[2024-08-19 23:25:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-19 23:25:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 537): INFO AMP_ENABLE: true AMP_OPT_LEVEL: '' AUG: AUTO_AUGMENT: rand-m9-mstd0.5-inc1 COLOR_JITTER: 0.4 CUTMIX: 1.0 CUTMIX_MINMAX: null MIXUP: 0.8 MIXUP_MODE: batch MIXUP_PROB: 1.0 MIXUP_SWITCH_PROB: 0.5 RECOUNT: 1 REMODE: pixel REPROB: 0.25 BASE: - '' DATA: BATCH_SIZE: 64 CACHE_MODE: part DATASET: imagenet DATA_PATH: /dataset/ImageNet_ILSVRC2012 IMG_SIZE: 224 INTERPOLATION: bicubic MASK_PATCH_SIZE: 32 MASK_RATIO: 0.6 NUM_WORKERS: 8 PERSISTENT_WORKERS: true PIN_MEMORY: true ZIP_MODE: false ENABLE_AMP: false EVAL_MODE: false FUSED_LAYERNORM: false MODEL: DDP: hfai DROP_PATH_RATE: 0.2 DROP_RATE: 0.0 LABEL_SMOOTHING: 0.1 MLLA: APE: false DEPTHS: - 2 - 4 - 8 - 4 DROP_PATH_RATE: 0.1 DROP_RATE: 0.0 EMBED_DIM: 64 IMAGE_SIZE: 224 IN_CHANS: 3 MLP_RATIO: 4.0 NUM_HEADS: - 2 - 4 - 8 - 16 PATCH_SIZE: 4 SIMPLE_DOWNSAMPLE: false SIMPLE_PATCH_EMBED: false MMCKPT: false NAME: msvmambav3_tiny_224 NUM_CLASSES: 1000 PRETRAINED: '' RESUME: '' RMT: CHUNKWISE_RECURRENTS: - true - true - false - false DEPTHS: - 2 - 2 - 6 - 2 DROP_PATH_RATE: 0.1 EMBED_DIMS: - 64 - 128 - 256 - 512 HEADS_RANGES: - 3 - 3 - 3 - 3 INIT_VALUES: - 1 - 1 - 1 - 1 LAYERSCALES: - false - false - false - false MLP_RATIOS: - 3 - 3 - 3 - 3 NUM_HEADS: - 3 - 6 - 12 - 24 PATCH_NORM: true TYPE: vssm VMAMBA2: APE: false ATTN_TYPES: - mamba2 - mamba2 - mamba2 - mamba2 BIDIRECTION: false DEPTHS: - 2 - 4 - 8 - 4 DROP_PATH_RATE: 0.2 DROP_RATE: 0.0 D_STATE: 64 EMBED_DIM: 64 IMAGE_SIZE: 224 IN_CHANS: 3 LEPE: false LINEAR_ATTN_DUALITY: false MLP_RATIO: 4.0 NUM_HEADS: - 2 - 4 - 8 - 16 PARTIAL_WIN_SIZE: -1 PATCH_SIZE: 4 RES_SCALE: - false - false - false - false SIMPLE_DOWNSAMPLE: true SIMPLE_PATCH_EMBED: true SSD_AEXP: false SSD_CHUNK_SIZE: 256 SSD_EXPANSION: 2 SSD_NGROUPS: 1 SSD_NORM_DA: false SSD_POSITIVE_DA: false WIN_ONLY: - false - false - false - false ZACT: false VSSM: ADD_SE: true AXIS_STAGE: [] CONVFFN: true CONV_FFN_RATIO: 4 DEPTHS: - 2 - 2 - 9 - 2 DOWNSAMPLE: v3 EMBED_DIM: 96 FULL_RES_INDEX: 1 GMLP: false IN_CHANS: 3 MLP_ACT_LAYER: gelu MLP_DROP_RATE: 0.0 MLP_RATIO: 4.0 NORM_LAYER: ln2d NUM_HEADS: - 1 - 2 - 4 - 8 PATCHEMBED: v2 PATCH_NORM: true PATCH_SIZE: 4 POSEMBED: false PRE_NORM: false SSM_ACT_LAYER: silu SSM_CONV: 3 SSM_CONV_BIAS: true SSM_DROP_RATE: 0.0 SSM_DT_RANK: auto SSM_D_STATE: 1 SSM_FORWARDTYPE: vms_noz SSM_INIT: v0 SSM_RANK_RATIO: 2.0 SSM_RATIO: 1.0 OUTPUT: ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009 PRINT_FREQ: 10 SAVE_FREQ: 1 SEED: 0 TAG: '20240819223009' TEST: CROP: true SEQUENTIAL: false SHUFFLE: false THROUGHPUT_MODE: false TRAIN: ACCUMULATION_STEPS: 1 AUTO_RESUME: true BASE_LR: 0.001 CLIP_GRAD: 5.0 EPOCHS: 300 LAYER_DECAY: 1.0 LR_SCHEDULER: DECAY_EPOCHS: 30 DECAY_RATE: 0.1 GAMMA: 0.1 MULTISTEPS: [] NAME: cosine WARMUP_PREFIX: true MIN_LR: 1.0e-05 MOE: SAVE_MASTER: false OPTIMIZER: BETAS: - 0.9 - 0.999 EPS: 1.0e-08 MOMENTUM: 0.9 NAME: adamw START_EPOCH: 0 USE_CHECKPOINT: false WARMUP_EPOCHS: 20 WARMUP_LR: 1.0e-06 WEIGHT_DECAY: 0.05 TRAINCOST_MODE: false [2024-08-19 23:25:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 538): INFO {"cfg": "./configs/msv2/msvmambav3_tiny_224.yaml", "opts": null, "batch_size": 64, "data_path": "/dataset/ImageNet_ILSVRC2012", "zip": false, "cache_mode": "part", "pretrained": null, "resume": null, "accumulation_steps": null, "use_checkpoint": false, "disable_amp": false, "output": "./exclude/output_msv2", "tag": "20240819223009", "eval": false, "throughput": false, "fused_layernorm": false, "optim": null, "model_ema": true, "model_ema_decay": 0.9999, "model_ema_force_cpu": false, "memory_limit_rate": -1, "ddp": "hfai", "enable_preload": true, "enable_persistance": true, "mesa": false, "mesa_value": 1.0, "mute_repeat": false} [2024-08-19 23:25:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-19 23:25:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 135): INFO VSSM( (patch_embed): Sequential( (0): Conv2d(3, 48, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (1): Identity() (2): LayerNorm2d((48,), eps=1e-05, elementwise_affine=True) (3): Identity() (4): GELU(approximate='none') (5): Conv2d(48, 96, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (6): Identity() (7): LayerNorm2d((96,), eps=1e-05, elementwise_affine=True) ) (layers): ModuleList( (0): Sequential( (blocks): Sequential( (0): VSSBlock( (norm): LayerNorm2d((96,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((96,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=96, out_features=12, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=12, out_features=96, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(96, 96, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=96) (in_proj): Linear2d(in_features=96, out_features=96, bias=False) (act): SiLU() (conv2d): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96) (out_act): Identity() (out_proj): Linear2d(in_features=96, out_features=96, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.0) (convFFN): ConvFFN( (linear1): Conv2d(96, 384, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384) ) (norm2): LayerNorm2d((96,), eps=1e-05, elementwise_affine=True) ) (1): VSSBlock( (norm): LayerNorm2d((96,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((96,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=96, out_features=12, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=12, out_features=96, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(96, 96, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=96) (in_proj): Linear2d(in_features=96, out_features=96, bias=False) (act): SiLU() (conv2d): Conv2d(96, 96, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=96) (out_act): Identity() (out_proj): Linear2d(in_features=96, out_features=96, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.014285714365541935) (convFFN): ConvFFN( (linear1): Conv2d(96, 384, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(384, 96, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384) ) (norm2): LayerNorm2d((96,), eps=1e-05, elementwise_affine=True) ) ) (downsample): Sequential( (0): Identity() (1): Conv2d(96, 192, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (2): Identity() (3): LayerNorm2d((192,), eps=1e-05, elementwise_affine=True) ) ) (1): Sequential( (blocks): Sequential( (0): VSSBlock( (norm): LayerNorm2d((192,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((192,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=192, out_features=24, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=24, out_features=192, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(192, 192, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=192) (in_proj): Linear2d(in_features=192, out_features=192, bias=False) (act): SiLU() (conv2d): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192) (out_act): Identity() (out_proj): Linear2d(in_features=192, out_features=192, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.02857142873108387) (convFFN): ConvFFN( (linear1): Conv2d(192, 768, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=768) ) (norm2): LayerNorm2d((192,), eps=1e-05, elementwise_affine=True) ) (1): VSSBlock( (norm): LayerNorm2d((192,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((192,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=192, out_features=24, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=24, out_features=192, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(192, 192, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=192) (in_proj): Linear2d(in_features=192, out_features=192, bias=False) (act): SiLU() (conv2d): Conv2d(192, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=192) (out_act): Identity() (out_proj): Linear2d(in_features=192, out_features=192, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.04285714402794838) (convFFN): ConvFFN( (linear1): Conv2d(192, 768, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(768, 192, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=768) ) (norm2): LayerNorm2d((192,), eps=1e-05, elementwise_affine=True) ) ) (downsample): Sequential( (0): Identity() (1): Conv2d(192, 384, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (2): Identity() (3): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) ) ) (2): Sequential( (blocks): Sequential( (0): VSSBlock( (norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=384, out_features=48, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=48, out_features=384, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(384, 384, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=384) (in_proj): Linear2d(in_features=384, out_features=384, bias=False) (act): SiLU() (conv2d): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384) (out_act): Identity() (out_proj): Linear2d(in_features=384, out_features=384, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.05714285746216774) (convFFN): ConvFFN( (linear1): Conv2d(384, 1536, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(1536, 384, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536) ) (norm2): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) ) (1): VSSBlock( (norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=384, out_features=48, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=48, out_features=384, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(384, 384, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=384) (in_proj): Linear2d(in_features=384, out_features=384, bias=False) (act): SiLU() (conv2d): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384) (out_act): Identity() (out_proj): Linear2d(in_features=384, out_features=384, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.0714285746216774) (convFFN): ConvFFN( (linear1): Conv2d(384, 1536, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(1536, 384, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536) ) (norm2): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) ) (2): VSSBlock( (norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=384, out_features=48, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=48, out_features=384, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(384, 384, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=384) (in_proj): Linear2d(in_features=384, out_features=384, bias=False) (act): SiLU() (conv2d): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384) (out_act): Identity() (out_proj): Linear2d(in_features=384, out_features=384, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.08571428805589676) (convFFN): ConvFFN( (linear1): Conv2d(384, 1536, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(1536, 384, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536) ) (norm2): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) ) (3): VSSBlock( (norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=384, out_features=48, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=48, out_features=384, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(384, 384, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=384) (in_proj): Linear2d(in_features=384, out_features=384, bias=False) (act): SiLU() (conv2d): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384) (out_act): Identity() (out_proj): Linear2d(in_features=384, out_features=384, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.10000000149011612) (convFFN): ConvFFN( (linear1): Conv2d(384, 1536, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(1536, 384, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536) ) (norm2): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) ) (4): VSSBlock( (norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=384, out_features=48, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=48, out_features=384, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(384, 384, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=384) (in_proj): Linear2d(in_features=384, out_features=384, bias=False) (act): SiLU() (conv2d): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384) (out_act): Identity() (out_proj): Linear2d(in_features=384, out_features=384, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.11428571492433548) (convFFN): ConvFFN( (linear1): Conv2d(384, 1536, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(1536, 384, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536) ) (norm2): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) ) (5): VSSBlock( (norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=384, out_features=48, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=48, out_features=384, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(384, 384, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=384) (in_proj): Linear2d(in_features=384, out_features=384, bias=False) (act): SiLU() (conv2d): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384) (out_act): Identity() (out_proj): Linear2d(in_features=384, out_features=384, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.12857143580913544) (convFFN): ConvFFN( (linear1): Conv2d(384, 1536, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(1536, 384, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536) ) (norm2): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) ) (6): VSSBlock( (norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=384, out_features=48, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=48, out_features=384, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(384, 384, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=384) (in_proj): Linear2d(in_features=384, out_features=384, bias=False) (act): SiLU() (conv2d): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384) (out_act): Identity() (out_proj): Linear2d(in_features=384, out_features=384, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.1428571492433548) (convFFN): ConvFFN( (linear1): Conv2d(384, 1536, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(1536, 384, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536) ) (norm2): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) ) (7): VSSBlock( (norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=384, out_features=48, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=48, out_features=384, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(384, 384, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=384) (in_proj): Linear2d(in_features=384, out_features=384, bias=False) (act): SiLU() (conv2d): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384) (out_act): Identity() (out_proj): Linear2d(in_features=384, out_features=384, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.15714286267757416) (convFFN): ConvFFN( (linear1): Conv2d(384, 1536, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(1536, 384, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536) ) (norm2): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) ) (8): VSSBlock( (norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=384, out_features=48, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=48, out_features=384, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(384, 384, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=384) (in_proj): Linear2d(in_features=384, out_features=384, bias=False) (act): SiLU() (conv2d): Conv2d(384, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=384) (out_act): Identity() (out_proj): Linear2d(in_features=384, out_features=384, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.17142857611179352) (convFFN): ConvFFN( (linear1): Conv2d(384, 1536, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(1536, 384, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(1536, 1536, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=1536) ) (norm2): LayerNorm2d((384,), eps=1e-05, elementwise_affine=True) ) ) (downsample): Sequential( (0): Identity() (1): Conv2d(384, 768, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1)) (2): Identity() (3): LayerNorm2d((768,), eps=1e-05, elementwise_affine=True) ) ) (3): Sequential( (blocks): Sequential( (0): VSSBlock( (norm): LayerNorm2d((768,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((768,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=768, out_features=96, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=96, out_features=768, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(768, 768, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=768) (in_proj): Linear2d(in_features=768, out_features=768, bias=False) (act): SiLU() (conv2d): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=768) (out_act): Identity() (out_proj): Linear2d(in_features=768, out_features=768, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.18571428954601288) (convFFN): ConvFFN( (linear1): Conv2d(768, 3072, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(3072, 768, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(3072, 3072, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=3072) ) (norm2): LayerNorm2d((768,), eps=1e-05, elementwise_affine=True) ) (1): VSSBlock( (norm): LayerNorm2d((768,), eps=1e-05, elementwise_affine=True) (op): SS2D( (out_norm): LayerNorm2d((768,), eps=1e-05, elementwise_affine=True) (se): SEModule( (avg_pool): AdaptiveAvgPool2d(output_size=1) (fc): Sequential( (0): Linear(in_features=768, out_features=96, bias=False) (1): ReLU(inplace=True) (2): Linear(in_features=96, out_features=768, bias=False) (3): Sigmoid() ) ) (conv2d_b1): Conv2d(768, 768, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), groups=768) (in_proj): Linear2d(in_features=768, out_features=768, bias=False) (act): SiLU() (conv2d): Conv2d(768, 768, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=768) (out_act): Identity() (out_proj): Linear2d(in_features=768, out_features=768, bias=False) (dropout): Identity() ) (drop_path): timm.DropPath(0.20000000298023224) (convFFN): ConvFFN( (linear1): Conv2d(768, 3072, kernel_size=(1, 1), stride=(1, 1)) (drop1): Dropout(p=0.0, inplace=True) (act): GELU(approximate='none') (linear2): Conv2d(3072, 768, kernel_size=(1, 1), stride=(1, 1)) (drop2): Dropout(p=0.0, inplace=True) (dwc): Conv2d(3072, 3072, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=3072) ) (norm2): LayerNorm2d((768,), eps=1e-05, elementwise_affine=True) ) ) (downsample): Identity() ) ) (classifier): Sequential( (norm): LayerNorm2d((768,), eps=1e-05, elementwise_affine=True) (permute): Identity() (avgpool): AdaptiveAvgPool2d(output_size=1) (flatten): Flatten(start_dim=1, end_dim=-1) (head): Linear(in_features=768, out_features=1000, bias=True) ) ) [2024-08-19 23:25:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 137): INFO number of params: 32263720 [2024-08-19 23:25:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 139): INFO number of GFLOPs: 5.133861504 [2024-08-19 23:25:51 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-19 23:25:51 msvmambav3_tiny_224] (optimizer.py 27): INFO No weight decay list: ['patch_embed.0.bias', 'patch_embed.2.weight', 'patch_embed.2.bias', 'patch_embed.5.bias', 'patch_embed.7.weight', 'patch_embed.7.bias', 'layers.0.blocks.0.norm.weight', 'layers.0.blocks.0.norm.bias', 'layers.0.blocks.0.op.Ds', 'layers.0.blocks.0.op.out_norm.weight', 'layers.0.blocks.0.op.out_norm.bias', 'layers.0.blocks.0.op.conv2d_b1.bias', 'layers.0.blocks.0.op.conv2d.bias', 'layers.0.blocks.0.convFFN.linear1.bias', 'layers.0.blocks.0.convFFN.linear2.bias', 'layers.0.blocks.0.convFFN.dwc.bias', 'layers.0.blocks.0.norm2.weight', 'layers.0.blocks.0.norm2.bias', 'layers.0.blocks.1.norm.weight', 'layers.0.blocks.1.norm.bias', 'layers.0.blocks.1.op.Ds', 'layers.0.blocks.1.op.out_norm.weight', 'layers.0.blocks.1.op.out_norm.bias', 'layers.0.blocks.1.op.conv2d_b1.bias', 'layers.0.blocks.1.op.conv2d.bias', 'layers.0.blocks.1.convFFN.linear1.bias', 'layers.0.blocks.1.convFFN.linear2.bias', 'layers.0.blocks.1.convFFN.dwc.bias', 'layers.0.blocks.1.norm2.weight', 'layers.0.blocks.1.norm2.bias', 'layers.0.downsample.1.bias', 'layers.0.downsample.3.weight', 'layers.0.downsample.3.bias', 'layers.1.blocks.0.norm.weight', 'layers.1.blocks.0.norm.bias', 'layers.1.blocks.0.op.Ds', 'layers.1.blocks.0.op.out_norm.weight', 'layers.1.blocks.0.op.out_norm.bias', 'layers.1.blocks.0.op.conv2d_b1.bias', 'layers.1.blocks.0.op.conv2d.bias', 'layers.1.blocks.0.convFFN.linear1.bias', 'layers.1.blocks.0.convFFN.linear2.bias', 'layers.1.blocks.0.convFFN.dwc.bias', 'layers.1.blocks.0.norm2.weight', 'layers.1.blocks.0.norm2.bias', 'layers.1.blocks.1.norm.weight', 'layers.1.blocks.1.norm.bias', 'layers.1.blocks.1.op.Ds', 'layers.1.blocks.1.op.out_norm.weight', 'layers.1.blocks.1.op.out_norm.bias', 'layers.1.blocks.1.op.conv2d_b1.bias', 'layers.1.blocks.1.op.conv2d.bias', 'layers.1.blocks.1.convFFN.linear1.bias', 'layers.1.blocks.1.convFFN.linear2.bias', 'layers.1.blocks.1.convFFN.dwc.bias', 'layers.1.blocks.1.norm2.weight', 'layers.1.blocks.1.norm2.bias', 'layers.1.downsample.1.bias', 'layers.1.downsample.3.weight', 'layers.1.downsample.3.bias', 'layers.2.blocks.0.norm.weight', 'layers.2.blocks.0.norm.bias', 'layers.2.blocks.0.op.Ds', 'layers.2.blocks.0.op.out_norm.weight', 'layers.2.blocks.0.op.out_norm.bias', 'layers.2.blocks.0.op.conv2d_b1.bias', 'layers.2.blocks.0.op.conv2d.bias', 'layers.2.blocks.0.convFFN.linear1.bias', 'layers.2.blocks.0.convFFN.linear2.bias', 'layers.2.blocks.0.convFFN.dwc.bias', 'layers.2.blocks.0.norm2.weight', 'layers.2.blocks.0.norm2.bias', 'layers.2.blocks.1.norm.weight', 'layers.2.blocks.1.norm.bias', 'layers.2.blocks.1.op.Ds', 'layers.2.blocks.1.op.out_norm.weight', 'layers.2.blocks.1.op.out_norm.bias', 'layers.2.blocks.1.op.conv2d_b1.bias', 'layers.2.blocks.1.op.conv2d.bias', 'layers.2.blocks.1.convFFN.linear1.bias', 'layers.2.blocks.1.convFFN.linear2.bias', 'layers.2.blocks.1.convFFN.dwc.bias', 'layers.2.blocks.1.norm2.weight', 'layers.2.blocks.1.norm2.bias', 'layers.2.blocks.2.norm.weight', 'layers.2.blocks.2.norm.bias', 'layers.2.blocks.2.op.Ds', 'layers.2.blocks.2.op.out_norm.weight', 'layers.2.blocks.2.op.out_norm.bias', 'layers.2.blocks.2.op.conv2d_b1.bias', 'layers.2.blocks.2.op.conv2d.bias', 'layers.2.blocks.2.convFFN.linear1.bias', 'layers.2.blocks.2.convFFN.linear2.bias', 'layers.2.blocks.2.convFFN.dwc.bias', 'layers.2.blocks.2.norm2.weight', 'layers.2.blocks.2.norm2.bias', 'layers.2.blocks.3.norm.weight', 'layers.2.blocks.3.norm.bias', 'layers.2.blocks.3.op.Ds', 'layers.2.blocks.3.op.out_norm.weight', 'layers.2.blocks.3.op.out_norm.bias', 'layers.2.blocks.3.op.conv2d_b1.bias', 'layers.2.blocks.3.op.conv2d.bias', 'layers.2.blocks.3.convFFN.linear1.bias', 'layers.2.blocks.3.convFFN.linear2.bias', 'layers.2.blocks.3.convFFN.dwc.bias', 'layers.2.blocks.3.norm2.weight', 'layers.2.blocks.3.norm2.bias', 'layers.2.blocks.4.norm.weight', 'layers.2.blocks.4.norm.bias', 'layers.2.blocks.4.op.Ds', 'layers.2.blocks.4.op.out_norm.weight', 'layers.2.blocks.4.op.out_norm.bias', 'layers.2.blocks.4.op.conv2d_b1.bias', 'layers.2.blocks.4.op.conv2d.bias', 'layers.2.blocks.4.convFFN.linear1.bias', 'layers.2.blocks.4.convFFN.linear2.bias', 'layers.2.blocks.4.convFFN.dwc.bias', 'layers.2.blocks.4.norm2.weight', 'layers.2.blocks.4.norm2.bias', 'layers.2.blocks.5.norm.weight', 'layers.2.blocks.5.norm.bias', 'layers.2.blocks.5.op.Ds', 'layers.2.blocks.5.op.out_norm.weight', 'layers.2.blocks.5.op.out_norm.bias', 'layers.2.blocks.5.op.conv2d_b1.bias', 'layers.2.blocks.5.op.conv2d.bias', 'layers.2.blocks.5.convFFN.linear1.bias', 'layers.2.blocks.5.convFFN.linear2.bias', 'layers.2.blocks.5.convFFN.dwc.bias', 'layers.2.blocks.5.norm2.weight', 'layers.2.blocks.5.norm2.bias', 'layers.2.blocks.6.norm.weight', 'layers.2.blocks.6.norm.bias', 'layers.2.blocks.6.op.Ds', 'layers.2.blocks.6.op.out_norm.weight', 'layers.2.blocks.6.op.out_norm.bias', 'layers.2.blocks.6.op.conv2d_b1.bias', 'layers.2.blocks.6.op.conv2d.bias', 'layers.2.blocks.6.convFFN.linear1.bias', 'layers.2.blocks.6.convFFN.linear2.bias', 'layers.2.blocks.6.convFFN.dwc.bias', 'layers.2.blocks.6.norm2.weight', 'layers.2.blocks.6.norm2.bias', 'layers.2.blocks.7.norm.weight', 'layers.2.blocks.7.norm.bias', 'layers.2.blocks.7.op.Ds', 'layers.2.blocks.7.op.out_norm.weight', 'layers.2.blocks.7.op.out_norm.bias', 'layers.2.blocks.7.op.conv2d_b1.bias', 'layers.2.blocks.7.op.conv2d.bias', 'layers.2.blocks.7.convFFN.linear1.bias', 'layers.2.blocks.7.convFFN.linear2.bias', 'layers.2.blocks.7.convFFN.dwc.bias', 'layers.2.blocks.7.norm2.weight', 'layers.2.blocks.7.norm2.bias', 'layers.2.blocks.8.norm.weight', 'layers.2.blocks.8.norm.bias', 'layers.2.blocks.8.op.Ds', 'layers.2.blocks.8.op.out_norm.weight', 'layers.2.blocks.8.op.out_norm.bias', 'layers.2.blocks.8.op.conv2d_b1.bias', 'layers.2.blocks.8.op.conv2d.bias', 'layers.2.blocks.8.convFFN.linear1.bias', 'layers.2.blocks.8.convFFN.linear2.bias', 'layers.2.blocks.8.convFFN.dwc.bias', 'layers.2.blocks.8.norm2.weight', 'layers.2.blocks.8.norm2.bias', 'layers.2.downsample.1.bias', 'layers.2.downsample.3.weight', 'layers.2.downsample.3.bias', 'layers.3.blocks.0.norm.weight', 'layers.3.blocks.0.norm.bias', 'layers.3.blocks.0.op.Ds', 'layers.3.blocks.0.op.out_norm.weight', 'layers.3.blocks.0.op.out_norm.bias', 'layers.3.blocks.0.op.conv2d_b1.bias', 'layers.3.blocks.0.op.conv2d.bias', 'layers.3.blocks.0.convFFN.linear1.bias', 'layers.3.blocks.0.convFFN.linear2.bias', 'layers.3.blocks.0.convFFN.dwc.bias', 'layers.3.blocks.0.norm2.weight', 'layers.3.blocks.0.norm2.bias', 'layers.3.blocks.1.norm.weight', 'layers.3.blocks.1.norm.bias', 'layers.3.blocks.1.op.Ds', 'layers.3.blocks.1.op.out_norm.weight', 'layers.3.blocks.1.op.out_norm.bias', 'layers.3.blocks.1.op.conv2d_b1.bias', 'layers.3.blocks.1.op.conv2d.bias', 'layers.3.blocks.1.convFFN.linear1.bias', 'layers.3.blocks.1.convFFN.linear2.bias', 'layers.3.blocks.1.convFFN.dwc.bias', 'layers.3.blocks.1.norm2.weight', 'layers.3.blocks.1.norm2.bias', 'classifier.norm.weight', 'classifier.norm.bias', 'classifier.head.bias'] [2024-08-19 23:25:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 196): INFO no checkpoint found in ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009, ignoring auto resume [2024-08-19 23:25:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-19 23:26:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][0/1251] eta 4:33:40 lr 0.000001 wd 0.0500 time 13.1261 (13.1261) data time 0.6472 (0.6472) model time 0.0000 (0.0000) loss 7.0165 (7.0165) grad_norm 2.2812 (2.2812) loss_scale 65536.0000 (65536.0000) mem 19811MB [2024-08-19 23:26:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][10/1251] eta 0:29:28 lr 0.000001 wd 0.0500 time 0.2447 (1.4254) data time 0.0010 (0.0598) model time 0.0000 (0.0000) loss 6.9737 (7.0202) grad_norm 2.2363 (2.2515) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][20/1251] eta 0:17:39 lr 0.000002 wd 0.0500 time 0.2456 (0.8608) data time 0.0013 (0.0319) model time 0.0000 (0.0000) loss 6.9811 (7.0121) grad_norm 2.2275 (2.2371) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][30/1251] eta 0:13:26 lr 0.000002 wd 0.0500 time 0.2369 (0.6605) data time 0.0008 (0.0219) model time 0.0000 (0.0000) loss 7.0235 (7.0070) grad_norm 2.2214 (2.2213) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][40/1251] eta 0:11:15 lr 0.000003 wd 0.0500 time 0.2402 (0.5579) data time 0.0010 (0.0169) model time 0.0000 (0.0000) loss 6.9647 (6.9970) grad_norm 2.1393 (2.2050) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][50/1251] eta 0:09:55 lr 0.000003 wd 0.0500 time 0.2351 (0.4954) data time 0.0009 (0.0138) model time 0.0000 (0.0000) loss 6.9222 (6.9869) grad_norm 2.0440 (2.1860) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][60/1251] eta 0:08:59 lr 0.000003 wd 0.0500 time 0.2419 (0.4533) data time 0.0011 (0.0117) model time 0.2408 (0.2377) loss 6.9562 (6.9837) grad_norm 2.1719 (2.1803) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][70/1251] eta 0:08:19 lr 0.000004 wd 0.0500 time 0.2403 (0.4233) data time 0.0010 (0.0102) model time 0.2393 (0.2382) loss 6.9259 (6.9798) grad_norm 2.0694 (2.1697) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][80/1251] eta 0:07:49 lr 0.000004 wd 0.0500 time 0.2409 (0.4005) data time 0.0011 (0.0091) model time 0.2398 (0.2381) loss 6.9586 (6.9753) grad_norm 2.0687 (2.1646) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][90/1251] eta 0:07:24 lr 0.000005 wd 0.0500 time 0.2299 (0.3829) data time 0.0009 (0.0082) model time 0.2290 (0.2384) loss 6.9424 (6.9706) grad_norm 2.2281 (2.1547) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][100/1251] eta 0:07:04 lr 0.000005 wd 0.0500 time 0.2485 (0.3688) data time 0.0008 (0.0075) model time 0.2477 (0.2386) loss 6.9467 (6.9667) grad_norm 2.0361 (2.1423) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][110/1251] eta 0:06:47 lr 0.000005 wd 0.0500 time 0.2346 (0.3574) data time 0.0010 (0.0069) model time 0.2336 (0.2390) loss 6.9427 (6.9634) grad_norm 2.0213 (2.1326) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][120/1251] eta 0:06:33 lr 0.000006 wd 0.0500 time 0.2443 (0.3479) data time 0.0009 (0.0065) model time 0.2433 (0.2393) loss 6.9502 (6.9615) grad_norm 2.0711 (2.1254) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][130/1251] eta 0:06:20 lr 0.000006 wd 0.0500 time 0.2396 (0.3398) data time 0.0011 (0.0060) model time 0.2385 (0.2394) loss 6.9465 (6.9595) grad_norm 1.9721 (2.1141) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][140/1251] eta 0:06:09 lr 0.000007 wd 0.0500 time 0.2417 (0.3328) data time 0.0010 (0.0057) model time 0.2408 (0.2395) loss 6.9007 (6.9569) grad_norm 1.9094 (2.1035) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][150/1251] eta 0:05:59 lr 0.000007 wd 0.0500 time 0.2401 (0.3268) data time 0.0010 (0.0055) model time 0.2391 (0.2396) loss 6.9039 (6.9542) grad_norm 1.9143 (2.0922) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][160/1251] eta 0:05:50 lr 0.000007 wd 0.0500 time 0.2406 (0.3216) data time 0.0011 (0.0052) model time 0.2395 (0.2397) loss 6.9031 (6.9512) grad_norm 1.8889 (2.0841) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][170/1251] eta 0:05:42 lr 0.000008 wd 0.0500 time 0.2467 (0.3168) data time 0.0010 (0.0050) model time 0.2457 (0.2397) loss 6.9247 (6.9493) grad_norm 1.9389 (2.0747) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][180/1251] eta 0:05:34 lr 0.000008 wd 0.0500 time 0.2458 (0.3126) data time 0.0010 (0.0048) model time 0.2447 (0.2396) loss 6.9165 (6.9487) grad_norm 1.9423 (2.0658) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][190/1251] eta 0:05:27 lr 0.000009 wd 0.0500 time 0.2477 (0.3089) data time 0.0015 (0.0046) model time 0.2461 (0.2397) loss 6.9092 (6.9473) grad_norm 1.9491 (2.0555) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][200/1251] eta 0:05:21 lr 0.000009 wd 0.0500 time 0.2424 (0.3055) data time 0.0012 (0.0044) model time 0.2413 (0.2398) loss 6.9603 (6.9459) grad_norm 1.9124 (2.0459) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][210/1251] eta 0:05:14 lr 0.000009 wd 0.0500 time 0.2367 (0.3024) data time 0.0013 (0.0042) model time 0.2354 (0.2397) loss 6.9370 (6.9442) grad_norm 1.7493 (2.0358) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][220/1251] eta 0:05:08 lr 0.000010 wd 0.0500 time 0.2436 (0.2996) data time 0.0009 (0.0041) model time 0.2427 (0.2397) loss 6.9217 (6.9431) grad_norm 1.7583 (2.0257) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][230/1251] eta 0:05:03 lr 0.000010 wd 0.0500 time 0.2355 (0.2971) data time 0.0014 (0.0040) model time 0.2341 (0.2397) loss 6.9241 (6.9415) grad_norm 1.8455 (2.0158) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][240/1251] eta 0:04:58 lr 0.000011 wd 0.0500 time 0.2447 (0.2948) data time 0.0010 (0.0038) model time 0.2436 (0.2397) loss 6.9043 (6.9394) grad_norm 1.7852 (2.0059) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][250/1251] eta 0:04:53 lr 0.000011 wd 0.0500 time 0.2422 (0.2928) data time 0.0009 (0.0037) model time 0.2413 (0.2399) loss 6.8666 (6.9383) grad_norm 1.7912 (1.9965) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][260/1251] eta 0:04:48 lr 0.000011 wd 0.0500 time 0.2385 (0.2908) data time 0.0010 (0.0036) model time 0.2375 (0.2399) loss 6.8852 (6.9373) grad_norm 1.6969 (1.9873) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][270/1251] eta 0:04:43 lr 0.000012 wd 0.0500 time 0.2416 (0.2890) data time 0.0010 (0.0035) model time 0.2407 (0.2400) loss 6.8571 (6.9355) grad_norm 1.7721 (1.9778) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][280/1251] eta 0:04:38 lr 0.000012 wd 0.0500 time 0.2418 (0.2873) data time 0.0013 (0.0034) model time 0.2405 (0.2399) loss 6.8796 (6.9342) grad_norm 1.7525 (1.9686) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][290/1251] eta 0:04:34 lr 0.000013 wd 0.0500 time 0.2423 (0.2857) data time 0.0009 (0.0034) model time 0.2414 (0.2400) loss 6.9077 (6.9327) grad_norm 1.6487 (1.9579) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][300/1251] eta 0:04:30 lr 0.000013 wd 0.0500 time 0.2450 (0.2842) data time 0.0012 (0.0033) model time 0.2438 (0.2400) loss 6.8921 (6.9316) grad_norm 1.6065 (1.9481) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][310/1251] eta 0:04:26 lr 0.000013 wd 0.0500 time 0.2368 (0.2828) data time 0.0012 (0.0032) model time 0.2356 (0.2399) loss 6.8431 (6.9301) grad_norm 1.7334 (1.9378) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][320/1251] eta 0:04:22 lr 0.000014 wd 0.0500 time 0.2425 (0.2815) data time 0.0011 (0.0032) model time 0.2414 (0.2400) loss 6.8750 (6.9288) grad_norm 1.6385 (1.9298) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][330/1251] eta 0:04:18 lr 0.000014 wd 0.0500 time 0.2416 (0.2803) data time 0.0008 (0.0031) model time 0.2408 (0.2400) loss 6.8924 (6.9279) grad_norm 1.6173 (1.9215) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][340/1251] eta 0:04:14 lr 0.000015 wd 0.0500 time 0.2459 (0.2792) data time 0.0011 (0.0030) model time 0.2448 (0.2400) loss 6.9030 (6.9270) grad_norm 1.7804 (1.9133) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][350/1251] eta 0:04:10 lr 0.000015 wd 0.0500 time 0.2410 (0.2781) data time 0.0008 (0.0030) model time 0.2402 (0.2401) loss 6.8055 (6.9257) grad_norm 1.5437 (1.9062) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][360/1251] eta 0:04:06 lr 0.000015 wd 0.0500 time 0.2486 (0.2772) data time 0.0008 (0.0029) model time 0.2478 (0.2401) loss 6.9375 (6.9247) grad_norm 1.5040 (1.8991) loss_scale 65536.0000 (65536.0000) mem 7403MB [2024-08-19 23:27:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][370/1251] eta 0:04:03 lr 0.000016 wd 0.0500 time 0.2408 (0.2768) data time 0.0010 (0.0029) model time 0.2398 (0.2409) loss 6.8349 (6.9234) grad_norm 1.6245 (1.8910) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][380/1251] eta 0:04:00 lr 0.000016 wd 0.0500 time 0.2444 (0.2759) data time 0.0009 (0.0028) model time 0.2436 (0.2408) loss 6.7946 (6.9218) grad_norm 1.7197 (1.8836) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][390/1251] eta 0:03:56 lr 0.000017 wd 0.0500 time 0.2335 (0.2750) data time 0.0009 (0.0028) model time 0.2327 (0.2408) loss 6.9075 (6.9206) grad_norm 1.6667 (1.8764) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:27:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][400/1251] eta 0:03:53 lr 0.000017 wd 0.0500 time 0.2448 (0.2741) data time 0.0012 (0.0027) model time 0.2436 (0.2407) loss 6.8755 (6.9189) grad_norm 1.5630 (1.8687) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:27:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][410/1251] eta 0:03:50 lr 0.000017 wd 0.0500 time 0.2390 (0.2744) data time 0.0012 (0.0027) model time 0.2377 (0.2419) loss 6.8772 (6.9180) grad_norm 1.7076 (1.8645) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][420/1251] eta 0:03:47 lr 0.000018 wd 0.0500 time 0.2376 (0.2736) data time 0.0009 (0.0027) model time 0.2367 (0.2419) loss 6.8971 (6.9166) grad_norm 1.7900 (1.8595) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:27:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][430/1251] eta 0:03:44 lr 0.000018 wd 0.0500 time 0.2439 (0.2729) data time 0.0011 (0.0026) model time 0.2428 (0.2419) loss 6.8931 (6.9159) grad_norm 2.0376 (1.8566) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:27:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][440/1251] eta 0:03:40 lr 0.000019 wd 0.0500 time 0.2403 (0.2721) data time 0.0008 (0.0026) model time 0.2394 (0.2418) loss 6.8881 (6.9145) grad_norm 1.6564 (1.8500) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][450/1251] eta 0:03:37 lr 0.000019 wd 0.0500 time 0.2450 (0.2714) data time 0.0008 (0.0026) model time 0.2442 (0.2417) loss 6.7913 (6.9132) grad_norm 1.3973 (1.8463) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][460/1251] eta 0:03:34 lr 0.000019 wd 0.0500 time 0.2415 (0.2708) data time 0.0010 (0.0025) model time 0.2405 (0.2417) loss 6.9261 (6.9115) grad_norm 1.6922 (1.8419) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][470/1251] eta 0:03:30 lr 0.000020 wd 0.0500 time 0.2432 (0.2701) data time 0.0011 (0.0025) model time 0.2422 (0.2417) loss 6.8343 (6.9099) grad_norm 1.9429 (1.8394) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][480/1251] eta 0:03:27 lr 0.000020 wd 0.0500 time 0.2458 (0.2696) data time 0.0008 (0.0025) model time 0.2449 (0.2417) loss 6.8855 (6.9092) grad_norm 1.4462 (1.8367) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][490/1251] eta 0:03:24 lr 0.000021 wd 0.0500 time 0.2416 (0.2690) data time 0.0011 (0.0024) model time 0.2405 (0.2416) loss 6.8552 (6.9078) grad_norm 1.5637 (1.8367) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][500/1251] eta 0:03:21 lr 0.000021 wd 0.0500 time 0.2322 (0.2684) data time 0.0011 (0.0024) model time 0.2311 (0.2415) loss 6.8840 (6.9065) grad_norm 3.0841 (1.8400) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][510/1251] eta 0:03:18 lr 0.000021 wd 0.0500 time 0.2379 (0.2678) data time 0.0008 (0.0024) model time 0.2371 (0.2415) loss 6.8201 (6.9052) grad_norm 1.4335 (1.8503) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][520/1251] eta 0:03:15 lr 0.000022 wd 0.0500 time 0.2452 (0.2673) data time 0.0008 (0.0024) model time 0.2444 (0.2415) loss 6.7678 (6.9037) grad_norm 2.0638 (1.8553) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][530/1251] eta 0:03:12 lr 0.000022 wd 0.0500 time 0.2414 (0.2668) data time 0.0011 (0.0023) model time 0.2403 (0.2414) loss 6.8643 (6.9019) grad_norm 1.4144 (1.8580) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][540/1251] eta 0:03:09 lr 0.000023 wd 0.0500 time 0.2370 (0.2663) data time 0.0011 (0.0023) model time 0.2359 (0.2413) loss 6.8427 (6.9003) grad_norm 1.5639 (1.8609) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][550/1251] eta 0:03:06 lr 0.000023 wd 0.0500 time 0.2348 (0.2659) data time 0.0008 (0.0023) model time 0.2340 (0.2413) loss 6.8332 (6.8985) grad_norm 1.7139 (1.8595) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][560/1251] eta 0:03:03 lr 0.000023 wd 0.0500 time 0.2366 (0.2654) data time 0.0013 (0.0023) model time 0.2353 (0.2413) loss 6.8468 (6.8976) grad_norm 2.0594 (1.8578) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][570/1251] eta 0:03:00 lr 0.000024 wd 0.0500 time 0.2420 (0.2650) data time 0.0010 (0.0022) model time 0.2410 (0.2413) loss 6.7812 (6.8961) grad_norm 2.2928 (1.8606) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][580/1251] eta 0:02:57 lr 0.000024 wd 0.0500 time 0.2394 (0.2646) data time 0.0007 (0.0022) model time 0.2386 (0.2412) loss 6.8586 (6.8946) grad_norm 1.8699 (1.8639) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][590/1251] eta 0:02:54 lr 0.000025 wd 0.0500 time 0.2426 (0.2641) data time 0.0010 (0.0022) model time 0.2415 (0.2412) loss 6.8392 (6.8933) grad_norm 2.6904 (1.8687) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][600/1251] eta 0:02:51 lr 0.000025 wd 0.0500 time 0.2431 (0.2637) data time 0.0011 (0.0022) model time 0.2420 (0.2411) loss 6.8100 (6.8912) grad_norm 2.7741 (1.8780) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][610/1251] eta 0:02:48 lr 0.000025 wd 0.0500 time 0.2361 (0.2633) data time 0.0010 (0.0022) model time 0.2351 (0.2411) loss 6.7677 (6.8895) grad_norm 1.8366 (1.8798) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][620/1251] eta 0:02:45 lr 0.000026 wd 0.0500 time 0.2449 (0.2630) data time 0.0008 (0.0021) model time 0.2442 (0.2411) loss 6.8593 (6.8879) grad_norm 1.7453 (1.8799) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][630/1251] eta 0:02:43 lr 0.000026 wd 0.0500 time 0.2385 (0.2626) data time 0.0011 (0.0021) model time 0.2374 (0.2410) loss 6.7998 (6.8865) grad_norm 1.4727 (1.8825) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][640/1251] eta 0:02:40 lr 0.000027 wd 0.0500 time 0.2506 (0.2623) data time 0.0008 (0.0021) model time 0.2497 (0.2410) loss 6.7275 (6.8850) grad_norm 1.4738 (1.8915) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][650/1251] eta 0:02:37 lr 0.000027 wd 0.0500 time 0.2414 (0.2619) data time 0.0009 (0.0021) model time 0.2405 (0.2410) loss 6.7962 (6.8834) grad_norm 1.7652 (1.8977) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][660/1251] eta 0:02:34 lr 0.000027 wd 0.0500 time 0.2420 (0.2616) data time 0.0008 (0.0021) model time 0.2412 (0.2410) loss 6.6208 (6.8814) grad_norm 3.1313 (1.9050) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][670/1251] eta 0:02:31 lr 0.000028 wd 0.0500 time 0.2464 (0.2613) data time 0.0008 (0.0021) model time 0.2456 (0.2410) loss 6.7935 (6.8802) grad_norm 2.2804 (1.9108) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][680/1251] eta 0:02:29 lr 0.000028 wd 0.0500 time 0.2451 (0.2610) data time 0.0009 (0.0020) model time 0.2442 (0.2409) loss 6.7631 (6.8794) grad_norm 2.3952 (1.9163) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][690/1251] eta 0:02:26 lr 0.000029 wd 0.0500 time 0.2389 (0.2607) data time 0.0010 (0.0020) model time 0.2378 (0.2409) loss 6.8057 (6.8778) grad_norm 4.0495 (1.9270) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][700/1251] eta 0:02:23 lr 0.000029 wd 0.0500 time 0.2407 (0.2604) data time 0.0010 (0.0020) model time 0.2398 (0.2409) loss 6.8904 (6.8762) grad_norm 2.2994 (1.9370) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][710/1251] eta 0:02:20 lr 0.000029 wd 0.0500 time 0.2397 (0.2602) data time 0.0008 (0.0020) model time 0.2389 (0.2409) loss 6.6666 (6.8747) grad_norm 2.8720 (1.9459) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][720/1251] eta 0:02:18 lr 0.000030 wd 0.0500 time 0.2423 (0.2599) data time 0.0008 (0.0020) model time 0.2415 (0.2409) loss 6.8666 (6.8732) grad_norm 3.3628 (1.9652) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][730/1251] eta 0:02:15 lr 0.000030 wd 0.0500 time 0.2412 (0.2597) data time 0.0010 (0.0020) model time 0.2402 (0.2409) loss 6.7732 (6.8721) grad_norm 2.5983 (1.9819) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][740/1251] eta 0:02:12 lr 0.000031 wd 0.0500 time 0.2382 (0.2594) data time 0.0011 (0.0020) model time 0.2371 (0.2408) loss 6.7992 (6.8712) grad_norm 1.9183 (1.9889) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][750/1251] eta 0:02:09 lr 0.000031 wd 0.0500 time 0.2431 (0.2591) data time 0.0010 (0.0020) model time 0.2420 (0.2408) loss 6.8063 (6.8700) grad_norm 3.1356 (1.9967) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][760/1251] eta 0:02:07 lr 0.000031 wd 0.0500 time 0.2421 (0.2589) data time 0.0011 (0.0019) model time 0.2410 (0.2408) loss 6.8203 (6.8689) grad_norm 2.1499 (2.0158) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][770/1251] eta 0:02:04 lr 0.000032 wd 0.0500 time 0.2345 (0.2587) data time 0.0008 (0.0020) model time 0.2337 (0.2408) loss 6.8009 (6.8681) grad_norm 1.7699 (2.0217) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][780/1251] eta 0:02:01 lr 0.000032 wd 0.0500 time 0.2391 (0.2585) data time 0.0009 (0.0019) model time 0.2382 (0.2408) loss 6.8474 (6.8668) grad_norm 1.6966 (2.0248) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][790/1251] eta 0:01:59 lr 0.000033 wd 0.0500 time 0.2391 (0.2582) data time 0.0011 (0.0019) model time 0.2380 (0.2407) loss 6.7020 (6.8654) grad_norm 5.3640 (2.0380) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][800/1251] eta 0:01:56 lr 0.000033 wd 0.0500 time 0.2452 (0.2580) data time 0.0008 (0.0019) model time 0.2444 (0.2407) loss 6.7843 (6.8641) grad_norm 3.9671 (2.0587) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][810/1251] eta 0:01:53 lr 0.000033 wd 0.0500 time 0.2478 (0.2578) data time 0.0008 (0.0019) model time 0.2470 (0.2407) loss 6.6439 (6.8621) grad_norm 3.5526 (2.0722) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][820/1251] eta 0:01:51 lr 0.000034 wd 0.0500 time 0.2449 (0.2576) data time 0.0008 (0.0019) model time 0.2441 (0.2407) loss 6.8334 (6.8607) grad_norm 3.8059 (2.0836) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][830/1251] eta 0:01:48 lr 0.000034 wd 0.0500 time 0.2332 (0.2574) data time 0.0012 (0.0019) model time 0.2320 (0.2407) loss 6.7837 (6.8589) grad_norm 2.2007 (2.1024) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][840/1251] eta 0:01:45 lr 0.000035 wd 0.0500 time 0.2430 (0.2572) data time 0.0009 (0.0019) model time 0.2421 (0.2407) loss 6.8399 (6.8578) grad_norm 3.0371 (2.1298) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][850/1251] eta 0:01:43 lr 0.000035 wd 0.0500 time 0.2427 (0.2569) data time 0.0008 (0.0019) model time 0.2418 (0.2406) loss 6.7349 (6.8564) grad_norm 2.3963 (2.1386) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][860/1251] eta 0:01:40 lr 0.000035 wd 0.0500 time 0.2400 (0.2567) data time 0.0010 (0.0019) model time 0.2389 (0.2406) loss 6.8302 (6.8551) grad_norm 1.8711 (2.1499) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][870/1251] eta 0:01:37 lr 0.000036 wd 0.0500 time 0.2408 (0.2566) data time 0.0008 (0.0019) model time 0.2400 (0.2406) loss 6.7320 (6.8540) grad_norm 2.3434 (2.1625) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][880/1251] eta 0:01:35 lr 0.000036 wd 0.0500 time 0.2298 (0.2563) data time 0.0008 (0.0018) model time 0.2290 (0.2405) loss 6.7541 (6.8524) grad_norm 3.2567 (2.1812) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][890/1251] eta 0:01:32 lr 0.000037 wd 0.0500 time 0.2127 (0.2564) data time 0.0012 (0.0018) model time 0.2115 (0.2408) loss 6.6611 (6.8509) grad_norm 3.9855 (2.2007) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][900/1251] eta 0:01:29 lr 0.000037 wd 0.0500 time 0.2404 (0.2562) data time 0.0010 (0.0018) model time 0.2394 (0.2407) loss 6.8117 (6.8497) grad_norm 2.9473 (2.2153) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][910/1251] eta 0:01:27 lr 0.000037 wd 0.0500 time 0.2356 (0.2560) data time 0.0012 (0.0018) model time 0.2345 (0.2407) loss 6.7609 (6.8484) grad_norm 3.2783 (2.2360) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][920/1251] eta 0:01:24 lr 0.000038 wd 0.0500 time 0.2365 (0.2559) data time 0.0010 (0.0018) model time 0.2354 (0.2407) loss 6.7617 (6.8472) grad_norm 3.1364 (2.2537) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][930/1251] eta 0:01:22 lr 0.000038 wd 0.0500 time 0.2402 (0.2559) data time 0.0011 (0.0018) model time 0.2391 (0.2409) loss 6.7564 (6.8463) grad_norm 4.4982 (2.2656) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:29:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][940/1251] eta 0:01:19 lr 0.000039 wd 0.0500 time 0.2333 (0.2557) data time 0.0009 (0.0018) model time 0.2324 (0.2409) loss 6.8570 (6.8450) grad_norm 4.3392 (2.2827) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:30:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][950/1251] eta 0:01:16 lr 0.000039 wd 0.0500 time 0.2490 (0.2556) data time 0.0011 (0.0018) model time 0.2479 (0.2409) loss 6.6909 (6.8428) grad_norm 3.7323 (2.3111) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:30:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][960/1251] eta 0:01:14 lr 0.000039 wd 0.0500 time 0.2408 (0.2554) data time 0.0011 (0.0018) model time 0.2397 (0.2409) loss 6.9254 (6.8419) grad_norm 3.8155 (2.3359) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:30:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][970/1251] eta 0:01:11 lr 0.000040 wd 0.0500 time 0.2301 (0.2553) data time 0.0011 (0.0018) model time 0.2290 (0.2408) loss 6.7720 (6.8409) grad_norm 3.2882 (2.3623) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:30:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][980/1251] eta 0:01:09 lr 0.000040 wd 0.0500 time 0.2327 (0.2551) data time 0.0010 (0.0018) model time 0.2317 (0.2408) loss 6.7804 (6.8401) grad_norm 3.7547 (2.3790) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:30:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][990/1251] eta 0:01:06 lr 0.000041 wd 0.0500 time 0.2339 (0.2549) data time 0.0011 (0.0018) model time 0.2328 (0.2408) loss 6.7584 (6.8390) grad_norm 5.0378 (2.3921) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1000/1251] eta 0:01:03 lr 0.000041 wd 0.0500 time 0.2539 (0.2548) data time 0.0010 (0.0018) model time 0.2530 (0.2408) loss 6.7704 (6.8375) grad_norm 2.3475 (2.4020) loss_scale 65536.0000 (65536.0000) mem 7376MB [2024-08-19 23:30:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1010/1251] eta 0:01:01 lr 0.000041 wd 0.0500 time 0.2417 (0.2547) data time 0.0011 (0.0017) model time 0.2406 (0.2408) loss 6.8383 (6.8369) grad_norm 8.3057 (inf) loss_scale 32768.0000 (65438.7656) mem 7376MB [2024-08-19 23:30:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1020/1251] eta 0:00:58 lr 0.000042 wd 0.0500 time 0.2406 (0.2545) data time 0.0008 (0.0017) model time 0.2399 (0.2407) loss 6.7775 (6.8354) grad_norm 2.3350 (inf) loss_scale 32768.0000 (65118.7777) mem 7376MB [2024-08-19 23:30:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1030/1251] eta 0:00:56 lr 0.000042 wd 0.0500 time 0.2396 (0.2544) data time 0.0011 (0.0017) model time 0.2385 (0.2407) loss 6.6384 (6.8339) grad_norm 3.4470 (inf) loss_scale 32768.0000 (64804.9971) mem 7376MB [2024-08-19 23:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1040/1251] eta 0:00:53 lr 0.000043 wd 0.0500 time 0.2453 (0.2543) data time 0.0007 (0.0017) model time 0.2446 (0.2407) loss 6.7876 (6.8328) grad_norm 5.7525 (inf) loss_scale 32768.0000 (64497.2450) mem 7376MB [2024-08-19 23:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1050/1251] eta 0:00:51 lr 0.000043 wd 0.0500 time 0.2429 (0.2542) data time 0.0010 (0.0017) model time 0.2419 (0.2407) loss 6.6882 (6.8314) grad_norm 4.3996 (inf) loss_scale 32768.0000 (64195.3492) mem 7376MB [2024-08-19 23:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1060/1251] eta 0:00:48 lr 0.000043 wd 0.0500 time 0.2525 (0.2541) data time 0.0008 (0.0017) model time 0.2517 (0.2408) loss 6.6215 (6.8306) grad_norm 4.0314 (inf) loss_scale 32768.0000 (63899.1442) mem 7376MB [2024-08-19 23:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1070/1251] eta 0:00:45 lr 0.000044 wd 0.0500 time 0.2369 (0.2539) data time 0.0011 (0.0017) model time 0.2358 (0.2407) loss 6.7836 (6.8294) grad_norm 4.0629 (inf) loss_scale 32768.0000 (63608.4706) mem 7376MB [2024-08-19 23:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1080/1251] eta 0:00:43 lr 0.000044 wd 0.0500 time 0.2397 (0.2538) data time 0.0010 (0.0017) model time 0.2387 (0.2407) loss 6.4689 (6.8279) grad_norm 3.5000 (inf) loss_scale 32768.0000 (63323.1748) mem 7376MB [2024-08-19 23:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1090/1251] eta 0:00:40 lr 0.000045 wd 0.0500 time 0.2334 (0.2537) data time 0.0011 (0.0017) model time 0.2322 (0.2407) loss 6.7364 (6.8265) grad_norm 3.2389 (inf) loss_scale 32768.0000 (63043.1091) mem 7376MB [2024-08-19 23:30:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1100/1251] eta 0:00:38 lr 0.000045 wd 0.0500 time 0.2428 (0.2536) data time 0.0011 (0.0017) model time 0.2418 (0.2407) loss 6.6561 (6.8254) grad_norm 3.4803 (inf) loss_scale 32768.0000 (62768.1308) mem 7376MB [2024-08-19 23:30:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1110/1251] eta 0:00:35 lr 0.000045 wd 0.0500 time 0.2378 (0.2534) data time 0.0009 (0.0017) model time 0.2369 (0.2407) loss 6.7264 (6.8241) grad_norm 3.1193 (inf) loss_scale 32768.0000 (62498.1026) mem 7376MB [2024-08-19 23:30:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1120/1251] eta 0:00:33 lr 0.000046 wd 0.0500 time 0.2458 (0.2533) data time 0.0009 (0.0017) model time 0.2448 (0.2407) loss 6.5885 (6.8231) grad_norm 2.9074 (inf) loss_scale 32768.0000 (62232.8921) mem 7376MB [2024-08-19 23:30:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1130/1251] eta 0:00:30 lr 0.000046 wd 0.0500 time 0.2473 (0.2532) data time 0.0007 (0.0017) model time 0.2466 (0.2407) loss 6.8458 (6.8219) grad_norm 3.5587 (inf) loss_scale 32768.0000 (61972.3714) mem 7376MB [2024-08-19 23:30:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1140/1251] eta 0:00:28 lr 0.000047 wd 0.0500 time 0.2297 (0.2531) data time 0.0012 (0.0017) model time 0.2285 (0.2407) loss 6.6732 (6.8205) grad_norm 3.2867 (inf) loss_scale 32768.0000 (61716.4172) mem 7376MB [2024-08-19 23:30:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1150/1251] eta 0:00:25 lr 0.000047 wd 0.0500 time 0.2412 (0.2530) data time 0.0009 (0.0017) model time 0.2402 (0.2407) loss 6.7672 (6.8193) grad_norm 3.9315 (inf) loss_scale 32768.0000 (61464.9105) mem 7376MB [2024-08-19 23:30:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1160/1251] eta 0:00:23 lr 0.000047 wd 0.0500 time 0.2423 (0.2529) data time 0.0007 (0.0017) model time 0.2416 (0.2407) loss 6.7119 (6.8178) grad_norm 5.5533 (inf) loss_scale 32768.0000 (61217.7364) mem 7376MB [2024-08-19 23:30:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1170/1251] eta 0:00:20 lr 0.000048 wd 0.0500 time 0.2395 (0.2528) data time 0.0010 (0.0017) model time 0.2386 (0.2406) loss 6.4706 (6.8166) grad_norm 7.0023 (inf) loss_scale 32768.0000 (60974.7839) mem 7376MB [2024-08-19 23:30:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1180/1251] eta 0:00:17 lr 0.000048 wd 0.0500 time 0.2396 (0.2526) data time 0.0011 (0.0016) model time 0.2384 (0.2406) loss 6.6519 (6.8156) grad_norm 3.3460 (inf) loss_scale 32768.0000 (60735.9458) mem 7376MB [2024-08-19 23:30:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1190/1251] eta 0:00:15 lr 0.000049 wd 0.0500 time 0.2379 (0.2525) data time 0.0010 (0.0016) model time 0.2368 (0.2405) loss 6.5454 (6.8144) grad_norm 2.3601 (inf) loss_scale 32768.0000 (60501.1184) mem 7376MB [2024-08-19 23:31:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1200/1251] eta 0:00:12 lr 0.000049 wd 0.0500 time 0.2425 (0.2524) data time 0.0008 (0.0016) model time 0.2417 (0.2405) loss 6.6919 (6.8134) grad_norm 4.5518 (inf) loss_scale 32768.0000 (60270.2015) mem 7376MB [2024-08-19 23:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1210/1251] eta 0:00:10 lr 0.000049 wd 0.0500 time 0.2326 (0.2523) data time 0.0009 (0.0016) model time 0.2316 (0.2405) loss 6.6147 (6.8125) grad_norm 7.8804 (inf) loss_scale 32768.0000 (60043.0983) mem 7376MB [2024-08-19 23:31:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1220/1251] eta 0:00:07 lr 0.000050 wd 0.0500 time 0.2437 (0.2522) data time 0.0008 (0.0016) model time 0.2428 (0.2405) loss 6.5992 (6.8112) grad_norm 2.8572 (inf) loss_scale 32768.0000 (59819.7150) mem 7376MB [2024-08-19 23:31:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1230/1251] eta 0:00:05 lr 0.000050 wd 0.0500 time 0.2425 (0.2521) data time 0.0011 (0.0016) model time 0.2414 (0.2405) loss 6.7498 (6.8100) grad_norm 4.0815 (inf) loss_scale 32768.0000 (59599.9610) mem 7376MB [2024-08-19 23:31:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1240/1251] eta 0:00:02 lr 0.000051 wd 0.0500 time 0.2203 (0.2519) data time 0.0005 (0.0016) model time 0.2198 (0.2404) loss 6.8293 (6.8087) grad_norm 4.1167 (inf) loss_scale 32768.0000 (59383.7486) mem 7376MB [2024-08-19 23:31:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [0/300][1250/1251] eta 0:00:00 lr 0.000051 wd 0.0500 time 0.2291 (0.2517) data time 0.0006 (0.0016) model time 0.2285 (0.2402) loss 6.6649 (6.8077) grad_norm 3.8044 (inf) loss_scale 32768.0000 (59170.9928) mem 7376MB [2024-08-19 23:31:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 0 training takes 0:05:14 [2024-08-19 23:31:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-19 23:31:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-19 23:31:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.480 (0.480) Loss 5.9375 (5.9375) Acc@1 1.953 (1.953) Acc@5 13.770 (13.770) Mem 7376MB [2024-08-19 23:31:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.115) Loss 6.4844 (6.2358) Acc@1 0.195 (1.199) Acc@5 2.539 (5.540) Mem 7376MB [2024-08-19 23:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.097) Loss 6.5000 (6.2427) Acc@1 0.098 (1.618) Acc@5 1.172 (6.045) Mem 7376MB [2024-08-19 23:31:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.090) Loss 6.2461 (6.2576) Acc@1 4.102 (1.833) Acc@5 9.180 (6.348) Mem 7376MB [2024-08-19 23:31:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 6.3555 (6.2798) Acc@1 1.367 (1.813) Acc@5 5.469 (6.252) Mem 7376MB [2024-08-19 23:31:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 2.204 Acc@5 7.346 [2024-08-19 23:31:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 2.2% [2024-08-19 23:31:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 2.20% [2024-08-19 23:31:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-19 23:31:20 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-19 23:31:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.423 (0.423) Loss 6.9922 (6.9922) Acc@1 0.000 (0.000) Acc@5 0.488 (0.488) Mem 7376MB [2024-08-19 23:31:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.111) Loss 6.9492 (6.9698) Acc@1 0.000 (0.062) Acc@5 0.879 (0.568) Mem 7376MB [2024-08-19 23:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.096) Loss 6.9961 (6.9775) Acc@1 0.488 (0.084) Acc@5 0.879 (0.507) Mem 7376MB [2024-08-19 23:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.090) Loss 7.0000 (6.9806) Acc@1 0.098 (0.107) Acc@5 0.391 (0.542) Mem 7376MB [2024-08-19 23:31:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 6.9648 (6.9794) Acc@1 0.000 (0.117) Acc@5 0.586 (0.536) Mem 7376MB [2024-08-19 23:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 0.108 Acc@5 0.512 [2024-08-19 23:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 0.1% [2024-08-19 23:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 0.11% [2024-08-19 23:31:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-19 23:31:24 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-19 23:31:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][0/1251] eta 0:16:39 lr 0.000051 wd 0.0500 time 0.7990 (0.7990) data time 0.5257 (0.5257) model time 0.0000 (0.0000) loss 6.5453 (6.5453) grad_norm 4.3234 (4.3234) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][10/1251] eta 0:06:02 lr 0.000051 wd 0.0500 time 0.2464 (0.2923) data time 0.0009 (0.0489) model time 0.0000 (0.0000) loss 6.5898 (6.6652) grad_norm 4.8098 (4.3153) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:31:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][20/1251] eta 0:05:30 lr 0.000052 wd 0.0500 time 0.2402 (0.2681) data time 0.0012 (0.0261) model time 0.0000 (0.0000) loss 6.5977 (6.6698) grad_norm 4.2684 (4.4003) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:31:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][30/1251] eta 0:05:17 lr 0.000052 wd 0.0500 time 0.2317 (0.2599) data time 0.0010 (0.0180) model time 0.0000 (0.0000) loss 6.6395 (6.6819) grad_norm 2.2771 (4.3970) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:31:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][40/1251] eta 0:05:10 lr 0.000053 wd 0.0500 time 0.2430 (0.2561) data time 0.0015 (0.0139) model time 0.0000 (0.0000) loss 6.5846 (6.6770) grad_norm 3.2052 (4.2876) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:31:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][50/1251] eta 0:05:08 lr 0.000053 wd 0.0500 time 0.4384 (0.2573) data time 0.0010 (0.0114) model time 0.0000 (0.0000) loss 6.6348 (6.6763) grad_norm 6.4539 (4.2882) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:31:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][60/1251] eta 0:05:03 lr 0.000053 wd 0.0500 time 0.2437 (0.2546) data time 0.0011 (0.0097) model time 0.2426 (0.2400) loss 6.6500 (6.6643) grad_norm 4.7901 (4.1784) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:31:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][70/1251] eta 0:04:58 lr 0.000054 wd 0.0500 time 0.2374 (0.2528) data time 0.0008 (0.0085) model time 0.2367 (0.2405) loss 6.6241 (6.6516) grad_norm 3.6955 (4.2308) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:31:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][80/1251] eta 0:04:54 lr 0.000054 wd 0.0500 time 0.2439 (0.2514) data time 0.0008 (0.0076) model time 0.2431 (0.2403) loss 6.6929 (6.6537) grad_norm 4.4408 (4.3273) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][90/1251] eta 0:04:50 lr 0.000055 wd 0.0500 time 0.2465 (0.2505) data time 0.0010 (0.0069) model time 0.2456 (0.2407) loss 6.5094 (6.6535) grad_norm 4.8863 (4.3026) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:31:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][100/1251] eta 0:04:47 lr 0.000055 wd 0.0500 time 0.2436 (0.2494) data time 0.0007 (0.0063) model time 0.2428 (0.2403) loss 6.6800 (6.6562) grad_norm 3.0324 (4.2552) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:31:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][110/1251] eta 0:04:43 lr 0.000055 wd 0.0500 time 0.2387 (0.2488) data time 0.0014 (0.0058) model time 0.2373 (0.2406) loss 6.6313 (6.6571) grad_norm 6.0069 (4.3569) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:31:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][120/1251] eta 0:04:40 lr 0.000056 wd 0.0500 time 0.2386 (0.2481) data time 0.0010 (0.0054) model time 0.2376 (0.2403) loss 6.5966 (6.6593) grad_norm 3.9690 (4.3938) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][130/1251] eta 0:04:37 lr 0.000056 wd 0.0500 time 0.2449 (0.2478) data time 0.0011 (0.0051) model time 0.2438 (0.2406) loss 6.6241 (6.6591) grad_norm 2.5616 (4.3340) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:31:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][140/1251] eta 0:04:34 lr 0.000057 wd 0.0500 time 0.2434 (0.2473) data time 0.0008 (0.0048) model time 0.2426 (0.2406) loss 6.3259 (6.6587) grad_norm 3.4130 (4.4246) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][150/1251] eta 0:04:31 lr 0.000057 wd 0.0500 time 0.2478 (0.2469) data time 0.0008 (0.0046) model time 0.2470 (0.2406) loss 6.7798 (6.6584) grad_norm 4.3581 (4.4721) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][160/1251] eta 0:04:28 lr 0.000057 wd 0.0500 time 0.2391 (0.2465) data time 0.0007 (0.0044) model time 0.2384 (0.2405) loss 6.6060 (6.6582) grad_norm 5.6808 (4.4758) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][170/1251] eta 0:04:26 lr 0.000058 wd 0.0500 time 0.2463 (0.2463) data time 0.0010 (0.0042) model time 0.2453 (0.2405) loss 6.7316 (6.6580) grad_norm 3.9854 (4.4586) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][180/1251] eta 0:04:23 lr 0.000058 wd 0.0500 time 0.2405 (0.2459) data time 0.0009 (0.0040) model time 0.2396 (0.2403) loss 6.7275 (6.6567) grad_norm 3.9243 (4.5111) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][190/1251] eta 0:04:20 lr 0.000059 wd 0.0500 time 0.2387 (0.2457) data time 0.0007 (0.0039) model time 0.2380 (0.2403) loss 6.7396 (6.6543) grad_norm 3.7536 (4.4860) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][200/1251] eta 0:04:17 lr 0.000059 wd 0.0500 time 0.2467 (0.2454) data time 0.0011 (0.0038) model time 0.2456 (0.2402) loss 6.5956 (6.6521) grad_norm 4.6144 (4.4658) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][210/1251] eta 0:04:15 lr 0.000059 wd 0.0500 time 0.2403 (0.2451) data time 0.0010 (0.0036) model time 0.2393 (0.2400) loss 6.7109 (6.6522) grad_norm 5.6832 (4.4636) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][220/1251] eta 0:04:12 lr 0.000060 wd 0.0500 time 0.2463 (0.2448) data time 0.0010 (0.0035) model time 0.2453 (0.2400) loss 6.7267 (6.6501) grad_norm 2.4726 (4.4507) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][230/1251] eta 0:04:09 lr 0.000060 wd 0.0500 time 0.2397 (0.2446) data time 0.0008 (0.0034) model time 0.2389 (0.2399) loss 6.4793 (6.6476) grad_norm 5.8117 (4.4643) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][240/1251] eta 0:04:08 lr 0.000061 wd 0.0500 time 0.2420 (0.2454) data time 0.0010 (0.0033) model time 0.2410 (0.2411) loss 6.5419 (6.6432) grad_norm 4.2556 (4.4710) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][250/1251] eta 0:04:05 lr 0.000061 wd 0.0500 time 0.2432 (0.2452) data time 0.0007 (0.0032) model time 0.2425 (0.2410) loss 6.5995 (6.6405) grad_norm 1.7760 (4.4207) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][260/1251] eta 0:04:02 lr 0.000061 wd 0.0500 time 0.2308 (0.2451) data time 0.0008 (0.0031) model time 0.2300 (0.2409) loss 6.6927 (6.6410) grad_norm 3.5034 (4.3909) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][270/1251] eta 0:04:00 lr 0.000062 wd 0.0500 time 0.2452 (0.2450) data time 0.0007 (0.0031) model time 0.2445 (0.2410) loss 6.6453 (6.6409) grad_norm 4.9749 (4.3850) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][280/1251] eta 0:03:57 lr 0.000062 wd 0.0500 time 0.2429 (0.2448) data time 0.0010 (0.0030) model time 0.2419 (0.2409) loss 6.8191 (6.6386) grad_norm 3.1809 (4.3854) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][290/1251] eta 0:03:55 lr 0.000063 wd 0.0500 time 0.2438 (0.2447) data time 0.0010 (0.0029) model time 0.2428 (0.2410) loss 6.6226 (6.6381) grad_norm 4.4535 (4.3604) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][300/1251] eta 0:03:52 lr 0.000063 wd 0.0500 time 0.2350 (0.2446) data time 0.0009 (0.0029) model time 0.2341 (0.2409) loss 6.5305 (6.6361) grad_norm 3.8282 (4.3534) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][310/1251] eta 0:03:50 lr 0.000063 wd 0.0500 time 0.2419 (0.2445) data time 0.0010 (0.0028) model time 0.2409 (0.2409) loss 6.2301 (6.6341) grad_norm 6.3134 (4.3440) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][320/1251] eta 0:03:47 lr 0.000064 wd 0.0500 time 0.2332 (0.2443) data time 0.0010 (0.0028) model time 0.2323 (0.2408) loss 6.6970 (6.6314) grad_norm 4.1159 (4.3401) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][330/1251] eta 0:03:44 lr 0.000064 wd 0.0500 time 0.2328 (0.2443) data time 0.0009 (0.0027) model time 0.2319 (0.2408) loss 6.6018 (6.6283) grad_norm 3.9237 (4.3465) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][340/1251] eta 0:03:42 lr 0.000065 wd 0.0500 time 0.2385 (0.2442) data time 0.0012 (0.0027) model time 0.2374 (0.2408) loss 6.7518 (6.6272) grad_norm 4.3495 (4.3480) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][350/1251] eta 0:03:40 lr 0.000065 wd 0.0500 time 0.2398 (0.2447) data time 0.0010 (0.0026) model time 0.2388 (0.2415) loss 6.5628 (6.6276) grad_norm 3.6618 (4.3480) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][360/1251] eta 0:03:37 lr 0.000065 wd 0.0500 time 0.2394 (0.2446) data time 0.0007 (0.0026) model time 0.2387 (0.2415) loss 6.7299 (6.6274) grad_norm 3.7115 (4.3281) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][370/1251] eta 0:03:35 lr 0.000066 wd 0.0500 time 0.2334 (0.2445) data time 0.0012 (0.0025) model time 0.2323 (0.2414) loss 6.7079 (6.6258) grad_norm 4.4999 (4.3126) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][380/1251] eta 0:03:32 lr 0.000066 wd 0.0500 time 0.2450 (0.2445) data time 0.0010 (0.0025) model time 0.2440 (0.2414) loss 6.7510 (6.6251) grad_norm 4.8633 (4.3255) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][390/1251] eta 0:03:30 lr 0.000067 wd 0.0500 time 0.2408 (0.2444) data time 0.0010 (0.0025) model time 0.2398 (0.2414) loss 6.6786 (6.6248) grad_norm 4.2062 (4.3216) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][400/1251] eta 0:03:28 lr 0.000067 wd 0.0500 time 0.2366 (0.2444) data time 0.0010 (0.0024) model time 0.2357 (0.2414) loss 6.5330 (6.6204) grad_norm 3.8784 (4.3336) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][410/1251] eta 0:03:25 lr 0.000067 wd 0.0500 time 0.2436 (0.2444) data time 0.0012 (0.0024) model time 0.2424 (0.2414) loss 6.5030 (6.6171) grad_norm 5.6932 (4.3414) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][420/1251] eta 0:03:22 lr 0.000068 wd 0.0500 time 0.2444 (0.2443) data time 0.0008 (0.0024) model time 0.2437 (0.2414) loss 6.4215 (6.6167) grad_norm 4.0713 (4.3738) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][430/1251] eta 0:03:20 lr 0.000068 wd 0.0500 time 0.2427 (0.2442) data time 0.0011 (0.0023) model time 0.2416 (0.2413) loss 6.5524 (6.6156) grad_norm 5.8545 (4.3836) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][440/1251] eta 0:03:17 lr 0.000069 wd 0.0500 time 0.2370 (0.2441) data time 0.0009 (0.0023) model time 0.2361 (0.2413) loss 6.6228 (6.6148) grad_norm 4.8891 (4.3887) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][450/1251] eta 0:03:15 lr 0.000069 wd 0.0500 time 0.2417 (0.2441) data time 0.0010 (0.0023) model time 0.2407 (0.2413) loss 6.8056 (6.6155) grad_norm 2.9350 (4.3785) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][460/1251] eta 0:03:13 lr 0.000069 wd 0.0500 time 0.2441 (0.2440) data time 0.0012 (0.0023) model time 0.2429 (0.2413) loss 6.5497 (6.6137) grad_norm 4.6905 (4.3870) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][470/1251] eta 0:03:10 lr 0.000070 wd 0.0500 time 0.2462 (0.2439) data time 0.0010 (0.0022) model time 0.2452 (0.2412) loss 6.7323 (6.6112) grad_norm 3.1426 (4.3680) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][480/1251] eta 0:03:08 lr 0.000070 wd 0.0500 time 0.2434 (0.2439) data time 0.0009 (0.0022) model time 0.2425 (0.2412) loss 6.7809 (6.6115) grad_norm 5.3766 (4.3821) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][490/1251] eta 0:03:05 lr 0.000071 wd 0.0500 time 0.2411 (0.2438) data time 0.0009 (0.0022) model time 0.2402 (0.2412) loss 6.6786 (6.6098) grad_norm 2.6642 (4.3742) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][500/1251] eta 0:03:03 lr 0.000071 wd 0.0500 time 0.2346 (0.2438) data time 0.0010 (0.0022) model time 0.2336 (0.2411) loss 6.4889 (6.6084) grad_norm 6.0191 (4.3748) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][510/1251] eta 0:03:00 lr 0.000071 wd 0.0500 time 0.2440 (0.2437) data time 0.0011 (0.0021) model time 0.2429 (0.2411) loss 6.5661 (6.6073) grad_norm 4.1703 (4.3846) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][520/1251] eta 0:02:58 lr 0.000072 wd 0.0500 time 0.2486 (0.2437) data time 0.0010 (0.0021) model time 0.2476 (0.2411) loss 6.4130 (6.6062) grad_norm 4.4405 (4.3835) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][530/1251] eta 0:02:55 lr 0.000072 wd 0.0500 time 0.2436 (0.2436) data time 0.0011 (0.0021) model time 0.2425 (0.2411) loss 6.5029 (6.6063) grad_norm 3.9871 (4.3730) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][540/1251] eta 0:02:53 lr 0.000073 wd 0.0500 time 0.2364 (0.2436) data time 0.0012 (0.0021) model time 0.2351 (0.2410) loss 6.7429 (6.6046) grad_norm 5.2852 (4.3691) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][550/1251] eta 0:02:50 lr 0.000073 wd 0.0500 time 0.2470 (0.2435) data time 0.0010 (0.0021) model time 0.2460 (0.2410) loss 6.4758 (6.6035) grad_norm 4.0516 (4.3627) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][560/1251] eta 0:02:48 lr 0.000073 wd 0.0500 time 0.2447 (0.2435) data time 0.0009 (0.0020) model time 0.2438 (0.2410) loss 6.4540 (6.6020) grad_norm 4.2463 (4.3485) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][570/1251] eta 0:02:45 lr 0.000074 wd 0.0500 time 0.2442 (0.2434) data time 0.0010 (0.0020) model time 0.2432 (0.2410) loss 6.3657 (6.6004) grad_norm 4.6802 (4.3418) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][580/1251] eta 0:02:43 lr 0.000074 wd 0.0500 time 0.2404 (0.2434) data time 0.0011 (0.0020) model time 0.2393 (0.2410) loss 6.5828 (6.5999) grad_norm 6.9094 (4.3434) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][590/1251] eta 0:02:40 lr 0.000075 wd 0.0500 time 0.2410 (0.2434) data time 0.0009 (0.0020) model time 0.2402 (0.2410) loss 6.4341 (6.5971) grad_norm 4.1402 (4.3359) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][600/1251] eta 0:02:38 lr 0.000075 wd 0.0500 time 0.2362 (0.2434) data time 0.0009 (0.0020) model time 0.2353 (0.2410) loss 6.4251 (6.5942) grad_norm 2.9411 (4.3166) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][610/1251] eta 0:02:35 lr 0.000075 wd 0.0500 time 0.2442 (0.2433) data time 0.0010 (0.0020) model time 0.2432 (0.2410) loss 6.6362 (6.5932) grad_norm 6.6622 (4.3206) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][620/1251] eta 0:02:33 lr 0.000076 wd 0.0500 time 0.2290 (0.2433) data time 0.0008 (0.0019) model time 0.2283 (0.2410) loss 6.4909 (6.5907) grad_norm 3.9219 (4.3140) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][630/1251] eta 0:02:31 lr 0.000076 wd 0.0500 time 0.2401 (0.2433) data time 0.0011 (0.0019) model time 0.2390 (0.2410) loss 6.4956 (6.5904) grad_norm 6.2658 (4.3257) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][640/1251] eta 0:02:28 lr 0.000077 wd 0.0500 time 0.2350 (0.2433) data time 0.0007 (0.0019) model time 0.2343 (0.2409) loss 6.4336 (6.5889) grad_norm 4.4012 (4.3419) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][650/1251] eta 0:02:26 lr 0.000077 wd 0.0500 time 0.2419 (0.2432) data time 0.0011 (0.0019) model time 0.2408 (0.2409) loss 6.5832 (6.5884) grad_norm 3.9068 (4.3467) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][660/1251] eta 0:02:23 lr 0.000077 wd 0.0500 time 0.2375 (0.2432) data time 0.0011 (0.0019) model time 0.2363 (0.2409) loss 6.6304 (6.5880) grad_norm 4.6302 (4.3408) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][670/1251] eta 0:02:21 lr 0.000078 wd 0.0500 time 0.2301 (0.2431) data time 0.0011 (0.0019) model time 0.2290 (0.2408) loss 6.4990 (6.5857) grad_norm 4.9995 (4.3413) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][680/1251] eta 0:02:18 lr 0.000078 wd 0.0500 time 0.2471 (0.2431) data time 0.0010 (0.0019) model time 0.2461 (0.2409) loss 6.3130 (6.5842) grad_norm 3.9626 (4.3376) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][690/1251] eta 0:02:16 lr 0.000079 wd 0.0500 time 0.2391 (0.2431) data time 0.0010 (0.0019) model time 0.2381 (0.2408) loss 6.6619 (6.5833) grad_norm 4.2035 (4.3291) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][700/1251] eta 0:02:13 lr 0.000079 wd 0.0500 time 0.2399 (0.2430) data time 0.0012 (0.0019) model time 0.2387 (0.2408) loss 6.4759 (6.5823) grad_norm 3.8292 (4.3331) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][710/1251] eta 0:02:11 lr 0.000079 wd 0.0500 time 0.2384 (0.2430) data time 0.0008 (0.0018) model time 0.2377 (0.2408) loss 6.7443 (6.5809) grad_norm 2.9709 (4.3346) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][720/1251] eta 0:02:09 lr 0.000080 wd 0.0500 time 0.2364 (0.2430) data time 0.0011 (0.0018) model time 0.2353 (0.2408) loss 6.4812 (6.5783) grad_norm 3.3513 (4.3202) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][730/1251] eta 0:02:06 lr 0.000080 wd 0.0500 time 0.2336 (0.2429) data time 0.0012 (0.0018) model time 0.2325 (0.2407) loss 6.2638 (6.5768) grad_norm 2.3104 (4.3110) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][740/1251] eta 0:02:04 lr 0.000080 wd 0.0500 time 0.2349 (0.2428) data time 0.0011 (0.0018) model time 0.2339 (0.2407) loss 6.4847 (6.5762) grad_norm 3.5961 (4.2974) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][750/1251] eta 0:02:01 lr 0.000081 wd 0.0500 time 0.2436 (0.2428) data time 0.0007 (0.0018) model time 0.2429 (0.2406) loss 6.2562 (6.5738) grad_norm 5.2626 (4.2920) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][760/1251] eta 0:01:59 lr 0.000081 wd 0.0500 time 0.2400 (0.2430) data time 0.0010 (0.0018) model time 0.2390 (0.2409) loss 6.6183 (6.5742) grad_norm 8.3599 (4.2976) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][770/1251] eta 0:01:56 lr 0.000082 wd 0.0500 time 0.2413 (0.2430) data time 0.0012 (0.0018) model time 0.2402 (0.2409) loss 6.5262 (6.5733) grad_norm 3.2098 (4.2860) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][780/1251] eta 0:01:54 lr 0.000082 wd 0.0500 time 0.2388 (0.2429) data time 0.0010 (0.0018) model time 0.2378 (0.2408) loss 6.7483 (6.5729) grad_norm 2.7131 (4.2772) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][790/1251] eta 0:01:51 lr 0.000082 wd 0.0500 time 0.2422 (0.2429) data time 0.0012 (0.0018) model time 0.2410 (0.2408) loss 6.5257 (6.5727) grad_norm 4.4387 (4.2872) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][800/1251] eta 0:01:49 lr 0.000083 wd 0.0500 time 0.2511 (0.2428) data time 0.0010 (0.0018) model time 0.2501 (0.2408) loss 6.6195 (6.5715) grad_norm 3.7160 (4.2898) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][810/1251] eta 0:01:47 lr 0.000083 wd 0.0500 time 0.2364 (0.2428) data time 0.0009 (0.0017) model time 0.2355 (0.2408) loss 6.0987 (6.5692) grad_norm 3.4275 (4.2843) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][820/1251] eta 0:01:44 lr 0.000084 wd 0.0500 time 0.2407 (0.2428) data time 0.0011 (0.0017) model time 0.2396 (0.2408) loss 6.4878 (6.5680) grad_norm 4.5750 (4.2782) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][830/1251] eta 0:01:42 lr 0.000084 wd 0.0500 time 0.2430 (0.2428) data time 0.0008 (0.0017) model time 0.2422 (0.2408) loss 6.1494 (6.5662) grad_norm 3.7188 (4.2787) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][840/1251] eta 0:01:39 lr 0.000084 wd 0.0500 time 0.2380 (0.2428) data time 0.0008 (0.0017) model time 0.2373 (0.2408) loss 6.0951 (6.5650) grad_norm 2.9927 (4.2812) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][850/1251] eta 0:01:37 lr 0.000085 wd 0.0500 time 0.2362 (0.2428) data time 0.0009 (0.0017) model time 0.2353 (0.2407) loss 6.5011 (6.5635) grad_norm 3.8861 (4.2776) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][860/1251] eta 0:01:34 lr 0.000085 wd 0.0500 time 0.2507 (0.2427) data time 0.0011 (0.0017) model time 0.2496 (0.2407) loss 6.1873 (6.5623) grad_norm 3.3547 (4.2755) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][870/1251] eta 0:01:32 lr 0.000086 wd 0.0500 time 0.2414 (0.2428) data time 0.0008 (0.0017) model time 0.2407 (0.2408) loss 6.2340 (6.5611) grad_norm 4.3749 (4.2739) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][880/1251] eta 0:01:30 lr 0.000086 wd 0.0500 time 0.2439 (0.2427) data time 0.0008 (0.0017) model time 0.2431 (0.2407) loss 6.4879 (6.5605) grad_norm 4.9621 (4.2650) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][890/1251] eta 0:01:27 lr 0.000086 wd 0.0500 time 0.2448 (0.2427) data time 0.0010 (0.0017) model time 0.2438 (0.2408) loss 6.6383 (6.5604) grad_norm 4.3879 (4.2674) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][900/1251] eta 0:01:25 lr 0.000087 wd 0.0500 time 0.2392 (0.2427) data time 0.0011 (0.0017) model time 0.2382 (0.2407) loss 6.6882 (6.5593) grad_norm 4.7653 (4.2700) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][910/1251] eta 0:01:22 lr 0.000087 wd 0.0500 time 0.2455 (0.2427) data time 0.0011 (0.0017) model time 0.2444 (0.2407) loss 6.4677 (6.5583) grad_norm 4.6179 (4.2633) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][920/1251] eta 0:01:20 lr 0.000088 wd 0.0500 time 0.2367 (0.2427) data time 0.0008 (0.0017) model time 0.2359 (0.2407) loss 6.1426 (6.5579) grad_norm 2.9187 (4.2665) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][930/1251] eta 0:01:17 lr 0.000088 wd 0.0500 time 0.2470 (0.2427) data time 0.0011 (0.0017) model time 0.2459 (0.2407) loss 6.6250 (6.5573) grad_norm 5.1926 (4.2672) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][940/1251] eta 0:01:15 lr 0.000088 wd 0.0500 time 0.2404 (0.2427) data time 0.0010 (0.0017) model time 0.2394 (0.2407) loss 6.2837 (6.5568) grad_norm 3.2010 (4.2649) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][950/1251] eta 0:01:13 lr 0.000089 wd 0.0500 time 0.2462 (0.2427) data time 0.0011 (0.0017) model time 0.2451 (0.2407) loss 6.4059 (6.5549) grad_norm 3.8174 (4.2583) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][960/1251] eta 0:01:10 lr 0.000089 wd 0.0500 time 0.2392 (0.2426) data time 0.0009 (0.0016) model time 0.2384 (0.2407) loss 6.6792 (6.5535) grad_norm 3.1765 (4.2575) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][970/1251] eta 0:01:08 lr 0.000090 wd 0.0500 time 0.2390 (0.2426) data time 0.0012 (0.0016) model time 0.2379 (0.2407) loss 6.3910 (6.5521) grad_norm 4.3487 (4.2582) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][980/1251] eta 0:01:05 lr 0.000090 wd 0.0500 time 0.4292 (0.2428) data time 0.0009 (0.0016) model time 0.4283 (0.2409) loss 6.1172 (6.5500) grad_norm 4.1796 (4.2551) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][990/1251] eta 0:01:03 lr 0.000090 wd 0.0500 time 0.2344 (0.2428) data time 0.0011 (0.0016) model time 0.2333 (0.2409) loss 6.4540 (6.5486) grad_norm 3.8736 (4.2519) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1000/1251] eta 0:01:00 lr 0.000091 wd 0.0500 time 0.2406 (0.2428) data time 0.0008 (0.0016) model time 0.2398 (0.2409) loss 6.5911 (6.5463) grad_norm 2.6548 (4.2554) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1010/1251] eta 0:00:58 lr 0.000091 wd 0.0500 time 0.2380 (0.2427) data time 0.0011 (0.0016) model time 0.2369 (0.2409) loss 6.6394 (6.5449) grad_norm 2.5716 (4.2448) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1020/1251] eta 0:00:56 lr 0.000092 wd 0.0500 time 0.2373 (0.2427) data time 0.0008 (0.0016) model time 0.2364 (0.2408) loss 6.5516 (6.5442) grad_norm 4.6366 (4.2371) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1030/1251] eta 0:00:53 lr 0.000092 wd 0.0500 time 0.2414 (0.2427) data time 0.0010 (0.0016) model time 0.2404 (0.2408) loss 6.8436 (6.5440) grad_norm 3.9428 (4.2351) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1040/1251] eta 0:00:51 lr 0.000092 wd 0.0500 time 0.2382 (0.2427) data time 0.0007 (0.0016) model time 0.2375 (0.2408) loss 6.3244 (6.5431) grad_norm 3.2215 (4.2309) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1050/1251] eta 0:00:48 lr 0.000093 wd 0.0500 time 0.2434 (0.2427) data time 0.0009 (0.0016) model time 0.2425 (0.2408) loss 6.2271 (6.5410) grad_norm 3.3898 (4.2321) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1060/1251] eta 0:00:46 lr 0.000093 wd 0.0500 time 0.2482 (0.2427) data time 0.0011 (0.0016) model time 0.2471 (0.2408) loss 6.6161 (6.5401) grad_norm 3.8400 (4.2312) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1070/1251] eta 0:00:43 lr 0.000094 wd 0.0500 time 0.2380 (0.2427) data time 0.0009 (0.0016) model time 0.2371 (0.2408) loss 6.7476 (6.5401) grad_norm 2.8824 (4.2241) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1080/1251] eta 0:00:41 lr 0.000094 wd 0.0500 time 0.2414 (0.2426) data time 0.0007 (0.0016) model time 0.2406 (0.2408) loss 6.6885 (6.5405) grad_norm 4.6177 (4.2272) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1090/1251] eta 0:00:39 lr 0.000094 wd 0.0500 time 0.2399 (0.2426) data time 0.0011 (0.0016) model time 0.2388 (0.2408) loss 6.5812 (6.5397) grad_norm 3.4005 (4.2237) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1100/1251] eta 0:00:36 lr 0.000095 wd 0.0500 time 0.2422 (0.2426) data time 0.0007 (0.0016) model time 0.2415 (0.2408) loss 6.3672 (6.5385) grad_norm 2.9195 (4.2186) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1110/1251] eta 0:00:34 lr 0.000095 wd 0.0500 time 0.2343 (0.2426) data time 0.0009 (0.0016) model time 0.2334 (0.2408) loss 6.4561 (6.5368) grad_norm 2.3499 (4.2095) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1120/1251] eta 0:00:31 lr 0.000096 wd 0.0500 time 0.2372 (0.2426) data time 0.0009 (0.0016) model time 0.2363 (0.2408) loss 6.5613 (6.5354) grad_norm 3.7157 (4.2055) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1130/1251] eta 0:00:29 lr 0.000096 wd 0.0500 time 0.2350 (0.2426) data time 0.0011 (0.0016) model time 0.2338 (0.2408) loss 6.5504 (6.5347) grad_norm 2.7631 (4.2042) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1140/1251] eta 0:00:26 lr 0.000096 wd 0.0500 time 0.2350 (0.2426) data time 0.0011 (0.0016) model time 0.2339 (0.2408) loss 6.1889 (6.5326) grad_norm 3.1282 (4.1985) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1150/1251] eta 0:00:24 lr 0.000097 wd 0.0500 time 0.2454 (0.2425) data time 0.0010 (0.0016) model time 0.2444 (0.2407) loss 6.5499 (6.5314) grad_norm 4.0431 (4.1952) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1160/1251] eta 0:00:22 lr 0.000097 wd 0.0500 time 0.2364 (0.2425) data time 0.0009 (0.0016) model time 0.2354 (0.2407) loss 6.1639 (6.5304) grad_norm 5.1790 (4.1924) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1170/1251] eta 0:00:19 lr 0.000098 wd 0.0500 time 0.2429 (0.2425) data time 0.0009 (0.0016) model time 0.2420 (0.2408) loss 6.0595 (6.5280) grad_norm 3.1495 (4.1862) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1180/1251] eta 0:00:17 lr 0.000098 wd 0.0500 time 0.2373 (0.2425) data time 0.0013 (0.0016) model time 0.2360 (0.2408) loss 6.3783 (6.5275) grad_norm 2.6585 (4.1858) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1190/1251] eta 0:00:14 lr 0.000098 wd 0.0500 time 0.2413 (0.2425) data time 0.0010 (0.0015) model time 0.2404 (0.2408) loss 6.4423 (6.5265) grad_norm 2.5984 (4.1780) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1200/1251] eta 0:00:12 lr 0.000099 wd 0.0500 time 0.2448 (0.2425) data time 0.0008 (0.0015) model time 0.2440 (0.2408) loss 6.5499 (6.5253) grad_norm 3.3176 (4.1714) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1210/1251] eta 0:00:09 lr 0.000099 wd 0.0500 time 0.2401 (0.2425) data time 0.0008 (0.0015) model time 0.2394 (0.2407) loss 5.9940 (6.5241) grad_norm 2.9650 (4.1663) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1220/1251] eta 0:00:07 lr 0.000100 wd 0.0500 time 0.2509 (0.2425) data time 0.0010 (0.0015) model time 0.2498 (0.2407) loss 6.0429 (6.5227) grad_norm 3.3357 (4.1661) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1230/1251] eta 0:00:05 lr 0.000100 wd 0.0500 time 0.2464 (0.2425) data time 0.0008 (0.0015) model time 0.2456 (0.2407) loss 6.2405 (6.5212) grad_norm 3.6250 (4.1611) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1240/1251] eta 0:00:02 lr 0.000100 wd 0.0500 time 0.2305 (0.2424) data time 0.0007 (0.0015) model time 0.2298 (0.2407) loss 6.2799 (6.5206) grad_norm 3.4423 (4.1587) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [1/300][1250/1251] eta 0:00:00 lr 0.000101 wd 0.0500 time 0.2265 (0.2423) data time 0.0005 (0.0015) model time 0.2260 (0.2405) loss 6.1796 (6.5193) grad_norm 5.1356 (4.1542) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 1 training takes 0:05:03 [2024-08-19 23:36:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-19 23:36:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-19 23:36:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.455 (0.455) Loss 5.3320 (5.3320) Acc@1 3.809 (3.809) Acc@5 21.289 (21.289) Mem 7376MB [2024-08-19 23:36:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.113) Loss 5.3984 (5.4364) Acc@1 6.055 (5.034) Acc@5 18.359 (16.752) Mem 7376MB [2024-08-19 23:36:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.097) Loss 5.9258 (5.4470) Acc@1 2.246 (5.404) Acc@5 8.691 (17.457) Mem 7376MB [2024-08-19 23:36:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.091) Loss 5.6250 (5.4790) Acc@1 5.371 (5.727) Acc@5 17.578 (17.912) Mem 7376MB [2024-08-19 23:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 5.6367 (5.5162) Acc@1 3.027 (5.600) Acc@5 12.891 (17.059) Mem 7376MB [2024-08-19 23:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 6.384 Acc@5 18.512 [2024-08-19 23:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 6.4% [2024-08-19 23:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 6.38% [2024-08-19 23:36:32 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-19 23:36:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-19 23:36:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.475 (0.475) Loss 6.9570 (6.9570) Acc@1 0.098 (0.098) Acc@5 0.684 (0.684) Mem 7376MB [2024-08-19 23:36:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.116) Loss 6.9375 (6.9489) Acc@1 0.098 (0.071) Acc@5 0.977 (0.577) Mem 7376MB [2024-08-19 23:36:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.098) Loss 6.9727 (6.9552) Acc@1 0.195 (0.088) Acc@5 0.586 (0.521) Mem 7376MB [2024-08-19 23:36:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.091) Loss 6.9844 (6.9585) Acc@1 0.098 (0.088) Acc@5 0.391 (0.548) Mem 7376MB [2024-08-19 23:36:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 6.9531 (6.9596) Acc@1 0.098 (0.119) Acc@5 0.586 (0.572) Mem 7376MB [2024-08-19 23:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 0.122 Acc@5 0.548 [2024-08-19 23:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 0.1% [2024-08-19 23:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 0.12% [2024-08-19 23:36:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-19 23:36:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-19 23:36:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][0/1251] eta 0:14:38 lr 0.000101 wd 0.0500 time 0.7022 (0.7022) data time 0.4724 (0.4724) model time 0.0000 (0.0000) loss 6.1529 (6.1529) grad_norm 3.6022 (3.6022) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][10/1251] eta 0:05:50 lr 0.000101 wd 0.0500 time 0.2324 (0.2824) data time 0.0011 (0.0439) model time 0.0000 (0.0000) loss 6.0342 (6.3519) grad_norm 3.1138 (3.5759) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][20/1251] eta 0:05:36 lr 0.000102 wd 0.0500 time 0.2348 (0.2734) data time 0.0009 (0.0235) model time 0.0000 (0.0000) loss 5.9488 (6.3853) grad_norm 4.5064 (4.2672) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][30/1251] eta 0:05:21 lr 0.000102 wd 0.0500 time 0.2357 (0.2634) data time 0.0009 (0.0163) model time 0.0000 (0.0000) loss 6.0648 (6.4017) grad_norm 3.4260 (4.2619) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][40/1251] eta 0:05:12 lr 0.000102 wd 0.0500 time 0.2422 (0.2577) data time 0.0009 (0.0126) model time 0.0000 (0.0000) loss 6.6426 (6.3924) grad_norm 3.0907 (3.9943) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][50/1251] eta 0:05:05 lr 0.000103 wd 0.0500 time 0.2453 (0.2544) data time 0.0011 (0.0104) model time 0.0000 (0.0000) loss 6.5601 (6.3994) grad_norm 3.6470 (3.8382) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][60/1251] eta 0:05:00 lr 0.000103 wd 0.0500 time 0.2375 (0.2523) data time 0.0010 (0.0088) model time 0.2365 (0.2404) loss 6.2759 (6.4167) grad_norm 2.9947 (3.8480) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][70/1251] eta 0:04:55 lr 0.000104 wd 0.0500 time 0.2581 (0.2506) data time 0.0008 (0.0077) model time 0.2572 (0.2399) loss 6.3907 (6.4256) grad_norm 3.1426 (3.9239) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][80/1251] eta 0:04:51 lr 0.000104 wd 0.0500 time 0.2362 (0.2492) data time 0.0013 (0.0069) model time 0.2349 (0.2393) loss 6.5245 (6.4301) grad_norm 3.9203 (3.9387) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][90/1251] eta 0:04:48 lr 0.000104 wd 0.0500 time 0.2348 (0.2481) data time 0.0007 (0.0063) model time 0.2341 (0.2391) loss 6.4494 (6.4200) grad_norm 3.5530 (4.0217) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][100/1251] eta 0:04:44 lr 0.000105 wd 0.0500 time 0.2457 (0.2474) data time 0.0007 (0.0058) model time 0.2450 (0.2392) loss 6.3806 (6.4180) grad_norm 6.1436 (4.0703) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][110/1251] eta 0:04:41 lr 0.000105 wd 0.0500 time 0.2460 (0.2471) data time 0.0011 (0.0053) model time 0.2449 (0.2399) loss 6.0458 (6.3999) grad_norm 3.8199 (4.0332) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][120/1251] eta 0:04:38 lr 0.000106 wd 0.0500 time 0.2319 (0.2466) data time 0.0012 (0.0050) model time 0.2307 (0.2399) loss 6.5361 (6.3987) grad_norm 2.6368 (3.9711) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][130/1251] eta 0:04:35 lr 0.000106 wd 0.0500 time 0.2441 (0.2462) data time 0.0009 (0.0047) model time 0.2432 (0.2398) loss 6.4378 (6.3927) grad_norm 3.9263 (3.9169) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][140/1251] eta 0:04:32 lr 0.000106 wd 0.0500 time 0.2392 (0.2456) data time 0.0011 (0.0044) model time 0.2381 (0.2396) loss 6.1011 (6.3961) grad_norm 3.7657 (3.8875) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][150/1251] eta 0:04:30 lr 0.000107 wd 0.0500 time 0.2394 (0.2453) data time 0.0008 (0.0042) model time 0.2386 (0.2396) loss 5.9608 (6.3860) grad_norm 4.3220 (3.9006) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][160/1251] eta 0:04:27 lr 0.000107 wd 0.0500 time 0.2452 (0.2449) data time 0.0008 (0.0040) model time 0.2444 (0.2395) loss 6.2529 (6.3870) grad_norm 2.8432 (3.8676) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][170/1251] eta 0:04:24 lr 0.000108 wd 0.0500 time 0.2387 (0.2447) data time 0.0011 (0.0039) model time 0.2377 (0.2394) loss 6.4065 (6.3869) grad_norm 3.4966 (3.8694) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][180/1251] eta 0:04:21 lr 0.000108 wd 0.0500 time 0.2383 (0.2445) data time 0.0013 (0.0037) model time 0.2370 (0.2395) loss 6.2531 (6.3890) grad_norm 4.3778 (3.9038) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][190/1251] eta 0:04:19 lr 0.000108 wd 0.0500 time 0.2496 (0.2443) data time 0.0008 (0.0036) model time 0.2488 (0.2395) loss 6.2216 (6.3861) grad_norm 3.8324 (3.8884) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][200/1251] eta 0:04:16 lr 0.000109 wd 0.0500 time 0.2354 (0.2442) data time 0.0010 (0.0035) model time 0.2344 (0.2395) loss 6.4589 (6.3889) grad_norm 4.2660 (3.8761) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][210/1251] eta 0:04:13 lr 0.000109 wd 0.0500 time 0.2340 (0.2439) data time 0.0008 (0.0034) model time 0.2331 (0.2394) loss 6.3881 (6.3905) grad_norm 3.8537 (3.8658) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][220/1251] eta 0:04:11 lr 0.000110 wd 0.0500 time 0.2355 (0.2438) data time 0.0011 (0.0033) model time 0.2344 (0.2395) loss 6.3860 (6.3853) grad_norm 3.3016 (3.8468) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][230/1251] eta 0:04:08 lr 0.000110 wd 0.0500 time 0.2383 (0.2437) data time 0.0009 (0.0032) model time 0.2374 (0.2396) loss 6.5434 (6.3852) grad_norm 3.8824 (3.8375) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][240/1251] eta 0:04:06 lr 0.000110 wd 0.0500 time 0.2440 (0.2437) data time 0.0011 (0.0031) model time 0.2430 (0.2397) loss 6.3889 (6.3767) grad_norm 4.4585 (3.8458) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][250/1251] eta 0:04:03 lr 0.000111 wd 0.0500 time 0.2462 (0.2435) data time 0.0010 (0.0030) model time 0.2452 (0.2397) loss 5.7439 (6.3732) grad_norm 3.5621 (3.8281) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][260/1251] eta 0:04:01 lr 0.000111 wd 0.0500 time 0.2432 (0.2434) data time 0.0008 (0.0029) model time 0.2424 (0.2397) loss 6.4180 (6.3697) grad_norm 3.4645 (3.8499) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][270/1251] eta 0:03:58 lr 0.000112 wd 0.0500 time 0.2383 (0.2434) data time 0.0007 (0.0029) model time 0.2376 (0.2397) loss 6.2013 (6.3711) grad_norm 3.2204 (3.8365) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][280/1251] eta 0:03:56 lr 0.000112 wd 0.0500 time 0.2376 (0.2433) data time 0.0009 (0.0028) model time 0.2367 (0.2397) loss 6.5165 (6.3688) grad_norm 3.5458 (3.8320) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][290/1251] eta 0:03:53 lr 0.000112 wd 0.0500 time 0.2414 (0.2433) data time 0.0011 (0.0028) model time 0.2403 (0.2398) loss 6.0246 (6.3623) grad_norm 3.3424 (3.8323) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][300/1251] eta 0:03:51 lr 0.000113 wd 0.0500 time 0.2433 (0.2431) data time 0.0010 (0.0027) model time 0.2423 (0.2397) loss 6.4161 (6.3618) grad_norm 3.7664 (3.8351) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][310/1251] eta 0:03:48 lr 0.000113 wd 0.0500 time 0.2369 (0.2431) data time 0.0010 (0.0026) model time 0.2360 (0.2398) loss 6.4134 (6.3611) grad_norm 3.8872 (3.8568) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][320/1251] eta 0:03:46 lr 0.000114 wd 0.0500 time 0.2449 (0.2431) data time 0.0010 (0.0026) model time 0.2439 (0.2398) loss 6.1793 (6.3583) grad_norm 3.9798 (3.8844) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:37:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][330/1251] eta 0:03:43 lr 0.000114 wd 0.0500 time 0.2371 (0.2430) data time 0.0009 (0.0025) model time 0.2362 (0.2398) loss 6.3522 (6.3541) grad_norm 3.3058 (3.8702) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][340/1251] eta 0:03:41 lr 0.000114 wd 0.0500 time 0.2422 (0.2429) data time 0.0010 (0.0025) model time 0.2412 (0.2397) loss 6.1026 (6.3530) grad_norm 3.3591 (3.8573) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][350/1251] eta 0:03:38 lr 0.000115 wd 0.0500 time 0.2381 (0.2428) data time 0.0010 (0.0025) model time 0.2371 (0.2398) loss 6.3510 (6.3511) grad_norm 4.9845 (3.8515) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][360/1251] eta 0:03:36 lr 0.000115 wd 0.0500 time 0.2462 (0.2429) data time 0.0009 (0.0024) model time 0.2452 (0.2399) loss 5.8430 (6.3421) grad_norm 4.9175 (3.8575) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][370/1251] eta 0:03:33 lr 0.000116 wd 0.0500 time 0.2349 (0.2429) data time 0.0012 (0.0024) model time 0.2337 (0.2399) loss 6.4695 (6.3398) grad_norm 4.3417 (3.8602) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][380/1251] eta 0:03:31 lr 0.000116 wd 0.0500 time 0.2472 (0.2428) data time 0.0007 (0.0024) model time 0.2465 (0.2399) loss 6.1193 (6.3397) grad_norm 4.5593 (3.8654) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][390/1251] eta 0:03:28 lr 0.000116 wd 0.0500 time 0.2340 (0.2427) data time 0.0013 (0.0023) model time 0.2327 (0.2398) loss 6.4206 (6.3363) grad_norm 3.0103 (3.8575) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][400/1251] eta 0:03:26 lr 0.000117 wd 0.0500 time 0.2451 (0.2427) data time 0.0011 (0.0023) model time 0.2441 (0.2399) loss 5.9643 (6.3365) grad_norm 3.2965 (3.8532) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][410/1251] eta 0:03:24 lr 0.000117 wd 0.0500 time 0.2423 (0.2426) data time 0.0010 (0.0023) model time 0.2414 (0.2398) loss 6.6254 (6.3340) grad_norm 2.6565 (3.8444) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][420/1251] eta 0:03:21 lr 0.000118 wd 0.0500 time 0.2416 (0.2426) data time 0.0007 (0.0022) model time 0.2408 (0.2398) loss 6.1751 (6.3322) grad_norm 2.4276 (3.8356) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][430/1251] eta 0:03:19 lr 0.000118 wd 0.0500 time 0.2399 (0.2426) data time 0.0008 (0.0022) model time 0.2391 (0.2399) loss 6.2656 (6.3326) grad_norm 3.5898 (3.8249) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][440/1251] eta 0:03:16 lr 0.000118 wd 0.0500 time 0.2457 (0.2426) data time 0.0008 (0.0022) model time 0.2449 (0.2399) loss 5.8550 (6.3319) grad_norm 4.6385 (3.8246) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][450/1251] eta 0:03:14 lr 0.000119 wd 0.0500 time 0.2422 (0.2426) data time 0.0009 (0.0022) model time 0.2414 (0.2400) loss 6.4506 (6.3278) grad_norm 4.8089 (3.8333) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][460/1251] eta 0:03:11 lr 0.000119 wd 0.0500 time 0.2383 (0.2425) data time 0.0010 (0.0021) model time 0.2373 (0.2399) loss 5.9984 (6.3248) grad_norm 2.6831 (3.8287) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][470/1251] eta 0:03:09 lr 0.000120 wd 0.0500 time 0.2389 (0.2429) data time 0.0007 (0.0021) model time 0.2381 (0.2404) loss 6.1495 (6.3249) grad_norm 4.3083 (3.8230) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][480/1251] eta 0:03:07 lr 0.000120 wd 0.0500 time 0.2448 (0.2433) data time 0.0008 (0.0021) model time 0.2440 (0.2409) loss 5.9503 (6.3220) grad_norm 2.7757 (3.8122) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][490/1251] eta 0:03:05 lr 0.000120 wd 0.0500 time 0.2426 (0.2433) data time 0.0010 (0.0021) model time 0.2416 (0.2409) loss 6.4971 (6.3233) grad_norm 2.8774 (3.8034) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][500/1251] eta 0:03:02 lr 0.000121 wd 0.0500 time 0.2414 (0.2432) data time 0.0012 (0.0020) model time 0.2402 (0.2408) loss 6.3916 (6.3211) grad_norm 4.1635 (3.8026) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:38:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][510/1251] eta 0:03:00 lr 0.000121 wd 0.0500 time 0.2394 (0.2431) data time 0.0012 (0.0020) model time 0.2382 (0.2407) loss 6.3562 (6.3193) grad_norm 2.9601 (3.8026) loss_scale 65536.0000 (33088.6262) mem 7376MB [2024-08-19 23:38:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][520/1251] eta 0:02:57 lr 0.000122 wd 0.0500 time 0.2402 (0.2430) data time 0.0012 (0.0020) model time 0.2391 (0.2407) loss 6.1352 (6.3178) grad_norm 3.0410 (3.8057) loss_scale 65536.0000 (33711.4165) mem 7376MB [2024-08-19 23:38:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][530/1251] eta 0:02:55 lr 0.000122 wd 0.0500 time 0.2420 (0.2430) data time 0.0008 (0.0020) model time 0.2412 (0.2407) loss 6.2401 (6.3143) grad_norm 3.5773 (3.7894) loss_scale 65536.0000 (34310.7495) mem 7376MB [2024-08-19 23:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][540/1251] eta 0:02:52 lr 0.000122 wd 0.0500 time 0.2404 (0.2430) data time 0.0010 (0.0020) model time 0.2394 (0.2407) loss 6.3305 (6.3125) grad_norm 3.7974 (3.7931) loss_scale 65536.0000 (34887.9261) mem 7376MB [2024-08-19 23:38:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][550/1251] eta 0:02:50 lr 0.000123 wd 0.0500 time 0.2383 (0.2430) data time 0.0012 (0.0020) model time 0.2372 (0.2407) loss 6.1450 (6.3120) grad_norm 3.1283 (3.7997) loss_scale 65536.0000 (35444.1525) mem 7376MB [2024-08-19 23:38:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][560/1251] eta 0:02:47 lr 0.000123 wd 0.0500 time 0.2372 (0.2430) data time 0.0008 (0.0019) model time 0.2363 (0.2407) loss 5.8969 (6.3111) grad_norm 3.2464 (3.7964) loss_scale 65536.0000 (35980.5490) mem 7376MB [2024-08-19 23:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][570/1251] eta 0:02:45 lr 0.000124 wd 0.0500 time 0.2465 (0.2429) data time 0.0010 (0.0019) model time 0.2455 (0.2407) loss 6.2571 (6.3084) grad_norm 2.6995 (3.7844) loss_scale 65536.0000 (36498.1576) mem 7376MB [2024-08-19 23:38:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][580/1251] eta 0:02:42 lr 0.000124 wd 0.0500 time 0.2436 (0.2429) data time 0.0008 (0.0019) model time 0.2429 (0.2407) loss 5.9504 (6.3058) grad_norm 4.9723 (3.7786) loss_scale 65536.0000 (36997.9484) mem 7376MB [2024-08-19 23:39:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][590/1251] eta 0:02:40 lr 0.000124 wd 0.0500 time 0.2436 (0.2428) data time 0.0010 (0.0019) model time 0.2426 (0.2407) loss 6.5274 (6.3051) grad_norm 2.6491 (inf) loss_scale 32768.0000 (37314.4907) mem 7376MB [2024-08-19 23:39:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][600/1251] eta 0:02:38 lr 0.000125 wd 0.0500 time 0.2373 (0.2428) data time 0.0015 (0.0019) model time 0.2359 (0.2406) loss 6.0076 (6.3038) grad_norm 2.5820 (inf) loss_scale 32768.0000 (37238.8419) mem 7376MB [2024-08-19 23:39:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][610/1251] eta 0:02:35 lr 0.000125 wd 0.0500 time 0.2431 (0.2428) data time 0.0011 (0.0019) model time 0.2421 (0.2406) loss 5.8078 (6.3017) grad_norm 3.0729 (inf) loss_scale 32768.0000 (37165.6694) mem 7376MB [2024-08-19 23:39:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][620/1251] eta 0:02:33 lr 0.000126 wd 0.0500 time 0.2375 (0.2428) data time 0.0008 (0.0019) model time 0.2367 (0.2406) loss 6.3230 (6.3008) grad_norm 4.1111 (inf) loss_scale 32768.0000 (37094.8535) mem 7376MB [2024-08-19 23:39:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][630/1251] eta 0:02:30 lr 0.000126 wd 0.0500 time 0.2482 (0.2427) data time 0.0008 (0.0018) model time 0.2473 (0.2406) loss 6.3547 (6.3001) grad_norm 4.7512 (inf) loss_scale 32768.0000 (37026.2821) mem 7376MB [2024-08-19 23:39:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][640/1251] eta 0:02:28 lr 0.000126 wd 0.0500 time 0.2387 (0.2428) data time 0.0008 (0.0018) model time 0.2379 (0.2407) loss 6.2190 (6.2962) grad_norm 2.7803 (inf) loss_scale 32768.0000 (36959.8502) mem 7376MB [2024-08-19 23:39:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][650/1251] eta 0:02:25 lr 0.000127 wd 0.0500 time 0.2401 (0.2427) data time 0.0011 (0.0018) model time 0.2390 (0.2406) loss 6.4395 (6.2957) grad_norm 4.1026 (inf) loss_scale 32768.0000 (36895.4593) mem 7376MB [2024-08-19 23:39:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][660/1251] eta 0:02:23 lr 0.000127 wd 0.0500 time 0.2415 (0.2427) data time 0.0008 (0.0018) model time 0.2407 (0.2407) loss 6.3871 (6.2938) grad_norm 3.1030 (inf) loss_scale 32768.0000 (36833.0166) mem 7376MB [2024-08-19 23:39:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][670/1251] eta 0:02:21 lr 0.000128 wd 0.0500 time 0.2407 (0.2427) data time 0.0007 (0.0018) model time 0.2400 (0.2406) loss 6.6291 (6.2939) grad_norm 3.8667 (inf) loss_scale 32768.0000 (36772.4352) mem 7376MB [2024-08-19 23:39:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][680/1251] eta 0:02:18 lr 0.000128 wd 0.0500 time 0.2475 (0.2427) data time 0.0013 (0.0018) model time 0.2463 (0.2406) loss 5.8001 (6.2932) grad_norm 4.0047 (inf) loss_scale 32768.0000 (36713.6329) mem 7376MB [2024-08-19 23:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][690/1251] eta 0:02:16 lr 0.000128 wd 0.0500 time 0.2397 (0.2426) data time 0.0008 (0.0018) model time 0.2389 (0.2406) loss 6.3093 (6.2910) grad_norm 3.1917 (inf) loss_scale 32768.0000 (36656.5326) mem 7376MB [2024-08-19 23:39:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][700/1251] eta 0:02:13 lr 0.000129 wd 0.0500 time 0.2377 (0.2426) data time 0.0009 (0.0018) model time 0.2368 (0.2406) loss 6.5686 (6.2916) grad_norm 3.3933 (inf) loss_scale 32768.0000 (36601.0613) mem 7376MB [2024-08-19 23:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][710/1251] eta 0:02:11 lr 0.000129 wd 0.0500 time 0.2462 (0.2426) data time 0.0009 (0.0018) model time 0.2453 (0.2406) loss 6.2566 (6.2908) grad_norm 4.7242 (inf) loss_scale 32768.0000 (36547.1505) mem 7376MB [2024-08-19 23:39:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][720/1251] eta 0:02:08 lr 0.000130 wd 0.0500 time 0.2378 (0.2426) data time 0.0010 (0.0018) model time 0.2369 (0.2406) loss 6.3226 (6.2896) grad_norm 3.9902 (inf) loss_scale 32768.0000 (36494.7351) mem 7376MB [2024-08-19 23:39:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][730/1251] eta 0:02:06 lr 0.000130 wd 0.0500 time 0.2342 (0.2425) data time 0.0011 (0.0017) model time 0.2331 (0.2405) loss 6.4750 (6.2907) grad_norm 2.8874 (inf) loss_scale 32768.0000 (36443.7538) mem 7376MB [2024-08-19 23:39:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][740/1251] eta 0:02:03 lr 0.000130 wd 0.0500 time 0.2406 (0.2425) data time 0.0009 (0.0017) model time 0.2396 (0.2406) loss 5.8531 (6.2913) grad_norm 2.7698 (inf) loss_scale 32768.0000 (36394.1484) mem 7376MB [2024-08-19 23:39:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][750/1251] eta 0:02:01 lr 0.000131 wd 0.0500 time 0.2304 (0.2425) data time 0.0008 (0.0017) model time 0.2296 (0.2405) loss 5.9522 (6.2895) grad_norm 4.2587 (inf) loss_scale 32768.0000 (36345.8642) mem 7376MB [2024-08-19 23:39:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][760/1251] eta 0:01:59 lr 0.000131 wd 0.0500 time 0.2511 (0.2425) data time 0.0009 (0.0017) model time 0.2502 (0.2406) loss 6.4653 (6.2884) grad_norm 3.5340 (inf) loss_scale 32768.0000 (36298.8489) mem 7376MB [2024-08-19 23:39:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][770/1251] eta 0:01:56 lr 0.000132 wd 0.0500 time 0.2502 (0.2425) data time 0.0011 (0.0017) model time 0.2491 (0.2406) loss 6.0224 (6.2866) grad_norm 2.2670 (inf) loss_scale 32768.0000 (36253.0532) mem 7376MB [2024-08-19 23:39:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][780/1251] eta 0:01:54 lr 0.000132 wd 0.0500 time 0.2399 (0.2425) data time 0.0008 (0.0017) model time 0.2391 (0.2406) loss 6.5414 (6.2846) grad_norm 2.8497 (inf) loss_scale 32768.0000 (36208.4302) mem 7376MB [2024-08-19 23:39:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][790/1251] eta 0:01:51 lr 0.000132 wd 0.0500 time 0.2434 (0.2425) data time 0.0011 (0.0017) model time 0.2423 (0.2406) loss 6.0844 (6.2830) grad_norm 3.6357 (inf) loss_scale 32768.0000 (36164.9355) mem 7376MB [2024-08-19 23:39:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][800/1251] eta 0:01:49 lr 0.000133 wd 0.0500 time 0.2381 (0.2425) data time 0.0008 (0.0017) model time 0.2373 (0.2406) loss 6.3083 (6.2814) grad_norm 3.7468 (inf) loss_scale 32768.0000 (36122.5268) mem 7376MB [2024-08-19 23:39:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][810/1251] eta 0:01:46 lr 0.000133 wd 0.0500 time 0.2357 (0.2425) data time 0.0010 (0.0017) model time 0.2347 (0.2406) loss 6.2348 (6.2801) grad_norm 3.6693 (inf) loss_scale 32768.0000 (36081.1640) mem 7376MB [2024-08-19 23:39:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][820/1251] eta 0:01:44 lr 0.000134 wd 0.0500 time 0.2414 (0.2425) data time 0.0008 (0.0017) model time 0.2407 (0.2406) loss 5.6998 (6.2785) grad_norm 3.6161 (inf) loss_scale 32768.0000 (36040.8088) mem 7376MB [2024-08-19 23:39:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][830/1251] eta 0:01:42 lr 0.000134 wd 0.0500 time 0.2383 (0.2425) data time 0.0010 (0.0017) model time 0.2374 (0.2406) loss 6.0972 (6.2763) grad_norm 3.8736 (inf) loss_scale 32768.0000 (36001.4248) mem 7376MB [2024-08-19 23:40:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][840/1251] eta 0:01:39 lr 0.000134 wd 0.0500 time 0.2430 (0.2424) data time 0.0012 (0.0017) model time 0.2418 (0.2406) loss 6.3052 (6.2748) grad_norm 2.3871 (inf) loss_scale 32768.0000 (35962.9774) mem 7376MB [2024-08-19 23:40:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][850/1251] eta 0:01:37 lr 0.000135 wd 0.0500 time 0.2530 (0.2424) data time 0.0008 (0.0016) model time 0.2522 (0.2406) loss 6.3259 (6.2736) grad_norm 2.2604 (inf) loss_scale 32768.0000 (35925.4336) mem 7376MB [2024-08-19 23:40:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][860/1251] eta 0:01:34 lr 0.000135 wd 0.0500 time 0.2387 (0.2424) data time 0.0010 (0.0016) model time 0.2377 (0.2405) loss 6.2634 (6.2710) grad_norm 3.4613 (inf) loss_scale 32768.0000 (35888.7619) mem 7376MB [2024-08-19 23:40:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][870/1251] eta 0:01:32 lr 0.000136 wd 0.0500 time 0.2402 (0.2424) data time 0.0008 (0.0016) model time 0.2395 (0.2405) loss 5.8238 (6.2691) grad_norm 2.8005 (inf) loss_scale 32768.0000 (35852.9323) mem 7376MB [2024-08-19 23:40:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][880/1251] eta 0:01:29 lr 0.000136 wd 0.0500 time 0.2429 (0.2424) data time 0.0009 (0.0016) model time 0.2420 (0.2405) loss 6.5996 (6.2687) grad_norm 3.5718 (inf) loss_scale 32768.0000 (35817.9160) mem 7376MB [2024-08-19 23:40:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][890/1251] eta 0:01:27 lr 0.000136 wd 0.0500 time 0.2416 (0.2423) data time 0.0008 (0.0016) model time 0.2408 (0.2405) loss 6.0985 (6.2689) grad_norm 6.8130 (inf) loss_scale 32768.0000 (35783.6857) mem 7376MB [2024-08-19 23:40:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][900/1251] eta 0:01:25 lr 0.000137 wd 0.0500 time 0.2437 (0.2423) data time 0.0011 (0.0016) model time 0.2426 (0.2405) loss 6.4421 (6.2669) grad_norm 2.9884 (inf) loss_scale 32768.0000 (35750.2153) mem 7376MB [2024-08-19 23:40:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][910/1251] eta 0:01:22 lr 0.000137 wd 0.0500 time 0.2396 (0.2423) data time 0.0008 (0.0016) model time 0.2388 (0.2405) loss 6.0892 (6.2661) grad_norm 2.9940 (inf) loss_scale 32768.0000 (35717.4797) mem 7376MB [2024-08-19 23:40:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][920/1251] eta 0:01:20 lr 0.000138 wd 0.0500 time 0.2446 (0.2423) data time 0.0011 (0.0016) model time 0.2436 (0.2405) loss 6.3352 (6.2664) grad_norm 3.0249 (inf) loss_scale 32768.0000 (35685.4549) mem 7376MB [2024-08-19 23:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][930/1251] eta 0:01:17 lr 0.000138 wd 0.0500 time 0.2426 (0.2423) data time 0.0011 (0.0016) model time 0.2415 (0.2405) loss 6.2130 (6.2643) grad_norm 3.8179 (inf) loss_scale 32768.0000 (35654.1182) mem 7376MB [2024-08-19 23:40:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][940/1251] eta 0:01:15 lr 0.000138 wd 0.0500 time 0.2447 (0.2423) data time 0.0010 (0.0016) model time 0.2436 (0.2405) loss 6.3287 (6.2619) grad_norm 3.2153 (inf) loss_scale 32768.0000 (35623.4474) mem 7376MB [2024-08-19 23:40:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][950/1251] eta 0:01:13 lr 0.000139 wd 0.0500 time 0.2462 (0.2425) data time 0.0009 (0.0016) model time 0.2453 (0.2408) loss 6.2632 (6.2606) grad_norm 3.8280 (inf) loss_scale 32768.0000 (35593.4217) mem 7376MB [2024-08-19 23:40:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][960/1251] eta 0:01:10 lr 0.000139 wd 0.0500 time 0.2376 (0.2425) data time 0.0009 (0.0016) model time 0.2367 (0.2408) loss 6.4486 (6.2597) grad_norm 3.0667 (inf) loss_scale 32768.0000 (35564.0208) mem 7376MB [2024-08-19 23:40:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][970/1251] eta 0:01:08 lr 0.000140 wd 0.0500 time 0.2363 (0.2425) data time 0.0011 (0.0016) model time 0.2352 (0.2408) loss 5.9621 (6.2568) grad_norm 3.4529 (inf) loss_scale 32768.0000 (35535.2255) mem 7376MB [2024-08-19 23:40:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][980/1251] eta 0:01:05 lr 0.000140 wd 0.0500 time 0.2391 (0.2425) data time 0.0008 (0.0016) model time 0.2383 (0.2408) loss 5.5612 (6.2551) grad_norm 2.9347 (inf) loss_scale 32768.0000 (35507.0173) mem 7376MB [2024-08-19 23:40:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][990/1251] eta 0:01:03 lr 0.000140 wd 0.0500 time 0.2469 (0.2425) data time 0.0008 (0.0016) model time 0.2461 (0.2408) loss 6.4814 (6.2546) grad_norm 3.0441 (inf) loss_scale 32768.0000 (35479.3784) mem 7376MB [2024-08-19 23:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1000/1251] eta 0:01:00 lr 0.000141 wd 0.0500 time 0.2381 (0.2427) data time 0.0010 (0.0016) model time 0.2371 (0.2409) loss 6.4368 (6.2530) grad_norm 3.6425 (inf) loss_scale 32768.0000 (35452.2917) mem 7376MB [2024-08-19 23:40:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1010/1251] eta 0:00:58 lr 0.000141 wd 0.0500 time 0.2423 (0.2426) data time 0.0009 (0.0016) model time 0.2413 (0.2409) loss 6.2841 (6.2517) grad_norm 3.4757 (inf) loss_scale 32768.0000 (35425.7409) mem 7376MB [2024-08-19 23:40:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1020/1251] eta 0:00:56 lr 0.000142 wd 0.0500 time 0.2422 (0.2426) data time 0.0007 (0.0015) model time 0.2414 (0.2409) loss 6.5054 (6.2497) grad_norm 3.3252 (inf) loss_scale 32768.0000 (35399.7101) mem 7376MB [2024-08-19 23:40:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1030/1251] eta 0:00:53 lr 0.000142 wd 0.0500 time 0.2393 (0.2426) data time 0.0009 (0.0015) model time 0.2384 (0.2409) loss 6.5252 (6.2489) grad_norm 3.4161 (inf) loss_scale 32768.0000 (35374.1843) mem 7376MB [2024-08-19 23:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1040/1251] eta 0:00:51 lr 0.000142 wd 0.0500 time 0.2386 (0.2426) data time 0.0007 (0.0015) model time 0.2379 (0.2409) loss 5.5878 (6.2471) grad_norm 4.0555 (inf) loss_scale 32768.0000 (35349.1489) mem 7376MB [2024-08-19 23:40:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1050/1251] eta 0:00:48 lr 0.000143 wd 0.0500 time 0.2485 (0.2426) data time 0.0008 (0.0015) model time 0.2477 (0.2409) loss 6.2520 (6.2448) grad_norm 3.5926 (inf) loss_scale 32768.0000 (35324.5899) mem 7376MB [2024-08-19 23:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1060/1251] eta 0:00:46 lr 0.000143 wd 0.0500 time 0.2436 (0.2426) data time 0.0010 (0.0015) model time 0.2425 (0.2409) loss 6.0653 (6.2440) grad_norm 2.5562 (inf) loss_scale 32768.0000 (35300.4939) mem 7376MB [2024-08-19 23:40:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1070/1251] eta 0:00:43 lr 0.000144 wd 0.0500 time 0.2436 (0.2426) data time 0.0010 (0.0015) model time 0.2425 (0.2409) loss 6.3955 (6.2432) grad_norm 2.8350 (inf) loss_scale 32768.0000 (35276.8478) mem 7376MB [2024-08-19 23:41:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1080/1251] eta 0:00:41 lr 0.000144 wd 0.0500 time 0.2369 (0.2426) data time 0.0008 (0.0015) model time 0.2361 (0.2409) loss 5.8518 (6.2423) grad_norm 2.9718 (inf) loss_scale 32768.0000 (35253.6392) mem 7376MB [2024-08-19 23:41:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1090/1251] eta 0:00:39 lr 0.000144 wd 0.0500 time 0.2447 (0.2425) data time 0.0009 (0.0015) model time 0.2437 (0.2409) loss 5.6657 (6.2423) grad_norm 3.7266 (inf) loss_scale 32768.0000 (35230.8561) mem 7376MB [2024-08-19 23:41:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1100/1251] eta 0:00:36 lr 0.000145 wd 0.0500 time 0.2437 (0.2425) data time 0.0009 (0.0015) model time 0.2428 (0.2409) loss 5.6023 (6.2404) grad_norm 3.5767 (inf) loss_scale 32768.0000 (35208.4868) mem 7376MB [2024-08-19 23:41:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1110/1251] eta 0:00:34 lr 0.000145 wd 0.0500 time 0.2459 (0.2425) data time 0.0010 (0.0015) model time 0.2450 (0.2409) loss 6.3465 (6.2411) grad_norm 4.4375 (inf) loss_scale 32768.0000 (35186.5203) mem 7376MB [2024-08-19 23:41:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1120/1251] eta 0:00:31 lr 0.000146 wd 0.0500 time 0.2435 (0.2425) data time 0.0007 (0.0015) model time 0.2428 (0.2409) loss 5.7659 (6.2402) grad_norm 3.0489 (inf) loss_scale 32768.0000 (35164.9456) mem 7376MB [2024-08-19 23:41:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1130/1251] eta 0:00:29 lr 0.000146 wd 0.0500 time 0.2587 (0.2425) data time 0.0009 (0.0015) model time 0.2578 (0.2409) loss 6.2260 (6.2397) grad_norm 4.0090 (inf) loss_scale 32768.0000 (35143.7524) mem 7376MB [2024-08-19 23:41:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1140/1251] eta 0:00:26 lr 0.000146 wd 0.0500 time 0.2432 (0.2426) data time 0.0007 (0.0015) model time 0.2425 (0.2409) loss 6.2777 (6.2381) grad_norm 3.7517 (inf) loss_scale 32768.0000 (35122.9308) mem 7376MB [2024-08-19 23:41:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1150/1251] eta 0:00:24 lr 0.000147 wd 0.0500 time 0.2410 (0.2425) data time 0.0011 (0.0015) model time 0.2399 (0.2409) loss 6.3016 (6.2372) grad_norm 3.7069 (inf) loss_scale 32768.0000 (35102.4709) mem 7376MB [2024-08-19 23:41:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1160/1251] eta 0:00:22 lr 0.000147 wd 0.0500 time 0.2427 (0.2425) data time 0.0008 (0.0015) model time 0.2419 (0.2409) loss 6.0187 (6.2359) grad_norm 2.9045 (inf) loss_scale 32768.0000 (35082.3635) mem 7376MB [2024-08-19 23:41:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1170/1251] eta 0:00:19 lr 0.000148 wd 0.0500 time 0.2430 (0.2425) data time 0.0010 (0.0015) model time 0.2420 (0.2409) loss 6.1600 (6.2355) grad_norm 2.5919 (inf) loss_scale 32768.0000 (35062.5995) mem 7376MB [2024-08-19 23:41:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1180/1251] eta 0:00:17 lr 0.000148 wd 0.0500 time 0.2403 (0.2425) data time 0.0010 (0.0015) model time 0.2393 (0.2409) loss 6.1152 (6.2342) grad_norm 3.9885 (inf) loss_scale 32768.0000 (35043.1702) mem 7376MB [2024-08-19 23:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1190/1251] eta 0:00:14 lr 0.000148 wd 0.0500 time 0.2355 (0.2425) data time 0.0010 (0.0015) model time 0.2345 (0.2409) loss 5.8568 (6.2329) grad_norm 3.9255 (inf) loss_scale 32768.0000 (35024.0672) mem 7376MB [2024-08-19 23:41:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1200/1251] eta 0:00:12 lr 0.000149 wd 0.0500 time 0.2415 (0.2425) data time 0.0008 (0.0015) model time 0.2406 (0.2409) loss 6.1435 (6.2314) grad_norm 5.1181 (inf) loss_scale 32768.0000 (35005.2823) mem 7376MB [2024-08-19 23:41:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1210/1251] eta 0:00:09 lr 0.000149 wd 0.0500 time 0.2509 (0.2425) data time 0.0010 (0.0015) model time 0.2499 (0.2409) loss 6.2891 (6.2303) grad_norm 2.9149 (inf) loss_scale 32768.0000 (34986.8076) mem 7376MB [2024-08-19 23:41:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1220/1251] eta 0:00:07 lr 0.000150 wd 0.0500 time 0.2480 (0.2425) data time 0.0011 (0.0015) model time 0.2470 (0.2409) loss 6.3197 (6.2299) grad_norm 3.1098 (inf) loss_scale 32768.0000 (34968.6355) mem 7376MB [2024-08-19 23:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1230/1251] eta 0:00:05 lr 0.000150 wd 0.0500 time 0.2405 (0.2425) data time 0.0010 (0.0015) model time 0.2395 (0.2409) loss 5.9453 (6.2289) grad_norm 4.7592 (inf) loss_scale 32768.0000 (34950.7587) mem 7376MB [2024-08-19 23:41:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1240/1251] eta 0:00:02 lr 0.000150 wd 0.0500 time 0.2270 (0.2424) data time 0.0007 (0.0015) model time 0.2263 (0.2408) loss 6.2253 (6.2282) grad_norm 3.0117 (inf) loss_scale 32768.0000 (34933.1700) mem 7376MB [2024-08-19 23:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [2/300][1250/1251] eta 0:00:00 lr 0.000151 wd 0.0500 time 0.2238 (0.2423) data time 0.0007 (0.0015) model time 0.2231 (0.2407) loss 6.3471 (6.2263) grad_norm 3.0893 (inf) loss_scale 32768.0000 (34915.8625) mem 7376MB [2024-08-19 23:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 2 training takes 0:05:03 [2024-08-19 23:41:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-19 23:41:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-19 23:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.475 (0.475) Loss 4.1680 (4.1680) Acc@1 20.020 (20.020) Acc@5 46.289 (46.289) Mem 7376MB [2024-08-19 23:41:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.113) Loss 4.5117 (4.4737) Acc@1 12.207 (14.595) Acc@5 31.934 (35.210) Mem 7376MB [2024-08-19 23:41:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.097) Loss 5.0820 (4.4790) Acc@1 9.766 (15.258) Acc@5 24.316 (35.761) Mem 7376MB [2024-08-19 23:41:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.091) Loss 4.8086 (4.5735) Acc@1 13.379 (14.652) Acc@5 29.297 (34.290) Mem 7376MB [2024-08-19 23:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 4.9062 (4.6417) Acc@1 8.008 (14.086) Acc@5 26.367 (32.991) Mem 7376MB [2024-08-19 23:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 14.872 Acc@5 34.168 [2024-08-19 23:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 14.9% [2024-08-19 23:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 14.87% [2024-08-19 23:41:45 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-19 23:41:46 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-19 23:41:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.435 (0.435) Loss 6.9336 (6.9336) Acc@1 0.391 (0.391) Acc@5 0.977 (0.977) Mem 7376MB [2024-08-19 23:41:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.112) Loss 6.9258 (6.9297) Acc@1 0.000 (0.151) Acc@5 1.074 (0.746) Mem 7376MB [2024-08-19 23:41:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.097) Loss 6.9531 (6.9349) Acc@1 0.293 (0.144) Acc@5 0.684 (0.660) Mem 7376MB [2024-08-19 23:41:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.090) Loss 6.9570 (6.9381) Acc@1 0.000 (0.135) Acc@5 0.195 (0.646) Mem 7376MB [2024-08-19 23:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 6.9570 (6.9416) Acc@1 0.195 (0.145) Acc@5 0.977 (0.676) Mem 7376MB [2024-08-19 23:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 0.154 Acc@5 0.672 [2024-08-19 23:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 0.2% [2024-08-19 23:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 0.15% [2024-08-19 23:41:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-19 23:41:51 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-19 23:41:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][0/1251] eta 0:15:17 lr 0.000151 wd 0.0500 time 0.7338 (0.7338) data time 0.5108 (0.5108) model time 0.0000 (0.0000) loss 5.6679 (5.6679) grad_norm 2.7281 (2.7281) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:41:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][10/1251] eta 0:05:53 lr 0.000151 wd 0.0500 time 0.2437 (0.2846) data time 0.0008 (0.0474) model time 0.0000 (0.0000) loss 6.1826 (6.0990) grad_norm 4.0453 (3.6172) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:41:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][20/1251] eta 0:05:25 lr 0.000152 wd 0.0500 time 0.2443 (0.2648) data time 0.0008 (0.0253) model time 0.0000 (0.0000) loss 5.4547 (6.0468) grad_norm 2.5085 (3.6822) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:41:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][30/1251] eta 0:05:14 lr 0.000152 wd 0.0500 time 0.2413 (0.2573) data time 0.0009 (0.0175) model time 0.0000 (0.0000) loss 6.0161 (6.0598) grad_norm 2.8487 (3.7303) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][40/1251] eta 0:05:06 lr 0.000152 wd 0.0500 time 0.2375 (0.2530) data time 0.0007 (0.0135) model time 0.0000 (0.0000) loss 6.0766 (6.0714) grad_norm 2.7313 (3.6823) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][50/1251] eta 0:05:00 lr 0.000153 wd 0.0500 time 0.2364 (0.2506) data time 0.0012 (0.0110) model time 0.0000 (0.0000) loss 5.9988 (6.0857) grad_norm 2.8662 (3.5773) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][60/1251] eta 0:04:56 lr 0.000153 wd 0.0500 time 0.2459 (0.2487) data time 0.0008 (0.0094) model time 0.2451 (0.2382) loss 5.5858 (6.0439) grad_norm 3.7657 (3.5048) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][70/1251] eta 0:04:52 lr 0.000154 wd 0.0500 time 0.2415 (0.2474) data time 0.0011 (0.0082) model time 0.2404 (0.2380) loss 6.3481 (6.0642) grad_norm 3.6816 (3.4754) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][80/1251] eta 0:04:48 lr 0.000154 wd 0.0500 time 0.2462 (0.2466) data time 0.0008 (0.0075) model time 0.2454 (0.2384) loss 6.1694 (6.0715) grad_norm 5.0387 (3.4760) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][90/1251] eta 0:04:45 lr 0.000154 wd 0.0500 time 0.2440 (0.2458) data time 0.0008 (0.0068) model time 0.2432 (0.2384) loss 5.8167 (6.0545) grad_norm 2.9606 (3.5132) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][100/1251] eta 0:04:42 lr 0.000155 wd 0.0500 time 0.2374 (0.2454) data time 0.0010 (0.0062) model time 0.2364 (0.2388) loss 5.9204 (6.0678) grad_norm 3.4453 (3.4896) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][110/1251] eta 0:04:39 lr 0.000155 wd 0.0500 time 0.2397 (0.2449) data time 0.0011 (0.0057) model time 0.2386 (0.2389) loss 6.0377 (6.0535) grad_norm 3.4192 (3.5098) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][120/1251] eta 0:04:36 lr 0.000156 wd 0.0500 time 0.2357 (0.2445) data time 0.0008 (0.0053) model time 0.2350 (0.2388) loss 6.2240 (6.0475) grad_norm 3.5946 (3.4887) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][130/1251] eta 0:04:33 lr 0.000156 wd 0.0500 time 0.2317 (0.2441) data time 0.0011 (0.0050) model time 0.2306 (0.2388) loss 6.3916 (6.0351) grad_norm 3.9210 (3.4810) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][140/1251] eta 0:04:31 lr 0.000156 wd 0.0500 time 0.2414 (0.2441) data time 0.0009 (0.0047) model time 0.2406 (0.2393) loss 6.3476 (6.0340) grad_norm 3.1695 (3.5022) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][150/1251] eta 0:04:28 lr 0.000157 wd 0.0500 time 0.2381 (0.2441) data time 0.0009 (0.0045) model time 0.2371 (0.2397) loss 6.1054 (6.0276) grad_norm 3.0930 (3.4780) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][160/1251] eta 0:04:25 lr 0.000157 wd 0.0500 time 0.2358 (0.2438) data time 0.0010 (0.0043) model time 0.2348 (0.2395) loss 5.9329 (6.0268) grad_norm 4.3594 (3.4897) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][170/1251] eta 0:04:23 lr 0.000158 wd 0.0500 time 0.2446 (0.2438) data time 0.0008 (0.0041) model time 0.2438 (0.2397) loss 6.1804 (6.0205) grad_norm 3.8146 (3.4720) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][180/1251] eta 0:04:20 lr 0.000158 wd 0.0500 time 0.2406 (0.2436) data time 0.0008 (0.0039) model time 0.2398 (0.2398) loss 6.4265 (6.0204) grad_norm 2.4352 (3.4673) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][190/1251] eta 0:04:18 lr 0.000158 wd 0.0500 time 0.2401 (0.2436) data time 0.0009 (0.0038) model time 0.2392 (0.2399) loss 5.9936 (6.0109) grad_norm 2.9673 (3.4652) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][200/1251] eta 0:04:15 lr 0.000159 wd 0.0500 time 0.2387 (0.2435) data time 0.0010 (0.0036) model time 0.2378 (0.2399) loss 6.3721 (6.0054) grad_norm 2.6207 (3.4593) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][210/1251] eta 0:04:13 lr 0.000159 wd 0.0500 time 0.2419 (0.2434) data time 0.0008 (0.0035) model time 0.2411 (0.2400) loss 6.3298 (6.0092) grad_norm 3.8837 (3.4589) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][220/1251] eta 0:04:10 lr 0.000160 wd 0.0500 time 0.2521 (0.2434) data time 0.0011 (0.0034) model time 0.2511 (0.2401) loss 5.6989 (6.0144) grad_norm 2.8405 (3.4379) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][230/1251] eta 0:04:08 lr 0.000160 wd 0.0500 time 0.2461 (0.2433) data time 0.0007 (0.0033) model time 0.2454 (0.2401) loss 5.8697 (6.0163) grad_norm 3.4154 (3.4493) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][240/1251] eta 0:04:05 lr 0.000160 wd 0.0500 time 0.2430 (0.2433) data time 0.0008 (0.0032) model time 0.2422 (0.2402) loss 6.3756 (6.0202) grad_norm 3.4947 (3.4869) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:42:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][250/1251] eta 0:04:03 lr 0.000161 wd 0.0500 time 0.2385 (0.2431) data time 0.0008 (0.0031) model time 0.2377 (0.2401) loss 6.3476 (6.0161) grad_norm 3.2711 (inf) loss_scale 16384.0000 (32441.6255) mem 7376MB [2024-08-19 23:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][260/1251] eta 0:04:00 lr 0.000161 wd 0.0500 time 0.2344 (0.2431) data time 0.0010 (0.0031) model time 0.2334 (0.2401) loss 6.0836 (6.0114) grad_norm 2.8391 (inf) loss_scale 16384.0000 (31826.3908) mem 7376MB [2024-08-19 23:42:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][270/1251] eta 0:03:58 lr 0.000162 wd 0.0500 time 0.2376 (0.2431) data time 0.0010 (0.0030) model time 0.2366 (0.2403) loss 5.7349 (6.0111) grad_norm 3.1670 (inf) loss_scale 16384.0000 (31256.5609) mem 7376MB [2024-08-19 23:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][280/1251] eta 0:03:56 lr 0.000162 wd 0.0500 time 0.2379 (0.2437) data time 0.0008 (0.0029) model time 0.2372 (0.2410) loss 5.7171 (6.0052) grad_norm 2.6816 (inf) loss_scale 16384.0000 (30727.2883) mem 7376MB [2024-08-19 23:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][290/1251] eta 0:03:54 lr 0.000162 wd 0.0500 time 0.2377 (0.2436) data time 0.0011 (0.0029) model time 0.2366 (0.2410) loss 6.4110 (6.0122) grad_norm 2.7524 (inf) loss_scale 16384.0000 (30234.3918) mem 7376MB [2024-08-19 23:43:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][300/1251] eta 0:03:51 lr 0.000163 wd 0.0500 time 0.2425 (0.2435) data time 0.0010 (0.0028) model time 0.2415 (0.2409) loss 5.5149 (6.0162) grad_norm 3.0021 (inf) loss_scale 16384.0000 (29774.2458) mem 7376MB [2024-08-19 23:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][310/1251] eta 0:03:49 lr 0.000163 wd 0.0500 time 0.2427 (0.2435) data time 0.0011 (0.0027) model time 0.2417 (0.2410) loss 5.5427 (6.0173) grad_norm 3.8873 (inf) loss_scale 16384.0000 (29343.6913) mem 7376MB [2024-08-19 23:43:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][320/1251] eta 0:03:46 lr 0.000164 wd 0.0500 time 0.2445 (0.2435) data time 0.0010 (0.0027) model time 0.2435 (0.2410) loss 5.7457 (6.0090) grad_norm 3.0770 (inf) loss_scale 16384.0000 (28939.9626) mem 7376MB [2024-08-19 23:43:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][330/1251] eta 0:03:44 lr 0.000164 wd 0.0500 time 0.2457 (0.2435) data time 0.0008 (0.0026) model time 0.2448 (0.2411) loss 5.5778 (6.0037) grad_norm 3.1035 (inf) loss_scale 16384.0000 (28560.6284) mem 7376MB [2024-08-19 23:43:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][340/1251] eta 0:03:41 lr 0.000164 wd 0.0500 time 0.2327 (0.2434) data time 0.0010 (0.0026) model time 0.2317 (0.2410) loss 6.2559 (6.0038) grad_norm 4.1553 (inf) loss_scale 16384.0000 (28203.5425) mem 7376MB [2024-08-19 23:43:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][350/1251] eta 0:03:39 lr 0.000165 wd 0.0500 time 0.2461 (0.2433) data time 0.0007 (0.0025) model time 0.2454 (0.2409) loss 6.2439 (5.9986) grad_norm 3.4674 (inf) loss_scale 16384.0000 (27866.8034) mem 7376MB [2024-08-19 23:43:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][360/1251] eta 0:03:36 lr 0.000165 wd 0.0500 time 0.2389 (0.2432) data time 0.0009 (0.0025) model time 0.2380 (0.2409) loss 5.8136 (5.9992) grad_norm 3.7556 (inf) loss_scale 16384.0000 (27548.7202) mem 7376MB [2024-08-19 23:43:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][370/1251] eta 0:03:34 lr 0.000166 wd 0.0500 time 0.2511 (0.2432) data time 0.0010 (0.0025) model time 0.2501 (0.2409) loss 6.2780 (5.9966) grad_norm 3.4433 (inf) loss_scale 16384.0000 (27247.7844) mem 7376MB [2024-08-19 23:43:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][380/1251] eta 0:03:31 lr 0.000166 wd 0.0500 time 0.2405 (0.2432) data time 0.0008 (0.0024) model time 0.2397 (0.2409) loss 5.1639 (5.9933) grad_norm 3.4794 (inf) loss_scale 16384.0000 (26962.6457) mem 7376MB [2024-08-19 23:43:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][390/1251] eta 0:03:29 lr 0.000166 wd 0.0500 time 0.2553 (0.2432) data time 0.0008 (0.0024) model time 0.2546 (0.2410) loss 5.4438 (5.9930) grad_norm 3.6823 (inf) loss_scale 16384.0000 (26692.0921) mem 7376MB [2024-08-19 23:43:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][400/1251] eta 0:03:27 lr 0.000167 wd 0.0500 time 0.2472 (0.2437) data time 0.0008 (0.0024) model time 0.2464 (0.2416) loss 5.2621 (5.9884) grad_norm 4.7944 (inf) loss_scale 16384.0000 (26435.0324) mem 7376MB [2024-08-19 23:43:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][410/1251] eta 0:03:24 lr 0.000167 wd 0.0500 time 0.2401 (0.2436) data time 0.0008 (0.0023) model time 0.2393 (0.2415) loss 6.0949 (5.9895) grad_norm 3.6781 (inf) loss_scale 16384.0000 (26190.4818) mem 7376MB [2024-08-19 23:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][420/1251] eta 0:03:22 lr 0.000168 wd 0.0500 time 0.2486 (0.2436) data time 0.0008 (0.0023) model time 0.2478 (0.2415) loss 6.2763 (5.9909) grad_norm 2.6308 (inf) loss_scale 16384.0000 (25957.5487) mem 7376MB [2024-08-19 23:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][430/1251] eta 0:03:19 lr 0.000168 wd 0.0500 time 0.2355 (0.2435) data time 0.0010 (0.0023) model time 0.2344 (0.2415) loss 6.0570 (5.9914) grad_norm 3.0643 (inf) loss_scale 16384.0000 (25735.4246) mem 7376MB [2024-08-19 23:43:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][440/1251] eta 0:03:17 lr 0.000168 wd 0.0500 time 0.2358 (0.2434) data time 0.0010 (0.0022) model time 0.2348 (0.2414) loss 5.3142 (5.9851) grad_norm 2.6495 (inf) loss_scale 16384.0000 (25523.3741) mem 7376MB [2024-08-19 23:43:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][450/1251] eta 0:03:14 lr 0.000169 wd 0.0500 time 0.2371 (0.2434) data time 0.0013 (0.0022) model time 0.2358 (0.2413) loss 5.6359 (5.9846) grad_norm 3.7467 (inf) loss_scale 16384.0000 (25320.7273) mem 7376MB [2024-08-19 23:43:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][460/1251] eta 0:03:12 lr 0.000169 wd 0.0500 time 0.2379 (0.2433) data time 0.0010 (0.0022) model time 0.2369 (0.2413) loss 6.1073 (5.9839) grad_norm 4.5673 (inf) loss_scale 16384.0000 (25126.8720) mem 7376MB [2024-08-19 23:43:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][470/1251] eta 0:03:09 lr 0.000170 wd 0.0500 time 0.2438 (0.2432) data time 0.0010 (0.0022) model time 0.2428 (0.2412) loss 5.4941 (5.9843) grad_norm 4.4345 (inf) loss_scale 16384.0000 (24941.2484) mem 7376MB [2024-08-19 23:43:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][480/1251] eta 0:03:07 lr 0.000170 wd 0.0500 time 0.2404 (0.2432) data time 0.0011 (0.0021) model time 0.2393 (0.2412) loss 6.0796 (5.9809) grad_norm 2.8072 (inf) loss_scale 16384.0000 (24763.3430) mem 7376MB [2024-08-19 23:43:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][490/1251] eta 0:03:05 lr 0.000170 wd 0.0500 time 0.2442 (0.2431) data time 0.0010 (0.0021) model time 0.2432 (0.2412) loss 6.3040 (5.9814) grad_norm 2.8958 (inf) loss_scale 16384.0000 (24592.6843) mem 7376MB [2024-08-19 23:43:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][500/1251] eta 0:03:02 lr 0.000171 wd 0.0500 time 0.2372 (0.2431) data time 0.0011 (0.0021) model time 0.2361 (0.2411) loss 6.2958 (5.9802) grad_norm 2.5524 (inf) loss_scale 16384.0000 (24428.8383) mem 7376MB [2024-08-19 23:43:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][510/1251] eta 0:03:00 lr 0.000171 wd 0.0500 time 0.2418 (0.2431) data time 0.0009 (0.0021) model time 0.2409 (0.2412) loss 5.6306 (5.9791) grad_norm 2.2754 (inf) loss_scale 16384.0000 (24271.4051) mem 7376MB [2024-08-19 23:43:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][520/1251] eta 0:02:57 lr 0.000172 wd 0.0500 time 0.2438 (0.2435) data time 0.0011 (0.0021) model time 0.2426 (0.2416) loss 6.2203 (5.9780) grad_norm 2.8312 (inf) loss_scale 16384.0000 (24120.0154) mem 7376MB [2024-08-19 23:44:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][530/1251] eta 0:02:55 lr 0.000172 wd 0.0500 time 0.2390 (0.2434) data time 0.0010 (0.0020) model time 0.2380 (0.2416) loss 5.5209 (5.9750) grad_norm 3.3798 (inf) loss_scale 16384.0000 (23974.3277) mem 7376MB [2024-08-19 23:44:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][540/1251] eta 0:02:53 lr 0.000172 wd 0.0500 time 0.2485 (0.2434) data time 0.0011 (0.0020) model time 0.2474 (0.2416) loss 6.2557 (5.9764) grad_norm 3.7311 (inf) loss_scale 16384.0000 (23834.0259) mem 7376MB [2024-08-19 23:44:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][550/1251] eta 0:02:50 lr 0.000173 wd 0.0500 time 0.2458 (0.2434) data time 0.0010 (0.0020) model time 0.2448 (0.2415) loss 6.1832 (5.9723) grad_norm 3.2081 (inf) loss_scale 16384.0000 (23698.8167) mem 7376MB [2024-08-19 23:44:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][560/1251] eta 0:02:48 lr 0.000173 wd 0.0500 time 0.2468 (0.2434) data time 0.0009 (0.0020) model time 0.2459 (0.2415) loss 4.8791 (5.9710) grad_norm 3.7016 (inf) loss_scale 16384.0000 (23568.4278) mem 7376MB [2024-08-19 23:44:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][570/1251] eta 0:02:45 lr 0.000174 wd 0.0500 time 0.2504 (0.2434) data time 0.0012 (0.0020) model time 0.2492 (0.2416) loss 6.0198 (5.9720) grad_norm 2.8788 (inf) loss_scale 16384.0000 (23442.6060) mem 7376MB [2024-08-19 23:44:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][580/1251] eta 0:02:43 lr 0.000174 wd 0.0500 time 0.2350 (0.2434) data time 0.0008 (0.0020) model time 0.2342 (0.2416) loss 5.4045 (5.9693) grad_norm 2.8709 (inf) loss_scale 16384.0000 (23321.1153) mem 7376MB [2024-08-19 23:44:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][590/1251] eta 0:02:40 lr 0.000174 wd 0.0500 time 0.2481 (0.2433) data time 0.0008 (0.0019) model time 0.2473 (0.2415) loss 5.5105 (5.9666) grad_norm 2.8730 (inf) loss_scale 16384.0000 (23203.7360) mem 7376MB [2024-08-19 23:44:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][600/1251] eta 0:02:38 lr 0.000175 wd 0.0500 time 0.2520 (0.2433) data time 0.0011 (0.0019) model time 0.2509 (0.2415) loss 5.8653 (5.9664) grad_norm 2.8493 (inf) loss_scale 16384.0000 (23090.2629) mem 7376MB [2024-08-19 23:44:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][610/1251] eta 0:02:35 lr 0.000175 wd 0.0500 time 0.2475 (0.2432) data time 0.0011 (0.0019) model time 0.2464 (0.2415) loss 5.3508 (5.9648) grad_norm 3.5604 (inf) loss_scale 16384.0000 (22980.5041) mem 7376MB [2024-08-19 23:44:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][620/1251] eta 0:02:33 lr 0.000176 wd 0.0500 time 0.2396 (0.2433) data time 0.0011 (0.0019) model time 0.2385 (0.2415) loss 5.9780 (5.9668) grad_norm 3.4664 (inf) loss_scale 16384.0000 (22874.2802) mem 7376MB [2024-08-19 23:44:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][630/1251] eta 0:02:31 lr 0.000176 wd 0.0500 time 0.2420 (0.2432) data time 0.0009 (0.0019) model time 0.2411 (0.2415) loss 5.5138 (5.9633) grad_norm 2.8950 (inf) loss_scale 16384.0000 (22771.4231) mem 7376MB [2024-08-19 23:44:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][640/1251] eta 0:02:28 lr 0.000176 wd 0.0500 time 0.2490 (0.2432) data time 0.0010 (0.0019) model time 0.2480 (0.2415) loss 5.6461 (5.9637) grad_norm 2.8344 (inf) loss_scale 16384.0000 (22671.7754) mem 7376MB [2024-08-19 23:44:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][650/1251] eta 0:02:26 lr 0.000177 wd 0.0500 time 0.2401 (0.2433) data time 0.0012 (0.0019) model time 0.2389 (0.2415) loss 6.2589 (5.9653) grad_norm 3.2921 (inf) loss_scale 16384.0000 (22575.1889) mem 7376MB [2024-08-19 23:44:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][660/1251] eta 0:02:23 lr 0.000177 wd 0.0500 time 0.2426 (0.2432) data time 0.0011 (0.0019) model time 0.2415 (0.2415) loss 6.0002 (5.9655) grad_norm 2.8190 (inf) loss_scale 16384.0000 (22481.5250) mem 7376MB [2024-08-19 23:44:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][670/1251] eta 0:02:21 lr 0.000178 wd 0.0500 time 0.2403 (0.2432) data time 0.0010 (0.0018) model time 0.2394 (0.2415) loss 5.5369 (5.9649) grad_norm 3.3195 (inf) loss_scale 16384.0000 (22390.6528) mem 7376MB [2024-08-19 23:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][680/1251] eta 0:02:18 lr 0.000178 wd 0.0500 time 0.2373 (0.2431) data time 0.0009 (0.0018) model time 0.2363 (0.2414) loss 6.2447 (5.9627) grad_norm 3.0308 (inf) loss_scale 16384.0000 (22302.4493) mem 7376MB [2024-08-19 23:44:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][690/1251] eta 0:02:16 lr 0.000178 wd 0.0500 time 0.2435 (0.2431) data time 0.0012 (0.0018) model time 0.2423 (0.2414) loss 6.2687 (5.9637) grad_norm 2.4649 (inf) loss_scale 16384.0000 (22216.7988) mem 7376MB [2024-08-19 23:44:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][700/1251] eta 0:02:13 lr 0.000179 wd 0.0500 time 0.2400 (0.2430) data time 0.0011 (0.0018) model time 0.2390 (0.2414) loss 5.5602 (5.9622) grad_norm 3.9173 (inf) loss_scale 16384.0000 (22133.5920) mem 7376MB [2024-08-19 23:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][710/1251] eta 0:02:11 lr 0.000179 wd 0.0500 time 0.2375 (0.2430) data time 0.0010 (0.0018) model time 0.2365 (0.2413) loss 6.3087 (5.9607) grad_norm 4.3175 (inf) loss_scale 16384.0000 (22052.7257) mem 7376MB [2024-08-19 23:44:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][720/1251] eta 0:02:09 lr 0.000180 wd 0.0500 time 0.2474 (0.2430) data time 0.0008 (0.0018) model time 0.2466 (0.2413) loss 6.3023 (5.9642) grad_norm 5.8601 (inf) loss_scale 16384.0000 (21974.1026) mem 7376MB [2024-08-19 23:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][730/1251] eta 0:02:06 lr 0.000180 wd 0.0500 time 0.2421 (0.2429) data time 0.0007 (0.0018) model time 0.2414 (0.2413) loss 6.0553 (5.9632) grad_norm 3.4669 (inf) loss_scale 16384.0000 (21897.6306) mem 7376MB [2024-08-19 23:44:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][740/1251] eta 0:02:04 lr 0.000180 wd 0.0500 time 0.2369 (0.2429) data time 0.0009 (0.0018) model time 0.2360 (0.2412) loss 5.9343 (5.9599) grad_norm 3.4141 (inf) loss_scale 16384.0000 (21823.2227) mem 7376MB [2024-08-19 23:44:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][750/1251] eta 0:02:01 lr 0.000181 wd 0.0500 time 0.2392 (0.2428) data time 0.0009 (0.0018) model time 0.2384 (0.2412) loss 5.1741 (5.9581) grad_norm 2.6180 (inf) loss_scale 16384.0000 (21750.7963) mem 7376MB [2024-08-19 23:44:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][760/1251] eta 0:01:59 lr 0.000181 wd 0.0500 time 0.2461 (0.2428) data time 0.0008 (0.0017) model time 0.2453 (0.2412) loss 5.5600 (5.9588) grad_norm 2.8975 (inf) loss_scale 16384.0000 (21680.2733) mem 7376MB [2024-08-19 23:44:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][770/1251] eta 0:01:56 lr 0.000182 wd 0.0500 time 0.2369 (0.2428) data time 0.0009 (0.0017) model time 0.2360 (0.2412) loss 5.7281 (5.9574) grad_norm 3.0895 (inf) loss_scale 16384.0000 (21611.5798) mem 7376MB [2024-08-19 23:45:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][780/1251] eta 0:01:54 lr 0.000182 wd 0.0500 time 0.2434 (0.2428) data time 0.0010 (0.0017) model time 0.2425 (0.2411) loss 5.8922 (5.9573) grad_norm 3.2621 (inf) loss_scale 16384.0000 (21544.6453) mem 7376MB [2024-08-19 23:45:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][790/1251] eta 0:01:51 lr 0.000182 wd 0.0500 time 0.2389 (0.2427) data time 0.0011 (0.0017) model time 0.2378 (0.2411) loss 5.9224 (5.9545) grad_norm 2.8109 (inf) loss_scale 16384.0000 (21479.4033) mem 7376MB [2024-08-19 23:45:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][800/1251] eta 0:01:49 lr 0.000183 wd 0.0500 time 0.2373 (0.2429) data time 0.0013 (0.0017) model time 0.2360 (0.2413) loss 6.0836 (5.9547) grad_norm 3.2427 (inf) loss_scale 16384.0000 (21415.7903) mem 7376MB [2024-08-19 23:45:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][810/1251] eta 0:01:47 lr 0.000183 wd 0.0500 time 0.2367 (0.2428) data time 0.0007 (0.0017) model time 0.2360 (0.2412) loss 6.2209 (5.9555) grad_norm 2.5103 (inf) loss_scale 16384.0000 (21353.7460) mem 7376MB [2024-08-19 23:45:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][820/1251] eta 0:01:44 lr 0.000184 wd 0.0500 time 0.2403 (0.2428) data time 0.0010 (0.0017) model time 0.2393 (0.2412) loss 6.0237 (5.9543) grad_norm 2.4095 (inf) loss_scale 16384.0000 (21293.2132) mem 7376MB [2024-08-19 23:45:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][830/1251] eta 0:01:42 lr 0.000184 wd 0.0500 time 0.2347 (0.2427) data time 0.0012 (0.0017) model time 0.2335 (0.2411) loss 5.6799 (5.9545) grad_norm 2.4797 (inf) loss_scale 16384.0000 (21234.1372) mem 7376MB [2024-08-19 23:45:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][840/1251] eta 0:01:39 lr 0.000184 wd 0.0500 time 0.2320 (0.2427) data time 0.0010 (0.0017) model time 0.2310 (0.2411) loss 5.8969 (5.9541) grad_norm 2.6497 (inf) loss_scale 16384.0000 (21176.4661) mem 7376MB [2024-08-19 23:45:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][850/1251] eta 0:01:37 lr 0.000185 wd 0.0500 time 0.2482 (0.2427) data time 0.0010 (0.0017) model time 0.2472 (0.2411) loss 6.0691 (5.9541) grad_norm 2.8266 (inf) loss_scale 16384.0000 (21120.1504) mem 7376MB [2024-08-19 23:45:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][860/1251] eta 0:01:34 lr 0.000185 wd 0.0500 time 0.2406 (0.2427) data time 0.0009 (0.0017) model time 0.2397 (0.2411) loss 6.4903 (5.9515) grad_norm 3.4276 (inf) loss_scale 16384.0000 (21065.1429) mem 7376MB [2024-08-19 23:45:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][870/1251] eta 0:01:32 lr 0.000186 wd 0.0500 time 0.2422 (0.2427) data time 0.0011 (0.0017) model time 0.2411 (0.2411) loss 6.0442 (5.9507) grad_norm 3.2617 (inf) loss_scale 16384.0000 (21011.3984) mem 7376MB [2024-08-19 23:45:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][880/1251] eta 0:01:30 lr 0.000186 wd 0.0500 time 0.2411 (0.2427) data time 0.0012 (0.0017) model time 0.2399 (0.2411) loss 6.0066 (5.9493) grad_norm 3.0653 (inf) loss_scale 16384.0000 (20958.8740) mem 7376MB [2024-08-19 23:45:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][890/1251] eta 0:01:27 lr 0.000186 wd 0.0500 time 0.2353 (0.2427) data time 0.0009 (0.0017) model time 0.2344 (0.2411) loss 6.0783 (5.9500) grad_norm 3.7404 (inf) loss_scale 16384.0000 (20907.5286) mem 7376MB [2024-08-19 23:45:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][900/1251] eta 0:01:25 lr 0.000187 wd 0.0500 time 0.2393 (0.2427) data time 0.0008 (0.0016) model time 0.2386 (0.2411) loss 6.1057 (5.9504) grad_norm 2.6879 (inf) loss_scale 16384.0000 (20857.3230) mem 7376MB [2024-08-19 23:45:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][910/1251] eta 0:01:22 lr 0.000187 wd 0.0500 time 0.2299 (0.2426) data time 0.0008 (0.0016) model time 0.2292 (0.2411) loss 5.3821 (5.9506) grad_norm 3.0889 (inf) loss_scale 16384.0000 (20808.2195) mem 7376MB [2024-08-19 23:45:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][920/1251] eta 0:01:20 lr 0.000188 wd 0.0500 time 0.2323 (0.2426) data time 0.0012 (0.0016) model time 0.2311 (0.2410) loss 6.0706 (5.9499) grad_norm 3.2480 (inf) loss_scale 16384.0000 (20760.1824) mem 7376MB [2024-08-19 23:45:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][930/1251] eta 0:01:17 lr 0.000188 wd 0.0500 time 0.2494 (0.2426) data time 0.0008 (0.0016) model time 0.2486 (0.2410) loss 6.3716 (5.9495) grad_norm 2.9139 (inf) loss_scale 16384.0000 (20713.1772) mem 7376MB [2024-08-19 23:45:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][940/1251] eta 0:01:15 lr 0.000188 wd 0.0500 time 0.2376 (0.2428) data time 0.0010 (0.0016) model time 0.2366 (0.2413) loss 5.9196 (5.9505) grad_norm 2.9592 (inf) loss_scale 16384.0000 (20667.1711) mem 7376MB [2024-08-19 23:45:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][950/1251] eta 0:01:13 lr 0.000189 wd 0.0500 time 0.2368 (0.2428) data time 0.0012 (0.0016) model time 0.2356 (0.2413) loss 5.9986 (5.9515) grad_norm 2.6456 (inf) loss_scale 16384.0000 (20622.1325) mem 7376MB [2024-08-19 23:45:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][960/1251] eta 0:01:10 lr 0.000189 wd 0.0500 time 0.2402 (0.2428) data time 0.0011 (0.0016) model time 0.2391 (0.2412) loss 6.0866 (5.9493) grad_norm 2.6142 (inf) loss_scale 16384.0000 (20578.0312) mem 7376MB [2024-08-19 23:45:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][970/1251] eta 0:01:08 lr 0.000190 wd 0.0500 time 0.2515 (0.2428) data time 0.0009 (0.0016) model time 0.2507 (0.2413) loss 5.3933 (5.9495) grad_norm 3.6135 (inf) loss_scale 16384.0000 (20534.8383) mem 7376MB [2024-08-19 23:45:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][980/1251] eta 0:01:05 lr 0.000190 wd 0.0500 time 0.2401 (0.2427) data time 0.0010 (0.0016) model time 0.2391 (0.2412) loss 5.9486 (5.9472) grad_norm 3.1346 (inf) loss_scale 16384.0000 (20492.5260) mem 7376MB [2024-08-19 23:45:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][990/1251] eta 0:01:03 lr 0.000190 wd 0.0500 time 0.2348 (0.2427) data time 0.0009 (0.0016) model time 0.2340 (0.2412) loss 5.1996 (5.9472) grad_norm 2.7019 (inf) loss_scale 16384.0000 (20451.0676) mem 7376MB [2024-08-19 23:45:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1000/1251] eta 0:01:00 lr 0.000191 wd 0.0500 time 0.2444 (0.2427) data time 0.0010 (0.0016) model time 0.2434 (0.2412) loss 6.1948 (5.9452) grad_norm 5.6666 (inf) loss_scale 16384.0000 (20410.4376) mem 7376MB [2024-08-19 23:45:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1010/1251] eta 0:00:58 lr 0.000191 wd 0.0500 time 0.2431 (0.2427) data time 0.0008 (0.0016) model time 0.2423 (0.2412) loss 6.1106 (5.9431) grad_norm 2.7411 (inf) loss_scale 16384.0000 (20370.6113) mem 7376MB [2024-08-19 23:45:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1020/1251] eta 0:00:56 lr 0.000192 wd 0.0500 time 0.2387 (0.2427) data time 0.0010 (0.0016) model time 0.2377 (0.2412) loss 5.8418 (5.9404) grad_norm 2.3426 (inf) loss_scale 16384.0000 (20331.5651) mem 7376MB [2024-08-19 23:46:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1030/1251] eta 0:00:53 lr 0.000192 wd 0.0500 time 0.2337 (0.2427) data time 0.0010 (0.0016) model time 0.2328 (0.2412) loss 6.0192 (5.9392) grad_norm 2.7385 (inf) loss_scale 16384.0000 (20293.2764) mem 7376MB [2024-08-19 23:46:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1040/1251] eta 0:00:51 lr 0.000192 wd 0.0500 time 0.2325 (0.2428) data time 0.0010 (0.0016) model time 0.2315 (0.2413) loss 5.9599 (5.9385) grad_norm 3.3704 (inf) loss_scale 16384.0000 (20255.7233) mem 7376MB [2024-08-19 23:46:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1050/1251] eta 0:00:48 lr 0.000193 wd 0.0500 time 0.2483 (0.2428) data time 0.0010 (0.0016) model time 0.2473 (0.2413) loss 6.2957 (5.9375) grad_norm 3.1125 (inf) loss_scale 16384.0000 (20218.8849) mem 7376MB [2024-08-19 23:46:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1060/1251] eta 0:00:46 lr 0.000193 wd 0.0500 time 0.2419 (0.2428) data time 0.0011 (0.0016) model time 0.2408 (0.2413) loss 6.0429 (5.9364) grad_norm 3.8369 (inf) loss_scale 16384.0000 (20182.7408) mem 7376MB [2024-08-19 23:46:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1070/1251] eta 0:00:43 lr 0.000194 wd 0.0500 time 0.2440 (0.2428) data time 0.0009 (0.0016) model time 0.2431 (0.2413) loss 5.4294 (5.9332) grad_norm 2.6966 (inf) loss_scale 16384.0000 (20147.2717) mem 7376MB [2024-08-19 23:46:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1080/1251] eta 0:00:41 lr 0.000194 wd 0.0500 time 0.2410 (0.2428) data time 0.0012 (0.0015) model time 0.2398 (0.2413) loss 6.1306 (5.9319) grad_norm 2.7656 (inf) loss_scale 16384.0000 (20112.4588) mem 7376MB [2024-08-19 23:46:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1090/1251] eta 0:00:39 lr 0.000194 wd 0.0500 time 0.2391 (0.2427) data time 0.0010 (0.0015) model time 0.2381 (0.2413) loss 6.2133 (5.9310) grad_norm 2.3231 (inf) loss_scale 16384.0000 (20078.2841) mem 7376MB [2024-08-19 23:46:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1100/1251] eta 0:00:36 lr 0.000195 wd 0.0500 time 0.2415 (0.2427) data time 0.0008 (0.0015) model time 0.2407 (0.2413) loss 6.2725 (5.9290) grad_norm 3.5019 (inf) loss_scale 16384.0000 (20044.7302) mem 7376MB [2024-08-19 23:46:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1110/1251] eta 0:00:34 lr 0.000195 wd 0.0500 time 0.2436 (0.2427) data time 0.0008 (0.0015) model time 0.2429 (0.2412) loss 5.6198 (5.9246) grad_norm 3.5549 (inf) loss_scale 16384.0000 (20011.7804) mem 7376MB [2024-08-19 23:46:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1120/1251] eta 0:00:31 lr 0.000196 wd 0.0500 time 0.2425 (0.2427) data time 0.0009 (0.0015) model time 0.2417 (0.2413) loss 6.2352 (5.9217) grad_norm 3.4459 (inf) loss_scale 16384.0000 (19979.4184) mem 7376MB [2024-08-19 23:46:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1130/1251] eta 0:00:29 lr 0.000196 wd 0.0500 time 0.2296 (0.2427) data time 0.0009 (0.0015) model time 0.2287 (0.2412) loss 5.2429 (5.9198) grad_norm 2.7875 (inf) loss_scale 16384.0000 (19947.6286) mem 7376MB [2024-08-19 23:46:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1140/1251] eta 0:00:26 lr 0.000196 wd 0.0500 time 0.2483 (0.2427) data time 0.0009 (0.0015) model time 0.2473 (0.2412) loss 5.7271 (5.9184) grad_norm 2.4396 (inf) loss_scale 16384.0000 (19916.3961) mem 7376MB [2024-08-19 23:46:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1150/1251] eta 0:00:24 lr 0.000197 wd 0.0500 time 0.2405 (0.2427) data time 0.0008 (0.0015) model time 0.2396 (0.2412) loss 5.3517 (5.9156) grad_norm 2.8938 (inf) loss_scale 16384.0000 (19885.7063) mem 7376MB [2024-08-19 23:46:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1160/1251] eta 0:00:22 lr 0.000197 wd 0.0500 time 0.2397 (0.2427) data time 0.0009 (0.0015) model time 0.2388 (0.2412) loss 5.3568 (5.9142) grad_norm 2.4703 (inf) loss_scale 16384.0000 (19855.5452) mem 7376MB [2024-08-19 23:46:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1170/1251] eta 0:00:19 lr 0.000198 wd 0.0500 time 0.2434 (0.2427) data time 0.0007 (0.0015) model time 0.2427 (0.2412) loss 5.7291 (5.9143) grad_norm 2.6973 (inf) loss_scale 16384.0000 (19825.8992) mem 7376MB [2024-08-19 23:46:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1180/1251] eta 0:00:17 lr 0.000198 wd 0.0500 time 0.2458 (0.2426) data time 0.0011 (0.0015) model time 0.2447 (0.2412) loss 6.3049 (5.9146) grad_norm 2.2812 (inf) loss_scale 16384.0000 (19796.7553) mem 7376MB [2024-08-19 23:46:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1190/1251] eta 0:00:14 lr 0.000198 wd 0.0500 time 0.2373 (0.2426) data time 0.0009 (0.0015) model time 0.2364 (0.2412) loss 5.6310 (5.9151) grad_norm 2.8048 (inf) loss_scale 16384.0000 (19768.1008) mem 7376MB [2024-08-19 23:46:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1200/1251] eta 0:00:12 lr 0.000199 wd 0.0500 time 0.2416 (0.2426) data time 0.0008 (0.0015) model time 0.2407 (0.2412) loss 6.1269 (5.9144) grad_norm 2.9746 (inf) loss_scale 16384.0000 (19739.9234) mem 7376MB [2024-08-19 23:46:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1210/1251] eta 0:00:09 lr 0.000199 wd 0.0500 time 0.2366 (0.2426) data time 0.0007 (0.0015) model time 0.2359 (0.2412) loss 5.7805 (5.9111) grad_norm 3.0576 (inf) loss_scale 16384.0000 (19712.2114) mem 7376MB [2024-08-19 23:46:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1220/1251] eta 0:00:07 lr 0.000200 wd 0.0500 time 0.2429 (0.2426) data time 0.0013 (0.0015) model time 0.2416 (0.2411) loss 6.2160 (5.9105) grad_norm 3.7075 (inf) loss_scale 16384.0000 (19684.9533) mem 7376MB [2024-08-19 23:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1230/1251] eta 0:00:05 lr 0.000200 wd 0.0500 time 0.2403 (0.2426) data time 0.0011 (0.0015) model time 0.2392 (0.2411) loss 6.1578 (5.9113) grad_norm 2.8724 (inf) loss_scale 16384.0000 (19658.1381) mem 7376MB [2024-08-19 23:46:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1240/1251] eta 0:00:02 lr 0.000200 wd 0.0500 time 0.2214 (0.2425) data time 0.0007 (0.0015) model time 0.2207 (0.2410) loss 5.1997 (5.9095) grad_norm 3.7850 (inf) loss_scale 16384.0000 (19631.7550) mem 7376MB [2024-08-19 23:46:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [3/300][1250/1251] eta 0:00:00 lr 0.000201 wd 0.0500 time 0.2230 (0.2423) data time 0.0005 (0.0015) model time 0.2225 (0.2409) loss 6.3217 (5.9105) grad_norm 2.9757 (inf) loss_scale 16384.0000 (19605.7938) mem 7376MB [2024-08-19 23:46:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 3 training takes 0:05:03 [2024-08-19 23:46:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-19 23:46:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-19 23:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.428 (0.428) Loss 3.1816 (3.1816) Acc@1 32.422 (32.422) Acc@5 63.965 (63.965) Mem 7376MB [2024-08-19 23:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.116) Loss 3.7949 (3.6536) Acc@1 18.066 (24.414) Acc@5 46.875 (51.536) Mem 7376MB [2024-08-19 23:46:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.099) Loss 4.3906 (3.6109) Acc@1 18.164 (25.586) Acc@5 40.039 (52.576) Mem 7376MB [2024-08-19 23:46:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.093) Loss 4.4531 (3.7974) Acc@1 17.871 (24.269) Acc@5 36.719 (49.527) Mem 7376MB [2024-08-19 23:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 4.3281 (3.9156) Acc@1 14.160 (22.980) Acc@5 38.281 (47.206) Mem 7376MB [2024-08-19 23:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 23.852 Acc@5 47.938 [2024-08-19 23:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 23.9% [2024-08-19 23:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 23.85% [2024-08-19 23:46:59 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-19 23:47:00 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-19 23:47:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.453 (0.453) Loss 6.9023 (6.9023) Acc@1 0.488 (0.488) Acc@5 1.465 (1.465) Mem 7376MB [2024-08-19 23:47:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.087 (0.111) Loss 6.9961 (6.9215) Acc@1 0.195 (0.222) Acc@5 0.879 (0.994) Mem 7376MB [2024-08-19 23:47:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.096) Loss 6.9609 (6.9293) Acc@1 0.293 (0.214) Acc@5 1.172 (0.939) Mem 7376MB [2024-08-19 23:47:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.089) Loss 6.9570 (6.9326) Acc@1 0.000 (0.173) Acc@5 0.488 (0.847) Mem 7376MB [2024-08-19 23:47:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 6.9531 (6.9380) Acc@1 0.195 (0.162) Acc@5 0.879 (0.791) Mem 7376MB [2024-08-19 23:47:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 0.188 Acc@5 0.820 [2024-08-19 23:47:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 0.2% [2024-08-19 23:47:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 0.19% [2024-08-19 23:47:04 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-19 23:47:05 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-19 23:47:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][0/1251] eta 0:13:33 lr 0.000201 wd 0.0500 time 0.6501 (0.6501) data time 0.4159 (0.4159) model time 0.0000 (0.0000) loss 6.2323 (6.2323) grad_norm 3.0323 (3.0323) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][10/1251] eta 0:05:45 lr 0.000201 wd 0.0500 time 0.2435 (0.2785) data time 0.0008 (0.0388) model time 0.0000 (0.0000) loss 5.9751 (5.7171) grad_norm 2.4167 (2.8863) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][20/1251] eta 0:05:20 lr 0.000202 wd 0.0500 time 0.2425 (0.2603) data time 0.0012 (0.0208) model time 0.0000 (0.0000) loss 5.9160 (5.7196) grad_norm 2.6949 (2.8787) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][30/1251] eta 0:05:09 lr 0.000202 wd 0.0500 time 0.2396 (0.2535) data time 0.0010 (0.0144) model time 0.0000 (0.0000) loss 4.9320 (5.6593) grad_norm 3.0491 (3.0893) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][40/1251] eta 0:05:02 lr 0.000202 wd 0.0500 time 0.2431 (0.2499) data time 0.0010 (0.0112) model time 0.0000 (0.0000) loss 5.7135 (5.6977) grad_norm 2.4894 (3.1387) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][50/1251] eta 0:04:57 lr 0.000203 wd 0.0500 time 0.2387 (0.2476) data time 0.0009 (0.0092) model time 0.0000 (0.0000) loss 5.8101 (5.6969) grad_norm 2.8307 (3.0907) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][60/1251] eta 0:04:53 lr 0.000203 wd 0.0500 time 0.2320 (0.2465) data time 0.0008 (0.0078) model time 0.2312 (0.2398) loss 5.6043 (5.7148) grad_norm 2.1476 (3.0484) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][70/1251] eta 0:04:52 lr 0.000204 wd 0.0500 time 0.2386 (0.2474) data time 0.0010 (0.0069) model time 0.2377 (0.2458) loss 5.8585 (5.6941) grad_norm 3.2388 (3.0446) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][80/1251] eta 0:04:48 lr 0.000204 wd 0.0500 time 0.2433 (0.2464) data time 0.0009 (0.0062) model time 0.2425 (0.2434) loss 5.3294 (5.6872) grad_norm 2.9815 (2.9870) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][90/1251] eta 0:04:45 lr 0.000204 wd 0.0500 time 0.2364 (0.2459) data time 0.0011 (0.0056) model time 0.2353 (0.2428) loss 5.6757 (5.6805) grad_norm 3.0736 (2.9715) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][100/1251] eta 0:04:42 lr 0.000205 wd 0.0500 time 0.2435 (0.2457) data time 0.0008 (0.0052) model time 0.2426 (0.2427) loss 6.3902 (5.7052) grad_norm 6.0397 (3.0244) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][110/1251] eta 0:04:39 lr 0.000205 wd 0.0500 time 0.2469 (0.2452) data time 0.0010 (0.0048) model time 0.2459 (0.2421) loss 5.7833 (5.6985) grad_norm 3.0043 (3.0433) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][120/1251] eta 0:04:37 lr 0.000206 wd 0.0500 time 0.2334 (0.2449) data time 0.0009 (0.0045) model time 0.2325 (0.2419) loss 6.2228 (5.6932) grad_norm 2.7551 (3.0314) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][130/1251] eta 0:04:34 lr 0.000206 wd 0.0500 time 0.2420 (0.2446) data time 0.0009 (0.0042) model time 0.2410 (0.2416) loss 5.8169 (5.7051) grad_norm 2.9205 (3.0274) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][140/1251] eta 0:04:31 lr 0.000206 wd 0.0500 time 0.2452 (0.2443) data time 0.0008 (0.0040) model time 0.2444 (0.2414) loss 5.8340 (5.7165) grad_norm 2.3691 (3.0064) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][150/1251] eta 0:04:28 lr 0.000207 wd 0.0500 time 0.2492 (0.2440) data time 0.0010 (0.0038) model time 0.2482 (0.2411) loss 5.5472 (5.7065) grad_norm 2.8353 (2.9972) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][160/1251] eta 0:04:25 lr 0.000207 wd 0.0500 time 0.2344 (0.2437) data time 0.0008 (0.0036) model time 0.2336 (0.2408) loss 5.4321 (5.7201) grad_norm 2.5874 (2.9666) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][170/1251] eta 0:04:23 lr 0.000208 wd 0.0500 time 0.2386 (0.2436) data time 0.0008 (0.0035) model time 0.2378 (0.2408) loss 5.6163 (5.7139) grad_norm 3.7586 (2.9647) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][180/1251] eta 0:04:20 lr 0.000208 wd 0.0500 time 0.2393 (0.2434) data time 0.0008 (0.0033) model time 0.2385 (0.2407) loss 5.7134 (5.7175) grad_norm 2.7280 (2.9777) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][190/1251] eta 0:04:18 lr 0.000208 wd 0.0500 time 0.2454 (0.2432) data time 0.0009 (0.0032) model time 0.2445 (0.2405) loss 5.2463 (5.7093) grad_norm 3.1471 (2.9600) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][200/1251] eta 0:04:15 lr 0.000209 wd 0.0500 time 0.2340 (0.2430) data time 0.0010 (0.0031) model time 0.2330 (0.2404) loss 6.1321 (5.7125) grad_norm 3.5843 (2.9665) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][210/1251] eta 0:04:14 lr 0.000209 wd 0.0500 time 0.2401 (0.2440) data time 0.0007 (0.0030) model time 0.2394 (0.2418) loss 5.8586 (5.7190) grad_norm 2.7813 (2.9576) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:47:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][220/1251] eta 0:04:12 lr 0.000210 wd 0.0500 time 0.2465 (0.2445) data time 0.0007 (0.0029) model time 0.2458 (0.2425) loss 5.5188 (5.7220) grad_norm 2.5417 (2.9447) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][230/1251] eta 0:04:09 lr 0.000210 wd 0.0500 time 0.2392 (0.2444) data time 0.0008 (0.0028) model time 0.2385 (0.2424) loss 5.8805 (5.7160) grad_norm 2.7401 (2.9418) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][240/1251] eta 0:04:06 lr 0.000210 wd 0.0500 time 0.2498 (0.2442) data time 0.0008 (0.0028) model time 0.2490 (0.2422) loss 6.1707 (5.7255) grad_norm 3.0102 (2.9351) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][250/1251] eta 0:04:04 lr 0.000211 wd 0.0500 time 0.2288 (0.2440) data time 0.0008 (0.0027) model time 0.2280 (0.2420) loss 6.2393 (5.7241) grad_norm 2.7281 (2.9460) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][260/1251] eta 0:04:01 lr 0.000211 wd 0.0500 time 0.2453 (0.2439) data time 0.0009 (0.0026) model time 0.2444 (0.2419) loss 6.2409 (5.7226) grad_norm 2.8258 (2.9463) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][270/1251] eta 0:03:59 lr 0.000212 wd 0.0500 time 0.2390 (0.2438) data time 0.0011 (0.0026) model time 0.2379 (0.2419) loss 5.7656 (5.7197) grad_norm 3.2483 (2.9536) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][280/1251] eta 0:03:56 lr 0.000212 wd 0.0500 time 0.2340 (0.2437) data time 0.0007 (0.0025) model time 0.2333 (0.2418) loss 6.0900 (5.7253) grad_norm 3.1791 (2.9484) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][290/1251] eta 0:03:54 lr 0.000212 wd 0.0500 time 0.2374 (0.2437) data time 0.0010 (0.0025) model time 0.2364 (0.2418) loss 4.8177 (5.7243) grad_norm 3.0323 (2.9543) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][300/1251] eta 0:03:51 lr 0.000213 wd 0.0500 time 0.2467 (0.2437) data time 0.0008 (0.0025) model time 0.2459 (0.2417) loss 5.6898 (5.7307) grad_norm 2.3368 (2.9499) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][310/1251] eta 0:03:49 lr 0.000213 wd 0.0500 time 0.2422 (0.2436) data time 0.0010 (0.0025) model time 0.2412 (0.2416) loss 6.0291 (5.7269) grad_norm 4.2350 (2.9615) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][320/1251] eta 0:03:46 lr 0.000214 wd 0.0500 time 0.2381 (0.2435) data time 0.0009 (0.0024) model time 0.2373 (0.2416) loss 5.8146 (5.7252) grad_norm 2.7443 (2.9583) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][330/1251] eta 0:03:44 lr 0.000214 wd 0.0500 time 0.2397 (0.2434) data time 0.0012 (0.0024) model time 0.2385 (0.2415) loss 5.2655 (5.7193) grad_norm 3.9021 (2.9624) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][340/1251] eta 0:03:41 lr 0.000214 wd 0.0500 time 0.2418 (0.2433) data time 0.0010 (0.0023) model time 0.2409 (0.2414) loss 5.9314 (5.7230) grad_norm 2.9983 (2.9727) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][350/1251] eta 0:03:39 lr 0.000215 wd 0.0500 time 0.2402 (0.2432) data time 0.0010 (0.0023) model time 0.2393 (0.2414) loss 6.1627 (5.7227) grad_norm 2.6476 (2.9716) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][360/1251] eta 0:03:36 lr 0.000215 wd 0.0500 time 0.2470 (0.2432) data time 0.0007 (0.0023) model time 0.2462 (0.2413) loss 5.4070 (5.7199) grad_norm 3.9165 (2.9663) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][370/1251] eta 0:03:34 lr 0.000216 wd 0.0500 time 0.2347 (0.2431) data time 0.0008 (0.0022) model time 0.2339 (0.2413) loss 5.5951 (5.7148) grad_norm 3.0160 (2.9605) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][380/1251] eta 0:03:31 lr 0.000216 wd 0.0500 time 0.2304 (0.2430) data time 0.0011 (0.0022) model time 0.2293 (0.2412) loss 5.9193 (5.7091) grad_norm 2.8856 (2.9595) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][390/1251] eta 0:03:29 lr 0.000216 wd 0.0500 time 0.2474 (0.2430) data time 0.0009 (0.0022) model time 0.2465 (0.2412) loss 5.8157 (5.7078) grad_norm 2.7263 (2.9766) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][400/1251] eta 0:03:26 lr 0.000217 wd 0.0500 time 0.2513 (0.2430) data time 0.0010 (0.0021) model time 0.2503 (0.2412) loss 6.2000 (5.7114) grad_norm 2.8438 (2.9718) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][410/1251] eta 0:03:24 lr 0.000217 wd 0.0500 time 0.2396 (0.2429) data time 0.0009 (0.0021) model time 0.2388 (0.2412) loss 5.8696 (5.7126) grad_norm 2.5097 (2.9677) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][420/1251] eta 0:03:21 lr 0.000218 wd 0.0500 time 0.2402 (0.2429) data time 0.0010 (0.0021) model time 0.2392 (0.2411) loss 5.5380 (5.7097) grad_norm 2.9708 (2.9628) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][430/1251] eta 0:03:19 lr 0.000218 wd 0.0500 time 0.2439 (0.2429) data time 0.0011 (0.0021) model time 0.2429 (0.2412) loss 5.9878 (5.7077) grad_norm 2.0007 (2.9516) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][440/1251] eta 0:03:16 lr 0.000218 wd 0.0500 time 0.2433 (0.2429) data time 0.0010 (0.0020) model time 0.2423 (0.2412) loss 6.0084 (5.7090) grad_norm 2.4773 (2.9417) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][450/1251] eta 0:03:14 lr 0.000219 wd 0.0500 time 0.2371 (0.2429) data time 0.0008 (0.0020) model time 0.2364 (0.2411) loss 5.5285 (5.7075) grad_norm 2.7595 (2.9370) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][460/1251] eta 0:03:12 lr 0.000219 wd 0.0500 time 0.2506 (0.2428) data time 0.0012 (0.0020) model time 0.2495 (0.2411) loss 6.0703 (5.7099) grad_norm 3.1184 (2.9330) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:48:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][470/1251] eta 0:03:09 lr 0.000220 wd 0.0500 time 0.2368 (0.2428) data time 0.0010 (0.0020) model time 0.2358 (0.2411) loss 6.0874 (5.7104) grad_norm 3.1663 (2.9306) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][480/1251] eta 0:03:07 lr 0.000220 wd 0.0500 time 0.2363 (0.2427) data time 0.0011 (0.0020) model time 0.2352 (0.2410) loss 4.9807 (5.7068) grad_norm 3.2740 (2.9329) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][490/1251] eta 0:03:04 lr 0.000220 wd 0.0500 time 0.2355 (0.2426) data time 0.0012 (0.0020) model time 0.2344 (0.2409) loss 6.0270 (5.7078) grad_norm 2.9235 (2.9303) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][500/1251] eta 0:03:02 lr 0.000221 wd 0.0500 time 0.2470 (0.2426) data time 0.0007 (0.0019) model time 0.2463 (0.2409) loss 5.9891 (5.7054) grad_norm 2.6152 (2.9290) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][510/1251] eta 0:02:59 lr 0.000221 wd 0.0500 time 0.2538 (0.2426) data time 0.0010 (0.0019) model time 0.2528 (0.2409) loss 5.3014 (5.7008) grad_norm 2.8069 (2.9278) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][520/1251] eta 0:02:57 lr 0.000222 wd 0.0500 time 0.2444 (0.2426) data time 0.0009 (0.0019) model time 0.2435 (0.2409) loss 5.9415 (5.7018) grad_norm 2.4379 (2.9326) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][530/1251] eta 0:02:54 lr 0.000222 wd 0.0500 time 0.2456 (0.2426) data time 0.0011 (0.0019) model time 0.2444 (0.2409) loss 4.8519 (5.6958) grad_norm 2.5767 (2.9256) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][540/1251] eta 0:02:52 lr 0.000222 wd 0.0500 time 0.2435 (0.2426) data time 0.0008 (0.0019) model time 0.2427 (0.2409) loss 4.8179 (5.6940) grad_norm 2.7876 (2.9265) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][550/1251] eta 0:02:50 lr 0.000223 wd 0.0500 time 0.2453 (0.2425) data time 0.0010 (0.0019) model time 0.2444 (0.2409) loss 5.9597 (5.6917) grad_norm 2.7713 (2.9256) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][560/1251] eta 0:02:47 lr 0.000223 wd 0.0500 time 0.2454 (0.2425) data time 0.0012 (0.0018) model time 0.2442 (0.2409) loss 5.2783 (5.6937) grad_norm 2.3753 (2.9366) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][570/1251] eta 0:02:45 lr 0.000224 wd 0.0500 time 0.2327 (0.2425) data time 0.0009 (0.0018) model time 0.2317 (0.2409) loss 4.9347 (5.6916) grad_norm 2.8697 (2.9396) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][580/1251] eta 0:02:42 lr 0.000224 wd 0.0500 time 0.2444 (0.2425) data time 0.0010 (0.0018) model time 0.2434 (0.2409) loss 6.0439 (5.6908) grad_norm 2.9698 (2.9409) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][590/1251] eta 0:02:40 lr 0.000224 wd 0.0500 time 0.2505 (0.2425) data time 0.0007 (0.0018) model time 0.2498 (0.2409) loss 5.8775 (5.6925) grad_norm 2.6538 (2.9395) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][600/1251] eta 0:02:38 lr 0.000225 wd 0.0500 time 0.2343 (0.2428) data time 0.0007 (0.0018) model time 0.2336 (0.2412) loss 5.0075 (5.6881) grad_norm 2.6031 (2.9381) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][610/1251] eta 0:02:35 lr 0.000225 wd 0.0500 time 0.2359 (0.2427) data time 0.0009 (0.0018) model time 0.2351 (0.2412) loss 5.6536 (5.6872) grad_norm 2.5871 (2.9332) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][620/1251] eta 0:02:33 lr 0.000226 wd 0.0500 time 0.2439 (0.2427) data time 0.0008 (0.0018) model time 0.2431 (0.2412) loss 6.1710 (5.6864) grad_norm 2.4427 (2.9288) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][630/1251] eta 0:02:30 lr 0.000226 wd 0.0500 time 0.2451 (0.2426) data time 0.0010 (0.0018) model time 0.2441 (0.2411) loss 6.0147 (5.6896) grad_norm 2.6604 (2.9320) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][640/1251] eta 0:02:28 lr 0.000226 wd 0.0500 time 0.2411 (0.2426) data time 0.0008 (0.0017) model time 0.2402 (0.2411) loss 4.8826 (5.6875) grad_norm 2.6769 (2.9305) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][650/1251] eta 0:02:25 lr 0.000227 wd 0.0500 time 0.2425 (0.2426) data time 0.0010 (0.0017) model time 0.2415 (0.2410) loss 5.7138 (5.6856) grad_norm 2.6504 (2.9266) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][660/1251] eta 0:02:23 lr 0.000227 wd 0.0500 time 0.2361 (0.2425) data time 0.0007 (0.0017) model time 0.2354 (0.2410) loss 5.3177 (5.6820) grad_norm 2.6086 (2.9268) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][670/1251] eta 0:02:20 lr 0.000228 wd 0.0500 time 0.2431 (0.2425) data time 0.0007 (0.0017) model time 0.2424 (0.2410) loss 5.9970 (5.6812) grad_norm 2.8172 (2.9250) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][680/1251] eta 0:02:18 lr 0.000228 wd 0.0500 time 0.2398 (0.2424) data time 0.0012 (0.0017) model time 0.2387 (0.2409) loss 5.4462 (5.6819) grad_norm 2.8091 (2.9188) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][690/1251] eta 0:02:15 lr 0.000228 wd 0.0500 time 0.2373 (0.2424) data time 0.0009 (0.0017) model time 0.2364 (0.2409) loss 4.5037 (5.6806) grad_norm 2.5969 (2.9155) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][700/1251] eta 0:02:13 lr 0.000229 wd 0.0500 time 0.2442 (0.2424) data time 0.0010 (0.0017) model time 0.2432 (0.2409) loss 5.3743 (5.6807) grad_norm 2.2114 (2.9187) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][710/1251] eta 0:02:11 lr 0.000229 wd 0.0500 time 0.2442 (0.2424) data time 0.0008 (0.0017) model time 0.2433 (0.2409) loss 5.9568 (5.6787) grad_norm 2.5497 (2.9241) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][720/1251] eta 0:02:08 lr 0.000230 wd 0.0500 time 0.2437 (0.2424) data time 0.0011 (0.0017) model time 0.2427 (0.2409) loss 5.0947 (5.6742) grad_norm 4.0511 (2.9244) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][730/1251] eta 0:02:06 lr 0.000230 wd 0.0500 time 0.2367 (0.2428) data time 0.0012 (0.0017) model time 0.2356 (0.2414) loss 5.9969 (5.6725) grad_norm 3.3968 (2.9238) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][740/1251] eta 0:02:04 lr 0.000230 wd 0.0500 time 0.2441 (0.2431) data time 0.0010 (0.0017) model time 0.2431 (0.2417) loss 5.5478 (5.6716) grad_norm 3.0731 (2.9233) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][750/1251] eta 0:02:01 lr 0.000231 wd 0.0500 time 0.2393 (0.2430) data time 0.0010 (0.0016) model time 0.2383 (0.2416) loss 5.4850 (5.6714) grad_norm 3.4081 (2.9212) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][760/1251] eta 0:01:59 lr 0.000231 wd 0.0500 time 0.2358 (0.2430) data time 0.0010 (0.0016) model time 0.2348 (0.2416) loss 4.7489 (5.6718) grad_norm 2.6134 (2.9208) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][770/1251] eta 0:01:56 lr 0.000232 wd 0.0500 time 0.2415 (0.2429) data time 0.0010 (0.0016) model time 0.2405 (0.2415) loss 5.0569 (5.6698) grad_norm 2.8887 (2.9199) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][780/1251] eta 0:01:54 lr 0.000232 wd 0.0500 time 0.2439 (0.2429) data time 0.0011 (0.0016) model time 0.2429 (0.2415) loss 5.9736 (5.6691) grad_norm 2.7065 (2.9245) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][790/1251] eta 0:01:51 lr 0.000232 wd 0.0500 time 0.2415 (0.2429) data time 0.0007 (0.0016) model time 0.2408 (0.2415) loss 6.2245 (5.6675) grad_norm 3.5549 (2.9261) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][800/1251] eta 0:01:49 lr 0.000233 wd 0.0500 time 0.2351 (0.2429) data time 0.0008 (0.0016) model time 0.2342 (0.2414) loss 6.1431 (5.6658) grad_norm 4.7294 (2.9272) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][810/1251] eta 0:01:47 lr 0.000233 wd 0.0500 time 0.2441 (0.2429) data time 0.0008 (0.0016) model time 0.2433 (0.2414) loss 5.2764 (5.6647) grad_norm 2.5133 (2.9260) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][820/1251] eta 0:01:44 lr 0.000234 wd 0.0500 time 0.2437 (0.2428) data time 0.0011 (0.0016) model time 0.2426 (0.2414) loss 5.2813 (5.6650) grad_norm 2.9653 (2.9253) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][830/1251] eta 0:01:42 lr 0.000234 wd 0.0500 time 0.2437 (0.2428) data time 0.0010 (0.0016) model time 0.2427 (0.2414) loss 6.0028 (5.6644) grad_norm 2.2742 (2.9219) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][840/1251] eta 0:01:39 lr 0.000234 wd 0.0500 time 0.2351 (0.2428) data time 0.0009 (0.0016) model time 0.2341 (0.2414) loss 6.0787 (5.6626) grad_norm 2.7070 (2.9194) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][850/1251] eta 0:01:37 lr 0.000235 wd 0.0500 time 0.2364 (0.2428) data time 0.0014 (0.0016) model time 0.2350 (0.2414) loss 5.8343 (5.6611) grad_norm 3.2136 (2.9190) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][860/1251] eta 0:01:34 lr 0.000235 wd 0.0500 time 0.2388 (0.2428) data time 0.0008 (0.0016) model time 0.2380 (0.2414) loss 6.0592 (5.6649) grad_norm 2.7736 (2.9155) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][870/1251] eta 0:01:32 lr 0.000236 wd 0.0500 time 0.2393 (0.2427) data time 0.0009 (0.0016) model time 0.2384 (0.2413) loss 6.0361 (5.6669) grad_norm 3.5415 (2.9193) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][880/1251] eta 0:01:30 lr 0.000236 wd 0.0500 time 0.2402 (0.2427) data time 0.0009 (0.0016) model time 0.2393 (0.2413) loss 5.8238 (5.6665) grad_norm 2.8186 (2.9170) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][890/1251] eta 0:01:27 lr 0.000236 wd 0.0500 time 0.2659 (0.2427) data time 0.0009 (0.0016) model time 0.2649 (0.2413) loss 5.1211 (5.6635) grad_norm 2.5149 (2.9114) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][900/1251] eta 0:01:25 lr 0.000237 wd 0.0500 time 0.2481 (0.2427) data time 0.0010 (0.0016) model time 0.2471 (0.2413) loss 5.9414 (5.6607) grad_norm 2.7923 (2.9113) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][910/1251] eta 0:01:22 lr 0.000237 wd 0.0500 time 0.2497 (0.2426) data time 0.0010 (0.0016) model time 0.2487 (0.2412) loss 5.0719 (5.6597) grad_norm 2.5224 (2.9116) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][920/1251] eta 0:01:20 lr 0.000238 wd 0.0500 time 0.2406 (0.2426) data time 0.0010 (0.0015) model time 0.2396 (0.2412) loss 5.7174 (5.6593) grad_norm 2.7967 (2.9117) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][930/1251] eta 0:01:17 lr 0.000238 wd 0.0500 time 0.2446 (0.2426) data time 0.0010 (0.0015) model time 0.2436 (0.2412) loss 5.9158 (5.6592) grad_norm 3.1920 (2.9088) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][940/1251] eta 0:01:15 lr 0.000238 wd 0.0500 time 0.2337 (0.2426) data time 0.0011 (0.0015) model time 0.2326 (0.2412) loss 5.1923 (5.6606) grad_norm 2.8774 (2.9100) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][950/1251] eta 0:01:13 lr 0.000239 wd 0.0500 time 0.2376 (0.2426) data time 0.0011 (0.0015) model time 0.2365 (0.2412) loss 5.3491 (5.6586) grad_norm 2.8892 (2.9090) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:50:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][960/1251] eta 0:01:10 lr 0.000239 wd 0.0500 time 0.2406 (0.2426) data time 0.0011 (0.0015) model time 0.2395 (0.2412) loss 6.2353 (5.6569) grad_norm 2.9542 (2.9080) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][970/1251] eta 0:01:08 lr 0.000240 wd 0.0500 time 0.2471 (0.2426) data time 0.0009 (0.0015) model time 0.2462 (0.2412) loss 5.4261 (5.6571) grad_norm 3.0200 (2.9053) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:51:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][980/1251] eta 0:01:05 lr 0.000240 wd 0.0500 time 0.2472 (0.2426) data time 0.0010 (0.0015) model time 0.2463 (0.2412) loss 5.5041 (5.6560) grad_norm 2.5909 (2.9057) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:51:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][990/1251] eta 0:01:03 lr 0.000240 wd 0.0500 time 0.2332 (0.2425) data time 0.0010 (0.0015) model time 0.2322 (0.2412) loss 5.6909 (5.6549) grad_norm 2.5213 (2.9041) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:51:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1000/1251] eta 0:01:00 lr 0.000241 wd 0.0500 time 0.2558 (0.2425) data time 0.0011 (0.0015) model time 0.2547 (0.2412) loss 5.8653 (5.6546) grad_norm 2.7582 (2.9029) loss_scale 32768.0000 (16482.2058) mem 7376MB [2024-08-19 23:51:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1010/1251] eta 0:00:58 lr 0.000241 wd 0.0500 time 0.2395 (0.2425) data time 0.0010 (0.0015) model time 0.2385 (0.2412) loss 5.5231 (5.6551) grad_norm 2.2527 (2.9015) loss_scale 32768.0000 (16643.2918) mem 7376MB [2024-08-19 23:51:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1020/1251] eta 0:00:56 lr 0.000242 wd 0.0500 time 0.2490 (0.2425) data time 0.0010 (0.0015) model time 0.2480 (0.2412) loss 5.1150 (5.6525) grad_norm 2.6042 (2.8986) loss_scale 32768.0000 (16801.2223) mem 7376MB [2024-08-19 23:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1030/1251] eta 0:00:53 lr 0.000242 wd 0.0500 time 0.2346 (0.2425) data time 0.0010 (0.0015) model time 0.2336 (0.2411) loss 5.5465 (5.6482) grad_norm 3.1753 (2.8985) loss_scale 32768.0000 (16956.0892) mem 7376MB [2024-08-19 23:51:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1040/1251] eta 0:00:51 lr 0.000242 wd 0.0500 time 0.2365 (0.2425) data time 0.0008 (0.0015) model time 0.2357 (0.2411) loss 6.0849 (5.6474) grad_norm 2.3882 (2.8984) loss_scale 32768.0000 (17107.9808) mem 7376MB [2024-08-19 23:51:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1050/1251] eta 0:00:48 lr 0.000243 wd 0.0500 time 0.2438 (0.2424) data time 0.0008 (0.0015) model time 0.2430 (0.2411) loss 5.3860 (5.6459) grad_norm 3.2420 (2.8974) loss_scale 32768.0000 (17256.9819) mem 7376MB [2024-08-19 23:51:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1060/1251] eta 0:00:46 lr 0.000243 wd 0.0500 time 0.2443 (0.2424) data time 0.0008 (0.0015) model time 0.2434 (0.2411) loss 5.8165 (5.6453) grad_norm 2.6147 (2.8945) loss_scale 32768.0000 (17403.1744) mem 7376MB [2024-08-19 23:51:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1070/1251] eta 0:00:43 lr 0.000244 wd 0.0500 time 0.2472 (0.2424) data time 0.0010 (0.0015) model time 0.2461 (0.2410) loss 6.1563 (5.6455) grad_norm 2.4221 (2.8914) loss_scale 32768.0000 (17546.6368) mem 7376MB [2024-08-19 23:51:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1080/1251] eta 0:00:41 lr 0.000244 wd 0.0500 time 0.2416 (0.2424) data time 0.0007 (0.0015) model time 0.2409 (0.2410) loss 4.9816 (5.6445) grad_norm 3.4005 (2.8885) loss_scale 32768.0000 (17687.4450) mem 7376MB [2024-08-19 23:51:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1090/1251] eta 0:00:39 lr 0.000244 wd 0.0500 time 0.2363 (0.2424) data time 0.0012 (0.0015) model time 0.2351 (0.2410) loss 4.9871 (5.6441) grad_norm 2.6542 (2.8865) loss_scale 32768.0000 (17825.6719) mem 7376MB [2024-08-19 23:51:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1100/1251] eta 0:00:36 lr 0.000245 wd 0.0500 time 0.2420 (0.2424) data time 0.0009 (0.0015) model time 0.2411 (0.2410) loss 5.0056 (5.6414) grad_norm 2.7692 (2.8850) loss_scale 32768.0000 (17961.3878) mem 7376MB [2024-08-19 23:51:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1110/1251] eta 0:00:34 lr 0.000245 wd 0.0500 time 0.2444 (0.2423) data time 0.0011 (0.0015) model time 0.2434 (0.2410) loss 5.8965 (5.6371) grad_norm 4.8223 (2.8873) loss_scale 32768.0000 (18094.6607) mem 7376MB [2024-08-19 23:51:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1120/1251] eta 0:00:31 lr 0.000246 wd 0.0500 time 0.2527 (0.2424) data time 0.0010 (0.0015) model time 0.2517 (0.2410) loss 5.6636 (5.6368) grad_norm 2.7011 (2.8880) loss_scale 32768.0000 (18225.5558) mem 7376MB [2024-08-19 23:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1130/1251] eta 0:00:29 lr 0.000246 wd 0.0500 time 0.2444 (0.2424) data time 0.0010 (0.0015) model time 0.2435 (0.2410) loss 5.1070 (5.6368) grad_norm 2.9068 (2.8867) loss_scale 32768.0000 (18354.1362) mem 7376MB [2024-08-19 23:51:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1140/1251] eta 0:00:26 lr 0.000246 wd 0.0500 time 0.2350 (0.2423) data time 0.0007 (0.0015) model time 0.2343 (0.2410) loss 5.4432 (5.6337) grad_norm 2.6179 (2.8843) loss_scale 32768.0000 (18480.4628) mem 7376MB [2024-08-19 23:51:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1150/1251] eta 0:00:24 lr 0.000247 wd 0.0500 time 0.2429 (0.2423) data time 0.0010 (0.0015) model time 0.2419 (0.2410) loss 5.9372 (5.6333) grad_norm 2.6758 (2.8814) loss_scale 32768.0000 (18604.5943) mem 7376MB [2024-08-19 23:51:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1160/1251] eta 0:00:22 lr 0.000247 wd 0.0500 time 0.2488 (0.2423) data time 0.0010 (0.0014) model time 0.2479 (0.2410) loss 5.7977 (5.6324) grad_norm 2.8289 (2.8786) loss_scale 32768.0000 (18726.5874) mem 7376MB [2024-08-19 23:51:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1170/1251] eta 0:00:19 lr 0.000248 wd 0.0500 time 0.2356 (0.2423) data time 0.0008 (0.0014) model time 0.2348 (0.2410) loss 5.8208 (5.6328) grad_norm 2.7232 (2.8761) loss_scale 32768.0000 (18846.4970) mem 7376MB [2024-08-19 23:51:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1180/1251] eta 0:00:17 lr 0.000248 wd 0.0500 time 0.2379 (0.2423) data time 0.0010 (0.0014) model time 0.2369 (0.2410) loss 5.7661 (5.6315) grad_norm 2.6840 (2.8754) loss_scale 32768.0000 (18964.3760) mem 7376MB [2024-08-19 23:51:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1190/1251] eta 0:00:14 lr 0.000248 wd 0.0500 time 0.2339 (0.2423) data time 0.0008 (0.0014) model time 0.2331 (0.2410) loss 5.1083 (5.6307) grad_norm 2.7533 (2.8742) loss_scale 32768.0000 (19080.2754) mem 7376MB [2024-08-19 23:51:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1200/1251] eta 0:00:12 lr 0.000249 wd 0.0500 time 0.2357 (0.2423) data time 0.0012 (0.0014) model time 0.2345 (0.2410) loss 4.9677 (5.6313) grad_norm 2.5078 (2.8711) loss_scale 32768.0000 (19194.2448) mem 7376MB [2024-08-19 23:51:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1210/1251] eta 0:00:09 lr 0.000249 wd 0.0500 time 0.2434 (0.2423) data time 0.0008 (0.0014) model time 0.2426 (0.2410) loss 4.5711 (5.6261) grad_norm 2.0989 (2.8675) loss_scale 32768.0000 (19306.3320) mem 7376MB [2024-08-19 23:52:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1220/1251] eta 0:00:07 lr 0.000250 wd 0.0500 time 0.2429 (0.2423) data time 0.0012 (0.0014) model time 0.2417 (0.2410) loss 5.6721 (5.6254) grad_norm 2.4697 (2.8657) loss_scale 32768.0000 (19416.5831) mem 7376MB [2024-08-19 23:52:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1230/1251] eta 0:00:05 lr 0.000250 wd 0.0500 time 0.2362 (0.2423) data time 0.0011 (0.0014) model time 0.2351 (0.2409) loss 5.6653 (5.6231) grad_norm 2.1146 (2.8638) loss_scale 32768.0000 (19525.0431) mem 7376MB [2024-08-19 23:52:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1240/1251] eta 0:00:02 lr 0.000250 wd 0.0500 time 0.2291 (0.2422) data time 0.0008 (0.0014) model time 0.2283 (0.2408) loss 5.1624 (5.6235) grad_norm 2.4161 (2.8617) loss_scale 32768.0000 (19631.7550) mem 7376MB [2024-08-19 23:52:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [4/300][1250/1251] eta 0:00:00 lr 0.000251 wd 0.0500 time 0.2285 (0.2422) data time 0.0007 (0.0014) model time 0.2278 (0.2409) loss 4.8510 (5.6241) grad_norm 2.6198 (2.8616) loss_scale 32768.0000 (19736.7610) mem 7376MB [2024-08-19 23:52:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 4 training takes 0:05:03 [2024-08-19 23:52:08 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-19 23:52:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-19 23:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.546 (0.546) Loss 2.3379 (2.3379) Acc@1 49.023 (49.023) Acc@5 78.125 (78.125) Mem 7376MB [2024-08-19 23:52:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.121) Loss 2.8867 (2.9460) Acc@1 36.914 (36.630) Acc@5 68.164 (64.498) Mem 7376MB [2024-08-19 23:52:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.100) Loss 3.6855 (2.9234) Acc@1 28.516 (37.109) Acc@5 51.367 (65.565) Mem 7376MB [2024-08-19 23:52:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.092) Loss 3.8301 (3.1416) Acc@1 26.172 (34.555) Acc@5 50.195 (61.580) Mem 7376MB [2024-08-19 23:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 3.7520 (3.2779) Acc@1 23.828 (32.570) Acc@5 47.363 (58.889) Mem 7376MB [2024-08-19 23:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 33.026 Acc@5 59.216 [2024-08-19 23:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 33.0% [2024-08-19 23:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 33.03% [2024-08-19 23:52:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-19 23:52:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-19 23:52:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.452 (0.452) Loss 6.9219 (6.9219) Acc@1 0.195 (0.195) Acc@5 1.270 (1.270) Mem 7376MB [2024-08-19 23:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.111) Loss 7.1602 (6.9716) Acc@1 0.098 (0.169) Acc@5 0.977 (0.755) Mem 7376MB [2024-08-19 23:52:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.086 (0.096) Loss 6.9688 (6.9669) Acc@1 0.000 (0.149) Acc@5 0.293 (0.749) Mem 7376MB [2024-08-19 23:52:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.090) Loss 6.9531 (6.9618) Acc@1 0.000 (0.161) Acc@5 0.391 (0.750) Mem 7376MB [2024-08-19 23:52:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 6.9688 (6.9581) Acc@1 0.000 (0.157) Acc@5 0.293 (0.776) Mem 7376MB [2024-08-19 23:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 0.210 Acc@5 0.902 [2024-08-19 23:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 0.2% [2024-08-19 23:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 0.21% [2024-08-19 23:52:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-19 23:52:18 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-19 23:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][0/1251] eta 0:15:24 lr 0.000251 wd 0.0500 time 0.7388 (0.7388) data time 0.5065 (0.5065) model time 0.0000 (0.0000) loss 5.6333 (5.6333) grad_norm 2.4445 (2.4445) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-19 23:52:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][10/1251] eta 0:05:54 lr 0.000251 wd 0.0500 time 0.2409 (0.2856) data time 0.0009 (0.0470) model time 0.0000 (0.0000) loss 5.8513 (5.6605) grad_norm 4.6378 (inf) loss_scale 16384.0000 (25320.7273) mem 7376MB [2024-08-19 23:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][20/1251] eta 0:05:25 lr 0.000252 wd 0.0500 time 0.2417 (0.2644) data time 0.0010 (0.0252) model time 0.0000 (0.0000) loss 5.4772 (5.5810) grad_norm 3.5984 (inf) loss_scale 16384.0000 (21065.1429) mem 7376MB [2024-08-19 23:52:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][30/1251] eta 0:05:13 lr 0.000252 wd 0.0500 time 0.2439 (0.2566) data time 0.0008 (0.0174) model time 0.0000 (0.0000) loss 5.2604 (5.5627) grad_norm 2.5033 (inf) loss_scale 16384.0000 (19555.0968) mem 7376MB [2024-08-19 23:52:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][40/1251] eta 0:05:06 lr 0.000252 wd 0.0500 time 0.2418 (0.2527) data time 0.0012 (0.0134) model time 0.0000 (0.0000) loss 6.2547 (5.6419) grad_norm 2.6217 (inf) loss_scale 16384.0000 (18781.6585) mem 7376MB [2024-08-19 23:52:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][50/1251] eta 0:05:00 lr 0.000253 wd 0.0500 time 0.2371 (0.2504) data time 0.0008 (0.0110) model time 0.0000 (0.0000) loss 5.4949 (5.6447) grad_norm 2.5432 (inf) loss_scale 16384.0000 (18311.5294) mem 7376MB [2024-08-19 23:52:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][60/1251] eta 0:04:57 lr 0.000253 wd 0.0500 time 0.2452 (0.2494) data time 0.0007 (0.0094) model time 0.2445 (0.2428) loss 5.7313 (5.6373) grad_norm 2.4238 (inf) loss_scale 16384.0000 (17995.5410) mem 7376MB [2024-08-19 23:52:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][70/1251] eta 0:04:53 lr 0.000254 wd 0.0500 time 0.2402 (0.2482) data time 0.0008 (0.0083) model time 0.2394 (0.2415) loss 4.6550 (5.6214) grad_norm 2.1312 (inf) loss_scale 16384.0000 (17768.5634) mem 7376MB [2024-08-19 23:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][80/1251] eta 0:04:49 lr 0.000254 wd 0.0500 time 0.2506 (0.2474) data time 0.0010 (0.0074) model time 0.2496 (0.2412) loss 5.7281 (5.5947) grad_norm 2.8996 (inf) loss_scale 16384.0000 (17597.6296) mem 7376MB [2024-08-19 23:52:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][90/1251] eta 0:04:46 lr 0.000254 wd 0.0500 time 0.2383 (0.2466) data time 0.0010 (0.0067) model time 0.2374 (0.2406) loss 5.5994 (5.5908) grad_norm 3.2201 (inf) loss_scale 16384.0000 (17464.2637) mem 7376MB [2024-08-19 23:52:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][100/1251] eta 0:04:43 lr 0.000255 wd 0.0500 time 0.2359 (0.2461) data time 0.0011 (0.0061) model time 0.2348 (0.2407) loss 5.4167 (5.5675) grad_norm 3.3577 (inf) loss_scale 16384.0000 (17357.3069) mem 7376MB [2024-08-19 23:52:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][110/1251] eta 0:04:40 lr 0.000255 wd 0.0500 time 0.2384 (0.2457) data time 0.0008 (0.0056) model time 0.2376 (0.2407) loss 5.9476 (5.5705) grad_norm 3.8875 (inf) loss_scale 16384.0000 (17269.6216) mem 7376MB [2024-08-19 23:52:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][120/1251] eta 0:04:37 lr 0.000256 wd 0.0500 time 0.2532 (0.2454) data time 0.0010 (0.0053) model time 0.2521 (0.2406) loss 5.9239 (5.5751) grad_norm 3.3605 (inf) loss_scale 16384.0000 (17196.4298) mem 7376MB [2024-08-19 23:52:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][130/1251] eta 0:04:34 lr 0.000256 wd 0.0500 time 0.2463 (0.2451) data time 0.0009 (0.0049) model time 0.2454 (0.2406) loss 5.9242 (5.5653) grad_norm 2.9468 (inf) loss_scale 16384.0000 (17134.4122) mem 7376MB [2024-08-19 23:52:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][140/1251] eta 0:04:33 lr 0.000256 wd 0.0500 time 0.2461 (0.2464) data time 0.0011 (0.0047) model time 0.2450 (0.2430) loss 5.4117 (5.5595) grad_norm 2.4092 (inf) loss_scale 16384.0000 (17081.1915) mem 7376MB [2024-08-19 23:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][150/1251] eta 0:04:30 lr 0.000257 wd 0.0500 time 0.2406 (0.2461) data time 0.0008 (0.0044) model time 0.2398 (0.2428) loss 6.1156 (5.5630) grad_norm 2.1942 (inf) loss_scale 16384.0000 (17035.0199) mem 7376MB [2024-08-19 23:52:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][160/1251] eta 0:04:29 lr 0.000257 wd 0.0500 time 0.2490 (0.2471) data time 0.0010 (0.0043) model time 0.2480 (0.2443) loss 5.5296 (5.5572) grad_norm 3.6338 (inf) loss_scale 16384.0000 (16994.5839) mem 7376MB [2024-08-19 23:53:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][170/1251] eta 0:04:26 lr 0.000258 wd 0.0500 time 0.2447 (0.2468) data time 0.0008 (0.0041) model time 0.2439 (0.2441) loss 5.3132 (5.5560) grad_norm 2.9404 (inf) loss_scale 16384.0000 (16958.8772) mem 7376MB [2024-08-19 23:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][180/1251] eta 0:04:23 lr 0.000258 wd 0.0500 time 0.2411 (0.2464) data time 0.0010 (0.0040) model time 0.2401 (0.2436) loss 5.7290 (5.5361) grad_norm 2.4392 (inf) loss_scale 16384.0000 (16927.1160) mem 7376MB [2024-08-19 23:53:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][190/1251] eta 0:04:21 lr 0.000258 wd 0.0500 time 0.2366 (0.2461) data time 0.0007 (0.0038) model time 0.2358 (0.2433) loss 4.0946 (5.5255) grad_norm 3.2303 (inf) loss_scale 16384.0000 (16898.6806) mem 7376MB [2024-08-19 23:53:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][200/1251] eta 0:04:18 lr 0.000259 wd 0.0500 time 0.2460 (0.2460) data time 0.0008 (0.0037) model time 0.2452 (0.2433) loss 4.2453 (5.5140) grad_norm 3.0064 (inf) loss_scale 16384.0000 (16873.0746) mem 7376MB [2024-08-19 23:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][210/1251] eta 0:04:15 lr 0.000259 wd 0.0500 time 0.2361 (0.2457) data time 0.0009 (0.0035) model time 0.2351 (0.2431) loss 4.5398 (5.5063) grad_norm 2.8879 (inf) loss_scale 16384.0000 (16849.8957) mem 7376MB [2024-08-19 23:53:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][220/1251] eta 0:04:13 lr 0.000260 wd 0.0500 time 0.2387 (0.2455) data time 0.0010 (0.0035) model time 0.2376 (0.2428) loss 5.4696 (5.4933) grad_norm 2.9749 (inf) loss_scale 16384.0000 (16828.8145) mem 7376MB [2024-08-19 23:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][230/1251] eta 0:04:10 lr 0.000260 wd 0.0500 time 0.2422 (0.2452) data time 0.0009 (0.0034) model time 0.2413 (0.2426) loss 6.1063 (5.4868) grad_norm 2.6013 (inf) loss_scale 16384.0000 (16809.5584) mem 7376MB [2024-08-19 23:53:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][240/1251] eta 0:04:07 lr 0.000260 wd 0.0500 time 0.2385 (0.2451) data time 0.0010 (0.0033) model time 0.2375 (0.2425) loss 5.4152 (5.4859) grad_norm 3.2124 (inf) loss_scale 16384.0000 (16791.9004) mem 7376MB [2024-08-19 23:53:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][250/1251] eta 0:04:05 lr 0.000261 wd 0.0500 time 0.2447 (0.2449) data time 0.0007 (0.0032) model time 0.2440 (0.2424) loss 5.8553 (5.4783) grad_norm 2.2432 (inf) loss_scale 16384.0000 (16775.6494) mem 7376MB [2024-08-19 23:53:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][260/1251] eta 0:04:02 lr 0.000261 wd 0.0500 time 0.2431 (0.2448) data time 0.0011 (0.0031) model time 0.2420 (0.2423) loss 5.5658 (5.4831) grad_norm 2.5210 (inf) loss_scale 16384.0000 (16760.6437) mem 7376MB [2024-08-19 23:53:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][270/1251] eta 0:03:59 lr 0.000262 wd 0.0500 time 0.2310 (0.2446) data time 0.0012 (0.0030) model time 0.2299 (0.2422) loss 5.7924 (5.4795) grad_norm 2.2863 (inf) loss_scale 16384.0000 (16746.7454) mem 7376MB [2024-08-19 23:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][280/1251] eta 0:03:57 lr 0.000262 wd 0.0500 time 0.2427 (0.2446) data time 0.0009 (0.0029) model time 0.2418 (0.2421) loss 5.6768 (5.4767) grad_norm 2.2742 (inf) loss_scale 16384.0000 (16733.8363) mem 7376MB [2024-08-19 23:53:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][290/1251] eta 0:03:54 lr 0.000262 wd 0.0500 time 0.2346 (0.2444) data time 0.0010 (0.0029) model time 0.2336 (0.2419) loss 5.8773 (5.4803) grad_norm 1.9320 (inf) loss_scale 16384.0000 (16721.8144) mem 7376MB [2024-08-19 23:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][300/1251] eta 0:03:52 lr 0.000263 wd 0.0500 time 0.2375 (0.2442) data time 0.0011 (0.0028) model time 0.2363 (0.2418) loss 5.0529 (5.4811) grad_norm 2.8067 (inf) loss_scale 16384.0000 (16710.5914) mem 7376MB [2024-08-19 23:53:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][310/1251] eta 0:03:49 lr 0.000263 wd 0.0500 time 0.2433 (0.2441) data time 0.0009 (0.0028) model time 0.2424 (0.2417) loss 5.4872 (5.4777) grad_norm 2.9056 (inf) loss_scale 16384.0000 (16700.0900) mem 7376MB [2024-08-19 23:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][320/1251] eta 0:03:47 lr 0.000264 wd 0.0500 time 0.2479 (0.2441) data time 0.0009 (0.0028) model time 0.2470 (0.2417) loss 4.2047 (5.4776) grad_norm 3.0770 (inf) loss_scale 16384.0000 (16690.2430) mem 7376MB [2024-08-19 23:53:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][330/1251] eta 0:03:44 lr 0.000264 wd 0.0500 time 0.2315 (0.2439) data time 0.0008 (0.0027) model time 0.2308 (0.2415) loss 5.7232 (5.4764) grad_norm 2.1636 (inf) loss_scale 16384.0000 (16680.9909) mem 7376MB [2024-08-19 23:53:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][340/1251] eta 0:03:42 lr 0.000264 wd 0.0500 time 0.2372 (0.2439) data time 0.0010 (0.0027) model time 0.2362 (0.2415) loss 5.6925 (5.4779) grad_norm 2.1138 (inf) loss_scale 16384.0000 (16672.2815) mem 7376MB [2024-08-19 23:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][350/1251] eta 0:03:39 lr 0.000265 wd 0.0500 time 0.2418 (0.2438) data time 0.0009 (0.0027) model time 0.2409 (0.2414) loss 4.7788 (5.4695) grad_norm 2.5976 (inf) loss_scale 16384.0000 (16664.0684) mem 7376MB [2024-08-19 23:53:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][360/1251] eta 0:03:37 lr 0.000265 wd 0.0500 time 0.2455 (0.2437) data time 0.0009 (0.0026) model time 0.2447 (0.2414) loss 5.7540 (5.4714) grad_norm 2.8303 (inf) loss_scale 16384.0000 (16656.3102) mem 7376MB [2024-08-19 23:53:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][370/1251] eta 0:03:34 lr 0.000266 wd 0.0500 time 0.2297 (0.2436) data time 0.0012 (0.0026) model time 0.2286 (0.2412) loss 5.4593 (5.4760) grad_norm 2.5450 (inf) loss_scale 16384.0000 (16648.9704) mem 7376MB [2024-08-19 23:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][380/1251] eta 0:03:32 lr 0.000266 wd 0.0500 time 0.2405 (0.2435) data time 0.0011 (0.0026) model time 0.2394 (0.2411) loss 5.8381 (5.4727) grad_norm 3.1330 (inf) loss_scale 16384.0000 (16642.0157) mem 7376MB [2024-08-19 23:53:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][390/1251] eta 0:03:29 lr 0.000266 wd 0.0500 time 0.2293 (0.2434) data time 0.0010 (0.0025) model time 0.2283 (0.2411) loss 5.5156 (5.4732) grad_norm 2.6535 (inf) loss_scale 16384.0000 (16635.4169) mem 7376MB [2024-08-19 23:53:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][400/1251] eta 0:03:27 lr 0.000267 wd 0.0500 time 0.2443 (0.2433) data time 0.0010 (0.0025) model time 0.2432 (0.2410) loss 5.5036 (5.4679) grad_norm 2.6043 (inf) loss_scale 16384.0000 (16629.1471) mem 7376MB [2024-08-19 23:53:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][410/1251] eta 0:03:24 lr 0.000267 wd 0.0500 time 0.2435 (0.2432) data time 0.0008 (0.0025) model time 0.2427 (0.2410) loss 5.7468 (5.4682) grad_norm 2.3410 (inf) loss_scale 16384.0000 (16623.1825) mem 7376MB [2024-08-19 23:54:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][420/1251] eta 0:03:22 lr 0.000268 wd 0.0500 time 0.2410 (0.2433) data time 0.0009 (0.0024) model time 0.2401 (0.2410) loss 5.8869 (5.4636) grad_norm 2.3054 (inf) loss_scale 16384.0000 (16617.5012) mem 7376MB [2024-08-19 23:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][430/1251] eta 0:03:19 lr 0.000268 wd 0.0500 time 0.2420 (0.2432) data time 0.0010 (0.0024) model time 0.2411 (0.2410) loss 5.0537 (5.4574) grad_norm 2.9288 (inf) loss_scale 16384.0000 (16612.0835) mem 7376MB [2024-08-19 23:54:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][440/1251] eta 0:03:17 lr 0.000268 wd 0.0500 time 0.2358 (0.2432) data time 0.0009 (0.0024) model time 0.2349 (0.2410) loss 5.9447 (5.4559) grad_norm 2.5387 (inf) loss_scale 16384.0000 (16606.9116) mem 7376MB [2024-08-19 23:54:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][450/1251] eta 0:03:14 lr 0.000269 wd 0.0500 time 0.2398 (0.2430) data time 0.0010 (0.0023) model time 0.2388 (0.2409) loss 5.8812 (5.4601) grad_norm 2.1136 (inf) loss_scale 16384.0000 (16601.9690) mem 7376MB [2024-08-19 23:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][460/1251] eta 0:03:12 lr 0.000269 wd 0.0500 time 0.2464 (0.2430) data time 0.0009 (0.0023) model time 0.2455 (0.2409) loss 6.0563 (5.4625) grad_norm 2.0797 (inf) loss_scale 16384.0000 (16597.2408) mem 7376MB [2024-08-19 23:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][470/1251] eta 0:03:09 lr 0.000270 wd 0.0500 time 0.2392 (0.2429) data time 0.0012 (0.0023) model time 0.2381 (0.2408) loss 5.2278 (5.4674) grad_norm 2.6004 (inf) loss_scale 16384.0000 (16592.7134) mem 7376MB [2024-08-19 23:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][480/1251] eta 0:03:07 lr 0.000270 wd 0.0500 time 0.2418 (0.2429) data time 0.0008 (0.0023) model time 0.2410 (0.2407) loss 4.6077 (5.4674) grad_norm 2.3427 (inf) loss_scale 16384.0000 (16588.3742) mem 7376MB [2024-08-19 23:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][490/1251] eta 0:03:04 lr 0.000270 wd 0.0500 time 0.2389 (0.2428) data time 0.0008 (0.0023) model time 0.2381 (0.2407) loss 4.5307 (5.4589) grad_norm 3.1492 (inf) loss_scale 16384.0000 (16584.2118) mem 7376MB [2024-08-19 23:54:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][500/1251] eta 0:03:02 lr 0.000271 wd 0.0500 time 0.2385 (0.2428) data time 0.0009 (0.0022) model time 0.2376 (0.2407) loss 6.1628 (5.4579) grad_norm 2.2622 (inf) loss_scale 16384.0000 (16580.2156) mem 7376MB [2024-08-19 23:54:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][510/1251] eta 0:02:59 lr 0.000271 wd 0.0500 time 0.2395 (0.2428) data time 0.0009 (0.0022) model time 0.2386 (0.2407) loss 5.2893 (5.4519) grad_norm 2.5867 (inf) loss_scale 16384.0000 (16576.3757) mem 7376MB [2024-08-19 23:54:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][520/1251] eta 0:02:57 lr 0.000272 wd 0.0500 time 0.2440 (0.2427) data time 0.0011 (0.0022) model time 0.2429 (0.2406) loss 5.9137 (5.4532) grad_norm 2.5697 (inf) loss_scale 16384.0000 (16572.6833) mem 7376MB [2024-08-19 23:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][530/1251] eta 0:02:54 lr 0.000272 wd 0.0500 time 0.2442 (0.2427) data time 0.0007 (0.0022) model time 0.2435 (0.2406) loss 4.8746 (5.4511) grad_norm 2.3673 (inf) loss_scale 16384.0000 (16569.1299) mem 7376MB [2024-08-19 23:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][540/1251] eta 0:02:52 lr 0.000272 wd 0.0500 time 0.2333 (0.2426) data time 0.0012 (0.0022) model time 0.2321 (0.2406) loss 5.6104 (5.4561) grad_norm 3.1303 (inf) loss_scale 16384.0000 (16565.7079) mem 7376MB [2024-08-19 23:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][550/1251] eta 0:02:50 lr 0.000273 wd 0.0500 time 0.2428 (0.2426) data time 0.0007 (0.0021) model time 0.2421 (0.2405) loss 4.6903 (5.4512) grad_norm 2.5108 (inf) loss_scale 16384.0000 (16562.4102) mem 7376MB [2024-08-19 23:54:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][560/1251] eta 0:02:47 lr 0.000273 wd 0.0500 time 0.2460 (0.2426) data time 0.0007 (0.0021) model time 0.2453 (0.2405) loss 4.9854 (5.4513) grad_norm 7.1622 (inf) loss_scale 16384.0000 (16559.2299) mem 7376MB [2024-08-19 23:54:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][570/1251] eta 0:02:45 lr 0.000274 wd 0.0500 time 0.2432 (0.2426) data time 0.0011 (0.0021) model time 0.2420 (0.2406) loss 5.2841 (5.4510) grad_norm 2.5442 (inf) loss_scale 16384.0000 (16556.1611) mem 7376MB [2024-08-19 23:54:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][580/1251] eta 0:02:42 lr 0.000274 wd 0.0500 time 0.2373 (0.2425) data time 0.0010 (0.0021) model time 0.2363 (0.2405) loss 4.3596 (5.4487) grad_norm 2.3565 (inf) loss_scale 16384.0000 (16553.1979) mem 7376MB [2024-08-19 23:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][590/1251] eta 0:02:40 lr 0.000274 wd 0.0500 time 0.2387 (0.2425) data time 0.0010 (0.0021) model time 0.2377 (0.2405) loss 5.7236 (5.4491) grad_norm 3.0987 (inf) loss_scale 16384.0000 (16550.3350) mem 7376MB [2024-08-19 23:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][600/1251] eta 0:02:37 lr 0.000275 wd 0.0500 time 0.2429 (0.2425) data time 0.0011 (0.0020) model time 0.2418 (0.2405) loss 5.0614 (5.4417) grad_norm 3.6763 (inf) loss_scale 16384.0000 (16547.5674) mem 7376MB [2024-08-19 23:54:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][610/1251] eta 0:02:35 lr 0.000275 wd 0.0500 time 0.2407 (0.2424) data time 0.0008 (0.0020) model time 0.2399 (0.2405) loss 4.7353 (5.4393) grad_norm 2.5766 (inf) loss_scale 16384.0000 (16544.8903) mem 7376MB [2024-08-19 23:54:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][620/1251] eta 0:02:32 lr 0.000276 wd 0.0500 time 0.2393 (0.2425) data time 0.0008 (0.0020) model time 0.2385 (0.2405) loss 4.6562 (5.4382) grad_norm 2.2090 (inf) loss_scale 16384.0000 (16542.2995) mem 7376MB [2024-08-19 23:54:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][630/1251] eta 0:02:30 lr 0.000276 wd 0.0500 time 0.2422 (0.2424) data time 0.0007 (0.0020) model time 0.2415 (0.2405) loss 6.2389 (5.4371) grad_norm 2.5044 (inf) loss_scale 16384.0000 (16539.7908) mem 7376MB [2024-08-19 23:54:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][640/1251] eta 0:02:28 lr 0.000276 wd 0.0500 time 0.2433 (0.2424) data time 0.0012 (0.0020) model time 0.2421 (0.2405) loss 6.2089 (5.4379) grad_norm 2.7560 (inf) loss_scale 16384.0000 (16537.3604) mem 7376MB [2024-08-19 23:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][650/1251] eta 0:02:25 lr 0.000277 wd 0.0500 time 0.2422 (0.2424) data time 0.0010 (0.0020) model time 0.2412 (0.2405) loss 4.2165 (5.4327) grad_norm 3.0872 (inf) loss_scale 16384.0000 (16535.0046) mem 7376MB [2024-08-19 23:54:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][660/1251] eta 0:02:23 lr 0.000277 wd 0.0500 time 0.2478 (0.2424) data time 0.0010 (0.0020) model time 0.2468 (0.2405) loss 5.4247 (5.4321) grad_norm 3.2707 (inf) loss_scale 16384.0000 (16532.7201) mem 7376MB [2024-08-19 23:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][670/1251] eta 0:02:20 lr 0.000278 wd 0.0500 time 0.2405 (0.2424) data time 0.0010 (0.0019) model time 0.2395 (0.2405) loss 4.8244 (5.4291) grad_norm 3.2973 (inf) loss_scale 16384.0000 (16530.5037) mem 7376MB [2024-08-19 23:55:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][680/1251] eta 0:02:18 lr 0.000278 wd 0.0500 time 0.2431 (0.2424) data time 0.0008 (0.0019) model time 0.2423 (0.2405) loss 5.9985 (5.4309) grad_norm 2.7042 (inf) loss_scale 16384.0000 (16528.3524) mem 7376MB [2024-08-19 23:55:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][690/1251] eta 0:02:15 lr 0.000278 wd 0.0500 time 0.2434 (0.2423) data time 0.0012 (0.0019) model time 0.2422 (0.2405) loss 5.4821 (5.4262) grad_norm 2.0868 (inf) loss_scale 16384.0000 (16526.2634) mem 7376MB [2024-08-19 23:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][700/1251] eta 0:02:13 lr 0.000279 wd 0.0500 time 0.2386 (0.2423) data time 0.0009 (0.0019) model time 0.2377 (0.2405) loss 5.2414 (5.4212) grad_norm 6.1572 (inf) loss_scale 16384.0000 (16524.2340) mem 7376MB [2024-08-19 23:55:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][710/1251] eta 0:02:11 lr 0.000279 wd 0.0500 time 0.2400 (0.2423) data time 0.0012 (0.0019) model time 0.2388 (0.2405) loss 4.8998 (5.4212) grad_norm 2.5464 (inf) loss_scale 16384.0000 (16522.2616) mem 7376MB [2024-08-19 23:55:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][720/1251] eta 0:02:08 lr 0.000279 wd 0.0500 time 0.2408 (0.2423) data time 0.0012 (0.0019) model time 0.2396 (0.2405) loss 5.7578 (5.4180) grad_norm 2.5845 (inf) loss_scale 16384.0000 (16520.3440) mem 7376MB [2024-08-19 23:55:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][730/1251] eta 0:02:06 lr 0.000280 wd 0.0500 time 0.2379 (0.2423) data time 0.0012 (0.0019) model time 0.2368 (0.2405) loss 4.7151 (5.4162) grad_norm 3.9958 (inf) loss_scale 16384.0000 (16518.4788) mem 7376MB [2024-08-19 23:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][740/1251] eta 0:02:03 lr 0.000280 wd 0.0500 time 0.2390 (0.2423) data time 0.0009 (0.0019) model time 0.2381 (0.2405) loss 4.8739 (5.4143) grad_norm 2.8815 (inf) loss_scale 16384.0000 (16516.6640) mem 7376MB [2024-08-19 23:55:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][750/1251] eta 0:02:01 lr 0.000281 wd 0.0500 time 0.2490 (0.2423) data time 0.0011 (0.0018) model time 0.2479 (0.2405) loss 4.7961 (5.4071) grad_norm 2.5928 (inf) loss_scale 16384.0000 (16514.8975) mem 7376MB [2024-08-19 23:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][760/1251] eta 0:01:58 lr 0.000281 wd 0.0500 time 0.2403 (0.2422) data time 0.0012 (0.0018) model time 0.2392 (0.2405) loss 5.9368 (5.4075) grad_norm 2.9216 (inf) loss_scale 16384.0000 (16513.1774) mem 7376MB [2024-08-19 23:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][770/1251] eta 0:01:56 lr 0.000281 wd 0.0500 time 0.2439 (0.2422) data time 0.0010 (0.0018) model time 0.2429 (0.2404) loss 5.6251 (5.4055) grad_norm 2.6459 (inf) loss_scale 16384.0000 (16511.5019) mem 7376MB [2024-08-19 23:55:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][780/1251] eta 0:01:54 lr 0.000282 wd 0.0500 time 0.2372 (0.2422) data time 0.0008 (0.0018) model time 0.2365 (0.2404) loss 5.6306 (5.4059) grad_norm 2.2144 (inf) loss_scale 16384.0000 (16509.8694) mem 7376MB [2024-08-19 23:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][790/1251] eta 0:01:51 lr 0.000282 wd 0.0500 time 0.2386 (0.2422) data time 0.0011 (0.0018) model time 0.2375 (0.2404) loss 5.8619 (5.4025) grad_norm 2.5738 (inf) loss_scale 16384.0000 (16508.2781) mem 7376MB [2024-08-19 23:55:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][800/1251] eta 0:01:49 lr 0.000283 wd 0.0500 time 0.2390 (0.2422) data time 0.0010 (0.0018) model time 0.2379 (0.2404) loss 5.5399 (5.3992) grad_norm 2.7303 (inf) loss_scale 16384.0000 (16506.7266) mem 7376MB [2024-08-19 23:55:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][810/1251] eta 0:01:46 lr 0.000283 wd 0.0500 time 0.2484 (0.2422) data time 0.0008 (0.0018) model time 0.2476 (0.2405) loss 5.1538 (5.4018) grad_norm 2.1857 (inf) loss_scale 16384.0000 (16505.2133) mem 7376MB [2024-08-19 23:55:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][820/1251] eta 0:01:44 lr 0.000283 wd 0.0500 time 0.2404 (0.2422) data time 0.0009 (0.0018) model time 0.2394 (0.2404) loss 4.8189 (5.3993) grad_norm 2.1071 (inf) loss_scale 16384.0000 (16503.7369) mem 7376MB [2024-08-19 23:55:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][830/1251] eta 0:01:41 lr 0.000284 wd 0.0500 time 0.2423 (0.2421) data time 0.0010 (0.0018) model time 0.2413 (0.2404) loss 4.4342 (5.3949) grad_norm 2.3469 (inf) loss_scale 16384.0000 (16502.2960) mem 7376MB [2024-08-19 23:55:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][840/1251] eta 0:01:39 lr 0.000284 wd 0.0500 time 0.2394 (0.2421) data time 0.0008 (0.0018) model time 0.2386 (0.2404) loss 5.1003 (5.3956) grad_norm 2.6801 (inf) loss_scale 16384.0000 (16500.8894) mem 7376MB [2024-08-19 23:55:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][850/1251] eta 0:01:37 lr 0.000285 wd 0.0500 time 0.2467 (0.2421) data time 0.0007 (0.0018) model time 0.2459 (0.2404) loss 4.9974 (5.3958) grad_norm 2.2454 (inf) loss_scale 16384.0000 (16499.5159) mem 7376MB [2024-08-19 23:55:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][860/1251] eta 0:01:34 lr 0.000285 wd 0.0500 time 0.2393 (0.2422) data time 0.0012 (0.0017) model time 0.2382 (0.2405) loss 5.4533 (5.3937) grad_norm 2.7307 (inf) loss_scale 16384.0000 (16498.1742) mem 7376MB [2024-08-19 23:55:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][870/1251] eta 0:01:32 lr 0.000285 wd 0.0500 time 0.2465 (0.2422) data time 0.0008 (0.0017) model time 0.2457 (0.2405) loss 5.8163 (5.3926) grad_norm 2.5724 (inf) loss_scale 16384.0000 (16496.8634) mem 7376MB [2024-08-19 23:55:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][880/1251] eta 0:01:29 lr 0.000286 wd 0.0500 time 0.2414 (0.2421) data time 0.0012 (0.0017) model time 0.2402 (0.2405) loss 5.2383 (5.3933) grad_norm 2.9907 (inf) loss_scale 16384.0000 (16495.5823) mem 7376MB [2024-08-19 23:55:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][890/1251] eta 0:01:27 lr 0.000286 wd 0.0500 time 0.2467 (0.2421) data time 0.0008 (0.0017) model time 0.2459 (0.2405) loss 5.6270 (5.3921) grad_norm 2.3979 (inf) loss_scale 16384.0000 (16494.3300) mem 7376MB [2024-08-19 23:55:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][900/1251] eta 0:01:24 lr 0.000287 wd 0.0500 time 0.2470 (0.2421) data time 0.0007 (0.0017) model time 0.2463 (0.2405) loss 5.9613 (5.3964) grad_norm 6.2463 (inf) loss_scale 16384.0000 (16493.1054) mem 7376MB [2024-08-19 23:55:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][910/1251] eta 0:01:22 lr 0.000287 wd 0.0500 time 0.2422 (0.2421) data time 0.0010 (0.0017) model time 0.2412 (0.2405) loss 5.0731 (5.3981) grad_norm 2.5427 (inf) loss_scale 16384.0000 (16491.9078) mem 7376MB [2024-08-19 23:56:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][920/1251] eta 0:01:20 lr 0.000287 wd 0.0500 time 0.2426 (0.2421) data time 0.0008 (0.0017) model time 0.2418 (0.2405) loss 5.8347 (5.3996) grad_norm 3.5820 (inf) loss_scale 16384.0000 (16490.7362) mem 7376MB [2024-08-19 23:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][930/1251] eta 0:01:17 lr 0.000288 wd 0.0500 time 0.2412 (0.2426) data time 0.0007 (0.0017) model time 0.2405 (0.2410) loss 4.9418 (5.3960) grad_norm 2.3384 (inf) loss_scale 16384.0000 (16489.5897) mem 7376MB [2024-08-19 23:56:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][940/1251] eta 0:01:15 lr 0.000288 wd 0.0500 time 0.2422 (0.2426) data time 0.0010 (0.0017) model time 0.2412 (0.2410) loss 5.8781 (5.3956) grad_norm 2.5192 (inf) loss_scale 16384.0000 (16488.4676) mem 7376MB [2024-08-19 23:56:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][950/1251] eta 0:01:13 lr 0.000289 wd 0.0500 time 0.2364 (0.2425) data time 0.0012 (0.0017) model time 0.2352 (0.2409) loss 4.7552 (5.3948) grad_norm 2.6182 (inf) loss_scale 16384.0000 (16487.3691) mem 7376MB [2024-08-19 23:56:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][960/1251] eta 0:01:10 lr 0.000289 wd 0.0500 time 0.2433 (0.2425) data time 0.0007 (0.0017) model time 0.2425 (0.2409) loss 5.4080 (5.3934) grad_norm 2.6683 (inf) loss_scale 16384.0000 (16486.2934) mem 7376MB [2024-08-19 23:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][970/1251] eta 0:01:08 lr 0.000289 wd 0.0500 time 0.2381 (0.2425) data time 0.0009 (0.0017) model time 0.2372 (0.2410) loss 5.3629 (5.3944) grad_norm 2.2612 (inf) loss_scale 16384.0000 (16485.2400) mem 7376MB [2024-08-19 23:56:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][980/1251] eta 0:01:05 lr 0.000290 wd 0.0500 time 0.2580 (0.2425) data time 0.0009 (0.0017) model time 0.2571 (0.2409) loss 4.7355 (5.3927) grad_norm 3.9670 (inf) loss_scale 16384.0000 (16484.2080) mem 7376MB [2024-08-19 23:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][990/1251] eta 0:01:03 lr 0.000290 wd 0.0500 time 0.2423 (0.2425) data time 0.0008 (0.0017) model time 0.2415 (0.2409) loss 4.8210 (5.3927) grad_norm 2.9363 (inf) loss_scale 16384.0000 (16483.1968) mem 7376MB [2024-08-19 23:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1000/1251] eta 0:01:00 lr 0.000291 wd 0.0500 time 0.2489 (0.2425) data time 0.0009 (0.0017) model time 0.2480 (0.2409) loss 5.4241 (5.3910) grad_norm 2.1366 (inf) loss_scale 16384.0000 (16482.2058) mem 7376MB [2024-08-19 23:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1010/1251] eta 0:00:58 lr 0.000291 wd 0.0500 time 0.2492 (0.2425) data time 0.0010 (0.0017) model time 0.2481 (0.2409) loss 5.7230 (5.3925) grad_norm 2.3872 (inf) loss_scale 16384.0000 (16481.2344) mem 7376MB [2024-08-19 23:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1020/1251] eta 0:00:56 lr 0.000291 wd 0.0500 time 0.2398 (0.2425) data time 0.0011 (0.0016) model time 0.2387 (0.2409) loss 4.1539 (5.3928) grad_norm 2.9949 (inf) loss_scale 16384.0000 (16480.2821) mem 7376MB [2024-08-19 23:56:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1030/1251] eta 0:00:53 lr 0.000292 wd 0.0500 time 0.2433 (0.2425) data time 0.0010 (0.0016) model time 0.2423 (0.2409) loss 5.5946 (5.3881) grad_norm 2.3770 (inf) loss_scale 16384.0000 (16479.3482) mem 7376MB [2024-08-19 23:56:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1040/1251] eta 0:00:51 lr 0.000292 wd 0.0500 time 0.2393 (0.2425) data time 0.0007 (0.0016) model time 0.2385 (0.2409) loss 5.4491 (5.3869) grad_norm 3.6275 (inf) loss_scale 16384.0000 (16478.4323) mem 7376MB [2024-08-19 23:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1050/1251] eta 0:00:48 lr 0.000293 wd 0.0500 time 0.2455 (0.2425) data time 0.0010 (0.0016) model time 0.2446 (0.2409) loss 5.0579 (5.3867) grad_norm 3.4400 (inf) loss_scale 16384.0000 (16477.5338) mem 7376MB [2024-08-19 23:56:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1060/1251] eta 0:00:46 lr 0.000293 wd 0.0500 time 0.2381 (0.2424) data time 0.0012 (0.0016) model time 0.2369 (0.2409) loss 5.2806 (5.3862) grad_norm 2.5524 (inf) loss_scale 16384.0000 (16476.6522) mem 7376MB [2024-08-19 23:56:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1070/1251] eta 0:00:43 lr 0.000293 wd 0.0500 time 0.2392 (0.2426) data time 0.0008 (0.0016) model time 0.2383 (0.2411) loss 4.4396 (5.3841) grad_norm 2.6683 (inf) loss_scale 16384.0000 (16475.7871) mem 7376MB [2024-08-19 23:56:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1080/1251] eta 0:00:41 lr 0.000294 wd 0.0500 time 0.2436 (0.2426) data time 0.0011 (0.0016) model time 0.2425 (0.2410) loss 5.4152 (5.3819) grad_norm 2.4763 (inf) loss_scale 16384.0000 (16474.9380) mem 7376MB [2024-08-19 23:56:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1090/1251] eta 0:00:39 lr 0.000294 wd 0.0500 time 0.2368 (0.2427) data time 0.0007 (0.0016) model time 0.2360 (0.2412) loss 5.7709 (5.3831) grad_norm 2.3760 (inf) loss_scale 16384.0000 (16474.1045) mem 7376MB [2024-08-19 23:56:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1100/1251] eta 0:00:36 lr 0.000295 wd 0.0500 time 0.2352 (0.2427) data time 0.0011 (0.0016) model time 0.2341 (0.2412) loss 5.7316 (5.3825) grad_norm 2.1518 (inf) loss_scale 16384.0000 (16473.2861) mem 7376MB [2024-08-19 23:56:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1110/1251] eta 0:00:34 lr 0.000295 wd 0.0500 time 0.2399 (0.2427) data time 0.0009 (0.0016) model time 0.2390 (0.2412) loss 5.4919 (5.3818) grad_norm 2.0023 (inf) loss_scale 16384.0000 (16472.4824) mem 7376MB [2024-08-19 23:56:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1120/1251] eta 0:00:31 lr 0.000295 wd 0.0500 time 0.2387 (0.2427) data time 0.0008 (0.0016) model time 0.2379 (0.2412) loss 5.8614 (5.3803) grad_norm 2.5596 (inf) loss_scale 16384.0000 (16471.6931) mem 7376MB [2024-08-19 23:56:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1130/1251] eta 0:00:29 lr 0.000296 wd 0.0500 time 0.2358 (0.2427) data time 0.0009 (0.0016) model time 0.2349 (0.2412) loss 5.9201 (5.3795) grad_norm 2.3029 (inf) loss_scale 16384.0000 (16470.9178) mem 7376MB [2024-08-19 23:56:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1140/1251] eta 0:00:26 lr 0.000296 wd 0.0500 time 0.2414 (0.2427) data time 0.0009 (0.0016) model time 0.2405 (0.2412) loss 4.5709 (5.3790) grad_norm 2.4224 (inf) loss_scale 16384.0000 (16470.1560) mem 7376MB [2024-08-19 23:56:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1150/1251] eta 0:00:24 lr 0.000297 wd 0.0500 time 0.2332 (0.2427) data time 0.0010 (0.0016) model time 0.2322 (0.2412) loss 4.8306 (5.3755) grad_norm 3.1303 (inf) loss_scale 16384.0000 (16469.4075) mem 7376MB [2024-08-19 23:56:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1160/1251] eta 0:00:22 lr 0.000297 wd 0.0500 time 0.2441 (0.2427) data time 0.0010 (0.0016) model time 0.2431 (0.2412) loss 5.7190 (5.3743) grad_norm 2.9268 (inf) loss_scale 16384.0000 (16468.6718) mem 7376MB [2024-08-19 23:57:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1170/1251] eta 0:00:19 lr 0.000297 wd 0.0500 time 0.2362 (0.2427) data time 0.0010 (0.0016) model time 0.2352 (0.2412) loss 5.9113 (5.3751) grad_norm 2.7700 (inf) loss_scale 16384.0000 (16467.9488) mem 7376MB [2024-08-19 23:57:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1180/1251] eta 0:00:17 lr 0.000298 wd 0.0500 time 0.2360 (0.2427) data time 0.0008 (0.0016) model time 0.2352 (0.2412) loss 4.5298 (5.3731) grad_norm 2.6567 (inf) loss_scale 16384.0000 (16467.2379) mem 7376MB [2024-08-19 23:57:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1190/1251] eta 0:00:14 lr 0.000298 wd 0.0500 time 0.2405 (0.2426) data time 0.0009 (0.0016) model time 0.2395 (0.2412) loss 5.2010 (5.3707) grad_norm 2.5455 (inf) loss_scale 16384.0000 (16466.5390) mem 7376MB [2024-08-19 23:57:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1200/1251] eta 0:00:12 lr 0.000299 wd 0.0500 time 0.2395 (0.2426) data time 0.0011 (0.0016) model time 0.2384 (0.2411) loss 5.7728 (5.3719) grad_norm 2.7996 (inf) loss_scale 16384.0000 (16465.8518) mem 7376MB [2024-08-19 23:57:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1210/1251] eta 0:00:09 lr 0.000299 wd 0.0500 time 0.2442 (0.2426) data time 0.0010 (0.0016) model time 0.2432 (0.2411) loss 5.2681 (5.3717) grad_norm 2.3623 (inf) loss_scale 16384.0000 (16465.1759) mem 7376MB [2024-08-19 23:57:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1220/1251] eta 0:00:07 lr 0.000299 wd 0.0500 time 0.2394 (0.2426) data time 0.0008 (0.0016) model time 0.2386 (0.2411) loss 4.7920 (5.3710) grad_norm 2.6549 (inf) loss_scale 16384.0000 (16464.5111) mem 7376MB [2024-08-19 23:57:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1230/1251] eta 0:00:05 lr 0.000300 wd 0.0500 time 0.2417 (0.2426) data time 0.0008 (0.0015) model time 0.2409 (0.2411) loss 4.9318 (5.3698) grad_norm 3.0464 (inf) loss_scale 16384.0000 (16463.8570) mem 7376MB [2024-08-19 23:57:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1240/1251] eta 0:00:02 lr 0.000300 wd 0.0500 time 0.2289 (0.2425) data time 0.0005 (0.0015) model time 0.2284 (0.2410) loss 4.5300 (5.3674) grad_norm 2.2805 (inf) loss_scale 16384.0000 (16463.2135) mem 7376MB [2024-08-19 23:57:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [5/300][1250/1251] eta 0:00:00 lr 0.000301 wd 0.0500 time 0.2294 (0.2424) data time 0.0005 (0.0015) model time 0.2289 (0.2409) loss 5.1123 (5.3670) grad_norm 3.2141 (inf) loss_scale 16384.0000 (16462.5803) mem 7376MB [2024-08-19 23:57:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 5 training takes 0:05:03 [2024-08-19 23:57:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-19 23:57:22 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-19 23:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.425 (0.425) Loss 1.8926 (1.8926) Acc@1 59.082 (59.082) Acc@5 84.277 (84.277) Mem 7376MB [2024-08-19 23:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.113) Loss 2.7266 (2.4877) Acc@1 37.500 (44.354) Acc@5 71.289 (72.745) Mem 7376MB [2024-08-19 23:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.096) Loss 3.2617 (2.4894) Acc@1 33.496 (44.452) Acc@5 59.570 (73.010) Mem 7376MB [2024-08-19 23:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.090) Loss 3.4492 (2.7185) Acc@1 32.812 (41.573) Acc@5 55.176 (68.596) Mem 7376MB [2024-08-19 23:57:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 3.5254 (2.8580) Acc@1 27.344 (39.553) Acc@5 52.637 (66.047) Mem 7376MB [2024-08-19 23:57:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 39.742 Acc@5 66.120 [2024-08-19 23:57:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 39.7% [2024-08-19 23:57:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 39.74% [2024-08-19 23:57:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-19 23:57:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-19 23:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.454 (0.454) Loss 6.9062 (6.9062) Acc@1 0.391 (0.391) Acc@5 1.855 (1.855) Mem 7376MB [2024-08-19 23:57:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.113) Loss 7.2109 (7.0064) Acc@1 0.195 (0.213) Acc@5 0.781 (0.861) Mem 7376MB [2024-08-19 23:57:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.097) Loss 6.9492 (6.9825) Acc@1 0.000 (0.163) Acc@5 0.293 (0.823) Mem 7376MB [2024-08-19 23:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.091) Loss 6.8984 (6.9661) Acc@1 0.000 (0.189) Acc@5 0.684 (0.907) Mem 7376MB [2024-08-19 23:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.085) Loss 6.9883 (6.9546) Acc@1 0.000 (0.202) Acc@5 0.879 (0.972) Mem 7376MB [2024-08-19 23:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 0.288 Acc@5 1.156 [2024-08-19 23:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 0.3% [2024-08-19 23:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 0.29% [2024-08-19 23:57:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-19 23:57:31 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-19 23:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][0/1251] eta 0:15:40 lr 0.000301 wd 0.0500 time 0.7521 (0.7521) data time 0.5152 (0.5152) model time 0.0000 (0.0000) loss 4.7435 (4.7435) grad_norm 2.8479 (2.8479) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][10/1251] eta 0:05:55 lr 0.000301 wd 0.0500 time 0.2359 (0.2868) data time 0.0010 (0.0479) model time 0.0000 (0.0000) loss 5.0685 (5.1556) grad_norm 2.4947 (3.0764) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][20/1251] eta 0:05:26 lr 0.000301 wd 0.0500 time 0.2416 (0.2655) data time 0.0009 (0.0256) model time 0.0000 (0.0000) loss 5.1445 (5.1787) grad_norm 2.4671 (2.7764) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][30/1251] eta 0:05:14 lr 0.000302 wd 0.0500 time 0.2406 (0.2576) data time 0.0012 (0.0177) model time 0.0000 (0.0000) loss 4.8831 (5.1425) grad_norm 2.4426 (2.7134) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:57:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][40/1251] eta 0:05:06 lr 0.000302 wd 0.0500 time 0.2555 (0.2534) data time 0.0010 (0.0136) model time 0.0000 (0.0000) loss 5.2174 (5.2083) grad_norm 2.1331 (2.6794) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][50/1251] eta 0:05:01 lr 0.000303 wd 0.0500 time 0.2428 (0.2509) data time 0.0007 (0.0112) model time 0.0000 (0.0000) loss 4.2215 (5.1749) grad_norm 2.1778 (2.6210) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][60/1251] eta 0:04:56 lr 0.000303 wd 0.0500 time 0.2434 (0.2490) data time 0.0011 (0.0095) model time 0.2424 (0.2380) loss 5.3915 (5.1215) grad_norm 2.5421 (2.6335) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:57:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][70/1251] eta 0:04:53 lr 0.000303 wd 0.0500 time 0.2452 (0.2482) data time 0.0009 (0.0083) model time 0.2443 (0.2402) loss 5.1474 (5.1555) grad_norm 3.6358 (2.6248) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:57:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][80/1251] eta 0:04:49 lr 0.000304 wd 0.0500 time 0.2402 (0.2473) data time 0.0012 (0.0074) model time 0.2390 (0.2400) loss 5.6782 (5.1848) grad_norm 2.8421 (2.6430) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:57:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][90/1251] eta 0:04:46 lr 0.000304 wd 0.0500 time 0.2466 (0.2467) data time 0.0007 (0.0067) model time 0.2458 (0.2401) loss 6.0331 (5.1841) grad_norm 2.4156 (2.6253) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:57:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][100/1251] eta 0:04:43 lr 0.000305 wd 0.0500 time 0.2410 (0.2463) data time 0.0012 (0.0062) model time 0.2398 (0.2405) loss 5.6535 (5.1929) grad_norm 2.2548 (2.6151) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:57:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][110/1251] eta 0:04:40 lr 0.000305 wd 0.0500 time 0.2382 (0.2459) data time 0.0007 (0.0057) model time 0.2374 (0.2405) loss 6.1374 (5.1851) grad_norm 2.1674 (2.6079) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][120/1251] eta 0:04:37 lr 0.000305 wd 0.0500 time 0.2420 (0.2454) data time 0.0011 (0.0053) model time 0.2409 (0.2403) loss 5.3962 (5.1950) grad_norm 2.2611 (2.5909) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][130/1251] eta 0:04:34 lr 0.000306 wd 0.0500 time 0.2405 (0.2453) data time 0.0009 (0.0050) model time 0.2396 (0.2406) loss 4.5936 (5.2047) grad_norm 2.7587 (2.6167) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][140/1251] eta 0:04:32 lr 0.000306 wd 0.0500 time 0.2512 (0.2452) data time 0.0008 (0.0047) model time 0.2504 (0.2410) loss 6.0413 (5.1996) grad_norm 2.2468 (2.6261) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][150/1251] eta 0:04:29 lr 0.000307 wd 0.0500 time 0.2434 (0.2450) data time 0.0011 (0.0045) model time 0.2423 (0.2409) loss 5.6650 (5.2090) grad_norm 2.8838 (2.6246) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][160/1251] eta 0:04:27 lr 0.000307 wd 0.0500 time 0.2464 (0.2449) data time 0.0012 (0.0043) model time 0.2452 (0.2410) loss 5.3242 (5.2251) grad_norm 1.8975 (2.6114) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][170/1251] eta 0:04:24 lr 0.000307 wd 0.0500 time 0.2461 (0.2447) data time 0.0010 (0.0041) model time 0.2451 (0.2410) loss 5.2346 (5.2176) grad_norm 2.3999 (2.6134) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][180/1251] eta 0:04:21 lr 0.000308 wd 0.0500 time 0.2450 (0.2444) data time 0.0011 (0.0039) model time 0.2439 (0.2408) loss 4.7652 (5.2103) grad_norm 3.5369 (2.6151) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][190/1251] eta 0:04:19 lr 0.000308 wd 0.0500 time 0.2398 (0.2443) data time 0.0008 (0.0038) model time 0.2390 (0.2408) loss 5.1992 (5.2183) grad_norm 3.9710 (2.6380) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][200/1251] eta 0:04:16 lr 0.000309 wd 0.0500 time 0.2407 (0.2443) data time 0.0011 (0.0036) model time 0.2397 (0.2409) loss 4.7458 (5.2290) grad_norm 3.3773 (2.6453) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][210/1251] eta 0:04:14 lr 0.000309 wd 0.0500 time 0.2537 (0.2442) data time 0.0008 (0.0035) model time 0.2529 (0.2410) loss 5.7119 (5.2285) grad_norm 3.2460 (2.6365) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][220/1251] eta 0:04:11 lr 0.000309 wd 0.0500 time 0.2389 (0.2441) data time 0.0011 (0.0034) model time 0.2378 (0.2410) loss 5.1918 (5.2062) grad_norm 2.5600 (2.6291) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][230/1251] eta 0:04:09 lr 0.000310 wd 0.0500 time 0.2399 (0.2441) data time 0.0009 (0.0033) model time 0.2390 (0.2411) loss 5.9244 (5.2113) grad_norm 2.5427 (2.6262) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][240/1251] eta 0:04:06 lr 0.000310 wd 0.0500 time 0.2419 (0.2439) data time 0.0009 (0.0032) model time 0.2411 (0.2410) loss 5.7929 (5.2071) grad_norm 2.4348 (2.6277) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][250/1251] eta 0:04:03 lr 0.000311 wd 0.0500 time 0.2416 (0.2437) data time 0.0010 (0.0031) model time 0.2406 (0.2408) loss 5.4398 (5.2069) grad_norm 2.6169 (2.6269) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][260/1251] eta 0:04:01 lr 0.000311 wd 0.0500 time 0.2379 (0.2436) data time 0.0010 (0.0030) model time 0.2369 (0.2408) loss 5.7074 (5.2078) grad_norm 3.3414 (2.6371) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][270/1251] eta 0:03:58 lr 0.000311 wd 0.0500 time 0.2364 (0.2435) data time 0.0009 (0.0030) model time 0.2355 (0.2407) loss 5.6118 (5.2055) grad_norm 2.2957 (2.6361) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][280/1251] eta 0:03:56 lr 0.000312 wd 0.0500 time 0.2354 (0.2435) data time 0.0008 (0.0029) model time 0.2346 (0.2407) loss 5.6644 (5.2020) grad_norm 2.1286 (2.6344) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][290/1251] eta 0:03:53 lr 0.000312 wd 0.0500 time 0.2514 (0.2434) data time 0.0010 (0.0028) model time 0.2504 (0.2407) loss 5.5442 (5.2086) grad_norm 2.3137 (2.6573) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][300/1251] eta 0:03:51 lr 0.000313 wd 0.0500 time 0.2493 (0.2434) data time 0.0008 (0.0028) model time 0.2485 (0.2408) loss 5.3633 (5.2045) grad_norm 2.6363 (2.6547) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][310/1251] eta 0:03:48 lr 0.000313 wd 0.0500 time 0.2384 (0.2433) data time 0.0009 (0.0027) model time 0.2375 (0.2408) loss 5.5025 (5.2035) grad_norm 2.3633 (2.6560) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][320/1251] eta 0:03:46 lr 0.000313 wd 0.0500 time 0.2406 (0.2433) data time 0.0009 (0.0027) model time 0.2397 (0.2407) loss 4.6427 (5.1992) grad_norm 3.9433 (2.6618) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][330/1251] eta 0:03:44 lr 0.000314 wd 0.0500 time 0.2499 (0.2432) data time 0.0009 (0.0026) model time 0.2490 (0.2408) loss 5.9676 (5.2043) grad_norm 2.6564 (2.6594) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][340/1251] eta 0:03:41 lr 0.000314 wd 0.0500 time 0.2572 (0.2433) data time 0.0008 (0.0026) model time 0.2564 (0.2408) loss 5.4453 (5.2018) grad_norm 2.2706 (2.6554) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][350/1251] eta 0:03:39 lr 0.000315 wd 0.0500 time 0.2388 (0.2432) data time 0.0009 (0.0025) model time 0.2379 (0.2408) loss 5.4025 (5.1944) grad_norm 3.1670 (2.6541) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:58:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][360/1251] eta 0:03:36 lr 0.000315 wd 0.0500 time 0.2372 (0.2432) data time 0.0011 (0.0025) model time 0.2361 (0.2408) loss 5.7991 (5.2004) grad_norm 2.1389 (2.6566) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][370/1251] eta 0:03:34 lr 0.000315 wd 0.0500 time 0.2366 (0.2437) data time 0.0012 (0.0025) model time 0.2354 (0.2414) loss 5.1444 (5.2030) grad_norm 3.2279 (2.6585) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][380/1251] eta 0:03:32 lr 0.000316 wd 0.0500 time 0.2449 (0.2443) data time 0.0010 (0.0025) model time 0.2439 (0.2422) loss 5.4669 (5.1993) grad_norm 2.4254 (2.6592) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][390/1251] eta 0:03:30 lr 0.000316 wd 0.0500 time 0.2500 (0.2443) data time 0.0010 (0.0024) model time 0.2490 (0.2421) loss 4.9705 (5.2040) grad_norm 2.7801 (2.6571) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][400/1251] eta 0:03:27 lr 0.000317 wd 0.0500 time 0.2367 (0.2442) data time 0.0010 (0.0024) model time 0.2357 (0.2421) loss 5.7276 (5.2105) grad_norm 2.6915 (2.6485) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][410/1251] eta 0:03:25 lr 0.000317 wd 0.0500 time 0.2495 (0.2447) data time 0.0010 (0.0024) model time 0.2484 (0.2427) loss 4.3481 (5.2052) grad_norm 4.5810 (2.6589) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][420/1251] eta 0:03:23 lr 0.000317 wd 0.0500 time 0.2345 (0.2446) data time 0.0008 (0.0024) model time 0.2337 (0.2425) loss 5.6773 (5.2043) grad_norm 2.5611 (2.6570) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][430/1251] eta 0:03:20 lr 0.000318 wd 0.0500 time 0.2419 (0.2444) data time 0.0012 (0.0023) model time 0.2407 (0.2424) loss 5.2051 (5.2016) grad_norm 2.4244 (2.6534) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][440/1251] eta 0:03:18 lr 0.000318 wd 0.0500 time 0.2345 (0.2443) data time 0.0010 (0.0023) model time 0.2335 (0.2423) loss 5.2466 (5.2058) grad_norm 2.4402 (2.6525) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][450/1251] eta 0:03:15 lr 0.000319 wd 0.0500 time 0.2416 (0.2443) data time 0.0007 (0.0023) model time 0.2409 (0.2423) loss 5.7408 (5.2019) grad_norm 3.1690 (2.6572) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][460/1251] eta 0:03:13 lr 0.000319 wd 0.0500 time 0.2404 (0.2442) data time 0.0008 (0.0023) model time 0.2396 (0.2422) loss 4.2420 (5.1969) grad_norm 2.2684 (2.6498) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][470/1251] eta 0:03:10 lr 0.000319 wd 0.0500 time 0.2389 (0.2441) data time 0.0011 (0.0022) model time 0.2378 (0.2422) loss 4.9879 (5.1947) grad_norm 2.4661 (2.6566) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][480/1251] eta 0:03:08 lr 0.000320 wd 0.0500 time 0.2349 (0.2441) data time 0.0012 (0.0022) model time 0.2338 (0.2421) loss 5.8410 (5.1933) grad_norm 3.5252 (2.6604) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][490/1251] eta 0:03:05 lr 0.000320 wd 0.0500 time 0.2375 (0.2440) data time 0.0009 (0.0022) model time 0.2366 (0.2421) loss 4.3766 (5.1925) grad_norm 3.9266 (2.6604) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][500/1251] eta 0:03:03 lr 0.000321 wd 0.0500 time 0.2423 (0.2440) data time 0.0007 (0.0022) model time 0.2415 (0.2421) loss 5.7748 (5.1923) grad_norm 2.1463 (2.6549) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][510/1251] eta 0:03:00 lr 0.000321 wd 0.0500 time 0.2449 (0.2440) data time 0.0010 (0.0021) model time 0.2438 (0.2420) loss 5.2877 (5.1902) grad_norm 2.7993 (2.6622) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][520/1251] eta 0:02:58 lr 0.000321 wd 0.0500 time 0.2375 (0.2439) data time 0.0008 (0.0021) model time 0.2366 (0.2420) loss 4.5732 (5.1848) grad_norm 2.6958 (2.6681) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][530/1251] eta 0:02:55 lr 0.000322 wd 0.0500 time 0.2501 (0.2438) data time 0.0010 (0.0021) model time 0.2491 (0.2419) loss 5.7164 (5.1859) grad_norm 2.6541 (2.6725) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-19 23:59:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][540/1251] eta 0:02:53 lr 0.000322 wd 0.0500 time 0.2423 (0.2437) data time 0.0010 (0.0021) model time 0.2413 (0.2418) loss 5.5203 (5.1822) grad_norm 2.3247 (inf) loss_scale 8192.0000 (16262.8614) mem 7376MB [2024-08-19 23:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][550/1251] eta 0:02:50 lr 0.000323 wd 0.0500 time 0.2387 (0.2437) data time 0.0010 (0.0021) model time 0.2377 (0.2418) loss 4.0352 (5.1747) grad_norm 2.4531 (inf) loss_scale 8192.0000 (16116.3848) mem 7376MB [2024-08-19 23:59:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][560/1251] eta 0:02:48 lr 0.000323 wd 0.0500 time 0.2366 (0.2436) data time 0.0010 (0.0021) model time 0.2356 (0.2418) loss 5.7185 (5.1751) grad_norm 2.9644 (inf) loss_scale 8192.0000 (15975.1301) mem 7376MB [2024-08-19 23:59:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][570/1251] eta 0:02:45 lr 0.000323 wd 0.0500 time 0.2397 (0.2436) data time 0.0008 (0.0020) model time 0.2389 (0.2417) loss 5.6877 (5.1795) grad_norm 3.3674 (inf) loss_scale 8192.0000 (15838.8231) mem 7376MB [2024-08-19 23:59:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][580/1251] eta 0:02:43 lr 0.000324 wd 0.0500 time 0.2377 (0.2435) data time 0.0011 (0.0020) model time 0.2367 (0.2416) loss 5.2419 (5.1804) grad_norm 2.3251 (inf) loss_scale 8192.0000 (15707.2083) mem 7376MB [2024-08-19 23:59:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][590/1251] eta 0:02:40 lr 0.000324 wd 0.0500 time 0.2442 (0.2435) data time 0.0008 (0.0020) model time 0.2434 (0.2416) loss 5.5872 (5.1859) grad_norm 2.2118 (inf) loss_scale 8192.0000 (15580.0474) mem 7376MB [2024-08-19 23:59:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][600/1251] eta 0:02:38 lr 0.000325 wd 0.0500 time 0.2439 (0.2434) data time 0.0010 (0.0020) model time 0.2429 (0.2415) loss 5.6228 (5.1821) grad_norm 3.7001 (inf) loss_scale 8192.0000 (15457.1181) mem 7376MB [2024-08-20 00:00:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][610/1251] eta 0:02:35 lr 0.000325 wd 0.0500 time 0.2465 (0.2434) data time 0.0010 (0.0020) model time 0.2455 (0.2415) loss 4.6639 (5.1790) grad_norm 2.8972 (inf) loss_scale 8192.0000 (15338.2128) mem 7376MB [2024-08-20 00:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][620/1251] eta 0:02:33 lr 0.000325 wd 0.0500 time 0.2355 (0.2433) data time 0.0008 (0.0020) model time 0.2347 (0.2415) loss 5.7407 (5.1825) grad_norm 2.4589 (inf) loss_scale 8192.0000 (15223.1369) mem 7376MB [2024-08-20 00:00:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][630/1251] eta 0:02:31 lr 0.000326 wd 0.0500 time 0.2340 (0.2433) data time 0.0010 (0.0020) model time 0.2330 (0.2415) loss 5.1218 (5.1802) grad_norm 2.4410 (inf) loss_scale 8192.0000 (15111.7084) mem 7376MB [2024-08-20 00:00:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][640/1251] eta 0:02:28 lr 0.000326 wd 0.0500 time 0.2426 (0.2436) data time 0.0010 (0.0019) model time 0.2416 (0.2418) loss 4.7035 (5.1820) grad_norm 2.7645 (inf) loss_scale 8192.0000 (15003.7566) mem 7376MB [2024-08-20 00:00:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][650/1251] eta 0:02:26 lr 0.000327 wd 0.0500 time 0.2424 (0.2436) data time 0.0010 (0.0019) model time 0.2414 (0.2418) loss 4.9476 (5.1816) grad_norm 2.1817 (inf) loss_scale 8192.0000 (14899.1214) mem 7376MB [2024-08-20 00:00:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][660/1251] eta 0:02:23 lr 0.000327 wd 0.0500 time 0.2421 (0.2435) data time 0.0009 (0.0019) model time 0.2412 (0.2418) loss 5.5222 (5.1792) grad_norm 2.8782 (inf) loss_scale 8192.0000 (14797.6520) mem 7376MB [2024-08-20 00:00:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][670/1251] eta 0:02:21 lr 0.000327 wd 0.0500 time 0.2425 (0.2435) data time 0.0009 (0.0019) model time 0.2416 (0.2418) loss 5.4928 (5.1795) grad_norm 2.5974 (inf) loss_scale 8192.0000 (14699.2072) mem 7376MB [2024-08-20 00:00:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][680/1251] eta 0:02:19 lr 0.000328 wd 0.0500 time 0.2382 (0.2435) data time 0.0011 (0.0019) model time 0.2372 (0.2418) loss 5.7893 (5.1795) grad_norm 2.2769 (inf) loss_scale 8192.0000 (14603.6535) mem 7376MB [2024-08-20 00:00:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][690/1251] eta 0:02:16 lr 0.000328 wd 0.0500 time 0.2448 (0.2435) data time 0.0011 (0.0019) model time 0.2436 (0.2417) loss 5.2148 (5.1778) grad_norm 2.4947 (inf) loss_scale 8192.0000 (14510.8654) mem 7376MB [2024-08-20 00:00:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][700/1251] eta 0:02:14 lr 0.000329 wd 0.0500 time 0.2412 (0.2434) data time 0.0008 (0.0019) model time 0.2405 (0.2417) loss 5.9379 (5.1786) grad_norm 2.0599 (inf) loss_scale 8192.0000 (14420.7247) mem 7376MB [2024-08-20 00:00:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][710/1251] eta 0:02:11 lr 0.000329 wd 0.0500 time 0.2330 (0.2434) data time 0.0010 (0.0019) model time 0.2320 (0.2417) loss 5.5441 (5.1743) grad_norm 2.8243 (inf) loss_scale 8192.0000 (14333.1195) mem 7376MB [2024-08-20 00:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][720/1251] eta 0:02:09 lr 0.000329 wd 0.0500 time 0.2473 (0.2434) data time 0.0010 (0.0018) model time 0.2463 (0.2417) loss 4.4822 (5.1701) grad_norm 3.0342 (inf) loss_scale 8192.0000 (14247.9445) mem 7376MB [2024-08-20 00:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][730/1251] eta 0:02:06 lr 0.000330 wd 0.0500 time 0.2350 (0.2434) data time 0.0011 (0.0018) model time 0.2339 (0.2417) loss 4.6423 (5.1674) grad_norm 3.0212 (inf) loss_scale 8192.0000 (14165.0999) mem 7376MB [2024-08-20 00:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][740/1251] eta 0:02:04 lr 0.000330 wd 0.0500 time 0.2410 (0.2434) data time 0.0012 (0.0018) model time 0.2398 (0.2417) loss 5.7354 (5.1684) grad_norm 2.6716 (inf) loss_scale 8192.0000 (14084.4912) mem 7376MB [2024-08-20 00:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][750/1251] eta 0:02:01 lr 0.000331 wd 0.0500 time 0.2462 (0.2434) data time 0.0007 (0.0018) model time 0.2455 (0.2417) loss 5.7524 (5.1694) grad_norm 2.3541 (inf) loss_scale 8192.0000 (14006.0293) mem 7376MB [2024-08-20 00:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][760/1251] eta 0:01:59 lr 0.000331 wd 0.0500 time 0.2437 (0.2434) data time 0.0008 (0.0018) model time 0.2429 (0.2417) loss 5.4246 (5.1735) grad_norm 2.4369 (inf) loss_scale 8192.0000 (13929.6294) mem 7376MB [2024-08-20 00:00:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][770/1251] eta 0:01:57 lr 0.000331 wd 0.0500 time 0.2351 (0.2434) data time 0.0011 (0.0018) model time 0.2340 (0.2417) loss 5.8409 (5.1758) grad_norm 2.8704 (inf) loss_scale 8192.0000 (13855.2114) mem 7376MB [2024-08-20 00:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][780/1251] eta 0:01:54 lr 0.000332 wd 0.0500 time 0.2443 (0.2433) data time 0.0009 (0.0018) model time 0.2434 (0.2417) loss 4.9666 (5.1757) grad_norm 2.7292 (inf) loss_scale 8192.0000 (13782.6991) mem 7376MB [2024-08-20 00:00:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][790/1251] eta 0:01:52 lr 0.000332 wd 0.0500 time 0.2410 (0.2433) data time 0.0012 (0.0018) model time 0.2398 (0.2416) loss 5.1813 (5.1753) grad_norm 2.0409 (inf) loss_scale 8192.0000 (13712.0202) mem 7376MB [2024-08-20 00:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][800/1251] eta 0:01:49 lr 0.000333 wd 0.0500 time 0.2309 (0.2433) data time 0.0012 (0.0018) model time 0.2297 (0.2416) loss 4.2945 (5.1732) grad_norm 1.9926 (inf) loss_scale 8192.0000 (13643.1061) mem 7376MB [2024-08-20 00:00:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][810/1251] eta 0:01:47 lr 0.000333 wd 0.0500 time 0.2306 (0.2433) data time 0.0012 (0.0018) model time 0.2295 (0.2416) loss 5.0778 (5.1716) grad_norm 2.1625 (inf) loss_scale 8192.0000 (13575.8915) mem 7376MB [2024-08-20 00:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][820/1251] eta 0:01:44 lr 0.000333 wd 0.0500 time 0.2448 (0.2433) data time 0.0010 (0.0018) model time 0.2438 (0.2416) loss 4.1825 (5.1722) grad_norm 2.6943 (inf) loss_scale 8192.0000 (13510.3143) mem 7376MB [2024-08-20 00:00:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][830/1251] eta 0:01:42 lr 0.000334 wd 0.0500 time 0.2441 (0.2433) data time 0.0009 (0.0018) model time 0.2432 (0.2416) loss 4.9868 (5.1717) grad_norm 2.2900 (inf) loss_scale 8192.0000 (13446.3153) mem 7376MB [2024-08-20 00:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][840/1251] eta 0:01:39 lr 0.000334 wd 0.0500 time 0.2413 (0.2432) data time 0.0010 (0.0017) model time 0.2402 (0.2416) loss 5.3338 (5.1676) grad_norm 2.9525 (inf) loss_scale 8192.0000 (13383.8383) mem 7376MB [2024-08-20 00:00:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][850/1251] eta 0:01:37 lr 0.000335 wd 0.0500 time 0.2420 (0.2432) data time 0.0010 (0.0017) model time 0.2411 (0.2416) loss 6.0052 (5.1683) grad_norm 2.2144 (inf) loss_scale 8192.0000 (13322.8296) mem 7376MB [2024-08-20 00:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][860/1251] eta 0:01:35 lr 0.000335 wd 0.0500 time 0.2318 (0.2432) data time 0.0009 (0.0017) model time 0.2309 (0.2415) loss 4.4463 (5.1676) grad_norm 2.6881 (inf) loss_scale 8192.0000 (13263.2381) mem 7376MB [2024-08-20 00:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][870/1251] eta 0:01:32 lr 0.000335 wd 0.0500 time 0.2340 (0.2432) data time 0.0011 (0.0017) model time 0.2329 (0.2416) loss 4.9845 (5.1687) grad_norm 3.3356 (inf) loss_scale 8192.0000 (13205.0149) mem 7376MB [2024-08-20 00:01:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][880/1251] eta 0:01:30 lr 0.000336 wd 0.0500 time 0.2363 (0.2432) data time 0.0011 (0.0017) model time 0.2352 (0.2416) loss 5.5206 (5.1699) grad_norm 2.6407 (inf) loss_scale 8192.0000 (13148.1135) mem 7376MB [2024-08-20 00:01:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][890/1251] eta 0:01:27 lr 0.000336 wd 0.0500 time 0.2431 (0.2431) data time 0.0007 (0.0017) model time 0.2424 (0.2415) loss 4.6285 (5.1696) grad_norm 1.9174 (inf) loss_scale 8192.0000 (13092.4893) mem 7376MB [2024-08-20 00:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][900/1251] eta 0:01:25 lr 0.000337 wd 0.0500 time 0.2442 (0.2431) data time 0.0007 (0.0017) model time 0.2435 (0.2415) loss 4.9533 (5.1701) grad_norm 2.0794 (inf) loss_scale 8192.0000 (13038.0999) mem 7376MB [2024-08-20 00:01:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][910/1251] eta 0:01:22 lr 0.000337 wd 0.0500 time 0.2418 (0.2434) data time 0.0012 (0.0017) model time 0.2406 (0.2418) loss 5.2527 (5.1683) grad_norm 4.1675 (inf) loss_scale 8192.0000 (12984.9045) mem 7376MB [2024-08-20 00:01:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][920/1251] eta 0:01:20 lr 0.000337 wd 0.0500 time 0.2370 (0.2436) data time 0.0013 (0.0017) model time 0.2357 (0.2420) loss 5.5765 (5.1675) grad_norm 2.5277 (inf) loss_scale 8192.0000 (12932.8643) mem 7376MB [2024-08-20 00:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][930/1251] eta 0:01:18 lr 0.000338 wd 0.0500 time 0.2485 (0.2435) data time 0.0008 (0.0017) model time 0.2478 (0.2420) loss 5.9244 (5.1671) grad_norm 4.7042 (inf) loss_scale 8192.0000 (12881.9420) mem 7376MB [2024-08-20 00:01:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][940/1251] eta 0:01:15 lr 0.000338 wd 0.0500 time 0.2335 (0.2435) data time 0.0009 (0.0017) model time 0.2326 (0.2420) loss 4.7181 (5.1681) grad_norm 2.5825 (inf) loss_scale 8192.0000 (12832.1020) mem 7376MB [2024-08-20 00:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][950/1251] eta 0:01:14 lr 0.000339 wd 0.0500 time 0.2362 (0.2463) data time 0.0010 (0.0017) model time 0.2352 (0.2449) loss 5.7710 (5.1704) grad_norm 2.7277 (inf) loss_scale 8192.0000 (12783.3102) mem 7376MB [2024-08-20 00:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][960/1251] eta 0:01:11 lr 0.000339 wd 0.0500 time 0.2471 (0.2462) data time 0.0010 (0.0017) model time 0.2461 (0.2448) loss 5.2354 (5.1709) grad_norm 2.3685 (inf) loss_scale 8192.0000 (12735.5338) mem 7376MB [2024-08-20 00:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][970/1251] eta 0:01:09 lr 0.000339 wd 0.0500 time 0.2409 (0.2462) data time 0.0011 (0.0017) model time 0.2399 (0.2448) loss 4.9951 (5.1687) grad_norm 2.7839 (inf) loss_scale 8192.0000 (12688.7415) mem 7376MB [2024-08-20 00:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][980/1251] eta 0:01:06 lr 0.000340 wd 0.0500 time 0.2432 (0.2461) data time 0.0010 (0.0017) model time 0.2422 (0.2447) loss 5.2254 (5.1657) grad_norm 1.9354 (inf) loss_scale 8192.0000 (12642.9032) mem 7376MB [2024-08-20 00:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][990/1251] eta 0:01:04 lr 0.000340 wd 0.0500 time 0.2429 (0.2461) data time 0.0009 (0.0017) model time 0.2419 (0.2447) loss 5.5810 (5.1687) grad_norm 2.1911 (inf) loss_scale 8192.0000 (12597.9899) mem 7376MB [2024-08-20 00:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1000/1251] eta 0:01:01 lr 0.000341 wd 0.0500 time 0.2528 (0.2461) data time 0.0009 (0.0017) model time 0.2518 (0.2447) loss 5.9536 (5.1699) grad_norm 3.6078 (inf) loss_scale 8192.0000 (12553.9740) mem 7376MB [2024-08-20 00:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1010/1251] eta 0:00:59 lr 0.000341 wd 0.0500 time 0.2450 (0.2460) data time 0.0010 (0.0017) model time 0.2439 (0.2446) loss 4.3452 (5.1686) grad_norm 2.8074 (inf) loss_scale 8192.0000 (12510.8289) mem 7376MB [2024-08-20 00:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1020/1251] eta 0:00:56 lr 0.000341 wd 0.0500 time 0.2574 (0.2460) data time 0.0010 (0.0016) model time 0.2564 (0.2446) loss 4.3594 (5.1687) grad_norm 2.6011 (inf) loss_scale 8192.0000 (12468.5289) mem 7376MB [2024-08-20 00:01:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1030/1251] eta 0:00:54 lr 0.000342 wd 0.0500 time 0.2487 (0.2460) data time 0.0011 (0.0016) model time 0.2476 (0.2446) loss 4.9918 (5.1693) grad_norm 2.1552 (inf) loss_scale 8192.0000 (12427.0495) mem 7376MB [2024-08-20 00:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1040/1251] eta 0:00:51 lr 0.000342 wd 0.0500 time 0.2570 (0.2459) data time 0.0008 (0.0016) model time 0.2562 (0.2445) loss 4.7143 (5.1705) grad_norm 2.9901 (inf) loss_scale 8192.0000 (12386.3670) mem 7376MB [2024-08-20 00:01:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1050/1251] eta 0:00:49 lr 0.000343 wd 0.0500 time 0.2489 (0.2459) data time 0.0009 (0.0016) model time 0.2480 (0.2445) loss 5.1489 (5.1722) grad_norm 3.7206 (inf) loss_scale 8192.0000 (12346.4586) mem 7376MB [2024-08-20 00:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1060/1251] eta 0:00:46 lr 0.000343 wd 0.0500 time 0.2403 (0.2458) data time 0.0011 (0.0016) model time 0.2391 (0.2444) loss 5.4069 (5.1714) grad_norm 2.9843 (inf) loss_scale 8192.0000 (12307.3025) mem 7376MB [2024-08-20 00:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1070/1251] eta 0:00:44 lr 0.000343 wd 0.0500 time 0.2480 (0.2458) data time 0.0011 (0.0016) model time 0.2469 (0.2444) loss 4.3082 (5.1698) grad_norm 2.6042 (inf) loss_scale 8192.0000 (12268.8777) mem 7376MB [2024-08-20 00:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1080/1251] eta 0:00:42 lr 0.000344 wd 0.0500 time 0.2382 (0.2458) data time 0.0008 (0.0016) model time 0.2374 (0.2443) loss 5.9042 (5.1680) grad_norm 2.4070 (inf) loss_scale 8192.0000 (12231.1637) mem 7376MB [2024-08-20 00:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1090/1251] eta 0:00:39 lr 0.000344 wd 0.0500 time 0.2466 (0.2457) data time 0.0011 (0.0016) model time 0.2455 (0.2443) loss 4.5457 (5.1695) grad_norm 2.6899 (inf) loss_scale 8192.0000 (12194.1412) mem 7376MB [2024-08-20 00:02:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1100/1251] eta 0:00:37 lr 0.000345 wd 0.0500 time 0.2445 (0.2457) data time 0.0010 (0.0016) model time 0.2435 (0.2443) loss 5.4456 (5.1702) grad_norm 1.9870 (inf) loss_scale 8192.0000 (12157.7911) mem 7376MB [2024-08-20 00:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1110/1251] eta 0:00:34 lr 0.000345 wd 0.0500 time 0.2427 (0.2456) data time 0.0008 (0.0016) model time 0.2419 (0.2442) loss 5.1643 (5.1698) grad_norm 2.3472 (inf) loss_scale 8192.0000 (12122.0954) mem 7376MB [2024-08-20 00:02:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1120/1251] eta 0:00:32 lr 0.000345 wd 0.0500 time 0.2358 (0.2456) data time 0.0011 (0.0016) model time 0.2347 (0.2442) loss 4.3752 (5.1705) grad_norm 2.4979 (inf) loss_scale 8192.0000 (12087.0366) mem 7376MB [2024-08-20 00:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1130/1251] eta 0:00:29 lr 0.000346 wd 0.0500 time 0.2420 (0.2455) data time 0.0011 (0.0016) model time 0.2410 (0.2441) loss 4.8381 (5.1687) grad_norm 3.2129 (inf) loss_scale 8192.0000 (12052.5977) mem 7376MB [2024-08-20 00:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1140/1251] eta 0:00:27 lr 0.000346 wd 0.0500 time 0.2437 (0.2455) data time 0.0008 (0.0016) model time 0.2429 (0.2441) loss 4.0285 (5.1700) grad_norm 2.4208 (inf) loss_scale 8192.0000 (12018.7625) mem 7376MB [2024-08-20 00:02:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1150/1251] eta 0:00:24 lr 0.000347 wd 0.0500 time 0.2413 (0.2455) data time 0.0011 (0.0016) model time 0.2402 (0.2441) loss 4.6836 (5.1687) grad_norm 2.5615 (inf) loss_scale 8192.0000 (11985.5152) mem 7376MB [2024-08-20 00:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1160/1251] eta 0:00:22 lr 0.000347 wd 0.0500 time 0.2405 (0.2454) data time 0.0008 (0.0016) model time 0.2398 (0.2440) loss 5.7495 (5.1676) grad_norm 2.0781 (inf) loss_scale 8192.0000 (11952.8407) mem 7376MB [2024-08-20 00:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1170/1251] eta 0:00:19 lr 0.000347 wd 0.0500 time 0.2458 (0.2454) data time 0.0010 (0.0016) model time 0.2449 (0.2440) loss 4.5805 (5.1669) grad_norm 2.9112 (inf) loss_scale 8192.0000 (11920.7242) mem 7376MB [2024-08-20 00:02:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1180/1251] eta 0:00:17 lr 0.000348 wd 0.0500 time 0.2429 (0.2455) data time 0.0011 (0.0016) model time 0.2418 (0.2441) loss 5.1004 (5.1637) grad_norm 2.6008 (inf) loss_scale 8192.0000 (11889.1516) mem 7376MB [2024-08-20 00:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1190/1251] eta 0:00:14 lr 0.000348 wd 0.0500 time 0.2383 (0.2455) data time 0.0009 (0.0016) model time 0.2374 (0.2441) loss 4.6038 (5.1616) grad_norm 2.5690 (inf) loss_scale 8192.0000 (11858.1092) mem 7376MB [2024-08-20 00:02:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1200/1251] eta 0:00:12 lr 0.000349 wd 0.0500 time 0.2338 (0.2455) data time 0.0010 (0.0016) model time 0.2328 (0.2441) loss 4.6205 (5.1619) grad_norm 2.1802 (inf) loss_scale 8192.0000 (11827.5837) mem 7376MB [2024-08-20 00:02:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1210/1251] eta 0:00:10 lr 0.000349 wd 0.0500 time 0.2438 (0.2454) data time 0.0010 (0.0016) model time 0.2428 (0.2440) loss 4.8866 (5.1608) grad_norm 2.5212 (inf) loss_scale 8192.0000 (11797.5623) mem 7376MB [2024-08-20 00:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1220/1251] eta 0:00:07 lr 0.000349 wd 0.0500 time 0.2384 (0.2454) data time 0.0011 (0.0016) model time 0.2373 (0.2440) loss 3.9062 (5.1609) grad_norm 2.7805 (inf) loss_scale 8192.0000 (11768.0328) mem 7376MB [2024-08-20 00:02:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1230/1251] eta 0:00:05 lr 0.000350 wd 0.0500 time 0.2383 (0.2454) data time 0.0008 (0.0016) model time 0.2375 (0.2440) loss 5.7266 (5.1618) grad_norm 2.3072 (inf) loss_scale 8192.0000 (11738.9829) mem 7376MB [2024-08-20 00:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1240/1251] eta 0:00:02 lr 0.000350 wd 0.0500 time 0.2267 (0.2453) data time 0.0007 (0.0016) model time 0.2260 (0.2439) loss 5.3902 (5.1622) grad_norm 2.1424 (inf) loss_scale 8192.0000 (11710.4013) mem 7376MB [2024-08-20 00:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [6/300][1250/1251] eta 0:00:00 lr 0.000351 wd 0.0500 time 0.2240 (0.2451) data time 0.0007 (0.0016) model time 0.2234 (0.2437) loss 5.2358 (5.1637) grad_norm 2.3748 (inf) loss_scale 8192.0000 (11682.2766) mem 7376MB [2024-08-20 00:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 6 training takes 0:05:06 [2024-08-20 00:02:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-20 00:02:39 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-20 00:02:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.428 (0.428) Loss 1.5996 (1.5996) Acc@1 65.430 (65.430) Acc@5 87.598 (87.598) Mem 7376MB [2024-08-20 00:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.111) Loss 2.1465 (2.1747) Acc@1 47.266 (49.938) Acc@5 79.492 (77.264) Mem 7376MB [2024-08-20 00:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.097) Loss 2.8926 (2.1420) Acc@1 41.504 (50.363) Acc@5 65.234 (78.172) Mem 7376MB [2024-08-20 00:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.090) Loss 3.1191 (2.3804) Acc@1 38.867 (47.354) Acc@5 61.133 (73.929) Mem 7376MB [2024-08-20 00:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 3.2422 (2.5349) Acc@1 32.910 (44.893) Acc@5 56.934 (71.210) Mem 7376MB [2024-08-20 00:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 44.892 Acc@5 71.264 [2024-08-20 00:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 44.9% [2024-08-20 00:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 44.89% [2024-08-20 00:02:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-20 00:02:43 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-20 00:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.453 (0.453) Loss 6.8477 (6.8477) Acc@1 0.293 (0.293) Acc@5 2.246 (2.246) Mem 7376MB [2024-08-20 00:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.115) Loss 7.1758 (6.9876) Acc@1 0.293 (0.355) Acc@5 0.879 (1.314) Mem 7376MB [2024-08-20 00:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.098) Loss 6.9102 (6.9550) Acc@1 0.098 (0.302) Acc@5 0.391 (1.246) Mem 7376MB [2024-08-20 00:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.092) Loss 6.7852 (6.9258) Acc@1 0.293 (0.328) Acc@5 1.855 (1.355) Mem 7376MB [2024-08-20 00:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 6.9648 (6.9083) Acc@1 0.098 (0.341) Acc@5 1.562 (1.460) Mem 7376MB [2024-08-20 00:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 0.470 Acc@5 1.726 [2024-08-20 00:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 0.5% [2024-08-20 00:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 0.47% [2024-08-20 00:02:47 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-20 00:02:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-20 00:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][0/1251] eta 0:14:22 lr 0.000351 wd 0.0500 time 0.6892 (0.6892) data time 0.4439 (0.4439) model time 0.0000 (0.0000) loss 5.3013 (5.3013) grad_norm 2.6002 (2.6002) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][10/1251] eta 0:05:52 lr 0.000351 wd 0.0500 time 0.2441 (0.2842) data time 0.0007 (0.0414) model time 0.0000 (0.0000) loss 5.7649 (5.2224) grad_norm 2.0733 (2.4084) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:02:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][20/1251] eta 0:05:23 lr 0.000351 wd 0.0500 time 0.2464 (0.2631) data time 0.0012 (0.0223) model time 0.0000 (0.0000) loss 5.4280 (5.1309) grad_norm 2.4994 (2.6504) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][30/1251] eta 0:05:12 lr 0.000352 wd 0.0500 time 0.2430 (0.2557) data time 0.0011 (0.0155) model time 0.0000 (0.0000) loss 5.1454 (5.1375) grad_norm 2.5401 (2.7477) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:02:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][40/1251] eta 0:05:05 lr 0.000352 wd 0.0500 time 0.2442 (0.2522) data time 0.0010 (0.0120) model time 0.0000 (0.0000) loss 4.8737 (5.1429) grad_norm 3.4759 (2.7121) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][50/1251] eta 0:04:59 lr 0.000353 wd 0.0500 time 0.2408 (0.2496) data time 0.0010 (0.0098) model time 0.0000 (0.0000) loss 4.7868 (5.1036) grad_norm 2.1341 (2.6968) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][60/1251] eta 0:04:55 lr 0.000353 wd 0.0500 time 0.2493 (0.2481) data time 0.0010 (0.0084) model time 0.2483 (0.2393) loss 5.0146 (5.1092) grad_norm 2.8616 (2.6684) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][70/1251] eta 0:04:51 lr 0.000353 wd 0.0500 time 0.2464 (0.2471) data time 0.0010 (0.0074) model time 0.2455 (0.2394) loss 5.4023 (5.1159) grad_norm 2.4895 (2.6346) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][80/1251] eta 0:04:48 lr 0.000354 wd 0.0500 time 0.2466 (0.2460) data time 0.0011 (0.0066) model time 0.2455 (0.2386) loss 4.8846 (5.1002) grad_norm 3.9901 (2.6595) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][90/1251] eta 0:04:44 lr 0.000354 wd 0.0500 time 0.2456 (0.2453) data time 0.0007 (0.0060) model time 0.2448 (0.2387) loss 3.9612 (5.0972) grad_norm 4.0937 (2.6802) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][100/1251] eta 0:04:42 lr 0.000355 wd 0.0500 time 0.2461 (0.2451) data time 0.0008 (0.0055) model time 0.2454 (0.2393) loss 4.8074 (5.0969) grad_norm 2.7411 (2.6624) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][110/1251] eta 0:04:41 lr 0.000355 wd 0.0500 time 0.2439 (0.2464) data time 0.0010 (0.0051) model time 0.2429 (0.2425) loss 4.4511 (5.0940) grad_norm 2.6254 (2.6604) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][120/1251] eta 0:04:38 lr 0.000355 wd 0.0500 time 0.2406 (0.2460) data time 0.0011 (0.0048) model time 0.2395 (0.2422) loss 5.1303 (5.0933) grad_norm 2.3747 (2.6499) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][130/1251] eta 0:04:35 lr 0.000356 wd 0.0500 time 0.2358 (0.2456) data time 0.0012 (0.0045) model time 0.2346 (0.2419) loss 5.6630 (5.1013) grad_norm 2.4420 (2.6259) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][140/1251] eta 0:04:32 lr 0.000356 wd 0.0500 time 0.2327 (0.2452) data time 0.0011 (0.0043) model time 0.2316 (0.2415) loss 4.8853 (5.0884) grad_norm 2.8494 (2.6214) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][150/1251] eta 0:04:29 lr 0.000357 wd 0.0500 time 0.2411 (0.2449) data time 0.0008 (0.0041) model time 0.2403 (0.2414) loss 5.4232 (5.0799) grad_norm 2.7490 (2.6254) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][160/1251] eta 0:04:26 lr 0.000357 wd 0.0500 time 0.2358 (0.2446) data time 0.0009 (0.0039) model time 0.2349 (0.2412) loss 4.9799 (5.0833) grad_norm 1.9972 (2.6149) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][170/1251] eta 0:04:24 lr 0.000357 wd 0.0500 time 0.2359 (0.2445) data time 0.0008 (0.0037) model time 0.2350 (0.2411) loss 5.1385 (5.0876) grad_norm 1.7907 (2.5949) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][180/1251] eta 0:04:22 lr 0.000358 wd 0.0500 time 0.2472 (0.2455) data time 0.0010 (0.0036) model time 0.2461 (0.2428) loss 5.2323 (5.1000) grad_norm 2.6547 (2.5890) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][190/1251] eta 0:04:21 lr 0.000358 wd 0.0500 time 0.2330 (0.2465) data time 0.0008 (0.0034) model time 0.2322 (0.2442) loss 5.5218 (5.0912) grad_norm 2.5809 (2.5884) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][200/1251] eta 0:04:18 lr 0.000359 wd 0.0500 time 0.2369 (0.2462) data time 0.0010 (0.0033) model time 0.2359 (0.2440) loss 5.1189 (5.0879) grad_norm 2.2493 (2.5990) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][210/1251] eta 0:04:16 lr 0.000359 wd 0.0500 time 0.2434 (0.2459) data time 0.0008 (0.0032) model time 0.2426 (0.2436) loss 5.6600 (5.1131) grad_norm 2.4425 (2.6121) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][220/1251] eta 0:04:13 lr 0.000359 wd 0.0500 time 0.2346 (0.2457) data time 0.0010 (0.0031) model time 0.2336 (0.2434) loss 5.1485 (5.1031) grad_norm 2.1551 (2.6100) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][230/1251] eta 0:04:10 lr 0.000360 wd 0.0500 time 0.2427 (0.2455) data time 0.0007 (0.0030) model time 0.2420 (0.2433) loss 5.4018 (5.1091) grad_norm 2.4313 (2.5964) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][240/1251] eta 0:04:08 lr 0.000360 wd 0.0500 time 0.2412 (0.2453) data time 0.0008 (0.0030) model time 0.2404 (0.2431) loss 4.0085 (5.0976) grad_norm 3.3128 (2.5932) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][250/1251] eta 0:04:05 lr 0.000361 wd 0.0500 time 0.2445 (0.2451) data time 0.0012 (0.0029) model time 0.2433 (0.2429) loss 4.1824 (5.0801) grad_norm 2.9160 (2.6037) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][260/1251] eta 0:04:02 lr 0.000361 wd 0.0500 time 0.2509 (0.2450) data time 0.0007 (0.0028) model time 0.2502 (0.2428) loss 5.4970 (5.0777) grad_norm 2.4879 (2.6046) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][270/1251] eta 0:04:00 lr 0.000361 wd 0.0500 time 0.2407 (0.2449) data time 0.0007 (0.0028) model time 0.2400 (0.2427) loss 5.6718 (5.0840) grad_norm 1.7629 (2.5967) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][280/1251] eta 0:03:57 lr 0.000362 wd 0.0500 time 0.2480 (0.2447) data time 0.0007 (0.0027) model time 0.2473 (0.2425) loss 4.8490 (5.0776) grad_norm 2.3432 (2.5839) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][290/1251] eta 0:03:55 lr 0.000362 wd 0.0500 time 0.2427 (0.2446) data time 0.0008 (0.0026) model time 0.2420 (0.2424) loss 3.8949 (5.0665) grad_norm 2.7354 (2.5763) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][300/1251] eta 0:03:52 lr 0.000363 wd 0.0500 time 0.2452 (0.2445) data time 0.0008 (0.0026) model time 0.2443 (0.2424) loss 4.8657 (5.0690) grad_norm 1.9418 (2.5779) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][310/1251] eta 0:03:50 lr 0.000363 wd 0.0500 time 0.2466 (0.2445) data time 0.0010 (0.0025) model time 0.2456 (0.2423) loss 4.9848 (5.0641) grad_norm 2.2502 (2.5679) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][320/1251] eta 0:03:47 lr 0.000363 wd 0.0500 time 0.2415 (0.2444) data time 0.0009 (0.0025) model time 0.2406 (0.2423) loss 4.6360 (5.0647) grad_norm 2.0356 (2.5553) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][330/1251] eta 0:03:44 lr 0.000364 wd 0.0500 time 0.2328 (0.2443) data time 0.0008 (0.0025) model time 0.2320 (0.2422) loss 4.2388 (5.0596) grad_norm 2.0829 (2.5575) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][340/1251] eta 0:03:42 lr 0.000364 wd 0.0500 time 0.2437 (0.2442) data time 0.0010 (0.0024) model time 0.2427 (0.2421) loss 4.8828 (5.0568) grad_norm 1.8941 (2.5755) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][350/1251] eta 0:03:40 lr 0.000365 wd 0.0500 time 0.2399 (0.2442) data time 0.0009 (0.0024) model time 0.2390 (0.2421) loss 5.5585 (5.0638) grad_norm 2.6242 (2.6043) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][360/1251] eta 0:03:37 lr 0.000365 wd 0.0500 time 0.2417 (0.2442) data time 0.0010 (0.0024) model time 0.2407 (0.2421) loss 4.2134 (5.0616) grad_norm 2.2980 (2.5988) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][370/1251] eta 0:03:35 lr 0.000365 wd 0.0500 time 0.2483 (0.2441) data time 0.0011 (0.0023) model time 0.2472 (0.2421) loss 4.9942 (5.0582) grad_norm 2.3266 (2.5960) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][380/1251] eta 0:03:32 lr 0.000366 wd 0.0500 time 0.2374 (0.2440) data time 0.0010 (0.0023) model time 0.2364 (0.2420) loss 4.5642 (5.0536) grad_norm 3.2869 (2.6013) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][390/1251] eta 0:03:30 lr 0.000366 wd 0.0500 time 0.2348 (0.2440) data time 0.0014 (0.0023) model time 0.2334 (0.2420) loss 4.8179 (5.0562) grad_norm 2.3973 (2.6057) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][400/1251] eta 0:03:27 lr 0.000367 wd 0.0500 time 0.2407 (0.2440) data time 0.0008 (0.0023) model time 0.2399 (0.2420) loss 4.8540 (5.0505) grad_norm 1.9355 (2.6066) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][410/1251] eta 0:03:25 lr 0.000367 wd 0.0500 time 0.2349 (0.2438) data time 0.0012 (0.0022) model time 0.2337 (0.2419) loss 5.3794 (5.0515) grad_norm 3.4044 (2.6049) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][420/1251] eta 0:03:22 lr 0.000367 wd 0.0500 time 0.2418 (0.2438) data time 0.0009 (0.0022) model time 0.2409 (0.2418) loss 5.0166 (5.0455) grad_norm 2.4023 (2.6142) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][430/1251] eta 0:03:20 lr 0.000368 wd 0.0500 time 0.2386 (0.2437) data time 0.0010 (0.0022) model time 0.2376 (0.2418) loss 5.6977 (5.0410) grad_norm 1.9283 (2.6092) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][440/1251] eta 0:03:17 lr 0.000368 wd 0.0500 time 0.2421 (0.2436) data time 0.0007 (0.0022) model time 0.2413 (0.2417) loss 5.8609 (5.0432) grad_norm 2.7552 (2.6029) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][450/1251] eta 0:03:15 lr 0.000369 wd 0.0500 time 0.2334 (0.2436) data time 0.0011 (0.0021) model time 0.2324 (0.2417) loss 5.3897 (5.0401) grad_norm 1.9584 (2.6070) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][460/1251] eta 0:03:12 lr 0.000369 wd 0.0500 time 0.2354 (0.2436) data time 0.0010 (0.0021) model time 0.2344 (0.2417) loss 5.1501 (5.0436) grad_norm 4.9032 (2.6082) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][470/1251] eta 0:03:10 lr 0.000369 wd 0.0500 time 0.2401 (0.2440) data time 0.0008 (0.0021) model time 0.2393 (0.2421) loss 5.4171 (5.0448) grad_norm 2.2066 (2.6102) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][480/1251] eta 0:03:08 lr 0.000370 wd 0.0500 time 0.2491 (0.2440) data time 0.0008 (0.0021) model time 0.2483 (0.2422) loss 5.2160 (5.0395) grad_norm 2.5690 (2.6116) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][490/1251] eta 0:03:05 lr 0.000370 wd 0.0500 time 0.2366 (0.2440) data time 0.0011 (0.0021) model time 0.2356 (0.2422) loss 4.6396 (5.0360) grad_norm 2.2230 (2.6110) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][500/1251] eta 0:03:03 lr 0.000371 wd 0.0500 time 0.2450 (0.2440) data time 0.0009 (0.0021) model time 0.2441 (0.2422) loss 4.8180 (5.0370) grad_norm 2.2694 (2.6076) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][510/1251] eta 0:03:00 lr 0.000371 wd 0.0500 time 0.2376 (0.2440) data time 0.0011 (0.0020) model time 0.2365 (0.2422) loss 5.4265 (5.0404) grad_norm 3.2029 (2.6069) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][520/1251] eta 0:02:58 lr 0.000371 wd 0.0500 time 0.2478 (0.2440) data time 0.0008 (0.0020) model time 0.2470 (0.2422) loss 5.6068 (5.0404) grad_norm 2.8667 (2.6076) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:04:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][530/1251] eta 0:02:55 lr 0.000372 wd 0.0500 time 0.2375 (0.2439) data time 0.0011 (0.0020) model time 0.2365 (0.2421) loss 4.8719 (5.0411) grad_norm 2.1163 (2.6057) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][540/1251] eta 0:02:53 lr 0.000372 wd 0.0500 time 0.2458 (0.2440) data time 0.0007 (0.0020) model time 0.2451 (0.2422) loss 4.1882 (5.0358) grad_norm 2.5796 (2.6048) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][550/1251] eta 0:02:50 lr 0.000373 wd 0.0500 time 0.2405 (0.2439) data time 0.0010 (0.0020) model time 0.2395 (0.2422) loss 4.8306 (5.0381) grad_norm 2.3292 (2.6019) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][560/1251] eta 0:02:48 lr 0.000373 wd 0.0500 time 0.2407 (0.2439) data time 0.0011 (0.0020) model time 0.2396 (0.2421) loss 5.4962 (5.0348) grad_norm 5.5801 (2.6203) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][570/1251] eta 0:02:46 lr 0.000373 wd 0.0500 time 0.2404 (0.2438) data time 0.0010 (0.0020) model time 0.2394 (0.2420) loss 4.5943 (5.0317) grad_norm 2.3991 (2.6259) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][580/1251] eta 0:02:43 lr 0.000374 wd 0.0500 time 0.2394 (0.2437) data time 0.0012 (0.0019) model time 0.2383 (0.2420) loss 5.5000 (5.0281) grad_norm 2.8995 (2.6262) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][590/1251] eta 0:02:41 lr 0.000374 wd 0.0500 time 0.2454 (0.2437) data time 0.0009 (0.0019) model time 0.2445 (0.2420) loss 5.4611 (5.0250) grad_norm 1.9678 (2.6220) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][600/1251] eta 0:02:38 lr 0.000375 wd 0.0500 time 0.2452 (0.2437) data time 0.0009 (0.0019) model time 0.2443 (0.2420) loss 4.6131 (5.0241) grad_norm 2.3504 (2.6218) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][610/1251] eta 0:02:36 lr 0.000375 wd 0.0500 time 0.2394 (0.2437) data time 0.0009 (0.0019) model time 0.2385 (0.2420) loss 5.1594 (5.0291) grad_norm 2.5779 (2.6189) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][620/1251] eta 0:02:33 lr 0.000375 wd 0.0500 time 0.2335 (0.2436) data time 0.0008 (0.0019) model time 0.2327 (0.2419) loss 5.1346 (5.0254) grad_norm 2.1950 (2.6150) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][630/1251] eta 0:02:31 lr 0.000376 wd 0.0500 time 0.2394 (0.2438) data time 0.0010 (0.0019) model time 0.2384 (0.2422) loss 4.6935 (5.0262) grad_norm 2.0962 (2.6097) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][640/1251] eta 0:02:28 lr 0.000376 wd 0.0500 time 0.2473 (0.2438) data time 0.0008 (0.0019) model time 0.2465 (0.2421) loss 4.6375 (5.0289) grad_norm 2.0873 (2.6042) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][650/1251] eta 0:02:26 lr 0.000377 wd 0.0500 time 0.2343 (0.2438) data time 0.0009 (0.0019) model time 0.2334 (0.2421) loss 5.6646 (5.0291) grad_norm 2.3083 (2.6013) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][660/1251] eta 0:02:24 lr 0.000377 wd 0.0500 time 0.2396 (0.2438) data time 0.0009 (0.0018) model time 0.2388 (0.2421) loss 5.6469 (5.0326) grad_norm 2.4341 (2.6015) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][670/1251] eta 0:02:21 lr 0.000377 wd 0.0500 time 0.2446 (0.2438) data time 0.0009 (0.0018) model time 0.2437 (0.2422) loss 5.1390 (5.0363) grad_norm 2.5590 (2.5999) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][680/1251] eta 0:02:19 lr 0.000378 wd 0.0500 time 0.2465 (0.2438) data time 0.0011 (0.0018) model time 0.2454 (0.2422) loss 5.2778 (5.0371) grad_norm 2.7042 (2.5979) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][690/1251] eta 0:02:16 lr 0.000378 wd 0.0500 time 0.2450 (0.2438) data time 0.0009 (0.0018) model time 0.2441 (0.2422) loss 5.2834 (5.0385) grad_norm 3.1258 (2.5966) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][700/1251] eta 0:02:14 lr 0.000379 wd 0.0500 time 0.2387 (0.2442) data time 0.0008 (0.0018) model time 0.2379 (0.2426) loss 4.8838 (5.0351) grad_norm 2.0224 (2.5960) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][710/1251] eta 0:02:12 lr 0.000379 wd 0.0500 time 0.2445 (0.2445) data time 0.0008 (0.0018) model time 0.2437 (0.2429) loss 4.2011 (5.0355) grad_norm 3.1554 (2.5955) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][720/1251] eta 0:02:09 lr 0.000379 wd 0.0500 time 0.2342 (0.2444) data time 0.0011 (0.0018) model time 0.2331 (0.2428) loss 3.8716 (5.0368) grad_norm 5.9493 (2.6008) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][730/1251] eta 0:02:07 lr 0.000380 wd 0.0500 time 0.2389 (0.2444) data time 0.0011 (0.0018) model time 0.2378 (0.2428) loss 5.6420 (5.0406) grad_norm 2.2615 (2.6006) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][740/1251] eta 0:02:04 lr 0.000380 wd 0.0500 time 0.2454 (0.2444) data time 0.0008 (0.0018) model time 0.2446 (0.2428) loss 4.4044 (5.0400) grad_norm 2.3710 (2.6019) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][750/1251] eta 0:02:02 lr 0.000381 wd 0.0500 time 0.2463 (0.2444) data time 0.0010 (0.0018) model time 0.2453 (0.2428) loss 4.3726 (5.0317) grad_norm 2.5324 (2.5986) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][760/1251] eta 0:01:59 lr 0.000381 wd 0.0500 time 0.2334 (0.2443) data time 0.0011 (0.0017) model time 0.2324 (0.2428) loss 4.7664 (5.0278) grad_norm 2.1115 (2.5969) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][770/1251] eta 0:01:57 lr 0.000381 wd 0.0500 time 0.2385 (0.2443) data time 0.0008 (0.0017) model time 0.2377 (0.2427) loss 5.8199 (5.0297) grad_norm 2.3087 (2.5949) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:05:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][780/1251] eta 0:01:55 lr 0.000382 wd 0.0500 time 0.2401 (0.2442) data time 0.0008 (0.0017) model time 0.2394 (0.2427) loss 5.0600 (5.0327) grad_norm 2.6726 (2.5937) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][790/1251] eta 0:01:52 lr 0.000382 wd 0.0500 time 0.2401 (0.2442) data time 0.0008 (0.0017) model time 0.2393 (0.2426) loss 4.7765 (5.0304) grad_norm 2.5933 (2.5919) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][800/1251] eta 0:01:50 lr 0.000383 wd 0.0500 time 0.2445 (0.2442) data time 0.0007 (0.0017) model time 0.2438 (0.2426) loss 5.7886 (5.0299) grad_norm 2.3217 (2.5960) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][810/1251] eta 0:01:47 lr 0.000383 wd 0.0500 time 0.2336 (0.2441) data time 0.0011 (0.0017) model time 0.2325 (0.2426) loss 4.0508 (5.0278) grad_norm 1.9925 (2.5949) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][820/1251] eta 0:01:45 lr 0.000383 wd 0.0500 time 0.2355 (0.2441) data time 0.0009 (0.0017) model time 0.2347 (0.2426) loss 3.8866 (5.0267) grad_norm 2.2806 (2.5929) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][830/1251] eta 0:01:42 lr 0.000384 wd 0.0500 time 0.2385 (0.2441) data time 0.0007 (0.0017) model time 0.2377 (0.2426) loss 4.7356 (5.0264) grad_norm 2.1668 (2.5939) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][840/1251] eta 0:01:40 lr 0.000384 wd 0.0500 time 0.2439 (0.2441) data time 0.0009 (0.0017) model time 0.2430 (0.2426) loss 4.6124 (5.0218) grad_norm 2.3427 (2.5918) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][850/1251] eta 0:01:37 lr 0.000385 wd 0.0500 time 0.2335 (0.2441) data time 0.0010 (0.0017) model time 0.2325 (0.2426) loss 5.0931 (5.0212) grad_norm 2.8047 (2.5912) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][860/1251] eta 0:01:35 lr 0.000385 wd 0.0500 time 0.2387 (0.2441) data time 0.0011 (0.0017) model time 0.2376 (0.2426) loss 5.2917 (5.0196) grad_norm 3.6730 (2.5949) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][870/1251] eta 0:01:32 lr 0.000385 wd 0.0500 time 0.2402 (0.2441) data time 0.0008 (0.0017) model time 0.2395 (0.2426) loss 3.8380 (5.0215) grad_norm 2.9534 (2.5939) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][880/1251] eta 0:01:30 lr 0.000386 wd 0.0500 time 0.2362 (0.2441) data time 0.0012 (0.0017) model time 0.2350 (0.2426) loss 5.0500 (5.0200) grad_norm 2.1846 (2.5929) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][890/1251] eta 0:01:28 lr 0.000386 wd 0.0500 time 0.2412 (0.2441) data time 0.0010 (0.0016) model time 0.2402 (0.2426) loss 5.2611 (5.0164) grad_norm 2.1700 (2.5893) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][900/1251] eta 0:01:25 lr 0.000387 wd 0.0500 time 0.2378 (0.2440) data time 0.0010 (0.0016) model time 0.2368 (0.2425) loss 4.2065 (5.0163) grad_norm 2.4028 (2.5860) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][910/1251] eta 0:01:23 lr 0.000387 wd 0.0500 time 0.2407 (0.2440) data time 0.0010 (0.0016) model time 0.2397 (0.2425) loss 5.4474 (5.0185) grad_norm 2.0467 (2.5861) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][920/1251] eta 0:01:20 lr 0.000387 wd 0.0500 time 0.2421 (0.2440) data time 0.0011 (0.0016) model time 0.2410 (0.2425) loss 4.8341 (5.0201) grad_norm 2.7075 (2.5878) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][930/1251] eta 0:01:18 lr 0.000388 wd 0.0500 time 0.2419 (0.2440) data time 0.0011 (0.0016) model time 0.2408 (0.2425) loss 5.4780 (5.0222) grad_norm 2.7713 (2.5905) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][940/1251] eta 0:01:15 lr 0.000388 wd 0.0500 time 0.2470 (0.2440) data time 0.0011 (0.0016) model time 0.2460 (0.2425) loss 5.4076 (5.0218) grad_norm 3.1515 (2.5900) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][950/1251] eta 0:01:13 lr 0.000389 wd 0.0500 time 0.2412 (0.2440) data time 0.0013 (0.0016) model time 0.2400 (0.2425) loss 4.8556 (5.0237) grad_norm 2.0046 (2.5862) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][960/1251] eta 0:01:10 lr 0.000389 wd 0.0500 time 0.2457 (0.2440) data time 0.0010 (0.0016) model time 0.2447 (0.2425) loss 5.2152 (5.0241) grad_norm 2.6133 (2.5855) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][970/1251] eta 0:01:08 lr 0.000389 wd 0.0500 time 0.2348 (0.2440) data time 0.0010 (0.0016) model time 0.2338 (0.2425) loss 4.3247 (5.0216) grad_norm 1.8820 (2.5874) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][980/1251] eta 0:01:06 lr 0.000390 wd 0.0500 time 0.2443 (0.2440) data time 0.0009 (0.0016) model time 0.2434 (0.2425) loss 5.2239 (5.0209) grad_norm 2.5025 (2.5867) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][990/1251] eta 0:01:03 lr 0.000390 wd 0.0500 time 0.2370 (0.2441) data time 0.0011 (0.0016) model time 0.2359 (0.2426) loss 4.9026 (5.0208) grad_norm 2.5712 (2.5853) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1000/1251] eta 0:01:01 lr 0.000391 wd 0.0500 time 0.2476 (0.2441) data time 0.0010 (0.0016) model time 0.2466 (0.2426) loss 5.7195 (5.0192) grad_norm 2.2734 (2.5842) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1010/1251] eta 0:00:58 lr 0.000391 wd 0.0500 time 0.2412 (0.2440) data time 0.0007 (0.0016) model time 0.2404 (0.2426) loss 5.9927 (5.0190) grad_norm 2.3004 (2.5863) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:06:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1020/1251] eta 0:00:56 lr 0.000391 wd 0.0500 time 0.2460 (0.2440) data time 0.0011 (0.0016) model time 0.2449 (0.2425) loss 4.6916 (5.0168) grad_norm 2.6754 (2.5852) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1030/1251] eta 0:00:53 lr 0.000392 wd 0.0500 time 0.2360 (0.2440) data time 0.0011 (0.0016) model time 0.2349 (0.2425) loss 4.9821 (5.0167) grad_norm 2.3867 (2.5842) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1040/1251] eta 0:00:51 lr 0.000392 wd 0.0500 time 0.2398 (0.2440) data time 0.0010 (0.0016) model time 0.2388 (0.2425) loss 4.7649 (5.0163) grad_norm 5.5160 (2.5856) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1050/1251] eta 0:00:49 lr 0.000393 wd 0.0500 time 0.2450 (0.2440) data time 0.0010 (0.0016) model time 0.2440 (0.2425) loss 4.8128 (5.0150) grad_norm 2.4403 (2.5848) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1060/1251] eta 0:00:46 lr 0.000393 wd 0.0500 time 0.2446 (0.2439) data time 0.0008 (0.0016) model time 0.2438 (0.2425) loss 4.1198 (5.0132) grad_norm 2.3568 (2.5867) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1070/1251] eta 0:00:44 lr 0.000393 wd 0.0500 time 0.2443 (0.2439) data time 0.0010 (0.0016) model time 0.2433 (0.2425) loss 5.1735 (5.0139) grad_norm 3.8963 (2.5893) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1080/1251] eta 0:00:41 lr 0.000394 wd 0.0500 time 0.2449 (0.2439) data time 0.0010 (0.0016) model time 0.2439 (0.2425) loss 5.0085 (5.0125) grad_norm 2.2754 (2.5868) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1090/1251] eta 0:00:39 lr 0.000394 wd 0.0500 time 0.2433 (0.2439) data time 0.0011 (0.0016) model time 0.2422 (0.2425) loss 5.4025 (5.0108) grad_norm 2.4276 (2.5848) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1100/1251] eta 0:00:36 lr 0.000395 wd 0.0500 time 0.2455 (0.2439) data time 0.0011 (0.0016) model time 0.2444 (0.2425) loss 5.0192 (5.0071) grad_norm 2.5613 (2.5878) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1110/1251] eta 0:00:34 lr 0.000395 wd 0.0500 time 0.2407 (0.2439) data time 0.0008 (0.0016) model time 0.2399 (0.2425) loss 4.3992 (5.0069) grad_norm 2.2584 (2.5869) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1120/1251] eta 0:00:31 lr 0.000395 wd 0.0500 time 0.2521 (0.2439) data time 0.0010 (0.0015) model time 0.2512 (0.2425) loss 5.2945 (5.0065) grad_norm 2.3560 (2.5849) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1130/1251] eta 0:00:29 lr 0.000396 wd 0.0500 time 0.2445 (0.2439) data time 0.0011 (0.0015) model time 0.2434 (0.2425) loss 5.2316 (5.0065) grad_norm 2.4368 (2.5847) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1140/1251] eta 0:00:27 lr 0.000396 wd 0.0500 time 0.2433 (0.2439) data time 0.0010 (0.0015) model time 0.2423 (0.2425) loss 4.3944 (5.0051) grad_norm 2.3674 (2.5842) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1150/1251] eta 0:00:24 lr 0.000397 wd 0.0500 time 0.2483 (0.2440) data time 0.0009 (0.0015) model time 0.2475 (0.2426) loss 4.7509 (5.0040) grad_norm 2.1171 (2.5811) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1160/1251] eta 0:00:22 lr 0.000397 wd 0.0500 time 0.2406 (0.2440) data time 0.0011 (0.0015) model time 0.2395 (0.2426) loss 5.6038 (5.0060) grad_norm 2.4906 (2.5797) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1170/1251] eta 0:00:19 lr 0.000397 wd 0.0500 time 0.2380 (0.2440) data time 0.0011 (0.0015) model time 0.2369 (0.2426) loss 5.1826 (5.0048) grad_norm 2.7840 (2.5780) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1180/1251] eta 0:00:17 lr 0.000398 wd 0.0500 time 0.2431 (0.2440) data time 0.0007 (0.0015) model time 0.2423 (0.2426) loss 4.1617 (5.0035) grad_norm 2.8759 (2.5767) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1190/1251] eta 0:00:14 lr 0.000398 wd 0.0500 time 0.2465 (0.2440) data time 0.0009 (0.0015) model time 0.2456 (0.2426) loss 4.0847 (5.0040) grad_norm 2.1957 (2.5752) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1200/1251] eta 0:00:12 lr 0.000399 wd 0.0500 time 0.2330 (0.2439) data time 0.0009 (0.0015) model time 0.2321 (0.2425) loss 3.7432 (5.0013) grad_norm 1.9911 (2.5739) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1210/1251] eta 0:00:10 lr 0.000399 wd 0.0500 time 0.2380 (0.2439) data time 0.0010 (0.0015) model time 0.2370 (0.2425) loss 5.2876 (5.0013) grad_norm 2.5743 (2.5716) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1220/1251] eta 0:00:07 lr 0.000399 wd 0.0500 time 0.2471 (0.2439) data time 0.0011 (0.0015) model time 0.2460 (0.2425) loss 3.7647 (5.0001) grad_norm 3.0131 (2.5714) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1230/1251] eta 0:00:05 lr 0.000400 wd 0.0500 time 0.4518 (0.2444) data time 0.0011 (0.0015) model time 0.4507 (0.2430) loss 5.2265 (4.9994) grad_norm 3.4720 (2.5703) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1240/1251] eta 0:00:02 lr 0.000400 wd 0.0500 time 0.2275 (0.2443) data time 0.0007 (0.0015) model time 0.2267 (0.2429) loss 4.6322 (4.9983) grad_norm 2.5215 (2.5735) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [7/300][1250/1251] eta 0:00:00 lr 0.000401 wd 0.0500 time 0.2248 (0.2441) data time 0.0005 (0.0015) model time 0.2243 (0.2427) loss 4.4909 (4.9979) grad_norm 2.8672 (2.5739) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 7 training takes 0:05:05 [2024-08-20 00:07:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-20 00:07:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-20 00:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.469 (0.469) Loss 1.5850 (1.5850) Acc@1 65.234 (65.234) Acc@5 87.012 (87.012) Mem 7376MB [2024-08-20 00:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.113) Loss 2.0488 (2.0018) Acc@1 50.781 (54.057) Acc@5 82.422 (80.664) Mem 7376MB [2024-08-20 00:07:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.084 (0.098) Loss 2.6094 (1.9720) Acc@1 42.383 (54.427) Acc@5 70.703 (81.338) Mem 7376MB [2024-08-20 00:07:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.085 (0.091) Loss 2.8184 (2.1952) Acc@1 42.773 (50.992) Acc@5 64.941 (77.193) Mem 7376MB [2024-08-20 00:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 2.9863 (2.3241) Acc@1 36.523 (48.807) Acc@5 62.793 (74.848) Mem 7376MB [2024-08-20 00:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 48.892 Acc@5 74.896 [2024-08-20 00:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 48.9% [2024-08-20 00:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 48.89% [2024-08-20 00:07:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-20 00:07:59 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-20 00:07:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.444 (0.444) Loss 6.6016 (6.6016) Acc@1 0.684 (0.684) Acc@5 5.371 (5.371) Mem 7376MB [2024-08-20 00:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.115) Loss 7.0117 (6.8388) Acc@1 0.195 (0.639) Acc@5 1.270 (2.370) Mem 7376MB [2024-08-20 00:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.098) Loss 6.7969 (6.8082) Acc@1 0.098 (0.558) Acc@5 1.855 (2.190) Mem 7376MB [2024-08-20 00:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.090) Loss 6.6328 (6.7825) Acc@1 0.488 (0.554) Acc@5 3.223 (2.290) Mem 7376MB [2024-08-20 00:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 6.8516 (6.7691) Acc@1 0.391 (0.593) Acc@5 2.051 (2.413) Mem 7376MB [2024-08-20 00:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 0.760 Acc@5 2.794 [2024-08-20 00:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 0.8% [2024-08-20 00:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 0.76% [2024-08-20 00:08:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-20 00:08:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-20 00:08:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][0/1251] eta 0:14:22 lr 0.000401 wd 0.0500 time 0.6894 (0.6894) data time 0.4531 (0.4531) model time 0.0000 (0.0000) loss 5.2442 (5.2442) grad_norm 2.3888 (2.3888) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][10/1251] eta 0:05:50 lr 0.000401 wd 0.0500 time 0.2403 (0.2822) data time 0.0011 (0.0423) model time 0.0000 (0.0000) loss 4.5797 (5.0891) grad_norm 2.0462 (2.7512) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:08:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][20/1251] eta 0:05:24 lr 0.000401 wd 0.0500 time 0.2434 (0.2633) data time 0.0009 (0.0227) model time 0.0000 (0.0000) loss 5.6603 (5.1734) grad_norm 2.5896 (2.6710) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:08:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][30/1251] eta 0:05:12 lr 0.000402 wd 0.0500 time 0.2475 (0.2561) data time 0.0008 (0.0157) model time 0.0000 (0.0000) loss 5.4847 (5.1952) grad_norm 1.8787 (2.6933) loss_scale 8192.0000 (8192.0000) mem 7376MB [2024-08-20 00:08:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][40/1251] eta 0:05:05 lr 0.000402 wd 0.0500 time 0.2432 (0.2520) data time 0.0012 (0.0121) model time 0.0000 (0.0000) loss 4.9132 (5.0609) grad_norm 2.4126 (2.6122) loss_scale 16384.0000 (10190.0488) mem 7376MB [2024-08-20 00:08:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][50/1251] eta 0:05:00 lr 0.000403 wd 0.0500 time 0.2465 (0.2501) data time 0.0008 (0.0100) model time 0.0000 (0.0000) loss 5.4432 (5.1010) grad_norm 2.3322 (2.6036) loss_scale 16384.0000 (11404.5490) mem 7376MB [2024-08-20 00:08:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][60/1251] eta 0:04:56 lr 0.000403 wd 0.0500 time 0.2400 (0.2487) data time 0.0011 (0.0085) model time 0.2390 (0.2405) loss 4.8864 (5.0823) grad_norm 2.1352 (2.5319) loss_scale 16384.0000 (12220.8525) mem 7376MB [2024-08-20 00:08:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][70/1251] eta 0:04:52 lr 0.000403 wd 0.0500 time 0.2499 (0.2481) data time 0.0008 (0.0075) model time 0.2491 (0.2419) loss 5.1256 (5.0801) grad_norm 2.3664 (2.5479) loss_scale 16384.0000 (12807.2113) mem 7376MB [2024-08-20 00:08:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][80/1251] eta 0:04:49 lr 0.000404 wd 0.0500 time 0.2493 (0.2473) data time 0.0010 (0.0067) model time 0.2482 (0.2415) loss 4.0221 (5.0536) grad_norm 2.3743 (2.5244) loss_scale 16384.0000 (13248.7901) mem 7376MB [2024-08-20 00:08:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][90/1251] eta 0:04:46 lr 0.000404 wd 0.0500 time 0.2410 (0.2469) data time 0.0011 (0.0061) model time 0.2398 (0.2417) loss 5.1153 (5.0493) grad_norm 1.7586 (2.5358) loss_scale 16384.0000 (13593.3187) mem 7376MB [2024-08-20 00:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][100/1251] eta 0:04:43 lr 0.000405 wd 0.0500 time 0.2465 (0.2466) data time 0.0011 (0.0056) model time 0.2454 (0.2419) loss 4.3682 (5.0294) grad_norm 2.3278 (2.5266) loss_scale 16384.0000 (13869.6238) mem 7376MB [2024-08-20 00:08:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][110/1251] eta 0:04:41 lr 0.000405 wd 0.0500 time 0.2483 (0.2464) data time 0.0012 (0.0052) model time 0.2471 (0.2421) loss 5.8287 (5.0376) grad_norm 1.8886 (2.5069) loss_scale 16384.0000 (14096.1441) mem 7376MB [2024-08-20 00:08:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][120/1251] eta 0:04:38 lr 0.000405 wd 0.0500 time 0.2488 (0.2461) data time 0.0008 (0.0048) model time 0.2481 (0.2420) loss 4.4578 (5.0378) grad_norm 2.0150 (2.4916) loss_scale 16384.0000 (14285.2231) mem 7376MB [2024-08-20 00:08:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][130/1251] eta 0:04:35 lr 0.000406 wd 0.0500 time 0.2351 (0.2456) data time 0.0011 (0.0046) model time 0.2340 (0.2417) loss 5.2534 (5.0320) grad_norm 2.1450 (2.4663) loss_scale 16384.0000 (14445.4351) mem 7376MB [2024-08-20 00:08:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][140/1251] eta 0:04:32 lr 0.000406 wd 0.0500 time 0.2518 (0.2454) data time 0.0008 (0.0043) model time 0.2510 (0.2416) loss 3.3760 (5.0044) grad_norm 2.2128 (2.4514) loss_scale 16384.0000 (14582.9220) mem 7376MB [2024-08-20 00:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][150/1251] eta 0:04:30 lr 0.000407 wd 0.0500 time 0.2437 (0.2452) data time 0.0011 (0.0041) model time 0.2426 (0.2417) loss 4.5364 (4.9909) grad_norm 1.9335 (2.4516) loss_scale 16384.0000 (14702.1987) mem 7376MB [2024-08-20 00:08:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][160/1251] eta 0:04:27 lr 0.000407 wd 0.0500 time 0.2464 (0.2452) data time 0.0009 (0.0039) model time 0.2455 (0.2419) loss 5.6723 (4.9888) grad_norm 2.9526 (2.4490) loss_scale 16384.0000 (14806.6584) mem 7376MB [2024-08-20 00:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][170/1251] eta 0:04:24 lr 0.000407 wd 0.0500 time 0.2478 (0.2451) data time 0.0010 (0.0038) model time 0.2468 (0.2418) loss 5.3283 (4.9968) grad_norm 3.3948 (2.4793) loss_scale 16384.0000 (14898.9006) mem 7376MB [2024-08-20 00:08:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][180/1251] eta 0:04:22 lr 0.000408 wd 0.0500 time 0.2492 (0.2449) data time 0.0007 (0.0036) model time 0.2485 (0.2418) loss 5.1551 (4.9847) grad_norm 2.1374 (2.4908) loss_scale 16384.0000 (14980.9503) mem 7376MB [2024-08-20 00:08:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][190/1251] eta 0:04:19 lr 0.000408 wd 0.0500 time 0.2349 (0.2448) data time 0.0008 (0.0035) model time 0.2341 (0.2417) loss 5.5708 (4.9936) grad_norm 2.2663 (2.4712) loss_scale 16384.0000 (15054.4084) mem 7376MB [2024-08-20 00:08:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][200/1251] eta 0:04:17 lr 0.000409 wd 0.0500 time 0.2528 (0.2447) data time 0.0007 (0.0034) model time 0.2521 (0.2417) loss 4.2933 (4.9986) grad_norm 2.7566 (2.4772) loss_scale 16384.0000 (15120.5572) mem 7376MB [2024-08-20 00:08:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][210/1251] eta 0:04:14 lr 0.000409 wd 0.0500 time 0.2490 (0.2446) data time 0.0007 (0.0033) model time 0.2483 (0.2417) loss 3.9077 (4.9928) grad_norm 2.3745 (2.4937) loss_scale 16384.0000 (15180.4360) mem 7376MB [2024-08-20 00:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][220/1251] eta 0:04:12 lr 0.000409 wd 0.0500 time 0.2493 (0.2445) data time 0.0011 (0.0032) model time 0.2482 (0.2416) loss 4.1332 (4.9770) grad_norm 2.4102 (2.5062) loss_scale 16384.0000 (15234.8959) mem 7376MB [2024-08-20 00:09:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][230/1251] eta 0:04:09 lr 0.000410 wd 0.0500 time 0.2534 (0.2445) data time 0.0008 (0.0031) model time 0.2527 (0.2417) loss 3.8433 (4.9708) grad_norm 4.1177 (2.5182) loss_scale 16384.0000 (15284.6407) mem 7376MB [2024-08-20 00:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][240/1251] eta 0:04:07 lr 0.000410 wd 0.0500 time 0.2439 (0.2445) data time 0.0008 (0.0030) model time 0.2431 (0.2418) loss 5.6770 (4.9773) grad_norm 2.1355 (2.5177) loss_scale 16384.0000 (15330.2573) mem 7376MB [2024-08-20 00:09:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][250/1251] eta 0:04:04 lr 0.000411 wd 0.0500 time 0.2431 (0.2444) data time 0.0008 (0.0030) model time 0.2423 (0.2418) loss 5.5890 (4.9871) grad_norm 1.8179 (2.5068) loss_scale 16384.0000 (15372.2390) mem 7376MB [2024-08-20 00:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][260/1251] eta 0:04:02 lr 0.000411 wd 0.0500 time 0.2339 (0.2449) data time 0.0010 (0.0029) model time 0.2329 (0.2425) loss 5.2572 (4.9893) grad_norm 1.7570 (2.4944) loss_scale 16384.0000 (15411.0038) mem 7376MB [2024-08-20 00:09:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][270/1251] eta 0:04:00 lr 0.000411 wd 0.0500 time 0.2430 (0.2448) data time 0.0008 (0.0029) model time 0.2422 (0.2424) loss 5.8184 (4.9800) grad_norm 2.6115 (2.4933) loss_scale 16384.0000 (15446.9077) mem 7376MB [2024-08-20 00:09:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][280/1251] eta 0:03:57 lr 0.000412 wd 0.0500 time 0.2443 (0.2448) data time 0.0008 (0.0028) model time 0.2435 (0.2424) loss 5.5066 (4.9794) grad_norm 1.9101 (2.4851) loss_scale 16384.0000 (15480.2562) mem 7376MB [2024-08-20 00:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][290/1251] eta 0:03:55 lr 0.000412 wd 0.0500 time 0.2368 (0.2447) data time 0.0008 (0.0027) model time 0.2360 (0.2424) loss 5.4542 (4.9796) grad_norm 2.0965 (2.4796) loss_scale 16384.0000 (15511.3127) mem 7376MB [2024-08-20 00:09:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][300/1251] eta 0:03:52 lr 0.000413 wd 0.0500 time 0.2466 (0.2448) data time 0.0011 (0.0027) model time 0.2455 (0.2425) loss 4.1494 (4.9862) grad_norm 2.7338 (2.4863) loss_scale 16384.0000 (15540.3056) mem 7376MB [2024-08-20 00:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][310/1251] eta 0:03:50 lr 0.000413 wd 0.0500 time 0.2408 (0.2447) data time 0.0008 (0.0026) model time 0.2400 (0.2424) loss 4.5950 (4.9728) grad_norm 3.2651 (2.5008) loss_scale 16384.0000 (15567.4341) mem 7376MB [2024-08-20 00:09:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][320/1251] eta 0:03:47 lr 0.000413 wd 0.0500 time 0.2521 (0.2447) data time 0.0011 (0.0026) model time 0.2511 (0.2425) loss 5.2228 (4.9698) grad_norm 2.4418 (2.5063) loss_scale 16384.0000 (15592.8723) mem 7376MB [2024-08-20 00:09:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][330/1251] eta 0:03:45 lr 0.000414 wd 0.0500 time 0.2399 (0.2447) data time 0.0010 (0.0026) model time 0.2389 (0.2425) loss 5.4114 (4.9698) grad_norm 2.9422 (2.5192) loss_scale 16384.0000 (15616.7734) mem 7376MB [2024-08-20 00:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][340/1251] eta 0:03:43 lr 0.000414 wd 0.0500 time 0.2420 (0.2448) data time 0.0009 (0.0025) model time 0.2411 (0.2427) loss 4.4114 (4.9687) grad_norm 3.1734 (2.5318) loss_scale 16384.0000 (15639.2727) mem 7376MB [2024-08-20 00:09:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][350/1251] eta 0:03:40 lr 0.000415 wd 0.0500 time 0.2386 (0.2447) data time 0.0010 (0.0025) model time 0.2376 (0.2426) loss 5.2426 (4.9678) grad_norm 2.6819 (2.5411) loss_scale 16384.0000 (15660.4900) mem 7376MB [2024-08-20 00:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][360/1251] eta 0:03:38 lr 0.000415 wd 0.0500 time 0.2533 (0.2447) data time 0.0010 (0.0025) model time 0.2523 (0.2425) loss 5.1505 (4.9678) grad_norm 2.8084 (2.5350) loss_scale 16384.0000 (15680.5319) mem 7376MB [2024-08-20 00:09:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][370/1251] eta 0:03:36 lr 0.000415 wd 0.0500 time 0.2449 (0.2452) data time 0.0008 (0.0025) model time 0.2441 (0.2432) loss 4.2009 (4.9562) grad_norm 3.0028 (2.5349) loss_scale 16384.0000 (15699.4933) mem 7376MB [2024-08-20 00:09:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][380/1251] eta 0:03:33 lr 0.000416 wd 0.0500 time 0.2405 (0.2452) data time 0.0011 (0.0024) model time 0.2394 (0.2431) loss 4.7355 (4.9489) grad_norm 2.3727 (2.5334) loss_scale 16384.0000 (15717.4593) mem 7376MB [2024-08-20 00:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][390/1251] eta 0:03:31 lr 0.000416 wd 0.0500 time 0.2442 (0.2451) data time 0.0007 (0.0024) model time 0.2434 (0.2431) loss 4.1114 (4.9433) grad_norm 3.3052 (2.5313) loss_scale 16384.0000 (15734.5064) mem 7376MB [2024-08-20 00:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][400/1251] eta 0:03:28 lr 0.000417 wd 0.0500 time 0.2487 (0.2451) data time 0.0011 (0.0024) model time 0.2477 (0.2431) loss 4.8643 (4.9372) grad_norm 2.3781 (2.5305) loss_scale 16384.0000 (15750.7032) mem 7376MB [2024-08-20 00:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][410/1251] eta 0:03:26 lr 0.000417 wd 0.0500 time 0.2416 (0.2450) data time 0.0011 (0.0023) model time 0.2405 (0.2430) loss 4.6996 (4.9391) grad_norm 2.1695 (2.5342) loss_scale 16384.0000 (15766.1119) mem 7376MB [2024-08-20 00:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][420/1251] eta 0:03:23 lr 0.000417 wd 0.0500 time 0.2460 (0.2450) data time 0.0012 (0.0023) model time 0.2449 (0.2430) loss 4.4995 (4.9314) grad_norm 2.0818 (2.5374) loss_scale 16384.0000 (15780.7886) mem 7376MB [2024-08-20 00:09:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][430/1251] eta 0:03:21 lr 0.000418 wd 0.0500 time 0.2377 (0.2449) data time 0.0010 (0.0023) model time 0.2367 (0.2429) loss 3.7993 (4.9275) grad_norm 2.5425 (2.5346) loss_scale 16384.0000 (15794.7842) mem 7376MB [2024-08-20 00:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][440/1251] eta 0:03:18 lr 0.000418 wd 0.0500 time 0.2446 (0.2449) data time 0.0010 (0.0023) model time 0.2435 (0.2430) loss 4.6452 (4.9306) grad_norm 3.3534 (2.5383) loss_scale 16384.0000 (15808.1451) mem 7376MB [2024-08-20 00:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][450/1251] eta 0:03:16 lr 0.000419 wd 0.0500 time 0.2391 (0.2448) data time 0.0011 (0.0022) model time 0.2380 (0.2428) loss 4.5266 (4.9279) grad_norm 2.5017 (2.5402) loss_scale 16384.0000 (15820.9135) mem 7376MB [2024-08-20 00:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][460/1251] eta 0:03:13 lr 0.000419 wd 0.0500 time 0.2484 (0.2447) data time 0.0012 (0.0022) model time 0.2472 (0.2428) loss 5.2864 (4.9249) grad_norm 1.8584 (2.5403) loss_scale 16384.0000 (15833.1280) mem 7376MB [2024-08-20 00:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][470/1251] eta 0:03:11 lr 0.000419 wd 0.0500 time 0.2505 (0.2446) data time 0.0010 (0.0022) model time 0.2495 (0.2427) loss 4.8223 (4.9282) grad_norm 2.6214 (2.5376) loss_scale 16384.0000 (15844.8238) mem 7376MB [2024-08-20 00:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][480/1251] eta 0:03:08 lr 0.000420 wd 0.0500 time 0.2406 (0.2446) data time 0.0010 (0.0022) model time 0.2396 (0.2427) loss 4.1377 (4.9249) grad_norm 2.1815 (2.5326) loss_scale 16384.0000 (15856.0333) mem 7376MB [2024-08-20 00:10:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][490/1251] eta 0:03:06 lr 0.000420 wd 0.0500 time 0.2443 (0.2446) data time 0.0009 (0.0022) model time 0.2434 (0.2427) loss 6.0306 (4.9263) grad_norm 2.7172 (2.5302) loss_scale 16384.0000 (15866.7862) mem 7376MB [2024-08-20 00:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][500/1251] eta 0:03:03 lr 0.000421 wd 0.0500 time 0.2440 (0.2446) data time 0.0007 (0.0022) model time 0.2433 (0.2426) loss 5.1141 (4.9300) grad_norm 2.6386 (2.5322) loss_scale 16384.0000 (15877.1098) mem 7376MB [2024-08-20 00:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][510/1251] eta 0:03:01 lr 0.000421 wd 0.0500 time 0.2402 (0.2449) data time 0.0011 (0.0022) model time 0.2391 (0.2430) loss 5.2458 (4.9294) grad_norm 1.9970 (2.5272) loss_scale 16384.0000 (15887.0294) mem 7376MB [2024-08-20 00:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][520/1251] eta 0:02:58 lr 0.000421 wd 0.0500 time 0.2384 (0.2448) data time 0.0010 (0.0021) model time 0.2374 (0.2430) loss 5.5720 (4.9308) grad_norm 2.2461 (2.5241) loss_scale 16384.0000 (15896.5681) mem 7376MB [2024-08-20 00:10:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][530/1251] eta 0:02:57 lr 0.000422 wd 0.0500 time 0.2492 (0.2456) data time 0.0007 (0.0021) model time 0.2485 (0.2438) loss 3.6743 (4.9241) grad_norm 3.3629 (2.5192) loss_scale 16384.0000 (15905.7476) mem 7376MB [2024-08-20 00:10:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][540/1251] eta 0:02:54 lr 0.000422 wd 0.0500 time 0.2535 (0.2456) data time 0.0007 (0.0021) model time 0.2528 (0.2438) loss 5.3157 (4.9259) grad_norm 2.1059 (2.5127) loss_scale 16384.0000 (15914.5878) mem 7376MB [2024-08-20 00:10:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][550/1251] eta 0:02:52 lr 0.000423 wd 0.0500 time 0.2549 (0.2455) data time 0.0007 (0.0021) model time 0.2542 (0.2438) loss 3.9922 (4.9222) grad_norm 2.0891 (2.5119) loss_scale 16384.0000 (15923.1071) mem 7376MB [2024-08-20 00:10:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][560/1251] eta 0:02:49 lr 0.000423 wd 0.0500 time 0.2509 (0.2455) data time 0.0010 (0.0021) model time 0.2499 (0.2438) loss 5.5100 (4.9135) grad_norm 2.0560 (2.5056) loss_scale 16384.0000 (15931.3226) mem 7376MB [2024-08-20 00:10:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][570/1251] eta 0:02:47 lr 0.000423 wd 0.0500 time 0.2425 (0.2455) data time 0.0009 (0.0021) model time 0.2416 (0.2437) loss 5.6026 (4.9145) grad_norm 2.1727 (2.5038) loss_scale 16384.0000 (15939.2504) mem 7376MB [2024-08-20 00:10:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][580/1251] eta 0:02:44 lr 0.000424 wd 0.0500 time 0.2487 (0.2454) data time 0.0009 (0.0020) model time 0.2478 (0.2437) loss 3.9850 (4.9150) grad_norm 2.0429 (2.5009) loss_scale 16384.0000 (15946.9053) mem 7376MB [2024-08-20 00:10:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][590/1251] eta 0:02:42 lr 0.000424 wd 0.0500 time 0.2548 (0.2453) data time 0.0011 (0.0020) model time 0.2537 (0.2436) loss 4.7510 (4.9085) grad_norm 2.3534 (2.4991) loss_scale 16384.0000 (15954.3012) mem 7376MB [2024-08-20 00:10:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][600/1251] eta 0:02:39 lr 0.000425 wd 0.0500 time 0.2424 (0.2453) data time 0.0011 (0.0020) model time 0.2412 (0.2436) loss 5.0669 (4.9072) grad_norm 3.1643 (2.5084) loss_scale 16384.0000 (15961.4509) mem 7376MB [2024-08-20 00:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][610/1251] eta 0:02:37 lr 0.000425 wd 0.0500 time 0.2441 (0.2452) data time 0.0008 (0.0020) model time 0.2434 (0.2435) loss 4.5734 (4.9049) grad_norm 1.9334 (2.5078) loss_scale 16384.0000 (15968.3666) mem 7376MB [2024-08-20 00:10:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][620/1251] eta 0:02:34 lr 0.000425 wd 0.0500 time 0.2438 (0.2452) data time 0.0010 (0.0020) model time 0.2428 (0.2435) loss 5.2054 (4.9020) grad_norm 2.0821 (2.5015) loss_scale 16384.0000 (15975.0596) mem 7376MB [2024-08-20 00:10:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][630/1251] eta 0:02:32 lr 0.000426 wd 0.0500 time 0.2403 (0.2452) data time 0.0009 (0.0020) model time 0.2393 (0.2435) loss 4.1419 (4.9086) grad_norm 2.1246 (2.5016) loss_scale 16384.0000 (15981.5404) mem 7376MB [2024-08-20 00:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][640/1251] eta 0:02:29 lr 0.000426 wd 0.0500 time 0.2412 (0.2452) data time 0.0009 (0.0020) model time 0.2403 (0.2435) loss 5.2777 (4.9063) grad_norm 1.9873 (2.4969) loss_scale 16384.0000 (15987.8190) mem 7376MB [2024-08-20 00:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][650/1251] eta 0:02:27 lr 0.000427 wd 0.0500 time 0.2413 (0.2451) data time 0.0010 (0.0020) model time 0.2402 (0.2434) loss 4.6407 (4.9019) grad_norm 2.3741 (2.4976) loss_scale 16384.0000 (15993.9048) mem 7376MB [2024-08-20 00:10:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][660/1251] eta 0:02:24 lr 0.000427 wd 0.0500 time 0.2440 (0.2451) data time 0.0008 (0.0019) model time 0.2432 (0.2434) loss 4.8327 (4.9014) grad_norm 1.6240 (2.4937) loss_scale 16384.0000 (15999.8064) mem 7376MB [2024-08-20 00:10:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][670/1251] eta 0:02:22 lr 0.000427 wd 0.0500 time 0.2438 (0.2451) data time 0.0010 (0.0019) model time 0.2427 (0.2434) loss 4.9252 (4.9009) grad_norm 2.5030 (2.4898) loss_scale 16384.0000 (16005.5320) mem 7376MB [2024-08-20 00:10:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][680/1251] eta 0:02:19 lr 0.000428 wd 0.0500 time 0.2382 (0.2451) data time 0.0010 (0.0019) model time 0.2372 (0.2434) loss 5.2308 (4.9061) grad_norm 2.9122 (2.4913) loss_scale 16384.0000 (16011.0896) mem 7376MB [2024-08-20 00:10:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][690/1251] eta 0:02:17 lr 0.000428 wd 0.0500 time 0.2392 (0.2450) data time 0.0008 (0.0019) model time 0.2385 (0.2433) loss 5.1752 (4.9078) grad_norm 2.2495 (2.4890) loss_scale 16384.0000 (16016.4863) mem 7376MB [2024-08-20 00:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][700/1251] eta 0:02:14 lr 0.000429 wd 0.0500 time 0.2423 (0.2449) data time 0.0011 (0.0019) model time 0.2412 (0.2433) loss 4.1431 (4.9072) grad_norm 2.6150 (2.4856) loss_scale 16384.0000 (16021.7290) mem 7376MB [2024-08-20 00:10:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][710/1251] eta 0:02:12 lr 0.000429 wd 0.0500 time 0.2525 (0.2449) data time 0.0008 (0.0019) model time 0.2516 (0.2433) loss 3.6897 (4.9034) grad_norm 1.9578 (2.4839) loss_scale 16384.0000 (16026.8242) mem 7376MB [2024-08-20 00:11:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][720/1251] eta 0:02:10 lr 0.000429 wd 0.0500 time 0.2368 (0.2449) data time 0.0007 (0.0019) model time 0.2360 (0.2432) loss 5.0216 (4.9011) grad_norm 2.2135 (2.4796) loss_scale 16384.0000 (16031.7781) mem 7376MB [2024-08-20 00:11:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][730/1251] eta 0:02:07 lr 0.000430 wd 0.0500 time 0.2418 (0.2448) data time 0.0008 (0.0019) model time 0.2410 (0.2432) loss 5.6010 (4.9029) grad_norm 2.0834 (2.4830) loss_scale 16384.0000 (16036.5964) mem 7376MB [2024-08-20 00:11:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][740/1251] eta 0:02:05 lr 0.000430 wd 0.0500 time 0.2372 (0.2448) data time 0.0012 (0.0019) model time 0.2360 (0.2431) loss 4.2421 (4.9018) grad_norm 2.1436 (2.4805) loss_scale 16384.0000 (16041.2848) mem 7376MB [2024-08-20 00:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][750/1251] eta 0:02:02 lr 0.000431 wd 0.0500 time 0.2458 (0.2447) data time 0.0010 (0.0018) model time 0.2448 (0.2431) loss 3.9277 (4.9030) grad_norm 1.8902 (2.4772) loss_scale 16384.0000 (16045.8482) mem 7376MB [2024-08-20 00:11:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][760/1251] eta 0:02:00 lr 0.000431 wd 0.0500 time 0.2369 (0.2447) data time 0.0008 (0.0018) model time 0.2360 (0.2431) loss 4.6110 (4.9011) grad_norm 2.0639 (2.4740) loss_scale 16384.0000 (16050.2917) mem 7376MB [2024-08-20 00:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][770/1251] eta 0:01:57 lr 0.000431 wd 0.0500 time 0.2387 (0.2447) data time 0.0009 (0.0018) model time 0.2379 (0.2431) loss 4.6886 (4.8989) grad_norm 3.7936 (2.4779) loss_scale 16384.0000 (16054.6200) mem 7376MB [2024-08-20 00:11:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][780/1251] eta 0:01:55 lr 0.000432 wd 0.0500 time 0.2349 (0.2449) data time 0.0011 (0.0018) model time 0.2338 (0.2433) loss 4.9580 (4.8993) grad_norm 2.2021 (2.4788) loss_scale 16384.0000 (16058.8374) mem 7376MB [2024-08-20 00:11:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][790/1251] eta 0:01:52 lr 0.000432 wd 0.0500 time 0.2413 (0.2449) data time 0.0007 (0.0018) model time 0.2405 (0.2433) loss 4.2908 (4.8960) grad_norm 1.9919 (2.4780) loss_scale 16384.0000 (16062.9482) mem 7376MB [2024-08-20 00:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][800/1251] eta 0:01:50 lr 0.000433 wd 0.0500 time 0.2397 (0.2449) data time 0.0010 (0.0018) model time 0.2387 (0.2433) loss 5.1849 (4.8957) grad_norm 1.9138 (2.4768) loss_scale 16384.0000 (16066.9563) mem 7376MB [2024-08-20 00:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][810/1251] eta 0:01:47 lr 0.000433 wd 0.0500 time 0.2446 (0.2448) data time 0.0011 (0.0018) model time 0.2434 (0.2432) loss 5.1044 (4.8968) grad_norm 2.6687 (2.4746) loss_scale 16384.0000 (16070.8656) mem 7376MB [2024-08-20 00:11:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][820/1251] eta 0:01:45 lr 0.000433 wd 0.0500 time 0.2408 (0.2448) data time 0.0009 (0.0018) model time 0.2400 (0.2432) loss 4.5922 (4.8938) grad_norm 2.4267 (2.4772) loss_scale 16384.0000 (16074.6797) mem 7376MB [2024-08-20 00:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][830/1251] eta 0:01:43 lr 0.000434 wd 0.0500 time 0.2382 (0.2448) data time 0.0011 (0.0018) model time 0.2371 (0.2432) loss 4.9779 (4.8947) grad_norm 2.3531 (2.4769) loss_scale 16384.0000 (16078.4019) mem 7376MB [2024-08-20 00:11:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][840/1251] eta 0:01:40 lr 0.000434 wd 0.0500 time 0.2401 (0.2447) data time 0.0011 (0.0018) model time 0.2390 (0.2431) loss 5.3097 (4.8954) grad_norm 2.3695 (2.4751) loss_scale 16384.0000 (16082.0357) mem 7376MB [2024-08-20 00:11:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][850/1251] eta 0:01:38 lr 0.000435 wd 0.0500 time 0.2347 (0.2447) data time 0.0011 (0.0018) model time 0.2337 (0.2431) loss 5.1872 (4.8957) grad_norm 2.2578 (2.4771) loss_scale 16384.0000 (16085.5840) mem 7376MB [2024-08-20 00:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][860/1251] eta 0:01:35 lr 0.000435 wd 0.0500 time 0.2384 (0.2447) data time 0.0011 (0.0018) model time 0.2373 (0.2431) loss 4.6689 (4.8967) grad_norm 2.4200 (2.4770) loss_scale 16384.0000 (16089.0499) mem 7376MB [2024-08-20 00:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][870/1251] eta 0:01:33 lr 0.000435 wd 0.0500 time 0.2434 (0.2447) data time 0.0007 (0.0018) model time 0.2427 (0.2431) loss 5.7400 (4.8987) grad_norm 2.2619 (2.4766) loss_scale 16384.0000 (16092.4363) mem 7376MB [2024-08-20 00:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][880/1251] eta 0:01:30 lr 0.000436 wd 0.0500 time 0.2378 (0.2446) data time 0.0011 (0.0017) model time 0.2367 (0.2430) loss 5.1351 (4.8975) grad_norm 1.9116 (2.4839) loss_scale 16384.0000 (16095.7457) mem 7376MB [2024-08-20 00:11:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][890/1251] eta 0:01:28 lr 0.000436 wd 0.0500 time 0.2500 (0.2446) data time 0.0007 (0.0017) model time 0.2492 (0.2430) loss 4.6180 (4.8963) grad_norm 1.9511 (2.4814) loss_scale 16384.0000 (16098.9809) mem 7376MB [2024-08-20 00:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][900/1251] eta 0:01:25 lr 0.000437 wd 0.0500 time 0.2458 (0.2446) data time 0.0010 (0.0017) model time 0.2448 (0.2430) loss 4.7116 (4.8969) grad_norm 2.8159 (2.4810) loss_scale 16384.0000 (16102.1443) mem 7376MB [2024-08-20 00:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][910/1251] eta 0:01:23 lr 0.000437 wd 0.0500 time 0.2501 (0.2446) data time 0.0011 (0.0017) model time 0.2490 (0.2430) loss 5.3527 (4.8947) grad_norm 3.1758 (2.4805) loss_scale 16384.0000 (16105.2382) mem 7376MB [2024-08-20 00:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][920/1251] eta 0:01:20 lr 0.000437 wd 0.0500 time 0.2421 (0.2445) data time 0.0008 (0.0017) model time 0.2413 (0.2429) loss 4.0490 (4.8932) grad_norm 1.5583 (2.4827) loss_scale 16384.0000 (16108.2649) mem 7376MB [2024-08-20 00:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][930/1251] eta 0:01:18 lr 0.000438 wd 0.0500 time 0.2442 (0.2445) data time 0.0010 (0.0017) model time 0.2432 (0.2429) loss 4.3636 (4.8921) grad_norm 2.8804 (2.4818) loss_scale 16384.0000 (16111.2266) mem 7376MB [2024-08-20 00:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][940/1251] eta 0:01:16 lr 0.000438 wd 0.0500 time 0.2464 (0.2445) data time 0.0010 (0.0017) model time 0.2454 (0.2429) loss 4.6382 (4.8928) grad_norm 2.3171 (2.4804) loss_scale 16384.0000 (16114.1254) mem 7376MB [2024-08-20 00:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][950/1251] eta 0:01:13 lr 0.000439 wd 0.0500 time 0.2396 (0.2445) data time 0.0007 (0.0017) model time 0.2388 (0.2429) loss 3.6234 (4.8911) grad_norm 2.9197 (2.4777) loss_scale 16384.0000 (16116.9632) mem 7376MB [2024-08-20 00:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][960/1251] eta 0:01:11 lr 0.000439 wd 0.0500 time 0.2392 (0.2444) data time 0.0008 (0.0017) model time 0.2384 (0.2429) loss 5.1448 (4.8936) grad_norm 2.1393 (2.4751) loss_scale 16384.0000 (16119.7419) mem 7376MB [2024-08-20 00:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][970/1251] eta 0:01:08 lr 0.000439 wd 0.0500 time 0.2389 (0.2444) data time 0.0008 (0.0017) model time 0.2381 (0.2429) loss 5.3011 (4.8930) grad_norm 2.2159 (2.4778) loss_scale 16384.0000 (16122.4634) mem 7376MB [2024-08-20 00:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][980/1251] eta 0:01:06 lr 0.000440 wd 0.0500 time 0.2430 (0.2444) data time 0.0009 (0.0017) model time 0.2421 (0.2428) loss 5.0066 (4.8916) grad_norm 2.0044 (2.4795) loss_scale 16384.0000 (16125.1295) mem 7376MB [2024-08-20 00:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][990/1251] eta 0:01:03 lr 0.000440 wd 0.0500 time 0.2396 (0.2444) data time 0.0010 (0.0017) model time 0.2386 (0.2428) loss 4.2752 (4.8917) grad_norm 2.6471 (2.4839) loss_scale 16384.0000 (16127.7417) mem 7376MB [2024-08-20 00:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1000/1251] eta 0:01:01 lr 0.000441 wd 0.0500 time 0.2375 (0.2443) data time 0.0011 (0.0017) model time 0.2363 (0.2428) loss 5.1232 (4.8937) grad_norm 3.3008 (2.4836) loss_scale 16384.0000 (16130.3017) mem 7376MB [2024-08-20 00:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1010/1251] eta 0:00:58 lr 0.000441 wd 0.0500 time 0.2283 (0.2443) data time 0.0010 (0.0017) model time 0.2273 (0.2428) loss 4.8585 (4.8954) grad_norm 2.4174 (2.4812) loss_scale 16384.0000 (16132.8111) mem 7376MB [2024-08-20 00:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1020/1251] eta 0:00:56 lr 0.000441 wd 0.0500 time 0.2427 (0.2443) data time 0.0011 (0.0017) model time 0.2416 (0.2428) loss 4.7723 (4.8957) grad_norm 3.7014 (2.4823) loss_scale 16384.0000 (16135.2713) mem 7376MB [2024-08-20 00:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1030/1251] eta 0:00:54 lr 0.000442 wd 0.0500 time 0.2385 (0.2444) data time 0.0009 (0.0016) model time 0.2376 (0.2429) loss 4.9967 (4.8948) grad_norm 1.9418 (2.4806) loss_scale 16384.0000 (16137.6838) mem 7376MB [2024-08-20 00:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1040/1251] eta 0:00:51 lr 0.000442 wd 0.0500 time 0.2407 (0.2446) data time 0.0010 (0.0016) model time 0.2397 (0.2431) loss 4.0803 (4.8959) grad_norm 2.2191 (2.4792) loss_scale 16384.0000 (16140.0500) mem 7376MB [2024-08-20 00:12:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1050/1251] eta 0:00:49 lr 0.000443 wd 0.0500 time 0.2468 (0.2447) data time 0.0010 (0.0016) model time 0.2458 (0.2432) loss 5.3976 (4.8930) grad_norm 1.8916 (2.4778) loss_scale 16384.0000 (16142.3711) mem 7376MB [2024-08-20 00:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1060/1251] eta 0:00:46 lr 0.000443 wd 0.0500 time 0.2433 (0.2447) data time 0.0011 (0.0016) model time 0.2422 (0.2432) loss 4.9817 (4.8949) grad_norm 2.1577 (2.4794) loss_scale 16384.0000 (16144.6484) mem 7376MB [2024-08-20 00:12:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1070/1251] eta 0:00:44 lr 0.000443 wd 0.0500 time 0.2415 (0.2447) data time 0.0010 (0.0016) model time 0.2405 (0.2432) loss 4.6472 (4.8937) grad_norm 2.0637 (2.4779) loss_scale 16384.0000 (16146.8833) mem 7376MB [2024-08-20 00:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1080/1251] eta 0:00:41 lr 0.000444 wd 0.0500 time 0.2446 (0.2446) data time 0.0008 (0.0016) model time 0.2439 (0.2432) loss 4.5309 (4.8913) grad_norm 2.2090 (2.4754) loss_scale 16384.0000 (16149.0768) mem 7376MB [2024-08-20 00:12:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1090/1251] eta 0:00:39 lr 0.000444 wd 0.0500 time 0.2431 (0.2446) data time 0.0008 (0.0016) model time 0.2424 (0.2431) loss 5.4676 (4.8929) grad_norm 2.0493 (2.4748) loss_scale 16384.0000 (16151.2301) mem 7376MB [2024-08-20 00:12:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1100/1251] eta 0:00:36 lr 0.000445 wd 0.0500 time 0.2409 (0.2446) data time 0.0011 (0.0016) model time 0.2398 (0.2431) loss 5.2216 (4.8896) grad_norm 1.7737 (2.4752) loss_scale 16384.0000 (16153.3442) mem 7376MB [2024-08-20 00:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1110/1251] eta 0:00:34 lr 0.000445 wd 0.0500 time 0.2343 (0.2445) data time 0.0009 (0.0016) model time 0.2334 (0.2431) loss 5.3451 (4.8899) grad_norm 2.5064 (2.4773) loss_scale 16384.0000 (16155.4203) mem 7376MB [2024-08-20 00:12:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1120/1251] eta 0:00:32 lr 0.000445 wd 0.0500 time 0.2431 (0.2445) data time 0.0011 (0.0016) model time 0.2421 (0.2430) loss 5.2610 (4.8900) grad_norm 2.7272 (2.4793) loss_scale 16384.0000 (16157.4594) mem 7376MB [2024-08-20 00:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1130/1251] eta 0:00:29 lr 0.000446 wd 0.0500 time 0.2373 (0.2445) data time 0.0010 (0.0016) model time 0.2363 (0.2430) loss 5.0322 (4.8883) grad_norm 1.8656 (2.4792) loss_scale 16384.0000 (16159.4624) mem 7376MB [2024-08-20 00:12:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1140/1251] eta 0:00:27 lr 0.000446 wd 0.0500 time 0.2496 (0.2445) data time 0.0010 (0.0016) model time 0.2486 (0.2430) loss 4.7743 (4.8889) grad_norm 2.0389 (2.4784) loss_scale 16384.0000 (16161.4303) mem 7376MB [2024-08-20 00:12:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1150/1251] eta 0:00:24 lr 0.000447 wd 0.0500 time 0.2420 (0.2445) data time 0.0011 (0.0016) model time 0.2409 (0.2430) loss 4.5211 (4.8881) grad_norm 2.9060 (2.4806) loss_scale 16384.0000 (16163.3640) mem 7376MB [2024-08-20 00:12:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1160/1251] eta 0:00:22 lr 0.000447 wd 0.0500 time 0.2468 (0.2445) data time 0.0009 (0.0016) model time 0.2458 (0.2430) loss 3.6692 (4.8870) grad_norm 2.8362 (2.4827) loss_scale 16384.0000 (16165.2644) mem 7376MB [2024-08-20 00:12:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1170/1251] eta 0:00:19 lr 0.000447 wd 0.0500 time 0.2339 (0.2444) data time 0.0010 (0.0016) model time 0.2330 (0.2429) loss 4.0190 (4.8844) grad_norm 2.2181 (2.4805) loss_scale 16384.0000 (16167.1324) mem 7376MB [2024-08-20 00:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1180/1251] eta 0:00:17 lr 0.000448 wd 0.0500 time 0.2492 (0.2444) data time 0.0011 (0.0016) model time 0.2481 (0.2429) loss 4.8122 (4.8817) grad_norm 2.3103 (2.4786) loss_scale 16384.0000 (16168.9687) mem 7376MB [2024-08-20 00:12:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1190/1251] eta 0:00:14 lr 0.000448 wd 0.0500 time 0.2400 (0.2443) data time 0.0010 (0.0016) model time 0.2389 (0.2429) loss 5.4699 (4.8810) grad_norm 1.7364 (2.4762) loss_scale 16384.0000 (16170.7741) mem 7376MB [2024-08-20 00:12:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1200/1251] eta 0:00:12 lr 0.000449 wd 0.0500 time 0.2402 (0.2443) data time 0.0008 (0.0016) model time 0.2394 (0.2429) loss 4.1383 (4.8807) grad_norm 1.9829 (2.4759) loss_scale 16384.0000 (16172.5495) mem 7376MB [2024-08-20 00:13:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1210/1251] eta 0:00:10 lr 0.000449 wd 0.0500 time 0.2390 (0.2443) data time 0.0009 (0.0016) model time 0.2381 (0.2428) loss 5.3749 (4.8821) grad_norm 2.7551 (2.4789) loss_scale 16384.0000 (16174.2956) mem 7376MB [2024-08-20 00:13:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1220/1251] eta 0:00:07 lr 0.000449 wd 0.0500 time 0.2410 (0.2443) data time 0.0011 (0.0016) model time 0.2399 (0.2428) loss 4.6311 (4.8809) grad_norm 2.0627 (2.4764) loss_scale 16384.0000 (16176.0131) mem 7376MB [2024-08-20 00:13:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1230/1251] eta 0:00:05 lr 0.000450 wd 0.0500 time 0.2372 (0.2442) data time 0.0009 (0.0016) model time 0.2363 (0.2428) loss 4.7416 (4.8809) grad_norm 3.3444 (2.4773) loss_scale 16384.0000 (16177.7027) mem 7376MB [2024-08-20 00:13:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1240/1251] eta 0:00:02 lr 0.000450 wd 0.0500 time 0.2243 (0.2441) data time 0.0005 (0.0016) model time 0.2239 (0.2426) loss 3.6094 (4.8783) grad_norm 2.6903 (2.4779) loss_scale 16384.0000 (16179.3650) mem 7376MB [2024-08-20 00:13:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [8/300][1250/1251] eta 0:00:00 lr 0.000451 wd 0.0500 time 0.2292 (0.2440) data time 0.0007 (0.0016) model time 0.2285 (0.2425) loss 5.1333 (4.8792) grad_norm 1.8273 (2.4768) loss_scale 16384.0000 (16181.0008) mem 7376MB [2024-08-20 00:13:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 8 training takes 0:05:05 [2024-08-20 00:13:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-20 00:13:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-20 00:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.412 (0.412) Loss 1.3701 (1.3701) Acc@1 71.680 (71.680) Acc@5 90.137 (90.137) Mem 7376MB [2024-08-20 00:13:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.107) Loss 1.7998 (1.8211) Acc@1 57.227 (59.100) Acc@5 85.938 (83.993) Mem 7376MB [2024-08-20 00:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.094) Loss 2.4707 (1.8378) Acc@1 49.121 (58.398) Acc@5 72.656 (84.003) Mem 7376MB [2024-08-20 00:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.089) Loss 2.7617 (2.0560) Acc@1 43.457 (54.826) Acc@5 67.090 (80.129) Mem 7376MB [2024-08-20 00:13:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 2.9258 (2.1890) Acc@1 39.355 (52.470) Acc@5 63.477 (77.720) Mem 7376MB [2024-08-20 00:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 52.260 Acc@5 77.632 [2024-08-20 00:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 52.3% [2024-08-20 00:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 52.26% [2024-08-20 00:13:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-20 00:13:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-20 00:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.431 (0.431) Loss 6.2461 (6.2461) Acc@1 1.758 (1.758) Acc@5 8.594 (8.594) Mem 7376MB [2024-08-20 00:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.115) Loss 6.7734 (6.5614) Acc@1 0.195 (1.199) Acc@5 2.344 (4.998) Mem 7376MB [2024-08-20 00:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.097) Loss 6.5781 (6.5383) Acc@1 0.879 (1.186) Acc@5 3.906 (4.799) Mem 7376MB [2024-08-20 00:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.087 (0.092) Loss 6.3984 (6.5228) Acc@1 1.758 (1.219) Acc@5 6.445 (4.854) Mem 7376MB [2024-08-20 00:13:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 6.6289 (6.5165) Acc@1 0.879 (1.246) Acc@5 3.711 (4.869) Mem 7376MB [2024-08-20 00:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 1.496 Acc@5 5.304 [2024-08-20 00:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 1.5% [2024-08-20 00:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 1.50% [2024-08-20 00:13:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-20 00:13:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-20 00:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][0/1251] eta 0:13:36 lr 0.000451 wd 0.0500 time 0.6524 (0.6524) data time 0.4272 (0.4272) model time 0.0000 (0.0000) loss 3.9125 (3.9125) grad_norm 2.0806 (2.0806) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][10/1251] eta 0:05:46 lr 0.000451 wd 0.0500 time 0.2391 (0.2794) data time 0.0010 (0.0399) model time 0.0000 (0.0000) loss 4.6157 (4.6407) grad_norm 3.1988 (2.4117) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][20/1251] eta 0:05:23 lr 0.000451 wd 0.0500 time 0.2458 (0.2629) data time 0.0007 (0.0214) model time 0.0000 (0.0000) loss 3.6333 (4.5297) grad_norm 2.1933 (2.4813) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][30/1251] eta 0:05:11 lr 0.000452 wd 0.0500 time 0.2392 (0.2552) data time 0.0011 (0.0149) model time 0.0000 (0.0000) loss 4.4410 (4.6469) grad_norm 1.8635 (2.3711) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][40/1251] eta 0:05:05 lr 0.000452 wd 0.0500 time 0.2374 (0.2520) data time 0.0009 (0.0115) model time 0.0000 (0.0000) loss 3.8002 (4.6722) grad_norm 2.0394 (2.3499) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][50/1251] eta 0:05:00 lr 0.000453 wd 0.0500 time 0.2517 (0.2502) data time 0.0008 (0.0095) model time 0.0000 (0.0000) loss 3.8130 (4.7175) grad_norm 2.4091 (2.4221) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][60/1251] eta 0:04:56 lr 0.000453 wd 0.0500 time 0.2446 (0.2486) data time 0.0011 (0.0081) model time 0.2435 (0.2388) loss 4.1058 (4.7265) grad_norm 2.3334 (2.4237) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][70/1251] eta 0:04:52 lr 0.000453 wd 0.0500 time 0.2478 (0.2478) data time 0.0010 (0.0072) model time 0.2468 (0.2401) loss 5.0668 (4.7790) grad_norm 2.2009 (2.4087) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][80/1251] eta 0:04:49 lr 0.000454 wd 0.0500 time 0.2497 (0.2469) data time 0.0009 (0.0064) model time 0.2489 (0.2401) loss 5.4442 (4.7787) grad_norm 2.4147 (2.4285) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][90/1251] eta 0:04:46 lr 0.000454 wd 0.0500 time 0.2505 (0.2467) data time 0.0010 (0.0058) model time 0.2495 (0.2409) loss 5.4145 (4.8051) grad_norm 2.1170 (2.4359) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][100/1251] eta 0:04:43 lr 0.000455 wd 0.0500 time 0.2491 (0.2463) data time 0.0010 (0.0054) model time 0.2481 (0.2410) loss 5.0585 (4.8161) grad_norm 2.2141 (2.4250) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][110/1251] eta 0:04:40 lr 0.000455 wd 0.0500 time 0.2472 (0.2458) data time 0.0009 (0.0050) model time 0.2462 (0.2409) loss 4.4324 (4.8211) grad_norm 3.6371 (2.4403) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][120/1251] eta 0:04:37 lr 0.000455 wd 0.0500 time 0.2556 (0.2457) data time 0.0008 (0.0047) model time 0.2548 (0.2412) loss 4.6447 (4.8337) grad_norm 2.0107 (2.4471) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][130/1251] eta 0:04:34 lr 0.000456 wd 0.0500 time 0.2407 (0.2453) data time 0.0011 (0.0044) model time 0.2396 (0.2409) loss 4.9825 (4.8184) grad_norm 1.9649 (2.4440) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][140/1251] eta 0:04:32 lr 0.000456 wd 0.0500 time 0.2615 (0.2452) data time 0.0008 (0.0042) model time 0.2607 (0.2413) loss 5.9399 (4.8237) grad_norm 2.0168 (2.4169) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][150/1251] eta 0:04:29 lr 0.000457 wd 0.0500 time 0.2379 (0.2450) data time 0.0009 (0.0039) model time 0.2370 (0.2412) loss 3.5928 (4.7910) grad_norm 3.2059 (2.4265) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:13:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][160/1251] eta 0:04:27 lr 0.000457 wd 0.0500 time 0.2450 (0.2449) data time 0.0010 (0.0038) model time 0.2440 (0.2413) loss 4.6196 (4.8164) grad_norm 2.1913 (2.4219) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][170/1251] eta 0:04:24 lr 0.000457 wd 0.0500 time 0.2504 (0.2449) data time 0.0011 (0.0036) model time 0.2493 (0.2415) loss 5.1610 (4.8135) grad_norm 3.9126 (2.4451) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][180/1251] eta 0:04:22 lr 0.000458 wd 0.0500 time 0.2396 (0.2447) data time 0.0011 (0.0035) model time 0.2385 (0.2414) loss 4.5309 (4.8147) grad_norm 1.9766 (2.4330) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][190/1251] eta 0:04:19 lr 0.000458 wd 0.0500 time 0.2541 (0.2447) data time 0.0007 (0.0033) model time 0.2534 (0.2416) loss 5.3503 (4.8189) grad_norm 2.3481 (2.4288) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][200/1251] eta 0:04:17 lr 0.000459 wd 0.0500 time 0.2477 (0.2446) data time 0.0008 (0.0032) model time 0.2469 (0.2415) loss 4.6623 (4.8176) grad_norm 2.7957 (2.4375) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][210/1251] eta 0:04:14 lr 0.000459 wd 0.0500 time 0.2347 (0.2444) data time 0.0009 (0.0031) model time 0.2338 (0.2414) loss 3.9141 (4.8213) grad_norm 3.9718 (2.4440) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][220/1251] eta 0:04:11 lr 0.000459 wd 0.0500 time 0.2499 (0.2443) data time 0.0012 (0.0030) model time 0.2487 (0.2413) loss 5.1312 (4.8203) grad_norm 1.8849 (2.4620) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][230/1251] eta 0:04:09 lr 0.000460 wd 0.0500 time 0.2449 (0.2442) data time 0.0010 (0.0030) model time 0.2439 (0.2413) loss 5.1509 (4.8356) grad_norm 2.5752 (2.4656) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][240/1251] eta 0:04:06 lr 0.000460 wd 0.0500 time 0.2409 (0.2441) data time 0.0010 (0.0029) model time 0.2399 (0.2414) loss 4.7028 (4.8462) grad_norm 2.5400 (2.4731) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][250/1251] eta 0:04:04 lr 0.000461 wd 0.0500 time 0.2401 (0.2440) data time 0.0011 (0.0028) model time 0.2390 (0.2413) loss 4.8874 (4.8543) grad_norm 1.8256 (2.4810) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][260/1251] eta 0:04:02 lr 0.000461 wd 0.0500 time 0.2411 (0.2448) data time 0.0007 (0.0028) model time 0.2404 (0.2424) loss 3.6964 (4.8512) grad_norm 3.3100 (2.4728) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][270/1251] eta 0:04:00 lr 0.000461 wd 0.0500 time 0.2488 (0.2449) data time 0.0008 (0.0027) model time 0.2481 (0.2426) loss 3.8497 (4.8408) grad_norm 2.4115 (2.4677) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][280/1251] eta 0:03:57 lr 0.000462 wd 0.0500 time 0.2478 (0.2448) data time 0.0010 (0.0026) model time 0.2467 (0.2425) loss 5.2258 (4.8417) grad_norm 1.9420 (2.4579) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][290/1251] eta 0:03:55 lr 0.000462 wd 0.0500 time 0.2449 (0.2447) data time 0.0011 (0.0026) model time 0.2438 (0.2424) loss 5.3010 (4.8450) grad_norm 2.9596 (2.4573) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][300/1251] eta 0:03:52 lr 0.000463 wd 0.0500 time 0.2406 (0.2447) data time 0.0008 (0.0025) model time 0.2398 (0.2424) loss 4.1469 (4.8320) grad_norm 2.3110 (2.4525) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][310/1251] eta 0:03:51 lr 0.000463 wd 0.0500 time 0.2368 (0.2457) data time 0.0011 (0.0025) model time 0.2357 (0.2436) loss 4.0748 (4.8279) grad_norm 2.8814 (2.4512) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][320/1251] eta 0:03:48 lr 0.000463 wd 0.0500 time 0.2377 (0.2455) data time 0.0012 (0.0025) model time 0.2364 (0.2435) loss 5.4259 (4.8226) grad_norm 2.8869 (2.4451) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][330/1251] eta 0:03:46 lr 0.000464 wd 0.0500 time 0.2461 (0.2455) data time 0.0009 (0.0024) model time 0.2452 (0.2435) loss 4.0929 (4.8170) grad_norm 2.7209 (2.4461) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][340/1251] eta 0:03:43 lr 0.000464 wd 0.0500 time 0.2460 (0.2454) data time 0.0011 (0.0024) model time 0.2450 (0.2434) loss 4.7535 (4.8184) grad_norm 1.7714 (2.4500) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][350/1251] eta 0:03:41 lr 0.000465 wd 0.0500 time 0.2362 (0.2453) data time 0.0012 (0.0024) model time 0.2350 (0.2433) loss 4.9679 (4.8172) grad_norm 2.2045 (2.4520) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][360/1251] eta 0:03:38 lr 0.000465 wd 0.0500 time 0.2367 (0.2453) data time 0.0008 (0.0023) model time 0.2359 (0.2433) loss 4.9589 (4.8153) grad_norm 2.4010 (2.4488) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][370/1251] eta 0:03:36 lr 0.000465 wd 0.0500 time 0.2352 (0.2453) data time 0.0009 (0.0023) model time 0.2343 (0.2433) loss 4.5913 (4.8069) grad_norm 2.4423 (2.4663) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][380/1251] eta 0:03:33 lr 0.000466 wd 0.0500 time 0.2335 (0.2452) data time 0.0009 (0.0023) model time 0.2325 (0.2433) loss 5.3611 (4.8099) grad_norm 2.5275 (2.4705) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][390/1251] eta 0:03:31 lr 0.000466 wd 0.0500 time 0.2495 (0.2451) data time 0.0011 (0.0022) model time 0.2484 (0.2432) loss 5.2461 (4.8096) grad_norm 2.3477 (2.4649) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:14:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][400/1251] eta 0:03:29 lr 0.000467 wd 0.0500 time 0.2370 (0.2456) data time 0.0010 (0.0022) model time 0.2360 (0.2438) loss 3.9643 (4.8091) grad_norm 1.8490 (2.4570) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][410/1251] eta 0:03:26 lr 0.000467 wd 0.0500 time 0.2463 (0.2455) data time 0.0010 (0.0022) model time 0.2453 (0.2437) loss 3.9501 (4.8068) grad_norm 1.8151 (2.4509) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][420/1251] eta 0:03:23 lr 0.000467 wd 0.0500 time 0.2407 (0.2455) data time 0.0007 (0.0022) model time 0.2400 (0.2437) loss 3.9757 (4.8087) grad_norm 6.5574 (2.4547) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][430/1251] eta 0:03:21 lr 0.000468 wd 0.0500 time 0.2374 (0.2454) data time 0.0009 (0.0021) model time 0.2366 (0.2436) loss 4.3852 (4.8059) grad_norm 2.0909 (2.4549) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][440/1251] eta 0:03:18 lr 0.000468 wd 0.0500 time 0.2378 (0.2452) data time 0.0013 (0.0021) model time 0.2364 (0.2434) loss 4.9568 (4.8090) grad_norm 3.2851 (2.4674) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][450/1251] eta 0:03:16 lr 0.000469 wd 0.0500 time 0.2481 (0.2452) data time 0.0009 (0.0021) model time 0.2472 (0.2434) loss 4.6905 (4.8091) grad_norm 1.9065 (2.4651) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][460/1251] eta 0:03:14 lr 0.000469 wd 0.0500 time 0.2436 (0.2454) data time 0.0008 (0.0021) model time 0.2427 (0.2437) loss 3.6607 (4.8010) grad_norm 2.5675 (2.4602) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][470/1251] eta 0:03:11 lr 0.000469 wd 0.0500 time 0.2614 (0.2454) data time 0.0010 (0.0021) model time 0.2604 (0.2437) loss 5.2745 (4.7960) grad_norm 3.1546 (2.4612) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][480/1251] eta 0:03:09 lr 0.000470 wd 0.0500 time 0.2368 (0.2453) data time 0.0012 (0.0020) model time 0.2356 (0.2436) loss 5.4192 (4.7948) grad_norm 1.9227 (2.4632) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][490/1251] eta 0:03:06 lr 0.000470 wd 0.0500 time 0.2435 (0.2453) data time 0.0009 (0.0020) model time 0.2426 (0.2435) loss 3.7588 (4.7877) grad_norm 3.1808 (2.4604) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][500/1251] eta 0:03:04 lr 0.000471 wd 0.0500 time 0.2438 (0.2452) data time 0.0012 (0.0020) model time 0.2426 (0.2434) loss 5.2475 (4.7871) grad_norm 4.4243 (2.4680) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][510/1251] eta 0:03:01 lr 0.000471 wd 0.0500 time 0.2554 (0.2451) data time 0.0011 (0.0020) model time 0.2543 (0.2434) loss 5.3643 (4.7889) grad_norm 2.1444 (2.4674) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][520/1251] eta 0:02:59 lr 0.000471 wd 0.0500 time 0.2381 (0.2450) data time 0.0011 (0.0020) model time 0.2370 (0.2433) loss 5.0314 (4.7883) grad_norm 2.4759 (2.4621) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][530/1251] eta 0:02:56 lr 0.000472 wd 0.0500 time 0.2408 (0.2450) data time 0.0009 (0.0020) model time 0.2399 (0.2432) loss 4.0378 (4.7857) grad_norm 2.6724 (2.4675) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][540/1251] eta 0:02:54 lr 0.000472 wd 0.0500 time 0.2453 (0.2449) data time 0.0010 (0.0020) model time 0.2443 (0.2431) loss 5.2229 (4.7851) grad_norm 1.6512 (2.4606) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][550/1251] eta 0:02:51 lr 0.000473 wd 0.0500 time 0.2461 (0.2448) data time 0.0010 (0.0019) model time 0.2451 (0.2431) loss 5.4721 (4.7879) grad_norm 3.9253 (2.4645) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][560/1251] eta 0:02:49 lr 0.000473 wd 0.0500 time 0.2525 (0.2448) data time 0.0009 (0.0019) model time 0.2516 (0.2431) loss 3.7873 (4.7891) grad_norm 3.5971 (2.4765) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][570/1251] eta 0:02:46 lr 0.000473 wd 0.0500 time 0.2484 (0.2448) data time 0.0007 (0.0019) model time 0.2477 (0.2430) loss 3.6716 (4.7863) grad_norm 1.7672 (2.4721) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][580/1251] eta 0:02:44 lr 0.000474 wd 0.0500 time 0.2453 (0.2447) data time 0.0011 (0.0019) model time 0.2442 (0.2430) loss 5.0348 (4.7857) grad_norm 3.6039 (2.4715) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][590/1251] eta 0:02:41 lr 0.000474 wd 0.0500 time 0.2412 (0.2447) data time 0.0010 (0.0019) model time 0.2402 (0.2430) loss 5.6854 (4.7871) grad_norm 2.2698 (2.4732) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][600/1251] eta 0:02:39 lr 0.000475 wd 0.0500 time 0.2403 (0.2446) data time 0.0007 (0.0019) model time 0.2395 (0.2429) loss 4.3286 (4.7833) grad_norm 1.9139 (2.4694) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][610/1251] eta 0:02:36 lr 0.000475 wd 0.0500 time 0.2532 (0.2446) data time 0.0010 (0.0019) model time 0.2522 (0.2429) loss 5.2473 (4.7818) grad_norm 2.0398 (2.4661) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][620/1251] eta 0:02:34 lr 0.000475 wd 0.0500 time 0.2416 (0.2445) data time 0.0009 (0.0018) model time 0.2407 (0.2429) loss 5.1730 (4.7815) grad_norm 1.8875 (2.4623) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][630/1251] eta 0:02:31 lr 0.000476 wd 0.0500 time 0.2416 (0.2445) data time 0.0007 (0.0018) model time 0.2409 (0.2428) loss 5.2419 (4.7761) grad_norm 2.5002 (2.4627) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][640/1251] eta 0:02:29 lr 0.000476 wd 0.0500 time 0.2494 (0.2445) data time 0.0007 (0.0018) model time 0.2487 (0.2428) loss 5.6032 (4.7830) grad_norm 2.5771 (2.4597) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:15:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][650/1251] eta 0:02:26 lr 0.000477 wd 0.0500 time 0.2497 (0.2444) data time 0.0010 (0.0018) model time 0.2487 (0.2428) loss 4.8362 (4.7839) grad_norm 2.0999 (2.4551) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:16:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][660/1251] eta 0:02:24 lr 0.000477 wd 0.0500 time 0.2494 (0.2444) data time 0.0010 (0.0018) model time 0.2484 (0.2428) loss 4.7240 (4.7828) grad_norm 2.2981 (2.4502) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:16:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][670/1251] eta 0:02:21 lr 0.000477 wd 0.0500 time 0.2477 (0.2444) data time 0.0011 (0.0018) model time 0.2465 (0.2427) loss 4.5076 (4.7840) grad_norm 3.5140 (2.4497) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:16:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][680/1251] eta 0:02:19 lr 0.000478 wd 0.0500 time 0.2434 (0.2443) data time 0.0012 (0.0018) model time 0.2423 (0.2427) loss 4.4687 (4.7804) grad_norm 2.1660 (2.4458) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:16:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][690/1251] eta 0:02:17 lr 0.000478 wd 0.0500 time 0.2438 (0.2443) data time 0.0011 (0.0018) model time 0.2427 (0.2427) loss 4.8776 (4.7820) grad_norm 2.3622 (2.4427) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:16:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][700/1251] eta 0:02:14 lr 0.000478 wd 0.0500 time 0.2453 (0.2443) data time 0.0008 (0.0018) model time 0.2446 (0.2427) loss 5.2714 (4.7807) grad_norm 2.2184 (2.4423) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:16:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][710/1251] eta 0:02:12 lr 0.000479 wd 0.0500 time 0.2495 (0.2443) data time 0.0011 (0.0018) model time 0.2484 (0.2426) loss 4.5657 (4.7790) grad_norm 2.2568 (2.4415) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:16:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][720/1251] eta 0:02:09 lr 0.000479 wd 0.0500 time 0.2483 (0.2442) data time 0.0011 (0.0017) model time 0.2472 (0.2426) loss 5.1056 (4.7797) grad_norm 2.4533 (2.4411) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:16:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][730/1251] eta 0:02:07 lr 0.000480 wd 0.0500 time 0.2401 (0.2442) data time 0.0008 (0.0017) model time 0.2393 (0.2426) loss 3.6731 (4.7783) grad_norm 2.2501 (2.4367) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:16:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][740/1251] eta 0:02:04 lr 0.000480 wd 0.0500 time 0.2511 (0.2441) data time 0.0011 (0.0017) model time 0.2500 (0.2425) loss 5.2700 (4.7795) grad_norm 2.3053 (2.4363) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:16:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][750/1251] eta 0:02:02 lr 0.000480 wd 0.0500 time 0.2406 (0.2441) data time 0.0012 (0.0017) model time 0.2394 (0.2424) loss 4.8102 (4.7819) grad_norm 2.2855 (2.4376) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:16:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][760/1251] eta 0:01:59 lr 0.000481 wd 0.0500 time 0.2375 (0.2440) data time 0.0010 (0.0017) model time 0.2365 (0.2424) loss 4.6598 (4.7802) grad_norm 3.3207 (2.4384) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:16:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][770/1251] eta 0:01:57 lr 0.000481 wd 0.0500 time 0.2554 (0.2440) data time 0.0012 (0.0017) model time 0.2542 (0.2424) loss 4.4707 (4.7792) grad_norm 2.0097 (2.4360) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:16:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][780/1251] eta 0:01:54 lr 0.000482 wd 0.0500 time 0.2386 (0.2440) data time 0.0007 (0.0017) model time 0.2378 (0.2424) loss 3.9288 (4.7799) grad_norm 2.6755 (2.4370) loss_scale 32768.0000 (16404.9782) mem 7376MB [2024-08-20 00:16:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][790/1251] eta 0:01:52 lr 0.000482 wd 0.0500 time 0.2493 (0.2439) data time 0.0008 (0.0017) model time 0.2485 (0.2423) loss 5.0361 (4.7798) grad_norm 1.9538 (2.4335) loss_scale 32768.0000 (16611.8432) mem 7376MB [2024-08-20 00:16:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][800/1251] eta 0:01:50 lr 0.000482 wd 0.0500 time 0.2361 (0.2439) data time 0.0010 (0.0017) model time 0.2351 (0.2423) loss 3.7280 (4.7799) grad_norm 1.8133 (2.4319) loss_scale 32768.0000 (16813.5431) mem 7376MB [2024-08-20 00:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][810/1251] eta 0:01:47 lr 0.000483 wd 0.0500 time 0.2405 (0.2439) data time 0.0011 (0.0017) model time 0.2394 (0.2423) loss 5.4166 (4.7831) grad_norm 2.1122 (2.4320) loss_scale 32768.0000 (17010.2688) mem 7376MB [2024-08-20 00:16:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][820/1251] eta 0:01:45 lr 0.000483 wd 0.0500 time 0.2434 (0.2438) data time 0.0008 (0.0017) model time 0.2426 (0.2422) loss 4.7799 (4.7834) grad_norm 2.0057 (2.4307) loss_scale 32768.0000 (17202.2022) mem 7376MB [2024-08-20 00:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][830/1251] eta 0:01:42 lr 0.000484 wd 0.0500 time 0.2373 (0.2438) data time 0.0009 (0.0017) model time 0.2363 (0.2422) loss 3.8579 (4.7812) grad_norm 1.9897 (2.4276) loss_scale 32768.0000 (17389.5162) mem 7376MB [2024-08-20 00:16:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][840/1251] eta 0:01:40 lr 0.000484 wd 0.0500 time 0.2450 (0.2443) data time 0.0011 (0.0017) model time 0.2438 (0.2428) loss 4.5870 (4.7767) grad_norm 1.7920 (2.4282) loss_scale 32768.0000 (17572.3757) mem 7376MB [2024-08-20 00:16:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][850/1251] eta 0:01:37 lr 0.000484 wd 0.0500 time 0.2316 (0.2442) data time 0.0012 (0.0017) model time 0.2305 (0.2427) loss 4.8968 (4.7759) grad_norm 2.1711 (2.4265) loss_scale 32768.0000 (17750.9377) mem 7376MB [2024-08-20 00:16:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][860/1251] eta 0:01:35 lr 0.000485 wd 0.0500 time 0.2416 (0.2442) data time 0.0010 (0.0017) model time 0.2406 (0.2427) loss 4.3119 (4.7733) grad_norm 1.9427 (2.4221) loss_scale 32768.0000 (17925.3519) mem 7376MB [2024-08-20 00:16:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][870/1251] eta 0:01:33 lr 0.000485 wd 0.0500 time 0.2352 (0.2442) data time 0.0013 (0.0016) model time 0.2339 (0.2427) loss 5.1472 (4.7714) grad_norm 2.5105 (2.4289) loss_scale 32768.0000 (18095.7612) mem 7376MB [2024-08-20 00:16:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][880/1251] eta 0:01:30 lr 0.000486 wd 0.0500 time 0.2466 (0.2442) data time 0.0008 (0.0016) model time 0.2458 (0.2427) loss 4.7920 (4.7670) grad_norm 2.5213 (2.4264) loss_scale 32768.0000 (18262.3019) mem 7376MB [2024-08-20 00:16:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][890/1251] eta 0:01:28 lr 0.000486 wd 0.0500 time 0.2404 (0.2443) data time 0.0011 (0.0016) model time 0.2393 (0.2427) loss 5.0230 (4.7681) grad_norm 2.0631 (2.4235) loss_scale 32768.0000 (18425.1044) mem 7376MB [2024-08-20 00:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][900/1251] eta 0:01:25 lr 0.000486 wd 0.0500 time 0.2455 (0.2442) data time 0.0007 (0.0016) model time 0.2448 (0.2427) loss 5.4354 (4.7664) grad_norm 1.9633 (2.4221) loss_scale 32768.0000 (18584.2930) mem 7376MB [2024-08-20 00:17:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][910/1251] eta 0:01:23 lr 0.000487 wd 0.0500 time 0.2416 (0.2442) data time 0.0028 (0.0016) model time 0.2388 (0.2427) loss 5.2914 (4.7672) grad_norm 2.1086 (2.4216) loss_scale 32768.0000 (18739.9868) mem 7376MB [2024-08-20 00:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][920/1251] eta 0:01:20 lr 0.000487 wd 0.0500 time 0.2390 (0.2442) data time 0.0011 (0.0016) model time 0.2379 (0.2426) loss 5.0564 (4.7656) grad_norm 1.9457 (2.4223) loss_scale 32768.0000 (18892.2997) mem 7376MB [2024-08-20 00:17:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][930/1251] eta 0:01:18 lr 0.000488 wd 0.0500 time 0.2405 (0.2441) data time 0.0008 (0.0016) model time 0.2398 (0.2426) loss 4.8486 (4.7620) grad_norm 2.9477 (2.4243) loss_scale 32768.0000 (19041.3405) mem 7376MB [2024-08-20 00:17:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][940/1251] eta 0:01:15 lr 0.000488 wd 0.0500 time 0.2422 (0.2441) data time 0.0007 (0.0016) model time 0.2415 (0.2426) loss 3.8428 (4.7578) grad_norm 2.4177 (2.4263) loss_scale 32768.0000 (19187.2136) mem 7376MB [2024-08-20 00:17:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][950/1251] eta 0:01:13 lr 0.000488 wd 0.0500 time 0.2345 (0.2441) data time 0.0010 (0.0016) model time 0.2334 (0.2426) loss 5.4583 (4.7559) grad_norm 1.8162 (2.4225) loss_scale 32768.0000 (19330.0189) mem 7376MB [2024-08-20 00:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][960/1251] eta 0:01:11 lr 0.000489 wd 0.0500 time 0.2438 (0.2441) data time 0.0008 (0.0016) model time 0.2430 (0.2426) loss 4.3214 (4.7546) grad_norm 2.2907 (2.4251) loss_scale 32768.0000 (19469.8522) mem 7376MB [2024-08-20 00:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][970/1251] eta 0:01:08 lr 0.000489 wd 0.0500 time 0.2473 (0.2441) data time 0.0011 (0.0016) model time 0.2462 (0.2426) loss 5.2683 (4.7546) grad_norm 2.5038 (2.4247) loss_scale 32768.0000 (19606.8054) mem 7376MB [2024-08-20 00:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][980/1251] eta 0:01:06 lr 0.000490 wd 0.0500 time 0.2413 (0.2440) data time 0.0010 (0.0016) model time 0.2402 (0.2425) loss 5.2922 (4.7549) grad_norm 1.7527 (2.4250) loss_scale 32768.0000 (19740.9664) mem 7376MB [2024-08-20 00:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][990/1251] eta 0:01:03 lr 0.000490 wd 0.0500 time 0.2380 (0.2442) data time 0.0009 (0.0016) model time 0.2371 (0.2427) loss 4.6743 (4.7538) grad_norm 2.1168 (2.4272) loss_scale 32768.0000 (19872.4198) mem 7376MB [2024-08-20 00:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1000/1251] eta 0:01:01 lr 0.000490 wd 0.0500 time 0.2403 (0.2442) data time 0.0009 (0.0016) model time 0.2393 (0.2427) loss 5.3718 (4.7550) grad_norm 3.5096 (2.4297) loss_scale 32768.0000 (20001.2468) mem 7376MB [2024-08-20 00:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1010/1251] eta 0:00:58 lr 0.000491 wd 0.0500 time 0.2386 (0.2442) data time 0.0008 (0.0016) model time 0.2378 (0.2427) loss 3.6653 (4.7543) grad_norm 2.0724 (2.4309) loss_scale 32768.0000 (20127.5252) mem 7376MB [2024-08-20 00:17:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1020/1251] eta 0:00:56 lr 0.000491 wd 0.0500 time 0.2403 (0.2442) data time 0.0009 (0.0016) model time 0.2394 (0.2427) loss 5.4336 (4.7532) grad_norm 1.7173 (2.4327) loss_scale 32768.0000 (20251.3301) mem 7376MB [2024-08-20 00:17:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1030/1251] eta 0:00:53 lr 0.000492 wd 0.0500 time 0.2413 (0.2441) data time 0.0010 (0.0016) model time 0.2403 (0.2427) loss 4.8344 (4.7545) grad_norm 2.3467 (2.4307) loss_scale 32768.0000 (20372.7333) mem 7376MB [2024-08-20 00:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1040/1251] eta 0:00:51 lr 0.000492 wd 0.0500 time 0.2433 (0.2441) data time 0.0010 (0.0016) model time 0.2423 (0.2427) loss 5.3157 (4.7546) grad_norm 2.0315 (2.4332) loss_scale 32768.0000 (20491.8040) mem 7376MB [2024-08-20 00:17:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1050/1251] eta 0:00:49 lr 0.000492 wd 0.0500 time 0.2421 (0.2441) data time 0.0010 (0.0016) model time 0.2411 (0.2426) loss 4.2550 (4.7533) grad_norm 2.0604 (2.4324) loss_scale 32768.0000 (20608.6089) mem 7376MB [2024-08-20 00:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1060/1251] eta 0:00:46 lr 0.000493 wd 0.0500 time 0.2515 (0.2441) data time 0.0009 (0.0016) model time 0.2506 (0.2426) loss 5.0189 (4.7528) grad_norm 2.1295 (2.4326) loss_scale 32768.0000 (20723.2121) mem 7376MB [2024-08-20 00:17:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1070/1251] eta 0:00:44 lr 0.000493 wd 0.0500 time 0.2410 (0.2441) data time 0.0010 (0.0016) model time 0.2400 (0.2426) loss 4.9544 (4.7539) grad_norm 1.8060 (2.4356) loss_scale 32768.0000 (20835.6751) mem 7376MB [2024-08-20 00:17:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1080/1251] eta 0:00:41 lr 0.000494 wd 0.0500 time 0.2449 (0.2440) data time 0.0010 (0.0015) model time 0.2439 (0.2426) loss 4.7577 (4.7530) grad_norm 2.0513 (2.4312) loss_scale 32768.0000 (20946.0574) mem 7376MB [2024-08-20 00:17:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1090/1251] eta 0:00:39 lr 0.000494 wd 0.0500 time 0.2364 (0.2440) data time 0.0010 (0.0015) model time 0.2354 (0.2425) loss 4.5927 (4.7528) grad_norm 1.9133 (2.4276) loss_scale 32768.0000 (21054.4161) mem 7376MB [2024-08-20 00:17:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1100/1251] eta 0:00:36 lr 0.000494 wd 0.0500 time 0.2380 (0.2440) data time 0.0010 (0.0015) model time 0.2369 (0.2425) loss 4.3394 (4.7514) grad_norm 1.8130 (2.4262) loss_scale 32768.0000 (21160.8065) mem 7376MB [2024-08-20 00:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1110/1251] eta 0:00:34 lr 0.000495 wd 0.0500 time 0.2420 (0.2440) data time 0.0008 (0.0015) model time 0.2413 (0.2425) loss 3.6161 (4.7461) grad_norm 1.8163 (2.4235) loss_scale 32768.0000 (21265.2817) mem 7376MB [2024-08-20 00:17:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1120/1251] eta 0:00:31 lr 0.000495 wd 0.0500 time 0.2454 (0.2440) data time 0.0011 (0.0015) model time 0.2442 (0.2425) loss 5.0829 (4.7460) grad_norm 2.7501 (2.4217) loss_scale 32768.0000 (21367.8930) mem 7376MB [2024-08-20 00:17:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1130/1251] eta 0:00:29 lr 0.000496 wd 0.0500 time 0.2557 (0.2440) data time 0.0008 (0.0015) model time 0.2549 (0.2425) loss 4.1742 (4.7451) grad_norm 2.4386 (2.4231) loss_scale 32768.0000 (21468.6897) mem 7376MB [2024-08-20 00:17:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1140/1251] eta 0:00:27 lr 0.000496 wd 0.0500 time 0.2429 (0.2440) data time 0.0008 (0.0015) model time 0.2422 (0.2425) loss 4.0286 (4.7426) grad_norm 2.1187 (2.4209) loss_scale 32768.0000 (21567.7195) mem 7376MB [2024-08-20 00:18:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1150/1251] eta 0:00:24 lr 0.000496 wd 0.0500 time 0.2373 (0.2440) data time 0.0007 (0.0015) model time 0.2366 (0.2425) loss 5.3845 (4.7442) grad_norm 2.1677 (2.4197) loss_scale 32768.0000 (21665.0287) mem 7376MB [2024-08-20 00:18:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1160/1251] eta 0:00:22 lr 0.000497 wd 0.0500 time 0.2377 (0.2440) data time 0.0009 (0.0015) model time 0.2368 (0.2425) loss 5.3690 (4.7416) grad_norm 2.7565 (2.4234) loss_scale 32768.0000 (21760.6615) mem 7376MB [2024-08-20 00:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1170/1251] eta 0:00:19 lr 0.000497 wd 0.0500 time 0.2414 (0.2440) data time 0.0008 (0.0015) model time 0.2406 (0.2425) loss 4.1589 (4.7410) grad_norm 2.2342 (2.4238) loss_scale 32768.0000 (21854.6610) mem 7376MB [2024-08-20 00:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1180/1251] eta 0:00:17 lr 0.000498 wd 0.0500 time 0.2379 (0.2439) data time 0.0010 (0.0015) model time 0.2370 (0.2425) loss 4.0046 (4.7380) grad_norm 2.0431 (2.4218) loss_scale 32768.0000 (21947.0686) mem 7376MB [2024-08-20 00:18:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1190/1251] eta 0:00:14 lr 0.000498 wd 0.0500 time 0.2419 (0.2441) data time 0.0009 (0.0015) model time 0.2411 (0.2427) loss 3.6879 (4.7373) grad_norm 2.3871 (2.4222) loss_scale 32768.0000 (22037.9244) mem 7376MB [2024-08-20 00:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1200/1251] eta 0:00:12 lr 0.000498 wd 0.0500 time 0.2356 (0.2441) data time 0.0008 (0.0015) model time 0.2348 (0.2426) loss 4.1339 (4.7361) grad_norm 2.2522 (2.4196) loss_scale 32768.0000 (22127.2673) mem 7376MB [2024-08-20 00:18:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1210/1251] eta 0:00:10 lr 0.000499 wd 0.0500 time 0.2315 (0.2440) data time 0.0010 (0.0015) model time 0.2305 (0.2426) loss 5.0478 (4.7380) grad_norm 2.4085 (2.4216) loss_scale 32768.0000 (22215.1346) mem 7376MB [2024-08-20 00:18:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1220/1251] eta 0:00:07 lr 0.000499 wd 0.0500 time 0.2296 (0.2440) data time 0.0012 (0.0015) model time 0.2284 (0.2426) loss 4.7025 (4.7372) grad_norm 2.0462 (2.4227) loss_scale 32768.0000 (22301.5627) mem 7376MB [2024-08-20 00:18:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1230/1251] eta 0:00:05 lr 0.000500 wd 0.0500 time 0.2336 (0.2440) data time 0.0009 (0.0015) model time 0.2327 (0.2426) loss 5.2417 (4.7364) grad_norm 1.7682 (2.4225) loss_scale 32768.0000 (22386.5865) mem 7376MB [2024-08-20 00:18:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1240/1251] eta 0:00:02 lr 0.000500 wd 0.0500 time 0.2225 (0.2440) data time 0.0005 (0.0015) model time 0.2220 (0.2425) loss 3.4698 (4.7358) grad_norm 2.1839 (2.4213) loss_scale 32768.0000 (22470.2401) mem 7376MB [2024-08-20 00:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [9/300][1250/1251] eta 0:00:00 lr 0.000500 wd 0.0500 time 0.2217 (0.2438) data time 0.0005 (0.0015) model time 0.2212 (0.2424) loss 5.0443 (4.7366) grad_norm 2.1925 (2.4214) loss_scale 32768.0000 (22552.5564) mem 7376MB [2024-08-20 00:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 9 training takes 0:05:05 [2024-08-20 00:18:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-20 00:18:25 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-20 00:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.499 (0.499) Loss 1.2256 (1.2256) Acc@1 74.316 (74.316) Acc@5 90.918 (90.918) Mem 7376MB [2024-08-20 00:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.115) Loss 1.4805 (1.6352) Acc@1 63.672 (61.648) Acc@5 89.258 (86.000) Mem 7376MB [2024-08-20 00:18:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.098) Loss 2.3516 (1.6430) Acc@1 49.219 (61.356) Acc@5 75.000 (86.272) Mem 7376MB [2024-08-20 00:18:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.095 (0.092) Loss 2.7480 (1.8839) Acc@1 45.117 (57.475) Acc@5 66.309 (82.123) Mem 7376MB [2024-08-20 00:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 2.6855 (2.0246) Acc@1 44.238 (55.088) Acc@5 69.531 (79.819) Mem 7376MB [2024-08-20 00:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 54.948 Acc@5 79.748 [2024-08-20 00:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 54.9% [2024-08-20 00:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 54.95% [2024-08-20 00:18:29 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-20 00:18:30 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-20 00:18:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.446 (0.446) Loss 5.8203 (5.8203) Acc@1 3.320 (3.320) Acc@5 14.258 (14.258) Mem 7376MB [2024-08-20 00:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.116) Loss 6.4180 (6.1843) Acc@1 1.367 (2.841) Acc@5 7.129 (9.632) Mem 7376MB [2024-08-20 00:18:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.099) Loss 6.2695 (6.1708) Acc@1 2.539 (2.967) Acc@5 7.520 (9.705) Mem 7376MB [2024-08-20 00:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.092) Loss 6.0859 (6.1644) Acc@1 3.809 (3.090) Acc@5 11.523 (9.681) Mem 7376MB [2024-08-20 00:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 6.2969 (6.1647) Acc@1 1.660 (3.030) Acc@5 6.250 (9.573) Mem 7376MB [2024-08-20 00:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 3.406 Acc@5 10.316 [2024-08-20 00:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 3.4% [2024-08-20 00:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 3.41% [2024-08-20 00:18:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-20 00:18:35 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-20 00:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][0/1251] eta 0:16:45 lr 0.000501 wd 0.0500 time 0.8036 (0.8036) data time 0.5756 (0.5756) model time 0.0000 (0.0000) loss 5.1352 (5.1352) grad_norm 1.7848 (1.7848) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-20 00:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][10/1251] eta 0:06:02 lr 0.000501 wd 0.0500 time 0.2410 (0.2920) data time 0.0011 (0.0533) model time 0.0000 (0.0000) loss 5.3661 (5.1294) grad_norm 2.1813 (2.4888) loss_scale 32768.0000 (32768.0000) mem 7376MB [2024-08-20 00:18:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][20/1251] eta 0:05:29 lr 0.000501 wd 0.0500 time 0.2429 (0.2675) data time 0.0007 (0.0284) model time 0.0000 (0.0000) loss 3.7573 (4.9597) grad_norm 1.8986 (inf) loss_scale 16384.0000 (24966.0952) mem 7376MB [2024-08-20 00:18:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][30/1251] eta 0:05:16 lr 0.000502 wd 0.0500 time 0.2451 (0.2594) data time 0.0009 (0.0196) model time 0.0000 (0.0000) loss 5.5322 (4.8677) grad_norm 1.8071 (inf) loss_scale 16384.0000 (22197.6774) mem 7376MB [2024-08-20 00:18:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][40/1251] eta 0:05:09 lr 0.000502 wd 0.0500 time 0.2624 (0.2556) data time 0.0007 (0.0152) model time 0.0000 (0.0000) loss 4.8788 (4.8531) grad_norm 2.3687 (inf) loss_scale 16384.0000 (20779.7073) mem 7376MB [2024-08-20 00:18:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][50/1251] eta 0:05:09 lr 0.000502 wd 0.0500 time 0.2404 (0.2575) data time 0.0012 (0.0125) model time 0.0000 (0.0000) loss 3.5718 (4.8176) grad_norm 2.2468 (inf) loss_scale 16384.0000 (19917.8039) mem 7376MB [2024-08-20 00:18:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][60/1251] eta 0:05:03 lr 0.000503 wd 0.0500 time 0.2415 (0.2550) data time 0.0010 (0.0106) model time 0.2404 (0.2410) loss 5.2634 (4.7954) grad_norm 2.0249 (inf) loss_scale 16384.0000 (19338.4918) mem 7376MB [2024-08-20 00:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][70/1251] eta 0:04:59 lr 0.000503 wd 0.0500 time 0.2390 (0.2533) data time 0.0008 (0.0093) model time 0.2383 (0.2416) loss 3.5511 (4.8069) grad_norm 2.1192 (inf) loss_scale 16384.0000 (18922.3662) mem 7376MB [2024-08-20 00:18:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][80/1251] eta 0:04:55 lr 0.000504 wd 0.0500 time 0.2536 (0.2520) data time 0.0011 (0.0083) model time 0.2525 (0.2415) loss 4.9415 (4.8218) grad_norm 2.3018 (inf) loss_scale 16384.0000 (18608.9877) mem 7376MB [2024-08-20 00:18:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][90/1251] eta 0:04:51 lr 0.000504 wd 0.0500 time 0.2490 (0.2509) data time 0.0008 (0.0075) model time 0.2482 (0.2413) loss 5.1034 (4.8020) grad_norm 2.1780 (inf) loss_scale 16384.0000 (18364.4835) mem 7376MB [2024-08-20 00:19:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][100/1251] eta 0:04:47 lr 0.000504 wd 0.0500 time 0.2366 (0.2499) data time 0.0007 (0.0070) model time 0.2358 (0.2409) loss 4.3547 (4.7743) grad_norm 2.8934 (inf) loss_scale 16384.0000 (18168.3960) mem 7376MB [2024-08-20 00:19:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][110/1251] eta 0:04:44 lr 0.000505 wd 0.0500 time 0.2457 (0.2493) data time 0.0010 (0.0064) model time 0.2446 (0.2411) loss 4.0465 (4.7697) grad_norm 2.4817 (inf) loss_scale 16384.0000 (18007.6396) mem 7376MB [2024-08-20 00:19:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][120/1251] eta 0:04:41 lr 0.000505 wd 0.0500 time 0.2348 (0.2487) data time 0.0010 (0.0060) model time 0.2338 (0.2410) loss 4.9946 (4.7687) grad_norm 2.2622 (inf) loss_scale 16384.0000 (17873.4545) mem 7376MB [2024-08-20 00:19:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][130/1251] eta 0:04:38 lr 0.000506 wd 0.0500 time 0.2392 (0.2482) data time 0.0011 (0.0056) model time 0.2381 (0.2410) loss 5.0977 (4.7602) grad_norm 1.9005 (inf) loss_scale 16384.0000 (17759.7557) mem 7376MB [2024-08-20 00:19:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][140/1251] eta 0:04:35 lr 0.000506 wd 0.0500 time 0.2424 (0.2478) data time 0.0013 (0.0053) model time 0.2412 (0.2411) loss 4.6634 (4.7593) grad_norm 1.8254 (inf) loss_scale 16384.0000 (17662.1844) mem 7376MB [2024-08-20 00:19:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][150/1251] eta 0:04:32 lr 0.000506 wd 0.0500 time 0.2416 (0.2473) data time 0.0009 (0.0050) model time 0.2406 (0.2409) loss 4.9740 (4.7581) grad_norm 3.6070 (inf) loss_scale 16384.0000 (17577.5364) mem 7376MB [2024-08-20 00:19:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][160/1251] eta 0:04:29 lr 0.000507 wd 0.0500 time 0.2365 (0.2470) data time 0.0011 (0.0048) model time 0.2354 (0.2409) loss 5.2562 (4.7648) grad_norm 2.1679 (inf) loss_scale 16384.0000 (17503.4037) mem 7376MB [2024-08-20 00:19:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][170/1251] eta 0:04:26 lr 0.000507 wd 0.0500 time 0.2440 (0.2469) data time 0.0008 (0.0046) model time 0.2432 (0.2412) loss 4.9133 (4.7729) grad_norm 2.4880 (inf) loss_scale 16384.0000 (17437.9415) mem 7376MB [2024-08-20 00:19:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][180/1251] eta 0:04:24 lr 0.000508 wd 0.0500 time 0.2437 (0.2466) data time 0.0007 (0.0044) model time 0.2430 (0.2411) loss 5.3108 (4.7742) grad_norm 1.8222 (inf) loss_scale 16384.0000 (17379.7127) mem 7376MB [2024-08-20 00:19:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][190/1251] eta 0:04:21 lr 0.000508 wd 0.0500 time 0.2314 (0.2464) data time 0.0013 (0.0042) model time 0.2301 (0.2411) loss 5.2158 (4.7733) grad_norm 1.9057 (inf) loss_scale 16384.0000 (17327.5812) mem 7376MB [2024-08-20 00:19:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][200/1251] eta 0:04:18 lr 0.000508 wd 0.0500 time 0.2371 (0.2463) data time 0.0011 (0.0041) model time 0.2360 (0.2413) loss 4.5496 (4.7640) grad_norm 2.0351 (inf) loss_scale 16384.0000 (17280.6368) mem 7376MB [2024-08-20 00:19:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][210/1251] eta 0:04:16 lr 0.000509 wd 0.0500 time 0.2393 (0.2460) data time 0.0008 (0.0039) model time 0.2385 (0.2411) loss 5.4527 (4.7661) grad_norm 2.0480 (inf) loss_scale 16384.0000 (17238.1422) mem 7376MB [2024-08-20 00:19:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][220/1251] eta 0:04:13 lr 0.000509 wd 0.0500 time 0.2451 (0.2458) data time 0.0007 (0.0038) model time 0.2444 (0.2412) loss 4.5310 (4.7676) grad_norm 1.6313 (inf) loss_scale 16384.0000 (17199.4932) mem 7376MB [2024-08-20 00:19:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][230/1251] eta 0:04:10 lr 0.000510 wd 0.0500 time 0.2413 (0.2457) data time 0.0007 (0.0037) model time 0.2406 (0.2412) loss 5.7074 (4.7665) grad_norm 2.3633 (inf) loss_scale 16384.0000 (17164.1905) mem 7376MB [2024-08-20 00:19:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][240/1251] eta 0:04:08 lr 0.000510 wd 0.0500 time 0.2380 (0.2454) data time 0.0008 (0.0036) model time 0.2373 (0.2410) loss 4.0324 (4.7475) grad_norm 2.0758 (inf) loss_scale 16384.0000 (17131.8174) mem 7376MB [2024-08-20 00:19:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][250/1251] eta 0:04:05 lr 0.000510 wd 0.0500 time 0.2365 (0.2452) data time 0.0008 (0.0035) model time 0.2357 (0.2409) loss 3.7700 (4.7490) grad_norm 1.6853 (inf) loss_scale 16384.0000 (17102.0239) mem 7376MB [2024-08-20 00:19:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][260/1251] eta 0:04:02 lr 0.000511 wd 0.0500 time 0.2338 (0.2451) data time 0.0008 (0.0034) model time 0.2330 (0.2410) loss 5.4658 (4.7566) grad_norm 2.6420 (inf) loss_scale 16384.0000 (17074.5134) mem 7376MB [2024-08-20 00:19:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][270/1251] eta 0:04:00 lr 0.000511 wd 0.0500 time 0.2398 (0.2451) data time 0.0012 (0.0033) model time 0.2386 (0.2411) loss 4.3369 (4.7495) grad_norm 2.4137 (inf) loss_scale 16384.0000 (17049.0332) mem 7376MB [2024-08-20 00:19:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][280/1251] eta 0:03:58 lr 0.000512 wd 0.0500 time 0.2400 (0.2458) data time 0.0012 (0.0032) model time 0.2389 (0.2421) loss 4.6902 (4.7408) grad_norm 2.1029 (inf) loss_scale 16384.0000 (17025.3665) mem 7376MB [2024-08-20 00:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][290/1251] eta 0:03:56 lr 0.000512 wd 0.0500 time 0.2402 (0.2465) data time 0.0012 (0.0031) model time 0.2391 (0.2430) loss 4.7622 (4.7363) grad_norm 2.1940 (inf) loss_scale 16384.0000 (17003.3265) mem 7376MB [2024-08-20 00:19:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][300/1251] eta 0:03:54 lr 0.000512 wd 0.0500 time 0.2405 (0.2463) data time 0.0011 (0.0031) model time 0.2394 (0.2428) loss 5.3106 (4.7299) grad_norm 4.0156 (inf) loss_scale 16384.0000 (16982.7508) mem 7376MB [2024-08-20 00:19:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][310/1251] eta 0:03:51 lr 0.000513 wd 0.0500 time 0.2466 (0.2461) data time 0.0008 (0.0030) model time 0.2458 (0.2427) loss 5.3280 (4.7334) grad_norm 2.2113 (inf) loss_scale 16384.0000 (16963.4984) mem 7376MB [2024-08-20 00:19:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][320/1251] eta 0:03:49 lr 0.000513 wd 0.0500 time 0.2433 (0.2460) data time 0.0008 (0.0030) model time 0.2425 (0.2427) loss 4.3382 (4.7317) grad_norm 2.6085 (inf) loss_scale 16384.0000 (16945.4455) mem 7376MB [2024-08-20 00:19:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][330/1251] eta 0:03:46 lr 0.000514 wd 0.0500 time 0.2429 (0.2459) data time 0.0007 (0.0029) model time 0.2422 (0.2427) loss 4.0878 (4.7274) grad_norm 1.9483 (inf) loss_scale 16384.0000 (16928.4834) mem 7376MB [2024-08-20 00:19:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][340/1251] eta 0:03:43 lr 0.000514 wd 0.0500 time 0.2440 (0.2458) data time 0.0011 (0.0029) model time 0.2429 (0.2426) loss 4.6018 (4.7321) grad_norm 3.2317 (inf) loss_scale 16384.0000 (16912.5161) mem 7376MB [2024-08-20 00:20:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][350/1251] eta 0:03:41 lr 0.000514 wd 0.0500 time 0.2383 (0.2457) data time 0.0010 (0.0028) model time 0.2374 (0.2426) loss 4.1568 (4.7248) grad_norm 2.4105 (inf) loss_scale 16384.0000 (16897.4587) mem 7376MB [2024-08-20 00:20:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][360/1251] eta 0:03:38 lr 0.000515 wd 0.0500 time 0.2363 (0.2457) data time 0.0012 (0.0028) model time 0.2351 (0.2426) loss 5.1905 (4.7295) grad_norm 2.4762 (inf) loss_scale 16384.0000 (16883.2355) mem 7376MB [2024-08-20 00:20:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][370/1251] eta 0:03:36 lr 0.000515 wd 0.0500 time 0.2406 (0.2455) data time 0.0010 (0.0027) model time 0.2396 (0.2425) loss 4.9361 (4.7296) grad_norm 3.5824 (inf) loss_scale 16384.0000 (16869.7790) mem 7376MB [2024-08-20 00:20:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][380/1251] eta 0:03:33 lr 0.000516 wd 0.0500 time 0.2410 (0.2455) data time 0.0010 (0.0027) model time 0.2400 (0.2424) loss 5.3933 (4.7283) grad_norm 1.9046 (inf) loss_scale 16384.0000 (16857.0289) mem 7376MB [2024-08-20 00:20:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][390/1251] eta 0:03:31 lr 0.000516 wd 0.0500 time 0.2308 (0.2454) data time 0.0009 (0.0027) model time 0.2299 (0.2424) loss 5.5466 (4.7321) grad_norm 2.1124 (inf) loss_scale 16384.0000 (16844.9309) mem 7376MB [2024-08-20 00:20:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][400/1251] eta 0:03:28 lr 0.000516 wd 0.0500 time 0.2438 (0.2453) data time 0.0007 (0.0026) model time 0.2431 (0.2424) loss 3.6960 (4.7373) grad_norm 3.2584 (inf) loss_scale 16384.0000 (16833.4364) mem 7376MB [2024-08-20 00:20:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][410/1251] eta 0:03:26 lr 0.000517 wd 0.0500 time 0.2353 (0.2452) data time 0.0011 (0.0026) model time 0.2343 (0.2423) loss 4.8472 (4.7440) grad_norm 1.6882 (inf) loss_scale 16384.0000 (16822.5012) mem 7376MB [2024-08-20 00:20:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][420/1251] eta 0:03:23 lr 0.000517 wd 0.0500 time 0.2389 (0.2452) data time 0.0012 (0.0026) model time 0.2377 (0.2422) loss 4.8817 (4.7398) grad_norm 2.1532 (inf) loss_scale 16384.0000 (16812.0855) mem 7376MB [2024-08-20 00:20:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][430/1251] eta 0:03:21 lr 0.000518 wd 0.0500 time 0.2419 (0.2451) data time 0.0010 (0.0026) model time 0.2410 (0.2422) loss 5.2312 (4.7379) grad_norm 1.7390 (inf) loss_scale 16384.0000 (16802.1531) mem 7376MB [2024-08-20 00:20:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][440/1251] eta 0:03:18 lr 0.000518 wd 0.0500 time 0.2453 (0.2451) data time 0.0010 (0.0025) model time 0.2443 (0.2422) loss 4.9729 (4.7355) grad_norm 2.3283 (inf) loss_scale 16384.0000 (16792.6712) mem 7376MB [2024-08-20 00:20:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][450/1251] eta 0:03:16 lr 0.000518 wd 0.0500 time 0.2363 (0.2451) data time 0.0011 (0.0025) model time 0.2352 (0.2423) loss 4.7692 (4.7375) grad_norm 2.7144 (inf) loss_scale 16384.0000 (16783.6098) mem 7376MB [2024-08-20 00:20:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][460/1251] eta 0:03:13 lr 0.000519 wd 0.0500 time 0.2407 (0.2451) data time 0.0010 (0.0025) model time 0.2398 (0.2423) loss 3.4500 (4.7380) grad_norm 1.9349 (inf) loss_scale 16384.0000 (16774.9414) mem 7376MB [2024-08-20 00:20:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][470/1251] eta 0:03:11 lr 0.000519 wd 0.0500 time 0.2362 (0.2450) data time 0.0010 (0.0024) model time 0.2352 (0.2422) loss 3.7976 (4.7327) grad_norm 2.7620 (inf) loss_scale 16384.0000 (16766.6412) mem 7376MB [2024-08-20 00:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][480/1251] eta 0:03:08 lr 0.000520 wd 0.0500 time 0.2432 (0.2449) data time 0.0010 (0.0024) model time 0.2421 (0.2422) loss 4.7092 (4.7240) grad_norm 4.6014 (inf) loss_scale 16384.0000 (16758.6861) mem 7376MB [2024-08-20 00:20:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][490/1251] eta 0:03:06 lr 0.000520 wd 0.0500 time 0.2418 (0.2449) data time 0.0011 (0.0024) model time 0.2407 (0.2422) loss 3.5848 (4.7189) grad_norm 2.9478 (inf) loss_scale 16384.0000 (16751.0550) mem 7376MB [2024-08-20 00:20:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][500/1251] eta 0:03:03 lr 0.000520 wd 0.0500 time 0.2439 (0.2449) data time 0.0010 (0.0024) model time 0.2428 (0.2422) loss 4.7719 (4.7201) grad_norm 1.9993 (inf) loss_scale 16384.0000 (16743.7285) mem 7376MB [2024-08-20 00:20:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][510/1251] eta 0:03:01 lr 0.000521 wd 0.0500 time 0.2468 (0.2448) data time 0.0005 (0.0024) model time 0.2462 (0.2422) loss 5.2808 (4.7254) grad_norm 2.1503 (inf) loss_scale 16384.0000 (16736.6888) mem 7376MB [2024-08-20 00:20:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][520/1251] eta 0:02:58 lr 0.000521 wd 0.0500 time 0.2411 (0.2448) data time 0.0008 (0.0023) model time 0.2404 (0.2422) loss 3.7142 (4.7250) grad_norm 2.1344 (inf) loss_scale 16384.0000 (16729.9194) mem 7376MB [2024-08-20 00:20:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][530/1251] eta 0:02:56 lr 0.000522 wd 0.0500 time 0.2436 (0.2448) data time 0.0008 (0.0023) model time 0.2428 (0.2422) loss 4.7441 (4.7273) grad_norm 2.7917 (inf) loss_scale 16384.0000 (16723.4049) mem 7376MB [2024-08-20 00:20:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][540/1251] eta 0:02:54 lr 0.000522 wd 0.0500 time 0.2385 (0.2447) data time 0.0008 (0.0023) model time 0.2377 (0.2422) loss 4.8102 (4.7315) grad_norm 1.9828 (inf) loss_scale 16384.0000 (16717.1312) mem 7376MB [2024-08-20 00:20:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][550/1251] eta 0:02:51 lr 0.000522 wd 0.0500 time 0.2481 (0.2447) data time 0.0010 (0.0023) model time 0.2471 (0.2422) loss 3.4950 (4.7279) grad_norm 2.4536 (inf) loss_scale 16384.0000 (16711.0853) mem 7376MB [2024-08-20 00:20:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][560/1251] eta 0:02:49 lr 0.000523 wd 0.0500 time 0.2530 (0.2447) data time 0.0011 (0.0023) model time 0.2518 (0.2422) loss 4.6995 (4.7254) grad_norm 2.1449 (inf) loss_scale 16384.0000 (16705.2549) mem 7376MB [2024-08-20 00:20:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][570/1251] eta 0:02:46 lr 0.000523 wd 0.0500 time 0.2400 (0.2447) data time 0.0010 (0.0023) model time 0.2390 (0.2422) loss 5.2061 (4.7196) grad_norm 2.7518 (inf) loss_scale 16384.0000 (16699.6287) mem 7376MB [2024-08-20 00:20:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][580/1251] eta 0:02:44 lr 0.000524 wd 0.0500 time 0.2465 (0.2447) data time 0.0010 (0.0022) model time 0.2455 (0.2422) loss 3.5552 (4.7204) grad_norm 2.2871 (inf) loss_scale 16384.0000 (16694.1962) mem 7376MB [2024-08-20 00:20:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][590/1251] eta 0:02:41 lr 0.000524 wd 0.0500 time 0.2388 (0.2446) data time 0.0007 (0.0022) model time 0.2380 (0.2422) loss 5.2518 (4.7204) grad_norm 2.1518 (inf) loss_scale 16384.0000 (16688.9475) mem 7376MB [2024-08-20 00:21:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][600/1251] eta 0:02:39 lr 0.000524 wd 0.0500 time 0.2369 (0.2446) data time 0.0010 (0.0022) model time 0.2360 (0.2421) loss 4.9319 (4.7230) grad_norm 2.7516 (inf) loss_scale 16384.0000 (16683.8735) mem 7376MB [2024-08-20 00:21:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][610/1251] eta 0:02:36 lr 0.000525 wd 0.0500 time 0.2388 (0.2446) data time 0.0010 (0.0022) model time 0.2378 (0.2421) loss 5.2950 (4.7201) grad_norm 3.1562 (inf) loss_scale 16384.0000 (16678.9656) mem 7376MB [2024-08-20 00:21:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][620/1251] eta 0:02:34 lr 0.000525 wd 0.0500 time 0.2359 (0.2445) data time 0.0010 (0.0022) model time 0.2349 (0.2421) loss 4.2033 (4.7195) grad_norm 1.7489 (inf) loss_scale 16384.0000 (16674.2158) mem 7376MB [2024-08-20 00:21:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][630/1251] eta 0:02:31 lr 0.000526 wd 0.0500 time 0.2410 (0.2445) data time 0.0009 (0.0022) model time 0.2401 (0.2421) loss 3.9718 (4.7196) grad_norm 2.2228 (inf) loss_scale 16384.0000 (16669.6165) mem 7376MB [2024-08-20 00:21:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][640/1251] eta 0:02:29 lr 0.000526 wd 0.0500 time 0.2401 (0.2444) data time 0.0010 (0.0022) model time 0.2391 (0.2420) loss 4.8700 (4.7234) grad_norm 2.0724 (inf) loss_scale 16384.0000 (16665.1607) mem 7376MB [2024-08-20 00:21:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][650/1251] eta 0:02:26 lr 0.000526 wd 0.0500 time 0.2335 (0.2444) data time 0.0012 (0.0021) model time 0.2323 (0.2420) loss 4.6523 (4.7210) grad_norm 3.8828 (inf) loss_scale 16384.0000 (16660.8418) mem 7376MB [2024-08-20 00:21:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][660/1251] eta 0:02:24 lr 0.000527 wd 0.0500 time 0.2424 (0.2444) data time 0.0008 (0.0021) model time 0.2416 (0.2420) loss 4.3524 (4.7177) grad_norm 2.4025 (inf) loss_scale 16384.0000 (16656.6536) mem 7376MB [2024-08-20 00:21:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][670/1251] eta 0:02:21 lr 0.000527 wd 0.0500 time 0.2409 (0.2444) data time 0.0011 (0.0021) model time 0.2398 (0.2420) loss 4.7909 (4.7195) grad_norm 2.0343 (inf) loss_scale 16384.0000 (16652.5902) mem 7376MB [2024-08-20 00:21:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][680/1251] eta 0:02:19 lr 0.000528 wd 0.0500 time 0.2402 (0.2443) data time 0.0011 (0.0021) model time 0.2392 (0.2420) loss 4.4918 (4.7169) grad_norm 1.9161 (inf) loss_scale 16384.0000 (16648.6461) mem 7376MB [2024-08-20 00:21:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][690/1251] eta 0:02:17 lr 0.000528 wd 0.0500 time 0.2353 (0.2443) data time 0.0010 (0.0021) model time 0.2343 (0.2420) loss 4.9477 (4.7173) grad_norm 2.9727 (inf) loss_scale 16384.0000 (16644.8162) mem 7376MB [2024-08-20 00:21:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][700/1251] eta 0:02:14 lr 0.000528 wd 0.0500 time 0.2414 (0.2442) data time 0.0008 (0.0021) model time 0.2406 (0.2419) loss 5.2304 (4.7189) grad_norm 1.9983 (inf) loss_scale 16384.0000 (16641.0956) mem 7376MB [2024-08-20 00:21:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][710/1251] eta 0:02:12 lr 0.000529 wd 0.0500 time 0.2381 (0.2442) data time 0.0008 (0.0021) model time 0.2373 (0.2419) loss 4.3120 (4.7213) grad_norm 2.9418 (inf) loss_scale 16384.0000 (16637.4796) mem 7376MB [2024-08-20 00:21:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][720/1251] eta 0:02:09 lr 0.000529 wd 0.0500 time 0.2413 (0.2441) data time 0.0009 (0.0021) model time 0.2405 (0.2418) loss 5.6096 (4.7220) grad_norm 2.0214 (inf) loss_scale 16384.0000 (16633.9639) mem 7376MB [2024-08-20 00:21:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][730/1251] eta 0:02:07 lr 0.000530 wd 0.0500 time 0.2381 (0.2441) data time 0.0008 (0.0021) model time 0.2373 (0.2418) loss 3.5298 (4.7182) grad_norm 2.0883 (inf) loss_scale 16384.0000 (16630.5445) mem 7376MB [2024-08-20 00:21:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][740/1251] eta 0:02:04 lr 0.000530 wd 0.0500 time 0.2458 (0.2441) data time 0.0011 (0.0020) model time 0.2447 (0.2418) loss 4.8206 (4.7131) grad_norm 1.8597 (inf) loss_scale 16384.0000 (16627.2173) mem 7376MB [2024-08-20 00:21:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][750/1251] eta 0:02:02 lr 0.000530 wd 0.0500 time 0.2392 (0.2441) data time 0.0007 (0.0020) model time 0.2385 (0.2418) loss 4.3540 (4.7105) grad_norm 2.3420 (inf) loss_scale 16384.0000 (16623.9787) mem 7376MB [2024-08-20 00:21:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][760/1251] eta 0:01:59 lr 0.000531 wd 0.0500 time 0.2486 (0.2441) data time 0.0011 (0.0020) model time 0.2475 (0.2418) loss 3.4875 (4.7079) grad_norm 2.3096 (inf) loss_scale 16384.0000 (16620.8252) mem 7376MB [2024-08-20 00:21:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][770/1251] eta 0:01:57 lr 0.000531 wd 0.0500 time 0.2459 (0.2440) data time 0.0010 (0.0020) model time 0.2450 (0.2418) loss 5.3494 (4.7053) grad_norm 2.5423 (inf) loss_scale 16384.0000 (16617.7536) mem 7376MB [2024-08-20 00:21:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][780/1251] eta 0:01:54 lr 0.000532 wd 0.0500 time 0.2418 (0.2440) data time 0.0011 (0.0020) model time 0.2407 (0.2418) loss 4.7987 (4.7074) grad_norm 3.3138 (inf) loss_scale 16384.0000 (16614.7606) mem 7376MB [2024-08-20 00:21:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][790/1251] eta 0:01:52 lr 0.000532 wd 0.0500 time 0.2377 (0.2440) data time 0.0008 (0.0020) model time 0.2370 (0.2418) loss 4.4931 (4.7050) grad_norm 1.7241 (inf) loss_scale 16384.0000 (16611.8432) mem 7376MB [2024-08-20 00:21:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][800/1251] eta 0:01:50 lr 0.000532 wd 0.0500 time 0.2417 (0.2440) data time 0.0011 (0.0020) model time 0.2406 (0.2418) loss 3.1492 (4.6990) grad_norm 2.0070 (inf) loss_scale 16384.0000 (16608.9988) mem 7376MB [2024-08-20 00:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][810/1251] eta 0:01:47 lr 0.000533 wd 0.0500 time 0.2410 (0.2440) data time 0.0011 (0.0020) model time 0.2399 (0.2418) loss 4.9008 (4.6958) grad_norm 2.2987 (inf) loss_scale 16384.0000 (16606.2244) mem 7376MB [2024-08-20 00:21:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][820/1251] eta 0:01:45 lr 0.000533 wd 0.0500 time 0.4492 (0.2442) data time 0.0010 (0.0020) model time 0.4483 (0.2420) loss 5.6580 (4.6963) grad_norm 1.9318 (inf) loss_scale 16384.0000 (16603.5177) mem 7376MB [2024-08-20 00:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][830/1251] eta 0:01:42 lr 0.000534 wd 0.0500 time 0.2530 (0.2442) data time 0.0007 (0.0019) model time 0.2522 (0.2420) loss 5.3327 (4.6966) grad_norm 2.5642 (inf) loss_scale 16384.0000 (16600.8761) mem 7376MB [2024-08-20 00:22:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][840/1251] eta 0:01:40 lr 0.000534 wd 0.0500 time 0.2395 (0.2441) data time 0.0012 (0.0019) model time 0.2383 (0.2420) loss 4.8974 (4.6975) grad_norm 2.5671 (inf) loss_scale 16384.0000 (16598.2973) mem 7376MB [2024-08-20 00:22:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][850/1251] eta 0:01:37 lr 0.000534 wd 0.0500 time 0.2510 (0.2441) data time 0.0007 (0.0019) model time 0.2503 (0.2420) loss 3.4477 (4.6948) grad_norm 1.6885 (inf) loss_scale 16384.0000 (16595.7791) mem 7376MB [2024-08-20 00:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][860/1251] eta 0:01:35 lr 0.000535 wd 0.0500 time 0.2465 (0.2441) data time 0.0011 (0.0019) model time 0.2454 (0.2420) loss 4.9844 (4.6978) grad_norm 1.8283 (inf) loss_scale 16384.0000 (16593.3194) mem 7376MB [2024-08-20 00:22:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][870/1251] eta 0:01:32 lr 0.000535 wd 0.0500 time 0.2539 (0.2441) data time 0.0010 (0.0019) model time 0.2529 (0.2420) loss 5.3415 (4.7014) grad_norm 2.3262 (inf) loss_scale 16384.0000 (16590.9162) mem 7376MB [2024-08-20 00:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][880/1251] eta 0:01:30 lr 0.000536 wd 0.0500 time 0.2454 (0.2441) data time 0.0011 (0.0019) model time 0.2444 (0.2420) loss 4.5323 (4.7001) grad_norm 4.0006 (inf) loss_scale 16384.0000 (16588.5675) mem 7376MB [2024-08-20 00:22:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][890/1251] eta 0:01:28 lr 0.000536 wd 0.0500 time 0.2521 (0.2440) data time 0.0009 (0.0019) model time 0.2513 (0.2420) loss 3.9386 (4.6953) grad_norm 1.6378 (inf) loss_scale 16384.0000 (16586.2716) mem 7376MB [2024-08-20 00:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][900/1251] eta 0:01:25 lr 0.000536 wd 0.0500 time 0.2440 (0.2440) data time 0.0008 (0.0019) model time 0.2432 (0.2420) loss 3.4906 (4.6881) grad_norm 1.8598 (inf) loss_scale 16384.0000 (16584.0266) mem 7376MB [2024-08-20 00:22:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][910/1251] eta 0:01:23 lr 0.000537 wd 0.0500 time 0.2436 (0.2440) data time 0.0010 (0.0019) model time 0.2426 (0.2419) loss 4.6344 (4.6883) grad_norm 2.4593 (inf) loss_scale 16384.0000 (16581.8310) mem 7376MB [2024-08-20 00:22:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][920/1251] eta 0:01:20 lr 0.000537 wd 0.0500 time 0.2577 (0.2439) data time 0.0009 (0.0019) model time 0.2568 (0.2419) loss 4.5003 (4.6890) grad_norm 2.3101 (inf) loss_scale 16384.0000 (16579.6830) mem 7376MB [2024-08-20 00:22:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][930/1251] eta 0:01:18 lr 0.000538 wd 0.0500 time 0.2450 (0.2439) data time 0.0010 (0.0019) model time 0.2440 (0.2419) loss 3.8864 (4.6931) grad_norm 2.1051 (inf) loss_scale 16384.0000 (16577.5811) mem 7376MB [2024-08-20 00:22:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][940/1251] eta 0:01:15 lr 0.000538 wd 0.0500 time 0.2406 (0.2439) data time 0.0007 (0.0019) model time 0.2399 (0.2419) loss 4.0406 (4.6940) grad_norm 1.9882 (inf) loss_scale 16384.0000 (16575.5239) mem 7376MB [2024-08-20 00:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][950/1251] eta 0:01:13 lr 0.000538 wd 0.0500 time 0.2531 (0.2439) data time 0.0008 (0.0018) model time 0.2523 (0.2419) loss 4.2455 (4.6940) grad_norm 1.5116 (inf) loss_scale 16384.0000 (16573.5100) mem 7376MB [2024-08-20 00:22:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][960/1251] eta 0:01:10 lr 0.000539 wd 0.0500 time 0.2490 (0.2439) data time 0.0009 (0.0018) model time 0.2481 (0.2419) loss 4.8464 (4.6953) grad_norm 2.9769 (inf) loss_scale 16384.0000 (16571.5380) mem 7376MB [2024-08-20 00:22:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][970/1251] eta 0:01:08 lr 0.000539 wd 0.0500 time 0.2531 (0.2439) data time 0.0007 (0.0018) model time 0.2524 (0.2419) loss 4.1362 (4.6932) grad_norm 1.6154 (inf) loss_scale 16384.0000 (16569.6066) mem 7376MB [2024-08-20 00:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][980/1251] eta 0:01:06 lr 0.000540 wd 0.0500 time 0.2341 (0.2441) data time 0.0010 (0.0018) model time 0.2331 (0.2421) loss 5.0060 (4.6932) grad_norm 1.7939 (inf) loss_scale 16384.0000 (16567.7146) mem 7376MB [2024-08-20 00:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][990/1251] eta 0:01:03 lr 0.000540 wd 0.0500 time 0.2362 (0.2440) data time 0.0007 (0.0018) model time 0.2355 (0.2421) loss 4.8662 (4.6916) grad_norm 2.1554 (inf) loss_scale 16384.0000 (16565.8607) mem 7376MB [2024-08-20 00:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1000/1251] eta 0:01:01 lr 0.000540 wd 0.0500 time 0.2376 (0.2440) data time 0.0008 (0.0018) model time 0.2367 (0.2420) loss 5.1966 (4.6924) grad_norm 4.4944 (inf) loss_scale 16384.0000 (16564.0440) mem 7376MB [2024-08-20 00:22:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1010/1251] eta 0:00:58 lr 0.000541 wd 0.0500 time 0.2394 (0.2440) data time 0.0010 (0.0018) model time 0.2384 (0.2420) loss 3.5440 (4.6873) grad_norm 2.8424 (inf) loss_scale 16384.0000 (16562.2631) mem 7376MB [2024-08-20 00:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1020/1251] eta 0:00:56 lr 0.000541 wd 0.0500 time 0.2395 (0.2440) data time 0.0014 (0.0018) model time 0.2381 (0.2420) loss 4.5652 (4.6853) grad_norm 2.0114 (inf) loss_scale 16384.0000 (16560.5171) mem 7376MB [2024-08-20 00:22:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1030/1251] eta 0:00:53 lr 0.000542 wd 0.0500 time 0.2396 (0.2440) data time 0.0007 (0.0018) model time 0.2389 (0.2420) loss 4.2499 (4.6839) grad_norm 2.5759 (inf) loss_scale 16384.0000 (16558.8050) mem 7376MB [2024-08-20 00:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1040/1251] eta 0:00:51 lr 0.000542 wd 0.0500 time 0.2423 (0.2439) data time 0.0008 (0.0018) model time 0.2415 (0.2420) loss 3.8627 (4.6827) grad_norm 3.3427 (inf) loss_scale 16384.0000 (16557.1258) mem 7376MB [2024-08-20 00:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1050/1251] eta 0:00:49 lr 0.000542 wd 0.0500 time 0.2365 (0.2439) data time 0.0013 (0.0018) model time 0.2352 (0.2420) loss 4.6508 (4.6821) grad_norm 1.8158 (inf) loss_scale 16384.0000 (16555.4786) mem 7376MB [2024-08-20 00:22:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1060/1251] eta 0:00:46 lr 0.000543 wd 0.0500 time 0.2350 (0.2439) data time 0.0009 (0.0018) model time 0.2341 (0.2420) loss 4.5701 (4.6792) grad_norm 2.8678 (inf) loss_scale 16384.0000 (16553.8624) mem 7376MB [2024-08-20 00:22:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1070/1251] eta 0:00:44 lr 0.000543 wd 0.0500 time 0.2496 (0.2438) data time 0.0012 (0.0018) model time 0.2484 (0.2419) loss 4.8339 (4.6777) grad_norm 2.7427 (inf) loss_scale 16384.0000 (16552.2764) mem 7376MB [2024-08-20 00:22:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1080/1251] eta 0:00:41 lr 0.000544 wd 0.0500 time 0.2485 (0.2439) data time 0.0010 (0.0018) model time 0.2475 (0.2420) loss 4.3799 (4.6774) grad_norm 2.4231 (inf) loss_scale 16384.0000 (16550.7197) mem 7376MB [2024-08-20 00:23:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1090/1251] eta 0:00:39 lr 0.000544 wd 0.0500 time 0.2414 (0.2438) data time 0.0012 (0.0018) model time 0.2402 (0.2419) loss 3.6013 (4.6736) grad_norm 1.6594 (inf) loss_scale 16384.0000 (16549.1916) mem 7376MB [2024-08-20 00:23:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1100/1251] eta 0:00:36 lr 0.000544 wd 0.0500 time 0.2376 (0.2438) data time 0.0009 (0.0017) model time 0.2367 (0.2419) loss 4.8326 (4.6723) grad_norm 2.7112 (inf) loss_scale 16384.0000 (16547.6912) mem 7376MB [2024-08-20 00:23:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1110/1251] eta 0:00:34 lr 0.000545 wd 0.0500 time 0.2446 (0.2438) data time 0.0009 (0.0017) model time 0.2437 (0.2419) loss 5.0529 (4.6717) grad_norm 2.6469 (inf) loss_scale 16384.0000 (16546.2178) mem 7376MB [2024-08-20 00:23:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1120/1251] eta 0:00:31 lr 0.000545 wd 0.0500 time 0.2453 (0.2438) data time 0.0012 (0.0017) model time 0.2441 (0.2419) loss 4.8268 (4.6734) grad_norm 2.2804 (inf) loss_scale 16384.0000 (16544.7707) mem 7376MB [2024-08-20 00:23:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1130/1251] eta 0:00:29 lr 0.000546 wd 0.0500 time 0.2386 (0.2438) data time 0.0008 (0.0017) model time 0.2379 (0.2419) loss 4.0308 (4.6695) grad_norm 2.0895 (inf) loss_scale 16384.0000 (16543.3492) mem 7376MB [2024-08-20 00:23:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1140/1251] eta 0:00:27 lr 0.000546 wd 0.0500 time 0.2340 (0.2438) data time 0.0012 (0.0017) model time 0.2328 (0.2419) loss 4.3409 (4.6640) grad_norm 2.2704 (inf) loss_scale 16384.0000 (16541.9527) mem 7376MB [2024-08-20 00:23:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1150/1251] eta 0:00:24 lr 0.000546 wd 0.0500 time 0.2418 (0.2438) data time 0.0010 (0.0017) model time 0.2408 (0.2419) loss 4.0162 (4.6617) grad_norm 2.9239 (inf) loss_scale 16384.0000 (16540.5804) mem 7376MB [2024-08-20 00:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1160/1251] eta 0:00:22 lr 0.000547 wd 0.0500 time 0.2452 (0.2437) data time 0.0010 (0.0017) model time 0.2442 (0.2419) loss 4.0251 (4.6630) grad_norm 2.7623 (inf) loss_scale 16384.0000 (16539.2317) mem 7376MB [2024-08-20 00:23:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1170/1251] eta 0:00:19 lr 0.000547 wd 0.0500 time 0.2446 (0.2438) data time 0.0010 (0.0017) model time 0.2436 (0.2419) loss 4.9036 (4.6646) grad_norm 2.4027 (inf) loss_scale 16384.0000 (16537.9061) mem 7376MB [2024-08-20 00:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1180/1251] eta 0:00:17 lr 0.000548 wd 0.0500 time 0.2383 (0.2437) data time 0.0010 (0.0017) model time 0.2373 (0.2419) loss 3.9052 (4.6624) grad_norm 2.6659 (inf) loss_scale 16384.0000 (16536.6029) mem 7376MB [2024-08-20 00:23:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1190/1251] eta 0:00:14 lr 0.000548 wd 0.0500 time 0.2440 (0.2437) data time 0.0008 (0.0017) model time 0.2432 (0.2419) loss 4.1856 (4.6622) grad_norm 2.6944 (inf) loss_scale 16384.0000 (16535.3216) mem 7376MB [2024-08-20 00:23:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1200/1251] eta 0:00:12 lr 0.000548 wd 0.0500 time 0.2411 (0.2437) data time 0.0008 (0.0017) model time 0.2403 (0.2419) loss 5.3521 (4.6624) grad_norm 2.5923 (inf) loss_scale 16384.0000 (16534.0616) mem 7376MB [2024-08-20 00:23:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1210/1251] eta 0:00:09 lr 0.000549 wd 0.0500 time 0.2421 (0.2439) data time 0.0012 (0.0017) model time 0.2410 (0.2420) loss 4.9394 (4.6617) grad_norm 1.9075 (inf) loss_scale 16384.0000 (16532.8225) mem 7376MB [2024-08-20 00:23:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1220/1251] eta 0:00:07 lr 0.000549 wd 0.0500 time 0.2380 (0.2440) data time 0.0013 (0.0017) model time 0.2367 (0.2422) loss 4.7912 (4.6610) grad_norm 2.1611 (inf) loss_scale 16384.0000 (16531.6036) mem 7376MB [2024-08-20 00:23:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1230/1251] eta 0:00:05 lr 0.000550 wd 0.0500 time 0.2490 (0.2440) data time 0.0009 (0.0017) model time 0.2482 (0.2422) loss 4.5004 (4.6618) grad_norm 3.4413 (inf) loss_scale 16384.0000 (16530.4045) mem 7376MB [2024-08-20 00:23:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1240/1251] eta 0:00:02 lr 0.000550 wd 0.0500 time 0.2207 (0.2439) data time 0.0007 (0.0017) model time 0.2199 (0.2421) loss 4.8459 (4.6631) grad_norm 1.7622 (inf) loss_scale 16384.0000 (16529.2248) mem 7376MB [2024-08-20 00:23:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [10/300][1250/1251] eta 0:00:00 lr 0.000550 wd 0.0500 time 0.2222 (0.2438) data time 0.0007 (0.0017) model time 0.2215 (0.2420) loss 4.7988 (4.6636) grad_norm 1.8595 (inf) loss_scale 16384.0000 (16528.0639) mem 7376MB [2024-08-20 00:23:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 10 training takes 0:05:04 [2024-08-20 00:23:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-20 00:23:40 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-20 00:23:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.458 (0.458) Loss 1.1660 (1.1660) Acc@1 73.828 (73.828) Acc@5 92.090 (92.090) Mem 7376MB [2024-08-20 00:23:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.109) Loss 1.4814 (1.5130) Acc@1 62.402 (64.213) Acc@5 90.137 (87.837) Mem 7376MB [2024-08-20 00:23:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.068 (0.095) Loss 2.2969 (1.5484) Acc@1 51.562 (63.449) Acc@5 75.000 (87.444) Mem 7376MB [2024-08-20 00:23:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.089) Loss 2.5918 (1.7742) Acc@1 45.605 (59.413) Acc@5 69.043 (83.660) Mem 7376MB [2024-08-20 00:23:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 2.4980 (1.9107) Acc@1 44.629 (57.026) Acc@5 71.875 (81.362) Mem 7376MB [2024-08-20 00:23:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 57.040 Acc@5 81.238 [2024-08-20 00:23:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 57.0% [2024-08-20 00:23:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 57.04% [2024-08-20 00:23:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-20 00:23:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-20 00:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.432 (0.432) Loss 5.3086 (5.3086) Acc@1 6.934 (6.934) Acc@5 25.391 (25.391) Mem 7376MB [2024-08-20 00:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.110) Loss 5.9297 (5.7141) Acc@1 4.883 (6.357) Acc@5 15.527 (17.791) Mem 7376MB [2024-08-20 00:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.095) Loss 5.8672 (5.7009) Acc@1 5.273 (6.580) Acc@5 15.430 (18.252) Mem 7376MB [2024-08-20 00:23:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.090) Loss 5.6875 (5.7056) Acc@1 7.812 (6.685) Acc@5 19.043 (18.224) Mem 7376MB [2024-08-20 00:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 5.8750 (5.7122) Acc@1 3.711 (6.479) Acc@5 12.500 (17.964) Mem 7376MB [2024-08-20 00:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 7.118 Acc@5 19.142 [2024-08-20 00:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 7.1% [2024-08-20 00:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 7.12% [2024-08-20 00:23:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-20 00:23:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-20 00:23:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][0/1251] eta 0:15:19 lr 0.000550 wd 0.0500 time 0.7354 (0.7354) data time 0.5059 (0.5059) model time 0.0000 (0.0000) loss 5.0663 (5.0663) grad_norm 1.8828 (1.8828) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:23:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][10/1251] eta 0:05:58 lr 0.000551 wd 0.0500 time 0.2523 (0.2890) data time 0.0008 (0.0469) model time 0.0000 (0.0000) loss 5.2592 (4.6438) grad_norm 2.0843 (2.1097) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:23:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][20/1251] eta 0:05:28 lr 0.000551 wd 0.0500 time 0.2403 (0.2668) data time 0.0012 (0.0251) model time 0.0000 (0.0000) loss 4.9552 (4.7767) grad_norm 2.4542 (2.1478) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:23:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][30/1251] eta 0:05:16 lr 0.000552 wd 0.0500 time 0.2433 (0.2592) data time 0.0010 (0.0175) model time 0.0000 (0.0000) loss 4.1292 (4.6637) grad_norm 2.2791 (2.2873) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][40/1251] eta 0:05:08 lr 0.000552 wd 0.0500 time 0.2426 (0.2550) data time 0.0011 (0.0135) model time 0.0000 (0.0000) loss 4.0600 (4.6343) grad_norm 2.8466 (2.2706) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][50/1251] eta 0:05:03 lr 0.000552 wd 0.0500 time 0.2542 (0.2525) data time 0.0008 (0.0111) model time 0.0000 (0.0000) loss 4.5933 (4.5794) grad_norm 1.6648 (2.2390) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][60/1251] eta 0:04:58 lr 0.000553 wd 0.0500 time 0.2394 (0.2509) data time 0.0009 (0.0095) model time 0.2385 (0.2418) loss 5.1253 (4.5479) grad_norm 2.0787 (2.2327) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][70/1251] eta 0:04:55 lr 0.000553 wd 0.0500 time 0.2505 (0.2498) data time 0.0008 (0.0083) model time 0.2497 (0.2419) loss 3.6172 (4.5381) grad_norm 2.3339 (2.2429) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][80/1251] eta 0:04:51 lr 0.000554 wd 0.0500 time 0.2394 (0.2490) data time 0.0009 (0.0074) model time 0.2385 (0.2420) loss 5.2158 (4.5842) grad_norm 2.5331 (2.3200) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][90/1251] eta 0:04:48 lr 0.000554 wd 0.0500 time 0.2476 (0.2483) data time 0.0010 (0.0067) model time 0.2467 (0.2417) loss 4.6542 (4.5780) grad_norm 2.0061 (2.3177) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][100/1251] eta 0:04:45 lr 0.000554 wd 0.0500 time 0.2545 (0.2478) data time 0.0009 (0.0062) model time 0.2536 (0.2418) loss 4.4822 (4.5863) grad_norm 2.2595 (2.3131) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][110/1251] eta 0:04:42 lr 0.000555 wd 0.0500 time 0.2369 (0.2472) data time 0.0010 (0.0057) model time 0.2358 (0.2416) loss 5.0835 (4.5969) grad_norm 2.3075 (2.2898) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][120/1251] eta 0:04:39 lr 0.000555 wd 0.0500 time 0.2409 (0.2467) data time 0.0008 (0.0054) model time 0.2401 (0.2414) loss 4.9521 (4.5737) grad_norm 1.6976 (2.2774) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][130/1251] eta 0:04:36 lr 0.000556 wd 0.0500 time 0.2552 (0.2465) data time 0.0010 (0.0050) model time 0.2542 (0.2415) loss 5.2514 (4.5877) grad_norm 2.8619 (2.2783) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][140/1251] eta 0:04:33 lr 0.000556 wd 0.0500 time 0.2405 (0.2460) data time 0.0014 (0.0047) model time 0.2391 (0.2412) loss 4.3663 (4.5797) grad_norm 1.7642 (2.2997) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][150/1251] eta 0:04:30 lr 0.000556 wd 0.0500 time 0.2468 (0.2457) data time 0.0010 (0.0045) model time 0.2459 (0.2410) loss 4.8848 (4.5814) grad_norm 1.6635 (2.3148) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][160/1251] eta 0:04:27 lr 0.000557 wd 0.0500 time 0.2477 (0.2454) data time 0.0011 (0.0043) model time 0.2466 (0.2409) loss 5.0470 (4.5592) grad_norm 3.0530 (2.3126) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][170/1251] eta 0:04:25 lr 0.000557 wd 0.0500 time 0.2485 (0.2453) data time 0.0011 (0.0041) model time 0.2474 (0.2411) loss 4.6411 (4.5580) grad_norm 1.8848 (2.2978) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][180/1251] eta 0:04:22 lr 0.000558 wd 0.0500 time 0.2433 (0.2451) data time 0.0011 (0.0040) model time 0.2422 (0.2411) loss 4.8062 (4.5582) grad_norm 2.5705 (2.3050) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][190/1251] eta 0:04:21 lr 0.000558 wd 0.0500 time 0.2436 (0.2460) data time 0.0010 (0.0038) model time 0.2426 (0.2425) loss 4.6426 (4.5437) grad_norm 2.6484 (2.3062) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][200/1251] eta 0:04:18 lr 0.000558 wd 0.0500 time 0.2344 (0.2458) data time 0.0009 (0.0037) model time 0.2336 (0.2424) loss 3.7733 (4.5360) grad_norm 2.5267 (2.3039) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][210/1251] eta 0:04:15 lr 0.000559 wd 0.0500 time 0.2474 (0.2458) data time 0.0012 (0.0036) model time 0.2463 (0.2425) loss 4.0936 (4.5365) grad_norm 2.1159 (2.2972) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][220/1251] eta 0:04:13 lr 0.000559 wd 0.0500 time 0.2390 (0.2459) data time 0.0011 (0.0034) model time 0.2379 (0.2427) loss 5.0356 (4.5382) grad_norm 2.2441 (2.3098) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][230/1251] eta 0:04:10 lr 0.000560 wd 0.0500 time 0.2509 (0.2458) data time 0.0007 (0.0034) model time 0.2501 (0.2427) loss 5.1390 (4.5545) grad_norm 2.0283 (2.3042) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][240/1251] eta 0:04:08 lr 0.000560 wd 0.0500 time 0.2368 (0.2455) data time 0.0009 (0.0033) model time 0.2359 (0.2425) loss 5.4706 (4.5650) grad_norm 1.9459 (2.3107) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][250/1251] eta 0:04:05 lr 0.000560 wd 0.0500 time 0.2477 (0.2454) data time 0.0009 (0.0032) model time 0.2468 (0.2424) loss 4.6467 (4.5761) grad_norm 2.1003 (2.3100) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][260/1251] eta 0:04:03 lr 0.000561 wd 0.0500 time 0.2420 (0.2453) data time 0.0010 (0.0031) model time 0.2410 (0.2424) loss 4.2895 (4.5742) grad_norm 1.6910 (2.3013) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][270/1251] eta 0:04:00 lr 0.000561 wd 0.0500 time 0.2333 (0.2452) data time 0.0008 (0.0031) model time 0.2326 (0.2423) loss 3.8645 (4.5752) grad_norm 1.8560 (2.3057) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][280/1251] eta 0:03:57 lr 0.000562 wd 0.0500 time 0.2415 (0.2451) data time 0.0011 (0.0030) model time 0.2404 (0.2422) loss 4.4403 (4.5751) grad_norm 2.5649 (2.3027) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][290/1251] eta 0:03:55 lr 0.000562 wd 0.0500 time 0.2438 (0.2451) data time 0.0010 (0.0030) model time 0.2428 (0.2423) loss 4.6831 (4.5775) grad_norm 2.2421 (2.3012) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][300/1251] eta 0:03:53 lr 0.000562 wd 0.0500 time 0.2422 (0.2450) data time 0.0008 (0.0029) model time 0.2414 (0.2422) loss 5.0067 (4.5718) grad_norm 1.9956 (2.3038) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][310/1251] eta 0:03:50 lr 0.000563 wd 0.0500 time 0.2415 (0.2449) data time 0.0010 (0.0029) model time 0.2405 (0.2421) loss 5.1934 (4.5832) grad_norm 2.1879 (2.2996) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][320/1251] eta 0:03:47 lr 0.000563 wd 0.0500 time 0.2373 (0.2448) data time 0.0010 (0.0028) model time 0.2364 (0.2421) loss 5.5746 (4.5886) grad_norm 1.9905 (2.2971) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][330/1251] eta 0:03:45 lr 0.000564 wd 0.0500 time 0.2444 (0.2448) data time 0.0013 (0.0028) model time 0.2431 (0.2421) loss 3.8220 (4.5929) grad_norm 3.1648 (2.3031) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][340/1251] eta 0:03:42 lr 0.000564 wd 0.0500 time 0.2425 (0.2447) data time 0.0008 (0.0027) model time 0.2417 (0.2420) loss 5.1109 (4.5969) grad_norm 1.8691 (2.2956) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][350/1251] eta 0:03:40 lr 0.000564 wd 0.0500 time 0.2478 (0.2446) data time 0.0008 (0.0027) model time 0.2470 (0.2420) loss 3.0653 (4.5862) grad_norm 1.8579 (2.3002) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][360/1251] eta 0:03:37 lr 0.000565 wd 0.0500 time 0.2356 (0.2445) data time 0.0011 (0.0026) model time 0.2345 (0.2419) loss 4.7304 (4.5866) grad_norm 1.6665 (2.3182) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][370/1251] eta 0:03:35 lr 0.000565 wd 0.0500 time 0.2438 (0.2446) data time 0.0009 (0.0026) model time 0.2429 (0.2420) loss 4.4059 (4.5892) grad_norm 1.8990 (2.3207) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][380/1251] eta 0:03:32 lr 0.000566 wd 0.0500 time 0.2369 (0.2445) data time 0.0011 (0.0026) model time 0.2358 (0.2420) loss 5.3599 (4.5906) grad_norm 3.3133 (2.3381) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][390/1251] eta 0:03:30 lr 0.000566 wd 0.0500 time 0.2373 (0.2445) data time 0.0010 (0.0026) model time 0.2363 (0.2420) loss 4.7140 (4.5962) grad_norm 2.4829 (2.3424) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][400/1251] eta 0:03:28 lr 0.000566 wd 0.0500 time 0.2145 (0.2449) data time 0.0010 (0.0025) model time 0.2136 (0.2425) loss 5.0108 (4.5974) grad_norm 1.8753 (2.3417) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][410/1251] eta 0:03:25 lr 0.000567 wd 0.0500 time 0.2469 (0.2449) data time 0.0010 (0.0025) model time 0.2459 (0.2425) loss 5.0139 (4.5931) grad_norm 2.4925 (2.3424) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][420/1251] eta 0:03:23 lr 0.000567 wd 0.0500 time 0.2401 (0.2449) data time 0.0009 (0.0025) model time 0.2392 (0.2426) loss 5.3417 (4.5894) grad_norm 1.7048 (2.3480) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][430/1251] eta 0:03:21 lr 0.000568 wd 0.0500 time 0.2416 (0.2449) data time 0.0008 (0.0025) model time 0.2408 (0.2425) loss 4.9013 (4.5980) grad_norm 3.2957 (2.3493) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][440/1251] eta 0:03:18 lr 0.000568 wd 0.0500 time 0.2419 (0.2450) data time 0.0011 (0.0025) model time 0.2407 (0.2426) loss 3.4690 (4.5996) grad_norm 2.2126 (2.3511) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][450/1251] eta 0:03:16 lr 0.000568 wd 0.0500 time 0.2420 (0.2455) data time 0.0009 (0.0025) model time 0.2411 (0.2432) loss 4.4787 (4.5898) grad_norm 2.2103 (2.3521) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][460/1251] eta 0:03:14 lr 0.000569 wd 0.0500 time 0.2539 (0.2454) data time 0.0008 (0.0025) model time 0.2531 (0.2432) loss 5.1643 (4.5873) grad_norm 2.1224 (2.3511) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][470/1251] eta 0:03:11 lr 0.000569 wd 0.0500 time 0.2437 (0.2453) data time 0.0011 (0.0024) model time 0.2426 (0.2431) loss 5.5274 (4.5833) grad_norm 2.3944 (2.3495) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][480/1251] eta 0:03:09 lr 0.000570 wd 0.0500 time 0.2548 (0.2452) data time 0.0007 (0.0024) model time 0.2541 (0.2430) loss 5.0096 (4.5854) grad_norm 1.9144 (2.3492) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][490/1251] eta 0:03:06 lr 0.000570 wd 0.0500 time 0.2494 (0.2452) data time 0.0008 (0.0024) model time 0.2485 (0.2430) loss 5.0533 (4.5882) grad_norm 1.9593 (2.3613) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][500/1251] eta 0:03:04 lr 0.000570 wd 0.0500 time 0.2298 (0.2455) data time 0.0011 (0.0024) model time 0.2287 (0.2434) loss 4.8356 (4.5840) grad_norm 2.1637 (2.3551) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][510/1251] eta 0:03:01 lr 0.000571 wd 0.0500 time 0.2376 (0.2455) data time 0.0011 (0.0023) model time 0.2365 (0.2434) loss 4.9819 (4.5833) grad_norm 2.1290 (2.3549) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:25:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][520/1251] eta 0:02:59 lr 0.000571 wd 0.0500 time 0.2336 (0.2455) data time 0.0011 (0.0023) model time 0.2325 (0.2434) loss 4.8776 (4.5824) grad_norm 2.4134 (2.3647) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][530/1251] eta 0:02:56 lr 0.000572 wd 0.0500 time 0.2422 (0.2454) data time 0.0012 (0.0023) model time 0.2410 (0.2433) loss 4.7279 (4.5866) grad_norm 1.8443 (2.3650) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][540/1251] eta 0:02:54 lr 0.000572 wd 0.0500 time 0.2396 (0.2454) data time 0.0010 (0.0023) model time 0.2387 (0.2433) loss 4.2653 (4.5858) grad_norm 2.6330 (2.3613) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][550/1251] eta 0:02:52 lr 0.000572 wd 0.0500 time 0.2489 (0.2454) data time 0.0011 (0.0023) model time 0.2478 (0.2433) loss 4.8946 (4.5799) grad_norm 1.9664 (2.3551) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][560/1251] eta 0:02:49 lr 0.000573 wd 0.0500 time 0.2395 (0.2453) data time 0.0008 (0.0022) model time 0.2386 (0.2432) loss 4.8508 (4.5839) grad_norm 1.7715 (2.3535) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][570/1251] eta 0:02:47 lr 0.000573 wd 0.0500 time 0.2429 (0.2452) data time 0.0010 (0.0022) model time 0.2419 (0.2432) loss 4.9725 (4.5840) grad_norm 2.9591 (2.3605) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][580/1251] eta 0:02:44 lr 0.000574 wd 0.0500 time 0.2397 (0.2452) data time 0.0007 (0.0022) model time 0.2389 (0.2431) loss 5.3716 (4.5867) grad_norm 2.8967 (2.3600) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][590/1251] eta 0:02:42 lr 0.000574 wd 0.0500 time 0.2375 (0.2451) data time 0.0008 (0.0022) model time 0.2368 (0.2431) loss 4.9280 (4.5858) grad_norm 1.9209 (2.3554) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][600/1251] eta 0:02:39 lr 0.000574 wd 0.0500 time 0.2401 (0.2451) data time 0.0011 (0.0022) model time 0.2389 (0.2430) loss 5.4151 (4.5829) grad_norm 3.3598 (2.3541) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][610/1251] eta 0:02:37 lr 0.000575 wd 0.0500 time 0.2383 (0.2450) data time 0.0009 (0.0022) model time 0.2374 (0.2430) loss 4.3890 (4.5858) grad_norm 1.9146 (2.3552) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][620/1251] eta 0:02:34 lr 0.000575 wd 0.0500 time 0.2384 (0.2449) data time 0.0009 (0.0022) model time 0.2376 (0.2429) loss 4.1906 (4.5867) grad_norm 1.7187 (2.3574) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][630/1251] eta 0:02:32 lr 0.000576 wd 0.0500 time 0.2427 (0.2449) data time 0.0007 (0.0021) model time 0.2420 (0.2429) loss 4.9116 (4.5869) grad_norm 2.0038 (2.3539) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][640/1251] eta 0:02:29 lr 0.000576 wd 0.0500 time 0.2435 (0.2449) data time 0.0008 (0.0021) model time 0.2427 (0.2429) loss 4.3398 (4.5903) grad_norm 1.7365 (2.3522) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][650/1251] eta 0:02:27 lr 0.000576 wd 0.0500 time 0.2421 (0.2448) data time 0.0010 (0.0021) model time 0.2411 (0.2429) loss 5.0178 (4.5937) grad_norm 2.1394 (2.3547) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][660/1251] eta 0:02:24 lr 0.000577 wd 0.0500 time 0.2477 (0.2448) data time 0.0009 (0.0021) model time 0.2468 (0.2429) loss 4.8732 (4.5904) grad_norm 4.2523 (2.3538) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][670/1251] eta 0:02:22 lr 0.000577 wd 0.0500 time 0.2372 (0.2448) data time 0.0009 (0.0021) model time 0.2363 (0.2428) loss 4.3749 (4.5940) grad_norm 2.2758 (2.3531) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][680/1251] eta 0:02:19 lr 0.000578 wd 0.0500 time 0.2352 (0.2448) data time 0.0010 (0.0021) model time 0.2342 (0.2428) loss 4.7924 (4.5938) grad_norm 2.1259 (2.3539) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][690/1251] eta 0:02:17 lr 0.000578 wd 0.0500 time 0.2420 (0.2448) data time 0.0012 (0.0020) model time 0.2408 (0.2429) loss 4.8747 (4.5944) grad_norm 2.4527 (2.3509) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][700/1251] eta 0:02:14 lr 0.000578 wd 0.0500 time 0.2370 (0.2447) data time 0.0011 (0.0020) model time 0.2359 (0.2428) loss 4.9998 (4.5954) grad_norm 2.7304 (2.3482) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][710/1251] eta 0:02:12 lr 0.000579 wd 0.0500 time 0.2423 (0.2447) data time 0.0009 (0.0020) model time 0.2414 (0.2428) loss 4.9262 (4.6002) grad_norm 2.1172 (2.3455) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][720/1251] eta 0:02:09 lr 0.000579 wd 0.0500 time 0.2505 (0.2447) data time 0.0008 (0.0020) model time 0.2497 (0.2428) loss 4.6624 (4.5987) grad_norm 3.3866 (2.3472) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][730/1251] eta 0:02:07 lr 0.000580 wd 0.0500 time 0.2419 (0.2449) data time 0.0009 (0.0020) model time 0.2411 (0.2430) loss 5.5810 (4.6039) grad_norm 2.7644 (2.3467) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][740/1251] eta 0:02:05 lr 0.000580 wd 0.0500 time 0.2469 (0.2449) data time 0.0010 (0.0020) model time 0.2459 (0.2430) loss 4.6643 (4.6019) grad_norm 2.9163 (2.3489) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][750/1251] eta 0:02:02 lr 0.000580 wd 0.0500 time 0.2388 (0.2448) data time 0.0010 (0.0020) model time 0.2378 (0.2430) loss 5.0521 (4.6008) grad_norm 2.7510 (2.3512) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][760/1251] eta 0:02:00 lr 0.000581 wd 0.0500 time 0.2400 (0.2448) data time 0.0009 (0.0019) model time 0.2391 (0.2430) loss 4.7945 (4.6039) grad_norm 1.9950 (2.3513) loss_scale 32768.0000 (16405.5296) mem 7376MB [2024-08-20 00:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][770/1251] eta 0:01:57 lr 0.000581 wd 0.0500 time 0.2424 (0.2448) data time 0.0008 (0.0019) model time 0.2416 (0.2430) loss 4.9873 (4.6039) grad_norm 2.3513 (2.3501) loss_scale 32768.0000 (16617.7536) mem 7376MB [2024-08-20 00:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][780/1251] eta 0:01:55 lr 0.000582 wd 0.0500 time 0.2430 (0.2448) data time 0.0008 (0.0019) model time 0.2423 (0.2429) loss 3.5979 (4.6033) grad_norm 2.5269 (2.3495) loss_scale 32768.0000 (16824.5429) mem 7376MB [2024-08-20 00:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][790/1251] eta 0:01:52 lr 0.000582 wd 0.0500 time 0.2430 (0.2448) data time 0.0009 (0.0019) model time 0.2421 (0.2429) loss 5.1736 (4.6047) grad_norm 1.7973 (2.3507) loss_scale 32768.0000 (17026.1037) mem 7376MB [2024-08-20 00:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][800/1251] eta 0:01:50 lr 0.000582 wd 0.0500 time 0.2409 (0.2447) data time 0.0011 (0.0019) model time 0.2398 (0.2429) loss 4.4805 (4.6050) grad_norm 1.7959 (2.3505) loss_scale 32768.0000 (17222.6317) mem 7376MB [2024-08-20 00:27:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][810/1251] eta 0:01:47 lr 0.000583 wd 0.0500 time 0.2448 (0.2447) data time 0.0009 (0.0019) model time 0.2438 (0.2429) loss 4.3109 (4.6023) grad_norm 2.2473 (2.3492) loss_scale 32768.0000 (17414.3132) mem 7376MB [2024-08-20 00:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][820/1251] eta 0:01:45 lr 0.000583 wd 0.0500 time 0.2380 (0.2447) data time 0.0009 (0.0019) model time 0.2371 (0.2429) loss 5.2554 (4.6028) grad_norm 1.7178 (2.3460) loss_scale 32768.0000 (17601.3252) mem 7376MB [2024-08-20 00:27:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][830/1251] eta 0:01:42 lr 0.000584 wd 0.0500 time 0.2306 (0.2446) data time 0.0009 (0.0019) model time 0.2297 (0.2428) loss 5.5689 (4.6031) grad_norm 1.9963 (2.3455) loss_scale 32768.0000 (17783.8363) mem 7376MB [2024-08-20 00:27:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][840/1251] eta 0:01:40 lr 0.000584 wd 0.0500 time 0.2398 (0.2446) data time 0.0008 (0.0019) model time 0.2391 (0.2428) loss 5.4032 (4.6048) grad_norm 3.8165 (2.3490) loss_scale 32768.0000 (17962.0071) mem 7376MB [2024-08-20 00:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][850/1251] eta 0:01:38 lr 0.000584 wd 0.0500 time 0.2380 (0.2446) data time 0.0010 (0.0019) model time 0.2370 (0.2428) loss 4.5490 (4.6079) grad_norm 2.7488 (2.3495) loss_scale 32768.0000 (18135.9906) mem 7376MB [2024-08-20 00:27:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][860/1251] eta 0:01:35 lr 0.000585 wd 0.0500 time 0.2410 (0.2445) data time 0.0008 (0.0019) model time 0.2402 (0.2427) loss 4.0857 (4.6106) grad_norm 2.1474 (2.3487) loss_scale 32768.0000 (18305.9326) mem 7376MB [2024-08-20 00:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][870/1251] eta 0:01:33 lr 0.000585 wd 0.0500 time 0.2417 (0.2445) data time 0.0012 (0.0019) model time 0.2405 (0.2427) loss 4.5981 (4.6116) grad_norm 2.8447 (2.3508) loss_scale 32768.0000 (18471.9724) mem 7376MB [2024-08-20 00:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][880/1251] eta 0:01:30 lr 0.000586 wd 0.0500 time 0.2471 (0.2444) data time 0.0012 (0.0018) model time 0.2460 (0.2427) loss 4.8972 (4.6074) grad_norm 1.8310 (2.3517) loss_scale 32768.0000 (18634.2429) mem 7376MB [2024-08-20 00:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][890/1251] eta 0:01:28 lr 0.000586 wd 0.0500 time 0.2431 (0.2444) data time 0.0010 (0.0018) model time 0.2421 (0.2426) loss 3.7520 (4.6049) grad_norm 3.2198 (2.3488) loss_scale 32768.0000 (18792.8709) mem 7376MB [2024-08-20 00:27:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][900/1251] eta 0:01:25 lr 0.000586 wd 0.0500 time 0.2503 (0.2444) data time 0.0008 (0.0018) model time 0.2495 (0.2427) loss 3.6698 (4.6065) grad_norm 2.3748 (inf) loss_scale 16384.0000 (18820.6881) mem 7376MB [2024-08-20 00:27:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][910/1251] eta 0:01:23 lr 0.000587 wd 0.0500 time 0.2414 (0.2444) data time 0.0012 (0.0018) model time 0.2402 (0.2426) loss 4.8939 (4.6055) grad_norm 2.7716 (inf) loss_scale 16384.0000 (18793.9407) mem 7376MB [2024-08-20 00:27:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][920/1251] eta 0:01:20 lr 0.000587 wd 0.0500 time 0.2483 (0.2444) data time 0.0010 (0.0018) model time 0.2473 (0.2426) loss 4.7774 (4.6044) grad_norm 2.7701 (inf) loss_scale 16384.0000 (18767.7742) mem 7376MB [2024-08-20 00:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][930/1251] eta 0:01:18 lr 0.000588 wd 0.0500 time 0.2400 (0.2444) data time 0.0010 (0.0018) model time 0.2390 (0.2426) loss 3.6089 (4.6015) grad_norm 3.9542 (inf) loss_scale 16384.0000 (18742.1697) mem 7376MB [2024-08-20 00:27:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][940/1251] eta 0:01:15 lr 0.000588 wd 0.0500 time 0.2479 (0.2444) data time 0.0008 (0.0018) model time 0.2471 (0.2426) loss 5.3110 (4.5967) grad_norm 1.9590 (inf) loss_scale 16384.0000 (18717.1095) mem 7376MB [2024-08-20 00:27:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][950/1251] eta 0:01:13 lr 0.000588 wd 0.0500 time 0.2405 (0.2443) data time 0.0009 (0.0018) model time 0.2396 (0.2426) loss 4.4670 (4.5949) grad_norm 1.4898 (inf) loss_scale 16384.0000 (18692.5762) mem 7376MB [2024-08-20 00:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][960/1251] eta 0:01:11 lr 0.000589 wd 0.0500 time 0.2429 (0.2443) data time 0.0010 (0.0018) model time 0.2419 (0.2426) loss 3.4174 (4.5931) grad_norm 2.1826 (inf) loss_scale 16384.0000 (18668.5536) mem 7376MB [2024-08-20 00:27:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][970/1251] eta 0:01:08 lr 0.000589 wd 0.0500 time 0.2454 (0.2443) data time 0.0010 (0.0018) model time 0.2444 (0.2426) loss 4.7956 (4.5930) grad_norm 4.1769 (inf) loss_scale 16384.0000 (18645.0257) mem 7376MB [2024-08-20 00:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][980/1251] eta 0:01:06 lr 0.000590 wd 0.0500 time 0.2434 (0.2443) data time 0.0008 (0.0018) model time 0.2426 (0.2425) loss 3.7935 (4.5931) grad_norm 1.9382 (inf) loss_scale 16384.0000 (18621.9776) mem 7376MB [2024-08-20 00:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][990/1251] eta 0:01:03 lr 0.000590 wd 0.0500 time 0.2427 (0.2445) data time 0.0010 (0.0018) model time 0.2416 (0.2428) loss 4.4021 (4.5930) grad_norm 2.7443 (inf) loss_scale 16384.0000 (18599.3946) mem 7376MB [2024-08-20 00:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1000/1251] eta 0:01:01 lr 0.000590 wd 0.0500 time 0.2500 (0.2444) data time 0.0010 (0.0018) model time 0.2490 (0.2427) loss 4.9391 (4.5896) grad_norm 1.7270 (inf) loss_scale 16384.0000 (18577.2627) mem 7376MB [2024-08-20 00:27:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1010/1251] eta 0:00:58 lr 0.000591 wd 0.0500 time 0.2406 (0.2444) data time 0.0011 (0.0018) model time 0.2395 (0.2427) loss 3.3702 (4.5860) grad_norm 2.2494 (inf) loss_scale 16384.0000 (18555.5687) mem 7376MB [2024-08-20 00:27:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1020/1251] eta 0:00:56 lr 0.000591 wd 0.0500 time 0.2375 (0.2444) data time 0.0010 (0.0018) model time 0.2365 (0.2427) loss 5.1308 (4.5844) grad_norm 3.3567 (inf) loss_scale 16384.0000 (18534.2997) mem 7376MB [2024-08-20 00:28:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1030/1251] eta 0:00:53 lr 0.000592 wd 0.0500 time 0.2381 (0.2443) data time 0.0011 (0.0018) model time 0.2370 (0.2426) loss 5.0375 (4.5872) grad_norm 1.8564 (inf) loss_scale 16384.0000 (18513.4433) mem 7376MB [2024-08-20 00:28:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1040/1251] eta 0:00:51 lr 0.000592 wd 0.0500 time 0.2370 (0.2443) data time 0.0010 (0.0017) model time 0.2359 (0.2426) loss 4.7896 (4.5885) grad_norm 2.5270 (inf) loss_scale 16384.0000 (18492.9875) mem 7376MB [2024-08-20 00:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1050/1251] eta 0:00:49 lr 0.000592 wd 0.0500 time 0.2418 (0.2442) data time 0.0010 (0.0017) model time 0.2409 (0.2426) loss 4.0221 (4.5876) grad_norm 3.9307 (inf) loss_scale 16384.0000 (18472.9210) mem 7376MB [2024-08-20 00:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1060/1251] eta 0:00:46 lr 0.000593 wd 0.0500 time 0.2451 (0.2442) data time 0.0008 (0.0017) model time 0.2442 (0.2425) loss 3.5243 (4.5858) grad_norm 1.9780 (inf) loss_scale 16384.0000 (18453.2328) mem 7376MB [2024-08-20 00:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1070/1251] eta 0:00:44 lr 0.000593 wd 0.0500 time 0.2561 (0.2442) data time 0.0008 (0.0017) model time 0.2553 (0.2425) loss 3.6017 (4.5851) grad_norm 2.4308 (inf) loss_scale 16384.0000 (18433.9122) mem 7376MB [2024-08-20 00:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1080/1251] eta 0:00:41 lr 0.000594 wd 0.0500 time 0.2539 (0.2442) data time 0.0011 (0.0017) model time 0.2528 (0.2425) loss 4.5217 (4.5875) grad_norm 2.3446 (inf) loss_scale 16384.0000 (18414.9491) mem 7376MB [2024-08-20 00:28:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1090/1251] eta 0:00:39 lr 0.000594 wd 0.0500 time 0.2475 (0.2441) data time 0.0009 (0.0017) model time 0.2466 (0.2425) loss 3.3391 (4.5867) grad_norm 6.7373 (inf) loss_scale 16384.0000 (18396.3336) mem 7376MB [2024-08-20 00:28:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1100/1251] eta 0:00:36 lr 0.000594 wd 0.0500 time 0.2383 (0.2442) data time 0.0009 (0.0017) model time 0.2375 (0.2425) loss 5.3759 (4.5865) grad_norm 2.2929 (inf) loss_scale 16384.0000 (18378.0563) mem 7376MB [2024-08-20 00:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1110/1251] eta 0:00:34 lr 0.000595 wd 0.0500 time 0.2614 (0.2441) data time 0.0011 (0.0017) model time 0.2603 (0.2425) loss 4.3977 (4.5863) grad_norm 2.4854 (inf) loss_scale 16384.0000 (18360.1080) mem 7376MB [2024-08-20 00:28:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1120/1251] eta 0:00:31 lr 0.000595 wd 0.0500 time 0.2496 (0.2441) data time 0.0007 (0.0017) model time 0.2489 (0.2424) loss 5.3065 (4.5867) grad_norm 2.2779 (inf) loss_scale 16384.0000 (18342.4799) mem 7376MB [2024-08-20 00:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1130/1251] eta 0:00:29 lr 0.000596 wd 0.0500 time 0.2359 (0.2441) data time 0.0011 (0.0017) model time 0.2348 (0.2424) loss 5.1355 (4.5865) grad_norm 2.7180 (inf) loss_scale 16384.0000 (18325.1636) mem 7376MB [2024-08-20 00:28:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1140/1251] eta 0:00:27 lr 0.000596 wd 0.0500 time 0.2373 (0.2440) data time 0.0010 (0.0017) model time 0.2363 (0.2424) loss 3.7523 (4.5849) grad_norm 2.6769 (inf) loss_scale 16384.0000 (18308.1507) mem 7376MB [2024-08-20 00:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1150/1251] eta 0:00:24 lr 0.000596 wd 0.0500 time 0.2366 (0.2440) data time 0.0010 (0.0017) model time 0.2356 (0.2424) loss 5.1363 (4.5866) grad_norm 1.8538 (inf) loss_scale 16384.0000 (18291.4335) mem 7376MB [2024-08-20 00:28:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1160/1251] eta 0:00:22 lr 0.000597 wd 0.0500 time 0.2521 (0.2440) data time 0.0008 (0.0017) model time 0.2513 (0.2424) loss 4.6980 (4.5874) grad_norm 2.3223 (inf) loss_scale 16384.0000 (18275.0043) mem 7376MB [2024-08-20 00:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1170/1251] eta 0:00:19 lr 0.000597 wd 0.0500 time 0.2407 (0.2440) data time 0.0011 (0.0017) model time 0.2396 (0.2423) loss 4.5486 (4.5848) grad_norm 2.1216 (inf) loss_scale 16384.0000 (18258.8557) mem 7376MB [2024-08-20 00:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1180/1251] eta 0:00:17 lr 0.000598 wd 0.0500 time 0.2412 (0.2440) data time 0.0012 (0.0017) model time 0.2401 (0.2423) loss 3.5538 (4.5860) grad_norm 3.3159 (inf) loss_scale 16384.0000 (18242.9805) mem 7376MB [2024-08-20 00:28:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1190/1251] eta 0:00:14 lr 0.000598 wd 0.0500 time 0.2423 (0.2440) data time 0.0008 (0.0017) model time 0.2416 (0.2423) loss 5.4003 (4.5864) grad_norm 2.2955 (inf) loss_scale 16384.0000 (18227.3720) mem 7376MB [2024-08-20 00:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1200/1251] eta 0:00:12 lr 0.000598 wd 0.0500 time 0.2469 (0.2440) data time 0.0011 (0.0017) model time 0.2458 (0.2423) loss 4.7212 (4.5850) grad_norm 2.2326 (inf) loss_scale 16384.0000 (18212.0233) mem 7376MB [2024-08-20 00:28:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1210/1251] eta 0:00:10 lr 0.000599 wd 0.0500 time 0.2424 (0.2440) data time 0.0007 (0.0017) model time 0.2416 (0.2423) loss 3.2089 (4.5821) grad_norm 1.9708 (inf) loss_scale 16384.0000 (18196.9282) mem 7376MB [2024-08-20 00:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1220/1251] eta 0:00:07 lr 0.000599 wd 0.0500 time 0.2530 (0.2440) data time 0.0014 (0.0017) model time 0.2516 (0.2423) loss 4.8548 (4.5819) grad_norm 2.5216 (inf) loss_scale 16384.0000 (18182.0803) mem 7376MB [2024-08-20 00:28:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1230/1251] eta 0:00:05 lr 0.000600 wd 0.0500 time 0.2483 (0.2440) data time 0.0013 (0.0017) model time 0.2470 (0.2423) loss 4.6655 (4.5814) grad_norm 2.3103 (inf) loss_scale 16384.0000 (18167.4736) mem 7376MB [2024-08-20 00:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1240/1251] eta 0:00:02 lr 0.000600 wd 0.0500 time 0.2253 (0.2440) data time 0.0007 (0.0017) model time 0.2246 (0.2423) loss 4.7482 (4.5827) grad_norm 2.1114 (inf) loss_scale 16384.0000 (18153.1023) mem 7376MB [2024-08-20 00:28:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [11/300][1250/1251] eta 0:00:00 lr 0.000600 wd 0.0500 time 0.2216 (0.2440) data time 0.0005 (0.0017) model time 0.2212 (0.2423) loss 4.8074 (4.5820) grad_norm 2.2933 (inf) loss_scale 16384.0000 (18138.9608) mem 7376MB [2024-08-20 00:28:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 11 training takes 0:05:05 [2024-08-20 00:28:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-20 00:28:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-20 00:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.429 (0.429) Loss 1.0762 (1.0762) Acc@1 76.758 (76.758) Acc@5 93.262 (93.262) Mem 7376MB [2024-08-20 00:28:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.108) Loss 1.4287 (1.5400) Acc@1 66.211 (64.462) Acc@5 90.723 (87.607) Mem 7376MB [2024-08-20 00:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.095) Loss 2.1094 (1.5572) Acc@1 55.664 (64.049) Acc@5 78.027 (87.751) Mem 7376MB [2024-08-20 00:28:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.090) Loss 2.4238 (1.7440) Acc@1 49.609 (60.726) Acc@5 71.875 (84.432) Mem 7376MB [2024-08-20 00:28:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 2.4512 (1.8634) Acc@1 46.973 (58.477) Acc@5 72.461 (82.403) Mem 7376MB [2024-08-20 00:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 58.464 Acc@5 82.366 [2024-08-20 00:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 58.5% [2024-08-20 00:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 58.46% [2024-08-20 00:29:00 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-20 00:29:01 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-20 00:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.438 (0.438) Loss 4.7070 (4.7070) Acc@1 16.895 (16.895) Acc@5 42.773 (42.773) Mem 7376MB [2024-08-20 00:29:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.088 (0.112) Loss 5.3359 (5.1435) Acc@1 10.938 (12.536) Acc@5 28.711 (30.131) Mem 7376MB [2024-08-20 00:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.097) Loss 5.3867 (5.1263) Acc@1 10.547 (12.974) Acc@5 25.879 (30.655) Mem 7376MB [2024-08-20 00:29:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.091) Loss 5.2148 (5.1576) Acc@1 13.672 (12.944) Acc@5 30.273 (29.870) Mem 7376MB [2024-08-20 00:29:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 5.3867 (5.1785) Acc@1 7.227 (12.540) Acc@5 21.680 (29.209) Mem 7376MB [2024-08-20 00:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 13.374 Acc@5 30.624 [2024-08-20 00:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 13.4% [2024-08-20 00:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 13.37% [2024-08-20 00:29:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-20 00:29:06 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-20 00:29:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][0/1251] eta 0:15:07 lr 0.000600 wd 0.0500 time 0.7256 (0.7256) data time 0.4945 (0.4945) model time 0.0000 (0.0000) loss 5.2927 (5.2927) grad_norm 2.4819 (2.4819) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][10/1251] eta 0:05:56 lr 0.000601 wd 0.0500 time 0.2547 (0.2874) data time 0.0007 (0.0459) model time 0.0000 (0.0000) loss 4.5468 (4.5290) grad_norm 1.9834 (2.3051) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][20/1251] eta 0:05:26 lr 0.000601 wd 0.0500 time 0.2457 (0.2655) data time 0.0008 (0.0247) model time 0.0000 (0.0000) loss 4.4291 (4.4283) grad_norm 1.9951 (2.4394) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][30/1251] eta 0:05:14 lr 0.000602 wd 0.0500 time 0.2586 (0.2577) data time 0.0012 (0.0170) model time 0.0000 (0.0000) loss 4.7924 (4.4757) grad_norm 3.2953 (2.4940) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][40/1251] eta 0:05:07 lr 0.000602 wd 0.0500 time 0.2387 (0.2539) data time 0.0008 (0.0131) model time 0.0000 (0.0000) loss 5.1874 (4.4236) grad_norm 2.0084 (2.3556) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][50/1251] eta 0:05:02 lr 0.000602 wd 0.0500 time 0.2538 (0.2520) data time 0.0009 (0.0107) model time 0.0000 (0.0000) loss 4.6533 (4.3854) grad_norm 1.7931 (2.3255) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][60/1251] eta 0:04:58 lr 0.000603 wd 0.0500 time 0.2488 (0.2504) data time 0.0012 (0.0093) model time 0.2476 (0.2405) loss 4.9689 (4.3323) grad_norm 2.4287 (2.3637) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][70/1251] eta 0:04:54 lr 0.000603 wd 0.0500 time 0.2488 (0.2495) data time 0.0010 (0.0081) model time 0.2478 (0.2416) loss 4.4960 (4.3837) grad_norm 2.5197 (2.3527) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][80/1251] eta 0:04:51 lr 0.000604 wd 0.0500 time 0.2584 (0.2489) data time 0.0010 (0.0072) model time 0.2574 (0.2423) loss 4.7956 (4.4050) grad_norm 2.2754 (2.3201) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][90/1251] eta 0:04:47 lr 0.000604 wd 0.0500 time 0.2439 (0.2480) data time 0.0007 (0.0067) model time 0.2431 (0.2414) loss 5.2427 (4.4102) grad_norm 1.5073 (2.2652) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][100/1251] eta 0:04:47 lr 0.000604 wd 0.0500 time 0.2515 (0.2498) data time 0.0009 (0.0061) model time 0.2506 (0.2461) loss 4.0698 (4.4142) grad_norm 3.0739 (2.2564) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][110/1251] eta 0:04:44 lr 0.000605 wd 0.0500 time 0.2367 (0.2491) data time 0.0008 (0.0057) model time 0.2359 (0.2452) loss 4.6726 (4.4282) grad_norm 2.1560 (2.2728) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][120/1251] eta 0:04:41 lr 0.000605 wd 0.0500 time 0.2408 (0.2487) data time 0.0009 (0.0053) model time 0.2399 (0.2450) loss 4.4620 (4.4244) grad_norm 3.3250 (2.2896) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][130/1251] eta 0:04:38 lr 0.000606 wd 0.0500 time 0.2368 (0.2484) data time 0.0016 (0.0050) model time 0.2352 (0.2448) loss 4.2075 (4.4438) grad_norm 1.7617 (2.2744) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][140/1251] eta 0:04:35 lr 0.000606 wd 0.0500 time 0.2390 (0.2479) data time 0.0012 (0.0048) model time 0.2378 (0.2442) loss 4.8673 (4.4827) grad_norm 2.3875 (2.2774) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][150/1251] eta 0:04:32 lr 0.000606 wd 0.0500 time 0.2414 (0.2476) data time 0.0010 (0.0045) model time 0.2404 (0.2440) loss 4.3625 (4.4831) grad_norm 3.7272 (2.2943) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][160/1251] eta 0:04:29 lr 0.000607 wd 0.0500 time 0.2419 (0.2471) data time 0.0015 (0.0043) model time 0.2405 (0.2435) loss 4.6904 (4.4858) grad_norm 2.0080 (2.2963) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][170/1251] eta 0:04:26 lr 0.000607 wd 0.0500 time 0.2376 (0.2468) data time 0.0010 (0.0041) model time 0.2366 (0.2433) loss 4.7106 (4.5025) grad_norm 2.9134 (2.2909) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][180/1251] eta 0:04:24 lr 0.000608 wd 0.0500 time 0.2385 (0.2468) data time 0.0008 (0.0040) model time 0.2376 (0.2435) loss 3.4483 (4.4943) grad_norm 1.6929 (2.2761) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][190/1251] eta 0:04:21 lr 0.000608 wd 0.0500 time 0.2414 (0.2466) data time 0.0008 (0.0038) model time 0.2406 (0.2434) loss 4.3639 (4.4947) grad_norm 2.4459 (2.2670) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][200/1251] eta 0:04:19 lr 0.000608 wd 0.0500 time 0.2451 (0.2465) data time 0.0013 (0.0037) model time 0.2438 (0.2434) loss 4.9215 (4.5025) grad_norm 3.0789 (2.2911) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:29:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][210/1251] eta 0:04:16 lr 0.000609 wd 0.0500 time 0.2335 (0.2462) data time 0.0012 (0.0036) model time 0.2323 (0.2430) loss 4.7743 (4.5055) grad_norm 2.0854 (2.2895) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][220/1251] eta 0:04:13 lr 0.000609 wd 0.0500 time 0.2470 (0.2460) data time 0.0010 (0.0035) model time 0.2461 (0.2428) loss 5.4453 (4.5106) grad_norm 1.9709 (2.2836) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][230/1251] eta 0:04:11 lr 0.000610 wd 0.0500 time 0.2322 (0.2460) data time 0.0009 (0.0034) model time 0.2313 (0.2429) loss 3.2587 (4.5013) grad_norm 2.0289 (2.2765) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][240/1251] eta 0:04:08 lr 0.000610 wd 0.0500 time 0.2372 (0.2458) data time 0.0007 (0.0033) model time 0.2365 (0.2429) loss 3.2544 (4.5104) grad_norm 2.7409 (2.2715) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][250/1251] eta 0:04:05 lr 0.000610 wd 0.0500 time 0.2452 (0.2456) data time 0.0012 (0.0032) model time 0.2440 (0.2427) loss 4.9933 (4.5072) grad_norm 1.8573 (2.2592) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][260/1251] eta 0:04:03 lr 0.000611 wd 0.0500 time 0.2397 (0.2455) data time 0.0012 (0.0032) model time 0.2385 (0.2426) loss 4.5809 (4.5105) grad_norm 2.8776 (2.2496) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][270/1251] eta 0:04:02 lr 0.000611 wd 0.0500 time 0.2388 (0.2469) data time 0.0011 (0.0031) model time 0.2377 (0.2444) loss 4.9958 (4.5247) grad_norm 2.9306 (2.2523) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][280/1251] eta 0:03:59 lr 0.000612 wd 0.0500 time 0.2387 (0.2467) data time 0.0008 (0.0030) model time 0.2380 (0.2443) loss 5.3231 (4.5274) grad_norm 2.2019 (2.2505) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][290/1251] eta 0:03:56 lr 0.000612 wd 0.0500 time 0.2372 (0.2466) data time 0.0010 (0.0029) model time 0.2361 (0.2441) loss 3.9623 (4.5262) grad_norm 1.8320 (2.2421) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][300/1251] eta 0:03:54 lr 0.000612 wd 0.0500 time 0.2416 (0.2464) data time 0.0009 (0.0029) model time 0.2407 (0.2440) loss 5.1745 (4.5262) grad_norm 3.9806 (2.2412) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][310/1251] eta 0:03:51 lr 0.000613 wd 0.0500 time 0.2293 (0.2464) data time 0.0009 (0.0028) model time 0.2284 (0.2440) loss 4.4007 (4.5143) grad_norm 1.9505 (2.2501) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][320/1251] eta 0:03:49 lr 0.000613 wd 0.0500 time 0.2387 (0.2463) data time 0.0010 (0.0028) model time 0.2377 (0.2439) loss 4.8709 (4.5044) grad_norm 1.8195 (2.2504) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][330/1251] eta 0:03:46 lr 0.000614 wd 0.0500 time 0.2348 (0.2462) data time 0.0011 (0.0028) model time 0.2337 (0.2439) loss 5.0495 (4.5035) grad_norm 2.7611 (2.2512) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][340/1251] eta 0:03:44 lr 0.000614 wd 0.0500 time 0.2455 (0.2461) data time 0.0010 (0.0027) model time 0.2445 (0.2438) loss 5.2050 (4.5090) grad_norm 1.5987 (2.2489) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][350/1251] eta 0:03:41 lr 0.000614 wd 0.0500 time 0.2376 (0.2460) data time 0.0011 (0.0027) model time 0.2365 (0.2436) loss 4.4916 (4.5087) grad_norm 3.1961 (2.2436) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][360/1251] eta 0:03:39 lr 0.000615 wd 0.0500 time 0.2478 (0.2459) data time 0.0008 (0.0027) model time 0.2470 (0.2436) loss 4.6152 (4.5109) grad_norm 2.3381 (2.2427) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][370/1251] eta 0:03:36 lr 0.000615 wd 0.0500 time 0.2440 (0.2458) data time 0.0010 (0.0027) model time 0.2430 (0.2435) loss 4.8622 (4.5164) grad_norm 1.9890 (2.2440) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][380/1251] eta 0:03:34 lr 0.000616 wd 0.0500 time 0.2388 (0.2458) data time 0.0009 (0.0026) model time 0.2379 (0.2435) loss 3.2566 (4.5011) grad_norm 2.4209 (2.2482) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][390/1251] eta 0:03:31 lr 0.000616 wd 0.0500 time 0.2382 (0.2457) data time 0.0009 (0.0026) model time 0.2374 (0.2434) loss 4.8750 (4.5016) grad_norm 1.9425 (2.2434) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][400/1251] eta 0:03:29 lr 0.000616 wd 0.0500 time 0.2424 (0.2457) data time 0.0007 (0.0026) model time 0.2416 (0.2434) loss 4.8165 (4.5050) grad_norm 2.0569 (2.2451) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][410/1251] eta 0:03:26 lr 0.000617 wd 0.0500 time 0.2405 (0.2455) data time 0.0012 (0.0025) model time 0.2393 (0.2433) loss 3.9791 (4.5056) grad_norm 2.5435 (2.2442) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][420/1251] eta 0:03:23 lr 0.000617 wd 0.0500 time 0.2404 (0.2454) data time 0.0008 (0.0025) model time 0.2396 (0.2432) loss 3.4423 (4.4981) grad_norm 3.2522 (2.2573) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][430/1251] eta 0:03:21 lr 0.000618 wd 0.0500 time 0.2384 (0.2454) data time 0.0012 (0.0025) model time 0.2372 (0.2431) loss 4.0395 (4.4994) grad_norm 1.7225 (2.2753) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][440/1251] eta 0:03:18 lr 0.000618 wd 0.0500 time 0.2448 (0.2454) data time 0.0012 (0.0025) model time 0.2436 (0.2431) loss 3.4276 (4.4989) grad_norm 1.5373 (2.2776) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][450/1251] eta 0:03:16 lr 0.000618 wd 0.0500 time 0.2395 (0.2453) data time 0.0009 (0.0024) model time 0.2387 (0.2431) loss 4.2526 (4.4993) grad_norm 1.9679 (2.2800) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:30:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][460/1251] eta 0:03:14 lr 0.000619 wd 0.0500 time 0.2448 (0.2456) data time 0.0007 (0.0024) model time 0.2441 (0.2434) loss 5.2005 (4.4889) grad_norm 2.2378 (2.2789) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][470/1251] eta 0:03:11 lr 0.000619 wd 0.0500 time 0.2302 (0.2455) data time 0.0009 (0.0024) model time 0.2293 (0.2433) loss 4.0910 (4.4896) grad_norm 2.0062 (2.2843) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][480/1251] eta 0:03:09 lr 0.000620 wd 0.0500 time 0.2367 (0.2454) data time 0.0008 (0.0023) model time 0.2360 (0.2433) loss 4.9173 (4.4972) grad_norm 4.7274 (2.2915) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][490/1251] eta 0:03:06 lr 0.000620 wd 0.0500 time 0.2509 (0.2454) data time 0.0011 (0.0023) model time 0.2498 (0.2433) loss 5.1172 (4.4963) grad_norm 2.3297 (2.2927) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][500/1251] eta 0:03:04 lr 0.000620 wd 0.0500 time 0.2419 (0.2453) data time 0.0008 (0.0023) model time 0.2411 (0.2432) loss 5.1887 (4.4962) grad_norm 2.4984 (2.2854) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][510/1251] eta 0:03:01 lr 0.000621 wd 0.0500 time 0.2417 (0.2453) data time 0.0012 (0.0023) model time 0.2405 (0.2432) loss 4.3193 (4.5029) grad_norm 2.3366 (2.2858) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][520/1251] eta 0:02:59 lr 0.000621 wd 0.0500 time 0.2550 (0.2453) data time 0.0007 (0.0023) model time 0.2543 (0.2432) loss 4.0594 (4.5005) grad_norm 2.0557 (2.2843) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][530/1251] eta 0:02:56 lr 0.000622 wd 0.0500 time 0.2435 (0.2452) data time 0.0011 (0.0022) model time 0.2424 (0.2431) loss 4.9175 (4.5029) grad_norm 1.7018 (2.2788) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][540/1251] eta 0:02:54 lr 0.000622 wd 0.0500 time 0.2463 (0.2451) data time 0.0009 (0.0022) model time 0.2453 (0.2431) loss 5.2088 (4.5031) grad_norm 2.7550 (2.2779) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][550/1251] eta 0:02:51 lr 0.000622 wd 0.0500 time 0.2414 (0.2450) data time 0.0009 (0.0022) model time 0.2404 (0.2430) loss 4.7827 (4.5047) grad_norm 1.7113 (2.2775) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][560/1251] eta 0:02:49 lr 0.000623 wd 0.0500 time 0.2348 (0.2450) data time 0.0008 (0.0022) model time 0.2339 (0.2430) loss 5.2437 (4.5110) grad_norm 1.7347 (2.2717) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][570/1251] eta 0:02:46 lr 0.000623 wd 0.0500 time 0.2453 (0.2450) data time 0.0008 (0.0022) model time 0.2446 (0.2430) loss 3.8396 (4.5088) grad_norm 2.2495 (2.2743) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][580/1251] eta 0:02:44 lr 0.000624 wd 0.0500 time 0.2564 (0.2450) data time 0.0007 (0.0022) model time 0.2556 (0.2430) loss 4.1237 (4.5042) grad_norm 1.6908 (2.2757) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][590/1251] eta 0:02:41 lr 0.000624 wd 0.0500 time 0.2468 (0.2449) data time 0.0008 (0.0022) model time 0.2460 (0.2429) loss 4.5608 (4.4987) grad_norm 1.9072 (2.2750) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][600/1251] eta 0:02:39 lr 0.000624 wd 0.0500 time 0.2468 (0.2449) data time 0.0008 (0.0021) model time 0.2460 (0.2429) loss 5.2541 (4.5028) grad_norm 1.5682 (2.2733) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][610/1251] eta 0:02:36 lr 0.000625 wd 0.0500 time 0.2426 (0.2449) data time 0.0008 (0.0021) model time 0.2418 (0.2429) loss 4.4033 (4.5041) grad_norm 1.8802 (2.2719) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][620/1251] eta 0:02:34 lr 0.000625 wd 0.0500 time 0.2485 (0.2449) data time 0.0010 (0.0021) model time 0.2475 (0.2429) loss 4.5757 (4.5071) grad_norm 3.0897 (2.2702) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][630/1251] eta 0:02:32 lr 0.000626 wd 0.0500 time 0.2485 (0.2449) data time 0.0007 (0.0021) model time 0.2478 (0.2429) loss 3.3178 (4.5048) grad_norm 2.4072 (2.2698) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][640/1251] eta 0:02:29 lr 0.000626 wd 0.0500 time 0.2337 (0.2448) data time 0.0010 (0.0021) model time 0.2327 (0.2429) loss 4.7272 (4.5029) grad_norm 2.5612 (2.2706) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][650/1251] eta 0:02:27 lr 0.000626 wd 0.0500 time 0.2425 (0.2448) data time 0.0008 (0.0021) model time 0.2417 (0.2429) loss 3.4333 (4.5044) grad_norm 3.3143 (2.2725) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][660/1251] eta 0:02:24 lr 0.000627 wd 0.0500 time 0.2391 (0.2448) data time 0.0010 (0.0021) model time 0.2381 (0.2429) loss 5.0041 (4.5058) grad_norm 2.6219 (2.2844) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][670/1251] eta 0:02:22 lr 0.000627 wd 0.0500 time 0.2467 (0.2448) data time 0.0009 (0.0020) model time 0.2457 (0.2428) loss 4.8724 (4.5047) grad_norm 2.1040 (2.2875) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][680/1251] eta 0:02:19 lr 0.000628 wd 0.0500 time 0.2334 (0.2447) data time 0.0010 (0.0020) model time 0.2324 (0.2428) loss 5.5025 (4.5011) grad_norm 2.2087 (2.2877) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][690/1251] eta 0:02:17 lr 0.000628 wd 0.0500 time 0.2409 (0.2447) data time 0.0010 (0.0020) model time 0.2399 (0.2428) loss 4.8451 (4.5038) grad_norm 1.7205 (2.2863) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][700/1251] eta 0:02:14 lr 0.000628 wd 0.0500 time 0.2462 (0.2447) data time 0.0010 (0.0020) model time 0.2453 (0.2428) loss 4.8264 (4.5035) grad_norm 3.1767 (2.2864) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][710/1251] eta 0:02:12 lr 0.000629 wd 0.0500 time 0.2416 (0.2447) data time 0.0009 (0.0020) model time 0.2407 (0.2428) loss 4.6832 (4.5037) grad_norm 1.8504 (2.2882) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][720/1251] eta 0:02:09 lr 0.000629 wd 0.0500 time 0.2391 (0.2446) data time 0.0008 (0.0020) model time 0.2384 (0.2427) loss 4.8334 (4.5038) grad_norm 2.2488 (2.2832) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][730/1251] eta 0:02:07 lr 0.000630 wd 0.0500 time 0.2427 (0.2446) data time 0.0009 (0.0020) model time 0.2418 (0.2427) loss 4.7262 (4.5024) grad_norm 2.2426 (2.2793) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][740/1251] eta 0:02:04 lr 0.000630 wd 0.0500 time 0.2541 (0.2446) data time 0.0011 (0.0020) model time 0.2530 (0.2427) loss 4.6415 (4.5026) grad_norm 2.0225 (2.2792) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][750/1251] eta 0:02:02 lr 0.000630 wd 0.0500 time 0.2407 (0.2445) data time 0.0008 (0.0019) model time 0.2400 (0.2427) loss 4.3039 (4.5015) grad_norm 2.0989 (2.2809) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][760/1251] eta 0:02:00 lr 0.000631 wd 0.0500 time 0.2483 (0.2445) data time 0.0010 (0.0019) model time 0.2473 (0.2427) loss 4.8223 (4.5008) grad_norm 2.0599 (2.2785) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][770/1251] eta 0:01:57 lr 0.000631 wd 0.0500 time 0.2383 (0.2445) data time 0.0009 (0.0019) model time 0.2374 (0.2426) loss 4.1257 (4.4984) grad_norm 1.8623 (2.2813) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][780/1251] eta 0:01:55 lr 0.000632 wd 0.0500 time 0.2472 (0.2447) data time 0.0008 (0.0019) model time 0.2464 (0.2429) loss 5.3196 (4.5018) grad_norm 1.8255 (2.2817) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][790/1251] eta 0:01:52 lr 0.000632 wd 0.0500 time 0.2482 (0.2449) data time 0.0010 (0.0019) model time 0.2472 (0.2431) loss 4.4109 (4.5010) grad_norm 2.3507 (2.2797) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][800/1251] eta 0:01:50 lr 0.000632 wd 0.0500 time 0.2518 (0.2449) data time 0.0009 (0.0019) model time 0.2509 (0.2431) loss 4.0003 (4.4987) grad_norm 1.8159 (2.2768) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][810/1251] eta 0:01:47 lr 0.000633 wd 0.0500 time 0.2449 (0.2449) data time 0.0008 (0.0019) model time 0.2442 (0.2431) loss 5.2562 (4.5024) grad_norm 1.6390 (2.2749) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][820/1251] eta 0:01:45 lr 0.000633 wd 0.0500 time 0.2400 (0.2448) data time 0.0008 (0.0019) model time 0.2393 (0.2430) loss 5.3441 (4.5042) grad_norm 2.5365 (2.2739) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][830/1251] eta 0:01:43 lr 0.000634 wd 0.0500 time 0.2448 (0.2448) data time 0.0010 (0.0019) model time 0.2438 (0.2430) loss 4.7716 (4.5062) grad_norm 3.0832 (2.2800) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][840/1251] eta 0:01:40 lr 0.000634 wd 0.0500 time 0.2436 (0.2448) data time 0.0008 (0.0019) model time 0.2427 (0.2430) loss 3.3486 (4.5042) grad_norm 2.5381 (2.2818) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][850/1251] eta 0:01:38 lr 0.000634 wd 0.0500 time 0.2521 (0.2448) data time 0.0009 (0.0019) model time 0.2512 (0.2430) loss 4.2102 (4.5068) grad_norm 1.5974 (2.2866) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][860/1251] eta 0:01:35 lr 0.000635 wd 0.0500 time 0.2404 (0.2448) data time 0.0007 (0.0019) model time 0.2396 (0.2430) loss 4.0574 (4.5078) grad_norm 3.3735 (2.2862) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][870/1251] eta 0:01:33 lr 0.000635 wd 0.0500 time 0.2392 (0.2447) data time 0.0008 (0.0019) model time 0.2384 (0.2430) loss 4.2038 (4.5125) grad_norm 1.5956 (2.2864) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][880/1251] eta 0:01:30 lr 0.000636 wd 0.0500 time 0.2375 (0.2447) data time 0.0010 (0.0019) model time 0.2364 (0.2430) loss 4.9697 (4.5152) grad_norm 2.2385 (2.2859) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][890/1251] eta 0:01:28 lr 0.000636 wd 0.0500 time 0.2420 (0.2447) data time 0.0011 (0.0018) model time 0.2409 (0.2429) loss 3.7539 (4.5139) grad_norm 2.0265 (2.2865) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][900/1251] eta 0:01:25 lr 0.000636 wd 0.0500 time 0.2438 (0.2446) data time 0.0011 (0.0018) model time 0.2427 (0.2429) loss 4.7735 (4.5147) grad_norm 2.0503 (2.2863) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][910/1251] eta 0:01:23 lr 0.000637 wd 0.0500 time 0.2508 (0.2448) data time 0.0007 (0.0018) model time 0.2500 (0.2431) loss 5.5911 (4.5125) grad_norm 2.3799 (2.2872) loss_scale 16384.0000 (16384.0000) mem 7376MB [2024-08-20 00:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-20 00:32:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-20 00:32:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-20 09:33:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-20 09:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-20 09:33:16 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-20 09:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-20 09:33:26 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-20 09:33:27 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-20 09:33:28 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-20 09:33:28 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 12) [2024-08-20 09:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-20 09:33:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][920/1251] eta 0:11:16 lr 0.000637 wd 0.0500 time 0.2345 (2.0449) data time 0.0014 (0.1449) model time 0.2331 (1.9000) loss 4.9354 (5.0777) grad_norm 1.7959 (2.2419) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][930/1251] eta 0:05:15 lr 0.000638 wd 0.0500 time 0.2413 (0.9827) data time 0.0011 (0.0603) model time 0.2402 (0.9224) loss 4.2457 (4.8124) grad_norm 3.6573 (2.6769) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:33:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][940/1251] eta 0:03:40 lr 0.000638 wd 0.0500 time 0.2467 (0.7078) data time 0.0009 (0.0384) model time 0.2459 (0.6693) loss 5.3416 (4.8466) grad_norm 2.0536 (2.4412) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:33:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][950/1251] eta 0:02:54 lr 0.000638 wd 0.0500 time 0.2360 (0.5812) data time 0.0012 (0.0283) model time 0.2348 (0.5528) loss 4.0348 (4.7794) grad_norm 1.8665 (2.3739) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:33:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][960/1251] eta 0:02:27 lr 0.000639 wd 0.0500 time 0.2362 (0.5079) data time 0.0008 (0.0225) model time 0.2355 (0.4854) loss 4.8106 (4.7314) grad_norm 1.9644 (2.3179) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][970/1251] eta 0:02:09 lr 0.000639 wd 0.0500 time 0.2378 (0.4608) data time 0.0011 (0.0188) model time 0.2367 (0.4421) loss 4.9159 (4.7308) grad_norm 4.3296 (2.3413) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][980/1251] eta 0:01:55 lr 0.000640 wd 0.0500 time 0.2540 (0.4280) data time 0.0012 (0.0161) model time 0.2528 (0.4119) loss 4.4005 (4.6890) grad_norm 1.6091 (2.3136) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][990/1251] eta 0:01:45 lr 0.000640 wd 0.0500 time 0.2347 (0.4033) data time 0.0010 (0.0142) model time 0.2337 (0.3891) loss 4.7107 (4.6455) grad_norm 2.0549 (2.3622) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1000/1251] eta 0:01:36 lr 0.000640 wd 0.0500 time 0.2449 (0.3843) data time 0.0010 (0.0127) model time 0.2439 (0.3716) loss 4.3639 (4.6077) grad_norm 2.1979 (2.3751) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1010/1251] eta 0:01:29 lr 0.000641 wd 0.0500 time 0.2384 (0.3694) data time 0.0011 (0.0115) model time 0.2373 (0.3579) loss 4.9293 (4.6274) grad_norm 2.3047 (2.3750) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1020/1251] eta 0:01:22 lr 0.000641 wd 0.0500 time 0.2347 (0.3574) data time 0.0008 (0.0105) model time 0.2339 (0.3469) loss 4.6851 (4.6467) grad_norm 2.4267 (2.3668) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1030/1251] eta 0:01:16 lr 0.000642 wd 0.0500 time 0.2400 (0.3477) data time 0.0010 (0.0097) model time 0.2390 (0.3380) loss 4.9827 (4.6402) grad_norm 2.0533 (2.3569) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1040/1251] eta 0:01:11 lr 0.000642 wd 0.0500 time 0.2414 (0.3391) data time 0.0011 (0.0090) model time 0.2403 (0.3301) loss 4.5572 (4.6206) grad_norm 2.1016 (2.3462) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1050/1251] eta 0:01:06 lr 0.000642 wd 0.0500 time 0.2422 (0.3318) data time 0.0007 (0.0084) model time 0.2415 (0.3234) loss 3.7574 (4.6102) grad_norm 2.4066 (2.3405) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1060/1251] eta 0:01:02 lr 0.000643 wd 0.0500 time 0.2455 (0.3255) data time 0.0009 (0.0079) model time 0.2446 (0.3175) loss 4.4586 (4.5941) grad_norm 1.6839 (2.3478) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1070/1251] eta 0:00:57 lr 0.000643 wd 0.0500 time 0.2481 (0.3201) data time 0.0010 (0.0075) model time 0.2471 (0.3126) loss 4.0395 (4.5893) grad_norm 1.8117 (2.3375) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1080/1251] eta 0:00:53 lr 0.000644 wd 0.0500 time 0.2406 (0.3151) data time 0.0010 (0.0071) model time 0.2396 (0.3080) loss 4.7107 (4.5858) grad_norm 2.0088 (2.3447) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1090/1251] eta 0:00:50 lr 0.000644 wd 0.0500 time 0.2468 (0.3110) data time 0.0009 (0.0068) model time 0.2458 (0.3043) loss 4.3676 (4.5687) grad_norm 1.6072 (2.3283) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1100/1251] eta 0:00:46 lr 0.000644 wd 0.0500 time 0.2499 (0.3073) data time 0.0008 (0.0065) model time 0.2491 (0.3008) loss 4.1915 (4.5572) grad_norm 1.9378 (2.3185) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1110/1251] eta 0:00:42 lr 0.000645 wd 0.0500 time 0.2378 (0.3038) data time 0.0007 (0.0062) model time 0.2371 (0.2976) loss 5.0421 (4.5513) grad_norm 2.2505 (2.3230) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1120/1251] eta 0:00:39 lr 0.000645 wd 0.0500 time 0.2413 (0.3008) data time 0.0008 (0.0059) model time 0.2405 (0.2949) loss 4.5137 (4.5317) grad_norm 2.2027 (2.3198) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1130/1251] eta 0:00:36 lr 0.000646 wd 0.0500 time 0.2467 (0.2981) data time 0.0008 (0.0057) model time 0.2459 (0.2924) loss 4.7748 (4.5248) grad_norm 2.3982 (2.3216) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1140/1251] eta 0:00:32 lr 0.000646 wd 0.0500 time 0.2399 (0.2955) data time 0.0012 (0.0055) model time 0.2387 (0.2900) loss 4.6142 (4.5240) grad_norm 2.1653 (2.3204) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1150/1251] eta 0:00:29 lr 0.000646 wd 0.0500 time 0.2414 (0.2932) data time 0.0008 (0.0054) model time 0.2405 (0.2879) loss 4.3796 (4.5179) grad_norm 3.3221 (2.3446) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1160/1251] eta 0:00:26 lr 0.000647 wd 0.0500 time 0.2460 (0.2911) data time 0.0011 (0.0052) model time 0.2448 (0.2859) loss 4.5292 (4.5172) grad_norm 3.0097 (2.3620) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1170/1251] eta 0:00:23 lr 0.000647 wd 0.0500 time 0.2427 (0.2892) data time 0.0007 (0.0050) model time 0.2419 (0.2841) loss 3.3816 (4.5039) grad_norm 2.1416 (2.3605) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1180/1251] eta 0:00:20 lr 0.000648 wd 0.0500 time 0.2482 (0.2873) data time 0.0008 (0.0049) model time 0.2474 (0.2824) loss 3.5533 (4.4885) grad_norm 1.5681 (2.3520) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1190/1251] eta 0:00:17 lr 0.000648 wd 0.0500 time 0.2460 (0.2857) data time 0.0011 (0.0048) model time 0.2449 (0.2809) loss 4.4565 (4.4967) grad_norm 2.4647 (2.3504) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1200/1251] eta 0:00:14 lr 0.000648 wd 0.0500 time 0.4680 (0.2849) data time 0.0009 (0.0046) model time 0.4671 (0.2802) loss 5.4268 (4.4913) grad_norm 5.5692 (2.3574) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1210/1251] eta 0:00:11 lr 0.000649 wd 0.0500 time 0.2316 (0.2833) data time 0.0014 (0.0045) model time 0.2302 (0.2788) loss 3.0891 (4.4798) grad_norm 1.8260 (2.3681) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:34:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1220/1251] eta 0:00:08 lr 0.000649 wd 0.0500 time 0.4909 (0.2828) data time 0.0008 (0.0044) model time 0.4901 (0.2783) loss 5.2764 (4.4771) grad_norm 1.8426 (2.3729) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:35:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1230/1251] eta 0:00:05 lr 0.000650 wd 0.0500 time 0.2509 (0.2815) data time 0.0010 (0.0043) model time 0.2499 (0.2772) loss 4.8611 (4.4877) grad_norm 1.9725 (2.3765) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:35:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1240/1251] eta 0:00:03 lr 0.000650 wd 0.0500 time 0.2310 (0.2801) data time 0.0005 (0.0042) model time 0.2305 (0.2759) loss 3.7877 (4.4983) grad_norm 2.5655 (2.3773) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [12/300][1250/1251] eta 0:00:00 lr 0.000650 wd 0.0500 time 0.2249 (0.2786) data time 0.0007 (0.0042) model time 0.2242 (0.2744) loss 3.7644 (4.4915) grad_norm 2.8145 (2.3759) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-20 09:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 12 training takes 0:01:33 [2024-08-20 09:35:06 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-20 09:35:11 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-20 09:35:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.395 (0.395) Loss 0.9956 (0.9956) Acc@1 78.613 (78.613) Acc@5 93.848 (93.848) Mem 7377MB [2024-08-20 09:35:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.105) Loss 1.5264 (1.4360) Acc@1 65.527 (67.321) Acc@5 89.941 (89.489) Mem 7377MB [2024-08-20 09:35:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.092) Loss 2.0723 (1.4412) Acc@1 56.934 (66.960) Acc@5 78.906 (89.634) Mem 7377MB [2024-08-20 09:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.087) Loss 2.4160 (1.6642) Acc@1 50.488 (62.717) Acc@5 72.363 (86.013) Mem 7377MB [2024-08-20 09:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.082) Loss 2.3652 (1.8022) Acc@1 50.000 (60.149) Acc@5 73.926 (83.810) Mem 7377MB [2024-08-20 09:35:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 59.924 Acc@5 83.612 [2024-08-20 09:35:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 59.9% [2024-08-20 09:35:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 59.92% [2024-08-20 09:35:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-20 09:35:17 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-20 09:35:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.430 (0.430) Loss 3.0215 (3.0215) Acc@1 48.438 (48.438) Acc@5 73.730 (73.730) Mem 7377MB [2024-08-20 09:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.112) Loss 3.1895 (3.4714) Acc@1 38.672 (35.476) Acc@5 68.262 (61.941) Mem 7377MB [2024-08-20 09:35:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.097) Loss 4.0273 (3.4211) Acc@1 30.469 (36.114) Acc@5 52.344 (63.202) Mem 7377MB [2024-08-20 09:35:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.091) Loss 3.8379 (3.5421) Acc@1 33.887 (34.832) Acc@5 54.688 (60.755) Mem 7377MB [2024-08-20 09:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 3.9727 (3.6258) Acc@1 25.586 (33.398) Acc@5 50.879 (59.068) Mem 7377MB [2024-08-20 09:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 34.180 Acc@5 59.954 [2024-08-20 09:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 34.2% [2024-08-20 09:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 34.18% [2024-08-20 09:35:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-20 09:35:22 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-20 09:35:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][0/1251] eta 0:17:42 lr 0.000650 wd 0.0500 time 0.8494 (0.8494) data time 0.5257 (0.5257) model time 0.0000 (0.0000) loss 4.8077 (4.8077) grad_norm 1.6726 (1.6726) loss_scale 16384.0000 (16384.0000) mem 7380MB [2024-08-20 09:35:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][10/1251] eta 0:06:07 lr 0.000651 wd 0.0500 time 0.2416 (0.2960) data time 0.0007 (0.0489) model time 0.0000 (0.0000) loss 4.1804 (4.6494) grad_norm 1.6471 (1.8644) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:35:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][20/1251] eta 0:05:31 lr 0.000651 wd 0.0500 time 0.2461 (0.2692) data time 0.0011 (0.0262) model time 0.0000 (0.0000) loss 4.5441 (4.6172) grad_norm 3.1994 (2.2235) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:35:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][30/1251] eta 0:05:17 lr 0.000652 wd 0.0500 time 0.2418 (0.2598) data time 0.0009 (0.0181) model time 0.0000 (0.0000) loss 3.8848 (4.5466) grad_norm 2.1159 (2.3566) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:35:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][40/1251] eta 0:05:09 lr 0.000652 wd 0.0500 time 0.2455 (0.2552) data time 0.0011 (0.0139) model time 0.0000 (0.0000) loss 4.5714 (4.5147) grad_norm 1.9308 (2.3539) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:35:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][50/1251] eta 0:05:02 lr 0.000652 wd 0.0500 time 0.2504 (0.2522) data time 0.0008 (0.0114) model time 0.0000 (0.0000) loss 3.7392 (4.4207) grad_norm 2.1787 (2.3045) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][60/1251] eta 0:04:58 lr 0.000653 wd 0.0500 time 0.2444 (0.2504) data time 0.0008 (0.0097) model time 0.2436 (0.2401) loss 4.4911 (4.4400) grad_norm 1.7263 (2.2632) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][70/1251] eta 0:04:54 lr 0.000653 wd 0.0500 time 0.2502 (0.2490) data time 0.0010 (0.0085) model time 0.2492 (0.2395) loss 5.1040 (4.4917) grad_norm 2.8604 (2.3257) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][80/1251] eta 0:04:50 lr 0.000654 wd 0.0500 time 0.2520 (0.2480) data time 0.0007 (0.0076) model time 0.2513 (0.2395) loss 5.1814 (4.4892) grad_norm 4.2143 (2.3399) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][90/1251] eta 0:04:47 lr 0.000654 wd 0.0500 time 0.2585 (0.2472) data time 0.0008 (0.0070) model time 0.2577 (0.2394) loss 5.1995 (4.4996) grad_norm 1.8464 (2.3592) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][100/1251] eta 0:04:43 lr 0.000654 wd 0.0500 time 0.2379 (0.2466) data time 0.0010 (0.0064) model time 0.2369 (0.2395) loss 4.7823 (4.5270) grad_norm 2.2226 (2.3375) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:35:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][110/1251] eta 0:04:41 lr 0.000655 wd 0.0500 time 0.2616 (0.2464) data time 0.0012 (0.0060) model time 0.2605 (0.2401) loss 4.0777 (4.5179) grad_norm 1.7221 (2.3039) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:35:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][120/1251] eta 0:04:38 lr 0.000655 wd 0.0500 time 0.2614 (0.2460) data time 0.0008 (0.0057) model time 0.2606 (0.2398) loss 4.1143 (4.4878) grad_norm 1.7680 (2.2651) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:35:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][130/1251] eta 0:04:35 lr 0.000656 wd 0.0500 time 0.2492 (0.2455) data time 0.0010 (0.0054) model time 0.2482 (0.2397) loss 3.6116 (4.4641) grad_norm 2.9446 (2.2624) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:35:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][140/1251] eta 0:04:32 lr 0.000656 wd 0.0500 time 0.2497 (0.2452) data time 0.0011 (0.0050) model time 0.2486 (0.2397) loss 3.2012 (4.4417) grad_norm 1.6618 (2.2442) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][150/1251] eta 0:04:29 lr 0.000656 wd 0.0500 time 0.2387 (0.2449) data time 0.0010 (0.0048) model time 0.2377 (0.2397) loss 4.5993 (4.4571) grad_norm 2.0596 (2.2627) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-20 09:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-20 09:36:01 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-20 09:36:07 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-20 09:44:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-20 09:44:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-20 09:44:22 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-20 09:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-20 09:44:35 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-20 09:44:36 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-20 09:44:38 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-20 09:44:38 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 13) [2024-08-20 09:44:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-20 09:44:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][160/1251] eta 3:51:54 lr 0.000657 wd 0.0500 time 12.7542 (12.7542) data time 0.6779 (0.6779) model time 12.0763 (12.0763) loss 5.3501 (5.3501) grad_norm 1.8023 (1.8023) loss_scale 16384.0000 (16384.0000) mem 20033MB [2024-08-20 09:44:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][170/1251] eta 0:24:45 lr 0.000657 wd 0.0500 time 0.2158 (1.3741) data time 0.0010 (0.0624) model time 0.2147 (1.3117) loss 3.4706 (4.7852) grad_norm 3.4331 (2.0838) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:44:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][180/1251] eta 0:14:43 lr 0.000658 wd 0.0500 time 0.2156 (0.8249) data time 0.0010 (0.0332) model time 0.2146 (0.7917) loss 4.3839 (4.7150) grad_norm 2.6027 (2.1950) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][190/1251] eta 0:11:10 lr 0.000658 wd 0.0500 time 0.2225 (0.6317) data time 0.0007 (0.0228) model time 0.2218 (0.6089) loss 3.6852 (4.7301) grad_norm 1.8399 (2.1674) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][200/1251] eta 0:09:19 lr 0.000658 wd 0.0500 time 0.2221 (0.5323) data time 0.0009 (0.0175) model time 0.2212 (0.5148) loss 4.5897 (4.6477) grad_norm 1.8254 (2.2064) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][210/1251] eta 0:08:11 lr 0.000659 wd 0.0500 time 0.2222 (0.4720) data time 0.0007 (0.0143) model time 0.2215 (0.4577) loss 4.8644 (4.6342) grad_norm 2.1251 (2.2606) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][220/1251] eta 0:07:24 lr 0.000659 wd 0.0500 time 0.2223 (0.4313) data time 0.0010 (0.0121) model time 0.2213 (0.4192) loss 4.8148 (4.5973) grad_norm 5.4830 (2.3581) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][230/1251] eta 0:06:50 lr 0.000660 wd 0.0500 time 0.2268 (0.4019) data time 0.0009 (0.0105) model time 0.2259 (0.3914) loss 4.3143 (4.5509) grad_norm 2.0622 (2.3512) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][240/1251] eta 0:06:23 lr 0.000660 wd 0.0500 time 0.2276 (0.3798) data time 0.0010 (0.0094) model time 0.2266 (0.3704) loss 3.9194 (4.5316) grad_norm 2.3365 (2.3328) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][250/1251] eta 0:06:03 lr 0.000660 wd 0.0500 time 0.2209 (0.3628) data time 0.0007 (0.0085) model time 0.2202 (0.3543) loss 5.2900 (4.5063) grad_norm 2.1117 (2.3421) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][260/1251] eta 0:05:45 lr 0.000661 wd 0.0500 time 0.2212 (0.3490) data time 0.0007 (0.0077) model time 0.2205 (0.3413) loss 4.5248 (4.5241) grad_norm 1.8794 (2.3190) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][270/1251] eta 0:05:31 lr 0.000661 wd 0.0500 time 0.2220 (0.3375) data time 0.0008 (0.0071) model time 0.2212 (0.3304) loss 3.8547 (4.5156) grad_norm 5.3486 (2.3398) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][280/1251] eta 0:05:18 lr 0.000662 wd 0.0500 time 0.2234 (0.3279) data time 0.0008 (0.0066) model time 0.2226 (0.3213) loss 3.4457 (4.5284) grad_norm 1.9614 (2.3581) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][290/1251] eta 0:05:07 lr 0.000662 wd 0.0500 time 0.2264 (0.3198) data time 0.0011 (0.0062) model time 0.2254 (0.3136) loss 4.8165 (4.5098) grad_norm 1.9992 (2.3356) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][300/1251] eta 0:04:57 lr 0.000662 wd 0.0500 time 0.2218 (0.3128) data time 0.0007 (0.0058) model time 0.2212 (0.3070) loss 4.6469 (4.4965) grad_norm 1.9833 (2.3213) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][310/1251] eta 0:04:48 lr 0.000663 wd 0.0500 time 0.2292 (0.3069) data time 0.0008 (0.0055) model time 0.2284 (0.3014) loss 3.2086 (4.4852) grad_norm 2.8591 (2.3539) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][320/1251] eta 0:04:40 lr 0.000663 wd 0.0500 time 0.2228 (0.3017) data time 0.0009 (0.0052) model time 0.2219 (0.2965) loss 4.7497 (4.4942) grad_norm 2.4330 (2.3379) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][330/1251] eta 0:04:33 lr 0.000664 wd 0.0500 time 0.2236 (0.2972) data time 0.0010 (0.0050) model time 0.2227 (0.2922) loss 4.5661 (4.4874) grad_norm 4.7719 (2.3377) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][340/1251] eta 0:04:27 lr 0.000664 wd 0.0500 time 0.2263 (0.2932) data time 0.0009 (0.0048) model time 0.2253 (0.2884) loss 4.8212 (4.4683) grad_norm 2.5246 (2.3434) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][350/1251] eta 0:04:20 lr 0.000664 wd 0.0500 time 0.2209 (0.2896) data time 0.0007 (0.0046) model time 0.2201 (0.2850) loss 4.2873 (4.4681) grad_norm 1.6980 (2.3293) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][360/1251] eta 0:04:15 lr 0.000665 wd 0.0500 time 0.2166 (0.2863) data time 0.0009 (0.0044) model time 0.2158 (0.2819) loss 4.5364 (4.4541) grad_norm 2.3657 (2.3198) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][370/1251] eta 0:04:09 lr 0.000665 wd 0.0500 time 0.2217 (0.2833) data time 0.0009 (0.0042) model time 0.2209 (0.2790) loss 4.6114 (4.4479) grad_norm 1.8503 (2.3178) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][380/1251] eta 0:04:04 lr 0.000666 wd 0.0500 time 0.2253 (0.2806) data time 0.0008 (0.0041) model time 0.2245 (0.2765) loss 4.5452 (4.4412) grad_norm 2.4901 (2.3246) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][390/1251] eta 0:03:59 lr 0.000666 wd 0.0500 time 0.2221 (0.2780) data time 0.0006 (0.0039) model time 0.2216 (0.2741) loss 3.2668 (4.4357) grad_norm 1.8577 (2.3118) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-20 09:45:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][400/1251] eta 0:03:54 lr 0.000666 wd 0.0500 time 0.2285 (0.2759) data time 0.0007 (0.0038) model time 0.2279 (0.2721) loss 4.4772 (4.4326) grad_norm 1.5909 (2.3023) loss_scale 32768.0000 (16995.8506) mem 7379MB [2024-08-20 09:45:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][410/1251] eta 0:03:50 lr 0.000667 wd 0.0500 time 0.2206 (0.2738) data time 0.0007 (0.0037) model time 0.2199 (0.2701) loss 4.9748 (4.4199) grad_norm 2.3763 (2.3030) loss_scale 32768.0000 (17624.2231) mem 7379MB [2024-08-20 09:45:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][420/1251] eta 0:03:45 lr 0.000667 wd 0.0500 time 0.2221 (0.2719) data time 0.0007 (0.0036) model time 0.2214 (0.2683) loss 4.5292 (4.4112) grad_norm 2.6017 (inf) loss_scale 16384.0000 (17702.2529) mem 7379MB [2024-08-20 09:45:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][430/1251] eta 0:03:41 lr 0.000668 wd 0.0500 time 0.2183 (0.2701) data time 0.0007 (0.0035) model time 0.2176 (0.2666) loss 5.0979 (4.4039) grad_norm 2.1393 (inf) loss_scale 16384.0000 (17653.6089) mem 7379MB [2024-08-20 09:45:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][440/1251] eta 0:03:37 lr 0.000668 wd 0.0500 time 0.2219 (0.2685) data time 0.0010 (0.0034) model time 0.2209 (0.2651) loss 4.6210 (4.4127) grad_norm 2.2349 (inf) loss_scale 16384.0000 (17608.4270) mem 7379MB [2024-08-20 09:45:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][450/1251] eta 0:03:34 lr 0.000668 wd 0.0500 time 0.2229 (0.2678) data time 0.0008 (0.0034) model time 0.2221 (0.2644) loss 3.5983 (4.4082) grad_norm 1.9641 (inf) loss_scale 16384.0000 (17566.3505) mem 7379MB [2024-08-20 09:46:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][460/1251] eta 0:03:30 lr 0.000669 wd 0.0500 time 0.2201 (0.2663) data time 0.0008 (0.0033) model time 0.2192 (0.2631) loss 4.0010 (4.3997) grad_norm 1.7311 (inf) loss_scale 16384.0000 (17527.0698) mem 7379MB [2024-08-20 09:46:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][470/1251] eta 0:03:27 lr 0.000669 wd 0.0500 time 0.2174 (0.2657) data time 0.0010 (0.0032) model time 0.2164 (0.2625) loss 5.0301 (4.4058) grad_norm 1.9569 (inf) loss_scale 16384.0000 (17490.3151) mem 7379MB [2024-08-20 09:46:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][480/1251] eta 0:03:23 lr 0.000670 wd 0.0500 time 0.2247 (0.2644) data time 0.0006 (0.0031) model time 0.2241 (0.2613) loss 5.4939 (4.4216) grad_norm 1.7421 (inf) loss_scale 16384.0000 (17455.8505) mem 7379MB [2024-08-20 09:46:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][490/1251] eta 0:03:20 lr 0.000670 wd 0.0500 time 0.2176 (0.2632) data time 0.0006 (0.0031) model time 0.2169 (0.2601) loss 3.6891 (4.4207) grad_norm 2.9824 (inf) loss_scale 16384.0000 (17423.4683) mem 7379MB [2024-08-20 09:46:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][500/1251] eta 0:03:16 lr 0.000670 wd 0.0500 time 0.2244 (0.2621) data time 0.0008 (0.0030) model time 0.2236 (0.2591) loss 5.0044 (4.4265) grad_norm 2.0529 (inf) loss_scale 16384.0000 (17392.9853) mem 7379MB [2024-08-20 09:46:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][510/1251] eta 0:03:13 lr 0.000671 wd 0.0500 time 0.2183 (0.2609) data time 0.0009 (0.0030) model time 0.2174 (0.2579) loss 4.6230 (4.4264) grad_norm 1.9903 (inf) loss_scale 16384.0000 (17364.2393) mem 7379MB [2024-08-20 09:46:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][520/1251] eta 0:03:09 lr 0.000671 wd 0.0500 time 0.2251 (0.2599) data time 0.0006 (0.0029) model time 0.2246 (0.2570) loss 4.1366 (4.4253) grad_norm 1.5402 (inf) loss_scale 16384.0000 (17337.0859) mem 7379MB [2024-08-20 09:46:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][530/1251] eta 0:03:06 lr 0.000672 wd 0.0500 time 0.2249 (0.2589) data time 0.0009 (0.0029) model time 0.2241 (0.2561) loss 3.6386 (4.4243) grad_norm 1.8996 (inf) loss_scale 16384.0000 (17311.3962) mem 7379MB [2024-08-20 09:46:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][540/1251] eta 0:03:03 lr 0.000672 wd 0.0500 time 0.2256 (0.2581) data time 0.0007 (0.0028) model time 0.2249 (0.2553) loss 2.7941 (4.4214) grad_norm 2.2210 (inf) loss_scale 16384.0000 (17287.0551) mem 7379MB [2024-08-20 09:46:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][550/1251] eta 0:03:00 lr 0.000672 wd 0.0500 time 0.2366 (0.2572) data time 0.0006 (0.0028) model time 0.2360 (0.2545) loss 5.3218 (4.4168) grad_norm 2.4840 (inf) loss_scale 16384.0000 (17263.9591) mem 7379MB [2024-08-20 09:46:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][560/1251] eta 0:02:57 lr 0.000673 wd 0.0500 time 0.2198 (0.2565) data time 0.0009 (0.0027) model time 0.2189 (0.2538) loss 4.4906 (4.4212) grad_norm 3.6915 (inf) loss_scale 16384.0000 (17242.0150) mem 7379MB [2024-08-20 09:46:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][570/1251] eta 0:02:54 lr 0.000673 wd 0.0500 time 0.2214 (0.2557) data time 0.0009 (0.0027) model time 0.2205 (0.2530) loss 4.9958 (4.4255) grad_norm 1.7088 (inf) loss_scale 16384.0000 (17221.1387) mem 7379MB [2024-08-20 09:46:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][580/1251] eta 0:02:51 lr 0.000674 wd 0.0500 time 0.2275 (0.2550) data time 0.0009 (0.0026) model time 0.2266 (0.2523) loss 5.3022 (4.4239) grad_norm 4.4623 (inf) loss_scale 16384.0000 (17201.2542) mem 7379MB [2024-08-20 09:46:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][590/1251] eta 0:02:48 lr 0.000674 wd 0.0500 time 0.2232 (0.2542) data time 0.0006 (0.0026) model time 0.2226 (0.2516) loss 4.7936 (4.4309) grad_norm 1.7508 (inf) loss_scale 16384.0000 (17182.2923) mem 7379MB [2024-08-20 09:46:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][600/1251] eta 0:02:45 lr 0.000674 wd 0.0500 time 0.2227 (0.2536) data time 0.0007 (0.0026) model time 0.2221 (0.2510) loss 5.5674 (4.4351) grad_norm 2.5275 (inf) loss_scale 16384.0000 (17164.1905) mem 7379MB [2024-08-20 09:46:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][610/1251] eta 0:02:42 lr 0.000675 wd 0.0500 time 0.2231 (0.2530) data time 0.0007 (0.0025) model time 0.2225 (0.2504) loss 4.2546 (4.4328) grad_norm 1.8603 (inf) loss_scale 16384.0000 (17146.8914) mem 7379MB [2024-08-20 09:46:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][620/1251] eta 0:02:39 lr 0.000675 wd 0.0500 time 0.2281 (0.2523) data time 0.0009 (0.0025) model time 0.2273 (0.2499) loss 4.8275 (4.4263) grad_norm 2.6691 (inf) loss_scale 16384.0000 (17130.3427) mem 7379MB [2024-08-20 09:46:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][630/1251] eta 0:02:36 lr 0.000676 wd 0.0500 time 0.2269 (0.2518) data time 0.0009 (0.0025) model time 0.2260 (0.2493) loss 4.5507 (4.4202) grad_norm 5.1076 (inf) loss_scale 16384.0000 (17114.4968) mem 7379MB [2024-08-20 09:46:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][640/1251] eta 0:02:33 lr 0.000676 wd 0.0500 time 0.2171 (0.2512) data time 0.0007 (0.0024) model time 0.2164 (0.2488) loss 4.7719 (4.4169) grad_norm 1.9785 (inf) loss_scale 16384.0000 (17099.3098) mem 7379MB [2024-08-20 09:46:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][650/1251] eta 0:02:30 lr 0.000676 wd 0.0500 time 0.2241 (0.2506) data time 0.0008 (0.0024) model time 0.2233 (0.2482) loss 4.8917 (4.4239) grad_norm 2.2287 (inf) loss_scale 16384.0000 (17084.7413) mem 7379MB [2024-08-20 09:46:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][660/1251] eta 0:02:27 lr 0.000677 wd 0.0500 time 0.2256 (0.2502) data time 0.0009 (0.0024) model time 0.2247 (0.2478) loss 4.5126 (4.4214) grad_norm 2.7026 (inf) loss_scale 16384.0000 (17070.7545) mem 7379MB [2024-08-20 09:46:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][670/1251] eta 0:02:25 lr 0.000677 wd 0.0500 time 0.2350 (0.2497) data time 0.0006 (0.0023) model time 0.2344 (0.2474) loss 4.8572 (4.4266) grad_norm 1.8420 (inf) loss_scale 16384.0000 (17057.3151) mem 7379MB [2024-08-20 09:46:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-20 09:46:51 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-20 09:46:52 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-20 09:52:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-20 09:52:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-20 09:52:14 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-20 09:58:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-21 06:25:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-21 06:25:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-21 06:25:51 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-21 06:26:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-21 06:26:06 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-21 06:26:08 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-21 06:26:09 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-21 06:26:09 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 13) [2024-08-21 06:26:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-21 06:27:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-21 06:27:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-21 06:27:55 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-21 06:28:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-21 06:28:07 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-21 06:28:08 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-21 06:28:09 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-21 06:28:09 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 13) [2024-08-21 06:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-21 06:28:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][680/1251] eta 0:56:15 lr 0.000678 wd 0.0500 time 0.3551 (5.9117) data time 0.0007 (0.4185) model time 0.3544 (5.4933) loss 5.1365 (5.0393) grad_norm 2.1640 (2.2112) loss_scale 16384.0000 (16384.0000) mem 7378MB [2024-08-21 06:28:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][690/1251] eta 0:10:58 lr 0.000678 wd 0.0500 time 0.2203 (1.1731) data time 0.0007 (0.0706) model time 0.2195 (1.1025) loss 3.8388 (4.6468) grad_norm 2.4738 (2.4900) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][700/1251] eta 0:06:48 lr 0.000678 wd 0.0500 time 0.2244 (0.7412) data time 0.0008 (0.0390) model time 0.2236 (0.7023) loss 4.8599 (4.6715) grad_norm 1.9316 (2.4306) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][710/1251] eta 0:05:13 lr 0.000679 wd 0.0500 time 0.2257 (0.5789) data time 0.0006 (0.0271) model time 0.2251 (0.5518) loss 4.6051 (4.6767) grad_norm 1.5467 (2.4060) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][720/1251] eta 0:04:22 lr 0.000679 wd 0.0500 time 0.2243 (0.4947) data time 0.0010 (0.0209) model time 0.2233 (0.4738) loss 4.3643 (4.6391) grad_norm 2.0599 (2.4058) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][730/1251] eta 0:03:50 lr 0.000679 wd 0.0500 time 0.2237 (0.4427) data time 0.0006 (0.0171) model time 0.2231 (0.4256) loss 4.6668 (4.6146) grad_norm 2.6317 (2.3690) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][740/1251] eta 0:03:28 lr 0.000680 wd 0.0500 time 0.2275 (0.4073) data time 0.0007 (0.0145) model time 0.2269 (0.3929) loss 4.9959 (4.5894) grad_norm 1.6332 (2.3282) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:28:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][750/1251] eta 0:03:11 lr 0.000680 wd 0.0500 time 0.2243 (0.3820) data time 0.0009 (0.0126) model time 0.2234 (0.3694) loss 4.7255 (4.5432) grad_norm 2.8874 (2.3900) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][760/1251] eta 0:02:58 lr 0.000681 wd 0.0500 time 0.2195 (0.3627) data time 0.0010 (0.0112) model time 0.2185 (0.3515) loss 4.2548 (4.5383) grad_norm 1.4472 (2.3359) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:28:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][770/1251] eta 0:02:47 lr 0.000681 wd 0.0500 time 0.2276 (0.3478) data time 0.0008 (0.0101) model time 0.2268 (0.3377) loss 3.0485 (4.5119) grad_norm 2.5374 (2.2973) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:28:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][780/1251] eta 0:02:38 lr 0.000681 wd 0.0500 time 0.2183 (0.3357) data time 0.0008 (0.0092) model time 0.2175 (0.3265) loss 5.0781 (4.5419) grad_norm 2.5611 (2.2927) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:28:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][790/1251] eta 0:02:30 lr 0.000682 wd 0.0500 time 0.2248 (0.3258) data time 0.0008 (0.0085) model time 0.2240 (0.3173) loss 4.7339 (4.5369) grad_norm 1.7493 (2.2752) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:28:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][800/1251] eta 0:02:23 lr 0.000682 wd 0.0500 time 0.2195 (0.3173) data time 0.0008 (0.0079) model time 0.2187 (0.3095) loss 4.8714 (4.5317) grad_norm 1.7299 (2.2733) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:28:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][810/1251] eta 0:02:16 lr 0.000683 wd 0.0500 time 0.2238 (0.3102) data time 0.0009 (0.0073) model time 0.2229 (0.3028) loss 4.7953 (4.5207) grad_norm 1.7168 (2.2712) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][820/1251] eta 0:02:11 lr 0.000683 wd 0.0500 time 0.2261 (0.3042) data time 0.0008 (0.0069) model time 0.2253 (0.2973) loss 4.6823 (4.5048) grad_norm 2.1245 (2.2807) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][830/1251] eta 0:02:05 lr 0.000683 wd 0.0500 time 0.2245 (0.2990) data time 0.0010 (0.0065) model time 0.2235 (0.2925) loss 4.7851 (4.4995) grad_norm 2.8385 (2.2833) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][840/1251] eta 0:02:00 lr 0.000684 wd 0.0500 time 0.2269 (0.2943) data time 0.0008 (0.0062) model time 0.2261 (0.2882) loss 4.4125 (4.4990) grad_norm 2.6156 (2.2889) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][850/1251] eta 0:01:56 lr 0.000684 wd 0.0500 time 0.2289 (0.2903) data time 0.0009 (0.0059) model time 0.2280 (0.2844) loss 3.7635 (4.4904) grad_norm 2.8549 (2.2780) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][860/1251] eta 0:01:52 lr 0.000685 wd 0.0500 time 0.2302 (0.2868) data time 0.0009 (0.0056) model time 0.2293 (0.2812) loss 4.6177 (4.4756) grad_norm 2.3108 (2.2772) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][870/1251] eta 0:01:48 lr 0.000685 wd 0.0500 time 0.2259 (0.2836) data time 0.0009 (0.0054) model time 0.2250 (0.2782) loss 4.5740 (4.4731) grad_norm 3.8894 (2.3015) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][880/1251] eta 0:01:44 lr 0.000685 wd 0.0500 time 0.2228 (0.2806) data time 0.0006 (0.0052) model time 0.2222 (0.2754) loss 4.9477 (4.4645) grad_norm 3.6941 (2.3059) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][890/1251] eta 0:01:40 lr 0.000686 wd 0.0500 time 0.2233 (0.2780) data time 0.0009 (0.0050) model time 0.2225 (0.2730) loss 4.1823 (4.4515) grad_norm 2.6631 (2.3344) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][900/1251] eta 0:01:36 lr 0.000686 wd 0.0500 time 0.2242 (0.2756) data time 0.0008 (0.0048) model time 0.2234 (0.2708) loss 5.2625 (4.4483) grad_norm 3.8223 (2.3465) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][910/1251] eta 0:01:33 lr 0.000687 wd 0.0500 time 0.2239 (0.2735) data time 0.0008 (0.0046) model time 0.2232 (0.2689) loss 4.5696 (4.4407) grad_norm 1.7839 (2.3548) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][920/1251] eta 0:01:29 lr 0.000687 wd 0.0500 time 0.2281 (0.2714) data time 0.0006 (0.0045) model time 0.2275 (0.2669) loss 4.7454 (4.4401) grad_norm 2.0965 (2.3533) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][930/1251] eta 0:01:26 lr 0.000687 wd 0.0500 time 0.2241 (0.2696) data time 0.0007 (0.0043) model time 0.2234 (0.2652) loss 5.2404 (4.4320) grad_norm 2.1813 (2.3419) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][940/1251] eta 0:01:23 lr 0.000688 wd 0.0500 time 0.2374 (0.2679) data time 0.0010 (0.0042) model time 0.2364 (0.2637) loss 4.5256 (4.4180) grad_norm 1.6686 (2.3262) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][950/1251] eta 0:01:20 lr 0.000688 wd 0.0500 time 0.2243 (0.2664) data time 0.0008 (0.0041) model time 0.2235 (0.2622) loss 3.9712 (4.4114) grad_norm 2.4520 (2.3169) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][960/1251] eta 0:01:17 lr 0.000689 wd 0.0500 time 0.2334 (0.2649) data time 0.0009 (0.0040) model time 0.2325 (0.2609) loss 3.1363 (4.4131) grad_norm 3.0913 (2.3263) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][970/1251] eta 0:01:14 lr 0.000689 wd 0.0500 time 0.2273 (0.2643) data time 0.0011 (0.0039) model time 0.2262 (0.2604) loss 3.8312 (4.4090) grad_norm 4.5637 (2.3463) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][980/1251] eta 0:01:11 lr 0.000689 wd 0.0500 time 0.2309 (0.2630) data time 0.0009 (0.0038) model time 0.2300 (0.2592) loss 4.7213 (4.3995) grad_norm 1.5925 (2.3449) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][990/1251] eta 0:01:08 lr 0.000690 wd 0.0500 time 0.2315 (0.2627) data time 0.0009 (0.0037) model time 0.2307 (0.2590) loss 4.7475 (4.4018) grad_norm 1.8585 (2.3336) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1000/1251] eta 0:01:05 lr 0.000690 wd 0.0500 time 0.2285 (0.2616) data time 0.0008 (0.0036) model time 0.2277 (0.2579) loss 4.6195 (4.4157) grad_norm 1.6903 (2.3305) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1010/1251] eta 0:01:02 lr 0.000691 wd 0.0500 time 0.2195 (0.2604) data time 0.0009 (0.0035) model time 0.2185 (0.2569) loss 4.8492 (4.4172) grad_norm 1.7796 (2.3207) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1020/1251] eta 0:00:59 lr 0.000691 wd 0.0500 time 0.2241 (0.2594) data time 0.0006 (0.0035) model time 0.2234 (0.2560) loss 4.7054 (4.4194) grad_norm 1.6332 (2.3040) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1030/1251] eta 0:00:57 lr 0.000691 wd 0.0500 time 0.2204 (0.2585) data time 0.0007 (0.0034) model time 0.2196 (0.2551) loss 5.4025 (4.4239) grad_norm 2.4378 (2.3017) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1040/1251] eta 0:00:54 lr 0.000692 wd 0.0500 time 0.2241 (0.2577) data time 0.0007 (0.0033) model time 0.2234 (0.2544) loss 5.1965 (4.4258) grad_norm 2.8099 (2.3170) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 06:29:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-21 06:29:46 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-21 06:29:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-21 07:05:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-21 07:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-21 07:05:45 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-21 07:05:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-21 07:05:54 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-21 07:05:56 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-21 07:05:57 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-21 07:05:57 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 13) [2024-08-21 07:05:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-21 07:06:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1050/1251] eta 0:08:43 lr 0.000692 wd 0.0500 time 0.2425 (2.6027) data time 0.0009 (0.3796) model time 0.2416 (2.2232) loss 5.0094 (5.0763) grad_norm 1.7529 (2.2545) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:06:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1060/1251] eta 0:04:19 lr 0.000693 wd 0.0500 time 0.2363 (1.3571) data time 0.0011 (0.1804) model time 0.2352 (1.1766) loss 4.4670 (4.7912) grad_norm 2.1016 (2.2981) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1070/1251] eta 0:02:55 lr 0.000693 wd 0.0500 time 0.2363 (0.9707) data time 0.0008 (0.1186) model time 0.2355 (0.8521) loss 4.9842 (4.8227) grad_norm 2.6460 (2.3511) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1080/1251] eta 0:02:13 lr 0.000693 wd 0.0500 time 0.2389 (0.7822) data time 0.0010 (0.0885) model time 0.2380 (0.6937) loss 4.4691 (4.7419) grad_norm 1.9829 (2.3321) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1090/1251] eta 0:01:47 lr 0.000694 wd 0.0500 time 0.2344 (0.6707) data time 0.0011 (0.0707) model time 0.2333 (0.6000) loss 3.9678 (4.6929) grad_norm 2.9776 (2.2844) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1100/1251] eta 0:01:30 lr 0.000694 wd 0.0500 time 0.2364 (0.5967) data time 0.0007 (0.0589) model time 0.2356 (0.5378) loss 4.0110 (4.6366) grad_norm 1.8849 (2.2994) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1110/1251] eta 0:01:16 lr 0.000695 wd 0.0500 time 0.2268 (0.5441) data time 0.0011 (0.0505) model time 0.2257 (0.4936) loss 4.0300 (4.5947) grad_norm 2.1478 (2.3214) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1120/1251] eta 0:01:06 lr 0.000695 wd 0.0500 time 0.2404 (0.5049) data time 0.0012 (0.0443) model time 0.2392 (0.4607) loss 4.4954 (4.5598) grad_norm 1.5671 (2.2773) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1130/1251] eta 0:00:57 lr 0.000695 wd 0.0500 time 0.2253 (0.4745) data time 0.0009 (0.0394) model time 0.2244 (0.4351) loss 4.6183 (4.5353) grad_norm 5.7573 (2.3381) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1140/1251] eta 0:00:49 lr 0.000696 wd 0.0500 time 0.2301 (0.4503) data time 0.0012 (0.0355) model time 0.2289 (0.4147) loss 4.5302 (4.5439) grad_norm 2.6930 (2.3191) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1150/1251] eta 0:00:43 lr 0.000696 wd 0.0500 time 0.2376 (0.4305) data time 0.0008 (0.0324) model time 0.2367 (0.3981) loss 5.2106 (4.5553) grad_norm 2.2200 (2.3041) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1160/1251] eta 0:00:37 lr 0.000697 wd 0.0500 time 0.2331 (0.4143) data time 0.0010 (0.0298) model time 0.2321 (0.3845) loss 4.7551 (4.5504) grad_norm 2.0291 (2.2724) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1170/1251] eta 0:00:32 lr 0.000697 wd 0.0500 time 0.2367 (0.4005) data time 0.0007 (0.0275) model time 0.2360 (0.3730) loss 4.5817 (4.5338) grad_norm 2.3675 (2.2601) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1180/1251] eta 0:00:27 lr 0.000697 wd 0.0500 time 0.2325 (0.3885) data time 0.0008 (0.0256) model time 0.2317 (0.3628) loss 5.1678 (4.5283) grad_norm 1.7782 (2.2438) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1190/1251] eta 0:00:23 lr 0.000698 wd 0.0500 time 0.2255 (0.3779) data time 0.0009 (0.0240) model time 0.2246 (0.3539) loss 3.9315 (4.5110) grad_norm 1.9292 (2.2450) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:07:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1200/1251] eta 0:00:18 lr 0.000698 wd 0.0500 time 0.2278 (0.3689) data time 0.0008 (0.0226) model time 0.2271 (0.3463) loss 5.2512 (4.5041) grad_norm 1.5977 (2.2466) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1210/1251] eta 0:00:15 lr 0.000699 wd 0.0500 time 0.3303 (0.3687) data time 0.0012 (0.0221) model time 0.3291 (0.3466) loss 4.3791 (4.4998) grad_norm 1.9028 (2.2372) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:07:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1220/1251] eta 0:00:11 lr 0.000699 wd 0.0500 time 0.2331 (0.3611) data time 0.0008 (0.0209) model time 0.2323 (0.3402) loss 5.0499 (4.4784) grad_norm 2.6337 (2.2364) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1230/1251] eta 0:00:07 lr 0.000699 wd 0.0500 time 0.2348 (0.3579) data time 0.0008 (0.0199) model time 0.2339 (0.3380) loss 5.6517 (4.4768) grad_norm 2.1162 (2.2173) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1240/1251] eta 0:00:03 lr 0.000700 wd 0.0500 time 0.2338 (0.3515) data time 0.0007 (0.0190) model time 0.2331 (0.3325) loss 3.5423 (4.4538) grad_norm 2.1190 (2.2282) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [13/300][1250/1251] eta 0:00:00 lr 0.000700 wd 0.0500 time 0.2261 (0.3454) data time 0.0005 (0.0181) model time 0.2256 (0.3273) loss 4.8625 (4.4482) grad_norm 2.2983 (2.2437) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-21 07:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 13 training takes 0:01:12 [2024-08-21 07:07:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-21 07:07:16 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-21 07:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.715 (0.715) Loss 0.9688 (0.9688) Acc@1 80.469 (80.469) Acc@5 94.238 (94.238) Mem 7379MB [2024-08-21 07:07:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.143) Loss 1.4346 (1.4199) Acc@1 66.699 (68.084) Acc@5 90.625 (89.959) Mem 7379MB [2024-08-21 07:07:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.110) Loss 2.0664 (1.4250) Acc@1 56.836 (67.834) Acc@5 80.273 (90.007) Mem 7379MB [2024-08-21 07:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.101) Loss 2.2773 (1.6147) Acc@1 52.148 (64.324) Acc@5 74.609 (86.741) Mem 7379MB [2024-08-21 07:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.093) Loss 2.3125 (1.7326) Acc@1 49.707 (61.866) Acc@5 76.465 (84.894) Mem 7379MB [2024-08-21 07:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 61.568 Acc@5 84.742 [2024-08-21 07:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 61.6% [2024-08-21 07:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 61.57% [2024-08-21 07:07:22 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-21 07:07:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-21 07:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.515 (0.515) Loss 2.4395 (2.4395) Acc@1 56.836 (56.836) Acc@5 79.980 (79.980) Mem 7379MB [2024-08-21 07:07:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.120) Loss 2.7227 (2.9519) Acc@1 46.387 (42.969) Acc@5 75.391 (70.011) Mem 7379MB [2024-08-21 07:07:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.100) Loss 3.5508 (2.9191) Acc@1 34.961 (43.383) Acc@5 60.156 (70.843) Mem 7379MB [2024-08-21 07:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.092) Loss 3.4492 (3.0701) Acc@1 39.453 (41.614) Acc@5 61.426 (67.918) Mem 7379MB [2024-08-21 07:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.086) Loss 3.6016 (3.1706) Acc@1 32.129 (39.870) Acc@5 56.934 (66.054) Mem 7379MB [2024-08-21 07:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 40.556 Acc@5 66.662 [2024-08-21 07:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 40.6% [2024-08-21 07:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 40.56% [2024-08-21 07:07:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-21 07:07:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-21 07:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][0/1251] eta 0:17:29 lr 0.000700 wd 0.0500 time 0.8390 (0.8390) data time 0.5090 (0.5090) model time 0.0000 (0.0000) loss 4.3072 (4.3072) grad_norm 2.9610 (2.9610) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:07:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][10/1251] eta 0:05:59 lr 0.000701 wd 0.0500 time 0.2377 (0.2899) data time 0.0010 (0.0473) model time 0.0000 (0.0000) loss 3.7927 (4.1925) grad_norm 2.3326 (2.1401) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][20/1251] eta 0:05:24 lr 0.000701 wd 0.0500 time 0.2383 (0.2638) data time 0.0010 (0.0252) model time 0.0000 (0.0000) loss 4.9777 (4.3419) grad_norm 5.4279 (2.6814) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][30/1251] eta 0:05:10 lr 0.000701 wd 0.0500 time 0.2317 (0.2542) data time 0.0011 (0.0175) model time 0.0000 (0.0000) loss 4.7126 (4.3077) grad_norm 1.6362 (2.5829) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][40/1251] eta 0:05:02 lr 0.000702 wd 0.0500 time 0.2300 (0.2494) data time 0.0008 (0.0135) model time 0.0000 (0.0000) loss 3.3939 (4.2800) grad_norm 2.6631 (2.4603) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][50/1251] eta 0:04:56 lr 0.000702 wd 0.0500 time 0.2345 (0.2471) data time 0.0011 (0.0111) model time 0.0000 (0.0000) loss 3.3283 (4.2502) grad_norm 1.8231 (2.4288) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][60/1251] eta 0:04:52 lr 0.000703 wd 0.0500 time 0.2349 (0.2454) data time 0.0008 (0.0094) model time 0.2341 (0.2354) loss 5.6247 (4.2421) grad_norm 2.1123 (2.3883) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][70/1251] eta 0:04:48 lr 0.000703 wd 0.0500 time 0.2523 (0.2445) data time 0.0013 (0.0083) model time 0.2510 (0.2367) loss 4.5071 (4.2977) grad_norm 1.9093 (2.4181) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][80/1251] eta 0:04:49 lr 0.000703 wd 0.0500 time 0.2357 (0.2472) data time 0.0010 (0.0074) model time 0.2347 (0.2461) loss 3.7467 (4.3047) grad_norm 1.9916 (2.4466) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:07:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][90/1251] eta 0:04:45 lr 0.000704 wd 0.0500 time 0.2357 (0.2458) data time 0.0008 (0.0067) model time 0.2349 (0.2430) loss 4.1669 (4.2617) grad_norm 2.0951 (2.3903) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:07:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][100/1251] eta 0:04:41 lr 0.000704 wd 0.0500 time 0.2271 (0.2446) data time 0.0011 (0.0062) model time 0.2260 (0.2409) loss 4.1437 (4.2629) grad_norm 1.8641 (2.3811) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][110/1251] eta 0:04:38 lr 0.000705 wd 0.0500 time 0.2364 (0.2438) data time 0.0010 (0.0057) model time 0.2354 (0.2399) loss 4.1995 (4.3019) grad_norm 2.6558 (2.3604) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:07:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][120/1251] eta 0:04:35 lr 0.000705 wd 0.0500 time 0.2357 (0.2433) data time 0.0008 (0.0054) model time 0.2349 (0.2394) loss 4.6867 (4.3241) grad_norm 2.4092 (2.3833) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][130/1251] eta 0:04:32 lr 0.000705 wd 0.0500 time 0.2410 (0.2428) data time 0.0010 (0.0050) model time 0.2400 (0.2388) loss 4.4340 (4.3222) grad_norm 2.1965 (2.3812) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][140/1251] eta 0:04:29 lr 0.000706 wd 0.0500 time 0.2417 (0.2423) data time 0.0011 (0.0048) model time 0.2406 (0.2383) loss 3.7953 (4.3292) grad_norm 2.0879 (2.3672) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:08:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][150/1251] eta 0:04:26 lr 0.000706 wd 0.0500 time 0.2270 (0.2418) data time 0.0009 (0.0045) model time 0.2261 (0.2379) loss 5.3081 (4.3331) grad_norm 2.1666 (2.3737) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][160/1251] eta 0:04:23 lr 0.000707 wd 0.0500 time 0.2448 (0.2415) data time 0.0011 (0.0043) model time 0.2437 (0.2377) loss 4.7230 (4.3341) grad_norm 1.9593 (2.3717) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:08:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][170/1251] eta 0:04:20 lr 0.000707 wd 0.0500 time 0.2378 (0.2412) data time 0.0008 (0.0041) model time 0.2370 (0.2376) loss 4.4753 (4.3308) grad_norm 2.2613 (2.3655) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:08:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][180/1251] eta 0:04:18 lr 0.000707 wd 0.0500 time 0.2330 (0.2409) data time 0.0011 (0.0040) model time 0.2319 (0.2373) loss 3.4867 (4.3145) grad_norm 2.9736 (2.3789) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:08:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][190/1251] eta 0:04:15 lr 0.000708 wd 0.0500 time 0.2336 (0.2405) data time 0.0010 (0.0038) model time 0.2326 (0.2369) loss 4.5776 (4.3277) grad_norm 1.8994 (2.3796) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:08:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][200/1251] eta 0:04:12 lr 0.000708 wd 0.0500 time 0.2389 (0.2400) data time 0.0008 (0.0037) model time 0.2381 (0.2364) loss 4.6532 (4.3444) grad_norm 1.6433 (2.3624) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:08:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][210/1251] eta 0:04:10 lr 0.000709 wd 0.0500 time 0.2312 (0.2411) data time 0.0008 (0.0036) model time 0.2304 (0.2380) loss 4.8467 (4.3440) grad_norm 2.0204 (2.3448) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:08:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][220/1251] eta 0:04:08 lr 0.000709 wd 0.0500 time 0.2314 (0.2407) data time 0.0010 (0.0035) model time 0.2304 (0.2377) loss 4.7528 (4.3627) grad_norm 1.9983 (2.3225) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][230/1251] eta 0:04:05 lr 0.000709 wd 0.0500 time 0.2372 (0.2405) data time 0.0011 (0.0034) model time 0.2361 (0.2374) loss 4.0714 (4.3741) grad_norm 2.0149 (2.3216) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:08:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][240/1251] eta 0:04:02 lr 0.000710 wd 0.0500 time 0.2280 (0.2400) data time 0.0008 (0.0033) model time 0.2272 (0.2370) loss 4.4745 (4.3751) grad_norm 2.0328 (2.3188) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:08:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][250/1251] eta 0:04:00 lr 0.000710 wd 0.0500 time 0.2377 (0.2403) data time 0.0008 (0.0032) model time 0.2369 (0.2374) loss 4.5162 (4.3599) grad_norm 3.7931 (2.3197) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 07:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-21 07:08:29 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-21 07:08:30 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-21 08:55:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-21 08:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-21 08:55:31 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-21 08:55:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-21 08:55:39 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-21 08:55:40 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-21 08:55:41 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-21 08:55:42 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 14) [2024-08-21 08:55:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-21 09:03:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-21 09:03:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-21 09:03:14 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-21 09:03:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-21 09:03:28 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-21 09:03:29 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-21 09:03:30 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-21 09:03:30 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 14) [2024-08-21 09:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-21 09:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][260/1251] eta 0:41:11 lr 0.000711 wd 0.0500 time 0.2260 (2.4944) data time 0.0008 (0.1189) model time 0.2251 (2.3755) loss 5.3944 (5.0435) grad_norm 1.8596 (1.7337) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][270/1251] eta 0:16:02 lr 0.000711 wd 0.0500 time 0.2250 (0.9811) data time 0.0008 (0.0403) model time 0.2242 (0.9408) loss 4.9354 (4.8443) grad_norm 2.4678 (2.1259) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:03:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][280/1251] eta 0:10:57 lr 0.000711 wd 0.0500 time 0.2213 (0.6776) data time 0.0009 (0.0246) model time 0.2204 (0.6530) loss 4.4316 (4.7909) grad_norm 2.6284 (2.1046) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:03:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][290/1251] eta 0:08:46 lr 0.000712 wd 0.0500 time 0.2271 (0.5482) data time 0.0009 (0.0179) model time 0.2262 (0.5303) loss 4.3155 (4.7524) grad_norm 1.5371 (2.1451) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:03:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][300/1251] eta 0:07:32 lr 0.000712 wd 0.0500 time 0.2247 (0.4761) data time 0.0008 (0.0141) model time 0.2239 (0.4619) loss 4.8291 (4.6889) grad_norm 1.7805 (2.1666) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:03:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][310/1251] eta 0:06:44 lr 0.000713 wd 0.0500 time 0.2249 (0.4303) data time 0.0008 (0.0117) model time 0.2241 (0.4186) loss 3.7953 (4.6451) grad_norm 2.8193 (2.1729) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][320/1251] eta 0:06:10 lr 0.000713 wd 0.0500 time 0.2209 (0.3984) data time 0.0008 (0.0101) model time 0.2201 (0.3884) loss 4.5185 (4.6088) grad_norm 2.1172 (2.2042) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][330/1251] eta 0:05:45 lr 0.000713 wd 0.0500 time 0.2265 (0.3750) data time 0.0010 (0.0089) model time 0.2255 (0.3662) loss 3.8379 (4.5516) grad_norm 2.6295 (2.2112) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][340/1251] eta 0:05:25 lr 0.000714 wd 0.0500 time 0.2228 (0.3573) data time 0.0007 (0.0079) model time 0.2221 (0.3493) loss 4.3380 (4.5222) grad_norm 1.7346 (2.2671) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][350/1251] eta 0:05:09 lr 0.000714 wd 0.0500 time 0.2211 (0.3432) data time 0.0009 (0.0072) model time 0.2202 (0.3360) loss 4.4094 (4.5125) grad_norm 3.1863 (2.3248) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][360/1251] eta 0:04:55 lr 0.000715 wd 0.0500 time 0.2233 (0.3317) data time 0.0009 (0.0066) model time 0.2224 (0.3251) loss 4.1392 (4.5432) grad_norm 3.7395 (2.3431) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][370/1251] eta 0:04:44 lr 0.000715 wd 0.0500 time 0.2203 (0.3224) data time 0.0007 (0.0061) model time 0.2196 (0.3163) loss 3.8459 (4.5224) grad_norm 1.8543 (2.3123) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][380/1251] eta 0:04:34 lr 0.000715 wd 0.0500 time 0.2250 (0.3146) data time 0.0006 (0.0057) model time 0.2243 (0.3089) loss 3.6163 (4.5138) grad_norm 1.6385 (2.2964) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][390/1251] eta 0:04:25 lr 0.000716 wd 0.0500 time 0.2250 (0.3080) data time 0.0007 (0.0054) model time 0.2243 (0.3027) loss 4.3393 (4.5090) grad_norm 1.6430 (2.3074) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][400/1251] eta 0:04:17 lr 0.000716 wd 0.0500 time 0.2162 (0.3020) data time 0.0009 (0.0051) model time 0.2152 (0.2970) loss 4.6726 (4.4852) grad_norm 1.9519 (2.2842) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][410/1251] eta 0:04:09 lr 0.000717 wd 0.0500 time 0.2292 (0.2970) data time 0.0009 (0.0048) model time 0.2283 (0.2922) loss 4.6883 (4.4811) grad_norm 3.3054 (2.2858) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][420/1251] eta 0:04:03 lr 0.000717 wd 0.0500 time 0.2297 (0.2926) data time 0.0008 (0.0046) model time 0.2290 (0.2880) loss 4.6962 (4.4814) grad_norm 1.7355 (2.2812) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][430/1251] eta 0:03:56 lr 0.000717 wd 0.0500 time 0.2224 (0.2887) data time 0.0009 (0.0044) model time 0.2216 (0.2843) loss 4.8395 (4.4665) grad_norm 3.7775 (2.3055) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][440/1251] eta 0:03:51 lr 0.000718 wd 0.0500 time 0.2254 (0.2852) data time 0.0010 (0.0042) model time 0.2244 (0.2810) loss 4.4820 (4.4580) grad_norm 1.6410 (2.2973) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][450/1251] eta 0:03:45 lr 0.000718 wd 0.0500 time 0.2257 (0.2821) data time 0.0009 (0.0040) model time 0.2248 (0.2781) loss 3.4376 (4.4476) grad_norm 2.4829 (2.2884) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][460/1251] eta 0:03:40 lr 0.000719 wd 0.0500 time 0.2278 (0.2792) data time 0.0008 (0.0039) model time 0.2270 (0.2753) loss 3.6698 (4.4374) grad_norm 1.7387 (2.2814) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][470/1251] eta 0:03:36 lr 0.000719 wd 0.0500 time 0.2254 (0.2766) data time 0.0008 (0.0038) model time 0.2246 (0.2728) loss 4.3307 (4.4271) grad_norm 2.8547 (2.2865) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][480/1251] eta 0:03:31 lr 0.000719 wd 0.0500 time 0.2258 (0.2742) data time 0.0008 (0.0036) model time 0.2250 (0.2706) loss 4.5211 (4.4271) grad_norm 2.3494 (2.2777) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][490/1251] eta 0:03:27 lr 0.000720 wd 0.0500 time 0.2224 (0.2721) data time 0.0010 (0.0035) model time 0.2214 (0.2686) loss 5.0080 (4.4161) grad_norm 3.3174 (2.2905) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][500/1251] eta 0:03:22 lr 0.000720 wd 0.0500 time 0.2227 (0.2701) data time 0.0010 (0.0034) model time 0.2217 (0.2667) loss 4.0870 (4.4147) grad_norm 1.9747 (2.3076) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][510/1251] eta 0:03:18 lr 0.000721 wd 0.0500 time 0.2323 (0.2682) data time 0.0007 (0.0033) model time 0.2316 (0.2649) loss 3.8880 (4.4035) grad_norm 2.1206 (2.2966) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][520/1251] eta 0:03:14 lr 0.000721 wd 0.0500 time 0.2252 (0.2666) data time 0.0007 (0.0032) model time 0.2245 (0.2633) loss 3.5434 (4.3948) grad_norm 2.6111 (2.2918) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][530/1251] eta 0:03:11 lr 0.000721 wd 0.0500 time 0.2218 (0.2650) data time 0.0008 (0.0032) model time 0.2209 (0.2618) loss 4.8423 (4.3944) grad_norm 1.8089 (2.3052) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][540/1251] eta 0:03:07 lr 0.000722 wd 0.0500 time 0.2237 (0.2635) data time 0.0008 (0.0031) model time 0.2229 (0.2604) loss 4.8465 (4.3897) grad_norm 2.1235 (2.3043) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][550/1251] eta 0:03:04 lr 0.000722 wd 0.0500 time 0.2282 (0.2629) data time 0.0010 (0.0030) model time 0.2273 (0.2599) loss 4.8148 (4.3830) grad_norm 4.5542 (2.3213) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][560/1251] eta 0:03:00 lr 0.000723 wd 0.0500 time 0.2219 (0.2616) data time 0.0009 (0.0029) model time 0.2210 (0.2586) loss 3.5946 (4.3749) grad_norm 1.7116 (2.3132) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][570/1251] eta 0:02:57 lr 0.000723 wd 0.0500 time 0.2215 (0.2612) data time 0.0009 (0.0029) model time 0.2205 (0.2583) loss 4.4202 (4.3767) grad_norm 1.5490 (2.3079) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:04:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][580/1251] eta 0:02:54 lr 0.000723 wd 0.0500 time 0.2257 (0.2601) data time 0.0007 (0.0028) model time 0.2251 (0.2573) loss 5.2129 (4.3890) grad_norm 2.5725 (2.2962) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][590/1251] eta 0:02:51 lr 0.000724 wd 0.0500 time 0.2284 (0.2591) data time 0.0007 (0.0028) model time 0.2277 (0.2563) loss 4.7410 (4.3848) grad_norm 2.4821 (2.3030) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][600/1251] eta 0:02:48 lr 0.000724 wd 0.0500 time 0.2232 (0.2581) data time 0.0008 (0.0027) model time 0.2224 (0.2553) loss 4.0812 (4.3869) grad_norm 2.1862 (2.3038) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][610/1251] eta 0:02:44 lr 0.000725 wd 0.0500 time 0.2220 (0.2572) data time 0.0011 (0.0027) model time 0.2209 (0.2545) loss 4.1788 (4.3901) grad_norm 4.3056 (2.3100) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][620/1251] eta 0:02:41 lr 0.000725 wd 0.0500 time 0.2304 (0.2563) data time 0.0008 (0.0026) model time 0.2296 (0.2537) loss 4.3032 (4.3845) grad_norm 2.8000 (2.3150) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][630/1251] eta 0:02:38 lr 0.000725 wd 0.0500 time 0.2254 (0.2555) data time 0.0008 (0.0026) model time 0.2246 (0.2529) loss 4.0482 (4.3836) grad_norm 1.9383 (2.3153) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][640/1251] eta 0:02:35 lr 0.000726 wd 0.0500 time 0.2203 (0.2546) data time 0.0007 (0.0025) model time 0.2196 (0.2521) loss 3.3712 (4.3748) grad_norm 1.4516 (2.3084) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][650/1251] eta 0:02:32 lr 0.000726 wd 0.0500 time 0.2331 (0.2539) data time 0.0008 (0.0025) model time 0.2323 (0.2514) loss 4.4823 (4.3780) grad_norm 2.1796 (2.3062) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][660/1251] eta 0:02:29 lr 0.000727 wd 0.0500 time 0.2245 (0.2532) data time 0.0006 (0.0025) model time 0.2239 (0.2507) loss 4.1552 (4.3830) grad_norm 2.0488 (2.2978) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][670/1251] eta 0:02:26 lr 0.000727 wd 0.0500 time 0.2327 (0.2525) data time 0.0009 (0.0024) model time 0.2318 (0.2501) loss 4.7380 (4.3858) grad_norm 1.6496 (2.2883) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][680/1251] eta 0:02:23 lr 0.000727 wd 0.0500 time 0.2227 (0.2519) data time 0.0007 (0.0024) model time 0.2219 (0.2495) loss 4.9397 (4.3843) grad_norm 1.8267 (2.2812) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][690/1251] eta 0:02:21 lr 0.000728 wd 0.0500 time 0.2299 (0.2514) data time 0.0009 (0.0024) model time 0.2290 (0.2490) loss 4.5077 (4.3923) grad_norm 1.7703 (2.2735) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][700/1251] eta 0:02:18 lr 0.000728 wd 0.0500 time 0.2244 (0.2508) data time 0.0010 (0.0023) model time 0.2234 (0.2484) loss 4.2670 (4.3946) grad_norm 2.1099 (2.2690) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][710/1251] eta 0:02:15 lr 0.000729 wd 0.0500 time 0.2246 (0.2502) data time 0.0007 (0.0023) model time 0.2238 (0.2479) loss 4.1684 (4.3909) grad_norm 1.8333 (2.2726) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][720/1251] eta 0:02:12 lr 0.000729 wd 0.0500 time 0.2208 (0.2497) data time 0.0008 (0.0023) model time 0.2200 (0.2474) loss 3.6186 (4.3848) grad_norm 1.4951 (2.2750) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][730/1251] eta 0:02:09 lr 0.000729 wd 0.0500 time 0.2277 (0.2492) data time 0.0008 (0.0023) model time 0.2269 (0.2469) loss 4.1930 (4.3799) grad_norm 2.7525 (2.2771) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][740/1251] eta 0:02:07 lr 0.000730 wd 0.0500 time 0.2241 (0.2487) data time 0.0014 (0.0022) model time 0.2226 (0.2465) loss 5.2890 (4.3845) grad_norm 1.8094 (2.2740) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][750/1251] eta 0:02:04 lr 0.000730 wd 0.0500 time 0.2200 (0.2482) data time 0.0008 (0.0022) model time 0.2192 (0.2460) loss 4.6712 (4.3843) grad_norm 2.2812 (2.2748) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][760/1251] eta 0:02:01 lr 0.000731 wd 0.0500 time 0.2204 (0.2478) data time 0.0010 (0.0022) model time 0.2194 (0.2456) loss 4.8277 (4.3824) grad_norm 1.7188 (2.2698) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][770/1251] eta 0:01:58 lr 0.000731 wd 0.0500 time 0.2200 (0.2473) data time 0.0009 (0.0022) model time 0.2190 (0.2452) loss 4.5654 (4.3907) grad_norm 1.7569 (2.2678) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][780/1251] eta 0:01:56 lr 0.000731 wd 0.0500 time 0.2293 (0.2469) data time 0.0009 (0.0021) model time 0.2284 (0.2448) loss 5.2277 (4.3834) grad_norm 1.9636 (2.2646) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][790/1251] eta 0:01:53 lr 0.000732 wd 0.0500 time 0.2290 (0.2465) data time 0.0008 (0.0021) model time 0.2282 (0.2444) loss 3.5192 (4.3758) grad_norm 1.5140 (2.2618) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][800/1251] eta 0:01:50 lr 0.000732 wd 0.0500 time 0.2224 (0.2461) data time 0.0008 (0.0021) model time 0.2215 (0.2440) loss 4.0449 (4.3765) grad_norm 2.1290 (2.2593) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][810/1251] eta 0:01:48 lr 0.000733 wd 0.0500 time 0.2147 (0.2457) data time 0.0009 (0.0021) model time 0.2138 (0.2436) loss 4.6532 (4.3825) grad_norm 1.9016 (2.2527) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][820/1251] eta 0:01:45 lr 0.000733 wd 0.0500 time 0.2279 (0.2453) data time 0.0010 (0.0021) model time 0.2269 (0.2433) loss 4.5697 (4.3850) grad_norm 1.8184 (2.2509) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][830/1251] eta 0:01:43 lr 0.000733 wd 0.0500 time 0.2233 (0.2450) data time 0.0011 (0.0020) model time 0.2223 (0.2430) loss 4.5784 (4.3837) grad_norm 2.8208 (2.2528) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][840/1251] eta 0:01:40 lr 0.000734 wd 0.0500 time 0.2170 (0.2447) data time 0.0018 (0.0020) model time 0.2152 (0.2427) loss 3.9939 (4.3862) grad_norm 3.0464 (2.2509) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:05:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][850/1251] eta 0:01:37 lr 0.000734 wd 0.0500 time 0.2195 (0.2444) data time 0.0007 (0.0020) model time 0.2188 (0.2423) loss 5.3602 (4.3893) grad_norm 1.4073 (2.2509) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][860/1251] eta 0:01:35 lr 0.000735 wd 0.0500 time 0.2256 (0.2441) data time 0.0007 (0.0020) model time 0.2249 (0.2420) loss 5.1161 (4.3854) grad_norm 2.0637 (2.2521) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][870/1251] eta 0:01:32 lr 0.000735 wd 0.0500 time 0.2230 (0.2438) data time 0.0008 (0.0020) model time 0.2222 (0.2418) loss 5.0002 (4.3870) grad_norm 2.1633 (2.2492) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][880/1251] eta 0:01:30 lr 0.000735 wd 0.0500 time 0.2156 (0.2435) data time 0.0011 (0.0020) model time 0.2145 (0.2415) loss 4.5749 (4.3901) grad_norm 1.6174 (2.2460) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][890/1251] eta 0:01:27 lr 0.000736 wd 0.0500 time 0.2251 (0.2432) data time 0.0011 (0.0020) model time 0.2240 (0.2412) loss 4.1778 (4.3885) grad_norm 1.6761 (2.2413) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][900/1251] eta 0:01:25 lr 0.000736 wd 0.0500 time 0.2273 (0.2430) data time 0.0008 (0.0019) model time 0.2264 (0.2410) loss 3.5117 (4.3839) grad_norm 2.6534 (2.2419) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][910/1251] eta 0:01:22 lr 0.000737 wd 0.0500 time 0.2215 (0.2427) data time 0.0009 (0.0019) model time 0.2206 (0.2408) loss 4.2791 (4.3854) grad_norm 2.9159 (2.2484) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][920/1251] eta 0:01:20 lr 0.000737 wd 0.0500 time 0.2203 (0.2425) data time 0.0009 (0.0019) model time 0.2195 (0.2405) loss 4.3400 (4.3802) grad_norm 2.0658 (2.2554) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][930/1251] eta 0:01:17 lr 0.000737 wd 0.0500 time 0.2332 (0.2423) data time 0.0007 (0.0019) model time 0.2325 (0.2403) loss 4.5570 (4.3870) grad_norm 1.7517 (2.2565) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][940/1251] eta 0:01:15 lr 0.000738 wd 0.0500 time 0.2216 (0.2420) data time 0.0009 (0.0019) model time 0.2207 (0.2401) loss 4.7234 (4.3859) grad_norm 1.9079 (2.2536) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][950/1251] eta 0:01:12 lr 0.000738 wd 0.0500 time 0.2243 (0.2417) data time 0.0010 (0.0019) model time 0.2233 (0.2399) loss 4.0691 (4.3817) grad_norm 1.6578 (2.2524) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][960/1251] eta 0:01:10 lr 0.000739 wd 0.0500 time 0.2235 (0.2415) data time 0.0008 (0.0019) model time 0.2227 (0.2396) loss 5.0114 (4.3790) grad_norm 2.0022 (2.2490) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][970/1251] eta 0:01:07 lr 0.000739 wd 0.0500 time 0.2221 (0.2413) data time 0.0008 (0.0019) model time 0.2213 (0.2394) loss 4.7758 (4.3750) grad_norm 1.8315 (2.2459) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][980/1251] eta 0:01:05 lr 0.000739 wd 0.0500 time 0.2359 (0.2411) data time 0.0008 (0.0019) model time 0.2351 (0.2392) loss 4.1202 (4.3738) grad_norm 2.0060 (2.2413) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][990/1251] eta 0:01:02 lr 0.000740 wd 0.0500 time 0.2265 (0.2409) data time 0.0010 (0.0018) model time 0.2255 (0.2390) loss 4.7179 (4.3757) grad_norm 1.8481 (2.2397) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1000/1251] eta 0:01:00 lr 0.000740 wd 0.0500 time 0.2239 (0.2407) data time 0.0009 (0.0018) model time 0.2230 (0.2388) loss 4.5141 (4.3776) grad_norm 1.5593 (2.2370) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1010/1251] eta 0:00:57 lr 0.000741 wd 0.0500 time 0.2274 (0.2405) data time 0.0008 (0.0018) model time 0.2266 (0.2387) loss 3.3886 (4.3747) grad_norm 3.2453 (2.2420) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1020/1251] eta 0:00:55 lr 0.000741 wd 0.0500 time 0.2215 (0.2403) data time 0.0007 (0.0018) model time 0.2208 (0.2385) loss 4.5172 (4.3779) grad_norm 2.9043 (2.2541) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1030/1251] eta 0:00:53 lr 0.000741 wd 0.0500 time 0.2216 (0.2401) data time 0.0010 (0.0018) model time 0.2206 (0.2383) loss 3.1181 (4.3786) grad_norm 1.8264 (2.2521) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1040/1251] eta 0:00:50 lr 0.000742 wd 0.0500 time 0.2231 (0.2399) data time 0.0009 (0.0018) model time 0.2222 (0.2381) loss 3.2138 (4.3769) grad_norm 2.2051 (2.2514) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1050/1251] eta 0:00:48 lr 0.000742 wd 0.0500 time 0.2245 (0.2397) data time 0.0007 (0.0018) model time 0.2238 (0.2379) loss 3.9220 (4.3775) grad_norm 1.8551 (2.2510) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1060/1251] eta 0:00:45 lr 0.000743 wd 0.0500 time 0.2201 (0.2395) data time 0.0008 (0.0018) model time 0.2193 (0.2377) loss 5.5494 (4.3763) grad_norm 1.7268 (2.2494) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1070/1251] eta 0:00:43 lr 0.000743 wd 0.0500 time 0.2253 (0.2396) data time 0.0006 (0.0018) model time 0.2246 (0.2378) loss 5.1546 (4.3708) grad_norm 2.9531 (2.2494) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1080/1251] eta 0:00:40 lr 0.000743 wd 0.0500 time 0.2235 (0.2394) data time 0.0008 (0.0018) model time 0.2226 (0.2376) loss 3.2084 (4.3704) grad_norm 2.6798 (2.2488) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1090/1251] eta 0:00:38 lr 0.000744 wd 0.0500 time 0.2196 (0.2394) data time 0.0009 (0.0017) model time 0.2187 (0.2376) loss 3.4767 (4.3671) grad_norm 1.6136 (2.2480) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1100/1251] eta 0:00:36 lr 0.000744 wd 0.0500 time 0.2258 (0.2392) data time 0.0006 (0.0017) model time 0.2252 (0.2375) loss 2.9812 (4.3657) grad_norm 1.9929 (2.2484) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1110/1251] eta 0:00:33 lr 0.000745 wd 0.0500 time 0.2193 (0.2390) data time 0.0009 (0.0017) model time 0.2185 (0.2373) loss 4.5919 (4.3668) grad_norm 1.5834 (2.2495) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:07:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1120/1251] eta 0:00:31 lr 0.000745 wd 0.0500 time 0.2238 (0.2388) data time 0.0008 (0.0017) model time 0.2229 (0.2371) loss 4.0539 (4.3634) grad_norm 3.2794 (2.2521) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:07:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1130/1251] eta 0:00:28 lr 0.000745 wd 0.0500 time 0.2243 (0.2386) data time 0.0008 (0.0017) model time 0.2236 (0.2369) loss 4.3602 (4.3636) grad_norm 2.0895 (2.2486) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:07:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1140/1251] eta 0:00:26 lr 0.000746 wd 0.0500 time 0.2225 (0.2385) data time 0.0007 (0.0017) model time 0.2218 (0.2368) loss 3.6791 (4.3609) grad_norm 1.8917 (2.2500) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:07:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1150/1251] eta 0:00:24 lr 0.000746 wd 0.0500 time 0.2175 (0.2383) data time 0.0010 (0.0017) model time 0.2166 (0.2366) loss 4.0755 (4.3591) grad_norm 3.3677 (2.2574) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1160/1251] eta 0:00:21 lr 0.000747 wd 0.0500 time 0.2243 (0.2381) data time 0.0007 (0.0017) model time 0.2236 (0.2364) loss 4.9563 (4.3607) grad_norm 3.1986 (2.2640) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-21 09:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1170/1251] eta 0:00:19 lr 0.000747 wd 0.0500 time 0.2299 (0.2380) data time 0.0006 (0.0017) model time 0.2293 (0.2363) loss 5.1119 (4.3603) grad_norm 2.2028 (2.2619) loss_scale 32768.0000 (16545.1541) mem 7377MB [2024-08-21 09:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1180/1251] eta 0:00:16 lr 0.000747 wd 0.0500 time 0.2239 (0.2379) data time 0.0008 (0.0017) model time 0.2231 (0.2362) loss 4.7047 (4.3636) grad_norm 1.7223 (2.2608) loss_scale 32768.0000 (16720.5362) mem 7377MB [2024-08-21 09:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1190/1251] eta 0:00:14 lr 0.000748 wd 0.0500 time 0.2250 (0.2377) data time 0.0009 (0.0017) model time 0.2241 (0.2360) loss 3.4805 (4.3647) grad_norm 2.5203 (2.2580) loss_scale 32768.0000 (16892.1668) mem 7377MB [2024-08-21 09:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1200/1251] eta 0:00:12 lr 0.000748 wd 0.0500 time 0.2240 (0.2376) data time 0.0007 (0.0017) model time 0.2233 (0.2359) loss 4.8236 (4.3618) grad_norm 2.0120 (2.2617) loss_scale 32768.0000 (17060.1651) mem 7377MB [2024-08-21 09:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1210/1251] eta 0:00:09 lr 0.000749 wd 0.0500 time 0.2214 (0.2374) data time 0.0008 (0.0017) model time 0.2206 (0.2358) loss 4.6103 (4.3570) grad_norm 2.5021 (2.2616) loss_scale 32768.0000 (17224.6450) mem 7377MB [2024-08-21 09:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1220/1251] eta 0:00:07 lr 0.000749 wd 0.0500 time 0.2341 (0.2373) data time 0.0009 (0.0016) model time 0.2332 (0.2357) loss 4.2333 (4.3565) grad_norm 2.4139 (2.2618) loss_scale 32768.0000 (17385.7161) mem 7377MB [2024-08-21 09:07:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1230/1251] eta 0:00:04 lr 0.000749 wd 0.0500 time 0.2119 (0.2372) data time 0.0016 (0.0016) model time 0.2103 (0.2355) loss 5.3918 (4.3591) grad_norm inf (inf) loss_scale 16384.0000 (17526.6790) mem 7377MB [2024-08-21 09:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1240/1251] eta 0:00:02 lr 0.000750 wd 0.0500 time 0.2149 (0.2370) data time 0.0005 (0.0016) model time 0.2144 (0.2353) loss 4.8206 (4.3585) grad_norm 4.5051 (inf) loss_scale 16384.0000 (17515.0782) mem 7377MB [2024-08-21 09:07:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [14/300][1250/1251] eta 0:00:00 lr 0.000750 wd 0.0500 time 0.2148 (0.2368) data time 0.0003 (0.0016) model time 0.2144 (0.2351) loss 4.6099 (4.3583) grad_norm 3.3193 (inf) loss_scale 16384.0000 (17503.7106) mem 7377MB [2024-08-21 09:07:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 14 training takes 0:03:55 [2024-08-21 09:07:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-21 09:07:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-21 09:07:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.344 (0.344) Loss 0.9204 (0.9204) Acc@1 81.445 (81.445) Acc@5 94.238 (94.238) Mem 7377MB [2024-08-21 09:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.096) Loss 1.3887 (1.3271) Acc@1 67.090 (68.928) Acc@5 90.430 (90.385) Mem 7377MB [2024-08-21 09:07:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.086) Loss 1.9824 (1.3438) Acc@1 56.445 (68.745) Acc@5 80.859 (90.495) Mem 7377MB [2024-08-21 09:07:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.081) Loss 2.3418 (1.5538) Acc@1 49.609 (64.812) Acc@5 72.559 (87.153) Mem 7377MB [2024-08-21 09:07:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.077) Loss 2.2617 (1.6691) Acc@1 50.586 (62.688) Acc@5 77.344 (85.456) Mem 7377MB [2024-08-21 09:07:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 62.462 Acc@5 85.350 [2024-08-21 09:07:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 62.5% [2024-08-21 09:07:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 62.46% [2024-08-21 09:07:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-21 09:07:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-21 09:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.354 (0.354) Loss 1.9795 (1.9795) Acc@1 64.160 (64.160) Acc@5 86.133 (86.133) Mem 7377MB [2024-08-21 09:07:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.095) Loss 2.3301 (2.5242) Acc@1 52.930 (49.441) Acc@5 81.348 (75.684) Mem 7377MB [2024-08-21 09:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.066 (0.083) Loss 3.1465 (2.5037) Acc@1 40.234 (49.609) Acc@5 65.234 (76.418) Mem 7377MB [2024-08-21 09:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.079) Loss 3.1172 (2.6770) Acc@1 43.359 (47.291) Acc@5 64.941 (73.233) Mem 7377MB [2024-08-21 09:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.074) Loss 3.2812 (2.7890) Acc@1 35.547 (45.339) Acc@5 61.035 (71.260) Mem 7377MB [2024-08-21 09:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 45.752 Acc@5 71.596 [2024-08-21 09:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 45.8% [2024-08-21 09:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 45.75% [2024-08-21 09:07:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-21 09:07:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-21 09:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][0/1251] eta 0:14:56 lr 0.000750 wd 0.0500 time 0.7169 (0.7169) data time 0.4821 (0.4821) model time 0.0000 (0.0000) loss 4.1949 (4.1949) grad_norm 1.8128 (1.8128) loss_scale 16384.0000 (16384.0000) mem 7380MB [2024-08-21 09:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][10/1251] eta 0:05:34 lr 0.000751 wd 0.0500 time 0.2235 (0.2697) data time 0.0009 (0.0452) model time 0.0000 (0.0000) loss 4.6987 (4.2116) grad_norm 2.2505 (2.4610) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:07:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][20/1251] eta 0:05:05 lr 0.000751 wd 0.0500 time 0.2279 (0.2485) data time 0.0009 (0.0241) model time 0.0000 (0.0000) loss 3.8959 (4.4095) grad_norm 3.1490 (2.5165) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][30/1251] eta 0:04:53 lr 0.000751 wd 0.0500 time 0.2249 (0.2406) data time 0.0009 (0.0167) model time 0.0000 (0.0000) loss 4.2286 (4.2577) grad_norm 3.0050 (2.3975) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:07:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][40/1251] eta 0:04:46 lr 0.000752 wd 0.0500 time 0.2231 (0.2368) data time 0.0007 (0.0128) model time 0.0000 (0.0000) loss 4.9627 (4.2721) grad_norm 2.1410 (2.4056) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:07:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][50/1251] eta 0:04:41 lr 0.000752 wd 0.0500 time 0.2236 (0.2347) data time 0.0010 (0.0105) model time 0.0000 (0.0000) loss 4.2842 (4.3164) grad_norm 2.4326 (2.3470) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:07:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][60/1251] eta 0:04:37 lr 0.000753 wd 0.0500 time 0.2278 (0.2333) data time 0.0009 (0.0090) model time 0.2269 (0.2250) loss 4.1625 (4.3196) grad_norm 3.0137 (2.3128) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][70/1251] eta 0:04:34 lr 0.000753 wd 0.0500 time 0.2236 (0.2322) data time 0.0007 (0.0079) model time 0.2229 (0.2246) loss 3.4358 (4.3404) grad_norm 1.7748 (2.3168) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][80/1251] eta 0:04:30 lr 0.000753 wd 0.0500 time 0.2220 (0.2313) data time 0.0010 (0.0070) model time 0.2210 (0.2244) loss 3.7075 (4.2965) grad_norm 3.0664 (2.3058) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][90/1251] eta 0:04:27 lr 0.000754 wd 0.0500 time 0.2264 (0.2305) data time 0.0007 (0.0064) model time 0.2257 (0.2241) loss 4.5237 (4.3024) grad_norm 2.1109 (2.3197) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][100/1251] eta 0:04:24 lr 0.000754 wd 0.0500 time 0.2269 (0.2300) data time 0.0009 (0.0058) model time 0.2260 (0.2241) loss 5.3786 (4.2961) grad_norm 2.2559 (2.3219) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][110/1251] eta 0:04:21 lr 0.000755 wd 0.0500 time 0.2231 (0.2295) data time 0.0010 (0.0054) model time 0.2221 (0.2240) loss 4.3049 (4.3016) grad_norm 1.5714 (2.2920) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][120/1251] eta 0:04:19 lr 0.000755 wd 0.0500 time 0.2208 (0.2290) data time 0.0009 (0.0050) model time 0.2199 (0.2239) loss 4.9568 (4.3116) grad_norm 1.8025 (2.2827) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][130/1251] eta 0:04:16 lr 0.000755 wd 0.0500 time 0.2245 (0.2286) data time 0.0009 (0.0047) model time 0.2236 (0.2238) loss 4.7867 (4.3188) grad_norm 3.4845 (2.3395) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][140/1251] eta 0:04:13 lr 0.000756 wd 0.0500 time 0.2281 (0.2283) data time 0.0008 (0.0045) model time 0.2273 (0.2237) loss 4.7775 (4.3204) grad_norm 2.3677 (2.3143) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][150/1251] eta 0:04:11 lr 0.000756 wd 0.0500 time 0.2235 (0.2281) data time 0.0008 (0.0042) model time 0.2227 (0.2237) loss 3.7473 (4.3178) grad_norm 1.6150 (2.2874) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][160/1251] eta 0:04:08 lr 0.000757 wd 0.0500 time 0.2210 (0.2278) data time 0.0007 (0.0041) model time 0.2203 (0.2235) loss 5.0410 (4.3224) grad_norm 1.7774 (2.2760) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][170/1251] eta 0:04:06 lr 0.000757 wd 0.0500 time 0.2261 (0.2276) data time 0.0008 (0.0039) model time 0.2253 (0.2236) loss 4.7376 (4.3015) grad_norm 1.6530 (2.2678) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][180/1251] eta 0:04:03 lr 0.000757 wd 0.0500 time 0.2267 (0.2274) data time 0.0008 (0.0037) model time 0.2259 (0.2236) loss 4.8169 (4.3256) grad_norm 2.4571 (2.2668) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][190/1251] eta 0:04:01 lr 0.000758 wd 0.0500 time 0.2214 (0.2272) data time 0.0008 (0.0036) model time 0.2206 (0.2234) loss 4.4893 (4.3464) grad_norm 2.3859 (2.2652) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][200/1251] eta 0:03:58 lr 0.000758 wd 0.0500 time 0.2189 (0.2270) data time 0.0010 (0.0034) model time 0.2179 (0.2234) loss 4.5525 (4.3455) grad_norm 2.4542 (2.2866) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][210/1251] eta 0:03:56 lr 0.000759 wd 0.0500 time 0.2262 (0.2269) data time 0.0009 (0.0033) model time 0.2253 (0.2234) loss 4.6053 (4.3624) grad_norm 4.1304 (2.3049) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][220/1251] eta 0:03:53 lr 0.000759 wd 0.0500 time 0.2303 (0.2269) data time 0.0006 (0.0032) model time 0.2297 (0.2235) loss 3.4501 (4.3665) grad_norm 2.1272 (2.2867) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][230/1251] eta 0:03:51 lr 0.000759 wd 0.0500 time 0.2299 (0.2268) data time 0.0007 (0.0031) model time 0.2292 (0.2235) loss 3.3431 (4.3595) grad_norm 1.8990 (2.2733) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][240/1251] eta 0:03:49 lr 0.000760 wd 0.0500 time 0.2216 (0.2267) data time 0.0006 (0.0030) model time 0.2210 (0.2235) loss 3.4786 (4.3533) grad_norm 1.9605 (2.2682) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][250/1251] eta 0:03:46 lr 0.000760 wd 0.0500 time 0.2331 (0.2266) data time 0.0009 (0.0030) model time 0.2322 (0.2235) loss 4.6002 (4.3581) grad_norm 2.5349 (2.2639) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][260/1251] eta 0:03:44 lr 0.000761 wd 0.0500 time 0.2301 (0.2265) data time 0.0007 (0.0029) model time 0.2293 (0.2235) loss 5.1466 (4.3538) grad_norm 1.9488 (2.2583) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][270/1251] eta 0:03:42 lr 0.000761 wd 0.0500 time 0.2242 (0.2264) data time 0.0008 (0.0028) model time 0.2235 (0.2235) loss 5.0892 (4.3626) grad_norm 4.2036 (2.2602) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][280/1251] eta 0:03:39 lr 0.000761 wd 0.0500 time 0.2254 (0.2264) data time 0.0009 (0.0027) model time 0.2245 (0.2235) loss 4.4442 (4.3661) grad_norm 3.7261 (2.2577) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][290/1251] eta 0:03:37 lr 0.000762 wd 0.0500 time 0.2223 (0.2262) data time 0.0008 (0.0027) model time 0.2215 (0.2234) loss 4.7641 (4.3751) grad_norm 1.9193 (2.2638) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][300/1251] eta 0:03:35 lr 0.000762 wd 0.0500 time 0.2275 (0.2262) data time 0.0007 (0.0026) model time 0.2268 (0.2234) loss 4.5165 (4.3735) grad_norm 1.7059 (2.2594) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][310/1251] eta 0:03:32 lr 0.000763 wd 0.0500 time 0.2259 (0.2261) data time 0.0007 (0.0026) model time 0.2251 (0.2234) loss 3.9415 (4.3721) grad_norm 1.9585 (2.2541) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][320/1251] eta 0:03:30 lr 0.000763 wd 0.0500 time 0.2203 (0.2260) data time 0.0009 (0.0025) model time 0.2194 (0.2234) loss 4.0307 (4.3626) grad_norm 2.2147 (2.2501) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][330/1251] eta 0:03:28 lr 0.000763 wd 0.0500 time 0.2237 (0.2260) data time 0.0008 (0.0025) model time 0.2228 (0.2234) loss 4.5768 (4.3581) grad_norm 2.1267 (2.2486) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][340/1251] eta 0:03:25 lr 0.000764 wd 0.0500 time 0.2241 (0.2259) data time 0.0009 (0.0024) model time 0.2231 (0.2233) loss 3.8014 (4.3562) grad_norm 2.4752 (2.2468) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][350/1251] eta 0:03:23 lr 0.000764 wd 0.0500 time 0.2266 (0.2258) data time 0.0010 (0.0024) model time 0.2256 (0.2232) loss 4.6534 (4.3584) grad_norm 2.9838 (2.2473) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][360/1251] eta 0:03:21 lr 0.000765 wd 0.0500 time 0.2229 (0.2257) data time 0.0008 (0.0024) model time 0.2221 (0.2232) loss 4.9699 (4.3629) grad_norm 1.6459 (2.2425) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][370/1251] eta 0:03:18 lr 0.000765 wd 0.0500 time 0.2240 (0.2257) data time 0.0009 (0.0023) model time 0.2231 (0.2233) loss 4.4220 (4.3716) grad_norm 3.0310 (2.2441) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][380/1251] eta 0:03:16 lr 0.000765 wd 0.0500 time 0.2245 (0.2257) data time 0.0010 (0.0023) model time 0.2235 (0.2232) loss 3.8967 (4.3738) grad_norm 2.4571 (2.2426) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][390/1251] eta 0:03:14 lr 0.000766 wd 0.0500 time 0.2280 (0.2256) data time 0.0008 (0.0023) model time 0.2272 (0.2232) loss 4.6186 (4.3751) grad_norm 2.3840 (2.2487) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][400/1251] eta 0:03:11 lr 0.000766 wd 0.0500 time 0.2293 (0.2256) data time 0.0007 (0.0022) model time 0.2286 (0.2233) loss 4.0303 (4.3730) grad_norm 2.6284 (2.2538) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][410/1251] eta 0:03:09 lr 0.000767 wd 0.0500 time 0.2272 (0.2256) data time 0.0011 (0.0022) model time 0.2261 (0.2233) loss 4.1536 (4.3808) grad_norm 2.0374 (2.2553) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][420/1251] eta 0:03:07 lr 0.000767 wd 0.0500 time 0.2232 (0.2261) data time 0.0007 (0.0022) model time 0.2225 (0.2239) loss 3.2543 (4.3792) grad_norm 2.0120 (2.2463) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][430/1251] eta 0:03:05 lr 0.000767 wd 0.0500 time 0.2301 (0.2261) data time 0.0008 (0.0022) model time 0.2293 (0.2239) loss 4.1233 (4.3793) grad_norm 2.0244 (2.2412) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][440/1251] eta 0:03:03 lr 0.000768 wd 0.0500 time 0.2221 (0.2261) data time 0.0010 (0.0021) model time 0.2211 (0.2239) loss 4.3783 (4.3797) grad_norm 2.4182 (2.2448) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][450/1251] eta 0:03:01 lr 0.000768 wd 0.0500 time 0.2215 (0.2260) data time 0.0010 (0.0021) model time 0.2206 (0.2239) loss 4.8106 (4.3809) grad_norm 2.6573 (2.2471) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][460/1251] eta 0:02:58 lr 0.000769 wd 0.0500 time 0.2209 (0.2260) data time 0.0007 (0.0021) model time 0.2202 (0.2239) loss 5.1784 (4.3801) grad_norm 2.4650 (2.2520) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][470/1251] eta 0:02:56 lr 0.000769 wd 0.0500 time 0.2260 (0.2260) data time 0.0011 (0.0021) model time 0.2249 (0.2239) loss 2.7117 (4.3813) grad_norm 1.4478 (2.2564) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][480/1251] eta 0:02:54 lr 0.000769 wd 0.0500 time 0.4428 (0.2264) data time 0.0008 (0.0020) model time 0.4420 (0.2244) loss 3.8758 (4.3776) grad_norm 2.3140 (2.2515) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][490/1251] eta 0:02:52 lr 0.000770 wd 0.0500 time 0.2230 (0.2263) data time 0.0009 (0.0020) model time 0.2221 (0.2243) loss 4.4885 (4.3676) grad_norm 2.3029 (2.2549) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][500/1251] eta 0:02:49 lr 0.000770 wd 0.0500 time 0.2204 (0.2262) data time 0.0010 (0.0020) model time 0.2193 (0.2242) loss 4.7882 (4.3620) grad_norm 3.0660 (2.2577) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][510/1251] eta 0:02:47 lr 0.000771 wd 0.0500 time 0.2179 (0.2262) data time 0.0010 (0.0020) model time 0.2169 (0.2242) loss 4.6592 (4.3625) grad_norm 2.2003 (2.2585) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][520/1251] eta 0:02:45 lr 0.000771 wd 0.0500 time 0.2236 (0.2261) data time 0.0008 (0.0020) model time 0.2227 (0.2241) loss 4.5952 (4.3630) grad_norm 2.4559 (2.2581) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][530/1251] eta 0:02:42 lr 0.000771 wd 0.0500 time 0.2182 (0.2260) data time 0.0009 (0.0019) model time 0.2173 (0.2241) loss 4.8826 (4.3676) grad_norm 1.5276 (2.2569) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][540/1251] eta 0:02:40 lr 0.000772 wd 0.0500 time 0.2207 (0.2260) data time 0.0010 (0.0019) model time 0.2197 (0.2241) loss 4.5977 (4.3623) grad_norm 4.4616 (2.2556) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][550/1251] eta 0:02:38 lr 0.000772 wd 0.0500 time 0.2244 (0.2260) data time 0.0007 (0.0019) model time 0.2237 (0.2241) loss 4.2111 (4.3590) grad_norm 2.5485 (2.2546) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][560/1251] eta 0:02:36 lr 0.000773 wd 0.0500 time 0.2220 (0.2260) data time 0.0008 (0.0019) model time 0.2211 (0.2241) loss 4.6418 (4.3601) grad_norm 1.4974 (2.2468) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][570/1251] eta 0:02:33 lr 0.000773 wd 0.0500 time 0.2173 (0.2260) data time 0.0007 (0.0019) model time 0.2165 (0.2241) loss 3.9025 (4.3516) grad_norm 2.3179 (2.2446) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][580/1251] eta 0:02:31 lr 0.000773 wd 0.0500 time 0.2205 (0.2260) data time 0.0007 (0.0019) model time 0.2198 (0.2241) loss 3.6981 (4.3512) grad_norm 2.6696 (2.2438) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][590/1251] eta 0:02:29 lr 0.000774 wd 0.0500 time 0.2238 (0.2259) data time 0.0007 (0.0019) model time 0.2230 (0.2241) loss 4.9772 (4.3486) grad_norm 1.8253 (2.2390) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:09:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][600/1251] eta 0:02:27 lr 0.000774 wd 0.0500 time 0.2210 (0.2259) data time 0.0008 (0.0018) model time 0.2203 (0.2240) loss 4.4398 (4.3513) grad_norm 2.1300 (2.2384) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][610/1251] eta 0:02:24 lr 0.000775 wd 0.0500 time 0.2183 (0.2258) data time 0.0011 (0.0018) model time 0.2172 (0.2240) loss 4.2394 (4.3577) grad_norm 3.5837 (2.2363) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][620/1251] eta 0:02:22 lr 0.000775 wd 0.0500 time 0.2194 (0.2258) data time 0.0010 (0.0018) model time 0.2184 (0.2240) loss 4.6394 (4.3572) grad_norm 3.3774 (2.2543) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][630/1251] eta 0:02:20 lr 0.000775 wd 0.0500 time 0.2195 (0.2258) data time 0.0007 (0.0018) model time 0.2188 (0.2240) loss 5.1981 (4.3573) grad_norm 2.8223 (2.2573) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][640/1251] eta 0:02:17 lr 0.000776 wd 0.0500 time 0.2319 (0.2258) data time 0.0010 (0.0018) model time 0.2309 (0.2240) loss 4.7077 (4.3622) grad_norm 1.7760 (2.2569) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][650/1251] eta 0:02:15 lr 0.000776 wd 0.0500 time 0.2218 (0.2258) data time 0.0007 (0.0018) model time 0.2211 (0.2240) loss 3.1252 (4.3554) grad_norm 1.6879 (2.2569) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][660/1251] eta 0:02:13 lr 0.000777 wd 0.0500 time 0.2169 (0.2258) data time 0.0010 (0.0018) model time 0.2159 (0.2239) loss 3.8315 (4.3485) grad_norm 1.9389 (2.2543) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][670/1251] eta 0:02:11 lr 0.000777 wd 0.0500 time 0.2270 (0.2258) data time 0.0008 (0.0018) model time 0.2262 (0.2239) loss 4.9443 (4.3463) grad_norm 1.5002 (2.2663) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][680/1251] eta 0:02:08 lr 0.000777 wd 0.0500 time 0.2234 (0.2257) data time 0.0010 (0.0018) model time 0.2225 (0.2239) loss 5.1936 (4.3478) grad_norm 2.9432 (2.2708) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][690/1251] eta 0:02:06 lr 0.000778 wd 0.0500 time 0.2246 (0.2257) data time 0.0007 (0.0018) model time 0.2239 (0.2239) loss 5.3379 (4.3480) grad_norm 2.0317 (2.2701) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][700/1251] eta 0:02:04 lr 0.000778 wd 0.0500 time 0.2272 (0.2257) data time 0.0007 (0.0018) model time 0.2265 (0.2240) loss 5.1521 (4.3508) grad_norm 1.7039 (2.2660) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][710/1251] eta 0:02:02 lr 0.000779 wd 0.0500 time 0.2250 (0.2257) data time 0.0006 (0.0018) model time 0.2244 (0.2239) loss 4.6064 (4.3545) grad_norm 1.8701 (2.2607) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][720/1251] eta 0:01:59 lr 0.000779 wd 0.0500 time 0.2293 (0.2257) data time 0.0009 (0.0017) model time 0.2284 (0.2239) loss 4.5157 (4.3518) grad_norm 1.4235 (2.2570) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][730/1251] eta 0:01:57 lr 0.000779 wd 0.0500 time 0.2278 (0.2256) data time 0.0008 (0.0017) model time 0.2270 (0.2239) loss 4.2343 (4.3500) grad_norm 2.2505 (2.2551) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][740/1251] eta 0:01:55 lr 0.000780 wd 0.0500 time 0.2240 (0.2256) data time 0.0008 (0.0017) model time 0.2232 (0.2239) loss 4.4784 (4.3494) grad_norm 2.4381 (2.2539) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][750/1251] eta 0:01:53 lr 0.000780 wd 0.0500 time 0.2228 (0.2256) data time 0.0009 (0.0017) model time 0.2219 (0.2239) loss 3.6036 (4.3476) grad_norm 2.3614 (2.2537) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][760/1251] eta 0:01:50 lr 0.000781 wd 0.0500 time 0.2250 (0.2256) data time 0.0010 (0.0017) model time 0.2239 (0.2239) loss 4.1260 (4.3425) grad_norm 2.9119 (2.2540) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][770/1251] eta 0:01:48 lr 0.000781 wd 0.0500 time 0.2241 (0.2256) data time 0.0009 (0.0017) model time 0.2232 (0.2238) loss 4.6698 (4.3457) grad_norm 1.8054 (2.2539) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][780/1251] eta 0:01:46 lr 0.000781 wd 0.0500 time 0.2248 (0.2255) data time 0.0011 (0.0017) model time 0.2237 (0.2238) loss 4.6628 (4.3463) grad_norm 1.9413 (2.2502) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][790/1251] eta 0:01:43 lr 0.000782 wd 0.0500 time 0.2294 (0.2255) data time 0.0009 (0.0017) model time 0.2285 (0.2238) loss 4.9433 (4.3424) grad_norm 1.7098 (2.2505) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][800/1251] eta 0:01:41 lr 0.000782 wd 0.0500 time 0.2258 (0.2255) data time 0.0010 (0.0017) model time 0.2248 (0.2238) loss 4.9162 (4.3458) grad_norm 1.7975 (2.2539) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][810/1251] eta 0:01:39 lr 0.000783 wd 0.0500 time 0.2254 (0.2255) data time 0.0010 (0.0017) model time 0.2243 (0.2238) loss 4.5365 (4.3455) grad_norm 1.8587 (2.2571) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][820/1251] eta 0:01:37 lr 0.000783 wd 0.0500 time 0.2223 (0.2255) data time 0.0009 (0.0017) model time 0.2214 (0.2238) loss 3.5698 (4.3434) grad_norm 1.7495 (2.2585) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][830/1251] eta 0:01:34 lr 0.000783 wd 0.0500 time 0.2254 (0.2254) data time 0.0006 (0.0016) model time 0.2248 (0.2238) loss 5.1344 (4.3463) grad_norm 1.6871 (2.2563) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][840/1251] eta 0:01:32 lr 0.000784 wd 0.0500 time 0.2280 (0.2254) data time 0.0007 (0.0016) model time 0.2274 (0.2237) loss 3.6535 (4.3474) grad_norm 2.1274 (2.2543) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][850/1251] eta 0:01:30 lr 0.000784 wd 0.0500 time 0.2240 (0.2254) data time 0.0009 (0.0016) model time 0.2231 (0.2237) loss 3.9810 (4.3459) grad_norm 2.5659 (2.2521) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][860/1251] eta 0:01:28 lr 0.000785 wd 0.0500 time 0.2260 (0.2254) data time 0.0011 (0.0016) model time 0.2249 (0.2237) loss 4.4910 (4.3431) grad_norm 1.6960 (2.2491) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:10:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][870/1251] eta 0:01:25 lr 0.000785 wd 0.0500 time 0.2217 (0.2254) data time 0.0006 (0.0016) model time 0.2211 (0.2237) loss 3.8931 (4.3421) grad_norm 2.3380 (2.2457) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][880/1251] eta 0:01:23 lr 0.000785 wd 0.0500 time 0.2191 (0.2254) data time 0.0009 (0.0016) model time 0.2182 (0.2237) loss 5.0476 (4.3428) grad_norm 2.2502 (2.2461) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][890/1251] eta 0:01:21 lr 0.000786 wd 0.0500 time 0.2244 (0.2254) data time 0.0008 (0.0016) model time 0.2236 (0.2237) loss 4.4331 (4.3449) grad_norm 1.6993 (2.2475) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][900/1251] eta 0:01:19 lr 0.000786 wd 0.0500 time 0.2271 (0.2254) data time 0.0009 (0.0016) model time 0.2262 (0.2237) loss 5.2239 (4.3429) grad_norm 1.5168 (2.2441) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][910/1251] eta 0:01:16 lr 0.000787 wd 0.0500 time 0.2239 (0.2254) data time 0.0009 (0.0016) model time 0.2230 (0.2238) loss 4.7487 (4.3421) grad_norm 1.7841 (2.2407) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][920/1251] eta 0:01:14 lr 0.000787 wd 0.0500 time 0.2220 (0.2254) data time 0.0007 (0.0016) model time 0.2213 (0.2238) loss 4.2206 (4.3383) grad_norm 2.3042 (2.2407) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][930/1251] eta 0:01:12 lr 0.000787 wd 0.0500 time 0.2206 (0.2254) data time 0.0008 (0.0016) model time 0.2198 (0.2238) loss 3.0825 (4.3343) grad_norm 1.9108 (2.2373) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][940/1251] eta 0:01:10 lr 0.000788 wd 0.0500 time 0.2244 (0.2254) data time 0.0007 (0.0016) model time 0.2237 (0.2238) loss 4.0519 (4.3327) grad_norm 2.2703 (2.2364) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][950/1251] eta 0:01:07 lr 0.000788 wd 0.0500 time 0.2297 (0.2254) data time 0.0007 (0.0016) model time 0.2290 (0.2238) loss 4.0921 (4.3322) grad_norm 2.0141 (2.2387) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][960/1251] eta 0:01:05 lr 0.000789 wd 0.0500 time 0.2278 (0.2254) data time 0.0007 (0.0016) model time 0.2271 (0.2238) loss 4.9882 (4.3290) grad_norm 2.9911 (2.2405) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][970/1251] eta 0:01:03 lr 0.000789 wd 0.0500 time 0.2267 (0.2253) data time 0.0009 (0.0016) model time 0.2258 (0.2238) loss 3.6090 (4.3270) grad_norm 1.5997 (2.2412) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][980/1251] eta 0:01:01 lr 0.000789 wd 0.0500 time 0.2282 (0.2254) data time 0.0009 (0.0016) model time 0.2273 (0.2238) loss 4.5844 (4.3268) grad_norm 2.0035 (2.2428) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][990/1251] eta 0:00:58 lr 0.000790 wd 0.0500 time 0.2222 (0.2253) data time 0.0008 (0.0016) model time 0.2214 (0.2238) loss 3.2922 (4.3219) grad_norm 2.0896 (2.2436) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1000/1251] eta 0:00:56 lr 0.000790 wd 0.0500 time 0.2261 (0.2253) data time 0.0008 (0.0016) model time 0.2253 (0.2238) loss 5.1363 (4.3231) grad_norm 1.5467 (2.2401) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1010/1251] eta 0:00:54 lr 0.000791 wd 0.0500 time 0.2211 (0.2253) data time 0.0007 (0.0015) model time 0.2204 (0.2237) loss 5.2893 (4.3258) grad_norm 1.3954 (2.2389) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1020/1251] eta 0:00:52 lr 0.000791 wd 0.0500 time 0.2229 (0.2255) data time 0.0007 (0.0015) model time 0.2222 (0.2239) loss 3.9975 (4.3252) grad_norm 1.8753 (2.2358) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1030/1251] eta 0:00:49 lr 0.000791 wd 0.0500 time 0.2211 (0.2255) data time 0.0010 (0.0015) model time 0.2201 (0.2239) loss 4.6046 (4.3265) grad_norm 1.6201 (2.2336) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1040/1251] eta 0:00:47 lr 0.000792 wd 0.0500 time 0.2246 (0.2255) data time 0.0010 (0.0015) model time 0.2237 (0.2240) loss 4.6195 (4.3282) grad_norm 2.2441 (2.2363) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1050/1251] eta 0:00:45 lr 0.000792 wd 0.0500 time 0.2235 (0.2255) data time 0.0008 (0.0015) model time 0.2227 (0.2239) loss 4.1501 (4.3277) grad_norm 1.5229 (2.2333) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1060/1251] eta 0:00:43 lr 0.000793 wd 0.0500 time 0.2236 (0.2255) data time 0.0008 (0.0015) model time 0.2228 (0.2239) loss 3.7074 (4.3286) grad_norm 2.2657 (2.2318) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1070/1251] eta 0:00:40 lr 0.000793 wd 0.0500 time 0.2225 (0.2255) data time 0.0007 (0.0015) model time 0.2219 (0.2239) loss 4.7560 (4.3275) grad_norm 1.7648 (2.2306) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1080/1251] eta 0:00:38 lr 0.000793 wd 0.0500 time 0.2248 (0.2255) data time 0.0008 (0.0015) model time 0.2239 (0.2239) loss 4.8047 (4.3281) grad_norm 2.3894 (2.2321) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1090/1251] eta 0:00:36 lr 0.000794 wd 0.0500 time 0.2172 (0.2255) data time 0.0007 (0.0015) model time 0.2164 (0.2239) loss 4.9488 (4.3278) grad_norm 1.8151 (2.2291) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1100/1251] eta 0:00:34 lr 0.000794 wd 0.0500 time 0.2303 (0.2255) data time 0.0008 (0.0015) model time 0.2295 (0.2239) loss 4.8612 (4.3290) grad_norm 2.2374 (2.2285) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1110/1251] eta 0:00:31 lr 0.000795 wd 0.0500 time 0.2257 (0.2255) data time 0.0007 (0.0015) model time 0.2250 (0.2240) loss 4.8523 (4.3267) grad_norm 2.5066 (2.2276) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1120/1251] eta 0:00:29 lr 0.000795 wd 0.0500 time 0.2361 (0.2255) data time 0.0007 (0.0015) model time 0.2354 (0.2240) loss 4.9617 (4.3272) grad_norm 1.9351 (2.2262) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1130/1251] eta 0:00:27 lr 0.000795 wd 0.0500 time 0.2191 (0.2255) data time 0.0010 (0.0015) model time 0.2181 (0.2240) loss 4.6449 (4.3279) grad_norm 1.9011 (2.2267) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1140/1251] eta 0:00:25 lr 0.000796 wd 0.0500 time 0.2257 (0.2255) data time 0.0010 (0.0015) model time 0.2246 (0.2240) loss 4.6672 (4.3301) grad_norm 2.2929 (2.2279) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1150/1251] eta 0:00:22 lr 0.000796 wd 0.0500 time 0.2231 (0.2255) data time 0.0011 (0.0015) model time 0.2220 (0.2240) loss 4.9420 (4.3302) grad_norm 2.1637 (2.2261) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1160/1251] eta 0:00:20 lr 0.000797 wd 0.0500 time 0.2251 (0.2255) data time 0.0010 (0.0015) model time 0.2241 (0.2240) loss 5.1336 (4.3304) grad_norm 1.5238 (2.2236) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1170/1251] eta 0:00:18 lr 0.000797 wd 0.0500 time 0.2275 (0.2255) data time 0.0007 (0.0015) model time 0.2268 (0.2240) loss 4.8153 (4.3317) grad_norm 1.6419 (2.2207) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1180/1251] eta 0:00:16 lr 0.000797 wd 0.0500 time 0.2232 (0.2255) data time 0.0008 (0.0015) model time 0.2224 (0.2240) loss 4.7685 (4.3305) grad_norm 1.8697 (2.2210) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1190/1251] eta 0:00:13 lr 0.000798 wd 0.0500 time 0.2221 (0.2255) data time 0.0007 (0.0015) model time 0.2214 (0.2240) loss 5.4820 (4.3309) grad_norm 2.5778 (2.2238) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1200/1251] eta 0:00:11 lr 0.000798 wd 0.0500 time 0.2233 (0.2255) data time 0.0006 (0.0015) model time 0.2227 (0.2240) loss 3.4022 (4.3320) grad_norm 1.8299 (2.2228) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1210/1251] eta 0:00:09 lr 0.000799 wd 0.0500 time 0.2245 (0.2254) data time 0.0009 (0.0015) model time 0.2236 (0.2240) loss 4.0009 (4.3297) grad_norm 2.1504 (2.2276) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1220/1251] eta 0:00:06 lr 0.000799 wd 0.0500 time 0.2260 (0.2254) data time 0.0008 (0.0015) model time 0.2252 (0.2240) loss 3.7603 (4.3296) grad_norm 3.3576 (2.2301) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1230/1251] eta 0:00:04 lr 0.000799 wd 0.0500 time 0.2219 (0.2254) data time 0.0008 (0.0014) model time 0.2211 (0.2240) loss 4.8840 (4.3268) grad_norm 2.1326 (2.2297) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1240/1251] eta 0:00:02 lr 0.000800 wd 0.0500 time 0.2169 (0.2254) data time 0.0005 (0.0014) model time 0.2164 (0.2239) loss 4.8031 (4.3251) grad_norm 2.6788 (2.2301) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [15/300][1250/1251] eta 0:00:00 lr 0.000800 wd 0.0500 time 0.2153 (0.2253) data time 0.0003 (0.0014) model time 0.2150 (0.2238) loss 4.4810 (4.3239) grad_norm 3.8783 (2.2320) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 15 training takes 0:04:41 [2024-08-21 09:12:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-21 09:12:24 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-21 09:12:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.383 (0.383) Loss 1.0195 (1.0195) Acc@1 78.320 (78.320) Acc@5 93.652 (93.652) Mem 7381MB [2024-08-21 09:12:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.068 (0.098) Loss 1.3252 (1.3416) Acc@1 69.531 (68.919) Acc@5 91.113 (90.776) Mem 7381MB [2024-08-21 09:12:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.085) Loss 1.8330 (1.3374) Acc@1 59.180 (69.299) Acc@5 82.715 (90.839) Mem 7381MB [2024-08-21 09:12:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.066 (0.081) Loss 2.1992 (1.5240) Acc@1 51.953 (65.486) Acc@5 75.781 (87.853) Mem 7381MB [2024-08-21 09:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 2.1484 (1.6265) Acc@1 53.125 (63.553) Acc@5 78.027 (86.164) Mem 7381MB [2024-08-21 09:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 63.334 Acc@5 86.032 [2024-08-21 09:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 63.3% [2024-08-21 09:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 63.33% [2024-08-21 09:12:28 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-21 09:12:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-21 09:12:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.362 (0.362) Loss 1.6357 (1.6357) Acc@1 67.578 (67.578) Acc@5 88.867 (88.867) Mem 7381MB [2024-08-21 09:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.065 (0.094) Loss 2.0254 (2.1802) Acc@1 56.543 (54.057) Acc@5 84.961 (79.936) Mem 7381MB [2024-08-21 09:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.064 (0.083) Loss 2.8184 (2.1686) Acc@1 44.629 (54.288) Acc@5 69.043 (80.483) Mem 7381MB [2024-08-21 09:12:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.065 (0.078) Loss 2.8457 (2.3564) Acc@1 45.801 (51.742) Acc@5 67.578 (77.199) Mem 7381MB [2024-08-21 09:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.074) Loss 3.0098 (2.4764) Acc@1 39.062 (49.736) Acc@5 64.941 (75.191) Mem 7381MB [2024-08-21 09:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 49.984 Acc@5 75.402 [2024-08-21 09:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 50.0% [2024-08-21 09:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 49.98% [2024-08-21 09:12:32 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-21 09:12:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-21 09:12:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][0/1251] eta 0:14:43 lr 0.000800 wd 0.0500 time 0.7060 (0.7060) data time 0.5000 (0.5000) model time 0.0000 (0.0000) loss 4.8435 (4.8435) grad_norm 1.5862 (1.5862) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][10/1251] eta 0:05:34 lr 0.000801 wd 0.0500 time 0.2307 (0.2695) data time 0.0007 (0.0463) model time 0.0000 (0.0000) loss 4.9602 (4.2618) grad_norm 1.8608 (2.0734) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][20/1251] eta 0:05:06 lr 0.000801 wd 0.0500 time 0.2307 (0.2486) data time 0.0012 (0.0247) model time 0.0000 (0.0000) loss 4.1087 (4.2887) grad_norm 2.1512 (1.9923) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][30/1251] eta 0:04:53 lr 0.000801 wd 0.0500 time 0.2269 (0.2406) data time 0.0008 (0.0171) model time 0.0000 (0.0000) loss 4.3845 (4.3414) grad_norm 1.9337 (2.0565) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][40/1251] eta 0:04:46 lr 0.000802 wd 0.0500 time 0.2301 (0.2364) data time 0.0007 (0.0131) model time 0.0000 (0.0000) loss 4.5864 (4.3221) grad_norm 1.6620 (2.1192) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][50/1251] eta 0:04:40 lr 0.000802 wd 0.0500 time 0.2259 (0.2339) data time 0.0008 (0.0107) model time 0.0000 (0.0000) loss 5.2566 (4.2941) grad_norm 1.5796 (2.1066) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][60/1251] eta 0:04:41 lr 0.000803 wd 0.0500 time 0.2271 (0.2361) data time 0.0011 (0.0092) model time 0.2260 (0.2465) loss 4.0828 (4.2662) grad_norm 2.0102 (2.0885) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][70/1251] eta 0:04:37 lr 0.000803 wd 0.0500 time 0.2238 (0.2346) data time 0.0006 (0.0080) model time 0.2232 (0.2354) loss 4.3638 (4.2628) grad_norm 1.7219 (2.1133) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][80/1251] eta 0:04:33 lr 0.000803 wd 0.0500 time 0.2183 (0.2336) data time 0.0007 (0.0072) model time 0.2176 (0.2320) loss 4.9846 (4.2956) grad_norm 2.8747 (2.1299) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][90/1251] eta 0:04:29 lr 0.000804 wd 0.0500 time 0.2208 (0.2324) data time 0.0009 (0.0065) model time 0.2199 (0.2294) loss 3.8863 (4.3016) grad_norm 2.7270 (2.1549) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][100/1251] eta 0:04:26 lr 0.000804 wd 0.0500 time 0.2227 (0.2314) data time 0.0008 (0.0059) model time 0.2220 (0.2279) loss 5.1435 (4.2961) grad_norm 2.8472 (2.1674) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:12:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][110/1251] eta 0:04:23 lr 0.000805 wd 0.0500 time 0.2303 (0.2307) data time 0.0006 (0.0055) model time 0.2297 (0.2270) loss 4.6743 (4.2914) grad_norm 1.5326 (2.1585) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][120/1251] eta 0:04:20 lr 0.000805 wd 0.0500 time 0.2271 (0.2303) data time 0.0009 (0.0051) model time 0.2262 (0.2267) loss 3.3130 (4.2927) grad_norm 2.0485 (2.1277) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][130/1251] eta 0:04:17 lr 0.000805 wd 0.0500 time 0.2274 (0.2299) data time 0.0007 (0.0048) model time 0.2267 (0.2263) loss 4.8354 (4.2829) grad_norm 1.8561 (2.1270) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][140/1251] eta 0:04:14 lr 0.000806 wd 0.0500 time 0.2309 (0.2295) data time 0.0006 (0.0045) model time 0.2302 (0.2260) loss 3.7559 (4.2835) grad_norm 1.9728 (2.1195) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][150/1251] eta 0:04:12 lr 0.000806 wd 0.0500 time 0.2221 (0.2291) data time 0.0008 (0.0043) model time 0.2213 (0.2257) loss 3.3291 (4.2633) grad_norm 2.7468 (2.1237) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][160/1251] eta 0:04:09 lr 0.000807 wd 0.0500 time 0.2255 (0.2287) data time 0.0008 (0.0041) model time 0.2247 (0.2253) loss 5.6591 (4.2924) grad_norm 1.7427 (2.1185) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][170/1251] eta 0:04:06 lr 0.000807 wd 0.0500 time 0.2253 (0.2285) data time 0.0008 (0.0039) model time 0.2244 (0.2251) loss 2.9093 (4.2666) grad_norm 1.7838 (2.1012) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][180/1251] eta 0:04:04 lr 0.000807 wd 0.0500 time 0.2220 (0.2282) data time 0.0007 (0.0038) model time 0.2213 (0.2249) loss 4.4890 (4.2686) grad_norm 1.9937 (2.0986) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][190/1251] eta 0:04:02 lr 0.000808 wd 0.0500 time 0.2197 (0.2281) data time 0.0008 (0.0036) model time 0.2189 (0.2250) loss 3.1039 (4.2722) grad_norm 1.8392 (2.0955) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][200/1251] eta 0:03:59 lr 0.000808 wd 0.0500 time 0.2218 (0.2280) data time 0.0006 (0.0035) model time 0.2211 (0.2250) loss 3.6695 (4.2769) grad_norm 2.2892 (2.0926) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][210/1251] eta 0:03:58 lr 0.000809 wd 0.0500 time 0.2322 (0.2289) data time 0.0008 (0.0034) model time 0.2314 (0.2263) loss 3.6501 (4.2985) grad_norm 3.5359 (2.1023) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][220/1251] eta 0:03:55 lr 0.000809 wd 0.0500 time 0.2194 (0.2287) data time 0.0009 (0.0033) model time 0.2184 (0.2261) loss 4.6450 (4.2797) grad_norm 2.6693 (2.1412) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][230/1251] eta 0:03:53 lr 0.000809 wd 0.0500 time 0.2232 (0.2284) data time 0.0008 (0.0032) model time 0.2225 (0.2259) loss 4.6112 (4.2773) grad_norm 1.8598 (2.1323) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][240/1251] eta 0:03:50 lr 0.000810 wd 0.0500 time 0.2299 (0.2284) data time 0.0008 (0.0031) model time 0.2291 (0.2259) loss 4.4103 (4.2810) grad_norm 2.0303 (2.1232) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][250/1251] eta 0:03:48 lr 0.000810 wd 0.0500 time 0.2245 (0.2282) data time 0.0008 (0.0030) model time 0.2238 (0.2257) loss 3.8938 (4.2790) grad_norm 1.6166 (2.1191) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][260/1251] eta 0:03:46 lr 0.000811 wd 0.0500 time 0.2177 (0.2281) data time 0.0009 (0.0029) model time 0.2168 (0.2257) loss 3.7850 (4.2808) grad_norm 2.9289 (2.1505) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][270/1251] eta 0:03:43 lr 0.000811 wd 0.0500 time 0.2240 (0.2280) data time 0.0007 (0.0028) model time 0.2232 (0.2256) loss 5.2175 (4.2856) grad_norm 1.8017 (2.1558) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][280/1251] eta 0:03:41 lr 0.000811 wd 0.0500 time 0.2236 (0.2279) data time 0.0009 (0.0028) model time 0.2227 (0.2255) loss 4.3200 (4.2875) grad_norm 2.4166 (2.1488) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][290/1251] eta 0:03:38 lr 0.000812 wd 0.0500 time 0.2287 (0.2278) data time 0.0008 (0.0027) model time 0.2279 (0.2254) loss 4.5688 (4.2866) grad_norm 2.2907 (2.1418) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][300/1251] eta 0:03:36 lr 0.000812 wd 0.0500 time 0.2307 (0.2277) data time 0.0008 (0.0027) model time 0.2299 (0.2254) loss 4.9363 (4.2890) grad_norm 1.7683 (2.1563) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][310/1251] eta 0:03:34 lr 0.000813 wd 0.0500 time 0.2296 (0.2277) data time 0.0009 (0.0026) model time 0.2287 (0.2254) loss 4.9001 (4.3004) grad_norm 1.9737 (2.1601) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][320/1251] eta 0:03:31 lr 0.000813 wd 0.0500 time 0.2254 (0.2276) data time 0.0010 (0.0026) model time 0.2243 (0.2254) loss 4.3215 (4.3088) grad_norm 1.6032 (2.1516) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][330/1251] eta 0:03:29 lr 0.000813 wd 0.0500 time 0.2254 (0.2276) data time 0.0010 (0.0025) model time 0.2244 (0.2254) loss 3.1787 (4.3108) grad_norm 1.4229 (2.1631) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][340/1251] eta 0:03:27 lr 0.000814 wd 0.0500 time 0.2197 (0.2274) data time 0.0009 (0.0025) model time 0.2188 (0.2252) loss 4.7569 (4.3125) grad_norm 3.0424 (2.1732) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][350/1251] eta 0:03:24 lr 0.000814 wd 0.0500 time 0.2236 (0.2273) data time 0.0008 (0.0024) model time 0.2228 (0.2252) loss 4.4271 (4.3095) grad_norm 2.3799 (2.1777) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][360/1251] eta 0:03:22 lr 0.000815 wd 0.0500 time 0.2249 (0.2273) data time 0.0008 (0.0024) model time 0.2241 (0.2252) loss 4.2721 (4.3108) grad_norm 2.9242 (2.1717) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][370/1251] eta 0:03:20 lr 0.000815 wd 0.0500 time 0.2280 (0.2272) data time 0.0008 (0.0023) model time 0.2272 (0.2251) loss 4.7576 (4.3060) grad_norm 2.6878 (2.1817) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:13:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][380/1251] eta 0:03:17 lr 0.000815 wd 0.0500 time 0.2162 (0.2271) data time 0.0008 (0.0023) model time 0.2154 (0.2250) loss 5.2240 (4.3080) grad_norm 1.7797 (2.1923) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][390/1251] eta 0:03:15 lr 0.000816 wd 0.0500 time 0.2188 (0.2271) data time 0.0008 (0.0023) model time 0.2180 (0.2250) loss 3.8323 (4.3095) grad_norm 2.2641 (2.1943) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][400/1251] eta 0:03:13 lr 0.000816 wd 0.0500 time 0.2219 (0.2270) data time 0.0009 (0.0022) model time 0.2210 (0.2250) loss 5.0532 (4.3195) grad_norm 1.9916 (2.1918) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][410/1251] eta 0:03:10 lr 0.000817 wd 0.0500 time 0.2223 (0.2269) data time 0.0009 (0.0022) model time 0.2214 (0.2250) loss 4.4071 (4.3165) grad_norm 1.6370 (2.1899) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][420/1251] eta 0:03:08 lr 0.000817 wd 0.0500 time 0.2169 (0.2269) data time 0.0009 (0.0022) model time 0.2161 (0.2249) loss 4.2650 (4.3172) grad_norm 1.6213 (2.1937) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][430/1251] eta 0:03:06 lr 0.000817 wd 0.0500 time 0.2275 (0.2268) data time 0.0009 (0.0022) model time 0.2265 (0.2249) loss 4.1341 (4.3160) grad_norm 1.7223 (2.1988) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][440/1251] eta 0:03:03 lr 0.000818 wd 0.0500 time 0.2203 (0.2268) data time 0.0008 (0.0021) model time 0.2196 (0.2248) loss 5.1816 (4.3193) grad_norm 2.5787 (2.1989) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][450/1251] eta 0:03:01 lr 0.000818 wd 0.0500 time 0.2230 (0.2267) data time 0.0008 (0.0021) model time 0.2222 (0.2248) loss 4.8202 (4.3216) grad_norm 2.2265 (2.2010) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][460/1251] eta 0:02:59 lr 0.000819 wd 0.0500 time 0.2221 (0.2267) data time 0.0006 (0.0021) model time 0.2215 (0.2248) loss 4.5544 (4.3283) grad_norm 2.2022 (2.2020) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][470/1251] eta 0:02:57 lr 0.000819 wd 0.0500 time 0.2306 (0.2267) data time 0.0007 (0.0021) model time 0.2298 (0.2248) loss 3.3752 (4.3288) grad_norm 1.4394 (2.2025) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][480/1251] eta 0:02:54 lr 0.000819 wd 0.0500 time 0.2256 (0.2267) data time 0.0006 (0.0020) model time 0.2249 (0.2248) loss 4.7926 (4.3323) grad_norm 2.3601 (2.2044) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][490/1251] eta 0:02:52 lr 0.000820 wd 0.0500 time 0.2363 (0.2266) data time 0.0008 (0.0020) model time 0.2355 (0.2248) loss 4.6285 (4.3304) grad_norm 1.7450 (2.2021) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][500/1251] eta 0:02:50 lr 0.000820 wd 0.0500 time 0.2296 (0.2266) data time 0.0006 (0.0020) model time 0.2290 (0.2248) loss 3.0139 (4.3256) grad_norm 2.2393 (2.2037) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][510/1251] eta 0:02:47 lr 0.000821 wd 0.0500 time 0.2212 (0.2266) data time 0.0007 (0.0020) model time 0.2205 (0.2247) loss 4.7117 (4.3227) grad_norm 1.9734 (2.2016) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][520/1251] eta 0:02:45 lr 0.000821 wd 0.0500 time 0.2259 (0.2265) data time 0.0008 (0.0020) model time 0.2251 (0.2247) loss 4.7317 (4.3249) grad_norm 2.1438 (2.2069) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][530/1251] eta 0:02:43 lr 0.000821 wd 0.0500 time 0.2311 (0.2265) data time 0.0008 (0.0019) model time 0.2303 (0.2247) loss 4.9283 (4.3262) grad_norm 2.6584 (2.2115) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][540/1251] eta 0:02:40 lr 0.000822 wd 0.0500 time 0.2252 (0.2264) data time 0.0009 (0.0019) model time 0.2243 (0.2247) loss 3.0445 (4.3189) grad_norm 1.7561 (2.2096) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][550/1251] eta 0:02:38 lr 0.000822 wd 0.0500 time 0.2248 (0.2264) data time 0.0008 (0.0019) model time 0.2240 (0.2246) loss 4.6657 (4.3161) grad_norm 1.5923 (2.2112) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][560/1251] eta 0:02:36 lr 0.000823 wd 0.0500 time 0.2257 (0.2264) data time 0.0006 (0.0019) model time 0.2251 (0.2246) loss 4.0134 (4.3164) grad_norm 1.5634 (2.2105) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][570/1251] eta 0:02:34 lr 0.000823 wd 0.0500 time 0.2267 (0.2263) data time 0.0006 (0.0019) model time 0.2260 (0.2246) loss 5.0203 (4.3200) grad_norm 1.7899 (2.2121) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][580/1251] eta 0:02:31 lr 0.000823 wd 0.0500 time 0.2218 (0.2263) data time 0.0008 (0.0019) model time 0.2210 (0.2246) loss 3.0620 (4.3150) grad_norm 1.6037 (2.2158) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][590/1251] eta 0:02:29 lr 0.000824 wd 0.0500 time 0.2248 (0.2263) data time 0.0006 (0.0018) model time 0.2241 (0.2245) loss 3.5140 (4.3137) grad_norm 1.8379 (2.2163) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][600/1251] eta 0:02:27 lr 0.000824 wd 0.0500 time 0.2421 (0.2263) data time 0.0006 (0.0018) model time 0.2416 (0.2245) loss 3.7127 (4.3126) grad_norm 1.8919 (2.2210) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][610/1251] eta 0:02:25 lr 0.000825 wd 0.0500 time 0.2263 (0.2262) data time 0.0008 (0.0018) model time 0.2255 (0.2245) loss 3.6011 (4.3090) grad_norm 2.5066 (2.2298) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][620/1251] eta 0:02:22 lr 0.000825 wd 0.0500 time 0.2290 (0.2262) data time 0.0006 (0.0018) model time 0.2284 (0.2245) loss 3.1862 (4.3019) grad_norm 1.2700 (2.2312) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][630/1251] eta 0:02:20 lr 0.000825 wd 0.0500 time 0.2242 (0.2261) data time 0.0010 (0.0018) model time 0.2232 (0.2245) loss 4.2691 (4.3027) grad_norm 3.9174 (2.2447) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:14:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][640/1251] eta 0:02:18 lr 0.000826 wd 0.0500 time 0.2192 (0.2261) data time 0.0010 (0.0018) model time 0.2182 (0.2245) loss 3.2978 (4.2969) grad_norm 3.1441 (2.2482) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:15:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][650/1251] eta 0:02:15 lr 0.000826 wd 0.0500 time 0.2266 (0.2261) data time 0.0007 (0.0018) model time 0.2259 (0.2244) loss 5.0077 (4.2974) grad_norm 1.8008 (2.2472) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:15:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][660/1251] eta 0:02:13 lr 0.000827 wd 0.0500 time 0.2251 (0.2261) data time 0.0006 (0.0018) model time 0.2245 (0.2245) loss 4.5519 (4.2962) grad_norm 1.7777 (2.2455) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][670/1251] eta 0:02:11 lr 0.000827 wd 0.0500 time 0.2294 (0.2261) data time 0.0008 (0.0017) model time 0.2286 (0.2245) loss 4.4435 (4.2989) grad_norm 1.9330 (2.2496) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:15:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][680/1251] eta 0:02:09 lr 0.000827 wd 0.0500 time 0.2268 (0.2262) data time 0.0008 (0.0017) model time 0.2260 (0.2245) loss 4.5770 (4.2968) grad_norm 2.2352 (2.2461) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:15:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][690/1251] eta 0:02:06 lr 0.000828 wd 0.0500 time 0.2259 (0.2262) data time 0.0008 (0.0017) model time 0.2251 (0.2246) loss 4.4689 (4.3010) grad_norm 2.7174 (2.2465) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][700/1251] eta 0:02:04 lr 0.000828 wd 0.0500 time 0.2284 (0.2261) data time 0.0007 (0.0017) model time 0.2278 (0.2245) loss 2.9323 (4.2989) grad_norm 2.3668 (2.2458) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:15:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][710/1251] eta 0:02:02 lr 0.000829 wd 0.0500 time 0.2288 (0.2261) data time 0.0007 (0.0017) model time 0.2281 (0.2245) loss 4.0038 (4.2974) grad_norm 2.8255 (2.2446) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:15:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][720/1251] eta 0:02:00 lr 0.000829 wd 0.0500 time 0.2263 (0.2261) data time 0.0009 (0.0017) model time 0.2254 (0.2245) loss 4.1108 (4.2958) grad_norm 3.3621 (2.2548) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:15:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][730/1251] eta 0:01:57 lr 0.000829 wd 0.0500 time 0.2326 (0.2263) data time 0.0008 (0.0017) model time 0.2317 (0.2247) loss 4.6598 (4.2978) grad_norm 2.5195 (2.2563) loss_scale 32768.0000 (16451.2394) mem 7381MB [2024-08-21 09:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][740/1251] eta 0:01:55 lr 0.000830 wd 0.0500 time 0.2225 (0.2263) data time 0.0009 (0.0017) model time 0.2216 (0.2247) loss 4.4328 (4.2952) grad_norm 2.0509 (2.2534) loss_scale 32768.0000 (16671.4386) mem 7381MB [2024-08-21 09:15:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][750/1251] eta 0:01:53 lr 0.000830 wd 0.0500 time 0.2192 (0.2263) data time 0.0008 (0.0017) model time 0.2185 (0.2247) loss 5.4283 (4.2989) grad_norm 2.3456 (2.2514) loss_scale 32768.0000 (16885.7736) mem 7381MB [2024-08-21 09:15:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][760/1251] eta 0:01:51 lr 0.000831 wd 0.0500 time 0.2251 (0.2263) data time 0.0006 (0.0016) model time 0.2245 (0.2247) loss 3.9020 (4.2979) grad_norm 2.0568 (2.2519) loss_scale 32768.0000 (17094.4757) mem 7381MB [2024-08-21 09:15:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][770/1251] eta 0:01:48 lr 0.000831 wd 0.0500 time 0.2270 (0.2263) data time 0.0010 (0.0016) model time 0.2259 (0.2247) loss 3.6563 (4.2986) grad_norm 2.8577 (2.2541) loss_scale 32768.0000 (17297.7639) mem 7381MB [2024-08-21 09:15:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][780/1251] eta 0:01:46 lr 0.000831 wd 0.0500 time 0.2215 (0.2262) data time 0.0008 (0.0016) model time 0.2206 (0.2247) loss 3.6882 (4.2959) grad_norm 1.9983 (2.2516) loss_scale 32768.0000 (17495.8464) mem 7381MB [2024-08-21 09:15:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][790/1251] eta 0:01:44 lr 0.000832 wd 0.0500 time 0.2272 (0.2262) data time 0.0008 (0.0016) model time 0.2264 (0.2247) loss 4.5954 (4.2974) grad_norm 2.8003 (inf) loss_scale 16384.0000 (17647.4943) mem 7381MB [2024-08-21 09:15:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][800/1251] eta 0:01:42 lr 0.000832 wd 0.0500 time 0.2305 (0.2262) data time 0.0009 (0.0016) model time 0.2296 (0.2247) loss 3.0234 (4.2925) grad_norm 1.5497 (inf) loss_scale 16384.0000 (17631.7203) mem 7381MB [2024-08-21 09:15:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][810/1251] eta 0:01:39 lr 0.000833 wd 0.0500 time 0.2246 (0.2262) data time 0.0010 (0.0016) model time 0.2236 (0.2247) loss 5.1696 (4.2954) grad_norm 2.4396 (inf) loss_scale 16384.0000 (17616.3354) mem 7381MB [2024-08-21 09:15:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][820/1251] eta 0:01:37 lr 0.000833 wd 0.0500 time 0.2299 (0.2262) data time 0.0009 (0.0016) model time 0.2291 (0.2247) loss 3.7400 (4.2910) grad_norm 2.6009 (inf) loss_scale 16384.0000 (17601.3252) mem 7381MB [2024-08-21 09:15:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][830/1251] eta 0:01:35 lr 0.000833 wd 0.0500 time 0.2200 (0.2262) data time 0.0008 (0.0016) model time 0.2192 (0.2247) loss 3.3630 (4.2896) grad_norm 1.9192 (inf) loss_scale 16384.0000 (17586.6763) mem 7381MB [2024-08-21 09:15:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][840/1251] eta 0:01:32 lr 0.000834 wd 0.0500 time 0.2215 (0.2261) data time 0.0006 (0.0016) model time 0.2208 (0.2246) loss 4.0536 (4.2851) grad_norm 1.4130 (inf) loss_scale 16384.0000 (17572.3757) mem 7381MB [2024-08-21 09:15:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][850/1251] eta 0:01:30 lr 0.000834 wd 0.0500 time 0.2209 (0.2261) data time 0.0010 (0.0016) model time 0.2199 (0.2246) loss 4.4840 (4.2876) grad_norm 2.5095 (inf) loss_scale 16384.0000 (17558.4113) mem 7381MB [2024-08-21 09:15:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][860/1251] eta 0:01:28 lr 0.000835 wd 0.0500 time 0.2190 (0.2261) data time 0.0010 (0.0016) model time 0.2180 (0.2246) loss 4.3253 (4.2856) grad_norm 2.2078 (inf) loss_scale 16384.0000 (17544.7712) mem 7381MB [2024-08-21 09:15:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][870/1251] eta 0:01:26 lr 0.000835 wd 0.0500 time 0.2264 (0.2261) data time 0.0009 (0.0016) model time 0.2255 (0.2246) loss 4.2726 (4.2856) grad_norm 1.9827 (inf) loss_scale 16384.0000 (17531.4443) mem 7381MB [2024-08-21 09:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][880/1251] eta 0:01:23 lr 0.000835 wd 0.0500 time 0.2212 (0.2261) data time 0.0009 (0.0016) model time 0.2203 (0.2246) loss 3.8657 (4.2847) grad_norm 1.8102 (inf) loss_scale 16384.0000 (17518.4200) mem 7381MB [2024-08-21 09:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][890/1251] eta 0:01:21 lr 0.000836 wd 0.0500 time 0.2302 (0.2261) data time 0.0007 (0.0016) model time 0.2295 (0.2246) loss 3.2632 (4.2857) grad_norm 1.8084 (inf) loss_scale 16384.0000 (17505.6880) mem 7381MB [2024-08-21 09:15:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][900/1251] eta 0:01:19 lr 0.000836 wd 0.0500 time 0.2316 (0.2261) data time 0.0008 (0.0016) model time 0.2308 (0.2246) loss 5.0822 (4.2829) grad_norm 3.6860 (inf) loss_scale 16384.0000 (17493.2386) mem 7381MB [2024-08-21 09:15:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][910/1251] eta 0:01:17 lr 0.000837 wd 0.0500 time 0.2210 (0.2261) data time 0.0007 (0.0015) model time 0.2203 (0.2246) loss 4.1383 (4.2827) grad_norm 2.4442 (inf) loss_scale 16384.0000 (17481.0626) mem 7381MB [2024-08-21 09:16:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][920/1251] eta 0:01:14 lr 0.000837 wd 0.0500 time 0.2292 (0.2261) data time 0.0008 (0.0015) model time 0.2284 (0.2246) loss 3.3553 (4.2804) grad_norm 2.6768 (inf) loss_scale 16384.0000 (17469.1509) mem 7381MB [2024-08-21 09:16:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][930/1251] eta 0:01:12 lr 0.000837 wd 0.0500 time 0.2246 (0.2261) data time 0.0010 (0.0015) model time 0.2236 (0.2246) loss 4.8267 (4.2799) grad_norm 2.4679 (inf) loss_scale 16384.0000 (17457.4952) mem 7381MB [2024-08-21 09:16:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][940/1251] eta 0:01:10 lr 0.000838 wd 0.0500 time 0.2286 (0.2261) data time 0.0008 (0.0015) model time 0.2277 (0.2246) loss 3.5838 (4.2790) grad_norm 2.2348 (inf) loss_scale 16384.0000 (17446.0871) mem 7381MB [2024-08-21 09:16:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][950/1251] eta 0:01:08 lr 0.000838 wd 0.0500 time 0.2119 (0.2260) data time 0.0015 (0.0015) model time 0.2104 (0.2246) loss 4.6209 (4.2797) grad_norm 1.7961 (inf) loss_scale 16384.0000 (17434.9190) mem 7381MB [2024-08-21 09:16:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][960/1251] eta 0:01:05 lr 0.000839 wd 0.0500 time 0.2189 (0.2260) data time 0.0010 (0.0015) model time 0.2179 (0.2246) loss 3.9880 (4.2821) grad_norm 1.7849 (inf) loss_scale 16384.0000 (17423.9834) mem 7381MB [2024-08-21 09:16:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][970/1251] eta 0:01:03 lr 0.000839 wd 0.0500 time 0.2264 (0.2260) data time 0.0009 (0.0015) model time 0.2255 (0.2246) loss 3.8658 (4.2810) grad_norm 2.0613 (inf) loss_scale 16384.0000 (17413.2729) mem 7381MB [2024-08-21 09:16:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][980/1251] eta 0:01:01 lr 0.000839 wd 0.0500 time 0.2231 (0.2260) data time 0.0014 (0.0015) model time 0.2217 (0.2245) loss 4.8474 (4.2837) grad_norm 1.8311 (inf) loss_scale 16384.0000 (17402.7808) mem 7381MB [2024-08-21 09:16:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][990/1251] eta 0:00:59 lr 0.000840 wd 0.0500 time 0.2212 (0.2262) data time 0.0008 (0.0015) model time 0.2205 (0.2247) loss 4.6385 (4.2871) grad_norm 2.0214 (inf) loss_scale 16384.0000 (17392.5005) mem 7381MB [2024-08-21 09:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1000/1251] eta 0:00:56 lr 0.000840 wd 0.0500 time 0.2231 (0.2262) data time 0.0007 (0.0015) model time 0.2224 (0.2247) loss 4.0556 (4.2866) grad_norm 2.0145 (inf) loss_scale 16384.0000 (17382.4256) mem 7381MB [2024-08-21 09:16:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1010/1251] eta 0:00:54 lr 0.000841 wd 0.0500 time 0.2175 (0.2261) data time 0.0009 (0.0015) model time 0.2166 (0.2247) loss 3.6949 (4.2872) grad_norm 2.2358 (inf) loss_scale 16384.0000 (17372.5500) mem 7381MB [2024-08-21 09:16:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1020/1251] eta 0:00:52 lr 0.000841 wd 0.0500 time 0.2354 (0.2261) data time 0.0007 (0.0015) model time 0.2347 (0.2247) loss 3.7103 (4.2889) grad_norm 2.0919 (inf) loss_scale 16384.0000 (17362.8678) mem 7381MB [2024-08-21 09:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1030/1251] eta 0:00:49 lr 0.000841 wd 0.0500 time 0.2225 (0.2261) data time 0.0011 (0.0015) model time 0.2215 (0.2247) loss 4.4458 (4.2877) grad_norm 1.5621 (inf) loss_scale 16384.0000 (17353.3734) mem 7381MB [2024-08-21 09:16:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1040/1251] eta 0:00:47 lr 0.000842 wd 0.0500 time 0.2264 (0.2261) data time 0.0007 (0.0015) model time 0.2257 (0.2247) loss 3.7760 (4.2854) grad_norm 2.8477 (inf) loss_scale 16384.0000 (17344.0615) mem 7381MB [2024-08-21 09:16:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1050/1251] eta 0:00:45 lr 0.000842 wd 0.0500 time 0.2229 (0.2261) data time 0.0008 (0.0015) model time 0.2222 (0.2247) loss 4.0812 (4.2885) grad_norm 1.6389 (inf) loss_scale 16384.0000 (17334.9267) mem 7381MB [2024-08-21 09:16:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1060/1251] eta 0:00:43 lr 0.000843 wd 0.0500 time 0.2305 (0.2261) data time 0.0006 (0.0015) model time 0.2299 (0.2247) loss 4.7201 (4.2899) grad_norm 2.4682 (inf) loss_scale 16384.0000 (17325.9642) mem 7381MB [2024-08-21 09:16:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1070/1251] eta 0:00:40 lr 0.000843 wd 0.0500 time 0.2252 (0.2261) data time 0.0007 (0.0015) model time 0.2245 (0.2247) loss 4.6751 (4.2900) grad_norm 2.1436 (inf) loss_scale 16384.0000 (17317.1690) mem 7381MB [2024-08-21 09:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1080/1251] eta 0:00:38 lr 0.000843 wd 0.0500 time 0.2279 (0.2261) data time 0.0006 (0.0015) model time 0.2273 (0.2247) loss 3.2576 (4.2868) grad_norm 2.3339 (inf) loss_scale 16384.0000 (17308.5365) mem 7381MB [2024-08-21 09:16:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1090/1251] eta 0:00:36 lr 0.000844 wd 0.0500 time 0.2238 (0.2260) data time 0.0008 (0.0015) model time 0.2230 (0.2247) loss 2.9804 (4.2828) grad_norm 3.2118 (inf) loss_scale 16384.0000 (17300.0623) mem 7381MB [2024-08-21 09:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1100/1251] eta 0:00:34 lr 0.000844 wd 0.0500 time 0.2267 (0.2260) data time 0.0006 (0.0015) model time 0.2261 (0.2246) loss 4.9205 (4.2844) grad_norm 1.6187 (inf) loss_scale 16384.0000 (17291.7421) mem 7381MB [2024-08-21 09:16:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1110/1251] eta 0:00:31 lr 0.000845 wd 0.0500 time 0.2234 (0.2260) data time 0.0008 (0.0015) model time 0.2226 (0.2246) loss 5.2894 (4.2856) grad_norm 3.2328 (inf) loss_scale 16384.0000 (17283.5716) mem 7381MB [2024-08-21 09:16:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1120/1251] eta 0:00:29 lr 0.000845 wd 0.0500 time 0.2288 (0.2260) data time 0.0007 (0.0015) model time 0.2281 (0.2246) loss 4.9969 (4.2865) grad_norm 2.5926 (inf) loss_scale 16384.0000 (17275.5468) mem 7381MB [2024-08-21 09:16:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1130/1251] eta 0:00:27 lr 0.000845 wd 0.0500 time 0.2261 (0.2260) data time 0.0008 (0.0014) model time 0.2252 (0.2246) loss 5.0506 (4.2870) grad_norm 1.8470 (inf) loss_scale 16384.0000 (17267.6640) mem 7381MB [2024-08-21 09:16:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1140/1251] eta 0:00:25 lr 0.000846 wd 0.0500 time 0.2233 (0.2260) data time 0.0008 (0.0014) model time 0.2226 (0.2246) loss 4.4726 (4.2885) grad_norm 3.2983 (inf) loss_scale 16384.0000 (17259.9194) mem 7381MB [2024-08-21 09:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1150/1251] eta 0:00:22 lr 0.000846 wd 0.0500 time 0.2200 (0.2260) data time 0.0008 (0.0014) model time 0.2193 (0.2246) loss 3.4700 (4.2873) grad_norm 2.0489 (inf) loss_scale 16384.0000 (17252.3093) mem 7381MB [2024-08-21 09:16:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1160/1251] eta 0:00:20 lr 0.000847 wd 0.0500 time 0.2182 (0.2260) data time 0.0007 (0.0014) model time 0.2174 (0.2246) loss 4.7478 (4.2871) grad_norm 3.7777 (inf) loss_scale 16384.0000 (17244.8303) mem 7381MB [2024-08-21 09:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1170/1251] eta 0:00:18 lr 0.000847 wd 0.0500 time 0.2162 (0.2259) data time 0.0007 (0.0014) model time 0.2154 (0.2246) loss 4.7260 (4.2860) grad_norm 3.4716 (inf) loss_scale 16384.0000 (17237.4791) mem 7381MB [2024-08-21 09:17:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1180/1251] eta 0:00:16 lr 0.000847 wd 0.0500 time 0.2242 (0.2259) data time 0.0008 (0.0014) model time 0.2234 (0.2246) loss 4.2198 (4.2854) grad_norm 2.3276 (inf) loss_scale 16384.0000 (17230.2523) mem 7381MB [2024-08-21 09:17:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1190/1251] eta 0:00:13 lr 0.000848 wd 0.0500 time 0.2200 (0.2259) data time 0.0009 (0.0014) model time 0.2191 (0.2246) loss 3.8896 (4.2828) grad_norm 2.0169 (inf) loss_scale 16384.0000 (17223.1469) mem 7381MB [2024-08-21 09:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1200/1251] eta 0:00:11 lr 0.000848 wd 0.0500 time 0.2222 (0.2259) data time 0.0009 (0.0014) model time 0.2214 (0.2246) loss 3.8499 (4.2826) grad_norm 2.6590 (inf) loss_scale 16384.0000 (17216.1599) mem 7381MB [2024-08-21 09:17:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1210/1251] eta 0:00:09 lr 0.000849 wd 0.0500 time 0.2219 (0.2259) data time 0.0007 (0.0014) model time 0.2212 (0.2246) loss 5.1795 (4.2834) grad_norm 2.4469 (inf) loss_scale 16384.0000 (17209.2882) mem 7381MB [2024-08-21 09:17:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1220/1251] eta 0:00:07 lr 0.000849 wd 0.0500 time 0.2228 (0.2259) data time 0.0008 (0.0014) model time 0.2220 (0.2246) loss 5.1896 (4.2820) grad_norm 1.6660 (inf) loss_scale 16384.0000 (17202.5291) mem 7381MB [2024-08-21 09:17:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1230/1251] eta 0:00:04 lr 0.000849 wd 0.0500 time 0.2242 (0.2259) data time 0.0007 (0.0014) model time 0.2235 (0.2246) loss 3.2400 (4.2800) grad_norm 2.0053 (inf) loss_scale 16384.0000 (17195.8798) mem 7381MB [2024-08-21 09:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1240/1251] eta 0:00:02 lr 0.000850 wd 0.0500 time 0.2127 (0.2258) data time 0.0004 (0.0014) model time 0.2123 (0.2245) loss 5.0739 (4.2803) grad_norm 3.4867 (inf) loss_scale 16384.0000 (17189.3376) mem 7381MB [2024-08-21 09:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [16/300][1250/1251] eta 0:00:00 lr 0.000850 wd 0.0500 time 0.2149 (0.2259) data time 0.0006 (0.0014) model time 0.2143 (0.2246) loss 3.8200 (4.2797) grad_norm 2.2289 (inf) loss_scale 16384.0000 (17182.9001) mem 7381MB [2024-08-21 09:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 16 training takes 0:04:42 [2024-08-21 09:17:16 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-21 09:17:16 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-21 09:17:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.417 (0.417) Loss 0.9609 (0.9609) Acc@1 79.688 (79.688) Acc@5 95.117 (95.117) Mem 7381MB [2024-08-21 09:17:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.068 (0.102) Loss 1.2549 (1.2551) Acc@1 70.898 (71.112) Acc@5 91.602 (91.610) Mem 7381MB [2024-08-21 09:17:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.088) Loss 1.9307 (1.2739) Acc@1 57.031 (70.340) Acc@5 82.617 (91.546) Mem 7381MB [2024-08-21 09:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.067 (0.083) Loss 2.1172 (1.4590) Acc@1 52.637 (66.548) Acc@5 77.246 (88.533) Mem 7381MB [2024-08-21 09:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.078) Loss 2.0957 (1.5724) Acc@1 54.395 (64.379) Acc@5 79.102 (86.821) Mem 7381MB [2024-08-21 09:17:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 64.320 Acc@5 86.706 [2024-08-21 09:17:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 64.3% [2024-08-21 09:17:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 64.32% [2024-08-21 09:17:20 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-21 09:17:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-21 09:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.391 (0.391) Loss 1.3896 (1.3896) Acc@1 71.484 (71.484) Acc@5 90.430 (90.430) Mem 7381MB [2024-08-21 09:17:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.065 (0.097) Loss 1.7930 (1.9123) Acc@1 60.547 (58.052) Acc@5 86.816 (83.345) Mem 7381MB [2024-08-21 09:17:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.067 (0.084) Loss 2.5469 (1.9066) Acc@1 48.047 (58.287) Acc@5 73.633 (83.654) Mem 7381MB [2024-08-21 09:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.080) Loss 2.6172 (2.1014) Acc@1 49.121 (55.418) Acc@5 70.996 (80.421) Mem 7381MB [2024-08-21 09:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 2.7891 (2.2243) Acc@1 41.699 (53.296) Acc@5 69.629 (78.370) Mem 7381MB [2024-08-21 09:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 53.440 Acc@5 78.470 [2024-08-21 09:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 53.4% [2024-08-21 09:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 53.44% [2024-08-21 09:17:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-21 09:17:25 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-21 09:17:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][0/1251] eta 0:13:11 lr 0.000850 wd 0.0500 time 0.6331 (0.6331) data time 0.4165 (0.4165) model time 0.0000 (0.0000) loss 4.7136 (4.7136) grad_norm 1.6828 (1.6828) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][10/1251] eta 0:05:23 lr 0.000851 wd 0.0500 time 0.2232 (0.2609) data time 0.0010 (0.0388) model time 0.0000 (0.0000) loss 3.8547 (4.4729) grad_norm 2.6637 (1.9990) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:17:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][20/1251] eta 0:05:01 lr 0.000851 wd 0.0500 time 0.2277 (0.2446) data time 0.0006 (0.0208) model time 0.0000 (0.0000) loss 4.0681 (4.2291) grad_norm 1.7759 (1.9140) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:17:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][30/1251] eta 0:04:50 lr 0.000851 wd 0.0500 time 0.2289 (0.2380) data time 0.0009 (0.0144) model time 0.0000 (0.0000) loss 4.1082 (4.2117) grad_norm 2.2370 (2.0180) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:17:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][40/1251] eta 0:04:44 lr 0.000852 wd 0.0500 time 0.2282 (0.2346) data time 0.0008 (0.0111) model time 0.0000 (0.0000) loss 4.2147 (4.2547) grad_norm 2.8428 (2.0067) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:17:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][50/1251] eta 0:04:38 lr 0.000852 wd 0.0500 time 0.2230 (0.2321) data time 0.0007 (0.0091) model time 0.0000 (0.0000) loss 4.9076 (4.2195) grad_norm 2.4502 (2.0598) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-21 09:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-21 09:17:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-21 09:17:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-21 09:19:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-21 09:19:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-21 09:19:30 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-21 09:19:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-21 09:19:39 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-21 09:27:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-21 09:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-21 09:27:16 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-21 09:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-21 09:27:23 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-21 09:27:24 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-21 09:27:25 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-21 09:27:25 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 17) [2024-08-21 09:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-21 09:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-21 09:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-21 09:33:05 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-21 09:35:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-21 09:35:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-21 09:35:38 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-21 09:35:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-21 09:35:46 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-21 09:35:47 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-21 09:35:48 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-21 09:35:48 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 17) [2024-08-21 09:35:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-23 03:36:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 03:36:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 03:36:29 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 03:36:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-23 03:36:45 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-23 03:36:46 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-23 03:36:47 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-23 03:36:47 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 17) [2024-08-23 03:36:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-23 03:37:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][60/1251] eta 0:45:02 lr 0.000853 wd 0.0500 time 0.2270 (2.2695) data time 0.0008 (0.1209) model time 0.2263 (2.1486) loss 4.8828 (4.8616) grad_norm 3.3440 (2.9707) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 03:37:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][70/1251] eta 0:19:31 lr 0.000853 wd 0.0500 time 0.2191 (0.9923) data time 0.0009 (0.0460) model time 0.2183 (0.9463) loss 4.3686 (4.5258) grad_norm 1.3229 (2.6352) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 03:37:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][80/1251] eta 0:13:36 lr 0.000853 wd 0.0500 time 0.2240 (0.6972) data time 0.0006 (0.0286) model time 0.2234 (0.6686) loss 4.5138 (4.5128) grad_norm 1.5986 (2.4058) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 03:37:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][90/1251] eta 0:10:58 lr 0.000854 wd 0.0500 time 0.2287 (0.5669) data time 0.0009 (0.0209) model time 0.2278 (0.5460) loss 4.2264 (4.4753) grad_norm 2.8392 (2.2881) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 03:37:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][100/1251] eta 0:09:27 lr 0.000854 wd 0.0500 time 0.2270 (0.4927) data time 0.0010 (0.0166) model time 0.2260 (0.4761) loss 3.7995 (4.4075) grad_norm 2.0256 (2.2342) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 03:37:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][110/1251] eta 0:08:28 lr 0.000855 wd 0.0500 time 0.2304 (0.4453) data time 0.0007 (0.0138) model time 0.2297 (0.4315) loss 5.0834 (4.4263) grad_norm 2.1995 (2.2962) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 03:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][120/1251] eta 0:07:46 lr 0.000855 wd 0.0500 time 0.2268 (0.4124) data time 0.0006 (0.0120) model time 0.2262 (0.4004) loss 3.5726 (4.3888) grad_norm 2.5734 (2.3359) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 03:37:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][130/1251] eta 0:07:15 lr 0.000855 wd 0.0500 time 0.2268 (0.3881) data time 0.0008 (0.0106) model time 0.2259 (0.3775) loss 4.7848 (4.3606) grad_norm 2.1698 (2.3062) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 03:37:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][140/1251] eta 0:06:50 lr 0.000856 wd 0.0500 time 0.2214 (0.3691) data time 0.0007 (0.0094) model time 0.2207 (0.3597) loss 3.0013 (4.3221) grad_norm 1.7966 (2.2701) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 03:37:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][150/1251] eta 0:06:29 lr 0.000856 wd 0.0500 time 0.2311 (0.3542) data time 0.0006 (0.0085) model time 0.2305 (0.3457) loss 4.5265 (4.3253) grad_norm 2.5362 (2.2939) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 03:37:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][160/1251] eta 0:06:13 lr 0.000857 wd 0.0500 time 0.2286 (0.3421) data time 0.0008 (0.0078) model time 0.2278 (0.3343) loss 4.9937 (4.3615) grad_norm 2.2769 (2.2775) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 03:37:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-23 03:37:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-23 03:37:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-23 05:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 05:34:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 05:34:43 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 05:34:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-23 05:34:54 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-23 05:34:56 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-23 05:34:57 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-23 05:34:57 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 17) [2024-08-23 05:34:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-23 05:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][170/1251] eta 0:27:59 lr 0.000857 wd 0.0500 time 0.2265 (1.5538) data time 0.0008 (0.1146) model time 0.2256 (1.4391) loss 4.6567 (4.7616) grad_norm 3.3893 (2.4592) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 05:35:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][180/1251] eta 0:15:51 lr 0.000857 wd 0.0500 time 0.2226 (0.8889) data time 0.0007 (0.0578) model time 0.2220 (0.8311) loss 4.4868 (4.4869) grad_norm 1.5304 (2.2287) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 05:35:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][190/1251] eta 0:11:48 lr 0.000858 wd 0.0500 time 0.2243 (0.6674) data time 0.0009 (0.0388) model time 0.2234 (0.6286) loss 4.7205 (4.5673) grad_norm 4.3653 (2.2889) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 05:35:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][200/1251] eta 0:09:44 lr 0.000858 wd 0.0500 time 0.2239 (0.5565) data time 0.0007 (0.0294) model time 0.2231 (0.5271) loss 3.9470 (4.4750) grad_norm 2.0319 (2.2587) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 05:35:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][210/1251] eta 0:08:30 lr 0.000859 wd 0.0500 time 0.2233 (0.4903) data time 0.0009 (0.0238) model time 0.2224 (0.4665) loss 4.0948 (4.4334) grad_norm 1.7081 (2.2299) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 05:35:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][220/1251] eta 0:07:39 lr 0.000859 wd 0.0500 time 0.2259 (0.4461) data time 0.0007 (0.0199) model time 0.2252 (0.4261) loss 4.4959 (4.3969) grad_norm 1.9490 (2.2080) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 05:35:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][230/1251] eta 0:07:02 lr 0.000859 wd 0.0500 time 0.2183 (0.4139) data time 0.0007 (0.0172) model time 0.2176 (0.3967) loss 3.1868 (4.3643) grad_norm 2.4884 (2.2086) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 05:35:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][240/1251] eta 0:06:34 lr 0.000860 wd 0.0500 time 0.2231 (0.3899) data time 0.0008 (0.0152) model time 0.2222 (0.3747) loss 4.8194 (4.3383) grad_norm 2.9331 (2.2313) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 05:35:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][250/1251] eta 0:06:11 lr 0.000860 wd 0.0500 time 0.2191 (0.3712) data time 0.0008 (0.0136) model time 0.2183 (0.3576) loss 4.7199 (4.3103) grad_norm 2.7103 (2.2459) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 05:35:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][260/1251] eta 0:05:53 lr 0.000861 wd 0.0500 time 0.2323 (0.3565) data time 0.0009 (0.0123) model time 0.2314 (0.3442) loss 4.5014 (4.3387) grad_norm 2.0308 (2.2178) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 05:35:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][270/1251] eta 0:05:38 lr 0.000861 wd 0.0500 time 0.2192 (0.3446) data time 0.0016 (0.0113) model time 0.2176 (0.3333) loss 3.8499 (4.3441) grad_norm 2.3238 (2.2163) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 05:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][280/1251] eta 0:05:24 lr 0.000861 wd 0.0500 time 0.2303 (0.3346) data time 0.0008 (0.0104) model time 0.2295 (0.3242) loss 4.9178 (4.3418) grad_norm 2.0958 (2.2361) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 05:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][290/1251] eta 0:05:13 lr 0.000862 wd 0.0500 time 0.2202 (0.3261) data time 0.0007 (0.0097) model time 0.2195 (0.3164) loss 4.0605 (4.3126) grad_norm 3.0511 (2.2296) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 05:35:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][300/1251] eta 0:05:03 lr 0.000862 wd 0.0500 time 0.2268 (0.3188) data time 0.0008 (0.0091) model time 0.2260 (0.3097) loss 3.3037 (4.3044) grad_norm 1.8951 (2.2680) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 05:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-23 05:35:47 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-23 05:35:49 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-23 05:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 05:40:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 05:40:36 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 05:40:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-23 05:40:47 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-23 05:40:48 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-23 05:40:49 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-23 05:40:49 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 17) [2024-08-23 05:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-23 05:41:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][310/1251] eta 3:11:27 lr 0.000863 wd 0.0500 time 12.2079 (12.2079) data time 0.5994 (0.5994) model time 11.6085 (11.6085) loss 5.0972 (5.0972) grad_norm 2.3889 (2.3889) loss_scale 16384.0000 (16384.0000) mem 20033MB [2024-08-23 05:41:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-23 05:41:06 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-23 05:41:09 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-23 05:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 05:59:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 06:14:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 06:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 06:14:41 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 06:14:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-23 06:14:51 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-23 06:14:53 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-23 06:17:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 06:17:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 06:17:38 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 06:17:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-23 06:17:46 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-23 06:17:47 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-23 06:17:48 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-23 06:17:48 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 17) [2024-08-23 06:17:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-23 06:18:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][320/1251] eta 0:24:29 lr 0.000863 wd 0.0500 time 0.2252 (1.5786) data time 0.0008 (0.0969) model time 0.2244 (1.4817) loss 4.5877 (4.7914) grad_norm 2.2302 (1.8888) loss_scale 16384.0000 (16384.0000) mem 7374MB [2024-08-23 06:18:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][330/1251] eta 0:13:50 lr 0.000863 wd 0.0500 time 0.2179 (0.9016) data time 0.0008 (0.0490) model time 0.2171 (0.8526) loss 5.0088 (4.5933) grad_norm 1.5816 (2.2724) loss_scale 16384.0000 (16384.0000) mem 7374MB [2024-08-23 06:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][340/1251] eta 0:10:16 lr 0.000864 wd 0.0500 time 0.2243 (0.6767) data time 0.0009 (0.0330) model time 0.2234 (0.6437) loss 4.6810 (4.6236) grad_norm 1.4420 (2.3621) loss_scale 16384.0000 (16384.0000) mem 7374MB [2024-08-23 06:18:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-23 06:18:13 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-23 06:18:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-23 06:20:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 06:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 06:20:27 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 06:20:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-23 06:20:37 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-23 06:20:38 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-23 06:20:40 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-23 06:20:40 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 17) [2024-08-23 06:20:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-23 06:20:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][350/1251] eta 0:25:19 lr 0.000864 wd 0.0500 time 0.2427 (1.6860) data time 0.0008 (0.1193) model time 0.2419 (1.5667) loss 4.6629 (4.7955) grad_norm 1.7249 (2.0418) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][360/1251] eta 0:13:46 lr 0.000865 wd 0.0500 time 0.2364 (0.9281) data time 0.0013 (0.0576) model time 0.2351 (0.8705) loss 4.1381 (4.5341) grad_norm 1.8571 (2.1756) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][370/1251] eta 0:10:09 lr 0.000865 wd 0.0500 time 0.2562 (0.6920) data time 0.0008 (0.0381) model time 0.2554 (0.6539) loss 5.2888 (4.5946) grad_norm 2.8037 (2.2216) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][380/1251] eta 0:08:22 lr 0.000865 wd 0.0500 time 0.2381 (0.5773) data time 0.0010 (0.0288) model time 0.2371 (0.5485) loss 4.3242 (4.5206) grad_norm 1.2824 (2.3496) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][390/1251] eta 0:07:18 lr 0.000866 wd 0.0500 time 0.2352 (0.5088) data time 0.0010 (0.0231) model time 0.2343 (0.4856) loss 4.5789 (4.4713) grad_norm 2.3313 (2.3085) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][400/1251] eta 0:06:34 lr 0.000866 wd 0.0500 time 0.2373 (0.4632) data time 0.0010 (0.0194) model time 0.2363 (0.4438) loss 3.2143 (4.4332) grad_norm 2.5386 (2.2749) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][410/1251] eta 0:06:02 lr 0.000867 wd 0.0500 time 0.2436 (0.4314) data time 0.0011 (0.0167) model time 0.2425 (0.4147) loss 4.5591 (4.4136) grad_norm 2.4446 (2.2218) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][420/1251] eta 0:05:38 lr 0.000867 wd 0.0500 time 0.2489 (0.4075) data time 0.0009 (0.0148) model time 0.2480 (0.3927) loss 4.3677 (4.3653) grad_norm 1.9127 (2.1953) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][430/1251] eta 0:05:19 lr 0.000867 wd 0.0500 time 0.2343 (0.3888) data time 0.0010 (0.0134) model time 0.2333 (0.3755) loss 4.1965 (4.3354) grad_norm 1.6603 (2.1575) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][440/1251] eta 0:05:03 lr 0.000868 wd 0.0500 time 0.2336 (0.3739) data time 0.0012 (0.0121) model time 0.2324 (0.3618) loss 4.3315 (4.3550) grad_norm 2.3315 (2.1689) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][450/1251] eta 0:04:49 lr 0.000868 wd 0.0500 time 0.2306 (0.3618) data time 0.0009 (0.0111) model time 0.2297 (0.3507) loss 4.9279 (4.3679) grad_norm 1.8205 (2.1594) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][460/1251] eta 0:04:38 lr 0.000869 wd 0.0500 time 0.2396 (0.3516) data time 0.0010 (0.0103) model time 0.2385 (0.3413) loss 4.7512 (4.3649) grad_norm 2.1144 (2.1443) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][470/1251] eta 0:04:27 lr 0.000869 wd 0.0500 time 0.2427 (0.3429) data time 0.0010 (0.0095) model time 0.2417 (0.3334) loss 4.0705 (4.3460) grad_norm 1.7930 (2.1426) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][480/1251] eta 0:04:18 lr 0.000869 wd 0.0500 time 0.2418 (0.3355) data time 0.0009 (0.0089) model time 0.2408 (0.3266) loss 4.6872 (4.3336) grad_norm 2.2264 (2.2443) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][490/1251] eta 0:04:10 lr 0.000870 wd 0.0500 time 0.2377 (0.3295) data time 0.0009 (0.0084) model time 0.2368 (0.3211) loss 3.9607 (4.3220) grad_norm 2.0169 (2.2223) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][500/1251] eta 0:04:03 lr 0.000870 wd 0.0500 time 0.2347 (0.3238) data time 0.0009 (0.0079) model time 0.2338 (0.3159) loss 5.1454 (4.3172) grad_norm 1.3391 (2.2175) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][510/1251] eta 0:03:56 lr 0.000871 wd 0.0500 time 0.2419 (0.3190) data time 0.0010 (0.0075) model time 0.2409 (0.3115) loss 4.2767 (4.3197) grad_norm 2.2337 (2.2153) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][520/1251] eta 0:03:50 lr 0.000871 wd 0.0500 time 0.2438 (0.3147) data time 0.0012 (0.0072) model time 0.2426 (0.3075) loss 4.1344 (4.3053) grad_norm 2.2372 (2.2026) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][530/1251] eta 0:03:44 lr 0.000871 wd 0.0500 time 0.2477 (0.3108) data time 0.0007 (0.0069) model time 0.2470 (0.3039) loss 5.2907 (4.3078) grad_norm 2.9163 (2.2111) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][540/1251] eta 0:03:38 lr 0.000872 wd 0.0500 time 0.2455 (0.3072) data time 0.0011 (0.0066) model time 0.2444 (0.3006) loss 3.2035 (4.2894) grad_norm 2.6005 (2.2221) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][550/1251] eta 0:03:33 lr 0.000872 wd 0.0500 time 0.2399 (0.3045) data time 0.0009 (0.0063) model time 0.2390 (0.2982) loss 4.8740 (4.2813) grad_norm 1.6558 (2.2207) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][560/1251] eta 0:03:28 lr 0.000873 wd 0.0500 time 0.2450 (0.3018) data time 0.0008 (0.0061) model time 0.2442 (0.2957) loss 4.8077 (4.2770) grad_norm 2.1644 (2.2045) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][570/1251] eta 0:03:23 lr 0.000873 wd 0.0500 time 0.2405 (0.2992) data time 0.0009 (0.0059) model time 0.2397 (0.2934) loss 3.0577 (4.2760) grad_norm 2.1252 (2.2125) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][580/1251] eta 0:03:19 lr 0.000873 wd 0.0500 time 0.2340 (0.2969) data time 0.0009 (0.0057) model time 0.2332 (0.2912) loss 2.8280 (4.2675) grad_norm 2.1331 (2.2169) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][590/1251] eta 0:03:14 lr 0.000874 wd 0.0500 time 0.2456 (0.2948) data time 0.0010 (0.0055) model time 0.2446 (0.2893) loss 3.9176 (4.2579) grad_norm 1.6183 (2.2127) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][600/1251] eta 0:03:10 lr 0.000874 wd 0.0500 time 0.2446 (0.2928) data time 0.0010 (0.0053) model time 0.2437 (0.2875) loss 4.6611 (4.2474) grad_norm 2.0411 (2.2063) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][610/1251] eta 0:03:06 lr 0.000875 wd 0.0500 time 0.2458 (0.2910) data time 0.0008 (0.0052) model time 0.2450 (0.2858) loss 3.1204 (4.2343) grad_norm 2.0023 (2.1994) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][620/1251] eta 0:03:02 lr 0.000875 wd 0.0500 time 0.2387 (0.2892) data time 0.0011 (0.0050) model time 0.2376 (0.2842) loss 4.4767 (4.2416) grad_norm 1.8874 (2.1906) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][630/1251] eta 0:02:58 lr 0.000875 wd 0.0500 time 0.2435 (0.2882) data time 0.0010 (0.0049) model time 0.2425 (0.2834) loss 4.2249 (4.2429) grad_norm 3.4161 (2.1916) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][640/1251] eta 0:02:55 lr 0.000876 wd 0.0500 time 0.2349 (0.2867) data time 0.0011 (0.0047) model time 0.2338 (0.2820) loss 4.6138 (4.2310) grad_norm 2.0180 (2.1973) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][650/1251] eta 0:02:51 lr 0.000876 wd 0.0500 time 0.2447 (0.2861) data time 0.0012 (0.0046) model time 0.2436 (0.2815) loss 4.6418 (4.2272) grad_norm 2.0310 (2.2141) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][660/1251] eta 0:02:48 lr 0.000877 wd 0.0500 time 0.2376 (0.2847) data time 0.0009 (0.0045) model time 0.2367 (0.2802) loss 4.8487 (4.2394) grad_norm 6.6524 (2.2453) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][670/1251] eta 0:02:44 lr 0.000877 wd 0.0500 time 0.2422 (0.2834) data time 0.0010 (0.0044) model time 0.2413 (0.2790) loss 3.1940 (4.2392) grad_norm 1.8067 (2.2585) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][680/1251] eta 0:02:41 lr 0.000877 wd 0.0500 time 0.2381 (0.2821) data time 0.0010 (0.0043) model time 0.2371 (0.2778) loss 4.1206 (4.2360) grad_norm 2.2189 (2.2520) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][690/1251] eta 0:02:37 lr 0.000878 wd 0.0500 time 0.2384 (0.2810) data time 0.0008 (0.0042) model time 0.2376 (0.2767) loss 4.6450 (4.2378) grad_norm 2.1121 (2.2578) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][700/1251] eta 0:02:34 lr 0.000878 wd 0.0500 time 0.2399 (0.2799) data time 0.0010 (0.0041) model time 0.2389 (0.2758) loss 4.0052 (4.2369) grad_norm 1.8195 (2.2586) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][710/1251] eta 0:02:30 lr 0.000878 wd 0.0500 time 0.2304 (0.2789) data time 0.0011 (0.0041) model time 0.2293 (0.2748) loss 4.7130 (4.2372) grad_norm 1.5375 (2.2469) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][720/1251] eta 0:02:27 lr 0.000879 wd 0.0500 time 0.2411 (0.2779) data time 0.0009 (0.0040) model time 0.2402 (0.2739) loss 4.7873 (4.2389) grad_norm 1.9095 (2.2415) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][730/1251] eta 0:02:24 lr 0.000879 wd 0.0500 time 0.2405 (0.2770) data time 0.0007 (0.0039) model time 0.2398 (0.2731) loss 4.4289 (4.2309) grad_norm 1.4037 (2.2356) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][740/1251] eta 0:02:21 lr 0.000880 wd 0.0500 time 0.2466 (0.2762) data time 0.0009 (0.0038) model time 0.2457 (0.2723) loss 4.9652 (4.2352) grad_norm 2.2847 (2.2245) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][750/1251] eta 0:02:17 lr 0.000880 wd 0.0500 time 0.2360 (0.2753) data time 0.0010 (0.0038) model time 0.2350 (0.2715) loss 4.3057 (4.2424) grad_norm 1.9536 (2.2177) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][760/1251] eta 0:02:14 lr 0.000880 wd 0.0500 time 0.2396 (0.2745) data time 0.0007 (0.0037) model time 0.2389 (0.2708) loss 3.0556 (4.2395) grad_norm 1.9821 (2.2117) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][770/1251] eta 0:02:11 lr 0.000881 wd 0.0500 time 0.2452 (0.2738) data time 0.0007 (0.0036) model time 0.2444 (0.2701) loss 4.7207 (4.2471) grad_norm 2.1991 (2.2126) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][780/1251] eta 0:02:08 lr 0.000881 wd 0.0500 time 0.2384 (0.2730) data time 0.0011 (0.0036) model time 0.2374 (0.2694) loss 4.0317 (4.2494) grad_norm 2.0519 (2.2105) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][790/1251] eta 0:02:05 lr 0.000882 wd 0.0500 time 0.2459 (0.2724) data time 0.0010 (0.0035) model time 0.2449 (0.2688) loss 4.0891 (4.2489) grad_norm 2.6286 (2.2137) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][800/1251] eta 0:02:02 lr 0.000882 wd 0.0500 time 0.2448 (0.2717) data time 0.0010 (0.0035) model time 0.2437 (0.2682) loss 3.8160 (4.2413) grad_norm 2.3180 (2.2090) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][810/1251] eta 0:01:59 lr 0.000882 wd 0.0500 time 0.2356 (0.2711) data time 0.0009 (0.0034) model time 0.2347 (0.2676) loss 3.0827 (4.2316) grad_norm 2.5362 (2.2161) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][820/1251] eta 0:01:56 lr 0.000883 wd 0.0500 time 0.2400 (0.2705) data time 0.0010 (0.0034) model time 0.2391 (0.2671) loss 4.8126 (4.2300) grad_norm 1.5602 (2.2112) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][830/1251] eta 0:01:53 lr 0.000883 wd 0.0500 time 0.2311 (0.2699) data time 0.0011 (0.0033) model time 0.2300 (0.2666) loss 4.1523 (4.2354) grad_norm 2.1018 (2.2124) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:22:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][840/1251] eta 0:01:50 lr 0.000884 wd 0.0500 time 0.2419 (0.2694) data time 0.0008 (0.0033) model time 0.2411 (0.2661) loss 4.6520 (4.2365) grad_norm 2.2895 (2.2199) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:23:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][850/1251] eta 0:01:47 lr 0.000884 wd 0.0500 time 0.2385 (0.2689) data time 0.0008 (0.0033) model time 0.2378 (0.2656) loss 5.0618 (4.2376) grad_norm 1.5874 (2.2194) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:23:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][860/1251] eta 0:01:44 lr 0.000884 wd 0.0500 time 0.2339 (0.2684) data time 0.0011 (0.0032) model time 0.2328 (0.2651) loss 2.8215 (4.2405) grad_norm 2.4473 (2.2273) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:23:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][870/1251] eta 0:01:42 lr 0.000885 wd 0.0500 time 0.2343 (0.2679) data time 0.0009 (0.0032) model time 0.2334 (0.2647) loss 5.0673 (4.2350) grad_norm 2.1394 (2.2259) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:23:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][880/1251] eta 0:01:39 lr 0.000885 wd 0.0500 time 0.2406 (0.2674) data time 0.0010 (0.0031) model time 0.2396 (0.2643) loss 4.5672 (4.2316) grad_norm 2.7503 (2.2264) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:23:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][890/1251] eta 0:01:36 lr 0.000886 wd 0.0500 time 0.2394 (0.2669) data time 0.0008 (0.0031) model time 0.2385 (0.2639) loss 4.9723 (4.2355) grad_norm 1.4273 (2.2199) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:23:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][900/1251] eta 0:01:33 lr 0.000886 wd 0.0500 time 0.2352 (0.2665) data time 0.0007 (0.0031) model time 0.2345 (0.2635) loss 5.2495 (4.2412) grad_norm 2.7688 (2.2175) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:23:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][910/1251] eta 0:01:30 lr 0.000886 wd 0.0500 time 0.2443 (0.2661) data time 0.0010 (0.0030) model time 0.2434 (0.2631) loss 4.0513 (4.2438) grad_norm 4.3021 (2.2254) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][920/1251] eta 0:01:27 lr 0.000887 wd 0.0500 time 0.2420 (0.2657) data time 0.0008 (0.0030) model time 0.2412 (0.2627) loss 5.1829 (4.2458) grad_norm 2.1934 (2.2248) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:23:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][930/1251] eta 0:01:25 lr 0.000887 wd 0.0500 time 0.2310 (0.2653) data time 0.0010 (0.0030) model time 0.2299 (0.2623) loss 3.5311 (4.2447) grad_norm 2.0193 (2.2271) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][940/1251] eta 0:01:22 lr 0.000888 wd 0.0500 time 0.2432 (0.2649) data time 0.0009 (0.0029) model time 0.2423 (0.2620) loss 3.8119 (4.2457) grad_norm 1.8150 (2.2259) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 06:23:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-23 06:23:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-23 06:23:25 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-23 06:38:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 06:38:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 06:38:39 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 06:38:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-23 06:38:46 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-23 06:38:47 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-23 06:38:49 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-23 06:38:49 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 17) [2024-08-23 06:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-23 06:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 06:41:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 06:41:49 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 06:41:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-23 06:41:58 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-23 06:42:00 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-23 06:42:01 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-23 06:42:01 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 17) [2024-08-23 06:42:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-23 06:42:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][950/1251] eta 0:09:48 lr 0.000888 wd 0.0500 time 0.2218 (1.9567) data time 0.0006 (0.1032) model time 0.2212 (1.8535) loss 4.6273 (4.7259) grad_norm 2.0089 (2.1547) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][960/1251] eta 0:04:49 lr 0.000888 wd 0.0500 time 0.2246 (0.9950) data time 0.0007 (0.0463) model time 0.2238 (0.9487) loss 4.8988 (4.5296) grad_norm 1.6576 (2.0499) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][970/1251] eta 0:03:22 lr 0.000889 wd 0.0500 time 0.2277 (0.7205) data time 0.0010 (0.0301) model time 0.2267 (0.6904) loss 4.9007 (4.5538) grad_norm 2.0571 (2.0578) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][980/1251] eta 0:02:39 lr 0.000889 wd 0.0500 time 0.2236 (0.5897) data time 0.0008 (0.0224) model time 0.2228 (0.5673) loss 3.8405 (4.4718) grad_norm 3.4776 (2.1954) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][990/1251] eta 0:02:14 lr 0.000890 wd 0.0500 time 0.2351 (0.5142) data time 0.0008 (0.0179) model time 0.2343 (0.4963) loss 4.8239 (4.4176) grad_norm 2.0914 (2.1354) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1000/1251] eta 0:01:56 lr 0.000890 wd 0.0500 time 0.2218 (0.4640) data time 0.0006 (0.0150) model time 0.2213 (0.4490) loss 3.6826 (4.3941) grad_norm 2.2573 (2.1410) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1010/1251] eta 0:01:43 lr 0.000890 wd 0.0500 time 0.2315 (0.4287) data time 0.0008 (0.0129) model time 0.2307 (0.4158) loss 3.2197 (4.3603) grad_norm 2.4052 (2.1764) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1020/1251] eta 0:01:32 lr 0.000891 wd 0.0500 time 0.2255 (0.4026) data time 0.0006 (0.0114) model time 0.2249 (0.3912) loss 3.1115 (4.3176) grad_norm 2.3794 (2.2099) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1030/1251] eta 0:01:24 lr 0.000891 wd 0.0500 time 0.2302 (0.3821) data time 0.0007 (0.0102) model time 0.2294 (0.3719) loss 4.6156 (4.2925) grad_norm 2.9984 (2.2351) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1040/1251] eta 0:01:17 lr 0.000892 wd 0.0500 time 0.2213 (0.3658) data time 0.0007 (0.0092) model time 0.2206 (0.3566) loss 5.0574 (4.3036) grad_norm 2.1884 (2.2197) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1050/1251] eta 0:01:10 lr 0.000892 wd 0.0500 time 0.2229 (0.3524) data time 0.0006 (0.0084) model time 0.2223 (0.3440) loss 3.4286 (4.3123) grad_norm 4.1508 (2.2542) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1060/1251] eta 0:01:05 lr 0.000892 wd 0.0500 time 0.2258 (0.3418) data time 0.0009 (0.0078) model time 0.2250 (0.3339) loss 4.4331 (4.3165) grad_norm 1.9405 (2.3623) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1070/1251] eta 0:01:00 lr 0.000893 wd 0.0500 time 0.2223 (0.3326) data time 0.0007 (0.0073) model time 0.2216 (0.3254) loss 4.4110 (4.2996) grad_norm 2.6566 (2.3666) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1080/1251] eta 0:00:55 lr 0.000893 wd 0.0500 time 0.2229 (0.3248) data time 0.0008 (0.0068) model time 0.2220 (0.3180) loss 4.1567 (4.2893) grad_norm 1.7172 (2.3629) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1090/1251] eta 0:00:51 lr 0.000894 wd 0.0500 time 0.2213 (0.3180) data time 0.0007 (0.0064) model time 0.2206 (0.3116) loss 4.6588 (4.2872) grad_norm 1.5522 (2.3562) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1100/1251] eta 0:00:47 lr 0.000894 wd 0.0500 time 0.2240 (0.3120) data time 0.0008 (0.0061) model time 0.2233 (0.3059) loss 3.8356 (4.2777) grad_norm 1.8509 (2.3395) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1110/1251] eta 0:00:43 lr 0.000894 wd 0.0500 time 0.2226 (0.3067) data time 0.0009 (0.0058) model time 0.2217 (0.3009) loss 4.6341 (4.2819) grad_norm 2.3701 (2.3211) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:42:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1120/1251] eta 0:00:39 lr 0.000895 wd 0.0500 time 0.2233 (0.3021) data time 0.0006 (0.0055) model time 0.2227 (0.2966) loss 3.5729 (4.2565) grad_norm 2.3695 (2.3188) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:43:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1130/1251] eta 0:00:36 lr 0.000895 wd 0.0500 time 0.2234 (0.2979) data time 0.0007 (0.0053) model time 0.2226 (0.2926) loss 4.7940 (4.2516) grad_norm 2.7837 (2.3135) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:43:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1140/1251] eta 0:00:32 lr 0.000896 wd 0.0500 time 0.2361 (0.2943) data time 0.0008 (0.0050) model time 0.2353 (0.2892) loss 3.2367 (4.2309) grad_norm 1.9891 (2.2991) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:43:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1150/1251] eta 0:00:29 lr 0.000896 wd 0.0500 time 0.2214 (0.2909) data time 0.0009 (0.0048) model time 0.2206 (0.2861) loss 4.4048 (4.2184) grad_norm 1.5728 (2.2836) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1160/1251] eta 0:00:26 lr 0.000896 wd 0.0500 time 0.2293 (0.2879) data time 0.0007 (0.0047) model time 0.2286 (0.2832) loss 4.1331 (4.2098) grad_norm 2.8148 (2.2815) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:43:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1170/1251] eta 0:00:23 lr 0.000897 wd 0.0500 time 0.2287 (0.2852) data time 0.0009 (0.0045) model time 0.2278 (0.2807) loss 4.5786 (4.2189) grad_norm 2.0944 (2.2686) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 06:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-23 06:43:11 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-23 06:43:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-23 06:55:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 06:55:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 06:55:28 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 06:55:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-23 06:55:40 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-23 06:55:41 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-23 06:55:42 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-23 06:55:43 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 17) [2024-08-23 06:55:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-23 06:56:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1180/1251] eta 0:05:40 lr 0.000897 wd 0.0500 time 0.2321 (4.7926) data time 0.0009 (0.3477) model time 0.2312 (4.4449) loss 3.7229 (4.6866) grad_norm 2.0870 (2.0573) loss_scale 16384.0000 (16384.0000) mem 7378MB [2024-08-23 06:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1190/1251] eta 0:01:18 lr 0.000898 wd 0.0500 time 0.2342 (1.2885) data time 0.0010 (0.0811) model time 0.2332 (1.2074) loss 4.2802 (4.6155) grad_norm 1.7624 (1.9549) loss_scale 16384.0000 (16384.0000) mem 7378MB [2024-08-23 06:56:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1200/1251] eta 0:00:42 lr 0.000898 wd 0.0500 time 0.2421 (0.8332) data time 0.0009 (0.0463) model time 0.2412 (0.7869) loss 4.9541 (4.5875) grad_norm 3.7954 (2.1164) loss_scale 16384.0000 (16384.0000) mem 7378MB [2024-08-23 06:56:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1210/1251] eta 0:00:26 lr 0.000898 wd 0.0500 time 0.2364 (0.6543) data time 0.0008 (0.0326) model time 0.2356 (0.6217) loss 5.5137 (4.6014) grad_norm 2.0339 (2.1776) loss_scale 16384.0000 (16384.0000) mem 7378MB [2024-08-23 06:56:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1220/1251] eta 0:00:17 lr 0.000899 wd 0.0500 time 0.2348 (0.5575) data time 0.0011 (0.0252) model time 0.2338 (0.5323) loss 4.1026 (4.4747) grad_norm 1.7845 (2.2282) loss_scale 16384.0000 (16384.0000) mem 7378MB [2024-08-23 06:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1230/1251] eta 0:00:10 lr 0.000899 wd 0.0500 time 0.2391 (0.4972) data time 0.0010 (0.0207) model time 0.2382 (0.4766) loss 4.0063 (4.4560) grad_norm 2.6919 (2.2421) loss_scale 16384.0000 (16384.0000) mem 7378MB [2024-08-23 06:56:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1240/1251] eta 0:00:05 lr 0.000900 wd 0.0500 time 0.2232 (0.4555) data time 0.0007 (0.0176) model time 0.2224 (0.4379) loss 3.9660 (4.4216) grad_norm 3.4770 (2.2139) loss_scale 16384.0000 (16384.0000) mem 7378MB [2024-08-23 06:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [17/300][1250/1251] eta 0:00:00 lr 0.000900 wd 0.0500 time 0.2274 (0.4243) data time 0.0007 (0.0153) model time 0.2267 (0.4090) loss 4.6686 (4.3818) grad_norm 2.0105 (2.2611) loss_scale 16384.0000 (16384.0000) mem 7378MB [2024-08-23 06:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 17 training takes 0:00:30 [2024-08-23 06:56:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-23 06:56:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-23 06:56:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.419 (0.419) Loss 0.8408 (0.8408) Acc@1 82.715 (82.715) Acc@5 95.312 (95.312) Mem 7378MB [2024-08-23 06:56:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.105) Loss 1.3203 (1.2492) Acc@1 71.484 (71.378) Acc@5 91.699 (91.850) Mem 7378MB [2024-08-23 06:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.092) Loss 1.8389 (1.2742) Acc@1 59.375 (70.671) Acc@5 82.715 (91.750) Mem 7378MB [2024-08-23 06:56:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.088) Loss 2.1699 (1.4657) Acc@1 53.809 (66.948) Acc@5 75.977 (88.609) Mem 7378MB [2024-08-23 06:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 2.0996 (1.5696) Acc@1 53.125 (65.034) Acc@5 79.395 (87.016) Mem 7378MB [2024-08-23 06:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 64.746 Acc@5 86.844 [2024-08-23 06:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 64.7% [2024-08-23 06:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 64.75% [2024-08-23 06:56:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-23 06:56:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-23 06:56:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.470 (0.470) Loss 1.2158 (1.2158) Acc@1 75.488 (75.488) Acc@5 91.699 (91.699) Mem 7378MB [2024-08-23 06:56:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.112) Loss 1.6172 (1.7082) Acc@1 64.160 (61.763) Acc@5 87.695 (85.574) Mem 7378MB [2024-08-23 06:56:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.095) Loss 2.3320 (1.7059) Acc@1 50.488 (61.719) Acc@5 76.074 (85.840) Mem 7378MB [2024-08-23 06:56:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.085 (0.090) Loss 2.4336 (1.9018) Acc@1 51.074 (58.603) Acc@5 72.461 (82.627) Mem 7378MB [2024-08-23 06:56:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.084) Loss 2.6016 (2.0248) Acc@1 43.750 (56.369) Acc@5 71.777 (80.681) Mem 7378MB [2024-08-23 06:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 56.426 Acc@5 80.686 [2024-08-23 06:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 56.4% [2024-08-23 06:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 56.43% [2024-08-23 06:56:32 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-23 06:56:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-23 06:56:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][0/1251] eta 0:16:30 lr 0.000900 wd 0.0500 time 0.7916 (0.7916) data time 0.5037 (0.5037) model time 0.0000 (0.0000) loss 4.4223 (4.4223) grad_norm 2.1364 (2.1364) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:56:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][10/1251] eta 0:06:02 lr 0.000900 wd 0.0500 time 0.2574 (0.2920) data time 0.0010 (0.0469) model time 0.0000 (0.0000) loss 4.5958 (4.0961) grad_norm 1.5488 (2.1110) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:56:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][20/1251] eta 0:05:28 lr 0.000901 wd 0.0500 time 0.2395 (0.2667) data time 0.0011 (0.0261) model time 0.0000 (0.0000) loss 3.9965 (4.1155) grad_norm 1.7546 (2.0767) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:56:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][30/1251] eta 0:05:15 lr 0.000901 wd 0.0500 time 0.2527 (0.2581) data time 0.0010 (0.0181) model time 0.0000 (0.0000) loss 4.1294 (4.2745) grad_norm 1.6662 (2.0333) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:56:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][40/1251] eta 0:05:09 lr 0.000902 wd 0.0500 time 0.2553 (0.2555) data time 0.0011 (0.0141) model time 0.0000 (0.0000) loss 4.5730 (4.2516) grad_norm 3.1040 (2.1476) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:56:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][50/1251] eta 0:05:02 lr 0.000902 wd 0.0500 time 0.2417 (0.2519) data time 0.0010 (0.0115) model time 0.0000 (0.0000) loss 4.3125 (4.2604) grad_norm 2.5387 (2.1573) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:56:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][60/1251] eta 0:04:57 lr 0.000902 wd 0.0500 time 0.2398 (0.2499) data time 0.0011 (0.0098) model time 0.2387 (0.2390) loss 4.4832 (4.2485) grad_norm 2.0271 (2.1319) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:56:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][70/1251] eta 0:04:53 lr 0.000903 wd 0.0500 time 0.2426 (0.2489) data time 0.0008 (0.0085) model time 0.2418 (0.2403) loss 3.9828 (4.2153) grad_norm 3.1963 (2.1530) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:56:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][80/1251] eta 0:04:50 lr 0.000903 wd 0.0500 time 0.2428 (0.2479) data time 0.0007 (0.0076) model time 0.2421 (0.2401) loss 4.0854 (4.2119) grad_norm 2.3411 (2.1360) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:56:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][90/1251] eta 0:04:47 lr 0.000904 wd 0.0500 time 0.2505 (0.2473) data time 0.0010 (0.0069) model time 0.2495 (0.2404) loss 4.5919 (4.2145) grad_norm 3.3539 (2.1650) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:56:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][100/1251] eta 0:04:43 lr 0.000904 wd 0.0500 time 0.2373 (0.2465) data time 0.0009 (0.0063) model time 0.2363 (0.2398) loss 2.9494 (4.1936) grad_norm 3.0528 (2.1565) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:57:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][110/1251] eta 0:04:40 lr 0.000904 wd 0.0500 time 0.2380 (0.2459) data time 0.0010 (0.0059) model time 0.2370 (0.2398) loss 3.7149 (4.1773) grad_norm 3.1853 (2.1859) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:57:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][120/1251] eta 0:04:37 lr 0.000905 wd 0.0500 time 0.2401 (0.2454) data time 0.0007 (0.0055) model time 0.2394 (0.2396) loss 4.3703 (4.1863) grad_norm 1.6948 (2.1930) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:57:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][130/1251] eta 0:04:34 lr 0.000905 wd 0.0500 time 0.2495 (0.2452) data time 0.0010 (0.0052) model time 0.2485 (0.2398) loss 4.4380 (4.1700) grad_norm 1.4162 (2.2264) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:57:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][140/1251] eta 0:04:32 lr 0.000906 wd 0.0500 time 0.2399 (0.2448) data time 0.0008 (0.0049) model time 0.2391 (0.2397) loss 4.3562 (4.1604) grad_norm 2.5502 (2.2260) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:57:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][150/1251] eta 0:04:29 lr 0.000906 wd 0.0500 time 0.2412 (0.2445) data time 0.0011 (0.0047) model time 0.2401 (0.2396) loss 4.6489 (4.1636) grad_norm 1.7892 (2.2356) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:57:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][160/1251] eta 0:04:26 lr 0.000906 wd 0.0500 time 0.2417 (0.2441) data time 0.0009 (0.0044) model time 0.2408 (0.2394) loss 3.0216 (4.1516) grad_norm 2.0667 (2.2503) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:57:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][170/1251] eta 0:04:23 lr 0.000907 wd 0.0500 time 0.2367 (0.2439) data time 0.0008 (0.0042) model time 0.2359 (0.2393) loss 3.1214 (4.1585) grad_norm 2.3403 (2.2468) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:57:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][180/1251] eta 0:04:20 lr 0.000907 wd 0.0500 time 0.2445 (0.2437) data time 0.0008 (0.0040) model time 0.2437 (0.2393) loss 3.0356 (4.1469) grad_norm 1.9979 (2.2323) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:57:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][190/1251] eta 0:04:18 lr 0.000908 wd 0.0500 time 0.2302 (0.2433) data time 0.0007 (0.0039) model time 0.2295 (0.2391) loss 4.5486 (4.1353) grad_norm 1.8768 (2.2399) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:57:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][200/1251] eta 0:04:15 lr 0.000908 wd 0.0500 time 0.2457 (0.2432) data time 0.0012 (0.0037) model time 0.2445 (0.2392) loss 4.2664 (4.1362) grad_norm 2.5575 (2.2296) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][210/1251] eta 0:04:13 lr 0.000908 wd 0.0500 time 0.2454 (0.2432) data time 0.0011 (0.0037) model time 0.2443 (0.2392) loss 3.6481 (4.1316) grad_norm 4.9551 (2.2341) loss_scale 16384.0000 (16384.0000) mem 7381MB [2024-08-23 06:57:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-23 06:57:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-23 06:57:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-23 07:06:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 07:06:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 07:06:15 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 07:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-23 07:06:27 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-23 07:06:29 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-23 07:06:30 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-23 07:06:30 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 18) [2024-08-23 07:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-23 07:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][220/1251] eta 1:00:18 lr 0.000909 wd 0.0500 time 0.2168 (3.5101) data time 0.0008 (0.4543) model time 0.2161 (3.0558) loss 5.1339 (4.6860) grad_norm 1.9073 (2.2564) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 07:06:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][230/1251] eta 0:19:45 lr 0.000909 wd 0.0500 time 0.2230 (1.1606) data time 0.0008 (0.1304) model time 0.2222 (1.0302) loss 4.5651 (4.5083) grad_norm 2.4916 (2.1625) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 07:06:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][240/1251] eta 0:14:07 lr 0.000910 wd 0.0500 time 0.2249 (0.8387) data time 0.0011 (0.0764) model time 0.2238 (0.7623) loss 4.2142 (4.4911) grad_norm 3.1771 (2.1000) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 07:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][250/1251] eta 0:10:59 lr 0.000910 wd 0.0500 time 0.2272 (0.6593) data time 0.0007 (0.0542) model time 0.2265 (0.6050) loss 3.5366 (4.5141) grad_norm 2.1239 (2.1898) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 07:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][260/1251] eta 0:09:16 lr 0.000910 wd 0.0500 time 0.2241 (0.5611) data time 0.0006 (0.0421) model time 0.2235 (0.5190) loss 4.0855 (4.4357) grad_norm 1.8406 (2.2654) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 07:07:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][270/1251] eta 0:08:09 lr 0.000911 wd 0.0500 time 0.2321 (0.4993) data time 0.0006 (0.0345) model time 0.2315 (0.4649) loss 4.9098 (4.4350) grad_norm 1.7071 (2.2568) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 07:07:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][280/1251] eta 0:07:24 lr 0.000911 wd 0.0500 time 0.2339 (0.4573) data time 0.0008 (0.0292) model time 0.2331 (0.4281) loss 4.3119 (4.3800) grad_norm 1.7143 (2.2082) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-23 07:07:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][290/1251] eta 0:06:48 lr 0.000912 wd 0.0500 time 0.2166 (0.4256) data time 0.0006 (0.0254) model time 0.2160 (0.4002) loss 4.6014 (4.3413) grad_norm 2.4180 (2.1960) loss_scale 32768.0000 (17269.6216) mem 7379MB [2024-08-23 07:07:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][300/1251] eta 0:06:21 lr 0.000912 wd 0.0500 time 0.2185 (0.4016) data time 0.0009 (0.0225) model time 0.2177 (0.3791) loss 4.5798 (4.3064) grad_norm 2.0672 (2.2008) loss_scale 32768.0000 (19114.6667) mem 7379MB [2024-08-23 07:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][310/1251] eta 0:06:00 lr 0.000912 wd 0.0500 time 0.2378 (0.3827) data time 0.0008 (0.0202) model time 0.2370 (0.3625) loss 4.1285 (4.2997) grad_norm 1.6257 (2.1916) loss_scale 32768.0000 (20567.1489) mem 7379MB [2024-08-23 07:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][320/1251] eta 0:05:42 lr 0.000913 wd 0.0500 time 0.2282 (0.3680) data time 0.0011 (0.0183) model time 0.2271 (0.3497) loss 4.0855 (4.3285) grad_norm 2.5460 (2.1675) loss_scale 32768.0000 (21740.3077) mem 7379MB [2024-08-23 07:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][330/1251] eta 0:05:27 lr 0.000913 wd 0.0500 time 0.2277 (0.3552) data time 0.0009 (0.0168) model time 0.2268 (0.3384) loss 4.3314 (4.3231) grad_norm 2.0110 (2.1980) loss_scale 32768.0000 (22707.6491) mem 7379MB [2024-08-23 07:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][340/1251] eta 0:05:13 lr 0.000914 wd 0.0500 time 0.2092 (0.3442) data time 0.0007 (0.0155) model time 0.2084 (0.3287) loss 3.7588 (4.3226) grad_norm inf (inf) loss_scale 16384.0000 (23386.8387) mem 7379MB [2024-08-23 07:07:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][350/1251] eta 0:05:01 lr 0.000914 wd 0.0500 time 0.2345 (0.3349) data time 0.0009 (0.0144) model time 0.2336 (0.3204) loss 4.5391 (4.3060) grad_norm 1.8816 (inf) loss_scale 16384.0000 (22864.2388) mem 7379MB [2024-08-23 07:07:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][360/1251] eta 0:04:51 lr 0.000914 wd 0.0500 time 0.2193 (0.3273) data time 0.0008 (0.0135) model time 0.2185 (0.3138) loss 4.0252 (4.2877) grad_norm 3.0584 (inf) loss_scale 16384.0000 (22414.2222) mem 7379MB [2024-08-23 07:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][370/1251] eta 0:04:42 lr 0.000915 wd 0.0500 time 0.2205 (0.3207) data time 0.0007 (0.0127) model time 0.2198 (0.3081) loss 4.0627 (4.2750) grad_norm 1.7123 (inf) loss_scale 16384.0000 (22022.6494) mem 7379MB [2024-08-23 07:07:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][380/1251] eta 0:04:34 lr 0.000915 wd 0.0500 time 0.2149 (0.3147) data time 0.0006 (0.0119) model time 0.2143 (0.3027) loss 4.2767 (4.2757) grad_norm 1.8792 (inf) loss_scale 16384.0000 (21678.8293) mem 7379MB [2024-08-23 07:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][390/1251] eta 0:04:26 lr 0.000916 wd 0.0500 time 0.2143 (0.3093) data time 0.0008 (0.0113) model time 0.2135 (0.2980) loss 2.9078 (4.2592) grad_norm 1.7331 (inf) loss_scale 16384.0000 (21374.5287) mem 7379MB [2024-08-23 07:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][400/1251] eta 0:04:35 lr 0.000916 wd 0.0500 time 0.2418 (0.3233) data time 0.0009 (0.0232) model time 0.2409 (0.3001) loss 4.0354 (4.2520) grad_norm 5.6845 (inf) loss_scale 16384.0000 (21103.3043) mem 7379MB [2024-08-23 07:07:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][410/1251] eta 0:04:27 lr 0.000916 wd 0.0500 time 0.2253 (0.3181) data time 0.0006 (0.0221) model time 0.2247 (0.2960) loss 4.3459 (4.2517) grad_norm 1.9843 (inf) loss_scale 16384.0000 (20860.0412) mem 7379MB [2024-08-23 07:07:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][420/1251] eta 0:04:20 lr 0.000917 wd 0.0500 time 0.2187 (0.3134) data time 0.0009 (0.0211) model time 0.2178 (0.2923) loss 4.6328 (4.2406) grad_norm 2.4195 (inf) loss_scale 16384.0000 (20640.6275) mem 7379MB [2024-08-23 07:07:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][430/1251] eta 0:04:13 lr 0.000917 wd 0.0500 time 0.2160 (0.3090) data time 0.0006 (0.0201) model time 0.2154 (0.2889) loss 3.9963 (4.2286) grad_norm 2.7243 (inf) loss_scale 16384.0000 (20441.7196) mem 7379MB [2024-08-23 07:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][440/1251] eta 0:04:20 lr 0.000918 wd 0.0500 time 0.2167 (0.3211) data time 0.0009 (0.0269) model time 0.2158 (0.2942) loss 4.2665 (4.2349) grad_norm 2.5188 (inf) loss_scale 16384.0000 (20260.5714) mem 7379MB [2024-08-23 07:07:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][450/1251] eta 0:04:13 lr 0.000918 wd 0.0500 time 0.2166 (0.3167) data time 0.0007 (0.0258) model time 0.2159 (0.2909) loss 3.6540 (4.2230) grad_norm 2.0009 (inf) loss_scale 16384.0000 (20094.9060) mem 7379MB [2024-08-23 07:07:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][460/1251] eta 0:04:07 lr 0.000918 wd 0.0500 time 0.2253 (0.3130) data time 0.0007 (0.0248) model time 0.2246 (0.2883) loss 2.9061 (4.2230) grad_norm 1.3734 (inf) loss_scale 16384.0000 (19942.8197) mem 7379MB [2024-08-23 07:07:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][470/1251] eta 0:04:01 lr 0.000919 wd 0.0500 time 0.2205 (0.3095) data time 0.0006 (0.0238) model time 0.2199 (0.2857) loss 2.9810 (4.2110) grad_norm 2.5205 (inf) loss_scale 16384.0000 (19802.7087) mem 7379MB [2024-08-23 07:07:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][480/1251] eta 0:03:56 lr 0.000919 wd 0.0500 time 0.2245 (0.3062) data time 0.0006 (0.0230) model time 0.2239 (0.2833) loss 4.2654 (4.2031) grad_norm 1.3562 (inf) loss_scale 16384.0000 (19673.2121) mem 7379MB [2024-08-23 07:07:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][490/1251] eta 0:03:50 lr 0.000920 wd 0.0500 time 0.2202 (0.3032) data time 0.0009 (0.0222) model time 0.2193 (0.2811) loss 4.3701 (4.1986) grad_norm 1.9439 (inf) loss_scale 16384.0000 (19553.1679) mem 7379MB [2024-08-23 07:07:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][500/1251] eta 0:03:45 lr 0.000920 wd 0.0500 time 0.2252 (0.3004) data time 0.0007 (0.0214) model time 0.2245 (0.2790) loss 3.1384 (4.1930) grad_norm 1.8168 (inf) loss_scale 16384.0000 (19441.5775) mem 7379MB [2024-08-23 07:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][510/1251] eta 0:03:41 lr 0.000920 wd 0.0500 time 0.2229 (0.2986) data time 0.0007 (0.0207) model time 0.2222 (0.2779) loss 3.6190 (4.1838) grad_norm 1.6187 (inf) loss_scale 16384.0000 (19337.5782) mem 7379MB [2024-08-23 07:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][520/1251] eta 0:03:36 lr 0.000921 wd 0.0500 time 0.2138 (0.2960) data time 0.0008 (0.0201) model time 0.2130 (0.2759) loss 4.5473 (4.1742) grad_norm 1.4846 (inf) loss_scale 16384.0000 (19240.4211) mem 7379MB [2024-08-23 07:08:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][530/1251] eta 0:03:33 lr 0.000921 wd 0.0500 time 0.2265 (0.2958) data time 0.0010 (0.0194) model time 0.2255 (0.2764) loss 4.3271 (4.1778) grad_norm 1.7634 (inf) loss_scale 16384.0000 (19149.4522) mem 7379MB [2024-08-23 07:08:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][540/1251] eta 0:03:28 lr 0.000922 wd 0.0500 time 0.2244 (0.2936) data time 0.0006 (0.0189) model time 0.2238 (0.2747) loss 5.0154 (4.1900) grad_norm 2.3785 (inf) loss_scale 16384.0000 (19064.0988) mem 7379MB [2024-08-23 07:08:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][550/1251] eta 0:03:24 lr 0.000922 wd 0.0500 time 0.2265 (0.2915) data time 0.0007 (0.0183) model time 0.2258 (0.2732) loss 3.6894 (4.1888) grad_norm 1.6480 (inf) loss_scale 16384.0000 (18983.8563) mem 7379MB [2024-08-23 07:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][560/1251] eta 0:03:20 lr 0.000922 wd 0.0500 time 0.2301 (0.2896) data time 0.0009 (0.0178) model time 0.2293 (0.2718) loss 4.5263 (4.1899) grad_norm 1.8037 (inf) loss_scale 16384.0000 (18908.2791) mem 7379MB [2024-08-23 07:08:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][570/1251] eta 0:03:15 lr 0.000923 wd 0.0500 time 0.2291 (0.2878) data time 0.0009 (0.0174) model time 0.2282 (0.2704) loss 4.4125 (4.1943) grad_norm 1.7711 (inf) loss_scale 16384.0000 (18836.9718) mem 7379MB [2024-08-23 07:08:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][580/1251] eta 0:03:11 lr 0.000923 wd 0.0500 time 0.2247 (0.2861) data time 0.0007 (0.0169) model time 0.2240 (0.2691) loss 3.2910 (4.1892) grad_norm 2.4854 (inf) loss_scale 16384.0000 (18769.5824) mem 7379MB [2024-08-23 07:08:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-23 07:08:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-23 07:08:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-23 07:19:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 09:13:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 09:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 09:14:03 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 09:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 09:31:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 09:32:07 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 09:35:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 09:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 09:35:59 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 09:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-23 09:36:20 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-23 09:36:22 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-23 09:36:23 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-23 09:36:23 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 18) [2024-08-23 09:36:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-23 09:36:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][590/1251] eta 0:18:53 lr 0.000924 wd 0.0500 time 0.2453 (1.7153) data time 0.0008 (0.0909) model time 0.2445 (1.6245) loss 4.9710 (4.7734) grad_norm 2.7818 (2.3059) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 09:36:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][600/1251] eta 0:10:11 lr 0.000924 wd 0.0500 time 0.2308 (0.9386) data time 0.0011 (0.0436) model time 0.2297 (0.8950) loss 4.7294 (4.5439) grad_norm 1.4467 (2.1678) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-23 09:36:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-23 09:36:45 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-23 09:36:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-23 10:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 10:17:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 10:34:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 10:34:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 10:34:55 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 11:00:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 11:00:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 11:00:27 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 11:00:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-23 11:00:33 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-23 11:00:34 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-23 11:00:36 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-23 11:00:36 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 18) [2024-08-23 11:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-23 11:00:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-23 11:00:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-23 11:00:57 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-23 17:27:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 17:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 17:27:17 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 22:49:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 22:49:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 22:49:27 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 22:49:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-23 22:49:40 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-23 22:49:42 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-23 22:49:43 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-23 22:49:43 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 18) [2024-08-23 22:49:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-23 23:47:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-23 23:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-23 23:47:28 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-23 23:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-23 23:47:37 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-23 23:47:39 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-23 23:47:40 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-23 23:47:40 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 18) [2024-08-23 23:47:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-24 00:10:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-24 00:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-24 02:30:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-24 02:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-24 02:30:53 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-24 02:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-24 02:34:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-24 02:34:56 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-24 03:22:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-24 03:22:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-24 03:35:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-24 03:35:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-24 03:35:32 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-24 03:35:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-24 03:35:38 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-24 03:35:40 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-24 03:35:41 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-24 03:35:41 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 18) [2024-08-24 03:35:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-24 03:35:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][610/1251] eta 0:32:49 lr 0.000924 wd 0.0500 time 0.2260 (3.0733) data time 0.0006 (0.1566) model time 0.2254 (2.9167) loss 4.9291 (4.7854) grad_norm 2.5595 (2.4728) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-24 03:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][620/1251] eta 0:10:56 lr 0.000925 wd 0.0500 time 0.2225 (1.0404) data time 0.0008 (0.0455) model time 0.2217 (0.9949) loss 4.6964 (4.5229) grad_norm 2.0144 (2.4747) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-24 03:36:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][630/1251] eta 0:07:15 lr 0.000925 wd 0.0500 time 0.2303 (0.7005) data time 0.0010 (0.0270) model time 0.2293 (0.6735) loss 4.0067 (4.5256) grad_norm 3.4103 (2.3498) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-24 03:36:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][640/1251] eta 0:05:42 lr 0.000926 wd 0.0500 time 0.2346 (0.5611) data time 0.0006 (0.0195) model time 0.2340 (0.5415) loss 3.6447 (4.5209) grad_norm 2.9691 (2.3012) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-24 03:36:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][650/1251] eta 0:04:51 lr 0.000926 wd 0.0500 time 0.2275 (0.4848) data time 0.0006 (0.0153) model time 0.2269 (0.4695) loss 4.5490 (4.4546) grad_norm 1.6833 (2.2306) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-24 03:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][660/1251] eta 0:04:17 lr 0.000926 wd 0.0500 time 0.2222 (0.4365) data time 0.0006 (0.0127) model time 0.2215 (0.4238) loss 4.9888 (4.4440) grad_norm 1.6791 (2.1758) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-24 03:36:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][670/1251] eta 0:03:54 lr 0.000927 wd 0.0500 time 0.2295 (0.4037) data time 0.0006 (0.0108) model time 0.2289 (0.3929) loss 4.3081 (4.3936) grad_norm 3.2924 (2.1913) loss_scale 16384.0000 (16384.0000) mem 7377MB [2024-08-24 03:36:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][680/1251] eta 0:03:36 lr 0.000927 wd 0.0500 time 0.2218 (0.3795) data time 0.0007 (0.0095) model time 0.2210 (0.3700) loss 4.6145 (4.3499) grad_norm 1.8262 (inf) loss_scale 8192.0000 (15941.1892) mem 7377MB [2024-08-24 03:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][690/1251] eta 0:03:22 lr 0.000928 wd 0.0500 time 0.2293 (0.3610) data time 0.0009 (0.0085) model time 0.2284 (0.3524) loss 4.4708 (4.3164) grad_norm 2.4011 (inf) loss_scale 8192.0000 (15018.6667) mem 7377MB [2024-08-24 03:36:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][700/1251] eta 0:03:10 lr 0.000928 wd 0.0500 time 0.2252 (0.3465) data time 0.0010 (0.0077) model time 0.2242 (0.3387) loss 4.2039 (4.3039) grad_norm 1.8939 (inf) loss_scale 8192.0000 (14292.4255) mem 7377MB [2024-08-24 03:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][710/1251] eta 0:03:01 lr 0.000928 wd 0.0500 time 0.2294 (0.3348) data time 0.0008 (0.0071) model time 0.2286 (0.3277) loss 4.0228 (4.3328) grad_norm 3.4779 (inf) loss_scale 8192.0000 (13705.8462) mem 7377MB [2024-08-24 03:36:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][720/1251] eta 0:02:52 lr 0.000929 wd 0.0500 time 0.2231 (0.3252) data time 0.0013 (0.0065) model time 0.2219 (0.3187) loss 4.9135 (4.3253) grad_norm 2.1001 (inf) loss_scale 8192.0000 (13222.1754) mem 7377MB [2024-08-24 03:36:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][730/1251] eta 0:02:45 lr 0.000929 wd 0.0500 time 0.2169 (0.3170) data time 0.0010 (0.0061) model time 0.2159 (0.3109) loss 3.9456 (4.3191) grad_norm 4.1475 (inf) loss_scale 8192.0000 (12816.5161) mem 7377MB [2024-08-24 03:36:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][740/1251] eta 0:02:38 lr 0.000930 wd 0.0500 time 0.2266 (0.3100) data time 0.0008 (0.0057) model time 0.2258 (0.3043) loss 4.8460 (4.3167) grad_norm 3.7521 (inf) loss_scale 8192.0000 (12471.4030) mem 7377MB [2024-08-24 03:36:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][750/1251] eta 0:02:32 lr 0.000930 wd 0.0500 time 0.2204 (0.3042) data time 0.0007 (0.0054) model time 0.2197 (0.2988) loss 3.8083 (4.2890) grad_norm 2.1899 (inf) loss_scale 8192.0000 (12174.2222) mem 7377MB [2024-08-24 03:36:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][760/1251] eta 0:02:26 lr 0.000930 wd 0.0500 time 0.2230 (0.2990) data time 0.0007 (0.0051) model time 0.2223 (0.2939) loss 4.0475 (4.2763) grad_norm 2.0704 (inf) loss_scale 8192.0000 (11915.6364) mem 7377MB [2024-08-24 03:36:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][770/1251] eta 0:02:21 lr 0.000931 wd 0.0500 time 0.2339 (0.2946) data time 0.0007 (0.0048) model time 0.2332 (0.2898) loss 4.1366 (4.2789) grad_norm 2.9512 (inf) loss_scale 8192.0000 (11688.5854) mem 7377MB [2024-08-24 03:36:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][780/1251] eta 0:02:16 lr 0.000931 wd 0.0500 time 0.2275 (0.2905) data time 0.0010 (0.0046) model time 0.2265 (0.2859) loss 2.9757 (4.2640) grad_norm 2.4101 (inf) loss_scale 8192.0000 (11487.6322) mem 7377MB [2024-08-24 03:36:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][790/1251] eta 0:02:12 lr 0.000932 wd 0.0500 time 0.2254 (0.2869) data time 0.0008 (0.0044) model time 0.2246 (0.2825) loss 3.8070 (4.2531) grad_norm 2.1717 (inf) loss_scale 8192.0000 (11308.5217) mem 7377MB [2024-08-24 03:36:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][800/1251] eta 0:02:07 lr 0.000932 wd 0.0500 time 0.2218 (0.2837) data time 0.0006 (0.0042) model time 0.2212 (0.2795) loss 4.2314 (4.2484) grad_norm 1.6729 (inf) loss_scale 8192.0000 (11147.8763) mem 7377MB [2024-08-24 03:36:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][810/1251] eta 0:02:03 lr 0.000932 wd 0.0500 time 0.2269 (0.2808) data time 0.0009 (0.0041) model time 0.2259 (0.2768) loss 4.0689 (4.2283) grad_norm 1.9467 (inf) loss_scale 8192.0000 (11002.9804) mem 7377MB [2024-08-24 03:36:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][820/1251] eta 0:01:59 lr 0.000933 wd 0.0500 time 0.2197 (0.2783) data time 0.0008 (0.0039) model time 0.2189 (0.2743) loss 4.0967 (4.2177) grad_norm 2.9736 (inf) loss_scale 8192.0000 (10871.6262) mem 7377MB [2024-08-24 03:36:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][830/1251] eta 0:01:56 lr 0.000933 wd 0.0500 time 0.2273 (0.2759) data time 0.0010 (0.0038) model time 0.2263 (0.2721) loss 4.6247 (4.2183) grad_norm 2.1433 (inf) loss_scale 8192.0000 (10752.0000) mem 7377MB [2024-08-24 03:36:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][840/1251] eta 0:01:52 lr 0.000934 wd 0.0500 time 0.2264 (0.2736) data time 0.0007 (0.0037) model time 0.2256 (0.2700) loss 3.4155 (4.2070) grad_norm 1.7154 (inf) loss_scale 8192.0000 (10642.5983) mem 7377MB [2024-08-24 03:36:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][850/1251] eta 0:01:48 lr 0.000934 wd 0.0500 time 0.2240 (0.2716) data time 0.0007 (0.0036) model time 0.2233 (0.2680) loss 3.3126 (4.2124) grad_norm 1.7049 (inf) loss_scale 8192.0000 (10542.1639) mem 7377MB [2024-08-24 03:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][860/1251] eta 0:01:45 lr 0.000934 wd 0.0500 time 0.2193 (0.2697) data time 0.0008 (0.0035) model time 0.2185 (0.2662) loss 2.9862 (4.2009) grad_norm 2.2203 (inf) loss_scale 8192.0000 (10449.6378) mem 7377MB [2024-08-24 03:36:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][870/1251] eta 0:01:42 lr 0.000935 wd 0.0500 time 0.2183 (0.2679) data time 0.0008 (0.0034) model time 0.2175 (0.2645) loss 4.0407 (4.1888) grad_norm 1.8363 (inf) loss_scale 8192.0000 (10364.1212) mem 7377MB [2024-08-24 03:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][880/1251] eta 0:01:38 lr 0.000935 wd 0.0500 time 0.2217 (0.2663) data time 0.0010 (0.0033) model time 0.2207 (0.2630) loss 4.5545 (4.1873) grad_norm 1.4389 (inf) loss_scale 8192.0000 (10284.8467) mem 7377MB [2024-08-24 03:36:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-24 03:36:59 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-24 03:37:01 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-24 03:42:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-24 03:45:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-24 03:45:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-24 03:45:57 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-24 03:46:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-24 03:46:07 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-24 03:46:09 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-24 03:46:10 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-24 03:46:10 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 18) [2024-08-24 03:46:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-24 03:46:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][890/1251] eta 0:10:52 lr 0.000936 wd 0.0500 time 0.2255 (1.8076) data time 0.0011 (0.1135) model time 0.2244 (1.6941) loss 4.9865 (4.8021) grad_norm 3.4974 (1.9170) loss_scale 8192.0000 (8192.0000) mem 7377MB [2024-08-24 03:46:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][900/1251] eta 0:05:08 lr 0.000936 wd 0.0500 time 0.2265 (0.8776) data time 0.0010 (0.0473) model time 0.2255 (0.8302) loss 4.2817 (4.4931) grad_norm 1.8706 (2.1579) loss_scale 8192.0000 (8192.0000) mem 7377MB [2024-08-24 03:46:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][910/1251] eta 0:03:37 lr 0.000936 wd 0.0500 time 0.2258 (0.6366) data time 0.0009 (0.0302) model time 0.2249 (0.6064) loss 5.0501 (4.5322) grad_norm 1.8739 (2.1274) loss_scale 8192.0000 (8192.0000) mem 7377MB [2024-08-24 03:46:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][920/1251] eta 0:02:54 lr 0.000937 wd 0.0500 time 0.2231 (0.5259) data time 0.0007 (0.0224) model time 0.2223 (0.5036) loss 4.1364 (4.4764) grad_norm 1.8384 (2.1293) loss_scale 8192.0000 (8192.0000) mem 7377MB [2024-08-24 03:46:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][930/1251] eta 0:02:28 lr 0.000937 wd 0.0500 time 0.2295 (0.4624) data time 0.0007 (0.0178) model time 0.2288 (0.4445) loss 4.7401 (4.4373) grad_norm 3.1326 (2.2523) loss_scale 8192.0000 (8192.0000) mem 7377MB [2024-08-24 03:46:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][940/1251] eta 0:02:10 lr 0.000938 wd 0.0500 time 0.2209 (0.4207) data time 0.0011 (0.0149) model time 0.2199 (0.4058) loss 4.2351 (4.4242) grad_norm 1.5957 (2.2217) loss_scale 8192.0000 (8192.0000) mem 7377MB [2024-08-24 03:46:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][950/1251] eta 0:01:57 lr 0.000938 wd 0.0500 time 0.2287 (0.3919) data time 0.0009 (0.0128) model time 0.2278 (0.3791) loss 4.3842 (4.3668) grad_norm 2.0227 (2.1830) loss_scale 8192.0000 (8192.0000) mem 7377MB [2024-08-24 03:46:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][960/1251] eta 0:01:47 lr 0.000938 wd 0.0500 time 0.2183 (0.3702) data time 0.0009 (0.0113) model time 0.2174 (0.3589) loss 4.2445 (4.3202) grad_norm 2.8770 (2.1888) loss_scale 8192.0000 (8192.0000) mem 7377MB [2024-08-24 03:46:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][970/1251] eta 0:01:39 lr 0.000939 wd 0.0500 time 0.2224 (0.3536) data time 0.0013 (0.0101) model time 0.2211 (0.3435) loss 4.2443 (4.2897) grad_norm 3.1776 (2.2105) loss_scale 8192.0000 (8192.0000) mem 7377MB [2024-08-24 03:46:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][980/1251] eta 0:01:32 lr 0.000939 wd 0.0500 time 0.2196 (0.3404) data time 0.0012 (0.0092) model time 0.2184 (0.3312) loss 4.7773 (4.3001) grad_norm 3.2874 (2.2279) loss_scale 8192.0000 (8192.0000) mem 7377MB [2024-08-24 03:46:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][990/1251] eta 0:01:26 lr 0.000940 wd 0.0500 time 0.2258 (0.3296) data time 0.0007 (0.0084) model time 0.2251 (0.3212) loss 4.5486 (4.3282) grad_norm 3.2087 (2.2788) loss_scale 8192.0000 (8192.0000) mem 7377MB [2024-08-24 03:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-24 03:46:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-24 03:46:51 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-25 14:25:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-25 14:25:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-25 14:25:24 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-25 14:25:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-25 14:25:34 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-25 14:25:35 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-25 14:25:36 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-25 14:25:36 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 18) [2024-08-25 14:25:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-25 14:25:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1000/1251] eta 0:08:01 lr 0.000940 wd 0.0500 time 0.2378 (1.9171) data time 0.0008 (0.0914) model time 0.2370 (1.8256) loss 4.4243 (4.7118) grad_norm 2.5037 (2.2035) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:25:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1010/1251] eta 0:03:57 lr 0.000940 wd 0.0500 time 0.2427 (0.9863) data time 0.0008 (0.0412) model time 0.2419 (0.9451) loss 5.0086 (4.4743) grad_norm 3.4331 (2.3631) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1020/1251] eta 0:02:46 lr 0.000941 wd 0.0500 time 0.2504 (0.7208) data time 0.0010 (0.0269) model time 0.2493 (0.6939) loss 4.8775 (4.5130) grad_norm 2.4120 (2.2634) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1030/1251] eta 0:02:11 lr 0.000941 wd 0.0500 time 0.2462 (0.5949) data time 0.0012 (0.0201) model time 0.2449 (0.5748) loss 4.6497 (4.4530) grad_norm 1.5397 (2.3006) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1040/1251] eta 0:01:50 lr 0.000942 wd 0.0500 time 0.2455 (0.5217) data time 0.0007 (0.0161) model time 0.2448 (0.5056) loss 4.6262 (4.4149) grad_norm 1.8009 (2.2311) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1050/1251] eta 0:01:35 lr 0.000942 wd 0.0500 time 0.2417 (0.4737) data time 0.0008 (0.0135) model time 0.2409 (0.4601) loss 3.9042 (4.3949) grad_norm 1.7364 (2.1898) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1060/1251] eta 0:01:23 lr 0.000942 wd 0.0500 time 0.2333 (0.4398) data time 0.0007 (0.0117) model time 0.2327 (0.4281) loss 2.9166 (4.3431) grad_norm 2.4222 (2.2108) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1070/1251] eta 0:01:15 lr 0.000943 wd 0.0500 time 0.2399 (0.4146) data time 0.0009 (0.0103) model time 0.2390 (0.4043) loss 3.1959 (4.2946) grad_norm 2.1538 (2.1752) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1080/1251] eta 0:01:07 lr 0.000943 wd 0.0500 time 0.2422 (0.3952) data time 0.0012 (0.0093) model time 0.2409 (0.3860) loss 4.7699 (4.2698) grad_norm 2.1586 (2.1662) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1090/1251] eta 0:01:01 lr 0.000944 wd 0.0500 time 0.2394 (0.3798) data time 0.0007 (0.0084) model time 0.2387 (0.3713) loss 5.0516 (4.2920) grad_norm 2.0952 (2.1665) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1100/1251] eta 0:00:55 lr 0.000944 wd 0.0500 time 0.2387 (0.3672) data time 0.0009 (0.0077) model time 0.2377 (0.3595) loss 3.3462 (4.2947) grad_norm 1.6950 (2.1531) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1110/1251] eta 0:00:50 lr 0.000944 wd 0.0500 time 0.2421 (0.3567) data time 0.0014 (0.0072) model time 0.2408 (0.3496) loss 4.2059 (4.3053) grad_norm 1.8535 (2.1845) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1120/1251] eta 0:00:45 lr 0.000945 wd 0.0500 time 0.2469 (0.3481) data time 0.0009 (0.0067) model time 0.2460 (0.3414) loss 4.5285 (4.2929) grad_norm 4.5533 (2.1792) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1130/1251] eta 0:00:41 lr 0.000945 wd 0.0500 time 0.2426 (0.3406) data time 0.0010 (0.0063) model time 0.2416 (0.3343) loss 4.0053 (4.2795) grad_norm 2.0359 (2.1855) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1140/1251] eta 0:00:37 lr 0.000946 wd 0.0500 time 0.2436 (0.3342) data time 0.0009 (0.0059) model time 0.2426 (0.3282) loss 4.3023 (4.2736) grad_norm 1.7110 (2.1813) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1150/1251] eta 0:00:33 lr 0.000946 wd 0.0500 time 0.2442 (0.3285) data time 0.0008 (0.0056) model time 0.2434 (0.3229) loss 3.6643 (4.2702) grad_norm 4.6381 (2.1934) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1160/1251] eta 0:00:29 lr 0.000946 wd 0.0500 time 0.2454 (0.3235) data time 0.0011 (0.0054) model time 0.2443 (0.3181) loss 4.6657 (4.2779) grad_norm 2.3680 (2.2084) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1170/1251] eta 0:00:25 lr 0.000947 wd 0.0500 time 0.2457 (0.3191) data time 0.0008 (0.0051) model time 0.2449 (0.3139) loss 3.0671 (4.2545) grad_norm 1.7917 (2.1882) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1180/1251] eta 0:00:22 lr 0.000947 wd 0.0500 time 0.2342 (0.3153) data time 0.0009 (0.0049) model time 0.2333 (0.3104) loss 4.8826 (4.2502) grad_norm 3.6102 (2.1994) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1190/1251] eta 0:00:19 lr 0.000948 wd 0.0500 time 0.2384 (0.3119) data time 0.0011 (0.0048) model time 0.2374 (0.3071) loss 3.2050 (4.2398) grad_norm 2.6338 (2.2023) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1200/1251] eta 0:00:15 lr 0.000948 wd 0.0500 time 0.2531 (0.3090) data time 0.0009 (0.0047) model time 0.2522 (0.3043) loss 4.5730 (4.2254) grad_norm 1.5699 (2.1822) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1210/1251] eta 0:00:12 lr 0.000948 wd 0.0500 time 0.2362 (0.3066) data time 0.0011 (0.0048) model time 0.2351 (0.3018) loss 3.9899 (4.2162) grad_norm 1.9872 (2.2055) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1220/1251] eta 0:00:09 lr 0.000949 wd 0.0500 time 0.2533 (0.3041) data time 0.0010 (0.0047) model time 0.2523 (0.2994) loss 4.3087 (4.2205) grad_norm 1.4105 (2.2088) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1230/1251] eta 0:00:06 lr 0.000949 wd 0.0500 time 0.2470 (0.3015) data time 0.0008 (0.0045) model time 0.2462 (0.2970) loss 4.7248 (4.2129) grad_norm 1.5424 (2.2063) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1240/1251] eta 0:00:03 lr 0.000950 wd 0.0500 time 0.2275 (0.2990) data time 0.0007 (0.0044) model time 0.2268 (0.2946) loss 3.4080 (4.2023) grad_norm 1.7007 (2.2115) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [18/300][1250/1251] eta 0:00:00 lr 0.000950 wd 0.0500 time 0.2251 (0.2962) data time 0.0008 (0.0042) model time 0.2244 (0.2920) loss 4.0940 (4.1923) grad_norm 1.9351 (2.2019) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 14:26:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 18 training takes 0:01:16 [2024-08-25 14:26:57 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-25 14:26:59 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-25 14:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.411 (0.411) Loss 0.8794 (0.8794) Acc@1 81.152 (81.152) Acc@5 96.094 (96.094) Mem 7379MB [2024-08-25 14:27:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.109) Loss 1.2861 (1.2288) Acc@1 71.484 (71.902) Acc@5 91.602 (92.214) Mem 7379MB [2024-08-25 14:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.089 (0.094) Loss 1.7725 (1.2436) Acc@1 61.621 (71.298) Acc@5 83.887 (92.001) Mem 7379MB [2024-08-25 14:27:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.090) Loss 2.1016 (1.4286) Acc@1 54.883 (67.814) Acc@5 78.516 (89.248) Mem 7379MB [2024-08-25 14:27:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 2.0293 (1.5389) Acc@1 56.250 (65.787) Acc@5 81.543 (87.586) Mem 7379MB [2024-08-25 14:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 65.506 Acc@5 87.384 [2024-08-25 14:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 65.5% [2024-08-25 14:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 65.51% [2024-08-25 14:27:06 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-25 14:27:07 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-25 14:27:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.506 (0.506) Loss 1.0781 (1.0781) Acc@1 77.832 (77.832) Acc@5 92.871 (92.871) Mem 7379MB [2024-08-25 14:27:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.116) Loss 1.4756 (1.5481) Acc@1 67.676 (64.409) Acc@5 89.258 (87.260) Mem 7379MB [2024-08-25 14:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.099) Loss 2.1504 (1.5490) Acc@1 53.223 (64.309) Acc@5 77.930 (87.440) Mem 7379MB [2024-08-25 14:27:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.093) Loss 2.2871 (1.7428) Acc@1 52.832 (61.029) Acc@5 74.121 (84.362) Mem 7379MB [2024-08-25 14:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 2.4434 (1.8642) Acc@1 45.996 (58.803) Acc@5 74.414 (82.474) Mem 7379MB [2024-08-25 14:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 58.786 Acc@5 82.492 [2024-08-25 14:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 58.8% [2024-08-25 14:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 58.79% [2024-08-25 14:27:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-25 14:27:12 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-25 14:27:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][0/1251] eta 0:22:39 lr 0.000950 wd 0.0500 time 1.0868 (1.0868) data time 0.3637 (0.3637) model time 0.0000 (0.0000) loss 4.7845 (4.7845) grad_norm 1.5617 (1.5617) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][10/1251] eta 0:06:45 lr 0.000950 wd 0.0500 time 0.2351 (0.3269) data time 0.0009 (0.0347) model time 0.0000 (0.0000) loss 2.9631 (3.8473) grad_norm 1.7925 (2.1249) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][20/1251] eta 0:05:53 lr 0.000951 wd 0.0500 time 0.2439 (0.2870) data time 0.0010 (0.0187) model time 0.0000 (0.0000) loss 4.5047 (4.1330) grad_norm 1.9892 (2.1928) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][30/1251] eta 0:05:45 lr 0.000951 wd 0.0500 time 0.5344 (0.2834) data time 0.0012 (0.0131) model time 0.0000 (0.0000) loss 4.0781 (4.1135) grad_norm 2.9704 (2.0939) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][40/1251] eta 0:05:31 lr 0.000952 wd 0.0500 time 0.2442 (0.2739) data time 0.0011 (0.0101) model time 0.0000 (0.0000) loss 4.5565 (4.0390) grad_norm 1.7337 (2.1506) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][50/1251] eta 0:05:22 lr 0.000952 wd 0.0500 time 0.2467 (0.2681) data time 0.0010 (0.0084) model time 0.0000 (0.0000) loss 4.4638 (4.0296) grad_norm 2.9591 (2.1912) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][60/1251] eta 0:05:15 lr 0.000952 wd 0.0500 time 0.2409 (0.2649) data time 0.0009 (0.0072) model time 0.2399 (0.2474) loss 4.9422 (4.1025) grad_norm 2.1115 (2.1660) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][70/1251] eta 0:05:10 lr 0.000953 wd 0.0500 time 0.2446 (0.2631) data time 0.0009 (0.0064) model time 0.2436 (0.2489) loss 3.2134 (4.1199) grad_norm 1.5278 (2.1085) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][80/1251] eta 0:05:05 lr 0.000953 wd 0.0500 time 0.2491 (0.2612) data time 0.0010 (0.0057) model time 0.2481 (0.2484) loss 3.6939 (4.1192) grad_norm 3.0218 (2.1455) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][90/1251] eta 0:05:01 lr 0.000954 wd 0.0500 time 0.2486 (0.2596) data time 0.0008 (0.0052) model time 0.2478 (0.2475) loss 4.5792 (4.1310) grad_norm 1.8651 (2.1715) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][100/1251] eta 0:04:56 lr 0.000954 wd 0.0500 time 0.2467 (0.2578) data time 0.0007 (0.0048) model time 0.2460 (0.2462) loss 3.6216 (4.1341) grad_norm 1.5723 (2.1749) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][110/1251] eta 0:04:53 lr 0.000954 wd 0.0500 time 0.2422 (0.2568) data time 0.0012 (0.0044) model time 0.2410 (0.2461) loss 4.3752 (4.1400) grad_norm 2.3641 (2.1695) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][120/1251] eta 0:04:49 lr 0.000955 wd 0.0500 time 0.2421 (0.2557) data time 0.0010 (0.0042) model time 0.2411 (0.2456) loss 4.6344 (4.1500) grad_norm 2.3559 (2.1688) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][130/1251] eta 0:04:45 lr 0.000955 wd 0.0500 time 0.2458 (0.2551) data time 0.0009 (0.0039) model time 0.2448 (0.2456) loss 4.0701 (4.1285) grad_norm 2.8800 (2.1635) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][140/1251] eta 0:04:42 lr 0.000956 wd 0.0500 time 0.2447 (0.2545) data time 0.0009 (0.0037) model time 0.2438 (0.2457) loss 4.7136 (4.1432) grad_norm 2.0859 (2.1727) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][150/1251] eta 0:04:39 lr 0.000956 wd 0.0500 time 0.2434 (0.2538) data time 0.0009 (0.0036) model time 0.2425 (0.2455) loss 4.1458 (4.1667) grad_norm 1.8349 (2.1618) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][160/1251] eta 0:04:36 lr 0.000956 wd 0.0500 time 0.2559 (0.2535) data time 0.0008 (0.0034) model time 0.2552 (0.2456) loss 3.3759 (4.1660) grad_norm 2.5918 (2.1972) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][170/1251] eta 0:04:33 lr 0.000957 wd 0.0500 time 0.2521 (0.2531) data time 0.0008 (0.0033) model time 0.2513 (0.2456) loss 4.2710 (4.1835) grad_norm 2.0534 (2.1783) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:27:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][180/1251] eta 0:04:30 lr 0.000957 wd 0.0500 time 0.2467 (0.2529) data time 0.0011 (0.0031) model time 0.2456 (0.2458) loss 4.1850 (4.1964) grad_norm 1.7231 (2.1922) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:28:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][190/1251] eta 0:04:27 lr 0.000958 wd 0.0500 time 0.2493 (0.2526) data time 0.0010 (0.0030) model time 0.2483 (0.2458) loss 4.1649 (4.1933) grad_norm 2.0299 (2.1763) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:28:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][200/1251] eta 0:04:25 lr 0.000958 wd 0.0500 time 0.2462 (0.2522) data time 0.0011 (0.0029) model time 0.2452 (0.2458) loss 3.8739 (4.1786) grad_norm 1.9858 (2.1697) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-25 14:28:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-25 14:28:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-25 14:28:05 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-25 14:30:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-25 14:30:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-25 14:30:28 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-25 14:30:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-25 14:30:37 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-25 14:30:39 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-25 14:30:40 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-25 14:30:40 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 19) [2024-08-25 14:30:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-25 15:40:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-25 15:40:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-25 15:40:57 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-25 15:41:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-25 15:41:08 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-25 15:41:09 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-25 15:41:10 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-25 15:41:10 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 19) [2024-08-25 15:41:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-25 15:41:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][210/1251] eta 0:58:13 lr 0.000958 wd 0.0500 time 0.2370 (3.3561) data time 0.0009 (0.2476) model time 0.2361 (3.1085) loss 4.8387 (4.6990) grad_norm 4.4707 (2.7027) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:41:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][220/1251] eta 0:19:23 lr 0.000959 wd 0.0500 time 0.2384 (1.1290) data time 0.0009 (0.0716) model time 0.2375 (1.0574) loss 4.3397 (4.5262) grad_norm 2.5602 (2.4468) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:41:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][230/1251] eta 0:12:53 lr 0.000959 wd 0.0500 time 0.2404 (0.7578) data time 0.0011 (0.0422) model time 0.2393 (0.7156) loss 4.0752 (4.5046) grad_norm 2.4682 (2.3197) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:41:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][240/1251] eta 0:10:11 lr 0.000960 wd 0.0500 time 0.2407 (0.6048) data time 0.0009 (0.0301) model time 0.2399 (0.5746) loss 3.5421 (4.4901) grad_norm 1.8753 (2.2330) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][250/1251] eta 0:08:41 lr 0.000960 wd 0.0500 time 0.2420 (0.5212) data time 0.0008 (0.0235) model time 0.2412 (0.4976) loss 4.6401 (4.4359) grad_norm 1.2811 (2.1435) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:41:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][260/1251] eta 0:07:44 lr 0.000960 wd 0.0500 time 0.2396 (0.4686) data time 0.0009 (0.0194) model time 0.2387 (0.4492) loss 5.0199 (4.4239) grad_norm 1.5879 (2.1547) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][270/1251] eta 0:07:04 lr 0.000961 wd 0.0500 time 0.2391 (0.4325) data time 0.0009 (0.0165) model time 0.2382 (0.4160) loss 4.5094 (4.3787) grad_norm 3.0741 (2.1473) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][280/1251] eta 0:06:34 lr 0.000961 wd 0.0500 time 0.2452 (0.4065) data time 0.0009 (0.0145) model time 0.2443 (0.3920) loss 4.7038 (4.3368) grad_norm 1.7052 (2.1425) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:41:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][290/1251] eta 0:06:11 lr 0.000962 wd 0.0500 time 0.2339 (0.3863) data time 0.0012 (0.0129) model time 0.2327 (0.3735) loss 4.4849 (4.2963) grad_norm 2.6262 (2.1199) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:41:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][300/1251] eta 0:05:52 lr 0.000962 wd 0.0500 time 0.2394 (0.3705) data time 0.0011 (0.0116) model time 0.2384 (0.3588) loss 3.9131 (4.2872) grad_norm 2.6977 (2.1677) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:41:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][310/1251] eta 0:05:36 lr 0.000962 wd 0.0500 time 0.2434 (0.3580) data time 0.0011 (0.0106) model time 0.2423 (0.3473) loss 4.1219 (4.3130) grad_norm 1.3373 (2.2537) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:41:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][320/1251] eta 0:05:23 lr 0.000963 wd 0.0500 time 0.2446 (0.3475) data time 0.0011 (0.0098) model time 0.2435 (0.3377) loss 4.3116 (4.3004) grad_norm 2.7220 (2.2679) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:41:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][330/1251] eta 0:05:12 lr 0.000963 wd 0.0500 time 0.2404 (0.3389) data time 0.0011 (0.0091) model time 0.2393 (0.3298) loss 3.9308 (4.3032) grad_norm 1.8343 (2.2602) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:41:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][340/1251] eta 0:05:02 lr 0.000964 wd 0.0500 time 0.2454 (0.3316) data time 0.0010 (0.0085) model time 0.2444 (0.3231) loss 4.4126 (4.2988) grad_norm 1.4909 (2.2995) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][350/1251] eta 0:04:52 lr 0.000964 wd 0.0500 time 0.2478 (0.3252) data time 0.0008 (0.0080) model time 0.2470 (0.3172) loss 3.8840 (4.2848) grad_norm 1.6431 (2.3002) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][360/1251] eta 0:04:44 lr 0.000964 wd 0.0500 time 0.2401 (0.3196) data time 0.0009 (0.0075) model time 0.2391 (0.3121) loss 3.8169 (4.2703) grad_norm 1.4258 (2.2985) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][370/1251] eta 0:04:37 lr 0.000965 wd 0.0500 time 0.2255 (0.3147) data time 0.0011 (0.0072) model time 0.2243 (0.3076) loss 4.1158 (4.2562) grad_norm 2.3563 (2.3062) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][380/1251] eta 0:04:30 lr 0.000965 wd 0.0500 time 0.2406 (0.3104) data time 0.0010 (0.0068) model time 0.2396 (0.3036) loss 3.2822 (4.2463) grad_norm 2.2469 (2.2803) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][390/1251] eta 0:04:23 lr 0.000966 wd 0.0500 time 0.2372 (0.3065) data time 0.0009 (0.0065) model time 0.2363 (0.3000) loss 3.5967 (4.2276) grad_norm 1.3796 (2.2648) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][400/1251] eta 0:04:17 lr 0.000966 wd 0.0500 time 0.2403 (0.3030) data time 0.0008 (0.0062) model time 0.2395 (0.2968) loss 4.2122 (4.2284) grad_norm 1.4423 (2.2494) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][410/1251] eta 0:04:12 lr 0.000966 wd 0.0500 time 0.2338 (0.2998) data time 0.0011 (0.0060) model time 0.2327 (0.2938) loss 4.1635 (4.2125) grad_norm 2.8883 (2.2400) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][420/1251] eta 0:04:06 lr 0.000967 wd 0.0500 time 0.2397 (0.2970) data time 0.0010 (0.0057) model time 0.2387 (0.2912) loss 4.2459 (4.2039) grad_norm 2.2240 (2.2361) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][430/1251] eta 0:04:01 lr 0.000967 wd 0.0500 time 0.2398 (0.2945) data time 0.0010 (0.0055) model time 0.2388 (0.2889) loss 4.1261 (4.1979) grad_norm 3.1659 (2.2608) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][440/1251] eta 0:03:57 lr 0.000968 wd 0.0500 time 0.2428 (0.2923) data time 0.0008 (0.0053) model time 0.2421 (0.2869) loss 3.5515 (4.1907) grad_norm 1.8943 (2.2697) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][450/1251] eta 0:03:52 lr 0.000968 wd 0.0500 time 0.2327 (0.2902) data time 0.0009 (0.0052) model time 0.2318 (0.2850) loss 2.9569 (4.1875) grad_norm 2.8209 (2.2884) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][460/1251] eta 0:03:47 lr 0.000968 wd 0.0500 time 0.2462 (0.2881) data time 0.0007 (0.0050) model time 0.2455 (0.2831) loss 2.8219 (4.1765) grad_norm 1.7287 (2.2844) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][470/1251] eta 0:03:43 lr 0.000969 wd 0.0500 time 0.2373 (0.2864) data time 0.0007 (0.0049) model time 0.2366 (0.2815) loss 4.1061 (4.1696) grad_norm 2.2277 (2.2682) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][480/1251] eta 0:03:39 lr 0.000969 wd 0.0500 time 0.2363 (0.2848) data time 0.0011 (0.0047) model time 0.2352 (0.2800) loss 4.4347 (4.1700) grad_norm 2.0226 (2.2520) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][490/1251] eta 0:03:35 lr 0.000970 wd 0.0500 time 0.2443 (0.2832) data time 0.0010 (0.0046) model time 0.2433 (0.2786) loss 3.0147 (4.1623) grad_norm 2.5003 (2.2484) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][500/1251] eta 0:03:32 lr 0.000970 wd 0.0500 time 0.2391 (0.2825) data time 0.0009 (0.0045) model time 0.2382 (0.2780) loss 3.9492 (4.1608) grad_norm 2.0463 (2.2570) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][510/1251] eta 0:03:28 lr 0.000970 wd 0.0500 time 0.2348 (0.2811) data time 0.0012 (0.0044) model time 0.2337 (0.2767) loss 4.2204 (4.1513) grad_norm 1.9956 (2.2597) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][520/1251] eta 0:03:25 lr 0.000971 wd 0.0500 time 0.2408 (0.2806) data time 0.0010 (0.0043) model time 0.2399 (0.2763) loss 4.1532 (4.1532) grad_norm 1.8293 (2.2519) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][530/1251] eta 0:03:21 lr 0.000971 wd 0.0500 time 0.2388 (0.2794) data time 0.0010 (0.0042) model time 0.2378 (0.2752) loss 4.7308 (4.1660) grad_norm 1.7525 (2.2506) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][540/1251] eta 0:03:17 lr 0.000972 wd 0.0500 time 0.2397 (0.2781) data time 0.0007 (0.0041) model time 0.2389 (0.2741) loss 3.9964 (4.1648) grad_norm 2.1904 (2.2465) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][550/1251] eta 0:03:14 lr 0.000972 wd 0.0500 time 0.2409 (0.2771) data time 0.0010 (0.0040) model time 0.2399 (0.2731) loss 4.4123 (4.1673) grad_norm 3.2430 (2.2408) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-25 15:42:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-25 15:42:52 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-25 15:42:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 03:48:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 03:48:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 03:48:55 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 03:49:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 03:49:05 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 03:49:06 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 03:49:07 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 03:49:07 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 19) [2024-08-26 03:49:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 03:49:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][560/1251] eta 0:55:03 lr 0.000972 wd 0.0500 time 0.2424 (4.7809) data time 0.0009 (0.2697) model time 0.2414 (4.5112) loss 3.7408 (4.6578) grad_norm 1.6198 (1.9074) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:49:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][570/1251] eta 0:14:37 lr 0.000973 wd 0.0500 time 0.2456 (1.2892) data time 0.0011 (0.0631) model time 0.2445 (1.2261) loss 4.1427 (4.5044) grad_norm 2.0005 (1.8384) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:49:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][580/1251] eta 0:09:18 lr 0.000973 wd 0.0500 time 0.2423 (0.8327) data time 0.0007 (0.0361) model time 0.2416 (0.7966) loss 4.9005 (4.5569) grad_norm 2.0336 (1.8773) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:49:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][590/1251] eta 0:07:11 lr 0.000974 wd 0.0500 time 0.2383 (0.6531) data time 0.0008 (0.0255) model time 0.2375 (0.6276) loss 4.6951 (4.5488) grad_norm 1.9493 (1.9114) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:49:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][600/1251] eta 0:06:02 lr 0.000974 wd 0.0500 time 0.2395 (0.5573) data time 0.0011 (0.0199) model time 0.2384 (0.5374) loss 4.0457 (4.4317) grad_norm 1.9800 (1.9898) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:49:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][610/1251] eta 0:05:18 lr 0.000974 wd 0.0500 time 0.2339 (0.4973) data time 0.0011 (0.0163) model time 0.2328 (0.4810) loss 4.2285 (4.4169) grad_norm 1.7641 (2.0008) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:49:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][620/1251] eta 0:04:47 lr 0.000975 wd 0.0500 time 0.2338 (0.4560) data time 0.0010 (0.0139) model time 0.2328 (0.4421) loss 3.7754 (4.3748) grad_norm 1.5848 (2.1266) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:49:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][630/1251] eta 0:04:24 lr 0.000975 wd 0.0500 time 0.2375 (0.4263) data time 0.0012 (0.0121) model time 0.2363 (0.4141) loss 4.4633 (4.3166) grad_norm 1.4623 (2.0926) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:49:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][640/1251] eta 0:04:06 lr 0.000976 wd 0.0500 time 0.2371 (0.4039) data time 0.0010 (0.0108) model time 0.2361 (0.3931) loss 3.0629 (4.2816) grad_norm 2.6696 (2.1342) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:49:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][650/1251] eta 0:03:52 lr 0.000976 wd 0.0500 time 0.2310 (0.3861) data time 0.0009 (0.0098) model time 0.2301 (0.3763) loss 4.9963 (4.2668) grad_norm 1.6978 (2.1762) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][660/1251] eta 0:03:39 lr 0.000976 wd 0.0500 time 0.2409 (0.3719) data time 0.0009 (0.0089) model time 0.2400 (0.3630) loss 5.1053 (4.2885) grad_norm 4.0902 (2.1695) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:49:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][670/1251] eta 0:03:29 lr 0.000977 wd 0.0500 time 0.2359 (0.3601) data time 0.0008 (0.0083) model time 0.2352 (0.3519) loss 4.4581 (4.2821) grad_norm 1.7936 (2.1574) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:49:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][680/1251] eta 0:03:19 lr 0.000977 wd 0.0500 time 0.2364 (0.3503) data time 0.0010 (0.0077) model time 0.2354 (0.3426) loss 4.3727 (4.2833) grad_norm 2.0379 (2.1577) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:49:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][690/1251] eta 0:03:11 lr 0.000978 wd 0.0500 time 0.2340 (0.3420) data time 0.0009 (0.0072) model time 0.2331 (0.3348) loss 4.2270 (4.2655) grad_norm 2.1241 (2.1519) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][700/1251] eta 0:03:04 lr 0.000978 wd 0.0500 time 0.2384 (0.3348) data time 0.0010 (0.0067) model time 0.2375 (0.3281) loss 4.5158 (4.2454) grad_norm 1.5918 (2.1274) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][710/1251] eta 0:02:57 lr 0.000978 wd 0.0500 time 0.2397 (0.3286) data time 0.0009 (0.0064) model time 0.2388 (0.3223) loss 4.1440 (4.2316) grad_norm 3.5207 (2.1405) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][720/1251] eta 0:02:51 lr 0.000979 wd 0.0500 time 0.2452 (0.3234) data time 0.0007 (0.0060) model time 0.2445 (0.3174) loss 3.2308 (4.2367) grad_norm 2.9177 (2.1384) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][730/1251] eta 0:02:45 lr 0.000979 wd 0.0500 time 0.2356 (0.3185) data time 0.0012 (0.0057) model time 0.2344 (0.3128) loss 4.4684 (4.2312) grad_norm 2.5176 (2.1490) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][740/1251] eta 0:02:40 lr 0.000980 wd 0.0500 time 0.2391 (0.3143) data time 0.0012 (0.0055) model time 0.2379 (0.3088) loss 4.5268 (4.2206) grad_norm 1.9316 (2.1861) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][750/1251] eta 0:02:35 lr 0.000980 wd 0.0500 time 0.2458 (0.3105) data time 0.0007 (0.0053) model time 0.2451 (0.3052) loss 4.0573 (4.2152) grad_norm 2.5325 (2.2253) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][760/1251] eta 0:02:30 lr 0.000980 wd 0.0500 time 0.2413 (0.3071) data time 0.0008 (0.0051) model time 0.2405 (0.3020) loss 3.2573 (4.2035) grad_norm 1.5780 (2.2191) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][770/1251] eta 0:02:26 lr 0.000981 wd 0.0500 time 0.2417 (0.3039) data time 0.0011 (0.0049) model time 0.2406 (0.2990) loss 3.3404 (4.2000) grad_norm 3.0268 (2.2352) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][780/1251] eta 0:02:21 lr 0.000981 wd 0.0500 time 0.2397 (0.3010) data time 0.0008 (0.0047) model time 0.2389 (0.2963) loss 3.0574 (4.1994) grad_norm 2.1653 (2.2457) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][790/1251] eta 0:02:17 lr 0.000982 wd 0.0500 time 0.2306 (0.2984) data time 0.0016 (0.0045) model time 0.2290 (0.2939) loss 3.6304 (4.1927) grad_norm 1.5537 (2.2490) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][800/1251] eta 0:02:13 lr 0.000982 wd 0.0500 time 0.2448 (0.2961) data time 0.0010 (0.0044) model time 0.2438 (0.2917) loss 4.2601 (4.1916) grad_norm 4.1740 (2.2460) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][810/1251] eta 0:02:09 lr 0.000982 wd 0.0500 time 0.2411 (0.2938) data time 0.0012 (0.0043) model time 0.2400 (0.2896) loss 4.4613 (4.1846) grad_norm 1.9112 (2.2573) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][820/1251] eta 0:02:05 lr 0.000983 wd 0.0500 time 0.2429 (0.2918) data time 0.0008 (0.0041) model time 0.2421 (0.2877) loss 3.9231 (4.1708) grad_norm 1.8132 (2.2526) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][830/1251] eta 0:02:02 lr 0.000983 wd 0.0500 time 0.2396 (0.2900) data time 0.0009 (0.0040) model time 0.2387 (0.2860) loss 5.3691 (4.1724) grad_norm 3.1537 (2.2518) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][840/1251] eta 0:01:58 lr 0.000984 wd 0.0500 time 0.2410 (0.2882) data time 0.0010 (0.0039) model time 0.2400 (0.2843) loss 4.4297 (4.1664) grad_norm 2.1627 (2.2332) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][850/1251] eta 0:01:55 lr 0.000984 wd 0.0500 time 0.2304 (0.2873) data time 0.0012 (0.0038) model time 0.2292 (0.2835) loss 3.9089 (4.1594) grad_norm 1.7538 (2.2273) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][860/1251] eta 0:01:51 lr 0.000984 wd 0.0500 time 0.2384 (0.2858) data time 0.0010 (0.0037) model time 0.2374 (0.2821) loss 4.2919 (4.1505) grad_norm 1.8967 (2.2223) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][870/1251] eta 0:01:48 lr 0.000985 wd 0.0500 time 0.2373 (0.2851) data time 0.0007 (0.0036) model time 0.2365 (0.2815) loss 4.8100 (4.1557) grad_norm 2.5449 (2.2224) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][880/1251] eta 0:01:45 lr 0.000985 wd 0.0500 time 0.2358 (0.2838) data time 0.0012 (0.0036) model time 0.2347 (0.2802) loss 4.5892 (4.1671) grad_norm 2.4166 (2.2191) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][890/1251] eta 0:01:41 lr 0.000986 wd 0.0500 time 0.2356 (0.2824) data time 0.0011 (0.0035) model time 0.2345 (0.2789) loss 4.6118 (4.1664) grad_norm 1.9324 (2.2214) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][900/1251] eta 0:01:38 lr 0.000986 wd 0.0500 time 0.2425 (0.2812) data time 0.0011 (0.0034) model time 0.2415 (0.2778) loss 4.1585 (4.1681) grad_norm 1.9907 (2.2290) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][910/1251] eta 0:01:35 lr 0.000986 wd 0.0500 time 0.2381 (0.2800) data time 0.0009 (0.0033) model time 0.2371 (0.2767) loss 4.6702 (4.1723) grad_norm 1.8678 (2.2226) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][920/1251] eta 0:01:32 lr 0.000987 wd 0.0500 time 0.2388 (0.2789) data time 0.0007 (0.0033) model time 0.2381 (0.2756) loss 4.6299 (4.1746) grad_norm 2.0514 (2.2334) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][930/1251] eta 0:01:29 lr 0.000987 wd 0.0500 time 0.2439 (0.2778) data time 0.0007 (0.0032) model time 0.2432 (0.2746) loss 4.6250 (4.1688) grad_norm 2.0843 (2.2327) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:50:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][940/1251] eta 0:01:26 lr 0.000988 wd 0.0500 time 0.2421 (0.2768) data time 0.0010 (0.0032) model time 0.2411 (0.2737) loss 3.2244 (4.1634) grad_norm 3.7780 (2.2449) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][950/1251] eta 0:01:23 lr 0.000988 wd 0.0500 time 0.2392 (0.2759) data time 0.0012 (0.0031) model time 0.2381 (0.2728) loss 4.2047 (4.1581) grad_norm 2.7908 (2.2453) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][960/1251] eta 0:01:20 lr 0.000988 wd 0.0500 time 0.2344 (0.2750) data time 0.0011 (0.0031) model time 0.2333 (0.2719) loss 3.7747 (4.1615) grad_norm 2.2036 (2.2428) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][970/1251] eta 0:01:17 lr 0.000989 wd 0.0500 time 0.2437 (0.2741) data time 0.0010 (0.0030) model time 0.2427 (0.2711) loss 4.5874 (4.1690) grad_norm 2.3939 (2.2576) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][980/1251] eta 0:01:14 lr 0.000989 wd 0.0500 time 0.2461 (0.2733) data time 0.0007 (0.0030) model time 0.2454 (0.2704) loss 2.9984 (4.1676) grad_norm 1.9912 (2.2540) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][990/1251] eta 0:01:11 lr 0.000990 wd 0.0500 time 0.2531 (0.2726) data time 0.0008 (0.0029) model time 0.2523 (0.2697) loss 4.8671 (4.1777) grad_norm 2.1714 (2.2473) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1000/1251] eta 0:01:08 lr 0.000990 wd 0.0500 time 0.2488 (0.2719) data time 0.0007 (0.0029) model time 0.2481 (0.2690) loss 4.0648 (4.1824) grad_norm 1.9738 (2.2502) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1010/1251] eta 0:01:05 lr 0.000990 wd 0.0500 time 0.2454 (0.2712) data time 0.0008 (0.0028) model time 0.2446 (0.2683) loss 3.2000 (4.1769) grad_norm 1.6031 (2.2468) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1020/1251] eta 0:01:02 lr 0.000991 wd 0.0500 time 0.2396 (0.2705) data time 0.0009 (0.0028) model time 0.2387 (0.2677) loss 3.6181 (4.1715) grad_norm 1.8092 (2.2367) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1030/1251] eta 0:00:59 lr 0.000991 wd 0.0500 time 0.2400 (0.2699) data time 0.0008 (0.0028) model time 0.2392 (0.2671) loss 3.7793 (4.1638) grad_norm 2.3815 (2.2345) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1040/1251] eta 0:00:56 lr 0.000992 wd 0.0500 time 0.2489 (0.2694) data time 0.0007 (0.0027) model time 0.2482 (0.2666) loss 5.0358 (4.1646) grad_norm 2.2765 (2.2248) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1050/1251] eta 0:00:54 lr 0.000992 wd 0.0500 time 0.2349 (0.2688) data time 0.0009 (0.0027) model time 0.2340 (0.2661) loss 3.4811 (4.1662) grad_norm 1.6057 (2.2342) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1060/1251] eta 0:00:51 lr 0.000992 wd 0.0500 time 0.2380 (0.2682) data time 0.0007 (0.0027) model time 0.2372 (0.2656) loss 3.4446 (4.1636) grad_norm 2.5388 (2.2372) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1070/1251] eta 0:00:48 lr 0.000993 wd 0.0500 time 0.2413 (0.2677) data time 0.0009 (0.0026) model time 0.2404 (0.2650) loss 4.7457 (4.1716) grad_norm 2.4020 (2.2367) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1080/1251] eta 0:00:45 lr 0.000993 wd 0.0500 time 0.2597 (0.2672) data time 0.0009 (0.0026) model time 0.2588 (0.2646) loss 3.9140 (4.1661) grad_norm 2.0900 (2.2367) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1090/1251] eta 0:00:42 lr 0.000994 wd 0.0500 time 0.2370 (0.2666) data time 0.0007 (0.0026) model time 0.2363 (0.2641) loss 4.4460 (4.1629) grad_norm 2.4050 (2.2359) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1100/1251] eta 0:00:40 lr 0.000994 wd 0.0500 time 0.2389 (0.2661) data time 0.0011 (0.0025) model time 0.2378 (0.2636) loss 4.1398 (4.1601) grad_norm 2.1740 (2.2380) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1110/1251] eta 0:00:37 lr 0.000994 wd 0.0500 time 0.2380 (0.2657) data time 0.0009 (0.0025) model time 0.2371 (0.2632) loss 4.9757 (4.1661) grad_norm 2.3548 (2.2402) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1120/1251] eta 0:00:34 lr 0.000995 wd 0.0500 time 0.2475 (0.2653) data time 0.0010 (0.0025) model time 0.2465 (0.2628) loss 4.0389 (4.1689) grad_norm 2.1010 (2.2403) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1130/1251] eta 0:00:32 lr 0.000995 wd 0.0500 time 0.2412 (0.2648) data time 0.0007 (0.0025) model time 0.2405 (0.2624) loss 3.8601 (4.1700) grad_norm 2.5848 (2.2402) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1140/1251] eta 0:00:29 lr 0.000996 wd 0.0500 time 0.2419 (0.2644) data time 0.0010 (0.0024) model time 0.2409 (0.2620) loss 4.3322 (4.1736) grad_norm 2.0392 (2.2408) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1150/1251] eta 0:00:26 lr 0.000996 wd 0.0500 time 0.2407 (0.2640) data time 0.0011 (0.0024) model time 0.2395 (0.2616) loss 3.3957 (4.1751) grad_norm 1.9560 (2.2369) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1160/1251] eta 0:00:23 lr 0.000996 wd 0.0500 time 0.2366 (0.2636) data time 0.0010 (0.0024) model time 0.2355 (0.2612) loss 2.8632 (4.1704) grad_norm 2.2191 (2.2313) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1170/1251] eta 0:00:21 lr 0.000997 wd 0.0500 time 0.2387 (0.2632) data time 0.0010 (0.0024) model time 0.2377 (0.2609) loss 4.3203 (4.1711) grad_norm 1.3627 (2.2293) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1180/1251] eta 0:00:18 lr 0.000997 wd 0.0500 time 0.2422 (0.2629) data time 0.0010 (0.0023) model time 0.2412 (0.2605) loss 5.1515 (4.1746) grad_norm 1.6187 (2.2227) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1190/1251] eta 0:00:16 lr 0.000998 wd 0.0500 time 0.2449 (0.2625) data time 0.0008 (0.0023) model time 0.2441 (0.2602) loss 4.8775 (4.1785) grad_norm 2.0272 (2.2251) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:52:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1200/1251] eta 0:00:13 lr 0.000998 wd 0.0500 time 0.2385 (0.2622) data time 0.0008 (0.0023) model time 0.2377 (0.2599) loss 5.3377 (4.1748) grad_norm 3.6088 (2.2318) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1210/1251] eta 0:00:10 lr 0.000998 wd 0.0500 time 0.2394 (0.2619) data time 0.0008 (0.0023) model time 0.2386 (0.2596) loss 4.7091 (4.1756) grad_norm 2.1522 (2.2330) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:52:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1220/1251] eta 0:00:08 lr 0.000999 wd 0.0500 time 0.2397 (0.2615) data time 0.0011 (0.0023) model time 0.2385 (0.2593) loss 3.9509 (4.1720) grad_norm 2.4505 (2.2334) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:52:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1230/1251] eta 0:00:05 lr 0.000999 wd 0.0500 time 0.2420 (0.2612) data time 0.0008 (0.0022) model time 0.2413 (0.2590) loss 4.7325 (4.1771) grad_norm 1.4229 (2.2257) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1240/1251] eta 0:00:02 lr 0.001000 wd 0.0500 time 0.2263 (0.2608) data time 0.0005 (0.0022) model time 0.2258 (0.2586) loss 4.2123 (4.1781) grad_norm 2.9650 (2.2291) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [19/300][1250/1251] eta 0:00:00 lr 0.001000 wd 0.0500 time 0.2254 (0.2603) data time 0.0007 (0.0022) model time 0.2247 (0.2581) loss 4.3911 (4.1749) grad_norm 1.7985 (2.2329) loss_scale 8192.0000 (8192.0000) mem 7373MB [2024-08-26 03:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 19 training takes 0:03:00 [2024-08-26 03:52:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 03:52:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 03:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.428 (0.428) Loss 0.9473 (0.9473) Acc@1 80.371 (80.371) Acc@5 93.945 (93.945) Mem 7373MB [2024-08-26 03:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.109) Loss 1.2178 (1.2228) Acc@1 72.754 (72.576) Acc@5 93.066 (92.054) Mem 7373MB [2024-08-26 03:52:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.095) Loss 1.7402 (1.2384) Acc@1 61.719 (71.773) Acc@5 84.668 (92.076) Mem 7373MB [2024-08-26 03:52:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.090) Loss 2.1465 (1.4094) Acc@1 52.637 (68.340) Acc@5 77.148 (89.359) Mem 7373MB [2024-08-26 03:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.084) Loss 2.0332 (1.5194) Acc@1 56.836 (66.240) Acc@5 81.934 (87.783) Mem 7373MB [2024-08-26 03:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 65.892 Acc@5 87.622 [2024-08-26 03:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 65.9% [2024-08-26 03:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 65.89% [2024-08-26 03:52:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 03:52:20 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 03:52:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.405 (0.405) Loss 0.9727 (0.9727) Acc@1 78.906 (78.906) Acc@5 93.164 (93.164) Mem 7373MB [2024-08-26 03:52:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.108) Loss 1.3672 (1.4201) Acc@1 69.727 (66.735) Acc@5 90.527 (88.699) Mem 7373MB [2024-08-26 03:52:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.095) Loss 2.0000 (1.4230) Acc@1 56.641 (66.741) Acc@5 79.590 (88.802) Mem 7373MB [2024-08-26 03:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.090) Loss 2.1738 (1.6144) Acc@1 54.102 (63.297) Acc@5 76.270 (85.903) Mem 7373MB [2024-08-26 03:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 2.3125 (1.7343) Acc@1 48.535 (61.011) Acc@5 75.781 (84.027) Mem 7373MB [2024-08-26 03:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 60.906 Acc@5 84.000 [2024-08-26 03:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 60.9% [2024-08-26 03:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 60.91% [2024-08-26 03:52:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 03:52:25 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 03:52:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][0/1251] eta 0:15:55 lr 0.001000 wd 0.0500 time 0.7635 (0.7635) data time 0.4622 (0.4622) model time 0.0000 (0.0000) loss 4.4364 (4.4364) grad_norm 1.7317 (1.7317) loss_scale 8192.0000 (8192.0000) mem 7381MB [2024-08-26 03:52:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][10/1251] eta 0:06:00 lr 0.001000 wd 0.0500 time 0.2380 (0.2902) data time 0.0011 (0.0430) model time 0.0000 (0.0000) loss 4.6915 (4.0001) grad_norm 5.2200 (2.3161) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:52:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][20/1251] eta 0:05:28 lr 0.001000 wd 0.0500 time 0.2371 (0.2669) data time 0.0008 (0.0230) model time 0.0000 (0.0000) loss 3.4351 (3.8749) grad_norm 1.9833 (2.4255) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:52:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][30/1251] eta 0:05:14 lr 0.001000 wd 0.0500 time 0.2432 (0.2575) data time 0.0007 (0.0159) model time 0.0000 (0.0000) loss 3.6374 (3.9386) grad_norm 1.6213 (2.2377) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:52:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][40/1251] eta 0:05:13 lr 0.001000 wd 0.0500 time 0.2435 (0.2591) data time 0.0011 (0.0123) model time 0.0000 (0.0000) loss 4.4922 (4.0193) grad_norm 2.2558 (2.2652) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][50/1251] eta 0:05:07 lr 0.001000 wd 0.0500 time 0.2481 (0.2558) data time 0.0007 (0.0101) model time 0.0000 (0.0000) loss 3.1178 (4.0562) grad_norm 3.6555 (2.2864) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:52:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][60/1251] eta 0:05:01 lr 0.001000 wd 0.0500 time 0.2423 (0.2533) data time 0.0009 (0.0086) model time 0.2413 (0.2392) loss 3.3668 (4.0678) grad_norm 2.8550 (2.3242) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:52:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][70/1251] eta 0:04:56 lr 0.001000 wd 0.0500 time 0.2457 (0.2515) data time 0.0010 (0.0075) model time 0.2447 (0.2394) loss 4.0054 (4.1017) grad_norm 2.0057 (2.2863) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:52:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][80/1251] eta 0:04:52 lr 0.001000 wd 0.0500 time 0.2381 (0.2499) data time 0.0012 (0.0067) model time 0.2369 (0.2389) loss 4.5540 (4.1430) grad_norm 1.4017 (2.2719) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:52:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][90/1251] eta 0:04:49 lr 0.001000 wd 0.0500 time 0.2447 (0.2491) data time 0.0008 (0.0061) model time 0.2439 (0.2394) loss 3.4810 (4.1351) grad_norm 2.1209 (2.2360) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:52:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][100/1251] eta 0:04:45 lr 0.001000 wd 0.0500 time 0.2425 (0.2483) data time 0.0010 (0.0056) model time 0.2415 (0.2396) loss 4.6700 (4.1446) grad_norm 1.4227 (2.2195) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:52:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][110/1251] eta 0:04:42 lr 0.001000 wd 0.0500 time 0.2380 (0.2478) data time 0.0011 (0.0052) model time 0.2368 (0.2399) loss 3.6654 (4.1230) grad_norm 1.5716 (2.1785) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][120/1251] eta 0:04:42 lr 0.001000 wd 0.0500 time 0.2475 (0.2501) data time 0.0011 (0.0048) model time 0.2464 (0.2450) loss 4.5567 (4.0795) grad_norm 3.0367 (2.1831) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:52:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][130/1251] eta 0:04:39 lr 0.001000 wd 0.0500 time 0.2390 (0.2495) data time 0.0010 (0.0045) model time 0.2380 (0.2444) loss 4.0700 (4.0902) grad_norm 1.7655 (2.1902) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:53:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][140/1251] eta 0:04:36 lr 0.001000 wd 0.0500 time 0.2475 (0.2491) data time 0.0008 (0.0043) model time 0.2467 (0.2443) loss 3.7094 (4.0650) grad_norm 2.2624 (2.1842) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][150/1251] eta 0:04:33 lr 0.001000 wd 0.0500 time 0.2402 (0.2486) data time 0.0008 (0.0041) model time 0.2393 (0.2439) loss 3.1032 (4.0653) grad_norm 2.0567 (2.1971) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:53:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][160/1251] eta 0:04:30 lr 0.001000 wd 0.0500 time 0.2422 (0.2482) data time 0.0010 (0.0039) model time 0.2413 (0.2436) loss 4.3725 (4.0686) grad_norm 2.4280 (2.2124) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:53:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][170/1251] eta 0:04:27 lr 0.001000 wd 0.0500 time 0.2368 (0.2477) data time 0.0011 (0.0037) model time 0.2358 (0.2433) loss 4.2913 (4.0619) grad_norm 2.0668 (2.2052) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][180/1251] eta 0:04:24 lr 0.001000 wd 0.0500 time 0.2393 (0.2474) data time 0.0010 (0.0035) model time 0.2383 (0.2431) loss 4.4565 (4.0673) grad_norm 2.0834 (2.1867) loss_scale 16384.0000 (8463.5580) mem 7379MB [2024-08-26 03:53:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][190/1251] eta 0:04:22 lr 0.001000 wd 0.0500 time 0.2380 (0.2471) data time 0.0009 (0.0034) model time 0.2371 (0.2430) loss 5.0881 (4.0665) grad_norm 1.9540 (2.1762) loss_scale 16384.0000 (8878.2408) mem 7379MB [2024-08-26 03:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][200/1251] eta 0:04:19 lr 0.001000 wd 0.0500 time 0.2524 (0.2469) data time 0.0009 (0.0033) model time 0.2515 (0.2429) loss 4.4514 (4.0665) grad_norm 2.6741 (inf) loss_scale 8192.0000 (8884.8557) mem 7379MB [2024-08-26 03:53:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][210/1251] eta 0:04:16 lr 0.001000 wd 0.0500 time 0.2395 (0.2467) data time 0.0007 (0.0032) model time 0.2388 (0.2428) loss 3.8034 (4.0724) grad_norm 1.9389 (inf) loss_scale 8192.0000 (8852.0190) mem 7379MB [2024-08-26 03:53:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][220/1251] eta 0:04:14 lr 0.001000 wd 0.0500 time 0.2332 (0.2464) data time 0.0010 (0.0031) model time 0.2322 (0.2426) loss 3.2828 (4.0701) grad_norm 1.3552 (inf) loss_scale 8192.0000 (8822.1538) mem 7379MB [2024-08-26 03:53:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][230/1251] eta 0:04:11 lr 0.001000 wd 0.0500 time 0.2441 (0.2463) data time 0.0007 (0.0030) model time 0.2434 (0.2426) loss 5.1820 (4.0891) grad_norm 2.2818 (inf) loss_scale 8192.0000 (8794.8745) mem 7379MB [2024-08-26 03:53:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][240/1251] eta 0:04:08 lr 0.001000 wd 0.0500 time 0.2396 (0.2461) data time 0.0011 (0.0029) model time 0.2385 (0.2425) loss 3.7553 (4.0969) grad_norm 1.8353 (inf) loss_scale 8192.0000 (8769.8589) mem 7379MB [2024-08-26 03:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][250/1251] eta 0:04:06 lr 0.001000 wd 0.0500 time 0.2463 (0.2460) data time 0.0010 (0.0028) model time 0.2454 (0.2425) loss 3.3520 (4.0842) grad_norm 2.8155 (inf) loss_scale 8192.0000 (8746.8367) mem 7379MB [2024-08-26 03:53:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][260/1251] eta 0:04:03 lr 0.001000 wd 0.0500 time 0.2415 (0.2459) data time 0.0008 (0.0028) model time 0.2407 (0.2424) loss 3.2108 (4.0701) grad_norm 2.4739 (inf) loss_scale 8192.0000 (8725.5785) mem 7379MB [2024-08-26 03:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][270/1251] eta 0:04:01 lr 0.001000 wd 0.0500 time 0.2375 (0.2457) data time 0.0008 (0.0027) model time 0.2366 (0.2423) loss 3.8960 (4.0782) grad_norm 1.8331 (inf) loss_scale 8192.0000 (8705.8893) mem 7379MB [2024-08-26 03:53:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][280/1251] eta 0:03:58 lr 0.001000 wd 0.0500 time 0.2421 (0.2455) data time 0.0010 (0.0027) model time 0.2411 (0.2422) loss 4.3605 (4.0880) grad_norm 1.8337 (inf) loss_scale 8192.0000 (8687.6014) mem 7379MB [2024-08-26 03:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][290/1251] eta 0:03:55 lr 0.001000 wd 0.0500 time 0.2401 (0.2453) data time 0.0010 (0.0026) model time 0.2391 (0.2421) loss 4.0185 (4.0872) grad_norm 1.9614 (inf) loss_scale 8192.0000 (8670.5704) mem 7379MB [2024-08-26 03:53:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][300/1251] eta 0:03:53 lr 0.001000 wd 0.0500 time 0.2358 (0.2453) data time 0.0012 (0.0025) model time 0.2346 (0.2421) loss 4.0414 (4.0913) grad_norm 2.0264 (inf) loss_scale 8192.0000 (8654.6711) mem 7379MB [2024-08-26 03:53:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][310/1251] eta 0:03:50 lr 0.001000 wd 0.0500 time 0.2382 (0.2452) data time 0.0008 (0.0025) model time 0.2375 (0.2420) loss 4.2015 (4.0927) grad_norm 1.8281 (inf) loss_scale 8192.0000 (8639.7942) mem 7379MB [2024-08-26 03:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][320/1251] eta 0:03:48 lr 0.001000 wd 0.0500 time 0.2412 (0.2451) data time 0.0009 (0.0025) model time 0.2403 (0.2420) loss 4.6289 (4.1031) grad_norm 2.0988 (inf) loss_scale 8192.0000 (8625.8442) mem 7379MB [2024-08-26 03:53:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][330/1251] eta 0:03:45 lr 0.001000 wd 0.0500 time 0.2400 (0.2450) data time 0.0011 (0.0024) model time 0.2389 (0.2420) loss 2.9526 (4.0899) grad_norm 2.4795 (inf) loss_scale 8192.0000 (8612.7372) mem 7379MB [2024-08-26 03:53:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][340/1251] eta 0:03:43 lr 0.001000 wd 0.0500 time 0.2449 (0.2449) data time 0.0009 (0.0024) model time 0.2440 (0.2420) loss 4.1485 (4.0824) grad_norm 2.2290 (inf) loss_scale 8192.0000 (8600.3988) mem 7379MB [2024-08-26 03:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][350/1251] eta 0:03:40 lr 0.001000 wd 0.0500 time 0.2402 (0.2448) data time 0.0010 (0.0023) model time 0.2392 (0.2420) loss 2.8490 (4.0934) grad_norm 1.9177 (inf) loss_scale 8192.0000 (8588.7635) mem 7379MB [2024-08-26 03:53:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][360/1251] eta 0:03:38 lr 0.001000 wd 0.0500 time 0.2453 (0.2448) data time 0.0011 (0.0023) model time 0.2442 (0.2420) loss 4.0235 (4.0935) grad_norm 2.1865 (inf) loss_scale 8192.0000 (8577.7729) mem 7379MB [2024-08-26 03:53:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][370/1251] eta 0:03:35 lr 0.001000 wd 0.0500 time 0.2450 (0.2448) data time 0.0009 (0.0023) model time 0.2440 (0.2420) loss 4.3715 (4.1008) grad_norm 2.0567 (inf) loss_scale 8192.0000 (8567.3747) mem 7379MB [2024-08-26 03:53:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][380/1251] eta 0:03:33 lr 0.001000 wd 0.0500 time 0.2348 (0.2447) data time 0.0010 (0.0022) model time 0.2337 (0.2419) loss 4.6300 (4.0973) grad_norm 2.4567 (inf) loss_scale 8192.0000 (8557.5223) mem 7379MB [2024-08-26 03:54:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][390/1251] eta 0:03:30 lr 0.001000 wd 0.0500 time 0.2364 (0.2446) data time 0.0011 (0.0022) model time 0.2354 (0.2419) loss 3.7923 (4.0990) grad_norm 1.4323 (inf) loss_scale 8192.0000 (8548.1739) mem 7379MB [2024-08-26 03:54:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][400/1251] eta 0:03:28 lr 0.001000 wd 0.0500 time 0.2437 (0.2445) data time 0.0011 (0.0022) model time 0.2426 (0.2418) loss 3.6122 (4.0921) grad_norm 1.6317 (inf) loss_scale 8192.0000 (8539.2918) mem 7379MB [2024-08-26 03:54:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][410/1251] eta 0:03:25 lr 0.001000 wd 0.0500 time 0.2449 (0.2444) data time 0.0011 (0.0021) model time 0.2437 (0.2418) loss 4.3893 (4.0923) grad_norm 1.7289 (inf) loss_scale 8192.0000 (8530.8418) mem 7379MB [2024-08-26 03:54:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][420/1251] eta 0:03:23 lr 0.001000 wd 0.0500 time 0.2374 (0.2443) data time 0.0011 (0.0021) model time 0.2363 (0.2417) loss 4.4566 (4.0951) grad_norm 1.6069 (inf) loss_scale 8192.0000 (8522.7933) mem 7379MB [2024-08-26 03:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][430/1251] eta 0:03:20 lr 0.001000 wd 0.0500 time 0.2437 (0.2443) data time 0.0010 (0.0021) model time 0.2427 (0.2417) loss 3.9352 (4.0975) grad_norm 2.9613 (inf) loss_scale 8192.0000 (8515.1183) mem 7379MB [2024-08-26 03:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][440/1251] eta 0:03:18 lr 0.001000 wd 0.0500 time 0.2429 (0.2443) data time 0.0008 (0.0021) model time 0.2421 (0.2418) loss 4.0160 (4.0995) grad_norm 2.0059 (inf) loss_scale 8192.0000 (8507.7914) mem 7379MB [2024-08-26 03:54:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][450/1251] eta 0:03:15 lr 0.001000 wd 0.0500 time 0.2504 (0.2443) data time 0.0010 (0.0020) model time 0.2494 (0.2418) loss 4.3049 (4.1014) grad_norm 2.8616 (inf) loss_scale 8192.0000 (8500.7894) mem 7379MB [2024-08-26 03:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][460/1251] eta 0:03:13 lr 0.001000 wd 0.0500 time 0.2352 (0.2442) data time 0.0007 (0.0020) model time 0.2344 (0.2417) loss 3.4567 (4.1000) grad_norm 1.8361 (inf) loss_scale 8192.0000 (8494.0911) mem 7379MB [2024-08-26 03:54:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][470/1251] eta 0:03:10 lr 0.001000 wd 0.0500 time 0.2439 (0.2441) data time 0.0007 (0.0020) model time 0.2432 (0.2417) loss 3.2185 (4.0925) grad_norm 3.4858 (inf) loss_scale 8192.0000 (8487.6773) mem 7379MB [2024-08-26 03:54:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][480/1251] eta 0:03:08 lr 0.001000 wd 0.0500 time 0.2397 (0.2442) data time 0.0007 (0.0020) model time 0.2390 (0.2418) loss 5.0007 (4.1010) grad_norm 1.5299 (inf) loss_scale 8192.0000 (8481.5301) mem 7379MB [2024-08-26 03:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][490/1251] eta 0:03:05 lr 0.001000 wd 0.0500 time 0.2364 (0.2442) data time 0.0011 (0.0020) model time 0.2352 (0.2418) loss 4.7890 (4.1114) grad_norm 1.7377 (inf) loss_scale 8192.0000 (8475.6334) mem 7379MB [2024-08-26 03:54:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][500/1251] eta 0:03:03 lr 0.001000 wd 0.0500 time 0.2436 (0.2441) data time 0.0010 (0.0020) model time 0.2427 (0.2418) loss 4.1943 (4.1126) grad_norm 2.0075 (inf) loss_scale 8192.0000 (8469.9721) mem 7379MB [2024-08-26 03:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][510/1251] eta 0:03:00 lr 0.001000 wd 0.0500 time 0.2404 (0.2441) data time 0.0010 (0.0019) model time 0.2394 (0.2417) loss 4.3325 (4.1186) grad_norm 2.0326 (inf) loss_scale 8192.0000 (8464.5323) mem 7379MB [2024-08-26 03:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][520/1251] eta 0:02:58 lr 0.001000 wd 0.0500 time 0.2398 (0.2440) data time 0.0011 (0.0019) model time 0.2388 (0.2417) loss 4.2375 (4.1206) grad_norm 1.7720 (inf) loss_scale 8192.0000 (8459.3013) mem 7379MB [2024-08-26 03:54:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][530/1251] eta 0:02:55 lr 0.001000 wd 0.0500 time 0.2386 (0.2440) data time 0.0010 (0.0019) model time 0.2375 (0.2417) loss 4.4391 (4.1210) grad_norm 2.1580 (inf) loss_scale 8192.0000 (8454.2674) mem 7379MB [2024-08-26 03:54:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][540/1251] eta 0:02:53 lr 0.001000 wd 0.0500 time 0.2359 (0.2439) data time 0.0012 (0.0019) model time 0.2347 (0.2417) loss 4.2110 (4.1203) grad_norm 1.7560 (inf) loss_scale 8192.0000 (8449.4196) mem 7379MB [2024-08-26 03:54:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][550/1251] eta 0:02:50 lr 0.001000 wd 0.0500 time 0.2391 (0.2439) data time 0.0009 (0.0019) model time 0.2382 (0.2416) loss 3.9646 (4.1223) grad_norm 2.6417 (inf) loss_scale 8192.0000 (8444.7477) mem 7379MB [2024-08-26 03:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][560/1251] eta 0:02:48 lr 0.001000 wd 0.0500 time 0.2353 (0.2438) data time 0.0010 (0.0019) model time 0.2343 (0.2416) loss 4.1736 (4.1179) grad_norm 3.1953 (inf) loss_scale 8192.0000 (8440.2424) mem 7379MB [2024-08-26 03:54:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][570/1251] eta 0:02:46 lr 0.001000 wd 0.0500 time 0.2381 (0.2439) data time 0.0012 (0.0019) model time 0.2369 (0.2416) loss 4.1074 (4.1220) grad_norm 2.4174 (inf) loss_scale 8192.0000 (8435.8949) mem 7379MB [2024-08-26 03:54:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][580/1251] eta 0:02:43 lr 0.001000 wd 0.0500 time 0.2378 (0.2438) data time 0.0009 (0.0018) model time 0.2368 (0.2416) loss 5.1237 (4.1252) grad_norm 2.3737 (inf) loss_scale 8192.0000 (8431.6971) mem 7379MB [2024-08-26 03:54:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][590/1251] eta 0:02:41 lr 0.001000 wd 0.0500 time 0.2428 (0.2438) data time 0.0009 (0.0018) model time 0.2419 (0.2416) loss 4.1119 (4.1322) grad_norm 2.3559 (inf) loss_scale 8192.0000 (8427.6413) mem 7379MB [2024-08-26 03:54:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][600/1251] eta 0:02:38 lr 0.001000 wd 0.0500 time 0.2359 (0.2438) data time 0.0009 (0.0018) model time 0.2351 (0.2416) loss 3.9508 (4.1336) grad_norm 2.6599 (inf) loss_scale 8192.0000 (8423.7205) mem 7379MB [2024-08-26 03:54:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][610/1251] eta 0:02:36 lr 0.001000 wd 0.0500 time 0.2405 (0.2437) data time 0.0008 (0.0018) model time 0.2397 (0.2415) loss 4.8582 (4.1353) grad_norm 2.2594 (inf) loss_scale 8192.0000 (8419.9280) mem 7379MB [2024-08-26 03:54:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][620/1251] eta 0:02:33 lr 0.001000 wd 0.0500 time 0.2440 (0.2437) data time 0.0007 (0.0018) model time 0.2433 (0.2415) loss 4.9016 (4.1285) grad_norm 3.0535 (inf) loss_scale 8192.0000 (8416.2576) mem 7379MB [2024-08-26 03:54:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][630/1251] eta 0:02:31 lr 0.001000 wd 0.0500 time 0.2397 (0.2436) data time 0.0008 (0.0018) model time 0.2389 (0.2415) loss 3.6585 (4.1239) grad_norm 1.4580 (inf) loss_scale 8192.0000 (8412.7036) mem 7379MB [2024-08-26 03:55:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][640/1251] eta 0:02:29 lr 0.001000 wd 0.0500 time 0.4633 (0.2440) data time 0.0010 (0.0018) model time 0.4622 (0.2420) loss 4.7316 (4.1271) grad_norm 1.7673 (inf) loss_scale 8192.0000 (8409.2605) mem 7379MB [2024-08-26 03:55:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][650/1251] eta 0:02:26 lr 0.001000 wd 0.0500 time 0.2429 (0.2444) data time 0.0011 (0.0018) model time 0.2418 (0.2424) loss 3.9110 (4.1276) grad_norm 1.7005 (inf) loss_scale 8192.0000 (8405.9232) mem 7379MB [2024-08-26 03:55:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][660/1251] eta 0:02:24 lr 0.001000 wd 0.0500 time 0.2373 (0.2444) data time 0.0012 (0.0017) model time 0.2361 (0.2424) loss 4.4147 (4.1310) grad_norm 1.4948 (inf) loss_scale 8192.0000 (8402.6868) mem 7379MB [2024-08-26 03:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][670/1251] eta 0:02:21 lr 0.001000 wd 0.0500 time 0.2332 (0.2444) data time 0.0010 (0.0017) model time 0.2322 (0.2424) loss 4.3332 (4.1365) grad_norm 1.6971 (inf) loss_scale 8192.0000 (8399.5469) mem 7379MB [2024-08-26 03:55:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][680/1251] eta 0:02:19 lr 0.001000 wd 0.0500 time 0.2483 (0.2444) data time 0.0010 (0.0017) model time 0.2473 (0.2424) loss 4.5545 (4.1393) grad_norm 1.7480 (inf) loss_scale 8192.0000 (8396.4993) mem 7379MB [2024-08-26 03:55:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][690/1251] eta 0:02:17 lr 0.001000 wd 0.0500 time 0.2316 (0.2443) data time 0.0011 (0.0017) model time 0.2305 (0.2423) loss 3.1956 (4.1390) grad_norm 1.5987 (inf) loss_scale 8192.0000 (8393.5398) mem 7379MB [2024-08-26 03:55:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][700/1251] eta 0:02:14 lr 0.001000 wd 0.0500 time 0.2394 (0.2442) data time 0.0009 (0.0017) model time 0.2385 (0.2423) loss 4.6532 (4.1398) grad_norm 2.8254 (inf) loss_scale 8192.0000 (8390.6648) mem 7379MB [2024-08-26 03:55:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][710/1251] eta 0:02:12 lr 0.001000 wd 0.0500 time 0.2383 (0.2442) data time 0.0007 (0.0017) model time 0.2375 (0.2422) loss 4.3641 (4.1435) grad_norm 1.5844 (inf) loss_scale 8192.0000 (8387.8706) mem 7379MB [2024-08-26 03:55:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][720/1251] eta 0:02:09 lr 0.001000 wd 0.0500 time 0.2403 (0.2441) data time 0.0011 (0.0017) model time 0.2391 (0.2422) loss 4.4090 (4.1454) grad_norm 3.3638 (inf) loss_scale 8192.0000 (8385.1540) mem 7379MB [2024-08-26 03:55:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][730/1251] eta 0:02:07 lr 0.001000 wd 0.0500 time 0.2367 (0.2441) data time 0.0010 (0.0017) model time 0.2356 (0.2422) loss 3.9241 (4.1446) grad_norm 1.8395 (inf) loss_scale 8192.0000 (8382.5116) mem 7379MB [2024-08-26 03:55:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][740/1251] eta 0:02:04 lr 0.001000 wd 0.0500 time 0.2414 (0.2441) data time 0.0007 (0.0017) model time 0.2407 (0.2422) loss 2.7747 (4.1417) grad_norm 1.5337 (inf) loss_scale 8192.0000 (8379.9406) mem 7379MB [2024-08-26 03:55:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][750/1251] eta 0:02:02 lr 0.001000 wd 0.0500 time 0.2441 (0.2441) data time 0.0007 (0.0017) model time 0.2434 (0.2422) loss 5.2065 (4.1436) grad_norm 1.4564 (inf) loss_scale 8192.0000 (8377.4381) mem 7379MB [2024-08-26 03:55:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][760/1251] eta 0:01:59 lr 0.001000 wd 0.0500 time 0.2422 (0.2441) data time 0.0010 (0.0017) model time 0.2413 (0.2422) loss 4.5705 (4.1427) grad_norm 2.8205 (inf) loss_scale 8192.0000 (8375.0013) mem 7379MB [2024-08-26 03:55:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][770/1251] eta 0:01:57 lr 0.001000 wd 0.0500 time 0.2391 (0.2441) data time 0.0009 (0.0016) model time 0.2382 (0.2422) loss 4.6043 (4.1468) grad_norm 3.6815 (inf) loss_scale 8192.0000 (8372.6278) mem 7379MB [2024-08-26 03:55:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][780/1251] eta 0:01:54 lr 0.001000 wd 0.0500 time 0.2514 (0.2440) data time 0.0010 (0.0016) model time 0.2504 (0.2422) loss 4.5097 (4.1450) grad_norm 1.4520 (inf) loss_scale 8192.0000 (8370.3150) mem 7379MB [2024-08-26 03:55:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][790/1251] eta 0:01:52 lr 0.001000 wd 0.0500 time 0.2402 (0.2440) data time 0.0008 (0.0016) model time 0.2394 (0.2421) loss 3.2570 (4.1377) grad_norm 2.9439 (inf) loss_scale 8192.0000 (8368.0607) mem 7379MB [2024-08-26 03:55:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][800/1251] eta 0:01:50 lr 0.001000 wd 0.0500 time 0.2505 (0.2440) data time 0.0007 (0.0016) model time 0.2498 (0.2421) loss 4.4518 (4.1347) grad_norm 2.0488 (inf) loss_scale 8192.0000 (8365.8627) mem 7379MB [2024-08-26 03:55:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][810/1251] eta 0:01:47 lr 0.001000 wd 0.0500 time 0.2381 (0.2439) data time 0.0010 (0.0016) model time 0.2371 (0.2421) loss 4.1001 (4.1345) grad_norm 1.7578 (inf) loss_scale 8192.0000 (8363.7189) mem 7379MB [2024-08-26 03:55:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][820/1251] eta 0:01:45 lr 0.001000 wd 0.0500 time 0.2388 (0.2439) data time 0.0011 (0.0016) model time 0.2377 (0.2420) loss 3.6356 (4.1367) grad_norm 1.8595 (inf) loss_scale 8192.0000 (8361.6273) mem 7379MB [2024-08-26 03:55:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][830/1251] eta 0:01:42 lr 0.001000 wd 0.0500 time 0.2405 (0.2439) data time 0.0007 (0.0016) model time 0.2398 (0.2421) loss 4.7462 (4.1400) grad_norm 2.4334 (inf) loss_scale 8192.0000 (8359.5860) mem 7379MB [2024-08-26 03:55:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][840/1251] eta 0:01:40 lr 0.001000 wd 0.0500 time 0.2418 (0.2439) data time 0.0010 (0.0016) model time 0.2409 (0.2420) loss 4.4287 (4.1376) grad_norm 2.1550 (inf) loss_scale 8192.0000 (8357.5933) mem 7379MB [2024-08-26 03:55:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][850/1251] eta 0:01:37 lr 0.001000 wd 0.0500 time 0.2416 (0.2438) data time 0.0008 (0.0016) model time 0.2409 (0.2420) loss 3.0798 (4.1367) grad_norm 1.9732 (inf) loss_scale 8192.0000 (8355.6475) mem 7379MB [2024-08-26 03:55:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][860/1251] eta 0:01:35 lr 0.001000 wd 0.0500 time 0.2404 (0.2438) data time 0.0007 (0.0016) model time 0.2397 (0.2419) loss 4.5902 (4.1352) grad_norm 1.7321 (inf) loss_scale 8192.0000 (8353.7468) mem 7379MB [2024-08-26 03:55:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][870/1251] eta 0:01:32 lr 0.001000 wd 0.0500 time 0.2334 (0.2438) data time 0.0012 (0.0016) model time 0.2322 (0.2419) loss 4.5751 (4.1319) grad_norm 3.7531 (inf) loss_scale 8192.0000 (8351.8898) mem 7379MB [2024-08-26 03:55:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][880/1251] eta 0:01:30 lr 0.001000 wd 0.0500 time 0.2371 (0.2437) data time 0.0009 (0.0016) model time 0.2361 (0.2419) loss 4.4497 (4.1315) grad_norm 1.6745 (inf) loss_scale 8192.0000 (8350.0749) mem 7379MB [2024-08-26 03:56:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][890/1251] eta 0:01:27 lr 0.001000 wd 0.0500 time 0.2401 (0.2437) data time 0.0008 (0.0016) model time 0.2393 (0.2419) loss 4.3921 (4.1285) grad_norm 2.5973 (inf) loss_scale 8192.0000 (8348.3008) mem 7379MB [2024-08-26 03:56:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][900/1251] eta 0:01:25 lr 0.001000 wd 0.0500 time 0.2349 (0.2437) data time 0.0011 (0.0016) model time 0.2338 (0.2419) loss 4.4893 (4.1312) grad_norm 1.5732 (inf) loss_scale 8192.0000 (8346.5660) mem 7379MB [2024-08-26 03:56:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][910/1251] eta 0:01:23 lr 0.001000 wd 0.0500 time 0.2461 (0.2437) data time 0.0010 (0.0016) model time 0.2451 (0.2419) loss 4.6406 (4.1341) grad_norm 1.7727 (inf) loss_scale 8192.0000 (8344.8694) mem 7379MB [2024-08-26 03:56:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][920/1251] eta 0:01:20 lr 0.001000 wd 0.0500 time 0.2466 (0.2437) data time 0.0007 (0.0016) model time 0.2459 (0.2419) loss 4.4568 (4.1345) grad_norm 2.3555 (inf) loss_scale 8192.0000 (8343.2096) mem 7379MB [2024-08-26 03:56:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][930/1251] eta 0:01:18 lr 0.001000 wd 0.0500 time 0.2421 (0.2437) data time 0.0007 (0.0016) model time 0.2414 (0.2419) loss 4.9738 (4.1355) grad_norm 2.6201 (inf) loss_scale 8192.0000 (8341.5854) mem 7379MB [2024-08-26 03:56:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][940/1251] eta 0:01:15 lr 0.001000 wd 0.0500 time 0.2424 (0.2437) data time 0.0009 (0.0016) model time 0.2415 (0.2419) loss 4.4820 (4.1373) grad_norm 2.6331 (inf) loss_scale 8192.0000 (8339.9957) mem 7379MB [2024-08-26 03:56:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][950/1251] eta 0:01:13 lr 0.001000 wd 0.0500 time 0.2399 (0.2436) data time 0.0008 (0.0016) model time 0.2391 (0.2419) loss 3.3517 (4.1324) grad_norm 1.6521 (inf) loss_scale 8192.0000 (8338.4395) mem 7379MB [2024-08-26 03:56:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][960/1251] eta 0:01:10 lr 0.001000 wd 0.0500 time 0.2443 (0.2436) data time 0.0011 (0.0016) model time 0.2432 (0.2418) loss 3.5436 (4.1275) grad_norm 1.4891 (inf) loss_scale 8192.0000 (8336.9157) mem 7379MB [2024-08-26 03:56:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][970/1251] eta 0:01:08 lr 0.001000 wd 0.0500 time 0.2483 (0.2438) data time 0.0007 (0.0016) model time 0.2476 (0.2421) loss 2.9379 (4.1220) grad_norm 1.2863 (inf) loss_scale 8192.0000 (8335.4233) mem 7379MB [2024-08-26 03:56:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][980/1251] eta 0:01:06 lr 0.001000 wd 0.0500 time 0.2423 (0.2438) data time 0.0007 (0.0016) model time 0.2415 (0.2421) loss 3.2543 (4.1220) grad_norm 5.5011 (inf) loss_scale 8192.0000 (8333.9613) mem 7379MB [2024-08-26 03:56:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][990/1251] eta 0:01:03 lr 0.001000 wd 0.0500 time 0.2433 (0.2438) data time 0.0010 (0.0016) model time 0.2424 (0.2420) loss 4.2976 (4.1223) grad_norm 2.3676 (inf) loss_scale 8192.0000 (8332.5288) mem 7379MB [2024-08-26 03:56:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1000/1251] eta 0:01:01 lr 0.001000 wd 0.0500 time 0.2382 (0.2438) data time 0.0009 (0.0015) model time 0.2373 (0.2420) loss 4.7991 (4.1229) grad_norm 1.6340 (inf) loss_scale 8192.0000 (8331.1249) mem 7379MB [2024-08-26 03:56:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1010/1251] eta 0:00:58 lr 0.001000 wd 0.0500 time 0.2512 (0.2438) data time 0.0007 (0.0015) model time 0.2505 (0.2420) loss 3.9736 (4.1259) grad_norm 1.5962 (inf) loss_scale 8192.0000 (8329.7488) mem 7379MB [2024-08-26 03:56:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1020/1251] eta 0:00:56 lr 0.001000 wd 0.0500 time 0.2394 (0.2437) data time 0.0007 (0.0015) model time 0.2388 (0.2420) loss 3.5357 (4.1253) grad_norm 1.6230 (inf) loss_scale 8192.0000 (8328.3996) mem 7379MB [2024-08-26 03:56:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1030/1251] eta 0:00:53 lr 0.001000 wd 0.0500 time 0.2392 (0.2437) data time 0.0010 (0.0015) model time 0.2382 (0.2420) loss 3.5307 (4.1228) grad_norm 3.4774 (inf) loss_scale 8192.0000 (8327.0766) mem 7379MB [2024-08-26 03:56:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1040/1251] eta 0:00:51 lr 0.001000 wd 0.0500 time 0.2365 (0.2437) data time 0.0007 (0.0015) model time 0.2358 (0.2420) loss 3.2243 (4.1255) grad_norm 4.1039 (inf) loss_scale 8192.0000 (8325.7791) mem 7379MB [2024-08-26 03:56:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1050/1251] eta 0:00:48 lr 0.001000 wd 0.0500 time 0.2404 (0.2437) data time 0.0008 (0.0015) model time 0.2395 (0.2420) loss 3.1666 (4.1244) grad_norm 1.9088 (inf) loss_scale 8192.0000 (8324.5062) mem 7379MB [2024-08-26 03:56:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1060/1251] eta 0:00:46 lr 0.001000 wd 0.0500 time 0.2392 (0.2436) data time 0.0009 (0.0015) model time 0.2382 (0.2419) loss 4.3384 (4.1216) grad_norm 1.7959 (inf) loss_scale 8192.0000 (8323.2573) mem 7379MB [2024-08-26 03:56:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1070/1251] eta 0:00:44 lr 0.001000 wd 0.0500 time 0.2383 (0.2436) data time 0.0009 (0.0015) model time 0.2374 (0.2419) loss 5.0844 (4.1238) grad_norm 3.3225 (inf) loss_scale 8192.0000 (8322.0317) mem 7379MB [2024-08-26 03:56:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1080/1251] eta 0:00:41 lr 0.001000 wd 0.0500 time 0.2396 (0.2436) data time 0.0008 (0.0015) model time 0.2388 (0.2419) loss 5.1735 (4.1241) grad_norm 6.4039 (inf) loss_scale 8192.0000 (8320.8289) mem 7379MB [2024-08-26 03:56:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1090/1251] eta 0:00:39 lr 0.001000 wd 0.0500 time 0.2346 (0.2436) data time 0.0009 (0.0015) model time 0.2337 (0.2419) loss 2.8904 (4.1225) grad_norm 2.1153 (inf) loss_scale 8192.0000 (8319.6480) mem 7379MB [2024-08-26 03:56:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1100/1251] eta 0:00:36 lr 0.001000 wd 0.0500 time 0.2323 (0.2436) data time 0.0009 (0.0015) model time 0.2315 (0.2419) loss 4.5782 (4.1229) grad_norm 2.8049 (inf) loss_scale 8192.0000 (8318.4886) mem 7379MB [2024-08-26 03:56:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1110/1251] eta 0:00:34 lr 0.001000 wd 0.0500 time 0.2422 (0.2435) data time 0.0012 (0.0015) model time 0.2409 (0.2418) loss 4.2177 (4.1242) grad_norm 1.3718 (inf) loss_scale 8192.0000 (8317.3501) mem 7379MB [2024-08-26 03:56:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1120/1251] eta 0:00:31 lr 0.001000 wd 0.0500 time 0.2451 (0.2435) data time 0.0007 (0.0015) model time 0.2443 (0.2418) loss 3.9077 (4.1237) grad_norm 3.0622 (inf) loss_scale 8192.0000 (8316.2319) mem 7379MB [2024-08-26 03:57:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1130/1251] eta 0:00:29 lr 0.001000 wd 0.0500 time 0.2374 (0.2435) data time 0.0007 (0.0015) model time 0.2366 (0.2418) loss 4.0626 (4.1231) grad_norm 2.6175 (inf) loss_scale 8192.0000 (8315.1335) mem 7379MB [2024-08-26 03:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1140/1251] eta 0:00:27 lr 0.001000 wd 0.0500 time 0.2502 (0.2435) data time 0.0012 (0.0015) model time 0.2490 (0.2418) loss 3.6720 (4.1251) grad_norm 1.6444 (inf) loss_scale 8192.0000 (8314.0543) mem 7379MB [2024-08-26 03:57:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1150/1251] eta 0:00:24 lr 0.001000 wd 0.0500 time 0.2335 (0.2435) data time 0.0009 (0.0015) model time 0.2326 (0.2418) loss 4.5584 (4.1251) grad_norm 1.6519 (inf) loss_scale 8192.0000 (8312.9939) mem 7379MB [2024-08-26 03:57:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1160/1251] eta 0:00:22 lr 0.001000 wd 0.0500 time 0.2404 (0.2435) data time 0.0007 (0.0015) model time 0.2397 (0.2418) loss 2.8646 (4.1219) grad_norm 2.5518 (inf) loss_scale 8192.0000 (8311.9518) mem 7379MB [2024-08-26 03:57:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1170/1251] eta 0:00:19 lr 0.001000 wd 0.0500 time 0.2392 (0.2435) data time 0.0011 (0.0015) model time 0.2381 (0.2418) loss 2.8225 (4.1219) grad_norm 3.7461 (inf) loss_scale 8192.0000 (8310.9274) mem 7379MB [2024-08-26 03:57:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1180/1251] eta 0:00:17 lr 0.001000 wd 0.0500 time 0.2406 (0.2435) data time 0.0009 (0.0015) model time 0.2397 (0.2418) loss 4.8536 (4.1221) grad_norm 2.2563 (inf) loss_scale 8192.0000 (8309.9204) mem 7379MB [2024-08-26 03:57:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1190/1251] eta 0:00:14 lr 0.001000 wd 0.0500 time 0.2445 (0.2435) data time 0.0007 (0.0015) model time 0.2438 (0.2418) loss 5.0001 (4.1236) grad_norm 1.7402 (inf) loss_scale 8192.0000 (8308.9303) mem 7379MB [2024-08-26 03:57:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1200/1251] eta 0:00:12 lr 0.001000 wd 0.0500 time 0.2399 (0.2435) data time 0.0010 (0.0015) model time 0.2389 (0.2418) loss 3.7336 (4.1235) grad_norm 1.9873 (inf) loss_scale 8192.0000 (8307.9567) mem 7379MB [2024-08-26 03:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1210/1251] eta 0:00:09 lr 0.001000 wd 0.0500 time 0.2419 (0.2435) data time 0.0010 (0.0015) model time 0.2409 (0.2418) loss 4.5695 (4.1235) grad_norm 1.5953 (inf) loss_scale 8192.0000 (8306.9992) mem 7379MB [2024-08-26 03:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1220/1251] eta 0:00:07 lr 0.001000 wd 0.0500 time 0.2404 (0.2435) data time 0.0007 (0.0015) model time 0.2397 (0.2418) loss 3.0925 (4.1224) grad_norm 1.7075 (inf) loss_scale 8192.0000 (8306.0573) mem 7379MB [2024-08-26 03:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1230/1251] eta 0:00:05 lr 0.001000 wd 0.0500 time 0.2416 (0.2434) data time 0.0010 (0.0015) model time 0.2406 (0.2418) loss 3.4935 (4.1189) grad_norm 1.6549 (inf) loss_scale 8192.0000 (8305.1308) mem 7379MB [2024-08-26 03:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1240/1251] eta 0:00:02 lr 0.001000 wd 0.0500 time 0.2272 (0.2434) data time 0.0007 (0.0015) model time 0.2265 (0.2417) loss 4.5855 (4.1181) grad_norm 1.2091 (inf) loss_scale 8192.0000 (8304.2192) mem 7379MB [2024-08-26 03:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [20/300][1250/1251] eta 0:00:00 lr 0.001000 wd 0.0500 time 0.2234 (0.2432) data time 0.0005 (0.0015) model time 0.2229 (0.2416) loss 3.8156 (4.1178) grad_norm 1.8131 (inf) loss_scale 8192.0000 (8303.3221) mem 7379MB [2024-08-26 03:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 20 training takes 0:05:04 [2024-08-26 03:57:29 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 03:57:30 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 03:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.396 (0.396) Loss 0.7700 (0.7700) Acc@1 84.082 (84.082) Acc@5 96.875 (96.875) Mem 7379MB [2024-08-26 03:57:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.108) Loss 1.1445 (1.1537) Acc@1 74.316 (73.571) Acc@5 93.066 (93.129) Mem 7379MB [2024-08-26 03:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.094) Loss 1.7324 (1.1818) Acc@1 62.402 (72.707) Acc@5 84.375 (92.876) Mem 7379MB [2024-08-26 03:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.088) Loss 2.0664 (1.3548) Acc@1 54.102 (69.465) Acc@5 78.125 (90.206) Mem 7379MB [2024-08-26 03:57:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 2.1191 (1.4737) Acc@1 54.590 (67.068) Acc@5 79.102 (88.467) Mem 7379MB [2024-08-26 03:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 66.712 Acc@5 88.292 [2024-08-26 03:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 66.7% [2024-08-26 03:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 66.71% [2024-08-26 03:57:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 03:57:34 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 03:57:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.425 (0.425) Loss 0.8916 (0.8916) Acc@1 79.883 (79.883) Acc@5 93.652 (93.652) Mem 7379MB [2024-08-26 03:57:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.096 (0.109) Loss 1.2852 (1.3204) Acc@1 71.094 (68.670) Acc@5 91.016 (89.693) Mem 7379MB [2024-08-26 03:57:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.094) Loss 1.8857 (1.3251) Acc@1 58.105 (68.522) Acc@5 81.543 (89.848) Mem 7379MB [2024-08-26 03:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.088) Loss 2.0840 (1.5129) Acc@1 54.785 (65.058) Acc@5 77.246 (87.043) Mem 7379MB [2024-08-26 03:57:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.082) Loss 2.1836 (1.6302) Acc@1 50.977 (62.772) Acc@5 78.320 (85.278) Mem 7379MB [2024-08-26 03:57:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 62.688 Acc@5 85.216 [2024-08-26 03:57:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 62.7% [2024-08-26 03:57:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 62.69% [2024-08-26 03:57:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 03:57:39 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 03:57:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][0/1251] eta 0:13:14 lr 0.001000 wd 0.0500 time 0.6348 (0.6348) data time 0.4068 (0.4068) model time 0.0000 (0.0000) loss 4.4495 (4.4495) grad_norm 1.4762 (1.4762) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:57:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][10/1251] eta 0:05:46 lr 0.001000 wd 0.0500 time 0.2365 (0.2790) data time 0.0009 (0.0379) model time 0.0000 (0.0000) loss 4.1535 (3.7010) grad_norm 3.0550 (2.2118) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][20/1251] eta 0:05:21 lr 0.001000 wd 0.0500 time 0.2430 (0.2615) data time 0.0008 (0.0203) model time 0.0000 (0.0000) loss 4.2163 (3.9032) grad_norm 1.8979 (2.1448) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:57:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][30/1251] eta 0:05:11 lr 0.001000 wd 0.0500 time 0.2421 (0.2554) data time 0.0011 (0.0141) model time 0.0000 (0.0000) loss 3.9082 (3.9401) grad_norm 1.8612 (2.0915) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:57:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][40/1251] eta 0:05:06 lr 0.001000 wd 0.0500 time 0.2505 (0.2530) data time 0.0007 (0.0110) model time 0.0000 (0.0000) loss 5.0870 (3.9476) grad_norm 1.8597 (2.1285) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:57:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][50/1251] eta 0:05:01 lr 0.001000 wd 0.0500 time 0.2412 (0.2514) data time 0.0009 (0.0091) model time 0.0000 (0.0000) loss 3.0841 (3.9550) grad_norm 3.7747 (2.1691) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:57:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][60/1251] eta 0:04:58 lr 0.001000 wd 0.0500 time 0.2745 (0.2505) data time 0.0009 (0.0077) model time 0.2736 (0.2447) loss 4.7150 (4.0082) grad_norm 2.1404 (2.1765) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:57:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][70/1251] eta 0:04:54 lr 0.001000 wd 0.0500 time 0.2397 (0.2493) data time 0.0008 (0.0068) model time 0.2389 (0.2429) loss 3.6295 (4.0466) grad_norm 2.9698 (2.1984) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:57:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][80/1251] eta 0:04:50 lr 0.001000 wd 0.0500 time 0.2360 (0.2483) data time 0.0011 (0.0061) model time 0.2349 (0.2418) loss 4.3442 (4.0610) grad_norm 2.4264 (2.2088) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][90/1251] eta 0:04:47 lr 0.001000 wd 0.0500 time 0.2421 (0.2477) data time 0.0007 (0.0055) model time 0.2414 (0.2418) loss 3.3511 (4.0843) grad_norm 2.3142 (2.1992) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][100/1251] eta 0:04:44 lr 0.001000 wd 0.0500 time 0.2561 (0.2473) data time 0.0010 (0.0051) model time 0.2551 (0.2420) loss 4.1905 (4.0873) grad_norm 1.3817 (2.2139) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][110/1251] eta 0:04:41 lr 0.001000 wd 0.0500 time 0.2400 (0.2469) data time 0.0009 (0.0047) model time 0.2391 (0.2419) loss 3.5970 (4.0956) grad_norm 2.4207 (2.2143) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][120/1251] eta 0:04:38 lr 0.001000 wd 0.0500 time 0.2428 (0.2464) data time 0.0007 (0.0044) model time 0.2421 (0.2417) loss 4.9024 (4.0800) grad_norm 2.5825 (2.2258) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][130/1251] eta 0:04:36 lr 0.001000 wd 0.0500 time 0.2580 (0.2463) data time 0.0007 (0.0042) model time 0.2573 (0.2420) loss 4.2455 (4.0833) grad_norm 1.3528 (2.2454) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][140/1251] eta 0:04:33 lr 0.001000 wd 0.0500 time 0.2313 (0.2460) data time 0.0011 (0.0040) model time 0.2302 (0.2419) loss 4.2379 (4.0712) grad_norm 1.4631 (2.2116) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][150/1251] eta 0:04:30 lr 0.001000 wd 0.0500 time 0.2395 (0.2457) data time 0.0009 (0.0038) model time 0.2386 (0.2417) loss 4.9836 (4.0925) grad_norm 2.1694 (2.2457) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][160/1251] eta 0:04:27 lr 0.001000 wd 0.0500 time 0.2473 (0.2454) data time 0.0010 (0.0036) model time 0.2463 (0.2415) loss 4.3763 (4.0805) grad_norm 1.3413 (2.2380) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][170/1251] eta 0:04:25 lr 0.001000 wd 0.0500 time 0.2511 (0.2453) data time 0.0013 (0.0035) model time 0.2499 (0.2416) loss 4.2602 (4.0810) grad_norm 3.2153 (2.2253) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][180/1251] eta 0:04:23 lr 0.001000 wd 0.0500 time 0.2390 (0.2462) data time 0.0009 (0.0033) model time 0.2381 (0.2431) loss 4.7262 (4.0872) grad_norm 3.0656 (2.2253) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][190/1251] eta 0:04:22 lr 0.001000 wd 0.0500 time 0.2385 (0.2470) data time 0.0010 (0.0032) model time 0.2375 (0.2444) loss 4.7545 (4.1014) grad_norm 2.1521 (2.2196) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][200/1251] eta 0:04:19 lr 0.001000 wd 0.0500 time 0.2371 (0.2467) data time 0.0008 (0.0031) model time 0.2363 (0.2440) loss 4.5829 (4.1072) grad_norm 1.5252 (2.2036) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][210/1251] eta 0:04:16 lr 0.001000 wd 0.0500 time 0.2389 (0.2464) data time 0.0011 (0.0030) model time 0.2378 (0.2438) loss 3.9300 (4.1126) grad_norm 2.7857 (2.2095) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][220/1251] eta 0:04:13 lr 0.001000 wd 0.0500 time 0.2413 (0.2462) data time 0.0008 (0.0029) model time 0.2405 (0.2436) loss 4.1542 (4.1165) grad_norm 1.7530 (2.2087) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][230/1251] eta 0:04:11 lr 0.001000 wd 0.0500 time 0.2348 (0.2460) data time 0.0010 (0.0028) model time 0.2339 (0.2434) loss 3.1625 (4.1175) grad_norm 1.7797 (2.2030) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][240/1251] eta 0:04:09 lr 0.001000 wd 0.0500 time 0.2401 (0.2466) data time 0.0012 (0.0027) model time 0.2390 (0.2443) loss 4.5833 (4.1154) grad_norm 1.6445 (2.2016) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][250/1251] eta 0:04:06 lr 0.001000 wd 0.0500 time 0.2484 (0.2464) data time 0.0010 (0.0027) model time 0.2474 (0.2441) loss 3.7768 (4.1257) grad_norm 2.2714 (2.2052) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][260/1251] eta 0:04:04 lr 0.001000 wd 0.0500 time 0.2307 (0.2462) data time 0.0011 (0.0027) model time 0.2296 (0.2439) loss 3.4055 (4.1111) grad_norm 2.0985 (2.2019) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][270/1251] eta 0:04:01 lr 0.001000 wd 0.0500 time 0.2332 (0.2460) data time 0.0010 (0.0026) model time 0.2322 (0.2436) loss 4.4622 (4.1152) grad_norm 1.7310 (2.2384) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][280/1251] eta 0:03:58 lr 0.001000 wd 0.0500 time 0.2479 (0.2458) data time 0.0007 (0.0025) model time 0.2472 (0.2434) loss 2.9638 (4.0965) grad_norm 3.0028 (2.2323) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][290/1251] eta 0:03:56 lr 0.001000 wd 0.0500 time 0.2445 (0.2456) data time 0.0010 (0.0025) model time 0.2435 (0.2433) loss 4.4817 (4.0919) grad_norm 1.9252 (2.2267) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][300/1251] eta 0:03:53 lr 0.001000 wd 0.0500 time 0.2461 (0.2455) data time 0.0010 (0.0024) model time 0.2451 (0.2432) loss 3.8740 (4.0892) grad_norm 2.0881 (2.2132) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][310/1251] eta 0:03:50 lr 0.001000 wd 0.0500 time 0.2377 (0.2453) data time 0.0008 (0.0024) model time 0.2369 (0.2431) loss 3.7243 (4.0868) grad_norm 1.9120 (2.2085) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:58:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][320/1251] eta 0:03:48 lr 0.001000 wd 0.0500 time 0.2472 (0.2452) data time 0.0008 (0.0023) model time 0.2464 (0.2430) loss 4.8191 (4.0928) grad_norm 1.4330 (2.1990) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][330/1251] eta 0:03:45 lr 0.001000 wd 0.0500 time 0.2383 (0.2451) data time 0.0010 (0.0023) model time 0.2373 (0.2429) loss 4.2798 (4.0943) grad_norm 2.3062 (2.1988) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][340/1251] eta 0:03:43 lr 0.001000 wd 0.0500 time 0.2528 (0.2450) data time 0.0008 (0.0023) model time 0.2519 (0.2429) loss 3.0731 (4.0874) grad_norm 2.1493 (2.2073) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][350/1251] eta 0:03:40 lr 0.001000 wd 0.0500 time 0.2355 (0.2450) data time 0.0008 (0.0022) model time 0.2347 (0.2428) loss 3.1567 (4.0864) grad_norm 2.6866 (2.2186) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][360/1251] eta 0:03:38 lr 0.001000 wd 0.0500 time 0.2372 (0.2449) data time 0.0007 (0.0022) model time 0.2365 (0.2428) loss 3.4596 (4.0851) grad_norm 1.4853 (2.2048) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][370/1251] eta 0:03:35 lr 0.001000 wd 0.0500 time 0.2375 (0.2448) data time 0.0011 (0.0022) model time 0.2364 (0.2427) loss 3.8932 (4.0822) grad_norm 1.9637 (2.2022) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][380/1251] eta 0:03:33 lr 0.001000 wd 0.0500 time 0.2448 (0.2448) data time 0.0010 (0.0021) model time 0.2438 (0.2427) loss 4.5245 (4.0860) grad_norm 2.6990 (2.2123) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][390/1251] eta 0:03:30 lr 0.001000 wd 0.0500 time 0.2467 (0.2447) data time 0.0010 (0.0021) model time 0.2457 (0.2426) loss 5.0296 (4.0931) grad_norm 2.2794 (2.2179) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][400/1251] eta 0:03:28 lr 0.001000 wd 0.0500 time 0.2471 (0.2446) data time 0.0009 (0.0021) model time 0.2462 (0.2426) loss 3.6614 (4.0896) grad_norm 1.9102 (2.2117) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][410/1251] eta 0:03:25 lr 0.001000 wd 0.0500 time 0.2328 (0.2445) data time 0.0011 (0.0020) model time 0.2316 (0.2425) loss 3.4465 (4.0865) grad_norm 1.8608 (2.2040) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][420/1251] eta 0:03:23 lr 0.001000 wd 0.0500 time 0.2416 (0.2444) data time 0.0010 (0.0020) model time 0.2406 (0.2424) loss 3.9606 (4.0842) grad_norm 2.2437 (2.2023) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][430/1251] eta 0:03:20 lr 0.001000 wd 0.0500 time 0.2336 (0.2443) data time 0.0008 (0.0020) model time 0.2327 (0.2423) loss 3.0717 (4.0790) grad_norm 1.8860 (2.2046) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][440/1251] eta 0:03:18 lr 0.001000 wd 0.0500 time 0.2355 (0.2442) data time 0.0009 (0.0020) model time 0.2346 (0.2422) loss 4.6598 (4.0800) grad_norm 2.4080 (2.1980) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][450/1251] eta 0:03:15 lr 0.001000 wd 0.0500 time 0.2447 (0.2442) data time 0.0010 (0.0020) model time 0.2436 (0.2422) loss 3.9226 (4.0775) grad_norm 1.1941 (2.1881) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][460/1251] eta 0:03:13 lr 0.001000 wd 0.0500 time 0.2392 (0.2441) data time 0.0010 (0.0019) model time 0.2382 (0.2422) loss 4.2569 (4.0802) grad_norm 3.0004 (2.1939) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][470/1251] eta 0:03:10 lr 0.001000 wd 0.0500 time 0.2421 (0.2442) data time 0.0011 (0.0019) model time 0.2410 (0.2422) loss 4.5889 (4.0814) grad_norm 2.4235 (2.2028) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][480/1251] eta 0:03:08 lr 0.001000 wd 0.0500 time 0.2419 (0.2441) data time 0.0009 (0.0019) model time 0.2411 (0.2422) loss 3.8239 (4.0776) grad_norm 3.0295 (2.2007) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][490/1251] eta 0:03:05 lr 0.001000 wd 0.0500 time 0.2401 (0.2441) data time 0.0008 (0.0019) model time 0.2393 (0.2422) loss 4.1198 (4.0785) grad_norm 1.5811 (2.1968) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][500/1251] eta 0:03:03 lr 0.001000 wd 0.0500 time 0.2380 (0.2441) data time 0.0009 (0.0019) model time 0.2371 (0.2422) loss 4.4835 (4.0792) grad_norm 1.6282 (2.1958) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][510/1251] eta 0:03:00 lr 0.001000 wd 0.0500 time 0.2303 (0.2440) data time 0.0012 (0.0018) model time 0.2291 (0.2422) loss 4.2956 (4.0854) grad_norm 1.8045 (2.1922) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][520/1251] eta 0:02:58 lr 0.001000 wd 0.0500 time 0.2392 (0.2440) data time 0.0009 (0.0018) model time 0.2383 (0.2421) loss 4.2308 (4.0770) grad_norm 2.0156 (2.1871) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][530/1251] eta 0:02:55 lr 0.001000 wd 0.0500 time 0.2421 (0.2439) data time 0.0011 (0.0018) model time 0.2410 (0.2421) loss 4.6709 (4.0762) grad_norm 1.4270 (2.1846) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][540/1251] eta 0:02:53 lr 0.001000 wd 0.0500 time 0.2361 (0.2438) data time 0.0009 (0.0018) model time 0.2352 (0.2420) loss 4.9541 (4.0758) grad_norm 1.8650 (2.1837) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][550/1251] eta 0:02:50 lr 0.001000 wd 0.0500 time 0.2432 (0.2438) data time 0.0010 (0.0018) model time 0.2422 (0.2420) loss 4.1951 (4.0780) grad_norm 1.9497 (2.1830) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][560/1251] eta 0:02:48 lr 0.001000 wd 0.0500 time 0.2392 (0.2437) data time 0.0007 (0.0018) model time 0.2385 (0.2419) loss 4.7032 (4.0765) grad_norm 2.1492 (2.1822) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 03:59:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][570/1251] eta 0:02:45 lr 0.001000 wd 0.0500 time 0.2405 (0.2437) data time 0.0011 (0.0018) model time 0.2395 (0.2419) loss 4.4869 (4.0764) grad_norm 2.0364 (2.1895) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][580/1251] eta 0:02:43 lr 0.001000 wd 0.0500 time 0.2472 (0.2437) data time 0.0011 (0.0017) model time 0.2461 (0.2419) loss 4.7578 (4.0799) grad_norm 3.4169 (2.1919) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][590/1251] eta 0:02:41 lr 0.001000 wd 0.0500 time 0.2409 (0.2436) data time 0.0007 (0.0017) model time 0.2402 (0.2419) loss 5.0766 (4.0834) grad_norm 1.7231 (2.1861) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][600/1251] eta 0:02:38 lr 0.001000 wd 0.0500 time 0.2355 (0.2436) data time 0.0013 (0.0017) model time 0.2342 (0.2419) loss 4.1705 (4.0811) grad_norm 2.9097 (2.1872) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][610/1251] eta 0:02:36 lr 0.001000 wd 0.0500 time 0.2403 (0.2436) data time 0.0011 (0.0017) model time 0.2393 (0.2418) loss 4.5387 (4.0866) grad_norm 1.9032 (2.1865) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][620/1251] eta 0:02:33 lr 0.001000 wd 0.0500 time 0.2444 (0.2436) data time 0.0008 (0.0017) model time 0.2436 (0.2418) loss 4.6654 (4.0914) grad_norm 2.0252 (2.1874) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][630/1251] eta 0:02:31 lr 0.001000 wd 0.0500 time 0.2411 (0.2435) data time 0.0010 (0.0017) model time 0.2401 (0.2418) loss 3.6519 (4.0936) grad_norm 2.3841 (2.1896) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][640/1251] eta 0:02:28 lr 0.001000 wd 0.0500 time 0.2335 (0.2435) data time 0.0008 (0.0017) model time 0.2327 (0.2417) loss 3.2993 (4.0916) grad_norm 2.1564 (2.1953) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][650/1251] eta 0:02:26 lr 0.001000 wd 0.0500 time 0.2374 (0.2434) data time 0.0009 (0.0017) model time 0.2364 (0.2417) loss 4.1195 (4.0888) grad_norm 2.4058 (2.1948) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][660/1251] eta 0:02:23 lr 0.001000 wd 0.0500 time 0.2376 (0.2434) data time 0.0009 (0.0017) model time 0.2367 (0.2417) loss 3.9282 (4.0910) grad_norm 2.4733 (2.1903) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][670/1251] eta 0:02:21 lr 0.001000 wd 0.0500 time 0.2372 (0.2434) data time 0.0008 (0.0016) model time 0.2364 (0.2417) loss 4.5010 (4.0886) grad_norm 1.6721 (2.1929) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][680/1251] eta 0:02:18 lr 0.001000 wd 0.0500 time 0.2413 (0.2434) data time 0.0011 (0.0016) model time 0.2402 (0.2417) loss 4.1665 (4.0890) grad_norm 2.5733 (2.1959) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][690/1251] eta 0:02:16 lr 0.001000 wd 0.0500 time 0.2427 (0.2433) data time 0.0010 (0.0016) model time 0.2417 (0.2417) loss 3.8361 (4.0902) grad_norm 1.4220 (2.2031) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][700/1251] eta 0:02:14 lr 0.001000 wd 0.0500 time 0.2369 (0.2433) data time 0.0008 (0.0016) model time 0.2361 (0.2416) loss 4.8629 (4.0956) grad_norm 1.4744 (2.1974) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][710/1251] eta 0:02:11 lr 0.001000 wd 0.0500 time 0.2358 (0.2433) data time 0.0009 (0.0016) model time 0.2348 (0.2416) loss 3.9144 (4.0924) grad_norm 1.6467 (2.1905) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][720/1251] eta 0:02:09 lr 0.001000 wd 0.0500 time 0.2404 (0.2432) data time 0.0009 (0.0016) model time 0.2394 (0.2416) loss 4.4945 (4.0937) grad_norm 2.9193 (2.1849) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][730/1251] eta 0:02:06 lr 0.001000 wd 0.0500 time 0.2390 (0.2432) data time 0.0008 (0.0016) model time 0.2383 (0.2415) loss 3.3050 (4.0943) grad_norm 2.0627 (2.1824) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][740/1251] eta 0:02:04 lr 0.001000 wd 0.0500 time 0.2400 (0.2432) data time 0.0012 (0.0016) model time 0.2388 (0.2415) loss 3.9301 (4.0946) grad_norm 2.7295 (2.1798) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][750/1251] eta 0:02:01 lr 0.001000 wd 0.0500 time 0.2510 (0.2432) data time 0.0009 (0.0016) model time 0.2501 (0.2415) loss 4.8629 (4.0981) grad_norm 2.0183 (2.1783) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][760/1251] eta 0:01:59 lr 0.001000 wd 0.0500 time 0.2421 (0.2432) data time 0.0010 (0.0016) model time 0.2411 (0.2415) loss 4.4197 (4.1018) grad_norm 1.4150 (2.1760) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][770/1251] eta 0:01:56 lr 0.001000 wd 0.0500 time 0.2453 (0.2432) data time 0.0011 (0.0016) model time 0.2443 (0.2415) loss 4.2696 (4.1027) grad_norm 2.1148 (2.1717) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][780/1251] eta 0:01:54 lr 0.001000 wd 0.0500 time 0.2356 (0.2434) data time 0.0007 (0.0016) model time 0.2349 (0.2418) loss 4.6052 (4.1042) grad_norm 2.1643 (2.1786) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][790/1251] eta 0:01:52 lr 0.001000 wd 0.0500 time 0.2339 (0.2434) data time 0.0009 (0.0016) model time 0.2330 (0.2418) loss 3.5018 (4.1037) grad_norm 3.0495 (2.1821) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][800/1251] eta 0:01:49 lr 0.001000 wd 0.0500 time 0.2406 (0.2434) data time 0.0009 (0.0016) model time 0.2397 (0.2418) loss 5.0670 (4.1024) grad_norm 2.3179 (2.1830) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][810/1251] eta 0:01:47 lr 0.001000 wd 0.0500 time 0.2452 (0.2434) data time 0.0008 (0.0015) model time 0.2444 (0.2418) loss 2.8791 (4.0997) grad_norm 2.7175 (2.1800) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:00:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][820/1251] eta 0:01:44 lr 0.001000 wd 0.0500 time 0.2490 (0.2433) data time 0.0010 (0.0015) model time 0.2480 (0.2418) loss 4.6684 (4.1002) grad_norm 1.6576 (2.1789) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][830/1251] eta 0:01:42 lr 0.001000 wd 0.0500 time 0.2323 (0.2433) data time 0.0009 (0.0015) model time 0.2314 (0.2418) loss 3.9990 (4.1009) grad_norm 1.5610 (2.1745) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:01:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][840/1251] eta 0:01:40 lr 0.001000 wd 0.0500 time 0.2489 (0.2433) data time 0.0008 (0.0015) model time 0.2480 (0.2418) loss 3.7199 (4.0992) grad_norm 2.6053 (2.1732) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:01:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][850/1251] eta 0:01:37 lr 0.001000 wd 0.0500 time 0.2474 (0.2434) data time 0.0011 (0.0015) model time 0.2463 (0.2418) loss 4.4136 (4.0954) grad_norm 5.8595 (2.1779) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:01:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][860/1251] eta 0:01:35 lr 0.001000 wd 0.0500 time 0.2441 (0.2433) data time 0.0009 (0.0015) model time 0.2432 (0.2418) loss 4.4261 (4.0956) grad_norm 2.4472 (2.1867) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:01:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][870/1251] eta 0:01:32 lr 0.001000 wd 0.0500 time 0.2393 (0.2433) data time 0.0011 (0.0015) model time 0.2382 (0.2418) loss 3.3130 (4.0960) grad_norm 1.6315 (2.1860) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:01:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][880/1251] eta 0:01:30 lr 0.001000 wd 0.0500 time 0.2424 (0.2433) data time 0.0008 (0.0015) model time 0.2416 (0.2418) loss 4.7328 (4.0962) grad_norm 2.6547 (2.1833) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:01:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][890/1251] eta 0:01:27 lr 0.001000 wd 0.0500 time 0.2430 (0.2433) data time 0.0010 (0.0015) model time 0.2421 (0.2418) loss 3.6755 (4.0956) grad_norm 1.8258 (2.1772) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][900/1251] eta 0:01:25 lr 0.001000 wd 0.0500 time 0.2382 (0.2433) data time 0.0008 (0.0015) model time 0.2374 (0.2418) loss 4.5708 (4.0930) grad_norm 1.6786 (2.1746) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][910/1251] eta 0:01:22 lr 0.001000 wd 0.0500 time 0.2458 (0.2433) data time 0.0007 (0.0015) model time 0.2451 (0.2418) loss 3.9240 (4.0908) grad_norm 1.8543 (2.1719) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][920/1251] eta 0:01:20 lr 0.001000 wd 0.0500 time 0.2308 (0.2433) data time 0.0011 (0.0015) model time 0.2297 (0.2418) loss 4.4371 (4.0858) grad_norm 1.6050 (2.1691) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][930/1251] eta 0:01:18 lr 0.001000 wd 0.0500 time 0.2387 (0.2433) data time 0.0008 (0.0015) model time 0.2379 (0.2418) loss 4.2405 (4.0843) grad_norm 1.5053 (2.1650) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][940/1251] eta 0:01:15 lr 0.001000 wd 0.0500 time 0.2381 (0.2432) data time 0.0011 (0.0015) model time 0.2371 (0.2417) loss 3.3838 (4.0822) grad_norm 3.0754 (2.1651) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][950/1251] eta 0:01:13 lr 0.001000 wd 0.0500 time 0.2401 (0.2432) data time 0.0009 (0.0015) model time 0.2392 (0.2417) loss 4.1807 (4.0811) grad_norm 1.4581 (2.1657) loss_scale 16384.0000 (8278.1409) mem 7379MB [2024-08-26 04:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][960/1251] eta 0:01:10 lr 0.001000 wd 0.0500 time 0.2462 (0.2432) data time 0.0008 (0.0015) model time 0.2454 (0.2417) loss 3.0524 (4.0809) grad_norm 1.4692 (2.1650) loss_scale 16384.0000 (8362.4891) mem 7379MB [2024-08-26 04:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][970/1251] eta 0:01:08 lr 0.001000 wd 0.0500 time 0.2438 (0.2432) data time 0.0010 (0.0015) model time 0.2428 (0.2417) loss 3.9549 (4.0798) grad_norm 2.1550 (2.1638) loss_scale 16384.0000 (8445.0999) mem 7379MB [2024-08-26 04:01:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][980/1251] eta 0:01:05 lr 0.001000 wd 0.0500 time 0.2338 (0.2432) data time 0.0008 (0.0015) model time 0.2331 (0.2417) loss 4.1372 (4.0798) grad_norm 1.9268 (2.1671) loss_scale 16384.0000 (8526.0265) mem 7379MB [2024-08-26 04:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][990/1251] eta 0:01:03 lr 0.001000 wd 0.0500 time 0.2438 (0.2432) data time 0.0009 (0.0015) model time 0.2429 (0.2417) loss 4.0895 (4.0838) grad_norm 2.1900 (2.1657) loss_scale 16384.0000 (8605.3199) mem 7379MB [2024-08-26 04:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1000/1251] eta 0:01:01 lr 0.001000 wd 0.0500 time 0.2436 (0.2432) data time 0.0011 (0.0014) model time 0.2425 (0.2417) loss 4.1211 (4.0841) grad_norm 4.7660 (2.1682) loss_scale 16384.0000 (8683.0290) mem 7379MB [2024-08-26 04:01:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1010/1251] eta 0:00:58 lr 0.001000 wd 0.0500 time 0.2399 (0.2432) data time 0.0011 (0.0014) model time 0.2388 (0.2417) loss 4.2266 (4.0792) grad_norm 2.7235 (2.1679) loss_scale 16384.0000 (8759.2008) mem 7379MB [2024-08-26 04:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1020/1251] eta 0:00:56 lr 0.001000 wd 0.0500 time 0.2449 (0.2432) data time 0.0007 (0.0014) model time 0.2442 (0.2417) loss 3.5978 (4.0793) grad_norm 2.6209 (2.1706) loss_scale 16384.0000 (8833.8805) mem 7379MB [2024-08-26 04:01:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1030/1251] eta 0:00:53 lr 0.001000 wd 0.0500 time 0.2380 (0.2432) data time 0.0009 (0.0014) model time 0.2370 (0.2417) loss 4.5463 (4.0806) grad_norm 1.5506 (2.1678) loss_scale 16384.0000 (8907.1115) mem 7379MB [2024-08-26 04:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1040/1251] eta 0:00:51 lr 0.001000 wd 0.0500 time 0.2459 (0.2432) data time 0.0009 (0.0014) model time 0.2450 (0.2418) loss 3.0026 (4.0771) grad_norm 1.8540 (2.1652) loss_scale 16384.0000 (8978.9356) mem 7379MB [2024-08-26 04:01:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1050/1251] eta 0:00:48 lr 0.001000 wd 0.0500 time 0.2352 (0.2432) data time 0.0007 (0.0014) model time 0.2344 (0.2417) loss 3.2115 (4.0794) grad_norm 1.9070 (2.1690) loss_scale 16384.0000 (9049.3930) mem 7379MB [2024-08-26 04:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1060/1251] eta 0:00:46 lr 0.001000 wd 0.0500 time 0.2389 (0.2432) data time 0.0009 (0.0014) model time 0.2379 (0.2417) loss 4.3839 (4.0796) grad_norm 1.5074 (2.1663) loss_scale 16384.0000 (9118.5221) mem 7379MB [2024-08-26 04:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1070/1251] eta 0:00:44 lr 0.001000 wd 0.0500 time 0.2430 (0.2432) data time 0.0008 (0.0014) model time 0.2422 (0.2417) loss 3.7494 (4.0795) grad_norm 3.1178 (2.1656) loss_scale 16384.0000 (9186.3604) mem 7379MB [2024-08-26 04:02:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1080/1251] eta 0:00:41 lr 0.001000 wd 0.0500 time 0.2354 (0.2431) data time 0.0011 (0.0014) model time 0.2343 (0.2417) loss 3.0767 (4.0781) grad_norm 1.3731 (2.1633) loss_scale 16384.0000 (9252.9436) mem 7379MB [2024-08-26 04:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1090/1251] eta 0:00:39 lr 0.001000 wd 0.0500 time 0.2432 (0.2431) data time 0.0010 (0.0014) model time 0.2422 (0.2417) loss 4.3735 (4.0783) grad_norm 2.7600 (2.1618) loss_scale 16384.0000 (9318.3061) mem 7379MB [2024-08-26 04:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1100/1251] eta 0:00:36 lr 0.001000 wd 0.0500 time 0.2399 (0.2431) data time 0.0007 (0.0014) model time 0.2392 (0.2417) loss 4.9589 (4.0782) grad_norm 3.4867 (2.1659) loss_scale 16384.0000 (9382.4814) mem 7379MB [2024-08-26 04:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1110/1251] eta 0:00:34 lr 0.001000 wd 0.0500 time 0.2376 (0.2433) data time 0.0011 (0.0014) model time 0.2366 (0.2419) loss 4.4506 (4.0794) grad_norm 1.7038 (2.1665) loss_scale 16384.0000 (9445.5014) mem 7379MB [2024-08-26 04:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1120/1251] eta 0:00:31 lr 0.001000 wd 0.0500 time 0.2594 (0.2435) data time 0.0011 (0.0014) model time 0.2582 (0.2421) loss 3.7847 (4.0774) grad_norm 1.7636 (2.1638) loss_scale 16384.0000 (9507.3970) mem 7379MB [2024-08-26 04:02:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1130/1251] eta 0:00:29 lr 0.001000 wd 0.0500 time 0.2431 (0.2435) data time 0.0009 (0.0014) model time 0.2422 (0.2421) loss 2.8930 (4.0769) grad_norm 2.4732 (2.1678) loss_scale 16384.0000 (9568.1981) mem 7379MB [2024-08-26 04:02:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1140/1251] eta 0:00:27 lr 0.001000 wd 0.0500 time 0.2456 (0.2435) data time 0.0008 (0.0014) model time 0.2449 (0.2420) loss 3.0996 (4.0742) grad_norm 1.6360 (2.1682) loss_scale 16384.0000 (9627.9334) mem 7379MB [2024-08-26 04:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1150/1251] eta 0:00:24 lr 0.001000 wd 0.0500 time 0.2357 (0.2434) data time 0.0009 (0.0014) model time 0.2348 (0.2420) loss 4.4049 (4.0759) grad_norm 1.6436 (2.1694) loss_scale 16384.0000 (9686.6308) mem 7379MB [2024-08-26 04:02:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1160/1251] eta 0:00:22 lr 0.001000 wd 0.0500 time 0.2372 (0.2434) data time 0.0011 (0.0014) model time 0.2362 (0.2420) loss 3.7173 (4.0743) grad_norm 2.1122 (2.1701) loss_scale 16384.0000 (9744.3170) mem 7379MB [2024-08-26 04:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1170/1251] eta 0:00:19 lr 0.001000 wd 0.0500 time 0.2485 (0.2434) data time 0.0010 (0.0014) model time 0.2475 (0.2420) loss 3.7911 (4.0747) grad_norm 1.7647 (2.1674) loss_scale 16384.0000 (9801.0179) mem 7379MB [2024-08-26 04:02:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1180/1251] eta 0:00:17 lr 0.001000 wd 0.0500 time 0.2427 (0.2434) data time 0.0007 (0.0014) model time 0.2420 (0.2420) loss 5.0934 (4.0768) grad_norm 2.4950 (2.1658) loss_scale 16384.0000 (9856.7587) mem 7379MB [2024-08-26 04:02:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1190/1251] eta 0:00:14 lr 0.001000 wd 0.0500 time 0.2458 (0.2434) data time 0.0009 (0.0014) model time 0.2448 (0.2420) loss 4.5840 (4.0779) grad_norm 2.2063 (2.1640) loss_scale 16384.0000 (9911.5634) mem 7379MB [2024-08-26 04:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1200/1251] eta 0:00:12 lr 0.001000 wd 0.0500 time 0.2409 (0.2434) data time 0.0008 (0.0014) model time 0.2401 (0.2420) loss 4.3664 (4.0756) grad_norm 2.3818 (2.1625) loss_scale 16384.0000 (9965.4555) mem 7379MB [2024-08-26 04:02:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1210/1251] eta 0:00:09 lr 0.001000 wd 0.0500 time 0.2384 (0.2434) data time 0.0009 (0.0014) model time 0.2375 (0.2420) loss 3.5719 (4.0760) grad_norm 1.6596 (2.1632) loss_scale 16384.0000 (10018.4575) mem 7379MB [2024-08-26 04:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1220/1251] eta 0:00:07 lr 0.001000 wd 0.0500 time 0.2430 (0.2433) data time 0.0011 (0.0014) model time 0.2419 (0.2420) loss 4.4200 (4.0753) grad_norm 1.5881 (2.1656) loss_scale 16384.0000 (10070.5913) mem 7379MB [2024-08-26 04:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1230/1251] eta 0:00:05 lr 0.001000 wd 0.0500 time 0.2403 (0.2433) data time 0.0009 (0.0014) model time 0.2393 (0.2420) loss 2.7139 (4.0736) grad_norm 2.3319 (2.1664) loss_scale 16384.0000 (10121.8781) mem 7379MB [2024-08-26 04:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1240/1251] eta 0:00:02 lr 0.001000 wd 0.0500 time 0.2226 (0.2433) data time 0.0005 (0.0014) model time 0.2221 (0.2419) loss 2.9681 (4.0732) grad_norm 1.7582 (2.1632) loss_scale 16384.0000 (10172.3384) mem 7379MB [2024-08-26 04:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [21/300][1250/1251] eta 0:00:00 lr 0.001000 wd 0.0500 time 0.2271 (0.2431) data time 0.0007 (0.0014) model time 0.2264 (0.2418) loss 4.5313 (4.0739) grad_norm 2.8980 (2.1610) loss_scale 16384.0000 (10221.9920) mem 7379MB [2024-08-26 04:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 21 training takes 0:05:04 [2024-08-26 04:02:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 04:02:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 04:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.403 (0.403) Loss 0.7388 (0.7388) Acc@1 84.277 (84.277) Acc@5 96.191 (96.191) Mem 7379MB [2024-08-26 04:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.106) Loss 1.0811 (1.1202) Acc@1 74.316 (74.015) Acc@5 93.750 (92.889) Mem 7379MB [2024-08-26 04:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.093) Loss 1.6924 (1.1466) Acc@1 61.230 (73.419) Acc@5 85.645 (92.769) Mem 7379MB [2024-08-26 04:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.087) Loss 1.9697 (1.3136) Acc@1 56.934 (70.020) Acc@5 79.395 (90.455) Mem 7379MB [2024-08-26 04:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.082) Loss 1.9463 (1.4171) Acc@1 56.348 (67.940) Acc@5 81.348 (88.896) Mem 7379MB [2024-08-26 04:02:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 67.722 Acc@5 88.712 [2024-08-26 04:02:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 67.7% [2024-08-26 04:02:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 67.72% [2024-08-26 04:02:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 04:02:49 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 04:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.434 (0.434) Loss 0.8252 (0.8252) Acc@1 80.566 (80.566) Acc@5 94.336 (94.336) Mem 7379MB [2024-08-26 04:02:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.108) Loss 1.2158 (1.2372) Acc@1 72.363 (70.073) Acc@5 91.602 (90.687) Mem 7379MB [2024-08-26 04:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.093) Loss 1.7871 (1.2435) Acc@1 59.180 (69.982) Acc@5 83.203 (90.792) Mem 7379MB [2024-08-26 04:02:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.087) Loss 2.0059 (1.4268) Acc@1 56.152 (66.564) Acc@5 77.734 (88.064) Mem 7379MB [2024-08-26 04:02:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.082) Loss 2.0820 (1.5414) Acc@1 53.125 (64.375) Acc@5 79.395 (86.359) Mem 7379MB [2024-08-26 04:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 64.288 Acc@5 86.300 [2024-08-26 04:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 64.3% [2024-08-26 04:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 64.29% [2024-08-26 04:02:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 04:02:53 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 04:02:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][0/1251] eta 0:14:42 lr 0.001000 wd 0.0500 time 0.7053 (0.7053) data time 0.4844 (0.4844) model time 0.0000 (0.0000) loss 3.7858 (3.7858) grad_norm 2.2508 (2.2508) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:02:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][10/1251] eta 0:05:50 lr 0.001000 wd 0.0500 time 0.2421 (0.2828) data time 0.0008 (0.0449) model time 0.0000 (0.0000) loss 4.8209 (4.2884) grad_norm 2.6142 (2.5395) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:02:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][20/1251] eta 0:05:24 lr 0.001000 wd 0.0500 time 0.2474 (0.2633) data time 0.0010 (0.0240) model time 0.0000 (0.0000) loss 3.1802 (4.1716) grad_norm 2.0550 (2.1817) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][30/1251] eta 0:05:13 lr 0.001000 wd 0.0500 time 0.2441 (0.2564) data time 0.0009 (0.0167) model time 0.0000 (0.0000) loss 4.8836 (4.1766) grad_norm 1.7415 (2.3117) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][40/1251] eta 0:05:05 lr 0.001000 wd 0.0500 time 0.2359 (0.2520) data time 0.0010 (0.0129) model time 0.0000 (0.0000) loss 4.5452 (4.2239) grad_norm 1.5373 (2.2644) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][50/1251] eta 0:04:59 lr 0.001000 wd 0.0500 time 0.2444 (0.2497) data time 0.0007 (0.0106) model time 0.0000 (0.0000) loss 2.6605 (4.1922) grad_norm 1.5776 (2.2319) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][60/1251] eta 0:04:55 lr 0.001000 wd 0.0500 time 0.2448 (0.2481) data time 0.0010 (0.0090) model time 0.2438 (0.2389) loss 3.8775 (4.1813) grad_norm 1.9975 (2.1745) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][70/1251] eta 0:04:52 lr 0.001000 wd 0.0500 time 0.2421 (0.2473) data time 0.0009 (0.0079) model time 0.2413 (0.2403) loss 4.3436 (4.1884) grad_norm 1.3434 (2.2067) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][80/1251] eta 0:04:48 lr 0.001000 wd 0.0500 time 0.2368 (0.2465) data time 0.0009 (0.0070) model time 0.2359 (0.2402) loss 3.4731 (4.1475) grad_norm 4.8361 (2.3310) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][90/1251] eta 0:04:45 lr 0.001000 wd 0.0500 time 0.2456 (0.2459) data time 0.0008 (0.0064) model time 0.2448 (0.2402) loss 4.4690 (4.1178) grad_norm 1.9210 (2.2872) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][100/1251] eta 0:04:42 lr 0.001000 wd 0.0500 time 0.2403 (0.2456) data time 0.0009 (0.0058) model time 0.2394 (0.2405) loss 5.0678 (4.1469) grad_norm 2.1756 (2.2623) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][110/1251] eta 0:04:39 lr 0.001000 wd 0.0500 time 0.2345 (0.2449) data time 0.0010 (0.0054) model time 0.2335 (0.2399) loss 4.1977 (4.1437) grad_norm 4.0608 (2.2522) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][120/1251] eta 0:04:36 lr 0.001000 wd 0.0500 time 0.2456 (0.2447) data time 0.0007 (0.0050) model time 0.2449 (0.2401) loss 3.2594 (4.1338) grad_norm 2.7908 (2.2414) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][130/1251] eta 0:04:33 lr 0.001000 wd 0.0500 time 0.2369 (0.2443) data time 0.0011 (0.0047) model time 0.2358 (0.2398) loss 3.1497 (4.1196) grad_norm 3.4909 (2.2910) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][140/1251] eta 0:04:31 lr 0.001000 wd 0.0500 time 0.2496 (0.2441) data time 0.0010 (0.0044) model time 0.2485 (0.2401) loss 3.7158 (4.0945) grad_norm 2.7351 (2.3097) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][150/1251] eta 0:04:28 lr 0.001000 wd 0.0500 time 0.2412 (0.2439) data time 0.0007 (0.0042) model time 0.2405 (0.2400) loss 2.9157 (4.0886) grad_norm 1.2647 (2.3065) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][160/1251] eta 0:04:25 lr 0.001000 wd 0.0500 time 0.2419 (0.2436) data time 0.0007 (0.0040) model time 0.2412 (0.2398) loss 3.7705 (4.0804) grad_norm 1.6612 (2.2885) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][170/1251] eta 0:04:23 lr 0.001000 wd 0.0500 time 0.2368 (0.2435) data time 0.0009 (0.0038) model time 0.2359 (0.2399) loss 5.0936 (4.0885) grad_norm 2.8207 (2.2696) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][180/1251] eta 0:04:20 lr 0.001000 wd 0.0500 time 0.2391 (0.2433) data time 0.0013 (0.0037) model time 0.2379 (0.2398) loss 4.4375 (4.0851) grad_norm 1.4494 (2.2832) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][190/1251] eta 0:04:17 lr 0.001000 wd 0.0500 time 0.2439 (0.2431) data time 0.0010 (0.0035) model time 0.2430 (0.2398) loss 4.6019 (4.0981) grad_norm 2.0239 (2.2816) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][200/1251] eta 0:04:15 lr 0.001000 wd 0.0500 time 0.2412 (0.2430) data time 0.0010 (0.0034) model time 0.2402 (0.2397) loss 3.6571 (4.0974) grad_norm 1.5615 (2.2629) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][210/1251] eta 0:04:12 lr 0.001000 wd 0.0500 time 0.2474 (0.2430) data time 0.0010 (0.0033) model time 0.2464 (0.2398) loss 4.3812 (4.0898) grad_norm 2.0730 (2.2514) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][220/1251] eta 0:04:10 lr 0.001000 wd 0.0500 time 0.2513 (0.2430) data time 0.0011 (0.0032) model time 0.2502 (0.2399) loss 4.4853 (4.0887) grad_norm 1.8930 (2.2418) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][230/1251] eta 0:04:08 lr 0.001000 wd 0.0500 time 0.2386 (0.2430) data time 0.0009 (0.0031) model time 0.2377 (0.2400) loss 3.5024 (4.0863) grad_norm 2.1523 (2.2385) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][240/1251] eta 0:04:05 lr 0.001000 wd 0.0500 time 0.2397 (0.2429) data time 0.0009 (0.0030) model time 0.2388 (0.2401) loss 3.3039 (4.0730) grad_norm 3.3273 (2.2381) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][250/1251] eta 0:04:03 lr 0.001000 wd 0.0500 time 0.2445 (0.2429) data time 0.0010 (0.0030) model time 0.2435 (0.2401) loss 4.0613 (4.0698) grad_norm 3.7868 (2.2519) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][260/1251] eta 0:04:00 lr 0.001000 wd 0.0500 time 0.2452 (0.2428) data time 0.0009 (0.0029) model time 0.2443 (0.2401) loss 4.9562 (4.0700) grad_norm 1.4813 (2.2821) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][270/1251] eta 0:03:58 lr 0.001000 wd 0.0500 time 0.2483 (0.2428) data time 0.0010 (0.0028) model time 0.2474 (0.2402) loss 3.0146 (4.0632) grad_norm 1.6360 (2.2637) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][280/1251] eta 0:03:55 lr 0.001000 wd 0.0500 time 0.2513 (0.2428) data time 0.0010 (0.0028) model time 0.2502 (0.2403) loss 3.9189 (4.0630) grad_norm 2.3070 (2.2664) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][290/1251] eta 0:03:53 lr 0.001000 wd 0.0500 time 0.2382 (0.2428) data time 0.0011 (0.0027) model time 0.2372 (0.2402) loss 4.3454 (4.0608) grad_norm 2.3624 (2.2666) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][300/1251] eta 0:03:50 lr 0.001000 wd 0.0500 time 0.2442 (0.2427) data time 0.0011 (0.0027) model time 0.2431 (0.2402) loss 4.2060 (4.0672) grad_norm 2.0653 (2.2568) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][310/1251] eta 0:03:48 lr 0.001000 wd 0.0500 time 0.2350 (0.2427) data time 0.0010 (0.0026) model time 0.2340 (0.2403) loss 4.7869 (4.0756) grad_norm 1.3905 (2.2428) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][320/1251] eta 0:03:45 lr 0.001000 wd 0.0500 time 0.2374 (0.2427) data time 0.0009 (0.0026) model time 0.2365 (0.2403) loss 4.5267 (4.0717) grad_norm 1.8652 (2.2251) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][330/1251] eta 0:03:43 lr 0.001000 wd 0.0500 time 0.2418 (0.2427) data time 0.0009 (0.0025) model time 0.2410 (0.2403) loss 3.5229 (4.0662) grad_norm 1.9317 (2.2152) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][340/1251] eta 0:03:41 lr 0.001000 wd 0.0500 time 0.2622 (0.2427) data time 0.0009 (0.0025) model time 0.2613 (0.2405) loss 3.0853 (4.0634) grad_norm 1.8922 (2.1996) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][350/1251] eta 0:03:38 lr 0.001000 wd 0.0500 time 0.2395 (0.2427) data time 0.0012 (0.0024) model time 0.2383 (0.2405) loss 3.9257 (4.0586) grad_norm 2.2905 (2.2005) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][360/1251] eta 0:03:36 lr 0.001000 wd 0.0500 time 0.2425 (0.2427) data time 0.0011 (0.0024) model time 0.2415 (0.2405) loss 4.6649 (4.0598) grad_norm 1.7124 (2.1905) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][370/1251] eta 0:03:33 lr 0.001000 wd 0.0500 time 0.2357 (0.2427) data time 0.0009 (0.0024) model time 0.2348 (0.2405) loss 2.8155 (4.0670) grad_norm 1.6795 (2.1805) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][380/1251] eta 0:03:31 lr 0.001000 wd 0.0500 time 0.2362 (0.2426) data time 0.0011 (0.0024) model time 0.2351 (0.2405) loss 4.5544 (4.0686) grad_norm 1.4600 (2.1777) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][390/1251] eta 0:03:28 lr 0.001000 wd 0.0500 time 0.2387 (0.2426) data time 0.0011 (0.0023) model time 0.2376 (0.2405) loss 4.0839 (4.0757) grad_norm 1.9814 (2.1758) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][400/1251] eta 0:03:26 lr 0.001000 wd 0.0500 time 0.2493 (0.2425) data time 0.0009 (0.0023) model time 0.2484 (0.2404) loss 3.6095 (4.0779) grad_norm 2.2162 (2.1799) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][410/1251] eta 0:03:23 lr 0.001000 wd 0.0500 time 0.2347 (0.2425) data time 0.0010 (0.0023) model time 0.2337 (0.2404) loss 4.2105 (4.0751) grad_norm 2.5019 (2.1771) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][420/1251] eta 0:03:21 lr 0.001000 wd 0.0500 time 0.2403 (0.2425) data time 0.0013 (0.0022) model time 0.2390 (0.2404) loss 4.3388 (4.0789) grad_norm 3.6332 (2.1828) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][430/1251] eta 0:03:19 lr 0.001000 wd 0.0500 time 0.2497 (0.2425) data time 0.0008 (0.0022) model time 0.2489 (0.2405) loss 4.7056 (4.0814) grad_norm 2.1476 (2.1951) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][440/1251] eta 0:03:16 lr 0.001000 wd 0.0500 time 0.2356 (0.2425) data time 0.0011 (0.0022) model time 0.2345 (0.2404) loss 4.1926 (4.0838) grad_norm 1.4997 (2.1924) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][450/1251] eta 0:03:14 lr 0.001000 wd 0.0500 time 0.2374 (0.2425) data time 0.0010 (0.0021) model time 0.2365 (0.2405) loss 4.3801 (4.0851) grad_norm 2.1383 (2.1944) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][460/1251] eta 0:03:11 lr 0.001000 wd 0.0500 time 0.2341 (0.2425) data time 0.0010 (0.0021) model time 0.2331 (0.2405) loss 4.4114 (4.0847) grad_norm 1.9850 (2.1979) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][470/1251] eta 0:03:09 lr 0.001000 wd 0.0500 time 0.2363 (0.2428) data time 0.0009 (0.0021) model time 0.2355 (0.2409) loss 5.2638 (4.0854) grad_norm 2.0190 (2.2056) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][480/1251] eta 0:03:07 lr 0.001000 wd 0.0500 time 0.2423 (0.2428) data time 0.0010 (0.0021) model time 0.2413 (0.2409) loss 3.9671 (4.0868) grad_norm 1.5925 (2.2110) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][490/1251] eta 0:03:04 lr 0.001000 wd 0.0500 time 0.2447 (0.2428) data time 0.0007 (0.0021) model time 0.2440 (0.2409) loss 3.5930 (4.0856) grad_norm 1.7300 (2.2069) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][500/1251] eta 0:03:02 lr 0.001000 wd 0.0500 time 0.2412 (0.2427) data time 0.0007 (0.0020) model time 0.2404 (0.2409) loss 3.1679 (4.0853) grad_norm 2.0229 (2.1949) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:04:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][510/1251] eta 0:02:59 lr 0.001000 wd 0.0500 time 0.2392 (0.2427) data time 0.0009 (0.0020) model time 0.2383 (0.2409) loss 3.3569 (4.0824) grad_norm 2.3035 (2.1958) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][520/1251] eta 0:02:57 lr 0.001000 wd 0.0500 time 0.2311 (0.2427) data time 0.0011 (0.0020) model time 0.2301 (0.2408) loss 4.6712 (4.0851) grad_norm 2.7635 (2.2029) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][530/1251] eta 0:02:54 lr 0.001000 wd 0.0500 time 0.2476 (0.2427) data time 0.0010 (0.0020) model time 0.2466 (0.2409) loss 4.0737 (4.0858) grad_norm 1.5551 (2.1968) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][540/1251] eta 0:02:52 lr 0.001000 wd 0.0500 time 0.2386 (0.2427) data time 0.0009 (0.0020) model time 0.2377 (0.2409) loss 3.2245 (4.0848) grad_norm 1.1939 (2.1909) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][550/1251] eta 0:02:50 lr 0.001000 wd 0.0500 time 0.2417 (0.2427) data time 0.0014 (0.0020) model time 0.2403 (0.2409) loss 2.8360 (4.0886) grad_norm 2.3126 (2.1899) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][560/1251] eta 0:02:47 lr 0.001000 wd 0.0500 time 0.2346 (0.2426) data time 0.0011 (0.0019) model time 0.2334 (0.2409) loss 4.1647 (4.0870) grad_norm 2.2131 (2.1889) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][570/1251] eta 0:02:45 lr 0.001000 wd 0.0500 time 0.2437 (0.2426) data time 0.0008 (0.0019) model time 0.2430 (0.2408) loss 4.9867 (4.0902) grad_norm 1.9834 (2.1847) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][580/1251] eta 0:02:42 lr 0.001000 wd 0.0500 time 0.2435 (0.2426) data time 0.0011 (0.0019) model time 0.2424 (0.2408) loss 4.7367 (4.0895) grad_norm 2.1512 (2.1842) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][590/1251] eta 0:02:40 lr 0.001000 wd 0.0500 time 0.2425 (0.2426) data time 0.0010 (0.0020) model time 0.2415 (0.2408) loss 4.6579 (4.0905) grad_norm 2.8908 (2.1825) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][600/1251] eta 0:02:37 lr 0.001000 wd 0.0500 time 0.2253 (0.2426) data time 0.0008 (0.0020) model time 0.2244 (0.2408) loss 4.4387 (4.0909) grad_norm 1.9381 (2.1832) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][610/1251] eta 0:02:35 lr 0.001000 wd 0.0500 time 0.2423 (0.2426) data time 0.0009 (0.0020) model time 0.2414 (0.2408) loss 4.2809 (4.0902) grad_norm 1.9092 (2.1863) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][620/1251] eta 0:02:33 lr 0.001000 wd 0.0500 time 0.2467 (0.2426) data time 0.0008 (0.0020) model time 0.2459 (0.2408) loss 4.0194 (4.0834) grad_norm 1.6269 (2.1888) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][630/1251] eta 0:02:30 lr 0.001000 wd 0.0500 time 0.2408 (0.2426) data time 0.0010 (0.0019) model time 0.2398 (0.2408) loss 4.7148 (4.0870) grad_norm 1.4526 (2.1858) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][640/1251] eta 0:02:28 lr 0.001000 wd 0.0500 time 0.2367 (0.2426) data time 0.0008 (0.0019) model time 0.2358 (0.2408) loss 2.3322 (4.0839) grad_norm 1.9679 (2.1891) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][650/1251] eta 0:02:25 lr 0.001000 wd 0.0500 time 0.2451 (0.2426) data time 0.0008 (0.0019) model time 0.2443 (0.2408) loss 3.1932 (4.0810) grad_norm 2.3304 (2.1914) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][660/1251] eta 0:02:23 lr 0.001000 wd 0.0500 time 0.2485 (0.2426) data time 0.0010 (0.0019) model time 0.2475 (0.2408) loss 3.9343 (4.0833) grad_norm 1.5882 (2.1870) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][670/1251] eta 0:02:20 lr 0.001000 wd 0.0500 time 0.2383 (0.2426) data time 0.0008 (0.0019) model time 0.2375 (0.2408) loss 2.8545 (4.0773) grad_norm 1.4274 (2.1792) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][680/1251] eta 0:02:18 lr 0.001000 wd 0.0500 time 0.2459 (0.2426) data time 0.0011 (0.0019) model time 0.2447 (0.2409) loss 3.8458 (4.0788) grad_norm 1.9219 (2.1750) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][690/1251] eta 0:02:16 lr 0.001000 wd 0.0500 time 0.2419 (0.2429) data time 0.0010 (0.0019) model time 0.2410 (0.2412) loss 4.3000 (4.0764) grad_norm 1.7369 (2.1741) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][700/1251] eta 0:02:14 lr 0.001000 wd 0.0500 time 0.2332 (0.2432) data time 0.0010 (0.0019) model time 0.2322 (0.2415) loss 3.6686 (4.0735) grad_norm 2.2843 (2.1761) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][710/1251] eta 0:02:11 lr 0.001000 wd 0.0500 time 0.2408 (0.2432) data time 0.0009 (0.0019) model time 0.2398 (0.2415) loss 3.8989 (4.0705) grad_norm 1.6300 (2.1736) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][720/1251] eta 0:02:09 lr 0.001000 wd 0.0500 time 0.2439 (0.2432) data time 0.0007 (0.0019) model time 0.2432 (0.2415) loss 4.4725 (4.0715) grad_norm 1.9502 (2.1708) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][730/1251] eta 0:02:06 lr 0.001000 wd 0.0500 time 0.2492 (0.2432) data time 0.0007 (0.0018) model time 0.2484 (0.2415) loss 3.2306 (4.0694) grad_norm 1.8277 (2.1671) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][740/1251] eta 0:02:04 lr 0.001000 wd 0.0500 time 0.2415 (0.2432) data time 0.0012 (0.0018) model time 0.2403 (0.2415) loss 3.6605 (4.0700) grad_norm 2.8845 (2.1641) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][750/1251] eta 0:02:01 lr 0.001000 wd 0.0500 time 0.2418 (0.2432) data time 0.0009 (0.0018) model time 0.2409 (0.2415) loss 4.4804 (4.0691) grad_norm 2.6960 (2.1636) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:05:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][760/1251] eta 0:01:59 lr 0.001000 wd 0.0500 time 0.2469 (0.2432) data time 0.0011 (0.0018) model time 0.2457 (0.2416) loss 4.6062 (4.0658) grad_norm 2.0975 (2.1601) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][770/1251] eta 0:01:56 lr 0.001000 wd 0.0500 time 0.2317 (0.2432) data time 0.0009 (0.0018) model time 0.2308 (0.2416) loss 4.6597 (4.0689) grad_norm 1.7309 (2.1587) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][780/1251] eta 0:01:54 lr 0.001000 wd 0.0500 time 0.2384 (0.2432) data time 0.0011 (0.0018) model time 0.2373 (0.2415) loss 4.2091 (4.0694) grad_norm 2.3479 (2.1578) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][790/1251] eta 0:01:52 lr 0.001000 wd 0.0500 time 0.2369 (0.2432) data time 0.0009 (0.0018) model time 0.2360 (0.2416) loss 4.9750 (4.0723) grad_norm 1.9320 (2.1575) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][800/1251] eta 0:01:49 lr 0.001000 wd 0.0500 time 0.2457 (0.2432) data time 0.0008 (0.0018) model time 0.2449 (0.2416) loss 3.8039 (4.0713) grad_norm 1.5923 (2.1544) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][810/1251] eta 0:01:47 lr 0.001000 wd 0.0500 time 0.2422 (0.2432) data time 0.0009 (0.0018) model time 0.2413 (0.2416) loss 3.2104 (4.0705) grad_norm 3.4854 (2.1527) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][820/1251] eta 0:01:44 lr 0.001000 wd 0.0500 time 0.2413 (0.2432) data time 0.0010 (0.0018) model time 0.2403 (0.2416) loss 4.4757 (4.0694) grad_norm 2.2711 (2.1534) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][830/1251] eta 0:01:42 lr 0.001000 wd 0.0500 time 0.2423 (0.2432) data time 0.0008 (0.0017) model time 0.2416 (0.2416) loss 3.9874 (4.0721) grad_norm 1.3395 (2.1482) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][840/1251] eta 0:01:39 lr 0.001000 wd 0.0500 time 0.2521 (0.2432) data time 0.0008 (0.0017) model time 0.2513 (0.2416) loss 4.7462 (4.0710) grad_norm 2.2113 (2.1466) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][850/1251] eta 0:01:37 lr 0.001000 wd 0.0500 time 0.2454 (0.2432) data time 0.0009 (0.0017) model time 0.2445 (0.2416) loss 4.3852 (4.0737) grad_norm 1.8487 (2.1431) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][860/1251] eta 0:01:35 lr 0.001000 wd 0.0500 time 0.2437 (0.2432) data time 0.0008 (0.0017) model time 0.2429 (0.2416) loss 5.1933 (4.0782) grad_norm 3.0671 (2.1407) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][870/1251] eta 0:01:32 lr 0.001000 wd 0.0500 time 0.2472 (0.2431) data time 0.0009 (0.0017) model time 0.2463 (0.2416) loss 3.5555 (4.0753) grad_norm 1.3784 (2.1379) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][880/1251] eta 0:01:30 lr 0.001000 wd 0.0500 time 0.2454 (0.2431) data time 0.0008 (0.0017) model time 0.2447 (0.2415) loss 3.3765 (4.0712) grad_norm 1.8772 (2.1354) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][890/1251] eta 0:01:27 lr 0.001000 wd 0.0500 time 0.2403 (0.2431) data time 0.0010 (0.0017) model time 0.2392 (0.2415) loss 3.9723 (4.0671) grad_norm 3.1808 (2.1335) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][900/1251] eta 0:01:25 lr 0.001000 wd 0.0500 time 0.2381 (0.2431) data time 0.0011 (0.0017) model time 0.2371 (0.2415) loss 4.5613 (4.0681) grad_norm 1.7264 (2.1341) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][910/1251] eta 0:01:22 lr 0.001000 wd 0.0500 time 0.2408 (0.2431) data time 0.0012 (0.0017) model time 0.2397 (0.2415) loss 4.3373 (4.0624) grad_norm 2.1323 (2.1347) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][920/1251] eta 0:01:20 lr 0.001000 wd 0.0500 time 0.2378 (0.2431) data time 0.0010 (0.0017) model time 0.2368 (0.2415) loss 3.6009 (4.0638) grad_norm 1.7655 (2.1373) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][930/1251] eta 0:01:18 lr 0.001000 wd 0.0500 time 0.2328 (0.2430) data time 0.0009 (0.0017) model time 0.2319 (0.2415) loss 4.0758 (4.0640) grad_norm 1.4091 (2.1358) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][940/1251] eta 0:01:15 lr 0.001000 wd 0.0500 time 0.2394 (0.2430) data time 0.0009 (0.0017) model time 0.2384 (0.2415) loss 4.4018 (4.0633) grad_norm 5.8085 (2.1448) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][950/1251] eta 0:01:13 lr 0.001000 wd 0.0500 time 0.2373 (0.2430) data time 0.0007 (0.0017) model time 0.2366 (0.2414) loss 2.7949 (4.0625) grad_norm 2.1154 (2.1492) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][960/1251] eta 0:01:10 lr 0.001000 wd 0.0500 time 0.2407 (0.2430) data time 0.0008 (0.0016) model time 0.2398 (0.2415) loss 3.5377 (4.0643) grad_norm 2.1756 (2.1486) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][970/1251] eta 0:01:08 lr 0.001000 wd 0.0500 time 0.2343 (0.2430) data time 0.0007 (0.0016) model time 0.2336 (0.2414) loss 4.5696 (4.0638) grad_norm 2.9586 (2.1503) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][980/1251] eta 0:01:05 lr 0.001000 wd 0.0500 time 0.2398 (0.2429) data time 0.0010 (0.0016) model time 0.2388 (0.2414) loss 4.4450 (4.0653) grad_norm 2.4057 (2.1534) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][990/1251] eta 0:01:03 lr 0.001000 wd 0.0500 time 0.2344 (0.2429) data time 0.0011 (0.0016) model time 0.2334 (0.2414) loss 2.9120 (4.0657) grad_norm 1.2187 (2.1517) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1000/1251] eta 0:01:01 lr 0.001000 wd 0.0500 time 0.2531 (0.2432) data time 0.0008 (0.0016) model time 0.2523 (0.2417) loss 5.0787 (4.0637) grad_norm 1.5959 (2.1494) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:06:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1010/1251] eta 0:00:58 lr 0.001000 wd 0.0500 time 0.2363 (0.2431) data time 0.0011 (0.0016) model time 0.2352 (0.2416) loss 4.3162 (4.0627) grad_norm 2.2511 (2.1519) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1020/1251] eta 0:00:56 lr 0.001000 wd 0.0500 time 0.2629 (0.2432) data time 0.0009 (0.0016) model time 0.2620 (0.2417) loss 4.9920 (4.0638) grad_norm 2.0014 (2.1563) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1030/1251] eta 0:00:53 lr 0.001000 wd 0.0500 time 0.2347 (0.2431) data time 0.0010 (0.0016) model time 0.2337 (0.2417) loss 3.2866 (4.0636) grad_norm 1.3423 (2.1541) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1040/1251] eta 0:00:51 lr 0.001000 wd 0.0500 time 0.2467 (0.2431) data time 0.0008 (0.0016) model time 0.2459 (0.2417) loss 2.8655 (4.0595) grad_norm 1.9555 (2.1541) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1050/1251] eta 0:00:48 lr 0.001000 wd 0.0500 time 0.2441 (0.2431) data time 0.0009 (0.0016) model time 0.2431 (0.2417) loss 3.3179 (4.0602) grad_norm 1.7894 (2.1493) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1060/1251] eta 0:00:46 lr 0.001000 wd 0.0500 time 0.2459 (0.2431) data time 0.0010 (0.0016) model time 0.2449 (0.2417) loss 5.0584 (4.0616) grad_norm 1.9419 (2.1465) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1070/1251] eta 0:00:44 lr 0.001000 wd 0.0500 time 0.2506 (0.2431) data time 0.0009 (0.0016) model time 0.2497 (0.2417) loss 3.9476 (4.0596) grad_norm 1.3508 (2.1422) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1080/1251] eta 0:00:41 lr 0.001000 wd 0.0500 time 0.2350 (0.2431) data time 0.0010 (0.0016) model time 0.2339 (0.2417) loss 4.2803 (4.0588) grad_norm 1.8300 (2.1410) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1090/1251] eta 0:00:39 lr 0.001000 wd 0.0500 time 0.2372 (0.2431) data time 0.0008 (0.0016) model time 0.2364 (0.2417) loss 3.8661 (4.0572) grad_norm 1.9368 (2.1449) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1100/1251] eta 0:00:36 lr 0.001000 wd 0.0500 time 0.2518 (0.2431) data time 0.0008 (0.0016) model time 0.2510 (0.2417) loss 3.5190 (4.0566) grad_norm 2.4041 (2.1472) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1110/1251] eta 0:00:34 lr 0.001000 wd 0.0500 time 0.2484 (0.2431) data time 0.0007 (0.0016) model time 0.2476 (0.2417) loss 3.4522 (4.0535) grad_norm 1.9497 (2.1455) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1120/1251] eta 0:00:31 lr 0.001000 wd 0.0500 time 0.2368 (0.2431) data time 0.0010 (0.0016) model time 0.2358 (0.2416) loss 4.1891 (4.0544) grad_norm 1.7065 (2.1439) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1130/1251] eta 0:00:29 lr 0.001000 wd 0.0500 time 0.2304 (0.2431) data time 0.0011 (0.0016) model time 0.2293 (0.2416) loss 4.0734 (4.0570) grad_norm 2.5140 (2.1458) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1140/1251] eta 0:00:26 lr 0.001000 wd 0.0500 time 0.2502 (0.2431) data time 0.0007 (0.0016) model time 0.2494 (0.2416) loss 4.4878 (4.0552) grad_norm 1.5222 (2.1445) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1150/1251] eta 0:00:24 lr 0.001000 wd 0.0500 time 0.2578 (0.2431) data time 0.0009 (0.0016) model time 0.2569 (0.2416) loss 4.4807 (4.0516) grad_norm 1.7589 (2.1450) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1160/1251] eta 0:00:22 lr 0.001000 wd 0.0500 time 0.2424 (0.2430) data time 0.0010 (0.0016) model time 0.2414 (0.2416) loss 3.6618 (4.0513) grad_norm 2.7118 (2.1451) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1170/1251] eta 0:00:19 lr 0.001000 wd 0.0500 time 0.2398 (0.2430) data time 0.0011 (0.0016) model time 0.2387 (0.2416) loss 4.9616 (4.0520) grad_norm 1.4018 (2.1493) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1180/1251] eta 0:00:17 lr 0.001000 wd 0.0500 time 0.2411 (0.2430) data time 0.0011 (0.0016) model time 0.2401 (0.2416) loss 3.8615 (4.0532) grad_norm 1.7872 (2.1473) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1190/1251] eta 0:00:14 lr 0.001000 wd 0.0500 time 0.2559 (0.2430) data time 0.0008 (0.0016) model time 0.2552 (0.2416) loss 3.8262 (4.0515) grad_norm 1.5748 (2.1474) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1200/1251] eta 0:00:12 lr 0.001000 wd 0.0500 time 0.2400 (0.2430) data time 0.0011 (0.0015) model time 0.2388 (0.2416) loss 4.1405 (4.0522) grad_norm 4.4112 (2.1527) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1210/1251] eta 0:00:09 lr 0.001000 wd 0.0500 time 0.2398 (0.2430) data time 0.0008 (0.0015) model time 0.2390 (0.2416) loss 3.7332 (4.0524) grad_norm 1.4355 (2.1546) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1220/1251] eta 0:00:07 lr 0.001000 wd 0.0500 time 0.2428 (0.2430) data time 0.0011 (0.0015) model time 0.2417 (0.2416) loss 3.6114 (4.0514) grad_norm 3.3857 (2.1583) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1230/1251] eta 0:00:05 lr 0.001000 wd 0.0500 time 0.2365 (0.2434) data time 0.0009 (0.0015) model time 0.2356 (0.2420) loss 2.8300 (4.0507) grad_norm 2.4391 (2.1594) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1240/1251] eta 0:00:02 lr 0.001000 wd 0.0500 time 0.2239 (0.2433) data time 0.0005 (0.0015) model time 0.2234 (0.2419) loss 3.7156 (4.0488) grad_norm 1.7509 (2.1574) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [22/300][1250/1251] eta 0:00:00 lr 0.001000 wd 0.0500 time 0.2273 (0.2432) data time 0.0008 (0.0015) model time 0.2265 (0.2418) loss 4.3720 (4.0504) grad_norm 1.7317 (2.1553) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 22 training takes 0:05:04 [2024-08-26 04:07:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 04:07:58 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 04:07:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.456 (0.456) Loss 0.7202 (0.7202) Acc@1 85.938 (85.938) Acc@5 95.996 (95.996) Mem 7379MB [2024-08-26 04:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.112) Loss 1.1357 (1.1210) Acc@1 74.707 (74.103) Acc@5 93.164 (93.200) Mem 7379MB [2024-08-26 04:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.096) Loss 1.6455 (1.1349) Acc@1 63.379 (73.972) Acc@5 86.426 (93.252) Mem 7379MB [2024-08-26 04:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.090) Loss 1.8955 (1.3109) Acc@1 57.227 (70.432) Acc@5 80.469 (90.625) Mem 7379MB [2024-08-26 04:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.8828 (1.4157) Acc@1 58.691 (68.226) Acc@5 82.422 (89.177) Mem 7379MB [2024-08-26 04:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 67.868 Acc@5 89.012 [2024-08-26 04:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 67.9% [2024-08-26 04:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 67.87% [2024-08-26 04:08:02 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 04:08:03 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 04:08:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.391 (0.391) Loss 0.7661 (0.7661) Acc@1 81.641 (81.641) Acc@5 94.824 (94.824) Mem 7379MB [2024-08-26 04:08:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.103) Loss 1.1543 (1.1690) Acc@1 73.633 (71.529) Acc@5 92.480 (91.477) Mem 7379MB [2024-08-26 04:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.091) Loss 1.6982 (1.1762) Acc@1 61.035 (71.354) Acc@5 84.375 (91.569) Mem 7379MB [2024-08-26 04:08:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.086) Loss 1.9473 (1.3556) Acc@1 57.129 (67.937) Acc@5 78.613 (88.886) Mem 7379MB [2024-08-26 04:08:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.081) Loss 1.9912 (1.4679) Acc@1 54.199 (65.742) Acc@5 79.980 (87.262) Mem 7379MB [2024-08-26 04:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 65.602 Acc@5 87.224 [2024-08-26 04:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 65.6% [2024-08-26 04:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 65.60% [2024-08-26 04:08:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 04:08:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 04:08:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][0/1251] eta 0:13:18 lr 0.001000 wd 0.0500 time 0.6383 (0.6383) data time 0.4157 (0.4157) model time 0.0000 (0.0000) loss 3.9482 (3.9482) grad_norm 1.9002 (1.9002) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][10/1251] eta 0:05:47 lr 0.001000 wd 0.0500 time 0.2413 (0.2800) data time 0.0013 (0.0388) model time 0.0000 (0.0000) loss 3.1296 (3.7700) grad_norm 1.6839 (2.1424) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][20/1251] eta 0:05:21 lr 0.001000 wd 0.0500 time 0.2371 (0.2610) data time 0.0007 (0.0208) model time 0.0000 (0.0000) loss 5.0031 (4.0491) grad_norm 2.9616 (2.1774) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][30/1251] eta 0:05:11 lr 0.001000 wd 0.0500 time 0.2450 (0.2549) data time 0.0009 (0.0144) model time 0.0000 (0.0000) loss 4.4232 (4.0579) grad_norm 2.5383 (2.1093) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][40/1251] eta 0:05:05 lr 0.001000 wd 0.0500 time 0.2429 (0.2524) data time 0.0009 (0.0111) model time 0.0000 (0.0000) loss 2.5911 (4.0493) grad_norm 2.1818 (2.1371) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][50/1251] eta 0:05:00 lr 0.001000 wd 0.0500 time 0.2463 (0.2505) data time 0.0011 (0.0092) model time 0.0000 (0.0000) loss 4.0986 (4.0038) grad_norm 1.8727 (2.1094) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][60/1251] eta 0:04:56 lr 0.001000 wd 0.0500 time 0.2382 (0.2493) data time 0.0010 (0.0078) model time 0.2373 (0.2417) loss 4.1473 (4.0162) grad_norm 2.4312 (2.0717) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][70/1251] eta 0:04:53 lr 0.001000 wd 0.0500 time 0.2440 (0.2483) data time 0.0011 (0.0069) model time 0.2429 (0.2414) loss 3.5869 (3.9998) grad_norm 1.9442 (2.1270) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][80/1251] eta 0:04:49 lr 0.001000 wd 0.0500 time 0.2427 (0.2474) data time 0.0010 (0.0061) model time 0.2417 (0.2411) loss 4.4961 (4.0218) grad_norm 3.7078 (2.2310) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][90/1251] eta 0:04:46 lr 0.001000 wd 0.0500 time 0.2417 (0.2469) data time 0.0007 (0.0056) model time 0.2409 (0.2413) loss 3.1268 (4.0102) grad_norm 2.2707 (2.2231) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][100/1251] eta 0:04:43 lr 0.001000 wd 0.0500 time 0.2389 (0.2461) data time 0.0007 (0.0051) model time 0.2382 (0.2406) loss 4.5277 (4.0107) grad_norm 1.8478 (2.2345) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][110/1251] eta 0:04:40 lr 0.001000 wd 0.0500 time 0.2455 (0.2458) data time 0.0008 (0.0048) model time 0.2447 (0.2407) loss 4.3000 (4.0244) grad_norm 2.6851 (2.2545) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][120/1251] eta 0:04:37 lr 0.001000 wd 0.0500 time 0.2464 (0.2454) data time 0.0011 (0.0044) model time 0.2453 (0.2406) loss 2.8153 (4.0204) grad_norm 2.0222 (2.2310) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][130/1251] eta 0:04:34 lr 0.001000 wd 0.0500 time 0.2610 (0.2452) data time 0.0008 (0.0042) model time 0.2602 (0.2407) loss 4.7924 (4.0235) grad_norm 1.9968 (2.2181) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][140/1251] eta 0:04:31 lr 0.001000 wd 0.0500 time 0.2428 (0.2448) data time 0.0007 (0.0040) model time 0.2420 (0.2405) loss 4.4667 (4.0411) grad_norm 2.1513 (2.2132) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][150/1251] eta 0:04:29 lr 0.001000 wd 0.0500 time 0.2372 (0.2445) data time 0.0008 (0.0038) model time 0.2364 (0.2404) loss 4.2678 (4.0296) grad_norm 1.4098 (2.1726) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][160/1251] eta 0:04:26 lr 0.001000 wd 0.0500 time 0.2409 (0.2444) data time 0.0010 (0.0036) model time 0.2399 (0.2405) loss 3.1606 (4.0215) grad_norm 2.6927 (2.1694) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][170/1251] eta 0:04:23 lr 0.001000 wd 0.0500 time 0.2381 (0.2441) data time 0.0008 (0.0034) model time 0.2373 (0.2404) loss 5.0436 (4.0393) grad_norm 2.4912 (2.1576) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][180/1251] eta 0:04:21 lr 0.001000 wd 0.0500 time 0.2400 (0.2439) data time 0.0010 (0.0033) model time 0.2390 (0.2404) loss 4.1809 (4.0350) grad_norm 2.6982 (2.1662) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][190/1251] eta 0:04:18 lr 0.001000 wd 0.0500 time 0.2468 (0.2438) data time 0.0007 (0.0032) model time 0.2461 (0.2404) loss 3.8891 (4.0402) grad_norm 1.8193 (2.1688) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][200/1251] eta 0:04:16 lr 0.001000 wd 0.0500 time 0.2447 (0.2437) data time 0.0011 (0.0031) model time 0.2436 (0.2404) loss 2.7575 (4.0423) grad_norm 3.1910 (2.1859) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][210/1251] eta 0:04:13 lr 0.001000 wd 0.0500 time 0.2463 (0.2435) data time 0.0010 (0.0030) model time 0.2454 (0.2403) loss 4.5748 (4.0509) grad_norm 1.8568 (2.1703) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][220/1251] eta 0:04:11 lr 0.001000 wd 0.0500 time 0.2377 (0.2435) data time 0.0007 (0.0029) model time 0.2369 (0.2404) loss 4.6537 (4.0660) grad_norm 1.5484 (2.1693) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][230/1251] eta 0:04:08 lr 0.001000 wd 0.0500 time 0.2566 (0.2434) data time 0.0008 (0.0028) model time 0.2558 (0.2404) loss 4.9457 (4.0659) grad_norm 2.6925 (2.1581) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][240/1251] eta 0:04:06 lr 0.001000 wd 0.0500 time 0.2459 (0.2433) data time 0.0009 (0.0027) model time 0.2451 (0.2404) loss 4.3042 (4.0695) grad_norm 1.9711 (2.1654) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][250/1251] eta 0:04:03 lr 0.001000 wd 0.0500 time 0.2391 (0.2432) data time 0.0008 (0.0027) model time 0.2384 (0.2404) loss 4.8600 (4.0794) grad_norm 1.5375 (2.1573) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][260/1251] eta 0:04:00 lr 0.001000 wd 0.0500 time 0.2385 (0.2431) data time 0.0009 (0.0026) model time 0.2375 (0.2403) loss 4.5583 (4.0954) grad_norm 1.6190 (2.1396) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][270/1251] eta 0:03:58 lr 0.001000 wd 0.0500 time 0.2699 (0.2432) data time 0.0010 (0.0025) model time 0.2690 (0.2404) loss 4.7005 (4.0925) grad_norm 2.2981 (2.1559) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][280/1251] eta 0:03:56 lr 0.001000 wd 0.0500 time 0.2423 (0.2438) data time 0.0009 (0.0025) model time 0.2413 (0.2414) loss 4.6446 (4.0939) grad_norm 1.9381 (2.1513) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][290/1251] eta 0:03:54 lr 0.001000 wd 0.0500 time 0.2466 (0.2438) data time 0.0007 (0.0024) model time 0.2459 (0.2414) loss 5.0471 (4.0917) grad_norm 1.4576 (2.1437) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][300/1251] eta 0:03:51 lr 0.001000 wd 0.0500 time 0.2436 (0.2437) data time 0.0007 (0.0024) model time 0.2428 (0.2413) loss 4.5231 (4.0855) grad_norm 1.8224 (2.1408) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][310/1251] eta 0:03:49 lr 0.001000 wd 0.0500 time 0.2425 (0.2437) data time 0.0010 (0.0023) model time 0.2415 (0.2414) loss 4.2342 (4.0802) grad_norm 1.7008 (2.1313) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][320/1251] eta 0:03:46 lr 0.001000 wd 0.0500 time 0.2384 (0.2436) data time 0.0008 (0.0023) model time 0.2376 (0.2413) loss 2.8478 (4.0764) grad_norm 2.4419 (2.1409) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][330/1251] eta 0:03:44 lr 0.001000 wd 0.0500 time 0.2430 (0.2436) data time 0.0008 (0.0023) model time 0.2422 (0.2413) loss 4.1503 (4.0708) grad_norm 2.8761 (2.1607) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][340/1251] eta 0:03:41 lr 0.001000 wd 0.0500 time 0.2480 (0.2436) data time 0.0010 (0.0022) model time 0.2470 (0.2414) loss 4.4638 (4.0698) grad_norm 1.5832 (2.1557) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][350/1251] eta 0:03:39 lr 0.001000 wd 0.0500 time 0.2405 (0.2435) data time 0.0008 (0.0022) model time 0.2397 (0.2413) loss 4.2809 (4.0716) grad_norm 1.9331 (2.1518) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][360/1251] eta 0:03:36 lr 0.001000 wd 0.0500 time 0.2436 (0.2435) data time 0.0009 (0.0022) model time 0.2427 (0.2414) loss 3.4440 (4.0740) grad_norm 1.5783 (2.1532) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][370/1251] eta 0:03:34 lr 0.001000 wd 0.0500 time 0.2405 (0.2435) data time 0.0011 (0.0021) model time 0.2395 (0.2414) loss 4.3351 (4.0787) grad_norm 2.3942 (2.1450) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][380/1251] eta 0:03:32 lr 0.001000 wd 0.0500 time 0.2497 (0.2435) data time 0.0007 (0.0021) model time 0.2490 (0.2414) loss 3.7457 (4.0680) grad_norm 2.0357 (2.1491) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][390/1251] eta 0:03:29 lr 0.001000 wd 0.0500 time 0.2421 (0.2435) data time 0.0010 (0.0021) model time 0.2411 (0.2414) loss 4.7620 (4.0737) grad_norm 4.3137 (2.1530) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][400/1251] eta 0:03:27 lr 0.001000 wd 0.0500 time 0.2414 (0.2434) data time 0.0011 (0.0020) model time 0.2404 (0.2414) loss 4.2212 (4.0728) grad_norm 1.5469 (2.1522) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][410/1251] eta 0:03:24 lr 0.001000 wd 0.0500 time 0.2324 (0.2434) data time 0.0009 (0.0020) model time 0.2315 (0.2414) loss 3.9660 (4.0640) grad_norm 1.6615 (2.1459) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][420/1251] eta 0:03:22 lr 0.001000 wd 0.0500 time 0.2464 (0.2433) data time 0.0009 (0.0020) model time 0.2455 (0.2413) loss 3.5064 (4.0536) grad_norm 2.0996 (2.1369) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][430/1251] eta 0:03:19 lr 0.001000 wd 0.0500 time 0.2432 (0.2432) data time 0.0009 (0.0020) model time 0.2423 (0.2412) loss 3.5536 (4.0509) grad_norm 1.3778 (2.1392) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:09:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][440/1251] eta 0:03:17 lr 0.001000 wd 0.0500 time 0.2395 (0.2432) data time 0.0010 (0.0019) model time 0.2385 (0.2412) loss 4.5358 (4.0493) grad_norm 2.5106 (2.1424) loss_scale 32768.0000 (16458.3039) mem 7379MB [2024-08-26 04:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][450/1251] eta 0:03:14 lr 0.001000 wd 0.0500 time 0.2389 (0.2431) data time 0.0007 (0.0019) model time 0.2381 (0.2412) loss 3.2509 (4.0424) grad_norm 2.2279 (inf) loss_scale 16384.0000 (16492.9845) mem 7379MB [2024-08-26 04:10:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][460/1251] eta 0:03:12 lr 0.001000 wd 0.0500 time 0.2399 (0.2431) data time 0.0009 (0.0019) model time 0.2390 (0.2412) loss 5.1662 (4.0450) grad_norm 2.5753 (inf) loss_scale 16384.0000 (16490.6204) mem 7379MB [2024-08-26 04:10:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][470/1251] eta 0:03:09 lr 0.001000 wd 0.0500 time 0.2492 (0.2431) data time 0.0010 (0.0019) model time 0.2482 (0.2412) loss 4.4155 (4.0432) grad_norm 2.5689 (inf) loss_scale 16384.0000 (16488.3567) mem 7379MB [2024-08-26 04:10:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][480/1251] eta 0:03:07 lr 0.001000 wd 0.0500 time 0.2436 (0.2431) data time 0.0011 (0.0019) model time 0.2425 (0.2412) loss 3.9890 (4.0397) grad_norm 3.1351 (inf) loss_scale 16384.0000 (16486.1871) mem 7379MB [2024-08-26 04:10:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][490/1251] eta 0:03:04 lr 0.001000 wd 0.0500 time 0.2392 (0.2431) data time 0.0012 (0.0019) model time 0.2380 (0.2412) loss 3.9551 (4.0467) grad_norm 2.1745 (inf) loss_scale 16384.0000 (16484.1059) mem 7379MB [2024-08-26 04:10:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][500/1251] eta 0:03:02 lr 0.001000 wd 0.0500 time 0.2455 (0.2435) data time 0.0010 (0.0018) model time 0.2446 (0.2417) loss 3.3413 (4.0502) grad_norm 1.7230 (inf) loss_scale 16384.0000 (16482.1078) mem 7379MB [2024-08-26 04:10:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][510/1251] eta 0:03:00 lr 0.001000 wd 0.0500 time 0.2406 (0.2439) data time 0.0011 (0.0018) model time 0.2395 (0.2422) loss 4.0347 (4.0505) grad_norm 2.0587 (inf) loss_scale 16384.0000 (16480.1879) mem 7379MB [2024-08-26 04:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][520/1251] eta 0:02:58 lr 0.001000 wd 0.0500 time 0.2365 (0.2439) data time 0.0009 (0.0018) model time 0.2356 (0.2421) loss 3.0247 (4.0465) grad_norm 2.0959 (inf) loss_scale 16384.0000 (16478.3417) mem 7379MB [2024-08-26 04:10:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][530/1251] eta 0:02:55 lr 0.001000 wd 0.0500 time 0.2331 (0.2438) data time 0.0007 (0.0018) model time 0.2324 (0.2421) loss 4.5997 (4.0488) grad_norm 1.8844 (inf) loss_scale 16384.0000 (16476.5650) mem 7379MB [2024-08-26 04:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][540/1251] eta 0:02:53 lr 0.001000 wd 0.0500 time 0.2426 (0.2438) data time 0.0008 (0.0018) model time 0.2419 (0.2421) loss 4.9027 (4.0541) grad_norm 1.3365 (inf) loss_scale 16384.0000 (16474.8540) mem 7379MB [2024-08-26 04:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][550/1251] eta 0:02:50 lr 0.001000 wd 0.0500 time 0.2384 (0.2437) data time 0.0013 (0.0018) model time 0.2372 (0.2420) loss 4.1740 (4.0527) grad_norm 1.3241 (inf) loss_scale 16384.0000 (16473.2051) mem 7379MB [2024-08-26 04:10:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][560/1251] eta 0:02:48 lr 0.001000 wd 0.0500 time 0.2445 (0.2437) data time 0.0007 (0.0017) model time 0.2438 (0.2420) loss 3.6408 (4.0547) grad_norm 2.5001 (inf) loss_scale 16384.0000 (16471.6150) mem 7379MB [2024-08-26 04:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][570/1251] eta 0:02:45 lr 0.001000 wd 0.0500 time 0.2404 (0.2436) data time 0.0012 (0.0017) model time 0.2393 (0.2419) loss 4.8009 (4.0604) grad_norm 1.9019 (inf) loss_scale 16384.0000 (16470.0806) mem 7379MB [2024-08-26 04:10:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][580/1251] eta 0:02:43 lr 0.001000 wd 0.0500 time 0.2414 (0.2436) data time 0.0009 (0.0017) model time 0.2405 (0.2419) loss 4.1196 (4.0557) grad_norm 2.1599 (inf) loss_scale 16384.0000 (16468.5990) mem 7379MB [2024-08-26 04:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][590/1251] eta 0:02:40 lr 0.001000 wd 0.0500 time 0.2451 (0.2436) data time 0.0010 (0.0017) model time 0.2441 (0.2419) loss 3.8638 (4.0494) grad_norm 2.7409 (inf) loss_scale 16384.0000 (16467.1675) mem 7379MB [2024-08-26 04:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][600/1251] eta 0:02:38 lr 0.001000 wd 0.0500 time 0.2343 (0.2435) data time 0.0009 (0.0017) model time 0.2334 (0.2419) loss 3.4478 (4.0485) grad_norm 1.9228 (inf) loss_scale 16384.0000 (16465.7837) mem 7379MB [2024-08-26 04:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][610/1251] eta 0:02:36 lr 0.001000 wd 0.0500 time 0.2398 (0.2435) data time 0.0011 (0.0017) model time 0.2388 (0.2419) loss 4.6351 (4.0486) grad_norm 1.7636 (inf) loss_scale 16384.0000 (16464.4452) mem 7379MB [2024-08-26 04:10:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][620/1251] eta 0:02:33 lr 0.001000 wd 0.0500 time 0.2491 (0.2435) data time 0.0008 (0.0017) model time 0.2483 (0.2419) loss 5.1943 (4.0494) grad_norm 3.2590 (inf) loss_scale 16384.0000 (16463.1498) mem 7379MB [2024-08-26 04:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][630/1251] eta 0:02:31 lr 0.001000 wd 0.0500 time 0.2369 (0.2435) data time 0.0009 (0.0017) model time 0.2360 (0.2419) loss 3.1625 (4.0429) grad_norm 1.5633 (inf) loss_scale 16384.0000 (16461.8954) mem 7379MB [2024-08-26 04:10:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][640/1251] eta 0:02:28 lr 0.001000 wd 0.0500 time 0.2471 (0.2435) data time 0.0008 (0.0017) model time 0.2463 (0.2419) loss 4.6554 (4.0432) grad_norm 4.3653 (inf) loss_scale 16384.0000 (16460.6802) mem 7379MB [2024-08-26 04:10:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][650/1251] eta 0:02:26 lr 0.001000 wd 0.0500 time 0.2397 (0.2434) data time 0.0007 (0.0016) model time 0.2390 (0.2418) loss 4.9161 (4.0411) grad_norm 2.1307 (inf) loss_scale 16384.0000 (16459.5023) mem 7379MB [2024-08-26 04:10:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][660/1251] eta 0:02:23 lr 0.001000 wd 0.0500 time 0.2415 (0.2434) data time 0.0012 (0.0016) model time 0.2403 (0.2418) loss 4.0671 (4.0413) grad_norm 1.5354 (inf) loss_scale 16384.0000 (16458.3601) mem 7379MB [2024-08-26 04:10:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][670/1251] eta 0:02:21 lr 0.001000 wd 0.0500 time 0.2526 (0.2434) data time 0.0008 (0.0016) model time 0.2518 (0.2418) loss 3.4787 (4.0397) grad_norm 2.1314 (inf) loss_scale 16384.0000 (16457.2519) mem 7379MB [2024-08-26 04:10:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][680/1251] eta 0:02:18 lr 0.001000 wd 0.0500 time 0.2386 (0.2433) data time 0.0008 (0.0016) model time 0.2378 (0.2418) loss 4.3427 (4.0405) grad_norm 1.5017 (inf) loss_scale 8192.0000 (16420.0881) mem 7379MB [2024-08-26 04:10:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][690/1251] eta 0:02:16 lr 0.001000 wd 0.0500 time 0.2440 (0.2434) data time 0.0007 (0.0016) model time 0.2433 (0.2418) loss 3.7014 (4.0415) grad_norm 2.0517 (inf) loss_scale 8192.0000 (16301.0130) mem 7379MB [2024-08-26 04:10:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][700/1251] eta 0:02:14 lr 0.001000 wd 0.0500 time 0.2338 (0.2433) data time 0.0009 (0.0016) model time 0.2329 (0.2417) loss 3.7543 (4.0435) grad_norm 2.3201 (inf) loss_scale 8192.0000 (16185.3352) mem 7379MB [2024-08-26 04:11:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][710/1251] eta 0:02:11 lr 0.001000 wd 0.0500 time 0.2421 (0.2433) data time 0.0010 (0.0016) model time 0.2412 (0.2417) loss 4.6946 (4.0446) grad_norm 3.0025 (inf) loss_scale 8192.0000 (16072.9114) mem 7379MB [2024-08-26 04:11:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][720/1251] eta 0:02:09 lr 0.001000 wd 0.0500 time 0.2347 (0.2432) data time 0.0011 (0.0016) model time 0.2336 (0.2417) loss 3.8014 (4.0471) grad_norm 2.5245 (inf) loss_scale 8192.0000 (15963.6061) mem 7379MB [2024-08-26 04:11:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][730/1251] eta 0:02:06 lr 0.001000 wd 0.0500 time 0.2407 (0.2432) data time 0.0011 (0.0016) model time 0.2396 (0.2417) loss 2.8535 (4.0435) grad_norm 4.5888 (inf) loss_scale 8192.0000 (15857.2914) mem 7379MB [2024-08-26 04:11:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][740/1251] eta 0:02:04 lr 0.001000 wd 0.0500 time 0.2370 (0.2432) data time 0.0011 (0.0016) model time 0.2359 (0.2416) loss 3.6519 (4.0434) grad_norm 2.1333 (inf) loss_scale 8192.0000 (15753.8462) mem 7379MB [2024-08-26 04:11:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][750/1251] eta 0:02:01 lr 0.001000 wd 0.0500 time 0.2449 (0.2431) data time 0.0008 (0.0016) model time 0.2441 (0.2416) loss 4.1466 (4.0415) grad_norm 1.6847 (inf) loss_scale 8192.0000 (15653.1558) mem 7379MB [2024-08-26 04:11:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][760/1251] eta 0:01:59 lr 0.001000 wd 0.0500 time 0.2412 (0.2431) data time 0.0008 (0.0016) model time 0.2404 (0.2416) loss 4.7233 (4.0415) grad_norm 1.6757 (inf) loss_scale 8192.0000 (15555.1117) mem 7379MB [2024-08-26 04:11:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][770/1251] eta 0:01:56 lr 0.001000 wd 0.0500 time 0.2418 (0.2431) data time 0.0007 (0.0015) model time 0.2411 (0.2416) loss 4.2100 (4.0453) grad_norm 2.2703 (inf) loss_scale 8192.0000 (15459.6109) mem 7379MB [2024-08-26 04:11:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][780/1251] eta 0:01:54 lr 0.001000 wd 0.0500 time 0.2451 (0.2430) data time 0.0009 (0.0015) model time 0.2442 (0.2415) loss 3.1696 (4.0421) grad_norm 1.8161 (inf) loss_scale 8192.0000 (15366.5557) mem 7379MB [2024-08-26 04:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][790/1251] eta 0:01:52 lr 0.001000 wd 0.0500 time 0.2370 (0.2430) data time 0.0008 (0.0015) model time 0.2362 (0.2415) loss 4.9156 (4.0436) grad_norm 1.9955 (inf) loss_scale 8192.0000 (15275.8534) mem 7379MB [2024-08-26 04:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][800/1251] eta 0:01:49 lr 0.001000 wd 0.0500 time 0.2467 (0.2430) data time 0.0008 (0.0015) model time 0.2459 (0.2415) loss 3.4707 (4.0464) grad_norm 2.7776 (inf) loss_scale 8192.0000 (15187.4157) mem 7379MB [2024-08-26 04:11:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][810/1251] eta 0:01:47 lr 0.001000 wd 0.0500 time 0.2387 (0.2430) data time 0.0009 (0.0015) model time 0.2378 (0.2415) loss 4.2816 (4.0467) grad_norm 2.3468 (inf) loss_scale 8192.0000 (15101.1591) mem 7379MB [2024-08-26 04:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][820/1251] eta 0:01:44 lr 0.001000 wd 0.0500 time 0.2423 (0.2430) data time 0.0010 (0.0015) model time 0.2413 (0.2415) loss 4.4129 (4.0495) grad_norm 3.3567 (inf) loss_scale 8192.0000 (15017.0037) mem 7379MB [2024-08-26 04:11:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][830/1251] eta 0:01:42 lr 0.001000 wd 0.0500 time 0.2399 (0.2430) data time 0.0009 (0.0015) model time 0.2390 (0.2414) loss 3.1666 (4.0486) grad_norm 2.5640 (inf) loss_scale 8192.0000 (14934.8736) mem 7379MB [2024-08-26 04:11:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][840/1251] eta 0:01:39 lr 0.001000 wd 0.0500 time 0.2379 (0.2429) data time 0.0007 (0.0015) model time 0.2372 (0.2414) loss 4.1537 (4.0520) grad_norm 1.7997 (inf) loss_scale 8192.0000 (14854.6968) mem 7379MB [2024-08-26 04:11:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][850/1251] eta 0:01:37 lr 0.001000 wd 0.0500 time 0.2390 (0.2430) data time 0.0009 (0.0015) model time 0.2381 (0.2415) loss 4.3088 (4.0529) grad_norm 2.4454 (inf) loss_scale 8192.0000 (14776.4042) mem 7379MB [2024-08-26 04:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][860/1251] eta 0:01:35 lr 0.001000 wd 0.0500 time 0.2377 (0.2430) data time 0.0009 (0.0015) model time 0.2368 (0.2415) loss 4.8926 (4.0535) grad_norm 1.9255 (inf) loss_scale 8192.0000 (14699.9303) mem 7379MB [2024-08-26 04:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][870/1251] eta 0:01:32 lr 0.001000 wd 0.0500 time 0.2359 (0.2430) data time 0.0008 (0.0015) model time 0.2350 (0.2415) loss 3.3118 (4.0507) grad_norm 1.6109 (inf) loss_scale 8192.0000 (14625.2124) mem 7379MB [2024-08-26 04:11:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][880/1251] eta 0:01:30 lr 0.001000 wd 0.0500 time 0.2478 (0.2430) data time 0.0007 (0.0015) model time 0.2471 (0.2415) loss 4.5021 (4.0500) grad_norm 1.8925 (inf) loss_scale 8192.0000 (14552.1907) mem 7379MB [2024-08-26 04:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][890/1251] eta 0:01:27 lr 0.001000 wd 0.0500 time 0.2376 (0.2429) data time 0.0011 (0.0015) model time 0.2365 (0.2415) loss 4.4748 (4.0485) grad_norm 2.4645 (inf) loss_scale 8192.0000 (14480.8081) mem 7379MB [2024-08-26 04:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][900/1251] eta 0:01:25 lr 0.001000 wd 0.0500 time 0.2396 (0.2429) data time 0.0008 (0.0015) model time 0.2388 (0.2414) loss 5.0483 (4.0505) grad_norm 1.6542 (inf) loss_scale 8192.0000 (14411.0100) mem 7379MB [2024-08-26 04:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][910/1251] eta 0:01:22 lr 0.001000 wd 0.0500 time 0.2397 (0.2429) data time 0.0010 (0.0015) model time 0.2387 (0.2414) loss 4.2599 (4.0491) grad_norm 4.0017 (inf) loss_scale 8192.0000 (14342.7442) mem 7379MB [2024-08-26 04:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][920/1251] eta 0:01:20 lr 0.001000 wd 0.0500 time 0.2423 (0.2429) data time 0.0008 (0.0015) model time 0.2415 (0.2414) loss 3.3319 (4.0473) grad_norm 1.6915 (inf) loss_scale 8192.0000 (14275.9609) mem 7379MB [2024-08-26 04:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][930/1251] eta 0:01:17 lr 0.001000 wd 0.0500 time 0.2375 (0.2429) data time 0.0009 (0.0015) model time 0.2367 (0.2414) loss 4.3172 (4.0473) grad_norm 2.1436 (inf) loss_scale 8192.0000 (14210.6122) mem 7379MB [2024-08-26 04:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][940/1251] eta 0:01:15 lr 0.001000 wd 0.0500 time 0.2354 (0.2429) data time 0.0010 (0.0015) model time 0.2344 (0.2414) loss 3.2975 (4.0441) grad_norm 2.3539 (inf) loss_scale 8192.0000 (14146.6525) mem 7379MB [2024-08-26 04:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][950/1251] eta 0:01:13 lr 0.001000 wd 0.0500 time 0.2419 (0.2428) data time 0.0007 (0.0015) model time 0.2411 (0.2414) loss 4.3359 (4.0449) grad_norm 1.8491 (inf) loss_scale 8192.0000 (14084.0379) mem 7379MB [2024-08-26 04:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][960/1251] eta 0:01:10 lr 0.001000 wd 0.0500 time 0.2426 (0.2428) data time 0.0007 (0.0015) model time 0.2419 (0.2414) loss 3.7344 (4.0449) grad_norm 2.6969 (inf) loss_scale 8192.0000 (14022.7263) mem 7379MB [2024-08-26 04:12:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][970/1251] eta 0:01:08 lr 0.001000 wd 0.0500 time 0.2402 (0.2428) data time 0.0010 (0.0015) model time 0.2392 (0.2414) loss 4.3634 (4.0447) grad_norm 6.8440 (inf) loss_scale 8192.0000 (13962.6777) mem 7379MB [2024-08-26 04:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][980/1251] eta 0:01:05 lr 0.001000 wd 0.0500 time 0.2392 (0.2428) data time 0.0008 (0.0015) model time 0.2385 (0.2414) loss 4.5414 (4.0402) grad_norm 2.2888 (inf) loss_scale 8192.0000 (13903.8532) mem 7379MB [2024-08-26 04:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][990/1251] eta 0:01:03 lr 0.001000 wd 0.0500 time 0.2366 (0.2428) data time 0.0011 (0.0014) model time 0.2355 (0.2413) loss 3.6549 (4.0391) grad_norm 2.6471 (inf) loss_scale 8192.0000 (13846.2159) mem 7379MB [2024-08-26 04:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1000/1251] eta 0:01:00 lr 0.001000 wd 0.0500 time 0.2311 (0.2428) data time 0.0008 (0.0014) model time 0.2303 (0.2413) loss 2.8080 (4.0386) grad_norm 1.6279 (inf) loss_scale 8192.0000 (13789.7303) mem 7379MB [2024-08-26 04:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1010/1251] eta 0:00:58 lr 0.001000 wd 0.0500 time 0.2421 (0.2427) data time 0.0007 (0.0014) model time 0.2413 (0.2413) loss 2.7183 (4.0376) grad_norm 1.6203 (inf) loss_scale 8192.0000 (13734.3620) mem 7379MB [2024-08-26 04:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1020/1251] eta 0:00:56 lr 0.001000 wd 0.0500 time 0.2358 (0.2427) data time 0.0011 (0.0014) model time 0.2347 (0.2413) loss 4.1178 (4.0406) grad_norm 1.9740 (inf) loss_scale 8192.0000 (13680.0784) mem 7379MB [2024-08-26 04:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1030/1251] eta 0:00:53 lr 0.001000 wd 0.0500 time 0.2369 (0.2427) data time 0.0007 (0.0014) model time 0.2361 (0.2412) loss 3.6277 (4.0396) grad_norm 2.4996 (inf) loss_scale 8192.0000 (13626.8477) mem 7379MB [2024-08-26 04:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1040/1251] eta 0:00:51 lr 0.001000 wd 0.0500 time 0.2442 (0.2427) data time 0.0010 (0.0014) model time 0.2432 (0.2412) loss 4.5640 (4.0381) grad_norm 1.8481 (inf) loss_scale 8192.0000 (13574.6398) mem 7379MB [2024-08-26 04:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1050/1251] eta 0:00:48 lr 0.001000 wd 0.0500 time 0.2346 (0.2426) data time 0.0011 (0.0014) model time 0.2335 (0.2412) loss 3.3831 (4.0385) grad_norm 1.1262 (inf) loss_scale 8192.0000 (13523.4253) mem 7379MB [2024-08-26 04:12:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1060/1251] eta 0:00:46 lr 0.001000 wd 0.0500 time 0.2390 (0.2426) data time 0.0008 (0.0014) model time 0.2383 (0.2412) loss 2.9491 (4.0384) grad_norm 1.6851 (inf) loss_scale 8192.0000 (13473.1762) mem 7379MB [2024-08-26 04:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1070/1251] eta 0:00:43 lr 0.001000 wd 0.0500 time 0.2400 (0.2426) data time 0.0009 (0.0014) model time 0.2391 (0.2411) loss 4.0954 (4.0370) grad_norm 4.1875 (inf) loss_scale 8192.0000 (13423.8655) mem 7379MB [2024-08-26 04:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1080/1251] eta 0:00:41 lr 0.001000 wd 0.0500 time 0.2480 (0.2425) data time 0.0007 (0.0014) model time 0.2472 (0.2411) loss 4.1831 (4.0375) grad_norm 1.5677 (inf) loss_scale 8192.0000 (13375.4672) mem 7379MB [2024-08-26 04:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1090/1251] eta 0:00:39 lr 0.001000 wd 0.0500 time 0.2369 (0.2425) data time 0.0011 (0.0014) model time 0.2358 (0.2411) loss 4.4161 (4.0365) grad_norm 1.8810 (inf) loss_scale 8192.0000 (13327.9560) mem 7379MB [2024-08-26 04:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1100/1251] eta 0:00:36 lr 0.001000 wd 0.0500 time 0.2451 (0.2425) data time 0.0011 (0.0014) model time 0.2440 (0.2410) loss 4.6876 (4.0340) grad_norm 1.6681 (inf) loss_scale 8192.0000 (13281.3079) mem 7379MB [2024-08-26 04:12:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1110/1251] eta 0:00:34 lr 0.001000 wd 0.0500 time 0.2326 (0.2424) data time 0.0010 (0.0014) model time 0.2315 (0.2410) loss 4.1146 (4.0334) grad_norm 2.2309 (inf) loss_scale 8192.0000 (13235.4995) mem 7379MB [2024-08-26 04:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1120/1251] eta 0:00:31 lr 0.001000 wd 0.0500 time 0.2429 (0.2424) data time 0.0010 (0.0014) model time 0.2419 (0.2410) loss 4.4268 (4.0338) grad_norm 1.4902 (inf) loss_scale 8192.0000 (13190.5085) mem 7379MB [2024-08-26 04:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1130/1251] eta 0:00:29 lr 0.001000 wd 0.0500 time 0.2404 (0.2424) data time 0.0011 (0.0014) model time 0.2393 (0.2410) loss 4.1885 (4.0319) grad_norm 1.7058 (inf) loss_scale 8192.0000 (13146.3130) mem 7379MB [2024-08-26 04:12:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1140/1251] eta 0:00:26 lr 0.001000 wd 0.0500 time 0.2430 (0.2424) data time 0.0008 (0.0014) model time 0.2422 (0.2410) loss 5.0168 (4.0333) grad_norm 2.1159 (inf) loss_scale 8192.0000 (13102.8922) mem 7379MB [2024-08-26 04:12:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1150/1251] eta 0:00:24 lr 0.001000 wd 0.0500 time 0.2336 (0.2424) data time 0.0011 (0.0014) model time 0.2325 (0.2410) loss 3.8750 (4.0325) grad_norm 2.7382 (inf) loss_scale 8192.0000 (13060.2259) mem 7379MB [2024-08-26 04:12:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1160/1251] eta 0:00:22 lr 0.001000 wd 0.0500 time 0.2433 (0.2423) data time 0.0009 (0.0014) model time 0.2423 (0.2409) loss 3.9946 (4.0319) grad_norm 1.9667 (inf) loss_scale 8192.0000 (13018.2946) mem 7379MB [2024-08-26 04:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1170/1251] eta 0:00:19 lr 0.001000 wd 0.0500 time 0.2445 (0.2423) data time 0.0010 (0.0014) model time 0.2436 (0.2409) loss 3.2363 (4.0313) grad_norm 1.7054 (inf) loss_scale 8192.0000 (12977.0794) mem 7379MB [2024-08-26 04:12:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1180/1251] eta 0:00:17 lr 0.001000 wd 0.0500 time 0.2357 (0.2423) data time 0.0009 (0.0014) model time 0.2349 (0.2409) loss 4.9282 (4.0315) grad_norm 4.1872 (inf) loss_scale 8192.0000 (12936.5622) mem 7379MB [2024-08-26 04:12:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1190/1251] eta 0:00:14 lr 0.001000 wd 0.0500 time 0.2511 (0.2423) data time 0.0009 (0.0014) model time 0.2503 (0.2409) loss 3.9829 (4.0341) grad_norm 2.0762 (inf) loss_scale 8192.0000 (12896.7254) mem 7379MB [2024-08-26 04:12:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1200/1251] eta 0:00:12 lr 0.001000 wd 0.0500 time 0.2474 (0.2423) data time 0.0009 (0.0014) model time 0.2465 (0.2409) loss 3.2479 (4.0338) grad_norm 2.6820 (inf) loss_scale 8192.0000 (12857.5520) mem 7379MB [2024-08-26 04:13:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1210/1251] eta 0:00:09 lr 0.001000 wd 0.0500 time 0.2351 (0.2424) data time 0.0012 (0.0014) model time 0.2340 (0.2411) loss 2.8519 (4.0328) grad_norm 1.4989 (inf) loss_scale 8192.0000 (12819.0256) mem 7379MB [2024-08-26 04:13:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1220/1251] eta 0:00:07 lr 0.001000 wd 0.0500 time 0.2404 (0.2425) data time 0.0008 (0.0014) model time 0.2396 (0.2411) loss 3.8968 (4.0334) grad_norm 2.0067 (inf) loss_scale 8192.0000 (12781.1302) mem 7379MB [2024-08-26 04:13:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1230/1251] eta 0:00:05 lr 0.001000 wd 0.0500 time 0.2282 (0.2424) data time 0.0011 (0.0014) model time 0.2271 (0.2411) loss 4.2726 (4.0325) grad_norm 2.7586 (inf) loss_scale 8192.0000 (12743.8505) mem 7379MB [2024-08-26 04:13:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1240/1251] eta 0:00:02 lr 0.001000 wd 0.0500 time 0.2241 (0.2424) data time 0.0007 (0.0014) model time 0.2234 (0.2410) loss 4.0413 (4.0329) grad_norm 2.0570 (inf) loss_scale 8192.0000 (12707.1716) mem 7379MB [2024-08-26 04:13:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [23/300][1250/1251] eta 0:00:00 lr 0.001000 wd 0.0500 time 0.2314 (0.2422) data time 0.0007 (0.0014) model time 0.2307 (0.2409) loss 4.4633 (4.0344) grad_norm 2.7079 (inf) loss_scale 8192.0000 (12671.0791) mem 7379MB [2024-08-26 04:13:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 23 training takes 0:05:03 [2024-08-26 04:13:11 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 04:13:12 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 04:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.436 (0.436) Loss 0.7563 (0.7563) Acc@1 84.570 (84.570) Acc@5 97.070 (97.070) Mem 7379MB [2024-08-26 04:13:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.111) Loss 1.1494 (1.1129) Acc@1 74.219 (74.583) Acc@5 93.359 (93.510) Mem 7379MB [2024-08-26 04:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.097) Loss 1.5859 (1.1251) Acc@1 64.648 (74.451) Acc@5 87.598 (93.578) Mem 7379MB [2024-08-26 04:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.090) Loss 1.9355 (1.2926) Acc@1 56.152 (71.135) Acc@5 81.836 (91.183) Mem 7379MB [2024-08-26 04:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.9600 (1.3955) Acc@1 54.883 (68.910) Acc@5 82.617 (89.806) Mem 7379MB [2024-08-26 04:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 68.618 Acc@5 89.604 [2024-08-26 04:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 68.6% [2024-08-26 04:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 68.62% [2024-08-26 04:13:16 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 04:13:16 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 04:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.447 (0.447) Loss 0.7183 (0.7183) Acc@1 82.324 (82.324) Acc@5 95.215 (95.215) Mem 7379MB [2024-08-26 04:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.069 (0.109) Loss 1.1035 (1.1126) Acc@1 74.414 (72.630) Acc@5 92.578 (92.037) Mem 7379MB [2024-08-26 04:13:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.094) Loss 1.6318 (1.1211) Acc@1 62.012 (72.414) Acc@5 84.961 (92.099) Mem 7379MB [2024-08-26 04:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.089) Loss 1.8984 (1.2964) Acc@1 56.836 (68.974) Acc@5 79.590 (89.519) Mem 7379MB [2024-08-26 04:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.9199 (1.4059) Acc@1 55.859 (66.852) Acc@5 81.348 (88.005) Mem 7379MB [2024-08-26 04:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 66.680 Acc@5 87.964 [2024-08-26 04:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 66.7% [2024-08-26 04:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 66.68% [2024-08-26 04:13:20 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 04:13:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 04:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][0/1251] eta 0:15:45 lr 0.001000 wd 0.0500 time 0.7560 (0.7560) data time 0.5327 (0.5327) model time 0.0000 (0.0000) loss 4.4272 (4.4272) grad_norm 2.5089 (2.5089) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][10/1251] eta 0:05:58 lr 0.000999 wd 0.0500 time 0.2418 (0.2892) data time 0.0007 (0.0499) model time 0.0000 (0.0000) loss 4.3736 (4.0358) grad_norm 1.6689 (2.7896) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][20/1251] eta 0:05:27 lr 0.000999 wd 0.0500 time 0.2302 (0.2657) data time 0.0009 (0.0266) model time 0.0000 (0.0000) loss 3.0646 (3.8488) grad_norm 2.1930 (2.8103) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][30/1251] eta 0:05:15 lr 0.000999 wd 0.0500 time 0.2507 (0.2586) data time 0.0008 (0.0183) model time 0.0000 (0.0000) loss 3.5657 (3.9330) grad_norm 2.9010 (2.7601) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][40/1251] eta 0:05:08 lr 0.000999 wd 0.0500 time 0.2456 (0.2545) data time 0.0010 (0.0141) model time 0.0000 (0.0000) loss 4.4853 (3.9174) grad_norm 2.3925 (2.6060) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][50/1251] eta 0:05:02 lr 0.000999 wd 0.0500 time 0.2429 (0.2522) data time 0.0009 (0.0115) model time 0.0000 (0.0000) loss 4.0816 (3.9356) grad_norm 1.6793 (2.4887) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][60/1251] eta 0:04:58 lr 0.000999 wd 0.0500 time 0.2467 (0.2503) data time 0.0008 (0.0098) model time 0.2459 (0.2398) loss 3.0286 (3.9549) grad_norm 2.4033 (2.3802) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][70/1251] eta 0:04:54 lr 0.000999 wd 0.0500 time 0.2392 (0.2493) data time 0.0011 (0.0086) model time 0.2381 (0.2409) loss 3.4382 (3.9825) grad_norm 2.1732 (2.3747) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][80/1251] eta 0:04:50 lr 0.000999 wd 0.0500 time 0.2343 (0.2484) data time 0.0012 (0.0077) model time 0.2330 (0.2409) loss 3.3241 (3.9822) grad_norm 2.4742 (2.3489) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][90/1251] eta 0:04:47 lr 0.000999 wd 0.0500 time 0.2309 (0.2475) data time 0.0009 (0.0070) model time 0.2300 (0.2405) loss 2.5655 (3.9556) grad_norm 1.5982 (2.2888) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][100/1251] eta 0:04:48 lr 0.000999 wd 0.0500 time 0.4347 (0.2510) data time 0.0011 (0.0064) model time 0.4336 (0.2487) loss 3.9960 (3.9629) grad_norm 1.9131 (2.2320) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][110/1251] eta 0:04:47 lr 0.000999 wd 0.0500 time 0.2431 (0.2521) data time 0.0008 (0.0059) model time 0.2424 (0.2510) loss 4.4972 (3.9815) grad_norm 1.8210 (2.1958) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][120/1251] eta 0:04:44 lr 0.000999 wd 0.0500 time 0.2378 (0.2512) data time 0.0012 (0.0055) model time 0.2366 (0.2494) loss 4.2791 (3.9578) grad_norm 1.4126 (2.1718) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][130/1251] eta 0:04:40 lr 0.000999 wd 0.0500 time 0.2375 (0.2502) data time 0.0009 (0.0051) model time 0.2366 (0.2479) loss 4.6639 (3.9909) grad_norm 1.9796 (2.1541) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][140/1251] eta 0:04:37 lr 0.000999 wd 0.0500 time 0.2370 (0.2495) data time 0.0009 (0.0049) model time 0.2361 (0.2469) loss 2.9321 (3.9838) grad_norm 2.4008 (2.1388) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:13:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][150/1251] eta 0:04:34 lr 0.000999 wd 0.0500 time 0.2473 (0.2490) data time 0.0011 (0.0046) model time 0.2462 (0.2463) loss 4.4972 (3.9875) grad_norm 1.4332 (2.1339) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][160/1251] eta 0:04:31 lr 0.000999 wd 0.0500 time 0.2442 (0.2485) data time 0.0007 (0.0044) model time 0.2435 (0.2457) loss 4.5575 (3.9838) grad_norm 1.8369 (2.1209) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][170/1251] eta 0:04:28 lr 0.000999 wd 0.0500 time 0.2465 (0.2481) data time 0.0010 (0.0042) model time 0.2455 (0.2454) loss 4.2028 (4.0067) grad_norm 1.8487 (2.1161) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][180/1251] eta 0:04:25 lr 0.000999 wd 0.0500 time 0.2427 (0.2479) data time 0.0008 (0.0040) model time 0.2419 (0.2452) loss 3.8997 (4.0355) grad_norm 2.1843 (2.1029) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][190/1251] eta 0:04:22 lr 0.000999 wd 0.0500 time 0.2402 (0.2475) data time 0.0010 (0.0038) model time 0.2392 (0.2447) loss 4.4035 (4.0433) grad_norm 1.6500 (2.1029) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][200/1251] eta 0:04:19 lr 0.000999 wd 0.0500 time 0.2390 (0.2472) data time 0.0007 (0.0037) model time 0.2383 (0.2444) loss 2.5921 (4.0105) grad_norm 1.7862 (2.0939) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][210/1251] eta 0:04:16 lr 0.000999 wd 0.0500 time 0.2419 (0.2468) data time 0.0007 (0.0036) model time 0.2411 (0.2441) loss 3.4687 (4.0111) grad_norm 1.8823 (2.0861) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][220/1251] eta 0:04:14 lr 0.000999 wd 0.0500 time 0.2472 (0.2466) data time 0.0009 (0.0035) model time 0.2463 (0.2438) loss 4.2820 (4.0127) grad_norm 3.8006 (2.1022) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][230/1251] eta 0:04:11 lr 0.000999 wd 0.0500 time 0.2451 (0.2464) data time 0.0010 (0.0034) model time 0.2440 (0.2437) loss 4.1431 (4.0141) grad_norm 1.5290 (2.0974) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][240/1251] eta 0:04:08 lr 0.000999 wd 0.0500 time 0.2325 (0.2461) data time 0.0010 (0.0033) model time 0.2315 (0.2435) loss 4.2135 (4.0209) grad_norm 1.6251 (2.0926) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][250/1251] eta 0:04:06 lr 0.000999 wd 0.0500 time 0.2379 (0.2460) data time 0.0009 (0.0032) model time 0.2369 (0.2434) loss 4.1778 (4.0267) grad_norm 1.2648 (2.0885) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][260/1251] eta 0:04:03 lr 0.000999 wd 0.0500 time 0.2395 (0.2458) data time 0.0010 (0.0031) model time 0.2385 (0.2432) loss 2.8312 (4.0178) grad_norm 1.8961 (2.0798) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][270/1251] eta 0:04:00 lr 0.000999 wd 0.0500 time 0.2481 (0.2456) data time 0.0008 (0.0030) model time 0.2473 (0.2430) loss 4.3695 (4.0238) grad_norm 1.9745 (2.0768) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][280/1251] eta 0:03:58 lr 0.000999 wd 0.0500 time 0.2353 (0.2454) data time 0.0010 (0.0030) model time 0.2343 (0.2428) loss 3.7142 (4.0261) grad_norm 2.0106 (2.0770) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][290/1251] eta 0:03:55 lr 0.000999 wd 0.0500 time 0.2388 (0.2453) data time 0.0008 (0.0029) model time 0.2381 (0.2428) loss 3.4758 (4.0254) grad_norm 1.6768 (2.0822) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][300/1251] eta 0:03:53 lr 0.000999 wd 0.0500 time 0.2378 (0.2451) data time 0.0010 (0.0028) model time 0.2368 (0.2426) loss 4.4240 (4.0277) grad_norm 2.0072 (2.0846) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][310/1251] eta 0:03:50 lr 0.000999 wd 0.0500 time 0.2354 (0.2449) data time 0.0008 (0.0028) model time 0.2345 (0.2424) loss 4.8049 (4.0298) grad_norm 2.0215 (2.0833) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][320/1251] eta 0:03:47 lr 0.000999 wd 0.0500 time 0.2482 (0.2449) data time 0.0007 (0.0027) model time 0.2475 (0.2424) loss 4.0916 (4.0319) grad_norm 1.7442 (2.0759) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][330/1251] eta 0:03:45 lr 0.000999 wd 0.0500 time 0.2381 (0.2447) data time 0.0011 (0.0027) model time 0.2370 (0.2423) loss 3.8411 (4.0298) grad_norm 1.3197 (2.0743) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][340/1251] eta 0:03:42 lr 0.000999 wd 0.0500 time 0.2455 (0.2447) data time 0.0007 (0.0026) model time 0.2447 (0.2423) loss 4.7036 (4.0208) grad_norm 1.6536 (2.0656) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][350/1251] eta 0:03:40 lr 0.000999 wd 0.0500 time 0.2484 (0.2446) data time 0.0010 (0.0026) model time 0.2474 (0.2423) loss 4.1419 (4.0236) grad_norm 2.7004 (2.0627) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][360/1251] eta 0:03:37 lr 0.000999 wd 0.0500 time 0.2342 (0.2446) data time 0.0011 (0.0025) model time 0.2331 (0.2423) loss 3.6102 (4.0201) grad_norm 1.6448 (2.0639) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][370/1251] eta 0:03:35 lr 0.000999 wd 0.0500 time 0.2444 (0.2445) data time 0.0007 (0.0025) model time 0.2437 (0.2422) loss 4.4575 (4.0262) grad_norm 3.6260 (2.0711) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][380/1251] eta 0:03:33 lr 0.000999 wd 0.0500 time 0.2430 (0.2449) data time 0.0010 (0.0024) model time 0.2421 (0.2427) loss 3.8468 (4.0293) grad_norm 2.1106 (2.0688) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][390/1251] eta 0:03:30 lr 0.000999 wd 0.0500 time 0.2364 (0.2448) data time 0.0007 (0.0024) model time 0.2357 (0.2426) loss 4.9416 (4.0340) grad_norm 1.2512 (2.0577) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:14:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][400/1251] eta 0:03:28 lr 0.000999 wd 0.0500 time 0.2398 (0.2447) data time 0.0008 (0.0024) model time 0.2391 (0.2425) loss 4.6583 (4.0322) grad_norm 1.4382 (2.0571) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][410/1251] eta 0:03:25 lr 0.000999 wd 0.0500 time 0.2341 (0.2446) data time 0.0009 (0.0023) model time 0.2332 (0.2425) loss 5.0922 (4.0274) grad_norm 3.8096 (2.0615) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][420/1251] eta 0:03:23 lr 0.000999 wd 0.0500 time 0.2417 (0.2445) data time 0.0009 (0.0023) model time 0.2408 (0.2424) loss 3.6915 (4.0212) grad_norm 1.3328 (2.0650) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][430/1251] eta 0:03:20 lr 0.000999 wd 0.0500 time 0.2405 (0.2445) data time 0.0011 (0.0023) model time 0.2394 (0.2424) loss 4.1794 (4.0177) grad_norm 2.3302 (2.0595) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][440/1251] eta 0:03:18 lr 0.000999 wd 0.0500 time 0.2308 (0.2444) data time 0.0007 (0.0023) model time 0.2300 (0.2423) loss 4.2898 (4.0130) grad_norm 2.4248 (2.0568) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][450/1251] eta 0:03:15 lr 0.000999 wd 0.0500 time 0.2405 (0.2443) data time 0.0011 (0.0022) model time 0.2394 (0.2423) loss 4.0539 (4.0071) grad_norm 3.8508 (2.0603) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][460/1251] eta 0:03:13 lr 0.000999 wd 0.0500 time 0.2554 (0.2443) data time 0.0011 (0.0022) model time 0.2543 (0.2423) loss 4.1864 (4.0111) grad_norm 1.3505 (2.0747) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][470/1251] eta 0:03:10 lr 0.000999 wd 0.0500 time 0.2427 (0.2443) data time 0.0010 (0.0022) model time 0.2417 (0.2423) loss 4.2259 (4.0122) grad_norm 1.7125 (2.0753) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][480/1251] eta 0:03:08 lr 0.000999 wd 0.0500 time 0.2393 (0.2442) data time 0.0011 (0.0022) model time 0.2382 (0.2422) loss 3.9606 (4.0161) grad_norm 2.0362 (2.0802) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][490/1251] eta 0:03:05 lr 0.000999 wd 0.0500 time 0.2336 (0.2441) data time 0.0010 (0.0021) model time 0.2326 (0.2421) loss 4.5170 (4.0145) grad_norm 1.9106 (2.0841) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][500/1251] eta 0:03:03 lr 0.000999 wd 0.0500 time 0.2410 (0.2441) data time 0.0010 (0.0021) model time 0.2400 (0.2421) loss 3.9321 (4.0159) grad_norm 2.4606 (2.0920) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][510/1251] eta 0:03:00 lr 0.000999 wd 0.0500 time 0.2458 (0.2440) data time 0.0007 (0.0021) model time 0.2451 (0.2420) loss 3.7151 (4.0110) grad_norm 2.1792 (2.0914) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][520/1251] eta 0:02:58 lr 0.000999 wd 0.0500 time 0.2449 (0.2439) data time 0.0010 (0.0021) model time 0.2439 (0.2420) loss 4.1375 (4.0049) grad_norm 2.0404 (2.0981) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][530/1251] eta 0:02:55 lr 0.000999 wd 0.0500 time 0.2377 (0.2439) data time 0.0008 (0.0020) model time 0.2369 (0.2420) loss 2.5753 (3.9995) grad_norm 3.3346 (2.0936) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][540/1251] eta 0:02:53 lr 0.000999 wd 0.0500 time 0.2494 (0.2439) data time 0.0007 (0.0021) model time 0.2487 (0.2419) loss 4.5852 (4.0023) grad_norm 4.0563 (2.0927) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][550/1251] eta 0:02:50 lr 0.000999 wd 0.0500 time 0.2423 (0.2439) data time 0.0009 (0.0021) model time 0.2413 (0.2419) loss 4.6524 (4.0068) grad_norm 1.3012 (2.0944) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][560/1251] eta 0:02:48 lr 0.000999 wd 0.0500 time 0.2440 (0.2439) data time 0.0009 (0.0020) model time 0.2432 (0.2419) loss 3.9387 (4.0077) grad_norm 2.1107 (2.0978) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][570/1251] eta 0:02:46 lr 0.000999 wd 0.0500 time 0.2379 (0.2438) data time 0.0009 (0.0020) model time 0.2369 (0.2419) loss 4.1507 (4.0138) grad_norm 2.4785 (2.1036) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][580/1251] eta 0:02:43 lr 0.000999 wd 0.0500 time 0.2458 (0.2438) data time 0.0010 (0.0020) model time 0.2448 (0.2418) loss 4.3815 (4.0121) grad_norm 2.2323 (2.1175) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][590/1251] eta 0:02:41 lr 0.000999 wd 0.0500 time 0.2370 (0.2437) data time 0.0010 (0.0020) model time 0.2360 (0.2418) loss 3.7845 (4.0129) grad_norm 1.8543 (2.1230) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][600/1251] eta 0:02:38 lr 0.000999 wd 0.0500 time 0.2451 (0.2437) data time 0.0009 (0.0020) model time 0.2443 (0.2418) loss 4.0042 (4.0183) grad_norm 1.3750 (2.1308) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][610/1251] eta 0:02:36 lr 0.000999 wd 0.0500 time 0.2486 (0.2436) data time 0.0008 (0.0020) model time 0.2479 (0.2418) loss 4.8622 (4.0230) grad_norm 1.4115 (2.1272) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][620/1251] eta 0:02:33 lr 0.000999 wd 0.0500 time 0.2415 (0.2436) data time 0.0011 (0.0020) model time 0.2404 (0.2418) loss 3.8383 (4.0218) grad_norm 2.0664 (2.1287) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][630/1251] eta 0:02:31 lr 0.000999 wd 0.0500 time 0.2384 (0.2436) data time 0.0009 (0.0019) model time 0.2375 (0.2418) loss 3.3656 (4.0236) grad_norm 2.2960 (2.1234) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:15:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][640/1251] eta 0:02:28 lr 0.000999 wd 0.0500 time 0.2433 (0.2436) data time 0.0010 (0.0019) model time 0.2423 (0.2417) loss 3.7781 (4.0219) grad_norm 1.4296 (2.1196) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][650/1251] eta 0:02:26 lr 0.000999 wd 0.0500 time 0.2414 (0.2436) data time 0.0010 (0.0019) model time 0.2404 (0.2418) loss 4.3020 (4.0237) grad_norm 1.6762 (2.1226) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][660/1251] eta 0:02:23 lr 0.000999 wd 0.0500 time 0.2402 (0.2435) data time 0.0007 (0.0019) model time 0.2395 (0.2417) loss 2.8822 (4.0197) grad_norm 2.9698 (2.1251) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][670/1251] eta 0:02:21 lr 0.000999 wd 0.0500 time 0.2487 (0.2436) data time 0.0012 (0.0019) model time 0.2475 (0.2418) loss 4.1163 (4.0208) grad_norm 1.9975 (2.1197) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][680/1251] eta 0:02:19 lr 0.000999 wd 0.0500 time 0.2352 (0.2435) data time 0.0009 (0.0019) model time 0.2343 (0.2417) loss 3.7362 (4.0255) grad_norm 2.0958 (2.1184) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][690/1251] eta 0:02:16 lr 0.000999 wd 0.0500 time 0.2403 (0.2436) data time 0.0007 (0.0019) model time 0.2396 (0.2418) loss 3.0310 (4.0233) grad_norm 2.3711 (2.1154) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][700/1251] eta 0:02:14 lr 0.000999 wd 0.0500 time 0.2441 (0.2436) data time 0.0008 (0.0019) model time 0.2433 (0.2418) loss 3.5694 (4.0258) grad_norm 2.9964 (2.1169) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][710/1251] eta 0:02:11 lr 0.000999 wd 0.0500 time 0.2429 (0.2436) data time 0.0010 (0.0019) model time 0.2419 (0.2418) loss 4.1601 (4.0268) grad_norm 1.6717 (2.1190) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][720/1251] eta 0:02:09 lr 0.000999 wd 0.0500 time 0.2431 (0.2436) data time 0.0011 (0.0019) model time 0.2420 (0.2418) loss 4.8497 (4.0267) grad_norm 1.6625 (2.1205) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][730/1251] eta 0:02:06 lr 0.000999 wd 0.0500 time 0.2481 (0.2436) data time 0.0007 (0.0019) model time 0.2473 (0.2418) loss 3.4095 (4.0228) grad_norm 1.5711 (2.1192) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][740/1251] eta 0:02:04 lr 0.000999 wd 0.0500 time 0.2423 (0.2436) data time 0.0007 (0.0018) model time 0.2416 (0.2418) loss 4.5901 (4.0194) grad_norm 2.7518 (2.1235) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][750/1251] eta 0:02:02 lr 0.000999 wd 0.0500 time 0.2375 (0.2436) data time 0.0010 (0.0018) model time 0.2365 (0.2418) loss 4.0346 (4.0148) grad_norm 1.6183 (2.1206) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][760/1251] eta 0:01:59 lr 0.000999 wd 0.0500 time 0.2416 (0.2435) data time 0.0007 (0.0018) model time 0.2409 (0.2418) loss 3.5694 (4.0136) grad_norm 3.1778 (2.1218) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][770/1251] eta 0:01:57 lr 0.000999 wd 0.0500 time 0.2417 (0.2435) data time 0.0007 (0.0018) model time 0.2410 (0.2418) loss 4.1078 (4.0130) grad_norm 1.9104 (2.1211) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][780/1251] eta 0:01:54 lr 0.000999 wd 0.0500 time 0.2444 (0.2435) data time 0.0010 (0.0018) model time 0.2434 (0.2417) loss 3.9800 (4.0094) grad_norm 2.0920 (2.1192) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][790/1251] eta 0:01:52 lr 0.000999 wd 0.0500 time 0.2343 (0.2434) data time 0.0010 (0.0018) model time 0.2333 (0.2417) loss 4.1091 (4.0098) grad_norm 1.9600 (2.1174) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][800/1251] eta 0:01:49 lr 0.000999 wd 0.0500 time 0.2395 (0.2434) data time 0.0010 (0.0018) model time 0.2385 (0.2417) loss 3.3153 (4.0073) grad_norm 2.7599 (2.1172) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][810/1251] eta 0:01:47 lr 0.000999 wd 0.0500 time 0.2430 (0.2434) data time 0.0008 (0.0018) model time 0.2422 (0.2417) loss 3.6831 (4.0082) grad_norm 3.1415 (2.1155) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][820/1251] eta 0:01:44 lr 0.000999 wd 0.0500 time 0.2442 (0.2434) data time 0.0007 (0.0018) model time 0.2434 (0.2417) loss 3.1826 (4.0091) grad_norm 1.9766 (2.1163) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][830/1251] eta 0:01:42 lr 0.000999 wd 0.0500 time 0.2441 (0.2434) data time 0.0009 (0.0018) model time 0.2432 (0.2417) loss 4.2223 (4.0075) grad_norm 1.4296 (2.1158) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][840/1251] eta 0:01:40 lr 0.000999 wd 0.0500 time 0.2350 (0.2434) data time 0.0009 (0.0017) model time 0.2340 (0.2417) loss 3.7294 (4.0072) grad_norm 2.0991 (2.1104) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][850/1251] eta 0:01:37 lr 0.000999 wd 0.0500 time 0.2389 (0.2434) data time 0.0008 (0.0017) model time 0.2381 (0.2417) loss 4.9140 (4.0100) grad_norm 2.2840 (2.1092) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][860/1251] eta 0:01:35 lr 0.000999 wd 0.0500 time 0.2413 (0.2434) data time 0.0011 (0.0017) model time 0.2402 (0.2417) loss 3.6675 (4.0111) grad_norm 3.5055 (2.1128) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][870/1251] eta 0:01:32 lr 0.000999 wd 0.0500 time 0.2406 (0.2433) data time 0.0010 (0.0017) model time 0.2396 (0.2417) loss 4.6253 (4.0138) grad_norm 1.7968 (2.1154) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][880/1251] eta 0:01:30 lr 0.000999 wd 0.0500 time 0.2412 (0.2433) data time 0.0011 (0.0017) model time 0.2400 (0.2417) loss 3.5638 (4.0113) grad_norm 2.0849 (2.1180) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][890/1251] eta 0:01:27 lr 0.000999 wd 0.0500 time 0.2434 (0.2433) data time 0.0007 (0.0017) model time 0.2427 (0.2417) loss 4.4319 (4.0117) grad_norm 1.6375 (2.1178) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][900/1251] eta 0:01:25 lr 0.000999 wd 0.0500 time 0.2430 (0.2433) data time 0.0010 (0.0017) model time 0.2420 (0.2416) loss 4.2804 (4.0139) grad_norm 2.7906 (2.1200) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][910/1251] eta 0:01:22 lr 0.000999 wd 0.0500 time 0.2409 (0.2433) data time 0.0009 (0.0017) model time 0.2400 (0.2417) loss 3.7412 (4.0115) grad_norm 2.0859 (2.1186) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][920/1251] eta 0:01:20 lr 0.000999 wd 0.0500 time 0.2384 (0.2432) data time 0.0013 (0.0017) model time 0.2371 (0.2416) loss 3.9795 (4.0112) grad_norm 1.7666 (2.1161) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][930/1251] eta 0:01:18 lr 0.000999 wd 0.0500 time 0.2374 (0.2433) data time 0.0009 (0.0017) model time 0.2365 (0.2416) loss 4.8059 (4.0141) grad_norm 1.9790 (2.1127) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][940/1251] eta 0:01:15 lr 0.000999 wd 0.0500 time 0.2433 (0.2433) data time 0.0009 (0.0017) model time 0.2425 (0.2416) loss 3.9338 (4.0101) grad_norm 1.4565 (2.1096) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][950/1251] eta 0:01:13 lr 0.000999 wd 0.0500 time 0.2413 (0.2433) data time 0.0011 (0.0017) model time 0.2403 (0.2417) loss 3.8275 (4.0115) grad_norm 2.2730 (2.1083) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][960/1251] eta 0:01:10 lr 0.000999 wd 0.0500 time 0.2415 (0.2432) data time 0.0009 (0.0017) model time 0.2406 (0.2416) loss 4.1509 (4.0102) grad_norm 3.2716 (2.1103) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][970/1251] eta 0:01:08 lr 0.000999 wd 0.0500 time 0.2432 (0.2432) data time 0.0007 (0.0016) model time 0.2425 (0.2416) loss 3.9754 (4.0090) grad_norm 1.7994 (2.1190) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][980/1251] eta 0:01:05 lr 0.000999 wd 0.0500 time 0.2453 (0.2432) data time 0.0010 (0.0016) model time 0.2444 (0.2417) loss 3.4197 (4.0059) grad_norm 3.0212 (2.1212) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][990/1251] eta 0:01:03 lr 0.000999 wd 0.0500 time 0.2474 (0.2432) data time 0.0007 (0.0016) model time 0.2467 (0.2416) loss 5.2427 (4.0066) grad_norm 2.2659 (2.1258) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1000/1251] eta 0:01:01 lr 0.000999 wd 0.0500 time 0.2387 (0.2432) data time 0.0011 (0.0016) model time 0.2376 (0.2416) loss 4.0994 (4.0052) grad_norm 2.0560 (2.1225) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1010/1251] eta 0:00:58 lr 0.000999 wd 0.0500 time 0.2427 (0.2432) data time 0.0012 (0.0016) model time 0.2414 (0.2416) loss 3.8533 (4.0070) grad_norm 2.1887 (2.1195) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1020/1251] eta 0:00:56 lr 0.000999 wd 0.0500 time 0.2425 (0.2432) data time 0.0010 (0.0016) model time 0.2416 (0.2416) loss 3.5541 (4.0086) grad_norm 4.2382 (2.1229) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1030/1251] eta 0:00:53 lr 0.000999 wd 0.0500 time 0.4380 (0.2436) data time 0.0012 (0.0016) model time 0.4369 (0.2420) loss 4.1915 (4.0097) grad_norm 2.5358 (2.1291) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1040/1251] eta 0:00:51 lr 0.000999 wd 0.0500 time 0.2518 (0.2438) data time 0.0009 (0.0016) model time 0.2509 (0.2422) loss 3.7440 (4.0093) grad_norm 1.4448 (2.1300) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1050/1251] eta 0:00:48 lr 0.000999 wd 0.0500 time 0.2503 (0.2437) data time 0.0009 (0.0016) model time 0.2494 (0.2422) loss 2.6403 (4.0034) grad_norm 1.4071 (2.1254) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1060/1251] eta 0:00:46 lr 0.000999 wd 0.0500 time 0.2415 (0.2437) data time 0.0009 (0.0016) model time 0.2406 (0.2422) loss 2.7304 (4.0028) grad_norm 1.5093 (2.1243) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1070/1251] eta 0:00:44 lr 0.000999 wd 0.0500 time 0.2343 (0.2437) data time 0.0007 (0.0016) model time 0.2336 (0.2422) loss 3.0121 (4.0017) grad_norm 1.9097 (2.1210) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1080/1251] eta 0:00:41 lr 0.000999 wd 0.0500 time 0.2495 (0.2437) data time 0.0011 (0.0016) model time 0.2484 (0.2422) loss 4.0694 (4.0034) grad_norm 2.3063 (2.1246) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1090/1251] eta 0:00:39 lr 0.000999 wd 0.0500 time 0.2378 (0.2437) data time 0.0011 (0.0016) model time 0.2367 (0.2422) loss 4.2201 (4.0048) grad_norm 2.8338 (2.1277) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1100/1251] eta 0:00:36 lr 0.000999 wd 0.0500 time 0.2519 (0.2436) data time 0.0008 (0.0016) model time 0.2511 (0.2421) loss 2.6783 (4.0053) grad_norm 1.7607 (2.1275) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1110/1251] eta 0:00:34 lr 0.000999 wd 0.0500 time 0.2427 (0.2436) data time 0.0008 (0.0016) model time 0.2419 (0.2421) loss 4.5113 (4.0047) grad_norm 2.3043 (2.1278) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1120/1251] eta 0:00:31 lr 0.000999 wd 0.0500 time 0.2470 (0.2436) data time 0.0008 (0.0016) model time 0.2463 (0.2421) loss 4.6720 (4.0067) grad_norm 1.3845 (2.1252) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1130/1251] eta 0:00:29 lr 0.000999 wd 0.0500 time 0.2400 (0.2436) data time 0.0011 (0.0016) model time 0.2389 (0.2421) loss 4.0820 (4.0058) grad_norm 1.6019 (2.1237) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:17:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1140/1251] eta 0:00:27 lr 0.000999 wd 0.0500 time 0.2449 (0.2435) data time 0.0007 (0.0016) model time 0.2441 (0.2420) loss 4.4092 (4.0037) grad_norm 1.7986 (2.1221) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1150/1251] eta 0:00:24 lr 0.000999 wd 0.0500 time 0.2513 (0.2435) data time 0.0008 (0.0016) model time 0.2506 (0.2420) loss 4.4050 (4.0048) grad_norm 2.3664 (2.1226) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1160/1251] eta 0:00:22 lr 0.000999 wd 0.0500 time 0.2377 (0.2435) data time 0.0009 (0.0016) model time 0.2368 (0.2420) loss 3.3953 (3.9996) grad_norm 1.9452 (2.1191) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1170/1251] eta 0:00:19 lr 0.000999 wd 0.0500 time 0.2493 (0.2436) data time 0.0010 (0.0015) model time 0.2483 (0.2421) loss 3.1996 (4.0014) grad_norm 2.0513 (2.1172) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1180/1251] eta 0:00:17 lr 0.000999 wd 0.0500 time 0.2405 (0.2435) data time 0.0010 (0.0015) model time 0.2395 (0.2421) loss 4.1062 (4.0003) grad_norm 1.8074 (2.1166) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1190/1251] eta 0:00:14 lr 0.000999 wd 0.0500 time 0.2403 (0.2435) data time 0.0009 (0.0015) model time 0.2393 (0.2421) loss 2.6310 (3.9992) grad_norm 1.4681 (2.1161) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1200/1251] eta 0:00:12 lr 0.000999 wd 0.0500 time 0.2459 (0.2435) data time 0.0007 (0.0015) model time 0.2452 (0.2420) loss 4.0275 (3.9998) grad_norm 2.1160 (2.1182) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1210/1251] eta 0:00:09 lr 0.000999 wd 0.0500 time 0.2401 (0.2435) data time 0.0008 (0.0015) model time 0.2393 (0.2420) loss 4.8676 (3.9980) grad_norm 1.8239 (2.1245) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1220/1251] eta 0:00:07 lr 0.000999 wd 0.0500 time 0.2432 (0.2435) data time 0.0010 (0.0015) model time 0.2422 (0.2420) loss 4.0165 (3.9967) grad_norm 2.7925 (2.1259) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1230/1251] eta 0:00:05 lr 0.000999 wd 0.0500 time 0.2453 (0.2434) data time 0.0008 (0.0015) model time 0.2445 (0.2420) loss 3.7482 (3.9964) grad_norm 6.1066 (2.1338) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1240/1251] eta 0:00:02 lr 0.000999 wd 0.0500 time 0.2317 (0.2434) data time 0.0007 (0.0015) model time 0.2310 (0.2419) loss 4.5849 (3.9970) grad_norm 1.4710 (2.1309) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [24/300][1250/1251] eta 0:00:00 lr 0.000999 wd 0.0500 time 0.2333 (0.2432) data time 0.0007 (0.0015) model time 0.2325 (0.2418) loss 3.3105 (3.9958) grad_norm 1.9875 (2.1316) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 24 training takes 0:05:04 [2024-08-26 04:18:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 04:18:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 04:18:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.418 (0.418) Loss 0.7544 (0.7544) Acc@1 84.961 (84.961) Acc@5 95.898 (95.898) Mem 7379MB [2024-08-26 04:18:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.088 (0.111) Loss 1.1367 (1.0684) Acc@1 72.754 (75.684) Acc@5 94.238 (93.963) Mem 7379MB [2024-08-26 04:18:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.084 (0.097) Loss 1.5762 (1.1167) Acc@1 64.648 (74.475) Acc@5 87.109 (93.601) Mem 7379MB [2024-08-26 04:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.090) Loss 1.9033 (1.2825) Acc@1 58.789 (71.270) Acc@5 80.176 (91.164) Mem 7379MB [2024-08-26 04:18:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.7842 (1.3784) Acc@1 61.328 (69.310) Acc@5 83.984 (89.837) Mem 7379MB [2024-08-26 04:18:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 68.966 Acc@5 89.630 [2024-08-26 04:18:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 69.0% [2024-08-26 04:18:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 68.97% [2024-08-26 04:18:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 04:18:31 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 04:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.427 (0.427) Loss 0.6787 (0.6787) Acc@1 82.910 (82.910) Acc@5 95.508 (95.508) Mem 7379MB [2024-08-26 04:18:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.113) Loss 1.0635 (1.0639) Acc@1 75.098 (73.668) Acc@5 92.773 (92.480) Mem 7379MB [2024-08-26 04:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.096) Loss 1.5693 (1.0740) Acc@1 62.305 (73.312) Acc@5 85.938 (92.541) Mem 7379MB [2024-08-26 04:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.089) Loss 1.8525 (1.2453) Acc@1 57.129 (69.922) Acc@5 80.078 (90.058) Mem 7379MB [2024-08-26 04:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.8574 (1.3516) Acc@1 56.738 (67.847) Acc@5 81.934 (88.572) Mem 7379MB [2024-08-26 04:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 67.686 Acc@5 88.546 [2024-08-26 04:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 67.7% [2024-08-26 04:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 67.69% [2024-08-26 04:18:35 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 04:18:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 04:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][0/1251] eta 0:14:39 lr 0.000999 wd 0.0500 time 0.7032 (0.7032) data time 0.4768 (0.4768) model time 0.0000 (0.0000) loss 4.5331 (4.5331) grad_norm 2.3393 (2.3393) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][10/1251] eta 0:05:50 lr 0.000999 wd 0.0500 time 0.2403 (0.2820) data time 0.0009 (0.0442) model time 0.0000 (0.0000) loss 5.0318 (3.8455) grad_norm 3.7519 (2.3149) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][20/1251] eta 0:05:23 lr 0.000999 wd 0.0500 time 0.2456 (0.2630) data time 0.0010 (0.0237) model time 0.0000 (0.0000) loss 3.8429 (3.9431) grad_norm 1.9522 (2.2147) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][30/1251] eta 0:05:11 lr 0.000999 wd 0.0500 time 0.2425 (0.2554) data time 0.0014 (0.0164) model time 0.0000 (0.0000) loss 3.8612 (3.9437) grad_norm 2.1548 (2.1473) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][40/1251] eta 0:05:05 lr 0.000999 wd 0.0500 time 0.2419 (0.2521) data time 0.0009 (0.0127) model time 0.0000 (0.0000) loss 4.3668 (3.9091) grad_norm 1.8876 (2.1652) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][50/1251] eta 0:04:59 lr 0.000999 wd 0.0500 time 0.2369 (0.2496) data time 0.0009 (0.0104) model time 0.0000 (0.0000) loss 3.5542 (3.8837) grad_norm 1.6901 (2.1636) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][60/1251] eta 0:04:55 lr 0.000999 wd 0.0500 time 0.2434 (0.2479) data time 0.0010 (0.0088) model time 0.2423 (0.2383) loss 4.4328 (3.8486) grad_norm 1.9951 (2.1652) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][70/1251] eta 0:04:51 lr 0.000999 wd 0.0500 time 0.2421 (0.2471) data time 0.0010 (0.0078) model time 0.2411 (0.2393) loss 2.7070 (3.8438) grad_norm 2.3583 (2.2027) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][80/1251] eta 0:04:48 lr 0.000999 wd 0.0500 time 0.2395 (0.2464) data time 0.0010 (0.0069) model time 0.2385 (0.2397) loss 3.3230 (3.8606) grad_norm 1.6445 (2.1517) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:18:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][90/1251] eta 0:04:45 lr 0.000999 wd 0.0500 time 0.2433 (0.2458) data time 0.0010 (0.0063) model time 0.2423 (0.2399) loss 4.0310 (3.8775) grad_norm 1.7329 (2.1031) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:19:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][100/1251] eta 0:04:42 lr 0.000999 wd 0.0500 time 0.2508 (0.2454) data time 0.0007 (0.0058) model time 0.2501 (0.2401) loss 4.6395 (3.8622) grad_norm 1.9774 (2.0775) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][110/1251] eta 0:04:39 lr 0.000999 wd 0.0500 time 0.2512 (0.2450) data time 0.0011 (0.0054) model time 0.2501 (0.2401) loss 4.5121 (3.8558) grad_norm 3.2795 (2.0907) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:19:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][120/1251] eta 0:04:36 lr 0.000999 wd 0.0500 time 0.2441 (0.2447) data time 0.0010 (0.0050) model time 0.2430 (0.2401) loss 4.1916 (3.8762) grad_norm 3.4513 (2.1044) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:19:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][130/1251] eta 0:04:34 lr 0.000999 wd 0.0500 time 0.2406 (0.2446) data time 0.0010 (0.0047) model time 0.2397 (0.2404) loss 2.5482 (3.8448) grad_norm 2.3223 (2.1205) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:19:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][140/1251] eta 0:04:31 lr 0.000999 wd 0.0500 time 0.2429 (0.2444) data time 0.0009 (0.0044) model time 0.2420 (0.2404) loss 4.7383 (3.8596) grad_norm 1.4146 (2.1136) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:19:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][150/1251] eta 0:04:28 lr 0.000999 wd 0.0500 time 0.2389 (0.2441) data time 0.0009 (0.0042) model time 0.2379 (0.2402) loss 4.0211 (3.8712) grad_norm 1.8354 (2.1551) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:19:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][160/1251] eta 0:04:26 lr 0.000999 wd 0.0500 time 0.2404 (0.2440) data time 0.0007 (0.0040) model time 0.2397 (0.2404) loss 3.6393 (3.8654) grad_norm 2.8658 (2.1732) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:19:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][170/1251] eta 0:04:23 lr 0.000999 wd 0.0500 time 0.2382 (0.2438) data time 0.0007 (0.0038) model time 0.2375 (0.2402) loss 4.9475 (3.8755) grad_norm 1.8880 (2.1646) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:19:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][180/1251] eta 0:04:21 lr 0.000999 wd 0.0500 time 0.2409 (0.2437) data time 0.0009 (0.0037) model time 0.2400 (0.2403) loss 3.5522 (3.8720) grad_norm 1.2786 (2.1529) loss_scale 16384.0000 (8418.2983) mem 7379MB [2024-08-26 04:19:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][190/1251] eta 0:04:18 lr 0.000999 wd 0.0500 time 0.2491 (0.2438) data time 0.0012 (0.0035) model time 0.2478 (0.2406) loss 4.7260 (3.8887) grad_norm 1.6615 (2.1428) loss_scale 16384.0000 (8835.3508) mem 7379MB [2024-08-26 04:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][200/1251] eta 0:04:16 lr 0.000999 wd 0.0500 time 0.2432 (0.2437) data time 0.0007 (0.0034) model time 0.2425 (0.2406) loss 4.6057 (3.9031) grad_norm 1.5173 (2.1338) loss_scale 16384.0000 (9210.9055) mem 7379MB [2024-08-26 04:19:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][210/1251] eta 0:04:13 lr 0.000999 wd 0.0500 time 0.2338 (0.2437) data time 0.0007 (0.0033) model time 0.2330 (0.2407) loss 4.3162 (3.9271) grad_norm 4.2258 (2.1388) loss_scale 16384.0000 (9550.8626) mem 7379MB [2024-08-26 04:19:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][220/1251] eta 0:04:11 lr 0.000999 wd 0.0500 time 0.2581 (0.2437) data time 0.0009 (0.0032) model time 0.2573 (0.2409) loss 4.4796 (3.9358) grad_norm 1.7630 (2.1514) loss_scale 16384.0000 (9860.0543) mem 7379MB [2024-08-26 04:19:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][230/1251] eta 0:04:08 lr 0.000999 wd 0.0500 time 0.2389 (0.2435) data time 0.0007 (0.0031) model time 0.2382 (0.2407) loss 3.3654 (3.9333) grad_norm 3.0046 (2.1526) loss_scale 16384.0000 (10142.4762) mem 7379MB [2024-08-26 04:19:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][240/1251] eta 0:04:06 lr 0.000999 wd 0.0500 time 0.2441 (0.2434) data time 0.0009 (0.0030) model time 0.2432 (0.2407) loss 4.2131 (3.9233) grad_norm 1.7733 (2.1621) loss_scale 16384.0000 (10401.4606) mem 7379MB [2024-08-26 04:19:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][250/1251] eta 0:04:03 lr 0.000999 wd 0.0500 time 0.2397 (0.2433) data time 0.0010 (0.0029) model time 0.2387 (0.2407) loss 4.0650 (3.9241) grad_norm 3.3774 (2.1741) loss_scale 16384.0000 (10639.8088) mem 7379MB [2024-08-26 04:19:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][260/1251] eta 0:04:01 lr 0.000999 wd 0.0500 time 0.2417 (0.2432) data time 0.0012 (0.0029) model time 0.2404 (0.2406) loss 4.0538 (3.9159) grad_norm 1.5102 (2.1635) loss_scale 16384.0000 (10859.8927) mem 7379MB [2024-08-26 04:19:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][270/1251] eta 0:03:58 lr 0.000999 wd 0.0500 time 0.2394 (0.2431) data time 0.0010 (0.0028) model time 0.2384 (0.2406) loss 4.5322 (3.9164) grad_norm 1.3695 (2.1603) loss_scale 16384.0000 (11063.7343) mem 7379MB [2024-08-26 04:19:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][280/1251] eta 0:03:55 lr 0.000999 wd 0.0500 time 0.2376 (0.2430) data time 0.0010 (0.0027) model time 0.2367 (0.2405) loss 3.0759 (3.9137) grad_norm 1.7413 (2.1582) loss_scale 16384.0000 (11253.0676) mem 7379MB [2024-08-26 04:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][290/1251] eta 0:03:53 lr 0.000999 wd 0.0500 time 0.2374 (0.2430) data time 0.0012 (0.0027) model time 0.2362 (0.2406) loss 4.1290 (3.9236) grad_norm 1.9065 (2.1476) loss_scale 16384.0000 (11429.3883) mem 7379MB [2024-08-26 04:19:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][300/1251] eta 0:03:51 lr 0.000999 wd 0.0500 time 0.2468 (0.2431) data time 0.0008 (0.0026) model time 0.2461 (0.2407) loss 4.7178 (3.9207) grad_norm 2.0547 (2.1481) loss_scale 16384.0000 (11593.9934) mem 7379MB [2024-08-26 04:19:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][310/1251] eta 0:03:48 lr 0.000999 wd 0.0500 time 0.2386 (0.2430) data time 0.0010 (0.0026) model time 0.2376 (0.2407) loss 3.2784 (3.9234) grad_norm 2.0500 (2.1654) loss_scale 16384.0000 (11748.0129) mem 7379MB [2024-08-26 04:19:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][320/1251] eta 0:03:46 lr 0.000999 wd 0.0500 time 0.2410 (0.2430) data time 0.0007 (0.0025) model time 0.2403 (0.2407) loss 3.1438 (3.9378) grad_norm 2.9911 (2.1685) loss_scale 16384.0000 (11892.4361) mem 7379MB [2024-08-26 04:19:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][330/1251] eta 0:03:43 lr 0.000999 wd 0.0500 time 0.2479 (0.2429) data time 0.0007 (0.0025) model time 0.2471 (0.2406) loss 2.6287 (3.9371) grad_norm 1.7683 (2.1611) loss_scale 16384.0000 (12028.1329) mem 7379MB [2024-08-26 04:19:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][340/1251] eta 0:03:41 lr 0.000999 wd 0.0500 time 0.2320 (0.2429) data time 0.0011 (0.0024) model time 0.2309 (0.2407) loss 4.2639 (3.9315) grad_norm 1.3499 (2.1513) loss_scale 16384.0000 (12155.8710) mem 7379MB [2024-08-26 04:20:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][350/1251] eta 0:03:38 lr 0.000999 wd 0.0500 time 0.2398 (0.2428) data time 0.0007 (0.0024) model time 0.2391 (0.2406) loss 3.4747 (3.9301) grad_norm 2.3871 (2.1497) loss_scale 16384.0000 (12276.3305) mem 7379MB [2024-08-26 04:20:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][360/1251] eta 0:03:36 lr 0.000999 wd 0.0500 time 0.2510 (0.2427) data time 0.0009 (0.0023) model time 0.2502 (0.2406) loss 4.6379 (3.9364) grad_norm 1.5263 (2.1439) loss_scale 16384.0000 (12390.1163) mem 7379MB [2024-08-26 04:20:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][370/1251] eta 0:03:33 lr 0.000999 wd 0.0500 time 0.2427 (0.2427) data time 0.0009 (0.0023) model time 0.2418 (0.2405) loss 4.8881 (3.9356) grad_norm 3.5759 (2.1524) loss_scale 16384.0000 (12497.7682) mem 7379MB [2024-08-26 04:20:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][380/1251] eta 0:03:31 lr 0.000999 wd 0.0500 time 0.2404 (0.2427) data time 0.0011 (0.0023) model time 0.2393 (0.2406) loss 3.8117 (3.9284) grad_norm 1.7149 (2.1461) loss_scale 16384.0000 (12599.7690) mem 7379MB [2024-08-26 04:20:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][390/1251] eta 0:03:28 lr 0.000999 wd 0.0500 time 0.2357 (0.2427) data time 0.0013 (0.0022) model time 0.2345 (0.2406) loss 3.7833 (3.9300) grad_norm 2.4801 (2.1514) loss_scale 16384.0000 (12696.5524) mem 7379MB [2024-08-26 04:20:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][400/1251] eta 0:03:26 lr 0.000999 wd 0.0500 time 0.2453 (0.2427) data time 0.0009 (0.0022) model time 0.2445 (0.2406) loss 3.8838 (3.9376) grad_norm 1.9059 (2.1513) loss_scale 16384.0000 (12788.5087) mem 7379MB [2024-08-26 04:20:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][410/1251] eta 0:03:24 lr 0.000999 wd 0.0500 time 0.2414 (0.2426) data time 0.0008 (0.0022) model time 0.2406 (0.2406) loss 4.8089 (3.9386) grad_norm 3.1040 (2.1606) loss_scale 16384.0000 (12875.9903) mem 7379MB [2024-08-26 04:20:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][420/1251] eta 0:03:21 lr 0.000999 wd 0.0500 time 0.2476 (0.2426) data time 0.0010 (0.0022) model time 0.2466 (0.2406) loss 3.4519 (3.9374) grad_norm 1.4828 (2.1663) loss_scale 16384.0000 (12959.3159) mem 7379MB [2024-08-26 04:20:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][430/1251] eta 0:03:19 lr 0.000999 wd 0.0500 time 0.2415 (0.2426) data time 0.0009 (0.0021) model time 0.2406 (0.2406) loss 2.5895 (3.9367) grad_norm 1.8151 (2.1665) loss_scale 16384.0000 (13038.7749) mem 7379MB [2024-08-26 04:20:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][440/1251] eta 0:03:16 lr 0.000999 wd 0.0500 time 0.2336 (0.2425) data time 0.0011 (0.0021) model time 0.2325 (0.2405) loss 4.1362 (3.9419) grad_norm 1.3767 (2.1565) loss_scale 16384.0000 (13114.6304) mem 7379MB [2024-08-26 04:20:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][450/1251] eta 0:03:14 lr 0.000999 wd 0.0500 time 0.2461 (0.2425) data time 0.0009 (0.0021) model time 0.2452 (0.2406) loss 2.8228 (3.9348) grad_norm 2.3481 (2.1594) loss_scale 16384.0000 (13187.1220) mem 7379MB [2024-08-26 04:20:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][460/1251] eta 0:03:11 lr 0.000999 wd 0.0500 time 0.2409 (0.2425) data time 0.0007 (0.0021) model time 0.2402 (0.2406) loss 3.2182 (3.9315) grad_norm 2.1933 (2.1558) loss_scale 16384.0000 (13256.4685) mem 7379MB [2024-08-26 04:20:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][470/1251] eta 0:03:09 lr 0.000999 wd 0.0500 time 0.2443 (0.2425) data time 0.0007 (0.0021) model time 0.2435 (0.2406) loss 4.1770 (3.9329) grad_norm 2.9647 (2.1511) loss_scale 16384.0000 (13322.8705) mem 7379MB [2024-08-26 04:20:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][480/1251] eta 0:03:06 lr 0.000999 wd 0.0500 time 0.2422 (0.2425) data time 0.0009 (0.0020) model time 0.2413 (0.2406) loss 3.0953 (3.9361) grad_norm 2.7963 (2.1582) loss_scale 16384.0000 (13386.5114) mem 7379MB [2024-08-26 04:20:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][490/1251] eta 0:03:04 lr 0.000999 wd 0.0500 time 0.2450 (0.2425) data time 0.0010 (0.0020) model time 0.2440 (0.2407) loss 4.3581 (3.9339) grad_norm 1.7582 (2.1561) loss_scale 16384.0000 (13447.5601) mem 7379MB [2024-08-26 04:20:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][500/1251] eta 0:03:02 lr 0.000999 wd 0.0500 time 0.2434 (0.2425) data time 0.0009 (0.0020) model time 0.2425 (0.2407) loss 3.6536 (3.9336) grad_norm 1.8112 (2.1552) loss_scale 16384.0000 (13506.1717) mem 7379MB [2024-08-26 04:20:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][510/1251] eta 0:02:59 lr 0.000999 wd 0.0500 time 0.2438 (0.2425) data time 0.0010 (0.0020) model time 0.2428 (0.2407) loss 4.0370 (3.9341) grad_norm 1.5890 (2.1577) loss_scale 16384.0000 (13562.4892) mem 7379MB [2024-08-26 04:20:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][520/1251] eta 0:02:57 lr 0.000999 wd 0.0500 time 0.2464 (0.2426) data time 0.0007 (0.0020) model time 0.2457 (0.2407) loss 5.0256 (3.9423) grad_norm 2.9630 (2.1589) loss_scale 16384.0000 (13616.6449) mem 7379MB [2024-08-26 04:20:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][530/1251] eta 0:02:54 lr 0.000999 wd 0.0500 time 0.2356 (0.2425) data time 0.0010 (0.0019) model time 0.2345 (0.2407) loss 4.2634 (3.9419) grad_norm 1.8355 (2.1617) loss_scale 16384.0000 (13668.7608) mem 7379MB [2024-08-26 04:20:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][540/1251] eta 0:02:52 lr 0.000999 wd 0.0500 time 0.2391 (0.2425) data time 0.0010 (0.0019) model time 0.2382 (0.2407) loss 4.1633 (3.9413) grad_norm 1.7929 (2.1571) loss_scale 16384.0000 (13718.9501) mem 7379MB [2024-08-26 04:20:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][550/1251] eta 0:02:50 lr 0.000999 wd 0.0500 time 0.4418 (0.2432) data time 0.0011 (0.0019) model time 0.4408 (0.2416) loss 4.0876 (3.9424) grad_norm 1.6018 (2.1571) loss_scale 16384.0000 (13767.3176) mem 7379MB [2024-08-26 04:20:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][560/1251] eta 0:02:48 lr 0.000999 wd 0.0500 time 0.2482 (0.2440) data time 0.0011 (0.0019) model time 0.2471 (0.2424) loss 3.7744 (3.9373) grad_norm 1.5656 (2.1528) loss_scale 16384.0000 (13813.9608) mem 7379MB [2024-08-26 04:20:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][570/1251] eta 0:02:46 lr 0.000999 wd 0.0500 time 0.2403 (0.2440) data time 0.0008 (0.0019) model time 0.2394 (0.2424) loss 4.2915 (3.9364) grad_norm 2.6404 (2.1564) loss_scale 16384.0000 (13858.9702) mem 7379MB [2024-08-26 04:20:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][580/1251] eta 0:02:43 lr 0.000999 wd 0.0500 time 0.2376 (0.2440) data time 0.0010 (0.0019) model time 0.2365 (0.2424) loss 4.0257 (3.9354) grad_norm 2.8349 (2.1583) loss_scale 16384.0000 (13902.4303) mem 7379MB [2024-08-26 04:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][590/1251] eta 0:02:41 lr 0.000999 wd 0.0500 time 0.2345 (0.2440) data time 0.0011 (0.0018) model time 0.2334 (0.2424) loss 3.5077 (3.9329) grad_norm 1.6382 (2.1618) loss_scale 16384.0000 (13944.4196) mem 7379MB [2024-08-26 04:21:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][600/1251] eta 0:02:38 lr 0.000999 wd 0.0500 time 0.2357 (0.2439) data time 0.0009 (0.0018) model time 0.2348 (0.2423) loss 4.7563 (3.9379) grad_norm 3.9641 (2.1749) loss_scale 16384.0000 (13985.0116) mem 7379MB [2024-08-26 04:21:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][610/1251] eta 0:02:36 lr 0.000999 wd 0.0500 time 0.2435 (0.2439) data time 0.0010 (0.0018) model time 0.2424 (0.2423) loss 4.7888 (3.9329) grad_norm 1.2479 (2.1800) loss_scale 16384.0000 (14024.2750) mem 7379MB [2024-08-26 04:21:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][620/1251] eta 0:02:33 lr 0.000999 wd 0.0500 time 0.2408 (0.2439) data time 0.0009 (0.0018) model time 0.2399 (0.2423) loss 3.9172 (3.9267) grad_norm 2.4966 (2.1853) loss_scale 16384.0000 (14062.2738) mem 7379MB [2024-08-26 04:21:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][630/1251] eta 0:02:31 lr 0.000999 wd 0.0500 time 0.2355 (0.2438) data time 0.0010 (0.0018) model time 0.2345 (0.2423) loss 4.3417 (3.9284) grad_norm 2.8760 (2.1863) loss_scale 16384.0000 (14099.0681) mem 7379MB [2024-08-26 04:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][640/1251] eta 0:02:28 lr 0.000999 wd 0.0500 time 0.2445 (0.2438) data time 0.0010 (0.0018) model time 0.2435 (0.2422) loss 2.7378 (3.9302) grad_norm 1.6096 (2.1789) loss_scale 16384.0000 (14134.7145) mem 7379MB [2024-08-26 04:21:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][650/1251] eta 0:02:26 lr 0.000999 wd 0.0500 time 0.2364 (0.2437) data time 0.0008 (0.0018) model time 0.2357 (0.2422) loss 4.1479 (3.9290) grad_norm 2.3509 (2.1741) loss_scale 16384.0000 (14169.2657) mem 7379MB [2024-08-26 04:21:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][660/1251] eta 0:02:24 lr 0.000999 wd 0.0500 time 0.2448 (0.2437) data time 0.0009 (0.0018) model time 0.2439 (0.2421) loss 3.8441 (3.9307) grad_norm 2.2570 (2.1692) loss_scale 16384.0000 (14202.7716) mem 7379MB [2024-08-26 04:21:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][670/1251] eta 0:02:21 lr 0.000999 wd 0.0500 time 0.2425 (0.2437) data time 0.0007 (0.0018) model time 0.2417 (0.2421) loss 4.3213 (3.9292) grad_norm 1.7908 (2.1735) loss_scale 16384.0000 (14235.2787) mem 7379MB [2024-08-26 04:21:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][680/1251] eta 0:02:19 lr 0.000999 wd 0.0500 time 0.2317 (0.2436) data time 0.0010 (0.0017) model time 0.2307 (0.2421) loss 4.2477 (3.9301) grad_norm 1.6148 (2.1746) loss_scale 16384.0000 (14266.8311) mem 7379MB [2024-08-26 04:21:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][690/1251] eta 0:02:16 lr 0.000999 wd 0.0500 time 0.2363 (0.2436) data time 0.0010 (0.0017) model time 0.2353 (0.2421) loss 4.1408 (3.9283) grad_norm 2.4277 (2.1704) loss_scale 16384.0000 (14297.4703) mem 7379MB [2024-08-26 04:21:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][700/1251] eta 0:02:14 lr 0.000999 wd 0.0500 time 0.2358 (0.2435) data time 0.0011 (0.0017) model time 0.2346 (0.2420) loss 3.5177 (3.9288) grad_norm 1.6516 (2.1664) loss_scale 16384.0000 (14327.2354) mem 7379MB [2024-08-26 04:21:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][710/1251] eta 0:02:11 lr 0.000999 wd 0.0500 time 0.2384 (0.2435) data time 0.0009 (0.0017) model time 0.2375 (0.2420) loss 4.1229 (3.9295) grad_norm 2.1689 (2.1598) loss_scale 16384.0000 (14356.1632) mem 7379MB [2024-08-26 04:21:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][720/1251] eta 0:02:09 lr 0.000999 wd 0.0500 time 0.2333 (0.2434) data time 0.0012 (0.0017) model time 0.2321 (0.2419) loss 4.3067 (3.9350) grad_norm 2.5928 (inf) loss_scale 8192.0000 (14316.1165) mem 7379MB [2024-08-26 04:21:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][730/1251] eta 0:02:06 lr 0.000999 wd 0.0500 time 0.2484 (0.2434) data time 0.0013 (0.0017) model time 0.2471 (0.2419) loss 4.2871 (3.9311) grad_norm 3.6714 (inf) loss_scale 8192.0000 (14232.3393) mem 7379MB [2024-08-26 04:21:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][740/1251] eta 0:02:04 lr 0.000999 wd 0.0500 time 0.2433 (0.2434) data time 0.0007 (0.0017) model time 0.2426 (0.2419) loss 4.8110 (3.9315) grad_norm 3.5075 (inf) loss_scale 8192.0000 (14150.8232) mem 7379MB [2024-08-26 04:21:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][750/1251] eta 0:02:01 lr 0.000999 wd 0.0500 time 0.2431 (0.2433) data time 0.0007 (0.0017) model time 0.2423 (0.2418) loss 4.3398 (3.9348) grad_norm 1.3152 (inf) loss_scale 8192.0000 (14071.4780) mem 7379MB [2024-08-26 04:21:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][760/1251] eta 0:01:59 lr 0.000999 wd 0.0500 time 0.2385 (0.2433) data time 0.0012 (0.0017) model time 0.2374 (0.2418) loss 4.0356 (3.9387) grad_norm 2.0907 (inf) loss_scale 8192.0000 (13994.2181) mem 7379MB [2024-08-26 04:21:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][770/1251] eta 0:01:57 lr 0.000999 wd 0.0500 time 0.2483 (0.2433) data time 0.0010 (0.0017) model time 0.2473 (0.2418) loss 2.6153 (3.9337) grad_norm 2.0375 (inf) loss_scale 8192.0000 (13918.9624) mem 7379MB [2024-08-26 04:21:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][780/1251] eta 0:01:54 lr 0.000999 wd 0.0500 time 0.2491 (0.2432) data time 0.0011 (0.0016) model time 0.2479 (0.2417) loss 3.6681 (3.9290) grad_norm 1.6232 (inf) loss_scale 8192.0000 (13845.6338) mem 7379MB [2024-08-26 04:21:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][790/1251] eta 0:01:52 lr 0.000999 wd 0.0500 time 0.2412 (0.2433) data time 0.0010 (0.0016) model time 0.2403 (0.2418) loss 4.5077 (3.9296) grad_norm 1.8740 (inf) loss_scale 8192.0000 (13774.1593) mem 7379MB [2024-08-26 04:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][800/1251] eta 0:01:49 lr 0.000999 wd 0.0500 time 0.2404 (0.2432) data time 0.0010 (0.0016) model time 0.2394 (0.2418) loss 3.4456 (3.9290) grad_norm 2.5143 (inf) loss_scale 8192.0000 (13704.4694) mem 7379MB [2024-08-26 04:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][810/1251] eta 0:01:47 lr 0.000999 wd 0.0500 time 0.2334 (0.2432) data time 0.0007 (0.0016) model time 0.2327 (0.2418) loss 4.1733 (3.9283) grad_norm 2.2433 (inf) loss_scale 8192.0000 (13636.4982) mem 7379MB [2024-08-26 04:21:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][820/1251] eta 0:01:44 lr 0.000999 wd 0.0500 time 0.2392 (0.2433) data time 0.0011 (0.0016) model time 0.2382 (0.2418) loss 3.5409 (3.9275) grad_norm 1.5707 (inf) loss_scale 8192.0000 (13570.1827) mem 7379MB [2024-08-26 04:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][830/1251] eta 0:01:42 lr 0.000999 wd 0.0500 time 0.2366 (0.2432) data time 0.0009 (0.0016) model time 0.2358 (0.2418) loss 3.5616 (3.9266) grad_norm 1.9973 (inf) loss_scale 8192.0000 (13505.4633) mem 7379MB [2024-08-26 04:22:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][840/1251] eta 0:01:39 lr 0.000999 wd 0.0500 time 0.2441 (0.2432) data time 0.0008 (0.0016) model time 0.2433 (0.2417) loss 4.5393 (3.9285) grad_norm 1.7239 (inf) loss_scale 8192.0000 (13442.2830) mem 7379MB [2024-08-26 04:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][850/1251] eta 0:01:37 lr 0.000999 wd 0.0500 time 0.2396 (0.2432) data time 0.0007 (0.0016) model time 0.2389 (0.2417) loss 2.6250 (3.9285) grad_norm 1.4800 (inf) loss_scale 8192.0000 (13380.5875) mem 7379MB [2024-08-26 04:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][860/1251] eta 0:01:35 lr 0.000999 wd 0.0500 time 0.2149 (0.2434) data time 0.0012 (0.0016) model time 0.2137 (0.2420) loss 2.5442 (3.9288) grad_norm 2.3449 (inf) loss_scale 8192.0000 (13320.3252) mem 7379MB [2024-08-26 04:22:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][870/1251] eta 0:01:32 lr 0.000999 wd 0.0500 time 0.2419 (0.2434) data time 0.0010 (0.0016) model time 0.2409 (0.2420) loss 3.8374 (3.9302) grad_norm 1.9706 (inf) loss_scale 8192.0000 (13261.4466) mem 7379MB [2024-08-26 04:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][880/1251] eta 0:01:30 lr 0.000999 wd 0.0500 time 0.2411 (0.2434) data time 0.0010 (0.0016) model time 0.2402 (0.2420) loss 4.3469 (3.9284) grad_norm 1.4522 (inf) loss_scale 8192.0000 (13203.9047) mem 7379MB [2024-08-26 04:22:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][890/1251] eta 0:01:27 lr 0.000999 wd 0.0500 time 0.2394 (0.2434) data time 0.0009 (0.0016) model time 0.2385 (0.2420) loss 3.0305 (3.9279) grad_norm 1.9604 (inf) loss_scale 8192.0000 (13147.6543) mem 7379MB [2024-08-26 04:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][900/1251] eta 0:01:25 lr 0.000999 wd 0.0500 time 0.2508 (0.2434) data time 0.0010 (0.0016) model time 0.2498 (0.2420) loss 4.2567 (3.9311) grad_norm 1.9566 (inf) loss_scale 8192.0000 (13092.6526) mem 7379MB [2024-08-26 04:22:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][910/1251] eta 0:01:22 lr 0.000999 wd 0.0500 time 0.2375 (0.2434) data time 0.0010 (0.0016) model time 0.2365 (0.2420) loss 4.5165 (3.9286) grad_norm 1.3585 (inf) loss_scale 8192.0000 (13038.8584) mem 7379MB [2024-08-26 04:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][920/1251] eta 0:01:20 lr 0.000999 wd 0.0500 time 0.2449 (0.2434) data time 0.0009 (0.0016) model time 0.2440 (0.2420) loss 4.6687 (3.9325) grad_norm 1.8202 (inf) loss_scale 8192.0000 (12986.2324) mem 7379MB [2024-08-26 04:22:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][930/1251] eta 0:01:18 lr 0.000999 wd 0.0500 time 0.2356 (0.2434) data time 0.0010 (0.0015) model time 0.2346 (0.2420) loss 3.9087 (3.9325) grad_norm 2.0823 (inf) loss_scale 8192.0000 (12934.7368) mem 7379MB [2024-08-26 04:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][940/1251] eta 0:01:15 lr 0.000999 wd 0.0500 time 0.2452 (0.2434) data time 0.0011 (0.0015) model time 0.2441 (0.2420) loss 4.1019 (3.9321) grad_norm 1.6647 (inf) loss_scale 8192.0000 (12884.3358) mem 7379MB [2024-08-26 04:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][950/1251] eta 0:01:13 lr 0.000999 wd 0.0500 time 0.2441 (0.2433) data time 0.0008 (0.0015) model time 0.2433 (0.2420) loss 4.7867 (3.9342) grad_norm 1.9063 (inf) loss_scale 8192.0000 (12834.9947) mem 7379MB [2024-08-26 04:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][960/1251] eta 0:01:10 lr 0.000999 wd 0.0500 time 0.2447 (0.2434) data time 0.0013 (0.0015) model time 0.2434 (0.2420) loss 3.4922 (3.9366) grad_norm 2.3366 (inf) loss_scale 8192.0000 (12786.6805) mem 7379MB [2024-08-26 04:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][970/1251] eta 0:01:08 lr 0.000999 wd 0.0500 time 0.2409 (0.2433) data time 0.0010 (0.0015) model time 0.2399 (0.2420) loss 4.0178 (3.9356) grad_norm 2.0475 (inf) loss_scale 8192.0000 (12739.3615) mem 7379MB [2024-08-26 04:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][980/1251] eta 0:01:05 lr 0.000999 wd 0.0500 time 0.2383 (0.2433) data time 0.0008 (0.0015) model time 0.2375 (0.2419) loss 4.4707 (3.9358) grad_norm 2.1036 (inf) loss_scale 8192.0000 (12693.0071) mem 7379MB [2024-08-26 04:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][990/1251] eta 0:01:03 lr 0.000999 wd 0.0500 time 0.2465 (0.2433) data time 0.0009 (0.0015) model time 0.2456 (0.2419) loss 4.1038 (3.9354) grad_norm 1.9555 (inf) loss_scale 8192.0000 (12647.5883) mem 7379MB [2024-08-26 04:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1000/1251] eta 0:01:01 lr 0.000999 wd 0.0500 time 0.2411 (0.2433) data time 0.0007 (0.0015) model time 0.2404 (0.2419) loss 4.3321 (3.9388) grad_norm 2.9384 (inf) loss_scale 8192.0000 (12603.0769) mem 7379MB [2024-08-26 04:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1010/1251] eta 0:00:58 lr 0.000999 wd 0.0500 time 0.2376 (0.2433) data time 0.0012 (0.0015) model time 0.2364 (0.2419) loss 4.1783 (3.9372) grad_norm 1.5733 (inf) loss_scale 8192.0000 (12559.4461) mem 7379MB [2024-08-26 04:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1020/1251] eta 0:00:56 lr 0.000999 wd 0.0500 time 0.2442 (0.2433) data time 0.0009 (0.0015) model time 0.2433 (0.2419) loss 3.9856 (3.9353) grad_norm 1.8878 (inf) loss_scale 8192.0000 (12516.6699) mem 7379MB [2024-08-26 04:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1030/1251] eta 0:00:53 lr 0.000999 wd 0.0500 time 0.2408 (0.2433) data time 0.0007 (0.0015) model time 0.2401 (0.2419) loss 2.9809 (3.9346) grad_norm 2.0536 (inf) loss_scale 8192.0000 (12474.7236) mem 7379MB [2024-08-26 04:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1040/1251] eta 0:00:51 lr 0.000999 wd 0.0500 time 0.2327 (0.2433) data time 0.0011 (0.0015) model time 0.2316 (0.2419) loss 4.4190 (3.9352) grad_norm 1.6185 (inf) loss_scale 8192.0000 (12433.5831) mem 7379MB [2024-08-26 04:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1050/1251] eta 0:00:48 lr 0.000999 wd 0.0500 time 0.2417 (0.2432) data time 0.0008 (0.0015) model time 0.2409 (0.2419) loss 3.0863 (3.9319) grad_norm 1.8254 (inf) loss_scale 8192.0000 (12393.2255) mem 7379MB [2024-08-26 04:22:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1060/1251] eta 0:00:46 lr 0.000999 wd 0.0500 time 0.2550 (0.2432) data time 0.0008 (0.0015) model time 0.2542 (0.2419) loss 2.9712 (3.9328) grad_norm 1.9660 (inf) loss_scale 8192.0000 (12353.6287) mem 7379MB [2024-08-26 04:22:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1070/1251] eta 0:00:44 lr 0.000999 wd 0.0500 time 0.2414 (0.2432) data time 0.0008 (0.0015) model time 0.2406 (0.2419) loss 4.5097 (3.9338) grad_norm 1.7734 (inf) loss_scale 8192.0000 (12314.7712) mem 7379MB [2024-08-26 04:22:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1080/1251] eta 0:00:41 lr 0.000999 wd 0.0500 time 0.2449 (0.2432) data time 0.0008 (0.0015) model time 0.2441 (0.2419) loss 4.7936 (3.9318) grad_norm 2.1789 (inf) loss_scale 8192.0000 (12276.6327) mem 7379MB [2024-08-26 04:23:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1090/1251] eta 0:00:39 lr 0.000999 wd 0.0500 time 0.4779 (0.2438) data time 0.0010 (0.0015) model time 0.4769 (0.2425) loss 4.3908 (3.9314) grad_norm 1.8604 (inf) loss_scale 8192.0000 (12239.1934) mem 7379MB [2024-08-26 04:23:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1100/1251] eta 0:00:36 lr 0.000999 wd 0.0500 time 0.2467 (0.2440) data time 0.0012 (0.0015) model time 0.2455 (0.2427) loss 2.9547 (3.9287) grad_norm 1.5553 (inf) loss_scale 8192.0000 (12202.4342) mem 7379MB [2024-08-26 04:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1110/1251] eta 0:00:34 lr 0.000999 wd 0.0500 time 0.2386 (0.2440) data time 0.0011 (0.0015) model time 0.2375 (0.2427) loss 4.7672 (3.9277) grad_norm 1.9414 (inf) loss_scale 8192.0000 (12166.3366) mem 7379MB [2024-08-26 04:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1120/1251] eta 0:00:31 lr 0.000999 wd 0.0500 time 0.2423 (0.2440) data time 0.0009 (0.0015) model time 0.2413 (0.2427) loss 4.0098 (3.9288) grad_norm 1.9164 (inf) loss_scale 8192.0000 (12130.8831) mem 7379MB [2024-08-26 04:23:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1130/1251] eta 0:00:29 lr 0.000999 wd 0.0500 time 0.2357 (0.2440) data time 0.0007 (0.0015) model time 0.2349 (0.2427) loss 3.0697 (3.9299) grad_norm 2.2444 (inf) loss_scale 8192.0000 (12096.0566) mem 7379MB [2024-08-26 04:23:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1140/1251] eta 0:00:27 lr 0.000999 wd 0.0500 time 0.2516 (0.2440) data time 0.0010 (0.0015) model time 0.2506 (0.2426) loss 4.7944 (3.9311) grad_norm 1.7533 (inf) loss_scale 8192.0000 (12061.8405) mem 7379MB [2024-08-26 04:23:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1150/1251] eta 0:00:24 lr 0.000999 wd 0.0500 time 0.2396 (0.2439) data time 0.0010 (0.0015) model time 0.2386 (0.2426) loss 3.5782 (3.9331) grad_norm 2.6589 (inf) loss_scale 8192.0000 (12028.2189) mem 7379MB [2024-08-26 04:23:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1160/1251] eta 0:00:22 lr 0.000999 wd 0.0500 time 0.2392 (0.2439) data time 0.0011 (0.0014) model time 0.2382 (0.2426) loss 4.1633 (3.9316) grad_norm 2.4657 (inf) loss_scale 8192.0000 (11995.1766) mem 7379MB [2024-08-26 04:23:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1170/1251] eta 0:00:19 lr 0.000999 wd 0.0500 time 0.2431 (0.2439) data time 0.0009 (0.0014) model time 0.2422 (0.2426) loss 4.5070 (3.9310) grad_norm 1.4368 (inf) loss_scale 8192.0000 (11962.6985) mem 7379MB [2024-08-26 04:23:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1180/1251] eta 0:00:17 lr 0.000999 wd 0.0500 time 0.2427 (0.2439) data time 0.0011 (0.0014) model time 0.2416 (0.2426) loss 4.3578 (3.9328) grad_norm 1.7081 (inf) loss_scale 8192.0000 (11930.7705) mem 7379MB [2024-08-26 04:23:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1190/1251] eta 0:00:14 lr 0.000999 wd 0.0500 time 0.2405 (0.2439) data time 0.0009 (0.0014) model time 0.2396 (0.2426) loss 4.5492 (3.9315) grad_norm 2.1490 (inf) loss_scale 8192.0000 (11899.3787) mem 7379MB [2024-08-26 04:23:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1200/1251] eta 0:00:12 lr 0.000999 wd 0.0500 time 0.2487 (0.2438) data time 0.0010 (0.0014) model time 0.2477 (0.2425) loss 4.4850 (3.9316) grad_norm 1.6165 (inf) loss_scale 8192.0000 (11868.5096) mem 7379MB [2024-08-26 04:23:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1210/1251] eta 0:00:09 lr 0.000999 wd 0.0500 time 0.2360 (0.2438) data time 0.0011 (0.0014) model time 0.2349 (0.2425) loss 3.7637 (3.9323) grad_norm 2.5216 (inf) loss_scale 8192.0000 (11838.1503) mem 7379MB [2024-08-26 04:23:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1220/1251] eta 0:00:07 lr 0.000999 wd 0.0500 time 0.2405 (0.2438) data time 0.0010 (0.0014) model time 0.2395 (0.2425) loss 4.1904 (3.9314) grad_norm 1.8676 (inf) loss_scale 8192.0000 (11808.2883) mem 7379MB [2024-08-26 04:23:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1230/1251] eta 0:00:05 lr 0.000999 wd 0.0500 time 0.2434 (0.2438) data time 0.0007 (0.0014) model time 0.2428 (0.2425) loss 4.4433 (3.9319) grad_norm 2.2654 (inf) loss_scale 8192.0000 (11778.9115) mem 7379MB [2024-08-26 04:23:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1240/1251] eta 0:00:02 lr 0.000999 wd 0.0500 time 0.2240 (0.2437) data time 0.0005 (0.0014) model time 0.2236 (0.2424) loss 4.4048 (3.9319) grad_norm 2.4677 (inf) loss_scale 8192.0000 (11750.0081) mem 7379MB [2024-08-26 04:23:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [25/300][1250/1251] eta 0:00:00 lr 0.000999 wd 0.0500 time 0.2226 (0.2436) data time 0.0007 (0.0014) model time 0.2219 (0.2423) loss 3.6038 (3.9290) grad_norm 1.7837 (inf) loss_scale 8192.0000 (11721.5667) mem 7379MB [2024-08-26 04:23:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 25 training takes 0:05:04 [2024-08-26 04:23:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 04:23:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 04:23:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.443 (0.443) Loss 0.6924 (0.6924) Acc@1 85.938 (85.938) Acc@5 97.070 (97.070) Mem 7379MB [2024-08-26 04:23:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.085 (0.112) Loss 1.1455 (1.0926) Acc@1 75.391 (75.648) Acc@5 93.359 (93.350) Mem 7379MB [2024-08-26 04:23:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.097) Loss 1.5088 (1.1021) Acc@1 65.430 (75.088) Acc@5 88.086 (93.480) Mem 7379MB [2024-08-26 04:23:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.091) Loss 1.8604 (1.2489) Acc@1 59.180 (72.042) Acc@5 79.492 (91.161) Mem 7379MB [2024-08-26 04:23:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.8535 (1.3406) Acc@1 58.496 (69.917) Acc@5 82.227 (89.899) Mem 7379MB [2024-08-26 04:23:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 69.584 Acc@5 89.716 [2024-08-26 04:23:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 69.6% [2024-08-26 04:23:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 69.58% [2024-08-26 04:23:45 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 04:23:46 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 04:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.421 (0.421) Loss 0.6465 (0.6465) Acc@1 83.496 (83.496) Acc@5 95.801 (95.801) Mem 7379MB [2024-08-26 04:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.113) Loss 1.0254 (1.0220) Acc@1 75.391 (74.423) Acc@5 93.359 (93.120) Mem 7379MB [2024-08-26 04:23:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.096) Loss 1.5146 (1.0343) Acc@1 63.867 (74.075) Acc@5 86.719 (93.062) Mem 7379MB [2024-08-26 04:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.091) Loss 1.8096 (1.2011) Acc@1 57.422 (70.731) Acc@5 80.762 (90.609) Mem 7379MB [2024-08-26 04:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.8057 (1.3041) Acc@1 57.617 (68.683) Acc@5 82.617 (89.163) Mem 7379MB [2024-08-26 04:23:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 68.472 Acc@5 89.118 [2024-08-26 04:23:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 68.5% [2024-08-26 04:23:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 68.47% [2024-08-26 04:23:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 04:23:51 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 04:23:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][0/1251] eta 0:15:12 lr 0.000999 wd 0.0500 time 0.7298 (0.7298) data time 0.5012 (0.5012) model time 0.0000 (0.0000) loss 4.4727 (4.4727) grad_norm 2.1481 (2.1481) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:23:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][10/1251] eta 0:05:53 lr 0.000999 wd 0.0500 time 0.2445 (0.2848) data time 0.0009 (0.0464) model time 0.0000 (0.0000) loss 2.8949 (4.1182) grad_norm 2.1581 (1.9822) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:23:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][20/1251] eta 0:05:26 lr 0.000999 wd 0.0500 time 0.2434 (0.2651) data time 0.0011 (0.0248) model time 0.0000 (0.0000) loss 3.0912 (3.9369) grad_norm 1.9667 (2.3872) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:23:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][30/1251] eta 0:05:14 lr 0.000999 wd 0.0500 time 0.2477 (0.2577) data time 0.0011 (0.0173) model time 0.0000 (0.0000) loss 4.4319 (3.9160) grad_norm 1.9750 (2.2667) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][40/1251] eta 0:05:09 lr 0.000999 wd 0.0500 time 0.2503 (0.2552) data time 0.0014 (0.0134) model time 0.0000 (0.0000) loss 3.8520 (3.8680) grad_norm 1.5589 (2.1832) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][50/1251] eta 0:05:08 lr 0.000999 wd 0.0500 time 0.2205 (0.2565) data time 0.0009 (0.0110) model time 0.0000 (0.0000) loss 3.5722 (3.8463) grad_norm 1.4137 (2.2022) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][60/1251] eta 0:05:02 lr 0.000999 wd 0.0500 time 0.2373 (0.2541) data time 0.0009 (0.0093) model time 0.2364 (0.2405) loss 4.4355 (3.8994) grad_norm 2.1276 (2.2893) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][70/1251] eta 0:04:57 lr 0.000999 wd 0.0500 time 0.2458 (0.2523) data time 0.0012 (0.0081) model time 0.2447 (0.2405) loss 3.8368 (3.9181) grad_norm 2.8052 (2.2749) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][80/1251] eta 0:04:54 lr 0.000999 wd 0.0500 time 0.2559 (0.2511) data time 0.0008 (0.0073) model time 0.2552 (0.2410) loss 3.4906 (3.9202) grad_norm 1.6686 (2.2845) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][90/1251] eta 0:04:50 lr 0.000999 wd 0.0500 time 0.2436 (0.2501) data time 0.0007 (0.0066) model time 0.2429 (0.2409) loss 4.5796 (3.9555) grad_norm 1.4831 (2.2339) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][100/1251] eta 0:04:46 lr 0.000999 wd 0.0500 time 0.2418 (0.2492) data time 0.0007 (0.0060) model time 0.2411 (0.2407) loss 4.8000 (3.9498) grad_norm 1.2085 (2.2224) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][110/1251] eta 0:04:43 lr 0.000999 wd 0.0500 time 0.2383 (0.2485) data time 0.0012 (0.0056) model time 0.2372 (0.2407) loss 3.5379 (3.9279) grad_norm 2.6828 (2.2653) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][120/1251] eta 0:04:40 lr 0.000999 wd 0.0500 time 0.2609 (0.2482) data time 0.0011 (0.0052) model time 0.2598 (0.2412) loss 4.2753 (3.9234) grad_norm 2.4975 (2.2690) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][130/1251] eta 0:04:37 lr 0.000999 wd 0.0500 time 0.2430 (0.2477) data time 0.0010 (0.0049) model time 0.2421 (0.2410) loss 4.0728 (3.9245) grad_norm 2.2540 (2.2575) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][140/1251] eta 0:04:34 lr 0.000999 wd 0.0500 time 0.2496 (0.2474) data time 0.0011 (0.0046) model time 0.2485 (0.2411) loss 3.3974 (3.9256) grad_norm 1.5695 (2.2182) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][150/1251] eta 0:04:32 lr 0.000999 wd 0.0500 time 0.2501 (0.2472) data time 0.0010 (0.0044) model time 0.2491 (0.2414) loss 4.3683 (3.9089) grad_norm 3.3074 (2.2073) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][160/1251] eta 0:04:29 lr 0.000999 wd 0.0500 time 0.2497 (0.2468) data time 0.0010 (0.0042) model time 0.2487 (0.2413) loss 4.2113 (3.9167) grad_norm 1.5370 (2.1877) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][170/1251] eta 0:04:26 lr 0.000999 wd 0.0500 time 0.2391 (0.2465) data time 0.0009 (0.0040) model time 0.2383 (0.2412) loss 4.6359 (3.9147) grad_norm 2.0635 (2.1738) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][180/1251] eta 0:04:23 lr 0.000999 wd 0.0500 time 0.2437 (0.2463) data time 0.0010 (0.0038) model time 0.2427 (0.2412) loss 3.7136 (3.9195) grad_norm 3.9607 (2.1685) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][190/1251] eta 0:04:21 lr 0.000999 wd 0.0500 time 0.2439 (0.2461) data time 0.0011 (0.0037) model time 0.2429 (0.2412) loss 3.7946 (3.9300) grad_norm 1.3198 (2.1753) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][200/1251] eta 0:04:18 lr 0.000999 wd 0.0500 time 0.2430 (0.2458) data time 0.0009 (0.0035) model time 0.2421 (0.2412) loss 4.3564 (3.9321) grad_norm 1.7563 (2.1792) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][210/1251] eta 0:04:15 lr 0.000999 wd 0.0500 time 0.2417 (0.2456) data time 0.0010 (0.0034) model time 0.2407 (0.2411) loss 3.9799 (3.9376) grad_norm 1.5813 (2.1613) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][220/1251] eta 0:04:12 lr 0.000999 wd 0.0500 time 0.2375 (0.2453) data time 0.0009 (0.0033) model time 0.2366 (0.2409) loss 3.3831 (3.9169) grad_norm 1.8189 (2.1598) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][230/1251] eta 0:04:10 lr 0.000999 wd 0.0500 time 0.2409 (0.2451) data time 0.0012 (0.0032) model time 0.2397 (0.2408) loss 4.2636 (3.9237) grad_norm 2.6295 (2.1610) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][240/1251] eta 0:04:07 lr 0.000999 wd 0.0500 time 0.2413 (0.2450) data time 0.0008 (0.0031) model time 0.2405 (0.2409) loss 4.7244 (3.9210) grad_norm 1.4846 (2.1447) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][250/1251] eta 0:04:05 lr 0.000999 wd 0.0500 time 0.2482 (0.2449) data time 0.0009 (0.0030) model time 0.2473 (0.2409) loss 4.9262 (3.9382) grad_norm 2.1344 (2.1342) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][260/1251] eta 0:04:02 lr 0.000999 wd 0.0500 time 0.2463 (0.2448) data time 0.0009 (0.0030) model time 0.2454 (0.2409) loss 4.3191 (3.9476) grad_norm 1.7282 (2.1292) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][270/1251] eta 0:04:00 lr 0.000999 wd 0.0500 time 0.2407 (0.2447) data time 0.0008 (0.0029) model time 0.2400 (0.2409) loss 3.2539 (3.9401) grad_norm 1.3897 (2.1239) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][280/1251] eta 0:03:57 lr 0.000999 wd 0.0500 time 0.2466 (0.2446) data time 0.0009 (0.0028) model time 0.2457 (0.2409) loss 4.3694 (3.9350) grad_norm 1.5484 (2.1130) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][290/1251] eta 0:03:54 lr 0.000999 wd 0.0500 time 0.2453 (0.2445) data time 0.0007 (0.0028) model time 0.2445 (0.2409) loss 4.4160 (3.9337) grad_norm 1.5741 (2.1073) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][300/1251] eta 0:03:52 lr 0.000999 wd 0.0500 time 0.2335 (0.2443) data time 0.0009 (0.0027) model time 0.2326 (0.2408) loss 4.7734 (3.9398) grad_norm 2.5775 (2.1130) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][310/1251] eta 0:03:49 lr 0.000999 wd 0.0500 time 0.2419 (0.2442) data time 0.0011 (0.0026) model time 0.2408 (0.2407) loss 4.3321 (3.9489) grad_norm 2.1537 (2.1190) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][320/1251] eta 0:03:47 lr 0.000999 wd 0.0500 time 0.2442 (0.2441) data time 0.0012 (0.0026) model time 0.2430 (0.2407) loss 4.5194 (3.9416) grad_norm 1.6247 (2.1194) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][330/1251] eta 0:03:44 lr 0.000999 wd 0.0500 time 0.2524 (0.2440) data time 0.0009 (0.0026) model time 0.2515 (0.2407) loss 4.5618 (3.9472) grad_norm 2.1383 (2.1165) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][340/1251] eta 0:03:42 lr 0.000999 wd 0.0500 time 0.2451 (0.2439) data time 0.0010 (0.0025) model time 0.2442 (0.2407) loss 4.6564 (3.9559) grad_norm 1.6354 (2.1209) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][350/1251] eta 0:03:39 lr 0.000999 wd 0.0500 time 0.2445 (0.2439) data time 0.0011 (0.0025) model time 0.2434 (0.2407) loss 3.9891 (3.9571) grad_norm 1.6527 (2.1214) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][360/1251] eta 0:03:37 lr 0.000999 wd 0.0500 time 0.2421 (0.2437) data time 0.0010 (0.0024) model time 0.2411 (0.2406) loss 3.7012 (3.9590) grad_norm 1.9540 (2.1175) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][370/1251] eta 0:03:36 lr 0.000999 wd 0.0500 time 0.4612 (0.2454) data time 0.0011 (0.0024) model time 0.4602 (0.2426) loss 2.8433 (3.9547) grad_norm 1.8130 (2.1099) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][380/1251] eta 0:03:34 lr 0.000999 wd 0.0500 time 0.2416 (0.2459) data time 0.0011 (0.0024) model time 0.2405 (0.2433) loss 4.5059 (3.9533) grad_norm 1.8695 (2.1152) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][390/1251] eta 0:03:31 lr 0.000999 wd 0.0500 time 0.2448 (0.2458) data time 0.0009 (0.0023) model time 0.2439 (0.2432) loss 3.3615 (3.9524) grad_norm 2.6633 (2.1132) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][400/1251] eta 0:03:29 lr 0.000999 wd 0.0500 time 0.2499 (0.2458) data time 0.0008 (0.0023) model time 0.2490 (0.2432) loss 3.6109 (3.9610) grad_norm 3.0539 (2.1072) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][410/1251] eta 0:03:26 lr 0.000999 wd 0.0500 time 0.2451 (0.2456) data time 0.0010 (0.0023) model time 0.2441 (0.2430) loss 4.4578 (3.9631) grad_norm 2.2980 (2.1100) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][420/1251] eta 0:03:24 lr 0.000999 wd 0.0500 time 0.2402 (0.2455) data time 0.0008 (0.0022) model time 0.2394 (0.2430) loss 4.0488 (3.9654) grad_norm 2.7611 (2.1093) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][430/1251] eta 0:03:21 lr 0.000999 wd 0.0500 time 0.2436 (0.2454) data time 0.0008 (0.0022) model time 0.2428 (0.2429) loss 3.3675 (3.9660) grad_norm 1.8751 (2.1170) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][440/1251] eta 0:03:18 lr 0.000999 wd 0.0500 time 0.2319 (0.2453) data time 0.0009 (0.0022) model time 0.2310 (0.2428) loss 4.6471 (3.9624) grad_norm 1.9040 (2.1123) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][450/1251] eta 0:03:16 lr 0.000999 wd 0.0500 time 0.2364 (0.2453) data time 0.0008 (0.0022) model time 0.2356 (0.2428) loss 3.7290 (3.9608) grad_norm 2.9279 (2.1105) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][460/1251] eta 0:03:13 lr 0.000999 wd 0.0500 time 0.2433 (0.2452) data time 0.0008 (0.0022) model time 0.2425 (0.2427) loss 4.1008 (3.9599) grad_norm 1.5077 (2.1041) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][470/1251] eta 0:03:11 lr 0.000999 wd 0.0500 time 0.2338 (0.2451) data time 0.0010 (0.0021) model time 0.2328 (0.2426) loss 2.6613 (3.9606) grad_norm 1.5410 (2.1006) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][480/1251] eta 0:03:08 lr 0.000999 wd 0.0500 time 0.2420 (0.2450) data time 0.0008 (0.0021) model time 0.2412 (0.2426) loss 3.2797 (3.9564) grad_norm 2.9550 (2.1041) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][490/1251] eta 0:03:06 lr 0.000999 wd 0.0500 time 0.2394 (0.2449) data time 0.0011 (0.0021) model time 0.2382 (0.2425) loss 4.2714 (3.9537) grad_norm 1.6818 (2.0989) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][500/1251] eta 0:03:03 lr 0.000999 wd 0.0500 time 0.2489 (0.2448) data time 0.0015 (0.0021) model time 0.2473 (0.2424) loss 4.1082 (3.9498) grad_norm 1.6967 (2.0971) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][510/1251] eta 0:03:01 lr 0.000999 wd 0.0500 time 0.2402 (0.2448) data time 0.0010 (0.0021) model time 0.2392 (0.2424) loss 3.9578 (3.9511) grad_norm 1.8319 (2.0928) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:25:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][520/1251] eta 0:02:58 lr 0.000999 wd 0.0500 time 0.2444 (0.2447) data time 0.0010 (0.0020) model time 0.2434 (0.2423) loss 3.5383 (3.9497) grad_norm 4.4359 (2.0960) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][530/1251] eta 0:02:56 lr 0.000999 wd 0.0500 time 0.2356 (0.2446) data time 0.0007 (0.0020) model time 0.2349 (0.2423) loss 4.0935 (3.9487) grad_norm 2.4870 (2.0976) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][540/1251] eta 0:02:53 lr 0.000999 wd 0.0500 time 0.2402 (0.2446) data time 0.0009 (0.0020) model time 0.2393 (0.2423) loss 4.5070 (3.9551) grad_norm 3.1757 (2.1032) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][550/1251] eta 0:02:51 lr 0.000999 wd 0.0500 time 0.2468 (0.2446) data time 0.0011 (0.0020) model time 0.2457 (0.2423) loss 4.5691 (3.9569) grad_norm 1.4664 (2.1058) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][560/1251] eta 0:02:48 lr 0.000999 wd 0.0500 time 0.2380 (0.2445) data time 0.0010 (0.0020) model time 0.2370 (0.2423) loss 3.5818 (3.9583) grad_norm 1.9466 (2.1108) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][570/1251] eta 0:02:46 lr 0.000999 wd 0.0500 time 0.2492 (0.2445) data time 0.0007 (0.0020) model time 0.2485 (0.2423) loss 3.0819 (3.9572) grad_norm 1.6373 (2.1074) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][580/1251] eta 0:02:44 lr 0.000999 wd 0.0500 time 0.2441 (0.2445) data time 0.0013 (0.0019) model time 0.2427 (0.2423) loss 3.6528 (3.9590) grad_norm 1.8949 (2.1052) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][590/1251] eta 0:02:41 lr 0.000999 wd 0.0500 time 0.2383 (0.2448) data time 0.0009 (0.0019) model time 0.2374 (0.2426) loss 3.7848 (3.9583) grad_norm 2.0466 (2.1069) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][600/1251] eta 0:02:39 lr 0.000999 wd 0.0500 time 0.2371 (0.2447) data time 0.0008 (0.0019) model time 0.2363 (0.2425) loss 3.9057 (3.9572) grad_norm 2.0325 (2.1036) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][610/1251] eta 0:02:36 lr 0.000999 wd 0.0500 time 0.2553 (0.2446) data time 0.0010 (0.0019) model time 0.2544 (0.2425) loss 3.1712 (3.9551) grad_norm 1.6779 (2.1013) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][620/1251] eta 0:02:34 lr 0.000999 wd 0.0500 time 0.2426 (0.2446) data time 0.0009 (0.0019) model time 0.2417 (0.2424) loss 4.1535 (3.9555) grad_norm 1.6369 (2.0986) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][630/1251] eta 0:02:31 lr 0.000999 wd 0.0500 time 0.2432 (0.2445) data time 0.0011 (0.0019) model time 0.2422 (0.2424) loss 3.7114 (3.9565) grad_norm 1.2203 (2.0939) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][640/1251] eta 0:02:29 lr 0.000999 wd 0.0500 time 0.2413 (0.2445) data time 0.0011 (0.0019) model time 0.2402 (0.2424) loss 4.6746 (3.9571) grad_norm 2.6475 (2.0964) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][650/1251] eta 0:02:26 lr 0.000999 wd 0.0500 time 0.2313 (0.2445) data time 0.0008 (0.0019) model time 0.2306 (0.2423) loss 3.3796 (3.9570) grad_norm 1.9579 (2.1056) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][660/1251] eta 0:02:24 lr 0.000999 wd 0.0500 time 0.2439 (0.2444) data time 0.0008 (0.0018) model time 0.2431 (0.2423) loss 4.6094 (3.9598) grad_norm 1.9019 (2.1101) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][670/1251] eta 0:02:22 lr 0.000999 wd 0.0500 time 0.2452 (0.2444) data time 0.0009 (0.0018) model time 0.2443 (0.2424) loss 4.6154 (3.9588) grad_norm 2.8361 (2.1151) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][680/1251] eta 0:02:19 lr 0.000999 wd 0.0500 time 0.2529 (0.2444) data time 0.0012 (0.0018) model time 0.2517 (0.2424) loss 4.1832 (3.9591) grad_norm 1.7674 (2.1093) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][690/1251] eta 0:02:17 lr 0.000999 wd 0.0500 time 0.2446 (0.2444) data time 0.0011 (0.0018) model time 0.2436 (0.2423) loss 3.7771 (3.9562) grad_norm 2.4437 (2.1036) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][700/1251] eta 0:02:14 lr 0.000999 wd 0.0500 time 0.2386 (0.2444) data time 0.0010 (0.0018) model time 0.2376 (0.2423) loss 3.6359 (3.9550) grad_norm 1.9969 (2.0979) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][710/1251] eta 0:02:12 lr 0.000999 wd 0.0500 time 0.2402 (0.2443) data time 0.0009 (0.0018) model time 0.2394 (0.2423) loss 4.5569 (3.9570) grad_norm 1.7679 (2.0977) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][720/1251] eta 0:02:09 lr 0.000999 wd 0.0500 time 0.2483 (0.2443) data time 0.0010 (0.0018) model time 0.2473 (0.2423) loss 3.9597 (3.9582) grad_norm 2.4625 (2.1013) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][730/1251] eta 0:02:07 lr 0.000999 wd 0.0500 time 0.2461 (0.2443) data time 0.0010 (0.0018) model time 0.2451 (0.2423) loss 4.7952 (3.9595) grad_norm 1.6812 (2.1001) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][740/1251] eta 0:02:04 lr 0.000999 wd 0.0500 time 0.2419 (0.2443) data time 0.0011 (0.0018) model time 0.2409 (0.2423) loss 3.7031 (3.9616) grad_norm 2.1789 (2.0962) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][750/1251] eta 0:02:02 lr 0.000999 wd 0.0500 time 0.2372 (0.2442) data time 0.0007 (0.0017) model time 0.2364 (0.2422) loss 3.3649 (3.9564) grad_norm 2.7983 (2.0946) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][760/1251] eta 0:01:59 lr 0.000999 wd 0.0500 time 0.2472 (0.2442) data time 0.0010 (0.0017) model time 0.2462 (0.2422) loss 3.8450 (3.9569) grad_norm 1.7919 (2.0921) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][770/1251] eta 0:01:57 lr 0.000999 wd 0.0500 time 0.2438 (0.2442) data time 0.0008 (0.0017) model time 0.2430 (0.2422) loss 4.1111 (3.9623) grad_norm 1.6064 (2.0916) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][780/1251] eta 0:01:54 lr 0.000999 wd 0.0500 time 0.2504 (0.2441) data time 0.0010 (0.0017) model time 0.2494 (0.2422) loss 3.0940 (3.9561) grad_norm 2.0272 (2.0928) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][790/1251] eta 0:01:52 lr 0.000999 wd 0.0500 time 0.2360 (0.2441) data time 0.0009 (0.0017) model time 0.2351 (0.2421) loss 4.7053 (3.9611) grad_norm 1.8855 (2.0903) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][800/1251] eta 0:01:50 lr 0.000999 wd 0.0500 time 0.2441 (0.2440) data time 0.0011 (0.0017) model time 0.2431 (0.2421) loss 3.7183 (3.9542) grad_norm 1.5056 (2.0923) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][810/1251] eta 0:01:47 lr 0.000999 wd 0.0500 time 0.2402 (0.2440) data time 0.0009 (0.0017) model time 0.2393 (0.2421) loss 4.5037 (3.9482) grad_norm 1.3027 (2.0991) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][820/1251] eta 0:01:45 lr 0.000999 wd 0.0500 time 0.2412 (0.2440) data time 0.0009 (0.0017) model time 0.2403 (0.2421) loss 4.3839 (3.9499) grad_norm 2.0854 (2.0957) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][830/1251] eta 0:01:42 lr 0.000999 wd 0.0500 time 0.2515 (0.2440) data time 0.0007 (0.0017) model time 0.2508 (0.2421) loss 3.5654 (3.9513) grad_norm 1.7323 (2.0920) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][840/1251] eta 0:01:40 lr 0.000999 wd 0.0500 time 0.2395 (0.2440) data time 0.0010 (0.0017) model time 0.2385 (0.2421) loss 3.1669 (3.9517) grad_norm 1.3381 (2.0880) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][850/1251] eta 0:01:37 lr 0.000999 wd 0.0500 time 0.2498 (0.2439) data time 0.0009 (0.0017) model time 0.2489 (0.2421) loss 4.3725 (3.9484) grad_norm 3.1695 (2.0886) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][860/1251] eta 0:01:35 lr 0.000999 wd 0.0500 time 0.2489 (0.2439) data time 0.0007 (0.0017) model time 0.2481 (0.2421) loss 2.7704 (3.9464) grad_norm 1.9172 (2.0949) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][870/1251] eta 0:01:32 lr 0.000999 wd 0.0500 time 0.2464 (0.2439) data time 0.0009 (0.0016) model time 0.2454 (0.2421) loss 4.0811 (3.9477) grad_norm 1.8184 (2.0974) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][880/1251] eta 0:01:30 lr 0.000999 wd 0.0500 time 0.2398 (0.2439) data time 0.0007 (0.0016) model time 0.2390 (0.2420) loss 4.0749 (3.9477) grad_norm 2.8262 (2.0983) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][890/1251] eta 0:01:28 lr 0.000999 wd 0.0500 time 0.2492 (0.2443) data time 0.0007 (0.0016) model time 0.2484 (0.2424) loss 4.2648 (3.9479) grad_norm 2.0471 (2.0976) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][900/1251] eta 0:01:25 lr 0.000999 wd 0.0500 time 0.2415 (0.2444) data time 0.0008 (0.0016) model time 0.2407 (0.2426) loss 3.6154 (3.9462) grad_norm 1.7681 (2.0933) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][910/1251] eta 0:01:23 lr 0.000999 wd 0.0500 time 0.2404 (0.2443) data time 0.0011 (0.0016) model time 0.2392 (0.2426) loss 4.3832 (3.9491) grad_norm 2.0241 (2.0903) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][920/1251] eta 0:01:20 lr 0.000999 wd 0.0500 time 0.2440 (0.2443) data time 0.0009 (0.0016) model time 0.2431 (0.2425) loss 4.3564 (3.9462) grad_norm 1.4619 (2.0924) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][930/1251] eta 0:01:18 lr 0.000999 wd 0.0500 time 0.2363 (0.2443) data time 0.0010 (0.0016) model time 0.2353 (0.2425) loss 4.2545 (3.9442) grad_norm 1.5944 (2.0928) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][940/1251] eta 0:01:15 lr 0.000999 wd 0.0500 time 0.2437 (0.2442) data time 0.0009 (0.0016) model time 0.2428 (0.2425) loss 4.8105 (3.9431) grad_norm 1.4640 (2.0896) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][950/1251] eta 0:01:13 lr 0.000999 wd 0.0500 time 0.2423 (0.2442) data time 0.0010 (0.0016) model time 0.2413 (0.2425) loss 3.8244 (3.9419) grad_norm 1.9522 (2.0904) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][960/1251] eta 0:01:11 lr 0.000999 wd 0.0500 time 0.2367 (0.2442) data time 0.0010 (0.0016) model time 0.2357 (0.2424) loss 4.4783 (3.9411) grad_norm 1.6661 (2.0903) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][970/1251] eta 0:01:08 lr 0.000999 wd 0.0500 time 0.2379 (0.2442) data time 0.0011 (0.0016) model time 0.2368 (0.2424) loss 3.8730 (3.9405) grad_norm 3.1635 (2.0938) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][980/1251] eta 0:01:06 lr 0.000999 wd 0.0500 time 0.2382 (0.2442) data time 0.0011 (0.0016) model time 0.2370 (0.2424) loss 4.4975 (3.9376) grad_norm 2.3414 (2.0924) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][990/1251] eta 0:01:03 lr 0.000999 wd 0.0500 time 0.2383 (0.2442) data time 0.0008 (0.0016) model time 0.2375 (0.2424) loss 4.6791 (3.9372) grad_norm 1.9993 (2.0926) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1000/1251] eta 0:01:01 lr 0.000999 wd 0.0500 time 0.2470 (0.2442) data time 0.0011 (0.0016) model time 0.2459 (0.2424) loss 4.3156 (3.9368) grad_norm 3.6799 (2.0962) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:27:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1010/1251] eta 0:00:58 lr 0.000999 wd 0.0500 time 0.2365 (0.2442) data time 0.0010 (0.0016) model time 0.2355 (0.2424) loss 4.7732 (3.9373) grad_norm 1.5374 (2.0966) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1020/1251] eta 0:00:56 lr 0.000999 wd 0.0500 time 0.2424 (0.2442) data time 0.0008 (0.0016) model time 0.2416 (0.2424) loss 4.4360 (3.9392) grad_norm 1.2356 (2.0934) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1030/1251] eta 0:00:53 lr 0.000999 wd 0.0500 time 0.2374 (0.2441) data time 0.0008 (0.0016) model time 0.2366 (0.2424) loss 2.5479 (3.9373) grad_norm 2.1086 (2.0915) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1040/1251] eta 0:00:51 lr 0.000999 wd 0.0500 time 0.2503 (0.2441) data time 0.0010 (0.0016) model time 0.2493 (0.2424) loss 3.4165 (3.9367) grad_norm 1.6896 (2.0920) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1050/1251] eta 0:00:49 lr 0.000999 wd 0.0500 time 0.2319 (0.2441) data time 0.0009 (0.0015) model time 0.2311 (0.2424) loss 4.4418 (3.9351) grad_norm 1.8552 (2.0931) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1060/1251] eta 0:00:46 lr 0.000999 wd 0.0500 time 0.2433 (0.2441) data time 0.0010 (0.0015) model time 0.2423 (0.2424) loss 4.0916 (3.9350) grad_norm 3.5236 (2.0932) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1070/1251] eta 0:00:44 lr 0.000999 wd 0.0500 time 0.2396 (0.2440) data time 0.0007 (0.0015) model time 0.2389 (0.2423) loss 4.4640 (3.9370) grad_norm 1.6924 (2.0935) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1080/1251] eta 0:00:41 lr 0.000999 wd 0.0500 time 0.2369 (0.2440) data time 0.0010 (0.0015) model time 0.2359 (0.2423) loss 3.5170 (3.9372) grad_norm 1.3549 (2.0908) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1090/1251] eta 0:00:39 lr 0.000999 wd 0.0500 time 0.2465 (0.2440) data time 0.0007 (0.0015) model time 0.2458 (0.2423) loss 4.3150 (3.9373) grad_norm 2.2201 (2.0886) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1100/1251] eta 0:00:36 lr 0.000999 wd 0.0500 time 0.2360 (0.2440) data time 0.0009 (0.0015) model time 0.2352 (0.2423) loss 3.6505 (3.9344) grad_norm 3.3476 (2.0881) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1110/1251] eta 0:00:34 lr 0.000999 wd 0.0500 time 0.2407 (0.2441) data time 0.0007 (0.0015) model time 0.2401 (0.2424) loss 2.7593 (3.9343) grad_norm 1.4918 (2.0872) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1120/1251] eta 0:00:31 lr 0.000999 wd 0.0500 time 0.2443 (0.2441) data time 0.0010 (0.0015) model time 0.2433 (0.2424) loss 3.7973 (3.9328) grad_norm 1.4671 (2.0856) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1130/1251] eta 0:00:29 lr 0.000999 wd 0.0500 time 0.2360 (0.2440) data time 0.0008 (0.0015) model time 0.2352 (0.2424) loss 4.1052 (3.9308) grad_norm 3.4409 (2.0897) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1140/1251] eta 0:00:27 lr 0.000999 wd 0.0500 time 0.2493 (0.2440) data time 0.0009 (0.0015) model time 0.2484 (0.2424) loss 4.1483 (3.9289) grad_norm 1.5313 (2.0876) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1150/1251] eta 0:00:24 lr 0.000999 wd 0.0500 time 0.2410 (0.2440) data time 0.0011 (0.0015) model time 0.2399 (0.2424) loss 4.4587 (3.9272) grad_norm 2.7470 (2.0903) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1160/1251] eta 0:00:22 lr 0.000999 wd 0.0500 time 0.2434 (0.2440) data time 0.0010 (0.0015) model time 0.2424 (0.2423) loss 4.1608 (3.9277) grad_norm 2.0378 (2.0889) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1170/1251] eta 0:00:19 lr 0.000999 wd 0.0500 time 0.2402 (0.2440) data time 0.0008 (0.0015) model time 0.2394 (0.2423) loss 3.0332 (3.9290) grad_norm 1.8837 (2.0901) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1180/1251] eta 0:00:17 lr 0.000998 wd 0.0500 time 0.2413 (0.2440) data time 0.0009 (0.0015) model time 0.2404 (0.2423) loss 4.1578 (3.9287) grad_norm 1.7209 (2.0977) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1190/1251] eta 0:00:14 lr 0.000998 wd 0.0500 time 0.2444 (0.2439) data time 0.0007 (0.0015) model time 0.2437 (0.2423) loss 4.7801 (3.9294) grad_norm 1.5080 (2.0984) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1200/1251] eta 0:00:12 lr 0.000998 wd 0.0500 time 0.2423 (0.2440) data time 0.0010 (0.0015) model time 0.2414 (0.2423) loss 3.8742 (3.9294) grad_norm 1.9002 (2.0991) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1210/1251] eta 0:00:10 lr 0.000998 wd 0.0500 time 0.2425 (0.2439) data time 0.0007 (0.0015) model time 0.2418 (0.2423) loss 4.5200 (3.9317) grad_norm 1.8535 (2.0977) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1220/1251] eta 0:00:07 lr 0.000998 wd 0.0500 time 0.2316 (0.2439) data time 0.0009 (0.0015) model time 0.2307 (0.2423) loss 4.3451 (3.9339) grad_norm 2.1608 (2.0965) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1230/1251] eta 0:00:05 lr 0.000998 wd 0.0500 time 0.2454 (0.2439) data time 0.0011 (0.0015) model time 0.2443 (0.2423) loss 4.3523 (3.9357) grad_norm 2.0010 (2.0971) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1240/1251] eta 0:00:02 lr 0.000998 wd 0.0500 time 0.2348 (0.2438) data time 0.0007 (0.0015) model time 0.2341 (0.2422) loss 3.8159 (3.9352) grad_norm 1.4674 (2.1003) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [26/300][1250/1251] eta 0:00:00 lr 0.000998 wd 0.0500 time 0.2263 (0.2437) data time 0.0005 (0.0015) model time 0.2259 (0.2421) loss 3.4425 (3.9365) grad_norm 1.7578 (2.1004) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 26 training takes 0:05:04 [2024-08-26 04:28:56 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 04:28:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 04:28:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.396 (0.396) Loss 0.7002 (0.7002) Acc@1 85.938 (85.938) Acc@5 96.191 (96.191) Mem 7379MB [2024-08-26 04:28:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.107) Loss 1.0400 (1.0562) Acc@1 78.223 (76.092) Acc@5 94.727 (94.309) Mem 7379MB [2024-08-26 04:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.093) Loss 1.6670 (1.0879) Acc@1 62.793 (75.288) Acc@5 86.816 (94.052) Mem 7379MB [2024-08-26 04:28:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.088) Loss 1.9150 (1.2542) Acc@1 57.520 (72.136) Acc@5 81.055 (91.627) Mem 7379MB [2024-08-26 04:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.082) Loss 1.8447 (1.3465) Acc@1 60.254 (70.358) Acc@5 82.910 (90.306) Mem 7379MB [2024-08-26 04:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 70.130 Acc@5 90.268 [2024-08-26 04:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 70.1% [2024-08-26 04:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 70.13% [2024-08-26 04:29:00 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 04:29:01 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 04:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.427 (0.427) Loss 0.6177 (0.6177) Acc@1 84.473 (84.473) Acc@5 96.191 (96.191) Mem 7379MB [2024-08-26 04:29:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.108) Loss 0.9946 (0.9866) Acc@1 76.172 (75.222) Acc@5 93.555 (93.519) Mem 7379MB [2024-08-26 04:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.094) Loss 1.4668 (0.9994) Acc@1 64.551 (74.777) Acc@5 87.500 (93.490) Mem 7379MB [2024-08-26 04:29:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.089) Loss 1.7686 (1.1626) Acc@1 57.520 (71.421) Acc@5 81.250 (91.063) Mem 7379MB [2024-08-26 04:29:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.7588 (1.2628) Acc@1 58.496 (69.453) Acc@5 83.203 (89.672) Mem 7379MB [2024-08-26 04:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 69.222 Acc@5 89.616 [2024-08-26 04:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 69.2% [2024-08-26 04:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 69.22% [2024-08-26 04:29:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 04:29:06 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 04:29:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][0/1251] eta 0:12:43 lr 0.000998 wd 0.0500 time 0.6106 (0.6106) data time 0.3869 (0.3869) model time 0.0000 (0.0000) loss 3.4616 (3.4616) grad_norm 1.5865 (1.5865) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][10/1251] eta 0:05:44 lr 0.000998 wd 0.0500 time 0.2482 (0.2773) data time 0.0008 (0.0361) model time 0.0000 (0.0000) loss 3.0351 (3.7921) grad_norm 1.5721 (1.8281) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][20/1251] eta 0:05:20 lr 0.000998 wd 0.0500 time 0.2419 (0.2600) data time 0.0010 (0.0194) model time 0.0000 (0.0000) loss 4.0617 (3.9389) grad_norm 2.5837 (1.8081) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][30/1251] eta 0:05:10 lr 0.000998 wd 0.0500 time 0.2476 (0.2543) data time 0.0012 (0.0135) model time 0.0000 (0.0000) loss 3.9520 (3.9746) grad_norm 2.4372 (1.9091) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][40/1251] eta 0:05:04 lr 0.000998 wd 0.0500 time 0.2443 (0.2511) data time 0.0009 (0.0104) model time 0.0000 (0.0000) loss 2.4461 (3.9886) grad_norm 1.4893 (1.8871) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][50/1251] eta 0:04:59 lr 0.000998 wd 0.0500 time 0.2368 (0.2491) data time 0.0008 (0.0086) model time 0.0000 (0.0000) loss 4.0050 (3.9643) grad_norm 3.6578 (1.9466) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][60/1251] eta 0:04:54 lr 0.000998 wd 0.0500 time 0.2425 (0.2476) data time 0.0013 (0.0073) model time 0.2413 (0.2388) loss 3.2939 (3.8375) grad_norm 1.8673 (1.9461) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][70/1251] eta 0:04:51 lr 0.000998 wd 0.0500 time 0.2415 (0.2467) data time 0.0009 (0.0064) model time 0.2406 (0.2398) loss 4.3965 (3.8426) grad_norm 1.6972 (1.9521) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][80/1251] eta 0:04:48 lr 0.000998 wd 0.0500 time 0.2451 (0.2460) data time 0.0009 (0.0058) model time 0.2442 (0.2396) loss 3.9288 (3.8637) grad_norm 1.5393 (1.9304) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][90/1251] eta 0:04:44 lr 0.000998 wd 0.0500 time 0.2402 (0.2454) data time 0.0013 (0.0053) model time 0.2389 (0.2396) loss 3.9805 (3.8991) grad_norm 2.0784 (1.9037) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][100/1251] eta 0:04:41 lr 0.000998 wd 0.0500 time 0.2382 (0.2448) data time 0.0010 (0.0048) model time 0.2372 (0.2395) loss 4.5381 (3.8843) grad_norm 1.8580 (1.9355) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][110/1251] eta 0:04:38 lr 0.000998 wd 0.0500 time 0.2383 (0.2445) data time 0.0011 (0.0045) model time 0.2372 (0.2396) loss 2.5927 (3.8951) grad_norm 2.3458 (1.9900) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][120/1251] eta 0:04:36 lr 0.000998 wd 0.0500 time 0.2415 (0.2444) data time 0.0010 (0.0042) model time 0.2405 (0.2400) loss 3.6948 (3.8949) grad_norm 2.0567 (2.0113) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][130/1251] eta 0:04:34 lr 0.000998 wd 0.0500 time 0.2453 (0.2445) data time 0.0010 (0.0040) model time 0.2443 (0.2406) loss 3.4709 (3.8834) grad_norm 2.0716 (2.0273) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][140/1251] eta 0:04:31 lr 0.000998 wd 0.0500 time 0.2449 (0.2443) data time 0.0008 (0.0037) model time 0.2441 (0.2406) loss 4.2888 (3.8842) grad_norm 1.3482 (2.0388) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][150/1251] eta 0:04:28 lr 0.000998 wd 0.0500 time 0.2356 (0.2441) data time 0.0007 (0.0036) model time 0.2349 (0.2405) loss 4.8459 (3.8851) grad_norm 1.8179 (2.0379) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][160/1251] eta 0:04:28 lr 0.000998 wd 0.0500 time 0.2391 (0.2461) data time 0.0008 (0.0034) model time 0.2382 (0.2437) loss 4.1617 (3.8883) grad_norm 1.6854 (2.0264) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][170/1251] eta 0:04:26 lr 0.000998 wd 0.0500 time 0.2426 (0.2468) data time 0.0010 (0.0033) model time 0.2416 (0.2449) loss 4.4072 (3.8756) grad_norm 1.6900 (2.0186) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][180/1251] eta 0:04:24 lr 0.000998 wd 0.0500 time 0.2400 (0.2465) data time 0.0010 (0.0031) model time 0.2390 (0.2445) loss 4.2567 (3.8857) grad_norm 1.4497 (2.0215) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][190/1251] eta 0:04:21 lr 0.000998 wd 0.0500 time 0.2407 (0.2463) data time 0.0008 (0.0030) model time 0.2399 (0.2443) loss 3.3293 (3.8706) grad_norm 2.0191 (2.0045) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][200/1251] eta 0:04:18 lr 0.000998 wd 0.0500 time 0.2370 (0.2460) data time 0.0009 (0.0029) model time 0.2361 (0.2439) loss 4.3858 (3.8724) grad_norm 2.7575 (2.0097) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:29:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][210/1251] eta 0:04:15 lr 0.000998 wd 0.0500 time 0.2408 (0.2456) data time 0.0009 (0.0028) model time 0.2399 (0.2435) loss 4.8253 (3.8846) grad_norm 2.1025 (1.9991) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:30:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][220/1251] eta 0:04:13 lr 0.000998 wd 0.0500 time 0.2419 (0.2455) data time 0.0009 (0.0028) model time 0.2409 (0.2434) loss 4.1060 (3.8951) grad_norm 2.7211 (2.0063) loss_scale 16384.0000 (8488.5430) mem 7379MB [2024-08-26 04:30:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][230/1251] eta 0:04:10 lr 0.000998 wd 0.0500 time 0.2500 (0.2453) data time 0.0010 (0.0027) model time 0.2490 (0.2433) loss 3.5914 (3.9034) grad_norm 2.4269 (1.9968) loss_scale 16384.0000 (8830.3377) mem 7379MB [2024-08-26 04:30:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][240/1251] eta 0:04:07 lr 0.000998 wd 0.0500 time 0.2430 (0.2451) data time 0.0008 (0.0026) model time 0.2421 (0.2431) loss 3.6682 (3.9081) grad_norm 1.5204 (2.0179) loss_scale 16384.0000 (9143.7676) mem 7379MB [2024-08-26 04:30:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][250/1251] eta 0:04:05 lr 0.000998 wd 0.0500 time 0.2405 (0.2449) data time 0.0009 (0.0026) model time 0.2395 (0.2429) loss 3.0177 (3.9100) grad_norm 1.6757 (2.0234) loss_scale 16384.0000 (9432.2231) mem 7379MB [2024-08-26 04:30:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][260/1251] eta 0:04:02 lr 0.000998 wd 0.0500 time 0.2393 (0.2448) data time 0.0010 (0.0025) model time 0.2384 (0.2427) loss 3.7492 (3.9166) grad_norm 2.9950 (2.0324) loss_scale 16384.0000 (9698.5747) mem 7379MB [2024-08-26 04:30:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][270/1251] eta 0:03:59 lr 0.000998 wd 0.0500 time 0.2427 (0.2446) data time 0.0007 (0.0024) model time 0.2421 (0.2426) loss 3.5698 (3.9185) grad_norm 1.9796 (2.0241) loss_scale 16384.0000 (9945.2694) mem 7379MB [2024-08-26 04:30:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][280/1251] eta 0:03:57 lr 0.000998 wd 0.0500 time 0.2454 (0.2445) data time 0.0009 (0.0024) model time 0.2445 (0.2425) loss 4.0409 (3.9175) grad_norm 1.6838 (2.0244) loss_scale 16384.0000 (10174.4057) mem 7379MB [2024-08-26 04:30:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][290/1251] eta 0:03:55 lr 0.000998 wd 0.0500 time 0.2542 (0.2449) data time 0.0009 (0.0023) model time 0.2533 (0.2430) loss 4.3064 (3.9182) grad_norm 2.3695 (2.0262) loss_scale 16384.0000 (10387.7938) mem 7379MB [2024-08-26 04:30:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][300/1251] eta 0:03:52 lr 0.000998 wd 0.0500 time 0.2366 (0.2448) data time 0.0008 (0.0023) model time 0.2358 (0.2429) loss 3.5005 (3.9156) grad_norm 2.1312 (2.0285) loss_scale 16384.0000 (10587.0033) mem 7379MB [2024-08-26 04:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][310/1251] eta 0:03:50 lr 0.000998 wd 0.0500 time 0.2445 (0.2447) data time 0.0009 (0.0023) model time 0.2436 (0.2429) loss 4.3223 (3.9114) grad_norm 2.2514 (2.0233) loss_scale 16384.0000 (10773.4019) mem 7379MB [2024-08-26 04:30:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][320/1251] eta 0:03:47 lr 0.000998 wd 0.0500 time 0.2392 (0.2446) data time 0.0011 (0.0022) model time 0.2382 (0.2427) loss 3.9034 (3.9079) grad_norm 2.0391 (2.0372) loss_scale 16384.0000 (10948.1869) mem 7379MB [2024-08-26 04:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][330/1251] eta 0:03:45 lr 0.000998 wd 0.0500 time 0.2441 (0.2445) data time 0.0007 (0.0022) model time 0.2434 (0.2427) loss 3.2614 (3.9046) grad_norm 1.9084 (2.0387) loss_scale 16384.0000 (11112.4109) mem 7379MB [2024-08-26 04:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][340/1251] eta 0:03:42 lr 0.000998 wd 0.0500 time 0.2401 (0.2444) data time 0.0010 (0.0021) model time 0.2391 (0.2425) loss 3.9311 (3.9086) grad_norm 1.8880 (2.0319) loss_scale 16384.0000 (11267.0029) mem 7379MB [2024-08-26 04:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][350/1251] eta 0:03:40 lr 0.000998 wd 0.0500 time 0.2343 (0.2443) data time 0.0010 (0.0021) model time 0.2333 (0.2425) loss 4.2283 (3.9097) grad_norm 1.9015 (2.0352) loss_scale 16384.0000 (11412.7863) mem 7379MB [2024-08-26 04:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][360/1251] eta 0:03:37 lr 0.000998 wd 0.0500 time 0.2365 (0.2443) data time 0.0010 (0.0021) model time 0.2356 (0.2425) loss 3.0819 (3.9022) grad_norm 1.6054 (2.0320) loss_scale 16384.0000 (11550.4931) mem 7379MB [2024-08-26 04:30:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][370/1251] eta 0:03:35 lr 0.000998 wd 0.0500 time 0.2368 (0.2441) data time 0.0007 (0.0021) model time 0.2361 (0.2423) loss 3.9744 (3.9019) grad_norm 3.7569 (2.0496) loss_scale 16384.0000 (11680.7763) mem 7379MB [2024-08-26 04:30:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][380/1251] eta 0:03:32 lr 0.000998 wd 0.0500 time 0.2429 (0.2440) data time 0.0010 (0.0020) model time 0.2419 (0.2422) loss 3.6771 (3.8999) grad_norm 1.9766 (2.0471) loss_scale 16384.0000 (11804.2205) mem 7379MB [2024-08-26 04:30:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][390/1251] eta 0:03:30 lr 0.000998 wd 0.0500 time 0.2457 (0.2440) data time 0.0009 (0.0020) model time 0.2448 (0.2422) loss 4.0057 (3.8991) grad_norm 1.4952 (2.0444) loss_scale 16384.0000 (11921.3504) mem 7379MB [2024-08-26 04:30:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][400/1251] eta 0:03:27 lr 0.000998 wd 0.0500 time 0.2395 (0.2439) data time 0.0011 (0.0020) model time 0.2384 (0.2421) loss 4.2299 (3.8992) grad_norm 2.2015 (2.0488) loss_scale 16384.0000 (12032.6384) mem 7379MB [2024-08-26 04:30:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][410/1251] eta 0:03:25 lr 0.000998 wd 0.0500 time 0.2355 (0.2438) data time 0.0013 (0.0020) model time 0.2342 (0.2420) loss 3.9285 (3.8984) grad_norm 1.8161 (2.0448) loss_scale 16384.0000 (12138.5109) mem 7379MB [2024-08-26 04:30:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][420/1251] eta 0:03:22 lr 0.000998 wd 0.0500 time 0.2420 (0.2438) data time 0.0009 (0.0019) model time 0.2410 (0.2420) loss 2.9120 (3.8981) grad_norm 1.8109 (2.0552) loss_scale 16384.0000 (12239.3539) mem 7379MB [2024-08-26 04:30:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][430/1251] eta 0:03:20 lr 0.000998 wd 0.0500 time 0.2462 (0.2438) data time 0.0010 (0.0019) model time 0.2451 (0.2420) loss 4.1494 (3.8971) grad_norm 2.5515 (2.0613) loss_scale 16384.0000 (12335.5174) mem 7379MB [2024-08-26 04:30:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][440/1251] eta 0:03:17 lr 0.000998 wd 0.0500 time 0.2388 (0.2437) data time 0.0010 (0.0019) model time 0.2378 (0.2420) loss 4.6086 (3.9001) grad_norm 1.8672 (2.0685) loss_scale 16384.0000 (12427.3197) mem 7379MB [2024-08-26 04:30:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][450/1251] eta 0:03:15 lr 0.000998 wd 0.0500 time 0.2364 (0.2437) data time 0.0011 (0.0019) model time 0.2353 (0.2420) loss 4.1856 (3.8945) grad_norm 2.0152 (2.0663) loss_scale 16384.0000 (12515.0510) mem 7379MB [2024-08-26 04:30:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][460/1251] eta 0:03:12 lr 0.000998 wd 0.0500 time 0.2438 (0.2436) data time 0.0009 (0.0019) model time 0.2429 (0.2419) loss 3.9147 (3.8922) grad_norm 3.7842 (2.0679) loss_scale 16384.0000 (12598.9761) mem 7379MB [2024-08-26 04:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][470/1251] eta 0:03:10 lr 0.000998 wd 0.0500 time 0.2417 (0.2436) data time 0.0011 (0.0018) model time 0.2406 (0.2419) loss 4.0647 (3.8938) grad_norm 2.7792 (2.0717) loss_scale 16384.0000 (12679.3376) mem 7379MB [2024-08-26 04:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][480/1251] eta 0:03:07 lr 0.000998 wd 0.0500 time 0.2395 (0.2436) data time 0.0011 (0.0018) model time 0.2384 (0.2419) loss 4.1618 (3.8952) grad_norm 1.9136 (2.0705) loss_scale 16384.0000 (12756.3576) mem 7379MB [2024-08-26 04:31:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][490/1251] eta 0:03:05 lr 0.000998 wd 0.0500 time 0.2513 (0.2436) data time 0.0008 (0.0018) model time 0.2505 (0.2419) loss 4.4020 (3.8955) grad_norm 1.9737 (2.0691) loss_scale 16384.0000 (12830.2403) mem 7379MB [2024-08-26 04:31:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][500/1251] eta 0:03:02 lr 0.000998 wd 0.0500 time 0.2436 (0.2436) data time 0.0009 (0.0018) model time 0.2427 (0.2419) loss 4.3446 (3.8972) grad_norm 1.8098 (2.0710) loss_scale 16384.0000 (12901.1737) mem 7379MB [2024-08-26 04:31:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][510/1251] eta 0:03:00 lr 0.000998 wd 0.0500 time 0.2449 (0.2435) data time 0.0008 (0.0018) model time 0.2441 (0.2419) loss 3.1698 (3.8903) grad_norm 2.7091 (2.0809) loss_scale 16384.0000 (12969.3307) mem 7379MB [2024-08-26 04:31:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][520/1251] eta 0:02:57 lr 0.000998 wd 0.0500 time 0.2321 (0.2435) data time 0.0011 (0.0018) model time 0.2310 (0.2418) loss 3.9354 (3.8929) grad_norm 1.3869 (2.0763) loss_scale 16384.0000 (13034.8714) mem 7379MB [2024-08-26 04:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][530/1251] eta 0:02:55 lr 0.000998 wd 0.0500 time 0.2359 (0.2435) data time 0.0009 (0.0017) model time 0.2351 (0.2418) loss 4.4859 (3.8932) grad_norm 2.0553 (2.0758) loss_scale 16384.0000 (13097.9435) mem 7379MB [2024-08-26 04:31:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][540/1251] eta 0:02:53 lr 0.000998 wd 0.0500 time 0.2348 (0.2434) data time 0.0012 (0.0017) model time 0.2335 (0.2418) loss 4.3557 (3.8928) grad_norm 2.0559 (2.0738) loss_scale 16384.0000 (13158.6839) mem 7379MB [2024-08-26 04:31:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][550/1251] eta 0:02:50 lr 0.000998 wd 0.0500 time 0.2486 (0.2434) data time 0.0009 (0.0017) model time 0.2477 (0.2418) loss 4.2422 (3.8915) grad_norm 2.6931 (2.0772) loss_scale 16384.0000 (13217.2196) mem 7379MB [2024-08-26 04:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][560/1251] eta 0:02:48 lr 0.000998 wd 0.0500 time 0.2457 (0.2434) data time 0.0009 (0.0017) model time 0.2448 (0.2418) loss 4.5957 (3.8927) grad_norm 1.9916 (2.0720) loss_scale 16384.0000 (13273.6684) mem 7379MB [2024-08-26 04:31:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][570/1251] eta 0:02:45 lr 0.000998 wd 0.0500 time 0.2398 (0.2433) data time 0.0008 (0.0017) model time 0.2390 (0.2417) loss 4.8776 (3.8971) grad_norm 2.1676 (2.0687) loss_scale 16384.0000 (13328.1401) mem 7379MB [2024-08-26 04:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][580/1251] eta 0:02:43 lr 0.000998 wd 0.0500 time 0.2357 (0.2433) data time 0.0010 (0.0017) model time 0.2347 (0.2417) loss 3.2645 (3.8968) grad_norm 2.9941 (2.0798) loss_scale 16384.0000 (13380.7367) mem 7379MB [2024-08-26 04:31:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][590/1251] eta 0:02:40 lr 0.000998 wd 0.0500 time 0.2425 (0.2433) data time 0.0011 (0.0017) model time 0.2414 (0.2417) loss 4.2296 (3.9046) grad_norm 1.9512 (2.0800) loss_scale 16384.0000 (13431.5533) mem 7379MB [2024-08-26 04:31:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][600/1251] eta 0:02:38 lr 0.000998 wd 0.0500 time 0.2466 (0.2433) data time 0.0009 (0.0017) model time 0.2456 (0.2417) loss 3.2541 (3.8980) grad_norm 2.0789 (2.0860) loss_scale 16384.0000 (13480.6789) mem 7379MB [2024-08-26 04:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][610/1251] eta 0:02:35 lr 0.000998 wd 0.0500 time 0.2371 (0.2432) data time 0.0009 (0.0017) model time 0.2361 (0.2416) loss 4.1135 (3.9067) grad_norm 1.7802 (2.0779) loss_scale 16384.0000 (13528.1964) mem 7379MB [2024-08-26 04:31:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][620/1251] eta 0:02:33 lr 0.000998 wd 0.0500 time 0.2457 (0.2432) data time 0.0008 (0.0016) model time 0.2448 (0.2416) loss 4.7029 (3.9087) grad_norm 2.1970 (2.0765) loss_scale 16384.0000 (13574.1836) mem 7379MB [2024-08-26 04:31:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][630/1251] eta 0:02:30 lr 0.000998 wd 0.0500 time 0.2418 (0.2432) data time 0.0009 (0.0016) model time 0.2409 (0.2416) loss 4.5857 (3.9110) grad_norm 1.8681 (2.0784) loss_scale 16384.0000 (13618.7132) mem 7379MB [2024-08-26 04:31:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][640/1251] eta 0:02:28 lr 0.000998 wd 0.0500 time 0.2417 (0.2432) data time 0.0010 (0.0016) model time 0.2407 (0.2417) loss 3.4314 (3.9082) grad_norm 1.7097 (2.0808) loss_scale 16384.0000 (13661.8534) mem 7379MB [2024-08-26 04:31:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][650/1251] eta 0:02:26 lr 0.000998 wd 0.0500 time 0.2443 (0.2432) data time 0.0009 (0.0016) model time 0.2435 (0.2417) loss 2.8246 (3.9103) grad_norm 2.2033 (2.0880) loss_scale 16384.0000 (13703.6682) mem 7379MB [2024-08-26 04:31:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][660/1251] eta 0:02:23 lr 0.000998 wd 0.0500 time 0.2465 (0.2432) data time 0.0008 (0.0016) model time 0.2456 (0.2417) loss 4.5978 (3.9114) grad_norm 2.0156 (2.0849) loss_scale 16384.0000 (13744.2179) mem 7379MB [2024-08-26 04:31:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][670/1251] eta 0:02:21 lr 0.000998 wd 0.0500 time 0.2438 (0.2432) data time 0.0012 (0.0016) model time 0.2425 (0.2417) loss 4.0046 (3.9148) grad_norm 1.8433 (2.0838) loss_scale 16384.0000 (13783.5589) mem 7379MB [2024-08-26 04:31:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][680/1251] eta 0:02:19 lr 0.000998 wd 0.0500 time 0.2434 (0.2435) data time 0.0012 (0.0016) model time 0.2422 (0.2420) loss 4.1771 (3.9144) grad_norm 1.8211 (2.0855) loss_scale 16384.0000 (13821.7445) mem 7379MB [2024-08-26 04:31:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][690/1251] eta 0:02:16 lr 0.000998 wd 0.0500 time 0.2434 (0.2441) data time 0.0008 (0.0016) model time 0.2426 (0.2426) loss 4.5514 (3.9142) grad_norm 1.5081 (2.0829) loss_scale 16384.0000 (13858.8249) mem 7379MB [2024-08-26 04:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][700/1251] eta 0:02:14 lr 0.000998 wd 0.0500 time 0.2532 (0.2441) data time 0.0009 (0.0016) model time 0.2523 (0.2427) loss 3.5041 (3.9114) grad_norm 2.3937 (2.0822) loss_scale 16384.0000 (13894.8474) mem 7379MB [2024-08-26 04:31:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][710/1251] eta 0:02:12 lr 0.000998 wd 0.0500 time 0.2330 (0.2440) data time 0.0010 (0.0016) model time 0.2320 (0.2426) loss 2.8219 (3.9076) grad_norm 2.3174 (2.0822) loss_scale 16384.0000 (13929.8565) mem 7379MB [2024-08-26 04:32:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][720/1251] eta 0:02:09 lr 0.000998 wd 0.0500 time 0.2349 (0.2440) data time 0.0011 (0.0016) model time 0.2338 (0.2426) loss 3.9650 (3.9110) grad_norm 1.5671 (2.0797) loss_scale 16384.0000 (13963.8946) mem 7379MB [2024-08-26 04:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][730/1251] eta 0:02:07 lr 0.000998 wd 0.0500 time 0.2434 (0.2440) data time 0.0015 (0.0016) model time 0.2419 (0.2425) loss 4.4677 (3.9145) grad_norm 1.9645 (2.0772) loss_scale 16384.0000 (13997.0014) mem 7379MB [2024-08-26 04:32:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][740/1251] eta 0:02:04 lr 0.000998 wd 0.0500 time 0.2364 (0.2439) data time 0.0009 (0.0015) model time 0.2355 (0.2425) loss 4.2831 (3.9136) grad_norm 2.5337 (2.0817) loss_scale 16384.0000 (14029.2146) mem 7379MB [2024-08-26 04:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][750/1251] eta 0:02:02 lr 0.000998 wd 0.0500 time 0.2418 (0.2439) data time 0.0007 (0.0015) model time 0.2411 (0.2424) loss 4.8370 (3.9174) grad_norm 1.4718 (2.0794) loss_scale 16384.0000 (14060.5699) mem 7379MB [2024-08-26 04:32:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][760/1251] eta 0:01:59 lr 0.000998 wd 0.0500 time 0.2471 (0.2438) data time 0.0011 (0.0015) model time 0.2461 (0.2424) loss 4.6680 (3.9186) grad_norm 3.1310 (2.0845) loss_scale 16384.0000 (14091.1012) mem 7379MB [2024-08-26 04:32:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][770/1251] eta 0:01:57 lr 0.000998 wd 0.0500 time 0.2382 (0.2438) data time 0.0009 (0.0015) model time 0.2373 (0.2424) loss 3.1709 (3.9156) grad_norm 2.8991 (2.0871) loss_scale 16384.0000 (14120.8405) mem 7379MB [2024-08-26 04:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][780/1251] eta 0:01:54 lr 0.000998 wd 0.0500 time 0.2478 (0.2438) data time 0.0007 (0.0015) model time 0.2471 (0.2424) loss 3.5348 (3.9144) grad_norm 1.5287 (2.0906) loss_scale 16384.0000 (14149.8182) mem 7379MB [2024-08-26 04:32:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][790/1251] eta 0:01:52 lr 0.000998 wd 0.0500 time 0.2399 (0.2438) data time 0.0007 (0.0015) model time 0.2392 (0.2423) loss 3.9608 (3.9119) grad_norm 2.7068 (2.0880) loss_scale 16384.0000 (14178.0632) mem 7379MB [2024-08-26 04:32:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][800/1251] eta 0:01:49 lr 0.000998 wd 0.0500 time 0.2445 (0.2437) data time 0.0008 (0.0015) model time 0.2437 (0.2423) loss 4.6676 (3.9183) grad_norm 1.4654 (2.0834) loss_scale 16384.0000 (14205.6030) mem 7379MB [2024-08-26 04:32:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][810/1251] eta 0:01:47 lr 0.000998 wd 0.0500 time 0.2415 (0.2440) data time 0.0009 (0.0015) model time 0.2406 (0.2426) loss 4.1882 (3.9217) grad_norm 1.4008 (2.0826) loss_scale 16384.0000 (14232.4636) mem 7379MB [2024-08-26 04:32:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][820/1251] eta 0:01:45 lr 0.000998 wd 0.0500 time 0.2361 (0.2439) data time 0.0010 (0.0015) model time 0.2351 (0.2426) loss 3.4989 (3.9215) grad_norm 2.3624 (2.0791) loss_scale 16384.0000 (14258.6699) mem 7379MB [2024-08-26 04:32:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][830/1251] eta 0:01:42 lr 0.000998 wd 0.0500 time 0.2452 (0.2439) data time 0.0010 (0.0015) model time 0.2442 (0.2425) loss 4.4311 (3.9205) grad_norm 1.7756 (2.0838) loss_scale 16384.0000 (14284.2455) mem 7379MB [2024-08-26 04:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][840/1251] eta 0:01:40 lr 0.000998 wd 0.0500 time 0.2446 (0.2439) data time 0.0009 (0.0015) model time 0.2437 (0.2425) loss 4.3216 (3.9202) grad_norm 2.0443 (2.0829) loss_scale 16384.0000 (14309.2128) mem 7379MB [2024-08-26 04:32:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][850/1251] eta 0:01:37 lr 0.000998 wd 0.0500 time 0.2408 (0.2439) data time 0.0008 (0.0015) model time 0.2400 (0.2425) loss 5.3512 (3.9220) grad_norm 4.5090 (2.0849) loss_scale 16384.0000 (14333.5934) mem 7379MB [2024-08-26 04:32:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][860/1251] eta 0:01:35 lr 0.000998 wd 0.0500 time 0.2393 (0.2438) data time 0.0009 (0.0015) model time 0.2385 (0.2424) loss 3.8143 (3.9207) grad_norm 1.6276 (2.0811) loss_scale 16384.0000 (14357.4077) mem 7379MB [2024-08-26 04:32:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][870/1251] eta 0:01:32 lr 0.000998 wd 0.0500 time 0.2397 (0.2438) data time 0.0011 (0.0015) model time 0.2386 (0.2424) loss 4.2004 (3.9165) grad_norm 1.9005 (2.0821) loss_scale 16384.0000 (14380.6751) mem 7379MB [2024-08-26 04:32:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][880/1251] eta 0:01:30 lr 0.000998 wd 0.0500 time 0.2390 (0.2438) data time 0.0009 (0.0015) model time 0.2380 (0.2424) loss 4.5481 (3.9179) grad_norm 2.2143 (2.0811) loss_scale 16384.0000 (14403.4143) mem 7379MB [2024-08-26 04:32:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][890/1251] eta 0:01:28 lr 0.000998 wd 0.0500 time 0.2445 (0.2438) data time 0.0010 (0.0015) model time 0.2435 (0.2424) loss 4.1971 (3.9200) grad_norm 1.6077 (2.0795) loss_scale 16384.0000 (14425.6431) mem 7379MB [2024-08-26 04:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][900/1251] eta 0:01:25 lr 0.000998 wd 0.0500 time 0.2444 (0.2437) data time 0.0009 (0.0015) model time 0.2434 (0.2424) loss 3.7064 (3.9197) grad_norm 1.8908 (2.0808) loss_scale 16384.0000 (14447.3785) mem 7379MB [2024-08-26 04:32:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][910/1251] eta 0:01:23 lr 0.000998 wd 0.0500 time 0.2270 (0.2437) data time 0.0012 (0.0015) model time 0.2258 (0.2423) loss 4.0285 (3.9223) grad_norm 1.8633 (2.0815) loss_scale 16384.0000 (14468.6367) mem 7379MB [2024-08-26 04:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][920/1251] eta 0:01:20 lr 0.000998 wd 0.0500 time 0.2442 (0.2437) data time 0.0007 (0.0015) model time 0.2435 (0.2423) loss 2.5995 (3.9213) grad_norm 2.8144 (2.0808) loss_scale 16384.0000 (14489.4332) mem 7379MB [2024-08-26 04:32:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][930/1251] eta 0:01:18 lr 0.000998 wd 0.0500 time 0.2481 (0.2437) data time 0.0010 (0.0015) model time 0.2472 (0.2423) loss 4.4466 (3.9199) grad_norm 2.0114 (2.0801) loss_scale 16384.0000 (14509.7830) mem 7379MB [2024-08-26 04:32:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][940/1251] eta 0:01:15 lr 0.000998 wd 0.0500 time 0.2473 (0.2436) data time 0.0007 (0.0014) model time 0.2466 (0.2423) loss 4.8716 (3.9186) grad_norm 1.8778 (2.0830) loss_scale 16384.0000 (14529.7003) mem 7379MB [2024-08-26 04:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][950/1251] eta 0:01:13 lr 0.000998 wd 0.0500 time 0.2379 (0.2436) data time 0.0014 (0.0014) model time 0.2365 (0.2423) loss 3.0528 (3.9167) grad_norm 1.9254 (2.0825) loss_scale 16384.0000 (14549.1987) mem 7379MB [2024-08-26 04:33:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][960/1251] eta 0:01:10 lr 0.000998 wd 0.0500 time 0.2356 (0.2436) data time 0.0009 (0.0014) model time 0.2347 (0.2422) loss 3.8203 (3.9162) grad_norm 1.6221 (2.0809) loss_scale 16384.0000 (14568.2914) mem 7379MB [2024-08-26 04:33:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][970/1251] eta 0:01:08 lr 0.000998 wd 0.0500 time 0.2387 (0.2436) data time 0.0007 (0.0014) model time 0.2380 (0.2422) loss 4.2395 (3.9172) grad_norm 2.1569 (2.0814) loss_scale 16384.0000 (14586.9907) mem 7379MB [2024-08-26 04:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][980/1251] eta 0:01:05 lr 0.000998 wd 0.0500 time 0.2402 (0.2435) data time 0.0009 (0.0014) model time 0.2393 (0.2422) loss 4.0618 (3.9121) grad_norm 1.3653 (2.0826) loss_scale 16384.0000 (14605.3089) mem 7379MB [2024-08-26 04:33:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][990/1251] eta 0:01:03 lr 0.000998 wd 0.0500 time 0.2343 (0.2435) data time 0.0011 (0.0014) model time 0.2332 (0.2422) loss 3.4184 (3.9096) grad_norm 1.7456 (2.0850) loss_scale 16384.0000 (14623.2573) mem 7379MB [2024-08-26 04:33:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1000/1251] eta 0:01:01 lr 0.000998 wd 0.0500 time 0.2401 (0.2435) data time 0.0009 (0.0014) model time 0.2392 (0.2421) loss 4.1567 (3.9095) grad_norm 1.3408 (2.0810) loss_scale 16384.0000 (14640.8472) mem 7379MB [2024-08-26 04:33:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1010/1251] eta 0:00:58 lr 0.000998 wd 0.0500 time 0.2416 (0.2434) data time 0.0010 (0.0014) model time 0.2406 (0.2421) loss 2.8593 (3.9084) grad_norm 1.5424 (2.0816) loss_scale 16384.0000 (14658.0890) mem 7379MB [2024-08-26 04:33:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1020/1251] eta 0:00:56 lr 0.000998 wd 0.0500 time 0.2353 (0.2434) data time 0.0011 (0.0014) model time 0.2342 (0.2421) loss 3.9776 (3.9101) grad_norm 1.5998 (2.0786) loss_scale 16384.0000 (14674.9931) mem 7379MB [2024-08-26 04:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1030/1251] eta 0:00:53 lr 0.000998 wd 0.0500 time 0.2485 (0.2434) data time 0.0011 (0.0014) model time 0.2474 (0.2421) loss 2.6206 (3.9107) grad_norm 1.9127 (2.0759) loss_scale 16384.0000 (14691.5694) mem 7379MB [2024-08-26 04:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1040/1251] eta 0:00:51 lr 0.000998 wd 0.0500 time 0.2393 (0.2434) data time 0.0008 (0.0014) model time 0.2385 (0.2421) loss 4.4170 (3.9104) grad_norm 1.6067 (2.0740) loss_scale 16384.0000 (14707.8271) mem 7379MB [2024-08-26 04:33:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1050/1251] eta 0:00:48 lr 0.000998 wd 0.0500 time 0.2485 (0.2434) data time 0.0008 (0.0014) model time 0.2477 (0.2420) loss 3.6377 (3.9091) grad_norm 1.3766 (2.0745) loss_scale 16384.0000 (14723.7755) mem 7379MB [2024-08-26 04:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1060/1251] eta 0:00:46 lr 0.000998 wd 0.0500 time 0.2835 (0.2434) data time 0.0010 (0.0014) model time 0.2825 (0.2421) loss 4.4348 (3.9112) grad_norm 1.5683 (2.0702) loss_scale 16384.0000 (14739.4232) mem 7379MB [2024-08-26 04:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1070/1251] eta 0:00:44 lr 0.000998 wd 0.0500 time 0.2420 (0.2434) data time 0.0011 (0.0014) model time 0.2409 (0.2421) loss 4.2744 (3.9102) grad_norm 1.6684 (2.0709) loss_scale 16384.0000 (14754.7787) mem 7379MB [2024-08-26 04:33:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1080/1251] eta 0:00:41 lr 0.000998 wd 0.0500 time 0.2498 (0.2434) data time 0.0010 (0.0014) model time 0.2488 (0.2421) loss 3.6699 (3.9100) grad_norm 1.6762 (2.0726) loss_scale 16384.0000 (14769.8501) mem 7379MB [2024-08-26 04:33:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1090/1251] eta 0:00:39 lr 0.000998 wd 0.0500 time 0.2366 (0.2434) data time 0.0011 (0.0014) model time 0.2356 (0.2421) loss 4.4075 (3.9080) grad_norm 1.9401 (2.0719) loss_scale 16384.0000 (14784.6453) mem 7379MB [2024-08-26 04:33:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1100/1251] eta 0:00:36 lr 0.000998 wd 0.0500 time 0.2402 (0.2434) data time 0.0010 (0.0014) model time 0.2393 (0.2421) loss 4.5838 (3.9099) grad_norm 2.4202 (2.0742) loss_scale 16384.0000 (14799.1717) mem 7379MB [2024-08-26 04:33:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1110/1251] eta 0:00:34 lr 0.000998 wd 0.0500 time 0.2427 (0.2434) data time 0.0007 (0.0014) model time 0.2420 (0.2421) loss 3.4741 (3.9083) grad_norm 2.2182 (2.0735) loss_scale 16384.0000 (14813.4365) mem 7379MB [2024-08-26 04:33:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1120/1251] eta 0:00:31 lr 0.000998 wd 0.0500 time 0.2379 (0.2434) data time 0.0008 (0.0014) model time 0.2371 (0.2421) loss 4.7867 (3.9066) grad_norm 1.4280 (2.0735) loss_scale 16384.0000 (14827.4469) mem 7379MB [2024-08-26 04:33:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1130/1251] eta 0:00:29 lr 0.000998 wd 0.0500 time 0.2460 (0.2434) data time 0.0009 (0.0014) model time 0.2451 (0.2421) loss 4.3114 (3.9072) grad_norm 1.6465 (2.0733) loss_scale 16384.0000 (14841.2095) mem 7379MB [2024-08-26 04:33:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1140/1251] eta 0:00:27 lr 0.000998 wd 0.0500 time 0.2472 (0.2434) data time 0.0007 (0.0014) model time 0.2464 (0.2421) loss 3.5082 (3.9048) grad_norm 1.7990 (2.0764) loss_scale 16384.0000 (14854.7309) mem 7379MB [2024-08-26 04:33:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1150/1251] eta 0:00:24 lr 0.000998 wd 0.0500 time 0.2397 (0.2434) data time 0.0009 (0.0014) model time 0.2388 (0.2421) loss 3.3217 (3.9027) grad_norm 1.6071 (2.0750) loss_scale 16384.0000 (14868.0174) mem 7379MB [2024-08-26 04:33:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1160/1251] eta 0:00:22 lr 0.000998 wd 0.0500 time 0.2422 (0.2434) data time 0.0011 (0.0014) model time 0.2411 (0.2421) loss 3.6089 (3.9031) grad_norm 2.2736 (2.0748) loss_scale 16384.0000 (14881.0749) mem 7379MB [2024-08-26 04:33:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1170/1251] eta 0:00:19 lr 0.000998 wd 0.0500 time 0.2313 (0.2434) data time 0.0008 (0.0014) model time 0.2305 (0.2421) loss 3.6378 (3.9024) grad_norm 2.0004 (2.0730) loss_scale 16384.0000 (14893.9095) mem 7379MB [2024-08-26 04:33:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1180/1251] eta 0:00:17 lr 0.000998 wd 0.0500 time 0.2405 (0.2434) data time 0.0007 (0.0014) model time 0.2397 (0.2421) loss 4.8213 (3.9032) grad_norm 1.8310 (2.0717) loss_scale 16384.0000 (14906.5267) mem 7379MB [2024-08-26 04:33:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1190/1251] eta 0:00:14 lr 0.000998 wd 0.0500 time 0.2388 (0.2433) data time 0.0013 (0.0014) model time 0.2375 (0.2421) loss 3.8308 (3.9058) grad_norm 1.3706 (2.0693) loss_scale 16384.0000 (14918.9320) mem 7379MB [2024-08-26 04:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1200/1251] eta 0:00:12 lr 0.000998 wd 0.0500 time 0.2451 (0.2434) data time 0.0011 (0.0014) model time 0.2440 (0.2421) loss 3.5197 (3.9051) grad_norm 1.4336 (2.0681) loss_scale 16384.0000 (14931.1307) mem 7379MB [2024-08-26 04:34:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1210/1251] eta 0:00:09 lr 0.000998 wd 0.0500 time 0.2348 (0.2433) data time 0.0011 (0.0014) model time 0.2337 (0.2421) loss 4.4458 (3.9032) grad_norm 2.9366 (2.0681) loss_scale 16384.0000 (14943.1280) mem 7379MB [2024-08-26 04:34:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1220/1251] eta 0:00:07 lr 0.000998 wd 0.0500 time 0.2400 (0.2433) data time 0.0012 (0.0014) model time 0.2388 (0.2420) loss 4.0167 (3.9042) grad_norm 4.1023 (2.0671) loss_scale 16384.0000 (14954.9287) mem 7379MB [2024-08-26 04:34:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1230/1251] eta 0:00:05 lr 0.000998 wd 0.0500 time 0.2357 (0.2433) data time 0.0009 (0.0014) model time 0.2348 (0.2420) loss 4.0163 (3.9043) grad_norm 1.2924 (2.0670) loss_scale 16384.0000 (14966.5378) mem 7379MB [2024-08-26 04:34:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1240/1251] eta 0:00:02 lr 0.000998 wd 0.0500 time 0.2229 (0.2432) data time 0.0007 (0.0014) model time 0.2222 (0.2419) loss 3.7581 (3.9059) grad_norm 1.6770 (2.0683) loss_scale 16384.0000 (14977.9597) mem 7379MB [2024-08-26 04:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [27/300][1250/1251] eta 0:00:00 lr 0.000998 wd 0.0500 time 0.2302 (0.2431) data time 0.0005 (0.0014) model time 0.2297 (0.2418) loss 4.4251 (3.9066) grad_norm 1.6684 (2.0701) loss_scale 16384.0000 (14989.1990) mem 7379MB [2024-08-26 04:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 27 training takes 0:05:04 [2024-08-26 04:34:10 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 04:34:11 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 04:34:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.435 (0.435) Loss 0.6313 (0.6313) Acc@1 86.914 (86.914) Acc@5 96.680 (96.680) Mem 7379MB [2024-08-26 04:34:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.092 (0.111) Loss 1.0674 (1.0180) Acc@1 76.855 (76.509) Acc@5 93.652 (93.848) Mem 7379MB [2024-08-26 04:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.096) Loss 1.4551 (1.0303) Acc@1 68.555 (75.684) Acc@5 88.770 (93.894) Mem 7379MB [2024-08-26 04:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.091) Loss 1.8965 (1.1915) Acc@1 55.957 (72.414) Acc@5 80.176 (91.630) Mem 7379MB [2024-08-26 04:34:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.6953 (1.2855) Acc@1 62.500 (70.532) Acc@5 85.254 (90.437) Mem 7379MB [2024-08-26 04:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 70.294 Acc@5 90.326 [2024-08-26 04:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 70.3% [2024-08-26 04:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 70.29% [2024-08-26 04:34:15 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 04:34:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 04:34:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.461 (0.461) Loss 0.5918 (0.5918) Acc@1 85.059 (85.059) Acc@5 96.387 (96.387) Mem 7379MB [2024-08-26 04:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.111) Loss 0.9663 (0.9569) Acc@1 76.660 (75.906) Acc@5 93.555 (93.777) Mem 7379MB [2024-08-26 04:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.096) Loss 1.4209 (0.9694) Acc@1 65.527 (75.460) Acc@5 88.086 (93.783) Mem 7379MB [2024-08-26 04:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.090) Loss 1.7314 (1.1290) Acc@1 58.398 (72.234) Acc@5 81.543 (91.447) Mem 7379MB [2024-08-26 04:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.7168 (1.2268) Acc@1 58.691 (70.324) Acc@5 84.082 (90.134) Mem 7379MB [2024-08-26 04:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 70.094 Acc@5 90.052 [2024-08-26 04:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 70.1% [2024-08-26 04:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 70.09% [2024-08-26 04:34:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 04:34:20 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 04:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][0/1251] eta 0:14:18 lr 0.000998 wd 0.0500 time 0.6859 (0.6859) data time 0.4655 (0.4655) model time 0.0000 (0.0000) loss 3.1359 (3.1359) grad_norm 3.0936 (3.0936) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][10/1251] eta 0:05:52 lr 0.000998 wd 0.0500 time 0.2489 (0.2839) data time 0.0006 (0.0442) model time 0.0000 (0.0000) loss 3.2296 (3.8901) grad_norm 2.3657 (2.0649) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][20/1251] eta 0:05:24 lr 0.000998 wd 0.0500 time 0.2465 (0.2636) data time 0.0007 (0.0236) model time 0.0000 (0.0000) loss 4.5957 (3.8073) grad_norm 1.9703 (1.9470) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][30/1251] eta 0:05:13 lr 0.000998 wd 0.0500 time 0.2459 (0.2570) data time 0.0011 (0.0163) model time 0.0000 (0.0000) loss 3.8668 (3.8080) grad_norm 2.1961 (1.9823) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][40/1251] eta 0:05:07 lr 0.000998 wd 0.0500 time 0.2549 (0.2536) data time 0.0011 (0.0126) model time 0.0000 (0.0000) loss 4.6837 (3.8977) grad_norm 1.5631 (1.9563) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][50/1251] eta 0:05:01 lr 0.000998 wd 0.0500 time 0.2450 (0.2514) data time 0.0010 (0.0104) model time 0.0000 (0.0000) loss 3.1856 (3.8971) grad_norm 1.6195 (1.9956) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][60/1251] eta 0:04:57 lr 0.000998 wd 0.0500 time 0.2473 (0.2501) data time 0.0011 (0.0088) model time 0.2462 (0.2424) loss 2.9651 (3.8915) grad_norm 2.4727 (2.0266) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][70/1251] eta 0:04:54 lr 0.000998 wd 0.0500 time 0.2577 (0.2490) data time 0.0015 (0.0078) model time 0.2562 (0.2416) loss 3.8677 (3.8777) grad_norm 3.3071 (2.0222) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][80/1251] eta 0:04:50 lr 0.000998 wd 0.0500 time 0.2422 (0.2484) data time 0.0009 (0.0070) model time 0.2413 (0.2420) loss 3.9862 (3.8761) grad_norm 1.9660 (2.0394) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][90/1251] eta 0:04:48 lr 0.000998 wd 0.0500 time 0.2484 (0.2481) data time 0.0010 (0.0063) model time 0.2474 (0.2428) loss 4.0744 (3.8863) grad_norm 2.4967 (2.0970) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][100/1251] eta 0:04:45 lr 0.000998 wd 0.0500 time 0.2408 (0.2476) data time 0.0010 (0.0058) model time 0.2398 (0.2426) loss 3.7461 (3.8654) grad_norm 1.7819 (2.0830) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][110/1251] eta 0:04:41 lr 0.000998 wd 0.0500 time 0.2489 (0.2470) data time 0.0010 (0.0054) model time 0.2479 (0.2420) loss 4.3938 (3.8708) grad_norm 2.0755 (2.0793) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][120/1251] eta 0:04:38 lr 0.000998 wd 0.0500 time 0.2399 (0.2465) data time 0.0009 (0.0050) model time 0.2390 (0.2418) loss 2.9457 (3.8586) grad_norm 1.7933 (2.0784) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][130/1251] eta 0:04:35 lr 0.000998 wd 0.0500 time 0.2499 (0.2462) data time 0.0007 (0.0047) model time 0.2492 (0.2418) loss 3.5555 (3.8350) grad_norm 2.1428 (2.0741) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][140/1251] eta 0:04:33 lr 0.000998 wd 0.0500 time 0.2386 (0.2458) data time 0.0010 (0.0045) model time 0.2376 (0.2415) loss 3.2161 (3.8560) grad_norm 2.1996 (2.0537) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][150/1251] eta 0:04:30 lr 0.000998 wd 0.0500 time 0.2431 (0.2454) data time 0.0009 (0.0042) model time 0.2422 (0.2413) loss 5.0299 (3.8873) grad_norm 1.7020 (2.0768) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:34:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][160/1251] eta 0:04:27 lr 0.000998 wd 0.0500 time 0.2448 (0.2452) data time 0.0009 (0.0040) model time 0.2439 (0.2412) loss 4.3139 (3.8841) grad_norm 3.2778 (2.0895) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][170/1251] eta 0:04:26 lr 0.000998 wd 0.0500 time 0.4871 (0.2463) data time 0.0010 (0.0039) model time 0.4861 (0.2431) loss 4.2660 (3.8838) grad_norm 2.7768 (2.0816) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][180/1251] eta 0:04:23 lr 0.000998 wd 0.0500 time 0.2442 (0.2458) data time 0.0007 (0.0037) model time 0.2435 (0.2426) loss 3.9945 (3.8870) grad_norm 1.6636 (2.0816) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][190/1251] eta 0:04:20 lr 0.000998 wd 0.0500 time 0.2302 (0.2454) data time 0.0011 (0.0036) model time 0.2291 (0.2422) loss 3.3385 (3.8860) grad_norm 2.2108 (2.0902) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][200/1251] eta 0:04:18 lr 0.000998 wd 0.0500 time 0.4582 (0.2463) data time 0.0011 (0.0034) model time 0.4572 (0.2434) loss 3.8097 (3.8670) grad_norm 2.0130 (2.0821) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][210/1251] eta 0:04:19 lr 0.000998 wd 0.0500 time 0.2408 (0.2493) data time 0.0009 (0.0033) model time 0.2399 (0.2475) loss 4.4327 (3.8795) grad_norm 1.7922 (2.0658) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][220/1251] eta 0:04:16 lr 0.000998 wd 0.0500 time 0.2410 (0.2490) data time 0.0009 (0.0033) model time 0.2401 (0.2471) loss 2.5404 (3.8567) grad_norm 1.6725 (2.0547) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][230/1251] eta 0:04:13 lr 0.000998 wd 0.0500 time 0.2252 (0.2487) data time 0.0009 (0.0032) model time 0.2244 (0.2467) loss 4.5945 (3.8567) grad_norm 1.8477 (2.0471) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][240/1251] eta 0:04:11 lr 0.000998 wd 0.0500 time 0.2358 (0.2483) data time 0.0011 (0.0031) model time 0.2346 (0.2463) loss 4.2650 (3.8524) grad_norm 1.8382 (2.0564) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][250/1251] eta 0:04:08 lr 0.000998 wd 0.0500 time 0.2448 (0.2480) data time 0.0009 (0.0030) model time 0.2439 (0.2460) loss 3.4676 (3.8607) grad_norm 1.5553 (2.0472) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][260/1251] eta 0:04:05 lr 0.000998 wd 0.0500 time 0.2440 (0.2478) data time 0.0007 (0.0030) model time 0.2433 (0.2457) loss 3.1158 (3.8606) grad_norm 1.7461 (2.0547) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][270/1251] eta 0:04:02 lr 0.000998 wd 0.0500 time 0.2384 (0.2475) data time 0.0010 (0.0029) model time 0.2375 (0.2454) loss 4.5723 (3.8735) grad_norm 1.9874 (2.0597) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][280/1251] eta 0:04:00 lr 0.000998 wd 0.0500 time 0.2495 (0.2473) data time 0.0012 (0.0028) model time 0.2482 (0.2453) loss 3.7969 (3.8765) grad_norm 1.6140 (2.0508) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][290/1251] eta 0:03:57 lr 0.000998 wd 0.0500 time 0.2403 (0.2470) data time 0.0009 (0.0028) model time 0.2395 (0.2450) loss 4.3826 (3.8697) grad_norm 1.7018 (2.0371) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][300/1251] eta 0:03:54 lr 0.000998 wd 0.0500 time 0.2423 (0.2469) data time 0.0009 (0.0027) model time 0.2414 (0.2448) loss 4.2221 (3.8769) grad_norm 1.9622 (2.0295) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][310/1251] eta 0:03:52 lr 0.000998 wd 0.0500 time 0.2440 (0.2468) data time 0.0010 (0.0027) model time 0.2430 (0.2448) loss 4.1568 (3.8890) grad_norm 1.4528 (2.0362) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][320/1251] eta 0:03:49 lr 0.000998 wd 0.0500 time 0.2418 (0.2467) data time 0.0009 (0.0026) model time 0.2409 (0.2447) loss 3.7990 (3.8952) grad_norm 3.2578 (2.0484) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][330/1251] eta 0:03:47 lr 0.000998 wd 0.0500 time 0.2375 (0.2465) data time 0.0007 (0.0026) model time 0.2368 (0.2445) loss 4.6463 (3.8993) grad_norm 1.6879 (2.0469) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][340/1251] eta 0:03:44 lr 0.000998 wd 0.0500 time 0.2348 (0.2464) data time 0.0009 (0.0025) model time 0.2338 (0.2444) loss 3.8972 (3.8932) grad_norm 1.8150 (2.0525) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][350/1251] eta 0:03:41 lr 0.000998 wd 0.0500 time 0.2399 (0.2463) data time 0.0009 (0.0025) model time 0.2390 (0.2443) loss 4.3814 (3.8975) grad_norm 2.5059 (2.0672) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][360/1251] eta 0:03:39 lr 0.000998 wd 0.0500 time 0.2316 (0.2462) data time 0.0011 (0.0024) model time 0.2305 (0.2442) loss 3.9732 (3.8940) grad_norm 3.6112 (2.0673) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][370/1251] eta 0:03:36 lr 0.000998 wd 0.0500 time 0.2393 (0.2461) data time 0.0011 (0.0024) model time 0.2382 (0.2441) loss 3.3335 (3.8962) grad_norm 2.2298 (2.0697) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][380/1251] eta 0:03:34 lr 0.000998 wd 0.0500 time 0.2453 (0.2460) data time 0.0011 (0.0024) model time 0.2442 (0.2440) loss 4.3487 (3.8946) grad_norm 1.7921 (2.0682) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][390/1251] eta 0:03:31 lr 0.000998 wd 0.0500 time 0.2385 (0.2459) data time 0.0011 (0.0023) model time 0.2374 (0.2440) loss 4.3092 (3.8882) grad_norm 1.6069 (2.0678) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][400/1251] eta 0:03:29 lr 0.000998 wd 0.0500 time 0.2392 (0.2458) data time 0.0009 (0.0023) model time 0.2383 (0.2439) loss 4.0766 (3.8886) grad_norm 1.9001 (2.0803) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][410/1251] eta 0:03:26 lr 0.000998 wd 0.0500 time 0.2406 (0.2457) data time 0.0009 (0.0023) model time 0.2397 (0.2438) loss 4.5083 (3.8848) grad_norm 1.3969 (2.0721) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][420/1251] eta 0:03:24 lr 0.000998 wd 0.0500 time 0.2459 (0.2456) data time 0.0009 (0.0023) model time 0.2450 (0.2436) loss 4.1420 (3.8810) grad_norm 2.9190 (2.0780) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][430/1251] eta 0:03:21 lr 0.000998 wd 0.0500 time 0.2512 (0.2455) data time 0.0008 (0.0022) model time 0.2504 (0.2436) loss 4.6019 (3.8862) grad_norm 2.0991 (2.0729) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][440/1251] eta 0:03:19 lr 0.000998 wd 0.0500 time 0.2441 (0.2455) data time 0.0009 (0.0022) model time 0.2432 (0.2435) loss 4.5195 (3.8846) grad_norm 2.8721 (2.0818) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][450/1251] eta 0:03:16 lr 0.000998 wd 0.0500 time 0.2399 (0.2453) data time 0.0009 (0.0022) model time 0.2390 (0.2434) loss 4.5976 (3.8931) grad_norm 1.4157 (2.0763) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][460/1251] eta 0:03:13 lr 0.000998 wd 0.0500 time 0.2347 (0.2452) data time 0.0010 (0.0022) model time 0.2337 (0.2433) loss 4.1584 (3.8885) grad_norm 1.5222 (2.0740) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][470/1251] eta 0:03:11 lr 0.000998 wd 0.0500 time 0.2344 (0.2451) data time 0.0009 (0.0021) model time 0.2336 (0.2432) loss 3.0980 (3.8841) grad_norm 1.8671 (2.0714) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][480/1251] eta 0:03:08 lr 0.000998 wd 0.0500 time 0.2381 (0.2450) data time 0.0008 (0.0021) model time 0.2373 (0.2431) loss 3.0684 (3.8827) grad_norm 1.8334 (2.0686) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][490/1251] eta 0:03:06 lr 0.000998 wd 0.0500 time 0.2437 (0.2450) data time 0.0010 (0.0021) model time 0.2427 (0.2431) loss 4.3844 (3.8823) grad_norm 3.5598 (2.0728) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][500/1251] eta 0:03:03 lr 0.000998 wd 0.0500 time 0.2371 (0.2449) data time 0.0009 (0.0021) model time 0.2362 (0.2430) loss 4.1037 (3.8851) grad_norm 2.0364 (2.0717) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][510/1251] eta 0:03:01 lr 0.000998 wd 0.0500 time 0.2460 (0.2448) data time 0.0007 (0.0021) model time 0.2453 (0.2430) loss 4.6443 (3.8813) grad_norm 3.2639 (2.0821) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][520/1251] eta 0:02:58 lr 0.000998 wd 0.0500 time 0.2439 (0.2448) data time 0.0007 (0.0020) model time 0.2431 (0.2429) loss 3.2858 (3.8798) grad_norm 3.2938 (2.0808) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][530/1251] eta 0:02:56 lr 0.000998 wd 0.0500 time 0.2449 (0.2447) data time 0.0008 (0.0020) model time 0.2441 (0.2429) loss 4.3713 (3.8872) grad_norm 1.6370 (2.0721) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][540/1251] eta 0:02:53 lr 0.000998 wd 0.0500 time 0.2435 (0.2447) data time 0.0010 (0.0020) model time 0.2425 (0.2428) loss 4.5730 (3.8835) grad_norm 2.4613 (2.0721) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][550/1251] eta 0:02:51 lr 0.000998 wd 0.0500 time 0.2477 (0.2447) data time 0.0008 (0.0020) model time 0.2469 (0.2429) loss 4.5530 (3.8814) grad_norm 1.7141 (2.0664) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][560/1251] eta 0:02:49 lr 0.000998 wd 0.0500 time 0.2437 (0.2446) data time 0.0008 (0.0020) model time 0.2429 (0.2428) loss 4.3584 (3.8794) grad_norm 2.1844 (2.0721) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][570/1251] eta 0:02:46 lr 0.000998 wd 0.0500 time 0.2380 (0.2446) data time 0.0008 (0.0019) model time 0.2373 (0.2428) loss 2.4216 (3.8775) grad_norm 2.6575 (2.0782) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][580/1251] eta 0:02:44 lr 0.000998 wd 0.0500 time 0.2335 (0.2446) data time 0.0008 (0.0019) model time 0.2327 (0.2428) loss 3.8659 (3.8739) grad_norm 1.9207 (2.0757) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][590/1251] eta 0:02:41 lr 0.000998 wd 0.0500 time 0.2477 (0.2445) data time 0.0010 (0.0019) model time 0.2467 (0.2428) loss 3.7358 (3.8756) grad_norm 4.2623 (2.0841) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][600/1251] eta 0:02:39 lr 0.000998 wd 0.0500 time 0.2453 (0.2445) data time 0.0009 (0.0019) model time 0.2443 (0.2427) loss 4.2679 (3.8780) grad_norm 3.1002 (2.0894) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][610/1251] eta 0:02:36 lr 0.000998 wd 0.0500 time 0.2449 (0.2445) data time 0.0010 (0.0019) model time 0.2439 (0.2427) loss 4.2986 (3.8795) grad_norm 1.7825 (2.0884) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][620/1251] eta 0:02:34 lr 0.000998 wd 0.0500 time 0.2516 (0.2444) data time 0.0008 (0.0019) model time 0.2508 (0.2427) loss 4.6151 (3.8796) grad_norm 2.7212 (2.0952) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][630/1251] eta 0:02:31 lr 0.000998 wd 0.0500 time 0.2355 (0.2444) data time 0.0008 (0.0019) model time 0.2347 (0.2427) loss 3.4224 (3.8772) grad_norm 1.4226 (2.0894) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][640/1251] eta 0:02:29 lr 0.000998 wd 0.0500 time 0.2472 (0.2443) data time 0.0012 (0.0019) model time 0.2460 (0.2426) loss 4.0159 (3.8819) grad_norm 1.6147 (2.0852) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:36:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][650/1251] eta 0:02:26 lr 0.000998 wd 0.0500 time 0.2463 (0.2443) data time 0.0007 (0.0018) model time 0.2456 (0.2426) loss 4.5336 (3.8862) grad_norm 1.8967 (2.0821) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][660/1251] eta 0:02:24 lr 0.000998 wd 0.0500 time 0.2373 (0.2442) data time 0.0008 (0.0018) model time 0.2365 (0.2425) loss 2.7060 (3.8822) grad_norm 3.9413 (2.0873) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][670/1251] eta 0:02:21 lr 0.000998 wd 0.0500 time 0.2409 (0.2441) data time 0.0009 (0.0018) model time 0.2400 (0.2424) loss 3.9130 (3.8826) grad_norm 1.6767 (2.0829) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][680/1251] eta 0:02:19 lr 0.000998 wd 0.0500 time 0.2469 (0.2441) data time 0.0010 (0.0018) model time 0.2459 (0.2424) loss 3.8248 (3.8850) grad_norm 1.8839 (2.0839) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][690/1251] eta 0:02:16 lr 0.000998 wd 0.0500 time 0.2406 (0.2440) data time 0.0008 (0.0018) model time 0.2398 (0.2423) loss 3.1043 (3.8825) grad_norm 2.1822 (2.0791) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][700/1251] eta 0:02:14 lr 0.000998 wd 0.0500 time 0.2354 (0.2440) data time 0.0009 (0.0018) model time 0.2345 (0.2423) loss 3.0917 (3.8830) grad_norm 2.2703 (2.0772) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][710/1251] eta 0:02:11 lr 0.000998 wd 0.0500 time 0.2395 (0.2439) data time 0.0009 (0.0018) model time 0.2386 (0.2423) loss 4.9148 (3.8797) grad_norm 2.2457 (2.0770) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][720/1251] eta 0:02:09 lr 0.000998 wd 0.0500 time 0.2456 (0.2439) data time 0.0010 (0.0018) model time 0.2447 (0.2422) loss 4.0358 (3.8843) grad_norm 1.6370 (2.0767) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][730/1251] eta 0:02:07 lr 0.000998 wd 0.0500 time 0.2397 (0.2439) data time 0.0007 (0.0018) model time 0.2389 (0.2422) loss 3.7092 (3.8820) grad_norm 1.8594 (2.0743) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][740/1251] eta 0:02:04 lr 0.000998 wd 0.0500 time 0.2363 (0.2439) data time 0.0010 (0.0017) model time 0.2353 (0.2422) loss 3.6892 (3.8825) grad_norm 2.2762 (2.0749) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][750/1251] eta 0:02:02 lr 0.000998 wd 0.0500 time 0.2390 (0.2439) data time 0.0009 (0.0017) model time 0.2381 (0.2422) loss 4.0943 (3.8853) grad_norm 1.7903 (2.0704) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][760/1251] eta 0:01:59 lr 0.000998 wd 0.0500 time 0.2359 (0.2439) data time 0.0010 (0.0017) model time 0.2348 (0.2422) loss 3.9548 (3.8862) grad_norm 1.7835 (2.0670) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][770/1251] eta 0:01:57 lr 0.000998 wd 0.0500 time 0.2483 (0.2438) data time 0.0010 (0.0017) model time 0.2474 (0.2422) loss 2.8111 (3.8866) grad_norm 2.6099 (2.0741) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][780/1251] eta 0:01:54 lr 0.000998 wd 0.0500 time 0.2401 (0.2438) data time 0.0008 (0.0017) model time 0.2393 (0.2422) loss 4.7440 (3.8887) grad_norm 2.1728 (2.0740) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][790/1251] eta 0:01:52 lr 0.000998 wd 0.0500 time 0.2359 (0.2438) data time 0.0008 (0.0017) model time 0.2351 (0.2421) loss 4.4387 (3.8920) grad_norm 1.5195 (2.0740) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][800/1251] eta 0:01:49 lr 0.000998 wd 0.0500 time 0.2413 (0.2437) data time 0.0009 (0.0017) model time 0.2404 (0.2421) loss 4.6255 (3.8965) grad_norm 1.8034 (2.0692) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][810/1251] eta 0:01:47 lr 0.000998 wd 0.0500 time 0.2460 (0.2437) data time 0.0010 (0.0017) model time 0.2450 (0.2421) loss 3.9454 (3.8981) grad_norm 2.0789 (2.0676) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][820/1251] eta 0:01:45 lr 0.000998 wd 0.0500 time 0.2382 (0.2437) data time 0.0009 (0.0017) model time 0.2373 (0.2421) loss 3.1070 (3.8963) grad_norm 2.4924 (2.0686) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][830/1251] eta 0:01:42 lr 0.000998 wd 0.0500 time 0.2401 (0.2437) data time 0.0012 (0.0017) model time 0.2389 (0.2420) loss 4.2007 (3.8936) grad_norm 2.0078 (2.0706) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][840/1251] eta 0:01:40 lr 0.000998 wd 0.0500 time 0.2387 (0.2437) data time 0.0009 (0.0017) model time 0.2378 (0.2420) loss 4.1848 (3.8961) grad_norm 1.6456 (2.0699) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][850/1251] eta 0:01:37 lr 0.000998 wd 0.0500 time 0.2456 (0.2437) data time 0.0007 (0.0017) model time 0.2449 (0.2420) loss 4.7106 (3.8976) grad_norm 2.0611 (2.0665) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][860/1251] eta 0:01:35 lr 0.000998 wd 0.0500 time 0.2510 (0.2437) data time 0.0009 (0.0017) model time 0.2501 (0.2420) loss 4.8262 (3.8952) grad_norm 1.4259 (2.0653) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][870/1251] eta 0:01:32 lr 0.000998 wd 0.0500 time 0.2404 (0.2437) data time 0.0011 (0.0017) model time 0.2394 (0.2421) loss 3.7589 (3.8954) grad_norm 1.4993 (2.0669) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][880/1251] eta 0:01:30 lr 0.000998 wd 0.0500 time 0.2470 (0.2437) data time 0.0010 (0.0017) model time 0.2460 (0.2420) loss 3.2665 (3.8905) grad_norm 1.6153 (2.0634) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][890/1251] eta 0:01:27 lr 0.000998 wd 0.0500 time 0.2409 (0.2436) data time 0.0007 (0.0017) model time 0.2402 (0.2420) loss 3.3378 (3.8894) grad_norm 3.0384 (2.0614) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:37:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][900/1251] eta 0:01:25 lr 0.000998 wd 0.0500 time 0.2487 (0.2436) data time 0.0011 (0.0016) model time 0.2477 (0.2420) loss 3.5297 (3.8898) grad_norm 1.8949 (2.0581) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:38:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][910/1251] eta 0:01:23 lr 0.000998 wd 0.0500 time 0.2416 (0.2436) data time 0.0007 (0.0016) model time 0.2409 (0.2420) loss 3.8718 (3.8913) grad_norm 1.5410 (2.0545) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:38:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][920/1251] eta 0:01:20 lr 0.000998 wd 0.0500 time 0.2355 (0.2436) data time 0.0010 (0.0016) model time 0.2346 (0.2420) loss 3.8271 (3.8895) grad_norm 1.5069 (2.0569) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:38:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][930/1251] eta 0:01:18 lr 0.000998 wd 0.0500 time 0.2423 (0.2435) data time 0.0007 (0.0016) model time 0.2416 (0.2420) loss 4.3790 (3.8891) grad_norm 2.0740 (2.0587) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:38:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][940/1251] eta 0:01:15 lr 0.000998 wd 0.0500 time 0.2325 (0.2435) data time 0.0011 (0.0016) model time 0.2314 (0.2419) loss 3.0925 (3.8881) grad_norm 2.1290 (2.0587) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:38:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][950/1251] eta 0:01:13 lr 0.000998 wd 0.0500 time 0.2361 (0.2435) data time 0.0012 (0.0016) model time 0.2349 (0.2419) loss 4.0135 (3.8857) grad_norm 1.4813 (2.0586) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:38:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][960/1251] eta 0:01:10 lr 0.000998 wd 0.0500 time 0.2431 (0.2435) data time 0.0008 (0.0016) model time 0.2423 (0.2419) loss 4.4267 (3.8877) grad_norm 1.7779 (2.0559) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:38:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][970/1251] eta 0:01:08 lr 0.000998 wd 0.0500 time 0.2353 (0.2435) data time 0.0011 (0.0016) model time 0.2342 (0.2419) loss 3.2960 (3.8850) grad_norm 1.9732 (2.0568) loss_scale 32768.0000 (16535.8599) mem 7379MB [2024-08-26 04:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][980/1251] eta 0:01:05 lr 0.000998 wd 0.0500 time 0.2346 (0.2434) data time 0.0010 (0.0016) model time 0.2336 (0.2419) loss 4.4059 (3.8875) grad_norm 3.4823 (inf) loss_scale 16384.0000 (16584.4159) mem 7379MB [2024-08-26 04:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][990/1251] eta 0:01:03 lr 0.000998 wd 0.0500 time 0.2410 (0.2434) data time 0.0009 (0.0016) model time 0.2401 (0.2419) loss 3.9646 (3.8848) grad_norm 1.5031 (inf) loss_scale 16384.0000 (16582.3935) mem 7379MB [2024-08-26 04:38:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1000/1251] eta 0:01:01 lr 0.000998 wd 0.0500 time 0.2398 (0.2434) data time 0.0011 (0.0016) model time 0.2387 (0.2419) loss 4.2318 (3.8870) grad_norm 1.5145 (inf) loss_scale 16384.0000 (16580.4116) mem 7379MB [2024-08-26 04:38:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1010/1251] eta 0:00:58 lr 0.000998 wd 0.0500 time 0.2358 (0.2434) data time 0.0011 (0.0016) model time 0.2348 (0.2419) loss 3.4738 (3.8859) grad_norm 4.1645 (inf) loss_scale 16384.0000 (16578.4688) mem 7379MB [2024-08-26 04:38:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1020/1251] eta 0:00:56 lr 0.000998 wd 0.0500 time 0.2491 (0.2434) data time 0.0008 (0.0016) model time 0.2483 (0.2419) loss 3.3460 (3.8810) grad_norm 1.7613 (inf) loss_scale 16384.0000 (16576.5642) mem 7379MB [2024-08-26 04:38:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1030/1251] eta 0:00:53 lr 0.000998 wd 0.0500 time 0.2451 (0.2434) data time 0.0007 (0.0016) model time 0.2443 (0.2418) loss 3.4157 (3.8798) grad_norm 2.2128 (inf) loss_scale 16384.0000 (16574.6964) mem 7379MB [2024-08-26 04:38:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1040/1251] eta 0:00:51 lr 0.000998 wd 0.0500 time 0.2400 (0.2433) data time 0.0009 (0.0016) model time 0.2390 (0.2418) loss 4.0231 (3.8808) grad_norm 1.8816 (inf) loss_scale 16384.0000 (16572.8646) mem 7379MB [2024-08-26 04:38:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1050/1251] eta 0:00:48 lr 0.000998 wd 0.0500 time 0.2447 (0.2433) data time 0.0010 (0.0016) model time 0.2437 (0.2418) loss 3.9826 (3.8791) grad_norm 1.6643 (inf) loss_scale 16384.0000 (16571.0676) mem 7379MB [2024-08-26 04:38:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1060/1251] eta 0:00:46 lr 0.000998 wd 0.0500 time 0.2400 (0.2433) data time 0.0009 (0.0016) model time 0.2392 (0.2418) loss 4.7810 (3.8789) grad_norm 1.4665 (inf) loss_scale 16384.0000 (16569.3044) mem 7379MB [2024-08-26 04:38:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1070/1251] eta 0:00:44 lr 0.000998 wd 0.0500 time 0.2356 (0.2432) data time 0.0011 (0.0015) model time 0.2345 (0.2417) loss 3.7952 (3.8778) grad_norm 2.3783 (inf) loss_scale 8192.0000 (16521.6807) mem 7379MB [2024-08-26 04:38:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1080/1251] eta 0:00:41 lr 0.000998 wd 0.0500 time 0.2435 (0.2432) data time 0.0010 (0.0015) model time 0.2425 (0.2417) loss 3.9525 (3.8801) grad_norm 1.6324 (inf) loss_scale 8192.0000 (16444.6253) mem 7379MB [2024-08-26 04:38:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1090/1251] eta 0:00:39 lr 0.000998 wd 0.0500 time 0.2411 (0.2432) data time 0.0011 (0.0015) model time 0.2400 (0.2417) loss 3.5937 (3.8790) grad_norm 2.6750 (inf) loss_scale 8192.0000 (16368.9826) mem 7379MB [2024-08-26 04:38:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1100/1251] eta 0:00:36 lr 0.000998 wd 0.0500 time 0.4787 (0.2434) data time 0.0007 (0.0015) model time 0.4779 (0.2419) loss 4.4495 (3.8781) grad_norm 1.8866 (inf) loss_scale 8192.0000 (16294.7139) mem 7379MB [2024-08-26 04:38:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1110/1251] eta 0:00:34 lr 0.000998 wd 0.0500 time 0.2366 (0.2433) data time 0.0007 (0.0015) model time 0.2359 (0.2418) loss 3.5497 (3.8811) grad_norm 1.6045 (inf) loss_scale 8192.0000 (16221.7822) mem 7379MB [2024-08-26 04:38:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1120/1251] eta 0:00:31 lr 0.000998 wd 0.0500 time 0.2358 (0.2433) data time 0.0007 (0.0015) model time 0.2351 (0.2418) loss 4.6922 (3.8830) grad_norm 1.2832 (inf) loss_scale 8192.0000 (16150.1517) mem 7379MB [2024-08-26 04:38:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1130/1251] eta 0:00:29 lr 0.000998 wd 0.0500 time 0.4788 (0.2435) data time 0.0007 (0.0015) model time 0.4780 (0.2420) loss 4.4141 (3.8825) grad_norm 2.2828 (inf) loss_scale 8192.0000 (16079.7878) mem 7379MB [2024-08-26 04:38:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1140/1251] eta 0:00:27 lr 0.000998 wd 0.0500 time 0.2395 (0.2440) data time 0.0010 (0.0015) model time 0.2384 (0.2426) loss 3.9398 (3.8809) grad_norm 2.7264 (inf) loss_scale 8192.0000 (16010.6573) mem 7379MB [2024-08-26 04:39:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1150/1251] eta 0:00:24 lr 0.000998 wd 0.0500 time 0.2423 (0.2441) data time 0.0009 (0.0015) model time 0.2414 (0.2426) loss 4.4610 (3.8828) grad_norm 1.9214 (inf) loss_scale 8192.0000 (15942.7281) mem 7379MB [2024-08-26 04:39:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1160/1251] eta 0:00:22 lr 0.000998 wd 0.0500 time 0.2411 (0.2440) data time 0.0007 (0.0015) model time 0.2404 (0.2426) loss 3.0772 (3.8807) grad_norm 2.4292 (inf) loss_scale 8192.0000 (15875.9690) mem 7379MB [2024-08-26 04:39:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1170/1251] eta 0:00:19 lr 0.000998 wd 0.0500 time 0.2428 (0.2440) data time 0.0009 (0.0015) model time 0.2419 (0.2426) loss 3.2017 (3.8808) grad_norm 1.9862 (inf) loss_scale 8192.0000 (15810.3501) mem 7379MB [2024-08-26 04:39:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1180/1251] eta 0:00:17 lr 0.000998 wd 0.0500 time 0.2408 (0.2440) data time 0.0010 (0.0015) model time 0.2398 (0.2425) loss 4.5402 (3.8824) grad_norm 2.2202 (inf) loss_scale 8192.0000 (15745.8425) mem 7379MB [2024-08-26 04:39:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1190/1251] eta 0:00:14 lr 0.000998 wd 0.0500 time 0.2445 (0.2440) data time 0.0007 (0.0015) model time 0.2438 (0.2425) loss 4.2922 (3.8782) grad_norm 1.4981 (inf) loss_scale 8192.0000 (15682.4181) mem 7379MB [2024-08-26 04:39:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1200/1251] eta 0:00:12 lr 0.000998 wd 0.0500 time 0.2400 (0.2440) data time 0.0007 (0.0015) model time 0.2393 (0.2425) loss 4.0526 (3.8808) grad_norm 1.9372 (inf) loss_scale 8192.0000 (15620.0500) mem 7379MB [2024-08-26 04:39:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1210/1251] eta 0:00:10 lr 0.000997 wd 0.0500 time 0.2419 (0.2439) data time 0.0010 (0.0015) model time 0.2409 (0.2425) loss 3.0336 (3.8800) grad_norm 1.6779 (inf) loss_scale 8192.0000 (15558.7118) mem 7379MB [2024-08-26 04:39:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1220/1251] eta 0:00:07 lr 0.000997 wd 0.0500 time 0.2374 (0.2439) data time 0.0008 (0.0015) model time 0.2366 (0.2425) loss 3.8428 (3.8815) grad_norm 2.0045 (inf) loss_scale 8192.0000 (15498.3784) mem 7379MB [2024-08-26 04:39:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1230/1251] eta 0:00:05 lr 0.000997 wd 0.0500 time 0.2391 (0.2439) data time 0.0008 (0.0015) model time 0.2383 (0.2425) loss 4.2850 (3.8817) grad_norm 1.6366 (inf) loss_scale 8192.0000 (15439.0252) mem 7379MB [2024-08-26 04:39:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1240/1251] eta 0:00:02 lr 0.000997 wd 0.0500 time 0.2248 (0.2438) data time 0.0007 (0.0015) model time 0.2240 (0.2424) loss 3.6204 (3.8801) grad_norm 1.5041 (inf) loss_scale 8192.0000 (15380.6285) mem 7379MB [2024-08-26 04:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [28/300][1250/1251] eta 0:00:00 lr 0.000997 wd 0.0500 time 0.2239 (0.2437) data time 0.0007 (0.0015) model time 0.2232 (0.2422) loss 4.0179 (3.8814) grad_norm 3.1583 (inf) loss_scale 8192.0000 (15323.1655) mem 7379MB [2024-08-26 04:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 28 training takes 0:05:04 [2024-08-26 04:39:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 04:39:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 04:39:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.439 (0.439) Loss 0.6963 (0.6963) Acc@1 86.133 (86.133) Acc@5 97.266 (97.266) Mem 7379MB [2024-08-26 04:39:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.111) Loss 1.0615 (1.0444) Acc@1 78.809 (76.776) Acc@5 93.359 (94.043) Mem 7379MB [2024-08-26 04:39:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.096) Loss 1.5684 (1.0639) Acc@1 64.453 (75.939) Acc@5 87.305 (93.992) Mem 7379MB [2024-08-26 04:39:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.086 (0.091) Loss 1.8242 (1.2228) Acc@1 58.203 (72.615) Acc@5 82.520 (91.731) Mem 7379MB [2024-08-26 04:39:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.6777 (1.3186) Acc@1 63.867 (70.555) Acc@5 84.668 (90.413) Mem 7379MB [2024-08-26 04:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 70.336 Acc@5 90.314 [2024-08-26 04:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 70.3% [2024-08-26 04:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 70.34% [2024-08-26 04:39:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 04:39:30 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 04:39:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.485 (0.485) Loss 0.5718 (0.5718) Acc@1 86.230 (86.230) Acc@5 96.484 (96.484) Mem 7379MB [2024-08-26 04:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.115) Loss 0.9395 (0.9316) Acc@1 78.027 (76.749) Acc@5 94.043 (94.061) Mem 7379MB [2024-08-26 04:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.098) Loss 1.3789 (0.9439) Acc@1 66.016 (76.209) Acc@5 89.160 (94.085) Mem 7379MB [2024-08-26 04:39:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.091) Loss 1.6973 (1.1001) Acc@1 58.398 (72.968) Acc@5 82.129 (91.819) Mem 7379MB [2024-08-26 04:39:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.6797 (1.1956) Acc@1 59.961 (71.120) Acc@5 84.668 (90.546) Mem 7379MB [2024-08-26 04:39:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 70.872 Acc@5 90.466 [2024-08-26 04:39:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 70.9% [2024-08-26 04:39:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 70.87% [2024-08-26 04:39:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 04:39:35 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 04:39:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][0/1251] eta 0:14:24 lr 0.000997 wd 0.0500 time 0.6914 (0.6914) data time 0.4659 (0.4659) model time 0.0000 (0.0000) loss 3.6642 (3.6642) grad_norm 1.4132 (1.4132) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:39:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][10/1251] eta 0:05:46 lr 0.000997 wd 0.0500 time 0.2356 (0.2793) data time 0.0007 (0.0433) model time 0.0000 (0.0000) loss 2.8722 (3.6522) grad_norm 1.7797 (2.1962) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:39:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][20/1251] eta 0:05:21 lr 0.000997 wd 0.0500 time 0.2403 (0.2609) data time 0.0008 (0.0231) model time 0.0000 (0.0000) loss 4.4219 (3.7244) grad_norm 1.8080 (2.0457) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:39:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][30/1251] eta 0:05:10 lr 0.000997 wd 0.0500 time 0.2475 (0.2541) data time 0.0011 (0.0160) model time 0.0000 (0.0000) loss 3.6357 (3.7684) grad_norm 2.7037 (2.0384) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:39:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][40/1251] eta 0:05:03 lr 0.000997 wd 0.0500 time 0.2385 (0.2508) data time 0.0009 (0.0124) model time 0.0000 (0.0000) loss 4.4671 (3.7095) grad_norm 1.9936 (2.0722) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][50/1251] eta 0:04:59 lr 0.000997 wd 0.0500 time 0.2418 (0.2491) data time 0.0011 (0.0102) model time 0.0000 (0.0000) loss 4.0849 (3.7723) grad_norm 1.5597 (2.1390) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:39:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][60/1251] eta 0:04:54 lr 0.000997 wd 0.0500 time 0.2394 (0.2475) data time 0.0007 (0.0087) model time 0.2387 (0.2388) loss 3.4591 (3.7976) grad_norm 2.5014 (2.1130) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:39:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][70/1251] eta 0:04:51 lr 0.000997 wd 0.0500 time 0.2393 (0.2467) data time 0.0007 (0.0076) model time 0.2385 (0.2396) loss 4.7950 (3.8006) grad_norm 2.9146 (2.0975) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:39:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][80/1251] eta 0:04:48 lr 0.000997 wd 0.0500 time 0.2488 (0.2460) data time 0.0009 (0.0068) model time 0.2479 (0.2399) loss 3.8614 (3.8200) grad_norm 2.5634 (2.0873) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:39:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][90/1251] eta 0:04:45 lr 0.000997 wd 0.0500 time 0.2436 (0.2457) data time 0.0007 (0.0061) model time 0.2429 (0.2405) loss 4.3914 (3.8600) grad_norm 1.6423 (2.0542) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][100/1251] eta 0:04:42 lr 0.000997 wd 0.0500 time 0.2389 (0.2454) data time 0.0010 (0.0056) model time 0.2380 (0.2406) loss 3.6867 (3.8592) grad_norm 2.2134 (2.0595) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][110/1251] eta 0:04:39 lr 0.000997 wd 0.0500 time 0.2390 (0.2451) data time 0.0008 (0.0052) model time 0.2382 (0.2408) loss 4.7027 (3.8724) grad_norm 2.4935 (2.0794) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][120/1251] eta 0:04:37 lr 0.000997 wd 0.0500 time 0.2465 (0.2450) data time 0.0010 (0.0048) model time 0.2456 (0.2411) loss 3.7932 (3.8806) grad_norm 1.3698 (2.0727) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][130/1251] eta 0:04:34 lr 0.000997 wd 0.0500 time 0.2475 (0.2448) data time 0.0007 (0.0045) model time 0.2468 (0.2411) loss 3.9502 (3.8659) grad_norm 1.5461 (2.0433) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][140/1251] eta 0:04:31 lr 0.000997 wd 0.0500 time 0.2455 (0.2446) data time 0.0009 (0.0043) model time 0.2445 (0.2410) loss 3.1783 (3.8577) grad_norm 1.5510 (2.0324) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][150/1251] eta 0:04:29 lr 0.000997 wd 0.0500 time 0.3432 (0.2449) data time 0.0010 (0.0041) model time 0.3422 (0.2418) loss 3.0680 (3.8334) grad_norm 2.6017 (2.0286) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][160/1251] eta 0:04:26 lr 0.000997 wd 0.0500 time 0.2434 (0.2446) data time 0.0009 (0.0039) model time 0.2425 (0.2415) loss 3.0469 (3.8382) grad_norm 1.7901 (2.0180) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][170/1251] eta 0:04:24 lr 0.000997 wd 0.0500 time 0.2424 (0.2444) data time 0.0009 (0.0037) model time 0.2415 (0.2415) loss 4.0138 (3.8386) grad_norm 1.7562 (1.9958) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][180/1251] eta 0:04:21 lr 0.000997 wd 0.0500 time 0.2428 (0.2444) data time 0.0007 (0.0036) model time 0.2421 (0.2415) loss 4.2964 (3.8308) grad_norm 2.4508 (1.9987) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][190/1251] eta 0:04:19 lr 0.000997 wd 0.0500 time 0.2429 (0.2442) data time 0.0009 (0.0035) model time 0.2420 (0.2414) loss 3.8561 (3.8151) grad_norm 2.1428 (2.0015) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][200/1251] eta 0:04:16 lr 0.000997 wd 0.0500 time 0.2406 (0.2441) data time 0.0008 (0.0033) model time 0.2398 (0.2413) loss 4.1166 (3.8113) grad_norm 2.6920 (2.0024) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][210/1251] eta 0:04:14 lr 0.000997 wd 0.0500 time 0.2455 (0.2440) data time 0.0011 (0.0032) model time 0.2444 (0.2414) loss 4.2197 (3.8118) grad_norm 1.1864 (2.0049) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][220/1251] eta 0:04:11 lr 0.000997 wd 0.0500 time 0.2441 (0.2440) data time 0.0007 (0.0031) model time 0.2434 (0.2414) loss 4.9714 (3.8159) grad_norm 1.7945 (2.0011) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][230/1251] eta 0:04:08 lr 0.000997 wd 0.0500 time 0.2365 (0.2437) data time 0.0009 (0.0030) model time 0.2355 (0.2412) loss 2.8970 (3.8041) grad_norm 1.9735 (1.9883) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][240/1251] eta 0:04:06 lr 0.000997 wd 0.0500 time 0.2397 (0.2436) data time 0.0007 (0.0030) model time 0.2390 (0.2411) loss 2.9446 (3.7964) grad_norm 1.7704 (1.9948) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][250/1251] eta 0:04:03 lr 0.000997 wd 0.0500 time 0.2447 (0.2435) data time 0.0010 (0.0029) model time 0.2437 (0.2411) loss 3.7443 (3.7904) grad_norm 2.0771 (1.9907) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][260/1251] eta 0:04:01 lr 0.000997 wd 0.0500 time 0.2355 (0.2433) data time 0.0011 (0.0028) model time 0.2344 (0.2409) loss 2.9595 (3.7906) grad_norm 2.0412 (1.9796) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][270/1251] eta 0:03:58 lr 0.000997 wd 0.0500 time 0.2366 (0.2432) data time 0.0011 (0.0027) model time 0.2355 (0.2409) loss 3.7010 (3.7897) grad_norm 1.4364 (1.9852) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][280/1251] eta 0:03:56 lr 0.000997 wd 0.0500 time 0.2391 (0.2431) data time 0.0011 (0.0027) model time 0.2381 (0.2408) loss 4.0544 (3.7937) grad_norm 1.8924 (1.9857) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][290/1251] eta 0:03:53 lr 0.000997 wd 0.0500 time 0.2485 (0.2431) data time 0.0009 (0.0026) model time 0.2476 (0.2408) loss 4.1542 (3.7952) grad_norm 2.2073 (1.9978) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][300/1251] eta 0:03:51 lr 0.000997 wd 0.0500 time 0.2471 (0.2431) data time 0.0009 (0.0026) model time 0.2462 (0.2409) loss 4.1956 (3.7925) grad_norm 1.5797 (1.9975) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][310/1251] eta 0:03:48 lr 0.000997 wd 0.0500 time 0.2415 (0.2431) data time 0.0009 (0.0025) model time 0.2406 (0.2409) loss 4.5602 (3.7971) grad_norm 1.6039 (1.9875) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][320/1251] eta 0:03:46 lr 0.000997 wd 0.0500 time 0.2443 (0.2431) data time 0.0010 (0.0025) model time 0.2434 (0.2410) loss 3.0939 (3.7967) grad_norm 1.5775 (1.9768) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][330/1251] eta 0:03:43 lr 0.000997 wd 0.0500 time 0.2406 (0.2430) data time 0.0007 (0.0024) model time 0.2398 (0.2409) loss 4.2587 (3.7995) grad_norm 1.8881 (1.9807) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:40:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][340/1251] eta 0:03:41 lr 0.000997 wd 0.0500 time 0.2471 (0.2430) data time 0.0009 (0.0024) model time 0.2462 (0.2409) loss 4.1124 (3.8077) grad_norm 2.7515 (1.9791) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][350/1251] eta 0:03:38 lr 0.000997 wd 0.0500 time 0.2370 (0.2430) data time 0.0010 (0.0023) model time 0.2361 (0.2409) loss 3.7684 (3.8134) grad_norm 1.2163 (1.9810) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][360/1251] eta 0:03:36 lr 0.000997 wd 0.0500 time 0.2455 (0.2430) data time 0.0010 (0.0023) model time 0.2445 (0.2410) loss 4.2411 (3.8160) grad_norm 1.9828 (1.9704) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][370/1251] eta 0:03:34 lr 0.000997 wd 0.0500 time 0.2433 (0.2430) data time 0.0011 (0.0023) model time 0.2422 (0.2410) loss 4.1010 (3.8192) grad_norm 1.2634 (1.9624) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][380/1251] eta 0:03:31 lr 0.000997 wd 0.0500 time 0.2477 (0.2430) data time 0.0007 (0.0022) model time 0.2470 (0.2411) loss 3.9871 (3.8166) grad_norm 2.2698 (1.9595) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][390/1251] eta 0:03:29 lr 0.000997 wd 0.0500 time 0.2562 (0.2432) data time 0.0008 (0.0022) model time 0.2553 (0.2413) loss 4.2713 (3.8218) grad_norm 2.1447 (1.9569) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][400/1251] eta 0:03:26 lr 0.000997 wd 0.0500 time 0.2398 (0.2431) data time 0.0008 (0.0022) model time 0.2390 (0.2412) loss 2.5896 (3.8143) grad_norm 2.1062 (1.9646) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][410/1251] eta 0:03:24 lr 0.000997 wd 0.0500 time 0.2413 (0.2431) data time 0.0009 (0.0021) model time 0.2404 (0.2412) loss 2.8006 (3.8141) grad_norm 2.9480 (1.9713) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][420/1251] eta 0:03:22 lr 0.000997 wd 0.0500 time 0.2467 (0.2431) data time 0.0009 (0.0021) model time 0.2458 (0.2413) loss 3.0304 (3.8035) grad_norm 1.8577 (1.9741) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][430/1251] eta 0:03:19 lr 0.000997 wd 0.0500 time 0.2445 (0.2431) data time 0.0007 (0.0021) model time 0.2438 (0.2412) loss 3.5628 (3.8055) grad_norm 2.4247 (1.9764) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][440/1251] eta 0:03:17 lr 0.000997 wd 0.0500 time 0.2413 (0.2431) data time 0.0010 (0.0021) model time 0.2403 (0.2413) loss 3.2506 (3.8076) grad_norm 2.3535 (1.9740) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][450/1251] eta 0:03:14 lr 0.000997 wd 0.0500 time 0.2396 (0.2431) data time 0.0009 (0.0020) model time 0.2387 (0.2413) loss 3.5788 (3.8060) grad_norm 1.6872 (1.9742) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][460/1251] eta 0:03:12 lr 0.000997 wd 0.0500 time 0.2711 (0.2431) data time 0.0011 (0.0020) model time 0.2700 (0.2414) loss 4.0057 (3.8084) grad_norm 1.7895 (1.9742) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][470/1251] eta 0:03:09 lr 0.000997 wd 0.0500 time 0.2430 (0.2431) data time 0.0009 (0.0020) model time 0.2421 (0.2414) loss 2.8631 (3.8042) grad_norm 2.0241 (1.9768) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][480/1251] eta 0:03:07 lr 0.000997 wd 0.0500 time 0.2372 (0.2435) data time 0.0011 (0.0020) model time 0.2361 (0.2418) loss 3.2855 (3.8016) grad_norm 1.8087 (1.9794) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][490/1251] eta 0:03:05 lr 0.000997 wd 0.0500 time 0.2402 (0.2434) data time 0.0012 (0.0020) model time 0.2390 (0.2418) loss 4.1978 (3.7976) grad_norm 1.4641 (1.9743) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][500/1251] eta 0:03:02 lr 0.000997 wd 0.0500 time 0.2429 (0.2434) data time 0.0009 (0.0019) model time 0.2420 (0.2418) loss 3.9089 (3.7965) grad_norm 2.4146 (1.9738) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][510/1251] eta 0:03:00 lr 0.000997 wd 0.0500 time 0.2497 (0.2434) data time 0.0008 (0.0019) model time 0.2490 (0.2417) loss 3.7309 (3.7948) grad_norm 1.7214 (1.9828) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][520/1251] eta 0:02:57 lr 0.000997 wd 0.0500 time 0.2477 (0.2434) data time 0.0008 (0.0019) model time 0.2469 (0.2418) loss 3.6948 (3.7998) grad_norm 2.5598 (1.9861) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][530/1251] eta 0:02:55 lr 0.000997 wd 0.0500 time 0.2400 (0.2434) data time 0.0007 (0.0019) model time 0.2393 (0.2417) loss 4.0697 (3.8003) grad_norm 2.0389 (1.9830) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][540/1251] eta 0:02:52 lr 0.000997 wd 0.0500 time 0.2496 (0.2433) data time 0.0008 (0.0019) model time 0.2488 (0.2417) loss 3.7936 (3.7985) grad_norm 1.3916 (1.9848) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][550/1251] eta 0:02:50 lr 0.000997 wd 0.0500 time 0.2391 (0.2433) data time 0.0010 (0.0019) model time 0.2381 (0.2417) loss 4.3832 (3.8011) grad_norm 1.5899 (1.9846) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][560/1251] eta 0:02:48 lr 0.000997 wd 0.0500 time 0.2413 (0.2433) data time 0.0007 (0.0018) model time 0.2405 (0.2417) loss 3.5600 (3.8025) grad_norm 1.8836 (1.9814) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][570/1251] eta 0:02:45 lr 0.000997 wd 0.0500 time 0.2433 (0.2432) data time 0.0010 (0.0018) model time 0.2423 (0.2416) loss 3.6861 (3.8056) grad_norm 3.0077 (1.9873) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][580/1251] eta 0:02:43 lr 0.000997 wd 0.0500 time 0.2396 (0.2432) data time 0.0010 (0.0018) model time 0.2386 (0.2416) loss 4.0033 (3.8060) grad_norm 1.8553 (1.9879) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:41:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][590/1251] eta 0:02:40 lr 0.000997 wd 0.0500 time 0.2418 (0.2432) data time 0.0007 (0.0018) model time 0.2410 (0.2416) loss 3.7490 (3.8035) grad_norm 2.5939 (1.9909) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][600/1251] eta 0:02:38 lr 0.000997 wd 0.0500 time 0.2436 (0.2432) data time 0.0009 (0.0018) model time 0.2427 (0.2416) loss 3.5876 (3.8057) grad_norm 1.9751 (1.9881) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][610/1251] eta 0:02:35 lr 0.000997 wd 0.0500 time 0.2406 (0.2432) data time 0.0009 (0.0018) model time 0.2397 (0.2416) loss 4.6301 (3.8045) grad_norm 1.6086 (1.9842) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][620/1251] eta 0:02:33 lr 0.000997 wd 0.0500 time 0.2435 (0.2432) data time 0.0008 (0.0018) model time 0.2427 (0.2416) loss 4.3506 (3.8066) grad_norm 1.7827 (1.9814) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][630/1251] eta 0:02:31 lr 0.000997 wd 0.0500 time 0.2659 (0.2432) data time 0.0008 (0.0018) model time 0.2651 (0.2417) loss 3.7985 (3.8118) grad_norm 1.3691 (1.9777) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][640/1251] eta 0:02:28 lr 0.000997 wd 0.0500 time 0.2334 (0.2432) data time 0.0009 (0.0017) model time 0.2326 (0.2416) loss 3.2205 (3.8134) grad_norm 1.5144 (1.9729) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][650/1251] eta 0:02:26 lr 0.000997 wd 0.0500 time 0.2369 (0.2431) data time 0.0010 (0.0017) model time 0.2358 (0.2416) loss 3.1148 (3.8125) grad_norm 1.7799 (1.9767) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][660/1251] eta 0:02:23 lr 0.000997 wd 0.0500 time 0.2487 (0.2431) data time 0.0010 (0.0018) model time 0.2476 (0.2416) loss 4.0586 (3.8147) grad_norm 2.2941 (1.9764) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][670/1251] eta 0:02:21 lr 0.000997 wd 0.0500 time 0.2443 (0.2431) data time 0.0012 (0.0017) model time 0.2431 (0.2416) loss 4.0882 (3.8154) grad_norm 2.0790 (1.9803) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][680/1251] eta 0:02:18 lr 0.000997 wd 0.0500 time 0.2405 (0.2431) data time 0.0012 (0.0017) model time 0.2393 (0.2416) loss 4.0100 (3.8171) grad_norm 1.9503 (1.9837) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][690/1251] eta 0:02:16 lr 0.000997 wd 0.0500 time 0.2443 (0.2431) data time 0.0008 (0.0017) model time 0.2434 (0.2416) loss 4.7486 (3.8168) grad_norm 1.7686 (1.9886) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][700/1251] eta 0:02:13 lr 0.000997 wd 0.0500 time 0.2455 (0.2431) data time 0.0007 (0.0017) model time 0.2448 (0.2415) loss 5.1105 (3.8186) grad_norm 1.3826 (1.9852) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][710/1251] eta 0:02:11 lr 0.000997 wd 0.0500 time 0.2413 (0.2430) data time 0.0011 (0.0017) model time 0.2403 (0.2415) loss 4.0345 (3.8193) grad_norm 1.4739 (1.9861) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][720/1251] eta 0:02:09 lr 0.000997 wd 0.0500 time 0.2407 (0.2430) data time 0.0008 (0.0017) model time 0.2399 (0.2415) loss 4.3833 (3.8223) grad_norm 2.4807 (1.9912) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][730/1251] eta 0:02:06 lr 0.000997 wd 0.0500 time 0.2377 (0.2436) data time 0.0008 (0.0017) model time 0.2370 (0.2421) loss 3.0854 (3.8237) grad_norm 1.2716 (1.9887) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][740/1251] eta 0:02:04 lr 0.000997 wd 0.0500 time 0.2433 (0.2445) data time 0.0008 (0.0017) model time 0.2425 (0.2431) loss 4.0937 (3.8267) grad_norm 1.5987 (1.9887) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][750/1251] eta 0:02:02 lr 0.000997 wd 0.0500 time 0.2376 (0.2444) data time 0.0009 (0.0017) model time 0.2367 (0.2430) loss 4.2816 (3.8284) grad_norm 2.3895 (1.9920) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][760/1251] eta 0:01:59 lr 0.000997 wd 0.0500 time 0.2488 (0.2444) data time 0.0009 (0.0017) model time 0.2479 (0.2430) loss 3.6573 (3.8268) grad_norm 1.8737 (1.9982) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][770/1251] eta 0:01:57 lr 0.000997 wd 0.0500 time 0.2366 (0.2443) data time 0.0008 (0.0017) model time 0.2359 (0.2429) loss 5.0946 (3.8336) grad_norm 2.2065 (1.9967) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][780/1251] eta 0:01:55 lr 0.000997 wd 0.0500 time 0.2456 (0.2442) data time 0.0008 (0.0017) model time 0.2448 (0.2428) loss 4.6090 (3.8333) grad_norm 1.4107 (1.9963) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][790/1251] eta 0:01:52 lr 0.000997 wd 0.0500 time 0.2435 (0.2442) data time 0.0009 (0.0016) model time 0.2425 (0.2428) loss 4.4227 (3.8310) grad_norm 2.3893 (1.9953) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][800/1251] eta 0:01:50 lr 0.000997 wd 0.0500 time 0.2367 (0.2442) data time 0.0010 (0.0016) model time 0.2357 (0.2428) loss 3.4647 (3.8271) grad_norm 1.5209 (1.9960) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][810/1251] eta 0:01:47 lr 0.000997 wd 0.0500 time 0.2398 (0.2441) data time 0.0007 (0.0016) model time 0.2391 (0.2427) loss 3.9487 (3.8284) grad_norm 1.5806 (1.9993) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][820/1251] eta 0:01:45 lr 0.000997 wd 0.0500 time 0.2379 (0.2441) data time 0.0011 (0.0016) model time 0.2368 (0.2427) loss 3.4764 (3.8324) grad_norm 2.0459 (2.0031) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:42:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][830/1251] eta 0:01:42 lr 0.000997 wd 0.0500 time 0.2501 (0.2441) data time 0.0010 (0.0016) model time 0.2491 (0.2427) loss 2.9441 (3.8282) grad_norm 1.6422 (2.0013) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][840/1251] eta 0:01:40 lr 0.000997 wd 0.0500 time 0.2413 (0.2440) data time 0.0007 (0.0016) model time 0.2406 (0.2427) loss 3.2042 (3.8283) grad_norm 2.5539 (2.0037) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][850/1251] eta 0:01:37 lr 0.000997 wd 0.0500 time 0.2367 (0.2440) data time 0.0007 (0.0016) model time 0.2360 (0.2427) loss 2.9124 (3.8271) grad_norm 1.3037 (2.0044) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][860/1251] eta 0:01:35 lr 0.000997 wd 0.0500 time 0.2467 (0.2440) data time 0.0010 (0.0016) model time 0.2457 (0.2426) loss 3.8218 (3.8271) grad_norm 1.5277 (2.0000) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][870/1251] eta 0:01:32 lr 0.000997 wd 0.0500 time 0.2378 (0.2440) data time 0.0012 (0.0016) model time 0.2366 (0.2426) loss 4.2872 (3.8276) grad_norm 3.2617 (2.0030) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][880/1251] eta 0:01:30 lr 0.000997 wd 0.0500 time 0.2641 (0.2440) data time 0.0008 (0.0016) model time 0.2632 (0.2426) loss 4.5188 (3.8314) grad_norm 1.7299 (2.0048) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][890/1251] eta 0:01:28 lr 0.000997 wd 0.0500 time 0.2431 (0.2440) data time 0.0008 (0.0016) model time 0.2423 (0.2426) loss 3.6516 (3.8300) grad_norm 1.7087 (2.0001) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][900/1251] eta 0:01:25 lr 0.000997 wd 0.0500 time 0.2470 (0.2440) data time 0.0008 (0.0016) model time 0.2462 (0.2426) loss 4.3133 (3.8330) grad_norm 1.5782 (1.9985) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][910/1251] eta 0:01:23 lr 0.000997 wd 0.0500 time 0.2406 (0.2439) data time 0.0009 (0.0016) model time 0.2398 (0.2426) loss 3.1428 (3.8298) grad_norm 1.8278 (1.9959) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][920/1251] eta 0:01:20 lr 0.000997 wd 0.0500 time 0.2394 (0.2439) data time 0.0010 (0.0016) model time 0.2384 (0.2425) loss 3.5079 (3.8332) grad_norm 2.6254 (1.9968) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][930/1251] eta 0:01:18 lr 0.000997 wd 0.0500 time 0.2454 (0.2439) data time 0.0010 (0.0016) model time 0.2444 (0.2425) loss 3.9695 (3.8334) grad_norm 2.3865 (1.9935) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][940/1251] eta 0:01:15 lr 0.000997 wd 0.0500 time 0.2355 (0.2439) data time 0.0010 (0.0016) model time 0.2345 (0.2425) loss 4.4876 (3.8346) grad_norm 3.2626 (1.9988) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][950/1251] eta 0:01:13 lr 0.000997 wd 0.0500 time 0.2420 (0.2439) data time 0.0010 (0.0015) model time 0.2410 (0.2425) loss 4.5096 (3.8364) grad_norm 1.6030 (1.9981) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][960/1251] eta 0:01:10 lr 0.000997 wd 0.0500 time 0.2466 (0.2438) data time 0.0008 (0.0015) model time 0.2458 (0.2425) loss 4.9380 (3.8388) grad_norm 1.6121 (1.9966) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][970/1251] eta 0:01:08 lr 0.000997 wd 0.0500 time 0.2467 (0.2438) data time 0.0008 (0.0015) model time 0.2459 (0.2424) loss 3.9139 (3.8420) grad_norm 2.4046 (1.9993) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][980/1251] eta 0:01:06 lr 0.000997 wd 0.0500 time 0.2408 (0.2438) data time 0.0007 (0.0015) model time 0.2401 (0.2424) loss 3.5380 (3.8388) grad_norm 1.7340 (1.9999) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][990/1251] eta 0:01:03 lr 0.000997 wd 0.0500 time 0.2395 (0.2438) data time 0.0007 (0.0015) model time 0.2388 (0.2424) loss 4.6391 (3.8404) grad_norm 2.3225 (1.9983) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1000/1251] eta 0:01:01 lr 0.000997 wd 0.0500 time 0.2480 (0.2437) data time 0.0009 (0.0015) model time 0.2471 (0.2424) loss 3.5427 (3.8402) grad_norm 2.1629 (1.9980) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1010/1251] eta 0:00:58 lr 0.000997 wd 0.0500 time 0.2377 (0.2437) data time 0.0007 (0.0015) model time 0.2370 (0.2424) loss 3.7733 (3.8417) grad_norm 1.6136 (1.9989) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1020/1251] eta 0:00:56 lr 0.000997 wd 0.0500 time 0.2476 (0.2439) data time 0.0010 (0.0015) model time 0.2466 (0.2426) loss 4.1762 (3.8424) grad_norm 2.2472 (1.9999) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1030/1251] eta 0:00:53 lr 0.000997 wd 0.0500 time 0.2472 (0.2439) data time 0.0008 (0.0015) model time 0.2463 (0.2426) loss 3.2136 (3.8407) grad_norm 2.8261 (2.0008) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1040/1251] eta 0:00:51 lr 0.000997 wd 0.0500 time 0.2503 (0.2439) data time 0.0007 (0.0015) model time 0.2495 (0.2426) loss 4.7994 (3.8390) grad_norm 1.7619 (2.0044) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1050/1251] eta 0:00:49 lr 0.000997 wd 0.0500 time 0.2370 (0.2439) data time 0.0008 (0.0015) model time 0.2363 (0.2426) loss 3.0509 (3.8360) grad_norm 1.8845 (2.0041) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1060/1251] eta 0:00:46 lr 0.000997 wd 0.0500 time 0.2439 (0.2439) data time 0.0008 (0.0015) model time 0.2431 (0.2426) loss 2.8199 (3.8376) grad_norm 2.0692 (2.0039) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1070/1251] eta 0:00:44 lr 0.000997 wd 0.0500 time 0.2509 (0.2439) data time 0.0009 (0.0015) model time 0.2500 (0.2426) loss 4.5377 (3.8420) grad_norm 1.3463 (2.0012) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:43:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1080/1251] eta 0:00:41 lr 0.000997 wd 0.0500 time 0.2383 (0.2439) data time 0.0009 (0.0015) model time 0.2374 (0.2425) loss 4.4547 (3.8423) grad_norm 1.5312 (1.9997) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1090/1251] eta 0:00:39 lr 0.000997 wd 0.0500 time 0.2412 (0.2438) data time 0.0007 (0.0015) model time 0.2405 (0.2425) loss 4.7747 (3.8438) grad_norm 2.3823 (2.0027) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1100/1251] eta 0:00:36 lr 0.000997 wd 0.0500 time 0.2526 (0.2438) data time 0.0011 (0.0015) model time 0.2515 (0.2425) loss 3.4436 (3.8455) grad_norm 2.0594 (2.0048) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1110/1251] eta 0:00:34 lr 0.000997 wd 0.0500 time 0.2330 (0.2438) data time 0.0010 (0.0015) model time 0.2320 (0.2425) loss 4.3110 (3.8448) grad_norm 1.9689 (2.0018) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1120/1251] eta 0:00:31 lr 0.000997 wd 0.0500 time 0.2422 (0.2438) data time 0.0007 (0.0015) model time 0.2415 (0.2425) loss 2.9612 (3.8427) grad_norm 1.9183 (1.9998) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1130/1251] eta 0:00:29 lr 0.000997 wd 0.0500 time 0.2408 (0.2438) data time 0.0007 (0.0015) model time 0.2401 (0.2425) loss 4.5928 (3.8422) grad_norm 1.8990 (1.9976) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1140/1251] eta 0:00:27 lr 0.000997 wd 0.0500 time 0.2463 (0.2438) data time 0.0007 (0.0015) model time 0.2456 (0.2425) loss 3.8281 (3.8433) grad_norm 1.7447 (1.9989) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1150/1251] eta 0:00:24 lr 0.000997 wd 0.0500 time 0.2440 (0.2438) data time 0.0009 (0.0015) model time 0.2431 (0.2425) loss 4.6747 (3.8453) grad_norm 1.3330 (1.9981) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1160/1251] eta 0:00:22 lr 0.000997 wd 0.0500 time 0.2418 (0.2438) data time 0.0010 (0.0015) model time 0.2408 (0.2425) loss 3.2164 (3.8458) grad_norm 1.3873 (1.9977) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1170/1251] eta 0:00:19 lr 0.000997 wd 0.0500 time 0.2413 (0.2438) data time 0.0007 (0.0015) model time 0.2405 (0.2425) loss 3.6188 (3.8457) grad_norm 1.7774 (1.9947) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1180/1251] eta 0:00:17 lr 0.000997 wd 0.0500 time 0.2402 (0.2438) data time 0.0013 (0.0014) model time 0.2389 (0.2425) loss 3.6568 (3.8458) grad_norm 1.7583 (1.9924) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1190/1251] eta 0:00:14 lr 0.000997 wd 0.0500 time 0.2416 (0.2438) data time 0.0008 (0.0014) model time 0.2409 (0.2425) loss 4.5674 (3.8480) grad_norm 1.5275 (1.9899) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1200/1251] eta 0:00:12 lr 0.000997 wd 0.0500 time 0.2475 (0.2438) data time 0.0009 (0.0014) model time 0.2466 (0.2425) loss 4.2072 (3.8506) grad_norm 1.5033 (1.9897) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1210/1251] eta 0:00:09 lr 0.000997 wd 0.0500 time 0.2377 (0.2437) data time 0.0009 (0.0014) model time 0.2368 (0.2425) loss 3.8981 (3.8485) grad_norm 2.4961 (1.9907) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1220/1251] eta 0:00:07 lr 0.000997 wd 0.0500 time 0.2334 (0.2437) data time 0.0012 (0.0014) model time 0.2323 (0.2424) loss 3.6532 (3.8479) grad_norm 1.9071 (1.9908) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1230/1251] eta 0:00:05 lr 0.000997 wd 0.0500 time 0.2389 (0.2437) data time 0.0008 (0.0014) model time 0.2381 (0.2424) loss 4.8631 (3.8493) grad_norm 1.5239 (1.9894) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1240/1251] eta 0:00:02 lr 0.000997 wd 0.0500 time 0.2261 (0.2436) data time 0.0007 (0.0014) model time 0.2254 (0.2424) loss 3.4524 (3.8478) grad_norm 2.1123 (1.9900) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [29/300][1250/1251] eta 0:00:00 lr 0.000997 wd 0.0500 time 0.2234 (0.2435) data time 0.0005 (0.0014) model time 0.2229 (0.2422) loss 4.6250 (3.8470) grad_norm 1.4607 (1.9894) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 29 training takes 0:05:04 [2024-08-26 04:44:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 04:44:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 04:44:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.444 (0.444) Loss 0.6860 (0.6860) Acc@1 86.914 (86.914) Acc@5 96.875 (96.875) Mem 7379MB [2024-08-26 04:44:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.110) Loss 1.1631 (1.0242) Acc@1 74.316 (77.282) Acc@5 92.383 (94.221) Mem 7379MB [2024-08-26 04:44:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.096) Loss 1.3965 (1.0368) Acc@1 67.773 (76.218) Acc@5 89.551 (94.289) Mem 7379MB [2024-08-26 04:44:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.090) Loss 1.8008 (1.1859) Acc@1 60.254 (73.078) Acc@5 82.227 (92.096) Mem 7379MB [2024-08-26 04:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.6299 (1.2745) Acc@1 62.793 (71.115) Acc@5 86.426 (90.842) Mem 7379MB [2024-08-26 04:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 70.866 Acc@5 90.748 [2024-08-26 04:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 70.9% [2024-08-26 04:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 70.87% [2024-08-26 04:44:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 04:44:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 04:44:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.463 (0.463) Loss 0.5542 (0.5542) Acc@1 87.012 (87.012) Acc@5 96.973 (96.973) Mem 7379MB [2024-08-26 04:44:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.109) Loss 0.9175 (0.9092) Acc@1 78.320 (77.415) Acc@5 94.727 (94.336) Mem 7379MB [2024-08-26 04:44:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.095) Loss 1.3428 (0.9212) Acc@1 67.188 (76.888) Acc@5 89.160 (94.317) Mem 7379MB [2024-08-26 04:44:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.089) Loss 1.6650 (1.0743) Acc@1 58.887 (73.598) Acc@5 83.398 (92.153) Mem 7379MB [2024-08-26 04:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.084) Loss 1.6396 (1.1674) Acc@1 60.449 (71.706) Acc@5 85.254 (90.958) Mem 7379MB [2024-08-26 04:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 71.464 Acc@5 90.846 [2024-08-26 04:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 71.5% [2024-08-26 04:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 71.46% [2024-08-26 04:44:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 04:44:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 04:44:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][0/1251] eta 0:14:17 lr 0.000997 wd 0.0500 time 0.6856 (0.6856) data time 0.4607 (0.4607) model time 0.0000 (0.0000) loss 3.1162 (3.1162) grad_norm 1.7678 (1.7678) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][10/1251] eta 0:05:50 lr 0.000997 wd 0.0500 time 0.2442 (0.2826) data time 0.0011 (0.0428) model time 0.0000 (0.0000) loss 4.4690 (4.2443) grad_norm 2.1493 (2.3633) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][20/1251] eta 0:05:35 lr 0.000997 wd 0.0500 time 0.2443 (0.2724) data time 0.0008 (0.0229) model time 0.0000 (0.0000) loss 4.3644 (4.0218) grad_norm 1.5079 (2.1586) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:44:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][30/1251] eta 0:05:30 lr 0.000997 wd 0.0500 time 0.2450 (0.2704) data time 0.0011 (0.0159) model time 0.0000 (0.0000) loss 3.6394 (4.0111) grad_norm 1.6389 (2.0207) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][40/1251] eta 0:05:31 lr 0.000997 wd 0.0500 time 0.2403 (0.2734) data time 0.0008 (0.0122) model time 0.0000 (0.0000) loss 2.7181 (3.8698) grad_norm 1.6876 (2.0065) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][50/1251] eta 0:05:25 lr 0.000997 wd 0.0500 time 0.2346 (0.2711) data time 0.0009 (0.0101) model time 0.0000 (0.0000) loss 4.1942 (3.8602) grad_norm 1.7739 (1.9671) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][60/1251] eta 0:05:16 lr 0.000997 wd 0.0500 time 0.2402 (0.2660) data time 0.0010 (0.0086) model time 0.2392 (0.2392) loss 4.6711 (3.8609) grad_norm 1.4939 (1.9452) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][70/1251] eta 0:05:09 lr 0.000997 wd 0.0500 time 0.2416 (0.2624) data time 0.0010 (0.0075) model time 0.2406 (0.2390) loss 3.6007 (3.8255) grad_norm 1.7019 (1.9148) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][80/1251] eta 0:05:03 lr 0.000997 wd 0.0500 time 0.2402 (0.2595) data time 0.0008 (0.0067) model time 0.2394 (0.2389) loss 2.9957 (3.8210) grad_norm 3.0152 (1.9102) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][90/1251] eta 0:04:59 lr 0.000997 wd 0.0500 time 0.2380 (0.2577) data time 0.0015 (0.0061) model time 0.2364 (0.2396) loss 3.9820 (3.8290) grad_norm 3.1826 (1.9292) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][100/1251] eta 0:04:54 lr 0.000997 wd 0.0500 time 0.2344 (0.2559) data time 0.0009 (0.0056) model time 0.2336 (0.2392) loss 3.1209 (3.7887) grad_norm 1.5858 (1.9348) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][110/1251] eta 0:04:50 lr 0.000997 wd 0.0500 time 0.2388 (0.2545) data time 0.0011 (0.0052) model time 0.2377 (0.2394) loss 3.6692 (3.7526) grad_norm 1.9583 (1.9457) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][120/1251] eta 0:04:46 lr 0.000997 wd 0.0500 time 0.2467 (0.2533) data time 0.0007 (0.0049) model time 0.2460 (0.2393) loss 4.3601 (3.7557) grad_norm 1.3445 (1.9797) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][130/1251] eta 0:04:43 lr 0.000997 wd 0.0500 time 0.2400 (0.2526) data time 0.0009 (0.0046) model time 0.2391 (0.2397) loss 2.7521 (3.7532) grad_norm 1.7497 (2.0111) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][140/1251] eta 0:04:39 lr 0.000997 wd 0.0500 time 0.2356 (0.2518) data time 0.0011 (0.0043) model time 0.2345 (0.2399) loss 3.3931 (3.7651) grad_norm 2.5262 (2.0077) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][150/1251] eta 0:04:36 lr 0.000997 wd 0.0500 time 0.2538 (0.2512) data time 0.0010 (0.0041) model time 0.2529 (0.2400) loss 3.4802 (3.7809) grad_norm 2.3362 (2.0331) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][160/1251] eta 0:04:33 lr 0.000997 wd 0.0500 time 0.2497 (0.2507) data time 0.0007 (0.0039) model time 0.2490 (0.2401) loss 3.2534 (3.7782) grad_norm 2.4676 (2.0157) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][170/1251] eta 0:04:30 lr 0.000997 wd 0.0500 time 0.2415 (0.2500) data time 0.0009 (0.0038) model time 0.2405 (0.2400) loss 3.6575 (3.8013) grad_norm 1.7828 (2.0032) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][180/1251] eta 0:04:28 lr 0.000997 wd 0.0500 time 0.2403 (0.2507) data time 0.0009 (0.0036) model time 0.2394 (0.2417) loss 4.6534 (3.8207) grad_norm 1.6904 (1.9931) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][190/1251] eta 0:04:25 lr 0.000997 wd 0.0500 time 0.2429 (0.2502) data time 0.0007 (0.0035) model time 0.2422 (0.2416) loss 2.7470 (3.8107) grad_norm 1.8731 (1.9935) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][200/1251] eta 0:04:22 lr 0.000997 wd 0.0500 time 0.2490 (0.2500) data time 0.0010 (0.0033) model time 0.2480 (0.2417) loss 2.3827 (3.7823) grad_norm 1.2184 (2.0029) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][210/1251] eta 0:04:19 lr 0.000997 wd 0.0500 time 0.2457 (0.2497) data time 0.0009 (0.0032) model time 0.2447 (0.2418) loss 4.8072 (3.7784) grad_norm 2.0692 (2.0287) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][220/1251] eta 0:04:17 lr 0.000997 wd 0.0500 time 0.2466 (0.2493) data time 0.0009 (0.0031) model time 0.2458 (0.2417) loss 3.2393 (3.7722) grad_norm 1.7794 (2.0157) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][230/1251] eta 0:04:14 lr 0.000997 wd 0.0500 time 0.2427 (0.2490) data time 0.0012 (0.0030) model time 0.2415 (0.2417) loss 3.3989 (3.7919) grad_norm 1.4397 (2.0044) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][240/1251] eta 0:04:11 lr 0.000997 wd 0.0500 time 0.2353 (0.2487) data time 0.0011 (0.0030) model time 0.2342 (0.2417) loss 4.1592 (3.8077) grad_norm 1.7360 (2.0130) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][250/1251] eta 0:04:08 lr 0.000997 wd 0.0500 time 0.2460 (0.2485) data time 0.0008 (0.0029) model time 0.2452 (0.2417) loss 3.9789 (3.8099) grad_norm 1.4830 (2.0077) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][260/1251] eta 0:04:05 lr 0.000997 wd 0.0500 time 0.2403 (0.2482) data time 0.0013 (0.0028) model time 0.2391 (0.2416) loss 4.6480 (3.8088) grad_norm 2.3045 (2.0051) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][270/1251] eta 0:04:03 lr 0.000997 wd 0.0500 time 0.2419 (0.2480) data time 0.0010 (0.0027) model time 0.2410 (0.2416) loss 3.1773 (3.8087) grad_norm 2.4379 (2.0042) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:45:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][280/1251] eta 0:04:00 lr 0.000997 wd 0.0500 time 0.2451 (0.2477) data time 0.0008 (0.0027) model time 0.2444 (0.2415) loss 2.7959 (3.8081) grad_norm 1.8551 (1.9938) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][290/1251] eta 0:03:57 lr 0.000997 wd 0.0500 time 0.2404 (0.2475) data time 0.0012 (0.0026) model time 0.2393 (0.2414) loss 4.0734 (3.8126) grad_norm 1.9145 (1.9910) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][300/1251] eta 0:03:55 lr 0.000997 wd 0.0500 time 0.2435 (0.2472) data time 0.0007 (0.0026) model time 0.2428 (0.2413) loss 4.1029 (3.8114) grad_norm 2.3862 (1.9864) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][310/1251] eta 0:03:52 lr 0.000997 wd 0.0500 time 0.2427 (0.2470) data time 0.0012 (0.0025) model time 0.2416 (0.2412) loss 4.2664 (3.8131) grad_norm 2.0559 (1.9848) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][320/1251] eta 0:03:49 lr 0.000997 wd 0.0500 time 0.2395 (0.2469) data time 0.0007 (0.0025) model time 0.2388 (0.2413) loss 2.8388 (3.8042) grad_norm 2.2418 (1.9739) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][330/1251] eta 0:03:47 lr 0.000997 wd 0.0500 time 0.2481 (0.2467) data time 0.0009 (0.0025) model time 0.2472 (0.2412) loss 2.8706 (3.8027) grad_norm 1.3937 (1.9688) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][340/1251] eta 0:03:44 lr 0.000997 wd 0.0500 time 0.2387 (0.2466) data time 0.0008 (0.0024) model time 0.2379 (0.2412) loss 4.1467 (3.8027) grad_norm 1.4463 (1.9554) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][350/1251] eta 0:03:42 lr 0.000997 wd 0.0500 time 0.2457 (0.2464) data time 0.0009 (0.0024) model time 0.2449 (0.2412) loss 4.6386 (3.8006) grad_norm 1.6671 (1.9532) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][360/1251] eta 0:03:39 lr 0.000997 wd 0.0500 time 0.2365 (0.2463) data time 0.0011 (0.0024) model time 0.2354 (0.2412) loss 3.8719 (3.7954) grad_norm 2.2710 (1.9694) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][370/1251] eta 0:03:36 lr 0.000997 wd 0.0500 time 0.2415 (0.2463) data time 0.0009 (0.0023) model time 0.2406 (0.2412) loss 3.3546 (3.7914) grad_norm 1.7750 (1.9725) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][380/1251] eta 0:03:34 lr 0.000997 wd 0.0500 time 0.2331 (0.2461) data time 0.0012 (0.0023) model time 0.2320 (0.2412) loss 4.2290 (3.7943) grad_norm 1.7736 (1.9657) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][390/1251] eta 0:03:31 lr 0.000997 wd 0.0500 time 0.2427 (0.2460) data time 0.0007 (0.0022) model time 0.2420 (0.2412) loss 3.4450 (3.7881) grad_norm 1.4905 (1.9688) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][400/1251] eta 0:03:29 lr 0.000997 wd 0.0500 time 0.2448 (0.2459) data time 0.0008 (0.0022) model time 0.2440 (0.2412) loss 2.9675 (3.7886) grad_norm 1.4813 (1.9743) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][410/1251] eta 0:03:26 lr 0.000997 wd 0.0500 time 0.2430 (0.2458) data time 0.0008 (0.0022) model time 0.2422 (0.2412) loss 3.8140 (3.7871) grad_norm 2.0476 (1.9773) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][420/1251] eta 0:03:24 lr 0.000997 wd 0.0500 time 0.2366 (0.2458) data time 0.0010 (0.0022) model time 0.2357 (0.2412) loss 4.0031 (3.7871) grad_norm 1.6773 (1.9751) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][430/1251] eta 0:03:21 lr 0.000997 wd 0.0500 time 0.2532 (0.2457) data time 0.0007 (0.0021) model time 0.2525 (0.2413) loss 3.4831 (3.7903) grad_norm 1.5401 (1.9699) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][440/1251] eta 0:03:19 lr 0.000997 wd 0.0500 time 0.2512 (0.2457) data time 0.0009 (0.0021) model time 0.2503 (0.2413) loss 2.9201 (3.7796) grad_norm 1.4894 (1.9664) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][450/1251] eta 0:03:16 lr 0.000997 wd 0.0500 time 0.2373 (0.2456) data time 0.0009 (0.0021) model time 0.2364 (0.2413) loss 2.8388 (3.7733) grad_norm 2.8441 (1.9742) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][460/1251] eta 0:03:14 lr 0.000997 wd 0.0500 time 0.2392 (0.2456) data time 0.0009 (0.0021) model time 0.2383 (0.2413) loss 4.1637 (3.7713) grad_norm 2.3732 (1.9790) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][470/1251] eta 0:03:11 lr 0.000997 wd 0.0500 time 0.2542 (0.2455) data time 0.0010 (0.0020) model time 0.2532 (0.2413) loss 4.4635 (3.7746) grad_norm 1.6513 (1.9782) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][480/1251] eta 0:03:09 lr 0.000997 wd 0.0500 time 0.2440 (0.2454) data time 0.0008 (0.0020) model time 0.2432 (0.2412) loss 4.1829 (3.7796) grad_norm 3.1457 (1.9813) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][490/1251] eta 0:03:06 lr 0.000997 wd 0.0500 time 0.2417 (0.2452) data time 0.0007 (0.0020) model time 0.2410 (0.2412) loss 4.0375 (3.7743) grad_norm 2.2224 (1.9790) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][500/1251] eta 0:03:04 lr 0.000997 wd 0.0500 time 0.2417 (0.2451) data time 0.0007 (0.0020) model time 0.2410 (0.2411) loss 4.4285 (3.7714) grad_norm 1.5693 (1.9782) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][510/1251] eta 0:03:01 lr 0.000997 wd 0.0500 time 0.2614 (0.2451) data time 0.0009 (0.0020) model time 0.2605 (0.2412) loss 3.9402 (3.7747) grad_norm 1.4702 (1.9755) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:46:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][520/1251] eta 0:02:59 lr 0.000997 wd 0.0500 time 0.2384 (0.2451) data time 0.0011 (0.0019) model time 0.2373 (0.2412) loss 3.5539 (3.7765) grad_norm 3.0100 (1.9789) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:47:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][530/1251] eta 0:02:56 lr 0.000997 wd 0.0500 time 0.2375 (0.2450) data time 0.0011 (0.0019) model time 0.2364 (0.2412) loss 3.1885 (3.7726) grad_norm 3.4096 (1.9818) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:47:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][540/1251] eta 0:02:54 lr 0.000997 wd 0.0500 time 0.2359 (0.2449) data time 0.0011 (0.0019) model time 0.2348 (0.2411) loss 3.8655 (3.7760) grad_norm 1.3438 (1.9855) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:47:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][550/1251] eta 0:02:51 lr 0.000997 wd 0.0500 time 0.2455 (0.2448) data time 0.0008 (0.0019) model time 0.2447 (0.2411) loss 3.3279 (3.7793) grad_norm 2.4191 (1.9855) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:47:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][560/1251] eta 0:02:49 lr 0.000997 wd 0.0500 time 0.2406 (0.2452) data time 0.0011 (0.0019) model time 0.2395 (0.2415) loss 2.9847 (3.7853) grad_norm 2.5821 (1.9898) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 04:47:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][570/1251] eta 0:02:47 lr 0.000997 wd 0.0500 time 0.2392 (0.2459) data time 0.0007 (0.0019) model time 0.2385 (0.2424) loss 3.0534 (3.7877) grad_norm 1.4756 (1.9933) loss_scale 16384.0000 (8306.7741) mem 7379MB [2024-08-26 04:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][580/1251] eta 0:02:45 lr 0.000997 wd 0.0500 time 0.2413 (0.2465) data time 0.0008 (0.0018) model time 0.2405 (0.2431) loss 2.6057 (3.7899) grad_norm 2.0973 (1.9965) loss_scale 16384.0000 (8445.7969) mem 7379MB [2024-08-26 04:47:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][590/1251] eta 0:02:42 lr 0.000997 wd 0.0500 time 0.2383 (0.2464) data time 0.0008 (0.0018) model time 0.2374 (0.2430) loss 3.6083 (3.7919) grad_norm 1.5743 (1.9976) loss_scale 16384.0000 (8580.1151) mem 7379MB [2024-08-26 04:47:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][600/1251] eta 0:02:40 lr 0.000997 wd 0.0500 time 0.2412 (0.2463) data time 0.0010 (0.0018) model time 0.2402 (0.2430) loss 4.3181 (3.7946) grad_norm 2.2011 (1.9965) loss_scale 16384.0000 (8709.9634) mem 7379MB [2024-08-26 04:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][610/1251] eta 0:02:37 lr 0.000997 wd 0.0500 time 0.2412 (0.2462) data time 0.0009 (0.0018) model time 0.2403 (0.2429) loss 3.6475 (3.7901) grad_norm 1.4131 (1.9970) loss_scale 16384.0000 (8835.5614) mem 7379MB [2024-08-26 04:47:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][620/1251] eta 0:02:35 lr 0.000997 wd 0.0500 time 0.2410 (0.2462) data time 0.0009 (0.0018) model time 0.2401 (0.2429) loss 4.3199 (3.7873) grad_norm 2.2723 (1.9960) loss_scale 16384.0000 (8957.1143) mem 7379MB [2024-08-26 04:47:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][630/1251] eta 0:02:32 lr 0.000997 wd 0.0500 time 0.2525 (0.2461) data time 0.0010 (0.0018) model time 0.2515 (0.2429) loss 2.9286 (3.7814) grad_norm 2.2325 (1.9979) loss_scale 16384.0000 (9074.8146) mem 7379MB [2024-08-26 04:47:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][640/1251] eta 0:02:30 lr 0.000997 wd 0.0500 time 0.2472 (0.2461) data time 0.0010 (0.0018) model time 0.2462 (0.2428) loss 3.9436 (3.7882) grad_norm 2.8880 (1.9976) loss_scale 16384.0000 (9188.8424) mem 7379MB [2024-08-26 04:47:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][650/1251] eta 0:02:27 lr 0.000997 wd 0.0500 time 0.2297 (0.2460) data time 0.0010 (0.0018) model time 0.2287 (0.2428) loss 4.5421 (3.7900) grad_norm 1.5435 (1.9927) loss_scale 16384.0000 (9299.3671) mem 7379MB [2024-08-26 04:47:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][660/1251] eta 0:02:25 lr 0.000997 wd 0.0500 time 0.2377 (0.2459) data time 0.0010 (0.0017) model time 0.2367 (0.2428) loss 4.0559 (3.7905) grad_norm 1.9861 (1.9880) loss_scale 16384.0000 (9406.5477) mem 7379MB [2024-08-26 04:47:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][670/1251] eta 0:02:22 lr 0.000997 wd 0.0500 time 0.2458 (0.2459) data time 0.0009 (0.0017) model time 0.2449 (0.2428) loss 3.9525 (3.7945) grad_norm 1.1925 (1.9873) loss_scale 16384.0000 (9510.5335) mem 7379MB [2024-08-26 04:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][680/1251] eta 0:02:20 lr 0.000997 wd 0.0500 time 0.2520 (0.2459) data time 0.0008 (0.0017) model time 0.2512 (0.2428) loss 3.5346 (3.7929) grad_norm 2.5135 (1.9861) loss_scale 16384.0000 (9611.4655) mem 7379MB [2024-08-26 04:47:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][690/1251] eta 0:02:17 lr 0.000997 wd 0.0500 time 0.2398 (0.2458) data time 0.0009 (0.0017) model time 0.2389 (0.2428) loss 3.9870 (3.7954) grad_norm 1.6313 (1.9828) loss_scale 16384.0000 (9709.4761) mem 7379MB [2024-08-26 04:47:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][700/1251] eta 0:02:15 lr 0.000997 wd 0.0500 time 0.2412 (0.2460) data time 0.0011 (0.0017) model time 0.2401 (0.2430) loss 4.0544 (3.7937) grad_norm 2.3881 (1.9823) loss_scale 16384.0000 (9804.6904) mem 7379MB [2024-08-26 04:47:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][710/1251] eta 0:02:13 lr 0.000997 wd 0.0500 time 0.2497 (0.2459) data time 0.0007 (0.0017) model time 0.2490 (0.2430) loss 3.7571 (3.7955) grad_norm 1.7014 (1.9830) loss_scale 16384.0000 (9897.2264) mem 7379MB [2024-08-26 04:47:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][720/1251] eta 0:02:10 lr 0.000997 wd 0.0500 time 0.2483 (0.2459) data time 0.0007 (0.0017) model time 0.2476 (0.2429) loss 3.3226 (3.7930) grad_norm 2.3126 (1.9826) loss_scale 16384.0000 (9987.1956) mem 7379MB [2024-08-26 04:47:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][730/1251] eta 0:02:08 lr 0.000997 wd 0.0500 time 0.2384 (0.2458) data time 0.0009 (0.0017) model time 0.2375 (0.2429) loss 3.0199 (3.7921) grad_norm 3.4805 (1.9845) loss_scale 16384.0000 (10074.7031) mem 7379MB [2024-08-26 04:47:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][740/1251] eta 0:02:05 lr 0.000997 wd 0.0500 time 0.2430 (0.2458) data time 0.0008 (0.0017) model time 0.2421 (0.2429) loss 3.5930 (3.7921) grad_norm 2.2166 (1.9877) loss_scale 16384.0000 (10159.8489) mem 7379MB [2024-08-26 04:47:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][750/1251] eta 0:02:03 lr 0.000997 wd 0.0500 time 0.2448 (0.2457) data time 0.0008 (0.0017) model time 0.2441 (0.2428) loss 4.4368 (3.7887) grad_norm 2.5512 (1.9877) loss_scale 16384.0000 (10242.7270) mem 7379MB [2024-08-26 04:47:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][760/1251] eta 0:02:00 lr 0.000996 wd 0.0500 time 0.2304 (0.2456) data time 0.0009 (0.0016) model time 0.2294 (0.2428) loss 4.4020 (3.7872) grad_norm 2.3021 (1.9902) loss_scale 16384.0000 (10323.4271) mem 7379MB [2024-08-26 04:47:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][770/1251] eta 0:01:58 lr 0.000996 wd 0.0500 time 0.2412 (0.2456) data time 0.0011 (0.0016) model time 0.2402 (0.2427) loss 3.9601 (3.7877) grad_norm 1.5293 (1.9903) loss_scale 16384.0000 (10402.0337) mem 7379MB [2024-08-26 04:48:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][780/1251] eta 0:01:55 lr 0.000996 wd 0.0500 time 0.2388 (0.2455) data time 0.0010 (0.0016) model time 0.2378 (0.2427) loss 4.1303 (3.7885) grad_norm 2.0419 (1.9887) loss_scale 16384.0000 (10478.6274) mem 7379MB [2024-08-26 04:48:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][790/1251] eta 0:01:53 lr 0.000996 wd 0.0500 time 0.2323 (0.2455) data time 0.0009 (0.0016) model time 0.2314 (0.2426) loss 4.6187 (3.7934) grad_norm 1.3262 (1.9895) loss_scale 16384.0000 (10553.2845) mem 7379MB [2024-08-26 04:48:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][800/1251] eta 0:01:50 lr 0.000996 wd 0.0500 time 0.2448 (0.2454) data time 0.0007 (0.0016) model time 0.2441 (0.2426) loss 4.6712 (3.7971) grad_norm 1.3798 (1.9878) loss_scale 16384.0000 (10626.0774) mem 7379MB [2024-08-26 04:48:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][810/1251] eta 0:01:48 lr 0.000996 wd 0.0500 time 0.2404 (0.2454) data time 0.0008 (0.0016) model time 0.2396 (0.2426) loss 3.1542 (3.7980) grad_norm 1.2884 (1.9827) loss_scale 16384.0000 (10697.0752) mem 7379MB [2024-08-26 04:48:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][820/1251] eta 0:01:45 lr 0.000996 wd 0.0500 time 0.2405 (0.2453) data time 0.0009 (0.0016) model time 0.2396 (0.2425) loss 4.3022 (3.8012) grad_norm 2.6813 (1.9859) loss_scale 16384.0000 (10766.3435) mem 7379MB [2024-08-26 04:48:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][830/1251] eta 0:01:43 lr 0.000996 wd 0.0500 time 0.2367 (0.2453) data time 0.0011 (0.0016) model time 0.2356 (0.2425) loss 4.3522 (3.8033) grad_norm 5.4872 (1.9942) loss_scale 16384.0000 (10833.9446) mem 7379MB [2024-08-26 04:48:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][840/1251] eta 0:01:40 lr 0.000996 wd 0.0500 time 0.2368 (0.2452) data time 0.0007 (0.0016) model time 0.2361 (0.2424) loss 3.3674 (3.8046) grad_norm 1.7276 (1.9943) loss_scale 16384.0000 (10899.9382) mem 7379MB [2024-08-26 04:48:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][850/1251] eta 0:01:38 lr 0.000996 wd 0.0500 time 0.2351 (0.2451) data time 0.0009 (0.0016) model time 0.2342 (0.2424) loss 2.9784 (3.8048) grad_norm 2.8206 (1.9946) loss_scale 16384.0000 (10964.3807) mem 7379MB [2024-08-26 04:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][860/1251] eta 0:01:35 lr 0.000996 wd 0.0500 time 0.2427 (0.2451) data time 0.0010 (0.0016) model time 0.2417 (0.2424) loss 3.8393 (3.8065) grad_norm 1.5134 (1.9920) loss_scale 16384.0000 (11027.3264) mem 7379MB [2024-08-26 04:48:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][870/1251] eta 0:01:33 lr 0.000996 wd 0.0500 time 0.2439 (0.2450) data time 0.0007 (0.0016) model time 0.2432 (0.2423) loss 3.0182 (3.8088) grad_norm 1.4209 (1.9890) loss_scale 16384.0000 (11088.8266) mem 7379MB [2024-08-26 04:48:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][880/1251] eta 0:01:30 lr 0.000996 wd 0.0500 time 0.2262 (0.2449) data time 0.0010 (0.0016) model time 0.2252 (0.2423) loss 3.9981 (3.8132) grad_norm 2.0951 (1.9871) loss_scale 16384.0000 (11148.9308) mem 7379MB [2024-08-26 04:48:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][890/1251] eta 0:01:28 lr 0.000996 wd 0.0500 time 0.2424 (0.2449) data time 0.0012 (0.0016) model time 0.2412 (0.2422) loss 4.2571 (3.8145) grad_norm 2.9940 (1.9880) loss_scale 16384.0000 (11207.6857) mem 7379MB [2024-08-26 04:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][900/1251] eta 0:01:25 lr 0.000996 wd 0.0500 time 0.2394 (0.2449) data time 0.0010 (0.0016) model time 0.2384 (0.2422) loss 4.1553 (3.8165) grad_norm 1.7481 (1.9883) loss_scale 16384.0000 (11265.1365) mem 7379MB [2024-08-26 04:48:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][910/1251] eta 0:01:23 lr 0.000996 wd 0.0500 time 0.2501 (0.2448) data time 0.0010 (0.0016) model time 0.2491 (0.2422) loss 3.8566 (3.8135) grad_norm 1.8942 (1.9854) loss_scale 16384.0000 (11321.3260) mem 7379MB [2024-08-26 04:48:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][920/1251] eta 0:01:21 lr 0.000996 wd 0.0500 time 0.2395 (0.2447) data time 0.0009 (0.0016) model time 0.2386 (0.2421) loss 3.3120 (3.8128) grad_norm 2.0460 (1.9834) loss_scale 16384.0000 (11376.2953) mem 7379MB [2024-08-26 04:48:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][930/1251] eta 0:01:18 lr 0.000996 wd 0.0500 time 0.2422 (0.2447) data time 0.0009 (0.0016) model time 0.2413 (0.2421) loss 4.6038 (3.8149) grad_norm 2.2544 (1.9812) loss_scale 16384.0000 (11430.0838) mem 7379MB [2024-08-26 04:48:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][940/1251] eta 0:01:16 lr 0.000996 wd 0.0500 time 0.2421 (0.2447) data time 0.0010 (0.0015) model time 0.2411 (0.2421) loss 4.2473 (3.8173) grad_norm 1.3273 (1.9794) loss_scale 16384.0000 (11482.7290) mem 7379MB [2024-08-26 04:48:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][950/1251] eta 0:01:13 lr 0.000996 wd 0.0500 time 0.2384 (0.2447) data time 0.0011 (0.0015) model time 0.2373 (0.2421) loss 3.6971 (3.8192) grad_norm 1.7863 (1.9798) loss_scale 16384.0000 (11534.2671) mem 7379MB [2024-08-26 04:48:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][960/1251] eta 0:01:11 lr 0.000996 wd 0.0500 time 0.2427 (0.2446) data time 0.0010 (0.0015) model time 0.2417 (0.2421) loss 4.0818 (3.8208) grad_norm 1.5842 (1.9794) loss_scale 16384.0000 (11584.7326) mem 7379MB [2024-08-26 04:48:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][970/1251] eta 0:01:08 lr 0.000996 wd 0.0500 time 0.2470 (0.2446) data time 0.0009 (0.0015) model time 0.2461 (0.2421) loss 3.8077 (3.8168) grad_norm 2.1408 (1.9810) loss_scale 16384.0000 (11634.1586) mem 7379MB [2024-08-26 04:48:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][980/1251] eta 0:01:06 lr 0.000996 wd 0.0500 time 0.2435 (0.2446) data time 0.0007 (0.0015) model time 0.2427 (0.2421) loss 4.7682 (3.8172) grad_norm 1.5630 (1.9793) loss_scale 16384.0000 (11682.5770) mem 7379MB [2024-08-26 04:48:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][990/1251] eta 0:01:03 lr 0.000996 wd 0.0500 time 0.2381 (0.2445) data time 0.0010 (0.0015) model time 0.2371 (0.2420) loss 4.4438 (3.8209) grad_norm 2.3468 (1.9778) loss_scale 16384.0000 (11730.0182) mem 7379MB [2024-08-26 04:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1000/1251] eta 0:01:01 lr 0.000996 wd 0.0500 time 0.2384 (0.2445) data time 0.0009 (0.0015) model time 0.2375 (0.2420) loss 3.5690 (3.8212) grad_norm 1.7213 (1.9764) loss_scale 16384.0000 (11776.5115) mem 7379MB [2024-08-26 04:48:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1010/1251] eta 0:00:58 lr 0.000996 wd 0.0500 time 0.2411 (0.2445) data time 0.0007 (0.0015) model time 0.2404 (0.2420) loss 3.3228 (3.8167) grad_norm 1.5048 (1.9739) loss_scale 16384.0000 (11822.0851) mem 7379MB [2024-08-26 04:48:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1020/1251] eta 0:00:56 lr 0.000996 wd 0.0500 time 0.2475 (0.2445) data time 0.0010 (0.0015) model time 0.2465 (0.2420) loss 4.4144 (3.8166) grad_norm 1.7939 (1.9726) loss_scale 16384.0000 (11866.7659) mem 7379MB [2024-08-26 04:49:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1030/1251] eta 0:00:54 lr 0.000996 wd 0.0500 time 0.2355 (0.2444) data time 0.0010 (0.0015) model time 0.2345 (0.2420) loss 2.6588 (3.8137) grad_norm 1.4170 (1.9776) loss_scale 16384.0000 (11910.5800) mem 7379MB [2024-08-26 04:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1040/1251] eta 0:00:51 lr 0.000996 wd 0.0500 time 0.2368 (0.2444) data time 0.0012 (0.0015) model time 0.2356 (0.2419) loss 3.3565 (3.8172) grad_norm 1.5031 (1.9805) loss_scale 16384.0000 (11953.5524) mem 7379MB [2024-08-26 04:49:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1050/1251] eta 0:00:49 lr 0.000996 wd 0.0500 time 0.2503 (0.2443) data time 0.0009 (0.0015) model time 0.2494 (0.2419) loss 2.8468 (3.8173) grad_norm 2.9545 (1.9791) loss_scale 16384.0000 (11995.7069) mem 7379MB [2024-08-26 04:49:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1060/1251] eta 0:00:46 lr 0.000996 wd 0.0500 time 0.2446 (0.2443) data time 0.0007 (0.0015) model time 0.2439 (0.2419) loss 4.6869 (3.8177) grad_norm 1.8304 (1.9786) loss_scale 16384.0000 (12037.0669) mem 7379MB [2024-08-26 04:49:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1070/1251] eta 0:00:44 lr 0.000996 wd 0.0500 time 0.2429 (0.2443) data time 0.0007 (0.0015) model time 0.2421 (0.2419) loss 4.6047 (3.8178) grad_norm 1.8335 (1.9760) loss_scale 16384.0000 (12077.6545) mem 7379MB [2024-08-26 04:49:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1080/1251] eta 0:00:41 lr 0.000996 wd 0.0500 time 0.4298 (0.2446) data time 0.0011 (0.0015) model time 0.4287 (0.2423) loss 3.4688 (3.8164) grad_norm 1.8274 (1.9742) loss_scale 16384.0000 (12117.4912) mem 7379MB [2024-08-26 04:49:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1090/1251] eta 0:00:39 lr 0.000996 wd 0.0500 time 0.2468 (0.2448) data time 0.0007 (0.0015) model time 0.2461 (0.2424) loss 3.1146 (3.8170) grad_norm 1.8772 (1.9740) loss_scale 16384.0000 (12156.5976) mem 7379MB [2024-08-26 04:49:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1100/1251] eta 0:00:37 lr 0.000996 wd 0.0500 time 0.2390 (0.2450) data time 0.0010 (0.0015) model time 0.2380 (0.2427) loss 4.2468 (3.8178) grad_norm 2.6132 (1.9739) loss_scale 16384.0000 (12194.9936) mem 7379MB [2024-08-26 04:49:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1110/1251] eta 0:00:34 lr 0.000996 wd 0.0500 time 0.2379 (0.2450) data time 0.0011 (0.0015) model time 0.2368 (0.2427) loss 3.8263 (3.8160) grad_norm 1.9501 (1.9740) loss_scale 16384.0000 (12232.6985) mem 7379MB [2024-08-26 04:49:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1120/1251] eta 0:00:32 lr 0.000996 wd 0.0500 time 0.2426 (0.2450) data time 0.0007 (0.0015) model time 0.2419 (0.2427) loss 4.0240 (3.8165) grad_norm 2.9077 (1.9736) loss_scale 16384.0000 (12269.7306) mem 7379MB [2024-08-26 04:49:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1130/1251] eta 0:00:29 lr 0.000996 wd 0.0500 time 0.2381 (0.2449) data time 0.0010 (0.0015) model time 0.2371 (0.2426) loss 3.3191 (3.8173) grad_norm 1.6255 (1.9734) loss_scale 16384.0000 (12306.1079) mem 7379MB [2024-08-26 04:49:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1140/1251] eta 0:00:27 lr 0.000996 wd 0.0500 time 0.2439 (0.2449) data time 0.0008 (0.0015) model time 0.2431 (0.2426) loss 3.3030 (3.8182) grad_norm 1.1965 (1.9737) loss_scale 16384.0000 (12341.8475) mem 7379MB [2024-08-26 04:49:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1150/1251] eta 0:00:24 lr 0.000996 wd 0.0500 time 0.2327 (0.2449) data time 0.0008 (0.0014) model time 0.2319 (0.2426) loss 2.9334 (3.8173) grad_norm 1.7302 (1.9737) loss_scale 16384.0000 (12376.9661) mem 7379MB [2024-08-26 04:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1160/1251] eta 0:00:22 lr 0.000996 wd 0.0500 time 0.2429 (0.2449) data time 0.0009 (0.0014) model time 0.2421 (0.2426) loss 3.3957 (3.8175) grad_norm 2.2282 (1.9745) loss_scale 16384.0000 (12411.4798) mem 7379MB [2024-08-26 04:49:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1170/1251] eta 0:00:19 lr 0.000996 wd 0.0500 time 0.2461 (0.2449) data time 0.0009 (0.0014) model time 0.2452 (0.2426) loss 4.3528 (3.8199) grad_norm 1.9438 (1.9747) loss_scale 16384.0000 (12445.4039) mem 7379MB [2024-08-26 04:49:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1180/1251] eta 0:00:17 lr 0.000996 wd 0.0500 time 0.2493 (0.2448) data time 0.0010 (0.0014) model time 0.2483 (0.2426) loss 3.4938 (3.8205) grad_norm 1.6224 (1.9727) loss_scale 16384.0000 (12478.7536) mem 7379MB [2024-08-26 04:49:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1190/1251] eta 0:00:14 lr 0.000996 wd 0.0500 time 0.2466 (0.2448) data time 0.0007 (0.0014) model time 0.2460 (0.2426) loss 3.8341 (3.8214) grad_norm 1.4283 (1.9703) loss_scale 16384.0000 (12511.5432) mem 7379MB [2024-08-26 04:49:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1200/1251] eta 0:00:12 lr 0.000996 wd 0.0500 time 0.2411 (0.2448) data time 0.0009 (0.0014) model time 0.2402 (0.2426) loss 3.1560 (3.8238) grad_norm 2.0865 (1.9706) loss_scale 16384.0000 (12543.7868) mem 7379MB [2024-08-26 04:49:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1210/1251] eta 0:00:10 lr 0.000996 wd 0.0500 time 0.2386 (0.2448) data time 0.0009 (0.0014) model time 0.2377 (0.2425) loss 2.7502 (3.8247) grad_norm 2.4051 (1.9740) loss_scale 16384.0000 (12575.4979) mem 7379MB [2024-08-26 04:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1220/1251] eta 0:00:07 lr 0.000996 wd 0.0500 time 0.2329 (0.2449) data time 0.0010 (0.0014) model time 0.2318 (0.2427) loss 3.9299 (3.8256) grad_norm 1.6194 (1.9781) loss_scale 16384.0000 (12606.6896) mem 7379MB [2024-08-26 04:49:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1230/1251] eta 0:00:05 lr 0.000996 wd 0.0500 time 0.2513 (0.2449) data time 0.0007 (0.0014) model time 0.2506 (0.2427) loss 4.5521 (3.8282) grad_norm 1.2335 (1.9780) loss_scale 16384.0000 (12637.3745) mem 7379MB [2024-08-26 04:49:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1240/1251] eta 0:00:02 lr 0.000996 wd 0.0500 time 0.2233 (0.2448) data time 0.0008 (0.0014) model time 0.2226 (0.2426) loss 4.0780 (3.8274) grad_norm 2.4672 (1.9802) loss_scale 16384.0000 (12667.5649) mem 7379MB [2024-08-26 04:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [30/300][1250/1251] eta 0:00:00 lr 0.000996 wd 0.0500 time 0.2251 (0.2446) data time 0.0005 (0.0014) model time 0.2246 (0.2425) loss 3.4643 (3.8286) grad_norm 2.2787 (1.9815) loss_scale 16384.0000 (12697.2726) mem 7379MB [2024-08-26 04:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 30 training takes 0:05:06 [2024-08-26 04:49:56 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 04:49:57 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 04:49:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.430 (0.430) Loss 0.7026 (0.7026) Acc@1 86.426 (86.426) Acc@5 97.754 (97.754) Mem 7379MB [2024-08-26 04:49:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.110) Loss 1.1201 (1.0617) Acc@1 75.195 (77.060) Acc@5 94.238 (94.354) Mem 7379MB [2024-08-26 04:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.097) Loss 1.4492 (1.0626) Acc@1 68.652 (76.614) Acc@5 89.062 (94.415) Mem 7379MB [2024-08-26 04:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.090) Loss 1.7900 (1.2165) Acc@1 60.254 (73.374) Acc@5 82.910 (92.084) Mem 7379MB [2024-08-26 04:50:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.7012 (1.3004) Acc@1 62.207 (71.584) Acc@5 85.840 (90.885) Mem 7379MB [2024-08-26 04:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 71.252 Acc@5 90.746 [2024-08-26 04:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 71.3% [2024-08-26 04:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 71.25% [2024-08-26 04:50:01 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 04:50:01 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 04:50:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.424 (0.424) Loss 0.5400 (0.5400) Acc@1 87.207 (87.207) Acc@5 97.266 (97.266) Mem 7379MB [2024-08-26 04:50:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.110) Loss 0.8975 (0.8897) Acc@1 78.516 (77.894) Acc@5 94.727 (94.576) Mem 7379MB [2024-08-26 04:50:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.095) Loss 1.3145 (0.9019) Acc@1 67.969 (77.395) Acc@5 89.160 (94.513) Mem 7379MB [2024-08-26 04:50:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.089) Loss 1.6328 (1.0518) Acc@1 59.766 (74.105) Acc@5 83.887 (92.411) Mem 7379MB [2024-08-26 04:50:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.6055 (1.1427) Acc@1 61.328 (72.254) Acc@5 85.547 (91.244) Mem 7379MB [2024-08-26 04:50:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 72.000 Acc@5 91.102 [2024-08-26 04:50:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 72.0% [2024-08-26 04:50:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 72.00% [2024-08-26 04:50:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 04:50:06 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 04:50:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][0/1251] eta 0:14:07 lr 0.000996 wd 0.0500 time 0.6772 (0.6772) data time 0.4583 (0.4583) model time 0.0000 (0.0000) loss 4.4249 (4.4249) grad_norm 2.1198 (2.1198) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][10/1251] eta 0:05:48 lr 0.000996 wd 0.0500 time 0.2479 (0.2808) data time 0.0007 (0.0426) model time 0.0000 (0.0000) loss 4.8221 (4.0512) grad_norm 1.4426 (1.7349) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][20/1251] eta 0:05:23 lr 0.000996 wd 0.0500 time 0.2487 (0.2631) data time 0.0010 (0.0227) model time 0.0000 (0.0000) loss 4.1633 (4.1820) grad_norm 1.8371 (1.7439) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][30/1251] eta 0:05:13 lr 0.000996 wd 0.0500 time 0.2360 (0.2564) data time 0.0010 (0.0157) model time 0.0000 (0.0000) loss 3.3498 (4.0018) grad_norm 1.7285 (1.7501) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][40/1251] eta 0:05:06 lr 0.000996 wd 0.0500 time 0.2430 (0.2532) data time 0.0008 (0.0121) model time 0.0000 (0.0000) loss 3.4469 (3.9877) grad_norm 1.9988 (1.8534) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][50/1251] eta 0:05:01 lr 0.000996 wd 0.0500 time 0.2458 (0.2509) data time 0.0011 (0.0100) model time 0.0000 (0.0000) loss 3.6708 (3.9465) grad_norm 2.2276 (1.9662) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][60/1251] eta 0:04:56 lr 0.000996 wd 0.0500 time 0.2401 (0.2492) data time 0.0008 (0.0085) model time 0.2393 (0.2394) loss 4.3848 (3.9203) grad_norm 1.6139 (1.9281) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][70/1251] eta 0:04:53 lr 0.000996 wd 0.0500 time 0.2430 (0.2481) data time 0.0008 (0.0074) model time 0.2422 (0.2401) loss 3.0052 (3.9134) grad_norm 2.0078 (1.9460) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][80/1251] eta 0:04:49 lr 0.000996 wd 0.0500 time 0.2417 (0.2471) data time 0.0011 (0.0066) model time 0.2407 (0.2398) loss 4.4094 (3.9172) grad_norm 2.0930 (1.9579) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][90/1251] eta 0:04:46 lr 0.000996 wd 0.0500 time 0.2413 (0.2467) data time 0.0011 (0.0060) model time 0.2402 (0.2403) loss 3.7857 (3.9157) grad_norm 1.5535 (1.9494) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][100/1251] eta 0:04:43 lr 0.000996 wd 0.0500 time 0.2414 (0.2464) data time 0.0007 (0.0055) model time 0.2407 (0.2407) loss 3.8914 (3.9209) grad_norm 2.7847 (1.9426) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][110/1251] eta 0:04:40 lr 0.000996 wd 0.0500 time 0.2367 (0.2459) data time 0.0012 (0.0051) model time 0.2356 (0.2407) loss 4.5172 (3.9214) grad_norm 3.1480 (1.9328) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][120/1251] eta 0:04:37 lr 0.000996 wd 0.0500 time 0.2422 (0.2456) data time 0.0008 (0.0048) model time 0.2414 (0.2407) loss 4.6271 (3.9150) grad_norm 2.1105 (1.9393) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][130/1251] eta 0:04:35 lr 0.000996 wd 0.0500 time 0.2433 (0.2454) data time 0.0007 (0.0045) model time 0.2426 (0.2409) loss 3.4412 (3.8966) grad_norm 2.7259 (1.9516) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][140/1251] eta 0:04:32 lr 0.000996 wd 0.0500 time 0.2484 (0.2453) data time 0.0009 (0.0042) model time 0.2475 (0.2412) loss 4.7219 (3.9147) grad_norm 2.5610 (1.9734) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][150/1251] eta 0:04:29 lr 0.000996 wd 0.0500 time 0.2480 (0.2451) data time 0.0010 (0.0040) model time 0.2470 (0.2412) loss 4.0715 (3.9143) grad_norm 1.3511 (1.9541) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][160/1251] eta 0:04:27 lr 0.000996 wd 0.0500 time 0.2452 (0.2449) data time 0.0010 (0.0038) model time 0.2442 (0.2412) loss 3.7797 (3.9249) grad_norm 2.0737 (1.9569) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][170/1251] eta 0:04:24 lr 0.000996 wd 0.0500 time 0.2423 (0.2447) data time 0.0007 (0.0037) model time 0.2415 (0.2411) loss 4.3406 (3.9251) grad_norm 3.0576 (1.9730) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][180/1251] eta 0:04:21 lr 0.000996 wd 0.0500 time 0.2423 (0.2445) data time 0.0008 (0.0035) model time 0.2416 (0.2410) loss 4.0855 (3.9171) grad_norm 1.4631 (1.9605) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][190/1251] eta 0:04:19 lr 0.000996 wd 0.0500 time 0.2477 (0.2444) data time 0.0009 (0.0034) model time 0.2468 (0.2411) loss 4.1746 (3.9035) grad_norm 1.6974 (1.9451) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][200/1251] eta 0:04:16 lr 0.000996 wd 0.0500 time 0.2432 (0.2442) data time 0.0008 (0.0033) model time 0.2425 (0.2410) loss 3.6795 (3.9033) grad_norm 1.2942 (1.9301) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:50:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][210/1251] eta 0:04:14 lr 0.000996 wd 0.0500 time 0.2345 (0.2440) data time 0.0011 (0.0032) model time 0.2334 (0.2408) loss 3.9278 (3.9031) grad_norm 1.8496 (1.9364) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][220/1251] eta 0:04:11 lr 0.000996 wd 0.0500 time 0.2360 (0.2439) data time 0.0012 (0.0031) model time 0.2348 (0.2407) loss 4.3972 (3.9005) grad_norm 1.6679 (1.9299) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][230/1251] eta 0:04:09 lr 0.000996 wd 0.0500 time 0.2424 (0.2439) data time 0.0010 (0.0030) model time 0.2415 (0.2409) loss 3.9007 (3.8835) grad_norm 1.8140 (1.9180) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][240/1251] eta 0:04:06 lr 0.000996 wd 0.0500 time 0.2389 (0.2438) data time 0.0008 (0.0029) model time 0.2381 (0.2409) loss 2.5579 (3.8587) grad_norm 1.6835 (1.9188) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][250/1251] eta 0:04:03 lr 0.000996 wd 0.0500 time 0.2363 (0.2437) data time 0.0010 (0.0028) model time 0.2353 (0.2408) loss 4.5297 (3.8498) grad_norm 2.2706 (1.9155) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][260/1251] eta 0:04:01 lr 0.000996 wd 0.0500 time 0.2456 (0.2435) data time 0.0007 (0.0028) model time 0.2449 (0.2407) loss 3.9950 (3.8430) grad_norm 1.9641 (1.9065) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][270/1251] eta 0:03:58 lr 0.000996 wd 0.0500 time 0.2401 (0.2434) data time 0.0009 (0.0027) model time 0.2392 (0.2407) loss 4.5923 (3.8378) grad_norm 1.5833 (1.8986) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][280/1251] eta 0:03:56 lr 0.000996 wd 0.0500 time 0.2447 (0.2433) data time 0.0010 (0.0026) model time 0.2437 (0.2406) loss 3.7818 (3.8364) grad_norm 1.5426 (1.8935) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][290/1251] eta 0:03:53 lr 0.000996 wd 0.0500 time 0.2407 (0.2432) data time 0.0009 (0.0026) model time 0.2398 (0.2406) loss 3.9167 (3.8316) grad_norm 1.6212 (1.8933) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][300/1251] eta 0:03:51 lr 0.000996 wd 0.0500 time 0.2452 (0.2431) data time 0.0009 (0.0025) model time 0.2442 (0.2406) loss 4.2314 (3.8316) grad_norm 1.4152 (1.8853) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][310/1251] eta 0:03:48 lr 0.000996 wd 0.0500 time 0.2475 (0.2431) data time 0.0009 (0.0025) model time 0.2465 (0.2406) loss 4.1443 (3.8259) grad_norm 2.9537 (1.8967) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][320/1251] eta 0:03:46 lr 0.000996 wd 0.0500 time 0.2374 (0.2431) data time 0.0009 (0.0024) model time 0.2366 (0.2406) loss 3.9041 (3.8181) grad_norm 1.5445 (1.9099) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][330/1251] eta 0:03:43 lr 0.000996 wd 0.0500 time 0.2443 (0.2430) data time 0.0011 (0.0024) model time 0.2433 (0.2406) loss 3.6935 (3.8140) grad_norm 1.9056 (1.9130) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][340/1251] eta 0:03:41 lr 0.000996 wd 0.0500 time 0.2342 (0.2434) data time 0.0011 (0.0024) model time 0.2331 (0.2411) loss 3.5543 (3.8234) grad_norm 1.6041 (1.9083) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][350/1251] eta 0:03:39 lr 0.000996 wd 0.0500 time 0.2422 (0.2439) data time 0.0009 (0.0023) model time 0.2413 (0.2417) loss 3.8616 (3.8251) grad_norm 1.4299 (1.9023) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][360/1251] eta 0:03:38 lr 0.000996 wd 0.0500 time 0.3955 (0.2447) data time 0.0041 (0.0023) model time 0.3914 (0.2427) loss 3.4623 (3.8241) grad_norm 1.5259 (1.8968) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][370/1251] eta 0:03:35 lr 0.000996 wd 0.0500 time 0.2412 (0.2451) data time 0.0008 (0.0022) model time 0.2404 (0.2431) loss 3.1038 (3.8139) grad_norm 1.5431 (1.9012) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][380/1251] eta 0:03:33 lr 0.000996 wd 0.0500 time 0.2385 (0.2450) data time 0.0009 (0.0022) model time 0.2376 (0.2431) loss 4.3881 (3.8218) grad_norm 1.5822 (1.8997) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][390/1251] eta 0:03:30 lr 0.000996 wd 0.0500 time 0.2417 (0.2449) data time 0.0009 (0.0022) model time 0.2408 (0.2430) loss 4.5868 (3.8226) grad_norm 2.6730 (1.9033) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][400/1251] eta 0:03:28 lr 0.000996 wd 0.0500 time 0.2396 (0.2454) data time 0.0010 (0.0022) model time 0.2386 (0.2436) loss 3.8992 (3.8231) grad_norm 1.8134 (1.9122) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][410/1251] eta 0:03:26 lr 0.000996 wd 0.0500 time 0.2379 (0.2453) data time 0.0009 (0.0021) model time 0.2370 (0.2435) loss 4.5748 (3.8251) grad_norm 1.8534 (1.9148) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][420/1251] eta 0:03:23 lr 0.000996 wd 0.0500 time 0.2414 (0.2453) data time 0.0010 (0.0021) model time 0.2404 (0.2435) loss 3.8788 (3.8223) grad_norm 2.0638 (1.9229) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][430/1251] eta 0:03:21 lr 0.000996 wd 0.0500 time 0.2367 (0.2452) data time 0.0010 (0.0021) model time 0.2357 (0.2434) loss 3.2020 (3.8226) grad_norm 1.6064 (1.9172) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][440/1251] eta 0:03:18 lr 0.000996 wd 0.0500 time 0.2485 (0.2452) data time 0.0011 (0.0021) model time 0.2474 (0.2434) loss 3.5214 (3.8188) grad_norm 2.0072 (1.9148) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][450/1251] eta 0:03:16 lr 0.000996 wd 0.0500 time 0.2492 (0.2451) data time 0.0009 (0.0020) model time 0.2484 (0.2434) loss 4.5121 (3.8186) grad_norm 1.5180 (1.9200) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:51:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][460/1251] eta 0:03:13 lr 0.000996 wd 0.0500 time 0.2417 (0.2450) data time 0.0011 (0.0020) model time 0.2405 (0.2433) loss 4.2395 (3.8244) grad_norm 1.8205 (1.9300) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][470/1251] eta 0:03:11 lr 0.000996 wd 0.0500 time 0.2384 (0.2449) data time 0.0010 (0.0020) model time 0.2375 (0.2432) loss 4.4117 (3.8249) grad_norm 1.3470 (1.9285) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][480/1251] eta 0:03:08 lr 0.000996 wd 0.0500 time 0.2513 (0.2449) data time 0.0008 (0.0020) model time 0.2505 (0.2432) loss 4.4823 (3.8187) grad_norm 3.1893 (1.9263) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][490/1251] eta 0:03:06 lr 0.000996 wd 0.0500 time 0.2381 (0.2448) data time 0.0009 (0.0019) model time 0.2372 (0.2431) loss 3.3394 (3.8203) grad_norm 3.1204 (1.9320) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][500/1251] eta 0:03:03 lr 0.000996 wd 0.0500 time 0.2400 (0.2448) data time 0.0008 (0.0019) model time 0.2393 (0.2431) loss 3.4591 (3.8233) grad_norm 1.1982 (1.9308) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][510/1251] eta 0:03:01 lr 0.000996 wd 0.0500 time 0.2394 (0.2448) data time 0.0010 (0.0019) model time 0.2384 (0.2431) loss 4.2465 (3.8256) grad_norm 2.2005 (1.9273) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][520/1251] eta 0:02:58 lr 0.000996 wd 0.0500 time 0.2364 (0.2447) data time 0.0011 (0.0019) model time 0.2353 (0.2430) loss 4.0604 (3.8190) grad_norm 1.8063 (1.9223) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][530/1251] eta 0:02:56 lr 0.000996 wd 0.0500 time 0.2411 (0.2447) data time 0.0010 (0.0019) model time 0.2401 (0.2430) loss 4.3190 (3.8228) grad_norm 1.9033 (1.9338) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][540/1251] eta 0:02:53 lr 0.000996 wd 0.0500 time 0.2466 (0.2446) data time 0.0009 (0.0019) model time 0.2457 (0.2429) loss 3.6731 (3.8242) grad_norm 2.1782 (1.9349) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][550/1251] eta 0:02:51 lr 0.000996 wd 0.0500 time 0.2380 (0.2446) data time 0.0007 (0.0019) model time 0.2374 (0.2429) loss 4.5511 (3.8263) grad_norm 1.4517 (1.9327) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][560/1251] eta 0:02:49 lr 0.000996 wd 0.0500 time 0.2497 (0.2446) data time 0.0009 (0.0018) model time 0.2488 (0.2429) loss 2.4491 (3.8277) grad_norm 1.8648 (1.9307) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][570/1251] eta 0:02:46 lr 0.000996 wd 0.0500 time 0.2395 (0.2446) data time 0.0010 (0.0018) model time 0.2385 (0.2429) loss 3.7922 (3.8279) grad_norm 2.1753 (1.9314) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][580/1251] eta 0:02:44 lr 0.000996 wd 0.0500 time 0.2469 (0.2445) data time 0.0011 (0.0018) model time 0.2458 (0.2429) loss 4.1927 (3.8279) grad_norm 1.3059 (1.9331) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][590/1251] eta 0:02:41 lr 0.000996 wd 0.0500 time 0.2391 (0.2445) data time 0.0009 (0.0018) model time 0.2381 (0.2429) loss 3.0195 (3.8280) grad_norm 1.8812 (1.9339) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][600/1251] eta 0:02:39 lr 0.000996 wd 0.0500 time 0.2483 (0.2444) data time 0.0010 (0.0018) model time 0.2473 (0.2428) loss 4.1914 (3.8254) grad_norm 1.8612 (1.9315) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][610/1251] eta 0:02:36 lr 0.000996 wd 0.0500 time 0.2330 (0.2444) data time 0.0009 (0.0018) model time 0.2321 (0.2428) loss 2.7959 (3.8194) grad_norm 1.8092 (1.9298) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][620/1251] eta 0:02:34 lr 0.000996 wd 0.0500 time 0.2475 (0.2444) data time 0.0010 (0.0018) model time 0.2466 (0.2428) loss 4.2591 (3.8193) grad_norm 1.4492 (1.9297) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][630/1251] eta 0:02:31 lr 0.000996 wd 0.0500 time 0.2441 (0.2443) data time 0.0007 (0.0018) model time 0.2433 (0.2427) loss 3.7926 (3.8207) grad_norm 2.3179 (1.9336) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][640/1251] eta 0:02:29 lr 0.000996 wd 0.0500 time 0.2343 (0.2443) data time 0.0008 (0.0017) model time 0.2335 (0.2427) loss 3.3589 (3.8218) grad_norm 1.6663 (1.9319) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][650/1251] eta 0:02:26 lr 0.000996 wd 0.0500 time 0.2436 (0.2442) data time 0.0007 (0.0017) model time 0.2429 (0.2426) loss 2.7049 (3.8188) grad_norm 2.0719 (1.9345) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][660/1251] eta 0:02:24 lr 0.000996 wd 0.0500 time 0.2454 (0.2442) data time 0.0008 (0.0017) model time 0.2446 (0.2426) loss 4.0028 (3.8227) grad_norm 2.4923 (1.9321) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][670/1251] eta 0:02:21 lr 0.000996 wd 0.0500 time 0.2531 (0.2442) data time 0.0008 (0.0017) model time 0.2522 (0.2426) loss 3.5620 (3.8229) grad_norm 1.8224 (1.9318) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][680/1251] eta 0:02:19 lr 0.000996 wd 0.0500 time 0.2431 (0.2441) data time 0.0010 (0.0017) model time 0.2421 (0.2426) loss 4.1924 (3.8188) grad_norm 1.4097 (1.9335) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][690/1251] eta 0:02:16 lr 0.000996 wd 0.0500 time 0.2349 (0.2441) data time 0.0007 (0.0017) model time 0.2342 (0.2425) loss 4.3718 (3.8219) grad_norm 1.5985 (1.9335) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:52:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][700/1251] eta 0:02:14 lr 0.000996 wd 0.0500 time 0.2452 (0.2441) data time 0.0008 (0.0017) model time 0.2444 (0.2425) loss 3.6263 (3.8277) grad_norm 2.0110 (1.9307) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][710/1251] eta 0:02:12 lr 0.000996 wd 0.0500 time 0.2433 (0.2441) data time 0.0009 (0.0017) model time 0.2424 (0.2425) loss 4.3635 (3.8299) grad_norm 2.0647 (1.9342) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][720/1251] eta 0:02:09 lr 0.000996 wd 0.0500 time 0.2372 (0.2440) data time 0.0009 (0.0017) model time 0.2363 (0.2425) loss 4.2622 (3.8310) grad_norm 1.9117 (1.9389) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][730/1251] eta 0:02:07 lr 0.000996 wd 0.0500 time 0.2425 (0.2440) data time 0.0007 (0.0017) model time 0.2417 (0.2425) loss 2.9651 (3.8312) grad_norm 1.8463 (1.9392) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][740/1251] eta 0:02:04 lr 0.000996 wd 0.0500 time 0.2432 (0.2440) data time 0.0012 (0.0016) model time 0.2421 (0.2424) loss 4.0305 (3.8326) grad_norm 2.0458 (1.9399) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][750/1251] eta 0:02:02 lr 0.000996 wd 0.0500 time 0.2327 (0.2439) data time 0.0009 (0.0016) model time 0.2318 (0.2424) loss 4.1599 (3.8312) grad_norm 1.6759 (1.9385) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][760/1251] eta 0:01:59 lr 0.000996 wd 0.0500 time 0.2374 (0.2439) data time 0.0011 (0.0016) model time 0.2363 (0.2423) loss 3.7517 (3.8328) grad_norm 2.1332 (1.9371) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][770/1251] eta 0:01:57 lr 0.000996 wd 0.0500 time 0.2455 (0.2438) data time 0.0010 (0.0016) model time 0.2445 (0.2423) loss 3.2128 (3.8311) grad_norm 3.0375 (1.9399) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][780/1251] eta 0:01:54 lr 0.000996 wd 0.0500 time 0.2407 (0.2438) data time 0.0009 (0.0016) model time 0.2398 (0.2423) loss 4.3036 (3.8320) grad_norm 2.9228 (1.9422) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][790/1251] eta 0:01:52 lr 0.000996 wd 0.0500 time 0.2442 (0.2437) data time 0.0008 (0.0016) model time 0.2434 (0.2422) loss 4.6863 (3.8310) grad_norm 1.7187 (1.9408) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][800/1251] eta 0:01:49 lr 0.000996 wd 0.0500 time 0.2475 (0.2437) data time 0.0011 (0.0016) model time 0.2464 (0.2422) loss 4.3921 (3.8324) grad_norm 2.1087 (1.9447) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][810/1251] eta 0:01:47 lr 0.000996 wd 0.0500 time 0.2446 (0.2437) data time 0.0010 (0.0016) model time 0.2436 (0.2422) loss 4.2354 (3.8348) grad_norm 1.9702 (1.9471) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][820/1251] eta 0:01:45 lr 0.000996 wd 0.0500 time 0.2433 (0.2437) data time 0.0010 (0.0016) model time 0.2424 (0.2422) loss 4.2600 (3.8364) grad_norm 2.0803 (1.9478) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][830/1251] eta 0:01:42 lr 0.000996 wd 0.0500 time 0.2436 (0.2437) data time 0.0012 (0.0016) model time 0.2424 (0.2422) loss 3.0622 (3.8366) grad_norm 3.1142 (1.9491) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][840/1251] eta 0:01:40 lr 0.000996 wd 0.0500 time 0.2442 (0.2437) data time 0.0007 (0.0016) model time 0.2435 (0.2422) loss 4.8973 (3.8393) grad_norm 1.7240 (1.9463) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][850/1251] eta 0:01:37 lr 0.000996 wd 0.0500 time 0.2421 (0.2437) data time 0.0008 (0.0016) model time 0.2413 (0.2422) loss 4.1947 (3.8372) grad_norm 1.9775 (1.9442) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][860/1251] eta 0:01:35 lr 0.000996 wd 0.0500 time 0.2466 (0.2437) data time 0.0007 (0.0016) model time 0.2459 (0.2422) loss 2.4226 (3.8364) grad_norm 1.7861 (1.9450) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][870/1251] eta 0:01:32 lr 0.000996 wd 0.0500 time 0.2394 (0.2439) data time 0.0009 (0.0016) model time 0.2385 (0.2424) loss 4.1123 (3.8342) grad_norm 1.9261 (1.9431) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][880/1251] eta 0:01:30 lr 0.000996 wd 0.0500 time 0.2472 (0.2442) data time 0.0011 (0.0016) model time 0.2460 (0.2427) loss 3.5830 (3.8302) grad_norm 3.1701 (1.9433) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][890/1251] eta 0:01:28 lr 0.000996 wd 0.0500 time 0.2395 (0.2446) data time 0.0008 (0.0015) model time 0.2387 (0.2432) loss 3.3901 (3.8301) grad_norm 1.0993 (1.9455) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][900/1251] eta 0:01:25 lr 0.000996 wd 0.0500 time 0.2317 (0.2449) data time 0.0010 (0.0015) model time 0.2307 (0.2435) loss 4.1594 (3.8328) grad_norm 1.5468 (1.9429) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][910/1251] eta 0:01:23 lr 0.000996 wd 0.0500 time 0.2422 (0.2448) data time 0.0008 (0.0015) model time 0.2415 (0.2434) loss 4.0488 (3.8332) grad_norm 2.0483 (1.9428) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][920/1251] eta 0:01:21 lr 0.000996 wd 0.0500 time 0.2383 (0.2450) data time 0.0009 (0.0015) model time 0.2374 (0.2436) loss 4.7667 (3.8344) grad_norm 3.2025 (1.9452) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][930/1251] eta 0:01:18 lr 0.000996 wd 0.0500 time 0.2349 (0.2449) data time 0.0011 (0.0015) model time 0.2338 (0.2436) loss 4.3142 (3.8364) grad_norm 1.7018 (1.9431) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][940/1251] eta 0:01:16 lr 0.000996 wd 0.0500 time 0.2376 (0.2449) data time 0.0009 (0.0015) model time 0.2367 (0.2435) loss 3.5613 (3.8312) grad_norm 2.1617 (1.9418) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:53:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][950/1251] eta 0:01:13 lr 0.000996 wd 0.0500 time 0.2414 (0.2449) data time 0.0011 (0.0015) model time 0.2403 (0.2435) loss 3.1736 (3.8300) grad_norm 2.1108 (1.9445) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][960/1251] eta 0:01:11 lr 0.000996 wd 0.0500 time 0.2327 (0.2449) data time 0.0011 (0.0015) model time 0.2316 (0.2435) loss 4.3050 (3.8316) grad_norm 1.6675 (1.9483) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][970/1251] eta 0:01:08 lr 0.000996 wd 0.0500 time 0.2461 (0.2449) data time 0.0008 (0.0015) model time 0.2454 (0.2435) loss 2.8423 (3.8302) grad_norm 1.5968 (1.9487) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][980/1251] eta 0:01:06 lr 0.000996 wd 0.0500 time 0.2395 (0.2448) data time 0.0012 (0.0015) model time 0.2383 (0.2435) loss 3.3853 (3.8286) grad_norm 2.8280 (1.9491) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][990/1251] eta 0:01:03 lr 0.000996 wd 0.0500 time 0.2348 (0.2448) data time 0.0009 (0.0015) model time 0.2339 (0.2434) loss 3.9213 (3.8285) grad_norm 1.9070 (1.9474) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1000/1251] eta 0:01:01 lr 0.000996 wd 0.0500 time 0.2397 (0.2447) data time 0.0007 (0.0015) model time 0.2390 (0.2434) loss 4.1208 (3.8274) grad_norm 2.1272 (1.9476) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1010/1251] eta 0:00:58 lr 0.000996 wd 0.0500 time 0.2474 (0.2447) data time 0.0008 (0.0015) model time 0.2466 (0.2434) loss 4.7035 (3.8260) grad_norm 1.9514 (1.9491) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1020/1251] eta 0:00:56 lr 0.000996 wd 0.0500 time 0.2381 (0.2447) data time 0.0010 (0.0015) model time 0.2371 (0.2433) loss 2.9033 (3.8217) grad_norm 1.6569 (1.9467) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1030/1251] eta 0:00:54 lr 0.000996 wd 0.0500 time 0.2336 (0.2446) data time 0.0008 (0.0015) model time 0.2328 (0.2433) loss 4.7577 (3.8236) grad_norm 1.9662 (1.9455) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1040/1251] eta 0:00:51 lr 0.000996 wd 0.0500 time 0.2348 (0.2446) data time 0.0008 (0.0015) model time 0.2340 (0.2432) loss 4.5202 (3.8274) grad_norm 1.7834 (1.9449) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1050/1251] eta 0:00:49 lr 0.000996 wd 0.0500 time 0.2472 (0.2446) data time 0.0011 (0.0015) model time 0.2461 (0.2432) loss 4.1734 (3.8266) grad_norm 3.0225 (1.9458) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1060/1251] eta 0:00:46 lr 0.000996 wd 0.0500 time 0.2435 (0.2446) data time 0.0008 (0.0015) model time 0.2427 (0.2432) loss 4.3049 (3.8273) grad_norm 1.6551 (1.9512) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1070/1251] eta 0:00:44 lr 0.000996 wd 0.0500 time 0.2433 (0.2446) data time 0.0007 (0.0015) model time 0.2426 (0.2432) loss 4.4338 (3.8287) grad_norm 1.4965 (1.9497) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1080/1251] eta 0:00:41 lr 0.000996 wd 0.0500 time 0.2404 (0.2445) data time 0.0010 (0.0015) model time 0.2394 (0.2432) loss 4.1604 (3.8278) grad_norm 2.6083 (1.9509) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1090/1251] eta 0:00:39 lr 0.000996 wd 0.0500 time 0.2416 (0.2445) data time 0.0011 (0.0014) model time 0.2405 (0.2432) loss 3.9939 (3.8280) grad_norm 3.1861 (1.9519) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1100/1251] eta 0:00:36 lr 0.000996 wd 0.0500 time 0.2399 (0.2445) data time 0.0010 (0.0014) model time 0.2389 (0.2432) loss 3.8767 (3.8286) grad_norm 1.8297 (1.9501) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1110/1251] eta 0:00:34 lr 0.000996 wd 0.0500 time 0.2497 (0.2445) data time 0.0009 (0.0014) model time 0.2488 (0.2431) loss 3.7894 (3.8277) grad_norm 1.7852 (1.9484) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1120/1251] eta 0:00:32 lr 0.000996 wd 0.0500 time 0.2362 (0.2445) data time 0.0010 (0.0014) model time 0.2352 (0.2431) loss 4.8596 (3.8306) grad_norm 2.9530 (1.9496) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1130/1251] eta 0:00:29 lr 0.000996 wd 0.0500 time 0.2453 (0.2444) data time 0.0007 (0.0014) model time 0.2446 (0.2431) loss 4.1459 (3.8287) grad_norm 1.4826 (1.9480) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1140/1251] eta 0:00:27 lr 0.000996 wd 0.0500 time 0.2417 (0.2444) data time 0.0010 (0.0014) model time 0.2407 (0.2430) loss 2.8799 (3.8267) grad_norm 3.3614 (1.9474) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1150/1251] eta 0:00:24 lr 0.000996 wd 0.0500 time 0.2457 (0.2444) data time 0.0008 (0.0014) model time 0.2449 (0.2431) loss 3.6931 (3.8255) grad_norm 4.4465 (1.9510) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1160/1251] eta 0:00:22 lr 0.000996 wd 0.0500 time 0.2352 (0.2444) data time 0.0009 (0.0014) model time 0.2343 (0.2430) loss 4.0260 (3.8273) grad_norm 1.9451 (1.9538) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1170/1251] eta 0:00:19 lr 0.000996 wd 0.0500 time 0.2358 (0.2443) data time 0.0008 (0.0014) model time 0.2350 (0.2430) loss 4.5461 (3.8298) grad_norm 1.7191 (1.9519) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1180/1251] eta 0:00:17 lr 0.000996 wd 0.0500 time 0.2380 (0.2443) data time 0.0008 (0.0014) model time 0.2372 (0.2430) loss 4.1825 (3.8290) grad_norm 1.2783 (1.9497) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1190/1251] eta 0:00:14 lr 0.000996 wd 0.0500 time 0.2459 (0.2443) data time 0.0011 (0.0014) model time 0.2448 (0.2430) loss 2.9964 (3.8278) grad_norm 2.0956 (1.9480) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:54:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1200/1251] eta 0:00:12 lr 0.000996 wd 0.0500 time 0.2412 (0.2443) data time 0.0007 (0.0014) model time 0.2404 (0.2429) loss 4.3404 (3.8266) grad_norm 1.8544 (1.9477) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:55:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1210/1251] eta 0:00:10 lr 0.000996 wd 0.0500 time 0.2375 (0.2443) data time 0.0009 (0.0014) model time 0.2366 (0.2429) loss 4.1256 (3.8261) grad_norm 2.0012 (1.9477) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:55:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1220/1251] eta 0:00:07 lr 0.000996 wd 0.0500 time 0.2338 (0.2442) data time 0.0008 (0.0014) model time 0.2331 (0.2429) loss 4.8567 (3.8281) grad_norm 1.9483 (1.9468) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1230/1251] eta 0:00:05 lr 0.000996 wd 0.0500 time 0.2389 (0.2442) data time 0.0007 (0.0014) model time 0.2382 (0.2429) loss 3.1791 (3.8257) grad_norm 1.6549 (1.9461) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1240/1251] eta 0:00:02 lr 0.000996 wd 0.0500 time 0.2258 (0.2441) data time 0.0005 (0.0014) model time 0.2253 (0.2428) loss 3.8653 (3.8263) grad_norm 1.9100 (1.9471) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:55:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [31/300][1250/1251] eta 0:00:00 lr 0.000996 wd 0.0500 time 0.2257 (0.2440) data time 0.0005 (0.0014) model time 0.2253 (0.2427) loss 3.1838 (3.8259) grad_norm 1.4900 (1.9454) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:55:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 31 training takes 0:05:05 [2024-08-26 04:55:11 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 04:55:12 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 04:55:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.428 (0.428) Loss 0.6558 (0.6558) Acc@1 86.426 (86.426) Acc@5 96.484 (96.484) Mem 7379MB [2024-08-26 04:55:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.110) Loss 0.9941 (0.9676) Acc@1 76.758 (77.566) Acc@5 93.945 (94.327) Mem 7379MB [2024-08-26 04:55:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.096) Loss 1.4316 (0.9876) Acc@1 67.871 (76.828) Acc@5 88.379 (94.341) Mem 7379MB [2024-08-26 04:55:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.090) Loss 1.6787 (1.1327) Acc@1 62.793 (73.907) Acc@5 82.910 (92.310) Mem 7379MB [2024-08-26 04:55:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.5850 (1.2242) Acc@1 63.574 (71.935) Acc@5 86.719 (91.023) Mem 7379MB [2024-08-26 04:55:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 71.672 Acc@5 90.972 [2024-08-26 04:55:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 71.7% [2024-08-26 04:55:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 71.67% [2024-08-26 04:55:16 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 04:55:17 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 04:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.418 (0.418) Loss 0.5278 (0.5278) Acc@1 87.695 (87.695) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 04:55:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.114) Loss 0.8818 (0.8720) Acc@1 79.199 (78.542) Acc@5 94.922 (94.806) Mem 7379MB [2024-08-26 04:55:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.095) Loss 1.2910 (0.8848) Acc@1 68.066 (77.916) Acc@5 89.453 (94.689) Mem 7379MB [2024-08-26 04:55:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.089) Loss 1.6064 (1.0318) Acc@1 60.645 (74.720) Acc@5 84.375 (92.669) Mem 7379MB [2024-08-26 04:55:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.083) Loss 1.5732 (1.1208) Acc@1 62.402 (72.885) Acc@5 86.035 (91.535) Mem 7379MB [2024-08-26 04:55:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 72.608 Acc@5 91.392 [2024-08-26 04:55:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 72.6% [2024-08-26 04:55:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 72.61% [2024-08-26 04:55:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 04:55:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 04:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][0/1251] eta 0:14:35 lr 0.000996 wd 0.0500 time 0.6996 (0.6996) data time 0.4783 (0.4783) model time 0.0000 (0.0000) loss 4.4410 (4.4410) grad_norm 1.7378 (1.7378) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][10/1251] eta 0:05:47 lr 0.000996 wd 0.0500 time 0.2333 (0.2797) data time 0.0011 (0.0445) model time 0.0000 (0.0000) loss 4.4470 (3.7425) grad_norm 1.5587 (1.6232) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:55:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][20/1251] eta 0:05:23 lr 0.000996 wd 0.0500 time 0.2622 (0.2624) data time 0.0008 (0.0238) model time 0.0000 (0.0000) loss 3.3098 (3.7351) grad_norm 1.3718 (1.7864) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][30/1251] eta 0:05:12 lr 0.000996 wd 0.0500 time 0.2520 (0.2555) data time 0.0007 (0.0164) model time 0.0000 (0.0000) loss 4.2137 (3.8089) grad_norm 3.0696 (1.8858) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:55:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][40/1251] eta 0:05:06 lr 0.000995 wd 0.0500 time 0.2661 (0.2527) data time 0.0008 (0.0127) model time 0.0000 (0.0000) loss 2.6589 (3.7668) grad_norm 2.7794 (1.9191) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:55:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][50/1251] eta 0:05:00 lr 0.000995 wd 0.0500 time 0.2431 (0.2502) data time 0.0010 (0.0104) model time 0.0000 (0.0000) loss 3.8033 (3.8079) grad_norm 1.1886 (1.9246) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:55:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][60/1251] eta 0:04:56 lr 0.000995 wd 0.0500 time 0.2432 (0.2486) data time 0.0008 (0.0089) model time 0.2424 (0.2399) loss 2.7814 (3.7482) grad_norm 1.3645 (1.8674) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 04:55:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][70/1251] eta 0:04:52 lr 0.000995 wd 0.0500 time 0.2498 (0.2477) data time 0.0007 (0.0079) model time 0.2490 (0.2398) loss 3.1096 (3.7716) grad_norm 1.1956 (1.8478) loss_scale 32768.0000 (18691.6056) mem 7379MB [2024-08-26 04:55:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][80/1251] eta 0:04:49 lr 0.000995 wd 0.0500 time 0.2338 (0.2468) data time 0.0008 (0.0071) model time 0.2330 (0.2398) loss 4.4995 (3.7587) grad_norm 1.9557 (1.8458) loss_scale 32768.0000 (20429.4321) mem 7379MB [2024-08-26 04:55:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][90/1251] eta 0:04:46 lr 0.000995 wd 0.0500 time 0.2503 (0.2464) data time 0.0009 (0.0065) model time 0.2493 (0.2402) loss 4.1729 (3.7754) grad_norm 1.7496 (1.8714) loss_scale 32768.0000 (21785.3187) mem 7379MB [2024-08-26 04:55:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][100/1251] eta 0:04:44 lr 0.000995 wd 0.0500 time 0.2125 (0.2474) data time 0.0011 (0.0060) model time 0.2114 (0.2432) loss 3.9577 (3.7770) grad_norm 1.9463 (inf) loss_scale 8192.0000 (20926.0990) mem 7379MB [2024-08-26 04:55:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][110/1251] eta 0:04:41 lr 0.000995 wd 0.0500 time 0.2478 (0.2471) data time 0.0009 (0.0055) model time 0.2469 (0.2432) loss 3.6418 (3.7751) grad_norm 1.7949 (inf) loss_scale 8192.0000 (19778.8829) mem 7379MB [2024-08-26 04:55:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][120/1251] eta 0:04:39 lr 0.000995 wd 0.0500 time 0.2314 (0.2467) data time 0.0010 (0.0053) model time 0.2304 (0.2427) loss 4.0189 (3.7953) grad_norm 1.8723 (inf) loss_scale 8192.0000 (18821.2893) mem 7379MB [2024-08-26 04:55:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][130/1251] eta 0:04:36 lr 0.000995 wd 0.0500 time 0.2420 (0.2464) data time 0.0009 (0.0050) model time 0.2411 (0.2426) loss 4.3592 (3.7995) grad_norm 2.8043 (inf) loss_scale 8192.0000 (18009.8931) mem 7379MB [2024-08-26 04:55:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][140/1251] eta 0:04:33 lr 0.000995 wd 0.0500 time 0.2430 (0.2461) data time 0.0007 (0.0047) model time 0.2423 (0.2424) loss 4.8986 (3.8159) grad_norm 3.1713 (inf) loss_scale 8192.0000 (17313.5887) mem 7379MB [2024-08-26 04:55:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][150/1251] eta 0:04:30 lr 0.000995 wd 0.0500 time 0.2346 (0.2460) data time 0.0010 (0.0045) model time 0.2335 (0.2424) loss 3.8276 (3.8079) grad_norm 1.4724 (inf) loss_scale 8192.0000 (16709.5099) mem 7379MB [2024-08-26 04:56:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][160/1251] eta 0:04:27 lr 0.000995 wd 0.0500 time 0.2426 (0.2456) data time 0.0008 (0.0043) model time 0.2417 (0.2421) loss 4.2395 (3.8138) grad_norm 2.3442 (inf) loss_scale 8192.0000 (16180.4720) mem 7379MB [2024-08-26 04:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][170/1251] eta 0:04:25 lr 0.000995 wd 0.0500 time 0.2479 (0.2454) data time 0.0010 (0.0041) model time 0.2469 (0.2420) loss 3.9442 (3.8209) grad_norm 1.6629 (inf) loss_scale 8192.0000 (15713.3099) mem 7379MB [2024-08-26 04:56:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][180/1251] eta 0:04:22 lr 0.000995 wd 0.0500 time 0.2500 (0.2454) data time 0.0010 (0.0039) model time 0.2491 (0.2421) loss 4.1685 (3.8338) grad_norm 2.0024 (inf) loss_scale 8192.0000 (15297.7680) mem 7379MB [2024-08-26 04:56:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][190/1251] eta 0:04:20 lr 0.000995 wd 0.0500 time 0.2419 (0.2451) data time 0.0009 (0.0038) model time 0.2410 (0.2419) loss 2.8587 (3.8318) grad_norm 1.4426 (inf) loss_scale 8192.0000 (14925.7382) mem 7379MB [2024-08-26 04:56:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][200/1251] eta 0:04:17 lr 0.000995 wd 0.0500 time 0.2379 (0.2450) data time 0.0008 (0.0036) model time 0.2370 (0.2419) loss 4.5694 (3.8383) grad_norm 1.7829 (inf) loss_scale 8192.0000 (14590.7264) mem 7379MB [2024-08-26 04:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][210/1251] eta 0:04:14 lr 0.000995 wd 0.0500 time 0.2405 (0.2448) data time 0.0009 (0.0035) model time 0.2396 (0.2418) loss 3.0640 (3.8153) grad_norm 1.6382 (inf) loss_scale 8192.0000 (14287.4692) mem 7379MB [2024-08-26 04:56:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][220/1251] eta 0:04:12 lr 0.000995 wd 0.0500 time 0.2404 (0.2447) data time 0.0008 (0.0034) model time 0.2396 (0.2417) loss 2.2724 (3.8047) grad_norm 1.8821 (inf) loss_scale 8192.0000 (14011.6561) mem 7379MB [2024-08-26 04:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][230/1251] eta 0:04:09 lr 0.000995 wd 0.0500 time 0.2323 (0.2445) data time 0.0012 (0.0033) model time 0.2311 (0.2416) loss 4.2575 (3.7959) grad_norm 2.3796 (inf) loss_scale 8192.0000 (13759.7229) mem 7379MB [2024-08-26 04:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][240/1251] eta 0:04:07 lr 0.000995 wd 0.0500 time 0.2424 (0.2444) data time 0.0008 (0.0032) model time 0.2416 (0.2416) loss 2.9041 (3.7950) grad_norm 1.4017 (inf) loss_scale 8192.0000 (13528.6971) mem 7379MB [2024-08-26 04:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][250/1251] eta 0:04:04 lr 0.000995 wd 0.0500 time 0.2383 (0.2443) data time 0.0010 (0.0031) model time 0.2372 (0.2415) loss 4.1026 (3.7934) grad_norm 1.5446 (inf) loss_scale 8192.0000 (13316.0797) mem 7379MB [2024-08-26 04:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][260/1251] eta 0:04:02 lr 0.000995 wd 0.0500 time 0.2424 (0.2442) data time 0.0012 (0.0030) model time 0.2412 (0.2415) loss 3.7421 (3.7917) grad_norm 3.6059 (inf) loss_scale 8192.0000 (13119.7548) mem 7379MB [2024-08-26 04:56:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][270/1251] eta 0:03:59 lr 0.000995 wd 0.0500 time 0.2410 (0.2442) data time 0.0008 (0.0030) model time 0.2402 (0.2415) loss 4.2269 (3.7930) grad_norm 2.3344 (inf) loss_scale 8192.0000 (12937.9188) mem 7379MB [2024-08-26 04:56:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][280/1251] eta 0:03:56 lr 0.000995 wd 0.0500 time 0.2416 (0.2440) data time 0.0010 (0.0029) model time 0.2406 (0.2414) loss 3.3940 (3.7757) grad_norm 1.8643 (inf) loss_scale 8192.0000 (12769.0249) mem 7379MB [2024-08-26 04:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][290/1251] eta 0:03:54 lr 0.000995 wd 0.0500 time 0.2444 (0.2440) data time 0.0007 (0.0028) model time 0.2437 (0.2415) loss 4.6235 (3.7825) grad_norm 2.3895 (inf) loss_scale 8192.0000 (12611.7388) mem 7379MB [2024-08-26 04:56:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][300/1251] eta 0:03:51 lr 0.000995 wd 0.0500 time 0.2432 (0.2439) data time 0.0010 (0.0028) model time 0.2422 (0.2415) loss 2.7402 (3.7849) grad_norm 1.3371 (inf) loss_scale 8192.0000 (12464.9037) mem 7379MB [2024-08-26 04:56:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][310/1251] eta 0:03:50 lr 0.000995 wd 0.0500 time 0.2395 (0.2446) data time 0.0007 (0.0027) model time 0.2388 (0.2423) loss 4.6106 (3.7767) grad_norm 1.8365 (inf) loss_scale 8192.0000 (12327.5113) mem 7379MB [2024-08-26 04:56:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][320/1251] eta 0:03:49 lr 0.000995 wd 0.0500 time 0.2537 (0.2466) data time 0.0008 (0.0027) model time 0.2529 (0.2447) loss 2.7674 (3.7776) grad_norm 2.0113 (inf) loss_scale 8192.0000 (12198.6791) mem 7379MB [2024-08-26 04:56:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][330/1251] eta 0:03:47 lr 0.000995 wd 0.0500 time 0.2384 (0.2471) data time 0.0009 (0.0026) model time 0.2375 (0.2453) loss 3.9112 (3.7818) grad_norm 2.6622 (inf) loss_scale 8192.0000 (12077.6314) mem 7379MB [2024-08-26 04:56:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][340/1251] eta 0:03:44 lr 0.000995 wd 0.0500 time 0.2389 (0.2469) data time 0.0011 (0.0026) model time 0.2377 (0.2452) loss 4.1542 (3.7852) grad_norm 2.4542 (inf) loss_scale 8192.0000 (11963.6833) mem 7379MB [2024-08-26 04:56:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][350/1251] eta 0:03:42 lr 0.000995 wd 0.0500 time 0.2681 (0.2468) data time 0.0011 (0.0025) model time 0.2670 (0.2450) loss 3.7795 (3.7805) grad_norm 1.4875 (inf) loss_scale 8192.0000 (11856.2279) mem 7379MB [2024-08-26 04:56:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][360/1251] eta 0:03:39 lr 0.000995 wd 0.0500 time 0.2401 (0.2466) data time 0.0011 (0.0025) model time 0.2390 (0.2449) loss 4.0265 (3.7840) grad_norm 1.8199 (inf) loss_scale 8192.0000 (11754.7258) mem 7379MB [2024-08-26 04:56:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][370/1251] eta 0:03:37 lr 0.000995 wd 0.0500 time 0.2352 (0.2465) data time 0.0011 (0.0024) model time 0.2342 (0.2447) loss 3.7953 (3.7867) grad_norm 1.7837 (inf) loss_scale 8192.0000 (11658.6954) mem 7379MB [2024-08-26 04:56:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][380/1251] eta 0:03:34 lr 0.000995 wd 0.0500 time 0.2442 (0.2464) data time 0.0013 (0.0024) model time 0.2430 (0.2446) loss 4.4418 (3.7964) grad_norm 2.4762 (inf) loss_scale 8192.0000 (11567.7060) mem 7379MB [2024-08-26 04:56:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][390/1251] eta 0:03:32 lr 0.000995 wd 0.0500 time 0.2420 (0.2463) data time 0.0010 (0.0024) model time 0.2410 (0.2445) loss 4.2895 (3.7920) grad_norm 2.8791 (inf) loss_scale 8192.0000 (11481.3708) mem 7379MB [2024-08-26 04:57:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][400/1251] eta 0:03:29 lr 0.000995 wd 0.0500 time 0.2365 (0.2462) data time 0.0009 (0.0024) model time 0.2356 (0.2444) loss 3.1234 (3.7821) grad_norm 1.5148 (inf) loss_scale 8192.0000 (11399.3416) mem 7379MB [2024-08-26 04:57:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][410/1251] eta 0:03:26 lr 0.000995 wd 0.0500 time 0.2347 (0.2460) data time 0.0011 (0.0023) model time 0.2336 (0.2443) loss 3.1596 (3.7800) grad_norm 1.3373 (inf) loss_scale 8192.0000 (11321.3041) mem 7379MB [2024-08-26 04:57:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][420/1251] eta 0:03:24 lr 0.000995 wd 0.0500 time 0.2377 (0.2459) data time 0.0012 (0.0023) model time 0.2365 (0.2441) loss 3.3895 (3.7840) grad_norm 1.4254 (inf) loss_scale 8192.0000 (11246.9739) mem 7379MB [2024-08-26 04:57:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][430/1251] eta 0:03:21 lr 0.000995 wd 0.0500 time 0.2377 (0.2458) data time 0.0011 (0.0023) model time 0.2365 (0.2440) loss 4.2497 (3.7852) grad_norm 3.6113 (inf) loss_scale 8192.0000 (11176.0928) mem 7379MB [2024-08-26 04:57:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][440/1251] eta 0:03:19 lr 0.000995 wd 0.0500 time 0.2447 (0.2457) data time 0.0008 (0.0022) model time 0.2440 (0.2440) loss 4.5949 (3.7911) grad_norm 2.3845 (inf) loss_scale 8192.0000 (11108.4263) mem 7379MB [2024-08-26 04:57:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][450/1251] eta 0:03:16 lr 0.000995 wd 0.0500 time 0.2380 (0.2456) data time 0.0013 (0.0022) model time 0.2367 (0.2439) loss 3.7531 (3.7902) grad_norm 1.6324 (inf) loss_scale 8192.0000 (11043.7605) mem 7379MB [2024-08-26 04:57:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][460/1251] eta 0:03:14 lr 0.000995 wd 0.0500 time 0.2394 (0.2455) data time 0.0008 (0.0022) model time 0.2386 (0.2438) loss 4.6616 (3.7958) grad_norm 3.5522 (inf) loss_scale 8192.0000 (10981.9002) mem 7379MB [2024-08-26 04:57:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][470/1251] eta 0:03:11 lr 0.000995 wd 0.0500 time 0.2466 (0.2455) data time 0.0012 (0.0022) model time 0.2454 (0.2437) loss 3.8657 (3.7924) grad_norm 1.2960 (inf) loss_scale 8192.0000 (10922.6667) mem 7379MB [2024-08-26 04:57:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][480/1251] eta 0:03:09 lr 0.000995 wd 0.0500 time 0.2407 (0.2454) data time 0.0010 (0.0022) model time 0.2397 (0.2437) loss 3.3697 (3.7875) grad_norm 1.4960 (inf) loss_scale 8192.0000 (10865.8960) mem 7379MB [2024-08-26 04:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][490/1251] eta 0:03:06 lr 0.000995 wd 0.0500 time 0.2399 (0.2453) data time 0.0009 (0.0021) model time 0.2391 (0.2436) loss 4.5762 (3.7923) grad_norm 1.3002 (inf) loss_scale 8192.0000 (10811.4379) mem 7379MB [2024-08-26 04:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][500/1251] eta 0:03:04 lr 0.000995 wd 0.0500 time 0.2400 (0.2453) data time 0.0007 (0.0021) model time 0.2394 (0.2435) loss 4.2601 (3.7977) grad_norm 1.9897 (inf) loss_scale 8192.0000 (10759.1537) mem 7379MB [2024-08-26 04:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][510/1251] eta 0:03:01 lr 0.000995 wd 0.0500 time 0.2441 (0.2452) data time 0.0010 (0.0021) model time 0.2432 (0.2435) loss 4.1472 (3.7931) grad_norm 1.5410 (inf) loss_scale 8192.0000 (10708.9159) mem 7379MB [2024-08-26 04:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][520/1251] eta 0:02:59 lr 0.000995 wd 0.0500 time 0.2381 (0.2451) data time 0.0010 (0.0021) model time 0.2371 (0.2434) loss 3.9056 (3.7949) grad_norm 3.0862 (inf) loss_scale 8192.0000 (10660.6065) mem 7379MB [2024-08-26 04:57:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][530/1251] eta 0:02:56 lr 0.000995 wd 0.0500 time 0.2369 (0.2450) data time 0.0009 (0.0020) model time 0.2361 (0.2433) loss 2.6506 (3.7913) grad_norm 1.9639 (inf) loss_scale 8192.0000 (10614.1168) mem 7379MB [2024-08-26 04:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][540/1251] eta 0:02:54 lr 0.000995 wd 0.0500 time 0.2431 (0.2450) data time 0.0011 (0.0020) model time 0.2420 (0.2433) loss 2.9835 (3.7917) grad_norm 1.3714 (inf) loss_scale 8192.0000 (10569.3457) mem 7379MB [2024-08-26 04:57:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][550/1251] eta 0:02:51 lr 0.000995 wd 0.0500 time 0.2461 (0.2448) data time 0.0010 (0.0020) model time 0.2451 (0.2432) loss 3.7221 (3.7897) grad_norm 2.0562 (inf) loss_scale 8192.0000 (10526.1996) mem 7379MB [2024-08-26 04:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][560/1251] eta 0:02:49 lr 0.000995 wd 0.0500 time 0.2340 (0.2448) data time 0.0010 (0.0020) model time 0.2330 (0.2431) loss 3.2692 (3.7910) grad_norm 1.3148 (inf) loss_scale 8192.0000 (10484.5918) mem 7379MB [2024-08-26 04:57:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][570/1251] eta 0:02:46 lr 0.000995 wd 0.0500 time 0.2427 (0.2447) data time 0.0007 (0.0020) model time 0.2420 (0.2430) loss 2.9129 (3.7869) grad_norm 3.1535 (inf) loss_scale 8192.0000 (10444.4413) mem 7379MB [2024-08-26 04:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][580/1251] eta 0:02:44 lr 0.000995 wd 0.0500 time 0.2409 (0.2447) data time 0.0009 (0.0020) model time 0.2401 (0.2430) loss 4.3989 (3.7913) grad_norm 1.6018 (inf) loss_scale 8192.0000 (10405.6730) mem 7379MB [2024-08-26 04:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][590/1251] eta 0:02:41 lr 0.000995 wd 0.0500 time 0.2331 (0.2446) data time 0.0009 (0.0019) model time 0.2322 (0.2429) loss 3.7829 (3.7893) grad_norm 2.3699 (inf) loss_scale 8192.0000 (10368.2166) mem 7379MB [2024-08-26 04:57:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][600/1251] eta 0:02:39 lr 0.000995 wd 0.0500 time 0.2439 (0.2445) data time 0.0009 (0.0019) model time 0.2430 (0.2428) loss 3.7646 (3.7870) grad_norm 1.4523 (inf) loss_scale 8192.0000 (10332.0067) mem 7379MB [2024-08-26 04:57:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][610/1251] eta 0:02:36 lr 0.000995 wd 0.0500 time 0.2343 (0.2445) data time 0.0011 (0.0019) model time 0.2332 (0.2428) loss 4.3349 (3.7890) grad_norm 2.9668 (inf) loss_scale 8192.0000 (10296.9820) mem 7379MB [2024-08-26 04:57:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][620/1251] eta 0:02:34 lr 0.000995 wd 0.0500 time 0.2407 (0.2444) data time 0.0011 (0.0019) model time 0.2396 (0.2428) loss 4.1373 (3.7899) grad_norm 3.3483 (inf) loss_scale 8192.0000 (10263.0853) mem 7379MB [2024-08-26 04:57:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][630/1251] eta 0:02:31 lr 0.000995 wd 0.0500 time 0.2436 (0.2447) data time 0.0009 (0.0019) model time 0.2427 (0.2431) loss 3.6873 (3.7893) grad_norm 1.8974 (inf) loss_scale 8192.0000 (10230.2631) mem 7379MB [2024-08-26 04:57:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][640/1251] eta 0:02:29 lr 0.000995 wd 0.0500 time 0.2397 (0.2447) data time 0.0009 (0.0019) model time 0.2388 (0.2430) loss 2.4860 (3.7846) grad_norm 1.4421 (inf) loss_scale 8192.0000 (10198.4649) mem 7379MB [2024-08-26 04:58:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][650/1251] eta 0:02:26 lr 0.000995 wd 0.0500 time 0.2349 (0.2446) data time 0.0011 (0.0019) model time 0.2338 (0.2430) loss 3.8615 (3.7821) grad_norm 1.6819 (inf) loss_scale 8192.0000 (10167.6436) mem 7379MB [2024-08-26 04:58:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][660/1251] eta 0:02:24 lr 0.000995 wd 0.0500 time 0.2438 (0.2445) data time 0.0010 (0.0018) model time 0.2428 (0.2429) loss 2.9379 (3.7805) grad_norm 1.6755 (inf) loss_scale 8192.0000 (10137.7549) mem 7379MB [2024-08-26 04:58:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][670/1251] eta 0:02:22 lr 0.000995 wd 0.0500 time 0.2404 (0.2445) data time 0.0009 (0.0018) model time 0.2395 (0.2429) loss 4.3678 (3.7827) grad_norm 1.7875 (inf) loss_scale 8192.0000 (10108.7571) mem 7379MB [2024-08-26 04:58:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][680/1251] eta 0:02:19 lr 0.000995 wd 0.0500 time 0.2421 (0.2445) data time 0.0008 (0.0018) model time 0.2413 (0.2429) loss 3.1176 (3.7793) grad_norm 3.1917 (inf) loss_scale 8192.0000 (10080.6109) mem 7379MB [2024-08-26 04:58:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][690/1251] eta 0:02:17 lr 0.000995 wd 0.0500 time 0.2336 (0.2444) data time 0.0008 (0.0018) model time 0.2329 (0.2429) loss 4.5533 (3.7842) grad_norm 1.7696 (inf) loss_scale 8192.0000 (10053.2793) mem 7379MB [2024-08-26 04:58:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][700/1251] eta 0:02:14 lr 0.000995 wd 0.0500 time 0.2443 (0.2444) data time 0.0009 (0.0018) model time 0.2435 (0.2428) loss 3.6181 (3.7818) grad_norm 1.9780 (inf) loss_scale 8192.0000 (10026.7275) mem 7379MB [2024-08-26 04:58:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][710/1251] eta 0:02:12 lr 0.000995 wd 0.0500 time 0.2392 (0.2444) data time 0.0009 (0.0018) model time 0.2384 (0.2428) loss 3.8750 (3.7851) grad_norm 1.8390 (inf) loss_scale 8192.0000 (10000.9226) mem 7379MB [2024-08-26 04:58:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][720/1251] eta 0:02:09 lr 0.000995 wd 0.0500 time 0.2390 (0.2443) data time 0.0007 (0.0018) model time 0.2383 (0.2427) loss 3.9958 (3.7834) grad_norm 1.4603 (inf) loss_scale 8192.0000 (9975.8336) mem 7379MB [2024-08-26 04:58:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][730/1251] eta 0:02:07 lr 0.000995 wd 0.0500 time 0.2416 (0.2443) data time 0.0009 (0.0018) model time 0.2407 (0.2427) loss 3.8831 (3.7829) grad_norm 2.4595 (inf) loss_scale 8192.0000 (9951.4309) mem 7379MB [2024-08-26 04:58:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][740/1251] eta 0:02:04 lr 0.000995 wd 0.0500 time 0.2421 (0.2443) data time 0.0011 (0.0018) model time 0.2410 (0.2427) loss 3.0434 (3.7878) grad_norm 2.1158 (inf) loss_scale 8192.0000 (9927.6869) mem 7379MB [2024-08-26 04:58:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][750/1251] eta 0:02:02 lr 0.000995 wd 0.0500 time 0.2426 (0.2442) data time 0.0011 (0.0018) model time 0.2415 (0.2427) loss 3.7328 (3.7835) grad_norm 2.2069 (inf) loss_scale 8192.0000 (9904.5752) mem 7379MB [2024-08-26 04:58:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][760/1251] eta 0:01:59 lr 0.000995 wd 0.0500 time 0.2337 (0.2442) data time 0.0011 (0.0017) model time 0.2326 (0.2426) loss 3.0353 (3.7817) grad_norm 1.5470 (inf) loss_scale 8192.0000 (9882.0710) mem 7379MB [2024-08-26 04:58:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][770/1251] eta 0:01:57 lr 0.000995 wd 0.0500 time 0.2473 (0.2442) data time 0.0010 (0.0017) model time 0.2463 (0.2427) loss 3.7923 (3.7851) grad_norm 1.7654 (inf) loss_scale 8192.0000 (9860.1505) mem 7379MB [2024-08-26 04:58:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][780/1251] eta 0:01:55 lr 0.000995 wd 0.0500 time 0.2483 (0.2442) data time 0.0010 (0.0017) model time 0.2473 (0.2427) loss 3.9025 (3.7865) grad_norm 1.7705 (inf) loss_scale 8192.0000 (9838.7913) mem 7379MB [2024-08-26 04:58:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][790/1251] eta 0:01:52 lr 0.000995 wd 0.0500 time 0.2358 (0.2442) data time 0.0012 (0.0017) model time 0.2346 (0.2427) loss 3.4226 (3.7834) grad_norm 1.5791 (inf) loss_scale 8192.0000 (9817.9722) mem 7379MB [2024-08-26 04:58:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][800/1251] eta 0:01:50 lr 0.000995 wd 0.0500 time 0.2433 (0.2442) data time 0.0009 (0.0017) model time 0.2423 (0.2427) loss 4.1820 (3.7879) grad_norm 1.5641 (inf) loss_scale 8192.0000 (9797.6729) mem 7379MB [2024-08-26 04:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][810/1251] eta 0:01:47 lr 0.000995 wd 0.0500 time 0.2380 (0.2442) data time 0.0010 (0.0017) model time 0.2370 (0.2427) loss 4.2175 (3.7908) grad_norm 1.6308 (inf) loss_scale 8192.0000 (9777.8742) mem 7379MB [2024-08-26 04:58:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][820/1251] eta 0:01:45 lr 0.000995 wd 0.0500 time 0.2477 (0.2442) data time 0.0007 (0.0017) model time 0.2470 (0.2426) loss 4.3803 (3.7900) grad_norm 2.1236 (inf) loss_scale 8192.0000 (9758.5579) mem 7379MB [2024-08-26 04:58:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][830/1251] eta 0:01:42 lr 0.000995 wd 0.0500 time 0.2400 (0.2441) data time 0.0010 (0.0017) model time 0.2390 (0.2426) loss 2.5065 (3.7860) grad_norm 1.3520 (inf) loss_scale 8192.0000 (9739.7064) mem 7379MB [2024-08-26 04:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][840/1251] eta 0:01:40 lr 0.000995 wd 0.0500 time 0.2409 (0.2441) data time 0.0008 (0.0017) model time 0.2401 (0.2426) loss 4.4092 (3.7864) grad_norm 1.5392 (inf) loss_scale 8192.0000 (9721.3032) mem 7379MB [2024-08-26 04:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][850/1251] eta 0:01:37 lr 0.000995 wd 0.0500 time 0.2431 (0.2441) data time 0.0009 (0.0017) model time 0.2422 (0.2426) loss 2.5526 (3.7820) grad_norm 1.5524 (inf) loss_scale 8192.0000 (9703.3325) mem 7379MB [2024-08-26 04:58:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][860/1251] eta 0:01:35 lr 0.000995 wd 0.0500 time 0.2452 (0.2441) data time 0.0007 (0.0017) model time 0.2445 (0.2426) loss 4.4396 (3.7828) grad_norm 1.8387 (inf) loss_scale 8192.0000 (9685.7793) mem 7379MB [2024-08-26 04:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][870/1251] eta 0:01:32 lr 0.000995 wd 0.0500 time 0.2466 (0.2441) data time 0.0009 (0.0017) model time 0.2457 (0.2426) loss 3.8667 (3.7830) grad_norm 2.0301 (inf) loss_scale 8192.0000 (9668.6292) mem 7379MB [2024-08-26 04:58:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][880/1251] eta 0:01:30 lr 0.000995 wd 0.0500 time 0.2479 (0.2441) data time 0.0007 (0.0016) model time 0.2472 (0.2426) loss 4.7898 (3.7880) grad_norm 1.7216 (inf) loss_scale 8192.0000 (9651.8683) mem 7379MB [2024-08-26 04:58:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][890/1251] eta 0:01:28 lr 0.000995 wd 0.0500 time 0.2378 (0.2440) data time 0.0008 (0.0016) model time 0.2370 (0.2425) loss 2.6238 (3.7859) grad_norm 1.8327 (inf) loss_scale 8192.0000 (9635.4837) mem 7379MB [2024-08-26 04:59:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][900/1251] eta 0:01:25 lr 0.000995 wd 0.0500 time 0.2361 (0.2440) data time 0.0010 (0.0016) model time 0.2350 (0.2425) loss 3.4911 (3.7836) grad_norm 1.5428 (inf) loss_scale 8192.0000 (9619.4628) mem 7379MB [2024-08-26 04:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][910/1251] eta 0:01:23 lr 0.000995 wd 0.0500 time 0.2327 (0.2440) data time 0.0011 (0.0016) model time 0.2316 (0.2425) loss 3.9244 (3.7840) grad_norm 1.4979 (inf) loss_scale 8192.0000 (9603.7936) mem 7379MB [2024-08-26 04:59:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][920/1251] eta 0:01:20 lr 0.000995 wd 0.0500 time 0.2389 (0.2440) data time 0.0008 (0.0016) model time 0.2382 (0.2425) loss 4.3494 (3.7841) grad_norm 1.4681 (inf) loss_scale 8192.0000 (9588.4647) mem 7379MB [2024-08-26 04:59:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][930/1251] eta 0:01:18 lr 0.000995 wd 0.0500 time 0.2370 (0.2439) data time 0.0007 (0.0016) model time 0.2363 (0.2425) loss 2.6529 (3.7849) grad_norm 1.5320 (inf) loss_scale 8192.0000 (9573.4651) mem 7379MB [2024-08-26 04:59:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][940/1251] eta 0:01:15 lr 0.000995 wd 0.0500 time 0.2446 (0.2439) data time 0.0009 (0.0016) model time 0.2437 (0.2425) loss 3.8925 (3.7853) grad_norm 1.6019 (inf) loss_scale 8192.0000 (9558.7843) mem 7379MB [2024-08-26 04:59:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][950/1251] eta 0:01:13 lr 0.000995 wd 0.0500 time 0.2333 (0.2439) data time 0.0011 (0.0016) model time 0.2321 (0.2424) loss 4.4003 (3.7838) grad_norm 1.9834 (inf) loss_scale 8192.0000 (9544.4122) mem 7379MB [2024-08-26 04:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][960/1251] eta 0:01:10 lr 0.000995 wd 0.0500 time 0.2457 (0.2439) data time 0.0007 (0.0016) model time 0.2450 (0.2424) loss 3.1573 (3.7824) grad_norm 2.9137 (inf) loss_scale 8192.0000 (9530.3392) mem 7379MB [2024-08-26 04:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][970/1251] eta 0:01:08 lr 0.000995 wd 0.0500 time 0.2455 (0.2439) data time 0.0008 (0.0016) model time 0.2448 (0.2424) loss 3.0755 (3.7816) grad_norm 2.0357 (inf) loss_scale 8192.0000 (9516.5561) mem 7379MB [2024-08-26 04:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][980/1251] eta 0:01:06 lr 0.000995 wd 0.0500 time 0.2449 (0.2438) data time 0.0010 (0.0016) model time 0.2439 (0.2424) loss 4.3883 (3.7839) grad_norm 2.1209 (inf) loss_scale 8192.0000 (9503.0540) mem 7379MB [2024-08-26 04:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][990/1251] eta 0:01:03 lr 0.000995 wd 0.0500 time 0.2355 (0.2438) data time 0.0009 (0.0016) model time 0.2346 (0.2423) loss 4.3357 (3.7857) grad_norm 1.9333 (inf) loss_scale 8192.0000 (9489.8244) mem 7379MB [2024-08-26 04:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1000/1251] eta 0:01:01 lr 0.000995 wd 0.0500 time 0.2513 (0.2438) data time 0.0008 (0.0016) model time 0.2505 (0.2423) loss 4.6554 (3.7885) grad_norm 2.1711 (inf) loss_scale 8192.0000 (9476.8591) mem 7379MB [2024-08-26 04:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1010/1251] eta 0:00:58 lr 0.000995 wd 0.0500 time 0.2406 (0.2437) data time 0.0009 (0.0016) model time 0.2396 (0.2423) loss 3.0903 (3.7889) grad_norm 2.2006 (inf) loss_scale 8192.0000 (9464.1503) mem 7379MB [2024-08-26 04:59:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1020/1251] eta 0:00:56 lr 0.000995 wd 0.0500 time 0.2396 (0.2437) data time 0.0009 (0.0016) model time 0.2387 (0.2423) loss 2.8503 (3.7854) grad_norm 1.4348 (inf) loss_scale 8192.0000 (9451.6905) mem 7379MB [2024-08-26 04:59:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1030/1251] eta 0:00:53 lr 0.000995 wd 0.0500 time 0.2420 (0.2437) data time 0.0010 (0.0016) model time 0.2411 (0.2423) loss 4.0146 (3.7851) grad_norm 1.7148 (inf) loss_scale 8192.0000 (9439.4724) mem 7379MB [2024-08-26 04:59:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1040/1251] eta 0:00:51 lr 0.000995 wd 0.0500 time 0.2451 (0.2437) data time 0.0011 (0.0015) model time 0.2439 (0.2422) loss 4.0222 (3.7869) grad_norm 2.4278 (inf) loss_scale 8192.0000 (9427.4890) mem 7379MB [2024-08-26 04:59:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1050/1251] eta 0:00:48 lr 0.000995 wd 0.0500 time 0.2358 (0.2436) data time 0.0009 (0.0015) model time 0.2350 (0.2422) loss 3.5030 (3.7898) grad_norm 2.3610 (inf) loss_scale 8192.0000 (9415.7336) mem 7379MB [2024-08-26 04:59:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1060/1251] eta 0:00:46 lr 0.000995 wd 0.0500 time 0.2373 (0.2436) data time 0.0008 (0.0015) model time 0.2365 (0.2422) loss 4.3056 (3.7915) grad_norm 2.4843 (inf) loss_scale 8192.0000 (9404.1998) mem 7379MB [2024-08-26 04:59:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1070/1251] eta 0:00:44 lr 0.000995 wd 0.0500 time 0.2394 (0.2436) data time 0.0010 (0.0015) model time 0.2385 (0.2422) loss 4.1536 (3.7949) grad_norm 1.4665 (inf) loss_scale 8192.0000 (9392.8814) mem 7379MB [2024-08-26 04:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1080/1251] eta 0:00:41 lr 0.000995 wd 0.0500 time 0.2447 (0.2435) data time 0.0010 (0.0015) model time 0.2437 (0.2421) loss 3.1932 (3.7932) grad_norm 2.0971 (inf) loss_scale 8192.0000 (9381.7724) mem 7379MB [2024-08-26 04:59:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1090/1251] eta 0:00:39 lr 0.000995 wd 0.0500 time 0.2482 (0.2436) data time 0.0014 (0.0015) model time 0.2468 (0.2421) loss 2.7187 (3.7892) grad_norm 1.3643 (inf) loss_scale 8192.0000 (9370.8671) mem 7379MB [2024-08-26 04:59:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1100/1251] eta 0:00:36 lr 0.000995 wd 0.0500 time 0.2440 (0.2435) data time 0.0010 (0.0015) model time 0.2430 (0.2421) loss 3.3519 (3.7889) grad_norm 1.6899 (inf) loss_scale 8192.0000 (9360.1599) mem 7379MB [2024-08-26 04:59:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1110/1251] eta 0:00:34 lr 0.000995 wd 0.0500 time 0.2477 (0.2435) data time 0.0008 (0.0015) model time 0.2469 (0.2421) loss 2.9154 (3.7869) grad_norm 1.8425 (inf) loss_scale 8192.0000 (9349.6454) mem 7379MB [2024-08-26 04:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1120/1251] eta 0:00:31 lr 0.000995 wd 0.0500 time 0.2511 (0.2435) data time 0.0008 (0.0015) model time 0.2503 (0.2422) loss 4.3738 (3.7867) grad_norm 1.4783 (inf) loss_scale 8192.0000 (9339.3185) mem 7379MB [2024-08-26 04:59:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1130/1251] eta 0:00:29 lr 0.000995 wd 0.0500 time 0.2378 (0.2435) data time 0.0010 (0.0015) model time 0.2368 (0.2421) loss 3.9164 (3.7877) grad_norm 1.6379 (inf) loss_scale 8192.0000 (9329.1742) mem 7379MB [2024-08-26 04:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1140/1251] eta 0:00:27 lr 0.000995 wd 0.0500 time 0.2379 (0.2435) data time 0.0010 (0.0015) model time 0.2368 (0.2421) loss 4.6508 (3.7898) grad_norm 1.8910 (inf) loss_scale 8192.0000 (9319.2077) mem 7379MB [2024-08-26 05:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1150/1251] eta 0:00:24 lr 0.000995 wd 0.0500 time 0.2324 (0.2435) data time 0.0009 (0.0015) model time 0.2315 (0.2421) loss 4.3808 (3.7906) grad_norm 1.3997 (inf) loss_scale 8192.0000 (9309.4144) mem 7379MB [2024-08-26 05:00:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1160/1251] eta 0:00:22 lr 0.000995 wd 0.0500 time 0.2391 (0.2434) data time 0.0009 (0.0015) model time 0.2382 (0.2421) loss 4.7705 (3.7905) grad_norm 2.3777 (inf) loss_scale 8192.0000 (9299.7898) mem 7379MB [2024-08-26 05:00:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1170/1251] eta 0:00:19 lr 0.000995 wd 0.0500 time 0.2407 (0.2434) data time 0.0010 (0.0015) model time 0.2397 (0.2420) loss 4.1524 (3.7916) grad_norm 1.7947 (inf) loss_scale 8192.0000 (9290.3296) mem 7379MB [2024-08-26 05:00:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1180/1251] eta 0:00:17 lr 0.000995 wd 0.0500 time 0.2487 (0.2434) data time 0.0008 (0.0015) model time 0.2478 (0.2420) loss 3.8118 (3.7919) grad_norm 2.0927 (inf) loss_scale 8192.0000 (9281.0296) mem 7379MB [2024-08-26 05:00:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1190/1251] eta 0:00:14 lr 0.000995 wd 0.0500 time 0.2406 (0.2434) data time 0.0008 (0.0015) model time 0.2399 (0.2420) loss 3.0974 (3.7906) grad_norm 2.4635 (inf) loss_scale 8192.0000 (9271.8858) mem 7379MB [2024-08-26 05:00:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1200/1251] eta 0:00:12 lr 0.000995 wd 0.0500 time 0.2437 (0.2434) data time 0.0009 (0.0015) model time 0.2428 (0.2420) loss 2.6628 (3.7896) grad_norm 1.7607 (inf) loss_scale 8192.0000 (9262.8943) mem 7379MB [2024-08-26 05:00:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1210/1251] eta 0:00:09 lr 0.000995 wd 0.0500 time 0.2377 (0.2433) data time 0.0008 (0.0015) model time 0.2370 (0.2420) loss 4.5544 (3.7889) grad_norm 2.0617 (inf) loss_scale 8192.0000 (9254.0512) mem 7379MB [2024-08-26 05:00:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1220/1251] eta 0:00:07 lr 0.000995 wd 0.0500 time 0.2337 (0.2433) data time 0.0012 (0.0015) model time 0.2325 (0.2420) loss 3.5139 (3.7885) grad_norm 1.9989 (inf) loss_scale 8192.0000 (9245.3530) mem 7379MB [2024-08-26 05:00:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1230/1251] eta 0:00:05 lr 0.000995 wd 0.0500 time 0.2351 (0.2433) data time 0.0009 (0.0015) model time 0.2342 (0.2419) loss 2.7888 (3.7866) grad_norm 1.5474 (inf) loss_scale 8192.0000 (9236.7961) mem 7379MB [2024-08-26 05:00:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1240/1251] eta 0:00:02 lr 0.000995 wd 0.0500 time 0.2252 (0.2434) data time 0.0005 (0.0015) model time 0.2247 (0.2421) loss 4.8767 (3.7879) grad_norm 1.4933 (inf) loss_scale 8192.0000 (9228.3771) mem 7379MB [2024-08-26 05:00:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [32/300][1250/1251] eta 0:00:00 lr 0.000995 wd 0.0500 time 0.2252 (0.2436) data time 0.0007 (0.0015) model time 0.2245 (0.2423) loss 4.0328 (3.7900) grad_norm 2.0622 (inf) loss_scale 8192.0000 (9220.0927) mem 7379MB [2024-08-26 05:00:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 32 training takes 0:05:04 [2024-08-26 05:00:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 05:00:27 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 05:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.504 (0.504) Loss 0.6313 (0.6313) Acc@1 87.305 (87.305) Acc@5 97.559 (97.559) Mem 7379MB [2024-08-26 05:00:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.119) Loss 0.9971 (0.9855) Acc@1 77.637 (77.876) Acc@5 94.336 (94.585) Mem 7379MB [2024-08-26 05:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.100) Loss 1.4961 (0.9999) Acc@1 66.016 (77.130) Acc@5 88.184 (94.513) Mem 7379MB [2024-08-26 05:00:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.092) Loss 1.7188 (1.1464) Acc@1 61.914 (73.970) Acc@5 82.715 (92.436) Mem 7379MB [2024-08-26 05:00:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.086) Loss 1.6318 (1.2311) Acc@1 62.695 (72.113) Acc@5 86.426 (91.318) Mem 7379MB [2024-08-26 05:00:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 71.812 Acc@5 91.218 [2024-08-26 05:00:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 71.8% [2024-08-26 05:00:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 71.81% [2024-08-26 05:00:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 05:00:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 05:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.505 (0.505) Loss 0.5181 (0.5181) Acc@1 88.184 (88.184) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 05:00:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.116) Loss 0.8667 (0.8567) Acc@1 79.785 (79.048) Acc@5 95.020 (94.957) Mem 7379MB [2024-08-26 05:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.098) Loss 1.2705 (0.8703) Acc@1 68.652 (78.362) Acc@5 90.137 (94.834) Mem 7379MB [2024-08-26 05:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.093) Loss 1.5781 (1.0146) Acc@1 61.230 (75.205) Acc@5 84.766 (92.884) Mem 7379MB [2024-08-26 05:00:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 1.5469 (1.1018) Acc@1 62.695 (73.323) Acc@5 86.523 (91.756) Mem 7379MB [2024-08-26 05:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 73.046 Acc@5 91.634 [2024-08-26 05:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 73.0% [2024-08-26 05:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 73.05% [2024-08-26 05:00:36 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 05:00:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 05:00:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][0/1251] eta 0:14:12 lr 0.000995 wd 0.0500 time 0.6813 (0.6813) data time 0.4206 (0.4206) model time 0.0000 (0.0000) loss 3.5100 (3.5100) grad_norm 3.2269 (3.2269) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:00:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][10/1251] eta 0:06:12 lr 0.000995 wd 0.0500 time 0.2365 (0.3005) data time 0.0007 (0.0391) model time 0.0000 (0.0000) loss 2.7774 (3.4996) grad_norm 1.7353 (1.9713) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][20/1251] eta 0:05:36 lr 0.000995 wd 0.0500 time 0.2422 (0.2732) data time 0.0012 (0.0210) model time 0.0000 (0.0000) loss 4.0623 (3.6528) grad_norm 2.1068 (2.0431) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:00:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][30/1251] eta 0:05:21 lr 0.000995 wd 0.0500 time 0.2419 (0.2630) data time 0.0011 (0.0146) model time 0.0000 (0.0000) loss 4.1455 (3.6037) grad_norm 1.6079 (2.0606) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][40/1251] eta 0:05:11 lr 0.000995 wd 0.0500 time 0.2432 (0.2570) data time 0.0009 (0.0113) model time 0.0000 (0.0000) loss 4.4433 (3.6761) grad_norm 1.6820 (1.9862) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][50/1251] eta 0:05:05 lr 0.000995 wd 0.0500 time 0.2435 (0.2542) data time 0.0009 (0.0093) model time 0.0000 (0.0000) loss 4.0647 (3.6997) grad_norm 1.5941 (1.9224) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:00:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][60/1251] eta 0:05:00 lr 0.000995 wd 0.0500 time 0.2401 (0.2523) data time 0.0009 (0.0080) model time 0.2392 (0.2407) loss 3.0709 (3.6908) grad_norm 1.8481 (1.9033) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][70/1251] eta 0:04:56 lr 0.000995 wd 0.0500 time 0.2366 (0.2508) data time 0.0010 (0.0070) model time 0.2356 (0.2406) loss 4.6513 (3.6885) grad_norm 2.6394 (1.9285) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:00:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][80/1251] eta 0:04:55 lr 0.000995 wd 0.0500 time 0.2372 (0.2519) data time 0.0007 (0.0063) model time 0.2365 (0.2468) loss 4.8701 (3.6651) grad_norm 1.6886 (1.9281) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:00:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][90/1251] eta 0:04:51 lr 0.000995 wd 0.0500 time 0.2482 (0.2507) data time 0.0008 (0.0057) model time 0.2475 (0.2450) loss 3.3515 (3.6813) grad_norm 1.6808 (1.9288) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][100/1251] eta 0:04:47 lr 0.000995 wd 0.0500 time 0.2409 (0.2499) data time 0.0009 (0.0052) model time 0.2400 (0.2444) loss 3.2966 (3.6800) grad_norm 1.2492 (1.9417) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][110/1251] eta 0:04:44 lr 0.000995 wd 0.0500 time 0.2445 (0.2491) data time 0.0009 (0.0048) model time 0.2436 (0.2436) loss 3.9157 (3.6911) grad_norm 1.4746 (1.9288) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][120/1251] eta 0:04:40 lr 0.000995 wd 0.0500 time 0.2411 (0.2483) data time 0.0010 (0.0045) model time 0.2401 (0.2430) loss 3.1606 (3.6883) grad_norm 2.5826 (1.9420) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][130/1251] eta 0:04:37 lr 0.000995 wd 0.0500 time 0.2412 (0.2477) data time 0.0008 (0.0043) model time 0.2404 (0.2424) loss 3.9409 (3.7099) grad_norm 1.9572 (1.9499) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][140/1251] eta 0:04:34 lr 0.000995 wd 0.0500 time 0.2370 (0.2472) data time 0.0009 (0.0040) model time 0.2361 (0.2422) loss 4.4216 (3.7049) grad_norm 1.2593 (1.9271) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][150/1251] eta 0:04:31 lr 0.000995 wd 0.0500 time 0.2330 (0.2467) data time 0.0009 (0.0038) model time 0.2321 (0.2418) loss 3.8125 (3.7250) grad_norm 2.3379 (1.9231) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][160/1251] eta 0:04:28 lr 0.000995 wd 0.0500 time 0.2435 (0.2464) data time 0.0007 (0.0037) model time 0.2428 (0.2418) loss 4.3180 (3.7379) grad_norm 1.8443 (1.9342) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][170/1251] eta 0:04:26 lr 0.000995 wd 0.0500 time 0.2399 (0.2461) data time 0.0010 (0.0035) model time 0.2389 (0.2416) loss 3.5761 (3.7400) grad_norm 1.8546 (1.9538) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][180/1251] eta 0:04:23 lr 0.000995 wd 0.0500 time 0.2406 (0.2458) data time 0.0008 (0.0034) model time 0.2399 (0.2414) loss 2.2780 (3.7312) grad_norm 1.8104 (1.9568) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][190/1251] eta 0:04:20 lr 0.000995 wd 0.0500 time 0.2475 (0.2457) data time 0.0007 (0.0032) model time 0.2467 (0.2415) loss 3.3284 (3.7292) grad_norm 2.0823 (1.9590) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][200/1251] eta 0:04:17 lr 0.000995 wd 0.0500 time 0.2468 (0.2454) data time 0.0007 (0.0031) model time 0.2461 (0.2414) loss 3.2236 (3.7404) grad_norm 2.1790 (1.9498) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][210/1251] eta 0:04:15 lr 0.000995 wd 0.0500 time 0.2365 (0.2452) data time 0.0011 (0.0030) model time 0.2354 (0.2413) loss 3.7925 (3.7525) grad_norm 2.3353 (1.9487) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][220/1251] eta 0:04:12 lr 0.000995 wd 0.0500 time 0.2462 (0.2452) data time 0.0009 (0.0029) model time 0.2453 (0.2414) loss 3.7549 (3.7490) grad_norm 1.6509 (1.9407) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][230/1251] eta 0:04:10 lr 0.000995 wd 0.0500 time 0.2401 (0.2450) data time 0.0009 (0.0029) model time 0.2391 (0.2413) loss 3.0463 (3.7372) grad_norm 1.5397 (1.9357) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][240/1251] eta 0:04:07 lr 0.000995 wd 0.0500 time 0.2426 (0.2449) data time 0.0007 (0.0028) model time 0.2419 (0.2413) loss 4.3306 (3.7363) grad_norm 1.4392 (1.9316) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][250/1251] eta 0:04:04 lr 0.000995 wd 0.0500 time 0.2363 (0.2446) data time 0.0010 (0.0027) model time 0.2353 (0.2412) loss 4.6814 (3.7479) grad_norm 1.8460 (1.9228) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][260/1251] eta 0:04:02 lr 0.000995 wd 0.0500 time 0.2414 (0.2445) data time 0.0011 (0.0026) model time 0.2403 (0.2411) loss 4.1866 (3.7480) grad_norm 1.4864 (1.9175) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][270/1251] eta 0:03:59 lr 0.000995 wd 0.0500 time 0.2434 (0.2443) data time 0.0007 (0.0026) model time 0.2427 (0.2410) loss 3.2933 (3.7397) grad_norm 1.5420 (1.9126) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][280/1251] eta 0:03:57 lr 0.000995 wd 0.0500 time 0.2426 (0.2442) data time 0.0010 (0.0025) model time 0.2416 (0.2409) loss 3.8390 (3.7403) grad_norm 2.1473 (1.9085) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][290/1251] eta 0:03:54 lr 0.000995 wd 0.0500 time 0.2402 (0.2442) data time 0.0010 (0.0025) model time 0.2392 (0.2410) loss 3.9092 (3.7471) grad_norm 2.8381 (1.9188) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][300/1251] eta 0:03:52 lr 0.000995 wd 0.0500 time 0.2417 (0.2441) data time 0.0007 (0.0024) model time 0.2410 (0.2410) loss 3.6186 (3.7460) grad_norm 1.9738 (1.9188) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][310/1251] eta 0:03:49 lr 0.000995 wd 0.0500 time 0.2456 (0.2441) data time 0.0007 (0.0024) model time 0.2449 (0.2410) loss 4.2040 (3.7443) grad_norm 2.3783 (1.9196) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][320/1251] eta 0:03:47 lr 0.000995 wd 0.0500 time 0.2421 (0.2440) data time 0.0009 (0.0023) model time 0.2412 (0.2410) loss 4.1115 (3.7404) grad_norm 1.4937 (1.9151) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][330/1251] eta 0:03:44 lr 0.000995 wd 0.0500 time 0.2425 (0.2440) data time 0.0010 (0.0023) model time 0.2415 (0.2411) loss 3.3365 (3.7394) grad_norm 1.5090 (1.9108) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][340/1251] eta 0:03:42 lr 0.000995 wd 0.0500 time 0.2327 (0.2439) data time 0.0009 (0.0023) model time 0.2318 (0.2410) loss 3.5778 (3.7481) grad_norm 2.2970 (1.9094) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][350/1251] eta 0:03:39 lr 0.000995 wd 0.0500 time 0.2397 (0.2438) data time 0.0011 (0.0022) model time 0.2386 (0.2410) loss 4.0126 (3.7463) grad_norm 1.9768 (1.9081) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][360/1251] eta 0:03:37 lr 0.000995 wd 0.0500 time 0.2389 (0.2438) data time 0.0008 (0.0022) model time 0.2380 (0.2410) loss 4.1489 (3.7471) grad_norm 1.2187 (1.9086) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][370/1251] eta 0:03:34 lr 0.000995 wd 0.0500 time 0.2449 (0.2437) data time 0.0011 (0.0022) model time 0.2438 (0.2410) loss 3.9062 (3.7519) grad_norm 1.8679 (1.9119) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][380/1251] eta 0:03:32 lr 0.000994 wd 0.0500 time 0.2457 (0.2437) data time 0.0010 (0.0021) model time 0.2447 (0.2410) loss 3.1752 (3.7549) grad_norm 2.6480 (1.9205) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][390/1251] eta 0:03:29 lr 0.000994 wd 0.0500 time 0.2487 (0.2436) data time 0.0010 (0.0021) model time 0.2478 (0.2409) loss 4.2640 (3.7528) grad_norm 2.1171 (1.9308) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][400/1251] eta 0:03:27 lr 0.000994 wd 0.0500 time 0.2326 (0.2435) data time 0.0011 (0.0021) model time 0.2315 (0.2409) loss 4.2076 (3.7544) grad_norm 2.2881 (1.9230) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][410/1251] eta 0:03:24 lr 0.000994 wd 0.0500 time 0.2395 (0.2434) data time 0.0010 (0.0021) model time 0.2385 (0.2408) loss 2.9166 (3.7469) grad_norm 1.7334 (1.9362) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][420/1251] eta 0:03:22 lr 0.000994 wd 0.0500 time 0.2357 (0.2433) data time 0.0008 (0.0021) model time 0.2350 (0.2408) loss 4.1683 (3.7482) grad_norm 1.4856 (1.9312) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][430/1251] eta 0:03:19 lr 0.000994 wd 0.0500 time 0.2448 (0.2433) data time 0.0011 (0.0020) model time 0.2437 (0.2407) loss 3.8474 (3.7415) grad_norm 1.7153 (1.9315) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][440/1251] eta 0:03:17 lr 0.000994 wd 0.0500 time 0.2337 (0.2432) data time 0.0010 (0.0020) model time 0.2327 (0.2407) loss 4.1280 (3.7429) grad_norm 1.6329 (1.9263) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][450/1251] eta 0:03:14 lr 0.000994 wd 0.0500 time 0.2422 (0.2432) data time 0.0008 (0.0020) model time 0.2414 (0.2408) loss 3.2907 (3.7482) grad_norm 3.5458 (1.9382) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][460/1251] eta 0:03:12 lr 0.000994 wd 0.0500 time 0.2375 (0.2432) data time 0.0009 (0.0020) model time 0.2366 (0.2408) loss 4.2087 (3.7539) grad_norm 1.7502 (1.9385) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][470/1251] eta 0:03:09 lr 0.000994 wd 0.0500 time 0.2460 (0.2432) data time 0.0010 (0.0019) model time 0.2450 (0.2408) loss 2.7303 (3.7559) grad_norm 1.3972 (1.9357) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][480/1251] eta 0:03:07 lr 0.000994 wd 0.0500 time 0.2375 (0.2432) data time 0.0009 (0.0019) model time 0.2366 (0.2408) loss 3.8416 (3.7572) grad_norm 1.6264 (1.9344) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][490/1251] eta 0:03:05 lr 0.000994 wd 0.0500 time 0.2450 (0.2432) data time 0.0009 (0.0019) model time 0.2441 (0.2408) loss 3.1097 (3.7560) grad_norm 2.2944 (1.9332) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][500/1251] eta 0:03:02 lr 0.000994 wd 0.0500 time 0.2415 (0.2432) data time 0.0007 (0.0019) model time 0.2408 (0.2409) loss 2.8879 (3.7480) grad_norm 1.3157 (1.9312) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][510/1251] eta 0:03:00 lr 0.000994 wd 0.0500 time 0.4231 (0.2441) data time 0.0009 (0.0019) model time 0.4222 (0.2419) loss 4.1288 (3.7513) grad_norm 1.2547 (1.9278) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][520/1251] eta 0:02:58 lr 0.000994 wd 0.0500 time 0.2366 (0.2441) data time 0.0010 (0.0019) model time 0.2356 (0.2419) loss 3.8026 (3.7512) grad_norm 2.2707 (1.9278) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][530/1251] eta 0:02:56 lr 0.000994 wd 0.0500 time 0.4727 (0.2445) data time 0.0011 (0.0019) model time 0.4717 (0.2424) loss 4.4855 (3.7515) grad_norm 2.2073 (1.9267) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][540/1251] eta 0:02:54 lr 0.000994 wd 0.0500 time 0.2394 (0.2448) data time 0.0010 (0.0018) model time 0.2384 (0.2427) loss 3.9100 (3.7517) grad_norm 1.8577 (1.9281) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][550/1251] eta 0:02:51 lr 0.000994 wd 0.0500 time 0.2361 (0.2447) data time 0.0012 (0.0018) model time 0.2349 (0.2427) loss 4.1572 (3.7465) grad_norm 2.0049 (1.9299) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][560/1251] eta 0:02:49 lr 0.000994 wd 0.0500 time 0.2521 (0.2448) data time 0.0008 (0.0018) model time 0.2513 (0.2427) loss 4.2621 (3.7519) grad_norm 1.2453 (1.9285) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][570/1251] eta 0:02:46 lr 0.000994 wd 0.0500 time 0.2484 (0.2448) data time 0.0009 (0.0018) model time 0.2476 (0.2428) loss 3.0631 (3.7524) grad_norm 1.9482 (1.9308) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:02:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][580/1251] eta 0:02:44 lr 0.000994 wd 0.0500 time 0.2376 (0.2447) data time 0.0007 (0.0018) model time 0.2368 (0.2427) loss 4.4341 (3.7516) grad_norm 1.5668 (1.9273) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][590/1251] eta 0:02:41 lr 0.000994 wd 0.0500 time 0.2546 (0.2447) data time 0.0009 (0.0018) model time 0.2538 (0.2427) loss 4.1950 (3.7519) grad_norm 2.0442 (1.9290) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][600/1251] eta 0:02:39 lr 0.000994 wd 0.0500 time 0.2372 (0.2446) data time 0.0008 (0.0018) model time 0.2364 (0.2427) loss 3.2685 (3.7471) grad_norm 1.4872 (1.9253) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][610/1251] eta 0:02:36 lr 0.000994 wd 0.0500 time 0.2473 (0.2446) data time 0.0007 (0.0018) model time 0.2466 (0.2426) loss 4.5239 (3.7493) grad_norm 2.2025 (1.9274) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][620/1251] eta 0:02:34 lr 0.000994 wd 0.0500 time 0.2397 (0.2445) data time 0.0011 (0.0017) model time 0.2386 (0.2426) loss 4.1932 (3.7514) grad_norm 2.3051 (1.9290) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][630/1251] eta 0:02:31 lr 0.000994 wd 0.0500 time 0.2514 (0.2445) data time 0.0008 (0.0017) model time 0.2506 (0.2426) loss 4.3549 (3.7515) grad_norm 3.1061 (1.9350) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][640/1251] eta 0:02:29 lr 0.000994 wd 0.0500 time 0.2403 (0.2445) data time 0.0011 (0.0017) model time 0.2392 (0.2426) loss 4.2165 (3.7526) grad_norm 3.7341 (1.9442) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][650/1251] eta 0:02:26 lr 0.000994 wd 0.0500 time 0.2566 (0.2445) data time 0.0009 (0.0017) model time 0.2557 (0.2426) loss 3.8119 (3.7533) grad_norm 1.9056 (1.9449) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][660/1251] eta 0:02:24 lr 0.000994 wd 0.0500 time 0.2371 (0.2444) data time 0.0007 (0.0017) model time 0.2363 (0.2425) loss 2.6500 (3.7508) grad_norm 1.4098 (1.9438) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][670/1251] eta 0:02:21 lr 0.000994 wd 0.0500 time 0.2409 (0.2444) data time 0.0010 (0.0017) model time 0.2399 (0.2425) loss 4.7697 (3.7526) grad_norm 1.5506 (1.9409) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][680/1251] eta 0:02:19 lr 0.000994 wd 0.0500 time 0.2486 (0.2444) data time 0.0008 (0.0017) model time 0.2478 (0.2425) loss 3.7208 (3.7511) grad_norm 2.3026 (1.9411) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][690/1251] eta 0:02:17 lr 0.000994 wd 0.0500 time 0.2563 (0.2444) data time 0.0011 (0.0017) model time 0.2552 (0.2425) loss 3.8173 (3.7521) grad_norm 2.4375 (1.9406) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][700/1251] eta 0:02:14 lr 0.000994 wd 0.0500 time 0.2323 (0.2444) data time 0.0010 (0.0017) model time 0.2313 (0.2425) loss 3.8946 (3.7529) grad_norm 1.4070 (1.9423) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][710/1251] eta 0:02:12 lr 0.000994 wd 0.0500 time 0.2439 (0.2443) data time 0.0009 (0.0017) model time 0.2430 (0.2425) loss 3.9265 (3.7536) grad_norm 3.8181 (1.9498) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][720/1251] eta 0:02:09 lr 0.000994 wd 0.0500 time 0.2433 (0.2443) data time 0.0009 (0.0017) model time 0.2424 (0.2424) loss 3.8911 (3.7565) grad_norm 1.9592 (1.9524) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][730/1251] eta 0:02:07 lr 0.000994 wd 0.0500 time 0.2438 (0.2443) data time 0.0007 (0.0017) model time 0.2430 (0.2425) loss 2.5121 (3.7498) grad_norm 1.8521 (1.9507) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][740/1251] eta 0:02:04 lr 0.000994 wd 0.0500 time 0.2401 (0.2442) data time 0.0011 (0.0016) model time 0.2389 (0.2424) loss 4.1757 (3.7456) grad_norm 1.3805 (1.9479) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][750/1251] eta 0:02:02 lr 0.000994 wd 0.0500 time 0.2403 (0.2442) data time 0.0009 (0.0016) model time 0.2394 (0.2424) loss 4.9826 (3.7526) grad_norm 1.7161 (1.9439) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][760/1251] eta 0:01:59 lr 0.000994 wd 0.0500 time 0.2346 (0.2441) data time 0.0007 (0.0016) model time 0.2338 (0.2423) loss 4.5064 (3.7530) grad_norm 1.7900 (1.9454) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][770/1251] eta 0:01:57 lr 0.000994 wd 0.0500 time 0.2438 (0.2441) data time 0.0008 (0.0016) model time 0.2430 (0.2423) loss 3.9446 (3.7511) grad_norm 1.7927 (1.9471) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][780/1251] eta 0:01:54 lr 0.000994 wd 0.0500 time 0.2385 (0.2440) data time 0.0008 (0.0016) model time 0.2378 (0.2422) loss 4.5249 (3.7530) grad_norm 1.6753 (1.9474) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][790/1251] eta 0:01:52 lr 0.000994 wd 0.0500 time 0.2498 (0.2440) data time 0.0009 (0.0016) model time 0.2489 (0.2422) loss 4.5567 (3.7534) grad_norm 2.3882 (1.9537) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][800/1251] eta 0:01:50 lr 0.000994 wd 0.0500 time 0.2525 (0.2440) data time 0.0010 (0.0016) model time 0.2515 (0.2422) loss 3.8399 (3.7566) grad_norm 2.0856 (1.9539) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][810/1251] eta 0:01:47 lr 0.000994 wd 0.0500 time 0.2504 (0.2440) data time 0.0010 (0.0016) model time 0.2494 (0.2422) loss 4.3158 (3.7555) grad_norm 1.5605 (1.9524) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][820/1251] eta 0:01:45 lr 0.000994 wd 0.0500 time 0.2425 (0.2440) data time 0.0008 (0.0016) model time 0.2417 (0.2422) loss 2.4143 (3.7546) grad_norm 1.6912 (1.9511) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][830/1251] eta 0:01:42 lr 0.000994 wd 0.0500 time 0.2423 (0.2440) data time 0.0009 (0.0016) model time 0.2414 (0.2423) loss 3.1573 (3.7537) grad_norm 1.7296 (1.9505) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:04:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][840/1251] eta 0:01:40 lr 0.000994 wd 0.0500 time 0.2427 (0.2440) data time 0.0011 (0.0016) model time 0.2416 (0.2422) loss 3.0487 (3.7522) grad_norm 2.2306 (1.9517) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][850/1251] eta 0:01:37 lr 0.000994 wd 0.0500 time 0.2427 (0.2440) data time 0.0008 (0.0016) model time 0.2420 (0.2423) loss 3.0394 (3.7527) grad_norm 1.5934 (1.9490) loss_scale 16384.0000 (8240.1316) mem 7379MB [2024-08-26 05:04:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][860/1251] eta 0:01:35 lr 0.000994 wd 0.0500 time 0.2435 (0.2440) data time 0.0010 (0.0016) model time 0.2425 (0.2422) loss 3.9905 (3.7544) grad_norm 2.8350 (1.9515) loss_scale 16384.0000 (8334.7178) mem 7379MB [2024-08-26 05:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][870/1251] eta 0:01:32 lr 0.000994 wd 0.0500 time 0.2368 (0.2439) data time 0.0007 (0.0016) model time 0.2361 (0.2422) loss 4.4568 (3.7530) grad_norm 1.3776 (1.9501) loss_scale 16384.0000 (8427.1320) mem 7379MB [2024-08-26 05:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][880/1251] eta 0:01:30 lr 0.000994 wd 0.0500 time 0.2500 (0.2439) data time 0.0011 (0.0016) model time 0.2489 (0.2422) loss 3.6795 (3.7524) grad_norm 1.8847 (1.9482) loss_scale 16384.0000 (8517.4484) mem 7379MB [2024-08-26 05:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][890/1251] eta 0:01:28 lr 0.000994 wd 0.0500 time 0.2412 (0.2439) data time 0.0011 (0.0016) model time 0.2401 (0.2422) loss 3.2420 (3.7530) grad_norm 2.0506 (1.9479) loss_scale 16384.0000 (8605.7374) mem 7379MB [2024-08-26 05:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][900/1251] eta 0:01:25 lr 0.000994 wd 0.0500 time 0.2399 (0.2439) data time 0.0010 (0.0015) model time 0.2389 (0.2422) loss 3.9074 (3.7517) grad_norm 2.0347 (1.9468) loss_scale 16384.0000 (8692.0666) mem 7379MB [2024-08-26 05:04:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][910/1251] eta 0:01:23 lr 0.000994 wd 0.0500 time 0.2447 (0.2439) data time 0.0009 (0.0015) model time 0.2438 (0.2422) loss 3.6280 (3.7511) grad_norm 1.2049 (1.9453) loss_scale 16384.0000 (8776.5005) mem 7379MB [2024-08-26 05:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][920/1251] eta 0:01:20 lr 0.000994 wd 0.0500 time 0.2447 (0.2439) data time 0.0009 (0.0015) model time 0.2438 (0.2422) loss 3.2877 (3.7526) grad_norm 1.7095 (1.9439) loss_scale 16384.0000 (8859.1010) mem 7379MB [2024-08-26 05:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][930/1251] eta 0:01:18 lr 0.000994 wd 0.0500 time 0.2416 (0.2438) data time 0.0007 (0.0015) model time 0.2409 (0.2422) loss 4.5873 (3.7508) grad_norm 3.8319 (1.9449) loss_scale 16384.0000 (8939.9270) mem 7379MB [2024-08-26 05:04:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][940/1251] eta 0:01:15 lr 0.000994 wd 0.0500 time 0.2371 (0.2441) data time 0.0009 (0.0015) model time 0.2362 (0.2424) loss 3.8827 (3.7527) grad_norm 1.7126 (1.9480) loss_scale 16384.0000 (9019.0351) mem 7379MB [2024-08-26 05:04:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][950/1251] eta 0:01:13 lr 0.000994 wd 0.0500 time 0.2393 (0.2441) data time 0.0008 (0.0015) model time 0.2384 (0.2424) loss 4.0360 (3.7535) grad_norm 1.5983 (1.9465) loss_scale 16384.0000 (9096.4795) mem 7379MB [2024-08-26 05:04:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][960/1251] eta 0:01:11 lr 0.000994 wd 0.0500 time 0.2430 (0.2441) data time 0.0009 (0.0015) model time 0.2421 (0.2424) loss 3.5628 (3.7560) grad_norm 1.4478 (1.9432) loss_scale 16384.0000 (9172.3122) mem 7379MB [2024-08-26 05:04:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][970/1251] eta 0:01:08 lr 0.000994 wd 0.0500 time 0.2381 (0.2441) data time 0.0011 (0.0015) model time 0.2370 (0.2424) loss 4.0780 (3.7579) grad_norm 2.1555 (1.9417) loss_scale 16384.0000 (9246.5829) mem 7379MB [2024-08-26 05:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][980/1251] eta 0:01:06 lr 0.000994 wd 0.0500 time 0.2401 (0.2441) data time 0.0010 (0.0015) model time 0.2391 (0.2424) loss 3.9179 (3.7558) grad_norm 2.3806 (1.9433) loss_scale 16384.0000 (9319.3394) mem 7379MB [2024-08-26 05:04:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][990/1251] eta 0:01:03 lr 0.000994 wd 0.0500 time 0.2366 (0.2440) data time 0.0010 (0.0015) model time 0.2356 (0.2424) loss 3.5697 (3.7561) grad_norm 2.2060 (1.9461) loss_scale 16384.0000 (9390.6276) mem 7379MB [2024-08-26 05:04:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1000/1251] eta 0:01:01 lr 0.000994 wd 0.0500 time 0.2402 (0.2440) data time 0.0007 (0.0015) model time 0.2394 (0.2424) loss 2.8012 (3.7537) grad_norm 2.2512 (1.9438) loss_scale 16384.0000 (9460.4915) mem 7379MB [2024-08-26 05:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1010/1251] eta 0:00:58 lr 0.000994 wd 0.0500 time 0.2378 (0.2442) data time 0.0008 (0.0015) model time 0.2370 (0.2426) loss 3.8898 (3.7549) grad_norm 1.9780 (1.9445) loss_scale 16384.0000 (9528.9733) mem 7379MB [2024-08-26 05:04:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1020/1251] eta 0:00:56 lr 0.000994 wd 0.0500 time 0.2353 (0.2442) data time 0.0008 (0.0015) model time 0.2346 (0.2425) loss 3.5792 (3.7560) grad_norm 1.5282 (1.9432) loss_scale 16384.0000 (9596.1136) mem 7379MB [2024-08-26 05:04:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1030/1251] eta 0:00:54 lr 0.000994 wd 0.0500 time 0.2387 (0.2444) data time 0.0010 (0.0015) model time 0.2377 (0.2427) loss 4.2636 (3.7532) grad_norm 1.7642 (1.9425) loss_scale 16384.0000 (9661.9515) mem 7379MB [2024-08-26 05:04:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1040/1251] eta 0:00:51 lr 0.000994 wd 0.0500 time 0.2458 (0.2448) data time 0.0008 (0.0015) model time 0.2450 (0.2432) loss 4.3793 (3.7516) grad_norm 1.8090 (1.9419) loss_scale 16384.0000 (9726.5245) mem 7379MB [2024-08-26 05:04:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1050/1251] eta 0:00:49 lr 0.000994 wd 0.0500 time 0.4111 (0.2449) data time 0.0011 (0.0015) model time 0.4100 (0.2433) loss 3.4875 (3.7513) grad_norm 1.4260 (1.9393) loss_scale 16384.0000 (9789.8687) mem 7379MB [2024-08-26 05:04:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1060/1251] eta 0:00:46 lr 0.000994 wd 0.0500 time 0.2339 (0.2452) data time 0.0011 (0.0015) model time 0.2327 (0.2436) loss 3.9831 (3.7525) grad_norm 1.5605 (1.9384) loss_scale 16384.0000 (9852.0189) mem 7379MB [2024-08-26 05:04:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1070/1251] eta 0:00:44 lr 0.000994 wd 0.0500 time 0.2375 (0.2451) data time 0.0011 (0.0015) model time 0.2364 (0.2435) loss 3.1003 (3.7543) grad_norm 2.3153 (1.9377) loss_scale 16384.0000 (9913.0084) mem 7379MB [2024-08-26 05:05:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1080/1251] eta 0:00:41 lr 0.000994 wd 0.0500 time 0.2357 (0.2451) data time 0.0009 (0.0015) model time 0.2347 (0.2435) loss 4.0503 (3.7526) grad_norm 2.4028 (1.9379) loss_scale 16384.0000 (9972.8696) mem 7379MB [2024-08-26 05:05:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1090/1251] eta 0:00:39 lr 0.000994 wd 0.0500 time 0.2412 (0.2451) data time 0.0008 (0.0015) model time 0.2405 (0.2435) loss 2.6694 (3.7526) grad_norm 1.8992 (1.9403) loss_scale 16384.0000 (10031.6334) mem 7379MB [2024-08-26 05:05:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1100/1251] eta 0:00:36 lr 0.000994 wd 0.0500 time 0.2454 (0.2450) data time 0.0008 (0.0015) model time 0.2446 (0.2434) loss 3.8540 (3.7527) grad_norm 1.7277 (1.9413) loss_scale 16384.0000 (10089.3297) mem 7379MB [2024-08-26 05:05:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1110/1251] eta 0:00:34 lr 0.000994 wd 0.0500 time 0.2490 (0.2450) data time 0.0007 (0.0015) model time 0.2483 (0.2434) loss 4.2812 (3.7537) grad_norm 1.6344 (1.9424) loss_scale 16384.0000 (10145.9874) mem 7379MB [2024-08-26 05:05:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1120/1251] eta 0:00:32 lr 0.000994 wd 0.0500 time 0.2432 (0.2450) data time 0.0011 (0.0015) model time 0.2421 (0.2434) loss 4.1173 (3.7564) grad_norm 1.6021 (1.9414) loss_scale 16384.0000 (10201.6343) mem 7379MB [2024-08-26 05:05:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1130/1251] eta 0:00:29 lr 0.000994 wd 0.0500 time 0.2394 (0.2450) data time 0.0009 (0.0015) model time 0.2385 (0.2434) loss 4.9355 (3.7583) grad_norm 1.4642 (1.9395) loss_scale 16384.0000 (10256.2971) mem 7379MB [2024-08-26 05:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1140/1251] eta 0:00:27 lr 0.000994 wd 0.0500 time 0.2535 (0.2449) data time 0.0011 (0.0015) model time 0.2525 (0.2434) loss 3.9676 (3.7619) grad_norm 1.5783 (1.9387) loss_scale 16384.0000 (10310.0018) mem 7379MB [2024-08-26 05:05:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1150/1251] eta 0:00:24 lr 0.000994 wd 0.0500 time 0.2366 (0.2449) data time 0.0009 (0.0015) model time 0.2357 (0.2434) loss 2.8865 (3.7635) grad_norm 1.8321 (1.9383) loss_scale 16384.0000 (10362.7732) mem 7379MB [2024-08-26 05:05:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1160/1251] eta 0:00:22 lr 0.000994 wd 0.0500 time 0.2416 (0.2449) data time 0.0009 (0.0015) model time 0.2407 (0.2434) loss 2.6494 (3.7638) grad_norm 1.8408 (1.9373) loss_scale 16384.0000 (10414.6357) mem 7379MB [2024-08-26 05:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1170/1251] eta 0:00:19 lr 0.000994 wd 0.0500 time 0.2528 (0.2449) data time 0.0007 (0.0015) model time 0.2520 (0.2433) loss 4.2063 (3.7642) grad_norm 2.4873 (1.9394) loss_scale 16384.0000 (10465.6123) mem 7379MB [2024-08-26 05:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1180/1251] eta 0:00:17 lr 0.000994 wd 0.0500 time 0.2497 (0.2449) data time 0.0011 (0.0015) model time 0.2486 (0.2433) loss 3.7295 (3.7627) grad_norm 1.6099 (1.9419) loss_scale 16384.0000 (10515.7257) mem 7379MB [2024-08-26 05:05:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1190/1251] eta 0:00:14 lr 0.000994 wd 0.0500 time 0.2485 (0.2449) data time 0.0012 (0.0015) model time 0.2474 (0.2433) loss 3.3227 (3.7623) grad_norm 1.8897 (1.9410) loss_scale 16384.0000 (10564.9975) mem 7379MB [2024-08-26 05:05:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1200/1251] eta 0:00:12 lr 0.000994 wd 0.0500 time 0.2564 (0.2449) data time 0.0011 (0.0015) model time 0.2553 (0.2433) loss 4.0885 (3.7622) grad_norm 1.5397 (1.9415) loss_scale 16384.0000 (10613.4488) mem 7379MB [2024-08-26 05:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1210/1251] eta 0:00:10 lr 0.000994 wd 0.0500 time 0.2422 (0.2449) data time 0.0012 (0.0015) model time 0.2410 (0.2433) loss 4.1544 (3.7611) grad_norm 1.4902 (1.9400) loss_scale 16384.0000 (10661.0999) mem 7379MB [2024-08-26 05:05:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1220/1251] eta 0:00:07 lr 0.000994 wd 0.0500 time 0.2460 (0.2449) data time 0.0010 (0.0015) model time 0.2450 (0.2433) loss 3.0799 (3.7621) grad_norm 1.5239 (1.9392) loss_scale 16384.0000 (10707.9705) mem 7379MB [2024-08-26 05:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1230/1251] eta 0:00:05 lr 0.000994 wd 0.0500 time 0.2428 (0.2449) data time 0.0007 (0.0015) model time 0.2421 (0.2433) loss 3.7529 (3.7629) grad_norm 1.6748 (1.9406) loss_scale 16384.0000 (10754.0796) mem 7379MB [2024-08-26 05:05:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1240/1251] eta 0:00:02 lr 0.000994 wd 0.0500 time 0.2240 (0.2448) data time 0.0008 (0.0015) model time 0.2232 (0.2432) loss 3.8927 (3.7640) grad_norm 3.3644 (1.9440) loss_scale 16384.0000 (10799.4456) mem 7379MB [2024-08-26 05:05:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [33/300][1250/1251] eta 0:00:00 lr 0.000994 wd 0.0500 time 0.2229 (0.2446) data time 0.0007 (0.0015) model time 0.2222 (0.2431) loss 3.5965 (3.7648) grad_norm 1.8967 (1.9454) loss_scale 16384.0000 (10844.0863) mem 7379MB [2024-08-26 05:05:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 33 training takes 0:05:06 [2024-08-26 05:05:42 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 05:05:43 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 05:05:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.471 (0.471) Loss 0.6455 (0.6455) Acc@1 86.621 (86.621) Acc@5 96.973 (96.973) Mem 7379MB [2024-08-26 05:05:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.114) Loss 1.0273 (0.9860) Acc@1 76.074 (77.779) Acc@5 93.848 (94.345) Mem 7379MB [2024-08-26 05:05:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.097) Loss 1.3125 (0.9917) Acc@1 70.605 (77.325) Acc@5 90.137 (94.443) Mem 7379MB [2024-08-26 05:05:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.091) Loss 1.6650 (1.1329) Acc@1 60.742 (74.134) Acc@5 83.496 (92.480) Mem 7379MB [2024-08-26 05:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.5479 (1.2127) Acc@1 62.891 (72.275) Acc@5 86.426 (91.399) Mem 7379MB [2024-08-26 05:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 72.042 Acc@5 91.326 [2024-08-26 05:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 72.0% [2024-08-26 05:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 72.04% [2024-08-26 05:05:47 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 05:05:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 05:05:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.476 (0.476) Loss 0.5093 (0.5093) Acc@1 88.379 (88.379) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 05:05:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.091 (0.115) Loss 0.8564 (0.8433) Acc@1 80.273 (79.510) Acc@5 95.117 (95.126) Mem 7379MB [2024-08-26 05:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.096) Loss 1.2539 (0.8576) Acc@1 69.141 (78.804) Acc@5 90.332 (94.978) Mem 7379MB [2024-08-26 05:05:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.090) Loss 1.5498 (0.9988) Acc@1 62.109 (75.712) Acc@5 85.352 (93.092) Mem 7379MB [2024-08-26 05:05:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.5234 (1.0843) Acc@1 63.086 (73.828) Acc@5 86.719 (92.018) Mem 7379MB [2024-08-26 05:05:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 73.546 Acc@5 91.892 [2024-08-26 05:05:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 73.5% [2024-08-26 05:05:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 73.55% [2024-08-26 05:05:52 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 05:05:53 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 05:05:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][0/1251] eta 0:14:28 lr 0.000994 wd 0.0500 time 0.6944 (0.6944) data time 0.4695 (0.4695) model time 0.0000 (0.0000) loss 3.7743 (3.7743) grad_norm 1.3854 (1.3854) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:05:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][10/1251] eta 0:05:50 lr 0.000994 wd 0.0500 time 0.2360 (0.2821) data time 0.0008 (0.0436) model time 0.0000 (0.0000) loss 2.5908 (3.9114) grad_norm 2.0114 (1.8334) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:05:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][20/1251] eta 0:05:24 lr 0.000994 wd 0.0500 time 0.2425 (0.2634) data time 0.0014 (0.0233) model time 0.0000 (0.0000) loss 2.5135 (3.8728) grad_norm 1.5340 (1.8453) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][30/1251] eta 0:05:14 lr 0.000994 wd 0.0500 time 0.2431 (0.2573) data time 0.0008 (0.0161) model time 0.0000 (0.0000) loss 4.1041 (3.8796) grad_norm 1.7931 (1.8231) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][40/1251] eta 0:05:08 lr 0.000994 wd 0.0500 time 0.2375 (0.2548) data time 0.0008 (0.0124) model time 0.0000 (0.0000) loss 3.5708 (3.8487) grad_norm 1.6330 (1.8037) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][50/1251] eta 0:05:03 lr 0.000994 wd 0.0500 time 0.2468 (0.2525) data time 0.0012 (0.0102) model time 0.0000 (0.0000) loss 3.6959 (3.8821) grad_norm 1.4692 (1.7840) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][60/1251] eta 0:04:58 lr 0.000994 wd 0.0500 time 0.2433 (0.2510) data time 0.0009 (0.0088) model time 0.2424 (0.2420) loss 4.1269 (3.9297) grad_norm 1.7442 (1.7827) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][70/1251] eta 0:04:55 lr 0.000994 wd 0.0500 time 0.2410 (0.2500) data time 0.0007 (0.0077) model time 0.2402 (0.2422) loss 4.0108 (3.9345) grad_norm 2.4993 (1.8529) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][80/1251] eta 0:04:51 lr 0.000994 wd 0.0500 time 0.2438 (0.2491) data time 0.0009 (0.0069) model time 0.2429 (0.2421) loss 4.4300 (3.9324) grad_norm 1.9641 (1.8369) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][90/1251] eta 0:04:48 lr 0.000994 wd 0.0500 time 0.2409 (0.2486) data time 0.0009 (0.0062) model time 0.2401 (0.2425) loss 4.7363 (3.8973) grad_norm 1.3529 (1.8307) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][100/1251] eta 0:04:45 lr 0.000994 wd 0.0500 time 0.2475 (0.2480) data time 0.0011 (0.0057) model time 0.2464 (0.2423) loss 3.6023 (3.8857) grad_norm 2.7713 (1.8448) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][110/1251] eta 0:04:42 lr 0.000994 wd 0.0500 time 0.2408 (0.2472) data time 0.0007 (0.0053) model time 0.2400 (0.2416) loss 3.1983 (3.8710) grad_norm 2.4157 (1.8649) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][120/1251] eta 0:04:39 lr 0.000994 wd 0.0500 time 0.2429 (0.2468) data time 0.0008 (0.0049) model time 0.2421 (0.2415) loss 4.3635 (3.8811) grad_norm 2.9915 (1.8580) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][130/1251] eta 0:04:36 lr 0.000994 wd 0.0500 time 0.2440 (0.2466) data time 0.0007 (0.0046) model time 0.2433 (0.2418) loss 2.9847 (3.8644) grad_norm 2.5808 (1.8694) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][140/1251] eta 0:04:33 lr 0.000994 wd 0.0500 time 0.2447 (0.2464) data time 0.0008 (0.0044) model time 0.2439 (0.2418) loss 4.5045 (3.8450) grad_norm 1.8229 (1.8671) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][150/1251] eta 0:04:30 lr 0.000994 wd 0.0500 time 0.2406 (0.2460) data time 0.0012 (0.0041) model time 0.2394 (0.2416) loss 4.2156 (3.8278) grad_norm 2.0538 (1.8676) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][160/1251] eta 0:04:28 lr 0.000994 wd 0.0500 time 0.2392 (0.2457) data time 0.0010 (0.0040) model time 0.2383 (0.2414) loss 3.8103 (3.8221) grad_norm 1.9659 (1.8808) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][170/1251] eta 0:04:25 lr 0.000994 wd 0.0500 time 0.2450 (0.2454) data time 0.0009 (0.0038) model time 0.2441 (0.2413) loss 4.2841 (3.8258) grad_norm 1.8793 (1.8741) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][180/1251] eta 0:04:22 lr 0.000994 wd 0.0500 time 0.2372 (0.2451) data time 0.0011 (0.0036) model time 0.2361 (0.2412) loss 3.8631 (3.8313) grad_norm 2.1311 (1.8638) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][190/1251] eta 0:04:19 lr 0.000994 wd 0.0500 time 0.2410 (0.2449) data time 0.0010 (0.0035) model time 0.2400 (0.2411) loss 4.1312 (3.8358) grad_norm 2.4393 (1.8749) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][200/1251] eta 0:04:17 lr 0.000994 wd 0.0500 time 0.2351 (0.2448) data time 0.0011 (0.0034) model time 0.2341 (0.2411) loss 3.8123 (3.8313) grad_norm 1.7145 (1.8684) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][210/1251] eta 0:04:14 lr 0.000994 wd 0.0500 time 0.2379 (0.2445) data time 0.0008 (0.0033) model time 0.2372 (0.2409) loss 4.5598 (3.8404) grad_norm 1.5183 (1.8635) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][220/1251] eta 0:04:11 lr 0.000994 wd 0.0500 time 0.2386 (0.2443) data time 0.0007 (0.0032) model time 0.2378 (0.2408) loss 3.5577 (3.8351) grad_norm 1.8365 (1.8915) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][230/1251] eta 0:04:09 lr 0.000994 wd 0.0500 time 0.2404 (0.2441) data time 0.0011 (0.0031) model time 0.2393 (0.2407) loss 4.1425 (3.8329) grad_norm 1.6307 (1.8924) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][240/1251] eta 0:04:06 lr 0.000994 wd 0.0500 time 0.2377 (0.2440) data time 0.0009 (0.0030) model time 0.2368 (0.2407) loss 3.9979 (3.8252) grad_norm 2.3266 (1.9085) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][250/1251] eta 0:04:04 lr 0.000994 wd 0.0500 time 0.2397 (0.2439) data time 0.0010 (0.0029) model time 0.2387 (0.2406) loss 3.9832 (3.8096) grad_norm 1.8124 (1.9246) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][260/1251] eta 0:04:01 lr 0.000994 wd 0.0500 time 0.2378 (0.2437) data time 0.0008 (0.0028) model time 0.2370 (0.2405) loss 4.3125 (3.8030) grad_norm 1.4383 (1.9456) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:06:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][270/1251] eta 0:03:58 lr 0.000994 wd 0.0500 time 0.2480 (0.2436) data time 0.0007 (0.0028) model time 0.2473 (0.2405) loss 4.5303 (3.8154) grad_norm 1.4630 (1.9493) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][280/1251] eta 0:03:56 lr 0.000994 wd 0.0500 time 0.2427 (0.2436) data time 0.0010 (0.0027) model time 0.2417 (0.2405) loss 3.7779 (3.8116) grad_norm 1.4329 (1.9482) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][290/1251] eta 0:03:54 lr 0.000994 wd 0.0500 time 0.2448 (0.2435) data time 0.0007 (0.0026) model time 0.2440 (0.2406) loss 3.7131 (3.8062) grad_norm 1.4596 (1.9343) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][300/1251] eta 0:03:51 lr 0.000994 wd 0.0500 time 0.2416 (0.2435) data time 0.0008 (0.0026) model time 0.2408 (0.2406) loss 3.4362 (3.8118) grad_norm 1.5107 (1.9335) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][310/1251] eta 0:03:49 lr 0.000994 wd 0.0500 time 0.2407 (0.2434) data time 0.0008 (0.0025) model time 0.2399 (0.2406) loss 4.7673 (3.8138) grad_norm 1.8260 (1.9356) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][320/1251] eta 0:03:47 lr 0.000994 wd 0.0500 time 0.2466 (0.2439) data time 0.0011 (0.0025) model time 0.2456 (0.2412) loss 3.3547 (3.8131) grad_norm 2.3828 (1.9345) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][330/1251] eta 0:03:44 lr 0.000994 wd 0.0500 time 0.2420 (0.2442) data time 0.0007 (0.0024) model time 0.2413 (0.2417) loss 2.8374 (3.8050) grad_norm 1.7959 (1.9276) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][340/1251] eta 0:03:42 lr 0.000994 wd 0.0500 time 0.2380 (0.2441) data time 0.0009 (0.0024) model time 0.2371 (0.2417) loss 3.4907 (3.7992) grad_norm 1.4488 (1.9321) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][350/1251] eta 0:03:39 lr 0.000994 wd 0.0500 time 0.2390 (0.2440) data time 0.0007 (0.0024) model time 0.2383 (0.2415) loss 3.9898 (3.8054) grad_norm 1.7336 (1.9325) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][360/1251] eta 0:03:37 lr 0.000994 wd 0.0500 time 0.4709 (0.2447) data time 0.0010 (0.0023) model time 0.4700 (0.2423) loss 3.8107 (3.7992) grad_norm 1.9821 (1.9355) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][370/1251] eta 0:03:35 lr 0.000994 wd 0.0500 time 0.2491 (0.2445) data time 0.0010 (0.0023) model time 0.2481 (0.2422) loss 3.3289 (3.7991) grad_norm 2.0182 (1.9286) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][380/1251] eta 0:03:32 lr 0.000994 wd 0.0500 time 0.2454 (0.2445) data time 0.0010 (0.0023) model time 0.2444 (0.2422) loss 4.0535 (3.7969) grad_norm 1.8708 (1.9439) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][390/1251] eta 0:03:31 lr 0.000994 wd 0.0500 time 0.2375 (0.2452) data time 0.0008 (0.0022) model time 0.2367 (0.2431) loss 3.5214 (3.7978) grad_norm 1.7273 (1.9387) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][400/1251] eta 0:03:29 lr 0.000994 wd 0.0500 time 0.2363 (0.2457) data time 0.0008 (0.0022) model time 0.2356 (0.2437) loss 4.5973 (3.8000) grad_norm 1.5483 (1.9350) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][410/1251] eta 0:03:27 lr 0.000994 wd 0.0500 time 0.2475 (0.2461) data time 0.0010 (0.0022) model time 0.2465 (0.2442) loss 2.2747 (3.7958) grad_norm 1.9421 (1.9313) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][420/1251] eta 0:03:24 lr 0.000994 wd 0.0500 time 0.2413 (0.2461) data time 0.0008 (0.0021) model time 0.2405 (0.2442) loss 2.9978 (3.7828) grad_norm 1.6061 (1.9361) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][430/1251] eta 0:03:21 lr 0.000994 wd 0.0500 time 0.2379 (0.2460) data time 0.0009 (0.0021) model time 0.2370 (0.2441) loss 3.3739 (3.7832) grad_norm 1.5964 (1.9364) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][440/1251] eta 0:03:19 lr 0.000994 wd 0.0500 time 0.2396 (0.2459) data time 0.0011 (0.0021) model time 0.2385 (0.2440) loss 4.0627 (3.7871) grad_norm 2.1253 (1.9409) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][450/1251] eta 0:03:16 lr 0.000994 wd 0.0500 time 0.2404 (0.2458) data time 0.0008 (0.0021) model time 0.2396 (0.2439) loss 3.4647 (3.7866) grad_norm 2.6344 (1.9390) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][460/1251] eta 0:03:14 lr 0.000994 wd 0.0500 time 0.2405 (0.2456) data time 0.0007 (0.0020) model time 0.2398 (0.2438) loss 4.2363 (3.7885) grad_norm 1.5965 (1.9350) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][470/1251] eta 0:03:11 lr 0.000994 wd 0.0500 time 0.2339 (0.2455) data time 0.0008 (0.0020) model time 0.2330 (0.2436) loss 2.7328 (3.7807) grad_norm 1.6411 (1.9312) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][480/1251] eta 0:03:09 lr 0.000994 wd 0.0500 time 0.2366 (0.2454) data time 0.0011 (0.0020) model time 0.2355 (0.2436) loss 3.8926 (3.7820) grad_norm 1.3552 (1.9312) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][490/1251] eta 0:03:06 lr 0.000994 wd 0.0500 time 0.2452 (0.2454) data time 0.0008 (0.0020) model time 0.2443 (0.2435) loss 3.0907 (3.7850) grad_norm 2.2593 (1.9374) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][500/1251] eta 0:03:04 lr 0.000994 wd 0.0500 time 0.2469 (0.2453) data time 0.0009 (0.0020) model time 0.2460 (0.2435) loss 4.5437 (3.7817) grad_norm 1.5958 (1.9391) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][510/1251] eta 0:03:01 lr 0.000994 wd 0.0500 time 0.2433 (0.2452) data time 0.0007 (0.0019) model time 0.2425 (0.2434) loss 3.4653 (3.7764) grad_norm 1.4748 (1.9370) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][520/1251] eta 0:02:59 lr 0.000994 wd 0.0500 time 0.2487 (0.2452) data time 0.0009 (0.0019) model time 0.2477 (0.2434) loss 3.8751 (3.7789) grad_norm 1.8204 (1.9399) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][530/1251] eta 0:02:56 lr 0.000994 wd 0.0500 time 0.2399 (0.2451) data time 0.0010 (0.0019) model time 0.2389 (0.2433) loss 3.1573 (3.7727) grad_norm 2.0749 (1.9377) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][540/1251] eta 0:02:54 lr 0.000994 wd 0.0500 time 0.2358 (0.2450) data time 0.0010 (0.0019) model time 0.2348 (0.2432) loss 4.4869 (3.7729) grad_norm 2.2056 (1.9380) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][550/1251] eta 0:02:51 lr 0.000994 wd 0.0500 time 0.2353 (0.2450) data time 0.0009 (0.0019) model time 0.2344 (0.2432) loss 3.6591 (3.7688) grad_norm 1.7095 (1.9388) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][560/1251] eta 0:02:49 lr 0.000994 wd 0.0500 time 0.2501 (0.2449) data time 0.0010 (0.0019) model time 0.2491 (0.2432) loss 4.0890 (3.7712) grad_norm 1.1472 (1.9345) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][570/1251] eta 0:02:46 lr 0.000994 wd 0.0500 time 0.2440 (0.2449) data time 0.0012 (0.0018) model time 0.2428 (0.2431) loss 4.2224 (3.7730) grad_norm 2.1690 (1.9325) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][580/1251] eta 0:02:44 lr 0.000993 wd 0.0500 time 0.2430 (0.2448) data time 0.0012 (0.0018) model time 0.2418 (0.2431) loss 3.7114 (3.7748) grad_norm 1.5996 (1.9371) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][590/1251] eta 0:02:41 lr 0.000993 wd 0.0500 time 0.2391 (0.2448) data time 0.0008 (0.0018) model time 0.2383 (0.2430) loss 4.4348 (3.7736) grad_norm 1.4897 (1.9359) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][600/1251] eta 0:02:39 lr 0.000993 wd 0.0500 time 0.2428 (0.2447) data time 0.0009 (0.0018) model time 0.2419 (0.2430) loss 4.7592 (3.7764) grad_norm 2.5322 (1.9343) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][610/1251] eta 0:02:36 lr 0.000993 wd 0.0500 time 0.2425 (0.2447) data time 0.0010 (0.0018) model time 0.2416 (0.2430) loss 4.2886 (3.7792) grad_norm 2.3542 (1.9330) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][620/1251] eta 0:02:34 lr 0.000993 wd 0.0500 time 0.2389 (0.2446) data time 0.0009 (0.0018) model time 0.2380 (0.2429) loss 3.6939 (3.7762) grad_norm 2.0632 (1.9349) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][630/1251] eta 0:02:31 lr 0.000993 wd 0.0500 time 0.2464 (0.2446) data time 0.0007 (0.0018) model time 0.2457 (0.2429) loss 3.8984 (3.7742) grad_norm 1.7029 (1.9383) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][640/1251] eta 0:02:29 lr 0.000993 wd 0.0500 time 0.2365 (0.2446) data time 0.0009 (0.0018) model time 0.2357 (0.2429) loss 2.8499 (3.7709) grad_norm 1.2759 (1.9344) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][650/1251] eta 0:02:26 lr 0.000993 wd 0.0500 time 0.2418 (0.2445) data time 0.0011 (0.0017) model time 0.2407 (0.2428) loss 3.9669 (3.7704) grad_norm 1.2724 (1.9291) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][660/1251] eta 0:02:24 lr 0.000993 wd 0.0500 time 0.2461 (0.2445) data time 0.0010 (0.0017) model time 0.2451 (0.2428) loss 4.7653 (3.7737) grad_norm 2.8029 (1.9343) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][670/1251] eta 0:02:22 lr 0.000993 wd 0.0500 time 0.2462 (0.2444) data time 0.0010 (0.0017) model time 0.2451 (0.2428) loss 4.0810 (3.7743) grad_norm 1.7840 (1.9361) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][680/1251] eta 0:02:19 lr 0.000993 wd 0.0500 time 0.2365 (0.2444) data time 0.0008 (0.0017) model time 0.2358 (0.2427) loss 4.3705 (3.7749) grad_norm 1.3804 (1.9348) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][690/1251] eta 0:02:17 lr 0.000993 wd 0.0500 time 0.2467 (0.2443) data time 0.0009 (0.0017) model time 0.2457 (0.2426) loss 3.4821 (3.7780) grad_norm 1.3525 (1.9347) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][700/1251] eta 0:02:14 lr 0.000993 wd 0.0500 time 0.2427 (0.2443) data time 0.0013 (0.0017) model time 0.2414 (0.2426) loss 3.0125 (3.7747) grad_norm 1.7434 (1.9315) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][710/1251] eta 0:02:12 lr 0.000993 wd 0.0500 time 0.2384 (0.2442) data time 0.0012 (0.0017) model time 0.2372 (0.2426) loss 3.1857 (3.7715) grad_norm 1.9294 (1.9300) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][720/1251] eta 0:02:09 lr 0.000993 wd 0.0500 time 0.2435 (0.2442) data time 0.0009 (0.0017) model time 0.2425 (0.2426) loss 3.6107 (3.7730) grad_norm 1.7671 (1.9306) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][730/1251] eta 0:02:07 lr 0.000993 wd 0.0500 time 0.2446 (0.2442) data time 0.0009 (0.0017) model time 0.2438 (0.2426) loss 3.5454 (3.7698) grad_norm 1.9318 (1.9299) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][740/1251] eta 0:02:04 lr 0.000993 wd 0.0500 time 0.2432 (0.2442) data time 0.0012 (0.0017) model time 0.2420 (0.2426) loss 3.9648 (3.7672) grad_norm 2.1140 (1.9307) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][750/1251] eta 0:02:02 lr 0.000993 wd 0.0500 time 0.2402 (0.2441) data time 0.0007 (0.0016) model time 0.2395 (0.2425) loss 3.0596 (3.7623) grad_norm 1.7950 (1.9316) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][760/1251] eta 0:01:59 lr 0.000993 wd 0.0500 time 0.2413 (0.2441) data time 0.0009 (0.0016) model time 0.2404 (0.2425) loss 3.7999 (3.7619) grad_norm 1.6446 (1.9316) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][770/1251] eta 0:01:57 lr 0.000993 wd 0.0500 time 0.2464 (0.2441) data time 0.0012 (0.0016) model time 0.2452 (0.2425) loss 4.2044 (3.7648) grad_norm 2.0144 (1.9311) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][780/1251] eta 0:01:54 lr 0.000993 wd 0.0500 time 0.2479 (0.2441) data time 0.0007 (0.0016) model time 0.2472 (0.2424) loss 4.6647 (3.7640) grad_norm 1.8195 (1.9297) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][790/1251] eta 0:01:52 lr 0.000993 wd 0.0500 time 0.2425 (0.2440) data time 0.0011 (0.0016) model time 0.2414 (0.2424) loss 2.7017 (3.7604) grad_norm 1.7599 (1.9338) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][800/1251] eta 0:01:50 lr 0.000993 wd 0.0500 time 0.2359 (0.2440) data time 0.0009 (0.0016) model time 0.2350 (0.2424) loss 3.5700 (3.7614) grad_norm 1.9816 (1.9331) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][810/1251] eta 0:01:47 lr 0.000993 wd 0.0500 time 0.2349 (0.2439) data time 0.0008 (0.0016) model time 0.2341 (0.2423) loss 2.6070 (3.7611) grad_norm 2.1479 (1.9322) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][820/1251] eta 0:01:45 lr 0.000993 wd 0.0500 time 0.2435 (0.2439) data time 0.0010 (0.0016) model time 0.2425 (0.2423) loss 3.6539 (3.7609) grad_norm 1.9373 (1.9303) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][830/1251] eta 0:01:42 lr 0.000993 wd 0.0500 time 0.2344 (0.2439) data time 0.0010 (0.0016) model time 0.2334 (0.2423) loss 2.8852 (3.7582) grad_norm 1.9864 (1.9298) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][840/1251] eta 0:01:40 lr 0.000993 wd 0.0500 time 0.2397 (0.2438) data time 0.0011 (0.0016) model time 0.2386 (0.2422) loss 4.2487 (3.7590) grad_norm 2.2218 (1.9405) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][850/1251] eta 0:01:37 lr 0.000993 wd 0.0500 time 0.2354 (0.2443) data time 0.0008 (0.0016) model time 0.2347 (0.2427) loss 3.1814 (3.7580) grad_norm 2.2729 (1.9423) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][860/1251] eta 0:01:35 lr 0.000993 wd 0.0500 time 0.2391 (0.2442) data time 0.0008 (0.0016) model time 0.2383 (0.2427) loss 4.7908 (3.7589) grad_norm 1.3537 (1.9371) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][870/1251] eta 0:01:33 lr 0.000993 wd 0.0500 time 0.2391 (0.2442) data time 0.0007 (0.0016) model time 0.2384 (0.2426) loss 4.0956 (3.7595) grad_norm 1.4097 (1.9336) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][880/1251] eta 0:01:30 lr 0.000993 wd 0.0500 time 0.2380 (0.2441) data time 0.0009 (0.0016) model time 0.2371 (0.2426) loss 3.2204 (3.7593) grad_norm 1.7187 (1.9322) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][890/1251] eta 0:01:28 lr 0.000993 wd 0.0500 time 0.2329 (0.2441) data time 0.0012 (0.0016) model time 0.2317 (0.2426) loss 3.8424 (3.7598) grad_norm 1.4985 (1.9325) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][900/1251] eta 0:01:25 lr 0.000993 wd 0.0500 time 0.2591 (0.2443) data time 0.0007 (0.0015) model time 0.2584 (0.2428) loss 4.5128 (3.7636) grad_norm 2.0566 (1.9359) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][910/1251] eta 0:01:23 lr 0.000993 wd 0.0500 time 0.2425 (0.2443) data time 0.0007 (0.0015) model time 0.2418 (0.2428) loss 3.8793 (3.7626) grad_norm 1.7270 (1.9346) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][920/1251] eta 0:01:20 lr 0.000993 wd 0.0500 time 0.2516 (0.2446) data time 0.0009 (0.0015) model time 0.2507 (0.2431) loss 4.1584 (3.7637) grad_norm 1.2947 (1.9353) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][930/1251] eta 0:01:18 lr 0.000993 wd 0.0500 time 0.2561 (0.2445) data time 0.0007 (0.0015) model time 0.2554 (0.2431) loss 2.6999 (3.7569) grad_norm 3.4288 (1.9389) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][940/1251] eta 0:01:16 lr 0.000993 wd 0.0500 time 0.2355 (0.2445) data time 0.0010 (0.0015) model time 0.2345 (0.2430) loss 3.0695 (3.7563) grad_norm 2.0980 (1.9438) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][950/1251] eta 0:01:13 lr 0.000993 wd 0.0500 time 0.2477 (0.2445) data time 0.0008 (0.0015) model time 0.2468 (0.2430) loss 2.6448 (3.7551) grad_norm 2.3446 (1.9441) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][960/1251] eta 0:01:11 lr 0.000993 wd 0.0500 time 0.2401 (0.2445) data time 0.0009 (0.0015) model time 0.2392 (0.2430) loss 4.0893 (3.7554) grad_norm 2.0674 (1.9456) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][970/1251] eta 0:01:08 lr 0.000993 wd 0.0500 time 0.2359 (0.2444) data time 0.0010 (0.0015) model time 0.2349 (0.2430) loss 3.2622 (3.7581) grad_norm 2.1366 (1.9456) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][980/1251] eta 0:01:06 lr 0.000993 wd 0.0500 time 0.2450 (0.2444) data time 0.0007 (0.0015) model time 0.2443 (0.2429) loss 4.8221 (3.7605) grad_norm 1.4583 (1.9450) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][990/1251] eta 0:01:03 lr 0.000993 wd 0.0500 time 0.2422 (0.2444) data time 0.0007 (0.0015) model time 0.2415 (0.2429) loss 4.4746 (3.7599) grad_norm 2.1179 (1.9427) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1000/1251] eta 0:01:01 lr 0.000993 wd 0.0500 time 0.2429 (0.2444) data time 0.0011 (0.0015) model time 0.2418 (0.2429) loss 2.6010 (3.7583) grad_norm 2.2337 (1.9406) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1010/1251] eta 0:00:58 lr 0.000993 wd 0.0500 time 0.2422 (0.2443) data time 0.0010 (0.0015) model time 0.2413 (0.2429) loss 4.1550 (3.7614) grad_norm 1.8337 (1.9435) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1020/1251] eta 0:00:56 lr 0.000993 wd 0.0500 time 0.2381 (0.2443) data time 0.0010 (0.0015) model time 0.2371 (0.2429) loss 4.4055 (3.7644) grad_norm 1.8345 (1.9454) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1030/1251] eta 0:00:53 lr 0.000993 wd 0.0500 time 0.2427 (0.2443) data time 0.0010 (0.0015) model time 0.2417 (0.2429) loss 3.7061 (3.7657) grad_norm 2.1138 (1.9468) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1040/1251] eta 0:00:51 lr 0.000993 wd 0.0500 time 0.2454 (0.2443) data time 0.0009 (0.0015) model time 0.2444 (0.2428) loss 4.2068 (3.7642) grad_norm 1.7077 (1.9470) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1050/1251] eta 0:00:49 lr 0.000993 wd 0.0500 time 0.2491 (0.2443) data time 0.0009 (0.0015) model time 0.2482 (0.2428) loss 3.1772 (3.7603) grad_norm 1.4639 (1.9455) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1060/1251] eta 0:00:46 lr 0.000993 wd 0.0500 time 0.2346 (0.2442) data time 0.0009 (0.0015) model time 0.2336 (0.2428) loss 3.1920 (3.7596) grad_norm 2.3487 (1.9433) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1070/1251] eta 0:00:44 lr 0.000993 wd 0.0500 time 0.2431 (0.2442) data time 0.0011 (0.0015) model time 0.2421 (0.2428) loss 4.2078 (3.7605) grad_norm 1.8785 (1.9418) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1080/1251] eta 0:00:41 lr 0.000993 wd 0.0500 time 0.2469 (0.2442) data time 0.0011 (0.0015) model time 0.2458 (0.2428) loss 3.5278 (3.7591) grad_norm 2.0478 (1.9455) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1090/1251] eta 0:00:39 lr 0.000993 wd 0.0500 time 0.2438 (0.2442) data time 0.0009 (0.0014) model time 0.2429 (0.2427) loss 3.4331 (3.7582) grad_norm 1.6953 (1.9485) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1100/1251] eta 0:00:36 lr 0.000993 wd 0.0500 time 0.2521 (0.2442) data time 0.0011 (0.0014) model time 0.2510 (0.2428) loss 2.9173 (3.7582) grad_norm 1.6848 (1.9461) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1110/1251] eta 0:00:34 lr 0.000993 wd 0.0500 time 0.2377 (0.2442) data time 0.0012 (0.0014) model time 0.2366 (0.2427) loss 3.8486 (3.7579) grad_norm 1.7678 (1.9472) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1120/1251] eta 0:00:31 lr 0.000993 wd 0.0500 time 0.2453 (0.2442) data time 0.0011 (0.0014) model time 0.2442 (0.2427) loss 3.5488 (3.7568) grad_norm 2.1884 (1.9475) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1130/1251] eta 0:00:29 lr 0.000993 wd 0.0500 time 0.2385 (0.2441) data time 0.0008 (0.0014) model time 0.2377 (0.2427) loss 3.4994 (3.7564) grad_norm 2.4398 (1.9467) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1140/1251] eta 0:00:27 lr 0.000993 wd 0.0500 time 0.2371 (0.2441) data time 0.0010 (0.0014) model time 0.2361 (0.2427) loss 3.7009 (3.7595) grad_norm 1.8241 (1.9476) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1150/1251] eta 0:00:24 lr 0.000993 wd 0.0500 time 0.2429 (0.2441) data time 0.0010 (0.0014) model time 0.2419 (0.2426) loss 3.9186 (3.7583) grad_norm 2.2022 (1.9475) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1160/1251] eta 0:00:22 lr 0.000993 wd 0.0500 time 0.2408 (0.2440) data time 0.0008 (0.0014) model time 0.2400 (0.2426) loss 4.8606 (3.7588) grad_norm 1.6651 (1.9467) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1170/1251] eta 0:00:19 lr 0.000993 wd 0.0500 time 0.2472 (0.2440) data time 0.0007 (0.0014) model time 0.2465 (0.2426) loss 2.8728 (3.7557) grad_norm 2.3868 (1.9467) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1180/1251] eta 0:00:17 lr 0.000993 wd 0.0500 time 0.2462 (0.2440) data time 0.0010 (0.0014) model time 0.2451 (0.2426) loss 3.4924 (3.7531) grad_norm 1.6965 (1.9440) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1190/1251] eta 0:00:14 lr 0.000993 wd 0.0500 time 0.2534 (0.2440) data time 0.0009 (0.0014) model time 0.2525 (0.2426) loss 3.6741 (3.7515) grad_norm 2.6396 (1.9450) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1200/1251] eta 0:00:12 lr 0.000993 wd 0.0500 time 0.2414 (0.2440) data time 0.0008 (0.0014) model time 0.2406 (0.2426) loss 4.3500 (3.7529) grad_norm 1.4672 (1.9447) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1210/1251] eta 0:00:10 lr 0.000993 wd 0.0500 time 0.2424 (0.2440) data time 0.0008 (0.0014) model time 0.2416 (0.2426) loss 4.2617 (3.7541) grad_norm 2.5390 (1.9499) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1220/1251] eta 0:00:07 lr 0.000993 wd 0.0500 time 0.2469 (0.2439) data time 0.0012 (0.0014) model time 0.2457 (0.2425) loss 3.0330 (3.7556) grad_norm 1.6472 (1.9515) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1230/1251] eta 0:00:05 lr 0.000993 wd 0.0500 time 0.2373 (0.2439) data time 0.0010 (0.0014) model time 0.2363 (0.2425) loss 4.6889 (3.7564) grad_norm 2.5308 (1.9495) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1240/1251] eta 0:00:02 lr 0.000993 wd 0.0500 time 0.2261 (0.2438) data time 0.0007 (0.0014) model time 0.2254 (0.2424) loss 4.2666 (3.7577) grad_norm 1.6317 (1.9467) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [34/300][1250/1251] eta 0:00:00 lr 0.000993 wd 0.0500 time 0.2271 (0.2437) data time 0.0005 (0.0014) model time 0.2267 (0.2423) loss 4.5835 (3.7594) grad_norm 2.2129 (1.9464) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:10:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 34 training takes 0:05:04 [2024-08-26 05:10:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 05:10:58 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 05:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.466 (0.466) Loss 0.6255 (0.6255) Acc@1 87.891 (87.891) Acc@5 97.266 (97.266) Mem 7379MB [2024-08-26 05:11:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.111) Loss 1.0527 (0.9755) Acc@1 75.195 (77.894) Acc@5 94.434 (94.354) Mem 7379MB [2024-08-26 05:11:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.094) Loss 1.4014 (0.9876) Acc@1 68.555 (77.111) Acc@5 87.988 (94.368) Mem 7379MB [2024-08-26 05:11:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.089) Loss 1.6162 (1.1164) Acc@1 61.035 (74.288) Acc@5 85.254 (92.635) Mem 7379MB [2024-08-26 05:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.5312 (1.2003) Acc@1 64.062 (72.435) Acc@5 86.816 (91.530) Mem 7379MB [2024-08-26 05:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 72.042 Acc@5 91.412 [2024-08-26 05:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 72.0% [2024-08-26 05:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 72.04% [2024-08-26 05:11:02 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 05:11:03 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 05:11:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.467 (0.467) Loss 0.5010 (0.5010) Acc@1 88.477 (88.477) Acc@5 97.461 (97.461) Mem 7379MB [2024-08-26 05:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.113) Loss 0.8477 (0.8317) Acc@1 80.957 (79.776) Acc@5 95.410 (95.179) Mem 7379MB [2024-08-26 05:11:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.097) Loss 1.2402 (0.8463) Acc@1 69.531 (79.111) Acc@5 90.234 (95.047) Mem 7379MB [2024-08-26 05:11:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.091) Loss 1.5244 (0.9847) Acc@1 62.207 (76.005) Acc@5 85.742 (93.211) Mem 7379MB [2024-08-26 05:11:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.4990 (1.0686) Acc@1 63.379 (74.100) Acc@5 86.719 (92.147) Mem 7379MB [2024-08-26 05:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 73.824 Acc@5 92.054 [2024-08-26 05:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 73.8% [2024-08-26 05:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 73.82% [2024-08-26 05:11:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 05:11:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 05:11:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][0/1251] eta 0:14:40 lr 0.000993 wd 0.0500 time 0.7035 (0.7035) data time 0.4786 (0.4786) model time 0.0000 (0.0000) loss 3.5877 (3.5877) grad_norm 3.0662 (3.0662) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][10/1251] eta 0:05:53 lr 0.000993 wd 0.0500 time 0.2389 (0.2846) data time 0.0010 (0.0444) model time 0.0000 (0.0000) loss 4.0763 (3.7206) grad_norm 2.1163 (2.2074) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][20/1251] eta 0:05:24 lr 0.000993 wd 0.0500 time 0.2441 (0.2636) data time 0.0011 (0.0238) model time 0.0000 (0.0000) loss 3.5594 (3.7688) grad_norm 4.6697 (2.3573) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][30/1251] eta 0:05:13 lr 0.000993 wd 0.0500 time 0.2417 (0.2571) data time 0.0012 (0.0165) model time 0.0000 (0.0000) loss 4.4016 (3.8314) grad_norm 1.6961 (2.3854) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][40/1251] eta 0:05:07 lr 0.000993 wd 0.0500 time 0.2555 (0.2537) data time 0.0007 (0.0127) model time 0.0000 (0.0000) loss 3.5804 (3.8152) grad_norm 1.5996 (2.2415) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][50/1251] eta 0:05:11 lr 0.000993 wd 0.0500 time 0.2401 (0.2598) data time 0.0011 (0.0106) model time 0.0000 (0.0000) loss 3.8708 (3.7605) grad_norm 2.0262 (2.1495) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][60/1251] eta 0:05:05 lr 0.000993 wd 0.0500 time 0.2415 (0.2565) data time 0.0008 (0.0090) model time 0.2407 (0.2387) loss 4.6188 (3.7553) grad_norm 1.6305 (2.0891) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][70/1251] eta 0:05:00 lr 0.000993 wd 0.0500 time 0.2413 (0.2541) data time 0.0008 (0.0079) model time 0.2405 (0.2388) loss 4.2872 (3.7581) grad_norm 1.5657 (2.0513) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][80/1251] eta 0:04:58 lr 0.000993 wd 0.0500 time 0.2413 (0.2550) data time 0.0007 (0.0070) model time 0.2405 (0.2458) loss 4.6066 (3.7968) grad_norm 1.7382 (2.0345) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][90/1251] eta 0:04:54 lr 0.000993 wd 0.0500 time 0.2386 (0.2534) data time 0.0010 (0.0064) model time 0.2377 (0.2443) loss 3.1413 (3.7931) grad_norm 1.3564 (1.9890) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][100/1251] eta 0:04:50 lr 0.000993 wd 0.0500 time 0.2322 (0.2523) data time 0.0008 (0.0058) model time 0.2315 (0.2438) loss 4.0593 (3.7656) grad_norm 2.2804 (1.9661) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][110/1251] eta 0:04:46 lr 0.000993 wd 0.0500 time 0.2503 (0.2513) data time 0.0010 (0.0054) model time 0.2493 (0.2432) loss 3.7627 (3.7930) grad_norm 1.4674 (1.9449) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][120/1251] eta 0:04:43 lr 0.000993 wd 0.0500 time 0.2457 (0.2507) data time 0.0008 (0.0050) model time 0.2449 (0.2431) loss 3.9968 (3.7991) grad_norm 2.0295 (1.9357) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][130/1251] eta 0:04:39 lr 0.000993 wd 0.0500 time 0.2369 (0.2497) data time 0.0009 (0.0047) model time 0.2360 (0.2423) loss 4.6646 (3.7897) grad_norm 1.7549 (1.9273) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][140/1251] eta 0:04:36 lr 0.000993 wd 0.0500 time 0.2391 (0.2491) data time 0.0010 (0.0045) model time 0.2381 (0.2420) loss 3.4869 (3.7867) grad_norm 1.9942 (1.9127) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][150/1251] eta 0:04:33 lr 0.000993 wd 0.0500 time 0.2345 (0.2485) data time 0.0012 (0.0042) model time 0.2333 (0.2417) loss 3.5885 (3.7890) grad_norm 2.3434 (1.9073) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][160/1251] eta 0:04:30 lr 0.000993 wd 0.0500 time 0.2450 (0.2481) data time 0.0009 (0.0040) model time 0.2440 (0.2417) loss 4.0029 (3.7860) grad_norm 1.6327 (1.9078) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][170/1251] eta 0:04:27 lr 0.000993 wd 0.0500 time 0.2521 (0.2478) data time 0.0010 (0.0038) model time 0.2511 (0.2417) loss 4.1851 (3.7956) grad_norm 1.4052 (1.9089) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][180/1251] eta 0:04:25 lr 0.000993 wd 0.0500 time 0.2406 (0.2474) data time 0.0009 (0.0037) model time 0.2397 (0.2416) loss 4.0668 (3.7910) grad_norm 1.3406 (1.9182) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][190/1251] eta 0:04:22 lr 0.000993 wd 0.0500 time 0.2400 (0.2471) data time 0.0009 (0.0035) model time 0.2391 (0.2415) loss 4.5021 (3.7769) grad_norm 1.6009 (1.9064) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][200/1251] eta 0:04:20 lr 0.000993 wd 0.0500 time 0.2505 (0.2479) data time 0.0010 (0.0034) model time 0.2496 (0.2429) loss 4.2201 (3.7677) grad_norm 1.5319 (1.9120) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][210/1251] eta 0:04:17 lr 0.000993 wd 0.0500 time 0.2370 (0.2475) data time 0.0008 (0.0033) model time 0.2361 (0.2427) loss 2.7113 (3.7608) grad_norm 1.7834 (1.9070) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][220/1251] eta 0:04:14 lr 0.000993 wd 0.0500 time 0.2382 (0.2473) data time 0.0010 (0.0033) model time 0.2371 (0.2425) loss 3.3766 (3.7623) grad_norm 2.2492 (1.9206) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][230/1251] eta 0:04:12 lr 0.000993 wd 0.0500 time 0.2283 (0.2469) data time 0.0009 (0.0032) model time 0.2274 (0.2422) loss 4.2393 (3.7639) grad_norm 2.5436 (1.9299) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][240/1251] eta 0:04:09 lr 0.000993 wd 0.0500 time 0.2467 (0.2468) data time 0.0010 (0.0031) model time 0.2457 (0.2422) loss 4.3639 (3.7709) grad_norm 2.7081 (1.9306) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][250/1251] eta 0:04:06 lr 0.000993 wd 0.0500 time 0.2407 (0.2465) data time 0.0010 (0.0030) model time 0.2397 (0.2421) loss 3.6311 (3.7690) grad_norm 2.0850 (1.9349) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][260/1251] eta 0:04:04 lr 0.000993 wd 0.0500 time 0.2429 (0.2463) data time 0.0010 (0.0029) model time 0.2420 (0.2419) loss 3.8586 (3.7560) grad_norm 2.4190 (1.9338) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][270/1251] eta 0:04:01 lr 0.000993 wd 0.0500 time 0.2393 (0.2461) data time 0.0011 (0.0029) model time 0.2382 (0.2418) loss 4.0694 (3.7643) grad_norm 1.8258 (1.9260) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][280/1251] eta 0:03:58 lr 0.000993 wd 0.0500 time 0.2542 (0.2460) data time 0.0010 (0.0028) model time 0.2532 (0.2419) loss 4.3337 (3.7661) grad_norm 2.6420 (1.9225) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][290/1251] eta 0:03:56 lr 0.000993 wd 0.0500 time 0.2409 (0.2458) data time 0.0007 (0.0027) model time 0.2402 (0.2418) loss 3.0320 (3.7648) grad_norm 2.1117 (1.9359) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][300/1251] eta 0:03:54 lr 0.000993 wd 0.0500 time 0.2391 (0.2471) data time 0.0007 (0.0027) model time 0.2384 (0.2434) loss 3.5744 (3.7603) grad_norm 2.4034 (1.9420) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][310/1251] eta 0:03:52 lr 0.000993 wd 0.0500 time 0.2447 (0.2469) data time 0.0009 (0.0026) model time 0.2438 (0.2434) loss 3.2885 (3.7685) grad_norm 1.9594 (1.9405) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][320/1251] eta 0:03:49 lr 0.000993 wd 0.0500 time 0.2352 (0.2467) data time 0.0009 (0.0026) model time 0.2343 (0.2432) loss 3.4100 (3.7625) grad_norm 1.7265 (1.9346) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][330/1251] eta 0:03:47 lr 0.000993 wd 0.0500 time 0.2405 (0.2466) data time 0.0009 (0.0025) model time 0.2395 (0.2431) loss 3.7752 (3.7566) grad_norm 1.6294 (1.9276) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][340/1251] eta 0:03:44 lr 0.000993 wd 0.0500 time 0.2416 (0.2464) data time 0.0007 (0.0025) model time 0.2408 (0.2430) loss 3.8202 (3.7551) grad_norm 1.6518 (1.9242) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:12:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][350/1251] eta 0:03:41 lr 0.000993 wd 0.0500 time 0.2384 (0.2463) data time 0.0008 (0.0024) model time 0.2376 (0.2429) loss 3.5782 (3.7528) grad_norm 1.6371 (1.9211) loss_scale 32768.0000 (16710.7464) mem 7379MB [2024-08-26 05:12:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][360/1251] eta 0:03:39 lr 0.000993 wd 0.0500 time 0.2353 (0.2461) data time 0.0008 (0.0024) model time 0.2345 (0.2428) loss 3.9865 (3.7456) grad_norm 2.8598 (1.9227) loss_scale 32768.0000 (17155.5457) mem 7379MB [2024-08-26 05:12:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][370/1251] eta 0:03:36 lr 0.000993 wd 0.0500 time 0.2398 (0.2459) data time 0.0008 (0.0024) model time 0.2390 (0.2426) loss 4.6930 (3.7421) grad_norm 1.8479 (inf) loss_scale 16384.0000 (17223.0728) mem 7379MB [2024-08-26 05:12:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][380/1251] eta 0:03:34 lr 0.000993 wd 0.0500 time 0.2390 (0.2458) data time 0.0012 (0.0023) model time 0.2377 (0.2426) loss 2.9934 (3.7392) grad_norm 1.4483 (inf) loss_scale 16384.0000 (17201.0499) mem 7379MB [2024-08-26 05:12:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][390/1251] eta 0:03:31 lr 0.000993 wd 0.0500 time 0.2370 (0.2456) data time 0.0009 (0.0023) model time 0.2361 (0.2425) loss 2.1913 (3.7379) grad_norm 1.3659 (inf) loss_scale 16384.0000 (17180.1535) mem 7379MB [2024-08-26 05:12:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][400/1251] eta 0:03:28 lr 0.000993 wd 0.0500 time 0.2386 (0.2455) data time 0.0008 (0.0023) model time 0.2378 (0.2424) loss 4.6180 (3.7382) grad_norm 2.0529 (inf) loss_scale 16384.0000 (17160.2993) mem 7379MB [2024-08-26 05:12:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][410/1251] eta 0:03:26 lr 0.000993 wd 0.0500 time 0.2395 (0.2454) data time 0.0007 (0.0022) model time 0.2387 (0.2423) loss 2.9155 (3.7429) grad_norm 3.2534 (inf) loss_scale 16384.0000 (17141.4112) mem 7379MB [2024-08-26 05:12:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][420/1251] eta 0:03:23 lr 0.000993 wd 0.0500 time 0.2422 (0.2453) data time 0.0010 (0.0022) model time 0.2412 (0.2423) loss 3.7777 (3.7398) grad_norm 2.0770 (inf) loss_scale 16384.0000 (17123.4204) mem 7379MB [2024-08-26 05:12:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][430/1251] eta 0:03:21 lr 0.000993 wd 0.0500 time 0.2443 (0.2452) data time 0.0010 (0.0022) model time 0.2433 (0.2422) loss 3.8936 (3.7471) grad_norm 1.5572 (inf) loss_scale 16384.0000 (17106.2645) mem 7379MB [2024-08-26 05:12:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][440/1251] eta 0:03:18 lr 0.000993 wd 0.0500 time 0.2401 (0.2452) data time 0.0010 (0.0022) model time 0.2391 (0.2422) loss 3.9402 (3.7529) grad_norm 1.3202 (inf) loss_scale 16384.0000 (17089.8866) mem 7379MB [2024-08-26 05:12:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][450/1251] eta 0:03:16 lr 0.000993 wd 0.0500 time 0.2385 (0.2451) data time 0.0008 (0.0021) model time 0.2378 (0.2422) loss 2.8695 (3.7477) grad_norm 1.6545 (inf) loss_scale 16384.0000 (17074.2350) mem 7379MB [2024-08-26 05:13:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][460/1251] eta 0:03:13 lr 0.000993 wd 0.0500 time 0.2376 (0.2450) data time 0.0009 (0.0021) model time 0.2367 (0.2421) loss 3.5655 (3.7504) grad_norm 1.9261 (inf) loss_scale 16384.0000 (17059.2625) mem 7379MB [2024-08-26 05:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][470/1251] eta 0:03:11 lr 0.000993 wd 0.0500 time 0.2381 (0.2449) data time 0.0010 (0.0021) model time 0.2370 (0.2420) loss 2.9800 (3.7552) grad_norm 2.4202 (inf) loss_scale 16384.0000 (17044.9257) mem 7379MB [2024-08-26 05:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][480/1251] eta 0:03:08 lr 0.000993 wd 0.0500 time 0.2388 (0.2448) data time 0.0008 (0.0021) model time 0.2380 (0.2420) loss 4.1964 (3.7607) grad_norm 1.8089 (inf) loss_scale 16384.0000 (17031.1850) mem 7379MB [2024-08-26 05:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][490/1251] eta 0:03:06 lr 0.000993 wd 0.0500 time 0.2414 (0.2447) data time 0.0009 (0.0020) model time 0.2405 (0.2419) loss 4.2077 (3.7583) grad_norm 1.7467 (inf) loss_scale 16384.0000 (17018.0041) mem 7379MB [2024-08-26 05:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][500/1251] eta 0:03:03 lr 0.000993 wd 0.0500 time 0.2308 (0.2447) data time 0.0011 (0.0020) model time 0.2297 (0.2419) loss 3.7451 (3.7620) grad_norm 2.4972 (inf) loss_scale 16384.0000 (17005.3493) mem 7379MB [2024-08-26 05:13:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][510/1251] eta 0:03:01 lr 0.000993 wd 0.0500 time 0.2459 (0.2446) data time 0.0010 (0.0020) model time 0.2449 (0.2419) loss 3.6465 (3.7606) grad_norm 2.0902 (inf) loss_scale 16384.0000 (16993.1898) mem 7379MB [2024-08-26 05:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][520/1251] eta 0:02:58 lr 0.000993 wd 0.0500 time 0.2407 (0.2445) data time 0.0007 (0.0020) model time 0.2400 (0.2418) loss 4.1529 (3.7558) grad_norm 1.5281 (inf) loss_scale 16384.0000 (16981.4971) mem 7379MB [2024-08-26 05:13:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][530/1251] eta 0:02:56 lr 0.000993 wd 0.0500 time 0.2390 (0.2445) data time 0.0009 (0.0020) model time 0.2380 (0.2418) loss 3.7733 (3.7522) grad_norm 1.9148 (inf) loss_scale 16384.0000 (16970.2448) mem 7379MB [2024-08-26 05:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][540/1251] eta 0:02:53 lr 0.000993 wd 0.0500 time 0.2433 (0.2444) data time 0.0010 (0.0020) model time 0.2424 (0.2418) loss 3.7772 (3.7508) grad_norm 1.7197 (inf) loss_scale 16384.0000 (16959.4085) mem 7379MB [2024-08-26 05:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][550/1251] eta 0:02:51 lr 0.000993 wd 0.0500 time 0.2480 (0.2444) data time 0.0008 (0.0019) model time 0.2472 (0.2418) loss 2.6619 (3.7495) grad_norm 1.9102 (inf) loss_scale 16384.0000 (16948.9655) mem 7379MB [2024-08-26 05:13:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][560/1251] eta 0:02:48 lr 0.000993 wd 0.0500 time 0.2390 (0.2444) data time 0.0009 (0.0019) model time 0.2381 (0.2418) loss 3.8284 (3.7524) grad_norm 1.5402 (inf) loss_scale 16384.0000 (16938.8948) mem 7379MB [2024-08-26 05:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][570/1251] eta 0:02:46 lr 0.000993 wd 0.0500 time 0.2387 (0.2444) data time 0.0009 (0.0019) model time 0.2378 (0.2418) loss 4.7057 (3.7514) grad_norm 1.6197 (inf) loss_scale 16384.0000 (16929.1769) mem 7379MB [2024-08-26 05:13:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][580/1251] eta 0:02:43 lr 0.000993 wd 0.0500 time 0.2459 (0.2443) data time 0.0009 (0.0019) model time 0.2450 (0.2418) loss 4.2702 (3.7585) grad_norm 2.3986 (inf) loss_scale 16384.0000 (16919.7935) mem 7379MB [2024-08-26 05:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][590/1251] eta 0:02:41 lr 0.000993 wd 0.0500 time 0.2401 (0.2443) data time 0.0009 (0.0019) model time 0.2392 (0.2418) loss 4.2786 (3.7613) grad_norm 1.8433 (inf) loss_scale 16384.0000 (16910.7276) mem 7379MB [2024-08-26 05:13:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][600/1251] eta 0:02:39 lr 0.000993 wd 0.0500 time 0.2421 (0.2445) data time 0.0010 (0.0019) model time 0.2412 (0.2420) loss 4.1063 (3.7591) grad_norm 2.1211 (inf) loss_scale 16384.0000 (16901.9634) mem 7379MB [2024-08-26 05:13:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][610/1251] eta 0:02:36 lr 0.000993 wd 0.0500 time 0.2380 (0.2445) data time 0.0011 (0.0019) model time 0.2370 (0.2420) loss 4.5236 (3.7617) grad_norm 1.9255 (inf) loss_scale 16384.0000 (16893.4861) mem 7379MB [2024-08-26 05:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][620/1251] eta 0:02:34 lr 0.000993 wd 0.0500 time 0.2369 (0.2444) data time 0.0010 (0.0018) model time 0.2359 (0.2420) loss 4.1749 (3.7654) grad_norm 1.7669 (inf) loss_scale 16384.0000 (16885.2818) mem 7379MB [2024-08-26 05:13:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][630/1251] eta 0:02:31 lr 0.000993 wd 0.0500 time 0.2404 (0.2444) data time 0.0014 (0.0018) model time 0.2389 (0.2419) loss 2.4610 (3.7631) grad_norm 1.3911 (inf) loss_scale 16384.0000 (16877.3376) mem 7379MB [2024-08-26 05:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][640/1251] eta 0:02:29 lr 0.000993 wd 0.0500 time 0.2420 (0.2444) data time 0.0009 (0.0018) model time 0.2411 (0.2420) loss 4.7330 (3.7689) grad_norm 1.4425 (inf) loss_scale 16384.0000 (16869.6412) mem 7379MB [2024-08-26 05:13:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][650/1251] eta 0:02:26 lr 0.000993 wd 0.0500 time 0.2389 (0.2443) data time 0.0009 (0.0018) model time 0.2380 (0.2419) loss 3.4875 (3.7675) grad_norm 2.3223 (inf) loss_scale 16384.0000 (16862.1813) mem 7379MB [2024-08-26 05:13:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][660/1251] eta 0:02:24 lr 0.000993 wd 0.0500 time 0.2447 (0.2443) data time 0.0010 (0.0018) model time 0.2437 (0.2419) loss 2.5302 (3.7612) grad_norm 1.6762 (inf) loss_scale 16384.0000 (16854.9470) mem 7379MB [2024-08-26 05:13:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][670/1251] eta 0:02:21 lr 0.000992 wd 0.0500 time 0.2399 (0.2443) data time 0.0007 (0.0018) model time 0.2391 (0.2419) loss 2.4876 (3.7606) grad_norm 1.8790 (inf) loss_scale 16384.0000 (16847.9285) mem 7379MB [2024-08-26 05:13:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][680/1251] eta 0:02:19 lr 0.000992 wd 0.0500 time 0.2419 (0.2442) data time 0.0007 (0.0018) model time 0.2412 (0.2419) loss 3.9405 (3.7601) grad_norm 2.6855 (inf) loss_scale 16384.0000 (16841.1160) mem 7379MB [2024-08-26 05:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][690/1251] eta 0:02:16 lr 0.000992 wd 0.0500 time 0.2518 (0.2442) data time 0.0012 (0.0018) model time 0.2506 (0.2419) loss 4.0027 (3.7596) grad_norm 2.6073 (inf) loss_scale 16384.0000 (16834.5007) mem 7379MB [2024-08-26 05:13:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][700/1251] eta 0:02:14 lr 0.000992 wd 0.0500 time 0.2416 (0.2441) data time 0.0010 (0.0018) model time 0.2407 (0.2418) loss 3.1531 (3.7595) grad_norm 1.5604 (inf) loss_scale 16384.0000 (16828.0742) mem 7379MB [2024-08-26 05:14:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][710/1251] eta 0:02:12 lr 0.000992 wd 0.0500 time 0.2354 (0.2441) data time 0.0007 (0.0018) model time 0.2347 (0.2418) loss 2.3643 (3.7571) grad_norm 1.7219 (inf) loss_scale 16384.0000 (16821.8284) mem 7379MB [2024-08-26 05:14:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][720/1251] eta 0:02:09 lr 0.000992 wd 0.0500 time 0.2491 (0.2443) data time 0.0008 (0.0017) model time 0.2483 (0.2420) loss 2.4776 (3.7540) grad_norm 1.6091 (inf) loss_scale 16384.0000 (16815.7559) mem 7379MB [2024-08-26 05:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][730/1251] eta 0:02:07 lr 0.000992 wd 0.0500 time 0.2410 (0.2443) data time 0.0011 (0.0017) model time 0.2399 (0.2420) loss 3.9676 (3.7515) grad_norm 1.7378 (inf) loss_scale 16384.0000 (16809.8495) mem 7379MB [2024-08-26 05:14:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][740/1251] eta 0:02:04 lr 0.000992 wd 0.0500 time 0.2468 (0.2442) data time 0.0009 (0.0017) model time 0.2458 (0.2420) loss 3.5731 (3.7512) grad_norm 3.4369 (inf) loss_scale 16384.0000 (16804.1026) mem 7379MB [2024-08-26 05:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][750/1251] eta 0:02:02 lr 0.000992 wd 0.0500 time 0.2381 (0.2442) data time 0.0008 (0.0017) model time 0.2372 (0.2420) loss 3.9683 (3.7514) grad_norm 1.5960 (inf) loss_scale 16384.0000 (16798.5087) mem 7379MB [2024-08-26 05:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][760/1251] eta 0:01:59 lr 0.000992 wd 0.0500 time 0.2364 (0.2442) data time 0.0009 (0.0017) model time 0.2356 (0.2420) loss 4.2311 (3.7516) grad_norm 1.6621 (inf) loss_scale 16384.0000 (16793.0618) mem 7379MB [2024-08-26 05:14:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][770/1251] eta 0:01:57 lr 0.000992 wd 0.0500 time 0.2485 (0.2442) data time 0.0010 (0.0017) model time 0.2475 (0.2420) loss 4.0779 (3.7517) grad_norm 1.9087 (inf) loss_scale 16384.0000 (16787.7562) mem 7379MB [2024-08-26 05:14:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][780/1251] eta 0:01:54 lr 0.000992 wd 0.0500 time 0.2492 (0.2441) data time 0.0011 (0.0017) model time 0.2482 (0.2420) loss 4.0855 (3.7517) grad_norm 1.5108 (inf) loss_scale 16384.0000 (16782.5864) mem 7379MB [2024-08-26 05:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][790/1251] eta 0:01:52 lr 0.000992 wd 0.0500 time 0.2314 (0.2441) data time 0.0009 (0.0017) model time 0.2305 (0.2420) loss 2.6456 (3.7531) grad_norm 2.2945 (inf) loss_scale 16384.0000 (16777.5474) mem 7379MB [2024-08-26 05:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][800/1251] eta 0:01:50 lr 0.000992 wd 0.0500 time 0.2422 (0.2441) data time 0.0011 (0.0017) model time 0.2411 (0.2420) loss 4.9046 (3.7509) grad_norm 1.2414 (inf) loss_scale 16384.0000 (16772.6342) mem 7379MB [2024-08-26 05:14:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][810/1251] eta 0:01:47 lr 0.000992 wd 0.0500 time 0.2605 (0.2441) data time 0.0011 (0.0017) model time 0.2594 (0.2419) loss 3.5412 (3.7504) grad_norm 1.8446 (inf) loss_scale 16384.0000 (16767.8422) mem 7379MB [2024-08-26 05:14:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][820/1251] eta 0:01:45 lr 0.000992 wd 0.0500 time 0.2375 (0.2441) data time 0.0010 (0.0017) model time 0.2365 (0.2419) loss 3.5526 (3.7496) grad_norm 1.6095 (inf) loss_scale 16384.0000 (16763.1669) mem 7379MB [2024-08-26 05:14:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][830/1251] eta 0:01:42 lr 0.000992 wd 0.0500 time 0.2401 (0.2440) data time 0.0012 (0.0017) model time 0.2389 (0.2419) loss 2.9046 (3.7526) grad_norm 1.6024 (inf) loss_scale 16384.0000 (16758.6041) mem 7379MB [2024-08-26 05:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][840/1251] eta 0:01:40 lr 0.000992 wd 0.0500 time 0.2395 (0.2440) data time 0.0010 (0.0017) model time 0.2385 (0.2419) loss 3.6926 (3.7546) grad_norm 1.8201 (inf) loss_scale 8192.0000 (16705.4459) mem 7379MB [2024-08-26 05:14:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][850/1251] eta 0:01:37 lr 0.000992 wd 0.0500 time 0.2432 (0.2439) data time 0.0007 (0.0017) model time 0.2425 (0.2418) loss 4.0254 (3.7521) grad_norm 1.9451 (inf) loss_scale 8192.0000 (16605.4054) mem 7379MB [2024-08-26 05:14:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][860/1251] eta 0:01:35 lr 0.000992 wd 0.0500 time 0.2417 (0.2439) data time 0.0009 (0.0017) model time 0.2408 (0.2418) loss 3.1097 (3.7527) grad_norm 1.5370 (inf) loss_scale 8192.0000 (16507.6887) mem 7379MB [2024-08-26 05:14:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][870/1251] eta 0:01:32 lr 0.000992 wd 0.0500 time 0.2463 (0.2439) data time 0.0009 (0.0017) model time 0.2454 (0.2418) loss 3.5935 (3.7519) grad_norm 1.3638 (inf) loss_scale 8192.0000 (16412.2158) mem 7379MB [2024-08-26 05:14:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][880/1251] eta 0:01:30 lr 0.000992 wd 0.0500 time 0.2456 (0.2438) data time 0.0011 (0.0017) model time 0.2445 (0.2417) loss 3.3331 (3.7548) grad_norm 1.7517 (inf) loss_scale 8192.0000 (16318.9103) mem 7379MB [2024-08-26 05:14:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][890/1251] eta 0:01:28 lr 0.000992 wd 0.0500 time 0.2548 (0.2438) data time 0.0010 (0.0016) model time 0.2538 (0.2417) loss 4.1404 (3.7518) grad_norm 1.9261 (inf) loss_scale 8192.0000 (16227.6992) mem 7379MB [2024-08-26 05:14:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][900/1251] eta 0:01:25 lr 0.000992 wd 0.0500 time 0.2415 (0.2438) data time 0.0009 (0.0016) model time 0.2405 (0.2417) loss 2.7654 (3.7489) grad_norm 1.2630 (inf) loss_scale 8192.0000 (16138.5128) mem 7379MB [2024-08-26 05:14:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][910/1251] eta 0:01:23 lr 0.000992 wd 0.0500 time 0.2400 (0.2437) data time 0.0009 (0.0016) model time 0.2391 (0.2416) loss 2.9860 (3.7496) grad_norm 2.1594 (inf) loss_scale 8192.0000 (16051.2843) mem 7379MB [2024-08-26 05:14:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][920/1251] eta 0:01:20 lr 0.000992 wd 0.0500 time 0.2317 (0.2437) data time 0.0009 (0.0016) model time 0.2308 (0.2416) loss 3.2253 (3.7494) grad_norm 1.4963 (inf) loss_scale 8192.0000 (15965.9501) mem 7379MB [2024-08-26 05:14:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][930/1251] eta 0:01:18 lr 0.000992 wd 0.0500 time 0.2445 (0.2437) data time 0.0011 (0.0016) model time 0.2434 (0.2416) loss 3.6756 (3.7495) grad_norm 2.8573 (inf) loss_scale 8192.0000 (15882.4490) mem 7379MB [2024-08-26 05:14:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][940/1251] eta 0:01:15 lr 0.000992 wd 0.0500 time 0.2411 (0.2437) data time 0.0010 (0.0016) model time 0.2401 (0.2416) loss 3.9400 (3.7519) grad_norm 2.2827 (inf) loss_scale 8192.0000 (15800.7226) mem 7379MB [2024-08-26 05:14:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][950/1251] eta 0:01:13 lr 0.000992 wd 0.0500 time 0.2406 (0.2437) data time 0.0011 (0.0016) model time 0.2395 (0.2416) loss 3.3489 (3.7486) grad_norm 1.6772 (inf) loss_scale 8192.0000 (15720.7150) mem 7379MB [2024-08-26 05:15:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][960/1251] eta 0:01:10 lr 0.000992 wd 0.0500 time 0.2416 (0.2437) data time 0.0008 (0.0016) model time 0.2408 (0.2416) loss 4.2588 (3.7504) grad_norm 2.3113 (inf) loss_scale 8192.0000 (15642.3725) mem 7379MB [2024-08-26 05:15:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][970/1251] eta 0:01:08 lr 0.000992 wd 0.0500 time 0.2533 (0.2436) data time 0.0010 (0.0016) model time 0.2523 (0.2416) loss 3.9469 (3.7530) grad_norm 1.5651 (inf) loss_scale 8192.0000 (15565.6437) mem 7379MB [2024-08-26 05:15:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][980/1251] eta 0:01:06 lr 0.000992 wd 0.0500 time 0.2448 (0.2441) data time 0.0011 (0.0016) model time 0.2437 (0.2421) loss 3.1692 (3.7526) grad_norm 1.5924 (inf) loss_scale 8192.0000 (15490.4791) mem 7379MB [2024-08-26 05:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][990/1251] eta 0:01:03 lr 0.000992 wd 0.0500 time 0.2404 (0.2440) data time 0.0008 (0.0016) model time 0.2397 (0.2421) loss 4.0946 (3.7513) grad_norm 1.8088 (inf) loss_scale 8192.0000 (15416.8315) mem 7379MB [2024-08-26 05:15:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1000/1251] eta 0:01:01 lr 0.000992 wd 0.0500 time 0.2453 (0.2440) data time 0.0007 (0.0016) model time 0.2446 (0.2421) loss 4.3717 (3.7523) grad_norm 1.7914 (inf) loss_scale 8192.0000 (15344.6553) mem 7379MB [2024-08-26 05:15:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1010/1251] eta 0:00:58 lr 0.000992 wd 0.0500 time 0.2415 (0.2440) data time 0.0011 (0.0016) model time 0.2404 (0.2420) loss 3.4941 (3.7517) grad_norm 1.6583 (inf) loss_scale 8192.0000 (15273.9070) mem 7379MB [2024-08-26 05:15:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1020/1251] eta 0:00:56 lr 0.000992 wd 0.0500 time 0.2465 (0.2440) data time 0.0010 (0.0016) model time 0.2455 (0.2420) loss 4.0650 (3.7539) grad_norm 1.8877 (inf) loss_scale 8192.0000 (15204.5446) mem 7379MB [2024-08-26 05:15:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1030/1251] eta 0:00:53 lr 0.000992 wd 0.0500 time 0.2508 (0.2440) data time 0.0011 (0.0016) model time 0.2496 (0.2420) loss 2.6883 (3.7533) grad_norm 1.5957 (inf) loss_scale 8192.0000 (15136.5276) mem 7379MB [2024-08-26 05:15:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1040/1251] eta 0:00:51 lr 0.000992 wd 0.0500 time 0.2360 (0.2439) data time 0.0008 (0.0016) model time 0.2352 (0.2420) loss 3.1367 (3.7525) grad_norm 2.5817 (inf) loss_scale 8192.0000 (15069.8175) mem 7379MB [2024-08-26 05:15:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1050/1251] eta 0:00:49 lr 0.000992 wd 0.0500 time 0.2445 (0.2439) data time 0.0010 (0.0016) model time 0.2435 (0.2420) loss 2.8897 (3.7526) grad_norm 1.7974 (inf) loss_scale 8192.0000 (15004.3768) mem 7379MB [2024-08-26 05:15:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1060/1251] eta 0:00:46 lr 0.000992 wd 0.0500 time 0.2467 (0.2439) data time 0.0007 (0.0016) model time 0.2460 (0.2420) loss 4.0595 (3.7519) grad_norm 1.6383 (inf) loss_scale 8192.0000 (14940.1697) mem 7379MB [2024-08-26 05:15:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1070/1251] eta 0:00:44 lr 0.000992 wd 0.0500 time 0.2379 (0.2439) data time 0.0010 (0.0016) model time 0.2370 (0.2419) loss 2.8840 (3.7514) grad_norm 1.3544 (inf) loss_scale 8192.0000 (14877.1615) mem 7379MB [2024-08-26 05:15:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1080/1251] eta 0:00:41 lr 0.000992 wd 0.0500 time 0.2365 (0.2438) data time 0.0012 (0.0016) model time 0.2353 (0.2419) loss 4.3798 (3.7536) grad_norm 1.6722 (inf) loss_scale 8192.0000 (14815.3191) mem 7379MB [2024-08-26 05:15:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1090/1251] eta 0:00:39 lr 0.000992 wd 0.0500 time 0.2480 (0.2438) data time 0.0010 (0.0015) model time 0.2470 (0.2419) loss 3.6324 (3.7534) grad_norm 2.7949 (inf) loss_scale 8192.0000 (14754.6104) mem 7379MB [2024-08-26 05:15:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1100/1251] eta 0:00:36 lr 0.000992 wd 0.0500 time 0.2444 (0.2438) data time 0.0009 (0.0015) model time 0.2436 (0.2419) loss 3.6691 (3.7520) grad_norm 1.7040 (inf) loss_scale 8192.0000 (14695.0045) mem 7379MB [2024-08-26 05:15:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1110/1251] eta 0:00:34 lr 0.000992 wd 0.0500 time 0.2338 (0.2437) data time 0.0009 (0.0016) model time 0.2329 (0.2419) loss 4.3906 (3.7543) grad_norm 2.9287 (inf) loss_scale 8192.0000 (14636.4716) mem 7379MB [2024-08-26 05:15:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1120/1251] eta 0:00:31 lr 0.000992 wd 0.0500 time 0.2355 (0.2437) data time 0.0009 (0.0015) model time 0.2346 (0.2418) loss 4.1411 (3.7545) grad_norm 1.3762 (inf) loss_scale 8192.0000 (14578.9831) mem 7379MB [2024-08-26 05:15:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1130/1251] eta 0:00:29 lr 0.000992 wd 0.0500 time 0.2363 (0.2439) data time 0.0011 (0.0015) model time 0.2352 (0.2420) loss 2.9501 (3.7572) grad_norm 1.4861 (inf) loss_scale 8192.0000 (14522.5111) mem 7379MB [2024-08-26 05:15:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1140/1251] eta 0:00:27 lr 0.000992 wd 0.0500 time 0.2408 (0.2438) data time 0.0018 (0.0015) model time 0.2390 (0.2420) loss 3.9432 (3.7582) grad_norm 1.8899 (inf) loss_scale 8192.0000 (14467.0289) mem 7379MB [2024-08-26 05:15:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1150/1251] eta 0:00:24 lr 0.000992 wd 0.0500 time 0.2457 (0.2438) data time 0.0008 (0.0015) model time 0.2449 (0.2420) loss 4.5348 (3.7593) grad_norm 4.3029 (inf) loss_scale 8192.0000 (14412.5109) mem 7379MB [2024-08-26 05:15:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1160/1251] eta 0:00:22 lr 0.000992 wd 0.0500 time 0.2448 (0.2438) data time 0.0010 (0.0015) model time 0.2439 (0.2420) loss 3.7137 (3.7621) grad_norm 1.9426 (inf) loss_scale 8192.0000 (14358.9320) mem 7379MB [2024-08-26 05:15:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1170/1251] eta 0:00:19 lr 0.000992 wd 0.0500 time 0.2451 (0.2438) data time 0.0008 (0.0015) model time 0.2443 (0.2420) loss 2.9042 (3.7607) grad_norm 1.9461 (inf) loss_scale 8192.0000 (14306.2681) mem 7379MB [2024-08-26 05:15:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1180/1251] eta 0:00:17 lr 0.000992 wd 0.0500 time 0.2361 (0.2438) data time 0.0010 (0.0015) model time 0.2350 (0.2420) loss 3.2149 (3.7595) grad_norm 1.7816 (inf) loss_scale 8192.0000 (14254.4962) mem 7379MB [2024-08-26 05:15:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1190/1251] eta 0:00:14 lr 0.000992 wd 0.0500 time 0.2491 (0.2438) data time 0.0010 (0.0015) model time 0.2481 (0.2419) loss 3.7245 (3.7590) grad_norm 1.2865 (inf) loss_scale 8192.0000 (14203.5936) mem 7379MB [2024-08-26 05:16:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1200/1251] eta 0:00:12 lr 0.000992 wd 0.0500 time 0.2357 (0.2438) data time 0.0009 (0.0015) model time 0.2348 (0.2419) loss 3.1016 (3.7600) grad_norm 1.9351 (inf) loss_scale 8192.0000 (14153.5387) mem 7379MB [2024-08-26 05:16:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1210/1251] eta 0:00:09 lr 0.000992 wd 0.0500 time 0.2359 (0.2438) data time 0.0009 (0.0015) model time 0.2350 (0.2419) loss 4.4673 (3.7602) grad_norm 2.3743 (inf) loss_scale 8192.0000 (14104.3105) mem 7379MB [2024-08-26 05:16:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1220/1251] eta 0:00:07 lr 0.000992 wd 0.0500 time 0.2437 (0.2438) data time 0.0012 (0.0015) model time 0.2425 (0.2419) loss 4.2799 (3.7605) grad_norm 1.4856 (inf) loss_scale 8192.0000 (14055.8886) mem 7379MB [2024-08-26 05:16:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1230/1251] eta 0:00:05 lr 0.000992 wd 0.0500 time 0.2370 (0.2441) data time 0.0011 (0.0015) model time 0.2358 (0.2423) loss 3.7421 (3.7599) grad_norm 2.0328 (inf) loss_scale 8192.0000 (14008.2535) mem 7379MB [2024-08-26 05:16:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1240/1251] eta 0:00:02 lr 0.000992 wd 0.0500 time 0.4446 (0.2442) data time 0.0007 (0.0015) model time 0.4439 (0.2424) loss 4.3801 (3.7620) grad_norm 2.0612 (inf) loss_scale 8192.0000 (13961.3860) mem 7379MB [2024-08-26 05:16:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [35/300][1250/1251] eta 0:00:00 lr 0.000992 wd 0.0500 time 0.2229 (0.2440) data time 0.0007 (0.0015) model time 0.2222 (0.2422) loss 4.2127 (3.7614) grad_norm 1.3814 (inf) loss_scale 8192.0000 (13915.2678) mem 7379MB [2024-08-26 05:16:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 35 training takes 0:05:05 [2024-08-26 05:16:13 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 05:16:14 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 05:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.469 (0.469) Loss 0.6436 (0.6436) Acc@1 86.719 (86.719) Acc@5 96.582 (96.582) Mem 7379MB [2024-08-26 05:16:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.112) Loss 0.9414 (0.9303) Acc@1 79.395 (78.453) Acc@5 94.629 (94.824) Mem 7379MB [2024-08-26 05:16:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.096) Loss 1.3223 (0.9525) Acc@1 69.043 (77.753) Acc@5 89.160 (94.685) Mem 7379MB [2024-08-26 05:16:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.090) Loss 1.6768 (1.0947) Acc@1 60.938 (74.619) Acc@5 84.180 (92.673) Mem 7379MB [2024-08-26 05:16:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.5596 (1.1806) Acc@1 62.988 (72.661) Acc@5 86.133 (91.475) Mem 7379MB [2024-08-26 05:16:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 72.270 Acc@5 91.368 [2024-08-26 05:16:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 72.3% [2024-08-26 05:16:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 72.27% [2024-08-26 05:16:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 05:16:18 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 05:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.450 (0.450) Loss 0.4956 (0.4956) Acc@1 88.965 (88.965) Acc@5 97.461 (97.461) Mem 7379MB [2024-08-26 05:16:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.110) Loss 0.8389 (0.8222) Acc@1 81.055 (80.131) Acc@5 95.117 (95.304) Mem 7379MB [2024-08-26 05:16:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.096) Loss 1.2275 (0.8366) Acc@1 70.215 (79.432) Acc@5 90.527 (95.201) Mem 7379MB [2024-08-26 05:16:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.090) Loss 1.5068 (0.9728) Acc@1 62.402 (76.336) Acc@5 86.230 (93.391) Mem 7379MB [2024-08-26 05:16:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.4766 (1.0554) Acc@1 63.379 (74.426) Acc@5 87.012 (92.342) Mem 7379MB [2024-08-26 05:16:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.128 Acc@5 92.258 [2024-08-26 05:16:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 74.1% [2024-08-26 05:16:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 74.13% [2024-08-26 05:16:22 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 05:16:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 05:16:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][0/1251] eta 0:15:24 lr 0.000992 wd 0.0500 time 0.7391 (0.7391) data time 0.5121 (0.5121) model time 0.0000 (0.0000) loss 3.9631 (3.9631) grad_norm 2.1663 (2.1663) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][10/1251] eta 0:05:55 lr 0.000992 wd 0.0500 time 0.2499 (0.2862) data time 0.0010 (0.0475) model time 0.0000 (0.0000) loss 3.4165 (3.6950) grad_norm 1.5857 (1.8473) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:16:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][20/1251] eta 0:05:26 lr 0.000992 wd 0.0500 time 0.2407 (0.2654) data time 0.0007 (0.0253) model time 0.0000 (0.0000) loss 2.8701 (3.7239) grad_norm 1.3052 (1.9118) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:16:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][30/1251] eta 0:05:14 lr 0.000992 wd 0.0500 time 0.2405 (0.2576) data time 0.0009 (0.0175) model time 0.0000 (0.0000) loss 3.1706 (3.6272) grad_norm 1.7113 (1.9709) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:16:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][40/1251] eta 0:05:07 lr 0.000992 wd 0.0500 time 0.2408 (0.2541) data time 0.0010 (0.0135) model time 0.0000 (0.0000) loss 3.2279 (3.7082) grad_norm 2.0234 (1.9247) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:16:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][50/1251] eta 0:05:01 lr 0.000992 wd 0.0500 time 0.2401 (0.2514) data time 0.0010 (0.0110) model time 0.0000 (0.0000) loss 3.8531 (3.6973) grad_norm 1.7698 (1.9411) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:16:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][60/1251] eta 0:04:57 lr 0.000992 wd 0.0500 time 0.2464 (0.2502) data time 0.0008 (0.0094) model time 0.2457 (0.2428) loss 3.4390 (3.7120) grad_norm 1.7455 (2.0144) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:16:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][70/1251] eta 0:04:54 lr 0.000992 wd 0.0500 time 0.2453 (0.2492) data time 0.0007 (0.0082) model time 0.2445 (0.2426) loss 4.6900 (3.7247) grad_norm 1.3946 (1.9724) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:16:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][80/1251] eta 0:04:50 lr 0.000992 wd 0.0500 time 0.2440 (0.2483) data time 0.0010 (0.0073) model time 0.2429 (0.2419) loss 2.6976 (3.7491) grad_norm 1.6313 (1.9576) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:16:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][90/1251] eta 0:04:47 lr 0.000992 wd 0.0500 time 0.2423 (0.2477) data time 0.0011 (0.0066) model time 0.2412 (0.2419) loss 2.9738 (3.7289) grad_norm 2.7057 (1.9576) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:16:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][100/1251] eta 0:04:44 lr 0.000992 wd 0.0500 time 0.2446 (0.2470) data time 0.0009 (0.0061) model time 0.2438 (0.2415) loss 3.1667 (3.7368) grad_norm 1.2335 (1.9364) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:16:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][110/1251] eta 0:04:41 lr 0.000992 wd 0.0500 time 0.2398 (0.2465) data time 0.0008 (0.0056) model time 0.2390 (0.2413) loss 3.3730 (3.7447) grad_norm 1.4670 (1.9368) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][120/1251] eta 0:04:38 lr 0.000992 wd 0.0500 time 0.2376 (0.2461) data time 0.0009 (0.0052) model time 0.2367 (0.2413) loss 3.8076 (3.7366) grad_norm 1.4407 (1.9449) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:16:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][130/1251] eta 0:04:35 lr 0.000992 wd 0.0500 time 0.2414 (0.2457) data time 0.0011 (0.0049) model time 0.2403 (0.2410) loss 4.2056 (3.7602) grad_norm 1.4013 (1.9388) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][140/1251] eta 0:04:32 lr 0.000992 wd 0.0500 time 0.2534 (0.2456) data time 0.0011 (0.0046) model time 0.2524 (0.2412) loss 3.4957 (3.7589) grad_norm 2.7449 (1.9517) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][150/1251] eta 0:04:30 lr 0.000992 wd 0.0500 time 0.2477 (0.2454) data time 0.0011 (0.0044) model time 0.2465 (0.2413) loss 3.8975 (3.7425) grad_norm 1.5423 (1.9514) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][160/1251] eta 0:04:27 lr 0.000992 wd 0.0500 time 0.2476 (0.2451) data time 0.0007 (0.0042) model time 0.2469 (0.2412) loss 4.5624 (3.7193) grad_norm 2.3196 (1.9676) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][170/1251] eta 0:04:24 lr 0.000992 wd 0.0500 time 0.2402 (0.2449) data time 0.0011 (0.0040) model time 0.2391 (0.2412) loss 3.5013 (3.7330) grad_norm 1.6966 (1.9668) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][180/1251] eta 0:04:22 lr 0.000992 wd 0.0500 time 0.2377 (0.2448) data time 0.0009 (0.0038) model time 0.2367 (0.2411) loss 4.3475 (3.7287) grad_norm 1.8375 (1.9491) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][190/1251] eta 0:04:19 lr 0.000992 wd 0.0500 time 0.2344 (0.2446) data time 0.0009 (0.0037) model time 0.2335 (0.2411) loss 3.2952 (3.7290) grad_norm 1.5766 (1.9299) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][200/1251] eta 0:04:16 lr 0.000992 wd 0.0500 time 0.2465 (0.2445) data time 0.0007 (0.0036) model time 0.2458 (0.2411) loss 3.3789 (3.7227) grad_norm 1.7256 (1.9271) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][210/1251] eta 0:04:14 lr 0.000992 wd 0.0500 time 0.2430 (0.2444) data time 0.0011 (0.0034) model time 0.2419 (0.2412) loss 4.2968 (3.7231) grad_norm 1.9730 (1.9324) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][220/1251] eta 0:04:11 lr 0.000992 wd 0.0500 time 0.2354 (0.2443) data time 0.0010 (0.0033) model time 0.2344 (0.2412) loss 3.0629 (3.7239) grad_norm 1.8144 (1.9276) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][230/1251] eta 0:04:09 lr 0.000992 wd 0.0500 time 0.2456 (0.2441) data time 0.0007 (0.0032) model time 0.2449 (0.2411) loss 3.9605 (3.7190) grad_norm 2.1956 (1.9158) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][240/1251] eta 0:04:06 lr 0.000992 wd 0.0500 time 0.2423 (0.2440) data time 0.0008 (0.0031) model time 0.2415 (0.2410) loss 4.5356 (3.7149) grad_norm 2.5401 (1.9269) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][250/1251] eta 0:04:04 lr 0.000992 wd 0.0500 time 0.2425 (0.2439) data time 0.0012 (0.0031) model time 0.2413 (0.2409) loss 3.8691 (3.7265) grad_norm 1.4069 (1.9316) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][260/1251] eta 0:04:01 lr 0.000992 wd 0.0500 time 0.2434 (0.2437) data time 0.0011 (0.0030) model time 0.2423 (0.2409) loss 3.7096 (3.7166) grad_norm 1.6934 (1.9357) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][270/1251] eta 0:03:59 lr 0.000992 wd 0.0500 time 0.2463 (0.2437) data time 0.0011 (0.0029) model time 0.2452 (0.2409) loss 3.5786 (3.7100) grad_norm 1.6932 (1.9312) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][280/1251] eta 0:03:56 lr 0.000992 wd 0.0500 time 0.2423 (0.2436) data time 0.0007 (0.0028) model time 0.2416 (0.2409) loss 4.4476 (3.7177) grad_norm 1.8721 (1.9252) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][290/1251] eta 0:03:54 lr 0.000992 wd 0.0500 time 0.2376 (0.2436) data time 0.0012 (0.0028) model time 0.2364 (0.2409) loss 3.7699 (3.7208) grad_norm 1.8889 (1.9222) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][300/1251] eta 0:03:51 lr 0.000992 wd 0.0500 time 0.2474 (0.2436) data time 0.0010 (0.0027) model time 0.2464 (0.2410) loss 4.1889 (3.7239) grad_norm 2.7374 (1.9340) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][310/1251] eta 0:03:49 lr 0.000992 wd 0.0500 time 0.2446 (0.2435) data time 0.0010 (0.0027) model time 0.2435 (0.2409) loss 3.2353 (3.7084) grad_norm 1.8345 (1.9264) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][320/1251] eta 0:03:46 lr 0.000992 wd 0.0500 time 0.2412 (0.2434) data time 0.0009 (0.0026) model time 0.2403 (0.2408) loss 3.7739 (3.7105) grad_norm 3.1418 (1.9414) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][330/1251] eta 0:03:44 lr 0.000992 wd 0.0500 time 0.2484 (0.2433) data time 0.0008 (0.0026) model time 0.2476 (0.2408) loss 3.5969 (3.7090) grad_norm 1.3760 (1.9424) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][340/1251] eta 0:03:41 lr 0.000992 wd 0.0500 time 0.2459 (0.2432) data time 0.0008 (0.0025) model time 0.2450 (0.2407) loss 3.2452 (3.7056) grad_norm 2.0538 (1.9391) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][350/1251] eta 0:03:39 lr 0.000992 wd 0.0500 time 0.2326 (0.2437) data time 0.0009 (0.0025) model time 0.2317 (0.2414) loss 4.1673 (3.7106) grad_norm 1.7622 (1.9303) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][360/1251] eta 0:03:37 lr 0.000992 wd 0.0500 time 0.2444 (0.2437) data time 0.0008 (0.0024) model time 0.2435 (0.2414) loss 4.5681 (3.7096) grad_norm 1.9785 (1.9253) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][370/1251] eta 0:03:34 lr 0.000992 wd 0.0500 time 0.2436 (0.2436) data time 0.0008 (0.0024) model time 0.2429 (0.2414) loss 3.8306 (3.7072) grad_norm 1.5834 (1.9263) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][380/1251] eta 0:03:32 lr 0.000992 wd 0.0500 time 0.2306 (0.2435) data time 0.0011 (0.0024) model time 0.2294 (0.2413) loss 4.2404 (3.7084) grad_norm 2.1280 (1.9357) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:17:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][390/1251] eta 0:03:29 lr 0.000992 wd 0.0500 time 0.2423 (0.2435) data time 0.0008 (0.0023) model time 0.2416 (0.2413) loss 4.1730 (3.7097) grad_norm 2.5699 (1.9341) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][400/1251] eta 0:03:27 lr 0.000992 wd 0.0500 time 0.2462 (0.2435) data time 0.0009 (0.0023) model time 0.2452 (0.2413) loss 3.9518 (3.7049) grad_norm 2.3426 (1.9486) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][410/1251] eta 0:03:24 lr 0.000992 wd 0.0500 time 0.2410 (0.2434) data time 0.0010 (0.0023) model time 0.2400 (0.2413) loss 2.7431 (3.7032) grad_norm 2.0348 (1.9459) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][420/1251] eta 0:03:22 lr 0.000992 wd 0.0500 time 0.2388 (0.2433) data time 0.0008 (0.0022) model time 0.2380 (0.2412) loss 3.6635 (3.7019) grad_norm 2.8580 (1.9481) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][430/1251] eta 0:03:19 lr 0.000992 wd 0.0500 time 0.2356 (0.2432) data time 0.0007 (0.0022) model time 0.2349 (0.2411) loss 4.6050 (3.6991) grad_norm 1.9200 (1.9605) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][440/1251] eta 0:03:17 lr 0.000992 wd 0.0500 time 0.2426 (0.2431) data time 0.0008 (0.0022) model time 0.2418 (0.2410) loss 4.3489 (3.7044) grad_norm 1.5702 (1.9564) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][450/1251] eta 0:03:14 lr 0.000992 wd 0.0500 time 0.2432 (0.2431) data time 0.0009 (0.0021) model time 0.2423 (0.2410) loss 4.1294 (3.7059) grad_norm 1.4020 (1.9493) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][460/1251] eta 0:03:12 lr 0.000992 wd 0.0500 time 0.2346 (0.2440) data time 0.0010 (0.0021) model time 0.2336 (0.2420) loss 3.8657 (3.7020) grad_norm 1.6166 (1.9449) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][470/1251] eta 0:03:10 lr 0.000992 wd 0.0500 time 0.2365 (0.2439) data time 0.0010 (0.0021) model time 0.2355 (0.2420) loss 3.7948 (3.7006) grad_norm 1.3421 (1.9373) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][480/1251] eta 0:03:07 lr 0.000992 wd 0.0500 time 0.2427 (0.2438) data time 0.0007 (0.0021) model time 0.2420 (0.2419) loss 3.9352 (3.7042) grad_norm 1.4523 (1.9369) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][490/1251] eta 0:03:05 lr 0.000992 wd 0.0500 time 0.2286 (0.2437) data time 0.0012 (0.0021) model time 0.2274 (0.2418) loss 3.4515 (3.7107) grad_norm 1.8347 (1.9381) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][500/1251] eta 0:03:03 lr 0.000992 wd 0.0500 time 0.2416 (0.2444) data time 0.0010 (0.0021) model time 0.2406 (0.2425) loss 4.1697 (3.7170) grad_norm 3.1450 (1.9430) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][510/1251] eta 0:03:01 lr 0.000992 wd 0.0500 time 0.2370 (0.2443) data time 0.0007 (0.0020) model time 0.2362 (0.2425) loss 4.1662 (3.7167) grad_norm 1.3996 (1.9441) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][520/1251] eta 0:02:58 lr 0.000992 wd 0.0500 time 0.2363 (0.2442) data time 0.0009 (0.0020) model time 0.2354 (0.2424) loss 4.6231 (3.7199) grad_norm 1.5321 (1.9417) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][530/1251] eta 0:02:56 lr 0.000992 wd 0.0500 time 0.2327 (0.2441) data time 0.0010 (0.0020) model time 0.2317 (0.2423) loss 3.0324 (3.7167) grad_norm 1.4280 (1.9396) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][540/1251] eta 0:02:53 lr 0.000992 wd 0.0500 time 0.2420 (0.2445) data time 0.0007 (0.0020) model time 0.2413 (0.2427) loss 2.5907 (3.7202) grad_norm 1.4646 (1.9390) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][550/1251] eta 0:02:51 lr 0.000992 wd 0.0500 time 0.2402 (0.2444) data time 0.0007 (0.0020) model time 0.2395 (0.2426) loss 3.6903 (3.7219) grad_norm 1.2962 (1.9312) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][560/1251] eta 0:02:48 lr 0.000992 wd 0.0500 time 0.2361 (0.2443) data time 0.0010 (0.0020) model time 0.2351 (0.2425) loss 3.6103 (3.7207) grad_norm 1.4270 (1.9256) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][570/1251] eta 0:02:46 lr 0.000992 wd 0.0500 time 0.2399 (0.2442) data time 0.0007 (0.0019) model time 0.2392 (0.2425) loss 4.5921 (3.7266) grad_norm 2.3140 (1.9240) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][580/1251] eta 0:02:43 lr 0.000992 wd 0.0500 time 0.2403 (0.2442) data time 0.0010 (0.0019) model time 0.2393 (0.2424) loss 3.1823 (3.7244) grad_norm 2.3155 (1.9285) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][590/1251] eta 0:02:41 lr 0.000992 wd 0.0500 time 0.2429 (0.2441) data time 0.0008 (0.0019) model time 0.2421 (0.2424) loss 4.6762 (3.7235) grad_norm 1.7531 (1.9290) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][600/1251] eta 0:02:38 lr 0.000992 wd 0.0500 time 0.2418 (0.2441) data time 0.0010 (0.0019) model time 0.2408 (0.2423) loss 2.9484 (3.7155) grad_norm 2.6147 (1.9284) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][610/1251] eta 0:02:36 lr 0.000992 wd 0.0500 time 0.2371 (0.2440) data time 0.0009 (0.0019) model time 0.2362 (0.2423) loss 4.7819 (3.7125) grad_norm 2.5247 (1.9295) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][620/1251] eta 0:02:33 lr 0.000992 wd 0.0500 time 0.2430 (0.2440) data time 0.0010 (0.0019) model time 0.2421 (0.2423) loss 3.7114 (3.7103) grad_norm 1.8507 (1.9382) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][630/1251] eta 0:02:31 lr 0.000992 wd 0.0500 time 0.2391 (0.2440) data time 0.0009 (0.0019) model time 0.2382 (0.2423) loss 3.8182 (3.7119) grad_norm 2.0381 (1.9388) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:18:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][640/1251] eta 0:02:29 lr 0.000992 wd 0.0500 time 0.2400 (0.2439) data time 0.0008 (0.0018) model time 0.2393 (0.2422) loss 4.1443 (3.7148) grad_norm 1.3933 (1.9389) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][650/1251] eta 0:02:26 lr 0.000992 wd 0.0500 time 0.2407 (0.2439) data time 0.0013 (0.0018) model time 0.2394 (0.2422) loss 4.2951 (3.7193) grad_norm 1.9001 (1.9361) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][660/1251] eta 0:02:24 lr 0.000992 wd 0.0500 time 0.2366 (0.2439) data time 0.0010 (0.0018) model time 0.2355 (0.2422) loss 4.0867 (3.7226) grad_norm 2.2097 (1.9358) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][670/1251] eta 0:02:21 lr 0.000992 wd 0.0500 time 0.2367 (0.2438) data time 0.0011 (0.0018) model time 0.2356 (0.2421) loss 2.8806 (3.7196) grad_norm 2.2840 (1.9376) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][680/1251] eta 0:02:19 lr 0.000991 wd 0.0500 time 0.2496 (0.2438) data time 0.0009 (0.0018) model time 0.2486 (0.2421) loss 2.9524 (3.7195) grad_norm 1.5680 (1.9358) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][690/1251] eta 0:02:16 lr 0.000991 wd 0.0500 time 0.2392 (0.2438) data time 0.0011 (0.0018) model time 0.2381 (0.2421) loss 3.2014 (3.7237) grad_norm 1.8592 (1.9350) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][700/1251] eta 0:02:14 lr 0.000991 wd 0.0500 time 0.2454 (0.2438) data time 0.0008 (0.0018) model time 0.2447 (0.2421) loss 2.9579 (3.7197) grad_norm 1.8782 (1.9332) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][710/1251] eta 0:02:11 lr 0.000991 wd 0.0500 time 0.2378 (0.2437) data time 0.0011 (0.0018) model time 0.2367 (0.2421) loss 3.3912 (3.7187) grad_norm 1.5719 (1.9278) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][720/1251] eta 0:02:09 lr 0.000991 wd 0.0500 time 0.2344 (0.2437) data time 0.0011 (0.0018) model time 0.2332 (0.2420) loss 4.1514 (3.7179) grad_norm 3.0203 (1.9289) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][730/1251] eta 0:02:06 lr 0.000991 wd 0.0500 time 0.2389 (0.2437) data time 0.0011 (0.0017) model time 0.2378 (0.2421) loss 3.4603 (3.7167) grad_norm 1.8334 (1.9271) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][740/1251] eta 0:02:04 lr 0.000991 wd 0.0500 time 0.2380 (0.2437) data time 0.0008 (0.0017) model time 0.2372 (0.2421) loss 3.1635 (3.7183) grad_norm 2.0385 (1.9268) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][750/1251] eta 0:02:02 lr 0.000991 wd 0.0500 time 0.2442 (0.2436) data time 0.0007 (0.0017) model time 0.2435 (0.2420) loss 3.2735 (3.7149) grad_norm 3.0687 (1.9315) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][760/1251] eta 0:01:59 lr 0.000991 wd 0.0500 time 0.2416 (0.2436) data time 0.0011 (0.0017) model time 0.2405 (0.2419) loss 3.0939 (3.7138) grad_norm 1.3838 (1.9277) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][770/1251] eta 0:01:57 lr 0.000991 wd 0.0500 time 0.2458 (0.2435) data time 0.0007 (0.0017) model time 0.2451 (0.2419) loss 2.3106 (3.7152) grad_norm 1.5268 (1.9258) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][780/1251] eta 0:01:54 lr 0.000991 wd 0.0500 time 0.2530 (0.2435) data time 0.0007 (0.0017) model time 0.2523 (0.2420) loss 3.3480 (3.7187) grad_norm 1.6130 (1.9218) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][790/1251] eta 0:01:52 lr 0.000991 wd 0.0500 time 0.2352 (0.2435) data time 0.0007 (0.0017) model time 0.2345 (0.2419) loss 2.8027 (3.7155) grad_norm 1.5989 (1.9239) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][800/1251] eta 0:01:49 lr 0.000991 wd 0.0500 time 0.2561 (0.2435) data time 0.0007 (0.0017) model time 0.2554 (0.2419) loss 3.0438 (3.7138) grad_norm 1.9366 (1.9249) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][810/1251] eta 0:01:47 lr 0.000991 wd 0.0500 time 0.2442 (0.2434) data time 0.0008 (0.0017) model time 0.2434 (0.2419) loss 4.3542 (3.7125) grad_norm 2.9379 (1.9283) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][820/1251] eta 0:01:44 lr 0.000991 wd 0.0500 time 0.2389 (0.2434) data time 0.0011 (0.0017) model time 0.2378 (0.2418) loss 3.4724 (3.7133) grad_norm 1.7194 (1.9323) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][830/1251] eta 0:01:42 lr 0.000991 wd 0.0500 time 0.2466 (0.2434) data time 0.0007 (0.0017) model time 0.2459 (0.2418) loss 2.6048 (3.7092) grad_norm 1.7304 (1.9298) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][840/1251] eta 0:01:40 lr 0.000991 wd 0.0500 time 0.2436 (0.2434) data time 0.0010 (0.0016) model time 0.2426 (0.2418) loss 2.8588 (3.7071) grad_norm 1.6513 (1.9308) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][850/1251] eta 0:01:37 lr 0.000991 wd 0.0500 time 0.2429 (0.2434) data time 0.0008 (0.0016) model time 0.2421 (0.2418) loss 3.8027 (3.7054) grad_norm 1.9361 (1.9296) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][860/1251] eta 0:01:35 lr 0.000991 wd 0.0500 time 0.2396 (0.2433) data time 0.0011 (0.0016) model time 0.2385 (0.2418) loss 3.6961 (3.7084) grad_norm 1.5005 (1.9276) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][870/1251] eta 0:01:32 lr 0.000991 wd 0.0500 time 0.2418 (0.2433) data time 0.0008 (0.0016) model time 0.2410 (0.2418) loss 4.0238 (3.7111) grad_norm 1.5912 (1.9277) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:19:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][880/1251] eta 0:01:30 lr 0.000991 wd 0.0500 time 0.2457 (0.2433) data time 0.0009 (0.0016) model time 0.2448 (0.2418) loss 3.3741 (3.7113) grad_norm 1.3743 (1.9288) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][890/1251] eta 0:01:27 lr 0.000991 wd 0.0500 time 0.2411 (0.2433) data time 0.0010 (0.0016) model time 0.2401 (0.2417) loss 2.9281 (3.7070) grad_norm 1.8460 (1.9267) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][900/1251] eta 0:01:25 lr 0.000991 wd 0.0500 time 0.2415 (0.2432) data time 0.0009 (0.0016) model time 0.2405 (0.2417) loss 2.7373 (3.7043) grad_norm 2.3045 (1.9247) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][910/1251] eta 0:01:22 lr 0.000991 wd 0.0500 time 0.2321 (0.2432) data time 0.0011 (0.0016) model time 0.2310 (0.2417) loss 4.1375 (3.7031) grad_norm 1.5771 (1.9250) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][920/1251] eta 0:01:20 lr 0.000991 wd 0.0500 time 0.2462 (0.2432) data time 0.0010 (0.0016) model time 0.2452 (0.2417) loss 3.6615 (3.7049) grad_norm 2.2238 (1.9247) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][930/1251] eta 0:01:18 lr 0.000991 wd 0.0500 time 0.2377 (0.2432) data time 0.0009 (0.0016) model time 0.2368 (0.2417) loss 4.3489 (3.7083) grad_norm 2.3316 (1.9280) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][940/1251] eta 0:01:15 lr 0.000991 wd 0.0500 time 0.2397 (0.2432) data time 0.0008 (0.0016) model time 0.2388 (0.2417) loss 3.6752 (3.7091) grad_norm 3.0077 (1.9292) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][950/1251] eta 0:01:13 lr 0.000991 wd 0.0500 time 0.2419 (0.2432) data time 0.0011 (0.0016) model time 0.2409 (0.2417) loss 3.7732 (3.7085) grad_norm 1.4560 (1.9277) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][960/1251] eta 0:01:10 lr 0.000991 wd 0.0500 time 0.2420 (0.2431) data time 0.0009 (0.0016) model time 0.2411 (0.2416) loss 4.5666 (3.7084) grad_norm 2.1482 (1.9275) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][970/1251] eta 0:01:08 lr 0.000991 wd 0.0500 time 0.2529 (0.2431) data time 0.0010 (0.0016) model time 0.2519 (0.2416) loss 3.9405 (3.7093) grad_norm 1.6444 (1.9248) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][980/1251] eta 0:01:05 lr 0.000991 wd 0.0500 time 0.2414 (0.2431) data time 0.0008 (0.0016) model time 0.2405 (0.2416) loss 3.6202 (3.7096) grad_norm 3.2841 (1.9251) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][990/1251] eta 0:01:03 lr 0.000991 wd 0.0500 time 0.4513 (0.2433) data time 0.0011 (0.0016) model time 0.4502 (0.2418) loss 3.7772 (3.7108) grad_norm 2.4535 (1.9225) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1000/1251] eta 0:01:01 lr 0.000991 wd 0.0500 time 0.2430 (0.2435) data time 0.0009 (0.0016) model time 0.2422 (0.2420) loss 4.7592 (3.7109) grad_norm 1.6333 (1.9194) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1010/1251] eta 0:00:58 lr 0.000991 wd 0.0500 time 0.2352 (0.2435) data time 0.0011 (0.0015) model time 0.2341 (0.2420) loss 3.5711 (3.7119) grad_norm 2.0078 (1.9172) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1020/1251] eta 0:00:56 lr 0.000991 wd 0.0500 time 0.4486 (0.2439) data time 0.0007 (0.0015) model time 0.4478 (0.2424) loss 3.3893 (3.7126) grad_norm 1.9550 (1.9173) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1030/1251] eta 0:00:53 lr 0.000991 wd 0.0500 time 0.2431 (0.2439) data time 0.0009 (0.0015) model time 0.2422 (0.2424) loss 4.0720 (3.7147) grad_norm 2.2275 (1.9171) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1040/1251] eta 0:00:51 lr 0.000991 wd 0.0500 time 0.2469 (0.2439) data time 0.0007 (0.0015) model time 0.2462 (0.2425) loss 3.8294 (3.7133) grad_norm 2.2048 (1.9179) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1050/1251] eta 0:00:49 lr 0.000991 wd 0.0500 time 0.2353 (0.2439) data time 0.0009 (0.0015) model time 0.2344 (0.2424) loss 4.4867 (3.7155) grad_norm 2.3055 (1.9213) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1060/1251] eta 0:00:46 lr 0.000991 wd 0.0500 time 0.2432 (0.2440) data time 0.0007 (0.0015) model time 0.2424 (0.2426) loss 4.3560 (3.7159) grad_norm 1.7236 (1.9184) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1070/1251] eta 0:00:44 lr 0.000991 wd 0.0500 time 0.2381 (0.2440) data time 0.0011 (0.0015) model time 0.2370 (0.2426) loss 3.1222 (3.7145) grad_norm 1.6817 (1.9193) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1080/1251] eta 0:00:41 lr 0.000991 wd 0.0500 time 0.2369 (0.2440) data time 0.0009 (0.0015) model time 0.2360 (0.2426) loss 2.7866 (3.7156) grad_norm 1.5948 (1.9185) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1090/1251] eta 0:00:39 lr 0.000991 wd 0.0500 time 0.2359 (0.2439) data time 0.0013 (0.0015) model time 0.2346 (0.2425) loss 3.7903 (3.7135) grad_norm 3.1631 (1.9248) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1100/1251] eta 0:00:36 lr 0.000991 wd 0.0500 time 0.2488 (0.2439) data time 0.0012 (0.0015) model time 0.2476 (0.2425) loss 4.2930 (3.7150) grad_norm 1.3967 (1.9286) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1110/1251] eta 0:00:34 lr 0.000991 wd 0.0500 time 0.2388 (0.2439) data time 0.0009 (0.0015) model time 0.2380 (0.2425) loss 4.3332 (3.7182) grad_norm 2.3069 (1.9311) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1120/1251] eta 0:00:31 lr 0.000991 wd 0.0500 time 0.2407 (0.2438) data time 0.0007 (0.0015) model time 0.2400 (0.2424) loss 4.0662 (3.7201) grad_norm 1.5234 (1.9302) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:20:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1130/1251] eta 0:00:29 lr 0.000991 wd 0.0500 time 0.2380 (0.2438) data time 0.0008 (0.0015) model time 0.2371 (0.2424) loss 3.3880 (3.7171) grad_norm 2.1840 (1.9321) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1140/1251] eta 0:00:27 lr 0.000991 wd 0.0500 time 0.2289 (0.2438) data time 0.0009 (0.0015) model time 0.2281 (0.2424) loss 2.3151 (3.7159) grad_norm 1.5875 (1.9301) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1150/1251] eta 0:00:24 lr 0.000991 wd 0.0500 time 0.2402 (0.2438) data time 0.0010 (0.0015) model time 0.2392 (0.2424) loss 3.7344 (3.7151) grad_norm 1.9038 (1.9289) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1160/1251] eta 0:00:22 lr 0.000991 wd 0.0500 time 0.2423 (0.2438) data time 0.0010 (0.0015) model time 0.2413 (0.2424) loss 3.9282 (3.7171) grad_norm 1.5672 (1.9287) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1170/1251] eta 0:00:19 lr 0.000991 wd 0.0500 time 0.2472 (0.2437) data time 0.0008 (0.0015) model time 0.2463 (0.2424) loss 3.8585 (3.7189) grad_norm 4.0636 (1.9329) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1180/1251] eta 0:00:17 lr 0.000991 wd 0.0500 time 0.2428 (0.2437) data time 0.0012 (0.0015) model time 0.2416 (0.2423) loss 3.4342 (3.7177) grad_norm 1.4374 (1.9353) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1190/1251] eta 0:00:14 lr 0.000991 wd 0.0500 time 0.2410 (0.2437) data time 0.0008 (0.0015) model time 0.2402 (0.2423) loss 4.0031 (3.7190) grad_norm 1.9436 (1.9344) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1200/1251] eta 0:00:12 lr 0.000991 wd 0.0500 time 0.2428 (0.2437) data time 0.0010 (0.0015) model time 0.2418 (0.2423) loss 2.6438 (3.7165) grad_norm 1.6733 (1.9331) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1210/1251] eta 0:00:09 lr 0.000991 wd 0.0500 time 0.2321 (0.2437) data time 0.0009 (0.0015) model time 0.2313 (0.2423) loss 4.5698 (3.7178) grad_norm 3.2337 (1.9338) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1220/1251] eta 0:00:07 lr 0.000991 wd 0.0500 time 0.2432 (0.2437) data time 0.0010 (0.0015) model time 0.2422 (0.2423) loss 3.4525 (3.7192) grad_norm 1.6976 (1.9338) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1230/1251] eta 0:00:05 lr 0.000991 wd 0.0500 time 0.2420 (0.2437) data time 0.0007 (0.0014) model time 0.2413 (0.2423) loss 3.4947 (3.7188) grad_norm 1.8997 (1.9361) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1240/1251] eta 0:00:02 lr 0.000991 wd 0.0500 time 0.2246 (0.2436) data time 0.0007 (0.0014) model time 0.2239 (0.2422) loss 2.8404 (3.7170) grad_norm 1.9661 (1.9361) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [36/300][1250/1251] eta 0:00:00 lr 0.000991 wd 0.0500 time 0.2289 (0.2434) data time 0.0007 (0.0014) model time 0.2283 (0.2420) loss 3.9789 (3.7167) grad_norm 1.9793 (1.9373) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 36 training takes 0:05:04 [2024-08-26 05:21:28 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 05:21:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 05:21:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.416 (0.416) Loss 0.6089 (0.6089) Acc@1 86.621 (86.621) Acc@5 96.777 (96.777) Mem 7379MB [2024-08-26 05:21:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.110) Loss 0.9517 (0.9383) Acc@1 78.906 (77.983) Acc@5 95.215 (94.727) Mem 7379MB [2024-08-26 05:21:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.097) Loss 1.3721 (0.9515) Acc@1 68.555 (77.832) Acc@5 89.551 (94.703) Mem 7379MB [2024-08-26 05:21:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.090) Loss 1.6963 (1.0943) Acc@1 61.133 (74.713) Acc@5 84.082 (92.840) Mem 7379MB [2024-08-26 05:21:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.5449 (1.1745) Acc@1 64.355 (73.080) Acc@5 87.109 (91.687) Mem 7379MB [2024-08-26 05:21:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 72.818 Acc@5 91.636 [2024-08-26 05:21:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 72.8% [2024-08-26 05:21:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 72.82% [2024-08-26 05:21:33 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 05:21:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 05:21:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.466 (0.466) Loss 0.4895 (0.4895) Acc@1 89.160 (89.160) Acc@5 97.461 (97.461) Mem 7379MB [2024-08-26 05:21:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.113) Loss 0.8340 (0.8145) Acc@1 81.445 (80.344) Acc@5 95.215 (95.384) Mem 7379MB [2024-08-26 05:21:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.095) Loss 1.2158 (0.8288) Acc@1 71.094 (79.664) Acc@5 90.527 (95.308) Mem 7379MB [2024-08-26 05:21:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.090) Loss 1.4951 (0.9630) Acc@1 63.477 (76.616) Acc@5 86.426 (93.520) Mem 7379MB [2024-08-26 05:21:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.4561 (1.0438) Acc@1 63.965 (74.748) Acc@5 87.012 (92.480) Mem 7379MB [2024-08-26 05:21:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.440 Acc@5 92.398 [2024-08-26 05:21:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 74.4% [2024-08-26 05:21:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 74.44% [2024-08-26 05:21:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 05:21:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 05:21:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][0/1251] eta 0:12:59 lr 0.000991 wd 0.0500 time 0.6228 (0.6228) data time 0.3924 (0.3924) model time 0.0000 (0.0000) loss 3.9165 (3.9165) grad_norm 2.6221 (2.6221) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][10/1251] eta 0:05:42 lr 0.000991 wd 0.0500 time 0.2346 (0.2762) data time 0.0012 (0.0366) model time 0.0000 (0.0000) loss 3.3349 (3.6281) grad_norm 1.6179 (1.9312) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][20/1251] eta 0:05:20 lr 0.000991 wd 0.0500 time 0.2412 (0.2600) data time 0.0009 (0.0197) model time 0.0000 (0.0000) loss 2.6963 (3.7801) grad_norm 1.9405 (1.7969) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][30/1251] eta 0:05:11 lr 0.000991 wd 0.0500 time 0.2807 (0.2551) data time 0.0011 (0.0137) model time 0.0000 (0.0000) loss 3.1635 (3.7597) grad_norm 1.4521 (1.8490) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][40/1251] eta 0:05:05 lr 0.000991 wd 0.0500 time 0.2339 (0.2519) data time 0.0009 (0.0107) model time 0.0000 (0.0000) loss 3.0370 (3.7241) grad_norm 1.8068 (1.8149) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][50/1251] eta 0:05:00 lr 0.000991 wd 0.0500 time 0.2383 (0.2498) data time 0.0008 (0.0088) model time 0.0000 (0.0000) loss 3.4016 (3.8096) grad_norm 2.2692 (1.8709) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][60/1251] eta 0:04:55 lr 0.000991 wd 0.0500 time 0.2467 (0.2483) data time 0.0009 (0.0075) model time 0.2459 (0.2397) loss 3.8962 (3.7930) grad_norm 1.4032 (1.8319) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][70/1251] eta 0:04:51 lr 0.000991 wd 0.0500 time 0.2372 (0.2472) data time 0.0008 (0.0066) model time 0.2364 (0.2396) loss 2.4853 (3.7460) grad_norm 1.3883 (1.8273) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][80/1251] eta 0:04:49 lr 0.000991 wd 0.0500 time 0.2396 (0.2468) data time 0.0009 (0.0059) model time 0.2387 (0.2407) loss 4.6510 (3.7086) grad_norm 2.0687 (1.8115) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][90/1251] eta 0:04:46 lr 0.000991 wd 0.0500 time 0.2499 (0.2464) data time 0.0007 (0.0054) model time 0.2492 (0.2410) loss 3.2842 (3.7033) grad_norm 1.8986 (1.8055) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][100/1251] eta 0:04:43 lr 0.000991 wd 0.0500 time 0.2355 (0.2459) data time 0.0009 (0.0050) model time 0.2346 (0.2408) loss 2.7243 (3.7046) grad_norm 2.4684 (1.8112) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][110/1251] eta 0:04:40 lr 0.000991 wd 0.0500 time 0.2496 (0.2456) data time 0.0009 (0.0047) model time 0.2487 (0.2409) loss 4.3953 (3.7332) grad_norm 1.6991 (1.8376) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][120/1251] eta 0:04:37 lr 0.000991 wd 0.0500 time 0.2549 (0.2452) data time 0.0011 (0.0044) model time 0.2538 (0.2407) loss 4.1520 (3.7236) grad_norm 1.2844 (1.8255) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][130/1251] eta 0:04:34 lr 0.000991 wd 0.0500 time 0.2438 (0.2448) data time 0.0008 (0.0041) model time 0.2430 (0.2406) loss 4.2524 (3.7174) grad_norm 1.7636 (1.8243) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][140/1251] eta 0:04:31 lr 0.000991 wd 0.0500 time 0.2367 (0.2447) data time 0.0010 (0.0039) model time 0.2356 (0.2407) loss 3.8010 (3.7079) grad_norm 2.2345 (1.8172) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][150/1251] eta 0:04:29 lr 0.000991 wd 0.0500 time 0.2452 (0.2444) data time 0.0011 (0.0037) model time 0.2441 (0.2406) loss 3.6680 (3.7069) grad_norm 1.5514 (1.8128) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][160/1251] eta 0:04:26 lr 0.000991 wd 0.0500 time 0.2480 (0.2443) data time 0.0007 (0.0035) model time 0.2473 (0.2406) loss 3.0533 (3.6937) grad_norm 2.2548 (1.8306) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][170/1251] eta 0:04:23 lr 0.000991 wd 0.0500 time 0.2419 (0.2442) data time 0.0009 (0.0034) model time 0.2411 (0.2407) loss 3.2051 (3.6938) grad_norm 1.8677 (1.8433) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][180/1251] eta 0:04:21 lr 0.000991 wd 0.0500 time 0.2380 (0.2439) data time 0.0008 (0.0032) model time 0.2371 (0.2405) loss 3.6116 (3.6859) grad_norm 2.7996 (1.8502) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][190/1251] eta 0:04:18 lr 0.000991 wd 0.0500 time 0.2505 (0.2438) data time 0.0007 (0.0031) model time 0.2497 (0.2406) loss 4.3325 (3.6951) grad_norm 1.7195 (1.8661) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][200/1251] eta 0:04:16 lr 0.000991 wd 0.0500 time 0.2382 (0.2436) data time 0.0008 (0.0031) model time 0.2374 (0.2404) loss 2.0956 (3.6817) grad_norm 1.5602 (1.8757) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][210/1251] eta 0:04:13 lr 0.000991 wd 0.0500 time 0.2407 (0.2436) data time 0.0008 (0.0030) model time 0.2400 (0.2404) loss 4.6331 (3.6918) grad_norm 2.2199 (1.8714) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][220/1251] eta 0:04:11 lr 0.000991 wd 0.0500 time 0.2396 (0.2436) data time 0.0007 (0.0029) model time 0.2390 (0.2405) loss 4.4096 (3.7056) grad_norm 1.3983 (1.8716) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][230/1251] eta 0:04:08 lr 0.000991 wd 0.0500 time 0.2411 (0.2434) data time 0.0008 (0.0028) model time 0.2404 (0.2404) loss 2.4441 (3.6998) grad_norm 2.1360 (1.8886) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][240/1251] eta 0:04:06 lr 0.000991 wd 0.0500 time 0.2460 (0.2434) data time 0.0011 (0.0028) model time 0.2449 (0.2406) loss 3.8081 (3.6987) grad_norm 2.0869 (1.8880) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][250/1251] eta 0:04:03 lr 0.000991 wd 0.0500 time 0.2438 (0.2434) data time 0.0008 (0.0027) model time 0.2431 (0.2406) loss 2.9866 (3.7045) grad_norm 1.8121 (1.8912) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][260/1251] eta 0:04:01 lr 0.000991 wd 0.0500 time 0.2430 (0.2433) data time 0.0011 (0.0026) model time 0.2419 (0.2406) loss 4.0064 (3.7061) grad_norm 1.5984 (1.8816) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][270/1251] eta 0:03:59 lr 0.000991 wd 0.0500 time 0.4743 (0.2441) data time 0.0011 (0.0026) model time 0.4732 (0.2416) loss 2.8063 (3.7037) grad_norm 2.3704 (1.8893) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][280/1251] eta 0:03:57 lr 0.000991 wd 0.0500 time 0.2354 (0.2447) data time 0.0007 (0.0025) model time 0.2346 (0.2425) loss 3.8351 (3.7054) grad_norm 1.5145 (1.8860) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][290/1251] eta 0:03:55 lr 0.000991 wd 0.0500 time 0.2355 (0.2446) data time 0.0009 (0.0025) model time 0.2347 (0.2424) loss 4.6189 (3.7155) grad_norm 1.6512 (1.8868) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][300/1251] eta 0:03:52 lr 0.000991 wd 0.0500 time 0.2410 (0.2445) data time 0.0010 (0.0024) model time 0.2401 (0.2423) loss 3.6885 (3.7186) grad_norm 2.0098 (1.8952) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][310/1251] eta 0:03:49 lr 0.000991 wd 0.0500 time 0.2415 (0.2444) data time 0.0009 (0.0024) model time 0.2407 (0.2422) loss 3.9640 (3.7301) grad_norm 1.7353 (1.8971) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][320/1251] eta 0:03:47 lr 0.000991 wd 0.0500 time 0.2409 (0.2448) data time 0.0007 (0.0023) model time 0.2402 (0.2427) loss 4.7182 (3.7413) grad_norm 1.2805 (1.8909) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:22:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][330/1251] eta 0:03:45 lr 0.000991 wd 0.0500 time 0.2408 (0.2452) data time 0.0008 (0.0023) model time 0.2400 (0.2433) loss 3.9280 (3.7376) grad_norm 1.9023 (1.8798) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:23:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][340/1251] eta 0:03:43 lr 0.000991 wd 0.0500 time 0.2471 (0.2452) data time 0.0008 (0.0023) model time 0.2463 (0.2433) loss 3.9654 (3.7383) grad_norm 1.9983 (1.8783) loss_scale 16384.0000 (8360.1642) mem 7379MB [2024-08-26 05:23:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][350/1251] eta 0:03:40 lr 0.000991 wd 0.0500 time 0.2578 (0.2452) data time 0.0008 (0.0022) model time 0.2570 (0.2432) loss 4.3181 (3.7390) grad_norm 2.4427 (1.8875) loss_scale 16384.0000 (8588.7635) mem 7379MB [2024-08-26 05:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][360/1251] eta 0:03:38 lr 0.000991 wd 0.0500 time 0.2436 (0.2451) data time 0.0011 (0.0022) model time 0.2425 (0.2432) loss 2.6816 (3.7370) grad_norm 2.0170 (1.8962) loss_scale 16384.0000 (8804.6981) mem 7379MB [2024-08-26 05:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][370/1251] eta 0:03:35 lr 0.000991 wd 0.0500 time 0.2429 (0.2451) data time 0.0007 (0.0022) model time 0.2422 (0.2432) loss 3.8265 (3.7466) grad_norm 2.2304 (1.9033) loss_scale 16384.0000 (9008.9919) mem 7379MB [2024-08-26 05:23:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][380/1251] eta 0:03:33 lr 0.000991 wd 0.0500 time 0.2438 (0.2450) data time 0.0009 (0.0021) model time 0.2429 (0.2432) loss 3.4201 (3.7516) grad_norm 2.1425 (1.9062) loss_scale 16384.0000 (9202.5617) mem 7379MB [2024-08-26 05:23:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][390/1251] eta 0:03:30 lr 0.000991 wd 0.0500 time 0.2353 (0.2449) data time 0.0011 (0.0021) model time 0.2342 (0.2431) loss 3.1285 (3.7517) grad_norm 1.5814 (1.9102) loss_scale 16384.0000 (9386.2302) mem 7379MB [2024-08-26 05:23:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][400/1251] eta 0:03:28 lr 0.000991 wd 0.0500 time 0.2472 (0.2454) data time 0.0010 (0.0021) model time 0.2463 (0.2436) loss 3.8455 (3.7531) grad_norm 1.7588 (1.9060) loss_scale 16384.0000 (9560.7382) mem 7379MB [2024-08-26 05:23:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][410/1251] eta 0:03:26 lr 0.000991 wd 0.0500 time 0.2425 (0.2453) data time 0.0007 (0.0021) model time 0.2418 (0.2435) loss 4.1845 (3.7541) grad_norm 1.4105 (1.9001) loss_scale 16384.0000 (9726.7543) mem 7379MB [2024-08-26 05:23:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][420/1251] eta 0:03:23 lr 0.000991 wd 0.0500 time 0.2459 (0.2452) data time 0.0010 (0.0021) model time 0.2449 (0.2434) loss 4.4534 (3.7560) grad_norm 2.5438 (1.9026) loss_scale 16384.0000 (9884.8836) mem 7379MB [2024-08-26 05:23:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][430/1251] eta 0:03:21 lr 0.000991 wd 0.0500 time 0.2486 (0.2451) data time 0.0007 (0.0020) model time 0.2479 (0.2434) loss 3.9149 (3.7599) grad_norm 1.9515 (1.9131) loss_scale 16384.0000 (10035.6752) mem 7379MB [2024-08-26 05:23:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][440/1251] eta 0:03:18 lr 0.000991 wd 0.0500 time 0.2516 (0.2451) data time 0.0007 (0.0020) model time 0.2508 (0.2434) loss 4.1713 (3.7581) grad_norm 1.6092 (1.9179) loss_scale 16384.0000 (10179.6281) mem 7379MB [2024-08-26 05:23:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][450/1251] eta 0:03:16 lr 0.000991 wd 0.0500 time 0.2402 (0.2451) data time 0.0010 (0.0020) model time 0.2393 (0.2434) loss 3.9329 (3.7553) grad_norm 1.7151 (1.9158) loss_scale 16384.0000 (10317.1973) mem 7379MB [2024-08-26 05:23:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][460/1251] eta 0:03:13 lr 0.000991 wd 0.0500 time 0.2393 (0.2450) data time 0.0009 (0.0020) model time 0.2384 (0.2433) loss 3.4366 (3.7563) grad_norm 1.9095 (1.9152) loss_scale 16384.0000 (10448.7983) mem 7379MB [2024-08-26 05:23:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][470/1251] eta 0:03:11 lr 0.000991 wd 0.0500 time 0.2399 (0.2449) data time 0.0012 (0.0020) model time 0.2387 (0.2432) loss 3.5566 (3.7534) grad_norm 1.6518 (1.9168) loss_scale 16384.0000 (10574.8110) mem 7379MB [2024-08-26 05:23:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][480/1251] eta 0:03:08 lr 0.000991 wd 0.0500 time 0.2407 (0.2448) data time 0.0010 (0.0019) model time 0.2398 (0.2431) loss 4.2732 (3.7579) grad_norm 1.9818 (1.9154) loss_scale 16384.0000 (10695.5842) mem 7379MB [2024-08-26 05:23:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][490/1251] eta 0:03:06 lr 0.000991 wd 0.0500 time 0.2430 (0.2448) data time 0.0009 (0.0019) model time 0.2421 (0.2431) loss 4.0022 (3.7592) grad_norm 2.1224 (1.9194) loss_scale 16384.0000 (10811.4379) mem 7379MB [2024-08-26 05:23:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][500/1251] eta 0:03:03 lr 0.000991 wd 0.0500 time 0.2405 (0.2448) data time 0.0010 (0.0019) model time 0.2395 (0.2431) loss 3.9531 (3.7613) grad_norm 2.3215 (1.9248) loss_scale 16384.0000 (10922.6667) mem 7379MB [2024-08-26 05:23:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][510/1251] eta 0:03:01 lr 0.000991 wd 0.0500 time 0.2437 (0.2447) data time 0.0010 (0.0019) model time 0.2428 (0.2431) loss 3.7705 (3.7634) grad_norm 1.8728 (1.9262) loss_scale 16384.0000 (11029.5421) mem 7379MB [2024-08-26 05:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][520/1251] eta 0:02:58 lr 0.000991 wd 0.0500 time 0.2362 (0.2447) data time 0.0009 (0.0019) model time 0.2353 (0.2430) loss 3.7635 (3.7574) grad_norm 1.7453 (1.9235) loss_scale 16384.0000 (11132.3148) mem 7379MB [2024-08-26 05:23:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][530/1251] eta 0:02:56 lr 0.000991 wd 0.0500 time 0.2461 (0.2447) data time 0.0009 (0.0018) model time 0.2452 (0.2430) loss 2.5171 (3.7557) grad_norm 4.2195 (1.9311) loss_scale 16384.0000 (11231.2166) mem 7379MB [2024-08-26 05:23:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][540/1251] eta 0:02:53 lr 0.000991 wd 0.0500 time 0.2426 (0.2446) data time 0.0009 (0.0018) model time 0.2417 (0.2430) loss 4.9446 (3.7557) grad_norm 1.4679 (1.9413) loss_scale 16384.0000 (11326.4621) mem 7379MB [2024-08-26 05:23:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][550/1251] eta 0:02:51 lr 0.000991 wd 0.0500 time 0.2408 (0.2446) data time 0.0009 (0.0018) model time 0.2400 (0.2430) loss 4.4931 (3.7630) grad_norm 1.8146 (1.9410) loss_scale 16384.0000 (11418.2505) mem 7379MB [2024-08-26 05:23:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][560/1251] eta 0:02:49 lr 0.000991 wd 0.0500 time 0.2427 (0.2446) data time 0.0011 (0.0019) model time 0.2416 (0.2429) loss 3.7510 (3.7655) grad_norm 1.7903 (1.9359) loss_scale 16384.0000 (11506.7665) mem 7379MB [2024-08-26 05:23:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][570/1251] eta 0:02:46 lr 0.000991 wd 0.0500 time 0.2389 (0.2446) data time 0.0010 (0.0018) model time 0.2379 (0.2429) loss 3.2615 (3.7647) grad_norm 1.4342 (1.9345) loss_scale 16384.0000 (11592.1821) mem 7379MB [2024-08-26 05:24:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][580/1251] eta 0:02:44 lr 0.000991 wd 0.0500 time 0.2366 (0.2445) data time 0.0010 (0.0018) model time 0.2356 (0.2429) loss 3.5693 (3.7604) grad_norm 1.9407 (1.9289) loss_scale 16384.0000 (11674.6575) mem 7379MB [2024-08-26 05:24:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][590/1251] eta 0:02:41 lr 0.000991 wd 0.0500 time 0.2373 (0.2445) data time 0.0009 (0.0018) model time 0.2363 (0.2428) loss 3.6591 (3.7584) grad_norm 1.6022 (1.9301) loss_scale 16384.0000 (11754.3418) mem 7379MB [2024-08-26 05:24:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][600/1251] eta 0:02:39 lr 0.000991 wd 0.0500 time 0.2341 (0.2444) data time 0.0010 (0.0018) model time 0.2331 (0.2428) loss 2.4235 (3.7563) grad_norm 1.8767 (1.9325) loss_scale 16384.0000 (11831.3744) mem 7379MB [2024-08-26 05:24:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][610/1251] eta 0:02:36 lr 0.000991 wd 0.0500 time 0.2390 (0.2444) data time 0.0010 (0.0018) model time 0.2380 (0.2428) loss 4.3733 (3.7523) grad_norm 2.4501 (1.9356) loss_scale 16384.0000 (11905.8854) mem 7379MB [2024-08-26 05:24:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][620/1251] eta 0:02:34 lr 0.000990 wd 0.0500 time 0.2477 (0.2444) data time 0.0007 (0.0018) model time 0.2470 (0.2427) loss 4.6801 (3.7571) grad_norm 3.5418 (1.9448) loss_scale 16384.0000 (11977.9968) mem 7379MB [2024-08-26 05:24:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][630/1251] eta 0:02:31 lr 0.000990 wd 0.0500 time 0.2482 (0.2443) data time 0.0009 (0.0018) model time 0.2472 (0.2427) loss 4.7024 (3.7541) grad_norm 1.6816 (1.9482) loss_scale 16384.0000 (12047.8225) mem 7379MB [2024-08-26 05:24:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][640/1251] eta 0:02:29 lr 0.000990 wd 0.0500 time 0.2389 (0.2443) data time 0.0010 (0.0018) model time 0.2380 (0.2427) loss 3.4994 (3.7511) grad_norm 1.8475 (1.9504) loss_scale 16384.0000 (12115.4696) mem 7379MB [2024-08-26 05:24:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][650/1251] eta 0:02:26 lr 0.000990 wd 0.0500 time 0.2392 (0.2443) data time 0.0010 (0.0017) model time 0.2382 (0.2426) loss 3.9943 (3.7540) grad_norm 2.2668 (1.9472) loss_scale 16384.0000 (12181.0384) mem 7379MB [2024-08-26 05:24:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][660/1251] eta 0:02:24 lr 0.000990 wd 0.0500 time 0.2396 (0.2442) data time 0.0010 (0.0017) model time 0.2386 (0.2426) loss 3.9775 (3.7517) grad_norm 1.6342 (1.9463) loss_scale 16384.0000 (12244.6233) mem 7379MB [2024-08-26 05:24:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][670/1251] eta 0:02:21 lr 0.000990 wd 0.0500 time 0.2430 (0.2442) data time 0.0008 (0.0017) model time 0.2421 (0.2426) loss 3.4100 (3.7541) grad_norm 1.4588 (1.9421) loss_scale 16384.0000 (12306.3130) mem 7379MB [2024-08-26 05:24:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][680/1251] eta 0:02:19 lr 0.000990 wd 0.0500 time 0.2364 (0.2442) data time 0.0014 (0.0017) model time 0.2350 (0.2426) loss 3.5424 (3.7577) grad_norm 1.7162 (1.9460) loss_scale 16384.0000 (12366.1909) mem 7379MB [2024-08-26 05:24:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][690/1251] eta 0:02:16 lr 0.000990 wd 0.0500 time 0.2427 (0.2441) data time 0.0011 (0.0017) model time 0.2416 (0.2425) loss 3.7438 (3.7564) grad_norm 2.0170 (1.9555) loss_scale 16384.0000 (12424.3357) mem 7379MB [2024-08-26 05:24:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][700/1251] eta 0:02:14 lr 0.000990 wd 0.0500 time 0.2434 (0.2441) data time 0.0007 (0.0017) model time 0.2427 (0.2425) loss 4.1735 (3.7584) grad_norm 1.9602 (1.9559) loss_scale 16384.0000 (12480.8217) mem 7379MB [2024-08-26 05:24:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][710/1251] eta 0:02:12 lr 0.000990 wd 0.0500 time 0.2458 (0.2441) data time 0.0007 (0.0017) model time 0.2452 (0.2425) loss 3.4256 (3.7565) grad_norm 1.7995 (1.9569) loss_scale 16384.0000 (12535.7187) mem 7379MB [2024-08-26 05:24:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][720/1251] eta 0:02:09 lr 0.000990 wd 0.0500 time 0.2351 (0.2440) data time 0.0007 (0.0017) model time 0.2344 (0.2425) loss 3.9858 (3.7552) grad_norm 1.4941 (1.9556) loss_scale 16384.0000 (12589.0929) mem 7379MB [2024-08-26 05:24:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][730/1251] eta 0:02:07 lr 0.000990 wd 0.0500 time 0.2389 (0.2440) data time 0.0009 (0.0017) model time 0.2380 (0.2424) loss 2.5797 (3.7454) grad_norm 1.7877 (1.9500) loss_scale 16384.0000 (12641.0068) mem 7379MB [2024-08-26 05:24:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][740/1251] eta 0:02:04 lr 0.000990 wd 0.0500 time 0.2461 (0.2440) data time 0.0007 (0.0017) model time 0.2454 (0.2424) loss 4.7720 (3.7447) grad_norm 1.7854 (1.9504) loss_scale 16384.0000 (12691.5196) mem 7379MB [2024-08-26 05:24:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][750/1251] eta 0:02:02 lr 0.000990 wd 0.0500 time 0.2463 (0.2440) data time 0.0009 (0.0017) model time 0.2454 (0.2424) loss 3.9518 (3.7427) grad_norm 2.0241 (1.9517) loss_scale 16384.0000 (12740.6871) mem 7379MB [2024-08-26 05:24:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][760/1251] eta 0:01:59 lr 0.000990 wd 0.0500 time 0.2373 (0.2439) data time 0.0008 (0.0016) model time 0.2365 (0.2424) loss 4.0987 (3.7423) grad_norm 1.8413 (1.9511) loss_scale 16384.0000 (12788.5624) mem 7379MB [2024-08-26 05:24:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][770/1251] eta 0:01:57 lr 0.000990 wd 0.0500 time 0.2412 (0.2439) data time 0.0011 (0.0016) model time 0.2401 (0.2424) loss 4.1250 (3.7413) grad_norm 1.2121 (1.9481) loss_scale 16384.0000 (12835.1958) mem 7379MB [2024-08-26 05:24:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][780/1251] eta 0:01:54 lr 0.000990 wd 0.0500 time 0.2417 (0.2439) data time 0.0007 (0.0016) model time 0.2410 (0.2424) loss 2.6221 (3.7394) grad_norm 1.6102 (1.9495) loss_scale 16384.0000 (12880.6351) mem 7379MB [2024-08-26 05:24:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][790/1251] eta 0:01:52 lr 0.000990 wd 0.0500 time 0.4173 (0.2441) data time 0.0008 (0.0016) model time 0.4165 (0.2426) loss 4.1914 (3.7401) grad_norm 2.1200 (1.9498) loss_scale 16384.0000 (12924.9254) mem 7379MB [2024-08-26 05:24:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][800/1251] eta 0:01:50 lr 0.000990 wd 0.0500 time 0.2476 (0.2443) data time 0.0007 (0.0016) model time 0.2469 (0.2428) loss 4.5338 (3.7413) grad_norm 2.4129 (1.9523) loss_scale 16384.0000 (12968.1099) mem 7379MB [2024-08-26 05:24:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][810/1251] eta 0:01:47 lr 0.000990 wd 0.0500 time 0.2370 (0.2443) data time 0.0010 (0.0016) model time 0.2359 (0.2428) loss 2.9800 (3.7404) grad_norm 1.6406 (1.9504) loss_scale 16384.0000 (13010.2293) mem 7379MB [2024-08-26 05:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][820/1251] eta 0:01:45 lr 0.000990 wd 0.0500 time 0.2505 (0.2443) data time 0.0010 (0.0016) model time 0.2495 (0.2428) loss 3.7394 (3.7421) grad_norm 1.2107 (1.9523) loss_scale 16384.0000 (13051.3228) mem 7379MB [2024-08-26 05:25:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][830/1251] eta 0:01:42 lr 0.000990 wd 0.0500 time 0.2412 (0.2443) data time 0.0007 (0.0016) model time 0.2405 (0.2428) loss 4.0140 (3.7435) grad_norm 1.9218 (1.9531) loss_scale 16384.0000 (13091.4272) mem 7379MB [2024-08-26 05:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][840/1251] eta 0:01:40 lr 0.000990 wd 0.0500 time 0.2430 (0.2445) data time 0.0009 (0.0016) model time 0.2421 (0.2431) loss 3.8151 (3.7430) grad_norm 1.7645 (1.9502) loss_scale 16384.0000 (13130.5779) mem 7379MB [2024-08-26 05:25:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][850/1251] eta 0:01:38 lr 0.000990 wd 0.0500 time 0.2408 (0.2450) data time 0.0009 (0.0016) model time 0.2399 (0.2435) loss 2.5120 (3.7426) grad_norm 1.9108 (1.9480) loss_scale 16384.0000 (13168.8085) mem 7379MB [2024-08-26 05:25:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][860/1251] eta 0:01:35 lr 0.000990 wd 0.0500 time 0.2407 (0.2449) data time 0.0013 (0.0016) model time 0.2395 (0.2435) loss 3.4718 (3.7435) grad_norm 1.3149 (1.9468) loss_scale 16384.0000 (13206.1510) mem 7379MB [2024-08-26 05:25:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][870/1251] eta 0:01:33 lr 0.000990 wd 0.0500 time 0.2376 (0.2449) data time 0.0008 (0.0016) model time 0.2368 (0.2435) loss 4.1327 (3.7404) grad_norm 1.8649 (1.9470) loss_scale 16384.0000 (13242.6361) mem 7379MB [2024-08-26 05:25:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][880/1251] eta 0:01:30 lr 0.000990 wd 0.0500 time 0.2390 (0.2449) data time 0.0008 (0.0016) model time 0.2382 (0.2435) loss 2.6639 (3.7405) grad_norm 1.9846 (1.9440) loss_scale 16384.0000 (13278.2928) mem 7379MB [2024-08-26 05:25:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][890/1251] eta 0:01:28 lr 0.000990 wd 0.0500 time 0.2424 (0.2448) data time 0.0009 (0.0016) model time 0.2415 (0.2434) loss 2.2905 (3.7403) grad_norm 2.7680 (1.9418) loss_scale 16384.0000 (13313.1493) mem 7379MB [2024-08-26 05:25:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][900/1251] eta 0:01:25 lr 0.000990 wd 0.0500 time 0.2390 (0.2448) data time 0.0012 (0.0016) model time 0.2379 (0.2434) loss 3.3973 (3.7422) grad_norm 1.5839 (1.9427) loss_scale 16384.0000 (13347.2320) mem 7379MB [2024-08-26 05:25:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][910/1251] eta 0:01:23 lr 0.000990 wd 0.0500 time 0.2399 (0.2448) data time 0.0010 (0.0016) model time 0.2389 (0.2434) loss 2.5857 (3.7409) grad_norm 1.6404 (1.9396) loss_scale 16384.0000 (13380.5664) mem 7379MB [2024-08-26 05:25:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][920/1251] eta 0:01:21 lr 0.000990 wd 0.0500 time 0.2460 (0.2448) data time 0.0007 (0.0015) model time 0.2453 (0.2433) loss 4.5542 (3.7363) grad_norm 1.5075 (1.9408) loss_scale 16384.0000 (13413.1770) mem 7379MB [2024-08-26 05:25:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][930/1251] eta 0:01:18 lr 0.000990 wd 0.0500 time 0.2426 (0.2447) data time 0.0010 (0.0015) model time 0.2415 (0.2433) loss 4.0486 (3.7354) grad_norm 1.7466 (1.9438) loss_scale 16384.0000 (13445.0870) mem 7379MB [2024-08-26 05:25:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][940/1251] eta 0:01:16 lr 0.000990 wd 0.0500 time 0.2502 (0.2447) data time 0.0010 (0.0015) model time 0.2493 (0.2433) loss 3.6333 (3.7359) grad_norm 1.8782 (1.9442) loss_scale 16384.0000 (13476.3188) mem 7379MB [2024-08-26 05:25:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][950/1251] eta 0:01:13 lr 0.000990 wd 0.0500 time 0.2421 (0.2447) data time 0.0007 (0.0015) model time 0.2413 (0.2433) loss 4.3556 (3.7365) grad_norm 1.4172 (1.9428) loss_scale 16384.0000 (13506.8938) mem 7379MB [2024-08-26 05:25:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][960/1251] eta 0:01:11 lr 0.000990 wd 0.0500 time 0.2410 (0.2447) data time 0.0009 (0.0015) model time 0.2400 (0.2433) loss 4.5859 (3.7375) grad_norm 1.4256 (1.9401) loss_scale 16384.0000 (13536.8325) mem 7379MB [2024-08-26 05:25:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][970/1251] eta 0:01:08 lr 0.000990 wd 0.0500 time 0.2428 (0.2447) data time 0.0009 (0.0015) model time 0.2419 (0.2433) loss 3.7850 (3.7358) grad_norm 1.7620 (1.9381) loss_scale 16384.0000 (13566.1545) mem 7379MB [2024-08-26 05:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][980/1251] eta 0:01:06 lr 0.000990 wd 0.0500 time 0.2460 (0.2447) data time 0.0009 (0.0015) model time 0.2450 (0.2433) loss 2.7179 (3.7357) grad_norm 2.6345 (1.9378) loss_scale 16384.0000 (13594.8787) mem 7379MB [2024-08-26 05:25:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][990/1251] eta 0:01:03 lr 0.000990 wd 0.0500 time 0.2389 (0.2447) data time 0.0007 (0.0015) model time 0.2382 (0.2433) loss 4.3631 (3.7366) grad_norm 2.3862 (1.9379) loss_scale 16384.0000 (13623.0232) mem 7379MB [2024-08-26 05:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1000/1251] eta 0:01:01 lr 0.000990 wd 0.0500 time 0.2320 (0.2447) data time 0.0009 (0.0015) model time 0.2312 (0.2433) loss 4.3969 (3.7381) grad_norm 2.7894 (1.9393) loss_scale 16384.0000 (13650.6054) mem 7379MB [2024-08-26 05:25:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1010/1251] eta 0:00:58 lr 0.000990 wd 0.0500 time 0.2377 (0.2447) data time 0.0010 (0.0015) model time 0.2366 (0.2433) loss 3.1665 (3.7362) grad_norm 2.9914 (1.9415) loss_scale 16384.0000 (13677.6419) mem 7379MB [2024-08-26 05:25:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1020/1251] eta 0:00:56 lr 0.000990 wd 0.0500 time 0.2374 (0.2447) data time 0.0009 (0.0015) model time 0.2365 (0.2433) loss 3.3411 (3.7356) grad_norm 1.8867 (1.9409) loss_scale 16384.0000 (13704.1489) mem 7379MB [2024-08-26 05:25:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1030/1251] eta 0:00:54 lr 0.000990 wd 0.0500 time 0.2429 (0.2447) data time 0.0009 (0.0015) model time 0.2420 (0.2433) loss 3.7061 (3.7355) grad_norm 2.5744 (1.9409) loss_scale 16384.0000 (13730.1416) mem 7379MB [2024-08-26 05:25:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1040/1251] eta 0:00:51 lr 0.000990 wd 0.0500 time 0.2462 (0.2447) data time 0.0008 (0.0015) model time 0.2455 (0.2433) loss 2.4507 (3.7345) grad_norm 1.7966 (1.9403) loss_scale 16384.0000 (13755.6350) mem 7379MB [2024-08-26 05:25:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1050/1251] eta 0:00:49 lr 0.000990 wd 0.0500 time 0.2374 (0.2446) data time 0.0009 (0.0015) model time 0.2365 (0.2432) loss 4.3881 (3.7382) grad_norm 2.0051 (1.9421) loss_scale 16384.0000 (13780.6432) mem 7379MB [2024-08-26 05:25:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1060/1251] eta 0:00:46 lr 0.000990 wd 0.0500 time 0.2425 (0.2446) data time 0.0011 (0.0015) model time 0.2414 (0.2432) loss 3.0557 (3.7385) grad_norm 1.7856 (1.9426) loss_scale 16384.0000 (13805.1800) mem 7379MB [2024-08-26 05:26:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1070/1251] eta 0:00:44 lr 0.000990 wd 0.0500 time 0.2406 (0.2446) data time 0.0011 (0.0015) model time 0.2396 (0.2432) loss 2.4554 (3.7365) grad_norm 3.6862 (1.9422) loss_scale 16384.0000 (13829.2586) mem 7379MB [2024-08-26 05:26:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1080/1251] eta 0:00:41 lr 0.000990 wd 0.0500 time 0.2596 (0.2446) data time 0.0007 (0.0015) model time 0.2588 (0.2432) loss 2.5844 (3.7375) grad_norm 1.6431 (1.9424) loss_scale 16384.0000 (13852.8918) mem 7379MB [2024-08-26 05:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1090/1251] eta 0:00:39 lr 0.000990 wd 0.0500 time 0.2387 (0.2445) data time 0.0011 (0.0015) model time 0.2376 (0.2431) loss 4.1941 (3.7356) grad_norm 1.5362 (1.9405) loss_scale 16384.0000 (13876.0917) mem 7379MB [2024-08-26 05:26:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1100/1251] eta 0:00:36 lr 0.000990 wd 0.0500 time 0.2334 (0.2445) data time 0.0011 (0.0015) model time 0.2324 (0.2431) loss 4.2706 (3.7325) grad_norm 1.4044 (1.9384) loss_scale 16384.0000 (13898.8701) mem 7379MB [2024-08-26 05:26:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1110/1251] eta 0:00:34 lr 0.000990 wd 0.0500 time 0.2448 (0.2445) data time 0.0010 (0.0015) model time 0.2438 (0.2431) loss 3.8200 (3.7333) grad_norm 1.9038 (1.9386) loss_scale 16384.0000 (13921.2385) mem 7379MB [2024-08-26 05:26:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1120/1251] eta 0:00:32 lr 0.000990 wd 0.0500 time 0.2399 (0.2444) data time 0.0007 (0.0015) model time 0.2392 (0.2431) loss 4.0626 (3.7323) grad_norm 1.8778 (1.9385) loss_scale 16384.0000 (13943.2079) mem 7379MB [2024-08-26 05:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1130/1251] eta 0:00:29 lr 0.000990 wd 0.0500 time 0.2398 (0.2444) data time 0.0011 (0.0015) model time 0.2387 (0.2431) loss 2.9610 (3.7333) grad_norm 1.7348 (1.9403) loss_scale 16384.0000 (13964.7887) mem 7379MB [2024-08-26 05:26:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1140/1251] eta 0:00:27 lr 0.000990 wd 0.0500 time 0.2397 (0.2444) data time 0.0011 (0.0015) model time 0.2386 (0.2430) loss 3.9222 (3.7351) grad_norm 1.7595 (1.9389) loss_scale 16384.0000 (13985.9912) mem 7379MB [2024-08-26 05:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1150/1251] eta 0:00:24 lr 0.000990 wd 0.0500 time 0.2425 (0.2444) data time 0.0011 (0.0015) model time 0.2413 (0.2430) loss 2.6360 (3.7344) grad_norm 1.8932 (1.9388) loss_scale 16384.0000 (14006.8254) mem 7379MB [2024-08-26 05:26:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1160/1251] eta 0:00:22 lr 0.000990 wd 0.0500 time 0.2382 (0.2443) data time 0.0010 (0.0015) model time 0.2372 (0.2430) loss 3.9342 (3.7339) grad_norm 1.5961 (1.9364) loss_scale 16384.0000 (14027.3006) mem 7379MB [2024-08-26 05:26:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1170/1251] eta 0:00:19 lr 0.000990 wd 0.0500 time 0.2410 (0.2443) data time 0.0011 (0.0015) model time 0.2400 (0.2429) loss 3.6107 (3.7331) grad_norm 1.7011 (1.9356) loss_scale 16384.0000 (14047.4261) mem 7379MB [2024-08-26 05:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1180/1251] eta 0:00:17 lr 0.000990 wd 0.0500 time 0.2442 (0.2443) data time 0.0007 (0.0015) model time 0.2435 (0.2429) loss 4.2405 (3.7343) grad_norm 2.8175 (1.9360) loss_scale 16384.0000 (14067.2108) mem 7379MB [2024-08-26 05:26:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1190/1251] eta 0:00:14 lr 0.000990 wd 0.0500 time 0.2364 (0.2443) data time 0.0011 (0.0014) model time 0.2353 (0.2429) loss 3.9467 (3.7343) grad_norm 1.8269 (1.9358) loss_scale 16384.0000 (14086.6633) mem 7379MB [2024-08-26 05:26:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1200/1251] eta 0:00:12 lr 0.000990 wd 0.0500 time 0.2415 (0.2443) data time 0.0010 (0.0014) model time 0.2405 (0.2429) loss 3.1349 (3.7332) grad_norm 1.8382 (1.9363) loss_scale 16384.0000 (14105.7918) mem 7379MB [2024-08-26 05:26:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1210/1251] eta 0:00:10 lr 0.000990 wd 0.0500 time 0.2448 (0.2442) data time 0.0009 (0.0014) model time 0.2439 (0.2429) loss 3.9540 (3.7316) grad_norm 2.0962 (1.9343) loss_scale 16384.0000 (14124.6045) mem 7379MB [2024-08-26 05:26:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1220/1251] eta 0:00:07 lr 0.000990 wd 0.0500 time 0.2471 (0.2442) data time 0.0007 (0.0014) model time 0.2464 (0.2429) loss 3.1440 (3.7322) grad_norm 2.6863 (1.9345) loss_scale 16384.0000 (14143.1089) mem 7379MB [2024-08-26 05:26:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1230/1251] eta 0:00:05 lr 0.000990 wd 0.0500 time 0.2361 (0.2442) data time 0.0014 (0.0014) model time 0.2347 (0.2429) loss 4.0768 (3.7321) grad_norm 2.0079 (1.9350) loss_scale 16384.0000 (14161.3128) mem 7379MB [2024-08-26 05:26:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1240/1251] eta 0:00:02 lr 0.000990 wd 0.0500 time 0.2254 (0.2441) data time 0.0007 (0.0014) model time 0.2247 (0.2428) loss 3.2533 (3.7307) grad_norm 2.3739 (1.9371) loss_scale 16384.0000 (14179.2232) mem 7379MB [2024-08-26 05:26:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [37/300][1250/1251] eta 0:00:00 lr 0.000990 wd 0.0500 time 0.2252 (0.2440) data time 0.0007 (0.0014) model time 0.2245 (0.2426) loss 3.5855 (3.7301) grad_norm 1.8387 (1.9357) loss_scale 16384.0000 (14196.8473) mem 7379MB [2024-08-26 05:26:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 37 training takes 0:05:05 [2024-08-26 05:26:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 05:26:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 05:26:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.466 (0.466) Loss 0.5610 (0.5610) Acc@1 88.867 (88.867) Acc@5 97.754 (97.754) Mem 7379MB [2024-08-26 05:26:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.114) Loss 0.9482 (0.9130) Acc@1 78.320 (79.039) Acc@5 94.336 (94.993) Mem 7379MB [2024-08-26 05:26:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.097) Loss 1.2646 (0.9290) Acc@1 70.801 (78.255) Acc@5 89.941 (95.024) Mem 7379MB [2024-08-26 05:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.091) Loss 1.6367 (1.0687) Acc@1 62.109 (75.265) Acc@5 84.766 (93.060) Mem 7379MB [2024-08-26 05:26:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.5410 (1.1520) Acc@1 65.723 (73.416) Acc@5 86.426 (91.980) Mem 7379MB [2024-08-26 05:26:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 73.064 Acc@5 91.786 [2024-08-26 05:26:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 73.1% [2024-08-26 05:26:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 73.06% [2024-08-26 05:26:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 05:26:49 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 05:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.427 (0.427) Loss 0.4868 (0.4868) Acc@1 89.453 (89.453) Acc@5 97.559 (97.559) Mem 7379MB [2024-08-26 05:26:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.107) Loss 0.8276 (0.8082) Acc@1 81.348 (80.487) Acc@5 95.215 (95.481) Mem 7379MB [2024-08-26 05:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.093) Loss 1.2051 (0.8225) Acc@1 71.680 (79.827) Acc@5 90.625 (95.392) Mem 7379MB [2024-08-26 05:26:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.090) Loss 1.4844 (0.9551) Acc@1 63.477 (76.799) Acc@5 86.426 (93.637) Mem 7379MB [2024-08-26 05:26:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.4346 (1.0341) Acc@1 64.551 (74.950) Acc@5 87.598 (92.631) Mem 7379MB [2024-08-26 05:26:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.632 Acc@5 92.548 [2024-08-26 05:26:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 74.6% [2024-08-26 05:26:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 74.63% [2024-08-26 05:26:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 05:26:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 05:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][0/1251] eta 0:13:48 lr 0.000990 wd 0.0500 time 0.6624 (0.6624) data time 0.4182 (0.4182) model time 0.0000 (0.0000) loss 2.7662 (2.7662) grad_norm 1.5932 (1.5932) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:26:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][10/1251] eta 0:05:46 lr 0.000990 wd 0.0500 time 0.2465 (0.2790) data time 0.0010 (0.0391) model time 0.0000 (0.0000) loss 4.0625 (3.9098) grad_norm 2.9424 (1.9189) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][20/1251] eta 0:05:20 lr 0.000990 wd 0.0500 time 0.2382 (0.2603) data time 0.0007 (0.0210) model time 0.0000 (0.0000) loss 3.6758 (3.8299) grad_norm 2.8711 (1.9483) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][30/1251] eta 0:05:10 lr 0.000990 wd 0.0500 time 0.2395 (0.2542) data time 0.0008 (0.0145) model time 0.0000 (0.0000) loss 3.1348 (3.8268) grad_norm 1.8957 (1.9017) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][40/1251] eta 0:05:04 lr 0.000990 wd 0.0500 time 0.2444 (0.2514) data time 0.0007 (0.0112) model time 0.0000 (0.0000) loss 2.6984 (3.7284) grad_norm 1.4856 (1.8575) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][50/1251] eta 0:05:04 lr 0.000990 wd 0.0500 time 0.2325 (0.2535) data time 0.0010 (0.0092) model time 0.0000 (0.0000) loss 4.0869 (3.7449) grad_norm 2.1511 (1.8351) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][60/1251] eta 0:04:59 lr 0.000990 wd 0.0500 time 0.2436 (0.2514) data time 0.0007 (0.0079) model time 0.2429 (0.2396) loss 2.8536 (3.7760) grad_norm 1.8153 (1.8533) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][70/1251] eta 0:05:01 lr 0.000990 wd 0.0500 time 0.4085 (0.2553) data time 0.0011 (0.0069) model time 0.4074 (0.2588) loss 3.3122 (3.7887) grad_norm 1.4876 (1.8706) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][80/1251] eta 0:04:56 lr 0.000990 wd 0.0500 time 0.2378 (0.2533) data time 0.0007 (0.0062) model time 0.2371 (0.2520) loss 2.6060 (3.7120) grad_norm 2.7283 (1.8919) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][90/1251] eta 0:04:52 lr 0.000990 wd 0.0500 time 0.2416 (0.2519) data time 0.0010 (0.0056) model time 0.2406 (0.2489) loss 3.7850 (3.7136) grad_norm 2.0417 (1.9198) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][100/1251] eta 0:04:48 lr 0.000990 wd 0.0500 time 0.2379 (0.2508) data time 0.0009 (0.0052) model time 0.2370 (0.2471) loss 4.0655 (3.7439) grad_norm 1.7734 (1.9258) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][110/1251] eta 0:04:45 lr 0.000990 wd 0.0500 time 0.2385 (0.2500) data time 0.0007 (0.0048) model time 0.2378 (0.2460) loss 2.7125 (3.7456) grad_norm 1.6121 (1.9253) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][120/1251] eta 0:04:42 lr 0.000990 wd 0.0500 time 0.2421 (0.2494) data time 0.0014 (0.0045) model time 0.2406 (0.2455) loss 3.1207 (3.7435) grad_norm 1.6427 (1.9212) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][130/1251] eta 0:04:38 lr 0.000990 wd 0.0500 time 0.2416 (0.2489) data time 0.0011 (0.0042) model time 0.2405 (0.2449) loss 2.9403 (3.7312) grad_norm 3.3710 (1.9344) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][140/1251] eta 0:04:35 lr 0.000990 wd 0.0500 time 0.2402 (0.2482) data time 0.0008 (0.0040) model time 0.2394 (0.2442) loss 4.3219 (3.7298) grad_norm 2.0701 (1.9326) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][150/1251] eta 0:04:32 lr 0.000990 wd 0.0500 time 0.2364 (0.2479) data time 0.0009 (0.0038) model time 0.2356 (0.2440) loss 3.4189 (3.7298) grad_norm 1.9356 (1.9579) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][160/1251] eta 0:04:30 lr 0.000990 wd 0.0500 time 0.2397 (0.2475) data time 0.0009 (0.0036) model time 0.2389 (0.2437) loss 2.5881 (3.7247) grad_norm 1.8670 (1.9812) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][170/1251] eta 0:04:27 lr 0.000990 wd 0.0500 time 0.2518 (0.2471) data time 0.0007 (0.0035) model time 0.2511 (0.2434) loss 4.6456 (3.7197) grad_norm 1.8213 (1.9722) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][180/1251] eta 0:04:24 lr 0.000990 wd 0.0500 time 0.2387 (0.2470) data time 0.0011 (0.0034) model time 0.2376 (0.2434) loss 3.8614 (3.7124) grad_norm 1.5280 (1.9587) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][190/1251] eta 0:04:21 lr 0.000990 wd 0.0500 time 0.2304 (0.2467) data time 0.0011 (0.0032) model time 0.2294 (0.2431) loss 3.1913 (3.6977) grad_norm 2.1294 (1.9584) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][200/1251] eta 0:04:19 lr 0.000990 wd 0.0500 time 0.2370 (0.2464) data time 0.0011 (0.0031) model time 0.2359 (0.2430) loss 3.9115 (3.6948) grad_norm 1.6300 (1.9452) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][210/1251] eta 0:04:16 lr 0.000990 wd 0.0500 time 0.2428 (0.2464) data time 0.0009 (0.0030) model time 0.2419 (0.2430) loss 2.6784 (3.6788) grad_norm 1.9651 (1.9394) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][220/1251] eta 0:04:13 lr 0.000990 wd 0.0500 time 0.2388 (0.2461) data time 0.0011 (0.0029) model time 0.2377 (0.2429) loss 3.9356 (3.6826) grad_norm 1.8731 (1.9352) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][230/1251] eta 0:04:11 lr 0.000990 wd 0.0500 time 0.2349 (0.2459) data time 0.0011 (0.0029) model time 0.2339 (0.2427) loss 3.3387 (3.6788) grad_norm 1.7753 (1.9303) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][240/1251] eta 0:04:08 lr 0.000990 wd 0.0500 time 0.2422 (0.2456) data time 0.0010 (0.0028) model time 0.2412 (0.2425) loss 3.7836 (3.6823) grad_norm 2.3833 (1.9443) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][250/1251] eta 0:04:05 lr 0.000990 wd 0.0500 time 0.2411 (0.2454) data time 0.0012 (0.0027) model time 0.2400 (0.2423) loss 4.0009 (3.6964) grad_norm 1.6321 (1.9455) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:27:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][260/1251] eta 0:04:03 lr 0.000990 wd 0.0500 time 0.2393 (0.2453) data time 0.0009 (0.0026) model time 0.2384 (0.2423) loss 4.8033 (3.7023) grad_norm 1.4131 (1.9424) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][270/1251] eta 0:04:00 lr 0.000990 wd 0.0500 time 0.2419 (0.2452) data time 0.0008 (0.0026) model time 0.2411 (0.2422) loss 3.6851 (3.7043) grad_norm 1.9990 (1.9496) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][280/1251] eta 0:03:57 lr 0.000990 wd 0.0500 time 0.2327 (0.2450) data time 0.0009 (0.0025) model time 0.2319 (0.2420) loss 2.9474 (3.6996) grad_norm 1.8201 (1.9614) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][290/1251] eta 0:03:55 lr 0.000990 wd 0.0500 time 0.2367 (0.2448) data time 0.0009 (0.0025) model time 0.2358 (0.2419) loss 3.7210 (3.6949) grad_norm 1.5405 (1.9683) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][300/1251] eta 0:03:53 lr 0.000990 wd 0.0500 time 0.2462 (0.2455) data time 0.0009 (0.0024) model time 0.2452 (0.2428) loss 4.0717 (3.6933) grad_norm 1.5332 (1.9598) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][310/1251] eta 0:03:50 lr 0.000990 wd 0.0500 time 0.2392 (0.2454) data time 0.0008 (0.0024) model time 0.2384 (0.2427) loss 4.6008 (3.6954) grad_norm 2.3695 (1.9580) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][320/1251] eta 0:03:48 lr 0.000990 wd 0.0500 time 0.2423 (0.2452) data time 0.0009 (0.0023) model time 0.2414 (0.2426) loss 4.2794 (3.6914) grad_norm 1.9166 (1.9656) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][330/1251] eta 0:03:45 lr 0.000990 wd 0.0500 time 0.2410 (0.2451) data time 0.0007 (0.0023) model time 0.2403 (0.2426) loss 3.4556 (3.6848) grad_norm 1.5781 (1.9635) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][340/1251] eta 0:03:43 lr 0.000990 wd 0.0500 time 0.2373 (0.2450) data time 0.0010 (0.0023) model time 0.2363 (0.2425) loss 3.6931 (3.6911) grad_norm 2.2227 (1.9575) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][350/1251] eta 0:03:40 lr 0.000990 wd 0.0500 time 0.2442 (0.2449) data time 0.0007 (0.0022) model time 0.2435 (0.2424) loss 4.9083 (3.6991) grad_norm 1.3287 (1.9557) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][360/1251] eta 0:03:38 lr 0.000990 wd 0.0500 time 0.2421 (0.2448) data time 0.0008 (0.0022) model time 0.2413 (0.2423) loss 3.3099 (3.6959) grad_norm 1.5717 (1.9491) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][370/1251] eta 0:03:35 lr 0.000990 wd 0.0500 time 0.2444 (0.2447) data time 0.0009 (0.0021) model time 0.2436 (0.2423) loss 4.2664 (3.7069) grad_norm 2.2254 (1.9422) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][380/1251] eta 0:03:33 lr 0.000990 wd 0.0500 time 0.2390 (0.2446) data time 0.0009 (0.0021) model time 0.2380 (0.2422) loss 3.7627 (3.7071) grad_norm 1.7538 (1.9403) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][390/1251] eta 0:03:30 lr 0.000990 wd 0.0500 time 0.2434 (0.2445) data time 0.0009 (0.0021) model time 0.2425 (0.2421) loss 3.9734 (3.7128) grad_norm 1.9781 (1.9387) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][400/1251] eta 0:03:27 lr 0.000990 wd 0.0500 time 0.2405 (0.2444) data time 0.0009 (0.0021) model time 0.2395 (0.2421) loss 3.9963 (3.7183) grad_norm 1.6934 (1.9385) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][410/1251] eta 0:03:25 lr 0.000990 wd 0.0500 time 0.2512 (0.2444) data time 0.0010 (0.0020) model time 0.2503 (0.2421) loss 4.3059 (3.7234) grad_norm 1.7445 (1.9357) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][420/1251] eta 0:03:23 lr 0.000990 wd 0.0500 time 0.2415 (0.2443) data time 0.0010 (0.0020) model time 0.2405 (0.2421) loss 3.1733 (3.7213) grad_norm 2.7569 (1.9370) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][430/1251] eta 0:03:20 lr 0.000990 wd 0.0500 time 0.2350 (0.2443) data time 0.0010 (0.0020) model time 0.2340 (0.2420) loss 3.7973 (3.7227) grad_norm 1.3978 (1.9420) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][440/1251] eta 0:03:18 lr 0.000990 wd 0.0500 time 0.2487 (0.2443) data time 0.0010 (0.0020) model time 0.2477 (0.2420) loss 4.0636 (3.7187) grad_norm 1.8569 (1.9495) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][450/1251] eta 0:03:15 lr 0.000990 wd 0.0500 time 0.2387 (0.2442) data time 0.0007 (0.0019) model time 0.2380 (0.2420) loss 4.0683 (3.7165) grad_norm 1.4600 (1.9464) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][460/1251] eta 0:03:13 lr 0.000990 wd 0.0500 time 0.2428 (0.2441) data time 0.0007 (0.0019) model time 0.2421 (0.2420) loss 4.4416 (3.7208) grad_norm 1.8598 (1.9502) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][470/1251] eta 0:03:10 lr 0.000990 wd 0.0500 time 0.2391 (0.2441) data time 0.0011 (0.0019) model time 0.2380 (0.2419) loss 3.4264 (3.7240) grad_norm 1.7164 (1.9517) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][480/1251] eta 0:03:08 lr 0.000990 wd 0.0500 time 0.2329 (0.2440) data time 0.0012 (0.0019) model time 0.2318 (0.2419) loss 3.5385 (3.7257) grad_norm 1.3506 (1.9473) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][490/1251] eta 0:03:05 lr 0.000989 wd 0.0500 time 0.2422 (0.2440) data time 0.0010 (0.0019) model time 0.2413 (0.2418) loss 2.9746 (3.7240) grad_norm 2.0444 (1.9473) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][500/1251] eta 0:03:03 lr 0.000989 wd 0.0500 time 0.2395 (0.2439) data time 0.0009 (0.0018) model time 0.2386 (0.2418) loss 3.6406 (3.7221) grad_norm 1.3797 (1.9413) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][510/1251] eta 0:03:00 lr 0.000989 wd 0.0500 time 0.2368 (0.2439) data time 0.0010 (0.0018) model time 0.2358 (0.2419) loss 4.3626 (3.7245) grad_norm 2.9990 (1.9541) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][520/1251] eta 0:02:58 lr 0.000989 wd 0.0500 time 0.2477 (0.2439) data time 0.0009 (0.0018) model time 0.2468 (0.2419) loss 4.1454 (3.7209) grad_norm 3.2234 (1.9568) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][530/1251] eta 0:02:55 lr 0.000989 wd 0.0500 time 0.2349 (0.2438) data time 0.0008 (0.0018) model time 0.2341 (0.2418) loss 3.2879 (3.7197) grad_norm 1.6296 (1.9505) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][540/1251] eta 0:02:53 lr 0.000989 wd 0.0500 time 0.2438 (0.2438) data time 0.0010 (0.0018) model time 0.2429 (0.2418) loss 3.3161 (3.7161) grad_norm 1.5704 (1.9486) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][550/1251] eta 0:02:50 lr 0.000989 wd 0.0500 time 0.2463 (0.2438) data time 0.0009 (0.0018) model time 0.2454 (0.2418) loss 3.7881 (3.7161) grad_norm 1.5974 (1.9441) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][560/1251] eta 0:02:48 lr 0.000989 wd 0.0500 time 0.2381 (0.2438) data time 0.0008 (0.0018) model time 0.2372 (0.2418) loss 2.7843 (3.7194) grad_norm 1.5506 (1.9408) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][570/1251] eta 0:02:45 lr 0.000989 wd 0.0500 time 0.2391 (0.2437) data time 0.0010 (0.0017) model time 0.2381 (0.2418) loss 3.3193 (3.7166) grad_norm 2.0427 (1.9384) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][580/1251] eta 0:02:43 lr 0.000989 wd 0.0500 time 0.2403 (0.2437) data time 0.0010 (0.0017) model time 0.2393 (0.2417) loss 3.9884 (3.7107) grad_norm 2.1593 (1.9361) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][590/1251] eta 0:02:41 lr 0.000989 wd 0.0500 time 0.2349 (0.2440) data time 0.0010 (0.0017) model time 0.2339 (0.2421) loss 2.9845 (3.7099) grad_norm 1.5684 (1.9344) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][600/1251] eta 0:02:39 lr 0.000989 wd 0.0500 time 0.2444 (0.2444) data time 0.0011 (0.0017) model time 0.2433 (0.2425) loss 2.5804 (3.7085) grad_norm 1.6986 (1.9335) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][610/1251] eta 0:02:36 lr 0.000989 wd 0.0500 time 0.2395 (0.2447) data time 0.0010 (0.0017) model time 0.2384 (0.2429) loss 3.5938 (3.7048) grad_norm 2.8977 (1.9315) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][620/1251] eta 0:02:34 lr 0.000989 wd 0.0500 time 0.2461 (0.2447) data time 0.0007 (0.0017) model time 0.2454 (0.2429) loss 4.1198 (3.7074) grad_norm 1.7054 (1.9324) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][630/1251] eta 0:02:31 lr 0.000989 wd 0.0500 time 0.2368 (0.2447) data time 0.0011 (0.0017) model time 0.2357 (0.2429) loss 4.0384 (3.7089) grad_norm 1.5195 (1.9272) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][640/1251] eta 0:02:29 lr 0.000989 wd 0.0500 time 0.2409 (0.2446) data time 0.0008 (0.0017) model time 0.2401 (0.2429) loss 3.3680 (3.7048) grad_norm 1.5135 (1.9251) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][650/1251] eta 0:02:26 lr 0.000989 wd 0.0500 time 0.2427 (0.2446) data time 0.0009 (0.0017) model time 0.2417 (0.2428) loss 4.0770 (3.7074) grad_norm 1.4723 (1.9230) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][660/1251] eta 0:02:24 lr 0.000989 wd 0.0500 time 0.2450 (0.2445) data time 0.0010 (0.0016) model time 0.2440 (0.2427) loss 3.8021 (3.7126) grad_norm 1.2516 (1.9212) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][670/1251] eta 0:02:22 lr 0.000989 wd 0.0500 time 0.2444 (0.2445) data time 0.0010 (0.0016) model time 0.2434 (0.2427) loss 3.1404 (3.7074) grad_norm 1.4752 (1.9203) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][680/1251] eta 0:02:19 lr 0.000989 wd 0.0500 time 0.2404 (0.2445) data time 0.0009 (0.0016) model time 0.2395 (0.2427) loss 4.3609 (3.7126) grad_norm 1.4793 (1.9167) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][690/1251] eta 0:02:17 lr 0.000989 wd 0.0500 time 0.2371 (0.2444) data time 0.0010 (0.0016) model time 0.2362 (0.2427) loss 4.4638 (3.7098) grad_norm 1.9126 (1.9248) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][700/1251] eta 0:02:14 lr 0.000989 wd 0.0500 time 0.2486 (0.2444) data time 0.0009 (0.0016) model time 0.2477 (0.2427) loss 2.7869 (3.7071) grad_norm 1.5363 (1.9195) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][710/1251] eta 0:02:12 lr 0.000989 wd 0.0500 time 0.2383 (0.2444) data time 0.0009 (0.0016) model time 0.2374 (0.2426) loss 4.5546 (3.7115) grad_norm 2.2457 (1.9209) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][720/1251] eta 0:02:09 lr 0.000989 wd 0.0500 time 0.2399 (0.2443) data time 0.0010 (0.0016) model time 0.2389 (0.2426) loss 4.2523 (3.7132) grad_norm 1.8176 (1.9210) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][730/1251] eta 0:02:07 lr 0.000989 wd 0.0500 time 0.2361 (0.2443) data time 0.0011 (0.0016) model time 0.2350 (0.2425) loss 4.2519 (3.7125) grad_norm 1.9939 (1.9184) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][740/1251] eta 0:02:04 lr 0.000989 wd 0.0500 time 0.2427 (0.2442) data time 0.0010 (0.0016) model time 0.2418 (0.2425) loss 3.6182 (3.7122) grad_norm 1.6370 (1.9203) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][750/1251] eta 0:02:02 lr 0.000989 wd 0.0500 time 0.2439 (0.2442) data time 0.0009 (0.0016) model time 0.2430 (0.2425) loss 3.3785 (3.7146) grad_norm 1.4619 (1.9192) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:29:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][760/1251] eta 0:01:59 lr 0.000989 wd 0.0500 time 0.2383 (0.2441) data time 0.0007 (0.0016) model time 0.2375 (0.2424) loss 4.2871 (3.7160) grad_norm 1.6487 (1.9159) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][770/1251] eta 0:01:57 lr 0.000989 wd 0.0500 time 0.2379 (0.2441) data time 0.0008 (0.0016) model time 0.2371 (0.2424) loss 4.7106 (3.7167) grad_norm 2.0574 (1.9168) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][780/1251] eta 0:01:54 lr 0.000989 wd 0.0500 time 0.2398 (0.2440) data time 0.0008 (0.0016) model time 0.2390 (0.2424) loss 3.8617 (3.7159) grad_norm 1.5855 (1.9136) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][790/1251] eta 0:01:52 lr 0.000989 wd 0.0500 time 0.2390 (0.2440) data time 0.0013 (0.0015) model time 0.2377 (0.2423) loss 3.8429 (3.7167) grad_norm 3.7874 (1.9178) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][800/1251] eta 0:01:50 lr 0.000989 wd 0.0500 time 0.2345 (0.2440) data time 0.0009 (0.0015) model time 0.2336 (0.2423) loss 4.2242 (3.7202) grad_norm 1.3443 (1.9171) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][810/1251] eta 0:01:47 lr 0.000989 wd 0.0500 time 0.2388 (0.2439) data time 0.0007 (0.0015) model time 0.2380 (0.2423) loss 4.1464 (3.7177) grad_norm 2.0256 (1.9154) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][820/1251] eta 0:01:45 lr 0.000989 wd 0.0500 time 0.2374 (0.2439) data time 0.0007 (0.0015) model time 0.2367 (0.2422) loss 4.5478 (3.7189) grad_norm 1.5280 (1.9206) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][830/1251] eta 0:01:42 lr 0.000989 wd 0.0500 time 0.2403 (0.2438) data time 0.0010 (0.0015) model time 0.2393 (0.2422) loss 4.5693 (3.7166) grad_norm 1.6728 (1.9190) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][840/1251] eta 0:01:40 lr 0.000989 wd 0.0500 time 0.2393 (0.2438) data time 0.0007 (0.0015) model time 0.2386 (0.2422) loss 3.3384 (3.7171) grad_norm 2.7260 (1.9191) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][850/1251] eta 0:01:37 lr 0.000989 wd 0.0500 time 0.2433 (0.2438) data time 0.0011 (0.0015) model time 0.2422 (0.2422) loss 3.0631 (3.7151) grad_norm 2.3114 (1.9184) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][860/1251] eta 0:01:35 lr 0.000989 wd 0.0500 time 0.2402 (0.2438) data time 0.0011 (0.0015) model time 0.2391 (0.2422) loss 4.1254 (3.7162) grad_norm 2.4495 (1.9186) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][870/1251] eta 0:01:32 lr 0.000989 wd 0.0500 time 0.2387 (0.2438) data time 0.0007 (0.0015) model time 0.2380 (0.2422) loss 4.2554 (3.7153) grad_norm 1.5659 (1.9198) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][880/1251] eta 0:01:30 lr 0.000989 wd 0.0500 time 0.2388 (0.2438) data time 0.0007 (0.0015) model time 0.2380 (0.2422) loss 4.5521 (3.7175) grad_norm 2.6382 (1.9218) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][890/1251] eta 0:01:27 lr 0.000989 wd 0.0500 time 0.2368 (0.2438) data time 0.0010 (0.0015) model time 0.2358 (0.2421) loss 4.1846 (3.7207) grad_norm 2.1584 (1.9255) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][900/1251] eta 0:01:25 lr 0.000989 wd 0.0500 time 0.2438 (0.2437) data time 0.0009 (0.0015) model time 0.2429 (0.2421) loss 4.5020 (3.7209) grad_norm 2.2229 (1.9294) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][910/1251] eta 0:01:23 lr 0.000989 wd 0.0500 time 0.2463 (0.2437) data time 0.0011 (0.0015) model time 0.2452 (0.2421) loss 2.5229 (3.7187) grad_norm 1.6072 (1.9290) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][920/1251] eta 0:01:20 lr 0.000989 wd 0.0500 time 0.2456 (0.2437) data time 0.0009 (0.0015) model time 0.2447 (0.2421) loss 4.5176 (3.7174) grad_norm 1.4347 (1.9261) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][930/1251] eta 0:01:18 lr 0.000989 wd 0.0500 time 0.2361 (0.2437) data time 0.0009 (0.0015) model time 0.2351 (0.2421) loss 4.5185 (3.7207) grad_norm 2.0068 (1.9255) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][940/1251] eta 0:01:15 lr 0.000989 wd 0.0500 time 0.2392 (0.2437) data time 0.0011 (0.0015) model time 0.2381 (0.2421) loss 4.1651 (3.7216) grad_norm 1.9884 (1.9254) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][950/1251] eta 0:01:13 lr 0.000989 wd 0.0500 time 0.2433 (0.2437) data time 0.0008 (0.0015) model time 0.2426 (0.2421) loss 2.5784 (3.7241) grad_norm 1.9043 (1.9257) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][960/1251] eta 0:01:10 lr 0.000989 wd 0.0500 time 0.2393 (0.2437) data time 0.0009 (0.0015) model time 0.2384 (0.2421) loss 3.5524 (3.7244) grad_norm 1.5104 (1.9260) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][970/1251] eta 0:01:08 lr 0.000989 wd 0.0500 time 0.2389 (0.2437) data time 0.0009 (0.0015) model time 0.2380 (0.2421) loss 3.7569 (3.7245) grad_norm 1.4248 (1.9216) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][980/1251] eta 0:01:06 lr 0.000989 wd 0.0500 time 0.2359 (0.2439) data time 0.0008 (0.0014) model time 0.2352 (0.2423) loss 4.1363 (3.7243) grad_norm 1.5122 (1.9232) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][990/1251] eta 0:01:03 lr 0.000989 wd 0.0500 time 0.2423 (0.2438) data time 0.0007 (0.0014) model time 0.2416 (0.2423) loss 2.5853 (3.7246) grad_norm 1.6673 (1.9205) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:30:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1000/1251] eta 0:01:01 lr 0.000989 wd 0.0500 time 0.2465 (0.2438) data time 0.0009 (0.0014) model time 0.2455 (0.2423) loss 3.6672 (3.7261) grad_norm 1.2017 (1.9217) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1010/1251] eta 0:00:58 lr 0.000989 wd 0.0500 time 0.2440 (0.2438) data time 0.0009 (0.0014) model time 0.2431 (0.2422) loss 3.7256 (3.7242) grad_norm 1.9665 (1.9219) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1020/1251] eta 0:00:56 lr 0.000989 wd 0.0500 time 0.2355 (0.2437) data time 0.0007 (0.0014) model time 0.2347 (0.2422) loss 2.5058 (3.7223) grad_norm 1.8829 (1.9217) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:31:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1030/1251] eta 0:00:53 lr 0.000989 wd 0.0500 time 0.2346 (0.2437) data time 0.0007 (0.0014) model time 0.2340 (0.2422) loss 3.4852 (3.7218) grad_norm 1.9848 (1.9200) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:31:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1040/1251] eta 0:00:51 lr 0.000989 wd 0.0500 time 0.2419 (0.2437) data time 0.0008 (0.0014) model time 0.2411 (0.2422) loss 3.4457 (3.7215) grad_norm 2.2117 (1.9193) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:31:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1050/1251] eta 0:00:48 lr 0.000989 wd 0.0500 time 0.2413 (0.2437) data time 0.0010 (0.0014) model time 0.2403 (0.2422) loss 4.3642 (3.7220) grad_norm 1.5878 (1.9200) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:31:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1060/1251] eta 0:00:46 lr 0.000989 wd 0.0500 time 0.2488 (0.2437) data time 0.0009 (0.0014) model time 0.2480 (0.2421) loss 3.6533 (3.7243) grad_norm 1.9424 (1.9211) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1070/1251] eta 0:00:44 lr 0.000989 wd 0.0500 time 0.2404 (0.2437) data time 0.0007 (0.0014) model time 0.2398 (0.2422) loss 2.9645 (3.7247) grad_norm 2.3680 (1.9268) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:31:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1080/1251] eta 0:00:41 lr 0.000989 wd 0.0500 time 0.2414 (0.2436) data time 0.0007 (0.0014) model time 0.2407 (0.2421) loss 3.4878 (3.7235) grad_norm 2.0563 (1.9266) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:31:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1090/1251] eta 0:00:39 lr 0.000989 wd 0.0500 time 0.2438 (0.2436) data time 0.0008 (0.0014) model time 0.2430 (0.2421) loss 4.7758 (3.7248) grad_norm 1.8363 (1.9262) loss_scale 32768.0000 (16504.1393) mem 7379MB [2024-08-26 05:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1100/1251] eta 0:00:36 lr 0.000989 wd 0.0500 time 0.2572 (0.2436) data time 0.0010 (0.0014) model time 0.2562 (0.2421) loss 3.7878 (3.7255) grad_norm 1.6590 (1.9255) loss_scale 32768.0000 (16651.8583) mem 7379MB [2024-08-26 05:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1110/1251] eta 0:00:34 lr 0.000989 wd 0.0500 time 0.2479 (0.2436) data time 0.0008 (0.0014) model time 0.2471 (0.2421) loss 4.7688 (3.7256) grad_norm 1.5389 (inf) loss_scale 16384.0000 (16664.1944) mem 7379MB [2024-08-26 05:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1120/1251] eta 0:00:31 lr 0.000989 wd 0.0500 time 0.4065 (0.2438) data time 0.0013 (0.0014) model time 0.4052 (0.2423) loss 3.9615 (3.7244) grad_norm 1.7749 (inf) loss_scale 16384.0000 (16661.6949) mem 7379MB [2024-08-26 05:31:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1130/1251] eta 0:00:29 lr 0.000989 wd 0.0500 time 0.2393 (0.2437) data time 0.0013 (0.0014) model time 0.2380 (0.2422) loss 3.7503 (3.7241) grad_norm 2.9325 (inf) loss_scale 16384.0000 (16659.2396) mem 7379MB [2024-08-26 05:31:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1140/1251] eta 0:00:27 lr 0.000989 wd 0.0500 time 0.2433 (0.2437) data time 0.0010 (0.0014) model time 0.2424 (0.2422) loss 3.7109 (3.7227) grad_norm 1.2284 (inf) loss_scale 16384.0000 (16656.8273) mem 7379MB [2024-08-26 05:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1150/1251] eta 0:00:24 lr 0.000989 wd 0.0500 time 0.2403 (0.2437) data time 0.0010 (0.0014) model time 0.2393 (0.2422) loss 4.4741 (3.7219) grad_norm 2.6173 (inf) loss_scale 16384.0000 (16654.4570) mem 7379MB [2024-08-26 05:31:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1160/1251] eta 0:00:22 lr 0.000989 wd 0.0500 time 0.2446 (0.2437) data time 0.0007 (0.0014) model time 0.2439 (0.2422) loss 5.0463 (3.7215) grad_norm 2.9445 (inf) loss_scale 16384.0000 (16652.1275) mem 7379MB [2024-08-26 05:31:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1170/1251] eta 0:00:19 lr 0.000989 wd 0.0500 time 0.2405 (0.2437) data time 0.0009 (0.0014) model time 0.2396 (0.2422) loss 2.6050 (3.7197) grad_norm 4.2631 (inf) loss_scale 16384.0000 (16649.8377) mem 7379MB [2024-08-26 05:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1180/1251] eta 0:00:17 lr 0.000989 wd 0.0500 time 0.2376 (0.2437) data time 0.0011 (0.0014) model time 0.2365 (0.2422) loss 2.2804 (3.7182) grad_norm 2.0588 (inf) loss_scale 16384.0000 (16647.5868) mem 7379MB [2024-08-26 05:31:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1190/1251] eta 0:00:14 lr 0.000989 wd 0.0500 time 0.2452 (0.2437) data time 0.0012 (0.0014) model time 0.2440 (0.2422) loss 3.8682 (3.7198) grad_norm 1.7812 (inf) loss_scale 16384.0000 (16645.3736) mem 7379MB [2024-08-26 05:31:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1200/1251] eta 0:00:12 lr 0.000989 wd 0.0500 time 0.2447 (0.2437) data time 0.0012 (0.0014) model time 0.2436 (0.2422) loss 2.9359 (3.7213) grad_norm 1.4495 (inf) loss_scale 16384.0000 (16643.1973) mem 7379MB [2024-08-26 05:31:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1210/1251] eta 0:00:09 lr 0.000989 wd 0.0500 time 0.2420 (0.2437) data time 0.0010 (0.0014) model time 0.2411 (0.2422) loss 3.3580 (3.7203) grad_norm 4.7480 (inf) loss_scale 16384.0000 (16641.0570) mem 7379MB [2024-08-26 05:31:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1220/1251] eta 0:00:07 lr 0.000989 wd 0.0500 time 0.2368 (0.2437) data time 0.0009 (0.0014) model time 0.2359 (0.2422) loss 4.4424 (3.7206) grad_norm 1.6212 (inf) loss_scale 16384.0000 (16638.9517) mem 7379MB [2024-08-26 05:31:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1230/1251] eta 0:00:05 lr 0.000989 wd 0.0500 time 0.2377 (0.2438) data time 0.0008 (0.0014) model time 0.2369 (0.2424) loss 2.6556 (3.7181) grad_norm 1.6172 (inf) loss_scale 16384.0000 (16636.8806) mem 7379MB [2024-08-26 05:31:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1240/1251] eta 0:00:02 lr 0.000989 wd 0.0500 time 0.2270 (0.2437) data time 0.0005 (0.0014) model time 0.2265 (0.2423) loss 3.3957 (3.7149) grad_norm 2.3680 (inf) loss_scale 16384.0000 (16634.8429) mem 7379MB [2024-08-26 05:31:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [38/300][1250/1251] eta 0:00:00 lr 0.000989 wd 0.0500 time 0.2257 (0.2436) data time 0.0007 (0.0014) model time 0.2250 (0.2421) loss 3.6963 (3.7171) grad_norm 1.7563 (inf) loss_scale 16384.0000 (16632.8377) mem 7379MB [2024-08-26 05:31:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 38 training takes 0:05:04 [2024-08-26 05:31:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 05:31:59 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 05:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.456 (0.456) Loss 0.6196 (0.6196) Acc@1 88.281 (88.281) Acc@5 96.777 (96.777) Mem 7379MB [2024-08-26 05:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.112) Loss 1.0596 (0.9581) Acc@1 77.637 (79.048) Acc@5 94.238 (95.037) Mem 7379MB [2024-08-26 05:32:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.096) Loss 1.3623 (0.9774) Acc@1 69.531 (78.237) Acc@5 89.258 (95.043) Mem 7379MB [2024-08-26 05:32:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.090) Loss 1.6572 (1.1222) Acc@1 63.281 (75.337) Acc@5 84.570 (93.066) Mem 7379MB [2024-08-26 05:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.5977 (1.2014) Acc@1 63.770 (73.580) Acc@5 87.305 (92.061) Mem 7379MB [2024-08-26 05:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 73.160 Acc@5 91.930 [2024-08-26 05:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 73.2% [2024-08-26 05:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 73.16% [2024-08-26 05:32:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 05:32:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 05:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.409 (0.409) Loss 0.4814 (0.4814) Acc@1 89.551 (89.551) Acc@5 97.559 (97.559) Mem 7379MB [2024-08-26 05:32:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.109) Loss 0.8203 (0.8023) Acc@1 81.152 (80.531) Acc@5 95.312 (95.508) Mem 7379MB [2024-08-26 05:32:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.095) Loss 1.1914 (0.8165) Acc@1 71.777 (79.948) Acc@5 90.625 (95.489) Mem 7379MB [2024-08-26 05:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.090) Loss 1.4717 (0.9473) Acc@1 63.574 (76.969) Acc@5 86.816 (93.744) Mem 7379MB [2024-08-26 05:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.4180 (1.0249) Acc@1 64.746 (75.169) Acc@5 87.598 (92.783) Mem 7379MB [2024-08-26 05:32:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.834 Acc@5 92.706 [2024-08-26 05:32:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 74.8% [2024-08-26 05:32:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 74.83% [2024-08-26 05:32:08 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 05:32:09 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 05:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][0/1251] eta 0:13:07 lr 0.000989 wd 0.0500 time 0.6293 (0.6293) data time 0.4004 (0.4004) model time 0.0000 (0.0000) loss 3.1980 (3.1980) grad_norm 2.0075 (2.0075) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][10/1251] eta 0:05:41 lr 0.000989 wd 0.0500 time 0.2415 (0.2752) data time 0.0008 (0.0373) model time 0.0000 (0.0000) loss 3.5864 (3.7207) grad_norm 2.1761 (1.9503) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][20/1251] eta 0:05:18 lr 0.000989 wd 0.0500 time 0.2378 (0.2591) data time 0.0010 (0.0200) model time 0.0000 (0.0000) loss 3.6011 (3.7889) grad_norm 2.4826 (1.9644) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][30/1251] eta 0:05:09 lr 0.000989 wd 0.0500 time 0.2414 (0.2537) data time 0.0009 (0.0139) model time 0.0000 (0.0000) loss 3.9477 (3.7292) grad_norm 1.6315 (1.8754) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][40/1251] eta 0:05:04 lr 0.000989 wd 0.0500 time 0.2407 (0.2514) data time 0.0010 (0.0109) model time 0.0000 (0.0000) loss 4.2400 (3.6942) grad_norm 1.9184 (1.8654) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][50/1251] eta 0:04:59 lr 0.000989 wd 0.0500 time 0.2535 (0.2496) data time 0.0007 (0.0092) model time 0.0000 (0.0000) loss 2.8519 (3.6966) grad_norm 1.6183 (1.8480) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][60/1251] eta 0:04:55 lr 0.000989 wd 0.0500 time 0.2312 (0.2482) data time 0.0009 (0.0078) model time 0.2303 (0.2401) loss 2.6615 (3.6936) grad_norm 2.0718 (1.8784) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][70/1251] eta 0:04:51 lr 0.000989 wd 0.0500 time 0.2416 (0.2469) data time 0.0010 (0.0069) model time 0.2406 (0.2390) loss 4.2709 (3.7130) grad_norm 3.0990 (1.8968) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][80/1251] eta 0:04:48 lr 0.000989 wd 0.0500 time 0.2421 (0.2461) data time 0.0009 (0.0061) model time 0.2412 (0.2390) loss 3.3884 (3.7440) grad_norm 1.9778 (1.9507) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][90/1251] eta 0:04:45 lr 0.000989 wd 0.0500 time 0.2452 (0.2456) data time 0.0011 (0.0056) model time 0.2441 (0.2395) loss 3.2666 (3.7659) grad_norm 1.8999 (1.9369) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][100/1251] eta 0:04:42 lr 0.000989 wd 0.0500 time 0.2412 (0.2451) data time 0.0007 (0.0051) model time 0.2405 (0.2395) loss 3.9452 (3.7382) grad_norm 1.4324 (1.9344) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][110/1251] eta 0:04:39 lr 0.000989 wd 0.0500 time 0.2629 (0.2451) data time 0.0011 (0.0048) model time 0.2618 (0.2403) loss 3.7278 (3.7404) grad_norm 1.8645 (1.9410) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][120/1251] eta 0:04:36 lr 0.000989 wd 0.0500 time 0.2434 (0.2449) data time 0.0007 (0.0045) model time 0.2426 (0.2403) loss 4.4953 (3.7436) grad_norm 1.3528 (1.9285) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][130/1251] eta 0:04:34 lr 0.000989 wd 0.0500 time 0.2414 (0.2448) data time 0.0012 (0.0042) model time 0.2402 (0.2406) loss 4.2932 (3.7454) grad_norm 1.7148 (1.9336) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][140/1251] eta 0:04:31 lr 0.000989 wd 0.0500 time 0.2460 (0.2447) data time 0.0008 (0.0040) model time 0.2452 (0.2408) loss 4.1303 (3.7422) grad_norm 1.7810 (1.9519) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][150/1251] eta 0:04:29 lr 0.000989 wd 0.0500 time 0.2325 (0.2444) data time 0.0011 (0.0038) model time 0.2314 (0.2406) loss 3.9705 (3.7327) grad_norm 2.2839 (1.9372) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][160/1251] eta 0:04:29 lr 0.000989 wd 0.0500 time 0.2335 (0.2470) data time 0.0009 (0.0037) model time 0.2326 (0.2446) loss 3.9970 (3.7362) grad_norm 2.0999 (1.9418) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][170/1251] eta 0:04:26 lr 0.000989 wd 0.0500 time 0.2387 (0.2466) data time 0.0007 (0.0035) model time 0.2380 (0.2443) loss 3.9932 (3.7400) grad_norm 2.5701 (1.9395) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][180/1251] eta 0:04:23 lr 0.000989 wd 0.0500 time 0.2469 (0.2464) data time 0.0008 (0.0034) model time 0.2462 (0.2440) loss 4.5521 (3.7346) grad_norm 2.0295 (1.9488) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][190/1251] eta 0:04:21 lr 0.000989 wd 0.0500 time 0.2344 (0.2463) data time 0.0009 (0.0033) model time 0.2335 (0.2439) loss 4.4432 (3.7377) grad_norm 1.7754 (1.9481) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][200/1251] eta 0:04:18 lr 0.000989 wd 0.0500 time 0.2488 (0.2461) data time 0.0010 (0.0032) model time 0.2478 (0.2438) loss 4.1511 (3.7450) grad_norm 2.0618 (1.9540) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][210/1251] eta 0:04:15 lr 0.000989 wd 0.0500 time 0.2369 (0.2459) data time 0.0010 (0.0031) model time 0.2359 (0.2436) loss 3.9120 (3.7384) grad_norm 2.1819 (1.9467) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][220/1251] eta 0:04:13 lr 0.000989 wd 0.0500 time 0.2418 (0.2457) data time 0.0007 (0.0030) model time 0.2411 (0.2434) loss 4.2691 (3.7520) grad_norm 2.0284 (1.9497) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][230/1251] eta 0:04:10 lr 0.000989 wd 0.0500 time 0.2548 (0.2456) data time 0.0007 (0.0029) model time 0.2540 (0.2433) loss 3.1959 (3.7456) grad_norm 3.0603 (1.9538) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][240/1251] eta 0:04:08 lr 0.000989 wd 0.0500 time 0.2387 (0.2454) data time 0.0013 (0.0028) model time 0.2374 (0.2432) loss 3.8209 (3.7461) grad_norm 2.2269 (1.9436) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][250/1251] eta 0:04:05 lr 0.000989 wd 0.0500 time 0.2475 (0.2453) data time 0.0011 (0.0027) model time 0.2464 (0.2431) loss 4.3862 (3.7404) grad_norm 2.1843 (1.9348) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][260/1251] eta 0:04:02 lr 0.000989 wd 0.0500 time 0.2388 (0.2452) data time 0.0009 (0.0027) model time 0.2379 (0.2430) loss 2.8901 (3.7448) grad_norm 1.7753 (1.9262) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][270/1251] eta 0:04:00 lr 0.000989 wd 0.0500 time 0.2465 (0.2452) data time 0.0011 (0.0026) model time 0.2454 (0.2431) loss 3.9067 (3.7526) grad_norm 2.0878 (1.9261) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][280/1251] eta 0:03:58 lr 0.000989 wd 0.0500 time 0.2474 (0.2451) data time 0.0008 (0.0026) model time 0.2466 (0.2430) loss 4.3514 (3.7682) grad_norm 1.4412 (1.9232) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][290/1251] eta 0:03:55 lr 0.000989 wd 0.0500 time 0.2151 (0.2456) data time 0.0011 (0.0025) model time 0.2140 (0.2436) loss 3.8146 (3.7703) grad_norm 1.8168 (1.9245) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][300/1251] eta 0:03:53 lr 0.000989 wd 0.0500 time 0.2502 (0.2455) data time 0.0009 (0.0025) model time 0.2493 (0.2435) loss 3.8405 (3.7703) grad_norm 1.3424 (1.9225) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][310/1251] eta 0:03:50 lr 0.000989 wd 0.0500 time 0.2302 (0.2454) data time 0.0009 (0.0024) model time 0.2293 (0.2435) loss 3.6296 (3.7702) grad_norm 2.0191 (1.9317) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][320/1251] eta 0:03:48 lr 0.000988 wd 0.0500 time 0.2513 (0.2453) data time 0.0008 (0.0024) model time 0.2505 (0.2434) loss 3.2587 (3.7636) grad_norm 2.4141 (1.9408) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][330/1251] eta 0:03:45 lr 0.000988 wd 0.0500 time 0.2315 (0.2452) data time 0.0011 (0.0023) model time 0.2304 (0.2433) loss 3.8690 (3.7507) grad_norm 1.8173 (1.9429) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][340/1251] eta 0:03:43 lr 0.000988 wd 0.0500 time 0.2420 (0.2451) data time 0.0009 (0.0023) model time 0.2411 (0.2432) loss 3.8055 (3.7500) grad_norm 2.0974 (1.9430) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][350/1251] eta 0:03:40 lr 0.000988 wd 0.0500 time 0.2391 (0.2450) data time 0.0010 (0.0023) model time 0.2380 (0.2431) loss 4.2329 (3.7472) grad_norm 1.2478 (1.9348) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][360/1251] eta 0:03:38 lr 0.000988 wd 0.0500 time 0.2343 (0.2450) data time 0.0011 (0.0023) model time 0.2332 (0.2431) loss 3.7717 (3.7515) grad_norm 1.7756 (1.9375) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][370/1251] eta 0:03:35 lr 0.000988 wd 0.0500 time 0.2540 (0.2450) data time 0.0007 (0.0022) model time 0.2534 (0.2431) loss 4.2754 (3.7444) grad_norm 2.3726 (1.9395) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][380/1251] eta 0:03:33 lr 0.000988 wd 0.0500 time 0.2454 (0.2449) data time 0.0007 (0.0022) model time 0.2447 (0.2431) loss 4.4800 (3.7421) grad_norm 1.4443 (1.9365) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][390/1251] eta 0:03:30 lr 0.000988 wd 0.0500 time 0.2390 (0.2449) data time 0.0010 (0.0022) model time 0.2379 (0.2430) loss 4.0970 (3.7469) grad_norm 1.5664 (1.9326) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][400/1251] eta 0:03:28 lr 0.000988 wd 0.0500 time 0.2456 (0.2449) data time 0.0008 (0.0022) model time 0.2448 (0.2430) loss 4.1214 (3.7490) grad_norm 1.5122 (1.9230) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][410/1251] eta 0:03:25 lr 0.000988 wd 0.0500 time 0.2388 (0.2448) data time 0.0009 (0.0021) model time 0.2379 (0.2430) loss 3.9120 (3.7525) grad_norm 2.0975 (1.9223) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][420/1251] eta 0:03:23 lr 0.000988 wd 0.0500 time 0.2409 (0.2447) data time 0.0010 (0.0021) model time 0.2399 (0.2429) loss 4.0721 (3.7534) grad_norm 2.0160 (1.9247) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][430/1251] eta 0:03:20 lr 0.000988 wd 0.0500 time 0.2416 (0.2446) data time 0.0013 (0.0021) model time 0.2404 (0.2428) loss 3.6103 (3.7540) grad_norm 1.5553 (1.9320) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][440/1251] eta 0:03:18 lr 0.000988 wd 0.0500 time 0.2359 (0.2446) data time 0.0010 (0.0021) model time 0.2349 (0.2429) loss 3.0310 (3.7460) grad_norm 1.7176 (1.9373) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:33:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][450/1251] eta 0:03:16 lr 0.000988 wd 0.0500 time 0.2404 (0.2450) data time 0.0009 (0.0020) model time 0.2395 (0.2433) loss 2.5176 (3.7423) grad_norm 1.9236 (1.9384) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][460/1251] eta 0:03:13 lr 0.000988 wd 0.0500 time 0.2375 (0.2449) data time 0.0008 (0.0020) model time 0.2367 (0.2432) loss 4.3118 (3.7436) grad_norm 1.3967 (1.9311) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][470/1251] eta 0:03:11 lr 0.000988 wd 0.0500 time 0.2377 (0.2448) data time 0.0009 (0.0020) model time 0.2368 (0.2431) loss 3.8057 (3.7370) grad_norm 1.3660 (1.9283) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][480/1251] eta 0:03:08 lr 0.000988 wd 0.0500 time 0.2497 (0.2448) data time 0.0009 (0.0020) model time 0.2488 (0.2431) loss 4.0050 (3.7378) grad_norm 1.8839 (1.9247) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][490/1251] eta 0:03:06 lr 0.000988 wd 0.0500 time 0.2441 (0.2447) data time 0.0009 (0.0020) model time 0.2432 (0.2430) loss 3.4776 (3.7386) grad_norm 1.8587 (1.9265) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][500/1251] eta 0:03:03 lr 0.000988 wd 0.0500 time 0.2411 (0.2450) data time 0.0007 (0.0019) model time 0.2403 (0.2433) loss 2.6329 (3.7425) grad_norm 4.8287 (1.9432) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][510/1251] eta 0:03:01 lr 0.000988 wd 0.0500 time 0.2448 (0.2450) data time 0.0008 (0.0019) model time 0.2440 (0.2433) loss 5.0852 (3.7479) grad_norm 2.6195 (1.9448) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][520/1251] eta 0:02:59 lr 0.000988 wd 0.0500 time 0.2413 (0.2449) data time 0.0007 (0.0019) model time 0.2406 (0.2432) loss 3.2282 (3.7513) grad_norm 1.8115 (1.9450) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][530/1251] eta 0:02:56 lr 0.000988 wd 0.0500 time 0.2457 (0.2448) data time 0.0009 (0.0019) model time 0.2448 (0.2432) loss 4.4223 (3.7439) grad_norm 1.5057 (1.9383) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][540/1251] eta 0:02:53 lr 0.000988 wd 0.0500 time 0.2391 (0.2447) data time 0.0007 (0.0019) model time 0.2384 (0.2431) loss 4.2487 (3.7474) grad_norm 1.6300 (1.9330) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][550/1251] eta 0:02:51 lr 0.000988 wd 0.0500 time 0.2431 (0.2447) data time 0.0010 (0.0019) model time 0.2421 (0.2430) loss 3.9553 (3.7474) grad_norm 2.0252 (1.9394) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][560/1251] eta 0:02:49 lr 0.000988 wd 0.0500 time 0.2401 (0.2447) data time 0.0009 (0.0019) model time 0.2392 (0.2431) loss 3.7014 (3.7434) grad_norm 3.1336 (1.9479) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][570/1251] eta 0:02:46 lr 0.000988 wd 0.0500 time 0.2418 (0.2446) data time 0.0011 (0.0018) model time 0.2406 (0.2430) loss 2.4985 (3.7423) grad_norm 1.7619 (1.9476) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][580/1251] eta 0:02:44 lr 0.000988 wd 0.0500 time 0.2424 (0.2446) data time 0.0009 (0.0018) model time 0.2415 (0.2430) loss 3.5626 (3.7409) grad_norm 2.8604 (1.9484) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][590/1251] eta 0:02:41 lr 0.000988 wd 0.0500 time 0.2394 (0.2445) data time 0.0008 (0.0018) model time 0.2386 (0.2429) loss 4.6679 (3.7396) grad_norm 1.7967 (1.9472) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][600/1251] eta 0:02:39 lr 0.000988 wd 0.0500 time 0.2401 (0.2444) data time 0.0010 (0.0018) model time 0.2391 (0.2428) loss 3.9836 (3.7349) grad_norm 1.5006 (1.9462) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][610/1251] eta 0:02:36 lr 0.000988 wd 0.0500 time 0.2302 (0.2443) data time 0.0011 (0.0018) model time 0.2291 (0.2427) loss 3.9423 (3.7304) grad_norm 1.7556 (1.9433) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][620/1251] eta 0:02:34 lr 0.000988 wd 0.0500 time 0.2410 (0.2443) data time 0.0007 (0.0018) model time 0.2403 (0.2426) loss 4.3954 (3.7330) grad_norm 2.6939 (1.9433) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][630/1251] eta 0:02:31 lr 0.000988 wd 0.0500 time 0.2401 (0.2442) data time 0.0011 (0.0018) model time 0.2390 (0.2426) loss 3.8174 (3.7349) grad_norm 2.8680 (1.9504) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][640/1251] eta 0:02:29 lr 0.000988 wd 0.0500 time 0.2398 (0.2441) data time 0.0010 (0.0018) model time 0.2388 (0.2425) loss 4.1186 (3.7357) grad_norm 1.9452 (1.9511) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][650/1251] eta 0:02:26 lr 0.000988 wd 0.0500 time 0.2380 (0.2440) data time 0.0007 (0.0018) model time 0.2373 (0.2424) loss 2.9599 (3.7318) grad_norm 1.9753 (1.9500) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][660/1251] eta 0:02:24 lr 0.000988 wd 0.0500 time 0.2414 (0.2440) data time 0.0008 (0.0018) model time 0.2407 (0.2424) loss 2.9040 (3.7264) grad_norm 1.5219 (1.9447) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][670/1251] eta 0:02:21 lr 0.000988 wd 0.0500 time 0.2420 (0.2439) data time 0.0010 (0.0018) model time 0.2411 (0.2423) loss 3.7720 (3.7260) grad_norm 2.6212 (1.9426) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][680/1251] eta 0:02:19 lr 0.000988 wd 0.0500 time 0.2445 (0.2439) data time 0.0008 (0.0018) model time 0.2437 (0.2423) loss 3.9938 (3.7237) grad_norm 1.7497 (1.9406) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][690/1251] eta 0:02:16 lr 0.000988 wd 0.0500 time 0.2436 (0.2438) data time 0.0009 (0.0017) model time 0.2428 (0.2422) loss 2.7149 (3.7205) grad_norm 1.5630 (1.9448) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:34:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][700/1251] eta 0:02:14 lr 0.000988 wd 0.0500 time 0.2637 (0.2439) data time 0.0009 (0.0017) model time 0.2628 (0.2423) loss 3.8662 (3.7194) grad_norm 1.4518 (1.9433) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][710/1251] eta 0:02:11 lr 0.000988 wd 0.0500 time 0.2419 (0.2439) data time 0.0009 (0.0017) model time 0.2410 (0.2423) loss 4.1452 (3.7207) grad_norm 2.7884 (1.9491) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][720/1251] eta 0:02:09 lr 0.000988 wd 0.0500 time 0.2468 (0.2438) data time 0.0010 (0.0017) model time 0.2458 (0.2422) loss 3.2225 (3.7204) grad_norm 2.1733 (1.9499) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][730/1251] eta 0:02:07 lr 0.000988 wd 0.0500 time 0.2388 (0.2438) data time 0.0009 (0.0017) model time 0.2379 (0.2422) loss 3.6940 (3.7167) grad_norm 1.8681 (1.9460) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][740/1251] eta 0:02:04 lr 0.000988 wd 0.0500 time 0.2332 (0.2437) data time 0.0009 (0.0017) model time 0.2323 (0.2421) loss 4.5586 (3.7215) grad_norm 1.7847 (1.9422) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][750/1251] eta 0:02:02 lr 0.000988 wd 0.0500 time 0.2443 (0.2437) data time 0.0009 (0.0017) model time 0.2434 (0.2421) loss 3.9607 (3.7258) grad_norm 1.5889 (1.9422) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][760/1251] eta 0:01:59 lr 0.000988 wd 0.0500 time 0.2408 (0.2437) data time 0.0007 (0.0017) model time 0.2401 (0.2422) loss 4.7811 (3.7300) grad_norm 1.6147 (1.9405) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][770/1251] eta 0:01:57 lr 0.000988 wd 0.0500 time 0.2431 (0.2438) data time 0.0011 (0.0017) model time 0.2420 (0.2422) loss 3.6350 (3.7327) grad_norm 2.7550 (1.9427) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][780/1251] eta 0:01:54 lr 0.000988 wd 0.0500 time 0.2443 (0.2437) data time 0.0007 (0.0017) model time 0.2436 (0.2422) loss 4.2755 (3.7337) grad_norm 1.8351 (1.9430) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][790/1251] eta 0:01:52 lr 0.000988 wd 0.0500 time 0.2365 (0.2437) data time 0.0010 (0.0017) model time 0.2356 (0.2422) loss 4.5053 (3.7340) grad_norm 2.9584 (1.9445) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][800/1251] eta 0:01:49 lr 0.000988 wd 0.0500 time 0.2365 (0.2437) data time 0.0011 (0.0017) model time 0.2354 (0.2421) loss 3.6146 (3.7321) grad_norm 1.3207 (1.9439) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][810/1251] eta 0:01:47 lr 0.000988 wd 0.0500 time 0.2519 (0.2437) data time 0.0007 (0.0017) model time 0.2512 (0.2421) loss 4.4514 (3.7302) grad_norm 1.8449 (1.9418) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][820/1251] eta 0:01:45 lr 0.000988 wd 0.0500 time 0.2365 (0.2439) data time 0.0009 (0.0017) model time 0.2356 (0.2424) loss 3.6043 (3.7316) grad_norm 2.1111 (1.9392) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][830/1251] eta 0:01:42 lr 0.000988 wd 0.0500 time 0.2415 (0.2439) data time 0.0007 (0.0016) model time 0.2409 (0.2423) loss 4.2205 (3.7310) grad_norm 2.1132 (1.9368) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][840/1251] eta 0:01:40 lr 0.000988 wd 0.0500 time 0.2388 (0.2438) data time 0.0011 (0.0016) model time 0.2377 (0.2423) loss 4.2199 (3.7338) grad_norm 2.4812 (1.9405) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][850/1251] eta 0:01:37 lr 0.000988 wd 0.0500 time 0.2370 (0.2438) data time 0.0009 (0.0016) model time 0.2360 (0.2423) loss 4.3941 (3.7335) grad_norm 1.9977 (1.9402) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][860/1251] eta 0:01:35 lr 0.000988 wd 0.0500 time 0.2459 (0.2438) data time 0.0008 (0.0016) model time 0.2451 (0.2423) loss 3.4318 (3.7313) grad_norm 1.9271 (1.9390) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][870/1251] eta 0:01:32 lr 0.000988 wd 0.0500 time 0.2336 (0.2438) data time 0.0012 (0.0016) model time 0.2324 (0.2423) loss 3.8020 (3.7297) grad_norm 1.6989 (1.9376) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][880/1251] eta 0:01:30 lr 0.000988 wd 0.0500 time 0.2425 (0.2437) data time 0.0009 (0.0016) model time 0.2416 (0.2422) loss 2.3558 (3.7286) grad_norm 1.4279 (1.9374) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][890/1251] eta 0:01:27 lr 0.000988 wd 0.0500 time 0.2396 (0.2438) data time 0.0010 (0.0016) model time 0.2386 (0.2423) loss 4.1925 (3.7319) grad_norm 1.9004 (1.9370) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:35:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][900/1251] eta 0:01:25 lr 0.000988 wd 0.0500 time 0.2494 (0.2437) data time 0.0010 (0.0016) model time 0.2484 (0.2422) loss 3.9708 (3.7334) grad_norm 2.2302 (inf) loss_scale 8192.0000 (16365.8158) mem 7379MB [2024-08-26 05:35:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][910/1251] eta 0:01:23 lr 0.000988 wd 0.0500 time 0.2419 (0.2437) data time 0.0008 (0.0016) model time 0.2411 (0.2422) loss 3.7770 (3.7306) grad_norm 1.8723 (inf) loss_scale 8192.0000 (16276.0922) mem 7379MB [2024-08-26 05:35:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][920/1251] eta 0:01:20 lr 0.000988 wd 0.0500 time 0.2385 (0.2437) data time 0.0010 (0.0016) model time 0.2375 (0.2422) loss 3.1267 (3.7289) grad_norm 1.6379 (inf) loss_scale 8192.0000 (16188.3170) mem 7379MB [2024-08-26 05:35:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][930/1251] eta 0:01:18 lr 0.000988 wd 0.0500 time 0.2321 (0.2437) data time 0.0012 (0.0016) model time 0.2309 (0.2422) loss 3.4572 (3.7291) grad_norm 1.5561 (inf) loss_scale 8192.0000 (16102.4275) mem 7379MB [2024-08-26 05:35:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][940/1251] eta 0:01:15 lr 0.000988 wd 0.0500 time 0.2376 (0.2436) data time 0.0012 (0.0016) model time 0.2364 (0.2421) loss 3.4687 (3.7308) grad_norm 1.8760 (inf) loss_scale 8192.0000 (16018.3634) mem 7379MB [2024-08-26 05:36:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][950/1251] eta 0:01:13 lr 0.000988 wd 0.0500 time 0.2489 (0.2436) data time 0.0007 (0.0016) model time 0.2482 (0.2422) loss 4.2823 (3.7299) grad_norm 1.9649 (inf) loss_scale 8192.0000 (15936.0673) mem 7379MB [2024-08-26 05:36:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][960/1251] eta 0:01:10 lr 0.000988 wd 0.0500 time 0.2393 (0.2436) data time 0.0008 (0.0016) model time 0.2386 (0.2421) loss 2.8140 (3.7279) grad_norm 1.5614 (inf) loss_scale 8192.0000 (15855.4839) mem 7379MB [2024-08-26 05:36:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][970/1251] eta 0:01:08 lr 0.000988 wd 0.0500 time 0.2408 (0.2436) data time 0.0009 (0.0016) model time 0.2399 (0.2421) loss 4.2097 (3.7289) grad_norm 3.3545 (inf) loss_scale 8192.0000 (15776.5602) mem 7379MB [2024-08-26 05:36:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][980/1251] eta 0:01:06 lr 0.000988 wd 0.0500 time 0.2378 (0.2436) data time 0.0010 (0.0016) model time 0.2367 (0.2421) loss 3.9812 (3.7320) grad_norm 1.8671 (inf) loss_scale 8192.0000 (15699.2457) mem 7379MB [2024-08-26 05:36:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][990/1251] eta 0:01:03 lr 0.000988 wd 0.0500 time 0.2415 (0.2438) data time 0.0009 (0.0016) model time 0.2406 (0.2423) loss 3.0768 (3.7338) grad_norm 1.7917 (inf) loss_scale 8192.0000 (15623.4914) mem 7379MB [2024-08-26 05:36:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1000/1251] eta 0:01:01 lr 0.000988 wd 0.0500 time 0.2381 (0.2437) data time 0.0010 (0.0015) model time 0.2371 (0.2423) loss 3.7647 (3.7311) grad_norm 1.8239 (inf) loss_scale 8192.0000 (15549.2507) mem 7379MB [2024-08-26 05:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1010/1251] eta 0:00:58 lr 0.000988 wd 0.0500 time 0.2390 (0.2437) data time 0.0010 (0.0015) model time 0.2379 (0.2422) loss 4.1664 (3.7280) grad_norm 2.0023 (inf) loss_scale 8192.0000 (15476.4787) mem 7379MB [2024-08-26 05:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1020/1251] eta 0:00:56 lr 0.000988 wd 0.0500 time 0.2404 (0.2439) data time 0.0011 (0.0015) model time 0.2393 (0.2424) loss 3.8774 (3.7306) grad_norm 1.5548 (inf) loss_scale 8192.0000 (15405.1322) mem 7379MB [2024-08-26 05:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1030/1251] eta 0:00:53 lr 0.000988 wd 0.0500 time 0.2446 (0.2438) data time 0.0009 (0.0015) model time 0.2437 (0.2424) loss 3.7706 (3.7308) grad_norm 1.7501 (inf) loss_scale 8192.0000 (15335.1697) mem 7379MB [2024-08-26 05:36:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1040/1251] eta 0:00:51 lr 0.000988 wd 0.0500 time 0.2545 (0.2439) data time 0.0011 (0.0015) model time 0.2534 (0.2424) loss 3.5057 (3.7272) grad_norm 2.5547 (inf) loss_scale 8192.0000 (15266.5514) mem 7379MB [2024-08-26 05:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1050/1251] eta 0:00:49 lr 0.000988 wd 0.0500 time 0.2392 (0.2438) data time 0.0010 (0.0015) model time 0.2382 (0.2424) loss 4.0693 (3.7253) grad_norm 1.3877 (inf) loss_scale 8192.0000 (15199.2388) mem 7379MB [2024-08-26 05:36:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1060/1251] eta 0:00:46 lr 0.000988 wd 0.0500 time 0.2440 (0.2438) data time 0.0013 (0.0015) model time 0.2427 (0.2424) loss 3.8962 (3.7235) grad_norm 1.6989 (inf) loss_scale 8192.0000 (15133.1951) mem 7379MB [2024-08-26 05:36:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1070/1251] eta 0:00:44 lr 0.000988 wd 0.0500 time 0.2400 (0.2438) data time 0.0008 (0.0015) model time 0.2392 (0.2424) loss 3.5702 (3.7190) grad_norm 2.6463 (inf) loss_scale 8192.0000 (15068.3847) mem 7379MB [2024-08-26 05:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1080/1251] eta 0:00:41 lr 0.000988 wd 0.0500 time 0.2416 (0.2438) data time 0.0010 (0.0015) model time 0.2406 (0.2424) loss 3.4502 (3.7177) grad_norm 3.1042 (inf) loss_scale 8192.0000 (15004.7734) mem 7379MB [2024-08-26 05:36:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1090/1251] eta 0:00:39 lr 0.000988 wd 0.0500 time 0.2570 (0.2442) data time 0.0009 (0.0015) model time 0.2561 (0.2428) loss 2.8224 (3.7177) grad_norm 1.9126 (inf) loss_scale 8192.0000 (14942.3281) mem 7379MB [2024-08-26 05:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1100/1251] eta 0:00:36 lr 0.000988 wd 0.0500 time 0.2395 (0.2442) data time 0.0008 (0.0015) model time 0.2387 (0.2428) loss 3.6494 (3.7166) grad_norm 2.3513 (inf) loss_scale 8192.0000 (14881.0173) mem 7379MB [2024-08-26 05:36:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1110/1251] eta 0:00:34 lr 0.000988 wd 0.0500 time 0.2418 (0.2441) data time 0.0008 (0.0015) model time 0.2410 (0.2427) loss 2.9971 (3.7162) grad_norm 1.7367 (inf) loss_scale 8192.0000 (14820.8101) mem 7379MB [2024-08-26 05:36:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1120/1251] eta 0:00:31 lr 0.000988 wd 0.0500 time 0.2423 (0.2441) data time 0.0009 (0.0015) model time 0.2414 (0.2427) loss 3.5954 (3.7151) grad_norm 1.5591 (inf) loss_scale 8192.0000 (14761.6771) mem 7379MB [2024-08-26 05:36:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1130/1251] eta 0:00:29 lr 0.000988 wd 0.0500 time 0.2380 (0.2441) data time 0.0007 (0.0015) model time 0.2372 (0.2427) loss 3.7318 (3.7161) grad_norm 1.5178 (inf) loss_scale 8192.0000 (14703.5897) mem 7379MB [2024-08-26 05:36:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1140/1251] eta 0:00:27 lr 0.000988 wd 0.0500 time 0.2355 (0.2441) data time 0.0010 (0.0015) model time 0.2346 (0.2427) loss 2.8439 (3.7143) grad_norm 3.0103 (inf) loss_scale 8192.0000 (14646.5206) mem 7379MB [2024-08-26 05:36:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1150/1251] eta 0:00:24 lr 0.000988 wd 0.0500 time 0.2474 (0.2441) data time 0.0008 (0.0015) model time 0.2467 (0.2427) loss 3.9746 (3.7162) grad_norm 1.9755 (inf) loss_scale 8192.0000 (14590.4431) mem 7379MB [2024-08-26 05:36:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1160/1251] eta 0:00:22 lr 0.000988 wd 0.0500 time 0.2408 (0.2441) data time 0.0007 (0.0015) model time 0.2400 (0.2427) loss 4.3222 (3.7177) grad_norm 2.0686 (inf) loss_scale 8192.0000 (14535.3316) mem 7379MB [2024-08-26 05:36:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1170/1251] eta 0:00:19 lr 0.000988 wd 0.0500 time 0.2441 (0.2441) data time 0.0010 (0.0015) model time 0.2431 (0.2427) loss 4.0112 (3.7184) grad_norm 1.8781 (inf) loss_scale 8192.0000 (14481.1614) mem 7379MB [2024-08-26 05:36:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1180/1251] eta 0:00:17 lr 0.000988 wd 0.0500 time 0.2432 (0.2441) data time 0.0009 (0.0015) model time 0.2423 (0.2427) loss 3.8150 (3.7163) grad_norm 1.7264 (inf) loss_scale 8192.0000 (14427.9086) mem 7379MB [2024-08-26 05:36:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1190/1251] eta 0:00:14 lr 0.000988 wd 0.0500 time 0.2429 (0.2441) data time 0.0014 (0.0015) model time 0.2414 (0.2427) loss 4.6700 (3.7159) grad_norm 2.0254 (inf) loss_scale 8192.0000 (14375.5500) mem 7379MB [2024-08-26 05:37:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1200/1251] eta 0:00:12 lr 0.000988 wd 0.0500 time 0.2417 (0.2441) data time 0.0010 (0.0015) model time 0.2407 (0.2427) loss 3.0895 (3.7150) grad_norm 3.1793 (inf) loss_scale 8192.0000 (14324.0633) mem 7379MB [2024-08-26 05:37:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1210/1251] eta 0:00:10 lr 0.000988 wd 0.0500 time 0.2465 (0.2441) data time 0.0008 (0.0015) model time 0.2457 (0.2427) loss 2.6660 (3.7133) grad_norm 2.7000 (inf) loss_scale 8192.0000 (14273.4269) mem 7379MB [2024-08-26 05:37:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1220/1251] eta 0:00:07 lr 0.000988 wd 0.0500 time 0.2517 (0.2441) data time 0.0008 (0.0015) model time 0.2509 (0.2427) loss 2.9478 (3.7139) grad_norm 2.0369 (inf) loss_scale 8192.0000 (14223.6200) mem 7379MB [2024-08-26 05:37:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1230/1251] eta 0:00:05 lr 0.000988 wd 0.0500 time 0.2368 (0.2440) data time 0.0008 (0.0014) model time 0.2360 (0.2427) loss 3.8072 (3.7115) grad_norm 4.9577 (inf) loss_scale 8192.0000 (14174.6223) mem 7379MB [2024-08-26 05:37:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1240/1251] eta 0:00:02 lr 0.000988 wd 0.0500 time 0.2240 (0.2440) data time 0.0004 (0.0014) model time 0.2235 (0.2426) loss 4.2695 (3.7101) grad_norm 2.1512 (inf) loss_scale 8192.0000 (14126.4142) mem 7379MB [2024-08-26 05:37:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [39/300][1250/1251] eta 0:00:00 lr 0.000988 wd 0.0500 time 0.2258 (0.2438) data time 0.0007 (0.0014) model time 0.2252 (0.2425) loss 4.1143 (3.7121) grad_norm 1.7903 (inf) loss_scale 8192.0000 (14078.9768) mem 7379MB [2024-08-26 05:37:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 39 training takes 0:05:05 [2024-08-26 05:37:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 05:37:14 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 05:37:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.492 (0.492) Loss 0.5884 (0.5884) Acc@1 87.891 (87.891) Acc@5 97.070 (97.070) Mem 7379MB [2024-08-26 05:37:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.116) Loss 0.9272 (0.9104) Acc@1 78.809 (79.031) Acc@5 95.117 (95.073) Mem 7379MB [2024-08-26 05:37:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.098) Loss 1.3633 (0.9275) Acc@1 68.555 (78.451) Acc@5 89.355 (95.066) Mem 7379MB [2024-08-26 05:37:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.091) Loss 1.5928 (1.0790) Acc@1 64.648 (75.387) Acc@5 85.156 (93.104) Mem 7379MB [2024-08-26 05:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.4658 (1.1680) Acc@1 66.797 (73.435) Acc@5 88.281 (92.011) Mem 7379MB [2024-08-26 05:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 73.096 Acc@5 91.962 [2024-08-26 05:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 73.1% [2024-08-26 05:37:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.737 (0.737) Loss 0.4768 (0.4768) Acc@1 89.844 (89.844) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 05:37:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.144) Loss 0.8184 (0.7969) Acc@1 81.055 (80.922) Acc@5 95.410 (95.597) Mem 7379MB [2024-08-26 05:37:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.113) Loss 1.1787 (0.8112) Acc@1 72.266 (80.194) Acc@5 90.723 (95.601) Mem 7379MB [2024-08-26 05:37:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.067 (0.102) Loss 1.4609 (0.9406) Acc@1 63.770 (77.265) Acc@5 86.816 (93.854) Mem 7379MB [2024-08-26 05:37:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.093) Loss 1.4023 (1.0170) Acc@1 65.430 (75.491) Acc@5 88.184 (92.943) Mem 7379MB [2024-08-26 05:37:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.160 Acc@5 92.864 [2024-08-26 05:37:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 75.2% [2024-08-26 05:37:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 75.16% [2024-08-26 05:37:23 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 05:37:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 05:37:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][0/1251] eta 0:14:26 lr 0.000988 wd 0.0500 time 0.6923 (0.6923) data time 0.4707 (0.4707) model time 0.0000 (0.0000) loss 3.7046 (3.7046) grad_norm 2.3024 (2.3024) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:37:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][10/1251] eta 0:05:50 lr 0.000988 wd 0.0500 time 0.2475 (0.2821) data time 0.0006 (0.0436) model time 0.0000 (0.0000) loss 3.9746 (3.8057) grad_norm 1.5648 (1.8990) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:37:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][20/1251] eta 0:05:24 lr 0.000988 wd 0.0500 time 0.2470 (0.2638) data time 0.0007 (0.0233) model time 0.0000 (0.0000) loss 4.3631 (3.7779) grad_norm 1.9434 (1.9169) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:37:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][30/1251] eta 0:05:13 lr 0.000988 wd 0.0500 time 0.2443 (0.2567) data time 0.0010 (0.0161) model time 0.0000 (0.0000) loss 3.3612 (3.7755) grad_norm 1.8967 (1.8794) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:37:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][40/1251] eta 0:05:06 lr 0.000988 wd 0.0500 time 0.2456 (0.2527) data time 0.0008 (0.0125) model time 0.0000 (0.0000) loss 4.5073 (3.6955) grad_norm 1.7842 (1.8308) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:37:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][50/1251] eta 0:05:00 lr 0.000988 wd 0.0500 time 0.2428 (0.2504) data time 0.0009 (0.0102) model time 0.0000 (0.0000) loss 2.7448 (3.6964) grad_norm 2.0371 (1.8139) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:37:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][60/1251] eta 0:04:56 lr 0.000988 wd 0.0500 time 0.2410 (0.2489) data time 0.0007 (0.0087) model time 0.2403 (0.2403) loss 2.7271 (3.6129) grad_norm 1.3645 (1.7898) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:37:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][70/1251] eta 0:04:52 lr 0.000988 wd 0.0500 time 0.2427 (0.2479) data time 0.0009 (0.0076) model time 0.2417 (0.2405) loss 4.5881 (3.6447) grad_norm 3.3641 (1.8306) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:37:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][80/1251] eta 0:04:49 lr 0.000988 wd 0.0500 time 0.2543 (0.2474) data time 0.0007 (0.0068) model time 0.2536 (0.2413) loss 3.6892 (3.6363) grad_norm 1.7170 (1.8330) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:37:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][90/1251] eta 0:04:46 lr 0.000988 wd 0.0500 time 0.2379 (0.2470) data time 0.0009 (0.0062) model time 0.2370 (0.2418) loss 4.4212 (3.6545) grad_norm 1.7125 (1.8603) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:37:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][100/1251] eta 0:04:43 lr 0.000987 wd 0.0500 time 0.2419 (0.2467) data time 0.0008 (0.0057) model time 0.2411 (0.2420) loss 2.7628 (3.6569) grad_norm 1.3255 (1.8914) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:37:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][110/1251] eta 0:04:41 lr 0.000987 wd 0.0500 time 0.2428 (0.2463) data time 0.0009 (0.0052) model time 0.2419 (0.2419) loss 2.5444 (3.6485) grad_norm 2.5132 (1.8754) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][120/1251] eta 0:04:38 lr 0.000987 wd 0.0500 time 0.2458 (0.2459) data time 0.0012 (0.0049) model time 0.2446 (0.2416) loss 4.1197 (3.6576) grad_norm 2.2183 (1.8652) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][130/1251] eta 0:04:35 lr 0.000987 wd 0.0500 time 0.2489 (0.2458) data time 0.0009 (0.0046) model time 0.2479 (0.2419) loss 2.7484 (3.6522) grad_norm 1.6994 (1.8759) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:37:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][140/1251] eta 0:04:32 lr 0.000987 wd 0.0500 time 0.2414 (0.2457) data time 0.0011 (0.0043) model time 0.2404 (0.2420) loss 3.9688 (3.6736) grad_norm 2.6092 (1.8867) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][150/1251] eta 0:04:30 lr 0.000987 wd 0.0500 time 0.2430 (0.2454) data time 0.0007 (0.0041) model time 0.2423 (0.2419) loss 4.0625 (3.6750) grad_norm 1.7649 (1.8750) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][160/1251] eta 0:04:27 lr 0.000987 wd 0.0500 time 0.2417 (0.2453) data time 0.0010 (0.0039) model time 0.2406 (0.2419) loss 3.5007 (3.6861) grad_norm 1.6983 (1.8710) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][170/1251] eta 0:04:24 lr 0.000987 wd 0.0500 time 0.2385 (0.2451) data time 0.0011 (0.0037) model time 0.2374 (0.2419) loss 2.6879 (3.6558) grad_norm 1.4373 (1.8762) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][180/1251] eta 0:04:22 lr 0.000987 wd 0.0500 time 0.2400 (0.2450) data time 0.0012 (0.0036) model time 0.2388 (0.2418) loss 4.1765 (3.6459) grad_norm 2.6241 (1.8760) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][190/1251] eta 0:04:19 lr 0.000987 wd 0.0500 time 0.2447 (0.2448) data time 0.0007 (0.0035) model time 0.2440 (0.2418) loss 3.7122 (3.6327) grad_norm 1.8498 (1.8663) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][200/1251] eta 0:04:17 lr 0.000987 wd 0.0500 time 0.2445 (0.2448) data time 0.0010 (0.0033) model time 0.2435 (0.2419) loss 3.8280 (3.6330) grad_norm 2.0182 (1.8669) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][210/1251] eta 0:04:15 lr 0.000987 wd 0.0500 time 0.2404 (0.2457) data time 0.0013 (0.0032) model time 0.2391 (0.2432) loss 3.6717 (3.6255) grad_norm 1.8287 (1.8724) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][220/1251] eta 0:04:13 lr 0.000987 wd 0.0500 time 0.2404 (0.2456) data time 0.0010 (0.0031) model time 0.2393 (0.2431) loss 4.2276 (3.6264) grad_norm 1.8459 (1.8811) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][230/1251] eta 0:04:10 lr 0.000987 wd 0.0500 time 0.2421 (0.2453) data time 0.0007 (0.0030) model time 0.2414 (0.2429) loss 4.1540 (3.6334) grad_norm 1.7180 (1.8720) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][240/1251] eta 0:04:07 lr 0.000987 wd 0.0500 time 0.2436 (0.2452) data time 0.0009 (0.0030) model time 0.2427 (0.2428) loss 3.0061 (3.6210) grad_norm 1.5559 (1.8751) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][250/1251] eta 0:04:05 lr 0.000987 wd 0.0500 time 0.2479 (0.2450) data time 0.0009 (0.0029) model time 0.2470 (0.2426) loss 3.8271 (3.6225) grad_norm 2.0313 (1.8928) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][260/1251] eta 0:04:02 lr 0.000987 wd 0.0500 time 0.2380 (0.2449) data time 0.0007 (0.0028) model time 0.2373 (0.2425) loss 3.6696 (3.6203) grad_norm 1.8553 (1.9040) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][270/1251] eta 0:04:01 lr 0.000987 wd 0.0500 time 0.2388 (0.2463) data time 0.0007 (0.0028) model time 0.2381 (0.2444) loss 3.2466 (3.6111) grad_norm 1.5038 (1.9014) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][280/1251] eta 0:03:59 lr 0.000987 wd 0.0500 time 0.2422 (0.2462) data time 0.0008 (0.0027) model time 0.2414 (0.2442) loss 2.7248 (3.6096) grad_norm 1.6475 (1.9221) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][290/1251] eta 0:03:56 lr 0.000987 wd 0.0500 time 0.2374 (0.2459) data time 0.0011 (0.0026) model time 0.2363 (0.2440) loss 3.7457 (3.6064) grad_norm 2.3088 (1.9241) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][300/1251] eta 0:03:53 lr 0.000987 wd 0.0500 time 0.2407 (0.2458) data time 0.0010 (0.0026) model time 0.2397 (0.2438) loss 3.1202 (3.6126) grad_norm 1.6781 (1.9124) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][310/1251] eta 0:03:51 lr 0.000987 wd 0.0500 time 0.2380 (0.2456) data time 0.0009 (0.0025) model time 0.2371 (0.2436) loss 3.3342 (3.6111) grad_norm 2.0809 (1.9039) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][320/1251] eta 0:03:48 lr 0.000987 wd 0.0500 time 0.2367 (0.2454) data time 0.0010 (0.0025) model time 0.2358 (0.2435) loss 3.3130 (3.6109) grad_norm 2.2733 (1.9073) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][330/1251] eta 0:03:45 lr 0.000987 wd 0.0500 time 0.2522 (0.2453) data time 0.0010 (0.0024) model time 0.2512 (0.2433) loss 3.3097 (3.6144) grad_norm 1.4545 (1.9045) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][340/1251] eta 0:03:43 lr 0.000987 wd 0.0500 time 0.2528 (0.2452) data time 0.0008 (0.0024) model time 0.2520 (0.2433) loss 3.3742 (3.6065) grad_norm 2.2563 (1.8967) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][350/1251] eta 0:03:40 lr 0.000987 wd 0.0500 time 0.2407 (0.2451) data time 0.0008 (0.0024) model time 0.2399 (0.2432) loss 2.5324 (3.6059) grad_norm 1.9958 (1.8994) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][360/1251] eta 0:03:38 lr 0.000987 wd 0.0500 time 0.2436 (0.2450) data time 0.0012 (0.0023) model time 0.2424 (0.2431) loss 3.8655 (3.6075) grad_norm 2.1659 (1.9093) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][370/1251] eta 0:03:35 lr 0.000987 wd 0.0500 time 0.2397 (0.2450) data time 0.0010 (0.0023) model time 0.2387 (0.2431) loss 3.1309 (3.6095) grad_norm 1.9390 (1.9097) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][380/1251] eta 0:03:33 lr 0.000987 wd 0.0500 time 0.2434 (0.2449) data time 0.0011 (0.0023) model time 0.2423 (0.2430) loss 3.8544 (3.6121) grad_norm 2.1698 (1.9051) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:38:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][390/1251] eta 0:03:30 lr 0.000987 wd 0.0500 time 0.2403 (0.2448) data time 0.0010 (0.0022) model time 0.2393 (0.2429) loss 3.1912 (3.6105) grad_norm 1.3410 (1.8974) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][400/1251] eta 0:03:28 lr 0.000987 wd 0.0500 time 0.2373 (0.2452) data time 0.0010 (0.0022) model time 0.2362 (0.2435) loss 3.3648 (3.6141) grad_norm 1.4291 (1.8916) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][410/1251] eta 0:03:26 lr 0.000987 wd 0.0500 time 0.2454 (0.2451) data time 0.0010 (0.0022) model time 0.2444 (0.2434) loss 4.0815 (3.6070) grad_norm 2.0831 (1.8939) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][420/1251] eta 0:03:23 lr 0.000987 wd 0.0500 time 0.2383 (0.2450) data time 0.0009 (0.0021) model time 0.2374 (0.2433) loss 4.2121 (3.6103) grad_norm 1.7747 (1.8961) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][430/1251] eta 0:03:21 lr 0.000987 wd 0.0500 time 0.2386 (0.2449) data time 0.0010 (0.0021) model time 0.2376 (0.2432) loss 3.9370 (3.6085) grad_norm 2.0431 (1.8976) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][440/1251] eta 0:03:18 lr 0.000987 wd 0.0500 time 0.2466 (0.2449) data time 0.0009 (0.0021) model time 0.2458 (0.2432) loss 4.3547 (3.6115) grad_norm 1.9586 (1.8942) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][450/1251] eta 0:03:16 lr 0.000987 wd 0.0500 time 0.2373 (0.2448) data time 0.0008 (0.0021) model time 0.2365 (0.2431) loss 4.3800 (3.6087) grad_norm 1.8168 (1.8957) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][460/1251] eta 0:03:13 lr 0.000987 wd 0.0500 time 0.2340 (0.2448) data time 0.0008 (0.0020) model time 0.2333 (0.2431) loss 3.7057 (3.6142) grad_norm 1.7485 (1.8951) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][470/1251] eta 0:03:11 lr 0.000987 wd 0.0500 time 0.2414 (0.2447) data time 0.0009 (0.0020) model time 0.2405 (0.2430) loss 4.1057 (3.6157) grad_norm 1.8355 (1.8914) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][480/1251] eta 0:03:08 lr 0.000987 wd 0.0500 time 0.2495 (0.2447) data time 0.0008 (0.0020) model time 0.2487 (0.2430) loss 3.0660 (3.6158) grad_norm 1.6319 (1.8889) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][490/1251] eta 0:03:06 lr 0.000987 wd 0.0500 time 0.2442 (0.2447) data time 0.0007 (0.0020) model time 0.2435 (0.2430) loss 4.2791 (3.6175) grad_norm 4.0273 (1.9150) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][500/1251] eta 0:03:03 lr 0.000987 wd 0.0500 time 0.2383 (0.2447) data time 0.0010 (0.0020) model time 0.2373 (0.2430) loss 4.3177 (3.6208) grad_norm 1.8622 (1.9159) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][510/1251] eta 0:03:01 lr 0.000987 wd 0.0500 time 0.2465 (0.2446) data time 0.0009 (0.0019) model time 0.2456 (0.2430) loss 3.1533 (3.6216) grad_norm 1.7614 (1.9123) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][520/1251] eta 0:02:58 lr 0.000987 wd 0.0500 time 0.2385 (0.2446) data time 0.0009 (0.0019) model time 0.2376 (0.2429) loss 3.9688 (3.6180) grad_norm 2.6250 (1.9088) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][530/1251] eta 0:02:56 lr 0.000987 wd 0.0500 time 0.2417 (0.2445) data time 0.0007 (0.0019) model time 0.2410 (0.2428) loss 4.4847 (3.6211) grad_norm 2.7061 (1.9133) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][540/1251] eta 0:02:53 lr 0.000987 wd 0.0500 time 0.2388 (0.2444) data time 0.0007 (0.0019) model time 0.2381 (0.2427) loss 2.6309 (3.6242) grad_norm 2.4963 (1.9120) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][550/1251] eta 0:02:51 lr 0.000987 wd 0.0500 time 0.2374 (0.2443) data time 0.0009 (0.0019) model time 0.2365 (0.2427) loss 4.0705 (3.6321) grad_norm 2.6517 (1.9155) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][560/1251] eta 0:02:48 lr 0.000987 wd 0.0500 time 0.2346 (0.2442) data time 0.0009 (0.0019) model time 0.2337 (0.2426) loss 3.9494 (3.6319) grad_norm 3.4052 (1.9188) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][570/1251] eta 0:02:46 lr 0.000987 wd 0.0500 time 0.2450 (0.2442) data time 0.0008 (0.0018) model time 0.2442 (0.2426) loss 3.6621 (3.6349) grad_norm 1.4869 (1.9165) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][580/1251] eta 0:02:43 lr 0.000987 wd 0.0500 time 0.2436 (0.2442) data time 0.0011 (0.0018) model time 0.2425 (0.2426) loss 3.3507 (3.6350) grad_norm 1.9574 (1.9161) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][590/1251] eta 0:02:41 lr 0.000987 wd 0.0500 time 0.2432 (0.2442) data time 0.0011 (0.0018) model time 0.2421 (0.2426) loss 3.9142 (3.6376) grad_norm 1.4447 (1.9135) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][600/1251] eta 0:02:38 lr 0.000987 wd 0.0500 time 0.2405 (0.2442) data time 0.0008 (0.0018) model time 0.2397 (0.2426) loss 4.2667 (3.6367) grad_norm 1.9666 (1.9097) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][610/1251] eta 0:02:36 lr 0.000987 wd 0.0500 time 0.2433 (0.2441) data time 0.0009 (0.0018) model time 0.2425 (0.2425) loss 2.9725 (3.6385) grad_norm 3.1011 (1.9161) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][620/1251] eta 0:02:34 lr 0.000987 wd 0.0500 time 0.2458 (0.2441) data time 0.0008 (0.0018) model time 0.2450 (0.2425) loss 3.4056 (3.6412) grad_norm 2.1135 (1.9168) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:39:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][630/1251] eta 0:02:31 lr 0.000987 wd 0.0500 time 0.2376 (0.2441) data time 0.0010 (0.0018) model time 0.2366 (0.2425) loss 3.7244 (3.6442) grad_norm 1.5727 (1.9163) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][640/1251] eta 0:02:29 lr 0.000987 wd 0.0500 time 0.2379 (0.2444) data time 0.0009 (0.0017) model time 0.2370 (0.2428) loss 3.9980 (3.6413) grad_norm 1.5474 (1.9197) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][650/1251] eta 0:02:27 lr 0.000987 wd 0.0500 time 0.2470 (0.2447) data time 0.0007 (0.0017) model time 0.2463 (0.2432) loss 4.5866 (3.6447) grad_norm 1.8077 (1.9211) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][660/1251] eta 0:02:24 lr 0.000987 wd 0.0500 time 0.2537 (0.2447) data time 0.0008 (0.0017) model time 0.2529 (0.2432) loss 4.8098 (3.6485) grad_norm 1.6189 (1.9182) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][670/1251] eta 0:02:22 lr 0.000987 wd 0.0500 time 0.2366 (0.2446) data time 0.0008 (0.0017) model time 0.2358 (0.2431) loss 3.2583 (3.6500) grad_norm 1.7500 (1.9160) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][680/1251] eta 0:02:19 lr 0.000987 wd 0.0500 time 0.2354 (0.2446) data time 0.0012 (0.0017) model time 0.2342 (0.2431) loss 4.0953 (3.6495) grad_norm 2.2711 (1.9136) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][690/1251] eta 0:02:17 lr 0.000987 wd 0.0500 time 0.2389 (0.2445) data time 0.0008 (0.0017) model time 0.2381 (0.2431) loss 4.6434 (3.6515) grad_norm 1.4799 (1.9109) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][700/1251] eta 0:02:14 lr 0.000987 wd 0.0500 time 0.2403 (0.2445) data time 0.0010 (0.0017) model time 0.2393 (0.2430) loss 4.2023 (3.6515) grad_norm 2.5252 (1.9173) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][710/1251] eta 0:02:12 lr 0.000987 wd 0.0500 time 0.2448 (0.2445) data time 0.0009 (0.0017) model time 0.2439 (0.2430) loss 3.7393 (3.6557) grad_norm 1.6865 (1.9208) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][720/1251] eta 0:02:09 lr 0.000987 wd 0.0500 time 0.2589 (0.2444) data time 0.0010 (0.0017) model time 0.2580 (0.2430) loss 2.6767 (3.6569) grad_norm 2.3034 (1.9203) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][730/1251] eta 0:02:07 lr 0.000987 wd 0.0500 time 0.2407 (0.2445) data time 0.0009 (0.0017) model time 0.2398 (0.2430) loss 4.1084 (3.6587) grad_norm 1.5675 (1.9187) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][740/1251] eta 0:02:04 lr 0.000987 wd 0.0500 time 0.2462 (0.2445) data time 0.0011 (0.0017) model time 0.2451 (0.2430) loss 3.0365 (3.6605) grad_norm 1.3336 (1.9138) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][750/1251] eta 0:02:02 lr 0.000987 wd 0.0500 time 0.2449 (0.2444) data time 0.0009 (0.0016) model time 0.2441 (0.2429) loss 3.2108 (3.6602) grad_norm 1.8117 (1.9168) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][760/1251] eta 0:01:59 lr 0.000987 wd 0.0500 time 0.2465 (0.2444) data time 0.0011 (0.0016) model time 0.2454 (0.2429) loss 3.2736 (3.6603) grad_norm 1.7618 (1.9136) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][770/1251] eta 0:01:57 lr 0.000987 wd 0.0500 time 0.2426 (0.2444) data time 0.0008 (0.0016) model time 0.2417 (0.2429) loss 3.6809 (3.6564) grad_norm 1.3892 (1.9160) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][780/1251] eta 0:01:55 lr 0.000987 wd 0.0500 time 0.2346 (0.2443) data time 0.0012 (0.0016) model time 0.2334 (0.2429) loss 2.9692 (3.6552) grad_norm 1.6400 (1.9202) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][790/1251] eta 0:01:52 lr 0.000987 wd 0.0500 time 0.2377 (0.2447) data time 0.0007 (0.0016) model time 0.2370 (0.2433) loss 2.7232 (3.6564) grad_norm 1.3571 (1.9176) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][800/1251] eta 0:01:50 lr 0.000987 wd 0.0500 time 0.2425 (0.2447) data time 0.0009 (0.0016) model time 0.2417 (0.2433) loss 4.1497 (3.6571) grad_norm 1.5067 (1.9145) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][810/1251] eta 0:01:47 lr 0.000987 wd 0.0500 time 0.2350 (0.2446) data time 0.0009 (0.0016) model time 0.2341 (0.2432) loss 4.2549 (3.6579) grad_norm 3.0011 (1.9154) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][820/1251] eta 0:01:45 lr 0.000987 wd 0.0500 time 0.2421 (0.2446) data time 0.0009 (0.0016) model time 0.2412 (0.2432) loss 4.1553 (3.6567) grad_norm 2.0458 (1.9153) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][830/1251] eta 0:01:42 lr 0.000987 wd 0.0500 time 0.2365 (0.2445) data time 0.0008 (0.0016) model time 0.2358 (0.2431) loss 4.5408 (3.6588) grad_norm 2.6313 (1.9158) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][840/1251] eta 0:01:40 lr 0.000987 wd 0.0500 time 0.2418 (0.2445) data time 0.0010 (0.0016) model time 0.2408 (0.2431) loss 3.5010 (3.6604) grad_norm 1.9672 (1.9169) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][850/1251] eta 0:01:38 lr 0.000987 wd 0.0500 time 0.2431 (0.2445) data time 0.0010 (0.0016) model time 0.2420 (0.2431) loss 3.9060 (3.6609) grad_norm 2.1643 (1.9194) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][860/1251] eta 0:01:35 lr 0.000987 wd 0.0500 time 0.2395 (0.2445) data time 0.0010 (0.0016) model time 0.2385 (0.2431) loss 3.2672 (3.6606) grad_norm 1.3786 (1.9161) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][870/1251] eta 0:01:33 lr 0.000987 wd 0.0500 time 0.2391 (0.2444) data time 0.0009 (0.0016) model time 0.2382 (0.2430) loss 2.3373 (3.6620) grad_norm 1.8417 (1.9142) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][880/1251] eta 0:01:30 lr 0.000987 wd 0.0500 time 0.2508 (0.2444) data time 0.0010 (0.0016) model time 0.2497 (0.2430) loss 4.0352 (3.6629) grad_norm 1.9559 (1.9151) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][890/1251] eta 0:01:28 lr 0.000987 wd 0.0500 time 0.2457 (0.2443) data time 0.0012 (0.0015) model time 0.2445 (0.2429) loss 2.8404 (3.6604) grad_norm 2.0566 (1.9140) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][900/1251] eta 0:01:25 lr 0.000987 wd 0.0500 time 0.2449 (0.2443) data time 0.0010 (0.0015) model time 0.2439 (0.2429) loss 4.0229 (3.6607) grad_norm 2.4000 (1.9141) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][910/1251] eta 0:01:23 lr 0.000987 wd 0.0500 time 0.2409 (0.2443) data time 0.0009 (0.0015) model time 0.2399 (0.2429) loss 4.2918 (3.6592) grad_norm 3.0433 (1.9150) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][920/1251] eta 0:01:20 lr 0.000987 wd 0.0500 time 0.2363 (0.2442) data time 0.0007 (0.0015) model time 0.2355 (0.2428) loss 3.7641 (3.6582) grad_norm 1.8573 (1.9136) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][930/1251] eta 0:01:18 lr 0.000987 wd 0.0500 time 0.2412 (0.2442) data time 0.0010 (0.0015) model time 0.2402 (0.2429) loss 3.3215 (3.6547) grad_norm 3.0326 (1.9181) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][940/1251] eta 0:01:15 lr 0.000987 wd 0.0500 time 0.2387 (0.2442) data time 0.0010 (0.0015) model time 0.2377 (0.2428) loss 4.0714 (3.6594) grad_norm 1.5722 (1.9198) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][950/1251] eta 0:01:13 lr 0.000987 wd 0.0500 time 0.2422 (0.2442) data time 0.0007 (0.0015) model time 0.2415 (0.2428) loss 4.5072 (3.6632) grad_norm 1.6910 (1.9172) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][960/1251] eta 0:01:11 lr 0.000987 wd 0.0500 time 0.2439 (0.2442) data time 0.0011 (0.0015) model time 0.2428 (0.2428) loss 4.5794 (3.6657) grad_norm 2.2322 (1.9176) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][970/1251] eta 0:01:08 lr 0.000987 wd 0.0500 time 0.2368 (0.2441) data time 0.0010 (0.0015) model time 0.2358 (0.2427) loss 3.4355 (3.6666) grad_norm 1.8350 (1.9148) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][980/1251] eta 0:01:06 lr 0.000987 wd 0.0500 time 0.2426 (0.2441) data time 0.0007 (0.0015) model time 0.2419 (0.2427) loss 4.3968 (3.6683) grad_norm 2.1437 (1.9134) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][990/1251] eta 0:01:03 lr 0.000987 wd 0.0500 time 0.2401 (0.2440) data time 0.0007 (0.0015) model time 0.2394 (0.2427) loss 4.3337 (3.6703) grad_norm 1.8974 (1.9147) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1000/1251] eta 0:01:01 lr 0.000987 wd 0.0500 time 0.2427 (0.2441) data time 0.0011 (0.0015) model time 0.2416 (0.2427) loss 3.4710 (3.6706) grad_norm 1.7422 (1.9148) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1010/1251] eta 0:00:58 lr 0.000987 wd 0.0500 time 0.2388 (0.2440) data time 0.0010 (0.0015) model time 0.2378 (0.2427) loss 3.7832 (3.6725) grad_norm 1.5822 (1.9160) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1020/1251] eta 0:00:56 lr 0.000987 wd 0.0500 time 0.2451 (0.2440) data time 0.0009 (0.0015) model time 0.2442 (0.2427) loss 3.3903 (3.6733) grad_norm 3.4236 (1.9173) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1030/1251] eta 0:00:53 lr 0.000987 wd 0.0500 time 0.2439 (0.2440) data time 0.0007 (0.0015) model time 0.2432 (0.2427) loss 4.0476 (3.6738) grad_norm 2.8434 (1.9159) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1040/1251] eta 0:00:51 lr 0.000987 wd 0.0500 time 0.2495 (0.2440) data time 0.0011 (0.0015) model time 0.2485 (0.2427) loss 3.0620 (3.6720) grad_norm 1.3165 (1.9148) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1050/1251] eta 0:00:49 lr 0.000987 wd 0.0500 time 0.2454 (0.2440) data time 0.0009 (0.0015) model time 0.2445 (0.2427) loss 3.7285 (3.6687) grad_norm 1.8141 (1.9145) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1060/1251] eta 0:00:46 lr 0.000987 wd 0.0500 time 0.2421 (0.2440) data time 0.0007 (0.0015) model time 0.2414 (0.2426) loss 4.0179 (3.6703) grad_norm 2.6021 (1.9152) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1070/1251] eta 0:00:44 lr 0.000987 wd 0.0500 time 0.2373 (0.2439) data time 0.0010 (0.0015) model time 0.2363 (0.2426) loss 3.6648 (3.6723) grad_norm 2.2659 (1.9151) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1080/1251] eta 0:00:41 lr 0.000986 wd 0.0500 time 0.2614 (0.2439) data time 0.0009 (0.0015) model time 0.2605 (0.2426) loss 4.0010 (3.6724) grad_norm 2.2407 (1.9177) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1090/1251] eta 0:00:39 lr 0.000986 wd 0.0500 time 0.2438 (0.2439) data time 0.0007 (0.0015) model time 0.2430 (0.2426) loss 4.5899 (3.6735) grad_norm 1.5129 (1.9181) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1100/1251] eta 0:00:36 lr 0.000986 wd 0.0500 time 0.2404 (0.2439) data time 0.0012 (0.0015) model time 0.2393 (0.2425) loss 4.3072 (3.6745) grad_norm 1.6437 (1.9178) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1110/1251] eta 0:00:34 lr 0.000986 wd 0.0500 time 0.2404 (0.2439) data time 0.0010 (0.0014) model time 0.2394 (0.2425) loss 4.0055 (3.6750) grad_norm 1.8444 (1.9183) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1120/1251] eta 0:00:31 lr 0.000986 wd 0.0500 time 0.2367 (0.2438) data time 0.0010 (0.0015) model time 0.2357 (0.2425) loss 2.5027 (3.6719) grad_norm 1.7381 (1.9183) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:41:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1130/1251] eta 0:00:29 lr 0.000986 wd 0.0500 time 0.2378 (0.2438) data time 0.0011 (0.0014) model time 0.2367 (0.2425) loss 4.2250 (3.6724) grad_norm 1.5820 (1.9183) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1140/1251] eta 0:00:27 lr 0.000986 wd 0.0500 time 0.2468 (0.2440) data time 0.0011 (0.0014) model time 0.2457 (0.2427) loss 4.1444 (3.6740) grad_norm 2.6434 (1.9205) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1150/1251] eta 0:00:24 lr 0.000986 wd 0.0500 time 0.2366 (0.2440) data time 0.0010 (0.0014) model time 0.2356 (0.2427) loss 3.9257 (3.6770) grad_norm 2.1544 (1.9215) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1160/1251] eta 0:00:22 lr 0.000986 wd 0.0500 time 0.2433 (0.2440) data time 0.0011 (0.0014) model time 0.2422 (0.2427) loss 3.8492 (3.6795) grad_norm 1.2507 (1.9198) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1170/1251] eta 0:00:19 lr 0.000986 wd 0.0500 time 0.2318 (0.2440) data time 0.0008 (0.0014) model time 0.2310 (0.2427) loss 3.8241 (3.6805) grad_norm 1.3539 (1.9180) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1180/1251] eta 0:00:17 lr 0.000986 wd 0.0500 time 0.4793 (0.2444) data time 0.0011 (0.0014) model time 0.4781 (0.2431) loss 3.9728 (3.6791) grad_norm 1.5551 (1.9165) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1190/1251] eta 0:00:14 lr 0.000986 wd 0.0500 time 0.2462 (0.2444) data time 0.0007 (0.0014) model time 0.2456 (0.2431) loss 4.4044 (3.6825) grad_norm 2.0890 (1.9161) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1200/1251] eta 0:00:12 lr 0.000986 wd 0.0500 time 0.2422 (0.2444) data time 0.0007 (0.0014) model time 0.2415 (0.2431) loss 2.6728 (3.6812) grad_norm 1.7343 (1.9190) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1210/1251] eta 0:00:10 lr 0.000986 wd 0.0500 time 0.2417 (0.2444) data time 0.0010 (0.0014) model time 0.2407 (0.2431) loss 4.1211 (3.6828) grad_norm 1.6272 (1.9185) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1220/1251] eta 0:00:07 lr 0.000986 wd 0.0500 time 0.2438 (0.2444) data time 0.0010 (0.0014) model time 0.2428 (0.2431) loss 3.2471 (3.6813) grad_norm 1.7876 (1.9166) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1230/1251] eta 0:00:05 lr 0.000986 wd 0.0500 time 0.2366 (0.2443) data time 0.0009 (0.0014) model time 0.2357 (0.2430) loss 4.6298 (3.6822) grad_norm 1.8083 (1.9155) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1240/1251] eta 0:00:02 lr 0.000986 wd 0.0500 time 0.2309 (0.2443) data time 0.0005 (0.0014) model time 0.2304 (0.2429) loss 4.3988 (3.6838) grad_norm 1.7012 (1.9160) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [40/300][1250/1251] eta 0:00:00 lr 0.000986 wd 0.0500 time 0.2298 (0.2441) data time 0.0004 (0.0014) model time 0.2294 (0.2428) loss 3.7155 (3.6843) grad_norm 1.7462 (1.9156) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 40 training takes 0:05:05 [2024-08-26 05:42:29 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 05:42:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 05:42:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.445 (0.445) Loss 0.5801 (0.5801) Acc@1 87.793 (87.793) Acc@5 97.266 (97.266) Mem 7379MB [2024-08-26 05:42:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.112) Loss 0.9790 (0.9116) Acc@1 77.637 (79.004) Acc@5 95.020 (95.153) Mem 7379MB [2024-08-26 05:42:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.096) Loss 1.3115 (0.9345) Acc@1 69.141 (78.348) Acc@5 89.453 (95.136) Mem 7379MB [2024-08-26 05:42:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.091) Loss 1.6230 (1.0582) Acc@1 60.352 (75.551) Acc@5 85.254 (93.498) Mem 7379MB [2024-08-26 05:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.5312 (1.1464) Acc@1 65.723 (73.685) Acc@5 86.621 (92.326) Mem 7379MB [2024-08-26 05:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 73.358 Acc@5 92.204 [2024-08-26 05:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 73.4% [2024-08-26 05:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 73.36% [2024-08-26 05:42:33 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 05:42:34 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 05:42:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.467 (0.467) Loss 0.4736 (0.4736) Acc@1 89.746 (89.746) Acc@5 97.754 (97.754) Mem 7379MB [2024-08-26 05:42:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.113) Loss 0.8149 (0.7917) Acc@1 81.836 (81.179) Acc@5 95.312 (95.685) Mem 7379MB [2024-08-26 05:42:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.097) Loss 1.1680 (0.8061) Acc@1 72.559 (80.441) Acc@5 90.918 (95.671) Mem 7379MB [2024-08-26 05:42:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.091) Loss 1.4531 (0.9344) Acc@1 63.965 (77.463) Acc@5 87.012 (93.961) Mem 7379MB [2024-08-26 05:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.3887 (1.0096) Acc@1 66.016 (75.719) Acc@5 88.184 (93.069) Mem 7379MB [2024-08-26 05:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.392 Acc@5 93.002 [2024-08-26 05:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 75.4% [2024-08-26 05:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 75.39% [2024-08-26 05:42:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 05:42:39 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 05:42:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][0/1251] eta 0:13:41 lr 0.000986 wd 0.0500 time 0.6566 (0.6566) data time 0.4283 (0.4283) model time 0.0000 (0.0000) loss 3.9302 (3.9302) grad_norm 1.8138 (1.8138) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][10/1251] eta 0:05:46 lr 0.000986 wd 0.0500 time 0.2396 (0.2794) data time 0.0008 (0.0398) model time 0.0000 (0.0000) loss 3.2414 (3.7666) grad_norm 1.7854 (1.7765) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][20/1251] eta 0:05:21 lr 0.000986 wd 0.0500 time 0.2393 (0.2611) data time 0.0009 (0.0213) model time 0.0000 (0.0000) loss 3.5145 (3.7160) grad_norm 1.5381 (1.7066) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][30/1251] eta 0:05:10 lr 0.000986 wd 0.0500 time 0.2343 (0.2546) data time 0.0010 (0.0147) model time 0.0000 (0.0000) loss 4.2135 (3.7024) grad_norm 1.5220 (1.7529) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][40/1251] eta 0:05:11 lr 0.000986 wd 0.0500 time 0.4643 (0.2571) data time 0.0010 (0.0114) model time 0.0000 (0.0000) loss 3.5101 (3.7065) grad_norm 2.0369 (1.7779) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][50/1251] eta 0:05:05 lr 0.000986 wd 0.0500 time 0.2413 (0.2543) data time 0.0012 (0.0095) model time 0.0000 (0.0000) loss 2.9001 (3.6368) grad_norm 1.8949 (1.8005) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][60/1251] eta 0:05:06 lr 0.000986 wd 0.0500 time 0.2450 (0.2575) data time 0.0009 (0.0081) model time 0.2441 (0.2727) loss 3.9000 (3.6550) grad_norm 1.8791 (1.8871) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][70/1251] eta 0:05:01 lr 0.000986 wd 0.0500 time 0.2379 (0.2552) data time 0.0007 (0.0071) model time 0.2371 (0.2563) loss 3.9851 (3.6600) grad_norm 1.6996 (1.8642) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:42:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][80/1251] eta 0:04:56 lr 0.000986 wd 0.0500 time 0.2369 (0.2535) data time 0.0008 (0.0064) model time 0.2361 (0.2509) loss 3.8358 (3.6693) grad_norm 1.7350 (1.8531) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][90/1251] eta 0:04:52 lr 0.000986 wd 0.0500 time 0.2402 (0.2520) data time 0.0012 (0.0058) model time 0.2391 (0.2480) loss 4.0841 (3.6920) grad_norm 1.7611 (1.8582) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][100/1251] eta 0:04:48 lr 0.000986 wd 0.0500 time 0.2375 (0.2508) data time 0.0008 (0.0054) model time 0.2368 (0.2460) loss 2.2799 (3.6615) grad_norm 2.1349 (1.8442) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][110/1251] eta 0:04:45 lr 0.000986 wd 0.0500 time 0.2361 (0.2499) data time 0.0011 (0.0050) model time 0.2351 (0.2449) loss 3.9859 (3.6575) grad_norm 1.8958 (1.8575) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][120/1251] eta 0:04:41 lr 0.000986 wd 0.0500 time 0.2383 (0.2493) data time 0.0011 (0.0047) model time 0.2372 (0.2444) loss 3.8833 (3.6725) grad_norm 1.9641 (1.8757) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][130/1251] eta 0:04:38 lr 0.000986 wd 0.0500 time 0.2404 (0.2486) data time 0.0011 (0.0044) model time 0.2393 (0.2438) loss 3.8981 (3.7052) grad_norm 1.2870 (1.8769) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][140/1251] eta 0:04:35 lr 0.000986 wd 0.0500 time 0.2326 (0.2481) data time 0.0010 (0.0042) model time 0.2316 (0.2434) loss 4.6070 (3.7056) grad_norm 1.8462 (1.8807) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][150/1251] eta 0:04:32 lr 0.000986 wd 0.0500 time 0.2441 (0.2476) data time 0.0009 (0.0040) model time 0.2431 (0.2431) loss 3.6748 (3.7089) grad_norm 2.1586 (1.8834) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][160/1251] eta 0:04:29 lr 0.000986 wd 0.0500 time 0.2421 (0.2473) data time 0.0009 (0.0038) model time 0.2412 (0.2429) loss 4.5942 (3.7236) grad_norm 1.9177 (1.8788) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][170/1251] eta 0:04:26 lr 0.000986 wd 0.0500 time 0.2403 (0.2469) data time 0.0010 (0.0036) model time 0.2394 (0.2426) loss 3.9700 (3.7187) grad_norm 1.4457 (1.8701) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][180/1251] eta 0:04:24 lr 0.000986 wd 0.0500 time 0.2419 (0.2467) data time 0.0009 (0.0036) model time 0.2410 (0.2425) loss 3.6214 (3.7047) grad_norm 1.3409 (1.8987) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][190/1251] eta 0:04:21 lr 0.000986 wd 0.0500 time 0.2415 (0.2465) data time 0.0008 (0.0034) model time 0.2407 (0.2425) loss 3.6636 (3.7211) grad_norm 1.7845 (1.8914) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][200/1251] eta 0:04:18 lr 0.000986 wd 0.0500 time 0.2397 (0.2463) data time 0.0008 (0.0033) model time 0.2389 (0.2424) loss 4.0877 (3.7256) grad_norm 1.5808 (1.8809) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][210/1251] eta 0:04:16 lr 0.000986 wd 0.0500 time 0.2399 (0.2461) data time 0.0009 (0.0032) model time 0.2390 (0.2423) loss 3.4198 (3.7192) grad_norm 2.0568 (1.8790) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][220/1251] eta 0:04:13 lr 0.000986 wd 0.0500 time 0.2382 (0.2459) data time 0.0008 (0.0031) model time 0.2375 (0.2422) loss 3.1974 (3.7148) grad_norm 1.4885 (1.8884) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][230/1251] eta 0:04:11 lr 0.000986 wd 0.0500 time 0.2416 (0.2458) data time 0.0010 (0.0030) model time 0.2405 (0.2423) loss 4.4474 (3.7200) grad_norm 1.9036 (1.8836) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][240/1251] eta 0:04:08 lr 0.000986 wd 0.0500 time 0.2427 (0.2457) data time 0.0012 (0.0029) model time 0.2415 (0.2423) loss 3.6227 (3.7192) grad_norm 3.2601 (1.8910) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][250/1251] eta 0:04:05 lr 0.000986 wd 0.0500 time 0.2385 (0.2457) data time 0.0010 (0.0028) model time 0.2374 (0.2424) loss 4.0069 (3.7054) grad_norm 1.2857 (1.8829) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][260/1251] eta 0:04:03 lr 0.000986 wd 0.0500 time 0.2449 (0.2456) data time 0.0010 (0.0028) model time 0.2439 (0.2423) loss 3.3234 (3.6949) grad_norm 1.9119 (1.8764) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][270/1251] eta 0:04:00 lr 0.000986 wd 0.0500 time 0.2405 (0.2454) data time 0.0008 (0.0027) model time 0.2397 (0.2422) loss 4.3067 (3.6808) grad_norm 1.8232 (1.8770) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][280/1251] eta 0:03:58 lr 0.000986 wd 0.0500 time 0.2418 (0.2452) data time 0.0010 (0.0026) model time 0.2408 (0.2421) loss 2.4482 (3.6699) grad_norm 1.6190 (1.8735) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][290/1251] eta 0:03:55 lr 0.000986 wd 0.0500 time 0.2344 (0.2451) data time 0.0008 (0.0026) model time 0.2335 (0.2420) loss 4.0508 (3.6796) grad_norm 2.0647 (1.8793) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][300/1251] eta 0:03:52 lr 0.000986 wd 0.0500 time 0.2360 (0.2449) data time 0.0011 (0.0025) model time 0.2349 (0.2419) loss 3.6656 (3.6836) grad_norm 1.3615 (1.8753) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][310/1251] eta 0:03:50 lr 0.000986 wd 0.0500 time 0.2365 (0.2448) data time 0.0007 (0.0025) model time 0.2357 (0.2418) loss 2.9014 (3.6863) grad_norm 2.7173 (1.8732) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:43:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][320/1251] eta 0:03:47 lr 0.000986 wd 0.0500 time 0.2417 (0.2447) data time 0.0009 (0.0024) model time 0.2407 (0.2418) loss 3.6923 (3.6830) grad_norm 2.0585 (1.8698) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:44:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][330/1251] eta 0:03:45 lr 0.000986 wd 0.0500 time 0.2380 (0.2446) data time 0.0009 (0.0024) model time 0.2370 (0.2418) loss 2.4371 (3.6775) grad_norm 1.6880 (1.8805) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:44:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][340/1251] eta 0:03:42 lr 0.000986 wd 0.0500 time 0.2347 (0.2446) data time 0.0010 (0.0024) model time 0.2337 (0.2417) loss 3.6186 (3.6768) grad_norm 1.9423 (1.8942) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:44:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][350/1251] eta 0:03:40 lr 0.000986 wd 0.0500 time 0.2362 (0.2445) data time 0.0013 (0.0023) model time 0.2349 (0.2417) loss 3.1504 (3.6752) grad_norm 1.7749 (1.9118) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:44:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][360/1251] eta 0:03:37 lr 0.000986 wd 0.0500 time 0.2454 (0.2445) data time 0.0008 (0.0023) model time 0.2446 (0.2417) loss 4.6749 (3.6848) grad_norm 1.9405 (1.9115) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:44:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][370/1251] eta 0:03:35 lr 0.000986 wd 0.0500 time 0.2468 (0.2444) data time 0.0007 (0.0023) model time 0.2461 (0.2417) loss 2.8382 (3.6826) grad_norm 2.2796 (1.9139) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:44:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][380/1251] eta 0:03:32 lr 0.000986 wd 0.0500 time 0.2325 (0.2443) data time 0.0009 (0.0022) model time 0.2316 (0.2417) loss 4.1433 (3.6733) grad_norm 2.3583 (1.9191) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:44:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][390/1251] eta 0:03:30 lr 0.000986 wd 0.0500 time 0.2431 (0.2442) data time 0.0009 (0.0022) model time 0.2423 (0.2416) loss 2.4250 (3.6690) grad_norm 1.2569 (1.9134) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 05:44:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][400/1251] eta 0:03:27 lr 0.000986 wd 0.0500 time 0.2380 (0.2441) data time 0.0009 (0.0022) model time 0.2371 (0.2415) loss 3.4619 (3.6717) grad_norm 1.4038 (1.9064) loss_scale 16384.0000 (8273.7157) mem 7379MB [2024-08-26 05:44:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][410/1251] eta 0:03:25 lr 0.000986 wd 0.0500 time 0.2458 (0.2442) data time 0.0011 (0.0021) model time 0.2446 (0.2417) loss 3.2257 (3.6693) grad_norm 2.3150 (1.8995) loss_scale 16384.0000 (8471.0462) mem 7379MB [2024-08-26 05:44:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][420/1251] eta 0:03:22 lr 0.000986 wd 0.0500 time 0.2346 (0.2441) data time 0.0011 (0.0021) model time 0.2336 (0.2416) loss 4.1605 (3.6700) grad_norm 1.5965 (1.8969) loss_scale 16384.0000 (8659.0024) mem 7379MB [2024-08-26 05:44:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][430/1251] eta 0:03:20 lr 0.000986 wd 0.0500 time 0.2431 (0.2440) data time 0.0011 (0.0021) model time 0.2420 (0.2415) loss 3.4927 (3.6760) grad_norm 1.7005 (1.8949) loss_scale 16384.0000 (8838.2367) mem 7379MB [2024-08-26 05:44:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][440/1251] eta 0:03:17 lr 0.000986 wd 0.0500 time 0.2438 (0.2440) data time 0.0008 (0.0021) model time 0.2431 (0.2416) loss 3.9906 (3.6755) grad_norm 2.0909 (1.8913) loss_scale 16384.0000 (9009.3424) mem 7379MB [2024-08-26 05:44:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][450/1251] eta 0:03:15 lr 0.000986 wd 0.0500 time 0.2415 (0.2439) data time 0.0007 (0.0020) model time 0.2408 (0.2415) loss 3.8205 (3.6726) grad_norm 2.5182 (1.8903) loss_scale 16384.0000 (9172.8603) mem 7379MB [2024-08-26 05:44:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][460/1251] eta 0:03:13 lr 0.000986 wd 0.0500 time 0.2450 (0.2444) data time 0.0007 (0.0020) model time 0.2443 (0.2420) loss 4.7459 (3.6741) grad_norm 1.9191 (1.8981) loss_scale 16384.0000 (9329.2842) mem 7379MB [2024-08-26 05:44:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][470/1251] eta 0:03:11 lr 0.000986 wd 0.0500 time 0.2410 (0.2448) data time 0.0009 (0.0020) model time 0.2401 (0.2426) loss 4.1018 (3.6716) grad_norm 2.3421 (1.9035) loss_scale 16384.0000 (9479.0658) mem 7379MB [2024-08-26 05:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][480/1251] eta 0:03:08 lr 0.000986 wd 0.0500 time 0.2533 (0.2448) data time 0.0011 (0.0020) model time 0.2522 (0.2426) loss 3.6872 (3.6688) grad_norm 1.9523 (1.9024) loss_scale 16384.0000 (9622.6195) mem 7379MB [2024-08-26 05:44:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][490/1251] eta 0:03:06 lr 0.000986 wd 0.0500 time 0.2401 (0.2447) data time 0.0010 (0.0020) model time 0.2391 (0.2425) loss 3.3407 (3.6670) grad_norm 1.6106 (1.9006) loss_scale 16384.0000 (9760.3259) mem 7379MB [2024-08-26 05:44:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][500/1251] eta 0:03:03 lr 0.000986 wd 0.0500 time 0.2329 (0.2447) data time 0.0015 (0.0019) model time 0.2313 (0.2425) loss 2.6791 (3.6608) grad_norm 1.8471 (1.8978) loss_scale 16384.0000 (9892.5349) mem 7379MB [2024-08-26 05:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][510/1251] eta 0:03:01 lr 0.000986 wd 0.0500 time 0.2447 (0.2447) data time 0.0009 (0.0019) model time 0.2438 (0.2425) loss 3.8161 (3.6621) grad_norm 1.7022 (1.8942) loss_scale 16384.0000 (10019.5695) mem 7379MB [2024-08-26 05:44:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][520/1251] eta 0:02:58 lr 0.000986 wd 0.0500 time 0.2395 (0.2446) data time 0.0007 (0.0019) model time 0.2388 (0.2425) loss 2.6257 (3.6629) grad_norm 1.7234 (1.8978) loss_scale 16384.0000 (10141.7274) mem 7379MB [2024-08-26 05:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][530/1251] eta 0:02:56 lr 0.000986 wd 0.0500 time 0.2404 (0.2445) data time 0.0007 (0.0019) model time 0.2397 (0.2424) loss 4.4252 (3.6636) grad_norm 2.3081 (1.8972) loss_scale 16384.0000 (10259.2844) mem 7379MB [2024-08-26 05:44:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][540/1251] eta 0:02:54 lr 0.000986 wd 0.0500 time 0.2377 (0.2448) data time 0.0010 (0.0019) model time 0.2367 (0.2427) loss 3.3242 (3.6625) grad_norm 1.3093 (1.8950) loss_scale 16384.0000 (10372.4954) mem 7379MB [2024-08-26 05:44:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][550/1251] eta 0:02:51 lr 0.000986 wd 0.0500 time 0.2470 (0.2448) data time 0.0008 (0.0019) model time 0.2462 (0.2427) loss 2.4129 (3.6596) grad_norm 1.4571 (1.9008) loss_scale 16384.0000 (10481.5971) mem 7379MB [2024-08-26 05:44:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][560/1251] eta 0:02:49 lr 0.000986 wd 0.0500 time 0.2657 (0.2448) data time 0.0009 (0.0019) model time 0.2648 (0.2427) loss 3.9224 (3.6612) grad_norm 1.6954 (1.9052) loss_scale 16384.0000 (10586.8093) mem 7379MB [2024-08-26 05:44:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][570/1251] eta 0:02:46 lr 0.000986 wd 0.0500 time 0.2470 (0.2447) data time 0.0007 (0.0018) model time 0.2463 (0.2427) loss 3.7561 (3.6625) grad_norm 1.3733 (1.9025) loss_scale 16384.0000 (10688.3363) mem 7379MB [2024-08-26 05:45:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][580/1251] eta 0:02:44 lr 0.000986 wd 0.0500 time 0.4555 (0.2450) data time 0.0010 (0.0018) model time 0.4545 (0.2431) loss 2.6549 (3.6654) grad_norm 1.6979 (1.8981) loss_scale 16384.0000 (10786.3683) mem 7379MB [2024-08-26 05:45:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][590/1251] eta 0:02:42 lr 0.000986 wd 0.0500 time 0.2387 (0.2453) data time 0.0008 (0.0018) model time 0.2380 (0.2434) loss 3.2550 (3.6671) grad_norm 1.9106 (1.8972) loss_scale 16384.0000 (10881.0829) mem 7379MB [2024-08-26 05:45:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][600/1251] eta 0:02:39 lr 0.000986 wd 0.0500 time 0.2316 (0.2453) data time 0.0010 (0.0018) model time 0.2306 (0.2433) loss 2.8065 (3.6676) grad_norm 1.6401 (1.8948) loss_scale 16384.0000 (10972.6456) mem 7379MB [2024-08-26 05:45:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][610/1251] eta 0:02:37 lr 0.000986 wd 0.0500 time 0.2315 (0.2452) data time 0.0009 (0.0018) model time 0.2305 (0.2432) loss 4.1472 (3.6652) grad_norm 1.9116 (1.8984) loss_scale 16384.0000 (11061.2111) mem 7379MB [2024-08-26 05:45:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][620/1251] eta 0:02:34 lr 0.000986 wd 0.0500 time 0.2407 (0.2451) data time 0.0012 (0.0018) model time 0.2395 (0.2432) loss 3.0139 (3.6631) grad_norm 2.3862 (1.8985) loss_scale 16384.0000 (11146.9243) mem 7379MB [2024-08-26 05:45:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][630/1251] eta 0:02:32 lr 0.000986 wd 0.0500 time 0.2486 (0.2451) data time 0.0008 (0.0018) model time 0.2478 (0.2432) loss 2.5270 (3.6623) grad_norm 1.6067 (1.8969) loss_scale 16384.0000 (11229.9208) mem 7379MB [2024-08-26 05:45:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][640/1251] eta 0:02:29 lr 0.000986 wd 0.0500 time 0.2361 (0.2450) data time 0.0007 (0.0018) model time 0.2354 (0.2432) loss 3.2608 (3.6639) grad_norm 1.8878 (1.8956) loss_scale 16384.0000 (11310.3276) mem 7379MB [2024-08-26 05:45:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][650/1251] eta 0:02:27 lr 0.000986 wd 0.0500 time 0.2370 (0.2450) data time 0.0009 (0.0017) model time 0.2361 (0.2431) loss 3.3296 (3.6647) grad_norm 2.3813 (1.8956) loss_scale 16384.0000 (11388.2642) mem 7379MB [2024-08-26 05:45:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][660/1251] eta 0:02:24 lr 0.000986 wd 0.0500 time 0.2397 (0.2450) data time 0.0011 (0.0017) model time 0.2387 (0.2431) loss 4.0714 (3.6635) grad_norm 1.4779 (1.8936) loss_scale 16384.0000 (11463.8427) mem 7379MB [2024-08-26 05:45:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][670/1251] eta 0:02:22 lr 0.000986 wd 0.0500 time 0.2420 (0.2449) data time 0.0009 (0.0017) model time 0.2411 (0.2431) loss 3.8818 (3.6639) grad_norm 3.4905 (1.9009) loss_scale 16384.0000 (11537.1684) mem 7379MB [2024-08-26 05:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][680/1251] eta 0:02:19 lr 0.000986 wd 0.0500 time 0.2420 (0.2449) data time 0.0008 (0.0017) model time 0.2412 (0.2431) loss 4.2721 (3.6629) grad_norm 1.7441 (1.8975) loss_scale 16384.0000 (11608.3407) mem 7379MB [2024-08-26 05:45:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][690/1251] eta 0:02:17 lr 0.000986 wd 0.0500 time 0.2385 (0.2449) data time 0.0008 (0.0017) model time 0.2377 (0.2430) loss 4.3600 (3.6633) grad_norm 2.2367 (1.8975) loss_scale 16384.0000 (11677.4530) mem 7379MB [2024-08-26 05:45:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][700/1251] eta 0:02:14 lr 0.000986 wd 0.0500 time 0.2395 (0.2448) data time 0.0008 (0.0017) model time 0.2387 (0.2430) loss 2.2257 (3.6650) grad_norm 1.2720 (1.8973) loss_scale 16384.0000 (11744.5934) mem 7379MB [2024-08-26 05:45:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][710/1251] eta 0:02:12 lr 0.000986 wd 0.0500 time 0.2496 (0.2448) data time 0.0008 (0.0017) model time 0.2488 (0.2429) loss 2.6964 (3.6629) grad_norm 3.2543 (1.8973) loss_scale 16384.0000 (11809.8453) mem 7379MB [2024-08-26 05:45:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][720/1251] eta 0:02:09 lr 0.000986 wd 0.0500 time 0.2442 (0.2447) data time 0.0008 (0.0017) model time 0.2434 (0.2429) loss 3.3744 (3.6598) grad_norm 2.3505 (1.9016) loss_scale 16384.0000 (11873.2871) mem 7379MB [2024-08-26 05:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][730/1251] eta 0:02:07 lr 0.000986 wd 0.0500 time 0.2425 (0.2447) data time 0.0009 (0.0017) model time 0.2416 (0.2429) loss 3.7098 (3.6632) grad_norm 2.0966 (1.9006) loss_scale 16384.0000 (11934.9932) mem 7379MB [2024-08-26 05:45:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][740/1251] eta 0:02:05 lr 0.000986 wd 0.0500 time 0.2546 (0.2447) data time 0.0009 (0.0017) model time 0.2537 (0.2429) loss 3.2484 (3.6645) grad_norm 2.2268 (1.9009) loss_scale 16384.0000 (11995.0337) mem 7379MB [2024-08-26 05:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][750/1251] eta 0:02:02 lr 0.000986 wd 0.0500 time 0.2397 (0.2446) data time 0.0010 (0.0016) model time 0.2387 (0.2428) loss 3.8221 (3.6622) grad_norm 1.2909 (1.9000) loss_scale 16384.0000 (12053.4754) mem 7379MB [2024-08-26 05:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][760/1251] eta 0:02:00 lr 0.000986 wd 0.0500 time 0.2398 (0.2446) data time 0.0010 (0.0016) model time 0.2388 (0.2428) loss 4.3450 (3.6635) grad_norm 1.5876 (1.8979) loss_scale 16384.0000 (12110.3811) mem 7379MB [2024-08-26 05:45:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][770/1251] eta 0:01:57 lr 0.000986 wd 0.0500 time 0.2396 (0.2446) data time 0.0007 (0.0016) model time 0.2389 (0.2428) loss 3.7959 (3.6639) grad_norm 1.8220 (1.8956) loss_scale 16384.0000 (12165.8106) mem 7379MB [2024-08-26 05:45:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][780/1251] eta 0:01:55 lr 0.000986 wd 0.0500 time 0.2474 (0.2446) data time 0.0007 (0.0016) model time 0.2467 (0.2428) loss 4.4125 (3.6621) grad_norm 1.4137 (1.8925) loss_scale 16384.0000 (12219.8207) mem 7379MB [2024-08-26 05:45:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][790/1251] eta 0:01:52 lr 0.000985 wd 0.0500 time 0.2449 (0.2445) data time 0.0009 (0.0016) model time 0.2440 (0.2428) loss 4.5507 (3.6620) grad_norm 2.2688 (1.8948) loss_scale 16384.0000 (12272.4652) mem 7379MB [2024-08-26 05:45:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][800/1251] eta 0:01:50 lr 0.000985 wd 0.0500 time 0.2382 (0.2445) data time 0.0011 (0.0016) model time 0.2371 (0.2428) loss 3.8306 (3.6616) grad_norm 1.9493 (1.8964) loss_scale 16384.0000 (12323.7953) mem 7379MB [2024-08-26 05:45:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][810/1251] eta 0:01:47 lr 0.000985 wd 0.0500 time 0.2373 (0.2445) data time 0.0009 (0.0016) model time 0.2364 (0.2428) loss 4.1786 (3.6624) grad_norm 1.9390 (1.8987) loss_scale 16384.0000 (12373.8594) mem 7379MB [2024-08-26 05:46:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][820/1251] eta 0:01:45 lr 0.000985 wd 0.0500 time 0.2462 (0.2445) data time 0.0009 (0.0016) model time 0.2454 (0.2427) loss 3.3944 (3.6621) grad_norm 1.8912 (1.8976) loss_scale 16384.0000 (12422.7040) mem 7379MB [2024-08-26 05:46:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][830/1251] eta 0:01:42 lr 0.000985 wd 0.0500 time 0.2377 (0.2444) data time 0.0010 (0.0016) model time 0.2368 (0.2427) loss 5.0633 (3.6617) grad_norm 1.6058 (1.8943) loss_scale 16384.0000 (12470.3730) mem 7379MB [2024-08-26 05:46:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][840/1251] eta 0:01:40 lr 0.000985 wd 0.0500 time 0.2484 (0.2444) data time 0.0009 (0.0016) model time 0.2475 (0.2427) loss 3.6929 (3.6642) grad_norm 1.5483 (1.8920) loss_scale 16384.0000 (12516.9084) mem 7379MB [2024-08-26 05:46:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][850/1251] eta 0:01:37 lr 0.000985 wd 0.0500 time 0.2397 (0.2444) data time 0.0012 (0.0016) model time 0.2385 (0.2426) loss 3.6811 (3.6648) grad_norm 2.5701 (1.8948) loss_scale 16384.0000 (12562.3502) mem 7379MB [2024-08-26 05:46:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][860/1251] eta 0:01:35 lr 0.000985 wd 0.0500 time 0.2509 (0.2443) data time 0.0009 (0.0016) model time 0.2500 (0.2426) loss 3.5835 (3.6637) grad_norm 2.3667 (1.8975) loss_scale 16384.0000 (12606.7364) mem 7379MB [2024-08-26 05:46:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][870/1251] eta 0:01:33 lr 0.000985 wd 0.0500 time 0.2449 (0.2443) data time 0.0009 (0.0016) model time 0.2440 (0.2426) loss 3.4580 (3.6635) grad_norm 2.2463 (1.8967) loss_scale 16384.0000 (12650.1033) mem 7379MB [2024-08-26 05:46:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][880/1251] eta 0:01:30 lr 0.000985 wd 0.0500 time 0.2427 (0.2442) data time 0.0011 (0.0016) model time 0.2417 (0.2425) loss 4.5609 (3.6644) grad_norm 3.6516 (1.8992) loss_scale 16384.0000 (12692.4858) mem 7379MB [2024-08-26 05:46:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][890/1251] eta 0:01:28 lr 0.000985 wd 0.0500 time 0.2367 (0.2442) data time 0.0012 (0.0016) model time 0.2355 (0.2425) loss 2.7806 (3.6665) grad_norm 1.7652 (1.9025) loss_scale 16384.0000 (12733.9169) mem 7379MB [2024-08-26 05:46:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][900/1251] eta 0:01:25 lr 0.000985 wd 0.0500 time 0.2411 (0.2441) data time 0.0007 (0.0015) model time 0.2404 (0.2425) loss 4.5976 (3.6685) grad_norm 1.6647 (1.9007) loss_scale 16384.0000 (12774.4284) mem 7379MB [2024-08-26 05:46:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][910/1251] eta 0:01:23 lr 0.000985 wd 0.0500 time 0.2456 (0.2441) data time 0.0007 (0.0015) model time 0.2448 (0.2425) loss 3.7781 (3.6708) grad_norm 1.7543 (1.9010) loss_scale 16384.0000 (12814.0505) mem 7379MB [2024-08-26 05:46:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][920/1251] eta 0:01:20 lr 0.000985 wd 0.0500 time 0.2395 (0.2441) data time 0.0011 (0.0015) model time 0.2385 (0.2425) loss 2.6399 (3.6708) grad_norm 2.6033 (1.9028) loss_scale 16384.0000 (12852.8122) mem 7379MB [2024-08-26 05:46:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][930/1251] eta 0:01:18 lr 0.000985 wd 0.0500 time 0.2403 (0.2441) data time 0.0010 (0.0015) model time 0.2393 (0.2424) loss 4.0583 (3.6699) grad_norm 1.5263 (1.9020) loss_scale 16384.0000 (12890.7411) mem 7379MB [2024-08-26 05:46:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][940/1251] eta 0:01:15 lr 0.000985 wd 0.0500 time 0.2436 (0.2441) data time 0.0009 (0.0015) model time 0.2427 (0.2424) loss 3.5882 (3.6685) grad_norm 2.2766 (1.9007) loss_scale 16384.0000 (12927.8640) mem 7379MB [2024-08-26 05:46:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][950/1251] eta 0:01:13 lr 0.000985 wd 0.0500 time 0.2404 (0.2440) data time 0.0011 (0.0015) model time 0.2393 (0.2424) loss 3.8550 (3.6672) grad_norm 1.7601 (1.9002) loss_scale 16384.0000 (12964.2061) mem 7379MB [2024-08-26 05:46:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][960/1251] eta 0:01:11 lr 0.000985 wd 0.0500 time 0.2389 (0.2440) data time 0.0007 (0.0015) model time 0.2382 (0.2424) loss 3.8173 (3.6685) grad_norm 1.9609 (1.8983) loss_scale 16384.0000 (12999.7919) mem 7379MB [2024-08-26 05:46:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][970/1251] eta 0:01:08 lr 0.000985 wd 0.0500 time 0.4648 (0.2442) data time 0.0010 (0.0015) model time 0.4638 (0.2426) loss 2.9968 (3.6678) grad_norm 1.8536 (1.8974) loss_scale 16384.0000 (13034.6447) mem 7379MB [2024-08-26 05:46:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][980/1251] eta 0:01:06 lr 0.000985 wd 0.0500 time 0.4283 (0.2445) data time 0.0008 (0.0015) model time 0.4275 (0.2429) loss 4.0121 (3.6660) grad_norm 2.6831 (1.8980) loss_scale 16384.0000 (13068.7870) mem 7379MB [2024-08-26 05:46:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][990/1251] eta 0:01:03 lr 0.000985 wd 0.0500 time 0.2489 (0.2447) data time 0.0010 (0.0015) model time 0.2479 (0.2431) loss 3.7034 (3.6649) grad_norm 1.7711 (1.8965) loss_scale 16384.0000 (13102.2402) mem 7379MB [2024-08-26 05:46:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1000/1251] eta 0:01:01 lr 0.000985 wd 0.0500 time 0.2382 (0.2446) data time 0.0007 (0.0015) model time 0.2375 (0.2430) loss 4.4108 (3.6665) grad_norm 1.4537 (1.8960) loss_scale 16384.0000 (13135.0250) mem 7379MB [2024-08-26 05:46:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1010/1251] eta 0:00:58 lr 0.000985 wd 0.0500 time 0.2343 (0.2446) data time 0.0010 (0.0015) model time 0.2333 (0.2430) loss 3.1317 (3.6668) grad_norm 1.3638 (1.8933) loss_scale 16384.0000 (13167.1612) mem 7379MB [2024-08-26 05:46:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1020/1251] eta 0:00:56 lr 0.000985 wd 0.0500 time 0.2455 (0.2446) data time 0.0010 (0.0015) model time 0.2445 (0.2430) loss 3.7293 (3.6648) grad_norm 1.6740 (1.8946) loss_scale 16384.0000 (13198.6680) mem 7379MB [2024-08-26 05:46:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1030/1251] eta 0:00:54 lr 0.000985 wd 0.0500 time 0.2449 (0.2445) data time 0.0009 (0.0015) model time 0.2440 (0.2430) loss 3.9885 (3.6657) grad_norm 1.5702 (1.8958) loss_scale 16384.0000 (13229.5635) mem 7379MB [2024-08-26 05:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1040/1251] eta 0:00:51 lr 0.000985 wd 0.0500 time 0.2466 (0.2445) data time 0.0011 (0.0015) model time 0.2455 (0.2429) loss 3.6686 (3.6662) grad_norm 1.8933 (1.8943) loss_scale 16384.0000 (13259.8655) mem 7379MB [2024-08-26 05:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1050/1251] eta 0:00:49 lr 0.000985 wd 0.0500 time 0.2443 (0.2445) data time 0.0009 (0.0015) model time 0.2433 (0.2429) loss 3.8966 (3.6644) grad_norm 2.1079 (1.8940) loss_scale 16384.0000 (13289.5909) mem 7379MB [2024-08-26 05:46:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1060/1251] eta 0:00:46 lr 0.000985 wd 0.0500 time 0.2486 (0.2445) data time 0.0010 (0.0015) model time 0.2476 (0.2429) loss 3.9176 (3.6646) grad_norm 2.2438 (1.8954) loss_scale 16384.0000 (13318.7559) mem 7379MB [2024-08-26 05:47:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1070/1251] eta 0:00:44 lr 0.000985 wd 0.0500 time 0.2434 (0.2445) data time 0.0008 (0.0015) model time 0.2426 (0.2429) loss 3.8508 (3.6642) grad_norm 1.7038 (1.8947) loss_scale 16384.0000 (13347.3763) mem 7379MB [2024-08-26 05:47:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1080/1251] eta 0:00:41 lr 0.000985 wd 0.0500 time 0.2474 (0.2446) data time 0.0007 (0.0015) model time 0.2468 (0.2431) loss 2.8921 (3.6646) grad_norm 1.4782 (1.8931) loss_scale 16384.0000 (13375.4672) mem 7379MB [2024-08-26 05:47:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1090/1251] eta 0:00:39 lr 0.000985 wd 0.0500 time 0.2383 (0.2446) data time 0.0009 (0.0015) model time 0.2374 (0.2430) loss 3.1255 (3.6634) grad_norm 2.2012 (1.8921) loss_scale 16384.0000 (13403.0431) mem 7379MB [2024-08-26 05:47:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1100/1251] eta 0:00:36 lr 0.000985 wd 0.0500 time 0.2427 (0.2446) data time 0.0010 (0.0015) model time 0.2417 (0.2430) loss 3.5574 (3.6632) grad_norm 1.6580 (1.8939) loss_scale 16384.0000 (13430.1181) mem 7379MB [2024-08-26 05:47:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1110/1251] eta 0:00:34 lr 0.000985 wd 0.0500 time 0.2487 (0.2446) data time 0.0010 (0.0015) model time 0.2477 (0.2430) loss 3.5577 (3.6625) grad_norm 1.9715 (1.8941) loss_scale 16384.0000 (13456.7057) mem 7379MB [2024-08-26 05:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1120/1251] eta 0:00:32 lr 0.000985 wd 0.0500 time 0.2384 (0.2445) data time 0.0010 (0.0014) model time 0.2374 (0.2430) loss 3.9079 (3.6656) grad_norm 1.5351 (1.8981) loss_scale 16384.0000 (13482.8189) mem 7379MB [2024-08-26 05:47:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1130/1251] eta 0:00:29 lr 0.000985 wd 0.0500 time 0.2372 (0.2445) data time 0.0009 (0.0014) model time 0.2363 (0.2430) loss 2.7673 (3.6664) grad_norm 2.1860 (1.8998) loss_scale 16384.0000 (13508.4704) mem 7379MB [2024-08-26 05:47:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1140/1251] eta 0:00:27 lr 0.000985 wd 0.0500 time 0.2409 (0.2445) data time 0.0007 (0.0014) model time 0.2402 (0.2430) loss 4.1252 (3.6638) grad_norm 1.5487 (1.8989) loss_scale 16384.0000 (13533.6722) mem 7379MB [2024-08-26 05:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1150/1251] eta 0:00:24 lr 0.000985 wd 0.0500 time 0.2399 (0.2445) data time 0.0009 (0.0014) model time 0.2390 (0.2429) loss 2.3392 (3.6640) grad_norm 2.0946 (1.9003) loss_scale 16384.0000 (13558.4361) mem 7379MB [2024-08-26 05:47:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1160/1251] eta 0:00:22 lr 0.000985 wd 0.0500 time 0.2483 (0.2444) data time 0.0008 (0.0014) model time 0.2476 (0.2429) loss 4.0492 (3.6655) grad_norm 2.6985 (1.9046) loss_scale 16384.0000 (13582.7735) mem 7379MB [2024-08-26 05:47:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1170/1251] eta 0:00:19 lr 0.000985 wd 0.0500 time 0.2457 (0.2444) data time 0.0008 (0.0014) model time 0.2450 (0.2429) loss 3.6894 (3.6610) grad_norm 1.5642 (1.9033) loss_scale 16384.0000 (13606.6951) mem 7379MB [2024-08-26 05:47:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1180/1251] eta 0:00:17 lr 0.000985 wd 0.0500 time 0.2452 (0.2444) data time 0.0010 (0.0014) model time 0.2443 (0.2429) loss 3.8689 (3.6618) grad_norm 2.1489 (1.9030) loss_scale 16384.0000 (13630.2117) mem 7379MB [2024-08-26 05:47:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1190/1251] eta 0:00:14 lr 0.000985 wd 0.0500 time 0.2431 (0.2444) data time 0.0008 (0.0014) model time 0.2423 (0.2429) loss 3.4043 (3.6644) grad_norm 1.6724 (1.9052) loss_scale 16384.0000 (13653.3333) mem 7379MB [2024-08-26 05:47:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1200/1251] eta 0:00:12 lr 0.000985 wd 0.0500 time 0.2415 (0.2444) data time 0.0009 (0.0014) model time 0.2406 (0.2429) loss 3.7084 (3.6639) grad_norm 2.4427 (1.9075) loss_scale 16384.0000 (13676.0699) mem 7379MB [2024-08-26 05:47:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1210/1251] eta 0:00:10 lr 0.000985 wd 0.0500 time 0.2435 (0.2444) data time 0.0010 (0.0014) model time 0.2426 (0.2429) loss 3.8047 (3.6649) grad_norm 1.8735 (1.9065) loss_scale 16384.0000 (13698.4310) mem 7379MB [2024-08-26 05:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1220/1251] eta 0:00:07 lr 0.000985 wd 0.0500 time 0.2390 (0.2444) data time 0.0008 (0.0014) model time 0.2382 (0.2429) loss 2.7555 (3.6654) grad_norm 1.3466 (1.9045) loss_scale 16384.0000 (13720.4259) mem 7379MB [2024-08-26 05:47:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1230/1251] eta 0:00:05 lr 0.000985 wd 0.0500 time 0.2428 (0.2444) data time 0.0010 (0.0014) model time 0.2418 (0.2429) loss 3.9474 (3.6629) grad_norm 1.9727 (1.9069) loss_scale 16384.0000 (13742.0634) mem 7379MB [2024-08-26 05:47:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1240/1251] eta 0:00:02 lr 0.000985 wd 0.0500 time 0.2324 (0.2443) data time 0.0007 (0.0014) model time 0.2317 (0.2428) loss 4.2779 (3.6643) grad_norm 2.0667 (1.9095) loss_scale 16384.0000 (13763.3521) mem 7379MB [2024-08-26 05:47:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [41/300][1250/1251] eta 0:00:00 lr 0.000985 wd 0.0500 time 0.2251 (0.2441) data time 0.0007 (0.0014) model time 0.2244 (0.2426) loss 3.9217 (3.6615) grad_norm 2.7754 (1.9130) loss_scale 16384.0000 (13784.3006) mem 7379MB [2024-08-26 05:47:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 41 training takes 0:05:05 [2024-08-26 05:47:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 05:47:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 05:47:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.484 (0.484) Loss 0.5923 (0.5923) Acc@1 88.770 (88.770) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 05:47:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.113) Loss 0.9795 (0.9159) Acc@1 77.148 (78.960) Acc@5 94.629 (95.144) Mem 7379MB [2024-08-26 05:47:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.096) Loss 1.3037 (0.9235) Acc@1 68.848 (78.595) Acc@5 90.625 (94.992) Mem 7379MB [2024-08-26 05:47:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.090) Loss 1.5439 (1.0581) Acc@1 64.062 (75.643) Acc@5 86.328 (93.233) Mem 7379MB [2024-08-26 05:47:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.4629 (1.1378) Acc@1 65.234 (73.778) Acc@5 88.574 (92.166) Mem 7379MB [2024-08-26 05:47:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 73.472 Acc@5 92.130 [2024-08-26 05:47:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 73.5% [2024-08-26 05:47:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 73.47% [2024-08-26 05:47:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 05:47:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 05:47:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.489 (0.489) Loss 0.4712 (0.4712) Acc@1 89.551 (89.551) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 05:47:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.114) Loss 0.8115 (0.7872) Acc@1 81.641 (81.330) Acc@5 95.410 (95.694) Mem 7379MB [2024-08-26 05:47:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.097) Loss 1.1572 (0.8017) Acc@1 72.656 (80.571) Acc@5 91.016 (95.722) Mem 7379MB [2024-08-26 05:47:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.091) Loss 1.4453 (0.9289) Acc@1 64.258 (77.627) Acc@5 87.109 (94.046) Mem 7379MB [2024-08-26 05:47:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.3740 (1.0027) Acc@1 66.113 (75.931) Acc@5 88.379 (93.200) Mem 7379MB [2024-08-26 05:47:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.596 Acc@5 93.116 [2024-08-26 05:47:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 75.6% [2024-08-26 05:47:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 75.60% [2024-08-26 05:47:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 05:47:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 05:47:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][0/1251] eta 0:14:42 lr 0.000985 wd 0.0500 time 0.7054 (0.7054) data time 0.4861 (0.4861) model time 0.0000 (0.0000) loss 3.8324 (3.8324) grad_norm 2.9870 (2.9870) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:47:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][10/1251] eta 0:05:51 lr 0.000985 wd 0.0500 time 0.2425 (0.2832) data time 0.0010 (0.0452) model time 0.0000 (0.0000) loss 4.0012 (3.7656) grad_norm 1.2672 (1.9384) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][20/1251] eta 0:05:25 lr 0.000985 wd 0.0500 time 0.2511 (0.2646) data time 0.0010 (0.0242) model time 0.0000 (0.0000) loss 3.8875 (3.7284) grad_norm 2.1876 (1.9297) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][30/1251] eta 0:05:13 lr 0.000985 wd 0.0500 time 0.2382 (0.2566) data time 0.0011 (0.0167) model time 0.0000 (0.0000) loss 3.9356 (3.6423) grad_norm 1.6435 (1.9118) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][40/1251] eta 0:05:06 lr 0.000985 wd 0.0500 time 0.2431 (0.2530) data time 0.0012 (0.0129) model time 0.0000 (0.0000) loss 4.3938 (3.6979) grad_norm 1.8870 (1.8999) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][50/1251] eta 0:05:01 lr 0.000985 wd 0.0500 time 0.2494 (0.2511) data time 0.0008 (0.0106) model time 0.0000 (0.0000) loss 4.4159 (3.7375) grad_norm 1.4109 (1.8980) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][60/1251] eta 0:04:57 lr 0.000985 wd 0.0500 time 0.2437 (0.2495) data time 0.0010 (0.0090) model time 0.2427 (0.2406) loss 4.0001 (3.7417) grad_norm 2.2776 (1.9080) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][70/1251] eta 0:04:53 lr 0.000985 wd 0.0500 time 0.2433 (0.2486) data time 0.0009 (0.0079) model time 0.2424 (0.2412) loss 3.4272 (3.7197) grad_norm 1.8472 (1.9163) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][80/1251] eta 0:04:50 lr 0.000985 wd 0.0500 time 0.2406 (0.2479) data time 0.0008 (0.0070) model time 0.2398 (0.2416) loss 4.1114 (3.6862) grad_norm 2.2151 (1.9175) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][90/1251] eta 0:04:46 lr 0.000985 wd 0.0500 time 0.2400 (0.2470) data time 0.0011 (0.0064) model time 0.2390 (0.2408) loss 3.0626 (3.6913) grad_norm 2.0651 (1.9170) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][100/1251] eta 0:04:43 lr 0.000985 wd 0.0500 time 0.2458 (0.2467) data time 0.0010 (0.0059) model time 0.2447 (0.2412) loss 3.4281 (3.6830) grad_norm 1.9249 (1.8938) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][110/1251] eta 0:04:41 lr 0.000985 wd 0.0500 time 0.2452 (0.2465) data time 0.0009 (0.0054) model time 0.2443 (0.2416) loss 3.8877 (3.6712) grad_norm 1.6547 (1.8719) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][120/1251] eta 0:04:38 lr 0.000985 wd 0.0500 time 0.2440 (0.2462) data time 0.0010 (0.0051) model time 0.2430 (0.2415) loss 2.7085 (3.6587) grad_norm 1.4800 (1.8657) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][130/1251] eta 0:04:35 lr 0.000985 wd 0.0500 time 0.2323 (0.2457) data time 0.0010 (0.0048) model time 0.2313 (0.2412) loss 3.0662 (3.6491) grad_norm 1.7837 (1.8807) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][140/1251] eta 0:04:32 lr 0.000985 wd 0.0500 time 0.2343 (0.2452) data time 0.0010 (0.0045) model time 0.2333 (0.2408) loss 3.9051 (3.6504) grad_norm 1.9742 (1.8878) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][150/1251] eta 0:04:32 lr 0.000985 wd 0.0500 time 0.2463 (0.2477) data time 0.0009 (0.0043) model time 0.2453 (0.2449) loss 3.0257 (3.6628) grad_norm 1.3637 (1.8709) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][160/1251] eta 0:04:29 lr 0.000985 wd 0.0500 time 0.2374 (0.2474) data time 0.0009 (0.0041) model time 0.2366 (0.2446) loss 3.7400 (3.6640) grad_norm 1.8537 (1.8680) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][170/1251] eta 0:04:27 lr 0.000985 wd 0.0500 time 0.2418 (0.2471) data time 0.0007 (0.0039) model time 0.2411 (0.2444) loss 3.0492 (3.6544) grad_norm 2.2791 (1.8637) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][180/1251] eta 0:04:24 lr 0.000985 wd 0.0500 time 0.2363 (0.2468) data time 0.0009 (0.0037) model time 0.2354 (0.2441) loss 3.9539 (3.6474) grad_norm 2.0562 (1.8815) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][190/1251] eta 0:04:21 lr 0.000985 wd 0.0500 time 0.2370 (0.2465) data time 0.0009 (0.0036) model time 0.2360 (0.2438) loss 3.3794 (3.6564) grad_norm 2.1195 (1.8740) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][200/1251] eta 0:04:18 lr 0.000985 wd 0.0500 time 0.2429 (0.2462) data time 0.0011 (0.0034) model time 0.2418 (0.2435) loss 4.2952 (3.6532) grad_norm 2.0709 (1.8700) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][210/1251] eta 0:04:16 lr 0.000985 wd 0.0500 time 0.2444 (0.2459) data time 0.0009 (0.0033) model time 0.2435 (0.2433) loss 3.6618 (3.6509) grad_norm 1.5889 (1.8697) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][220/1251] eta 0:04:13 lr 0.000985 wd 0.0500 time 0.2441 (0.2458) data time 0.0008 (0.0032) model time 0.2433 (0.2432) loss 3.1315 (3.6529) grad_norm 1.9575 (1.8663) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][230/1251] eta 0:04:10 lr 0.000985 wd 0.0500 time 0.2500 (0.2456) data time 0.0008 (0.0031) model time 0.2493 (0.2430) loss 3.5273 (3.6509) grad_norm 1.3469 (1.8600) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][240/1251] eta 0:04:08 lr 0.000985 wd 0.0500 time 0.2373 (0.2462) data time 0.0011 (0.0030) model time 0.2362 (0.2439) loss 2.9155 (3.6468) grad_norm 2.7951 (1.8627) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][250/1251] eta 0:04:06 lr 0.000985 wd 0.0500 time 0.2404 (0.2467) data time 0.0009 (0.0030) model time 0.2395 (0.2446) loss 3.6103 (3.6447) grad_norm 1.8842 (1.8664) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:48:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][260/1251] eta 0:04:05 lr 0.000985 wd 0.0500 time 0.2437 (0.2478) data time 0.0010 (0.0029) model time 0.2427 (0.2459) loss 3.7387 (3.6476) grad_norm 1.5792 (1.8657) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][270/1251] eta 0:04:02 lr 0.000985 wd 0.0500 time 0.2395 (0.2475) data time 0.0008 (0.0028) model time 0.2386 (0.2457) loss 4.4055 (3.6425) grad_norm 2.4008 (1.8641) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][280/1251] eta 0:04:00 lr 0.000985 wd 0.0500 time 0.2425 (0.2473) data time 0.0010 (0.0028) model time 0.2414 (0.2454) loss 4.1759 (3.6462) grad_norm 1.5800 (1.8627) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][290/1251] eta 0:03:57 lr 0.000985 wd 0.0500 time 0.2479 (0.2471) data time 0.0010 (0.0027) model time 0.2469 (0.2453) loss 3.8175 (3.6548) grad_norm 1.8607 (1.8546) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][300/1251] eta 0:03:54 lr 0.000985 wd 0.0500 time 0.2374 (0.2469) data time 0.0009 (0.0026) model time 0.2365 (0.2450) loss 3.2900 (3.6560) grad_norm 2.4024 (1.8578) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][310/1251] eta 0:03:52 lr 0.000985 wd 0.0500 time 0.2353 (0.2467) data time 0.0008 (0.0026) model time 0.2345 (0.2448) loss 2.6271 (3.6507) grad_norm 1.7941 (1.8542) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][320/1251] eta 0:03:49 lr 0.000985 wd 0.0500 time 0.2518 (0.2466) data time 0.0010 (0.0025) model time 0.2508 (0.2447) loss 3.5537 (3.6551) grad_norm 3.3734 (1.8687) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][330/1251] eta 0:03:46 lr 0.000985 wd 0.0500 time 0.2421 (0.2465) data time 0.0011 (0.0025) model time 0.2410 (0.2446) loss 3.1187 (3.6628) grad_norm 2.0682 (1.8677) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][340/1251] eta 0:03:44 lr 0.000985 wd 0.0500 time 0.2326 (0.2463) data time 0.0009 (0.0024) model time 0.2318 (0.2444) loss 3.7872 (3.6611) grad_norm 1.6365 (1.8613) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][350/1251] eta 0:03:41 lr 0.000985 wd 0.0500 time 0.2394 (0.2461) data time 0.0007 (0.0024) model time 0.2387 (0.2442) loss 2.3906 (3.6642) grad_norm 1.5923 (1.8688) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][360/1251] eta 0:03:39 lr 0.000985 wd 0.0500 time 0.2431 (0.2459) data time 0.0010 (0.0024) model time 0.2421 (0.2441) loss 3.7287 (3.6623) grad_norm 1.8108 (1.8710) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][370/1251] eta 0:03:36 lr 0.000985 wd 0.0500 time 0.2415 (0.2458) data time 0.0010 (0.0023) model time 0.2405 (0.2440) loss 4.2631 (3.6665) grad_norm 2.4189 (1.8691) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][380/1251] eta 0:03:34 lr 0.000985 wd 0.0500 time 0.2415 (0.2458) data time 0.0007 (0.0023) model time 0.2408 (0.2439) loss 2.7921 (3.6684) grad_norm 1.4622 (1.8718) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][390/1251] eta 0:03:31 lr 0.000985 wd 0.0500 time 0.2452 (0.2457) data time 0.0007 (0.0023) model time 0.2445 (0.2438) loss 3.2398 (3.6701) grad_norm 1.6328 (1.8808) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][400/1251] eta 0:03:28 lr 0.000985 wd 0.0500 time 0.2422 (0.2456) data time 0.0011 (0.0022) model time 0.2411 (0.2438) loss 4.4450 (3.6790) grad_norm 1.8688 (1.8968) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][410/1251] eta 0:03:26 lr 0.000985 wd 0.0500 time 0.2439 (0.2454) data time 0.0012 (0.0022) model time 0.2427 (0.2436) loss 2.7976 (3.6795) grad_norm 3.0109 (1.9020) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][420/1251] eta 0:03:23 lr 0.000985 wd 0.0500 time 0.2426 (0.2453) data time 0.0007 (0.0022) model time 0.2419 (0.2435) loss 4.1188 (3.6824) grad_norm 1.6063 (1.8986) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][430/1251] eta 0:03:21 lr 0.000985 wd 0.0500 time 0.2516 (0.2453) data time 0.0008 (0.0021) model time 0.2508 (0.2435) loss 4.2691 (3.6824) grad_norm 1.6256 (1.8998) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][440/1251] eta 0:03:18 lr 0.000985 wd 0.0500 time 0.2456 (0.2452) data time 0.0008 (0.0021) model time 0.2448 (0.2434) loss 3.4913 (3.6856) grad_norm 1.8491 (1.8998) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][450/1251] eta 0:03:16 lr 0.000985 wd 0.0500 time 0.2488 (0.2455) data time 0.0007 (0.0021) model time 0.2481 (0.2438) loss 3.6867 (3.6859) grad_norm 1.8176 (1.8994) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][460/1251] eta 0:03:14 lr 0.000984 wd 0.0500 time 0.2348 (0.2454) data time 0.0008 (0.0021) model time 0.2340 (0.2437) loss 4.6607 (3.6858) grad_norm 2.0945 (1.9015) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][470/1251] eta 0:03:11 lr 0.000984 wd 0.0500 time 0.2396 (0.2453) data time 0.0011 (0.0020) model time 0.2386 (0.2436) loss 3.1720 (3.6846) grad_norm 1.8570 (1.9006) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][480/1251] eta 0:03:09 lr 0.000984 wd 0.0500 time 0.2400 (0.2452) data time 0.0008 (0.0020) model time 0.2392 (0.2435) loss 4.1517 (3.6859) grad_norm 1.9850 (1.8987) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][490/1251] eta 0:03:06 lr 0.000984 wd 0.0500 time 0.2364 (0.2452) data time 0.0010 (0.0020) model time 0.2354 (0.2435) loss 3.6199 (3.6891) grad_norm 1.4689 (1.9001) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:49:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][500/1251] eta 0:03:04 lr 0.000984 wd 0.0500 time 0.2437 (0.2451) data time 0.0010 (0.0020) model time 0.2426 (0.2434) loss 3.3808 (3.6840) grad_norm 2.1147 (1.9032) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][510/1251] eta 0:03:01 lr 0.000984 wd 0.0500 time 0.2464 (0.2450) data time 0.0008 (0.0020) model time 0.2456 (0.2433) loss 4.9028 (3.6884) grad_norm 2.1631 (1.8994) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][520/1251] eta 0:02:59 lr 0.000984 wd 0.0500 time 0.2405 (0.2449) data time 0.0007 (0.0019) model time 0.2397 (0.2432) loss 2.4155 (3.6919) grad_norm 1.8603 (1.8987) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][530/1251] eta 0:02:56 lr 0.000984 wd 0.0500 time 0.2429 (0.2448) data time 0.0007 (0.0019) model time 0.2422 (0.2431) loss 4.3194 (3.6919) grad_norm 1.8290 (1.9004) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][540/1251] eta 0:02:54 lr 0.000984 wd 0.0500 time 0.2455 (0.2447) data time 0.0011 (0.0019) model time 0.2444 (0.2431) loss 3.4783 (3.6915) grad_norm 1.5848 (1.8948) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][550/1251] eta 0:02:51 lr 0.000984 wd 0.0500 time 0.2359 (0.2447) data time 0.0009 (0.0019) model time 0.2350 (0.2430) loss 2.6585 (3.6843) grad_norm 1.5382 (1.8923) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][560/1251] eta 0:02:49 lr 0.000984 wd 0.0500 time 0.2456 (0.2446) data time 0.0009 (0.0019) model time 0.2446 (0.2429) loss 3.5550 (3.6798) grad_norm 1.4205 (1.8856) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][570/1251] eta 0:02:46 lr 0.000984 wd 0.0500 time 0.2418 (0.2445) data time 0.0012 (0.0019) model time 0.2407 (0.2429) loss 3.9781 (3.6782) grad_norm 1.6265 (1.8836) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][580/1251] eta 0:02:44 lr 0.000984 wd 0.0500 time 0.2439 (0.2444) data time 0.0008 (0.0018) model time 0.2431 (0.2428) loss 3.4982 (3.6785) grad_norm 2.8705 (1.8852) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][590/1251] eta 0:02:41 lr 0.000984 wd 0.0500 time 0.2400 (0.2444) data time 0.0008 (0.0018) model time 0.2392 (0.2428) loss 2.8535 (3.6785) grad_norm 1.5539 (1.8852) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][600/1251] eta 0:02:39 lr 0.000984 wd 0.0500 time 0.2550 (0.2444) data time 0.0007 (0.0018) model time 0.2542 (0.2428) loss 4.1588 (3.6793) grad_norm 1.7570 (1.8840) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][610/1251] eta 0:02:36 lr 0.000984 wd 0.0500 time 0.2412 (0.2443) data time 0.0008 (0.0018) model time 0.2404 (0.2427) loss 4.0200 (3.6785) grad_norm 2.3218 (1.8853) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][620/1251] eta 0:02:34 lr 0.000984 wd 0.0500 time 0.2395 (0.2443) data time 0.0010 (0.0018) model time 0.2385 (0.2427) loss 3.7206 (3.6778) grad_norm 1.7298 (1.8892) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][630/1251] eta 0:02:31 lr 0.000984 wd 0.0500 time 0.2629 (0.2442) data time 0.0009 (0.0018) model time 0.2619 (0.2426) loss 4.3750 (3.6748) grad_norm 1.8722 (1.8936) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][640/1251] eta 0:02:29 lr 0.000984 wd 0.0500 time 0.2383 (0.2442) data time 0.0010 (0.0018) model time 0.2373 (0.2426) loss 2.5988 (3.6732) grad_norm 1.6144 (1.8977) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][650/1251] eta 0:02:26 lr 0.000984 wd 0.0500 time 0.2379 (0.2442) data time 0.0010 (0.0018) model time 0.2369 (0.2426) loss 4.4178 (3.6794) grad_norm 1.5090 (1.8977) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][660/1251] eta 0:02:24 lr 0.000984 wd 0.0500 time 0.2337 (0.2441) data time 0.0011 (0.0018) model time 0.2327 (0.2425) loss 3.3605 (3.6811) grad_norm 2.5288 (1.8996) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][670/1251] eta 0:02:21 lr 0.000984 wd 0.0500 time 0.2398 (0.2441) data time 0.0010 (0.0017) model time 0.2388 (0.2425) loss 4.3998 (3.6817) grad_norm 1.3813 (1.9023) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][680/1251] eta 0:02:19 lr 0.000984 wd 0.0500 time 0.2397 (0.2440) data time 0.0010 (0.0017) model time 0.2387 (0.2425) loss 3.6610 (3.6756) grad_norm 1.4347 (1.9060) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][690/1251] eta 0:02:16 lr 0.000984 wd 0.0500 time 0.2431 (0.2440) data time 0.0010 (0.0017) model time 0.2421 (0.2424) loss 2.7220 (3.6727) grad_norm 2.9790 (1.9126) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][700/1251] eta 0:02:14 lr 0.000984 wd 0.0500 time 0.2354 (0.2440) data time 0.0013 (0.0017) model time 0.2341 (0.2424) loss 3.2775 (3.6711) grad_norm 1.4657 (1.9114) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][710/1251] eta 0:02:11 lr 0.000984 wd 0.0500 time 0.2405 (0.2439) data time 0.0009 (0.0017) model time 0.2396 (0.2424) loss 2.6973 (3.6668) grad_norm 2.9173 (1.9223) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][720/1251] eta 0:02:09 lr 0.000984 wd 0.0500 time 0.2412 (0.2439) data time 0.0012 (0.0017) model time 0.2400 (0.2424) loss 2.6364 (3.6669) grad_norm 1.7814 (1.9185) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][730/1251] eta 0:02:07 lr 0.000984 wd 0.0500 time 0.2429 (0.2439) data time 0.0010 (0.0017) model time 0.2419 (0.2423) loss 3.7730 (3.6662) grad_norm 1.2269 (1.9176) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][740/1251] eta 0:02:04 lr 0.000984 wd 0.0500 time 0.2427 (0.2439) data time 0.0011 (0.0017) model time 0.2416 (0.2423) loss 2.2377 (3.6624) grad_norm 3.7371 (1.9223) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:50:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][750/1251] eta 0:02:02 lr 0.000984 wd 0.0500 time 0.2430 (0.2439) data time 0.0011 (0.0017) model time 0.2419 (0.2423) loss 3.5807 (3.6682) grad_norm 1.4655 (1.9267) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][760/1251] eta 0:01:59 lr 0.000984 wd 0.0500 time 0.2372 (0.2440) data time 0.0008 (0.0017) model time 0.2364 (0.2425) loss 3.1484 (3.6670) grad_norm 1.5613 (1.9246) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][770/1251] eta 0:01:57 lr 0.000984 wd 0.0500 time 0.2329 (0.2440) data time 0.0009 (0.0017) model time 0.2319 (0.2425) loss 3.0963 (3.6698) grad_norm 2.0817 (1.9231) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][780/1251] eta 0:01:55 lr 0.000984 wd 0.0500 time 0.2362 (0.2448) data time 0.0010 (0.0017) model time 0.2351 (0.2433) loss 3.6201 (3.6705) grad_norm 1.8143 (1.9240) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][790/1251] eta 0:01:52 lr 0.000984 wd 0.0500 time 0.2380 (0.2447) data time 0.0009 (0.0017) model time 0.2371 (0.2433) loss 2.7328 (3.6693) grad_norm 2.1215 (1.9220) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][800/1251] eta 0:01:50 lr 0.000984 wd 0.0500 time 0.2402 (0.2447) data time 0.0008 (0.0016) model time 0.2395 (0.2432) loss 2.6650 (3.6670) grad_norm 3.2318 (1.9297) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][810/1251] eta 0:01:47 lr 0.000984 wd 0.0500 time 0.2377 (0.2447) data time 0.0007 (0.0017) model time 0.2370 (0.2432) loss 4.1512 (3.6677) grad_norm 1.7166 (1.9314) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][820/1251] eta 0:01:45 lr 0.000984 wd 0.0500 time 0.2352 (0.2446) data time 0.0009 (0.0016) model time 0.2343 (0.2431) loss 3.3706 (3.6678) grad_norm 2.6148 (1.9321) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][830/1251] eta 0:01:42 lr 0.000984 wd 0.0500 time 0.2414 (0.2446) data time 0.0009 (0.0016) model time 0.2406 (0.2431) loss 3.3455 (3.6688) grad_norm 1.9993 (1.9318) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][840/1251] eta 0:01:40 lr 0.000984 wd 0.0500 time 0.2382 (0.2445) data time 0.0008 (0.0016) model time 0.2374 (0.2430) loss 4.0345 (3.6686) grad_norm 2.0933 (1.9321) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][850/1251] eta 0:01:38 lr 0.000984 wd 0.0500 time 0.2392 (0.2445) data time 0.0007 (0.0016) model time 0.2385 (0.2430) loss 4.4793 (3.6718) grad_norm 2.2019 (1.9324) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][860/1251] eta 0:01:35 lr 0.000984 wd 0.0500 time 0.2398 (0.2444) data time 0.0012 (0.0016) model time 0.2386 (0.2429) loss 4.0353 (3.6699) grad_norm 1.6734 (1.9319) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][870/1251] eta 0:01:33 lr 0.000984 wd 0.0500 time 0.2341 (0.2444) data time 0.0009 (0.0016) model time 0.2333 (0.2429) loss 4.2925 (3.6713) grad_norm 2.0738 (1.9295) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][880/1251] eta 0:01:30 lr 0.000984 wd 0.0500 time 0.2437 (0.2444) data time 0.0009 (0.0016) model time 0.2427 (0.2429) loss 3.4863 (3.6718) grad_norm 2.9624 (1.9289) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][890/1251] eta 0:01:28 lr 0.000984 wd 0.0500 time 0.2364 (0.2443) data time 0.0012 (0.0016) model time 0.2352 (0.2429) loss 3.0254 (3.6727) grad_norm 1.5796 (1.9255) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][900/1251] eta 0:01:25 lr 0.000984 wd 0.0500 time 0.2452 (0.2443) data time 0.0011 (0.0016) model time 0.2441 (0.2428) loss 3.0486 (3.6735) grad_norm 2.5582 (1.9267) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][910/1251] eta 0:01:23 lr 0.000984 wd 0.0500 time 0.2452 (0.2443) data time 0.0008 (0.0016) model time 0.2444 (0.2428) loss 4.0493 (3.6702) grad_norm 1.9672 (1.9305) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][920/1251] eta 0:01:20 lr 0.000984 wd 0.0500 time 0.2435 (0.2443) data time 0.0009 (0.0016) model time 0.2425 (0.2428) loss 3.9235 (3.6703) grad_norm 1.4548 (1.9309) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][930/1251] eta 0:01:18 lr 0.000984 wd 0.0500 time 0.2385 (0.2443) data time 0.0010 (0.0016) model time 0.2375 (0.2428) loss 3.5152 (3.6728) grad_norm 2.2510 (1.9301) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][940/1251] eta 0:01:15 lr 0.000984 wd 0.0500 time 0.2357 (0.2443) data time 0.0011 (0.0016) model time 0.2346 (0.2428) loss 3.5111 (3.6716) grad_norm 1.5918 (1.9286) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][950/1251] eta 0:01:13 lr 0.000984 wd 0.0500 time 0.2464 (0.2442) data time 0.0009 (0.0016) model time 0.2455 (0.2428) loss 2.8774 (3.6719) grad_norm 2.0245 (1.9327) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][960/1251] eta 0:01:11 lr 0.000984 wd 0.0500 time 0.2399 (0.2442) data time 0.0009 (0.0016) model time 0.2390 (0.2427) loss 3.7664 (3.6715) grad_norm 1.4408 (1.9362) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][970/1251] eta 0:01:08 lr 0.000984 wd 0.0500 time 0.2421 (0.2442) data time 0.0010 (0.0016) model time 0.2411 (0.2427) loss 2.2649 (3.6691) grad_norm 2.9968 (1.9361) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][980/1251] eta 0:01:06 lr 0.000984 wd 0.0500 time 0.2421 (0.2441) data time 0.0009 (0.0016) model time 0.2412 (0.2427) loss 4.1734 (3.6677) grad_norm 1.6765 (1.9353) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][990/1251] eta 0:01:03 lr 0.000984 wd 0.0500 time 0.2381 (0.2443) data time 0.0012 (0.0016) model time 0.2369 (0.2429) loss 4.0096 (3.6685) grad_norm 1.6441 (1.9361) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:51:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1000/1251] eta 0:01:01 lr 0.000984 wd 0.0500 time 0.2347 (0.2443) data time 0.0012 (0.0016) model time 0.2334 (0.2428) loss 3.7602 (3.6676) grad_norm 1.8562 (1.9375) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1010/1251] eta 0:00:58 lr 0.000984 wd 0.0500 time 0.2435 (0.2443) data time 0.0009 (0.0016) model time 0.2427 (0.2428) loss 3.0186 (3.6667) grad_norm 1.9106 (1.9348) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1020/1251] eta 0:00:56 lr 0.000984 wd 0.0500 time 0.2404 (0.2442) data time 0.0011 (0.0016) model time 0.2393 (0.2428) loss 3.3195 (3.6664) grad_norm 1.5991 (1.9328) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1030/1251] eta 0:00:53 lr 0.000984 wd 0.0500 time 0.2363 (0.2442) data time 0.0010 (0.0016) model time 0.2353 (0.2428) loss 2.7883 (3.6639) grad_norm 1.3777 (1.9318) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1040/1251] eta 0:00:51 lr 0.000984 wd 0.0500 time 0.2416 (0.2442) data time 0.0009 (0.0016) model time 0.2407 (0.2428) loss 3.5011 (3.6654) grad_norm 2.5312 (1.9408) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1050/1251] eta 0:00:49 lr 0.000984 wd 0.0500 time 0.2385 (0.2442) data time 0.0009 (0.0016) model time 0.2376 (0.2427) loss 2.2536 (3.6621) grad_norm 1.7536 (1.9445) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1060/1251] eta 0:00:46 lr 0.000984 wd 0.0500 time 0.2453 (0.2442) data time 0.0008 (0.0016) model time 0.2446 (0.2427) loss 4.6555 (3.6662) grad_norm 2.1579 (1.9432) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1070/1251] eta 0:00:44 lr 0.000984 wd 0.0500 time 0.2408 (0.2441) data time 0.0009 (0.0015) model time 0.2399 (0.2427) loss 4.4942 (3.6689) grad_norm 1.7222 (1.9429) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1080/1251] eta 0:00:41 lr 0.000984 wd 0.0500 time 0.2354 (0.2445) data time 0.0010 (0.0015) model time 0.2344 (0.2431) loss 2.4344 (3.6697) grad_norm 2.1582 (1.9400) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1090/1251] eta 0:00:39 lr 0.000984 wd 0.0500 time 0.2373 (0.2445) data time 0.0008 (0.0015) model time 0.2366 (0.2431) loss 2.8825 (3.6708) grad_norm 2.3366 (1.9372) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1100/1251] eta 0:00:36 lr 0.000984 wd 0.0500 time 0.2436 (0.2445) data time 0.0009 (0.0015) model time 0.2427 (0.2431) loss 4.3191 (3.6722) grad_norm 2.2164 (1.9354) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1110/1251] eta 0:00:34 lr 0.000984 wd 0.0500 time 0.2429 (0.2445) data time 0.0010 (0.0015) model time 0.2419 (0.2431) loss 4.2232 (3.6717) grad_norm 1.7503 (1.9364) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1120/1251] eta 0:00:32 lr 0.000984 wd 0.0500 time 0.2463 (0.2445) data time 0.0007 (0.0015) model time 0.2456 (0.2431) loss 4.6416 (3.6740) grad_norm 1.8827 (1.9369) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1130/1251] eta 0:00:29 lr 0.000984 wd 0.0500 time 0.2530 (0.2445) data time 0.0007 (0.0015) model time 0.2523 (0.2430) loss 4.2435 (3.6763) grad_norm 1.4594 (1.9368) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1140/1251] eta 0:00:27 lr 0.000984 wd 0.0500 time 0.2417 (0.2444) data time 0.0009 (0.0015) model time 0.2408 (0.2430) loss 2.0051 (3.6761) grad_norm 2.0070 (1.9391) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:52:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1150/1251] eta 0:00:24 lr 0.000984 wd 0.0500 time 0.2389 (0.2444) data time 0.0009 (0.0015) model time 0.2380 (0.2430) loss 2.8753 (3.6743) grad_norm 1.7460 (1.9392) loss_scale 32768.0000 (16455.1729) mem 7379MB [2024-08-26 05:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1160/1251] eta 0:00:22 lr 0.000984 wd 0.0500 time 0.2372 (0.2444) data time 0.0009 (0.0015) model time 0.2363 (0.2430) loss 4.1297 (3.6766) grad_norm 2.9386 (1.9398) loss_scale 32768.0000 (16595.6796) mem 7379MB [2024-08-26 05:52:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1170/1251] eta 0:00:19 lr 0.000984 wd 0.0500 time 0.2391 (0.2444) data time 0.0008 (0.0015) model time 0.2382 (0.2430) loss 2.8986 (3.6740) grad_norm 1.9547 (1.9383) loss_scale 32768.0000 (16733.7865) mem 7379MB [2024-08-26 05:52:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1180/1251] eta 0:00:17 lr 0.000984 wd 0.0500 time 0.2437 (0.2444) data time 0.0009 (0.0015) model time 0.2428 (0.2430) loss 4.0910 (3.6766) grad_norm 1.4164 (1.9388) loss_scale 32768.0000 (16869.5546) mem 7379MB [2024-08-26 05:52:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1190/1251] eta 0:00:14 lr 0.000984 wd 0.0500 time 0.2411 (0.2444) data time 0.0010 (0.0015) model time 0.2400 (0.2430) loss 3.1632 (3.6766) grad_norm 1.6907 (inf) loss_scale 16384.0000 (16948.0168) mem 7379MB [2024-08-26 05:52:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1200/1251] eta 0:00:12 lr 0.000984 wd 0.0500 time 0.2384 (0.2443) data time 0.0010 (0.0015) model time 0.2374 (0.2429) loss 3.0096 (3.6772) grad_norm 2.1583 (inf) loss_scale 16384.0000 (16943.3206) mem 7379MB [2024-08-26 05:52:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1210/1251] eta 0:00:10 lr 0.000984 wd 0.0500 time 0.2347 (0.2443) data time 0.0011 (0.0015) model time 0.2335 (0.2429) loss 3.5707 (3.6766) grad_norm 2.3170 (inf) loss_scale 16384.0000 (16938.7019) mem 7379MB [2024-08-26 05:52:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1220/1251] eta 0:00:07 lr 0.000984 wd 0.0500 time 0.2444 (0.2443) data time 0.0009 (0.0015) model time 0.2435 (0.2429) loss 3.7299 (3.6768) grad_norm 1.9980 (inf) loss_scale 16384.0000 (16934.1589) mem 7379MB [2024-08-26 05:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1230/1251] eta 0:00:05 lr 0.000984 wd 0.0500 time 0.2348 (0.2443) data time 0.0011 (0.0015) model time 0.2337 (0.2429) loss 3.3952 (3.6749) grad_norm 1.8983 (inf) loss_scale 16384.0000 (16929.6897) mem 7379MB [2024-08-26 05:52:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1240/1251] eta 0:00:02 lr 0.000984 wd 0.0500 time 0.2249 (0.2442) data time 0.0006 (0.0015) model time 0.2243 (0.2428) loss 3.7179 (3.6743) grad_norm 1.6064 (inf) loss_scale 16384.0000 (16925.2925) mem 7379MB [2024-08-26 05:53:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [42/300][1250/1251] eta 0:00:00 lr 0.000984 wd 0.0500 time 0.2231 (0.2441) data time 0.0007 (0.0015) model time 0.2224 (0.2427) loss 3.1570 (3.6748) grad_norm 2.4494 (inf) loss_scale 16384.0000 (16920.9656) mem 7379MB [2024-08-26 05:53:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 42 training takes 0:05:05 [2024-08-26 05:53:00 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 05:53:01 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 05:53:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.440 (0.440) Loss 0.6084 (0.6084) Acc@1 87.598 (87.598) Acc@5 97.266 (97.266) Mem 7379MB [2024-08-26 05:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.086 (0.111) Loss 0.9248 (0.9114) Acc@1 79.297 (79.510) Acc@5 94.922 (95.091) Mem 7379MB [2024-08-26 05:53:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.097) Loss 1.3037 (0.9417) Acc@1 69.434 (78.390) Acc@5 90.918 (95.015) Mem 7379MB [2024-08-26 05:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.083 (0.091) Loss 1.6650 (1.0650) Acc@1 59.082 (75.576) Acc@5 84.473 (93.328) Mem 7379MB [2024-08-26 05:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.4805 (1.1409) Acc@1 65.430 (73.809) Acc@5 88.770 (92.280) Mem 7379MB [2024-08-26 05:53:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 73.488 Acc@5 92.142 [2024-08-26 05:53:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 73.5% [2024-08-26 05:53:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 73.49% [2024-08-26 05:53:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 05:53:05 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 05:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.434 (0.434) Loss 0.4692 (0.4692) Acc@1 90.039 (90.039) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 05:53:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.108) Loss 0.8057 (0.7825) Acc@1 81.738 (81.570) Acc@5 95.410 (95.721) Mem 7379MB [2024-08-26 05:53:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.094) Loss 1.1484 (0.7975) Acc@1 72.949 (80.766) Acc@5 91.113 (95.740) Mem 7379MB [2024-08-26 05:53:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.068 (0.088) Loss 1.4365 (0.9236) Acc@1 64.062 (77.832) Acc@5 87.012 (94.097) Mem 7379MB [2024-08-26 05:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.3652 (0.9962) Acc@1 66.406 (76.127) Acc@5 88.477 (93.278) Mem 7379MB [2024-08-26 05:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.804 Acc@5 93.198 [2024-08-26 05:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 75.8% [2024-08-26 05:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 75.80% [2024-08-26 05:53:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 05:53:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 05:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][0/1251] eta 0:15:06 lr 0.000984 wd 0.0500 time 0.7243 (0.7243) data time 0.5043 (0.5043) model time 0.0000 (0.0000) loss 3.9538 (3.9538) grad_norm 2.4107 (2.4107) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:53:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][10/1251] eta 0:05:55 lr 0.000984 wd 0.0500 time 0.2393 (0.2866) data time 0.0008 (0.0467) model time 0.0000 (0.0000) loss 3.8994 (3.7363) grad_norm 2.2949 (2.1673) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:53:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][20/1251] eta 0:05:27 lr 0.000984 wd 0.0500 time 0.2464 (0.2663) data time 0.0010 (0.0250) model time 0.0000 (0.0000) loss 1.8797 (3.6944) grad_norm 2.2242 (2.0485) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:53:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][30/1251] eta 0:05:15 lr 0.000984 wd 0.0500 time 0.2525 (0.2583) data time 0.0008 (0.0172) model time 0.0000 (0.0000) loss 3.6731 (3.7799) grad_norm 2.0705 (2.1382) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:53:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][40/1251] eta 0:05:07 lr 0.000984 wd 0.0500 time 0.2406 (0.2543) data time 0.0009 (0.0133) model time 0.0000 (0.0000) loss 3.8190 (3.7618) grad_norm 1.2643 (2.0763) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 05:53:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][50/1251] eta 0:05:02 lr 0.000984 wd 0.0500 time 0.2312 (0.2516) data time 0.0007 (0.0109) model time 0.0000 (0.0000) loss 2.6752 (3.7141) grad_norm inf (inf) loss_scale 8192.0000 (16223.3725) mem 7379MB [2024-08-26 05:53:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][60/1251] eta 0:04:57 lr 0.000984 wd 0.0500 time 0.2423 (0.2499) data time 0.0007 (0.0093) model time 0.2416 (0.2406) loss 3.5884 (3.6819) grad_norm 2.3098 (inf) loss_scale 8192.0000 (14906.7541) mem 7379MB [2024-08-26 05:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][70/1251] eta 0:04:53 lr 0.000984 wd 0.0500 time 0.2530 (0.2489) data time 0.0010 (0.0081) model time 0.2520 (0.2410) loss 3.5866 (3.6516) grad_norm 1.9438 (inf) loss_scale 8192.0000 (13961.0141) mem 7379MB [2024-08-26 05:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][80/1251] eta 0:04:50 lr 0.000984 wd 0.0500 time 0.2327 (0.2478) data time 0.0009 (0.0073) model time 0.2318 (0.2403) loss 2.6578 (3.6478) grad_norm 1.8612 (inf) loss_scale 8192.0000 (13248.7901) mem 7379MB [2024-08-26 05:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][90/1251] eta 0:04:46 lr 0.000984 wd 0.0500 time 0.2462 (0.2469) data time 0.0011 (0.0066) model time 0.2451 (0.2400) loss 4.2227 (3.6388) grad_norm 1.4939 (inf) loss_scale 8192.0000 (12693.0989) mem 7379MB [2024-08-26 05:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][100/1251] eta 0:04:43 lr 0.000983 wd 0.0500 time 0.2433 (0.2465) data time 0.0009 (0.0060) model time 0.2424 (0.2404) loss 3.6244 (3.6615) grad_norm 1.4574 (inf) loss_scale 8192.0000 (12247.4455) mem 7379MB [2024-08-26 05:53:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][110/1251] eta 0:04:40 lr 0.000983 wd 0.0500 time 0.2469 (0.2462) data time 0.0011 (0.0056) model time 0.2458 (0.2405) loss 3.7633 (3.6695) grad_norm 3.0673 (inf) loss_scale 8192.0000 (11882.0901) mem 7379MB [2024-08-26 05:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][120/1251] eta 0:04:37 lr 0.000983 wd 0.0500 time 0.2351 (0.2458) data time 0.0008 (0.0052) model time 0.2344 (0.2404) loss 3.7578 (3.6514) grad_norm 2.0156 (inf) loss_scale 8192.0000 (11577.1240) mem 7379MB [2024-08-26 05:53:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][130/1251] eta 0:04:34 lr 0.000983 wd 0.0500 time 0.2381 (0.2453) data time 0.0010 (0.0049) model time 0.2371 (0.2402) loss 3.2333 (3.6538) grad_norm 1.9387 (inf) loss_scale 8192.0000 (11318.7176) mem 7379MB [2024-08-26 05:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][140/1251] eta 0:04:32 lr 0.000983 wd 0.0500 time 0.2440 (0.2451) data time 0.0011 (0.0046) model time 0.2429 (0.2403) loss 3.8995 (3.6508) grad_norm 1.5930 (inf) loss_scale 8192.0000 (11096.9645) mem 7379MB [2024-08-26 05:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][150/1251] eta 0:04:29 lr 0.000983 wd 0.0500 time 0.2427 (0.2448) data time 0.0010 (0.0044) model time 0.2418 (0.2402) loss 3.6861 (3.6422) grad_norm 1.2653 (inf) loss_scale 8192.0000 (10904.5828) mem 7379MB [2024-08-26 05:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][160/1251] eta 0:04:26 lr 0.000983 wd 0.0500 time 0.2416 (0.2445) data time 0.0007 (0.0042) model time 0.2408 (0.2402) loss 4.2015 (3.6147) grad_norm 1.5250 (inf) loss_scale 8192.0000 (10736.0994) mem 7379MB [2024-08-26 05:53:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][170/1251] eta 0:04:24 lr 0.000983 wd 0.0500 time 0.2350 (0.2443) data time 0.0011 (0.0040) model time 0.2339 (0.2401) loss 3.6295 (3.6333) grad_norm 2.1159 (inf) loss_scale 8192.0000 (10587.3216) mem 7379MB [2024-08-26 05:53:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][180/1251] eta 0:04:21 lr 0.000983 wd 0.0500 time 0.2362 (0.2441) data time 0.0009 (0.0038) model time 0.2354 (0.2401) loss 4.0119 (3.6327) grad_norm 1.7690 (inf) loss_scale 8192.0000 (10454.9834) mem 7379MB [2024-08-26 05:53:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][190/1251] eta 0:04:18 lr 0.000983 wd 0.0500 time 0.2446 (0.2439) data time 0.0011 (0.0037) model time 0.2435 (0.2400) loss 3.0881 (3.6327) grad_norm 2.2324 (inf) loss_scale 8192.0000 (10336.5026) mem 7379MB [2024-08-26 05:53:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][200/1251] eta 0:04:16 lr 0.000983 wd 0.0500 time 0.2326 (0.2437) data time 0.0011 (0.0036) model time 0.2315 (0.2400) loss 3.7765 (3.6411) grad_norm 2.8957 (inf) loss_scale 8192.0000 (10229.8109) mem 7379MB [2024-08-26 05:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][210/1251] eta 0:04:13 lr 0.000983 wd 0.0500 time 0.2531 (0.2437) data time 0.0010 (0.0034) model time 0.2522 (0.2402) loss 3.5960 (3.6503) grad_norm 1.7211 (inf) loss_scale 8192.0000 (10133.2322) mem 7379MB [2024-08-26 05:54:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][220/1251] eta 0:04:11 lr 0.000983 wd 0.0500 time 0.2392 (0.2436) data time 0.0010 (0.0033) model time 0.2381 (0.2402) loss 4.2515 (3.6620) grad_norm 3.0605 (inf) loss_scale 8192.0000 (10045.3937) mem 7379MB [2024-08-26 05:54:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][230/1251] eta 0:04:08 lr 0.000983 wd 0.0500 time 0.2447 (0.2435) data time 0.0012 (0.0032) model time 0.2435 (0.2401) loss 3.6030 (3.6636) grad_norm 1.8357 (inf) loss_scale 8192.0000 (9965.1602) mem 7379MB [2024-08-26 05:54:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][240/1251] eta 0:04:05 lr 0.000983 wd 0.0500 time 0.2368 (0.2433) data time 0.0009 (0.0031) model time 0.2359 (0.2400) loss 3.7669 (3.6664) grad_norm 2.1780 (inf) loss_scale 8192.0000 (9891.5851) mem 7379MB [2024-08-26 05:54:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][250/1251] eta 0:04:03 lr 0.000983 wd 0.0500 time 0.2352 (0.2431) data time 0.0008 (0.0030) model time 0.2344 (0.2399) loss 4.5021 (3.6721) grad_norm 2.6647 (inf) loss_scale 8192.0000 (9823.8725) mem 7379MB [2024-08-26 05:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][260/1251] eta 0:04:03 lr 0.000983 wd 0.0500 time 0.2379 (0.2456) data time 0.0010 (0.0030) model time 0.2369 (0.2431) loss 4.1691 (3.6745) grad_norm 2.2016 (inf) loss_scale 8192.0000 (9761.3487) mem 7379MB [2024-08-26 05:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][270/1251] eta 0:04:00 lr 0.000983 wd 0.0500 time 0.2408 (0.2454) data time 0.0008 (0.0029) model time 0.2401 (0.2430) loss 3.0976 (3.6650) grad_norm 1.5223 (inf) loss_scale 8192.0000 (9703.4391) mem 7379MB [2024-08-26 05:54:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][280/1251] eta 0:03:58 lr 0.000983 wd 0.0500 time 0.2412 (0.2454) data time 0.0007 (0.0028) model time 0.2404 (0.2430) loss 4.0647 (3.6683) grad_norm 2.0098 (inf) loss_scale 8192.0000 (9649.6512) mem 7379MB [2024-08-26 05:54:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][290/1251] eta 0:03:55 lr 0.000983 wd 0.0500 time 0.2469 (0.2453) data time 0.0008 (0.0028) model time 0.2461 (0.2429) loss 3.9075 (3.6678) grad_norm 1.7546 (inf) loss_scale 8192.0000 (9599.5601) mem 7379MB [2024-08-26 05:54:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][300/1251] eta 0:03:53 lr 0.000983 wd 0.0500 time 0.2507 (0.2451) data time 0.0010 (0.0027) model time 0.2497 (0.2428) loss 3.9036 (3.6641) grad_norm 2.1416 (inf) loss_scale 8192.0000 (9552.7973) mem 7379MB [2024-08-26 05:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][310/1251] eta 0:03:50 lr 0.000983 wd 0.0500 time 0.2377 (0.2451) data time 0.0009 (0.0026) model time 0.2368 (0.2428) loss 3.5309 (3.6592) grad_norm 3.2536 (inf) loss_scale 8192.0000 (9509.0418) mem 7379MB [2024-08-26 05:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][320/1251] eta 0:03:48 lr 0.000983 wd 0.0500 time 0.2427 (0.2451) data time 0.0010 (0.0026) model time 0.2417 (0.2428) loss 3.4462 (3.6531) grad_norm 1.4826 (inf) loss_scale 8192.0000 (9468.0125) mem 7379MB [2024-08-26 05:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][330/1251] eta 0:03:45 lr 0.000983 wd 0.0500 time 0.2447 (0.2450) data time 0.0011 (0.0026) model time 0.2436 (0.2428) loss 3.8846 (3.6560) grad_norm 2.9160 (inf) loss_scale 8192.0000 (9429.4622) mem 7379MB [2024-08-26 05:54:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][340/1251] eta 0:03:43 lr 0.000983 wd 0.0500 time 0.2371 (0.2449) data time 0.0008 (0.0025) model time 0.2363 (0.2427) loss 4.1695 (3.6614) grad_norm 1.9773 (inf) loss_scale 8192.0000 (9393.1730) mem 7379MB [2024-08-26 05:54:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][350/1251] eta 0:03:40 lr 0.000983 wd 0.0500 time 0.2478 (0.2448) data time 0.0010 (0.0025) model time 0.2468 (0.2426) loss 4.2732 (3.6623) grad_norm 2.1989 (inf) loss_scale 8192.0000 (9358.9516) mem 7379MB [2024-08-26 05:54:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][360/1251] eta 0:03:38 lr 0.000983 wd 0.0500 time 0.2363 (0.2448) data time 0.0009 (0.0024) model time 0.2354 (0.2426) loss 4.2489 (3.6663) grad_norm 1.9295 (inf) loss_scale 8192.0000 (9326.6260) mem 7379MB [2024-08-26 05:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][370/1251] eta 0:03:35 lr 0.000983 wd 0.0500 time 0.2421 (0.2446) data time 0.0007 (0.0024) model time 0.2414 (0.2425) loss 2.5767 (3.6673) grad_norm 1.8725 (inf) loss_scale 8192.0000 (9296.0431) mem 7379MB [2024-08-26 05:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][380/1251] eta 0:03:33 lr 0.000983 wd 0.0500 time 0.2368 (0.2446) data time 0.0011 (0.0024) model time 0.2357 (0.2424) loss 4.0792 (3.6677) grad_norm 1.3661 (inf) loss_scale 8192.0000 (9267.0656) mem 7379MB [2024-08-26 05:54:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][390/1251] eta 0:03:30 lr 0.000983 wd 0.0500 time 0.2363 (0.2444) data time 0.0007 (0.0023) model time 0.2355 (0.2423) loss 4.1961 (3.6664) grad_norm 1.8443 (inf) loss_scale 8192.0000 (9239.5703) mem 7379MB [2024-08-26 05:54:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][400/1251] eta 0:03:28 lr 0.000983 wd 0.0500 time 0.2458 (0.2444) data time 0.0007 (0.0023) model time 0.2450 (0.2423) loss 4.3334 (3.6658) grad_norm 1.9047 (inf) loss_scale 8192.0000 (9213.4464) mem 7379MB [2024-08-26 05:54:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][410/1251] eta 0:03:25 lr 0.000983 wd 0.0500 time 0.2350 (0.2444) data time 0.0010 (0.0023) model time 0.2340 (0.2423) loss 3.0128 (3.6634) grad_norm 1.6290 (inf) loss_scale 8192.0000 (9188.5937) mem 7379MB [2024-08-26 05:54:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][420/1251] eta 0:03:22 lr 0.000983 wd 0.0500 time 0.2355 (0.2443) data time 0.0008 (0.0022) model time 0.2347 (0.2422) loss 2.7277 (3.6578) grad_norm 1.3228 (inf) loss_scale 8192.0000 (9164.9216) mem 7379MB [2024-08-26 05:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][430/1251] eta 0:03:20 lr 0.000983 wd 0.0500 time 0.2464 (0.2443) data time 0.0010 (0.0022) model time 0.2455 (0.2422) loss 4.0748 (3.6566) grad_norm 1.7878 (inf) loss_scale 8192.0000 (9142.3480) mem 7379MB [2024-08-26 05:54:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][440/1251] eta 0:03:17 lr 0.000983 wd 0.0500 time 0.2367 (0.2441) data time 0.0010 (0.0022) model time 0.2356 (0.2421) loss 3.8706 (3.6567) grad_norm 2.4082 (inf) loss_scale 8192.0000 (9120.7982) mem 7379MB [2024-08-26 05:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][450/1251] eta 0:03:15 lr 0.000983 wd 0.0500 time 0.2432 (0.2441) data time 0.0011 (0.0022) model time 0.2421 (0.2421) loss 3.8871 (3.6520) grad_norm 1.5649 (inf) loss_scale 8192.0000 (9100.2040) mem 7379MB [2024-08-26 05:55:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][460/1251] eta 0:03:13 lr 0.000983 wd 0.0500 time 0.2420 (0.2441) data time 0.0011 (0.0021) model time 0.2409 (0.2421) loss 3.5731 (3.6530) grad_norm 1.3598 (inf) loss_scale 8192.0000 (9080.5033) mem 7379MB [2024-08-26 05:55:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][470/1251] eta 0:03:10 lr 0.000983 wd 0.0500 time 0.2413 (0.2445) data time 0.0010 (0.0021) model time 0.2403 (0.2426) loss 3.1922 (3.6509) grad_norm 1.9275 (inf) loss_scale 8192.0000 (9061.6391) mem 7379MB [2024-08-26 05:55:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][480/1251] eta 0:03:08 lr 0.000983 wd 0.0500 time 0.2381 (0.2444) data time 0.0011 (0.0021) model time 0.2369 (0.2425) loss 3.6996 (3.6542) grad_norm 1.8430 (inf) loss_scale 8192.0000 (9043.5593) mem 7379MB [2024-08-26 05:55:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][490/1251] eta 0:03:05 lr 0.000983 wd 0.0500 time 0.2399 (0.2443) data time 0.0007 (0.0021) model time 0.2391 (0.2424) loss 4.3873 (3.6637) grad_norm 1.8787 (inf) loss_scale 8192.0000 (9026.2159) mem 7379MB [2024-08-26 05:55:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][500/1251] eta 0:03:03 lr 0.000983 wd 0.0500 time 0.2391 (0.2443) data time 0.0012 (0.0021) model time 0.2379 (0.2424) loss 4.0467 (3.6655) grad_norm 1.8851 (inf) loss_scale 8192.0000 (9009.5649) mem 7379MB [2024-08-26 05:55:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][510/1251] eta 0:03:01 lr 0.000983 wd 0.0500 time 0.2461 (0.2443) data time 0.0011 (0.0020) model time 0.2450 (0.2425) loss 3.5129 (3.6649) grad_norm 2.0679 (inf) loss_scale 8192.0000 (8993.5656) mem 7379MB [2024-08-26 05:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][520/1251] eta 0:02:58 lr 0.000983 wd 0.0500 time 0.2478 (0.2443) data time 0.0010 (0.0020) model time 0.2469 (0.2424) loss 4.1201 (3.6605) grad_norm 1.6602 (inf) loss_scale 8192.0000 (8978.1804) mem 7379MB [2024-08-26 05:55:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][530/1251] eta 0:02:56 lr 0.000983 wd 0.0500 time 0.2455 (0.2442) data time 0.0007 (0.0020) model time 0.2447 (0.2424) loss 3.9049 (3.6653) grad_norm 2.2287 (inf) loss_scale 8192.0000 (8963.3748) mem 7379MB [2024-08-26 05:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][540/1251] eta 0:02:53 lr 0.000983 wd 0.0500 time 0.2396 (0.2442) data time 0.0011 (0.0020) model time 0.2385 (0.2423) loss 3.1890 (3.6649) grad_norm 1.5933 (inf) loss_scale 8192.0000 (8949.1165) mem 7379MB [2024-08-26 05:55:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][550/1251] eta 0:02:51 lr 0.000983 wd 0.0500 time 0.2410 (0.2442) data time 0.0008 (0.0020) model time 0.2401 (0.2424) loss 4.4149 (3.6689) grad_norm 2.2707 (inf) loss_scale 8192.0000 (8935.3757) mem 7379MB [2024-08-26 05:55:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][560/1251] eta 0:02:48 lr 0.000983 wd 0.0500 time 0.2374 (0.2441) data time 0.0008 (0.0019) model time 0.2366 (0.2423) loss 4.2755 (3.6665) grad_norm 1.4915 (inf) loss_scale 8192.0000 (8922.1248) mem 7379MB [2024-08-26 05:55:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][570/1251] eta 0:02:46 lr 0.000983 wd 0.0500 time 0.2421 (0.2440) data time 0.0009 (0.0019) model time 0.2412 (0.2423) loss 3.6599 (3.6634) grad_norm 1.7180 (inf) loss_scale 8192.0000 (8909.3380) mem 7379MB [2024-08-26 05:55:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][580/1251] eta 0:02:43 lr 0.000983 wd 0.0500 time 0.2457 (0.2440) data time 0.0011 (0.0019) model time 0.2447 (0.2423) loss 4.0008 (3.6647) grad_norm 1.5862 (inf) loss_scale 8192.0000 (8896.9914) mem 7379MB [2024-08-26 05:55:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][590/1251] eta 0:02:41 lr 0.000983 wd 0.0500 time 0.2382 (0.2440) data time 0.0011 (0.0019) model time 0.2371 (0.2422) loss 3.5281 (3.6656) grad_norm 1.7204 (inf) loss_scale 8192.0000 (8885.0626) mem 7379MB [2024-08-26 05:55:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][600/1251] eta 0:02:38 lr 0.000983 wd 0.0500 time 0.2425 (0.2440) data time 0.0011 (0.0019) model time 0.2414 (0.2422) loss 3.4930 (3.6637) grad_norm 1.8483 (inf) loss_scale 8192.0000 (8873.5308) mem 7379MB [2024-08-26 05:55:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][610/1251] eta 0:02:36 lr 0.000983 wd 0.0500 time 0.2422 (0.2440) data time 0.0010 (0.0019) model time 0.2412 (0.2422) loss 2.6658 (3.6634) grad_norm 2.1023 (inf) loss_scale 8192.0000 (8862.3764) mem 7379MB [2024-08-26 05:55:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][620/1251] eta 0:02:33 lr 0.000983 wd 0.0500 time 0.2376 (0.2439) data time 0.0009 (0.0019) model time 0.2367 (0.2422) loss 3.9729 (3.6635) grad_norm 1.4594 (inf) loss_scale 8192.0000 (8851.5813) mem 7379MB [2024-08-26 05:55:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][630/1251] eta 0:02:31 lr 0.000983 wd 0.0500 time 0.2460 (0.2443) data time 0.0009 (0.0018) model time 0.2451 (0.2426) loss 4.2232 (3.6625) grad_norm 2.5758 (inf) loss_scale 8192.0000 (8841.1284) mem 7379MB [2024-08-26 05:55:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][640/1251] eta 0:02:29 lr 0.000983 wd 0.0500 time 0.2422 (0.2449) data time 0.0011 (0.0018) model time 0.2411 (0.2433) loss 3.4722 (3.6663) grad_norm 2.4464 (inf) loss_scale 8192.0000 (8831.0016) mem 7379MB [2024-08-26 05:55:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][650/1251] eta 0:02:27 lr 0.000983 wd 0.0500 time 0.2505 (0.2448) data time 0.0007 (0.0018) model time 0.2498 (0.2432) loss 3.8692 (3.6701) grad_norm 1.6168 (inf) loss_scale 8192.0000 (8821.1859) mem 7379MB [2024-08-26 05:55:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][660/1251] eta 0:02:24 lr 0.000983 wd 0.0500 time 0.2434 (0.2448) data time 0.0008 (0.0018) model time 0.2426 (0.2432) loss 3.5359 (3.6713) grad_norm 1.6122 (inf) loss_scale 8192.0000 (8811.6672) mem 7379MB [2024-08-26 05:55:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][670/1251] eta 0:02:22 lr 0.000983 wd 0.0500 time 0.2368 (0.2448) data time 0.0009 (0.0018) model time 0.2359 (0.2432) loss 4.3169 (3.6748) grad_norm 2.0969 (inf) loss_scale 8192.0000 (8802.4322) mem 7379MB [2024-08-26 05:55:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][680/1251] eta 0:02:19 lr 0.000983 wd 0.0500 time 0.2394 (0.2447) data time 0.0011 (0.0018) model time 0.2382 (0.2431) loss 4.1677 (3.6721) grad_norm 1.8142 (inf) loss_scale 8192.0000 (8793.4684) mem 7379MB [2024-08-26 05:55:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][690/1251] eta 0:02:17 lr 0.000983 wd 0.0500 time 0.2318 (0.2447) data time 0.0007 (0.0018) model time 0.2311 (0.2431) loss 3.1762 (3.6746) grad_norm 1.6382 (inf) loss_scale 8192.0000 (8784.7641) mem 7379MB [2024-08-26 05:56:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][700/1251] eta 0:02:14 lr 0.000983 wd 0.0500 time 0.2477 (0.2446) data time 0.0010 (0.0018) model time 0.2467 (0.2431) loss 3.2272 (3.6743) grad_norm 3.0540 (inf) loss_scale 8192.0000 (8776.3081) mem 7379MB [2024-08-26 05:56:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][710/1251] eta 0:02:12 lr 0.000983 wd 0.0500 time 0.2434 (0.2446) data time 0.0007 (0.0017) model time 0.2427 (0.2430) loss 3.2265 (3.6760) grad_norm 1.2338 (inf) loss_scale 8192.0000 (8768.0900) mem 7379MB [2024-08-26 05:56:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][720/1251] eta 0:02:09 lr 0.000983 wd 0.0500 time 0.2364 (0.2446) data time 0.0012 (0.0017) model time 0.2352 (0.2430) loss 4.1433 (3.6762) grad_norm 1.8735 (inf) loss_scale 8192.0000 (8760.0999) mem 7379MB [2024-08-26 05:56:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][730/1251] eta 0:02:07 lr 0.000983 wd 0.0500 time 0.2397 (0.2446) data time 0.0007 (0.0017) model time 0.2390 (0.2430) loss 4.2930 (3.6777) grad_norm 1.9109 (inf) loss_scale 8192.0000 (8752.3283) mem 7379MB [2024-08-26 05:56:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][740/1251] eta 0:02:04 lr 0.000983 wd 0.0500 time 0.2437 (0.2446) data time 0.0010 (0.0017) model time 0.2427 (0.2430) loss 4.3206 (3.6791) grad_norm 2.4334 (inf) loss_scale 8192.0000 (8744.7665) mem 7379MB [2024-08-26 05:56:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][750/1251] eta 0:02:02 lr 0.000983 wd 0.0500 time 0.2378 (0.2445) data time 0.0010 (0.0017) model time 0.2368 (0.2430) loss 3.9669 (3.6778) grad_norm 1.6815 (inf) loss_scale 8192.0000 (8737.4061) mem 7379MB [2024-08-26 05:56:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][760/1251] eta 0:02:00 lr 0.000983 wd 0.0500 time 0.2382 (0.2445) data time 0.0011 (0.0017) model time 0.2371 (0.2429) loss 3.1704 (3.6705) grad_norm 1.4753 (inf) loss_scale 8192.0000 (8730.2392) mem 7379MB [2024-08-26 05:56:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][770/1251] eta 0:01:57 lr 0.000983 wd 0.0500 time 0.2380 (0.2444) data time 0.0009 (0.0017) model time 0.2371 (0.2429) loss 4.3351 (3.6699) grad_norm 1.4293 (inf) loss_scale 8192.0000 (8723.2581) mem 7379MB [2024-08-26 05:56:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][780/1251] eta 0:01:55 lr 0.000983 wd 0.0500 time 0.2363 (0.2448) data time 0.0012 (0.0017) model time 0.2351 (0.2433) loss 4.5467 (3.6713) grad_norm 1.4330 (inf) loss_scale 8192.0000 (8716.4558) mem 7379MB [2024-08-26 05:56:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][790/1251] eta 0:01:52 lr 0.000983 wd 0.0500 time 0.2442 (0.2448) data time 0.0008 (0.0017) model time 0.2435 (0.2433) loss 4.6356 (3.6744) grad_norm 2.2549 (inf) loss_scale 8192.0000 (8709.8255) mem 7379MB [2024-08-26 05:56:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][800/1251] eta 0:01:50 lr 0.000983 wd 0.0500 time 0.2441 (0.2447) data time 0.0007 (0.0017) model time 0.2434 (0.2432) loss 4.2168 (3.6738) grad_norm 2.4339 (inf) loss_scale 8192.0000 (8703.3608) mem 7379MB [2024-08-26 05:56:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][810/1251] eta 0:01:47 lr 0.000983 wd 0.0500 time 0.2399 (0.2447) data time 0.0010 (0.0017) model time 0.2389 (0.2432) loss 3.5717 (3.6730) grad_norm 1.6523 (inf) loss_scale 8192.0000 (8697.0555) mem 7379MB [2024-08-26 05:56:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][820/1251] eta 0:01:45 lr 0.000983 wd 0.0500 time 0.2454 (0.2447) data time 0.0011 (0.0016) model time 0.2443 (0.2432) loss 3.7868 (3.6697) grad_norm 1.9078 (inf) loss_scale 8192.0000 (8690.9038) mem 7379MB [2024-08-26 05:56:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][830/1251] eta 0:01:42 lr 0.000983 wd 0.0500 time 0.2388 (0.2446) data time 0.0010 (0.0016) model time 0.2378 (0.2431) loss 3.9721 (3.6700) grad_norm 2.3721 (inf) loss_scale 8192.0000 (8684.9001) mem 7379MB [2024-08-26 05:56:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][840/1251] eta 0:01:40 lr 0.000983 wd 0.0500 time 0.2382 (0.2446) data time 0.0011 (0.0016) model time 0.2371 (0.2431) loss 2.6348 (3.6721) grad_norm 2.2410 (inf) loss_scale 8192.0000 (8679.0392) mem 7379MB [2024-08-26 05:56:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][850/1251] eta 0:01:38 lr 0.000983 wd 0.0500 time 0.2441 (0.2446) data time 0.0009 (0.0016) model time 0.2431 (0.2431) loss 4.0093 (3.6717) grad_norm 1.9360 (inf) loss_scale 8192.0000 (8673.3161) mem 7379MB [2024-08-26 05:56:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][860/1251] eta 0:01:35 lr 0.000983 wd 0.0500 time 0.2537 (0.2446) data time 0.0008 (0.0016) model time 0.2529 (0.2431) loss 4.3497 (3.6741) grad_norm 1.5009 (inf) loss_scale 8192.0000 (8667.7259) mem 7379MB [2024-08-26 05:56:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][870/1251] eta 0:01:33 lr 0.000983 wd 0.0500 time 0.2303 (0.2445) data time 0.0009 (0.0016) model time 0.2294 (0.2431) loss 3.8545 (3.6730) grad_norm 1.6694 (inf) loss_scale 8192.0000 (8662.2641) mem 7379MB [2024-08-26 05:56:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][880/1251] eta 0:01:30 lr 0.000983 wd 0.0500 time 0.2420 (0.2446) data time 0.0010 (0.0016) model time 0.2410 (0.2431) loss 2.5044 (3.6717) grad_norm 2.0686 (inf) loss_scale 8192.0000 (8656.9262) mem 7379MB [2024-08-26 05:56:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][890/1251] eta 0:01:28 lr 0.000983 wd 0.0500 time 0.2351 (0.2445) data time 0.0009 (0.0016) model time 0.2342 (0.2431) loss 4.4773 (3.6735) grad_norm 1.8167 (inf) loss_scale 8192.0000 (8651.7082) mem 7379MB [2024-08-26 05:56:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][900/1251] eta 0:01:25 lr 0.000983 wd 0.0500 time 0.2402 (0.2445) data time 0.0011 (0.0016) model time 0.2391 (0.2430) loss 3.0545 (3.6744) grad_norm 1.9753 (inf) loss_scale 8192.0000 (8646.6060) mem 7379MB [2024-08-26 05:56:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][910/1251] eta 0:01:23 lr 0.000983 wd 0.0500 time 0.2470 (0.2445) data time 0.0010 (0.0016) model time 0.2460 (0.2430) loss 3.6677 (3.6744) grad_norm 1.5208 (inf) loss_scale 4096.0000 (8601.1504) mem 7379MB [2024-08-26 05:56:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][920/1251] eta 0:01:20 lr 0.000983 wd 0.0500 time 0.2379 (0.2444) data time 0.0011 (0.0016) model time 0.2368 (0.2430) loss 3.7984 (3.6744) grad_norm 1.4605 (inf) loss_scale 4096.0000 (8552.2345) mem 7379MB [2024-08-26 05:56:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][930/1251] eta 0:01:18 lr 0.000983 wd 0.0500 time 0.2385 (0.2444) data time 0.0011 (0.0016) model time 0.2374 (0.2430) loss 3.6878 (3.6736) grad_norm 1.7073 (inf) loss_scale 4096.0000 (8504.3695) mem 7379MB [2024-08-26 05:57:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][940/1251] eta 0:01:16 lr 0.000983 wd 0.0500 time 0.2366 (0.2444) data time 0.0007 (0.0016) model time 0.2359 (0.2430) loss 3.6714 (3.6700) grad_norm 2.3092 (inf) loss_scale 4096.0000 (8457.5218) mem 7379MB [2024-08-26 05:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][950/1251] eta 0:01:13 lr 0.000983 wd 0.0500 time 0.2407 (0.2444) data time 0.0010 (0.0016) model time 0.2397 (0.2430) loss 4.1001 (3.6700) grad_norm 1.7565 (inf) loss_scale 4096.0000 (8411.6593) mem 7379MB [2024-08-26 05:57:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][960/1251] eta 0:01:11 lr 0.000983 wd 0.0500 time 0.2496 (0.2444) data time 0.0011 (0.0016) model time 0.2486 (0.2430) loss 4.5977 (3.6738) grad_norm 1.4759 (inf) loss_scale 4096.0000 (8366.7513) mem 7379MB [2024-08-26 05:57:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][970/1251] eta 0:01:08 lr 0.000982 wd 0.0500 time 0.2489 (0.2444) data time 0.0010 (0.0015) model time 0.2479 (0.2430) loss 3.8913 (3.6746) grad_norm 1.4517 (inf) loss_scale 4096.0000 (8322.7683) mem 7379MB [2024-08-26 05:57:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][980/1251] eta 0:01:06 lr 0.000982 wd 0.0500 time 0.2441 (0.2443) data time 0.0007 (0.0015) model time 0.2434 (0.2429) loss 2.7647 (3.6726) grad_norm 2.0339 (inf) loss_scale 4096.0000 (8279.6820) mem 7379MB [2024-08-26 05:57:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][990/1251] eta 0:01:03 lr 0.000982 wd 0.0500 time 0.2386 (0.2443) data time 0.0010 (0.0015) model time 0.2376 (0.2429) loss 3.5365 (3.6711) grad_norm 1.7222 (inf) loss_scale 4096.0000 (8237.4652) mem 7379MB [2024-08-26 05:57:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1000/1251] eta 0:01:01 lr 0.000982 wd 0.0500 time 0.2432 (0.2443) data time 0.0010 (0.0015) model time 0.2422 (0.2429) loss 3.6963 (3.6706) grad_norm 2.3146 (inf) loss_scale 4096.0000 (8196.0919) mem 7379MB [2024-08-26 05:57:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1010/1251] eta 0:00:58 lr 0.000982 wd 0.0500 time 0.2369 (0.2443) data time 0.0007 (0.0015) model time 0.2361 (0.2429) loss 3.4659 (3.6710) grad_norm 3.6657 (inf) loss_scale 4096.0000 (8155.5371) mem 7379MB [2024-08-26 05:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1020/1251] eta 0:00:56 lr 0.000982 wd 0.0500 time 0.2435 (0.2443) data time 0.0009 (0.0015) model time 0.2426 (0.2429) loss 3.0721 (3.6748) grad_norm 2.5651 (inf) loss_scale 4096.0000 (8115.7767) mem 7379MB [2024-08-26 05:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1030/1251] eta 0:00:53 lr 0.000982 wd 0.0500 time 0.2449 (0.2442) data time 0.0010 (0.0015) model time 0.2438 (0.2428) loss 4.1292 (3.6771) grad_norm 1.4947 (inf) loss_scale 4096.0000 (8076.7876) mem 7379MB [2024-08-26 05:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1040/1251] eta 0:00:51 lr 0.000982 wd 0.0500 time 0.2384 (0.2442) data time 0.0008 (0.0015) model time 0.2376 (0.2428) loss 3.2123 (3.6745) grad_norm 2.6185 (inf) loss_scale 4096.0000 (8038.5476) mem 7379MB [2024-08-26 05:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1050/1251] eta 0:00:49 lr 0.000982 wd 0.0500 time 0.2396 (0.2441) data time 0.0011 (0.0015) model time 0.2385 (0.2427) loss 3.9613 (3.6715) grad_norm 1.6674 (inf) loss_scale 4096.0000 (8001.0352) mem 7379MB [2024-08-26 05:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1060/1251] eta 0:00:46 lr 0.000982 wd 0.0500 time 0.2444 (0.2441) data time 0.0008 (0.0015) model time 0.2436 (0.2427) loss 3.1645 (3.6689) grad_norm 1.5364 (inf) loss_scale 4096.0000 (7964.2300) mem 7379MB [2024-08-26 05:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1070/1251] eta 0:00:44 lr 0.000982 wd 0.0500 time 0.2424 (0.2441) data time 0.0010 (0.0015) model time 0.2414 (0.2427) loss 4.5339 (3.6656) grad_norm 1.4516 (inf) loss_scale 4096.0000 (7928.1120) mem 7379MB [2024-08-26 05:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1080/1251] eta 0:00:41 lr 0.000982 wd 0.0500 time 0.2394 (0.2441) data time 0.0011 (0.0015) model time 0.2383 (0.2427) loss 3.3823 (3.6660) grad_norm 1.2894 (inf) loss_scale 4096.0000 (7892.6623) mem 7379MB [2024-08-26 05:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1090/1251] eta 0:00:39 lr 0.000982 wd 0.0500 time 0.2415 (0.2441) data time 0.0008 (0.0015) model time 0.2407 (0.2427) loss 3.2732 (3.6667) grad_norm 1.6483 (inf) loss_scale 4096.0000 (7857.8625) mem 7379MB [2024-08-26 05:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1100/1251] eta 0:00:36 lr 0.000982 wd 0.0500 time 0.2444 (0.2441) data time 0.0011 (0.0015) model time 0.2433 (0.2427) loss 3.9380 (3.6644) grad_norm 1.9677 (inf) loss_scale 4096.0000 (7823.6948) mem 7379MB [2024-08-26 05:57:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1110/1251] eta 0:00:34 lr 0.000982 wd 0.0500 time 0.2448 (0.2441) data time 0.0010 (0.0015) model time 0.2439 (0.2427) loss 4.0116 (3.6619) grad_norm 2.4396 (inf) loss_scale 4096.0000 (7790.1422) mem 7379MB [2024-08-26 05:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1120/1251] eta 0:00:31 lr 0.000982 wd 0.0500 time 0.2410 (0.2440) data time 0.0009 (0.0015) model time 0.2401 (0.2426) loss 3.9752 (3.6632) grad_norm 2.3829 (inf) loss_scale 4096.0000 (7757.1882) mem 7379MB [2024-08-26 05:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1130/1251] eta 0:00:29 lr 0.000982 wd 0.0500 time 0.2479 (0.2440) data time 0.0009 (0.0015) model time 0.2470 (0.2426) loss 3.3290 (3.6631) grad_norm 2.2537 (inf) loss_scale 4096.0000 (7724.8170) mem 7379MB [2024-08-26 05:57:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1140/1251] eta 0:00:27 lr 0.000982 wd 0.0500 time 0.2407 (0.2440) data time 0.0012 (0.0015) model time 0.2396 (0.2426) loss 3.5034 (3.6635) grad_norm 1.8328 (inf) loss_scale 4096.0000 (7693.0131) mem 7379MB [2024-08-26 05:57:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1150/1251] eta 0:00:24 lr 0.000982 wd 0.0500 time 0.2353 (0.2440) data time 0.0010 (0.0015) model time 0.2343 (0.2426) loss 3.6225 (3.6637) grad_norm 1.6287 (inf) loss_scale 4096.0000 (7661.7619) mem 7379MB [2024-08-26 05:57:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1160/1251] eta 0:00:22 lr 0.000982 wd 0.0500 time 0.2410 (0.2439) data time 0.0013 (0.0015) model time 0.2396 (0.2426) loss 3.9172 (3.6652) grad_norm 1.8898 (inf) loss_scale 4096.0000 (7631.0491) mem 7379MB [2024-08-26 05:57:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1170/1251] eta 0:00:19 lr 0.000982 wd 0.0500 time 0.2320 (0.2443) data time 0.0010 (0.0015) model time 0.2310 (0.2429) loss 2.8933 (3.6636) grad_norm 3.9789 (inf) loss_scale 4096.0000 (7600.8608) mem 7379MB [2024-08-26 05:57:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1180/1251] eta 0:00:17 lr 0.000982 wd 0.0500 time 0.2447 (0.2444) data time 0.0011 (0.0015) model time 0.2436 (0.2431) loss 2.7454 (3.6636) grad_norm 1.7772 (inf) loss_scale 4096.0000 (7571.1837) mem 7379MB [2024-08-26 05:58:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1190/1251] eta 0:00:14 lr 0.000982 wd 0.0500 time 0.2410 (0.2448) data time 0.0008 (0.0015) model time 0.2402 (0.2434) loss 4.3347 (3.6671) grad_norm 2.0363 (inf) loss_scale 4096.0000 (7542.0050) mem 7379MB [2024-08-26 05:58:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1200/1251] eta 0:00:12 lr 0.000982 wd 0.0500 time 0.2425 (0.2447) data time 0.0010 (0.0015) model time 0.2415 (0.2434) loss 2.6555 (3.6678) grad_norm 1.9073 (inf) loss_scale 4096.0000 (7513.3122) mem 7379MB [2024-08-26 05:58:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1210/1251] eta 0:00:10 lr 0.000982 wd 0.0500 time 0.2431 (0.2447) data time 0.0011 (0.0015) model time 0.2421 (0.2434) loss 3.0412 (3.6666) grad_norm 1.3522 (inf) loss_scale 4096.0000 (7485.0933) mem 7379MB [2024-08-26 05:58:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1220/1251] eta 0:00:07 lr 0.000982 wd 0.0500 time 0.2336 (0.2447) data time 0.0008 (0.0014) model time 0.2328 (0.2433) loss 3.0070 (3.6630) grad_norm 2.1771 (inf) loss_scale 4096.0000 (7457.3366) mem 7379MB [2024-08-26 05:58:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1230/1251] eta 0:00:05 lr 0.000982 wd 0.0500 time 0.2412 (0.2446) data time 0.0008 (0.0014) model time 0.2404 (0.2433) loss 4.2180 (3.6630) grad_norm 1.7468 (inf) loss_scale 4096.0000 (7430.0309) mem 7379MB [2024-08-26 05:58:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1240/1251] eta 0:00:02 lr 0.000982 wd 0.0500 time 0.2254 (0.2446) data time 0.0005 (0.0014) model time 0.2249 (0.2432) loss 2.3556 (3.6630) grad_norm 1.5641 (inf) loss_scale 4096.0000 (7403.1652) mem 7379MB [2024-08-26 05:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [43/300][1250/1251] eta 0:00:00 lr 0.000982 wd 0.0500 time 0.2241 (0.2444) data time 0.0007 (0.0014) model time 0.2234 (0.2431) loss 3.8703 (3.6649) grad_norm 1.7991 (inf) loss_scale 4096.0000 (7376.7290) mem 7379MB [2024-08-26 05:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 43 training takes 0:05:05 [2024-08-26 05:58:16 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 05:58:17 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 05:58:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.433 (0.433) Loss 0.5635 (0.5635) Acc@1 89.844 (89.844) Acc@5 97.754 (97.754) Mem 7379MB [2024-08-26 05:58:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.110) Loss 0.8892 (0.8964) Acc@1 82.031 (79.901) Acc@5 94.531 (95.277) Mem 7379MB [2024-08-26 05:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.095) Loss 1.2871 (0.9240) Acc@1 69.727 (78.832) Acc@5 90.820 (95.178) Mem 7379MB [2024-08-26 05:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.089) Loss 1.6250 (1.0607) Acc@1 61.426 (75.895) Acc@5 84.766 (93.337) Mem 7379MB [2024-08-26 05:58:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.5010 (1.1338) Acc@1 64.160 (74.095) Acc@5 87.402 (92.371) Mem 7379MB [2024-08-26 05:58:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 73.716 Acc@5 92.276 [2024-08-26 05:58:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 73.7% [2024-08-26 05:58:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 73.72% [2024-08-26 05:58:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 05:58:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 05:58:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.515 (0.515) Loss 0.4685 (0.4685) Acc@1 89.844 (89.844) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 05:58:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.118) Loss 0.8013 (0.7787) Acc@1 81.934 (81.685) Acc@5 95.508 (95.792) Mem 7379MB [2024-08-26 05:58:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.099) Loss 1.1416 (0.7945) Acc@1 72.949 (80.892) Acc@5 91.211 (95.829) Mem 7379MB [2024-08-26 05:58:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.092) Loss 1.4268 (0.9192) Acc@1 64.160 (77.983) Acc@5 87.305 (94.197) Mem 7379MB [2024-08-26 05:58:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 1.3535 (0.9910) Acc@1 66.504 (76.272) Acc@5 88.965 (93.378) Mem 7379MB [2024-08-26 05:58:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.950 Acc@5 93.296 [2024-08-26 05:58:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 76.0% [2024-08-26 05:58:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 75.95% [2024-08-26 05:58:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 05:58:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 05:58:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][0/1251] eta 0:14:54 lr 0.000982 wd 0.0500 time 0.7151 (0.7151) data time 0.4843 (0.4843) model time 0.0000 (0.0000) loss 3.1604 (3.1604) grad_norm 1.5683 (1.5683) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:58:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][10/1251] eta 0:05:53 lr 0.000982 wd 0.0500 time 0.2442 (0.2850) data time 0.0009 (0.0449) model time 0.0000 (0.0000) loss 3.5226 (3.7535) grad_norm 1.7647 (1.6815) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:58:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][20/1251] eta 0:05:27 lr 0.000982 wd 0.0500 time 0.2421 (0.2658) data time 0.0010 (0.0240) model time 0.0000 (0.0000) loss 3.7574 (3.7067) grad_norm 2.4179 (1.8667) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:58:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][30/1251] eta 0:05:16 lr 0.000982 wd 0.0500 time 0.2406 (0.2590) data time 0.0010 (0.0167) model time 0.0000 (0.0000) loss 3.3111 (3.6873) grad_norm 2.6357 (1.9954) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:58:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][40/1251] eta 0:05:09 lr 0.000982 wd 0.0500 time 0.2420 (0.2556) data time 0.0008 (0.0128) model time 0.0000 (0.0000) loss 4.4074 (3.7280) grad_norm 2.4232 (2.0263) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][50/1251] eta 0:05:07 lr 0.000982 wd 0.0500 time 0.3903 (0.2558) data time 0.0012 (0.0105) model time 0.0000 (0.0000) loss 4.1267 (3.7409) grad_norm 1.6280 (1.9646) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:58:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][60/1251] eta 0:05:05 lr 0.000982 wd 0.0500 time 0.2406 (0.2566) data time 0.0010 (0.0090) model time 0.2396 (0.2596) loss 3.5045 (3.7496) grad_norm 1.5182 (1.9502) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:58:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][70/1251] eta 0:05:00 lr 0.000982 wd 0.0500 time 0.2442 (0.2546) data time 0.0011 (0.0078) model time 0.2431 (0.2507) loss 3.4194 (3.7135) grad_norm 2.6030 (1.9590) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][80/1251] eta 0:04:56 lr 0.000982 wd 0.0500 time 0.2406 (0.2531) data time 0.0010 (0.0070) model time 0.2396 (0.2474) loss 3.2968 (3.7192) grad_norm 1.8308 (1.9364) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][90/1251] eta 0:04:52 lr 0.000982 wd 0.0500 time 0.2422 (0.2519) data time 0.0009 (0.0063) model time 0.2414 (0.2460) loss 4.3274 (3.7385) grad_norm 2.0706 (1.9224) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:58:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][100/1251] eta 0:04:48 lr 0.000982 wd 0.0500 time 0.2449 (0.2511) data time 0.0012 (0.0058) model time 0.2437 (0.2452) loss 3.8247 (3.7569) grad_norm 2.0457 (1.9536) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][110/1251] eta 0:04:45 lr 0.000982 wd 0.0500 time 0.2384 (0.2502) data time 0.0009 (0.0054) model time 0.2375 (0.2445) loss 3.7914 (3.7565) grad_norm 1.5660 (1.9320) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:58:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][120/1251] eta 0:04:42 lr 0.000982 wd 0.0500 time 0.2446 (0.2496) data time 0.0007 (0.0050) model time 0.2438 (0.2440) loss 4.1956 (3.7508) grad_norm 1.5591 (1.9115) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:58:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][130/1251] eta 0:04:39 lr 0.000982 wd 0.0500 time 0.2420 (0.2491) data time 0.0011 (0.0047) model time 0.2409 (0.2437) loss 3.0820 (3.7356) grad_norm 1.8445 (1.9211) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][140/1251] eta 0:04:36 lr 0.000982 wd 0.0500 time 0.2495 (0.2488) data time 0.0007 (0.0045) model time 0.2488 (0.2437) loss 4.5177 (3.7408) grad_norm 2.2505 (1.9064) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][150/1251] eta 0:04:33 lr 0.000982 wd 0.0500 time 0.2500 (0.2483) data time 0.0011 (0.0042) model time 0.2488 (0.2435) loss 3.3718 (3.7526) grad_norm 1.9672 (1.9126) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][160/1251] eta 0:04:30 lr 0.000982 wd 0.0500 time 0.2516 (0.2480) data time 0.0010 (0.0040) model time 0.2506 (0.2434) loss 3.9445 (3.7589) grad_norm 1.7555 (1.9065) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][170/1251] eta 0:04:27 lr 0.000982 wd 0.0500 time 0.2355 (0.2476) data time 0.0010 (0.0039) model time 0.2346 (0.2430) loss 4.4087 (3.7512) grad_norm 2.8547 (1.9151) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][180/1251] eta 0:04:24 lr 0.000982 wd 0.0500 time 0.2449 (0.2473) data time 0.0008 (0.0037) model time 0.2442 (0.2429) loss 4.0766 (3.7323) grad_norm 2.3568 (1.9186) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][190/1251] eta 0:04:22 lr 0.000982 wd 0.0500 time 0.2411 (0.2469) data time 0.0011 (0.0036) model time 0.2400 (0.2427) loss 3.9651 (3.7373) grad_norm 2.1541 (1.9093) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][200/1251] eta 0:04:19 lr 0.000982 wd 0.0500 time 0.2446 (0.2468) data time 0.0009 (0.0034) model time 0.2437 (0.2427) loss 3.8340 (3.7433) grad_norm 1.7143 (1.9005) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][210/1251] eta 0:04:16 lr 0.000982 wd 0.0500 time 0.2459 (0.2466) data time 0.0009 (0.0033) model time 0.2450 (0.2426) loss 3.8121 (3.7342) grad_norm 1.6747 (1.8938) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][220/1251] eta 0:04:14 lr 0.000982 wd 0.0500 time 0.2425 (0.2464) data time 0.0010 (0.0032) model time 0.2415 (0.2426) loss 3.6141 (3.7431) grad_norm 1.9055 (1.9048) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][230/1251] eta 0:04:11 lr 0.000982 wd 0.0500 time 0.2380 (0.2461) data time 0.0011 (0.0031) model time 0.2369 (0.2423) loss 3.6622 (3.7270) grad_norm 1.8498 (1.9048) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][240/1251] eta 0:04:08 lr 0.000982 wd 0.0500 time 0.2444 (0.2459) data time 0.0007 (0.0030) model time 0.2437 (0.2422) loss 3.9779 (3.7115) grad_norm 1.9347 (1.8986) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][250/1251] eta 0:04:05 lr 0.000982 wd 0.0500 time 0.2415 (0.2458) data time 0.0007 (0.0029) model time 0.2408 (0.2422) loss 4.3854 (3.7075) grad_norm 1.5185 (1.8988) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][260/1251] eta 0:04:03 lr 0.000982 wd 0.0500 time 0.2417 (0.2456) data time 0.0008 (0.0029) model time 0.2409 (0.2421) loss 4.7104 (3.7024) grad_norm 1.5170 (1.9059) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][270/1251] eta 0:04:00 lr 0.000982 wd 0.0500 time 0.2408 (0.2454) data time 0.0009 (0.0028) model time 0.2399 (0.2420) loss 2.6295 (3.6923) grad_norm 2.1249 (1.9038) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][280/1251] eta 0:03:58 lr 0.000982 wd 0.0500 time 0.2430 (0.2452) data time 0.0007 (0.0027) model time 0.2423 (0.2418) loss 4.7340 (3.6900) grad_norm 1.6138 (1.9087) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][290/1251] eta 0:03:55 lr 0.000982 wd 0.0500 time 0.2395 (0.2450) data time 0.0010 (0.0027) model time 0.2385 (0.2417) loss 3.7672 (3.6823) grad_norm 1.3911 (1.9129) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][300/1251] eta 0:03:52 lr 0.000982 wd 0.0500 time 0.2384 (0.2448) data time 0.0007 (0.0026) model time 0.2376 (0.2416) loss 2.7211 (3.6830) grad_norm 1.7177 (1.9207) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][310/1251] eta 0:03:50 lr 0.000982 wd 0.0500 time 0.2440 (0.2447) data time 0.0010 (0.0026) model time 0.2430 (0.2416) loss 3.1384 (3.6802) grad_norm 1.5187 (1.9256) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][320/1251] eta 0:03:47 lr 0.000982 wd 0.0500 time 0.2509 (0.2447) data time 0.0007 (0.0025) model time 0.2502 (0.2415) loss 2.5494 (3.6800) grad_norm 1.3521 (1.9222) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][330/1251] eta 0:03:45 lr 0.000982 wd 0.0500 time 0.2409 (0.2446) data time 0.0011 (0.0025) model time 0.2398 (0.2416) loss 3.9945 (3.6934) grad_norm 1.9477 (1.9103) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][340/1251] eta 0:03:42 lr 0.000982 wd 0.0500 time 0.2385 (0.2445) data time 0.0011 (0.0024) model time 0.2374 (0.2415) loss 3.7854 (3.6948) grad_norm 1.6915 (1.9129) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][350/1251] eta 0:03:40 lr 0.000982 wd 0.0500 time 0.2406 (0.2444) data time 0.0010 (0.0024) model time 0.2396 (0.2414) loss 3.4359 (3.6981) grad_norm 1.5430 (1.9190) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][360/1251] eta 0:03:37 lr 0.000982 wd 0.0500 time 0.2430 (0.2442) data time 0.0008 (0.0024) model time 0.2422 (0.2413) loss 2.9122 (3.6997) grad_norm 1.3975 (1.9174) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][370/1251] eta 0:03:35 lr 0.000982 wd 0.0500 time 0.2406 (0.2441) data time 0.0009 (0.0023) model time 0.2397 (0.2413) loss 4.3147 (3.7017) grad_norm 2.7990 (1.9140) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 05:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][380/1251] eta 0:03:32 lr 0.000982 wd 0.0500 time 0.2453 (0.2441) data time 0.0008 (0.0023) model time 0.2446 (0.2412) loss 3.6264 (3.7074) grad_norm 1.5671 (1.9074) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][390/1251] eta 0:03:30 lr 0.000982 wd 0.0500 time 0.2383 (0.2440) data time 0.0011 (0.0023) model time 0.2372 (0.2412) loss 4.2354 (3.7150) grad_norm 1.6454 (1.9074) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][400/1251] eta 0:03:27 lr 0.000982 wd 0.0500 time 0.2434 (0.2439) data time 0.0011 (0.0022) model time 0.2423 (0.2412) loss 3.8419 (3.7145) grad_norm 2.5545 (1.9222) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][410/1251] eta 0:03:25 lr 0.000982 wd 0.0500 time 0.2362 (0.2439) data time 0.0010 (0.0022) model time 0.2352 (0.2412) loss 3.7133 (3.7197) grad_norm 2.1171 (1.9212) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][420/1251] eta 0:03:22 lr 0.000982 wd 0.0500 time 0.2441 (0.2438) data time 0.0009 (0.0022) model time 0.2431 (0.2411) loss 3.9035 (3.7261) grad_norm 1.6776 (1.9206) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][430/1251] eta 0:03:20 lr 0.000982 wd 0.0500 time 0.2418 (0.2438) data time 0.0008 (0.0022) model time 0.2410 (0.2412) loss 4.1360 (3.7242) grad_norm 1.6082 (1.9183) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][440/1251] eta 0:03:17 lr 0.000982 wd 0.0500 time 0.2475 (0.2438) data time 0.0007 (0.0022) model time 0.2468 (0.2412) loss 3.4993 (3.7270) grad_norm 1.5584 (1.9156) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][450/1251] eta 0:03:15 lr 0.000982 wd 0.0500 time 0.2388 (0.2442) data time 0.0007 (0.0021) model time 0.2381 (0.2417) loss 3.9314 (3.7173) grad_norm 2.0326 (1.9138) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][460/1251] eta 0:03:13 lr 0.000982 wd 0.0500 time 0.2397 (0.2451) data time 0.0009 (0.0021) model time 0.2389 (0.2427) loss 2.7428 (3.7183) grad_norm 1.7072 (1.9196) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][470/1251] eta 0:03:11 lr 0.000982 wd 0.0500 time 0.2350 (0.2450) data time 0.0010 (0.0021) model time 0.2339 (0.2426) loss 3.7201 (3.7160) grad_norm 1.4003 (1.9173) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][480/1251] eta 0:03:08 lr 0.000982 wd 0.0500 time 0.2399 (0.2449) data time 0.0009 (0.0021) model time 0.2390 (0.2426) loss 3.3976 (3.7099) grad_norm 2.7377 (1.9160) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][490/1251] eta 0:03:06 lr 0.000982 wd 0.0500 time 0.2440 (0.2449) data time 0.0008 (0.0020) model time 0.2433 (0.2426) loss 3.9416 (3.7066) grad_norm 2.1022 (1.9193) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][500/1251] eta 0:03:03 lr 0.000982 wd 0.0500 time 0.2386 (0.2448) data time 0.0011 (0.0020) model time 0.2375 (0.2425) loss 3.6936 (3.7030) grad_norm 1.6415 (1.9194) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][510/1251] eta 0:03:01 lr 0.000982 wd 0.0500 time 0.2333 (0.2447) data time 0.0011 (0.0020) model time 0.2322 (0.2425) loss 4.1152 (3.7018) grad_norm 2.2156 (1.9225) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][520/1251] eta 0:02:58 lr 0.000982 wd 0.0500 time 0.2437 (0.2447) data time 0.0012 (0.0020) model time 0.2425 (0.2424) loss 3.9456 (3.7057) grad_norm 1.9337 (1.9203) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][530/1251] eta 0:02:56 lr 0.000982 wd 0.0500 time 0.2370 (0.2446) data time 0.0009 (0.0020) model time 0.2361 (0.2424) loss 4.1595 (3.7057) grad_norm 1.7505 (1.9179) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][540/1251] eta 0:02:54 lr 0.000982 wd 0.0500 time 0.2406 (0.2449) data time 0.0011 (0.0020) model time 0.2395 (0.2427) loss 3.8009 (3.7066) grad_norm 2.1640 (1.9229) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][550/1251] eta 0:02:51 lr 0.000982 wd 0.0500 time 0.2358 (0.2449) data time 0.0011 (0.0019) model time 0.2348 (0.2427) loss 3.8660 (3.7015) grad_norm 1.6756 (1.9288) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][560/1251] eta 0:02:49 lr 0.000981 wd 0.0500 time 0.2408 (0.2449) data time 0.0010 (0.0019) model time 0.2398 (0.2427) loss 3.6626 (3.7000) grad_norm 1.7638 (1.9233) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][570/1251] eta 0:02:46 lr 0.000981 wd 0.0500 time 0.2494 (0.2449) data time 0.0010 (0.0019) model time 0.2484 (0.2427) loss 4.1694 (3.7003) grad_norm 1.8341 (1.9241) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][580/1251] eta 0:02:44 lr 0.000981 wd 0.0500 time 0.2397 (0.2456) data time 0.0007 (0.0019) model time 0.2390 (0.2435) loss 3.2179 (3.6923) grad_norm 2.5090 (1.9275) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][590/1251] eta 0:02:42 lr 0.000981 wd 0.0500 time 0.2390 (0.2455) data time 0.0010 (0.0019) model time 0.2380 (0.2435) loss 3.6474 (3.6885) grad_norm 2.1130 (1.9313) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][600/1251] eta 0:02:39 lr 0.000981 wd 0.0500 time 0.2458 (0.2455) data time 0.0010 (0.0019) model time 0.2448 (0.2434) loss 3.3968 (3.6865) grad_norm 3.5997 (1.9365) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][610/1251] eta 0:02:37 lr 0.000981 wd 0.0500 time 0.2403 (0.2454) data time 0.0010 (0.0019) model time 0.2393 (0.2434) loss 3.8810 (3.6831) grad_norm 2.1108 (1.9371) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:00:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][620/1251] eta 0:02:34 lr 0.000981 wd 0.0500 time 0.2383 (0.2454) data time 0.0010 (0.0019) model time 0.2373 (0.2434) loss 3.7016 (3.6898) grad_norm 1.9329 (1.9356) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][630/1251] eta 0:02:32 lr 0.000981 wd 0.0500 time 0.2453 (0.2454) data time 0.0007 (0.0019) model time 0.2445 (0.2433) loss 3.0376 (3.6894) grad_norm 1.9924 (1.9386) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][640/1251] eta 0:02:29 lr 0.000981 wd 0.0500 time 0.2430 (0.2454) data time 0.0009 (0.0019) model time 0.2421 (0.2433) loss 3.0746 (3.6872) grad_norm 2.6428 (1.9451) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][650/1251] eta 0:02:27 lr 0.000981 wd 0.0500 time 0.2394 (0.2453) data time 0.0012 (0.0019) model time 0.2382 (0.2433) loss 3.4558 (3.6883) grad_norm 2.2757 (1.9472) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][660/1251] eta 0:02:24 lr 0.000981 wd 0.0500 time 0.2527 (0.2453) data time 0.0012 (0.0019) model time 0.2515 (0.2433) loss 3.9575 (3.6887) grad_norm 1.8747 (1.9474) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][670/1251] eta 0:02:22 lr 0.000981 wd 0.0500 time 0.2403 (0.2452) data time 0.0009 (0.0019) model time 0.2394 (0.2432) loss 3.6455 (3.6921) grad_norm 2.0286 (1.9488) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][680/1251] eta 0:02:19 lr 0.000981 wd 0.0500 time 0.2455 (0.2452) data time 0.0012 (0.0018) model time 0.2443 (0.2432) loss 3.7879 (3.6911) grad_norm 1.2093 (1.9487) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][690/1251] eta 0:02:17 lr 0.000981 wd 0.0500 time 0.2366 (0.2451) data time 0.0011 (0.0018) model time 0.2355 (0.2431) loss 3.7331 (3.6900) grad_norm 2.2280 (1.9456) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][700/1251] eta 0:02:15 lr 0.000981 wd 0.0500 time 0.2408 (0.2451) data time 0.0009 (0.0018) model time 0.2399 (0.2431) loss 3.5486 (3.6893) grad_norm 1.9615 (1.9453) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][710/1251] eta 0:02:12 lr 0.000981 wd 0.0500 time 0.2416 (0.2451) data time 0.0009 (0.0018) model time 0.2407 (0.2431) loss 2.4282 (3.6916) grad_norm 2.1268 (1.9431) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][720/1251] eta 0:02:10 lr 0.000981 wd 0.0500 time 0.2414 (0.2450) data time 0.0010 (0.0018) model time 0.2405 (0.2431) loss 3.7245 (3.6938) grad_norm 1.6144 (1.9414) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][730/1251] eta 0:02:07 lr 0.000981 wd 0.0500 time 0.2368 (0.2450) data time 0.0009 (0.0018) model time 0.2359 (0.2430) loss 3.7730 (3.6902) grad_norm 2.2934 (1.9421) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][740/1251] eta 0:02:05 lr 0.000981 wd 0.0500 time 0.2395 (0.2449) data time 0.0007 (0.0018) model time 0.2388 (0.2429) loss 4.4228 (3.6891) grad_norm 2.7456 (1.9450) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][750/1251] eta 0:02:02 lr 0.000981 wd 0.0500 time 0.2398 (0.2448) data time 0.0012 (0.0018) model time 0.2386 (0.2429) loss 2.6685 (3.6887) grad_norm 2.1568 (1.9425) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][760/1251] eta 0:02:00 lr 0.000981 wd 0.0500 time 0.2454 (0.2448) data time 0.0007 (0.0018) model time 0.2447 (0.2429) loss 3.0644 (3.6896) grad_norm 2.4944 (1.9412) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][770/1251] eta 0:01:57 lr 0.000981 wd 0.0500 time 0.2440 (0.2448) data time 0.0008 (0.0018) model time 0.2432 (0.2428) loss 2.7209 (3.6876) grad_norm 2.3548 (1.9430) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][780/1251] eta 0:01:55 lr 0.000981 wd 0.0500 time 0.2358 (0.2447) data time 0.0009 (0.0018) model time 0.2349 (0.2428) loss 2.9331 (3.6876) grad_norm 1.5659 (1.9425) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][790/1251] eta 0:01:52 lr 0.000981 wd 0.0500 time 0.2415 (0.2447) data time 0.0008 (0.0018) model time 0.2407 (0.2428) loss 2.3111 (3.6864) grad_norm 1.4086 (1.9388) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][800/1251] eta 0:01:50 lr 0.000981 wd 0.0500 time 0.2404 (0.2447) data time 0.0007 (0.0018) model time 0.2397 (0.2427) loss 4.0169 (3.6886) grad_norm 1.3907 (1.9361) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][810/1251] eta 0:01:47 lr 0.000981 wd 0.0500 time 0.2526 (0.2446) data time 0.0008 (0.0018) model time 0.2519 (0.2427) loss 4.0850 (3.6904) grad_norm 1.9170 (1.9343) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][820/1251] eta 0:01:45 lr 0.000981 wd 0.0500 time 0.2326 (0.2449) data time 0.0013 (0.0017) model time 0.2314 (0.2430) loss 4.2059 (3.6896) grad_norm 2.7470 (1.9379) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][830/1251] eta 0:01:43 lr 0.000981 wd 0.0500 time 0.2376 (0.2451) data time 0.0011 (0.0017) model time 0.2366 (0.2433) loss 3.2445 (3.6840) grad_norm 1.9277 (1.9399) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][840/1251] eta 0:01:40 lr 0.000981 wd 0.0500 time 0.2377 (0.2451) data time 0.0009 (0.0017) model time 0.2368 (0.2432) loss 2.5506 (3.6804) grad_norm 2.0539 (1.9369) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][850/1251] eta 0:01:38 lr 0.000981 wd 0.0500 time 0.2391 (0.2451) data time 0.0010 (0.0017) model time 0.2381 (0.2432) loss 4.2843 (3.6828) grad_norm 2.6264 (1.9353) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][860/1251] eta 0:01:35 lr 0.000981 wd 0.0500 time 0.2424 (0.2451) data time 0.0011 (0.0017) model time 0.2414 (0.2432) loss 3.8110 (3.6856) grad_norm 1.4128 (1.9357) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][870/1251] eta 0:01:33 lr 0.000981 wd 0.0500 time 0.2428 (0.2450) data time 0.0007 (0.0017) model time 0.2420 (0.2432) loss 3.7320 (3.6858) grad_norm 1.7640 (1.9339) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][880/1251] eta 0:01:30 lr 0.000981 wd 0.0500 time 0.2332 (0.2450) data time 0.0011 (0.0017) model time 0.2321 (0.2432) loss 3.4999 (3.6847) grad_norm 1.4463 (1.9318) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][890/1251] eta 0:01:28 lr 0.000981 wd 0.0500 time 0.2406 (0.2450) data time 0.0009 (0.0017) model time 0.2397 (0.2432) loss 3.3408 (3.6828) grad_norm 1.5852 (1.9304) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][900/1251] eta 0:01:25 lr 0.000981 wd 0.0500 time 0.2475 (0.2450) data time 0.0008 (0.0017) model time 0.2467 (0.2432) loss 3.3941 (3.6834) grad_norm 1.9457 (1.9288) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][910/1251] eta 0:01:23 lr 0.000981 wd 0.0500 time 0.2484 (0.2450) data time 0.0010 (0.0017) model time 0.2474 (0.2432) loss 3.5597 (3.6834) grad_norm 2.7272 (1.9319) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][920/1251] eta 0:01:21 lr 0.000981 wd 0.0500 time 0.2410 (0.2449) data time 0.0010 (0.0017) model time 0.2400 (0.2431) loss 3.8633 (3.6821) grad_norm 2.4226 (1.9335) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][930/1251] eta 0:01:18 lr 0.000981 wd 0.0500 time 0.2315 (0.2449) data time 0.0009 (0.0017) model time 0.2306 (0.2431) loss 4.4899 (3.6828) grad_norm 1.8767 (1.9332) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][940/1251] eta 0:01:16 lr 0.000981 wd 0.0500 time 0.2457 (0.2448) data time 0.0008 (0.0017) model time 0.2449 (0.2431) loss 3.4874 (3.6815) grad_norm 1.6423 (1.9307) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][950/1251] eta 0:01:13 lr 0.000981 wd 0.0500 time 0.2420 (0.2448) data time 0.0008 (0.0016) model time 0.2412 (0.2430) loss 3.2076 (3.6814) grad_norm 2.2923 (1.9319) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][960/1251] eta 0:01:11 lr 0.000981 wd 0.0500 time 0.2436 (0.2448) data time 0.0010 (0.0016) model time 0.2426 (0.2430) loss 3.2111 (3.6831) grad_norm 1.8405 (1.9343) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][970/1251] eta 0:01:08 lr 0.000981 wd 0.0500 time 0.2428 (0.2449) data time 0.0009 (0.0016) model time 0.2419 (0.2432) loss 3.3057 (3.6831) grad_norm 1.5784 (1.9408) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][980/1251] eta 0:01:06 lr 0.000981 wd 0.0500 time 0.2538 (0.2452) data time 0.0012 (0.0016) model time 0.2526 (0.2435) loss 3.8782 (3.6826) grad_norm 1.4987 (1.9395) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][990/1251] eta 0:01:03 lr 0.000981 wd 0.0500 time 0.2422 (0.2451) data time 0.0010 (0.0016) model time 0.2412 (0.2434) loss 4.2781 (3.6858) grad_norm 1.7335 (1.9399) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1000/1251] eta 0:01:01 lr 0.000981 wd 0.0500 time 0.2655 (0.2451) data time 0.0007 (0.0016) model time 0.2648 (0.2434) loss 2.7774 (3.6843) grad_norm 1.8483 (1.9427) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1010/1251] eta 0:00:59 lr 0.000981 wd 0.0500 time 0.2527 (0.2451) data time 0.0010 (0.0016) model time 0.2517 (0.2433) loss 4.2230 (3.6863) grad_norm 1.9548 (1.9442) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1020/1251] eta 0:00:56 lr 0.000981 wd 0.0500 time 0.2458 (0.2450) data time 0.0008 (0.0016) model time 0.2450 (0.2433) loss 4.3547 (3.6831) grad_norm 1.4521 (1.9425) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1030/1251] eta 0:00:54 lr 0.000981 wd 0.0500 time 0.2424 (0.2450) data time 0.0010 (0.0016) model time 0.2414 (0.2433) loss 3.5813 (3.6820) grad_norm 1.6390 (1.9421) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1040/1251] eta 0:00:51 lr 0.000981 wd 0.0500 time 0.2415 (0.2450) data time 0.0009 (0.0016) model time 0.2406 (0.2433) loss 4.1334 (3.6824) grad_norm 1.7286 (1.9402) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1050/1251] eta 0:00:49 lr 0.000981 wd 0.0500 time 0.2387 (0.2449) data time 0.0011 (0.0016) model time 0.2376 (0.2432) loss 4.3035 (3.6850) grad_norm 2.5648 (1.9391) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1060/1251] eta 0:00:46 lr 0.000981 wd 0.0500 time 0.2403 (0.2449) data time 0.0011 (0.0016) model time 0.2391 (0.2432) loss 2.7766 (3.6854) grad_norm 1.5476 (1.9367) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1070/1251] eta 0:00:44 lr 0.000981 wd 0.0500 time 0.2412 (0.2449) data time 0.0009 (0.0016) model time 0.2404 (0.2432) loss 3.6412 (3.6877) grad_norm 1.9563 (1.9358) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1080/1251] eta 0:00:41 lr 0.000981 wd 0.0500 time 0.2417 (0.2449) data time 0.0009 (0.0016) model time 0.2407 (0.2432) loss 3.5617 (3.6894) grad_norm 2.7884 (1.9348) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1090/1251] eta 0:00:39 lr 0.000981 wd 0.0500 time 0.2541 (0.2448) data time 0.0012 (0.0016) model time 0.2529 (0.2431) loss 4.0589 (3.6884) grad_norm 1.5720 (1.9376) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1100/1251] eta 0:00:36 lr 0.000981 wd 0.0500 time 0.2346 (0.2448) data time 0.0010 (0.0016) model time 0.2336 (0.2431) loss 2.5384 (3.6869) grad_norm 1.2957 (1.9368) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:02:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1110/1251] eta 0:00:34 lr 0.000981 wd 0.0500 time 0.2335 (0.2448) data time 0.0012 (0.0016) model time 0.2324 (0.2431) loss 3.5390 (3.6872) grad_norm 1.7300 (1.9351) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1120/1251] eta 0:00:32 lr 0.000981 wd 0.0500 time 0.2484 (0.2448) data time 0.0007 (0.0016) model time 0.2477 (0.2431) loss 3.5967 (3.6860) grad_norm 2.1926 (1.9351) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1130/1251] eta 0:00:29 lr 0.000981 wd 0.0500 time 0.2434 (0.2447) data time 0.0008 (0.0016) model time 0.2426 (0.2430) loss 3.4533 (3.6857) grad_norm 1.7161 (1.9331) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1140/1251] eta 0:00:27 lr 0.000981 wd 0.0500 time 0.2349 (0.2447) data time 0.0010 (0.0016) model time 0.2338 (0.2430) loss 3.6691 (3.6811) grad_norm 1.7663 (1.9357) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1150/1251] eta 0:00:24 lr 0.000981 wd 0.0500 time 0.2350 (0.2447) data time 0.0013 (0.0016) model time 0.2337 (0.2430) loss 3.7106 (3.6833) grad_norm 2.5348 (1.9349) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1160/1251] eta 0:00:22 lr 0.000981 wd 0.0500 time 0.2430 (0.2447) data time 0.0010 (0.0016) model time 0.2421 (0.2430) loss 2.6173 (3.6836) grad_norm 2.6277 (1.9369) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1170/1251] eta 0:00:19 lr 0.000981 wd 0.0500 time 0.2656 (0.2447) data time 0.0007 (0.0016) model time 0.2649 (0.2430) loss 4.1728 (3.6819) grad_norm 2.4013 (1.9391) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1180/1251] eta 0:00:17 lr 0.000981 wd 0.0500 time 0.2437 (0.2446) data time 0.0011 (0.0016) model time 0.2426 (0.2429) loss 3.6569 (3.6807) grad_norm 1.4845 (1.9387) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1190/1251] eta 0:00:14 lr 0.000981 wd 0.0500 time 0.2387 (0.2446) data time 0.0009 (0.0016) model time 0.2378 (0.2429) loss 4.3535 (3.6816) grad_norm 1.6750 (1.9407) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1200/1251] eta 0:00:12 lr 0.000981 wd 0.0500 time 0.2461 (0.2446) data time 0.0007 (0.0016) model time 0.2453 (0.2429) loss 4.1304 (3.6826) grad_norm 2.8541 (1.9409) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1210/1251] eta 0:00:10 lr 0.000981 wd 0.0500 time 0.2396 (0.2445) data time 0.0009 (0.0015) model time 0.2387 (0.2429) loss 2.8111 (3.6835) grad_norm 2.3600 (1.9412) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1220/1251] eta 0:00:07 lr 0.000981 wd 0.0500 time 0.2432 (0.2445) data time 0.0009 (0.0015) model time 0.2424 (0.2429) loss 4.2323 (3.6845) grad_norm 1.4026 (1.9405) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1230/1251] eta 0:00:05 lr 0.000981 wd 0.0500 time 0.2452 (0.2445) data time 0.0007 (0.0015) model time 0.2444 (0.2428) loss 2.4409 (3.6842) grad_norm 1.3804 (1.9419) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1240/1251] eta 0:00:02 lr 0.000981 wd 0.0500 time 0.2252 (0.2444) data time 0.0007 (0.0015) model time 0.2245 (0.2427) loss 2.5844 (3.6840) grad_norm 1.9898 (1.9435) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [44/300][1250/1251] eta 0:00:00 lr 0.000981 wd 0.0500 time 0.2252 (0.2442) data time 0.0007 (0.0015) model time 0.2245 (0.2426) loss 3.5873 (3.6837) grad_norm 3.0323 (1.9449) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 44 training takes 0:05:05 [2024-08-26 06:03:32 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 06:03:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 06:03:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.436 (0.436) Loss 0.5518 (0.5518) Acc@1 88.379 (88.379) Acc@5 97.363 (97.363) Mem 7379MB [2024-08-26 06:03:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.110) Loss 0.8574 (0.8796) Acc@1 82.422 (80.034) Acc@5 95.801 (95.224) Mem 7379MB [2024-08-26 06:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.096) Loss 1.2490 (0.9079) Acc@1 69.727 (78.850) Acc@5 90.723 (95.131) Mem 7379MB [2024-08-26 06:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.089 (0.090) Loss 1.5713 (1.0410) Acc@1 63.086 (75.784) Acc@5 85.059 (93.400) Mem 7379MB [2024-08-26 06:03:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.4443 (1.1141) Acc@1 64.941 (74.252) Acc@5 87.988 (92.349) Mem 7379MB [2024-08-26 06:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 73.930 Acc@5 92.270 [2024-08-26 06:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 73.9% [2024-08-26 06:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 73.93% [2024-08-26 06:03:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 06:03:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 06:03:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.483 (0.483) Loss 0.4680 (0.4680) Acc@1 89.844 (89.844) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-26 06:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.113) Loss 0.7979 (0.7751) Acc@1 82.227 (81.925) Acc@5 95.703 (95.934) Mem 7379MB [2024-08-26 06:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.098) Loss 1.1377 (0.7912) Acc@1 73.145 (81.106) Acc@5 91.406 (95.936) Mem 7379MB [2024-08-26 06:03:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.092) Loss 1.4180 (0.9148) Acc@1 63.867 (78.185) Acc@5 87.500 (94.333) Mem 7379MB [2024-08-26 06:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.3438 (0.9855) Acc@1 66.699 (76.484) Acc@5 88.965 (93.498) Mem 7379MB [2024-08-26 06:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.156 Acc@5 93.426 [2024-08-26 06:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 76.2% [2024-08-26 06:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 76.16% [2024-08-26 06:03:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 06:03:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 06:03:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][0/1251] eta 0:14:08 lr 0.000981 wd 0.0500 time 0.6781 (0.6781) data time 0.4520 (0.4520) model time 0.0000 (0.0000) loss 2.6238 (2.6238) grad_norm 2.0776 (2.0776) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][10/1251] eta 0:05:48 lr 0.000981 wd 0.0500 time 0.2369 (0.2812) data time 0.0009 (0.0420) model time 0.0000 (0.0000) loss 3.8542 (3.6122) grad_norm 2.5837 (2.1942) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][20/1251] eta 0:05:35 lr 0.000981 wd 0.0500 time 0.2508 (0.2724) data time 0.0010 (0.0225) model time 0.0000 (0.0000) loss 3.8703 (3.4962) grad_norm 2.0103 (2.0788) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][30/1251] eta 0:05:20 lr 0.000981 wd 0.0500 time 0.2348 (0.2629) data time 0.0009 (0.0156) model time 0.0000 (0.0000) loss 3.9163 (3.5623) grad_norm 1.6629 (2.1524) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][40/1251] eta 0:05:12 lr 0.000981 wd 0.0500 time 0.2433 (0.2582) data time 0.0009 (0.0120) model time 0.0000 (0.0000) loss 3.6559 (3.5456) grad_norm 1.6018 (2.1683) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][50/1251] eta 0:05:06 lr 0.000981 wd 0.0500 time 0.2452 (0.2551) data time 0.0008 (0.0099) model time 0.0000 (0.0000) loss 4.2513 (3.5556) grad_norm 1.5376 (2.1195) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:03:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][60/1251] eta 0:05:01 lr 0.000981 wd 0.0500 time 0.2440 (0.2534) data time 0.0009 (0.0085) model time 0.2431 (0.2433) loss 3.0503 (3.5537) grad_norm 2.0579 (2.0970) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][70/1251] eta 0:04:57 lr 0.000981 wd 0.0500 time 0.2443 (0.2520) data time 0.0010 (0.0074) model time 0.2433 (0.2429) loss 3.0092 (3.5532) grad_norm 1.5104 (2.0487) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][80/1251] eta 0:04:53 lr 0.000981 wd 0.0500 time 0.2337 (0.2507) data time 0.0009 (0.0066) model time 0.2328 (0.2420) loss 4.1237 (3.5544) grad_norm 2.0028 (2.0748) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][90/1251] eta 0:04:49 lr 0.000981 wd 0.0500 time 0.2346 (0.2498) data time 0.0009 (0.0060) model time 0.2336 (0.2418) loss 3.6727 (3.5656) grad_norm 2.7713 (2.0665) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][100/1251] eta 0:04:46 lr 0.000981 wd 0.0500 time 0.2398 (0.2490) data time 0.0010 (0.0055) model time 0.2387 (0.2416) loss 3.7183 (3.5840) grad_norm 2.8340 (2.0426) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][110/1251] eta 0:04:43 lr 0.000981 wd 0.0500 time 0.2382 (0.2485) data time 0.0008 (0.0051) model time 0.2374 (0.2418) loss 4.6127 (3.6221) grad_norm 2.2134 (2.0465) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][120/1251] eta 0:04:40 lr 0.000981 wd 0.0500 time 0.2366 (0.2479) data time 0.0010 (0.0048) model time 0.2356 (0.2416) loss 3.6255 (3.6089) grad_norm 1.5220 (2.0334) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][130/1251] eta 0:04:37 lr 0.000980 wd 0.0500 time 0.2391 (0.2474) data time 0.0011 (0.0045) model time 0.2380 (0.2415) loss 3.7969 (3.6154) grad_norm 1.9089 (2.0047) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][140/1251] eta 0:04:34 lr 0.000980 wd 0.0500 time 0.2388 (0.2470) data time 0.0010 (0.0042) model time 0.2378 (0.2414) loss 3.6581 (3.6279) grad_norm 1.8793 (1.9878) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][150/1251] eta 0:04:34 lr 0.000980 wd 0.0500 time 0.2393 (0.2496) data time 0.0010 (0.0040) model time 0.2383 (0.2457) loss 3.6252 (3.6347) grad_norm 1.6230 (1.9654) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][160/1251] eta 0:04:31 lr 0.000980 wd 0.0500 time 0.2449 (0.2491) data time 0.0008 (0.0038) model time 0.2441 (0.2453) loss 2.8708 (3.6468) grad_norm 2.0033 (1.9561) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][170/1251] eta 0:04:28 lr 0.000980 wd 0.0500 time 0.2458 (0.2488) data time 0.0009 (0.0037) model time 0.2450 (0.2451) loss 4.0476 (3.6474) grad_norm 1.4308 (1.9487) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][180/1251] eta 0:04:27 lr 0.000980 wd 0.0500 time 0.2439 (0.2496) data time 0.0007 (0.0035) model time 0.2431 (0.2463) loss 3.9065 (3.6660) grad_norm 1.6808 (1.9361) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][190/1251] eta 0:04:25 lr 0.000980 wd 0.0500 time 0.2397 (0.2504) data time 0.0009 (0.0034) model time 0.2388 (0.2477) loss 3.4074 (3.6684) grad_norm 1.4858 (1.9247) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][200/1251] eta 0:04:22 lr 0.000980 wd 0.0500 time 0.2435 (0.2500) data time 0.0012 (0.0033) model time 0.2424 (0.2472) loss 3.6986 (3.6727) grad_norm 3.5152 (1.9502) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][210/1251] eta 0:04:19 lr 0.000980 wd 0.0500 time 0.2472 (0.2496) data time 0.0007 (0.0032) model time 0.2465 (0.2467) loss 4.1595 (3.6755) grad_norm 2.1567 (1.9871) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][220/1251] eta 0:04:16 lr 0.000980 wd 0.0500 time 0.2535 (0.2492) data time 0.0010 (0.0031) model time 0.2525 (0.2464) loss 3.7803 (3.6820) grad_norm 1.7174 (1.9713) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][230/1251] eta 0:04:14 lr 0.000980 wd 0.0500 time 0.2376 (0.2488) data time 0.0009 (0.0030) model time 0.2367 (0.2460) loss 2.8664 (3.6740) grad_norm 2.1987 (1.9604) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][240/1251] eta 0:04:11 lr 0.000980 wd 0.0500 time 0.2328 (0.2491) data time 0.0009 (0.0029) model time 0.2320 (0.2464) loss 4.0663 (3.6765) grad_norm 1.9352 (1.9557) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][250/1251] eta 0:04:10 lr 0.000980 wd 0.0500 time 0.2386 (0.2500) data time 0.0009 (0.0028) model time 0.2377 (0.2477) loss 3.6972 (3.6781) grad_norm 1.5606 (1.9549) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][260/1251] eta 0:04:07 lr 0.000980 wd 0.0500 time 0.2418 (0.2497) data time 0.0009 (0.0028) model time 0.2408 (0.2473) loss 3.9510 (3.6847) grad_norm 1.9197 (1.9471) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][270/1251] eta 0:04:04 lr 0.000980 wd 0.0500 time 0.2379 (0.2493) data time 0.0008 (0.0027) model time 0.2371 (0.2470) loss 3.4739 (3.6694) grad_norm 1.6214 (1.9407) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][280/1251] eta 0:04:01 lr 0.000980 wd 0.0500 time 0.2417 (0.2491) data time 0.0007 (0.0026) model time 0.2409 (0.2467) loss 3.5041 (3.6752) grad_norm 1.4506 (1.9377) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][290/1251] eta 0:03:59 lr 0.000980 wd 0.0500 time 0.2337 (0.2488) data time 0.0008 (0.0026) model time 0.2328 (0.2464) loss 4.5648 (3.6791) grad_norm 2.4200 (1.9343) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][300/1251] eta 0:03:56 lr 0.000980 wd 0.0500 time 0.2400 (0.2485) data time 0.0009 (0.0025) model time 0.2391 (0.2461) loss 2.9228 (3.6788) grad_norm 2.0145 (1.9447) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:04:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][310/1251] eta 0:03:53 lr 0.000980 wd 0.0500 time 0.2406 (0.2483) data time 0.0010 (0.0025) model time 0.2396 (0.2459) loss 3.9884 (3.6868) grad_norm 2.3538 (1.9578) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:05:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][320/1251] eta 0:03:50 lr 0.000980 wd 0.0500 time 0.2462 (0.2481) data time 0.0010 (0.0024) model time 0.2452 (0.2457) loss 3.7971 (3.6789) grad_norm 1.4624 (1.9489) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:05:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][330/1251] eta 0:03:48 lr 0.000980 wd 0.0500 time 0.2445 (0.2480) data time 0.0007 (0.0024) model time 0.2437 (0.2457) loss 4.1100 (3.6794) grad_norm 1.7749 (1.9455) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][340/1251] eta 0:03:45 lr 0.000980 wd 0.0500 time 0.2360 (0.2478) data time 0.0008 (0.0024) model time 0.2351 (0.2454) loss 2.9481 (3.6811) grad_norm 1.4445 (1.9487) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:05:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][350/1251] eta 0:03:43 lr 0.000980 wd 0.0500 time 0.2408 (0.2476) data time 0.0010 (0.0023) model time 0.2398 (0.2453) loss 3.2898 (3.6861) grad_norm 1.9809 (1.9474) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:05:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][360/1251] eta 0:03:40 lr 0.000980 wd 0.0500 time 0.2477 (0.2474) data time 0.0007 (0.0023) model time 0.2470 (0.2451) loss 3.5113 (3.6898) grad_norm 2.0503 (1.9477) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:05:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][370/1251] eta 0:03:37 lr 0.000980 wd 0.0500 time 0.2424 (0.2473) data time 0.0010 (0.0023) model time 0.2414 (0.2450) loss 3.9982 (3.6944) grad_norm 1.5916 (1.9427) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][380/1251] eta 0:03:35 lr 0.000980 wd 0.0500 time 0.2391 (0.2472) data time 0.0010 (0.0022) model time 0.2381 (0.2449) loss 3.5836 (3.6955) grad_norm 2.1952 (1.9411) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:05:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][390/1251] eta 0:03:32 lr 0.000980 wd 0.0500 time 0.2424 (0.2471) data time 0.0008 (0.0022) model time 0.2416 (0.2449) loss 3.3402 (3.6926) grad_norm 2.8329 (1.9442) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:05:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][400/1251] eta 0:03:30 lr 0.000980 wd 0.0500 time 0.2535 (0.2470) data time 0.0007 (0.0022) model time 0.2528 (0.2448) loss 4.3951 (3.6867) grad_norm 2.5404 (1.9423) loss_scale 8192.0000 (4106.2145) mem 7379MB [2024-08-26 06:05:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][410/1251] eta 0:03:27 lr 0.000980 wd 0.0500 time 0.2369 (0.2469) data time 0.0009 (0.0021) model time 0.2360 (0.2446) loss 4.0327 (3.6862) grad_norm 1.6937 (1.9382) loss_scale 8192.0000 (4205.6253) mem 7379MB [2024-08-26 06:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][420/1251] eta 0:03:25 lr 0.000980 wd 0.0500 time 0.2420 (0.2467) data time 0.0009 (0.0021) model time 0.2411 (0.2445) loss 4.1769 (3.6934) grad_norm 1.7175 (1.9358) loss_scale 8192.0000 (4300.3135) mem 7379MB [2024-08-26 06:05:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][430/1251] eta 0:03:22 lr 0.000980 wd 0.0500 time 0.2473 (0.2466) data time 0.0009 (0.0021) model time 0.2464 (0.2444) loss 2.4541 (3.6888) grad_norm 1.7663 (1.9348) loss_scale 8192.0000 (4390.6079) mem 7379MB [2024-08-26 06:05:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][440/1251] eta 0:03:19 lr 0.000980 wd 0.0500 time 0.2363 (0.2465) data time 0.0009 (0.0021) model time 0.2354 (0.2443) loss 4.1141 (3.6952) grad_norm 1.7549 (1.9341) loss_scale 8192.0000 (4476.8073) mem 7379MB [2024-08-26 06:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][450/1251] eta 0:03:17 lr 0.000980 wd 0.0500 time 0.2337 (0.2464) data time 0.0009 (0.0020) model time 0.2327 (0.2442) loss 4.3157 (3.6912) grad_norm 1.5541 (1.9305) loss_scale 8192.0000 (4559.1840) mem 7379MB [2024-08-26 06:05:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][460/1251] eta 0:03:14 lr 0.000980 wd 0.0500 time 0.2478 (0.2464) data time 0.0009 (0.0020) model time 0.2468 (0.2443) loss 3.8065 (3.6854) grad_norm 2.2910 (1.9304) loss_scale 8192.0000 (4637.9870) mem 7379MB [2024-08-26 06:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][470/1251] eta 0:03:12 lr 0.000980 wd 0.0500 time 0.2414 (0.2463) data time 0.0008 (0.0020) model time 0.2406 (0.2441) loss 4.0050 (3.6857) grad_norm 2.0759 (1.9328) loss_scale 8192.0000 (4713.4437) mem 7379MB [2024-08-26 06:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][480/1251] eta 0:03:09 lr 0.000980 wd 0.0500 time 0.2522 (0.2462) data time 0.0008 (0.0020) model time 0.2514 (0.2441) loss 3.0742 (3.6867) grad_norm 1.4374 (1.9284) loss_scale 8192.0000 (4785.7630) mem 7379MB [2024-08-26 06:05:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][490/1251] eta 0:03:07 lr 0.000980 wd 0.0500 time 0.2364 (0.2461) data time 0.0009 (0.0020) model time 0.2354 (0.2440) loss 4.0447 (3.6854) grad_norm 2.2149 (1.9232) loss_scale 8192.0000 (4855.1365) mem 7379MB [2024-08-26 06:05:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][500/1251] eta 0:03:04 lr 0.000980 wd 0.0500 time 0.2437 (0.2459) data time 0.0010 (0.0019) model time 0.2427 (0.2439) loss 3.4782 (3.6857) grad_norm 2.5387 (1.9227) loss_scale 8192.0000 (4921.7405) mem 7379MB [2024-08-26 06:05:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][510/1251] eta 0:03:02 lr 0.000980 wd 0.0500 time 0.2383 (0.2458) data time 0.0013 (0.0019) model time 0.2370 (0.2438) loss 4.1188 (3.6859) grad_norm 1.9073 (1.9174) loss_scale 8192.0000 (4985.7378) mem 7379MB [2024-08-26 06:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][520/1251] eta 0:02:59 lr 0.000980 wd 0.0500 time 0.2409 (0.2457) data time 0.0007 (0.0019) model time 0.2401 (0.2437) loss 2.8134 (3.6808) grad_norm 4.1589 (1.9378) loss_scale 8192.0000 (5047.2783) mem 7379MB [2024-08-26 06:05:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][530/1251] eta 0:02:57 lr 0.000980 wd 0.0500 time 0.2400 (0.2456) data time 0.0007 (0.0019) model time 0.2392 (0.2436) loss 3.3230 (3.6786) grad_norm 2.0824 (1.9412) loss_scale 8192.0000 (5106.5009) mem 7379MB [2024-08-26 06:05:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][540/1251] eta 0:02:54 lr 0.000980 wd 0.0500 time 0.2469 (0.2456) data time 0.0009 (0.0019) model time 0.2459 (0.2435) loss 3.1124 (3.6791) grad_norm 2.3399 (1.9432) loss_scale 8192.0000 (5163.5342) mem 7379MB [2024-08-26 06:05:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][550/1251] eta 0:02:52 lr 0.000980 wd 0.0500 time 0.2384 (0.2455) data time 0.0011 (0.0019) model time 0.2374 (0.2435) loss 3.8593 (3.6773) grad_norm 1.6595 (1.9487) loss_scale 8192.0000 (5218.4973) mem 7379MB [2024-08-26 06:06:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][560/1251] eta 0:02:49 lr 0.000980 wd 0.0500 time 0.2437 (0.2455) data time 0.0010 (0.0018) model time 0.2427 (0.2434) loss 3.5583 (3.6766) grad_norm 2.0419 (1.9502) loss_scale 8192.0000 (5271.5009) mem 7379MB [2024-08-26 06:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][570/1251] eta 0:02:47 lr 0.000980 wd 0.0500 time 0.2415 (0.2454) data time 0.0008 (0.0018) model time 0.2407 (0.2434) loss 3.8926 (3.6751) grad_norm 1.9048 (1.9468) loss_scale 8192.0000 (5322.6480) mem 7379MB [2024-08-26 06:06:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][580/1251] eta 0:02:44 lr 0.000980 wd 0.0500 time 0.2443 (0.2454) data time 0.0009 (0.0018) model time 0.2435 (0.2434) loss 3.7147 (3.6727) grad_norm 2.1201 (1.9416) loss_scale 8192.0000 (5372.0344) mem 7379MB [2024-08-26 06:06:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][590/1251] eta 0:02:42 lr 0.000980 wd 0.0500 time 0.2458 (0.2453) data time 0.0007 (0.0018) model time 0.2451 (0.2433) loss 2.3143 (3.6687) grad_norm 2.2732 (1.9419) loss_scale 8192.0000 (5419.7496) mem 7379MB [2024-08-26 06:06:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][600/1251] eta 0:02:39 lr 0.000980 wd 0.0500 time 0.2433 (0.2452) data time 0.0010 (0.0018) model time 0.2423 (0.2432) loss 3.8030 (3.6717) grad_norm 1.8710 (1.9401) loss_scale 8192.0000 (5465.8769) mem 7379MB [2024-08-26 06:06:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][610/1251] eta 0:02:37 lr 0.000980 wd 0.0500 time 0.2378 (0.2452) data time 0.0011 (0.0018) model time 0.2367 (0.2432) loss 3.9938 (3.6756) grad_norm 1.9929 (1.9417) loss_scale 8192.0000 (5510.4943) mem 7379MB [2024-08-26 06:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][620/1251] eta 0:02:34 lr 0.000980 wd 0.0500 time 0.2437 (0.2451) data time 0.0010 (0.0018) model time 0.2427 (0.2432) loss 4.0871 (3.6757) grad_norm 1.8330 (1.9439) loss_scale 8192.0000 (5553.6747) mem 7379MB [2024-08-26 06:06:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][630/1251] eta 0:02:32 lr 0.000980 wd 0.0500 time 0.2491 (0.2451) data time 0.0010 (0.0018) model time 0.2481 (0.2432) loss 3.2959 (3.6735) grad_norm 1.9238 (1.9415) loss_scale 8192.0000 (5595.4865) mem 7379MB [2024-08-26 06:06:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][640/1251] eta 0:02:29 lr 0.000980 wd 0.0500 time 0.2449 (0.2450) data time 0.0007 (0.0018) model time 0.2442 (0.2431) loss 3.5390 (3.6724) grad_norm 1.5880 (1.9434) loss_scale 8192.0000 (5635.9938) mem 7379MB [2024-08-26 06:06:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][650/1251] eta 0:02:27 lr 0.000980 wd 0.0500 time 0.2367 (0.2450) data time 0.0009 (0.0018) model time 0.2358 (0.2430) loss 2.7105 (3.6695) grad_norm 2.2710 (1.9477) loss_scale 8192.0000 (5675.2565) mem 7379MB [2024-08-26 06:06:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][660/1251] eta 0:02:24 lr 0.000980 wd 0.0500 time 0.2480 (0.2449) data time 0.0009 (0.0017) model time 0.2471 (0.2430) loss 2.9152 (3.6698) grad_norm 1.8570 (1.9440) loss_scale 8192.0000 (5713.3313) mem 7379MB [2024-08-26 06:06:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][670/1251] eta 0:02:22 lr 0.000980 wd 0.0500 time 0.2428 (0.2449) data time 0.0010 (0.0017) model time 0.2418 (0.2430) loss 4.0052 (3.6738) grad_norm 1.7439 (1.9443) loss_scale 8192.0000 (5750.2712) mem 7379MB [2024-08-26 06:06:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][680/1251] eta 0:02:19 lr 0.000980 wd 0.0500 time 0.2392 (0.2448) data time 0.0009 (0.0017) model time 0.2384 (0.2429) loss 3.7572 (3.6753) grad_norm 1.4515 (1.9417) loss_scale 8192.0000 (5786.1263) mem 7379MB [2024-08-26 06:06:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][690/1251] eta 0:02:17 lr 0.000980 wd 0.0500 time 0.2469 (0.2448) data time 0.0009 (0.0017) model time 0.2460 (0.2429) loss 3.9127 (3.6702) grad_norm 1.4794 (1.9423) loss_scale 8192.0000 (5820.9436) mem 7379MB [2024-08-26 06:06:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][700/1251] eta 0:02:14 lr 0.000980 wd 0.0500 time 0.2343 (0.2448) data time 0.0011 (0.0017) model time 0.2332 (0.2429) loss 3.7287 (3.6696) grad_norm 1.4718 (1.9441) loss_scale 8192.0000 (5854.7675) mem 7379MB [2024-08-26 06:06:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][710/1251] eta 0:02:12 lr 0.000980 wd 0.0500 time 0.2423 (0.2447) data time 0.0007 (0.0017) model time 0.2416 (0.2428) loss 3.7873 (3.6683) grad_norm 1.8422 (1.9421) loss_scale 8192.0000 (5887.6399) mem 7379MB [2024-08-26 06:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][720/1251] eta 0:02:10 lr 0.000980 wd 0.0500 time 0.2550 (0.2450) data time 0.0013 (0.0017) model time 0.2538 (0.2432) loss 2.6439 (3.6675) grad_norm 1.8459 (1.9410) loss_scale 8192.0000 (5919.6006) mem 7379MB [2024-08-26 06:06:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][730/1251] eta 0:02:07 lr 0.000980 wd 0.0500 time 0.2429 (0.2453) data time 0.0008 (0.0017) model time 0.2422 (0.2435) loss 2.3413 (3.6646) grad_norm 3.6413 (1.9413) loss_scale 8192.0000 (5950.6867) mem 7379MB [2024-08-26 06:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][740/1251] eta 0:02:05 lr 0.000980 wd 0.0500 time 0.2423 (0.2452) data time 0.0011 (0.0017) model time 0.2412 (0.2434) loss 4.0721 (3.6630) grad_norm 2.0341 (1.9462) loss_scale 8192.0000 (5980.9339) mem 7379MB [2024-08-26 06:06:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][750/1251] eta 0:02:02 lr 0.000980 wd 0.0500 time 0.2387 (0.2452) data time 0.0010 (0.0017) model time 0.2377 (0.2434) loss 3.7502 (3.6654) grad_norm 2.5341 (1.9472) loss_scale 8192.0000 (6010.3755) mem 7379MB [2024-08-26 06:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][760/1251] eta 0:02:00 lr 0.000980 wd 0.0500 time 0.2451 (0.2452) data time 0.0009 (0.0017) model time 0.2442 (0.2435) loss 3.9293 (3.6703) grad_norm 2.1550 (1.9475) loss_scale 8192.0000 (6039.0434) mem 7379MB [2024-08-26 06:06:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][770/1251] eta 0:01:58 lr 0.000980 wd 0.0500 time 0.4371 (0.2460) data time 0.0010 (0.0016) model time 0.4361 (0.2443) loss 2.8861 (3.6709) grad_norm 1.7924 (1.9470) loss_scale 8192.0000 (6066.9676) mem 7379MB [2024-08-26 06:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][780/1251] eta 0:01:55 lr 0.000980 wd 0.0500 time 0.2478 (0.2459) data time 0.0008 (0.0016) model time 0.2470 (0.2442) loss 3.1858 (3.6680) grad_norm 1.8466 (1.9492) loss_scale 8192.0000 (6094.1767) mem 7379MB [2024-08-26 06:06:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][790/1251] eta 0:01:53 lr 0.000980 wd 0.0500 time 0.2387 (0.2459) data time 0.0010 (0.0016) model time 0.2378 (0.2442) loss 3.8568 (3.6681) grad_norm 1.5155 (1.9474) loss_scale 8192.0000 (6120.6979) mem 7379MB [2024-08-26 06:06:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][800/1251] eta 0:01:50 lr 0.000980 wd 0.0500 time 0.2412 (0.2458) data time 0.0010 (0.0016) model time 0.2403 (0.2441) loss 3.7887 (3.6670) grad_norm 2.2394 (1.9448) loss_scale 8192.0000 (6146.5568) mem 7379MB [2024-08-26 06:07:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][810/1251] eta 0:01:48 lr 0.000980 wd 0.0500 time 0.2415 (0.2457) data time 0.0009 (0.0016) model time 0.2406 (0.2440) loss 4.2618 (3.6704) grad_norm 1.6042 (1.9406) loss_scale 8192.0000 (6171.7781) mem 7379MB [2024-08-26 06:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][820/1251] eta 0:01:45 lr 0.000980 wd 0.0500 time 0.2375 (0.2456) data time 0.0009 (0.0016) model time 0.2366 (0.2439) loss 3.7075 (3.6707) grad_norm 2.6306 (1.9399) loss_scale 8192.0000 (6196.3849) mem 7379MB [2024-08-26 06:07:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][830/1251] eta 0:01:43 lr 0.000980 wd 0.0500 time 0.2403 (0.2456) data time 0.0008 (0.0016) model time 0.2395 (0.2439) loss 3.7209 (3.6732) grad_norm 2.1249 (1.9383) loss_scale 8192.0000 (6220.3995) mem 7379MB [2024-08-26 06:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][840/1251] eta 0:01:40 lr 0.000980 wd 0.0500 time 0.2447 (0.2456) data time 0.0009 (0.0016) model time 0.2438 (0.2439) loss 3.7135 (3.6698) grad_norm 3.7579 (1.9398) loss_scale 8192.0000 (6243.8430) mem 7379MB [2024-08-26 06:07:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][850/1251] eta 0:01:38 lr 0.000980 wd 0.0500 time 0.2374 (0.2455) data time 0.0007 (0.0016) model time 0.2367 (0.2438) loss 4.2063 (3.6718) grad_norm 1.8466 (1.9380) loss_scale 8192.0000 (6266.7356) mem 7379MB [2024-08-26 06:07:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][860/1251] eta 0:01:35 lr 0.000980 wd 0.0500 time 0.2474 (0.2455) data time 0.0007 (0.0016) model time 0.2467 (0.2438) loss 4.4532 (3.6763) grad_norm 2.9968 (1.9436) loss_scale 8192.0000 (6289.0964) mem 7379MB [2024-08-26 06:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][870/1251] eta 0:01:33 lr 0.000980 wd 0.0500 time 0.2365 (0.2454) data time 0.0007 (0.0016) model time 0.2357 (0.2438) loss 3.0706 (3.6747) grad_norm 1.5592 (1.9456) loss_scale 8192.0000 (6310.9437) mem 7379MB [2024-08-26 06:07:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][880/1251] eta 0:01:31 lr 0.000980 wd 0.0500 time 0.2453 (0.2454) data time 0.0007 (0.0016) model time 0.2445 (0.2437) loss 4.1345 (3.6764) grad_norm 1.5077 (1.9460) loss_scale 8192.0000 (6332.2951) mem 7379MB [2024-08-26 06:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][890/1251] eta 0:01:28 lr 0.000980 wd 0.0500 time 0.2349 (0.2453) data time 0.0007 (0.0016) model time 0.2341 (0.2437) loss 2.9530 (3.6778) grad_norm 2.0141 (1.9516) loss_scale 8192.0000 (6353.1672) mem 7379MB [2024-08-26 06:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][900/1251] eta 0:01:26 lr 0.000980 wd 0.0500 time 0.2381 (0.2453) data time 0.0011 (0.0016) model time 0.2370 (0.2436) loss 4.6165 (3.6809) grad_norm 1.4985 (1.9520) loss_scale 8192.0000 (6373.5760) mem 7379MB [2024-08-26 06:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][910/1251] eta 0:01:23 lr 0.000980 wd 0.0500 time 0.2377 (0.2453) data time 0.0008 (0.0016) model time 0.2368 (0.2436) loss 3.7657 (3.6795) grad_norm 1.5286 (1.9488) loss_scale 8192.0000 (6393.5368) mem 7379MB [2024-08-26 06:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][920/1251] eta 0:01:21 lr 0.000980 wd 0.0500 time 0.2412 (0.2452) data time 0.0011 (0.0016) model time 0.2401 (0.2435) loss 4.1715 (3.6771) grad_norm 2.0304 (1.9474) loss_scale 8192.0000 (6413.0641) mem 7379MB [2024-08-26 06:07:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][930/1251] eta 0:01:18 lr 0.000979 wd 0.0500 time 0.2423 (0.2452) data time 0.0008 (0.0016) model time 0.2415 (0.2435) loss 4.4366 (3.6756) grad_norm 2.8108 (1.9460) loss_scale 8192.0000 (6432.1719) mem 7379MB [2024-08-26 06:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][940/1251] eta 0:01:16 lr 0.000979 wd 0.0500 time 0.2350 (0.2452) data time 0.0010 (0.0015) model time 0.2339 (0.2435) loss 3.8782 (3.6740) grad_norm 1.6664 (1.9445) loss_scale 8192.0000 (6450.8735) mem 7379MB [2024-08-26 06:07:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][950/1251] eta 0:01:13 lr 0.000979 wd 0.0500 time 0.2452 (0.2453) data time 0.0011 (0.0015) model time 0.2441 (0.2437) loss 4.2512 (3.6748) grad_norm 1.7853 (1.9438) loss_scale 8192.0000 (6469.1819) mem 7379MB [2024-08-26 06:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][960/1251] eta 0:01:11 lr 0.000979 wd 0.0500 time 0.2403 (0.2453) data time 0.0007 (0.0015) model time 0.2396 (0.2437) loss 2.8843 (3.6710) grad_norm 1.6248 (1.9410) loss_scale 8192.0000 (6487.1093) mem 7379MB [2024-08-26 06:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][970/1251] eta 0:01:08 lr 0.000979 wd 0.0500 time 0.2395 (0.2453) data time 0.0008 (0.0015) model time 0.2387 (0.2437) loss 4.1056 (3.6753) grad_norm 2.2968 (1.9414) loss_scale 8192.0000 (6504.6674) mem 7379MB [2024-08-26 06:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][980/1251] eta 0:01:06 lr 0.000979 wd 0.0500 time 0.2330 (0.2452) data time 0.0008 (0.0015) model time 0.2322 (0.2436) loss 4.3310 (3.6762) grad_norm 1.2950 (1.9427) loss_scale 8192.0000 (6521.8675) mem 7379MB [2024-08-26 06:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][990/1251] eta 0:01:04 lr 0.000979 wd 0.0500 time 0.2488 (0.2452) data time 0.0008 (0.0015) model time 0.2480 (0.2436) loss 2.4758 (3.6727) grad_norm 1.5023 (1.9446) loss_scale 8192.0000 (6538.7205) mem 7379MB [2024-08-26 06:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1000/1251] eta 0:01:01 lr 0.000979 wd 0.0500 time 0.2436 (0.2452) data time 0.0009 (0.0015) model time 0.2427 (0.2436) loss 4.1121 (3.6720) grad_norm 1.5126 (1.9429) loss_scale 8192.0000 (6555.2368) mem 7379MB [2024-08-26 06:07:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1010/1251] eta 0:00:59 lr 0.000979 wd 0.0500 time 0.2493 (0.2452) data time 0.0014 (0.0015) model time 0.2479 (0.2436) loss 3.1820 (3.6707) grad_norm 2.0519 (1.9411) loss_scale 8192.0000 (6571.4263) mem 7379MB [2024-08-26 06:07:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1020/1251] eta 0:00:56 lr 0.000979 wd 0.0500 time 0.2381 (0.2452) data time 0.0007 (0.0015) model time 0.2373 (0.2436) loss 4.1995 (3.6681) grad_norm 2.2862 (1.9403) loss_scale 8192.0000 (6587.2987) mem 7379MB [2024-08-26 06:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1030/1251] eta 0:00:54 lr 0.000979 wd 0.0500 time 0.2428 (0.2452) data time 0.0008 (0.0015) model time 0.2419 (0.2436) loss 2.6317 (3.6666) grad_norm 1.6337 (1.9382) loss_scale 8192.0000 (6602.8632) mem 7379MB [2024-08-26 06:07:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1040/1251] eta 0:00:51 lr 0.000979 wd 0.0500 time 0.2448 (0.2451) data time 0.0008 (0.0015) model time 0.2440 (0.2436) loss 3.2174 (3.6679) grad_norm 3.5324 (1.9379) loss_scale 8192.0000 (6618.1287) mem 7379MB [2024-08-26 06:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1050/1251] eta 0:00:49 lr 0.000979 wd 0.0500 time 0.2481 (0.2451) data time 0.0008 (0.0015) model time 0.2474 (0.2435) loss 2.2058 (3.6656) grad_norm 1.9046 (1.9375) loss_scale 8192.0000 (6633.1037) mem 7379MB [2024-08-26 06:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1060/1251] eta 0:00:46 lr 0.000979 wd 0.0500 time 0.2396 (0.2451) data time 0.0009 (0.0015) model time 0.2388 (0.2435) loss 3.2762 (3.6693) grad_norm 2.4665 (1.9389) loss_scale 8192.0000 (6647.7964) mem 7379MB [2024-08-26 06:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1070/1251] eta 0:00:44 lr 0.000979 wd 0.0500 time 0.2415 (0.2451) data time 0.0008 (0.0015) model time 0.2407 (0.2435) loss 3.4421 (3.6697) grad_norm 2.1912 (1.9388) loss_scale 8192.0000 (6662.2148) mem 7379MB [2024-08-26 06:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1080/1251] eta 0:00:41 lr 0.000979 wd 0.0500 time 0.2373 (0.2454) data time 0.0007 (0.0015) model time 0.2366 (0.2439) loss 3.5434 (3.6724) grad_norm 2.0496 (1.9410) loss_scale 8192.0000 (6676.3663) mem 7379MB [2024-08-26 06:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1090/1251] eta 0:00:39 lr 0.000979 wd 0.0500 time 0.2417 (0.2454) data time 0.0007 (0.0015) model time 0.2409 (0.2439) loss 4.1929 (3.6741) grad_norm 1.8280 (1.9398) loss_scale 8192.0000 (6690.2585) mem 7379MB [2024-08-26 06:08:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1100/1251] eta 0:00:37 lr 0.000979 wd 0.0500 time 0.2466 (0.2454) data time 0.0008 (0.0015) model time 0.2459 (0.2438) loss 2.4330 (3.6730) grad_norm 2.1900 (1.9387) loss_scale 8192.0000 (6703.8983) mem 7379MB [2024-08-26 06:08:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1110/1251] eta 0:00:34 lr 0.000979 wd 0.0500 time 0.2365 (0.2453) data time 0.0009 (0.0015) model time 0.2356 (0.2438) loss 4.4031 (3.6724) grad_norm 1.5013 (1.9371) loss_scale 8192.0000 (6717.2925) mem 7379MB [2024-08-26 06:08:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1120/1251] eta 0:00:32 lr 0.000979 wd 0.0500 time 0.2425 (0.2453) data time 0.0010 (0.0015) model time 0.2415 (0.2438) loss 3.8926 (3.6741) grad_norm 1.8068 (1.9357) loss_scale 8192.0000 (6730.4478) mem 7379MB [2024-08-26 06:08:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1130/1251] eta 0:00:29 lr 0.000979 wd 0.0500 time 0.2343 (0.2453) data time 0.0009 (0.0015) model time 0.2334 (0.2438) loss 4.1807 (3.6738) grad_norm 2.0594 (1.9348) loss_scale 8192.0000 (6743.3705) mem 7379MB [2024-08-26 06:08:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1140/1251] eta 0:00:27 lr 0.000979 wd 0.0500 time 0.2373 (0.2453) data time 0.0010 (0.0014) model time 0.2363 (0.2438) loss 4.5854 (3.6742) grad_norm 2.0592 (1.9346) loss_scale 8192.0000 (6756.0666) mem 7379MB [2024-08-26 06:08:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1150/1251] eta 0:00:24 lr 0.000979 wd 0.0500 time 0.2395 (0.2453) data time 0.0010 (0.0014) model time 0.2386 (0.2438) loss 3.6933 (3.6749) grad_norm 2.0483 (1.9366) loss_scale 8192.0000 (6768.5421) mem 7379MB [2024-08-26 06:08:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1160/1251] eta 0:00:22 lr 0.000979 wd 0.0500 time 0.2418 (0.2452) data time 0.0011 (0.0014) model time 0.2407 (0.2437) loss 3.7793 (3.6746) grad_norm 1.9198 (1.9362) loss_scale 8192.0000 (6780.8028) mem 7379MB [2024-08-26 06:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1170/1251] eta 0:00:19 lr 0.000979 wd 0.0500 time 0.2406 (0.2452) data time 0.0008 (0.0014) model time 0.2398 (0.2437) loss 3.4558 (3.6718) grad_norm 3.5320 (1.9375) loss_scale 8192.0000 (6792.8540) mem 7379MB [2024-08-26 06:08:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1180/1251] eta 0:00:17 lr 0.000979 wd 0.0500 time 0.2454 (0.2452) data time 0.0009 (0.0014) model time 0.2445 (0.2437) loss 2.3993 (3.6677) grad_norm 2.3240 (1.9378) loss_scale 8192.0000 (6804.7011) mem 7379MB [2024-08-26 06:08:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1190/1251] eta 0:00:14 lr 0.000979 wd 0.0500 time 0.2367 (0.2452) data time 0.0008 (0.0014) model time 0.2360 (0.2437) loss 3.8566 (3.6687) grad_norm 1.9010 (1.9369) loss_scale 8192.0000 (6816.3493) mem 7379MB [2024-08-26 06:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1200/1251] eta 0:00:12 lr 0.000979 wd 0.0500 time 0.2461 (0.2452) data time 0.0010 (0.0014) model time 0.2450 (0.2437) loss 2.8844 (3.6685) grad_norm 1.4407 (1.9333) loss_scale 8192.0000 (6827.8035) mem 7379MB [2024-08-26 06:08:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1210/1251] eta 0:00:10 lr 0.000979 wd 0.0500 time 0.2324 (0.2452) data time 0.0011 (0.0014) model time 0.2313 (0.2437) loss 3.7109 (3.6672) grad_norm 2.1489 (1.9307) loss_scale 8192.0000 (6839.0685) mem 7379MB [2024-08-26 06:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1220/1251] eta 0:00:07 lr 0.000979 wd 0.0500 time 0.2375 (0.2452) data time 0.0011 (0.0014) model time 0.2364 (0.2437) loss 3.8425 (3.6674) grad_norm 1.8320 (1.9300) loss_scale 8192.0000 (6850.1491) mem 7379MB [2024-08-26 06:08:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1230/1251] eta 0:00:05 lr 0.000979 wd 0.0500 time 0.4058 (0.2453) data time 0.0013 (0.0014) model time 0.4045 (0.2438) loss 3.8292 (3.6685) grad_norm 3.7766 (1.9323) loss_scale 8192.0000 (6861.0496) mem 7379MB [2024-08-26 06:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1240/1251] eta 0:00:02 lr 0.000979 wd 0.0500 time 0.2264 (0.2452) data time 0.0007 (0.0014) model time 0.2256 (0.2437) loss 2.3994 (3.6677) grad_norm 1.7779 (1.9317) loss_scale 8192.0000 (6871.7744) mem 7379MB [2024-08-26 06:08:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [45/300][1250/1251] eta 0:00:00 lr 0.000979 wd 0.0500 time 0.2335 (0.2452) data time 0.0005 (0.0014) model time 0.2330 (0.2437) loss 4.7390 (3.6677) grad_norm 2.0630 (1.9311) loss_scale 8192.0000 (6882.3277) mem 7379MB [2024-08-26 06:08:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 45 training takes 0:05:06 [2024-08-26 06:08:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 06:08:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 06:08:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.516 (0.516) Loss 0.5786 (0.5786) Acc@1 88.281 (88.281) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-26 06:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.116) Loss 0.9902 (0.9118) Acc@1 79.785 (80.300) Acc@5 94.141 (95.250) Mem 7379MB [2024-08-26 06:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.099) Loss 1.3027 (0.9258) Acc@1 68.066 (79.306) Acc@5 91.113 (95.364) Mem 7379MB [2024-08-26 06:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.092) Loss 1.5752 (1.0566) Acc@1 64.062 (76.238) Acc@5 85.449 (93.646) Mem 7379MB [2024-08-26 06:08:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 1.4590 (1.1369) Acc@1 66.699 (74.481) Acc@5 89.062 (92.614) Mem 7379MB [2024-08-26 06:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.046 Acc@5 92.514 [2024-08-26 06:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 74.0% [2024-08-26 06:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 74.05% [2024-08-26 06:08:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 06:08:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 06:08:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.452 (0.452) Loss 0.4680 (0.4680) Acc@1 89.844 (89.844) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 06:08:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.112) Loss 0.7949 (0.7726) Acc@1 81.836 (81.889) Acc@5 95.703 (95.961) Mem 7379MB [2024-08-26 06:08:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.096) Loss 1.1318 (0.7886) Acc@1 73.340 (81.134) Acc@5 91.406 (95.959) Mem 7379MB [2024-08-26 06:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.090) Loss 1.4072 (0.9109) Acc@1 64.648 (78.264) Acc@5 87.695 (94.408) Mem 7379MB [2024-08-26 06:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.3340 (0.9808) Acc@1 66.699 (76.615) Acc@5 89.062 (93.564) Mem 7379MB [2024-08-26 06:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.300 Acc@5 93.504 [2024-08-26 06:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 76.3% [2024-08-26 06:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 76.30% [2024-08-26 06:08:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 06:08:59 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 06:09:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][0/1251] eta 0:14:43 lr 0.000979 wd 0.0500 time 0.7064 (0.7064) data time 0.4816 (0.4816) model time 0.0000 (0.0000) loss 2.7914 (2.7914) grad_norm 1.6470 (1.6470) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][10/1251] eta 0:05:51 lr 0.000979 wd 0.0500 time 0.2368 (0.2835) data time 0.0011 (0.0447) model time 0.0000 (0.0000) loss 3.5817 (3.5233) grad_norm 2.1487 (1.7483) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][20/1251] eta 0:05:23 lr 0.000979 wd 0.0500 time 0.2374 (0.2627) data time 0.0007 (0.0239) model time 0.0000 (0.0000) loss 4.4316 (3.6168) grad_norm 2.5868 (1.8194) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][30/1251] eta 0:05:11 lr 0.000979 wd 0.0500 time 0.2455 (0.2553) data time 0.0009 (0.0165) model time 0.0000 (0.0000) loss 3.6769 (3.7127) grad_norm 2.7484 (1.9524) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][40/1251] eta 0:05:04 lr 0.000979 wd 0.0500 time 0.2423 (0.2518) data time 0.0007 (0.0127) model time 0.0000 (0.0000) loss 3.0653 (3.6631) grad_norm 2.8714 (1.9535) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][50/1251] eta 0:05:00 lr 0.000979 wd 0.0500 time 0.2385 (0.2498) data time 0.0009 (0.0104) model time 0.0000 (0.0000) loss 4.3532 (3.7041) grad_norm 1.9581 (1.9622) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][60/1251] eta 0:04:55 lr 0.000979 wd 0.0500 time 0.2400 (0.2484) data time 0.0011 (0.0089) model time 0.2390 (0.2401) loss 3.4718 (3.6450) grad_norm 1.9188 (1.9518) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][70/1251] eta 0:04:52 lr 0.000979 wd 0.0500 time 0.2459 (0.2479) data time 0.0011 (0.0078) model time 0.2447 (0.2420) loss 4.1817 (3.6859) grad_norm 1.9825 (1.9247) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][80/1251] eta 0:04:50 lr 0.000979 wd 0.0500 time 0.2891 (0.2479) data time 0.0010 (0.0069) model time 0.2882 (0.2434) loss 3.6193 (3.6856) grad_norm 1.9163 (1.9358) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][90/1251] eta 0:04:47 lr 0.000979 wd 0.0500 time 0.2447 (0.2476) data time 0.0007 (0.0063) model time 0.2440 (0.2436) loss 4.0792 (3.7020) grad_norm 2.7986 (1.9773) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][100/1251] eta 0:04:44 lr 0.000979 wd 0.0500 time 0.2415 (0.2471) data time 0.0009 (0.0058) model time 0.2406 (0.2433) loss 3.4817 (3.7057) grad_norm 2.2289 (1.9774) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][110/1251] eta 0:04:41 lr 0.000979 wd 0.0500 time 0.2418 (0.2471) data time 0.0011 (0.0054) model time 0.2407 (0.2436) loss 3.7806 (3.7100) grad_norm 2.2364 (1.9744) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][120/1251] eta 0:04:39 lr 0.000979 wd 0.0500 time 0.2454 (0.2468) data time 0.0011 (0.0050) model time 0.2444 (0.2435) loss 3.3210 (3.7056) grad_norm 2.0221 (1.9775) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][130/1251] eta 0:04:36 lr 0.000979 wd 0.0500 time 0.2395 (0.2464) data time 0.0009 (0.0047) model time 0.2386 (0.2431) loss 4.3506 (3.7076) grad_norm 1.8607 (1.9706) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][140/1251] eta 0:04:33 lr 0.000979 wd 0.0500 time 0.2408 (0.2462) data time 0.0010 (0.0044) model time 0.2398 (0.2431) loss 4.1653 (3.7035) grad_norm 1.6503 (1.9771) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][150/1251] eta 0:04:30 lr 0.000979 wd 0.0500 time 0.2343 (0.2459) data time 0.0011 (0.0042) model time 0.2331 (0.2428) loss 3.4753 (3.6802) grad_norm 1.6966 (1.9711) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][160/1251] eta 0:04:28 lr 0.000979 wd 0.0500 time 0.2465 (0.2460) data time 0.0012 (0.0040) model time 0.2453 (0.2432) loss 3.5088 (3.6758) grad_norm 2.0213 (1.9595) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][170/1251] eta 0:04:25 lr 0.000979 wd 0.0500 time 0.2381 (0.2458) data time 0.0009 (0.0038) model time 0.2373 (0.2430) loss 3.7030 (3.6779) grad_norm 1.7655 (1.9523) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][180/1251] eta 0:04:23 lr 0.000979 wd 0.0500 time 0.2499 (0.2458) data time 0.0009 (0.0037) model time 0.2490 (0.2431) loss 4.3500 (3.6845) grad_norm 2.9654 (1.9542) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][190/1251] eta 0:04:20 lr 0.000979 wd 0.0500 time 0.2472 (0.2457) data time 0.0010 (0.0035) model time 0.2462 (0.2431) loss 3.9470 (3.6665) grad_norm 1.6442 (1.9534) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][200/1251] eta 0:04:18 lr 0.000979 wd 0.0500 time 0.2419 (0.2456) data time 0.0010 (0.0035) model time 0.2410 (0.2431) loss 4.1688 (3.6491) grad_norm 1.7998 (1.9506) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][210/1251] eta 0:04:16 lr 0.000979 wd 0.0500 time 0.2209 (0.2464) data time 0.0011 (0.0033) model time 0.2198 (0.2442) loss 3.8932 (3.6453) grad_norm 1.9717 (1.9394) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][220/1251] eta 0:04:13 lr 0.000979 wd 0.0500 time 0.2458 (0.2463) data time 0.0010 (0.0033) model time 0.2449 (0.2440) loss 3.9656 (3.6380) grad_norm 2.5467 (1.9399) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][230/1251] eta 0:04:11 lr 0.000979 wd 0.0500 time 0.2511 (0.2460) data time 0.0009 (0.0032) model time 0.2501 (0.2438) loss 3.4676 (3.6340) grad_norm 2.0917 (1.9240) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:09:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][240/1251] eta 0:04:08 lr 0.000979 wd 0.0500 time 0.2460 (0.2458) data time 0.0008 (0.0031) model time 0.2452 (0.2436) loss 2.8110 (3.6246) grad_norm 1.7464 (1.9145) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][250/1251] eta 0:04:06 lr 0.000979 wd 0.0500 time 0.2512 (0.2458) data time 0.0013 (0.0030) model time 0.2500 (0.2436) loss 4.0853 (3.6367) grad_norm 1.5764 (1.9106) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][260/1251] eta 0:04:05 lr 0.000979 wd 0.0500 time 0.2483 (0.2473) data time 0.0010 (0.0029) model time 0.2473 (0.2456) loss 3.9246 (3.6487) grad_norm 2.2013 (1.9063) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][270/1251] eta 0:04:02 lr 0.000979 wd 0.0500 time 0.2496 (0.2471) data time 0.0011 (0.0029) model time 0.2484 (0.2454) loss 4.1416 (3.6530) grad_norm 2.9702 (1.9177) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][280/1251] eta 0:03:59 lr 0.000979 wd 0.0500 time 0.2455 (0.2470) data time 0.0011 (0.0028) model time 0.2444 (0.2453) loss 3.5770 (3.6573) grad_norm 1.8039 (1.9189) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][290/1251] eta 0:03:57 lr 0.000979 wd 0.0500 time 0.2465 (0.2468) data time 0.0007 (0.0027) model time 0.2458 (0.2451) loss 3.7806 (3.6631) grad_norm 2.7754 (1.9214) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][300/1251] eta 0:03:54 lr 0.000979 wd 0.0500 time 0.2466 (0.2466) data time 0.0007 (0.0027) model time 0.2459 (0.2448) loss 2.6293 (3.6657) grad_norm 2.1992 (1.9318) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][310/1251] eta 0:03:51 lr 0.000979 wd 0.0500 time 0.2478 (0.2464) data time 0.0009 (0.0026) model time 0.2468 (0.2447) loss 3.3185 (3.6680) grad_norm 1.5567 (1.9279) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][320/1251] eta 0:03:49 lr 0.000979 wd 0.0500 time 0.2473 (0.2464) data time 0.0011 (0.0026) model time 0.2462 (0.2446) loss 3.2417 (3.6668) grad_norm 1.9104 (1.9218) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][330/1251] eta 0:03:46 lr 0.000979 wd 0.0500 time 0.2402 (0.2463) data time 0.0012 (0.0026) model time 0.2390 (0.2445) loss 4.1722 (3.6665) grad_norm 1.5498 (1.9181) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][340/1251] eta 0:03:44 lr 0.000979 wd 0.0500 time 0.2384 (0.2463) data time 0.0011 (0.0025) model time 0.2373 (0.2445) loss 3.8185 (3.6726) grad_norm 2.2772 (1.9156) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][350/1251] eta 0:03:41 lr 0.000979 wd 0.0500 time 0.2458 (0.2462) data time 0.0009 (0.0025) model time 0.2449 (0.2444) loss 4.4695 (3.6726) grad_norm 1.5582 (1.9099) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][360/1251] eta 0:03:39 lr 0.000979 wd 0.0500 time 0.2462 (0.2460) data time 0.0007 (0.0025) model time 0.2455 (0.2442) loss 2.6535 (3.6722) grad_norm 1.6440 (1.9126) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][370/1251] eta 0:03:36 lr 0.000979 wd 0.0500 time 0.2462 (0.2459) data time 0.0009 (0.0024) model time 0.2453 (0.2442) loss 4.2366 (3.6716) grad_norm 1.8886 (1.9073) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][380/1251] eta 0:03:34 lr 0.000979 wd 0.0500 time 0.2441 (0.2459) data time 0.0009 (0.0024) model time 0.2432 (0.2441) loss 3.3931 (3.6764) grad_norm 2.4711 (1.9071) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][390/1251] eta 0:03:31 lr 0.000979 wd 0.0500 time 0.2390 (0.2457) data time 0.0008 (0.0023) model time 0.2382 (0.2440) loss 4.3134 (3.6773) grad_norm 2.1162 (1.9145) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][400/1251] eta 0:03:29 lr 0.000979 wd 0.0500 time 0.2364 (0.2456) data time 0.0008 (0.0023) model time 0.2356 (0.2439) loss 3.7383 (3.6800) grad_norm 3.6203 (1.9198) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][410/1251] eta 0:03:26 lr 0.000979 wd 0.0500 time 0.2403 (0.2456) data time 0.0008 (0.0023) model time 0.2396 (0.2438) loss 4.1817 (3.6856) grad_norm 2.6210 (1.9198) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][420/1251] eta 0:03:23 lr 0.000979 wd 0.0500 time 0.2362 (0.2454) data time 0.0007 (0.0023) model time 0.2354 (0.2437) loss 4.1487 (3.6752) grad_norm 1.4011 (1.9197) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][430/1251] eta 0:03:21 lr 0.000979 wd 0.0500 time 0.2419 (0.2453) data time 0.0007 (0.0022) model time 0.2411 (0.2435) loss 2.5912 (3.6664) grad_norm 2.0183 (1.9137) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][440/1251] eta 0:03:18 lr 0.000979 wd 0.0500 time 0.2411 (0.2453) data time 0.0010 (0.0022) model time 0.2400 (0.2435) loss 3.8177 (3.6669) grad_norm 3.3627 (1.9250) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][450/1251] eta 0:03:16 lr 0.000979 wd 0.0500 time 0.2450 (0.2452) data time 0.0009 (0.0022) model time 0.2440 (0.2434) loss 3.0753 (3.6619) grad_norm 2.9709 (1.9268) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][460/1251] eta 0:03:13 lr 0.000978 wd 0.0500 time 0.2588 (0.2452) data time 0.0010 (0.0022) model time 0.2578 (0.2435) loss 3.6145 (3.6647) grad_norm 2.1017 (1.9317) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][470/1251] eta 0:03:11 lr 0.000978 wd 0.0500 time 0.2404 (0.2451) data time 0.0011 (0.0021) model time 0.2392 (0.2434) loss 3.7647 (3.6586) grad_norm 1.7359 (1.9326) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][480/1251] eta 0:03:08 lr 0.000978 wd 0.0500 time 0.2466 (0.2451) data time 0.0012 (0.0021) model time 0.2454 (0.2434) loss 4.1609 (3.6616) grad_norm 1.8586 (1.9319) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][490/1251] eta 0:03:06 lr 0.000978 wd 0.0500 time 0.2387 (0.2450) data time 0.0010 (0.0021) model time 0.2377 (0.2434) loss 3.6055 (3.6629) grad_norm 3.4341 (1.9351) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][500/1251] eta 0:03:04 lr 0.000978 wd 0.0500 time 0.2352 (0.2451) data time 0.0009 (0.0021) model time 0.2343 (0.2434) loss 2.2482 (3.6550) grad_norm 1.4218 (1.9319) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][510/1251] eta 0:03:01 lr 0.000978 wd 0.0500 time 0.2338 (0.2450) data time 0.0010 (0.0021) model time 0.2328 (0.2434) loss 3.7621 (3.6493) grad_norm 2.0122 (1.9285) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][520/1251] eta 0:02:59 lr 0.000978 wd 0.0500 time 0.2421 (0.2454) data time 0.0009 (0.0020) model time 0.2412 (0.2438) loss 4.2960 (3.6513) grad_norm 1.5480 (1.9242) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][530/1251] eta 0:02:56 lr 0.000978 wd 0.0500 time 0.2396 (0.2453) data time 0.0009 (0.0020) model time 0.2387 (0.2437) loss 3.4528 (3.6532) grad_norm 2.3809 (1.9217) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][540/1251] eta 0:02:54 lr 0.000978 wd 0.0500 time 0.2421 (0.2453) data time 0.0009 (0.0020) model time 0.2413 (0.2437) loss 2.9028 (3.6476) grad_norm 1.8247 (1.9201) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][550/1251] eta 0:02:51 lr 0.000978 wd 0.0500 time 0.2406 (0.2453) data time 0.0011 (0.0020) model time 0.2395 (0.2437) loss 2.5519 (3.6453) grad_norm 1.5869 (1.9127) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][560/1251] eta 0:02:49 lr 0.000978 wd 0.0500 time 0.2457 (0.2452) data time 0.0009 (0.0020) model time 0.2448 (0.2437) loss 2.8763 (3.6430) grad_norm 1.8621 (1.9087) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][570/1251] eta 0:02:46 lr 0.000978 wd 0.0500 time 0.2378 (0.2452) data time 0.0010 (0.0019) model time 0.2369 (0.2436) loss 2.6594 (3.6380) grad_norm 2.3177 (1.9103) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][580/1251] eta 0:02:44 lr 0.000978 wd 0.0500 time 0.2452 (0.2452) data time 0.0009 (0.0019) model time 0.2443 (0.2436) loss 4.2276 (3.6422) grad_norm 2.7960 (1.9212) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][590/1251] eta 0:02:42 lr 0.000978 wd 0.0500 time 0.2365 (0.2451) data time 0.0009 (0.0019) model time 0.2356 (0.2436) loss 4.2260 (3.6463) grad_norm 1.2696 (1.9232) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][600/1251] eta 0:02:39 lr 0.000978 wd 0.0500 time 0.2445 (0.2451) data time 0.0009 (0.0019) model time 0.2436 (0.2435) loss 2.9167 (3.6471) grad_norm 1.5523 (1.9218) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][610/1251] eta 0:02:37 lr 0.000978 wd 0.0500 time 0.2419 (0.2450) data time 0.0008 (0.0019) model time 0.2411 (0.2435) loss 4.3288 (3.6454) grad_norm 2.1280 (1.9275) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][620/1251] eta 0:02:34 lr 0.000978 wd 0.0500 time 0.2466 (0.2450) data time 0.0009 (0.0019) model time 0.2457 (0.2435) loss 3.3773 (3.6444) grad_norm 1.8820 (1.9304) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][630/1251] eta 0:02:32 lr 0.000978 wd 0.0500 time 0.2398 (0.2457) data time 0.0008 (0.0019) model time 0.2391 (0.2442) loss 3.9991 (3.6472) grad_norm 1.7264 (1.9272) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][640/1251] eta 0:02:30 lr 0.000978 wd 0.0500 time 0.2364 (0.2456) data time 0.0007 (0.0018) model time 0.2357 (0.2442) loss 3.7978 (3.6502) grad_norm 1.7509 (1.9255) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][650/1251] eta 0:02:27 lr 0.000978 wd 0.0500 time 0.2397 (0.2456) data time 0.0009 (0.0019) model time 0.2388 (0.2441) loss 4.1698 (3.6541) grad_norm 2.2119 (1.9285) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][660/1251] eta 0:02:25 lr 0.000978 wd 0.0500 time 0.2376 (0.2455) data time 0.0012 (0.0018) model time 0.2364 (0.2440) loss 3.1898 (3.6551) grad_norm 2.4529 (1.9370) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][670/1251] eta 0:02:22 lr 0.000978 wd 0.0500 time 0.2388 (0.2455) data time 0.0009 (0.0018) model time 0.2379 (0.2440) loss 3.3409 (3.6553) grad_norm 2.0443 (1.9353) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][680/1251] eta 0:02:20 lr 0.000978 wd 0.0500 time 0.2423 (0.2454) data time 0.0007 (0.0018) model time 0.2415 (0.2440) loss 2.9182 (3.6538) grad_norm 1.5369 (1.9330) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][690/1251] eta 0:02:17 lr 0.000978 wd 0.0500 time 0.2340 (0.2454) data time 0.0011 (0.0018) model time 0.2328 (0.2439) loss 3.7655 (3.6518) grad_norm 1.4754 (1.9348) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][700/1251] eta 0:02:15 lr 0.000978 wd 0.0500 time 0.2375 (0.2453) data time 0.0009 (0.0018) model time 0.2366 (0.2439) loss 2.7907 (3.6479) grad_norm 1.9311 (1.9341) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][710/1251] eta 0:02:12 lr 0.000978 wd 0.0500 time 0.2359 (0.2453) data time 0.0009 (0.0018) model time 0.2350 (0.2438) loss 4.4372 (3.6499) grad_norm 1.9228 (1.9354) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][720/1251] eta 0:02:10 lr 0.000978 wd 0.0500 time 0.2361 (0.2453) data time 0.0007 (0.0018) model time 0.2353 (0.2438) loss 2.6561 (3.6483) grad_norm 2.5616 (1.9383) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][730/1251] eta 0:02:07 lr 0.000978 wd 0.0500 time 0.2398 (0.2452) data time 0.0011 (0.0018) model time 0.2387 (0.2437) loss 2.3689 (3.6437) grad_norm 2.3361 (1.9368) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][740/1251] eta 0:02:05 lr 0.000978 wd 0.0500 time 0.2339 (0.2451) data time 0.0011 (0.0018) model time 0.2328 (0.2437) loss 3.2426 (3.6405) grad_norm 1.9076 (1.9370) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][750/1251] eta 0:02:02 lr 0.000978 wd 0.0500 time 0.2344 (0.2453) data time 0.0008 (0.0018) model time 0.2335 (0.2439) loss 2.3024 (3.6392) grad_norm 1.6746 (1.9390) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][760/1251] eta 0:02:00 lr 0.000978 wd 0.0500 time 0.2405 (0.2453) data time 0.0007 (0.0018) model time 0.2398 (0.2439) loss 4.5778 (3.6433) grad_norm 2.4770 (1.9382) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][770/1251] eta 0:01:57 lr 0.000978 wd 0.0500 time 0.2432 (0.2452) data time 0.0010 (0.0017) model time 0.2423 (0.2438) loss 3.8239 (3.6393) grad_norm 1.8088 (1.9367) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][780/1251] eta 0:01:55 lr 0.000978 wd 0.0500 time 0.2357 (0.2452) data time 0.0010 (0.0017) model time 0.2347 (0.2437) loss 3.5871 (3.6392) grad_norm 1.4954 (1.9350) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][790/1251] eta 0:01:53 lr 0.000978 wd 0.0500 time 0.2406 (0.2451) data time 0.0010 (0.0017) model time 0.2396 (0.2437) loss 2.5595 (3.6380) grad_norm 1.3927 (1.9318) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][800/1251] eta 0:01:50 lr 0.000978 wd 0.0500 time 0.2534 (0.2451) data time 0.0010 (0.0017) model time 0.2524 (0.2436) loss 4.0349 (3.6403) grad_norm 2.0684 (1.9303) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][810/1251] eta 0:01:48 lr 0.000978 wd 0.0500 time 0.2453 (0.2451) data time 0.0010 (0.0017) model time 0.2443 (0.2436) loss 3.9504 (3.6391) grad_norm 1.9041 (1.9293) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][820/1251] eta 0:01:45 lr 0.000978 wd 0.0500 time 0.2455 (0.2450) data time 0.0010 (0.0017) model time 0.2445 (0.2436) loss 3.7786 (3.6388) grad_norm 2.8448 (1.9255) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][830/1251] eta 0:01:43 lr 0.000978 wd 0.0500 time 0.2407 (0.2450) data time 0.0009 (0.0017) model time 0.2398 (0.2435) loss 3.4399 (3.6379) grad_norm 2.7203 (1.9324) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][840/1251] eta 0:01:40 lr 0.000978 wd 0.0500 time 0.2519 (0.2449) data time 0.0011 (0.0017) model time 0.2508 (0.2435) loss 3.6942 (3.6375) grad_norm 1.2305 (1.9325) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][850/1251] eta 0:01:38 lr 0.000978 wd 0.0500 time 0.2469 (0.2449) data time 0.0010 (0.0017) model time 0.2459 (0.2434) loss 3.3810 (3.6389) grad_norm 1.2708 (1.9293) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][860/1251] eta 0:01:35 lr 0.000978 wd 0.0500 time 0.2411 (0.2448) data time 0.0009 (0.0017) model time 0.2402 (0.2434) loss 3.3218 (3.6402) grad_norm 1.6224 (1.9269) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][870/1251] eta 0:01:33 lr 0.000978 wd 0.0500 time 0.2378 (0.2448) data time 0.0007 (0.0017) model time 0.2372 (0.2434) loss 4.3172 (3.6383) grad_norm 1.5459 (1.9236) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][880/1251] eta 0:01:30 lr 0.000978 wd 0.0500 time 0.2442 (0.2448) data time 0.0007 (0.0017) model time 0.2435 (0.2433) loss 2.8135 (3.6392) grad_norm 1.7076 (1.9220) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][890/1251] eta 0:01:28 lr 0.000978 wd 0.0500 time 0.2274 (0.2447) data time 0.0008 (0.0017) model time 0.2266 (0.2433) loss 2.8840 (3.6370) grad_norm 1.9783 (1.9208) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][900/1251] eta 0:01:25 lr 0.000978 wd 0.0500 time 0.2414 (0.2450) data time 0.0007 (0.0016) model time 0.2407 (0.2435) loss 3.4200 (3.6361) grad_norm 2.0686 (1.9241) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][910/1251] eta 0:01:23 lr 0.000978 wd 0.0500 time 0.2448 (0.2449) data time 0.0007 (0.0016) model time 0.2440 (0.2435) loss 3.5619 (3.6359) grad_norm 1.6033 (1.9220) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][920/1251] eta 0:01:21 lr 0.000978 wd 0.0500 time 0.2428 (0.2449) data time 0.0009 (0.0016) model time 0.2420 (0.2435) loss 4.1518 (3.6387) grad_norm 1.8384 (1.9220) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][930/1251] eta 0:01:18 lr 0.000978 wd 0.0500 time 0.2399 (0.2449) data time 0.0007 (0.0016) model time 0.2392 (0.2435) loss 2.7218 (3.6352) grad_norm 2.4124 (1.9207) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][940/1251] eta 0:01:16 lr 0.000978 wd 0.0500 time 0.2437 (0.2448) data time 0.0008 (0.0016) model time 0.2429 (0.2434) loss 2.8930 (3.6370) grad_norm 2.6294 (1.9210) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][950/1251] eta 0:01:13 lr 0.000978 wd 0.0500 time 0.2384 (0.2448) data time 0.0008 (0.0016) model time 0.2376 (0.2434) loss 2.6736 (3.6366) grad_norm 2.0386 (1.9198) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][960/1251] eta 0:01:11 lr 0.000978 wd 0.0500 time 0.2500 (0.2448) data time 0.0010 (0.0016) model time 0.2490 (0.2434) loss 3.5868 (3.6352) grad_norm 1.6637 (1.9182) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][970/1251] eta 0:01:08 lr 0.000978 wd 0.0500 time 0.2380 (0.2447) data time 0.0009 (0.0016) model time 0.2371 (0.2433) loss 4.5165 (3.6387) grad_norm 2.9647 (1.9259) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:12:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][980/1251] eta 0:01:06 lr 0.000978 wd 0.0500 time 0.2473 (0.2447) data time 0.0010 (0.0016) model time 0.2462 (0.2433) loss 2.2411 (3.6309) grad_norm 1.5748 (1.9242) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][990/1251] eta 0:01:03 lr 0.000978 wd 0.0500 time 0.2424 (0.2447) data time 0.0010 (0.0016) model time 0.2414 (0.2433) loss 3.1868 (3.6303) grad_norm 1.4975 (1.9213) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1000/1251] eta 0:01:01 lr 0.000978 wd 0.0500 time 0.2494 (0.2447) data time 0.0009 (0.0016) model time 0.2486 (0.2433) loss 2.5715 (3.6281) grad_norm 1.2840 (1.9213) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1010/1251] eta 0:00:58 lr 0.000978 wd 0.0500 time 0.2471 (0.2447) data time 0.0010 (0.0016) model time 0.2460 (0.2433) loss 4.0337 (3.6252) grad_norm 1.5549 (1.9224) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1020/1251] eta 0:00:56 lr 0.000978 wd 0.0500 time 0.2449 (0.2447) data time 0.0011 (0.0016) model time 0.2438 (0.2433) loss 3.8231 (3.6287) grad_norm 1.7266 (1.9217) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1030/1251] eta 0:00:54 lr 0.000978 wd 0.0500 time 0.2443 (0.2446) data time 0.0008 (0.0016) model time 0.2435 (0.2432) loss 2.9522 (3.6279) grad_norm 1.7349 (1.9216) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1040/1251] eta 0:00:51 lr 0.000978 wd 0.0500 time 0.2466 (0.2446) data time 0.0007 (0.0016) model time 0.2460 (0.2432) loss 4.2781 (3.6285) grad_norm 2.9950 (1.9234) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1050/1251] eta 0:00:49 lr 0.000978 wd 0.0500 time 0.2461 (0.2446) data time 0.0008 (0.0016) model time 0.2454 (0.2432) loss 3.4842 (3.6286) grad_norm 2.0130 (1.9270) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1060/1251] eta 0:00:46 lr 0.000978 wd 0.0500 time 0.2438 (0.2446) data time 0.0013 (0.0016) model time 0.2425 (0.2432) loss 3.9662 (3.6307) grad_norm 1.6588 (1.9252) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1070/1251] eta 0:00:44 lr 0.000978 wd 0.0500 time 0.2383 (0.2446) data time 0.0011 (0.0016) model time 0.2371 (0.2432) loss 3.6135 (3.6316) grad_norm 1.8046 (1.9231) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1080/1251] eta 0:00:41 lr 0.000978 wd 0.0500 time 0.2433 (0.2445) data time 0.0009 (0.0015) model time 0.2423 (0.2432) loss 3.5880 (3.6313) grad_norm 1.2082 (1.9206) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1090/1251] eta 0:00:39 lr 0.000978 wd 0.0500 time 0.2399 (0.2445) data time 0.0012 (0.0015) model time 0.2387 (0.2431) loss 3.6201 (3.6335) grad_norm 2.2278 (1.9195) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1100/1251] eta 0:00:36 lr 0.000978 wd 0.0500 time 0.2477 (0.2445) data time 0.0009 (0.0015) model time 0.2468 (0.2431) loss 3.9581 (3.6348) grad_norm 2.2791 (1.9191) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1110/1251] eta 0:00:34 lr 0.000978 wd 0.0500 time 0.2484 (0.2445) data time 0.0016 (0.0015) model time 0.2468 (0.2431) loss 3.9676 (3.6370) grad_norm 2.4326 (1.9195) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1120/1251] eta 0:00:32 lr 0.000978 wd 0.0500 time 0.2411 (0.2444) data time 0.0011 (0.0015) model time 0.2400 (0.2431) loss 3.5560 (3.6350) grad_norm 2.0809 (1.9202) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1130/1251] eta 0:00:29 lr 0.000978 wd 0.0500 time 0.2416 (0.2444) data time 0.0011 (0.0015) model time 0.2405 (0.2431) loss 4.1246 (3.6333) grad_norm 1.7973 (1.9232) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1140/1251] eta 0:00:27 lr 0.000978 wd 0.0500 time 0.2402 (0.2444) data time 0.0009 (0.0015) model time 0.2393 (0.2431) loss 3.6775 (3.6348) grad_norm 1.6759 (1.9223) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:13:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1150/1251] eta 0:00:24 lr 0.000978 wd 0.0500 time 0.2457 (0.2444) data time 0.0009 (0.0015) model time 0.2448 (0.2430) loss 4.0739 (3.6367) grad_norm 1.4240 (1.9213) loss_scale 16384.0000 (8206.2346) mem 7379MB [2024-08-26 06:13:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1160/1251] eta 0:00:22 lr 0.000978 wd 0.0500 time 0.4532 (0.2445) data time 0.0010 (0.0015) model time 0.4522 (0.2432) loss 4.0998 (3.6391) grad_norm 2.0830 (1.9206) loss_scale 16384.0000 (8276.6718) mem 7379MB [2024-08-26 06:13:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1170/1251] eta 0:00:19 lr 0.000978 wd 0.0500 time 0.2434 (0.2447) data time 0.0007 (0.0015) model time 0.2427 (0.2434) loss 3.0015 (3.6408) grad_norm 1.8262 (1.9196) loss_scale 16384.0000 (8345.9061) mem 7379MB [2024-08-26 06:13:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1180/1251] eta 0:00:17 lr 0.000978 wd 0.0500 time 0.2389 (0.2447) data time 0.0008 (0.0015) model time 0.2381 (0.2433) loss 3.6392 (3.6407) grad_norm 1.8940 (1.9189) loss_scale 16384.0000 (8413.9678) mem 7379MB [2024-08-26 06:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1190/1251] eta 0:00:14 lr 0.000978 wd 0.0500 time 0.2478 (0.2450) data time 0.0007 (0.0015) model time 0.2471 (0.2437) loss 4.0589 (3.6414) grad_norm 2.4654 (1.9211) loss_scale 16384.0000 (8480.8866) mem 7379MB [2024-08-26 06:13:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1200/1251] eta 0:00:12 lr 0.000978 wd 0.0500 time 0.2423 (0.2450) data time 0.0010 (0.0015) model time 0.2413 (0.2437) loss 3.7834 (3.6394) grad_norm 3.3873 (1.9280) loss_scale 16384.0000 (8546.6911) mem 7379MB [2024-08-26 06:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1210/1251] eta 0:00:10 lr 0.000978 wd 0.0500 time 0.2406 (0.2449) data time 0.0007 (0.0015) model time 0.2399 (0.2436) loss 3.2050 (3.6385) grad_norm 1.6145 (1.9277) loss_scale 16384.0000 (8611.4088) mem 7379MB [2024-08-26 06:13:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1220/1251] eta 0:00:07 lr 0.000978 wd 0.0500 time 0.2359 (0.2449) data time 0.0013 (0.0015) model time 0.2346 (0.2436) loss 3.8157 (3.6375) grad_norm 1.9132 (1.9274) loss_scale 16384.0000 (8675.0663) mem 7379MB [2024-08-26 06:14:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1230/1251] eta 0:00:05 lr 0.000977 wd 0.0500 time 0.2486 (0.2449) data time 0.0011 (0.0015) model time 0.2476 (0.2436) loss 4.1517 (3.6380) grad_norm 2.4459 (1.9261) loss_scale 16384.0000 (8737.6897) mem 7379MB [2024-08-26 06:14:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1240/1251] eta 0:00:02 lr 0.000977 wd 0.0500 time 0.2252 (0.2448) data time 0.0005 (0.0015) model time 0.2248 (0.2435) loss 3.9089 (3.6376) grad_norm 2.5191 (1.9281) loss_scale 16384.0000 (8799.3038) mem 7379MB [2024-08-26 06:14:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [46/300][1250/1251] eta 0:00:00 lr 0.000977 wd 0.0500 time 0.2263 (0.2446) data time 0.0005 (0.0015) model time 0.2258 (0.2433) loss 2.7601 (3.6352) grad_norm 1.2507 (1.9259) loss_scale 16384.0000 (8859.9329) mem 7379MB [2024-08-26 06:14:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 46 training takes 0:05:06 [2024-08-26 06:14:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 06:14:06 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 06:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.414 (0.414) Loss 0.5034 (0.5034) Acc@1 90.039 (90.039) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 06:14:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.110) Loss 0.9194 (0.8836) Acc@1 79.395 (79.767) Acc@5 95.312 (95.375) Mem 7379MB [2024-08-26 06:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.094) Loss 1.2793 (0.9026) Acc@1 69.043 (78.916) Acc@5 90.332 (95.340) Mem 7379MB [2024-08-26 06:14:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.088) Loss 1.5996 (1.0354) Acc@1 62.207 (76.024) Acc@5 85.352 (93.489) Mem 7379MB [2024-08-26 06:14:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.5244 (1.1164) Acc@1 64.648 (74.171) Acc@5 87.207 (92.476) Mem 7379MB [2024-08-26 06:14:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 73.920 Acc@5 92.382 [2024-08-26 06:14:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 73.9% [2024-08-26 06:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.834 (0.834) Loss 0.4690 (0.4690) Acc@1 89.844 (89.844) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 06:14:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.150) Loss 0.7915 (0.7705) Acc@1 82.031 (82.120) Acc@5 95.801 (96.049) Mem 7379MB [2024-08-26 06:14:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.116) Loss 1.1279 (0.7867) Acc@1 73.145 (81.283) Acc@5 91.602 (96.029) Mem 7379MB [2024-08-26 06:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.103) Loss 1.3994 (0.9076) Acc@1 64.746 (78.453) Acc@5 87.793 (94.478) Mem 7379MB [2024-08-26 06:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.3281 (0.9769) Acc@1 66.699 (76.789) Acc@5 88.965 (93.643) Mem 7379MB [2024-08-26 06:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.482 Acc@5 93.582 [2024-08-26 06:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 76.5% [2024-08-26 06:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 76.48% [2024-08-26 06:14:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 06:14:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 06:14:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][0/1251] eta 0:14:13 lr 0.000977 wd 0.0500 time 0.6822 (0.6822) data time 0.4576 (0.4576) model time 0.0000 (0.0000) loss 3.5845 (3.5845) grad_norm 1.6298 (1.6298) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][10/1251] eta 0:05:48 lr 0.000977 wd 0.0500 time 0.2452 (0.2808) data time 0.0007 (0.0428) model time 0.0000 (0.0000) loss 4.1944 (3.8526) grad_norm 1.7197 (1.7727) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][20/1251] eta 0:05:21 lr 0.000977 wd 0.0500 time 0.2355 (0.2614) data time 0.0009 (0.0229) model time 0.0000 (0.0000) loss 4.4497 (3.8691) grad_norm 1.9033 (1.8655) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][30/1251] eta 0:05:10 lr 0.000977 wd 0.0500 time 0.2407 (0.2543) data time 0.0010 (0.0158) model time 0.0000 (0.0000) loss 3.0991 (3.7985) grad_norm 1.8142 (1.8778) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][40/1251] eta 0:05:04 lr 0.000977 wd 0.0500 time 0.2511 (0.2513) data time 0.0010 (0.0122) model time 0.0000 (0.0000) loss 3.0920 (3.6838) grad_norm 2.0279 (1.9857) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][50/1251] eta 0:04:59 lr 0.000977 wd 0.0500 time 0.2525 (0.2494) data time 0.0010 (0.0100) model time 0.0000 (0.0000) loss 3.2047 (3.7031) grad_norm 1.5196 (2.0333) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][60/1251] eta 0:04:55 lr 0.000977 wd 0.0500 time 0.2502 (0.2479) data time 0.0009 (0.0086) model time 0.2493 (0.2392) loss 3.9670 (3.6554) grad_norm 1.6588 (1.9824) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][70/1251] eta 0:04:51 lr 0.000977 wd 0.0500 time 0.2367 (0.2469) data time 0.0010 (0.0075) model time 0.2356 (0.2392) loss 3.1199 (3.6752) grad_norm 1.4851 (1.9546) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][80/1251] eta 0:04:48 lr 0.000977 wd 0.0500 time 0.2418 (0.2462) data time 0.0009 (0.0067) model time 0.2409 (0.2397) loss 4.3497 (3.6720) grad_norm 2.5970 (1.9745) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][90/1251] eta 0:04:44 lr 0.000977 wd 0.0500 time 0.2355 (0.2454) data time 0.0010 (0.0061) model time 0.2345 (0.2393) loss 4.0435 (3.6504) grad_norm 1.7433 (1.9414) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][100/1251] eta 0:04:41 lr 0.000977 wd 0.0500 time 0.2474 (0.2449) data time 0.0010 (0.0056) model time 0.2464 (0.2392) loss 4.0184 (3.6415) grad_norm 1.8889 (1.9084) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][110/1251] eta 0:04:41 lr 0.000977 wd 0.0500 time 0.4740 (0.2467) data time 0.0011 (0.0052) model time 0.4730 (0.2434) loss 4.0598 (3.6465) grad_norm 1.1997 (1.9034) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][120/1251] eta 0:04:38 lr 0.000977 wd 0.0500 time 0.2485 (0.2463) data time 0.0011 (0.0048) model time 0.2474 (0.2429) loss 3.9376 (3.6527) grad_norm 2.0743 (1.9198) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][130/1251] eta 0:04:35 lr 0.000977 wd 0.0500 time 0.2494 (0.2459) data time 0.0008 (0.0045) model time 0.2487 (0.2426) loss 2.5192 (3.6363) grad_norm 1.9916 (1.9424) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][140/1251] eta 0:04:32 lr 0.000977 wd 0.0500 time 0.2466 (0.2456) data time 0.0010 (0.0043) model time 0.2456 (0.2425) loss 3.2852 (3.6299) grad_norm 1.8202 (1.9460) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][150/1251] eta 0:04:30 lr 0.000977 wd 0.0500 time 0.2590 (0.2455) data time 0.0011 (0.0042) model time 0.2580 (0.2423) loss 3.8603 (3.6074) grad_norm 3.0077 (1.9410) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][160/1251] eta 0:04:27 lr 0.000977 wd 0.0500 time 0.2587 (0.2453) data time 0.0010 (0.0040) model time 0.2578 (0.2422) loss 3.6761 (3.5996) grad_norm 2.7374 (1.9449) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][170/1251] eta 0:04:24 lr 0.000977 wd 0.0500 time 0.2474 (0.2450) data time 0.0008 (0.0038) model time 0.2466 (0.2420) loss 3.0037 (3.5844) grad_norm 2.2589 (1.9343) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:14:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][180/1251] eta 0:04:22 lr 0.000977 wd 0.0500 time 0.2400 (0.2448) data time 0.0010 (0.0037) model time 0.2390 (0.2419) loss 2.6110 (3.5797) grad_norm 1.6357 (1.9246) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][190/1251] eta 0:04:19 lr 0.000977 wd 0.0500 time 0.2511 (0.2448) data time 0.0010 (0.0035) model time 0.2501 (0.2419) loss 3.5485 (3.5779) grad_norm 1.6292 (1.9145) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][200/1251] eta 0:04:17 lr 0.000977 wd 0.0500 time 0.2418 (0.2446) data time 0.0009 (0.0034) model time 0.2409 (0.2418) loss 3.7850 (3.5933) grad_norm 2.6171 (1.9185) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][210/1251] eta 0:04:14 lr 0.000977 wd 0.0500 time 0.2476 (0.2446) data time 0.0011 (0.0033) model time 0.2465 (0.2419) loss 3.8709 (3.5849) grad_norm 2.2512 (1.9204) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][220/1251] eta 0:04:12 lr 0.000977 wd 0.0500 time 0.2393 (0.2444) data time 0.0008 (0.0032) model time 0.2384 (0.2418) loss 4.0688 (3.5980) grad_norm 2.0734 (1.9185) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][230/1251] eta 0:04:09 lr 0.000977 wd 0.0500 time 0.2390 (0.2442) data time 0.0009 (0.0031) model time 0.2381 (0.2416) loss 4.2005 (3.6108) grad_norm 1.3400 (1.9063) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][240/1251] eta 0:04:06 lr 0.000977 wd 0.0500 time 0.2434 (0.2441) data time 0.0007 (0.0030) model time 0.2427 (0.2415) loss 4.1985 (3.6271) grad_norm 1.9817 (1.9040) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][250/1251] eta 0:04:04 lr 0.000977 wd 0.0500 time 0.2415 (0.2440) data time 0.0010 (0.0029) model time 0.2404 (0.2415) loss 2.4796 (3.6283) grad_norm 2.5236 (1.9015) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][260/1251] eta 0:04:01 lr 0.000977 wd 0.0500 time 0.2481 (0.2439) data time 0.0012 (0.0029) model time 0.2469 (0.2415) loss 3.4568 (3.6292) grad_norm 1.4613 (1.9265) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][270/1251] eta 0:03:59 lr 0.000977 wd 0.0500 time 0.2418 (0.2439) data time 0.0009 (0.0028) model time 0.2409 (0.2415) loss 3.4136 (3.6204) grad_norm 1.9611 (1.9460) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][280/1251] eta 0:03:56 lr 0.000977 wd 0.0500 time 0.2385 (0.2438) data time 0.0011 (0.0028) model time 0.2374 (0.2414) loss 3.6803 (3.6283) grad_norm 1.5092 (1.9428) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][290/1251] eta 0:03:54 lr 0.000977 wd 0.0500 time 0.2421 (0.2436) data time 0.0007 (0.0027) model time 0.2414 (0.2413) loss 4.0054 (3.6330) grad_norm 2.0889 (1.9331) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][300/1251] eta 0:03:51 lr 0.000977 wd 0.0500 time 0.2576 (0.2437) data time 0.0014 (0.0026) model time 0.2562 (0.2414) loss 3.3724 (3.6359) grad_norm 1.5996 (1.9300) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][310/1251] eta 0:03:49 lr 0.000977 wd 0.0500 time 0.2397 (0.2437) data time 0.0009 (0.0026) model time 0.2388 (0.2414) loss 3.7014 (3.6413) grad_norm 2.2280 (1.9258) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][320/1251] eta 0:03:47 lr 0.000977 wd 0.0500 time 0.2429 (0.2443) data time 0.0007 (0.0025) model time 0.2422 (0.2422) loss 4.0825 (3.6368) grad_norm 2.0510 (1.9237) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][330/1251] eta 0:03:44 lr 0.000977 wd 0.0500 time 0.2472 (0.2443) data time 0.0012 (0.0025) model time 0.2460 (0.2423) loss 3.9410 (3.6323) grad_norm 2.0011 (1.9183) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][340/1251] eta 0:03:42 lr 0.000977 wd 0.0500 time 0.2548 (0.2442) data time 0.0007 (0.0024) model time 0.2541 (0.2422) loss 4.4163 (3.6278) grad_norm 1.3526 (1.9186) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][350/1251] eta 0:03:39 lr 0.000977 wd 0.0500 time 0.2449 (0.2442) data time 0.0009 (0.0024) model time 0.2440 (0.2422) loss 4.4558 (3.6320) grad_norm 2.2131 (1.9153) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][360/1251] eta 0:03:37 lr 0.000977 wd 0.0500 time 0.2414 (0.2440) data time 0.0008 (0.0024) model time 0.2406 (0.2420) loss 3.8760 (3.6335) grad_norm 1.5388 (1.9105) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][370/1251] eta 0:03:34 lr 0.000977 wd 0.0500 time 0.2369 (0.2440) data time 0.0009 (0.0023) model time 0.2360 (0.2420) loss 3.3129 (3.6309) grad_norm 2.2193 (1.9092) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][380/1251] eta 0:03:32 lr 0.000977 wd 0.0500 time 0.2337 (0.2439) data time 0.0010 (0.0023) model time 0.2328 (0.2420) loss 3.5876 (3.6305) grad_norm 1.9842 (1.9120) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][390/1251] eta 0:03:29 lr 0.000977 wd 0.0500 time 0.2387 (0.2439) data time 0.0010 (0.0023) model time 0.2378 (0.2420) loss 4.2230 (3.6350) grad_norm 2.4162 (1.9110) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][400/1251] eta 0:03:27 lr 0.000977 wd 0.0500 time 0.2378 (0.2438) data time 0.0009 (0.0022) model time 0.2368 (0.2419) loss 3.8245 (3.6352) grad_norm 1.6259 (1.9163) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][410/1251] eta 0:03:24 lr 0.000977 wd 0.0500 time 0.2448 (0.2437) data time 0.0010 (0.0022) model time 0.2438 (0.2418) loss 2.9226 (3.6346) grad_norm 2.1433 (1.9201) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:15:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][420/1251] eta 0:03:22 lr 0.000977 wd 0.0500 time 0.2421 (0.2437) data time 0.0007 (0.0022) model time 0.2414 (0.2418) loss 3.8693 (3.6273) grad_norm 1.4901 (1.9146) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:16:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][430/1251] eta 0:03:20 lr 0.000977 wd 0.0500 time 0.2481 (0.2436) data time 0.0010 (0.0021) model time 0.2471 (0.2418) loss 4.6793 (3.6285) grad_norm 2.5236 (1.9158) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:16:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][440/1251] eta 0:03:17 lr 0.000977 wd 0.0500 time 0.2435 (0.2437) data time 0.0011 (0.0021) model time 0.2424 (0.2418) loss 3.5563 (3.6290) grad_norm 2.0783 (1.9106) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:16:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][450/1251] eta 0:03:15 lr 0.000977 wd 0.0500 time 0.2374 (0.2446) data time 0.0009 (0.0021) model time 0.2365 (0.2429) loss 3.3340 (3.6325) grad_norm 1.4531 (1.9102) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:16:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][460/1251] eta 0:03:13 lr 0.000977 wd 0.0500 time 0.2428 (0.2445) data time 0.0009 (0.0021) model time 0.2419 (0.2428) loss 3.7999 (3.6360) grad_norm 1.4086 (1.9050) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:16:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][470/1251] eta 0:03:10 lr 0.000977 wd 0.0500 time 0.2395 (0.2444) data time 0.0009 (0.0021) model time 0.2386 (0.2427) loss 3.2797 (3.6310) grad_norm 3.1234 (inf) loss_scale 8192.0000 (16349.2144) mem 7379MB [2024-08-26 06:16:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][480/1251] eta 0:03:08 lr 0.000977 wd 0.0500 time 0.2495 (0.2444) data time 0.0007 (0.0020) model time 0.2489 (0.2427) loss 3.7534 (3.6364) grad_norm 2.1704 (inf) loss_scale 8192.0000 (16179.6258) mem 7379MB [2024-08-26 06:16:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][490/1251] eta 0:03:06 lr 0.000977 wd 0.0500 time 0.2449 (0.2449) data time 0.0007 (0.0020) model time 0.2442 (0.2433) loss 3.4823 (3.6362) grad_norm 1.7730 (inf) loss_scale 8192.0000 (16016.9450) mem 7379MB [2024-08-26 06:16:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][500/1251] eta 0:03:03 lr 0.000977 wd 0.0500 time 0.2381 (0.2449) data time 0.0009 (0.0020) model time 0.2372 (0.2433) loss 2.6944 (3.6423) grad_norm 1.8022 (inf) loss_scale 8192.0000 (15860.7585) mem 7379MB [2024-08-26 06:16:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][510/1251] eta 0:03:01 lr 0.000977 wd 0.0500 time 0.2526 (0.2448) data time 0.0009 (0.0020) model time 0.2517 (0.2433) loss 3.6194 (3.6392) grad_norm 1.9260 (inf) loss_scale 8192.0000 (15710.6849) mem 7379MB [2024-08-26 06:16:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][520/1251] eta 0:02:58 lr 0.000977 wd 0.0500 time 0.2516 (0.2448) data time 0.0008 (0.0020) model time 0.2508 (0.2432) loss 2.3905 (3.6329) grad_norm 1.7252 (inf) loss_scale 8192.0000 (15566.3724) mem 7379MB [2024-08-26 06:16:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][530/1251] eta 0:02:56 lr 0.000977 wd 0.0500 time 0.2483 (0.2447) data time 0.0010 (0.0019) model time 0.2473 (0.2432) loss 2.8894 (3.6342) grad_norm 1.7583 (inf) loss_scale 8192.0000 (15427.4953) mem 7379MB [2024-08-26 06:16:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][540/1251] eta 0:02:53 lr 0.000977 wd 0.0500 time 0.2406 (0.2447) data time 0.0010 (0.0019) model time 0.2396 (0.2431) loss 4.0529 (3.6314) grad_norm 1.5598 (inf) loss_scale 8192.0000 (15293.7523) mem 7379MB [2024-08-26 06:16:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][550/1251] eta 0:02:51 lr 0.000977 wd 0.0500 time 0.2397 (0.2447) data time 0.0009 (0.0019) model time 0.2387 (0.2432) loss 3.9950 (3.6288) grad_norm 1.6413 (inf) loss_scale 8192.0000 (15164.8639) mem 7379MB [2024-08-26 06:16:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][560/1251] eta 0:02:49 lr 0.000977 wd 0.0500 time 0.2375 (0.2447) data time 0.0009 (0.0019) model time 0.2366 (0.2431) loss 3.0425 (3.6267) grad_norm 2.0175 (inf) loss_scale 8192.0000 (15040.5704) mem 7379MB [2024-08-26 06:16:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][570/1251] eta 0:02:46 lr 0.000977 wd 0.0500 time 0.2438 (0.2446) data time 0.0007 (0.0019) model time 0.2431 (0.2431) loss 3.0515 (3.6256) grad_norm 1.4839 (inf) loss_scale 8192.0000 (14920.6305) mem 7379MB [2024-08-26 06:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][580/1251] eta 0:02:44 lr 0.000977 wd 0.0500 time 0.2660 (0.2446) data time 0.0008 (0.0019) model time 0.2652 (0.2431) loss 3.4085 (3.6220) grad_norm 1.5625 (inf) loss_scale 8192.0000 (14804.8193) mem 7379MB [2024-08-26 06:16:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][590/1251] eta 0:02:41 lr 0.000977 wd 0.0500 time 0.2425 (0.2446) data time 0.0009 (0.0018) model time 0.2415 (0.2430) loss 3.4241 (3.6216) grad_norm 1.6925 (inf) loss_scale 8192.0000 (14692.9272) mem 7379MB [2024-08-26 06:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][600/1251] eta 0:02:39 lr 0.000977 wd 0.0500 time 0.2429 (0.2445) data time 0.0009 (0.0018) model time 0.2420 (0.2430) loss 3.7978 (3.6246) grad_norm 1.7186 (inf) loss_scale 8192.0000 (14584.7587) mem 7379MB [2024-08-26 06:16:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][610/1251] eta 0:02:36 lr 0.000977 wd 0.0500 time 0.2501 (0.2445) data time 0.0007 (0.0018) model time 0.2494 (0.2430) loss 4.0854 (3.6200) grad_norm 1.5044 (inf) loss_scale 8192.0000 (14480.1309) mem 7379MB [2024-08-26 06:16:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][620/1251] eta 0:02:34 lr 0.000977 wd 0.0500 time 0.2336 (0.2444) data time 0.0010 (0.0018) model time 0.2326 (0.2429) loss 3.6708 (3.6214) grad_norm 1.9209 (inf) loss_scale 8192.0000 (14378.8728) mem 7379MB [2024-08-26 06:16:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][630/1251] eta 0:02:31 lr 0.000977 wd 0.0500 time 0.2463 (0.2444) data time 0.0008 (0.0018) model time 0.2455 (0.2429) loss 3.7974 (3.6203) grad_norm 1.5173 (inf) loss_scale 8192.0000 (14280.8241) mem 7379MB [2024-08-26 06:16:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][640/1251] eta 0:02:29 lr 0.000977 wd 0.0500 time 0.2364 (0.2444) data time 0.0011 (0.0018) model time 0.2354 (0.2429) loss 3.5164 (3.6184) grad_norm 1.8958 (inf) loss_scale 8192.0000 (14185.8346) mem 7379MB [2024-08-26 06:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][650/1251] eta 0:02:26 lr 0.000977 wd 0.0500 time 0.2361 (0.2443) data time 0.0009 (0.0018) model time 0.2352 (0.2429) loss 3.0721 (3.6228) grad_norm 2.4276 (inf) loss_scale 8192.0000 (14093.7634) mem 7379MB [2024-08-26 06:16:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][660/1251] eta 0:02:24 lr 0.000977 wd 0.0500 time 0.2427 (0.2443) data time 0.0009 (0.0018) model time 0.2417 (0.2428) loss 4.3222 (3.6280) grad_norm 2.2124 (inf) loss_scale 8192.0000 (14004.4781) mem 7379MB [2024-08-26 06:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][670/1251] eta 0:02:21 lr 0.000977 wd 0.0500 time 0.2354 (0.2442) data time 0.0007 (0.0017) model time 0.2347 (0.2428) loss 3.1620 (3.6317) grad_norm 1.8746 (inf) loss_scale 8192.0000 (13917.8539) mem 7379MB [2024-08-26 06:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][680/1251] eta 0:02:19 lr 0.000977 wd 0.0500 time 0.2366 (0.2442) data time 0.0009 (0.0017) model time 0.2356 (0.2427) loss 3.6298 (3.6308) grad_norm 2.2431 (inf) loss_scale 8192.0000 (13833.7739) mem 7379MB [2024-08-26 06:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][690/1251] eta 0:02:16 lr 0.000977 wd 0.0500 time 0.2407 (0.2442) data time 0.0010 (0.0017) model time 0.2397 (0.2427) loss 4.0939 (3.6313) grad_norm 2.1317 (inf) loss_scale 8192.0000 (13752.1274) mem 7379MB [2024-08-26 06:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][700/1251] eta 0:02:14 lr 0.000977 wd 0.0500 time 0.2462 (0.2442) data time 0.0010 (0.0017) model time 0.2452 (0.2427) loss 3.8978 (3.6341) grad_norm 1.5872 (inf) loss_scale 8192.0000 (13672.8103) mem 7379MB [2024-08-26 06:17:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][710/1251] eta 0:02:12 lr 0.000977 wd 0.0500 time 0.2344 (0.2441) data time 0.0009 (0.0017) model time 0.2335 (0.2426) loss 4.5418 (3.6393) grad_norm 1.5609 (inf) loss_scale 8192.0000 (13595.7243) mem 7379MB [2024-08-26 06:17:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][720/1251] eta 0:02:09 lr 0.000976 wd 0.0500 time 0.2457 (0.2441) data time 0.0010 (0.0017) model time 0.2447 (0.2426) loss 3.7759 (3.6430) grad_norm 1.6157 (inf) loss_scale 8192.0000 (13520.7767) mem 7379MB [2024-08-26 06:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][730/1251] eta 0:02:07 lr 0.000976 wd 0.0500 time 0.2358 (0.2441) data time 0.0007 (0.0017) model time 0.2351 (0.2427) loss 4.7119 (3.6461) grad_norm 2.0921 (inf) loss_scale 8192.0000 (13447.8796) mem 7379MB [2024-08-26 06:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][740/1251] eta 0:02:04 lr 0.000976 wd 0.0500 time 0.2614 (0.2442) data time 0.0010 (0.0017) model time 0.2604 (0.2427) loss 3.7719 (3.6471) grad_norm 1.7978 (inf) loss_scale 8192.0000 (13376.9501) mem 7379MB [2024-08-26 06:17:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][750/1251] eta 0:02:02 lr 0.000976 wd 0.0500 time 0.2560 (0.2441) data time 0.0008 (0.0017) model time 0.2553 (0.2426) loss 3.3590 (3.6479) grad_norm 1.6760 (inf) loss_scale 8192.0000 (13307.9095) mem 7379MB [2024-08-26 06:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][760/1251] eta 0:01:59 lr 0.000976 wd 0.0500 time 0.2447 (0.2441) data time 0.0009 (0.0017) model time 0.2438 (0.2426) loss 4.0377 (3.6499) grad_norm 1.5860 (inf) loss_scale 8192.0000 (13240.6833) mem 7379MB [2024-08-26 06:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][770/1251] eta 0:01:57 lr 0.000976 wd 0.0500 time 0.2474 (0.2441) data time 0.0009 (0.0017) model time 0.2465 (0.2426) loss 4.4971 (3.6491) grad_norm 1.6213 (inf) loss_scale 8192.0000 (13175.2010) mem 7379MB [2024-08-26 06:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][780/1251] eta 0:01:54 lr 0.000976 wd 0.0500 time 0.2456 (0.2441) data time 0.0010 (0.0017) model time 0.2446 (0.2426) loss 3.8151 (3.6541) grad_norm 1.5880 (inf) loss_scale 8192.0000 (13111.3956) mem 7379MB [2024-08-26 06:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][790/1251] eta 0:01:52 lr 0.000976 wd 0.0500 time 0.2435 (0.2441) data time 0.0009 (0.0017) model time 0.2426 (0.2426) loss 3.5563 (3.6575) grad_norm 1.9170 (inf) loss_scale 8192.0000 (13049.2035) mem 7379MB [2024-08-26 06:17:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][800/1251] eta 0:01:50 lr 0.000976 wd 0.0500 time 0.2496 (0.2441) data time 0.0010 (0.0017) model time 0.2486 (0.2426) loss 3.7373 (3.6533) grad_norm 1.6296 (inf) loss_scale 8192.0000 (12988.5643) mem 7379MB [2024-08-26 06:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][810/1251] eta 0:01:47 lr 0.000976 wd 0.0500 time 0.2422 (0.2444) data time 0.0007 (0.0017) model time 0.2415 (0.2429) loss 2.7762 (3.6511) grad_norm 1.7409 (inf) loss_scale 8192.0000 (12929.4205) mem 7379MB [2024-08-26 06:17:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][820/1251] eta 0:01:45 lr 0.000976 wd 0.0500 time 0.2484 (0.2448) data time 0.0008 (0.0016) model time 0.2476 (0.2434) loss 4.5035 (3.6485) grad_norm 1.9763 (inf) loss_scale 8192.0000 (12871.7174) mem 7379MB [2024-08-26 06:17:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][830/1251] eta 0:01:43 lr 0.000976 wd 0.0500 time 0.2448 (0.2448) data time 0.0008 (0.0016) model time 0.2440 (0.2434) loss 4.0117 (3.6496) grad_norm 2.6290 (inf) loss_scale 8192.0000 (12815.4031) mem 7379MB [2024-08-26 06:17:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][840/1251] eta 0:01:40 lr 0.000976 wd 0.0500 time 0.2384 (0.2447) data time 0.0008 (0.0016) model time 0.2376 (0.2433) loss 4.1388 (3.6510) grad_norm 1.7945 (inf) loss_scale 8192.0000 (12760.4281) mem 7379MB [2024-08-26 06:17:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][850/1251] eta 0:01:38 lr 0.000976 wd 0.0500 time 0.2358 (0.2449) data time 0.0011 (0.0016) model time 0.2347 (0.2436) loss 2.1837 (3.6473) grad_norm 1.4926 (inf) loss_scale 8192.0000 (12706.7450) mem 7379MB [2024-08-26 06:17:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][860/1251] eta 0:01:35 lr 0.000976 wd 0.0500 time 0.2403 (0.2449) data time 0.0010 (0.0016) model time 0.2392 (0.2435) loss 4.2371 (3.6470) grad_norm 1.8098 (inf) loss_scale 8192.0000 (12654.3089) mem 7379MB [2024-08-26 06:17:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][870/1251] eta 0:01:33 lr 0.000976 wd 0.0500 time 0.2443 (0.2449) data time 0.0010 (0.0016) model time 0.2433 (0.2435) loss 3.6044 (3.6460) grad_norm 1.9492 (inf) loss_scale 8192.0000 (12603.0769) mem 7379MB [2024-08-26 06:17:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][880/1251] eta 0:01:30 lr 0.000976 wd 0.0500 time 0.2417 (0.2449) data time 0.0008 (0.0016) model time 0.2408 (0.2435) loss 2.5974 (3.6448) grad_norm 1.4479 (inf) loss_scale 8192.0000 (12553.0079) mem 7379MB [2024-08-26 06:17:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][890/1251] eta 0:01:28 lr 0.000976 wd 0.0500 time 0.2416 (0.2448) data time 0.0010 (0.0016) model time 0.2406 (0.2435) loss 2.6970 (3.6401) grad_norm 1.4501 (inf) loss_scale 8192.0000 (12504.0629) mem 7379MB [2024-08-26 06:17:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][900/1251] eta 0:01:25 lr 0.000976 wd 0.0500 time 0.2452 (0.2448) data time 0.0008 (0.0016) model time 0.2443 (0.2435) loss 2.8680 (3.6402) grad_norm 1.8280 (inf) loss_scale 8192.0000 (12456.2042) mem 7379MB [2024-08-26 06:17:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][910/1251] eta 0:01:23 lr 0.000976 wd 0.0500 time 0.2380 (0.2448) data time 0.0010 (0.0016) model time 0.2370 (0.2435) loss 3.3855 (3.6379) grad_norm 2.1755 (inf) loss_scale 8192.0000 (12409.3963) mem 7379MB [2024-08-26 06:18:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][920/1251] eta 0:01:21 lr 0.000976 wd 0.0500 time 0.2325 (0.2448) data time 0.0010 (0.0016) model time 0.2315 (0.2434) loss 2.9384 (3.6374) grad_norm 2.1012 (inf) loss_scale 8192.0000 (12363.6048) mem 7379MB [2024-08-26 06:18:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][930/1251] eta 0:01:18 lr 0.000976 wd 0.0500 time 0.2391 (0.2447) data time 0.0011 (0.0016) model time 0.2380 (0.2434) loss 3.6559 (3.6334) grad_norm 1.8368 (inf) loss_scale 8192.0000 (12318.7970) mem 7379MB [2024-08-26 06:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][940/1251] eta 0:01:16 lr 0.000976 wd 0.0500 time 0.2439 (0.2447) data time 0.0009 (0.0016) model time 0.2429 (0.2433) loss 3.3333 (3.6312) grad_norm 2.2194 (inf) loss_scale 8192.0000 (12274.9416) mem 7379MB [2024-08-26 06:18:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][950/1251] eta 0:01:13 lr 0.000976 wd 0.0500 time 0.2455 (0.2446) data time 0.0011 (0.0016) model time 0.2444 (0.2433) loss 3.2242 (3.6327) grad_norm 1.3547 (inf) loss_scale 8192.0000 (12232.0084) mem 7379MB [2024-08-26 06:18:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][960/1251] eta 0:01:11 lr 0.000976 wd 0.0500 time 0.2386 (0.2446) data time 0.0011 (0.0016) model time 0.2375 (0.2432) loss 3.9599 (3.6327) grad_norm 2.2758 (inf) loss_scale 8192.0000 (12189.9688) mem 7379MB [2024-08-26 06:18:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][970/1251] eta 0:01:08 lr 0.000976 wd 0.0500 time 0.2336 (0.2449) data time 0.0010 (0.0015) model time 0.2326 (0.2436) loss 3.6914 (3.6326) grad_norm 2.1259 (inf) loss_scale 8192.0000 (12148.7951) mem 7379MB [2024-08-26 06:18:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][980/1251] eta 0:01:06 lr 0.000976 wd 0.0500 time 0.2402 (0.2449) data time 0.0008 (0.0015) model time 0.2394 (0.2436) loss 4.3804 (3.6324) grad_norm 1.8385 (inf) loss_scale 8192.0000 (12108.4608) mem 7379MB [2024-08-26 06:18:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][990/1251] eta 0:01:03 lr 0.000976 wd 0.0500 time 0.2369 (0.2449) data time 0.0009 (0.0015) model time 0.2359 (0.2435) loss 3.8754 (3.6322) grad_norm 1.5847 (inf) loss_scale 8192.0000 (12068.9405) mem 7379MB [2024-08-26 06:18:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1000/1251] eta 0:01:01 lr 0.000976 wd 0.0500 time 0.2400 (0.2448) data time 0.0008 (0.0015) model time 0.2392 (0.2435) loss 4.2831 (3.6345) grad_norm 1.6235 (inf) loss_scale 8192.0000 (12030.2098) mem 7379MB [2024-08-26 06:18:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1010/1251] eta 0:00:58 lr 0.000976 wd 0.0500 time 0.2442 (0.2448) data time 0.0011 (0.0015) model time 0.2431 (0.2435) loss 3.6861 (3.6346) grad_norm 1.6938 (inf) loss_scale 8192.0000 (11992.2453) mem 7379MB [2024-08-26 06:18:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1020/1251] eta 0:00:56 lr 0.000976 wd 0.0500 time 0.2461 (0.2448) data time 0.0010 (0.0015) model time 0.2451 (0.2435) loss 2.9095 (3.6351) grad_norm 1.8556 (inf) loss_scale 8192.0000 (11955.0245) mem 7379MB [2024-08-26 06:18:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1030/1251] eta 0:00:54 lr 0.000976 wd 0.0500 time 0.2432 (0.2448) data time 0.0007 (0.0015) model time 0.2425 (0.2435) loss 3.6919 (3.6350) grad_norm 1.9324 (inf) loss_scale 8192.0000 (11918.5257) mem 7379MB [2024-08-26 06:18:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1040/1251] eta 0:00:51 lr 0.000976 wd 0.0500 time 0.4596 (0.2450) data time 0.0007 (0.0015) model time 0.4589 (0.2437) loss 4.3707 (3.6342) grad_norm 1.8388 (inf) loss_scale 8192.0000 (11882.7281) mem 7379MB [2024-08-26 06:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1050/1251] eta 0:00:49 lr 0.000976 wd 0.0500 time 0.2409 (0.2449) data time 0.0011 (0.0015) model time 0.2398 (0.2436) loss 2.8621 (3.6338) grad_norm 1.9003 (inf) loss_scale 8192.0000 (11847.6118) mem 7379MB [2024-08-26 06:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1060/1251] eta 0:00:46 lr 0.000976 wd 0.0500 time 0.2412 (0.2449) data time 0.0007 (0.0015) model time 0.2405 (0.2436) loss 2.5334 (3.6329) grad_norm 1.2737 (inf) loss_scale 8192.0000 (11813.1574) mem 7379MB [2024-08-26 06:18:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1070/1251] eta 0:00:44 lr 0.000976 wd 0.0500 time 0.2402 (0.2449) data time 0.0010 (0.0015) model time 0.2392 (0.2436) loss 3.2742 (3.6305) grad_norm 2.3334 (inf) loss_scale 8192.0000 (11779.3464) mem 7379MB [2024-08-26 06:18:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1080/1251] eta 0:00:41 lr 0.000976 wd 0.0500 time 0.2412 (0.2449) data time 0.0011 (0.0015) model time 0.2401 (0.2436) loss 3.7420 (3.6320) grad_norm 1.4819 (inf) loss_scale 8192.0000 (11746.1610) mem 7379MB [2024-08-26 06:18:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1090/1251] eta 0:00:39 lr 0.000976 wd 0.0500 time 0.2433 (0.2449) data time 0.0008 (0.0015) model time 0.2424 (0.2436) loss 3.7882 (3.6315) grad_norm 2.1498 (inf) loss_scale 8192.0000 (11713.5839) mem 7379MB [2024-08-26 06:18:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1100/1251] eta 0:00:36 lr 0.000976 wd 0.0500 time 0.2442 (0.2448) data time 0.0008 (0.0015) model time 0.2434 (0.2435) loss 4.0603 (3.6324) grad_norm 1.9958 (inf) loss_scale 8192.0000 (11681.5985) mem 7379MB [2024-08-26 06:18:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1110/1251] eta 0:00:34 lr 0.000976 wd 0.0500 time 0.2385 (0.2448) data time 0.0008 (0.0015) model time 0.2377 (0.2435) loss 3.0738 (3.6314) grad_norm 2.7986 (inf) loss_scale 8192.0000 (11650.1890) mem 7379MB [2024-08-26 06:18:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1120/1251] eta 0:00:32 lr 0.000976 wd 0.0500 time 0.2528 (0.2448) data time 0.0009 (0.0015) model time 0.2519 (0.2435) loss 3.9335 (3.6324) grad_norm 1.5892 (inf) loss_scale 8192.0000 (11619.3399) mem 7379MB [2024-08-26 06:18:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1130/1251] eta 0:00:29 lr 0.000976 wd 0.0500 time 0.2383 (0.2448) data time 0.0009 (0.0015) model time 0.2374 (0.2435) loss 4.4836 (3.6368) grad_norm 1.7721 (inf) loss_scale 8192.0000 (11589.0363) mem 7379MB [2024-08-26 06:18:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1140/1251] eta 0:00:27 lr 0.000976 wd 0.0500 time 0.2451 (0.2448) data time 0.0008 (0.0015) model time 0.2443 (0.2435) loss 3.1591 (3.6386) grad_norm 2.3900 (inf) loss_scale 8192.0000 (11559.2638) mem 7379MB [2024-08-26 06:18:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1150/1251] eta 0:00:24 lr 0.000976 wd 0.0500 time 0.2415 (0.2447) data time 0.0008 (0.0015) model time 0.2407 (0.2435) loss 4.7243 (3.6391) grad_norm 1.3983 (inf) loss_scale 8192.0000 (11530.0087) mem 7379MB [2024-08-26 06:18:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1160/1251] eta 0:00:22 lr 0.000976 wd 0.0500 time 0.2410 (0.2447) data time 0.0010 (0.0015) model time 0.2400 (0.2434) loss 3.7047 (3.6412) grad_norm 1.5311 (inf) loss_scale 8192.0000 (11501.2575) mem 7379MB [2024-08-26 06:19:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1170/1251] eta 0:00:19 lr 0.000976 wd 0.0500 time 0.2359 (0.2447) data time 0.0010 (0.0015) model time 0.2349 (0.2434) loss 2.8988 (3.6404) grad_norm 1.7816 (inf) loss_scale 8192.0000 (11472.9974) mem 7379MB [2024-08-26 06:19:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1180/1251] eta 0:00:17 lr 0.000976 wd 0.0500 time 0.2427 (0.2446) data time 0.0010 (0.0015) model time 0.2417 (0.2433) loss 3.3482 (3.6377) grad_norm 1.9617 (inf) loss_scale 8192.0000 (11445.2159) mem 7379MB [2024-08-26 06:19:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1190/1251] eta 0:00:14 lr 0.000976 wd 0.0500 time 0.2475 (0.2446) data time 0.0010 (0.0014) model time 0.2465 (0.2433) loss 3.9793 (3.6394) grad_norm 1.6375 (inf) loss_scale 8192.0000 (11417.9009) mem 7379MB [2024-08-26 06:19:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1200/1251] eta 0:00:12 lr 0.000976 wd 0.0500 time 0.2361 (0.2446) data time 0.0010 (0.0014) model time 0.2351 (0.2433) loss 2.9312 (3.6362) grad_norm 1.8150 (inf) loss_scale 8192.0000 (11391.0408) mem 7379MB [2024-08-26 06:19:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1210/1251] eta 0:00:10 lr 0.000976 wd 0.0500 time 0.2349 (0.2445) data time 0.0011 (0.0014) model time 0.2339 (0.2433) loss 3.9277 (3.6367) grad_norm 1.5793 (inf) loss_scale 8192.0000 (11364.6243) mem 7379MB [2024-08-26 06:19:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1220/1251] eta 0:00:07 lr 0.000976 wd 0.0500 time 0.2392 (0.2445) data time 0.0010 (0.0014) model time 0.2382 (0.2433) loss 3.9773 (3.6356) grad_norm 2.0014 (inf) loss_scale 8192.0000 (11338.6405) mem 7379MB [2024-08-26 06:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1230/1251] eta 0:00:05 lr 0.000976 wd 0.0500 time 0.2403 (0.2445) data time 0.0010 (0.0014) model time 0.2393 (0.2432) loss 3.4317 (3.6354) grad_norm 2.1572 (inf) loss_scale 8192.0000 (11313.0788) mem 7379MB [2024-08-26 06:19:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1240/1251] eta 0:00:02 lr 0.000976 wd 0.0500 time 0.2241 (0.2444) data time 0.0005 (0.0014) model time 0.2236 (0.2432) loss 2.9316 (3.6333) grad_norm 2.1204 (inf) loss_scale 8192.0000 (11287.9291) mem 7379MB [2024-08-26 06:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [47/300][1250/1251] eta 0:00:00 lr 0.000976 wd 0.0500 time 0.2289 (0.2443) data time 0.0007 (0.0014) model time 0.2282 (0.2430) loss 3.7400 (3.6311) grad_norm 2.3583 (inf) loss_scale 8192.0000 (11263.1815) mem 7379MB [2024-08-26 06:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 47 training takes 0:05:05 [2024-08-26 06:19:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 06:19:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 06:19:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.458 (0.458) Loss 0.6074 (0.6074) Acc@1 89.746 (89.746) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 06:19:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.087 (0.114) Loss 0.9419 (0.8994) Acc@1 80.078 (80.247) Acc@5 94.922 (95.295) Mem 7379MB [2024-08-26 06:19:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.098) Loss 1.2344 (0.9162) Acc@1 72.559 (79.618) Acc@5 91.211 (95.326) Mem 7379MB [2024-08-26 06:19:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.092) Loss 1.5342 (1.0531) Acc@1 63.867 (76.581) Acc@5 86.621 (93.495) Mem 7379MB [2024-08-26 06:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.086) Loss 1.5723 (1.1320) Acc@1 64.258 (74.809) Acc@5 86.914 (92.519) Mem 7379MB [2024-08-26 06:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.350 Acc@5 92.410 [2024-08-26 06:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 74.4% [2024-08-26 06:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 74.35% [2024-08-26 06:19:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 06:19:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 06:19:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.392 (0.392) Loss 0.4714 (0.4714) Acc@1 90.137 (90.137) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 06:19:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.111) Loss 0.7910 (0.7691) Acc@1 82.227 (82.253) Acc@5 95.703 (96.138) Mem 7379MB [2024-08-26 06:19:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.068 (0.095) Loss 1.1240 (0.7855) Acc@1 73.340 (81.394) Acc@5 91.602 (96.094) Mem 7379MB [2024-08-26 06:19:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.090) Loss 1.3945 (0.9052) Acc@1 64.258 (78.547) Acc@5 87.891 (94.563) Mem 7379MB [2024-08-26 06:19:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.084) Loss 1.3232 (0.9737) Acc@1 66.992 (76.908) Acc@5 88.965 (93.731) Mem 7379MB [2024-08-26 06:19:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.590 Acc@5 93.666 [2024-08-26 06:19:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 76.6% [2024-08-26 06:19:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 76.59% [2024-08-26 06:19:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 06:19:31 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 06:19:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][0/1251] eta 0:12:53 lr 0.000976 wd 0.0500 time 0.6182 (0.6182) data time 0.3876 (0.3876) model time 0.0000 (0.0000) loss 3.9687 (3.9687) grad_norm 2.3082 (2.3082) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:19:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][10/1251] eta 0:05:41 lr 0.000976 wd 0.0500 time 0.2443 (0.2751) data time 0.0010 (0.0362) model time 0.0000 (0.0000) loss 3.6817 (3.7369) grad_norm 1.6791 (2.1912) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:19:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][20/1251] eta 0:05:18 lr 0.000976 wd 0.0500 time 0.2455 (0.2587) data time 0.0010 (0.0194) model time 0.0000 (0.0000) loss 3.6689 (3.4978) grad_norm 2.2464 (2.0139) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:19:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][30/1251] eta 0:05:08 lr 0.000976 wd 0.0500 time 0.2352 (0.2530) data time 0.0008 (0.0135) model time 0.0000 (0.0000) loss 2.7465 (3.5944) grad_norm 3.6406 (2.0154) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:19:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][40/1251] eta 0:05:03 lr 0.000976 wd 0.0500 time 0.2439 (0.2503) data time 0.0009 (0.0104) model time 0.0000 (0.0000) loss 3.4578 (3.6370) grad_norm 2.0406 (2.1636) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:19:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][50/1251] eta 0:04:58 lr 0.000976 wd 0.0500 time 0.2362 (0.2487) data time 0.0010 (0.0086) model time 0.0000 (0.0000) loss 3.7602 (3.5716) grad_norm 1.1756 (2.1914) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][60/1251] eta 0:04:55 lr 0.000976 wd 0.0500 time 0.2464 (0.2478) data time 0.0009 (0.0074) model time 0.2456 (0.2424) loss 3.4809 (3.5419) grad_norm 3.4298 (2.1590) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:19:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][70/1251] eta 0:04:51 lr 0.000976 wd 0.0500 time 0.2443 (0.2470) data time 0.0010 (0.0065) model time 0.2433 (0.2416) loss 2.5400 (3.5439) grad_norm 1.4582 (2.1374) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:19:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][80/1251] eta 0:04:48 lr 0.000976 wd 0.0500 time 0.2468 (0.2466) data time 0.0009 (0.0058) model time 0.2458 (0.2420) loss 4.6576 (3.6202) grad_norm 2.5505 (2.0838) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:19:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][90/1251] eta 0:04:45 lr 0.000976 wd 0.0500 time 0.2357 (0.2462) data time 0.0008 (0.0053) model time 0.2349 (0.2420) loss 4.5154 (3.5907) grad_norm 1.3395 (2.0740) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:19:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][100/1251] eta 0:04:43 lr 0.000976 wd 0.0500 time 0.2438 (0.2459) data time 0.0011 (0.0048) model time 0.2427 (0.2420) loss 3.4197 (3.5964) grad_norm 2.4295 (2.0812) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:19:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][110/1251] eta 0:04:40 lr 0.000976 wd 0.0500 time 0.2419 (0.2455) data time 0.0008 (0.0045) model time 0.2411 (0.2419) loss 3.5620 (3.6091) grad_norm 1.8079 (2.0594) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][120/1251] eta 0:04:39 lr 0.000976 wd 0.0500 time 0.2396 (0.2472) data time 0.0009 (0.0042) model time 0.2387 (0.2450) loss 3.9953 (3.6221) grad_norm 1.8476 (2.0372) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][130/1251] eta 0:04:36 lr 0.000976 wd 0.0500 time 0.2473 (0.2468) data time 0.0007 (0.0040) model time 0.2466 (0.2446) loss 2.4597 (3.6135) grad_norm 1.5706 (2.0275) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][140/1251] eta 0:04:33 lr 0.000976 wd 0.0500 time 0.2427 (0.2464) data time 0.0007 (0.0038) model time 0.2420 (0.2441) loss 4.0555 (3.6049) grad_norm 1.3862 (2.0057) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][150/1251] eta 0:04:31 lr 0.000976 wd 0.0500 time 0.2382 (0.2463) data time 0.0010 (0.0036) model time 0.2372 (0.2440) loss 4.2263 (3.5978) grad_norm 1.8624 (2.0019) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][160/1251] eta 0:04:28 lr 0.000976 wd 0.0500 time 0.2382 (0.2460) data time 0.0010 (0.0035) model time 0.2372 (0.2437) loss 3.7985 (3.5962) grad_norm 1.6490 (1.9884) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][170/1251] eta 0:04:28 lr 0.000976 wd 0.0500 time 0.4459 (0.2483) data time 0.0009 (0.0033) model time 0.4450 (0.2471) loss 3.2888 (3.5812) grad_norm 1.3725 (1.9737) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][180/1251] eta 0:04:26 lr 0.000976 wd 0.0500 time 0.2406 (0.2490) data time 0.0009 (0.0032) model time 0.2397 (0.2480) loss 3.6259 (3.5764) grad_norm 1.8730 (1.9801) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][190/1251] eta 0:04:23 lr 0.000976 wd 0.0500 time 0.2438 (0.2486) data time 0.0010 (0.0031) model time 0.2428 (0.2475) loss 4.2128 (3.5860) grad_norm 2.0908 (1.9659) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][200/1251] eta 0:04:21 lr 0.000975 wd 0.0500 time 0.2369 (0.2484) data time 0.0010 (0.0031) model time 0.2359 (0.2471) loss 4.3034 (3.5899) grad_norm 2.6877 (1.9627) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][210/1251] eta 0:04:18 lr 0.000975 wd 0.0500 time 0.2401 (0.2481) data time 0.0008 (0.0030) model time 0.2394 (0.2467) loss 3.6605 (3.5742) grad_norm 2.7120 (1.9681) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][220/1251] eta 0:04:15 lr 0.000975 wd 0.0500 time 0.2431 (0.2478) data time 0.0012 (0.0029) model time 0.2419 (0.2464) loss 3.6769 (3.5831) grad_norm 2.0028 (1.9713) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][230/1251] eta 0:04:12 lr 0.000975 wd 0.0500 time 0.2490 (0.2478) data time 0.0007 (0.0028) model time 0.2483 (0.2463) loss 4.5248 (3.5883) grad_norm 1.8569 (1.9702) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][240/1251] eta 0:04:11 lr 0.000975 wd 0.0500 time 0.4079 (0.2488) data time 0.0008 (0.0027) model time 0.4071 (0.2476) loss 4.4152 (3.6008) grad_norm 2.3620 (1.9805) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][250/1251] eta 0:04:08 lr 0.000975 wd 0.0500 time 0.2476 (0.2485) data time 0.0007 (0.0027) model time 0.2468 (0.2473) loss 3.2696 (3.5867) grad_norm 2.4093 (1.9805) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][260/1251] eta 0:04:06 lr 0.000975 wd 0.0500 time 0.2380 (0.2483) data time 0.0009 (0.0026) model time 0.2371 (0.2471) loss 3.8479 (3.5743) grad_norm 1.1803 (1.9850) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][270/1251] eta 0:04:03 lr 0.000975 wd 0.0500 time 0.2330 (0.2481) data time 0.0012 (0.0026) model time 0.2319 (0.2468) loss 2.8440 (3.5731) grad_norm 1.6635 (1.9849) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][280/1251] eta 0:04:00 lr 0.000975 wd 0.0500 time 0.2464 (0.2478) data time 0.0010 (0.0025) model time 0.2454 (0.2465) loss 3.8475 (3.5695) grad_norm 2.4007 (1.9789) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][290/1251] eta 0:03:58 lr 0.000975 wd 0.0500 time 0.2402 (0.2477) data time 0.0009 (0.0025) model time 0.2393 (0.2463) loss 2.9343 (3.5681) grad_norm 2.0152 (1.9778) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][300/1251] eta 0:03:55 lr 0.000975 wd 0.0500 time 0.2418 (0.2476) data time 0.0010 (0.0024) model time 0.2408 (0.2462) loss 4.2139 (3.5662) grad_norm 2.7580 (1.9784) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][310/1251] eta 0:03:52 lr 0.000975 wd 0.0500 time 0.2385 (0.2474) data time 0.0007 (0.0024) model time 0.2378 (0.2460) loss 3.7917 (3.5719) grad_norm 1.9173 (1.9698) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][320/1251] eta 0:03:50 lr 0.000975 wd 0.0500 time 0.2338 (0.2472) data time 0.0011 (0.0023) model time 0.2326 (0.2458) loss 2.5551 (3.5705) grad_norm 2.1582 (1.9602) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][330/1251] eta 0:03:47 lr 0.000975 wd 0.0500 time 0.2438 (0.2471) data time 0.0011 (0.0023) model time 0.2427 (0.2456) loss 3.8144 (3.5831) grad_norm 2.1773 (1.9688) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][340/1251] eta 0:03:45 lr 0.000975 wd 0.0500 time 0.2440 (0.2470) data time 0.0010 (0.0023) model time 0.2431 (0.2455) loss 3.7981 (3.5888) grad_norm 2.4151 (1.9735) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:20:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][350/1251] eta 0:03:42 lr 0.000975 wd 0.0500 time 0.2419 (0.2468) data time 0.0007 (0.0023) model time 0.2412 (0.2453) loss 3.4528 (3.5869) grad_norm 2.1153 (1.9722) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][360/1251] eta 0:03:39 lr 0.000975 wd 0.0500 time 0.2329 (0.2467) data time 0.0011 (0.0022) model time 0.2318 (0.2452) loss 3.4549 (3.5788) grad_norm 2.2862 (1.9713) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][370/1251] eta 0:03:37 lr 0.000975 wd 0.0500 time 0.2367 (0.2465) data time 0.0011 (0.0022) model time 0.2356 (0.2450) loss 2.7936 (3.5800) grad_norm 1.3883 (1.9662) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][380/1251] eta 0:03:34 lr 0.000975 wd 0.0500 time 0.2422 (0.2464) data time 0.0008 (0.0022) model time 0.2414 (0.2448) loss 4.2412 (3.5794) grad_norm 1.9141 (1.9653) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][390/1251] eta 0:03:32 lr 0.000975 wd 0.0500 time 0.2461 (0.2463) data time 0.0009 (0.0021) model time 0.2452 (0.2447) loss 3.3549 (3.5749) grad_norm 1.7297 (1.9701) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][400/1251] eta 0:03:29 lr 0.000975 wd 0.0500 time 0.2459 (0.2462) data time 0.0011 (0.0021) model time 0.2448 (0.2447) loss 3.0985 (3.5739) grad_norm 1.7735 (1.9657) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][410/1251] eta 0:03:27 lr 0.000975 wd 0.0500 time 0.2400 (0.2462) data time 0.0010 (0.0021) model time 0.2390 (0.2447) loss 3.8418 (3.5726) grad_norm 1.6296 (1.9595) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][420/1251] eta 0:03:24 lr 0.000975 wd 0.0500 time 0.2408 (0.2461) data time 0.0011 (0.0021) model time 0.2397 (0.2446) loss 3.8301 (3.5740) grad_norm 2.2979 (1.9661) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][430/1251] eta 0:03:21 lr 0.000975 wd 0.0500 time 0.2389 (0.2460) data time 0.0011 (0.0020) model time 0.2378 (0.2445) loss 3.3128 (3.5723) grad_norm 1.8368 (1.9671) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][440/1251] eta 0:03:19 lr 0.000975 wd 0.0500 time 0.2427 (0.2459) data time 0.0008 (0.0020) model time 0.2419 (0.2444) loss 2.9669 (3.5796) grad_norm 2.9778 (1.9678) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][450/1251] eta 0:03:16 lr 0.000975 wd 0.0500 time 0.2382 (0.2458) data time 0.0009 (0.0020) model time 0.2372 (0.2443) loss 2.7249 (3.5783) grad_norm 1.8711 (1.9640) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][460/1251] eta 0:03:14 lr 0.000975 wd 0.0500 time 0.2427 (0.2458) data time 0.0009 (0.0020) model time 0.2418 (0.2443) loss 3.8890 (3.5770) grad_norm 2.0745 (1.9603) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][470/1251] eta 0:03:11 lr 0.000975 wd 0.0500 time 0.2502 (0.2458) data time 0.0012 (0.0020) model time 0.2490 (0.2443) loss 3.8049 (3.5807) grad_norm 1.7594 (1.9579) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][480/1251] eta 0:03:09 lr 0.000975 wd 0.0500 time 0.2483 (0.2457) data time 0.0013 (0.0019) model time 0.2471 (0.2442) loss 3.1998 (3.5781) grad_norm 1.4777 (1.9611) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][490/1251] eta 0:03:06 lr 0.000975 wd 0.0500 time 0.2462 (0.2457) data time 0.0008 (0.0019) model time 0.2454 (0.2442) loss 2.9948 (3.5775) grad_norm 2.6205 (1.9579) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][500/1251] eta 0:03:04 lr 0.000975 wd 0.0500 time 0.2419 (0.2456) data time 0.0012 (0.0019) model time 0.2407 (0.2441) loss 3.9329 (3.5799) grad_norm 1.7517 (1.9556) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][510/1251] eta 0:03:01 lr 0.000975 wd 0.0500 time 0.2475 (0.2455) data time 0.0010 (0.0019) model time 0.2465 (0.2440) loss 3.1962 (3.5757) grad_norm 2.9473 (1.9572) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][520/1251] eta 0:02:59 lr 0.000975 wd 0.0500 time 0.2380 (0.2459) data time 0.0009 (0.0019) model time 0.2371 (0.2444) loss 3.5293 (3.5802) grad_norm 2.0616 (1.9639) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][530/1251] eta 0:02:57 lr 0.000975 wd 0.0500 time 0.2427 (0.2458) data time 0.0007 (0.0019) model time 0.2420 (0.2443) loss 4.2112 (3.5823) grad_norm 2.0507 (1.9601) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][540/1251] eta 0:02:54 lr 0.000975 wd 0.0500 time 0.2526 (0.2458) data time 0.0010 (0.0018) model time 0.2517 (0.2443) loss 3.8814 (3.5838) grad_norm 1.4231 (1.9547) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][550/1251] eta 0:02:52 lr 0.000975 wd 0.0500 time 0.2433 (0.2457) data time 0.0009 (0.0018) model time 0.2424 (0.2442) loss 4.6053 (3.5871) grad_norm 3.0415 (1.9607) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][560/1251] eta 0:02:49 lr 0.000975 wd 0.0500 time 0.2422 (0.2456) data time 0.0010 (0.0018) model time 0.2412 (0.2441) loss 3.4247 (3.5890) grad_norm 1.4112 (1.9572) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][570/1251] eta 0:02:47 lr 0.000975 wd 0.0500 time 0.2351 (0.2459) data time 0.0011 (0.0018) model time 0.2340 (0.2444) loss 2.8333 (3.5925) grad_norm 1.8352 (1.9556) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][580/1251] eta 0:02:44 lr 0.000975 wd 0.0500 time 0.2404 (0.2458) data time 0.0007 (0.0018) model time 0.2396 (0.2444) loss 4.3652 (3.5925) grad_norm 1.8793 (1.9541) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][590/1251] eta 0:02:42 lr 0.000975 wd 0.0500 time 0.2522 (0.2457) data time 0.0008 (0.0018) model time 0.2514 (0.2443) loss 4.5018 (3.5925) grad_norm 2.4217 (1.9524) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:21:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][600/1251] eta 0:02:39 lr 0.000975 wd 0.0500 time 0.2406 (0.2456) data time 0.0010 (0.0018) model time 0.2396 (0.2442) loss 2.8359 (3.5890) grad_norm 1.4564 (1.9525) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][610/1251] eta 0:02:37 lr 0.000975 wd 0.0500 time 0.2458 (0.2456) data time 0.0007 (0.0017) model time 0.2451 (0.2442) loss 3.7945 (3.5918) grad_norm 3.5697 (1.9573) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][620/1251] eta 0:02:34 lr 0.000975 wd 0.0500 time 0.2403 (0.2456) data time 0.0009 (0.0017) model time 0.2394 (0.2442) loss 4.1191 (3.5877) grad_norm 1.4693 (1.9578) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][630/1251] eta 0:02:32 lr 0.000975 wd 0.0500 time 0.2398 (0.2456) data time 0.0008 (0.0017) model time 0.2390 (0.2442) loss 4.3169 (3.5885) grad_norm 1.5195 (1.9541) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][640/1251] eta 0:02:30 lr 0.000975 wd 0.0500 time 0.2359 (0.2460) data time 0.0008 (0.0017) model time 0.2351 (0.2446) loss 3.8383 (3.5873) grad_norm 1.5356 (1.9510) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][650/1251] eta 0:02:27 lr 0.000975 wd 0.0500 time 0.2441 (0.2459) data time 0.0009 (0.0017) model time 0.2432 (0.2446) loss 3.4525 (3.5887) grad_norm 2.4205 (1.9548) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][660/1251] eta 0:02:25 lr 0.000975 wd 0.0500 time 0.2486 (0.2459) data time 0.0010 (0.0017) model time 0.2476 (0.2445) loss 3.0893 (3.5882) grad_norm 1.9739 (1.9566) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][670/1251] eta 0:02:22 lr 0.000975 wd 0.0500 time 0.2473 (0.2458) data time 0.0010 (0.0017) model time 0.2463 (0.2445) loss 3.7495 (3.5862) grad_norm 1.3545 (1.9546) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][680/1251] eta 0:02:20 lr 0.000975 wd 0.0500 time 0.2441 (0.2458) data time 0.0007 (0.0017) model time 0.2434 (0.2444) loss 2.9863 (3.5911) grad_norm 1.3199 (1.9519) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][690/1251] eta 0:02:17 lr 0.000975 wd 0.0500 time 0.2539 (0.2457) data time 0.0007 (0.0017) model time 0.2532 (0.2444) loss 2.5760 (3.5914) grad_norm 1.4501 (1.9528) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][700/1251] eta 0:02:15 lr 0.000975 wd 0.0500 time 0.2413 (0.2457) data time 0.0009 (0.0017) model time 0.2404 (0.2443) loss 3.7986 (3.5915) grad_norm 1.6109 (1.9511) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][710/1251] eta 0:02:13 lr 0.000975 wd 0.0500 time 0.2390 (0.2462) data time 0.0008 (0.0017) model time 0.2382 (0.2449) loss 4.4969 (3.5913) grad_norm 1.1642 (1.9495) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][720/1251] eta 0:02:10 lr 0.000975 wd 0.0500 time 0.2377 (0.2464) data time 0.0011 (0.0017) model time 0.2366 (0.2451) loss 2.8176 (3.5924) grad_norm 1.9215 (1.9474) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][730/1251] eta 0:02:08 lr 0.000975 wd 0.0500 time 0.2426 (0.2464) data time 0.0010 (0.0016) model time 0.2417 (0.2451) loss 4.3067 (3.5954) grad_norm 2.3079 (1.9500) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][740/1251] eta 0:02:05 lr 0.000975 wd 0.0500 time 0.2437 (0.2463) data time 0.0007 (0.0016) model time 0.2430 (0.2450) loss 3.6126 (3.5944) grad_norm 3.7988 (1.9621) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][750/1251] eta 0:02:03 lr 0.000975 wd 0.0500 time 0.2388 (0.2462) data time 0.0013 (0.0016) model time 0.2375 (0.2449) loss 3.5169 (3.5924) grad_norm 1.8782 (1.9605) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][760/1251] eta 0:02:01 lr 0.000975 wd 0.0500 time 0.2382 (0.2464) data time 0.0010 (0.0016) model time 0.2372 (0.2452) loss 3.7832 (3.5901) grad_norm 3.1887 (1.9600) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][770/1251] eta 0:01:58 lr 0.000975 wd 0.0500 time 0.2423 (0.2467) data time 0.0010 (0.0016) model time 0.2413 (0.2454) loss 4.0810 (3.5925) grad_norm 2.0577 (1.9614) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][780/1251] eta 0:01:56 lr 0.000975 wd 0.0500 time 0.2387 (0.2466) data time 0.0008 (0.0016) model time 0.2379 (0.2453) loss 3.2239 (3.5940) grad_norm 2.6054 (1.9580) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][790/1251] eta 0:01:53 lr 0.000975 wd 0.0500 time 0.2469 (0.2465) data time 0.0009 (0.0016) model time 0.2461 (0.2453) loss 2.8950 (3.5964) grad_norm 2.9991 (1.9639) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][800/1251] eta 0:01:51 lr 0.000975 wd 0.0500 time 0.2341 (0.2465) data time 0.0009 (0.0016) model time 0.2332 (0.2452) loss 4.5609 (3.5984) grad_norm 2.3770 (1.9666) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][810/1251] eta 0:01:48 lr 0.000975 wd 0.0500 time 0.2364 (0.2464) data time 0.0010 (0.0016) model time 0.2355 (0.2451) loss 2.7877 (3.5981) grad_norm 1.7732 (1.9674) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][820/1251] eta 0:01:46 lr 0.000975 wd 0.0500 time 0.2398 (0.2463) data time 0.0009 (0.0016) model time 0.2389 (0.2450) loss 4.1569 (3.5956) grad_norm 1.8811 (1.9661) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][830/1251] eta 0:01:43 lr 0.000975 wd 0.0500 time 0.2412 (0.2462) data time 0.0011 (0.0016) model time 0.2401 (0.2450) loss 3.4979 (3.5942) grad_norm 2.3846 (1.9655) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:22:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][840/1251] eta 0:01:41 lr 0.000975 wd 0.0500 time 0.2371 (0.2462) data time 0.0010 (0.0016) model time 0.2361 (0.2449) loss 2.3640 (3.5911) grad_norm 1.5534 (1.9633) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][850/1251] eta 0:01:38 lr 0.000975 wd 0.0500 time 0.2424 (0.2461) data time 0.0010 (0.0016) model time 0.2414 (0.2449) loss 4.1931 (3.5936) grad_norm 2.1100 (1.9614) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][860/1251] eta 0:01:36 lr 0.000975 wd 0.0500 time 0.2395 (0.2461) data time 0.0007 (0.0016) model time 0.2388 (0.2448) loss 3.1725 (3.5929) grad_norm 2.1737 (1.9601) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][870/1251] eta 0:01:33 lr 0.000975 wd 0.0500 time 0.2406 (0.2460) data time 0.0007 (0.0015) model time 0.2398 (0.2447) loss 3.3668 (3.5944) grad_norm 2.3432 (1.9601) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][880/1251] eta 0:01:31 lr 0.000975 wd 0.0500 time 0.2415 (0.2460) data time 0.0010 (0.0015) model time 0.2405 (0.2447) loss 4.8392 (3.5975) grad_norm 2.4656 (1.9617) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][890/1251] eta 0:01:28 lr 0.000975 wd 0.0500 time 0.2443 (0.2459) data time 0.0008 (0.0015) model time 0.2436 (0.2447) loss 3.9632 (3.5979) grad_norm 1.4998 (1.9573) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][900/1251] eta 0:01:26 lr 0.000975 wd 0.0500 time 0.2398 (0.2459) data time 0.0012 (0.0015) model time 0.2386 (0.2446) loss 3.6937 (3.5950) grad_norm 1.4115 (1.9561) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][910/1251] eta 0:01:23 lr 0.000975 wd 0.0500 time 0.2378 (0.2458) data time 0.0010 (0.0015) model time 0.2368 (0.2445) loss 3.5812 (3.5932) grad_norm 1.9789 (1.9538) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][920/1251] eta 0:01:21 lr 0.000974 wd 0.0500 time 0.2295 (0.2457) data time 0.0011 (0.0015) model time 0.2284 (0.2445) loss 3.7404 (3.5943) grad_norm 1.8877 (1.9535) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][930/1251] eta 0:01:18 lr 0.000974 wd 0.0500 time 0.2337 (0.2457) data time 0.0011 (0.0015) model time 0.2326 (0.2444) loss 3.8382 (3.5932) grad_norm 1.6170 (1.9525) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][940/1251] eta 0:01:16 lr 0.000974 wd 0.0500 time 0.2371 (0.2457) data time 0.0011 (0.0015) model time 0.2360 (0.2444) loss 4.1147 (3.5976) grad_norm 1.8192 (1.9539) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][950/1251] eta 0:01:13 lr 0.000974 wd 0.0500 time 0.2417 (0.2456) data time 0.0013 (0.0015) model time 0.2405 (0.2443) loss 3.8957 (3.5997) grad_norm 1.6629 (1.9535) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][960/1251] eta 0:01:11 lr 0.000974 wd 0.0500 time 0.2329 (0.2456) data time 0.0010 (0.0015) model time 0.2319 (0.2443) loss 3.0176 (3.5971) grad_norm 2.1504 (1.9566) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][970/1251] eta 0:01:08 lr 0.000974 wd 0.0500 time 0.2438 (0.2455) data time 0.0007 (0.0015) model time 0.2431 (0.2443) loss 3.3330 (3.5966) grad_norm 2.2456 (1.9556) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][980/1251] eta 0:01:06 lr 0.000974 wd 0.0500 time 0.2387 (0.2455) data time 0.0011 (0.0015) model time 0.2377 (0.2442) loss 2.8664 (3.5939) grad_norm 1.8242 (1.9566) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][990/1251] eta 0:01:04 lr 0.000974 wd 0.0500 time 0.2386 (0.2455) data time 0.0007 (0.0015) model time 0.2379 (0.2442) loss 3.9309 (3.5953) grad_norm 1.8567 (1.9549) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1000/1251] eta 0:01:01 lr 0.000974 wd 0.0500 time 0.2349 (0.2455) data time 0.0011 (0.0015) model time 0.2338 (0.2442) loss 3.5498 (3.5946) grad_norm 1.6662 (1.9527) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1010/1251] eta 0:00:59 lr 0.000974 wd 0.0500 time 0.2404 (0.2455) data time 0.0009 (0.0015) model time 0.2395 (0.2442) loss 2.9644 (3.5924) grad_norm 2.3609 (1.9527) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1020/1251] eta 0:00:56 lr 0.000974 wd 0.0500 time 0.2399 (0.2455) data time 0.0011 (0.0015) model time 0.2388 (0.2442) loss 3.9518 (3.5911) grad_norm 2.3046 (1.9553) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1030/1251] eta 0:00:54 lr 0.000974 wd 0.0500 time 0.2391 (0.2454) data time 0.0010 (0.0015) model time 0.2382 (0.2442) loss 3.6811 (3.5920) grad_norm 1.6083 (1.9536) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1040/1251] eta 0:00:51 lr 0.000974 wd 0.0500 time 0.2349 (0.2455) data time 0.0011 (0.0015) model time 0.2338 (0.2443) loss 4.1004 (3.5971) grad_norm 1.4491 (1.9512) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1050/1251] eta 0:00:49 lr 0.000974 wd 0.0500 time 0.2310 (0.2455) data time 0.0009 (0.0015) model time 0.2301 (0.2442) loss 3.6767 (3.5974) grad_norm 2.1537 (1.9523) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1060/1251] eta 0:00:46 lr 0.000974 wd 0.0500 time 0.2408 (0.2454) data time 0.0007 (0.0015) model time 0.2401 (0.2441) loss 2.6746 (3.5959) grad_norm 2.5548 (1.9514) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1070/1251] eta 0:00:44 lr 0.000974 wd 0.0500 time 0.2389 (0.2454) data time 0.0009 (0.0015) model time 0.2380 (0.2441) loss 3.4128 (3.5976) grad_norm 2.3877 (1.9538) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1080/1251] eta 0:00:41 lr 0.000974 wd 0.0500 time 0.2434 (0.2453) data time 0.0007 (0.0015) model time 0.2426 (0.2440) loss 2.9499 (3.5943) grad_norm 2.5167 (1.9550) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:23:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1090/1251] eta 0:00:39 lr 0.000974 wd 0.0500 time 0.2407 (0.2453) data time 0.0010 (0.0015) model time 0.2397 (0.2440) loss 3.7442 (3.5955) grad_norm 1.5524 (1.9557) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:24:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1100/1251] eta 0:00:37 lr 0.000974 wd 0.0500 time 0.2480 (0.2453) data time 0.0008 (0.0014) model time 0.2472 (0.2440) loss 3.5384 (3.5942) grad_norm 2.7684 (1.9575) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1110/1251] eta 0:00:34 lr 0.000974 wd 0.0500 time 0.2350 (0.2454) data time 0.0011 (0.0014) model time 0.2339 (0.2442) loss 4.1506 (3.5962) grad_norm 2.0566 (1.9565) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1120/1251] eta 0:00:32 lr 0.000974 wd 0.0500 time 0.2406 (0.2454) data time 0.0007 (0.0014) model time 0.2399 (0.2441) loss 4.3851 (3.5955) grad_norm 1.8213 (1.9558) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1130/1251] eta 0:00:29 lr 0.000974 wd 0.0500 time 0.2394 (0.2454) data time 0.0009 (0.0014) model time 0.2385 (0.2441) loss 2.7683 (3.5954) grad_norm 1.5195 (1.9544) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:24:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1140/1251] eta 0:00:27 lr 0.000974 wd 0.0500 time 0.2433 (0.2453) data time 0.0010 (0.0014) model time 0.2423 (0.2441) loss 3.8317 (3.5967) grad_norm 1.5092 (1.9544) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:24:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1150/1251] eta 0:00:24 lr 0.000974 wd 0.0500 time 0.2433 (0.2453) data time 0.0010 (0.0014) model time 0.2422 (0.2441) loss 4.0566 (3.5982) grad_norm 1.5828 (1.9540) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:24:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1160/1251] eta 0:00:22 lr 0.000974 wd 0.0500 time 0.2379 (0.2453) data time 0.0008 (0.0014) model time 0.2371 (0.2440) loss 3.8012 (3.5974) grad_norm 3.3650 (1.9578) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:24:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1170/1251] eta 0:00:19 lr 0.000974 wd 0.0500 time 0.2533 (0.2456) data time 0.0010 (0.0014) model time 0.2523 (0.2444) loss 3.8892 (3.5999) grad_norm 1.4510 (1.9597) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:24:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1180/1251] eta 0:00:17 lr 0.000974 wd 0.0500 time 0.2383 (0.2456) data time 0.0010 (0.0014) model time 0.2373 (0.2443) loss 3.8311 (3.6001) grad_norm 2.6072 (1.9588) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:24:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1190/1251] eta 0:00:14 lr 0.000974 wd 0.0500 time 0.2409 (0.2455) data time 0.0009 (0.0014) model time 0.2399 (0.2443) loss 3.8486 (3.6008) grad_norm 1.4268 (1.9571) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:24:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1200/1251] eta 0:00:12 lr 0.000974 wd 0.0500 time 0.2446 (0.2455) data time 0.0011 (0.0014) model time 0.2435 (0.2443) loss 3.3252 (3.6002) grad_norm 1.4567 (1.9544) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:24:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1210/1251] eta 0:00:10 lr 0.000974 wd 0.0500 time 0.2445 (0.2455) data time 0.0010 (0.0014) model time 0.2436 (0.2442) loss 3.4252 (3.6012) grad_norm 2.1754 (1.9514) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:24:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1220/1251] eta 0:00:07 lr 0.000974 wd 0.0500 time 0.2419 (0.2454) data time 0.0010 (0.0014) model time 0.2409 (0.2442) loss 4.1204 (3.6025) grad_norm 1.4452 (1.9508) loss_scale 16384.0000 (8212.1278) mem 7379MB [2024-08-26 06:24:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1230/1251] eta 0:00:05 lr 0.000974 wd 0.0500 time 0.2429 (0.2456) data time 0.0009 (0.0014) model time 0.2420 (0.2443) loss 3.6161 (3.6016) grad_norm 1.4232 (1.9493) loss_scale 16384.0000 (8278.5118) mem 7379MB [2024-08-26 06:24:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1240/1251] eta 0:00:02 lr 0.000974 wd 0.0500 time 0.2227 (0.2456) data time 0.0005 (0.0014) model time 0.2222 (0.2443) loss 3.0633 (3.6024) grad_norm 1.4281 (1.9479) loss_scale 16384.0000 (8343.8259) mem 7379MB [2024-08-26 06:24:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [48/300][1250/1251] eta 0:00:00 lr 0.000974 wd 0.0500 time 0.2276 (0.2454) data time 0.0005 (0.0014) model time 0.2271 (0.2442) loss 3.0105 (3.6048) grad_norm 1.7681 (1.9472) loss_scale 16384.0000 (8408.0959) mem 7379MB [2024-08-26 06:24:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 48 training takes 0:05:07 [2024-08-26 06:24:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 06:24:39 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 06:24:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.491 (0.491) Loss 0.5483 (0.5483) Acc@1 89.551 (89.551) Acc@5 97.363 (97.363) Mem 7379MB [2024-08-26 06:24:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.115) Loss 0.9033 (0.8836) Acc@1 79.492 (80.229) Acc@5 94.531 (95.472) Mem 7379MB [2024-08-26 06:24:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.098) Loss 1.2842 (0.8995) Acc@1 69.336 (79.515) Acc@5 90.918 (95.415) Mem 7379MB [2024-08-26 06:24:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.092) Loss 1.5791 (1.0310) Acc@1 61.816 (76.528) Acc@5 85.645 (93.722) Mem 7379MB [2024-08-26 06:24:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 1.4385 (1.1054) Acc@1 65.625 (74.781) Acc@5 88.965 (92.783) Mem 7379MB [2024-08-26 06:24:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.358 Acc@5 92.698 [2024-08-26 06:24:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 74.4% [2024-08-26 06:24:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 74.36% [2024-08-26 06:24:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 06:24:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 06:24:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.435 (0.435) Loss 0.4719 (0.4719) Acc@1 90.039 (90.039) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 06:24:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.110) Loss 0.7939 (0.7686) Acc@1 82.422 (82.413) Acc@5 95.703 (96.209) Mem 7379MB [2024-08-26 06:24:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.095) Loss 1.1182 (0.7846) Acc@1 73.730 (81.538) Acc@5 91.797 (96.164) Mem 7379MB [2024-08-26 06:24:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.089) Loss 1.3896 (0.9031) Acc@1 64.551 (78.720) Acc@5 87.988 (94.632) Mem 7379MB [2024-08-26 06:24:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.3164 (0.9708) Acc@1 67.578 (77.070) Acc@5 88.867 (93.802) Mem 7379MB [2024-08-26 06:24:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.728 Acc@5 93.752 [2024-08-26 06:24:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 76.7% [2024-08-26 06:24:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 76.73% [2024-08-26 06:24:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 06:24:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 06:24:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][0/1251] eta 0:13:30 lr 0.000974 wd 0.0500 time 0.6479 (0.6479) data time 0.4272 (0.4272) model time 0.0000 (0.0000) loss 2.6701 (2.6701) grad_norm 2.2310 (2.2310) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:24:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][10/1251] eta 0:05:45 lr 0.000974 wd 0.0500 time 0.2324 (0.2786) data time 0.0010 (0.0398) model time 0.0000 (0.0000) loss 4.1367 (3.7068) grad_norm 1.6602 (2.0091) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:24:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][20/1251] eta 0:05:20 lr 0.000974 wd 0.0500 time 0.2421 (0.2607) data time 0.0013 (0.0213) model time 0.0000 (0.0000) loss 3.7769 (3.7945) grad_norm 1.9182 (1.8958) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:24:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][30/1251] eta 0:05:11 lr 0.000974 wd 0.0500 time 0.2427 (0.2549) data time 0.0008 (0.0148) model time 0.0000 (0.0000) loss 3.8987 (3.7447) grad_norm 2.1018 (1.8911) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][40/1251] eta 0:05:05 lr 0.000974 wd 0.0500 time 0.2397 (0.2519) data time 0.0007 (0.0115) model time 0.0000 (0.0000) loss 3.3757 (3.6973) grad_norm 1.5846 (1.8972) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:25:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][50/1251] eta 0:05:00 lr 0.000974 wd 0.0500 time 0.2418 (0.2500) data time 0.0009 (0.0094) model time 0.0000 (0.0000) loss 3.4091 (3.6697) grad_norm 1.9169 (1.8899) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][60/1251] eta 0:04:56 lr 0.000974 wd 0.0500 time 0.2497 (0.2487) data time 0.0010 (0.0080) model time 0.2488 (0.2412) loss 3.7939 (3.6352) grad_norm 3.1701 (1.9119) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:25:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][70/1251] eta 0:04:52 lr 0.000974 wd 0.0500 time 0.2459 (0.2480) data time 0.0011 (0.0071) model time 0.2447 (0.2417) loss 4.4464 (3.6143) grad_norm 2.8463 (1.9624) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:25:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][80/1251] eta 0:04:49 lr 0.000974 wd 0.0500 time 0.2426 (0.2474) data time 0.0008 (0.0063) model time 0.2419 (0.2420) loss 3.1448 (3.5877) grad_norm 1.9798 (2.0515) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][90/1251] eta 0:04:46 lr 0.000974 wd 0.0500 time 0.2452 (0.2467) data time 0.0010 (0.0057) model time 0.2441 (0.2414) loss 3.9259 (3.6021) grad_norm 1.7104 (2.0100) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 06:25:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][100/1251] eta 0:04:43 lr 0.000974 wd 0.0500 time 0.2471 (0.2461) data time 0.0010 (0.0053) model time 0.2461 (0.2409) loss 4.1487 (3.5964) grad_norm 1.4011 (inf) loss_scale 8192.0000 (15816.2376) mem 7379MB [2024-08-26 06:25:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][110/1251] eta 0:04:40 lr 0.000974 wd 0.0500 time 0.2368 (0.2456) data time 0.0013 (0.0049) model time 0.2354 (0.2408) loss 3.7532 (3.6101) grad_norm 1.8446 (inf) loss_scale 8192.0000 (15129.3694) mem 7379MB [2024-08-26 06:25:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][120/1251] eta 0:04:37 lr 0.000974 wd 0.0500 time 0.2423 (0.2455) data time 0.0008 (0.0046) model time 0.2414 (0.2411) loss 2.5151 (3.5892) grad_norm 1.5787 (inf) loss_scale 8192.0000 (14556.0331) mem 7379MB [2024-08-26 06:25:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][130/1251] eta 0:04:34 lr 0.000974 wd 0.0500 time 0.2496 (0.2452) data time 0.0010 (0.0043) model time 0.2486 (0.2411) loss 2.6118 (3.5980) grad_norm 1.7362 (inf) loss_scale 8192.0000 (14070.2290) mem 7379MB [2024-08-26 06:25:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][140/1251] eta 0:04:32 lr 0.000974 wd 0.0500 time 0.2425 (0.2451) data time 0.0009 (0.0041) model time 0.2416 (0.2412) loss 4.1797 (3.6114) grad_norm 1.8363 (inf) loss_scale 8192.0000 (13653.3333) mem 7379MB [2024-08-26 06:25:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][150/1251] eta 0:04:29 lr 0.000974 wd 0.0500 time 0.2454 (0.2450) data time 0.0012 (0.0039) model time 0.2442 (0.2414) loss 3.4748 (3.5952) grad_norm 1.8279 (inf) loss_scale 8192.0000 (13291.6556) mem 7379MB [2024-08-26 06:25:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][160/1251] eta 0:04:27 lr 0.000974 wd 0.0500 time 0.2446 (0.2449) data time 0.0010 (0.0037) model time 0.2436 (0.2415) loss 3.7845 (3.6010) grad_norm 1.5057 (inf) loss_scale 8192.0000 (12974.9068) mem 7379MB [2024-08-26 06:25:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][170/1251] eta 0:04:24 lr 0.000974 wd 0.0500 time 0.2427 (0.2448) data time 0.0008 (0.0035) model time 0.2419 (0.2416) loss 3.0724 (3.6001) grad_norm 1.3872 (inf) loss_scale 8192.0000 (12695.2047) mem 7379MB [2024-08-26 06:25:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][180/1251] eta 0:04:22 lr 0.000974 wd 0.0500 time 0.2372 (0.2446) data time 0.0007 (0.0034) model time 0.2365 (0.2415) loss 3.4314 (3.6028) grad_norm 1.5908 (inf) loss_scale 8192.0000 (12446.4088) mem 7379MB [2024-08-26 06:25:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][190/1251] eta 0:04:19 lr 0.000974 wd 0.0500 time 0.2442 (0.2445) data time 0.0009 (0.0033) model time 0.2433 (0.2414) loss 3.2277 (3.5979) grad_norm 2.1517 (inf) loss_scale 8192.0000 (12223.6649) mem 7379MB [2024-08-26 06:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][200/1251] eta 0:04:16 lr 0.000974 wd 0.0500 time 0.2460 (0.2444) data time 0.0007 (0.0031) model time 0.2453 (0.2414) loss 4.0947 (3.6108) grad_norm 3.1351 (inf) loss_scale 8192.0000 (12023.0846) mem 7379MB [2024-08-26 06:25:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][210/1251] eta 0:04:14 lr 0.000974 wd 0.0500 time 0.2393 (0.2443) data time 0.0010 (0.0030) model time 0.2383 (0.2414) loss 3.8209 (3.6177) grad_norm 1.7795 (inf) loss_scale 8192.0000 (11841.5166) mem 7379MB [2024-08-26 06:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][220/1251] eta 0:04:12 lr 0.000974 wd 0.0500 time 0.2521 (0.2449) data time 0.0008 (0.0030) model time 0.2513 (0.2423) loss 2.4712 (3.6067) grad_norm 1.9203 (inf) loss_scale 8192.0000 (11676.3801) mem 7379MB [2024-08-26 06:25:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][230/1251] eta 0:04:09 lr 0.000974 wd 0.0500 time 0.2368 (0.2447) data time 0.0010 (0.0029) model time 0.2358 (0.2422) loss 4.2610 (3.6018) grad_norm 1.3612 (inf) loss_scale 8192.0000 (11525.5411) mem 7379MB [2024-08-26 06:25:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][240/1251] eta 0:04:07 lr 0.000974 wd 0.0500 time 0.2390 (0.2446) data time 0.0010 (0.0028) model time 0.2381 (0.2422) loss 3.5874 (3.5873) grad_norm 1.4726 (inf) loss_scale 8192.0000 (11387.2199) mem 7379MB [2024-08-26 06:25:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][250/1251] eta 0:04:05 lr 0.000974 wd 0.0500 time 0.2390 (0.2453) data time 0.0007 (0.0027) model time 0.2383 (0.2431) loss 3.9705 (3.5967) grad_norm 1.6133 (inf) loss_scale 8192.0000 (11259.9203) mem 7379MB [2024-08-26 06:25:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][260/1251] eta 0:04:03 lr 0.000974 wd 0.0500 time 0.2382 (0.2461) data time 0.0008 (0.0027) model time 0.2373 (0.2441) loss 4.4874 (3.6041) grad_norm 1.8622 (inf) loss_scale 8192.0000 (11142.3755) mem 7379MB [2024-08-26 06:25:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][270/1251] eta 0:04:01 lr 0.000974 wd 0.0500 time 0.2396 (0.2460) data time 0.0007 (0.0026) model time 0.2389 (0.2440) loss 3.9705 (3.6068) grad_norm 1.5570 (inf) loss_scale 8192.0000 (11033.5055) mem 7379MB [2024-08-26 06:25:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][280/1251] eta 0:03:58 lr 0.000974 wd 0.0500 time 0.2378 (0.2458) data time 0.0008 (0.0025) model time 0.2370 (0.2438) loss 3.7850 (3.6068) grad_norm 2.7422 (inf) loss_scale 8192.0000 (10932.3843) mem 7379MB [2024-08-26 06:26:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][290/1251] eta 0:03:56 lr 0.000974 wd 0.0500 time 0.2399 (0.2457) data time 0.0010 (0.0025) model time 0.2389 (0.2437) loss 2.9125 (3.6081) grad_norm 1.9380 (inf) loss_scale 8192.0000 (10838.2131) mem 7379MB [2024-08-26 06:26:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][300/1251] eta 0:03:53 lr 0.000974 wd 0.0500 time 0.2416 (0.2455) data time 0.0007 (0.0024) model time 0.2408 (0.2436) loss 2.6944 (3.6046) grad_norm 1.9076 (inf) loss_scale 8192.0000 (10750.2990) mem 7379MB [2024-08-26 06:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][310/1251] eta 0:03:50 lr 0.000974 wd 0.0500 time 0.2345 (0.2454) data time 0.0010 (0.0024) model time 0.2335 (0.2434) loss 4.1167 (3.6156) grad_norm 2.1977 (inf) loss_scale 8192.0000 (10668.0386) mem 7379MB [2024-08-26 06:26:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][320/1251] eta 0:03:48 lr 0.000974 wd 0.0500 time 0.2304 (0.2452) data time 0.0011 (0.0024) model time 0.2293 (0.2432) loss 4.0303 (3.6158) grad_norm 1.8323 (inf) loss_scale 8192.0000 (10590.9034) mem 7379MB [2024-08-26 06:26:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][330/1251] eta 0:03:45 lr 0.000974 wd 0.0500 time 0.2388 (0.2451) data time 0.0010 (0.0024) model time 0.2378 (0.2431) loss 3.2794 (3.6152) grad_norm 2.1540 (inf) loss_scale 8192.0000 (10518.4290) mem 7379MB [2024-08-26 06:26:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][340/1251] eta 0:03:43 lr 0.000974 wd 0.0500 time 0.2506 (0.2450) data time 0.0009 (0.0023) model time 0.2497 (0.2430) loss 3.5292 (3.6181) grad_norm 3.5385 (inf) loss_scale 8192.0000 (10450.2053) mem 7379MB [2024-08-26 06:26:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][350/1251] eta 0:03:40 lr 0.000974 wd 0.0500 time 0.2380 (0.2449) data time 0.0009 (0.0023) model time 0.2372 (0.2429) loss 4.2985 (3.6236) grad_norm 1.7194 (inf) loss_scale 8192.0000 (10385.8689) mem 7379MB [2024-08-26 06:26:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][360/1251] eta 0:03:38 lr 0.000974 wd 0.0500 time 0.2558 (0.2449) data time 0.0011 (0.0023) model time 0.2546 (0.2429) loss 3.9798 (3.6274) grad_norm 3.3333 (inf) loss_scale 8192.0000 (10325.0970) mem 7379MB [2024-08-26 06:26:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][370/1251] eta 0:03:35 lr 0.000973 wd 0.0500 time 0.2426 (0.2448) data time 0.0008 (0.0022) model time 0.2418 (0.2428) loss 4.0846 (3.6246) grad_norm 1.9642 (inf) loss_scale 8192.0000 (10267.6011) mem 7379MB [2024-08-26 06:26:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][380/1251] eta 0:03:33 lr 0.000973 wd 0.0500 time 0.2474 (0.2447) data time 0.0008 (0.0022) model time 0.2466 (0.2428) loss 3.2244 (3.6274) grad_norm 2.3637 (inf) loss_scale 8192.0000 (10213.1234) mem 7379MB [2024-08-26 06:26:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][390/1251] eta 0:03:31 lr 0.000973 wd 0.0500 time 0.2426 (0.2453) data time 0.0007 (0.0022) model time 0.2419 (0.2435) loss 3.1346 (3.6329) grad_norm 2.2363 (inf) loss_scale 8192.0000 (10161.4322) mem 7379MB [2024-08-26 06:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][400/1251] eta 0:03:28 lr 0.000973 wd 0.0500 time 0.2412 (0.2452) data time 0.0011 (0.0021) model time 0.2401 (0.2434) loss 3.7733 (3.6307) grad_norm 2.2861 (inf) loss_scale 8192.0000 (10112.3192) mem 7379MB [2024-08-26 06:26:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][410/1251] eta 0:03:26 lr 0.000973 wd 0.0500 time 0.2433 (0.2452) data time 0.0012 (0.0021) model time 0.2421 (0.2434) loss 3.7580 (3.6293) grad_norm 1.7663 (inf) loss_scale 8192.0000 (10065.5961) mem 7379MB [2024-08-26 06:26:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][420/1251] eta 0:03:23 lr 0.000973 wd 0.0500 time 0.2427 (0.2451) data time 0.0008 (0.0021) model time 0.2419 (0.2433) loss 4.2895 (3.6333) grad_norm 1.7640 (inf) loss_scale 8192.0000 (10021.0926) mem 7379MB [2024-08-26 06:26:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][430/1251] eta 0:03:21 lr 0.000973 wd 0.0500 time 0.2410 (0.2450) data time 0.0011 (0.0021) model time 0.2399 (0.2433) loss 3.2511 (3.6339) grad_norm 1.8300 (inf) loss_scale 8192.0000 (9978.6543) mem 7379MB [2024-08-26 06:26:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][440/1251] eta 0:03:18 lr 0.000973 wd 0.0500 time 0.2607 (0.2450) data time 0.0010 (0.0020) model time 0.2597 (0.2433) loss 3.8979 (3.6322) grad_norm 1.6339 (inf) loss_scale 8192.0000 (9938.1406) mem 7379MB [2024-08-26 06:26:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][450/1251] eta 0:03:16 lr 0.000973 wd 0.0500 time 0.2407 (0.2449) data time 0.0011 (0.0020) model time 0.2396 (0.2432) loss 3.9151 (3.6342) grad_norm 1.5196 (inf) loss_scale 8192.0000 (9899.4235) mem 7379MB [2024-08-26 06:26:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][460/1251] eta 0:03:13 lr 0.000973 wd 0.0500 time 0.2431 (0.2449) data time 0.0007 (0.0020) model time 0.2424 (0.2432) loss 2.8437 (3.6278) grad_norm 1.9024 (inf) loss_scale 8192.0000 (9862.3861) mem 7379MB [2024-08-26 06:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][470/1251] eta 0:03:11 lr 0.000973 wd 0.0500 time 0.2394 (0.2448) data time 0.0011 (0.0020) model time 0.2383 (0.2431) loss 3.5597 (3.6315) grad_norm 3.3406 (inf) loss_scale 8192.0000 (9826.9214) mem 7379MB [2024-08-26 06:26:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][480/1251] eta 0:03:09 lr 0.000973 wd 0.0500 time 0.2349 (0.2452) data time 0.0013 (0.0020) model time 0.2336 (0.2435) loss 3.4173 (3.6303) grad_norm 1.5330 (inf) loss_scale 8192.0000 (9792.9314) mem 7379MB [2024-08-26 06:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][490/1251] eta 0:03:06 lr 0.000973 wd 0.0500 time 0.2382 (0.2451) data time 0.0011 (0.0019) model time 0.2372 (0.2435) loss 3.5053 (3.6279) grad_norm 1.4722 (inf) loss_scale 8192.0000 (9760.3259) mem 7379MB [2024-08-26 06:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][500/1251] eta 0:03:04 lr 0.000973 wd 0.0500 time 0.2459 (0.2450) data time 0.0011 (0.0019) model time 0.2449 (0.2434) loss 3.4536 (3.6235) grad_norm 1.9173 (inf) loss_scale 8192.0000 (9729.0220) mem 7379MB [2024-08-26 06:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][510/1251] eta 0:03:01 lr 0.000973 wd 0.0500 time 0.2357 (0.2450) data time 0.0011 (0.0019) model time 0.2346 (0.2433) loss 3.9206 (3.6238) grad_norm 1.4230 (inf) loss_scale 8192.0000 (9698.9432) mem 7379MB [2024-08-26 06:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][520/1251] eta 0:02:59 lr 0.000973 wd 0.0500 time 0.2394 (0.2449) data time 0.0007 (0.0019) model time 0.2387 (0.2433) loss 2.7290 (3.6202) grad_norm 1.8550 (inf) loss_scale 8192.0000 (9670.0192) mem 7379MB [2024-08-26 06:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][530/1251] eta 0:02:56 lr 0.000973 wd 0.0500 time 0.2319 (0.2449) data time 0.0010 (0.0019) model time 0.2309 (0.2432) loss 3.7539 (3.6184) grad_norm 1.5904 (inf) loss_scale 8192.0000 (9642.1846) mem 7379MB [2024-08-26 06:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][540/1251] eta 0:02:54 lr 0.000973 wd 0.0500 time 0.2497 (0.2448) data time 0.0007 (0.0019) model time 0.2490 (0.2432) loss 2.4934 (3.6145) grad_norm 3.3152 (inf) loss_scale 8192.0000 (9615.3789) mem 7379MB [2024-08-26 06:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][550/1251] eta 0:02:51 lr 0.000973 wd 0.0500 time 0.2377 (0.2448) data time 0.0008 (0.0018) model time 0.2369 (0.2432) loss 2.9618 (3.6146) grad_norm 1.8521 (inf) loss_scale 8192.0000 (9589.5463) mem 7379MB [2024-08-26 06:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][560/1251] eta 0:02:49 lr 0.000973 wd 0.0500 time 0.2390 (0.2447) data time 0.0011 (0.0018) model time 0.2379 (0.2431) loss 3.8356 (3.6160) grad_norm 3.0238 (inf) loss_scale 8192.0000 (9564.6346) mem 7379MB [2024-08-26 06:27:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][570/1251] eta 0:02:46 lr 0.000973 wd 0.0500 time 0.2350 (0.2446) data time 0.0011 (0.0018) model time 0.2339 (0.2430) loss 3.2673 (3.6147) grad_norm 1.3240 (inf) loss_scale 8192.0000 (9540.5954) mem 7379MB [2024-08-26 06:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][580/1251] eta 0:02:44 lr 0.000973 wd 0.0500 time 0.2428 (0.2446) data time 0.0009 (0.0018) model time 0.2418 (0.2430) loss 4.3812 (3.6168) grad_norm 2.3267 (inf) loss_scale 8192.0000 (9517.3838) mem 7379MB [2024-08-26 06:27:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][590/1251] eta 0:02:41 lr 0.000973 wd 0.0500 time 0.2387 (0.2445) data time 0.0007 (0.0018) model time 0.2380 (0.2429) loss 4.2918 (3.6152) grad_norm 1.8169 (inf) loss_scale 8192.0000 (9494.9577) mem 7379MB [2024-08-26 06:27:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][600/1251] eta 0:02:39 lr 0.000973 wd 0.0500 time 0.2443 (0.2445) data time 0.0007 (0.0018) model time 0.2435 (0.2429) loss 3.2570 (3.6156) grad_norm 1.4890 (inf) loss_scale 8192.0000 (9473.2779) mem 7379MB [2024-08-26 06:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][610/1251] eta 0:02:36 lr 0.000973 wd 0.0500 time 0.2401 (0.2445) data time 0.0008 (0.0018) model time 0.2393 (0.2429) loss 3.8080 (3.6197) grad_norm 1.7749 (inf) loss_scale 8192.0000 (9452.3077) mem 7379MB [2024-08-26 06:27:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][620/1251] eta 0:02:34 lr 0.000973 wd 0.0500 time 0.2453 (0.2444) data time 0.0009 (0.0018) model time 0.2443 (0.2429) loss 3.9158 (3.6198) grad_norm 1.8819 (inf) loss_scale 8192.0000 (9432.0129) mem 7379MB [2024-08-26 06:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][630/1251] eta 0:02:31 lr 0.000973 wd 0.0500 time 0.2468 (0.2444) data time 0.0009 (0.0017) model time 0.2459 (0.2429) loss 3.6298 (3.6144) grad_norm 1.7954 (inf) loss_scale 8192.0000 (9412.3613) mem 7379MB [2024-08-26 06:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][640/1251] eta 0:02:29 lr 0.000973 wd 0.0500 time 0.2432 (0.2444) data time 0.0007 (0.0017) model time 0.2424 (0.2429) loss 2.9207 (3.6150) grad_norm 3.7254 (inf) loss_scale 8192.0000 (9393.3229) mem 7379MB [2024-08-26 06:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][650/1251] eta 0:02:26 lr 0.000973 wd 0.0500 time 0.2445 (0.2444) data time 0.0010 (0.0017) model time 0.2436 (0.2429) loss 3.5498 (3.6128) grad_norm 2.0543 (inf) loss_scale 8192.0000 (9374.8694) mem 7379MB [2024-08-26 06:27:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][660/1251] eta 0:02:24 lr 0.000973 wd 0.0500 time 0.2462 (0.2444) data time 0.0010 (0.0017) model time 0.2452 (0.2428) loss 3.6680 (3.6136) grad_norm 1.5403 (inf) loss_scale 8192.0000 (9356.9743) mem 7379MB [2024-08-26 06:27:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][670/1251] eta 0:02:21 lr 0.000973 wd 0.0500 time 0.2348 (0.2443) data time 0.0010 (0.0017) model time 0.2339 (0.2428) loss 3.8070 (3.6161) grad_norm 2.1204 (inf) loss_scale 8192.0000 (9339.6125) mem 7379MB [2024-08-26 06:27:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][680/1251] eta 0:02:19 lr 0.000973 wd 0.0500 time 0.2507 (0.2444) data time 0.0010 (0.0017) model time 0.2497 (0.2428) loss 4.5644 (3.6205) grad_norm 1.6791 (inf) loss_scale 8192.0000 (9322.7606) mem 7379MB [2024-08-26 06:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][690/1251] eta 0:02:17 lr 0.000973 wd 0.0500 time 0.2566 (0.2443) data time 0.0008 (0.0017) model time 0.2558 (0.2428) loss 2.8498 (3.6195) grad_norm 1.7340 (inf) loss_scale 8192.0000 (9306.3965) mem 7379MB [2024-08-26 06:27:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][700/1251] eta 0:02:14 lr 0.000973 wd 0.0500 time 0.2340 (0.2443) data time 0.0008 (0.0017) model time 0.2332 (0.2428) loss 3.4481 (3.6218) grad_norm 3.2297 (inf) loss_scale 8192.0000 (9290.4993) mem 7379MB [2024-08-26 06:27:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][710/1251] eta 0:02:12 lr 0.000973 wd 0.0500 time 0.2402 (0.2443) data time 0.0012 (0.0017) model time 0.2391 (0.2427) loss 3.4183 (3.6188) grad_norm 1.8987 (inf) loss_scale 8192.0000 (9275.0492) mem 7379MB [2024-08-26 06:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][720/1251] eta 0:02:09 lr 0.000973 wd 0.0500 time 0.2401 (0.2442) data time 0.0009 (0.0016) model time 0.2392 (0.2427) loss 3.6040 (3.6219) grad_norm 3.0862 (inf) loss_scale 8192.0000 (9260.0277) mem 7379MB [2024-08-26 06:27:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][730/1251] eta 0:02:07 lr 0.000973 wd 0.0500 time 0.2541 (0.2442) data time 0.0009 (0.0016) model time 0.2532 (0.2427) loss 3.6669 (3.6235) grad_norm 1.4491 (inf) loss_scale 8192.0000 (9245.4172) mem 7379MB [2024-08-26 06:27:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][740/1251] eta 0:02:04 lr 0.000973 wd 0.0500 time 0.2424 (0.2445) data time 0.0009 (0.0016) model time 0.2415 (0.2430) loss 3.5535 (3.6229) grad_norm 2.0603 (inf) loss_scale 8192.0000 (9231.2011) mem 7379MB [2024-08-26 06:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][750/1251] eta 0:02:02 lr 0.000973 wd 0.0500 time 0.2487 (0.2445) data time 0.0009 (0.0016) model time 0.2477 (0.2430) loss 3.2773 (3.6237) grad_norm 2.0540 (inf) loss_scale 8192.0000 (9217.3635) mem 7379MB [2024-08-26 06:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][760/1251] eta 0:02:00 lr 0.000973 wd 0.0500 time 0.2391 (0.2444) data time 0.0007 (0.0016) model time 0.2384 (0.2430) loss 4.2447 (3.6283) grad_norm 2.9154 (inf) loss_scale 8192.0000 (9203.8896) mem 7379MB [2024-08-26 06:27:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][770/1251] eta 0:01:57 lr 0.000973 wd 0.0500 time 0.2356 (0.2444) data time 0.0010 (0.0016) model time 0.2346 (0.2429) loss 2.8298 (3.6230) grad_norm 2.5637 (inf) loss_scale 8192.0000 (9190.7652) mem 7379MB [2024-08-26 06:27:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][780/1251] eta 0:01:55 lr 0.000973 wd 0.0500 time 0.2467 (0.2443) data time 0.0007 (0.0016) model time 0.2460 (0.2429) loss 3.3243 (3.6182) grad_norm 1.6151 (inf) loss_scale 8192.0000 (9177.9770) mem 7379MB [2024-08-26 06:28:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][790/1251] eta 0:01:52 lr 0.000973 wd 0.0500 time 0.2445 (0.2443) data time 0.0012 (0.0016) model time 0.2433 (0.2429) loss 3.9409 (3.6207) grad_norm 1.5735 (inf) loss_scale 8192.0000 (9165.5120) mem 7379MB [2024-08-26 06:28:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][800/1251] eta 0:01:50 lr 0.000973 wd 0.0500 time 0.2392 (0.2443) data time 0.0009 (0.0016) model time 0.2383 (0.2428) loss 3.8202 (3.6199) grad_norm 1.6288 (inf) loss_scale 8192.0000 (9153.3583) mem 7379MB [2024-08-26 06:28:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][810/1251] eta 0:01:47 lr 0.000973 wd 0.0500 time 0.2427 (0.2443) data time 0.0009 (0.0016) model time 0.2418 (0.2429) loss 4.0214 (3.6181) grad_norm 2.2221 (inf) loss_scale 8192.0000 (9141.5043) mem 7379MB [2024-08-26 06:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][820/1251] eta 0:01:45 lr 0.000973 wd 0.0500 time 0.2350 (0.2443) data time 0.0011 (0.0016) model time 0.2339 (0.2428) loss 3.7356 (3.6178) grad_norm 2.0212 (inf) loss_scale 8192.0000 (9129.9391) mem 7379MB [2024-08-26 06:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][830/1251] eta 0:01:42 lr 0.000973 wd 0.0500 time 0.2445 (0.2443) data time 0.0007 (0.0016) model time 0.2438 (0.2428) loss 3.1819 (3.6209) grad_norm 1.2357 (inf) loss_scale 8192.0000 (9118.6522) mem 7379MB [2024-08-26 06:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][840/1251] eta 0:01:40 lr 0.000973 wd 0.0500 time 0.2477 (0.2442) data time 0.0008 (0.0016) model time 0.2469 (0.2428) loss 2.7596 (3.6186) grad_norm 2.6875 (inf) loss_scale 8192.0000 (9107.6338) mem 7379MB [2024-08-26 06:28:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][850/1251] eta 0:01:37 lr 0.000973 wd 0.0500 time 0.2376 (0.2442) data time 0.0011 (0.0016) model time 0.2364 (0.2428) loss 3.7791 (3.6188) grad_norm 1.8814 (inf) loss_scale 8192.0000 (9096.8743) mem 7379MB [2024-08-26 06:28:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][860/1251] eta 0:01:35 lr 0.000973 wd 0.0500 time 0.2379 (0.2441) data time 0.0008 (0.0015) model time 0.2371 (0.2427) loss 4.0967 (3.6176) grad_norm 1.9441 (inf) loss_scale 8192.0000 (9086.3647) mem 7379MB [2024-08-26 06:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][870/1251] eta 0:01:33 lr 0.000973 wd 0.0500 time 0.2484 (0.2441) data time 0.0009 (0.0015) model time 0.2475 (0.2427) loss 2.6419 (3.6154) grad_norm 2.0953 (inf) loss_scale 8192.0000 (9076.0964) mem 7379MB [2024-08-26 06:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][880/1251] eta 0:01:30 lr 0.000973 wd 0.0500 time 0.2440 (0.2441) data time 0.0009 (0.0015) model time 0.2431 (0.2427) loss 4.2146 (3.6183) grad_norm 1.9189 (inf) loss_scale 8192.0000 (9066.0613) mem 7379MB [2024-08-26 06:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][890/1251] eta 0:01:28 lr 0.000973 wd 0.0500 time 0.2367 (0.2443) data time 0.0011 (0.0015) model time 0.2356 (0.2429) loss 2.8334 (3.6183) grad_norm 1.4591 (inf) loss_scale 8192.0000 (9056.2514) mem 7379MB [2024-08-26 06:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][900/1251] eta 0:01:25 lr 0.000973 wd 0.0500 time 0.2369 (0.2446) data time 0.0012 (0.0015) model time 0.2358 (0.2432) loss 3.7666 (3.6190) grad_norm 1.8398 (inf) loss_scale 8192.0000 (9046.6593) mem 7379MB [2024-08-26 06:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][910/1251] eta 0:01:23 lr 0.000973 wd 0.0500 time 0.2454 (0.2447) data time 0.0009 (0.0015) model time 0.2445 (0.2433) loss 3.6949 (3.6227) grad_norm 1.8749 (inf) loss_scale 8192.0000 (9037.2777) mem 7379MB [2024-08-26 06:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][920/1251] eta 0:01:20 lr 0.000973 wd 0.0500 time 0.2460 (0.2447) data time 0.0008 (0.0015) model time 0.2453 (0.2433) loss 4.6566 (3.6232) grad_norm 2.5397 (inf) loss_scale 8192.0000 (9028.0999) mem 7379MB [2024-08-26 06:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][930/1251] eta 0:01:18 lr 0.000973 wd 0.0500 time 0.2430 (0.2446) data time 0.0009 (0.0015) model time 0.2420 (0.2433) loss 4.4200 (3.6210) grad_norm 1.5535 (inf) loss_scale 8192.0000 (9019.1192) mem 7379MB [2024-08-26 06:28:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][940/1251] eta 0:01:16 lr 0.000973 wd 0.0500 time 0.2410 (0.2446) data time 0.0008 (0.0015) model time 0.2402 (0.2432) loss 2.7833 (3.6222) grad_norm 2.0600 (inf) loss_scale 8192.0000 (9010.3294) mem 7379MB [2024-08-26 06:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][950/1251] eta 0:01:13 lr 0.000973 wd 0.0500 time 0.2428 (0.2446) data time 0.0011 (0.0015) model time 0.2417 (0.2432) loss 3.7763 (3.6228) grad_norm 1.3697 (inf) loss_scale 8192.0000 (9001.7245) mem 7379MB [2024-08-26 06:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][960/1251] eta 0:01:11 lr 0.000973 wd 0.0500 time 0.2425 (0.2445) data time 0.0010 (0.0015) model time 0.2415 (0.2432) loss 3.7518 (3.6241) grad_norm 1.9235 (inf) loss_scale 8192.0000 (8993.2986) mem 7379MB [2024-08-26 06:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][970/1251] eta 0:01:08 lr 0.000973 wd 0.0500 time 0.2349 (0.2445) data time 0.0011 (0.0015) model time 0.2338 (0.2431) loss 3.8352 (3.6221) grad_norm 1.3537 (inf) loss_scale 8192.0000 (8985.0463) mem 7379MB [2024-08-26 06:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][980/1251] eta 0:01:06 lr 0.000973 wd 0.0500 time 0.2410 (0.2444) data time 0.0008 (0.0015) model time 0.2402 (0.2431) loss 3.3796 (3.6186) grad_norm 1.8109 (inf) loss_scale 8192.0000 (8976.9623) mem 7379MB [2024-08-26 06:28:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][990/1251] eta 0:01:03 lr 0.000973 wd 0.0500 time 0.2325 (0.2444) data time 0.0009 (0.0015) model time 0.2316 (0.2431) loss 4.2165 (3.6215) grad_norm 1.5442 (inf) loss_scale 8192.0000 (8969.0414) mem 7379MB [2024-08-26 06:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1000/1251] eta 0:01:01 lr 0.000973 wd 0.0500 time 0.2526 (0.2444) data time 0.0009 (0.0015) model time 0.2517 (0.2431) loss 2.4995 (3.6195) grad_norm 1.6359 (inf) loss_scale 8192.0000 (8961.2787) mem 7379MB [2024-08-26 06:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1010/1251] eta 0:00:58 lr 0.000973 wd 0.0500 time 0.2411 (0.2444) data time 0.0007 (0.0015) model time 0.2404 (0.2431) loss 3.7596 (3.6181) grad_norm 1.8020 (inf) loss_scale 8192.0000 (8953.6696) mem 7379MB [2024-08-26 06:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1020/1251] eta 0:00:56 lr 0.000973 wd 0.0500 time 0.2491 (0.2444) data time 0.0010 (0.0015) model time 0.2481 (0.2431) loss 3.5836 (3.6186) grad_norm 1.6845 (inf) loss_scale 8192.0000 (8946.2096) mem 7379MB [2024-08-26 06:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1030/1251] eta 0:00:54 lr 0.000973 wd 0.0500 time 0.2403 (0.2444) data time 0.0008 (0.0015) model time 0.2395 (0.2430) loss 4.1708 (3.6186) grad_norm 1.4383 (inf) loss_scale 8192.0000 (8938.8943) mem 7379MB [2024-08-26 06:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1040/1251] eta 0:00:51 lr 0.000973 wd 0.0500 time 0.2468 (0.2444) data time 0.0010 (0.0015) model time 0.2458 (0.2430) loss 3.3714 (3.6157) grad_norm 1.6840 (inf) loss_scale 8192.0000 (8931.7195) mem 7379MB [2024-08-26 06:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1050/1251] eta 0:00:49 lr 0.000973 wd 0.0500 time 0.2412 (0.2443) data time 0.0007 (0.0015) model time 0.2405 (0.2430) loss 3.6157 (3.6164) grad_norm 2.2098 (inf) loss_scale 8192.0000 (8924.6813) mem 7379MB [2024-08-26 06:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1060/1251] eta 0:00:46 lr 0.000973 wd 0.0500 time 0.2668 (0.2443) data time 0.0007 (0.0015) model time 0.2661 (0.2430) loss 2.7289 (3.6144) grad_norm 2.1466 (inf) loss_scale 8192.0000 (8917.7757) mem 7379MB [2024-08-26 06:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1070/1251] eta 0:00:44 lr 0.000972 wd 0.0500 time 0.2417 (0.2443) data time 0.0010 (0.0015) model time 0.2407 (0.2430) loss 4.2844 (3.6143) grad_norm 1.8210 (inf) loss_scale 8192.0000 (8910.9991) mem 7379MB [2024-08-26 06:29:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1080/1251] eta 0:00:41 lr 0.000972 wd 0.0500 time 0.2362 (0.2443) data time 0.0012 (0.0014) model time 0.2350 (0.2430) loss 3.2803 (3.6124) grad_norm 2.7381 (inf) loss_scale 8192.0000 (8904.3478) mem 7379MB [2024-08-26 06:29:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1090/1251] eta 0:00:39 lr 0.000972 wd 0.0500 time 0.2625 (0.2443) data time 0.0010 (0.0014) model time 0.2616 (0.2430) loss 3.1366 (3.6118) grad_norm 1.9731 (inf) loss_scale 8192.0000 (8897.8185) mem 7379MB [2024-08-26 06:29:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1100/1251] eta 0:00:36 lr 0.000972 wd 0.0500 time 0.2450 (0.2443) data time 0.0010 (0.0014) model time 0.2440 (0.2430) loss 4.1463 (3.6106) grad_norm 1.7379 (inf) loss_scale 8192.0000 (8891.4078) mem 7379MB [2024-08-26 06:29:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1110/1251] eta 0:00:34 lr 0.000972 wd 0.0500 time 0.2363 (0.2443) data time 0.0008 (0.0014) model time 0.2355 (0.2429) loss 3.4841 (3.6110) grad_norm 3.3689 (inf) loss_scale 8192.0000 (8885.1125) mem 7379MB [2024-08-26 06:29:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1120/1251] eta 0:00:31 lr 0.000972 wd 0.0500 time 0.2406 (0.2442) data time 0.0009 (0.0014) model time 0.2397 (0.2429) loss 3.3599 (3.6106) grad_norm 1.9623 (inf) loss_scale 8192.0000 (8878.9295) mem 7379MB [2024-08-26 06:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1130/1251] eta 0:00:29 lr 0.000972 wd 0.0500 time 0.2337 (0.2442) data time 0.0009 (0.0014) model time 0.2327 (0.2429) loss 3.8902 (3.6136) grad_norm 1.8479 (inf) loss_scale 8192.0000 (8872.8559) mem 7379MB [2024-08-26 06:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1140/1251] eta 0:00:27 lr 0.000972 wd 0.0500 time 0.2386 (0.2442) data time 0.0010 (0.0014) model time 0.2376 (0.2428) loss 3.6553 (3.6153) grad_norm 2.0143 (inf) loss_scale 8192.0000 (8866.8887) mem 7379MB [2024-08-26 06:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1150/1251] eta 0:00:24 lr 0.000972 wd 0.0500 time 0.2418 (0.2441) data time 0.0008 (0.0014) model time 0.2410 (0.2428) loss 2.4454 (3.6160) grad_norm 1.9259 (inf) loss_scale 8192.0000 (8861.0252) mem 7379MB [2024-08-26 06:29:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1160/1251] eta 0:00:22 lr 0.000972 wd 0.0500 time 0.2398 (0.2441) data time 0.0012 (0.0014) model time 0.2386 (0.2428) loss 3.9055 (3.6178) grad_norm 2.5387 (inf) loss_scale 8192.0000 (8855.2627) mem 7379MB [2024-08-26 06:29:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1170/1251] eta 0:00:19 lr 0.000972 wd 0.0500 time 0.2437 (0.2441) data time 0.0008 (0.0014) model time 0.2429 (0.2428) loss 3.2716 (3.6157) grad_norm 1.7084 (inf) loss_scale 8192.0000 (8849.5986) mem 7379MB [2024-08-26 06:29:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1180/1251] eta 0:00:17 lr 0.000972 wd 0.0500 time 0.2388 (0.2443) data time 0.0009 (0.0014) model time 0.2379 (0.2429) loss 3.6053 (3.6180) grad_norm 1.8400 (inf) loss_scale 8192.0000 (8844.0305) mem 7379MB [2024-08-26 06:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1190/1251] eta 0:00:14 lr 0.000972 wd 0.0500 time 0.2398 (0.2445) data time 0.0010 (0.0014) model time 0.2388 (0.2431) loss 3.7642 (3.6188) grad_norm 1.8843 (inf) loss_scale 8192.0000 (8838.5558) mem 7379MB [2024-08-26 06:29:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1200/1251] eta 0:00:12 lr 0.000972 wd 0.0500 time 0.2407 (0.2444) data time 0.0007 (0.0014) model time 0.2400 (0.2431) loss 2.7994 (3.6181) grad_norm 1.9015 (inf) loss_scale 8192.0000 (8833.1724) mem 7379MB [2024-08-26 06:29:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1210/1251] eta 0:00:10 lr 0.000972 wd 0.0500 time 0.2497 (0.2444) data time 0.0007 (0.0014) model time 0.2489 (0.2431) loss 4.2731 (3.6172) grad_norm 2.4157 (inf) loss_scale 8192.0000 (8827.8778) mem 7379MB [2024-08-26 06:29:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1220/1251] eta 0:00:07 lr 0.000972 wd 0.0500 time 0.2452 (0.2444) data time 0.0010 (0.0014) model time 0.2442 (0.2431) loss 3.3930 (3.6188) grad_norm 2.1763 (inf) loss_scale 8192.0000 (8822.6699) mem 7379MB [2024-08-26 06:29:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1230/1251] eta 0:00:05 lr 0.000972 wd 0.0500 time 0.2450 (0.2444) data time 0.0008 (0.0014) model time 0.2442 (0.2431) loss 4.0962 (3.6180) grad_norm 1.5194 (inf) loss_scale 8192.0000 (8817.5467) mem 7379MB [2024-08-26 06:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1240/1251] eta 0:00:02 lr 0.000972 wd 0.0500 time 0.2238 (0.2443) data time 0.0008 (0.0014) model time 0.2230 (0.2430) loss 3.5242 (3.6198) grad_norm 1.6687 (inf) loss_scale 8192.0000 (8812.5060) mem 7379MB [2024-08-26 06:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [49/300][1250/1251] eta 0:00:00 lr 0.000972 wd 0.0500 time 0.2257 (0.2442) data time 0.0008 (0.0014) model time 0.2249 (0.2429) loss 2.8095 (3.6197) grad_norm 2.1813 (inf) loss_scale 8192.0000 (8807.5460) mem 7379MB [2024-08-26 06:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 49 training takes 0:05:05 [2024-08-26 06:29:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 06:29:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 06:29:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.411 (0.411) Loss 0.5698 (0.5698) Acc@1 88.770 (88.770) Acc@5 97.363 (97.363) Mem 7379MB [2024-08-26 06:29:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.109) Loss 0.9170 (0.8946) Acc@1 79.492 (80.016) Acc@5 95.898 (95.312) Mem 7379MB [2024-08-26 06:29:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.087 (0.095) Loss 1.2822 (0.9154) Acc@1 70.508 (79.246) Acc@5 91.309 (95.150) Mem 7379MB [2024-08-26 06:29:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.089) Loss 1.6855 (1.0527) Acc@1 59.570 (76.292) Acc@5 85.645 (93.589) Mem 7379MB [2024-08-26 06:29:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.5059 (1.1309) Acc@1 65.527 (74.609) Acc@5 88.574 (92.614) Mem 7379MB [2024-08-26 06:29:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.324 Acc@5 92.524 [2024-08-26 06:29:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 74.3% [2024-08-26 06:29:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.759 (0.759) Loss 0.4719 (0.4719) Acc@1 89.941 (89.941) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 06:30:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.144) Loss 0.7935 (0.7676) Acc@1 82.715 (82.457) Acc@5 95.801 (96.298) Mem 7379MB [2024-08-26 06:30:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.099 (0.115) Loss 1.1143 (0.7835) Acc@1 74.023 (81.617) Acc@5 91.992 (96.229) Mem 7379MB [2024-08-26 06:30:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.104) Loss 1.3818 (0.9006) Acc@1 64.648 (78.827) Acc@5 88.379 (94.686) Mem 7379MB [2024-08-26 06:30:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.095) Loss 1.3076 (0.9680) Acc@1 67.578 (77.156) Acc@5 89.258 (93.902) Mem 7379MB [2024-08-26 06:30:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.802 Acc@5 93.840 [2024-08-26 06:30:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 76.8% [2024-08-26 06:30:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 76.80% [2024-08-26 06:30:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 06:30:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 06:30:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][0/1251] eta 0:14:48 lr 0.000972 wd 0.0500 time 0.7102 (0.7102) data time 0.4852 (0.4852) model time 0.0000 (0.0000) loss 3.7637 (3.7637) grad_norm 2.0233 (2.0233) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][10/1251] eta 0:05:52 lr 0.000972 wd 0.0500 time 0.2417 (0.2840) data time 0.0008 (0.0451) model time 0.0000 (0.0000) loss 4.3951 (3.6421) grad_norm 2.0997 (2.0329) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][20/1251] eta 0:05:26 lr 0.000972 wd 0.0500 time 0.2433 (0.2653) data time 0.0010 (0.0241) model time 0.0000 (0.0000) loss 3.4552 (3.7394) grad_norm 1.7520 (1.9174) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][30/1251] eta 0:05:15 lr 0.000972 wd 0.0500 time 0.2538 (0.2580) data time 0.0011 (0.0166) model time 0.0000 (0.0000) loss 3.9124 (3.6901) grad_norm 1.6852 (1.9051) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][40/1251] eta 0:05:07 lr 0.000972 wd 0.0500 time 0.2420 (0.2540) data time 0.0008 (0.0129) model time 0.0000 (0.0000) loss 4.3170 (3.7294) grad_norm 1.8334 (1.9931) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][50/1251] eta 0:05:02 lr 0.000972 wd 0.0500 time 0.2522 (0.2516) data time 0.0010 (0.0106) model time 0.0000 (0.0000) loss 4.0655 (3.7146) grad_norm 2.2141 (1.9603) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][60/1251] eta 0:04:57 lr 0.000972 wd 0.0500 time 0.2376 (0.2499) data time 0.0012 (0.0091) model time 0.2364 (0.2401) loss 3.3542 (3.7058) grad_norm 1.6259 (1.9610) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][70/1251] eta 0:04:53 lr 0.000972 wd 0.0500 time 0.2484 (0.2488) data time 0.0008 (0.0079) model time 0.2475 (0.2408) loss 2.8690 (3.6823) grad_norm 1.5589 (2.0009) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][80/1251] eta 0:04:50 lr 0.000972 wd 0.0500 time 0.2421 (0.2482) data time 0.0010 (0.0071) model time 0.2411 (0.2415) loss 3.4488 (3.6612) grad_norm 2.5123 (1.9729) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][90/1251] eta 0:04:49 lr 0.000972 wd 0.0500 time 0.2451 (0.2498) data time 0.0009 (0.0064) model time 0.2442 (0.2465) loss 4.0098 (3.6212) grad_norm 1.5315 (1.9431) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][100/1251] eta 0:04:46 lr 0.000972 wd 0.0500 time 0.2377 (0.2489) data time 0.0009 (0.0059) model time 0.2367 (0.2451) loss 3.2389 (3.5951) grad_norm 1.7070 (1.9328) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][110/1251] eta 0:04:43 lr 0.000972 wd 0.0500 time 0.2440 (0.2481) data time 0.0008 (0.0054) model time 0.2433 (0.2442) loss 3.7987 (3.5881) grad_norm 1.5680 (1.9157) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][120/1251] eta 0:04:40 lr 0.000972 wd 0.0500 time 0.2443 (0.2476) data time 0.0009 (0.0051) model time 0.2434 (0.2437) loss 3.6467 (3.5806) grad_norm 2.0035 (1.9208) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][130/1251] eta 0:04:37 lr 0.000972 wd 0.0500 time 0.2405 (0.2472) data time 0.0009 (0.0048) model time 0.2396 (0.2434) loss 3.4812 (3.5731) grad_norm 2.7405 (1.9463) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][140/1251] eta 0:04:34 lr 0.000972 wd 0.0500 time 0.2403 (0.2468) data time 0.0009 (0.0045) model time 0.2394 (0.2430) loss 3.6183 (3.5783) grad_norm 2.0165 (1.9504) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][150/1251] eta 0:04:31 lr 0.000972 wd 0.0500 time 0.2355 (0.2464) data time 0.0013 (0.0043) model time 0.2342 (0.2427) loss 4.0517 (3.5783) grad_norm 1.6140 (1.9411) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][160/1251] eta 0:04:28 lr 0.000972 wd 0.0500 time 0.2385 (0.2459) data time 0.0009 (0.0041) model time 0.2376 (0.2423) loss 3.1310 (3.5614) grad_norm 1.6287 (1.9265) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][170/1251] eta 0:04:25 lr 0.000972 wd 0.0500 time 0.2404 (0.2457) data time 0.0007 (0.0039) model time 0.2397 (0.2421) loss 2.2564 (3.5712) grad_norm 2.4894 (1.9236) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][180/1251] eta 0:04:25 lr 0.000972 wd 0.0500 time 0.2393 (0.2483) data time 0.0008 (0.0037) model time 0.2385 (0.2460) loss 3.3247 (3.5554) grad_norm 1.8485 (1.9189) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][190/1251] eta 0:04:22 lr 0.000972 wd 0.0500 time 0.2369 (0.2478) data time 0.0011 (0.0037) model time 0.2358 (0.2453) loss 2.3604 (3.5549) grad_norm 1.5518 (1.9079) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][200/1251] eta 0:04:20 lr 0.000972 wd 0.0500 time 0.2338 (0.2476) data time 0.0011 (0.0036) model time 0.2326 (0.2450) loss 3.8669 (3.5578) grad_norm 1.6756 (1.8970) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][210/1251] eta 0:04:17 lr 0.000972 wd 0.0500 time 0.2376 (0.2472) data time 0.0011 (0.0035) model time 0.2365 (0.2447) loss 3.2019 (3.5684) grad_norm 1.7568 (1.8881) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:30:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][220/1251] eta 0:04:14 lr 0.000972 wd 0.0500 time 0.2315 (0.2470) data time 0.0009 (0.0034) model time 0.2306 (0.2444) loss 3.5446 (3.5660) grad_norm 3.1423 (1.9097) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][230/1251] eta 0:04:12 lr 0.000972 wd 0.0500 time 0.2423 (0.2468) data time 0.0009 (0.0033) model time 0.2414 (0.2443) loss 2.4092 (3.5547) grad_norm 1.4994 (1.9190) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][240/1251] eta 0:04:09 lr 0.000972 wd 0.0500 time 0.2416 (0.2466) data time 0.0010 (0.0032) model time 0.2406 (0.2441) loss 3.5005 (3.5547) grad_norm 1.7210 (1.9173) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][250/1251] eta 0:04:06 lr 0.000972 wd 0.0500 time 0.2428 (0.2466) data time 0.0007 (0.0031) model time 0.2420 (0.2441) loss 3.4591 (3.5564) grad_norm 1.6662 (1.9285) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][260/1251] eta 0:04:04 lr 0.000972 wd 0.0500 time 0.2378 (0.2464) data time 0.0009 (0.0030) model time 0.2369 (0.2440) loss 3.2522 (3.5558) grad_norm 1.6672 (1.9308) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][270/1251] eta 0:04:01 lr 0.000972 wd 0.0500 time 0.2450 (0.2462) data time 0.0008 (0.0029) model time 0.2442 (0.2439) loss 2.6088 (3.5568) grad_norm 2.0618 (1.9356) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][280/1251] eta 0:03:58 lr 0.000972 wd 0.0500 time 0.2336 (0.2461) data time 0.0009 (0.0029) model time 0.2328 (0.2437) loss 2.7187 (3.5570) grad_norm 1.6984 (1.9400) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][290/1251] eta 0:03:57 lr 0.000972 wd 0.0500 time 0.4735 (0.2467) data time 0.0007 (0.0028) model time 0.4728 (0.2445) loss 4.0841 (3.5656) grad_norm 1.2745 (1.9329) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][300/1251] eta 0:03:54 lr 0.000972 wd 0.0500 time 0.2542 (0.2465) data time 0.0007 (0.0027) model time 0.2534 (0.2443) loss 2.8222 (3.5698) grad_norm 1.7099 (1.9296) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][310/1251] eta 0:03:51 lr 0.000972 wd 0.0500 time 0.2493 (0.2463) data time 0.0009 (0.0027) model time 0.2484 (0.2441) loss 4.0935 (3.5732) grad_norm 2.2018 (1.9214) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][320/1251] eta 0:03:49 lr 0.000972 wd 0.0500 time 0.2511 (0.2461) data time 0.0009 (0.0026) model time 0.2501 (0.2440) loss 3.3788 (3.5709) grad_norm 2.2360 (1.9238) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][330/1251] eta 0:03:47 lr 0.000972 wd 0.0500 time 0.2386 (0.2466) data time 0.0009 (0.0026) model time 0.2377 (0.2445) loss 3.0624 (3.5655) grad_norm 2.0480 (1.9249) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][340/1251] eta 0:03:44 lr 0.000972 wd 0.0500 time 0.2393 (0.2464) data time 0.0010 (0.0025) model time 0.2384 (0.2444) loss 3.8583 (3.5672) grad_norm 1.6267 (1.9248) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][350/1251] eta 0:03:41 lr 0.000972 wd 0.0500 time 0.2508 (0.2463) data time 0.0011 (0.0025) model time 0.2497 (0.2443) loss 3.7938 (3.5677) grad_norm 1.7782 (1.9234) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][360/1251] eta 0:03:39 lr 0.000972 wd 0.0500 time 0.2463 (0.2462) data time 0.0007 (0.0024) model time 0.2456 (0.2442) loss 2.6311 (3.5675) grad_norm 2.1730 (1.9333) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][370/1251] eta 0:03:36 lr 0.000972 wd 0.0500 time 0.2419 (0.2461) data time 0.0010 (0.0024) model time 0.2409 (0.2441) loss 3.7975 (3.5607) grad_norm 1.8293 (1.9334) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][380/1251] eta 0:03:34 lr 0.000972 wd 0.0500 time 0.2360 (0.2461) data time 0.0011 (0.0024) model time 0.2349 (0.2441) loss 3.2605 (3.5658) grad_norm 1.9078 (1.9365) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][390/1251] eta 0:03:31 lr 0.000972 wd 0.0500 time 0.2297 (0.2459) data time 0.0008 (0.0024) model time 0.2290 (0.2440) loss 3.2353 (3.5666) grad_norm 2.6475 (1.9337) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][400/1251] eta 0:03:29 lr 0.000972 wd 0.0500 time 0.2419 (0.2459) data time 0.0007 (0.0023) model time 0.2412 (0.2439) loss 3.9720 (3.5701) grad_norm 2.3827 (1.9334) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][410/1251] eta 0:03:26 lr 0.000972 wd 0.0500 time 0.2394 (0.2458) data time 0.0009 (0.0023) model time 0.2385 (0.2438) loss 3.2273 (3.5657) grad_norm 1.8176 (1.9303) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][420/1251] eta 0:03:24 lr 0.000972 wd 0.0500 time 0.2412 (0.2457) data time 0.0007 (0.0023) model time 0.2405 (0.2437) loss 2.6051 (3.5634) grad_norm 2.2333 (1.9333) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][430/1251] eta 0:03:21 lr 0.000972 wd 0.0500 time 0.2429 (0.2455) data time 0.0007 (0.0022) model time 0.2422 (0.2436) loss 4.3146 (3.5651) grad_norm 1.9544 (1.9326) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][440/1251] eta 0:03:19 lr 0.000972 wd 0.0500 time 0.2379 (0.2455) data time 0.0007 (0.0022) model time 0.2372 (0.2436) loss 4.0660 (3.5660) grad_norm 2.2172 (1.9257) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][450/1251] eta 0:03:16 lr 0.000972 wd 0.0500 time 0.2379 (0.2454) data time 0.0011 (0.0022) model time 0.2369 (0.2435) loss 4.1435 (3.5676) grad_norm 1.7293 (1.9222) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][460/1251] eta 0:03:14 lr 0.000972 wd 0.0500 time 0.2407 (0.2454) data time 0.0009 (0.0022) model time 0.2398 (0.2435) loss 3.3197 (3.5647) grad_norm 2.3085 (1.9231) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:31:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][470/1251] eta 0:03:11 lr 0.000972 wd 0.0500 time 0.2494 (0.2453) data time 0.0009 (0.0021) model time 0.2485 (0.2434) loss 2.8697 (3.5601) grad_norm 1.5692 (1.9183) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][480/1251] eta 0:03:09 lr 0.000972 wd 0.0500 time 0.2409 (0.2453) data time 0.0009 (0.0021) model time 0.2400 (0.2434) loss 4.1513 (3.5665) grad_norm 2.0974 (1.9181) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][490/1251] eta 0:03:06 lr 0.000971 wd 0.0500 time 0.2470 (0.2452) data time 0.0010 (0.0021) model time 0.2460 (0.2434) loss 3.6401 (3.5664) grad_norm 1.4697 (1.9160) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][500/1251] eta 0:03:04 lr 0.000971 wd 0.0500 time 0.2358 (0.2452) data time 0.0010 (0.0021) model time 0.2348 (0.2434) loss 4.0379 (3.5659) grad_norm 2.5475 (1.9141) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][510/1251] eta 0:03:01 lr 0.000971 wd 0.0500 time 0.2448 (0.2452) data time 0.0007 (0.0021) model time 0.2441 (0.2433) loss 4.2137 (3.5709) grad_norm 1.5440 (1.9186) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][520/1251] eta 0:02:59 lr 0.000971 wd 0.0500 time 0.2433 (0.2452) data time 0.0009 (0.0021) model time 0.2423 (0.2433) loss 4.3225 (3.5724) grad_norm 2.2878 (1.9176) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][530/1251] eta 0:02:56 lr 0.000971 wd 0.0500 time 0.2560 (0.2452) data time 0.0008 (0.0020) model time 0.2552 (0.2434) loss 3.1188 (3.5748) grad_norm 2.5568 (1.9143) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][540/1251] eta 0:02:54 lr 0.000971 wd 0.0500 time 0.2405 (0.2451) data time 0.0009 (0.0020) model time 0.2395 (0.2433) loss 3.4148 (3.5689) grad_norm 1.4336 (1.9149) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][550/1251] eta 0:02:51 lr 0.000971 wd 0.0500 time 0.2522 (0.2451) data time 0.0007 (0.0020) model time 0.2515 (0.2433) loss 3.9982 (3.5679) grad_norm 1.5292 (1.9109) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][560/1251] eta 0:02:49 lr 0.000971 wd 0.0500 time 0.2390 (0.2450) data time 0.0008 (0.0020) model time 0.2382 (0.2432) loss 4.3214 (3.5674) grad_norm 2.3677 (1.9069) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][570/1251] eta 0:02:46 lr 0.000971 wd 0.0500 time 0.2381 (0.2450) data time 0.0008 (0.0020) model time 0.2372 (0.2432) loss 4.0888 (3.5689) grad_norm 1.8755 (1.9084) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][580/1251] eta 0:02:44 lr 0.000971 wd 0.0500 time 0.2362 (0.2449) data time 0.0008 (0.0020) model time 0.2354 (0.2431) loss 4.1766 (3.5714) grad_norm 1.4551 (1.9096) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][590/1251] eta 0:02:41 lr 0.000971 wd 0.0500 time 0.2462 (0.2448) data time 0.0008 (0.0019) model time 0.2453 (0.2431) loss 2.9288 (3.5654) grad_norm 1.8381 (1.9102) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][600/1251] eta 0:02:39 lr 0.000971 wd 0.0500 time 0.2402 (0.2448) data time 0.0010 (0.0019) model time 0.2393 (0.2430) loss 2.5488 (3.5598) grad_norm 2.3117 (1.9085) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][610/1251] eta 0:02:36 lr 0.000971 wd 0.0500 time 0.2455 (0.2447) data time 0.0009 (0.0019) model time 0.2446 (0.2430) loss 2.8443 (3.5570) grad_norm 1.8092 (1.9079) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][620/1251] eta 0:02:34 lr 0.000971 wd 0.0500 time 0.2405 (0.2447) data time 0.0007 (0.0019) model time 0.2398 (0.2429) loss 3.5479 (3.5537) grad_norm 4.4549 (1.9133) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][630/1251] eta 0:02:31 lr 0.000971 wd 0.0500 time 0.2387 (0.2446) data time 0.0009 (0.0019) model time 0.2379 (0.2429) loss 4.0352 (3.5522) grad_norm 1.8247 (1.9226) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][640/1251] eta 0:02:29 lr 0.000971 wd 0.0500 time 0.2420 (0.2446) data time 0.0010 (0.0019) model time 0.2410 (0.2428) loss 3.6480 (3.5511) grad_norm 1.8840 (1.9216) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][650/1251] eta 0:02:26 lr 0.000971 wd 0.0500 time 0.2383 (0.2445) data time 0.0008 (0.0019) model time 0.2375 (0.2428) loss 4.4900 (3.5550) grad_norm 2.2847 (1.9268) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][660/1251] eta 0:02:24 lr 0.000971 wd 0.0500 time 0.2395 (0.2444) data time 0.0009 (0.0018) model time 0.2386 (0.2427) loss 3.3456 (3.5591) grad_norm 1.7765 (1.9258) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][670/1251] eta 0:02:21 lr 0.000971 wd 0.0500 time 0.2455 (0.2444) data time 0.0007 (0.0018) model time 0.2448 (0.2427) loss 4.0551 (3.5643) grad_norm 1.3725 (1.9282) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][680/1251] eta 0:02:19 lr 0.000971 wd 0.0500 time 0.2318 (0.2443) data time 0.0008 (0.0018) model time 0.2311 (0.2426) loss 3.7810 (3.5693) grad_norm 1.6111 (1.9271) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][690/1251] eta 0:02:17 lr 0.000971 wd 0.0500 time 0.2444 (0.2443) data time 0.0008 (0.0018) model time 0.2436 (0.2426) loss 4.4668 (3.5696) grad_norm 2.1753 (1.9259) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][700/1251] eta 0:02:14 lr 0.000971 wd 0.0500 time 0.2345 (0.2448) data time 0.0011 (0.0018) model time 0.2334 (0.2432) loss 3.1745 (3.5666) grad_norm 1.6838 (1.9226) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][710/1251] eta 0:02:12 lr 0.000971 wd 0.0500 time 0.2432 (0.2448) data time 0.0010 (0.0018) model time 0.2422 (0.2432) loss 3.4794 (3.5671) grad_norm 1.4547 (1.9209) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:33:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][720/1251] eta 0:02:09 lr 0.000971 wd 0.0500 time 0.2438 (0.2448) data time 0.0008 (0.0018) model time 0.2431 (0.2432) loss 4.1499 (3.5656) grad_norm 1.4951 (1.9188) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:33:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][730/1251] eta 0:02:07 lr 0.000971 wd 0.0500 time 0.2445 (0.2448) data time 0.0007 (0.0018) model time 0.2438 (0.2432) loss 4.0869 (3.5717) grad_norm 1.4745 (1.9194) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][740/1251] eta 0:02:05 lr 0.000971 wd 0.0500 time 0.2320 (0.2447) data time 0.0011 (0.0018) model time 0.2309 (0.2431) loss 2.5234 (3.5718) grad_norm 1.5575 (1.9219) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][750/1251] eta 0:02:02 lr 0.000971 wd 0.0500 time 0.2454 (0.2447) data time 0.0010 (0.0017) model time 0.2444 (0.2430) loss 2.5737 (3.5707) grad_norm 1.7479 (1.9241) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][760/1251] eta 0:02:00 lr 0.000971 wd 0.0500 time 0.2446 (0.2446) data time 0.0011 (0.0017) model time 0.2435 (0.2430) loss 3.4910 (3.5743) grad_norm 1.6787 (1.9226) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:33:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][770/1251] eta 0:01:57 lr 0.000971 wd 0.0500 time 0.2401 (0.2446) data time 0.0010 (0.0017) model time 0.2391 (0.2430) loss 3.9260 (3.5770) grad_norm 4.6209 (1.9293) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][780/1251] eta 0:01:55 lr 0.000971 wd 0.0500 time 0.2476 (0.2445) data time 0.0007 (0.0017) model time 0.2470 (0.2430) loss 3.4717 (3.5756) grad_norm 2.0829 (1.9320) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][790/1251] eta 0:01:52 lr 0.000971 wd 0.0500 time 0.2480 (0.2445) data time 0.0007 (0.0017) model time 0.2473 (0.2429) loss 4.1189 (3.5738) grad_norm 1.3970 (1.9287) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:33:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][800/1251] eta 0:01:50 lr 0.000971 wd 0.0500 time 0.2437 (0.2445) data time 0.0010 (0.0017) model time 0.2427 (0.2429) loss 3.8132 (3.5749) grad_norm 2.6642 (1.9315) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:33:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][810/1251] eta 0:01:48 lr 0.000971 wd 0.0500 time 0.4469 (0.2450) data time 0.0011 (0.0017) model time 0.4458 (0.2434) loss 3.7431 (3.5747) grad_norm 1.8105 (1.9299) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][820/1251] eta 0:01:45 lr 0.000971 wd 0.0500 time 0.2454 (0.2452) data time 0.0007 (0.0017) model time 0.2447 (0.2437) loss 4.2567 (3.5757) grad_norm 1.5840 (1.9259) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][830/1251] eta 0:01:43 lr 0.000971 wd 0.0500 time 0.2397 (0.2454) data time 0.0008 (0.0017) model time 0.2389 (0.2439) loss 4.6706 (3.5759) grad_norm 2.0078 (1.9265) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][840/1251] eta 0:01:40 lr 0.000971 wd 0.0500 time 0.2438 (0.2454) data time 0.0010 (0.0017) model time 0.2428 (0.2439) loss 3.8950 (3.5776) grad_norm 1.7179 (1.9267) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][850/1251] eta 0:01:38 lr 0.000971 wd 0.0500 time 0.2407 (0.2454) data time 0.0007 (0.0017) model time 0.2400 (0.2439) loss 4.2650 (3.5778) grad_norm 1.3640 (1.9264) loss_scale 16384.0000 (8269.0106) mem 7379MB [2024-08-26 06:33:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][860/1251] eta 0:01:36 lr 0.000971 wd 0.0500 time 0.4405 (0.2456) data time 0.0011 (0.0017) model time 0.4394 (0.2441) loss 3.7631 (3.5814) grad_norm 1.6826 (inf) loss_scale 8192.0000 (8287.1452) mem 7379MB [2024-08-26 06:33:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][870/1251] eta 0:01:33 lr 0.000971 wd 0.0500 time 0.2524 (0.2456) data time 0.0008 (0.0017) model time 0.2516 (0.2441) loss 2.3842 (3.5844) grad_norm 1.7480 (inf) loss_scale 8192.0000 (8286.0528) mem 7379MB [2024-08-26 06:33:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][880/1251] eta 0:01:31 lr 0.000971 wd 0.0500 time 0.2448 (0.2455) data time 0.0007 (0.0016) model time 0.2441 (0.2440) loss 4.1796 (3.5848) grad_norm 1.6937 (inf) loss_scale 8192.0000 (8284.9852) mem 7379MB [2024-08-26 06:33:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][890/1251] eta 0:01:28 lr 0.000971 wd 0.0500 time 0.2372 (0.2455) data time 0.0010 (0.0016) model time 0.2362 (0.2440) loss 4.0075 (3.5877) grad_norm 1.9822 (inf) loss_scale 8192.0000 (8283.9416) mem 7379MB [2024-08-26 06:33:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][900/1251] eta 0:01:26 lr 0.000971 wd 0.0500 time 0.2415 (0.2454) data time 0.0010 (0.0016) model time 0.2406 (0.2440) loss 3.4356 (3.5882) grad_norm 2.6639 (inf) loss_scale 8192.0000 (8282.9212) mem 7379MB [2024-08-26 06:33:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][910/1251] eta 0:01:23 lr 0.000971 wd 0.0500 time 0.2480 (0.2454) data time 0.0010 (0.0016) model time 0.2470 (0.2439) loss 4.0911 (3.5872) grad_norm 1.6609 (inf) loss_scale 8192.0000 (8281.9232) mem 7379MB [2024-08-26 06:33:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][920/1251] eta 0:01:21 lr 0.000971 wd 0.0500 time 0.2443 (0.2453) data time 0.0010 (0.0016) model time 0.2433 (0.2439) loss 2.2684 (3.5847) grad_norm 1.6767 (inf) loss_scale 8192.0000 (8280.9468) mem 7379MB [2024-08-26 06:33:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][930/1251] eta 0:01:18 lr 0.000971 wd 0.0500 time 0.2444 (0.2453) data time 0.0010 (0.0016) model time 0.2434 (0.2438) loss 3.8065 (3.5828) grad_norm 3.2936 (inf) loss_scale 8192.0000 (8279.9914) mem 7379MB [2024-08-26 06:33:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][940/1251] eta 0:01:16 lr 0.000971 wd 0.0500 time 0.2320 (0.2453) data time 0.0011 (0.0016) model time 0.2308 (0.2438) loss 2.7327 (3.5831) grad_norm 2.2116 (inf) loss_scale 8192.0000 (8279.0563) mem 7379MB [2024-08-26 06:33:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][950/1251] eta 0:01:13 lr 0.000971 wd 0.0500 time 0.2428 (0.2452) data time 0.0010 (0.0016) model time 0.2418 (0.2438) loss 4.1257 (3.5865) grad_norm 2.7964 (inf) loss_scale 8192.0000 (8278.1409) mem 7379MB [2024-08-26 06:34:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][960/1251] eta 0:01:11 lr 0.000971 wd 0.0500 time 0.2445 (0.2452) data time 0.0011 (0.0016) model time 0.2434 (0.2438) loss 3.9056 (3.5896) grad_norm 1.7616 (inf) loss_scale 8192.0000 (8277.2445) mem 7379MB [2024-08-26 06:34:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][970/1251] eta 0:01:08 lr 0.000971 wd 0.0500 time 0.2394 (0.2452) data time 0.0011 (0.0016) model time 0.2384 (0.2437) loss 3.6907 (3.5896) grad_norm 1.5129 (inf) loss_scale 8192.0000 (8276.3666) mem 7379MB [2024-08-26 06:34:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][980/1251] eta 0:01:06 lr 0.000971 wd 0.0500 time 0.2363 (0.2452) data time 0.0009 (0.0016) model time 0.2354 (0.2437) loss 3.7978 (3.5910) grad_norm 1.7999 (inf) loss_scale 8192.0000 (8275.5066) mem 7379MB [2024-08-26 06:34:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][990/1251] eta 0:01:03 lr 0.000971 wd 0.0500 time 0.2354 (0.2451) data time 0.0009 (0.0016) model time 0.2346 (0.2437) loss 2.3674 (3.5897) grad_norm 1.4966 (inf) loss_scale 8192.0000 (8274.6640) mem 7379MB [2024-08-26 06:34:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1000/1251] eta 0:01:01 lr 0.000971 wd 0.0500 time 0.2434 (0.2451) data time 0.0008 (0.0016) model time 0.2427 (0.2436) loss 3.3244 (3.5906) grad_norm 1.9583 (inf) loss_scale 8192.0000 (8273.8382) mem 7379MB [2024-08-26 06:34:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1010/1251] eta 0:00:59 lr 0.000971 wd 0.0500 time 0.2439 (0.2450) data time 0.0011 (0.0016) model time 0.2428 (0.2436) loss 3.7856 (3.5931) grad_norm 1.7175 (inf) loss_scale 8192.0000 (8273.0287) mem 7379MB [2024-08-26 06:34:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1020/1251] eta 0:00:56 lr 0.000971 wd 0.0500 time 0.2385 (0.2452) data time 0.0011 (0.0016) model time 0.2375 (0.2438) loss 3.2738 (3.5915) grad_norm 2.3823 (inf) loss_scale 8192.0000 (8272.2351) mem 7379MB [2024-08-26 06:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1030/1251] eta 0:00:54 lr 0.000971 wd 0.0500 time 0.2434 (0.2452) data time 0.0007 (0.0016) model time 0.2427 (0.2438) loss 4.3122 (3.5912) grad_norm 1.7165 (inf) loss_scale 8192.0000 (8271.4568) mem 7379MB [2024-08-26 06:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1040/1251] eta 0:00:51 lr 0.000971 wd 0.0500 time 0.2339 (0.2452) data time 0.0010 (0.0016) model time 0.2329 (0.2437) loss 3.8382 (3.5924) grad_norm 1.6284 (inf) loss_scale 8192.0000 (8270.6936) mem 7379MB [2024-08-26 06:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1050/1251] eta 0:00:49 lr 0.000971 wd 0.0500 time 0.2421 (0.2451) data time 0.0008 (0.0016) model time 0.2413 (0.2437) loss 4.4166 (3.5913) grad_norm 1.4617 (inf) loss_scale 8192.0000 (8269.9448) mem 7379MB [2024-08-26 06:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1060/1251] eta 0:00:46 lr 0.000971 wd 0.0500 time 0.2403 (0.2451) data time 0.0010 (0.0016) model time 0.2393 (0.2437) loss 3.8201 (3.5921) grad_norm 1.7452 (inf) loss_scale 8192.0000 (8269.2102) mem 7379MB [2024-08-26 06:34:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1070/1251] eta 0:00:44 lr 0.000971 wd 0.0500 time 0.2411 (0.2451) data time 0.0008 (0.0016) model time 0.2403 (0.2437) loss 4.6224 (3.5911) grad_norm 1.4115 (inf) loss_scale 8192.0000 (8268.4893) mem 7379MB [2024-08-26 06:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1080/1251] eta 0:00:41 lr 0.000971 wd 0.0500 time 0.2399 (0.2451) data time 0.0009 (0.0015) model time 0.2391 (0.2436) loss 4.2717 (3.5902) grad_norm 1.9218 (inf) loss_scale 8192.0000 (8267.7817) mem 7379MB [2024-08-26 06:34:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1090/1251] eta 0:00:39 lr 0.000971 wd 0.0500 time 0.2357 (0.2450) data time 0.0009 (0.0015) model time 0.2347 (0.2436) loss 3.3102 (3.5904) grad_norm 2.0740 (inf) loss_scale 8192.0000 (8267.0871) mem 7379MB [2024-08-26 06:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1100/1251] eta 0:00:36 lr 0.000971 wd 0.0500 time 0.2393 (0.2450) data time 0.0009 (0.0015) model time 0.2384 (0.2436) loss 2.6720 (3.5920) grad_norm 2.0373 (inf) loss_scale 8192.0000 (8266.4051) mem 7379MB [2024-08-26 06:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1110/1251] eta 0:00:34 lr 0.000971 wd 0.0500 time 0.2467 (0.2451) data time 0.0008 (0.0015) model time 0.2459 (0.2437) loss 4.2242 (3.5911) grad_norm 1.5248 (inf) loss_scale 8192.0000 (8265.7354) mem 7379MB [2024-08-26 06:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1120/1251] eta 0:00:32 lr 0.000971 wd 0.0500 time 0.2457 (0.2451) data time 0.0009 (0.0015) model time 0.2448 (0.2437) loss 4.2645 (3.5903) grad_norm 1.5576 (inf) loss_scale 8192.0000 (8265.0776) mem 7379MB [2024-08-26 06:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1130/1251] eta 0:00:29 lr 0.000971 wd 0.0500 time 0.2380 (0.2451) data time 0.0010 (0.0015) model time 0.2371 (0.2437) loss 3.6424 (3.5917) grad_norm 1.7437 (inf) loss_scale 8192.0000 (8264.4315) mem 7379MB [2024-08-26 06:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1140/1251] eta 0:00:27 lr 0.000971 wd 0.0500 time 0.2337 (0.2450) data time 0.0011 (0.0015) model time 0.2326 (0.2436) loss 3.2296 (3.5923) grad_norm 1.9463 (inf) loss_scale 8192.0000 (8263.7967) mem 7379MB [2024-08-26 06:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1150/1251] eta 0:00:24 lr 0.000971 wd 0.0500 time 0.2423 (0.2450) data time 0.0009 (0.0015) model time 0.2413 (0.2436) loss 4.1713 (3.5927) grad_norm 1.3701 (inf) loss_scale 8192.0000 (8263.1729) mem 7379MB [2024-08-26 06:34:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1160/1251] eta 0:00:22 lr 0.000970 wd 0.0500 time 0.2412 (0.2450) data time 0.0013 (0.0015) model time 0.2399 (0.2436) loss 2.8112 (3.5921) grad_norm 2.6480 (inf) loss_scale 8192.0000 (8262.5599) mem 7379MB [2024-08-26 06:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1170/1251] eta 0:00:19 lr 0.000970 wd 0.0500 time 0.2313 (0.2449) data time 0.0010 (0.0015) model time 0.2304 (0.2435) loss 3.9441 (3.5920) grad_norm 1.9415 (inf) loss_scale 8192.0000 (8261.9573) mem 7379MB [2024-08-26 06:34:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1180/1251] eta 0:00:17 lr 0.000970 wd 0.0500 time 0.2421 (0.2449) data time 0.0011 (0.0015) model time 0.2410 (0.2435) loss 3.6799 (3.5945) grad_norm 1.9177 (inf) loss_scale 8192.0000 (8261.3649) mem 7379MB [2024-08-26 06:34:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1190/1251] eta 0:00:14 lr 0.000970 wd 0.0500 time 0.2340 (0.2449) data time 0.0011 (0.0015) model time 0.2329 (0.2435) loss 3.0179 (3.5938) grad_norm 1.6928 (inf) loss_scale 8192.0000 (8260.7825) mem 7379MB [2024-08-26 06:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1200/1251] eta 0:00:12 lr 0.000970 wd 0.0500 time 0.2374 (0.2449) data time 0.0009 (0.0015) model time 0.2365 (0.2435) loss 3.2477 (3.5942) grad_norm 1.6639 (inf) loss_scale 8192.0000 (8260.2098) mem 7379MB [2024-08-26 06:35:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1210/1251] eta 0:00:10 lr 0.000970 wd 0.0500 time 0.2442 (0.2448) data time 0.0008 (0.0015) model time 0.2434 (0.2434) loss 2.9335 (3.5941) grad_norm 1.4266 (inf) loss_scale 8192.0000 (8259.6466) mem 7379MB [2024-08-26 06:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1220/1251] eta 0:00:07 lr 0.000970 wd 0.0500 time 0.2506 (0.2448) data time 0.0008 (0.0015) model time 0.2498 (0.2434) loss 4.1145 (3.5934) grad_norm 1.3272 (inf) loss_scale 8192.0000 (8259.0925) mem 7379MB [2024-08-26 06:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1230/1251] eta 0:00:05 lr 0.000970 wd 0.0500 time 0.2442 (0.2448) data time 0.0009 (0.0015) model time 0.2433 (0.2434) loss 4.2515 (3.5933) grad_norm 1.6640 (inf) loss_scale 8192.0000 (8258.5475) mem 7379MB [2024-08-26 06:35:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1240/1251] eta 0:00:02 lr 0.000970 wd 0.0500 time 0.2250 (0.2447) data time 0.0007 (0.0015) model time 0.2243 (0.2433) loss 4.0173 (3.5934) grad_norm 1.6475 (inf) loss_scale 8192.0000 (8258.0113) mem 7379MB [2024-08-26 06:35:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [50/300][1250/1251] eta 0:00:00 lr 0.000970 wd 0.0500 time 0.2289 (0.2446) data time 0.0007 (0.0015) model time 0.2282 (0.2432) loss 2.4490 (3.5924) grad_norm 1.5933 (inf) loss_scale 8192.0000 (8257.4836) mem 7379MB [2024-08-26 06:35:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 50 training takes 0:05:05 [2024-08-26 06:35:10 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 06:35:11 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 06:35:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.430 (0.430) Loss 0.6050 (0.6050) Acc@1 89.746 (89.746) Acc@5 97.852 (97.852) Mem 7379MB [2024-08-26 06:35:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.110) Loss 0.9404 (0.9361) Acc@1 80.664 (80.140) Acc@5 96.094 (95.339) Mem 7379MB [2024-08-26 06:35:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.095) Loss 1.2559 (0.9430) Acc@1 72.656 (79.497) Acc@5 90.723 (95.429) Mem 7379MB [2024-08-26 06:35:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.089) Loss 1.5684 (1.0755) Acc@1 65.137 (76.474) Acc@5 86.133 (93.630) Mem 7379MB [2024-08-26 06:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.4707 (1.1463) Acc@1 67.773 (74.752) Acc@5 87.598 (92.721) Mem 7379MB [2024-08-26 06:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.518 Acc@5 92.678 [2024-08-26 06:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 74.5% [2024-08-26 06:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 74.52% [2024-08-26 06:35:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 06:35:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 06:35:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.477 (0.477) Loss 0.4744 (0.4744) Acc@1 90.137 (90.137) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 06:35:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.112) Loss 0.7930 (0.7670) Acc@1 82.910 (82.608) Acc@5 95.703 (96.307) Mem 7379MB [2024-08-26 06:35:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.095) Loss 1.1113 (0.7828) Acc@1 74.414 (81.780) Acc@5 91.895 (96.224) Mem 7379MB [2024-08-26 06:35:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.090) Loss 1.3789 (0.8992) Acc@1 64.746 (79.013) Acc@5 88.477 (94.714) Mem 7379MB [2024-08-26 06:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.3018 (0.9659) Acc@1 68.164 (77.365) Acc@5 89.648 (93.936) Mem 7379MB [2024-08-26 06:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.982 Acc@5 93.880 [2024-08-26 06:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 77.0% [2024-08-26 06:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 76.98% [2024-08-26 06:35:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 06:35:20 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 06:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][0/1251] eta 0:13:58 lr 0.000970 wd 0.0500 time 0.6707 (0.6707) data time 0.4302 (0.4302) model time 0.0000 (0.0000) loss 3.7150 (3.7150) grad_norm 1.9386 (1.9386) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][10/1251] eta 0:05:47 lr 0.000970 wd 0.0500 time 0.2456 (0.2800) data time 0.0010 (0.0401) model time 0.0000 (0.0000) loss 3.0686 (3.8154) grad_norm 1.7086 (1.9260) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][20/1251] eta 0:05:22 lr 0.000970 wd 0.0500 time 0.2431 (0.2622) data time 0.0010 (0.0215) model time 0.0000 (0.0000) loss 2.7058 (3.5914) grad_norm 1.5241 (1.8556) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][30/1251] eta 0:05:12 lr 0.000970 wd 0.0500 time 0.2468 (0.2555) data time 0.0007 (0.0149) model time 0.0000 (0.0000) loss 3.9713 (3.6334) grad_norm 2.0392 (1.9401) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][40/1251] eta 0:05:04 lr 0.000970 wd 0.0500 time 0.2358 (0.2515) data time 0.0011 (0.0115) model time 0.0000 (0.0000) loss 4.0134 (3.6570) grad_norm 1.6025 (1.8962) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][50/1251] eta 0:04:59 lr 0.000970 wd 0.0500 time 0.2394 (0.2494) data time 0.0009 (0.0094) model time 0.0000 (0.0000) loss 3.4832 (3.6919) grad_norm 1.8836 (1.9021) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][60/1251] eta 0:04:55 lr 0.000970 wd 0.0500 time 0.2426 (0.2481) data time 0.0010 (0.0081) model time 0.2416 (0.2403) loss 3.9077 (3.6493) grad_norm 2.3369 (1.9938) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][70/1251] eta 0:04:51 lr 0.000970 wd 0.0500 time 0.2505 (0.2471) data time 0.0008 (0.0071) model time 0.2497 (0.2401) loss 3.8579 (3.5944) grad_norm 1.6551 (1.9825) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][80/1251] eta 0:04:48 lr 0.000970 wd 0.0500 time 0.2455 (0.2466) data time 0.0009 (0.0063) model time 0.2445 (0.2406) loss 4.2191 (3.5478) grad_norm 1.7525 (1.9509) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][90/1251] eta 0:04:45 lr 0.000970 wd 0.0500 time 0.2385 (0.2459) data time 0.0007 (0.0058) model time 0.2378 (0.2404) loss 2.3166 (3.5388) grad_norm 1.9375 (1.9462) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][100/1251] eta 0:04:44 lr 0.000970 wd 0.0500 time 0.2418 (0.2476) data time 0.0012 (0.0053) model time 0.2405 (0.2446) loss 3.8020 (3.5774) grad_norm 1.8165 (2.0189) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][110/1251] eta 0:04:41 lr 0.000970 wd 0.0500 time 0.2382 (0.2470) data time 0.0011 (0.0049) model time 0.2372 (0.2438) loss 3.1743 (3.6002) grad_norm 1.7655 (2.0189) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][120/1251] eta 0:04:38 lr 0.000970 wd 0.0500 time 0.2366 (0.2463) data time 0.0008 (0.0046) model time 0.2358 (0.2430) loss 3.7122 (3.6168) grad_norm 1.7861 (2.0222) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][130/1251] eta 0:04:35 lr 0.000970 wd 0.0500 time 0.2416 (0.2460) data time 0.0009 (0.0043) model time 0.2407 (0.2427) loss 3.6517 (3.6097) grad_norm 1.4767 (1.9999) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][140/1251] eta 0:04:34 lr 0.000970 wd 0.0500 time 0.2378 (0.2470) data time 0.0010 (0.0041) model time 0.2368 (0.2446) loss 4.0662 (3.6100) grad_norm 1.6215 (1.9989) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:35:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][150/1251] eta 0:04:31 lr 0.000970 wd 0.0500 time 0.2461 (0.2467) data time 0.0009 (0.0039) model time 0.2452 (0.2442) loss 3.4917 (3.6182) grad_norm 1.8843 (2.0159) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][160/1251] eta 0:04:28 lr 0.000970 wd 0.0500 time 0.2433 (0.2465) data time 0.0010 (0.0037) model time 0.2423 (0.2440) loss 3.4193 (3.6171) grad_norm 2.6503 (2.0169) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][170/1251] eta 0:04:28 lr 0.000970 wd 0.0500 time 0.2446 (0.2486) data time 0.0010 (0.0036) model time 0.2436 (0.2471) loss 3.9000 (3.6244) grad_norm 1.8407 (2.0054) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][180/1251] eta 0:04:27 lr 0.000970 wd 0.0500 time 0.2371 (0.2493) data time 0.0007 (0.0034) model time 0.2363 (0.2482) loss 2.5397 (3.6220) grad_norm 1.4376 (1.9964) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][190/1251] eta 0:04:24 lr 0.000970 wd 0.0500 time 0.2444 (0.2489) data time 0.0009 (0.0033) model time 0.2436 (0.2476) loss 2.6556 (3.6151) grad_norm 1.9384 (1.9930) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][200/1251] eta 0:04:21 lr 0.000970 wd 0.0500 time 0.2434 (0.2486) data time 0.0008 (0.0032) model time 0.2427 (0.2473) loss 2.9448 (3.6192) grad_norm 1.5002 (2.0186) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][210/1251] eta 0:04:18 lr 0.000970 wd 0.0500 time 0.2328 (0.2483) data time 0.0012 (0.0031) model time 0.2316 (0.2468) loss 3.7028 (3.6242) grad_norm 2.6393 (2.0196) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][220/1251] eta 0:04:16 lr 0.000970 wd 0.0500 time 0.2406 (0.2489) data time 0.0008 (0.0030) model time 0.2397 (0.2477) loss 3.7965 (3.6261) grad_norm 2.6934 (2.0300) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][230/1251] eta 0:04:13 lr 0.000970 wd 0.0500 time 0.2474 (0.2487) data time 0.0010 (0.0029) model time 0.2464 (0.2474) loss 2.8022 (3.6223) grad_norm 2.5960 (2.0418) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][240/1251] eta 0:04:11 lr 0.000970 wd 0.0500 time 0.2506 (0.2485) data time 0.0007 (0.0028) model time 0.2500 (0.2472) loss 3.9417 (3.6215) grad_norm 1.6370 (2.0392) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][250/1251] eta 0:04:08 lr 0.000970 wd 0.0500 time 0.2376 (0.2483) data time 0.0011 (0.0028) model time 0.2365 (0.2469) loss 3.9280 (3.6310) grad_norm 1.5846 (2.0310) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][260/1251] eta 0:04:05 lr 0.000970 wd 0.0500 time 0.2380 (0.2480) data time 0.0007 (0.0027) model time 0.2373 (0.2466) loss 4.0288 (3.6344) grad_norm 1.3450 (2.0242) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][270/1251] eta 0:04:03 lr 0.000970 wd 0.0500 time 0.2451 (0.2478) data time 0.0009 (0.0026) model time 0.2442 (0.2463) loss 3.9970 (3.6327) grad_norm 1.5770 (2.0208) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][280/1251] eta 0:04:00 lr 0.000970 wd 0.0500 time 0.2413 (0.2476) data time 0.0009 (0.0026) model time 0.2404 (0.2461) loss 3.6907 (3.6304) grad_norm 2.2754 (2.0187) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][290/1251] eta 0:03:57 lr 0.000970 wd 0.0500 time 0.2383 (0.2474) data time 0.0010 (0.0025) model time 0.2373 (0.2459) loss 2.5441 (3.6249) grad_norm 1.5901 (2.0246) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][300/1251] eta 0:03:55 lr 0.000970 wd 0.0500 time 0.2396 (0.2473) data time 0.0010 (0.0025) model time 0.2386 (0.2457) loss 2.7834 (3.6227) grad_norm 2.2575 (2.0184) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][310/1251] eta 0:03:52 lr 0.000970 wd 0.0500 time 0.2454 (0.2472) data time 0.0011 (0.0025) model time 0.2443 (0.2457) loss 3.1547 (3.6117) grad_norm 1.6172 (2.0258) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][320/1251] eta 0:03:50 lr 0.000970 wd 0.0500 time 0.2447 (0.2471) data time 0.0010 (0.0025) model time 0.2437 (0.2455) loss 3.3393 (3.6073) grad_norm 1.5637 (2.0221) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][330/1251] eta 0:03:47 lr 0.000970 wd 0.0500 time 0.2397 (0.2469) data time 0.0008 (0.0024) model time 0.2389 (0.2453) loss 2.5917 (3.6052) grad_norm 1.2633 (2.0159) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][340/1251] eta 0:03:44 lr 0.000970 wd 0.0500 time 0.2394 (0.2467) data time 0.0011 (0.0024) model time 0.2383 (0.2451) loss 4.3219 (3.6161) grad_norm 1.7435 (2.0076) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][350/1251] eta 0:03:42 lr 0.000970 wd 0.0500 time 0.2497 (0.2466) data time 0.0007 (0.0023) model time 0.2490 (0.2450) loss 2.4857 (3.6080) grad_norm 1.5805 (1.9992) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][360/1251] eta 0:03:39 lr 0.000970 wd 0.0500 time 0.2417 (0.2465) data time 0.0011 (0.0023) model time 0.2406 (0.2449) loss 3.3001 (3.6075) grad_norm 2.0114 (1.9939) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][370/1251] eta 0:03:37 lr 0.000970 wd 0.0500 time 0.2438 (0.2464) data time 0.0008 (0.0023) model time 0.2430 (0.2448) loss 2.4041 (3.6008) grad_norm 2.3023 (1.9941) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][380/1251] eta 0:03:34 lr 0.000970 wd 0.0500 time 0.2431 (0.2462) data time 0.0009 (0.0022) model time 0.2422 (0.2446) loss 4.0512 (3.6072) grad_norm 2.7362 (2.0143) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][390/1251] eta 0:03:31 lr 0.000970 wd 0.0500 time 0.2418 (0.2461) data time 0.0011 (0.0022) model time 0.2406 (0.2445) loss 3.9230 (3.6065) grad_norm 1.9519 (2.0111) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:36:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][400/1251] eta 0:03:29 lr 0.000970 wd 0.0500 time 0.2456 (0.2459) data time 0.0011 (0.0022) model time 0.2445 (0.2443) loss 4.0783 (3.6129) grad_norm 1.8477 (2.0155) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][410/1251] eta 0:03:26 lr 0.000970 wd 0.0500 time 0.2445 (0.2458) data time 0.0007 (0.0021) model time 0.2438 (0.2442) loss 3.1677 (3.6058) grad_norm 1.6475 (2.0114) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][420/1251] eta 0:03:24 lr 0.000970 wd 0.0500 time 0.2492 (0.2458) data time 0.0011 (0.0021) model time 0.2481 (0.2442) loss 3.4280 (3.6029) grad_norm 1.7747 (2.0114) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][430/1251] eta 0:03:21 lr 0.000970 wd 0.0500 time 0.2341 (0.2458) data time 0.0013 (0.0021) model time 0.2329 (0.2442) loss 3.8672 (3.6021) grad_norm 2.6000 (2.0154) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][440/1251] eta 0:03:19 lr 0.000970 wd 0.0500 time 0.2407 (0.2457) data time 0.0010 (0.0021) model time 0.2397 (0.2441) loss 4.1603 (3.6026) grad_norm 2.0243 (2.0164) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][450/1251] eta 0:03:16 lr 0.000970 wd 0.0500 time 0.2396 (0.2457) data time 0.0008 (0.0020) model time 0.2387 (0.2442) loss 3.7921 (3.6033) grad_norm 1.7651 (2.0109) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][460/1251] eta 0:03:14 lr 0.000970 wd 0.0500 time 0.2462 (0.2457) data time 0.0009 (0.0020) model time 0.2453 (0.2442) loss 4.0176 (3.5997) grad_norm 1.2747 (2.0069) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][470/1251] eta 0:03:11 lr 0.000970 wd 0.0500 time 0.2409 (0.2457) data time 0.0008 (0.0020) model time 0.2400 (0.2441) loss 2.8961 (3.6038) grad_norm 1.5325 (2.0072) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][480/1251] eta 0:03:09 lr 0.000970 wd 0.0500 time 0.2341 (0.2456) data time 0.0007 (0.0020) model time 0.2333 (0.2440) loss 4.5338 (3.6067) grad_norm 1.8797 (2.0072) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][490/1251] eta 0:03:06 lr 0.000970 wd 0.0500 time 0.2377 (0.2455) data time 0.0009 (0.0020) model time 0.2369 (0.2440) loss 4.0575 (3.6014) grad_norm 1.4753 (2.0045) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][500/1251] eta 0:03:04 lr 0.000970 wd 0.0500 time 0.2474 (0.2455) data time 0.0008 (0.0019) model time 0.2466 (0.2439) loss 2.6341 (3.5970) grad_norm 1.9947 (2.0054) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][510/1251] eta 0:03:01 lr 0.000970 wd 0.0500 time 0.2376 (0.2454) data time 0.0007 (0.0019) model time 0.2369 (0.2439) loss 4.1329 (3.6030) grad_norm 2.1055 (2.0064) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][520/1251] eta 0:02:59 lr 0.000970 wd 0.0500 time 0.2451 (0.2453) data time 0.0008 (0.0019) model time 0.2443 (0.2438) loss 2.5259 (3.6061) grad_norm 1.4812 (2.0051) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][530/1251] eta 0:02:57 lr 0.000970 wd 0.0500 time 0.2416 (0.2457) data time 0.0011 (0.0019) model time 0.2405 (0.2442) loss 3.8900 (3.6067) grad_norm 2.0960 (2.0025) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][540/1251] eta 0:02:54 lr 0.000970 wd 0.0500 time 0.2464 (0.2460) data time 0.0009 (0.0019) model time 0.2454 (0.2445) loss 3.0931 (3.6055) grad_norm 1.3988 (1.9954) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][550/1251] eta 0:02:52 lr 0.000970 wd 0.0500 time 0.2428 (0.2459) data time 0.0009 (0.0018) model time 0.2419 (0.2445) loss 3.3597 (3.6016) grad_norm 1.9679 (1.9923) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][560/1251] eta 0:02:49 lr 0.000970 wd 0.0500 time 0.2429 (0.2459) data time 0.0008 (0.0018) model time 0.2422 (0.2444) loss 2.9852 (3.5971) grad_norm 1.8334 (1.9883) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][570/1251] eta 0:02:47 lr 0.000969 wd 0.0500 time 0.2356 (0.2458) data time 0.0010 (0.0018) model time 0.2347 (0.2444) loss 3.1405 (3.5948) grad_norm 1.9566 (1.9875) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][580/1251] eta 0:02:44 lr 0.000969 wd 0.0500 time 0.2444 (0.2458) data time 0.0009 (0.0018) model time 0.2435 (0.2443) loss 3.4546 (3.5944) grad_norm 2.9704 (1.9870) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][590/1251] eta 0:02:42 lr 0.000969 wd 0.0500 time 0.2358 (0.2457) data time 0.0011 (0.0018) model time 0.2347 (0.2442) loss 3.8018 (3.5975) grad_norm 3.8926 (1.9864) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][600/1251] eta 0:02:39 lr 0.000969 wd 0.0500 time 0.2381 (0.2456) data time 0.0008 (0.0018) model time 0.2373 (0.2442) loss 4.6107 (3.6013) grad_norm 3.2589 (1.9916) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][610/1251] eta 0:02:37 lr 0.000969 wd 0.0500 time 0.2395 (0.2455) data time 0.0010 (0.0018) model time 0.2385 (0.2441) loss 4.2386 (3.6029) grad_norm 2.6006 (1.9911) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][620/1251] eta 0:02:35 lr 0.000969 wd 0.0500 time 0.2438 (0.2458) data time 0.0010 (0.0018) model time 0.2429 (0.2444) loss 3.6601 (3.6044) grad_norm 1.7278 (1.9883) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][630/1251] eta 0:02:32 lr 0.000969 wd 0.0500 time 0.2383 (0.2457) data time 0.0009 (0.0017) model time 0.2375 (0.2443) loss 4.5022 (3.6021) grad_norm 1.7881 (1.9865) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:37:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][640/1251] eta 0:02:30 lr 0.000969 wd 0.0500 time 0.2392 (0.2457) data time 0.0010 (0.0017) model time 0.2382 (0.2443) loss 3.2163 (3.6006) grad_norm 2.3631 (1.9909) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][650/1251] eta 0:02:27 lr 0.000969 wd 0.0500 time 0.2510 (0.2459) data time 0.0007 (0.0017) model time 0.2502 (0.2445) loss 4.7120 (3.6005) grad_norm 2.5617 (1.9882) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][660/1251] eta 0:02:25 lr 0.000969 wd 0.0500 time 0.2385 (0.2460) data time 0.0009 (0.0017) model time 0.2377 (0.2446) loss 3.8987 (3.5992) grad_norm 1.4951 (1.9861) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][670/1251] eta 0:02:22 lr 0.000969 wd 0.0500 time 0.2485 (0.2459) data time 0.0012 (0.0017) model time 0.2473 (0.2446) loss 4.2080 (3.6019) grad_norm 1.5847 (1.9876) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][680/1251] eta 0:02:20 lr 0.000969 wd 0.0500 time 0.2356 (0.2459) data time 0.0011 (0.0017) model time 0.2345 (0.2445) loss 4.2446 (3.6036) grad_norm 1.6684 (1.9874) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][690/1251] eta 0:02:17 lr 0.000969 wd 0.0500 time 0.2386 (0.2458) data time 0.0011 (0.0017) model time 0.2375 (0.2445) loss 3.6431 (3.6025) grad_norm 1.5507 (1.9837) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][700/1251] eta 0:02:15 lr 0.000969 wd 0.0500 time 0.2429 (0.2461) data time 0.0008 (0.0017) model time 0.2421 (0.2447) loss 4.4068 (3.6052) grad_norm 2.8658 (1.9824) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][710/1251] eta 0:02:13 lr 0.000969 wd 0.0500 time 0.2402 (0.2466) data time 0.0009 (0.0017) model time 0.2393 (0.2453) loss 3.4206 (3.6035) grad_norm 1.9141 (1.9823) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][720/1251] eta 0:02:10 lr 0.000969 wd 0.0500 time 0.2473 (0.2465) data time 0.0009 (0.0017) model time 0.2463 (0.2452) loss 3.3703 (3.6000) grad_norm 1.8666 (1.9844) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][730/1251] eta 0:02:08 lr 0.000969 wd 0.0500 time 0.2355 (0.2464) data time 0.0012 (0.0016) model time 0.2343 (0.2452) loss 3.7422 (3.6032) grad_norm 1.7198 (1.9830) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][740/1251] eta 0:02:05 lr 0.000969 wd 0.0500 time 0.2390 (0.2464) data time 0.0010 (0.0016) model time 0.2380 (0.2451) loss 3.9636 (3.6006) grad_norm 1.8260 (1.9894) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][750/1251] eta 0:02:03 lr 0.000969 wd 0.0500 time 0.2523 (0.2463) data time 0.0007 (0.0016) model time 0.2516 (0.2450) loss 3.5348 (3.5998) grad_norm 1.8989 (1.9872) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][760/1251] eta 0:02:00 lr 0.000969 wd 0.0500 time 0.2447 (0.2462) data time 0.0008 (0.0016) model time 0.2440 (0.2449) loss 3.2579 (3.6000) grad_norm 1.6685 (1.9862) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][770/1251] eta 0:01:58 lr 0.000969 wd 0.0500 time 0.2479 (0.2462) data time 0.0008 (0.0016) model time 0.2471 (0.2449) loss 2.6360 (3.5951) grad_norm 1.6970 (1.9844) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][780/1251] eta 0:01:55 lr 0.000969 wd 0.0500 time 0.2415 (0.2461) data time 0.0011 (0.0016) model time 0.2404 (0.2448) loss 3.8802 (3.5965) grad_norm 2.0255 (1.9801) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][790/1251] eta 0:01:53 lr 0.000969 wd 0.0500 time 0.2369 (0.2460) data time 0.0008 (0.0016) model time 0.2361 (0.2447) loss 4.3215 (3.5997) grad_norm 1.3803 (1.9786) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][800/1251] eta 0:01:50 lr 0.000969 wd 0.0500 time 0.2394 (0.2460) data time 0.0010 (0.0016) model time 0.2384 (0.2447) loss 3.6192 (3.6015) grad_norm 1.5514 (1.9766) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][810/1251] eta 0:01:48 lr 0.000969 wd 0.0500 time 0.2351 (0.2459) data time 0.0009 (0.0016) model time 0.2342 (0.2447) loss 4.5838 (3.6044) grad_norm 1.4209 (1.9760) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][820/1251] eta 0:01:45 lr 0.000969 wd 0.0500 time 0.2371 (0.2459) data time 0.0009 (0.0016) model time 0.2362 (0.2446) loss 2.6912 (3.6092) grad_norm 1.4790 (1.9759) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][830/1251] eta 0:01:43 lr 0.000969 wd 0.0500 time 0.2427 (0.2459) data time 0.0007 (0.0016) model time 0.2420 (0.2446) loss 4.3744 (3.6127) grad_norm 1.9352 (1.9764) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][840/1251] eta 0:01:41 lr 0.000969 wd 0.0500 time 0.2409 (0.2459) data time 0.0008 (0.0016) model time 0.2401 (0.2446) loss 3.0799 (3.6133) grad_norm 1.9364 (1.9754) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][850/1251] eta 0:01:38 lr 0.000969 wd 0.0500 time 0.2407 (0.2458) data time 0.0011 (0.0016) model time 0.2396 (0.2446) loss 3.3809 (3.6126) grad_norm 2.8176 (1.9765) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][860/1251] eta 0:01:36 lr 0.000969 wd 0.0500 time 0.2380 (0.2458) data time 0.0009 (0.0016) model time 0.2371 (0.2445) loss 2.9453 (3.6116) grad_norm 1.4176 (1.9751) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][870/1251] eta 0:01:33 lr 0.000969 wd 0.0500 time 0.2568 (0.2458) data time 0.0009 (0.0015) model time 0.2559 (0.2445) loss 2.7555 (3.6107) grad_norm 1.9965 (1.9744) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][880/1251] eta 0:01:31 lr 0.000969 wd 0.0500 time 0.2359 (0.2457) data time 0.0012 (0.0015) model time 0.2347 (0.2445) loss 2.4516 (3.6073) grad_norm 2.4253 (1.9757) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:38:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][890/1251] eta 0:01:28 lr 0.000969 wd 0.0500 time 0.2391 (0.2457) data time 0.0011 (0.0015) model time 0.2380 (0.2444) loss 4.0123 (3.6072) grad_norm 1.7872 (1.9767) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][900/1251] eta 0:01:26 lr 0.000969 wd 0.0500 time 0.2316 (0.2456) data time 0.0012 (0.0015) model time 0.2304 (0.2444) loss 3.9223 (3.6074) grad_norm 2.2674 (1.9750) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][910/1251] eta 0:01:23 lr 0.000969 wd 0.0500 time 0.2379 (0.2456) data time 0.0008 (0.0015) model time 0.2371 (0.2443) loss 4.5910 (3.6100) grad_norm 1.5819 (1.9743) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][920/1251] eta 0:01:21 lr 0.000969 wd 0.0500 time 0.2380 (0.2456) data time 0.0009 (0.0015) model time 0.2370 (0.2443) loss 4.0032 (3.6084) grad_norm 1.9974 (1.9743) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][930/1251] eta 0:01:18 lr 0.000969 wd 0.0500 time 0.2308 (0.2455) data time 0.0009 (0.0015) model time 0.2299 (0.2442) loss 3.2778 (3.6058) grad_norm 1.9066 (1.9743) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][940/1251] eta 0:01:16 lr 0.000969 wd 0.0500 time 0.2360 (0.2455) data time 0.0007 (0.0015) model time 0.2353 (0.2442) loss 3.4161 (3.6058) grad_norm 1.7689 (1.9754) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][950/1251] eta 0:01:13 lr 0.000969 wd 0.0500 time 0.2473 (0.2454) data time 0.0008 (0.0015) model time 0.2465 (0.2441) loss 3.3149 (3.6081) grad_norm 1.6520 (1.9773) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][960/1251] eta 0:01:11 lr 0.000969 wd 0.0500 time 0.2384 (0.2454) data time 0.0010 (0.0015) model time 0.2375 (0.2441) loss 3.8023 (3.6053) grad_norm 1.3314 (1.9758) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][970/1251] eta 0:01:08 lr 0.000969 wd 0.0500 time 0.2347 (0.2453) data time 0.0009 (0.0015) model time 0.2338 (0.2440) loss 4.3967 (3.6060) grad_norm 2.0823 (1.9755) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][980/1251] eta 0:01:06 lr 0.000969 wd 0.0500 time 0.2397 (0.2453) data time 0.0008 (0.0015) model time 0.2389 (0.2440) loss 3.7913 (3.6081) grad_norm 1.6990 (1.9759) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][990/1251] eta 0:01:04 lr 0.000969 wd 0.0500 time 0.2442 (0.2453) data time 0.0007 (0.0015) model time 0.2434 (0.2440) loss 4.2337 (3.6118) grad_norm 1.7806 (1.9770) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1000/1251] eta 0:01:01 lr 0.000969 wd 0.0500 time 0.2310 (0.2452) data time 0.0011 (0.0015) model time 0.2299 (0.2439) loss 3.1526 (3.6118) grad_norm 1.8132 (1.9752) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1010/1251] eta 0:00:59 lr 0.000969 wd 0.0500 time 0.2404 (0.2452) data time 0.0011 (0.0015) model time 0.2392 (0.2439) loss 3.5565 (3.6148) grad_norm 1.7140 (1.9730) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1020/1251] eta 0:00:56 lr 0.000969 wd 0.0500 time 0.2330 (0.2452) data time 0.0012 (0.0015) model time 0.2318 (0.2439) loss 3.4431 (3.6143) grad_norm 2.3067 (1.9714) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1030/1251] eta 0:00:54 lr 0.000969 wd 0.0500 time 0.2375 (0.2452) data time 0.0007 (0.0015) model time 0.2368 (0.2439) loss 2.7770 (3.6146) grad_norm 2.4069 (1.9699) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1040/1251] eta 0:00:51 lr 0.000969 wd 0.0500 time 0.2400 (0.2451) data time 0.0011 (0.0015) model time 0.2389 (0.2438) loss 3.4835 (3.6144) grad_norm 2.7903 (1.9691) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1050/1251] eta 0:00:49 lr 0.000969 wd 0.0500 time 0.2410 (0.2451) data time 0.0011 (0.0015) model time 0.2399 (0.2438) loss 3.8589 (3.6155) grad_norm 1.3776 (1.9669) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1060/1251] eta 0:00:46 lr 0.000969 wd 0.0500 time 0.2364 (0.2450) data time 0.0011 (0.0015) model time 0.2353 (0.2437) loss 2.6441 (3.6128) grad_norm 1.8574 (1.9678) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1070/1251] eta 0:00:44 lr 0.000969 wd 0.0500 time 0.2380 (0.2454) data time 0.0007 (0.0015) model time 0.2373 (0.2441) loss 3.1646 (3.6107) grad_norm 1.5122 (1.9659) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1080/1251] eta 0:00:41 lr 0.000969 wd 0.0500 time 0.2412 (0.2453) data time 0.0010 (0.0015) model time 0.2401 (0.2441) loss 3.7985 (3.6084) grad_norm 2.3247 (1.9637) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1090/1251] eta 0:00:39 lr 0.000969 wd 0.0500 time 0.2401 (0.2453) data time 0.0011 (0.0015) model time 0.2390 (0.2440) loss 3.7001 (3.6078) grad_norm 3.3059 (1.9659) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1100/1251] eta 0:00:37 lr 0.000969 wd 0.0500 time 0.2395 (0.2453) data time 0.0008 (0.0015) model time 0.2386 (0.2440) loss 3.9497 (3.6090) grad_norm 1.6808 (1.9660) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1110/1251] eta 0:00:34 lr 0.000969 wd 0.0500 time 0.2452 (0.2452) data time 0.0009 (0.0015) model time 0.2443 (0.2440) loss 4.6523 (3.6116) grad_norm 1.5135 (1.9653) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:39:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1120/1251] eta 0:00:32 lr 0.000969 wd 0.0500 time 0.2390 (0.2452) data time 0.0009 (0.0015) model time 0.2381 (0.2439) loss 4.2774 (3.6096) grad_norm inf (inf) loss_scale 4096.0000 (8188.3461) mem 7379MB [2024-08-26 06:39:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1130/1251] eta 0:00:29 lr 0.000969 wd 0.0500 time 0.2408 (0.2452) data time 0.0008 (0.0015) model time 0.2400 (0.2439) loss 4.2973 (3.6078) grad_norm 3.4923 (inf) loss_scale 4096.0000 (8152.1627) mem 7379MB [2024-08-26 06:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1140/1251] eta 0:00:27 lr 0.000969 wd 0.0500 time 0.2426 (0.2452) data time 0.0012 (0.0014) model time 0.2415 (0.2439) loss 3.7566 (3.6058) grad_norm 1.9255 (inf) loss_scale 4096.0000 (8116.6135) mem 7379MB [2024-08-26 06:40:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1150/1251] eta 0:00:24 lr 0.000969 wd 0.0500 time 0.2405 (0.2454) data time 0.0007 (0.0014) model time 0.2397 (0.2441) loss 3.1118 (3.6061) grad_norm 1.4809 (inf) loss_scale 4096.0000 (8081.6820) mem 7379MB [2024-08-26 06:40:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1160/1251] eta 0:00:22 lr 0.000969 wd 0.0500 time 0.2452 (0.2454) data time 0.0007 (0.0014) model time 0.2445 (0.2441) loss 4.2592 (3.6049) grad_norm 2.0240 (inf) loss_scale 4096.0000 (8047.3523) mem 7379MB [2024-08-26 06:40:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1170/1251] eta 0:00:19 lr 0.000969 wd 0.0500 time 0.2425 (0.2453) data time 0.0011 (0.0014) model time 0.2415 (0.2441) loss 3.9084 (3.6054) grad_norm 1.7893 (inf) loss_scale 4096.0000 (8013.6089) mem 7379MB [2024-08-26 06:40:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1180/1251] eta 0:00:17 lr 0.000969 wd 0.0500 time 0.2386 (0.2456) data time 0.0007 (0.0014) model time 0.2378 (0.2444) loss 4.3642 (3.6052) grad_norm 1.4109 (inf) loss_scale 4096.0000 (7980.4369) mem 7379MB [2024-08-26 06:40:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1190/1251] eta 0:00:14 lr 0.000969 wd 0.0500 time 0.2373 (0.2456) data time 0.0011 (0.0014) model time 0.2361 (0.2444) loss 3.4706 (3.6051) grad_norm 1.4795 (inf) loss_scale 4096.0000 (7947.8220) mem 7379MB [2024-08-26 06:40:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1200/1251] eta 0:00:12 lr 0.000969 wd 0.0500 time 0.2437 (0.2457) data time 0.0010 (0.0014) model time 0.2427 (0.2444) loss 4.1932 (3.6070) grad_norm 1.7220 (inf) loss_scale 4096.0000 (7915.7502) mem 7379MB [2024-08-26 06:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1210/1251] eta 0:00:10 lr 0.000969 wd 0.0500 time 0.2349 (0.2456) data time 0.0010 (0.0014) model time 0.2339 (0.2444) loss 3.9029 (3.6053) grad_norm 2.2117 (inf) loss_scale 4096.0000 (7884.2081) mem 7379MB [2024-08-26 06:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1220/1251] eta 0:00:07 lr 0.000968 wd 0.0500 time 0.2358 (0.2457) data time 0.0010 (0.0014) model time 0.2348 (0.2445) loss 3.7091 (3.6036) grad_norm 1.4618 (inf) loss_scale 4096.0000 (7853.1826) mem 7379MB [2024-08-26 06:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1230/1251] eta 0:00:05 lr 0.000968 wd 0.0500 time 0.2485 (0.2460) data time 0.0009 (0.0014) model time 0.2477 (0.2448) loss 3.8585 (3.6028) grad_norm 2.1715 (inf) loss_scale 4096.0000 (7822.6613) mem 7379MB [2024-08-26 06:40:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1240/1251] eta 0:00:02 lr 0.000968 wd 0.0500 time 0.2249 (0.2459) data time 0.0005 (0.0014) model time 0.2244 (0.2446) loss 4.3760 (3.6031) grad_norm 1.4470 (inf) loss_scale 4096.0000 (7792.6317) mem 7379MB [2024-08-26 06:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [51/300][1250/1251] eta 0:00:00 lr 0.000968 wd 0.0500 time 0.2287 (0.2457) data time 0.0005 (0.0014) model time 0.2282 (0.2445) loss 4.4490 (3.6062) grad_norm 2.0011 (inf) loss_scale 4096.0000 (7763.0823) mem 7379MB [2024-08-26 06:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 51 training takes 0:05:07 [2024-08-26 06:40:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 06:40:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 06:40:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.438 (0.438) Loss 0.5850 (0.5850) Acc@1 88.574 (88.574) Acc@5 97.559 (97.559) Mem 7379MB [2024-08-26 06:40:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.111) Loss 0.9390 (0.9192) Acc@1 78.320 (79.714) Acc@5 95.117 (95.384) Mem 7379MB [2024-08-26 06:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.095) Loss 1.3281 (0.9361) Acc@1 70.703 (78.995) Acc@5 90.039 (95.275) Mem 7379MB [2024-08-26 06:40:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.090) Loss 1.6465 (1.0659) Acc@1 62.012 (76.424) Acc@5 85.352 (93.514) Mem 7379MB [2024-08-26 06:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.4502 (1.1366) Acc@1 68.262 (74.867) Acc@5 88.184 (92.650) Mem 7379MB [2024-08-26 06:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.558 Acc@5 92.614 [2024-08-26 06:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 74.6% [2024-08-26 06:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 74.56% [2024-08-26 06:40:32 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 06:40:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 06:40:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.501 (0.501) Loss 0.4746 (0.4746) Acc@1 90.137 (90.137) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 06:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.114) Loss 0.7910 (0.7654) Acc@1 82.910 (82.635) Acc@5 95.703 (96.316) Mem 7379MB [2024-08-26 06:40:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.097) Loss 1.1064 (0.7812) Acc@1 74.609 (81.836) Acc@5 91.895 (96.270) Mem 7379MB [2024-08-26 06:40:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.091) Loss 1.3730 (0.8969) Acc@1 64.941 (79.102) Acc@5 88.672 (94.764) Mem 7379MB [2024-08-26 06:40:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.3008 (0.9635) Acc@1 68.555 (77.480) Acc@5 89.941 (93.993) Mem 7379MB [2024-08-26 06:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.124 Acc@5 93.934 [2024-08-26 06:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 77.1% [2024-08-26 06:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 77.12% [2024-08-26 06:40:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 06:40:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 06:40:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][0/1251] eta 0:16:09 lr 0.000968 wd 0.0500 time 0.7750 (0.7750) data time 0.5416 (0.5416) model time 0.0000 (0.0000) loss 4.3802 (4.3802) grad_norm 2.2991 (2.2991) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][10/1251] eta 0:06:00 lr 0.000968 wd 0.0500 time 0.2395 (0.2906) data time 0.0008 (0.0502) model time 0.0000 (0.0000) loss 4.7136 (3.6702) grad_norm 1.9593 (1.7773) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:40:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][20/1251] eta 0:05:28 lr 0.000968 wd 0.0500 time 0.2440 (0.2666) data time 0.0008 (0.0268) model time 0.0000 (0.0000) loss 3.5244 (3.5544) grad_norm 2.5029 (2.0123) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:40:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][30/1251] eta 0:05:15 lr 0.000968 wd 0.0500 time 0.2390 (0.2587) data time 0.0010 (0.0185) model time 0.0000 (0.0000) loss 2.5199 (3.5072) grad_norm 1.6471 (2.0237) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:40:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][40/1251] eta 0:05:07 lr 0.000968 wd 0.0500 time 0.2446 (0.2542) data time 0.0011 (0.0142) model time 0.0000 (0.0000) loss 3.3285 (3.5553) grad_norm 1.5561 (2.0944) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][50/1251] eta 0:05:02 lr 0.000968 wd 0.0500 time 0.2380 (0.2520) data time 0.0011 (0.0117) model time 0.0000 (0.0000) loss 3.1937 (3.5251) grad_norm 2.8411 (2.0699) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:40:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][60/1251] eta 0:04:58 lr 0.000968 wd 0.0500 time 0.2437 (0.2507) data time 0.0011 (0.0099) model time 0.2426 (0.2429) loss 3.7826 (3.5105) grad_norm 1.8136 (2.0769) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][70/1251] eta 0:04:54 lr 0.000968 wd 0.0500 time 0.2483 (0.2495) data time 0.0008 (0.0086) model time 0.2475 (0.2420) loss 3.7394 (3.5020) grad_norm 1.8316 (2.0940) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:40:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][80/1251] eta 0:04:50 lr 0.000968 wd 0.0500 time 0.2315 (0.2483) data time 0.0010 (0.0077) model time 0.2306 (0.2411) loss 3.3882 (3.4775) grad_norm 1.8149 (2.0693) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][90/1251] eta 0:04:47 lr 0.000968 wd 0.0500 time 0.2459 (0.2474) data time 0.0012 (0.0070) model time 0.2447 (0.2405) loss 4.0333 (3.4945) grad_norm 1.6830 (2.0398) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][100/1251] eta 0:04:44 lr 0.000968 wd 0.0500 time 0.2414 (0.2474) data time 0.0012 (0.0066) model time 0.2402 (0.2414) loss 2.3644 (3.4737) grad_norm 2.5505 (2.0262) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][110/1251] eta 0:04:41 lr 0.000968 wd 0.0500 time 0.2414 (0.2471) data time 0.0009 (0.0061) model time 0.2405 (0.2416) loss 2.8847 (3.4540) grad_norm 1.8679 (2.0044) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][120/1251] eta 0:04:39 lr 0.000968 wd 0.0500 time 0.2409 (0.2469) data time 0.0007 (0.0059) model time 0.2403 (0.2414) loss 2.6280 (3.4417) grad_norm 2.1066 (1.9917) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][130/1251] eta 0:04:36 lr 0.000968 wd 0.0500 time 0.2457 (0.2465) data time 0.0011 (0.0056) model time 0.2446 (0.2413) loss 3.3472 (3.4406) grad_norm 2.2666 (2.0162) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][140/1251] eta 0:04:33 lr 0.000968 wd 0.0500 time 0.2391 (0.2461) data time 0.0012 (0.0053) model time 0.2379 (0.2412) loss 3.2997 (3.4699) grad_norm 2.5258 (2.0265) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][150/1251] eta 0:04:30 lr 0.000968 wd 0.0500 time 0.2336 (0.2459) data time 0.0010 (0.0050) model time 0.2326 (0.2413) loss 3.7627 (3.4963) grad_norm 1.9802 (2.0168) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][160/1251] eta 0:04:28 lr 0.000968 wd 0.0500 time 0.2497 (0.2457) data time 0.0012 (0.0047) model time 0.2485 (0.2412) loss 3.6804 (3.4948) grad_norm 1.5403 (2.0398) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][170/1251] eta 0:04:25 lr 0.000968 wd 0.0500 time 0.2369 (0.2456) data time 0.0009 (0.0045) model time 0.2359 (0.2413) loss 2.3645 (3.4885) grad_norm 1.3610 (2.0161) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][180/1251] eta 0:04:22 lr 0.000968 wd 0.0500 time 0.2369 (0.2454) data time 0.0008 (0.0044) model time 0.2361 (0.2413) loss 3.7552 (3.4988) grad_norm 2.0598 (1.9982) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][190/1251] eta 0:04:20 lr 0.000968 wd 0.0500 time 0.2361 (0.2451) data time 0.0010 (0.0042) model time 0.2351 (0.2412) loss 3.9395 (3.5059) grad_norm 1.7045 (1.9801) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][200/1251] eta 0:04:17 lr 0.000968 wd 0.0500 time 0.2402 (0.2449) data time 0.0009 (0.0040) model time 0.2393 (0.2410) loss 3.0062 (3.5042) grad_norm 2.0111 (1.9800) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][210/1251] eta 0:04:14 lr 0.000968 wd 0.0500 time 0.2345 (0.2446) data time 0.0007 (0.0039) model time 0.2338 (0.2408) loss 2.5413 (3.5028) grad_norm 1.8978 (1.9905) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][220/1251] eta 0:04:12 lr 0.000968 wd 0.0500 time 0.2466 (0.2446) data time 0.0011 (0.0038) model time 0.2454 (0.2410) loss 2.9349 (3.5138) grad_norm 1.4591 (1.9790) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][230/1251] eta 0:04:09 lr 0.000968 wd 0.0500 time 0.2438 (0.2444) data time 0.0010 (0.0036) model time 0.2428 (0.2408) loss 3.6933 (3.5072) grad_norm 2.1234 (1.9737) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][240/1251] eta 0:04:07 lr 0.000968 wd 0.0500 time 0.2387 (0.2444) data time 0.0010 (0.0036) model time 0.2377 (0.2408) loss 2.8370 (3.5044) grad_norm 2.4366 (1.9895) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][250/1251] eta 0:04:04 lr 0.000968 wd 0.0500 time 0.2399 (0.2442) data time 0.0011 (0.0035) model time 0.2388 (0.2408) loss 3.7655 (3.5139) grad_norm 1.8712 (1.9970) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][260/1251] eta 0:04:02 lr 0.000968 wd 0.0500 time 0.2374 (0.2450) data time 0.0010 (0.0034) model time 0.2364 (0.2418) loss 3.0453 (3.5206) grad_norm 1.8443 (1.9876) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][270/1251] eta 0:04:00 lr 0.000968 wd 0.0500 time 0.2444 (0.2448) data time 0.0007 (0.0034) model time 0.2437 (0.2418) loss 3.6815 (3.5275) grad_norm 2.3627 (1.9849) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][280/1251] eta 0:03:57 lr 0.000968 wd 0.0500 time 0.2460 (0.2448) data time 0.0011 (0.0033) model time 0.2449 (0.2417) loss 3.6614 (3.5282) grad_norm 1.3524 (1.9747) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][290/1251] eta 0:03:55 lr 0.000968 wd 0.0500 time 0.2326 (0.2446) data time 0.0009 (0.0032) model time 0.2317 (0.2416) loss 3.5878 (3.5365) grad_norm 2.2945 (1.9695) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][300/1251] eta 0:03:52 lr 0.000968 wd 0.0500 time 0.2456 (0.2446) data time 0.0009 (0.0032) model time 0.2447 (0.2416) loss 2.7734 (3.5393) grad_norm 2.0256 (1.9747) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][310/1251] eta 0:03:50 lr 0.000968 wd 0.0500 time 0.2424 (0.2445) data time 0.0007 (0.0031) model time 0.2416 (0.2416) loss 3.8849 (3.5479) grad_norm 2.0752 (1.9669) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][320/1251] eta 0:03:47 lr 0.000968 wd 0.0500 time 0.2345 (0.2444) data time 0.0011 (0.0030) model time 0.2334 (0.2415) loss 3.2410 (3.5492) grad_norm 2.6737 (1.9627) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:41:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][330/1251] eta 0:03:44 lr 0.000968 wd 0.0500 time 0.2455 (0.2443) data time 0.0008 (0.0030) model time 0.2447 (0.2415) loss 4.3259 (3.5498) grad_norm 1.7481 (1.9729) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][340/1251] eta 0:03:42 lr 0.000968 wd 0.0500 time 0.2345 (0.2442) data time 0.0009 (0.0029) model time 0.2335 (0.2414) loss 4.0646 (3.5504) grad_norm 2.2011 (1.9717) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][350/1251] eta 0:03:40 lr 0.000968 wd 0.0500 time 0.2444 (0.2447) data time 0.0007 (0.0029) model time 0.2437 (0.2421) loss 2.5755 (3.5548) grad_norm 1.8572 (1.9720) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][360/1251] eta 0:03:38 lr 0.000968 wd 0.0500 time 0.2436 (0.2447) data time 0.0007 (0.0028) model time 0.2429 (0.2421) loss 2.7524 (3.5595) grad_norm 1.9524 (1.9676) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][370/1251] eta 0:03:35 lr 0.000968 wd 0.0500 time 0.2404 (0.2446) data time 0.0007 (0.0028) model time 0.2397 (0.2421) loss 3.9287 (3.5633) grad_norm 2.0970 (1.9670) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][380/1251] eta 0:03:33 lr 0.000968 wd 0.0500 time 0.2334 (0.2446) data time 0.0009 (0.0027) model time 0.2324 (0.2421) loss 2.8427 (3.5673) grad_norm 2.1036 (1.9713) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][390/1251] eta 0:03:30 lr 0.000968 wd 0.0500 time 0.2314 (0.2446) data time 0.0009 (0.0027) model time 0.2306 (0.2422) loss 4.2130 (3.5727) grad_norm 2.1550 (1.9780) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][400/1251] eta 0:03:28 lr 0.000968 wd 0.0500 time 0.2428 (0.2446) data time 0.0010 (0.0026) model time 0.2418 (0.2422) loss 4.0271 (3.5777) grad_norm 2.1832 (1.9704) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][410/1251] eta 0:03:25 lr 0.000968 wd 0.0500 time 0.2399 (0.2445) data time 0.0011 (0.0026) model time 0.2388 (0.2421) loss 3.8335 (3.5800) grad_norm 2.0316 (1.9683) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][420/1251] eta 0:03:23 lr 0.000968 wd 0.0500 time 0.2442 (0.2445) data time 0.0010 (0.0026) model time 0.2432 (0.2421) loss 3.7065 (3.5883) grad_norm 2.4036 (1.9684) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][430/1251] eta 0:03:20 lr 0.000968 wd 0.0500 time 0.2352 (0.2444) data time 0.0011 (0.0025) model time 0.2342 (0.2420) loss 3.6966 (3.5851) grad_norm 1.8211 (1.9663) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][440/1251] eta 0:03:18 lr 0.000968 wd 0.0500 time 0.2361 (0.2443) data time 0.0007 (0.0025) model time 0.2353 (0.2420) loss 2.4284 (3.5814) grad_norm 2.2920 (1.9640) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][450/1251] eta 0:03:15 lr 0.000968 wd 0.0500 time 0.2400 (0.2443) data time 0.0012 (0.0025) model time 0.2387 (0.2420) loss 3.7237 (3.5830) grad_norm 1.6032 (1.9648) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][460/1251] eta 0:03:13 lr 0.000968 wd 0.0500 time 0.2407 (0.2442) data time 0.0010 (0.0024) model time 0.2397 (0.2420) loss 3.7616 (3.5841) grad_norm 1.5569 (1.9631) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][470/1251] eta 0:03:11 lr 0.000968 wd 0.0500 time 0.2377 (0.2447) data time 0.0011 (0.0024) model time 0.2366 (0.2425) loss 3.4895 (3.5855) grad_norm 1.7909 (1.9605) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][480/1251] eta 0:03:08 lr 0.000968 wd 0.0500 time 0.4346 (0.2450) data time 0.0009 (0.0024) model time 0.4338 (0.2429) loss 4.1875 (3.5876) grad_norm 3.3560 (1.9623) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][490/1251] eta 0:03:06 lr 0.000968 wd 0.0500 time 0.2393 (0.2453) data time 0.0010 (0.0023) model time 0.2383 (0.2432) loss 2.7442 (3.5861) grad_norm 2.4758 (1.9638) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][500/1251] eta 0:03:04 lr 0.000968 wd 0.0500 time 0.2451 (0.2452) data time 0.0008 (0.0023) model time 0.2443 (0.2432) loss 4.3317 (3.5834) grad_norm 2.2516 (1.9642) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][510/1251] eta 0:03:01 lr 0.000968 wd 0.0500 time 0.2473 (0.2455) data time 0.0012 (0.0023) model time 0.2461 (0.2436) loss 3.7091 (3.5808) grad_norm 2.1371 (1.9623) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][520/1251] eta 0:02:59 lr 0.000968 wd 0.0500 time 0.2469 (0.2455) data time 0.0009 (0.0023) model time 0.2460 (0.2435) loss 2.7564 (3.5788) grad_norm 2.2732 (1.9594) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][530/1251] eta 0:02:56 lr 0.000968 wd 0.0500 time 0.2433 (0.2454) data time 0.0010 (0.0022) model time 0.2424 (0.2435) loss 3.8920 (3.5798) grad_norm 1.2336 (1.9601) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][540/1251] eta 0:02:54 lr 0.000968 wd 0.0500 time 0.2390 (0.2453) data time 0.0007 (0.0022) model time 0.2382 (0.2434) loss 4.1334 (3.5785) grad_norm 1.3625 (1.9652) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][550/1251] eta 0:02:51 lr 0.000968 wd 0.0500 time 0.2446 (0.2453) data time 0.0010 (0.0022) model time 0.2436 (0.2434) loss 3.8327 (3.5825) grad_norm 2.7150 (1.9684) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][560/1251] eta 0:02:49 lr 0.000968 wd 0.0500 time 0.2446 (0.2452) data time 0.0007 (0.0022) model time 0.2439 (0.2433) loss 2.8645 (3.5859) grad_norm 4.5297 (1.9733) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:42:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][570/1251] eta 0:02:46 lr 0.000968 wd 0.0500 time 0.2369 (0.2451) data time 0.0011 (0.0022) model time 0.2358 (0.2432) loss 3.9585 (3.5922) grad_norm 1.4655 (1.9774) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][580/1251] eta 0:02:44 lr 0.000968 wd 0.0500 time 0.2388 (0.2451) data time 0.0009 (0.0021) model time 0.2379 (0.2432) loss 2.6011 (3.5894) grad_norm 1.9471 (1.9781) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][590/1251] eta 0:02:41 lr 0.000968 wd 0.0500 time 0.2448 (0.2450) data time 0.0007 (0.0021) model time 0.2441 (0.2432) loss 4.2678 (3.5927) grad_norm 2.1106 (1.9727) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][600/1251] eta 0:02:39 lr 0.000967 wd 0.0500 time 0.2375 (0.2450) data time 0.0007 (0.0021) model time 0.2368 (0.2431) loss 4.5076 (3.5903) grad_norm 2.0085 (1.9738) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][610/1251] eta 0:02:37 lr 0.000967 wd 0.0500 time 0.2391 (0.2449) data time 0.0012 (0.0021) model time 0.2379 (0.2431) loss 3.7373 (3.5872) grad_norm 1.8332 (1.9726) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][620/1251] eta 0:02:34 lr 0.000967 wd 0.0500 time 0.2438 (0.2449) data time 0.0007 (0.0021) model time 0.2430 (0.2431) loss 2.5477 (3.5814) grad_norm 1.1970 (1.9678) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][630/1251] eta 0:02:32 lr 0.000967 wd 0.0500 time 0.2568 (0.2449) data time 0.0008 (0.0020) model time 0.2560 (0.2430) loss 4.0378 (3.5812) grad_norm 1.8179 (1.9619) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][640/1251] eta 0:02:29 lr 0.000967 wd 0.0500 time 0.2420 (0.2448) data time 0.0010 (0.0020) model time 0.2411 (0.2430) loss 3.7152 (3.5845) grad_norm 1.7848 (1.9640) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][650/1251] eta 0:02:27 lr 0.000967 wd 0.0500 time 0.2403 (0.2447) data time 0.0010 (0.0020) model time 0.2393 (0.2429) loss 4.3363 (3.5867) grad_norm 1.9624 (1.9672) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][660/1251] eta 0:02:24 lr 0.000967 wd 0.0500 time 0.2378 (0.2447) data time 0.0009 (0.0020) model time 0.2369 (0.2429) loss 3.2774 (3.5862) grad_norm 1.2558 (1.9637) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][670/1251] eta 0:02:22 lr 0.000967 wd 0.0500 time 0.2474 (0.2447) data time 0.0009 (0.0020) model time 0.2465 (0.2429) loss 4.0059 (3.5870) grad_norm 1.6055 (1.9613) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][680/1251] eta 0:02:19 lr 0.000967 wd 0.0500 time 0.2488 (0.2446) data time 0.0010 (0.0020) model time 0.2478 (0.2429) loss 3.0659 (3.5852) grad_norm 2.0303 (1.9597) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][690/1251] eta 0:02:17 lr 0.000967 wd 0.0500 time 0.2463 (0.2446) data time 0.0007 (0.0020) model time 0.2455 (0.2428) loss 2.6982 (3.5858) grad_norm 1.6372 (1.9587) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][700/1251] eta 0:02:14 lr 0.000967 wd 0.0500 time 0.2428 (0.2445) data time 0.0008 (0.0019) model time 0.2420 (0.2428) loss 4.2174 (3.5836) grad_norm 1.5312 (1.9573) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][710/1251] eta 0:02:12 lr 0.000967 wd 0.0500 time 0.2438 (0.2445) data time 0.0009 (0.0019) model time 0.2429 (0.2428) loss 2.9271 (3.5824) grad_norm 1.8646 (1.9541) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][720/1251] eta 0:02:09 lr 0.000967 wd 0.0500 time 0.2354 (0.2445) data time 0.0009 (0.0019) model time 0.2345 (0.2427) loss 4.1663 (3.5874) grad_norm 3.3039 (1.9554) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][730/1251] eta 0:02:07 lr 0.000967 wd 0.0500 time 0.2356 (0.2445) data time 0.0009 (0.0019) model time 0.2347 (0.2427) loss 3.4195 (3.5875) grad_norm 1.5191 (1.9638) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][740/1251] eta 0:02:04 lr 0.000967 wd 0.0500 time 0.2394 (0.2444) data time 0.0011 (0.0019) model time 0.2383 (0.2427) loss 3.5165 (3.5840) grad_norm 1.3927 (1.9590) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][750/1251] eta 0:02:02 lr 0.000967 wd 0.0500 time 0.2455 (0.2447) data time 0.0009 (0.0020) model time 0.2446 (0.2430) loss 4.8629 (3.5845) grad_norm 1.4688 (1.9587) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][760/1251] eta 0:02:00 lr 0.000967 wd 0.0500 time 0.2424 (0.2447) data time 0.0008 (0.0019) model time 0.2416 (0.2429) loss 4.4037 (3.5850) grad_norm 1.6407 (1.9580) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][770/1251] eta 0:01:57 lr 0.000967 wd 0.0500 time 0.2417 (0.2447) data time 0.0008 (0.0019) model time 0.2409 (0.2429) loss 3.9021 (3.5843) grad_norm 1.4184 (1.9563) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][780/1251] eta 0:01:55 lr 0.000967 wd 0.0500 time 0.2466 (0.2448) data time 0.0007 (0.0019) model time 0.2458 (0.2431) loss 4.2943 (3.5804) grad_norm 1.8832 (1.9535) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][790/1251] eta 0:01:52 lr 0.000967 wd 0.0500 time 0.2320 (0.2448) data time 0.0007 (0.0019) model time 0.2313 (0.2430) loss 3.9878 (3.5813) grad_norm 1.8949 (1.9568) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][800/1251] eta 0:01:50 lr 0.000967 wd 0.0500 time 0.2427 (0.2447) data time 0.0007 (0.0019) model time 0.2419 (0.2430) loss 2.4798 (3.5776) grad_norm 1.9960 (1.9556) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][810/1251] eta 0:01:47 lr 0.000967 wd 0.0500 time 0.2417 (0.2446) data time 0.0009 (0.0019) model time 0.2407 (0.2429) loss 3.8028 (3.5773) grad_norm 2.4934 (1.9534) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:43:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][820/1251] eta 0:01:45 lr 0.000967 wd 0.0500 time 0.2473 (0.2446) data time 0.0008 (0.0019) model time 0.2465 (0.2429) loss 3.9888 (3.5783) grad_norm 1.7519 (1.9507) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][830/1251] eta 0:01:42 lr 0.000967 wd 0.0500 time 0.2407 (0.2446) data time 0.0010 (0.0019) model time 0.2397 (0.2428) loss 2.3795 (3.5775) grad_norm 1.5324 (1.9468) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][840/1251] eta 0:01:40 lr 0.000967 wd 0.0500 time 0.2400 (0.2445) data time 0.0008 (0.0019) model time 0.2392 (0.2428) loss 4.5725 (3.5840) grad_norm 3.0920 (1.9462) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][850/1251] eta 0:01:38 lr 0.000967 wd 0.0500 time 0.2371 (0.2445) data time 0.0009 (0.0018) model time 0.2363 (0.2428) loss 3.6436 (3.5843) grad_norm 1.9633 (1.9475) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][860/1251] eta 0:01:35 lr 0.000967 wd 0.0500 time 0.2422 (0.2444) data time 0.0010 (0.0018) model time 0.2412 (0.2427) loss 3.7855 (3.5852) grad_norm 2.3297 (1.9464) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][870/1251] eta 0:01:33 lr 0.000967 wd 0.0500 time 0.2457 (0.2446) data time 0.0010 (0.0018) model time 0.2447 (0.2429) loss 3.7691 (3.5860) grad_norm 2.1773 (1.9463) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][880/1251] eta 0:01:30 lr 0.000967 wd 0.0500 time 0.2484 (0.2446) data time 0.0007 (0.0018) model time 0.2477 (0.2429) loss 2.6671 (3.5836) grad_norm 1.8713 (1.9489) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][890/1251] eta 0:01:28 lr 0.000967 wd 0.0500 time 0.2388 (0.2448) data time 0.0008 (0.0018) model time 0.2379 (0.2431) loss 2.7188 (3.5834) grad_norm 2.3505 (1.9522) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][900/1251] eta 0:01:25 lr 0.000967 wd 0.0500 time 0.2417 (0.2450) data time 0.0007 (0.0018) model time 0.2410 (0.2433) loss 4.3550 (3.5832) grad_norm 1.6320 (1.9497) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][910/1251] eta 0:01:23 lr 0.000967 wd 0.0500 time 0.2433 (0.2449) data time 0.0009 (0.0018) model time 0.2425 (0.2433) loss 2.9296 (3.5806) grad_norm 2.6920 (1.9478) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][920/1251] eta 0:01:21 lr 0.000967 wd 0.0500 time 0.2376 (0.2449) data time 0.0011 (0.0018) model time 0.2365 (0.2433) loss 3.9041 (3.5786) grad_norm 1.3904 (1.9490) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][930/1251] eta 0:01:18 lr 0.000967 wd 0.0500 time 0.2341 (0.2449) data time 0.0010 (0.0018) model time 0.2330 (0.2433) loss 3.3431 (3.5773) grad_norm 2.4947 (1.9482) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][940/1251] eta 0:01:16 lr 0.000967 wd 0.0500 time 0.2451 (0.2449) data time 0.0010 (0.0018) model time 0.2441 (0.2433) loss 3.6926 (3.5769) grad_norm 3.8494 (1.9497) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][950/1251] eta 0:01:13 lr 0.000967 wd 0.0500 time 0.2368 (0.2449) data time 0.0011 (0.0018) model time 0.2357 (0.2433) loss 3.5591 (3.5782) grad_norm 2.6103 (1.9517) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][960/1251] eta 0:01:11 lr 0.000967 wd 0.0500 time 0.2484 (0.2449) data time 0.0010 (0.0018) model time 0.2474 (0.2433) loss 3.9935 (3.5771) grad_norm 2.6009 (1.9546) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][970/1251] eta 0:01:08 lr 0.000967 wd 0.0500 time 0.2468 (0.2448) data time 0.0007 (0.0017) model time 0.2461 (0.2432) loss 4.3115 (3.5771) grad_norm 1.7231 (1.9537) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][980/1251] eta 0:01:06 lr 0.000967 wd 0.0500 time 0.2375 (0.2448) data time 0.0011 (0.0017) model time 0.2364 (0.2432) loss 4.2236 (3.5756) grad_norm 1.8510 (1.9518) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][990/1251] eta 0:01:03 lr 0.000967 wd 0.0500 time 0.2404 (0.2448) data time 0.0010 (0.0017) model time 0.2395 (0.2432) loss 3.7428 (3.5759) grad_norm 2.1020 (1.9521) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1000/1251] eta 0:01:01 lr 0.000967 wd 0.0500 time 0.2405 (0.2447) data time 0.0007 (0.0017) model time 0.2398 (0.2432) loss 4.2937 (3.5751) grad_norm 2.3099 (1.9543) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1010/1251] eta 0:00:58 lr 0.000967 wd 0.0500 time 0.2457 (0.2447) data time 0.0011 (0.0017) model time 0.2446 (0.2431) loss 4.2536 (3.5745) grad_norm 1.8873 (1.9591) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1020/1251] eta 0:00:56 lr 0.000967 wd 0.0500 time 0.2427 (0.2447) data time 0.0007 (0.0017) model time 0.2420 (0.2431) loss 3.4857 (3.5751) grad_norm 2.2005 (1.9587) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1030/1251] eta 0:00:54 lr 0.000967 wd 0.0500 time 0.2481 (0.2448) data time 0.0007 (0.0017) model time 0.2474 (0.2433) loss 4.0693 (3.5728) grad_norm 2.3545 (1.9590) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1040/1251] eta 0:00:51 lr 0.000967 wd 0.0500 time 0.2350 (0.2448) data time 0.0010 (0.0017) model time 0.2341 (0.2432) loss 3.4306 (3.5716) grad_norm 1.3815 (1.9584) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1050/1251] eta 0:00:49 lr 0.000967 wd 0.0500 time 0.2453 (0.2448) data time 0.0009 (0.0017) model time 0.2444 (0.2432) loss 3.7753 (3.5702) grad_norm 1.7679 (1.9561) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:44:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1060/1251] eta 0:00:46 lr 0.000967 wd 0.0500 time 0.2398 (0.2447) data time 0.0008 (0.0017) model time 0.2390 (0.2432) loss 4.8080 (3.5713) grad_norm 3.8556 (1.9558) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1070/1251] eta 0:00:44 lr 0.000967 wd 0.0500 time 0.2421 (0.2447) data time 0.0008 (0.0017) model time 0.2413 (0.2432) loss 2.9002 (3.5728) grad_norm 1.6597 (1.9531) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1080/1251] eta 0:00:41 lr 0.000967 wd 0.0500 time 0.2453 (0.2447) data time 0.0010 (0.0017) model time 0.2443 (0.2432) loss 3.3790 (3.5728) grad_norm 1.5049 (1.9514) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1090/1251] eta 0:00:39 lr 0.000967 wd 0.0500 time 0.2431 (0.2447) data time 0.0008 (0.0017) model time 0.2423 (0.2432) loss 2.9781 (3.5733) grad_norm 2.0959 (1.9512) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1100/1251] eta 0:00:36 lr 0.000967 wd 0.0500 time 0.2354 (0.2447) data time 0.0010 (0.0017) model time 0.2344 (0.2431) loss 3.0384 (3.5717) grad_norm 1.8616 (1.9542) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1110/1251] eta 0:00:34 lr 0.000967 wd 0.0500 time 0.2515 (0.2446) data time 0.0008 (0.0017) model time 0.2507 (0.2431) loss 2.6377 (3.5709) grad_norm 3.1286 (1.9561) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1120/1251] eta 0:00:32 lr 0.000967 wd 0.0500 time 0.2419 (0.2446) data time 0.0009 (0.0016) model time 0.2410 (0.2431) loss 4.4014 (3.5700) grad_norm 1.9194 (1.9555) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1130/1251] eta 0:00:29 lr 0.000967 wd 0.0500 time 0.2422 (0.2446) data time 0.0007 (0.0016) model time 0.2415 (0.2431) loss 3.7582 (3.5696) grad_norm 2.9144 (1.9552) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1140/1251] eta 0:00:27 lr 0.000967 wd 0.0500 time 0.2419 (0.2446) data time 0.0009 (0.0016) model time 0.2409 (0.2430) loss 4.0074 (3.5694) grad_norm 1.6921 (1.9540) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1150/1251] eta 0:00:24 lr 0.000967 wd 0.0500 time 0.2488 (0.2446) data time 0.0008 (0.0016) model time 0.2480 (0.2430) loss 4.1459 (3.5702) grad_norm 1.8748 (1.9534) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1160/1251] eta 0:00:22 lr 0.000967 wd 0.0500 time 0.2395 (0.2445) data time 0.0009 (0.0016) model time 0.2386 (0.2430) loss 4.6486 (3.5715) grad_norm 1.7203 (1.9568) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1170/1251] eta 0:00:19 lr 0.000967 wd 0.0500 time 0.2381 (0.2445) data time 0.0009 (0.0016) model time 0.2372 (0.2430) loss 3.6416 (3.5711) grad_norm 1.6152 (1.9562) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1180/1251] eta 0:00:17 lr 0.000967 wd 0.0500 time 0.2399 (0.2445) data time 0.0009 (0.0016) model time 0.2389 (0.2430) loss 3.9145 (3.5729) grad_norm 2.0156 (1.9550) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1190/1251] eta 0:00:14 lr 0.000967 wd 0.0500 time 0.2462 (0.2445) data time 0.0009 (0.0016) model time 0.2453 (0.2430) loss 2.7808 (3.5749) grad_norm 1.7308 (1.9555) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1200/1251] eta 0:00:12 lr 0.000967 wd 0.0500 time 0.2380 (0.2444) data time 0.0009 (0.0016) model time 0.2371 (0.2429) loss 3.9249 (3.5743) grad_norm 1.6142 (1.9547) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1210/1251] eta 0:00:10 lr 0.000967 wd 0.0500 time 0.2413 (0.2444) data time 0.0010 (0.0016) model time 0.2403 (0.2429) loss 2.9606 (3.5736) grad_norm 1.5893 (1.9550) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1220/1251] eta 0:00:07 lr 0.000967 wd 0.0500 time 0.2417 (0.2444) data time 0.0011 (0.0016) model time 0.2405 (0.2429) loss 3.1492 (3.5748) grad_norm 2.5545 (1.9538) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1230/1251] eta 0:00:05 lr 0.000966 wd 0.0500 time 0.2439 (0.2444) data time 0.0008 (0.0016) model time 0.2432 (0.2429) loss 3.8674 (3.5769) grad_norm 1.7395 (1.9544) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1240/1251] eta 0:00:02 lr 0.000966 wd 0.0500 time 0.2240 (0.2443) data time 0.0007 (0.0016) model time 0.2233 (0.2428) loss 3.0971 (3.5779) grad_norm 1.6309 (1.9528) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [52/300][1250/1251] eta 0:00:00 lr 0.000966 wd 0.0500 time 0.2236 (0.2442) data time 0.0007 (0.0016) model time 0.2228 (0.2427) loss 3.7805 (3.5765) grad_norm 2.3817 (1.9507) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 52 training takes 0:05:05 [2024-08-26 06:45:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 06:45:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 06:45:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.405 (0.405) Loss 0.5874 (0.5874) Acc@1 88.867 (88.867) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-26 06:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.107) Loss 0.9526 (0.8866) Acc@1 80.762 (81.081) Acc@5 95.020 (95.827) Mem 7379MB [2024-08-26 06:45:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.094) Loss 1.2617 (0.9164) Acc@1 70.996 (79.967) Acc@5 90.918 (95.647) Mem 7379MB [2024-08-26 06:45:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.088) Loss 1.5781 (1.0445) Acc@1 63.477 (77.161) Acc@5 87.402 (93.920) Mem 7379MB [2024-08-26 06:45:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.4580 (1.1163) Acc@1 66.309 (75.381) Acc@5 88.477 (92.964) Mem 7379MB [2024-08-26 06:45:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.972 Acc@5 92.830 [2024-08-26 06:45:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.0% [2024-08-26 06:45:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 74.97% [2024-08-26 06:45:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 06:45:49 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 06:45:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.373 (0.373) Loss 0.4734 (0.4734) Acc@1 90.039 (90.039) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-26 06:45:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.104) Loss 0.7896 (0.7630) Acc@1 83.301 (82.839) Acc@5 95.703 (96.378) Mem 7379MB [2024-08-26 06:45:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.092) Loss 1.1055 (0.7793) Acc@1 74.414 (81.924) Acc@5 91.992 (96.294) Mem 7379MB [2024-08-26 06:45:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.088) Loss 1.3682 (0.8942) Acc@1 64.941 (79.183) Acc@5 88.672 (94.796) Mem 7379MB [2024-08-26 06:45:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.2939 (0.9601) Acc@1 68.359 (77.591) Acc@5 90.137 (94.026) Mem 7379MB [2024-08-26 06:45:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.228 Acc@5 93.980 [2024-08-26 06:45:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 77.2% [2024-08-26 06:45:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 77.23% [2024-08-26 06:45:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 06:45:53 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 06:45:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][0/1251] eta 0:13:24 lr 0.000966 wd 0.0500 time 0.6430 (0.6430) data time 0.4128 (0.4128) model time 0.0000 (0.0000) loss 3.8590 (3.8590) grad_norm 2.6978 (2.6978) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][10/1251] eta 0:05:46 lr 0.000966 wd 0.0500 time 0.2367 (0.2795) data time 0.0009 (0.0384) model time 0.0000 (0.0000) loss 3.9565 (3.8591) grad_norm 2.1763 (2.4536) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:45:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][20/1251] eta 0:05:21 lr 0.000966 wd 0.0500 time 0.2396 (0.2610) data time 0.0010 (0.0206) model time 0.0000 (0.0000) loss 3.2410 (3.6874) grad_norm 1.5275 (2.1505) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][30/1251] eta 0:05:11 lr 0.000966 wd 0.0500 time 0.2416 (0.2551) data time 0.0011 (0.0143) model time 0.0000 (0.0000) loss 3.3087 (3.6766) grad_norm 1.9136 (2.1514) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][40/1251] eta 0:05:04 lr 0.000966 wd 0.0500 time 0.2460 (0.2515) data time 0.0008 (0.0111) model time 0.0000 (0.0000) loss 4.1706 (3.6191) grad_norm 2.2564 (2.1161) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][50/1251] eta 0:04:59 lr 0.000966 wd 0.0500 time 0.2388 (0.2497) data time 0.0011 (0.0091) model time 0.0000 (0.0000) loss 3.8475 (3.6129) grad_norm 1.5323 (2.0313) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][60/1251] eta 0:05:00 lr 0.000966 wd 0.0500 time 0.2424 (0.2524) data time 0.0012 (0.0078) model time 0.2412 (0.2648) loss 3.3487 (3.5975) grad_norm 1.1402 (1.9803) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][70/1251] eta 0:04:56 lr 0.000966 wd 0.0500 time 0.2368 (0.2508) data time 0.0012 (0.0068) model time 0.2356 (0.2525) loss 3.7188 (3.6254) grad_norm 1.6952 (1.9537) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][80/1251] eta 0:04:52 lr 0.000966 wd 0.0500 time 0.2399 (0.2496) data time 0.0012 (0.0061) model time 0.2387 (0.2485) loss 4.0403 (3.6379) grad_norm 1.6448 (1.9297) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][90/1251] eta 0:04:54 lr 0.000966 wd 0.0500 time 0.4583 (0.2537) data time 0.0011 (0.0055) model time 0.4572 (0.2577) loss 2.6878 (3.6186) grad_norm 1.6740 (1.9397) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][100/1251] eta 0:04:52 lr 0.000966 wd 0.0500 time 0.2385 (0.2545) data time 0.0010 (0.0051) model time 0.2376 (0.2584) loss 4.3389 (3.6239) grad_norm 1.5412 (1.9178) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][110/1251] eta 0:04:49 lr 0.000966 wd 0.0500 time 0.2438 (0.2535) data time 0.0007 (0.0047) model time 0.2431 (0.2557) loss 2.6134 (3.6009) grad_norm 2.2880 (1.9068) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][120/1251] eta 0:04:45 lr 0.000966 wd 0.0500 time 0.2318 (0.2524) data time 0.0009 (0.0044) model time 0.2309 (0.2533) loss 4.6120 (3.6063) grad_norm 1.3853 (1.9036) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][130/1251] eta 0:04:42 lr 0.000966 wd 0.0500 time 0.2399 (0.2517) data time 0.0009 (0.0042) model time 0.2391 (0.2520) loss 2.9684 (3.6070) grad_norm 1.9212 (1.9006) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][140/1251] eta 0:04:40 lr 0.000966 wd 0.0500 time 0.2321 (0.2522) data time 0.0009 (0.0039) model time 0.2312 (0.2527) loss 4.2769 (3.6119) grad_norm 2.4991 (1.9134) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][150/1251] eta 0:04:36 lr 0.000966 wd 0.0500 time 0.2426 (0.2515) data time 0.0012 (0.0037) model time 0.2414 (0.2513) loss 3.6853 (3.6102) grad_norm 1.4324 (1.9169) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][160/1251] eta 0:04:33 lr 0.000966 wd 0.0500 time 0.2478 (0.2510) data time 0.0008 (0.0036) model time 0.2470 (0.2506) loss 3.4937 (3.6033) grad_norm 2.1129 (1.9223) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][170/1251] eta 0:04:30 lr 0.000966 wd 0.0500 time 0.2355 (0.2505) data time 0.0010 (0.0034) model time 0.2345 (0.2498) loss 3.9623 (3.6076) grad_norm 1.7347 (1.9210) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][180/1251] eta 0:04:27 lr 0.000966 wd 0.0500 time 0.2360 (0.2500) data time 0.0008 (0.0033) model time 0.2352 (0.2490) loss 3.2314 (3.6042) grad_norm 1.3832 (1.9182) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][190/1251] eta 0:04:24 lr 0.000966 wd 0.0500 time 0.2391 (0.2495) data time 0.0009 (0.0032) model time 0.2381 (0.2484) loss 2.9816 (3.5824) grad_norm 1.9904 (1.9176) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][200/1251] eta 0:04:21 lr 0.000966 wd 0.0500 time 0.2404 (0.2491) data time 0.0012 (0.0031) model time 0.2391 (0.2479) loss 3.9601 (3.6027) grad_norm 1.6240 (1.9118) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][210/1251] eta 0:04:18 lr 0.000966 wd 0.0500 time 0.2440 (0.2487) data time 0.0007 (0.0030) model time 0.2432 (0.2474) loss 4.6089 (3.6124) grad_norm 1.9783 (1.9127) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][220/1251] eta 0:04:16 lr 0.000966 wd 0.0500 time 0.2407 (0.2484) data time 0.0009 (0.0029) model time 0.2398 (0.2470) loss 4.1855 (3.6234) grad_norm 1.3857 (1.9175) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][230/1251] eta 0:04:13 lr 0.000966 wd 0.0500 time 0.2416 (0.2481) data time 0.0011 (0.0028) model time 0.2405 (0.2466) loss 3.5171 (3.6251) grad_norm 1.4260 (1.9220) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][240/1251] eta 0:04:10 lr 0.000966 wd 0.0500 time 0.2402 (0.2478) data time 0.0007 (0.0027) model time 0.2395 (0.2462) loss 4.1102 (3.6184) grad_norm 1.6191 (1.9368) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][250/1251] eta 0:04:07 lr 0.000966 wd 0.0500 time 0.2415 (0.2475) data time 0.0007 (0.0027) model time 0.2408 (0.2459) loss 4.1097 (3.6122) grad_norm 1.9252 (1.9318) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:46:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][260/1251] eta 0:04:05 lr 0.000966 wd 0.0500 time 0.2411 (0.2472) data time 0.0011 (0.0026) model time 0.2400 (0.2456) loss 3.9290 (3.6208) grad_norm 1.5823 (1.9468) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][270/1251] eta 0:04:02 lr 0.000966 wd 0.0500 time 0.2416 (0.2470) data time 0.0011 (0.0025) model time 0.2405 (0.2454) loss 3.5079 (3.6081) grad_norm 1.6751 (1.9433) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][280/1251] eta 0:03:59 lr 0.000966 wd 0.0500 time 0.2411 (0.2468) data time 0.0007 (0.0025) model time 0.2404 (0.2451) loss 3.8001 (3.6138) grad_norm 2.3097 (1.9387) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][290/1251] eta 0:03:57 lr 0.000966 wd 0.0500 time 0.2386 (0.2466) data time 0.0011 (0.0024) model time 0.2375 (0.2449) loss 3.2556 (3.6141) grad_norm 2.7007 (1.9369) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][300/1251] eta 0:03:55 lr 0.000966 wd 0.0500 time 0.2563 (0.2472) data time 0.0010 (0.0024) model time 0.2553 (0.2457) loss 3.6014 (3.6027) grad_norm 1.2868 (1.9308) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][310/1251] eta 0:03:53 lr 0.000966 wd 0.0500 time 0.2381 (0.2483) data time 0.0012 (0.0023) model time 0.2370 (0.2470) loss 3.0092 (3.5962) grad_norm 2.0539 (1.9385) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][320/1251] eta 0:03:50 lr 0.000966 wd 0.0500 time 0.2377 (0.2481) data time 0.0011 (0.0023) model time 0.2366 (0.2467) loss 3.6853 (3.5881) grad_norm 2.0323 (1.9482) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][330/1251] eta 0:03:48 lr 0.000966 wd 0.0500 time 0.2384 (0.2478) data time 0.0009 (0.0023) model time 0.2376 (0.2464) loss 4.1798 (3.5918) grad_norm 2.4277 (1.9601) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][340/1251] eta 0:03:45 lr 0.000966 wd 0.0500 time 0.2433 (0.2476) data time 0.0007 (0.0022) model time 0.2425 (0.2462) loss 4.6425 (3.5913) grad_norm 1.6527 (1.9649) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][350/1251] eta 0:03:42 lr 0.000966 wd 0.0500 time 0.2375 (0.2475) data time 0.0010 (0.0022) model time 0.2366 (0.2460) loss 3.9211 (3.6014) grad_norm 1.3457 (1.9603) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][360/1251] eta 0:03:40 lr 0.000966 wd 0.0500 time 0.2445 (0.2473) data time 0.0009 (0.0022) model time 0.2436 (0.2459) loss 3.9917 (3.6041) grad_norm 3.1092 (1.9695) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][370/1251] eta 0:03:37 lr 0.000966 wd 0.0500 time 0.2422 (0.2471) data time 0.0010 (0.0021) model time 0.2412 (0.2457) loss 2.9807 (3.6050) grad_norm 1.7707 (1.9708) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][380/1251] eta 0:03:35 lr 0.000966 wd 0.0500 time 0.2483 (0.2470) data time 0.0010 (0.0021) model time 0.2474 (0.2455) loss 3.9286 (3.6029) grad_norm 1.7527 (1.9679) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][390/1251] eta 0:03:32 lr 0.000966 wd 0.0500 time 0.2358 (0.2468) data time 0.0009 (0.0021) model time 0.2348 (0.2454) loss 3.0907 (3.5987) grad_norm 1.7090 (1.9677) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][400/1251] eta 0:03:29 lr 0.000966 wd 0.0500 time 0.2387 (0.2467) data time 0.0011 (0.0021) model time 0.2376 (0.2452) loss 3.9293 (3.5973) grad_norm 1.8808 (1.9608) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][410/1251] eta 0:03:27 lr 0.000966 wd 0.0500 time 0.2381 (0.2466) data time 0.0010 (0.0020) model time 0.2371 (0.2451) loss 4.3622 (3.5910) grad_norm 2.1775 (1.9556) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][420/1251] eta 0:03:24 lr 0.000966 wd 0.0500 time 0.2489 (0.2466) data time 0.0012 (0.0020) model time 0.2478 (0.2451) loss 3.5343 (3.5991) grad_norm 2.9484 (1.9578) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][430/1251] eta 0:03:22 lr 0.000966 wd 0.0500 time 0.2460 (0.2465) data time 0.0014 (0.0020) model time 0.2446 (0.2450) loss 3.0368 (3.5960) grad_norm 1.7515 (1.9535) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][440/1251] eta 0:03:19 lr 0.000966 wd 0.0500 time 0.2412 (0.2463) data time 0.0009 (0.0020) model time 0.2403 (0.2449) loss 4.0460 (3.5922) grad_norm 1.6597 (1.9511) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][450/1251] eta 0:03:17 lr 0.000966 wd 0.0500 time 0.2431 (0.2462) data time 0.0009 (0.0019) model time 0.2423 (0.2448) loss 3.8938 (3.5947) grad_norm 1.5722 (1.9504) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][460/1251] eta 0:03:14 lr 0.000966 wd 0.0500 time 0.2479 (0.2462) data time 0.0008 (0.0019) model time 0.2471 (0.2447) loss 3.0830 (3.5942) grad_norm 2.3134 (1.9497) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][470/1251] eta 0:03:12 lr 0.000966 wd 0.0500 time 0.2462 (0.2461) data time 0.0009 (0.0019) model time 0.2453 (0.2446) loss 3.8448 (3.5920) grad_norm 3.3648 (1.9542) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][480/1251] eta 0:03:09 lr 0.000966 wd 0.0500 time 0.2450 (0.2460) data time 0.0007 (0.0019) model time 0.2443 (0.2446) loss 4.0893 (3.5882) grad_norm 2.1679 (1.9521) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][490/1251] eta 0:03:07 lr 0.000966 wd 0.0500 time 0.2412 (0.2459) data time 0.0007 (0.0019) model time 0.2404 (0.2445) loss 3.4267 (3.5884) grad_norm 2.3787 (1.9517) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][500/1251] eta 0:03:04 lr 0.000966 wd 0.0500 time 0.2427 (0.2459) data time 0.0009 (0.0018) model time 0.2418 (0.2444) loss 4.1010 (3.5882) grad_norm 1.9122 (1.9472) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:47:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][510/1251] eta 0:03:02 lr 0.000966 wd 0.0500 time 0.2509 (0.2458) data time 0.0007 (0.0018) model time 0.2502 (0.2443) loss 2.8876 (3.5894) grad_norm 1.6229 (1.9494) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:48:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][520/1251] eta 0:02:59 lr 0.000966 wd 0.0500 time 0.2351 (0.2461) data time 0.0007 (0.0018) model time 0.2344 (0.2447) loss 3.7109 (3.5890) grad_norm 1.4639 (1.9524) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:48:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][530/1251] eta 0:02:57 lr 0.000966 wd 0.0500 time 0.2540 (0.2460) data time 0.0008 (0.0018) model time 0.2532 (0.2446) loss 3.2065 (3.5948) grad_norm 1.8196 (1.9509) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:48:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][540/1251] eta 0:02:54 lr 0.000966 wd 0.0500 time 0.2397 (0.2459) data time 0.0009 (0.0018) model time 0.2388 (0.2445) loss 4.2890 (3.5946) grad_norm 1.3426 (1.9464) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:48:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][550/1251] eta 0:02:52 lr 0.000966 wd 0.0500 time 0.2441 (0.2458) data time 0.0010 (0.0018) model time 0.2431 (0.2444) loss 4.0644 (3.5978) grad_norm 1.7604 (1.9437) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:48:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][560/1251] eta 0:02:49 lr 0.000966 wd 0.0500 time 0.2401 (0.2457) data time 0.0010 (0.0018) model time 0.2391 (0.2443) loss 3.1497 (3.5959) grad_norm 1.2908 (1.9427) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:48:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][570/1251] eta 0:02:47 lr 0.000966 wd 0.0500 time 0.2438 (0.2456) data time 0.0007 (0.0017) model time 0.2430 (0.2442) loss 3.0816 (3.5954) grad_norm 1.4832 (1.9413) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:48:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][580/1251] eta 0:02:44 lr 0.000966 wd 0.0500 time 0.2400 (0.2455) data time 0.0008 (0.0017) model time 0.2392 (0.2441) loss 2.7508 (3.5900) grad_norm 1.8528 (1.9374) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:48:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][590/1251] eta 0:02:42 lr 0.000966 wd 0.0500 time 0.2402 (0.2455) data time 0.0010 (0.0017) model time 0.2392 (0.2441) loss 4.1029 (3.5842) grad_norm 2.7723 (1.9427) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][600/1251] eta 0:02:40 lr 0.000965 wd 0.0500 time 0.2413 (0.2458) data time 0.0010 (0.0017) model time 0.2403 (0.2444) loss 3.4569 (3.5841) grad_norm 3.1058 (1.9463) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:48:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][610/1251] eta 0:02:37 lr 0.000965 wd 0.0500 time 0.2338 (0.2458) data time 0.0008 (0.0017) model time 0.2331 (0.2444) loss 2.4234 (3.5834) grad_norm 2.5523 (1.9465) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 06:48:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][620/1251] eta 0:02:35 lr 0.000965 wd 0.0500 time 0.2351 (0.2457) data time 0.0008 (0.0017) model time 0.2343 (0.2443) loss 3.7194 (3.5816) grad_norm 1.8420 (1.9474) loss_scale 8192.0000 (4115.7874) mem 7379MB [2024-08-26 06:48:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][630/1251] eta 0:02:32 lr 0.000965 wd 0.0500 time 0.2386 (0.2457) data time 0.0010 (0.0017) model time 0.2376 (0.2443) loss 3.6740 (3.5785) grad_norm 1.7687 (1.9464) loss_scale 8192.0000 (4180.3867) mem 7379MB [2024-08-26 06:48:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][640/1251] eta 0:02:30 lr 0.000965 wd 0.0500 time 0.2355 (0.2456) data time 0.0009 (0.0017) model time 0.2345 (0.2442) loss 3.7000 (3.5757) grad_norm 2.2855 (1.9419) loss_scale 8192.0000 (4242.9704) mem 7379MB [2024-08-26 06:48:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][650/1251] eta 0:02:27 lr 0.000965 wd 0.0500 time 0.2430 (0.2456) data time 0.0007 (0.0017) model time 0.2423 (0.2442) loss 4.7755 (3.5788) grad_norm 1.8295 (1.9405) loss_scale 8192.0000 (4303.6313) mem 7379MB [2024-08-26 06:48:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][660/1251] eta 0:02:25 lr 0.000965 wd 0.0500 time 0.2535 (0.2455) data time 0.0011 (0.0017) model time 0.2524 (0.2441) loss 4.0282 (3.5812) grad_norm 1.7688 (1.9391) loss_scale 8192.0000 (4362.4569) mem 7379MB [2024-08-26 06:48:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][670/1251] eta 0:02:22 lr 0.000965 wd 0.0500 time 0.2417 (0.2458) data time 0.0012 (0.0017) model time 0.2405 (0.2444) loss 3.8457 (3.5829) grad_norm 1.7553 (1.9377) loss_scale 8192.0000 (4419.5291) mem 7379MB [2024-08-26 06:48:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][680/1251] eta 0:02:20 lr 0.000965 wd 0.0500 time 0.2446 (0.2457) data time 0.0010 (0.0017) model time 0.2435 (0.2443) loss 4.1055 (3.5858) grad_norm 1.2949 (1.9376) loss_scale 8192.0000 (4474.9251) mem 7379MB [2024-08-26 06:48:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][690/1251] eta 0:02:17 lr 0.000965 wd 0.0500 time 0.2418 (0.2457) data time 0.0011 (0.0017) model time 0.2408 (0.2443) loss 3.4478 (3.5824) grad_norm 2.1205 (1.9358) loss_scale 8192.0000 (4528.7178) mem 7379MB [2024-08-26 06:48:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][700/1251] eta 0:02:15 lr 0.000965 wd 0.0500 time 0.2452 (0.2456) data time 0.0011 (0.0017) model time 0.2441 (0.2442) loss 3.9534 (3.5793) grad_norm 1.7827 (1.9368) loss_scale 8192.0000 (4580.9757) mem 7379MB [2024-08-26 06:48:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][710/1251] eta 0:02:12 lr 0.000965 wd 0.0500 time 0.2451 (0.2455) data time 0.0007 (0.0017) model time 0.2443 (0.2441) loss 3.9500 (3.5777) grad_norm 2.7195 (1.9392) loss_scale 8192.0000 (4631.7637) mem 7379MB [2024-08-26 06:48:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][720/1251] eta 0:02:10 lr 0.000965 wd 0.0500 time 0.2451 (0.2455) data time 0.0010 (0.0017) model time 0.2441 (0.2440) loss 3.5120 (3.5726) grad_norm 1.9701 (1.9367) loss_scale 8192.0000 (4681.1429) mem 7379MB [2024-08-26 06:48:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][730/1251] eta 0:02:07 lr 0.000965 wd 0.0500 time 0.2455 (0.2454) data time 0.0008 (0.0016) model time 0.2448 (0.2440) loss 3.4625 (3.5727) grad_norm 2.0707 (1.9421) loss_scale 8192.0000 (4729.1710) mem 7379MB [2024-08-26 06:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][740/1251] eta 0:02:05 lr 0.000965 wd 0.0500 time 0.2403 (0.2454) data time 0.0008 (0.0016) model time 0.2395 (0.2440) loss 3.5625 (3.5736) grad_norm 1.6853 (1.9448) loss_scale 8192.0000 (4775.9028) mem 7379MB [2024-08-26 06:48:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][750/1251] eta 0:02:02 lr 0.000965 wd 0.0500 time 0.2377 (0.2453) data time 0.0011 (0.0016) model time 0.2367 (0.2439) loss 3.7221 (3.5718) grad_norm 2.1313 (1.9448) loss_scale 8192.0000 (4821.3901) mem 7379MB [2024-08-26 06:49:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][760/1251] eta 0:02:00 lr 0.000965 wd 0.0500 time 0.2363 (0.2452) data time 0.0012 (0.0016) model time 0.2351 (0.2438) loss 4.0196 (3.5769) grad_norm 1.8988 (1.9408) loss_scale 8192.0000 (4865.6820) mem 7379MB [2024-08-26 06:49:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][770/1251] eta 0:01:57 lr 0.000965 wd 0.0500 time 0.2464 (0.2452) data time 0.0008 (0.0017) model time 0.2457 (0.2438) loss 3.9900 (3.5804) grad_norm 1.6127 (1.9373) loss_scale 8192.0000 (4908.8249) mem 7379MB [2024-08-26 06:49:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][780/1251] eta 0:01:55 lr 0.000965 wd 0.0500 time 0.2440 (0.2452) data time 0.0010 (0.0017) model time 0.2430 (0.2437) loss 3.2756 (3.5775) grad_norm 1.3492 (1.9365) loss_scale 8192.0000 (4950.8630) mem 7379MB [2024-08-26 06:49:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][790/1251] eta 0:01:52 lr 0.000965 wd 0.0500 time 0.2375 (0.2451) data time 0.0008 (0.0016) model time 0.2367 (0.2437) loss 2.8503 (3.5782) grad_norm 1.9471 (1.9339) loss_scale 8192.0000 (4991.8382) mem 7379MB [2024-08-26 06:49:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][800/1251] eta 0:01:50 lr 0.000965 wd 0.0500 time 0.2393 (0.2451) data time 0.0009 (0.0016) model time 0.2384 (0.2436) loss 3.5919 (3.5776) grad_norm 2.1288 (1.9343) loss_scale 8192.0000 (5031.7903) mem 7379MB [2024-08-26 06:49:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][810/1251] eta 0:01:48 lr 0.000965 wd 0.0500 time 0.2405 (0.2450) data time 0.0008 (0.0017) model time 0.2397 (0.2436) loss 4.1130 (3.5760) grad_norm 1.4777 (1.9318) loss_scale 8192.0000 (5070.7571) mem 7379MB [2024-08-26 06:49:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][820/1251] eta 0:01:45 lr 0.000965 wd 0.0500 time 0.2448 (0.2450) data time 0.0010 (0.0017) model time 0.2437 (0.2435) loss 3.6380 (3.5742) grad_norm 2.4674 (1.9295) loss_scale 8192.0000 (5108.7747) mem 7379MB [2024-08-26 06:49:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][830/1251] eta 0:01:43 lr 0.000965 wd 0.0500 time 0.2462 (0.2452) data time 0.0007 (0.0016) model time 0.2454 (0.2438) loss 2.6225 (3.5766) grad_norm 1.7240 (1.9300) loss_scale 8192.0000 (5145.8773) mem 7379MB [2024-08-26 06:49:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][840/1251] eta 0:01:40 lr 0.000965 wd 0.0500 time 0.2393 (0.2454) data time 0.0008 (0.0016) model time 0.2385 (0.2440) loss 4.2157 (3.5764) grad_norm 1.9386 (1.9297) loss_scale 8192.0000 (5182.0975) mem 7379MB [2024-08-26 06:49:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][850/1251] eta 0:01:38 lr 0.000965 wd 0.0500 time 0.2395 (0.2458) data time 0.0009 (0.0016) model time 0.2386 (0.2444) loss 2.4553 (3.5765) grad_norm 1.5234 (1.9292) loss_scale 8192.0000 (5217.4665) mem 7379MB [2024-08-26 06:49:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][860/1251] eta 0:01:36 lr 0.000965 wd 0.0500 time 0.2379 (0.2458) data time 0.0010 (0.0016) model time 0.2368 (0.2444) loss 3.1167 (3.5758) grad_norm 1.8524 (1.9315) loss_scale 8192.0000 (5252.0139) mem 7379MB [2024-08-26 06:49:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][870/1251] eta 0:01:33 lr 0.000965 wd 0.0500 time 0.2424 (0.2457) data time 0.0009 (0.0016) model time 0.2415 (0.2443) loss 3.3560 (3.5750) grad_norm 2.3728 (1.9345) loss_scale 8192.0000 (5285.7681) mem 7379MB [2024-08-26 06:49:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][880/1251] eta 0:01:31 lr 0.000965 wd 0.0500 time 0.2490 (0.2457) data time 0.0010 (0.0016) model time 0.2480 (0.2443) loss 3.4533 (3.5755) grad_norm 2.0174 (1.9348) loss_scale 8192.0000 (5318.7560) mem 7379MB [2024-08-26 06:49:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][890/1251] eta 0:01:28 lr 0.000965 wd 0.0500 time 0.2483 (0.2456) data time 0.0008 (0.0016) model time 0.2474 (0.2442) loss 3.7106 (3.5776) grad_norm 1.5440 (1.9349) loss_scale 8192.0000 (5351.0034) mem 7379MB [2024-08-26 06:49:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][900/1251] eta 0:01:26 lr 0.000965 wd 0.0500 time 0.2357 (0.2456) data time 0.0010 (0.0016) model time 0.2346 (0.2442) loss 2.5617 (3.5782) grad_norm 2.1312 (1.9381) loss_scale 8192.0000 (5382.5350) mem 7379MB [2024-08-26 06:49:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][910/1251] eta 0:01:23 lr 0.000965 wd 0.0500 time 0.2430 (0.2455) data time 0.0007 (0.0016) model time 0.2422 (0.2441) loss 3.1018 (3.5766) grad_norm 1.4389 (1.9364) loss_scale 8192.0000 (5413.3743) mem 7379MB [2024-08-26 06:49:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][920/1251] eta 0:01:21 lr 0.000965 wd 0.0500 time 0.2498 (0.2455) data time 0.0007 (0.0016) model time 0.2491 (0.2441) loss 4.4766 (3.5735) grad_norm 1.4884 (1.9357) loss_scale 8192.0000 (5443.5440) mem 7379MB [2024-08-26 06:49:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][930/1251] eta 0:01:18 lr 0.000965 wd 0.0500 time 0.2438 (0.2455) data time 0.0007 (0.0016) model time 0.2432 (0.2440) loss 3.6688 (3.5736) grad_norm 1.6679 (1.9340) loss_scale 8192.0000 (5473.0655) mem 7379MB [2024-08-26 06:49:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][940/1251] eta 0:01:16 lr 0.000965 wd 0.0500 time 0.2431 (0.2454) data time 0.0009 (0.0016) model time 0.2422 (0.2440) loss 4.2699 (3.5740) grad_norm 1.9340 (1.9335) loss_scale 8192.0000 (5501.9596) mem 7379MB [2024-08-26 06:49:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][950/1251] eta 0:01:13 lr 0.000965 wd 0.0500 time 0.2369 (0.2453) data time 0.0011 (0.0016) model time 0.2358 (0.2439) loss 3.6343 (3.5721) grad_norm 1.3366 (1.9311) loss_scale 8192.0000 (5530.2461) mem 7379MB [2024-08-26 06:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][960/1251] eta 0:01:11 lr 0.000965 wd 0.0500 time 0.2432 (0.2453) data time 0.0012 (0.0016) model time 0.2420 (0.2439) loss 2.9254 (3.5692) grad_norm 1.5642 (1.9303) loss_scale 8192.0000 (5557.9438) mem 7379MB [2024-08-26 06:49:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][970/1251] eta 0:01:08 lr 0.000965 wd 0.0500 time 0.2369 (0.2453) data time 0.0009 (0.0016) model time 0.2359 (0.2439) loss 3.1981 (3.5657) grad_norm 1.6343 (1.9282) loss_scale 8192.0000 (5585.0711) mem 7379MB [2024-08-26 06:49:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][980/1251] eta 0:01:06 lr 0.000965 wd 0.0500 time 0.2409 (0.2452) data time 0.0011 (0.0016) model time 0.2398 (0.2438) loss 3.8873 (3.5666) grad_norm 1.7705 (1.9282) loss_scale 8192.0000 (5611.6453) mem 7379MB [2024-08-26 06:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][990/1251] eta 0:01:03 lr 0.000965 wd 0.0500 time 0.2414 (0.2452) data time 0.0009 (0.0016) model time 0.2405 (0.2437) loss 3.1301 (3.5657) grad_norm 1.0795 (1.9250) loss_scale 8192.0000 (5637.6831) mem 7379MB [2024-08-26 06:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1000/1251] eta 0:01:01 lr 0.000965 wd 0.0500 time 0.2408 (0.2452) data time 0.0007 (0.0016) model time 0.2401 (0.2437) loss 3.6183 (3.5659) grad_norm 1.5659 (1.9234) loss_scale 8192.0000 (5663.2008) mem 7379MB [2024-08-26 06:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1010/1251] eta 0:00:59 lr 0.000965 wd 0.0500 time 0.2397 (0.2451) data time 0.0007 (0.0016) model time 0.2390 (0.2437) loss 3.9399 (3.5665) grad_norm 1.7685 (1.9254) loss_scale 8192.0000 (5688.2136) mem 7379MB [2024-08-26 06:50:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1020/1251] eta 0:00:56 lr 0.000965 wd 0.0500 time 0.4324 (0.2455) data time 0.0012 (0.0016) model time 0.4313 (0.2441) loss 3.6145 (3.5642) grad_norm 1.5861 (1.9278) loss_scale 8192.0000 (5712.7365) mem 7379MB [2024-08-26 06:50:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1030/1251] eta 0:00:54 lr 0.000965 wd 0.0500 time 0.2365 (0.2457) data time 0.0008 (0.0016) model time 0.2357 (0.2443) loss 4.3677 (3.5678) grad_norm 1.5947 (1.9274) loss_scale 8192.0000 (5736.7837) mem 7379MB [2024-08-26 06:50:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1040/1251] eta 0:00:51 lr 0.000965 wd 0.0500 time 0.2446 (0.2456) data time 0.0007 (0.0016) model time 0.2438 (0.2443) loss 3.9212 (3.5676) grad_norm 1.4249 (1.9261) loss_scale 8192.0000 (5760.3689) mem 7379MB [2024-08-26 06:50:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1050/1251] eta 0:00:49 lr 0.000965 wd 0.0500 time 0.2381 (0.2456) data time 0.0008 (0.0016) model time 0.2373 (0.2442) loss 3.7808 (3.5663) grad_norm 1.6267 (1.9237) loss_scale 8192.0000 (5783.5052) mem 7379MB [2024-08-26 06:50:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1060/1251] eta 0:00:46 lr 0.000965 wd 0.0500 time 0.2393 (0.2456) data time 0.0010 (0.0016) model time 0.2383 (0.2442) loss 3.7321 (3.5676) grad_norm 2.5755 (1.9292) loss_scale 8192.0000 (5806.2055) mem 7379MB [2024-08-26 06:50:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1070/1251] eta 0:00:44 lr 0.000965 wd 0.0500 time 0.2410 (0.2455) data time 0.0007 (0.0016) model time 0.2403 (0.2441) loss 3.4356 (3.5680) grad_norm 1.4492 (1.9278) loss_scale 8192.0000 (5828.4818) mem 7379MB [2024-08-26 06:50:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1080/1251] eta 0:00:41 lr 0.000965 wd 0.0500 time 0.2429 (0.2455) data time 0.0007 (0.0015) model time 0.2422 (0.2441) loss 3.6528 (3.5681) grad_norm 2.0551 (1.9275) loss_scale 8192.0000 (5850.3460) mem 7379MB [2024-08-26 06:50:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1090/1251] eta 0:00:39 lr 0.000965 wd 0.0500 time 0.2409 (0.2454) data time 0.0010 (0.0015) model time 0.2399 (0.2440) loss 3.2445 (3.5689) grad_norm 2.0150 (1.9292) loss_scale 8192.0000 (5871.8093) mem 7379MB [2024-08-26 06:50:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1100/1251] eta 0:00:37 lr 0.000965 wd 0.0500 time 0.2373 (0.2454) data time 0.0008 (0.0015) model time 0.2365 (0.2440) loss 4.3459 (3.5721) grad_norm 2.2036 (1.9319) loss_scale 8192.0000 (5892.8828) mem 7379MB [2024-08-26 06:50:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1110/1251] eta 0:00:34 lr 0.000965 wd 0.0500 time 0.2486 (0.2453) data time 0.0007 (0.0015) model time 0.2479 (0.2440) loss 4.3681 (3.5747) grad_norm 1.5707 (1.9323) loss_scale 8192.0000 (5913.5770) mem 7379MB [2024-08-26 06:50:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1120/1251] eta 0:00:32 lr 0.000965 wd 0.0500 time 0.2451 (0.2455) data time 0.0007 (0.0015) model time 0.2445 (0.2441) loss 4.7486 (3.5770) grad_norm 1.5288 (1.9295) loss_scale 8192.0000 (5933.9019) mem 7379MB [2024-08-26 06:50:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1130/1251] eta 0:00:29 lr 0.000965 wd 0.0500 time 0.2426 (0.2454) data time 0.0010 (0.0015) model time 0.2417 (0.2440) loss 3.8560 (3.5762) grad_norm 1.9855 (1.9273) loss_scale 8192.0000 (5953.8674) mem 7379MB [2024-08-26 06:50:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1140/1251] eta 0:00:27 lr 0.000965 wd 0.0500 time 0.2385 (0.2454) data time 0.0010 (0.0015) model time 0.2376 (0.2440) loss 3.9774 (3.5776) grad_norm 2.0259 (1.9281) loss_scale 8192.0000 (5973.4829) mem 7379MB [2024-08-26 06:50:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1150/1251] eta 0:00:24 lr 0.000965 wd 0.0500 time 0.2440 (0.2453) data time 0.0010 (0.0015) model time 0.2431 (0.2440) loss 4.0342 (3.5757) grad_norm 1.4631 (1.9269) loss_scale 8192.0000 (5992.7576) mem 7379MB [2024-08-26 06:50:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1160/1251] eta 0:00:22 lr 0.000965 wd 0.0500 time 0.2448 (0.2453) data time 0.0012 (0.0015) model time 0.2436 (0.2439) loss 2.7766 (3.5730) grad_norm 1.7120 (1.9279) loss_scale 8192.0000 (6011.7003) mem 7379MB [2024-08-26 06:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1170/1251] eta 0:00:19 lr 0.000965 wd 0.0500 time 0.2339 (0.2453) data time 0.0010 (0.0015) model time 0.2328 (0.2439) loss 4.2358 (3.5749) grad_norm 1.4191 (1.9284) loss_scale 8192.0000 (6030.3194) mem 7379MB [2024-08-26 06:50:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1180/1251] eta 0:00:17 lr 0.000965 wd 0.0500 time 0.2403 (0.2452) data time 0.0010 (0.0015) model time 0.2393 (0.2439) loss 3.7750 (3.5738) grad_norm 1.4508 (1.9265) loss_scale 8192.0000 (6048.6232) mem 7379MB [2024-08-26 06:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1190/1251] eta 0:00:14 lr 0.000965 wd 0.0500 time 0.2417 (0.2452) data time 0.0011 (0.0015) model time 0.2405 (0.2438) loss 3.4328 (3.5730) grad_norm 1.5328 (1.9255) loss_scale 8192.0000 (6066.6196) mem 7379MB [2024-08-26 06:50:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1200/1251] eta 0:00:12 lr 0.000965 wd 0.0500 time 0.2439 (0.2452) data time 0.0010 (0.0015) model time 0.2430 (0.2438) loss 4.6700 (3.5766) grad_norm 1.8634 (1.9249) loss_scale 8192.0000 (6084.3164) mem 7379MB [2024-08-26 06:50:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1210/1251] eta 0:00:10 lr 0.000964 wd 0.0500 time 0.2416 (0.2452) data time 0.0008 (0.0015) model time 0.2408 (0.2438) loss 3.8988 (3.5775) grad_norm 2.2739 (1.9261) loss_scale 8192.0000 (6101.7209) mem 7379MB [2024-08-26 06:50:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1220/1251] eta 0:00:07 lr 0.000964 wd 0.0500 time 0.2409 (0.2452) data time 0.0016 (0.0015) model time 0.2394 (0.2438) loss 3.5494 (3.5774) grad_norm 1.9398 (1.9262) loss_scale 8192.0000 (6118.8403) mem 7379MB [2024-08-26 06:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1230/1251] eta 0:00:05 lr 0.000964 wd 0.0500 time 0.2448 (0.2451) data time 0.0009 (0.0015) model time 0.2439 (0.2438) loss 2.8116 (3.5778) grad_norm 1.6709 (1.9264) loss_scale 8192.0000 (6135.6816) mem 7379MB [2024-08-26 06:50:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1240/1251] eta 0:00:02 lr 0.000964 wd 0.0500 time 0.2240 (0.2450) data time 0.0005 (0.0015) model time 0.2235 (0.2437) loss 3.3554 (3.5793) grad_norm 1.8431 (1.9256) loss_scale 8192.0000 (6152.2514) mem 7379MB [2024-08-26 06:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [53/300][1250/1251] eta 0:00:00 lr 0.000964 wd 0.0500 time 0.2268 (0.2449) data time 0.0005 (0.0015) model time 0.2263 (0.2436) loss 2.8993 (3.5789) grad_norm 2.3773 (1.9258) loss_scale 8192.0000 (6168.5564) mem 7379MB [2024-08-26 06:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 53 training takes 0:05:06 [2024-08-26 06:51:00 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 06:51:01 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 06:51:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.435 (0.435) Loss 0.5508 (0.5508) Acc@1 88.574 (88.574) Acc@5 97.461 (97.461) Mem 7379MB [2024-08-26 06:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.111) Loss 0.9077 (0.8715) Acc@1 78.906 (80.380) Acc@5 95.117 (95.384) Mem 7379MB [2024-08-26 06:51:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.085 (0.095) Loss 1.3164 (0.8886) Acc@1 70.020 (79.548) Acc@5 90.625 (95.429) Mem 7379MB [2024-08-26 06:51:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.089) Loss 1.5508 (1.0196) Acc@1 61.230 (76.692) Acc@5 86.523 (93.772) Mem 7379MB [2024-08-26 06:51:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.4424 (1.0953) Acc@1 66.699 (74.969) Acc@5 89.648 (92.928) Mem 7379MB [2024-08-26 06:51:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.738 Acc@5 92.828 [2024-08-26 06:51:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 74.7% [2024-08-26 06:51:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.756 (0.756) Loss 0.4722 (0.4722) Acc@1 90.039 (90.039) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 06:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.146) Loss 0.7881 (0.7606) Acc@1 83.594 (83.034) Acc@5 95.801 (96.440) Mem 7379MB [2024-08-26 06:51:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.114) Loss 1.1035 (0.7776) Acc@1 75.000 (82.124) Acc@5 91.992 (96.345) Mem 7379MB [2024-08-26 06:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.087 (0.103) Loss 1.3662 (0.8920) Acc@1 65.039 (79.363) Acc@5 88.672 (94.856) Mem 7379MB [2024-08-26 06:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.2852 (0.9574) Acc@1 68.750 (77.746) Acc@5 90.234 (94.098) Mem 7379MB [2024-08-26 06:51:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.382 Acc@5 94.046 [2024-08-26 06:51:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 77.4% [2024-08-26 06:51:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 77.38% [2024-08-26 06:51:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 06:51:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 06:51:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][0/1251] eta 0:13:35 lr 0.000964 wd 0.0500 time 0.6518 (0.6518) data time 0.4326 (0.4326) model time 0.0000 (0.0000) loss 4.6088 (4.6088) grad_norm 1.3890 (1.3890) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][10/1251] eta 0:05:45 lr 0.000964 wd 0.0500 time 0.2433 (0.2787) data time 0.0012 (0.0403) model time 0.0000 (0.0000) loss 2.9599 (3.7967) grad_norm 1.4618 (1.8054) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][20/1251] eta 0:05:20 lr 0.000964 wd 0.0500 time 0.2459 (0.2606) data time 0.0009 (0.0216) model time 0.0000 (0.0000) loss 3.2228 (3.7927) grad_norm 3.0892 (2.1259) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][30/1251] eta 0:05:11 lr 0.000964 wd 0.0500 time 0.2447 (0.2547) data time 0.0009 (0.0149) model time 0.0000 (0.0000) loss 3.6939 (3.8402) grad_norm 1.9855 (2.0282) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][40/1251] eta 0:05:11 lr 0.000964 wd 0.0500 time 0.2490 (0.2570) data time 0.0007 (0.0115) model time 0.0000 (0.0000) loss 2.4939 (3.7490) grad_norm 1.5321 (1.9640) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][50/1251] eta 0:05:05 lr 0.000964 wd 0.0500 time 0.2409 (0.2540) data time 0.0013 (0.0095) model time 0.0000 (0.0000) loss 3.8850 (3.7248) grad_norm 1.5746 (1.9756) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][60/1251] eta 0:05:00 lr 0.000964 wd 0.0500 time 0.2394 (0.2523) data time 0.0011 (0.0081) model time 0.2383 (0.2425) loss 4.0574 (3.7306) grad_norm 2.6177 (1.9607) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][70/1251] eta 0:04:56 lr 0.000964 wd 0.0500 time 0.2387 (0.2508) data time 0.0011 (0.0071) model time 0.2376 (0.2415) loss 3.3630 (3.7426) grad_norm 1.4045 (1.9210) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][80/1251] eta 0:04:52 lr 0.000964 wd 0.0500 time 0.2428 (0.2495) data time 0.0010 (0.0064) model time 0.2418 (0.2407) loss 3.5953 (3.6653) grad_norm 1.6878 (1.8865) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][90/1251] eta 0:04:48 lr 0.000964 wd 0.0500 time 0.2426 (0.2487) data time 0.0012 (0.0058) model time 0.2414 (0.2408) loss 2.8934 (3.6536) grad_norm 2.5439 (1.9022) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][100/1251] eta 0:04:45 lr 0.000964 wd 0.0500 time 0.2314 (0.2480) data time 0.0008 (0.0055) model time 0.2306 (0.2404) loss 4.2286 (3.6561) grad_norm 2.8379 (1.9236) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][110/1251] eta 0:04:44 lr 0.000964 wd 0.0500 time 0.2401 (0.2494) data time 0.0012 (0.0051) model time 0.2389 (0.2440) loss 3.5904 (3.6600) grad_norm 2.3239 (1.9460) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][120/1251] eta 0:04:44 lr 0.000964 wd 0.0500 time 0.4569 (0.2520) data time 0.0008 (0.0048) model time 0.4561 (0.2491) loss 3.9675 (3.6699) grad_norm 2.2224 (1.9490) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][130/1251] eta 0:04:41 lr 0.000964 wd 0.0500 time 0.2401 (0.2512) data time 0.0011 (0.0045) model time 0.2389 (0.2481) loss 2.8772 (3.6514) grad_norm 1.4536 (1.9503) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][140/1251] eta 0:04:38 lr 0.000964 wd 0.0500 time 0.2433 (0.2506) data time 0.0008 (0.0042) model time 0.2426 (0.2474) loss 3.5163 (3.6489) grad_norm 1.4044 (1.9293) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][150/1251] eta 0:04:35 lr 0.000964 wd 0.0500 time 0.2451 (0.2500) data time 0.0010 (0.0040) model time 0.2442 (0.2467) loss 3.9134 (3.6448) grad_norm 1.9191 (1.9230) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][160/1251] eta 0:04:32 lr 0.000964 wd 0.0500 time 0.2322 (0.2495) data time 0.0009 (0.0038) model time 0.2314 (0.2462) loss 3.9270 (3.6571) grad_norm 1.4430 (1.9197) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][170/1251] eta 0:04:29 lr 0.000964 wd 0.0500 time 0.2418 (0.2492) data time 0.0009 (0.0037) model time 0.2409 (0.2459) loss 2.7284 (3.6420) grad_norm 1.6211 (1.9142) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][180/1251] eta 0:04:26 lr 0.000964 wd 0.0500 time 0.2421 (0.2487) data time 0.0010 (0.0035) model time 0.2411 (0.2454) loss 3.7381 (3.6361) grad_norm 1.9125 (1.9099) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][190/1251] eta 0:04:24 lr 0.000964 wd 0.0500 time 0.4558 (0.2494) data time 0.0007 (0.0034) model time 0.4550 (0.2466) loss 4.4132 (3.6352) grad_norm 2.9076 (1.9221) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][200/1251] eta 0:04:21 lr 0.000964 wd 0.0500 time 0.2382 (0.2491) data time 0.0007 (0.0033) model time 0.2375 (0.2462) loss 3.6433 (3.6439) grad_norm 1.4968 (1.9399) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][210/1251] eta 0:04:18 lr 0.000964 wd 0.0500 time 0.2358 (0.2487) data time 0.0011 (0.0032) model time 0.2348 (0.2458) loss 2.9818 (3.6428) grad_norm 2.9151 (1.9461) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][220/1251] eta 0:04:16 lr 0.000964 wd 0.0500 time 0.2524 (0.2484) data time 0.0007 (0.0031) model time 0.2517 (0.2456) loss 4.0126 (3.6489) grad_norm 2.2677 (1.9545) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][230/1251] eta 0:04:13 lr 0.000964 wd 0.0500 time 0.2416 (0.2482) data time 0.0007 (0.0030) model time 0.2409 (0.2454) loss 2.1946 (3.6495) grad_norm 2.0147 (1.9479) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][240/1251] eta 0:04:10 lr 0.000964 wd 0.0500 time 0.2406 (0.2479) data time 0.0008 (0.0029) model time 0.2398 (0.2451) loss 3.4662 (3.6449) grad_norm 1.6142 (1.9466) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][250/1251] eta 0:04:07 lr 0.000964 wd 0.0500 time 0.2395 (0.2476) data time 0.0010 (0.0028) model time 0.2385 (0.2449) loss 3.2185 (3.6463) grad_norm 1.9015 (1.9540) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][260/1251] eta 0:04:05 lr 0.000964 wd 0.0500 time 0.2492 (0.2474) data time 0.0010 (0.0028) model time 0.2482 (0.2447) loss 3.0251 (3.6393) grad_norm 1.5883 (1.9487) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][270/1251] eta 0:04:02 lr 0.000964 wd 0.0500 time 0.2388 (0.2472) data time 0.0009 (0.0027) model time 0.2379 (0.2445) loss 2.8283 (3.6299) grad_norm 2.0861 (1.9534) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][280/1251] eta 0:03:59 lr 0.000964 wd 0.0500 time 0.2498 (0.2471) data time 0.0010 (0.0026) model time 0.2487 (0.2444) loss 3.5586 (3.6315) grad_norm 1.9985 (1.9412) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][290/1251] eta 0:03:57 lr 0.000964 wd 0.0500 time 0.2386 (0.2476) data time 0.0009 (0.0026) model time 0.2378 (0.2451) loss 2.5685 (3.6263) grad_norm 1.8316 (1.9382) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][300/1251] eta 0:03:55 lr 0.000964 wd 0.0500 time 0.2461 (0.2474) data time 0.0009 (0.0025) model time 0.2451 (0.2449) loss 3.3042 (3.6276) grad_norm 2.7184 (1.9369) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][310/1251] eta 0:03:52 lr 0.000964 wd 0.0500 time 0.2354 (0.2472) data time 0.0012 (0.0025) model time 0.2341 (0.2447) loss 2.8014 (3.6172) grad_norm 2.8027 (1.9321) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][320/1251] eta 0:03:49 lr 0.000964 wd 0.0500 time 0.2290 (0.2470) data time 0.0009 (0.0024) model time 0.2281 (0.2446) loss 4.4431 (3.6173) grad_norm 2.1367 (1.9401) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][330/1251] eta 0:03:47 lr 0.000964 wd 0.0500 time 0.2389 (0.2469) data time 0.0010 (0.0024) model time 0.2379 (0.2445) loss 3.7979 (3.6212) grad_norm 2.0363 (1.9397) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][340/1251] eta 0:03:44 lr 0.000964 wd 0.0500 time 0.2433 (0.2468) data time 0.0012 (0.0023) model time 0.2421 (0.2444) loss 3.8331 (3.6141) grad_norm 2.1569 (1.9359) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][350/1251] eta 0:03:42 lr 0.000964 wd 0.0500 time 0.2457 (0.2466) data time 0.0010 (0.0023) model time 0.2447 (0.2443) loss 3.7101 (3.6076) grad_norm 1.8068 (1.9361) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][360/1251] eta 0:03:39 lr 0.000964 wd 0.0500 time 0.2372 (0.2465) data time 0.0011 (0.0023) model time 0.2361 (0.2441) loss 4.1906 (3.6042) grad_norm 1.8418 (1.9361) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][370/1251] eta 0:03:37 lr 0.000964 wd 0.0500 time 0.2352 (0.2464) data time 0.0010 (0.0022) model time 0.2342 (0.2440) loss 3.4474 (3.6028) grad_norm 1.9236 (1.9366) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][380/1251] eta 0:03:34 lr 0.000964 wd 0.0500 time 0.2366 (0.2467) data time 0.0008 (0.0022) model time 0.2358 (0.2445) loss 3.3351 (3.6045) grad_norm 1.6325 (1.9352) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][390/1251] eta 0:03:32 lr 0.000964 wd 0.0500 time 0.2509 (0.2466) data time 0.0012 (0.0022) model time 0.2497 (0.2444) loss 3.6278 (3.6062) grad_norm 1.5618 (1.9299) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][400/1251] eta 0:03:29 lr 0.000964 wd 0.0500 time 0.2435 (0.2464) data time 0.0007 (0.0022) model time 0.2427 (0.2442) loss 4.1371 (3.5959) grad_norm 1.9713 (1.9256) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][410/1251] eta 0:03:27 lr 0.000964 wd 0.0500 time 0.2422 (0.2463) data time 0.0008 (0.0021) model time 0.2415 (0.2441) loss 2.7665 (3.5906) grad_norm 1.5420 (1.9229) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][420/1251] eta 0:03:24 lr 0.000964 wd 0.0500 time 0.2436 (0.2462) data time 0.0007 (0.0021) model time 0.2428 (0.2440) loss 3.9181 (3.5948) grad_norm 1.5211 (1.9273) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][430/1251] eta 0:03:22 lr 0.000964 wd 0.0500 time 0.2376 (0.2461) data time 0.0009 (0.0021) model time 0.2366 (0.2440) loss 3.0087 (3.5952) grad_norm 1.4367 (1.9313) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:52:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][440/1251] eta 0:03:19 lr 0.000964 wd 0.0500 time 0.2418 (0.2460) data time 0.0009 (0.0021) model time 0.2409 (0.2439) loss 4.3043 (3.5969) grad_norm 2.7810 (1.9320) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][450/1251] eta 0:03:17 lr 0.000964 wd 0.0500 time 0.2506 (0.2460) data time 0.0009 (0.0020) model time 0.2497 (0.2438) loss 3.5798 (3.5949) grad_norm 2.1023 (1.9333) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][460/1251] eta 0:03:14 lr 0.000964 wd 0.0500 time 0.2421 (0.2459) data time 0.0009 (0.0020) model time 0.2412 (0.2439) loss 3.7577 (3.5987) grad_norm 1.6146 (1.9381) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][470/1251] eta 0:03:11 lr 0.000964 wd 0.0500 time 0.2432 (0.2458) data time 0.0010 (0.0020) model time 0.2422 (0.2438) loss 3.7908 (3.6053) grad_norm 1.3574 (1.9374) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][480/1251] eta 0:03:09 lr 0.000964 wd 0.0500 time 0.2421 (0.2457) data time 0.0011 (0.0020) model time 0.2410 (0.2437) loss 3.6709 (3.6005) grad_norm 1.7147 (1.9366) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][490/1251] eta 0:03:06 lr 0.000964 wd 0.0500 time 0.2432 (0.2456) data time 0.0009 (0.0020) model time 0.2423 (0.2436) loss 4.0414 (3.6048) grad_norm 1.5314 (1.9319) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][500/1251] eta 0:03:04 lr 0.000964 wd 0.0500 time 0.2409 (0.2456) data time 0.0007 (0.0019) model time 0.2402 (0.2435) loss 2.6962 (3.6026) grad_norm 1.7792 (1.9352) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][510/1251] eta 0:03:01 lr 0.000964 wd 0.0500 time 0.2386 (0.2455) data time 0.0009 (0.0019) model time 0.2377 (0.2435) loss 4.1631 (3.5985) grad_norm 2.4617 (1.9375) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][520/1251] eta 0:02:59 lr 0.000964 wd 0.0500 time 0.2319 (0.2458) data time 0.0012 (0.0019) model time 0.2307 (0.2439) loss 3.9798 (3.5990) grad_norm 2.2878 (1.9405) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][530/1251] eta 0:02:57 lr 0.000964 wd 0.0500 time 0.2501 (0.2458) data time 0.0009 (0.0019) model time 0.2491 (0.2439) loss 3.2969 (3.5986) grad_norm 2.2418 (1.9489) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][540/1251] eta 0:02:55 lr 0.000964 wd 0.0500 time 0.2547 (0.2465) data time 0.0010 (0.0019) model time 0.2537 (0.2446) loss 4.0176 (3.5940) grad_norm 1.6409 (1.9511) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][550/1251] eta 0:02:52 lr 0.000964 wd 0.0500 time 0.2371 (0.2464) data time 0.0011 (0.0018) model time 0.2359 (0.2446) loss 2.4742 (3.5894) grad_norm 1.8737 (1.9516) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][560/1251] eta 0:02:50 lr 0.000963 wd 0.0500 time 0.2387 (0.2463) data time 0.0010 (0.0018) model time 0.2377 (0.2445) loss 3.6688 (3.5886) grad_norm 2.3089 (1.9526) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][570/1251] eta 0:02:47 lr 0.000963 wd 0.0500 time 0.2449 (0.2462) data time 0.0009 (0.0018) model time 0.2440 (0.2444) loss 3.9172 (3.5864) grad_norm 1.5217 (1.9549) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][580/1251] eta 0:02:45 lr 0.000963 wd 0.0500 time 0.2478 (0.2461) data time 0.0007 (0.0018) model time 0.2471 (0.2443) loss 2.8989 (3.5807) grad_norm 1.8319 (1.9593) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][590/1251] eta 0:02:42 lr 0.000963 wd 0.0500 time 0.2391 (0.2460) data time 0.0010 (0.0018) model time 0.2381 (0.2442) loss 2.6993 (3.5834) grad_norm 1.8303 (1.9603) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][600/1251] eta 0:02:40 lr 0.000963 wd 0.0500 time 0.2364 (0.2460) data time 0.0009 (0.0018) model time 0.2355 (0.2442) loss 3.2993 (3.5884) grad_norm 2.8062 (1.9604) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][610/1251] eta 0:02:37 lr 0.000963 wd 0.0500 time 0.2465 (0.2459) data time 0.0009 (0.0018) model time 0.2456 (0.2441) loss 3.2519 (3.5850) grad_norm 1.2757 (1.9574) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][620/1251] eta 0:02:35 lr 0.000963 wd 0.0500 time 0.2422 (0.2458) data time 0.0009 (0.0018) model time 0.2413 (0.2440) loss 4.0547 (3.5885) grad_norm 1.4054 (1.9526) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][630/1251] eta 0:02:32 lr 0.000963 wd 0.0500 time 0.2349 (0.2460) data time 0.0008 (0.0017) model time 0.2341 (0.2442) loss 3.1199 (3.5890) grad_norm 1.4295 (1.9491) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][640/1251] eta 0:02:30 lr 0.000963 wd 0.0500 time 0.2420 (0.2462) data time 0.0010 (0.0017) model time 0.2410 (0.2444) loss 3.6467 (3.5856) grad_norm 1.5994 (1.9469) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][650/1251] eta 0:02:27 lr 0.000963 wd 0.0500 time 0.2406 (0.2460) data time 0.0008 (0.0017) model time 0.2397 (0.2443) loss 3.0294 (3.5825) grad_norm 1.3632 (1.9470) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][660/1251] eta 0:02:25 lr 0.000963 wd 0.0500 time 0.2458 (0.2460) data time 0.0007 (0.0017) model time 0.2451 (0.2443) loss 2.9970 (3.5816) grad_norm 2.3380 (1.9452) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][670/1251] eta 0:02:22 lr 0.000963 wd 0.0500 time 0.2383 (0.2459) data time 0.0014 (0.0017) model time 0.2370 (0.2442) loss 3.7764 (3.5794) grad_norm 1.5581 (1.9421) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:53:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][680/1251] eta 0:02:20 lr 0.000963 wd 0.0500 time 0.2379 (0.2459) data time 0.0010 (0.0017) model time 0.2369 (0.2441) loss 3.4230 (3.5806) grad_norm 2.5705 (1.9387) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][690/1251] eta 0:02:17 lr 0.000963 wd 0.0500 time 0.2455 (0.2458) data time 0.0010 (0.0017) model time 0.2445 (0.2441) loss 3.9530 (3.5813) grad_norm 2.9040 (1.9426) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][700/1251] eta 0:02:15 lr 0.000963 wd 0.0500 time 0.2344 (0.2458) data time 0.0010 (0.0017) model time 0.2333 (0.2440) loss 3.8212 (3.5837) grad_norm 2.1725 (1.9426) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][710/1251] eta 0:02:12 lr 0.000963 wd 0.0500 time 0.2423 (0.2457) data time 0.0009 (0.0017) model time 0.2413 (0.2440) loss 3.7860 (3.5861) grad_norm 2.3480 (1.9392) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][720/1251] eta 0:02:10 lr 0.000963 wd 0.0500 time 0.2448 (0.2457) data time 0.0010 (0.0017) model time 0.2438 (0.2439) loss 4.0660 (3.5826) grad_norm 1.5610 (1.9361) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][730/1251] eta 0:02:07 lr 0.000963 wd 0.0500 time 0.2448 (0.2456) data time 0.0011 (0.0016) model time 0.2437 (0.2439) loss 4.1811 (3.5817) grad_norm 1.6748 (1.9360) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][740/1251] eta 0:02:05 lr 0.000963 wd 0.0500 time 0.2352 (0.2456) data time 0.0009 (0.0016) model time 0.2342 (0.2439) loss 4.2434 (3.5827) grad_norm 1.7066 (1.9384) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][750/1251] eta 0:02:02 lr 0.000963 wd 0.0500 time 0.2385 (0.2455) data time 0.0010 (0.0016) model time 0.2375 (0.2438) loss 3.1814 (3.5857) grad_norm 1.8069 (1.9371) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][760/1251] eta 0:02:00 lr 0.000963 wd 0.0500 time 0.2441 (0.2455) data time 0.0009 (0.0016) model time 0.2432 (0.2438) loss 3.7421 (3.5842) grad_norm 1.9731 (1.9342) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][770/1251] eta 0:01:58 lr 0.000963 wd 0.0500 time 0.2435 (0.2454) data time 0.0011 (0.0016) model time 0.2424 (0.2437) loss 3.6607 (3.5876) grad_norm 1.8633 (1.9336) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][780/1251] eta 0:01:55 lr 0.000963 wd 0.0500 time 0.2408 (0.2454) data time 0.0009 (0.0016) model time 0.2399 (0.2437) loss 2.7394 (3.5874) grad_norm 1.9083 (1.9301) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][790/1251] eta 0:01:53 lr 0.000963 wd 0.0500 time 0.2440 (0.2453) data time 0.0010 (0.0016) model time 0.2430 (0.2436) loss 4.1359 (3.5876) grad_norm 1.6216 (1.9294) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][800/1251] eta 0:01:50 lr 0.000963 wd 0.0500 time 0.2370 (0.2452) data time 0.0011 (0.0016) model time 0.2359 (0.2436) loss 4.0212 (3.5860) grad_norm 2.2174 (1.9301) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][810/1251] eta 0:01:48 lr 0.000963 wd 0.0500 time 0.2401 (0.2452) data time 0.0009 (0.0016) model time 0.2392 (0.2435) loss 4.4967 (3.5908) grad_norm 2.6181 (1.9321) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][820/1251] eta 0:01:45 lr 0.000963 wd 0.0500 time 0.2444 (0.2451) data time 0.0007 (0.0016) model time 0.2437 (0.2435) loss 3.5438 (3.5884) grad_norm 1.3472 (1.9283) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][830/1251] eta 0:01:43 lr 0.000963 wd 0.0500 time 0.2442 (0.2451) data time 0.0010 (0.0016) model time 0.2431 (0.2434) loss 2.6626 (3.5892) grad_norm 3.1391 (1.9287) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][840/1251] eta 0:01:40 lr 0.000963 wd 0.0500 time 0.2348 (0.2450) data time 0.0011 (0.0016) model time 0.2337 (0.2434) loss 3.1455 (3.5894) grad_norm 2.1498 (1.9307) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][850/1251] eta 0:01:38 lr 0.000963 wd 0.0500 time 0.2450 (0.2450) data time 0.0011 (0.0016) model time 0.2439 (0.2433) loss 3.5143 (3.5894) grad_norm 1.5655 (1.9325) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][860/1251] eta 0:01:35 lr 0.000963 wd 0.0500 time 0.2452 (0.2449) data time 0.0009 (0.0016) model time 0.2443 (0.2433) loss 3.4034 (3.5864) grad_norm 1.6608 (1.9343) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][870/1251] eta 0:01:33 lr 0.000963 wd 0.0500 time 0.2424 (0.2449) data time 0.0009 (0.0016) model time 0.2415 (0.2432) loss 3.6124 (3.5864) grad_norm 1.8220 (1.9328) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][880/1251] eta 0:01:30 lr 0.000963 wd 0.0500 time 0.2426 (0.2448) data time 0.0008 (0.0016) model time 0.2419 (0.2432) loss 3.9445 (3.5883) grad_norm 1.3377 (1.9316) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][890/1251] eta 0:01:28 lr 0.000963 wd 0.0500 time 0.2523 (0.2448) data time 0.0009 (0.0015) model time 0.2514 (0.2431) loss 4.4786 (3.5930) grad_norm 1.6513 (1.9353) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][900/1251] eta 0:01:25 lr 0.000963 wd 0.0500 time 0.2364 (0.2447) data time 0.0009 (0.0015) model time 0.2356 (0.2431) loss 4.2136 (3.5941) grad_norm 1.6591 (1.9336) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][910/1251] eta 0:01:23 lr 0.000963 wd 0.0500 time 0.2354 (0.2449) data time 0.0010 (0.0015) model time 0.2344 (0.2433) loss 3.7103 (3.5940) grad_norm 1.7821 (1.9326) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][920/1251] eta 0:01:21 lr 0.000963 wd 0.0500 time 0.2397 (0.2448) data time 0.0008 (0.0015) model time 0.2388 (0.2432) loss 2.9410 (3.5930) grad_norm 2.0513 (1.9357) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:54:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][930/1251] eta 0:01:18 lr 0.000963 wd 0.0500 time 0.2347 (0.2448) data time 0.0008 (0.0015) model time 0.2338 (0.2432) loss 3.9918 (3.5960) grad_norm 1.8757 (1.9330) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][940/1251] eta 0:01:16 lr 0.000963 wd 0.0500 time 0.2413 (0.2448) data time 0.0010 (0.0015) model time 0.2402 (0.2432) loss 3.9343 (3.5981) grad_norm 1.9967 (1.9342) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][950/1251] eta 0:01:13 lr 0.000963 wd 0.0500 time 0.2433 (0.2447) data time 0.0007 (0.0015) model time 0.2426 (0.2432) loss 3.2863 (3.5986) grad_norm 1.6418 (1.9347) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][960/1251] eta 0:01:11 lr 0.000963 wd 0.0500 time 0.2372 (0.2447) data time 0.0009 (0.0015) model time 0.2363 (0.2431) loss 4.4833 (3.6019) grad_norm 1.4455 (1.9368) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][970/1251] eta 0:01:08 lr 0.000963 wd 0.0500 time 0.2416 (0.2449) data time 0.0007 (0.0015) model time 0.2408 (0.2433) loss 4.0006 (3.6026) grad_norm 1.6450 (1.9353) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][980/1251] eta 0:01:06 lr 0.000963 wd 0.0500 time 0.2455 (0.2449) data time 0.0010 (0.0015) model time 0.2446 (0.2433) loss 3.5947 (3.6017) grad_norm 1.4598 (1.9353) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][990/1251] eta 0:01:03 lr 0.000963 wd 0.0500 time 0.2292 (0.2448) data time 0.0011 (0.0015) model time 0.2281 (0.2432) loss 3.2216 (3.6007) grad_norm 1.3767 (1.9338) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1000/1251] eta 0:01:01 lr 0.000963 wd 0.0500 time 0.2440 (0.2448) data time 0.0009 (0.0015) model time 0.2430 (0.2432) loss 3.7449 (3.6019) grad_norm 1.7899 (1.9344) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1010/1251] eta 0:00:58 lr 0.000963 wd 0.0500 time 0.2422 (0.2448) data time 0.0010 (0.0015) model time 0.2412 (0.2432) loss 3.5506 (3.6013) grad_norm 1.2537 (1.9308) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1020/1251] eta 0:00:56 lr 0.000963 wd 0.0500 time 0.2402 (0.2447) data time 0.0008 (0.0015) model time 0.2395 (0.2432) loss 3.7862 (3.6034) grad_norm 2.1541 (1.9290) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1030/1251] eta 0:00:54 lr 0.000963 wd 0.0500 time 0.2397 (0.2447) data time 0.0007 (0.0015) model time 0.2389 (0.2432) loss 3.5836 (3.6043) grad_norm 1.7279 (1.9281) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1040/1251] eta 0:00:51 lr 0.000963 wd 0.0500 time 0.2439 (0.2447) data time 0.0010 (0.0015) model time 0.2429 (0.2432) loss 2.6631 (3.6026) grad_norm 1.8377 (1.9302) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1050/1251] eta 0:00:49 lr 0.000963 wd 0.0500 time 0.2303 (0.2447) data time 0.0011 (0.0015) model time 0.2292 (0.2431) loss 4.1895 (3.6055) grad_norm 1.4219 (1.9291) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1060/1251] eta 0:00:46 lr 0.000963 wd 0.0500 time 0.2438 (0.2449) data time 0.0011 (0.0015) model time 0.2428 (0.2433) loss 3.8388 (3.6020) grad_norm 2.1659 (1.9273) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1070/1251] eta 0:00:44 lr 0.000963 wd 0.0500 time 0.2526 (0.2450) data time 0.0009 (0.0015) model time 0.2516 (0.2435) loss 3.8200 (3.6033) grad_norm 2.5090 (1.9292) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1080/1251] eta 0:00:41 lr 0.000963 wd 0.0500 time 0.2361 (0.2452) data time 0.0013 (0.0015) model time 0.2348 (0.2437) loss 2.4828 (3.6023) grad_norm 1.6935 (1.9285) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1090/1251] eta 0:00:39 lr 0.000963 wd 0.0500 time 0.2365 (0.2452) data time 0.0009 (0.0015) model time 0.2356 (0.2437) loss 4.2654 (3.6028) grad_norm 1.9878 (1.9305) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1100/1251] eta 0:00:37 lr 0.000963 wd 0.0500 time 0.2423 (0.2451) data time 0.0010 (0.0014) model time 0.2414 (0.2437) loss 3.2079 (3.6001) grad_norm 2.0938 (1.9297) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1110/1251] eta 0:00:34 lr 0.000963 wd 0.0500 time 0.2314 (0.2451) data time 0.0010 (0.0014) model time 0.2304 (0.2436) loss 4.1753 (3.6026) grad_norm 1.5065 (1.9303) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1120/1251] eta 0:00:32 lr 0.000963 wd 0.0500 time 0.4645 (0.2453) data time 0.0009 (0.0014) model time 0.4636 (0.2438) loss 3.9127 (3.6012) grad_norm 1.8935 (1.9292) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1130/1251] eta 0:00:29 lr 0.000963 wd 0.0500 time 0.2418 (0.2453) data time 0.0012 (0.0014) model time 0.2406 (0.2438) loss 3.6461 (3.6035) grad_norm 1.7589 (1.9311) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1140/1251] eta 0:00:27 lr 0.000963 wd 0.0500 time 0.2350 (0.2452) data time 0.0012 (0.0014) model time 0.2339 (0.2438) loss 3.8333 (3.6043) grad_norm 1.8694 (1.9317) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1150/1251] eta 0:00:24 lr 0.000962 wd 0.0500 time 0.2389 (0.2452) data time 0.0011 (0.0014) model time 0.2378 (0.2437) loss 3.8043 (3.6061) grad_norm 1.5845 (1.9314) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1160/1251] eta 0:00:22 lr 0.000962 wd 0.0500 time 0.2493 (0.2453) data time 0.0010 (0.0014) model time 0.2483 (0.2439) loss 4.0371 (3.6064) grad_norm 1.9502 (1.9313) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:55:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1170/1251] eta 0:00:19 lr 0.000962 wd 0.0500 time 0.2410 (0.2455) data time 0.0008 (0.0014) model time 0.2402 (0.2441) loss 4.1870 (3.6068) grad_norm 1.7983 (1.9308) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1180/1251] eta 0:00:17 lr 0.000962 wd 0.0500 time 0.2441 (0.2455) data time 0.0009 (0.0014) model time 0.2432 (0.2440) loss 3.6605 (3.6053) grad_norm 2.1454 (1.9296) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1190/1251] eta 0:00:14 lr 0.000962 wd 0.0500 time 0.2442 (0.2455) data time 0.0010 (0.0014) model time 0.2432 (0.2440) loss 3.7560 (3.6020) grad_norm 2.4315 (1.9319) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1200/1251] eta 0:00:12 lr 0.000962 wd 0.0500 time 0.2440 (0.2454) data time 0.0009 (0.0014) model time 0.2431 (0.2440) loss 3.7852 (3.6016) grad_norm 2.3012 (1.9332) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1210/1251] eta 0:00:10 lr 0.000962 wd 0.0500 time 0.2399 (0.2454) data time 0.0011 (0.0014) model time 0.2388 (0.2440) loss 3.0913 (3.6020) grad_norm 1.5664 (1.9329) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1220/1251] eta 0:00:07 lr 0.000962 wd 0.0500 time 0.2347 (0.2455) data time 0.0009 (0.0014) model time 0.2338 (0.2441) loss 2.6072 (3.6008) grad_norm 1.6903 (1.9323) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1230/1251] eta 0:00:05 lr 0.000962 wd 0.0500 time 0.2531 (0.2455) data time 0.0009 (0.0014) model time 0.2522 (0.2441) loss 4.2495 (3.5991) grad_norm 2.0673 (1.9333) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1240/1251] eta 0:00:02 lr 0.000962 wd 0.0500 time 0.2249 (0.2454) data time 0.0005 (0.0014) model time 0.2244 (0.2440) loss 3.6374 (3.6000) grad_norm 2.2471 (1.9343) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [54/300][1250/1251] eta 0:00:00 lr 0.000962 wd 0.0500 time 0.2267 (0.2453) data time 0.0005 (0.0014) model time 0.2263 (0.2438) loss 2.0393 (3.5975) grad_norm 2.1959 (1.9381) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 54 training takes 0:05:06 [2024-08-26 06:56:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 06:56:17 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 06:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.420 (0.420) Loss 0.5967 (0.5967) Acc@1 88.281 (88.281) Acc@5 97.461 (97.461) Mem 7379MB [2024-08-26 06:56:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.101 (0.111) Loss 0.9497 (0.8837) Acc@1 80.273 (80.487) Acc@5 94.922 (95.526) Mem 7379MB [2024-08-26 06:56:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.090 (0.096) Loss 1.2666 (0.9064) Acc@1 71.191 (79.669) Acc@5 91.309 (95.485) Mem 7379MB [2024-08-26 06:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.090) Loss 1.5566 (1.0444) Acc@1 62.402 (76.591) Acc@5 87.207 (93.756) Mem 7379MB [2024-08-26 06:56:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.5537 (1.1254) Acc@1 64.941 (74.802) Acc@5 87.402 (92.771) Mem 7379MB [2024-08-26 06:56:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.480 Acc@5 92.662 [2024-08-26 06:56:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 74.5% [2024-08-26 06:56:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.742 (0.742) Loss 0.4727 (0.4727) Acc@1 89.941 (89.941) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 06:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.145) Loss 0.7861 (0.7590) Acc@1 83.398 (83.105) Acc@5 95.898 (96.520) Mem 7379MB [2024-08-26 06:56:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.113) Loss 1.1006 (0.7765) Acc@1 74.902 (82.213) Acc@5 92.188 (96.419) Mem 7379MB [2024-08-26 06:56:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.102) Loss 1.3594 (0.8902) Acc@1 65.039 (79.448) Acc@5 88.672 (94.931) Mem 7379MB [2024-08-26 06:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.093) Loss 1.2793 (0.9547) Acc@1 68.555 (77.825) Acc@5 90.332 (94.183) Mem 7379MB [2024-08-26 06:56:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.442 Acc@5 94.120 [2024-08-26 06:56:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 77.4% [2024-08-26 06:56:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 77.44% [2024-08-26 06:56:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 06:56:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 06:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][0/1251] eta 0:14:53 lr 0.000962 wd 0.0500 time 0.7139 (0.7139) data time 0.4924 (0.4924) model time 0.0000 (0.0000) loss 4.0605 (4.0605) grad_norm 1.5511 (1.5511) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][10/1251] eta 0:05:50 lr 0.000962 wd 0.0500 time 0.2386 (0.2828) data time 0.0011 (0.0457) model time 0.0000 (0.0000) loss 3.4732 (3.5319) grad_norm 2.1748 (1.9533) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][20/1251] eta 0:05:25 lr 0.000962 wd 0.0500 time 0.2467 (0.2641) data time 0.0010 (0.0244) model time 0.0000 (0.0000) loss 3.9194 (3.6071) grad_norm 1.7556 (1.9292) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][30/1251] eta 0:05:14 lr 0.000962 wd 0.0500 time 0.2449 (0.2572) data time 0.0007 (0.0169) model time 0.0000 (0.0000) loss 2.1935 (3.5796) grad_norm 2.2777 (2.0351) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][40/1251] eta 0:05:06 lr 0.000962 wd 0.0500 time 0.2414 (0.2532) data time 0.0007 (0.0130) model time 0.0000 (0.0000) loss 3.3269 (3.5572) grad_norm 1.3044 (2.0047) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][50/1251] eta 0:05:01 lr 0.000962 wd 0.0500 time 0.2354 (0.2511) data time 0.0009 (0.0106) model time 0.0000 (0.0000) loss 2.4338 (3.4723) grad_norm 1.7574 (2.0438) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 06:56:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][60/1251] eta 0:04:57 lr 0.000962 wd 0.0500 time 0.2399 (0.2498) data time 0.0007 (0.0091) model time 0.2392 (0.2420) loss 4.2909 (3.5023) grad_norm 1.7835 (inf) loss_scale 4096.0000 (7789.1148) mem 7379MB [2024-08-26 06:56:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][70/1251] eta 0:04:53 lr 0.000962 wd 0.0500 time 0.2393 (0.2488) data time 0.0010 (0.0079) model time 0.2383 (0.2421) loss 3.9794 (3.5617) grad_norm 1.4655 (inf) loss_scale 4096.0000 (7268.9577) mem 7379MB [2024-08-26 06:56:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][80/1251] eta 0:04:50 lr 0.000962 wd 0.0500 time 0.2387 (0.2478) data time 0.0007 (0.0071) model time 0.2379 (0.2413) loss 3.9558 (3.5257) grad_norm 1.7502 (inf) loss_scale 4096.0000 (6877.2346) mem 7379MB [2024-08-26 06:56:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][90/1251] eta 0:04:46 lr 0.000962 wd 0.0500 time 0.2439 (0.2472) data time 0.0008 (0.0064) model time 0.2431 (0.2412) loss 3.4128 (3.5366) grad_norm 1.6869 (inf) loss_scale 4096.0000 (6571.6044) mem 7379MB [2024-08-26 06:56:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][100/1251] eta 0:04:43 lr 0.000962 wd 0.0500 time 0.2377 (0.2464) data time 0.0009 (0.0059) model time 0.2368 (0.2406) loss 3.8457 (3.5363) grad_norm 2.1087 (inf) loss_scale 4096.0000 (6326.4950) mem 7379MB [2024-08-26 06:56:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][110/1251] eta 0:04:40 lr 0.000962 wd 0.0500 time 0.2474 (0.2459) data time 0.0010 (0.0054) model time 0.2464 (0.2404) loss 3.4659 (3.5451) grad_norm 1.7032 (inf) loss_scale 4096.0000 (6125.5495) mem 7379MB [2024-08-26 06:56:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][120/1251] eta 0:04:37 lr 0.000962 wd 0.0500 time 0.2406 (0.2454) data time 0.0012 (0.0051) model time 0.2394 (0.2402) loss 3.8780 (3.5462) grad_norm 1.9616 (inf) loss_scale 4096.0000 (5957.8182) mem 7379MB [2024-08-26 06:56:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][130/1251] eta 0:04:34 lr 0.000962 wd 0.0500 time 0.2333 (0.2451) data time 0.0009 (0.0047) model time 0.2324 (0.2403) loss 4.3543 (3.5608) grad_norm 2.3345 (inf) loss_scale 4096.0000 (5815.6947) mem 7379MB [2024-08-26 06:57:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][140/1251] eta 0:04:32 lr 0.000962 wd 0.0500 time 0.2548 (0.2450) data time 0.0007 (0.0045) model time 0.2541 (0.2405) loss 2.4567 (3.5504) grad_norm 1.4327 (inf) loss_scale 4096.0000 (5693.7305) mem 7379MB [2024-08-26 06:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][150/1251] eta 0:04:29 lr 0.000962 wd 0.0500 time 0.2464 (0.2447) data time 0.0012 (0.0043) model time 0.2452 (0.2404) loss 2.6049 (3.5511) grad_norm 1.1485 (inf) loss_scale 4096.0000 (5587.9205) mem 7379MB [2024-08-26 06:57:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][160/1251] eta 0:04:26 lr 0.000962 wd 0.0500 time 0.2368 (0.2445) data time 0.0009 (0.0041) model time 0.2359 (0.2404) loss 2.4243 (3.5463) grad_norm 1.5895 (inf) loss_scale 4096.0000 (5495.2547) mem 7379MB [2024-08-26 06:57:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][170/1251] eta 0:04:24 lr 0.000962 wd 0.0500 time 0.2342 (0.2442) data time 0.0011 (0.0039) model time 0.2331 (0.2403) loss 3.5293 (3.5772) grad_norm 2.0917 (inf) loss_scale 4096.0000 (5413.4269) mem 7379MB [2024-08-26 06:57:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][180/1251] eta 0:04:21 lr 0.000962 wd 0.0500 time 0.2344 (0.2440) data time 0.0010 (0.0037) model time 0.2334 (0.2403) loss 4.1138 (3.5677) grad_norm 2.6740 (inf) loss_scale 4096.0000 (5340.6409) mem 7379MB [2024-08-26 06:57:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][190/1251] eta 0:04:18 lr 0.000962 wd 0.0500 time 0.2417 (0.2439) data time 0.0008 (0.0036) model time 0.2409 (0.2403) loss 3.2320 (3.5654) grad_norm 1.5113 (inf) loss_scale 4096.0000 (5275.4764) mem 7379MB [2024-08-26 06:57:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][200/1251] eta 0:04:16 lr 0.000962 wd 0.0500 time 0.2428 (0.2438) data time 0.0011 (0.0035) model time 0.2417 (0.2402) loss 3.0742 (3.5615) grad_norm 1.7956 (inf) loss_scale 4096.0000 (5216.7960) mem 7379MB [2024-08-26 06:57:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][210/1251] eta 0:04:13 lr 0.000962 wd 0.0500 time 0.2389 (0.2435) data time 0.0009 (0.0033) model time 0.2381 (0.2401) loss 3.0756 (3.5605) grad_norm 2.0075 (inf) loss_scale 4096.0000 (5163.6777) mem 7379MB [2024-08-26 06:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][220/1251] eta 0:04:10 lr 0.000962 wd 0.0500 time 0.2394 (0.2434) data time 0.0010 (0.0032) model time 0.2383 (0.2401) loss 3.8509 (3.5560) grad_norm 1.7677 (inf) loss_scale 4096.0000 (5115.3665) mem 7379MB [2024-08-26 06:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][230/1251] eta 0:04:08 lr 0.000962 wd 0.0500 time 0.2368 (0.2433) data time 0.0011 (0.0031) model time 0.2357 (0.2401) loss 4.1863 (3.5632) grad_norm 1.7845 (inf) loss_scale 4096.0000 (5071.2381) mem 7379MB [2024-08-26 06:57:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][240/1251] eta 0:04:05 lr 0.000962 wd 0.0500 time 0.2450 (0.2432) data time 0.0011 (0.0030) model time 0.2439 (0.2401) loss 3.5700 (3.5659) grad_norm 2.5970 (inf) loss_scale 4096.0000 (5030.7718) mem 7379MB [2024-08-26 06:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][250/1251] eta 0:04:03 lr 0.000962 wd 0.0500 time 0.2411 (0.2431) data time 0.0009 (0.0030) model time 0.2402 (0.2401) loss 4.5296 (3.5812) grad_norm 2.6606 (inf) loss_scale 4096.0000 (4993.5299) mem 7379MB [2024-08-26 06:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][260/1251] eta 0:04:00 lr 0.000962 wd 0.0500 time 0.2489 (0.2431) data time 0.0007 (0.0029) model time 0.2482 (0.2401) loss 3.8088 (3.5800) grad_norm 2.7718 (inf) loss_scale 4096.0000 (4959.1418) mem 7379MB [2024-08-26 06:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][270/1251] eta 0:03:58 lr 0.000962 wd 0.0500 time 0.2419 (0.2430) data time 0.0009 (0.0028) model time 0.2410 (0.2401) loss 3.3330 (3.5783) grad_norm 1.9814 (inf) loss_scale 4096.0000 (4927.2915) mem 7379MB [2024-08-26 06:57:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][280/1251] eta 0:03:55 lr 0.000962 wd 0.0500 time 0.2425 (0.2429) data time 0.0007 (0.0028) model time 0.2417 (0.2401) loss 3.9321 (3.5819) grad_norm 1.7570 (inf) loss_scale 4096.0000 (4897.7082) mem 7379MB [2024-08-26 06:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][290/1251] eta 0:03:54 lr 0.000962 wd 0.0500 time 0.2437 (0.2436) data time 0.0011 (0.0027) model time 0.2425 (0.2410) loss 3.7720 (3.5725) grad_norm 1.3584 (inf) loss_scale 4096.0000 (4870.1581) mem 7379MB [2024-08-26 06:57:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][300/1251] eta 0:03:51 lr 0.000962 wd 0.0500 time 0.2351 (0.2435) data time 0.0007 (0.0026) model time 0.2344 (0.2410) loss 4.0987 (3.5779) grad_norm 1.6638 (inf) loss_scale 4096.0000 (4844.4385) mem 7379MB [2024-08-26 06:57:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][310/1251] eta 0:03:49 lr 0.000962 wd 0.0500 time 0.2433 (0.2435) data time 0.0009 (0.0026) model time 0.2423 (0.2410) loss 3.7031 (3.5771) grad_norm 2.8740 (inf) loss_scale 4096.0000 (4820.3730) mem 7379MB [2024-08-26 06:57:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][320/1251] eta 0:03:46 lr 0.000962 wd 0.0500 time 0.2459 (0.2434) data time 0.0010 (0.0025) model time 0.2449 (0.2410) loss 3.6338 (3.5770) grad_norm 2.2765 (inf) loss_scale 4096.0000 (4797.8069) mem 7379MB [2024-08-26 06:57:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][330/1251] eta 0:03:44 lr 0.000962 wd 0.0500 time 0.2424 (0.2440) data time 0.0007 (0.0025) model time 0.2417 (0.2417) loss 2.7458 (3.5806) grad_norm 1.3645 (inf) loss_scale 4096.0000 (4776.6042) mem 7379MB [2024-08-26 06:57:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][340/1251] eta 0:03:42 lr 0.000962 wd 0.0500 time 0.2417 (0.2445) data time 0.0008 (0.0024) model time 0.2408 (0.2424) loss 2.6957 (3.5733) grad_norm 1.6882 (inf) loss_scale 4096.0000 (4756.6452) mem 7379MB [2024-08-26 06:57:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][350/1251] eta 0:03:40 lr 0.000962 wd 0.0500 time 0.2393 (0.2444) data time 0.0011 (0.0024) model time 0.2382 (0.2423) loss 2.6573 (3.5658) grad_norm 2.1977 (inf) loss_scale 4096.0000 (4737.8234) mem 7379MB [2024-08-26 06:57:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][360/1251] eta 0:03:38 lr 0.000962 wd 0.0500 time 0.2393 (0.2454) data time 0.0011 (0.0024) model time 0.2381 (0.2435) loss 4.2559 (3.5680) grad_norm 1.4315 (inf) loss_scale 4096.0000 (4720.0443) mem 7379MB [2024-08-26 06:57:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][370/1251] eta 0:03:36 lr 0.000962 wd 0.0500 time 0.2406 (0.2453) data time 0.0008 (0.0023) model time 0.2399 (0.2434) loss 2.6822 (3.5643) grad_norm 1.7601 (inf) loss_scale 4096.0000 (4703.2237) mem 7379MB [2024-08-26 06:58:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][380/1251] eta 0:03:33 lr 0.000962 wd 0.0500 time 0.2427 (0.2453) data time 0.0007 (0.0023) model time 0.2420 (0.2434) loss 2.7066 (3.5574) grad_norm 3.0802 (inf) loss_scale 4096.0000 (4687.2861) mem 7379MB [2024-08-26 06:58:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][390/1251] eta 0:03:31 lr 0.000962 wd 0.0500 time 0.2456 (0.2452) data time 0.0011 (0.0023) model time 0.2445 (0.2433) loss 4.2052 (3.5567) grad_norm 2.0465 (inf) loss_scale 4096.0000 (4672.1637) mem 7379MB [2024-08-26 06:58:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][400/1251] eta 0:03:28 lr 0.000962 wd 0.0500 time 0.2425 (0.2451) data time 0.0010 (0.0022) model time 0.2415 (0.2432) loss 4.2227 (3.5608) grad_norm 1.3991 (inf) loss_scale 4096.0000 (4657.7955) mem 7379MB [2024-08-26 06:58:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][410/1251] eta 0:03:26 lr 0.000962 wd 0.0500 time 0.2451 (0.2450) data time 0.0011 (0.0022) model time 0.2440 (0.2431) loss 3.3892 (3.5613) grad_norm 1.7422 (inf) loss_scale 4096.0000 (4644.1265) mem 7379MB [2024-08-26 06:58:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][420/1251] eta 0:03:23 lr 0.000962 wd 0.0500 time 0.2468 (0.2450) data time 0.0008 (0.0022) model time 0.2460 (0.2431) loss 3.1799 (3.5603) grad_norm 1.4254 (inf) loss_scale 4096.0000 (4631.1069) mem 7379MB [2024-08-26 06:58:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][430/1251] eta 0:03:21 lr 0.000962 wd 0.0500 time 0.2430 (0.2449) data time 0.0013 (0.0021) model time 0.2418 (0.2430) loss 3.2248 (3.5547) grad_norm 1.2586 (inf) loss_scale 4096.0000 (4618.6914) mem 7379MB [2024-08-26 06:58:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][440/1251] eta 0:03:18 lr 0.000962 wd 0.0500 time 0.2320 (0.2447) data time 0.0008 (0.0021) model time 0.2312 (0.2429) loss 2.9314 (3.5571) grad_norm 1.3908 (inf) loss_scale 4096.0000 (4606.8390) mem 7379MB [2024-08-26 06:58:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][450/1251] eta 0:03:16 lr 0.000962 wd 0.0500 time 0.2428 (0.2447) data time 0.0011 (0.0021) model time 0.2417 (0.2429) loss 3.9203 (3.5576) grad_norm 2.3416 (inf) loss_scale 4096.0000 (4595.5122) mem 7379MB [2024-08-26 06:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][460/1251] eta 0:03:13 lr 0.000962 wd 0.0500 time 0.2427 (0.2446) data time 0.0007 (0.0021) model time 0.2420 (0.2428) loss 4.0935 (3.5572) grad_norm 1.8369 (inf) loss_scale 4096.0000 (4584.6768) mem 7379MB [2024-08-26 06:58:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][470/1251] eta 0:03:11 lr 0.000962 wd 0.0500 time 0.2411 (0.2450) data time 0.0008 (0.0021) model time 0.2403 (0.2432) loss 3.3491 (3.5627) grad_norm 1.5018 (inf) loss_scale 4096.0000 (4574.3015) mem 7379MB [2024-08-26 06:58:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][480/1251] eta 0:03:09 lr 0.000962 wd 0.0500 time 0.2463 (0.2454) data time 0.0009 (0.0021) model time 0.2454 (0.2437) loss 3.5378 (3.5635) grad_norm 2.2983 (inf) loss_scale 4096.0000 (4564.3576) mem 7379MB [2024-08-26 06:58:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][490/1251] eta 0:03:07 lr 0.000961 wd 0.0500 time 0.2413 (0.2458) data time 0.0009 (0.0020) model time 0.2404 (0.2441) loss 3.9458 (3.5625) grad_norm 2.8408 (inf) loss_scale 4096.0000 (4554.8187) mem 7379MB [2024-08-26 06:58:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][500/1251] eta 0:03:04 lr 0.000961 wd 0.0500 time 0.2438 (0.2457) data time 0.0010 (0.0020) model time 0.2428 (0.2440) loss 3.5604 (3.5665) grad_norm 1.9410 (inf) loss_scale 4096.0000 (4545.6607) mem 7379MB [2024-08-26 06:58:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][510/1251] eta 0:03:01 lr 0.000961 wd 0.0500 time 0.2319 (0.2456) data time 0.0010 (0.0020) model time 0.2309 (0.2440) loss 3.5438 (3.5634) grad_norm 2.1670 (inf) loss_scale 4096.0000 (4536.8611) mem 7379MB [2024-08-26 06:58:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][520/1251] eta 0:02:59 lr 0.000961 wd 0.0500 time 0.2355 (0.2455) data time 0.0007 (0.0020) model time 0.2347 (0.2439) loss 4.1959 (3.5669) grad_norm 2.5272 (inf) loss_scale 4096.0000 (4528.3992) mem 7379MB [2024-08-26 06:58:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][530/1251] eta 0:02:56 lr 0.000961 wd 0.0500 time 0.2481 (0.2455) data time 0.0009 (0.0020) model time 0.2472 (0.2439) loss 3.9454 (3.5686) grad_norm 2.8914 (inf) loss_scale 4096.0000 (4520.2561) mem 7379MB [2024-08-26 06:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][540/1251] eta 0:02:54 lr 0.000961 wd 0.0500 time 0.2390 (0.2454) data time 0.0011 (0.0019) model time 0.2379 (0.2438) loss 2.9705 (3.5679) grad_norm 2.3321 (inf) loss_scale 4096.0000 (4512.4140) mem 7379MB [2024-08-26 06:58:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][550/1251] eta 0:02:51 lr 0.000961 wd 0.0500 time 0.2457 (0.2453) data time 0.0012 (0.0019) model time 0.2445 (0.2437) loss 3.7432 (3.5668) grad_norm 2.5541 (inf) loss_scale 4096.0000 (4504.8566) mem 7379MB [2024-08-26 06:58:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][560/1251] eta 0:02:49 lr 0.000961 wd 0.0500 time 0.2370 (0.2453) data time 0.0012 (0.0019) model time 0.2358 (0.2436) loss 3.8066 (3.5710) grad_norm 1.9310 (inf) loss_scale 4096.0000 (4497.5686) mem 7379MB [2024-08-26 06:58:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][570/1251] eta 0:02:46 lr 0.000961 wd 0.0500 time 0.2400 (0.2452) data time 0.0008 (0.0019) model time 0.2393 (0.2436) loss 4.1826 (3.5717) grad_norm 2.2271 (inf) loss_scale 4096.0000 (4490.5359) mem 7379MB [2024-08-26 06:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][580/1251] eta 0:02:44 lr 0.000961 wd 0.0500 time 0.2438 (0.2451) data time 0.0011 (0.0019) model time 0.2427 (0.2435) loss 3.2028 (3.5702) grad_norm 2.0225 (inf) loss_scale 4096.0000 (4483.7453) mem 7379MB [2024-08-26 06:58:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][590/1251] eta 0:02:41 lr 0.000961 wd 0.0500 time 0.2462 (0.2450) data time 0.0007 (0.0019) model time 0.2455 (0.2434) loss 3.1052 (3.5706) grad_norm 1.6228 (inf) loss_scale 4096.0000 (4477.1844) mem 7379MB [2024-08-26 06:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][600/1251] eta 0:02:39 lr 0.000961 wd 0.0500 time 0.2441 (0.2450) data time 0.0009 (0.0019) model time 0.2432 (0.2434) loss 4.2499 (3.5764) grad_norm 3.2685 (inf) loss_scale 4096.0000 (4470.8419) mem 7379MB [2024-08-26 06:58:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][610/1251] eta 0:02:37 lr 0.000961 wd 0.0500 time 0.2474 (0.2449) data time 0.0007 (0.0019) model time 0.2466 (0.2433) loss 3.3695 (3.5797) grad_norm 2.5437 (inf) loss_scale 4096.0000 (4464.7070) mem 7379MB [2024-08-26 06:58:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][620/1251] eta 0:02:34 lr 0.000961 wd 0.0500 time 0.2416 (0.2449) data time 0.0009 (0.0019) model time 0.2408 (0.2432) loss 4.2110 (3.5833) grad_norm 1.9933 (inf) loss_scale 4096.0000 (4458.7697) mem 7379MB [2024-08-26 06:59:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][630/1251] eta 0:02:32 lr 0.000961 wd 0.0500 time 0.2423 (0.2448) data time 0.0009 (0.0018) model time 0.2415 (0.2432) loss 3.8272 (3.5821) grad_norm 1.5572 (inf) loss_scale 4096.0000 (4453.0206) mem 7379MB [2024-08-26 06:59:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][640/1251] eta 0:02:29 lr 0.000961 wd 0.0500 time 0.2479 (0.2448) data time 0.0010 (0.0018) model time 0.2469 (0.2432) loss 3.3963 (3.5818) grad_norm 1.7517 (inf) loss_scale 4096.0000 (4447.4509) mem 7379MB [2024-08-26 06:59:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][650/1251] eta 0:02:27 lr 0.000961 wd 0.0500 time 0.2434 (0.2447) data time 0.0011 (0.0018) model time 0.2422 (0.2431) loss 4.0313 (3.5788) grad_norm 2.1588 (inf) loss_scale 4096.0000 (4442.0522) mem 7379MB [2024-08-26 06:59:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][660/1251] eta 0:02:24 lr 0.000961 wd 0.0500 time 0.2361 (0.2447) data time 0.0009 (0.0018) model time 0.2352 (0.2431) loss 2.6351 (3.5797) grad_norm 1.7692 (inf) loss_scale 4096.0000 (4436.8169) mem 7379MB [2024-08-26 06:59:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][670/1251] eta 0:02:22 lr 0.000961 wd 0.0500 time 0.2403 (0.2446) data time 0.0012 (0.0018) model time 0.2391 (0.2430) loss 3.9673 (3.5824) grad_norm 1.9031 (inf) loss_scale 4096.0000 (4431.7377) mem 7379MB [2024-08-26 06:59:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][680/1251] eta 0:02:19 lr 0.000961 wd 0.0500 time 0.2380 (0.2446) data time 0.0007 (0.0018) model time 0.2373 (0.2430) loss 2.3797 (3.5790) grad_norm 1.9737 (inf) loss_scale 4096.0000 (4426.8076) mem 7379MB [2024-08-26 06:59:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][690/1251] eta 0:02:17 lr 0.000961 wd 0.0500 time 0.2371 (0.2445) data time 0.0009 (0.0018) model time 0.2362 (0.2429) loss 3.0926 (3.5797) grad_norm 1.6100 (inf) loss_scale 4096.0000 (4422.0203) mem 7379MB [2024-08-26 06:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][700/1251] eta 0:02:14 lr 0.000961 wd 0.0500 time 0.2432 (0.2444) data time 0.0009 (0.0018) model time 0.2423 (0.2428) loss 3.4925 (3.5823) grad_norm 1.8624 (inf) loss_scale 4096.0000 (4417.3695) mem 7379MB [2024-08-26 06:59:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][710/1251] eta 0:02:12 lr 0.000961 wd 0.0500 time 0.2448 (0.2447) data time 0.0008 (0.0018) model time 0.2440 (0.2431) loss 3.4350 (3.5847) grad_norm 2.4254 (inf) loss_scale 4096.0000 (4412.8495) mem 7379MB [2024-08-26 06:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][720/1251] eta 0:02:09 lr 0.000961 wd 0.0500 time 0.2402 (0.2447) data time 0.0010 (0.0017) model time 0.2392 (0.2431) loss 3.8115 (3.5825) grad_norm 2.7282 (inf) loss_scale 4096.0000 (4408.4549) mem 7379MB [2024-08-26 06:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][730/1251] eta 0:02:07 lr 0.000961 wd 0.0500 time 0.2347 (0.2446) data time 0.0011 (0.0018) model time 0.2336 (0.2430) loss 3.2590 (3.5832) grad_norm 1.4089 (inf) loss_scale 4096.0000 (4404.1806) mem 7379MB [2024-08-26 06:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][740/1251] eta 0:02:04 lr 0.000961 wd 0.0500 time 0.2441 (0.2446) data time 0.0008 (0.0017) model time 0.2433 (0.2430) loss 4.2190 (3.5828) grad_norm 1.5413 (inf) loss_scale 4096.0000 (4400.0216) mem 7379MB [2024-08-26 06:59:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][750/1251] eta 0:02:02 lr 0.000961 wd 0.0500 time 0.2463 (0.2445) data time 0.0010 (0.0017) model time 0.2453 (0.2430) loss 3.2572 (3.5825) grad_norm 1.5218 (inf) loss_scale 4096.0000 (4395.9734) mem 7379MB [2024-08-26 06:59:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][760/1251] eta 0:02:00 lr 0.000961 wd 0.0500 time 0.2415 (0.2445) data time 0.0009 (0.0017) model time 0.2406 (0.2429) loss 3.5426 (3.5825) grad_norm 1.4319 (inf) loss_scale 4096.0000 (4392.0315) mem 7379MB [2024-08-26 06:59:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][770/1251] eta 0:01:57 lr 0.000961 wd 0.0500 time 0.2402 (0.2445) data time 0.0011 (0.0017) model time 0.2392 (0.2429) loss 2.7324 (3.5809) grad_norm 1.9452 (inf) loss_scale 4096.0000 (4388.1920) mem 7379MB [2024-08-26 06:59:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][780/1251] eta 0:01:55 lr 0.000961 wd 0.0500 time 0.2399 (0.2444) data time 0.0008 (0.0017) model time 0.2390 (0.2429) loss 3.4689 (3.5812) grad_norm 1.4251 (inf) loss_scale 4096.0000 (4384.4507) mem 7379MB [2024-08-26 06:59:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][790/1251] eta 0:01:52 lr 0.000961 wd 0.0500 time 0.2341 (0.2444) data time 0.0009 (0.0017) model time 0.2332 (0.2429) loss 3.4279 (3.5835) grad_norm 2.1225 (inf) loss_scale 4096.0000 (4380.8040) mem 7379MB [2024-08-26 06:59:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][800/1251] eta 0:01:50 lr 0.000961 wd 0.0500 time 0.2350 (0.2444) data time 0.0011 (0.0017) model time 0.2339 (0.2428) loss 3.7468 (3.5817) grad_norm 2.0045 (inf) loss_scale 4096.0000 (4377.2484) mem 7379MB [2024-08-26 06:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][810/1251] eta 0:01:47 lr 0.000961 wd 0.0500 time 0.2408 (0.2443) data time 0.0011 (0.0017) model time 0.2397 (0.2428) loss 3.6758 (3.5839) grad_norm 1.7574 (inf) loss_scale 4096.0000 (4373.7805) mem 7379MB [2024-08-26 06:59:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][820/1251] eta 0:01:45 lr 0.000961 wd 0.0500 time 0.2423 (0.2443) data time 0.0007 (0.0017) model time 0.2416 (0.2428) loss 3.8924 (3.5858) grad_norm 1.6024 (inf) loss_scale 4096.0000 (4370.3971) mem 7379MB [2024-08-26 06:59:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][830/1251] eta 0:01:42 lr 0.000961 wd 0.0500 time 0.2392 (0.2445) data time 0.0010 (0.0017) model time 0.2382 (0.2430) loss 3.8710 (3.5893) grad_norm 1.5686 (inf) loss_scale 4096.0000 (4367.0951) mem 7379MB [2024-08-26 06:59:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][840/1251] eta 0:01:40 lr 0.000961 wd 0.0500 time 0.2376 (0.2445) data time 0.0011 (0.0017) model time 0.2364 (0.2430) loss 4.1147 (3.5910) grad_norm 2.0074 (inf) loss_scale 4096.0000 (4363.8716) mem 7379MB [2024-08-26 06:59:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][850/1251] eta 0:01:38 lr 0.000961 wd 0.0500 time 0.3669 (0.2446) data time 0.0011 (0.0017) model time 0.3658 (0.2431) loss 3.6237 (3.5867) grad_norm 2.3399 (inf) loss_scale 4096.0000 (4360.7239) mem 7379MB [2024-08-26 06:59:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][860/1251] eta 0:01:35 lr 0.000961 wd 0.0500 time 0.2399 (0.2448) data time 0.0011 (0.0017) model time 0.2388 (0.2433) loss 3.8348 (3.5861) grad_norm 1.4504 (inf) loss_scale 4096.0000 (4357.6492) mem 7379MB [2024-08-26 07:00:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][870/1251] eta 0:01:33 lr 0.000961 wd 0.0500 time 0.2454 (0.2448) data time 0.0007 (0.0017) model time 0.2447 (0.2433) loss 2.7813 (3.5869) grad_norm 1.6998 (inf) loss_scale 4096.0000 (4354.6452) mem 7379MB [2024-08-26 07:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][880/1251] eta 0:01:30 lr 0.000961 wd 0.0500 time 0.2402 (0.2451) data time 0.0009 (0.0016) model time 0.2394 (0.2436) loss 4.2899 (3.5841) grad_norm 2.2284 (inf) loss_scale 4096.0000 (4351.7094) mem 7379MB [2024-08-26 07:00:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][890/1251] eta 0:01:28 lr 0.000961 wd 0.0500 time 0.2358 (0.2450) data time 0.0009 (0.0016) model time 0.2349 (0.2435) loss 3.9320 (3.5849) grad_norm 3.1200 (inf) loss_scale 4096.0000 (4348.8395) mem 7379MB [2024-08-26 07:00:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][900/1251] eta 0:01:25 lr 0.000961 wd 0.0500 time 0.2347 (0.2450) data time 0.0007 (0.0016) model time 0.2340 (0.2435) loss 2.3570 (3.5802) grad_norm 2.1763 (inf) loss_scale 4096.0000 (4346.0333) mem 7379MB [2024-08-26 07:00:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][910/1251] eta 0:01:23 lr 0.000961 wd 0.0500 time 0.2404 (0.2449) data time 0.0010 (0.0016) model time 0.2394 (0.2435) loss 3.9315 (3.5829) grad_norm 2.0717 (inf) loss_scale 4096.0000 (4343.2887) mem 7379MB [2024-08-26 07:00:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][920/1251] eta 0:01:21 lr 0.000961 wd 0.0500 time 0.2442 (0.2449) data time 0.0009 (0.0016) model time 0.2432 (0.2434) loss 2.3832 (3.5824) grad_norm 1.7584 (inf) loss_scale 4096.0000 (4340.6037) mem 7379MB [2024-08-26 07:00:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][930/1251] eta 0:01:18 lr 0.000961 wd 0.0500 time 0.2454 (0.2449) data time 0.0010 (0.0016) model time 0.2443 (0.2434) loss 3.7830 (3.5835) grad_norm 1.5029 (inf) loss_scale 4096.0000 (4337.9764) mem 7379MB [2024-08-26 07:00:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][940/1251] eta 0:01:16 lr 0.000961 wd 0.0500 time 0.2459 (0.2449) data time 0.0010 (0.0016) model time 0.2448 (0.2434) loss 3.7732 (3.5831) grad_norm 1.4898 (inf) loss_scale 4096.0000 (4335.4049) mem 7379MB [2024-08-26 07:00:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][950/1251] eta 0:01:13 lr 0.000961 wd 0.0500 time 0.2463 (0.2448) data time 0.0009 (0.0016) model time 0.2454 (0.2434) loss 4.0397 (3.5843) grad_norm 3.9288 (inf) loss_scale 4096.0000 (4332.8875) mem 7379MB [2024-08-26 07:00:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][960/1251] eta 0:01:11 lr 0.000961 wd 0.0500 time 0.2412 (0.2448) data time 0.0012 (0.0016) model time 0.2399 (0.2434) loss 3.7020 (3.5851) grad_norm 1.3549 (inf) loss_scale 4096.0000 (4330.4225) mem 7379MB [2024-08-26 07:00:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][970/1251] eta 0:01:08 lr 0.000961 wd 0.0500 time 0.2425 (0.2448) data time 0.0009 (0.0016) model time 0.2416 (0.2433) loss 2.9852 (3.5846) grad_norm 1.2809 (inf) loss_scale 4096.0000 (4328.0082) mem 7379MB [2024-08-26 07:00:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][980/1251] eta 0:01:06 lr 0.000961 wd 0.0500 time 0.2455 (0.2448) data time 0.0009 (0.0016) model time 0.2446 (0.2433) loss 3.7244 (3.5829) grad_norm 1.7215 (inf) loss_scale 4096.0000 (4325.6432) mem 7379MB [2024-08-26 07:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][990/1251] eta 0:01:03 lr 0.000961 wd 0.0500 time 0.2380 (0.2447) data time 0.0010 (0.0016) model time 0.2370 (0.2433) loss 3.2086 (3.5838) grad_norm 2.6088 (inf) loss_scale 4096.0000 (4323.3259) mem 7379MB [2024-08-26 07:00:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1000/1251] eta 0:01:01 lr 0.000961 wd 0.0500 time 0.2528 (0.2448) data time 0.0008 (0.0016) model time 0.2521 (0.2434) loss 3.0565 (3.5830) grad_norm 1.7381 (inf) loss_scale 4096.0000 (4321.0549) mem 7379MB [2024-08-26 07:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1010/1251] eta 0:00:59 lr 0.000961 wd 0.0500 time 0.2583 (0.2450) data time 0.0011 (0.0016) model time 0.2571 (0.2436) loss 4.1597 (3.5868) grad_norm 2.0457 (inf) loss_scale 4096.0000 (4318.8289) mem 7379MB [2024-08-26 07:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1020/1251] eta 0:00:56 lr 0.000961 wd 0.0500 time 0.2451 (0.2449) data time 0.0007 (0.0016) model time 0.2444 (0.2435) loss 3.7273 (3.5839) grad_norm 1.7819 (inf) loss_scale 4096.0000 (4316.6464) mem 7379MB [2024-08-26 07:00:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1030/1251] eta 0:00:54 lr 0.000961 wd 0.0500 time 0.2357 (0.2449) data time 0.0008 (0.0016) model time 0.2350 (0.2434) loss 2.7037 (3.5844) grad_norm 1.6770 (inf) loss_scale 4096.0000 (4314.5063) mem 7379MB [2024-08-26 07:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1040/1251] eta 0:00:51 lr 0.000961 wd 0.0500 time 0.2467 (0.2448) data time 0.0007 (0.0016) model time 0.2459 (0.2434) loss 3.1483 (3.5820) grad_norm 1.8453 (inf) loss_scale 4096.0000 (4312.4073) mem 7379MB [2024-08-26 07:00:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1050/1251] eta 0:00:49 lr 0.000961 wd 0.0500 time 0.2357 (0.2447) data time 0.0007 (0.0016) model time 0.2350 (0.2433) loss 2.9974 (3.5810) grad_norm 1.6083 (inf) loss_scale 4096.0000 (4310.3482) mem 7379MB [2024-08-26 07:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1060/1251] eta 0:00:46 lr 0.000960 wd 0.0500 time 0.2372 (0.2447) data time 0.0009 (0.0016) model time 0.2363 (0.2432) loss 4.0240 (3.5770) grad_norm 2.1125 (inf) loss_scale 4096.0000 (4308.3280) mem 7379MB [2024-08-26 07:00:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1070/1251] eta 0:00:44 lr 0.000960 wd 0.0500 time 0.2294 (0.2446) data time 0.0008 (0.0015) model time 0.2286 (0.2432) loss 4.0041 (3.5771) grad_norm 2.2761 (inf) loss_scale 4096.0000 (4306.3455) mem 7379MB [2024-08-26 07:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1080/1251] eta 0:00:41 lr 0.000960 wd 0.0500 time 0.2345 (0.2445) data time 0.0011 (0.0015) model time 0.2335 (0.2431) loss 3.8546 (3.5785) grad_norm 1.7155 (inf) loss_scale 4096.0000 (4304.3996) mem 7379MB [2024-08-26 07:00:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1090/1251] eta 0:00:39 lr 0.000960 wd 0.0500 time 0.2372 (0.2445) data time 0.0011 (0.0015) model time 0.2361 (0.2430) loss 3.5385 (3.5769) grad_norm 2.3232 (inf) loss_scale 4096.0000 (4302.4895) mem 7379MB [2024-08-26 07:00:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1100/1251] eta 0:00:36 lr 0.000960 wd 0.0500 time 0.2429 (0.2444) data time 0.0008 (0.0015) model time 0.2421 (0.2430) loss 4.7670 (3.5766) grad_norm 2.1138 (inf) loss_scale 4096.0000 (4300.6140) mem 7379MB [2024-08-26 07:00:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1110/1251] eta 0:00:34 lr 0.000960 wd 0.0500 time 0.2386 (0.2444) data time 0.0009 (0.0015) model time 0.2378 (0.2430) loss 3.9154 (3.5762) grad_norm 2.1924 (inf) loss_scale 4096.0000 (4298.7723) mem 7379MB [2024-08-26 07:01:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1120/1251] eta 0:00:32 lr 0.000960 wd 0.0500 time 0.2415 (0.2443) data time 0.0010 (0.0015) model time 0.2405 (0.2429) loss 3.2043 (3.5775) grad_norm 1.8222 (inf) loss_scale 4096.0000 (4296.9634) mem 7379MB [2024-08-26 07:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1130/1251] eta 0:00:29 lr 0.000960 wd 0.0500 time 0.2361 (0.2443) data time 0.0008 (0.0015) model time 0.2353 (0.2428) loss 3.4070 (3.5758) grad_norm 1.5583 (inf) loss_scale 4096.0000 (4295.1866) mem 7379MB [2024-08-26 07:01:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1140/1251] eta 0:00:27 lr 0.000960 wd 0.0500 time 0.2453 (0.2442) data time 0.0007 (0.0015) model time 0.2446 (0.2428) loss 4.0186 (3.5768) grad_norm 1.9311 (inf) loss_scale 4096.0000 (4293.4408) mem 7379MB [2024-08-26 07:01:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1150/1251] eta 0:00:24 lr 0.000960 wd 0.0500 time 0.2398 (0.2442) data time 0.0010 (0.0015) model time 0.2388 (0.2428) loss 2.7138 (3.5749) grad_norm 2.0410 (inf) loss_scale 4096.0000 (4291.7255) mem 7379MB [2024-08-26 07:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1160/1251] eta 0:00:22 lr 0.000960 wd 0.0500 time 0.2405 (0.2441) data time 0.0008 (0.0015) model time 0.2397 (0.2427) loss 2.3951 (3.5744) grad_norm 1.6555 (inf) loss_scale 4096.0000 (4290.0396) mem 7379MB [2024-08-26 07:01:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1170/1251] eta 0:00:19 lr 0.000960 wd 0.0500 time 0.2311 (0.2441) data time 0.0009 (0.0015) model time 0.2301 (0.2427) loss 3.7484 (3.5717) grad_norm 3.9659 (inf) loss_scale 4096.0000 (4288.3826) mem 7379MB [2024-08-26 07:01:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1180/1251] eta 0:00:17 lr 0.000960 wd 0.0500 time 0.2353 (0.2440) data time 0.0009 (0.0015) model time 0.2345 (0.2426) loss 2.7955 (3.5694) grad_norm 1.7432 (inf) loss_scale 4096.0000 (4286.7536) mem 7379MB [2024-08-26 07:01:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1190/1251] eta 0:00:14 lr 0.000960 wd 0.0500 time 0.2376 (0.2440) data time 0.0012 (0.0015) model time 0.2365 (0.2426) loss 4.0422 (3.5718) grad_norm 2.4512 (inf) loss_scale 4096.0000 (4285.1520) mem 7379MB [2024-08-26 07:01:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1200/1251] eta 0:00:12 lr 0.000960 wd 0.0500 time 0.2319 (0.2440) data time 0.0011 (0.0015) model time 0.2309 (0.2426) loss 2.6961 (3.5721) grad_norm 1.5456 (inf) loss_scale 4096.0000 (4283.5770) mem 7379MB [2024-08-26 07:01:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1210/1251] eta 0:00:10 lr 0.000960 wd 0.0500 time 0.2483 (0.2439) data time 0.0007 (0.0015) model time 0.2476 (0.2425) loss 3.3245 (3.5748) grad_norm 1.6017 (inf) loss_scale 4096.0000 (4282.0281) mem 7379MB [2024-08-26 07:01:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1220/1251] eta 0:00:07 lr 0.000960 wd 0.0500 time 0.2496 (0.2439) data time 0.0009 (0.0015) model time 0.2487 (0.2425) loss 3.8562 (3.5752) grad_norm 2.0818 (inf) loss_scale 4096.0000 (4280.5045) mem 7379MB [2024-08-26 07:01:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1230/1251] eta 0:00:05 lr 0.000960 wd 0.0500 time 0.2372 (0.2438) data time 0.0010 (0.0015) model time 0.2361 (0.2424) loss 3.8754 (3.5745) grad_norm 2.0363 (inf) loss_scale 4096.0000 (4279.0057) mem 7379MB [2024-08-26 07:01:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1240/1251] eta 0:00:02 lr 0.000960 wd 0.0500 time 0.2299 (0.2438) data time 0.0006 (0.0015) model time 0.2293 (0.2424) loss 3.7945 (3.5738) grad_norm 2.0403 (inf) loss_scale 4096.0000 (4277.5310) mem 7379MB [2024-08-26 07:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [55/300][1250/1251] eta 0:00:00 lr 0.000960 wd 0.0500 time 0.2251 (0.2440) data time 0.0007 (0.0015) model time 0.2244 (0.2426) loss 3.3155 (3.5752) grad_norm 1.4700 (inf) loss_scale 4096.0000 (4276.0799) mem 7379MB [2024-08-26 07:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 55 training takes 0:05:05 [2024-08-26 07:01:32 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 07:01:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 07:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.471 (0.471) Loss 0.5522 (0.5522) Acc@1 88.867 (88.867) Acc@5 97.852 (97.852) Mem 7379MB [2024-08-26 07:01:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.112) Loss 0.9580 (0.8932) Acc@1 78.711 (80.247) Acc@5 95.605 (95.872) Mem 7379MB [2024-08-26 07:01:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.096) Loss 1.2959 (0.9114) Acc@1 71.191 (79.501) Acc@5 90.430 (95.657) Mem 7379MB [2024-08-26 07:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.095 (0.091) Loss 1.5449 (1.0429) Acc@1 64.648 (76.903) Acc@5 88.184 (93.955) Mem 7379MB [2024-08-26 07:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.085) Loss 1.4961 (1.1186) Acc@1 66.211 (75.269) Acc@5 88.574 (93.033) Mem 7379MB [2024-08-26 07:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.836 Acc@5 92.884 [2024-08-26 07:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 74.8% [2024-08-26 07:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.961 (0.961) Loss 0.4729 (0.4729) Acc@1 90.039 (90.039) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 07:01:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.090 (0.160) Loss 0.7837 (0.7568) Acc@1 83.887 (83.301) Acc@5 95.898 (96.555) Mem 7379MB [2024-08-26 07:01:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.121) Loss 1.0977 (0.7746) Acc@1 74.609 (82.296) Acc@5 92.188 (96.461) Mem 7379MB [2024-08-26 07:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.106) Loss 1.3584 (0.8879) Acc@1 64.746 (79.486) Acc@5 88.672 (94.997) Mem 7379MB [2024-08-26 07:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.097) Loss 1.2734 (0.9519) Acc@1 68.652 (77.887) Acc@5 90.234 (94.222) Mem 7379MB [2024-08-26 07:01:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.498 Acc@5 94.154 [2024-08-26 07:01:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 77.5% [2024-08-26 07:01:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 77.50% [2024-08-26 07:01:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 07:01:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 07:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][0/1251] eta 0:14:50 lr 0.000960 wd 0.0500 time 0.7119 (0.7119) data time 0.4708 (0.4708) model time 0.0000 (0.0000) loss 2.9810 (2.9810) grad_norm 1.3501 (1.3501) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:01:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][10/1251] eta 0:06:14 lr 0.000960 wd 0.0500 time 0.2423 (0.3019) data time 0.0012 (0.0438) model time 0.0000 (0.0000) loss 3.8489 (3.3130) grad_norm 1.3630 (1.6238) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][20/1251] eta 0:05:33 lr 0.000960 wd 0.0500 time 0.2293 (0.2710) data time 0.0010 (0.0234) model time 0.0000 (0.0000) loss 2.3852 (3.3652) grad_norm 1.3857 (1.7863) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:01:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][30/1251] eta 0:05:26 lr 0.000960 wd 0.0500 time 0.2397 (0.2672) data time 0.0013 (0.0162) model time 0.0000 (0.0000) loss 3.3545 (3.3420) grad_norm 1.9715 (1.7631) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][40/1251] eta 0:05:14 lr 0.000960 wd 0.0500 time 0.2353 (0.2598) data time 0.0011 (0.0125) model time 0.0000 (0.0000) loss 3.0675 (3.3701) grad_norm 2.8178 (1.8337) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:01:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][50/1251] eta 0:05:07 lr 0.000960 wd 0.0500 time 0.2395 (0.2563) data time 0.0007 (0.0103) model time 0.0000 (0.0000) loss 3.6020 (3.4302) grad_norm 1.5031 (1.8136) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][60/1251] eta 0:05:01 lr 0.000960 wd 0.0500 time 0.2424 (0.2532) data time 0.0010 (0.0089) model time 0.2414 (0.2356) loss 3.3843 (3.4725) grad_norm 1.4723 (1.7949) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][70/1251] eta 0:04:56 lr 0.000960 wd 0.0500 time 0.2350 (0.2510) data time 0.0011 (0.0078) model time 0.2340 (0.2359) loss 2.3419 (3.4067) grad_norm 4.0478 (1.8416) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][80/1251] eta 0:04:52 lr 0.000960 wd 0.0500 time 0.2375 (0.2495) data time 0.0010 (0.0070) model time 0.2365 (0.2364) loss 4.1206 (3.4125) grad_norm 1.2110 (1.8668) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][90/1251] eta 0:04:53 lr 0.000960 wd 0.0500 time 0.2507 (0.2530) data time 0.0010 (0.0063) model time 0.2497 (0.2474) loss 3.2316 (3.4220) grad_norm 2.7930 (1.8644) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][100/1251] eta 0:04:52 lr 0.000960 wd 0.0500 time 0.2472 (0.2539) data time 0.0009 (0.0058) model time 0.2463 (0.2502) loss 3.7580 (3.4640) grad_norm 1.7145 (1.8521) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][110/1251] eta 0:04:48 lr 0.000960 wd 0.0500 time 0.2313 (0.2525) data time 0.0008 (0.0054) model time 0.2304 (0.2479) loss 4.2266 (3.4760) grad_norm 1.6651 (1.8542) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][120/1251] eta 0:04:44 lr 0.000960 wd 0.0500 time 0.2572 (0.2513) data time 0.0009 (0.0050) model time 0.2563 (0.2463) loss 4.4415 (3.4823) grad_norm 1.8462 (1.8579) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][130/1251] eta 0:04:43 lr 0.000960 wd 0.0500 time 0.2327 (0.2525) data time 0.0011 (0.0047) model time 0.2316 (0.2488) loss 3.5065 (3.4729) grad_norm 2.2084 (1.8608) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][140/1251] eta 0:04:39 lr 0.000960 wd 0.0500 time 0.2481 (0.2520) data time 0.0009 (0.0046) model time 0.2472 (0.2481) loss 2.7231 (3.4400) grad_norm 1.5482 (1.8585) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][150/1251] eta 0:04:38 lr 0.000960 wd 0.0500 time 0.2338 (0.2529) data time 0.0009 (0.0043) model time 0.2329 (0.2498) loss 2.9827 (3.4454) grad_norm 1.5842 (1.8527) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][160/1251] eta 0:04:34 lr 0.000960 wd 0.0500 time 0.2427 (0.2519) data time 0.0010 (0.0041) model time 0.2417 (0.2486) loss 2.6147 (3.4386) grad_norm 1.9142 (1.8493) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][170/1251] eta 0:04:31 lr 0.000960 wd 0.0500 time 0.2334 (0.2512) data time 0.0011 (0.0040) model time 0.2323 (0.2478) loss 3.2961 (3.4552) grad_norm 1.9016 (1.8622) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][180/1251] eta 0:04:28 lr 0.000960 wd 0.0500 time 0.2318 (0.2505) data time 0.0008 (0.0038) model time 0.2310 (0.2470) loss 4.2390 (3.4605) grad_norm 2.0465 (1.8807) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][190/1251] eta 0:04:25 lr 0.000960 wd 0.0500 time 0.2378 (0.2499) data time 0.0008 (0.0036) model time 0.2370 (0.2463) loss 4.5956 (3.4695) grad_norm 1.8670 (1.9028) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][200/1251] eta 0:04:21 lr 0.000960 wd 0.0500 time 0.2331 (0.2492) data time 0.0010 (0.0035) model time 0.2321 (0.2455) loss 3.8049 (3.4677) grad_norm 1.8571 (1.9192) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][210/1251] eta 0:04:19 lr 0.000960 wd 0.0500 time 0.2373 (0.2489) data time 0.0009 (0.0034) model time 0.2363 (0.2453) loss 3.8802 (3.4766) grad_norm 1.9517 (1.9253) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][220/1251] eta 0:04:16 lr 0.000960 wd 0.0500 time 0.2312 (0.2483) data time 0.0010 (0.0033) model time 0.2302 (0.2447) loss 3.3029 (3.4942) grad_norm 4.8923 (1.9506) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][230/1251] eta 0:04:13 lr 0.000960 wd 0.0500 time 0.2362 (0.2480) data time 0.0007 (0.0032) model time 0.2355 (0.2444) loss 4.0004 (3.5041) grad_norm 1.8534 (1.9474) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][240/1251] eta 0:04:10 lr 0.000960 wd 0.0500 time 0.2424 (0.2476) data time 0.0007 (0.0031) model time 0.2417 (0.2440) loss 2.3868 (3.5049) grad_norm 1.8303 (1.9380) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][250/1251] eta 0:04:07 lr 0.000960 wd 0.0500 time 0.2426 (0.2472) data time 0.0010 (0.0030) model time 0.2415 (0.2437) loss 3.6891 (3.5109) grad_norm 1.4594 (1.9359) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][260/1251] eta 0:04:04 lr 0.000960 wd 0.0500 time 0.2370 (0.2469) data time 0.0011 (0.0030) model time 0.2359 (0.2434) loss 3.7121 (3.5017) grad_norm 1.6772 (1.9361) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][270/1251] eta 0:04:01 lr 0.000960 wd 0.0500 time 0.2313 (0.2466) data time 0.0013 (0.0029) model time 0.2300 (0.2431) loss 2.7977 (3.5056) grad_norm 1.4505 (1.9311) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][280/1251] eta 0:03:59 lr 0.000960 wd 0.0500 time 0.2364 (0.2462) data time 0.0009 (0.0028) model time 0.2355 (0.2428) loss 2.3129 (3.5061) grad_norm 2.0140 (1.9354) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][290/1251] eta 0:03:56 lr 0.000960 wd 0.0500 time 0.2388 (0.2459) data time 0.0011 (0.0028) model time 0.2378 (0.2425) loss 3.5687 (3.5120) grad_norm 1.7750 (1.9347) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][300/1251] eta 0:03:53 lr 0.000960 wd 0.0500 time 0.2385 (0.2456) data time 0.0011 (0.0027) model time 0.2375 (0.2422) loss 3.5894 (3.5151) grad_norm 1.8925 (1.9287) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:02:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][310/1251] eta 0:03:51 lr 0.000960 wd 0.0500 time 0.2503 (0.2455) data time 0.0008 (0.0027) model time 0.2495 (0.2422) loss 2.5893 (3.5132) grad_norm 1.5984 (1.9251) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][320/1251] eta 0:03:48 lr 0.000960 wd 0.0500 time 0.2350 (0.2453) data time 0.0011 (0.0026) model time 0.2339 (0.2420) loss 2.9248 (3.5125) grad_norm 2.0824 (1.9336) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][330/1251] eta 0:03:45 lr 0.000960 wd 0.0500 time 0.2405 (0.2451) data time 0.0008 (0.0026) model time 0.2397 (0.2419) loss 3.7514 (3.5194) grad_norm 1.3130 (1.9278) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][340/1251] eta 0:03:43 lr 0.000960 wd 0.0500 time 0.2352 (0.2449) data time 0.0009 (0.0025) model time 0.2343 (0.2417) loss 3.8250 (3.5234) grad_norm 2.2176 (1.9283) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][350/1251] eta 0:03:40 lr 0.000960 wd 0.0500 time 0.2359 (0.2447) data time 0.0007 (0.0025) model time 0.2352 (0.2415) loss 4.3040 (3.5237) grad_norm 2.1734 (1.9317) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][360/1251] eta 0:03:37 lr 0.000960 wd 0.0500 time 0.2399 (0.2445) data time 0.0008 (0.0025) model time 0.2391 (0.2413) loss 2.7506 (3.5165) grad_norm 2.3027 (1.9340) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][370/1251] eta 0:03:35 lr 0.000960 wd 0.0500 time 0.2392 (0.2443) data time 0.0010 (0.0024) model time 0.2382 (0.2413) loss 3.8787 (3.5198) grad_norm 1.6035 (1.9297) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][380/1251] eta 0:03:32 lr 0.000960 wd 0.0500 time 0.2446 (0.2442) data time 0.0010 (0.0024) model time 0.2436 (0.2412) loss 4.1429 (3.5216) grad_norm 1.4770 (1.9259) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][390/1251] eta 0:03:30 lr 0.000959 wd 0.0500 time 0.2392 (0.2441) data time 0.0009 (0.0023) model time 0.2382 (0.2411) loss 3.6433 (3.5259) grad_norm 1.5977 (1.9250) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][400/1251] eta 0:03:27 lr 0.000959 wd 0.0500 time 0.2453 (0.2439) data time 0.0010 (0.0023) model time 0.2443 (0.2409) loss 1.9751 (3.5221) grad_norm 1.8601 (1.9273) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][410/1251] eta 0:03:25 lr 0.000959 wd 0.0500 time 0.2356 (0.2438) data time 0.0010 (0.0023) model time 0.2347 (0.2409) loss 3.8331 (3.5226) grad_norm 1.5956 (1.9339) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][420/1251] eta 0:03:22 lr 0.000959 wd 0.0500 time 0.2352 (0.2436) data time 0.0009 (0.0023) model time 0.2344 (0.2407) loss 3.8856 (3.5251) grad_norm 3.5088 (1.9397) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][430/1251] eta 0:03:19 lr 0.000959 wd 0.0500 time 0.2381 (0.2435) data time 0.0011 (0.0022) model time 0.2371 (0.2406) loss 3.7229 (3.5283) grad_norm 1.4891 (1.9424) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][440/1251] eta 0:03:17 lr 0.000959 wd 0.0500 time 0.2363 (0.2434) data time 0.0008 (0.0022) model time 0.2354 (0.2405) loss 4.1633 (3.5331) grad_norm 1.8450 (1.9450) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][450/1251] eta 0:03:14 lr 0.000959 wd 0.0500 time 0.2382 (0.2432) data time 0.0010 (0.0022) model time 0.2372 (0.2404) loss 4.0571 (3.5349) grad_norm 1.5015 (1.9456) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][460/1251] eta 0:03:12 lr 0.000959 wd 0.0500 time 0.2348 (0.2431) data time 0.0011 (0.0022) model time 0.2337 (0.2403) loss 4.2386 (3.5405) grad_norm 1.8898 (1.9421) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][470/1251] eta 0:03:09 lr 0.000959 wd 0.0500 time 0.2366 (0.2429) data time 0.0009 (0.0021) model time 0.2357 (0.2402) loss 3.4572 (3.5442) grad_norm 1.4646 (1.9344) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][480/1251] eta 0:03:07 lr 0.000959 wd 0.0500 time 0.2346 (0.2428) data time 0.0009 (0.0021) model time 0.2337 (0.2400) loss 3.2415 (3.5420) grad_norm 2.0562 (1.9359) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][490/1251] eta 0:03:04 lr 0.000959 wd 0.0500 time 0.2366 (0.2427) data time 0.0009 (0.0021) model time 0.2357 (0.2399) loss 3.9921 (3.5448) grad_norm 2.6001 (1.9358) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][500/1251] eta 0:03:02 lr 0.000959 wd 0.0500 time 0.2411 (0.2426) data time 0.0010 (0.0021) model time 0.2401 (0.2399) loss 2.5703 (3.5481) grad_norm 1.8276 (1.9343) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][510/1251] eta 0:02:59 lr 0.000959 wd 0.0500 time 0.2397 (0.2425) data time 0.0008 (0.0020) model time 0.2389 (0.2399) loss 3.1331 (3.5467) grad_norm 3.0204 (1.9357) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][520/1251] eta 0:02:57 lr 0.000959 wd 0.0500 time 0.4516 (0.2432) data time 0.0009 (0.0020) model time 0.4506 (0.2406) loss 3.8721 (3.5506) grad_norm 1.6316 (1.9311) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][530/1251] eta 0:02:55 lr 0.000959 wd 0.0500 time 0.2303 (0.2431) data time 0.0011 (0.0020) model time 0.2292 (0.2405) loss 3.0638 (3.5542) grad_norm 2.1825 (1.9296) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][540/1251] eta 0:02:52 lr 0.000959 wd 0.0500 time 0.2327 (0.2430) data time 0.0011 (0.0020) model time 0.2316 (0.2404) loss 3.2673 (3.5510) grad_norm 2.4873 (1.9318) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][550/1251] eta 0:02:50 lr 0.000959 wd 0.0500 time 0.2326 (0.2432) data time 0.0009 (0.0020) model time 0.2317 (0.2407) loss 3.7643 (3.5528) grad_norm 1.6806 (1.9403) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:03:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][560/1251] eta 0:02:47 lr 0.000959 wd 0.0500 time 0.2270 (0.2431) data time 0.0009 (0.0019) model time 0.2261 (0.2406) loss 3.5128 (3.5572) grad_norm 2.6166 (1.9440) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][570/1251] eta 0:02:45 lr 0.000959 wd 0.0500 time 0.2413 (0.2430) data time 0.0008 (0.0019) model time 0.2406 (0.2406) loss 2.5889 (3.5593) grad_norm 2.4834 (1.9475) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][580/1251] eta 0:02:43 lr 0.000959 wd 0.0500 time 0.2451 (0.2430) data time 0.0007 (0.0019) model time 0.2444 (0.2406) loss 4.5441 (3.5576) grad_norm 2.4214 (1.9493) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][590/1251] eta 0:02:40 lr 0.000959 wd 0.0500 time 0.2358 (0.2429) data time 0.0008 (0.0019) model time 0.2349 (0.2405) loss 4.3389 (3.5610) grad_norm 1.8310 (1.9501) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][600/1251] eta 0:02:38 lr 0.000959 wd 0.0500 time 0.2375 (0.2428) data time 0.0007 (0.0019) model time 0.2368 (0.2405) loss 4.2418 (3.5618) grad_norm 3.8366 (1.9482) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][610/1251] eta 0:02:35 lr 0.000959 wd 0.0500 time 0.2391 (0.2427) data time 0.0010 (0.0019) model time 0.2381 (0.2404) loss 4.0137 (3.5668) grad_norm 2.4031 (1.9478) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][620/1251] eta 0:02:33 lr 0.000959 wd 0.0500 time 0.2339 (0.2427) data time 0.0010 (0.0019) model time 0.2329 (0.2404) loss 3.7466 (3.5662) grad_norm 1.6198 (1.9468) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][630/1251] eta 0:02:30 lr 0.000959 wd 0.0500 time 0.2426 (0.2427) data time 0.0009 (0.0018) model time 0.2417 (0.2404) loss 3.9514 (3.5643) grad_norm 2.3456 (1.9463) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][640/1251] eta 0:02:28 lr 0.000959 wd 0.0500 time 0.2330 (0.2426) data time 0.0008 (0.0018) model time 0.2322 (0.2403) loss 4.3944 (3.5724) grad_norm 2.3792 (1.9439) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][650/1251] eta 0:02:26 lr 0.000959 wd 0.0500 time 0.2400 (0.2433) data time 0.0010 (0.0018) model time 0.2390 (0.2410) loss 3.3858 (3.5698) grad_norm 1.7545 (1.9423) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][660/1251] eta 0:02:23 lr 0.000959 wd 0.0500 time 0.2411 (0.2432) data time 0.0008 (0.0018) model time 0.2403 (0.2410) loss 2.9474 (3.5680) grad_norm 1.9430 (1.9409) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][670/1251] eta 0:02:21 lr 0.000959 wd 0.0500 time 0.2404 (0.2434) data time 0.0009 (0.0018) model time 0.2395 (0.2413) loss 4.4625 (3.5711) grad_norm 1.7593 (1.9408) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][680/1251] eta 0:02:19 lr 0.000959 wd 0.0500 time 0.2296 (0.2436) data time 0.0011 (0.0018) model time 0.2285 (0.2414) loss 4.1490 (3.5713) grad_norm 1.9994 (1.9406) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][690/1251] eta 0:02:16 lr 0.000959 wd 0.0500 time 0.2324 (0.2435) data time 0.0011 (0.0018) model time 0.2314 (0.2414) loss 3.5263 (3.5703) grad_norm 1.5888 (1.9424) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][700/1251] eta 0:02:14 lr 0.000959 wd 0.0500 time 0.2377 (0.2434) data time 0.0009 (0.0018) model time 0.2368 (0.2413) loss 3.3808 (3.5705) grad_norm 1.8433 (1.9403) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][710/1251] eta 0:02:11 lr 0.000959 wd 0.0500 time 0.2410 (0.2433) data time 0.0009 (0.0018) model time 0.2401 (0.2412) loss 2.6282 (3.5669) grad_norm 1.5724 (1.9373) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][720/1251] eta 0:02:09 lr 0.000959 wd 0.0500 time 0.2387 (0.2432) data time 0.0008 (0.0017) model time 0.2379 (0.2411) loss 4.4008 (3.5682) grad_norm 1.6920 (1.9350) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][730/1251] eta 0:02:06 lr 0.000959 wd 0.0500 time 0.2340 (0.2431) data time 0.0007 (0.0017) model time 0.2332 (0.2410) loss 3.7826 (3.5708) grad_norm 1.6272 (1.9351) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][740/1251] eta 0:02:04 lr 0.000959 wd 0.0500 time 0.2318 (0.2431) data time 0.0010 (0.0017) model time 0.2308 (0.2410) loss 3.1315 (3.5691) grad_norm 2.4122 (1.9353) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][750/1251] eta 0:02:01 lr 0.000959 wd 0.0500 time 0.2358 (0.2430) data time 0.0011 (0.0017) model time 0.2347 (0.2409) loss 3.2438 (3.5672) grad_norm 1.6002 (1.9357) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][760/1251] eta 0:01:59 lr 0.000959 wd 0.0500 time 0.2395 (0.2429) data time 0.0007 (0.0017) model time 0.2387 (0.2409) loss 2.3686 (3.5640) grad_norm 2.0111 (1.9361) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][770/1251] eta 0:01:56 lr 0.000959 wd 0.0500 time 0.2375 (0.2429) data time 0.0008 (0.0017) model time 0.2367 (0.2408) loss 4.2444 (3.5635) grad_norm 1.7365 (1.9405) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][780/1251] eta 0:01:54 lr 0.000959 wd 0.0500 time 0.2420 (0.2428) data time 0.0007 (0.0017) model time 0.2412 (0.2408) loss 4.4064 (3.5636) grad_norm 1.5811 (1.9376) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][790/1251] eta 0:01:51 lr 0.000959 wd 0.0500 time 0.2331 (0.2427) data time 0.0007 (0.0017) model time 0.2324 (0.2407) loss 2.7661 (3.5626) grad_norm 1.6713 (1.9338) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][800/1251] eta 0:01:49 lr 0.000959 wd 0.0500 time 0.2325 (0.2426) data time 0.0008 (0.0017) model time 0.2317 (0.2406) loss 3.6098 (3.5635) grad_norm 2.0106 (1.9337) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:04:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][810/1251] eta 0:01:46 lr 0.000959 wd 0.0500 time 0.2457 (0.2426) data time 0.0007 (0.0017) model time 0.2450 (0.2406) loss 3.3811 (3.5648) grad_norm 1.8324 (1.9304) loss_scale 8192.0000 (4131.3539) mem 7379MB [2024-08-26 07:05:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][820/1251] eta 0:01:44 lr 0.000959 wd 0.0500 time 0.2589 (0.2426) data time 0.0009 (0.0017) model time 0.2579 (0.2406) loss 3.5647 (3.5654) grad_norm 1.9370 (1.9295) loss_scale 8192.0000 (4180.8136) mem 7379MB [2024-08-26 07:05:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][830/1251] eta 0:01:42 lr 0.000959 wd 0.0500 time 0.2510 (0.2426) data time 0.0009 (0.0016) model time 0.2501 (0.2406) loss 3.8237 (3.5632) grad_norm 1.6919 (1.9316) loss_scale 8192.0000 (4229.0830) mem 7379MB [2024-08-26 07:05:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][840/1251] eta 0:01:39 lr 0.000959 wd 0.0500 time 0.2385 (0.2426) data time 0.0010 (0.0016) model time 0.2375 (0.2406) loss 3.4115 (3.5647) grad_norm 1.8252 (1.9340) loss_scale 8192.0000 (4276.2045) mem 7379MB [2024-08-26 07:05:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][850/1251] eta 0:01:37 lr 0.000959 wd 0.0500 time 0.2427 (0.2426) data time 0.0010 (0.0016) model time 0.2418 (0.2406) loss 3.7259 (3.5655) grad_norm 2.6915 (1.9338) loss_scale 8192.0000 (4322.2186) mem 7379MB [2024-08-26 07:05:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][860/1251] eta 0:01:34 lr 0.000959 wd 0.0500 time 0.2423 (0.2426) data time 0.0008 (0.0016) model time 0.2415 (0.2406) loss 4.6226 (3.5663) grad_norm 2.1729 (1.9388) loss_scale 8192.0000 (4367.1638) mem 7379MB [2024-08-26 07:05:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][870/1251] eta 0:01:32 lr 0.000959 wd 0.0500 time 0.2370 (0.2425) data time 0.0011 (0.0016) model time 0.2359 (0.2406) loss 3.8703 (3.5661) grad_norm 2.5039 (1.9448) loss_scale 8192.0000 (4411.0769) mem 7379MB [2024-08-26 07:05:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][880/1251] eta 0:01:29 lr 0.000959 wd 0.0500 time 0.2361 (0.2425) data time 0.0009 (0.0016) model time 0.2352 (0.2406) loss 3.9773 (3.5670) grad_norm 1.7569 (1.9444) loss_scale 8192.0000 (4453.9932) mem 7379MB [2024-08-26 07:05:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][890/1251] eta 0:01:27 lr 0.000959 wd 0.0500 time 0.2404 (0.2425) data time 0.0011 (0.0016) model time 0.2393 (0.2406) loss 3.1394 (3.5641) grad_norm 1.9839 (1.9445) loss_scale 8192.0000 (4495.9461) mem 7379MB [2024-08-26 07:05:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][900/1251] eta 0:01:25 lr 0.000959 wd 0.0500 time 0.2304 (0.2424) data time 0.0007 (0.0016) model time 0.2297 (0.2405) loss 3.8884 (3.5635) grad_norm 3.2024 (1.9500) loss_scale 8192.0000 (4536.9678) mem 7379MB [2024-08-26 07:05:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][910/1251] eta 0:01:22 lr 0.000959 wd 0.0500 time 0.2381 (0.2424) data time 0.0010 (0.0016) model time 0.2371 (0.2405) loss 3.9614 (3.5677) grad_norm 1.8976 (1.9503) loss_scale 8192.0000 (4577.0889) mem 7379MB [2024-08-26 07:05:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][920/1251] eta 0:01:20 lr 0.000959 wd 0.0500 time 0.2429 (0.2423) data time 0.0008 (0.0016) model time 0.2421 (0.2404) loss 2.6380 (3.5672) grad_norm 2.0731 (1.9492) loss_scale 8192.0000 (4616.3388) mem 7379MB [2024-08-26 07:05:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][930/1251] eta 0:01:17 lr 0.000959 wd 0.0500 time 0.2335 (0.2423) data time 0.0012 (0.0016) model time 0.2323 (0.2404) loss 3.3201 (3.5657) grad_norm 1.3526 (1.9473) loss_scale 8192.0000 (4654.7454) mem 7379MB [2024-08-26 07:05:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][940/1251] eta 0:01:15 lr 0.000959 wd 0.0500 time 0.2380 (0.2425) data time 0.0009 (0.0016) model time 0.2372 (0.2406) loss 2.9315 (3.5658) grad_norm 1.7730 (1.9472) loss_scale 8192.0000 (4692.3358) mem 7379MB [2024-08-26 07:05:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][950/1251] eta 0:01:12 lr 0.000958 wd 0.0500 time 0.2391 (0.2425) data time 0.0012 (0.0016) model time 0.2379 (0.2406) loss 3.7867 (3.5669) grad_norm 1.6684 (1.9457) loss_scale 8192.0000 (4729.1356) mem 7379MB [2024-08-26 07:05:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][960/1251] eta 0:01:10 lr 0.000958 wd 0.0500 time 0.2384 (0.2424) data time 0.0008 (0.0016) model time 0.2376 (0.2406) loss 4.1833 (3.5654) grad_norm 1.7320 (1.9429) loss_scale 8192.0000 (4765.1696) mem 7379MB [2024-08-26 07:05:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][970/1251] eta 0:01:08 lr 0.000958 wd 0.0500 time 0.2399 (0.2424) data time 0.0011 (0.0016) model time 0.2388 (0.2405) loss 3.6260 (3.5653) grad_norm 1.9585 (1.9415) loss_scale 8192.0000 (4800.4614) mem 7379MB [2024-08-26 07:05:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][980/1251] eta 0:01:05 lr 0.000958 wd 0.0500 time 0.2424 (0.2424) data time 0.0010 (0.0016) model time 0.2414 (0.2405) loss 3.6409 (3.5614) grad_norm 2.3246 (1.9399) loss_scale 8192.0000 (4835.0336) mem 7379MB [2024-08-26 07:05:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][990/1251] eta 0:01:03 lr 0.000958 wd 0.0500 time 0.2294 (0.2423) data time 0.0012 (0.0015) model time 0.2282 (0.2405) loss 2.6581 (3.5624) grad_norm 1.6670 (1.9389) loss_scale 8192.0000 (4868.9082) mem 7379MB [2024-08-26 07:05:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1000/1251] eta 0:01:00 lr 0.000958 wd 0.0500 time 0.2456 (0.2423) data time 0.0007 (0.0015) model time 0.2448 (0.2405) loss 2.9168 (3.5628) grad_norm 1.9522 (1.9363) loss_scale 8192.0000 (4902.1059) mem 7379MB [2024-08-26 07:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1010/1251] eta 0:00:58 lr 0.000958 wd 0.0500 time 0.2443 (0.2423) data time 0.0010 (0.0015) model time 0.2433 (0.2405) loss 3.6623 (3.5608) grad_norm 1.5768 (1.9380) loss_scale 8192.0000 (4934.6469) mem 7379MB [2024-08-26 07:05:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1020/1251] eta 0:00:56 lr 0.000958 wd 0.0500 time 0.2333 (0.2425) data time 0.0009 (0.0015) model time 0.2324 (0.2407) loss 2.9669 (3.5568) grad_norm 1.6914 (1.9426) loss_scale 8192.0000 (4966.5504) mem 7379MB [2024-08-26 07:05:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1030/1251] eta 0:00:53 lr 0.000958 wd 0.0500 time 0.2367 (0.2427) data time 0.0010 (0.0015) model time 0.2358 (0.2409) loss 4.2173 (3.5560) grad_norm 1.4824 (1.9437) loss_scale 8192.0000 (4997.8351) mem 7379MB [2024-08-26 07:05:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1040/1251] eta 0:00:51 lr 0.000958 wd 0.0500 time 0.2530 (0.2429) data time 0.0007 (0.0015) model time 0.2524 (0.2411) loss 4.1383 (3.5584) grad_norm 1.9359 (1.9425) loss_scale 8192.0000 (5028.5187) mem 7379MB [2024-08-26 07:05:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1050/1251] eta 0:00:48 lr 0.000958 wd 0.0500 time 0.2458 (0.2429) data time 0.0011 (0.0015) model time 0.2447 (0.2411) loss 4.1678 (3.5624) grad_norm 1.6369 (1.9420) loss_scale 8192.0000 (5058.6185) mem 7379MB [2024-08-26 07:05:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1060/1251] eta 0:00:46 lr 0.000958 wd 0.0500 time 0.2361 (0.2428) data time 0.0009 (0.0015) model time 0.2352 (0.2411) loss 2.8110 (3.5628) grad_norm 3.2221 (1.9420) loss_scale 8192.0000 (5088.1508) mem 7379MB [2024-08-26 07:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1070/1251] eta 0:00:43 lr 0.000958 wd 0.0500 time 0.2371 (0.2429) data time 0.0011 (0.0015) model time 0.2361 (0.2411) loss 3.3123 (3.5668) grad_norm 1.3564 (1.9433) loss_scale 8192.0000 (5117.1317) mem 7379MB [2024-08-26 07:06:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1080/1251] eta 0:00:41 lr 0.000958 wd 0.0500 time 0.2379 (0.2430) data time 0.0013 (0.0015) model time 0.2366 (0.2413) loss 2.8676 (3.5654) grad_norm 1.6006 (1.9409) loss_scale 8192.0000 (5145.5763) mem 7379MB [2024-08-26 07:06:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1090/1251] eta 0:00:39 lr 0.000958 wd 0.0500 time 0.2387 (0.2430) data time 0.0010 (0.0015) model time 0.2378 (0.2413) loss 3.4662 (3.5674) grad_norm 1.5144 (1.9383) loss_scale 8192.0000 (5173.4995) mem 7379MB [2024-08-26 07:06:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1100/1251] eta 0:00:36 lr 0.000958 wd 0.0500 time 0.2462 (0.2430) data time 0.0008 (0.0015) model time 0.2454 (0.2412) loss 2.9342 (3.5695) grad_norm 1.4475 (1.9371) loss_scale 8192.0000 (5200.9155) mem 7379MB [2024-08-26 07:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1110/1251] eta 0:00:34 lr 0.000958 wd 0.0500 time 0.2400 (0.2429) data time 0.0007 (0.0015) model time 0.2393 (0.2412) loss 3.7435 (3.5696) grad_norm 1.8207 (1.9395) loss_scale 8192.0000 (5227.8380) mem 7379MB [2024-08-26 07:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1120/1251] eta 0:00:31 lr 0.000958 wd 0.0500 time 0.2313 (0.2429) data time 0.0011 (0.0015) model time 0.2302 (0.2412) loss 4.1638 (3.5701) grad_norm 1.9389 (1.9394) loss_scale 8192.0000 (5254.2801) mem 7379MB [2024-08-26 07:06:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1130/1251] eta 0:00:29 lr 0.000958 wd 0.0500 time 0.2460 (0.2428) data time 0.0009 (0.0015) model time 0.2450 (0.2411) loss 3.7989 (3.5710) grad_norm 1.5905 (1.9386) loss_scale 8192.0000 (5280.2546) mem 7379MB [2024-08-26 07:06:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1140/1251] eta 0:00:26 lr 0.000958 wd 0.0500 time 0.2358 (0.2428) data time 0.0009 (0.0015) model time 0.2349 (0.2411) loss 3.8631 (3.5708) grad_norm 2.1628 (1.9400) loss_scale 8192.0000 (5305.7739) mem 7379MB [2024-08-26 07:06:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1150/1251] eta 0:00:24 lr 0.000958 wd 0.0500 time 0.2359 (0.2427) data time 0.0009 (0.0015) model time 0.2350 (0.2410) loss 2.2560 (3.5697) grad_norm 2.9872 (1.9392) loss_scale 8192.0000 (5330.8497) mem 7379MB [2024-08-26 07:06:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1160/1251] eta 0:00:22 lr 0.000958 wd 0.0500 time 0.2345 (0.2427) data time 0.0011 (0.0015) model time 0.2334 (0.2410) loss 4.1108 (3.5733) grad_norm 1.9095 (1.9372) loss_scale 8192.0000 (5355.4935) mem 7379MB [2024-08-26 07:06:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1170/1251] eta 0:00:19 lr 0.000958 wd 0.0500 time 0.2296 (0.2427) data time 0.0009 (0.0015) model time 0.2287 (0.2410) loss 1.8351 (3.5714) grad_norm 2.0109 (1.9398) loss_scale 8192.0000 (5379.7165) mem 7379MB [2024-08-26 07:06:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1180/1251] eta 0:00:17 lr 0.000958 wd 0.0500 time 0.2343 (0.2426) data time 0.0010 (0.0015) model time 0.2333 (0.2409) loss 3.4624 (3.5721) grad_norm 2.3496 (1.9432) loss_scale 8192.0000 (5403.5292) mem 7379MB [2024-08-26 07:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1190/1251] eta 0:00:14 lr 0.000958 wd 0.0500 time 0.2363 (0.2425) data time 0.0009 (0.0015) model time 0.2354 (0.2409) loss 3.5189 (3.5697) grad_norm 1.5901 (1.9436) loss_scale 8192.0000 (5426.9421) mem 7379MB [2024-08-26 07:06:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1200/1251] eta 0:00:12 lr 0.000958 wd 0.0500 time 0.2467 (0.2425) data time 0.0012 (0.0015) model time 0.2456 (0.2408) loss 3.8644 (3.5706) grad_norm 1.7475 (1.9445) loss_scale 8192.0000 (5449.9650) mem 7379MB [2024-08-26 07:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1210/1251] eta 0:00:09 lr 0.000958 wd 0.0500 time 0.2333 (0.2425) data time 0.0007 (0.0015) model time 0.2326 (0.2408) loss 4.1239 (3.5723) grad_norm 1.3009 (1.9428) loss_scale 8192.0000 (5472.6078) mem 7379MB [2024-08-26 07:06:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1220/1251] eta 0:00:07 lr 0.000958 wd 0.0500 time 0.2358 (0.2424) data time 0.0009 (0.0015) model time 0.2348 (0.2408) loss 4.9191 (3.5755) grad_norm 1.7437 (1.9436) loss_scale 8192.0000 (5494.8796) mem 7379MB [2024-08-26 07:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1230/1251] eta 0:00:05 lr 0.000958 wd 0.0500 time 0.2387 (0.2424) data time 0.0011 (0.0015) model time 0.2376 (0.2407) loss 3.7747 (3.5760) grad_norm 1.7466 (1.9417) loss_scale 8192.0000 (5516.7896) mem 7379MB [2024-08-26 07:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1240/1251] eta 0:00:02 lr 0.000958 wd 0.0500 time 0.2239 (0.2423) data time 0.0005 (0.0015) model time 0.2234 (0.2406) loss 3.3160 (3.5767) grad_norm 3.8728 (1.9443) loss_scale 8192.0000 (5538.3465) mem 7379MB [2024-08-26 07:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [56/300][1250/1251] eta 0:00:00 lr 0.000958 wd 0.0500 time 0.2253 (0.2422) data time 0.0007 (0.0014) model time 0.2246 (0.2405) loss 3.9683 (3.5758) grad_norm 1.8762 (1.9443) loss_scale 8192.0000 (5559.5588) mem 7379MB [2024-08-26 07:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 56 training takes 0:05:02 [2024-08-26 07:06:45 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 07:06:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 07:06:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.516 (0.516) Loss 0.5298 (0.5298) Acc@1 89.844 (89.844) Acc@5 97.852 (97.852) Mem 7379MB [2024-08-26 07:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.117) Loss 0.8882 (0.8569) Acc@1 81.250 (81.161) Acc@5 95.215 (95.748) Mem 7379MB [2024-08-26 07:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.099) Loss 1.2451 (0.8730) Acc@1 71.289 (80.148) Acc@5 91.016 (95.624) Mem 7379MB [2024-08-26 07:06:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.092) Loss 1.5156 (1.0077) Acc@1 65.039 (77.123) Acc@5 86.328 (93.863) Mem 7379MB [2024-08-26 07:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.086) Loss 1.3604 (1.0751) Acc@1 68.457 (75.519) Acc@5 89.844 (93.009) Mem 7379MB [2024-08-26 07:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.218 Acc@5 92.928 [2024-08-26 07:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.2% [2024-08-26 07:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 75.22% [2024-08-26 07:06:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 07:06:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 07:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.496 (0.496) Loss 0.4724 (0.4724) Acc@1 90.039 (90.039) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 07:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.122) Loss 0.7832 (0.7548) Acc@1 83.984 (83.416) Acc@5 95.898 (96.591) Mem 7379MB [2024-08-26 07:06:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.102) Loss 1.0928 (0.7729) Acc@1 74.512 (82.334) Acc@5 92.285 (96.456) Mem 7379MB [2024-08-26 07:06:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.094) Loss 1.3574 (0.8859) Acc@1 65.137 (79.580) Acc@5 89.453 (95.007) Mem 7379MB [2024-08-26 07:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.2686 (0.9495) Acc@1 68.555 (77.970) Acc@5 90.137 (94.250) Mem 7379MB [2024-08-26 07:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.590 Acc@5 94.178 [2024-08-26 07:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 77.6% [2024-08-26 07:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 77.59% [2024-08-26 07:06:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 07:06:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 07:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][0/1251] eta 0:16:15 lr 0.000958 wd 0.0500 time 0.7798 (0.7798) data time 0.5499 (0.5499) model time 0.0000 (0.0000) loss 2.4317 (2.4317) grad_norm 1.7973 (1.7973) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][10/1251] eta 0:05:58 lr 0.000958 wd 0.0500 time 0.2496 (0.2889) data time 0.0010 (0.0509) model time 0.0000 (0.0000) loss 3.9461 (3.3136) grad_norm 2.0440 (1.9521) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][20/1251] eta 0:05:27 lr 0.000958 wd 0.0500 time 0.2437 (0.2659) data time 0.0011 (0.0276) model time 0.0000 (0.0000) loss 4.1668 (3.5004) grad_norm 3.3208 (2.0360) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][30/1251] eta 0:05:13 lr 0.000958 wd 0.0500 time 0.2431 (0.2571) data time 0.0008 (0.0191) model time 0.0000 (0.0000) loss 4.1904 (3.6389) grad_norm 1.7373 (2.0567) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][40/1251] eta 0:05:06 lr 0.000958 wd 0.0500 time 0.2394 (0.2528) data time 0.0010 (0.0147) model time 0.0000 (0.0000) loss 3.7444 (3.6644) grad_norm 1.4626 (1.9977) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][50/1251] eta 0:05:00 lr 0.000958 wd 0.0500 time 0.2337 (0.2498) data time 0.0009 (0.0120) model time 0.0000 (0.0000) loss 2.4767 (3.6247) grad_norm 1.4681 (2.0000) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][60/1251] eta 0:04:55 lr 0.000958 wd 0.0500 time 0.2400 (0.2482) data time 0.0007 (0.0102) model time 0.2392 (0.2389) loss 4.5320 (3.6705) grad_norm 2.2896 (2.0111) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][70/1251] eta 0:04:51 lr 0.000958 wd 0.0500 time 0.2291 (0.2468) data time 0.0010 (0.0089) model time 0.2280 (0.2381) loss 3.4498 (3.6553) grad_norm 1.8783 (2.0052) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][80/1251] eta 0:04:47 lr 0.000958 wd 0.0500 time 0.2344 (0.2454) data time 0.0010 (0.0079) model time 0.2334 (0.2370) loss 3.5277 (3.6099) grad_norm 2.3312 (1.9754) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][90/1251] eta 0:04:44 lr 0.000958 wd 0.0500 time 0.2400 (0.2448) data time 0.0010 (0.0072) model time 0.2390 (0.2373) loss 3.7472 (3.5941) grad_norm 1.5685 (1.9750) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][100/1251] eta 0:04:40 lr 0.000958 wd 0.0500 time 0.2469 (0.2441) data time 0.0010 (0.0066) model time 0.2459 (0.2372) loss 3.7478 (3.6113) grad_norm 2.1120 (1.9689) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][110/1251] eta 0:04:38 lr 0.000958 wd 0.0500 time 0.2423 (0.2437) data time 0.0008 (0.0061) model time 0.2414 (0.2374) loss 3.5192 (3.6014) grad_norm 1.7513 (1.9533) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][120/1251] eta 0:04:37 lr 0.000958 wd 0.0500 time 0.2376 (0.2450) data time 0.0011 (0.0057) model time 0.2365 (0.2404) loss 3.7763 (3.6010) grad_norm 1.8237 (1.9414) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][130/1251] eta 0:04:34 lr 0.000958 wd 0.0500 time 0.2384 (0.2449) data time 0.0009 (0.0053) model time 0.2375 (0.2407) loss 3.9298 (3.5945) grad_norm 1.4148 (1.9308) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][140/1251] eta 0:04:31 lr 0.000958 wd 0.0500 time 0.2422 (0.2443) data time 0.0008 (0.0050) model time 0.2415 (0.2402) loss 2.7705 (3.5791) grad_norm 1.7131 (1.9453) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][150/1251] eta 0:04:28 lr 0.000958 wd 0.0500 time 0.2594 (0.2441) data time 0.0009 (0.0047) model time 0.2585 (0.2401) loss 3.4546 (3.5595) grad_norm 1.5924 (1.9461) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][160/1251] eta 0:04:25 lr 0.000958 wd 0.0500 time 0.2312 (0.2436) data time 0.0009 (0.0045) model time 0.2303 (0.2397) loss 4.0481 (3.5764) grad_norm 2.2979 (1.9337) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][170/1251] eta 0:04:23 lr 0.000958 wd 0.0500 time 0.2430 (0.2435) data time 0.0008 (0.0043) model time 0.2422 (0.2398) loss 3.0120 (3.5667) grad_norm 2.2576 (1.9311) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][180/1251] eta 0:04:20 lr 0.000958 wd 0.0500 time 0.2337 (0.2432) data time 0.0009 (0.0041) model time 0.2328 (0.2396) loss 3.6901 (3.5782) grad_norm 3.2463 (1.9509) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][190/1251] eta 0:04:19 lr 0.000958 wd 0.0500 time 0.2392 (0.2441) data time 0.0008 (0.0040) model time 0.2384 (0.2410) loss 3.5725 (3.5699) grad_norm 1.2410 (1.9628) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][200/1251] eta 0:04:18 lr 0.000958 wd 0.0500 time 0.2334 (0.2459) data time 0.0010 (0.0038) model time 0.2324 (0.2435) loss 3.8346 (3.5808) grad_norm 2.3416 (1.9684) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][210/1251] eta 0:04:15 lr 0.000958 wd 0.0500 time 0.2430 (0.2454) data time 0.0010 (0.0037) model time 0.2421 (0.2429) loss 2.5355 (3.5624) grad_norm 1.7685 (1.9704) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][220/1251] eta 0:04:12 lr 0.000958 wd 0.0500 time 0.2433 (0.2450) data time 0.0010 (0.0036) model time 0.2423 (0.2425) loss 3.5349 (3.5555) grad_norm 1.9770 (1.9714) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][230/1251] eta 0:04:09 lr 0.000958 wd 0.0500 time 0.2313 (0.2447) data time 0.0010 (0.0034) model time 0.2303 (0.2422) loss 3.1700 (3.5552) grad_norm 2.5414 (1.9870) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][240/1251] eta 0:04:07 lr 0.000958 wd 0.0500 time 0.2344 (0.2445) data time 0.0007 (0.0033) model time 0.2337 (0.2420) loss 3.8600 (3.5627) grad_norm 1.5021 (1.9842) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][250/1251] eta 0:04:04 lr 0.000958 wd 0.0500 time 0.2412 (0.2442) data time 0.0009 (0.0033) model time 0.2402 (0.2417) loss 3.4849 (3.5674) grad_norm 1.6401 (1.9774) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:07:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][260/1251] eta 0:04:01 lr 0.000957 wd 0.0500 time 0.2382 (0.2440) data time 0.0009 (0.0032) model time 0.2373 (0.2415) loss 3.8159 (3.5735) grad_norm 1.6289 (1.9649) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][270/1251] eta 0:03:59 lr 0.000957 wd 0.0500 time 0.2422 (0.2439) data time 0.0011 (0.0031) model time 0.2412 (0.2415) loss 3.5279 (3.5631) grad_norm 2.3787 (1.9678) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][280/1251] eta 0:03:56 lr 0.000957 wd 0.0500 time 0.2484 (0.2438) data time 0.0009 (0.0030) model time 0.2476 (0.2415) loss 3.1680 (3.5691) grad_norm 2.0217 (1.9743) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][290/1251] eta 0:03:54 lr 0.000957 wd 0.0500 time 0.2408 (0.2437) data time 0.0010 (0.0029) model time 0.2398 (0.2413) loss 3.7646 (3.5721) grad_norm 2.1549 (1.9844) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][300/1251] eta 0:03:51 lr 0.000957 wd 0.0500 time 0.2440 (0.2436) data time 0.0009 (0.0029) model time 0.2431 (0.2413) loss 3.5355 (3.5718) grad_norm 2.2644 (1.9799) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][310/1251] eta 0:03:49 lr 0.000957 wd 0.0500 time 0.2331 (0.2435) data time 0.0010 (0.0028) model time 0.2321 (0.2412) loss 3.3512 (3.5719) grad_norm 1.5467 (1.9688) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][320/1251] eta 0:03:47 lr 0.000957 wd 0.0500 time 0.2457 (0.2440) data time 0.0007 (0.0028) model time 0.2449 (0.2418) loss 2.8289 (3.5635) grad_norm 2.1563 (1.9643) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][330/1251] eta 0:03:44 lr 0.000957 wd 0.0500 time 0.2364 (0.2438) data time 0.0010 (0.0027) model time 0.2355 (0.2417) loss 4.1126 (3.5685) grad_norm 2.2514 (1.9695) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][340/1251] eta 0:03:41 lr 0.000957 wd 0.0500 time 0.2334 (0.2437) data time 0.0008 (0.0027) model time 0.2327 (0.2416) loss 2.4144 (3.5633) grad_norm 1.2596 (1.9696) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][350/1251] eta 0:03:39 lr 0.000957 wd 0.0500 time 0.2486 (0.2436) data time 0.0008 (0.0026) model time 0.2478 (0.2415) loss 2.9811 (3.5704) grad_norm 1.4363 (1.9662) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][360/1251] eta 0:03:36 lr 0.000957 wd 0.0500 time 0.2397 (0.2435) data time 0.0012 (0.0026) model time 0.2385 (0.2414) loss 4.0829 (3.5725) grad_norm 1.8745 (1.9638) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][370/1251] eta 0:03:34 lr 0.000957 wd 0.0500 time 0.2441 (0.2434) data time 0.0007 (0.0025) model time 0.2434 (0.2413) loss 4.3296 (3.5757) grad_norm 1.9731 (1.9612) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][380/1251] eta 0:03:31 lr 0.000957 wd 0.0500 time 0.2356 (0.2433) data time 0.0009 (0.0025) model time 0.2347 (0.2412) loss 3.4522 (3.5748) grad_norm 2.4799 (1.9606) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][390/1251] eta 0:03:29 lr 0.000957 wd 0.0500 time 0.2339 (0.2438) data time 0.0009 (0.0025) model time 0.2329 (0.2419) loss 2.9708 (3.5752) grad_norm 2.0047 (1.9606) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][400/1251] eta 0:03:27 lr 0.000957 wd 0.0500 time 0.2362 (0.2437) data time 0.0010 (0.0024) model time 0.2352 (0.2418) loss 3.1042 (3.5733) grad_norm 4.4430 (1.9663) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][410/1251] eta 0:03:25 lr 0.000957 wd 0.0500 time 0.2321 (0.2440) data time 0.0009 (0.0024) model time 0.2312 (0.2422) loss 3.5196 (3.5724) grad_norm 2.3037 (1.9647) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][420/1251] eta 0:03:22 lr 0.000957 wd 0.0500 time 0.2419 (0.2440) data time 0.0009 (0.0024) model time 0.2410 (0.2421) loss 3.7704 (3.5768) grad_norm 2.2402 (1.9608) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][430/1251] eta 0:03:20 lr 0.000957 wd 0.0500 time 0.2500 (0.2438) data time 0.0008 (0.0023) model time 0.2492 (0.2420) loss 2.8335 (3.5760) grad_norm 1.7869 (1.9554) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][440/1251] eta 0:03:17 lr 0.000957 wd 0.0500 time 0.2393 (0.2437) data time 0.0008 (0.0023) model time 0.2385 (0.2419) loss 4.0191 (3.5823) grad_norm 1.9445 (1.9467) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][450/1251] eta 0:03:15 lr 0.000957 wd 0.0500 time 0.2319 (0.2437) data time 0.0010 (0.0023) model time 0.2309 (0.2419) loss 3.1506 (3.5794) grad_norm 2.7090 (1.9507) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][460/1251] eta 0:03:12 lr 0.000957 wd 0.0500 time 0.2383 (0.2436) data time 0.0009 (0.0022) model time 0.2373 (0.2418) loss 2.5814 (3.5829) grad_norm 1.1482 (1.9479) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][470/1251] eta 0:03:10 lr 0.000957 wd 0.0500 time 0.2275 (0.2435) data time 0.0012 (0.0022) model time 0.2263 (0.2417) loss 3.3471 (3.5834) grad_norm 2.0350 (1.9462) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][480/1251] eta 0:03:07 lr 0.000957 wd 0.0500 time 0.2401 (0.2433) data time 0.0009 (0.0022) model time 0.2393 (0.2415) loss 4.2158 (3.5763) grad_norm 1.5813 (1.9468) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][490/1251] eta 0:03:05 lr 0.000957 wd 0.0500 time 0.2392 (0.2432) data time 0.0010 (0.0022) model time 0.2383 (0.2414) loss 4.1671 (3.5828) grad_norm 2.9433 (1.9499) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][500/1251] eta 0:03:02 lr 0.000957 wd 0.0500 time 0.2292 (0.2431) data time 0.0009 (0.0021) model time 0.2283 (0.2413) loss 2.7626 (3.5846) grad_norm 2.0618 (1.9514) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][510/1251] eta 0:03:00 lr 0.000957 wd 0.0500 time 0.2392 (0.2430) data time 0.0007 (0.0021) model time 0.2385 (0.2412) loss 2.2696 (3.5841) grad_norm 3.1635 (1.9504) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][520/1251] eta 0:02:57 lr 0.000957 wd 0.0500 time 0.2377 (0.2429) data time 0.0009 (0.0021) model time 0.2368 (0.2411) loss 4.3462 (3.5831) grad_norm 1.9627 (1.9487) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][530/1251] eta 0:02:55 lr 0.000957 wd 0.0500 time 0.2484 (0.2433) data time 0.0008 (0.0021) model time 0.2476 (0.2416) loss 2.4644 (3.5793) grad_norm 1.8184 (1.9655) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][540/1251] eta 0:02:52 lr 0.000957 wd 0.0500 time 0.2433 (0.2432) data time 0.0007 (0.0021) model time 0.2427 (0.2415) loss 3.9944 (3.5795) grad_norm 3.2107 (1.9636) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][550/1251] eta 0:02:50 lr 0.000957 wd 0.0500 time 0.2382 (0.2434) data time 0.0010 (0.0020) model time 0.2372 (0.2418) loss 3.5077 (3.5823) grad_norm 1.8657 (1.9616) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][560/1251] eta 0:02:48 lr 0.000957 wd 0.0500 time 0.2353 (0.2434) data time 0.0010 (0.0020) model time 0.2343 (0.2417) loss 3.6112 (3.5820) grad_norm 1.8573 (1.9567) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][570/1251] eta 0:02:45 lr 0.000957 wd 0.0500 time 0.2445 (0.2433) data time 0.0010 (0.0020) model time 0.2436 (0.2417) loss 3.7977 (3.5792) grad_norm 2.6065 (1.9581) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][580/1251] eta 0:02:43 lr 0.000957 wd 0.0500 time 0.2420 (0.2433) data time 0.0008 (0.0020) model time 0.2412 (0.2416) loss 2.5561 (3.5783) grad_norm 1.4761 (1.9558) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][590/1251] eta 0:02:40 lr 0.000957 wd 0.0500 time 0.2438 (0.2432) data time 0.0009 (0.0020) model time 0.2429 (0.2415) loss 2.7588 (3.5756) grad_norm 1.3534 (1.9565) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][600/1251] eta 0:02:38 lr 0.000957 wd 0.0500 time 0.2303 (0.2431) data time 0.0014 (0.0019) model time 0.2289 (0.2415) loss 4.0946 (3.5766) grad_norm 2.0716 (1.9602) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][610/1251] eta 0:02:35 lr 0.000957 wd 0.0500 time 0.2345 (0.2431) data time 0.0009 (0.0019) model time 0.2336 (0.2415) loss 2.9460 (3.5753) grad_norm 1.8780 (1.9631) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][620/1251] eta 0:02:33 lr 0.000957 wd 0.0500 time 0.2344 (0.2430) data time 0.0011 (0.0019) model time 0.2334 (0.2414) loss 2.9469 (3.5765) grad_norm 1.2804 (1.9603) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][630/1251] eta 0:02:30 lr 0.000957 wd 0.0500 time 0.2382 (0.2430) data time 0.0010 (0.0019) model time 0.2372 (0.2413) loss 2.8204 (3.5770) grad_norm 1.8395 (1.9554) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][640/1251] eta 0:02:28 lr 0.000957 wd 0.0500 time 0.2377 (0.2429) data time 0.0009 (0.0019) model time 0.2368 (0.2413) loss 4.0834 (3.5760) grad_norm 1.6745 (1.9508) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][650/1251] eta 0:02:25 lr 0.000957 wd 0.0500 time 0.2310 (0.2429) data time 0.0011 (0.0019) model time 0.2299 (0.2412) loss 3.5129 (3.5753) grad_norm 2.0007 (1.9515) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][660/1251] eta 0:02:23 lr 0.000957 wd 0.0500 time 0.2461 (0.2427) data time 0.0011 (0.0019) model time 0.2449 (0.2411) loss 4.3185 (3.5763) grad_norm 1.6622 (1.9526) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][670/1251] eta 0:02:21 lr 0.000957 wd 0.0500 time 0.2296 (0.2427) data time 0.0009 (0.0019) model time 0.2287 (0.2411) loss 2.7303 (3.5759) grad_norm 2.3299 (1.9478) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][680/1251] eta 0:02:18 lr 0.000957 wd 0.0500 time 0.2343 (0.2426) data time 0.0010 (0.0018) model time 0.2333 (0.2410) loss 3.7635 (3.5799) grad_norm 1.8437 (1.9540) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][690/1251] eta 0:02:16 lr 0.000957 wd 0.0500 time 0.2344 (0.2425) data time 0.0007 (0.0018) model time 0.2336 (0.2409) loss 2.9596 (3.5798) grad_norm 1.6057 (1.9535) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][700/1251] eta 0:02:13 lr 0.000957 wd 0.0500 time 0.2388 (0.2424) data time 0.0011 (0.0018) model time 0.2377 (0.2408) loss 3.0295 (3.5771) grad_norm 1.6667 (1.9504) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][710/1251] eta 0:02:11 lr 0.000957 wd 0.0500 time 0.2373 (0.2423) data time 0.0007 (0.0018) model time 0.2366 (0.2407) loss 3.1996 (3.5794) grad_norm 1.7959 (1.9486) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][720/1251] eta 0:02:08 lr 0.000957 wd 0.0500 time 0.2356 (0.2423) data time 0.0007 (0.0018) model time 0.2348 (0.2407) loss 3.9749 (3.5802) grad_norm 1.7115 (1.9495) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][730/1251] eta 0:02:06 lr 0.000957 wd 0.0500 time 0.2413 (0.2422) data time 0.0007 (0.0018) model time 0.2406 (0.2406) loss 4.0302 (3.5799) grad_norm 1.6533 (1.9477) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][740/1251] eta 0:02:03 lr 0.000957 wd 0.0500 time 0.2410 (0.2422) data time 0.0010 (0.0018) model time 0.2400 (0.2406) loss 3.8241 (3.5826) grad_norm 2.3069 (1.9455) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][750/1251] eta 0:02:01 lr 0.000957 wd 0.0500 time 0.2367 (0.2421) data time 0.0009 (0.0018) model time 0.2357 (0.2405) loss 3.3624 (3.5824) grad_norm 1.3069 (1.9436) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][760/1251] eta 0:01:58 lr 0.000957 wd 0.0500 time 0.2305 (0.2421) data time 0.0010 (0.0018) model time 0.2295 (0.2405) loss 3.6009 (3.5820) grad_norm 1.8665 (1.9480) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][770/1251] eta 0:01:56 lr 0.000957 wd 0.0500 time 0.2432 (0.2420) data time 0.0009 (0.0018) model time 0.2424 (0.2405) loss 4.4550 (3.5850) grad_norm 1.7653 (1.9464) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][780/1251] eta 0:01:53 lr 0.000957 wd 0.0500 time 0.2338 (0.2420) data time 0.0010 (0.0017) model time 0.2328 (0.2404) loss 3.7238 (3.5856) grad_norm 2.0282 (1.9463) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][790/1251] eta 0:01:51 lr 0.000957 wd 0.0500 time 0.2299 (0.2419) data time 0.0011 (0.0017) model time 0.2288 (0.2403) loss 3.2597 (3.5846) grad_norm 3.1865 (1.9543) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][800/1251] eta 0:01:49 lr 0.000957 wd 0.0500 time 0.2335 (0.2418) data time 0.0010 (0.0017) model time 0.2325 (0.2403) loss 3.2424 (3.5839) grad_norm 1.4375 (1.9528) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][810/1251] eta 0:01:46 lr 0.000956 wd 0.0500 time 0.2451 (0.2418) data time 0.0009 (0.0017) model time 0.2442 (0.2402) loss 4.3845 (3.5877) grad_norm 1.9681 (1.9515) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][820/1251] eta 0:01:44 lr 0.000956 wd 0.0500 time 0.2312 (0.2418) data time 0.0011 (0.0017) model time 0.2301 (0.2402) loss 3.3972 (3.5911) grad_norm 1.7930 (1.9511) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][830/1251] eta 0:01:41 lr 0.000956 wd 0.0500 time 0.2446 (0.2417) data time 0.0008 (0.0017) model time 0.2438 (0.2402) loss 3.1541 (3.5875) grad_norm 1.6572 (1.9492) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][840/1251] eta 0:01:39 lr 0.000956 wd 0.0500 time 0.2366 (0.2417) data time 0.0009 (0.0017) model time 0.2357 (0.2401) loss 4.1089 (3.5867) grad_norm 2.2409 (1.9480) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][850/1251] eta 0:01:36 lr 0.000956 wd 0.0500 time 0.2343 (0.2416) data time 0.0010 (0.0017) model time 0.2334 (0.2401) loss 3.7231 (3.5887) grad_norm 2.0026 (1.9489) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][860/1251] eta 0:01:34 lr 0.000956 wd 0.0500 time 0.2494 (0.2416) data time 0.0008 (0.0017) model time 0.2487 (0.2401) loss 3.1811 (3.5871) grad_norm 1.9661 (1.9478) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][870/1251] eta 0:01:32 lr 0.000956 wd 0.0500 time 0.2447 (0.2416) data time 0.0011 (0.0017) model time 0.2436 (0.2401) loss 4.2626 (3.5861) grad_norm 2.0415 (1.9470) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][880/1251] eta 0:01:29 lr 0.000956 wd 0.0500 time 0.2352 (0.2416) data time 0.0007 (0.0017) model time 0.2345 (0.2401) loss 3.8220 (3.5863) grad_norm 1.5101 (1.9455) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][890/1251] eta 0:01:27 lr 0.000956 wd 0.0500 time 0.2354 (0.2416) data time 0.0008 (0.0017) model time 0.2346 (0.2400) loss 3.7766 (3.5876) grad_norm 2.2799 (1.9491) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][900/1251] eta 0:01:24 lr 0.000956 wd 0.0500 time 0.2335 (0.2415) data time 0.0007 (0.0016) model time 0.2327 (0.2400) loss 4.0266 (3.5892) grad_norm 2.0467 (1.9529) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][910/1251] eta 0:01:22 lr 0.000956 wd 0.0500 time 0.2447 (0.2415) data time 0.0011 (0.0016) model time 0.2436 (0.2400) loss 3.5087 (3.5864) grad_norm 1.6204 (1.9529) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][920/1251] eta 0:01:19 lr 0.000956 wd 0.0500 time 0.2378 (0.2415) data time 0.0009 (0.0016) model time 0.2369 (0.2400) loss 3.7848 (3.5889) grad_norm 2.0256 (1.9522) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][930/1251] eta 0:01:17 lr 0.000956 wd 0.0500 time 0.2434 (0.2417) data time 0.0009 (0.0016) model time 0.2424 (0.2402) loss 2.9684 (3.5876) grad_norm 2.0976 (1.9542) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][940/1251] eta 0:01:15 lr 0.000956 wd 0.0500 time 0.2486 (0.2417) data time 0.0008 (0.0016) model time 0.2478 (0.2402) loss 3.0537 (3.5881) grad_norm 1.6076 (1.9532) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][950/1251] eta 0:01:12 lr 0.000956 wd 0.0500 time 0.2369 (0.2417) data time 0.0007 (0.0016) model time 0.2362 (0.2403) loss 4.6506 (3.5876) grad_norm 1.7395 (1.9510) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][960/1251] eta 0:01:10 lr 0.000956 wd 0.0500 time 0.2412 (0.2417) data time 0.0010 (0.0016) model time 0.2402 (0.2402) loss 3.4331 (3.5889) grad_norm 2.1472 (1.9520) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][970/1251] eta 0:01:07 lr 0.000956 wd 0.0500 time 0.2389 (0.2417) data time 0.0008 (0.0016) model time 0.2380 (0.2402) loss 3.6991 (3.5869) grad_norm 1.5937 (1.9556) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][980/1251] eta 0:01:05 lr 0.000956 wd 0.0500 time 0.2448 (0.2417) data time 0.0007 (0.0016) model time 0.2440 (0.2402) loss 3.1856 (3.5860) grad_norm 1.5667 (1.9547) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][990/1251] eta 0:01:03 lr 0.000956 wd 0.0500 time 0.2317 (0.2417) data time 0.0009 (0.0016) model time 0.2308 (0.2402) loss 2.5166 (3.5821) grad_norm 2.0998 (1.9516) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1000/1251] eta 0:01:00 lr 0.000956 wd 0.0500 time 0.2419 (0.2417) data time 0.0012 (0.0016) model time 0.2407 (0.2403) loss 3.0919 (3.5833) grad_norm 1.6942 (1.9495) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1010/1251] eta 0:00:58 lr 0.000956 wd 0.0500 time 0.2411 (0.2417) data time 0.0009 (0.0016) model time 0.2402 (0.2403) loss 3.4247 (3.5837) grad_norm 1.9619 (1.9484) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1020/1251] eta 0:00:55 lr 0.000956 wd 0.0500 time 0.2445 (0.2418) data time 0.0008 (0.0016) model time 0.2437 (0.2403) loss 3.7726 (3.5837) grad_norm 1.7135 (1.9475) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1030/1251] eta 0:00:53 lr 0.000956 wd 0.0500 time 0.2467 (0.2418) data time 0.0010 (0.0016) model time 0.2457 (0.2403) loss 4.0682 (3.5861) grad_norm 1.5320 (1.9488) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1040/1251] eta 0:00:51 lr 0.000956 wd 0.0500 time 0.2500 (0.2418) data time 0.0007 (0.0016) model time 0.2493 (0.2403) loss 2.3484 (3.5867) grad_norm 1.5506 (1.9482) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1050/1251] eta 0:00:48 lr 0.000956 wd 0.0500 time 0.2420 (0.2420) data time 0.0007 (0.0016) model time 0.2412 (0.2406) loss 3.4061 (3.5857) grad_norm 1.9032 (1.9461) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1060/1251] eta 0:00:46 lr 0.000956 wd 0.0500 time 0.4694 (0.2422) data time 0.0011 (0.0016) model time 0.4683 (0.2408) loss 2.9153 (3.5867) grad_norm 1.7672 (1.9455) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1070/1251] eta 0:00:43 lr 0.000956 wd 0.0500 time 0.2473 (0.2422) data time 0.0007 (0.0016) model time 0.2466 (0.2408) loss 3.3767 (3.5880) grad_norm 1.9618 (1.9476) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1080/1251] eta 0:00:41 lr 0.000956 wd 0.0500 time 0.2411 (0.2424) data time 0.0011 (0.0015) model time 0.2400 (0.2410) loss 3.9479 (3.5889) grad_norm 1.4460 (1.9466) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1090/1251] eta 0:00:39 lr 0.000956 wd 0.0500 time 0.2440 (0.2424) data time 0.0011 (0.0015) model time 0.2429 (0.2410) loss 3.6632 (3.5871) grad_norm 2.6014 (1.9462) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1100/1251] eta 0:00:36 lr 0.000956 wd 0.0500 time 0.2559 (0.2425) data time 0.0010 (0.0015) model time 0.2549 (0.2411) loss 3.0643 (3.5864) grad_norm 1.9339 (1.9450) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1110/1251] eta 0:00:34 lr 0.000956 wd 0.0500 time 0.2386 (0.2426) data time 0.0009 (0.0015) model time 0.2377 (0.2412) loss 2.3925 (3.5852) grad_norm 1.5403 (1.9438) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1120/1251] eta 0:00:31 lr 0.000956 wd 0.0500 time 0.2364 (0.2428) data time 0.0012 (0.0015) model time 0.2352 (0.2414) loss 3.7187 (3.5811) grad_norm 2.8481 (1.9472) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1130/1251] eta 0:00:29 lr 0.000956 wd 0.0500 time 0.2297 (0.2431) data time 0.0007 (0.0015) model time 0.2289 (0.2418) loss 2.6194 (3.5821) grad_norm 3.6587 (1.9487) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1140/1251] eta 0:00:26 lr 0.000956 wd 0.0500 time 0.2442 (0.2431) data time 0.0010 (0.0015) model time 0.2433 (0.2418) loss 4.1654 (3.5827) grad_norm 1.4447 (1.9475) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1150/1251] eta 0:00:24 lr 0.000956 wd 0.0500 time 0.2409 (0.2431) data time 0.0010 (0.0015) model time 0.2399 (0.2417) loss 2.7491 (3.5819) grad_norm 3.2885 (1.9518) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1160/1251] eta 0:00:22 lr 0.000956 wd 0.0500 time 0.2448 (0.2430) data time 0.0010 (0.0015) model time 0.2438 (0.2417) loss 3.0524 (3.5812) grad_norm 1.6565 (1.9525) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1170/1251] eta 0:00:19 lr 0.000956 wd 0.0500 time 0.2355 (0.2430) data time 0.0009 (0.0015) model time 0.2346 (0.2417) loss 4.0886 (3.5809) grad_norm 1.4390 (1.9527) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1180/1251] eta 0:00:17 lr 0.000956 wd 0.0500 time 0.2327 (0.2430) data time 0.0011 (0.0015) model time 0.2316 (0.2416) loss 3.6413 (3.5834) grad_norm 1.9512 (1.9507) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1190/1251] eta 0:00:14 lr 0.000956 wd 0.0500 time 0.2317 (0.2429) data time 0.0012 (0.0015) model time 0.2306 (0.2416) loss 3.4551 (3.5828) grad_norm 1.4478 (1.9535) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1200/1251] eta 0:00:12 lr 0.000956 wd 0.0500 time 0.2415 (0.2429) data time 0.0008 (0.0015) model time 0.2407 (0.2416) loss 4.3800 (3.5828) grad_norm 1.9879 (1.9534) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1210/1251] eta 0:00:09 lr 0.000956 wd 0.0500 time 0.2391 (0.2429) data time 0.0008 (0.0015) model time 0.2382 (0.2415) loss 2.5110 (3.5817) grad_norm 1.7683 (1.9509) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1220/1251] eta 0:00:07 lr 0.000956 wd 0.0500 time 0.2350 (0.2428) data time 0.0010 (0.0015) model time 0.2340 (0.2415) loss 2.3382 (3.5791) grad_norm 1.8745 (1.9508) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1230/1251] eta 0:00:05 lr 0.000956 wd 0.0500 time 0.2443 (0.2428) data time 0.0010 (0.0015) model time 0.2433 (0.2414) loss 3.8304 (3.5806) grad_norm 2.3139 (1.9511) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1240/1251] eta 0:00:02 lr 0.000956 wd 0.0500 time 0.2286 (0.2427) data time 0.0005 (0.0015) model time 0.2281 (0.2414) loss 3.6429 (3.5808) grad_norm 1.6266 (1.9523) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [57/300][1250/1251] eta 0:00:00 lr 0.000956 wd 0.0500 time 0.2126 (0.2427) data time 0.0008 (0.0015) model time 0.2117 (0.2414) loss 4.1083 (3.5828) grad_norm 1.6856 (1.9499) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 57 training takes 0:05:03 [2024-08-26 07:11:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 07:11:59 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 07:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.507 (0.507) Loss 0.5767 (0.5767) Acc@1 89.355 (89.355) Acc@5 97.363 (97.363) Mem 7379MB [2024-08-26 07:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.085 (0.117) Loss 0.8521 (0.8785) Acc@1 80.664 (80.584) Acc@5 95.508 (95.517) Mem 7379MB [2024-08-26 07:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.099) Loss 1.2246 (0.8922) Acc@1 71.484 (79.646) Acc@5 91.211 (95.489) Mem 7379MB [2024-08-26 07:12:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.093) Loss 1.5615 (1.0173) Acc@1 62.695 (76.969) Acc@5 86.621 (93.889) Mem 7379MB [2024-08-26 07:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.086) Loss 1.4951 (1.0892) Acc@1 66.309 (75.353) Acc@5 88.379 (92.983) Mem 7379MB [2024-08-26 07:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 74.974 Acc@5 92.906 [2024-08-26 07:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.0% [2024-08-26 07:12:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.909 (0.909) Loss 0.4702 (0.4702) Acc@1 90.527 (90.527) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 07:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.160) Loss 0.7798 (0.7525) Acc@1 84.277 (83.558) Acc@5 95.898 (96.591) Mem 7379MB [2024-08-26 07:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.121) Loss 1.0898 (0.7708) Acc@1 74.902 (82.492) Acc@5 92.285 (96.461) Mem 7379MB [2024-08-26 07:12:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.108) Loss 1.3525 (0.8836) Acc@1 65.039 (79.725) Acc@5 89.355 (95.007) Mem 7379MB [2024-08-26 07:12:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.098) Loss 1.2646 (0.9469) Acc@1 68.262 (78.065) Acc@5 90.430 (94.255) Mem 7379MB [2024-08-26 07:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.712 Acc@5 94.200 [2024-08-26 07:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 77.7% [2024-08-26 07:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 77.71% [2024-08-26 07:12:08 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 07:12:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 07:12:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][0/1251] eta 0:15:58 lr 0.000956 wd 0.0500 time 0.7662 (0.7662) data time 0.5427 (0.5427) model time 0.0000 (0.0000) loss 3.5845 (3.5845) grad_norm 1.7953 (1.7953) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][10/1251] eta 0:06:01 lr 0.000956 wd 0.0500 time 0.2627 (0.2911) data time 0.0008 (0.0503) model time 0.0000 (0.0000) loss 3.9938 (3.8463) grad_norm 1.9697 (1.9145) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][20/1251] eta 0:05:29 lr 0.000956 wd 0.0500 time 0.2332 (0.2679) data time 0.0008 (0.0268) model time 0.0000 (0.0000) loss 4.0710 (3.8097) grad_norm 3.0374 (2.0947) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][30/1251] eta 0:05:16 lr 0.000956 wd 0.0500 time 0.2379 (0.2593) data time 0.0012 (0.0185) model time 0.0000 (0.0000) loss 3.7944 (3.5940) grad_norm 1.4916 (2.0613) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][40/1251] eta 0:05:08 lr 0.000956 wd 0.0500 time 0.2407 (0.2547) data time 0.0007 (0.0142) model time 0.0000 (0.0000) loss 4.4195 (3.5728) grad_norm 1.5207 (1.9926) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][50/1251] eta 0:05:08 lr 0.000956 wd 0.0500 time 0.2503 (0.2567) data time 0.0007 (0.0116) model time 0.0000 (0.0000) loss 3.7515 (3.6057) grad_norm 1.8232 (1.9747) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][60/1251] eta 0:05:02 lr 0.000956 wd 0.0500 time 0.2362 (0.2539) data time 0.0012 (0.0099) model time 0.2350 (0.2384) loss 4.1240 (3.6403) grad_norm 1.9496 (1.9613) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][70/1251] eta 0:04:57 lr 0.000956 wd 0.0500 time 0.2381 (0.2517) data time 0.0008 (0.0086) model time 0.2373 (0.2379) loss 2.9827 (3.5998) grad_norm 1.4362 (1.9679) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][80/1251] eta 0:04:53 lr 0.000956 wd 0.0500 time 0.2413 (0.2503) data time 0.0007 (0.0077) model time 0.2406 (0.2386) loss 2.3479 (3.5709) grad_norm 2.1174 (1.9503) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][90/1251] eta 0:04:49 lr 0.000956 wd 0.0500 time 0.2403 (0.2495) data time 0.0011 (0.0070) model time 0.2392 (0.2394) loss 3.5431 (3.5264) grad_norm 2.1767 (1.9415) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][100/1251] eta 0:04:46 lr 0.000956 wd 0.0500 time 0.2323 (0.2488) data time 0.0011 (0.0064) model time 0.2312 (0.2396) loss 4.1031 (3.5358) grad_norm 2.7387 (1.9532) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][110/1251] eta 0:04:43 lr 0.000955 wd 0.0500 time 0.2445 (0.2482) data time 0.0011 (0.0059) model time 0.2433 (0.2399) loss 3.5724 (3.5651) grad_norm 1.8373 (1.9531) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][120/1251] eta 0:04:39 lr 0.000955 wd 0.0500 time 0.2344 (0.2475) data time 0.0010 (0.0055) model time 0.2334 (0.2399) loss 3.7230 (3.5489) grad_norm 1.7158 (1.9616) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][130/1251] eta 0:04:36 lr 0.000955 wd 0.0500 time 0.2411 (0.2470) data time 0.0011 (0.0051) model time 0.2400 (0.2398) loss 3.7189 (3.5571) grad_norm 2.6875 (1.9533) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][140/1251] eta 0:04:34 lr 0.000955 wd 0.0500 time 0.2435 (0.2467) data time 0.0009 (0.0049) model time 0.2427 (0.2401) loss 4.2208 (3.5669) grad_norm 1.6439 (1.9429) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][150/1251] eta 0:04:31 lr 0.000955 wd 0.0500 time 0.2380 (0.2463) data time 0.0010 (0.0046) model time 0.2370 (0.2400) loss 2.8018 (3.5670) grad_norm 1.6375 (1.9211) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][160/1251] eta 0:04:28 lr 0.000955 wd 0.0500 time 0.2380 (0.2460) data time 0.0010 (0.0044) model time 0.2370 (0.2400) loss 3.9804 (3.5809) grad_norm 1.9765 (1.9160) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][170/1251] eta 0:04:25 lr 0.000955 wd 0.0500 time 0.2409 (0.2457) data time 0.0013 (0.0042) model time 0.2396 (0.2399) loss 3.6163 (3.5956) grad_norm 1.8689 (1.9139) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][180/1251] eta 0:04:22 lr 0.000955 wd 0.0500 time 0.2375 (0.2453) data time 0.0008 (0.0040) model time 0.2367 (0.2398) loss 4.1132 (3.5936) grad_norm 1.4414 (1.9251) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][190/1251] eta 0:04:20 lr 0.000955 wd 0.0500 time 0.2440 (0.2452) data time 0.0011 (0.0039) model time 0.2430 (0.2400) loss 3.3631 (3.5971) grad_norm 1.7111 (1.9229) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:12:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][200/1251] eta 0:04:18 lr 0.000955 wd 0.0500 time 0.2433 (0.2462) data time 0.0008 (0.0037) model time 0.2425 (0.2416) loss 4.2733 (3.5932) grad_norm 1.7142 (1.9371) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:13:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][210/1251] eta 0:04:16 lr 0.000955 wd 0.0500 time 0.2406 (0.2460) data time 0.0009 (0.0036) model time 0.2397 (0.2415) loss 3.4460 (3.6135) grad_norm 2.8043 (1.9453) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][220/1251] eta 0:04:13 lr 0.000955 wd 0.0500 time 0.2455 (0.2458) data time 0.0009 (0.0035) model time 0.2446 (0.2415) loss 3.6540 (3.6165) grad_norm 2.6807 (1.9480) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][230/1251] eta 0:04:10 lr 0.000955 wd 0.0500 time 0.2452 (0.2456) data time 0.0009 (0.0034) model time 0.2443 (0.2415) loss 3.6769 (3.6158) grad_norm 1.4325 (1.9417) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][240/1251] eta 0:04:08 lr 0.000955 wd 0.0500 time 0.2365 (0.2454) data time 0.0010 (0.0033) model time 0.2355 (0.2413) loss 3.4765 (3.5970) grad_norm 1.6052 (1.9313) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][250/1251] eta 0:04:05 lr 0.000955 wd 0.0500 time 0.2436 (0.2453) data time 0.0008 (0.0032) model time 0.2429 (0.2413) loss 4.6853 (3.6104) grad_norm 1.3821 (1.9268) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][260/1251] eta 0:04:02 lr 0.000955 wd 0.0500 time 0.2315 (0.2452) data time 0.0009 (0.0031) model time 0.2307 (0.2413) loss 4.4278 (3.6090) grad_norm 1.6203 (1.9254) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][270/1251] eta 0:04:00 lr 0.000955 wd 0.0500 time 0.2430 (0.2451) data time 0.0008 (0.0030) model time 0.2422 (0.2413) loss 3.8715 (3.6068) grad_norm 2.1653 (1.9335) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][280/1251] eta 0:03:57 lr 0.000955 wd 0.0500 time 0.2365 (0.2449) data time 0.0009 (0.0030) model time 0.2355 (0.2412) loss 3.2470 (3.6096) grad_norm 1.8333 (1.9384) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][290/1251] eta 0:03:55 lr 0.000955 wd 0.0500 time 0.2374 (0.2448) data time 0.0009 (0.0029) model time 0.2365 (0.2412) loss 4.0064 (3.6081) grad_norm 1.6918 (1.9262) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][300/1251] eta 0:03:52 lr 0.000955 wd 0.0500 time 0.2389 (0.2446) data time 0.0011 (0.0028) model time 0.2378 (0.2411) loss 2.9799 (3.6008) grad_norm 1.5002 (1.9184) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:13:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][310/1251] eta 0:03:50 lr 0.000955 wd 0.0500 time 0.2458 (0.2446) data time 0.0009 (0.0028) model time 0.2449 (0.2412) loss 4.2328 (3.5996) grad_norm 1.7317 (1.9190) loss_scale 16384.0000 (8429.0675) mem 7379MB [2024-08-26 07:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][320/1251] eta 0:03:47 lr 0.000955 wd 0.0500 time 0.2367 (0.2445) data time 0.0011 (0.0027) model time 0.2356 (0.2412) loss 4.2849 (3.5892) grad_norm 1.4005 (1.9184) loss_scale 16384.0000 (8676.8847) mem 7379MB [2024-08-26 07:13:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][330/1251] eta 0:03:45 lr 0.000955 wd 0.0500 time 0.2387 (0.2446) data time 0.0007 (0.0027) model time 0.2380 (0.2413) loss 4.4736 (3.6033) grad_norm 2.7110 (1.9160) loss_scale 16384.0000 (8909.7281) mem 7379MB [2024-08-26 07:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][340/1251] eta 0:03:43 lr 0.000955 wd 0.0500 time 0.4624 (0.2451) data time 0.0008 (0.0026) model time 0.4616 (0.2420) loss 2.9764 (3.5934) grad_norm 1.8438 (1.9213) loss_scale 16384.0000 (9128.9150) mem 7379MB [2024-08-26 07:13:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][350/1251] eta 0:03:40 lr 0.000955 wd 0.0500 time 0.2437 (0.2450) data time 0.0008 (0.0026) model time 0.2429 (0.2420) loss 2.5916 (3.5870) grad_norm 1.5843 (1.9173) loss_scale 16384.0000 (9335.6125) mem 7379MB [2024-08-26 07:13:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][360/1251] eta 0:03:38 lr 0.000955 wd 0.0500 time 0.2348 (0.2456) data time 0.0009 (0.0025) model time 0.2339 (0.2428) loss 3.5673 (3.5865) grad_norm 1.3706 (1.9144) loss_scale 16384.0000 (9530.8587) mem 7379MB [2024-08-26 07:13:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][370/1251] eta 0:03:36 lr 0.000955 wd 0.0500 time 0.2486 (0.2455) data time 0.0011 (0.0025) model time 0.2475 (0.2427) loss 3.6645 (3.5822) grad_norm 1.4479 (1.9054) loss_scale 16384.0000 (9715.5795) mem 7379MB [2024-08-26 07:13:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][380/1251] eta 0:03:33 lr 0.000955 wd 0.0500 time 0.2416 (0.2455) data time 0.0009 (0.0024) model time 0.2407 (0.2427) loss 2.4439 (3.5780) grad_norm 2.0749 (1.9111) loss_scale 16384.0000 (9890.6037) mem 7379MB [2024-08-26 07:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][390/1251] eta 0:03:31 lr 0.000955 wd 0.0500 time 0.2372 (0.2454) data time 0.0008 (0.0024) model time 0.2364 (0.2426) loss 3.8760 (3.5849) grad_norm 2.0302 (1.9119) loss_scale 16384.0000 (10056.6752) mem 7379MB [2024-08-26 07:13:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][400/1251] eta 0:03:28 lr 0.000955 wd 0.0500 time 0.2383 (0.2453) data time 0.0009 (0.0024) model time 0.2374 (0.2426) loss 3.2520 (3.5872) grad_norm 1.3701 (1.9107) loss_scale 16384.0000 (10214.4638) mem 7379MB [2024-08-26 07:13:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][410/1251] eta 0:03:26 lr 0.000955 wd 0.0500 time 0.2428 (0.2452) data time 0.0010 (0.0023) model time 0.2418 (0.2426) loss 3.4369 (3.5823) grad_norm 1.9830 (1.9116) loss_scale 16384.0000 (10364.5742) mem 7379MB [2024-08-26 07:13:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][420/1251] eta 0:03:23 lr 0.000955 wd 0.0500 time 0.2465 (0.2452) data time 0.0010 (0.0023) model time 0.2455 (0.2426) loss 3.7260 (3.5860) grad_norm 2.0124 (1.9196) loss_scale 16384.0000 (10507.5534) mem 7379MB [2024-08-26 07:13:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][430/1251] eta 0:03:21 lr 0.000955 wd 0.0500 time 0.2396 (0.2451) data time 0.0007 (0.0023) model time 0.2389 (0.2425) loss 2.6813 (3.5765) grad_norm 2.6660 (1.9184) loss_scale 16384.0000 (10643.8979) mem 7379MB [2024-08-26 07:13:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][440/1251] eta 0:03:18 lr 0.000955 wd 0.0500 time 0.2475 (0.2450) data time 0.0008 (0.0023) model time 0.2468 (0.2424) loss 3.1989 (3.5750) grad_norm 1.8538 (1.9174) loss_scale 16384.0000 (10774.0590) mem 7379MB [2024-08-26 07:13:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][450/1251] eta 0:03:16 lr 0.000955 wd 0.0500 time 0.2420 (0.2453) data time 0.0007 (0.0022) model time 0.2412 (0.2428) loss 4.6915 (3.5817) grad_norm 1.5024 (inf) loss_scale 8192.0000 (10753.1353) mem 7379MB [2024-08-26 07:14:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][460/1251] eta 0:03:13 lr 0.000955 wd 0.0500 time 0.2415 (0.2452) data time 0.0010 (0.0022) model time 0.2405 (0.2428) loss 3.0156 (3.5850) grad_norm 2.1007 (inf) loss_scale 8192.0000 (10697.5792) mem 7379MB [2024-08-26 07:14:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][470/1251] eta 0:03:11 lr 0.000955 wd 0.0500 time 0.2596 (0.2452) data time 0.0010 (0.0022) model time 0.2586 (0.2428) loss 3.1784 (3.5804) grad_norm 1.9214 (inf) loss_scale 8192.0000 (10644.3822) mem 7379MB [2024-08-26 07:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][480/1251] eta 0:03:09 lr 0.000955 wd 0.0500 time 0.2483 (0.2452) data time 0.0010 (0.0021) model time 0.2473 (0.2428) loss 2.5023 (3.5800) grad_norm 1.9141 (inf) loss_scale 8192.0000 (10593.3971) mem 7379MB [2024-08-26 07:14:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][490/1251] eta 0:03:06 lr 0.000955 wd 0.0500 time 0.2429 (0.2451) data time 0.0007 (0.0021) model time 0.2422 (0.2427) loss 2.2851 (3.5738) grad_norm 1.5630 (inf) loss_scale 8192.0000 (10544.4888) mem 7379MB [2024-08-26 07:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][500/1251] eta 0:03:04 lr 0.000955 wd 0.0500 time 0.2503 (0.2451) data time 0.0009 (0.0021) model time 0.2494 (0.2427) loss 2.9083 (3.5727) grad_norm 2.0158 (inf) loss_scale 8192.0000 (10497.5329) mem 7379MB [2024-08-26 07:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][510/1251] eta 0:03:01 lr 0.000955 wd 0.0500 time 0.2462 (0.2450) data time 0.0010 (0.0021) model time 0.2452 (0.2427) loss 3.5369 (3.5744) grad_norm 1.9221 (inf) loss_scale 8192.0000 (10452.4149) mem 7379MB [2024-08-26 07:14:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][520/1251] eta 0:02:59 lr 0.000955 wd 0.0500 time 0.2453 (0.2449) data time 0.0008 (0.0021) model time 0.2445 (0.2426) loss 4.4961 (3.5748) grad_norm 2.8105 (inf) loss_scale 8192.0000 (10409.0288) mem 7379MB [2024-08-26 07:14:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][530/1251] eta 0:02:56 lr 0.000955 wd 0.0500 time 0.2416 (0.2449) data time 0.0010 (0.0021) model time 0.2407 (0.2425) loss 3.8349 (3.5772) grad_norm 2.3579 (inf) loss_scale 8192.0000 (10367.2768) mem 7379MB [2024-08-26 07:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][540/1251] eta 0:02:54 lr 0.000955 wd 0.0500 time 0.2406 (0.2448) data time 0.0009 (0.0020) model time 0.2398 (0.2425) loss 4.4129 (3.5792) grad_norm 1.7037 (inf) loss_scale 8192.0000 (10327.0684) mem 7379MB [2024-08-26 07:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][550/1251] eta 0:02:51 lr 0.000955 wd 0.0500 time 0.2476 (0.2448) data time 0.0012 (0.0020) model time 0.2465 (0.2425) loss 3.4647 (3.5794) grad_norm 1.6818 (inf) loss_scale 8192.0000 (10288.3194) mem 7379MB [2024-08-26 07:14:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][560/1251] eta 0:02:49 lr 0.000955 wd 0.0500 time 0.2585 (0.2447) data time 0.0010 (0.0020) model time 0.2575 (0.2425) loss 3.5839 (3.5728) grad_norm 1.9277 (inf) loss_scale 8192.0000 (10250.9519) mem 7379MB [2024-08-26 07:14:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][570/1251] eta 0:02:46 lr 0.000955 wd 0.0500 time 0.2399 (0.2446) data time 0.0007 (0.0020) model time 0.2392 (0.2424) loss 3.8050 (3.5746) grad_norm 1.5199 (inf) loss_scale 8192.0000 (10214.8932) mem 7379MB [2024-08-26 07:14:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][580/1251] eta 0:02:44 lr 0.000955 wd 0.0500 time 0.2446 (0.2449) data time 0.0011 (0.0020) model time 0.2435 (0.2427) loss 3.8726 (3.5795) grad_norm 2.1865 (inf) loss_scale 8192.0000 (10180.0757) mem 7379MB [2024-08-26 07:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][590/1251] eta 0:02:41 lr 0.000955 wd 0.0500 time 0.2476 (0.2449) data time 0.0009 (0.0020) model time 0.2467 (0.2427) loss 4.4981 (3.5808) grad_norm 1.3035 (inf) loss_scale 8192.0000 (10146.4365) mem 7379MB [2024-08-26 07:14:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][600/1251] eta 0:02:39 lr 0.000955 wd 0.0500 time 0.2378 (0.2448) data time 0.0011 (0.0020) model time 0.2366 (0.2427) loss 3.5812 (3.5763) grad_norm 1.3276 (inf) loss_scale 8192.0000 (10113.9168) mem 7379MB [2024-08-26 07:14:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][610/1251] eta 0:02:36 lr 0.000955 wd 0.0500 time 0.2426 (0.2448) data time 0.0007 (0.0019) model time 0.2419 (0.2426) loss 2.7778 (3.5731) grad_norm 1.8130 (inf) loss_scale 8192.0000 (10082.4615) mem 7379MB [2024-08-26 07:14:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][620/1251] eta 0:02:34 lr 0.000955 wd 0.0500 time 0.2331 (0.2447) data time 0.0009 (0.0019) model time 0.2323 (0.2426) loss 3.5114 (3.5709) grad_norm 2.0753 (inf) loss_scale 8192.0000 (10052.0193) mem 7379MB [2024-08-26 07:14:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][630/1251] eta 0:02:31 lr 0.000955 wd 0.0500 time 0.2415 (0.2446) data time 0.0011 (0.0019) model time 0.2403 (0.2425) loss 3.7061 (3.5705) grad_norm 1.8974 (inf) loss_scale 8192.0000 (10022.5420) mem 7379MB [2024-08-26 07:14:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][640/1251] eta 0:02:29 lr 0.000955 wd 0.0500 time 0.2502 (0.2446) data time 0.0007 (0.0019) model time 0.2495 (0.2425) loss 3.2122 (3.5685) grad_norm 3.0308 (inf) loss_scale 8192.0000 (9993.9844) mem 7379MB [2024-08-26 07:14:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][650/1251] eta 0:02:26 lr 0.000954 wd 0.0500 time 0.2429 (0.2445) data time 0.0008 (0.0019) model time 0.2421 (0.2424) loss 3.6272 (3.5734) grad_norm 1.9627 (inf) loss_scale 8192.0000 (9966.3041) mem 7379MB [2024-08-26 07:14:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][660/1251] eta 0:02:24 lr 0.000954 wd 0.0500 time 0.2440 (0.2445) data time 0.0009 (0.0019) model time 0.2430 (0.2424) loss 3.5901 (3.5768) grad_norm 2.4256 (inf) loss_scale 8192.0000 (9939.4614) mem 7379MB [2024-08-26 07:14:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][670/1251] eta 0:02:21 lr 0.000954 wd 0.0500 time 0.2373 (0.2444) data time 0.0009 (0.0018) model time 0.2364 (0.2423) loss 3.5700 (3.5788) grad_norm 1.7834 (inf) loss_scale 8192.0000 (9913.4188) mem 7379MB [2024-08-26 07:14:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][680/1251] eta 0:02:19 lr 0.000954 wd 0.0500 time 0.2329 (0.2443) data time 0.0009 (0.0018) model time 0.2320 (0.2423) loss 4.4541 (3.5825) grad_norm 1.7950 (inf) loss_scale 8192.0000 (9888.1410) mem 7379MB [2024-08-26 07:14:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][690/1251] eta 0:02:17 lr 0.000954 wd 0.0500 time 0.2433 (0.2443) data time 0.0008 (0.0018) model time 0.2425 (0.2422) loss 3.2707 (3.5844) grad_norm 1.7988 (inf) loss_scale 8192.0000 (9863.5948) mem 7379MB [2024-08-26 07:15:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][700/1251] eta 0:02:14 lr 0.000954 wd 0.0500 time 0.2465 (0.2446) data time 0.0010 (0.0018) model time 0.2455 (0.2426) loss 3.3304 (3.5827) grad_norm 1.4087 (inf) loss_scale 8192.0000 (9839.7489) mem 7379MB [2024-08-26 07:15:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][710/1251] eta 0:02:12 lr 0.000954 wd 0.0500 time 0.2498 (0.2445) data time 0.0010 (0.0018) model time 0.2488 (0.2426) loss 3.6401 (3.5862) grad_norm 1.8224 (inf) loss_scale 8192.0000 (9816.5738) mem 7379MB [2024-08-26 07:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][720/1251] eta 0:02:10 lr 0.000954 wd 0.0500 time 0.2438 (0.2450) data time 0.0010 (0.0018) model time 0.2428 (0.2431) loss 2.6210 (3.5822) grad_norm 3.0569 (inf) loss_scale 8192.0000 (9794.0416) mem 7379MB [2024-08-26 07:15:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][730/1251] eta 0:02:07 lr 0.000954 wd 0.0500 time 0.2393 (0.2452) data time 0.0007 (0.0018) model time 0.2385 (0.2433) loss 2.5511 (3.5822) grad_norm 2.3408 (inf) loss_scale 8192.0000 (9772.1259) mem 7379MB [2024-08-26 07:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][740/1251] eta 0:02:05 lr 0.000954 wd 0.0500 time 0.2389 (0.2452) data time 0.0009 (0.0018) model time 0.2380 (0.2433) loss 2.9944 (3.5842) grad_norm 2.1085 (inf) loss_scale 8192.0000 (9750.8016) mem 7379MB [2024-08-26 07:15:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][750/1251] eta 0:02:02 lr 0.000954 wd 0.0500 time 0.2469 (0.2451) data time 0.0008 (0.0018) model time 0.2461 (0.2432) loss 4.3452 (3.5843) grad_norm 2.0800 (inf) loss_scale 8192.0000 (9730.0453) mem 7379MB [2024-08-26 07:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][760/1251] eta 0:02:00 lr 0.000954 wd 0.0500 time 0.2526 (0.2451) data time 0.0012 (0.0018) model time 0.2513 (0.2432) loss 3.8093 (3.5854) grad_norm 1.8318 (inf) loss_scale 8192.0000 (9709.8344) mem 7379MB [2024-08-26 07:15:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][770/1251] eta 0:01:57 lr 0.000954 wd 0.0500 time 0.2330 (0.2450) data time 0.0007 (0.0017) model time 0.2322 (0.2431) loss 3.1399 (3.5821) grad_norm 2.8601 (inf) loss_scale 8192.0000 (9690.1479) mem 7379MB [2024-08-26 07:15:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][780/1251] eta 0:01:55 lr 0.000954 wd 0.0500 time 0.2401 (0.2450) data time 0.0010 (0.0017) model time 0.2391 (0.2431) loss 4.4355 (3.5869) grad_norm 1.6353 (inf) loss_scale 8192.0000 (9670.9654) mem 7379MB [2024-08-26 07:15:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][790/1251] eta 0:01:52 lr 0.000954 wd 0.0500 time 0.2424 (0.2449) data time 0.0009 (0.0017) model time 0.2415 (0.2431) loss 3.6321 (3.5878) grad_norm 3.1464 (inf) loss_scale 8192.0000 (9652.2680) mem 7379MB [2024-08-26 07:15:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][800/1251] eta 0:01:50 lr 0.000954 wd 0.0500 time 0.2410 (0.2449) data time 0.0007 (0.0017) model time 0.2403 (0.2430) loss 2.8325 (3.5846) grad_norm 2.3142 (inf) loss_scale 8192.0000 (9634.0375) mem 7379MB [2024-08-26 07:15:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][810/1251] eta 0:01:47 lr 0.000954 wd 0.0500 time 0.2370 (0.2448) data time 0.0008 (0.0017) model time 0.2362 (0.2430) loss 2.4634 (3.5830) grad_norm 1.9712 (inf) loss_scale 8192.0000 (9616.2565) mem 7379MB [2024-08-26 07:15:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][820/1251] eta 0:01:45 lr 0.000954 wd 0.0500 time 0.2454 (0.2448) data time 0.0009 (0.0017) model time 0.2445 (0.2430) loss 4.1668 (3.5850) grad_norm 1.7326 (inf) loss_scale 8192.0000 (9598.9086) mem 7379MB [2024-08-26 07:15:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][830/1251] eta 0:01:43 lr 0.000954 wd 0.0500 time 0.2410 (0.2447) data time 0.0011 (0.0017) model time 0.2399 (0.2429) loss 3.9556 (3.5862) grad_norm 2.2305 (inf) loss_scale 8192.0000 (9581.9783) mem 7379MB [2024-08-26 07:15:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][840/1251] eta 0:01:40 lr 0.000954 wd 0.0500 time 0.2393 (0.2447) data time 0.0008 (0.0017) model time 0.2385 (0.2429) loss 3.0226 (3.5857) grad_norm 1.7135 (inf) loss_scale 8192.0000 (9565.4507) mem 7379MB [2024-08-26 07:15:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][850/1251] eta 0:01:38 lr 0.000954 wd 0.0500 time 0.2383 (0.2446) data time 0.0009 (0.0017) model time 0.2374 (0.2428) loss 4.1653 (3.5844) grad_norm 1.4503 (inf) loss_scale 8192.0000 (9549.3114) mem 7379MB [2024-08-26 07:15:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][860/1251] eta 0:01:35 lr 0.000954 wd 0.0500 time 0.4079 (0.2448) data time 0.0008 (0.0017) model time 0.4071 (0.2430) loss 4.2780 (3.5867) grad_norm 2.3804 (inf) loss_scale 8192.0000 (9533.5470) mem 7379MB [2024-08-26 07:15:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][870/1251] eta 0:01:33 lr 0.000954 wd 0.0500 time 0.2379 (0.2447) data time 0.0011 (0.0017) model time 0.2368 (0.2430) loss 3.8867 (3.5871) grad_norm 1.5171 (inf) loss_scale 8192.0000 (9518.1447) mem 7379MB [2024-08-26 07:15:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][880/1251] eta 0:01:30 lr 0.000954 wd 0.0500 time 0.2451 (0.2449) data time 0.0011 (0.0017) model time 0.2439 (0.2431) loss 3.6442 (3.5883) grad_norm 1.4069 (inf) loss_scale 8192.0000 (9503.0919) mem 7379MB [2024-08-26 07:15:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][890/1251] eta 0:01:28 lr 0.000954 wd 0.0500 time 0.2429 (0.2448) data time 0.0009 (0.0016) model time 0.2419 (0.2431) loss 2.7991 (3.5869) grad_norm 2.2383 (inf) loss_scale 8192.0000 (9488.3771) mem 7379MB [2024-08-26 07:15:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][900/1251] eta 0:01:25 lr 0.000954 wd 0.0500 time 0.2548 (0.2448) data time 0.0008 (0.0016) model time 0.2540 (0.2431) loss 2.7172 (3.5871) grad_norm 2.9549 (inf) loss_scale 8192.0000 (9473.9889) mem 7379MB [2024-08-26 07:15:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][910/1251] eta 0:01:23 lr 0.000954 wd 0.0500 time 0.2342 (0.2448) data time 0.0007 (0.0016) model time 0.2334 (0.2430) loss 4.0553 (3.5868) grad_norm 1.5890 (inf) loss_scale 8192.0000 (9459.9166) mem 7379MB [2024-08-26 07:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][920/1251] eta 0:01:21 lr 0.000954 wd 0.0500 time 0.2460 (0.2447) data time 0.0008 (0.0016) model time 0.2452 (0.2430) loss 4.1562 (3.5853) grad_norm 2.2433 (inf) loss_scale 8192.0000 (9446.1498) mem 7379MB [2024-08-26 07:15:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][930/1251] eta 0:01:18 lr 0.000954 wd 0.0500 time 0.2438 (0.2447) data time 0.0009 (0.0016) model time 0.2430 (0.2430) loss 3.8009 (3.5848) grad_norm 1.5369 (inf) loss_scale 8192.0000 (9432.6788) mem 7379MB [2024-08-26 07:15:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][940/1251] eta 0:01:16 lr 0.000954 wd 0.0500 time 0.2462 (0.2448) data time 0.0009 (0.0016) model time 0.2453 (0.2430) loss 3.1155 (3.5866) grad_norm 1.8658 (inf) loss_scale 8192.0000 (9419.4942) mem 7379MB [2024-08-26 07:16:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][950/1251] eta 0:01:13 lr 0.000954 wd 0.0500 time 0.2426 (0.2447) data time 0.0014 (0.0016) model time 0.2412 (0.2430) loss 3.9290 (3.5885) grad_norm 2.9526 (inf) loss_scale 8192.0000 (9406.5868) mem 7379MB [2024-08-26 07:16:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][960/1251] eta 0:01:11 lr 0.000954 wd 0.0500 time 0.2377 (0.2447) data time 0.0010 (0.0016) model time 0.2367 (0.2430) loss 3.7869 (3.5900) grad_norm 1.7846 (inf) loss_scale 8192.0000 (9393.9480) mem 7379MB [2024-08-26 07:16:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][970/1251] eta 0:01:08 lr 0.000954 wd 0.0500 time 0.2342 (0.2449) data time 0.0008 (0.0016) model time 0.2334 (0.2432) loss 2.6814 (3.5901) grad_norm 1.7540 (inf) loss_scale 8192.0000 (9381.5695) mem 7379MB [2024-08-26 07:16:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][980/1251] eta 0:01:06 lr 0.000954 wd 0.0500 time 0.2379 (0.2451) data time 0.0011 (0.0016) model time 0.2368 (0.2434) loss 2.8089 (3.5915) grad_norm 1.9126 (inf) loss_scale 8192.0000 (9369.4434) mem 7379MB [2024-08-26 07:16:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][990/1251] eta 0:01:03 lr 0.000954 wd 0.0500 time 0.2460 (0.2451) data time 0.0007 (0.0016) model time 0.2453 (0.2434) loss 3.0040 (3.5922) grad_norm 1.5537 (inf) loss_scale 8192.0000 (9357.5621) mem 7379MB [2024-08-26 07:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1000/1251] eta 0:01:01 lr 0.000954 wd 0.0500 time 0.2474 (0.2451) data time 0.0010 (0.0016) model time 0.2464 (0.2434) loss 3.5640 (3.5918) grad_norm 1.8923 (inf) loss_scale 8192.0000 (9345.9181) mem 7379MB [2024-08-26 07:16:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1010/1251] eta 0:00:59 lr 0.000954 wd 0.0500 time 0.2483 (0.2450) data time 0.0007 (0.0016) model time 0.2476 (0.2434) loss 2.8447 (3.5897) grad_norm 1.8059 (inf) loss_scale 8192.0000 (9334.5045) mem 7379MB [2024-08-26 07:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1020/1251] eta 0:00:56 lr 0.000954 wd 0.0500 time 0.2335 (0.2450) data time 0.0012 (0.0016) model time 0.2323 (0.2433) loss 3.6968 (3.5867) grad_norm 2.0921 (inf) loss_scale 8192.0000 (9323.3144) mem 7379MB [2024-08-26 07:16:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1030/1251] eta 0:00:54 lr 0.000954 wd 0.0500 time 0.2354 (0.2449) data time 0.0007 (0.0016) model time 0.2347 (0.2433) loss 3.0897 (3.5869) grad_norm 1.9576 (inf) loss_scale 8192.0000 (9312.3414) mem 7379MB [2024-08-26 07:16:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1040/1251] eta 0:00:51 lr 0.000954 wd 0.0500 time 0.2458 (0.2449) data time 0.0008 (0.0016) model time 0.2450 (0.2433) loss 4.2795 (3.5894) grad_norm 2.0133 (inf) loss_scale 8192.0000 (9301.5793) mem 7379MB [2024-08-26 07:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1050/1251] eta 0:00:49 lr 0.000954 wd 0.0500 time 0.2467 (0.2449) data time 0.0010 (0.0016) model time 0.2457 (0.2432) loss 3.6661 (3.5906) grad_norm 1.5229 (inf) loss_scale 8192.0000 (9291.0219) mem 7379MB [2024-08-26 07:16:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1060/1251] eta 0:00:46 lr 0.000954 wd 0.0500 time 0.2413 (0.2449) data time 0.0009 (0.0016) model time 0.2404 (0.2432) loss 4.4562 (3.5870) grad_norm 2.4638 (inf) loss_scale 8192.0000 (9280.6635) mem 7379MB [2024-08-26 07:16:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1070/1251] eta 0:00:44 lr 0.000954 wd 0.0500 time 0.2381 (0.2448) data time 0.0009 (0.0016) model time 0.2372 (0.2432) loss 4.7123 (3.5884) grad_norm 1.7566 (inf) loss_scale 8192.0000 (9270.4986) mem 7379MB [2024-08-26 07:16:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1080/1251] eta 0:00:41 lr 0.000954 wd 0.0500 time 0.2361 (0.2448) data time 0.0008 (0.0016) model time 0.2352 (0.2432) loss 2.9689 (3.5882) grad_norm 1.7335 (inf) loss_scale 8192.0000 (9260.5217) mem 7379MB [2024-08-26 07:16:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1090/1251] eta 0:00:39 lr 0.000954 wd 0.0500 time 0.2387 (0.2448) data time 0.0009 (0.0015) model time 0.2378 (0.2432) loss 4.0781 (3.5880) grad_norm 2.0940 (inf) loss_scale 8192.0000 (9250.7278) mem 7379MB [2024-08-26 07:16:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1100/1251] eta 0:00:36 lr 0.000954 wd 0.0500 time 0.2354 (0.2448) data time 0.0011 (0.0015) model time 0.2343 (0.2432) loss 3.0565 (3.5853) grad_norm 1.5590 (inf) loss_scale 8192.0000 (9241.1117) mem 7379MB [2024-08-26 07:16:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1110/1251] eta 0:00:34 lr 0.000954 wd 0.0500 time 0.2400 (0.2449) data time 0.0008 (0.0015) model time 0.2392 (0.2433) loss 3.8181 (3.5846) grad_norm 1.5749 (inf) loss_scale 8192.0000 (9231.6688) mem 7379MB [2024-08-26 07:16:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1120/1251] eta 0:00:32 lr 0.000954 wd 0.0500 time 0.2479 (0.2449) data time 0.0007 (0.0015) model time 0.2471 (0.2433) loss 4.0979 (3.5855) grad_norm 1.7903 (inf) loss_scale 8192.0000 (9222.3943) mem 7379MB [2024-08-26 07:16:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1130/1251] eta 0:00:29 lr 0.000954 wd 0.0500 time 0.2404 (0.2449) data time 0.0009 (0.0015) model time 0.2395 (0.2433) loss 4.3499 (3.5884) grad_norm 1.8775 (inf) loss_scale 8192.0000 (9213.2838) mem 7379MB [2024-08-26 07:16:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1140/1251] eta 0:00:27 lr 0.000954 wd 0.0500 time 0.2353 (0.2448) data time 0.0009 (0.0015) model time 0.2344 (0.2432) loss 4.1047 (3.5897) grad_norm 2.1026 (inf) loss_scale 8192.0000 (9204.3330) mem 7379MB [2024-08-26 07:16:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1150/1251] eta 0:00:24 lr 0.000954 wd 0.0500 time 0.2352 (0.2448) data time 0.0010 (0.0015) model time 0.2343 (0.2432) loss 3.6521 (3.5897) grad_norm 1.5183 (inf) loss_scale 8192.0000 (9195.5378) mem 7379MB [2024-08-26 07:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1160/1251] eta 0:00:22 lr 0.000954 wd 0.0500 time 0.2410 (0.2448) data time 0.0012 (0.0015) model time 0.2398 (0.2432) loss 4.0740 (3.5914) grad_norm 1.7699 (inf) loss_scale 8192.0000 (9186.8941) mem 7379MB [2024-08-26 07:16:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1170/1251] eta 0:00:19 lr 0.000954 wd 0.0500 time 0.2443 (0.2448) data time 0.0007 (0.0015) model time 0.2436 (0.2432) loss 3.3143 (3.5935) grad_norm 2.1373 (inf) loss_scale 8192.0000 (9178.3980) mem 7379MB [2024-08-26 07:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1180/1251] eta 0:00:17 lr 0.000953 wd 0.0500 time 0.2479 (0.2448) data time 0.0010 (0.0015) model time 0.2470 (0.2432) loss 2.5693 (3.5922) grad_norm 2.1434 (inf) loss_scale 8192.0000 (9170.0457) mem 7379MB [2024-08-26 07:17:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1190/1251] eta 0:00:14 lr 0.000953 wd 0.0500 time 0.2371 (0.2447) data time 0.0007 (0.0015) model time 0.2364 (0.2431) loss 2.4741 (3.5922) grad_norm 1.4492 (inf) loss_scale 8192.0000 (9161.8338) mem 7379MB [2024-08-26 07:17:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1200/1251] eta 0:00:12 lr 0.000953 wd 0.0500 time 0.2312 (0.2447) data time 0.0007 (0.0015) model time 0.2305 (0.2431) loss 4.2567 (3.5913) grad_norm 1.8361 (inf) loss_scale 8192.0000 (9153.7585) mem 7379MB [2024-08-26 07:17:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1210/1251] eta 0:00:10 lr 0.000953 wd 0.0500 time 0.2467 (0.2447) data time 0.0010 (0.0015) model time 0.2457 (0.2431) loss 2.7509 (3.5902) grad_norm 1.5753 (inf) loss_scale 8192.0000 (9145.8167) mem 7379MB [2024-08-26 07:17:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1220/1251] eta 0:00:07 lr 0.000953 wd 0.0500 time 0.2436 (0.2446) data time 0.0007 (0.0015) model time 0.2429 (0.2431) loss 4.1391 (3.5912) grad_norm 2.6032 (inf) loss_scale 8192.0000 (9138.0049) mem 7379MB [2024-08-26 07:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1230/1251] eta 0:00:05 lr 0.000953 wd 0.0500 time 0.2400 (0.2446) data time 0.0007 (0.0015) model time 0.2393 (0.2430) loss 3.5106 (3.5934) grad_norm 1.8826 (inf) loss_scale 8192.0000 (9130.3201) mem 7379MB [2024-08-26 07:17:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1240/1251] eta 0:00:02 lr 0.000953 wd 0.0500 time 0.2228 (0.2449) data time 0.0005 (0.0015) model time 0.2223 (0.2433) loss 2.3280 (3.5900) grad_norm 1.6051 (inf) loss_scale 8192.0000 (9122.7591) mem 7379MB [2024-08-26 07:17:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [58/300][1250/1251] eta 0:00:00 lr 0.000953 wd 0.0500 time 0.2231 (0.2449) data time 0.0007 (0.0015) model time 0.2224 (0.2433) loss 3.4021 (3.5881) grad_norm 1.8588 (inf) loss_scale 8192.0000 (9115.3189) mem 7379MB [2024-08-26 07:17:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 58 training takes 0:05:06 [2024-08-26 07:17:15 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 07:17:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 07:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.485 (0.485) Loss 0.5273 (0.5273) Acc@1 89.160 (89.160) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-26 07:17:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.113) Loss 0.8735 (0.8271) Acc@1 80.762 (80.877) Acc@5 95.312 (95.907) Mem 7379MB [2024-08-26 07:17:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.096) Loss 1.3213 (0.8635) Acc@1 69.043 (80.027) Acc@5 90.527 (95.717) Mem 7379MB [2024-08-26 07:17:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.091) Loss 1.5273 (0.9901) Acc@1 63.867 (77.378) Acc@5 86.914 (94.068) Mem 7379MB [2024-08-26 07:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.3193 (1.0594) Acc@1 69.824 (75.805) Acc@5 90.820 (93.297) Mem 7379MB [2024-08-26 07:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.414 Acc@5 93.220 [2024-08-26 07:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.4% [2024-08-26 07:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 75.41% [2024-08-26 07:17:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 07:17:20 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 07:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.434 (0.434) Loss 0.4695 (0.4695) Acc@1 90.625 (90.625) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 07:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.109) Loss 0.7778 (0.7510) Acc@1 83.887 (83.407) Acc@5 95.996 (96.591) Mem 7379MB [2024-08-26 07:17:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.094) Loss 1.0908 (0.7697) Acc@1 74.902 (82.431) Acc@5 92.383 (96.526) Mem 7379MB [2024-08-26 07:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.089) Loss 1.3457 (0.8818) Acc@1 64.746 (79.744) Acc@5 89.258 (95.086) Mem 7379MB [2024-08-26 07:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.2627 (0.9448) Acc@1 68.652 (78.125) Acc@5 90.527 (94.334) Mem 7379MB [2024-08-26 07:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.786 Acc@5 94.274 [2024-08-26 07:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 77.8% [2024-08-26 07:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 77.79% [2024-08-26 07:17:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 07:17:25 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 07:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][0/1251] eta 0:13:29 lr 0.000953 wd 0.0500 time 0.6473 (0.6473) data time 0.4256 (0.4256) model time 0.0000 (0.0000) loss 3.9615 (3.9615) grad_norm 1.4836 (1.4836) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][10/1251] eta 0:06:05 lr 0.000953 wd 0.0500 time 0.2345 (0.2946) data time 0.0009 (0.0397) model time 0.0000 (0.0000) loss 3.9033 (3.5103) grad_norm 2.5662 (2.0723) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:17:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][20/1251] eta 0:05:44 lr 0.000953 wd 0.0500 time 0.2433 (0.2800) data time 0.0008 (0.0212) model time 0.0000 (0.0000) loss 3.4619 (3.5424) grad_norm 1.9381 (2.1502) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][30/1251] eta 0:05:26 lr 0.000953 wd 0.0500 time 0.2404 (0.2673) data time 0.0012 (0.0154) model time 0.0000 (0.0000) loss 3.9598 (3.6312) grad_norm 2.0551 (2.0242) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:17:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][40/1251] eta 0:05:16 lr 0.000953 wd 0.0500 time 0.2429 (0.2613) data time 0.0008 (0.0120) model time 0.0000 (0.0000) loss 4.3275 (3.6508) grad_norm 1.8743 (1.9730) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][50/1251] eta 0:05:09 lr 0.000953 wd 0.0500 time 0.2460 (0.2575) data time 0.0007 (0.0098) model time 0.0000 (0.0000) loss 4.0401 (3.5471) grad_norm 1.8839 (1.9290) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][60/1251] eta 0:05:03 lr 0.000953 wd 0.0500 time 0.2455 (0.2548) data time 0.0007 (0.0084) model time 0.2448 (0.2400) loss 2.6514 (3.5346) grad_norm 2.4330 (1.9876) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:17:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][70/1251] eta 0:04:58 lr 0.000953 wd 0.0500 time 0.2297 (0.2526) data time 0.0009 (0.0073) model time 0.2288 (0.2391) loss 4.5278 (3.5235) grad_norm 1.7303 (1.9925) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][80/1251] eta 0:04:54 lr 0.000953 wd 0.0500 time 0.2438 (0.2512) data time 0.0007 (0.0066) model time 0.2431 (0.2395) loss 3.3281 (3.4925) grad_norm 1.6828 (1.9939) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:17:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][90/1251] eta 0:04:50 lr 0.000953 wd 0.0500 time 0.2424 (0.2500) data time 0.0011 (0.0060) model time 0.2413 (0.2393) loss 2.3557 (3.4795) grad_norm 1.9060 (1.9618) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][100/1251] eta 0:04:46 lr 0.000953 wd 0.0500 time 0.2390 (0.2491) data time 0.0011 (0.0055) model time 0.2379 (0.2394) loss 3.9583 (3.4857) grad_norm 1.6049 (1.9398) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][110/1251] eta 0:04:43 lr 0.000953 wd 0.0500 time 0.2332 (0.2482) data time 0.0009 (0.0051) model time 0.2323 (0.2392) loss 3.3448 (3.4982) grad_norm 1.4483 (1.9362) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:17:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][120/1251] eta 0:04:40 lr 0.000953 wd 0.0500 time 0.2433 (0.2477) data time 0.0009 (0.0047) model time 0.2424 (0.2395) loss 2.8853 (3.4982) grad_norm 1.8360 (1.9298) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:17:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][130/1251] eta 0:04:37 lr 0.000953 wd 0.0500 time 0.2360 (0.2472) data time 0.0011 (0.0044) model time 0.2350 (0.2395) loss 4.1608 (3.5228) grad_norm 2.2882 (1.9293) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][140/1251] eta 0:04:35 lr 0.000953 wd 0.0500 time 0.2361 (0.2480) data time 0.0010 (0.0042) model time 0.2351 (0.2415) loss 4.0067 (3.5075) grad_norm 1.6939 (1.9455) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][150/1251] eta 0:04:33 lr 0.000953 wd 0.0500 time 0.2372 (0.2486) data time 0.0010 (0.0040) model time 0.2362 (0.2430) loss 3.5444 (3.4937) grad_norm 1.2445 (1.9503) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][160/1251] eta 0:04:30 lr 0.000953 wd 0.0500 time 0.2425 (0.2483) data time 0.0007 (0.0038) model time 0.2417 (0.2431) loss 3.3715 (3.4966) grad_norm 1.7654 (1.9416) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][170/1251] eta 0:04:27 lr 0.000953 wd 0.0500 time 0.2352 (0.2479) data time 0.0009 (0.0036) model time 0.2343 (0.2428) loss 2.5740 (3.5041) grad_norm 2.0226 (1.9350) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][180/1251] eta 0:04:25 lr 0.000953 wd 0.0500 time 0.2397 (0.2476) data time 0.0011 (0.0035) model time 0.2387 (0.2426) loss 2.9911 (3.4957) grad_norm 2.5462 (1.9294) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][190/1251] eta 0:04:22 lr 0.000953 wd 0.0500 time 0.2335 (0.2473) data time 0.0008 (0.0034) model time 0.2327 (0.2425) loss 4.4331 (3.5021) grad_norm 1.7109 (1.9431) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][200/1251] eta 0:04:19 lr 0.000953 wd 0.0500 time 0.2452 (0.2470) data time 0.0009 (0.0032) model time 0.2444 (0.2424) loss 3.8274 (3.5033) grad_norm 1.2891 (1.9393) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][210/1251] eta 0:04:16 lr 0.000953 wd 0.0500 time 0.2359 (0.2467) data time 0.0009 (0.0032) model time 0.2351 (0.2423) loss 4.0079 (3.5106) grad_norm 1.8945 (1.9407) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][220/1251] eta 0:04:14 lr 0.000953 wd 0.0500 time 0.2418 (0.2466) data time 0.0009 (0.0031) model time 0.2409 (0.2423) loss 4.5716 (3.5144) grad_norm 1.6839 (1.9405) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][230/1251] eta 0:04:11 lr 0.000953 wd 0.0500 time 0.2374 (0.2464) data time 0.0007 (0.0030) model time 0.2366 (0.2422) loss 3.9252 (3.4990) grad_norm 2.1887 (1.9568) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][240/1251] eta 0:04:08 lr 0.000953 wd 0.0500 time 0.2452 (0.2463) data time 0.0010 (0.0029) model time 0.2442 (0.2422) loss 3.0330 (3.4989) grad_norm 1.3557 (1.9633) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][250/1251] eta 0:04:06 lr 0.000953 wd 0.0500 time 0.2409 (0.2462) data time 0.0008 (0.0028) model time 0.2401 (0.2422) loss 3.1751 (3.4976) grad_norm 1.9694 (1.9665) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][260/1251] eta 0:04:03 lr 0.000953 wd 0.0500 time 0.2422 (0.2460) data time 0.0010 (0.0027) model time 0.2411 (0.2422) loss 3.6008 (3.5069) grad_norm 1.3425 (1.9570) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][270/1251] eta 0:04:01 lr 0.000953 wd 0.0500 time 0.2355 (0.2466) data time 0.0008 (0.0027) model time 0.2347 (0.2430) loss 2.9473 (3.5014) grad_norm 1.4277 (1.9532) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][280/1251] eta 0:03:59 lr 0.000953 wd 0.0500 time 0.2393 (0.2465) data time 0.0011 (0.0027) model time 0.2382 (0.2430) loss 3.8496 (3.4978) grad_norm 1.2774 (1.9445) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][290/1251] eta 0:03:56 lr 0.000953 wd 0.0500 time 0.2398 (0.2464) data time 0.0009 (0.0026) model time 0.2389 (0.2430) loss 2.8743 (3.5011) grad_norm 2.4341 (1.9457) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][300/1251] eta 0:03:54 lr 0.000953 wd 0.0500 time 0.2450 (0.2462) data time 0.0010 (0.0025) model time 0.2440 (0.2429) loss 3.9057 (3.5115) grad_norm 1.9623 (1.9504) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][310/1251] eta 0:03:51 lr 0.000953 wd 0.0500 time 0.2388 (0.2461) data time 0.0009 (0.0025) model time 0.2379 (0.2428) loss 2.8612 (3.5091) grad_norm 1.9289 (1.9685) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][320/1251] eta 0:03:49 lr 0.000953 wd 0.0500 time 0.2515 (0.2461) data time 0.0009 (0.0025) model time 0.2505 (0.2429) loss 2.6732 (3.5093) grad_norm 1.7494 (1.9625) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][330/1251] eta 0:03:46 lr 0.000953 wd 0.0500 time 0.2427 (0.2460) data time 0.0011 (0.0024) model time 0.2415 (0.2429) loss 3.7425 (3.5118) grad_norm 2.3952 (1.9584) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][340/1251] eta 0:03:43 lr 0.000953 wd 0.0500 time 0.2430 (0.2459) data time 0.0009 (0.0024) model time 0.2421 (0.2427) loss 4.2760 (3.5203) grad_norm 1.6538 (1.9502) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][350/1251] eta 0:03:41 lr 0.000953 wd 0.0500 time 0.2317 (0.2458) data time 0.0010 (0.0024) model time 0.2307 (0.2427) loss 3.8119 (3.5169) grad_norm 2.5835 (1.9536) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][360/1251] eta 0:03:38 lr 0.000953 wd 0.0500 time 0.2449 (0.2456) data time 0.0007 (0.0023) model time 0.2442 (0.2426) loss 3.8449 (3.5197) grad_norm 2.4899 (1.9683) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][370/1251] eta 0:03:36 lr 0.000953 wd 0.0500 time 0.2588 (0.2456) data time 0.0011 (0.0023) model time 0.2577 (0.2426) loss 3.5415 (3.5166) grad_norm 2.8027 (1.9690) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:18:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][380/1251] eta 0:03:33 lr 0.000953 wd 0.0500 time 0.2472 (0.2455) data time 0.0009 (0.0023) model time 0.2463 (0.2426) loss 2.8966 (3.5136) grad_norm 1.4394 (1.9675) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][390/1251] eta 0:03:31 lr 0.000953 wd 0.0500 time 0.2447 (0.2455) data time 0.0010 (0.0022) model time 0.2436 (0.2426) loss 3.3604 (3.5237) grad_norm 1.3534 (1.9751) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][400/1251] eta 0:03:29 lr 0.000953 wd 0.0500 time 0.2406 (0.2459) data time 0.0008 (0.0022) model time 0.2398 (0.2431) loss 3.1409 (3.5232) grad_norm 1.6648 (1.9704) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][410/1251] eta 0:03:26 lr 0.000953 wd 0.0500 time 0.2435 (0.2458) data time 0.0010 (0.0022) model time 0.2425 (0.2430) loss 4.2365 (3.5235) grad_norm 1.7815 (1.9679) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][420/1251] eta 0:03:24 lr 0.000953 wd 0.0500 time 0.2336 (0.2457) data time 0.0012 (0.0021) model time 0.2324 (0.2429) loss 3.7708 (3.5291) grad_norm 2.7617 (1.9704) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][430/1251] eta 0:03:21 lr 0.000953 wd 0.0500 time 0.2342 (0.2456) data time 0.0012 (0.0021) model time 0.2330 (0.2429) loss 3.6673 (3.5307) grad_norm 1.9740 (1.9691) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][440/1251] eta 0:03:19 lr 0.000953 wd 0.0500 time 0.2452 (0.2455) data time 0.0008 (0.0021) model time 0.2443 (0.2429) loss 4.4626 (3.5328) grad_norm 1.5867 (1.9644) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][450/1251] eta 0:03:16 lr 0.000953 wd 0.0500 time 0.2405 (0.2454) data time 0.0008 (0.0021) model time 0.2397 (0.2428) loss 3.9421 (3.5400) grad_norm 1.7511 (1.9609) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][460/1251] eta 0:03:14 lr 0.000952 wd 0.0500 time 0.2439 (0.2458) data time 0.0008 (0.0020) model time 0.2431 (0.2433) loss 3.8098 (3.5499) grad_norm 1.9030 (1.9613) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][470/1251] eta 0:03:11 lr 0.000952 wd 0.0500 time 0.2400 (0.2458) data time 0.0008 (0.0020) model time 0.2391 (0.2432) loss 3.6561 (3.5530) grad_norm 1.6701 (1.9599) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][480/1251] eta 0:03:09 lr 0.000952 wd 0.0500 time 0.2462 (0.2457) data time 0.0010 (0.0020) model time 0.2453 (0.2432) loss 3.9749 (3.5539) grad_norm 1.5294 (1.9582) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][490/1251] eta 0:03:06 lr 0.000952 wd 0.0500 time 0.2452 (0.2457) data time 0.0007 (0.0020) model time 0.2445 (0.2432) loss 4.0967 (3.5476) grad_norm 1.9617 (1.9605) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][500/1251] eta 0:03:04 lr 0.000952 wd 0.0500 time 0.2419 (0.2456) data time 0.0010 (0.0020) model time 0.2408 (0.2431) loss 3.8616 (3.5525) grad_norm 2.0767 (1.9613) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][510/1251] eta 0:03:02 lr 0.000952 wd 0.0500 time 0.2376 (0.2459) data time 0.0008 (0.0020) model time 0.2369 (0.2436) loss 2.6076 (3.5533) grad_norm 2.1020 (1.9594) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][520/1251] eta 0:02:59 lr 0.000952 wd 0.0500 time 0.2456 (0.2459) data time 0.0008 (0.0019) model time 0.2448 (0.2435) loss 3.2939 (3.5494) grad_norm 1.8868 (1.9564) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][530/1251] eta 0:02:57 lr 0.000952 wd 0.0500 time 0.2441 (0.2462) data time 0.0010 (0.0019) model time 0.2431 (0.2440) loss 3.9422 (3.5455) grad_norm 2.1412 (1.9643) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][540/1251] eta 0:02:55 lr 0.000952 wd 0.0500 time 0.2414 (0.2462) data time 0.0007 (0.0019) model time 0.2406 (0.2439) loss 2.9799 (3.5443) grad_norm 1.4430 (1.9640) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][550/1251] eta 0:02:53 lr 0.000952 wd 0.0500 time 0.4359 (0.2468) data time 0.0008 (0.0019) model time 0.4351 (0.2447) loss 2.4737 (3.5453) grad_norm 1.6432 (1.9643) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][560/1251] eta 0:02:50 lr 0.000952 wd 0.0500 time 0.2429 (0.2467) data time 0.0010 (0.0019) model time 0.2420 (0.2446) loss 3.2362 (3.5363) grad_norm 1.3879 (1.9584) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][570/1251] eta 0:02:47 lr 0.000952 wd 0.0500 time 0.2388 (0.2466) data time 0.0011 (0.0019) model time 0.2377 (0.2445) loss 3.5522 (3.5387) grad_norm 1.9568 (1.9588) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][580/1251] eta 0:02:45 lr 0.000952 wd 0.0500 time 0.2406 (0.2465) data time 0.0009 (0.0018) model time 0.2397 (0.2444) loss 4.0623 (3.5407) grad_norm 1.9495 (1.9643) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][590/1251] eta 0:02:42 lr 0.000952 wd 0.0500 time 0.2502 (0.2465) data time 0.0009 (0.0018) model time 0.2493 (0.2443) loss 3.8029 (3.5459) grad_norm 2.9722 (1.9784) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][600/1251] eta 0:02:40 lr 0.000952 wd 0.0500 time 0.2448 (0.2464) data time 0.0007 (0.0018) model time 0.2440 (0.2443) loss 4.0424 (3.5467) grad_norm 2.2597 (1.9772) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][610/1251] eta 0:02:37 lr 0.000952 wd 0.0500 time 0.2438 (0.2463) data time 0.0007 (0.0018) model time 0.2431 (0.2442) loss 3.7621 (3.5443) grad_norm 1.5198 (1.9741) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:19:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][620/1251] eta 0:02:35 lr 0.000952 wd 0.0500 time 0.2421 (0.2463) data time 0.0008 (0.0018) model time 0.2413 (0.2442) loss 2.4997 (3.5398) grad_norm 2.0240 (1.9737) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][630/1251] eta 0:02:32 lr 0.000952 wd 0.0500 time 0.2368 (0.2462) data time 0.0009 (0.0018) model time 0.2359 (0.2441) loss 3.3932 (3.5367) grad_norm 2.2235 (1.9741) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][640/1251] eta 0:02:30 lr 0.000952 wd 0.0500 time 0.2405 (0.2461) data time 0.0008 (0.0018) model time 0.2398 (0.2440) loss 2.7478 (3.5380) grad_norm 2.5735 (1.9750) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][650/1251] eta 0:02:27 lr 0.000952 wd 0.0500 time 0.2410 (0.2460) data time 0.0008 (0.0018) model time 0.2403 (0.2440) loss 2.2853 (3.5318) grad_norm 1.7875 (1.9729) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][660/1251] eta 0:02:25 lr 0.000952 wd 0.0500 time 0.2441 (0.2463) data time 0.0009 (0.0017) model time 0.2432 (0.2443) loss 3.8043 (3.5319) grad_norm 1.7102 (1.9707) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][670/1251] eta 0:02:23 lr 0.000952 wd 0.0500 time 0.2395 (0.2462) data time 0.0010 (0.0017) model time 0.2385 (0.2442) loss 3.6892 (3.5324) grad_norm 2.1465 (1.9708) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][680/1251] eta 0:02:20 lr 0.000952 wd 0.0500 time 0.2438 (0.2464) data time 0.0010 (0.0017) model time 0.2428 (0.2445) loss 2.4886 (3.5311) grad_norm 1.7880 (1.9691) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][690/1251] eta 0:02:18 lr 0.000952 wd 0.0500 time 0.2440 (0.2464) data time 0.0011 (0.0017) model time 0.2429 (0.2444) loss 3.5021 (3.5322) grad_norm 1.4162 (1.9660) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][700/1251] eta 0:02:15 lr 0.000952 wd 0.0500 time 0.2437 (0.2463) data time 0.0010 (0.0017) model time 0.2428 (0.2444) loss 3.7425 (3.5315) grad_norm 1.8981 (1.9661) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][710/1251] eta 0:02:13 lr 0.000952 wd 0.0500 time 0.2405 (0.2462) data time 0.0010 (0.0017) model time 0.2395 (0.2443) loss 4.1281 (3.5344) grad_norm 1.3884 (1.9666) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][720/1251] eta 0:02:10 lr 0.000952 wd 0.0500 time 0.2435 (0.2462) data time 0.0007 (0.0017) model time 0.2428 (0.2443) loss 4.3505 (3.5351) grad_norm 1.5537 (1.9626) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][730/1251] eta 0:02:08 lr 0.000952 wd 0.0500 time 0.2399 (0.2461) data time 0.0012 (0.0017) model time 0.2387 (0.2442) loss 3.6526 (3.5354) grad_norm 1.7640 (1.9645) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][740/1251] eta 0:02:05 lr 0.000952 wd 0.0500 time 0.2414 (0.2461) data time 0.0009 (0.0017) model time 0.2405 (0.2442) loss 3.8987 (3.5354) grad_norm 1.9755 (1.9622) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][750/1251] eta 0:02:03 lr 0.000952 wd 0.0500 time 0.2380 (0.2461) data time 0.0008 (0.0017) model time 0.2372 (0.2442) loss 2.8988 (3.5330) grad_norm 1.6870 (1.9589) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][760/1251] eta 0:02:00 lr 0.000952 wd 0.0500 time 0.2368 (0.2460) data time 0.0009 (0.0016) model time 0.2359 (0.2441) loss 3.8399 (3.5324) grad_norm 1.8353 (1.9592) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][770/1251] eta 0:01:58 lr 0.000952 wd 0.0500 time 0.2443 (0.2460) data time 0.0007 (0.0016) model time 0.2435 (0.2441) loss 3.0654 (3.5323) grad_norm 1.5320 (1.9562) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][780/1251] eta 0:01:55 lr 0.000952 wd 0.0500 time 0.2479 (0.2459) data time 0.0011 (0.0016) model time 0.2468 (0.2440) loss 2.9596 (3.5317) grad_norm 2.2451 (1.9534) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][790/1251] eta 0:01:53 lr 0.000952 wd 0.0500 time 0.2425 (0.2459) data time 0.0007 (0.0016) model time 0.2418 (0.2440) loss 4.0228 (3.5344) grad_norm 1.4584 (1.9533) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][800/1251] eta 0:01:50 lr 0.000952 wd 0.0500 time 0.2460 (0.2458) data time 0.0008 (0.0016) model time 0.2452 (0.2439) loss 3.7435 (3.5352) grad_norm 1.7644 (1.9564) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][810/1251] eta 0:01:48 lr 0.000952 wd 0.0500 time 0.2392 (0.2457) data time 0.0011 (0.0016) model time 0.2381 (0.2439) loss 3.7223 (3.5396) grad_norm 2.8582 (1.9554) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][820/1251] eta 0:01:45 lr 0.000952 wd 0.0500 time 0.2443 (0.2457) data time 0.0008 (0.0016) model time 0.2435 (0.2439) loss 2.5047 (3.5380) grad_norm 2.1727 (1.9579) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][830/1251] eta 0:01:43 lr 0.000952 wd 0.0500 time 0.2389 (0.2457) data time 0.0009 (0.0016) model time 0.2380 (0.2438) loss 3.2085 (3.5362) grad_norm 1.8399 (1.9570) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][840/1251] eta 0:01:40 lr 0.000952 wd 0.0500 time 0.2350 (0.2456) data time 0.0011 (0.0016) model time 0.2339 (0.2438) loss 3.7408 (3.5361) grad_norm 2.3206 (1.9560) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][850/1251] eta 0:01:38 lr 0.000952 wd 0.0500 time 0.2397 (0.2456) data time 0.0008 (0.0016) model time 0.2390 (0.2438) loss 3.2930 (3.5394) grad_norm 1.8783 (1.9621) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][860/1251] eta 0:01:36 lr 0.000952 wd 0.0500 time 0.2367 (0.2455) data time 0.0010 (0.0016) model time 0.2357 (0.2437) loss 3.8928 (3.5413) grad_norm 1.7726 (1.9620) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:20:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][870/1251] eta 0:01:33 lr 0.000952 wd 0.0500 time 0.2416 (0.2455) data time 0.0009 (0.0016) model time 0.2406 (0.2437) loss 3.0118 (3.5436) grad_norm 2.9667 (1.9623) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][880/1251] eta 0:01:31 lr 0.000952 wd 0.0500 time 0.2501 (0.2455) data time 0.0007 (0.0016) model time 0.2494 (0.2437) loss 4.3754 (3.5410) grad_norm 4.2169 (1.9656) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][890/1251] eta 0:01:28 lr 0.000952 wd 0.0500 time 0.2437 (0.2454) data time 0.0009 (0.0016) model time 0.2428 (0.2436) loss 3.7763 (3.5435) grad_norm 2.0843 (1.9639) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][900/1251] eta 0:01:26 lr 0.000952 wd 0.0500 time 0.2422 (0.2454) data time 0.0009 (0.0016) model time 0.2413 (0.2436) loss 2.7752 (3.5466) grad_norm 2.2577 (1.9656) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][910/1251] eta 0:01:23 lr 0.000952 wd 0.0500 time 0.2463 (0.2454) data time 0.0009 (0.0016) model time 0.2454 (0.2436) loss 3.5315 (3.5466) grad_norm 1.8334 (1.9654) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][920/1251] eta 0:01:21 lr 0.000952 wd 0.0500 time 0.2581 (0.2456) data time 0.0009 (0.0015) model time 0.2573 (0.2438) loss 4.4029 (3.5511) grad_norm 1.8804 (1.9632) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][930/1251] eta 0:01:18 lr 0.000952 wd 0.0500 time 0.2362 (0.2455) data time 0.0009 (0.0015) model time 0.2353 (0.2437) loss 4.0576 (3.5550) grad_norm 1.8198 (1.9634) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][940/1251] eta 0:01:16 lr 0.000952 wd 0.0500 time 0.2467 (0.2455) data time 0.0008 (0.0015) model time 0.2458 (0.2437) loss 4.5552 (3.5569) grad_norm 2.5251 (1.9659) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][950/1251] eta 0:01:13 lr 0.000952 wd 0.0500 time 0.2356 (0.2454) data time 0.0016 (0.0015) model time 0.2340 (0.2437) loss 3.3190 (3.5588) grad_norm 1.3702 (1.9647) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][960/1251] eta 0:01:11 lr 0.000952 wd 0.0500 time 0.2428 (0.2454) data time 0.0010 (0.0015) model time 0.2418 (0.2436) loss 2.9980 (3.5618) grad_norm 2.5424 (1.9652) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][970/1251] eta 0:01:08 lr 0.000952 wd 0.0500 time 0.2417 (0.2453) data time 0.0008 (0.0015) model time 0.2409 (0.2436) loss 3.7695 (3.5631) grad_norm 2.2545 (1.9663) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][980/1251] eta 0:01:06 lr 0.000952 wd 0.0500 time 0.2459 (0.2453) data time 0.0010 (0.0015) model time 0.2449 (0.2436) loss 4.1966 (3.5657) grad_norm 1.3590 (1.9659) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][990/1251] eta 0:01:04 lr 0.000951 wd 0.0500 time 0.2391 (0.2453) data time 0.0011 (0.0015) model time 0.2379 (0.2436) loss 2.9360 (3.5661) grad_norm 2.0306 (1.9653) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1000/1251] eta 0:01:01 lr 0.000951 wd 0.0500 time 0.2496 (0.2454) data time 0.0008 (0.0015) model time 0.2489 (0.2437) loss 2.2598 (3.5644) grad_norm 2.2078 (1.9631) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1010/1251] eta 0:00:59 lr 0.000951 wd 0.0500 time 0.2402 (0.2454) data time 0.0008 (0.0015) model time 0.2395 (0.2437) loss 4.3164 (3.5645) grad_norm 2.7243 (1.9624) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1020/1251] eta 0:00:56 lr 0.000951 wd 0.0500 time 0.2435 (0.2454) data time 0.0008 (0.0015) model time 0.2428 (0.2437) loss 3.9627 (3.5654) grad_norm 2.3925 (1.9648) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1030/1251] eta 0:00:54 lr 0.000951 wd 0.0500 time 0.2467 (0.2454) data time 0.0010 (0.0015) model time 0.2457 (0.2437) loss 3.9773 (3.5665) grad_norm 1.8891 (1.9624) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1040/1251] eta 0:00:51 lr 0.000951 wd 0.0500 time 0.2418 (0.2453) data time 0.0010 (0.0015) model time 0.2408 (0.2436) loss 3.0734 (3.5659) grad_norm 2.2602 (1.9610) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1050/1251] eta 0:00:49 lr 0.000951 wd 0.0500 time 0.2393 (0.2455) data time 0.0008 (0.0015) model time 0.2386 (0.2438) loss 3.1599 (3.5669) grad_norm 1.8485 (1.9626) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1060/1251] eta 0:00:46 lr 0.000951 wd 0.0500 time 0.2426 (0.2456) data time 0.0007 (0.0015) model time 0.2419 (0.2439) loss 4.0350 (3.5696) grad_norm 1.9136 (1.9658) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1070/1251] eta 0:00:44 lr 0.000951 wd 0.0500 time 0.3845 (0.2457) data time 0.0010 (0.0015) model time 0.3836 (0.2440) loss 3.7435 (3.5716) grad_norm 1.5556 (1.9646) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1080/1251] eta 0:00:41 lr 0.000951 wd 0.0500 time 0.2346 (0.2456) data time 0.0012 (0.0015) model time 0.2334 (0.2440) loss 3.4011 (3.5723) grad_norm 1.5219 (1.9633) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1090/1251] eta 0:00:39 lr 0.000951 wd 0.0500 time 0.2452 (0.2455) data time 0.0007 (0.0015) model time 0.2445 (0.2439) loss 3.0074 (3.5726) grad_norm 1.4371 (1.9628) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1100/1251] eta 0:00:37 lr 0.000951 wd 0.0500 time 0.2346 (0.2455) data time 0.0010 (0.0015) model time 0.2336 (0.2439) loss 4.1817 (3.5734) grad_norm 1.6186 (1.9615) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1110/1251] eta 0:00:34 lr 0.000951 wd 0.0500 time 0.2438 (0.2454) data time 0.0009 (0.0015) model time 0.2429 (0.2438) loss 4.3218 (3.5745) grad_norm 1.7904 (1.9605) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:22:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1120/1251] eta 0:00:32 lr 0.000951 wd 0.0500 time 0.2374 (0.2454) data time 0.0009 (0.0014) model time 0.2366 (0.2438) loss 2.8418 (3.5707) grad_norm 2.0940 (1.9616) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:22:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1130/1251] eta 0:00:29 lr 0.000951 wd 0.0500 time 0.2383 (0.2454) data time 0.0010 (0.0014) model time 0.2373 (0.2438) loss 3.8290 (3.5717) grad_norm 2.3626 (1.9614) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1140/1251] eta 0:00:27 lr 0.000951 wd 0.0500 time 0.2430 (0.2454) data time 0.0009 (0.0014) model time 0.2421 (0.2438) loss 3.1954 (3.5701) grad_norm 1.6082 (1.9616) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:22:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1150/1251] eta 0:00:24 lr 0.000951 wd 0.0500 time 0.2391 (0.2453) data time 0.0008 (0.0014) model time 0.2384 (0.2437) loss 3.9863 (3.5670) grad_norm 1.6130 (1.9599) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1160/1251] eta 0:00:22 lr 0.000951 wd 0.0500 time 0.2326 (0.2453) data time 0.0010 (0.0014) model time 0.2316 (0.2437) loss 3.9800 (3.5680) grad_norm 2.5141 (1.9584) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:22:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1170/1251] eta 0:00:19 lr 0.000951 wd 0.0500 time 0.2394 (0.2453) data time 0.0012 (0.0014) model time 0.2383 (0.2437) loss 4.0911 (3.5660) grad_norm 2.0164 (1.9589) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1180/1251] eta 0:00:17 lr 0.000951 wd 0.0500 time 0.2403 (0.2453) data time 0.0009 (0.0014) model time 0.2395 (0.2437) loss 3.6824 (3.5664) grad_norm 1.3867 (1.9580) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:22:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1190/1251] eta 0:00:14 lr 0.000951 wd 0.0500 time 0.2480 (0.2452) data time 0.0010 (0.0014) model time 0.2470 (0.2436) loss 3.7726 (3.5688) grad_norm 1.2094 (1.9555) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1200/1251] eta 0:00:12 lr 0.000951 wd 0.0500 time 0.2466 (0.2454) data time 0.0010 (0.0014) model time 0.2456 (0.2438) loss 3.7609 (3.5676) grad_norm 1.3848 (1.9542) loss_scale 16384.0000 (8253.3888) mem 7379MB [2024-08-26 07:22:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1210/1251] eta 0:00:10 lr 0.000951 wd 0.0500 time 0.2396 (0.2453) data time 0.0010 (0.0014) model time 0.2386 (0.2437) loss 4.0179 (3.5704) grad_norm 1.6157 (1.9539) loss_scale 16384.0000 (8320.5285) mem 7379MB [2024-08-26 07:22:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1220/1251] eta 0:00:07 lr 0.000951 wd 0.0500 time 0.2431 (0.2453) data time 0.0011 (0.0014) model time 0.2420 (0.2437) loss 3.5980 (3.5728) grad_norm 1.6033 (1.9512) loss_scale 16384.0000 (8386.5684) mem 7379MB [2024-08-26 07:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1230/1251] eta 0:00:05 lr 0.000951 wd 0.0500 time 0.2443 (0.2453) data time 0.0007 (0.0014) model time 0.2436 (0.2437) loss 4.4360 (3.5725) grad_norm 2.5050 (1.9515) loss_scale 16384.0000 (8451.5353) mem 7379MB [2024-08-26 07:22:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1240/1251] eta 0:00:02 lr 0.000951 wd 0.0500 time 0.2275 (0.2452) data time 0.0005 (0.0014) model time 0.2270 (0.2436) loss 3.1664 (3.5739) grad_norm 3.7066 (1.9549) loss_scale 16384.0000 (8515.4553) mem 7379MB [2024-08-26 07:22:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [59/300][1250/1251] eta 0:00:00 lr 0.000951 wd 0.0500 time 0.2259 (0.2451) data time 0.0005 (0.0014) model time 0.2254 (0.2435) loss 4.3604 (3.5760) grad_norm 3.5101 (1.9588) loss_scale 16384.0000 (8578.3533) mem 7379MB [2024-08-26 07:22:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 59 training takes 0:05:06 [2024-08-26 07:22:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 07:22:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 07:22:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.447 (0.447) Loss 0.5356 (0.5356) Acc@1 89.941 (89.941) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-26 07:22:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.109) Loss 0.8433 (0.8423) Acc@1 81.348 (81.188) Acc@5 96.680 (95.987) Mem 7379MB [2024-08-26 07:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.094) Loss 1.1934 (0.8717) Acc@1 72.656 (80.013) Acc@5 91.309 (95.703) Mem 7379MB [2024-08-26 07:22:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.089) Loss 1.4775 (0.9880) Acc@1 63.477 (77.378) Acc@5 87.988 (94.150) Mem 7379MB [2024-08-26 07:22:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.4121 (1.0575) Acc@1 67.090 (75.760) Acc@5 88.867 (93.338) Mem 7379MB [2024-08-26 07:22:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.304 Acc@5 93.240 [2024-08-26 07:22:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.3% [2024-08-26 07:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.827 (0.827) Loss 0.4668 (0.4668) Acc@1 90.625 (90.625) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 07:22:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.149) Loss 0.7769 (0.7496) Acc@1 83.887 (83.461) Acc@5 95.898 (96.618) Mem 7379MB [2024-08-26 07:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.116) Loss 1.0859 (0.7686) Acc@1 74.707 (82.520) Acc@5 92.480 (96.549) Mem 7379MB [2024-08-26 07:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.103) Loss 1.3428 (0.8803) Acc@1 65.332 (79.845) Acc@5 89.453 (95.089) Mem 7379MB [2024-08-26 07:22:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.094) Loss 1.2578 (0.9426) Acc@1 68.945 (78.235) Acc@5 90.820 (94.376) Mem 7379MB [2024-08-26 07:22:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.890 Acc@5 94.318 [2024-08-26 07:22:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 77.9% [2024-08-26 07:22:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 77.89% [2024-08-26 07:22:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 07:22:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 07:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][0/1251] eta 0:15:06 lr 0.000951 wd 0.0500 time 0.7246 (0.7246) data time 0.5070 (0.5070) model time 0.0000 (0.0000) loss 2.6667 (2.6667) grad_norm 2.5719 (2.5719) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][10/1251] eta 0:05:52 lr 0.000951 wd 0.0500 time 0.2385 (0.2839) data time 0.0012 (0.0470) model time 0.0000 (0.0000) loss 3.8323 (3.5486) grad_norm 1.9955 (2.0593) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][20/1251] eta 0:05:24 lr 0.000951 wd 0.0500 time 0.2462 (0.2636) data time 0.0010 (0.0251) model time 0.0000 (0.0000) loss 4.1647 (3.5125) grad_norm 1.6511 (1.8925) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][30/1251] eta 0:05:12 lr 0.000951 wd 0.0500 time 0.2478 (0.2563) data time 0.0010 (0.0173) model time 0.0000 (0.0000) loss 3.5543 (3.4164) grad_norm 1.6822 (1.8476) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:22:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][40/1251] eta 0:05:05 lr 0.000951 wd 0.0500 time 0.2309 (0.2519) data time 0.0011 (0.0134) model time 0.0000 (0.0000) loss 3.0015 (3.4507) grad_norm 1.7856 (1.8831) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:22:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][50/1251] eta 0:04:59 lr 0.000951 wd 0.0500 time 0.2437 (0.2496) data time 0.0010 (0.0110) model time 0.0000 (0.0000) loss 2.2643 (3.4121) grad_norm 1.3624 (1.8738) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:22:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][60/1251] eta 0:04:55 lr 0.000951 wd 0.0500 time 0.2423 (0.2483) data time 0.0007 (0.0093) model time 0.2417 (0.2402) loss 3.5137 (3.4283) grad_norm 2.1769 (1.8777) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:22:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][70/1251] eta 0:04:52 lr 0.000951 wd 0.0500 time 0.2401 (0.2475) data time 0.0009 (0.0082) model time 0.2392 (0.2410) loss 3.6125 (3.4169) grad_norm 1.6791 (1.8957) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][80/1251] eta 0:04:49 lr 0.000951 wd 0.0500 time 0.2497 (0.2471) data time 0.0007 (0.0073) model time 0.2490 (0.2416) loss 4.2279 (3.4543) grad_norm 2.3720 (1.9501) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][90/1251] eta 0:04:46 lr 0.000951 wd 0.0500 time 0.2428 (0.2464) data time 0.0008 (0.0066) model time 0.2420 (0.2413) loss 3.5796 (3.4498) grad_norm 1.4104 (1.9299) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][100/1251] eta 0:04:43 lr 0.000951 wd 0.0500 time 0.2817 (0.2463) data time 0.0008 (0.0060) model time 0.2809 (0.2419) loss 3.8823 (3.4881) grad_norm 1.4114 (1.8900) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][110/1251] eta 0:04:42 lr 0.000951 wd 0.0500 time 0.2407 (0.2480) data time 0.0008 (0.0058) model time 0.2399 (0.2452) loss 4.1354 (3.4990) grad_norm 1.8528 (1.8738) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][120/1251] eta 0:04:39 lr 0.000951 wd 0.0500 time 0.2432 (0.2473) data time 0.0009 (0.0054) model time 0.2423 (0.2444) loss 3.6374 (3.5189) grad_norm 1.7192 (1.8675) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][130/1251] eta 0:04:36 lr 0.000951 wd 0.0500 time 0.2336 (0.2470) data time 0.0009 (0.0050) model time 0.2327 (0.2440) loss 4.0704 (3.5245) grad_norm 1.7138 (1.8615) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][140/1251] eta 0:04:33 lr 0.000951 wd 0.0500 time 0.2309 (0.2466) data time 0.0009 (0.0047) model time 0.2300 (0.2436) loss 4.0944 (3.5320) grad_norm 1.7481 (1.8698) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][150/1251] eta 0:04:31 lr 0.000951 wd 0.0500 time 0.2463 (0.2463) data time 0.0007 (0.0045) model time 0.2456 (0.2434) loss 3.6905 (3.5518) grad_norm 1.1524 (1.8758) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][160/1251] eta 0:04:28 lr 0.000951 wd 0.0500 time 0.2412 (0.2461) data time 0.0010 (0.0043) model time 0.2401 (0.2433) loss 3.6235 (3.5408) grad_norm 3.2523 (1.8818) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][170/1251] eta 0:04:25 lr 0.000951 wd 0.0500 time 0.2459 (0.2459) data time 0.0010 (0.0041) model time 0.2449 (0.2432) loss 3.7461 (3.5577) grad_norm 2.4376 (1.8758) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][180/1251] eta 0:04:23 lr 0.000951 wd 0.0500 time 0.2423 (0.2457) data time 0.0008 (0.0039) model time 0.2415 (0.2430) loss 4.2994 (3.5589) grad_norm 1.5685 (1.8651) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][190/1251] eta 0:04:22 lr 0.000951 wd 0.0500 time 0.2417 (0.2475) data time 0.0008 (0.0038) model time 0.2409 (0.2455) loss 3.2848 (3.5683) grad_norm 1.6853 (1.8550) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][200/1251] eta 0:04:20 lr 0.000951 wd 0.0500 time 0.4477 (0.2482) data time 0.0010 (0.0036) model time 0.4466 (0.2466) loss 3.8368 (3.5715) grad_norm 2.8243 (1.8776) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][210/1251] eta 0:04:18 lr 0.000951 wd 0.0500 time 0.2428 (0.2480) data time 0.0009 (0.0035) model time 0.2419 (0.2463) loss 3.5075 (3.5736) grad_norm 2.0262 (1.8858) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][220/1251] eta 0:04:15 lr 0.000951 wd 0.0500 time 0.2408 (0.2477) data time 0.0007 (0.0034) model time 0.2401 (0.2459) loss 4.4719 (3.5683) grad_norm 1.6876 (1.8871) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][230/1251] eta 0:04:12 lr 0.000951 wd 0.0500 time 0.2427 (0.2476) data time 0.0010 (0.0033) model time 0.2417 (0.2458) loss 3.6781 (3.5722) grad_norm 2.0119 (1.8793) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][240/1251] eta 0:04:10 lr 0.000951 wd 0.0500 time 0.2347 (0.2473) data time 0.0009 (0.0032) model time 0.2338 (0.2456) loss 4.2171 (3.5739) grad_norm 1.7093 (1.8773) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][250/1251] eta 0:04:07 lr 0.000950 wd 0.0500 time 0.2329 (0.2472) data time 0.0008 (0.0031) model time 0.2321 (0.2454) loss 2.5704 (3.5627) grad_norm 2.1254 (1.8771) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][260/1251] eta 0:04:04 lr 0.000950 wd 0.0500 time 0.2453 (0.2470) data time 0.0007 (0.0030) model time 0.2446 (0.2452) loss 3.3160 (3.5507) grad_norm 1.4056 (1.8815) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][270/1251] eta 0:04:02 lr 0.000950 wd 0.0500 time 0.2421 (0.2468) data time 0.0011 (0.0030) model time 0.2409 (0.2450) loss 3.9736 (3.5550) grad_norm 1.7121 (1.8854) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][280/1251] eta 0:04:00 lr 0.000950 wd 0.0500 time 0.2409 (0.2474) data time 0.0007 (0.0029) model time 0.2402 (0.2458) loss 3.6169 (3.5518) grad_norm 1.4149 (1.8870) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][290/1251] eta 0:03:57 lr 0.000950 wd 0.0500 time 0.2431 (0.2472) data time 0.0013 (0.0029) model time 0.2418 (0.2456) loss 3.9263 (3.5505) grad_norm 1.6908 (1.8903) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][300/1251] eta 0:03:54 lr 0.000950 wd 0.0500 time 0.2396 (0.2471) data time 0.0008 (0.0028) model time 0.2388 (0.2454) loss 3.8061 (3.5525) grad_norm 1.4900 (1.8845) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:23:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][310/1251] eta 0:03:52 lr 0.000950 wd 0.0500 time 0.2477 (0.2469) data time 0.0009 (0.0027) model time 0.2468 (0.2452) loss 3.6407 (3.5617) grad_norm 2.6974 (1.8858) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:24:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][320/1251] eta 0:03:50 lr 0.000950 wd 0.0500 time 0.2572 (0.2472) data time 0.0009 (0.0027) model time 0.2563 (0.2457) loss 3.5433 (3.5577) grad_norm 1.9612 (1.8970) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:24:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][330/1251] eta 0:03:47 lr 0.000950 wd 0.0500 time 0.2382 (0.2475) data time 0.0011 (0.0027) model time 0.2371 (0.2459) loss 3.8182 (3.5616) grad_norm 1.9569 (1.9041) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][340/1251] eta 0:03:45 lr 0.000950 wd 0.0500 time 0.2366 (0.2478) data time 0.0012 (0.0027) model time 0.2354 (0.2463) loss 3.8139 (3.5537) grad_norm 2.3427 (1.9100) loss_scale 16384.0000 (16384.0000) mem 7379MB [2024-08-26 07:24:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][350/1251] eta 0:03:43 lr 0.000950 wd 0.0500 time 0.2409 (0.2476) data time 0.0009 (0.0026) model time 0.2400 (0.2460) loss 4.1633 (3.5528) grad_norm 1.5977 (nan) loss_scale 8192.0000 (16313.9829) mem 7379MB [2024-08-26 07:24:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][360/1251] eta 0:03:40 lr 0.000950 wd 0.0500 time 0.2521 (0.2474) data time 0.0010 (0.0026) model time 0.2511 (0.2459) loss 3.7442 (3.5569) grad_norm 1.9987 (nan) loss_scale 8192.0000 (16088.9972) mem 7379MB [2024-08-26 07:24:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][370/1251] eta 0:03:37 lr 0.000950 wd 0.0500 time 0.2422 (0.2473) data time 0.0009 (0.0025) model time 0.2414 (0.2458) loss 2.7093 (3.5574) grad_norm 1.6928 (nan) loss_scale 8192.0000 (15876.1402) mem 7379MB [2024-08-26 07:24:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][380/1251] eta 0:03:35 lr 0.000950 wd 0.0500 time 0.2360 (0.2471) data time 0.0007 (0.0025) model time 0.2353 (0.2456) loss 3.1104 (3.5633) grad_norm 1.5705 (nan) loss_scale 8192.0000 (15674.4567) mem 7379MB [2024-08-26 07:24:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][390/1251] eta 0:03:32 lr 0.000950 wd 0.0500 time 0.2416 (0.2470) data time 0.0007 (0.0025) model time 0.2409 (0.2455) loss 3.5662 (3.5613) grad_norm 1.4824 (nan) loss_scale 8192.0000 (15483.0895) mem 7379MB [2024-08-26 07:24:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][400/1251] eta 0:03:30 lr 0.000950 wd 0.0500 time 0.2424 (0.2469) data time 0.0009 (0.0024) model time 0.2415 (0.2454) loss 3.8956 (3.5549) grad_norm 1.5430 (nan) loss_scale 8192.0000 (15301.2668) mem 7379MB [2024-08-26 07:24:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][410/1251] eta 0:03:27 lr 0.000950 wd 0.0500 time 0.2484 (0.2468) data time 0.0008 (0.0024) model time 0.2477 (0.2452) loss 2.8327 (3.5487) grad_norm 1.7071 (nan) loss_scale 8192.0000 (15128.2920) mem 7379MB [2024-08-26 07:24:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][420/1251] eta 0:03:24 lr 0.000950 wd 0.0500 time 0.2438 (0.2467) data time 0.0010 (0.0024) model time 0.2429 (0.2451) loss 4.0122 (3.5507) grad_norm 1.5932 (nan) loss_scale 8192.0000 (14963.5344) mem 7379MB [2024-08-26 07:24:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][430/1251] eta 0:03:22 lr 0.000950 wd 0.0500 time 0.2444 (0.2466) data time 0.0009 (0.0023) model time 0.2435 (0.2450) loss 3.6955 (3.5551) grad_norm 1.8132 (nan) loss_scale 8192.0000 (14806.4223) mem 7379MB [2024-08-26 07:24:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][440/1251] eta 0:03:19 lr 0.000950 wd 0.0500 time 0.2388 (0.2464) data time 0.0011 (0.0023) model time 0.2378 (0.2448) loss 3.6490 (3.5541) grad_norm 1.3032 (nan) loss_scale 8192.0000 (14656.4354) mem 7379MB [2024-08-26 07:24:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][450/1251] eta 0:03:17 lr 0.000950 wd 0.0500 time 0.2422 (0.2464) data time 0.0010 (0.0023) model time 0.2411 (0.2448) loss 3.6654 (3.5516) grad_norm 2.5018 (nan) loss_scale 8192.0000 (14513.0998) mem 7379MB [2024-08-26 07:24:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][460/1251] eta 0:03:14 lr 0.000950 wd 0.0500 time 0.2502 (0.2463) data time 0.0011 (0.0023) model time 0.2491 (0.2447) loss 4.1435 (3.5517) grad_norm 1.4599 (nan) loss_scale 8192.0000 (14375.9826) mem 7379MB [2024-08-26 07:24:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][470/1251] eta 0:03:12 lr 0.000950 wd 0.0500 time 0.2392 (0.2462) data time 0.0009 (0.0022) model time 0.2383 (0.2446) loss 2.6447 (3.5476) grad_norm 1.5930 (nan) loss_scale 8192.0000 (14244.6879) mem 7379MB [2024-08-26 07:24:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][480/1251] eta 0:03:09 lr 0.000950 wd 0.0500 time 0.2412 (0.2461) data time 0.0008 (0.0022) model time 0.2405 (0.2445) loss 2.7242 (3.5428) grad_norm 1.8814 (nan) loss_scale 8192.0000 (14118.8524) mem 7379MB [2024-08-26 07:24:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][490/1251] eta 0:03:07 lr 0.000950 wd 0.0500 time 0.2329 (0.2459) data time 0.0008 (0.0022) model time 0.2321 (0.2443) loss 4.4911 (3.5457) grad_norm 1.6215 (nan) loss_scale 8192.0000 (13998.1426) mem 7379MB [2024-08-26 07:24:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][500/1251] eta 0:03:04 lr 0.000950 wd 0.0500 time 0.2429 (0.2458) data time 0.0009 (0.0022) model time 0.2420 (0.2442) loss 3.0684 (3.5369) grad_norm 2.7864 (nan) loss_scale 8192.0000 (13882.2515) mem 7379MB [2024-08-26 07:24:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][510/1251] eta 0:03:02 lr 0.000950 wd 0.0500 time 0.2361 (0.2458) data time 0.0010 (0.0021) model time 0.2352 (0.2442) loss 3.9842 (3.5354) grad_norm 1.6868 (nan) loss_scale 8192.0000 (13770.8963) mem 7379MB [2024-08-26 07:24:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][520/1251] eta 0:02:59 lr 0.000950 wd 0.0500 time 0.2305 (0.2457) data time 0.0009 (0.0021) model time 0.2296 (0.2441) loss 3.9082 (3.5320) grad_norm 2.0882 (nan) loss_scale 8192.0000 (13663.8157) mem 7379MB [2024-08-26 07:24:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][530/1251] eta 0:02:57 lr 0.000950 wd 0.0500 time 0.2530 (0.2457) data time 0.0008 (0.0021) model time 0.2522 (0.2441) loss 3.7856 (3.5311) grad_norm 1.4241 (nan) loss_scale 8192.0000 (13560.7684) mem 7379MB [2024-08-26 07:24:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][540/1251] eta 0:02:54 lr 0.000950 wd 0.0500 time 0.2480 (0.2456) data time 0.0010 (0.0021) model time 0.2470 (0.2440) loss 2.6189 (3.5304) grad_norm 2.0026 (nan) loss_scale 8192.0000 (13461.5305) mem 7379MB [2024-08-26 07:24:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][550/1251] eta 0:02:52 lr 0.000950 wd 0.0500 time 0.2542 (0.2455) data time 0.0009 (0.0021) model time 0.2533 (0.2440) loss 3.1873 (3.5296) grad_norm 1.9817 (nan) loss_scale 8192.0000 (13365.8947) mem 7379MB [2024-08-26 07:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][560/1251] eta 0:02:49 lr 0.000950 wd 0.0500 time 0.2309 (0.2454) data time 0.0011 (0.0020) model time 0.2299 (0.2438) loss 4.0152 (3.5330) grad_norm 1.4602 (nan) loss_scale 8192.0000 (13273.6684) mem 7379MB [2024-08-26 07:25:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][570/1251] eta 0:02:47 lr 0.000950 wd 0.0500 time 0.2513 (0.2454) data time 0.0007 (0.0021) model time 0.2506 (0.2438) loss 4.1843 (3.5343) grad_norm 1.6430 (nan) loss_scale 8192.0000 (13184.6725) mem 7379MB [2024-08-26 07:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][580/1251] eta 0:02:44 lr 0.000950 wd 0.0500 time 0.2395 (0.2453) data time 0.0013 (0.0020) model time 0.2383 (0.2437) loss 2.3213 (3.5322) grad_norm 1.9070 (nan) loss_scale 8192.0000 (13098.7401) mem 7379MB [2024-08-26 07:25:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][590/1251] eta 0:02:42 lr 0.000950 wd 0.0500 time 0.2480 (0.2453) data time 0.0010 (0.0020) model time 0.2470 (0.2437) loss 3.7624 (3.5269) grad_norm 1.3308 (nan) loss_scale 8192.0000 (13015.7157) mem 7379MB [2024-08-26 07:25:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][600/1251] eta 0:02:39 lr 0.000950 wd 0.0500 time 0.2403 (0.2452) data time 0.0007 (0.0020) model time 0.2395 (0.2436) loss 4.2149 (3.5281) grad_norm 1.7988 (nan) loss_scale 8192.0000 (12935.4542) mem 7379MB [2024-08-26 07:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][610/1251] eta 0:02:37 lr 0.000950 wd 0.0500 time 0.2315 (0.2451) data time 0.0010 (0.0020) model time 0.2305 (0.2435) loss 4.1848 (3.5317) grad_norm 1.1846 (nan) loss_scale 8192.0000 (12857.8200) mem 7379MB [2024-08-26 07:25:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][620/1251] eta 0:02:34 lr 0.000950 wd 0.0500 time 0.2407 (0.2450) data time 0.0010 (0.0020) model time 0.2397 (0.2434) loss 3.8673 (3.5318) grad_norm 1.6200 (nan) loss_scale 8192.0000 (12782.6860) mem 7379MB [2024-08-26 07:25:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][630/1251] eta 0:02:32 lr 0.000950 wd 0.0500 time 0.2400 (0.2449) data time 0.0009 (0.0020) model time 0.2390 (0.2434) loss 4.2922 (3.5361) grad_norm 1.4168 (nan) loss_scale 8192.0000 (12709.9334) mem 7379MB [2024-08-26 07:25:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][640/1251] eta 0:02:29 lr 0.000950 wd 0.0500 time 0.2418 (0.2449) data time 0.0010 (0.0020) model time 0.2408 (0.2433) loss 2.8570 (3.5371) grad_norm 2.2674 (nan) loss_scale 8192.0000 (12639.4509) mem 7379MB [2024-08-26 07:25:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][650/1251] eta 0:02:27 lr 0.000950 wd 0.0500 time 0.2381 (0.2448) data time 0.0011 (0.0019) model time 0.2370 (0.2433) loss 3.5959 (3.5349) grad_norm 1.7422 (nan) loss_scale 8192.0000 (12571.1336) mem 7379MB [2024-08-26 07:25:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][660/1251] eta 0:02:24 lr 0.000950 wd 0.0500 time 0.2434 (0.2449) data time 0.0011 (0.0019) model time 0.2423 (0.2433) loss 3.5465 (3.5367) grad_norm 1.7436 (nan) loss_scale 8192.0000 (12504.8835) mem 7379MB [2024-08-26 07:25:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][670/1251] eta 0:02:22 lr 0.000950 wd 0.0500 time 0.2422 (0.2448) data time 0.0012 (0.0019) model time 0.2409 (0.2433) loss 3.8867 (3.5393) grad_norm 1.9236 (nan) loss_scale 8192.0000 (12440.6080) mem 7379MB [2024-08-26 07:25:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][680/1251] eta 0:02:19 lr 0.000950 wd 0.0500 time 0.2466 (0.2448) data time 0.0009 (0.0019) model time 0.2456 (0.2432) loss 3.5110 (3.5408) grad_norm 2.2321 (nan) loss_scale 8192.0000 (12378.2203) mem 7379MB [2024-08-26 07:25:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][690/1251] eta 0:02:17 lr 0.000950 wd 0.0500 time 0.2416 (0.2448) data time 0.0007 (0.0019) model time 0.2409 (0.2432) loss 3.6197 (3.5457) grad_norm 1.7713 (nan) loss_scale 8192.0000 (12317.6382) mem 7379MB [2024-08-26 07:25:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][700/1251] eta 0:02:14 lr 0.000950 wd 0.0500 time 0.2404 (0.2448) data time 0.0009 (0.0019) model time 0.2395 (0.2432) loss 3.6688 (3.5453) grad_norm 1.6449 (nan) loss_scale 8192.0000 (12258.7846) mem 7379MB [2024-08-26 07:25:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][710/1251] eta 0:02:12 lr 0.000950 wd 0.0500 time 0.2448 (0.2450) data time 0.0010 (0.0019) model time 0.2438 (0.2434) loss 3.9168 (3.5475) grad_norm 1.5933 (nan) loss_scale 8192.0000 (12201.5865) mem 7379MB [2024-08-26 07:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][720/1251] eta 0:02:10 lr 0.000950 wd 0.0500 time 0.4781 (0.2453) data time 0.0007 (0.0019) model time 0.4774 (0.2438) loss 3.2909 (3.5471) grad_norm 1.8718 (nan) loss_scale 8192.0000 (12145.9750) mem 7379MB [2024-08-26 07:25:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][730/1251] eta 0:02:07 lr 0.000950 wd 0.0500 time 0.2411 (0.2452) data time 0.0009 (0.0019) model time 0.2402 (0.2437) loss 3.9653 (3.5474) grad_norm 1.4860 (nan) loss_scale 8192.0000 (12091.8851) mem 7379MB [2024-08-26 07:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][740/1251] eta 0:02:05 lr 0.000950 wd 0.0500 time 0.2465 (0.2452) data time 0.0009 (0.0019) model time 0.2456 (0.2437) loss 2.9795 (3.5459) grad_norm 1.9742 (nan) loss_scale 8192.0000 (12039.2551) mem 7379MB [2024-08-26 07:25:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][750/1251] eta 0:02:02 lr 0.000950 wd 0.0500 time 0.2432 (0.2451) data time 0.0008 (0.0019) model time 0.2424 (0.2436) loss 3.1451 (3.5410) grad_norm 1.5322 (nan) loss_scale 8192.0000 (11988.0266) mem 7379MB [2024-08-26 07:25:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][760/1251] eta 0:02:00 lr 0.000950 wd 0.0500 time 0.2448 (0.2451) data time 0.0010 (0.0019) model time 0.2438 (0.2435) loss 3.8989 (3.5415) grad_norm 1.9580 (nan) loss_scale 8192.0000 (11938.1445) mem 7379MB [2024-08-26 07:25:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][770/1251] eta 0:01:57 lr 0.000949 wd 0.0500 time 0.2424 (0.2450) data time 0.0009 (0.0019) model time 0.2415 (0.2435) loss 4.3076 (3.5430) grad_norm 1.8142 (nan) loss_scale 8192.0000 (11889.5564) mem 7379MB [2024-08-26 07:25:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][780/1251] eta 0:01:55 lr 0.000949 wd 0.0500 time 0.2392 (0.2450) data time 0.0009 (0.0018) model time 0.2383 (0.2435) loss 3.8628 (3.5447) grad_norm 1.2097 (nan) loss_scale 8192.0000 (11842.2125) mem 7379MB [2024-08-26 07:25:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][790/1251] eta 0:01:52 lr 0.000949 wd 0.0500 time 0.2439 (0.2450) data time 0.0009 (0.0018) model time 0.2430 (0.2434) loss 3.6189 (3.5441) grad_norm 1.8193 (nan) loss_scale 8192.0000 (11796.0657) mem 7379MB [2024-08-26 07:25:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][800/1251] eta 0:01:50 lr 0.000949 wd 0.0500 time 0.2394 (0.2452) data time 0.0007 (0.0018) model time 0.2386 (0.2437) loss 4.1478 (3.5477) grad_norm 1.7860 (nan) loss_scale 8192.0000 (11751.0712) mem 7379MB [2024-08-26 07:26:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][810/1251] eta 0:01:48 lr 0.000949 wd 0.0500 time 0.2342 (0.2451) data time 0.0011 (0.0018) model time 0.2331 (0.2436) loss 3.6417 (3.5492) grad_norm 1.4189 (nan) loss_scale 8192.0000 (11707.1862) mem 7379MB [2024-08-26 07:26:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][820/1251] eta 0:01:45 lr 0.000949 wd 0.0500 time 0.2404 (0.2450) data time 0.0007 (0.0018) model time 0.2396 (0.2435) loss 3.9322 (3.5480) grad_norm 1.5210 (nan) loss_scale 8192.0000 (11664.3703) mem 7379MB [2024-08-26 07:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][830/1251] eta 0:01:43 lr 0.000949 wd 0.0500 time 0.2293 (0.2450) data time 0.0010 (0.0018) model time 0.2283 (0.2435) loss 4.1999 (3.5490) grad_norm 1.7036 (nan) loss_scale 8192.0000 (11622.5848) mem 7379MB [2024-08-26 07:26:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][840/1251] eta 0:01:40 lr 0.000949 wd 0.0500 time 0.2376 (0.2450) data time 0.0009 (0.0018) model time 0.2367 (0.2435) loss 3.2286 (3.5446) grad_norm 1.6533 (nan) loss_scale 8192.0000 (11581.7931) mem 7379MB [2024-08-26 07:26:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][850/1251] eta 0:01:38 lr 0.000949 wd 0.0500 time 0.2390 (0.2452) data time 0.0007 (0.0018) model time 0.2382 (0.2437) loss 3.6590 (3.5467) grad_norm 1.8266 (nan) loss_scale 8192.0000 (11541.9600) mem 7379MB [2024-08-26 07:26:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][860/1251] eta 0:01:35 lr 0.000949 wd 0.0500 time 0.2406 (0.2454) data time 0.0012 (0.0018) model time 0.2393 (0.2439) loss 3.8185 (3.5474) grad_norm 2.4103 (nan) loss_scale 8192.0000 (11503.0523) mem 7379MB [2024-08-26 07:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][870/1251] eta 0:01:33 lr 0.000949 wd 0.0500 time 0.2371 (0.2456) data time 0.0008 (0.0018) model time 0.2363 (0.2441) loss 4.0919 (3.5491) grad_norm 2.2466 (nan) loss_scale 8192.0000 (11465.0379) mem 7379MB [2024-08-26 07:26:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][880/1251] eta 0:01:31 lr 0.000949 wd 0.0500 time 0.2547 (0.2455) data time 0.0011 (0.0017) model time 0.2537 (0.2441) loss 3.5240 (3.5512) grad_norm 1.6641 (nan) loss_scale 8192.0000 (11427.8865) mem 7379MB [2024-08-26 07:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][890/1251] eta 0:01:28 lr 0.000949 wd 0.0500 time 0.2451 (0.2455) data time 0.0010 (0.0017) model time 0.2441 (0.2441) loss 3.9195 (3.5521) grad_norm 2.1216 (nan) loss_scale 8192.0000 (11391.5690) mem 7379MB [2024-08-26 07:26:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][900/1251] eta 0:01:26 lr 0.000949 wd 0.0500 time 0.2419 (0.2455) data time 0.0007 (0.0017) model time 0.2412 (0.2441) loss 4.2655 (3.5504) grad_norm 1.9359 (nan) loss_scale 8192.0000 (11356.0577) mem 7379MB [2024-08-26 07:26:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][910/1251] eta 0:01:23 lr 0.000949 wd 0.0500 time 0.2445 (0.2454) data time 0.0009 (0.0017) model time 0.2435 (0.2440) loss 4.1129 (3.5551) grad_norm 1.8653 (nan) loss_scale 8192.0000 (11321.3260) mem 7379MB [2024-08-26 07:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][920/1251] eta 0:01:21 lr 0.000949 wd 0.0500 time 0.2397 (0.2454) data time 0.0009 (0.0017) model time 0.2389 (0.2440) loss 2.8711 (3.5540) grad_norm 1.1946 (nan) loss_scale 8192.0000 (11287.3485) mem 7379MB [2024-08-26 07:26:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][930/1251] eta 0:01:18 lr 0.000949 wd 0.0500 time 0.2436 (0.2454) data time 0.0007 (0.0017) model time 0.2428 (0.2440) loss 4.4321 (3.5538) grad_norm 1.6613 (nan) loss_scale 8192.0000 (11254.1010) mem 7379MB [2024-08-26 07:26:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][940/1251] eta 0:01:16 lr 0.000949 wd 0.0500 time 0.2409 (0.2453) data time 0.0010 (0.0017) model time 0.2399 (0.2439) loss 3.9113 (3.5555) grad_norm 2.0146 (nan) loss_scale 8192.0000 (11221.5600) mem 7379MB [2024-08-26 07:26:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][950/1251] eta 0:01:13 lr 0.000949 wd 0.0500 time 0.2430 (0.2453) data time 0.0011 (0.0017) model time 0.2419 (0.2439) loss 4.1360 (3.5565) grad_norm 1.7645 (nan) loss_scale 8192.0000 (11189.7035) mem 7379MB [2024-08-26 07:26:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][960/1251] eta 0:01:11 lr 0.000949 wd 0.0500 time 0.2414 (0.2453) data time 0.0009 (0.0017) model time 0.2406 (0.2438) loss 2.5567 (3.5555) grad_norm 1.7716 (nan) loss_scale 8192.0000 (11158.5099) mem 7379MB [2024-08-26 07:26:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][970/1251] eta 0:01:08 lr 0.000949 wd 0.0500 time 0.2465 (0.2452) data time 0.0011 (0.0017) model time 0.2454 (0.2438) loss 3.3071 (3.5541) grad_norm 1.8897 (nan) loss_scale 8192.0000 (11127.9588) mem 7379MB [2024-08-26 07:26:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][980/1251] eta 0:01:06 lr 0.000949 wd 0.0500 time 0.2346 (0.2452) data time 0.0011 (0.0017) model time 0.2335 (0.2438) loss 3.6184 (3.5528) grad_norm 2.5746 (nan) loss_scale 8192.0000 (11098.0306) mem 7379MB [2024-08-26 07:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][990/1251] eta 0:01:03 lr 0.000949 wd 0.0500 time 0.2421 (0.2452) data time 0.0011 (0.0017) model time 0.2410 (0.2438) loss 3.6848 (3.5527) grad_norm 2.7858 (nan) loss_scale 8192.0000 (11068.7064) mem 7379MB [2024-08-26 07:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1000/1251] eta 0:01:01 lr 0.000949 wd 0.0500 time 0.2380 (0.2452) data time 0.0010 (0.0017) model time 0.2370 (0.2438) loss 3.9608 (3.5534) grad_norm 1.3617 (nan) loss_scale 8192.0000 (11039.9680) mem 7379MB [2024-08-26 07:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1010/1251] eta 0:00:59 lr 0.000949 wd 0.0500 time 0.2428 (0.2451) data time 0.0010 (0.0017) model time 0.2418 (0.2437) loss 3.9903 (3.5554) grad_norm 3.1070 (nan) loss_scale 8192.0000 (11011.7982) mem 7379MB [2024-08-26 07:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1020/1251] eta 0:00:56 lr 0.000949 wd 0.0500 time 0.2360 (0.2451) data time 0.0008 (0.0016) model time 0.2352 (0.2437) loss 3.3129 (3.5557) grad_norm 1.6073 (nan) loss_scale 8192.0000 (10984.1802) mem 7379MB [2024-08-26 07:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1030/1251] eta 0:00:54 lr 0.000949 wd 0.0500 time 0.2471 (0.2450) data time 0.0007 (0.0016) model time 0.2464 (0.2436) loss 2.8160 (3.5525) grad_norm 1.8027 (nan) loss_scale 8192.0000 (10957.0980) mem 7379MB [2024-08-26 07:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1040/1251] eta 0:00:51 lr 0.000949 wd 0.0500 time 0.2414 (0.2452) data time 0.0012 (0.0016) model time 0.2402 (0.2438) loss 3.6777 (3.5509) grad_norm 1.7060 (nan) loss_scale 8192.0000 (10930.5360) mem 7379MB [2024-08-26 07:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1050/1251] eta 0:00:49 lr 0.000949 wd 0.0500 time 0.2398 (0.2452) data time 0.0007 (0.0016) model time 0.2391 (0.2438) loss 4.0167 (3.5508) grad_norm 1.7644 (nan) loss_scale 8192.0000 (10904.4795) mem 7379MB [2024-08-26 07:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1060/1251] eta 0:00:46 lr 0.000949 wd 0.0500 time 0.2636 (0.2452) data time 0.0010 (0.0016) model time 0.2626 (0.2438) loss 3.8167 (3.5514) grad_norm 2.6790 (nan) loss_scale 8192.0000 (10878.9142) mem 7379MB [2024-08-26 07:27:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1070/1251] eta 0:00:44 lr 0.000949 wd 0.0500 time 0.2427 (0.2451) data time 0.0007 (0.0016) model time 0.2420 (0.2438) loss 4.1878 (3.5514) grad_norm 1.7165 (nan) loss_scale 8192.0000 (10853.8263) mem 7379MB [2024-08-26 07:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1080/1251] eta 0:00:41 lr 0.000949 wd 0.0500 time 0.2447 (0.2451) data time 0.0009 (0.0016) model time 0.2438 (0.2437) loss 3.2640 (3.5511) grad_norm 1.6530 (nan) loss_scale 8192.0000 (10829.2026) mem 7379MB [2024-08-26 07:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1090/1251] eta 0:00:39 lr 0.000949 wd 0.0500 time 0.2446 (0.2451) data time 0.0011 (0.0016) model time 0.2434 (0.2437) loss 3.7375 (3.5516) grad_norm 1.8335 (nan) loss_scale 8192.0000 (10805.0302) mem 7379MB [2024-08-26 07:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1100/1251] eta 0:00:37 lr 0.000949 wd 0.0500 time 0.2530 (0.2451) data time 0.0009 (0.0016) model time 0.2521 (0.2437) loss 4.3295 (3.5518) grad_norm 1.5833 (nan) loss_scale 8192.0000 (10781.2970) mem 7379MB [2024-08-26 07:27:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1110/1251] eta 0:00:34 lr 0.000949 wd 0.0500 time 0.2475 (0.2451) data time 0.0009 (0.0016) model time 0.2466 (0.2437) loss 3.3215 (3.5517) grad_norm 2.6468 (nan) loss_scale 8192.0000 (10757.9910) mem 7379MB [2024-08-26 07:27:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1120/1251] eta 0:00:32 lr 0.000949 wd 0.0500 time 0.2407 (0.2453) data time 0.0011 (0.0016) model time 0.2396 (0.2439) loss 3.5083 (3.5499) grad_norm 2.0385 (nan) loss_scale 8192.0000 (10735.1008) mem 7379MB [2024-08-26 07:27:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1130/1251] eta 0:00:29 lr 0.000949 wd 0.0500 time 0.4549 (0.2454) data time 0.0010 (0.0016) model time 0.4539 (0.2441) loss 3.7021 (3.5503) grad_norm 1.5279 (nan) loss_scale 8192.0000 (10712.6154) mem 7379MB [2024-08-26 07:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1140/1251] eta 0:00:27 lr 0.000949 wd 0.0500 time 0.2417 (0.2454) data time 0.0010 (0.0016) model time 0.2407 (0.2440) loss 4.0510 (3.5504) grad_norm 2.4575 (nan) loss_scale 8192.0000 (10690.5241) mem 7379MB [2024-08-26 07:27:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1150/1251] eta 0:00:24 lr 0.000949 wd 0.0500 time 0.2448 (0.2453) data time 0.0008 (0.0016) model time 0.2440 (0.2440) loss 4.5233 (3.5490) grad_norm 2.3404 (nan) loss_scale 8192.0000 (10668.8167) mem 7379MB [2024-08-26 07:27:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1160/1251] eta 0:00:22 lr 0.000949 wd 0.0500 time 0.2423 (0.2453) data time 0.0007 (0.0016) model time 0.2417 (0.2440) loss 3.2531 (3.5521) grad_norm 2.2684 (nan) loss_scale 8192.0000 (10647.4832) mem 7379MB [2024-08-26 07:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1170/1251] eta 0:00:19 lr 0.000949 wd 0.0500 time 0.2387 (0.2453) data time 0.0010 (0.0016) model time 0.2377 (0.2439) loss 3.4242 (3.5522) grad_norm 1.9712 (nan) loss_scale 8192.0000 (10626.5141) mem 7379MB [2024-08-26 07:27:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1180/1251] eta 0:00:17 lr 0.000949 wd 0.0500 time 0.2358 (0.2452) data time 0.0011 (0.0016) model time 0.2346 (0.2439) loss 3.0943 (3.5542) grad_norm 1.4034 (nan) loss_scale 8192.0000 (10605.9001) mem 7379MB [2024-08-26 07:27:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1190/1251] eta 0:00:14 lr 0.000949 wd 0.0500 time 0.2478 (0.2452) data time 0.0008 (0.0016) model time 0.2469 (0.2439) loss 4.0126 (3.5549) grad_norm 2.6900 (nan) loss_scale 8192.0000 (10585.6322) mem 7379MB [2024-08-26 07:27:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1200/1251] eta 0:00:12 lr 0.000949 wd 0.0500 time 0.2383 (0.2452) data time 0.0009 (0.0016) model time 0.2374 (0.2438) loss 4.0082 (3.5570) grad_norm 1.6045 (nan) loss_scale 8192.0000 (10565.7019) mem 7379MB [2024-08-26 07:27:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1210/1251] eta 0:00:10 lr 0.000949 wd 0.0500 time 0.2404 (0.2451) data time 0.0010 (0.0015) model time 0.2394 (0.2438) loss 3.7859 (3.5576) grad_norm 2.6239 (nan) loss_scale 8192.0000 (10546.1007) mem 7379MB [2024-08-26 07:27:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1220/1251] eta 0:00:07 lr 0.000949 wd 0.0500 time 0.2330 (0.2451) data time 0.0009 (0.0015) model time 0.2321 (0.2438) loss 3.1597 (3.5575) grad_norm 1.8333 (nan) loss_scale 8192.0000 (10526.8206) mem 7379MB [2024-08-26 07:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1230/1251] eta 0:00:05 lr 0.000949 wd 0.0500 time 0.2463 (0.2451) data time 0.0009 (0.0015) model time 0.2454 (0.2438) loss 3.9593 (3.5589) grad_norm 1.3120 (nan) loss_scale 8192.0000 (10507.8538) mem 7379MB [2024-08-26 07:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1240/1251] eta 0:00:02 lr 0.000949 wd 0.0500 time 0.2276 (0.2450) data time 0.0007 (0.0015) model time 0.2269 (0.2437) loss 4.0212 (3.5595) grad_norm 1.6612 (nan) loss_scale 8192.0000 (10489.1926) mem 7379MB [2024-08-26 07:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [60/300][1250/1251] eta 0:00:00 lr 0.000949 wd 0.0500 time 0.2342 (0.2449) data time 0.0005 (0.0015) model time 0.2337 (0.2435) loss 2.8965 (3.5587) grad_norm 1.8274 (nan) loss_scale 8192.0000 (10470.8297) mem 7379MB [2024-08-26 07:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 60 training takes 0:05:06 [2024-08-26 07:27:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 07:27:49 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 07:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.392 (0.392) Loss 0.5308 (0.5308) Acc@1 89.844 (89.844) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 07:27:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.109) Loss 0.9058 (0.8469) Acc@1 79.395 (80.620) Acc@5 95.117 (95.810) Mem 7379MB [2024-08-26 07:27:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.095) Loss 1.1875 (0.8598) Acc@1 70.605 (79.887) Acc@5 91.309 (95.726) Mem 7379MB [2024-08-26 07:27:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.089) Loss 1.4717 (0.9835) Acc@1 65.332 (77.252) Acc@5 88.770 (94.106) Mem 7379MB [2024-08-26 07:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.3867 (1.0554) Acc@1 66.602 (75.700) Acc@5 89.551 (93.209) Mem 7379MB [2024-08-26 07:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.288 Acc@5 93.124 [2024-08-26 07:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.3% [2024-08-26 07:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.746 (0.746) Loss 0.4658 (0.4658) Acc@1 91.016 (91.016) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 07:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.142) Loss 0.7749 (0.7480) Acc@1 83.984 (83.567) Acc@5 95.898 (96.582) Mem 7379MB [2024-08-26 07:27:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.112) Loss 1.0830 (0.7673) Acc@1 74.805 (82.571) Acc@5 92.773 (96.545) Mem 7379MB [2024-08-26 07:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.101) Loss 1.3379 (0.8788) Acc@1 65.625 (79.905) Acc@5 89.453 (95.114) Mem 7379MB [2024-08-26 07:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.092) Loss 1.2559 (0.9405) Acc@1 68.945 (78.306) Acc@5 90.723 (94.415) Mem 7379MB [2024-08-26 07:27:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.952 Acc@5 94.362 [2024-08-26 07:27:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.0% [2024-08-26 07:27:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 77.95% [2024-08-26 07:27:57 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 07:27:58 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 07:27:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][0/1251] eta 0:14:57 lr 0.000949 wd 0.0500 time 0.7177 (0.7177) data time 0.4895 (0.4895) model time 0.0000 (0.0000) loss 3.0504 (3.0504) grad_norm 1.8003 (1.8003) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][10/1251] eta 0:05:52 lr 0.000949 wd 0.0500 time 0.2301 (0.2843) data time 0.0008 (0.0455) model time 0.0000 (0.0000) loss 3.6075 (3.6044) grad_norm 1.2729 (1.8802) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][20/1251] eta 0:05:25 lr 0.000949 wd 0.0500 time 0.2358 (0.2642) data time 0.0011 (0.0243) model time 0.0000 (0.0000) loss 3.6322 (3.6490) grad_norm 1.9592 (1.9812) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][30/1251] eta 0:05:13 lr 0.000948 wd 0.0500 time 0.2413 (0.2565) data time 0.0010 (0.0168) model time 0.0000 (0.0000) loss 3.0661 (3.6129) grad_norm 1.9675 (1.8682) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][40/1251] eta 0:05:06 lr 0.000948 wd 0.0500 time 0.2343 (0.2531) data time 0.0008 (0.0129) model time 0.0000 (0.0000) loss 3.7247 (3.6424) grad_norm 1.8361 (1.8682) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][50/1251] eta 0:05:01 lr 0.000948 wd 0.0500 time 0.2413 (0.2508) data time 0.0007 (0.0106) model time 0.0000 (0.0000) loss 3.6907 (3.5601) grad_norm 1.6363 (1.8298) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][60/1251] eta 0:04:56 lr 0.000948 wd 0.0500 time 0.2430 (0.2491) data time 0.0010 (0.0090) model time 0.2421 (0.2396) loss 2.2941 (3.6172) grad_norm 2.1567 (1.8639) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][70/1251] eta 0:04:55 lr 0.000948 wd 0.0500 time 0.2413 (0.2506) data time 0.0007 (0.0079) model time 0.2406 (0.2491) loss 4.4477 (3.5841) grad_norm 1.7786 (1.8586) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][80/1251] eta 0:04:52 lr 0.000948 wd 0.0500 time 0.2465 (0.2496) data time 0.0009 (0.0071) model time 0.2456 (0.2464) loss 3.5814 (3.5580) grad_norm 1.7007 (1.8666) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][90/1251] eta 0:04:48 lr 0.000948 wd 0.0500 time 0.2475 (0.2488) data time 0.0011 (0.0064) model time 0.2464 (0.2453) loss 3.8236 (3.5496) grad_norm 1.5499 (1.8556) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][100/1251] eta 0:04:45 lr 0.000948 wd 0.0500 time 0.2415 (0.2480) data time 0.0010 (0.0059) model time 0.2405 (0.2441) loss 4.0177 (3.5604) grad_norm 2.2954 (1.8727) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][110/1251] eta 0:04:42 lr 0.000948 wd 0.0500 time 0.2527 (0.2473) data time 0.0008 (0.0055) model time 0.2519 (0.2432) loss 3.8333 (3.5677) grad_norm 2.0539 (1.8808) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][120/1251] eta 0:04:39 lr 0.000948 wd 0.0500 time 0.2388 (0.2468) data time 0.0011 (0.0051) model time 0.2377 (0.2428) loss 3.6326 (3.5549) grad_norm 2.2173 (1.8896) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][130/1251] eta 0:04:36 lr 0.000948 wd 0.0500 time 0.2429 (0.2464) data time 0.0008 (0.0048) model time 0.2421 (0.2425) loss 3.0061 (3.5530) grad_norm 1.4211 (1.8951) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][140/1251] eta 0:04:33 lr 0.000948 wd 0.0500 time 0.2460 (0.2461) data time 0.0008 (0.0045) model time 0.2451 (0.2423) loss 4.5493 (3.5715) grad_norm 1.9614 (1.8835) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][150/1251] eta 0:04:30 lr 0.000948 wd 0.0500 time 0.2402 (0.2458) data time 0.0010 (0.0043) model time 0.2392 (0.2421) loss 3.4485 (3.5908) grad_norm 2.0707 (1.9065) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][160/1251] eta 0:04:28 lr 0.000948 wd 0.0500 time 0.2693 (0.2457) data time 0.0009 (0.0041) model time 0.2683 (0.2422) loss 2.8514 (3.5964) grad_norm 2.6372 (1.9173) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][170/1251] eta 0:04:25 lr 0.000948 wd 0.0500 time 0.2473 (0.2455) data time 0.0011 (0.0039) model time 0.2461 (0.2422) loss 3.4615 (3.6016) grad_norm 3.5250 (1.9223) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][180/1251] eta 0:04:22 lr 0.000948 wd 0.0500 time 0.2476 (0.2454) data time 0.0012 (0.0038) model time 0.2464 (0.2422) loss 4.1030 (3.6134) grad_norm 1.8935 (1.9215) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][190/1251] eta 0:04:20 lr 0.000948 wd 0.0500 time 0.2355 (0.2451) data time 0.0009 (0.0036) model time 0.2345 (0.2420) loss 4.6293 (3.6173) grad_norm 1.4739 (1.9195) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][200/1251] eta 0:04:17 lr 0.000948 wd 0.0500 time 0.2414 (0.2449) data time 0.0010 (0.0035) model time 0.2405 (0.2418) loss 2.2253 (3.6093) grad_norm 1.4243 (1.9317) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][210/1251] eta 0:04:14 lr 0.000948 wd 0.0500 time 0.2405 (0.2446) data time 0.0010 (0.0034) model time 0.2395 (0.2416) loss 2.4717 (3.6045) grad_norm 1.3941 (1.9155) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][220/1251] eta 0:04:12 lr 0.000948 wd 0.0500 time 0.2418 (0.2454) data time 0.0008 (0.0033) model time 0.2410 (0.2427) loss 3.9792 (3.6018) grad_norm 3.0040 (1.9140) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][230/1251] eta 0:04:10 lr 0.000948 wd 0.0500 time 0.2341 (0.2452) data time 0.0007 (0.0032) model time 0.2334 (0.2425) loss 3.9444 (3.5990) grad_norm 1.9338 (1.9196) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][240/1251] eta 0:04:07 lr 0.000948 wd 0.0500 time 0.2375 (0.2449) data time 0.0010 (0.0031) model time 0.2365 (0.2423) loss 3.2027 (3.6042) grad_norm 3.0037 (1.9336) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:28:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][250/1251] eta 0:04:05 lr 0.000948 wd 0.0500 time 0.2380 (0.2448) data time 0.0010 (0.0030) model time 0.2371 (0.2422) loss 4.0448 (3.6123) grad_norm 1.5153 (1.9381) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][260/1251] eta 0:04:02 lr 0.000948 wd 0.0500 time 0.2401 (0.2447) data time 0.0008 (0.0029) model time 0.2394 (0.2421) loss 4.1854 (3.6133) grad_norm 1.6741 (1.9323) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][270/1251] eta 0:03:59 lr 0.000948 wd 0.0500 time 0.2397 (0.2445) data time 0.0008 (0.0029) model time 0.2390 (0.2420) loss 2.3167 (3.6165) grad_norm 1.6128 (1.9271) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][280/1251] eta 0:03:57 lr 0.000948 wd 0.0500 time 0.2429 (0.2444) data time 0.0011 (0.0028) model time 0.2418 (0.2420) loss 3.0312 (3.6084) grad_norm 1.9680 (1.9295) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][290/1251] eta 0:03:54 lr 0.000948 wd 0.0500 time 0.2370 (0.2444) data time 0.0010 (0.0027) model time 0.2360 (0.2419) loss 2.9273 (3.6027) grad_norm 1.6969 (1.9356) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][300/1251] eta 0:03:53 lr 0.000948 wd 0.0500 time 0.2398 (0.2457) data time 0.0008 (0.0027) model time 0.2390 (0.2436) loss 3.9159 (3.6074) grad_norm 1.4244 (1.9460) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][310/1251] eta 0:03:51 lr 0.000948 wd 0.0500 time 0.2356 (0.2461) data time 0.0008 (0.0026) model time 0.2348 (0.2442) loss 3.4716 (3.6016) grad_norm 2.1192 (1.9444) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][320/1251] eta 0:03:48 lr 0.000948 wd 0.0500 time 0.2350 (0.2459) data time 0.0010 (0.0026) model time 0.2341 (0.2440) loss 3.4452 (3.6046) grad_norm 1.4952 (1.9431) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][330/1251] eta 0:03:46 lr 0.000948 wd 0.0500 time 0.2310 (0.2458) data time 0.0010 (0.0025) model time 0.2300 (0.2438) loss 2.9528 (3.6090) grad_norm 1.6816 (1.9563) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][340/1251] eta 0:03:43 lr 0.000948 wd 0.0500 time 0.2410 (0.2456) data time 0.0010 (0.0025) model time 0.2401 (0.2437) loss 4.0051 (3.6109) grad_norm 1.7357 (1.9591) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][350/1251] eta 0:03:41 lr 0.000948 wd 0.0500 time 0.2382 (0.2454) data time 0.0010 (0.0024) model time 0.2371 (0.2435) loss 3.8244 (3.6120) grad_norm 2.6584 (1.9560) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][360/1251] eta 0:03:38 lr 0.000948 wd 0.0500 time 0.2410 (0.2453) data time 0.0007 (0.0024) model time 0.2403 (0.2434) loss 4.4627 (3.5998) grad_norm 2.2704 (1.9612) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][370/1251] eta 0:03:36 lr 0.000948 wd 0.0500 time 0.2374 (0.2452) data time 0.0012 (0.0024) model time 0.2362 (0.2433) loss 3.6803 (3.5946) grad_norm 1.3702 (1.9591) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][380/1251] eta 0:03:33 lr 0.000948 wd 0.0500 time 0.2348 (0.2451) data time 0.0012 (0.0023) model time 0.2336 (0.2432) loss 3.9131 (3.5973) grad_norm 2.0563 (1.9590) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][390/1251] eta 0:03:30 lr 0.000948 wd 0.0500 time 0.2342 (0.2450) data time 0.0011 (0.0023) model time 0.2331 (0.2431) loss 3.2462 (3.5967) grad_norm 1.5072 (1.9578) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][400/1251] eta 0:03:28 lr 0.000948 wd 0.0500 time 0.2492 (0.2449) data time 0.0010 (0.0023) model time 0.2482 (0.2430) loss 3.2723 (3.5998) grad_norm 2.2530 (1.9580) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][410/1251] eta 0:03:25 lr 0.000948 wd 0.0500 time 0.2306 (0.2448) data time 0.0009 (0.0022) model time 0.2297 (0.2429) loss 4.4291 (3.6041) grad_norm 2.0338 (1.9630) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][420/1251] eta 0:03:23 lr 0.000948 wd 0.0500 time 0.2398 (0.2447) data time 0.0011 (0.0022) model time 0.2387 (0.2429) loss 3.7058 (3.6023) grad_norm 1.9838 (1.9623) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][430/1251] eta 0:03:20 lr 0.000948 wd 0.0500 time 0.2380 (0.2446) data time 0.0007 (0.0022) model time 0.2373 (0.2427) loss 4.5274 (3.6070) grad_norm 1.7375 (1.9595) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][440/1251] eta 0:03:18 lr 0.000948 wd 0.0500 time 0.2543 (0.2445) data time 0.0008 (0.0021) model time 0.2534 (0.2427) loss 3.9523 (3.6081) grad_norm 1.3111 (1.9565) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][450/1251] eta 0:03:15 lr 0.000948 wd 0.0500 time 0.2452 (0.2444) data time 0.0010 (0.0021) model time 0.2442 (0.2426) loss 4.2414 (3.6063) grad_norm 2.0422 (1.9730) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][460/1251] eta 0:03:13 lr 0.000948 wd 0.0500 time 0.2441 (0.2444) data time 0.0011 (0.0021) model time 0.2431 (0.2425) loss 2.8139 (3.6056) grad_norm 1.3598 (1.9753) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][470/1251] eta 0:03:10 lr 0.000948 wd 0.0500 time 0.2493 (0.2444) data time 0.0007 (0.0021) model time 0.2486 (0.2425) loss 3.4208 (3.6057) grad_norm 2.1195 (1.9729) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][480/1251] eta 0:03:08 lr 0.000948 wd 0.0500 time 0.2381 (0.2443) data time 0.0010 (0.0021) model time 0.2371 (0.2425) loss 3.2995 (3.6069) grad_norm 1.4378 (1.9670) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:29:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][490/1251] eta 0:03:05 lr 0.000948 wd 0.0500 time 0.2444 (0.2442) data time 0.0008 (0.0020) model time 0.2437 (0.2424) loss 2.2108 (3.6041) grad_norm 1.7196 (1.9616) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][500/1251] eta 0:03:03 lr 0.000948 wd 0.0500 time 0.2369 (0.2444) data time 0.0011 (0.0020) model time 0.2359 (0.2427) loss 3.2397 (3.6001) grad_norm 1.8976 (1.9572) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][510/1251] eta 0:03:01 lr 0.000948 wd 0.0500 time 0.2424 (0.2444) data time 0.0010 (0.0020) model time 0.2414 (0.2427) loss 3.9104 (3.5969) grad_norm 1.8247 (1.9562) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][520/1251] eta 0:02:58 lr 0.000948 wd 0.0500 time 0.2475 (0.2443) data time 0.0010 (0.0020) model time 0.2465 (0.2426) loss 3.4502 (3.5946) grad_norm 1.8948 (1.9540) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][530/1251] eta 0:02:56 lr 0.000947 wd 0.0500 time 0.2532 (0.2443) data time 0.0009 (0.0020) model time 0.2523 (0.2426) loss 4.4173 (3.5972) grad_norm 1.8961 (1.9495) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][540/1251] eta 0:02:53 lr 0.000947 wd 0.0500 time 0.2410 (0.2443) data time 0.0010 (0.0019) model time 0.2399 (0.2426) loss 3.6069 (3.5934) grad_norm 1.6323 (1.9461) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][550/1251] eta 0:02:51 lr 0.000947 wd 0.0500 time 0.2463 (0.2442) data time 0.0010 (0.0019) model time 0.2453 (0.2425) loss 3.8103 (3.5936) grad_norm 2.1073 (1.9509) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][560/1251] eta 0:02:49 lr 0.000947 wd 0.0500 time 0.2423 (0.2446) data time 0.0009 (0.0019) model time 0.2413 (0.2429) loss 3.9499 (3.5953) grad_norm 1.7144 (1.9498) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][570/1251] eta 0:02:46 lr 0.000947 wd 0.0500 time 0.2363 (0.2447) data time 0.0008 (0.0019) model time 0.2355 (0.2431) loss 3.3975 (3.5989) grad_norm 2.2946 (1.9503) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][580/1251] eta 0:02:44 lr 0.000947 wd 0.0500 time 0.2333 (0.2446) data time 0.0011 (0.0019) model time 0.2322 (0.2430) loss 3.9577 (3.6010) grad_norm 2.4327 (1.9504) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][590/1251] eta 0:02:42 lr 0.000947 wd 0.0500 time 0.4634 (0.2451) data time 0.0008 (0.0019) model time 0.4626 (0.2435) loss 4.1363 (3.6038) grad_norm 2.9260 (1.9522) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][600/1251] eta 0:02:39 lr 0.000947 wd 0.0500 time 0.2385 (0.2450) data time 0.0007 (0.0019) model time 0.2378 (0.2434) loss 4.2653 (3.6051) grad_norm 2.0985 (1.9562) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][610/1251] eta 0:02:36 lr 0.000947 wd 0.0500 time 0.2350 (0.2449) data time 0.0008 (0.0018) model time 0.2342 (0.2433) loss 2.7006 (3.6029) grad_norm 1.7812 (1.9566) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][620/1251] eta 0:02:34 lr 0.000947 wd 0.0500 time 0.2384 (0.2447) data time 0.0009 (0.0018) model time 0.2374 (0.2432) loss 3.6543 (3.6033) grad_norm 1.6572 (1.9603) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][630/1251] eta 0:02:31 lr 0.000947 wd 0.0500 time 0.2399 (0.2446) data time 0.0009 (0.0018) model time 0.2390 (0.2431) loss 3.8765 (3.6045) grad_norm 1.8808 (1.9577) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][640/1251] eta 0:02:29 lr 0.000947 wd 0.0500 time 0.2393 (0.2445) data time 0.0010 (0.0018) model time 0.2382 (0.2430) loss 2.8262 (3.6026) grad_norm 1.4869 (1.9540) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][650/1251] eta 0:02:26 lr 0.000947 wd 0.0500 time 0.2367 (0.2445) data time 0.0009 (0.0018) model time 0.2359 (0.2429) loss 4.3900 (3.5983) grad_norm 1.7788 (1.9499) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][660/1251] eta 0:02:24 lr 0.000947 wd 0.0500 time 0.2336 (0.2444) data time 0.0012 (0.0018) model time 0.2324 (0.2428) loss 3.7865 (3.5992) grad_norm 2.0955 (1.9459) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][670/1251] eta 0:02:21 lr 0.000947 wd 0.0500 time 0.2428 (0.2443) data time 0.0007 (0.0018) model time 0.2421 (0.2427) loss 3.2764 (3.5952) grad_norm 1.6737 (1.9431) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][680/1251] eta 0:02:19 lr 0.000947 wd 0.0500 time 0.2341 (0.2442) data time 0.0008 (0.0018) model time 0.2333 (0.2426) loss 4.1206 (3.5905) grad_norm 1.2675 (1.9424) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][690/1251] eta 0:02:16 lr 0.000947 wd 0.0500 time 0.2350 (0.2441) data time 0.0011 (0.0017) model time 0.2339 (0.2425) loss 3.4494 (3.5844) grad_norm 1.8406 (1.9449) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][700/1251] eta 0:02:14 lr 0.000947 wd 0.0500 time 0.2423 (0.2440) data time 0.0009 (0.0017) model time 0.2415 (0.2424) loss 3.9089 (3.5831) grad_norm 1.8640 (1.9467) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][710/1251] eta 0:02:12 lr 0.000947 wd 0.0500 time 0.2362 (0.2442) data time 0.0010 (0.0017) model time 0.2352 (0.2427) loss 3.9210 (3.5825) grad_norm 1.8966 (1.9447) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][720/1251] eta 0:02:09 lr 0.000947 wd 0.0500 time 0.2349 (0.2441) data time 0.0008 (0.0017) model time 0.2341 (0.2426) loss 3.9204 (3.5828) grad_norm 2.1304 (1.9462) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][730/1251] eta 0:02:07 lr 0.000947 wd 0.0500 time 0.2463 (0.2443) data time 0.0008 (0.0017) model time 0.2455 (0.2428) loss 4.5567 (3.5823) grad_norm 2.0871 (1.9448) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:30:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][740/1251] eta 0:02:04 lr 0.000947 wd 0.0500 time 0.2321 (0.2442) data time 0.0012 (0.0017) model time 0.2309 (0.2427) loss 3.5789 (3.5836) grad_norm 1.4824 (1.9492) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:31:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][750/1251] eta 0:02:02 lr 0.000947 wd 0.0500 time 0.2374 (0.2441) data time 0.0011 (0.0017) model time 0.2363 (0.2426) loss 3.7954 (3.5826) grad_norm 1.4421 (1.9481) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][760/1251] eta 0:01:59 lr 0.000947 wd 0.0500 time 0.2358 (0.2440) data time 0.0010 (0.0017) model time 0.2348 (0.2425) loss 4.1620 (3.5788) grad_norm 1.9097 (1.9450) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:31:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][770/1251] eta 0:01:57 lr 0.000947 wd 0.0500 time 0.2366 (0.2439) data time 0.0008 (0.0017) model time 0.2358 (0.2424) loss 3.8829 (3.5801) grad_norm 1.8852 (1.9502) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:31:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][780/1251] eta 0:01:54 lr 0.000947 wd 0.0500 time 0.2392 (0.2439) data time 0.0011 (0.0017) model time 0.2382 (0.2423) loss 3.1803 (3.5804) grad_norm 1.9593 (1.9477) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:31:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][790/1251] eta 0:01:52 lr 0.000947 wd 0.0500 time 0.2327 (0.2438) data time 0.0011 (0.0016) model time 0.2315 (0.2423) loss 2.5231 (3.5756) grad_norm 1.6489 (1.9443) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:31:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][800/1251] eta 0:01:49 lr 0.000947 wd 0.0500 time 0.2360 (0.2437) data time 0.0010 (0.0016) model time 0.2351 (0.2422) loss 2.7567 (3.5705) grad_norm 1.8963 (1.9444) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][810/1251] eta 0:01:47 lr 0.000947 wd 0.0500 time 0.2349 (0.2436) data time 0.0008 (0.0016) model time 0.2341 (0.2421) loss 3.4804 (3.5730) grad_norm 2.0484 (1.9426) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:31:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][820/1251] eta 0:01:44 lr 0.000947 wd 0.0500 time 0.2378 (0.2436) data time 0.0007 (0.0016) model time 0.2371 (0.2421) loss 3.8567 (3.5724) grad_norm 2.1120 (1.9410) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:31:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][830/1251] eta 0:01:42 lr 0.000947 wd 0.0500 time 0.2408 (0.2436) data time 0.0009 (0.0016) model time 0.2400 (0.2421) loss 3.5778 (3.5716) grad_norm 1.8327 (1.9393) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][840/1251] eta 0:01:40 lr 0.000947 wd 0.0500 time 0.2402 (0.2435) data time 0.0011 (0.0016) model time 0.2391 (0.2420) loss 3.6317 (3.5686) grad_norm 2.4429 (1.9375) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:31:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][850/1251] eta 0:01:37 lr 0.000947 wd 0.0500 time 0.2367 (0.2434) data time 0.0009 (0.0016) model time 0.2358 (0.2419) loss 3.5079 (3.5670) grad_norm 2.6547 (1.9390) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][860/1251] eta 0:01:35 lr 0.000947 wd 0.0500 time 0.2363 (0.2434) data time 0.0008 (0.0016) model time 0.2355 (0.2419) loss 3.4706 (3.5652) grad_norm 2.0365 (1.9417) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:31:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][870/1251] eta 0:01:32 lr 0.000947 wd 0.0500 time 0.2396 (0.2433) data time 0.0010 (0.0016) model time 0.2386 (0.2418) loss 3.3993 (3.5654) grad_norm 2.5065 (inf) loss_scale 4096.0000 (8163.7842) mem 7379MB [2024-08-26 07:31:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][880/1251] eta 0:01:30 lr 0.000947 wd 0.0500 time 0.2313 (0.2432) data time 0.0010 (0.0016) model time 0.2303 (0.2417) loss 3.4828 (3.5666) grad_norm 1.8801 (inf) loss_scale 4096.0000 (8117.6118) mem 7379MB [2024-08-26 07:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][890/1251] eta 0:01:27 lr 0.000947 wd 0.0500 time 0.2486 (0.2432) data time 0.0007 (0.0016) model time 0.2479 (0.2417) loss 3.8479 (3.5672) grad_norm 1.8663 (inf) loss_scale 4096.0000 (8072.4759) mem 7379MB [2024-08-26 07:31:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][900/1251] eta 0:01:25 lr 0.000947 wd 0.0500 time 0.2362 (0.2431) data time 0.0010 (0.0016) model time 0.2352 (0.2416) loss 4.0219 (3.5697) grad_norm 1.9316 (inf) loss_scale 4096.0000 (8028.3418) mem 7379MB [2024-08-26 07:31:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][910/1251] eta 0:01:22 lr 0.000947 wd 0.0500 time 0.2393 (0.2431) data time 0.0008 (0.0016) model time 0.2386 (0.2416) loss 2.7038 (3.5686) grad_norm 1.8249 (inf) loss_scale 4096.0000 (7985.1767) mem 7379MB [2024-08-26 07:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][920/1251] eta 0:01:20 lr 0.000947 wd 0.0500 time 0.2392 (0.2430) data time 0.0010 (0.0016) model time 0.2382 (0.2415) loss 2.5331 (3.5659) grad_norm 3.3788 (inf) loss_scale 4096.0000 (7942.9490) mem 7379MB [2024-08-26 07:31:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][930/1251] eta 0:01:17 lr 0.000947 wd 0.0500 time 0.2384 (0.2430) data time 0.0007 (0.0016) model time 0.2377 (0.2415) loss 3.2715 (3.5624) grad_norm 1.6321 (inf) loss_scale 4096.0000 (7901.6284) mem 7379MB [2024-08-26 07:31:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][940/1251] eta 0:01:15 lr 0.000947 wd 0.0500 time 0.2376 (0.2429) data time 0.0011 (0.0016) model time 0.2365 (0.2414) loss 3.2132 (3.5620) grad_norm 2.3246 (inf) loss_scale 4096.0000 (7861.1860) mem 7379MB [2024-08-26 07:31:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][950/1251] eta 0:01:13 lr 0.000947 wd 0.0500 time 0.2365 (0.2429) data time 0.0011 (0.0015) model time 0.2354 (0.2414) loss 3.7348 (3.5617) grad_norm 1.4851 (inf) loss_scale 4096.0000 (7821.5941) mem 7379MB [2024-08-26 07:31:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][960/1251] eta 0:01:10 lr 0.000947 wd 0.0500 time 0.2297 (0.2428) data time 0.0010 (0.0015) model time 0.2287 (0.2414) loss 2.6692 (3.5627) grad_norm 2.0245 (inf) loss_scale 4096.0000 (7782.8262) mem 7379MB [2024-08-26 07:31:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][970/1251] eta 0:01:08 lr 0.000947 wd 0.0500 time 0.2287 (0.2428) data time 0.0008 (0.0015) model time 0.2279 (0.2413) loss 2.5739 (3.5632) grad_norm 1.6789 (inf) loss_scale 4096.0000 (7744.8568) mem 7379MB [2024-08-26 07:31:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][980/1251] eta 0:01:05 lr 0.000947 wd 0.0500 time 0.2313 (0.2427) data time 0.0009 (0.0015) model time 0.2304 (0.2413) loss 3.8454 (3.5605) grad_norm 2.1417 (inf) loss_scale 4096.0000 (7707.6616) mem 7379MB [2024-08-26 07:31:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][990/1251] eta 0:01:03 lr 0.000947 wd 0.0500 time 0.2414 (0.2427) data time 0.0010 (0.0015) model time 0.2404 (0.2412) loss 4.2701 (3.5603) grad_norm 1.9786 (inf) loss_scale 4096.0000 (7671.2170) mem 7379MB [2024-08-26 07:32:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1000/1251] eta 0:01:00 lr 0.000947 wd 0.0500 time 0.2413 (0.2427) data time 0.0010 (0.0015) model time 0.2403 (0.2412) loss 2.4261 (3.5613) grad_norm 1.3805 (inf) loss_scale 4096.0000 (7635.5005) mem 7379MB [2024-08-26 07:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1010/1251] eta 0:00:58 lr 0.000947 wd 0.0500 time 0.2358 (0.2426) data time 0.0010 (0.0015) model time 0.2348 (0.2411) loss 3.1745 (3.5638) grad_norm 2.1199 (inf) loss_scale 4096.0000 (7600.4906) mem 7379MB [2024-08-26 07:32:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1020/1251] eta 0:00:56 lr 0.000947 wd 0.0500 time 0.2359 (0.2427) data time 0.0007 (0.0015) model time 0.2352 (0.2413) loss 3.1432 (3.5644) grad_norm 1.9960 (inf) loss_scale 4096.0000 (7566.1665) mem 7379MB [2024-08-26 07:32:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1030/1251] eta 0:00:53 lr 0.000946 wd 0.0500 time 0.2550 (0.2427) data time 0.0009 (0.0015) model time 0.2540 (0.2412) loss 2.8458 (3.5630) grad_norm 1.9836 (inf) loss_scale 4096.0000 (7532.5082) mem 7379MB [2024-08-26 07:32:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1040/1251] eta 0:00:51 lr 0.000946 wd 0.0500 time 0.2386 (0.2426) data time 0.0007 (0.0015) model time 0.2378 (0.2412) loss 4.2015 (3.5644) grad_norm 1.8000 (inf) loss_scale 4096.0000 (7499.4966) mem 7379MB [2024-08-26 07:32:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1050/1251] eta 0:00:48 lr 0.000946 wd 0.0500 time 0.2313 (0.2426) data time 0.0009 (0.0015) model time 0.2304 (0.2411) loss 2.6470 (3.5640) grad_norm 1.6976 (inf) loss_scale 4096.0000 (7467.1132) mem 7379MB [2024-08-26 07:32:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1060/1251] eta 0:00:46 lr 0.000946 wd 0.0500 time 0.2370 (0.2425) data time 0.0009 (0.0015) model time 0.2361 (0.2410) loss 3.5218 (3.5641) grad_norm 2.5251 (inf) loss_scale 4096.0000 (7435.3402) mem 7379MB [2024-08-26 07:32:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1070/1251] eta 0:00:43 lr 0.000946 wd 0.0500 time 0.2354 (0.2424) data time 0.0008 (0.0015) model time 0.2346 (0.2410) loss 4.2446 (3.5643) grad_norm 1.7072 (inf) loss_scale 4096.0000 (7404.1606) mem 7379MB [2024-08-26 07:32:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1080/1251] eta 0:00:41 lr 0.000946 wd 0.0500 time 0.2372 (0.2424) data time 0.0011 (0.0015) model time 0.2361 (0.2410) loss 3.2671 (3.5661) grad_norm 1.6653 (inf) loss_scale 4096.0000 (7373.5578) mem 7379MB [2024-08-26 07:32:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1090/1251] eta 0:00:39 lr 0.000946 wd 0.0500 time 0.2343 (0.2424) data time 0.0010 (0.0015) model time 0.2332 (0.2409) loss 3.8312 (3.5627) grad_norm 1.3271 (inf) loss_scale 4096.0000 (7343.5160) mem 7379MB [2024-08-26 07:32:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1100/1251] eta 0:00:36 lr 0.000946 wd 0.0500 time 0.2323 (0.2425) data time 0.0009 (0.0015) model time 0.2314 (0.2411) loss 4.2186 (3.5650) grad_norm 2.9065 (inf) loss_scale 4096.0000 (7314.0200) mem 7379MB [2024-08-26 07:32:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1110/1251] eta 0:00:34 lr 0.000946 wd 0.0500 time 0.2379 (0.2425) data time 0.0010 (0.0015) model time 0.2369 (0.2411) loss 3.4760 (3.5615) grad_norm 1.6897 (inf) loss_scale 4096.0000 (7285.0549) mem 7379MB [2024-08-26 07:32:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1120/1251] eta 0:00:31 lr 0.000946 wd 0.0500 time 0.2407 (0.2425) data time 0.0009 (0.0015) model time 0.2398 (0.2410) loss 1.9899 (3.5623) grad_norm 1.9267 (inf) loss_scale 4096.0000 (7256.6066) mem 7379MB [2024-08-26 07:32:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1130/1251] eta 0:00:29 lr 0.000946 wd 0.0500 time 0.2373 (0.2424) data time 0.0011 (0.0015) model time 0.2362 (0.2410) loss 3.7312 (3.5647) grad_norm 1.8754 (inf) loss_scale 4096.0000 (7228.6614) mem 7379MB [2024-08-26 07:32:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1140/1251] eta 0:00:26 lr 0.000946 wd 0.0500 time 0.2356 (0.2424) data time 0.0007 (0.0015) model time 0.2348 (0.2410) loss 2.9847 (3.5665) grad_norm 1.3739 (inf) loss_scale 4096.0000 (7201.2060) mem 7379MB [2024-08-26 07:32:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1150/1251] eta 0:00:24 lr 0.000946 wd 0.0500 time 0.2445 (0.2426) data time 0.0008 (0.0015) model time 0.2438 (0.2411) loss 3.3135 (3.5648) grad_norm 1.8023 (inf) loss_scale 4096.0000 (7174.2276) mem 7379MB [2024-08-26 07:32:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1160/1251] eta 0:00:22 lr 0.000946 wd 0.0500 time 0.2376 (0.2425) data time 0.0009 (0.0015) model time 0.2367 (0.2411) loss 3.0039 (3.5650) grad_norm 2.3714 (inf) loss_scale 4096.0000 (7147.7140) mem 7379MB [2024-08-26 07:32:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1170/1251] eta 0:00:19 lr 0.000946 wd 0.0500 time 0.2424 (0.2425) data time 0.0010 (0.0014) model time 0.2415 (0.2411) loss 3.8254 (3.5670) grad_norm 1.5770 (inf) loss_scale 4096.0000 (7121.6533) mem 7379MB [2024-08-26 07:32:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1180/1251] eta 0:00:17 lr 0.000946 wd 0.0500 time 0.2353 (0.2425) data time 0.0008 (0.0014) model time 0.2346 (0.2411) loss 3.5698 (3.5685) grad_norm 1.4910 (inf) loss_scale 4096.0000 (7096.0339) mem 7379MB [2024-08-26 07:32:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1190/1251] eta 0:00:14 lr 0.000946 wd 0.0500 time 0.2360 (0.2424) data time 0.0015 (0.0014) model time 0.2345 (0.2410) loss 2.8540 (3.5641) grad_norm 1.5663 (inf) loss_scale 4096.0000 (7070.8447) mem 7379MB [2024-08-26 07:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1200/1251] eta 0:00:12 lr 0.000946 wd 0.0500 time 0.2416 (0.2424) data time 0.0010 (0.0014) model time 0.2406 (0.2410) loss 3.7551 (3.5641) grad_norm 1.7401 (inf) loss_scale 4096.0000 (7046.0749) mem 7379MB [2024-08-26 07:32:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1210/1251] eta 0:00:09 lr 0.000946 wd 0.0500 time 0.2377 (0.2424) data time 0.0007 (0.0014) model time 0.2370 (0.2410) loss 2.5393 (3.5641) grad_norm 1.5459 (inf) loss_scale 4096.0000 (7021.7143) mem 7379MB [2024-08-26 07:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1220/1251] eta 0:00:07 lr 0.000946 wd 0.0500 time 0.2389 (0.2424) data time 0.0012 (0.0014) model time 0.2377 (0.2410) loss 3.7596 (3.5627) grad_norm 1.6341 (inf) loss_scale 4096.0000 (6997.7527) mem 7379MB [2024-08-26 07:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1230/1251] eta 0:00:05 lr 0.000946 wd 0.0500 time 0.2343 (0.2427) data time 0.0008 (0.0014) model time 0.2335 (0.2413) loss 2.6741 (3.5590) grad_norm 2.7374 (inf) loss_scale 4096.0000 (6974.1803) mem 7379MB [2024-08-26 07:32:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1240/1251] eta 0:00:02 lr 0.000946 wd 0.0500 time 0.2264 (0.2428) data time 0.0005 (0.0014) model time 0.2259 (0.2414) loss 2.8588 (3.5616) grad_norm 1.7485 (inf) loss_scale 4096.0000 (6950.9879) mem 7379MB [2024-08-26 07:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [61/300][1250/1251] eta 0:00:00 lr 0.000946 wd 0.0500 time 0.2265 (0.2427) data time 0.0007 (0.0014) model time 0.2258 (0.2413) loss 3.2462 (3.5612) grad_norm 1.3894 (inf) loss_scale 4096.0000 (6928.1663) mem 7379MB [2024-08-26 07:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 61 training takes 0:05:03 [2024-08-26 07:33:01 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 07:33:02 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 07:33:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.508 (0.508) Loss 0.5781 (0.5781) Acc@1 88.086 (88.086) Acc@5 97.363 (97.363) Mem 7379MB [2024-08-26 07:33:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.119) Loss 0.8179 (0.8224) Acc@1 81.152 (81.170) Acc@5 96.191 (95.898) Mem 7379MB [2024-08-26 07:33:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.100) Loss 1.1836 (0.8401) Acc@1 72.266 (80.311) Acc@5 91.992 (95.968) Mem 7379MB [2024-08-26 07:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.093) Loss 1.5059 (0.9855) Acc@1 63.672 (77.161) Acc@5 87.988 (94.141) Mem 7379MB [2024-08-26 07:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.087) Loss 1.3955 (1.0606) Acc@1 67.676 (75.457) Acc@5 88.770 (93.281) Mem 7379MB [2024-08-26 07:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.258 Acc@5 93.192 [2024-08-26 07:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.3% [2024-08-26 07:33:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.948 (0.948) Loss 0.4661 (0.4661) Acc@1 91.016 (91.016) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 07:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.159) Loss 0.7715 (0.7464) Acc@1 84.082 (83.620) Acc@5 95.996 (96.626) Mem 7379MB [2024-08-26 07:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.121) Loss 1.0811 (0.7661) Acc@1 74.707 (82.617) Acc@5 92.871 (96.582) Mem 7379MB [2024-08-26 07:33:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.107) Loss 1.3340 (0.8774) Acc@1 65.723 (79.984) Acc@5 89.258 (95.161) Mem 7379MB [2024-08-26 07:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.098) Loss 1.2529 (0.9389) Acc@1 69.043 (78.387) Acc@5 90.918 (94.465) Mem 7379MB [2024-08-26 07:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.012 Acc@5 94.398 [2024-08-26 07:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.0% [2024-08-26 07:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.01% [2024-08-26 07:33:10 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 07:33:11 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 07:33:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][0/1251] eta 0:15:35 lr 0.000946 wd 0.0500 time 0.7482 (0.7482) data time 0.5086 (0.5086) model time 0.0000 (0.0000) loss 3.4940 (3.4940) grad_norm 1.7038 (1.7038) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][10/1251] eta 0:05:54 lr 0.000946 wd 0.0500 time 0.2315 (0.2855) data time 0.0010 (0.0472) model time 0.0000 (0.0000) loss 2.5106 (3.2698) grad_norm 1.5955 (1.7936) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][20/1251] eta 0:05:36 lr 0.000946 wd 0.0500 time 0.2387 (0.2731) data time 0.0009 (0.0252) model time 0.0000 (0.0000) loss 4.0553 (3.3905) grad_norm 2.4089 (1.9315) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][30/1251] eta 0:05:19 lr 0.000946 wd 0.0500 time 0.2388 (0.2613) data time 0.0010 (0.0174) model time 0.0000 (0.0000) loss 3.0342 (3.3552) grad_norm 2.8807 (2.0638) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][40/1251] eta 0:05:09 lr 0.000946 wd 0.0500 time 0.2433 (0.2557) data time 0.0009 (0.0134) model time 0.0000 (0.0000) loss 4.2438 (3.4101) grad_norm 3.0270 (2.0718) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][50/1251] eta 0:05:03 lr 0.000946 wd 0.0500 time 0.2446 (0.2528) data time 0.0010 (0.0110) model time 0.0000 (0.0000) loss 3.5318 (3.4593) grad_norm 1.9082 (2.0675) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][60/1251] eta 0:04:58 lr 0.000946 wd 0.0500 time 0.2406 (0.2505) data time 0.0007 (0.0094) model time 0.2399 (0.2376) loss 3.4133 (3.4764) grad_norm 2.1746 (2.0278) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][70/1251] eta 0:04:54 lr 0.000946 wd 0.0500 time 0.2384 (0.2490) data time 0.0009 (0.0082) model time 0.2375 (0.2382) loss 2.5941 (3.4625) grad_norm 1.4710 (1.9869) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][80/1251] eta 0:04:49 lr 0.000946 wd 0.0500 time 0.2371 (0.2476) data time 0.0007 (0.0073) model time 0.2364 (0.2376) loss 2.3421 (3.4633) grad_norm 1.8258 (1.9906) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][90/1251] eta 0:04:46 lr 0.000946 wd 0.0500 time 0.2369 (0.2469) data time 0.0011 (0.0066) model time 0.2358 (0.2383) loss 3.7592 (3.4472) grad_norm 1.6687 (1.9719) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][100/1251] eta 0:04:42 lr 0.000946 wd 0.0500 time 0.2341 (0.2458) data time 0.0012 (0.0061) model time 0.2329 (0.2376) loss 3.4363 (3.4284) grad_norm 1.7246 (2.0112) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][110/1251] eta 0:04:39 lr 0.000946 wd 0.0500 time 0.2408 (0.2451) data time 0.0012 (0.0056) model time 0.2396 (0.2375) loss 3.9656 (3.4403) grad_norm 1.4965 (1.9923) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][120/1251] eta 0:04:36 lr 0.000946 wd 0.0500 time 0.2388 (0.2448) data time 0.0009 (0.0053) model time 0.2378 (0.2379) loss 3.9370 (3.4468) grad_norm 1.4281 (1.9680) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][130/1251] eta 0:04:33 lr 0.000946 wd 0.0500 time 0.2425 (0.2443) data time 0.0008 (0.0049) model time 0.2417 (0.2378) loss 4.2617 (3.4647) grad_norm 2.1372 (1.9428) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][140/1251] eta 0:04:31 lr 0.000946 wd 0.0500 time 0.2346 (0.2440) data time 0.0011 (0.0047) model time 0.2335 (0.2379) loss 3.5415 (3.4723) grad_norm 1.6069 (1.9247) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][150/1251] eta 0:04:27 lr 0.000946 wd 0.0500 time 0.2352 (0.2434) data time 0.0009 (0.0044) model time 0.2343 (0.2376) loss 2.7094 (3.4512) grad_norm 1.8493 (1.9159) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][160/1251] eta 0:04:26 lr 0.000946 wd 0.0500 time 0.2322 (0.2444) data time 0.0010 (0.0042) model time 0.2312 (0.2395) loss 2.8978 (3.4500) grad_norm 1.5236 (1.9264) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][170/1251] eta 0:04:23 lr 0.000946 wd 0.0500 time 0.2322 (0.2442) data time 0.0007 (0.0040) model time 0.2314 (0.2394) loss 3.8295 (3.4655) grad_norm 1.5842 (1.9157) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][180/1251] eta 0:04:21 lr 0.000946 wd 0.0500 time 0.2454 (0.2439) data time 0.0008 (0.0039) model time 0.2446 (0.2394) loss 4.1160 (3.4768) grad_norm 1.6159 (1.9037) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][190/1251] eta 0:04:18 lr 0.000946 wd 0.0500 time 0.2363 (0.2436) data time 0.0008 (0.0037) model time 0.2356 (0.2392) loss 3.7901 (3.4747) grad_norm 1.4979 (1.9010) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][200/1251] eta 0:04:15 lr 0.000946 wd 0.0500 time 0.2309 (0.2435) data time 0.0009 (0.0036) model time 0.2301 (0.2393) loss 2.4497 (3.4632) grad_norm 2.7872 (1.9028) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][210/1251] eta 0:04:13 lr 0.000946 wd 0.0500 time 0.2368 (0.2434) data time 0.0007 (0.0035) model time 0.2361 (0.2393) loss 3.7322 (3.4800) grad_norm 1.6591 (1.9127) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][220/1251] eta 0:04:10 lr 0.000946 wd 0.0500 time 0.2455 (0.2432) data time 0.0008 (0.0033) model time 0.2447 (0.2392) loss 2.4842 (3.4762) grad_norm 1.8948 (1.9178) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][230/1251] eta 0:04:08 lr 0.000946 wd 0.0500 time 0.2394 (0.2430) data time 0.0008 (0.0032) model time 0.2385 (0.2392) loss 3.0763 (3.4748) grad_norm 3.4097 (1.9211) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][240/1251] eta 0:04:05 lr 0.000946 wd 0.0500 time 0.2417 (0.2427) data time 0.0009 (0.0032) model time 0.2407 (0.2389) loss 3.3609 (3.4754) grad_norm 1.9495 (1.9403) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][250/1251] eta 0:04:02 lr 0.000946 wd 0.0500 time 0.2405 (0.2426) data time 0.0009 (0.0031) model time 0.2396 (0.2389) loss 4.5532 (3.4710) grad_norm 1.6112 (1.9443) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][260/1251] eta 0:04:00 lr 0.000946 wd 0.0500 time 0.2469 (0.2423) data time 0.0011 (0.0030) model time 0.2458 (0.2388) loss 3.2608 (3.4806) grad_norm 1.4921 (1.9401) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][270/1251] eta 0:03:57 lr 0.000946 wd 0.0500 time 0.2424 (0.2422) data time 0.0009 (0.0029) model time 0.2415 (0.2387) loss 3.7978 (3.4846) grad_norm 1.8891 (1.9397) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][280/1251] eta 0:03:55 lr 0.000945 wd 0.0500 time 0.2443 (0.2422) data time 0.0011 (0.0029) model time 0.2432 (0.2388) loss 3.8648 (3.4861) grad_norm 1.8973 (1.9318) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][290/1251] eta 0:03:52 lr 0.000945 wd 0.0500 time 0.2426 (0.2421) data time 0.0009 (0.0028) model time 0.2418 (0.2387) loss 4.3339 (3.4998) grad_norm 2.2839 (1.9318) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][300/1251] eta 0:03:50 lr 0.000945 wd 0.0500 time 0.2420 (0.2420) data time 0.0007 (0.0028) model time 0.2413 (0.2387) loss 3.7240 (3.5028) grad_norm 1.4395 (1.9295) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][310/1251] eta 0:03:47 lr 0.000945 wd 0.0500 time 0.2460 (0.2419) data time 0.0008 (0.0027) model time 0.2452 (0.2387) loss 4.1506 (3.5061) grad_norm 2.2813 (1.9368) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][320/1251] eta 0:03:45 lr 0.000945 wd 0.0500 time 0.2386 (0.2425) data time 0.0009 (0.0027) model time 0.2376 (0.2395) loss 3.7013 (3.5118) grad_norm 1.9957 (1.9415) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][330/1251] eta 0:03:43 lr 0.000945 wd 0.0500 time 0.2313 (0.2424) data time 0.0009 (0.0026) model time 0.2305 (0.2394) loss 2.3645 (3.5107) grad_norm 1.8973 (1.9401) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][340/1251] eta 0:03:40 lr 0.000945 wd 0.0500 time 0.2460 (0.2423) data time 0.0007 (0.0026) model time 0.2453 (0.2394) loss 3.4313 (3.5123) grad_norm 1.6537 (1.9377) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][350/1251] eta 0:03:38 lr 0.000945 wd 0.0500 time 0.2363 (0.2421) data time 0.0007 (0.0025) model time 0.2356 (0.2392) loss 3.3923 (3.5160) grad_norm 1.7499 (1.9422) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][360/1251] eta 0:03:35 lr 0.000945 wd 0.0500 time 0.2393 (0.2420) data time 0.0008 (0.0025) model time 0.2385 (0.2391) loss 3.4803 (3.5214) grad_norm 1.9141 (1.9399) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][370/1251] eta 0:03:33 lr 0.000945 wd 0.0500 time 0.2365 (0.2419) data time 0.0012 (0.0025) model time 0.2353 (0.2390) loss 3.1249 (3.5270) grad_norm 2.2500 (1.9435) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][380/1251] eta 0:03:31 lr 0.000945 wd 0.0500 time 0.2397 (0.2424) data time 0.0008 (0.0024) model time 0.2389 (0.2397) loss 2.8948 (3.5305) grad_norm 1.9095 (1.9405) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][390/1251] eta 0:03:28 lr 0.000945 wd 0.0500 time 0.2321 (0.2423) data time 0.0009 (0.0024) model time 0.2312 (0.2396) loss 3.9070 (3.5266) grad_norm 2.5979 (1.9497) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][400/1251] eta 0:03:26 lr 0.000945 wd 0.0500 time 0.2449 (0.2422) data time 0.0009 (0.0024) model time 0.2440 (0.2396) loss 3.7264 (3.5250) grad_norm 2.5067 (1.9607) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][410/1251] eta 0:03:23 lr 0.000945 wd 0.0500 time 0.2371 (0.2421) data time 0.0007 (0.0023) model time 0.2364 (0.2395) loss 2.2446 (3.5161) grad_norm 1.5397 (1.9523) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][420/1251] eta 0:03:21 lr 0.000945 wd 0.0500 time 0.2409 (0.2421) data time 0.0010 (0.0023) model time 0.2399 (0.2395) loss 4.0622 (3.5185) grad_norm 1.5672 (1.9566) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][430/1251] eta 0:03:18 lr 0.000945 wd 0.0500 time 0.2309 (0.2420) data time 0.0008 (0.0023) model time 0.2301 (0.2394) loss 2.3188 (3.5065) grad_norm 1.9694 (1.9492) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][440/1251] eta 0:03:16 lr 0.000945 wd 0.0500 time 0.2377 (0.2420) data time 0.0009 (0.0022) model time 0.2368 (0.2395) loss 3.7297 (3.5066) grad_norm 1.8415 (1.9438) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][450/1251] eta 0:03:13 lr 0.000945 wd 0.0500 time 0.2347 (0.2420) data time 0.0009 (0.0022) model time 0.2338 (0.2395) loss 4.1851 (3.5111) grad_norm 1.5805 (1.9439) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][460/1251] eta 0:03:11 lr 0.000945 wd 0.0500 time 0.2384 (0.2419) data time 0.0009 (0.0022) model time 0.2375 (0.2395) loss 2.9353 (3.5072) grad_norm 1.8690 (1.9492) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][470/1251] eta 0:03:08 lr 0.000945 wd 0.0500 time 0.2418 (0.2418) data time 0.0010 (0.0022) model time 0.2407 (0.2394) loss 3.6844 (3.5120) grad_norm 2.7336 (1.9521) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][480/1251] eta 0:03:06 lr 0.000945 wd 0.0500 time 0.2432 (0.2418) data time 0.0009 (0.0021) model time 0.2423 (0.2394) loss 3.4897 (3.5113) grad_norm 2.1301 (1.9630) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][490/1251] eta 0:03:04 lr 0.000945 wd 0.0500 time 0.2498 (0.2418) data time 0.0009 (0.0021) model time 0.2489 (0.2394) loss 3.5683 (3.5131) grad_norm 1.7597 (1.9629) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][500/1251] eta 0:03:02 lr 0.000945 wd 0.0500 time 0.2336 (0.2423) data time 0.0011 (0.0021) model time 0.2325 (0.2401) loss 3.9915 (3.5109) grad_norm 1.3966 (1.9583) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][510/1251] eta 0:02:59 lr 0.000945 wd 0.0500 time 0.2387 (0.2426) data time 0.0007 (0.0021) model time 0.2380 (0.2404) loss 2.8074 (3.5122) grad_norm 2.2483 (1.9541) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][520/1251] eta 0:02:57 lr 0.000945 wd 0.0500 time 0.2429 (0.2430) data time 0.0009 (0.0021) model time 0.2421 (0.2408) loss 3.6986 (3.5134) grad_norm 1.9505 (1.9532) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][530/1251] eta 0:02:55 lr 0.000945 wd 0.0500 time 0.2476 (0.2429) data time 0.0007 (0.0020) model time 0.2469 (0.2408) loss 2.8023 (3.5131) grad_norm 3.6993 (1.9525) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][540/1251] eta 0:02:52 lr 0.000945 wd 0.0500 time 0.2350 (0.2429) data time 0.0011 (0.0020) model time 0.2339 (0.2407) loss 3.6907 (3.5135) grad_norm 1.8208 (1.9526) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][550/1251] eta 0:02:50 lr 0.000945 wd 0.0500 time 0.2322 (0.2428) data time 0.0010 (0.0020) model time 0.2312 (0.2407) loss 3.7596 (3.5161) grad_norm 3.3779 (1.9582) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][560/1251] eta 0:02:47 lr 0.000945 wd 0.0500 time 0.2441 (0.2431) data time 0.0011 (0.0020) model time 0.2430 (0.2410) loss 4.0218 (3.5114) grad_norm 2.1908 (1.9641) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][570/1251] eta 0:02:45 lr 0.000945 wd 0.0500 time 0.2385 (0.2430) data time 0.0010 (0.0020) model time 0.2375 (0.2409) loss 3.5661 (3.5146) grad_norm 2.2792 (1.9659) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][580/1251] eta 0:02:43 lr 0.000945 wd 0.0500 time 0.2457 (0.2430) data time 0.0008 (0.0020) model time 0.2449 (0.2409) loss 3.8538 (3.5159) grad_norm 2.0573 (1.9725) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][590/1251] eta 0:02:40 lr 0.000945 wd 0.0500 time 0.2400 (0.2428) data time 0.0010 (0.0019) model time 0.2390 (0.2408) loss 2.5479 (3.5188) grad_norm 1.4783 (1.9784) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][600/1251] eta 0:02:38 lr 0.000945 wd 0.0500 time 0.2585 (0.2428) data time 0.0011 (0.0019) model time 0.2575 (0.2408) loss 3.8322 (3.5243) grad_norm 1.9613 (1.9778) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][610/1251] eta 0:02:35 lr 0.000945 wd 0.0500 time 0.2399 (0.2428) data time 0.0011 (0.0019) model time 0.2388 (0.2408) loss 3.5921 (3.5278) grad_norm 1.5823 (1.9737) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][620/1251] eta 0:02:33 lr 0.000945 wd 0.0500 time 0.2403 (0.2427) data time 0.0007 (0.0019) model time 0.2396 (0.2407) loss 2.3847 (3.5250) grad_norm 1.8566 (1.9715) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][630/1251] eta 0:02:30 lr 0.000945 wd 0.0500 time 0.2306 (0.2426) data time 0.0011 (0.0019) model time 0.2296 (0.2406) loss 3.8405 (3.5254) grad_norm 1.5034 (1.9745) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][640/1251] eta 0:02:28 lr 0.000945 wd 0.0500 time 0.2390 (0.2425) data time 0.0010 (0.0019) model time 0.2379 (0.2406) loss 3.8785 (3.5248) grad_norm 3.5781 (1.9797) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][650/1251] eta 0:02:25 lr 0.000945 wd 0.0500 time 0.2374 (0.2425) data time 0.0008 (0.0019) model time 0.2366 (0.2405) loss 2.5828 (3.5241) grad_norm 2.1226 (1.9807) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][660/1251] eta 0:02:23 lr 0.000945 wd 0.0500 time 0.2366 (0.2424) data time 0.0009 (0.0019) model time 0.2356 (0.2405) loss 3.0229 (3.5270) grad_norm 1.7337 (1.9832) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][670/1251] eta 0:02:20 lr 0.000945 wd 0.0500 time 0.2369 (0.2423) data time 0.0010 (0.0019) model time 0.2359 (0.2404) loss 3.4325 (3.5290) grad_norm 1.8683 (1.9798) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][680/1251] eta 0:02:18 lr 0.000945 wd 0.0500 time 0.2365 (0.2423) data time 0.0010 (0.0019) model time 0.2355 (0.2403) loss 2.5453 (3.5263) grad_norm 2.7066 (1.9795) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][690/1251] eta 0:02:15 lr 0.000945 wd 0.0500 time 0.2514 (0.2423) data time 0.0007 (0.0019) model time 0.2506 (0.2403) loss 3.5610 (3.5250) grad_norm 2.4298 (1.9790) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][700/1251] eta 0:02:13 lr 0.000945 wd 0.0500 time 0.2354 (0.2422) data time 0.0008 (0.0018) model time 0.2346 (0.2403) loss 4.1991 (3.5255) grad_norm 2.2899 (1.9779) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][710/1251] eta 0:02:11 lr 0.000945 wd 0.0500 time 0.2402 (0.2422) data time 0.0010 (0.0018) model time 0.2393 (0.2403) loss 3.7620 (3.5289) grad_norm 2.5244 (1.9768) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][720/1251] eta 0:02:08 lr 0.000945 wd 0.0500 time 0.2499 (0.2422) data time 0.0008 (0.0018) model time 0.2491 (0.2403) loss 2.3577 (3.5297) grad_norm 1.9247 (1.9799) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][730/1251] eta 0:02:06 lr 0.000945 wd 0.0500 time 0.2411 (0.2421) data time 0.0009 (0.0018) model time 0.2403 (0.2402) loss 2.6864 (3.5310) grad_norm 1.9787 (1.9852) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][740/1251] eta 0:02:03 lr 0.000945 wd 0.0500 time 0.2398 (0.2421) data time 0.0008 (0.0018) model time 0.2390 (0.2402) loss 3.0912 (3.5291) grad_norm 1.2473 (1.9939) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][750/1251] eta 0:02:01 lr 0.000945 wd 0.0500 time 0.2367 (0.2420) data time 0.0012 (0.0018) model time 0.2356 (0.2401) loss 2.7118 (3.5265) grad_norm 1.7950 (1.9922) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][760/1251] eta 0:01:58 lr 0.000945 wd 0.0500 time 0.2357 (0.2423) data time 0.0011 (0.0018) model time 0.2346 (0.2404) loss 3.7021 (3.5275) grad_norm 2.2592 (1.9942) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][770/1251] eta 0:01:56 lr 0.000944 wd 0.0500 time 0.2358 (0.2423) data time 0.0008 (0.0018) model time 0.2349 (0.2404) loss 4.2778 (3.5241) grad_norm 1.6443 (1.9962) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][780/1251] eta 0:01:54 lr 0.000944 wd 0.0500 time 0.2470 (0.2422) data time 0.0007 (0.0018) model time 0.2463 (0.2404) loss 2.9826 (3.5223) grad_norm 1.4708 (1.9918) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][790/1251] eta 0:01:51 lr 0.000944 wd 0.0500 time 0.2308 (0.2422) data time 0.0011 (0.0018) model time 0.2297 (0.2403) loss 4.3732 (3.5231) grad_norm 1.7511 (1.9886) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][800/1251] eta 0:01:49 lr 0.000944 wd 0.0500 time 0.2363 (0.2421) data time 0.0010 (0.0018) model time 0.2353 (0.2403) loss 3.5691 (3.5263) grad_norm 1.7893 (1.9872) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][810/1251] eta 0:01:46 lr 0.000944 wd 0.0500 time 0.2412 (0.2421) data time 0.0009 (0.0017) model time 0.2403 (0.2402) loss 3.3888 (3.5242) grad_norm 1.5670 (1.9872) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][820/1251] eta 0:01:44 lr 0.000944 wd 0.0500 time 0.2454 (0.2420) data time 0.0009 (0.0017) model time 0.2444 (0.2402) loss 3.7134 (3.5256) grad_norm 2.3129 (1.9890) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][830/1251] eta 0:01:41 lr 0.000944 wd 0.0500 time 0.2309 (0.2420) data time 0.0011 (0.0017) model time 0.2298 (0.2401) loss 3.9583 (3.5287) grad_norm 1.9604 (1.9889) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][840/1251] eta 0:01:39 lr 0.000944 wd 0.0500 time 0.2595 (0.2420) data time 0.0012 (0.0017) model time 0.2583 (0.2401) loss 3.9551 (3.5294) grad_norm 1.5926 (1.9883) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][850/1251] eta 0:01:37 lr 0.000944 wd 0.0500 time 0.2439 (0.2419) data time 0.0009 (0.0017) model time 0.2430 (0.2401) loss 3.9145 (3.5328) grad_norm 2.0244 (1.9875) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][860/1251] eta 0:01:34 lr 0.000944 wd 0.0500 time 0.2388 (0.2419) data time 0.0008 (0.0017) model time 0.2381 (0.2401) loss 3.3484 (3.5344) grad_norm 2.6767 (1.9874) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][870/1251] eta 0:01:32 lr 0.000944 wd 0.0500 time 0.2291 (0.2419) data time 0.0011 (0.0017) model time 0.2281 (0.2401) loss 2.7938 (3.5273) grad_norm 1.6686 (1.9884) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][880/1251] eta 0:01:29 lr 0.000944 wd 0.0500 time 0.2343 (0.2418) data time 0.0009 (0.0017) model time 0.2333 (0.2401) loss 2.9274 (3.5293) grad_norm 1.5820 (1.9868) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][890/1251] eta 0:01:27 lr 0.000944 wd 0.0500 time 0.4284 (0.2421) data time 0.0011 (0.0017) model time 0.4273 (0.2403) loss 3.6421 (3.5249) grad_norm 1.9855 (1.9841) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][900/1251] eta 0:01:24 lr 0.000944 wd 0.0500 time 0.2304 (0.2420) data time 0.0011 (0.0017) model time 0.2293 (0.2403) loss 3.3538 (3.5232) grad_norm 1.6762 (1.9846) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][910/1251] eta 0:01:22 lr 0.000944 wd 0.0500 time 0.2350 (0.2420) data time 0.0009 (0.0017) model time 0.2341 (0.2402) loss 2.4272 (3.5225) grad_norm 1.5840 (1.9877) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][920/1251] eta 0:01:20 lr 0.000944 wd 0.0500 time 0.2379 (0.2419) data time 0.0010 (0.0017) model time 0.2368 (0.2402) loss 2.7392 (3.5232) grad_norm 2.0276 (1.9859) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][930/1251] eta 0:01:17 lr 0.000944 wd 0.0500 time 0.2351 (0.2419) data time 0.0011 (0.0017) model time 0.2340 (0.2401) loss 3.6563 (3.5204) grad_norm 1.7088 (1.9856) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:36:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][940/1251] eta 0:01:15 lr 0.000944 wd 0.0500 time 0.2373 (0.2418) data time 0.0010 (0.0016) model time 0.2362 (0.2401) loss 4.5277 (3.5223) grad_norm 1.4259 (1.9826) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][950/1251] eta 0:01:12 lr 0.000944 wd 0.0500 time 0.2481 (0.2418) data time 0.0007 (0.0016) model time 0.2474 (0.2401) loss 4.1248 (3.5224) grad_norm 2.0926 (1.9813) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][960/1251] eta 0:01:10 lr 0.000944 wd 0.0500 time 0.2388 (0.2418) data time 0.0010 (0.0016) model time 0.2378 (0.2400) loss 4.1754 (3.5231) grad_norm 2.4551 (1.9815) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][970/1251] eta 0:01:07 lr 0.000944 wd 0.0500 time 0.2369 (0.2417) data time 0.0008 (0.0016) model time 0.2361 (0.2400) loss 2.9457 (3.5238) grad_norm 1.8656 (1.9825) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][980/1251] eta 0:01:05 lr 0.000944 wd 0.0500 time 0.2408 (0.2417) data time 0.0008 (0.0016) model time 0.2399 (0.2400) loss 4.0383 (3.5242) grad_norm 1.6309 (1.9790) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][990/1251] eta 0:01:03 lr 0.000944 wd 0.0500 time 0.2357 (0.2417) data time 0.0009 (0.0016) model time 0.2348 (0.2399) loss 3.4309 (3.5242) grad_norm 1.6163 (1.9783) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1000/1251] eta 0:01:00 lr 0.000944 wd 0.0500 time 0.2420 (0.2416) data time 0.0007 (0.0016) model time 0.2413 (0.2399) loss 4.0536 (3.5242) grad_norm 1.7694 (1.9774) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1010/1251] eta 0:00:58 lr 0.000944 wd 0.0500 time 0.2415 (0.2416) data time 0.0010 (0.0016) model time 0.2405 (0.2399) loss 3.9005 (3.5248) grad_norm 2.1823 (1.9776) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1020/1251] eta 0:00:55 lr 0.000944 wd 0.0500 time 0.2354 (0.2418) data time 0.0009 (0.0016) model time 0.2345 (0.2401) loss 4.4204 (3.5258) grad_norm 2.4212 (1.9773) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1030/1251] eta 0:00:53 lr 0.000944 wd 0.0500 time 0.2476 (0.2420) data time 0.0009 (0.0016) model time 0.2466 (0.2403) loss 4.2635 (3.5281) grad_norm 1.9457 (1.9754) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1040/1251] eta 0:00:51 lr 0.000944 wd 0.0500 time 0.2391 (0.2422) data time 0.0007 (0.0016) model time 0.2384 (0.2405) loss 4.6485 (3.5294) grad_norm 2.2229 (1.9763) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1050/1251] eta 0:00:48 lr 0.000944 wd 0.0500 time 0.2458 (0.2422) data time 0.0019 (0.0016) model time 0.2439 (0.2405) loss 4.0243 (3.5295) grad_norm 1.6206 (1.9779) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1060/1251] eta 0:00:46 lr 0.000944 wd 0.0500 time 0.2440 (0.2421) data time 0.0007 (0.0016) model time 0.2433 (0.2405) loss 4.3789 (3.5292) grad_norm 1.7740 (1.9756) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1070/1251] eta 0:00:43 lr 0.000944 wd 0.0500 time 0.2345 (0.2422) data time 0.0010 (0.0016) model time 0.2335 (0.2406) loss 3.7141 (3.5299) grad_norm 1.7277 (1.9741) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1080/1251] eta 0:00:41 lr 0.000944 wd 0.0500 time 0.2314 (0.2422) data time 0.0011 (0.0016) model time 0.2303 (0.2406) loss 4.0412 (3.5300) grad_norm 2.0535 (1.9738) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1090/1251] eta 0:00:39 lr 0.000944 wd 0.0500 time 0.2346 (0.2424) data time 0.0010 (0.0016) model time 0.2336 (0.2408) loss 3.6833 (3.5307) grad_norm 1.8356 (1.9727) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1100/1251] eta 0:00:36 lr 0.000944 wd 0.0500 time 0.2473 (0.2423) data time 0.0010 (0.0016) model time 0.2463 (0.2407) loss 4.0373 (3.5310) grad_norm 1.6113 (1.9722) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1110/1251] eta 0:00:34 lr 0.000944 wd 0.0500 time 0.2346 (0.2423) data time 0.0010 (0.0016) model time 0.2336 (0.2407) loss 2.1014 (3.5297) grad_norm 1.6746 (1.9694) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1120/1251] eta 0:00:31 lr 0.000944 wd 0.0500 time 0.2338 (0.2423) data time 0.0008 (0.0016) model time 0.2330 (0.2407) loss 3.8937 (3.5270) grad_norm 1.3345 (1.9692) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1130/1251] eta 0:00:29 lr 0.000944 wd 0.0500 time 0.2378 (0.2422) data time 0.0012 (0.0015) model time 0.2366 (0.2406) loss 3.3221 (3.5286) grad_norm 2.0130 (1.9698) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1140/1251] eta 0:00:26 lr 0.000944 wd 0.0500 time 0.2403 (0.2422) data time 0.0008 (0.0015) model time 0.2395 (0.2406) loss 4.6334 (3.5258) grad_norm 1.5982 (1.9700) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1150/1251] eta 0:00:24 lr 0.000944 wd 0.0500 time 0.2439 (0.2422) data time 0.0007 (0.0015) model time 0.2432 (0.2406) loss 4.3984 (3.5279) grad_norm 3.4587 (1.9701) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1160/1251] eta 0:00:22 lr 0.000944 wd 0.0500 time 0.2307 (0.2421) data time 0.0011 (0.0015) model time 0.2296 (0.2406) loss 3.1778 (3.5284) grad_norm 1.8333 (1.9705) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1170/1251] eta 0:00:19 lr 0.000944 wd 0.0500 time 0.2432 (0.2421) data time 0.0010 (0.0015) model time 0.2422 (0.2405) loss 2.4846 (3.5262) grad_norm 3.1459 (1.9711) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1180/1251] eta 0:00:17 lr 0.000944 wd 0.0500 time 0.2362 (0.2421) data time 0.0011 (0.0015) model time 0.2351 (0.2405) loss 3.9690 (3.5259) grad_norm 2.3202 (1.9761) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:37:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1190/1251] eta 0:00:14 lr 0.000944 wd 0.0500 time 0.2447 (0.2421) data time 0.0008 (0.0015) model time 0.2439 (0.2405) loss 2.8862 (3.5239) grad_norm 2.5430 (1.9756) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1200/1251] eta 0:00:12 lr 0.000944 wd 0.0500 time 0.2347 (0.2420) data time 0.0007 (0.0015) model time 0.2340 (0.2405) loss 3.4456 (3.5272) grad_norm 2.1096 (1.9755) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1210/1251] eta 0:00:09 lr 0.000944 wd 0.0500 time 0.2340 (0.2420) data time 0.0009 (0.0015) model time 0.2331 (0.2404) loss 2.6831 (3.5254) grad_norm 1.8003 (1.9755) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1220/1251] eta 0:00:07 lr 0.000944 wd 0.0500 time 0.2330 (0.2420) data time 0.0009 (0.0015) model time 0.2321 (0.2404) loss 4.3715 (3.5263) grad_norm 1.6748 (1.9743) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1230/1251] eta 0:00:05 lr 0.000944 wd 0.0500 time 0.2430 (0.2420) data time 0.0011 (0.0015) model time 0.2419 (0.2404) loss 4.3258 (3.5261) grad_norm 2.2738 (1.9751) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1240/1251] eta 0:00:02 lr 0.000944 wd 0.0500 time 0.2212 (0.2419) data time 0.0005 (0.0015) model time 0.2208 (0.2403) loss 3.6095 (3.5277) grad_norm 2.0435 (1.9741) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [62/300][1250/1251] eta 0:00:00 lr 0.000944 wd 0.0500 time 0.5151 (0.2420) data time 0.0006 (0.0015) model time 0.5145 (0.2404) loss 3.1149 (3.5287) grad_norm 2.1800 (1.9745) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 62 training takes 0:05:02 [2024-08-26 07:38:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 07:38:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 07:38:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.480 (0.480) Loss 0.5942 (0.5942) Acc@1 89.160 (89.160) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-26 07:38:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.113) Loss 0.8750 (0.8944) Acc@1 82.324 (80.797) Acc@5 96.387 (95.774) Mem 7379MB [2024-08-26 07:38:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.097) Loss 1.2227 (0.9128) Acc@1 70.801 (79.734) Acc@5 92.285 (95.698) Mem 7379MB [2024-08-26 07:38:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.085 (0.091) Loss 1.4570 (1.0313) Acc@1 65.625 (77.148) Acc@5 88.184 (94.074) Mem 7379MB [2024-08-26 07:38:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.085) Loss 1.4971 (1.0948) Acc@1 65.723 (75.641) Acc@5 88.574 (93.245) Mem 7379MB [2024-08-26 07:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.328 Acc@5 93.184 [2024-08-26 07:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.3% [2024-08-26 07:38:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.947 (0.947) Loss 0.4666 (0.4666) Acc@1 91.113 (91.113) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 07:38:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.162) Loss 0.7705 (0.7447) Acc@1 84.375 (83.842) Acc@5 95.996 (96.662) Mem 7379MB [2024-08-26 07:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.123) Loss 1.0771 (0.7646) Acc@1 74.707 (82.738) Acc@5 92.969 (96.619) Mem 7379MB [2024-08-26 07:38:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.088 (0.109) Loss 1.3320 (0.8757) Acc@1 65.723 (80.094) Acc@5 89.746 (95.183) Mem 7379MB [2024-08-26 07:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.098) Loss 1.2490 (0.9370) Acc@1 68.945 (78.494) Acc@5 90.820 (94.491) Mem 7379MB [2024-08-26 07:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.102 Acc@5 94.424 [2024-08-26 07:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.1% [2024-08-26 07:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.10% [2024-08-26 07:38:23 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 07:38:24 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 07:38:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][0/1251] eta 0:17:05 lr 0.000943 wd 0.0500 time 0.8196 (0.8196) data time 0.5862 (0.5862) model time 0.0000 (0.0000) loss 2.5494 (2.5494) grad_norm 2.4621 (2.4621) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][10/1251] eta 0:06:02 lr 0.000943 wd 0.0500 time 0.2354 (0.2925) data time 0.0010 (0.0543) model time 0.0000 (0.0000) loss 3.1140 (3.4185) grad_norm 1.5146 (1.8336) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][20/1251] eta 0:05:27 lr 0.000943 wd 0.0500 time 0.2365 (0.2660) data time 0.0009 (0.0289) model time 0.0000 (0.0000) loss 2.5761 (3.5670) grad_norm 1.4475 (1.8124) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][30/1251] eta 0:05:13 lr 0.000943 wd 0.0500 time 0.2375 (0.2565) data time 0.0009 (0.0199) model time 0.0000 (0.0000) loss 2.1053 (3.5738) grad_norm 1.6113 (1.8359) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][40/1251] eta 0:05:05 lr 0.000943 wd 0.0500 time 0.2393 (0.2519) data time 0.0011 (0.0155) model time 0.0000 (0.0000) loss 3.6870 (3.5608) grad_norm 1.6879 (1.8418) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][50/1251] eta 0:04:59 lr 0.000943 wd 0.0500 time 0.2376 (0.2491) data time 0.0007 (0.0127) model time 0.0000 (0.0000) loss 4.1205 (3.5312) grad_norm 1.5174 (1.8240) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][60/1251] eta 0:04:54 lr 0.000943 wd 0.0500 time 0.2512 (0.2474) data time 0.0007 (0.0108) model time 0.2505 (0.2378) loss 2.9822 (3.5007) grad_norm 1.6950 (1.8313) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][70/1251] eta 0:04:50 lr 0.000943 wd 0.0500 time 0.2388 (0.2459) data time 0.0009 (0.0094) model time 0.2379 (0.2369) loss 4.4126 (3.5155) grad_norm 2.0286 (1.8571) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][80/1251] eta 0:04:49 lr 0.000943 wd 0.0500 time 0.2389 (0.2476) data time 0.0011 (0.0084) model time 0.2378 (0.2442) loss 3.7513 (3.5687) grad_norm 1.6843 (1.8523) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][90/1251] eta 0:04:46 lr 0.000943 wd 0.0500 time 0.2305 (0.2464) data time 0.0011 (0.0076) model time 0.2294 (0.2421) loss 3.5760 (3.5879) grad_norm 2.3222 (1.8373) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][100/1251] eta 0:04:42 lr 0.000943 wd 0.0500 time 0.2381 (0.2459) data time 0.0010 (0.0069) model time 0.2371 (0.2416) loss 3.6097 (3.6001) grad_norm 1.7134 (1.8294) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][110/1251] eta 0:04:44 lr 0.000943 wd 0.0500 time 0.4479 (0.2494) data time 0.0010 (0.0064) model time 0.4469 (0.2486) loss 3.3808 (3.6007) grad_norm 2.2749 (1.8821) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][120/1251] eta 0:04:40 lr 0.000943 wd 0.0500 time 0.2340 (0.2484) data time 0.0010 (0.0059) model time 0.2330 (0.2468) loss 3.7596 (3.6095) grad_norm 1.7879 (1.9211) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][130/1251] eta 0:04:37 lr 0.000943 wd 0.0500 time 0.2414 (0.2477) data time 0.0009 (0.0056) model time 0.2405 (0.2458) loss 4.1264 (3.5873) grad_norm 1.4846 (1.9434) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:38:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][140/1251] eta 0:04:34 lr 0.000943 wd 0.0500 time 0.2348 (0.2471) data time 0.0008 (0.0053) model time 0.2340 (0.2449) loss 3.4374 (3.5817) grad_norm 1.8432 (1.9464) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][150/1251] eta 0:04:31 lr 0.000943 wd 0.0500 time 0.2366 (0.2466) data time 0.0007 (0.0050) model time 0.2359 (0.2443) loss 3.0862 (3.5858) grad_norm 2.3686 (1.9656) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][160/1251] eta 0:04:28 lr 0.000943 wd 0.0500 time 0.2369 (0.2462) data time 0.0010 (0.0048) model time 0.2360 (0.2437) loss 2.9795 (3.5789) grad_norm 1.9272 (1.9844) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][170/1251] eta 0:04:26 lr 0.000943 wd 0.0500 time 0.2373 (0.2467) data time 0.0008 (0.0045) model time 0.2365 (0.2447) loss 3.6717 (3.5694) grad_norm 1.6364 (1.9878) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][180/1251] eta 0:04:23 lr 0.000943 wd 0.0500 time 0.2314 (0.2463) data time 0.0011 (0.0043) model time 0.2302 (0.2442) loss 3.3651 (3.5566) grad_norm 1.6728 (1.9694) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][190/1251] eta 0:04:21 lr 0.000943 wd 0.0500 time 0.2386 (0.2460) data time 0.0008 (0.0042) model time 0.2378 (0.2438) loss 3.6799 (3.5586) grad_norm 1.8758 (1.9622) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][200/1251] eta 0:04:18 lr 0.000943 wd 0.0500 time 0.2411 (0.2455) data time 0.0008 (0.0040) model time 0.2403 (0.2433) loss 4.0126 (3.5629) grad_norm 1.8237 (1.9635) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][210/1251] eta 0:04:15 lr 0.000943 wd 0.0500 time 0.2447 (0.2452) data time 0.0007 (0.0039) model time 0.2440 (0.2430) loss 2.2628 (3.5346) grad_norm 2.4709 (1.9773) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][220/1251] eta 0:04:12 lr 0.000943 wd 0.0500 time 0.2393 (0.2449) data time 0.0010 (0.0038) model time 0.2383 (0.2426) loss 2.2766 (3.5160) grad_norm 2.3889 (1.9968) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][230/1251] eta 0:04:09 lr 0.000943 wd 0.0500 time 0.2385 (0.2446) data time 0.0007 (0.0036) model time 0.2378 (0.2423) loss 4.0259 (3.5104) grad_norm 1.2872 (1.9861) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][240/1251] eta 0:04:07 lr 0.000943 wd 0.0500 time 0.2320 (0.2444) data time 0.0012 (0.0035) model time 0.2307 (0.2421) loss 2.4764 (3.5032) grad_norm 1.2927 (1.9745) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][250/1251] eta 0:04:04 lr 0.000943 wd 0.0500 time 0.2372 (0.2442) data time 0.0009 (0.0034) model time 0.2364 (0.2419) loss 2.6619 (3.4918) grad_norm 2.5102 (1.9756) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][260/1251] eta 0:04:01 lr 0.000943 wd 0.0500 time 0.2469 (0.2441) data time 0.0007 (0.0033) model time 0.2462 (0.2418) loss 3.2361 (3.4868) grad_norm 1.8493 (1.9743) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][270/1251] eta 0:03:59 lr 0.000943 wd 0.0500 time 0.2351 (0.2438) data time 0.0012 (0.0033) model time 0.2339 (0.2415) loss 3.5692 (3.4893) grad_norm 1.4834 (1.9712) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][280/1251] eta 0:03:56 lr 0.000943 wd 0.0500 time 0.2342 (0.2436) data time 0.0011 (0.0032) model time 0.2331 (0.2414) loss 3.6826 (3.4877) grad_norm 2.0981 (1.9715) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][290/1251] eta 0:03:54 lr 0.000943 wd 0.0500 time 0.2374 (0.2435) data time 0.0008 (0.0031) model time 0.2366 (0.2413) loss 4.3595 (3.4931) grad_norm 1.9539 (1.9779) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][300/1251] eta 0:03:51 lr 0.000943 wd 0.0500 time 0.2387 (0.2434) data time 0.0008 (0.0030) model time 0.2379 (0.2412) loss 3.7555 (3.4960) grad_norm 2.0359 (1.9792) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][310/1251] eta 0:03:48 lr 0.000943 wd 0.0500 time 0.2341 (0.2432) data time 0.0009 (0.0030) model time 0.2332 (0.2410) loss 3.9853 (3.4957) grad_norm 1.8514 (1.9772) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][320/1251] eta 0:03:46 lr 0.000943 wd 0.0500 time 0.2424 (0.2431) data time 0.0009 (0.0029) model time 0.2415 (0.2410) loss 3.3825 (3.4915) grad_norm 1.9339 (1.9783) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][330/1251] eta 0:03:43 lr 0.000943 wd 0.0500 time 0.2341 (0.2429) data time 0.0008 (0.0028) model time 0.2333 (0.2408) loss 4.4632 (3.4919) grad_norm 1.4692 (1.9751) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][340/1251] eta 0:03:41 lr 0.000943 wd 0.0500 time 0.2347 (0.2434) data time 0.0009 (0.0028) model time 0.2338 (0.2413) loss 2.1642 (3.4875) grad_norm 1.6867 (1.9639) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][350/1251] eta 0:03:39 lr 0.000943 wd 0.0500 time 0.2343 (0.2433) data time 0.0009 (0.0027) model time 0.2335 (0.2412) loss 4.0747 (3.4941) grad_norm 2.1823 (1.9657) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][360/1251] eta 0:03:36 lr 0.000943 wd 0.0500 time 0.2436 (0.2432) data time 0.0009 (0.0027) model time 0.2427 (0.2412) loss 4.0110 (3.4972) grad_norm 3.4571 (1.9709) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:39:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][370/1251] eta 0:03:34 lr 0.000943 wd 0.0500 time 0.2369 (0.2431) data time 0.0009 (0.0026) model time 0.2360 (0.2412) loss 2.7234 (3.4938) grad_norm 1.7642 (1.9710) loss_scale 8192.0000 (4184.3235) mem 7379MB [2024-08-26 07:39:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][380/1251] eta 0:03:31 lr 0.000943 wd 0.0500 time 0.2322 (0.2429) data time 0.0010 (0.0026) model time 0.2312 (0.2410) loss 3.7084 (3.5005) grad_norm 1.8776 (1.9701) loss_scale 8192.0000 (4289.5118) mem 7379MB [2024-08-26 07:39:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][390/1251] eta 0:03:29 lr 0.000943 wd 0.0500 time 0.2431 (0.2429) data time 0.0009 (0.0026) model time 0.2422 (0.2409) loss 4.0786 (3.5094) grad_norm 2.0134 (1.9714) loss_scale 8192.0000 (4389.3197) mem 7379MB [2024-08-26 07:40:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][400/1251] eta 0:03:27 lr 0.000943 wd 0.0500 time 0.2332 (0.2439) data time 0.0008 (0.0025) model time 0.2324 (0.2421) loss 3.1628 (3.5081) grad_norm 1.9070 (1.9689) loss_scale 8192.0000 (4484.1496) mem 7379MB [2024-08-26 07:40:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][410/1251] eta 0:03:25 lr 0.000943 wd 0.0500 time 0.2327 (0.2442) data time 0.0008 (0.0025) model time 0.2318 (0.2425) loss 4.4562 (3.5139) grad_norm 1.7771 (1.9688) loss_scale 8192.0000 (4574.3650) mem 7379MB [2024-08-26 07:40:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][420/1251] eta 0:03:22 lr 0.000943 wd 0.0500 time 0.2361 (0.2441) data time 0.0011 (0.0025) model time 0.2350 (0.2423) loss 2.9323 (3.5180) grad_norm 2.4191 (1.9701) loss_scale 8192.0000 (4660.2945) mem 7379MB [2024-08-26 07:40:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][430/1251] eta 0:03:20 lr 0.000943 wd 0.0500 time 0.2409 (0.2440) data time 0.0008 (0.0024) model time 0.2401 (0.2423) loss 3.1376 (3.5178) grad_norm 2.3032 (1.9694) loss_scale 8192.0000 (4742.2367) mem 7379MB [2024-08-26 07:40:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][440/1251] eta 0:03:17 lr 0.000943 wd 0.0500 time 0.2502 (0.2439) data time 0.0010 (0.0024) model time 0.2492 (0.2422) loss 2.9677 (3.5116) grad_norm 1.2579 (1.9660) loss_scale 8192.0000 (4820.4626) mem 7379MB [2024-08-26 07:40:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][450/1251] eta 0:03:15 lr 0.000943 wd 0.0500 time 0.2402 (0.2443) data time 0.0009 (0.0024) model time 0.2393 (0.2426) loss 4.1624 (3.5190) grad_norm 1.6300 (1.9690) loss_scale 8192.0000 (4895.2195) mem 7379MB [2024-08-26 07:40:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][460/1251] eta 0:03:13 lr 0.000943 wd 0.0500 time 0.2286 (0.2442) data time 0.0010 (0.0023) model time 0.2276 (0.2425) loss 3.7074 (3.5184) grad_norm 2.9581 (1.9722) loss_scale 8192.0000 (4966.7332) mem 7379MB [2024-08-26 07:40:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][470/1251] eta 0:03:10 lr 0.000943 wd 0.0500 time 0.2448 (0.2441) data time 0.0008 (0.0023) model time 0.2440 (0.2424) loss 2.5613 (3.5135) grad_norm 1.7898 (1.9721) loss_scale 8192.0000 (5035.2102) mem 7379MB [2024-08-26 07:40:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][480/1251] eta 0:03:08 lr 0.000943 wd 0.0500 time 0.2498 (0.2440) data time 0.0009 (0.0023) model time 0.2489 (0.2424) loss 3.9030 (3.5104) grad_norm 2.1857 (1.9695) loss_scale 8192.0000 (5100.8399) mem 7379MB [2024-08-26 07:40:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][490/1251] eta 0:03:05 lr 0.000942 wd 0.0500 time 0.2391 (0.2440) data time 0.0009 (0.0023) model time 0.2382 (0.2423) loss 3.9451 (3.5055) grad_norm 1.6441 (1.9663) loss_scale 8192.0000 (5163.7963) mem 7379MB [2024-08-26 07:40:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][500/1251] eta 0:03:03 lr 0.000942 wd 0.0500 time 0.2351 (0.2439) data time 0.0010 (0.0022) model time 0.2341 (0.2423) loss 4.0805 (3.5107) grad_norm 1.8263 (1.9730) loss_scale 8192.0000 (5224.2395) mem 7379MB [2024-08-26 07:40:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][510/1251] eta 0:03:00 lr 0.000942 wd 0.0500 time 0.2380 (0.2439) data time 0.0008 (0.0022) model time 0.2372 (0.2422) loss 3.4953 (3.5054) grad_norm 2.6778 (1.9831) loss_scale 8192.0000 (5282.3170) mem 7379MB [2024-08-26 07:40:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][520/1251] eta 0:02:58 lr 0.000942 wd 0.0500 time 0.2456 (0.2440) data time 0.0007 (0.0022) model time 0.2449 (0.2424) loss 3.1468 (3.5088) grad_norm 1.3706 (1.9863) loss_scale 8192.0000 (5338.1651) mem 7379MB [2024-08-26 07:40:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][530/1251] eta 0:02:55 lr 0.000942 wd 0.0500 time 0.2414 (0.2440) data time 0.0012 (0.0022) model time 0.2402 (0.2425) loss 3.4942 (3.5068) grad_norm 1.9160 (1.9836) loss_scale 8192.0000 (5391.9096) mem 7379MB [2024-08-26 07:40:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][540/1251] eta 0:02:53 lr 0.000942 wd 0.0500 time 0.2354 (0.2440) data time 0.0010 (0.0021) model time 0.2344 (0.2424) loss 3.4569 (3.5093) grad_norm 1.8016 (1.9804) loss_scale 8192.0000 (5443.6673) mem 7379MB [2024-08-26 07:40:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][550/1251] eta 0:02:51 lr 0.000942 wd 0.0500 time 0.2565 (0.2440) data time 0.0010 (0.0021) model time 0.2555 (0.2424) loss 3.3410 (3.5096) grad_norm 1.6440 (1.9756) loss_scale 8192.0000 (5493.5463) mem 7379MB [2024-08-26 07:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][560/1251] eta 0:02:48 lr 0.000942 wd 0.0500 time 0.2417 (0.2439) data time 0.0007 (0.0021) model time 0.2410 (0.2423) loss 3.0097 (3.5091) grad_norm 1.7217 (1.9769) loss_scale 8192.0000 (5541.6471) mem 7379MB [2024-08-26 07:40:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][570/1251] eta 0:02:46 lr 0.000942 wd 0.0500 time 0.2480 (0.2439) data time 0.0007 (0.0021) model time 0.2473 (0.2423) loss 4.3203 (3.5120) grad_norm 1.8327 (1.9831) loss_scale 8192.0000 (5588.0630) mem 7379MB [2024-08-26 07:40:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][580/1251] eta 0:02:43 lr 0.000942 wd 0.0500 time 0.2402 (0.2438) data time 0.0007 (0.0021) model time 0.2395 (0.2423) loss 4.4534 (3.5126) grad_norm 1.6482 (1.9811) loss_scale 8192.0000 (5632.8812) mem 7379MB [2024-08-26 07:40:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][590/1251] eta 0:02:41 lr 0.000942 wd 0.0500 time 0.2492 (0.2438) data time 0.0010 (0.0020) model time 0.2482 (0.2423) loss 3.2115 (3.5174) grad_norm 1.8222 (1.9786) loss_scale 8192.0000 (5676.1827) mem 7379MB [2024-08-26 07:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][600/1251] eta 0:02:38 lr 0.000942 wd 0.0500 time 0.2360 (0.2438) data time 0.0008 (0.0020) model time 0.2352 (0.2423) loss 4.4228 (3.5152) grad_norm 3.0065 (1.9806) loss_scale 8192.0000 (5718.0433) mem 7379MB [2024-08-26 07:40:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][610/1251] eta 0:02:36 lr 0.000942 wd 0.0500 time 0.2417 (0.2438) data time 0.0010 (0.0020) model time 0.2407 (0.2423) loss 3.3986 (3.5142) grad_norm 1.6972 (1.9865) loss_scale 8192.0000 (5758.5336) mem 7379MB [2024-08-26 07:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][620/1251] eta 0:02:33 lr 0.000942 wd 0.0500 time 0.2453 (0.2441) data time 0.0009 (0.0020) model time 0.2444 (0.2426) loss 3.7808 (3.5173) grad_norm 1.9183 (1.9831) loss_scale 8192.0000 (5797.7198) mem 7379MB [2024-08-26 07:40:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][630/1251] eta 0:02:31 lr 0.000942 wd 0.0500 time 0.2401 (0.2440) data time 0.0011 (0.0020) model time 0.2390 (0.2425) loss 2.5743 (3.5165) grad_norm 1.9782 (1.9793) loss_scale 8192.0000 (5835.6640) mem 7379MB [2024-08-26 07:41:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][640/1251] eta 0:02:29 lr 0.000942 wd 0.0500 time 0.2376 (0.2443) data time 0.0008 (0.0020) model time 0.2369 (0.2428) loss 3.6440 (3.5196) grad_norm 1.9106 (1.9755) loss_scale 8192.0000 (5872.4243) mem 7379MB [2024-08-26 07:41:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][650/1251] eta 0:02:26 lr 0.000942 wd 0.0500 time 0.2457 (0.2442) data time 0.0009 (0.0019) model time 0.2448 (0.2428) loss 4.1436 (3.5167) grad_norm 1.5415 (1.9714) loss_scale 8192.0000 (5908.0553) mem 7379MB [2024-08-26 07:41:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][660/1251] eta 0:02:24 lr 0.000942 wd 0.0500 time 0.2387 (0.2442) data time 0.0010 (0.0019) model time 0.2378 (0.2427) loss 3.8822 (3.5183) grad_norm 2.2936 (1.9690) loss_scale 8192.0000 (5942.6082) mem 7379MB [2024-08-26 07:41:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][670/1251] eta 0:02:21 lr 0.000942 wd 0.0500 time 0.2443 (0.2442) data time 0.0011 (0.0019) model time 0.2433 (0.2427) loss 2.9727 (3.5171) grad_norm 3.3359 (1.9702) loss_scale 8192.0000 (5976.1311) mem 7379MB [2024-08-26 07:41:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][680/1251] eta 0:02:19 lr 0.000942 wd 0.0500 time 0.2466 (0.2441) data time 0.0010 (0.0019) model time 0.2456 (0.2427) loss 3.7011 (3.5169) grad_norm 2.3801 (1.9764) loss_scale 8192.0000 (6008.6696) mem 7379MB [2024-08-26 07:41:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][690/1251] eta 0:02:17 lr 0.000942 wd 0.0500 time 0.2391 (0.2444) data time 0.0012 (0.0019) model time 0.2379 (0.2430) loss 3.3350 (3.5165) grad_norm 1.6493 (1.9718) loss_scale 8192.0000 (6040.2663) mem 7379MB [2024-08-26 07:41:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][700/1251] eta 0:02:14 lr 0.000942 wd 0.0500 time 0.2445 (0.2444) data time 0.0010 (0.0019) model time 0.2436 (0.2430) loss 3.1562 (3.5160) grad_norm 1.6818 (1.9704) loss_scale 8192.0000 (6070.9615) mem 7379MB [2024-08-26 07:41:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][710/1251] eta 0:02:12 lr 0.000942 wd 0.0500 time 0.2429 (0.2444) data time 0.0016 (0.0019) model time 0.2413 (0.2430) loss 4.1013 (3.5177) grad_norm 2.0864 (1.9711) loss_scale 8192.0000 (6100.7932) mem 7379MB [2024-08-26 07:41:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][720/1251] eta 0:02:09 lr 0.000942 wd 0.0500 time 0.2407 (0.2444) data time 0.0015 (0.0019) model time 0.2392 (0.2430) loss 2.5165 (3.5210) grad_norm 1.9809 (1.9748) loss_scale 8192.0000 (6129.7975) mem 7379MB [2024-08-26 07:41:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][730/1251] eta 0:02:07 lr 0.000942 wd 0.0500 time 0.2391 (0.2443) data time 0.0011 (0.0018) model time 0.2380 (0.2429) loss 3.2409 (3.5191) grad_norm 1.8760 (1.9761) loss_scale 8192.0000 (6158.0082) mem 7379MB [2024-08-26 07:41:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][740/1251] eta 0:02:04 lr 0.000942 wd 0.0500 time 0.2430 (0.2443) data time 0.0008 (0.0018) model time 0.2422 (0.2429) loss 3.0354 (3.5197) grad_norm 1.6305 (1.9809) loss_scale 8192.0000 (6185.4575) mem 7379MB [2024-08-26 07:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][750/1251] eta 0:02:02 lr 0.000942 wd 0.0500 time 0.2428 (0.2442) data time 0.0010 (0.0018) model time 0.2419 (0.2428) loss 4.2874 (3.5245) grad_norm 1.3908 (1.9793) loss_scale 8192.0000 (6212.1758) mem 7379MB [2024-08-26 07:41:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][760/1251] eta 0:01:59 lr 0.000942 wd 0.0500 time 0.2413 (0.2442) data time 0.0008 (0.0018) model time 0.2405 (0.2428) loss 4.4904 (3.5206) grad_norm 1.4929 (1.9762) loss_scale 8192.0000 (6238.1919) mem 7379MB [2024-08-26 07:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][770/1251] eta 0:01:57 lr 0.000942 wd 0.0500 time 0.2321 (0.2441) data time 0.0010 (0.0018) model time 0.2311 (0.2427) loss 3.7061 (3.5200) grad_norm 2.3404 (1.9742) loss_scale 8192.0000 (6263.5331) mem 7379MB [2024-08-26 07:41:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][780/1251] eta 0:01:54 lr 0.000942 wd 0.0500 time 0.2428 (0.2441) data time 0.0009 (0.0018) model time 0.2419 (0.2427) loss 3.7124 (3.5175) grad_norm 1.5398 (1.9726) loss_scale 8192.0000 (6288.2254) mem 7379MB [2024-08-26 07:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][790/1251] eta 0:01:52 lr 0.000942 wd 0.0500 time 0.2399 (0.2440) data time 0.0013 (0.0018) model time 0.2386 (0.2427) loss 3.1548 (3.5177) grad_norm 1.7845 (1.9733) loss_scale 8192.0000 (6312.2933) mem 7379MB [2024-08-26 07:41:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][800/1251] eta 0:01:50 lr 0.000942 wd 0.0500 time 0.2379 (0.2440) data time 0.0009 (0.0018) model time 0.2369 (0.2426) loss 3.9301 (3.5184) grad_norm 1.7639 (1.9716) loss_scale 8192.0000 (6335.7603) mem 7379MB [2024-08-26 07:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][810/1251] eta 0:01:47 lr 0.000942 wd 0.0500 time 0.2567 (0.2440) data time 0.0009 (0.0018) model time 0.2558 (0.2426) loss 3.1381 (3.5162) grad_norm 1.8236 (1.9677) loss_scale 8192.0000 (6358.6486) mem 7379MB [2024-08-26 07:41:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][820/1251] eta 0:01:45 lr 0.000942 wd 0.0500 time 0.2416 (0.2439) data time 0.0012 (0.0018) model time 0.2404 (0.2426) loss 3.2811 (3.5194) grad_norm 1.6585 (1.9651) loss_scale 8192.0000 (6380.9793) mem 7379MB [2024-08-26 07:41:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][830/1251] eta 0:01:42 lr 0.000942 wd 0.0500 time 0.2384 (0.2439) data time 0.0007 (0.0018) model time 0.2376 (0.2425) loss 2.7602 (3.5181) grad_norm 3.2175 (1.9694) loss_scale 8192.0000 (6402.7726) mem 7379MB [2024-08-26 07:41:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][840/1251] eta 0:01:40 lr 0.000942 wd 0.0500 time 0.2358 (0.2439) data time 0.0011 (0.0017) model time 0.2347 (0.2425) loss 4.0216 (3.5209) grad_norm 1.8444 (1.9686) loss_scale 8192.0000 (6424.0476) mem 7379MB [2024-08-26 07:41:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][850/1251] eta 0:01:37 lr 0.000942 wd 0.0500 time 0.2393 (0.2438) data time 0.0010 (0.0017) model time 0.2383 (0.2425) loss 2.3458 (3.5205) grad_norm 1.5877 (1.9655) loss_scale 8192.0000 (6444.8226) mem 7379MB [2024-08-26 07:41:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][860/1251] eta 0:01:35 lr 0.000942 wd 0.0500 time 0.2382 (0.2438) data time 0.0008 (0.0017) model time 0.2375 (0.2424) loss 2.8611 (3.5221) grad_norm 1.4311 (1.9629) loss_scale 8192.0000 (6465.1150) mem 7379MB [2024-08-26 07:41:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][870/1251] eta 0:01:32 lr 0.000942 wd 0.0500 time 0.2493 (0.2440) data time 0.0009 (0.0017) model time 0.2484 (0.2426) loss 3.9032 (3.5229) grad_norm 1.7628 (1.9594) loss_scale 8192.0000 (6484.9414) mem 7379MB [2024-08-26 07:41:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][880/1251] eta 0:01:30 lr 0.000942 wd 0.0500 time 0.2354 (0.2439) data time 0.0010 (0.0017) model time 0.2344 (0.2426) loss 3.7808 (3.5235) grad_norm 2.3566 (1.9593) loss_scale 8192.0000 (6504.3178) mem 7379MB [2024-08-26 07:42:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][890/1251] eta 0:01:28 lr 0.000942 wd 0.0500 time 0.2434 (0.2440) data time 0.0010 (0.0017) model time 0.2424 (0.2426) loss 3.6558 (3.5262) grad_norm 3.2780 (1.9644) loss_scale 8192.0000 (6523.2593) mem 7379MB [2024-08-26 07:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][900/1251] eta 0:01:25 lr 0.000942 wd 0.0500 time 0.2553 (0.2439) data time 0.0011 (0.0017) model time 0.2542 (0.2426) loss 3.4275 (3.5245) grad_norm 3.7067 (1.9714) loss_scale 8192.0000 (6541.7802) mem 7379MB [2024-08-26 07:42:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][910/1251] eta 0:01:23 lr 0.000942 wd 0.0500 time 0.2448 (0.2439) data time 0.0010 (0.0017) model time 0.2439 (0.2426) loss 2.5839 (3.5258) grad_norm 2.0156 (1.9700) loss_scale 8192.0000 (6559.8946) mem 7379MB [2024-08-26 07:42:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][920/1251] eta 0:01:20 lr 0.000942 wd 0.0500 time 0.2330 (0.2439) data time 0.0007 (0.0017) model time 0.2323 (0.2426) loss 4.0744 (3.5245) grad_norm 1.4380 (1.9668) loss_scale 8192.0000 (6577.6156) mem 7379MB [2024-08-26 07:42:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][930/1251] eta 0:01:18 lr 0.000942 wd 0.0500 time 0.2428 (0.2439) data time 0.0010 (0.0017) model time 0.2418 (0.2425) loss 4.0015 (3.5257) grad_norm 1.4395 (1.9644) loss_scale 8192.0000 (6594.9560) mem 7379MB [2024-08-26 07:42:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][940/1251] eta 0:01:15 lr 0.000942 wd 0.0500 time 0.2407 (0.2439) data time 0.0011 (0.0017) model time 0.2396 (0.2425) loss 3.6487 (3.5272) grad_norm 2.1163 (1.9620) loss_scale 8192.0000 (6611.9277) mem 7379MB [2024-08-26 07:42:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][950/1251] eta 0:01:13 lr 0.000942 wd 0.0500 time 0.2394 (0.2438) data time 0.0014 (0.0017) model time 0.2379 (0.2425) loss 4.1757 (3.5266) grad_norm 1.7627 (1.9600) loss_scale 8192.0000 (6628.5426) mem 7379MB [2024-08-26 07:42:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][960/1251] eta 0:01:11 lr 0.000942 wd 0.0500 time 0.4160 (0.2440) data time 0.0010 (0.0017) model time 0.4151 (0.2427) loss 3.7578 (3.5281) grad_norm 1.9913 (1.9624) loss_scale 8192.0000 (6644.8117) mem 7379MB [2024-08-26 07:42:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][970/1251] eta 0:01:08 lr 0.000941 wd 0.0500 time 0.2384 (0.2440) data time 0.0009 (0.0016) model time 0.2375 (0.2426) loss 3.6913 (3.5297) grad_norm 1.6361 (1.9616) loss_scale 8192.0000 (6660.7456) mem 7379MB [2024-08-26 07:42:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][980/1251] eta 0:01:06 lr 0.000941 wd 0.0500 time 0.2445 (0.2440) data time 0.0009 (0.0016) model time 0.2436 (0.2426) loss 3.1149 (3.5275) grad_norm 2.2642 (1.9601) loss_scale 8192.0000 (6676.3547) mem 7379MB [2024-08-26 07:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][990/1251] eta 0:01:03 lr 0.000941 wd 0.0500 time 0.2387 (0.2439) data time 0.0008 (0.0016) model time 0.2379 (0.2426) loss 3.1742 (3.5281) grad_norm 1.6349 (1.9596) loss_scale 8192.0000 (6691.6488) mem 7379MB [2024-08-26 07:42:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1000/1251] eta 0:01:01 lr 0.000941 wd 0.0500 time 0.2397 (0.2439) data time 0.0010 (0.0016) model time 0.2387 (0.2426) loss 2.7271 (3.5307) grad_norm 1.4775 (1.9609) loss_scale 8192.0000 (6706.6374) mem 7379MB [2024-08-26 07:42:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1010/1251] eta 0:00:58 lr 0.000941 wd 0.0500 time 0.2357 (0.2438) data time 0.0010 (0.0016) model time 0.2347 (0.2425) loss 3.7146 (3.5303) grad_norm 2.0653 (1.9631) loss_scale 8192.0000 (6721.3294) mem 7379MB [2024-08-26 07:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1020/1251] eta 0:00:56 lr 0.000941 wd 0.0500 time 0.2371 (0.2438) data time 0.0009 (0.0016) model time 0.2362 (0.2425) loss 4.0182 (3.5290) grad_norm 1.6330 (1.9612) loss_scale 8192.0000 (6735.7336) mem 7379MB [2024-08-26 07:42:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1030/1251] eta 0:00:53 lr 0.000941 wd 0.0500 time 0.2430 (0.2437) data time 0.0009 (0.0016) model time 0.2421 (0.2424) loss 2.6753 (3.5283) grad_norm 2.0435 (1.9624) loss_scale 8192.0000 (6749.8584) mem 7379MB [2024-08-26 07:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1040/1251] eta 0:00:51 lr 0.000941 wd 0.0500 time 0.4579 (0.2441) data time 0.0010 (0.0016) model time 0.4569 (0.2428) loss 3.9428 (3.5300) grad_norm 2.1413 (1.9622) loss_scale 8192.0000 (6763.7118) mem 7379MB [2024-08-26 07:42:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1050/1251] eta 0:00:49 lr 0.000941 wd 0.0500 time 0.2459 (0.2441) data time 0.0010 (0.0016) model time 0.2449 (0.2429) loss 3.8326 (3.5317) grad_norm 1.7779 (1.9612) loss_scale 8192.0000 (6777.3016) mem 7379MB [2024-08-26 07:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1060/1251] eta 0:00:46 lr 0.000941 wd 0.0500 time 0.2439 (0.2441) data time 0.0008 (0.0016) model time 0.2432 (0.2428) loss 3.5020 (3.5296) grad_norm 1.6864 (1.9605) loss_scale 8192.0000 (6790.6352) mem 7379MB [2024-08-26 07:42:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1070/1251] eta 0:00:44 lr 0.000941 wd 0.0500 time 0.2410 (0.2441) data time 0.0011 (0.0016) model time 0.2399 (0.2428) loss 4.3672 (3.5310) grad_norm 2.0154 (1.9576) loss_scale 8192.0000 (6803.7199) mem 7379MB [2024-08-26 07:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1080/1251] eta 0:00:41 lr 0.000941 wd 0.0500 time 0.2448 (0.2441) data time 0.0008 (0.0016) model time 0.2440 (0.2428) loss 2.7807 (3.5326) grad_norm 3.2166 (1.9601) loss_scale 8192.0000 (6816.5624) mem 7379MB [2024-08-26 07:42:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1090/1251] eta 0:00:39 lr 0.000941 wd 0.0500 time 0.2471 (0.2441) data time 0.0010 (0.0016) model time 0.2461 (0.2428) loss 3.1557 (3.5305) grad_norm 1.5016 (1.9595) loss_scale 8192.0000 (6829.1696) mem 7379MB [2024-08-26 07:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1100/1251] eta 0:00:36 lr 0.000941 wd 0.0500 time 0.2415 (0.2441) data time 0.0010 (0.0016) model time 0.2405 (0.2428) loss 3.9350 (3.5322) grad_norm 2.5310 (1.9610) loss_scale 8192.0000 (6841.5477) mem 7379MB [2024-08-26 07:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1110/1251] eta 0:00:34 lr 0.000941 wd 0.0500 time 0.2402 (0.2440) data time 0.0011 (0.0016) model time 0.2391 (0.2428) loss 3.4014 (3.5317) grad_norm 1.7728 (1.9598) loss_scale 8192.0000 (6853.7030) mem 7379MB [2024-08-26 07:42:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1120/1251] eta 0:00:31 lr 0.000941 wd 0.0500 time 0.2413 (0.2440) data time 0.0010 (0.0016) model time 0.2403 (0.2427) loss 3.6269 (3.5328) grad_norm 1.6591 (1.9573) loss_scale 8192.0000 (6865.6414) mem 7379MB [2024-08-26 07:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1130/1251] eta 0:00:29 lr 0.000941 wd 0.0500 time 0.2390 (0.2441) data time 0.0013 (0.0016) model time 0.2378 (0.2428) loss 3.0793 (3.5325) grad_norm 1.6138 (1.9556) loss_scale 8192.0000 (6877.3687) mem 7379MB [2024-08-26 07:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1140/1251] eta 0:00:27 lr 0.000941 wd 0.0500 time 0.2510 (0.2441) data time 0.0011 (0.0016) model time 0.2499 (0.2428) loss 3.6418 (3.5338) grad_norm 1.6219 (1.9547) loss_scale 8192.0000 (6888.8904) mem 7379MB [2024-08-26 07:43:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1150/1251] eta 0:00:24 lr 0.000941 wd 0.0500 time 0.2404 (0.2441) data time 0.0008 (0.0015) model time 0.2396 (0.2428) loss 4.2667 (3.5333) grad_norm 2.4938 (1.9560) loss_scale 8192.0000 (6900.2120) mem 7379MB [2024-08-26 07:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1160/1251] eta 0:00:22 lr 0.000941 wd 0.0500 time 0.2371 (0.2441) data time 0.0012 (0.0015) model time 0.2359 (0.2428) loss 2.5616 (3.5310) grad_norm 1.8626 (1.9560) loss_scale 8192.0000 (6911.3385) mem 7379MB [2024-08-26 07:43:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1170/1251] eta 0:00:19 lr 0.000941 wd 0.0500 time 0.2509 (0.2441) data time 0.0010 (0.0015) model time 0.2500 (0.2428) loss 3.4284 (3.5285) grad_norm 1.6097 (1.9554) loss_scale 8192.0000 (6922.2750) mem 7379MB [2024-08-26 07:43:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1180/1251] eta 0:00:17 lr 0.000941 wd 0.0500 time 0.2396 (0.2442) data time 0.0008 (0.0015) model time 0.2388 (0.2430) loss 4.1558 (3.5296) grad_norm 1.5532 (1.9536) loss_scale 8192.0000 (6933.0262) mem 7379MB [2024-08-26 07:43:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1190/1251] eta 0:00:14 lr 0.000941 wd 0.0500 time 0.2466 (0.2443) data time 0.0011 (0.0015) model time 0.2455 (0.2430) loss 3.5517 (3.5293) grad_norm 2.7268 (1.9567) loss_scale 8192.0000 (6943.5970) mem 7379MB [2024-08-26 07:43:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1200/1251] eta 0:00:12 lr 0.000941 wd 0.0500 time 0.2322 (0.2442) data time 0.0009 (0.0015) model time 0.2313 (0.2430) loss 3.8498 (3.5283) grad_norm 2.0720 (1.9586) loss_scale 8192.0000 (6953.9917) mem 7379MB [2024-08-26 07:43:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1210/1251] eta 0:00:10 lr 0.000941 wd 0.0500 time 0.2455 (0.2442) data time 0.0008 (0.0015) model time 0.2447 (0.2430) loss 4.1651 (3.5305) grad_norm 2.0119 (1.9575) loss_scale 8192.0000 (6964.2147) mem 7379MB [2024-08-26 07:43:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1220/1251] eta 0:00:07 lr 0.000941 wd 0.0500 time 0.2311 (0.2442) data time 0.0009 (0.0015) model time 0.2302 (0.2430) loss 2.6295 (3.5309) grad_norm 1.7003 (1.9565) loss_scale 8192.0000 (6974.2703) mem 7379MB [2024-08-26 07:43:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1230/1251] eta 0:00:05 lr 0.000941 wd 0.0500 time 0.2409 (0.2442) data time 0.0014 (0.0015) model time 0.2396 (0.2429) loss 2.3848 (3.5289) grad_norm 2.1490 (1.9548) loss_scale 8192.0000 (6984.1625) mem 7379MB [2024-08-26 07:43:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1240/1251] eta 0:00:02 lr 0.000941 wd 0.0500 time 0.2255 (0.2441) data time 0.0005 (0.0015) model time 0.2250 (0.2428) loss 4.1621 (3.5290) grad_norm 1.3727 (1.9529) loss_scale 8192.0000 (6993.8952) mem 7379MB [2024-08-26 07:43:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [63/300][1250/1251] eta 0:00:00 lr 0.000941 wd 0.0500 time 0.2284 (0.2439) data time 0.0007 (0.0015) model time 0.2276 (0.2427) loss 3.7374 (3.5278) grad_norm 1.7859 (1.9515) loss_scale 8192.0000 (7003.4724) mem 7379MB [2024-08-26 07:43:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 63 training takes 0:05:05 [2024-08-26 07:43:29 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 07:43:30 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 07:43:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.488 (0.488) Loss 0.5073 (0.5073) Acc@1 89.941 (89.941) Acc@5 97.559 (97.559) Mem 7379MB [2024-08-26 07:43:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.117) Loss 0.8799 (0.8352) Acc@1 80.566 (81.179) Acc@5 96.094 (95.872) Mem 7379MB [2024-08-26 07:43:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.099) Loss 1.1768 (0.8571) Acc@1 71.582 (80.101) Acc@5 91.504 (95.764) Mem 7379MB [2024-08-26 07:43:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.092) Loss 1.5000 (0.9888) Acc@1 65.039 (77.303) Acc@5 86.914 (94.068) Mem 7379MB [2024-08-26 07:43:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.086) Loss 1.3584 (1.0504) Acc@1 68.262 (75.838) Acc@5 89.062 (93.295) Mem 7379MB [2024-08-26 07:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.528 Acc@5 93.236 [2024-08-26 07:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.5% [2024-08-26 07:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 75.53% [2024-08-26 07:43:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 07:43:34 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 07:43:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.433 (0.433) Loss 0.4678 (0.4678) Acc@1 90.918 (90.918) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 07:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.110) Loss 0.7705 (0.7436) Acc@1 84.277 (83.736) Acc@5 95.996 (96.626) Mem 7379MB [2024-08-26 07:43:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.095) Loss 1.0723 (0.7634) Acc@1 74.316 (82.710) Acc@5 92.969 (96.619) Mem 7379MB [2024-08-26 07:43:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.090) Loss 1.3301 (0.8745) Acc@1 65.527 (80.081) Acc@5 89.453 (95.186) Mem 7379MB [2024-08-26 07:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.2441 (0.9351) Acc@1 69.727 (78.537) Acc@5 91.016 (94.510) Mem 7379MB [2024-08-26 07:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.150 Acc@5 94.436 [2024-08-26 07:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.2% [2024-08-26 07:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.15% [2024-08-26 07:43:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 07:43:39 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 07:43:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][0/1251] eta 0:15:16 lr 0.000941 wd 0.0500 time 0.7329 (0.7329) data time 0.5093 (0.5093) model time 0.0000 (0.0000) loss 3.7774 (3.7774) grad_norm 1.7847 (1.7847) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:43:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][10/1251] eta 0:05:54 lr 0.000941 wd 0.0500 time 0.2450 (0.2854) data time 0.0011 (0.0472) model time 0.0000 (0.0000) loss 4.0767 (3.6407) grad_norm 3.9421 (2.4761) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:43:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][20/1251] eta 0:05:25 lr 0.000941 wd 0.0500 time 0.2399 (0.2646) data time 0.0009 (0.0252) model time 0.0000 (0.0000) loss 4.4062 (3.6494) grad_norm 1.5306 (2.3677) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:43:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][30/1251] eta 0:05:14 lr 0.000941 wd 0.0500 time 0.2441 (0.2574) data time 0.0007 (0.0175) model time 0.0000 (0.0000) loss 4.0092 (3.5816) grad_norm 3.5750 (2.4439) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:43:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][40/1251] eta 0:05:08 lr 0.000941 wd 0.0500 time 0.2497 (0.2547) data time 0.0009 (0.0135) model time 0.0000 (0.0000) loss 2.8520 (3.5202) grad_norm 1.7830 (2.3155) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:43:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][50/1251] eta 0:05:12 lr 0.000941 wd 0.0500 time 0.2473 (0.2605) data time 0.0008 (0.0110) model time 0.0000 (0.0000) loss 3.6291 (3.5396) grad_norm 1.5408 (2.2552) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:43:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][60/1251] eta 0:05:06 lr 0.000941 wd 0.0500 time 0.2470 (0.2574) data time 0.0011 (0.0094) model time 0.2459 (0.2409) loss 2.5038 (3.5041) grad_norm 1.4450 (2.1981) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:43:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][70/1251] eta 0:05:01 lr 0.000941 wd 0.0500 time 0.2338 (0.2551) data time 0.0008 (0.0082) model time 0.2330 (0.2405) loss 2.3327 (3.5254) grad_norm 1.8150 (2.1388) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][80/1251] eta 0:04:56 lr 0.000941 wd 0.0500 time 0.2477 (0.2532) data time 0.0007 (0.0073) model time 0.2469 (0.2399) loss 2.8673 (3.5530) grad_norm 1.4470 (2.1182) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][90/1251] eta 0:04:52 lr 0.000941 wd 0.0500 time 0.2397 (0.2519) data time 0.0010 (0.0066) model time 0.2386 (0.2399) loss 2.7733 (3.5050) grad_norm 1.5678 (2.0749) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][100/1251] eta 0:04:48 lr 0.000941 wd 0.0500 time 0.2403 (0.2506) data time 0.0007 (0.0061) model time 0.2396 (0.2396) loss 3.6689 (3.5148) grad_norm 1.7278 (2.0514) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][110/1251] eta 0:04:45 lr 0.000941 wd 0.0500 time 0.2356 (0.2499) data time 0.0011 (0.0057) model time 0.2345 (0.2398) loss 3.5437 (3.4969) grad_norm 1.4177 (2.0566) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][120/1251] eta 0:04:42 lr 0.000941 wd 0.0500 time 0.2562 (0.2495) data time 0.0008 (0.0053) model time 0.2555 (0.2403) loss 4.3548 (3.5029) grad_norm 1.8397 (2.0343) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][130/1251] eta 0:04:39 lr 0.000941 wd 0.0500 time 0.2402 (0.2489) data time 0.0009 (0.0050) model time 0.2393 (0.2403) loss 2.9626 (3.5200) grad_norm 1.3367 (2.0169) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][140/1251] eta 0:04:36 lr 0.000941 wd 0.0500 time 0.2422 (0.2484) data time 0.0013 (0.0047) model time 0.2409 (0.2405) loss 3.7468 (3.5133) grad_norm 2.7031 (2.0167) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][150/1251] eta 0:04:34 lr 0.000941 wd 0.0500 time 0.2400 (0.2491) data time 0.0007 (0.0045) model time 0.2393 (0.2422) loss 2.7251 (3.5053) grad_norm 2.5022 (2.0055) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][160/1251] eta 0:04:31 lr 0.000941 wd 0.0500 time 0.2407 (0.2488) data time 0.0010 (0.0043) model time 0.2396 (0.2423) loss 3.5422 (3.5154) grad_norm 1.5793 (1.9918) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][170/1251] eta 0:04:28 lr 0.000941 wd 0.0500 time 0.2450 (0.2485) data time 0.0009 (0.0041) model time 0.2441 (0.2423) loss 3.3344 (3.4928) grad_norm 1.8851 (1.9743) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][180/1251] eta 0:04:25 lr 0.000941 wd 0.0500 time 0.2445 (0.2482) data time 0.0009 (0.0039) model time 0.2436 (0.2423) loss 3.4087 (3.4998) grad_norm 3.7827 (1.9793) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][190/1251] eta 0:04:22 lr 0.000940 wd 0.0500 time 0.2388 (0.2478) data time 0.0009 (0.0037) model time 0.2379 (0.2421) loss 4.1720 (3.5030) grad_norm 1.9101 (2.0049) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][200/1251] eta 0:04:20 lr 0.000940 wd 0.0500 time 0.2442 (0.2477) data time 0.0016 (0.0036) model time 0.2426 (0.2423) loss 3.2634 (3.5005) grad_norm 1.7897 (1.9935) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][210/1251] eta 0:04:18 lr 0.000940 wd 0.0500 time 0.2418 (0.2484) data time 0.0009 (0.0035) model time 0.2408 (0.2435) loss 4.0358 (3.5162) grad_norm 1.2765 (1.9936) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][220/1251] eta 0:04:15 lr 0.000940 wd 0.0500 time 0.2475 (0.2481) data time 0.0009 (0.0034) model time 0.2466 (0.2433) loss 4.0662 (3.5289) grad_norm 1.8091 (1.9916) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][230/1251] eta 0:04:13 lr 0.000940 wd 0.0500 time 0.2362 (0.2480) data time 0.0008 (0.0033) model time 0.2355 (0.2434) loss 3.2937 (3.5230) grad_norm 1.5693 (1.9906) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][240/1251] eta 0:04:10 lr 0.000940 wd 0.0500 time 0.2342 (0.2477) data time 0.0009 (0.0032) model time 0.2332 (0.2432) loss 3.6307 (3.5334) grad_norm 2.0049 (1.9959) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][250/1251] eta 0:04:07 lr 0.000940 wd 0.0500 time 0.2422 (0.2475) data time 0.0007 (0.0031) model time 0.2416 (0.2431) loss 4.1261 (3.5392) grad_norm 2.2260 (1.9855) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][260/1251] eta 0:04:05 lr 0.000940 wd 0.0500 time 0.2394 (0.2474) data time 0.0007 (0.0030) model time 0.2387 (0.2431) loss 4.0399 (3.5362) grad_norm 1.5408 (1.9719) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][270/1251] eta 0:04:02 lr 0.000940 wd 0.0500 time 0.2408 (0.2473) data time 0.0012 (0.0029) model time 0.2396 (0.2432) loss 3.4068 (3.5315) grad_norm 2.1336 (1.9691) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][280/1251] eta 0:03:59 lr 0.000940 wd 0.0500 time 0.2368 (0.2471) data time 0.0007 (0.0029) model time 0.2361 (0.2431) loss 3.0474 (3.5349) grad_norm 1.8105 (1.9701) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][290/1251] eta 0:03:57 lr 0.000940 wd 0.0500 time 0.2406 (0.2469) data time 0.0007 (0.0028) model time 0.2398 (0.2430) loss 3.2976 (3.5256) grad_norm 1.6073 (1.9647) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][300/1251] eta 0:03:54 lr 0.000940 wd 0.0500 time 0.2389 (0.2467) data time 0.0009 (0.0028) model time 0.2379 (0.2429) loss 3.9695 (3.5269) grad_norm 3.9129 (1.9703) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][310/1251] eta 0:03:52 lr 0.000940 wd 0.0500 time 0.2384 (0.2472) data time 0.0008 (0.0027) model time 0.2377 (0.2435) loss 3.4244 (3.5339) grad_norm 1.6606 (1.9612) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:44:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][320/1251] eta 0:03:49 lr 0.000940 wd 0.0500 time 0.2482 (0.2470) data time 0.0007 (0.0026) model time 0.2475 (0.2434) loss 3.8168 (3.5367) grad_norm 3.1521 (1.9666) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][330/1251] eta 0:03:47 lr 0.000940 wd 0.0500 time 0.2531 (0.2469) data time 0.0009 (0.0026) model time 0.2523 (0.2434) loss 3.4021 (3.5386) grad_norm 3.1067 (1.9758) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][340/1251] eta 0:03:44 lr 0.000940 wd 0.0500 time 0.2392 (0.2469) data time 0.0010 (0.0025) model time 0.2383 (0.2434) loss 2.6673 (3.5317) grad_norm 1.3504 (1.9760) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][350/1251] eta 0:03:42 lr 0.000940 wd 0.0500 time 0.2447 (0.2467) data time 0.0009 (0.0025) model time 0.2439 (0.2433) loss 4.0352 (3.5330) grad_norm 1.5476 (1.9698) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][360/1251] eta 0:03:39 lr 0.000940 wd 0.0500 time 0.2386 (0.2466) data time 0.0010 (0.0025) model time 0.2376 (0.2432) loss 3.7348 (3.5258) grad_norm 1.4899 (1.9635) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][370/1251] eta 0:03:37 lr 0.000940 wd 0.0500 time 0.2464 (0.2465) data time 0.0007 (0.0024) model time 0.2456 (0.2432) loss 2.9806 (3.5302) grad_norm 2.0803 (1.9646) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][380/1251] eta 0:03:34 lr 0.000940 wd 0.0500 time 0.2356 (0.2462) data time 0.0010 (0.0024) model time 0.2346 (0.2430) loss 3.4109 (3.5307) grad_norm 1.5581 (1.9711) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][390/1251] eta 0:03:31 lr 0.000940 wd 0.0500 time 0.2377 (0.2461) data time 0.0008 (0.0023) model time 0.2370 (0.2429) loss 4.1392 (3.5237) grad_norm 1.5899 (1.9631) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][400/1251] eta 0:03:29 lr 0.000940 wd 0.0500 time 0.2353 (0.2463) data time 0.0009 (0.0023) model time 0.2345 (0.2432) loss 3.8887 (3.5263) grad_norm 1.5792 (1.9594) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][410/1251] eta 0:03:27 lr 0.000940 wd 0.0500 time 0.2614 (0.2463) data time 0.0008 (0.0023) model time 0.2606 (0.2433) loss 2.2860 (3.5195) grad_norm 1.7035 (1.9584) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][420/1251] eta 0:03:24 lr 0.000940 wd 0.0500 time 0.2459 (0.2462) data time 0.0011 (0.0023) model time 0.2448 (0.2432) loss 2.5581 (3.5142) grad_norm 1.6142 (1.9621) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][430/1251] eta 0:03:22 lr 0.000940 wd 0.0500 time 0.2367 (0.2462) data time 0.0009 (0.0022) model time 0.2358 (0.2432) loss 4.2231 (3.5177) grad_norm 1.9172 (1.9717) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][440/1251] eta 0:03:19 lr 0.000940 wd 0.0500 time 0.2347 (0.2461) data time 0.0009 (0.0022) model time 0.2338 (0.2431) loss 3.3999 (3.5150) grad_norm 2.0929 (1.9673) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][450/1251] eta 0:03:17 lr 0.000940 wd 0.0500 time 0.2440 (0.2460) data time 0.0008 (0.0022) model time 0.2433 (0.2431) loss 4.1306 (3.5183) grad_norm 2.2457 (1.9684) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][460/1251] eta 0:03:14 lr 0.000940 wd 0.0500 time 0.2412 (0.2459) data time 0.0012 (0.0021) model time 0.2400 (0.2430) loss 3.3848 (3.5204) grad_norm 1.3977 (1.9718) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][470/1251] eta 0:03:12 lr 0.000940 wd 0.0500 time 0.2360 (0.2462) data time 0.0008 (0.0021) model time 0.2352 (0.2435) loss 2.3567 (3.5187) grad_norm 1.5690 (1.9731) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][480/1251] eta 0:03:09 lr 0.000940 wd 0.0500 time 0.2440 (0.2462) data time 0.0009 (0.0021) model time 0.2431 (0.2435) loss 3.8622 (3.5240) grad_norm 1.7701 (1.9727) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][490/1251] eta 0:03:07 lr 0.000940 wd 0.0500 time 0.2423 (0.2462) data time 0.0008 (0.0021) model time 0.2415 (0.2435) loss 3.6883 (3.5191) grad_norm 2.1449 (1.9806) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][500/1251] eta 0:03:04 lr 0.000940 wd 0.0500 time 0.2430 (0.2461) data time 0.0007 (0.0021) model time 0.2422 (0.2435) loss 4.4202 (3.5230) grad_norm 1.6264 (1.9806) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][510/1251] eta 0:03:02 lr 0.000940 wd 0.0500 time 0.2379 (0.2461) data time 0.0011 (0.0020) model time 0.2368 (0.2434) loss 2.8291 (3.5303) grad_norm 1.5532 (1.9766) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][520/1251] eta 0:02:59 lr 0.000940 wd 0.0500 time 0.2432 (0.2460) data time 0.0010 (0.0020) model time 0.2423 (0.2434) loss 3.8588 (3.5291) grad_norm 1.9261 (1.9734) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][530/1251] eta 0:02:57 lr 0.000940 wd 0.0500 time 0.2515 (0.2460) data time 0.0009 (0.0020) model time 0.2505 (0.2434) loss 4.2907 (3.5294) grad_norm 1.5028 (1.9721) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][540/1251] eta 0:02:54 lr 0.000940 wd 0.0500 time 0.2436 (0.2459) data time 0.0008 (0.0020) model time 0.2429 (0.2434) loss 4.0573 (3.5333) grad_norm 2.4401 (1.9697) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][550/1251] eta 0:02:52 lr 0.000940 wd 0.0500 time 0.2547 (0.2459) data time 0.0010 (0.0020) model time 0.2538 (0.2433) loss 3.5650 (3.5318) grad_norm 1.9404 (1.9685) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:45:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][560/1251] eta 0:02:50 lr 0.000940 wd 0.0500 time 0.4549 (0.2462) data time 0.0008 (0.0019) model time 0.4541 (0.2437) loss 4.3419 (3.5355) grad_norm 2.5598 (1.9783) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][570/1251] eta 0:02:47 lr 0.000940 wd 0.0500 time 0.2381 (0.2465) data time 0.0010 (0.0019) model time 0.2371 (0.2441) loss 2.7021 (3.5338) grad_norm 1.7309 (1.9814) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][580/1251] eta 0:02:45 lr 0.000940 wd 0.0500 time 0.2401 (0.2464) data time 0.0010 (0.0019) model time 0.2391 (0.2440) loss 4.0827 (3.5413) grad_norm 2.4275 (1.9811) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][590/1251] eta 0:02:42 lr 0.000940 wd 0.0500 time 0.2416 (0.2463) data time 0.0007 (0.0019) model time 0.2408 (0.2439) loss 3.0683 (3.5412) grad_norm 1.8634 (1.9838) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][600/1251] eta 0:02:40 lr 0.000940 wd 0.0500 time 0.2442 (0.2462) data time 0.0008 (0.0019) model time 0.2434 (0.2438) loss 4.2926 (3.5472) grad_norm 1.8383 (1.9834) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][610/1251] eta 0:02:37 lr 0.000940 wd 0.0500 time 0.2297 (0.2461) data time 0.0009 (0.0019) model time 0.2288 (0.2437) loss 3.2329 (3.5419) grad_norm 1.6054 (1.9880) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][620/1251] eta 0:02:35 lr 0.000940 wd 0.0500 time 0.2378 (0.2460) data time 0.0010 (0.0019) model time 0.2369 (0.2437) loss 3.8685 (3.5460) grad_norm 1.5430 (1.9894) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][630/1251] eta 0:02:32 lr 0.000940 wd 0.0500 time 0.2479 (0.2459) data time 0.0010 (0.0018) model time 0.2470 (0.2436) loss 3.4692 (3.5464) grad_norm 2.1561 (1.9858) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][640/1251] eta 0:02:30 lr 0.000940 wd 0.0500 time 0.2416 (0.2459) data time 0.0011 (0.0018) model time 0.2405 (0.2436) loss 3.5365 (3.5470) grad_norm 1.4173 (1.9806) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][650/1251] eta 0:02:27 lr 0.000940 wd 0.0500 time 0.2446 (0.2458) data time 0.0011 (0.0018) model time 0.2435 (0.2435) loss 3.2644 (3.5522) grad_norm 1.4059 (1.9771) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][660/1251] eta 0:02:25 lr 0.000939 wd 0.0500 time 0.2312 (0.2458) data time 0.0014 (0.0018) model time 0.2298 (0.2435) loss 4.1414 (3.5552) grad_norm 2.1301 (1.9728) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][670/1251] eta 0:02:22 lr 0.000939 wd 0.0500 time 0.2197 (0.2460) data time 0.0011 (0.0018) model time 0.2186 (0.2438) loss 2.5801 (3.5550) grad_norm 1.5290 (1.9761) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][680/1251] eta 0:02:20 lr 0.000939 wd 0.0500 time 0.2471 (0.2460) data time 0.0010 (0.0018) model time 0.2462 (0.2438) loss 3.1199 (3.5550) grad_norm 3.1902 (1.9799) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][690/1251] eta 0:02:17 lr 0.000939 wd 0.0500 time 0.2408 (0.2459) data time 0.0008 (0.0018) model time 0.2401 (0.2437) loss 3.5130 (3.5565) grad_norm 1.3640 (1.9813) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][700/1251] eta 0:02:15 lr 0.000939 wd 0.0500 time 0.2455 (0.2459) data time 0.0011 (0.0018) model time 0.2444 (0.2437) loss 4.4082 (3.5541) grad_norm 1.6883 (1.9802) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][710/1251] eta 0:02:13 lr 0.000939 wd 0.0500 time 0.2473 (0.2459) data time 0.0011 (0.0018) model time 0.2462 (0.2437) loss 3.5085 (3.5550) grad_norm 1.8392 (1.9801) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][720/1251] eta 0:02:10 lr 0.000939 wd 0.0500 time 0.2409 (0.2459) data time 0.0010 (0.0017) model time 0.2398 (0.2437) loss 3.8947 (3.5508) grad_norm 1.9522 (1.9799) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][730/1251] eta 0:02:08 lr 0.000939 wd 0.0500 time 0.2348 (0.2458) data time 0.0008 (0.0017) model time 0.2340 (0.2437) loss 2.1183 (3.5480) grad_norm 1.3608 (1.9773) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][740/1251] eta 0:02:05 lr 0.000939 wd 0.0500 time 0.2311 (0.2458) data time 0.0009 (0.0017) model time 0.2302 (0.2436) loss 3.8518 (3.5485) grad_norm 2.0433 (1.9787) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][750/1251] eta 0:02:03 lr 0.000939 wd 0.0500 time 0.2388 (0.2457) data time 0.0008 (0.0017) model time 0.2380 (0.2436) loss 3.6475 (3.5506) grad_norm 2.2209 (1.9784) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][760/1251] eta 0:02:00 lr 0.000939 wd 0.0500 time 0.2383 (0.2457) data time 0.0007 (0.0017) model time 0.2376 (0.2436) loss 3.6324 (3.5506) grad_norm 1.7388 (1.9762) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][770/1251] eta 0:01:58 lr 0.000939 wd 0.0500 time 0.2402 (0.2457) data time 0.0011 (0.0017) model time 0.2391 (0.2436) loss 3.5222 (3.5489) grad_norm 1.8444 (1.9749) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][780/1251] eta 0:01:55 lr 0.000939 wd 0.0500 time 0.2421 (0.2456) data time 0.0010 (0.0017) model time 0.2411 (0.2435) loss 3.8589 (3.5513) grad_norm 2.1579 (1.9759) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][790/1251] eta 0:01:53 lr 0.000939 wd 0.0500 time 0.2369 (0.2455) data time 0.0011 (0.0017) model time 0.2358 (0.2435) loss 3.8327 (3.5495) grad_norm 1.9760 (1.9775) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][800/1251] eta 0:01:50 lr 0.000939 wd 0.0500 time 0.2401 (0.2455) data time 0.0007 (0.0017) model time 0.2394 (0.2434) loss 3.4072 (3.5442) grad_norm 1.7191 (1.9787) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:46:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][810/1251] eta 0:01:48 lr 0.000939 wd 0.0500 time 0.2402 (0.2454) data time 0.0007 (0.0017) model time 0.2395 (0.2434) loss 2.9615 (3.5456) grad_norm 2.0535 (1.9787) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][820/1251] eta 0:01:45 lr 0.000939 wd 0.0500 time 0.2506 (0.2454) data time 0.0011 (0.0017) model time 0.2495 (0.2433) loss 2.8380 (3.5438) grad_norm 2.8204 (1.9785) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][830/1251] eta 0:01:43 lr 0.000939 wd 0.0500 time 0.2422 (0.2453) data time 0.0007 (0.0016) model time 0.2414 (0.2433) loss 2.6558 (3.5418) grad_norm 1.6643 (1.9762) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][840/1251] eta 0:01:40 lr 0.000939 wd 0.0500 time 0.2422 (0.2452) data time 0.0009 (0.0016) model time 0.2413 (0.2432) loss 3.6417 (3.5412) grad_norm 1.6532 (1.9773) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][850/1251] eta 0:01:38 lr 0.000939 wd 0.0500 time 0.2360 (0.2451) data time 0.0011 (0.0016) model time 0.2349 (0.2431) loss 3.0785 (3.5381) grad_norm 1.6422 (1.9744) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][860/1251] eta 0:01:35 lr 0.000939 wd 0.0500 time 0.2421 (0.2451) data time 0.0007 (0.0016) model time 0.2414 (0.2431) loss 3.9404 (3.5414) grad_norm 1.5617 (1.9753) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][870/1251] eta 0:01:33 lr 0.000939 wd 0.0500 time 0.2432 (0.2450) data time 0.0010 (0.0016) model time 0.2422 (0.2431) loss 3.9118 (3.5427) grad_norm 1.4506 (1.9772) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][880/1251] eta 0:01:30 lr 0.000939 wd 0.0500 time 0.2398 (0.2450) data time 0.0009 (0.0016) model time 0.2389 (0.2430) loss 3.0388 (3.5436) grad_norm 1.8190 (1.9764) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][890/1251] eta 0:01:28 lr 0.000939 wd 0.0500 time 0.2384 (0.2449) data time 0.0011 (0.0016) model time 0.2372 (0.2430) loss 3.7097 (3.5413) grad_norm 3.4808 (1.9796) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][900/1251] eta 0:01:25 lr 0.000939 wd 0.0500 time 0.2388 (0.2449) data time 0.0011 (0.0016) model time 0.2377 (0.2429) loss 3.7505 (3.5430) grad_norm 2.1444 (1.9821) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][910/1251] eta 0:01:23 lr 0.000939 wd 0.0500 time 0.2450 (0.2448) data time 0.0008 (0.0016) model time 0.2441 (0.2429) loss 3.9561 (3.5428) grad_norm 2.7268 (1.9846) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][920/1251] eta 0:01:21 lr 0.000939 wd 0.0500 time 0.2371 (0.2448) data time 0.0009 (0.0016) model time 0.2362 (0.2428) loss 3.0546 (3.5419) grad_norm 2.3224 (1.9874) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][930/1251] eta 0:01:18 lr 0.000939 wd 0.0500 time 0.2366 (0.2450) data time 0.0011 (0.0016) model time 0.2355 (0.2430) loss 3.7288 (3.5433) grad_norm 1.8605 (1.9933) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][940/1251] eta 0:01:16 lr 0.000939 wd 0.0500 time 0.2320 (0.2449) data time 0.0008 (0.0016) model time 0.2311 (0.2430) loss 4.1693 (3.5462) grad_norm 1.4323 (1.9937) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][950/1251] eta 0:01:13 lr 0.000939 wd 0.0500 time 0.2411 (0.2449) data time 0.0011 (0.0016) model time 0.2400 (0.2430) loss 2.5082 (3.5440) grad_norm 2.3550 (1.9953) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][960/1251] eta 0:01:11 lr 0.000939 wd 0.0500 time 0.2318 (0.2449) data time 0.0009 (0.0016) model time 0.2310 (0.2430) loss 4.0737 (3.5439) grad_norm 1.6084 (1.9975) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][970/1251] eta 0:01:08 lr 0.000939 wd 0.0500 time 0.2454 (0.2448) data time 0.0007 (0.0016) model time 0.2447 (0.2429) loss 3.8897 (3.5415) grad_norm 1.9990 (1.9954) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][980/1251] eta 0:01:06 lr 0.000939 wd 0.0500 time 0.2374 (0.2452) data time 0.0011 (0.0016) model time 0.2364 (0.2433) loss 3.3391 (3.5434) grad_norm 1.5115 (1.9930) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][990/1251] eta 0:01:03 lr 0.000939 wd 0.0500 time 0.2447 (0.2452) data time 0.0011 (0.0015) model time 0.2436 (0.2433) loss 3.2245 (3.5427) grad_norm 2.5058 (1.9931) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1000/1251] eta 0:01:01 lr 0.000939 wd 0.0500 time 0.2421 (0.2451) data time 0.0009 (0.0015) model time 0.2412 (0.2433) loss 3.4941 (3.5419) grad_norm 2.0067 (1.9942) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1010/1251] eta 0:00:59 lr 0.000939 wd 0.0500 time 0.2439 (0.2451) data time 0.0008 (0.0015) model time 0.2431 (0.2432) loss 3.5641 (3.5415) grad_norm 2.2527 (1.9928) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1020/1251] eta 0:00:56 lr 0.000939 wd 0.0500 time 0.2418 (0.2450) data time 0.0011 (0.0015) model time 0.2407 (0.2432) loss 4.0076 (3.5397) grad_norm 3.0521 (1.9935) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1030/1251] eta 0:00:54 lr 0.000939 wd 0.0500 time 0.2362 (0.2450) data time 0.0008 (0.0015) model time 0.2354 (0.2431) loss 3.3618 (3.5396) grad_norm 1.6530 (1.9921) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1040/1251] eta 0:00:51 lr 0.000939 wd 0.0500 time 0.2389 (0.2449) data time 0.0009 (0.0015) model time 0.2380 (0.2431) loss 3.5776 (3.5406) grad_norm 1.5821 (1.9912) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1050/1251] eta 0:00:49 lr 0.000939 wd 0.0500 time 0.2497 (0.2449) data time 0.0011 (0.0015) model time 0.2486 (0.2431) loss 4.0305 (3.5407) grad_norm 1.8210 (1.9891) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:47:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1060/1251] eta 0:00:46 lr 0.000939 wd 0.0500 time 0.2428 (0.2449) data time 0.0010 (0.0015) model time 0.2418 (0.2431) loss 3.8360 (3.5406) grad_norm 1.6854 (1.9882) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:48:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1070/1251] eta 0:00:44 lr 0.000939 wd 0.0500 time 0.2398 (0.2449) data time 0.0007 (0.0015) model time 0.2391 (0.2430) loss 3.7600 (3.5396) grad_norm 1.7672 (1.9921) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:48:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1080/1251] eta 0:00:41 lr 0.000939 wd 0.0500 time 0.2423 (0.2448) data time 0.0009 (0.0015) model time 0.2415 (0.2430) loss 4.1105 (3.5389) grad_norm 1.7662 (1.9920) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:48:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1090/1251] eta 0:00:39 lr 0.000939 wd 0.0500 time 0.2448 (0.2448) data time 0.0012 (0.0015) model time 0.2436 (0.2430) loss 3.8289 (3.5403) grad_norm 1.8006 (1.9901) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:48:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1100/1251] eta 0:00:36 lr 0.000939 wd 0.0500 time 0.2454 (0.2450) data time 0.0010 (0.0015) model time 0.2444 (0.2432) loss 3.2165 (3.5415) grad_norm 1.7699 (1.9896) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:48:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1110/1251] eta 0:00:34 lr 0.000939 wd 0.0500 time 0.2384 (0.2451) data time 0.0008 (0.0015) model time 0.2376 (0.2434) loss 4.2186 (3.5446) grad_norm 1.5146 (1.9875) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:48:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1120/1251] eta 0:00:32 lr 0.000939 wd 0.0500 time 0.2377 (0.2451) data time 0.0012 (0.0015) model time 0.2365 (0.2433) loss 3.0362 (3.5431) grad_norm 2.1065 (1.9883) loss_scale 16384.0000 (8257.7698) mem 7379MB [2024-08-26 07:48:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1130/1251] eta 0:00:29 lr 0.000938 wd 0.0500 time 0.2389 (0.2451) data time 0.0008 (0.0015) model time 0.2381 (0.2433) loss 3.7655 (3.5423) grad_norm 1.2976 (1.9858) loss_scale 16384.0000 (8329.6198) mem 7379MB [2024-08-26 07:48:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1140/1251] eta 0:00:27 lr 0.000938 wd 0.0500 time 0.2432 (0.2452) data time 0.0011 (0.0015) model time 0.2421 (0.2435) loss 3.5836 (3.5415) grad_norm 1.5724 (1.9842) loss_scale 16384.0000 (8400.2103) mem 7379MB [2024-08-26 07:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1150/1251] eta 0:00:24 lr 0.000938 wd 0.0500 time 0.2523 (0.2452) data time 0.0010 (0.0015) model time 0.2513 (0.2435) loss 3.6739 (3.5420) grad_norm 1.3942 (1.9835) loss_scale 16384.0000 (8469.5743) mem 7379MB [2024-08-26 07:48:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1160/1251] eta 0:00:22 lr 0.000938 wd 0.0500 time 0.2424 (0.2452) data time 0.0009 (0.0015) model time 0.2415 (0.2434) loss 4.0713 (3.5416) grad_norm 2.4272 (1.9827) loss_scale 16384.0000 (8537.7433) mem 7379MB [2024-08-26 07:48:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1170/1251] eta 0:00:19 lr 0.000938 wd 0.0500 time 0.2387 (0.2451) data time 0.0009 (0.0015) model time 0.2378 (0.2434) loss 3.3854 (3.5419) grad_norm 1.7201 (1.9834) loss_scale 16384.0000 (8604.7481) mem 7379MB [2024-08-26 07:48:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1180/1251] eta 0:00:17 lr 0.000938 wd 0.0500 time 0.2496 (0.2451) data time 0.0010 (0.0015) model time 0.2486 (0.2434) loss 2.6202 (3.5419) grad_norm 2.2447 (inf) loss_scale 8192.0000 (8608.1897) mem 7379MB [2024-08-26 07:48:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1190/1251] eta 0:00:14 lr 0.000938 wd 0.0500 time 0.2462 (0.2451) data time 0.0009 (0.0015) model time 0.2454 (0.2434) loss 3.5911 (3.5418) grad_norm 2.1203 (inf) loss_scale 8192.0000 (8604.6952) mem 7379MB [2024-08-26 07:48:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1200/1251] eta 0:00:12 lr 0.000938 wd 0.0500 time 0.2347 (0.2451) data time 0.0011 (0.0015) model time 0.2336 (0.2434) loss 3.7223 (3.5408) grad_norm 1.6708 (inf) loss_scale 8192.0000 (8601.2590) mem 7379MB [2024-08-26 07:48:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1210/1251] eta 0:00:10 lr 0.000938 wd 0.0500 time 0.2379 (0.2451) data time 0.0010 (0.0015) model time 0.2370 (0.2434) loss 2.9354 (3.5409) grad_norm 2.7736 (inf) loss_scale 8192.0000 (8597.8794) mem 7379MB [2024-08-26 07:48:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1220/1251] eta 0:00:07 lr 0.000938 wd 0.0500 time 0.2408 (0.2450) data time 0.0009 (0.0014) model time 0.2399 (0.2433) loss 4.7502 (3.5388) grad_norm 1.7795 (inf) loss_scale 8192.0000 (8594.5553) mem 7379MB [2024-08-26 07:48:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1230/1251] eta 0:00:05 lr 0.000938 wd 0.0500 time 0.2369 (0.2450) data time 0.0009 (0.0014) model time 0.2360 (0.2433) loss 2.5646 (3.5393) grad_norm 1.9372 (inf) loss_scale 8192.0000 (8591.2851) mem 7379MB [2024-08-26 07:48:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1240/1251] eta 0:00:02 lr 0.000938 wd 0.0500 time 0.2252 (0.2451) data time 0.0005 (0.0014) model time 0.2246 (0.2434) loss 3.5687 (3.5398) grad_norm 2.1116 (inf) loss_scale 8192.0000 (8588.0677) mem 7379MB [2024-08-26 07:48:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [64/300][1250/1251] eta 0:00:00 lr 0.000938 wd 0.0500 time 0.2243 (0.2450) data time 0.0007 (0.0014) model time 0.2235 (0.2433) loss 4.1490 (3.5422) grad_norm 1.3658 (inf) loss_scale 8192.0000 (8584.9017) mem 7379MB [2024-08-26 07:48:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 64 training takes 0:05:06 [2024-08-26 07:48:46 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 07:48:46 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 07:48:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.426 (0.426) Loss 0.5586 (0.5586) Acc@1 90.137 (90.137) Acc@5 97.461 (97.461) Mem 7379MB [2024-08-26 07:48:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.111) Loss 0.9702 (0.8785) Acc@1 79.102 (80.753) Acc@5 95.703 (95.943) Mem 7379MB [2024-08-26 07:48:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.096) Loss 1.2119 (0.9018) Acc@1 70.410 (79.762) Acc@5 92.480 (95.829) Mem 7379MB [2024-08-26 07:48:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.090) Loss 1.6211 (1.0233) Acc@1 61.914 (77.148) Acc@5 86.230 (94.219) Mem 7379MB [2024-08-26 07:48:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.4092 (1.0911) Acc@1 67.969 (75.531) Acc@5 89.453 (93.345) Mem 7379MB [2024-08-26 07:48:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.334 Acc@5 93.222 [2024-08-26 07:48:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.3% [2024-08-26 07:48:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.877 (0.877) Loss 0.4688 (0.4688) Acc@1 90.820 (90.820) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 07:48:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.151) Loss 0.7690 (0.7423) Acc@1 84.277 (83.629) Acc@5 96.094 (96.697) Mem 7379MB [2024-08-26 07:48:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.117) Loss 1.0693 (0.7622) Acc@1 74.414 (82.710) Acc@5 92.969 (96.656) Mem 7379MB [2024-08-26 07:48:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.103) Loss 1.3311 (0.8732) Acc@1 65.625 (80.103) Acc@5 89.648 (95.209) Mem 7379MB [2024-08-26 07:48:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.2412 (0.9332) Acc@1 69.629 (78.606) Acc@5 91.016 (94.550) Mem 7379MB [2024-08-26 07:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.252 Acc@5 94.482 [2024-08-26 07:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.3% [2024-08-26 07:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.25% [2024-08-26 07:48:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 07:48:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 07:48:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][0/1251] eta 0:13:51 lr 0.000938 wd 0.0500 time 0.6644 (0.6644) data time 0.4310 (0.4310) model time 0.0000 (0.0000) loss 3.1224 (3.1224) grad_norm 2.0165 (2.0165) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:48:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][10/1251] eta 0:05:51 lr 0.000938 wd 0.0500 time 0.2527 (0.2835) data time 0.0010 (0.0401) model time 0.0000 (0.0000) loss 2.9031 (3.2892) grad_norm 1.5007 (2.2093) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][20/1251] eta 0:05:25 lr 0.000938 wd 0.0500 time 0.2443 (0.2640) data time 0.0010 (0.0215) model time 0.0000 (0.0000) loss 3.2834 (3.2669) grad_norm 1.9539 (2.0474) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][30/1251] eta 0:05:13 lr 0.000938 wd 0.0500 time 0.2494 (0.2565) data time 0.0010 (0.0151) model time 0.0000 (0.0000) loss 3.9413 (3.4268) grad_norm 1.6950 (2.0019) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][40/1251] eta 0:05:06 lr 0.000938 wd 0.0500 time 0.2446 (0.2530) data time 0.0009 (0.0117) model time 0.0000 (0.0000) loss 2.3437 (3.4136) grad_norm 1.9491 (2.0055) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][50/1251] eta 0:05:01 lr 0.000938 wd 0.0500 time 0.2456 (0.2509) data time 0.0009 (0.0096) model time 0.0000 (0.0000) loss 3.8394 (3.4158) grad_norm 1.8360 (1.9601) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][60/1251] eta 0:04:57 lr 0.000938 wd 0.0500 time 0.2535 (0.2496) data time 0.0009 (0.0082) model time 0.2526 (0.2423) loss 3.8327 (3.4243) grad_norm 2.0896 (1.9921) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][70/1251] eta 0:04:53 lr 0.000938 wd 0.0500 time 0.2419 (0.2489) data time 0.0011 (0.0072) model time 0.2409 (0.2428) loss 4.0499 (3.4500) grad_norm 2.0532 (1.9919) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][80/1251] eta 0:04:50 lr 0.000938 wd 0.0500 time 0.2456 (0.2478) data time 0.0009 (0.0064) model time 0.2447 (0.2416) loss 2.9757 (3.4946) grad_norm 2.2808 (2.0176) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][90/1251] eta 0:04:47 lr 0.000938 wd 0.0500 time 0.2512 (0.2472) data time 0.0009 (0.0058) model time 0.2503 (0.2416) loss 2.6964 (3.4647) grad_norm 1.8693 (2.0580) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][100/1251] eta 0:04:44 lr 0.000938 wd 0.0500 time 0.2446 (0.2468) data time 0.0007 (0.0053) model time 0.2439 (0.2417) loss 4.1321 (3.4707) grad_norm 2.5635 (2.0494) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][110/1251] eta 0:04:41 lr 0.000938 wd 0.0500 time 0.2455 (0.2464) data time 0.0008 (0.0049) model time 0.2448 (0.2416) loss 2.7809 (3.4664) grad_norm 1.8242 (2.0371) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][120/1251] eta 0:04:40 lr 0.000938 wd 0.0500 time 0.2305 (0.2477) data time 0.0010 (0.0046) model time 0.2295 (0.2443) loss 2.6374 (3.4838) grad_norm 1.9504 (2.0214) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][130/1251] eta 0:04:37 lr 0.000938 wd 0.0500 time 0.2393 (0.2474) data time 0.0011 (0.0044) model time 0.2382 (0.2442) loss 3.5226 (3.4716) grad_norm 1.8438 (2.0300) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][140/1251] eta 0:04:34 lr 0.000938 wd 0.0500 time 0.2388 (0.2470) data time 0.0011 (0.0041) model time 0.2377 (0.2438) loss 3.4231 (3.4951) grad_norm 2.0071 (2.0217) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][150/1251] eta 0:04:31 lr 0.000938 wd 0.0500 time 0.2441 (0.2466) data time 0.0008 (0.0039) model time 0.2433 (0.2434) loss 4.4304 (3.4860) grad_norm 1.6492 (2.0015) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][160/1251] eta 0:04:28 lr 0.000938 wd 0.0500 time 0.2398 (0.2462) data time 0.0010 (0.0037) model time 0.2388 (0.2430) loss 3.1892 (3.4835) grad_norm 1.7225 (1.9908) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][170/1251] eta 0:04:25 lr 0.000938 wd 0.0500 time 0.2363 (0.2458) data time 0.0008 (0.0036) model time 0.2355 (0.2427) loss 3.2865 (3.4881) grad_norm 2.5349 (2.0014) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][180/1251] eta 0:04:23 lr 0.000938 wd 0.0500 time 0.2382 (0.2457) data time 0.0007 (0.0034) model time 0.2375 (0.2426) loss 4.0053 (3.4900) grad_norm 1.7831 (1.9874) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][190/1251] eta 0:04:20 lr 0.000938 wd 0.0500 time 0.2607 (0.2456) data time 0.0009 (0.0033) model time 0.2598 (0.2427) loss 3.2731 (3.4716) grad_norm 2.0416 (1.9788) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][200/1251] eta 0:04:17 lr 0.000938 wd 0.0500 time 0.2456 (0.2453) data time 0.0010 (0.0032) model time 0.2446 (0.2424) loss 3.6723 (3.4697) grad_norm 1.4547 (1.9690) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][210/1251] eta 0:04:15 lr 0.000938 wd 0.0500 time 0.2317 (0.2451) data time 0.0011 (0.0031) model time 0.2306 (0.2422) loss 4.2731 (3.4680) grad_norm 2.2687 (1.9617) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][220/1251] eta 0:04:12 lr 0.000938 wd 0.0500 time 0.2492 (0.2450) data time 0.0010 (0.0030) model time 0.2482 (0.2423) loss 3.4129 (3.4740) grad_norm 2.0293 (1.9656) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][230/1251] eta 0:04:10 lr 0.000938 wd 0.0500 time 0.2450 (0.2449) data time 0.0009 (0.0029) model time 0.2442 (0.2421) loss 4.0760 (3.4714) grad_norm 1.7129 (1.9521) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][240/1251] eta 0:04:07 lr 0.000938 wd 0.0500 time 0.2323 (0.2447) data time 0.0009 (0.0028) model time 0.2315 (0.2420) loss 3.5233 (3.4645) grad_norm 1.6272 (1.9505) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][250/1251] eta 0:04:05 lr 0.000938 wd 0.0500 time 0.2383 (0.2452) data time 0.0008 (0.0028) model time 0.2375 (0.2428) loss 2.2540 (3.4553) grad_norm 2.4608 (1.9625) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][260/1251] eta 0:04:02 lr 0.000938 wd 0.0500 time 0.2406 (0.2451) data time 0.0008 (0.0027) model time 0.2398 (0.2427) loss 3.5698 (3.4525) grad_norm 2.8204 (1.9591) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][270/1251] eta 0:04:00 lr 0.000938 wd 0.0500 time 0.2402 (0.2451) data time 0.0009 (0.0026) model time 0.2393 (0.2428) loss 3.8038 (3.4599) grad_norm 1.5380 (1.9531) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][280/1251] eta 0:03:57 lr 0.000938 wd 0.0500 time 0.2531 (0.2450) data time 0.0010 (0.0026) model time 0.2521 (0.2427) loss 3.9206 (3.4603) grad_norm 3.5203 (1.9923) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][290/1251] eta 0:03:55 lr 0.000938 wd 0.0500 time 0.2353 (0.2450) data time 0.0010 (0.0025) model time 0.2343 (0.2427) loss 3.6184 (3.4641) grad_norm 1.5106 (1.9903) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][300/1251] eta 0:03:52 lr 0.000938 wd 0.0500 time 0.2369 (0.2448) data time 0.0009 (0.0025) model time 0.2360 (0.2426) loss 2.6824 (3.4657) grad_norm 1.7276 (1.9846) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][310/1251] eta 0:03:50 lr 0.000938 wd 0.0500 time 0.2473 (0.2448) data time 0.0007 (0.0024) model time 0.2466 (0.2426) loss 4.1549 (3.4598) grad_norm 1.5378 (1.9791) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][320/1251] eta 0:03:47 lr 0.000938 wd 0.0500 time 0.2429 (0.2447) data time 0.0008 (0.0024) model time 0.2421 (0.2425) loss 4.0656 (3.4701) grad_norm 1.4918 (1.9844) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][330/1251] eta 0:03:45 lr 0.000938 wd 0.0500 time 0.2324 (0.2446) data time 0.0008 (0.0023) model time 0.2316 (0.2424) loss 3.9070 (3.4742) grad_norm 1.6649 (1.9724) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][340/1251] eta 0:03:43 lr 0.000938 wd 0.0500 time 0.2405 (0.2450) data time 0.0007 (0.0023) model time 0.2398 (0.2430) loss 2.3995 (3.4774) grad_norm 1.9061 (1.9653) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][350/1251] eta 0:03:40 lr 0.000937 wd 0.0500 time 0.2248 (0.2449) data time 0.0010 (0.0023) model time 0.2238 (0.2429) loss 3.7386 (3.4805) grad_norm 2.8992 (1.9713) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][360/1251] eta 0:03:38 lr 0.000937 wd 0.0500 time 0.2462 (0.2448) data time 0.0012 (0.0022) model time 0.2450 (0.2428) loss 3.9792 (3.4830) grad_norm 2.1215 (1.9719) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][370/1251] eta 0:03:35 lr 0.000937 wd 0.0500 time 0.2335 (0.2448) data time 0.0008 (0.0022) model time 0.2327 (0.2428) loss 4.3860 (3.4800) grad_norm 1.5495 (1.9677) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][380/1251] eta 0:03:33 lr 0.000937 wd 0.0500 time 0.2376 (0.2452) data time 0.0007 (0.0022) model time 0.2368 (0.2433) loss 3.5972 (3.4768) grad_norm 1.7614 (1.9694) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][390/1251] eta 0:03:31 lr 0.000937 wd 0.0500 time 0.2339 (0.2457) data time 0.0009 (0.0021) model time 0.2330 (0.2439) loss 3.7074 (3.4801) grad_norm 2.0533 (1.9749) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][400/1251] eta 0:03:29 lr 0.000937 wd 0.0500 time 0.2413 (0.2457) data time 0.0007 (0.0021) model time 0.2405 (0.2439) loss 3.2761 (3.4825) grad_norm 1.7002 (1.9748) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][410/1251] eta 0:03:26 lr 0.000937 wd 0.0500 time 0.2452 (0.2457) data time 0.0010 (0.0021) model time 0.2441 (0.2439) loss 4.2476 (3.4776) grad_norm 1.8419 (1.9683) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][420/1251] eta 0:03:24 lr 0.000937 wd 0.0500 time 0.2391 (0.2456) data time 0.0009 (0.0021) model time 0.2382 (0.2438) loss 3.9940 (3.4777) grad_norm 1.5073 (1.9633) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][430/1251] eta 0:03:21 lr 0.000937 wd 0.0500 time 0.2297 (0.2455) data time 0.0009 (0.0020) model time 0.2288 (0.2438) loss 3.3550 (3.4815) grad_norm 1.7101 (1.9580) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][440/1251] eta 0:03:19 lr 0.000937 wd 0.0500 time 0.2429 (0.2455) data time 0.0007 (0.0020) model time 0.2421 (0.2438) loss 3.8313 (3.4763) grad_norm 2.3921 (1.9622) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][450/1251] eta 0:03:16 lr 0.000937 wd 0.0500 time 0.2582 (0.2459) data time 0.0011 (0.0020) model time 0.2571 (0.2443) loss 3.3114 (3.4798) grad_norm 1.4067 (1.9649) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][460/1251] eta 0:03:14 lr 0.000937 wd 0.0500 time 0.2424 (0.2463) data time 0.0007 (0.0020) model time 0.2417 (0.2446) loss 4.1392 (3.4814) grad_norm 3.0582 (1.9648) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][470/1251] eta 0:03:12 lr 0.000937 wd 0.0500 time 0.2456 (0.2462) data time 0.0011 (0.0020) model time 0.2446 (0.2446) loss 3.3225 (3.4828) grad_norm 1.3621 (1.9608) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][480/1251] eta 0:03:09 lr 0.000937 wd 0.0500 time 0.2447 (0.2461) data time 0.0009 (0.0019) model time 0.2438 (0.2445) loss 2.7685 (3.4763) grad_norm 1.6878 (1.9576) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][490/1251] eta 0:03:07 lr 0.000937 wd 0.0500 time 0.2414 (0.2460) data time 0.0010 (0.0019) model time 0.2405 (0.2444) loss 4.3561 (3.4784) grad_norm 1.5246 (1.9548) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:50:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][500/1251] eta 0:03:04 lr 0.000937 wd 0.0500 time 0.2482 (0.2460) data time 0.0007 (0.0019) model time 0.2475 (0.2444) loss 2.6773 (3.4760) grad_norm 1.5368 (1.9495) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:51:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][510/1251] eta 0:03:02 lr 0.000937 wd 0.0500 time 0.2395 (0.2462) data time 0.0011 (0.0019) model time 0.2384 (0.2446) loss 2.5402 (3.4721) grad_norm 1.5167 (1.9485) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:51:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][520/1251] eta 0:02:59 lr 0.000937 wd 0.0500 time 0.2451 (0.2461) data time 0.0008 (0.0019) model time 0.2444 (0.2445) loss 3.9722 (3.4720) grad_norm 1.8787 (1.9565) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][530/1251] eta 0:02:57 lr 0.000937 wd 0.0500 time 0.2422 (0.2460) data time 0.0011 (0.0019) model time 0.2411 (0.2445) loss 2.6397 (3.4748) grad_norm 2.6690 (1.9559) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][540/1251] eta 0:02:54 lr 0.000937 wd 0.0500 time 0.2397 (0.2460) data time 0.0007 (0.0019) model time 0.2389 (0.2444) loss 3.8168 (3.4760) grad_norm 2.6847 (1.9575) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:51:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][550/1251] eta 0:02:52 lr 0.000937 wd 0.0500 time 0.2462 (0.2459) data time 0.0010 (0.0018) model time 0.2452 (0.2444) loss 3.6475 (3.4783) grad_norm 1.7696 (1.9595) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 07:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][560/1251] eta 0:02:49 lr 0.000937 wd 0.0500 time 0.2438 (0.2459) data time 0.0011 (0.0018) model time 0.2427 (0.2443) loss 2.0470 (3.4755) grad_norm 2.3213 (inf) loss_scale 4096.0000 (8148.1925) mem 7379MB [2024-08-26 07:51:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][570/1251] eta 0:02:47 lr 0.000937 wd 0.0500 time 0.2461 (0.2458) data time 0.0007 (0.0018) model time 0.2454 (0.2443) loss 2.6213 (3.4739) grad_norm 2.0785 (inf) loss_scale 4096.0000 (8077.2259) mem 7379MB [2024-08-26 07:51:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][580/1251] eta 0:02:44 lr 0.000937 wd 0.0500 time 0.2414 (0.2458) data time 0.0010 (0.0019) model time 0.2403 (0.2442) loss 3.8692 (3.4744) grad_norm 1.9958 (inf) loss_scale 4096.0000 (8008.7022) mem 7379MB [2024-08-26 07:51:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][590/1251] eta 0:02:42 lr 0.000937 wd 0.0500 time 0.2403 (0.2458) data time 0.0007 (0.0019) model time 0.2396 (0.2442) loss 3.9878 (3.4783) grad_norm 1.4518 (inf) loss_scale 4096.0000 (7942.4975) mem 7379MB [2024-08-26 07:51:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][600/1251] eta 0:02:39 lr 0.000937 wd 0.0500 time 0.2432 (0.2457) data time 0.0009 (0.0018) model time 0.2423 (0.2441) loss 3.6254 (3.4836) grad_norm 1.7544 (inf) loss_scale 4096.0000 (7878.4958) mem 7379MB [2024-08-26 07:51:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][610/1251] eta 0:02:37 lr 0.000937 wd 0.0500 time 0.2397 (0.2456) data time 0.0008 (0.0018) model time 0.2389 (0.2440) loss 4.4043 (3.4844) grad_norm 3.8023 (inf) loss_scale 4096.0000 (7816.5892) mem 7379MB [2024-08-26 07:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][620/1251] eta 0:02:34 lr 0.000937 wd 0.0500 time 0.2492 (0.2456) data time 0.0012 (0.0018) model time 0.2480 (0.2440) loss 3.7579 (3.4879) grad_norm 1.9183 (inf) loss_scale 4096.0000 (7756.6763) mem 7379MB [2024-08-26 07:51:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][630/1251] eta 0:02:32 lr 0.000937 wd 0.0500 time 0.2459 (0.2455) data time 0.0009 (0.0018) model time 0.2450 (0.2440) loss 4.5080 (3.4933) grad_norm 1.7070 (inf) loss_scale 4096.0000 (7698.6624) mem 7379MB [2024-08-26 07:51:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][640/1251] eta 0:02:29 lr 0.000937 wd 0.0500 time 0.2391 (0.2455) data time 0.0010 (0.0018) model time 0.2381 (0.2439) loss 3.8930 (3.4949) grad_norm 1.3765 (inf) loss_scale 4096.0000 (7642.4587) mem 7379MB [2024-08-26 07:51:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][650/1251] eta 0:02:27 lr 0.000937 wd 0.0500 time 0.2397 (0.2454) data time 0.0011 (0.0018) model time 0.2385 (0.2438) loss 3.0134 (3.4984) grad_norm 1.8885 (inf) loss_scale 4096.0000 (7587.9816) mem 7379MB [2024-08-26 07:51:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][660/1251] eta 0:02:25 lr 0.000937 wd 0.0500 time 0.2367 (0.2454) data time 0.0007 (0.0018) model time 0.2360 (0.2438) loss 2.5036 (3.4970) grad_norm 1.7125 (inf) loss_scale 4096.0000 (7535.1528) mem 7379MB [2024-08-26 07:51:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][670/1251] eta 0:02:22 lr 0.000937 wd 0.0500 time 0.2355 (0.2454) data time 0.0008 (0.0018) model time 0.2347 (0.2438) loss 4.1044 (3.4977) grad_norm 1.6899 (inf) loss_scale 4096.0000 (7483.8987) mem 7379MB [2024-08-26 07:51:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][680/1251] eta 0:02:20 lr 0.000937 wd 0.0500 time 0.2415 (0.2453) data time 0.0010 (0.0018) model time 0.2404 (0.2438) loss 3.6289 (3.4970) grad_norm 1.7344 (inf) loss_scale 4096.0000 (7434.1498) mem 7379MB [2024-08-26 07:51:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][690/1251] eta 0:02:17 lr 0.000937 wd 0.0500 time 0.2366 (0.2453) data time 0.0009 (0.0018) model time 0.2357 (0.2437) loss 3.9166 (3.4983) grad_norm 1.7974 (inf) loss_scale 4096.0000 (7385.8408) mem 7379MB [2024-08-26 07:51:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][700/1251] eta 0:02:15 lr 0.000937 wd 0.0500 time 0.2433 (0.2452) data time 0.0009 (0.0017) model time 0.2424 (0.2437) loss 4.0160 (3.5042) grad_norm 1.2896 (inf) loss_scale 4096.0000 (7338.9101) mem 7379MB [2024-08-26 07:51:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][710/1251] eta 0:02:12 lr 0.000937 wd 0.0500 time 0.2397 (0.2452) data time 0.0009 (0.0017) model time 0.2388 (0.2437) loss 4.2423 (3.5061) grad_norm 1.9690 (inf) loss_scale 4096.0000 (7293.2996) mem 7379MB [2024-08-26 07:51:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][720/1251] eta 0:02:10 lr 0.000937 wd 0.0500 time 0.2439 (0.2452) data time 0.0010 (0.0017) model time 0.2429 (0.2437) loss 3.5552 (3.5050) grad_norm 1.9457 (inf) loss_scale 4096.0000 (7248.9542) mem 7379MB [2024-08-26 07:51:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][730/1251] eta 0:02:07 lr 0.000937 wd 0.0500 time 0.2351 (0.2452) data time 0.0011 (0.0017) model time 0.2340 (0.2436) loss 3.4956 (3.5003) grad_norm 1.9597 (inf) loss_scale 4096.0000 (7205.8222) mem 7379MB [2024-08-26 07:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][740/1251] eta 0:02:05 lr 0.000937 wd 0.0500 time 0.2476 (0.2454) data time 0.0009 (0.0017) model time 0.2467 (0.2439) loss 3.6695 (3.4993) grad_norm 1.6302 (inf) loss_scale 4096.0000 (7163.8543) mem 7379MB [2024-08-26 07:52:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][750/1251] eta 0:02:02 lr 0.000937 wd 0.0500 time 0.2354 (0.2454) data time 0.0010 (0.0017) model time 0.2344 (0.2439) loss 2.8352 (3.4968) grad_norm 1.6868 (inf) loss_scale 4096.0000 (7123.0040) mem 7379MB [2024-08-26 07:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][760/1251] eta 0:02:00 lr 0.000937 wd 0.0500 time 0.2352 (0.2454) data time 0.0011 (0.0017) model time 0.2341 (0.2439) loss 2.5456 (3.5008) grad_norm 1.7354 (inf) loss_scale 4096.0000 (7083.2273) mem 7379MB [2024-08-26 07:52:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][770/1251] eta 0:01:58 lr 0.000937 wd 0.0500 time 0.2396 (0.2453) data time 0.0007 (0.0017) model time 0.2389 (0.2438) loss 4.0352 (3.4997) grad_norm 1.9701 (inf) loss_scale 4096.0000 (7044.4825) mem 7379MB [2024-08-26 07:52:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][780/1251] eta 0:01:55 lr 0.000937 wd 0.0500 time 0.2322 (0.2456) data time 0.0012 (0.0017) model time 0.2310 (0.2441) loss 3.6540 (3.5032) grad_norm 2.1733 (inf) loss_scale 4096.0000 (7006.7298) mem 7379MB [2024-08-26 07:52:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][790/1251] eta 0:01:53 lr 0.000937 wd 0.0500 time 0.2426 (0.2455) data time 0.0011 (0.0017) model time 0.2415 (0.2440) loss 3.0871 (3.5040) grad_norm 1.3196 (inf) loss_scale 4096.0000 (6969.9317) mem 7379MB [2024-08-26 07:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][800/1251] eta 0:01:50 lr 0.000937 wd 0.0500 time 0.2398 (0.2455) data time 0.0007 (0.0017) model time 0.2390 (0.2440) loss 3.4804 (3.5063) grad_norm 1.7758 (inf) loss_scale 4096.0000 (6934.0524) mem 7379MB [2024-08-26 07:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][810/1251] eta 0:01:48 lr 0.000936 wd 0.0500 time 0.2342 (0.2454) data time 0.0007 (0.0016) model time 0.2335 (0.2439) loss 3.6241 (3.5099) grad_norm 1.6223 (inf) loss_scale 4096.0000 (6899.0580) mem 7379MB [2024-08-26 07:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][820/1251] eta 0:01:45 lr 0.000936 wd 0.0500 time 0.2443 (0.2454) data time 0.0008 (0.0016) model time 0.2435 (0.2439) loss 3.4920 (3.5106) grad_norm 1.8963 (inf) loss_scale 4096.0000 (6864.9160) mem 7379MB [2024-08-26 07:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][830/1251] eta 0:01:43 lr 0.000936 wd 0.0500 time 0.2366 (0.2453) data time 0.0009 (0.0016) model time 0.2357 (0.2439) loss 3.9520 (3.5089) grad_norm 2.0611 (inf) loss_scale 4096.0000 (6831.5957) mem 7379MB [2024-08-26 07:52:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][840/1251] eta 0:01:40 lr 0.000936 wd 0.0500 time 0.2451 (0.2453) data time 0.0009 (0.0016) model time 0.2442 (0.2438) loss 3.5172 (3.5050) grad_norm 1.9690 (inf) loss_scale 4096.0000 (6799.0678) mem 7379MB [2024-08-26 07:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][850/1251] eta 0:01:38 lr 0.000936 wd 0.0500 time 0.2511 (0.2452) data time 0.0007 (0.0016) model time 0.2504 (0.2438) loss 3.1255 (3.5047) grad_norm 1.4965 (inf) loss_scale 4096.0000 (6767.3043) mem 7379MB [2024-08-26 07:52:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][860/1251] eta 0:01:35 lr 0.000936 wd 0.0500 time 0.2313 (0.2452) data time 0.0012 (0.0016) model time 0.2301 (0.2437) loss 3.7685 (3.5060) grad_norm 1.9342 (inf) loss_scale 4096.0000 (6736.2787) mem 7379MB [2024-08-26 07:52:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][870/1251] eta 0:01:33 lr 0.000936 wd 0.0500 time 0.2339 (0.2451) data time 0.0011 (0.0016) model time 0.2328 (0.2437) loss 3.7562 (3.5046) grad_norm 1.9618 (inf) loss_scale 4096.0000 (6705.9656) mem 7379MB [2024-08-26 07:52:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][880/1251] eta 0:01:30 lr 0.000936 wd 0.0500 time 0.2464 (0.2451) data time 0.0007 (0.0016) model time 0.2457 (0.2436) loss 2.2303 (3.5022) grad_norm 1.6992 (inf) loss_scale 4096.0000 (6676.3405) mem 7379MB [2024-08-26 07:52:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][890/1251] eta 0:01:28 lr 0.000936 wd 0.0500 time 0.2378 (0.2450) data time 0.0008 (0.0016) model time 0.2370 (0.2436) loss 3.9889 (3.4986) grad_norm 1.7257 (inf) loss_scale 4096.0000 (6647.3805) mem 7379MB [2024-08-26 07:52:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][900/1251] eta 0:01:26 lr 0.000936 wd 0.0500 time 0.2457 (0.2453) data time 0.0008 (0.0016) model time 0.2449 (0.2438) loss 4.0087 (3.4995) grad_norm 1.9065 (inf) loss_scale 4096.0000 (6619.0633) mem 7379MB [2024-08-26 07:52:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][910/1251] eta 0:01:23 lr 0.000936 wd 0.0500 time 0.2412 (0.2455) data time 0.0007 (0.0016) model time 0.2405 (0.2440) loss 2.9518 (3.4985) grad_norm 1.7719 (inf) loss_scale 4096.0000 (6591.3677) mem 7379MB [2024-08-26 07:52:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][920/1251] eta 0:01:21 lr 0.000936 wd 0.0500 time 0.2435 (0.2454) data time 0.0007 (0.0016) model time 0.2428 (0.2440) loss 3.0956 (3.4978) grad_norm 1.7842 (inf) loss_scale 4096.0000 (6564.2736) mem 7379MB [2024-08-26 07:52:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][930/1251] eta 0:01:18 lr 0.000936 wd 0.0500 time 0.2384 (0.2454) data time 0.0008 (0.0016) model time 0.2376 (0.2439) loss 4.0842 (3.4981) grad_norm 1.7636 (inf) loss_scale 4096.0000 (6537.7615) mem 7379MB [2024-08-26 07:52:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][940/1251] eta 0:01:16 lr 0.000936 wd 0.0500 time 0.2350 (0.2453) data time 0.0009 (0.0016) model time 0.2341 (0.2439) loss 3.6221 (3.4983) grad_norm 1.5364 (inf) loss_scale 4096.0000 (6511.8130) mem 7379MB [2024-08-26 07:52:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][950/1251] eta 0:01:13 lr 0.000936 wd 0.0500 time 0.2360 (0.2453) data time 0.0011 (0.0016) model time 0.2349 (0.2438) loss 4.5873 (3.4997) grad_norm 2.2493 (inf) loss_scale 4096.0000 (6486.4101) mem 7379MB [2024-08-26 07:52:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][960/1251] eta 0:01:11 lr 0.000936 wd 0.0500 time 0.2483 (0.2452) data time 0.0007 (0.0016) model time 0.2476 (0.2438) loss 3.0665 (3.5013) grad_norm 1.5364 (inf) loss_scale 4096.0000 (6461.5359) mem 7379MB [2024-08-26 07:52:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][970/1251] eta 0:01:08 lr 0.000936 wd 0.0500 time 0.2459 (0.2452) data time 0.0007 (0.0016) model time 0.2452 (0.2438) loss 3.3919 (3.5028) grad_norm 2.2555 (inf) loss_scale 4096.0000 (6437.1740) mem 7379MB [2024-08-26 07:52:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][980/1251] eta 0:01:06 lr 0.000936 wd 0.0500 time 0.2472 (0.2451) data time 0.0008 (0.0016) model time 0.2464 (0.2437) loss 3.4468 (3.5037) grad_norm 2.7420 (inf) loss_scale 4096.0000 (6413.3089) mem 7379MB [2024-08-26 07:52:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][990/1251] eta 0:01:04 lr 0.000936 wd 0.0500 time 0.2443 (0.2453) data time 0.0010 (0.0015) model time 0.2434 (0.2439) loss 3.4220 (3.5026) grad_norm 1.2649 (inf) loss_scale 4096.0000 (6389.9253) mem 7379MB [2024-08-26 07:53:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1000/1251] eta 0:01:01 lr 0.000936 wd 0.0500 time 0.2430 (0.2455) data time 0.0010 (0.0015) model time 0.2419 (0.2441) loss 3.9149 (3.5015) grad_norm 2.1910 (inf) loss_scale 4096.0000 (6367.0090) mem 7379MB [2024-08-26 07:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1010/1251] eta 0:00:59 lr 0.000936 wd 0.0500 time 0.2386 (0.2454) data time 0.0010 (0.0015) model time 0.2376 (0.2440) loss 2.9759 (3.5000) grad_norm 1.6327 (inf) loss_scale 4096.0000 (6344.5460) mem 7379MB [2024-08-26 07:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1020/1251] eta 0:00:56 lr 0.000936 wd 0.0500 time 0.2472 (0.2454) data time 0.0008 (0.0015) model time 0.2464 (0.2440) loss 3.8499 (3.5001) grad_norm 1.8031 (inf) loss_scale 4096.0000 (6322.5230) mem 7379MB [2024-08-26 07:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1030/1251] eta 0:00:54 lr 0.000936 wd 0.0500 time 0.2308 (0.2455) data time 0.0008 (0.0015) model time 0.2300 (0.2442) loss 4.4890 (3.5013) grad_norm 1.7082 (inf) loss_scale 4096.0000 (6300.9273) mem 7379MB [2024-08-26 07:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1040/1251] eta 0:00:51 lr 0.000936 wd 0.0500 time 0.2367 (0.2455) data time 0.0010 (0.0015) model time 0.2357 (0.2441) loss 3.6894 (3.5008) grad_norm 1.8890 (inf) loss_scale 4096.0000 (6279.7464) mem 7379MB [2024-08-26 07:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1050/1251] eta 0:00:49 lr 0.000936 wd 0.0500 time 0.2409 (0.2456) data time 0.0010 (0.0015) model time 0.2399 (0.2443) loss 4.0122 (3.5034) grad_norm 2.1092 (inf) loss_scale 4096.0000 (6258.9686) mem 7379MB [2024-08-26 07:53:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1060/1251] eta 0:00:46 lr 0.000936 wd 0.0500 time 0.2424 (0.2456) data time 0.0007 (0.0015) model time 0.2417 (0.2442) loss 3.6340 (3.5075) grad_norm 2.6256 (inf) loss_scale 4096.0000 (6238.5825) mem 7379MB [2024-08-26 07:53:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1070/1251] eta 0:00:44 lr 0.000936 wd 0.0500 time 0.2385 (0.2456) data time 0.0009 (0.0015) model time 0.2376 (0.2442) loss 3.8327 (3.5092) grad_norm 1.4013 (inf) loss_scale 4096.0000 (6218.5770) mem 7379MB [2024-08-26 07:53:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1080/1251] eta 0:00:41 lr 0.000936 wd 0.0500 time 0.2407 (0.2455) data time 0.0010 (0.0015) model time 0.2397 (0.2441) loss 3.0655 (3.5109) grad_norm 1.6833 (inf) loss_scale 4096.0000 (6198.9417) mem 7379MB [2024-08-26 07:53:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1090/1251] eta 0:00:39 lr 0.000936 wd 0.0500 time 0.2498 (0.2455) data time 0.0011 (0.0015) model time 0.2487 (0.2441) loss 3.0521 (3.5120) grad_norm 1.8150 (inf) loss_scale 4096.0000 (6179.6664) mem 7379MB [2024-08-26 07:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1100/1251] eta 0:00:37 lr 0.000936 wd 0.0500 time 0.2387 (0.2454) data time 0.0010 (0.0015) model time 0.2378 (0.2441) loss 3.2029 (3.5113) grad_norm 1.7591 (inf) loss_scale 4096.0000 (6160.7411) mem 7379MB [2024-08-26 07:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1110/1251] eta 0:00:34 lr 0.000936 wd 0.0500 time 0.2391 (0.2454) data time 0.0007 (0.0015) model time 0.2383 (0.2440) loss 3.9549 (3.5130) grad_norm 2.3882 (inf) loss_scale 4096.0000 (6142.1566) mem 7379MB [2024-08-26 07:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1120/1251] eta 0:00:32 lr 0.000936 wd 0.0500 time 0.2355 (0.2453) data time 0.0007 (0.0015) model time 0.2349 (0.2440) loss 4.0495 (3.5152) grad_norm 1.6753 (inf) loss_scale 4096.0000 (6123.9037) mem 7379MB [2024-08-26 07:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1130/1251] eta 0:00:29 lr 0.000936 wd 0.0500 time 0.2561 (0.2453) data time 0.0009 (0.0015) model time 0.2552 (0.2440) loss 4.0378 (3.5164) grad_norm 1.7200 (inf) loss_scale 4096.0000 (6105.9735) mem 7379MB [2024-08-26 07:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1140/1251] eta 0:00:27 lr 0.000936 wd 0.0500 time 0.2444 (0.2453) data time 0.0009 (0.0015) model time 0.2435 (0.2439) loss 2.6087 (3.5144) grad_norm 3.7739 (inf) loss_scale 4096.0000 (6088.3576) mem 7379MB [2024-08-26 07:53:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1150/1251] eta 0:00:24 lr 0.000936 wd 0.0500 time 0.2467 (0.2453) data time 0.0007 (0.0015) model time 0.2460 (0.2439) loss 4.4544 (3.5163) grad_norm 2.4892 (inf) loss_scale 4096.0000 (6071.0478) mem 7379MB [2024-08-26 07:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1160/1251] eta 0:00:22 lr 0.000936 wd 0.0500 time 0.2331 (0.2452) data time 0.0009 (0.0015) model time 0.2321 (0.2439) loss 3.0119 (3.5134) grad_norm 1.8112 (inf) loss_scale 4096.0000 (6054.0362) mem 7379MB [2024-08-26 07:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1170/1251] eta 0:00:19 lr 0.000936 wd 0.0500 time 0.2434 (0.2452) data time 0.0008 (0.0015) model time 0.2426 (0.2439) loss 2.5267 (3.5127) grad_norm 2.8085 (inf) loss_scale 4096.0000 (6037.3151) mem 7379MB [2024-08-26 07:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1180/1251] eta 0:00:17 lr 0.000936 wd 0.0500 time 0.2458 (0.2452) data time 0.0009 (0.0015) model time 0.2448 (0.2438) loss 2.7574 (3.5124) grad_norm 1.4449 (inf) loss_scale 4096.0000 (6020.8772) mem 7379MB [2024-08-26 07:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1190/1251] eta 0:00:14 lr 0.000936 wd 0.0500 time 0.2460 (0.2452) data time 0.0010 (0.0015) model time 0.2450 (0.2438) loss 2.4877 (3.5106) grad_norm 1.5733 (inf) loss_scale 4096.0000 (6004.7154) mem 7379MB [2024-08-26 07:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1200/1251] eta 0:00:12 lr 0.000936 wd 0.0500 time 0.2466 (0.2452) data time 0.0009 (0.0015) model time 0.2457 (0.2438) loss 3.9073 (3.5135) grad_norm 2.0445 (inf) loss_scale 4096.0000 (5988.8226) mem 7379MB [2024-08-26 07:53:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1210/1251] eta 0:00:10 lr 0.000936 wd 0.0500 time 0.2478 (0.2451) data time 0.0009 (0.0014) model time 0.2468 (0.2438) loss 3.6450 (3.5130) grad_norm 2.0252 (inf) loss_scale 4096.0000 (5973.1924) mem 7379MB [2024-08-26 07:53:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1220/1251] eta 0:00:07 lr 0.000936 wd 0.0500 time 0.2416 (0.2451) data time 0.0012 (0.0015) model time 0.2404 (0.2438) loss 2.6866 (3.5129) grad_norm 2.3806 (inf) loss_scale 4096.0000 (5957.8182) mem 7379MB [2024-08-26 07:53:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1230/1251] eta 0:00:05 lr 0.000936 wd 0.0500 time 0.2448 (0.2451) data time 0.0010 (0.0015) model time 0.2438 (0.2437) loss 3.4033 (3.5125) grad_norm 2.5391 (inf) loss_scale 4096.0000 (5942.6937) mem 7379MB [2024-08-26 07:53:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1240/1251] eta 0:00:02 lr 0.000936 wd 0.0500 time 0.2280 (0.2450) data time 0.0005 (0.0015) model time 0.2275 (0.2436) loss 4.0966 (3.5141) grad_norm 1.9714 (inf) loss_scale 4096.0000 (5927.8131) mem 7379MB [2024-08-26 07:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [65/300][1250/1251] eta 0:00:00 lr 0.000936 wd 0.0500 time 0.2334 (0.2449) data time 0.0007 (0.0014) model time 0.2327 (0.2435) loss 3.4305 (3.5129) grad_norm 2.6266 (inf) loss_scale 4096.0000 (5913.1703) mem 7379MB [2024-08-26 07:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 65 training takes 0:05:06 [2024-08-26 07:54:02 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 07:54:03 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 07:54:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.453 (0.453) Loss 0.5791 (0.5791) Acc@1 89.258 (89.258) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 07:54:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.112) Loss 0.8276 (0.8715) Acc@1 81.641 (80.975) Acc@5 96.289 (95.827) Mem 7379MB [2024-08-26 07:54:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.097) Loss 1.2969 (0.8883) Acc@1 70.801 (80.259) Acc@5 90.723 (95.736) Mem 7379MB [2024-08-26 07:54:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.091) Loss 1.4707 (1.0123) Acc@1 64.355 (77.284) Acc@5 88.770 (94.232) Mem 7379MB [2024-08-26 07:54:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.4346 (1.0824) Acc@1 67.285 (75.698) Acc@5 89.062 (93.252) Mem 7379MB [2024-08-26 07:54:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.518 Acc@5 93.242 [2024-08-26 07:54:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.5% [2024-08-26 07:54:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.810 (0.810) Loss 0.4685 (0.4685) Acc@1 91.016 (91.016) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 07:54:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.148) Loss 0.7671 (0.7407) Acc@1 84.863 (83.691) Acc@5 95.996 (96.689) Mem 7379MB [2024-08-26 07:54:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.115) Loss 1.0654 (0.7609) Acc@1 74.512 (82.766) Acc@5 92.871 (96.661) Mem 7379MB [2024-08-26 07:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.103) Loss 1.3311 (0.8715) Acc@1 65.723 (80.160) Acc@5 90.039 (95.234) Mem 7379MB [2024-08-26 07:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.2412 (0.9311) Acc@1 69.824 (78.682) Acc@5 91.113 (94.574) Mem 7379MB [2024-08-26 07:54:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.328 Acc@5 94.502 [2024-08-26 07:54:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.3% [2024-08-26 07:54:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.33% [2024-08-26 07:54:11 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 07:54:12 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 07:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][0/1251] eta 0:14:25 lr 0.000936 wd 0.0500 time 0.6921 (0.6921) data time 0.4690 (0.4690) model time 0.0000 (0.0000) loss 2.6361 (2.6361) grad_norm 2.6597 (2.6597) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][10/1251] eta 0:05:52 lr 0.000935 wd 0.0500 time 0.2431 (0.2838) data time 0.0009 (0.0436) model time 0.0000 (0.0000) loss 3.7954 (3.3593) grad_norm 1.7981 (2.3229) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][20/1251] eta 0:05:36 lr 0.000935 wd 0.0500 time 0.2396 (0.2737) data time 0.0007 (0.0233) model time 0.0000 (0.0000) loss 2.4058 (3.3987) grad_norm 1.7637 (2.0885) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][30/1251] eta 0:05:21 lr 0.000935 wd 0.0500 time 0.2391 (0.2637) data time 0.0010 (0.0161) model time 0.0000 (0.0000) loss 2.5366 (3.4646) grad_norm 1.6057 (1.9862) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][40/1251] eta 0:05:19 lr 0.000935 wd 0.0500 time 0.2455 (0.2639) data time 0.0010 (0.0124) model time 0.0000 (0.0000) loss 2.3895 (3.5138) grad_norm 3.2365 (2.1261) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][50/1251] eta 0:05:11 lr 0.000935 wd 0.0500 time 0.2415 (0.2595) data time 0.0010 (0.0102) model time 0.0000 (0.0000) loss 2.5183 (3.4760) grad_norm 1.3286 (2.1288) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][60/1251] eta 0:05:05 lr 0.000935 wd 0.0500 time 0.2360 (0.2568) data time 0.0012 (0.0087) model time 0.2348 (0.2418) loss 3.6640 (3.4940) grad_norm 2.4525 (2.1324) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][70/1251] eta 0:05:01 lr 0.000935 wd 0.0500 time 0.2358 (0.2549) data time 0.0012 (0.0076) model time 0.2345 (0.2421) loss 3.8241 (3.4993) grad_norm 2.0073 (2.1968) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][80/1251] eta 0:04:56 lr 0.000935 wd 0.0500 time 0.2378 (0.2532) data time 0.0009 (0.0068) model time 0.2369 (0.2414) loss 3.8829 (3.5541) grad_norm 1.8925 (2.1650) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][90/1251] eta 0:04:52 lr 0.000935 wd 0.0500 time 0.2343 (0.2520) data time 0.0009 (0.0062) model time 0.2335 (0.2414) loss 3.7154 (3.5334) grad_norm 2.2920 (2.1637) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][100/1251] eta 0:04:48 lr 0.000935 wd 0.0500 time 0.2401 (0.2509) data time 0.0010 (0.0057) model time 0.2391 (0.2410) loss 3.6698 (3.5301) grad_norm 1.7341 (2.1493) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][110/1251] eta 0:04:45 lr 0.000935 wd 0.0500 time 0.2383 (0.2499) data time 0.0008 (0.0052) model time 0.2375 (0.2408) loss 4.3783 (3.5272) grad_norm 1.7169 (2.1074) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][120/1251] eta 0:04:41 lr 0.000935 wd 0.0500 time 0.2419 (0.2492) data time 0.0010 (0.0049) model time 0.2409 (0.2407) loss 3.4637 (3.5081) grad_norm 1.8089 (2.1096) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][130/1251] eta 0:04:38 lr 0.000935 wd 0.0500 time 0.2346 (0.2486) data time 0.0010 (0.0046) model time 0.2336 (0.2406) loss 3.2557 (3.4963) grad_norm 1.9084 (2.0814) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][140/1251] eta 0:04:35 lr 0.000935 wd 0.0500 time 0.2320 (0.2481) data time 0.0012 (0.0044) model time 0.2308 (0.2406) loss 3.5492 (3.4779) grad_norm 1.6799 (2.0960) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][150/1251] eta 0:04:32 lr 0.000935 wd 0.0500 time 0.2431 (0.2477) data time 0.0011 (0.0041) model time 0.2421 (0.2407) loss 2.9617 (3.4878) grad_norm 1.8095 (2.0809) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][160/1251] eta 0:04:29 lr 0.000935 wd 0.0500 time 0.2549 (0.2475) data time 0.0009 (0.0039) model time 0.2540 (0.2409) loss 3.8752 (3.4775) grad_norm 2.0712 (2.0750) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][170/1251] eta 0:04:29 lr 0.000935 wd 0.0500 time 0.2331 (0.2489) data time 0.0009 (0.0038) model time 0.2322 (0.2434) loss 3.2157 (3.4689) grad_norm 1.5946 (2.0691) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][180/1251] eta 0:04:27 lr 0.000935 wd 0.0500 time 0.2406 (0.2495) data time 0.0011 (0.0036) model time 0.2395 (0.2445) loss 3.7186 (3.4595) grad_norm 2.4338 (2.0526) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:54:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][190/1251] eta 0:04:24 lr 0.000935 wd 0.0500 time 0.2385 (0.2491) data time 0.0009 (0.0035) model time 0.2376 (0.2442) loss 4.1646 (3.4618) grad_norm 2.1113 (2.0476) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][200/1251] eta 0:04:21 lr 0.000935 wd 0.0500 time 0.2482 (0.2488) data time 0.0008 (0.0033) model time 0.2474 (0.2442) loss 3.9481 (3.4752) grad_norm 3.9343 (2.0545) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][210/1251] eta 0:04:18 lr 0.000935 wd 0.0500 time 0.2355 (0.2484) data time 0.0008 (0.0032) model time 0.2347 (0.2438) loss 4.3954 (3.4858) grad_norm 2.3352 (2.0697) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][220/1251] eta 0:04:15 lr 0.000935 wd 0.0500 time 0.2474 (0.2481) data time 0.0009 (0.0031) model time 0.2465 (0.2436) loss 4.2089 (3.4910) grad_norm 2.5476 (2.0632) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][230/1251] eta 0:04:12 lr 0.000935 wd 0.0500 time 0.2412 (0.2477) data time 0.0009 (0.0030) model time 0.2403 (0.2434) loss 3.6178 (3.4915) grad_norm 1.6755 (2.0508) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][240/1251] eta 0:04:10 lr 0.000935 wd 0.0500 time 0.2474 (0.2475) data time 0.0008 (0.0030) model time 0.2466 (0.2432) loss 4.3168 (3.4990) grad_norm 1.6033 (2.0405) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][250/1251] eta 0:04:07 lr 0.000935 wd 0.0500 time 0.2394 (0.2472) data time 0.0010 (0.0029) model time 0.2383 (0.2430) loss 3.8209 (3.5001) grad_norm 2.0358 (2.0364) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][260/1251] eta 0:04:06 lr 0.000935 wd 0.0500 time 0.4618 (0.2486) data time 0.0010 (0.0028) model time 0.4608 (0.2449) loss 4.0013 (3.5035) grad_norm 1.7173 (2.0361) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][270/1251] eta 0:04:03 lr 0.000935 wd 0.0500 time 0.2389 (0.2483) data time 0.0008 (0.0027) model time 0.2381 (0.2446) loss 3.4929 (3.5170) grad_norm 2.1444 (2.0274) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][280/1251] eta 0:04:01 lr 0.000935 wd 0.0500 time 0.2406 (0.2487) data time 0.0007 (0.0027) model time 0.2399 (0.2453) loss 3.1591 (3.5071) grad_norm 1.9131 (2.0201) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][290/1251] eta 0:03:58 lr 0.000935 wd 0.0500 time 0.2399 (0.2485) data time 0.0009 (0.0026) model time 0.2390 (0.2452) loss 3.4600 (3.5061) grad_norm 1.7227 (2.0127) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][300/1251] eta 0:03:56 lr 0.000935 wd 0.0500 time 0.2358 (0.2483) data time 0.0010 (0.0026) model time 0.2347 (0.2450) loss 3.3239 (3.5016) grad_norm 2.2751 (2.0275) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][310/1251] eta 0:03:53 lr 0.000935 wd 0.0500 time 0.2414 (0.2481) data time 0.0012 (0.0025) model time 0.2402 (0.2448) loss 3.7032 (3.4976) grad_norm 1.8646 (2.0262) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][320/1251] eta 0:03:50 lr 0.000935 wd 0.0500 time 0.2400 (0.2480) data time 0.0011 (0.0025) model time 0.2389 (0.2448) loss 2.6080 (3.4939) grad_norm 1.8927 (2.0199) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][330/1251] eta 0:03:48 lr 0.000935 wd 0.0500 time 0.2418 (0.2478) data time 0.0008 (0.0024) model time 0.2410 (0.2447) loss 3.7935 (3.4893) grad_norm 1.9481 (2.0121) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][340/1251] eta 0:03:45 lr 0.000935 wd 0.0500 time 0.2322 (0.2476) data time 0.0012 (0.0024) model time 0.2310 (0.2445) loss 3.5367 (3.4856) grad_norm 1.9367 (2.0040) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][350/1251] eta 0:03:42 lr 0.000935 wd 0.0500 time 0.2419 (0.2475) data time 0.0011 (0.0024) model time 0.2408 (0.2444) loss 2.9337 (3.4849) grad_norm 2.2208 (2.0005) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][360/1251] eta 0:03:40 lr 0.000935 wd 0.0500 time 0.2412 (0.2474) data time 0.0010 (0.0023) model time 0.2402 (0.2443) loss 2.8420 (3.4806) grad_norm 1.8798 (1.9977) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][370/1251] eta 0:03:37 lr 0.000935 wd 0.0500 time 0.2398 (0.2472) data time 0.0010 (0.0023) model time 0.2388 (0.2442) loss 4.2001 (3.4843) grad_norm 2.2753 (1.9980) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][380/1251] eta 0:03:35 lr 0.000935 wd 0.0500 time 0.2467 (0.2471) data time 0.0013 (0.0022) model time 0.2454 (0.2442) loss 3.5418 (3.4835) grad_norm 1.5357 (1.9953) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][390/1251] eta 0:03:32 lr 0.000935 wd 0.0500 time 0.2430 (0.2471) data time 0.0010 (0.0022) model time 0.2419 (0.2442) loss 4.1242 (3.4773) grad_norm 2.2490 (1.9937) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][400/1251] eta 0:03:30 lr 0.000935 wd 0.0500 time 0.2393 (0.2475) data time 0.0010 (0.0022) model time 0.2383 (0.2447) loss 3.2836 (3.4727) grad_norm 1.7939 (1.9960) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][410/1251] eta 0:03:27 lr 0.000935 wd 0.0500 time 0.2388 (0.2472) data time 0.0007 (0.0022) model time 0.2380 (0.2445) loss 3.1650 (3.4703) grad_norm 2.1541 (1.9928) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][420/1251] eta 0:03:25 lr 0.000935 wd 0.0500 time 0.2355 (0.2471) data time 0.0007 (0.0021) model time 0.2348 (0.2444) loss 2.2277 (3.4690) grad_norm 2.0299 (1.9998) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:55:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][430/1251] eta 0:03:22 lr 0.000935 wd 0.0500 time 0.2453 (0.2470) data time 0.0009 (0.0021) model time 0.2444 (0.2443) loss 3.8515 (3.4758) grad_norm 1.7196 (1.9994) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][440/1251] eta 0:03:20 lr 0.000935 wd 0.0500 time 0.4706 (0.2474) data time 0.0009 (0.0021) model time 0.4697 (0.2447) loss 3.5095 (3.4782) grad_norm 1.7659 (2.0014) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][450/1251] eta 0:03:18 lr 0.000935 wd 0.0500 time 0.2403 (0.2472) data time 0.0009 (0.0021) model time 0.2394 (0.2446) loss 3.9386 (3.4808) grad_norm 2.4549 (1.9994) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][460/1251] eta 0:03:15 lr 0.000935 wd 0.0500 time 0.2369 (0.2472) data time 0.0011 (0.0021) model time 0.2358 (0.2446) loss 2.4130 (3.4801) grad_norm 1.8912 (1.9970) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][470/1251] eta 0:03:12 lr 0.000934 wd 0.0500 time 0.2429 (0.2470) data time 0.0008 (0.0020) model time 0.2421 (0.2444) loss 4.2952 (3.4862) grad_norm 2.3174 (1.9988) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][480/1251] eta 0:03:10 lr 0.000934 wd 0.0500 time 0.2315 (0.2469) data time 0.0008 (0.0020) model time 0.2307 (0.2444) loss 3.5380 (3.4869) grad_norm 2.6151 (1.9960) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][490/1251] eta 0:03:07 lr 0.000934 wd 0.0500 time 0.2466 (0.2468) data time 0.0008 (0.0020) model time 0.2459 (0.2443) loss 3.9449 (3.4857) grad_norm 2.1314 (1.9936) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][500/1251] eta 0:03:05 lr 0.000934 wd 0.0500 time 0.2420 (0.2467) data time 0.0010 (0.0020) model time 0.2410 (0.2442) loss 3.6190 (3.4859) grad_norm 2.3182 (1.9922) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][510/1251] eta 0:03:02 lr 0.000934 wd 0.0500 time 0.2395 (0.2467) data time 0.0008 (0.0020) model time 0.2387 (0.2441) loss 3.9573 (3.4886) grad_norm 1.7804 (1.9933) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][520/1251] eta 0:03:00 lr 0.000934 wd 0.0500 time 0.2356 (0.2466) data time 0.0010 (0.0020) model time 0.2346 (0.2441) loss 3.2664 (3.4894) grad_norm 2.5608 (1.9984) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][530/1251] eta 0:02:57 lr 0.000934 wd 0.0500 time 0.2453 (0.2465) data time 0.0007 (0.0020) model time 0.2446 (0.2440) loss 3.2164 (3.4924) grad_norm 2.0240 (1.9960) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][540/1251] eta 0:02:55 lr 0.000934 wd 0.0500 time 0.2385 (0.2464) data time 0.0011 (0.0019) model time 0.2375 (0.2439) loss 3.9930 (3.4984) grad_norm 2.8082 (1.9923) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][550/1251] eta 0:02:52 lr 0.000934 wd 0.0500 time 0.2475 (0.2463) data time 0.0007 (0.0019) model time 0.2468 (0.2439) loss 3.0925 (3.5000) grad_norm 1.5652 (1.9930) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][560/1251] eta 0:02:50 lr 0.000934 wd 0.0500 time 0.2378 (0.2462) data time 0.0009 (0.0019) model time 0.2369 (0.2438) loss 2.9675 (3.5030) grad_norm 2.4847 (1.9986) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][570/1251] eta 0:02:47 lr 0.000934 wd 0.0500 time 0.2439 (0.2462) data time 0.0007 (0.0019) model time 0.2432 (0.2438) loss 3.4723 (3.5030) grad_norm 1.4904 (1.9988) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][580/1251] eta 0:02:45 lr 0.000934 wd 0.0500 time 0.2365 (0.2465) data time 0.0009 (0.0019) model time 0.2355 (0.2441) loss 3.2749 (3.4993) grad_norm 1.5197 (1.9951) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][590/1251] eta 0:02:42 lr 0.000934 wd 0.0500 time 0.2421 (0.2464) data time 0.0010 (0.0019) model time 0.2411 (0.2441) loss 3.1934 (3.5007) grad_norm 1.7789 (1.9957) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][600/1251] eta 0:02:40 lr 0.000934 wd 0.0500 time 0.2420 (0.2463) data time 0.0009 (0.0018) model time 0.2411 (0.2440) loss 4.3431 (3.4991) grad_norm 1.5235 (1.9923) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][610/1251] eta 0:02:37 lr 0.000934 wd 0.0500 time 0.2357 (0.2463) data time 0.0009 (0.0018) model time 0.2349 (0.2440) loss 2.2840 (3.4972) grad_norm 1.5508 (1.9889) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][620/1251] eta 0:02:35 lr 0.000934 wd 0.0500 time 0.2389 (0.2462) data time 0.0012 (0.0018) model time 0.2378 (0.2439) loss 3.7524 (3.4990) grad_norm 2.4162 (1.9878) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][630/1251] eta 0:02:32 lr 0.000934 wd 0.0500 time 0.2476 (0.2462) data time 0.0009 (0.0018) model time 0.2467 (0.2439) loss 3.4175 (3.5011) grad_norm 1.6986 (1.9851) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][640/1251] eta 0:02:30 lr 0.000934 wd 0.0500 time 0.2394 (0.2461) data time 0.0009 (0.0018) model time 0.2385 (0.2439) loss 4.3818 (3.5044) grad_norm 1.9151 (1.9824) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][650/1251] eta 0:02:27 lr 0.000934 wd 0.0500 time 0.2385 (0.2461) data time 0.0010 (0.0018) model time 0.2376 (0.2439) loss 3.8596 (3.5067) grad_norm 1.6140 (1.9842) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][660/1251] eta 0:02:25 lr 0.000934 wd 0.0500 time 0.2388 (0.2460) data time 0.0011 (0.0018) model time 0.2378 (0.2438) loss 2.7375 (3.5046) grad_norm 2.5971 (1.9850) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][670/1251] eta 0:02:22 lr 0.000934 wd 0.0500 time 0.2443 (0.2460) data time 0.0010 (0.0018) model time 0.2433 (0.2438) loss 3.1457 (3.5033) grad_norm 1.5777 (1.9847) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:56:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][680/1251] eta 0:02:20 lr 0.000934 wd 0.0500 time 0.2363 (0.2459) data time 0.0008 (0.0018) model time 0.2355 (0.2437) loss 3.5249 (3.5024) grad_norm 2.4876 (1.9848) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][690/1251] eta 0:02:18 lr 0.000934 wd 0.0500 time 0.4604 (0.2461) data time 0.0009 (0.0017) model time 0.4595 (0.2440) loss 3.7977 (3.5048) grad_norm 2.1736 (1.9892) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][700/1251] eta 0:02:15 lr 0.000934 wd 0.0500 time 0.2385 (0.2467) data time 0.0007 (0.0017) model time 0.2377 (0.2446) loss 3.5841 (3.5018) grad_norm 1.3448 (1.9894) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][710/1251] eta 0:02:13 lr 0.000934 wd 0.0500 time 0.2400 (0.2466) data time 0.0010 (0.0017) model time 0.2390 (0.2445) loss 3.6044 (3.5037) grad_norm 1.8442 (1.9872) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][720/1251] eta 0:02:10 lr 0.000934 wd 0.0500 time 0.2432 (0.2465) data time 0.0007 (0.0017) model time 0.2424 (0.2445) loss 4.1986 (3.5098) grad_norm 1.8542 (1.9885) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][730/1251] eta 0:02:08 lr 0.000934 wd 0.0500 time 0.2385 (0.2465) data time 0.0009 (0.0017) model time 0.2375 (0.2444) loss 3.1047 (3.5112) grad_norm 1.9019 (1.9892) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][740/1251] eta 0:02:05 lr 0.000934 wd 0.0500 time 0.2401 (0.2464) data time 0.0008 (0.0017) model time 0.2392 (0.2444) loss 3.8781 (3.5108) grad_norm 1.3087 (1.9911) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][750/1251] eta 0:02:03 lr 0.000934 wd 0.0500 time 0.2436 (0.2463) data time 0.0008 (0.0017) model time 0.2427 (0.2443) loss 4.1340 (3.5140) grad_norm 2.3318 (1.9926) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][760/1251] eta 0:02:00 lr 0.000934 wd 0.0500 time 0.2385 (0.2463) data time 0.0009 (0.0017) model time 0.2375 (0.2442) loss 3.3145 (3.5111) grad_norm 1.1316 (1.9893) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][770/1251] eta 0:01:58 lr 0.000934 wd 0.0500 time 0.2536 (0.2462) data time 0.0008 (0.0017) model time 0.2528 (0.2442) loss 4.4581 (3.5129) grad_norm 2.2494 (1.9890) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][780/1251] eta 0:01:56 lr 0.000934 wd 0.0500 time 0.4104 (0.2464) data time 0.0009 (0.0017) model time 0.4095 (0.2444) loss 3.0227 (3.5101) grad_norm 1.6488 (1.9882) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][790/1251] eta 0:01:53 lr 0.000934 wd 0.0500 time 0.2419 (0.2464) data time 0.0011 (0.0017) model time 0.2408 (0.2444) loss 2.0453 (3.5073) grad_norm 2.2402 (1.9882) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][800/1251] eta 0:01:51 lr 0.000934 wd 0.0500 time 0.2373 (0.2465) data time 0.0009 (0.0016) model time 0.2364 (0.2445) loss 4.5157 (3.5112) grad_norm 1.6089 (1.9885) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][810/1251] eta 0:01:48 lr 0.000934 wd 0.0500 time 0.2322 (0.2464) data time 0.0009 (0.0016) model time 0.2313 (0.2444) loss 2.8064 (3.5099) grad_norm 1.8892 (1.9843) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][820/1251] eta 0:01:46 lr 0.000934 wd 0.0500 time 0.2420 (0.2463) data time 0.0008 (0.0016) model time 0.2412 (0.2444) loss 3.2564 (3.5068) grad_norm 2.1199 (1.9860) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][830/1251] eta 0:01:43 lr 0.000934 wd 0.0500 time 0.2469 (0.2463) data time 0.0008 (0.0016) model time 0.2461 (0.2444) loss 4.4109 (3.5096) grad_norm 2.5012 (1.9848) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][840/1251] eta 0:01:41 lr 0.000934 wd 0.0500 time 0.2405 (0.2463) data time 0.0008 (0.0016) model time 0.2397 (0.2444) loss 2.9312 (3.5111) grad_norm 2.1770 (1.9867) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][850/1251] eta 0:01:38 lr 0.000934 wd 0.0500 time 0.2479 (0.2462) data time 0.0010 (0.0016) model time 0.2469 (0.2443) loss 2.4702 (3.5097) grad_norm 1.7037 (1.9959) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][860/1251] eta 0:01:36 lr 0.000934 wd 0.0500 time 0.2329 (0.2462) data time 0.0008 (0.0016) model time 0.2321 (0.2443) loss 3.0094 (3.5081) grad_norm 2.0310 (1.9930) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][870/1251] eta 0:01:33 lr 0.000934 wd 0.0500 time 0.2376 (0.2461) data time 0.0007 (0.0016) model time 0.2368 (0.2442) loss 4.1334 (3.5094) grad_norm 1.3194 (1.9980) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][880/1251] eta 0:01:31 lr 0.000934 wd 0.0500 time 0.2393 (0.2460) data time 0.0009 (0.0016) model time 0.2384 (0.2441) loss 4.5245 (3.5122) grad_norm 3.0684 (2.0018) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][890/1251] eta 0:01:28 lr 0.000934 wd 0.0500 time 0.2434 (0.2460) data time 0.0010 (0.0016) model time 0.2424 (0.2441) loss 2.8352 (3.5125) grad_norm 2.2242 (2.0033) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][900/1251] eta 0:01:26 lr 0.000934 wd 0.0500 time 0.2391 (0.2459) data time 0.0011 (0.0016) model time 0.2381 (0.2441) loss 3.7094 (3.5123) grad_norm 1.7892 (2.0021) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][910/1251] eta 0:01:23 lr 0.000934 wd 0.0500 time 0.2495 (0.2459) data time 0.0007 (0.0016) model time 0.2489 (0.2440) loss 3.8052 (3.5101) grad_norm 1.9600 (2.0018) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:57:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][920/1251] eta 0:01:21 lr 0.000933 wd 0.0500 time 0.2394 (0.2459) data time 0.0010 (0.0016) model time 0.2384 (0.2440) loss 3.6840 (3.5117) grad_norm 5.2841 (2.0074) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][930/1251] eta 0:01:18 lr 0.000933 wd 0.0500 time 0.2424 (0.2459) data time 0.0009 (0.0016) model time 0.2415 (0.2440) loss 3.3122 (3.5108) grad_norm 1.7292 (2.0077) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][940/1251] eta 0:01:16 lr 0.000933 wd 0.0500 time 0.2441 (0.2458) data time 0.0010 (0.0016) model time 0.2431 (0.2440) loss 3.4692 (3.5124) grad_norm 2.1502 (2.0068) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][950/1251] eta 0:01:14 lr 0.000933 wd 0.0500 time 0.2440 (0.2460) data time 0.0007 (0.0015) model time 0.2432 (0.2442) loss 4.1757 (3.5102) grad_norm 1.5302 (2.0040) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][960/1251] eta 0:01:11 lr 0.000933 wd 0.0500 time 0.2399 (0.2459) data time 0.0010 (0.0015) model time 0.2389 (0.2441) loss 3.8146 (3.5110) grad_norm 1.8468 (2.0044) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][970/1251] eta 0:01:09 lr 0.000933 wd 0.0500 time 0.2417 (0.2459) data time 0.0007 (0.0015) model time 0.2409 (0.2441) loss 4.3304 (3.5127) grad_norm 1.7724 (2.0017) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][980/1251] eta 0:01:06 lr 0.000933 wd 0.0500 time 0.2436 (0.2461) data time 0.0007 (0.0015) model time 0.2429 (0.2443) loss 3.6520 (3.5154) grad_norm 1.3547 (2.0005) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][990/1251] eta 0:01:04 lr 0.000933 wd 0.0500 time 0.2377 (0.2460) data time 0.0010 (0.0015) model time 0.2367 (0.2442) loss 4.2594 (3.5165) grad_norm 1.5512 (2.0000) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1000/1251] eta 0:01:01 lr 0.000933 wd 0.0500 time 0.2436 (0.2459) data time 0.0008 (0.0015) model time 0.2428 (0.2442) loss 3.8613 (3.5177) grad_norm 1.9219 (1.9978) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1010/1251] eta 0:00:59 lr 0.000933 wd 0.0500 time 0.2364 (0.2459) data time 0.0009 (0.0015) model time 0.2354 (0.2441) loss 3.2715 (3.5158) grad_norm 2.0767 (1.9969) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1020/1251] eta 0:00:56 lr 0.000933 wd 0.0500 time 0.2383 (0.2458) data time 0.0009 (0.0015) model time 0.2375 (0.2441) loss 3.8492 (3.5140) grad_norm 2.0441 (1.9972) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1030/1251] eta 0:00:54 lr 0.000933 wd 0.0500 time 0.2369 (0.2458) data time 0.0011 (0.0015) model time 0.2359 (0.2440) loss 3.3521 (3.5128) grad_norm 2.7766 (1.9978) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1040/1251] eta 0:00:51 lr 0.000933 wd 0.0500 time 0.2494 (0.2458) data time 0.0009 (0.0015) model time 0.2485 (0.2440) loss 3.6576 (3.5112) grad_norm 1.8868 (1.9981) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1050/1251] eta 0:00:49 lr 0.000933 wd 0.0500 time 0.2381 (0.2457) data time 0.0008 (0.0015) model time 0.2374 (0.2440) loss 3.5488 (3.5087) grad_norm 1.9076 (1.9966) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1060/1251] eta 0:00:46 lr 0.000933 wd 0.0500 time 0.2292 (0.2457) data time 0.0010 (0.0015) model time 0.2282 (0.2439) loss 3.3229 (3.5076) grad_norm 1.3616 (1.9977) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1070/1251] eta 0:00:44 lr 0.000933 wd 0.0500 time 0.2421 (0.2457) data time 0.0011 (0.0015) model time 0.2410 (0.2439) loss 2.4973 (3.5099) grad_norm 1.5025 (1.9983) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1080/1251] eta 0:00:42 lr 0.000933 wd 0.0500 time 0.2456 (0.2456) data time 0.0010 (0.0015) model time 0.2446 (0.2439) loss 4.2200 (3.5109) grad_norm 1.7816 (1.9963) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1090/1251] eta 0:00:39 lr 0.000933 wd 0.0500 time 0.2358 (0.2456) data time 0.0008 (0.0015) model time 0.2350 (0.2439) loss 3.5576 (3.5105) grad_norm 2.3169 (1.9942) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1100/1251] eta 0:00:37 lr 0.000933 wd 0.0500 time 0.2469 (0.2458) data time 0.0010 (0.0015) model time 0.2459 (0.2441) loss 3.1451 (3.5100) grad_norm 1.8983 (1.9969) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1110/1251] eta 0:00:34 lr 0.000933 wd 0.0500 time 0.2409 (0.2457) data time 0.0011 (0.0015) model time 0.2398 (0.2440) loss 3.6127 (3.5116) grad_norm 1.6582 (1.9948) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1120/1251] eta 0:00:32 lr 0.000933 wd 0.0500 time 0.2427 (0.2457) data time 0.0009 (0.0015) model time 0.2418 (0.2440) loss 3.7855 (3.5148) grad_norm 1.5091 (1.9921) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1130/1251] eta 0:00:29 lr 0.000933 wd 0.0500 time 0.2374 (0.2456) data time 0.0010 (0.0015) model time 0.2364 (0.2439) loss 3.3346 (3.5145) grad_norm 1.5572 (1.9892) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1140/1251] eta 0:00:27 lr 0.000933 wd 0.0500 time 0.2422 (0.2456) data time 0.0008 (0.0015) model time 0.2415 (0.2439) loss 3.0735 (3.5160) grad_norm 1.5895 (1.9881) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1150/1251] eta 0:00:24 lr 0.000933 wd 0.0500 time 0.2480 (0.2456) data time 0.0011 (0.0015) model time 0.2469 (0.2439) loss 3.6088 (3.5175) grad_norm 1.4716 (1.9886) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1160/1251] eta 0:00:22 lr 0.000933 wd 0.0500 time 0.2461 (0.2456) data time 0.0010 (0.0015) model time 0.2451 (0.2439) loss 3.7228 (3.5180) grad_norm 1.7410 (1.9888) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:58:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1170/1251] eta 0:00:19 lr 0.000933 wd 0.0500 time 0.2443 (0.2455) data time 0.0009 (0.0014) model time 0.2434 (0.2439) loss 2.8788 (3.5178) grad_norm 2.2798 (1.9887) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1180/1251] eta 0:00:17 lr 0.000933 wd 0.0500 time 0.2391 (0.2455) data time 0.0008 (0.0014) model time 0.2384 (0.2438) loss 3.9272 (3.5184) grad_norm 2.0440 (1.9907) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1190/1251] eta 0:00:14 lr 0.000933 wd 0.0500 time 0.2444 (0.2456) data time 0.0010 (0.0014) model time 0.2433 (0.2440) loss 3.5944 (3.5190) grad_norm 1.7840 (1.9922) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1200/1251] eta 0:00:12 lr 0.000933 wd 0.0500 time 0.2367 (0.2456) data time 0.0011 (0.0014) model time 0.2356 (0.2439) loss 3.8091 (3.5184) grad_norm 1.9950 (1.9919) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1210/1251] eta 0:00:10 lr 0.000933 wd 0.0500 time 0.2463 (0.2456) data time 0.0012 (0.0014) model time 0.2451 (0.2439) loss 3.6482 (3.5200) grad_norm 1.7376 (1.9933) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1220/1251] eta 0:00:07 lr 0.000933 wd 0.0500 time 0.2397 (0.2456) data time 0.0009 (0.0014) model time 0.2389 (0.2439) loss 2.7676 (3.5176) grad_norm 1.9884 (1.9944) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1230/1251] eta 0:00:05 lr 0.000933 wd 0.0500 time 0.2427 (0.2456) data time 0.0011 (0.0014) model time 0.2417 (0.2439) loss 3.6904 (3.5175) grad_norm 2.0133 (1.9950) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1240/1251] eta 0:00:02 lr 0.000933 wd 0.0500 time 0.2271 (0.2455) data time 0.0007 (0.0014) model time 0.2264 (0.2438) loss 2.8447 (3.5156) grad_norm 2.3115 (1.9948) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [66/300][1250/1251] eta 0:00:00 lr 0.000933 wd 0.0500 time 0.2241 (0.2453) data time 0.0005 (0.0014) model time 0.2236 (0.2437) loss 4.2142 (3.5173) grad_norm 3.7883 (1.9960) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 66 training takes 0:05:06 [2024-08-26 07:59:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 07:59:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 07:59:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.440 (0.440) Loss 0.5762 (0.5762) Acc@1 90.039 (90.039) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 07:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.114) Loss 0.9814 (0.8669) Acc@1 77.930 (80.762) Acc@5 94.531 (95.872) Mem 7379MB [2024-08-26 07:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.097) Loss 1.2012 (0.8811) Acc@1 71.777 (80.115) Acc@5 91.016 (95.833) Mem 7379MB [2024-08-26 07:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.091) Loss 1.4570 (0.9978) Acc@1 64.355 (77.482) Acc@5 87.793 (94.182) Mem 7379MB [2024-08-26 07:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.063 (0.085) Loss 1.3340 (1.0654) Acc@1 68.066 (75.900) Acc@5 89.844 (93.350) Mem 7379MB [2024-08-26 07:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.684 Acc@5 93.270 [2024-08-26 07:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.7% [2024-08-26 07:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 75.68% [2024-08-26 07:59:23 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 07:59:24 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 07:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.445 (0.445) Loss 0.4690 (0.4690) Acc@1 91.016 (91.016) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 07:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.109) Loss 0.7681 (0.7399) Acc@1 84.766 (83.860) Acc@5 96.191 (96.715) Mem 7379MB [2024-08-26 07:59:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.095) Loss 1.0674 (0.7604) Acc@1 74.609 (82.868) Acc@5 92.969 (96.670) Mem 7379MB [2024-08-26 07:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.089) Loss 1.3311 (0.8703) Acc@1 65.430 (80.248) Acc@5 90.039 (95.275) Mem 7379MB [2024-08-26 07:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.2383 (0.9294) Acc@1 70.117 (78.773) Acc@5 91.309 (94.638) Mem 7379MB [2024-08-26 07:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.410 Acc@5 94.558 [2024-08-26 07:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.4% [2024-08-26 07:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.41% [2024-08-26 07:59:28 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 07:59:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 07:59:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][0/1251] eta 0:15:07 lr 0.000933 wd 0.0500 time 0.7258 (0.7258) data time 0.5061 (0.5061) model time 0.0000 (0.0000) loss 3.6778 (3.6778) grad_norm 1.8922 (1.8922) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][10/1251] eta 0:05:54 lr 0.000933 wd 0.0500 time 0.2470 (0.2853) data time 0.0010 (0.0469) model time 0.0000 (0.0000) loss 3.7396 (3.7106) grad_norm 2.7903 (2.1125) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][20/1251] eta 0:05:25 lr 0.000933 wd 0.0500 time 0.2444 (0.2642) data time 0.0010 (0.0250) model time 0.0000 (0.0000) loss 3.4874 (3.6050) grad_norm 1.9075 (2.0373) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][30/1251] eta 0:05:13 lr 0.000933 wd 0.0500 time 0.2425 (0.2569) data time 0.0011 (0.0173) model time 0.0000 (0.0000) loss 2.5665 (3.6271) grad_norm 2.4080 (2.1261) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][40/1251] eta 0:05:06 lr 0.000933 wd 0.0500 time 0.2455 (0.2531) data time 0.0011 (0.0133) model time 0.0000 (0.0000) loss 3.8794 (3.6578) grad_norm 1.7586 (2.1220) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][50/1251] eta 0:05:06 lr 0.000933 wd 0.0500 time 0.2482 (0.2551) data time 0.0010 (0.0110) model time 0.0000 (0.0000) loss 2.7826 (3.5832) grad_norm 2.3302 (2.0727) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 07:59:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][60/1251] eta 0:05:04 lr 0.000933 wd 0.0500 time 0.2381 (0.2559) data time 0.0008 (0.0094) model time 0.2373 (0.2591) loss 4.4261 (3.5594) grad_norm 1.4943 (2.0876) loss_scale 8192.0000 (4633.1803) mem 7379MB [2024-08-26 07:59:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][70/1251] eta 0:05:01 lr 0.000933 wd 0.0500 time 0.2444 (0.2557) data time 0.0011 (0.0082) model time 0.2433 (0.2563) loss 3.1033 (3.5881) grad_norm 2.1905 (2.0979) loss_scale 8192.0000 (5134.4225) mem 7379MB [2024-08-26 07:59:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][80/1251] eta 0:04:57 lr 0.000933 wd 0.0500 time 0.2332 (0.2537) data time 0.0009 (0.0073) model time 0.2322 (0.2504) loss 3.7547 (3.5522) grad_norm 1.5140 (2.0706) loss_scale 8192.0000 (5511.9012) mem 7379MB [2024-08-26 07:59:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][90/1251] eta 0:04:53 lr 0.000933 wd 0.0500 time 0.2360 (0.2525) data time 0.0011 (0.0066) model time 0.2349 (0.2482) loss 3.2232 (3.5535) grad_norm 1.3930 (2.0601) loss_scale 8192.0000 (5806.4176) mem 7379MB [2024-08-26 07:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][100/1251] eta 0:04:49 lr 0.000933 wd 0.0500 time 0.2392 (0.2514) data time 0.0011 (0.0061) model time 0.2380 (0.2466) loss 3.6114 (3.5439) grad_norm 1.7267 (2.0364) loss_scale 8192.0000 (6042.6139) mem 7379MB [2024-08-26 07:59:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][110/1251] eta 0:04:45 lr 0.000933 wd 0.0500 time 0.2368 (0.2506) data time 0.0010 (0.0056) model time 0.2358 (0.2459) loss 3.3453 (3.5659) grad_norm 1.6928 (2.0130) loss_scale 8192.0000 (6236.2523) mem 7379MB [2024-08-26 07:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][120/1251] eta 0:04:42 lr 0.000932 wd 0.0500 time 0.2363 (0.2501) data time 0.0011 (0.0052) model time 0.2352 (0.2454) loss 3.0408 (3.5348) grad_norm 1.7203 (2.0170) loss_scale 8192.0000 (6397.8843) mem 7379MB [2024-08-26 08:00:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][130/1251] eta 0:04:40 lr 0.000932 wd 0.0500 time 0.2485 (0.2498) data time 0.0010 (0.0049) model time 0.2476 (0.2454) loss 3.5757 (3.5339) grad_norm 2.1998 (2.0217) loss_scale 8192.0000 (6534.8397) mem 7379MB [2024-08-26 08:00:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][140/1251] eta 0:04:38 lr 0.000932 wd 0.0500 time 0.2363 (0.2506) data time 0.0012 (0.0046) model time 0.2351 (0.2470) loss 2.4356 (3.5541) grad_norm 1.5903 (2.0300) loss_scale 8192.0000 (6652.3688) mem 7379MB [2024-08-26 08:00:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][150/1251] eta 0:04:35 lr 0.000932 wd 0.0500 time 0.2466 (0.2501) data time 0.0007 (0.0044) model time 0.2459 (0.2465) loss 3.0867 (3.5432) grad_norm 2.3852 (2.0260) loss_scale 8192.0000 (6754.3311) mem 7379MB [2024-08-26 08:00:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][160/1251] eta 0:04:32 lr 0.000932 wd 0.0500 time 0.2373 (0.2497) data time 0.0010 (0.0042) model time 0.2363 (0.2462) loss 3.2521 (3.5282) grad_norm 2.0266 (2.0111) loss_scale 8192.0000 (6843.6273) mem 7379MB [2024-08-26 08:00:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][170/1251] eta 0:04:29 lr 0.000932 wd 0.0500 time 0.2422 (0.2493) data time 0.0010 (0.0040) model time 0.2412 (0.2458) loss 3.8444 (3.5045) grad_norm 2.0587 (2.0140) loss_scale 8192.0000 (6922.4795) mem 7379MB [2024-08-26 08:00:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][180/1251] eta 0:04:26 lr 0.000932 wd 0.0500 time 0.2399 (0.2488) data time 0.0011 (0.0038) model time 0.2389 (0.2453) loss 3.4801 (3.5020) grad_norm 1.4686 (2.0160) loss_scale 8192.0000 (6992.6188) mem 7379MB [2024-08-26 08:00:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][190/1251] eta 0:04:23 lr 0.000932 wd 0.0500 time 0.2330 (0.2484) data time 0.0010 (0.0037) model time 0.2320 (0.2448) loss 4.1271 (3.5007) grad_norm 1.5429 (2.0137) loss_scale 8192.0000 (7055.4136) mem 7379MB [2024-08-26 08:00:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][200/1251] eta 0:04:20 lr 0.000932 wd 0.0500 time 0.2364 (0.2480) data time 0.0011 (0.0036) model time 0.2353 (0.2445) loss 3.7517 (3.4907) grad_norm 1.7888 (2.0037) loss_scale 8192.0000 (7111.9602) mem 7379MB [2024-08-26 08:00:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][210/1251] eta 0:04:19 lr 0.000932 wd 0.0500 time 0.4279 (0.2495) data time 0.0013 (0.0035) model time 0.4266 (0.2467) loss 3.4037 (3.4739) grad_norm 1.8542 (1.9951) loss_scale 8192.0000 (7163.1469) mem 7379MB [2024-08-26 08:00:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][220/1251] eta 0:04:17 lr 0.000932 wd 0.0500 time 0.2469 (0.2502) data time 0.0010 (0.0034) model time 0.2459 (0.2477) loss 2.5653 (3.4658) grad_norm 1.5153 (1.9907) loss_scale 8192.0000 (7209.7014) mem 7379MB [2024-08-26 08:00:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][230/1251] eta 0:04:15 lr 0.000932 wd 0.0500 time 0.2462 (0.2499) data time 0.0009 (0.0033) model time 0.2453 (0.2473) loss 3.6904 (3.4723) grad_norm 2.5192 (2.0142) loss_scale 8192.0000 (7252.2251) mem 7379MB [2024-08-26 08:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][240/1251] eta 0:04:12 lr 0.000932 wd 0.0500 time 0.2422 (0.2495) data time 0.0007 (0.0032) model time 0.2414 (0.2470) loss 2.5906 (3.4644) grad_norm 1.7556 (2.0077) loss_scale 8192.0000 (7291.2199) mem 7379MB [2024-08-26 08:00:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][250/1251] eta 0:04:09 lr 0.000932 wd 0.0500 time 0.2421 (0.2492) data time 0.0012 (0.0031) model time 0.2408 (0.2466) loss 3.5852 (3.4567) grad_norm 1.8461 (2.0021) loss_scale 8192.0000 (7327.1076) mem 7379MB [2024-08-26 08:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][260/1251] eta 0:04:06 lr 0.000932 wd 0.0500 time 0.2428 (0.2490) data time 0.0011 (0.0030) model time 0.2417 (0.2464) loss 3.3756 (3.4646) grad_norm 1.8240 (2.0004) loss_scale 8192.0000 (7360.2452) mem 7379MB [2024-08-26 08:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][270/1251] eta 0:04:03 lr 0.000932 wd 0.0500 time 0.2363 (0.2487) data time 0.0010 (0.0029) model time 0.2353 (0.2462) loss 3.8517 (3.4690) grad_norm 1.6942 (2.0006) loss_scale 8192.0000 (7390.9373) mem 7379MB [2024-08-26 08:00:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][280/1251] eta 0:04:01 lr 0.000932 wd 0.0500 time 0.2433 (0.2485) data time 0.0008 (0.0029) model time 0.2425 (0.2460) loss 4.3416 (3.4643) grad_norm 1.4585 (1.9981) loss_scale 8192.0000 (7419.4448) mem 7379MB [2024-08-26 08:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][290/1251] eta 0:03:58 lr 0.000932 wd 0.0500 time 0.2408 (0.2483) data time 0.0011 (0.0028) model time 0.2397 (0.2458) loss 3.9498 (3.4585) grad_norm 1.7472 (1.9983) loss_scale 8192.0000 (7445.9931) mem 7379MB [2024-08-26 08:00:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][300/1251] eta 0:03:55 lr 0.000932 wd 0.0500 time 0.2486 (0.2481) data time 0.0008 (0.0028) model time 0.2478 (0.2455) loss 3.2891 (3.4558) grad_norm 2.4053 (2.0258) loss_scale 8192.0000 (7470.7774) mem 7379MB [2024-08-26 08:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][310/1251] eta 0:03:53 lr 0.000932 wd 0.0500 time 0.2491 (0.2479) data time 0.0008 (0.0028) model time 0.2483 (0.2454) loss 4.1985 (3.4555) grad_norm 2.0901 (2.0269) loss_scale 8192.0000 (7493.9678) mem 7379MB [2024-08-26 08:00:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][320/1251] eta 0:03:50 lr 0.000932 wd 0.0500 time 0.2466 (0.2478) data time 0.0009 (0.0027) model time 0.2456 (0.2452) loss 4.0407 (3.4711) grad_norm 2.0671 (2.0198) loss_scale 8192.0000 (7515.7134) mem 7379MB [2024-08-26 08:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][330/1251] eta 0:03:48 lr 0.000932 wd 0.0500 time 0.2457 (0.2476) data time 0.0009 (0.0027) model time 0.2447 (0.2451) loss 2.4147 (3.4699) grad_norm 2.8266 (2.0265) loss_scale 8192.0000 (7536.1450) mem 7379MB [2024-08-26 08:00:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][340/1251] eta 0:03:45 lr 0.000932 wd 0.0500 time 0.2397 (0.2475) data time 0.0009 (0.0026) model time 0.2388 (0.2451) loss 4.3673 (3.4715) grad_norm 2.3578 (2.0202) loss_scale 8192.0000 (7555.3783) mem 7379MB [2024-08-26 08:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][350/1251] eta 0:03:42 lr 0.000932 wd 0.0500 time 0.2453 (0.2474) data time 0.0009 (0.0026) model time 0.2444 (0.2449) loss 2.6379 (3.4671) grad_norm 1.7691 (2.0343) loss_scale 8192.0000 (7573.5157) mem 7379MB [2024-08-26 08:00:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][360/1251] eta 0:03:40 lr 0.000932 wd 0.0500 time 0.4165 (0.2477) data time 0.0009 (0.0025) model time 0.4156 (0.2453) loss 3.6507 (3.4682) grad_norm 1.1817 (2.0361) loss_scale 8192.0000 (7590.6482) mem 7379MB [2024-08-26 08:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][370/1251] eta 0:03:38 lr 0.000932 wd 0.0500 time 0.2363 (0.2476) data time 0.0007 (0.0025) model time 0.2356 (0.2453) loss 3.7639 (3.4669) grad_norm 1.9972 (2.0372) loss_scale 8192.0000 (7606.8571) mem 7379MB [2024-08-26 08:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][380/1251] eta 0:03:35 lr 0.000932 wd 0.0500 time 0.2442 (0.2475) data time 0.0009 (0.0025) model time 0.2433 (0.2451) loss 3.4459 (3.4730) grad_norm 1.5288 (2.0384) loss_scale 8192.0000 (7622.2152) mem 7379MB [2024-08-26 08:01:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][390/1251] eta 0:03:32 lr 0.000932 wd 0.0500 time 0.2517 (0.2473) data time 0.0009 (0.0024) model time 0.2508 (0.2450) loss 3.6475 (3.4697) grad_norm 1.5858 (2.0316) loss_scale 8192.0000 (7636.7877) mem 7379MB [2024-08-26 08:01:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][400/1251] eta 0:03:30 lr 0.000932 wd 0.0500 time 0.2415 (0.2476) data time 0.0011 (0.0024) model time 0.2404 (0.2454) loss 3.6390 (3.4687) grad_norm 1.6514 (2.0229) loss_scale 8192.0000 (7650.6334) mem 7379MB [2024-08-26 08:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][410/1251] eta 0:03:28 lr 0.000932 wd 0.0500 time 0.2413 (0.2475) data time 0.0009 (0.0023) model time 0.2403 (0.2453) loss 3.3868 (3.4755) grad_norm 1.8181 (2.0145) loss_scale 8192.0000 (7663.8054) mem 7379MB [2024-08-26 08:01:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][420/1251] eta 0:03:25 lr 0.000932 wd 0.0500 time 0.2439 (0.2473) data time 0.0010 (0.0023) model time 0.2429 (0.2451) loss 3.5160 (3.4725) grad_norm 1.9330 (2.0107) loss_scale 8192.0000 (7676.3515) mem 7379MB [2024-08-26 08:01:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][430/1251] eta 0:03:22 lr 0.000932 wd 0.0500 time 0.2369 (0.2473) data time 0.0011 (0.0023) model time 0.2358 (0.2451) loss 3.7102 (3.4686) grad_norm 2.0245 (2.0092) loss_scale 8192.0000 (7688.3155) mem 7379MB [2024-08-26 08:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][440/1251] eta 0:03:20 lr 0.000932 wd 0.0500 time 0.2413 (0.2471) data time 0.0009 (0.0023) model time 0.2404 (0.2450) loss 3.7091 (3.4653) grad_norm 3.1213 (2.0142) loss_scale 8192.0000 (7699.7370) mem 7379MB [2024-08-26 08:01:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][450/1251] eta 0:03:17 lr 0.000932 wd 0.0500 time 0.2384 (0.2470) data time 0.0008 (0.0022) model time 0.2376 (0.2448) loss 4.0510 (3.4665) grad_norm 1.8914 (2.0162) loss_scale 8192.0000 (7710.6519) mem 7379MB [2024-08-26 08:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][460/1251] eta 0:03:15 lr 0.000932 wd 0.0500 time 0.2461 (0.2469) data time 0.0007 (0.0022) model time 0.2455 (0.2447) loss 4.0701 (3.4722) grad_norm 2.1476 (2.0120) loss_scale 8192.0000 (7721.0933) mem 7379MB [2024-08-26 08:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][470/1251] eta 0:03:12 lr 0.000932 wd 0.0500 time 0.2460 (0.2467) data time 0.0007 (0.0022) model time 0.2453 (0.2446) loss 4.2384 (3.4745) grad_norm 1.9517 (2.0082) loss_scale 8192.0000 (7731.0913) mem 7379MB [2024-08-26 08:01:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][480/1251] eta 0:03:10 lr 0.000932 wd 0.0500 time 0.2394 (0.2466) data time 0.0007 (0.0022) model time 0.2387 (0.2445) loss 2.4030 (3.4776) grad_norm 1.9834 (2.0036) loss_scale 8192.0000 (7740.6736) mem 7379MB [2024-08-26 08:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][490/1251] eta 0:03:07 lr 0.000932 wd 0.0500 time 0.2486 (0.2466) data time 0.0009 (0.0021) model time 0.2476 (0.2445) loss 2.9771 (3.4811) grad_norm 1.4516 (2.0030) loss_scale 8192.0000 (7749.8656) mem 7379MB [2024-08-26 08:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][500/1251] eta 0:03:05 lr 0.000932 wd 0.0500 time 0.2412 (0.2465) data time 0.0009 (0.0021) model time 0.2403 (0.2445) loss 3.8492 (3.4815) grad_norm 1.4431 (2.0009) loss_scale 8192.0000 (7758.6906) mem 7379MB [2024-08-26 08:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][510/1251] eta 0:03:02 lr 0.000932 wd 0.0500 time 0.2401 (0.2464) data time 0.0015 (0.0021) model time 0.2386 (0.2444) loss 3.6843 (3.4897) grad_norm 2.3764 (2.0041) loss_scale 8192.0000 (7767.1703) mem 7379MB [2024-08-26 08:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][520/1251] eta 0:03:00 lr 0.000932 wd 0.0500 time 0.2437 (0.2463) data time 0.0010 (0.0021) model time 0.2427 (0.2443) loss 3.2576 (3.4841) grad_norm 2.8310 (2.0089) loss_scale 8192.0000 (7775.3244) mem 7379MB [2024-08-26 08:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][530/1251] eta 0:02:57 lr 0.000932 wd 0.0500 time 0.2416 (0.2462) data time 0.0011 (0.0021) model time 0.2405 (0.2442) loss 3.4518 (3.4881) grad_norm 1.3754 (2.0039) loss_scale 8192.0000 (7783.1714) mem 7379MB [2024-08-26 08:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][540/1251] eta 0:02:55 lr 0.000932 wd 0.0500 time 0.2322 (0.2462) data time 0.0009 (0.0020) model time 0.2312 (0.2441) loss 3.4931 (3.4879) grad_norm 1.6524 (1.9998) loss_scale 8192.0000 (7790.7283) mem 7379MB [2024-08-26 08:01:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][550/1251] eta 0:02:52 lr 0.000932 wd 0.0500 time 0.2393 (0.2461) data time 0.0009 (0.0020) model time 0.2384 (0.2441) loss 4.2180 (3.4908) grad_norm 1.6937 (1.9992) loss_scale 8192.0000 (7798.0109) mem 7379MB [2024-08-26 08:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][560/1251] eta 0:02:49 lr 0.000931 wd 0.0500 time 0.2459 (0.2460) data time 0.0011 (0.0020) model time 0.2447 (0.2440) loss 3.7109 (3.4943) grad_norm 2.0060 (1.9943) loss_scale 8192.0000 (7805.0339) mem 7379MB [2024-08-26 08:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][570/1251] eta 0:02:47 lr 0.000931 wd 0.0500 time 0.2406 (0.2459) data time 0.0007 (0.0020) model time 0.2399 (0.2440) loss 3.6069 (3.5013) grad_norm 1.6716 (1.9912) loss_scale 8192.0000 (7811.8109) mem 7379MB [2024-08-26 08:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][580/1251] eta 0:02:45 lr 0.000931 wd 0.0500 time 0.2363 (0.2463) data time 0.0009 (0.0020) model time 0.2355 (0.2443) loss 2.6258 (3.4971) grad_norm 3.8803 (1.9906) loss_scale 8192.0000 (7818.3546) mem 7379MB [2024-08-26 08:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][590/1251] eta 0:02:42 lr 0.000931 wd 0.0500 time 0.2435 (0.2462) data time 0.0010 (0.0019) model time 0.2425 (0.2443) loss 3.9190 (3.4993) grad_norm 2.1943 (1.9966) loss_scale 8192.0000 (7824.6768) mem 7379MB [2024-08-26 08:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][600/1251] eta 0:02:40 lr 0.000931 wd 0.0500 time 0.2496 (0.2465) data time 0.0009 (0.0019) model time 0.2487 (0.2446) loss 3.8471 (3.5006) grad_norm 1.5458 (1.9926) loss_scale 8192.0000 (7830.7887) mem 7379MB [2024-08-26 08:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][610/1251] eta 0:02:37 lr 0.000931 wd 0.0500 time 0.2368 (0.2464) data time 0.0010 (0.0019) model time 0.2357 (0.2445) loss 2.8273 (3.5011) grad_norm 2.0878 (1.9921) loss_scale 8192.0000 (7836.7005) mem 7379MB [2024-08-26 08:02:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][620/1251] eta 0:02:35 lr 0.000931 wd 0.0500 time 0.2478 (0.2463) data time 0.0010 (0.0019) model time 0.2468 (0.2444) loss 3.7762 (3.5036) grad_norm 3.0609 (1.9951) loss_scale 8192.0000 (7842.4219) mem 7379MB [2024-08-26 08:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][630/1251] eta 0:02:32 lr 0.000931 wd 0.0500 time 0.2450 (0.2463) data time 0.0010 (0.0019) model time 0.2440 (0.2444) loss 3.7523 (3.5035) grad_norm 1.5300 (1.9966) loss_scale 8192.0000 (7847.9620) mem 7379MB [2024-08-26 08:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][640/1251] eta 0:02:30 lr 0.000931 wd 0.0500 time 0.2356 (0.2462) data time 0.0010 (0.0019) model time 0.2345 (0.2444) loss 3.8789 (3.5083) grad_norm 2.0338 (1.9963) loss_scale 8192.0000 (7853.3292) mem 7379MB [2024-08-26 08:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][650/1251] eta 0:02:27 lr 0.000931 wd 0.0500 time 0.2431 (0.2461) data time 0.0010 (0.0019) model time 0.2421 (0.2443) loss 3.7960 (3.5103) grad_norm 3.2795 (1.9992) loss_scale 8192.0000 (7858.5315) mem 7379MB [2024-08-26 08:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][660/1251] eta 0:02:25 lr 0.000931 wd 0.0500 time 0.2405 (0.2463) data time 0.0011 (0.0018) model time 0.2394 (0.2445) loss 3.8601 (3.5118) grad_norm 4.2460 (2.0042) loss_scale 8192.0000 (7863.5764) mem 7379MB [2024-08-26 08:02:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][670/1251] eta 0:02:23 lr 0.000931 wd 0.0500 time 0.2454 (0.2462) data time 0.0010 (0.0018) model time 0.2443 (0.2444) loss 3.8344 (3.5128) grad_norm 1.7436 (2.0052) loss_scale 8192.0000 (7868.4709) mem 7379MB [2024-08-26 08:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][680/1251] eta 0:02:20 lr 0.000931 wd 0.0500 time 0.2349 (0.2462) data time 0.0009 (0.0018) model time 0.2341 (0.2444) loss 4.0217 (3.5110) grad_norm 1.4441 (2.0071) loss_scale 8192.0000 (7873.2217) mem 7379MB [2024-08-26 08:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][690/1251] eta 0:02:18 lr 0.000931 wd 0.0500 time 0.2487 (0.2461) data time 0.0007 (0.0018) model time 0.2479 (0.2443) loss 3.9709 (3.5120) grad_norm 2.0173 (2.0042) loss_scale 8192.0000 (7877.8350) mem 7379MB [2024-08-26 08:02:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][700/1251] eta 0:02:15 lr 0.000931 wd 0.0500 time 0.2475 (0.2460) data time 0.0007 (0.0018) model time 0.2468 (0.2442) loss 3.6099 (3.5125) grad_norm 1.3488 (1.9996) loss_scale 8192.0000 (7882.3167) mem 7379MB [2024-08-26 08:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][710/1251] eta 0:02:13 lr 0.000931 wd 0.0500 time 0.2470 (0.2460) data time 0.0009 (0.0018) model time 0.2461 (0.2442) loss 4.0378 (3.5085) grad_norm 2.5967 (2.0001) loss_scale 8192.0000 (7886.6723) mem 7379MB [2024-08-26 08:02:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][720/1251] eta 0:02:10 lr 0.000931 wd 0.0500 time 0.2445 (0.2459) data time 0.0008 (0.0018) model time 0.2437 (0.2441) loss 3.9966 (3.5134) grad_norm 1.8108 (2.0045) loss_scale 8192.0000 (7890.9071) mem 7379MB [2024-08-26 08:02:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][730/1251] eta 0:02:08 lr 0.000931 wd 0.0500 time 0.2456 (0.2459) data time 0.0010 (0.0018) model time 0.2446 (0.2441) loss 3.9044 (3.5164) grad_norm 2.3666 (2.0043) loss_scale 8192.0000 (7895.0260) mem 7379MB [2024-08-26 08:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][740/1251] eta 0:02:05 lr 0.000931 wd 0.0500 time 0.2429 (0.2458) data time 0.0007 (0.0018) model time 0.2422 (0.2440) loss 3.9025 (3.5130) grad_norm 2.5080 (2.0079) loss_scale 8192.0000 (7899.0337) mem 7379MB [2024-08-26 08:02:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][750/1251] eta 0:02:03 lr 0.000931 wd 0.0500 time 0.2476 (0.2457) data time 0.0013 (0.0018) model time 0.2464 (0.2440) loss 3.5636 (3.5136) grad_norm 2.7682 (2.0116) loss_scale 8192.0000 (7902.9348) mem 7379MB [2024-08-26 08:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][760/1251] eta 0:02:00 lr 0.000931 wd 0.0500 time 0.2528 (0.2457) data time 0.0011 (0.0018) model time 0.2517 (0.2439) loss 2.5021 (3.5118) grad_norm 2.4860 (2.0179) loss_scale 8192.0000 (7906.7332) mem 7379MB [2024-08-26 08:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][770/1251] eta 0:01:58 lr 0.000931 wd 0.0500 time 0.2484 (0.2457) data time 0.0010 (0.0017) model time 0.2474 (0.2439) loss 4.0569 (3.5100) grad_norm 1.3926 (2.0168) loss_scale 8192.0000 (7910.4332) mem 7379MB [2024-08-26 08:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][780/1251] eta 0:01:55 lr 0.000931 wd 0.0500 time 0.2421 (0.2456) data time 0.0010 (0.0017) model time 0.2411 (0.2438) loss 3.9079 (3.5097) grad_norm 1.6916 (2.0160) loss_scale 8192.0000 (7914.0384) mem 7379MB [2024-08-26 08:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][790/1251] eta 0:01:53 lr 0.000931 wd 0.0500 time 0.2419 (0.2456) data time 0.0007 (0.0017) model time 0.2411 (0.2438) loss 3.8812 (3.5073) grad_norm 2.4267 (2.0168) loss_scale 8192.0000 (7917.5525) mem 7379MB [2024-08-26 08:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][800/1251] eta 0:01:50 lr 0.000931 wd 0.0500 time 0.2454 (0.2455) data time 0.0008 (0.0017) model time 0.2447 (0.2437) loss 4.2883 (3.5075) grad_norm 1.5669 (2.0167) loss_scale 8192.0000 (7920.9788) mem 7379MB [2024-08-26 08:02:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][810/1251] eta 0:01:48 lr 0.000931 wd 0.0500 time 0.2448 (0.2455) data time 0.0009 (0.0017) model time 0.2439 (0.2437) loss 3.7820 (3.5103) grad_norm 1.5438 (2.0149) loss_scale 8192.0000 (7924.3206) mem 7379MB [2024-08-26 08:02:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][820/1251] eta 0:01:45 lr 0.000931 wd 0.0500 time 0.2394 (0.2457) data time 0.0007 (0.0017) model time 0.2387 (0.2439) loss 4.1599 (3.5091) grad_norm 1.9990 (2.0111) loss_scale 8192.0000 (7927.5810) mem 7379MB [2024-08-26 08:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][830/1251] eta 0:01:43 lr 0.000931 wd 0.0500 time 0.2469 (0.2456) data time 0.0011 (0.0017) model time 0.2458 (0.2439) loss 3.1412 (3.5084) grad_norm 1.4374 (2.0084) loss_scale 8192.0000 (7930.7629) mem 7379MB [2024-08-26 08:02:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][840/1251] eta 0:01:40 lr 0.000931 wd 0.0500 time 0.2482 (0.2456) data time 0.0010 (0.0017) model time 0.2472 (0.2439) loss 2.7157 (3.5048) grad_norm 1.6379 (2.0046) loss_scale 8192.0000 (7933.8692) mem 7379MB [2024-08-26 08:02:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][850/1251] eta 0:01:38 lr 0.000931 wd 0.0500 time 0.2482 (0.2455) data time 0.0009 (0.0017) model time 0.2473 (0.2438) loss 3.6378 (3.5043) grad_norm 1.7127 (2.0029) loss_scale 8192.0000 (7936.9025) mem 7379MB [2024-08-26 08:03:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][860/1251] eta 0:01:35 lr 0.000931 wd 0.0500 time 0.2339 (0.2455) data time 0.0010 (0.0017) model time 0.2329 (0.2438) loss 4.0605 (3.5075) grad_norm 1.6641 (inf) loss_scale 4096.0000 (7911.3217) mem 7379MB [2024-08-26 08:03:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][870/1251] eta 0:01:33 lr 0.000931 wd 0.0500 time 0.2401 (0.2455) data time 0.0011 (0.0017) model time 0.2390 (0.2438) loss 3.1622 (3.5053) grad_norm 1.5293 (inf) loss_scale 4096.0000 (7867.5178) mem 7379MB [2024-08-26 08:03:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][880/1251] eta 0:01:31 lr 0.000931 wd 0.0500 time 0.2466 (0.2454) data time 0.0010 (0.0017) model time 0.2456 (0.2437) loss 3.9984 (3.5078) grad_norm 2.8551 (inf) loss_scale 4096.0000 (7824.7083) mem 7379MB [2024-08-26 08:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][890/1251] eta 0:01:28 lr 0.000931 wd 0.0500 time 0.2484 (0.2456) data time 0.0008 (0.0017) model time 0.2476 (0.2440) loss 2.4343 (3.5031) grad_norm 2.1837 (inf) loss_scale 4096.0000 (7782.8597) mem 7379MB [2024-08-26 08:03:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][900/1251] eta 0:01:26 lr 0.000931 wd 0.0500 time 0.2431 (0.2456) data time 0.0010 (0.0017) model time 0.2422 (0.2439) loss 3.9286 (3.5060) grad_norm 2.9016 (inf) loss_scale 4096.0000 (7741.9401) mem 7379MB [2024-08-26 08:03:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][910/1251] eta 0:01:23 lr 0.000931 wd 0.0500 time 0.2348 (0.2455) data time 0.0011 (0.0017) model time 0.2337 (0.2439) loss 3.6035 (3.5065) grad_norm 1.9820 (inf) loss_scale 4096.0000 (7701.9188) mem 7379MB [2024-08-26 08:03:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][920/1251] eta 0:01:21 lr 0.000931 wd 0.0500 time 0.2395 (0.2455) data time 0.0008 (0.0016) model time 0.2386 (0.2439) loss 3.9741 (3.5090) grad_norm 2.6155 (inf) loss_scale 4096.0000 (7662.7666) mem 7379MB [2024-08-26 08:03:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][930/1251] eta 0:01:18 lr 0.000931 wd 0.0500 time 0.2397 (0.2454) data time 0.0007 (0.0016) model time 0.2390 (0.2438) loss 3.8174 (3.5062) grad_norm 1.7277 (inf) loss_scale 4096.0000 (7624.4554) mem 7379MB [2024-08-26 08:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][940/1251] eta 0:01:16 lr 0.000931 wd 0.0500 time 0.2414 (0.2456) data time 0.0009 (0.0016) model time 0.2405 (0.2440) loss 4.5544 (3.5066) grad_norm 2.0769 (inf) loss_scale 4096.0000 (7586.9586) mem 7379MB [2024-08-26 08:03:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][950/1251] eta 0:01:13 lr 0.000931 wd 0.0500 time 0.2418 (0.2456) data time 0.0011 (0.0016) model time 0.2408 (0.2439) loss 3.5938 (3.5083) grad_norm 3.0855 (inf) loss_scale 4096.0000 (7550.2503) mem 7379MB [2024-08-26 08:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][960/1251] eta 0:01:11 lr 0.000931 wd 0.0500 time 0.2377 (0.2455) data time 0.0011 (0.0016) model time 0.2366 (0.2439) loss 3.8155 (3.5069) grad_norm 2.9631 (inf) loss_scale 4096.0000 (7514.3059) mem 7379MB [2024-08-26 08:03:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][970/1251] eta 0:01:08 lr 0.000931 wd 0.0500 time 0.2385 (0.2455) data time 0.0010 (0.0016) model time 0.2375 (0.2439) loss 3.9466 (3.5078) grad_norm 2.3497 (inf) loss_scale 4096.0000 (7479.1020) mem 7379MB [2024-08-26 08:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][980/1251] eta 0:01:06 lr 0.000931 wd 0.0500 time 0.2418 (0.2457) data time 0.0011 (0.0016) model time 0.2407 (0.2441) loss 3.9517 (3.5082) grad_norm 1.6495 (inf) loss_scale 4096.0000 (7444.6157) mem 7379MB [2024-08-26 08:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][990/1251] eta 0:01:04 lr 0.000931 wd 0.0500 time 0.2398 (0.2456) data time 0.0009 (0.0016) model time 0.2389 (0.2441) loss 3.9519 (3.5065) grad_norm 2.1737 (inf) loss_scale 4096.0000 (7410.8254) mem 7379MB [2024-08-26 08:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1000/1251] eta 0:01:01 lr 0.000931 wd 0.0500 time 0.2438 (0.2456) data time 0.0008 (0.0016) model time 0.2430 (0.2440) loss 2.4207 (3.5052) grad_norm 2.7592 (inf) loss_scale 4096.0000 (7377.7103) mem 7379MB [2024-08-26 08:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1010/1251] eta 0:00:59 lr 0.000930 wd 0.0500 time 0.2504 (0.2456) data time 0.0010 (0.0016) model time 0.2494 (0.2440) loss 2.8019 (3.5044) grad_norm 1.6222 (inf) loss_scale 4096.0000 (7345.2502) mem 7379MB [2024-08-26 08:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1020/1251] eta 0:00:56 lr 0.000930 wd 0.0500 time 0.2467 (0.2455) data time 0.0009 (0.0016) model time 0.2458 (0.2440) loss 4.0473 (3.5055) grad_norm 2.2943 (inf) loss_scale 4096.0000 (7313.4261) mem 7379MB [2024-08-26 08:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1030/1251] eta 0:00:54 lr 0.000930 wd 0.0500 time 0.2458 (0.2455) data time 0.0011 (0.0016) model time 0.2447 (0.2439) loss 2.9660 (3.5055) grad_norm 1.7369 (inf) loss_scale 4096.0000 (7282.2192) mem 7379MB [2024-08-26 08:03:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1040/1251] eta 0:00:51 lr 0.000930 wd 0.0500 time 0.2436 (0.2455) data time 0.0010 (0.0016) model time 0.2426 (0.2439) loss 4.2634 (3.5087) grad_norm 1.5010 (inf) loss_scale 4096.0000 (7251.6119) mem 7379MB [2024-08-26 08:03:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1050/1251] eta 0:00:49 lr 0.000930 wd 0.0500 time 0.2457 (0.2455) data time 0.0011 (0.0016) model time 0.2446 (0.2439) loss 3.9980 (3.5121) grad_norm 2.0284 (inf) loss_scale 4096.0000 (7221.5871) mem 7379MB [2024-08-26 08:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1060/1251] eta 0:00:46 lr 0.000930 wd 0.0500 time 0.2374 (0.2454) data time 0.0009 (0.0016) model time 0.2365 (0.2439) loss 3.5269 (3.5135) grad_norm 2.0489 (inf) loss_scale 4096.0000 (7192.1282) mem 7379MB [2024-08-26 08:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1070/1251] eta 0:00:44 lr 0.000930 wd 0.0500 time 0.2432 (0.2454) data time 0.0011 (0.0016) model time 0.2421 (0.2439) loss 4.0256 (3.5151) grad_norm 2.0210 (inf) loss_scale 4096.0000 (7163.2194) mem 7379MB [2024-08-26 08:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1080/1251] eta 0:00:41 lr 0.000930 wd 0.0500 time 0.2357 (0.2454) data time 0.0013 (0.0015) model time 0.2344 (0.2438) loss 2.9733 (3.5110) grad_norm 1.4969 (inf) loss_scale 4096.0000 (7134.8455) mem 7379MB [2024-08-26 08:03:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1090/1251] eta 0:00:39 lr 0.000930 wd 0.0500 time 0.2385 (0.2454) data time 0.0008 (0.0015) model time 0.2377 (0.2438) loss 3.6277 (3.5116) grad_norm 1.9542 (inf) loss_scale 4096.0000 (7106.9918) mem 7379MB [2024-08-26 08:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1100/1251] eta 0:00:37 lr 0.000930 wd 0.0500 time 0.2352 (0.2454) data time 0.0010 (0.0015) model time 0.2343 (0.2438) loss 4.4886 (3.5156) grad_norm 1.7479 (inf) loss_scale 4096.0000 (7079.6440) mem 7379MB [2024-08-26 08:04:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1110/1251] eta 0:00:34 lr 0.000930 wd 0.0500 time 0.2482 (0.2454) data time 0.0009 (0.0015) model time 0.2473 (0.2438) loss 4.3638 (3.5156) grad_norm 2.5807 (inf) loss_scale 4096.0000 (7052.7885) mem 7379MB [2024-08-26 08:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1120/1251] eta 0:00:32 lr 0.000930 wd 0.0500 time 0.2343 (0.2453) data time 0.0011 (0.0015) model time 0.2332 (0.2438) loss 3.5697 (3.5147) grad_norm 1.8180 (inf) loss_scale 4096.0000 (7026.4121) mem 7379MB [2024-08-26 08:04:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1130/1251] eta 0:00:29 lr 0.000930 wd 0.0500 time 0.2438 (0.2453) data time 0.0008 (0.0015) model time 0.2430 (0.2437) loss 3.1761 (3.5165) grad_norm 2.0224 (inf) loss_scale 4096.0000 (7000.5022) mem 7379MB [2024-08-26 08:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1140/1251] eta 0:00:27 lr 0.000930 wd 0.0500 time 0.4390 (0.2456) data time 0.0008 (0.0015) model time 0.4383 (0.2441) loss 3.7070 (3.5157) grad_norm 1.9539 (inf) loss_scale 4096.0000 (6975.0465) mem 7379MB [2024-08-26 08:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1150/1251] eta 0:00:24 lr 0.000930 wd 0.0500 time 0.2554 (0.2458) data time 0.0008 (0.0015) model time 0.2547 (0.2443) loss 2.7636 (3.5150) grad_norm 1.8135 (inf) loss_scale 4096.0000 (6950.0330) mem 7379MB [2024-08-26 08:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1160/1251] eta 0:00:22 lr 0.000930 wd 0.0500 time 0.2340 (0.2458) data time 0.0011 (0.0015) model time 0.2329 (0.2443) loss 3.1485 (3.5155) grad_norm 1.4979 (inf) loss_scale 4096.0000 (6925.4505) mem 7379MB [2024-08-26 08:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1170/1251] eta 0:00:19 lr 0.000930 wd 0.0500 time 0.2523 (0.2457) data time 0.0009 (0.0015) model time 0.2514 (0.2442) loss 3.0602 (3.5166) grad_norm 1.9999 (inf) loss_scale 4096.0000 (6901.2878) mem 7379MB [2024-08-26 08:04:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1180/1251] eta 0:00:17 lr 0.000930 wd 0.0500 time 0.2360 (0.2457) data time 0.0007 (0.0015) model time 0.2353 (0.2442) loss 3.1695 (3.5163) grad_norm 1.5430 (inf) loss_scale 4096.0000 (6877.5343) mem 7379MB [2024-08-26 08:04:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1190/1251] eta 0:00:14 lr 0.000930 wd 0.0500 time 0.2426 (0.2458) data time 0.0011 (0.0015) model time 0.2415 (0.2444) loss 3.6089 (3.5170) grad_norm 1.6253 (inf) loss_scale 4096.0000 (6854.1797) mem 7379MB [2024-08-26 08:04:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1200/1251] eta 0:00:12 lr 0.000930 wd 0.0500 time 0.2434 (0.2458) data time 0.0011 (0.0015) model time 0.2423 (0.2443) loss 4.1640 (3.5194) grad_norm 1.7169 (inf) loss_scale 4096.0000 (6831.2140) mem 7379MB [2024-08-26 08:04:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1210/1251] eta 0:00:10 lr 0.000930 wd 0.0500 time 0.2380 (0.2458) data time 0.0009 (0.0015) model time 0.2371 (0.2443) loss 3.0578 (3.5175) grad_norm 1.7930 (inf) loss_scale 4096.0000 (6808.6276) mem 7379MB [2024-08-26 08:04:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1220/1251] eta 0:00:07 lr 0.000930 wd 0.0500 time 0.2347 (0.2458) data time 0.0011 (0.0015) model time 0.2337 (0.2443) loss 3.7489 (3.5181) grad_norm 1.7304 (inf) loss_scale 4096.0000 (6786.4111) mem 7379MB [2024-08-26 08:04:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1230/1251] eta 0:00:05 lr 0.000930 wd 0.0500 time 0.2477 (0.2458) data time 0.0007 (0.0015) model time 0.2469 (0.2443) loss 2.7002 (3.5173) grad_norm 1.7813 (inf) loss_scale 4096.0000 (6764.5556) mem 7379MB [2024-08-26 08:04:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1240/1251] eta 0:00:02 lr 0.000930 wd 0.0500 time 0.2234 (0.2457) data time 0.0005 (0.0015) model time 0.2229 (0.2442) loss 4.0495 (3.5172) grad_norm 1.7440 (inf) loss_scale 4096.0000 (6743.0524) mem 7379MB [2024-08-26 08:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [67/300][1250/1251] eta 0:00:00 lr 0.000930 wd 0.0500 time 0.2240 (0.2455) data time 0.0005 (0.0015) model time 0.2235 (0.2440) loss 4.6040 (3.5198) grad_norm 1.7340 (inf) loss_scale 4096.0000 (6721.8929) mem 7379MB [2024-08-26 08:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 67 training takes 0:05:07 [2024-08-26 08:04:36 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 08:04:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 08:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.468 (0.468) Loss 0.5273 (0.5273) Acc@1 89.062 (89.062) Acc@5 97.559 (97.559) Mem 7379MB [2024-08-26 08:04:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.110) Loss 0.8345 (0.8229) Acc@1 81.152 (81.259) Acc@5 95.801 (96.183) Mem 7379MB [2024-08-26 08:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.096) Loss 1.2334 (0.8647) Acc@1 71.094 (80.264) Acc@5 92.383 (95.959) Mem 7379MB [2024-08-26 08:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.090) Loss 1.4941 (0.9822) Acc@1 64.453 (77.923) Acc@5 86.816 (94.326) Mem 7379MB [2024-08-26 08:04:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.3574 (1.0528) Acc@1 69.727 (76.298) Acc@5 89.551 (93.467) Mem 7379MB [2024-08-26 08:04:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.946 Acc@5 93.344 [2024-08-26 08:04:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.9% [2024-08-26 08:04:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 75.95% [2024-08-26 08:04:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 08:04:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 08:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.423 (0.423) Loss 0.4697 (0.4697) Acc@1 91.309 (91.309) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 08:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.106) Loss 0.7656 (0.7383) Acc@1 84.570 (83.967) Acc@5 96.191 (96.751) Mem 7379MB [2024-08-26 08:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.093) Loss 1.0664 (0.7592) Acc@1 74.805 (82.971) Acc@5 92.871 (96.666) Mem 7379MB [2024-08-26 08:04:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.088) Loss 1.3301 (0.8686) Acc@1 65.723 (80.377) Acc@5 89.844 (95.268) Mem 7379MB [2024-08-26 08:04:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.2373 (0.9272) Acc@1 70.215 (78.859) Acc@5 91.113 (94.643) Mem 7379MB [2024-08-26 08:04:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.502 Acc@5 94.562 [2024-08-26 08:04:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.5% [2024-08-26 08:04:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.50% [2024-08-26 08:04:45 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 08:04:46 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 08:04:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][0/1251] eta 0:14:31 lr 0.000930 wd 0.0500 time 0.6965 (0.6965) data time 0.4676 (0.4676) model time 0.0000 (0.0000) loss 4.1848 (4.1848) grad_norm 3.0505 (3.0505) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:04:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][10/1251] eta 0:05:49 lr 0.000930 wd 0.0500 time 0.2414 (0.2815) data time 0.0009 (0.0434) model time 0.0000 (0.0000) loss 3.2265 (3.5483) grad_norm 2.6019 (2.1619) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:04:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][20/1251] eta 0:05:23 lr 0.000930 wd 0.0500 time 0.2377 (0.2628) data time 0.0009 (0.0232) model time 0.0000 (0.0000) loss 3.9691 (3.5753) grad_norm 2.5761 (2.0916) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:04:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][30/1251] eta 0:05:12 lr 0.000930 wd 0.0500 time 0.2473 (0.2558) data time 0.0009 (0.0161) model time 0.0000 (0.0000) loss 3.4322 (3.4604) grad_norm 1.4689 (2.0542) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:04:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][40/1251] eta 0:05:05 lr 0.000930 wd 0.0500 time 0.2404 (0.2525) data time 0.0010 (0.0124) model time 0.0000 (0.0000) loss 3.6572 (3.4403) grad_norm 1.5277 (1.9958) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:04:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][50/1251] eta 0:05:00 lr 0.000930 wd 0.0500 time 0.2391 (0.2503) data time 0.0010 (0.0102) model time 0.0000 (0.0000) loss 4.0807 (3.4397) grad_norm 1.4733 (1.9706) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][60/1251] eta 0:04:56 lr 0.000930 wd 0.0500 time 0.2487 (0.2490) data time 0.0010 (0.0087) model time 0.2477 (0.2416) loss 3.0635 (3.4571) grad_norm 1.5672 (1.9776) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][70/1251] eta 0:04:52 lr 0.000930 wd 0.0500 time 0.2328 (0.2478) data time 0.0007 (0.0076) model time 0.2322 (0.2406) loss 2.7310 (3.4792) grad_norm 2.3591 (1.9669) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][80/1251] eta 0:04:49 lr 0.000930 wd 0.0500 time 0.2437 (0.2470) data time 0.0010 (0.0068) model time 0.2427 (0.2405) loss 4.0152 (3.5048) grad_norm 2.0851 (1.9652) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][90/1251] eta 0:04:46 lr 0.000930 wd 0.0500 time 0.2434 (0.2466) data time 0.0008 (0.0061) model time 0.2426 (0.2409) loss 4.3420 (3.5246) grad_norm 1.4403 (1.9493) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][100/1251] eta 0:04:43 lr 0.000930 wd 0.0500 time 0.2367 (0.2460) data time 0.0009 (0.0056) model time 0.2358 (0.2407) loss 3.3264 (3.5346) grad_norm 2.1869 (1.9710) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][110/1251] eta 0:04:40 lr 0.000930 wd 0.0500 time 0.2446 (0.2456) data time 0.0007 (0.0052) model time 0.2439 (0.2406) loss 2.5640 (3.5283) grad_norm 1.6460 (1.9795) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][120/1251] eta 0:04:37 lr 0.000930 wd 0.0500 time 0.2379 (0.2454) data time 0.0009 (0.0049) model time 0.2370 (0.2408) loss 3.9676 (3.5111) grad_norm 1.6825 (1.9866) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][130/1251] eta 0:04:34 lr 0.000930 wd 0.0500 time 0.2449 (0.2451) data time 0.0010 (0.0046) model time 0.2439 (0.2408) loss 4.6611 (3.5068) grad_norm 2.0367 (2.0012) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][140/1251] eta 0:04:32 lr 0.000930 wd 0.0500 time 0.2461 (0.2449) data time 0.0011 (0.0043) model time 0.2450 (0.2409) loss 3.5499 (3.5014) grad_norm 1.6186 (2.0256) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][150/1251] eta 0:04:31 lr 0.000930 wd 0.0500 time 0.2366 (0.2463) data time 0.0007 (0.0041) model time 0.2359 (0.2432) loss 2.3195 (3.4966) grad_norm 1.5427 (2.0530) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][160/1251] eta 0:04:29 lr 0.000930 wd 0.0500 time 0.2364 (0.2472) data time 0.0007 (0.0039) model time 0.2357 (0.2448) loss 4.1262 (3.5022) grad_norm 1.4862 (2.0521) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][170/1251] eta 0:04:26 lr 0.000930 wd 0.0500 time 0.2511 (0.2470) data time 0.0010 (0.0038) model time 0.2501 (0.2445) loss 3.2621 (3.5005) grad_norm 1.4710 (2.0307) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][180/1251] eta 0:04:25 lr 0.000930 wd 0.0500 time 0.2489 (0.2479) data time 0.0008 (0.0036) model time 0.2482 (0.2460) loss 3.9184 (3.5047) grad_norm 2.0435 (2.0146) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][190/1251] eta 0:04:22 lr 0.000929 wd 0.0500 time 0.2392 (0.2476) data time 0.0009 (0.0035) model time 0.2383 (0.2456) loss 3.0234 (3.5048) grad_norm 1.6589 (1.9980) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][200/1251] eta 0:04:20 lr 0.000929 wd 0.0500 time 0.2453 (0.2474) data time 0.0007 (0.0033) model time 0.2447 (0.2454) loss 4.3156 (3.5025) grad_norm 2.2250 (1.9969) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][210/1251] eta 0:04:18 lr 0.000929 wd 0.0500 time 0.4232 (0.2481) data time 0.0009 (0.0032) model time 0.4223 (0.2463) loss 3.7594 (3.4999) grad_norm 1.9521 (2.0069) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][220/1251] eta 0:04:15 lr 0.000929 wd 0.0500 time 0.2389 (0.2478) data time 0.0010 (0.0031) model time 0.2379 (0.2460) loss 3.4351 (3.4924) grad_norm 1.4738 (2.0052) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][230/1251] eta 0:04:12 lr 0.000929 wd 0.0500 time 0.2450 (0.2476) data time 0.0008 (0.0030) model time 0.2443 (0.2458) loss 3.8707 (3.4930) grad_norm 1.5364 (2.0075) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][240/1251] eta 0:04:10 lr 0.000929 wd 0.0500 time 0.2411 (0.2474) data time 0.0007 (0.0030) model time 0.2404 (0.2456) loss 3.7154 (3.5019) grad_norm 2.2834 (1.9970) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][250/1251] eta 0:04:07 lr 0.000929 wd 0.0500 time 0.2461 (0.2472) data time 0.0007 (0.0029) model time 0.2453 (0.2454) loss 3.0935 (3.5060) grad_norm 1.5696 (1.9969) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][260/1251] eta 0:04:04 lr 0.000929 wd 0.0500 time 0.2418 (0.2470) data time 0.0008 (0.0028) model time 0.2411 (0.2452) loss 3.6740 (3.5071) grad_norm 1.7814 (2.0209) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][270/1251] eta 0:04:02 lr 0.000929 wd 0.0500 time 0.2435 (0.2468) data time 0.0009 (0.0027) model time 0.2426 (0.2450) loss 3.4357 (3.4992) grad_norm 1.8521 (2.0150) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][280/1251] eta 0:03:59 lr 0.000929 wd 0.0500 time 0.2444 (0.2466) data time 0.0010 (0.0027) model time 0.2434 (0.2447) loss 2.8316 (3.4993) grad_norm 1.9237 (2.0106) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:05:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][290/1251] eta 0:03:56 lr 0.000929 wd 0.0500 time 0.2447 (0.2463) data time 0.0010 (0.0026) model time 0.2437 (0.2445) loss 3.8051 (3.5013) grad_norm 1.6771 (2.0083) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][300/1251] eta 0:03:54 lr 0.000929 wd 0.0500 time 0.2496 (0.2463) data time 0.0007 (0.0026) model time 0.2490 (0.2444) loss 3.8441 (3.5132) grad_norm 1.8279 (2.0138) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][310/1251] eta 0:03:51 lr 0.000929 wd 0.0500 time 0.2401 (0.2461) data time 0.0010 (0.0025) model time 0.2392 (0.2442) loss 3.9989 (3.5052) grad_norm 1.9421 (2.0192) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][320/1251] eta 0:03:49 lr 0.000929 wd 0.0500 time 0.2611 (0.2467) data time 0.0010 (0.0025) model time 0.2601 (0.2450) loss 2.9335 (3.5022) grad_norm 3.4137 (2.0313) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][330/1251] eta 0:03:47 lr 0.000929 wd 0.0500 time 0.2433 (0.2466) data time 0.0007 (0.0024) model time 0.2426 (0.2449) loss 2.3987 (3.4931) grad_norm 2.0399 (2.0307) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][340/1251] eta 0:03:44 lr 0.000929 wd 0.0500 time 0.2538 (0.2465) data time 0.0012 (0.0024) model time 0.2526 (0.2448) loss 3.6936 (3.4922) grad_norm 2.2923 (2.0236) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][350/1251] eta 0:03:41 lr 0.000929 wd 0.0500 time 0.2347 (0.2463) data time 0.0012 (0.0024) model time 0.2336 (0.2447) loss 3.4928 (3.4858) grad_norm 1.5680 (2.0120) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][360/1251] eta 0:03:39 lr 0.000929 wd 0.0500 time 0.2418 (0.2462) data time 0.0009 (0.0023) model time 0.2409 (0.2446) loss 3.1126 (3.4810) grad_norm 2.1271 (2.0051) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][370/1251] eta 0:03:36 lr 0.000929 wd 0.0500 time 0.2334 (0.2461) data time 0.0011 (0.0023) model time 0.2323 (0.2444) loss 2.5598 (3.4806) grad_norm 2.4679 (2.0052) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][380/1251] eta 0:03:34 lr 0.000929 wd 0.0500 time 0.2446 (0.2460) data time 0.0011 (0.0022) model time 0.2435 (0.2443) loss 4.1952 (3.4888) grad_norm 1.7304 (2.0011) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][390/1251] eta 0:03:32 lr 0.000929 wd 0.0500 time 0.2435 (0.2465) data time 0.0011 (0.0022) model time 0.2425 (0.2449) loss 3.5882 (3.4873) grad_norm 1.9730 (2.0026) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][400/1251] eta 0:03:29 lr 0.000929 wd 0.0500 time 0.2514 (0.2464) data time 0.0009 (0.0022) model time 0.2505 (0.2448) loss 2.7157 (3.4806) grad_norm 1.6428 (1.9972) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][410/1251] eta 0:03:27 lr 0.000929 wd 0.0500 time 0.2434 (0.2464) data time 0.0008 (0.0022) model time 0.2426 (0.2448) loss 3.9762 (3.4836) grad_norm 1.4240 (1.9962) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][420/1251] eta 0:03:24 lr 0.000929 wd 0.0500 time 0.2454 (0.2463) data time 0.0007 (0.0022) model time 0.2447 (0.2447) loss 2.6786 (3.4824) grad_norm 1.6270 (1.9987) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][430/1251] eta 0:03:22 lr 0.000929 wd 0.0500 time 0.2381 (0.2462) data time 0.0010 (0.0021) model time 0.2371 (0.2446) loss 4.0314 (3.4732) grad_norm 2.7398 (2.0042) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][440/1251] eta 0:03:19 lr 0.000929 wd 0.0500 time 0.2397 (0.2462) data time 0.0009 (0.0021) model time 0.2388 (0.2446) loss 3.8727 (3.4752) grad_norm 2.1589 (2.0027) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][450/1251] eta 0:03:17 lr 0.000929 wd 0.0500 time 0.2388 (0.2460) data time 0.0009 (0.0021) model time 0.2380 (0.2444) loss 3.4499 (3.4720) grad_norm 1.6486 (2.0030) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][460/1251] eta 0:03:14 lr 0.000929 wd 0.0500 time 0.2533 (0.2464) data time 0.0009 (0.0021) model time 0.2524 (0.2449) loss 3.7124 (3.4696) grad_norm 1.5588 (2.0059) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][470/1251] eta 0:03:12 lr 0.000929 wd 0.0500 time 0.2361 (0.2463) data time 0.0009 (0.0020) model time 0.2352 (0.2448) loss 3.8505 (3.4688) grad_norm 1.6205 (2.0066) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][480/1251] eta 0:03:09 lr 0.000929 wd 0.0500 time 0.2332 (0.2462) data time 0.0008 (0.0020) model time 0.2324 (0.2447) loss 2.6493 (3.4725) grad_norm 2.0546 (2.0086) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][490/1251] eta 0:03:07 lr 0.000929 wd 0.0500 time 0.2404 (0.2461) data time 0.0010 (0.0020) model time 0.2394 (0.2446) loss 3.7384 (3.4783) grad_norm 1.4804 (2.0041) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][500/1251] eta 0:03:04 lr 0.000929 wd 0.0500 time 0.2385 (0.2460) data time 0.0009 (0.0020) model time 0.2376 (0.2444) loss 2.3511 (3.4801) grad_norm 1.7455 (2.0102) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][510/1251] eta 0:03:02 lr 0.000929 wd 0.0500 time 0.2386 (0.2459) data time 0.0008 (0.0020) model time 0.2377 (0.2443) loss 4.3206 (3.4810) grad_norm 1.5056 (2.0095) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][520/1251] eta 0:02:59 lr 0.000929 wd 0.0500 time 0.2356 (0.2458) data time 0.0009 (0.0019) model time 0.2347 (0.2442) loss 3.5538 (3.4842) grad_norm 1.3974 (2.0074) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][530/1251] eta 0:02:57 lr 0.000929 wd 0.0500 time 0.2409 (0.2457) data time 0.0010 (0.0019) model time 0.2399 (0.2442) loss 3.8854 (3.4864) grad_norm 2.3390 (2.0081) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:06:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][540/1251] eta 0:02:54 lr 0.000929 wd 0.0500 time 0.2465 (0.2457) data time 0.0007 (0.0019) model time 0.2458 (0.2441) loss 3.1329 (3.4858) grad_norm 1.9550 (2.0025) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][550/1251] eta 0:02:52 lr 0.000929 wd 0.0500 time 0.2398 (0.2456) data time 0.0009 (0.0019) model time 0.2389 (0.2440) loss 3.4024 (3.4791) grad_norm 2.0897 (2.0027) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][560/1251] eta 0:02:49 lr 0.000929 wd 0.0500 time 0.2504 (0.2455) data time 0.0009 (0.0019) model time 0.2494 (0.2440) loss 3.3191 (3.4817) grad_norm 2.4390 (2.0013) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][570/1251] eta 0:02:47 lr 0.000929 wd 0.0500 time 0.2369 (0.2454) data time 0.0012 (0.0019) model time 0.2357 (0.2439) loss 3.5529 (3.4846) grad_norm 1.7737 (1.9966) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][580/1251] eta 0:02:44 lr 0.000929 wd 0.0500 time 0.2467 (0.2454) data time 0.0011 (0.0019) model time 0.2456 (0.2439) loss 3.7907 (3.4844) grad_norm 1.7717 (1.9964) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][590/1251] eta 0:02:42 lr 0.000929 wd 0.0500 time 0.2368 (0.2453) data time 0.0007 (0.0018) model time 0.2361 (0.2438) loss 3.8833 (3.4819) grad_norm 1.4533 (1.9949) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][600/1251] eta 0:02:39 lr 0.000929 wd 0.0500 time 0.2334 (0.2453) data time 0.0007 (0.0018) model time 0.2328 (0.2437) loss 3.7481 (3.4823) grad_norm 3.0024 (1.9933) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][610/1251] eta 0:02:37 lr 0.000929 wd 0.0500 time 0.2384 (0.2452) data time 0.0011 (0.0018) model time 0.2372 (0.2437) loss 3.5701 (3.4835) grad_norm 1.6995 (1.9896) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][620/1251] eta 0:02:34 lr 0.000929 wd 0.0500 time 0.2428 (0.2452) data time 0.0008 (0.0018) model time 0.2420 (0.2436) loss 3.9529 (3.4822) grad_norm 2.1383 (1.9882) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][630/1251] eta 0:02:32 lr 0.000928 wd 0.0500 time 0.2349 (0.2451) data time 0.0010 (0.0018) model time 0.2339 (0.2436) loss 2.9192 (3.4827) grad_norm 1.7935 (1.9876) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][640/1251] eta 0:02:29 lr 0.000928 wd 0.0500 time 0.2428 (0.2451) data time 0.0008 (0.0018) model time 0.2420 (0.2436) loss 4.1346 (3.4857) grad_norm 1.5762 (1.9853) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][650/1251] eta 0:02:27 lr 0.000928 wd 0.0500 time 0.2421 (0.2451) data time 0.0012 (0.0018) model time 0.2409 (0.2436) loss 3.6950 (3.4830) grad_norm 2.5722 (1.9842) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][660/1251] eta 0:02:24 lr 0.000928 wd 0.0500 time 0.2411 (0.2451) data time 0.0010 (0.0018) model time 0.2401 (0.2436) loss 3.6765 (3.4800) grad_norm 1.9108 (1.9849) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][670/1251] eta 0:02:22 lr 0.000928 wd 0.0500 time 0.2399 (0.2450) data time 0.0009 (0.0018) model time 0.2390 (0.2435) loss 3.9162 (3.4812) grad_norm 1.5165 (1.9839) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][680/1251] eta 0:02:19 lr 0.000928 wd 0.0500 time 0.2409 (0.2449) data time 0.0011 (0.0017) model time 0.2399 (0.2435) loss 2.7365 (3.4795) grad_norm 1.9503 (1.9860) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][690/1251] eta 0:02:17 lr 0.000928 wd 0.0500 time 0.2446 (0.2449) data time 0.0011 (0.0017) model time 0.2435 (0.2434) loss 3.1420 (3.4807) grad_norm 3.5760 (1.9860) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][700/1251] eta 0:02:14 lr 0.000928 wd 0.0500 time 0.2478 (0.2448) data time 0.0011 (0.0017) model time 0.2467 (0.2434) loss 3.5088 (3.4846) grad_norm 1.5792 (1.9866) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][710/1251] eta 0:02:12 lr 0.000928 wd 0.0500 time 0.2536 (0.2448) data time 0.0009 (0.0017) model time 0.2527 (0.2433) loss 2.4267 (3.4834) grad_norm 1.3667 (1.9818) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][720/1251] eta 0:02:10 lr 0.000928 wd 0.0500 time 0.2382 (0.2451) data time 0.0013 (0.0017) model time 0.2369 (0.2436) loss 3.1943 (3.4839) grad_norm 1.9293 (1.9813) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][730/1251] eta 0:02:07 lr 0.000928 wd 0.0500 time 0.3679 (0.2452) data time 0.0009 (0.0017) model time 0.3670 (0.2437) loss 3.7358 (3.4813) grad_norm 1.6823 (1.9779) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][740/1251] eta 0:02:05 lr 0.000928 wd 0.0500 time 0.2399 (0.2457) data time 0.0009 (0.0017) model time 0.2390 (0.2443) loss 3.8641 (3.4836) grad_norm 1.6534 (1.9765) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][750/1251] eta 0:02:03 lr 0.000928 wd 0.0500 time 0.2421 (0.2459) data time 0.0008 (0.0017) model time 0.2414 (0.2446) loss 2.3589 (3.4803) grad_norm 1.7991 (1.9785) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][760/1251] eta 0:02:00 lr 0.000928 wd 0.0500 time 0.2390 (0.2459) data time 0.0010 (0.0017) model time 0.2380 (0.2445) loss 3.6488 (3.4831) grad_norm 2.0676 (1.9812) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][770/1251] eta 0:01:58 lr 0.000928 wd 0.0500 time 0.2400 (0.2459) data time 0.0009 (0.0017) model time 0.2391 (0.2445) loss 3.6935 (3.4789) grad_norm 1.7631 (1.9811) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][780/1251] eta 0:01:55 lr 0.000928 wd 0.0500 time 0.2426 (0.2458) data time 0.0010 (0.0017) model time 0.2416 (0.2445) loss 3.7520 (3.4798) grad_norm 2.7315 (1.9834) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][790/1251] eta 0:01:53 lr 0.000928 wd 0.0500 time 0.2485 (0.2458) data time 0.0008 (0.0016) model time 0.2477 (0.2444) loss 2.4753 (3.4800) grad_norm 1.6849 (1.9824) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][800/1251] eta 0:01:50 lr 0.000928 wd 0.0500 time 0.2341 (0.2457) data time 0.0011 (0.0016) model time 0.2330 (0.2443) loss 3.1797 (3.4831) grad_norm 2.1288 (1.9849) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][810/1251] eta 0:01:48 lr 0.000928 wd 0.0500 time 0.2366 (0.2457) data time 0.0011 (0.0016) model time 0.2355 (0.2443) loss 3.4210 (3.4814) grad_norm 2.8745 (1.9821) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][820/1251] eta 0:01:45 lr 0.000928 wd 0.0500 time 0.2391 (0.2456) data time 0.0013 (0.0016) model time 0.2379 (0.2442) loss 3.9436 (3.4832) grad_norm 2.8821 (1.9857) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][830/1251] eta 0:01:43 lr 0.000928 wd 0.0500 time 0.2421 (0.2455) data time 0.0011 (0.0016) model time 0.2410 (0.2442) loss 3.6961 (3.4814) grad_norm 1.6374 (1.9845) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][840/1251] eta 0:01:40 lr 0.000928 wd 0.0500 time 0.2384 (0.2455) data time 0.0012 (0.0016) model time 0.2372 (0.2441) loss 3.7212 (3.4817) grad_norm 1.9703 (1.9842) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][850/1251] eta 0:01:38 lr 0.000928 wd 0.0500 time 0.2336 (0.2454) data time 0.0014 (0.0016) model time 0.2322 (0.2441) loss 3.3580 (3.4831) grad_norm 1.9556 (1.9836) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][860/1251] eta 0:01:35 lr 0.000928 wd 0.0500 time 0.2425 (0.2454) data time 0.0009 (0.0016) model time 0.2416 (0.2440) loss 2.8044 (3.4799) grad_norm 1.7885 (1.9849) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][870/1251] eta 0:01:33 lr 0.000928 wd 0.0500 time 0.2439 (0.2454) data time 0.0008 (0.0016) model time 0.2431 (0.2440) loss 3.1100 (3.4800) grad_norm 2.3751 (1.9890) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][880/1251] eta 0:01:31 lr 0.000928 wd 0.0500 time 0.2427 (0.2453) data time 0.0009 (0.0016) model time 0.2418 (0.2440) loss 4.1219 (3.4801) grad_norm 1.9477 (1.9865) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][890/1251] eta 0:01:28 lr 0.000928 wd 0.0500 time 0.2414 (0.2453) data time 0.0010 (0.0016) model time 0.2404 (0.2439) loss 3.2062 (3.4823) grad_norm 2.6693 (1.9870) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][900/1251] eta 0:01:26 lr 0.000928 wd 0.0500 time 0.2420 (0.2452) data time 0.0010 (0.0016) model time 0.2410 (0.2439) loss 3.6150 (3.4825) grad_norm 3.0308 (1.9917) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][910/1251] eta 0:01:23 lr 0.000928 wd 0.0500 time 0.2376 (0.2452) data time 0.0012 (0.0016) model time 0.2364 (0.2438) loss 2.8727 (3.4817) grad_norm 1.8199 (1.9902) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][920/1251] eta 0:01:21 lr 0.000928 wd 0.0500 time 0.2408 (0.2451) data time 0.0009 (0.0016) model time 0.2399 (0.2438) loss 3.8509 (3.4836) grad_norm 2.0007 (1.9879) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][930/1251] eta 0:01:18 lr 0.000928 wd 0.0500 time 0.2440 (0.2451) data time 0.0009 (0.0016) model time 0.2431 (0.2438) loss 2.6821 (3.4808) grad_norm 1.6112 (1.9856) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][940/1251] eta 0:01:16 lr 0.000928 wd 0.0500 time 0.2414 (0.2451) data time 0.0009 (0.0016) model time 0.2405 (0.2437) loss 3.7169 (3.4821) grad_norm 1.8755 (1.9830) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][950/1251] eta 0:01:13 lr 0.000928 wd 0.0500 time 0.2402 (0.2451) data time 0.0010 (0.0016) model time 0.2392 (0.2437) loss 3.6703 (3.4806) grad_norm 1.7318 (1.9808) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][960/1251] eta 0:01:11 lr 0.000928 wd 0.0500 time 0.2393 (0.2450) data time 0.0009 (0.0015) model time 0.2383 (0.2436) loss 3.7471 (3.4831) grad_norm 1.7704 (1.9784) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][970/1251] eta 0:01:08 lr 0.000928 wd 0.0500 time 0.2391 (0.2450) data time 0.0008 (0.0015) model time 0.2383 (0.2436) loss 3.9424 (3.4835) grad_norm 1.9507 (1.9766) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][980/1251] eta 0:01:06 lr 0.000928 wd 0.0500 time 0.2446 (0.2449) data time 0.0010 (0.0015) model time 0.2437 (0.2436) loss 3.8170 (3.4825) grad_norm 2.6569 (1.9762) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][990/1251] eta 0:01:03 lr 0.000928 wd 0.0500 time 0.4460 (0.2451) data time 0.0008 (0.0015) model time 0.4453 (0.2438) loss 4.4119 (3.4829) grad_norm 1.3233 (1.9776) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1000/1251] eta 0:01:01 lr 0.000928 wd 0.0500 time 0.2360 (0.2451) data time 0.0011 (0.0015) model time 0.2349 (0.2437) loss 3.6482 (3.4848) grad_norm 2.3074 (1.9757) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1010/1251] eta 0:00:59 lr 0.000928 wd 0.0500 time 0.2303 (0.2450) data time 0.0010 (0.0015) model time 0.2292 (0.2437) loss 3.7509 (3.4871) grad_norm 2.0310 (1.9755) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1020/1251] eta 0:00:56 lr 0.000928 wd 0.0500 time 0.2454 (0.2450) data time 0.0009 (0.0015) model time 0.2445 (0.2436) loss 3.8286 (3.4871) grad_norm 1.4744 (1.9739) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1030/1251] eta 0:00:54 lr 0.000928 wd 0.0500 time 0.2447 (0.2449) data time 0.0008 (0.0015) model time 0.2439 (0.2436) loss 3.1318 (3.4856) grad_norm 1.8890 (1.9710) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1040/1251] eta 0:00:51 lr 0.000928 wd 0.0500 time 0.2478 (0.2449) data time 0.0010 (0.0015) model time 0.2468 (0.2436) loss 3.8079 (3.4865) grad_norm 1.9398 (1.9723) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1050/1251] eta 0:00:49 lr 0.000928 wd 0.0500 time 0.2350 (0.2449) data time 0.0011 (0.0015) model time 0.2338 (0.2436) loss 2.0469 (3.4866) grad_norm 2.0226 (1.9759) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1060/1251] eta 0:00:46 lr 0.000927 wd 0.0500 time 0.2416 (0.2449) data time 0.0007 (0.0015) model time 0.2409 (0.2436) loss 4.0861 (3.4864) grad_norm 2.1259 (1.9753) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1070/1251] eta 0:00:44 lr 0.000927 wd 0.0500 time 0.2411 (0.2449) data time 0.0008 (0.0015) model time 0.2403 (0.2436) loss 3.6844 (3.4867) grad_norm 1.7695 (1.9753) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1080/1251] eta 0:00:41 lr 0.000927 wd 0.0500 time 0.2418 (0.2451) data time 0.0010 (0.0015) model time 0.2408 (0.2438) loss 4.5515 (3.4897) grad_norm 2.5282 (1.9763) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1090/1251] eta 0:00:39 lr 0.000927 wd 0.0500 time 0.2489 (0.2452) data time 0.0009 (0.0015) model time 0.2480 (0.2439) loss 2.7053 (3.4877) grad_norm 2.2084 (1.9783) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1100/1251] eta 0:00:37 lr 0.000927 wd 0.0500 time 0.2459 (0.2452) data time 0.0007 (0.0015) model time 0.2452 (0.2439) loss 3.8348 (3.4869) grad_norm 1.7653 (1.9774) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1110/1251] eta 0:00:34 lr 0.000927 wd 0.0500 time 0.2482 (0.2452) data time 0.0007 (0.0015) model time 0.2474 (0.2439) loss 2.8007 (3.4861) grad_norm 1.4652 (1.9747) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1120/1251] eta 0:00:32 lr 0.000927 wd 0.0500 time 0.2419 (0.2452) data time 0.0010 (0.0015) model time 0.2409 (0.2439) loss 3.0455 (3.4867) grad_norm 1.8735 (1.9754) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1130/1251] eta 0:00:29 lr 0.000927 wd 0.0500 time 0.2393 (0.2451) data time 0.0010 (0.0015) model time 0.2384 (0.2438) loss 2.7525 (3.4885) grad_norm 2.4535 (1.9766) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1140/1251] eta 0:00:27 lr 0.000927 wd 0.0500 time 0.2415 (0.2451) data time 0.0011 (0.0015) model time 0.2404 (0.2438) loss 3.8661 (3.4888) grad_norm 1.7958 (1.9743) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1150/1251] eta 0:00:24 lr 0.000927 wd 0.0500 time 0.2431 (0.2451) data time 0.0007 (0.0015) model time 0.2424 (0.2438) loss 3.8446 (3.4903) grad_norm 1.7537 (1.9742) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1160/1251] eta 0:00:22 lr 0.000927 wd 0.0500 time 0.2366 (0.2451) data time 0.0009 (0.0015) model time 0.2357 (0.2438) loss 2.8157 (3.4889) grad_norm 1.4535 (1.9715) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1170/1251] eta 0:00:19 lr 0.000927 wd 0.0500 time 0.2413 (0.2451) data time 0.0009 (0.0015) model time 0.2404 (0.2438) loss 4.3081 (3.4914) grad_norm 1.8872 (1.9714) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1180/1251] eta 0:00:17 lr 0.000927 wd 0.0500 time 0.2333 (0.2450) data time 0.0007 (0.0014) model time 0.2326 (0.2437) loss 4.5562 (3.4937) grad_norm 3.6636 (1.9755) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1190/1251] eta 0:00:14 lr 0.000927 wd 0.0500 time 0.2411 (0.2450) data time 0.0008 (0.0014) model time 0.2403 (0.2437) loss 3.6961 (3.4946) grad_norm 1.6957 (1.9729) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1200/1251] eta 0:00:12 lr 0.000927 wd 0.0500 time 0.2363 (0.2450) data time 0.0009 (0.0014) model time 0.2354 (0.2437) loss 4.2172 (3.4963) grad_norm 2.6237 (1.9724) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1210/1251] eta 0:00:10 lr 0.000927 wd 0.0500 time 0.2422 (0.2449) data time 0.0008 (0.0014) model time 0.2413 (0.2437) loss 2.5632 (3.4967) grad_norm 2.4293 (1.9725) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1220/1251] eta 0:00:07 lr 0.000927 wd 0.0500 time 0.2435 (0.2449) data time 0.0008 (0.0014) model time 0.2427 (0.2437) loss 4.0758 (3.4958) grad_norm 2.0167 (1.9718) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1230/1251] eta 0:00:05 lr 0.000927 wd 0.0500 time 0.2433 (0.2449) data time 0.0009 (0.0014) model time 0.2424 (0.2436) loss 2.4630 (3.4949) grad_norm 2.3702 (1.9728) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1240/1251] eta 0:00:02 lr 0.000927 wd 0.0500 time 0.2259 (0.2450) data time 0.0005 (0.0014) model time 0.2254 (0.2437) loss 2.7929 (3.4928) grad_norm 1.6328 (1.9727) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [68/300][1250/1251] eta 0:00:00 lr 0.000927 wd 0.0500 time 0.2274 (0.2450) data time 0.0007 (0.0014) model time 0.2267 (0.2437) loss 4.0612 (3.4939) grad_norm 1.5695 (1.9707) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 68 training takes 0:05:06 [2024-08-26 08:09:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 08:09:53 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 08:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.446 (0.446) Loss 0.5361 (0.5361) Acc@1 90.430 (90.430) Acc@5 97.852 (97.852) Mem 7379MB [2024-08-26 08:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.110) Loss 0.8682 (0.8575) Acc@1 81.055 (81.454) Acc@5 95.898 (95.890) Mem 7379MB [2024-08-26 08:09:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.086 (0.096) Loss 1.2949 (0.8932) Acc@1 67.480 (79.850) Acc@5 91.895 (95.787) Mem 7379MB [2024-08-26 08:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.091) Loss 1.5049 (1.0071) Acc@1 63.867 (77.391) Acc@5 88.184 (94.367) Mem 7379MB [2024-08-26 08:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.4160 (1.0733) Acc@1 66.895 (75.848) Acc@5 88.867 (93.517) Mem 7379MB [2024-08-26 08:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.486 Acc@5 93.394 [2024-08-26 08:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.5% [2024-08-26 08:09:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.753 (0.753) Loss 0.4700 (0.4700) Acc@1 91.113 (91.113) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 08:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.146) Loss 0.7622 (0.7367) Acc@1 84.375 (83.993) Acc@5 96.387 (96.760) Mem 7379MB [2024-08-26 08:10:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.113) Loss 1.0635 (0.7588) Acc@1 74.512 (82.938) Acc@5 93.262 (96.689) Mem 7379MB [2024-08-26 08:10:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.102) Loss 1.3301 (0.8673) Acc@1 65.820 (80.403) Acc@5 90.039 (95.284) Mem 7379MB [2024-08-26 08:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.093) Loss 1.2324 (0.9256) Acc@1 70.215 (78.921) Acc@5 91.113 (94.672) Mem 7379MB [2024-08-26 08:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.562 Acc@5 94.592 [2024-08-26 08:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.6% [2024-08-26 08:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.56% [2024-08-26 08:10:02 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 08:10:03 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 08:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][0/1251] eta 0:13:26 lr 0.000927 wd 0.0500 time 0.6445 (0.6445) data time 0.4226 (0.4226) model time 0.0000 (0.0000) loss 3.6620 (3.6620) grad_norm 1.7503 (1.7503) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][10/1251] eta 0:06:00 lr 0.000927 wd 0.0500 time 0.2428 (0.2906) data time 0.0007 (0.0393) model time 0.0000 (0.0000) loss 3.4795 (3.5708) grad_norm 1.5130 (1.8910) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][20/1251] eta 0:05:28 lr 0.000927 wd 0.0500 time 0.2354 (0.2667) data time 0.0011 (0.0211) model time 0.0000 (0.0000) loss 3.8385 (3.5515) grad_norm 1.8562 (2.0307) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][30/1251] eta 0:05:15 lr 0.000927 wd 0.0500 time 0.2417 (0.2585) data time 0.0009 (0.0146) model time 0.0000 (0.0000) loss 2.5756 (3.4950) grad_norm 1.7054 (1.9671) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][40/1251] eta 0:05:14 lr 0.000927 wd 0.0500 time 0.2483 (0.2598) data time 0.0009 (0.0113) model time 0.0000 (0.0000) loss 3.0471 (3.5263) grad_norm 1.7548 (1.9796) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][50/1251] eta 0:05:12 lr 0.000927 wd 0.0500 time 0.2399 (0.2599) data time 0.0010 (0.0093) model time 0.0000 (0.0000) loss 3.4495 (3.5013) grad_norm 1.7912 (1.9901) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][60/1251] eta 0:05:10 lr 0.000927 wd 0.0500 time 0.2408 (0.2609) data time 0.0011 (0.0079) model time 0.2396 (0.2653) loss 4.0014 (3.5432) grad_norm 1.6102 (1.9815) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][70/1251] eta 0:05:05 lr 0.000927 wd 0.0500 time 0.2413 (0.2587) data time 0.0012 (0.0070) model time 0.2401 (0.2546) loss 3.5518 (3.5408) grad_norm 1.5635 (1.9702) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][80/1251] eta 0:05:00 lr 0.000927 wd 0.0500 time 0.2422 (0.2566) data time 0.0009 (0.0062) model time 0.2413 (0.2502) loss 3.6764 (3.5419) grad_norm 1.7036 (1.9517) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][90/1251] eta 0:04:56 lr 0.000927 wd 0.0500 time 0.2440 (0.2553) data time 0.0011 (0.0057) model time 0.2429 (0.2485) loss 3.5701 (3.5105) grad_norm 2.8751 (1.9601) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][100/1251] eta 0:04:52 lr 0.000927 wd 0.0500 time 0.2346 (0.2539) data time 0.0010 (0.0052) model time 0.2336 (0.2468) loss 3.5962 (3.4975) grad_norm 3.2699 (1.9777) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][110/1251] eta 0:04:48 lr 0.000927 wd 0.0500 time 0.2442 (0.2528) data time 0.0010 (0.0048) model time 0.2432 (0.2457) loss 3.5210 (3.5098) grad_norm 1.7415 (1.9645) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][120/1251] eta 0:04:44 lr 0.000927 wd 0.0500 time 0.2341 (0.2517) data time 0.0008 (0.0045) model time 0.2332 (0.2448) loss 2.7415 (3.4767) grad_norm 2.4892 (1.9529) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][130/1251] eta 0:04:41 lr 0.000927 wd 0.0500 time 0.2489 (0.2510) data time 0.0007 (0.0043) model time 0.2482 (0.2443) loss 2.4531 (3.4872) grad_norm 1.7703 (1.9615) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][140/1251] eta 0:04:38 lr 0.000927 wd 0.0500 time 0.2344 (0.2502) data time 0.0010 (0.0040) model time 0.2334 (0.2437) loss 4.3002 (3.4854) grad_norm 1.7657 (1.9837) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][150/1251] eta 0:04:34 lr 0.000927 wd 0.0500 time 0.2375 (0.2496) data time 0.0013 (0.0038) model time 0.2362 (0.2433) loss 3.6096 (3.4787) grad_norm 2.5873 (1.9793) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][160/1251] eta 0:04:31 lr 0.000927 wd 0.0500 time 0.2423 (0.2491) data time 0.0011 (0.0037) model time 0.2412 (0.2431) loss 2.8477 (3.4829) grad_norm 1.7333 (1.9631) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][170/1251] eta 0:04:28 lr 0.000927 wd 0.0500 time 0.2400 (0.2487) data time 0.0010 (0.0035) model time 0.2390 (0.2429) loss 3.3004 (3.4808) grad_norm 1.5468 (1.9547) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][180/1251] eta 0:04:26 lr 0.000927 wd 0.0500 time 0.2435 (0.2484) data time 0.0008 (0.0034) model time 0.2427 (0.2428) loss 4.1832 (3.4902) grad_norm 2.1746 (1.9480) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][190/1251] eta 0:04:23 lr 0.000927 wd 0.0500 time 0.2374 (0.2481) data time 0.0009 (0.0032) model time 0.2365 (0.2427) loss 4.0376 (3.4992) grad_norm 2.3611 (1.9557) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][200/1251] eta 0:04:20 lr 0.000927 wd 0.0500 time 0.2414 (0.2477) data time 0.0007 (0.0031) model time 0.2407 (0.2425) loss 4.1882 (3.4997) grad_norm 2.0769 (1.9521) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][210/1251] eta 0:04:17 lr 0.000927 wd 0.0500 time 0.2407 (0.2473) data time 0.0010 (0.0030) model time 0.2397 (0.2423) loss 2.9470 (3.4928) grad_norm 2.5989 (1.9660) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][220/1251] eta 0:04:14 lr 0.000927 wd 0.0500 time 0.2434 (0.2471) data time 0.0007 (0.0029) model time 0.2427 (0.2422) loss 3.8398 (3.4831) grad_norm 1.4226 (1.9659) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:11:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][230/1251] eta 0:04:12 lr 0.000927 wd 0.0500 time 0.2399 (0.2469) data time 0.0009 (0.0029) model time 0.2389 (0.2422) loss 3.6780 (3.4861) grad_norm 1.5440 (1.9581) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][240/1251] eta 0:04:09 lr 0.000926 wd 0.0500 time 0.2435 (0.2468) data time 0.0008 (0.0028) model time 0.2427 (0.2422) loss 2.5905 (3.4790) grad_norm 1.4469 (1.9495) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:11:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][250/1251] eta 0:04:06 lr 0.000926 wd 0.0500 time 0.2417 (0.2466) data time 0.0008 (0.0027) model time 0.2409 (0.2422) loss 3.4212 (3.4664) grad_norm 1.4909 (1.9449) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][260/1251] eta 0:04:04 lr 0.000926 wd 0.0500 time 0.2342 (0.2465) data time 0.0009 (0.0026) model time 0.2333 (0.2422) loss 3.1715 (3.4770) grad_norm 1.7146 (1.9559) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:11:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][270/1251] eta 0:04:02 lr 0.000926 wd 0.0500 time 0.4438 (0.2471) data time 0.0011 (0.0026) model time 0.4426 (0.2431) loss 3.8083 (3.4681) grad_norm 1.7901 (1.9525) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][280/1251] eta 0:03:59 lr 0.000926 wd 0.0500 time 0.2433 (0.2468) data time 0.0010 (0.0025) model time 0.2423 (0.2429) loss 3.1222 (3.4753) grad_norm 5.6442 (1.9773) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][290/1251] eta 0:03:57 lr 0.000926 wd 0.0500 time 0.2482 (0.2467) data time 0.0010 (0.0025) model time 0.2472 (0.2428) loss 2.7570 (3.4688) grad_norm 1.8528 (1.9820) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:11:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][300/1251] eta 0:03:54 lr 0.000926 wd 0.0500 time 0.2449 (0.2465) data time 0.0007 (0.0024) model time 0.2442 (0.2427) loss 2.8481 (3.4765) grad_norm 1.9626 (1.9818) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][310/1251] eta 0:03:51 lr 0.000926 wd 0.0500 time 0.2381 (0.2463) data time 0.0010 (0.0024) model time 0.2370 (0.2426) loss 2.7408 (3.4752) grad_norm 2.3308 (1.9819) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][320/1251] eta 0:03:49 lr 0.000926 wd 0.0500 time 0.2454 (0.2461) data time 0.0011 (0.0023) model time 0.2443 (0.2425) loss 3.3100 (3.4776) grad_norm 1.9793 (1.9852) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][330/1251] eta 0:03:46 lr 0.000926 wd 0.0500 time 0.2546 (0.2460) data time 0.0007 (0.0023) model time 0.2539 (0.2424) loss 3.9452 (3.4787) grad_norm 2.2540 (1.9869) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:11:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][340/1251] eta 0:03:43 lr 0.000926 wd 0.0500 time 0.2420 (0.2458) data time 0.0007 (0.0023) model time 0.2413 (0.2423) loss 3.7734 (3.4796) grad_norm 1.5080 (1.9877) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][350/1251] eta 0:03:41 lr 0.000926 wd 0.0500 time 0.2340 (0.2456) data time 0.0011 (0.0022) model time 0.2329 (0.2422) loss 2.9860 (3.4753) grad_norm 2.4573 (1.9866) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][360/1251] eta 0:03:38 lr 0.000926 wd 0.0500 time 0.2351 (0.2455) data time 0.0010 (0.0022) model time 0.2341 (0.2421) loss 2.1883 (3.4716) grad_norm 1.9251 (1.9970) loss_scale 8192.0000 (4186.7701) mem 7379MB [2024-08-26 08:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][370/1251] eta 0:03:36 lr 0.000926 wd 0.0500 time 0.2410 (0.2454) data time 0.0009 (0.0022) model time 0.2401 (0.2421) loss 3.6894 (3.4776) grad_norm 2.2757 (2.0056) loss_scale 8192.0000 (4294.7278) mem 7379MB [2024-08-26 08:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][380/1251] eta 0:03:33 lr 0.000926 wd 0.0500 time 0.2418 (0.2453) data time 0.0010 (0.0021) model time 0.2408 (0.2420) loss 3.1771 (3.4801) grad_norm 1.6329 (2.0024) loss_scale 8192.0000 (4397.0184) mem 7379MB [2024-08-26 08:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][390/1251] eta 0:03:31 lr 0.000926 wd 0.0500 time 0.2466 (0.2453) data time 0.0007 (0.0021) model time 0.2459 (0.2420) loss 3.5303 (3.4728) grad_norm 2.0435 (2.0043) loss_scale 8192.0000 (4494.0767) mem 7379MB [2024-08-26 08:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][400/1251] eta 0:03:28 lr 0.000926 wd 0.0500 time 0.2440 (0.2451) data time 0.0010 (0.0021) model time 0.2430 (0.2419) loss 3.9871 (3.4782) grad_norm 1.5210 (2.0011) loss_scale 8192.0000 (4586.2943) mem 7379MB [2024-08-26 08:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][410/1251] eta 0:03:26 lr 0.000926 wd 0.0500 time 0.2465 (0.2450) data time 0.0010 (0.0021) model time 0.2455 (0.2419) loss 3.9928 (3.4853) grad_norm 2.0061 (2.0037) loss_scale 8192.0000 (4674.0243) mem 7379MB [2024-08-26 08:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][420/1251] eta 0:03:23 lr 0.000926 wd 0.0500 time 0.2465 (0.2450) data time 0.0010 (0.0020) model time 0.2455 (0.2419) loss 2.7742 (3.4769) grad_norm 1.4916 (2.0043) loss_scale 8192.0000 (4757.5867) mem 7379MB [2024-08-26 08:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][430/1251] eta 0:03:21 lr 0.000926 wd 0.0500 time 0.2446 (0.2449) data time 0.0008 (0.0020) model time 0.2438 (0.2418) loss 3.5849 (3.4783) grad_norm 3.2547 (2.0038) loss_scale 8192.0000 (4837.2715) mem 7379MB [2024-08-26 08:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][440/1251] eta 0:03:18 lr 0.000926 wd 0.0500 time 0.2447 (0.2449) data time 0.0007 (0.0020) model time 0.2440 (0.2419) loss 3.6131 (3.4759) grad_norm 1.6632 (2.0014) loss_scale 8192.0000 (4913.3424) mem 7379MB [2024-08-26 08:11:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][450/1251] eta 0:03:16 lr 0.000926 wd 0.0500 time 0.2516 (0.2448) data time 0.0011 (0.0020) model time 0.2505 (0.2419) loss 3.5072 (3.4804) grad_norm 1.5325 (1.9991) loss_scale 8192.0000 (4986.0399) mem 7379MB [2024-08-26 08:11:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][460/1251] eta 0:03:13 lr 0.000926 wd 0.0500 time 0.2408 (0.2447) data time 0.0010 (0.0020) model time 0.2398 (0.2418) loss 3.8763 (3.4852) grad_norm 1.6603 (1.9945) loss_scale 8192.0000 (5055.5835) mem 7379MB [2024-08-26 08:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][470/1251] eta 0:03:11 lr 0.000926 wd 0.0500 time 0.2463 (0.2448) data time 0.0008 (0.0019) model time 0.2455 (0.2419) loss 3.9448 (3.4845) grad_norm 1.8420 (1.9938) loss_scale 8192.0000 (5122.1741) mem 7379MB [2024-08-26 08:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][480/1251] eta 0:03:08 lr 0.000926 wd 0.0500 time 0.2415 (0.2450) data time 0.0010 (0.0019) model time 0.2405 (0.2422) loss 3.5565 (3.4826) grad_norm 1.7879 (1.9932) loss_scale 8192.0000 (5185.9958) mem 7379MB [2024-08-26 08:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][490/1251] eta 0:03:06 lr 0.000926 wd 0.0500 time 0.2402 (0.2450) data time 0.0010 (0.0019) model time 0.2392 (0.2422) loss 2.7412 (3.4807) grad_norm 3.6143 (1.9957) loss_scale 8192.0000 (5247.2179) mem 7379MB [2024-08-26 08:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][500/1251] eta 0:03:03 lr 0.000926 wd 0.0500 time 0.2433 (0.2449) data time 0.0009 (0.0019) model time 0.2424 (0.2421) loss 2.8682 (3.4771) grad_norm 1.9587 (1.9985) loss_scale 8192.0000 (5305.9960) mem 7379MB [2024-08-26 08:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][510/1251] eta 0:03:01 lr 0.000926 wd 0.0500 time 0.2416 (0.2448) data time 0.0007 (0.0019) model time 0.2409 (0.2421) loss 4.2459 (3.4825) grad_norm 1.3673 (1.9957) loss_scale 8192.0000 (5362.4736) mem 7379MB [2024-08-26 08:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][520/1251] eta 0:02:58 lr 0.000926 wd 0.0500 time 0.2543 (0.2448) data time 0.0008 (0.0018) model time 0.2535 (0.2421) loss 2.4616 (3.4787) grad_norm 1.6423 (1.9945) loss_scale 8192.0000 (5416.7831) mem 7379MB [2024-08-26 08:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][530/1251] eta 0:02:56 lr 0.000926 wd 0.0500 time 0.2352 (0.2450) data time 0.0010 (0.0018) model time 0.2343 (0.2424) loss 3.8496 (3.4775) grad_norm 1.5649 (1.9909) loss_scale 8192.0000 (5469.0471) mem 7379MB [2024-08-26 08:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][540/1251] eta 0:02:54 lr 0.000926 wd 0.0500 time 0.2426 (0.2458) data time 0.0010 (0.0018) model time 0.2416 (0.2433) loss 3.7541 (3.4797) grad_norm 2.1977 (1.9854) loss_scale 8192.0000 (5519.3789) mem 7379MB [2024-08-26 08:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][550/1251] eta 0:02:52 lr 0.000926 wd 0.0500 time 0.2383 (0.2457) data time 0.0008 (0.0018) model time 0.2375 (0.2433) loss 3.8534 (3.4859) grad_norm 1.5459 (1.9839) loss_scale 8192.0000 (5567.8838) mem 7379MB [2024-08-26 08:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][560/1251] eta 0:02:49 lr 0.000926 wd 0.0500 time 0.2504 (0.2457) data time 0.0008 (0.0018) model time 0.2495 (0.2432) loss 4.1246 (3.4843) grad_norm 2.9264 (1.9841) loss_scale 8192.0000 (5614.6595) mem 7379MB [2024-08-26 08:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][570/1251] eta 0:02:47 lr 0.000926 wd 0.0500 time 0.2438 (0.2456) data time 0.0008 (0.0018) model time 0.2429 (0.2431) loss 3.9752 (3.4852) grad_norm 2.5449 (1.9847) loss_scale 8192.0000 (5659.7968) mem 7379MB [2024-08-26 08:12:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][580/1251] eta 0:02:45 lr 0.000926 wd 0.0500 time 0.2384 (0.2462) data time 0.0007 (0.0018) model time 0.2377 (0.2438) loss 4.1991 (3.4847) grad_norm 1.8275 (1.9851) loss_scale 8192.0000 (5703.3804) mem 7379MB [2024-08-26 08:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][590/1251] eta 0:02:42 lr 0.000926 wd 0.0500 time 0.2395 (0.2461) data time 0.0011 (0.0018) model time 0.2385 (0.2438) loss 3.6270 (3.4848) grad_norm 1.9123 (1.9857) loss_scale 8192.0000 (5745.4890) mem 7379MB [2024-08-26 08:12:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][600/1251] eta 0:02:40 lr 0.000926 wd 0.0500 time 0.2351 (0.2464) data time 0.0009 (0.0017) model time 0.2342 (0.2441) loss 4.4090 (3.4845) grad_norm 2.3056 (1.9918) loss_scale 8192.0000 (5786.1963) mem 7379MB [2024-08-26 08:12:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][610/1251] eta 0:02:37 lr 0.000926 wd 0.0500 time 0.2463 (0.2463) data time 0.0009 (0.0017) model time 0.2453 (0.2440) loss 3.8423 (3.4829) grad_norm 1.5036 (1.9933) loss_scale 8192.0000 (5825.5712) mem 7379MB [2024-08-26 08:12:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][620/1251] eta 0:02:35 lr 0.000926 wd 0.0500 time 0.2472 (0.2462) data time 0.0008 (0.0017) model time 0.2464 (0.2440) loss 4.4166 (3.4901) grad_norm 2.7649 (1.9923) loss_scale 8192.0000 (5863.6779) mem 7379MB [2024-08-26 08:12:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][630/1251] eta 0:02:33 lr 0.000926 wd 0.0500 time 0.4605 (0.2465) data time 0.0008 (0.0017) model time 0.4596 (0.2443) loss 2.8134 (3.4851) grad_norm 2.9504 (1.9923) loss_scale 8192.0000 (5900.5769) mem 7379MB [2024-08-26 08:12:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][640/1251] eta 0:02:30 lr 0.000926 wd 0.0500 time 0.2453 (0.2468) data time 0.0008 (0.0017) model time 0.2444 (0.2446) loss 4.0034 (3.4843) grad_norm 2.0892 (1.9950) loss_scale 8192.0000 (5936.3245) mem 7379MB [2024-08-26 08:12:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][650/1251] eta 0:02:28 lr 0.000926 wd 0.0500 time 0.2385 (0.2466) data time 0.0013 (0.0017) model time 0.2372 (0.2445) loss 2.3185 (3.4869) grad_norm 1.5205 (1.9922) loss_scale 8192.0000 (5970.9739) mem 7379MB [2024-08-26 08:12:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][660/1251] eta 0:02:25 lr 0.000926 wd 0.0500 time 0.2367 (0.2465) data time 0.0010 (0.0017) model time 0.2357 (0.2444) loss 3.5048 (3.4825) grad_norm 2.1467 (1.9933) loss_scale 8192.0000 (6004.5749) mem 7379MB [2024-08-26 08:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][670/1251] eta 0:02:23 lr 0.000925 wd 0.0500 time 0.2405 (0.2465) data time 0.0012 (0.0017) model time 0.2393 (0.2444) loss 3.4581 (3.4869) grad_norm 2.0279 (1.9926) loss_scale 8192.0000 (6037.1744) mem 7379MB [2024-08-26 08:12:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][680/1251] eta 0:02:20 lr 0.000925 wd 0.0500 time 0.2461 (0.2465) data time 0.0008 (0.0017) model time 0.2454 (0.2443) loss 3.7912 (3.4911) grad_norm 2.5453 (1.9964) loss_scale 8192.0000 (6068.8164) mem 7379MB [2024-08-26 08:12:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][690/1251] eta 0:02:18 lr 0.000925 wd 0.0500 time 0.2412 (0.2464) data time 0.0010 (0.0016) model time 0.2402 (0.2443) loss 4.1531 (3.4902) grad_norm 1.7387 (1.9972) loss_scale 8192.0000 (6099.5427) mem 7379MB [2024-08-26 08:12:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][700/1251] eta 0:02:15 lr 0.000925 wd 0.0500 time 0.2477 (0.2463) data time 0.0009 (0.0016) model time 0.2469 (0.2442) loss 3.8698 (3.4948) grad_norm 2.2982 (1.9975) loss_scale 8192.0000 (6129.3923) mem 7379MB [2024-08-26 08:12:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][710/1251] eta 0:02:13 lr 0.000925 wd 0.0500 time 0.2379 (0.2462) data time 0.0012 (0.0016) model time 0.2367 (0.2442) loss 2.9964 (3.4953) grad_norm 1.9229 (1.9980) loss_scale 8192.0000 (6158.4023) mem 7379MB [2024-08-26 08:13:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][720/1251] eta 0:02:10 lr 0.000925 wd 0.0500 time 0.2428 (0.2462) data time 0.0010 (0.0016) model time 0.2418 (0.2441) loss 3.6036 (3.4969) grad_norm 1.7089 (1.9945) loss_scale 8192.0000 (6186.6075) mem 7379MB [2024-08-26 08:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][730/1251] eta 0:02:08 lr 0.000925 wd 0.0500 time 0.2462 (0.2461) data time 0.0007 (0.0016) model time 0.2456 (0.2441) loss 3.4287 (3.4978) grad_norm 1.4138 (1.9990) loss_scale 8192.0000 (6214.0410) mem 7379MB [2024-08-26 08:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][740/1251] eta 0:02:05 lr 0.000925 wd 0.0500 time 0.2352 (0.2461) data time 0.0008 (0.0016) model time 0.2344 (0.2440) loss 3.9741 (3.4998) grad_norm 2.7301 (1.9984) loss_scale 8192.0000 (6240.7341) mem 7379MB [2024-08-26 08:13:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][750/1251] eta 0:02:03 lr 0.000925 wd 0.0500 time 0.2448 (0.2460) data time 0.0010 (0.0016) model time 0.2438 (0.2440) loss 2.8085 (3.5001) grad_norm 2.1841 (1.9977) loss_scale 8192.0000 (6266.7164) mem 7379MB [2024-08-26 08:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][760/1251] eta 0:02:00 lr 0.000925 wd 0.0500 time 0.2438 (0.2460) data time 0.0015 (0.0016) model time 0.2423 (0.2439) loss 3.6339 (3.4970) grad_norm 1.7540 (1.9947) loss_scale 8192.0000 (6292.0158) mem 7379MB [2024-08-26 08:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][770/1251] eta 0:01:58 lr 0.000925 wd 0.0500 time 0.2406 (0.2459) data time 0.0010 (0.0016) model time 0.2397 (0.2439) loss 3.1479 (3.4957) grad_norm 2.1482 (1.9955) loss_scale 8192.0000 (6316.6589) mem 7379MB [2024-08-26 08:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][780/1251] eta 0:01:55 lr 0.000925 wd 0.0500 time 0.2360 (0.2458) data time 0.0011 (0.0016) model time 0.2350 (0.2438) loss 3.2831 (3.4952) grad_norm 1.8869 (1.9961) loss_scale 8192.0000 (6340.6709) mem 7379MB [2024-08-26 08:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][790/1251] eta 0:01:53 lr 0.000925 wd 0.0500 time 0.2337 (0.2460) data time 0.0011 (0.0016) model time 0.2326 (0.2440) loss 3.9211 (3.4969) grad_norm 2.1255 (1.9929) loss_scale 8192.0000 (6364.0759) mem 7379MB [2024-08-26 08:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][800/1251] eta 0:01:50 lr 0.000925 wd 0.0500 time 0.2435 (0.2459) data time 0.0010 (0.0016) model time 0.2425 (0.2440) loss 2.6627 (3.4980) grad_norm 1.8543 (1.9905) loss_scale 8192.0000 (6386.8964) mem 7379MB [2024-08-26 08:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][810/1251] eta 0:01:48 lr 0.000925 wd 0.0500 time 0.2339 (0.2459) data time 0.0009 (0.0016) model time 0.2330 (0.2439) loss 2.4350 (3.4957) grad_norm 1.9876 (1.9920) loss_scale 8192.0000 (6409.1541) mem 7379MB [2024-08-26 08:13:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][820/1251] eta 0:01:45 lr 0.000925 wd 0.0500 time 0.2424 (0.2458) data time 0.0011 (0.0015) model time 0.2413 (0.2439) loss 4.1494 (3.4966) grad_norm 1.6983 (1.9914) loss_scale 8192.0000 (6430.8697) mem 7379MB [2024-08-26 08:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][830/1251] eta 0:01:43 lr 0.000925 wd 0.0500 time 0.2428 (0.2457) data time 0.0007 (0.0015) model time 0.2421 (0.2438) loss 4.2475 (3.4997) grad_norm 1.8983 (1.9892) loss_scale 8192.0000 (6452.0626) mem 7379MB [2024-08-26 08:13:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][840/1251] eta 0:01:40 lr 0.000925 wd 0.0500 time 0.2427 (0.2457) data time 0.0010 (0.0015) model time 0.2417 (0.2438) loss 2.9815 (3.4991) grad_norm 4.2467 (1.9978) loss_scale 8192.0000 (6472.7515) mem 7379MB [2024-08-26 08:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][850/1251] eta 0:01:38 lr 0.000925 wd 0.0500 time 0.2403 (0.2456) data time 0.0007 (0.0015) model time 0.2396 (0.2437) loss 2.5716 (3.4982) grad_norm 1.7906 (1.9966) loss_scale 8192.0000 (6492.9542) mem 7379MB [2024-08-26 08:13:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][860/1251] eta 0:01:36 lr 0.000925 wd 0.0500 time 0.2356 (0.2456) data time 0.0008 (0.0015) model time 0.2348 (0.2437) loss 3.7457 (3.4963) grad_norm 2.1874 (1.9949) loss_scale 8192.0000 (6512.6876) mem 7379MB [2024-08-26 08:13:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][870/1251] eta 0:01:33 lr 0.000925 wd 0.0500 time 0.2447 (0.2455) data time 0.0007 (0.0015) model time 0.2440 (0.2436) loss 4.0133 (3.4994) grad_norm 2.0341 (1.9939) loss_scale 8192.0000 (6531.9679) mem 7379MB [2024-08-26 08:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][880/1251] eta 0:01:31 lr 0.000925 wd 0.0500 time 0.2374 (0.2455) data time 0.0007 (0.0015) model time 0.2366 (0.2436) loss 3.4808 (3.5007) grad_norm 1.8241 (1.9949) loss_scale 8192.0000 (6550.8104) mem 7379MB [2024-08-26 08:13:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][890/1251] eta 0:01:28 lr 0.000925 wd 0.0500 time 0.2490 (0.2455) data time 0.0008 (0.0015) model time 0.2483 (0.2436) loss 3.7925 (3.5024) grad_norm 1.7685 (1.9957) loss_scale 8192.0000 (6569.2301) mem 7379MB [2024-08-26 08:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][900/1251] eta 0:01:26 lr 0.000925 wd 0.0500 time 0.2429 (0.2455) data time 0.0009 (0.0015) model time 0.2420 (0.2436) loss 4.0671 (3.5037) grad_norm 1.4238 (1.9959) loss_scale 8192.0000 (6587.2408) mem 7379MB [2024-08-26 08:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][910/1251] eta 0:01:23 lr 0.000925 wd 0.0500 time 0.2470 (0.2456) data time 0.0010 (0.0015) model time 0.2460 (0.2438) loss 3.4591 (3.5064) grad_norm 2.4132 (1.9965) loss_scale 8192.0000 (6604.8562) mem 7379MB [2024-08-26 08:13:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][920/1251] eta 0:01:21 lr 0.000925 wd 0.0500 time 0.2319 (0.2456) data time 0.0011 (0.0015) model time 0.2308 (0.2437) loss 3.3690 (3.5074) grad_norm 2.6604 (1.9981) loss_scale 8192.0000 (6622.0890) mem 7379MB [2024-08-26 08:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][930/1251] eta 0:01:18 lr 0.000925 wd 0.0500 time 0.2521 (0.2456) data time 0.0009 (0.0015) model time 0.2512 (0.2437) loss 3.3277 (3.5049) grad_norm 1.3338 (1.9949) loss_scale 8192.0000 (6638.9517) mem 7379MB [2024-08-26 08:13:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][940/1251] eta 0:01:16 lr 0.000925 wd 0.0500 time 0.2463 (0.2455) data time 0.0007 (0.0015) model time 0.2456 (0.2437) loss 3.0300 (3.5048) grad_norm 1.4917 (1.9935) loss_scale 8192.0000 (6655.4559) mem 7379MB [2024-08-26 08:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][950/1251] eta 0:01:13 lr 0.000925 wd 0.0500 time 0.2411 (0.2455) data time 0.0010 (0.0015) model time 0.2402 (0.2436) loss 3.4909 (3.5053) grad_norm 1.6169 (1.9917) loss_scale 8192.0000 (6671.6130) mem 7379MB [2024-08-26 08:13:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][960/1251] eta 0:01:11 lr 0.000925 wd 0.0500 time 0.2471 (0.2455) data time 0.0011 (0.0015) model time 0.2460 (0.2436) loss 3.6156 (3.5056) grad_norm 1.5083 (1.9905) loss_scale 8192.0000 (6687.4339) mem 7379MB [2024-08-26 08:14:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][970/1251] eta 0:01:08 lr 0.000925 wd 0.0500 time 0.2382 (0.2455) data time 0.0007 (0.0015) model time 0.2374 (0.2436) loss 3.8513 (3.5051) grad_norm 1.9413 (1.9901) loss_scale 8192.0000 (6702.9289) mem 7379MB [2024-08-26 08:14:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][980/1251] eta 0:01:06 lr 0.000925 wd 0.0500 time 0.2421 (0.2454) data time 0.0011 (0.0015) model time 0.2410 (0.2436) loss 2.9163 (3.5065) grad_norm 1.6450 (1.9890) loss_scale 8192.0000 (6718.1081) mem 7379MB [2024-08-26 08:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][990/1251] eta 0:01:04 lr 0.000925 wd 0.0500 time 0.2361 (0.2454) data time 0.0009 (0.0015) model time 0.2353 (0.2436) loss 3.5657 (3.5090) grad_norm 1.9762 (1.9889) loss_scale 8192.0000 (6732.9808) mem 7379MB [2024-08-26 08:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1000/1251] eta 0:01:01 lr 0.000925 wd 0.0500 time 0.2416 (0.2455) data time 0.0010 (0.0014) model time 0.2406 (0.2437) loss 3.7940 (3.5092) grad_norm 2.3376 (1.9886) loss_scale 8192.0000 (6747.5564) mem 7379MB [2024-08-26 08:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1010/1251] eta 0:00:59 lr 0.000925 wd 0.0500 time 0.2452 (0.2455) data time 0.0010 (0.0014) model time 0.2442 (0.2437) loss 3.8743 (3.5087) grad_norm 1.9178 (1.9873) loss_scale 8192.0000 (6761.8437) mem 7379MB [2024-08-26 08:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1020/1251] eta 0:00:56 lr 0.000925 wd 0.0500 time 0.2419 (0.2455) data time 0.0011 (0.0014) model time 0.2408 (0.2437) loss 3.7741 (3.5106) grad_norm 2.5627 (1.9887) loss_scale 8192.0000 (6775.8511) mem 7379MB [2024-08-26 08:14:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1030/1251] eta 0:00:54 lr 0.000925 wd 0.0500 time 0.2499 (0.2455) data time 0.0010 (0.0014) model time 0.2489 (0.2437) loss 3.5429 (3.5096) grad_norm 1.9495 (1.9885) loss_scale 8192.0000 (6789.5868) mem 7379MB [2024-08-26 08:14:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1040/1251] eta 0:00:51 lr 0.000925 wd 0.0500 time 0.2436 (0.2454) data time 0.0010 (0.0014) model time 0.2426 (0.2437) loss 3.3235 (3.5098) grad_norm 4.6416 (1.9912) loss_scale 8192.0000 (6803.0586) mem 7379MB [2024-08-26 08:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1050/1251] eta 0:00:49 lr 0.000925 wd 0.0500 time 0.4156 (0.2456) data time 0.0007 (0.0014) model time 0.4148 (0.2438) loss 4.4059 (3.5102) grad_norm 2.0808 (1.9924) loss_scale 8192.0000 (6816.2740) mem 7379MB [2024-08-26 08:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1060/1251] eta 0:00:46 lr 0.000925 wd 0.0500 time 0.2438 (0.2458) data time 0.0010 (0.0014) model time 0.2428 (0.2441) loss 3.1274 (3.5102) grad_norm 1.7420 (1.9899) loss_scale 8192.0000 (6829.2403) mem 7379MB [2024-08-26 08:14:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1070/1251] eta 0:00:44 lr 0.000925 wd 0.0500 time 0.2409 (0.2458) data time 0.0010 (0.0014) model time 0.2399 (0.2440) loss 3.8779 (3.5110) grad_norm 1.4922 (1.9876) loss_scale 8192.0000 (6841.9645) mem 7379MB [2024-08-26 08:14:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1080/1251] eta 0:00:42 lr 0.000925 wd 0.0500 time 0.2468 (0.2457) data time 0.0010 (0.0014) model time 0.2458 (0.2440) loss 3.6094 (3.5086) grad_norm 1.9627 (1.9868) loss_scale 8192.0000 (6854.4533) mem 7379MB [2024-08-26 08:14:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1090/1251] eta 0:00:39 lr 0.000925 wd 0.0500 time 0.2416 (0.2457) data time 0.0008 (0.0014) model time 0.2408 (0.2440) loss 4.0435 (3.5108) grad_norm 1.9575 (1.9885) loss_scale 8192.0000 (6866.7131) mem 7379MB [2024-08-26 08:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1100/1251] eta 0:00:37 lr 0.000924 wd 0.0500 time 0.2449 (0.2459) data time 0.0010 (0.0014) model time 0.2439 (0.2442) loss 3.9492 (3.5135) grad_norm 1.4238 (1.9887) loss_scale 8192.0000 (6878.7502) mem 7379MB [2024-08-26 08:14:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1110/1251] eta 0:00:34 lr 0.000924 wd 0.0500 time 0.2446 (0.2459) data time 0.0007 (0.0014) model time 0.2439 (0.2442) loss 3.9052 (3.5144) grad_norm 1.7385 (1.9878) loss_scale 8192.0000 (6890.5707) mem 7379MB [2024-08-26 08:14:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1120/1251] eta 0:00:32 lr 0.000924 wd 0.0500 time 0.2398 (0.2460) data time 0.0012 (0.0014) model time 0.2386 (0.2443) loss 3.9288 (3.5147) grad_norm 2.0472 (1.9880) loss_scale 8192.0000 (6902.1802) mem 7379MB [2024-08-26 08:14:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1130/1251] eta 0:00:29 lr 0.000924 wd 0.0500 time 0.2445 (0.2460) data time 0.0011 (0.0014) model time 0.2434 (0.2443) loss 4.0880 (3.5173) grad_norm 3.0646 (1.9894) loss_scale 8192.0000 (6913.5844) mem 7379MB [2024-08-26 08:14:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1140/1251] eta 0:00:27 lr 0.000924 wd 0.0500 time 0.2388 (0.2460) data time 0.0007 (0.0014) model time 0.2381 (0.2443) loss 3.4901 (3.5186) grad_norm 2.8616 (1.9924) loss_scale 8192.0000 (6924.7888) mem 7379MB [2024-08-26 08:14:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1150/1251] eta 0:00:24 lr 0.000924 wd 0.0500 time 0.2587 (0.2459) data time 0.0007 (0.0014) model time 0.2580 (0.2443) loss 4.0332 (3.5175) grad_norm 2.7281 (1.9924) loss_scale 8192.0000 (6935.7984) mem 7379MB [2024-08-26 08:14:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1160/1251] eta 0:00:22 lr 0.000924 wd 0.0500 time 0.2538 (0.2459) data time 0.0010 (0.0014) model time 0.2528 (0.2443) loss 3.3325 (3.5153) grad_norm 1.8465 (1.9902) loss_scale 8192.0000 (6946.6184) mem 7379MB [2024-08-26 08:14:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1170/1251] eta 0:00:19 lr 0.000924 wd 0.0500 time 0.2468 (0.2461) data time 0.0010 (0.0014) model time 0.2458 (0.2444) loss 2.6088 (3.5138) grad_norm 2.6940 (1.9894) loss_scale 8192.0000 (6957.2536) mem 7379MB [2024-08-26 08:14:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1180/1251] eta 0:00:17 lr 0.000924 wd 0.0500 time 0.2437 (0.2462) data time 0.0010 (0.0014) model time 0.2427 (0.2446) loss 3.4176 (3.5132) grad_norm 2.2580 (1.9900) loss_scale 8192.0000 (6967.7087) mem 7379MB [2024-08-26 08:14:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1190/1251] eta 0:00:15 lr 0.000924 wd 0.0500 time 0.2412 (0.2461) data time 0.0008 (0.0014) model time 0.2403 (0.2445) loss 3.7814 (3.5131) grad_norm 2.0180 (1.9908) loss_scale 8192.0000 (6977.9882) mem 7379MB [2024-08-26 08:14:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1200/1251] eta 0:00:12 lr 0.000924 wd 0.0500 time 0.2411 (0.2461) data time 0.0011 (0.0014) model time 0.2399 (0.2445) loss 3.3560 (3.5128) grad_norm 1.6934 (1.9949) loss_scale 8192.0000 (6988.0966) mem 7379MB [2024-08-26 08:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1210/1251] eta 0:00:10 lr 0.000924 wd 0.0500 time 0.2382 (0.2461) data time 0.0008 (0.0014) model time 0.2375 (0.2445) loss 3.2972 (3.5122) grad_norm 1.7281 (1.9951) loss_scale 8192.0000 (6998.0380) mem 7379MB [2024-08-26 08:15:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1220/1251] eta 0:00:07 lr 0.000924 wd 0.0500 time 0.2443 (0.2460) data time 0.0008 (0.0014) model time 0.2435 (0.2444) loss 3.4687 (3.5133) grad_norm 2.1544 (1.9947) loss_scale 8192.0000 (7007.8165) mem 7379MB [2024-08-26 08:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1230/1251] eta 0:00:05 lr 0.000924 wd 0.0500 time 0.2322 (0.2460) data time 0.0007 (0.0014) model time 0.2315 (0.2443) loss 4.1743 (3.5160) grad_norm 2.6315 (1.9941) loss_scale 8192.0000 (7017.4362) mem 7379MB [2024-08-26 08:15:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1240/1251] eta 0:00:02 lr 0.000924 wd 0.0500 time 0.2251 (0.2459) data time 0.0005 (0.0014) model time 0.2247 (0.2442) loss 3.9876 (3.5162) grad_norm 2.5255 (1.9952) loss_scale 8192.0000 (7026.9009) mem 7379MB [2024-08-26 08:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [69/300][1250/1251] eta 0:00:00 lr 0.000924 wd 0.0500 time 0.2314 (0.2457) data time 0.0005 (0.0014) model time 0.2309 (0.2441) loss 2.6584 (3.5155) grad_norm 2.0864 (1.9951) loss_scale 8192.0000 (7036.2142) mem 7379MB [2024-08-26 08:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 69 training takes 0:05:07 [2024-08-26 08:15:10 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 08:15:11 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 08:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.437 (0.437) Loss 0.5933 (0.5933) Acc@1 88.672 (88.672) Acc@5 97.461 (97.461) Mem 7379MB [2024-08-26 08:15:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.088 (0.113) Loss 0.8296 (0.8515) Acc@1 80.664 (81.064) Acc@5 95.703 (95.863) Mem 7379MB [2024-08-26 08:15:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.097) Loss 1.2344 (0.8711) Acc@1 71.289 (80.199) Acc@5 91.602 (95.801) Mem 7379MB [2024-08-26 08:15:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.091) Loss 1.4033 (0.9947) Acc@1 66.211 (77.608) Acc@5 88.965 (94.200) Mem 7379MB [2024-08-26 08:15:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.085) Loss 1.3721 (1.0520) Acc@1 67.383 (76.200) Acc@5 90.430 (93.450) Mem 7379MB [2024-08-26 08:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.682 Acc@5 93.354 [2024-08-26 08:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.7% [2024-08-26 08:15:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.786 (0.786) Loss 0.4688 (0.4688) Acc@1 91.406 (91.406) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 08:15:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.086 (0.146) Loss 0.7588 (0.7342) Acc@1 84.961 (84.180) Acc@5 96.191 (96.768) Mem 7379MB [2024-08-26 08:15:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.114) Loss 1.0596 (0.7567) Acc@1 74.707 (83.050) Acc@5 93.164 (96.731) Mem 7379MB [2024-08-26 08:15:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.102) Loss 1.3252 (0.8647) Acc@1 66.406 (80.535) Acc@5 90.039 (95.331) Mem 7379MB [2024-08-26 08:15:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.093) Loss 1.2295 (0.9229) Acc@1 70.215 (79.018) Acc@5 91.016 (94.717) Mem 7379MB [2024-08-26 08:15:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.640 Acc@5 94.636 [2024-08-26 08:15:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.6% [2024-08-26 08:15:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.64% [2024-08-26 08:15:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 08:15:20 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 08:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][0/1251] eta 0:13:23 lr 0.000924 wd 0.0500 time 0.6424 (0.6424) data time 0.4212 (0.4212) model time 0.0000 (0.0000) loss 3.2271 (3.2271) grad_norm 1.9610 (1.9610) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][10/1251] eta 0:05:42 lr 0.000924 wd 0.0500 time 0.2370 (0.2762) data time 0.0008 (0.0392) model time 0.0000 (0.0000) loss 4.1239 (3.4244) grad_norm 1.7145 (2.1099) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][20/1251] eta 0:05:18 lr 0.000924 wd 0.0500 time 0.2394 (0.2587) data time 0.0010 (0.0210) model time 0.0000 (0.0000) loss 3.4233 (3.5938) grad_norm 2.0495 (1.9976) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][30/1251] eta 0:05:09 lr 0.000924 wd 0.0500 time 0.2398 (0.2534) data time 0.0009 (0.0145) model time 0.0000 (0.0000) loss 3.8847 (3.5571) grad_norm 2.9340 (2.0116) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][40/1251] eta 0:05:02 lr 0.000924 wd 0.0500 time 0.2418 (0.2502) data time 0.0010 (0.0112) model time 0.0000 (0.0000) loss 3.5626 (3.5219) grad_norm 1.4720 (1.9232) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][50/1251] eta 0:04:58 lr 0.000924 wd 0.0500 time 0.2400 (0.2483) data time 0.0011 (0.0092) model time 0.0000 (0.0000) loss 3.6467 (3.4689) grad_norm 1.4300 (1.8832) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][60/1251] eta 0:04:54 lr 0.000924 wd 0.0500 time 0.2430 (0.2473) data time 0.0007 (0.0079) model time 0.2423 (0.2414) loss 4.1232 (3.4591) grad_norm 1.7136 (1.9008) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][70/1251] eta 0:04:53 lr 0.000924 wd 0.0500 time 0.2410 (0.2485) data time 0.0012 (0.0069) model time 0.2398 (0.2479) loss 2.9775 (3.4605) grad_norm 1.8757 (1.9471) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][80/1251] eta 0:04:50 lr 0.000924 wd 0.0500 time 0.2368 (0.2477) data time 0.0010 (0.0063) model time 0.2358 (0.2455) loss 3.1216 (3.4963) grad_norm 1.6482 (1.9697) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][90/1251] eta 0:04:46 lr 0.000924 wd 0.0500 time 0.2450 (0.2470) data time 0.0007 (0.0057) model time 0.2443 (0.2443) loss 3.9049 (3.5315) grad_norm 1.5826 (1.9651) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][100/1251] eta 0:04:43 lr 0.000924 wd 0.0500 time 0.2410 (0.2465) data time 0.0008 (0.0052) model time 0.2402 (0.2435) loss 4.0573 (3.5340) grad_norm 2.4816 (1.9713) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][110/1251] eta 0:04:40 lr 0.000924 wd 0.0500 time 0.2421 (0.2459) data time 0.0010 (0.0048) model time 0.2411 (0.2427) loss 2.7147 (3.5135) grad_norm 2.4440 (1.9848) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][120/1251] eta 0:04:39 lr 0.000924 wd 0.0500 time 0.2485 (0.2472) data time 0.0007 (0.0045) model time 0.2477 (0.2454) loss 2.8922 (3.5155) grad_norm 1.9672 (1.9936) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][130/1251] eta 0:04:36 lr 0.000924 wd 0.0500 time 0.2396 (0.2468) data time 0.0007 (0.0043) model time 0.2390 (0.2447) loss 4.1872 (3.5212) grad_norm 1.5875 (1.9666) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][140/1251] eta 0:04:33 lr 0.000924 wd 0.0500 time 0.2442 (0.2464) data time 0.0009 (0.0041) model time 0.2433 (0.2442) loss 4.0634 (3.5282) grad_norm 2.0056 (1.9503) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][150/1251] eta 0:04:31 lr 0.000924 wd 0.0500 time 0.2460 (0.2461) data time 0.0007 (0.0039) model time 0.2453 (0.2439) loss 2.8646 (3.5305) grad_norm 1.5593 (1.9417) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:15:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][160/1251] eta 0:04:28 lr 0.000924 wd 0.0500 time 0.2403 (0.2459) data time 0.0013 (0.0037) model time 0.2391 (0.2437) loss 3.9370 (3.5220) grad_norm 1.6393 (1.9689) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][170/1251] eta 0:04:25 lr 0.000924 wd 0.0500 time 0.2452 (0.2456) data time 0.0007 (0.0035) model time 0.2444 (0.2433) loss 4.4919 (3.5133) grad_norm 1.9581 (1.9780) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][180/1251] eta 0:04:22 lr 0.000924 wd 0.0500 time 0.2523 (0.2452) data time 0.0010 (0.0034) model time 0.2514 (0.2429) loss 3.8703 (3.5265) grad_norm 1.5783 (1.9936) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][190/1251] eta 0:04:19 lr 0.000924 wd 0.0500 time 0.2404 (0.2449) data time 0.0010 (0.0033) model time 0.2394 (0.2425) loss 4.2445 (3.5242) grad_norm 2.7706 (1.9959) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][200/1251] eta 0:04:17 lr 0.000924 wd 0.0500 time 0.2410 (0.2446) data time 0.0008 (0.0031) model time 0.2403 (0.2423) loss 2.8449 (3.5225) grad_norm 1.8273 (1.9888) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][210/1251] eta 0:04:14 lr 0.000924 wd 0.0500 time 0.2418 (0.2444) data time 0.0007 (0.0030) model time 0.2411 (0.2421) loss 2.9518 (3.5221) grad_norm 1.9273 (1.9825) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][220/1251] eta 0:04:11 lr 0.000924 wd 0.0500 time 0.2483 (0.2443) data time 0.0009 (0.0029) model time 0.2474 (0.2421) loss 3.6961 (3.5241) grad_norm 1.9644 (1.9744) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][230/1251] eta 0:04:09 lr 0.000924 wd 0.0500 time 0.2399 (0.2443) data time 0.0009 (0.0029) model time 0.2390 (0.2421) loss 4.5431 (3.5218) grad_norm 2.5172 (1.9734) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][240/1251] eta 0:04:06 lr 0.000924 wd 0.0500 time 0.2422 (0.2442) data time 0.0010 (0.0028) model time 0.2412 (0.2420) loss 2.4102 (3.5118) grad_norm 2.3504 (1.9825) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][250/1251] eta 0:04:04 lr 0.000924 wd 0.0500 time 0.2348 (0.2440) data time 0.0011 (0.0027) model time 0.2337 (0.2418) loss 3.9598 (3.5064) grad_norm 2.2478 (1.9795) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][260/1251] eta 0:04:01 lr 0.000924 wd 0.0500 time 0.2349 (0.2438) data time 0.0012 (0.0027) model time 0.2337 (0.2417) loss 3.4284 (3.5154) grad_norm 2.4646 (1.9990) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][270/1251] eta 0:03:59 lr 0.000923 wd 0.0500 time 0.2446 (0.2438) data time 0.0009 (0.0026) model time 0.2437 (0.2417) loss 2.5144 (3.5219) grad_norm 2.3217 (2.0082) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][280/1251] eta 0:03:56 lr 0.000923 wd 0.0500 time 0.2444 (0.2437) data time 0.0009 (0.0025) model time 0.2435 (0.2416) loss 3.1986 (3.5178) grad_norm 1.7195 (1.9987) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][290/1251] eta 0:03:54 lr 0.000923 wd 0.0500 time 0.2316 (0.2436) data time 0.0011 (0.0025) model time 0.2306 (0.2415) loss 3.8960 (3.5095) grad_norm 1.9612 (1.9978) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][300/1251] eta 0:03:51 lr 0.000923 wd 0.0500 time 0.2395 (0.2435) data time 0.0008 (0.0024) model time 0.2387 (0.2414) loss 2.5921 (3.5148) grad_norm 1.9982 (1.9927) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][310/1251] eta 0:03:49 lr 0.000923 wd 0.0500 time 0.2423 (0.2434) data time 0.0007 (0.0024) model time 0.2416 (0.2414) loss 3.7688 (3.5092) grad_norm 1.7462 (1.9855) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][320/1251] eta 0:03:47 lr 0.000923 wd 0.0500 time 0.2448 (0.2439) data time 0.0008 (0.0023) model time 0.2439 (0.2421) loss 3.7754 (3.5082) grad_norm 2.3210 (1.9853) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][330/1251] eta 0:03:46 lr 0.000923 wd 0.0500 time 0.2427 (0.2460) data time 0.0011 (0.0023) model time 0.2416 (0.2445) loss 3.6511 (3.5106) grad_norm 1.8440 (1.9822) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][340/1251] eta 0:03:44 lr 0.000923 wd 0.0500 time 0.2409 (0.2459) data time 0.0011 (0.0023) model time 0.2399 (0.2444) loss 3.5132 (3.5058) grad_norm 2.1211 (1.9754) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][350/1251] eta 0:03:41 lr 0.000923 wd 0.0500 time 0.2367 (0.2457) data time 0.0009 (0.0022) model time 0.2357 (0.2442) loss 2.2787 (3.5005) grad_norm 2.1040 (1.9718) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][360/1251] eta 0:03:38 lr 0.000923 wd 0.0500 time 0.2445 (0.2456) data time 0.0010 (0.0022) model time 0.2435 (0.2441) loss 3.8402 (3.4957) grad_norm 1.7972 (1.9701) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][370/1251] eta 0:03:36 lr 0.000923 wd 0.0500 time 0.2362 (0.2462) data time 0.0007 (0.0022) model time 0.2354 (0.2448) loss 4.2626 (3.4968) grad_norm 2.9734 (1.9823) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][380/1251] eta 0:03:34 lr 0.000923 wd 0.0500 time 0.2298 (0.2465) data time 0.0008 (0.0021) model time 0.2291 (0.2452) loss 2.6688 (3.4922) grad_norm 1.9510 (1.9777) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:16:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][390/1251] eta 0:03:32 lr 0.000923 wd 0.0500 time 0.2390 (0.2463) data time 0.0009 (0.0021) model time 0.2382 (0.2450) loss 2.2303 (3.4926) grad_norm 1.8482 (inf) loss_scale 4096.0000 (8097.7187) mem 7379MB [2024-08-26 08:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][400/1251] eta 0:03:29 lr 0.000923 wd 0.0500 time 0.2312 (0.2462) data time 0.0010 (0.0021) model time 0.2302 (0.2449) loss 2.7457 (3.4932) grad_norm 1.6300 (inf) loss_scale 4096.0000 (7997.9252) mem 7379MB [2024-08-26 08:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][410/1251] eta 0:03:26 lr 0.000923 wd 0.0500 time 0.2395 (0.2461) data time 0.0010 (0.0021) model time 0.2385 (0.2447) loss 2.8086 (3.4939) grad_norm 1.7146 (inf) loss_scale 4096.0000 (7902.9878) mem 7379MB [2024-08-26 08:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][420/1251] eta 0:03:24 lr 0.000923 wd 0.0500 time 0.2382 (0.2460) data time 0.0008 (0.0020) model time 0.2374 (0.2446) loss 3.9644 (3.5011) grad_norm 2.1815 (inf) loss_scale 4096.0000 (7812.5606) mem 7379MB [2024-08-26 08:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][430/1251] eta 0:03:21 lr 0.000923 wd 0.0500 time 0.2403 (0.2458) data time 0.0009 (0.0020) model time 0.2394 (0.2445) loss 3.0761 (3.5077) grad_norm 3.1083 (inf) loss_scale 4096.0000 (7726.3295) mem 7379MB [2024-08-26 08:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][440/1251] eta 0:03:19 lr 0.000923 wd 0.0500 time 0.2396 (0.2458) data time 0.0008 (0.0020) model time 0.2387 (0.2444) loss 2.2919 (3.5043) grad_norm 1.7030 (inf) loss_scale 4096.0000 (7644.0091) mem 7379MB [2024-08-26 08:17:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][450/1251] eta 0:03:16 lr 0.000923 wd 0.0500 time 0.2387 (0.2457) data time 0.0010 (0.0020) model time 0.2377 (0.2443) loss 3.6947 (3.5071) grad_norm 1.6000 (inf) loss_scale 4096.0000 (7565.3392) mem 7379MB [2024-08-26 08:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][460/1251] eta 0:03:14 lr 0.000923 wd 0.0500 time 0.2459 (0.2461) data time 0.0007 (0.0019) model time 0.2451 (0.2448) loss 3.6420 (3.5067) grad_norm 2.1176 (inf) loss_scale 4096.0000 (7490.0824) mem 7379MB [2024-08-26 08:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][470/1251] eta 0:03:12 lr 0.000923 wd 0.0500 time 0.2384 (0.2465) data time 0.0012 (0.0019) model time 0.2372 (0.2452) loss 3.0503 (3.5031) grad_norm 2.3277 (inf) loss_scale 4096.0000 (7418.0212) mem 7379MB [2024-08-26 08:17:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][480/1251] eta 0:03:09 lr 0.000923 wd 0.0500 time 0.2433 (0.2464) data time 0.0007 (0.0019) model time 0.2426 (0.2451) loss 3.6160 (3.5074) grad_norm 1.4574 (inf) loss_scale 4096.0000 (7348.9563) mem 7379MB [2024-08-26 08:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][490/1251] eta 0:03:07 lr 0.000923 wd 0.0500 time 0.2403 (0.2462) data time 0.0008 (0.0019) model time 0.2395 (0.2450) loss 3.8522 (3.5065) grad_norm 2.0310 (inf) loss_scale 4096.0000 (7282.7047) mem 7379MB [2024-08-26 08:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][500/1251] eta 0:03:04 lr 0.000923 wd 0.0500 time 0.2470 (0.2462) data time 0.0011 (0.0019) model time 0.2459 (0.2449) loss 3.2376 (3.5053) grad_norm 1.8030 (inf) loss_scale 4096.0000 (7219.0978) mem 7379MB [2024-08-26 08:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][510/1251] eta 0:03:02 lr 0.000923 wd 0.0500 time 0.2326 (0.2461) data time 0.0010 (0.0019) model time 0.2316 (0.2448) loss 2.8512 (3.5088) grad_norm 1.5983 (inf) loss_scale 4096.0000 (7157.9804) mem 7379MB [2024-08-26 08:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][520/1251] eta 0:02:59 lr 0.000923 wd 0.0500 time 0.2499 (0.2460) data time 0.0007 (0.0018) model time 0.2492 (0.2447) loss 4.1184 (3.5117) grad_norm 1.7247 (inf) loss_scale 4096.0000 (7099.2092) mem 7379MB [2024-08-26 08:17:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][530/1251] eta 0:02:57 lr 0.000923 wd 0.0500 time 0.2435 (0.2459) data time 0.0012 (0.0018) model time 0.2423 (0.2446) loss 4.1429 (3.5129) grad_norm 1.5623 (inf) loss_scale 4096.0000 (7042.6516) mem 7379MB [2024-08-26 08:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][540/1251] eta 0:02:54 lr 0.000923 wd 0.0500 time 0.2445 (0.2458) data time 0.0010 (0.0018) model time 0.2436 (0.2445) loss 4.4951 (3.5129) grad_norm 1.6024 (inf) loss_scale 4096.0000 (6988.1848) mem 7379MB [2024-08-26 08:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][550/1251] eta 0:02:52 lr 0.000923 wd 0.0500 time 0.2514 (0.2458) data time 0.0007 (0.0018) model time 0.2507 (0.2445) loss 4.4798 (3.5173) grad_norm 2.0683 (inf) loss_scale 4096.0000 (6935.6951) mem 7379MB [2024-08-26 08:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][560/1251] eta 0:02:49 lr 0.000923 wd 0.0500 time 0.2400 (0.2457) data time 0.0008 (0.0018) model time 0.2392 (0.2444) loss 4.3647 (3.5132) grad_norm 1.3821 (inf) loss_scale 4096.0000 (6885.0766) mem 7379MB [2024-08-26 08:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][570/1251] eta 0:02:47 lr 0.000923 wd 0.0500 time 0.2411 (0.2456) data time 0.0011 (0.0018) model time 0.2399 (0.2443) loss 3.9231 (3.5136) grad_norm 1.9293 (inf) loss_scale 4096.0000 (6836.2312) mem 7379MB [2024-08-26 08:17:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][580/1251] eta 0:02:44 lr 0.000923 wd 0.0500 time 0.2427 (0.2455) data time 0.0008 (0.0018) model time 0.2419 (0.2442) loss 2.1854 (3.5082) grad_norm 1.8020 (inf) loss_scale 4096.0000 (6789.0671) mem 7379MB [2024-08-26 08:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][590/1251] eta 0:02:42 lr 0.000923 wd 0.0500 time 0.2389 (0.2458) data time 0.0011 (0.0017) model time 0.2378 (0.2446) loss 3.2052 (3.5107) grad_norm 1.4835 (inf) loss_scale 4096.0000 (6743.4992) mem 7379MB [2024-08-26 08:17:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][600/1251] eta 0:02:40 lr 0.000923 wd 0.0500 time 0.2458 (0.2458) data time 0.0007 (0.0017) model time 0.2451 (0.2445) loss 4.0119 (3.5080) grad_norm 2.3375 (inf) loss_scale 4096.0000 (6699.4476) mem 7379MB [2024-08-26 08:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][610/1251] eta 0:02:37 lr 0.000923 wd 0.0500 time 0.2450 (0.2458) data time 0.0011 (0.0017) model time 0.2439 (0.2445) loss 3.4015 (3.5103) grad_norm 1.7691 (inf) loss_scale 4096.0000 (6656.8380) mem 7379MB [2024-08-26 08:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][620/1251] eta 0:02:35 lr 0.000923 wd 0.0500 time 0.2445 (0.2457) data time 0.0011 (0.0017) model time 0.2434 (0.2444) loss 3.5413 (3.5078) grad_norm 1.6961 (inf) loss_scale 4096.0000 (6615.6006) mem 7379MB [2024-08-26 08:17:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][630/1251] eta 0:02:32 lr 0.000923 wd 0.0500 time 0.2389 (0.2456) data time 0.0011 (0.0017) model time 0.2378 (0.2443) loss 3.2524 (3.5111) grad_norm 1.3484 (inf) loss_scale 4096.0000 (6575.6704) mem 7379MB [2024-08-26 08:17:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][640/1251] eta 0:02:30 lr 0.000923 wd 0.0500 time 0.2376 (0.2456) data time 0.0011 (0.0017) model time 0.2365 (0.2443) loss 3.2936 (3.5123) grad_norm 2.6349 (inf) loss_scale 4096.0000 (6536.9860) mem 7379MB [2024-08-26 08:18:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][650/1251] eta 0:02:27 lr 0.000923 wd 0.0500 time 0.2352 (0.2455) data time 0.0009 (0.0017) model time 0.2343 (0.2442) loss 4.6703 (3.5155) grad_norm 2.5775 (inf) loss_scale 4096.0000 (6499.4900) mem 7379MB [2024-08-26 08:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][660/1251] eta 0:02:25 lr 0.000923 wd 0.0500 time 0.2451 (0.2455) data time 0.0009 (0.0017) model time 0.2442 (0.2442) loss 4.1130 (3.5177) grad_norm 1.7894 (inf) loss_scale 4096.0000 (6463.1286) mem 7379MB [2024-08-26 08:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][670/1251] eta 0:02:22 lr 0.000923 wd 0.0500 time 0.2437 (0.2454) data time 0.0007 (0.0017) model time 0.2429 (0.2441) loss 4.0921 (3.5178) grad_norm 1.7504 (inf) loss_scale 4096.0000 (6427.8510) mem 7379MB [2024-08-26 08:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][680/1251] eta 0:02:20 lr 0.000923 wd 0.0500 time 0.2308 (0.2453) data time 0.0012 (0.0017) model time 0.2296 (0.2440) loss 3.5954 (3.5175) grad_norm 1.7677 (inf) loss_scale 4096.0000 (6393.6094) mem 7379MB [2024-08-26 08:18:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][690/1251] eta 0:02:17 lr 0.000922 wd 0.0500 time 0.2380 (0.2453) data time 0.0011 (0.0016) model time 0.2370 (0.2440) loss 2.5597 (3.5143) grad_norm 1.6997 (inf) loss_scale 4096.0000 (6360.3589) mem 7379MB [2024-08-26 08:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][700/1251] eta 0:02:15 lr 0.000922 wd 0.0500 time 0.2436 (0.2453) data time 0.0009 (0.0016) model time 0.2427 (0.2440) loss 3.4294 (3.5129) grad_norm 2.3031 (inf) loss_scale 4096.0000 (6328.0571) mem 7379MB [2024-08-26 08:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][710/1251] eta 0:02:12 lr 0.000922 wd 0.0500 time 0.2434 (0.2452) data time 0.0007 (0.0016) model time 0.2427 (0.2439) loss 4.0216 (3.5107) grad_norm 1.5809 (inf) loss_scale 4096.0000 (6296.6639) mem 7379MB [2024-08-26 08:18:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][720/1251] eta 0:02:10 lr 0.000922 wd 0.0500 time 0.2526 (0.2452) data time 0.0011 (0.0016) model time 0.2515 (0.2439) loss 4.1828 (3.5130) grad_norm 1.4741 (inf) loss_scale 4096.0000 (6266.1415) mem 7379MB [2024-08-26 08:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][730/1251] eta 0:02:07 lr 0.000922 wd 0.0500 time 0.2359 (0.2451) data time 0.0007 (0.0016) model time 0.2351 (0.2438) loss 4.1179 (3.5114) grad_norm 2.2058 (inf) loss_scale 4096.0000 (6236.4542) mem 7379MB [2024-08-26 08:18:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][740/1251] eta 0:02:05 lr 0.000922 wd 0.0500 time 0.2381 (0.2450) data time 0.0010 (0.0016) model time 0.2372 (0.2437) loss 3.4838 (3.5109) grad_norm 1.5978 (inf) loss_scale 4096.0000 (6207.5682) mem 7379MB [2024-08-26 08:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][750/1251] eta 0:02:02 lr 0.000922 wd 0.0500 time 0.2441 (0.2450) data time 0.0008 (0.0016) model time 0.2433 (0.2437) loss 4.6187 (3.5123) grad_norm 2.4029 (inf) loss_scale 4096.0000 (6179.4514) mem 7379MB [2024-08-26 08:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][760/1251] eta 0:02:00 lr 0.000922 wd 0.0500 time 0.2349 (0.2449) data time 0.0008 (0.0016) model time 0.2341 (0.2437) loss 3.6356 (3.5125) grad_norm 2.0063 (inf) loss_scale 4096.0000 (6152.0736) mem 7379MB [2024-08-26 08:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][770/1251] eta 0:01:57 lr 0.000922 wd 0.0500 time 0.2378 (0.2449) data time 0.0007 (0.0016) model time 0.2371 (0.2436) loss 3.6182 (3.5133) grad_norm 1.9237 (inf) loss_scale 4096.0000 (6125.4060) mem 7379MB [2024-08-26 08:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][780/1251] eta 0:01:55 lr 0.000922 wd 0.0500 time 0.2414 (0.2448) data time 0.0009 (0.0016) model time 0.2404 (0.2436) loss 4.0294 (3.5163) grad_norm 2.2988 (inf) loss_scale 4096.0000 (6099.4213) mem 7379MB [2024-08-26 08:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][790/1251] eta 0:01:52 lr 0.000922 wd 0.0500 time 0.2374 (0.2448) data time 0.0011 (0.0016) model time 0.2363 (0.2435) loss 3.2055 (3.5121) grad_norm 1.9003 (inf) loss_scale 4096.0000 (6074.0936) mem 7379MB [2024-08-26 08:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][800/1251] eta 0:01:50 lr 0.000922 wd 0.0500 time 0.2424 (0.2448) data time 0.0009 (0.0016) model time 0.2415 (0.2435) loss 4.5023 (3.5113) grad_norm 2.2971 (inf) loss_scale 4096.0000 (6049.3983) mem 7379MB [2024-08-26 08:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][810/1251] eta 0:01:47 lr 0.000922 wd 0.0500 time 0.2372 (0.2447) data time 0.0011 (0.0016) model time 0.2361 (0.2434) loss 3.8005 (3.5094) grad_norm 2.0779 (inf) loss_scale 4096.0000 (6025.3120) mem 7379MB [2024-08-26 08:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][820/1251] eta 0:01:45 lr 0.000922 wd 0.0500 time 0.2439 (0.2447) data time 0.0009 (0.0015) model time 0.2430 (0.2434) loss 3.6529 (3.5139) grad_norm 2.5553 (inf) loss_scale 4096.0000 (6001.8124) mem 7379MB [2024-08-26 08:18:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][830/1251] eta 0:01:43 lr 0.000922 wd 0.0500 time 0.2373 (0.2447) data time 0.0011 (0.0015) model time 0.2362 (0.2434) loss 2.7657 (3.5130) grad_norm 2.9664 (inf) loss_scale 4096.0000 (5978.8785) mem 7379MB [2024-08-26 08:18:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][840/1251] eta 0:01:40 lr 0.000922 wd 0.0500 time 0.2411 (0.2447) data time 0.0009 (0.0015) model time 0.2402 (0.2434) loss 3.1911 (3.5149) grad_norm 2.0085 (inf) loss_scale 4096.0000 (5956.4899) mem 7379MB [2024-08-26 08:18:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][850/1251] eta 0:01:38 lr 0.000922 wd 0.0500 time 0.2423 (0.2454) data time 0.0011 (0.0015) model time 0.2412 (0.2441) loss 3.9243 (3.5115) grad_norm 1.3572 (inf) loss_scale 4096.0000 (5934.6275) mem 7379MB [2024-08-26 08:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][860/1251] eta 0:01:35 lr 0.000922 wd 0.0500 time 0.2620 (0.2453) data time 0.0007 (0.0015) model time 0.2613 (0.2441) loss 3.7379 (3.5134) grad_norm 1.5535 (inf) loss_scale 4096.0000 (5913.2729) mem 7379MB [2024-08-26 08:18:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][870/1251] eta 0:01:33 lr 0.000922 wd 0.0500 time 0.2488 (0.2455) data time 0.0007 (0.0015) model time 0.2481 (0.2443) loss 2.6316 (3.5108) grad_norm 1.6877 (inf) loss_scale 4096.0000 (5892.4087) mem 7379MB [2024-08-26 08:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][880/1251] eta 0:01:31 lr 0.000922 wd 0.0500 time 0.2397 (0.2455) data time 0.0007 (0.0015) model time 0.2389 (0.2443) loss 2.8189 (3.5092) grad_norm 1.4803 (inf) loss_scale 4096.0000 (5872.0182) mem 7379MB [2024-08-26 08:18:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][890/1251] eta 0:01:28 lr 0.000922 wd 0.0500 time 0.2390 (0.2457) data time 0.0011 (0.0015) model time 0.2379 (0.2445) loss 3.5138 (3.5091) grad_norm 2.9736 (inf) loss_scale 4096.0000 (5852.0853) mem 7379MB [2024-08-26 08:19:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][900/1251] eta 0:01:26 lr 0.000922 wd 0.0500 time 0.2468 (0.2459) data time 0.0010 (0.0015) model time 0.2458 (0.2447) loss 4.0902 (3.5091) grad_norm 2.0188 (inf) loss_scale 4096.0000 (5832.5949) mem 7379MB [2024-08-26 08:19:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][910/1251] eta 0:01:23 lr 0.000922 wd 0.0500 time 0.2497 (0.2461) data time 0.0010 (0.0015) model time 0.2487 (0.2449) loss 3.6077 (3.5107) grad_norm 2.5544 (inf) loss_scale 4096.0000 (5813.5324) mem 7379MB [2024-08-26 08:19:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][920/1251] eta 0:01:21 lr 0.000922 wd 0.0500 time 0.2408 (0.2460) data time 0.0007 (0.0015) model time 0.2401 (0.2448) loss 4.0407 (3.5062) grad_norm 1.8331 (inf) loss_scale 4096.0000 (5794.8838) mem 7379MB [2024-08-26 08:19:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][930/1251] eta 0:01:18 lr 0.000922 wd 0.0500 time 0.2449 (0.2460) data time 0.0009 (0.0015) model time 0.2440 (0.2448) loss 4.1944 (3.5094) grad_norm 1.6027 (inf) loss_scale 4096.0000 (5776.6359) mem 7379MB [2024-08-26 08:19:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][940/1251] eta 0:01:16 lr 0.000922 wd 0.0500 time 0.2473 (0.2459) data time 0.0011 (0.0015) model time 0.2462 (0.2447) loss 4.0149 (3.5120) grad_norm 1.8277 (inf) loss_scale 4096.0000 (5758.7758) mem 7379MB [2024-08-26 08:19:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][950/1251] eta 0:01:14 lr 0.000922 wd 0.0500 time 0.2413 (0.2459) data time 0.0008 (0.0015) model time 0.2404 (0.2447) loss 2.3140 (3.5081) grad_norm 2.0369 (inf) loss_scale 4096.0000 (5741.2913) mem 7379MB [2024-08-26 08:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][960/1251] eta 0:01:11 lr 0.000922 wd 0.0500 time 0.2455 (0.2459) data time 0.0008 (0.0015) model time 0.2447 (0.2447) loss 4.1055 (3.5082) grad_norm 1.2772 (inf) loss_scale 4096.0000 (5724.1707) mem 7379MB [2024-08-26 08:19:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][970/1251] eta 0:01:09 lr 0.000922 wd 0.0500 time 0.2408 (0.2458) data time 0.0007 (0.0015) model time 0.2401 (0.2446) loss 3.5645 (3.5054) grad_norm 1.7527 (inf) loss_scale 4096.0000 (5707.4027) mem 7379MB [2024-08-26 08:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][980/1251] eta 0:01:06 lr 0.000922 wd 0.0500 time 0.2445 (0.2459) data time 0.0008 (0.0015) model time 0.2437 (0.2448) loss 3.4498 (3.5072) grad_norm 1.7993 (inf) loss_scale 4096.0000 (5690.9766) mem 7379MB [2024-08-26 08:19:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][990/1251] eta 0:01:04 lr 0.000922 wd 0.0500 time 0.2525 (0.2460) data time 0.0013 (0.0015) model time 0.2512 (0.2448) loss 3.2514 (3.5067) grad_norm 1.4012 (inf) loss_scale 4096.0000 (5674.8819) mem 7379MB [2024-08-26 08:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1000/1251] eta 0:01:01 lr 0.000922 wd 0.0500 time 0.2289 (0.2459) data time 0.0008 (0.0015) model time 0.2280 (0.2448) loss 2.7299 (3.5054) grad_norm 1.6734 (inf) loss_scale 4096.0000 (5659.1089) mem 7379MB [2024-08-26 08:19:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1010/1251] eta 0:00:59 lr 0.000922 wd 0.0500 time 0.2510 (0.2459) data time 0.0007 (0.0015) model time 0.2503 (0.2447) loss 2.1039 (3.5040) grad_norm 1.7871 (inf) loss_scale 4096.0000 (5643.6479) mem 7379MB [2024-08-26 08:19:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1020/1251] eta 0:00:56 lr 0.000922 wd 0.0500 time 0.2469 (0.2458) data time 0.0009 (0.0015) model time 0.2460 (0.2447) loss 3.6345 (3.5037) grad_norm 1.7868 (inf) loss_scale 4096.0000 (5628.4897) mem 7379MB [2024-08-26 08:19:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1030/1251] eta 0:00:54 lr 0.000922 wd 0.0500 time 0.2351 (0.2458) data time 0.0009 (0.0014) model time 0.2342 (0.2446) loss 4.0464 (3.5043) grad_norm 2.8382 (inf) loss_scale 4096.0000 (5613.6256) mem 7379MB [2024-08-26 08:19:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1040/1251] eta 0:00:51 lr 0.000922 wd 0.0500 time 0.2372 (0.2457) data time 0.0009 (0.0014) model time 0.2364 (0.2445) loss 4.0050 (3.5038) grad_norm 2.1468 (inf) loss_scale 4096.0000 (5599.0471) mem 7379MB [2024-08-26 08:19:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1050/1251] eta 0:00:49 lr 0.000922 wd 0.0500 time 0.2410 (0.2458) data time 0.0007 (0.0014) model time 0.2404 (0.2447) loss 3.8071 (3.5040) grad_norm 1.8601 (inf) loss_scale 4096.0000 (5584.7460) mem 7379MB [2024-08-26 08:19:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1060/1251] eta 0:00:46 lr 0.000922 wd 0.0500 time 0.2389 (0.2458) data time 0.0008 (0.0014) model time 0.2381 (0.2446) loss 3.9355 (3.5069) grad_norm 2.0969 (inf) loss_scale 4096.0000 (5570.7144) mem 7379MB [2024-08-26 08:19:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1070/1251] eta 0:00:44 lr 0.000922 wd 0.0500 time 0.2467 (0.2458) data time 0.0008 (0.0014) model time 0.2459 (0.2446) loss 2.4285 (3.5067) grad_norm 2.4340 (inf) loss_scale 4096.0000 (5556.9449) mem 7379MB [2024-08-26 08:19:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1080/1251] eta 0:00:42 lr 0.000922 wd 0.0500 time 0.2387 (0.2457) data time 0.0007 (0.0014) model time 0.2380 (0.2446) loss 4.0959 (3.5070) grad_norm 1.9391 (inf) loss_scale 4096.0000 (5543.4302) mem 7379MB [2024-08-26 08:19:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1090/1251] eta 0:00:39 lr 0.000922 wd 0.0500 time 0.2335 (0.2457) data time 0.0007 (0.0014) model time 0.2327 (0.2445) loss 2.8827 (3.5082) grad_norm 1.6295 (inf) loss_scale 4096.0000 (5530.1632) mem 7379MB [2024-08-26 08:19:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1100/1251] eta 0:00:37 lr 0.000922 wd 0.0500 time 0.2471 (0.2456) data time 0.0010 (0.0014) model time 0.2462 (0.2445) loss 2.6121 (3.5069) grad_norm 1.9899 (inf) loss_scale 4096.0000 (5517.1371) mem 7379MB [2024-08-26 08:19:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1110/1251] eta 0:00:34 lr 0.000921 wd 0.0500 time 0.2397 (0.2456) data time 0.0008 (0.0014) model time 0.2390 (0.2444) loss 4.1455 (3.5074) grad_norm 2.8354 (inf) loss_scale 4096.0000 (5504.3456) mem 7379MB [2024-08-26 08:19:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1120/1251] eta 0:00:32 lr 0.000921 wd 0.0500 time 0.2463 (0.2456) data time 0.0008 (0.0014) model time 0.2455 (0.2444) loss 3.4026 (3.5059) grad_norm 2.4727 (inf) loss_scale 4096.0000 (5491.7823) mem 7379MB [2024-08-26 08:19:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1130/1251] eta 0:00:29 lr 0.000921 wd 0.0500 time 0.2414 (0.2455) data time 0.0008 (0.0014) model time 0.2406 (0.2444) loss 3.7316 (3.5056) grad_norm 2.5809 (inf) loss_scale 4096.0000 (5479.4412) mem 7379MB [2024-08-26 08:20:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1140/1251] eta 0:00:27 lr 0.000921 wd 0.0500 time 0.2426 (0.2455) data time 0.0011 (0.0014) model time 0.2415 (0.2443) loss 3.8890 (3.5063) grad_norm 2.6152 (inf) loss_scale 4096.0000 (5467.3164) mem 7379MB [2024-08-26 08:20:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1150/1251] eta 0:00:24 lr 0.000921 wd 0.0500 time 0.2424 (0.2454) data time 0.0011 (0.0014) model time 0.2413 (0.2443) loss 3.4485 (3.5061) grad_norm 1.9884 (inf) loss_scale 4096.0000 (5455.4023) mem 7379MB [2024-08-26 08:20:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1160/1251] eta 0:00:22 lr 0.000921 wd 0.0500 time 0.2327 (0.2454) data time 0.0009 (0.0014) model time 0.2317 (0.2442) loss 2.7638 (3.5051) grad_norm 2.0486 (inf) loss_scale 4096.0000 (5443.6934) mem 7379MB [2024-08-26 08:20:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1170/1251] eta 0:00:19 lr 0.000921 wd 0.0500 time 0.2415 (0.2454) data time 0.0008 (0.0014) model time 0.2407 (0.2442) loss 4.4250 (3.5083) grad_norm 2.9961 (inf) loss_scale 4096.0000 (5432.1845) mem 7379MB [2024-08-26 08:20:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1180/1251] eta 0:00:17 lr 0.000921 wd 0.0500 time 0.2407 (0.2453) data time 0.0009 (0.0014) model time 0.2397 (0.2442) loss 3.6939 (3.5098) grad_norm 1.5406 (inf) loss_scale 4096.0000 (5420.8704) mem 7379MB [2024-08-26 08:20:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1190/1251] eta 0:00:14 lr 0.000921 wd 0.0500 time 0.2434 (0.2453) data time 0.0007 (0.0014) model time 0.2427 (0.2441) loss 3.2637 (3.5070) grad_norm 1.7554 (inf) loss_scale 4096.0000 (5409.7464) mem 7379MB [2024-08-26 08:20:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1200/1251] eta 0:00:12 lr 0.000921 wd 0.0500 time 0.2385 (0.2453) data time 0.0012 (0.0014) model time 0.2373 (0.2441) loss 2.5254 (3.5049) grad_norm 2.0682 (inf) loss_scale 4096.0000 (5398.8077) mem 7379MB [2024-08-26 08:20:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1210/1251] eta 0:00:10 lr 0.000921 wd 0.0500 time 0.2471 (0.2452) data time 0.0011 (0.0014) model time 0.2460 (0.2441) loss 3.6631 (3.5073) grad_norm 1.5801 (inf) loss_scale 4096.0000 (5388.0495) mem 7379MB [2024-08-26 08:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1220/1251] eta 0:00:07 lr 0.000921 wd 0.0500 time 0.2393 (0.2452) data time 0.0007 (0.0014) model time 0.2385 (0.2441) loss 3.6681 (3.5076) grad_norm 1.4351 (inf) loss_scale 4096.0000 (5377.4676) mem 7379MB [2024-08-26 08:20:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1230/1251] eta 0:00:05 lr 0.000921 wd 0.0500 time 0.2399 (0.2452) data time 0.0007 (0.0014) model time 0.2392 (0.2440) loss 4.0992 (3.5089) grad_norm 1.2666 (inf) loss_scale 4096.0000 (5367.0577) mem 7379MB [2024-08-26 08:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1240/1251] eta 0:00:02 lr 0.000921 wd 0.0500 time 0.2247 (0.2451) data time 0.0007 (0.0014) model time 0.2240 (0.2439) loss 3.6251 (3.5093) grad_norm 1.9503 (inf) loss_scale 4096.0000 (5356.8155) mem 7379MB [2024-08-26 08:20:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [70/300][1250/1251] eta 0:00:00 lr 0.000921 wd 0.0500 time 0.2248 (0.2450) data time 0.0006 (0.0014) model time 0.2241 (0.2438) loss 2.9039 (3.5079) grad_norm 2.0189 (inf) loss_scale 4096.0000 (5346.7370) mem 7379MB [2024-08-26 08:20:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 70 training takes 0:05:06 [2024-08-26 08:20:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 08:20:27 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 08:20:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.439 (0.439) Loss 0.5234 (0.5234) Acc@1 89.453 (89.453) Acc@5 97.559 (97.559) Mem 7379MB [2024-08-26 08:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.109) Loss 0.9155 (0.8481) Acc@1 79.785 (81.232) Acc@5 95.215 (95.961) Mem 7379MB [2024-08-26 08:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.094) Loss 1.2412 (0.8678) Acc@1 70.020 (80.413) Acc@5 92.090 (95.940) Mem 7379MB [2024-08-26 08:20:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.088) Loss 1.4570 (0.9920) Acc@1 64.453 (77.555) Acc@5 87.598 (94.292) Mem 7379MB [2024-08-26 08:20:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.3828 (1.0517) Acc@1 67.969 (76.155) Acc@5 89.746 (93.514) Mem 7379MB [2024-08-26 08:20:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.874 Acc@5 93.438 [2024-08-26 08:20:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.9% [2024-08-26 08:20:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.782 (0.782) Loss 0.4697 (0.4697) Acc@1 91.504 (91.504) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 08:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.146) Loss 0.7559 (0.7330) Acc@1 84.766 (84.277) Acc@5 95.996 (96.777) Mem 7379MB [2024-08-26 08:20:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.114) Loss 1.0566 (0.7556) Acc@1 75.098 (83.138) Acc@5 93.262 (96.735) Mem 7379MB [2024-08-26 08:20:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.102) Loss 1.3174 (0.8634) Acc@1 66.504 (80.639) Acc@5 90.332 (95.360) Mem 7379MB [2024-08-26 08:20:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.2295 (0.9211) Acc@1 70.508 (79.128) Acc@5 91.309 (94.762) Mem 7379MB [2024-08-26 08:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.722 Acc@5 94.692 [2024-08-26 08:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.7% [2024-08-26 08:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.72% [2024-08-26 08:20:36 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 08:20:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 08:20:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][0/1251] eta 0:14:37 lr 0.000921 wd 0.0500 time 0.7014 (0.7014) data time 0.4737 (0.4737) model time 0.0000 (0.0000) loss 3.4680 (3.4680) grad_norm 2.3241 (2.3241) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:20:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][10/1251] eta 0:05:49 lr 0.000921 wd 0.0500 time 0.2370 (0.2817) data time 0.0007 (0.0439) model time 0.0000 (0.0000) loss 2.2809 (3.3121) grad_norm 2.4281 (2.1906) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:20:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][20/1251] eta 0:05:24 lr 0.000921 wd 0.0500 time 0.2424 (0.2636) data time 0.0010 (0.0235) model time 0.0000 (0.0000) loss 3.1202 (3.2348) grad_norm 1.6790 (2.1159) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:20:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][30/1251] eta 0:05:12 lr 0.000921 wd 0.0500 time 0.2332 (0.2560) data time 0.0011 (0.0163) model time 0.0000 (0.0000) loss 3.1775 (3.3507) grad_norm 2.0316 (2.0146) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:20:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][40/1251] eta 0:05:07 lr 0.000921 wd 0.0500 time 0.2463 (0.2537) data time 0.0007 (0.0125) model time 0.0000 (0.0000) loss 2.6833 (3.2829) grad_norm 2.2705 (2.0682) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:20:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][50/1251] eta 0:05:01 lr 0.000921 wd 0.0500 time 0.2453 (0.2515) data time 0.0010 (0.0103) model time 0.0000 (0.0000) loss 3.7733 (3.3141) grad_norm 1.3139 (2.0124) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:20:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][60/1251] eta 0:04:57 lr 0.000921 wd 0.0500 time 0.2433 (0.2500) data time 0.0009 (0.0088) model time 0.2424 (0.2416) loss 4.2173 (3.3581) grad_norm 2.4723 (1.9986) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:20:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][70/1251] eta 0:04:53 lr 0.000921 wd 0.0500 time 0.2379 (0.2489) data time 0.0010 (0.0077) model time 0.2368 (0.2414) loss 3.5526 (3.3535) grad_norm 1.5472 (2.0086) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:20:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][80/1251] eta 0:04:50 lr 0.000921 wd 0.0500 time 0.2471 (0.2481) data time 0.0011 (0.0069) model time 0.2460 (0.2415) loss 3.9661 (3.3792) grad_norm 2.6239 (2.0382) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:20:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][90/1251] eta 0:04:47 lr 0.000921 wd 0.0500 time 0.2356 (0.2476) data time 0.0008 (0.0062) model time 0.2347 (0.2417) loss 3.0938 (3.4023) grad_norm 1.6745 (2.0310) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][100/1251] eta 0:04:44 lr 0.000921 wd 0.0500 time 0.2323 (0.2469) data time 0.0009 (0.0057) model time 0.2313 (0.2413) loss 3.5190 (3.3928) grad_norm 2.9391 (2.0256) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][110/1251] eta 0:04:41 lr 0.000921 wd 0.0500 time 0.2473 (0.2465) data time 0.0010 (0.0053) model time 0.2463 (0.2412) loss 3.8829 (3.4374) grad_norm 1.4657 (2.0027) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][120/1251] eta 0:04:38 lr 0.000921 wd 0.0500 time 0.2395 (0.2461) data time 0.0009 (0.0049) model time 0.2385 (0.2411) loss 3.3354 (3.4610) grad_norm 2.0435 (1.9938) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][130/1251] eta 0:04:35 lr 0.000921 wd 0.0500 time 0.2375 (0.2458) data time 0.0010 (0.0046) model time 0.2365 (0.2411) loss 3.7764 (3.4576) grad_norm 1.3623 (1.9980) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][140/1251] eta 0:04:34 lr 0.000921 wd 0.0500 time 0.2434 (0.2471) data time 0.0007 (0.0044) model time 0.2426 (0.2435) loss 4.0654 (3.4452) grad_norm 1.7198 (1.9869) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][150/1251] eta 0:04:33 lr 0.000921 wd 0.0500 time 0.2434 (0.2481) data time 0.0010 (0.0042) model time 0.2424 (0.2454) loss 3.2134 (3.4459) grad_norm 1.9898 (1.9814) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][160/1251] eta 0:04:30 lr 0.000921 wd 0.0500 time 0.2395 (0.2477) data time 0.0010 (0.0040) model time 0.2385 (0.2449) loss 3.0701 (3.4454) grad_norm 2.4467 (1.9851) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][170/1251] eta 0:04:27 lr 0.000921 wd 0.0500 time 0.2376 (0.2473) data time 0.0009 (0.0038) model time 0.2367 (0.2445) loss 2.2156 (3.4237) grad_norm 1.7784 (1.9706) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][180/1251] eta 0:04:24 lr 0.000921 wd 0.0500 time 0.2433 (0.2471) data time 0.0010 (0.0036) model time 0.2423 (0.2444) loss 4.0308 (3.4374) grad_norm 2.1439 (1.9790) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][190/1251] eta 0:04:21 lr 0.000921 wd 0.0500 time 0.2335 (0.2468) data time 0.0011 (0.0035) model time 0.2324 (0.2441) loss 4.0342 (3.4437) grad_norm 1.8928 (1.9770) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][200/1251] eta 0:04:19 lr 0.000921 wd 0.0500 time 0.2416 (0.2465) data time 0.0010 (0.0034) model time 0.2406 (0.2437) loss 2.2468 (3.4394) grad_norm 1.7135 (1.9725) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][210/1251] eta 0:04:16 lr 0.000921 wd 0.0500 time 0.2392 (0.2462) data time 0.0010 (0.0033) model time 0.2383 (0.2436) loss 3.7902 (3.4429) grad_norm 1.8336 (1.9688) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][220/1251] eta 0:04:13 lr 0.000921 wd 0.0500 time 0.2467 (0.2459) data time 0.0009 (0.0032) model time 0.2458 (0.2433) loss 3.6522 (3.4489) grad_norm 2.9170 (1.9883) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][230/1251] eta 0:04:10 lr 0.000921 wd 0.0500 time 0.2452 (0.2458) data time 0.0009 (0.0031) model time 0.2442 (0.2431) loss 3.6583 (3.4433) grad_norm 1.6325 (1.9819) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][240/1251] eta 0:04:08 lr 0.000921 wd 0.0500 time 0.2352 (0.2455) data time 0.0011 (0.0030) model time 0.2341 (0.2429) loss 3.4974 (3.4398) grad_norm 1.7363 (1.9770) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][250/1251] eta 0:04:06 lr 0.000921 wd 0.0500 time 0.2362 (0.2461) data time 0.0009 (0.0029) model time 0.2354 (0.2437) loss 4.1703 (3.4541) grad_norm 1.5959 (1.9790) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][260/1251] eta 0:04:04 lr 0.000921 wd 0.0500 time 0.2371 (0.2465) data time 0.0011 (0.0028) model time 0.2360 (0.2443) loss 2.3333 (3.4489) grad_norm 2.1840 (1.9723) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][270/1251] eta 0:04:01 lr 0.000920 wd 0.0500 time 0.2437 (0.2463) data time 0.0007 (0.0028) model time 0.2431 (0.2441) loss 2.5549 (3.4421) grad_norm 2.3407 (1.9786) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][280/1251] eta 0:03:58 lr 0.000920 wd 0.0500 time 0.2421 (0.2461) data time 0.0008 (0.0027) model time 0.2413 (0.2439) loss 3.0424 (3.4438) grad_norm 1.4326 (1.9804) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][290/1251] eta 0:03:56 lr 0.000920 wd 0.0500 time 0.2395 (0.2460) data time 0.0009 (0.0026) model time 0.2386 (0.2438) loss 3.8183 (3.4465) grad_norm 1.8768 (1.9815) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][300/1251] eta 0:03:55 lr 0.000920 wd 0.0500 time 0.2418 (0.2473) data time 0.0008 (0.0026) model time 0.2410 (0.2454) loss 4.0716 (3.4449) grad_norm 1.8501 (1.9854) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][310/1251] eta 0:03:52 lr 0.000920 wd 0.0500 time 0.2379 (0.2471) data time 0.0011 (0.0025) model time 0.2368 (0.2453) loss 3.1143 (3.4358) grad_norm 1.4528 (1.9816) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][320/1251] eta 0:03:50 lr 0.000920 wd 0.0500 time 0.2416 (0.2476) data time 0.0009 (0.0025) model time 0.2406 (0.2459) loss 3.9023 (3.4317) grad_norm 2.1261 (1.9833) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:21:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][330/1251] eta 0:03:49 lr 0.000920 wd 0.0500 time 0.2400 (0.2488) data time 0.0012 (0.0024) model time 0.2388 (0.2473) loss 3.8582 (3.4335) grad_norm 2.1000 (1.9948) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][340/1251] eta 0:03:46 lr 0.000920 wd 0.0500 time 0.2451 (0.2486) data time 0.0010 (0.0024) model time 0.2442 (0.2470) loss 3.9347 (3.4316) grad_norm 2.2016 (1.9999) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][350/1251] eta 0:03:43 lr 0.000920 wd 0.0500 time 0.2424 (0.2484) data time 0.0009 (0.0024) model time 0.2416 (0.2468) loss 4.2593 (3.4333) grad_norm 1.9027 (2.0145) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][360/1251] eta 0:03:41 lr 0.000920 wd 0.0500 time 0.2358 (0.2482) data time 0.0008 (0.0023) model time 0.2350 (0.2466) loss 3.0253 (3.4342) grad_norm 1.5868 (2.0185) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][370/1251] eta 0:03:38 lr 0.000920 wd 0.0500 time 0.2427 (0.2481) data time 0.0010 (0.0023) model time 0.2417 (0.2465) loss 3.9524 (3.4440) grad_norm 2.4715 (2.0153) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][380/1251] eta 0:03:35 lr 0.000920 wd 0.0500 time 0.2432 (0.2479) data time 0.0007 (0.0023) model time 0.2425 (0.2463) loss 4.1607 (3.4506) grad_norm 1.4319 (2.0143) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][390/1251] eta 0:03:33 lr 0.000920 wd 0.0500 time 0.2332 (0.2478) data time 0.0014 (0.0022) model time 0.2318 (0.2463) loss 3.6182 (3.4484) grad_norm 1.6745 (2.0050) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][400/1251] eta 0:03:30 lr 0.000920 wd 0.0500 time 0.2386 (0.2476) data time 0.0011 (0.0022) model time 0.2375 (0.2461) loss 4.2140 (3.4542) grad_norm 2.3214 (2.0033) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][410/1251] eta 0:03:28 lr 0.000920 wd 0.0500 time 0.2372 (0.2475) data time 0.0010 (0.0022) model time 0.2363 (0.2460) loss 3.5972 (3.4588) grad_norm 2.3766 (1.9984) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][420/1251] eta 0:03:25 lr 0.000920 wd 0.0500 time 0.2352 (0.2474) data time 0.0012 (0.0021) model time 0.2340 (0.2458) loss 4.1743 (3.4592) grad_norm 1.6250 (2.0012) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][430/1251] eta 0:03:22 lr 0.000920 wd 0.0500 time 0.2451 (0.2473) data time 0.0008 (0.0021) model time 0.2444 (0.2457) loss 3.8314 (3.4616) grad_norm 2.2556 (2.0088) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][440/1251] eta 0:03:20 lr 0.000920 wd 0.0500 time 0.2384 (0.2471) data time 0.0009 (0.0021) model time 0.2375 (0.2455) loss 2.8847 (3.4681) grad_norm 3.1182 (2.0107) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][450/1251] eta 0:03:17 lr 0.000920 wd 0.0500 time 0.2449 (0.2469) data time 0.0011 (0.0021) model time 0.2438 (0.2454) loss 3.5381 (3.4650) grad_norm 1.8967 (2.0047) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][460/1251] eta 0:03:15 lr 0.000920 wd 0.0500 time 0.2420 (0.2468) data time 0.0010 (0.0020) model time 0.2410 (0.2452) loss 3.8788 (3.4673) grad_norm 1.9904 (1.9994) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][470/1251] eta 0:03:12 lr 0.000920 wd 0.0500 time 0.2452 (0.2467) data time 0.0010 (0.0020) model time 0.2442 (0.2452) loss 4.0119 (3.4643) grad_norm 1.6211 (1.9961) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][480/1251] eta 0:03:10 lr 0.000920 wd 0.0500 time 0.2450 (0.2466) data time 0.0010 (0.0020) model time 0.2440 (0.2450) loss 3.7760 (3.4659) grad_norm 1.8883 (1.9964) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][490/1251] eta 0:03:07 lr 0.000920 wd 0.0500 time 0.2478 (0.2470) data time 0.0009 (0.0020) model time 0.2469 (0.2454) loss 2.3328 (3.4659) grad_norm 1.4645 (1.9995) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][500/1251] eta 0:03:05 lr 0.000920 wd 0.0500 time 0.2457 (0.2469) data time 0.0008 (0.0020) model time 0.2449 (0.2454) loss 3.1852 (3.4663) grad_norm 1.9026 (1.9960) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][510/1251] eta 0:03:02 lr 0.000920 wd 0.0500 time 0.2479 (0.2468) data time 0.0011 (0.0019) model time 0.2468 (0.2453) loss 2.4921 (3.4635) grad_norm 1.8692 (2.0008) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][520/1251] eta 0:03:00 lr 0.000920 wd 0.0500 time 0.2483 (0.2468) data time 0.0010 (0.0019) model time 0.2473 (0.2452) loss 3.0947 (3.4590) grad_norm 2.0319 (2.0001) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][530/1251] eta 0:02:57 lr 0.000920 wd 0.0500 time 0.2471 (0.2467) data time 0.0009 (0.0019) model time 0.2462 (0.2452) loss 3.9903 (3.4625) grad_norm 2.5846 (2.0080) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][540/1251] eta 0:02:55 lr 0.000920 wd 0.0500 time 0.2377 (0.2467) data time 0.0012 (0.0019) model time 0.2365 (0.2451) loss 4.0141 (3.4652) grad_norm 2.0510 (2.0201) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][550/1251] eta 0:02:52 lr 0.000920 wd 0.0500 time 0.2424 (0.2466) data time 0.0008 (0.0019) model time 0.2416 (0.2450) loss 3.5223 (3.4620) grad_norm 1.9133 (2.0173) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][560/1251] eta 0:02:50 lr 0.000920 wd 0.0500 time 0.2446 (0.2465) data time 0.0009 (0.0019) model time 0.2437 (0.2450) loss 3.4659 (3.4644) grad_norm 1.5662 (2.0161) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:22:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][570/1251] eta 0:02:47 lr 0.000920 wd 0.0500 time 0.2383 (0.2464) data time 0.0009 (0.0018) model time 0.2374 (0.2449) loss 3.5613 (3.4686) grad_norm 1.6612 (2.0146) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][580/1251] eta 0:02:45 lr 0.000920 wd 0.0500 time 0.4523 (0.2467) data time 0.0010 (0.0018) model time 0.4513 (0.2452) loss 3.0358 (3.4675) grad_norm 1.4178 (2.0087) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][590/1251] eta 0:02:43 lr 0.000920 wd 0.0500 time 0.2362 (0.2466) data time 0.0010 (0.0018) model time 0.2353 (0.2452) loss 3.3264 (3.4683) grad_norm 2.4453 (2.0069) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][600/1251] eta 0:02:40 lr 0.000920 wd 0.0500 time 0.2452 (0.2465) data time 0.0009 (0.0018) model time 0.2443 (0.2451) loss 2.9776 (3.4716) grad_norm 1.9678 (2.0047) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][610/1251] eta 0:02:38 lr 0.000920 wd 0.0500 time 0.2406 (0.2465) data time 0.0008 (0.0018) model time 0.2398 (0.2450) loss 4.1932 (3.4705) grad_norm 1.7809 (2.0020) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][620/1251] eta 0:02:35 lr 0.000920 wd 0.0500 time 0.2390 (0.2464) data time 0.0010 (0.0018) model time 0.2380 (0.2450) loss 4.3871 (3.4750) grad_norm 2.1947 (2.0026) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][630/1251] eta 0:02:32 lr 0.000920 wd 0.0500 time 0.2407 (0.2463) data time 0.0010 (0.0018) model time 0.2397 (0.2449) loss 4.0549 (3.4740) grad_norm 1.8072 (1.9970) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][640/1251] eta 0:02:30 lr 0.000920 wd 0.0500 time 0.2394 (0.2463) data time 0.0008 (0.0018) model time 0.2385 (0.2448) loss 3.6840 (3.4729) grad_norm 2.4316 (1.9950) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][650/1251] eta 0:02:27 lr 0.000920 wd 0.0500 time 0.2492 (0.2462) data time 0.0007 (0.0018) model time 0.2485 (0.2447) loss 2.8988 (3.4682) grad_norm 1.3873 (2.0042) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][660/1251] eta 0:02:25 lr 0.000920 wd 0.0500 time 0.2393 (0.2464) data time 0.0009 (0.0017) model time 0.2384 (0.2449) loss 3.5469 (3.4695) grad_norm 2.0865 (2.0043) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][670/1251] eta 0:02:23 lr 0.000920 wd 0.0500 time 0.2452 (0.2463) data time 0.0009 (0.0017) model time 0.2443 (0.2449) loss 3.3689 (3.4720) grad_norm 1.6883 (2.0103) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][680/1251] eta 0:02:20 lr 0.000920 wd 0.0500 time 0.2393 (0.2463) data time 0.0010 (0.0017) model time 0.2382 (0.2448) loss 3.4901 (3.4738) grad_norm 2.0321 (2.0079) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][690/1251] eta 0:02:18 lr 0.000919 wd 0.0500 time 0.2362 (0.2462) data time 0.0010 (0.0017) model time 0.2352 (0.2448) loss 3.9148 (3.4786) grad_norm 1.5909 (2.0063) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][700/1251] eta 0:02:15 lr 0.000919 wd 0.0500 time 0.2441 (0.2461) data time 0.0011 (0.0017) model time 0.2430 (0.2447) loss 3.5761 (3.4807) grad_norm 1.5861 (2.0161) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][710/1251] eta 0:02:13 lr 0.000919 wd 0.0500 time 0.2406 (0.2461) data time 0.0010 (0.0017) model time 0.2396 (0.2447) loss 3.5527 (3.4855) grad_norm 2.4019 (2.0124) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][720/1251] eta 0:02:10 lr 0.000919 wd 0.0500 time 0.2463 (0.2461) data time 0.0007 (0.0017) model time 0.2456 (0.2446) loss 2.3863 (3.4867) grad_norm 2.9111 (2.0110) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][730/1251] eta 0:02:08 lr 0.000919 wd 0.0500 time 0.2370 (0.2460) data time 0.0011 (0.0017) model time 0.2359 (0.2446) loss 3.2788 (3.4880) grad_norm 1.8568 (2.0102) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][740/1251] eta 0:02:05 lr 0.000919 wd 0.0500 time 0.2469 (0.2460) data time 0.0011 (0.0017) model time 0.2458 (0.2445) loss 2.2869 (3.4865) grad_norm 2.2503 (2.0085) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][750/1251] eta 0:02:03 lr 0.000919 wd 0.0500 time 0.2442 (0.2459) data time 0.0008 (0.0017) model time 0.2434 (0.2445) loss 3.0029 (3.4869) grad_norm 2.1697 (2.0102) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][760/1251] eta 0:02:00 lr 0.000919 wd 0.0500 time 0.2429 (0.2459) data time 0.0010 (0.0016) model time 0.2419 (0.2444) loss 2.7583 (3.4873) grad_norm 1.9475 (2.0136) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][770/1251] eta 0:01:58 lr 0.000919 wd 0.0500 time 0.2346 (0.2461) data time 0.0008 (0.0016) model time 0.2338 (0.2447) loss 3.3951 (3.4886) grad_norm 1.8569 (2.0135) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][780/1251] eta 0:01:56 lr 0.000919 wd 0.0500 time 0.2442 (0.2463) data time 0.0009 (0.0016) model time 0.2433 (0.2449) loss 3.3514 (3.4892) grad_norm 1.7604 (2.0125) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][790/1251] eta 0:01:53 lr 0.000919 wd 0.0500 time 0.2491 (0.2463) data time 0.0009 (0.0016) model time 0.2483 (0.2449) loss 2.3536 (3.4889) grad_norm 1.7771 (2.0096) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][800/1251] eta 0:01:51 lr 0.000919 wd 0.0500 time 0.2479 (0.2462) data time 0.0010 (0.0016) model time 0.2469 (0.2448) loss 3.0164 (3.4878) grad_norm 1.9434 (2.0103) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][810/1251] eta 0:01:48 lr 0.000919 wd 0.0500 time 0.2356 (0.2461) data time 0.0010 (0.0016) model time 0.2346 (0.2448) loss 3.5530 (3.4837) grad_norm 2.1810 (2.0074) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:23:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][820/1251] eta 0:01:46 lr 0.000919 wd 0.0500 time 0.2367 (0.2461) data time 0.0011 (0.0016) model time 0.2356 (0.2447) loss 3.5797 (3.4853) grad_norm 1.7610 (2.0053) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][830/1251] eta 0:01:43 lr 0.000919 wd 0.0500 time 0.2398 (0.2460) data time 0.0009 (0.0016) model time 0.2389 (0.2446) loss 3.6848 (3.4848) grad_norm 2.2851 (2.0029) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][840/1251] eta 0:01:41 lr 0.000919 wd 0.0500 time 0.2333 (0.2460) data time 0.0007 (0.0016) model time 0.2325 (0.2446) loss 3.5060 (3.4826) grad_norm 2.4300 (2.0015) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][850/1251] eta 0:01:38 lr 0.000919 wd 0.0500 time 0.2454 (0.2459) data time 0.0010 (0.0016) model time 0.2444 (0.2445) loss 3.6056 (3.4793) grad_norm 1.8887 (2.0001) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][860/1251] eta 0:01:36 lr 0.000919 wd 0.0500 time 0.2423 (0.2459) data time 0.0010 (0.0016) model time 0.2413 (0.2445) loss 3.6932 (3.4802) grad_norm 2.1404 (1.9999) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][870/1251] eta 0:01:33 lr 0.000919 wd 0.0500 time 0.2391 (0.2458) data time 0.0008 (0.0016) model time 0.2383 (0.2444) loss 2.6645 (3.4790) grad_norm 2.0987 (2.0032) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][880/1251] eta 0:01:31 lr 0.000919 wd 0.0500 time 0.2433 (0.2458) data time 0.0007 (0.0016) model time 0.2426 (0.2444) loss 3.8475 (3.4803) grad_norm 2.1393 (2.0064) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][890/1251] eta 0:01:28 lr 0.000919 wd 0.0500 time 0.2362 (0.2457) data time 0.0011 (0.0015) model time 0.2351 (0.2444) loss 3.1495 (3.4799) grad_norm 1.9735 (2.0078) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][900/1251] eta 0:01:26 lr 0.000919 wd 0.0500 time 0.2403 (0.2457) data time 0.0009 (0.0015) model time 0.2393 (0.2443) loss 3.1999 (3.4807) grad_norm 1.7935 (2.0073) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][910/1251] eta 0:01:23 lr 0.000919 wd 0.0500 time 0.2360 (0.2456) data time 0.0011 (0.0015) model time 0.2349 (0.2443) loss 3.5342 (3.4827) grad_norm 1.9856 (2.0093) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][920/1251] eta 0:01:21 lr 0.000919 wd 0.0500 time 0.2400 (0.2456) data time 0.0011 (0.0015) model time 0.2390 (0.2443) loss 3.6594 (3.4829) grad_norm 3.6840 (2.0096) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][930/1251] eta 0:01:18 lr 0.000919 wd 0.0500 time 0.2396 (0.2456) data time 0.0010 (0.0015) model time 0.2386 (0.2442) loss 3.9019 (3.4862) grad_norm 1.7865 (2.0071) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][940/1251] eta 0:01:16 lr 0.000919 wd 0.0500 time 0.2439 (0.2456) data time 0.0009 (0.0015) model time 0.2430 (0.2442) loss 3.5890 (3.4866) grad_norm 1.5836 (2.0070) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][950/1251] eta 0:01:13 lr 0.000919 wd 0.0500 time 0.2494 (0.2455) data time 0.0007 (0.0015) model time 0.2486 (0.2442) loss 3.9592 (3.4900) grad_norm 2.1091 (2.0057) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][960/1251] eta 0:01:11 lr 0.000919 wd 0.0500 time 0.2427 (0.2455) data time 0.0010 (0.0015) model time 0.2417 (0.2441) loss 3.7041 (3.4926) grad_norm 1.5046 (2.0073) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][970/1251] eta 0:01:08 lr 0.000919 wd 0.0500 time 0.2480 (0.2455) data time 0.0010 (0.0015) model time 0.2470 (0.2441) loss 3.5010 (3.4952) grad_norm 1.8755 (2.0060) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][980/1251] eta 0:01:06 lr 0.000919 wd 0.0500 time 0.2404 (0.2454) data time 0.0007 (0.0015) model time 0.2396 (0.2441) loss 4.2696 (3.4947) grad_norm 1.6371 (2.0072) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][990/1251] eta 0:01:04 lr 0.000919 wd 0.0500 time 0.2521 (0.2454) data time 0.0007 (0.0015) model time 0.2514 (0.2440) loss 3.2538 (3.4945) grad_norm 1.5556 (2.0067) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1000/1251] eta 0:01:01 lr 0.000919 wd 0.0500 time 0.2538 (0.2453) data time 0.0009 (0.0015) model time 0.2529 (0.2440) loss 3.2346 (3.4958) grad_norm 2.0232 (2.0060) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1010/1251] eta 0:00:59 lr 0.000919 wd 0.0500 time 0.2443 (0.2454) data time 0.0009 (0.0015) model time 0.2433 (0.2441) loss 3.7970 (3.4985) grad_norm 1.5962 (2.0052) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1020/1251] eta 0:00:56 lr 0.000919 wd 0.0500 time 0.2444 (0.2454) data time 0.0013 (0.0015) model time 0.2431 (0.2440) loss 3.8250 (3.4977) grad_norm 1.7397 (2.0032) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1030/1251] eta 0:00:54 lr 0.000919 wd 0.0500 time 0.2392 (0.2453) data time 0.0012 (0.0015) model time 0.2380 (0.2440) loss 2.5759 (3.4956) grad_norm 1.7402 (2.0021) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1040/1251] eta 0:00:51 lr 0.000919 wd 0.0500 time 0.2413 (0.2453) data time 0.0007 (0.0015) model time 0.2406 (0.2439) loss 3.5176 (3.4972) grad_norm 2.4002 (2.0076) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1050/1251] eta 0:00:49 lr 0.000919 wd 0.0500 time 0.2405 (0.2453) data time 0.0007 (0.0015) model time 0.2398 (0.2439) loss 3.2621 (3.4980) grad_norm 1.5159 (2.0084) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1060/1251] eta 0:00:46 lr 0.000919 wd 0.0500 time 0.2395 (0.2452) data time 0.0010 (0.0015) model time 0.2385 (0.2439) loss 4.0229 (3.4980) grad_norm 1.8235 (2.0066) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1070/1251] eta 0:00:44 lr 0.000919 wd 0.0500 time 0.2366 (0.2452) data time 0.0011 (0.0015) model time 0.2355 (0.2438) loss 3.5595 (3.4984) grad_norm 2.7292 (2.0079) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:25:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1080/1251] eta 0:00:41 lr 0.000919 wd 0.0500 time 0.2458 (0.2453) data time 0.0009 (0.0015) model time 0.2448 (0.2440) loss 3.2046 (3.4990) grad_norm 1.8299 (2.0101) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1090/1251] eta 0:00:39 lr 0.000919 wd 0.0500 time 0.2333 (0.2453) data time 0.0008 (0.0015) model time 0.2325 (0.2440) loss 2.7153 (3.4987) grad_norm 1.4213 (2.0088) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:25:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1100/1251] eta 0:00:37 lr 0.000918 wd 0.0500 time 0.2401 (0.2453) data time 0.0009 (0.0014) model time 0.2392 (0.2439) loss 2.6750 (3.4957) grad_norm 1.8829 (2.0097) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:25:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1110/1251] eta 0:00:34 lr 0.000918 wd 0.0500 time 0.2342 (0.2452) data time 0.0009 (0.0014) model time 0.2333 (0.2439) loss 3.8594 (3.4960) grad_norm 2.0945 (2.0088) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1120/1251] eta 0:00:32 lr 0.000918 wd 0.0500 time 0.2418 (0.2453) data time 0.0010 (0.0014) model time 0.2408 (0.2440) loss 3.3741 (3.4961) grad_norm 1.7438 (2.0081) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:25:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1130/1251] eta 0:00:29 lr 0.000918 wd 0.0500 time 0.2495 (0.2453) data time 0.0012 (0.0014) model time 0.2484 (0.2440) loss 3.4351 (3.4941) grad_norm 1.7404 (2.0096) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:25:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1140/1251] eta 0:00:27 lr 0.000918 wd 0.0500 time 0.2457 (0.2453) data time 0.0009 (0.0014) model time 0.2448 (0.2440) loss 2.7874 (3.4948) grad_norm 2.0138 (2.0086) loss_scale 8192.0000 (4131.8983) mem 7379MB [2024-08-26 08:25:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1150/1251] eta 0:00:24 lr 0.000918 wd 0.0500 time 0.2435 (0.2453) data time 0.0010 (0.0014) model time 0.2425 (0.2440) loss 3.4259 (3.4942) grad_norm 1.5529 (2.0058) loss_scale 8192.0000 (4167.1729) mem 7379MB [2024-08-26 08:25:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1160/1251] eta 0:00:22 lr 0.000918 wd 0.0500 time 0.2418 (0.2452) data time 0.0009 (0.0014) model time 0.2409 (0.2439) loss 4.4599 (3.4919) grad_norm 2.8730 (2.0050) loss_scale 8192.0000 (4201.8398) mem 7379MB [2024-08-26 08:25:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1170/1251] eta 0:00:19 lr 0.000918 wd 0.0500 time 0.2410 (0.2452) data time 0.0011 (0.0014) model time 0.2399 (0.2439) loss 3.4641 (3.4916) grad_norm 2.0718 (2.0062) loss_scale 8192.0000 (4235.9146) mem 7379MB [2024-08-26 08:25:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1180/1251] eta 0:00:17 lr 0.000918 wd 0.0500 time 0.2374 (0.2451) data time 0.0009 (0.0014) model time 0.2365 (0.2438) loss 3.6017 (3.4924) grad_norm 2.1106 (2.0074) loss_scale 8192.0000 (4269.4124) mem 7379MB [2024-08-26 08:25:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1190/1251] eta 0:00:14 lr 0.000918 wd 0.0500 time 0.2405 (0.2453) data time 0.0007 (0.0014) model time 0.2397 (0.2440) loss 3.4641 (3.4915) grad_norm 1.7996 (inf) loss_scale 4096.0000 (4281.7128) mem 7379MB [2024-08-26 08:25:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1200/1251] eta 0:00:12 lr 0.000918 wd 0.0500 time 0.2398 (0.2453) data time 0.0009 (0.0014) model time 0.2388 (0.2440) loss 3.6825 (3.4921) grad_norm 2.1773 (inf) loss_scale 4096.0000 (4280.1665) mem 7379MB [2024-08-26 08:25:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1210/1251] eta 0:00:10 lr 0.000918 wd 0.0500 time 0.2400 (0.2452) data time 0.0011 (0.0014) model time 0.2389 (0.2439) loss 3.8194 (3.4931) grad_norm 1.9497 (inf) loss_scale 4096.0000 (4278.6457) mem 7379MB [2024-08-26 08:25:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1220/1251] eta 0:00:07 lr 0.000918 wd 0.0500 time 0.2568 (0.2452) data time 0.0007 (0.0014) model time 0.2561 (0.2439) loss 2.5311 (3.4907) grad_norm 1.5480 (inf) loss_scale 4096.0000 (4277.1499) mem 7379MB [2024-08-26 08:25:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1230/1251] eta 0:00:05 lr 0.000918 wd 0.0500 time 0.2510 (0.2455) data time 0.0007 (0.0014) model time 0.2503 (0.2443) loss 3.8003 (3.4931) grad_norm 2.7275 (inf) loss_scale 4096.0000 (4275.6783) mem 7379MB [2024-08-26 08:25:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1240/1251] eta 0:00:02 lr 0.000918 wd 0.0500 time 0.2238 (0.2454) data time 0.0007 (0.0014) model time 0.2231 (0.2442) loss 3.9534 (3.4911) grad_norm 2.8039 (inf) loss_scale 4096.0000 (4274.2305) mem 7379MB [2024-08-26 08:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [71/300][1250/1251] eta 0:00:00 lr 0.000918 wd 0.0500 time 0.2235 (0.2453) data time 0.0005 (0.0014) model time 0.2230 (0.2440) loss 2.7183 (3.4895) grad_norm 1.5982 (inf) loss_scale 4096.0000 (4272.8058) mem 7379MB [2024-08-26 08:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 71 training takes 0:05:06 [2024-08-26 08:25:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 08:25:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 08:25:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.423 (0.423) Loss 0.4673 (0.4673) Acc@1 90.527 (90.527) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 08:25:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.112) Loss 0.8613 (0.8047) Acc@1 80.176 (81.392) Acc@5 95.020 (95.863) Mem 7379MB [2024-08-26 08:25:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.087 (0.097) Loss 1.1523 (0.8312) Acc@1 71.191 (80.101) Acc@5 92.578 (95.829) Mem 7379MB [2024-08-26 08:25:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.090) Loss 1.4502 (0.9559) Acc@1 65.039 (77.523) Acc@5 87.793 (94.276) Mem 7379MB [2024-08-26 08:25:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.3877 (1.0223) Acc@1 67.676 (76.043) Acc@5 87.695 (93.395) Mem 7379MB [2024-08-26 08:25:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.662 Acc@5 93.292 [2024-08-26 08:25:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 75.7% [2024-08-26 08:25:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.783 (0.783) Loss 0.4697 (0.4697) Acc@1 91.504 (91.504) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 08:25:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.148) Loss 0.7510 (0.7314) Acc@1 84.863 (84.242) Acc@5 95.898 (96.786) Mem 7379MB [2024-08-26 08:25:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.114) Loss 1.0547 (0.7543) Acc@1 75.098 (83.143) Acc@5 93.359 (96.763) Mem 7379MB [2024-08-26 08:25:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.102) Loss 1.3145 (0.8617) Acc@1 66.992 (80.620) Acc@5 90.234 (95.388) Mem 7379MB [2024-08-26 08:25:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.093) Loss 1.2246 (0.9191) Acc@1 70.312 (79.137) Acc@5 91.309 (94.779) Mem 7379MB [2024-08-26 08:25:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.742 Acc@5 94.702 [2024-08-26 08:25:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.7% [2024-08-26 08:25:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.74% [2024-08-26 08:25:52 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 08:25:53 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 08:25:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][0/1251] eta 0:18:58 lr 0.000918 wd 0.0500 time 0.9098 (0.9098) data time 0.6884 (0.6884) model time 0.0000 (0.0000) loss 4.0344 (4.0344) grad_norm 1.3915 (1.3915) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:25:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][10/1251] eta 0:07:02 lr 0.000918 wd 0.0500 time 0.2373 (0.3407) data time 0.0010 (0.0635) model time 0.0000 (0.0000) loss 3.6424 (3.7055) grad_norm 2.1555 (1.9566) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:25:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][20/1251] eta 0:06:00 lr 0.000918 wd 0.0500 time 0.2390 (0.2932) data time 0.0009 (0.0337) model time 0.0000 (0.0000) loss 4.2180 (3.6191) grad_norm 1.8049 (1.9381) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][30/1251] eta 0:05:37 lr 0.000918 wd 0.0500 time 0.2477 (0.2761) data time 0.0007 (0.0232) model time 0.0000 (0.0000) loss 2.5754 (3.5394) grad_norm 2.4561 (1.9501) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][40/1251] eta 0:05:23 lr 0.000918 wd 0.0500 time 0.2383 (0.2670) data time 0.0007 (0.0178) model time 0.0000 (0.0000) loss 4.1680 (3.4929) grad_norm 2.9231 (2.0030) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][50/1251] eta 0:05:14 lr 0.000918 wd 0.0500 time 0.2357 (0.2622) data time 0.0010 (0.0145) model time 0.0000 (0.0000) loss 3.5381 (3.5110) grad_norm 2.2833 (2.0091) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][60/1251] eta 0:05:08 lr 0.000918 wd 0.0500 time 0.2461 (0.2591) data time 0.0009 (0.0123) model time 0.2451 (0.2422) loss 3.6287 (3.5264) grad_norm 1.8842 (1.9807) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][70/1251] eta 0:05:03 lr 0.000918 wd 0.0500 time 0.2412 (0.2568) data time 0.0008 (0.0107) model time 0.2404 (0.2422) loss 4.3867 (3.4970) grad_norm 2.0909 (1.9556) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][80/1251] eta 0:04:59 lr 0.000918 wd 0.0500 time 0.2452 (0.2555) data time 0.0008 (0.0095) model time 0.2445 (0.2431) loss 2.8217 (3.4427) grad_norm 3.7326 (1.9745) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][90/1251] eta 0:04:54 lr 0.000918 wd 0.0500 time 0.2502 (0.2539) data time 0.0010 (0.0085) model time 0.2492 (0.2423) loss 3.6848 (3.4315) grad_norm 2.0124 (1.9814) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][100/1251] eta 0:04:51 lr 0.000918 wd 0.0500 time 0.2526 (0.2529) data time 0.0011 (0.0078) model time 0.2514 (0.2424) loss 3.1630 (3.4014) grad_norm 1.3969 (1.9603) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][110/1251] eta 0:04:47 lr 0.000918 wd 0.0500 time 0.2385 (0.2520) data time 0.0007 (0.0072) model time 0.2378 (0.2423) loss 2.1208 (3.4272) grad_norm 1.9744 (1.9691) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][120/1251] eta 0:04:43 lr 0.000918 wd 0.0500 time 0.2388 (0.2509) data time 0.0009 (0.0067) model time 0.2379 (0.2417) loss 4.1587 (3.4337) grad_norm 3.6751 (1.9859) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][130/1251] eta 0:04:40 lr 0.000918 wd 0.0500 time 0.2395 (0.2502) data time 0.0010 (0.0062) model time 0.2386 (0.2415) loss 3.1185 (3.4392) grad_norm 2.2931 (1.9969) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][140/1251] eta 0:04:37 lr 0.000918 wd 0.0500 time 0.2371 (0.2495) data time 0.0010 (0.0059) model time 0.2362 (0.2414) loss 3.6812 (3.4551) grad_norm 2.7733 (2.0072) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][150/1251] eta 0:04:34 lr 0.000918 wd 0.0500 time 0.2470 (0.2491) data time 0.0008 (0.0055) model time 0.2462 (0.2415) loss 3.5630 (3.4457) grad_norm 1.7302 (1.9960) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][160/1251] eta 0:04:31 lr 0.000918 wd 0.0500 time 0.2395 (0.2487) data time 0.0010 (0.0053) model time 0.2384 (0.2415) loss 3.7823 (3.4509) grad_norm 1.5612 (1.9843) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][170/1251] eta 0:04:28 lr 0.000918 wd 0.0500 time 0.2366 (0.2483) data time 0.0012 (0.0050) model time 0.2354 (0.2414) loss 2.5239 (3.4622) grad_norm 1.6015 (1.9742) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][180/1251] eta 0:04:25 lr 0.000918 wd 0.0500 time 0.2381 (0.2480) data time 0.0008 (0.0048) model time 0.2373 (0.2414) loss 3.4446 (3.4689) grad_norm 5.0301 (1.9824) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][190/1251] eta 0:04:22 lr 0.000918 wd 0.0500 time 0.2403 (0.2475) data time 0.0011 (0.0046) model time 0.2392 (0.2412) loss 3.5895 (3.4702) grad_norm 2.7851 (1.9959) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][200/1251] eta 0:04:19 lr 0.000918 wd 0.0500 time 0.2340 (0.2471) data time 0.0011 (0.0044) model time 0.2329 (0.2410) loss 3.7697 (3.4827) grad_norm 1.9896 (1.9946) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][210/1251] eta 0:04:17 lr 0.000918 wd 0.0500 time 0.2488 (0.2475) data time 0.0010 (0.0043) model time 0.2478 (0.2419) loss 3.3766 (3.4751) grad_norm 1.9924 (1.9974) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][220/1251] eta 0:04:14 lr 0.000918 wd 0.0500 time 0.2388 (0.2473) data time 0.0008 (0.0041) model time 0.2380 (0.2418) loss 3.2918 (3.4616) grad_norm 3.8726 (1.9996) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][230/1251] eta 0:04:12 lr 0.000918 wd 0.0500 time 0.2346 (0.2470) data time 0.0010 (0.0040) model time 0.2337 (0.2417) loss 3.6696 (3.4604) grad_norm 1.9607 (1.9974) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][240/1251] eta 0:04:09 lr 0.000918 wd 0.0500 time 0.2345 (0.2468) data time 0.0011 (0.0039) model time 0.2334 (0.2416) loss 3.2868 (3.4575) grad_norm 1.6983 (2.0059) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][250/1251] eta 0:04:06 lr 0.000918 wd 0.0500 time 0.2357 (0.2464) data time 0.0010 (0.0037) model time 0.2347 (0.2414) loss 3.2273 (3.4710) grad_norm 1.8988 (2.0129) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:26:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][260/1251] eta 0:04:05 lr 0.000917 wd 0.0500 time 0.2417 (0.2478) data time 0.0009 (0.0036) model time 0.2408 (0.2433) loss 2.5192 (3.4592) grad_norm 1.9647 (2.0058) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][270/1251] eta 0:04:02 lr 0.000917 wd 0.0500 time 0.2408 (0.2477) data time 0.0010 (0.0035) model time 0.2398 (0.2433) loss 4.0631 (3.4605) grad_norm 2.1892 (2.0118) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][280/1251] eta 0:04:00 lr 0.000917 wd 0.0500 time 0.2389 (0.2474) data time 0.0008 (0.0034) model time 0.2381 (0.2432) loss 3.8009 (3.4671) grad_norm 1.2884 (2.0077) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][290/1251] eta 0:03:57 lr 0.000917 wd 0.0500 time 0.2457 (0.2473) data time 0.0010 (0.0034) model time 0.2447 (0.2431) loss 3.8435 (3.4595) grad_norm 1.3652 (2.0040) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][300/1251] eta 0:03:54 lr 0.000917 wd 0.0500 time 0.2329 (0.2470) data time 0.0009 (0.0033) model time 0.2320 (0.2429) loss 2.4836 (3.4596) grad_norm 1.9454 (1.9998) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][310/1251] eta 0:03:52 lr 0.000917 wd 0.0500 time 0.2417 (0.2469) data time 0.0011 (0.0032) model time 0.2405 (0.2429) loss 2.8204 (3.4629) grad_norm 1.8981 (1.9984) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][320/1251] eta 0:03:49 lr 0.000917 wd 0.0500 time 0.2457 (0.2468) data time 0.0007 (0.0031) model time 0.2450 (0.2429) loss 3.2328 (3.4647) grad_norm 2.0529 (2.0085) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][330/1251] eta 0:03:47 lr 0.000917 wd 0.0500 time 0.2387 (0.2466) data time 0.0007 (0.0031) model time 0.2380 (0.2428) loss 4.4190 (3.4682) grad_norm 1.7321 (2.0079) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][340/1251] eta 0:03:44 lr 0.000917 wd 0.0500 time 0.2348 (0.2465) data time 0.0012 (0.0030) model time 0.2336 (0.2428) loss 3.2761 (3.4762) grad_norm 1.7714 (2.0062) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][350/1251] eta 0:03:42 lr 0.000917 wd 0.0500 time 0.2457 (0.2464) data time 0.0008 (0.0030) model time 0.2450 (0.2427) loss 3.6158 (3.4817) grad_norm 2.6254 (2.0138) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][360/1251] eta 0:03:39 lr 0.000917 wd 0.0500 time 0.2479 (0.2463) data time 0.0010 (0.0029) model time 0.2469 (0.2427) loss 3.9632 (3.4840) grad_norm 1.7407 (2.0172) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][370/1251] eta 0:03:36 lr 0.000917 wd 0.0500 time 0.2430 (0.2462) data time 0.0011 (0.0029) model time 0.2420 (0.2427) loss 2.5786 (3.4847) grad_norm 1.4204 (2.0138) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][380/1251] eta 0:03:34 lr 0.000917 wd 0.0500 time 0.2469 (0.2461) data time 0.0007 (0.0028) model time 0.2462 (0.2426) loss 3.2524 (3.4804) grad_norm 1.8052 (2.0112) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][390/1251] eta 0:03:31 lr 0.000917 wd 0.0500 time 0.2392 (0.2461) data time 0.0008 (0.0028) model time 0.2384 (0.2426) loss 3.5477 (3.4842) grad_norm 2.5886 (2.0160) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][400/1251] eta 0:03:29 lr 0.000917 wd 0.0500 time 0.4253 (0.2464) data time 0.0009 (0.0027) model time 0.4244 (0.2431) loss 4.1418 (3.4909) grad_norm 3.4241 (2.0274) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][410/1251] eta 0:03:27 lr 0.000917 wd 0.0500 time 0.2422 (0.2463) data time 0.0011 (0.0027) model time 0.2411 (0.2431) loss 3.6336 (3.4898) grad_norm 2.4517 (2.0328) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][420/1251] eta 0:03:24 lr 0.000917 wd 0.0500 time 0.2383 (0.2463) data time 0.0007 (0.0026) model time 0.2376 (0.2430) loss 4.2266 (3.4967) grad_norm 2.3883 (2.0288) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][430/1251] eta 0:03:22 lr 0.000917 wd 0.0500 time 0.2441 (0.2461) data time 0.0009 (0.0026) model time 0.2432 (0.2430) loss 3.8822 (3.4994) grad_norm 1.6599 (2.0258) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][440/1251] eta 0:03:19 lr 0.000917 wd 0.0500 time 0.2426 (0.2460) data time 0.0007 (0.0026) model time 0.2418 (0.2429) loss 4.0981 (3.4963) grad_norm 1.6919 (2.0263) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][450/1251] eta 0:03:16 lr 0.000917 wd 0.0500 time 0.2461 (0.2459) data time 0.0009 (0.0025) model time 0.2452 (0.2428) loss 3.8693 (3.4985) grad_norm 1.9450 (2.0243) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][460/1251] eta 0:03:14 lr 0.000917 wd 0.0500 time 0.2390 (0.2459) data time 0.0007 (0.0025) model time 0.2384 (0.2428) loss 3.5941 (3.4969) grad_norm 3.3611 (2.0330) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][470/1251] eta 0:03:11 lr 0.000917 wd 0.0500 time 0.2366 (0.2458) data time 0.0010 (0.0025) model time 0.2356 (0.2428) loss 3.2194 (3.4969) grad_norm 1.5381 (2.0296) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][480/1251] eta 0:03:09 lr 0.000917 wd 0.0500 time 0.2413 (0.2457) data time 0.0009 (0.0024) model time 0.2404 (0.2428) loss 4.3624 (3.4952) grad_norm 2.5460 (2.0251) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][490/1251] eta 0:03:07 lr 0.000917 wd 0.0500 time 0.2651 (0.2461) data time 0.0007 (0.0024) model time 0.2644 (0.2433) loss 4.4212 (3.5012) grad_norm 2.1294 (2.0215) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][500/1251] eta 0:03:05 lr 0.000917 wd 0.0500 time 0.2382 (0.2466) data time 0.0012 (0.0024) model time 0.2370 (0.2439) loss 3.2711 (3.5035) grad_norm 2.4978 (2.0261) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:27:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][510/1251] eta 0:03:02 lr 0.000917 wd 0.0500 time 0.2437 (0.2465) data time 0.0010 (0.0024) model time 0.2427 (0.2438) loss 3.5972 (3.5012) grad_norm 1.8504 (2.0276) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][520/1251] eta 0:03:00 lr 0.000917 wd 0.0500 time 0.2320 (0.2464) data time 0.0011 (0.0023) model time 0.2309 (0.2437) loss 3.6142 (3.4971) grad_norm 1.9579 (2.0306) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][530/1251] eta 0:02:57 lr 0.000917 wd 0.0500 time 0.2448 (0.2463) data time 0.0010 (0.0023) model time 0.2437 (0.2436) loss 3.4247 (3.4926) grad_norm 1.6542 (2.0317) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][540/1251] eta 0:02:55 lr 0.000917 wd 0.0500 time 0.2450 (0.2463) data time 0.0008 (0.0023) model time 0.2442 (0.2436) loss 3.4692 (3.4919) grad_norm 1.8999 (2.0360) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][550/1251] eta 0:02:52 lr 0.000917 wd 0.0500 time 0.2469 (0.2462) data time 0.0010 (0.0023) model time 0.2459 (0.2435) loss 3.8613 (3.4908) grad_norm 2.0978 (2.0317) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][560/1251] eta 0:02:50 lr 0.000917 wd 0.0500 time 0.2425 (0.2461) data time 0.0011 (0.0022) model time 0.2414 (0.2435) loss 3.5860 (3.4916) grad_norm 2.1254 (2.0247) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][570/1251] eta 0:02:47 lr 0.000917 wd 0.0500 time 0.2430 (0.2460) data time 0.0009 (0.0022) model time 0.2421 (0.2434) loss 3.2198 (3.4829) grad_norm 1.7813 (2.0253) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][580/1251] eta 0:02:45 lr 0.000917 wd 0.0500 time 0.2364 (0.2459) data time 0.0010 (0.0022) model time 0.2354 (0.2434) loss 2.8404 (3.4793) grad_norm 2.0184 (2.0259) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][590/1251] eta 0:02:42 lr 0.000917 wd 0.0500 time 0.2410 (0.2459) data time 0.0010 (0.0022) model time 0.2400 (0.2433) loss 2.7843 (3.4791) grad_norm 2.1988 (2.0313) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][600/1251] eta 0:02:40 lr 0.000917 wd 0.0500 time 0.2432 (0.2458) data time 0.0008 (0.0022) model time 0.2423 (0.2433) loss 3.9707 (3.4819) grad_norm 2.1897 (2.0303) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][610/1251] eta 0:02:37 lr 0.000917 wd 0.0500 time 0.2354 (0.2457) data time 0.0007 (0.0021) model time 0.2347 (0.2432) loss 3.9784 (3.4836) grad_norm 2.7307 (2.0353) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][620/1251] eta 0:02:35 lr 0.000917 wd 0.0500 time 0.2388 (0.2456) data time 0.0009 (0.0021) model time 0.2380 (0.2431) loss 3.8747 (3.4879) grad_norm 2.4925 (2.0365) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][630/1251] eta 0:02:32 lr 0.000917 wd 0.0500 time 0.2496 (0.2456) data time 0.0007 (0.0021) model time 0.2488 (0.2431) loss 2.9898 (3.4900) grad_norm 1.4709 (2.0344) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][640/1251] eta 0:02:30 lr 0.000917 wd 0.0500 time 0.2441 (0.2458) data time 0.0011 (0.0021) model time 0.2430 (0.2434) loss 3.8533 (3.4890) grad_norm 1.7472 (2.0326) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][650/1251] eta 0:02:27 lr 0.000917 wd 0.0500 time 0.2406 (0.2458) data time 0.0010 (0.0021) model time 0.2395 (0.2434) loss 3.6510 (3.4926) grad_norm 2.2340 (2.0271) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][660/1251] eta 0:02:25 lr 0.000916 wd 0.0500 time 0.2339 (0.2457) data time 0.0011 (0.0021) model time 0.2328 (0.2433) loss 3.0178 (3.4928) grad_norm 1.8776 (2.0269) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][670/1251] eta 0:02:22 lr 0.000916 wd 0.0500 time 0.2419 (0.2457) data time 0.0008 (0.0020) model time 0.2411 (0.2433) loss 3.9992 (3.4965) grad_norm 1.6934 (2.0270) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][680/1251] eta 0:02:20 lr 0.000916 wd 0.0500 time 0.2441 (0.2456) data time 0.0009 (0.0020) model time 0.2432 (0.2433) loss 4.0555 (3.4934) grad_norm 1.7527 (2.0296) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][690/1251] eta 0:02:17 lr 0.000916 wd 0.0500 time 0.2402 (0.2456) data time 0.0008 (0.0020) model time 0.2394 (0.2432) loss 4.0040 (3.4974) grad_norm 1.6916 (2.0290) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][700/1251] eta 0:02:15 lr 0.000916 wd 0.0500 time 0.2424 (0.2455) data time 0.0011 (0.0020) model time 0.2413 (0.2431) loss 2.4128 (3.4933) grad_norm 1.6962 (2.0289) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][710/1251] eta 0:02:12 lr 0.000916 wd 0.0500 time 0.2413 (0.2454) data time 0.0008 (0.0020) model time 0.2405 (0.2431) loss 3.5192 (3.4897) grad_norm 1.7229 (2.0250) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][720/1251] eta 0:02:10 lr 0.000916 wd 0.0500 time 0.2418 (0.2453) data time 0.0008 (0.0020) model time 0.2410 (0.2430) loss 3.7031 (3.4906) grad_norm 1.5493 (2.0242) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][730/1251] eta 0:02:07 lr 0.000916 wd 0.0500 time 0.2433 (0.2456) data time 0.0007 (0.0020) model time 0.2426 (0.2433) loss 2.6969 (3.4891) grad_norm 2.0730 (2.0269) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][740/1251] eta 0:02:05 lr 0.000916 wd 0.0500 time 0.2379 (0.2456) data time 0.0012 (0.0019) model time 0.2367 (0.2433) loss 3.6566 (3.4911) grad_norm 1.6083 (2.0277) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:28:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][750/1251] eta 0:02:03 lr 0.000916 wd 0.0500 time 0.2374 (0.2455) data time 0.0007 (0.0019) model time 0.2367 (0.2433) loss 2.3366 (3.4881) grad_norm 1.6914 (2.0322) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][760/1251] eta 0:02:00 lr 0.000916 wd 0.0500 time 0.2399 (0.2455) data time 0.0007 (0.0019) model time 0.2391 (0.2432) loss 3.4308 (3.4887) grad_norm 2.6472 (2.0337) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][770/1251] eta 0:01:58 lr 0.000916 wd 0.0500 time 0.2466 (0.2454) data time 0.0008 (0.0019) model time 0.2457 (0.2432) loss 3.7827 (3.4891) grad_norm 1.8776 (2.0302) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][780/1251] eta 0:01:55 lr 0.000916 wd 0.0500 time 0.2380 (0.2454) data time 0.0007 (0.0019) model time 0.2372 (0.2432) loss 4.3759 (3.4935) grad_norm 2.2021 (2.0266) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][790/1251] eta 0:01:53 lr 0.000916 wd 0.0500 time 0.2472 (0.2453) data time 0.0010 (0.0019) model time 0.2463 (0.2431) loss 3.5304 (3.4919) grad_norm 2.0801 (2.0254) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][800/1251] eta 0:01:50 lr 0.000916 wd 0.0500 time 0.2537 (0.2453) data time 0.0009 (0.0019) model time 0.2528 (0.2431) loss 3.2431 (3.4924) grad_norm 2.2897 (2.0258) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][810/1251] eta 0:01:48 lr 0.000916 wd 0.0500 time 0.2417 (0.2453) data time 0.0010 (0.0019) model time 0.2407 (0.2431) loss 4.5742 (3.4929) grad_norm 3.2205 (2.0259) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][820/1251] eta 0:01:45 lr 0.000916 wd 0.0500 time 0.2443 (0.2453) data time 0.0007 (0.0019) model time 0.2436 (0.2431) loss 3.1948 (3.4934) grad_norm 2.1882 (2.0235) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][830/1251] eta 0:01:43 lr 0.000916 wd 0.0500 time 0.2365 (0.2452) data time 0.0010 (0.0018) model time 0.2356 (0.2431) loss 3.7369 (3.4941) grad_norm 2.0600 (2.0218) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][840/1251] eta 0:01:40 lr 0.000916 wd 0.0500 time 0.2399 (0.2452) data time 0.0010 (0.0018) model time 0.2389 (0.2431) loss 3.8419 (3.4955) grad_norm 4.3189 (2.0263) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][850/1251] eta 0:01:38 lr 0.000916 wd 0.0500 time 0.2410 (0.2452) data time 0.0008 (0.0018) model time 0.2401 (0.2431) loss 2.6544 (3.4923) grad_norm 1.9918 (2.0245) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][860/1251] eta 0:01:35 lr 0.000916 wd 0.0500 time 0.2431 (0.2451) data time 0.0011 (0.0018) model time 0.2420 (0.2430) loss 3.1155 (3.4914) grad_norm 1.8080 (2.0235) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][870/1251] eta 0:01:33 lr 0.000916 wd 0.0500 time 0.2322 (0.2450) data time 0.0008 (0.0018) model time 0.2314 (0.2430) loss 4.4934 (3.4915) grad_norm 1.7055 (2.0232) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][880/1251] eta 0:01:30 lr 0.000916 wd 0.0500 time 0.2392 (0.2450) data time 0.0008 (0.0018) model time 0.2384 (0.2429) loss 3.7272 (3.4879) grad_norm 1.5288 (2.0204) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][890/1251] eta 0:01:28 lr 0.000916 wd 0.0500 time 0.2352 (0.2450) data time 0.0011 (0.0018) model time 0.2341 (0.2429) loss 3.6907 (3.4894) grad_norm 2.0137 (2.0199) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][900/1251] eta 0:01:25 lr 0.000916 wd 0.0500 time 0.2388 (0.2449) data time 0.0008 (0.0018) model time 0.2380 (0.2429) loss 4.0076 (3.4900) grad_norm 1.9248 (2.0181) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][910/1251] eta 0:01:23 lr 0.000916 wd 0.0500 time 0.2419 (0.2449) data time 0.0008 (0.0018) model time 0.2412 (0.2428) loss 4.2681 (3.4923) grad_norm 1.8257 (2.0167) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][920/1251] eta 0:01:21 lr 0.000916 wd 0.0500 time 0.3695 (0.2450) data time 0.0009 (0.0018) model time 0.3686 (0.2430) loss 3.5636 (3.4926) grad_norm 3.0001 (2.0180) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][930/1251] eta 0:01:18 lr 0.000916 wd 0.0500 time 0.2475 (0.2449) data time 0.0008 (0.0018) model time 0.2467 (0.2429) loss 3.4992 (3.4943) grad_norm 2.6081 (2.0227) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][940/1251] eta 0:01:16 lr 0.000916 wd 0.0500 time 0.2345 (0.2453) data time 0.0010 (0.0017) model time 0.2336 (0.2433) loss 3.4805 (3.4950) grad_norm 1.8847 (2.0221) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][950/1251] eta 0:01:13 lr 0.000916 wd 0.0500 time 0.2395 (0.2453) data time 0.0010 (0.0017) model time 0.2385 (0.2433) loss 3.0205 (3.4930) grad_norm 1.6085 (2.0196) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][960/1251] eta 0:01:11 lr 0.000916 wd 0.0500 time 0.2362 (0.2452) data time 0.0008 (0.0017) model time 0.2354 (0.2433) loss 3.5606 (3.4962) grad_norm 2.3218 (2.0195) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][970/1251] eta 0:01:08 lr 0.000916 wd 0.0500 time 0.2424 (0.2452) data time 0.0008 (0.0017) model time 0.2416 (0.2432) loss 4.4543 (3.4970) grad_norm 2.1155 (2.0179) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][980/1251] eta 0:01:06 lr 0.000916 wd 0.0500 time 0.2412 (0.2452) data time 0.0012 (0.0017) model time 0.2400 (0.2432) loss 3.8455 (3.4958) grad_norm 1.4667 (2.0169) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][990/1251] eta 0:01:03 lr 0.000916 wd 0.0500 time 0.2381 (0.2451) data time 0.0011 (0.0017) model time 0.2370 (0.2432) loss 3.4269 (3.4964) grad_norm 2.0359 (2.0197) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:29:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1000/1251] eta 0:01:01 lr 0.000916 wd 0.0500 time 0.2461 (0.2451) data time 0.0008 (0.0017) model time 0.2453 (0.2431) loss 4.4217 (3.4962) grad_norm 1.6151 (2.0171) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1010/1251] eta 0:00:59 lr 0.000916 wd 0.0500 time 0.2433 (0.2450) data time 0.0007 (0.0017) model time 0.2426 (0.2431) loss 2.8273 (3.4963) grad_norm 1.4575 (2.0137) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1020/1251] eta 0:00:56 lr 0.000916 wd 0.0500 time 0.4657 (0.2455) data time 0.0017 (0.0017) model time 0.4640 (0.2436) loss 3.9003 (3.4966) grad_norm 2.1389 (2.0125) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1030/1251] eta 0:00:54 lr 0.000916 wd 0.0500 time 0.2414 (0.2454) data time 0.0009 (0.0017) model time 0.2405 (0.2435) loss 2.3590 (3.4968) grad_norm 1.8192 (2.0144) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1040/1251] eta 0:00:51 lr 0.000916 wd 0.0500 time 0.2462 (0.2454) data time 0.0011 (0.0017) model time 0.2451 (0.2435) loss 3.9222 (3.4978) grad_norm 1.6333 (2.0128) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1050/1251] eta 0:00:49 lr 0.000916 wd 0.0500 time 0.2400 (0.2454) data time 0.0009 (0.0017) model time 0.2390 (0.2435) loss 3.2040 (3.4985) grad_norm 1.4012 (2.0098) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1060/1251] eta 0:00:46 lr 0.000916 wd 0.0500 time 0.2354 (0.2453) data time 0.0010 (0.0017) model time 0.2344 (0.2435) loss 3.5453 (3.4995) grad_norm 1.9992 (2.0096) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1070/1251] eta 0:00:44 lr 0.000915 wd 0.0500 time 0.2396 (0.2453) data time 0.0008 (0.0017) model time 0.2388 (0.2434) loss 4.5436 (3.5021) grad_norm 2.4447 (2.0111) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1080/1251] eta 0:00:41 lr 0.000915 wd 0.0500 time 0.2508 (0.2453) data time 0.0010 (0.0017) model time 0.2498 (0.2434) loss 3.1265 (3.5037) grad_norm 1.4691 (2.0086) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1090/1251] eta 0:00:39 lr 0.000915 wd 0.0500 time 0.2458 (0.2453) data time 0.0010 (0.0016) model time 0.2449 (0.2434) loss 3.8480 (3.5070) grad_norm 2.2188 (2.0063) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1100/1251] eta 0:00:37 lr 0.000915 wd 0.0500 time 0.2394 (0.2452) data time 0.0012 (0.0016) model time 0.2382 (0.2434) loss 3.8362 (3.5072) grad_norm 1.8279 (2.0063) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1110/1251] eta 0:00:34 lr 0.000915 wd 0.0500 time 0.2444 (0.2452) data time 0.0007 (0.0016) model time 0.2437 (0.2434) loss 3.9217 (3.5126) grad_norm 1.8989 (2.0061) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1120/1251] eta 0:00:32 lr 0.000915 wd 0.0500 time 0.2526 (0.2452) data time 0.0011 (0.0016) model time 0.2515 (0.2434) loss 3.5065 (3.5115) grad_norm 2.5008 (2.0055) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1130/1251] eta 0:00:29 lr 0.000915 wd 0.0500 time 0.2428 (0.2451) data time 0.0011 (0.0016) model time 0.2417 (0.2433) loss 3.4911 (3.5110) grad_norm 1.6128 (2.0050) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1140/1251] eta 0:00:27 lr 0.000915 wd 0.0500 time 0.2454 (0.2451) data time 0.0007 (0.0016) model time 0.2447 (0.2433) loss 4.2085 (3.5125) grad_norm 1.7901 (2.0071) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1150/1251] eta 0:00:24 lr 0.000915 wd 0.0500 time 0.2431 (0.2451) data time 0.0009 (0.0016) model time 0.2421 (0.2433) loss 3.8356 (3.5116) grad_norm 1.8175 (2.0059) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1160/1251] eta 0:00:22 lr 0.000915 wd 0.0500 time 0.2510 (0.2451) data time 0.0009 (0.0016) model time 0.2501 (0.2433) loss 3.2095 (3.5094) grad_norm 1.7209 (2.0078) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1170/1251] eta 0:00:19 lr 0.000915 wd 0.0500 time 0.2358 (0.2450) data time 0.0007 (0.0016) model time 0.2351 (0.2432) loss 3.1838 (3.5094) grad_norm 1.8213 (2.0066) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1180/1251] eta 0:00:17 lr 0.000915 wd 0.0500 time 0.2431 (0.2452) data time 0.0009 (0.0016) model time 0.2422 (0.2434) loss 3.5956 (3.5095) grad_norm 1.8042 (2.0059) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1190/1251] eta 0:00:14 lr 0.000915 wd 0.0500 time 0.2420 (0.2455) data time 0.0007 (0.0016) model time 0.2413 (0.2437) loss 3.3990 (3.5096) grad_norm 1.6019 (2.0060) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1200/1251] eta 0:00:12 lr 0.000915 wd 0.0500 time 0.2409 (0.2455) data time 0.0007 (0.0016) model time 0.2401 (0.2437) loss 4.3791 (3.5102) grad_norm 2.3145 (2.0040) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1210/1251] eta 0:00:10 lr 0.000915 wd 0.0500 time 0.2442 (0.2454) data time 0.0007 (0.0016) model time 0.2435 (0.2437) loss 4.0550 (3.5113) grad_norm 2.3900 (2.0063) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1220/1251] eta 0:00:07 lr 0.000915 wd 0.0500 time 0.2446 (0.2454) data time 0.0008 (0.0016) model time 0.2438 (0.2436) loss 4.2827 (3.5100) grad_norm 1.7915 (2.0078) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1230/1251] eta 0:00:05 lr 0.000915 wd 0.0500 time 0.2468 (0.2453) data time 0.0009 (0.0016) model time 0.2459 (0.2436) loss 3.5306 (3.5097) grad_norm 1.7973 (2.0076) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:30:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1240/1251] eta 0:00:02 lr 0.000915 wd 0.0500 time 0.2233 (0.2453) data time 0.0004 (0.0016) model time 0.2228 (0.2435) loss 3.8339 (3.5096) grad_norm 1.3811 (2.0088) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [72/300][1250/1251] eta 0:00:00 lr 0.000915 wd 0.0500 time 0.2245 (0.2451) data time 0.0005 (0.0016) model time 0.2240 (0.2434) loss 3.3698 (3.5078) grad_norm 1.9453 (2.0074) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 72 training takes 0:05:06 [2024-08-26 08:31:00 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 08:31:00 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 08:31:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.446 (0.446) Loss 0.5186 (0.5186) Acc@1 89.258 (89.258) Acc@5 97.559 (97.559) Mem 7379MB [2024-08-26 08:31:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.112) Loss 0.8638 (0.8400) Acc@1 81.348 (81.445) Acc@5 95.898 (95.898) Mem 7379MB [2024-08-26 08:31:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.096) Loss 1.1973 (0.8572) Acc@1 71.191 (80.562) Acc@5 92.285 (95.871) Mem 7379MB [2024-08-26 08:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.090) Loss 1.4160 (0.9768) Acc@1 66.699 (77.826) Acc@5 88.477 (94.311) Mem 7379MB [2024-08-26 08:31:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.3818 (1.0441) Acc@1 68.359 (76.260) Acc@5 89.746 (93.548) Mem 7379MB [2024-08-26 08:31:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.950 Acc@5 93.496 [2024-08-26 08:31:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.0% [2024-08-26 08:31:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 75.95% [2024-08-26 08:31:04 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 08:31:05 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 08:31:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.442 (0.442) Loss 0.4690 (0.4690) Acc@1 91.406 (91.406) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 08:31:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.111) Loss 0.7480 (0.7292) Acc@1 85.059 (84.171) Acc@5 96.094 (96.813) Mem 7379MB [2024-08-26 08:31:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.097) Loss 1.0527 (0.7522) Acc@1 75.488 (83.119) Acc@5 93.359 (96.759) Mem 7379MB [2024-08-26 08:31:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.091) Loss 1.3096 (0.8592) Acc@1 67.285 (80.636) Acc@5 90.039 (95.401) Mem 7379MB [2024-08-26 08:31:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.2217 (0.9166) Acc@1 70.605 (79.171) Acc@5 91.211 (94.808) Mem 7379MB [2024-08-26 08:31:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.788 Acc@5 94.742 [2024-08-26 08:31:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.8% [2024-08-26 08:31:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.79% [2024-08-26 08:31:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 08:31:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 08:31:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][0/1251] eta 0:14:40 lr 0.000915 wd 0.0500 time 0.7040 (0.7040) data time 0.4813 (0.4813) model time 0.0000 (0.0000) loss 3.7341 (3.7341) grad_norm 1.2914 (1.2914) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][10/1251] eta 0:05:50 lr 0.000915 wd 0.0500 time 0.2441 (0.2823) data time 0.0012 (0.0447) model time 0.0000 (0.0000) loss 3.6420 (3.4540) grad_norm 1.3817 (1.8066) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][20/1251] eta 0:05:24 lr 0.000915 wd 0.0500 time 0.2461 (0.2632) data time 0.0011 (0.0239) model time 0.0000 (0.0000) loss 3.1500 (3.5141) grad_norm 1.9500 (1.8653) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][30/1251] eta 0:05:12 lr 0.000915 wd 0.0500 time 0.2366 (0.2561) data time 0.0008 (0.0165) model time 0.0000 (0.0000) loss 2.7342 (3.4238) grad_norm 2.8916 (1.9222) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][40/1251] eta 0:05:06 lr 0.000915 wd 0.0500 time 0.2548 (0.2529) data time 0.0010 (0.0127) model time 0.0000 (0.0000) loss 2.7522 (3.4225) grad_norm 3.5219 (2.0054) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][50/1251] eta 0:05:00 lr 0.000915 wd 0.0500 time 0.2354 (0.2501) data time 0.0010 (0.0104) model time 0.0000 (0.0000) loss 2.6919 (3.4381) grad_norm 1.8108 (1.9858) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][60/1251] eta 0:04:56 lr 0.000915 wd 0.0500 time 0.2423 (0.2489) data time 0.0011 (0.0089) model time 0.2413 (0.2417) loss 4.0432 (3.4233) grad_norm 1.8885 (1.9787) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][70/1251] eta 0:04:52 lr 0.000915 wd 0.0500 time 0.2379 (0.2475) data time 0.0009 (0.0078) model time 0.2370 (0.2399) loss 2.9643 (3.4499) grad_norm 1.8067 (1.9631) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][80/1251] eta 0:04:48 lr 0.000915 wd 0.0500 time 0.2420 (0.2466) data time 0.0009 (0.0070) model time 0.2410 (0.2397) loss 4.1909 (3.4626) grad_norm 2.1586 (1.9820) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][90/1251] eta 0:04:45 lr 0.000915 wd 0.0500 time 0.2369 (0.2461) data time 0.0011 (0.0063) model time 0.2358 (0.2400) loss 3.7085 (3.4314) grad_norm 1.5991 (1.9990) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][100/1251] eta 0:04:45 lr 0.000915 wd 0.0500 time 0.2406 (0.2478) data time 0.0009 (0.0058) model time 0.2396 (0.2444) loss 3.2066 (3.4285) grad_norm 3.3836 (2.0172) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][110/1251] eta 0:04:42 lr 0.000915 wd 0.0500 time 0.2399 (0.2473) data time 0.0009 (0.0054) model time 0.2390 (0.2438) loss 2.2862 (3.4202) grad_norm 1.9428 (2.0058) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][120/1251] eta 0:04:39 lr 0.000915 wd 0.0500 time 0.2401 (0.2467) data time 0.0012 (0.0050) model time 0.2390 (0.2432) loss 3.5832 (3.4116) grad_norm 1.7664 (2.0090) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][130/1251] eta 0:04:37 lr 0.000915 wd 0.0500 time 0.2345 (0.2480) data time 0.0010 (0.0047) model time 0.2335 (0.2456) loss 3.0490 (3.4141) grad_norm 2.1739 (1.9937) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][140/1251] eta 0:04:34 lr 0.000915 wd 0.0500 time 0.2459 (0.2475) data time 0.0011 (0.0044) model time 0.2448 (0.2450) loss 3.3150 (3.4101) grad_norm 1.7252 (1.9958) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][150/1251] eta 0:04:31 lr 0.000915 wd 0.0500 time 0.2441 (0.2470) data time 0.0007 (0.0042) model time 0.2434 (0.2444) loss 3.7659 (3.4053) grad_norm 1.8396 (2.0069) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][160/1251] eta 0:04:28 lr 0.000915 wd 0.0500 time 0.2434 (0.2466) data time 0.0007 (0.0040) model time 0.2427 (0.2438) loss 3.6806 (3.4139) grad_norm 1.8063 (2.0078) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][170/1251] eta 0:04:26 lr 0.000915 wd 0.0500 time 0.2436 (0.2463) data time 0.0009 (0.0039) model time 0.2426 (0.2437) loss 3.2113 (3.4145) grad_norm 1.5203 (1.9940) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][180/1251] eta 0:04:23 lr 0.000915 wd 0.0500 time 0.2592 (0.2461) data time 0.0010 (0.0037) model time 0.2583 (0.2435) loss 4.3545 (3.4358) grad_norm 1.4080 (1.9711) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][190/1251] eta 0:04:20 lr 0.000915 wd 0.0500 time 0.2403 (0.2458) data time 0.0007 (0.0036) model time 0.2396 (0.2431) loss 4.0283 (3.4543) grad_norm 1.9712 (1.9717) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:31:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][200/1251] eta 0:04:18 lr 0.000915 wd 0.0500 time 0.2411 (0.2462) data time 0.0009 (0.0035) model time 0.2401 (0.2437) loss 3.6556 (3.4422) grad_norm 2.2171 (1.9824) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][210/1251] eta 0:04:16 lr 0.000915 wd 0.0500 time 0.2410 (0.2460) data time 0.0010 (0.0034) model time 0.2400 (0.2436) loss 2.8446 (3.4357) grad_norm 2.1444 (1.9863) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][220/1251] eta 0:04:13 lr 0.000914 wd 0.0500 time 0.2443 (0.2458) data time 0.0009 (0.0033) model time 0.2434 (0.2434) loss 3.8531 (3.4466) grad_norm 1.6597 (1.9831) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][230/1251] eta 0:04:10 lr 0.000914 wd 0.0500 time 0.2389 (0.2456) data time 0.0012 (0.0032) model time 0.2378 (0.2433) loss 3.4654 (3.4461) grad_norm 1.9051 (1.9816) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][240/1251] eta 0:04:08 lr 0.000914 wd 0.0500 time 0.2345 (0.2455) data time 0.0010 (0.0031) model time 0.2335 (0.2431) loss 3.6345 (3.4506) grad_norm 1.8117 (1.9889) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][250/1251] eta 0:04:05 lr 0.000914 wd 0.0500 time 0.2399 (0.2453) data time 0.0010 (0.0030) model time 0.2390 (0.2430) loss 3.3854 (3.4348) grad_norm 1.5897 (1.9982) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][260/1251] eta 0:04:02 lr 0.000914 wd 0.0500 time 0.2344 (0.2451) data time 0.0010 (0.0029) model time 0.2333 (0.2428) loss 3.6413 (3.4320) grad_norm 1.8699 (1.9925) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][270/1251] eta 0:04:00 lr 0.000914 wd 0.0500 time 0.2370 (0.2449) data time 0.0010 (0.0029) model time 0.2359 (0.2426) loss 2.9555 (3.4256) grad_norm 4.2475 (2.0126) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][280/1251] eta 0:03:57 lr 0.000914 wd 0.0500 time 0.2433 (0.2448) data time 0.0009 (0.0028) model time 0.2424 (0.2425) loss 3.6055 (3.4324) grad_norm 1.5266 (2.0128) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][290/1251] eta 0:03:55 lr 0.000914 wd 0.0500 time 0.2428 (0.2446) data time 0.0010 (0.0027) model time 0.2418 (0.2424) loss 4.0251 (3.4425) grad_norm 2.6098 (2.0138) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][300/1251] eta 0:03:52 lr 0.000914 wd 0.0500 time 0.2383 (0.2445) data time 0.0011 (0.0027) model time 0.2372 (0.2422) loss 3.6700 (3.4454) grad_norm 1.8769 (2.0128) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][310/1251] eta 0:03:49 lr 0.000914 wd 0.0500 time 0.2399 (0.2444) data time 0.0011 (0.0026) model time 0.2389 (0.2421) loss 3.4831 (3.4438) grad_norm 2.2793 (2.0067) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][320/1251] eta 0:03:47 lr 0.000914 wd 0.0500 time 0.2500 (0.2443) data time 0.0007 (0.0026) model time 0.2493 (0.2421) loss 3.9808 (3.4422) grad_norm 2.1961 (2.0143) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][330/1251] eta 0:03:44 lr 0.000914 wd 0.0500 time 0.2403 (0.2442) data time 0.0009 (0.0025) model time 0.2393 (0.2420) loss 3.2014 (3.4338) grad_norm 2.3478 (2.0167) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][340/1251] eta 0:03:42 lr 0.000914 wd 0.0500 time 0.2438 (0.2441) data time 0.0009 (0.0025) model time 0.2430 (0.2420) loss 3.3793 (3.4372) grad_norm 1.4414 (2.0133) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][350/1251] eta 0:03:39 lr 0.000914 wd 0.0500 time 0.2379 (0.2439) data time 0.0008 (0.0024) model time 0.2371 (0.2418) loss 3.6978 (3.4474) grad_norm 1.5964 (2.0092) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][360/1251] eta 0:03:37 lr 0.000914 wd 0.0500 time 0.2366 (0.2439) data time 0.0011 (0.0024) model time 0.2355 (0.2418) loss 3.9271 (3.4452) grad_norm 1.4238 (2.0067) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][370/1251] eta 0:03:34 lr 0.000914 wd 0.0500 time 0.2477 (0.2439) data time 0.0009 (0.0024) model time 0.2469 (0.2418) loss 3.1477 (3.4489) grad_norm 1.9120 (2.0093) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][380/1251] eta 0:03:32 lr 0.000914 wd 0.0500 time 0.2379 (0.2443) data time 0.0008 (0.0023) model time 0.2371 (0.2424) loss 3.8835 (3.4486) grad_norm 1.7174 (2.0064) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][390/1251] eta 0:03:30 lr 0.000914 wd 0.0500 time 0.2341 (0.2448) data time 0.0007 (0.0023) model time 0.2334 (0.2429) loss 3.3175 (3.4441) grad_norm 1.7862 (2.0082) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][400/1251] eta 0:03:28 lr 0.000914 wd 0.0500 time 0.2377 (0.2453) data time 0.0010 (0.0022) model time 0.2367 (0.2435) loss 2.9993 (3.4449) grad_norm 1.5530 (2.0026) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][410/1251] eta 0:03:26 lr 0.000914 wd 0.0500 time 0.2431 (0.2452) data time 0.0008 (0.0022) model time 0.2423 (0.2435) loss 3.8581 (3.4436) grad_norm 2.2795 (2.0110) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][420/1251] eta 0:03:23 lr 0.000914 wd 0.0500 time 0.2400 (0.2451) data time 0.0010 (0.0022) model time 0.2390 (0.2434) loss 3.7968 (3.4469) grad_norm 2.0103 (2.0193) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][430/1251] eta 0:03:21 lr 0.000914 wd 0.0500 time 0.2474 (0.2451) data time 0.0009 (0.0022) model time 0.2465 (0.2433) loss 3.8359 (3.4467) grad_norm 2.4087 (2.0201) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][440/1251] eta 0:03:18 lr 0.000914 wd 0.0500 time 0.2519 (0.2450) data time 0.0009 (0.0021) model time 0.2510 (0.2433) loss 3.9974 (3.4490) grad_norm 1.9461 (2.0264) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][450/1251] eta 0:03:16 lr 0.000914 wd 0.0500 time 0.2429 (0.2450) data time 0.0009 (0.0021) model time 0.2420 (0.2433) loss 4.0867 (3.4482) grad_norm 1.8382 (2.0293) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][460/1251] eta 0:03:14 lr 0.000914 wd 0.0500 time 0.2486 (0.2454) data time 0.0008 (0.0021) model time 0.2478 (0.2438) loss 4.7235 (3.4454) grad_norm 1.7435 (2.0244) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][470/1251] eta 0:03:11 lr 0.000914 wd 0.0500 time 0.2452 (0.2453) data time 0.0010 (0.0021) model time 0.2442 (0.2437) loss 3.6714 (3.4471) grad_norm 1.9481 (2.0273) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][480/1251] eta 0:03:09 lr 0.000914 wd 0.0500 time 0.2422 (0.2452) data time 0.0009 (0.0020) model time 0.2414 (0.2435) loss 2.1512 (3.4453) grad_norm 1.6998 (2.0261) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][490/1251] eta 0:03:06 lr 0.000914 wd 0.0500 time 0.2405 (0.2451) data time 0.0009 (0.0020) model time 0.2396 (0.2435) loss 2.4746 (3.4461) grad_norm 2.1526 (2.0227) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][500/1251] eta 0:03:04 lr 0.000914 wd 0.0500 time 0.2406 (0.2450) data time 0.0009 (0.0020) model time 0.2397 (0.2434) loss 2.9058 (3.4416) grad_norm 1.9015 (2.0240) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][510/1251] eta 0:03:01 lr 0.000914 wd 0.0500 time 0.2458 (0.2450) data time 0.0009 (0.0020) model time 0.2449 (0.2434) loss 3.1919 (3.4460) grad_norm 1.6620 (2.0260) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][520/1251] eta 0:02:59 lr 0.000914 wd 0.0500 time 0.2442 (0.2449) data time 0.0011 (0.0020) model time 0.2431 (0.2433) loss 2.1870 (3.4445) grad_norm 1.7018 (2.0230) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][530/1251] eta 0:02:56 lr 0.000914 wd 0.0500 time 0.2407 (0.2449) data time 0.0010 (0.0019) model time 0.2396 (0.2433) loss 3.5202 (3.4474) grad_norm 2.2124 (2.0278) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][540/1251] eta 0:02:54 lr 0.000914 wd 0.0500 time 0.2394 (0.2448) data time 0.0011 (0.0019) model time 0.2383 (0.2432) loss 3.3039 (3.4433) grad_norm 2.1028 (2.0293) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][550/1251] eta 0:02:51 lr 0.000914 wd 0.0500 time 0.2394 (0.2447) data time 0.0009 (0.0019) model time 0.2386 (0.2431) loss 2.6128 (3.4422) grad_norm 2.0100 (2.0302) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][560/1251] eta 0:02:49 lr 0.000914 wd 0.0500 time 0.2441 (0.2446) data time 0.0008 (0.0019) model time 0.2434 (0.2430) loss 3.0790 (3.4359) grad_norm 2.0109 (2.0277) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][570/1251] eta 0:02:46 lr 0.000914 wd 0.0500 time 0.2399 (0.2446) data time 0.0012 (0.0019) model time 0.2387 (0.2430) loss 3.6064 (3.4419) grad_norm 2.1597 (2.0254) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][580/1251] eta 0:02:44 lr 0.000914 wd 0.0500 time 0.2365 (0.2445) data time 0.0010 (0.0019) model time 0.2355 (0.2429) loss 3.5769 (3.4463) grad_norm 1.9296 (2.0200) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][590/1251] eta 0:02:41 lr 0.000914 wd 0.0500 time 0.2410 (0.2444) data time 0.0010 (0.0018) model time 0.2400 (0.2428) loss 3.8156 (3.4469) grad_norm 2.5029 (2.0204) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][600/1251] eta 0:02:39 lr 0.000914 wd 0.0500 time 0.2354 (0.2443) data time 0.0010 (0.0018) model time 0.2343 (0.2428) loss 3.5200 (3.4458) grad_norm 2.2325 (2.0186) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][610/1251] eta 0:02:36 lr 0.000914 wd 0.0500 time 0.2377 (0.2443) data time 0.0008 (0.0018) model time 0.2368 (0.2427) loss 3.7925 (3.4445) grad_norm 1.4584 (2.0182) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][620/1251] eta 0:02:34 lr 0.000913 wd 0.0500 time 0.2398 (0.2442) data time 0.0009 (0.0018) model time 0.2389 (0.2426) loss 3.7091 (3.4502) grad_norm 1.7565 (2.0130) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][630/1251] eta 0:02:31 lr 0.000913 wd 0.0500 time 0.2490 (0.2441) data time 0.0011 (0.0018) model time 0.2479 (0.2426) loss 2.5580 (3.4477) grad_norm 1.8997 (2.0096) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][640/1251] eta 0:02:29 lr 0.000913 wd 0.0500 time 0.2478 (0.2441) data time 0.0008 (0.0018) model time 0.2469 (0.2425) loss 3.9086 (3.4500) grad_norm 1.3201 (2.0080) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][650/1251] eta 0:02:26 lr 0.000913 wd 0.0500 time 0.2423 (0.2440) data time 0.0007 (0.0018) model time 0.2416 (0.2425) loss 3.8361 (3.4540) grad_norm 2.0301 (2.0090) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][660/1251] eta 0:02:24 lr 0.000913 wd 0.0500 time 0.2384 (0.2440) data time 0.0010 (0.0018) model time 0.2374 (0.2424) loss 3.5936 (3.4554) grad_norm 3.0412 (2.0088) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][670/1251] eta 0:02:21 lr 0.000913 wd 0.0500 time 0.2347 (0.2439) data time 0.0007 (0.0018) model time 0.2339 (0.2424) loss 3.8577 (3.4594) grad_norm 2.0550 (2.0095) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][680/1251] eta 0:02:19 lr 0.000913 wd 0.0500 time 0.2478 (0.2439) data time 0.0010 (0.0017) model time 0.2468 (0.2423) loss 3.3979 (3.4586) grad_norm 2.7717 (2.0088) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][690/1251] eta 0:02:16 lr 0.000913 wd 0.0500 time 0.2411 (0.2439) data time 0.0010 (0.0017) model time 0.2402 (0.2423) loss 3.8258 (3.4623) grad_norm 2.0625 (2.0059) loss_scale 8192.0000 (4143.4211) mem 7379MB [2024-08-26 08:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][700/1251] eta 0:02:14 lr 0.000913 wd 0.0500 time 0.2468 (0.2439) data time 0.0009 (0.0017) model time 0.2458 (0.2423) loss 3.8621 (3.4635) grad_norm 2.1745 (2.0104) loss_scale 8192.0000 (4201.1755) mem 7379MB [2024-08-26 08:34:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][710/1251] eta 0:02:11 lr 0.000913 wd 0.0500 time 0.2446 (0.2439) data time 0.0009 (0.0017) model time 0.2437 (0.2423) loss 3.7063 (3.4648) grad_norm 2.0918 (2.0125) loss_scale 8192.0000 (4257.3052) mem 7379MB [2024-08-26 08:34:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][720/1251] eta 0:02:09 lr 0.000913 wd 0.0500 time 0.2428 (0.2441) data time 0.0011 (0.0017) model time 0.2417 (0.2426) loss 3.5387 (3.4666) grad_norm 1.4502 (2.0125) loss_scale 8192.0000 (4311.8779) mem 7379MB [2024-08-26 08:34:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][730/1251] eta 0:02:07 lr 0.000913 wd 0.0500 time 0.2405 (0.2441) data time 0.0011 (0.0017) model time 0.2393 (0.2426) loss 3.9749 (3.4652) grad_norm 2.5784 (2.0131) loss_scale 8192.0000 (4364.9576) mem 7379MB [2024-08-26 08:34:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][740/1251] eta 0:02:04 lr 0.000913 wd 0.0500 time 0.2433 (0.2441) data time 0.0010 (0.0017) model time 0.2423 (0.2426) loss 3.7610 (3.4679) grad_norm 1.6772 (2.0125) loss_scale 8192.0000 (4416.6046) mem 7379MB [2024-08-26 08:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][750/1251] eta 0:02:02 lr 0.000913 wd 0.0500 time 0.2456 (0.2440) data time 0.0010 (0.0017) model time 0.2446 (0.2425) loss 3.3206 (3.4650) grad_norm 1.4151 (2.0112) loss_scale 8192.0000 (4466.8762) mem 7379MB [2024-08-26 08:34:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][760/1251] eta 0:01:59 lr 0.000913 wd 0.0500 time 0.2367 (0.2440) data time 0.0007 (0.0017) model time 0.2360 (0.2425) loss 2.6380 (3.4657) grad_norm 1.7498 (2.0089) loss_scale 8192.0000 (4515.8265) mem 7379MB [2024-08-26 08:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][770/1251] eta 0:01:57 lr 0.000913 wd 0.0500 time 0.2405 (0.2439) data time 0.0010 (0.0017) model time 0.2396 (0.2425) loss 3.4238 (3.4646) grad_norm 1.7121 (2.0138) loss_scale 8192.0000 (4563.5071) mem 7379MB [2024-08-26 08:34:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][780/1251] eta 0:01:54 lr 0.000913 wd 0.0500 time 0.2416 (0.2439) data time 0.0007 (0.0016) model time 0.2409 (0.2424) loss 4.2569 (3.4649) grad_norm 1.3552 (2.0127) loss_scale 8192.0000 (4609.9667) mem 7379MB [2024-08-26 08:34:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][790/1251] eta 0:01:52 lr 0.000913 wd 0.0500 time 0.2348 (0.2438) data time 0.0009 (0.0016) model time 0.2340 (0.2424) loss 3.3302 (3.4618) grad_norm 1.6627 (2.0128) loss_scale 8192.0000 (4655.2516) mem 7379MB [2024-08-26 08:34:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][800/1251] eta 0:01:49 lr 0.000913 wd 0.0500 time 0.2533 (0.2438) data time 0.0007 (0.0016) model time 0.2526 (0.2423) loss 3.9793 (3.4646) grad_norm 2.4750 (2.0143) loss_scale 8192.0000 (4699.4057) mem 7379MB [2024-08-26 08:34:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][810/1251] eta 0:01:47 lr 0.000913 wd 0.0500 time 0.2448 (0.2438) data time 0.0007 (0.0016) model time 0.2440 (0.2423) loss 3.7164 (3.4686) grad_norm 1.9802 (2.0147) loss_scale 8192.0000 (4742.4710) mem 7379MB [2024-08-26 08:34:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][820/1251] eta 0:01:45 lr 0.000913 wd 0.0500 time 0.4456 (0.2443) data time 0.0010 (0.0016) model time 0.4446 (0.2428) loss 3.6489 (3.4709) grad_norm 1.5289 (2.0118) loss_scale 8192.0000 (4784.4872) mem 7379MB [2024-08-26 08:34:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][830/1251] eta 0:01:42 lr 0.000913 wd 0.0500 time 0.2405 (0.2442) data time 0.0009 (0.0016) model time 0.2396 (0.2428) loss 2.8528 (3.4711) grad_norm 2.1063 (2.0076) loss_scale 8192.0000 (4825.4922) mem 7379MB [2024-08-26 08:34:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][840/1251] eta 0:01:40 lr 0.000913 wd 0.0500 time 0.2495 (0.2442) data time 0.0010 (0.0016) model time 0.2485 (0.2428) loss 3.6131 (3.4712) grad_norm 1.4098 (2.0054) loss_scale 8192.0000 (4865.5220) mem 7379MB [2024-08-26 08:34:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][850/1251] eta 0:01:37 lr 0.000913 wd 0.0500 time 0.2378 (0.2442) data time 0.0011 (0.0016) model time 0.2367 (0.2428) loss 2.8078 (3.4689) grad_norm 1.8615 (2.0052) loss_scale 8192.0000 (4904.6110) mem 7379MB [2024-08-26 08:34:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][860/1251] eta 0:01:35 lr 0.000913 wd 0.0500 time 0.2382 (0.2442) data time 0.0008 (0.0016) model time 0.2374 (0.2427) loss 3.1733 (3.4665) grad_norm 2.4828 (2.0090) loss_scale 8192.0000 (4942.7921) mem 7379MB [2024-08-26 08:34:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][870/1251] eta 0:01:33 lr 0.000913 wd 0.0500 time 0.2387 (0.2441) data time 0.0008 (0.0016) model time 0.2379 (0.2427) loss 2.6954 (3.4632) grad_norm 1.8505 (2.0120) loss_scale 8192.0000 (4980.0964) mem 7379MB [2024-08-26 08:34:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][880/1251] eta 0:01:30 lr 0.000913 wd 0.0500 time 0.2357 (0.2441) data time 0.0009 (0.0016) model time 0.2348 (0.2427) loss 3.8081 (3.4655) grad_norm 1.7905 (2.0113) loss_scale 8192.0000 (5016.5539) mem 7379MB [2024-08-26 08:34:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][890/1251] eta 0:01:28 lr 0.000913 wd 0.0500 time 0.2427 (0.2441) data time 0.0007 (0.0016) model time 0.2420 (0.2427) loss 3.8997 (3.4690) grad_norm 1.7551 (2.0122) loss_scale 8192.0000 (5052.1930) mem 7379MB [2024-08-26 08:34:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][900/1251] eta 0:01:25 lr 0.000913 wd 0.0500 time 0.2412 (0.2441) data time 0.0010 (0.0016) model time 0.2402 (0.2427) loss 2.6552 (3.4692) grad_norm 2.0590 (2.0108) loss_scale 8192.0000 (5087.0411) mem 7379MB [2024-08-26 08:34:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][910/1251] eta 0:01:23 lr 0.000913 wd 0.0500 time 0.2423 (0.2440) data time 0.0007 (0.0016) model time 0.2415 (0.2426) loss 2.3948 (3.4698) grad_norm 2.1541 (2.0132) loss_scale 8192.0000 (5121.1240) mem 7379MB [2024-08-26 08:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][920/1251] eta 0:01:20 lr 0.000913 wd 0.0500 time 0.2450 (0.2445) data time 0.0011 (0.0015) model time 0.2438 (0.2431) loss 4.1497 (3.4698) grad_norm 2.5257 (2.0130) loss_scale 8192.0000 (5154.4669) mem 7379MB [2024-08-26 08:34:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][930/1251] eta 0:01:18 lr 0.000913 wd 0.0500 time 0.2413 (0.2444) data time 0.0011 (0.0015) model time 0.2402 (0.2431) loss 3.5950 (3.4702) grad_norm 2.0482 (2.0132) loss_scale 8192.0000 (5187.0934) mem 7379MB [2024-08-26 08:35:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][940/1251] eta 0:01:16 lr 0.000913 wd 0.0500 time 0.2434 (0.2444) data time 0.0009 (0.0015) model time 0.2424 (0.2431) loss 3.8576 (3.4677) grad_norm 2.5595 (2.0118) loss_scale 8192.0000 (5219.0266) mem 7379MB [2024-08-26 08:35:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][950/1251] eta 0:01:13 lr 0.000913 wd 0.0500 time 0.2408 (0.2444) data time 0.0008 (0.0015) model time 0.2400 (0.2430) loss 4.1177 (3.4666) grad_norm 1.7602 (2.0114) loss_scale 8192.0000 (5250.2881) mem 7379MB [2024-08-26 08:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][960/1251] eta 0:01:11 lr 0.000913 wd 0.0500 time 0.2425 (0.2444) data time 0.0007 (0.0015) model time 0.2417 (0.2430) loss 2.8953 (3.4662) grad_norm 2.3143 (2.0099) loss_scale 8192.0000 (5280.8991) mem 7379MB [2024-08-26 08:35:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][970/1251] eta 0:01:08 lr 0.000913 wd 0.0500 time 0.2396 (0.2443) data time 0.0009 (0.0015) model time 0.2387 (0.2430) loss 3.5678 (3.4663) grad_norm 2.0153 (2.0127) loss_scale 8192.0000 (5310.8795) mem 7379MB [2024-08-26 08:35:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][980/1251] eta 0:01:06 lr 0.000913 wd 0.0500 time 0.2387 (0.2445) data time 0.0009 (0.0015) model time 0.2378 (0.2432) loss 4.1647 (3.4664) grad_norm 2.4977 (2.0161) loss_scale 8192.0000 (5340.2487) mem 7379MB [2024-08-26 08:35:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][990/1251] eta 0:01:03 lr 0.000913 wd 0.0500 time 0.2448 (0.2445) data time 0.0007 (0.0015) model time 0.2441 (0.2431) loss 2.6480 (3.4678) grad_norm 1.6100 (2.0127) loss_scale 8192.0000 (5369.0252) mem 7379MB [2024-08-26 08:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1000/1251] eta 0:01:01 lr 0.000913 wd 0.0500 time 0.2345 (0.2444) data time 0.0009 (0.0015) model time 0.2337 (0.2431) loss 3.9197 (3.4646) grad_norm 2.5449 (2.0115) loss_scale 8192.0000 (5397.2268) mem 7379MB [2024-08-26 08:35:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1010/1251] eta 0:00:58 lr 0.000912 wd 0.0500 time 0.2569 (0.2444) data time 0.0008 (0.0015) model time 0.2561 (0.2431) loss 3.7156 (3.4671) grad_norm 3.0752 (2.0137) loss_scale 8192.0000 (5424.8704) mem 7379MB [2024-08-26 08:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1020/1251] eta 0:00:56 lr 0.000912 wd 0.0500 time 0.2425 (0.2444) data time 0.0010 (0.0015) model time 0.2416 (0.2431) loss 2.8552 (3.4643) grad_norm 1.7587 (2.0166) loss_scale 8192.0000 (5451.9726) mem 7379MB [2024-08-26 08:35:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1030/1251] eta 0:00:54 lr 0.000912 wd 0.0500 time 0.2420 (0.2446) data time 0.0008 (0.0015) model time 0.2412 (0.2433) loss 3.3501 (3.4639) grad_norm 3.2676 (2.0183) loss_scale 8192.0000 (5478.5490) mem 7379MB [2024-08-26 08:35:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1040/1251] eta 0:00:51 lr 0.000912 wd 0.0500 time 0.2415 (0.2446) data time 0.0007 (0.0015) model time 0.2408 (0.2433) loss 3.2452 (3.4653) grad_norm 1.7539 (2.0176) loss_scale 8192.0000 (5504.6148) mem 7379MB [2024-08-26 08:35:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1050/1251] eta 0:00:49 lr 0.000912 wd 0.0500 time 0.2449 (0.2445) data time 0.0008 (0.0015) model time 0.2441 (0.2432) loss 2.8219 (3.4633) grad_norm 1.5171 (2.0220) loss_scale 8192.0000 (5530.1846) mem 7379MB [2024-08-26 08:35:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1060/1251] eta 0:00:46 lr 0.000912 wd 0.0500 time 0.2389 (0.2447) data time 0.0007 (0.0015) model time 0.2381 (0.2434) loss 3.4193 (3.4633) grad_norm 1.9142 (2.0248) loss_scale 8192.0000 (5555.2724) mem 7379MB [2024-08-26 08:35:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1070/1251] eta 0:00:44 lr 0.000912 wd 0.0500 time 0.2529 (0.2447) data time 0.0007 (0.0015) model time 0.2522 (0.2434) loss 2.5377 (3.4633) grad_norm 2.1219 (2.0283) loss_scale 8192.0000 (5579.8917) mem 7379MB [2024-08-26 08:35:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1080/1251] eta 0:00:41 lr 0.000912 wd 0.0500 time 0.2397 (0.2447) data time 0.0010 (0.0015) model time 0.2387 (0.2434) loss 2.4703 (3.4640) grad_norm 1.6940 (2.0272) loss_scale 8192.0000 (5604.0555) mem 7379MB [2024-08-26 08:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1090/1251] eta 0:00:39 lr 0.000912 wd 0.0500 time 0.2394 (0.2447) data time 0.0010 (0.0015) model time 0.2385 (0.2434) loss 4.1672 (3.4640) grad_norm 1.7891 (2.0258) loss_scale 8192.0000 (5627.7764) mem 7379MB [2024-08-26 08:35:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1100/1251] eta 0:00:36 lr 0.000912 wd 0.0500 time 0.2351 (0.2446) data time 0.0009 (0.0015) model time 0.2342 (0.2434) loss 3.6280 (3.4645) grad_norm 1.9562 (2.0245) loss_scale 8192.0000 (5651.0663) mem 7379MB [2024-08-26 08:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1110/1251] eta 0:00:34 lr 0.000912 wd 0.0500 time 0.2451 (0.2446) data time 0.0010 (0.0015) model time 0.2441 (0.2434) loss 3.0048 (3.4675) grad_norm 1.8874 (2.0219) loss_scale 8192.0000 (5673.9370) mem 7379MB [2024-08-26 08:35:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1120/1251] eta 0:00:32 lr 0.000912 wd 0.0500 time 0.2455 (0.2446) data time 0.0008 (0.0015) model time 0.2447 (0.2433) loss 2.8791 (3.4689) grad_norm 1.7095 (2.0228) loss_scale 8192.0000 (5696.3996) mem 7379MB [2024-08-26 08:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1130/1251] eta 0:00:29 lr 0.000912 wd 0.0500 time 0.2365 (0.2446) data time 0.0011 (0.0014) model time 0.2355 (0.2433) loss 3.9264 (3.4677) grad_norm 1.4190 (2.0259) loss_scale 8192.0000 (5718.4651) mem 7379MB [2024-08-26 08:35:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1140/1251] eta 0:00:27 lr 0.000912 wd 0.0500 time 0.2398 (0.2446) data time 0.0012 (0.0014) model time 0.2386 (0.2433) loss 3.3400 (3.4702) grad_norm 2.1590 (2.0280) loss_scale 8192.0000 (5740.1437) mem 7379MB [2024-08-26 08:35:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1150/1251] eta 0:00:24 lr 0.000912 wd 0.0500 time 0.2392 (0.2446) data time 0.0009 (0.0014) model time 0.2383 (0.2433) loss 3.7098 (3.4690) grad_norm 1.9349 (2.0292) loss_scale 8192.0000 (5761.4457) mem 7379MB [2024-08-26 08:35:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1160/1251] eta 0:00:22 lr 0.000912 wd 0.0500 time 0.2361 (0.2445) data time 0.0009 (0.0014) model time 0.2352 (0.2432) loss 2.4424 (3.4687) grad_norm 1.4578 (2.0291) loss_scale 8192.0000 (5782.3807) mem 7379MB [2024-08-26 08:35:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1170/1251] eta 0:00:19 lr 0.000912 wd 0.0500 time 0.2385 (0.2445) data time 0.0008 (0.0014) model time 0.2376 (0.2432) loss 3.4873 (3.4673) grad_norm 1.6576 (2.0271) loss_scale 8192.0000 (5802.9582) mem 7379MB [2024-08-26 08:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1180/1251] eta 0:00:17 lr 0.000912 wd 0.0500 time 0.2439 (0.2445) data time 0.0010 (0.0014) model time 0.2428 (0.2432) loss 4.1747 (3.4701) grad_norm 1.7277 (2.0257) loss_scale 8192.0000 (5823.1871) mem 7379MB [2024-08-26 08:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1190/1251] eta 0:00:14 lr 0.000912 wd 0.0500 time 0.2430 (0.2445) data time 0.0010 (0.0014) model time 0.2421 (0.2432) loss 3.2617 (3.4720) grad_norm 1.6916 (2.0263) loss_scale 8192.0000 (5843.0764) mem 7379MB [2024-08-26 08:36:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1200/1251] eta 0:00:12 lr 0.000912 wd 0.0500 time 0.2411 (0.2444) data time 0.0007 (0.0014) model time 0.2404 (0.2432) loss 3.8933 (3.4736) grad_norm 1.6729 (2.0260) loss_scale 8192.0000 (5862.6345) mem 7379MB [2024-08-26 08:36:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1210/1251] eta 0:00:10 lr 0.000912 wd 0.0500 time 0.2378 (0.2444) data time 0.0011 (0.0014) model time 0.2367 (0.2431) loss 4.0373 (3.4741) grad_norm 2.5202 (2.0253) loss_scale 8192.0000 (5881.8695) mem 7379MB [2024-08-26 08:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1220/1251] eta 0:00:07 lr 0.000912 wd 0.0500 time 0.2408 (0.2444) data time 0.0013 (0.0014) model time 0.2395 (0.2431) loss 3.9433 (3.4745) grad_norm 1.6805 (2.0256) loss_scale 8192.0000 (5900.7895) mem 7379MB [2024-08-26 08:36:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1230/1251] eta 0:00:05 lr 0.000912 wd 0.0500 time 0.2465 (0.2443) data time 0.0007 (0.0014) model time 0.2458 (0.2431) loss 3.4460 (3.4744) grad_norm 1.9308 (2.0244) loss_scale 8192.0000 (5919.4021) mem 7379MB [2024-08-26 08:36:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1240/1251] eta 0:00:02 lr 0.000912 wd 0.0500 time 0.2235 (0.2442) data time 0.0005 (0.0014) model time 0.2230 (0.2430) loss 2.8533 (3.4764) grad_norm 2.3573 (2.0239) loss_scale 8192.0000 (5937.7147) mem 7379MB [2024-08-26 08:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [73/300][1250/1251] eta 0:00:00 lr 0.000912 wd 0.0500 time 0.2236 (0.2441) data time 0.0005 (0.0014) model time 0.2231 (0.2428) loss 2.7229 (3.4764) grad_norm 2.4509 (2.0240) loss_scale 8192.0000 (5955.7346) mem 7379MB [2024-08-26 08:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 73 training takes 0:05:05 [2024-08-26 08:36:15 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 08:36:16 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 08:36:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.439 (0.439) Loss 0.5444 (0.5444) Acc@1 89.941 (89.941) Acc@5 97.363 (97.363) Mem 7379MB [2024-08-26 08:36:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.108) Loss 0.8599 (0.8426) Acc@1 80.664 (81.428) Acc@5 95.703 (96.103) Mem 7379MB [2024-08-26 08:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.095) Loss 1.2168 (0.8694) Acc@1 72.070 (80.683) Acc@5 91.602 (95.954) Mem 7379MB [2024-08-26 08:36:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.090) Loss 1.4043 (0.9911) Acc@1 67.578 (78.068) Acc@5 88.574 (94.421) Mem 7379MB [2024-08-26 08:36:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.4277 (1.0555) Acc@1 67.871 (76.496) Acc@5 89.453 (93.590) Mem 7379MB [2024-08-26 08:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.050 Acc@5 93.476 [2024-08-26 08:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.1% [2024-08-26 08:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 76.05% [2024-08-26 08:36:20 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 08:36:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 08:36:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.437 (0.437) Loss 0.4678 (0.4678) Acc@1 91.406 (91.406) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 08:36:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.093 (0.111) Loss 0.7456 (0.7269) Acc@1 84.863 (84.304) Acc@5 96.191 (96.804) Mem 7379MB [2024-08-26 08:36:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.096) Loss 1.0488 (0.7504) Acc@1 75.586 (83.208) Acc@5 93.359 (96.763) Mem 7379MB [2024-08-26 08:36:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.090) Loss 1.3057 (0.8571) Acc@1 66.895 (80.733) Acc@5 90.137 (95.451) Mem 7379MB [2024-08-26 08:36:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.2168 (0.9143) Acc@1 70.801 (79.252) Acc@5 91.211 (94.817) Mem 7379MB [2024-08-26 08:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.864 Acc@5 94.748 [2024-08-26 08:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.9% [2024-08-26 08:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.86% [2024-08-26 08:36:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 08:36:25 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 08:36:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][0/1251] eta 0:15:00 lr 0.000912 wd 0.0500 time 0.7199 (0.7199) data time 0.4973 (0.4973) model time 0.0000 (0.0000) loss 4.1594 (4.1594) grad_norm 1.9799 (1.9799) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:36:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][10/1251] eta 0:05:53 lr 0.000912 wd 0.0500 time 0.2433 (0.2847) data time 0.0007 (0.0462) model time 0.0000 (0.0000) loss 3.5702 (3.5755) grad_norm 2.4134 (2.0424) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:36:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][20/1251] eta 0:05:23 lr 0.000912 wd 0.0500 time 0.2405 (0.2631) data time 0.0008 (0.0248) model time 0.0000 (0.0000) loss 4.1591 (3.5987) grad_norm 2.1358 (1.9321) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:36:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][30/1251] eta 0:05:12 lr 0.000912 wd 0.0500 time 0.2405 (0.2559) data time 0.0010 (0.0172) model time 0.0000 (0.0000) loss 2.9220 (3.5568) grad_norm 1.8398 (1.8870) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:36:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][40/1251] eta 0:05:05 lr 0.000912 wd 0.0500 time 0.2370 (0.2524) data time 0.0008 (0.0132) model time 0.0000 (0.0000) loss 2.4541 (3.4871) grad_norm 2.5788 (2.0129) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:36:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][50/1251] eta 0:05:05 lr 0.000912 wd 0.0500 time 0.2363 (0.2547) data time 0.0010 (0.0108) model time 0.0000 (0.0000) loss 3.7272 (3.4906) grad_norm 1.6218 (2.0006) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][60/1251] eta 0:05:00 lr 0.000912 wd 0.0500 time 0.2387 (0.2522) data time 0.0008 (0.0092) model time 0.2380 (0.2386) loss 4.1844 (3.5377) grad_norm 1.7641 (1.9793) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:36:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][70/1251] eta 0:04:55 lr 0.000912 wd 0.0500 time 0.2395 (0.2503) data time 0.0012 (0.0080) model time 0.2383 (0.2384) loss 3.5656 (3.5452) grad_norm 2.1203 (1.9687) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:36:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][80/1251] eta 0:04:51 lr 0.000912 wd 0.0500 time 0.2380 (0.2490) data time 0.0010 (0.0072) model time 0.2370 (0.2384) loss 3.6857 (3.5396) grad_norm 1.7341 (1.9722) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:36:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][90/1251] eta 0:04:47 lr 0.000912 wd 0.0500 time 0.2399 (0.2480) data time 0.0007 (0.0065) model time 0.2392 (0.2386) loss 3.5158 (3.5480) grad_norm 1.6854 (1.9555) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:36:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][100/1251] eta 0:04:44 lr 0.000912 wd 0.0500 time 0.2407 (0.2474) data time 0.0008 (0.0059) model time 0.2399 (0.2389) loss 4.4196 (3.5661) grad_norm 3.3370 (1.9776) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][110/1251] eta 0:04:41 lr 0.000912 wd 0.0500 time 0.2490 (0.2469) data time 0.0007 (0.0055) model time 0.2483 (0.2393) loss 4.6284 (3.5584) grad_norm 2.7502 (1.9832) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:36:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][120/1251] eta 0:04:38 lr 0.000912 wd 0.0500 time 0.2420 (0.2463) data time 0.0013 (0.0051) model time 0.2408 (0.2392) loss 3.9230 (3.5521) grad_norm 1.7933 (1.9817) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][130/1251] eta 0:04:35 lr 0.000912 wd 0.0500 time 0.2418 (0.2460) data time 0.0009 (0.0048) model time 0.2410 (0.2394) loss 3.2139 (3.5673) grad_norm 2.8463 (1.9834) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][140/1251] eta 0:04:32 lr 0.000912 wd 0.0500 time 0.2458 (0.2456) data time 0.0009 (0.0045) model time 0.2449 (0.2395) loss 3.6835 (3.5742) grad_norm 2.0290 (1.9991) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:37:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][150/1251] eta 0:04:30 lr 0.000912 wd 0.0500 time 0.2436 (0.2453) data time 0.0010 (0.0043) model time 0.2426 (0.2395) loss 3.6553 (3.5690) grad_norm 2.2102 (2.0443) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:37:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][160/1251] eta 0:04:27 lr 0.000911 wd 0.0500 time 0.2436 (0.2450) data time 0.0009 (0.0041) model time 0.2427 (0.2394) loss 3.3371 (3.5677) grad_norm 1.9931 (2.0537) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:37:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][170/1251] eta 0:04:24 lr 0.000911 wd 0.0500 time 0.2408 (0.2447) data time 0.0008 (0.0039) model time 0.2400 (0.2395) loss 3.2279 (3.5603) grad_norm 1.5557 (2.0409) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:37:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][180/1251] eta 0:04:23 lr 0.000911 wd 0.0500 time 0.2576 (0.2458) data time 0.0013 (0.0038) model time 0.2563 (0.2413) loss 3.3089 (3.5536) grad_norm 1.5545 (2.0511) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][190/1251] eta 0:04:22 lr 0.000911 wd 0.0500 time 0.4442 (0.2476) data time 0.0009 (0.0037) model time 0.4432 (0.2440) loss 4.5428 (3.5730) grad_norm 1.6430 (2.0472) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:37:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][200/1251] eta 0:04:20 lr 0.000911 wd 0.0500 time 0.2387 (0.2482) data time 0.0011 (0.0035) model time 0.2376 (0.2449) loss 3.1075 (3.5727) grad_norm 1.8727 (2.0372) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][210/1251] eta 0:04:18 lr 0.000911 wd 0.0500 time 0.2438 (0.2478) data time 0.0011 (0.0034) model time 0.2426 (0.2446) loss 4.0475 (3.5608) grad_norm 2.1839 (2.0272) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:37:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][220/1251] eta 0:04:15 lr 0.000911 wd 0.0500 time 0.2402 (0.2475) data time 0.0011 (0.0033) model time 0.2391 (0.2442) loss 4.0114 (3.5688) grad_norm 2.3241 (2.0343) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:37:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][230/1251] eta 0:04:13 lr 0.000911 wd 0.0500 time 0.2443 (0.2480) data time 0.0007 (0.0032) model time 0.2436 (0.2451) loss 4.1902 (3.5728) grad_norm 1.9367 (2.0431) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:37:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][240/1251] eta 0:04:10 lr 0.000911 wd 0.0500 time 0.2492 (0.2478) data time 0.0007 (0.0031) model time 0.2485 (0.2450) loss 3.8117 (3.5594) grad_norm 1.6138 (2.0438) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:37:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][250/1251] eta 0:04:08 lr 0.000911 wd 0.0500 time 0.2423 (0.2482) data time 0.0007 (0.0030) model time 0.2416 (0.2455) loss 3.2328 (3.5480) grad_norm 2.0960 (2.0424) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][260/1251] eta 0:04:05 lr 0.000911 wd 0.0500 time 0.2276 (0.2479) data time 0.0011 (0.0030) model time 0.2265 (0.2451) loss 3.7310 (3.5424) grad_norm 1.6045 (2.0310) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:37:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][270/1251] eta 0:04:02 lr 0.000911 wd 0.0500 time 0.2348 (0.2476) data time 0.0009 (0.0029) model time 0.2339 (0.2449) loss 3.6152 (3.5358) grad_norm 1.7603 (inf) loss_scale 4096.0000 (8071.0849) mem 7379MB [2024-08-26 08:37:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][280/1251] eta 0:04:00 lr 0.000911 wd 0.0500 time 0.2375 (0.2472) data time 0.0007 (0.0028) model time 0.2368 (0.2445) loss 3.1120 (3.5400) grad_norm 2.0851 (inf) loss_scale 4096.0000 (7929.6228) mem 7379MB [2024-08-26 08:37:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][290/1251] eta 0:03:57 lr 0.000911 wd 0.0500 time 0.2406 (0.2471) data time 0.0008 (0.0028) model time 0.2398 (0.2445) loss 3.8965 (3.5429) grad_norm 1.6643 (inf) loss_scale 4096.0000 (7797.8832) mem 7379MB [2024-08-26 08:37:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][300/1251] eta 0:03:54 lr 0.000911 wd 0.0500 time 0.2582 (0.2470) data time 0.0012 (0.0027) model time 0.2570 (0.2443) loss 3.6319 (3.5465) grad_norm 1.9131 (inf) loss_scale 4096.0000 (7674.8970) mem 7379MB [2024-08-26 08:37:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][310/1251] eta 0:03:52 lr 0.000911 wd 0.0500 time 0.2371 (0.2467) data time 0.0009 (0.0026) model time 0.2361 (0.2441) loss 3.5013 (3.5468) grad_norm 2.5364 (inf) loss_scale 4096.0000 (7559.8199) mem 7379MB [2024-08-26 08:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][320/1251] eta 0:03:49 lr 0.000911 wd 0.0500 time 0.2387 (0.2466) data time 0.0007 (0.0026) model time 0.2380 (0.2441) loss 3.1681 (3.5540) grad_norm 1.5364 (inf) loss_scale 4096.0000 (7451.9128) mem 7379MB [2024-08-26 08:37:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][330/1251] eta 0:03:46 lr 0.000911 wd 0.0500 time 0.2320 (0.2464) data time 0.0012 (0.0025) model time 0.2308 (0.2438) loss 3.0952 (3.5502) grad_norm 1.6906 (inf) loss_scale 4096.0000 (7350.5257) mem 7379MB [2024-08-26 08:37:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][340/1251] eta 0:03:44 lr 0.000911 wd 0.0500 time 0.2378 (0.2462) data time 0.0010 (0.0025) model time 0.2369 (0.2437) loss 3.6753 (3.5487) grad_norm 1.7360 (inf) loss_scale 4096.0000 (7255.0850) mem 7379MB [2024-08-26 08:37:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][350/1251] eta 0:03:41 lr 0.000911 wd 0.0500 time 0.2398 (0.2460) data time 0.0011 (0.0025) model time 0.2387 (0.2435) loss 3.8969 (3.5524) grad_norm 2.2792 (inf) loss_scale 4096.0000 (7165.0826) mem 7379MB [2024-08-26 08:37:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][360/1251] eta 0:03:39 lr 0.000911 wd 0.0500 time 0.2453 (0.2459) data time 0.0007 (0.0024) model time 0.2446 (0.2434) loss 3.8792 (3.5485) grad_norm 1.8748 (inf) loss_scale 4096.0000 (7080.0665) mem 7379MB [2024-08-26 08:37:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][370/1251] eta 0:03:36 lr 0.000911 wd 0.0500 time 0.2410 (0.2457) data time 0.0013 (0.0024) model time 0.2398 (0.2432) loss 3.3321 (3.5451) grad_norm 2.2750 (inf) loss_scale 4096.0000 (6999.6334) mem 7379MB [2024-08-26 08:37:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][380/1251] eta 0:03:33 lr 0.000911 wd 0.0500 time 0.2388 (0.2456) data time 0.0007 (0.0024) model time 0.2381 (0.2432) loss 4.0397 (3.5533) grad_norm 2.0445 (inf) loss_scale 4096.0000 (6923.4226) mem 7379MB [2024-08-26 08:38:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][390/1251] eta 0:03:31 lr 0.000911 wd 0.0500 time 0.2385 (0.2455) data time 0.0007 (0.0023) model time 0.2378 (0.2430) loss 2.8892 (3.5430) grad_norm 1.4053 (inf) loss_scale 4096.0000 (6851.1100) mem 7379MB [2024-08-26 08:38:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][400/1251] eta 0:03:28 lr 0.000911 wd 0.0500 time 0.2402 (0.2454) data time 0.0012 (0.0023) model time 0.2390 (0.2430) loss 3.3429 (3.5470) grad_norm 1.4227 (inf) loss_scale 4096.0000 (6782.4040) mem 7379MB [2024-08-26 08:38:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][410/1251] eta 0:03:26 lr 0.000911 wd 0.0500 time 0.2425 (0.2458) data time 0.0007 (0.0023) model time 0.2418 (0.2435) loss 3.9608 (3.5470) grad_norm 2.0443 (inf) loss_scale 4096.0000 (6717.0414) mem 7379MB [2024-08-26 08:38:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][420/1251] eta 0:03:24 lr 0.000911 wd 0.0500 time 0.2465 (0.2458) data time 0.0009 (0.0022) model time 0.2456 (0.2435) loss 4.1652 (3.5477) grad_norm 2.2886 (inf) loss_scale 4096.0000 (6654.7838) mem 7379MB [2024-08-26 08:38:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][430/1251] eta 0:03:21 lr 0.000911 wd 0.0500 time 0.2409 (0.2457) data time 0.0011 (0.0022) model time 0.2398 (0.2434) loss 3.6787 (3.5451) grad_norm 1.4468 (inf) loss_scale 4096.0000 (6595.4153) mem 7379MB [2024-08-26 08:38:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][440/1251] eta 0:03:19 lr 0.000911 wd 0.0500 time 0.2513 (0.2456) data time 0.0011 (0.0022) model time 0.2502 (0.2434) loss 3.5885 (3.5353) grad_norm 1.5512 (inf) loss_scale 4096.0000 (6538.7392) mem 7379MB [2024-08-26 08:38:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][450/1251] eta 0:03:16 lr 0.000911 wd 0.0500 time 0.2441 (0.2455) data time 0.0009 (0.0022) model time 0.2431 (0.2433) loss 2.8951 (3.5301) grad_norm 2.8059 (inf) loss_scale 4096.0000 (6484.5765) mem 7379MB [2024-08-26 08:38:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][460/1251] eta 0:03:14 lr 0.000911 wd 0.0500 time 0.2423 (0.2454) data time 0.0011 (0.0021) model time 0.2412 (0.2432) loss 3.6757 (3.5313) grad_norm 2.4847 (inf) loss_scale 4096.0000 (6432.7636) mem 7379MB [2024-08-26 08:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][470/1251] eta 0:03:11 lr 0.000911 wd 0.0500 time 0.2452 (0.2453) data time 0.0009 (0.0021) model time 0.2443 (0.2431) loss 4.0836 (3.5378) grad_norm 1.8919 (inf) loss_scale 4096.0000 (6383.1507) mem 7379MB [2024-08-26 08:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][480/1251] eta 0:03:09 lr 0.000911 wd 0.0500 time 0.2343 (0.2452) data time 0.0012 (0.0021) model time 0.2331 (0.2430) loss 3.3938 (3.5404) grad_norm 2.0575 (inf) loss_scale 4096.0000 (6335.6008) mem 7379MB [2024-08-26 08:38:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][490/1251] eta 0:03:06 lr 0.000911 wd 0.0500 time 0.2380 (0.2451) data time 0.0009 (0.0021) model time 0.2371 (0.2430) loss 3.7230 (3.5355) grad_norm 1.7372 (inf) loss_scale 4096.0000 (6289.9878) mem 7379MB [2024-08-26 08:38:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][500/1251] eta 0:03:04 lr 0.000911 wd 0.0500 time 0.2376 (0.2451) data time 0.0013 (0.0021) model time 0.2363 (0.2429) loss 3.6943 (3.5356) grad_norm 2.9099 (inf) loss_scale 4096.0000 (6246.1956) mem 7379MB [2024-08-26 08:38:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][510/1251] eta 0:03:01 lr 0.000911 wd 0.0500 time 0.2444 (0.2450) data time 0.0008 (0.0020) model time 0.2436 (0.2429) loss 2.3832 (3.5359) grad_norm 1.5900 (inf) loss_scale 4096.0000 (6204.1174) mem 7379MB [2024-08-26 08:38:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][520/1251] eta 0:02:59 lr 0.000911 wd 0.0500 time 0.2393 (0.2449) data time 0.0008 (0.0020) model time 0.2385 (0.2428) loss 3.7918 (3.5304) grad_norm 1.6347 (inf) loss_scale 4096.0000 (6163.6545) mem 7379MB [2024-08-26 08:38:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][530/1251] eta 0:02:56 lr 0.000911 wd 0.0500 time 0.2446 (0.2448) data time 0.0010 (0.0020) model time 0.2436 (0.2427) loss 2.6363 (3.5239) grad_norm 1.7498 (inf) loss_scale 4096.0000 (6124.7156) mem 7379MB [2024-08-26 08:38:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][540/1251] eta 0:02:54 lr 0.000911 wd 0.0500 time 0.2391 (0.2451) data time 0.0010 (0.0020) model time 0.2382 (0.2431) loss 3.4240 (3.5254) grad_norm 2.2008 (inf) loss_scale 4096.0000 (6087.2163) mem 7379MB [2024-08-26 08:38:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][550/1251] eta 0:02:51 lr 0.000910 wd 0.0500 time 0.2411 (0.2450) data time 0.0008 (0.0020) model time 0.2403 (0.2430) loss 3.1032 (3.5195) grad_norm 2.2035 (inf) loss_scale 4096.0000 (6051.0780) mem 7379MB [2024-08-26 08:38:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][560/1251] eta 0:02:49 lr 0.000910 wd 0.0500 time 0.2350 (0.2450) data time 0.0011 (0.0019) model time 0.2339 (0.2429) loss 2.4715 (3.5195) grad_norm 1.9183 (inf) loss_scale 4096.0000 (6016.2282) mem 7379MB [2024-08-26 08:38:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][570/1251] eta 0:02:46 lr 0.000910 wd 0.0500 time 0.2432 (0.2449) data time 0.0007 (0.0019) model time 0.2425 (0.2429) loss 3.1160 (3.5188) grad_norm 1.6373 (inf) loss_scale 4096.0000 (5982.5989) mem 7379MB [2024-08-26 08:38:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][580/1251] eta 0:02:44 lr 0.000910 wd 0.0500 time 0.2302 (0.2449) data time 0.0011 (0.0019) model time 0.2291 (0.2428) loss 3.4386 (3.5205) grad_norm 1.7057 (inf) loss_scale 4096.0000 (5950.1274) mem 7379MB [2024-08-26 08:38:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][590/1251] eta 0:02:41 lr 0.000910 wd 0.0500 time 0.2408 (0.2448) data time 0.0010 (0.0019) model time 0.2397 (0.2428) loss 3.5938 (3.5248) grad_norm 2.4999 (inf) loss_scale 4096.0000 (5918.7547) mem 7379MB [2024-08-26 08:38:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][600/1251] eta 0:02:39 lr 0.000910 wd 0.0500 time 0.2463 (0.2448) data time 0.0009 (0.0019) model time 0.2453 (0.2428) loss 3.8635 (3.5220) grad_norm 2.3223 (inf) loss_scale 4096.0000 (5888.4260) mem 7379MB [2024-08-26 08:38:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][610/1251] eta 0:02:36 lr 0.000910 wd 0.0500 time 0.2385 (0.2447) data time 0.0010 (0.0019) model time 0.2375 (0.2428) loss 3.8730 (3.5202) grad_norm 2.3487 (inf) loss_scale 4096.0000 (5859.0900) mem 7379MB [2024-08-26 08:38:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][620/1251] eta 0:02:34 lr 0.000910 wd 0.0500 time 0.2417 (0.2447) data time 0.0007 (0.0019) model time 0.2409 (0.2427) loss 4.3241 (3.5202) grad_norm 1.6901 (inf) loss_scale 4096.0000 (5830.6989) mem 7379MB [2024-08-26 08:39:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][630/1251] eta 0:02:31 lr 0.000910 wd 0.0500 time 0.2311 (0.2446) data time 0.0009 (0.0018) model time 0.2303 (0.2427) loss 3.7238 (3.5149) grad_norm 1.8099 (inf) loss_scale 4096.0000 (5803.2076) mem 7379MB [2024-08-26 08:39:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][640/1251] eta 0:02:29 lr 0.000910 wd 0.0500 time 0.2370 (0.2446) data time 0.0012 (0.0018) model time 0.2358 (0.2426) loss 3.6114 (3.5161) grad_norm 1.7213 (inf) loss_scale 4096.0000 (5776.5741) mem 7379MB [2024-08-26 08:39:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][650/1251] eta 0:02:26 lr 0.000910 wd 0.0500 time 0.2375 (0.2445) data time 0.0007 (0.0018) model time 0.2368 (0.2425) loss 4.0385 (3.5164) grad_norm 1.7679 (inf) loss_scale 4096.0000 (5750.7588) mem 7379MB [2024-08-26 08:39:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][660/1251] eta 0:02:24 lr 0.000910 wd 0.0500 time 0.2370 (0.2444) data time 0.0012 (0.0018) model time 0.2358 (0.2425) loss 3.5259 (3.5123) grad_norm 1.4928 (inf) loss_scale 4096.0000 (5725.7247) mem 7379MB [2024-08-26 08:39:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][670/1251] eta 0:02:21 lr 0.000910 wd 0.0500 time 0.2455 (0.2444) data time 0.0008 (0.0018) model time 0.2447 (0.2425) loss 4.0300 (3.5149) grad_norm 1.9345 (inf) loss_scale 4096.0000 (5701.4367) mem 7379MB [2024-08-26 08:39:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][680/1251] eta 0:02:19 lr 0.000910 wd 0.0500 time 0.2352 (0.2444) data time 0.0010 (0.0018) model time 0.2342 (0.2425) loss 2.9441 (3.5145) grad_norm 1.8856 (inf) loss_scale 4096.0000 (5677.8620) mem 7379MB [2024-08-26 08:39:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][690/1251] eta 0:02:17 lr 0.000910 wd 0.0500 time 0.2350 (0.2443) data time 0.0011 (0.0018) model time 0.2338 (0.2424) loss 4.0030 (3.5160) grad_norm 1.8430 (inf) loss_scale 4096.0000 (5654.9696) mem 7379MB [2024-08-26 08:39:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][700/1251] eta 0:02:14 lr 0.000910 wd 0.0500 time 0.2442 (0.2442) data time 0.0008 (0.0018) model time 0.2433 (0.2424) loss 4.0684 (3.5186) grad_norm 2.2154 (inf) loss_scale 4096.0000 (5632.7304) mem 7379MB [2024-08-26 08:39:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][710/1251] eta 0:02:12 lr 0.000910 wd 0.0500 time 0.3917 (0.2444) data time 0.0010 (0.0018) model time 0.3907 (0.2425) loss 3.8537 (3.5176) grad_norm 2.1405 (inf) loss_scale 4096.0000 (5611.1167) mem 7379MB [2024-08-26 08:39:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][720/1251] eta 0:02:09 lr 0.000910 wd 0.0500 time 0.2420 (0.2447) data time 0.0011 (0.0017) model time 0.2409 (0.2428) loss 3.6306 (3.5200) grad_norm 1.6193 (inf) loss_scale 4096.0000 (5590.1026) mem 7379MB [2024-08-26 08:39:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][730/1251] eta 0:02:07 lr 0.000910 wd 0.0500 time 0.2380 (0.2449) data time 0.0011 (0.0017) model time 0.2369 (0.2431) loss 4.2677 (3.5224) grad_norm 4.0695 (inf) loss_scale 4096.0000 (5569.6635) mem 7379MB [2024-08-26 08:39:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][740/1251] eta 0:02:05 lr 0.000910 wd 0.0500 time 0.2442 (0.2448) data time 0.0009 (0.0017) model time 0.2433 (0.2431) loss 3.3352 (3.5194) grad_norm 1.3829 (inf) loss_scale 4096.0000 (5549.7760) mem 7379MB [2024-08-26 08:39:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][750/1251] eta 0:02:02 lr 0.000910 wd 0.0500 time 0.2418 (0.2448) data time 0.0009 (0.0017) model time 0.2409 (0.2430) loss 4.1876 (3.5235) grad_norm 1.6200 (inf) loss_scale 4096.0000 (5530.4181) mem 7379MB [2024-08-26 08:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][760/1251] eta 0:02:00 lr 0.000910 wd 0.0500 time 0.2405 (0.2448) data time 0.0007 (0.0017) model time 0.2398 (0.2430) loss 3.9523 (3.5234) grad_norm 2.2328 (inf) loss_scale 4096.0000 (5511.5690) mem 7379MB [2024-08-26 08:39:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][770/1251] eta 0:01:57 lr 0.000910 wd 0.0500 time 0.4398 (0.2450) data time 0.0008 (0.0017) model time 0.4391 (0.2432) loss 2.5757 (3.5235) grad_norm 1.4862 (inf) loss_scale 4096.0000 (5493.2088) mem 7379MB [2024-08-26 08:39:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][780/1251] eta 0:01:55 lr 0.000910 wd 0.0500 time 0.2413 (0.2450) data time 0.0012 (0.0017) model time 0.2401 (0.2432) loss 3.4211 (3.5251) grad_norm 1.6572 (inf) loss_scale 4096.0000 (5475.3188) mem 7379MB [2024-08-26 08:39:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][790/1251] eta 0:01:52 lr 0.000910 wd 0.0500 time 0.2429 (0.2449) data time 0.0009 (0.0017) model time 0.2420 (0.2432) loss 2.7690 (3.5208) grad_norm 1.6763 (inf) loss_scale 4096.0000 (5457.8812) mem 7379MB [2024-08-26 08:39:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][800/1251] eta 0:01:50 lr 0.000910 wd 0.0500 time 0.2445 (0.2448) data time 0.0010 (0.0017) model time 0.2435 (0.2431) loss 3.6941 (3.5191) grad_norm 1.8760 (inf) loss_scale 4096.0000 (5440.8789) mem 7379MB [2024-08-26 08:39:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][810/1251] eta 0:01:47 lr 0.000910 wd 0.0500 time 0.2436 (0.2448) data time 0.0009 (0.0017) model time 0.2427 (0.2431) loss 3.7544 (3.5212) grad_norm 2.0063 (inf) loss_scale 4096.0000 (5424.2959) mem 7379MB [2024-08-26 08:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][820/1251] eta 0:01:45 lr 0.000910 wd 0.0500 time 0.2392 (0.2448) data time 0.0009 (0.0017) model time 0.2383 (0.2430) loss 2.8120 (3.5213) grad_norm 3.6975 (inf) loss_scale 4096.0000 (5408.1169) mem 7379MB [2024-08-26 08:39:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][830/1251] eta 0:01:43 lr 0.000910 wd 0.0500 time 0.2415 (0.2447) data time 0.0008 (0.0016) model time 0.2407 (0.2430) loss 3.4242 (3.5195) grad_norm 1.8757 (inf) loss_scale 4096.0000 (5392.3273) mem 7379MB [2024-08-26 08:39:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][840/1251] eta 0:01:40 lr 0.000910 wd 0.0500 time 0.2401 (0.2446) data time 0.0008 (0.0016) model time 0.2393 (0.2430) loss 4.0856 (3.5148) grad_norm 1.5074 (inf) loss_scale 4096.0000 (5376.9132) mem 7379MB [2024-08-26 08:39:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][850/1251] eta 0:01:38 lr 0.000910 wd 0.0500 time 0.2427 (0.2446) data time 0.0007 (0.0016) model time 0.2420 (0.2429) loss 3.4569 (3.5093) grad_norm 1.7536 (inf) loss_scale 4096.0000 (5361.8613) mem 7379MB [2024-08-26 08:39:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][860/1251] eta 0:01:35 lr 0.000910 wd 0.0500 time 0.2440 (0.2446) data time 0.0009 (0.0016) model time 0.2431 (0.2429) loss 4.1084 (3.5084) grad_norm 1.4039 (inf) loss_scale 4096.0000 (5347.1591) mem 7379MB [2024-08-26 08:39:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][870/1251] eta 0:01:33 lr 0.000910 wd 0.0500 time 0.2346 (0.2446) data time 0.0009 (0.0016) model time 0.2337 (0.2429) loss 2.0008 (3.5075) grad_norm 2.8939 (inf) loss_scale 4096.0000 (5332.7945) mem 7379MB [2024-08-26 08:40:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][880/1251] eta 0:01:30 lr 0.000910 wd 0.0500 time 0.2438 (0.2445) data time 0.0010 (0.0016) model time 0.2428 (0.2429) loss 3.8523 (3.5077) grad_norm 1.6043 (inf) loss_scale 4096.0000 (5318.7560) mem 7379MB [2024-08-26 08:40:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][890/1251] eta 0:01:28 lr 0.000910 wd 0.0500 time 0.2433 (0.2445) data time 0.0008 (0.0016) model time 0.2425 (0.2428) loss 3.6426 (3.5062) grad_norm 2.6812 (inf) loss_scale 4096.0000 (5305.0325) mem 7379MB [2024-08-26 08:40:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][900/1251] eta 0:01:25 lr 0.000910 wd 0.0500 time 0.2409 (0.2444) data time 0.0007 (0.0016) model time 0.2402 (0.2428) loss 2.2642 (3.5033) grad_norm 2.2205 (inf) loss_scale 4096.0000 (5291.6138) mem 7379MB [2024-08-26 08:40:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][910/1251] eta 0:01:23 lr 0.000910 wd 0.0500 time 0.2354 (0.2445) data time 0.0010 (0.0016) model time 0.2344 (0.2428) loss 3.8754 (3.5062) grad_norm 2.2169 (inf) loss_scale 4096.0000 (5278.4896) mem 7379MB [2024-08-26 08:40:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][920/1251] eta 0:01:20 lr 0.000910 wd 0.0500 time 0.2444 (0.2445) data time 0.0009 (0.0016) model time 0.2435 (0.2428) loss 2.4862 (3.5038) grad_norm 2.0258 (inf) loss_scale 4096.0000 (5265.6504) mem 7379MB [2024-08-26 08:40:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][930/1251] eta 0:01:18 lr 0.000910 wd 0.0500 time 0.2413 (0.2444) data time 0.0008 (0.0016) model time 0.2404 (0.2428) loss 2.8166 (3.5063) grad_norm 2.9041 (inf) loss_scale 4096.0000 (5253.0870) mem 7379MB [2024-08-26 08:40:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][940/1251] eta 0:01:16 lr 0.000910 wd 0.0500 time 0.2392 (0.2446) data time 0.0010 (0.0016) model time 0.2382 (0.2430) loss 3.8367 (3.5066) grad_norm 1.9363 (inf) loss_scale 4096.0000 (5240.7906) mem 7379MB [2024-08-26 08:40:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][950/1251] eta 0:01:13 lr 0.000909 wd 0.0500 time 0.2428 (0.2446) data time 0.0009 (0.0016) model time 0.2419 (0.2429) loss 4.1768 (3.5051) grad_norm 1.4055 (inf) loss_scale 4096.0000 (5228.7529) mem 7379MB [2024-08-26 08:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][960/1251] eta 0:01:11 lr 0.000909 wd 0.0500 time 0.2432 (0.2446) data time 0.0008 (0.0016) model time 0.2424 (0.2429) loss 4.3157 (3.5048) grad_norm 2.3719 (inf) loss_scale 4096.0000 (5216.9657) mem 7379MB [2024-08-26 08:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][970/1251] eta 0:01:08 lr 0.000909 wd 0.0500 time 0.2362 (0.2445) data time 0.0012 (0.0016) model time 0.2350 (0.2429) loss 3.0914 (3.5069) grad_norm 1.5731 (inf) loss_scale 4096.0000 (5205.4212) mem 7379MB [2024-08-26 08:40:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][980/1251] eta 0:01:06 lr 0.000909 wd 0.0500 time 0.2477 (0.2447) data time 0.0009 (0.0016) model time 0.2468 (0.2431) loss 2.5239 (3.5075) grad_norm 1.7827 (inf) loss_scale 4096.0000 (5194.1121) mem 7379MB [2024-08-26 08:40:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][990/1251] eta 0:01:03 lr 0.000909 wd 0.0500 time 0.2441 (0.2447) data time 0.0009 (0.0016) model time 0.2432 (0.2431) loss 2.9692 (3.5061) grad_norm 1.7111 (inf) loss_scale 4096.0000 (5183.0313) mem 7379MB [2024-08-26 08:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1000/1251] eta 0:01:01 lr 0.000909 wd 0.0500 time 0.2356 (0.2447) data time 0.0010 (0.0016) model time 0.2346 (0.2431) loss 2.5050 (3.5037) grad_norm 1.7002 (inf) loss_scale 4096.0000 (5172.1718) mem 7379MB [2024-08-26 08:40:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1010/1251] eta 0:00:58 lr 0.000909 wd 0.0500 time 0.2418 (0.2446) data time 0.0009 (0.0015) model time 0.2409 (0.2430) loss 2.7071 (3.5029) grad_norm 2.5289 (inf) loss_scale 4096.0000 (5161.5272) mem 7379MB [2024-08-26 08:40:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1020/1251] eta 0:00:56 lr 0.000909 wd 0.0500 time 0.2427 (0.2446) data time 0.0010 (0.0015) model time 0.2417 (0.2430) loss 3.9540 (3.5032) grad_norm 1.8037 (inf) loss_scale 4096.0000 (5151.0911) mem 7379MB [2024-08-26 08:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1030/1251] eta 0:00:54 lr 0.000909 wd 0.0500 time 0.2414 (0.2445) data time 0.0010 (0.0015) model time 0.2404 (0.2430) loss 3.0352 (3.5019) grad_norm 2.1345 (inf) loss_scale 4096.0000 (5140.8574) mem 7379MB [2024-08-26 08:40:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1040/1251] eta 0:00:51 lr 0.000909 wd 0.0500 time 0.2399 (0.2445) data time 0.0011 (0.0015) model time 0.2388 (0.2429) loss 3.6522 (3.5026) grad_norm 1.6631 (inf) loss_scale 4096.0000 (5130.8204) mem 7379MB [2024-08-26 08:40:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1050/1251] eta 0:00:49 lr 0.000909 wd 0.0500 time 0.2484 (0.2445) data time 0.0010 (0.0015) model time 0.2474 (0.2429) loss 3.5579 (3.5038) grad_norm 1.7800 (inf) loss_scale 4096.0000 (5120.9743) mem 7379MB [2024-08-26 08:40:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1060/1251] eta 0:00:46 lr 0.000909 wd 0.0500 time 0.2527 (0.2445) data time 0.0009 (0.0015) model time 0.2518 (0.2429) loss 3.7527 (3.5022) grad_norm 1.8598 (inf) loss_scale 4096.0000 (5111.3139) mem 7379MB [2024-08-26 08:40:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1070/1251] eta 0:00:44 lr 0.000909 wd 0.0500 time 0.2421 (0.2444) data time 0.0011 (0.0015) model time 0.2410 (0.2429) loss 3.4421 (3.5022) grad_norm 1.7190 (inf) loss_scale 4096.0000 (5101.8338) mem 7379MB [2024-08-26 08:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1080/1251] eta 0:00:41 lr 0.000909 wd 0.0500 time 0.2417 (0.2446) data time 0.0008 (0.0015) model time 0.2409 (0.2430) loss 4.7529 (3.5030) grad_norm 2.4129 (inf) loss_scale 4096.0000 (5092.5291) mem 7379MB [2024-08-26 08:40:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1090/1251] eta 0:00:39 lr 0.000909 wd 0.0500 time 0.2492 (0.2446) data time 0.0008 (0.0015) model time 0.2484 (0.2430) loss 4.4455 (3.5050) grad_norm 1.9187 (inf) loss_scale 4096.0000 (5083.3951) mem 7379MB [2024-08-26 08:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1100/1251] eta 0:00:36 lr 0.000909 wd 0.0500 time 0.2418 (0.2445) data time 0.0010 (0.0015) model time 0.2408 (0.2430) loss 3.8335 (3.5047) grad_norm 3.3573 (inf) loss_scale 4096.0000 (5074.4269) mem 7379MB [2024-08-26 08:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1110/1251] eta 0:00:34 lr 0.000909 wd 0.0500 time 0.2415 (0.2445) data time 0.0011 (0.0015) model time 0.2404 (0.2430) loss 3.4852 (3.5054) grad_norm 2.6373 (inf) loss_scale 4096.0000 (5065.6202) mem 7379MB [2024-08-26 08:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1120/1251] eta 0:00:32 lr 0.000909 wd 0.0500 time 0.2359 (0.2445) data time 0.0011 (0.0015) model time 0.2348 (0.2429) loss 3.9250 (3.5054) grad_norm 1.8407 (inf) loss_scale 4096.0000 (5056.9706) mem 7379MB [2024-08-26 08:41:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1130/1251] eta 0:00:29 lr 0.000909 wd 0.0500 time 0.2334 (0.2444) data time 0.0008 (0.0015) model time 0.2325 (0.2429) loss 3.3499 (3.5045) grad_norm 1.3908 (inf) loss_scale 4096.0000 (5048.4739) mem 7379MB [2024-08-26 08:41:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1140/1251] eta 0:00:27 lr 0.000909 wd 0.0500 time 0.2395 (0.2444) data time 0.0010 (0.0015) model time 0.2385 (0.2429) loss 3.6571 (3.5066) grad_norm 1.8510 (inf) loss_scale 4096.0000 (5040.1262) mem 7379MB [2024-08-26 08:41:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1150/1251] eta 0:00:24 lr 0.000909 wd 0.0500 time 0.2433 (0.2444) data time 0.0008 (0.0015) model time 0.2426 (0.2428) loss 4.4845 (3.5058) grad_norm 1.8815 (inf) loss_scale 4096.0000 (5031.9235) mem 7379MB [2024-08-26 08:41:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1160/1251] eta 0:00:22 lr 0.000909 wd 0.0500 time 0.2484 (0.2445) data time 0.0009 (0.0015) model time 0.2474 (0.2430) loss 4.0070 (3.5041) grad_norm 2.0387 (inf) loss_scale 4096.0000 (5023.8622) mem 7379MB [2024-08-26 08:41:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1170/1251] eta 0:00:19 lr 0.000909 wd 0.0500 time 0.2428 (0.2445) data time 0.0009 (0.0015) model time 0.2419 (0.2429) loss 4.1088 (3.5052) grad_norm 1.6222 (inf) loss_scale 4096.0000 (5015.9385) mem 7379MB [2024-08-26 08:41:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1180/1251] eta 0:00:17 lr 0.000909 wd 0.0500 time 0.2404 (0.2444) data time 0.0010 (0.0015) model time 0.2394 (0.2429) loss 2.9544 (3.5068) grad_norm 2.6601 (inf) loss_scale 4096.0000 (5008.1490) mem 7379MB [2024-08-26 08:41:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1190/1251] eta 0:00:14 lr 0.000909 wd 0.0500 time 0.2376 (0.2444) data time 0.0007 (0.0015) model time 0.2369 (0.2429) loss 3.8108 (3.5103) grad_norm 1.7696 (inf) loss_scale 4096.0000 (5000.4903) mem 7379MB [2024-08-26 08:41:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1200/1251] eta 0:00:12 lr 0.000909 wd 0.0500 time 0.2369 (0.2443) data time 0.0011 (0.0015) model time 0.2358 (0.2428) loss 3.7801 (3.5113) grad_norm 3.2222 (inf) loss_scale 4096.0000 (4992.9592) mem 7379MB [2024-08-26 08:41:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1210/1251] eta 0:00:10 lr 0.000909 wd 0.0500 time 0.2495 (0.2443) data time 0.0007 (0.0015) model time 0.2488 (0.2428) loss 4.2438 (3.5113) grad_norm 1.5819 (inf) loss_scale 4096.0000 (4985.5524) mem 7379MB [2024-08-26 08:41:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1220/1251] eta 0:00:07 lr 0.000909 wd 0.0500 time 0.2398 (0.2443) data time 0.0007 (0.0015) model time 0.2391 (0.2428) loss 2.7206 (3.5112) grad_norm 1.9308 (inf) loss_scale 4096.0000 (4978.2670) mem 7379MB [2024-08-26 08:41:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1230/1251] eta 0:00:05 lr 0.000909 wd 0.0500 time 0.2498 (0.2443) data time 0.0010 (0.0015) model time 0.2488 (0.2428) loss 3.9505 (3.5092) grad_norm 2.4030 (inf) loss_scale 4096.0000 (4971.0999) mem 7379MB [2024-08-26 08:41:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1240/1251] eta 0:00:02 lr 0.000909 wd 0.0500 time 0.2232 (0.2447) data time 0.0005 (0.0015) model time 0.2227 (0.2432) loss 2.3824 (3.5090) grad_norm 1.3374 (inf) loss_scale 4096.0000 (4964.0483) mem 7379MB [2024-08-26 08:41:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [74/300][1250/1251] eta 0:00:00 lr 0.000909 wd 0.0500 time 0.2270 (0.2446) data time 0.0005 (0.0014) model time 0.2266 (0.2432) loss 2.3930 (3.5060) grad_norm 2.4571 (inf) loss_scale 4096.0000 (4957.1095) mem 7379MB [2024-08-26 08:41:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 74 training takes 0:05:06 [2024-08-26 08:41:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 08:41:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 08:41:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.402 (0.402) Loss 0.4983 (0.4983) Acc@1 90.625 (90.625) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 08:41:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.118) Loss 0.8071 (0.8242) Acc@1 82.715 (82.076) Acc@5 95.898 (95.943) Mem 7379MB [2024-08-26 08:41:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.099) Loss 1.1748 (0.8464) Acc@1 71.387 (80.994) Acc@5 92.480 (95.922) Mem 7379MB [2024-08-26 08:41:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.092) Loss 1.4209 (0.9703) Acc@1 65.820 (78.141) Acc@5 88.672 (94.418) Mem 7379MB [2024-08-26 08:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.086) Loss 1.2539 (1.0414) Acc@1 70.801 (76.624) Acc@5 91.406 (93.590) Mem 7379MB [2024-08-26 08:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.332 Acc@5 93.572 [2024-08-26 08:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.3% [2024-08-26 08:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 76.33% [2024-08-26 08:41:36 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 08:41:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 08:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.427 (0.427) Loss 0.4658 (0.4658) Acc@1 91.602 (91.602) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 08:41:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.108) Loss 0.7437 (0.7248) Acc@1 85.059 (84.348) Acc@5 96.289 (96.822) Mem 7379MB [2024-08-26 08:41:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.094) Loss 1.0420 (0.7480) Acc@1 75.586 (83.222) Acc@5 93.457 (96.805) Mem 7379MB [2024-08-26 08:41:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.088) Loss 1.3057 (0.8545) Acc@1 66.895 (80.771) Acc@5 90.234 (95.492) Mem 7379MB [2024-08-26 08:41:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.2119 (0.9118) Acc@1 70.605 (79.278) Acc@5 91.211 (94.846) Mem 7379MB [2024-08-26 08:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.876 Acc@5 94.782 [2024-08-26 08:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.9% [2024-08-26 08:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.88% [2024-08-26 08:41:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 08:41:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 08:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][0/1251] eta 0:13:29 lr 0.000909 wd 0.0500 time 0.6470 (0.6470) data time 0.4189 (0.4189) model time 0.0000 (0.0000) loss 3.7468 (3.7468) grad_norm 1.7796 (1.7796) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][10/1251] eta 0:05:46 lr 0.000909 wd 0.0500 time 0.2357 (0.2796) data time 0.0007 (0.0390) model time 0.0000 (0.0000) loss 3.4778 (3.5800) grad_norm 1.8812 (2.0856) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:41:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][20/1251] eta 0:05:21 lr 0.000909 wd 0.0500 time 0.2463 (0.2615) data time 0.0009 (0.0209) model time 0.0000 (0.0000) loss 3.6030 (3.4411) grad_norm 2.0878 (2.0566) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:41:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][30/1251] eta 0:05:11 lr 0.000909 wd 0.0500 time 0.2381 (0.2548) data time 0.0008 (0.0145) model time 0.0000 (0.0000) loss 3.5416 (3.4225) grad_norm 1.9084 (1.9961) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:41:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][40/1251] eta 0:05:05 lr 0.000909 wd 0.0500 time 0.2510 (0.2520) data time 0.0010 (0.0112) model time 0.0000 (0.0000) loss 2.2343 (3.4634) grad_norm 1.9824 (2.0087) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:41:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][50/1251] eta 0:05:00 lr 0.000909 wd 0.0500 time 0.2474 (0.2501) data time 0.0007 (0.0092) model time 0.0000 (0.0000) loss 3.5241 (3.5021) grad_norm 3.7190 (2.0782) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:41:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][60/1251] eta 0:04:56 lr 0.000909 wd 0.0500 time 0.2455 (0.2491) data time 0.0008 (0.0079) model time 0.2447 (0.2433) loss 2.9057 (3.4561) grad_norm 1.9747 (2.0909) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:41:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][70/1251] eta 0:04:52 lr 0.000909 wd 0.0500 time 0.2372 (0.2480) data time 0.0009 (0.0069) model time 0.2363 (0.2416) loss 2.8104 (3.4551) grad_norm 2.5506 (2.0982) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][80/1251] eta 0:04:49 lr 0.000908 wd 0.0500 time 0.2424 (0.2472) data time 0.0012 (0.0062) model time 0.2412 (0.2414) loss 3.6903 (3.4473) grad_norm 1.7529 (2.1628) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][90/1251] eta 0:04:46 lr 0.000908 wd 0.0500 time 0.2394 (0.2465) data time 0.0011 (0.0056) model time 0.2382 (0.2409) loss 3.2662 (3.4207) grad_norm 3.0391 (2.2069) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][100/1251] eta 0:04:45 lr 0.000908 wd 0.0500 time 0.2368 (0.2479) data time 0.0010 (0.0052) model time 0.2358 (0.2447) loss 2.9446 (3.4190) grad_norm 1.6626 (2.1585) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][110/1251] eta 0:04:42 lr 0.000908 wd 0.0500 time 0.2414 (0.2475) data time 0.0008 (0.0048) model time 0.2405 (0.2442) loss 4.1617 (3.4080) grad_norm 1.5095 (2.1433) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][120/1251] eta 0:04:39 lr 0.000908 wd 0.0500 time 0.2488 (0.2471) data time 0.0007 (0.0045) model time 0.2481 (0.2440) loss 2.1534 (3.4000) grad_norm 1.6634 (2.1244) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][130/1251] eta 0:04:36 lr 0.000908 wd 0.0500 time 0.2359 (0.2467) data time 0.0010 (0.0042) model time 0.2349 (0.2436) loss 3.8720 (3.4006) grad_norm 2.2054 (2.1259) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][140/1251] eta 0:04:33 lr 0.000908 wd 0.0500 time 0.2432 (0.2463) data time 0.0009 (0.0040) model time 0.2423 (0.2432) loss 4.1201 (3.4120) grad_norm 1.4738 (2.1284) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][150/1251] eta 0:04:30 lr 0.000908 wd 0.0500 time 0.2444 (0.2460) data time 0.0007 (0.0038) model time 0.2437 (0.2430) loss 3.3633 (3.4045) grad_norm 1.5409 (2.1055) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][160/1251] eta 0:04:28 lr 0.000908 wd 0.0500 time 0.2363 (0.2457) data time 0.0007 (0.0036) model time 0.2356 (0.2427) loss 3.8400 (3.3988) grad_norm 2.0787 (2.1194) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][170/1251] eta 0:04:25 lr 0.000908 wd 0.0500 time 0.2410 (0.2455) data time 0.0012 (0.0034) model time 0.2399 (0.2425) loss 2.7016 (3.4010) grad_norm 1.4786 (2.1150) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][180/1251] eta 0:04:22 lr 0.000908 wd 0.0500 time 0.2372 (0.2452) data time 0.0011 (0.0033) model time 0.2361 (0.2423) loss 3.9202 (3.4032) grad_norm 1.8228 (2.1019) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][190/1251] eta 0:04:19 lr 0.000908 wd 0.0500 time 0.2341 (0.2449) data time 0.0009 (0.0032) model time 0.2332 (0.2421) loss 3.9559 (3.4139) grad_norm 2.0697 (2.0950) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][200/1251] eta 0:04:17 lr 0.000908 wd 0.0500 time 0.2475 (0.2448) data time 0.0007 (0.0031) model time 0.2467 (0.2421) loss 3.3142 (3.4128) grad_norm 1.7006 (2.0885) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][210/1251] eta 0:04:14 lr 0.000908 wd 0.0500 time 0.2381 (0.2448) data time 0.0010 (0.0030) model time 0.2371 (0.2421) loss 3.3693 (3.4238) grad_norm 1.8893 (2.0744) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][220/1251] eta 0:04:12 lr 0.000908 wd 0.0500 time 0.2441 (0.2446) data time 0.0009 (0.0029) model time 0.2433 (0.2419) loss 2.7321 (3.4183) grad_norm 2.9146 (2.0656) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][230/1251] eta 0:04:09 lr 0.000908 wd 0.0500 time 0.2379 (0.2445) data time 0.0009 (0.0028) model time 0.2371 (0.2419) loss 4.1586 (3.4132) grad_norm 3.6210 (2.0913) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][240/1251] eta 0:04:07 lr 0.000908 wd 0.0500 time 0.2435 (0.2444) data time 0.0013 (0.0027) model time 0.2422 (0.2418) loss 3.6344 (3.4285) grad_norm 1.9482 (2.0927) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][250/1251] eta 0:04:04 lr 0.000908 wd 0.0500 time 0.2451 (0.2443) data time 0.0013 (0.0027) model time 0.2438 (0.2419) loss 3.3801 (3.4363) grad_norm 2.1956 (2.0842) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][260/1251] eta 0:04:02 lr 0.000908 wd 0.0500 time 0.2365 (0.2450) data time 0.0011 (0.0026) model time 0.2353 (0.2428) loss 3.9507 (3.4359) grad_norm 1.7242 (2.0906) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][270/1251] eta 0:04:00 lr 0.000908 wd 0.0500 time 0.2449 (0.2449) data time 0.0008 (0.0025) model time 0.2441 (0.2427) loss 2.6642 (3.4477) grad_norm 1.7149 (2.0847) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][280/1251] eta 0:03:57 lr 0.000908 wd 0.0500 time 0.2443 (0.2449) data time 0.0015 (0.0025) model time 0.2429 (0.2428) loss 3.6974 (3.4489) grad_norm 1.6458 (2.0802) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][290/1251] eta 0:03:55 lr 0.000908 wd 0.0500 time 0.2434 (0.2448) data time 0.0010 (0.0024) model time 0.2424 (0.2427) loss 3.8797 (3.4469) grad_norm 1.5257 (2.0729) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][300/1251] eta 0:03:52 lr 0.000908 wd 0.0500 time 0.2386 (0.2447) data time 0.0007 (0.0024) model time 0.2379 (0.2426) loss 3.7510 (3.4455) grad_norm 1.5219 (2.0652) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:42:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][310/1251] eta 0:03:50 lr 0.000908 wd 0.0500 time 0.2416 (0.2446) data time 0.0011 (0.0023) model time 0.2405 (0.2425) loss 3.7530 (3.4531) grad_norm 1.7545 (2.0669) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][320/1251] eta 0:03:47 lr 0.000908 wd 0.0500 time 0.2420 (0.2446) data time 0.0007 (0.0023) model time 0.2414 (0.2426) loss 4.1276 (3.4490) grad_norm 2.0645 (2.0756) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][330/1251] eta 0:03:45 lr 0.000908 wd 0.0500 time 0.2374 (0.2445) data time 0.0010 (0.0023) model time 0.2364 (0.2425) loss 3.7290 (3.4424) grad_norm 1.6727 (2.0749) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][340/1251] eta 0:03:42 lr 0.000908 wd 0.0500 time 0.2405 (0.2445) data time 0.0011 (0.0022) model time 0.2394 (0.2425) loss 4.0238 (3.4535) grad_norm 2.2697 (2.0655) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][350/1251] eta 0:03:40 lr 0.000908 wd 0.0500 time 0.2369 (0.2444) data time 0.0007 (0.0022) model time 0.2362 (0.2424) loss 3.7115 (3.4561) grad_norm 2.1247 (2.0691) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][360/1251] eta 0:03:38 lr 0.000908 wd 0.0500 time 0.2407 (0.2449) data time 0.0011 (0.0022) model time 0.2395 (0.2431) loss 3.6142 (3.4533) grad_norm 1.6191 (2.0640) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][370/1251] eta 0:03:35 lr 0.000908 wd 0.0500 time 0.2375 (0.2449) data time 0.0010 (0.0021) model time 0.2365 (0.2430) loss 2.4323 (3.4485) grad_norm 1.9590 (2.0585) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][380/1251] eta 0:03:33 lr 0.000908 wd 0.0500 time 0.2478 (0.2448) data time 0.0009 (0.0021) model time 0.2469 (0.2430) loss 4.2691 (3.4369) grad_norm 2.5416 (2.0594) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][390/1251] eta 0:03:30 lr 0.000908 wd 0.0500 time 0.2464 (0.2447) data time 0.0010 (0.0021) model time 0.2454 (0.2429) loss 3.4495 (3.4315) grad_norm 2.4156 (2.0580) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][400/1251] eta 0:03:28 lr 0.000908 wd 0.0500 time 0.2337 (0.2446) data time 0.0007 (0.0020) model time 0.2329 (0.2428) loss 4.1412 (3.4325) grad_norm 2.2686 (2.0554) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][410/1251] eta 0:03:25 lr 0.000908 wd 0.0500 time 0.2420 (0.2446) data time 0.0013 (0.0020) model time 0.2408 (0.2428) loss 3.9942 (3.4351) grad_norm 2.2505 (2.0513) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][420/1251] eta 0:03:23 lr 0.000908 wd 0.0500 time 0.2423 (0.2446) data time 0.0008 (0.0020) model time 0.2415 (0.2428) loss 3.0287 (3.4393) grad_norm 1.8216 (2.0533) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][430/1251] eta 0:03:20 lr 0.000908 wd 0.0500 time 0.2373 (0.2445) data time 0.0010 (0.0020) model time 0.2363 (0.2427) loss 3.8324 (3.4422) grad_norm 1.7873 (2.0484) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][440/1251] eta 0:03:18 lr 0.000908 wd 0.0500 time 0.2499 (0.2445) data time 0.0007 (0.0020) model time 0.2492 (0.2428) loss 3.8211 (3.4486) grad_norm 2.7562 (2.0473) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][450/1251] eta 0:03:16 lr 0.000908 wd 0.0500 time 0.2404 (0.2449) data time 0.0012 (0.0019) model time 0.2392 (0.2433) loss 3.4131 (3.4466) grad_norm 1.7569 (2.0481) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][460/1251] eta 0:03:13 lr 0.000908 wd 0.0500 time 0.2425 (0.2449) data time 0.0009 (0.0019) model time 0.2416 (0.2432) loss 2.9592 (3.4482) grad_norm 2.1584 (2.0462) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][470/1251] eta 0:03:11 lr 0.000907 wd 0.0500 time 0.2396 (0.2448) data time 0.0011 (0.0019) model time 0.2385 (0.2431) loss 3.1252 (3.4458) grad_norm 1.6677 (2.0468) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][480/1251] eta 0:03:08 lr 0.000907 wd 0.0500 time 0.2489 (0.2447) data time 0.0008 (0.0019) model time 0.2482 (0.2431) loss 3.0075 (3.4449) grad_norm 1.7791 (2.0423) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][490/1251] eta 0:03:06 lr 0.000907 wd 0.0500 time 0.2400 (0.2446) data time 0.0008 (0.0019) model time 0.2392 (0.2430) loss 2.4560 (3.4424) grad_norm 1.6482 (2.0394) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][500/1251] eta 0:03:03 lr 0.000907 wd 0.0500 time 0.2421 (0.2446) data time 0.0008 (0.0018) model time 0.2414 (0.2429) loss 3.7353 (3.4430) grad_norm 1.5723 (2.0339) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][510/1251] eta 0:03:01 lr 0.000907 wd 0.0500 time 0.2441 (0.2445) data time 0.0009 (0.0018) model time 0.2432 (0.2429) loss 3.4976 (3.4369) grad_norm 3.5799 (2.0358) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][520/1251] eta 0:02:58 lr 0.000907 wd 0.0500 time 0.2317 (0.2447) data time 0.0011 (0.0018) model time 0.2305 (0.2432) loss 3.5448 (3.4406) grad_norm 2.6782 (2.0442) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][530/1251] eta 0:02:56 lr 0.000907 wd 0.0500 time 0.2379 (0.2455) data time 0.0009 (0.0018) model time 0.2369 (0.2440) loss 3.4149 (3.4353) grad_norm 1.5313 (2.0473) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][540/1251] eta 0:02:54 lr 0.000907 wd 0.0500 time 0.2422 (0.2454) data time 0.0009 (0.0018) model time 0.2413 (0.2439) loss 2.8132 (3.4309) grad_norm 2.1686 (2.0556) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][550/1251] eta 0:02:51 lr 0.000907 wd 0.0500 time 0.2420 (0.2453) data time 0.0010 (0.0018) model time 0.2411 (0.2438) loss 3.4021 (3.4369) grad_norm 3.4228 (2.0598) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:43:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][560/1251] eta 0:02:49 lr 0.000907 wd 0.0500 time 0.2485 (0.2452) data time 0.0009 (0.0018) model time 0.2476 (0.2437) loss 2.9967 (3.4350) grad_norm 1.5695 (2.0652) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][570/1251] eta 0:02:46 lr 0.000907 wd 0.0500 time 0.2441 (0.2452) data time 0.0010 (0.0017) model time 0.2430 (0.2437) loss 4.0650 (3.4361) grad_norm 1.4747 (2.0605) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][580/1251] eta 0:02:44 lr 0.000907 wd 0.0500 time 0.2450 (0.2452) data time 0.0010 (0.0017) model time 0.2440 (0.2437) loss 3.7765 (3.4414) grad_norm 2.2260 (2.0547) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][590/1251] eta 0:02:42 lr 0.000907 wd 0.0500 time 0.2488 (0.2451) data time 0.0010 (0.0017) model time 0.2477 (0.2436) loss 3.6067 (3.4414) grad_norm 1.7504 (2.0574) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][600/1251] eta 0:02:39 lr 0.000907 wd 0.0500 time 0.2390 (0.2450) data time 0.0008 (0.0017) model time 0.2383 (0.2436) loss 3.1610 (3.4361) grad_norm 1.9073 (2.0624) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][610/1251] eta 0:02:37 lr 0.000907 wd 0.0500 time 0.2385 (0.2450) data time 0.0010 (0.0017) model time 0.2374 (0.2435) loss 2.2845 (3.4333) grad_norm 2.3836 (2.0654) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][620/1251] eta 0:02:34 lr 0.000907 wd 0.0500 time 0.2469 (0.2452) data time 0.0010 (0.0017) model time 0.2459 (0.2437) loss 2.8431 (3.4361) grad_norm 1.5344 (2.0591) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][630/1251] eta 0:02:32 lr 0.000907 wd 0.0500 time 0.2397 (0.2451) data time 0.0011 (0.0017) model time 0.2387 (0.2437) loss 3.0035 (3.4350) grad_norm 1.3449 (2.0551) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][640/1251] eta 0:02:29 lr 0.000907 wd 0.0500 time 0.2357 (0.2450) data time 0.0008 (0.0017) model time 0.2350 (0.2436) loss 2.8610 (3.4318) grad_norm 1.8781 (2.0514) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][650/1251] eta 0:02:27 lr 0.000907 wd 0.0500 time 0.2416 (0.2450) data time 0.0009 (0.0017) model time 0.2408 (0.2435) loss 4.4146 (3.4358) grad_norm 1.5415 (2.0513) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][660/1251] eta 0:02:24 lr 0.000907 wd 0.0500 time 0.2362 (0.2450) data time 0.0011 (0.0016) model time 0.2351 (0.2435) loss 3.6606 (3.4382) grad_norm 1.5080 (2.0498) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][670/1251] eta 0:02:22 lr 0.000907 wd 0.0500 time 0.2401 (0.2449) data time 0.0010 (0.0016) model time 0.2391 (0.2435) loss 3.4304 (3.4372) grad_norm 2.2131 (2.0516) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][680/1251] eta 0:02:19 lr 0.000907 wd 0.0500 time 0.2556 (0.2449) data time 0.0009 (0.0016) model time 0.2547 (0.2435) loss 3.7762 (3.4364) grad_norm 1.9935 (2.0500) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][690/1251] eta 0:02:17 lr 0.000907 wd 0.0500 time 0.2396 (0.2449) data time 0.0011 (0.0016) model time 0.2386 (0.2435) loss 3.8569 (3.4390) grad_norm 2.9176 (2.0509) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][700/1251] eta 0:02:14 lr 0.000907 wd 0.0500 time 0.2390 (0.2449) data time 0.0013 (0.0016) model time 0.2377 (0.2435) loss 3.1371 (3.4415) grad_norm 3.1345 (2.0561) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][710/1251] eta 0:02:12 lr 0.000907 wd 0.0500 time 0.2438 (0.2449) data time 0.0010 (0.0016) model time 0.2427 (0.2434) loss 3.3473 (3.4416) grad_norm 1.2429 (2.0512) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][720/1251] eta 0:02:09 lr 0.000907 wd 0.0500 time 0.2372 (0.2448) data time 0.0008 (0.0016) model time 0.2363 (0.2434) loss 3.5687 (3.4384) grad_norm 1.6240 (2.0469) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][730/1251] eta 0:02:07 lr 0.000907 wd 0.0500 time 0.2419 (0.2448) data time 0.0007 (0.0016) model time 0.2412 (0.2433) loss 3.6483 (3.4359) grad_norm 1.5094 (2.0473) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][740/1251] eta 0:02:05 lr 0.000907 wd 0.0500 time 0.2417 (0.2447) data time 0.0010 (0.0016) model time 0.2406 (0.2433) loss 4.1798 (3.4389) grad_norm 1.8854 (2.0509) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][750/1251] eta 0:02:02 lr 0.000907 wd 0.0500 time 0.2426 (0.2447) data time 0.0007 (0.0016) model time 0.2419 (0.2433) loss 2.8454 (3.4394) grad_norm 1.4111 (2.0500) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][760/1251] eta 0:02:00 lr 0.000907 wd 0.0500 time 0.2397 (0.2446) data time 0.0009 (0.0016) model time 0.2388 (0.2432) loss 2.5998 (3.4352) grad_norm 1.5741 (2.0467) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][770/1251] eta 0:01:57 lr 0.000907 wd 0.0500 time 0.2424 (0.2449) data time 0.0007 (0.0016) model time 0.2416 (0.2435) loss 3.5965 (3.4384) grad_norm 2.2860 (2.0434) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][780/1251] eta 0:01:55 lr 0.000907 wd 0.0500 time 0.2406 (0.2448) data time 0.0009 (0.0015) model time 0.2397 (0.2435) loss 2.5482 (3.4376) grad_norm 2.0811 (2.0429) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][790/1251] eta 0:01:52 lr 0.000907 wd 0.0500 time 0.2422 (0.2448) data time 0.0009 (0.0015) model time 0.2413 (0.2434) loss 3.6413 (3.4397) grad_norm 1.6679 (2.0389) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:44:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][800/1251] eta 0:01:50 lr 0.000907 wd 0.0500 time 0.2455 (0.2448) data time 0.0010 (0.0015) model time 0.2445 (0.2434) loss 3.4339 (3.4421) grad_norm 1.6353 (2.0361) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][810/1251] eta 0:01:47 lr 0.000907 wd 0.0500 time 0.2439 (0.2447) data time 0.0009 (0.0015) model time 0.2430 (0.2434) loss 3.2472 (3.4417) grad_norm 2.4274 (2.0406) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][820/1251] eta 0:01:45 lr 0.000907 wd 0.0500 time 0.2428 (0.2447) data time 0.0012 (0.0015) model time 0.2416 (0.2433) loss 3.8690 (3.4415) grad_norm 1.9290 (2.0377) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][830/1251] eta 0:01:42 lr 0.000907 wd 0.0500 time 0.2500 (0.2447) data time 0.0010 (0.0015) model time 0.2490 (0.2433) loss 3.9091 (3.4409) grad_norm 2.6429 (2.0354) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][840/1251] eta 0:01:40 lr 0.000907 wd 0.0500 time 0.2365 (0.2447) data time 0.0013 (0.0015) model time 0.2352 (0.2433) loss 3.0981 (3.4442) grad_norm 2.0320 (2.0348) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][850/1251] eta 0:01:38 lr 0.000907 wd 0.0500 time 0.2510 (0.2446) data time 0.0009 (0.0015) model time 0.2501 (0.2433) loss 3.2431 (3.4477) grad_norm 1.9101 (2.0332) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][860/1251] eta 0:01:35 lr 0.000906 wd 0.0500 time 0.2408 (0.2446) data time 0.0008 (0.0015) model time 0.2400 (0.2432) loss 3.8451 (3.4469) grad_norm 2.1844 (2.0308) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][870/1251] eta 0:01:33 lr 0.000906 wd 0.0500 time 0.2447 (0.2445) data time 0.0007 (0.0015) model time 0.2440 (0.2432) loss 2.3982 (3.4442) grad_norm 1.8949 (2.0312) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][880/1251] eta 0:01:30 lr 0.000906 wd 0.0500 time 0.2320 (0.2447) data time 0.0011 (0.0015) model time 0.2310 (0.2433) loss 3.8628 (3.4486) grad_norm 2.5481 (2.0340) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][890/1251] eta 0:01:28 lr 0.000906 wd 0.0500 time 0.2454 (0.2446) data time 0.0009 (0.0015) model time 0.2444 (0.2433) loss 3.8740 (3.4495) grad_norm 1.4616 (2.0338) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][900/1251] eta 0:01:25 lr 0.000906 wd 0.0500 time 0.2420 (0.2449) data time 0.0007 (0.0015) model time 0.2412 (0.2436) loss 2.1463 (3.4498) grad_norm 1.7609 (2.0321) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][910/1251] eta 0:01:23 lr 0.000906 wd 0.0500 time 0.2479 (0.2448) data time 0.0010 (0.0015) model time 0.2469 (0.2435) loss 3.3828 (3.4520) grad_norm 2.9231 (2.0319) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][920/1251] eta 0:01:21 lr 0.000906 wd 0.0500 time 0.2415 (0.2448) data time 0.0007 (0.0015) model time 0.2407 (0.2434) loss 2.5357 (3.4519) grad_norm 1.9070 (2.0322) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][930/1251] eta 0:01:18 lr 0.000906 wd 0.0500 time 0.2370 (0.2447) data time 0.0011 (0.0015) model time 0.2360 (0.2434) loss 3.3155 (3.4516) grad_norm 1.8082 (2.0294) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][940/1251] eta 0:01:16 lr 0.000906 wd 0.0500 time 0.2485 (0.2447) data time 0.0008 (0.0015) model time 0.2478 (0.2434) loss 2.7724 (3.4513) grad_norm 2.1801 (2.0293) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][950/1251] eta 0:01:13 lr 0.000906 wd 0.0500 time 0.2332 (0.2447) data time 0.0011 (0.0015) model time 0.2321 (0.2433) loss 3.2726 (3.4538) grad_norm 1.9526 (2.0276) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][960/1251] eta 0:01:11 lr 0.000906 wd 0.0500 time 0.2389 (0.2446) data time 0.0007 (0.0014) model time 0.2382 (0.2433) loss 3.4614 (3.4551) grad_norm 2.0402 (2.0264) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][970/1251] eta 0:01:08 lr 0.000906 wd 0.0500 time 0.2381 (0.2446) data time 0.0010 (0.0014) model time 0.2371 (0.2432) loss 3.1706 (3.4531) grad_norm 1.6694 (2.0229) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][980/1251] eta 0:01:06 lr 0.000906 wd 0.0500 time 0.2351 (0.2445) data time 0.0009 (0.0014) model time 0.2342 (0.2432) loss 3.9631 (3.4535) grad_norm 2.3821 (2.0224) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][990/1251] eta 0:01:03 lr 0.000906 wd 0.0500 time 0.2427 (0.2447) data time 0.0009 (0.0014) model time 0.2418 (0.2434) loss 3.4759 (3.4532) grad_norm 2.2628 (2.0213) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1000/1251] eta 0:01:01 lr 0.000906 wd 0.0500 time 0.2304 (0.2447) data time 0.0010 (0.0014) model time 0.2294 (0.2434) loss 3.6191 (3.4537) grad_norm 1.7855 (2.0218) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1010/1251] eta 0:00:58 lr 0.000906 wd 0.0500 time 0.2374 (0.2446) data time 0.0007 (0.0014) model time 0.2367 (0.2433) loss 3.5340 (3.4550) grad_norm 1.6147 (2.0206) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:45:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1020/1251] eta 0:00:56 lr 0.000906 wd 0.0500 time 0.2341 (0.2446) data time 0.0010 (0.0014) model time 0.2331 (0.2433) loss 3.2386 (3.4556) grad_norm 2.0724 (2.0210) loss_scale 8192.0000 (4132.1058) mem 7379MB [2024-08-26 08:45:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1030/1251] eta 0:00:54 lr 0.000906 wd 0.0500 time 0.2487 (0.2446) data time 0.0010 (0.0014) model time 0.2476 (0.2433) loss 3.6029 (3.4564) grad_norm 2.0258 (2.0235) loss_scale 8192.0000 (4171.4840) mem 7379MB [2024-08-26 08:45:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1040/1251] eta 0:00:51 lr 0.000906 wd 0.0500 time 0.2435 (0.2445) data time 0.0007 (0.0014) model time 0.2428 (0.2432) loss 3.9904 (3.4599) grad_norm 1.6196 (2.0237) loss_scale 8192.0000 (4210.1057) mem 7379MB [2024-08-26 08:45:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1050/1251] eta 0:00:49 lr 0.000906 wd 0.0500 time 0.2317 (0.2449) data time 0.0007 (0.0014) model time 0.2309 (0.2436) loss 3.5703 (3.4585) grad_norm 2.3181 (2.0250) loss_scale 8192.0000 (4247.9924) mem 7379MB [2024-08-26 08:46:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1060/1251] eta 0:00:46 lr 0.000906 wd 0.0500 time 0.2485 (0.2449) data time 0.0007 (0.0014) model time 0.2478 (0.2436) loss 3.9430 (3.4595) grad_norm 1.9047 (2.0262) loss_scale 8192.0000 (4285.1649) mem 7379MB [2024-08-26 08:46:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1070/1251] eta 0:00:44 lr 0.000906 wd 0.0500 time 0.2390 (0.2448) data time 0.0009 (0.0014) model time 0.2380 (0.2436) loss 3.5188 (3.4589) grad_norm 2.0967 (2.0285) loss_scale 8192.0000 (4321.6433) mem 7379MB [2024-08-26 08:46:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1080/1251] eta 0:00:41 lr 0.000906 wd 0.0500 time 0.2336 (0.2448) data time 0.0010 (0.0014) model time 0.2326 (0.2435) loss 2.7816 (3.4584) grad_norm 1.8062 (2.0291) loss_scale 8192.0000 (4357.4468) mem 7379MB [2024-08-26 08:46:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1090/1251] eta 0:00:39 lr 0.000906 wd 0.0500 time 0.2437 (0.2448) data time 0.0007 (0.0014) model time 0.2431 (0.2435) loss 4.1977 (3.4595) grad_norm 1.5270 (2.0277) loss_scale 8192.0000 (4392.5940) mem 7379MB [2024-08-26 08:46:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1100/1251] eta 0:00:36 lr 0.000906 wd 0.0500 time 0.2487 (0.2447) data time 0.0007 (0.0014) model time 0.2481 (0.2435) loss 4.2947 (3.4627) grad_norm 1.9881 (2.0294) loss_scale 8192.0000 (4427.1026) mem 7379MB [2024-08-26 08:46:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1110/1251] eta 0:00:34 lr 0.000906 wd 0.0500 time 0.2407 (0.2447) data time 0.0010 (0.0014) model time 0.2398 (0.2434) loss 2.2793 (3.4639) grad_norm 1.6677 (2.0297) loss_scale 8192.0000 (4460.9901) mem 7379MB [2024-08-26 08:46:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1120/1251] eta 0:00:32 lr 0.000906 wd 0.0500 time 0.2287 (0.2447) data time 0.0011 (0.0014) model time 0.2276 (0.2434) loss 3.7834 (3.4654) grad_norm 1.7198 (2.0314) loss_scale 8192.0000 (4494.2730) mem 7379MB [2024-08-26 08:46:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1130/1251] eta 0:00:29 lr 0.000906 wd 0.0500 time 0.2501 (0.2447) data time 0.0010 (0.0014) model time 0.2492 (0.2434) loss 4.2488 (3.4646) grad_norm 1.6398 (2.0306) loss_scale 8192.0000 (4526.9673) mem 7379MB [2024-08-26 08:46:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1140/1251] eta 0:00:27 lr 0.000906 wd 0.0500 time 0.4692 (0.2449) data time 0.0009 (0.0014) model time 0.4683 (0.2436) loss 4.0429 (3.4658) grad_norm 2.6096 (2.0312) loss_scale 8192.0000 (4559.0885) mem 7379MB [2024-08-26 08:46:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1150/1251] eta 0:00:24 lr 0.000906 wd 0.0500 time 0.2401 (0.2448) data time 0.0011 (0.0014) model time 0.2391 (0.2436) loss 2.7609 (3.4647) grad_norm 2.7632 (2.0321) loss_scale 8192.0000 (4590.6516) mem 7379MB [2024-08-26 08:46:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1160/1251] eta 0:00:22 lr 0.000906 wd 0.0500 time 0.2348 (0.2448) data time 0.0009 (0.0014) model time 0.2339 (0.2435) loss 3.4517 (3.4660) grad_norm 2.7142 (2.0325) loss_scale 8192.0000 (4621.6710) mem 7379MB [2024-08-26 08:46:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1170/1251] eta 0:00:19 lr 0.000906 wd 0.0500 time 0.2562 (0.2448) data time 0.0008 (0.0014) model time 0.2554 (0.2435) loss 3.7018 (3.4677) grad_norm 2.1027 (2.0334) loss_scale 8192.0000 (4652.1605) mem 7379MB [2024-08-26 08:46:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1180/1251] eta 0:00:17 lr 0.000906 wd 0.0500 time 0.2412 (0.2448) data time 0.0007 (0.0014) model time 0.2405 (0.2435) loss 4.4297 (3.4701) grad_norm 2.1168 (2.0349) loss_scale 8192.0000 (4682.1338) mem 7379MB [2024-08-26 08:46:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1190/1251] eta 0:00:14 lr 0.000906 wd 0.0500 time 0.2836 (0.2450) data time 0.0007 (0.0014) model time 0.2828 (0.2437) loss 3.6035 (3.4708) grad_norm 2.0022 (2.0352) loss_scale 8192.0000 (4711.6037) mem 7379MB [2024-08-26 08:46:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1200/1251] eta 0:00:12 lr 0.000906 wd 0.0500 time 0.2389 (0.2449) data time 0.0011 (0.0014) model time 0.2378 (0.2437) loss 3.7965 (3.4704) grad_norm 1.8815 (2.0335) loss_scale 8192.0000 (4740.5828) mem 7379MB [2024-08-26 08:46:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1210/1251] eta 0:00:10 lr 0.000906 wd 0.0500 time 0.2475 (0.2449) data time 0.0009 (0.0014) model time 0.2466 (0.2437) loss 4.0233 (3.4732) grad_norm 2.9605 (2.0343) loss_scale 8192.0000 (4769.0834) mem 7379MB [2024-08-26 08:46:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1220/1251] eta 0:00:07 lr 0.000906 wd 0.0500 time 0.2409 (0.2449) data time 0.0009 (0.0014) model time 0.2399 (0.2437) loss 2.9742 (3.4719) grad_norm 1.3420 (2.0347) loss_scale 8192.0000 (4797.1171) mem 7379MB [2024-08-26 08:46:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1230/1251] eta 0:00:05 lr 0.000906 wd 0.0500 time 0.2658 (0.2449) data time 0.0008 (0.0014) model time 0.2649 (0.2437) loss 4.0376 (3.4734) grad_norm 2.6675 (2.0340) loss_scale 8192.0000 (4824.6954) mem 7379MB [2024-08-26 08:46:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1240/1251] eta 0:00:02 lr 0.000905 wd 0.0500 time 0.2267 (0.2448) data time 0.0007 (0.0014) model time 0.2260 (0.2436) loss 3.7795 (3.4745) grad_norm 1.9600 (2.0334) loss_scale 8192.0000 (4851.8292) mem 7379MB [2024-08-26 08:46:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [75/300][1250/1251] eta 0:00:00 lr 0.000905 wd 0.0500 time 0.2238 (0.2447) data time 0.0007 (0.0013) model time 0.2231 (0.2434) loss 3.7324 (3.4741) grad_norm 1.6861 (2.0319) loss_scale 8192.0000 (4878.5292) mem 7379MB [2024-08-26 08:46:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 75 training takes 0:05:06 [2024-08-26 08:46:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 08:46:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 08:46:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/75_ckpt.pth saving...... [2024-08-26 08:46:49 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/75_ckpt.pth saved !!! [2024-08-26 08:46:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.440 (0.440) Loss 0.5537 (0.5537) Acc@1 90.820 (90.820) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 08:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.110) Loss 0.8110 (0.8458) Acc@1 83.496 (81.561) Acc@5 95.898 (95.827) Mem 7379MB [2024-08-26 08:46:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.096) Loss 1.2275 (0.8637) Acc@1 71.191 (80.501) Acc@5 91.699 (95.833) Mem 7379MB [2024-08-26 08:46:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.090) Loss 1.4922 (0.9790) Acc@1 64.453 (77.958) Acc@5 87.988 (94.405) Mem 7379MB [2024-08-26 08:46:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.3145 (1.0480) Acc@1 69.727 (76.305) Acc@5 89.941 (93.524) Mem 7379MB [2024-08-26 08:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 75.960 Acc@5 93.438 [2024-08-26 08:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.0% [2024-08-26 08:46:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.756 (0.756) Loss 0.4641 (0.4641) Acc@1 91.797 (91.797) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 08:46:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.145) Loss 0.7422 (0.7232) Acc@1 85.059 (84.384) Acc@5 96.387 (96.804) Mem 7379MB [2024-08-26 08:46:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.113) Loss 1.0371 (0.7465) Acc@1 75.684 (83.273) Acc@5 93.652 (96.805) Mem 7379MB [2024-08-26 08:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.101) Loss 1.3066 (0.8528) Acc@1 66.602 (80.800) Acc@5 90.430 (95.492) Mem 7379MB [2024-08-26 08:46:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.2080 (0.9101) Acc@1 70.508 (79.287) Acc@5 91.211 (94.831) Mem 7379MB [2024-08-26 08:46:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.888 Acc@5 94.778 [2024-08-26 08:46:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.9% [2024-08-26 08:46:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.89% [2024-08-26 08:46:57 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 08:46:58 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 08:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][0/1251] eta 0:13:23 lr 0.000905 wd 0.0500 time 0.6424 (0.6424) data time 0.4052 (0.4052) model time 0.0000 (0.0000) loss 3.4127 (3.4127) grad_norm 2.4189 (2.4189) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][10/1251] eta 0:05:44 lr 0.000905 wd 0.0500 time 0.2421 (0.2777) data time 0.0009 (0.0377) model time 0.0000 (0.0000) loss 3.8881 (3.4049) grad_norm 2.0608 (2.3552) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][20/1251] eta 0:05:21 lr 0.000905 wd 0.0500 time 0.2530 (0.2613) data time 0.0009 (0.0202) model time 0.0000 (0.0000) loss 2.2783 (3.3260) grad_norm 1.4791 (2.1684) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][30/1251] eta 0:05:10 lr 0.000905 wd 0.0500 time 0.2394 (0.2546) data time 0.0009 (0.0141) model time 0.0000 (0.0000) loss 3.2495 (3.3245) grad_norm 2.0614 (2.0846) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][40/1251] eta 0:05:04 lr 0.000905 wd 0.0500 time 0.2447 (0.2515) data time 0.0011 (0.0112) model time 0.0000 (0.0000) loss 3.6887 (3.3390) grad_norm 1.5819 (2.0123) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][50/1251] eta 0:04:58 lr 0.000905 wd 0.0500 time 0.2414 (0.2489) data time 0.0009 (0.0092) model time 0.0000 (0.0000) loss 3.9414 (3.2965) grad_norm 2.0568 (2.0073) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][60/1251] eta 0:04:54 lr 0.000905 wd 0.0500 time 0.2445 (0.2474) data time 0.0009 (0.0079) model time 0.2436 (0.2387) loss 2.3370 (3.3216) grad_norm 2.3049 (2.1047) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][70/1251] eta 0:04:51 lr 0.000905 wd 0.0500 time 0.2472 (0.2471) data time 0.0008 (0.0069) model time 0.2464 (0.2414) loss 2.3219 (3.3217) grad_norm 2.5388 (2.0848) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][80/1251] eta 0:04:48 lr 0.000905 wd 0.0500 time 0.2389 (0.2463) data time 0.0008 (0.0062) model time 0.2381 (0.2408) loss 3.4607 (3.3607) grad_norm 2.0768 (2.0582) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][90/1251] eta 0:04:47 lr 0.000905 wd 0.0500 time 0.2442 (0.2480) data time 0.0010 (0.0056) model time 0.2432 (0.2457) loss 3.6293 (3.3552) grad_norm 1.7266 (2.0314) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][100/1251] eta 0:04:44 lr 0.000905 wd 0.0500 time 0.2409 (0.2472) data time 0.0010 (0.0051) model time 0.2399 (0.2444) loss 3.5341 (3.3633) grad_norm 1.5375 (2.0272) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][110/1251] eta 0:04:43 lr 0.000905 wd 0.0500 time 0.2336 (0.2483) data time 0.0013 (0.0048) model time 0.2324 (0.2468) loss 3.7252 (3.3748) grad_norm 1.8772 (2.0118) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][120/1251] eta 0:04:40 lr 0.000905 wd 0.0500 time 0.2363 (0.2476) data time 0.0009 (0.0045) model time 0.2354 (0.2457) loss 3.4043 (3.3992) grad_norm 2.2049 (2.0243) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][130/1251] eta 0:04:37 lr 0.000905 wd 0.0500 time 0.2439 (0.2471) data time 0.0007 (0.0042) model time 0.2432 (0.2449) loss 3.5440 (3.3862) grad_norm 2.0017 (2.0271) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][140/1251] eta 0:04:34 lr 0.000905 wd 0.0500 time 0.2397 (0.2466) data time 0.0010 (0.0040) model time 0.2386 (0.2443) loss 3.9643 (3.4018) grad_norm 1.9848 (2.0085) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][150/1251] eta 0:04:32 lr 0.000905 wd 0.0500 time 0.2500 (0.2475) data time 0.0007 (0.0038) model time 0.2493 (0.2458) loss 3.9686 (3.4093) grad_norm 1.9133 (2.0014) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][160/1251] eta 0:04:29 lr 0.000905 wd 0.0500 time 0.2347 (0.2471) data time 0.0008 (0.0036) model time 0.2340 (0.2453) loss 4.1140 (3.4181) grad_norm 1.4967 (1.9871) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][170/1251] eta 0:04:26 lr 0.000905 wd 0.0500 time 0.2384 (0.2470) data time 0.0010 (0.0035) model time 0.2374 (0.2451) loss 3.7456 (3.4245) grad_norm 1.7536 (1.9737) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][180/1251] eta 0:04:24 lr 0.000905 wd 0.0500 time 0.2442 (0.2467) data time 0.0010 (0.0033) model time 0.2432 (0.2448) loss 2.3649 (3.4243) grad_norm 2.6680 (1.9703) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][190/1251] eta 0:04:21 lr 0.000905 wd 0.0500 time 0.2408 (0.2465) data time 0.0009 (0.0032) model time 0.2399 (0.2446) loss 2.3831 (3.4224) grad_norm 1.5517 (1.9755) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][200/1251] eta 0:04:18 lr 0.000905 wd 0.0500 time 0.2428 (0.2463) data time 0.0007 (0.0031) model time 0.2421 (0.2443) loss 4.1082 (3.4417) grad_norm 1.8743 (1.9727) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][210/1251] eta 0:04:16 lr 0.000905 wd 0.0500 time 0.2444 (0.2461) data time 0.0008 (0.0030) model time 0.2436 (0.2441) loss 3.9763 (3.4653) grad_norm 1.7221 (1.9825) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][220/1251] eta 0:04:13 lr 0.000905 wd 0.0500 time 0.2394 (0.2457) data time 0.0007 (0.0029) model time 0.2386 (0.2438) loss 3.7386 (3.4724) grad_norm 2.0185 (1.9920) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][230/1251] eta 0:04:10 lr 0.000905 wd 0.0500 time 0.2340 (0.2454) data time 0.0009 (0.0028) model time 0.2331 (0.2434) loss 3.0510 (3.4685) grad_norm 1.5411 (1.9793) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][240/1251] eta 0:04:08 lr 0.000905 wd 0.0500 time 0.2445 (0.2453) data time 0.0007 (0.0028) model time 0.2438 (0.2434) loss 4.0493 (3.4634) grad_norm 2.5053 (1.9759) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:47:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][250/1251] eta 0:04:05 lr 0.000905 wd 0.0500 time 0.2458 (0.2453) data time 0.0008 (0.0027) model time 0.2450 (0.2433) loss 4.0765 (3.4618) grad_norm 1.4485 (1.9708) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][260/1251] eta 0:04:02 lr 0.000905 wd 0.0500 time 0.2349 (0.2451) data time 0.0007 (0.0026) model time 0.2342 (0.2432) loss 3.6369 (3.4631) grad_norm 1.7320 (1.9688) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][270/1251] eta 0:04:01 lr 0.000905 wd 0.0500 time 0.2440 (0.2465) data time 0.0011 (0.0026) model time 0.2429 (0.2450) loss 3.7273 (3.4651) grad_norm 1.7199 (1.9649) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][280/1251] eta 0:03:59 lr 0.000905 wd 0.0500 time 0.2422 (0.2463) data time 0.0007 (0.0025) model time 0.2415 (0.2447) loss 3.7855 (3.4711) grad_norm 1.7032 (1.9640) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][290/1251] eta 0:03:56 lr 0.000905 wd 0.0500 time 0.2403 (0.2461) data time 0.0009 (0.0025) model time 0.2394 (0.2445) loss 4.3323 (3.4767) grad_norm 2.6446 (1.9646) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][300/1251] eta 0:03:53 lr 0.000905 wd 0.0500 time 0.2389 (0.2458) data time 0.0010 (0.0024) model time 0.2379 (0.2442) loss 3.4451 (3.4847) grad_norm 1.5911 (1.9689) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][310/1251] eta 0:03:51 lr 0.000905 wd 0.0500 time 0.2385 (0.2457) data time 0.0011 (0.0024) model time 0.2373 (0.2440) loss 1.9917 (3.4788) grad_norm 2.6943 (1.9785) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][320/1251] eta 0:03:49 lr 0.000905 wd 0.0500 time 0.2340 (0.2469) data time 0.0008 (0.0023) model time 0.2331 (0.2455) loss 2.4478 (3.4756) grad_norm 1.4200 (1.9823) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][330/1251] eta 0:03:47 lr 0.000905 wd 0.0500 time 0.2407 (0.2474) data time 0.0010 (0.0023) model time 0.2397 (0.2461) loss 3.6570 (3.4683) grad_norm 1.6735 (1.9905) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][340/1251] eta 0:03:45 lr 0.000905 wd 0.0500 time 0.2481 (0.2472) data time 0.0009 (0.0022) model time 0.2472 (0.2459) loss 3.3276 (3.4615) grad_norm 1.4954 (1.9906) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][350/1251] eta 0:03:42 lr 0.000905 wd 0.0500 time 0.2431 (0.2470) data time 0.0008 (0.0022) model time 0.2423 (0.2457) loss 2.2552 (3.4570) grad_norm 1.5515 (1.9919) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][360/1251] eta 0:03:39 lr 0.000905 wd 0.0500 time 0.2372 (0.2469) data time 0.0011 (0.0022) model time 0.2360 (0.2455) loss 3.2862 (3.4570) grad_norm 2.7876 (1.9956) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][370/1251] eta 0:03:37 lr 0.000904 wd 0.0500 time 0.2421 (0.2472) data time 0.0007 (0.0021) model time 0.2415 (0.2459) loss 4.1706 (3.4596) grad_norm 2.9172 (2.0036) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][380/1251] eta 0:03:35 lr 0.000904 wd 0.0500 time 0.2417 (0.2471) data time 0.0010 (0.0021) model time 0.2407 (0.2458) loss 3.4791 (3.4624) grad_norm 1.9261 (2.0014) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][390/1251] eta 0:03:32 lr 0.000904 wd 0.0500 time 0.2402 (0.2469) data time 0.0010 (0.0021) model time 0.2393 (0.2456) loss 3.6440 (3.4565) grad_norm 1.6839 (1.9958) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][400/1251] eta 0:03:30 lr 0.000904 wd 0.0500 time 0.2369 (0.2468) data time 0.0013 (0.0021) model time 0.2356 (0.2455) loss 3.7956 (3.4577) grad_norm 1.7392 (1.9915) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][410/1251] eta 0:03:27 lr 0.000904 wd 0.0500 time 0.2312 (0.2467) data time 0.0011 (0.0020) model time 0.2301 (0.2453) loss 2.9459 (3.4595) grad_norm 1.3865 (1.9912) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][420/1251] eta 0:03:24 lr 0.000904 wd 0.0500 time 0.2480 (0.2465) data time 0.0008 (0.0020) model time 0.2471 (0.2452) loss 3.4810 (3.4585) grad_norm 2.0584 (1.9861) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][430/1251] eta 0:03:22 lr 0.000904 wd 0.0500 time 0.2326 (0.2464) data time 0.0011 (0.0020) model time 0.2315 (0.2451) loss 3.8729 (3.4629) grad_norm 2.4928 (1.9886) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][440/1251] eta 0:03:19 lr 0.000904 wd 0.0500 time 0.2420 (0.2463) data time 0.0008 (0.0020) model time 0.2412 (0.2450) loss 3.5680 (3.4652) grad_norm 1.3267 (1.9900) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][450/1251] eta 0:03:17 lr 0.000904 wd 0.0500 time 0.2462 (0.2463) data time 0.0010 (0.0019) model time 0.2452 (0.2449) loss 3.6635 (3.4577) grad_norm 1.6221 (1.9849) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][460/1251] eta 0:03:14 lr 0.000904 wd 0.0500 time 0.2479 (0.2462) data time 0.0010 (0.0019) model time 0.2468 (0.2449) loss 4.0850 (3.4633) grad_norm 1.4972 (1.9834) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][470/1251] eta 0:03:12 lr 0.000904 wd 0.0500 time 0.2369 (0.2461) data time 0.0011 (0.0019) model time 0.2358 (0.2447) loss 3.7085 (3.4589) grad_norm 2.1669 (1.9820) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][480/1251] eta 0:03:09 lr 0.000904 wd 0.0500 time 0.2477 (0.2460) data time 0.0007 (0.0019) model time 0.2470 (0.2446) loss 3.0908 (3.4613) grad_norm 2.1580 (1.9806) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:48:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][490/1251] eta 0:03:07 lr 0.000904 wd 0.0500 time 0.2358 (0.2459) data time 0.0008 (0.0019) model time 0.2349 (0.2446) loss 4.0079 (3.4653) grad_norm 1.3303 (1.9789) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][500/1251] eta 0:03:04 lr 0.000904 wd 0.0500 time 0.2476 (0.2459) data time 0.0009 (0.0019) model time 0.2467 (0.2445) loss 4.1614 (3.4739) grad_norm 1.4249 (1.9761) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][510/1251] eta 0:03:02 lr 0.000904 wd 0.0500 time 0.2374 (0.2458) data time 0.0008 (0.0018) model time 0.2365 (0.2445) loss 3.4682 (3.4720) grad_norm 3.4830 (1.9812) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][520/1251] eta 0:02:59 lr 0.000904 wd 0.0500 time 0.2472 (0.2457) data time 0.0009 (0.0018) model time 0.2463 (0.2444) loss 3.7210 (3.4742) grad_norm 2.8115 (1.9842) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][530/1251] eta 0:02:57 lr 0.000904 wd 0.0500 time 0.2395 (0.2457) data time 0.0007 (0.0018) model time 0.2387 (0.2443) loss 4.0207 (3.4792) grad_norm 1.2659 (1.9821) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][540/1251] eta 0:02:54 lr 0.000904 wd 0.0500 time 0.2428 (0.2456) data time 0.0010 (0.0018) model time 0.2418 (0.2442) loss 3.9955 (3.4703) grad_norm 2.3382 (1.9820) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][550/1251] eta 0:02:52 lr 0.000904 wd 0.0500 time 0.2424 (0.2456) data time 0.0008 (0.0018) model time 0.2416 (0.2442) loss 2.1063 (3.4710) grad_norm 1.8428 (1.9846) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][560/1251] eta 0:02:49 lr 0.000904 wd 0.0500 time 0.2353 (0.2455) data time 0.0007 (0.0018) model time 0.2345 (0.2441) loss 4.2555 (3.4721) grad_norm 1.5461 (1.9842) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][570/1251] eta 0:02:47 lr 0.000904 wd 0.0500 time 0.2448 (0.2454) data time 0.0007 (0.0017) model time 0.2441 (0.2441) loss 3.1152 (3.4724) grad_norm 1.6743 (1.9813) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][580/1251] eta 0:02:44 lr 0.000904 wd 0.0500 time 0.2446 (0.2454) data time 0.0008 (0.0017) model time 0.2438 (0.2440) loss 2.5024 (3.4660) grad_norm 1.9477 (1.9806) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][590/1251] eta 0:02:42 lr 0.000904 wd 0.0500 time 0.2373 (0.2453) data time 0.0009 (0.0017) model time 0.2364 (0.2440) loss 4.1431 (3.4613) grad_norm 1.6476 (1.9824) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][600/1251] eta 0:02:39 lr 0.000904 wd 0.0500 time 0.2463 (0.2453) data time 0.0008 (0.0017) model time 0.2455 (0.2439) loss 1.9521 (3.4574) grad_norm 2.4427 (1.9797) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][610/1251] eta 0:02:37 lr 0.000904 wd 0.0500 time 0.2483 (0.2452) data time 0.0008 (0.0017) model time 0.2475 (0.2439) loss 3.1544 (3.4555) grad_norm 3.1220 (1.9792) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][620/1251] eta 0:02:34 lr 0.000904 wd 0.0500 time 0.2429 (0.2451) data time 0.0009 (0.0017) model time 0.2420 (0.2438) loss 3.0410 (3.4531) grad_norm 2.2520 (1.9792) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][630/1251] eta 0:02:32 lr 0.000904 wd 0.0500 time 0.2426 (0.2454) data time 0.0009 (0.0017) model time 0.2417 (0.2441) loss 4.4559 (3.4536) grad_norm 1.7990 (1.9814) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][640/1251] eta 0:02:29 lr 0.000904 wd 0.0500 time 0.2390 (0.2454) data time 0.0009 (0.0017) model time 0.2381 (0.2441) loss 3.2001 (3.4513) grad_norm 1.9531 (1.9775) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][650/1251] eta 0:02:27 lr 0.000904 wd 0.0500 time 0.2420 (0.2453) data time 0.0012 (0.0017) model time 0.2408 (0.2440) loss 3.3972 (3.4544) grad_norm 1.8422 (1.9791) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][660/1251] eta 0:02:24 lr 0.000904 wd 0.0500 time 0.2449 (0.2453) data time 0.0009 (0.0016) model time 0.2440 (0.2440) loss 3.2626 (3.4570) grad_norm 1.7508 (1.9797) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][670/1251] eta 0:02:22 lr 0.000904 wd 0.0500 time 0.2502 (0.2455) data time 0.0007 (0.0016) model time 0.2494 (0.2443) loss 3.3131 (3.4578) grad_norm 1.9854 (1.9775) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][680/1251] eta 0:02:20 lr 0.000904 wd 0.0500 time 0.2513 (0.2455) data time 0.0012 (0.0016) model time 0.2501 (0.2442) loss 3.1503 (3.4549) grad_norm 1.7374 (1.9786) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][690/1251] eta 0:02:17 lr 0.000904 wd 0.0500 time 0.2354 (0.2454) data time 0.0010 (0.0016) model time 0.2344 (0.2441) loss 3.7213 (3.4591) grad_norm 2.2089 (1.9776) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][700/1251] eta 0:02:15 lr 0.000904 wd 0.0500 time 0.2379 (0.2454) data time 0.0009 (0.0016) model time 0.2371 (0.2441) loss 4.2832 (3.4622) grad_norm 2.1693 (1.9770) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][710/1251] eta 0:02:12 lr 0.000904 wd 0.0500 time 0.2454 (0.2453) data time 0.0010 (0.0016) model time 0.2444 (0.2440) loss 2.7234 (3.4616) grad_norm 2.2498 (1.9772) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][720/1251] eta 0:02:10 lr 0.000904 wd 0.0500 time 0.2400 (0.2452) data time 0.0012 (0.0016) model time 0.2388 (0.2439) loss 4.1694 (3.4642) grad_norm 2.5517 (1.9777) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:49:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][730/1251] eta 0:02:07 lr 0.000904 wd 0.0500 time 0.2414 (0.2452) data time 0.0010 (0.0016) model time 0.2404 (0.2439) loss 3.7225 (3.4655) grad_norm 1.8273 (1.9823) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][740/1251] eta 0:02:05 lr 0.000904 wd 0.0500 time 0.2370 (0.2451) data time 0.0008 (0.0016) model time 0.2363 (0.2439) loss 4.4331 (3.4686) grad_norm 1.8666 (1.9897) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][750/1251] eta 0:02:02 lr 0.000903 wd 0.0500 time 0.2503 (0.2451) data time 0.0009 (0.0016) model time 0.2493 (0.2438) loss 3.1061 (3.4679) grad_norm 1.7718 (1.9914) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][760/1251] eta 0:02:00 lr 0.000903 wd 0.0500 time 0.2379 (0.2450) data time 0.0008 (0.0016) model time 0.2371 (0.2437) loss 2.9916 (3.4680) grad_norm 2.1444 (1.9934) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][770/1251] eta 0:01:57 lr 0.000903 wd 0.0500 time 0.2396 (0.2450) data time 0.0010 (0.0016) model time 0.2386 (0.2437) loss 3.9672 (3.4653) grad_norm 2.8438 (1.9941) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][780/1251] eta 0:01:55 lr 0.000903 wd 0.0500 time 0.2438 (0.2451) data time 0.0010 (0.0016) model time 0.2429 (0.2438) loss 2.7549 (3.4631) grad_norm 1.5903 (1.9941) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][790/1251] eta 0:01:53 lr 0.000903 wd 0.0500 time 0.2431 (0.2454) data time 0.0009 (0.0015) model time 0.2421 (0.2442) loss 2.4465 (3.4619) grad_norm 1.8203 (1.9928) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][800/1251] eta 0:01:50 lr 0.000903 wd 0.0500 time 0.2410 (0.2454) data time 0.0008 (0.0015) model time 0.2402 (0.2441) loss 3.8763 (3.4640) grad_norm 2.3271 (1.9934) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][810/1251] eta 0:01:48 lr 0.000903 wd 0.0500 time 0.2371 (0.2453) data time 0.0010 (0.0015) model time 0.2362 (0.2441) loss 3.8208 (3.4627) grad_norm 2.0324 (1.9955) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][820/1251] eta 0:01:45 lr 0.000903 wd 0.0500 time 0.2388 (0.2455) data time 0.0008 (0.0015) model time 0.2380 (0.2443) loss 3.3332 (3.4624) grad_norm 1.2411 (1.9939) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][830/1251] eta 0:01:43 lr 0.000903 wd 0.0500 time 0.2457 (0.2455) data time 0.0010 (0.0015) model time 0.2447 (0.2442) loss 3.5286 (3.4624) grad_norm 1.6097 (1.9948) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][840/1251] eta 0:01:41 lr 0.000903 wd 0.0500 time 0.2400 (0.2459) data time 0.0007 (0.0015) model time 0.2393 (0.2447) loss 4.4556 (3.4634) grad_norm 3.4041 (1.9975) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][850/1251] eta 0:01:38 lr 0.000903 wd 0.0500 time 0.2417 (0.2461) data time 0.0008 (0.0015) model time 0.2409 (0.2449) loss 3.7205 (3.4623) grad_norm 1.4980 (1.9970) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][860/1251] eta 0:01:36 lr 0.000903 wd 0.0500 time 0.2364 (0.2463) data time 0.0007 (0.0015) model time 0.2357 (0.2451) loss 3.8770 (3.4628) grad_norm 1.7957 (1.9978) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][870/1251] eta 0:01:33 lr 0.000903 wd 0.0500 time 0.2436 (0.2462) data time 0.0008 (0.0015) model time 0.2428 (0.2451) loss 2.9942 (3.4640) grad_norm 1.4789 (1.9990) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][880/1251] eta 0:01:31 lr 0.000903 wd 0.0500 time 0.2369 (0.2462) data time 0.0009 (0.0015) model time 0.2360 (0.2450) loss 4.2626 (3.4682) grad_norm 2.8845 (2.0005) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][890/1251] eta 0:01:28 lr 0.000903 wd 0.0500 time 0.2485 (0.2464) data time 0.0011 (0.0015) model time 0.2474 (0.2452) loss 2.5629 (3.4677) grad_norm 1.4778 (1.9991) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][900/1251] eta 0:01:26 lr 0.000903 wd 0.0500 time 0.2477 (0.2464) data time 0.0010 (0.0015) model time 0.2467 (0.2452) loss 3.4111 (3.4663) grad_norm 1.5630 (1.9964) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][910/1251] eta 0:01:24 lr 0.000903 wd 0.0500 time 0.2441 (0.2463) data time 0.0008 (0.0015) model time 0.2434 (0.2452) loss 3.5898 (3.4659) grad_norm 2.0130 (1.9951) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][920/1251] eta 0:01:21 lr 0.000903 wd 0.0500 time 0.2432 (0.2463) data time 0.0011 (0.0015) model time 0.2421 (0.2451) loss 3.9443 (3.4671) grad_norm 1.7672 (1.9972) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][930/1251] eta 0:01:19 lr 0.000903 wd 0.0500 time 0.2432 (0.2462) data time 0.0008 (0.0015) model time 0.2424 (0.2450) loss 2.9892 (3.4666) grad_norm 1.5157 (1.9984) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][940/1251] eta 0:01:16 lr 0.000903 wd 0.0500 time 0.2365 (0.2462) data time 0.0009 (0.0015) model time 0.2356 (0.2450) loss 3.7243 (3.4692) grad_norm 1.6615 (1.9975) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][950/1251] eta 0:01:14 lr 0.000903 wd 0.0500 time 0.2317 (0.2461) data time 0.0009 (0.0015) model time 0.2308 (0.2449) loss 2.6715 (3.4699) grad_norm 1.8486 (1.9957) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][960/1251] eta 0:01:11 lr 0.000903 wd 0.0500 time 0.2436 (0.2461) data time 0.0009 (0.0015) model time 0.2427 (0.2449) loss 4.6774 (3.4737) grad_norm 2.7347 (1.9973) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][970/1251] eta 0:01:09 lr 0.000903 wd 0.0500 time 0.2430 (0.2460) data time 0.0008 (0.0015) model time 0.2421 (0.2448) loss 3.9041 (3.4726) grad_norm 2.3037 (2.0004) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:50:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][980/1251] eta 0:01:06 lr 0.000903 wd 0.0500 time 0.2497 (0.2460) data time 0.0010 (0.0015) model time 0.2488 (0.2448) loss 2.5375 (3.4724) grad_norm 1.8602 (1.9995) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][990/1251] eta 0:01:04 lr 0.000903 wd 0.0500 time 0.2356 (0.2460) data time 0.0009 (0.0014) model time 0.2347 (0.2448) loss 4.2086 (3.4737) grad_norm 2.0986 (2.0012) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1000/1251] eta 0:01:01 lr 0.000903 wd 0.0500 time 0.2422 (0.2459) data time 0.0009 (0.0014) model time 0.2412 (0.2447) loss 3.8068 (3.4739) grad_norm 2.1417 (2.0033) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1010/1251] eta 0:00:59 lr 0.000903 wd 0.0500 time 0.2434 (0.2459) data time 0.0011 (0.0014) model time 0.2423 (0.2447) loss 3.3484 (3.4759) grad_norm 2.3807 (2.0026) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1020/1251] eta 0:00:56 lr 0.000903 wd 0.0500 time 0.2413 (0.2459) data time 0.0010 (0.0014) model time 0.2403 (0.2447) loss 2.6994 (3.4757) grad_norm 1.9067 (2.0012) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1030/1251] eta 0:00:54 lr 0.000903 wd 0.0500 time 0.2519 (0.2458) data time 0.0008 (0.0014) model time 0.2511 (0.2446) loss 2.5870 (3.4779) grad_norm 2.1258 (2.0010) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1040/1251] eta 0:00:51 lr 0.000903 wd 0.0500 time 0.2312 (0.2460) data time 0.0009 (0.0014) model time 0.2303 (0.2448) loss 3.0750 (3.4743) grad_norm 1.5702 (2.0021) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1050/1251] eta 0:00:49 lr 0.000903 wd 0.0500 time 0.2443 (0.2459) data time 0.0011 (0.0014) model time 0.2433 (0.2448) loss 3.8923 (3.4738) grad_norm 1.7966 (2.0040) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1060/1251] eta 0:00:46 lr 0.000903 wd 0.0500 time 0.2536 (0.2459) data time 0.0010 (0.0014) model time 0.2526 (0.2447) loss 3.8697 (3.4732) grad_norm 1.3955 (2.0076) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1070/1251] eta 0:00:44 lr 0.000903 wd 0.0500 time 0.2495 (0.2459) data time 0.0009 (0.0014) model time 0.2486 (0.2447) loss 3.3861 (3.4748) grad_norm 2.4320 (2.0070) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1080/1251] eta 0:00:42 lr 0.000903 wd 0.0500 time 0.2396 (0.2458) data time 0.0009 (0.0014) model time 0.2387 (0.2446) loss 3.6662 (3.4746) grad_norm 3.3882 (2.0087) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1090/1251] eta 0:00:39 lr 0.000903 wd 0.0500 time 0.2397 (0.2458) data time 0.0010 (0.0014) model time 0.2386 (0.2446) loss 3.1622 (3.4756) grad_norm 1.8574 (2.0108) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1100/1251] eta 0:00:37 lr 0.000903 wd 0.0500 time 0.2399 (0.2458) data time 0.0009 (0.0014) model time 0.2390 (0.2446) loss 3.5137 (3.4764) grad_norm 1.6040 (2.0112) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1110/1251] eta 0:00:34 lr 0.000903 wd 0.0500 time 0.2353 (0.2457) data time 0.0010 (0.0014) model time 0.2343 (0.2445) loss 4.1736 (3.4763) grad_norm 2.8639 (2.0137) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1120/1251] eta 0:00:32 lr 0.000903 wd 0.0500 time 0.2504 (0.2457) data time 0.0008 (0.0014) model time 0.2496 (0.2445) loss 2.9641 (3.4759) grad_norm 1.7295 (2.0127) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1130/1251] eta 0:00:29 lr 0.000902 wd 0.0500 time 0.2414 (0.2457) data time 0.0011 (0.0014) model time 0.2402 (0.2445) loss 3.4225 (3.4751) grad_norm 1.6395 (2.0105) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1140/1251] eta 0:00:27 lr 0.000902 wd 0.0500 time 0.2412 (0.2456) data time 0.0010 (0.0014) model time 0.2402 (0.2444) loss 3.3681 (3.4744) grad_norm 1.7405 (2.0089) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1150/1251] eta 0:00:24 lr 0.000902 wd 0.0500 time 0.2367 (0.2457) data time 0.0009 (0.0014) model time 0.2358 (0.2445) loss 3.2108 (3.4741) grad_norm 1.7048 (2.0099) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1160/1251] eta 0:00:22 lr 0.000902 wd 0.0500 time 0.2393 (0.2456) data time 0.0009 (0.0014) model time 0.2384 (0.2445) loss 2.9643 (3.4724) grad_norm 1.7065 (2.0101) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1170/1251] eta 0:00:19 lr 0.000902 wd 0.0500 time 0.2388 (0.2456) data time 0.0007 (0.0014) model time 0.2380 (0.2444) loss 2.5480 (3.4720) grad_norm 2.7892 (2.0101) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1180/1251] eta 0:00:17 lr 0.000902 wd 0.0500 time 0.2373 (0.2456) data time 0.0008 (0.0014) model time 0.2365 (0.2444) loss 4.1514 (3.4718) grad_norm 1.9327 (2.0097) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1190/1251] eta 0:00:14 lr 0.000902 wd 0.0500 time 0.2386 (0.2455) data time 0.0009 (0.0014) model time 0.2378 (0.2443) loss 3.4233 (3.4718) grad_norm 2.0389 (2.0090) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1200/1251] eta 0:00:12 lr 0.000902 wd 0.0500 time 0.2422 (0.2455) data time 0.0010 (0.0014) model time 0.2412 (0.2443) loss 4.4045 (3.4727) grad_norm 2.1471 (2.0093) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1210/1251] eta 0:00:10 lr 0.000902 wd 0.0500 time 0.2448 (0.2454) data time 0.0009 (0.0014) model time 0.2439 (0.2443) loss 2.7639 (3.4719) grad_norm 2.0328 (2.0069) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:51:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1220/1251] eta 0:00:07 lr 0.000902 wd 0.0500 time 0.2504 (0.2454) data time 0.0009 (0.0014) model time 0.2495 (0.2442) loss 2.7576 (3.4737) grad_norm 1.6982 (2.0060) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1230/1251] eta 0:00:05 lr 0.000902 wd 0.0500 time 0.2434 (0.2454) data time 0.0007 (0.0014) model time 0.2427 (0.2442) loss 3.9749 (3.4751) grad_norm 1.7660 (2.0052) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1240/1251] eta 0:00:02 lr 0.000902 wd 0.0500 time 0.2249 (0.2453) data time 0.0008 (0.0014) model time 0.2241 (0.2441) loss 3.2627 (3.4754) grad_norm 2.1972 (2.0039) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [76/300][1250/1251] eta 0:00:00 lr 0.000902 wd 0.0500 time 0.2237 (0.2451) data time 0.0005 (0.0014) model time 0.2231 (0.2439) loss 3.7389 (3.4756) grad_norm 2.0075 (2.0030) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 76 training takes 0:05:06 [2024-08-26 08:52:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 08:52:05 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 08:52:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.435 (0.435) Loss 0.5317 (0.5317) Acc@1 90.137 (90.137) Acc@5 97.559 (97.559) Mem 7379MB [2024-08-26 08:52:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.109) Loss 0.8506 (0.8020) Acc@1 82.617 (82.244) Acc@5 95.996 (96.103) Mem 7379MB [2024-08-26 08:52:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.096) Loss 1.1943 (0.8337) Acc@1 71.875 (81.041) Acc@5 91.113 (96.070) Mem 7379MB [2024-08-26 08:52:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.091) Loss 1.4385 (0.9503) Acc@1 64.746 (78.538) Acc@5 88.574 (94.572) Mem 7379MB [2024-08-26 08:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.3154 (1.0192) Acc@1 69.824 (76.872) Acc@5 90.430 (93.755) Mem 7379MB [2024-08-26 08:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.532 Acc@5 93.648 [2024-08-26 08:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.5% [2024-08-26 08:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 76.53% [2024-08-26 08:52:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 08:52:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 08:52:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.414 (0.414) Loss 0.4626 (0.4626) Acc@1 91.992 (91.992) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 08:52:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.108) Loss 0.7422 (0.7217) Acc@1 84.961 (84.499) Acc@5 96.387 (96.822) Mem 7379MB [2024-08-26 08:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.095) Loss 1.0322 (0.7452) Acc@1 75.684 (83.343) Acc@5 93.750 (96.810) Mem 7379MB [2024-08-26 08:52:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.090) Loss 1.3047 (0.8512) Acc@1 66.504 (80.847) Acc@5 90.430 (95.505) Mem 7379MB [2024-08-26 08:52:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.2002 (0.9082) Acc@1 70.508 (79.337) Acc@5 91.113 (94.872) Mem 7379MB [2024-08-26 08:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.934 Acc@5 94.826 [2024-08-26 08:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 78.9% [2024-08-26 08:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 78.93% [2024-08-26 08:52:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 08:52:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 08:52:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][0/1251] eta 0:14:50 lr 0.000902 wd 0.0500 time 0.7121 (0.7121) data time 0.4912 (0.4912) model time 0.0000 (0.0000) loss 4.2929 (4.2929) grad_norm 2.4513 (2.4513) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][10/1251] eta 0:05:54 lr 0.000902 wd 0.0500 time 0.2373 (0.2855) data time 0.0010 (0.0456) model time 0.0000 (0.0000) loss 3.4417 (3.5968) grad_norm 2.6586 (1.9917) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][20/1251] eta 0:05:25 lr 0.000902 wd 0.0500 time 0.2475 (0.2646) data time 0.0010 (0.0244) model time 0.0000 (0.0000) loss 3.1795 (3.5186) grad_norm 3.2359 (2.1148) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][30/1251] eta 0:05:14 lr 0.000902 wd 0.0500 time 0.2661 (0.2578) data time 0.0009 (0.0168) model time 0.0000 (0.0000) loss 3.7248 (3.4893) grad_norm 1.8573 (2.0386) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][40/1251] eta 0:05:07 lr 0.000902 wd 0.0500 time 0.2498 (0.2539) data time 0.0009 (0.0129) model time 0.0000 (0.0000) loss 3.6937 (3.4360) grad_norm 3.4320 (2.1634) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][50/1251] eta 0:05:01 lr 0.000902 wd 0.0500 time 0.2392 (0.2515) data time 0.0008 (0.0106) model time 0.0000 (0.0000) loss 4.1375 (3.4371) grad_norm 1.7967 (2.1549) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][60/1251] eta 0:05:07 lr 0.000902 wd 0.0500 time 0.2450 (0.2580) data time 0.0014 (0.0090) model time 0.2436 (0.2902) loss 3.8909 (3.3938) grad_norm 3.0493 (2.1935) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][70/1251] eta 0:05:02 lr 0.000902 wd 0.0500 time 0.2430 (0.2559) data time 0.0011 (0.0079) model time 0.2419 (0.2662) loss 3.5730 (3.4021) grad_norm 1.9264 (2.1611) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][80/1251] eta 0:04:57 lr 0.000902 wd 0.0500 time 0.2352 (0.2541) data time 0.0011 (0.0071) model time 0.2341 (0.2576) loss 3.9878 (3.4349) grad_norm 2.2957 (2.1339) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][90/1251] eta 0:04:53 lr 0.000902 wd 0.0500 time 0.2402 (0.2529) data time 0.0012 (0.0064) model time 0.2390 (0.2536) loss 3.9046 (3.4711) grad_norm 1.6998 (2.0960) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][100/1251] eta 0:04:49 lr 0.000902 wd 0.0500 time 0.2354 (0.2516) data time 0.0008 (0.0059) model time 0.2345 (0.2506) loss 3.0303 (3.4594) grad_norm 1.9592 (2.0974) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][110/1251] eta 0:04:45 lr 0.000902 wd 0.0500 time 0.2347 (0.2505) data time 0.0008 (0.0054) model time 0.2339 (0.2487) loss 3.9669 (3.4934) grad_norm 1.9164 (2.0779) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][120/1251] eta 0:04:42 lr 0.000902 wd 0.0500 time 0.2455 (0.2497) data time 0.0009 (0.0051) model time 0.2446 (0.2475) loss 2.4529 (3.4907) grad_norm 1.7174 (2.0683) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][130/1251] eta 0:04:41 lr 0.000902 wd 0.0500 time 0.4614 (0.2508) data time 0.0008 (0.0047) model time 0.4606 (0.2494) loss 2.6355 (3.5029) grad_norm 2.7216 (2.0921) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][140/1251] eta 0:04:37 lr 0.000902 wd 0.0500 time 0.2378 (0.2502) data time 0.0013 (0.0045) model time 0.2365 (0.2484) loss 3.2261 (3.4957) grad_norm 2.2015 (2.0878) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][150/1251] eta 0:04:34 lr 0.000902 wd 0.0500 time 0.2397 (0.2497) data time 0.0012 (0.0043) model time 0.2385 (0.2477) loss 3.3052 (3.4966) grad_norm 2.1780 (2.0835) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][160/1251] eta 0:04:31 lr 0.000902 wd 0.0500 time 0.2444 (0.2492) data time 0.0011 (0.0041) model time 0.2433 (0.2472) loss 3.5381 (3.4751) grad_norm 1.5856 (2.0806) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:52:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][170/1251] eta 0:04:28 lr 0.000902 wd 0.0500 time 0.2445 (0.2488) data time 0.0007 (0.0039) model time 0.2438 (0.2467) loss 3.7531 (3.4751) grad_norm 1.7928 (2.0893) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][180/1251] eta 0:04:27 lr 0.000902 wd 0.0500 time 0.2421 (0.2496) data time 0.0009 (0.0037) model time 0.2412 (0.2478) loss 3.5134 (3.4611) grad_norm 2.3318 (2.0821) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][190/1251] eta 0:04:24 lr 0.000902 wd 0.0500 time 0.2509 (0.2491) data time 0.0007 (0.0036) model time 0.2502 (0.2473) loss 2.8655 (3.4547) grad_norm 2.7308 (2.0843) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][200/1251] eta 0:04:22 lr 0.000902 wd 0.0500 time 0.2481 (0.2498) data time 0.0007 (0.0035) model time 0.2474 (0.2483) loss 3.5862 (3.4602) grad_norm 1.4687 (2.0705) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][210/1251] eta 0:04:19 lr 0.000902 wd 0.0500 time 0.2405 (0.2494) data time 0.0011 (0.0033) model time 0.2393 (0.2477) loss 3.1039 (3.4584) grad_norm 1.7289 (2.0758) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][220/1251] eta 0:04:16 lr 0.000902 wd 0.0500 time 0.2323 (0.2488) data time 0.0010 (0.0032) model time 0.2313 (0.2470) loss 3.8212 (3.4696) grad_norm 1.9710 (2.0704) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][230/1251] eta 0:04:14 lr 0.000902 wd 0.0500 time 0.2398 (0.2494) data time 0.0009 (0.0031) model time 0.2389 (0.2478) loss 4.3684 (3.4724) grad_norm 2.0434 (2.0720) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][240/1251] eta 0:04:11 lr 0.000902 wd 0.0500 time 0.2472 (0.2491) data time 0.0010 (0.0031) model time 0.2462 (0.2474) loss 3.6305 (3.4790) grad_norm 2.3576 (2.0813) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][250/1251] eta 0:04:09 lr 0.000902 wd 0.0500 time 0.2431 (0.2488) data time 0.0011 (0.0030) model time 0.2419 (0.2471) loss 2.9094 (3.4682) grad_norm 1.8702 (2.0711) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][260/1251] eta 0:04:06 lr 0.000901 wd 0.0500 time 0.2321 (0.2485) data time 0.0011 (0.0029) model time 0.2310 (0.2468) loss 3.8023 (3.4754) grad_norm 1.8378 (2.0646) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][270/1251] eta 0:04:03 lr 0.000901 wd 0.0500 time 0.2423 (0.2483) data time 0.0010 (0.0028) model time 0.2413 (0.2465) loss 3.7201 (3.4782) grad_norm 2.9712 (2.0623) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][280/1251] eta 0:04:00 lr 0.000901 wd 0.0500 time 0.2398 (0.2480) data time 0.0014 (0.0028) model time 0.2385 (0.2462) loss 3.1651 (3.4820) grad_norm 2.0770 (2.0556) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][290/1251] eta 0:03:58 lr 0.000901 wd 0.0500 time 0.4480 (0.2485) data time 0.0009 (0.0027) model time 0.4471 (0.2468) loss 3.0223 (3.4845) grad_norm 2.3048 (2.0495) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][300/1251] eta 0:03:56 lr 0.000901 wd 0.0500 time 0.2399 (0.2489) data time 0.0010 (0.0027) model time 0.2389 (0.2473) loss 4.3989 (3.4868) grad_norm 1.3371 (2.0456) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][310/1251] eta 0:03:54 lr 0.000901 wd 0.0500 time 0.2351 (0.2487) data time 0.0008 (0.0026) model time 0.2344 (0.2471) loss 2.7076 (3.4829) grad_norm 1.6231 (2.0374) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][320/1251] eta 0:03:51 lr 0.000901 wd 0.0500 time 0.2407 (0.2485) data time 0.0010 (0.0025) model time 0.2397 (0.2469) loss 3.8299 (3.4852) grad_norm 2.2604 (2.0343) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][330/1251] eta 0:03:48 lr 0.000901 wd 0.0500 time 0.2356 (0.2482) data time 0.0010 (0.0025) model time 0.2346 (0.2466) loss 3.7388 (3.4777) grad_norm 1.5091 (2.0271) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][340/1251] eta 0:03:45 lr 0.000901 wd 0.0500 time 0.2384 (0.2480) data time 0.0010 (0.0025) model time 0.2374 (0.2464) loss 3.3712 (3.4800) grad_norm 1.6804 (2.0245) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][350/1251] eta 0:03:43 lr 0.000901 wd 0.0500 time 0.2421 (0.2479) data time 0.0009 (0.0024) model time 0.2412 (0.2462) loss 2.8699 (3.4758) grad_norm 1.7955 (2.0208) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][360/1251] eta 0:03:40 lr 0.000901 wd 0.0500 time 0.2445 (0.2476) data time 0.0008 (0.0024) model time 0.2437 (0.2460) loss 3.6612 (3.4712) grad_norm 1.6454 (2.0203) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][370/1251] eta 0:03:37 lr 0.000901 wd 0.0500 time 0.2450 (0.2474) data time 0.0012 (0.0023) model time 0.2438 (0.2458) loss 3.9822 (3.4651) grad_norm 2.4067 (2.0270) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][380/1251] eta 0:03:35 lr 0.000901 wd 0.0500 time 0.2376 (0.2473) data time 0.0012 (0.0023) model time 0.2364 (0.2456) loss 3.8039 (3.4653) grad_norm 2.6642 (2.0376) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][390/1251] eta 0:03:32 lr 0.000901 wd 0.0500 time 0.2404 (0.2472) data time 0.0010 (0.0023) model time 0.2395 (0.2455) loss 3.6977 (3.4753) grad_norm 2.0447 (2.0407) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][400/1251] eta 0:03:30 lr 0.000901 wd 0.0500 time 0.2441 (0.2471) data time 0.0008 (0.0022) model time 0.2433 (0.2454) loss 2.0598 (3.4650) grad_norm 1.6260 (2.0313) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][410/1251] eta 0:03:27 lr 0.000901 wd 0.0500 time 0.2401 (0.2470) data time 0.0008 (0.0022) model time 0.2392 (0.2453) loss 2.5398 (3.4691) grad_norm 4.2664 (2.0346) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:53:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][420/1251] eta 0:03:25 lr 0.000901 wd 0.0500 time 0.2373 (0.2471) data time 0.0007 (0.0022) model time 0.2366 (0.2455) loss 2.4909 (3.4679) grad_norm 1.9441 (2.0315) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:54:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][430/1251] eta 0:03:22 lr 0.000901 wd 0.0500 time 0.2359 (0.2469) data time 0.0009 (0.0022) model time 0.2349 (0.2453) loss 3.4885 (3.4650) grad_norm 2.1772 (2.0310) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:54:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][440/1251] eta 0:03:20 lr 0.000901 wd 0.0500 time 0.2443 (0.2468) data time 0.0011 (0.0021) model time 0.2432 (0.2451) loss 4.2685 (3.4659) grad_norm 3.2487 (2.0320) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:54:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][450/1251] eta 0:03:17 lr 0.000901 wd 0.0500 time 0.2381 (0.2467) data time 0.0009 (0.0021) model time 0.2372 (0.2450) loss 2.7868 (3.4683) grad_norm 1.4916 (2.0386) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:54:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][460/1251] eta 0:03:15 lr 0.000901 wd 0.0500 time 0.2399 (0.2465) data time 0.0008 (0.0021) model time 0.2391 (0.2449) loss 4.3793 (3.4715) grad_norm 2.5945 (2.0397) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:54:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][470/1251] eta 0:03:12 lr 0.000901 wd 0.0500 time 0.2471 (0.2464) data time 0.0007 (0.0021) model time 0.2464 (0.2448) loss 2.9789 (3.4705) grad_norm 2.5003 (2.0474) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:54:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][480/1251] eta 0:03:09 lr 0.000901 wd 0.0500 time 0.2452 (0.2463) data time 0.0008 (0.0020) model time 0.2444 (0.2447) loss 2.3271 (3.4672) grad_norm 1.7553 (2.0463) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:54:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][490/1251] eta 0:03:07 lr 0.000901 wd 0.0500 time 0.2399 (0.2462) data time 0.0010 (0.0020) model time 0.2389 (0.2446) loss 3.1755 (3.4659) grad_norm 2.0275 (2.0473) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:54:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][500/1251] eta 0:03:04 lr 0.000901 wd 0.0500 time 0.2489 (0.2462) data time 0.0007 (0.0020) model time 0.2482 (0.2446) loss 3.8473 (3.4644) grad_norm 2.2240 (2.0469) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 08:54:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][510/1251] eta 0:03:02 lr 0.000901 wd 0.0500 time 0.2450 (0.2461) data time 0.0011 (0.0020) model time 0.2439 (0.2445) loss 3.7242 (3.4631) grad_norm 2.0969 (2.0416) loss_scale 16384.0000 (8208.0313) mem 7379MB [2024-08-26 08:54:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][520/1251] eta 0:02:59 lr 0.000901 wd 0.0500 time 0.2364 (0.2460) data time 0.0011 (0.0020) model time 0.2353 (0.2444) loss 3.3244 (3.4570) grad_norm 1.6396 (2.0390) loss_scale 16384.0000 (8364.9597) mem 7379MB [2024-08-26 08:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][530/1251] eta 0:02:57 lr 0.000901 wd 0.0500 time 0.2388 (0.2459) data time 0.0008 (0.0019) model time 0.2380 (0.2443) loss 4.1150 (3.4521) grad_norm 2.1425 (2.0396) loss_scale 16384.0000 (8515.9774) mem 7379MB [2024-08-26 08:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][540/1251] eta 0:02:54 lr 0.000901 wd 0.0500 time 0.2383 (0.2458) data time 0.0010 (0.0019) model time 0.2373 (0.2442) loss 3.8692 (3.4516) grad_norm 1.4649 (2.0379) loss_scale 16384.0000 (8661.4122) mem 7379MB [2024-08-26 08:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][550/1251] eta 0:02:52 lr 0.000901 wd 0.0500 time 0.2387 (0.2457) data time 0.0010 (0.0019) model time 0.2377 (0.2441) loss 3.3349 (3.4518) grad_norm 1.7759 (2.0345) loss_scale 16384.0000 (8801.5681) mem 7379MB [2024-08-26 08:54:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][560/1251] eta 0:02:49 lr 0.000901 wd 0.0500 time 0.2449 (0.2456) data time 0.0010 (0.0019) model time 0.2440 (0.2441) loss 3.4954 (3.4486) grad_norm 2.3027 (2.0375) loss_scale 16384.0000 (8936.7273) mem 7379MB [2024-08-26 08:54:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][570/1251] eta 0:02:47 lr 0.000901 wd 0.0500 time 0.2381 (0.2459) data time 0.0012 (0.0019) model time 0.2369 (0.2444) loss 2.9084 (3.4450) grad_norm 1.4942 (2.0342) loss_scale 16384.0000 (9067.1524) mem 7379MB [2024-08-26 08:54:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][580/1251] eta 0:02:45 lr 0.000901 wd 0.0500 time 0.4682 (0.2466) data time 0.0011 (0.0019) model time 0.4671 (0.2451) loss 3.9468 (3.4462) grad_norm 2.5172 (2.0332) loss_scale 16384.0000 (9193.0878) mem 7379MB [2024-08-26 08:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][590/1251] eta 0:02:43 lr 0.000901 wd 0.0500 time 0.2441 (0.2468) data time 0.0007 (0.0018) model time 0.2434 (0.2454) loss 4.1769 (3.4459) grad_norm 3.5624 (2.0424) loss_scale 16384.0000 (9314.7614) mem 7379MB [2024-08-26 08:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][600/1251] eta 0:02:40 lr 0.000901 wd 0.0500 time 0.2383 (0.2467) data time 0.0010 (0.0018) model time 0.2373 (0.2453) loss 3.6464 (3.4413) grad_norm 1.7751 (2.0401) loss_scale 16384.0000 (9432.3860) mem 7379MB [2024-08-26 08:54:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][610/1251] eta 0:02:38 lr 0.000901 wd 0.0500 time 0.2434 (0.2466) data time 0.0008 (0.0018) model time 0.2426 (0.2452) loss 4.1122 (3.4399) grad_norm 1.8361 (inf) loss_scale 8192.0000 (9425.4926) mem 7379MB [2024-08-26 08:54:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][620/1251] eta 0:02:35 lr 0.000901 wd 0.0500 time 0.2428 (0.2466) data time 0.0007 (0.0018) model time 0.2421 (0.2451) loss 4.1808 (3.4395) grad_norm 2.3941 (inf) loss_scale 8192.0000 (9405.6296) mem 7379MB [2024-08-26 08:54:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][630/1251] eta 0:02:33 lr 0.000900 wd 0.0500 time 0.2477 (0.2465) data time 0.0007 (0.0018) model time 0.2469 (0.2450) loss 3.1401 (3.4377) grad_norm 2.0296 (inf) loss_scale 8192.0000 (9386.3962) mem 7379MB [2024-08-26 08:54:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][640/1251] eta 0:02:30 lr 0.000900 wd 0.0500 time 0.2385 (0.2464) data time 0.0008 (0.0018) model time 0.2377 (0.2449) loss 3.9671 (3.4443) grad_norm 2.1604 (inf) loss_scale 8192.0000 (9367.7629) mem 7379MB [2024-08-26 08:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][650/1251] eta 0:02:28 lr 0.000900 wd 0.0500 time 0.4232 (0.2468) data time 0.0011 (0.0018) model time 0.4222 (0.2454) loss 3.6705 (3.4475) grad_norm 1.5647 (inf) loss_scale 8192.0000 (9349.7020) mem 7379MB [2024-08-26 08:54:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][660/1251] eta 0:02:25 lr 0.000900 wd 0.0500 time 0.2535 (0.2467) data time 0.0007 (0.0018) model time 0.2528 (0.2453) loss 3.4806 (3.4479) grad_norm 1.6018 (inf) loss_scale 8192.0000 (9332.1876) mem 7379MB [2024-08-26 08:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][670/1251] eta 0:02:23 lr 0.000900 wd 0.0500 time 0.2478 (0.2467) data time 0.0011 (0.0017) model time 0.2466 (0.2453) loss 2.6077 (3.4441) grad_norm 1.5364 (inf) loss_scale 8192.0000 (9315.1952) mem 7379MB [2024-08-26 08:55:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][680/1251] eta 0:02:20 lr 0.000900 wd 0.0500 time 0.2438 (0.2466) data time 0.0009 (0.0017) model time 0.2429 (0.2452) loss 3.5238 (3.4439) grad_norm 2.4530 (inf) loss_scale 8192.0000 (9298.7019) mem 7379MB [2024-08-26 08:55:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][690/1251] eta 0:02:18 lr 0.000900 wd 0.0500 time 0.2385 (0.2465) data time 0.0009 (0.0017) model time 0.2376 (0.2451) loss 3.9376 (3.4448) grad_norm 1.6084 (inf) loss_scale 8192.0000 (9282.6860) mem 7379MB [2024-08-26 08:55:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][700/1251] eta 0:02:15 lr 0.000900 wd 0.0500 time 0.2399 (0.2465) data time 0.0008 (0.0017) model time 0.2390 (0.2451) loss 3.6687 (3.4481) grad_norm 1.6507 (inf) loss_scale 8192.0000 (9267.1270) mem 7379MB [2024-08-26 08:55:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][710/1251] eta 0:02:13 lr 0.000900 wd 0.0500 time 0.2367 (0.2464) data time 0.0011 (0.0017) model time 0.2356 (0.2450) loss 3.7355 (3.4486) grad_norm 1.7090 (inf) loss_scale 8192.0000 (9252.0056) mem 7379MB [2024-08-26 08:55:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][720/1251] eta 0:02:10 lr 0.000900 wd 0.0500 time 0.2380 (0.2466) data time 0.0011 (0.0017) model time 0.2369 (0.2452) loss 3.1607 (3.4507) grad_norm 2.8388 (inf) loss_scale 8192.0000 (9237.3037) mem 7379MB [2024-08-26 08:55:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][730/1251] eta 0:02:08 lr 0.000900 wd 0.0500 time 0.2379 (0.2465) data time 0.0010 (0.0017) model time 0.2370 (0.2451) loss 3.3456 (3.4532) grad_norm 1.3911 (inf) loss_scale 8192.0000 (9223.0041) mem 7379MB [2024-08-26 08:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][740/1251] eta 0:02:05 lr 0.000900 wd 0.0500 time 0.2342 (0.2464) data time 0.0010 (0.0017) model time 0.2332 (0.2450) loss 4.0507 (3.4538) grad_norm 1.6989 (inf) loss_scale 8192.0000 (9209.0904) mem 7379MB [2024-08-26 08:55:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][750/1251] eta 0:02:03 lr 0.000900 wd 0.0500 time 0.2440 (0.2464) data time 0.0007 (0.0017) model time 0.2432 (0.2450) loss 2.7098 (3.4528) grad_norm 1.7398 (inf) loss_scale 8192.0000 (9195.5473) mem 7379MB [2024-08-26 08:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][760/1251] eta 0:02:00 lr 0.000900 wd 0.0500 time 0.2320 (0.2463) data time 0.0009 (0.0017) model time 0.2310 (0.2449) loss 2.8636 (3.4527) grad_norm 1.4180 (inf) loss_scale 8192.0000 (9182.3601) mem 7379MB [2024-08-26 08:55:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][770/1251] eta 0:01:58 lr 0.000900 wd 0.0500 time 0.2374 (0.2463) data time 0.0012 (0.0016) model time 0.2362 (0.2449) loss 3.6463 (3.4546) grad_norm 2.7685 (inf) loss_scale 8192.0000 (9169.5149) mem 7379MB [2024-08-26 08:55:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][780/1251] eta 0:01:55 lr 0.000900 wd 0.0500 time 0.2407 (0.2462) data time 0.0011 (0.0016) model time 0.2396 (0.2449) loss 3.7563 (3.4525) grad_norm 2.2727 (inf) loss_scale 8192.0000 (9156.9987) mem 7379MB [2024-08-26 08:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][790/1251] eta 0:01:53 lr 0.000900 wd 0.0500 time 0.2347 (0.2462) data time 0.0010 (0.0016) model time 0.2337 (0.2448) loss 3.3776 (3.4492) grad_norm 1.8012 (inf) loss_scale 8192.0000 (9144.7990) mem 7379MB [2024-08-26 08:55:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][800/1251] eta 0:01:50 lr 0.000900 wd 0.0500 time 0.2437 (0.2461) data time 0.0007 (0.0016) model time 0.2430 (0.2447) loss 3.5522 (3.4515) grad_norm 1.5890 (inf) loss_scale 8192.0000 (9132.9039) mem 7379MB [2024-08-26 08:55:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][810/1251] eta 0:01:48 lr 0.000900 wd 0.0500 time 0.2410 (0.2460) data time 0.0009 (0.0016) model time 0.2401 (0.2447) loss 3.9367 (3.4486) grad_norm 2.1893 (inf) loss_scale 8192.0000 (9121.3021) mem 7379MB [2024-08-26 08:55:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][820/1251] eta 0:01:46 lr 0.000900 wd 0.0500 time 0.2376 (0.2460) data time 0.0009 (0.0016) model time 0.2367 (0.2446) loss 3.3311 (3.4508) grad_norm 1.7113 (inf) loss_scale 8192.0000 (9109.9829) mem 7379MB [2024-08-26 08:55:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][830/1251] eta 0:01:43 lr 0.000900 wd 0.0500 time 0.2412 (0.2460) data time 0.0009 (0.0016) model time 0.2403 (0.2446) loss 2.8376 (3.4492) grad_norm 1.5429 (inf) loss_scale 8192.0000 (9098.9362) mem 7379MB [2024-08-26 08:55:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][840/1251] eta 0:01:41 lr 0.000900 wd 0.0500 time 0.2508 (0.2459) data time 0.0007 (0.0016) model time 0.2501 (0.2446) loss 4.2205 (3.4532) grad_norm 2.4708 (inf) loss_scale 8192.0000 (9088.1522) mem 7379MB [2024-08-26 08:55:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][850/1251] eta 0:01:38 lr 0.000900 wd 0.0500 time 0.2407 (0.2459) data time 0.0011 (0.0016) model time 0.2397 (0.2445) loss 3.9411 (3.4548) grad_norm 1.6719 (inf) loss_scale 8192.0000 (9077.6216) mem 7379MB [2024-08-26 08:55:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][860/1251] eta 0:01:36 lr 0.000900 wd 0.0500 time 0.2310 (0.2458) data time 0.0009 (0.0016) model time 0.2301 (0.2444) loss 2.6955 (3.4534) grad_norm 2.9859 (inf) loss_scale 8192.0000 (9067.3357) mem 7379MB [2024-08-26 08:55:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][870/1251] eta 0:01:33 lr 0.000900 wd 0.0500 time 0.2365 (0.2458) data time 0.0008 (0.0016) model time 0.2357 (0.2444) loss 3.3774 (3.4538) grad_norm 2.2214 (inf) loss_scale 8192.0000 (9057.2859) mem 7379MB [2024-08-26 08:55:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][880/1251] eta 0:01:31 lr 0.000900 wd 0.0500 time 0.2385 (0.2457) data time 0.0011 (0.0016) model time 0.2374 (0.2443) loss 3.9237 (3.4551) grad_norm 1.7401 (inf) loss_scale 8192.0000 (9047.4642) mem 7379MB [2024-08-26 08:55:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][890/1251] eta 0:01:28 lr 0.000900 wd 0.0500 time 0.2348 (0.2456) data time 0.0011 (0.0016) model time 0.2337 (0.2443) loss 3.8328 (3.4590) grad_norm 1.8818 (inf) loss_scale 8192.0000 (9037.8631) mem 7379MB [2024-08-26 08:55:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][900/1251] eta 0:01:26 lr 0.000900 wd 0.0500 time 0.2427 (0.2456) data time 0.0011 (0.0016) model time 0.2416 (0.2442) loss 2.9694 (3.4602) grad_norm 2.0620 (inf) loss_scale 8192.0000 (9028.4750) mem 7379MB [2024-08-26 08:55:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][910/1251] eta 0:01:23 lr 0.000900 wd 0.0500 time 0.2367 (0.2455) data time 0.0008 (0.0016) model time 0.2359 (0.2441) loss 3.7391 (3.4625) grad_norm 2.1667 (inf) loss_scale 8192.0000 (9019.2931) mem 7379MB [2024-08-26 08:56:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][920/1251] eta 0:01:21 lr 0.000900 wd 0.0500 time 0.2414 (0.2455) data time 0.0009 (0.0015) model time 0.2406 (0.2441) loss 2.0872 (3.4616) grad_norm 2.0328 (inf) loss_scale 8192.0000 (9010.3105) mem 7379MB [2024-08-26 08:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][930/1251] eta 0:01:18 lr 0.000900 wd 0.0500 time 0.2312 (0.2454) data time 0.0011 (0.0015) model time 0.2301 (0.2440) loss 3.8216 (3.4622) grad_norm 2.5482 (inf) loss_scale 8192.0000 (9001.5209) mem 7379MB [2024-08-26 08:56:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][940/1251] eta 0:01:16 lr 0.000900 wd 0.0500 time 0.2419 (0.2456) data time 0.0008 (0.0015) model time 0.2411 (0.2443) loss 4.4800 (3.4629) grad_norm 3.1275 (inf) loss_scale 8192.0000 (8992.9182) mem 7379MB [2024-08-26 08:56:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][950/1251] eta 0:01:13 lr 0.000900 wd 0.0500 time 0.2496 (0.2456) data time 0.0010 (0.0015) model time 0.2486 (0.2442) loss 3.4145 (3.4622) grad_norm 1.6591 (inf) loss_scale 8192.0000 (8984.4963) mem 7379MB [2024-08-26 08:56:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][960/1251] eta 0:01:11 lr 0.000900 wd 0.0500 time 0.2458 (0.2455) data time 0.0007 (0.0015) model time 0.2451 (0.2442) loss 3.1105 (3.4630) grad_norm 2.5492 (inf) loss_scale 8192.0000 (8976.2497) mem 7379MB [2024-08-26 08:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][970/1251] eta 0:01:08 lr 0.000900 wd 0.0500 time 0.2437 (0.2455) data time 0.0007 (0.0015) model time 0.2430 (0.2442) loss 3.2943 (3.4647) grad_norm 2.4729 (inf) loss_scale 8192.0000 (8968.1730) mem 7379MB [2024-08-26 08:56:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][980/1251] eta 0:01:06 lr 0.000900 wd 0.0500 time 0.2404 (0.2455) data time 0.0007 (0.0015) model time 0.2397 (0.2441) loss 4.0674 (3.4670) grad_norm 1.5275 (inf) loss_scale 8192.0000 (8960.2610) mem 7379MB [2024-08-26 08:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][990/1251] eta 0:01:04 lr 0.000900 wd 0.0500 time 0.2431 (0.2454) data time 0.0009 (0.0015) model time 0.2422 (0.2441) loss 4.0455 (3.4706) grad_norm 1.8246 (inf) loss_scale 8192.0000 (8952.5086) mem 7379MB [2024-08-26 08:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1000/1251] eta 0:01:01 lr 0.000900 wd 0.0500 time 0.2492 (0.2454) data time 0.0009 (0.0015) model time 0.2483 (0.2441) loss 3.9740 (3.4737) grad_norm 1.4889 (inf) loss_scale 8192.0000 (8944.9111) mem 7379MB [2024-08-26 08:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1010/1251] eta 0:00:59 lr 0.000899 wd 0.0500 time 0.2414 (0.2454) data time 0.0009 (0.0015) model time 0.2405 (0.2440) loss 3.4011 (3.4724) grad_norm 2.7225 (inf) loss_scale 8192.0000 (8937.4639) mem 7379MB [2024-08-26 08:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1020/1251] eta 0:00:56 lr 0.000899 wd 0.0500 time 0.2393 (0.2454) data time 0.0010 (0.0015) model time 0.2383 (0.2440) loss 4.2053 (3.4717) grad_norm 2.4440 (inf) loss_scale 8192.0000 (8930.1626) mem 7379MB [2024-08-26 08:56:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1030/1251] eta 0:00:54 lr 0.000899 wd 0.0500 time 0.2374 (0.2453) data time 0.0011 (0.0015) model time 0.2362 (0.2440) loss 3.4627 (3.4706) grad_norm 2.0968 (inf) loss_scale 8192.0000 (8923.0029) mem 7379MB [2024-08-26 08:56:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1040/1251] eta 0:00:51 lr 0.000899 wd 0.0500 time 0.2422 (0.2453) data time 0.0008 (0.0015) model time 0.2414 (0.2439) loss 3.9977 (3.4720) grad_norm 1.6406 (inf) loss_scale 8192.0000 (8915.9808) mem 7379MB [2024-08-26 08:56:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1050/1251] eta 0:00:49 lr 0.000899 wd 0.0500 time 0.2426 (0.2452) data time 0.0009 (0.0015) model time 0.2416 (0.2439) loss 2.9264 (3.4718) grad_norm 1.6203 (inf) loss_scale 8192.0000 (8909.0923) mem 7379MB [2024-08-26 08:56:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1060/1251] eta 0:00:46 lr 0.000899 wd 0.0500 time 0.2378 (0.2452) data time 0.0008 (0.0015) model time 0.2370 (0.2439) loss 4.4573 (3.4699) grad_norm 2.0458 (inf) loss_scale 8192.0000 (8902.3336) mem 7379MB [2024-08-26 08:56:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1070/1251] eta 0:00:44 lr 0.000899 wd 0.0500 time 0.2313 (0.2451) data time 0.0009 (0.0015) model time 0.2305 (0.2438) loss 3.3158 (3.4728) grad_norm 1.6140 (inf) loss_scale 8192.0000 (8895.7012) mem 7379MB [2024-08-26 08:56:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1080/1251] eta 0:00:41 lr 0.000899 wd 0.0500 time 0.2445 (0.2451) data time 0.0010 (0.0015) model time 0.2435 (0.2438) loss 3.6671 (3.4733) grad_norm 1.5409 (inf) loss_scale 8192.0000 (8889.1915) mem 7379MB [2024-08-26 08:56:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1090/1251] eta 0:00:39 lr 0.000899 wd 0.0500 time 0.2368 (0.2451) data time 0.0009 (0.0015) model time 0.2359 (0.2438) loss 4.0325 (3.4731) grad_norm 1.6072 (inf) loss_scale 8192.0000 (8882.8011) mem 7379MB [2024-08-26 08:56:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1100/1251] eta 0:00:37 lr 0.000899 wd 0.0500 time 0.2447 (0.2450) data time 0.0007 (0.0015) model time 0.2439 (0.2437) loss 2.7118 (3.4743) grad_norm 1.7274 (inf) loss_scale 8192.0000 (8876.5268) mem 7379MB [2024-08-26 08:56:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1110/1251] eta 0:00:34 lr 0.000899 wd 0.0500 time 0.2411 (0.2452) data time 0.0010 (0.0015) model time 0.2401 (0.2439) loss 3.5320 (3.4737) grad_norm 1.4240 (inf) loss_scale 8192.0000 (8870.3654) mem 7379MB [2024-08-26 08:56:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1120/1251] eta 0:00:32 lr 0.000899 wd 0.0500 time 0.2422 (0.2452) data time 0.0007 (0.0014) model time 0.2415 (0.2439) loss 3.5172 (3.4730) grad_norm 2.0173 (inf) loss_scale 8192.0000 (8864.3140) mem 7379MB [2024-08-26 08:56:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1130/1251] eta 0:00:29 lr 0.000899 wd 0.0500 time 0.2417 (0.2454) data time 0.0007 (0.0014) model time 0.2410 (0.2441) loss 4.5033 (3.4753) grad_norm 1.6482 (inf) loss_scale 8192.0000 (8858.3696) mem 7379MB [2024-08-26 08:56:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1140/1251] eta 0:00:27 lr 0.000899 wd 0.0500 time 0.2430 (0.2453) data time 0.0008 (0.0014) model time 0.2422 (0.2440) loss 2.5563 (3.4772) grad_norm 2.1783 (inf) loss_scale 8192.0000 (8852.5294) mem 7379MB [2024-08-26 08:56:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1150/1251] eta 0:00:24 lr 0.000899 wd 0.0500 time 0.2377 (0.2453) data time 0.0009 (0.0014) model time 0.2368 (0.2440) loss 3.6364 (3.4756) grad_norm 2.3258 (inf) loss_scale 8192.0000 (8846.7906) mem 7379MB [2024-08-26 08:57:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1160/1251] eta 0:00:22 lr 0.000899 wd 0.0500 time 0.2555 (0.2455) data time 0.0008 (0.0014) model time 0.2546 (0.2442) loss 4.3330 (3.4763) grad_norm 1.9093 (inf) loss_scale 8192.0000 (8841.1507) mem 7379MB [2024-08-26 08:57:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1170/1251] eta 0:00:19 lr 0.000899 wd 0.0500 time 0.2354 (0.2455) data time 0.0008 (0.0014) model time 0.2346 (0.2442) loss 4.4319 (3.4777) grad_norm 1.7126 (inf) loss_scale 8192.0000 (8835.6072) mem 7379MB [2024-08-26 08:57:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1180/1251] eta 0:00:17 lr 0.000899 wd 0.0500 time 0.2466 (0.2458) data time 0.0009 (0.0014) model time 0.2457 (0.2445) loss 3.8237 (3.4771) grad_norm 2.2560 (inf) loss_scale 8192.0000 (8830.1575) mem 7379MB [2024-08-26 08:57:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1190/1251] eta 0:00:14 lr 0.000899 wd 0.0500 time 0.2315 (0.2458) data time 0.0012 (0.0014) model time 0.2302 (0.2445) loss 3.9293 (3.4774) grad_norm 2.0437 (inf) loss_scale 8192.0000 (8824.7993) mem 7379MB [2024-08-26 08:57:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1200/1251] eta 0:00:12 lr 0.000899 wd 0.0500 time 0.2409 (0.2457) data time 0.0011 (0.0014) model time 0.2398 (0.2445) loss 3.8439 (3.4790) grad_norm 1.6182 (inf) loss_scale 4096.0000 (8795.6570) mem 7379MB [2024-08-26 08:57:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1210/1251] eta 0:00:10 lr 0.000899 wd 0.0500 time 0.2331 (0.2457) data time 0.0010 (0.0014) model time 0.2322 (0.2444) loss 3.4871 (3.4798) grad_norm 1.7638 (inf) loss_scale 4096.0000 (8756.8489) mem 7379MB [2024-08-26 08:57:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1220/1251] eta 0:00:07 lr 0.000899 wd 0.0500 time 0.4406 (0.2458) data time 0.0012 (0.0014) model time 0.4394 (0.2446) loss 4.0394 (3.4809) grad_norm 2.3930 (inf) loss_scale 4096.0000 (8718.6765) mem 7379MB [2024-08-26 08:57:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1230/1251] eta 0:00:05 lr 0.000899 wd 0.0500 time 0.2425 (0.2459) data time 0.0011 (0.0014) model time 0.2415 (0.2447) loss 3.3579 (3.4830) grad_norm 2.0887 (inf) loss_scale 4096.0000 (8681.1243) mem 7379MB [2024-08-26 08:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1240/1251] eta 0:00:02 lr 0.000899 wd 0.0500 time 0.2264 (0.2460) data time 0.0005 (0.0014) model time 0.2259 (0.2447) loss 2.4130 (3.4806) grad_norm 1.2818 (inf) loss_scale 4096.0000 (8644.1773) mem 7379MB [2024-08-26 08:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [77/300][1250/1251] eta 0:00:00 lr 0.000899 wd 0.0500 time 0.2263 (0.2458) data time 0.0005 (0.0014) model time 0.2258 (0.2446) loss 3.8558 (3.4814) grad_norm 2.5196 (inf) loss_scale 4096.0000 (8607.8209) mem 7379MB [2024-08-26 08:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 77 training takes 0:05:07 [2024-08-26 08:57:22 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 08:57:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 08:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.438 (0.438) Loss 0.5596 (0.5596) Acc@1 88.867 (88.867) Acc@5 97.266 (97.266) Mem 7379MB [2024-08-26 08:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.110) Loss 0.8481 (0.8191) Acc@1 80.176 (81.419) Acc@5 95.117 (95.969) Mem 7379MB [2024-08-26 08:57:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.097) Loss 1.1436 (0.8331) Acc@1 72.656 (80.780) Acc@5 92.090 (95.991) Mem 7379MB [2024-08-26 08:57:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.090) Loss 1.3828 (0.9550) Acc@1 67.773 (78.157) Acc@5 89.062 (94.440) Mem 7379MB [2024-08-26 08:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.2842 (1.0194) Acc@1 70.410 (76.646) Acc@5 90.527 (93.726) Mem 7379MB [2024-08-26 08:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.214 Acc@5 93.654 [2024-08-26 08:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.2% [2024-08-26 08:57:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.788 (0.788) Loss 0.4600 (0.4600) Acc@1 91.895 (91.895) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 08:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.145) Loss 0.7393 (0.7193) Acc@1 84.863 (84.437) Acc@5 96.289 (96.813) Mem 7379MB [2024-08-26 08:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.113) Loss 1.0293 (0.7430) Acc@1 75.684 (83.357) Acc@5 93.652 (96.791) Mem 7379MB [2024-08-26 08:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.085 (0.102) Loss 1.3037 (0.8487) Acc@1 66.504 (80.916) Acc@5 90.625 (95.502) Mem 7379MB [2024-08-26 08:57:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.1934 (0.9055) Acc@1 70.801 (79.409) Acc@5 91.113 (94.848) Mem 7379MB [2024-08-26 08:57:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.010 Acc@5 94.806 [2024-08-26 08:57:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.0% [2024-08-26 08:57:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.01% [2024-08-26 08:57:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 08:57:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 08:57:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][0/1251] eta 0:14:11 lr 0.000899 wd 0.0500 time 0.6803 (0.6803) data time 0.4377 (0.4377) model time 0.0000 (0.0000) loss 3.5563 (3.5563) grad_norm 1.5995 (1.5995) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:57:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][10/1251] eta 0:05:54 lr 0.000899 wd 0.0500 time 0.2465 (0.2857) data time 0.0008 (0.0410) model time 0.0000 (0.0000) loss 4.4389 (3.5890) grad_norm 2.0926 (1.8003) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:57:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][20/1251] eta 0:05:26 lr 0.000899 wd 0.0500 time 0.2421 (0.2648) data time 0.0007 (0.0220) model time 0.0000 (0.0000) loss 4.1802 (3.6142) grad_norm 1.3505 (1.9213) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:57:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][30/1251] eta 0:05:14 lr 0.000899 wd 0.0500 time 0.2440 (0.2573) data time 0.0009 (0.0152) model time 0.0000 (0.0000) loss 2.4393 (3.5213) grad_norm 1.4987 (1.8647) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:57:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][40/1251] eta 0:05:06 lr 0.000899 wd 0.0500 time 0.2459 (0.2534) data time 0.0008 (0.0118) model time 0.0000 (0.0000) loss 3.1178 (3.5111) grad_norm 1.4473 (1.8072) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:57:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][50/1251] eta 0:05:01 lr 0.000899 wd 0.0500 time 0.2420 (0.2514) data time 0.0010 (0.0097) model time 0.0000 (0.0000) loss 2.6769 (3.4925) grad_norm 2.3000 (1.8279) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:57:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][60/1251] eta 0:04:57 lr 0.000899 wd 0.0500 time 0.2377 (0.2498) data time 0.0007 (0.0083) model time 0.2370 (0.2404) loss 2.9021 (3.4991) grad_norm 1.9816 (1.8507) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:57:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][70/1251] eta 0:04:53 lr 0.000899 wd 0.0500 time 0.2437 (0.2488) data time 0.0011 (0.0073) model time 0.2426 (0.2411) loss 2.0617 (3.4689) grad_norm 2.0250 (1.8748) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:57:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][80/1251] eta 0:04:50 lr 0.000899 wd 0.0500 time 0.2465 (0.2483) data time 0.0012 (0.0065) model time 0.2453 (0.2418) loss 3.3280 (3.4486) grad_norm 1.7313 (1.8811) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:57:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][90/1251] eta 0:04:47 lr 0.000899 wd 0.0500 time 0.2457 (0.2475) data time 0.0007 (0.0059) model time 0.2450 (0.2415) loss 2.2544 (3.4287) grad_norm 2.2351 (1.8689) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:57:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][100/1251] eta 0:04:44 lr 0.000899 wd 0.0500 time 0.2424 (0.2469) data time 0.0007 (0.0055) model time 0.2417 (0.2411) loss 3.0023 (3.4388) grad_norm 1.9405 (1.8624) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:57:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][110/1251] eta 0:04:41 lr 0.000899 wd 0.0500 time 0.2458 (0.2464) data time 0.0010 (0.0051) model time 0.2449 (0.2409) loss 3.9307 (3.4445) grad_norm 2.6363 (1.8906) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][120/1251] eta 0:04:38 lr 0.000899 wd 0.0500 time 0.2357 (0.2459) data time 0.0011 (0.0048) model time 0.2346 (0.2407) loss 3.2969 (3.4464) grad_norm 3.5029 (1.9422) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][130/1251] eta 0:04:34 lr 0.000898 wd 0.0500 time 0.2387 (0.2453) data time 0.0009 (0.0045) model time 0.2378 (0.2402) loss 3.5957 (3.4512) grad_norm 2.5176 (1.9614) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][140/1251] eta 0:04:32 lr 0.000898 wd 0.0500 time 0.2357 (0.2453) data time 0.0012 (0.0043) model time 0.2345 (0.2406) loss 3.7870 (3.4545) grad_norm 1.7801 (1.9489) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][150/1251] eta 0:04:32 lr 0.000898 wd 0.0500 time 0.2390 (0.2478) data time 0.0007 (0.0041) model time 0.2383 (0.2448) loss 2.5444 (3.4504) grad_norm 1.5176 (1.9489) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][160/1251] eta 0:04:29 lr 0.000898 wd 0.0500 time 0.2369 (0.2475) data time 0.0011 (0.0039) model time 0.2358 (0.2445) loss 2.6804 (3.4392) grad_norm 1.9793 (1.9485) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][170/1251] eta 0:04:27 lr 0.000898 wd 0.0500 time 0.2463 (0.2472) data time 0.0010 (0.0037) model time 0.2453 (0.2442) loss 3.9171 (3.4492) grad_norm 1.3847 (1.9447) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][180/1251] eta 0:04:24 lr 0.000898 wd 0.0500 time 0.2381 (0.2469) data time 0.0009 (0.0036) model time 0.2372 (0.2440) loss 3.8196 (3.4476) grad_norm 2.0697 (1.9677) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][190/1251] eta 0:04:21 lr 0.000898 wd 0.0500 time 0.2383 (0.2467) data time 0.0009 (0.0034) model time 0.2373 (0.2439) loss 3.0379 (3.4547) grad_norm 1.7025 (1.9628) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][200/1251] eta 0:04:19 lr 0.000898 wd 0.0500 time 0.2372 (0.2465) data time 0.0008 (0.0033) model time 0.2364 (0.2437) loss 2.2048 (3.4495) grad_norm 3.2710 (1.9650) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][210/1251] eta 0:04:16 lr 0.000898 wd 0.0500 time 0.2401 (0.2462) data time 0.0010 (0.0032) model time 0.2391 (0.2434) loss 2.3433 (3.4574) grad_norm 1.9982 (1.9721) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][220/1251] eta 0:04:13 lr 0.000898 wd 0.0500 time 0.2388 (0.2460) data time 0.0008 (0.0031) model time 0.2380 (0.2432) loss 4.2492 (3.4551) grad_norm 1.5849 (1.9781) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][230/1251] eta 0:04:10 lr 0.000898 wd 0.0500 time 0.2435 (0.2458) data time 0.0009 (0.0030) model time 0.2426 (0.2431) loss 2.5827 (3.4540) grad_norm 1.8104 (1.9771) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][240/1251] eta 0:04:08 lr 0.000898 wd 0.0500 time 0.2397 (0.2456) data time 0.0013 (0.0029) model time 0.2384 (0.2429) loss 3.4013 (3.4650) grad_norm 1.8820 (1.9686) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][250/1251] eta 0:04:05 lr 0.000898 wd 0.0500 time 0.2385 (0.2454) data time 0.0011 (0.0029) model time 0.2374 (0.2428) loss 3.3837 (3.4581) grad_norm 2.0590 (1.9744) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][260/1251] eta 0:04:02 lr 0.000898 wd 0.0500 time 0.2418 (0.2452) data time 0.0009 (0.0028) model time 0.2409 (0.2426) loss 3.1713 (3.4656) grad_norm 2.0319 (1.9710) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][270/1251] eta 0:04:00 lr 0.000898 wd 0.0500 time 0.2470 (0.2451) data time 0.0007 (0.0027) model time 0.2463 (0.2425) loss 3.9786 (3.4683) grad_norm 2.4436 (1.9771) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][280/1251] eta 0:03:57 lr 0.000898 wd 0.0500 time 0.2381 (0.2449) data time 0.0007 (0.0027) model time 0.2374 (0.2424) loss 3.2542 (3.4673) grad_norm 3.1281 (1.9879) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][290/1251] eta 0:03:55 lr 0.000898 wd 0.0500 time 0.2463 (0.2448) data time 0.0009 (0.0026) model time 0.2453 (0.2423) loss 2.0598 (3.4664) grad_norm 2.1044 (1.9810) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][300/1251] eta 0:03:52 lr 0.000898 wd 0.0500 time 0.2351 (0.2447) data time 0.0011 (0.0026) model time 0.2340 (0.2422) loss 3.8076 (3.4683) grad_norm 1.4125 (1.9709) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][310/1251] eta 0:03:50 lr 0.000898 wd 0.0500 time 0.2398 (0.2446) data time 0.0009 (0.0025) model time 0.2388 (0.2421) loss 4.3424 (3.4596) grad_norm 1.6337 (1.9695) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][320/1251] eta 0:03:47 lr 0.000898 wd 0.0500 time 0.2373 (0.2445) data time 0.0009 (0.0025) model time 0.2364 (0.2421) loss 2.9466 (3.4601) grad_norm 1.5533 (1.9656) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][330/1251] eta 0:03:45 lr 0.000898 wd 0.0500 time 0.2443 (0.2444) data time 0.0009 (0.0024) model time 0.2435 (0.2421) loss 3.8456 (3.4601) grad_norm 1.5415 (1.9652) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][340/1251] eta 0:03:42 lr 0.000898 wd 0.0500 time 0.2340 (0.2443) data time 0.0007 (0.0024) model time 0.2333 (0.2420) loss 4.1851 (3.4617) grad_norm 1.7978 (1.9696) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:58:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][350/1251] eta 0:03:40 lr 0.000898 wd 0.0500 time 0.2397 (0.2447) data time 0.0010 (0.0023) model time 0.2387 (0.2425) loss 4.1525 (3.4577) grad_norm 1.5534 (1.9721) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][360/1251] eta 0:03:38 lr 0.000898 wd 0.0500 time 0.2376 (0.2447) data time 0.0008 (0.0023) model time 0.2367 (0.2425) loss 2.5425 (3.4620) grad_norm 2.4192 (1.9739) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][370/1251] eta 0:03:35 lr 0.000898 wd 0.0500 time 0.2400 (0.2447) data time 0.0010 (0.0023) model time 0.2390 (0.2425) loss 3.5946 (3.4617) grad_norm 1.9794 (1.9723) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][380/1251] eta 0:03:33 lr 0.000898 wd 0.0500 time 0.2401 (0.2446) data time 0.0011 (0.0022) model time 0.2390 (0.2425) loss 3.9855 (3.4651) grad_norm 1.4386 (1.9681) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][390/1251] eta 0:03:30 lr 0.000898 wd 0.0500 time 0.2376 (0.2450) data time 0.0011 (0.0022) model time 0.2365 (0.2430) loss 3.2575 (3.4627) grad_norm 1.6019 (1.9641) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][400/1251] eta 0:03:28 lr 0.000898 wd 0.0500 time 0.2394 (0.2450) data time 0.0010 (0.0022) model time 0.2384 (0.2430) loss 3.0780 (3.4622) grad_norm 4.1416 (1.9857) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][410/1251] eta 0:03:25 lr 0.000898 wd 0.0500 time 0.2404 (0.2449) data time 0.0011 (0.0021) model time 0.2393 (0.2429) loss 3.2570 (3.4561) grad_norm 1.8635 (1.9866) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][420/1251] eta 0:03:23 lr 0.000898 wd 0.0500 time 0.2398 (0.2448) data time 0.0008 (0.0021) model time 0.2390 (0.2428) loss 4.1093 (3.4637) grad_norm 1.9312 (1.9999) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][430/1251] eta 0:03:20 lr 0.000898 wd 0.0500 time 0.2348 (0.2448) data time 0.0012 (0.0021) model time 0.2336 (0.2428) loss 2.7599 (3.4626) grad_norm 1.8110 (1.9942) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][440/1251] eta 0:03:18 lr 0.000898 wd 0.0500 time 0.2407 (0.2446) data time 0.0008 (0.0021) model time 0.2399 (0.2427) loss 2.4893 (3.4645) grad_norm 1.8621 (1.9931) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][450/1251] eta 0:03:15 lr 0.000898 wd 0.0500 time 0.2468 (0.2446) data time 0.0010 (0.0020) model time 0.2458 (0.2426) loss 3.4051 (3.4578) grad_norm 2.5734 (2.0030) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][460/1251] eta 0:03:13 lr 0.000898 wd 0.0500 time 0.2402 (0.2445) data time 0.0007 (0.0020) model time 0.2394 (0.2426) loss 3.2799 (3.4545) grad_norm 2.1769 (2.0071) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][470/1251] eta 0:03:10 lr 0.000898 wd 0.0500 time 0.2400 (0.2444) data time 0.0009 (0.0020) model time 0.2392 (0.2425) loss 2.9666 (3.4522) grad_norm 1.5777 (2.0090) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][480/1251] eta 0:03:08 lr 0.000898 wd 0.0500 time 0.2309 (0.2447) data time 0.0008 (0.0020) model time 0.2302 (0.2429) loss 3.9200 (3.4572) grad_norm 1.8018 (2.0024) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][490/1251] eta 0:03:06 lr 0.000898 wd 0.0500 time 0.2423 (0.2451) data time 0.0009 (0.0020) model time 0.2413 (0.2433) loss 3.4814 (3.4579) grad_norm 6.2263 (2.0154) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][500/1251] eta 0:03:04 lr 0.000897 wd 0.0500 time 0.2411 (0.2458) data time 0.0010 (0.0019) model time 0.2401 (0.2441) loss 3.6623 (3.4562) grad_norm 1.8074 (2.0182) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][510/1251] eta 0:03:02 lr 0.000897 wd 0.0500 time 0.2331 (0.2456) data time 0.0010 (0.0019) model time 0.2321 (0.2439) loss 3.4084 (3.4584) grad_norm 1.7115 (2.0167) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][520/1251] eta 0:02:59 lr 0.000897 wd 0.0500 time 0.2477 (0.2456) data time 0.0011 (0.0019) model time 0.2467 (0.2439) loss 3.2050 (3.4572) grad_norm 2.3045 (2.0122) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][530/1251] eta 0:02:57 lr 0.000897 wd 0.0500 time 0.2436 (0.2455) data time 0.0010 (0.0019) model time 0.2426 (0.2438) loss 3.2201 (3.4562) grad_norm 1.4598 (2.0107) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][540/1251] eta 0:02:54 lr 0.000897 wd 0.0500 time 0.2476 (0.2455) data time 0.0012 (0.0019) model time 0.2464 (0.2438) loss 3.6631 (3.4572) grad_norm 2.3705 (2.0072) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][550/1251] eta 0:02:52 lr 0.000897 wd 0.0500 time 0.2470 (0.2455) data time 0.0007 (0.0019) model time 0.2463 (0.2438) loss 3.5923 (3.4592) grad_norm 1.7754 (2.0067) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][560/1251] eta 0:02:49 lr 0.000897 wd 0.0500 time 0.2425 (0.2454) data time 0.0007 (0.0018) model time 0.2418 (0.2438) loss 4.0096 (3.4570) grad_norm 1.8566 (2.0056) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][570/1251] eta 0:02:47 lr 0.000897 wd 0.0500 time 0.2473 (0.2454) data time 0.0011 (0.0018) model time 0.2463 (0.2437) loss 3.2137 (3.4556) grad_norm 2.0535 (2.0085) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][580/1251] eta 0:02:44 lr 0.000897 wd 0.0500 time 0.2386 (0.2453) data time 0.0007 (0.0018) model time 0.2379 (0.2436) loss 2.7526 (3.4536) grad_norm 1.8017 (2.0085) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][590/1251] eta 0:02:42 lr 0.000897 wd 0.0500 time 0.2408 (0.2452) data time 0.0007 (0.0018) model time 0.2401 (0.2436) loss 3.8651 (3.4591) grad_norm 1.4981 (2.0052) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 08:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][600/1251] eta 0:02:39 lr 0.000897 wd 0.0500 time 0.2473 (0.2452) data time 0.0007 (0.0018) model time 0.2467 (0.2435) loss 3.9917 (3.4591) grad_norm 2.7369 (2.0054) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][610/1251] eta 0:02:37 lr 0.000897 wd 0.0500 time 0.2459 (0.2451) data time 0.0010 (0.0018) model time 0.2449 (0.2435) loss 3.0822 (3.4593) grad_norm 2.2376 (2.0070) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][620/1251] eta 0:02:34 lr 0.000897 wd 0.0500 time 0.2387 (0.2451) data time 0.0010 (0.0018) model time 0.2377 (0.2435) loss 3.4868 (3.4566) grad_norm 2.0621 (2.0064) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][630/1251] eta 0:02:32 lr 0.000897 wd 0.0500 time 0.2383 (0.2450) data time 0.0010 (0.0018) model time 0.2374 (0.2434) loss 2.9139 (3.4549) grad_norm 2.1324 (2.0119) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][640/1251] eta 0:02:29 lr 0.000897 wd 0.0500 time 0.2397 (0.2450) data time 0.0010 (0.0017) model time 0.2387 (0.2434) loss 3.6840 (3.4505) grad_norm 2.0603 (2.0148) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][650/1251] eta 0:02:27 lr 0.000897 wd 0.0500 time 0.2410 (0.2450) data time 0.0011 (0.0017) model time 0.2399 (0.2434) loss 2.9610 (3.4519) grad_norm 1.5186 (2.0145) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][660/1251] eta 0:02:24 lr 0.000897 wd 0.0500 time 0.2416 (0.2452) data time 0.0010 (0.0017) model time 0.2406 (0.2436) loss 3.4380 (3.4538) grad_norm 2.4895 (2.0109) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][670/1251] eta 0:02:22 lr 0.000897 wd 0.0500 time 0.2387 (0.2452) data time 0.0008 (0.0017) model time 0.2379 (0.2436) loss 3.4473 (3.4554) grad_norm 2.4192 (2.0099) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][680/1251] eta 0:02:19 lr 0.000897 wd 0.0500 time 0.2402 (0.2451) data time 0.0010 (0.0017) model time 0.2392 (0.2435) loss 2.9579 (3.4548) grad_norm 2.9286 (2.0084) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][690/1251] eta 0:02:17 lr 0.000897 wd 0.0500 time 0.2475 (0.2451) data time 0.0009 (0.0017) model time 0.2466 (0.2435) loss 4.3758 (3.4533) grad_norm 2.1013 (2.0084) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][700/1251] eta 0:02:15 lr 0.000897 wd 0.0500 time 0.2379 (0.2450) data time 0.0008 (0.0017) model time 0.2371 (0.2435) loss 4.2610 (3.4497) grad_norm 2.8913 (2.0148) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][710/1251] eta 0:02:12 lr 0.000897 wd 0.0500 time 0.2462 (0.2450) data time 0.0008 (0.0017) model time 0.2454 (0.2435) loss 4.2489 (3.4486) grad_norm 2.7606 (2.0153) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][720/1251] eta 0:02:10 lr 0.000897 wd 0.0500 time 0.2401 (0.2453) data time 0.0008 (0.0017) model time 0.2393 (0.2438) loss 3.0973 (3.4499) grad_norm 1.8759 (2.0095) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][730/1251] eta 0:02:07 lr 0.000897 wd 0.0500 time 0.2390 (0.2452) data time 0.0010 (0.0017) model time 0.2380 (0.2437) loss 3.1116 (3.4496) grad_norm 1.6034 (2.0104) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][740/1251] eta 0:02:05 lr 0.000897 wd 0.0500 time 0.2371 (0.2452) data time 0.0011 (0.0016) model time 0.2360 (0.2437) loss 3.8407 (3.4514) grad_norm 2.1508 (2.0134) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][750/1251] eta 0:02:02 lr 0.000897 wd 0.0500 time 0.2403 (0.2451) data time 0.0010 (0.0016) model time 0.2392 (0.2436) loss 2.7769 (3.4534) grad_norm 1.6519 (2.0125) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][760/1251] eta 0:02:00 lr 0.000897 wd 0.0500 time 0.2396 (0.2451) data time 0.0007 (0.0016) model time 0.2389 (0.2436) loss 3.7122 (3.4524) grad_norm 1.4307 (2.0089) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][770/1251] eta 0:01:57 lr 0.000897 wd 0.0500 time 0.2411 (0.2450) data time 0.0010 (0.0016) model time 0.2401 (0.2435) loss 3.6465 (3.4539) grad_norm 2.1640 (2.0073) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][780/1251] eta 0:01:55 lr 0.000897 wd 0.0500 time 0.2353 (0.2450) data time 0.0011 (0.0016) model time 0.2342 (0.2435) loss 3.9709 (3.4569) grad_norm 1.7536 (2.0044) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][790/1251] eta 0:01:52 lr 0.000897 wd 0.0500 time 0.2492 (0.2450) data time 0.0009 (0.0016) model time 0.2483 (0.2435) loss 3.6777 (3.4568) grad_norm 1.8283 (2.0046) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][800/1251] eta 0:01:50 lr 0.000897 wd 0.0500 time 0.2459 (0.2449) data time 0.0010 (0.0016) model time 0.2449 (0.2434) loss 2.6999 (3.4556) grad_norm 2.7670 (2.0031) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][810/1251] eta 0:01:48 lr 0.000897 wd 0.0500 time 0.2328 (0.2449) data time 0.0009 (0.0016) model time 0.2319 (0.2434) loss 3.5479 (3.4555) grad_norm 1.4163 (2.0043) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][820/1251] eta 0:01:45 lr 0.000897 wd 0.0500 time 0.2411 (0.2449) data time 0.0010 (0.0016) model time 0.2401 (0.2434) loss 3.9688 (3.4563) grad_norm 1.6690 (2.0020) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][830/1251] eta 0:01:43 lr 0.000897 wd 0.0500 time 0.2506 (0.2449) data time 0.0010 (0.0016) model time 0.2496 (0.2434) loss 3.4515 (3.4564) grad_norm 1.9335 (1.9994) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:00:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][840/1251] eta 0:01:40 lr 0.000897 wd 0.0500 time 0.2474 (0.2448) data time 0.0007 (0.0016) model time 0.2467 (0.2434) loss 3.6452 (3.4576) grad_norm 2.1787 (2.0009) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][850/1251] eta 0:01:38 lr 0.000897 wd 0.0500 time 0.2368 (0.2448) data time 0.0011 (0.0016) model time 0.2357 (0.2433) loss 3.9485 (3.4566) grad_norm 2.0585 (2.0002) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][860/1251] eta 0:01:35 lr 0.000897 wd 0.0500 time 0.2347 (0.2448) data time 0.0007 (0.0016) model time 0.2339 (0.2433) loss 3.9056 (3.4598) grad_norm 1.9425 (1.9980) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][870/1251] eta 0:01:33 lr 0.000896 wd 0.0500 time 0.2434 (0.2448) data time 0.0008 (0.0016) model time 0.2426 (0.2433) loss 4.1920 (3.4643) grad_norm 1.4826 (1.9962) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][880/1251] eta 0:01:30 lr 0.000896 wd 0.0500 time 0.2358 (0.2447) data time 0.0010 (0.0016) model time 0.2348 (0.2432) loss 3.4136 (3.4660) grad_norm 1.5454 (1.9936) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][890/1251] eta 0:01:28 lr 0.000896 wd 0.0500 time 0.2376 (0.2447) data time 0.0009 (0.0016) model time 0.2367 (0.2432) loss 3.3840 (3.4635) grad_norm 3.1877 (1.9945) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][900/1251] eta 0:01:25 lr 0.000896 wd 0.0500 time 0.2351 (0.2449) data time 0.0010 (0.0015) model time 0.2341 (0.2434) loss 3.6534 (3.4624) grad_norm 4.3237 (1.9998) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][910/1251] eta 0:01:23 lr 0.000896 wd 0.0500 time 0.2447 (0.2450) data time 0.0011 (0.0015) model time 0.2436 (0.2436) loss 3.9025 (3.4636) grad_norm 2.1684 (2.0002) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][920/1251] eta 0:01:21 lr 0.000896 wd 0.0500 time 0.2495 (0.2450) data time 0.0008 (0.0015) model time 0.2487 (0.2436) loss 3.0511 (3.4637) grad_norm 1.7392 (1.9984) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][930/1251] eta 0:01:18 lr 0.000896 wd 0.0500 time 0.2420 (0.2450) data time 0.0009 (0.0015) model time 0.2411 (0.2435) loss 3.6877 (3.4642) grad_norm 1.7254 (2.0003) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][940/1251] eta 0:01:16 lr 0.000896 wd 0.0500 time 0.2366 (0.2449) data time 0.0010 (0.0015) model time 0.2357 (0.2435) loss 2.9351 (3.4640) grad_norm 2.8869 (2.0092) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][950/1251] eta 0:01:13 lr 0.000896 wd 0.0500 time 0.2387 (0.2449) data time 0.0011 (0.0015) model time 0.2376 (0.2435) loss 3.5394 (3.4654) grad_norm 1.9051 (2.0123) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][960/1251] eta 0:01:11 lr 0.000896 wd 0.0500 time 0.2350 (0.2449) data time 0.0008 (0.0015) model time 0.2342 (0.2434) loss 3.7256 (3.4626) grad_norm 3.6679 (2.0148) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][970/1251] eta 0:01:08 lr 0.000896 wd 0.0500 time 0.2499 (0.2448) data time 0.0010 (0.0015) model time 0.2490 (0.2434) loss 3.6626 (3.4600) grad_norm 1.7428 (2.0140) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][980/1251] eta 0:01:06 lr 0.000896 wd 0.0500 time 0.2446 (0.2448) data time 0.0009 (0.0015) model time 0.2437 (0.2434) loss 2.4701 (3.4570) grad_norm 1.8373 (2.0152) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][990/1251] eta 0:01:03 lr 0.000896 wd 0.0500 time 0.2372 (0.2448) data time 0.0009 (0.0015) model time 0.2363 (0.2434) loss 3.5218 (3.4560) grad_norm 2.2249 (2.0185) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1000/1251] eta 0:01:01 lr 0.000896 wd 0.0500 time 0.2516 (0.2448) data time 0.0008 (0.0015) model time 0.2508 (0.2433) loss 3.0200 (3.4554) grad_norm 1.9958 (2.0177) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1010/1251] eta 0:00:58 lr 0.000896 wd 0.0500 time 0.2514 (0.2447) data time 0.0008 (0.0015) model time 0.2506 (0.2433) loss 4.0209 (3.4555) grad_norm 1.9727 (2.0163) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1020/1251] eta 0:00:56 lr 0.000896 wd 0.0500 time 0.2409 (0.2449) data time 0.0010 (0.0015) model time 0.2400 (0.2435) loss 3.9877 (3.4577) grad_norm 2.3861 (2.0160) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1030/1251] eta 0:00:54 lr 0.000896 wd 0.0500 time 0.2396 (0.2449) data time 0.0010 (0.0015) model time 0.2386 (0.2435) loss 3.5425 (3.4562) grad_norm 1.5611 (2.0164) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1040/1251] eta 0:00:51 lr 0.000896 wd 0.0500 time 0.2385 (0.2449) data time 0.0010 (0.0015) model time 0.2376 (0.2435) loss 3.0908 (3.4564) grad_norm 1.8954 (2.0146) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1050/1251] eta 0:00:49 lr 0.000896 wd 0.0500 time 0.2486 (0.2449) data time 0.0009 (0.0015) model time 0.2477 (0.2435) loss 3.0843 (3.4556) grad_norm 2.5965 (2.0151) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1060/1251] eta 0:00:46 lr 0.000896 wd 0.0500 time 0.2395 (0.2448) data time 0.0010 (0.0015) model time 0.2385 (0.2434) loss 3.8935 (3.4565) grad_norm 3.0964 (2.0158) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1070/1251] eta 0:00:44 lr 0.000896 wd 0.0500 time 0.2432 (0.2448) data time 0.0008 (0.0015) model time 0.2423 (0.2434) loss 2.9283 (3.4567) grad_norm 1.4425 (2.0133) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1080/1251] eta 0:00:41 lr 0.000896 wd 0.0500 time 0.2534 (0.2452) data time 0.0007 (0.0015) model time 0.2527 (0.2438) loss 3.5160 (3.4584) grad_norm 1.5676 (2.0142) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1090/1251] eta 0:00:39 lr 0.000896 wd 0.0500 time 0.2372 (0.2451) data time 0.0010 (0.0015) model time 0.2362 (0.2438) loss 3.5033 (3.4598) grad_norm 2.6760 (2.0205) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1100/1251] eta 0:00:37 lr 0.000896 wd 0.0500 time 0.2398 (0.2451) data time 0.0008 (0.0015) model time 0.2389 (0.2437) loss 4.0382 (3.4612) grad_norm 2.0230 (2.0198) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1110/1251] eta 0:00:34 lr 0.000896 wd 0.0500 time 0.2450 (0.2451) data time 0.0007 (0.0015) model time 0.2442 (0.2437) loss 3.2005 (3.4648) grad_norm 2.2668 (2.0192) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1120/1251] eta 0:00:32 lr 0.000896 wd 0.0500 time 0.2425 (0.2451) data time 0.0007 (0.0015) model time 0.2418 (0.2437) loss 3.0101 (3.4635) grad_norm 2.4037 (2.0193) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1130/1251] eta 0:00:29 lr 0.000896 wd 0.0500 time 0.2539 (0.2450) data time 0.0007 (0.0015) model time 0.2533 (0.2437) loss 4.0694 (3.4652) grad_norm 2.2277 (2.0207) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1140/1251] eta 0:00:27 lr 0.000896 wd 0.0500 time 0.2411 (0.2450) data time 0.0007 (0.0015) model time 0.2404 (0.2436) loss 4.0467 (3.4668) grad_norm 1.4294 (2.0186) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1150/1251] eta 0:00:24 lr 0.000896 wd 0.0500 time 0.2396 (0.2450) data time 0.0008 (0.0014) model time 0.2388 (0.2436) loss 3.0419 (3.4659) grad_norm 2.2041 (2.0208) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1160/1251] eta 0:00:22 lr 0.000896 wd 0.0500 time 0.2484 (0.2450) data time 0.0007 (0.0014) model time 0.2477 (0.2436) loss 2.7737 (3.4649) grad_norm 3.8825 (2.0221) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1170/1251] eta 0:00:19 lr 0.000896 wd 0.0500 time 0.2460 (0.2449) data time 0.0007 (0.0014) model time 0.2453 (0.2436) loss 2.9653 (3.4642) grad_norm 2.1142 (2.0211) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1180/1251] eta 0:00:17 lr 0.000896 wd 0.0500 time 0.2467 (0.2449) data time 0.0009 (0.0014) model time 0.2458 (0.2435) loss 3.9828 (3.4658) grad_norm 1.5936 (2.0200) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1190/1251] eta 0:00:14 lr 0.000896 wd 0.0500 time 0.2443 (0.2449) data time 0.0010 (0.0014) model time 0.2433 (0.2435) loss 2.8873 (3.4641) grad_norm 1.7538 (2.0230) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1200/1251] eta 0:00:12 lr 0.000896 wd 0.0500 time 0.2425 (0.2450) data time 0.0010 (0.0014) model time 0.2415 (0.2437) loss 3.1638 (3.4633) grad_norm 2.2220 (2.0216) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1210/1251] eta 0:00:10 lr 0.000896 wd 0.0500 time 0.2439 (0.2450) data time 0.0007 (0.0014) model time 0.2431 (0.2436) loss 4.1867 (3.4631) grad_norm 1.7772 (2.0181) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1220/1251] eta 0:00:07 lr 0.000896 wd 0.0500 time 0.2431 (0.2450) data time 0.0010 (0.0014) model time 0.2420 (0.2436) loss 3.6504 (3.4657) grad_norm 2.0014 (2.0165) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1230/1251] eta 0:00:05 lr 0.000896 wd 0.0500 time 0.2517 (0.2449) data time 0.0009 (0.0014) model time 0.2508 (0.2436) loss 3.4346 (3.4660) grad_norm 1.9346 (2.0169) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1240/1251] eta 0:00:02 lr 0.000895 wd 0.0500 time 0.2261 (0.2448) data time 0.0005 (0.0014) model time 0.2256 (0.2435) loss 2.9287 (3.4647) grad_norm 1.8421 (2.0170) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [78/300][1250/1251] eta 0:00:00 lr 0.000895 wd 0.0500 time 0.2237 (0.2447) data time 0.0005 (0.0014) model time 0.2233 (0.2433) loss 2.8259 (3.4654) grad_norm 1.9478 (2.0167) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 78 training takes 0:05:06 [2024-08-26 09:02:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 09:02:39 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 09:02:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.406 (0.406) Loss 0.5039 (0.5039) Acc@1 89.844 (89.844) Acc@5 97.461 (97.461) Mem 7379MB [2024-08-26 09:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.105) Loss 0.9336 (0.8333) Acc@1 79.785 (81.419) Acc@5 95.020 (95.854) Mem 7379MB [2024-08-26 09:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.093) Loss 1.2764 (0.8497) Acc@1 68.750 (80.738) Acc@5 91.699 (95.912) Mem 7379MB [2024-08-26 09:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.088) Loss 1.5439 (0.9709) Acc@1 62.988 (78.346) Acc@5 87.891 (94.405) Mem 7379MB [2024-08-26 09:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.4424 (1.0348) Acc@1 67.578 (76.882) Acc@5 88.867 (93.600) Mem 7379MB [2024-08-26 09:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.510 Acc@5 93.506 [2024-08-26 09:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.5% [2024-08-26 09:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.816 (0.816) Loss 0.4585 (0.4585) Acc@1 91.992 (91.992) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 09:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.149) Loss 0.7397 (0.7184) Acc@1 84.766 (84.411) Acc@5 96.191 (96.804) Mem 7379MB [2024-08-26 09:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.117) Loss 1.0283 (0.7419) Acc@1 75.391 (83.347) Acc@5 93.555 (96.773) Mem 7379MB [2024-08-26 09:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.103) Loss 1.3047 (0.8474) Acc@1 66.992 (80.932) Acc@5 90.820 (95.546) Mem 7379MB [2024-08-26 09:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.1904 (0.9040) Acc@1 71.094 (79.435) Acc@5 91.406 (94.905) Mem 7379MB [2024-08-26 09:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.058 Acc@5 94.862 [2024-08-26 09:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.1% [2024-08-26 09:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.06% [2024-08-26 09:02:47 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 09:02:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 09:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][0/1251] eta 0:14:18 lr 0.000895 wd 0.0500 time 0.6864 (0.6864) data time 0.4401 (0.4401) model time 0.0000 (0.0000) loss 2.3144 (2.3144) grad_norm 1.7631 (1.7631) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][10/1251] eta 0:06:11 lr 0.000895 wd 0.0500 time 0.2315 (0.2994) data time 0.0011 (0.0410) model time 0.0000 (0.0000) loss 2.7528 (3.3864) grad_norm 1.5495 (1.9005) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][20/1251] eta 0:05:45 lr 0.000895 wd 0.0500 time 0.2448 (0.2807) data time 0.0010 (0.0220) model time 0.0000 (0.0000) loss 2.8059 (3.3979) grad_norm 2.5167 (2.0081) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][30/1251] eta 0:05:27 lr 0.000895 wd 0.0500 time 0.2420 (0.2680) data time 0.0009 (0.0152) model time 0.0000 (0.0000) loss 3.8159 (3.4133) grad_norm 2.3873 (1.9684) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:02:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][40/1251] eta 0:05:17 lr 0.000895 wd 0.0500 time 0.2477 (0.2621) data time 0.0009 (0.0117) model time 0.0000 (0.0000) loss 3.1766 (3.3823) grad_norm 1.4684 (1.9510) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][50/1251] eta 0:05:09 lr 0.000895 wd 0.0500 time 0.2460 (0.2577) data time 0.0009 (0.0096) model time 0.0000 (0.0000) loss 3.6462 (3.3506) grad_norm 1.9203 (1.9305) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][60/1251] eta 0:05:03 lr 0.000895 wd 0.0500 time 0.2330 (0.2549) data time 0.0012 (0.0082) model time 0.2318 (0.2397) loss 3.1906 (3.3944) grad_norm 1.9710 (1.9404) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][70/1251] eta 0:04:59 lr 0.000895 wd 0.0500 time 0.2424 (0.2533) data time 0.0007 (0.0072) model time 0.2418 (0.2410) loss 3.1509 (3.4216) grad_norm 2.7611 (1.9813) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][80/1251] eta 0:04:54 lr 0.000895 wd 0.0500 time 0.2414 (0.2517) data time 0.0009 (0.0064) model time 0.2406 (0.2404) loss 4.0531 (3.4151) grad_norm 2.0874 (1.9953) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][90/1251] eta 0:04:53 lr 0.000895 wd 0.0500 time 0.2352 (0.2526) data time 0.0010 (0.0058) model time 0.2342 (0.2451) loss 3.5050 (3.4131) grad_norm 1.7659 (1.9864) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][100/1251] eta 0:04:54 lr 0.000895 wd 0.0500 time 0.2348 (0.2555) data time 0.0007 (0.0054) model time 0.2341 (0.2523) loss 4.2133 (3.4450) grad_norm 1.5943 (1.9836) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][110/1251] eta 0:04:50 lr 0.000895 wd 0.0500 time 0.2425 (0.2545) data time 0.0009 (0.0050) model time 0.2416 (0.2507) loss 3.5093 (3.4480) grad_norm 1.5954 (1.9637) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][120/1251] eta 0:04:46 lr 0.000895 wd 0.0500 time 0.2445 (0.2535) data time 0.0008 (0.0047) model time 0.2438 (0.2494) loss 2.8950 (3.4087) grad_norm 2.1607 (1.9628) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][130/1251] eta 0:04:43 lr 0.000895 wd 0.0500 time 0.2397 (0.2525) data time 0.0010 (0.0044) model time 0.2387 (0.2481) loss 3.6757 (3.4106) grad_norm 1.8724 (1.9666) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][140/1251] eta 0:04:39 lr 0.000895 wd 0.0500 time 0.2539 (0.2520) data time 0.0010 (0.0042) model time 0.2529 (0.2476) loss 3.2002 (3.4344) grad_norm 1.7846 (1.9870) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][150/1251] eta 0:04:36 lr 0.000895 wd 0.0500 time 0.2448 (0.2511) data time 0.0008 (0.0040) model time 0.2440 (0.2466) loss 4.1809 (3.4405) grad_norm 1.8759 (1.9773) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][160/1251] eta 0:04:33 lr 0.000895 wd 0.0500 time 0.2480 (0.2506) data time 0.0007 (0.0038) model time 0.2473 (0.2463) loss 3.4424 (3.4594) grad_norm 2.5027 (1.9941) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][170/1251] eta 0:04:30 lr 0.000895 wd 0.0500 time 0.2369 (0.2502) data time 0.0008 (0.0037) model time 0.2361 (0.2459) loss 3.2288 (3.4652) grad_norm 2.0413 (2.0096) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][180/1251] eta 0:04:28 lr 0.000895 wd 0.0500 time 0.2480 (0.2503) data time 0.0007 (0.0035) model time 0.2473 (0.2463) loss 2.9148 (3.4679) grad_norm 3.7883 (2.0425) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][190/1251] eta 0:04:25 lr 0.000895 wd 0.0500 time 0.2444 (0.2498) data time 0.0009 (0.0034) model time 0.2434 (0.2459) loss 3.7976 (3.4816) grad_norm 2.3231 (2.0477) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][200/1251] eta 0:04:22 lr 0.000895 wd 0.0500 time 0.2417 (0.2494) data time 0.0007 (0.0033) model time 0.2409 (0.2455) loss 3.4634 (3.4744) grad_norm 2.2350 (2.0415) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][210/1251] eta 0:04:19 lr 0.000895 wd 0.0500 time 0.2354 (0.2489) data time 0.0009 (0.0032) model time 0.2345 (0.2450) loss 2.4743 (3.4701) grad_norm 1.6844 (2.0426) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][220/1251] eta 0:04:16 lr 0.000895 wd 0.0500 time 0.2342 (0.2486) data time 0.0010 (0.0031) model time 0.2332 (0.2447) loss 4.0752 (3.4832) grad_norm 1.7128 (2.0575) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][230/1251] eta 0:04:13 lr 0.000895 wd 0.0500 time 0.2400 (0.2483) data time 0.0010 (0.0030) model time 0.2390 (0.2445) loss 3.4668 (3.4901) grad_norm 1.7266 (2.0479) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][240/1251] eta 0:04:10 lr 0.000895 wd 0.0500 time 0.2434 (0.2480) data time 0.0008 (0.0029) model time 0.2426 (0.2443) loss 4.4122 (3.5052) grad_norm 2.0530 (2.0510) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][250/1251] eta 0:04:08 lr 0.000895 wd 0.0500 time 0.2404 (0.2478) data time 0.0010 (0.0028) model time 0.2394 (0.2442) loss 3.9613 (3.4931) grad_norm 2.0147 (2.0732) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][260/1251] eta 0:04:05 lr 0.000895 wd 0.0500 time 0.2527 (0.2476) data time 0.0010 (0.0028) model time 0.2517 (0.2440) loss 3.6329 (3.5031) grad_norm 2.4502 (2.0831) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][270/1251] eta 0:04:02 lr 0.000895 wd 0.0500 time 0.2465 (0.2474) data time 0.0007 (0.0027) model time 0.2458 (0.2439) loss 3.7562 (3.4923) grad_norm 1.7349 (2.0794) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:03:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][280/1251] eta 0:04:00 lr 0.000895 wd 0.0500 time 0.2449 (0.2472) data time 0.0007 (0.0026) model time 0.2442 (0.2438) loss 3.2176 (3.4915) grad_norm 2.9294 (2.0929) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][290/1251] eta 0:03:57 lr 0.000895 wd 0.0500 time 0.2422 (0.2470) data time 0.0008 (0.0026) model time 0.2414 (0.2436) loss 3.2273 (3.4987) grad_norm 1.8885 (2.0929) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][300/1251] eta 0:03:54 lr 0.000895 wd 0.0500 time 0.2507 (0.2469) data time 0.0010 (0.0025) model time 0.2497 (0.2436) loss 3.9482 (3.5073) grad_norm 1.5778 (2.0869) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][310/1251] eta 0:03:52 lr 0.000895 wd 0.0500 time 0.2443 (0.2467) data time 0.0009 (0.0025) model time 0.2435 (0.2435) loss 3.7897 (3.5075) grad_norm 2.5580 (2.0973) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][320/1251] eta 0:03:49 lr 0.000895 wd 0.0500 time 0.2441 (0.2465) data time 0.0008 (0.0024) model time 0.2433 (0.2433) loss 3.6132 (3.5036) grad_norm 2.0067 (2.0899) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][330/1251] eta 0:03:47 lr 0.000895 wd 0.0500 time 0.2482 (0.2469) data time 0.0011 (0.0024) model time 0.2472 (0.2439) loss 3.7609 (3.4983) grad_norm 1.6512 (2.0777) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][340/1251] eta 0:03:44 lr 0.000895 wd 0.0500 time 0.2477 (0.2468) data time 0.0010 (0.0024) model time 0.2467 (0.2439) loss 3.3575 (3.4982) grad_norm 3.6659 (2.0866) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][350/1251] eta 0:03:42 lr 0.000894 wd 0.0500 time 0.2452 (0.2467) data time 0.0008 (0.0023) model time 0.2444 (0.2438) loss 3.9284 (3.5001) grad_norm 2.5309 (2.0897) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][360/1251] eta 0:03:39 lr 0.000894 wd 0.0500 time 0.2454 (0.2466) data time 0.0010 (0.0023) model time 0.2444 (0.2437) loss 3.9799 (3.5000) grad_norm 1.7853 (2.0803) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][370/1251] eta 0:03:37 lr 0.000894 wd 0.0500 time 0.2423 (0.2466) data time 0.0011 (0.0023) model time 0.2412 (0.2437) loss 4.2611 (3.5009) grad_norm 2.0146 (2.0731) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][380/1251] eta 0:03:34 lr 0.000894 wd 0.0500 time 0.2409 (0.2465) data time 0.0008 (0.0022) model time 0.2401 (0.2437) loss 4.2690 (3.4959) grad_norm 1.8131 (2.0702) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][390/1251] eta 0:03:32 lr 0.000894 wd 0.0500 time 0.2368 (0.2463) data time 0.0008 (0.0022) model time 0.2361 (0.2436) loss 4.2461 (3.5013) grad_norm 1.8220 (2.0708) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][400/1251] eta 0:03:30 lr 0.000894 wd 0.0500 time 0.2413 (0.2472) data time 0.0010 (0.0022) model time 0.2402 (0.2446) loss 3.4684 (3.5003) grad_norm 2.0326 (2.0689) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][410/1251] eta 0:03:27 lr 0.000894 wd 0.0500 time 0.2458 (0.2471) data time 0.0010 (0.0021) model time 0.2447 (0.2445) loss 4.0373 (3.5024) grad_norm 1.9120 (2.0696) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][420/1251] eta 0:03:25 lr 0.000894 wd 0.0500 time 0.2405 (0.2469) data time 0.0010 (0.0021) model time 0.2395 (0.2444) loss 3.7836 (3.5041) grad_norm 2.2621 (2.0669) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][430/1251] eta 0:03:22 lr 0.000894 wd 0.0500 time 0.2471 (0.2469) data time 0.0009 (0.0021) model time 0.2462 (0.2443) loss 3.8897 (3.5058) grad_norm 1.6444 (2.0707) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][440/1251] eta 0:03:20 lr 0.000894 wd 0.0500 time 0.2411 (0.2467) data time 0.0010 (0.0021) model time 0.2401 (0.2442) loss 3.5191 (3.5084) grad_norm 2.9012 (2.0672) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][450/1251] eta 0:03:17 lr 0.000894 wd 0.0500 time 0.2435 (0.2466) data time 0.0010 (0.0020) model time 0.2425 (0.2441) loss 2.6026 (3.5028) grad_norm 1.8597 (2.0739) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][460/1251] eta 0:03:14 lr 0.000894 wd 0.0500 time 0.2349 (0.2465) data time 0.0012 (0.0020) model time 0.2337 (0.2440) loss 2.8733 (3.4981) grad_norm 1.5004 (2.0731) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][470/1251] eta 0:03:12 lr 0.000894 wd 0.0500 time 0.2424 (0.2464) data time 0.0012 (0.0020) model time 0.2412 (0.2439) loss 4.0156 (3.4952) grad_norm 1.4651 (2.0680) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][480/1251] eta 0:03:09 lr 0.000894 wd 0.0500 time 0.2422 (0.2463) data time 0.0012 (0.0020) model time 0.2410 (0.2439) loss 3.6682 (3.4930) grad_norm 1.7485 (2.0695) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][490/1251] eta 0:03:07 lr 0.000894 wd 0.0500 time 0.2445 (0.2462) data time 0.0008 (0.0020) model time 0.2438 (0.2438) loss 3.8705 (3.4938) grad_norm 1.5510 (2.0627) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][500/1251] eta 0:03:04 lr 0.000894 wd 0.0500 time 0.2513 (0.2461) data time 0.0010 (0.0019) model time 0.2504 (0.2437) loss 3.6317 (3.4896) grad_norm 2.0169 (2.0576) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][510/1251] eta 0:03:02 lr 0.000894 wd 0.0500 time 0.2362 (0.2460) data time 0.0010 (0.0019) model time 0.2352 (0.2436) loss 2.8765 (3.4849) grad_norm 2.1512 (2.0564) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][520/1251] eta 0:02:59 lr 0.000894 wd 0.0500 time 0.2369 (0.2459) data time 0.0010 (0.0019) model time 0.2359 (0.2435) loss 3.8733 (3.4869) grad_norm 3.3002 (2.0583) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:04:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][530/1251] eta 0:02:57 lr 0.000894 wd 0.0500 time 0.2436 (0.2458) data time 0.0012 (0.0019) model time 0.2423 (0.2435) loss 3.8571 (3.4908) grad_norm 2.0404 (2.0588) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][540/1251] eta 0:02:54 lr 0.000894 wd 0.0500 time 0.2468 (0.2458) data time 0.0008 (0.0019) model time 0.2460 (0.2435) loss 4.6354 (3.4915) grad_norm 3.7008 (2.0616) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][550/1251] eta 0:02:52 lr 0.000894 wd 0.0500 time 0.2387 (0.2461) data time 0.0008 (0.0019) model time 0.2379 (0.2438) loss 2.2486 (3.4906) grad_norm 1.6058 (2.0603) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][560/1251] eta 0:02:49 lr 0.000894 wd 0.0500 time 0.2419 (0.2460) data time 0.0007 (0.0018) model time 0.2412 (0.2438) loss 3.3074 (3.4924) grad_norm 1.7651 (2.0539) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][570/1251] eta 0:02:47 lr 0.000894 wd 0.0500 time 0.2399 (0.2459) data time 0.0010 (0.0018) model time 0.2390 (0.2437) loss 3.4678 (3.4909) grad_norm 1.6665 (2.0524) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][580/1251] eta 0:02:44 lr 0.000894 wd 0.0500 time 0.2365 (0.2458) data time 0.0007 (0.0018) model time 0.2358 (0.2436) loss 3.8101 (3.4872) grad_norm 1.5793 (2.0485) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][590/1251] eta 0:02:42 lr 0.000894 wd 0.0500 time 0.2391 (0.2457) data time 0.0009 (0.0018) model time 0.2382 (0.2435) loss 3.4150 (3.4886) grad_norm 1.6124 (2.0479) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][600/1251] eta 0:02:39 lr 0.000894 wd 0.0500 time 0.2343 (0.2456) data time 0.0007 (0.0018) model time 0.2335 (0.2434) loss 3.6074 (3.4857) grad_norm 1.9537 (2.0489) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][610/1251] eta 0:02:37 lr 0.000894 wd 0.0500 time 0.2502 (0.2456) data time 0.0010 (0.0018) model time 0.2491 (0.2434) loss 3.2392 (3.4880) grad_norm 1.3989 (2.0454) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][620/1251] eta 0:02:34 lr 0.000894 wd 0.0500 time 0.2490 (0.2455) data time 0.0007 (0.0018) model time 0.2482 (0.2434) loss 3.4129 (3.4896) grad_norm 2.0968 (2.0413) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][630/1251] eta 0:02:32 lr 0.000894 wd 0.0500 time 0.2447 (0.2458) data time 0.0010 (0.0018) model time 0.2437 (0.2437) loss 3.2214 (3.4878) grad_norm 2.0188 (2.0388) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][640/1251] eta 0:02:30 lr 0.000894 wd 0.0500 time 0.2354 (0.2461) data time 0.0009 (0.0017) model time 0.2345 (0.2440) loss 2.1019 (3.4839) grad_norm 2.0689 (2.0421) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][650/1251] eta 0:02:27 lr 0.000894 wd 0.0500 time 0.2451 (0.2460) data time 0.0009 (0.0017) model time 0.2442 (0.2440) loss 3.5402 (3.4840) grad_norm 1.6459 (2.0397) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][660/1251] eta 0:02:25 lr 0.000894 wd 0.0500 time 0.2420 (0.2459) data time 0.0010 (0.0017) model time 0.2410 (0.2439) loss 2.6786 (3.4802) grad_norm 1.5746 (2.0422) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][670/1251] eta 0:02:22 lr 0.000894 wd 0.0500 time 0.2404 (0.2459) data time 0.0010 (0.0017) model time 0.2394 (0.2438) loss 3.5188 (3.4822) grad_norm 2.7575 (2.0409) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][680/1251] eta 0:02:20 lr 0.000894 wd 0.0500 time 0.2459 (0.2458) data time 0.0008 (0.0017) model time 0.2451 (0.2438) loss 2.3341 (3.4827) grad_norm 1.7712 (2.0458) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][690/1251] eta 0:02:17 lr 0.000894 wd 0.0500 time 0.2366 (0.2458) data time 0.0010 (0.0017) model time 0.2356 (0.2438) loss 2.3556 (3.4817) grad_norm 2.1929 (2.0503) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:05:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][700/1251] eta 0:02:15 lr 0.000894 wd 0.0500 time 0.2385 (0.2457) data time 0.0011 (0.0017) model time 0.2375 (0.2438) loss 3.7453 (3.4868) grad_norm 1.6291 (2.0463) loss_scale 8192.0000 (4148.5877) mem 7379MB [2024-08-26 09:05:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][710/1251] eta 0:02:13 lr 0.000893 wd 0.0500 time 0.2439 (0.2460) data time 0.0012 (0.0017) model time 0.2427 (0.2440) loss 3.2002 (3.4827) grad_norm 2.2384 (2.0436) loss_scale 8192.0000 (4205.4571) mem 7379MB [2024-08-26 09:05:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][720/1251] eta 0:02:10 lr 0.000893 wd 0.0500 time 0.2443 (0.2459) data time 0.0007 (0.0017) model time 0.2436 (0.2440) loss 4.1525 (3.4869) grad_norm 2.1532 (2.0429) loss_scale 8192.0000 (4260.7490) mem 7379MB [2024-08-26 09:05:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][730/1251] eta 0:02:08 lr 0.000893 wd 0.0500 time 0.2398 (0.2459) data time 0.0011 (0.0016) model time 0.2387 (0.2440) loss 3.4809 (3.4890) grad_norm 2.0625 (2.0477) loss_scale 8192.0000 (4314.5280) mem 7379MB [2024-08-26 09:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][740/1251] eta 0:02:05 lr 0.000893 wd 0.0500 time 0.2426 (0.2458) data time 0.0008 (0.0016) model time 0.2418 (0.2439) loss 3.8723 (3.4843) grad_norm 1.5269 (2.0522) loss_scale 8192.0000 (4366.8556) mem 7379MB [2024-08-26 09:05:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][750/1251] eta 0:02:03 lr 0.000893 wd 0.0500 time 0.2466 (0.2458) data time 0.0010 (0.0016) model time 0.2457 (0.2439) loss 3.4504 (3.4839) grad_norm 1.7031 (2.0517) loss_scale 8192.0000 (4417.7896) mem 7379MB [2024-08-26 09:05:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][760/1251] eta 0:02:00 lr 0.000893 wd 0.0500 time 0.2353 (0.2457) data time 0.0010 (0.0016) model time 0.2343 (0.2438) loss 3.3482 (3.4802) grad_norm 2.3616 (2.0520) loss_scale 8192.0000 (4467.3850) mem 7379MB [2024-08-26 09:05:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][770/1251] eta 0:01:58 lr 0.000893 wd 0.0500 time 0.2403 (0.2457) data time 0.0011 (0.0016) model time 0.2391 (0.2438) loss 4.1334 (3.4818) grad_norm 2.0832 (2.0510) loss_scale 8192.0000 (4515.6939) mem 7379MB [2024-08-26 09:06:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][780/1251] eta 0:01:55 lr 0.000893 wd 0.0500 time 0.2351 (0.2457) data time 0.0010 (0.0016) model time 0.2341 (0.2438) loss 3.0427 (3.4803) grad_norm 1.8192 (2.0528) loss_scale 8192.0000 (4562.7657) mem 7379MB [2024-08-26 09:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][790/1251] eta 0:01:53 lr 0.000893 wd 0.0500 time 0.2368 (0.2456) data time 0.0007 (0.0016) model time 0.2361 (0.2437) loss 3.0661 (3.4810) grad_norm 1.7539 (2.0556) loss_scale 8192.0000 (4608.6473) mem 7379MB [2024-08-26 09:06:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][800/1251] eta 0:01:50 lr 0.000893 wd 0.0500 time 0.2388 (0.2455) data time 0.0009 (0.0016) model time 0.2379 (0.2437) loss 3.6024 (3.4823) grad_norm 1.7470 (2.0554) loss_scale 8192.0000 (4653.3833) mem 7379MB [2024-08-26 09:06:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][810/1251] eta 0:01:48 lr 0.000893 wd 0.0500 time 0.2411 (0.2455) data time 0.0011 (0.0016) model time 0.2400 (0.2437) loss 3.5593 (3.4841) grad_norm 1.5031 (2.0522) loss_scale 8192.0000 (4697.0160) mem 7379MB [2024-08-26 09:06:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][820/1251] eta 0:01:45 lr 0.000893 wd 0.0500 time 0.2391 (0.2455) data time 0.0007 (0.0016) model time 0.2384 (0.2436) loss 3.7136 (3.4841) grad_norm 1.5200 (2.0539) loss_scale 8192.0000 (4739.5859) mem 7379MB [2024-08-26 09:06:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][830/1251] eta 0:01:43 lr 0.000893 wd 0.0500 time 0.2462 (0.2454) data time 0.0008 (0.0016) model time 0.2454 (0.2436) loss 3.6810 (3.4850) grad_norm 1.6094 (2.0526) loss_scale 8192.0000 (4781.1312) mem 7379MB [2024-08-26 09:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][840/1251] eta 0:01:40 lr 0.000893 wd 0.0500 time 0.2410 (0.2454) data time 0.0010 (0.0016) model time 0.2400 (0.2436) loss 2.5146 (3.4854) grad_norm 1.4626 (2.0475) loss_scale 8192.0000 (4821.6885) mem 7379MB [2024-08-26 09:06:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][850/1251] eta 0:01:38 lr 0.000893 wd 0.0500 time 0.2440 (0.2453) data time 0.0008 (0.0016) model time 0.2431 (0.2435) loss 2.9427 (3.4843) grad_norm 1.7059 (2.0462) loss_scale 8192.0000 (4861.2926) mem 7379MB [2024-08-26 09:06:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][860/1251] eta 0:01:36 lr 0.000893 wd 0.0500 time 0.4421 (0.2455) data time 0.0009 (0.0016) model time 0.4412 (0.2437) loss 3.9061 (3.4831) grad_norm 1.5467 (2.0448) loss_scale 8192.0000 (4899.9768) mem 7379MB [2024-08-26 09:06:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][870/1251] eta 0:01:33 lr 0.000893 wd 0.0500 time 0.2414 (0.2455) data time 0.0011 (0.0015) model time 0.2403 (0.2437) loss 3.6239 (3.4844) grad_norm 2.0331 (2.0487) loss_scale 8192.0000 (4937.7727) mem 7379MB [2024-08-26 09:06:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][880/1251] eta 0:01:31 lr 0.000893 wd 0.0500 time 0.2478 (0.2455) data time 0.0012 (0.0015) model time 0.2467 (0.2437) loss 3.0006 (3.4815) grad_norm 2.1469 (2.0479) loss_scale 8192.0000 (4974.7106) mem 7379MB [2024-08-26 09:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][890/1251] eta 0:01:28 lr 0.000893 wd 0.0500 time 0.2399 (0.2455) data time 0.0010 (0.0015) model time 0.2389 (0.2437) loss 4.4481 (3.4802) grad_norm 1.8503 (2.0451) loss_scale 8192.0000 (5010.8193) mem 7379MB [2024-08-26 09:06:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][900/1251] eta 0:01:26 lr 0.000893 wd 0.0500 time 0.2385 (0.2454) data time 0.0009 (0.0015) model time 0.2376 (0.2436) loss 4.3036 (3.4814) grad_norm 2.7863 (2.0500) loss_scale 8192.0000 (5046.1265) mem 7379MB [2024-08-26 09:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][910/1251] eta 0:01:23 lr 0.000893 wd 0.0500 time 0.2557 (0.2454) data time 0.0010 (0.0015) model time 0.2547 (0.2436) loss 3.6574 (3.4809) grad_norm 1.4504 (2.0479) loss_scale 8192.0000 (5080.6586) mem 7379MB [2024-08-26 09:06:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][920/1251] eta 0:01:21 lr 0.000893 wd 0.0500 time 0.2487 (0.2455) data time 0.0014 (0.0015) model time 0.2473 (0.2438) loss 4.2219 (3.4838) grad_norm 1.5304 (2.0480) loss_scale 8192.0000 (5114.4408) mem 7379MB [2024-08-26 09:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][930/1251] eta 0:01:18 lr 0.000893 wd 0.0500 time 0.2397 (0.2455) data time 0.0007 (0.0015) model time 0.2389 (0.2437) loss 4.4146 (3.4850) grad_norm 1.5890 (2.0451) loss_scale 8192.0000 (5147.4973) mem 7379MB [2024-08-26 09:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][940/1251] eta 0:01:16 lr 0.000893 wd 0.0500 time 0.2448 (0.2455) data time 0.0008 (0.0015) model time 0.2440 (0.2437) loss 3.8821 (3.4855) grad_norm 1.6964 (2.0437) loss_scale 8192.0000 (5179.8512) mem 7379MB [2024-08-26 09:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][950/1251] eta 0:01:13 lr 0.000893 wd 0.0500 time 0.2397 (0.2456) data time 0.0009 (0.0015) model time 0.2388 (0.2439) loss 3.9257 (3.4867) grad_norm 2.7182 (2.0446) loss_scale 8192.0000 (5211.5247) mem 7379MB [2024-08-26 09:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][960/1251] eta 0:01:11 lr 0.000893 wd 0.0500 time 0.2377 (0.2456) data time 0.0011 (0.0015) model time 0.2366 (0.2439) loss 3.4333 (3.4870) grad_norm 2.0857 (2.0443) loss_scale 8192.0000 (5242.5390) mem 7379MB [2024-08-26 09:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][970/1251] eta 0:01:09 lr 0.000893 wd 0.0500 time 0.2355 (0.2456) data time 0.0007 (0.0015) model time 0.2347 (0.2439) loss 3.8427 (3.4871) grad_norm 3.1529 (2.0459) loss_scale 8192.0000 (5272.9145) mem 7379MB [2024-08-26 09:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][980/1251] eta 0:01:06 lr 0.000893 wd 0.0500 time 0.2441 (0.2455) data time 0.0007 (0.0015) model time 0.2434 (0.2438) loss 3.0032 (3.4850) grad_norm 1.8851 (2.0460) loss_scale 8192.0000 (5302.6707) mem 7379MB [2024-08-26 09:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][990/1251] eta 0:01:04 lr 0.000893 wd 0.0500 time 0.2366 (0.2455) data time 0.0011 (0.0015) model time 0.2356 (0.2438) loss 3.4362 (3.4843) grad_norm 2.8471 (2.0478) loss_scale 8192.0000 (5331.8264) mem 7379MB [2024-08-26 09:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1000/1251] eta 0:01:01 lr 0.000893 wd 0.0500 time 0.2364 (0.2454) data time 0.0010 (0.0015) model time 0.2353 (0.2437) loss 3.7822 (3.4844) grad_norm 4.1850 (2.0493) loss_scale 8192.0000 (5360.3996) mem 7379MB [2024-08-26 09:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1010/1251] eta 0:00:59 lr 0.000893 wd 0.0500 time 0.2389 (0.2454) data time 0.0008 (0.0015) model time 0.2381 (0.2437) loss 3.9178 (3.4852) grad_norm 2.2399 (2.0510) loss_scale 8192.0000 (5388.4075) mem 7379MB [2024-08-26 09:06:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1020/1251] eta 0:00:56 lr 0.000893 wd 0.0500 time 0.2252 (0.2455) data time 0.0011 (0.0015) model time 0.2240 (0.2438) loss 3.7111 (3.4868) grad_norm 1.5608 (2.0503) loss_scale 8192.0000 (5415.8668) mem 7379MB [2024-08-26 09:07:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1030/1251] eta 0:00:54 lr 0.000893 wd 0.0500 time 0.2408 (0.2459) data time 0.0010 (0.0015) model time 0.2398 (0.2442) loss 3.4811 (3.4859) grad_norm 2.1089 (2.0502) loss_scale 8192.0000 (5442.7934) mem 7379MB [2024-08-26 09:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1040/1251] eta 0:00:51 lr 0.000893 wd 0.0500 time 0.2443 (0.2459) data time 0.0009 (0.0015) model time 0.2434 (0.2442) loss 3.6388 (3.4861) grad_norm 1.9748 (2.0476) loss_scale 8192.0000 (5469.2027) mem 7379MB [2024-08-26 09:07:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1050/1251] eta 0:00:49 lr 0.000893 wd 0.0500 time 0.2443 (0.2458) data time 0.0007 (0.0015) model time 0.2435 (0.2442) loss 2.6320 (3.4843) grad_norm 1.9072 (2.0485) loss_scale 8192.0000 (5495.1094) mem 7379MB [2024-08-26 09:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1060/1251] eta 0:00:46 lr 0.000893 wd 0.0500 time 0.2367 (0.2458) data time 0.0012 (0.0015) model time 0.2355 (0.2441) loss 3.5849 (3.4838) grad_norm 2.3117 (2.0475) loss_scale 8192.0000 (5520.5278) mem 7379MB [2024-08-26 09:07:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1070/1251] eta 0:00:44 lr 0.000893 wd 0.0500 time 0.2384 (0.2459) data time 0.0009 (0.0014) model time 0.2375 (0.2443) loss 3.1053 (3.4851) grad_norm 2.0169 (2.0475) loss_scale 8192.0000 (5545.4715) mem 7379MB [2024-08-26 09:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1080/1251] eta 0:00:42 lr 0.000892 wd 0.0500 time 0.2439 (0.2459) data time 0.0009 (0.0014) model time 0.2430 (0.2442) loss 2.9290 (3.4872) grad_norm 1.7349 (2.0471) loss_scale 8192.0000 (5569.9537) mem 7379MB [2024-08-26 09:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1090/1251] eta 0:00:39 lr 0.000892 wd 0.0500 time 0.2395 (0.2458) data time 0.0010 (0.0014) model time 0.2385 (0.2442) loss 2.7266 (3.4891) grad_norm 1.8427 (2.0453) loss_scale 8192.0000 (5593.9872) mem 7379MB [2024-08-26 09:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1100/1251] eta 0:00:37 lr 0.000892 wd 0.0500 time 0.2463 (0.2458) data time 0.0010 (0.0014) model time 0.2453 (0.2442) loss 3.6521 (3.4889) grad_norm 1.7771 (2.0441) loss_scale 8192.0000 (5617.5840) mem 7379MB [2024-08-26 09:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1110/1251] eta 0:00:34 lr 0.000892 wd 0.0500 time 0.2457 (0.2458) data time 0.0011 (0.0014) model time 0.2446 (0.2442) loss 3.1307 (3.4888) grad_norm 1.6203 (2.0429) loss_scale 8192.0000 (5640.7561) mem 7379MB [2024-08-26 09:07:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1120/1251] eta 0:00:32 lr 0.000892 wd 0.0500 time 0.2429 (0.2457) data time 0.0007 (0.0014) model time 0.2422 (0.2441) loss 3.4791 (3.4906) grad_norm 1.8248 (2.0434) loss_scale 8192.0000 (5663.5147) mem 7379MB [2024-08-26 09:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1130/1251] eta 0:00:29 lr 0.000892 wd 0.0500 time 0.2473 (0.2457) data time 0.0011 (0.0014) model time 0.2462 (0.2441) loss 3.7749 (3.4920) grad_norm 1.9376 (2.0449) loss_scale 8192.0000 (5685.8709) mem 7379MB [2024-08-26 09:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1140/1251] eta 0:00:27 lr 0.000892 wd 0.0500 time 0.2320 (0.2457) data time 0.0010 (0.0014) model time 0.2310 (0.2441) loss 4.0170 (3.4932) grad_norm 1.7528 (2.0429) loss_scale 8192.0000 (5707.8352) mem 7379MB [2024-08-26 09:07:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1150/1251] eta 0:00:24 lr 0.000892 wd 0.0500 time 0.2419 (0.2456) data time 0.0009 (0.0014) model time 0.2409 (0.2440) loss 4.0126 (3.4928) grad_norm 2.0883 (2.0405) loss_scale 8192.0000 (5729.4179) mem 7379MB [2024-08-26 09:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1160/1251] eta 0:00:22 lr 0.000892 wd 0.0500 time 0.2431 (0.2456) data time 0.0009 (0.0014) model time 0.2422 (0.2440) loss 4.1110 (3.4942) grad_norm 1.8341 (2.0398) loss_scale 8192.0000 (5750.6288) mem 7379MB [2024-08-26 09:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1170/1251] eta 0:00:19 lr 0.000892 wd 0.0500 time 0.2351 (0.2457) data time 0.0011 (0.0014) model time 0.2340 (0.2441) loss 3.4413 (3.4947) grad_norm 2.8223 (2.0402) loss_scale 8192.0000 (5771.4774) mem 7379MB [2024-08-26 09:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1180/1251] eta 0:00:17 lr 0.000892 wd 0.0500 time 0.2391 (0.2459) data time 0.0011 (0.0014) model time 0.2380 (0.2443) loss 3.7152 (3.4941) grad_norm 1.8263 (2.0409) loss_scale 8192.0000 (5791.9729) mem 7379MB [2024-08-26 09:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1190/1251] eta 0:00:14 lr 0.000892 wd 0.0500 time 0.2459 (0.2458) data time 0.0010 (0.0014) model time 0.2449 (0.2443) loss 3.2306 (3.4924) grad_norm 1.9794 (2.0404) loss_scale 8192.0000 (5812.1243) mem 7379MB [2024-08-26 09:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1200/1251] eta 0:00:12 lr 0.000892 wd 0.0500 time 0.2429 (0.2458) data time 0.0010 (0.0014) model time 0.2419 (0.2442) loss 2.1872 (3.4939) grad_norm 1.7210 (2.0395) loss_scale 8192.0000 (5831.9400) mem 7379MB [2024-08-26 09:07:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1210/1251] eta 0:00:10 lr 0.000892 wd 0.0500 time 0.2476 (0.2458) data time 0.0009 (0.0014) model time 0.2467 (0.2442) loss 3.7683 (3.4942) grad_norm 2.4096 (2.0387) loss_scale 8192.0000 (5851.4286) mem 7379MB [2024-08-26 09:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1220/1251] eta 0:00:07 lr 0.000892 wd 0.0500 time 0.2372 (0.2457) data time 0.0012 (0.0014) model time 0.2360 (0.2442) loss 3.4420 (3.4950) grad_norm 1.4732 (2.0376) loss_scale 8192.0000 (5870.5979) mem 7379MB [2024-08-26 09:07:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1230/1251] eta 0:00:05 lr 0.000892 wd 0.0500 time 0.2363 (0.2457) data time 0.0013 (0.0014) model time 0.2350 (0.2441) loss 3.4855 (3.4946) grad_norm 2.2200 (2.0363) loss_scale 8192.0000 (5889.4557) mem 7379MB [2024-08-26 09:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1240/1251] eta 0:00:02 lr 0.000892 wd 0.0500 time 0.2232 (0.2456) data time 0.0005 (0.0014) model time 0.2227 (0.2440) loss 2.5834 (3.4956) grad_norm 1.5761 (2.0335) loss_scale 8192.0000 (5908.0097) mem 7379MB [2024-08-26 09:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [79/300][1250/1251] eta 0:00:00 lr 0.000892 wd 0.0500 time 0.2237 (0.2454) data time 0.0005 (0.0014) model time 0.2232 (0.2439) loss 3.6615 (3.4952) grad_norm 1.5769 (2.0346) loss_scale 8192.0000 (5926.2670) mem 7379MB [2024-08-26 09:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 79 training takes 0:05:07 [2024-08-26 09:07:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 09:07:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 09:07:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.443 (0.443) Loss 0.5396 (0.5396) Acc@1 89.746 (89.746) Acc@5 97.852 (97.852) Mem 7379MB [2024-08-26 09:07:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.114) Loss 0.8184 (0.8305) Acc@1 83.203 (81.916) Acc@5 96.191 (96.191) Mem 7379MB [2024-08-26 09:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.097) Loss 1.2246 (0.8660) Acc@1 70.215 (80.762) Acc@5 91.113 (95.945) Mem 7379MB [2024-08-26 09:07:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.091) Loss 1.4980 (0.9782) Acc@1 64.844 (78.160) Acc@5 86.914 (94.421) Mem 7379MB [2024-08-26 09:07:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.3398 (1.0411) Acc@1 69.531 (76.608) Acc@5 90.918 (93.664) Mem 7379MB [2024-08-26 09:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.370 Acc@5 93.584 [2024-08-26 09:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.4% [2024-08-26 09:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.789 (0.789) Loss 0.4578 (0.4578) Acc@1 92.090 (92.090) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 09:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.149) Loss 0.7388 (0.7166) Acc@1 85.059 (84.535) Acc@5 96.289 (96.866) Mem 7379MB [2024-08-26 09:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.116) Loss 1.0234 (0.7401) Acc@1 75.879 (83.459) Acc@5 93.848 (96.824) Mem 7379MB [2024-08-26 09:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.092 (0.104) Loss 1.3066 (0.8454) Acc@1 66.699 (81.014) Acc@5 90.723 (95.568) Mem 7379MB [2024-08-26 09:08:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.1855 (0.9019) Acc@1 70.996 (79.561) Acc@5 91.699 (94.934) Mem 7379MB [2024-08-26 09:08:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.156 Acc@5 94.878 [2024-08-26 09:08:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.2% [2024-08-26 09:08:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.16% [2024-08-26 09:08:04 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 09:08:05 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 09:08:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][0/1251] eta 0:15:00 lr 0.000892 wd 0.0500 time 0.7195 (0.7195) data time 0.4808 (0.4808) model time 0.0000 (0.0000) loss 4.1014 (4.1014) grad_norm 1.5842 (1.5842) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][10/1251] eta 0:05:52 lr 0.000892 wd 0.0500 time 0.2421 (0.2839) data time 0.0009 (0.0446) model time 0.0000 (0.0000) loss 3.4511 (3.6318) grad_norm 2.3097 (2.0935) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][20/1251] eta 0:05:23 lr 0.000892 wd 0.0500 time 0.2380 (0.2624) data time 0.0008 (0.0239) model time 0.0000 (0.0000) loss 2.3221 (3.5164) grad_norm 1.4157 (2.0960) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][30/1251] eta 0:05:10 lr 0.000892 wd 0.0500 time 0.2449 (0.2543) data time 0.0010 (0.0165) model time 0.0000 (0.0000) loss 3.1712 (3.4946) grad_norm 2.9197 (2.0891) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][40/1251] eta 0:05:03 lr 0.000892 wd 0.0500 time 0.2448 (0.2509) data time 0.0011 (0.0127) model time 0.0000 (0.0000) loss 3.8496 (3.4814) grad_norm 1.7048 (2.0201) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][50/1251] eta 0:05:04 lr 0.000892 wd 0.0500 time 0.2345 (0.2532) data time 0.0010 (0.0104) model time 0.0000 (0.0000) loss 3.7794 (3.5194) grad_norm 1.9498 (2.1190) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][60/1251] eta 0:04:59 lr 0.000892 wd 0.0500 time 0.2414 (0.2511) data time 0.0007 (0.0089) model time 0.2407 (0.2391) loss 2.6535 (3.4481) grad_norm 2.9381 (2.1286) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][70/1251] eta 0:04:54 lr 0.000892 wd 0.0500 time 0.2390 (0.2493) data time 0.0007 (0.0078) model time 0.2383 (0.2384) loss 3.7993 (3.4383) grad_norm 1.6104 (2.1340) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][80/1251] eta 0:04:50 lr 0.000892 wd 0.0500 time 0.2415 (0.2485) data time 0.0009 (0.0069) model time 0.2406 (0.2393) loss 3.5577 (3.4448) grad_norm 1.6305 (2.1448) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][90/1251] eta 0:04:47 lr 0.000892 wd 0.0500 time 0.2354 (0.2476) data time 0.0010 (0.0063) model time 0.2344 (0.2394) loss 3.2463 (3.4515) grad_norm 1.7331 (2.1318) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][100/1251] eta 0:04:44 lr 0.000892 wd 0.0500 time 0.2408 (0.2468) data time 0.0012 (0.0058) model time 0.2395 (0.2393) loss 3.8798 (3.4627) grad_norm 1.4854 (2.0968) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][110/1251] eta 0:04:41 lr 0.000892 wd 0.0500 time 0.2412 (0.2463) data time 0.0010 (0.0054) model time 0.2402 (0.2394) loss 3.1978 (3.4512) grad_norm 2.3039 (2.1011) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][120/1251] eta 0:04:39 lr 0.000892 wd 0.0500 time 0.2441 (0.2471) data time 0.0007 (0.0050) model time 0.2434 (0.2416) loss 3.9960 (3.4530) grad_norm 1.9794 (2.1170) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][130/1251] eta 0:04:36 lr 0.000892 wd 0.0500 time 0.2421 (0.2466) data time 0.0011 (0.0047) model time 0.2410 (0.2413) loss 3.0635 (3.4362) grad_norm 2.4498 (2.1010) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][140/1251] eta 0:04:34 lr 0.000892 wd 0.0500 time 0.2395 (0.2475) data time 0.0008 (0.0044) model time 0.2387 (0.2432) loss 4.0225 (3.4292) grad_norm 1.4554 (2.0908) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][150/1251] eta 0:04:31 lr 0.000892 wd 0.0500 time 0.2447 (0.2470) data time 0.0007 (0.0042) model time 0.2440 (0.2428) loss 4.2214 (3.4244) grad_norm 2.5437 (2.0864) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][160/1251] eta 0:04:28 lr 0.000892 wd 0.0500 time 0.2354 (0.2465) data time 0.0011 (0.0040) model time 0.2344 (0.2423) loss 3.8747 (3.4256) grad_norm 1.7133 (2.0910) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][170/1251] eta 0:04:26 lr 0.000892 wd 0.0500 time 0.2418 (0.2462) data time 0.0009 (0.0038) model time 0.2409 (0.2421) loss 3.5059 (3.4246) grad_norm 2.2637 (2.0939) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][180/1251] eta 0:04:23 lr 0.000892 wd 0.0500 time 0.2359 (0.2460) data time 0.0011 (0.0037) model time 0.2348 (0.2421) loss 3.0370 (3.4199) grad_norm 2.0074 (2.0775) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][190/1251] eta 0:04:20 lr 0.000891 wd 0.0500 time 0.2374 (0.2457) data time 0.0010 (0.0035) model time 0.2364 (0.2419) loss 3.9259 (3.4290) grad_norm 2.3705 (2.0676) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][200/1251] eta 0:04:17 lr 0.000891 wd 0.0500 time 0.2357 (0.2454) data time 0.0014 (0.0034) model time 0.2343 (0.2417) loss 3.9578 (3.4349) grad_norm 1.4579 (2.0737) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][210/1251] eta 0:04:15 lr 0.000891 wd 0.0500 time 0.2381 (0.2452) data time 0.0009 (0.0033) model time 0.2372 (0.2417) loss 2.3270 (3.4303) grad_norm 1.6095 (2.0818) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][220/1251] eta 0:04:13 lr 0.000891 wd 0.0500 time 0.2439 (0.2460) data time 0.0012 (0.0032) model time 0.2427 (0.2428) loss 3.5095 (3.4315) grad_norm 1.6353 (2.0874) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:09:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][230/1251] eta 0:04:10 lr 0.000891 wd 0.0500 time 0.2384 (0.2458) data time 0.0007 (0.0031) model time 0.2377 (0.2427) loss 4.1078 (3.4268) grad_norm 2.3521 (2.1149) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][240/1251] eta 0:04:08 lr 0.000891 wd 0.0500 time 0.2364 (0.2455) data time 0.0008 (0.0030) model time 0.2356 (0.2424) loss 2.5263 (3.4188) grad_norm 2.2640 (2.1222) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:09:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][250/1251] eta 0:04:05 lr 0.000891 wd 0.0500 time 0.2434 (0.2453) data time 0.0011 (0.0029) model time 0.2423 (0.2423) loss 3.3367 (3.4227) grad_norm 2.0120 (2.1187) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:09:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][260/1251] eta 0:04:02 lr 0.000891 wd 0.0500 time 0.2392 (0.2451) data time 0.0009 (0.0029) model time 0.2383 (0.2421) loss 2.9578 (3.4134) grad_norm 2.2286 (2.1136) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][270/1251] eta 0:04:00 lr 0.000891 wd 0.0500 time 0.2393 (0.2450) data time 0.0011 (0.0028) model time 0.2382 (0.2420) loss 2.9826 (3.4105) grad_norm 2.4340 (2.1126) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:09:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][280/1251] eta 0:03:57 lr 0.000891 wd 0.0500 time 0.2402 (0.2448) data time 0.0012 (0.0027) model time 0.2390 (0.2419) loss 3.0924 (3.4193) grad_norm 1.4069 (inf) loss_scale 4096.0000 (8075.3879) mem 7379MB [2024-08-26 09:09:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][290/1251] eta 0:03:55 lr 0.000891 wd 0.0500 time 0.2385 (0.2447) data time 0.0009 (0.0027) model time 0.2376 (0.2418) loss 2.5753 (3.4142) grad_norm 1.9765 (inf) loss_scale 4096.0000 (7938.6392) mem 7379MB [2024-08-26 09:09:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][300/1251] eta 0:03:52 lr 0.000891 wd 0.0500 time 0.2375 (0.2446) data time 0.0009 (0.0026) model time 0.2366 (0.2418) loss 2.3835 (3.4105) grad_norm 1.5250 (inf) loss_scale 4096.0000 (7810.9767) mem 7379MB [2024-08-26 09:09:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][310/1251] eta 0:03:50 lr 0.000891 wd 0.0500 time 0.2431 (0.2444) data time 0.0007 (0.0026) model time 0.2423 (0.2417) loss 3.6859 (3.4115) grad_norm 2.5479 (inf) loss_scale 4096.0000 (7691.5241) mem 7379MB [2024-08-26 09:09:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][320/1251] eta 0:03:47 lr 0.000891 wd 0.0500 time 0.2493 (0.2443) data time 0.0007 (0.0025) model time 0.2486 (0.2417) loss 4.0451 (3.4206) grad_norm 1.9745 (inf) loss_scale 4096.0000 (7579.5140) mem 7379MB [2024-08-26 09:09:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][330/1251] eta 0:03:45 lr 0.000891 wd 0.0500 time 0.4035 (0.2447) data time 0.0014 (0.0025) model time 0.4021 (0.2421) loss 3.6289 (3.4267) grad_norm 1.8276 (inf) loss_scale 4096.0000 (7474.2719) mem 7379MB [2024-08-26 09:09:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][340/1251] eta 0:03:42 lr 0.000891 wd 0.0500 time 0.2447 (0.2446) data time 0.0007 (0.0024) model time 0.2440 (0.2421) loss 3.3152 (3.4355) grad_norm 2.3184 (inf) loss_scale 4096.0000 (7375.2023) mem 7379MB [2024-08-26 09:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][350/1251] eta 0:03:40 lr 0.000891 wd 0.0500 time 0.2408 (0.2445) data time 0.0011 (0.0024) model time 0.2397 (0.2420) loss 3.7850 (3.4342) grad_norm 3.3581 (inf) loss_scale 4096.0000 (7281.7778) mem 7379MB [2024-08-26 09:09:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][360/1251] eta 0:03:37 lr 0.000891 wd 0.0500 time 0.2426 (0.2444) data time 0.0011 (0.0024) model time 0.2415 (0.2419) loss 2.8880 (3.4383) grad_norm 1.7272 (inf) loss_scale 4096.0000 (7193.5291) mem 7379MB [2024-08-26 09:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][370/1251] eta 0:03:35 lr 0.000891 wd 0.0500 time 0.2394 (0.2443) data time 0.0010 (0.0023) model time 0.2385 (0.2418) loss 2.8236 (3.4474) grad_norm 1.8078 (inf) loss_scale 4096.0000 (7110.0377) mem 7379MB [2024-08-26 09:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][380/1251] eta 0:03:32 lr 0.000891 wd 0.0500 time 0.2435 (0.2442) data time 0.0010 (0.0023) model time 0.2425 (0.2418) loss 3.5921 (3.4497) grad_norm 1.7378 (inf) loss_scale 4096.0000 (7030.9291) mem 7379MB [2024-08-26 09:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][390/1251] eta 0:03:30 lr 0.000891 wd 0.0500 time 0.2454 (0.2442) data time 0.0011 (0.0023) model time 0.2443 (0.2418) loss 3.5328 (3.4529) grad_norm 1.9103 (inf) loss_scale 4096.0000 (6955.8670) mem 7379MB [2024-08-26 09:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][400/1251] eta 0:03:27 lr 0.000891 wd 0.0500 time 0.2387 (0.2441) data time 0.0007 (0.0022) model time 0.2380 (0.2418) loss 4.2674 (3.4555) grad_norm 1.7952 (inf) loss_scale 4096.0000 (6884.5486) mem 7379MB [2024-08-26 09:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][410/1251] eta 0:03:25 lr 0.000891 wd 0.0500 time 0.2407 (0.2445) data time 0.0010 (0.0022) model time 0.2397 (0.2422) loss 2.9026 (3.4430) grad_norm 1.5780 (inf) loss_scale 4096.0000 (6816.7007) mem 7379MB [2024-08-26 09:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][420/1251] eta 0:03:23 lr 0.000891 wd 0.0500 time 0.2401 (0.2444) data time 0.0007 (0.0022) model time 0.2394 (0.2422) loss 2.5270 (3.4435) grad_norm 1.7040 (inf) loss_scale 4096.0000 (6752.0760) mem 7379MB [2024-08-26 09:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][430/1251] eta 0:03:20 lr 0.000891 wd 0.0500 time 0.2483 (0.2444) data time 0.0009 (0.0021) model time 0.2473 (0.2422) loss 3.8684 (3.4437) grad_norm 1.9220 (inf) loss_scale 4096.0000 (6690.4501) mem 7379MB [2024-08-26 09:09:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][440/1251] eta 0:03:18 lr 0.000891 wd 0.0500 time 0.2512 (0.2444) data time 0.0010 (0.0021) model time 0.2502 (0.2422) loss 3.6284 (3.4409) grad_norm 1.8521 (inf) loss_scale 4096.0000 (6631.6190) mem 7379MB [2024-08-26 09:09:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][450/1251] eta 0:03:16 lr 0.000891 wd 0.0500 time 0.2421 (0.2447) data time 0.0009 (0.0021) model time 0.2412 (0.2426) loss 3.9716 (3.4443) grad_norm 2.7456 (inf) loss_scale 4096.0000 (6575.3969) mem 7379MB [2024-08-26 09:09:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][460/1251] eta 0:03:13 lr 0.000891 wd 0.0500 time 0.2501 (0.2451) data time 0.0007 (0.0021) model time 0.2494 (0.2431) loss 4.2774 (3.4376) grad_norm 3.0088 (inf) loss_scale 4096.0000 (6521.6139) mem 7379MB [2024-08-26 09:10:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][470/1251] eta 0:03:11 lr 0.000891 wd 0.0500 time 0.2411 (0.2451) data time 0.0010 (0.0020) model time 0.2401 (0.2431) loss 3.7021 (3.4430) grad_norm 3.2646 (inf) loss_scale 4096.0000 (6470.1146) mem 7379MB [2024-08-26 09:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][480/1251] eta 0:03:08 lr 0.000891 wd 0.0500 time 0.2348 (0.2451) data time 0.0012 (0.0020) model time 0.2336 (0.2431) loss 3.8697 (3.4433) grad_norm 1.6174 (inf) loss_scale 4096.0000 (6420.7568) mem 7379MB [2024-08-26 09:10:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][490/1251] eta 0:03:06 lr 0.000891 wd 0.0500 time 0.2323 (0.2450) data time 0.0012 (0.0020) model time 0.2311 (0.2430) loss 3.5272 (3.4408) grad_norm 2.1300 (inf) loss_scale 4096.0000 (6373.4094) mem 7379MB [2024-08-26 09:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][500/1251] eta 0:03:03 lr 0.000891 wd 0.0500 time 0.2419 (0.2449) data time 0.0008 (0.0020) model time 0.2411 (0.2429) loss 2.6141 (3.4377) grad_norm 2.2233 (inf) loss_scale 4096.0000 (6327.9521) mem 7379MB [2024-08-26 09:10:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][510/1251] eta 0:03:01 lr 0.000891 wd 0.0500 time 0.2410 (0.2448) data time 0.0010 (0.0020) model time 0.2400 (0.2429) loss 3.4347 (3.4366) grad_norm 2.2866 (inf) loss_scale 4096.0000 (6284.2740) mem 7379MB [2024-08-26 09:10:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][520/1251] eta 0:02:58 lr 0.000891 wd 0.0500 time 0.2440 (0.2448) data time 0.0010 (0.0019) model time 0.2430 (0.2428) loss 3.4303 (3.4364) grad_norm 1.9190 (inf) loss_scale 4096.0000 (6242.2726) mem 7379MB [2024-08-26 09:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][530/1251] eta 0:02:56 lr 0.000891 wd 0.0500 time 0.4345 (0.2451) data time 0.0010 (0.0019) model time 0.4335 (0.2432) loss 3.7987 (3.4403) grad_norm 3.4427 (inf) loss_scale 4096.0000 (6201.8531) mem 7379MB [2024-08-26 09:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][540/1251] eta 0:02:54 lr 0.000891 wd 0.0500 time 0.2455 (0.2454) data time 0.0010 (0.0019) model time 0.2445 (0.2436) loss 3.1779 (3.4424) grad_norm 2.1783 (inf) loss_scale 4096.0000 (6162.9279) mem 7379MB [2024-08-26 09:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][550/1251] eta 0:02:52 lr 0.000890 wd 0.0500 time 0.2483 (0.2458) data time 0.0008 (0.0019) model time 0.2475 (0.2440) loss 3.8752 (3.4393) grad_norm 1.7994 (inf) loss_scale 4096.0000 (6125.4156) mem 7379MB [2024-08-26 09:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][560/1251] eta 0:02:49 lr 0.000890 wd 0.0500 time 0.2532 (0.2458) data time 0.0009 (0.0019) model time 0.2523 (0.2440) loss 3.5624 (3.4397) grad_norm 2.3667 (inf) loss_scale 4096.0000 (6089.2406) mem 7379MB [2024-08-26 09:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][570/1251] eta 0:02:47 lr 0.000890 wd 0.0500 time 0.2362 (0.2456) data time 0.0010 (0.0019) model time 0.2352 (0.2439) loss 3.0005 (3.4364) grad_norm 3.9919 (inf) loss_scale 4096.0000 (6054.3327) mem 7379MB [2024-08-26 09:10:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][580/1251] eta 0:02:44 lr 0.000890 wd 0.0500 time 0.2449 (0.2455) data time 0.0009 (0.0018) model time 0.2440 (0.2438) loss 4.1791 (3.4369) grad_norm 1.9724 (inf) loss_scale 4096.0000 (6020.6265) mem 7379MB [2024-08-26 09:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][590/1251] eta 0:02:42 lr 0.000890 wd 0.0500 time 0.2379 (0.2455) data time 0.0009 (0.0018) model time 0.2371 (0.2437) loss 3.9126 (3.4367) grad_norm 1.8009 (inf) loss_scale 4096.0000 (5988.0609) mem 7379MB [2024-08-26 09:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][600/1251] eta 0:02:39 lr 0.000890 wd 0.0500 time 0.2344 (0.2454) data time 0.0010 (0.0018) model time 0.2335 (0.2436) loss 4.4160 (3.4388) grad_norm 2.1182 (inf) loss_scale 4096.0000 (5956.5790) mem 7379MB [2024-08-26 09:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][610/1251] eta 0:02:37 lr 0.000890 wd 0.0500 time 0.2400 (0.2454) data time 0.0009 (0.0018) model time 0.2391 (0.2436) loss 2.8829 (3.4367) grad_norm 1.7126 (inf) loss_scale 4096.0000 (5926.1277) mem 7379MB [2024-08-26 09:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][620/1251] eta 0:02:34 lr 0.000890 wd 0.0500 time 0.2445 (0.2453) data time 0.0011 (0.0018) model time 0.2434 (0.2436) loss 3.6835 (3.4349) grad_norm 1.7968 (inf) loss_scale 4096.0000 (5896.6570) mem 7379MB [2024-08-26 09:10:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][630/1251] eta 0:02:32 lr 0.000890 wd 0.0500 time 0.2486 (0.2453) data time 0.0010 (0.0018) model time 0.2476 (0.2436) loss 3.0116 (3.4340) grad_norm 2.8459 (inf) loss_scale 4096.0000 (5868.1204) mem 7379MB [2024-08-26 09:10:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][640/1251] eta 0:02:30 lr 0.000890 wd 0.0500 time 0.2382 (0.2456) data time 0.0008 (0.0018) model time 0.2374 (0.2439) loss 4.1543 (3.4308) grad_norm 1.7643 (inf) loss_scale 4096.0000 (5840.4743) mem 7379MB [2024-08-26 09:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][650/1251] eta 0:02:27 lr 0.000890 wd 0.0500 time 0.2410 (0.2455) data time 0.0007 (0.0018) model time 0.2403 (0.2439) loss 2.9417 (3.4317) grad_norm 1.6452 (inf) loss_scale 4096.0000 (5813.6774) mem 7379MB [2024-08-26 09:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][660/1251] eta 0:02:25 lr 0.000890 wd 0.0500 time 0.2340 (0.2457) data time 0.0008 (0.0017) model time 0.2332 (0.2440) loss 2.2305 (3.4336) grad_norm 1.6795 (inf) loss_scale 4096.0000 (5787.6914) mem 7379MB [2024-08-26 09:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][670/1251] eta 0:02:22 lr 0.000890 wd 0.0500 time 0.2416 (0.2456) data time 0.0009 (0.0017) model time 0.2407 (0.2440) loss 3.2084 (3.4282) grad_norm 1.7416 (inf) loss_scale 4096.0000 (5762.4799) mem 7379MB [2024-08-26 09:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][680/1251] eta 0:02:20 lr 0.000890 wd 0.0500 time 0.2402 (0.2456) data time 0.0011 (0.0017) model time 0.2392 (0.2439) loss 3.5545 (3.4303) grad_norm 1.7750 (inf) loss_scale 4096.0000 (5738.0088) mem 7379MB [2024-08-26 09:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][690/1251] eta 0:02:17 lr 0.000890 wd 0.0500 time 0.2386 (0.2455) data time 0.0008 (0.0017) model time 0.2377 (0.2439) loss 3.6703 (3.4325) grad_norm 1.6478 (inf) loss_scale 4096.0000 (5714.2460) mem 7379MB [2024-08-26 09:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][700/1251] eta 0:02:15 lr 0.000890 wd 0.0500 time 0.2426 (0.2454) data time 0.0007 (0.0017) model time 0.2420 (0.2438) loss 4.0245 (3.4327) grad_norm 1.4684 (inf) loss_scale 4096.0000 (5691.1612) mem 7379MB [2024-08-26 09:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][710/1251] eta 0:02:12 lr 0.000890 wd 0.0500 time 0.2386 (0.2454) data time 0.0010 (0.0017) model time 0.2376 (0.2438) loss 3.0570 (3.4311) grad_norm 1.6590 (inf) loss_scale 4096.0000 (5668.7257) mem 7379MB [2024-08-26 09:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][720/1251] eta 0:02:10 lr 0.000890 wd 0.0500 time 0.2659 (0.2454) data time 0.0011 (0.0017) model time 0.2648 (0.2438) loss 3.3828 (3.4296) grad_norm 1.6703 (inf) loss_scale 4096.0000 (5646.9126) mem 7379MB [2024-08-26 09:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][730/1251] eta 0:02:07 lr 0.000890 wd 0.0500 time 0.2374 (0.2454) data time 0.0010 (0.0017) model time 0.2364 (0.2438) loss 3.8181 (3.4291) grad_norm 2.5478 (inf) loss_scale 4096.0000 (5625.6963) mem 7379MB [2024-08-26 09:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][740/1251] eta 0:02:05 lr 0.000890 wd 0.0500 time 0.2489 (0.2453) data time 0.0007 (0.0017) model time 0.2482 (0.2437) loss 3.4190 (3.4305) grad_norm 1.9749 (inf) loss_scale 4096.0000 (5605.0526) mem 7379MB [2024-08-26 09:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][750/1251] eta 0:02:02 lr 0.000890 wd 0.0500 time 0.2363 (0.2453) data time 0.0009 (0.0017) model time 0.2354 (0.2437) loss 4.1169 (3.4312) grad_norm 3.1829 (inf) loss_scale 4096.0000 (5584.9587) mem 7379MB [2024-08-26 09:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][760/1251] eta 0:02:00 lr 0.000890 wd 0.0500 time 0.2446 (0.2452) data time 0.0008 (0.0016) model time 0.2438 (0.2436) loss 4.3046 (3.4329) grad_norm 2.2453 (inf) loss_scale 4096.0000 (5565.3929) mem 7379MB [2024-08-26 09:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][770/1251] eta 0:01:57 lr 0.000890 wd 0.0500 time 0.2424 (0.2452) data time 0.0009 (0.0016) model time 0.2415 (0.2436) loss 3.0033 (3.4307) grad_norm 2.1080 (inf) loss_scale 4096.0000 (5546.3346) mem 7379MB [2024-08-26 09:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][780/1251] eta 0:01:55 lr 0.000890 wd 0.0500 time 0.2423 (0.2451) data time 0.0010 (0.0016) model time 0.2413 (0.2435) loss 4.0997 (3.4313) grad_norm 2.0888 (inf) loss_scale 4096.0000 (5527.7644) mem 7379MB [2024-08-26 09:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][790/1251] eta 0:01:52 lr 0.000890 wd 0.0500 time 0.2383 (0.2451) data time 0.0009 (0.0016) model time 0.2374 (0.2435) loss 3.2151 (3.4357) grad_norm 1.6129 (inf) loss_scale 4096.0000 (5509.6637) mem 7379MB [2024-08-26 09:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][800/1251] eta 0:01:50 lr 0.000890 wd 0.0500 time 0.2449 (0.2450) data time 0.0009 (0.0016) model time 0.2440 (0.2435) loss 3.5792 (3.4369) grad_norm 2.0693 (inf) loss_scale 4096.0000 (5492.0150) mem 7379MB [2024-08-26 09:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][810/1251] eta 0:01:48 lr 0.000890 wd 0.0500 time 0.2459 (0.2450) data time 0.0009 (0.0016) model time 0.2450 (0.2434) loss 3.8978 (3.4348) grad_norm 1.4701 (inf) loss_scale 4096.0000 (5474.8015) mem 7379MB [2024-08-26 09:11:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][820/1251] eta 0:01:45 lr 0.000890 wd 0.0500 time 0.2340 (0.2449) data time 0.0013 (0.0016) model time 0.2327 (0.2434) loss 2.7901 (3.4339) grad_norm 2.3140 (inf) loss_scale 4096.0000 (5458.0073) mem 7379MB [2024-08-26 09:11:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][830/1251] eta 0:01:43 lr 0.000890 wd 0.0500 time 0.2484 (0.2449) data time 0.0007 (0.0016) model time 0.2477 (0.2434) loss 3.9918 (3.4366) grad_norm 1.6495 (inf) loss_scale 4096.0000 (5441.6173) mem 7379MB [2024-08-26 09:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][840/1251] eta 0:01:40 lr 0.000890 wd 0.0500 time 0.2437 (0.2449) data time 0.0010 (0.0016) model time 0.2427 (0.2433) loss 3.1367 (3.4370) grad_norm 2.1462 (inf) loss_scale 4096.0000 (5425.6171) mem 7379MB [2024-08-26 09:11:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][850/1251] eta 0:01:38 lr 0.000890 wd 0.0500 time 0.2408 (0.2448) data time 0.0011 (0.0016) model time 0.2397 (0.2433) loss 3.9433 (3.4379) grad_norm 2.1077 (inf) loss_scale 4096.0000 (5409.9929) mem 7379MB [2024-08-26 09:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][860/1251] eta 0:01:35 lr 0.000890 wd 0.0500 time 0.2411 (0.2451) data time 0.0011 (0.0016) model time 0.2400 (0.2436) loss 3.6980 (3.4385) grad_norm 3.2750 (inf) loss_scale 4096.0000 (5394.7317) mem 7379MB [2024-08-26 09:11:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][870/1251] eta 0:01:33 lr 0.000890 wd 0.0500 time 0.2408 (0.2451) data time 0.0009 (0.0016) model time 0.2398 (0.2435) loss 3.8505 (3.4382) grad_norm 1.6950 (inf) loss_scale 4096.0000 (5379.8209) mem 7379MB [2024-08-26 09:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][880/1251] eta 0:01:30 lr 0.000890 wd 0.0500 time 0.2362 (0.2450) data time 0.0011 (0.0016) model time 0.2352 (0.2435) loss 3.9631 (3.4401) grad_norm 2.0071 (inf) loss_scale 4096.0000 (5365.2486) mem 7379MB [2024-08-26 09:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][890/1251] eta 0:01:28 lr 0.000890 wd 0.0500 time 0.2401 (0.2450) data time 0.0010 (0.0016) model time 0.2391 (0.2435) loss 2.3962 (3.4393) grad_norm 1.9099 (inf) loss_scale 4096.0000 (5351.0034) mem 7379MB [2024-08-26 09:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][900/1251] eta 0:01:25 lr 0.000890 wd 0.0500 time 0.2450 (0.2450) data time 0.0010 (0.0015) model time 0.2440 (0.2434) loss 3.5079 (3.4378) grad_norm 2.4019 (inf) loss_scale 4096.0000 (5337.0744) mem 7379MB [2024-08-26 09:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][910/1251] eta 0:01:23 lr 0.000889 wd 0.0500 time 0.2359 (0.2449) data time 0.0008 (0.0015) model time 0.2351 (0.2434) loss 4.3862 (3.4384) grad_norm 1.9399 (inf) loss_scale 4096.0000 (5323.4512) mem 7379MB [2024-08-26 09:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][920/1251] eta 0:01:21 lr 0.000889 wd 0.0500 time 0.2370 (0.2449) data time 0.0010 (0.0015) model time 0.2360 (0.2434) loss 3.7063 (3.4367) grad_norm 1.5663 (inf) loss_scale 4096.0000 (5310.1238) mem 7379MB [2024-08-26 09:11:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][930/1251] eta 0:01:18 lr 0.000889 wd 0.0500 time 0.2428 (0.2449) data time 0.0009 (0.0015) model time 0.2418 (0.2434) loss 3.2816 (3.4391) grad_norm 2.0446 (inf) loss_scale 4096.0000 (5297.0827) mem 7379MB [2024-08-26 09:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][940/1251] eta 0:01:16 lr 0.000889 wd 0.0500 time 0.2332 (0.2450) data time 0.0010 (0.0015) model time 0.2322 (0.2436) loss 3.7942 (3.4386) grad_norm 2.1904 (inf) loss_scale 4096.0000 (5284.3188) mem 7379MB [2024-08-26 09:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][950/1251] eta 0:01:13 lr 0.000889 wd 0.0500 time 0.2348 (0.2450) data time 0.0012 (0.0015) model time 0.2337 (0.2435) loss 3.9911 (3.4390) grad_norm 2.0760 (inf) loss_scale 4096.0000 (5271.8233) mem 7379MB [2024-08-26 09:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][960/1251] eta 0:01:11 lr 0.000889 wd 0.0500 time 0.2350 (0.2450) data time 0.0012 (0.0015) model time 0.2338 (0.2435) loss 3.8309 (3.4381) grad_norm 3.6643 (inf) loss_scale 4096.0000 (5259.5879) mem 7379MB [2024-08-26 09:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][970/1251] eta 0:01:08 lr 0.000889 wd 0.0500 time 0.2444 (0.2451) data time 0.0007 (0.0015) model time 0.2437 (0.2436) loss 3.4716 (3.4385) grad_norm 2.8369 (inf) loss_scale 4096.0000 (5247.6045) mem 7379MB [2024-08-26 09:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][980/1251] eta 0:01:06 lr 0.000889 wd 0.0500 time 0.2363 (0.2454) data time 0.0008 (0.0015) model time 0.2355 (0.2440) loss 4.0042 (3.4389) grad_norm 1.7092 (inf) loss_scale 4096.0000 (5235.8654) mem 7379MB [2024-08-26 09:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][990/1251] eta 0:01:04 lr 0.000889 wd 0.0500 time 0.2447 (0.2454) data time 0.0010 (0.0015) model time 0.2437 (0.2439) loss 2.9344 (3.4376) grad_norm 1.4777 (inf) loss_scale 4096.0000 (5224.3633) mem 7379MB [2024-08-26 09:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1000/1251] eta 0:01:01 lr 0.000889 wd 0.0500 time 0.2300 (0.2453) data time 0.0009 (0.0015) model time 0.2292 (0.2439) loss 4.0595 (3.4385) grad_norm inf (inf) loss_scale 2048.0000 (5211.0450) mem 7379MB [2024-08-26 09:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1010/1251] eta 0:00:59 lr 0.000889 wd 0.0500 time 0.2410 (0.2453) data time 0.0007 (0.0015) model time 0.2403 (0.2438) loss 4.2569 (3.4385) grad_norm 2.0495 (inf) loss_scale 2048.0000 (5179.7587) mem 7379MB [2024-08-26 09:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1020/1251] eta 0:00:56 lr 0.000889 wd 0.0500 time 0.2441 (0.2452) data time 0.0009 (0.0015) model time 0.2432 (0.2438) loss 3.7189 (3.4391) grad_norm 2.1578 (inf) loss_scale 2048.0000 (5149.0852) mem 7379MB [2024-08-26 09:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1030/1251] eta 0:00:54 lr 0.000889 wd 0.0500 time 0.2447 (0.2452) data time 0.0009 (0.0015) model time 0.2438 (0.2437) loss 3.1217 (3.4388) grad_norm 1.8661 (inf) loss_scale 2048.0000 (5119.0068) mem 7379MB [2024-08-26 09:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1040/1251] eta 0:00:51 lr 0.000889 wd 0.0500 time 0.2390 (0.2451) data time 0.0009 (0.0015) model time 0.2382 (0.2437) loss 3.7564 (3.4389) grad_norm 1.5691 (inf) loss_scale 2048.0000 (5089.5062) mem 7379MB [2024-08-26 09:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1050/1251] eta 0:00:49 lr 0.000889 wd 0.0500 time 0.2420 (0.2451) data time 0.0011 (0.0015) model time 0.2409 (0.2437) loss 2.7399 (3.4394) grad_norm 1.4974 (inf) loss_scale 2048.0000 (5060.5671) mem 7379MB [2024-08-26 09:12:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1060/1251] eta 0:00:46 lr 0.000889 wd 0.0500 time 0.2372 (0.2451) data time 0.0010 (0.0015) model time 0.2362 (0.2436) loss 3.6136 (3.4416) grad_norm 1.8087 (inf) loss_scale 2048.0000 (5032.1734) mem 7379MB [2024-08-26 09:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1070/1251] eta 0:00:44 lr 0.000889 wd 0.0500 time 0.2388 (0.2452) data time 0.0009 (0.0015) model time 0.2379 (0.2438) loss 3.2964 (3.4433) grad_norm 1.5792 (inf) loss_scale 2048.0000 (5004.3100) mem 7379MB [2024-08-26 09:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1080/1251] eta 0:00:41 lr 0.000889 wd 0.0500 time 0.2424 (0.2454) data time 0.0009 (0.0015) model time 0.2415 (0.2440) loss 3.9155 (3.4457) grad_norm 2.1934 (inf) loss_scale 2048.0000 (4976.9621) mem 7379MB [2024-08-26 09:12:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1090/1251] eta 0:00:39 lr 0.000889 wd 0.0500 time 0.2358 (0.2455) data time 0.0010 (0.0015) model time 0.2348 (0.2441) loss 4.4494 (3.4487) grad_norm 1.9118 (inf) loss_scale 2048.0000 (4950.1155) mem 7379MB [2024-08-26 09:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1100/1251] eta 0:00:37 lr 0.000889 wd 0.0500 time 0.2363 (0.2455) data time 0.0011 (0.0014) model time 0.2352 (0.2441) loss 3.3653 (3.4457) grad_norm 1.8684 (inf) loss_scale 2048.0000 (4923.7566) mem 7379MB [2024-08-26 09:12:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1110/1251] eta 0:00:34 lr 0.000889 wd 0.0500 time 0.2419 (0.2454) data time 0.0010 (0.0014) model time 0.2409 (0.2440) loss 3.7308 (3.4483) grad_norm 1.7960 (inf) loss_scale 2048.0000 (4897.8722) mem 7379MB [2024-08-26 09:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1120/1251] eta 0:00:32 lr 0.000889 wd 0.0500 time 0.2367 (0.2454) data time 0.0010 (0.0014) model time 0.2357 (0.2440) loss 4.1720 (3.4493) grad_norm 2.0192 (inf) loss_scale 2048.0000 (4872.4496) mem 7379MB [2024-08-26 09:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1130/1251] eta 0:00:29 lr 0.000889 wd 0.0500 time 0.2456 (0.2453) data time 0.0009 (0.0014) model time 0.2447 (0.2440) loss 3.6047 (3.4500) grad_norm 2.1520 (inf) loss_scale 2048.0000 (4847.4766) mem 7379MB [2024-08-26 09:12:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1140/1251] eta 0:00:27 lr 0.000889 wd 0.0500 time 0.2425 (0.2453) data time 0.0012 (0.0014) model time 0.2413 (0.2439) loss 3.6183 (3.4513) grad_norm 2.2553 (inf) loss_scale 2048.0000 (4822.9413) mem 7379MB [2024-08-26 09:12:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1150/1251] eta 0:00:24 lr 0.000889 wd 0.0500 time 0.2451 (0.2454) data time 0.0009 (0.0014) model time 0.2442 (0.2441) loss 3.6146 (3.4514) grad_norm 2.0436 (inf) loss_scale 2048.0000 (4798.8323) mem 7379MB [2024-08-26 09:12:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1160/1251] eta 0:00:22 lr 0.000889 wd 0.0500 time 0.2406 (0.2454) data time 0.0007 (0.0014) model time 0.2399 (0.2440) loss 3.6186 (3.4512) grad_norm 1.9342 (inf) loss_scale 2048.0000 (4775.1387) mem 7379MB [2024-08-26 09:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1170/1251] eta 0:00:19 lr 0.000889 wd 0.0500 time 0.2431 (0.2454) data time 0.0009 (0.0014) model time 0.2422 (0.2440) loss 3.9374 (3.4529) grad_norm 2.4324 (inf) loss_scale 2048.0000 (4751.8497) mem 7379MB [2024-08-26 09:12:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1180/1251] eta 0:00:17 lr 0.000889 wd 0.0500 time 0.2435 (0.2456) data time 0.0008 (0.0014) model time 0.2428 (0.2442) loss 4.1852 (3.4536) grad_norm 1.4763 (inf) loss_scale 2048.0000 (4728.9551) mem 7379MB [2024-08-26 09:12:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1190/1251] eta 0:00:14 lr 0.000889 wd 0.0500 time 0.2438 (0.2456) data time 0.0009 (0.0014) model time 0.2428 (0.2442) loss 3.4843 (3.4533) grad_norm 2.1994 (inf) loss_scale 2048.0000 (4706.4450) mem 7379MB [2024-08-26 09:13:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1200/1251] eta 0:00:12 lr 0.000889 wd 0.0500 time 0.2405 (0.2456) data time 0.0009 (0.0014) model time 0.2395 (0.2442) loss 3.7091 (3.4535) grad_norm 2.1831 (inf) loss_scale 2048.0000 (4684.3097) mem 7379MB [2024-08-26 09:13:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1210/1251] eta 0:00:10 lr 0.000889 wd 0.0500 time 0.2421 (0.2456) data time 0.0012 (0.0014) model time 0.2409 (0.2442) loss 3.2121 (3.4516) grad_norm 2.1267 (inf) loss_scale 2048.0000 (4662.5400) mem 7379MB [2024-08-26 09:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1220/1251] eta 0:00:07 lr 0.000889 wd 0.0500 time 0.2377 (0.2456) data time 0.0012 (0.0014) model time 0.2365 (0.2442) loss 2.2251 (3.4508) grad_norm 2.9536 (inf) loss_scale 2048.0000 (4641.1269) mem 7379MB [2024-08-26 09:13:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1230/1251] eta 0:00:05 lr 0.000889 wd 0.0500 time 0.2355 (0.2456) data time 0.0010 (0.0014) model time 0.2345 (0.2442) loss 3.8287 (3.4488) grad_norm 1.7715 (inf) loss_scale 2048.0000 (4620.0617) mem 7379MB [2024-08-26 09:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1240/1251] eta 0:00:02 lr 0.000889 wd 0.0500 time 0.2245 (0.2455) data time 0.0007 (0.0014) model time 0.2238 (0.2441) loss 3.0778 (3.4478) grad_norm 3.6672 (inf) loss_scale 2048.0000 (4599.3360) mem 7379MB [2024-08-26 09:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [80/300][1250/1251] eta 0:00:00 lr 0.000889 wd 0.0500 time 0.2234 (0.2454) data time 0.0007 (0.0014) model time 0.2227 (0.2441) loss 3.7128 (3.4489) grad_norm 1.2939 (inf) loss_scale 2048.0000 (4578.9416) mem 7379MB [2024-08-26 09:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 80 training takes 0:05:07 [2024-08-26 09:13:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 09:13:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 09:13:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.441 (0.441) Loss 0.5762 (0.5762) Acc@1 88.867 (88.867) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 09:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.111) Loss 0.7783 (0.8161) Acc@1 83.789 (82.200) Acc@5 96.777 (96.333) Mem 7379MB [2024-08-26 09:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.096) Loss 1.1689 (0.8433) Acc@1 71.680 (81.129) Acc@5 93.262 (96.164) Mem 7379MB [2024-08-26 09:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.091) Loss 1.4609 (0.9679) Acc@1 66.211 (78.358) Acc@5 88.379 (94.591) Mem 7379MB [2024-08-26 09:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.4121 (1.0363) Acc@1 67.480 (76.805) Acc@5 89.844 (93.836) Mem 7379MB [2024-08-26 09:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.494 Acc@5 93.732 [2024-08-26 09:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.5% [2024-08-26 09:13:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.816 (0.816) Loss 0.4592 (0.4592) Acc@1 91.992 (91.992) Acc@5 98.438 (98.438) Mem 7379MB [2024-08-26 09:13:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.147) Loss 0.7368 (0.7152) Acc@1 85.156 (84.517) Acc@5 96.387 (96.937) Mem 7379MB [2024-08-26 09:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.114) Loss 1.0205 (0.7387) Acc@1 75.781 (83.445) Acc@5 94.141 (96.875) Mem 7379MB [2024-08-26 09:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.103) Loss 1.3066 (0.8439) Acc@1 67.090 (81.029) Acc@5 90.625 (95.590) Mem 7379MB [2024-08-26 09:13:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.095) Loss 1.1797 (0.9000) Acc@1 71.289 (79.568) Acc@5 91.504 (94.958) Mem 7379MB [2024-08-26 09:13:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.164 Acc@5 94.912 [2024-08-26 09:13:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.2% [2024-08-26 09:13:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.16% [2024-08-26 09:13:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 09:13:22 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 09:13:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][0/1251] eta 0:14:08 lr 0.000889 wd 0.0500 time 0.6783 (0.6783) data time 0.4545 (0.4545) model time 0.0000 (0.0000) loss 4.4696 (4.4696) grad_norm 1.4732 (1.4732) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][10/1251] eta 0:05:53 lr 0.000888 wd 0.0500 time 0.2482 (0.2846) data time 0.0009 (0.0428) model time 0.0000 (0.0000) loss 3.4003 (3.5732) grad_norm 1.8726 (2.0290) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][20/1251] eta 0:05:25 lr 0.000888 wd 0.0500 time 0.2383 (0.2643) data time 0.0009 (0.0230) model time 0.0000 (0.0000) loss 3.5623 (3.5456) grad_norm 2.9393 (2.0681) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][30/1251] eta 0:05:14 lr 0.000888 wd 0.0500 time 0.2348 (0.2574) data time 0.0013 (0.0161) model time 0.0000 (0.0000) loss 2.5369 (3.5489) grad_norm 1.7050 (2.0062) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][40/1251] eta 0:05:07 lr 0.000888 wd 0.0500 time 0.2378 (0.2540) data time 0.0012 (0.0126) model time 0.0000 (0.0000) loss 2.9893 (3.5226) grad_norm 1.3086 (1.8986) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][50/1251] eta 0:05:02 lr 0.000888 wd 0.0500 time 0.2499 (0.2519) data time 0.0007 (0.0104) model time 0.0000 (0.0000) loss 4.2968 (3.5295) grad_norm 1.6841 (1.9365) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][60/1251] eta 0:04:58 lr 0.000888 wd 0.0500 time 0.2404 (0.2503) data time 0.0015 (0.0089) model time 0.2389 (0.2409) loss 3.8832 (3.5257) grad_norm 2.3539 (1.9583) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][70/1251] eta 0:04:54 lr 0.000888 wd 0.0500 time 0.2476 (0.2490) data time 0.0009 (0.0079) model time 0.2467 (0.2403) loss 4.3282 (3.5323) grad_norm 1.5260 (1.9383) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][80/1251] eta 0:04:50 lr 0.000888 wd 0.0500 time 0.2453 (0.2482) data time 0.0011 (0.0071) model time 0.2442 (0.2406) loss 3.6503 (3.5480) grad_norm 1.6384 (1.9176) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][90/1251] eta 0:04:47 lr 0.000888 wd 0.0500 time 0.2382 (0.2475) data time 0.0011 (0.0065) model time 0.2371 (0.2405) loss 3.3238 (3.5541) grad_norm 2.0643 (1.9452) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][100/1251] eta 0:04:44 lr 0.000888 wd 0.0500 time 0.2348 (0.2470) data time 0.0009 (0.0060) model time 0.2339 (0.2406) loss 3.2008 (3.5553) grad_norm 2.3061 (1.9226) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][110/1251] eta 0:04:43 lr 0.000888 wd 0.0500 time 0.2517 (0.2487) data time 0.0011 (0.0055) model time 0.2507 (0.2446) loss 3.3893 (3.5556) grad_norm 1.9977 (1.9266) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][120/1251] eta 0:04:40 lr 0.000888 wd 0.0500 time 0.2385 (0.2483) data time 0.0011 (0.0052) model time 0.2375 (0.2442) loss 3.1393 (3.5394) grad_norm 1.8455 (1.9392) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][130/1251] eta 0:04:37 lr 0.000888 wd 0.0500 time 0.2472 (0.2479) data time 0.0010 (0.0049) model time 0.2462 (0.2441) loss 3.6362 (3.5062) grad_norm 1.7009 (1.9533) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][140/1251] eta 0:04:35 lr 0.000888 wd 0.0500 time 0.2493 (0.2476) data time 0.0007 (0.0046) model time 0.2487 (0.2439) loss 4.1534 (3.5184) grad_norm 2.0387 (1.9668) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:13:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][150/1251] eta 0:04:32 lr 0.000888 wd 0.0500 time 0.2398 (0.2475) data time 0.0009 (0.0044) model time 0.2388 (0.2440) loss 4.0329 (3.5332) grad_norm 1.9055 (1.9618) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][160/1251] eta 0:04:29 lr 0.000888 wd 0.0500 time 0.2378 (0.2472) data time 0.0009 (0.0042) model time 0.2369 (0.2436) loss 3.6508 (3.5337) grad_norm 2.5553 (1.9746) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][170/1251] eta 0:04:26 lr 0.000888 wd 0.0500 time 0.2375 (0.2468) data time 0.0011 (0.0040) model time 0.2364 (0.2433) loss 3.8254 (3.5478) grad_norm 2.1099 (1.9878) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][180/1251] eta 0:04:23 lr 0.000888 wd 0.0500 time 0.2467 (0.2464) data time 0.0007 (0.0039) model time 0.2460 (0.2429) loss 3.8852 (3.5561) grad_norm 2.1639 (1.9870) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][190/1251] eta 0:04:21 lr 0.000888 wd 0.0500 time 0.2348 (0.2461) data time 0.0009 (0.0037) model time 0.2339 (0.2427) loss 2.9636 (3.5508) grad_norm 2.0197 (1.9999) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][200/1251] eta 0:04:18 lr 0.000888 wd 0.0500 time 0.2371 (0.2458) data time 0.0011 (0.0036) model time 0.2359 (0.2424) loss 3.7935 (3.5498) grad_norm 2.1143 (1.9980) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][210/1251] eta 0:04:15 lr 0.000888 wd 0.0500 time 0.2413 (0.2456) data time 0.0007 (0.0035) model time 0.2406 (0.2423) loss 3.7730 (3.5531) grad_norm 1.7778 (2.0004) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][220/1251] eta 0:04:13 lr 0.000888 wd 0.0500 time 0.2419 (0.2462) data time 0.0007 (0.0034) model time 0.2412 (0.2433) loss 3.7149 (3.5418) grad_norm 2.1557 (1.9894) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][230/1251] eta 0:04:11 lr 0.000888 wd 0.0500 time 0.2391 (0.2460) data time 0.0010 (0.0033) model time 0.2381 (0.2430) loss 3.2230 (3.5439) grad_norm 2.2692 (1.9892) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][240/1251] eta 0:04:09 lr 0.000888 wd 0.0500 time 0.2380 (0.2464) data time 0.0011 (0.0032) model time 0.2369 (0.2437) loss 3.7007 (3.5456) grad_norm 1.6693 (1.9977) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][250/1251] eta 0:04:06 lr 0.000888 wd 0.0500 time 0.2422 (0.2467) data time 0.0008 (0.0031) model time 0.2414 (0.2442) loss 3.5621 (3.5422) grad_norm 1.7939 (2.0003) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][260/1251] eta 0:04:04 lr 0.000888 wd 0.0500 time 0.2405 (0.2465) data time 0.0009 (0.0030) model time 0.2396 (0.2439) loss 3.9729 (3.5408) grad_norm 1.9942 (1.9904) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][270/1251] eta 0:04:01 lr 0.000888 wd 0.0500 time 0.2310 (0.2462) data time 0.0011 (0.0029) model time 0.2299 (0.2436) loss 3.3823 (3.5340) grad_norm 2.0067 (1.9882) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][280/1251] eta 0:03:58 lr 0.000888 wd 0.0500 time 0.2354 (0.2460) data time 0.0010 (0.0029) model time 0.2344 (0.2434) loss 3.5200 (3.5332) grad_norm 1.8594 (1.9896) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][290/1251] eta 0:03:56 lr 0.000888 wd 0.0500 time 0.2427 (0.2458) data time 0.0009 (0.0028) model time 0.2418 (0.2433) loss 2.9782 (3.5265) grad_norm 2.4112 (1.9992) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][300/1251] eta 0:03:53 lr 0.000888 wd 0.0500 time 0.2414 (0.2457) data time 0.0009 (0.0028) model time 0.2405 (0.2432) loss 3.9969 (3.5253) grad_norm 2.3921 (2.0007) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][310/1251] eta 0:03:51 lr 0.000888 wd 0.0500 time 0.2426 (0.2463) data time 0.0009 (0.0027) model time 0.2417 (0.2440) loss 3.0224 (3.5233) grad_norm 3.0615 (2.0112) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][320/1251] eta 0:03:49 lr 0.000888 wd 0.0500 time 0.2348 (0.2461) data time 0.0011 (0.0026) model time 0.2337 (0.2439) loss 3.9648 (3.5277) grad_norm 1.8613 (2.0062) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][330/1251] eta 0:03:46 lr 0.000888 wd 0.0500 time 0.2482 (0.2460) data time 0.0009 (0.0026) model time 0.2473 (0.2438) loss 3.4911 (3.5207) grad_norm 2.9102 (2.0000) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][340/1251] eta 0:03:43 lr 0.000888 wd 0.0500 time 0.2469 (0.2458) data time 0.0009 (0.0025) model time 0.2459 (0.2436) loss 3.4347 (3.5214) grad_norm 2.6571 (2.0136) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][350/1251] eta 0:03:41 lr 0.000888 wd 0.0500 time 0.2391 (0.2463) data time 0.0009 (0.0025) model time 0.2383 (0.2442) loss 3.7542 (3.5278) grad_norm 1.6198 (2.0108) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][360/1251] eta 0:03:39 lr 0.000888 wd 0.0500 time 0.2349 (0.2468) data time 0.0009 (0.0025) model time 0.2340 (0.2448) loss 2.2187 (3.5226) grad_norm 2.0364 (2.0066) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][370/1251] eta 0:03:37 lr 0.000887 wd 0.0500 time 0.2420 (0.2472) data time 0.0012 (0.0024) model time 0.2408 (0.2453) loss 3.1663 (3.5176) grad_norm 1.9291 (2.0067) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][380/1251] eta 0:03:35 lr 0.000887 wd 0.0500 time 0.2474 (0.2471) data time 0.0007 (0.0024) model time 0.2466 (0.2452) loss 3.9207 (3.5168) grad_norm 1.5531 (2.0101) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:14:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][390/1251] eta 0:03:32 lr 0.000887 wd 0.0500 time 0.2332 (0.2469) data time 0.0009 (0.0023) model time 0.2323 (0.2450) loss 4.0241 (3.5198) grad_norm 1.6715 (2.0044) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][400/1251] eta 0:03:30 lr 0.000887 wd 0.0500 time 0.2586 (0.2468) data time 0.0009 (0.0023) model time 0.2577 (0.2449) loss 3.5481 (3.5247) grad_norm 1.9202 (2.0009) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][410/1251] eta 0:03:27 lr 0.000887 wd 0.0500 time 0.2477 (0.2467) data time 0.0011 (0.0023) model time 0.2466 (0.2448) loss 3.8172 (3.5293) grad_norm 1.9689 (2.0087) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][420/1251] eta 0:03:24 lr 0.000887 wd 0.0500 time 0.2351 (0.2466) data time 0.0010 (0.0023) model time 0.2342 (0.2447) loss 2.9977 (3.5304) grad_norm 1.4903 (2.0156) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][430/1251] eta 0:03:22 lr 0.000887 wd 0.0500 time 0.2378 (0.2464) data time 0.0011 (0.0022) model time 0.2367 (0.2446) loss 2.8737 (3.5271) grad_norm 1.5319 (2.0234) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][440/1251] eta 0:03:19 lr 0.000887 wd 0.0500 time 0.2362 (0.2463) data time 0.0008 (0.0022) model time 0.2354 (0.2445) loss 3.7464 (3.5208) grad_norm 1.9229 (2.0253) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][450/1251] eta 0:03:17 lr 0.000887 wd 0.0500 time 0.2435 (0.2467) data time 0.0008 (0.0022) model time 0.2427 (0.2449) loss 3.6789 (3.5169) grad_norm 1.5304 (2.0284) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][460/1251] eta 0:03:15 lr 0.000887 wd 0.0500 time 0.2465 (0.2466) data time 0.0007 (0.0021) model time 0.2457 (0.2448) loss 3.7260 (3.5168) grad_norm 1.6210 (2.0232) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][470/1251] eta 0:03:12 lr 0.000887 wd 0.0500 time 0.2484 (0.2465) data time 0.0011 (0.0021) model time 0.2473 (0.2448) loss 3.8728 (3.5194) grad_norm 2.1033 (2.0184) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][480/1251] eta 0:03:10 lr 0.000887 wd 0.0500 time 0.2384 (0.2464) data time 0.0009 (0.0021) model time 0.2375 (0.2447) loss 4.3525 (3.5209) grad_norm 2.2540 (2.0197) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][490/1251] eta 0:03:07 lr 0.000887 wd 0.0500 time 0.2469 (0.2467) data time 0.0010 (0.0021) model time 0.2459 (0.2450) loss 2.9321 (3.5163) grad_norm 3.0499 (2.0279) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][500/1251] eta 0:03:05 lr 0.000887 wd 0.0500 time 0.2832 (0.2468) data time 0.0013 (0.0020) model time 0.2819 (0.2451) loss 4.1673 (3.5178) grad_norm 1.8835 (2.0381) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][510/1251] eta 0:03:02 lr 0.000887 wd 0.0500 time 0.2428 (0.2467) data time 0.0009 (0.0020) model time 0.2419 (0.2450) loss 4.1542 (3.5167) grad_norm 1.6781 (2.0386) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][520/1251] eta 0:03:00 lr 0.000887 wd 0.0500 time 0.2443 (0.2466) data time 0.0010 (0.0020) model time 0.2433 (0.2449) loss 3.8334 (3.5157) grad_norm 1.9821 (2.0348) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][530/1251] eta 0:02:57 lr 0.000887 wd 0.0500 time 0.2337 (0.2465) data time 0.0012 (0.0020) model time 0.2325 (0.2448) loss 2.4466 (3.5057) grad_norm 2.1182 (2.0374) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][540/1251] eta 0:02:55 lr 0.000887 wd 0.0500 time 0.2393 (0.2463) data time 0.0007 (0.0020) model time 0.2386 (0.2446) loss 4.1640 (3.5058) grad_norm 2.1735 (2.0369) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][550/1251] eta 0:02:52 lr 0.000887 wd 0.0500 time 0.2457 (0.2463) data time 0.0009 (0.0020) model time 0.2447 (0.2446) loss 4.2054 (3.5044) grad_norm 2.3775 (2.0365) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][560/1251] eta 0:02:50 lr 0.000887 wd 0.0500 time 0.2374 (0.2462) data time 0.0009 (0.0020) model time 0.2365 (0.2445) loss 3.4504 (3.5023) grad_norm 2.1653 (2.0381) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][570/1251] eta 0:02:47 lr 0.000887 wd 0.0500 time 0.2309 (0.2461) data time 0.0011 (0.0019) model time 0.2298 (0.2444) loss 3.0062 (3.5011) grad_norm 2.0136 (2.0383) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][580/1251] eta 0:02:45 lr 0.000887 wd 0.0500 time 0.2387 (0.2460) data time 0.0010 (0.0019) model time 0.2376 (0.2443) loss 4.5599 (3.5022) grad_norm 1.8364 (2.0401) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][590/1251] eta 0:02:42 lr 0.000887 wd 0.0500 time 0.2392 (0.2459) data time 0.0012 (0.0019) model time 0.2380 (0.2442) loss 3.5857 (3.5041) grad_norm 1.9606 (2.0376) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][600/1251] eta 0:02:40 lr 0.000887 wd 0.0500 time 0.2362 (0.2458) data time 0.0007 (0.0019) model time 0.2355 (0.2441) loss 3.3976 (3.5017) grad_norm 1.8223 (2.0361) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][610/1251] eta 0:02:37 lr 0.000887 wd 0.0500 time 0.2433 (0.2457) data time 0.0007 (0.0019) model time 0.2426 (0.2441) loss 4.5783 (3.5040) grad_norm 2.5269 (2.0323) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][620/1251] eta 0:02:35 lr 0.000887 wd 0.0500 time 0.2356 (0.2457) data time 0.0009 (0.0019) model time 0.2347 (0.2440) loss 3.3296 (3.5022) grad_norm 1.5697 (2.0300) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][630/1251] eta 0:02:32 lr 0.000887 wd 0.0500 time 0.2365 (0.2456) data time 0.0009 (0.0019) model time 0.2356 (0.2439) loss 2.2725 (3.5030) grad_norm 2.1722 (2.0281) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:15:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][640/1251] eta 0:02:30 lr 0.000887 wd 0.0500 time 0.2420 (0.2455) data time 0.0010 (0.0018) model time 0.2410 (0.2439) loss 2.0742 (3.4959) grad_norm 3.0894 (2.0314) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][650/1251] eta 0:02:27 lr 0.000887 wd 0.0500 time 0.2437 (0.2455) data time 0.0010 (0.0018) model time 0.2426 (0.2438) loss 2.5057 (3.4924) grad_norm 1.8745 (2.0324) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][660/1251] eta 0:02:25 lr 0.000887 wd 0.0500 time 0.2401 (0.2454) data time 0.0011 (0.0018) model time 0.2391 (0.2438) loss 3.6613 (3.4944) grad_norm 1.6190 (2.0336) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][670/1251] eta 0:02:22 lr 0.000887 wd 0.0500 time 0.2413 (0.2454) data time 0.0007 (0.0018) model time 0.2406 (0.2437) loss 3.7993 (3.4948) grad_norm 1.4025 (2.0335) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][680/1251] eta 0:02:20 lr 0.000887 wd 0.0500 time 0.2416 (0.2453) data time 0.0010 (0.0018) model time 0.2406 (0.2437) loss 3.4265 (3.4949) grad_norm 2.7141 (2.0315) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][690/1251] eta 0:02:17 lr 0.000887 wd 0.0500 time 0.2382 (0.2453) data time 0.0010 (0.0018) model time 0.2372 (0.2436) loss 3.9693 (3.4872) grad_norm 3.2811 (2.0329) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][700/1251] eta 0:02:15 lr 0.000887 wd 0.0500 time 0.2381 (0.2452) data time 0.0010 (0.0018) model time 0.2371 (0.2436) loss 4.0599 (3.4867) grad_norm 2.1830 (2.0323) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][710/1251] eta 0:02:12 lr 0.000887 wd 0.0500 time 0.2448 (0.2452) data time 0.0009 (0.0018) model time 0.2440 (0.2436) loss 2.4060 (3.4810) grad_norm 1.3639 (2.0297) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][720/1251] eta 0:02:10 lr 0.000886 wd 0.0500 time 0.2482 (0.2451) data time 0.0007 (0.0018) model time 0.2474 (0.2435) loss 2.5439 (3.4805) grad_norm 3.3883 (2.0300) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][730/1251] eta 0:02:07 lr 0.000886 wd 0.0500 time 0.2456 (0.2450) data time 0.0008 (0.0017) model time 0.2449 (0.2434) loss 2.5581 (3.4759) grad_norm 1.6595 (2.0286) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][740/1251] eta 0:02:05 lr 0.000886 wd 0.0500 time 0.2320 (0.2452) data time 0.0008 (0.0017) model time 0.2312 (0.2436) loss 3.0914 (3.4796) grad_norm 1.5655 (2.0239) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][750/1251] eta 0:02:02 lr 0.000886 wd 0.0500 time 0.2464 (0.2454) data time 0.0007 (0.0017) model time 0.2457 (0.2439) loss 3.8545 (3.4781) grad_norm 1.4879 (2.0209) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][760/1251] eta 0:02:00 lr 0.000886 wd 0.0500 time 0.2419 (0.2454) data time 0.0010 (0.0017) model time 0.2410 (0.2438) loss 4.1965 (3.4808) grad_norm 1.9960 (2.0196) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][770/1251] eta 0:01:58 lr 0.000886 wd 0.0500 time 0.4443 (0.2458) data time 0.0011 (0.0017) model time 0.4432 (0.2443) loss 4.0036 (3.4838) grad_norm 1.9470 (2.0184) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][780/1251] eta 0:01:55 lr 0.000886 wd 0.0500 time 0.2411 (0.2458) data time 0.0009 (0.0017) model time 0.2402 (0.2443) loss 3.0297 (3.4810) grad_norm 1.9961 (2.0172) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][790/1251] eta 0:01:53 lr 0.000886 wd 0.0500 time 0.2429 (0.2457) data time 0.0009 (0.0017) model time 0.2420 (0.2442) loss 3.7156 (3.4798) grad_norm 1.4387 (2.0163) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][800/1251] eta 0:01:50 lr 0.000886 wd 0.0500 time 0.2460 (0.2457) data time 0.0008 (0.0017) model time 0.2453 (0.2442) loss 3.8312 (3.4813) grad_norm 2.7187 (2.0192) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][810/1251] eta 0:01:48 lr 0.000886 wd 0.0500 time 0.2468 (0.2457) data time 0.0013 (0.0017) model time 0.2455 (0.2441) loss 3.7362 (3.4807) grad_norm 2.1177 (2.0180) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][820/1251] eta 0:01:45 lr 0.000886 wd 0.0500 time 0.2431 (0.2456) data time 0.0009 (0.0017) model time 0.2422 (0.2441) loss 2.5093 (3.4778) grad_norm 2.2670 (2.0165) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][830/1251] eta 0:01:43 lr 0.000886 wd 0.0500 time 0.2520 (0.2456) data time 0.0008 (0.0017) model time 0.2512 (0.2441) loss 2.9166 (3.4769) grad_norm 1.4757 (2.0137) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][840/1251] eta 0:01:40 lr 0.000886 wd 0.0500 time 0.2487 (0.2455) data time 0.0011 (0.0017) model time 0.2476 (0.2440) loss 3.0653 (3.4746) grad_norm 3.4015 (2.0161) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][850/1251] eta 0:01:38 lr 0.000886 wd 0.0500 time 0.2465 (0.2455) data time 0.0009 (0.0017) model time 0.2456 (0.2440) loss 4.3773 (3.4772) grad_norm 1.7484 (2.0170) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][860/1251] eta 0:01:35 lr 0.000886 wd 0.0500 time 0.2473 (0.2455) data time 0.0011 (0.0017) model time 0.2461 (0.2440) loss 3.3869 (3.4779) grad_norm 1.8677 (2.0146) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][870/1251] eta 0:01:33 lr 0.000886 wd 0.0500 time 0.2450 (0.2457) data time 0.0007 (0.0017) model time 0.2442 (0.2442) loss 3.6715 (3.4802) grad_norm 2.2470 (2.0136) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][880/1251] eta 0:01:31 lr 0.000886 wd 0.0500 time 0.2521 (0.2459) data time 0.0007 (0.0017) model time 0.2514 (0.2444) loss 4.3372 (3.4784) grad_norm 3.3086 (2.0132) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][890/1251] eta 0:01:28 lr 0.000886 wd 0.0500 time 0.2414 (0.2460) data time 0.0010 (0.0017) model time 0.2405 (0.2445) loss 4.2626 (3.4776) grad_norm 1.7958 (2.0226) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][900/1251] eta 0:01:26 lr 0.000886 wd 0.0500 time 0.2477 (0.2460) data time 0.0008 (0.0017) model time 0.2469 (0.2445) loss 4.2577 (3.4778) grad_norm 2.3293 (2.0204) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][910/1251] eta 0:01:23 lr 0.000886 wd 0.0500 time 0.2585 (0.2460) data time 0.0010 (0.0017) model time 0.2575 (0.2445) loss 3.8049 (3.4792) grad_norm 1.9654 (2.0218) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][920/1251] eta 0:01:21 lr 0.000886 wd 0.0500 time 0.2396 (0.2459) data time 0.0009 (0.0017) model time 0.2388 (0.2444) loss 4.1959 (3.4813) grad_norm 2.1024 (2.0227) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][930/1251] eta 0:01:18 lr 0.000886 wd 0.0500 time 0.2448 (0.2459) data time 0.0009 (0.0016) model time 0.2438 (0.2444) loss 2.5929 (3.4822) grad_norm 2.1902 (2.0235) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][940/1251] eta 0:01:16 lr 0.000886 wd 0.0500 time 0.2450 (0.2458) data time 0.0009 (0.0016) model time 0.2442 (0.2444) loss 2.5493 (3.4830) grad_norm 2.2504 (2.0269) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][950/1251] eta 0:01:13 lr 0.000886 wd 0.0500 time 0.2404 (0.2458) data time 0.0012 (0.0016) model time 0.2392 (0.2443) loss 3.1656 (3.4823) grad_norm 1.5816 (2.0259) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][960/1251] eta 0:01:11 lr 0.000886 wd 0.0500 time 0.2408 (0.2457) data time 0.0010 (0.0016) model time 0.2398 (0.2442) loss 3.6774 (3.4831) grad_norm 1.9676 (2.0267) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][970/1251] eta 0:01:09 lr 0.000886 wd 0.0500 time 0.2451 (0.2457) data time 0.0010 (0.0016) model time 0.2442 (0.2442) loss 3.4382 (3.4818) grad_norm 1.8820 (2.0246) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][980/1251] eta 0:01:06 lr 0.000886 wd 0.0500 time 0.2412 (0.2456) data time 0.0007 (0.0016) model time 0.2405 (0.2441) loss 4.4099 (3.4835) grad_norm 1.9620 (2.0230) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][990/1251] eta 0:01:04 lr 0.000886 wd 0.0500 time 0.2377 (0.2458) data time 0.0007 (0.0016) model time 0.2370 (0.2443) loss 2.5690 (3.4816) grad_norm 1.5503 (2.0212) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1000/1251] eta 0:01:01 lr 0.000886 wd 0.0500 time 0.2488 (0.2458) data time 0.0008 (0.0016) model time 0.2480 (0.2443) loss 3.3464 (3.4806) grad_norm 2.8355 (2.0229) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1010/1251] eta 0:00:59 lr 0.000886 wd 0.0500 time 0.2455 (0.2457) data time 0.0007 (0.0016) model time 0.2448 (0.2443) loss 2.8844 (3.4771) grad_norm 1.6827 (2.0258) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1020/1251] eta 0:00:56 lr 0.000886 wd 0.0500 time 0.2426 (0.2457) data time 0.0009 (0.0016) model time 0.2417 (0.2442) loss 3.6233 (3.4788) grad_norm 1.7946 (2.0250) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1030/1251] eta 0:00:54 lr 0.000886 wd 0.0500 time 0.2427 (0.2456) data time 0.0010 (0.0016) model time 0.2418 (0.2442) loss 3.0923 (3.4794) grad_norm 1.5610 (2.0254) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1040/1251] eta 0:00:51 lr 0.000886 wd 0.0500 time 0.2388 (0.2457) data time 0.0008 (0.0016) model time 0.2380 (0.2443) loss 3.0191 (3.4784) grad_norm 1.9121 (2.0254) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1050/1251] eta 0:00:49 lr 0.000886 wd 0.0500 time 0.2422 (0.2457) data time 0.0010 (0.0016) model time 0.2411 (0.2443) loss 3.9321 (3.4774) grad_norm 4.4966 (2.0325) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1060/1251] eta 0:00:46 lr 0.000886 wd 0.0500 time 0.2451 (0.2457) data time 0.0011 (0.0016) model time 0.2439 (0.2442) loss 3.5342 (3.4788) grad_norm 2.1388 (2.0347) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1070/1251] eta 0:00:44 lr 0.000885 wd 0.0500 time 0.2475 (0.2456) data time 0.0011 (0.0016) model time 0.2464 (0.2442) loss 3.6261 (3.4769) grad_norm 1.7519 (2.0338) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1080/1251] eta 0:00:41 lr 0.000885 wd 0.0500 time 0.2408 (0.2456) data time 0.0007 (0.0016) model time 0.2401 (0.2441) loss 4.2828 (3.4786) grad_norm 2.7803 (2.0351) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1090/1251] eta 0:00:39 lr 0.000885 wd 0.0500 time 0.2431 (0.2456) data time 0.0010 (0.0016) model time 0.2420 (0.2441) loss 2.9124 (3.4765) grad_norm 1.3360 (2.0353) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1100/1251] eta 0:00:37 lr 0.000885 wd 0.0500 time 0.2460 (0.2455) data time 0.0009 (0.0015) model time 0.2450 (0.2441) loss 3.6020 (3.4781) grad_norm 2.5042 (2.0396) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1110/1251] eta 0:00:34 lr 0.000885 wd 0.0500 time 0.2474 (0.2455) data time 0.0009 (0.0015) model time 0.2465 (0.2441) loss 3.7684 (3.4777) grad_norm 1.4851 (2.0384) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:17:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1120/1251] eta 0:00:32 lr 0.000885 wd 0.0500 time 0.2475 (0.2455) data time 0.0007 (0.0015) model time 0.2468 (0.2441) loss 4.2387 (3.4776) grad_norm 1.3015 (2.0368) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1130/1251] eta 0:00:29 lr 0.000885 wd 0.0500 time 0.2464 (0.2454) data time 0.0007 (0.0015) model time 0.2458 (0.2440) loss 4.4328 (3.4771) grad_norm 1.8157 (2.0359) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1140/1251] eta 0:00:27 lr 0.000885 wd 0.0500 time 0.2404 (0.2454) data time 0.0010 (0.0015) model time 0.2394 (0.2440) loss 2.6488 (3.4759) grad_norm 1.7307 (2.0365) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1150/1251] eta 0:00:24 lr 0.000885 wd 0.0500 time 0.2432 (0.2454) data time 0.0009 (0.0015) model time 0.2423 (0.2440) loss 3.6913 (3.4770) grad_norm 2.2410 (2.0370) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1160/1251] eta 0:00:22 lr 0.000885 wd 0.0500 time 0.2446 (0.2454) data time 0.0010 (0.0015) model time 0.2436 (0.2440) loss 3.7712 (3.4785) grad_norm 2.0576 (2.0398) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1170/1251] eta 0:00:19 lr 0.000885 wd 0.0500 time 0.2362 (0.2453) data time 0.0012 (0.0015) model time 0.2351 (0.2439) loss 3.9895 (3.4804) grad_norm 1.6620 (2.0410) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1180/1251] eta 0:00:17 lr 0.000885 wd 0.0500 time 0.2433 (0.2453) data time 0.0007 (0.0015) model time 0.2426 (0.2439) loss 2.8399 (3.4795) grad_norm 1.9038 (2.0408) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1190/1251] eta 0:00:14 lr 0.000885 wd 0.0500 time 0.2573 (0.2453) data time 0.0008 (0.0015) model time 0.2565 (0.2439) loss 4.5959 (3.4824) grad_norm 1.8826 (2.0421) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1200/1251] eta 0:00:12 lr 0.000885 wd 0.0500 time 0.2476 (0.2453) data time 0.0010 (0.0015) model time 0.2467 (0.2439) loss 3.8524 (3.4801) grad_norm 2.0321 (2.0435) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1210/1251] eta 0:00:10 lr 0.000885 wd 0.0500 time 0.2459 (0.2453) data time 0.0010 (0.0015) model time 0.2449 (0.2439) loss 3.7321 (3.4796) grad_norm 1.9371 (2.0423) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1220/1251] eta 0:00:07 lr 0.000885 wd 0.0500 time 0.2426 (0.2452) data time 0.0009 (0.0015) model time 0.2417 (0.2439) loss 2.9724 (3.4813) grad_norm 3.2786 (2.0416) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1230/1251] eta 0:00:05 lr 0.000885 wd 0.0500 time 0.2444 (0.2452) data time 0.0010 (0.0015) model time 0.2434 (0.2438) loss 3.5681 (3.4811) grad_norm 1.7758 (2.0431) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1240/1251] eta 0:00:02 lr 0.000885 wd 0.0500 time 0.2263 (0.2453) data time 0.0005 (0.0015) model time 0.2258 (0.2439) loss 2.2625 (3.4806) grad_norm 2.0301 (2.0436) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [81/300][1250/1251] eta 0:00:00 lr 0.000885 wd 0.0500 time 0.2261 (0.2452) data time 0.0007 (0.0015) model time 0.2254 (0.2438) loss 3.9474 (3.4792) grad_norm 2.7764 (2.0436) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 81 training takes 0:05:06 [2024-08-26 09:18:29 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 09:18:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 09:18:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.427 (0.427) Loss 0.5264 (0.5264) Acc@1 89.551 (89.551) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 09:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.088 (0.108) Loss 0.7964 (0.8170) Acc@1 83.691 (81.854) Acc@5 95.312 (95.996) Mem 7379MB [2024-08-26 09:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.094) Loss 1.1436 (0.8320) Acc@1 71.973 (81.157) Acc@5 93.359 (96.177) Mem 7379MB [2024-08-26 09:18:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.089) Loss 1.4199 (0.9599) Acc@1 65.332 (78.399) Acc@5 89.258 (94.594) Mem 7379MB [2024-08-26 09:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.3535 (1.0312) Acc@1 67.773 (76.717) Acc@5 91.211 (93.762) Mem 7379MB [2024-08-26 09:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.290 Acc@5 93.658 [2024-08-26 09:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.3% [2024-08-26 09:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.800 (0.800) Loss 0.4590 (0.4590) Acc@1 92.188 (92.188) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 09:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.148) Loss 0.7354 (0.7137) Acc@1 85.156 (84.650) Acc@5 96.289 (96.884) Mem 7379MB [2024-08-26 09:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.117) Loss 1.0176 (0.7370) Acc@1 75.684 (83.524) Acc@5 94.238 (96.870) Mem 7379MB [2024-08-26 09:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.105) Loss 1.3066 (0.8422) Acc@1 66.895 (81.058) Acc@5 90.625 (95.615) Mem 7379MB [2024-08-26 09:18:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.095) Loss 1.1768 (0.8981) Acc@1 71.680 (79.602) Acc@5 91.699 (94.996) Mem 7379MB [2024-08-26 09:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.186 Acc@5 94.946 [2024-08-26 09:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.2% [2024-08-26 09:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.19% [2024-08-26 09:18:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 09:18:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 09:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][0/1251] eta 0:15:06 lr 0.000885 wd 0.0500 time 0.7246 (0.7246) data time 0.5027 (0.5027) model time 0.0000 (0.0000) loss 3.6851 (3.6851) grad_norm 2.6429 (2.6429) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][10/1251] eta 0:06:09 lr 0.000885 wd 0.0500 time 0.3688 (0.2981) data time 0.0010 (0.0466) model time 0.0000 (0.0000) loss 3.5508 (3.3076) grad_norm 2.3172 (2.0242) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][20/1251] eta 0:05:33 lr 0.000885 wd 0.0500 time 0.2381 (0.2709) data time 0.0011 (0.0249) model time 0.0000 (0.0000) loss 4.0016 (3.2776) grad_norm 1.8080 (1.9361) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][30/1251] eta 0:05:18 lr 0.000885 wd 0.0500 time 0.2390 (0.2612) data time 0.0008 (0.0172) model time 0.0000 (0.0000) loss 2.9609 (3.3752) grad_norm 1.6372 (1.9055) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][40/1251] eta 0:05:10 lr 0.000885 wd 0.0500 time 0.2387 (0.2563) data time 0.0011 (0.0133) model time 0.0000 (0.0000) loss 3.5999 (3.3897) grad_norm 1.4259 (1.8709) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][50/1251] eta 0:05:04 lr 0.000885 wd 0.0500 time 0.2447 (0.2533) data time 0.0009 (0.0108) model time 0.0000 (0.0000) loss 4.2768 (3.4296) grad_norm 1.9066 (1.8931) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][60/1251] eta 0:04:59 lr 0.000885 wd 0.0500 time 0.2406 (0.2514) data time 0.0009 (0.0092) model time 0.2396 (0.2411) loss 3.3915 (3.4514) grad_norm 2.7224 (1.9763) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][70/1251] eta 0:04:58 lr 0.000885 wd 0.0500 time 0.2563 (0.2529) data time 0.0010 (0.0081) model time 0.2553 (0.2511) loss 2.6787 (3.4512) grad_norm 1.7311 (1.9918) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:18:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][80/1251] eta 0:04:54 lr 0.000885 wd 0.0500 time 0.2429 (0.2517) data time 0.0010 (0.0072) model time 0.2418 (0.2482) loss 3.1413 (3.4346) grad_norm 1.5470 (2.0047) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][90/1251] eta 0:04:51 lr 0.000885 wd 0.0500 time 0.2487 (0.2507) data time 0.0008 (0.0065) model time 0.2479 (0.2464) loss 3.8360 (3.4437) grad_norm 1.9095 (2.0215) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][100/1251] eta 0:04:49 lr 0.000885 wd 0.0500 time 0.2472 (0.2518) data time 0.0007 (0.0060) model time 0.2464 (0.2493) loss 3.0360 (3.4027) grad_norm 2.6089 (2.0622) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][110/1251] eta 0:04:46 lr 0.000885 wd 0.0500 time 0.2435 (0.2510) data time 0.0008 (0.0055) model time 0.2426 (0.2482) loss 4.1888 (3.4349) grad_norm 2.2051 (2.0558) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][120/1251] eta 0:04:43 lr 0.000885 wd 0.0500 time 0.2450 (0.2503) data time 0.0010 (0.0051) model time 0.2441 (0.2472) loss 3.2317 (3.4236) grad_norm 2.2116 (2.0402) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][130/1251] eta 0:04:40 lr 0.000885 wd 0.0500 time 0.2593 (0.2499) data time 0.0012 (0.0048) model time 0.2581 (0.2468) loss 3.6046 (3.4221) grad_norm 2.3008 (2.0239) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][140/1251] eta 0:04:38 lr 0.000885 wd 0.0500 time 0.2381 (0.2506) data time 0.0007 (0.0046) model time 0.2374 (0.2480) loss 4.1739 (3.4109) grad_norm 1.6310 (2.0537) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][150/1251] eta 0:04:36 lr 0.000885 wd 0.0500 time 0.2422 (0.2512) data time 0.0009 (0.0043) model time 0.2413 (0.2492) loss 2.5849 (3.4062) grad_norm 2.7937 (2.0544) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][160/1251] eta 0:04:34 lr 0.000885 wd 0.0500 time 0.2492 (0.2517) data time 0.0011 (0.0041) model time 0.2481 (0.2500) loss 3.6829 (3.4208) grad_norm 1.3590 (2.0336) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][170/1251] eta 0:04:31 lr 0.000885 wd 0.0500 time 0.2496 (0.2513) data time 0.0009 (0.0039) model time 0.2487 (0.2495) loss 2.8086 (3.4092) grad_norm 1.7181 (2.0310) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][180/1251] eta 0:04:28 lr 0.000884 wd 0.0500 time 0.2367 (0.2511) data time 0.0010 (0.0038) model time 0.2357 (0.2492) loss 3.5214 (3.3807) grad_norm 3.4169 (2.0497) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][190/1251] eta 0:04:25 lr 0.000884 wd 0.0500 time 0.2442 (0.2506) data time 0.0009 (0.0036) model time 0.2433 (0.2487) loss 3.5823 (3.3854) grad_norm 1.7890 (2.0492) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][200/1251] eta 0:04:23 lr 0.000884 wd 0.0500 time 0.2334 (0.2503) data time 0.0009 (0.0035) model time 0.2325 (0.2483) loss 2.8288 (3.3880) grad_norm 1.3777 (2.0411) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][210/1251] eta 0:04:20 lr 0.000884 wd 0.0500 time 0.2474 (0.2500) data time 0.0010 (0.0034) model time 0.2465 (0.2479) loss 3.3779 (3.3915) grad_norm 2.0683 (2.0314) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][220/1251] eta 0:04:17 lr 0.000884 wd 0.0500 time 0.2544 (0.2497) data time 0.0007 (0.0033) model time 0.2537 (0.2476) loss 3.4121 (3.3840) grad_norm 2.9820 (2.0315) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][230/1251] eta 0:04:14 lr 0.000884 wd 0.0500 time 0.2405 (0.2493) data time 0.0007 (0.0032) model time 0.2398 (0.2472) loss 2.9163 (3.3895) grad_norm 1.7564 (2.0326) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][240/1251] eta 0:04:11 lr 0.000884 wd 0.0500 time 0.2422 (0.2489) data time 0.0010 (0.0031) model time 0.2412 (0.2468) loss 2.8384 (3.3765) grad_norm 2.6228 (2.0427) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][250/1251] eta 0:04:08 lr 0.000884 wd 0.0500 time 0.2437 (0.2486) data time 0.0010 (0.0030) model time 0.2428 (0.2464) loss 3.1536 (3.3844) grad_norm 2.0604 (2.0448) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][260/1251] eta 0:04:08 lr 0.000884 wd 0.0500 time 0.4600 (0.2509) data time 0.0013 (0.0029) model time 0.4587 (0.2494) loss 3.7612 (3.3898) grad_norm 1.8271 (2.0418) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][270/1251] eta 0:04:05 lr 0.000884 wd 0.0500 time 0.2419 (0.2505) data time 0.0010 (0.0028) model time 0.2409 (0.2489) loss 3.6876 (3.3892) grad_norm 1.9539 (2.0407) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][280/1251] eta 0:04:03 lr 0.000884 wd 0.0500 time 0.2459 (0.2503) data time 0.0007 (0.0028) model time 0.2451 (0.2487) loss 4.2665 (3.3965) grad_norm 1.7262 (2.0326) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][290/1251] eta 0:04:00 lr 0.000884 wd 0.0500 time 0.2387 (0.2501) data time 0.0008 (0.0027) model time 0.2379 (0.2484) loss 3.3821 (3.3942) grad_norm 1.7899 (2.0335) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][300/1251] eta 0:03:57 lr 0.000884 wd 0.0500 time 0.2333 (0.2498) data time 0.0009 (0.0027) model time 0.2324 (0.2481) loss 2.7768 (3.3958) grad_norm 2.3192 (2.0321) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][310/1251] eta 0:03:54 lr 0.000884 wd 0.0500 time 0.2422 (0.2496) data time 0.0011 (0.0026) model time 0.2411 (0.2479) loss 3.7821 (3.3918) grad_norm 2.1452 (2.0280) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:19:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][320/1251] eta 0:03:52 lr 0.000884 wd 0.0500 time 0.2423 (0.2494) data time 0.0007 (0.0026) model time 0.2416 (0.2477) loss 3.7419 (3.3903) grad_norm 2.1234 (2.0239) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][330/1251] eta 0:03:49 lr 0.000884 wd 0.0500 time 0.2403 (0.2493) data time 0.0010 (0.0025) model time 0.2393 (0.2475) loss 3.3620 (3.4007) grad_norm 1.5770 (2.0201) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][340/1251] eta 0:03:46 lr 0.000884 wd 0.0500 time 0.2425 (0.2491) data time 0.0008 (0.0025) model time 0.2417 (0.2473) loss 4.3324 (3.4145) grad_norm 1.7725 (2.0182) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][350/1251] eta 0:03:44 lr 0.000884 wd 0.0500 time 0.2405 (0.2489) data time 0.0010 (0.0024) model time 0.2396 (0.2471) loss 3.3770 (3.4055) grad_norm 2.1067 (2.0260) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][360/1251] eta 0:03:41 lr 0.000884 wd 0.0500 time 0.2394 (0.2487) data time 0.0008 (0.0024) model time 0.2385 (0.2469) loss 3.5552 (3.4038) grad_norm 1.8854 (2.0268) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][370/1251] eta 0:03:38 lr 0.000884 wd 0.0500 time 0.2430 (0.2484) data time 0.0009 (0.0024) model time 0.2421 (0.2466) loss 4.3975 (3.4030) grad_norm 2.5677 (2.0202) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][380/1251] eta 0:03:36 lr 0.000884 wd 0.0500 time 0.2372 (0.2483) data time 0.0012 (0.0023) model time 0.2360 (0.2465) loss 3.5025 (3.4050) grad_norm 1.5726 (2.0178) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][390/1251] eta 0:03:33 lr 0.000884 wd 0.0500 time 0.2408 (0.2481) data time 0.0009 (0.0023) model time 0.2398 (0.2463) loss 3.0477 (3.4022) grad_norm 2.2857 (2.0162) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][400/1251] eta 0:03:31 lr 0.000884 wd 0.0500 time 0.2413 (0.2480) data time 0.0008 (0.0023) model time 0.2406 (0.2462) loss 2.8233 (3.4107) grad_norm 3.3306 (2.0252) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][410/1251] eta 0:03:28 lr 0.000884 wd 0.0500 time 0.2389 (0.2478) data time 0.0009 (0.0022) model time 0.2380 (0.2460) loss 4.4293 (3.4217) grad_norm 1.8087 (2.0256) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][420/1251] eta 0:03:25 lr 0.000884 wd 0.0500 time 0.2358 (0.2477) data time 0.0007 (0.0022) model time 0.2350 (0.2459) loss 4.1274 (3.4186) grad_norm 1.6389 (2.0249) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][430/1251] eta 0:03:23 lr 0.000884 wd 0.0500 time 0.2407 (0.2476) data time 0.0007 (0.0022) model time 0.2400 (0.2458) loss 3.7210 (3.4241) grad_norm 2.3071 (2.0268) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][440/1251] eta 0:03:20 lr 0.000884 wd 0.0500 time 0.2520 (0.2475) data time 0.0007 (0.0021) model time 0.2513 (0.2457) loss 3.7179 (3.4220) grad_norm 1.5645 (2.0246) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][450/1251] eta 0:03:18 lr 0.000884 wd 0.0500 time 0.2491 (0.2478) data time 0.0009 (0.0021) model time 0.2481 (0.2461) loss 3.5801 (3.4260) grad_norm 1.9531 (2.0202) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][460/1251] eta 0:03:15 lr 0.000884 wd 0.0500 time 0.2438 (0.2476) data time 0.0007 (0.0021) model time 0.2431 (0.2459) loss 4.2970 (3.4281) grad_norm 2.3840 (2.0167) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][470/1251] eta 0:03:13 lr 0.000884 wd 0.0500 time 0.2439 (0.2475) data time 0.0010 (0.0021) model time 0.2429 (0.2458) loss 3.8564 (3.4304) grad_norm 1.6693 (2.0166) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][480/1251] eta 0:03:10 lr 0.000884 wd 0.0500 time 0.2326 (0.2474) data time 0.0008 (0.0020) model time 0.2319 (0.2457) loss 4.2871 (3.4319) grad_norm 1.5708 (2.0139) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][490/1251] eta 0:03:08 lr 0.000884 wd 0.0500 time 0.2322 (0.2472) data time 0.0011 (0.0020) model time 0.2311 (0.2456) loss 3.8162 (3.4310) grad_norm 1.8552 (2.0141) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:20:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][500/1251] eta 0:03:05 lr 0.000884 wd 0.0500 time 0.2464 (0.2471) data time 0.0010 (0.0020) model time 0.2455 (0.2454) loss 3.3795 (3.4307) grad_norm 1.9231 (2.0174) loss_scale 4096.0000 (2060.2635) mem 7379MB [2024-08-26 09:20:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][510/1251] eta 0:03:03 lr 0.000884 wd 0.0500 time 0.2388 (0.2473) data time 0.0008 (0.0020) model time 0.2381 (0.2457) loss 3.3113 (3.4319) grad_norm 1.8621 (2.0212) loss_scale 4096.0000 (2100.1018) mem 7379MB [2024-08-26 09:20:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][520/1251] eta 0:03:00 lr 0.000884 wd 0.0500 time 0.2423 (0.2472) data time 0.0009 (0.0020) model time 0.2414 (0.2456) loss 3.8450 (3.4314) grad_norm 2.3101 (2.0184) loss_scale 4096.0000 (2138.4107) mem 7379MB [2024-08-26 09:20:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][530/1251] eta 0:02:58 lr 0.000883 wd 0.0500 time 0.2441 (0.2471) data time 0.0008 (0.0019) model time 0.2433 (0.2455) loss 2.5116 (3.4269) grad_norm 1.9318 (2.0206) loss_scale 4096.0000 (2175.2768) mem 7379MB [2024-08-26 09:20:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][540/1251] eta 0:02:55 lr 0.000883 wd 0.0500 time 0.2404 (0.2474) data time 0.0009 (0.0019) model time 0.2394 (0.2457) loss 2.8514 (3.4260) grad_norm 1.7089 (2.0247) loss_scale 4096.0000 (2210.7800) mem 7379MB [2024-08-26 09:20:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][550/1251] eta 0:02:53 lr 0.000883 wd 0.0500 time 0.2333 (0.2473) data time 0.0010 (0.0019) model time 0.2323 (0.2456) loss 3.1910 (3.4214) grad_norm 1.9190 (2.0228) loss_scale 4096.0000 (2244.9946) mem 7379MB [2024-08-26 09:20:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][560/1251] eta 0:02:50 lr 0.000883 wd 0.0500 time 0.2448 (0.2472) data time 0.0008 (0.0019) model time 0.2440 (0.2456) loss 3.4887 (3.4243) grad_norm 1.7876 (2.0180) loss_scale 4096.0000 (2277.9893) mem 7379MB [2024-08-26 09:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][570/1251] eta 0:02:48 lr 0.000883 wd 0.0500 time 0.2425 (0.2472) data time 0.0009 (0.0019) model time 0.2416 (0.2456) loss 4.1526 (3.4258) grad_norm 1.5991 (2.0140) loss_scale 4096.0000 (2309.8284) mem 7379MB [2024-08-26 09:21:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][580/1251] eta 0:02:45 lr 0.000883 wd 0.0500 time 0.2397 (0.2471) data time 0.0012 (0.0019) model time 0.2385 (0.2455) loss 3.4998 (3.4252) grad_norm 2.8187 (2.0175) loss_scale 4096.0000 (2340.5714) mem 7379MB [2024-08-26 09:21:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][590/1251] eta 0:02:43 lr 0.000883 wd 0.0500 time 0.2387 (0.2470) data time 0.0009 (0.0019) model time 0.2378 (0.2454) loss 3.4160 (3.4251) grad_norm 2.6916 (2.0177) loss_scale 4096.0000 (2370.2741) mem 7379MB [2024-08-26 09:21:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][600/1251] eta 0:02:40 lr 0.000883 wd 0.0500 time 0.2405 (0.2473) data time 0.0007 (0.0018) model time 0.2398 (0.2457) loss 3.0226 (3.4228) grad_norm 1.4617 (2.0194) loss_scale 4096.0000 (2398.9884) mem 7379MB [2024-08-26 09:21:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][610/1251] eta 0:02:38 lr 0.000883 wd 0.0500 time 0.2381 (0.2472) data time 0.0009 (0.0018) model time 0.2371 (0.2456) loss 3.6312 (3.4257) grad_norm 1.8796 (2.0232) loss_scale 4096.0000 (2426.7627) mem 7379MB [2024-08-26 09:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][620/1251] eta 0:02:35 lr 0.000883 wd 0.0500 time 0.2367 (0.2471) data time 0.0009 (0.0018) model time 0.2358 (0.2456) loss 3.6137 (3.4257) grad_norm 1.5710 (2.0241) loss_scale 4096.0000 (2453.6425) mem 7379MB [2024-08-26 09:21:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][630/1251] eta 0:02:33 lr 0.000883 wd 0.0500 time 0.2389 (0.2471) data time 0.0011 (0.0018) model time 0.2378 (0.2455) loss 2.2776 (3.4231) grad_norm 2.1478 (2.0244) loss_scale 4096.0000 (2479.6704) mem 7379MB [2024-08-26 09:21:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][640/1251] eta 0:02:30 lr 0.000883 wd 0.0500 time 0.2447 (0.2470) data time 0.0010 (0.0018) model time 0.2437 (0.2454) loss 3.9582 (3.4211) grad_norm 2.1304 (2.0233) loss_scale 4096.0000 (2504.8861) mem 7379MB [2024-08-26 09:21:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][650/1251] eta 0:02:28 lr 0.000883 wd 0.0500 time 0.2419 (0.2469) data time 0.0012 (0.0018) model time 0.2407 (0.2453) loss 3.2739 (3.4236) grad_norm 2.5192 (2.0216) loss_scale 4096.0000 (2529.3272) mem 7379MB [2024-08-26 09:21:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][660/1251] eta 0:02:26 lr 0.000883 wd 0.0500 time 0.4383 (0.2471) data time 0.0009 (0.0018) model time 0.4374 (0.2456) loss 4.0530 (3.4263) grad_norm 1.7037 (2.0190) loss_scale 4096.0000 (2553.0287) mem 7379MB [2024-08-26 09:21:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][670/1251] eta 0:02:23 lr 0.000883 wd 0.0500 time 0.2337 (0.2473) data time 0.0010 (0.0018) model time 0.2328 (0.2458) loss 3.3057 (3.4312) grad_norm 2.1045 (2.0177) loss_scale 4096.0000 (2576.0238) mem 7379MB [2024-08-26 09:21:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][680/1251] eta 0:02:21 lr 0.000883 wd 0.0500 time 0.2346 (0.2475) data time 0.0010 (0.0017) model time 0.2336 (0.2461) loss 3.3302 (3.4294) grad_norm 1.8740 (2.0191) loss_scale 4096.0000 (2598.3436) mem 7379MB [2024-08-26 09:21:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][690/1251] eta 0:02:18 lr 0.000883 wd 0.0500 time 0.2505 (0.2475) data time 0.0007 (0.0017) model time 0.2498 (0.2460) loss 2.7599 (3.4279) grad_norm 1.9258 (2.0199) loss_scale 4096.0000 (2620.0174) mem 7379MB [2024-08-26 09:21:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][700/1251] eta 0:02:16 lr 0.000883 wd 0.0500 time 0.2317 (0.2474) data time 0.0012 (0.0017) model time 0.2306 (0.2460) loss 2.9865 (3.4306) grad_norm 1.8800 (2.0210) loss_scale 4096.0000 (2641.0728) mem 7379MB [2024-08-26 09:21:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][710/1251] eta 0:02:13 lr 0.000883 wd 0.0500 time 0.2449 (0.2474) data time 0.0011 (0.0017) model time 0.2438 (0.2459) loss 3.2477 (3.4305) grad_norm 1.7323 (2.0167) loss_scale 4096.0000 (2661.5359) mem 7379MB [2024-08-26 09:21:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][720/1251] eta 0:02:11 lr 0.000883 wd 0.0500 time 0.2395 (0.2473) data time 0.0007 (0.0017) model time 0.2388 (0.2458) loss 3.1802 (3.4236) grad_norm 1.5751 (2.0162) loss_scale 4096.0000 (2681.4313) mem 7379MB [2024-08-26 09:21:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][730/1251] eta 0:02:08 lr 0.000883 wd 0.0500 time 0.2430 (0.2472) data time 0.0007 (0.0017) model time 0.2423 (0.2457) loss 3.4040 (3.4264) grad_norm 3.1084 (2.0185) loss_scale 4096.0000 (2700.7825) mem 7379MB [2024-08-26 09:21:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][740/1251] eta 0:02:06 lr 0.000883 wd 0.0500 time 0.2550 (0.2471) data time 0.0010 (0.0017) model time 0.2540 (0.2457) loss 2.4163 (3.4263) grad_norm 1.8246 (2.0167) loss_scale 4096.0000 (2719.6113) mem 7379MB [2024-08-26 09:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][750/1251] eta 0:02:03 lr 0.000883 wd 0.0500 time 0.2346 (0.2471) data time 0.0010 (0.0017) model time 0.2336 (0.2456) loss 3.6427 (3.4283) grad_norm 2.0814 (2.0126) loss_scale 4096.0000 (2737.9387) mem 7379MB [2024-08-26 09:21:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][760/1251] eta 0:02:01 lr 0.000883 wd 0.0500 time 0.2368 (0.2469) data time 0.0008 (0.0017) model time 0.2360 (0.2455) loss 2.5983 (3.4257) grad_norm 1.8265 (inf) loss_scale 2048.0000 (2728.8725) mem 7379MB [2024-08-26 09:21:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][770/1251] eta 0:01:58 lr 0.000883 wd 0.0500 time 0.2458 (0.2469) data time 0.0011 (0.0017) model time 0.2447 (0.2454) loss 4.0767 (3.4278) grad_norm 2.1223 (inf) loss_scale 2048.0000 (2720.0415) mem 7379MB [2024-08-26 09:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][780/1251] eta 0:01:56 lr 0.000883 wd 0.0500 time 0.4106 (0.2470) data time 0.0011 (0.0017) model time 0.4095 (0.2456) loss 3.6088 (3.4284) grad_norm 1.8635 (inf) loss_scale 2048.0000 (2711.4366) mem 7379MB [2024-08-26 09:21:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][790/1251] eta 0:01:53 lr 0.000883 wd 0.0500 time 0.2373 (0.2470) data time 0.0011 (0.0016) model time 0.2362 (0.2455) loss 3.6999 (3.4311) grad_norm 1.4785 (inf) loss_scale 2048.0000 (2703.0493) mem 7379MB [2024-08-26 09:21:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][800/1251] eta 0:01:51 lr 0.000883 wd 0.0500 time 0.2352 (0.2469) data time 0.0010 (0.0016) model time 0.2342 (0.2454) loss 3.0403 (3.4285) grad_norm 1.6325 (inf) loss_scale 2048.0000 (2694.8714) mem 7379MB [2024-08-26 09:21:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][810/1251] eta 0:01:48 lr 0.000883 wd 0.0500 time 0.2366 (0.2468) data time 0.0012 (0.0016) model time 0.2354 (0.2454) loss 3.6611 (3.4317) grad_norm 1.9597 (inf) loss_scale 2048.0000 (2686.8952) mem 7379MB [2024-08-26 09:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][820/1251] eta 0:01:46 lr 0.000883 wd 0.0500 time 0.2450 (0.2468) data time 0.0011 (0.0016) model time 0.2439 (0.2453) loss 3.7624 (3.4316) grad_norm 1.4778 (inf) loss_scale 2048.0000 (2679.1133) mem 7379MB [2024-08-26 09:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][830/1251] eta 0:01:43 lr 0.000883 wd 0.0500 time 0.2424 (0.2467) data time 0.0009 (0.0016) model time 0.2415 (0.2453) loss 3.7779 (3.4314) grad_norm 2.5440 (inf) loss_scale 2048.0000 (2671.5187) mem 7379MB [2024-08-26 09:22:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][840/1251] eta 0:01:41 lr 0.000883 wd 0.0500 time 0.2353 (0.2466) data time 0.0008 (0.0016) model time 0.2345 (0.2452) loss 4.0780 (3.4292) grad_norm 1.4663 (inf) loss_scale 2048.0000 (2664.1046) mem 7379MB [2024-08-26 09:22:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][850/1251] eta 0:01:38 lr 0.000883 wd 0.0500 time 0.2492 (0.2466) data time 0.0007 (0.0016) model time 0.2484 (0.2452) loss 3.9907 (3.4258) grad_norm 2.4082 (inf) loss_scale 2048.0000 (2656.8649) mem 7379MB [2024-08-26 09:22:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][860/1251] eta 0:01:36 lr 0.000883 wd 0.0500 time 0.2416 (0.2466) data time 0.0010 (0.0016) model time 0.2406 (0.2451) loss 3.1658 (3.4235) grad_norm 2.4519 (inf) loss_scale 2048.0000 (2649.7933) mem 7379MB [2024-08-26 09:22:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][870/1251] eta 0:01:33 lr 0.000882 wd 0.0500 time 0.2425 (0.2465) data time 0.0009 (0.0016) model time 0.2416 (0.2451) loss 3.3539 (3.4250) grad_norm 1.8462 (inf) loss_scale 2048.0000 (2642.8840) mem 7379MB [2024-08-26 09:22:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][880/1251] eta 0:01:31 lr 0.000882 wd 0.0500 time 0.2332 (0.2465) data time 0.0011 (0.0016) model time 0.2321 (0.2450) loss 3.7491 (3.4273) grad_norm 1.4362 (inf) loss_scale 2048.0000 (2636.1317) mem 7379MB [2024-08-26 09:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][890/1251] eta 0:01:28 lr 0.000882 wd 0.0500 time 0.2393 (0.2464) data time 0.0009 (0.0016) model time 0.2383 (0.2450) loss 3.4405 (3.4255) grad_norm 2.0445 (inf) loss_scale 2048.0000 (2629.5309) mem 7379MB [2024-08-26 09:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][900/1251] eta 0:01:26 lr 0.000882 wd 0.0500 time 0.2449 (0.2463) data time 0.0010 (0.0016) model time 0.2439 (0.2449) loss 3.4399 (3.4262) grad_norm 3.2928 (inf) loss_scale 2048.0000 (2623.0766) mem 7379MB [2024-08-26 09:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][910/1251] eta 0:01:23 lr 0.000882 wd 0.0500 time 0.2418 (0.2463) data time 0.0012 (0.0016) model time 0.2406 (0.2449) loss 3.4170 (3.4271) grad_norm 1.6948 (inf) loss_scale 2048.0000 (2616.7640) mem 7379MB [2024-08-26 09:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][920/1251] eta 0:01:21 lr 0.000882 wd 0.0500 time 0.2398 (0.2462) data time 0.0009 (0.0016) model time 0.2389 (0.2448) loss 3.0982 (3.4293) grad_norm 1.8863 (inf) loss_scale 2048.0000 (2610.5885) mem 7379MB [2024-08-26 09:22:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][930/1251] eta 0:01:19 lr 0.000882 wd 0.0500 time 0.2393 (0.2462) data time 0.0008 (0.0015) model time 0.2385 (0.2448) loss 3.9547 (3.4286) grad_norm 2.2758 (inf) loss_scale 2048.0000 (2604.5456) mem 7379MB [2024-08-26 09:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][940/1251] eta 0:01:16 lr 0.000882 wd 0.0500 time 0.2394 (0.2461) data time 0.0007 (0.0015) model time 0.2387 (0.2447) loss 3.8860 (3.4293) grad_norm 1.6806 (inf) loss_scale 2048.0000 (2598.6312) mem 7379MB [2024-08-26 09:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][950/1251] eta 0:01:14 lr 0.000882 wd 0.0500 time 0.2461 (0.2461) data time 0.0008 (0.0015) model time 0.2453 (0.2447) loss 2.7340 (3.4304) grad_norm 1.5936 (inf) loss_scale 2048.0000 (2592.8412) mem 7379MB [2024-08-26 09:22:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][960/1251] eta 0:01:11 lr 0.000882 wd 0.0500 time 0.2368 (0.2460) data time 0.0012 (0.0015) model time 0.2356 (0.2446) loss 3.3041 (3.4312) grad_norm 1.9576 (inf) loss_scale 2048.0000 (2587.1717) mem 7379MB [2024-08-26 09:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][970/1251] eta 0:01:09 lr 0.000882 wd 0.0500 time 0.2439 (0.2460) data time 0.0011 (0.0015) model time 0.2428 (0.2446) loss 3.6261 (3.4340) grad_norm 2.1372 (inf) loss_scale 2048.0000 (2581.6189) mem 7379MB [2024-08-26 09:22:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][980/1251] eta 0:01:06 lr 0.000882 wd 0.0500 time 0.2415 (0.2462) data time 0.0007 (0.0015) model time 0.2409 (0.2448) loss 4.2955 (3.4373) grad_norm 2.4683 (inf) loss_scale 2048.0000 (2576.1794) mem 7379MB [2024-08-26 09:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][990/1251] eta 0:01:04 lr 0.000882 wd 0.0500 time 0.2468 (0.2462) data time 0.0007 (0.0015) model time 0.2460 (0.2448) loss 4.1907 (3.4407) grad_norm 1.5197 (inf) loss_scale 2048.0000 (2570.8496) mem 7379MB [2024-08-26 09:22:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1000/1251] eta 0:01:01 lr 0.000882 wd 0.0500 time 0.2391 (0.2461) data time 0.0009 (0.0015) model time 0.2382 (0.2447) loss 3.4055 (3.4432) grad_norm 2.0134 (inf) loss_scale 2048.0000 (2565.6264) mem 7379MB [2024-08-26 09:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1010/1251] eta 0:00:59 lr 0.000882 wd 0.0500 time 0.2412 (0.2461) data time 0.0009 (0.0015) model time 0.2404 (0.2447) loss 2.7640 (3.4409) grad_norm 1.6319 (inf) loss_scale 2048.0000 (2560.5064) mem 7379MB [2024-08-26 09:22:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1020/1251] eta 0:00:56 lr 0.000882 wd 0.0500 time 0.2521 (0.2461) data time 0.0010 (0.0015) model time 0.2511 (0.2447) loss 3.4448 (3.4410) grad_norm 2.0883 (inf) loss_scale 2048.0000 (2555.4868) mem 7379MB [2024-08-26 09:22:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1030/1251] eta 0:00:54 lr 0.000882 wd 0.0500 time 0.2445 (0.2464) data time 0.0007 (0.0015) model time 0.2438 (0.2450) loss 4.1793 (3.4435) grad_norm 1.5707 (inf) loss_scale 2048.0000 (2550.5645) mem 7379MB [2024-08-26 09:22:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1040/1251] eta 0:00:51 lr 0.000882 wd 0.0500 time 0.2292 (0.2464) data time 0.0011 (0.0015) model time 0.2281 (0.2450) loss 3.4061 (3.4414) grad_norm 2.0624 (inf) loss_scale 2048.0000 (2545.7368) mem 7379MB [2024-08-26 09:22:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1050/1251] eta 0:00:49 lr 0.000882 wd 0.0500 time 0.2452 (0.2463) data time 0.0011 (0.0015) model time 0.2441 (0.2449) loss 3.6930 (3.4413) grad_norm 1.4551 (inf) loss_scale 2048.0000 (2541.0010) mem 7379MB [2024-08-26 09:23:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1060/1251] eta 0:00:47 lr 0.000882 wd 0.0500 time 0.2462 (0.2463) data time 0.0007 (0.0015) model time 0.2455 (0.2449) loss 4.0845 (3.4430) grad_norm 1.9728 (inf) loss_scale 2048.0000 (2536.3544) mem 7379MB [2024-08-26 09:23:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1070/1251] eta 0:00:44 lr 0.000882 wd 0.0500 time 0.2437 (0.2463) data time 0.0008 (0.0015) model time 0.2430 (0.2449) loss 4.3602 (3.4440) grad_norm 2.8250 (inf) loss_scale 2048.0000 (2531.7946) mem 7379MB [2024-08-26 09:23:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1080/1251] eta 0:00:42 lr 0.000882 wd 0.0500 time 0.2391 (0.2462) data time 0.0007 (0.0015) model time 0.2384 (0.2448) loss 2.6360 (3.4430) grad_norm 1.8033 (inf) loss_scale 2048.0000 (2527.3191) mem 7379MB [2024-08-26 09:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1090/1251] eta 0:00:39 lr 0.000882 wd 0.0500 time 0.2405 (0.2462) data time 0.0009 (0.0015) model time 0.2396 (0.2448) loss 3.5077 (3.4432) grad_norm 1.6660 (inf) loss_scale 2048.0000 (2522.9258) mem 7379MB [2024-08-26 09:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1100/1251] eta 0:00:37 lr 0.000882 wd 0.0500 time 0.2415 (0.2461) data time 0.0010 (0.0015) model time 0.2406 (0.2448) loss 3.4389 (3.4446) grad_norm 1.5826 (inf) loss_scale 2048.0000 (2518.6122) mem 7379MB [2024-08-26 09:23:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1110/1251] eta 0:00:34 lr 0.000882 wd 0.0500 time 0.2406 (0.2461) data time 0.0009 (0.0015) model time 0.2397 (0.2447) loss 3.8775 (3.4490) grad_norm 2.0705 (inf) loss_scale 2048.0000 (2514.3762) mem 7379MB [2024-08-26 09:23:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1120/1251] eta 0:00:32 lr 0.000882 wd 0.0500 time 0.2457 (0.2462) data time 0.0008 (0.0015) model time 0.2449 (0.2448) loss 3.2015 (3.4494) grad_norm 2.6504 (inf) loss_scale 2048.0000 (2510.2159) mem 7379MB [2024-08-26 09:23:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1130/1251] eta 0:00:29 lr 0.000882 wd 0.0500 time 0.2402 (0.2461) data time 0.0010 (0.0014) model time 0.2392 (0.2448) loss 3.3982 (3.4469) grad_norm 2.7391 (inf) loss_scale 2048.0000 (2506.1291) mem 7379MB [2024-08-26 09:23:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1140/1251] eta 0:00:27 lr 0.000882 wd 0.0500 time 0.2417 (0.2461) data time 0.0007 (0.0014) model time 0.2410 (0.2448) loss 3.6461 (3.4472) grad_norm 1.5673 (inf) loss_scale 2048.0000 (2502.1139) mem 7379MB [2024-08-26 09:23:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1150/1251] eta 0:00:24 lr 0.000882 wd 0.0500 time 0.2454 (0.2461) data time 0.0010 (0.0014) model time 0.2445 (0.2447) loss 3.2273 (3.4458) grad_norm 1.7320 (inf) loss_scale 2048.0000 (2498.1685) mem 7379MB [2024-08-26 09:23:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1160/1251] eta 0:00:22 lr 0.000882 wd 0.0500 time 0.2418 (0.2460) data time 0.0009 (0.0014) model time 0.2409 (0.2447) loss 4.2721 (3.4448) grad_norm 3.2916 (inf) loss_scale 2048.0000 (2494.2911) mem 7379MB [2024-08-26 09:23:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1170/1251] eta 0:00:19 lr 0.000882 wd 0.0500 time 0.2376 (0.2460) data time 0.0012 (0.0014) model time 0.2364 (0.2446) loss 3.5993 (3.4467) grad_norm 2.4412 (inf) loss_scale 2048.0000 (2490.4799) mem 7379MB [2024-08-26 09:23:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1180/1251] eta 0:00:17 lr 0.000882 wd 0.0500 time 0.2382 (0.2459) data time 0.0011 (0.0014) model time 0.2371 (0.2446) loss 2.4302 (3.4474) grad_norm 2.2370 (inf) loss_scale 2048.0000 (2486.7333) mem 7379MB [2024-08-26 09:23:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1190/1251] eta 0:00:15 lr 0.000882 wd 0.0500 time 0.2392 (0.2462) data time 0.0010 (0.0014) model time 0.2383 (0.2449) loss 4.0831 (3.4493) grad_norm 2.5120 (inf) loss_scale 2048.0000 (2483.0495) mem 7379MB [2024-08-26 09:23:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1200/1251] eta 0:00:12 lr 0.000882 wd 0.0500 time 0.2395 (0.2462) data time 0.0011 (0.0014) model time 0.2384 (0.2449) loss 3.3949 (3.4505) grad_norm 2.3852 (inf) loss_scale 2048.0000 (2479.4271) mem 7379MB [2024-08-26 09:23:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1210/1251] eta 0:00:10 lr 0.000882 wd 0.0500 time 0.2418 (0.2462) data time 0.0012 (0.0014) model time 0.2407 (0.2449) loss 3.8065 (3.4506) grad_norm 1.8855 (inf) loss_scale 2048.0000 (2475.8646) mem 7379MB [2024-08-26 09:23:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1220/1251] eta 0:00:07 lr 0.000881 wd 0.0500 time 0.2444 (0.2461) data time 0.0009 (0.0014) model time 0.2435 (0.2448) loss 3.8422 (3.4492) grad_norm 2.0854 (inf) loss_scale 2048.0000 (2472.3604) mem 7379MB [2024-08-26 09:23:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1230/1251] eta 0:00:05 lr 0.000881 wd 0.0500 time 0.2382 (0.2461) data time 0.0007 (0.0014) model time 0.2375 (0.2448) loss 3.9231 (3.4513) grad_norm 2.2413 (inf) loss_scale 2048.0000 (2468.9131) mem 7379MB [2024-08-26 09:23:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1240/1251] eta 0:00:02 lr 0.000881 wd 0.0500 time 0.2292 (0.2460) data time 0.0007 (0.0014) model time 0.2285 (0.2447) loss 3.3433 (3.4507) grad_norm 2.2940 (inf) loss_scale 2048.0000 (2465.5214) mem 7379MB [2024-08-26 09:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [82/300][1250/1251] eta 0:00:00 lr 0.000881 wd 0.0500 time 0.2312 (0.2459) data time 0.0007 (0.0014) model time 0.2305 (0.2445) loss 3.5225 (3.4489) grad_norm 1.4599 (inf) loss_scale 2048.0000 (2462.1839) mem 7379MB [2024-08-26 09:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 82 training takes 0:05:07 [2024-08-26 09:23:46 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 09:23:47 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 09:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.473 (0.473) Loss 0.5625 (0.5625) Acc@1 90.137 (90.137) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 09:23:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.088 (0.114) Loss 0.8662 (0.8255) Acc@1 81.055 (81.596) Acc@5 95.410 (96.094) Mem 7379MB [2024-08-26 09:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.098) Loss 1.0508 (0.8394) Acc@1 74.902 (80.915) Acc@5 93.066 (96.043) Mem 7379MB [2024-08-26 09:23:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.091) Loss 1.4658 (0.9579) Acc@1 65.039 (78.308) Acc@5 88.672 (94.525) Mem 7379MB [2024-08-26 09:23:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.3848 (1.0299) Acc@1 67.383 (76.529) Acc@5 90.723 (93.667) Mem 7379MB [2024-08-26 09:23:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.204 Acc@5 93.598 [2024-08-26 09:23:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.2% [2024-08-26 09:23:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.833 (0.833) Loss 0.4580 (0.4580) Acc@1 91.992 (91.992) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 09:23:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.150) Loss 0.7344 (0.7114) Acc@1 85.352 (84.739) Acc@5 96.191 (96.884) Mem 7379MB [2024-08-26 09:23:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.116) Loss 1.0146 (0.7346) Acc@1 75.879 (83.608) Acc@5 94.141 (96.866) Mem 7379MB [2024-08-26 09:23:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.103) Loss 1.3057 (0.8395) Acc@1 66.797 (81.143) Acc@5 90.625 (95.609) Mem 7379MB [2024-08-26 09:23:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.1709 (0.8954) Acc@1 71.387 (79.680) Acc@5 92.090 (95.010) Mem 7379MB [2024-08-26 09:23:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.258 Acc@5 94.960 [2024-08-26 09:23:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.3% [2024-08-26 09:23:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.26% [2024-08-26 09:23:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 09:23:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 09:23:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][0/1251] eta 0:13:17 lr 0.000881 wd 0.0500 time 0.6374 (0.6374) data time 0.4128 (0.4128) model time 0.0000 (0.0000) loss 4.3862 (4.3862) grad_norm 1.6835 (1.6835) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:23:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][10/1251] eta 0:05:47 lr 0.000881 wd 0.0500 time 0.2415 (0.2796) data time 0.0008 (0.0385) model time 0.0000 (0.0000) loss 4.0028 (3.7813) grad_norm 2.1251 (2.0889) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][20/1251] eta 0:05:22 lr 0.000881 wd 0.0500 time 0.2441 (0.2616) data time 0.0007 (0.0206) model time 0.0000 (0.0000) loss 3.0871 (3.5376) grad_norm 1.5046 (2.2626) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][30/1251] eta 0:05:11 lr 0.000881 wd 0.0500 time 0.2512 (0.2550) data time 0.0011 (0.0143) model time 0.0000 (0.0000) loss 3.7006 (3.5023) grad_norm 2.6089 (2.1836) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][40/1251] eta 0:05:04 lr 0.000881 wd 0.0500 time 0.2314 (0.2516) data time 0.0011 (0.0115) model time 0.0000 (0.0000) loss 3.3960 (3.4489) grad_norm 3.3140 (2.1794) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][50/1251] eta 0:05:00 lr 0.000881 wd 0.0500 time 0.2415 (0.2499) data time 0.0011 (0.0094) model time 0.0000 (0.0000) loss 3.7531 (3.4043) grad_norm 1.5044 (2.1737) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][60/1251] eta 0:04:59 lr 0.000881 wd 0.0500 time 0.2404 (0.2517) data time 0.0011 (0.0081) model time 0.2392 (0.2595) loss 3.4168 (3.4588) grad_norm 1.7824 (2.1034) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][70/1251] eta 0:04:55 lr 0.000881 wd 0.0500 time 0.2336 (0.2502) data time 0.0008 (0.0071) model time 0.2328 (0.2500) loss 3.5286 (3.4644) grad_norm 1.6115 (2.0555) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][80/1251] eta 0:04:51 lr 0.000881 wd 0.0500 time 0.2399 (0.2493) data time 0.0009 (0.0063) model time 0.2391 (0.2473) loss 2.8592 (3.4396) grad_norm 1.7964 (2.0185) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][90/1251] eta 0:04:48 lr 0.000881 wd 0.0500 time 0.2507 (0.2486) data time 0.0009 (0.0057) model time 0.2498 (0.2458) loss 4.0412 (3.4354) grad_norm 1.8651 (2.0013) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][100/1251] eta 0:04:45 lr 0.000881 wd 0.0500 time 0.2455 (0.2480) data time 0.0010 (0.0053) model time 0.2445 (0.2449) loss 3.9364 (3.4216) grad_norm 1.5286 (1.9915) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][110/1251] eta 0:04:42 lr 0.000881 wd 0.0500 time 0.2417 (0.2475) data time 0.0009 (0.0049) model time 0.2408 (0.2443) loss 3.8825 (3.4236) grad_norm 1.6644 (2.0024) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][120/1251] eta 0:04:39 lr 0.000881 wd 0.0500 time 0.2476 (0.2471) data time 0.0007 (0.0046) model time 0.2469 (0.2439) loss 4.1047 (3.4341) grad_norm 2.7306 (1.9955) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][130/1251] eta 0:04:38 lr 0.000881 wd 0.0500 time 0.2406 (0.2480) data time 0.0013 (0.0043) model time 0.2394 (0.2457) loss 4.0231 (3.4263) grad_norm 1.6152 (2.0018) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][140/1251] eta 0:04:35 lr 0.000881 wd 0.0500 time 0.2482 (0.2476) data time 0.0007 (0.0041) model time 0.2475 (0.2452) loss 3.8053 (3.4184) grad_norm 2.9608 (2.0131) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][150/1251] eta 0:04:32 lr 0.000881 wd 0.0500 time 0.2457 (0.2472) data time 0.0008 (0.0039) model time 0.2450 (0.2447) loss 2.8055 (3.4067) grad_norm 1.8828 (2.0093) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][160/1251] eta 0:04:29 lr 0.000881 wd 0.0500 time 0.2414 (0.2470) data time 0.0010 (0.0038) model time 0.2404 (0.2445) loss 3.2507 (3.4000) grad_norm 1.7887 (1.9988) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][170/1251] eta 0:04:26 lr 0.000881 wd 0.0500 time 0.2429 (0.2466) data time 0.0011 (0.0036) model time 0.2418 (0.2441) loss 3.5892 (3.3878) grad_norm 2.6051 (2.0062) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][180/1251] eta 0:04:23 lr 0.000881 wd 0.0500 time 0.2372 (0.2463) data time 0.0009 (0.0034) model time 0.2363 (0.2438) loss 3.0145 (3.3975) grad_norm 1.7832 (2.0144) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][190/1251] eta 0:04:22 lr 0.000881 wd 0.0500 time 0.2433 (0.2472) data time 0.0007 (0.0033) model time 0.2426 (0.2451) loss 3.2312 (3.4007) grad_norm 2.1269 (2.0215) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][200/1251] eta 0:04:21 lr 0.000881 wd 0.0500 time 0.2392 (0.2491) data time 0.0009 (0.0032) model time 0.2383 (0.2477) loss 3.3607 (3.3926) grad_norm 2.3900 (2.0229) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][210/1251] eta 0:04:19 lr 0.000881 wd 0.0500 time 0.2509 (0.2497) data time 0.0011 (0.0031) model time 0.2498 (0.2485) loss 3.7994 (3.3944) grad_norm 1.3527 (2.0213) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][220/1251] eta 0:04:16 lr 0.000881 wd 0.0500 time 0.2394 (0.2492) data time 0.0007 (0.0030) model time 0.2387 (0.2479) loss 3.2098 (3.3879) grad_norm 1.4531 (2.0129) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][230/1251] eta 0:04:14 lr 0.000881 wd 0.0500 time 0.2427 (0.2489) data time 0.0011 (0.0029) model time 0.2416 (0.2475) loss 3.7828 (3.3933) grad_norm 1.8263 (2.0181) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][240/1251] eta 0:04:11 lr 0.000881 wd 0.0500 time 0.2429 (0.2486) data time 0.0009 (0.0028) model time 0.2419 (0.2471) loss 3.6030 (3.4065) grad_norm 2.6041 (2.0226) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:24:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][250/1251] eta 0:04:08 lr 0.000881 wd 0.0500 time 0.2475 (0.2483) data time 0.0010 (0.0028) model time 0.2466 (0.2468) loss 3.5206 (3.4058) grad_norm 1.8338 (2.0276) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][260/1251] eta 0:04:05 lr 0.000881 wd 0.0500 time 0.2382 (0.2481) data time 0.0010 (0.0027) model time 0.2372 (0.2466) loss 3.3894 (3.4011) grad_norm 1.4557 (2.0338) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][270/1251] eta 0:04:03 lr 0.000881 wd 0.0500 time 0.2503 (0.2479) data time 0.0010 (0.0027) model time 0.2493 (0.2464) loss 2.4573 (3.3948) grad_norm 1.6908 (2.0290) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][280/1251] eta 0:04:00 lr 0.000881 wd 0.0500 time 0.2412 (0.2476) data time 0.0009 (0.0026) model time 0.2403 (0.2461) loss 2.9181 (3.3969) grad_norm 2.1204 (2.0530) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][290/1251] eta 0:03:57 lr 0.000881 wd 0.0500 time 0.2382 (0.2474) data time 0.0011 (0.0025) model time 0.2371 (0.2458) loss 3.4532 (3.3949) grad_norm 2.5209 (2.0533) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][300/1251] eta 0:03:55 lr 0.000881 wd 0.0500 time 0.2540 (0.2473) data time 0.0007 (0.0025) model time 0.2533 (0.2457) loss 4.0842 (3.4067) grad_norm 2.1232 (2.0492) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][310/1251] eta 0:03:52 lr 0.000881 wd 0.0500 time 0.2530 (0.2473) data time 0.0011 (0.0024) model time 0.2519 (0.2457) loss 2.7708 (3.4148) grad_norm 2.1147 (2.0481) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][320/1251] eta 0:03:50 lr 0.000880 wd 0.0500 time 0.2489 (0.2471) data time 0.0010 (0.0024) model time 0.2480 (0.2455) loss 3.3409 (3.4070) grad_norm 1.9311 (2.0511) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][330/1251] eta 0:03:47 lr 0.000880 wd 0.0500 time 0.2467 (0.2470) data time 0.0011 (0.0024) model time 0.2457 (0.2454) loss 4.1011 (3.4066) grad_norm 2.2447 (2.0556) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][340/1251] eta 0:03:44 lr 0.000880 wd 0.0500 time 0.2425 (0.2469) data time 0.0007 (0.0023) model time 0.2418 (0.2453) loss 4.0497 (3.4062) grad_norm 2.0061 (2.0530) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][350/1251] eta 0:03:42 lr 0.000880 wd 0.0500 time 0.2451 (0.2467) data time 0.0012 (0.0023) model time 0.2439 (0.2451) loss 3.3835 (3.4116) grad_norm 1.7219 (2.0449) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][360/1251] eta 0:03:39 lr 0.000880 wd 0.0500 time 0.2430 (0.2466) data time 0.0007 (0.0023) model time 0.2423 (0.2450) loss 2.6145 (3.4056) grad_norm 1.6993 (2.0403) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][370/1251] eta 0:03:37 lr 0.000880 wd 0.0500 time 0.2353 (0.2465) data time 0.0007 (0.0022) model time 0.2346 (0.2448) loss 2.9540 (3.4067) grad_norm 1.9579 (2.0353) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][380/1251] eta 0:03:34 lr 0.000880 wd 0.0500 time 0.2381 (0.2463) data time 0.0007 (0.0022) model time 0.2374 (0.2447) loss 2.5476 (3.4098) grad_norm 1.9314 (2.0336) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][390/1251] eta 0:03:32 lr 0.000880 wd 0.0500 time 0.2390 (0.2465) data time 0.0011 (0.0022) model time 0.2379 (0.2450) loss 3.5531 (3.4108) grad_norm 1.6007 (2.0326) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][400/1251] eta 0:03:30 lr 0.000880 wd 0.0500 time 0.2358 (0.2470) data time 0.0011 (0.0021) model time 0.2347 (0.2455) loss 3.4060 (3.4095) grad_norm 2.3344 (2.0324) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][410/1251] eta 0:03:27 lr 0.000880 wd 0.0500 time 0.2400 (0.2468) data time 0.0008 (0.0021) model time 0.2392 (0.2453) loss 2.6851 (3.4045) grad_norm 1.7109 (2.0265) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][420/1251] eta 0:03:24 lr 0.000880 wd 0.0500 time 0.2417 (0.2467) data time 0.0010 (0.0021) model time 0.2408 (0.2451) loss 3.7112 (3.4078) grad_norm 3.0707 (2.0308) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][430/1251] eta 0:03:22 lr 0.000880 wd 0.0500 time 0.2374 (0.2466) data time 0.0010 (0.0021) model time 0.2364 (0.2451) loss 3.7406 (3.4116) grad_norm 2.5593 (2.0311) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][440/1251] eta 0:03:19 lr 0.000880 wd 0.0500 time 0.2320 (0.2464) data time 0.0012 (0.0020) model time 0.2309 (0.2449) loss 3.3755 (3.4144) grad_norm 2.0639 (2.0337) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][450/1251] eta 0:03:17 lr 0.000880 wd 0.0500 time 0.2376 (0.2463) data time 0.0009 (0.0020) model time 0.2366 (0.2447) loss 2.2206 (3.4155) grad_norm 4.1278 (2.0395) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][460/1251] eta 0:03:14 lr 0.000880 wd 0.0500 time 0.2454 (0.2462) data time 0.0008 (0.0020) model time 0.2446 (0.2447) loss 3.9662 (3.4221) grad_norm 1.5176 (2.0374) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][470/1251] eta 0:03:12 lr 0.000880 wd 0.0500 time 0.2459 (0.2462) data time 0.0009 (0.0020) model time 0.2450 (0.2446) loss 3.0706 (3.4248) grad_norm 1.7642 (2.0388) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][480/1251] eta 0:03:09 lr 0.000880 wd 0.0500 time 0.2456 (0.2461) data time 0.0010 (0.0020) model time 0.2446 (0.2446) loss 3.9393 (3.4276) grad_norm 2.2623 (2.0457) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][490/1251] eta 0:03:07 lr 0.000880 wd 0.0500 time 0.2468 (0.2460) data time 0.0009 (0.0019) model time 0.2459 (0.2445) loss 3.7953 (3.4273) grad_norm 1.5222 (2.0454) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:25:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][500/1251] eta 0:03:04 lr 0.000880 wd 0.0500 time 0.2373 (0.2459) data time 0.0010 (0.0019) model time 0.2363 (0.2444) loss 2.5980 (3.4255) grad_norm 1.7904 (2.0501) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:26:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][510/1251] eta 0:03:02 lr 0.000880 wd 0.0500 time 0.2342 (0.2458) data time 0.0012 (0.0019) model time 0.2330 (0.2443) loss 4.0888 (3.4277) grad_norm 2.7222 (2.0483) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:26:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][520/1251] eta 0:02:59 lr 0.000880 wd 0.0500 time 0.2378 (0.2457) data time 0.0012 (0.0019) model time 0.2367 (0.2442) loss 3.6265 (3.4300) grad_norm 1.6679 (2.0451) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:26:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][530/1251] eta 0:02:57 lr 0.000880 wd 0.0500 time 0.2388 (0.2456) data time 0.0009 (0.0019) model time 0.2379 (0.2441) loss 3.6395 (3.4237) grad_norm 2.0848 (inf) loss_scale 1024.0000 (2042.2147) mem 7379MB [2024-08-26 09:26:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][540/1251] eta 0:02:54 lr 0.000880 wd 0.0500 time 0.2407 (0.2459) data time 0.0010 (0.0019) model time 0.2396 (0.2444) loss 3.1402 (3.4218) grad_norm 2.2913 (inf) loss_scale 1024.0000 (2023.3937) mem 7379MB [2024-08-26 09:26:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][550/1251] eta 0:02:52 lr 0.000880 wd 0.0500 time 0.2381 (0.2458) data time 0.0007 (0.0018) model time 0.2373 (0.2443) loss 3.4963 (3.4254) grad_norm 2.0474 (inf) loss_scale 1024.0000 (2005.2559) mem 7379MB [2024-08-26 09:26:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][560/1251] eta 0:02:49 lr 0.000880 wd 0.0500 time 0.2368 (0.2457) data time 0.0011 (0.0018) model time 0.2357 (0.2442) loss 3.6667 (3.4213) grad_norm 2.9916 (inf) loss_scale 1024.0000 (1987.7647) mem 7379MB [2024-08-26 09:26:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][570/1251] eta 0:02:47 lr 0.000880 wd 0.0500 time 0.2295 (0.2457) data time 0.0012 (0.0018) model time 0.2283 (0.2442) loss 4.3662 (3.4270) grad_norm 3.7531 (inf) loss_scale 1024.0000 (1970.8862) mem 7379MB [2024-08-26 09:26:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][580/1251] eta 0:02:45 lr 0.000880 wd 0.0500 time 0.2333 (0.2459) data time 0.0008 (0.0018) model time 0.2325 (0.2445) loss 3.2466 (3.4253) grad_norm 1.4345 (inf) loss_scale 1024.0000 (1954.5886) mem 7379MB [2024-08-26 09:26:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][590/1251] eta 0:02:42 lr 0.000880 wd 0.0500 time 0.2456 (0.2458) data time 0.0007 (0.0018) model time 0.2449 (0.2444) loss 2.6779 (3.4226) grad_norm 1.5755 (inf) loss_scale 1024.0000 (1938.8426) mem 7379MB [2024-08-26 09:26:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][600/1251] eta 0:02:39 lr 0.000880 wd 0.0500 time 0.2456 (0.2458) data time 0.0007 (0.0018) model time 0.2448 (0.2443) loss 4.1189 (3.4200) grad_norm 2.0255 (inf) loss_scale 1024.0000 (1923.6206) mem 7379MB [2024-08-26 09:26:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][610/1251] eta 0:02:37 lr 0.000880 wd 0.0500 time 0.2431 (0.2457) data time 0.0009 (0.0018) model time 0.2422 (0.2442) loss 3.4212 (3.4220) grad_norm 1.7066 (inf) loss_scale 1024.0000 (1908.8969) mem 7379MB [2024-08-26 09:26:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][620/1251] eta 0:02:34 lr 0.000880 wd 0.0500 time 0.2411 (0.2456) data time 0.0009 (0.0018) model time 0.2402 (0.2441) loss 4.2728 (3.4248) grad_norm 1.7729 (inf) loss_scale 1024.0000 (1894.6473) mem 7379MB [2024-08-26 09:26:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][630/1251] eta 0:02:32 lr 0.000880 wd 0.0500 time 0.2400 (0.2455) data time 0.0009 (0.0017) model time 0.2391 (0.2441) loss 3.5275 (3.4246) grad_norm 1.9032 (inf) loss_scale 1024.0000 (1880.8494) mem 7379MB [2024-08-26 09:26:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][640/1251] eta 0:02:29 lr 0.000880 wd 0.0500 time 0.2371 (0.2455) data time 0.0008 (0.0017) model time 0.2363 (0.2440) loss 2.4811 (3.4215) grad_norm 2.2933 (inf) loss_scale 1024.0000 (1867.4821) mem 7379MB [2024-08-26 09:26:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][650/1251] eta 0:02:27 lr 0.000880 wd 0.0500 time 0.2393 (0.2453) data time 0.0009 (0.0017) model time 0.2384 (0.2439) loss 4.3700 (3.4225) grad_norm 1.8874 (inf) loss_scale 1024.0000 (1854.5253) mem 7379MB [2024-08-26 09:26:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][660/1251] eta 0:02:24 lr 0.000879 wd 0.0500 time 0.2365 (0.2453) data time 0.0011 (0.0017) model time 0.2354 (0.2438) loss 2.8931 (3.4227) grad_norm 2.0483 (inf) loss_scale 1024.0000 (1841.9607) mem 7379MB [2024-08-26 09:26:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][670/1251] eta 0:02:22 lr 0.000879 wd 0.0500 time 0.2406 (0.2452) data time 0.0010 (0.0017) model time 0.2396 (0.2438) loss 3.4651 (3.4180) grad_norm 2.3100 (inf) loss_scale 1024.0000 (1829.7705) mem 7379MB [2024-08-26 09:26:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][680/1251] eta 0:02:20 lr 0.000879 wd 0.0500 time 0.2446 (0.2452) data time 0.0010 (0.0017) model time 0.2435 (0.2437) loss 4.0422 (3.4158) grad_norm 1.4024 (inf) loss_scale 1024.0000 (1817.9383) mem 7379MB [2024-08-26 09:26:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][690/1251] eta 0:02:17 lr 0.000879 wd 0.0500 time 0.2377 (0.2451) data time 0.0008 (0.0017) model time 0.2369 (0.2437) loss 2.2622 (3.4136) grad_norm 2.5789 (inf) loss_scale 1024.0000 (1806.4486) mem 7379MB [2024-08-26 09:26:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][700/1251] eta 0:02:15 lr 0.000879 wd 0.0500 time 0.2467 (0.2451) data time 0.0013 (0.0017) model time 0.2455 (0.2437) loss 3.6202 (3.4154) grad_norm 2.0648 (inf) loss_scale 1024.0000 (1795.2867) mem 7379MB [2024-08-26 09:26:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][710/1251] eta 0:02:12 lr 0.000879 wd 0.0500 time 0.2475 (0.2453) data time 0.0010 (0.0017) model time 0.2465 (0.2439) loss 3.6205 (3.4168) grad_norm 2.0705 (inf) loss_scale 1024.0000 (1784.4388) mem 7379MB [2024-08-26 09:26:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][720/1251] eta 0:02:10 lr 0.000879 wd 0.0500 time 0.2390 (0.2452) data time 0.0010 (0.0017) model time 0.2380 (0.2438) loss 4.3377 (3.4172) grad_norm 1.8180 (inf) loss_scale 1024.0000 (1773.8918) mem 7379MB [2024-08-26 09:26:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][730/1251] eta 0:02:07 lr 0.000879 wd 0.0500 time 0.2395 (0.2452) data time 0.0009 (0.0017) model time 0.2387 (0.2438) loss 2.8110 (3.4188) grad_norm 1.8838 (inf) loss_scale 1024.0000 (1763.6334) mem 7379MB [2024-08-26 09:26:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][740/1251] eta 0:02:05 lr 0.000879 wd 0.0500 time 0.2388 (0.2452) data time 0.0009 (0.0016) model time 0.2379 (0.2437) loss 3.6130 (3.4200) grad_norm 1.5364 (inf) loss_scale 1024.0000 (1753.6518) mem 7379MB [2024-08-26 09:27:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][750/1251] eta 0:02:02 lr 0.000879 wd 0.0500 time 0.2409 (0.2451) data time 0.0009 (0.0016) model time 0.2400 (0.2437) loss 4.9819 (3.4247) grad_norm 1.4947 (inf) loss_scale 1024.0000 (1743.9361) mem 7379MB [2024-08-26 09:27:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][760/1251] eta 0:02:00 lr 0.000879 wd 0.0500 time 0.2356 (0.2451) data time 0.0011 (0.0016) model time 0.2345 (0.2437) loss 3.7650 (3.4276) grad_norm 2.0657 (inf) loss_scale 1024.0000 (1734.4757) mem 7379MB [2024-08-26 09:27:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][770/1251] eta 0:01:57 lr 0.000879 wd 0.0500 time 0.2481 (0.2451) data time 0.0009 (0.0016) model time 0.2472 (0.2437) loss 3.3640 (3.4269) grad_norm 2.3028 (inf) loss_scale 1024.0000 (1725.2607) mem 7379MB [2024-08-26 09:27:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][780/1251] eta 0:01:55 lr 0.000879 wd 0.0500 time 0.2385 (0.2451) data time 0.0010 (0.0016) model time 0.2375 (0.2437) loss 2.6806 (3.4283) grad_norm 1.9563 (inf) loss_scale 1024.0000 (1716.2817) mem 7379MB [2024-08-26 09:27:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][790/1251] eta 0:01:52 lr 0.000879 wd 0.0500 time 0.2390 (0.2451) data time 0.0011 (0.0016) model time 0.2380 (0.2436) loss 3.7304 (3.4302) grad_norm 3.0749 (inf) loss_scale 1024.0000 (1707.5297) mem 7379MB [2024-08-26 09:27:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][800/1251] eta 0:01:50 lr 0.000879 wd 0.0500 time 0.2475 (0.2450) data time 0.0010 (0.0016) model time 0.2465 (0.2436) loss 3.2672 (3.4352) grad_norm 2.4578 (inf) loss_scale 1024.0000 (1698.9963) mem 7379MB [2024-08-26 09:27:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][810/1251] eta 0:01:48 lr 0.000879 wd 0.0500 time 0.2399 (0.2450) data time 0.0011 (0.0016) model time 0.2388 (0.2436) loss 3.3881 (3.4331) grad_norm 1.4081 (inf) loss_scale 1024.0000 (1690.6732) mem 7379MB [2024-08-26 09:27:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][820/1251] eta 0:01:45 lr 0.000879 wd 0.0500 time 0.2447 (0.2455) data time 0.0011 (0.0016) model time 0.2435 (0.2441) loss 3.4715 (3.4350) grad_norm 1.8604 (inf) loss_scale 1024.0000 (1682.5530) mem 7379MB [2024-08-26 09:27:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][830/1251] eta 0:01:43 lr 0.000879 wd 0.0500 time 0.2430 (0.2455) data time 0.0011 (0.0016) model time 0.2419 (0.2441) loss 2.1179 (3.4376) grad_norm 2.0620 (inf) loss_scale 1024.0000 (1674.6282) mem 7379MB [2024-08-26 09:27:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][840/1251] eta 0:01:40 lr 0.000879 wd 0.0500 time 0.2418 (0.2454) data time 0.0011 (0.0016) model time 0.2407 (0.2440) loss 4.0439 (3.4390) grad_norm 1.3122 (inf) loss_scale 1024.0000 (1666.8918) mem 7379MB [2024-08-26 09:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][850/1251] eta 0:01:38 lr 0.000879 wd 0.0500 time 0.2414 (0.2454) data time 0.0008 (0.0016) model time 0.2406 (0.2440) loss 2.4706 (3.4373) grad_norm 1.6211 (inf) loss_scale 1024.0000 (1659.3373) mem 7379MB [2024-08-26 09:27:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][860/1251] eta 0:01:35 lr 0.000879 wd 0.0500 time 0.2494 (0.2454) data time 0.0008 (0.0016) model time 0.2486 (0.2440) loss 4.4076 (3.4361) grad_norm 1.7034 (inf) loss_scale 1024.0000 (1651.9582) mem 7379MB [2024-08-26 09:27:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][870/1251] eta 0:01:33 lr 0.000879 wd 0.0500 time 0.2384 (0.2454) data time 0.0009 (0.0016) model time 0.2375 (0.2440) loss 3.8665 (3.4353) grad_norm 2.1950 (inf) loss_scale 1024.0000 (1644.7486) mem 7379MB [2024-08-26 09:27:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][880/1251] eta 0:01:31 lr 0.000879 wd 0.0500 time 0.2358 (0.2454) data time 0.0009 (0.0016) model time 0.2349 (0.2440) loss 4.0208 (3.4335) grad_norm 2.3558 (inf) loss_scale 1024.0000 (1637.7026) mem 7379MB [2024-08-26 09:27:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][890/1251] eta 0:01:28 lr 0.000879 wd 0.0500 time 0.2480 (0.2453) data time 0.0007 (0.0016) model time 0.2473 (0.2439) loss 4.0923 (3.4369) grad_norm 2.1166 (inf) loss_scale 1024.0000 (1630.8148) mem 7379MB [2024-08-26 09:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][900/1251] eta 0:01:26 lr 0.000879 wd 0.0500 time 0.2428 (0.2453) data time 0.0010 (0.0016) model time 0.2418 (0.2439) loss 3.7882 (3.4368) grad_norm 1.9054 (inf) loss_scale 1024.0000 (1624.0799) mem 7379MB [2024-08-26 09:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][910/1251] eta 0:01:23 lr 0.000879 wd 0.0500 time 0.2383 (0.2452) data time 0.0007 (0.0016) model time 0.2376 (0.2438) loss 2.6647 (3.4375) grad_norm 2.2980 (inf) loss_scale 1024.0000 (1617.4929) mem 7379MB [2024-08-26 09:27:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][920/1251] eta 0:01:21 lr 0.000879 wd 0.0500 time 0.2439 (0.2454) data time 0.0007 (0.0016) model time 0.2433 (0.2440) loss 2.7822 (3.4360) grad_norm 1.7105 (inf) loss_scale 1024.0000 (1611.0489) mem 7379MB [2024-08-26 09:27:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][930/1251] eta 0:01:18 lr 0.000879 wd 0.0500 time 0.2433 (0.2454) data time 0.0009 (0.0016) model time 0.2424 (0.2440) loss 3.4962 (3.4345) grad_norm 1.9592 (inf) loss_scale 1024.0000 (1604.7433) mem 7379MB [2024-08-26 09:27:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][940/1251] eta 0:01:16 lr 0.000879 wd 0.0500 time 0.2442 (0.2454) data time 0.0011 (0.0016) model time 0.2431 (0.2440) loss 2.2921 (3.4330) grad_norm 1.8359 (inf) loss_scale 1024.0000 (1598.5717) mem 7379MB [2024-08-26 09:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][950/1251] eta 0:01:13 lr 0.000879 wd 0.0500 time 0.2395 (0.2454) data time 0.0011 (0.0016) model time 0.2384 (0.2439) loss 3.3665 (3.4315) grad_norm 1.8892 (inf) loss_scale 1024.0000 (1592.5300) mem 7379MB [2024-08-26 09:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][960/1251] eta 0:01:11 lr 0.000879 wd 0.0500 time 0.2405 (0.2454) data time 0.0010 (0.0016) model time 0.2394 (0.2439) loss 3.6787 (3.4328) grad_norm 1.7103 (inf) loss_scale 1024.0000 (1586.6139) mem 7379MB [2024-08-26 09:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][970/1251] eta 0:01:08 lr 0.000879 wd 0.0500 time 0.2434 (0.2453) data time 0.0011 (0.0016) model time 0.2423 (0.2439) loss 2.8941 (3.4323) grad_norm 1.5918 (inf) loss_scale 1024.0000 (1580.8198) mem 7379MB [2024-08-26 09:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][980/1251] eta 0:01:06 lr 0.000879 wd 0.0500 time 0.2383 (0.2453) data time 0.0008 (0.0016) model time 0.2375 (0.2438) loss 3.7742 (3.4326) grad_norm 1.7367 (inf) loss_scale 1024.0000 (1575.1437) mem 7379MB [2024-08-26 09:27:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][990/1251] eta 0:01:04 lr 0.000879 wd 0.0500 time 0.2415 (0.2452) data time 0.0010 (0.0016) model time 0.2405 (0.2438) loss 3.8645 (3.4326) grad_norm 1.8936 (inf) loss_scale 1024.0000 (1569.5822) mem 7379MB [2024-08-26 09:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1000/1251] eta 0:01:01 lr 0.000879 wd 0.0500 time 0.2466 (0.2452) data time 0.0010 (0.0016) model time 0.2457 (0.2438) loss 3.5816 (3.4312) grad_norm 3.0692 (inf) loss_scale 1024.0000 (1564.1319) mem 7379MB [2024-08-26 09:28:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1010/1251] eta 0:00:59 lr 0.000878 wd 0.0500 time 0.2428 (0.2452) data time 0.0010 (0.0016) model time 0.2419 (0.2437) loss 3.6083 (3.4331) grad_norm 2.0589 (inf) loss_scale 1024.0000 (1558.7893) mem 7379MB [2024-08-26 09:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1020/1251] eta 0:00:56 lr 0.000878 wd 0.0500 time 0.2322 (0.2451) data time 0.0009 (0.0016) model time 0.2313 (0.2437) loss 4.0230 (3.4339) grad_norm 1.7825 (inf) loss_scale 1024.0000 (1553.5514) mem 7379MB [2024-08-26 09:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1030/1251] eta 0:00:54 lr 0.000878 wd 0.0500 time 0.2421 (0.2451) data time 0.0009 (0.0016) model time 0.2412 (0.2437) loss 3.3793 (3.4353) grad_norm 2.5557 (inf) loss_scale 1024.0000 (1548.4151) mem 7379MB [2024-08-26 09:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1040/1251] eta 0:00:51 lr 0.000878 wd 0.0500 time 0.2384 (0.2451) data time 0.0009 (0.0016) model time 0.2376 (0.2436) loss 1.8938 (3.4323) grad_norm 2.1516 (inf) loss_scale 1024.0000 (1543.3775) mem 7379MB [2024-08-26 09:28:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1050/1251] eta 0:00:49 lr 0.000878 wd 0.0500 time 0.2495 (0.2451) data time 0.0010 (0.0016) model time 0.2486 (0.2436) loss 2.7922 (3.4326) grad_norm 1.5199 (inf) loss_scale 1024.0000 (1538.4358) mem 7379MB [2024-08-26 09:28:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1060/1251] eta 0:00:46 lr 0.000878 wd 0.0500 time 0.2405 (0.2452) data time 0.0007 (0.0016) model time 0.2398 (0.2438) loss 4.2888 (3.4350) grad_norm 1.5709 (inf) loss_scale 1024.0000 (1533.5872) mem 7379MB [2024-08-26 09:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1070/1251] eta 0:00:44 lr 0.000878 wd 0.0500 time 0.2484 (0.2452) data time 0.0010 (0.0016) model time 0.2474 (0.2438) loss 3.3508 (3.4350) grad_norm 2.4019 (inf) loss_scale 1024.0000 (1528.8291) mem 7379MB [2024-08-26 09:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1080/1251] eta 0:00:41 lr 0.000878 wd 0.0500 time 0.2411 (0.2453) data time 0.0009 (0.0016) model time 0.2402 (0.2439) loss 3.3013 (3.4360) grad_norm 1.4386 (inf) loss_scale 1024.0000 (1524.1591) mem 7379MB [2024-08-26 09:28:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1090/1251] eta 0:00:39 lr 0.000878 wd 0.0500 time 0.2431 (0.2453) data time 0.0008 (0.0016) model time 0.2423 (0.2439) loss 3.9935 (3.4351) grad_norm 1.6439 (inf) loss_scale 1024.0000 (1519.5747) mem 7379MB [2024-08-26 09:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1100/1251] eta 0:00:37 lr 0.000878 wd 0.0500 time 0.2508 (0.2453) data time 0.0007 (0.0016) model time 0.2501 (0.2439) loss 4.1259 (3.4352) grad_norm 2.2666 (inf) loss_scale 1024.0000 (1515.0736) mem 7379MB [2024-08-26 09:28:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1110/1251] eta 0:00:34 lr 0.000878 wd 0.0500 time 0.2372 (0.2453) data time 0.0011 (0.0016) model time 0.2361 (0.2438) loss 3.8537 (3.4365) grad_norm 3.0592 (inf) loss_scale 1024.0000 (1510.6535) mem 7379MB [2024-08-26 09:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1120/1251] eta 0:00:32 lr 0.000878 wd 0.0500 time 0.2605 (0.2452) data time 0.0010 (0.0016) model time 0.2595 (0.2438) loss 3.2252 (3.4357) grad_norm 2.0639 (inf) loss_scale 1024.0000 (1506.3122) mem 7379MB [2024-08-26 09:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1130/1251] eta 0:00:29 lr 0.000878 wd 0.0500 time 0.2384 (0.2456) data time 0.0012 (0.0016) model time 0.2373 (0.2442) loss 3.7362 (3.4357) grad_norm 1.8603 (inf) loss_scale 1024.0000 (1502.0477) mem 7379MB [2024-08-26 09:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1140/1251] eta 0:00:27 lr 0.000878 wd 0.0500 time 0.2387 (0.2457) data time 0.0010 (0.0016) model time 0.2378 (0.2444) loss 2.6723 (3.4373) grad_norm 2.4298 (inf) loss_scale 1024.0000 (1497.8580) mem 7379MB [2024-08-26 09:28:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1150/1251] eta 0:00:24 lr 0.000878 wd 0.0500 time 0.2407 (0.2457) data time 0.0009 (0.0016) model time 0.2398 (0.2443) loss 3.4354 (3.4410) grad_norm 1.5465 (inf) loss_scale 1024.0000 (1493.7411) mem 7379MB [2024-08-26 09:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1160/1251] eta 0:00:22 lr 0.000878 wd 0.0500 time 0.2407 (0.2457) data time 0.0011 (0.0015) model time 0.2396 (0.2443) loss 2.8387 (3.4399) grad_norm 2.2286 (inf) loss_scale 1024.0000 (1489.6951) mem 7379MB [2024-08-26 09:28:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1170/1251] eta 0:00:19 lr 0.000878 wd 0.0500 time 0.2467 (0.2457) data time 0.0012 (0.0015) model time 0.2455 (0.2443) loss 3.2059 (3.4402) grad_norm 1.6731 (inf) loss_scale 1024.0000 (1485.7182) mem 7379MB [2024-08-26 09:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1180/1251] eta 0:00:17 lr 0.000878 wd 0.0500 time 0.2378 (0.2456) data time 0.0011 (0.0015) model time 0.2368 (0.2443) loss 3.1710 (3.4405) grad_norm 2.1749 (inf) loss_scale 1024.0000 (1481.8086) mem 7379MB [2024-08-26 09:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1190/1251] eta 0:00:14 lr 0.000878 wd 0.0500 time 0.2404 (0.2456) data time 0.0008 (0.0015) model time 0.2396 (0.2442) loss 4.1330 (3.4406) grad_norm 1.6545 (inf) loss_scale 1024.0000 (1477.9647) mem 7379MB [2024-08-26 09:28:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1200/1251] eta 0:00:12 lr 0.000878 wd 0.0500 time 0.2455 (0.2456) data time 0.0007 (0.0015) model time 0.2448 (0.2442) loss 2.3882 (3.4390) grad_norm 1.5403 (inf) loss_scale 1024.0000 (1474.1848) mem 7379MB [2024-08-26 09:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1210/1251] eta 0:00:10 lr 0.000878 wd 0.0500 time 0.2464 (0.2456) data time 0.0009 (0.0015) model time 0.2454 (0.2442) loss 3.6823 (3.4370) grad_norm 1.9962 (inf) loss_scale 1024.0000 (1470.4674) mem 7379MB [2024-08-26 09:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1220/1251] eta 0:00:07 lr 0.000878 wd 0.0500 time 0.2442 (0.2455) data time 0.0007 (0.0015) model time 0.2434 (0.2442) loss 3.4250 (3.4371) grad_norm 1.8173 (inf) loss_scale 1024.0000 (1466.8108) mem 7379MB [2024-08-26 09:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1230/1251] eta 0:00:05 lr 0.000878 wd 0.0500 time 0.2426 (0.2455) data time 0.0010 (0.0015) model time 0.2415 (0.2442) loss 3.3060 (3.4373) grad_norm 2.6200 (inf) loss_scale 1024.0000 (1463.2136) mem 7379MB [2024-08-26 09:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1240/1251] eta 0:00:02 lr 0.000878 wd 0.0500 time 0.2243 (0.2456) data time 0.0007 (0.0015) model time 0.2236 (0.2442) loss 3.8822 (3.4395) grad_norm 3.1652 (inf) loss_scale 1024.0000 (1459.6745) mem 7379MB [2024-08-26 09:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [83/300][1250/1251] eta 0:00:00 lr 0.000878 wd 0.0500 time 0.2256 (0.2454) data time 0.0005 (0.0015) model time 0.2251 (0.2441) loss 1.9750 (3.4385) grad_norm 1.7853 (inf) loss_scale 1024.0000 (1456.1918) mem 7379MB [2024-08-26 09:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 83 training takes 0:05:07 [2024-08-26 09:29:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 09:29:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 09:29:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.389 (0.389) Loss 0.5488 (0.5488) Acc@1 89.746 (89.746) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-26 09:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.107) Loss 0.8115 (0.8181) Acc@1 81.641 (81.845) Acc@5 95.703 (95.978) Mem 7379MB [2024-08-26 09:29:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.094) Loss 1.1953 (0.8351) Acc@1 72.266 (81.078) Acc@5 92.090 (95.908) Mem 7379MB [2024-08-26 09:29:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.089) Loss 1.4746 (0.9445) Acc@1 64.746 (78.613) Acc@5 88.379 (94.563) Mem 7379MB [2024-08-26 09:29:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.3477 (1.0132) Acc@1 68.262 (76.975) Acc@5 90.625 (93.848) Mem 7379MB [2024-08-26 09:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.642 Acc@5 93.754 [2024-08-26 09:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.6% [2024-08-26 09:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 76.64% [2024-08-26 09:29:08 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 09:29:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 09:29:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.400 (0.400) Loss 0.4573 (0.4573) Acc@1 92.188 (92.188) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 09:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.085 (0.112) Loss 0.7329 (0.7101) Acc@1 85.645 (84.775) Acc@5 96.387 (96.928) Mem 7379MB [2024-08-26 09:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.096) Loss 1.0127 (0.7334) Acc@1 75.293 (83.631) Acc@5 94.141 (96.903) Mem 7379MB [2024-08-26 09:29:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.089) Loss 1.3018 (0.8377) Acc@1 67.090 (81.181) Acc@5 90.820 (95.646) Mem 7379MB [2024-08-26 09:29:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.1680 (0.8932) Acc@1 71.191 (79.683) Acc@5 92.188 (95.053) Mem 7379MB [2024-08-26 09:29:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.270 Acc@5 95.010 [2024-08-26 09:29:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.3% [2024-08-26 09:29:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.27% [2024-08-26 09:29:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 09:29:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 09:29:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][0/1251] eta 0:13:52 lr 0.000878 wd 0.0500 time 0.6654 (0.6654) data time 0.4325 (0.4325) model time 0.0000 (0.0000) loss 3.0628 (3.0628) grad_norm 1.6320 (1.6320) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][10/1251] eta 0:05:48 lr 0.000878 wd 0.0500 time 0.2400 (0.2806) data time 0.0012 (0.0403) model time 0.0000 (0.0000) loss 3.8388 (3.2180) grad_norm 2.1595 (1.7712) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][20/1251] eta 0:05:22 lr 0.000878 wd 0.0500 time 0.2478 (0.2623) data time 0.0009 (0.0216) model time 0.0000 (0.0000) loss 4.1369 (3.3098) grad_norm 2.1548 (1.9033) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][30/1251] eta 0:05:11 lr 0.000878 wd 0.0500 time 0.2399 (0.2555) data time 0.0008 (0.0150) model time 0.0000 (0.0000) loss 3.0385 (3.2756) grad_norm 1.5992 (1.9099) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][40/1251] eta 0:05:05 lr 0.000878 wd 0.0500 time 0.2465 (0.2524) data time 0.0009 (0.0116) model time 0.0000 (0.0000) loss 3.5835 (3.3128) grad_norm 1.5765 (1.8660) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][50/1251] eta 0:05:05 lr 0.000878 wd 0.0500 time 0.2412 (0.2543) data time 0.0009 (0.0095) model time 0.0000 (0.0000) loss 3.7007 (3.3423) grad_norm 2.3042 (1.8448) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][60/1251] eta 0:05:00 lr 0.000878 wd 0.0500 time 0.2435 (0.2525) data time 0.0010 (0.0081) model time 0.2425 (0.2422) loss 2.8724 (3.3642) grad_norm 2.6088 (1.9593) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][70/1251] eta 0:04:56 lr 0.000878 wd 0.0500 time 0.2489 (0.2511) data time 0.0009 (0.0071) model time 0.2480 (0.2419) loss 4.3210 (3.3956) grad_norm 2.0916 (1.9778) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][80/1251] eta 0:04:52 lr 0.000878 wd 0.0500 time 0.2441 (0.2501) data time 0.0011 (0.0064) model time 0.2430 (0.2420) loss 3.7230 (3.4131) grad_norm 1.8260 (1.9707) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][90/1251] eta 0:04:49 lr 0.000878 wd 0.0500 time 0.2386 (0.2491) data time 0.0007 (0.0058) model time 0.2379 (0.2414) loss 3.7187 (3.4339) grad_norm 2.2834 (1.9971) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][100/1251] eta 0:04:45 lr 0.000877 wd 0.0500 time 0.2423 (0.2483) data time 0.0008 (0.0053) model time 0.2415 (0.2412) loss 2.3308 (3.4323) grad_norm 1.5887 (1.9853) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][110/1251] eta 0:04:42 lr 0.000877 wd 0.0500 time 0.2463 (0.2479) data time 0.0010 (0.0049) model time 0.2453 (0.2414) loss 3.9346 (3.4517) grad_norm 2.6554 (2.0194) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][120/1251] eta 0:04:39 lr 0.000877 wd 0.0500 time 0.2394 (0.2473) data time 0.0010 (0.0046) model time 0.2384 (0.2411) loss 3.4968 (3.4252) grad_norm 1.3932 (1.9909) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][130/1251] eta 0:04:36 lr 0.000877 wd 0.0500 time 0.2453 (0.2469) data time 0.0007 (0.0043) model time 0.2445 (0.2412) loss 4.4063 (3.4515) grad_norm 2.1556 (1.9905) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][140/1251] eta 0:04:34 lr 0.000877 wd 0.0500 time 0.2425 (0.2466) data time 0.0008 (0.0041) model time 0.2417 (0.2413) loss 3.0617 (3.4613) grad_norm 2.1177 (1.9819) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][150/1251] eta 0:04:32 lr 0.000877 wd 0.0500 time 0.2449 (0.2477) data time 0.0010 (0.0039) model time 0.2439 (0.2434) loss 3.9058 (3.4649) grad_norm 2.2685 (1.9701) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][160/1251] eta 0:04:29 lr 0.000877 wd 0.0500 time 0.2455 (0.2474) data time 0.0007 (0.0037) model time 0.2447 (0.2432) loss 3.4928 (3.4736) grad_norm 1.5807 (1.9716) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][170/1251] eta 0:04:28 lr 0.000877 wd 0.0500 time 0.4820 (0.2484) data time 0.0009 (0.0036) model time 0.4811 (0.2449) loss 3.8913 (3.4806) grad_norm 2.4379 (1.9691) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:29:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][180/1251] eta 0:04:26 lr 0.000877 wd 0.0500 time 0.2293 (0.2491) data time 0.0010 (0.0034) model time 0.2283 (0.2461) loss 3.7696 (3.4881) grad_norm 1.8142 (1.9614) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][190/1251] eta 0:04:23 lr 0.000877 wd 0.0500 time 0.2439 (0.2487) data time 0.0009 (0.0033) model time 0.2430 (0.2457) loss 3.2697 (3.4862) grad_norm 1.5776 (1.9557) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][200/1251] eta 0:04:21 lr 0.000877 wd 0.0500 time 0.2338 (0.2485) data time 0.0009 (0.0032) model time 0.2330 (0.2455) loss 3.0012 (3.4988) grad_norm 1.9619 (1.9756) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][210/1251] eta 0:04:18 lr 0.000877 wd 0.0500 time 0.2417 (0.2482) data time 0.0009 (0.0031) model time 0.2409 (0.2452) loss 3.6613 (3.5049) grad_norm 1.9500 (1.9766) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][220/1251] eta 0:04:15 lr 0.000877 wd 0.0500 time 0.2540 (0.2480) data time 0.0010 (0.0030) model time 0.2530 (0.2451) loss 3.7145 (3.5092) grad_norm 1.8959 (1.9816) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][230/1251] eta 0:04:12 lr 0.000877 wd 0.0500 time 0.2277 (0.2478) data time 0.0009 (0.0029) model time 0.2269 (0.2448) loss 3.6055 (3.5077) grad_norm 1.7583 (1.9917) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][240/1251] eta 0:04:10 lr 0.000877 wd 0.0500 time 0.2346 (0.2474) data time 0.0012 (0.0028) model time 0.2335 (0.2445) loss 3.5504 (3.5093) grad_norm 2.1443 (1.9857) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][250/1251] eta 0:04:07 lr 0.000877 wd 0.0500 time 0.2489 (0.2471) data time 0.0010 (0.0028) model time 0.2479 (0.2442) loss 3.6818 (3.5103) grad_norm 1.7464 (1.9822) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][260/1251] eta 0:04:04 lr 0.000877 wd 0.0500 time 0.2364 (0.2468) data time 0.0007 (0.0028) model time 0.2357 (0.2439) loss 4.0112 (3.5161) grad_norm 1.4788 (1.9797) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][270/1251] eta 0:04:01 lr 0.000877 wd 0.0500 time 0.2383 (0.2465) data time 0.0012 (0.0027) model time 0.2371 (0.2436) loss 2.7712 (3.4990) grad_norm 2.3126 (1.9807) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][280/1251] eta 0:03:59 lr 0.000877 wd 0.0500 time 0.2396 (0.2464) data time 0.0012 (0.0026) model time 0.2384 (0.2435) loss 3.7293 (3.5027) grad_norm 1.7118 (1.9863) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][290/1251] eta 0:03:56 lr 0.000877 wd 0.0500 time 0.2414 (0.2462) data time 0.0009 (0.0026) model time 0.2405 (0.2434) loss 3.9001 (3.5028) grad_norm 2.0521 (1.9757) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][300/1251] eta 0:03:53 lr 0.000877 wd 0.0500 time 0.2468 (0.2461) data time 0.0015 (0.0025) model time 0.2452 (0.2433) loss 3.7227 (3.4962) grad_norm 1.5505 (1.9714) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][310/1251] eta 0:03:51 lr 0.000877 wd 0.0500 time 0.2445 (0.2459) data time 0.0007 (0.0025) model time 0.2437 (0.2431) loss 2.8688 (3.4944) grad_norm 1.5034 (1.9675) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][320/1251] eta 0:03:48 lr 0.000877 wd 0.0500 time 0.2476 (0.2457) data time 0.0010 (0.0024) model time 0.2466 (0.2430) loss 4.1171 (3.5031) grad_norm 1.7002 (1.9709) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][330/1251] eta 0:03:46 lr 0.000877 wd 0.0500 time 0.2355 (0.2456) data time 0.0009 (0.0024) model time 0.2346 (0.2429) loss 4.0556 (3.4922) grad_norm 1.8042 (1.9851) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][340/1251] eta 0:03:44 lr 0.000877 wd 0.0500 time 0.2373 (0.2460) data time 0.0011 (0.0023) model time 0.2363 (0.2434) loss 3.5042 (3.4824) grad_norm 1.7710 (1.9938) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][350/1251] eta 0:03:41 lr 0.000877 wd 0.0500 time 0.2402 (0.2459) data time 0.0009 (0.0023) model time 0.2393 (0.2433) loss 3.5902 (3.4806) grad_norm 1.9822 (1.9988) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][360/1251] eta 0:03:39 lr 0.000877 wd 0.0500 time 0.2329 (0.2463) data time 0.0011 (0.0023) model time 0.2318 (0.2439) loss 3.0330 (3.4777) grad_norm 2.7175 (2.0009) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][370/1251] eta 0:03:36 lr 0.000877 wd 0.0500 time 0.2423 (0.2461) data time 0.0011 (0.0022) model time 0.2412 (0.2437) loss 3.2534 (3.4763) grad_norm 2.0132 (2.0016) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][380/1251] eta 0:03:34 lr 0.000877 wd 0.0500 time 0.2366 (0.2459) data time 0.0008 (0.0022) model time 0.2358 (0.2436) loss 2.3239 (3.4707) grad_norm 1.8716 (2.0033) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][390/1251] eta 0:03:31 lr 0.000877 wd 0.0500 time 0.2383 (0.2458) data time 0.0007 (0.0022) model time 0.2376 (0.2435) loss 3.4939 (3.4711) grad_norm 1.7828 (2.0061) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][400/1251] eta 0:03:29 lr 0.000877 wd 0.0500 time 0.2669 (0.2459) data time 0.0009 (0.0022) model time 0.2661 (0.2436) loss 4.7118 (3.4765) grad_norm 1.6673 (2.0076) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][410/1251] eta 0:03:26 lr 0.000877 wd 0.0500 time 0.2455 (0.2459) data time 0.0010 (0.0022) model time 0.2446 (0.2435) loss 2.6781 (3.4678) grad_norm 1.7235 (2.0127) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][420/1251] eta 0:03:24 lr 0.000877 wd 0.0500 time 0.2448 (0.2458) data time 0.0009 (0.0022) model time 0.2439 (0.2435) loss 3.5591 (3.4734) grad_norm 2.3680 (2.0149) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:30:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][430/1251] eta 0:03:21 lr 0.000877 wd 0.0500 time 0.2443 (0.2457) data time 0.0012 (0.0022) model time 0.2431 (0.2434) loss 3.5079 (3.4740) grad_norm 1.8898 (2.0141) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][440/1251] eta 0:03:19 lr 0.000876 wd 0.0500 time 0.2441 (0.2457) data time 0.0010 (0.0021) model time 0.2432 (0.2434) loss 3.3675 (3.4686) grad_norm 1.8627 (2.0131) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][450/1251] eta 0:03:16 lr 0.000876 wd 0.0500 time 0.2510 (0.2456) data time 0.0009 (0.0021) model time 0.2501 (0.2433) loss 3.7679 (3.4635) grad_norm 1.4512 (2.0079) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][460/1251] eta 0:03:14 lr 0.000876 wd 0.0500 time 0.2374 (0.2455) data time 0.0010 (0.0021) model time 0.2364 (0.2432) loss 3.5555 (3.4634) grad_norm 1.9940 (2.0205) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][470/1251] eta 0:03:11 lr 0.000876 wd 0.0500 time 0.2422 (0.2454) data time 0.0007 (0.0021) model time 0.2415 (0.2431) loss 3.1143 (3.4544) grad_norm 1.8411 (2.0171) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][480/1251] eta 0:03:09 lr 0.000876 wd 0.0500 time 0.2353 (0.2453) data time 0.0008 (0.0021) model time 0.2344 (0.2431) loss 2.8390 (3.4578) grad_norm 2.4201 (2.0175) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][490/1251] eta 0:03:06 lr 0.000876 wd 0.0500 time 0.2471 (0.2453) data time 0.0011 (0.0020) model time 0.2460 (0.2431) loss 3.0488 (3.4536) grad_norm 2.2027 (2.0206) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][500/1251] eta 0:03:04 lr 0.000876 wd 0.0500 time 0.2452 (0.2452) data time 0.0008 (0.0020) model time 0.2444 (0.2430) loss 3.5250 (3.4530) grad_norm 1.8252 (2.0169) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][510/1251] eta 0:03:01 lr 0.000876 wd 0.0500 time 0.2412 (0.2452) data time 0.0008 (0.0020) model time 0.2403 (0.2430) loss 2.3856 (3.4530) grad_norm 2.4465 (2.0112) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][520/1251] eta 0:02:59 lr 0.000876 wd 0.0500 time 0.2434 (0.2451) data time 0.0007 (0.0020) model time 0.2427 (0.2430) loss 2.6425 (3.4456) grad_norm 2.2371 (2.0092) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][530/1251] eta 0:02:56 lr 0.000876 wd 0.0500 time 0.2438 (0.2451) data time 0.0008 (0.0020) model time 0.2430 (0.2430) loss 3.1023 (3.4472) grad_norm 1.4605 (2.0069) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][540/1251] eta 0:02:54 lr 0.000876 wd 0.0500 time 0.2397 (0.2450) data time 0.0012 (0.0019) model time 0.2386 (0.2429) loss 3.6492 (3.4483) grad_norm 1.9361 (2.0065) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][550/1251] eta 0:02:51 lr 0.000876 wd 0.0500 time 0.2315 (0.2452) data time 0.0009 (0.0019) model time 0.2307 (0.2431) loss 4.0694 (3.4502) grad_norm 1.3462 (2.0160) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][560/1251] eta 0:02:49 lr 0.000876 wd 0.0500 time 0.2462 (0.2452) data time 0.0009 (0.0019) model time 0.2453 (0.2431) loss 3.7770 (3.4473) grad_norm 1.5532 (2.0189) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][570/1251] eta 0:02:46 lr 0.000876 wd 0.0500 time 0.2488 (0.2451) data time 0.0008 (0.0019) model time 0.2481 (0.2431) loss 2.1055 (3.4455) grad_norm 2.1302 (2.0199) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][580/1251] eta 0:02:44 lr 0.000876 wd 0.0500 time 0.2445 (0.2451) data time 0.0011 (0.0019) model time 0.2434 (0.2431) loss 3.8459 (3.4475) grad_norm 2.8242 (2.0214) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][590/1251] eta 0:02:42 lr 0.000876 wd 0.0500 time 0.2392 (0.2454) data time 0.0011 (0.0019) model time 0.2381 (0.2434) loss 3.1065 (3.4446) grad_norm 2.3868 (2.0192) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][600/1251] eta 0:02:39 lr 0.000876 wd 0.0500 time 0.2385 (0.2453) data time 0.0011 (0.0019) model time 0.2375 (0.2433) loss 3.7283 (3.4466) grad_norm 2.8240 (2.0221) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][610/1251] eta 0:02:37 lr 0.000876 wd 0.0500 time 0.2350 (0.2453) data time 0.0009 (0.0018) model time 0.2341 (0.2434) loss 2.4376 (3.4461) grad_norm 2.0677 (2.0276) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][620/1251] eta 0:02:34 lr 0.000876 wd 0.0500 time 0.2455 (0.2453) data time 0.0009 (0.0018) model time 0.2446 (0.2434) loss 2.6446 (3.4476) grad_norm 1.7135 (2.0292) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][630/1251] eta 0:02:32 lr 0.000876 wd 0.0500 time 0.2413 (0.2453) data time 0.0009 (0.0018) model time 0.2403 (0.2433) loss 3.6813 (3.4466) grad_norm 2.0481 (2.0271) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][640/1251] eta 0:02:29 lr 0.000876 wd 0.0500 time 0.2431 (0.2452) data time 0.0007 (0.0018) model time 0.2424 (0.2433) loss 2.4651 (3.4428) grad_norm 1.7584 (2.0259) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][650/1251] eta 0:02:27 lr 0.000876 wd 0.0500 time 0.2409 (0.2452) data time 0.0009 (0.0018) model time 0.2400 (0.2433) loss 3.7104 (3.4422) grad_norm 2.1639 (2.0277) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][660/1251] eta 0:02:24 lr 0.000876 wd 0.0500 time 0.2370 (0.2451) data time 0.0007 (0.0018) model time 0.2363 (0.2432) loss 4.0353 (3.4433) grad_norm 3.1224 (2.0282) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:31:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][670/1251] eta 0:02:22 lr 0.000876 wd 0.0500 time 0.2485 (0.2451) data time 0.0009 (0.0018) model time 0.2475 (0.2432) loss 3.1636 (3.4428) grad_norm 1.6969 (2.0275) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][680/1251] eta 0:02:19 lr 0.000876 wd 0.0500 time 0.2417 (0.2450) data time 0.0008 (0.0018) model time 0.2409 (0.2431) loss 2.8863 (3.4416) grad_norm 1.5452 (2.0267) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][690/1251] eta 0:02:17 lr 0.000876 wd 0.0500 time 0.2450 (0.2450) data time 0.0010 (0.0018) model time 0.2441 (0.2431) loss 3.2416 (3.4436) grad_norm 2.0168 (2.0231) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][700/1251] eta 0:02:15 lr 0.000876 wd 0.0500 time 0.2403 (0.2453) data time 0.0009 (0.0017) model time 0.2395 (0.2435) loss 4.0163 (3.4492) grad_norm 1.9832 (2.0207) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][710/1251] eta 0:02:12 lr 0.000876 wd 0.0500 time 0.2382 (0.2458) data time 0.0010 (0.0017) model time 0.2372 (0.2440) loss 3.4194 (3.4517) grad_norm 2.4658 (2.0212) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][720/1251] eta 0:02:10 lr 0.000876 wd 0.0500 time 0.2499 (0.2464) data time 0.0010 (0.0017) model time 0.2489 (0.2446) loss 2.5022 (3.4500) grad_norm 1.3822 (2.0223) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][730/1251] eta 0:02:08 lr 0.000876 wd 0.0500 time 0.2405 (0.2466) data time 0.0009 (0.0017) model time 0.2397 (0.2449) loss 3.0753 (3.4499) grad_norm 2.5661 (2.0207) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][740/1251] eta 0:02:05 lr 0.000876 wd 0.0500 time 0.2405 (0.2465) data time 0.0012 (0.0017) model time 0.2394 (0.2448) loss 2.6494 (3.4449) grad_norm 1.9008 (2.0190) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][750/1251] eta 0:02:03 lr 0.000876 wd 0.0500 time 0.2400 (0.2464) data time 0.0011 (0.0017) model time 0.2390 (0.2447) loss 3.6168 (3.4416) grad_norm 1.7285 (2.0210) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][760/1251] eta 0:02:00 lr 0.000876 wd 0.0500 time 0.2450 (0.2464) data time 0.0007 (0.0017) model time 0.2443 (0.2447) loss 3.4883 (3.4424) grad_norm 1.3820 (2.0214) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][770/1251] eta 0:01:58 lr 0.000876 wd 0.0500 time 0.2403 (0.2463) data time 0.0010 (0.0017) model time 0.2394 (0.2446) loss 3.9919 (3.4458) grad_norm 2.1465 (2.0202) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][780/1251] eta 0:01:56 lr 0.000875 wd 0.0500 time 0.2479 (0.2463) data time 0.0009 (0.0017) model time 0.2471 (0.2446) loss 3.1397 (3.4422) grad_norm 1.7150 (2.0182) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][790/1251] eta 0:01:53 lr 0.000875 wd 0.0500 time 0.2401 (0.2463) data time 0.0011 (0.0017) model time 0.2390 (0.2446) loss 3.9352 (3.4410) grad_norm 1.8216 (2.0200) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][800/1251] eta 0:01:51 lr 0.000875 wd 0.0500 time 0.2360 (0.2462) data time 0.0011 (0.0017) model time 0.2349 (0.2446) loss 3.2674 (3.4427) grad_norm 1.6471 (2.0216) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][810/1251] eta 0:01:48 lr 0.000875 wd 0.0500 time 0.2466 (0.2462) data time 0.0007 (0.0017) model time 0.2460 (0.2445) loss 4.4116 (3.4433) grad_norm 2.7696 (2.0231) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][820/1251] eta 0:01:46 lr 0.000875 wd 0.0500 time 0.2423 (0.2461) data time 0.0010 (0.0016) model time 0.2414 (0.2445) loss 3.6739 (3.4437) grad_norm 1.6275 (2.0210) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][830/1251] eta 0:01:43 lr 0.000875 wd 0.0500 time 0.2420 (0.2461) data time 0.0008 (0.0016) model time 0.2412 (0.2444) loss 3.9868 (3.4427) grad_norm 1.7104 (2.0197) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][840/1251] eta 0:01:41 lr 0.000875 wd 0.0500 time 0.2439 (0.2460) data time 0.0009 (0.0016) model time 0.2430 (0.2444) loss 3.2353 (3.4439) grad_norm 5.6584 (2.0279) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][850/1251] eta 0:01:38 lr 0.000875 wd 0.0500 time 0.2398 (0.2460) data time 0.0011 (0.0016) model time 0.2387 (0.2443) loss 2.9993 (3.4448) grad_norm 4.2574 (2.0382) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][860/1251] eta 0:01:36 lr 0.000875 wd 0.0500 time 0.2342 (0.2459) data time 0.0012 (0.0016) model time 0.2330 (0.2443) loss 3.7669 (3.4448) grad_norm 1.5892 (2.0371) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][870/1251] eta 0:01:33 lr 0.000875 wd 0.0500 time 0.2399 (0.2458) data time 0.0008 (0.0016) model time 0.2391 (0.2442) loss 4.0308 (3.4457) grad_norm 1.9731 (2.0357) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][880/1251] eta 0:01:31 lr 0.000875 wd 0.0500 time 0.2428 (0.2460) data time 0.0009 (0.0016) model time 0.2420 (0.2443) loss 3.6716 (3.4475) grad_norm 2.6157 (2.0350) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][890/1251] eta 0:01:28 lr 0.000875 wd 0.0500 time 0.2457 (0.2459) data time 0.0008 (0.0016) model time 0.2449 (0.2443) loss 3.7139 (3.4455) grad_norm 2.1744 (2.0333) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][900/1251] eta 0:01:26 lr 0.000875 wd 0.0500 time 0.2478 (0.2459) data time 0.0009 (0.0016) model time 0.2469 (0.2443) loss 4.1430 (3.4461) grad_norm 1.8263 (2.0358) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][910/1251] eta 0:01:23 lr 0.000875 wd 0.0500 time 0.2491 (0.2459) data time 0.0007 (0.0016) model time 0.2484 (0.2442) loss 4.0363 (3.4495) grad_norm 1.6563 (2.0379) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][920/1251] eta 0:01:21 lr 0.000875 wd 0.0500 time 0.2455 (0.2459) data time 0.0007 (0.0016) model time 0.2448 (0.2442) loss 3.1527 (3.4485) grad_norm 2.0409 (2.0365) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][930/1251] eta 0:01:18 lr 0.000875 wd 0.0500 time 0.2484 (0.2458) data time 0.0008 (0.0016) model time 0.2476 (0.2442) loss 4.0786 (3.4506) grad_norm 2.2380 (2.0409) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][940/1251] eta 0:01:16 lr 0.000875 wd 0.0500 time 0.2413 (0.2458) data time 0.0010 (0.0016) model time 0.2403 (0.2442) loss 3.7448 (3.4503) grad_norm 1.9374 (2.0432) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][950/1251] eta 0:01:13 lr 0.000875 wd 0.0500 time 0.2462 (0.2457) data time 0.0009 (0.0016) model time 0.2453 (0.2441) loss 4.0634 (3.4508) grad_norm 2.1487 (2.0412) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][960/1251] eta 0:01:11 lr 0.000875 wd 0.0500 time 0.2419 (0.2457) data time 0.0010 (0.0016) model time 0.2409 (0.2441) loss 4.0704 (3.4499) grad_norm 1.9821 (2.0387) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][970/1251] eta 0:01:09 lr 0.000875 wd 0.0500 time 0.2347 (0.2457) data time 0.0008 (0.0016) model time 0.2338 (0.2441) loss 3.2453 (3.4497) grad_norm 1.5186 (2.0356) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][980/1251] eta 0:01:06 lr 0.000875 wd 0.0500 time 0.2448 (0.2459) data time 0.0011 (0.0016) model time 0.2437 (0.2443) loss 3.6355 (3.4527) grad_norm 1.8944 (2.0356) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][990/1251] eta 0:01:04 lr 0.000875 wd 0.0500 time 0.2452 (0.2458) data time 0.0009 (0.0016) model time 0.2442 (0.2442) loss 3.2380 (3.4518) grad_norm 2.7126 (2.0341) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1000/1251] eta 0:01:01 lr 0.000875 wd 0.0500 time 0.2444 (0.2458) data time 0.0009 (0.0016) model time 0.2434 (0.2442) loss 4.1381 (3.4487) grad_norm 1.8863 (2.0344) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1010/1251] eta 0:00:59 lr 0.000875 wd 0.0500 time 0.2431 (0.2457) data time 0.0007 (0.0016) model time 0.2424 (0.2442) loss 4.1565 (3.4489) grad_norm 2.1197 (2.0350) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1020/1251] eta 0:00:56 lr 0.000875 wd 0.0500 time 0.2428 (0.2457) data time 0.0010 (0.0016) model time 0.2418 (0.2441) loss 3.4720 (3.4482) grad_norm 2.1892 (2.0339) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1030/1251] eta 0:00:54 lr 0.000875 wd 0.0500 time 0.2442 (0.2457) data time 0.0009 (0.0016) model time 0.2432 (0.2441) loss 3.9297 (3.4487) grad_norm 1.7664 (2.0293) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1040/1251] eta 0:00:51 lr 0.000875 wd 0.0500 time 0.2407 (0.2456) data time 0.0009 (0.0015) model time 0.2398 (0.2440) loss 2.8848 (3.4476) grad_norm 2.3526 (2.0288) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1050/1251] eta 0:00:49 lr 0.000875 wd 0.0500 time 0.2377 (0.2456) data time 0.0009 (0.0015) model time 0.2368 (0.2440) loss 3.6033 (3.4478) grad_norm 6.5337 (2.0369) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1060/1251] eta 0:00:46 lr 0.000875 wd 0.0500 time 0.2402 (0.2456) data time 0.0009 (0.0015) model time 0.2392 (0.2440) loss 3.8865 (3.4489) grad_norm 2.6187 (2.0401) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1070/1251] eta 0:00:44 lr 0.000875 wd 0.0500 time 0.2471 (0.2455) data time 0.0011 (0.0015) model time 0.2459 (0.2439) loss 3.4719 (3.4499) grad_norm 1.7266 (2.0395) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1080/1251] eta 0:00:42 lr 0.000875 wd 0.0500 time 0.2345 (0.2457) data time 0.0010 (0.0015) model time 0.2335 (0.2441) loss 4.0343 (3.4498) grad_norm 1.8751 (2.0373) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1090/1251] eta 0:00:39 lr 0.000875 wd 0.0500 time 0.2456 (0.2456) data time 0.0009 (0.0015) model time 0.2447 (0.2441) loss 3.4513 (3.4505) grad_norm 2.6326 (2.0363) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1100/1251] eta 0:00:37 lr 0.000875 wd 0.0500 time 0.2370 (0.2457) data time 0.0012 (0.0015) model time 0.2358 (0.2441) loss 3.2645 (3.4493) grad_norm 2.2507 (2.0371) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1110/1251] eta 0:00:34 lr 0.000875 wd 0.0500 time 0.2344 (0.2457) data time 0.0015 (0.0015) model time 0.2329 (0.2441) loss 3.9948 (3.4518) grad_norm 2.3618 (2.0368) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1120/1251] eta 0:00:32 lr 0.000874 wd 0.0500 time 0.2442 (0.2456) data time 0.0010 (0.0015) model time 0.2432 (0.2441) loss 3.4906 (3.4514) grad_norm 1.5319 (2.0351) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1130/1251] eta 0:00:29 lr 0.000874 wd 0.0500 time 0.2423 (0.2458) data time 0.0007 (0.0015) model time 0.2416 (0.2442) loss 3.6620 (3.4500) grad_norm 1.5391 (2.0335) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1140/1251] eta 0:00:27 lr 0.000874 wd 0.0500 time 0.2419 (0.2458) data time 0.0010 (0.0015) model time 0.2409 (0.2442) loss 4.0430 (3.4514) grad_norm 1.9382 (2.0316) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1150/1251] eta 0:00:24 lr 0.000874 wd 0.0500 time 0.2453 (0.2457) data time 0.0008 (0.0015) model time 0.2445 (0.2442) loss 2.9709 (3.4519) grad_norm 3.3457 (2.0316) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1160/1251] eta 0:00:22 lr 0.000874 wd 0.0500 time 0.2409 (0.2457) data time 0.0008 (0.0015) model time 0.2401 (0.2442) loss 3.6728 (3.4529) grad_norm 2.0305 (2.0335) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1170/1251] eta 0:00:19 lr 0.000874 wd 0.0500 time 0.2372 (0.2457) data time 0.0012 (0.0015) model time 0.2361 (0.2441) loss 3.9139 (3.4520) grad_norm 2.1910 (2.0316) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:34:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1180/1251] eta 0:00:17 lr 0.000874 wd 0.0500 time 0.2449 (0.2457) data time 0.0009 (0.0015) model time 0.2440 (0.2441) loss 3.4923 (3.4543) grad_norm 2.0062 (2.0303) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:34:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1190/1251] eta 0:00:14 lr 0.000874 wd 0.0500 time 0.2425 (0.2456) data time 0.0012 (0.0015) model time 0.2413 (0.2441) loss 3.8360 (3.4540) grad_norm 1.6386 (2.0283) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:34:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1200/1251] eta 0:00:12 lr 0.000874 wd 0.0500 time 0.2393 (0.2456) data time 0.0009 (0.0015) model time 0.2384 (0.2441) loss 2.5340 (3.4553) grad_norm 1.6857 (2.0330) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:34:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1210/1251] eta 0:00:10 lr 0.000874 wd 0.0500 time 0.2443 (0.2456) data time 0.0007 (0.0015) model time 0.2436 (0.2440) loss 3.9154 (3.4571) grad_norm 2.3549 (2.0347) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1220/1251] eta 0:00:07 lr 0.000874 wd 0.0500 time 0.2434 (0.2456) data time 0.0007 (0.0015) model time 0.2427 (0.2440) loss 3.2937 (3.4584) grad_norm 1.5399 (2.0341) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:34:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1230/1251] eta 0:00:05 lr 0.000874 wd 0.0500 time 0.2380 (0.2457) data time 0.0010 (0.0015) model time 0.2371 (0.2441) loss 3.8311 (3.4591) grad_norm 1.5996 (2.0344) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1240/1251] eta 0:00:02 lr 0.000874 wd 0.0500 time 0.2280 (0.2457) data time 0.0007 (0.0015) model time 0.2273 (0.2442) loss 4.2520 (3.4612) grad_norm 2.0582 (2.0398) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [84/300][1250/1251] eta 0:00:00 lr 0.000874 wd 0.0500 time 0.2282 (0.2457) data time 0.0005 (0.0015) model time 0.2277 (0.2442) loss 4.4620 (3.4625) grad_norm 2.0787 (2.0447) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 84 training takes 0:05:07 [2024-08-26 09:34:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 09:34:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 09:34:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.426 (0.426) Loss 0.5571 (0.5571) Acc@1 89.551 (89.551) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-26 09:34:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.089 (0.115) Loss 0.8066 (0.8333) Acc@1 83.008 (81.960) Acc@5 95.801 (96.129) Mem 7379MB [2024-08-26 09:34:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.097) Loss 1.1953 (0.8551) Acc@1 72.559 (80.980) Acc@5 92.188 (96.066) Mem 7379MB [2024-08-26 09:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.090) Loss 1.4854 (0.9799) Acc@1 65.625 (78.157) Acc@5 87.500 (94.487) Mem 7379MB [2024-08-26 09:34:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.3828 (1.0476) Acc@1 67.969 (76.582) Acc@5 89.746 (93.600) Mem 7379MB [2024-08-26 09:34:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.260 Acc@5 93.594 [2024-08-26 09:34:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.3% [2024-08-26 09:34:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.761 (0.761) Loss 0.4575 (0.4575) Acc@1 91.895 (91.895) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 09:34:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.089 (0.148) Loss 0.7305 (0.7090) Acc@1 85.742 (84.863) Acc@5 96.387 (96.955) Mem 7379MB [2024-08-26 09:34:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.115) Loss 1.0107 (0.7322) Acc@1 75.391 (83.705) Acc@5 94.141 (96.908) Mem 7379MB [2024-08-26 09:34:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.102) Loss 1.3008 (0.8361) Acc@1 67.383 (81.259) Acc@5 90.723 (95.650) Mem 7379MB [2024-08-26 09:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.093) Loss 1.1650 (0.8915) Acc@1 71.582 (79.757) Acc@5 92.188 (95.062) Mem 7379MB [2024-08-26 09:34:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.342 Acc@5 95.008 [2024-08-26 09:34:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.3% [2024-08-26 09:34:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.34% [2024-08-26 09:34:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 09:34:30 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 09:34:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][0/1251] eta 0:15:25 lr 0.000874 wd 0.0500 time 0.7399 (0.7399) data time 0.5182 (0.5182) model time 0.0000 (0.0000) loss 3.4919 (3.4919) grad_norm 2.0449 (2.0449) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][10/1251] eta 0:06:40 lr 0.000874 wd 0.0500 time 0.2406 (0.3226) data time 0.0008 (0.0480) model time 0.0000 (0.0000) loss 3.8427 (3.4063) grad_norm 1.7429 (1.9758) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:34:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][20/1251] eta 0:06:01 lr 0.000874 wd 0.0500 time 0.2385 (0.2936) data time 0.0007 (0.0256) model time 0.0000 (0.0000) loss 3.1225 (3.4523) grad_norm 1.4083 (1.9365) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-26 09:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][30/1251] eta 0:05:38 lr 0.000874 wd 0.0500 time 0.2355 (0.2771) data time 0.0013 (0.0177) model time 0.0000 (0.0000) loss 3.8591 (3.4471) grad_norm 1.5054 (1.9758) loss_scale 2048.0000 (1189.1613) mem 7379MB [2024-08-26 09:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][40/1251] eta 0:05:25 lr 0.000874 wd 0.0500 time 0.2388 (0.2691) data time 0.0013 (0.0137) model time 0.0000 (0.0000) loss 2.9950 (3.4323) grad_norm 1.6277 (1.9195) loss_scale 2048.0000 (1398.6341) mem 7379MB [2024-08-26 09:34:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][50/1251] eta 0:05:15 lr 0.000874 wd 0.0500 time 0.2348 (0.2631) data time 0.0009 (0.0112) model time 0.0000 (0.0000) loss 2.7987 (3.4772) grad_norm 1.4937 (1.8902) loss_scale 2048.0000 (1525.9608) mem 7379MB [2024-08-26 09:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][60/1251] eta 0:05:08 lr 0.000874 wd 0.0500 time 0.2441 (0.2594) data time 0.0009 (0.0095) model time 0.2432 (0.2398) loss 3.9643 (3.4931) grad_norm 1.6685 (1.9090) loss_scale 2048.0000 (1611.5410) mem 7379MB [2024-08-26 09:34:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][70/1251] eta 0:05:03 lr 0.000874 wd 0.0500 time 0.2427 (0.2568) data time 0.0007 (0.0083) model time 0.2420 (0.2399) loss 3.8500 (3.5076) grad_norm 2.2302 (1.9258) loss_scale 2048.0000 (1673.0141) mem 7379MB [2024-08-26 09:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][80/1251] eta 0:04:58 lr 0.000874 wd 0.0500 time 0.2437 (0.2549) data time 0.0009 (0.0074) model time 0.2428 (0.2400) loss 2.5777 (3.5030) grad_norm 1.7524 (1.9418) loss_scale 2048.0000 (1719.3086) mem 7379MB [2024-08-26 09:34:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][90/1251] eta 0:04:54 lr 0.000874 wd 0.0500 time 0.2393 (0.2535) data time 0.0007 (0.0067) model time 0.2386 (0.2402) loss 3.1740 (3.4967) grad_norm 1.9467 (1.9343) loss_scale 2048.0000 (1755.4286) mem 7379MB [2024-08-26 09:34:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][100/1251] eta 0:04:50 lr 0.000874 wd 0.0500 time 0.2432 (0.2522) data time 0.0009 (0.0061) model time 0.2423 (0.2401) loss 2.1001 (3.5012) grad_norm 1.4026 (1.9298) loss_scale 2048.0000 (1784.3960) mem 7379MB [2024-08-26 09:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][110/1251] eta 0:04:47 lr 0.000874 wd 0.0500 time 0.2363 (0.2516) data time 0.0009 (0.0057) model time 0.2354 (0.2408) loss 4.1839 (3.4885) grad_norm 2.7368 (1.9228) loss_scale 2048.0000 (1808.1441) mem 7379MB [2024-08-26 09:35:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][120/1251] eta 0:04:43 lr 0.000874 wd 0.0500 time 0.2438 (0.2508) data time 0.0011 (0.0053) model time 0.2427 (0.2408) loss 3.4828 (3.5094) grad_norm 2.4949 (1.9231) loss_scale 2048.0000 (1827.9669) mem 7379MB [2024-08-26 09:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][130/1251] eta 0:04:40 lr 0.000874 wd 0.0500 time 0.2405 (0.2501) data time 0.0010 (0.0050) model time 0.2395 (0.2408) loss 3.4217 (3.4812) grad_norm 1.6248 (1.9182) loss_scale 2048.0000 (1844.7634) mem 7379MB [2024-08-26 09:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][140/1251] eta 0:04:37 lr 0.000874 wd 0.0500 time 0.2413 (0.2496) data time 0.0009 (0.0047) model time 0.2404 (0.2409) loss 2.8042 (3.4743) grad_norm 1.7636 (1.9122) loss_scale 2048.0000 (1859.1773) mem 7379MB [2024-08-26 09:35:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][150/1251] eta 0:04:35 lr 0.000874 wd 0.0500 time 0.2341 (0.2498) data time 0.0009 (0.0045) model time 0.2333 (0.2420) loss 3.6197 (3.4466) grad_norm 1.8739 (1.9106) loss_scale 2048.0000 (1871.6821) mem 7379MB [2024-08-26 09:35:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][160/1251] eta 0:04:31 lr 0.000874 wd 0.0500 time 0.2377 (0.2493) data time 0.0009 (0.0043) model time 0.2368 (0.2418) loss 3.7857 (3.4580) grad_norm 1.6707 (1.9015) loss_scale 2048.0000 (1882.6335) mem 7379MB [2024-08-26 09:35:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][170/1251] eta 0:04:28 lr 0.000874 wd 0.0500 time 0.2373 (0.2488) data time 0.0011 (0.0041) model time 0.2362 (0.2417) loss 2.1650 (3.4444) grad_norm 1.8184 (1.9107) loss_scale 2048.0000 (1892.3041) mem 7379MB [2024-08-26 09:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][180/1251] eta 0:04:26 lr 0.000874 wd 0.0500 time 0.2413 (0.2484) data time 0.0009 (0.0039) model time 0.2404 (0.2416) loss 3.7405 (3.4466) grad_norm 2.6335 (1.9239) loss_scale 2048.0000 (1900.9061) mem 7379MB [2024-08-26 09:35:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][190/1251] eta 0:04:23 lr 0.000874 wd 0.0500 time 0.2394 (0.2480) data time 0.0007 (0.0038) model time 0.2386 (0.2415) loss 3.8613 (3.4501) grad_norm 1.6808 (1.9279) loss_scale 2048.0000 (1908.6073) mem 7379MB [2024-08-26 09:35:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][200/1251] eta 0:04:20 lr 0.000874 wd 0.0500 time 0.2384 (0.2478) data time 0.0009 (0.0036) model time 0.2376 (0.2415) loss 2.8721 (3.4474) grad_norm 4.2936 (1.9366) loss_scale 2048.0000 (1915.5423) mem 7379MB [2024-08-26 09:35:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][210/1251] eta 0:04:17 lr 0.000873 wd 0.0500 time 0.2496 (0.2475) data time 0.0007 (0.0035) model time 0.2489 (0.2416) loss 2.3249 (3.4386) grad_norm 2.4184 (1.9397) loss_scale 2048.0000 (1921.8199) mem 7379MB [2024-08-26 09:35:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][220/1251] eta 0:04:14 lr 0.000873 wd 0.0500 time 0.2408 (0.2471) data time 0.0009 (0.0034) model time 0.2398 (0.2413) loss 4.0625 (3.4568) grad_norm 1.7157 (1.9514) loss_scale 2048.0000 (1927.5294) mem 7379MB [2024-08-26 09:35:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][230/1251] eta 0:04:12 lr 0.000873 wd 0.0500 time 0.2393 (0.2469) data time 0.0010 (0.0033) model time 0.2383 (0.2413) loss 4.2555 (3.4621) grad_norm 2.0105 (1.9622) loss_scale 2048.0000 (1932.7446) mem 7379MB [2024-08-26 09:35:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][240/1251] eta 0:04:09 lr 0.000873 wd 0.0500 time 0.2369 (0.2467) data time 0.0009 (0.0032) model time 0.2360 (0.2413) loss 2.6581 (3.4511) grad_norm 2.2402 (1.9606) loss_scale 2048.0000 (1937.5270) mem 7379MB [2024-08-26 09:35:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][250/1251] eta 0:04:06 lr 0.000873 wd 0.0500 time 0.2425 (0.2467) data time 0.0008 (0.0031) model time 0.2418 (0.2415) loss 3.0654 (3.4446) grad_norm 1.5436 (1.9614) loss_scale 2048.0000 (1941.9283) mem 7379MB [2024-08-26 09:35:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][260/1251] eta 0:04:04 lr 0.000873 wd 0.0500 time 0.2426 (0.2465) data time 0.0009 (0.0030) model time 0.2417 (0.2415) loss 4.1233 (3.4443) grad_norm 1.8732 (1.9619) loss_scale 2048.0000 (1945.9923) mem 7379MB [2024-08-26 09:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][270/1251] eta 0:04:01 lr 0.000873 wd 0.0500 time 0.2486 (0.2465) data time 0.0008 (0.0030) model time 0.2478 (0.2416) loss 3.6382 (3.4426) grad_norm 1.8526 (1.9553) loss_scale 2048.0000 (1949.7565) mem 7379MB [2024-08-26 09:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][280/1251] eta 0:03:59 lr 0.000873 wd 0.0500 time 0.2438 (0.2464) data time 0.0007 (0.0029) model time 0.2431 (0.2417) loss 4.2457 (3.4448) grad_norm 2.9612 (1.9604) loss_scale 2048.0000 (1953.2527) mem 7379MB [2024-08-26 09:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][290/1251] eta 0:03:56 lr 0.000873 wd 0.0500 time 0.2427 (0.2463) data time 0.0008 (0.0029) model time 0.2419 (0.2417) loss 2.2556 (3.4466) grad_norm 2.8524 (1.9709) loss_scale 2048.0000 (1956.5086) mem 7379MB [2024-08-26 09:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][300/1251] eta 0:03:54 lr 0.000873 wd 0.0500 time 0.2467 (0.2462) data time 0.0009 (0.0028) model time 0.2458 (0.2417) loss 2.8368 (3.4465) grad_norm 3.1566 (1.9703) loss_scale 2048.0000 (1959.5482) mem 7379MB [2024-08-26 09:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][310/1251] eta 0:03:51 lr 0.000873 wd 0.0500 time 0.2445 (0.2462) data time 0.0008 (0.0028) model time 0.2437 (0.2417) loss 2.1252 (3.4399) grad_norm 1.9461 (1.9756) loss_scale 2048.0000 (1962.3923) mem 7379MB [2024-08-26 09:35:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][320/1251] eta 0:03:49 lr 0.000873 wd 0.0500 time 0.2436 (0.2460) data time 0.0009 (0.0027) model time 0.2426 (0.2417) loss 3.7359 (3.4400) grad_norm 4.0955 (1.9881) loss_scale 2048.0000 (1965.0592) mem 7379MB [2024-08-26 09:35:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][330/1251] eta 0:03:46 lr 0.000873 wd 0.0500 time 0.2364 (0.2458) data time 0.0010 (0.0027) model time 0.2354 (0.2416) loss 2.9836 (3.4332) grad_norm 2.1945 (2.0058) loss_scale 2048.0000 (1967.5650) mem 7379MB [2024-08-26 09:35:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][340/1251] eta 0:03:43 lr 0.000873 wd 0.0500 time 0.2377 (0.2457) data time 0.0007 (0.0026) model time 0.2370 (0.2415) loss 4.2785 (3.4347) grad_norm 1.8000 (2.0034) loss_scale 2048.0000 (1969.9238) mem 7379MB [2024-08-26 09:35:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][350/1251] eta 0:03:41 lr 0.000873 wd 0.0500 time 0.2447 (0.2455) data time 0.0009 (0.0026) model time 0.2438 (0.2415) loss 2.5616 (3.4323) grad_norm 1.7393 (2.0041) loss_scale 2048.0000 (1972.1481) mem 7379MB [2024-08-26 09:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][360/1251] eta 0:03:38 lr 0.000873 wd 0.0500 time 0.2403 (0.2454) data time 0.0010 (0.0025) model time 0.2393 (0.2414) loss 2.9126 (3.4323) grad_norm 1.7140 (2.0042) loss_scale 2048.0000 (1974.2493) mem 7379MB [2024-08-26 09:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][370/1251] eta 0:03:36 lr 0.000873 wd 0.0500 time 0.2356 (0.2453) data time 0.0010 (0.0025) model time 0.2347 (0.2414) loss 2.7151 (3.4314) grad_norm 3.0364 (2.0116) loss_scale 2048.0000 (1976.2372) mem 7379MB [2024-08-26 09:36:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][380/1251] eta 0:03:33 lr 0.000873 wd 0.0500 time 0.2416 (0.2452) data time 0.0011 (0.0025) model time 0.2405 (0.2414) loss 3.8266 (3.4359) grad_norm 1.7664 (2.0234) loss_scale 2048.0000 (1978.1207) mem 7379MB [2024-08-26 09:36:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][390/1251] eta 0:03:31 lr 0.000873 wd 0.0500 time 0.2377 (0.2451) data time 0.0011 (0.0024) model time 0.2366 (0.2413) loss 3.8011 (3.4247) grad_norm 1.6896 (2.0222) loss_scale 2048.0000 (1979.9079) mem 7379MB [2024-08-26 09:36:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][400/1251] eta 0:03:28 lr 0.000873 wd 0.0500 time 0.2432 (0.2450) data time 0.0008 (0.0024) model time 0.2423 (0.2413) loss 3.7610 (3.4244) grad_norm 1.9220 (2.0180) loss_scale 2048.0000 (1981.6060) mem 7379MB [2024-08-26 09:36:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][410/1251] eta 0:03:26 lr 0.000873 wd 0.0500 time 0.2509 (0.2454) data time 0.0007 (0.0023) model time 0.2502 (0.2418) loss 4.1883 (3.4311) grad_norm 2.0888 (2.0159) loss_scale 2048.0000 (1983.2214) mem 7379MB [2024-08-26 09:36:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][420/1251] eta 0:03:23 lr 0.000873 wd 0.0500 time 0.2423 (0.2453) data time 0.0010 (0.0023) model time 0.2413 (0.2418) loss 3.9568 (3.4320) grad_norm 2.3852 (2.0169) loss_scale 2048.0000 (1984.7601) mem 7379MB [2024-08-26 09:36:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][430/1251] eta 0:03:21 lr 0.000873 wd 0.0500 time 0.2312 (0.2458) data time 0.0009 (0.0023) model time 0.2303 (0.2423) loss 2.5239 (3.4338) grad_norm 2.5763 (2.0176) loss_scale 2048.0000 (1986.2274) mem 7379MB [2024-08-26 09:36:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][440/1251] eta 0:03:19 lr 0.000873 wd 0.0500 time 0.2470 (0.2457) data time 0.0010 (0.0023) model time 0.2460 (0.2423) loss 3.6425 (3.4388) grad_norm 1.5274 (2.0231) loss_scale 2048.0000 (1987.6281) mem 7379MB [2024-08-26 09:36:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][450/1251] eta 0:03:16 lr 0.000873 wd 0.0500 time 0.2383 (0.2456) data time 0.0010 (0.0022) model time 0.2373 (0.2423) loss 3.5591 (3.4410) grad_norm 2.0376 (2.0173) loss_scale 2048.0000 (1988.9667) mem 7379MB [2024-08-26 09:36:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][460/1251] eta 0:03:14 lr 0.000873 wd 0.0500 time 0.2446 (0.2460) data time 0.0010 (0.0022) model time 0.2436 (0.2428) loss 2.4760 (3.4396) grad_norm 2.4682 (2.0173) loss_scale 2048.0000 (1990.2473) mem 7379MB [2024-08-26 09:36:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][470/1251] eta 0:03:12 lr 0.000873 wd 0.0500 time 0.2441 (0.2459) data time 0.0011 (0.0022) model time 0.2430 (0.2427) loss 3.7126 (3.4418) grad_norm 2.6665 (2.0205) loss_scale 2048.0000 (1991.4735) mem 7379MB [2024-08-26 09:36:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][480/1251] eta 0:03:09 lr 0.000873 wd 0.0500 time 0.2377 (0.2458) data time 0.0010 (0.0022) model time 0.2368 (0.2426) loss 2.9862 (3.4350) grad_norm 2.3374 (2.0198) loss_scale 2048.0000 (1992.6486) mem 7379MB [2024-08-26 09:36:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][490/1251] eta 0:03:06 lr 0.000873 wd 0.0500 time 0.2459 (0.2457) data time 0.0007 (0.0021) model time 0.2451 (0.2426) loss 3.1807 (3.4371) grad_norm 1.5460 (2.0128) loss_scale 2048.0000 (1993.7760) mem 7379MB [2024-08-26 09:36:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][500/1251] eta 0:03:04 lr 0.000873 wd 0.0500 time 0.2435 (0.2456) data time 0.0010 (0.0021) model time 0.2425 (0.2425) loss 4.0722 (3.4413) grad_norm 3.1305 (2.0156) loss_scale 2048.0000 (1994.8583) mem 7379MB [2024-08-26 09:36:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][510/1251] eta 0:03:01 lr 0.000873 wd 0.0500 time 0.2363 (0.2455) data time 0.0010 (0.0021) model time 0.2353 (0.2425) loss 3.1828 (3.4433) grad_norm 1.7290 (2.0143) loss_scale 2048.0000 (1995.8982) mem 7379MB [2024-08-26 09:36:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][520/1251] eta 0:02:59 lr 0.000873 wd 0.0500 time 0.2351 (0.2457) data time 0.0009 (0.0021) model time 0.2342 (0.2427) loss 2.8039 (3.4425) grad_norm 2.1664 (2.0129) loss_scale 2048.0000 (1996.8983) mem 7379MB [2024-08-26 09:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][530/1251] eta 0:02:57 lr 0.000873 wd 0.0500 time 0.2404 (0.2456) data time 0.0010 (0.0021) model time 0.2394 (0.2426) loss 3.6534 (3.4454) grad_norm 1.8786 (2.0144) loss_scale 2048.0000 (1997.8606) mem 7379MB [2024-08-26 09:36:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][540/1251] eta 0:02:54 lr 0.000872 wd 0.0500 time 0.2404 (0.2455) data time 0.0012 (0.0020) model time 0.2392 (0.2426) loss 3.7410 (3.4527) grad_norm 4.2549 (2.0236) loss_scale 2048.0000 (1998.7874) mem 7379MB [2024-08-26 09:36:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][550/1251] eta 0:02:52 lr 0.000872 wd 0.0500 time 0.2450 (0.2458) data time 0.0009 (0.0020) model time 0.2440 (0.2430) loss 3.8259 (3.4514) grad_norm 1.6326 (2.0280) loss_scale 2048.0000 (1999.6806) mem 7379MB [2024-08-26 09:36:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][560/1251] eta 0:02:50 lr 0.000872 wd 0.0500 time 0.2349 (0.2461) data time 0.0008 (0.0020) model time 0.2341 (0.2433) loss 3.9003 (3.4497) grad_norm 1.4549 (2.0218) loss_scale 2048.0000 (2000.5419) mem 7379MB [2024-08-26 09:36:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][570/1251] eta 0:02:47 lr 0.000872 wd 0.0500 time 0.2384 (0.2460) data time 0.0008 (0.0020) model time 0.2376 (0.2433) loss 3.7010 (3.4518) grad_norm 1.9622 (2.0180) loss_scale 2048.0000 (2001.3730) mem 7379MB [2024-08-26 09:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][580/1251] eta 0:02:45 lr 0.000872 wd 0.0500 time 0.2372 (0.2459) data time 0.0011 (0.0020) model time 0.2362 (0.2432) loss 3.9195 (3.4542) grad_norm 1.6936 (2.0156) loss_scale 2048.0000 (2002.1756) mem 7379MB [2024-08-26 09:36:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][590/1251] eta 0:02:42 lr 0.000872 wd 0.0500 time 0.2368 (0.2458) data time 0.0009 (0.0019) model time 0.2358 (0.2432) loss 3.5166 (3.4550) grad_norm 2.1287 (2.0177) loss_scale 2048.0000 (2002.9509) mem 7379MB [2024-08-26 09:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][600/1251] eta 0:02:39 lr 0.000872 wd 0.0500 time 0.2435 (0.2458) data time 0.0008 (0.0019) model time 0.2427 (0.2431) loss 4.1424 (3.4568) grad_norm 1.6842 (2.0169) loss_scale 2048.0000 (2003.7005) mem 7379MB [2024-08-26 09:37:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][610/1251] eta 0:02:37 lr 0.000872 wd 0.0500 time 0.2444 (0.2457) data time 0.0011 (0.0019) model time 0.2433 (0.2431) loss 3.5192 (3.4593) grad_norm 1.7147 (2.0144) loss_scale 2048.0000 (2004.4255) mem 7379MB [2024-08-26 09:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][620/1251] eta 0:02:34 lr 0.000872 wd 0.0500 time 0.2398 (0.2456) data time 0.0007 (0.0019) model time 0.2391 (0.2430) loss 4.1617 (3.4604) grad_norm 1.8284 (2.0163) loss_scale 2048.0000 (2005.1272) mem 7379MB [2024-08-26 09:37:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][630/1251] eta 0:02:32 lr 0.000872 wd 0.0500 time 0.4650 (0.2459) data time 0.0009 (0.0019) model time 0.4641 (0.2434) loss 3.0719 (3.4563) grad_norm 2.2509 (2.0140) loss_scale 2048.0000 (2005.8067) mem 7379MB [2024-08-26 09:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][640/1251] eta 0:02:30 lr 0.000872 wd 0.0500 time 0.2401 (0.2459) data time 0.0012 (0.0019) model time 0.2389 (0.2433) loss 3.6622 (3.4562) grad_norm 2.6101 (2.0168) loss_scale 2048.0000 (2006.4649) mem 7379MB [2024-08-26 09:37:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][650/1251] eta 0:02:27 lr 0.000872 wd 0.0500 time 0.2339 (0.2458) data time 0.0011 (0.0019) model time 0.2328 (0.2433) loss 3.6463 (3.4543) grad_norm 3.2416 (2.0185) loss_scale 2048.0000 (2007.1029) mem 7379MB [2024-08-26 09:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][660/1251] eta 0:02:25 lr 0.000872 wd 0.0500 time 0.2449 (0.2457) data time 0.0007 (0.0018) model time 0.2442 (0.2432) loss 2.8010 (3.4478) grad_norm 3.9268 (2.0245) loss_scale 2048.0000 (2007.7216) mem 7379MB [2024-08-26 09:37:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][670/1251] eta 0:02:22 lr 0.000872 wd 0.0500 time 0.4544 (0.2460) data time 0.0009 (0.0018) model time 0.4535 (0.2435) loss 3.7212 (3.4521) grad_norm 1.4387 (2.0240) loss_scale 2048.0000 (2008.3219) mem 7379MB [2024-08-26 09:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][680/1251] eta 0:02:20 lr 0.000872 wd 0.0500 time 0.2350 (0.2459) data time 0.0009 (0.0018) model time 0.2340 (0.2435) loss 4.2495 (3.4531) grad_norm 2.1210 (2.0272) loss_scale 2048.0000 (2008.9046) mem 7379MB [2024-08-26 09:37:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][690/1251] eta 0:02:17 lr 0.000872 wd 0.0500 time 0.2422 (0.2458) data time 0.0010 (0.0018) model time 0.2412 (0.2434) loss 3.2984 (3.4591) grad_norm 2.3239 (2.0266) loss_scale 2048.0000 (2009.4703) mem 7379MB [2024-08-26 09:37:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][700/1251] eta 0:02:15 lr 0.000872 wd 0.0500 time 0.2418 (0.2457) data time 0.0009 (0.0018) model time 0.2409 (0.2433) loss 3.0233 (3.4568) grad_norm 1.6704 (2.0245) loss_scale 2048.0000 (2010.0200) mem 7379MB [2024-08-26 09:37:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][710/1251] eta 0:02:12 lr 0.000872 wd 0.0500 time 0.2466 (0.2457) data time 0.0008 (0.0018) model time 0.2459 (0.2433) loss 2.5461 (3.4571) grad_norm 3.1971 (2.0280) loss_scale 2048.0000 (2010.5541) mem 7379MB [2024-08-26 09:37:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][720/1251] eta 0:02:10 lr 0.000872 wd 0.0500 time 0.2408 (0.2456) data time 0.0007 (0.0018) model time 0.2401 (0.2432) loss 4.0058 (3.4575) grad_norm 2.1402 (2.0286) loss_scale 2048.0000 (2011.0735) mem 7379MB [2024-08-26 09:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][730/1251] eta 0:02:07 lr 0.000872 wd 0.0500 time 0.2357 (0.2455) data time 0.0008 (0.0018) model time 0.2349 (0.2431) loss 2.5338 (3.4552) grad_norm 2.4092 (2.0286) loss_scale 2048.0000 (2011.5787) mem 7379MB [2024-08-26 09:37:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][740/1251] eta 0:02:05 lr 0.000872 wd 0.0500 time 0.2388 (0.2455) data time 0.0011 (0.0018) model time 0.2376 (0.2431) loss 3.7470 (3.4541) grad_norm 2.3381 (2.0303) loss_scale 2048.0000 (2012.0702) mem 7379MB [2024-08-26 09:37:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][750/1251] eta 0:02:02 lr 0.000872 wd 0.0500 time 0.2443 (0.2454) data time 0.0008 (0.0017) model time 0.2435 (0.2430) loss 4.3610 (3.4574) grad_norm 1.6236 (2.0337) loss_scale 2048.0000 (2012.5486) mem 7379MB [2024-08-26 09:37:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][760/1251] eta 0:02:00 lr 0.000872 wd 0.0500 time 0.2475 (0.2454) data time 0.0009 (0.0017) model time 0.2465 (0.2430) loss 3.4891 (3.4575) grad_norm 3.0145 (2.0394) loss_scale 2048.0000 (2013.0145) mem 7379MB [2024-08-26 09:37:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][770/1251] eta 0:01:57 lr 0.000872 wd 0.0500 time 0.2451 (0.2453) data time 0.0010 (0.0017) model time 0.2441 (0.2430) loss 3.0749 (3.4582) grad_norm 2.1508 (2.0375) loss_scale 2048.0000 (2013.4682) mem 7379MB [2024-08-26 09:37:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][780/1251] eta 0:01:55 lr 0.000872 wd 0.0500 time 0.2462 (0.2452) data time 0.0010 (0.0017) model time 0.2452 (0.2429) loss 3.5605 (3.4571) grad_norm 1.6798 (2.0353) loss_scale 2048.0000 (2013.9104) mem 7379MB [2024-08-26 09:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][790/1251] eta 0:01:53 lr 0.000872 wd 0.0500 time 0.2479 (0.2452) data time 0.0010 (0.0017) model time 0.2468 (0.2429) loss 2.5889 (3.4567) grad_norm 2.4896 (2.0366) loss_scale 2048.0000 (2014.3413) mem 7379MB [2024-08-26 09:37:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][800/1251] eta 0:01:50 lr 0.000872 wd 0.0500 time 0.2579 (0.2452) data time 0.0007 (0.0017) model time 0.2572 (0.2429) loss 3.9039 (3.4577) grad_norm 2.3690 (2.0356) loss_scale 2048.0000 (2014.7615) mem 7379MB [2024-08-26 09:37:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][810/1251] eta 0:01:48 lr 0.000872 wd 0.0500 time 0.2438 (0.2451) data time 0.0011 (0.0017) model time 0.2427 (0.2429) loss 2.2508 (3.4525) grad_norm 2.4609 (2.0338) loss_scale 2048.0000 (2015.1714) mem 7379MB [2024-08-26 09:37:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][820/1251] eta 0:01:45 lr 0.000872 wd 0.0500 time 0.2476 (0.2451) data time 0.0008 (0.0017) model time 0.2468 (0.2429) loss 3.8139 (3.4552) grad_norm 1.6937 (2.0349) loss_scale 2048.0000 (2015.5713) mem 7379MB [2024-08-26 09:37:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][830/1251] eta 0:01:43 lr 0.000872 wd 0.0500 time 0.2419 (0.2451) data time 0.0007 (0.0017) model time 0.2412 (0.2428) loss 2.8225 (3.4559) grad_norm 2.1689 (2.0352) loss_scale 2048.0000 (2015.9615) mem 7379MB [2024-08-26 09:37:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][840/1251] eta 0:01:40 lr 0.000872 wd 0.0500 time 0.2337 (0.2450) data time 0.0008 (0.0017) model time 0.2329 (0.2428) loss 4.0758 (3.4583) grad_norm 2.7066 (2.0363) loss_scale 2048.0000 (2016.3424) mem 7379MB [2024-08-26 09:37:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][850/1251] eta 0:01:38 lr 0.000872 wd 0.0500 time 0.2305 (0.2450) data time 0.0009 (0.0017) model time 0.2295 (0.2428) loss 3.9291 (3.4606) grad_norm 1.8683 (2.0335) loss_scale 2048.0000 (2016.7145) mem 7379MB [2024-08-26 09:38:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][860/1251] eta 0:01:35 lr 0.000872 wd 0.0500 time 0.2396 (0.2449) data time 0.0007 (0.0017) model time 0.2389 (0.2427) loss 2.6111 (3.4589) grad_norm 1.7552 (2.0330) loss_scale 2048.0000 (2017.0778) mem 7379MB [2024-08-26 09:38:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][870/1251] eta 0:01:33 lr 0.000872 wd 0.0500 time 0.2399 (0.2449) data time 0.0010 (0.0016) model time 0.2389 (0.2427) loss 3.6670 (3.4592) grad_norm 2.4240 (2.0342) loss_scale 2048.0000 (2017.4328) mem 7379MB [2024-08-26 09:38:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][880/1251] eta 0:01:30 lr 0.000871 wd 0.0500 time 0.2444 (0.2449) data time 0.0011 (0.0016) model time 0.2433 (0.2427) loss 2.6471 (3.4560) grad_norm 2.3380 (2.0374) loss_scale 2048.0000 (2017.7798) mem 7379MB [2024-08-26 09:38:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][890/1251] eta 0:01:28 lr 0.000871 wd 0.0500 time 0.2463 (0.2451) data time 0.0007 (0.0016) model time 0.2457 (0.2429) loss 3.4466 (3.4571) grad_norm 1.4613 (2.0396) loss_scale 2048.0000 (2018.1190) mem 7379MB [2024-08-26 09:38:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][900/1251] eta 0:01:26 lr 0.000871 wd 0.0500 time 0.2439 (0.2453) data time 0.0010 (0.0016) model time 0.2429 (0.2432) loss 4.1181 (3.4596) grad_norm 2.1150 (2.0408) loss_scale 2048.0000 (2018.4506) mem 7379MB [2024-08-26 09:38:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][910/1251] eta 0:01:23 lr 0.000871 wd 0.0500 time 0.2375 (0.2452) data time 0.0011 (0.0016) model time 0.2364 (0.2431) loss 3.6506 (3.4567) grad_norm 1.8564 (2.0413) loss_scale 2048.0000 (2018.7750) mem 7379MB [2024-08-26 09:38:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][920/1251] eta 0:01:21 lr 0.000871 wd 0.0500 time 0.2434 (0.2452) data time 0.0007 (0.0016) model time 0.2427 (0.2431) loss 3.0947 (3.4592) grad_norm 2.3467 (2.0414) loss_scale 2048.0000 (2019.0923) mem 7379MB [2024-08-26 09:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][930/1251] eta 0:01:18 lr 0.000871 wd 0.0500 time 0.2448 (0.2453) data time 0.0010 (0.0016) model time 0.2437 (0.2432) loss 3.9835 (3.4614) grad_norm 1.9845 (2.0402) loss_scale 2048.0000 (2019.4028) mem 7379MB [2024-08-26 09:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][940/1251] eta 0:01:16 lr 0.000871 wd 0.0500 time 0.2421 (0.2455) data time 0.0009 (0.0016) model time 0.2412 (0.2435) loss 3.2430 (3.4624) grad_norm 2.5061 (2.0419) loss_scale 2048.0000 (2019.7067) mem 7379MB [2024-08-26 09:38:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][950/1251] eta 0:01:13 lr 0.000871 wd 0.0500 time 0.2427 (0.2455) data time 0.0008 (0.0016) model time 0.2419 (0.2434) loss 2.5428 (3.4605) grad_norm 3.0458 (2.0472) loss_scale 2048.0000 (2020.0042) mem 7379MB [2024-08-26 09:38:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][960/1251] eta 0:01:11 lr 0.000871 wd 0.0500 time 0.2402 (0.2454) data time 0.0011 (0.0016) model time 0.2391 (0.2434) loss 3.3976 (3.4614) grad_norm 1.8670 (2.0460) loss_scale 2048.0000 (2020.2955) mem 7379MB [2024-08-26 09:38:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][970/1251] eta 0:01:08 lr 0.000871 wd 0.0500 time 0.2368 (0.2454) data time 0.0009 (0.0016) model time 0.2359 (0.2434) loss 3.1836 (3.4587) grad_norm 1.8648 (2.0476) loss_scale 2048.0000 (2020.5808) mem 7379MB [2024-08-26 09:38:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][980/1251] eta 0:01:06 lr 0.000871 wd 0.0500 time 0.2409 (0.2454) data time 0.0010 (0.0016) model time 0.2399 (0.2434) loss 2.3043 (3.4580) grad_norm 1.7094 (2.0454) loss_scale 2048.0000 (2020.8603) mem 7379MB [2024-08-26 09:38:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][990/1251] eta 0:01:04 lr 0.000871 wd 0.0500 time 0.4559 (0.2455) data time 0.0009 (0.0016) model time 0.4549 (0.2435) loss 3.7410 (3.4579) grad_norm 1.8543 (2.0464) loss_scale 2048.0000 (2021.1342) mem 7379MB [2024-08-26 09:38:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1000/1251] eta 0:01:01 lr 0.000871 wd 0.0500 time 0.2476 (0.2455) data time 0.0009 (0.0016) model time 0.2466 (0.2435) loss 3.5082 (3.4573) grad_norm 1.8702 (2.0428) loss_scale 2048.0000 (2021.4026) mem 7379MB [2024-08-26 09:38:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1010/1251] eta 0:00:59 lr 0.000871 wd 0.0500 time 0.2420 (0.2455) data time 0.0010 (0.0016) model time 0.2410 (0.2435) loss 2.3806 (3.4576) grad_norm 2.1307 (2.0426) loss_scale 2048.0000 (2021.6657) mem 7379MB [2024-08-26 09:38:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1020/1251] eta 0:00:56 lr 0.000871 wd 0.0500 time 0.2401 (0.2454) data time 0.0007 (0.0016) model time 0.2394 (0.2434) loss 3.7023 (3.4560) grad_norm 2.3701 (2.0410) loss_scale 2048.0000 (2021.9236) mem 7379MB [2024-08-26 09:38:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1030/1251] eta 0:00:54 lr 0.000871 wd 0.0500 time 0.2379 (0.2454) data time 0.0009 (0.0016) model time 0.2370 (0.2434) loss 2.9719 (3.4549) grad_norm 2.5962 (2.0409) loss_scale 2048.0000 (2022.1765) mem 7379MB [2024-08-26 09:38:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1040/1251] eta 0:00:51 lr 0.000871 wd 0.0500 time 0.2438 (0.2454) data time 0.0012 (0.0015) model time 0.2426 (0.2434) loss 3.3872 (3.4549) grad_norm 1.6200 (2.0413) loss_scale 2048.0000 (2022.4246) mem 7379MB [2024-08-26 09:38:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1050/1251] eta 0:00:49 lr 0.000871 wd 0.0500 time 0.2440 (0.2455) data time 0.0008 (0.0015) model time 0.2432 (0.2436) loss 2.6162 (3.4510) grad_norm 2.3708 (2.0417) loss_scale 2048.0000 (2022.6679) mem 7379MB [2024-08-26 09:38:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1060/1251] eta 0:00:46 lr 0.000871 wd 0.0500 time 0.2355 (0.2455) data time 0.0008 (0.0015) model time 0.2347 (0.2435) loss 4.1890 (3.4523) grad_norm 1.2353 (2.0394) loss_scale 2048.0000 (2022.9067) mem 7379MB [2024-08-26 09:38:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1070/1251] eta 0:00:44 lr 0.000871 wd 0.0500 time 0.2386 (0.2456) data time 0.0010 (0.0015) model time 0.2376 (0.2437) loss 3.9841 (3.4525) grad_norm 1.6127 (2.0393) loss_scale 2048.0000 (2023.1410) mem 7379MB [2024-08-26 09:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1080/1251] eta 0:00:42 lr 0.000871 wd 0.0500 time 0.2417 (0.2457) data time 0.0007 (0.0015) model time 0.2410 (0.2438) loss 3.9685 (3.4530) grad_norm 2.7159 (2.0409) loss_scale 2048.0000 (2023.3710) mem 7379MB [2024-08-26 09:38:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1090/1251] eta 0:00:39 lr 0.000871 wd 0.0500 time 0.2391 (0.2456) data time 0.0007 (0.0015) model time 0.2384 (0.2437) loss 3.0386 (3.4527) grad_norm 1.4573 (2.0384) loss_scale 2048.0000 (2023.5967) mem 7379MB [2024-08-26 09:39:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1100/1251] eta 0:00:37 lr 0.000871 wd 0.0500 time 0.2445 (0.2456) data time 0.0007 (0.0015) model time 0.2438 (0.2437) loss 3.8466 (3.4520) grad_norm 2.2030 (2.0370) loss_scale 2048.0000 (2023.8183) mem 7379MB [2024-08-26 09:39:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1110/1251] eta 0:00:34 lr 0.000871 wd 0.0500 time 0.2460 (0.2455) data time 0.0009 (0.0015) model time 0.2452 (0.2436) loss 4.7109 (3.4536) grad_norm 2.5877 (2.0370) loss_scale 2048.0000 (2024.0360) mem 7379MB [2024-08-26 09:39:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1120/1251] eta 0:00:32 lr 0.000871 wd 0.0500 time 0.2373 (0.2455) data time 0.0007 (0.0015) model time 0.2366 (0.2436) loss 3.9232 (3.4547) grad_norm 2.0716 (2.0373) loss_scale 2048.0000 (2024.2498) mem 7379MB [2024-08-26 09:39:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1130/1251] eta 0:00:29 lr 0.000871 wd 0.0500 time 0.2376 (0.2455) data time 0.0009 (0.0015) model time 0.2368 (0.2436) loss 3.9559 (3.4549) grad_norm 1.9190 (2.0363) loss_scale 2048.0000 (2024.4598) mem 7379MB [2024-08-26 09:39:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1140/1251] eta 0:00:27 lr 0.000871 wd 0.0500 time 0.2314 (0.2454) data time 0.0012 (0.0015) model time 0.2302 (0.2435) loss 3.9841 (3.4561) grad_norm 1.7261 (2.0383) loss_scale 2048.0000 (2024.6661) mem 7379MB [2024-08-26 09:39:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1150/1251] eta 0:00:24 lr 0.000871 wd 0.0500 time 0.2394 (0.2454) data time 0.0007 (0.0015) model time 0.2387 (0.2435) loss 3.8579 (3.4554) grad_norm 2.1307 (2.0381) loss_scale 2048.0000 (2024.8688) mem 7379MB [2024-08-26 09:39:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1160/1251] eta 0:00:22 lr 0.000871 wd 0.0500 time 0.2382 (0.2453) data time 0.0009 (0.0015) model time 0.2373 (0.2435) loss 3.6993 (3.4572) grad_norm 1.9562 (2.0465) loss_scale 2048.0000 (2025.0680) mem 7379MB [2024-08-26 09:39:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1170/1251] eta 0:00:19 lr 0.000871 wd 0.0500 time 0.2440 (0.2455) data time 0.0007 (0.0015) model time 0.2433 (0.2436) loss 2.4176 (3.4555) grad_norm 1.7681 (2.0567) loss_scale 2048.0000 (2025.2639) mem 7379MB [2024-08-26 09:39:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1180/1251] eta 0:00:17 lr 0.000871 wd 0.0500 time 0.2459 (0.2455) data time 0.0007 (0.0015) model time 0.2452 (0.2436) loss 3.7903 (3.4549) grad_norm 1.6146 (2.0574) loss_scale 2048.0000 (2025.4564) mem 7379MB [2024-08-26 09:39:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1190/1251] eta 0:00:14 lr 0.000871 wd 0.0500 time 0.2393 (0.2454) data time 0.0009 (0.0015) model time 0.2384 (0.2436) loss 3.1521 (3.4548) grad_norm 1.6101 (2.0569) loss_scale 2048.0000 (2025.6457) mem 7379MB [2024-08-26 09:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1200/1251] eta 0:00:12 lr 0.000871 wd 0.0500 time 0.2445 (0.2454) data time 0.0009 (0.0015) model time 0.2435 (0.2436) loss 4.0017 (3.4563) grad_norm 2.4599 (2.0568) loss_scale 2048.0000 (2025.8318) mem 7379MB [2024-08-26 09:39:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1210/1251] eta 0:00:10 lr 0.000870 wd 0.0500 time 0.2402 (0.2453) data time 0.0010 (0.0015) model time 0.2392 (0.2435) loss 3.1164 (3.4584) grad_norm 1.7597 (2.0577) loss_scale 2048.0000 (2026.0149) mem 7379MB [2024-08-26 09:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1220/1251] eta 0:00:07 lr 0.000870 wd 0.0500 time 0.2479 (0.2453) data time 0.0009 (0.0015) model time 0.2470 (0.2435) loss 3.9892 (3.4592) grad_norm 2.4195 (2.0560) loss_scale 2048.0000 (2026.1949) mem 7379MB [2024-08-26 09:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1230/1251] eta 0:00:05 lr 0.000870 wd 0.0500 time 0.2474 (0.2453) data time 0.0010 (0.0015) model time 0.2464 (0.2435) loss 2.7811 (3.4587) grad_norm 1.5215 (2.0554) loss_scale 2048.0000 (2026.3721) mem 7379MB [2024-08-26 09:39:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1240/1251] eta 0:00:02 lr 0.000870 wd 0.0500 time 0.2249 (0.2452) data time 0.0005 (0.0015) model time 0.2245 (0.2434) loss 3.9381 (3.4580) grad_norm 1.9344 (2.0528) loss_scale 2048.0000 (2026.5463) mem 7379MB [2024-08-26 09:39:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [85/300][1250/1251] eta 0:00:00 lr 0.000870 wd 0.0500 time 0.2239 (0.2450) data time 0.0007 (0.0015) model time 0.2232 (0.2432) loss 2.9696 (3.4572) grad_norm 1.6822 (2.0533) loss_scale 2048.0000 (2026.7178) mem 7379MB [2024-08-26 09:39:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 85 training takes 0:05:06 [2024-08-26 09:39:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 09:39:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 09:39:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.474 (0.474) Loss 0.5103 (0.5103) Acc@1 90.723 (90.723) Acc@5 97.852 (97.852) Mem 7379MB [2024-08-26 09:39:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.112) Loss 0.7891 (0.8205) Acc@1 82.910 (81.658) Acc@5 95.898 (95.969) Mem 7379MB [2024-08-26 09:39:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.096) Loss 1.1035 (0.8434) Acc@1 74.121 (80.915) Acc@5 93.164 (95.987) Mem 7379MB [2024-08-26 09:39:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.086 (0.091) Loss 1.3975 (0.9617) Acc@1 66.699 (78.267) Acc@5 89.258 (94.437) Mem 7379MB [2024-08-26 09:39:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.2656 (1.0218) Acc@1 70.703 (76.851) Acc@5 90.820 (93.767) Mem 7379MB [2024-08-26 09:39:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.388 Acc@5 93.676 [2024-08-26 09:39:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.4% [2024-08-26 09:39:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.836 (0.836) Loss 0.4565 (0.4565) Acc@1 91.797 (91.797) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 09:39:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.150) Loss 0.7266 (0.7078) Acc@1 86.133 (84.908) Acc@5 96.289 (96.955) Mem 7379MB [2024-08-26 09:39:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.116) Loss 1.0098 (0.7308) Acc@1 75.781 (83.766) Acc@5 94.141 (96.917) Mem 7379MB [2024-08-26 09:39:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.104) Loss 1.2959 (0.8347) Acc@1 67.383 (81.297) Acc@5 90.723 (95.691) Mem 7379MB [2024-08-26 09:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.095) Loss 1.1641 (0.8897) Acc@1 71.680 (79.795) Acc@5 91.992 (95.117) Mem 7379MB [2024-08-26 09:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.378 Acc@5 95.054 [2024-08-26 09:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.4% [2024-08-26 09:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.38% [2024-08-26 09:39:46 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 09:39:47 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 09:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][0/1251] eta 0:20:32 lr 0.000870 wd 0.0500 time 0.9848 (0.9848) data time 0.5291 (0.5291) model time 0.0000 (0.0000) loss 4.1641 (4.1641) grad_norm 1.8072 (1.8072) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:39:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][10/1251] eta 0:06:22 lr 0.000870 wd 0.0500 time 0.2339 (0.3081) data time 0.0012 (0.0491) model time 0.0000 (0.0000) loss 3.7157 (3.7655) grad_norm 1.6853 (1.6953) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:39:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][20/1251] eta 0:05:42 lr 0.000870 wd 0.0500 time 0.2476 (0.2781) data time 0.0007 (0.0264) model time 0.0000 (0.0000) loss 4.2463 (3.7290) grad_norm 1.9114 (1.7861) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:39:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][30/1251] eta 0:05:25 lr 0.000870 wd 0.0500 time 0.2399 (0.2663) data time 0.0011 (0.0182) model time 0.0000 (0.0000) loss 3.4600 (3.6789) grad_norm 2.8333 (1.8674) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:39:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][40/1251] eta 0:05:15 lr 0.000870 wd 0.0500 time 0.2395 (0.2607) data time 0.0011 (0.0141) model time 0.0000 (0.0000) loss 3.6565 (3.5697) grad_norm 2.0269 (1.8998) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][50/1251] eta 0:05:09 lr 0.000870 wd 0.0500 time 0.2461 (0.2577) data time 0.0007 (0.0115) model time 0.0000 (0.0000) loss 3.8144 (3.5448) grad_norm 2.1866 (1.9985) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][60/1251] eta 0:05:04 lr 0.000870 wd 0.0500 time 0.2448 (0.2554) data time 0.0010 (0.0098) model time 0.2438 (0.2424) loss 2.3425 (3.5054) grad_norm 2.4623 (1.9862) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][70/1251] eta 0:04:59 lr 0.000870 wd 0.0500 time 0.2465 (0.2534) data time 0.0010 (0.0085) model time 0.2455 (0.2414) loss 3.3632 (3.4800) grad_norm 2.0256 (2.0458) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][80/1251] eta 0:04:54 lr 0.000870 wd 0.0500 time 0.2450 (0.2518) data time 0.0007 (0.0076) model time 0.2443 (0.2408) loss 4.2316 (3.4677) grad_norm 2.7166 (2.0606) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][90/1251] eta 0:04:50 lr 0.000870 wd 0.0500 time 0.2414 (0.2505) data time 0.0009 (0.0069) model time 0.2405 (0.2403) loss 4.1491 (3.4762) grad_norm 1.7201 (2.0585) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][100/1251] eta 0:04:47 lr 0.000870 wd 0.0500 time 0.2443 (0.2498) data time 0.0012 (0.0063) model time 0.2431 (0.2406) loss 3.6169 (3.4554) grad_norm 1.5427 (2.0623) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][110/1251] eta 0:04:44 lr 0.000870 wd 0.0500 time 0.2430 (0.2491) data time 0.0012 (0.0058) model time 0.2418 (0.2407) loss 2.4471 (3.4318) grad_norm 1.8482 (2.0288) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][120/1251] eta 0:04:40 lr 0.000870 wd 0.0500 time 0.2392 (0.2484) data time 0.0009 (0.0054) model time 0.2383 (0.2406) loss 3.6628 (3.4244) grad_norm 1.9917 (2.0216) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][130/1251] eta 0:04:37 lr 0.000870 wd 0.0500 time 0.2407 (0.2479) data time 0.0009 (0.0051) model time 0.2398 (0.2406) loss 2.9149 (3.4160) grad_norm 1.9617 (2.0471) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][140/1251] eta 0:04:34 lr 0.000870 wd 0.0500 time 0.2454 (0.2475) data time 0.0010 (0.0048) model time 0.2444 (0.2406) loss 2.4001 (3.4137) grad_norm 2.2777 (2.0442) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][150/1251] eta 0:04:32 lr 0.000870 wd 0.0500 time 0.2370 (0.2472) data time 0.0008 (0.0046) model time 0.2362 (0.2407) loss 3.3878 (3.4263) grad_norm 2.5010 (2.0534) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][160/1251] eta 0:04:29 lr 0.000870 wd 0.0500 time 0.2418 (0.2468) data time 0.0007 (0.0043) model time 0.2411 (0.2407) loss 2.6622 (3.4225) grad_norm 2.8231 (2.0702) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][170/1251] eta 0:04:26 lr 0.000870 wd 0.0500 time 0.2373 (0.2465) data time 0.0009 (0.0042) model time 0.2364 (0.2407) loss 2.8605 (3.4204) grad_norm 1.6915 (2.0830) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][180/1251] eta 0:04:23 lr 0.000870 wd 0.0500 time 0.2290 (0.2463) data time 0.0010 (0.0040) model time 0.2281 (0.2408) loss 3.6449 (3.4082) grad_norm 2.2996 (2.0759) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][190/1251] eta 0:04:21 lr 0.000870 wd 0.0500 time 0.2412 (0.2462) data time 0.0010 (0.0038) model time 0.2401 (0.2409) loss 3.6737 (3.4143) grad_norm 2.9356 (2.0777) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][200/1251] eta 0:04:19 lr 0.000870 wd 0.0500 time 0.2378 (0.2469) data time 0.0010 (0.0037) model time 0.2368 (0.2423) loss 3.7897 (3.4308) grad_norm 1.9300 (2.0782) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][210/1251] eta 0:04:16 lr 0.000870 wd 0.0500 time 0.2294 (0.2466) data time 0.0011 (0.0035) model time 0.2283 (0.2420) loss 3.2448 (3.4425) grad_norm 1.7229 (2.1007) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][220/1251] eta 0:04:13 lr 0.000870 wd 0.0500 time 0.2470 (0.2463) data time 0.0013 (0.0034) model time 0.2457 (0.2419) loss 3.4125 (3.4457) grad_norm 2.0591 (2.0995) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][230/1251] eta 0:04:11 lr 0.000870 wd 0.0500 time 0.2308 (0.2460) data time 0.0008 (0.0033) model time 0.2300 (0.2417) loss 2.2608 (3.4545) grad_norm 1.8647 (2.0972) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][240/1251] eta 0:04:08 lr 0.000870 wd 0.0500 time 0.2496 (0.2460) data time 0.0008 (0.0032) model time 0.2487 (0.2418) loss 3.7020 (3.4550) grad_norm 4.1600 (2.1183) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][250/1251] eta 0:04:06 lr 0.000870 wd 0.0500 time 0.2401 (0.2459) data time 0.0009 (0.0031) model time 0.2392 (0.2419) loss 3.1328 (3.4477) grad_norm 2.0643 (2.1158) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][260/1251] eta 0:04:03 lr 0.000870 wd 0.0500 time 0.2359 (0.2459) data time 0.0007 (0.0031) model time 0.2352 (0.2420) loss 3.0082 (3.4390) grad_norm 2.0096 (2.1337) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][270/1251] eta 0:04:01 lr 0.000870 wd 0.0500 time 0.4600 (0.2465) data time 0.0010 (0.0030) model time 0.4590 (0.2429) loss 3.6622 (3.4455) grad_norm 2.4342 (2.1349) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][280/1251] eta 0:03:59 lr 0.000870 wd 0.0500 time 0.2399 (0.2463) data time 0.0010 (0.0029) model time 0.2389 (0.2427) loss 2.7886 (3.4423) grad_norm 1.6875 (2.1370) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:40:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][290/1251] eta 0:03:56 lr 0.000869 wd 0.0500 time 0.2345 (0.2461) data time 0.0012 (0.0028) model time 0.2333 (0.2427) loss 2.9118 (3.4422) grad_norm 2.5881 (2.1503) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][300/1251] eta 0:03:54 lr 0.000869 wd 0.0500 time 0.2459 (0.2461) data time 0.0010 (0.0028) model time 0.2449 (0.2427) loss 3.9103 (3.4448) grad_norm 1.3933 (2.1424) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][310/1251] eta 0:03:52 lr 0.000869 wd 0.0500 time 0.2427 (0.2467) data time 0.0010 (0.0027) model time 0.2417 (0.2435) loss 2.8057 (3.4407) grad_norm 1.5078 (2.1361) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][320/1251] eta 0:03:49 lr 0.000869 wd 0.0500 time 0.2430 (0.2466) data time 0.0010 (0.0027) model time 0.2419 (0.2435) loss 3.7234 (3.4512) grad_norm 2.1436 (2.1277) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][330/1251] eta 0:03:47 lr 0.000869 wd 0.0500 time 0.2405 (0.2471) data time 0.0009 (0.0026) model time 0.2396 (0.2441) loss 4.1650 (3.4481) grad_norm 1.8718 (2.1259) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][340/1251] eta 0:03:45 lr 0.000869 wd 0.0500 time 0.3593 (0.2478) data time 0.0013 (0.0026) model time 0.3580 (0.2450) loss 2.6958 (3.4488) grad_norm 1.5613 (2.1128) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][350/1251] eta 0:03:43 lr 0.000869 wd 0.0500 time 0.2339 (0.2476) data time 0.0011 (0.0025) model time 0.2328 (0.2448) loss 3.6188 (3.4550) grad_norm 1.6150 (2.1031) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][360/1251] eta 0:03:40 lr 0.000869 wd 0.0500 time 0.2550 (0.2474) data time 0.0009 (0.0025) model time 0.2541 (0.2447) loss 3.0082 (3.4511) grad_norm 2.4498 (2.1025) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][370/1251] eta 0:03:37 lr 0.000869 wd 0.0500 time 0.2448 (0.2473) data time 0.0007 (0.0025) model time 0.2440 (0.2446) loss 4.0088 (3.4513) grad_norm 2.6763 (2.1000) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][380/1251] eta 0:03:35 lr 0.000869 wd 0.0500 time 0.2345 (0.2472) data time 0.0010 (0.0024) model time 0.2335 (0.2445) loss 3.6176 (3.4492) grad_norm 2.3645 (2.1111) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][390/1251] eta 0:03:33 lr 0.000869 wd 0.0500 time 0.4445 (0.2476) data time 0.0007 (0.0024) model time 0.4438 (0.2450) loss 3.1378 (3.4497) grad_norm 1.7521 (2.1046) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][400/1251] eta 0:03:30 lr 0.000869 wd 0.0500 time 0.2454 (0.2475) data time 0.0010 (0.0024) model time 0.2445 (0.2449) loss 3.5263 (3.4494) grad_norm 1.9998 (2.1033) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][410/1251] eta 0:03:28 lr 0.000869 wd 0.0500 time 0.2419 (0.2478) data time 0.0010 (0.0023) model time 0.2409 (0.2453) loss 3.6344 (3.4547) grad_norm 3.2125 (2.1022) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][420/1251] eta 0:03:25 lr 0.000869 wd 0.0500 time 0.2392 (0.2477) data time 0.0010 (0.0023) model time 0.2382 (0.2453) loss 3.5306 (3.4587) grad_norm 1.9462 (2.1038) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][430/1251] eta 0:03:23 lr 0.000869 wd 0.0500 time 0.2460 (0.2476) data time 0.0010 (0.0023) model time 0.2450 (0.2452) loss 3.8787 (3.4590) grad_norm 1.4593 (2.1038) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][440/1251] eta 0:03:20 lr 0.000869 wd 0.0500 time 0.2396 (0.2475) data time 0.0010 (0.0023) model time 0.2386 (0.2451) loss 3.6653 (3.4596) grad_norm 1.3126 (2.0980) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][450/1251] eta 0:03:18 lr 0.000869 wd 0.0500 time 0.2462 (0.2474) data time 0.0010 (0.0022) model time 0.2453 (0.2450) loss 4.2120 (3.4596) grad_norm 1.4776 (2.0893) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][460/1251] eta 0:03:15 lr 0.000869 wd 0.0500 time 0.2476 (0.2478) data time 0.0008 (0.0022) model time 0.2468 (0.2455) loss 3.7832 (3.4538) grad_norm 1.6177 (2.0843) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][470/1251] eta 0:03:13 lr 0.000869 wd 0.0500 time 0.2422 (0.2477) data time 0.0008 (0.0022) model time 0.2415 (0.2454) loss 3.8646 (3.4483) grad_norm 1.4755 (2.0863) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][480/1251] eta 0:03:10 lr 0.000869 wd 0.0500 time 0.2448 (0.2475) data time 0.0008 (0.0022) model time 0.2441 (0.2453) loss 3.5860 (3.4475) grad_norm 1.7255 (2.0860) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][490/1251] eta 0:03:08 lr 0.000869 wd 0.0500 time 0.2343 (0.2474) data time 0.0013 (0.0021) model time 0.2330 (0.2452) loss 3.8384 (3.4550) grad_norm 1.5816 (2.0863) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][500/1251] eta 0:03:05 lr 0.000869 wd 0.0500 time 0.2374 (0.2473) data time 0.0009 (0.0021) model time 0.2365 (0.2450) loss 3.8275 (3.4568) grad_norm 1.6232 (2.0832) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][510/1251] eta 0:03:03 lr 0.000869 wd 0.0500 time 0.2499 (0.2472) data time 0.0009 (0.0021) model time 0.2489 (0.2449) loss 3.1478 (3.4563) grad_norm 1.8737 (2.0834) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][520/1251] eta 0:03:00 lr 0.000869 wd 0.0500 time 0.2478 (0.2471) data time 0.0010 (0.0021) model time 0.2468 (0.2449) loss 3.2290 (3.4573) grad_norm 2.0920 (2.0880) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:41:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][530/1251] eta 0:02:58 lr 0.000869 wd 0.0500 time 0.2443 (0.2470) data time 0.0011 (0.0021) model time 0.2432 (0.2448) loss 3.2501 (3.4586) grad_norm 1.5436 (2.0907) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][540/1251] eta 0:02:55 lr 0.000869 wd 0.0500 time 0.2419 (0.2468) data time 0.0011 (0.0020) model time 0.2408 (0.2447) loss 3.6079 (3.4568) grad_norm 1.3084 (2.0865) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][550/1251] eta 0:02:52 lr 0.000869 wd 0.0500 time 0.2381 (0.2467) data time 0.0010 (0.0020) model time 0.2371 (0.2445) loss 4.1126 (3.4585) grad_norm 1.7167 (2.0815) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][560/1251] eta 0:02:50 lr 0.000869 wd 0.0500 time 0.2432 (0.2466) data time 0.0011 (0.0020) model time 0.2421 (0.2444) loss 3.2730 (3.4575) grad_norm 2.0967 (2.0767) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][570/1251] eta 0:02:47 lr 0.000869 wd 0.0500 time 0.2502 (0.2465) data time 0.0007 (0.0020) model time 0.2495 (0.2443) loss 3.6542 (3.4591) grad_norm 1.5370 (2.0744) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][580/1251] eta 0:02:45 lr 0.000869 wd 0.0500 time 0.2537 (0.2464) data time 0.0007 (0.0020) model time 0.2530 (0.2443) loss 4.1968 (3.4590) grad_norm 1.8528 (2.0719) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][590/1251] eta 0:02:42 lr 0.000869 wd 0.0500 time 0.2388 (0.2463) data time 0.0009 (0.0019) model time 0.2379 (0.2442) loss 3.6204 (3.4603) grad_norm 1.9486 (2.0727) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][600/1251] eta 0:02:40 lr 0.000869 wd 0.0500 time 0.2426 (0.2463) data time 0.0011 (0.0019) model time 0.2415 (0.2442) loss 3.4779 (3.4576) grad_norm 1.4681 (2.0826) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][610/1251] eta 0:02:37 lr 0.000869 wd 0.0500 time 0.2392 (0.2462) data time 0.0008 (0.0019) model time 0.2384 (0.2441) loss 3.6866 (3.4564) grad_norm 1.7740 (2.0910) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][620/1251] eta 0:02:35 lr 0.000869 wd 0.0500 time 0.2358 (0.2461) data time 0.0011 (0.0019) model time 0.2347 (0.2440) loss 3.6780 (3.4633) grad_norm 1.7893 (2.0891) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][630/1251] eta 0:02:32 lr 0.000868 wd 0.0500 time 0.2444 (0.2460) data time 0.0010 (0.0019) model time 0.2434 (0.2440) loss 3.9523 (3.4639) grad_norm 1.5269 (2.0856) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][640/1251] eta 0:02:30 lr 0.000868 wd 0.0500 time 0.2383 (0.2459) data time 0.0007 (0.0019) model time 0.2376 (0.2439) loss 3.3979 (3.4625) grad_norm 2.5071 (2.0838) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][650/1251] eta 0:02:27 lr 0.000868 wd 0.0500 time 0.2405 (0.2459) data time 0.0009 (0.0019) model time 0.2396 (0.2438) loss 2.6590 (3.4568) grad_norm 1.8665 (2.0828) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][660/1251] eta 0:02:25 lr 0.000868 wd 0.0500 time 0.2321 (0.2458) data time 0.0007 (0.0018) model time 0.2314 (0.2437) loss 3.4464 (3.4526) grad_norm 1.7266 (2.0816) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][670/1251] eta 0:02:22 lr 0.000868 wd 0.0500 time 0.2412 (0.2457) data time 0.0007 (0.0018) model time 0.2405 (0.2436) loss 2.3946 (3.4537) grad_norm 1.3651 (2.0823) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][680/1251] eta 0:02:20 lr 0.000868 wd 0.0500 time 0.2396 (0.2456) data time 0.0008 (0.0018) model time 0.2388 (0.2436) loss 3.6091 (3.4568) grad_norm 1.9384 (2.0854) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][690/1251] eta 0:02:17 lr 0.000868 wd 0.0500 time 0.2416 (0.2455) data time 0.0012 (0.0018) model time 0.2405 (0.2435) loss 2.5175 (3.4507) grad_norm 1.5981 (2.0842) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][700/1251] eta 0:02:15 lr 0.000868 wd 0.0500 time 0.2323 (0.2455) data time 0.0011 (0.0018) model time 0.2313 (0.2435) loss 3.3989 (3.4497) grad_norm 1.7117 (2.0828) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][710/1251] eta 0:02:12 lr 0.000868 wd 0.0500 time 0.2358 (0.2454) data time 0.0010 (0.0018) model time 0.2347 (0.2434) loss 3.3627 (3.4475) grad_norm 2.4015 (2.0889) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][720/1251] eta 0:02:10 lr 0.000868 wd 0.0500 time 0.2389 (0.2453) data time 0.0009 (0.0018) model time 0.2380 (0.2434) loss 3.4640 (3.4447) grad_norm 1.8631 (2.0893) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][730/1251] eta 0:02:07 lr 0.000868 wd 0.0500 time 0.2485 (0.2456) data time 0.0008 (0.0018) model time 0.2477 (0.2436) loss 3.7760 (3.4440) grad_norm 1.6749 (2.0839) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][740/1251] eta 0:02:05 lr 0.000868 wd 0.0500 time 0.2346 (0.2455) data time 0.0011 (0.0018) model time 0.2335 (0.2435) loss 3.5324 (3.4448) grad_norm 1.7054 (2.0770) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][750/1251] eta 0:02:02 lr 0.000868 wd 0.0500 time 0.2361 (0.2454) data time 0.0008 (0.0018) model time 0.2353 (0.2435) loss 4.4513 (3.4467) grad_norm 1.6841 (2.0726) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][760/1251] eta 0:02:00 lr 0.000868 wd 0.0500 time 0.2380 (0.2454) data time 0.0012 (0.0017) model time 0.2368 (0.2434) loss 4.0986 (3.4517) grad_norm 1.9111 (2.0705) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][770/1251] eta 0:01:57 lr 0.000868 wd 0.0500 time 0.2402 (0.2453) data time 0.0009 (0.0017) model time 0.2393 (0.2434) loss 3.7932 (3.4497) grad_norm 2.0864 (2.0685) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 09:42:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][780/1251] eta 0:01:55 lr 0.000868 wd 0.0500 time 0.2309 (0.2452) data time 0.0011 (0.0017) model time 0.2298 (0.2433) loss 4.2059 (3.4516) grad_norm 1.7589 (2.0712) loss_scale 4096.0000 (2063.7337) mem 7379MB [2024-08-26 09:43:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][790/1251] eta 0:01:53 lr 0.000868 wd 0.0500 time 0.4095 (0.2454) data time 0.0011 (0.0017) model time 0.4084 (0.2435) loss 2.9966 (3.4529) grad_norm 2.2141 (2.0685) loss_scale 4096.0000 (2089.4260) mem 7379MB [2024-08-26 09:43:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][800/1251] eta 0:01:50 lr 0.000868 wd 0.0500 time 0.2426 (0.2453) data time 0.0008 (0.0017) model time 0.2418 (0.2434) loss 3.0134 (3.4518) grad_norm 2.1125 (2.0675) loss_scale 4096.0000 (2114.4769) mem 7379MB [2024-08-26 09:43:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][810/1251] eta 0:01:48 lr 0.000868 wd 0.0500 time 0.2274 (0.2452) data time 0.0008 (0.0017) model time 0.2265 (0.2433) loss 2.2533 (3.4486) grad_norm 2.1661 (2.0687) loss_scale 4096.0000 (2138.9100) mem 7379MB [2024-08-26 09:43:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][820/1251] eta 0:01:45 lr 0.000868 wd 0.0500 time 0.2434 (0.2452) data time 0.0009 (0.0017) model time 0.2425 (0.2433) loss 4.0303 (3.4512) grad_norm 2.3398 (2.0701) loss_scale 4096.0000 (2162.7479) mem 7379MB [2024-08-26 09:43:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][830/1251] eta 0:01:43 lr 0.000868 wd 0.0500 time 0.2342 (0.2451) data time 0.0010 (0.0017) model time 0.2333 (0.2432) loss 4.0861 (3.4527) grad_norm 2.4663 (2.0699) loss_scale 4096.0000 (2186.0120) mem 7379MB [2024-08-26 09:43:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][840/1251] eta 0:01:40 lr 0.000868 wd 0.0500 time 0.2461 (0.2453) data time 0.0008 (0.0017) model time 0.2454 (0.2435) loss 4.1994 (3.4581) grad_norm 1.4737 (2.0675) loss_scale 4096.0000 (2208.7229) mem 7379MB [2024-08-26 09:43:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][850/1251] eta 0:01:38 lr 0.000868 wd 0.0500 time 0.2474 (0.2453) data time 0.0010 (0.0017) model time 0.2464 (0.2435) loss 2.8788 (3.4551) grad_norm 1.8548 (2.0664) loss_scale 4096.0000 (2230.9001) mem 7379MB [2024-08-26 09:43:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][860/1251] eta 0:01:36 lr 0.000868 wd 0.0500 time 0.4600 (0.2458) data time 0.0012 (0.0017) model time 0.4587 (0.2440) loss 3.9924 (3.4547) grad_norm 1.6270 (2.0654) loss_scale 4096.0000 (2252.5621) mem 7379MB [2024-08-26 09:43:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][870/1251] eta 0:01:33 lr 0.000868 wd 0.0500 time 0.2480 (0.2460) data time 0.0011 (0.0017) model time 0.2469 (0.2442) loss 3.7075 (3.4556) grad_norm 2.2804 (2.0633) loss_scale 4096.0000 (2273.7268) mem 7379MB [2024-08-26 09:43:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][880/1251] eta 0:01:31 lr 0.000868 wd 0.0500 time 0.2428 (0.2459) data time 0.0010 (0.0016) model time 0.2418 (0.2441) loss 2.5080 (3.4531) grad_norm 2.7927 (2.0686) loss_scale 4096.0000 (2294.4109) mem 7379MB [2024-08-26 09:43:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][890/1251] eta 0:01:28 lr 0.000868 wd 0.0500 time 0.2439 (0.2459) data time 0.0010 (0.0016) model time 0.2429 (0.2441) loss 3.3991 (3.4548) grad_norm 1.8889 (2.0675) loss_scale 4096.0000 (2314.6308) mem 7379MB [2024-08-26 09:43:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][900/1251] eta 0:01:26 lr 0.000868 wd 0.0500 time 0.2426 (0.2458) data time 0.0010 (0.0016) model time 0.2416 (0.2440) loss 3.8016 (3.4563) grad_norm 1.7741 (2.0638) loss_scale 4096.0000 (2334.4018) mem 7379MB [2024-08-26 09:43:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][910/1251] eta 0:01:23 lr 0.000868 wd 0.0500 time 0.2397 (0.2457) data time 0.0010 (0.0016) model time 0.2388 (0.2440) loss 3.8230 (3.4558) grad_norm 1.9660 (2.0659) loss_scale 4096.0000 (2353.7387) mem 7379MB [2024-08-26 09:43:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][920/1251] eta 0:01:21 lr 0.000868 wd 0.0500 time 0.2436 (0.2457) data time 0.0010 (0.0016) model time 0.2426 (0.2440) loss 3.9263 (3.4575) grad_norm 1.6065 (2.0689) loss_scale 4096.0000 (2372.6558) mem 7379MB [2024-08-26 09:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][930/1251] eta 0:01:19 lr 0.000868 wd 0.0500 time 0.4625 (0.2461) data time 0.0011 (0.0016) model time 0.4614 (0.2444) loss 2.6691 (3.4562) grad_norm 2.4237 (2.0695) loss_scale 4096.0000 (2391.1665) mem 7379MB [2024-08-26 09:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][940/1251] eta 0:01:16 lr 0.000868 wd 0.0500 time 0.2383 (0.2460) data time 0.0007 (0.0016) model time 0.2375 (0.2443) loss 3.7160 (3.4557) grad_norm 2.3936 (2.0686) loss_scale 4096.0000 (2409.2837) mem 7379MB [2024-08-26 09:43:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][950/1251] eta 0:01:14 lr 0.000868 wd 0.0500 time 0.2403 (0.2460) data time 0.0011 (0.0016) model time 0.2393 (0.2443) loss 2.5237 (3.4573) grad_norm 2.0579 (2.0674) loss_scale 4096.0000 (2427.0200) mem 7379MB [2024-08-26 09:43:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][960/1251] eta 0:01:11 lr 0.000867 wd 0.0500 time 0.2508 (0.2459) data time 0.0009 (0.0016) model time 0.2499 (0.2442) loss 3.5307 (3.4551) grad_norm 1.6505 (2.0650) loss_scale 4096.0000 (2444.3871) mem 7379MB [2024-08-26 09:43:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][970/1251] eta 0:01:09 lr 0.000867 wd 0.0500 time 0.2443 (0.2459) data time 0.0010 (0.0016) model time 0.2433 (0.2442) loss 3.0232 (3.4559) grad_norm 1.6613 (2.0648) loss_scale 4096.0000 (2461.3965) mem 7379MB [2024-08-26 09:43:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][980/1251] eta 0:01:06 lr 0.000867 wd 0.0500 time 0.2389 (0.2460) data time 0.0007 (0.0016) model time 0.2382 (0.2443) loss 4.2761 (3.4577) grad_norm 1.8423 (2.0621) loss_scale 4096.0000 (2478.0591) mem 7379MB [2024-08-26 09:43:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][990/1251] eta 0:01:04 lr 0.000867 wd 0.0500 time 0.2387 (0.2459) data time 0.0010 (0.0016) model time 0.2377 (0.2443) loss 3.3275 (3.4577) grad_norm 1.9735 (2.0602) loss_scale 4096.0000 (2494.3855) mem 7379MB [2024-08-26 09:43:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1000/1251] eta 0:01:01 lr 0.000867 wd 0.0500 time 0.2311 (0.2459) data time 0.0009 (0.0016) model time 0.2301 (0.2442) loss 2.9694 (3.4565) grad_norm 1.9551 (2.0583) loss_scale 4096.0000 (2510.3856) mem 7379MB [2024-08-26 09:43:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1010/1251] eta 0:00:59 lr 0.000867 wd 0.0500 time 0.2385 (0.2459) data time 0.0011 (0.0016) model time 0.2374 (0.2442) loss 3.4095 (3.4567) grad_norm 2.6998 (2.0616) loss_scale 4096.0000 (2526.0692) mem 7379MB [2024-08-26 09:43:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1020/1251] eta 0:00:56 lr 0.000867 wd 0.0500 time 0.2448 (0.2458) data time 0.0009 (0.0016) model time 0.2439 (0.2442) loss 4.2738 (3.4579) grad_norm 1.5337 (2.0618) loss_scale 4096.0000 (2541.4456) mem 7379MB [2024-08-26 09:44:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1030/1251] eta 0:00:54 lr 0.000867 wd 0.0500 time 0.2369 (0.2458) data time 0.0009 (0.0016) model time 0.2360 (0.2441) loss 3.3360 (3.4569) grad_norm 1.8155 (2.0597) loss_scale 4096.0000 (2556.5238) mem 7379MB [2024-08-26 09:44:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1040/1251] eta 0:00:51 lr 0.000867 wd 0.0500 time 0.2379 (0.2457) data time 0.0012 (0.0015) model time 0.2367 (0.2441) loss 3.6886 (3.4562) grad_norm 1.9588 (2.0574) loss_scale 4096.0000 (2571.3122) mem 7379MB [2024-08-26 09:44:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1050/1251] eta 0:00:49 lr 0.000867 wd 0.0500 time 0.2342 (0.2457) data time 0.0007 (0.0015) model time 0.2335 (0.2440) loss 3.6317 (3.4568) grad_norm 2.4140 (2.0580) loss_scale 4096.0000 (2585.8192) mem 7379MB [2024-08-26 09:44:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1060/1251] eta 0:00:46 lr 0.000867 wd 0.0500 time 0.2389 (0.2457) data time 0.0007 (0.0015) model time 0.2383 (0.2440) loss 3.2933 (3.4568) grad_norm 1.7139 (2.0633) loss_scale 4096.0000 (2600.0528) mem 7379MB [2024-08-26 09:44:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1070/1251] eta 0:00:44 lr 0.000867 wd 0.0500 time 0.2327 (0.2456) data time 0.0008 (0.0015) model time 0.2319 (0.2440) loss 4.1446 (3.4618) grad_norm 2.0656 (2.0670) loss_scale 4096.0000 (2614.0205) mem 7379MB [2024-08-26 09:44:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1080/1251] eta 0:00:41 lr 0.000867 wd 0.0500 time 0.2367 (0.2456) data time 0.0007 (0.0015) model time 0.2359 (0.2440) loss 2.4841 (3.4619) grad_norm 2.1149 (2.0676) loss_scale 4096.0000 (2627.7299) mem 7379MB [2024-08-26 09:44:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1090/1251] eta 0:00:39 lr 0.000867 wd 0.0500 time 0.2373 (0.2455) data time 0.0009 (0.0015) model time 0.2364 (0.2439) loss 3.4548 (3.4616) grad_norm 1.3783 (2.0668) loss_scale 4096.0000 (2641.1879) mem 7379MB [2024-08-26 09:44:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1100/1251] eta 0:00:37 lr 0.000867 wd 0.0500 time 0.2435 (0.2455) data time 0.0009 (0.0015) model time 0.2426 (0.2439) loss 3.5954 (3.4611) grad_norm 2.1668 (2.0677) loss_scale 4096.0000 (2654.4015) mem 7379MB [2024-08-26 09:44:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1110/1251] eta 0:00:34 lr 0.000867 wd 0.0500 time 0.2371 (0.2455) data time 0.0010 (0.0015) model time 0.2360 (0.2439) loss 3.5811 (3.4627) grad_norm 1.5226 (2.0679) loss_scale 4096.0000 (2667.3771) mem 7379MB [2024-08-26 09:44:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1120/1251] eta 0:00:32 lr 0.000867 wd 0.0500 time 0.2343 (0.2454) data time 0.0010 (0.0015) model time 0.2333 (0.2438) loss 3.8670 (3.4618) grad_norm 1.7253 (2.0689) loss_scale 4096.0000 (2680.1213) mem 7379MB [2024-08-26 09:44:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1130/1251] eta 0:00:29 lr 0.000867 wd 0.0500 time 0.2404 (0.2456) data time 0.0007 (0.0015) model time 0.2396 (0.2440) loss 4.1712 (3.4620) grad_norm 3.8822 (2.0698) loss_scale 4096.0000 (2692.6401) mem 7379MB [2024-08-26 09:44:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1140/1251] eta 0:00:27 lr 0.000867 wd 0.0500 time 0.2438 (0.2456) data time 0.0011 (0.0015) model time 0.2428 (0.2439) loss 3.4855 (3.4633) grad_norm 1.6764 (2.0702) loss_scale 4096.0000 (2704.9395) mem 7379MB [2024-08-26 09:44:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1150/1251] eta 0:00:24 lr 0.000867 wd 0.0500 time 0.2329 (0.2455) data time 0.0011 (0.0015) model time 0.2318 (0.2439) loss 3.4409 (3.4618) grad_norm 2.4180 (2.0694) loss_scale 4096.0000 (2717.0252) mem 7379MB [2024-08-26 09:44:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1160/1251] eta 0:00:22 lr 0.000867 wd 0.0500 time 0.2370 (0.2455) data time 0.0008 (0.0015) model time 0.2362 (0.2439) loss 4.1291 (3.4641) grad_norm 2.2439 (2.0698) loss_scale 4096.0000 (2728.9027) mem 7379MB [2024-08-26 09:44:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1170/1251] eta 0:00:19 lr 0.000867 wd 0.0500 time 0.2404 (0.2454) data time 0.0007 (0.0015) model time 0.2397 (0.2438) loss 3.5429 (3.4632) grad_norm 1.8873 (2.0715) loss_scale 4096.0000 (2740.5773) mem 7379MB [2024-08-26 09:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1180/1251] eta 0:00:17 lr 0.000867 wd 0.0500 time 0.2441 (0.2454) data time 0.0009 (0.0015) model time 0.2432 (0.2438) loss 3.9456 (3.4615) grad_norm 2.2869 (2.0748) loss_scale 4096.0000 (2752.0542) mem 7379MB [2024-08-26 09:44:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1190/1251] eta 0:00:14 lr 0.000867 wd 0.0500 time 0.2394 (0.2454) data time 0.0007 (0.0015) model time 0.2387 (0.2438) loss 3.0566 (3.4607) grad_norm 1.6774 (2.0742) loss_scale 4096.0000 (2763.3384) mem 7379MB [2024-08-26 09:44:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1200/1251] eta 0:00:12 lr 0.000867 wd 0.0500 time 0.2323 (0.2453) data time 0.0012 (0.0015) model time 0.2311 (0.2437) loss 3.7452 (3.4605) grad_norm 1.6313 (2.0720) loss_scale 4096.0000 (2774.4346) mem 7379MB [2024-08-26 09:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1210/1251] eta 0:00:10 lr 0.000867 wd 0.0500 time 0.2422 (0.2453) data time 0.0008 (0.0015) model time 0.2414 (0.2437) loss 2.3780 (3.4620) grad_norm 1.7454 (2.0702) loss_scale 4096.0000 (2785.3476) mem 7379MB [2024-08-26 09:44:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1220/1251] eta 0:00:07 lr 0.000867 wd 0.0500 time 0.2391 (0.2452) data time 0.0008 (0.0015) model time 0.2383 (0.2436) loss 2.2057 (3.4623) grad_norm 1.5160 (2.0684) loss_scale 4096.0000 (2796.0819) mem 7379MB [2024-08-26 09:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1230/1251] eta 0:00:05 lr 0.000867 wd 0.0500 time 0.2435 (0.2452) data time 0.0009 (0.0015) model time 0.2426 (0.2436) loss 3.5127 (3.4630) grad_norm 2.2916 (2.0688) loss_scale 4096.0000 (2806.6418) mem 7379MB [2024-08-26 09:44:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1240/1251] eta 0:00:02 lr 0.000867 wd 0.0500 time 0.2265 (0.2451) data time 0.0005 (0.0015) model time 0.2260 (0.2435) loss 3.4739 (3.4645) grad_norm 2.4682 (2.0673) loss_scale 4096.0000 (2817.0314) mem 7379MB [2024-08-26 09:44:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [86/300][1250/1251] eta 0:00:00 lr 0.000867 wd 0.0500 time 0.2266 (0.2450) data time 0.0005 (0.0015) model time 0.2261 (0.2434) loss 3.1880 (3.4655) grad_norm 1.7662 (2.0667) loss_scale 4096.0000 (2827.2550) mem 7379MB [2024-08-26 09:44:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 86 training takes 0:05:06 [2024-08-26 09:44:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 09:44:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 09:44:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.441 (0.441) Loss 0.5474 (0.5474) Acc@1 90.234 (90.234) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 09:44:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.112) Loss 0.8408 (0.8139) Acc@1 82.715 (82.333) Acc@5 95.801 (96.200) Mem 7379MB [2024-08-26 09:44:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.095) Loss 1.2188 (0.8421) Acc@1 71.387 (81.157) Acc@5 91.992 (96.061) Mem 7379MB [2024-08-26 09:44:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.089) Loss 1.4814 (0.9518) Acc@1 65.234 (78.701) Acc@5 88.574 (94.682) Mem 7379MB [2024-08-26 09:44:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.063 (0.084) Loss 1.3574 (1.0209) Acc@1 68.164 (77.046) Acc@5 90.039 (93.829) Mem 7379MB [2024-08-26 09:44:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.540 Acc@5 93.664 [2024-08-26 09:44:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.5% [2024-08-26 09:44:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.750 (0.750) Loss 0.4558 (0.4558) Acc@1 91.895 (91.895) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 09:45:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.146) Loss 0.7256 (0.7064) Acc@1 86.328 (84.996) Acc@5 96.289 (96.902) Mem 7379MB [2024-08-26 09:45:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.114) Loss 1.0049 (0.7296) Acc@1 75.977 (83.845) Acc@5 94.434 (96.884) Mem 7379MB [2024-08-26 09:45:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.103) Loss 1.2949 (0.8333) Acc@1 67.578 (81.363) Acc@5 90.527 (95.662) Mem 7379MB [2024-08-26 09:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.1611 (0.8880) Acc@1 71.777 (79.819) Acc@5 92.188 (95.117) Mem 7379MB [2024-08-26 09:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.406 Acc@5 95.058 [2024-08-26 09:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.4% [2024-08-26 09:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.41% [2024-08-26 09:45:02 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 09:45:03 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 09:45:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][0/1251] eta 0:16:23 lr 0.000867 wd 0.0500 time 0.7861 (0.7861) data time 0.5589 (0.5589) model time 0.0000 (0.0000) loss 3.7825 (3.7825) grad_norm 2.1271 (2.1271) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][10/1251] eta 0:05:59 lr 0.000867 wd 0.0500 time 0.2420 (0.2900) data time 0.0010 (0.0518) model time 0.0000 (0.0000) loss 2.4981 (3.0003) grad_norm 1.7397 (2.2879) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][20/1251] eta 0:05:27 lr 0.000867 wd 0.0500 time 0.2485 (0.2663) data time 0.0010 (0.0276) model time 0.0000 (0.0000) loss 4.1213 (3.2982) grad_norm 1.8405 (2.0357) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][30/1251] eta 0:05:14 lr 0.000867 wd 0.0500 time 0.2380 (0.2580) data time 0.0010 (0.0190) model time 0.0000 (0.0000) loss 3.8306 (3.2731) grad_norm 2.1312 (2.0282) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][40/1251] eta 0:05:07 lr 0.000866 wd 0.0500 time 0.2429 (0.2539) data time 0.0009 (0.0146) model time 0.0000 (0.0000) loss 2.8236 (3.2845) grad_norm 2.2790 (1.9896) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][50/1251] eta 0:05:06 lr 0.000866 wd 0.0500 time 0.2383 (0.2552) data time 0.0011 (0.0120) model time 0.0000 (0.0000) loss 3.1713 (3.2871) grad_norm 1.6426 (1.9646) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][60/1251] eta 0:05:01 lr 0.000866 wd 0.0500 time 0.2365 (0.2528) data time 0.0014 (0.0102) model time 0.2351 (0.2393) loss 3.5827 (3.3023) grad_norm 3.1736 (1.9630) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][70/1251] eta 0:04:59 lr 0.000866 wd 0.0500 time 0.2360 (0.2533) data time 0.0008 (0.0089) model time 0.2352 (0.2474) loss 2.4471 (3.3052) grad_norm 2.8585 (1.9714) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][80/1251] eta 0:04:54 lr 0.000866 wd 0.0500 time 0.2298 (0.2518) data time 0.0012 (0.0079) model time 0.2287 (0.2449) loss 3.4022 (3.3368) grad_norm 3.3065 (2.0353) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][90/1251] eta 0:04:51 lr 0.000866 wd 0.0500 time 0.2472 (0.2509) data time 0.0012 (0.0072) model time 0.2460 (0.2443) loss 3.5319 (3.3401) grad_norm 1.3869 (2.0169) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][100/1251] eta 0:04:47 lr 0.000866 wd 0.0500 time 0.2377 (0.2499) data time 0.0008 (0.0065) model time 0.2369 (0.2435) loss 3.4471 (3.3565) grad_norm 1.9716 (2.0066) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][110/1251] eta 0:04:46 lr 0.000866 wd 0.0500 time 0.4533 (0.2511) data time 0.0010 (0.0061) model time 0.4524 (0.2466) loss 3.7323 (3.3483) grad_norm 1.6360 (2.0110) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][120/1251] eta 0:04:43 lr 0.000866 wd 0.0500 time 0.2444 (0.2503) data time 0.0007 (0.0056) model time 0.2437 (0.2456) loss 3.9893 (3.3447) grad_norm 1.6485 (2.0106) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][130/1251] eta 0:04:39 lr 0.000866 wd 0.0500 time 0.2384 (0.2496) data time 0.0009 (0.0053) model time 0.2375 (0.2450) loss 2.6823 (3.3617) grad_norm 2.7462 (2.0191) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][140/1251] eta 0:04:38 lr 0.000866 wd 0.0500 time 0.2479 (0.2508) data time 0.0007 (0.0050) model time 0.2471 (0.2473) loss 3.8891 (3.3754) grad_norm 2.8263 (2.0134) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][150/1251] eta 0:04:35 lr 0.000866 wd 0.0500 time 0.2369 (0.2502) data time 0.0008 (0.0047) model time 0.2361 (0.2467) loss 3.5059 (3.3774) grad_norm 1.4357 (2.0092) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][160/1251] eta 0:04:32 lr 0.000866 wd 0.0500 time 0.2403 (0.2496) data time 0.0009 (0.0045) model time 0.2394 (0.2460) loss 2.9069 (3.3869) grad_norm 1.5068 (1.9835) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][170/1251] eta 0:04:29 lr 0.000866 wd 0.0500 time 0.2379 (0.2490) data time 0.0007 (0.0043) model time 0.2372 (0.2453) loss 4.3384 (3.4223) grad_norm 2.6452 (1.9782) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][180/1251] eta 0:04:26 lr 0.000866 wd 0.0500 time 0.2399 (0.2485) data time 0.0008 (0.0041) model time 0.2391 (0.2449) loss 3.6506 (3.4341) grad_norm 1.7351 (2.0340) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][190/1251] eta 0:04:23 lr 0.000866 wd 0.0500 time 0.2305 (0.2483) data time 0.0009 (0.0039) model time 0.2297 (0.2448) loss 3.0951 (3.4283) grad_norm 1.6840 (2.0297) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][200/1251] eta 0:04:20 lr 0.000866 wd 0.0500 time 0.2515 (0.2480) data time 0.0010 (0.0038) model time 0.2504 (0.2445) loss 3.3842 (3.4380) grad_norm 1.8194 (2.0310) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][210/1251] eta 0:04:18 lr 0.000866 wd 0.0500 time 0.2324 (0.2486) data time 0.0010 (0.0037) model time 0.2314 (0.2454) loss 3.4503 (3.4524) grad_norm 1.9150 (2.0311) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:45:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][220/1251] eta 0:04:15 lr 0.000866 wd 0.0500 time 0.2410 (0.2482) data time 0.0010 (0.0035) model time 0.2400 (0.2451) loss 2.5236 (3.4437) grad_norm 2.0618 (2.0237) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][230/1251] eta 0:04:13 lr 0.000866 wd 0.0500 time 0.2444 (0.2488) data time 0.0008 (0.0034) model time 0.2436 (0.2459) loss 4.1835 (3.4542) grad_norm 2.0810 (2.0194) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][240/1251] eta 0:04:11 lr 0.000866 wd 0.0500 time 0.2404 (0.2486) data time 0.0009 (0.0033) model time 0.2394 (0.2458) loss 3.5833 (3.4447) grad_norm 2.8890 (2.0357) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][250/1251] eta 0:04:09 lr 0.000866 wd 0.0500 time 0.2437 (0.2491) data time 0.0008 (0.0032) model time 0.2429 (0.2465) loss 2.1755 (3.4448) grad_norm 1.4908 (2.0394) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][260/1251] eta 0:04:06 lr 0.000866 wd 0.0500 time 0.2407 (0.2488) data time 0.0007 (0.0031) model time 0.2400 (0.2462) loss 4.3690 (3.4432) grad_norm 2.1428 (2.0470) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][270/1251] eta 0:04:04 lr 0.000866 wd 0.0500 time 0.2468 (0.2493) data time 0.0009 (0.0031) model time 0.2459 (0.2469) loss 3.4265 (3.4381) grad_norm 2.5108 (2.0490) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][280/1251] eta 0:04:01 lr 0.000866 wd 0.0500 time 0.2362 (0.2490) data time 0.0007 (0.0030) model time 0.2354 (0.2467) loss 4.4128 (3.4396) grad_norm 1.7031 (2.0441) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][290/1251] eta 0:03:59 lr 0.000866 wd 0.0500 time 0.2359 (0.2487) data time 0.0010 (0.0029) model time 0.2349 (0.2463) loss 4.1765 (3.4464) grad_norm 2.7636 (2.0520) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][300/1251] eta 0:03:56 lr 0.000866 wd 0.0500 time 0.2379 (0.2486) data time 0.0011 (0.0029) model time 0.2368 (0.2462) loss 3.8124 (3.4520) grad_norm 2.2760 (2.0476) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][310/1251] eta 0:03:54 lr 0.000866 wd 0.0500 time 0.2441 (0.2497) data time 0.0010 (0.0028) model time 0.2431 (0.2476) loss 2.8740 (3.4479) grad_norm 1.4705 (2.0434) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][320/1251] eta 0:03:52 lr 0.000866 wd 0.0500 time 0.2420 (0.2495) data time 0.0010 (0.0028) model time 0.2409 (0.2474) loss 3.9405 (3.4485) grad_norm 1.8854 (2.0432) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][330/1251] eta 0:03:49 lr 0.000866 wd 0.0500 time 0.2399 (0.2494) data time 0.0009 (0.0027) model time 0.2390 (0.2473) loss 4.1912 (3.4520) grad_norm 2.2022 (2.0472) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][340/1251] eta 0:03:46 lr 0.000866 wd 0.0500 time 0.2424 (0.2492) data time 0.0010 (0.0026) model time 0.2415 (0.2471) loss 3.0189 (3.4569) grad_norm 1.7720 (2.0450) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][350/1251] eta 0:03:44 lr 0.000866 wd 0.0500 time 0.2448 (0.2490) data time 0.0010 (0.0026) model time 0.2438 (0.2469) loss 3.7804 (3.4605) grad_norm 2.0860 (2.0397) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][360/1251] eta 0:03:41 lr 0.000866 wd 0.0500 time 0.2369 (0.2488) data time 0.0008 (0.0026) model time 0.2362 (0.2467) loss 3.7339 (3.4542) grad_norm 1.9302 (2.0478) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][370/1251] eta 0:03:38 lr 0.000865 wd 0.0500 time 0.2363 (0.2486) data time 0.0009 (0.0025) model time 0.2354 (0.2465) loss 3.4967 (3.4582) grad_norm 1.9649 (2.0436) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][380/1251] eta 0:03:36 lr 0.000865 wd 0.0500 time 0.2503 (0.2484) data time 0.0007 (0.0025) model time 0.2496 (0.2464) loss 2.9926 (3.4545) grad_norm 2.1232 (2.0463) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][390/1251] eta 0:03:33 lr 0.000865 wd 0.0500 time 0.2354 (0.2482) data time 0.0009 (0.0024) model time 0.2345 (0.2462) loss 2.6183 (3.4544) grad_norm 2.5819 (2.0493) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][400/1251] eta 0:03:31 lr 0.000865 wd 0.0500 time 0.2405 (0.2481) data time 0.0009 (0.0024) model time 0.2397 (0.2461) loss 4.0795 (3.4572) grad_norm 2.7854 (2.0503) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][410/1251] eta 0:03:28 lr 0.000865 wd 0.0500 time 0.2460 (0.2480) data time 0.0007 (0.0024) model time 0.2453 (0.2459) loss 4.0706 (3.4609) grad_norm 1.6112 (2.0450) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][420/1251] eta 0:03:25 lr 0.000865 wd 0.0500 time 0.2469 (0.2478) data time 0.0008 (0.0023) model time 0.2461 (0.2458) loss 2.8820 (3.4650) grad_norm 1.6221 (2.0382) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][430/1251] eta 0:03:23 lr 0.000865 wd 0.0500 time 0.2439 (0.2477) data time 0.0011 (0.0023) model time 0.2427 (0.2457) loss 3.8103 (3.4688) grad_norm 1.6724 (2.0358) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][440/1251] eta 0:03:20 lr 0.000865 wd 0.0500 time 0.2429 (0.2476) data time 0.0009 (0.0023) model time 0.2420 (0.2456) loss 3.6512 (3.4699) grad_norm 2.0680 (2.0353) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][450/1251] eta 0:03:18 lr 0.000865 wd 0.0500 time 0.2399 (0.2475) data time 0.0007 (0.0022) model time 0.2391 (0.2455) loss 3.6709 (3.4673) grad_norm 1.4753 (2.0409) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][460/1251] eta 0:03:15 lr 0.000865 wd 0.0500 time 0.2410 (0.2474) data time 0.0008 (0.0022) model time 0.2402 (0.2454) loss 4.3572 (3.4681) grad_norm 1.4582 (2.0385) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][470/1251] eta 0:03:13 lr 0.000865 wd 0.0500 time 0.2467 (0.2473) data time 0.0008 (0.0022) model time 0.2459 (0.2453) loss 4.1493 (3.4698) grad_norm 2.1474 (2.0426) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][480/1251] eta 0:03:10 lr 0.000865 wd 0.0500 time 0.2449 (0.2471) data time 0.0007 (0.0022) model time 0.2441 (0.2452) loss 3.3105 (3.4667) grad_norm 1.7236 (2.0424) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][490/1251] eta 0:03:07 lr 0.000865 wd 0.0500 time 0.2366 (0.2470) data time 0.0011 (0.0021) model time 0.2355 (0.2451) loss 3.5181 (3.4671) grad_norm 1.7223 (2.0375) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][500/1251] eta 0:03:05 lr 0.000865 wd 0.0500 time 0.2384 (0.2469) data time 0.0009 (0.0021) model time 0.2375 (0.2450) loss 2.5192 (3.4618) grad_norm 3.1212 (2.0394) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][510/1251] eta 0:03:02 lr 0.000865 wd 0.0500 time 0.2476 (0.2469) data time 0.0009 (0.0021) model time 0.2467 (0.2449) loss 4.0203 (3.4596) grad_norm 2.0589 (2.0330) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][520/1251] eta 0:03:00 lr 0.000865 wd 0.0500 time 0.2453 (0.2468) data time 0.0008 (0.0021) model time 0.2445 (0.2449) loss 3.4919 (3.4581) grad_norm 1.8326 (2.0358) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][530/1251] eta 0:02:57 lr 0.000865 wd 0.0500 time 0.2431 (0.2467) data time 0.0011 (0.0021) model time 0.2419 (0.2448) loss 2.3793 (3.4599) grad_norm 2.5174 (2.0380) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][540/1251] eta 0:02:55 lr 0.000865 wd 0.0500 time 0.2412 (0.2466) data time 0.0007 (0.0020) model time 0.2405 (0.2447) loss 3.8528 (3.4592) grad_norm 1.6375 (2.0400) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][550/1251] eta 0:02:52 lr 0.000865 wd 0.0500 time 0.2401 (0.2465) data time 0.0010 (0.0020) model time 0.2391 (0.2446) loss 2.7666 (3.4550) grad_norm 1.9469 (2.0404) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][560/1251] eta 0:02:50 lr 0.000865 wd 0.0500 time 0.2493 (0.2465) data time 0.0012 (0.0020) model time 0.2481 (0.2446) loss 3.4688 (3.4523) grad_norm 1.9370 (2.0396) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][570/1251] eta 0:02:47 lr 0.000865 wd 0.0500 time 0.2437 (0.2464) data time 0.0014 (0.0020) model time 0.2424 (0.2445) loss 3.8398 (3.4533) grad_norm 1.7085 (2.0344) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][580/1251] eta 0:02:45 lr 0.000865 wd 0.0500 time 0.2455 (0.2463) data time 0.0007 (0.0020) model time 0.2448 (0.2445) loss 3.1103 (3.4470) grad_norm 2.1147 (2.0329) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][590/1251] eta 0:02:43 lr 0.000865 wd 0.0500 time 0.2436 (0.2467) data time 0.0007 (0.0020) model time 0.2429 (0.2448) loss 4.0095 (3.4497) grad_norm 1.6952 (2.0294) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][600/1251] eta 0:02:40 lr 0.000865 wd 0.0500 time 0.2414 (0.2466) data time 0.0010 (0.0019) model time 0.2404 (0.2448) loss 3.6379 (3.4537) grad_norm 4.3626 (2.0288) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][610/1251] eta 0:02:38 lr 0.000865 wd 0.0500 time 0.2428 (0.2465) data time 0.0009 (0.0019) model time 0.2419 (0.2447) loss 4.0792 (3.4510) grad_norm 3.4125 (2.0296) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][620/1251] eta 0:02:35 lr 0.000865 wd 0.0500 time 0.2506 (0.2464) data time 0.0007 (0.0019) model time 0.2499 (0.2446) loss 3.7363 (3.4498) grad_norm 2.0895 (2.0332) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][630/1251] eta 0:02:33 lr 0.000865 wd 0.0500 time 0.4197 (0.2468) data time 0.0008 (0.0019) model time 0.4189 (0.2451) loss 4.1199 (3.4535) grad_norm 1.6314 (2.0347) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][640/1251] eta 0:02:30 lr 0.000865 wd 0.0500 time 0.2524 (0.2467) data time 0.0010 (0.0019) model time 0.2514 (0.2450) loss 3.5309 (3.4556) grad_norm 1.7503 (2.0407) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][650/1251] eta 0:02:28 lr 0.000865 wd 0.0500 time 0.2383 (0.2467) data time 0.0010 (0.0019) model time 0.2373 (0.2449) loss 4.0108 (3.4566) grad_norm 1.9135 (2.0436) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][660/1251] eta 0:02:25 lr 0.000865 wd 0.0500 time 0.2421 (0.2468) data time 0.0007 (0.0019) model time 0.2414 (0.2451) loss 4.2375 (3.4567) grad_norm 1.5291 (2.0422) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][670/1251] eta 0:02:23 lr 0.000865 wd 0.0500 time 0.2443 (0.2467) data time 0.0011 (0.0019) model time 0.2433 (0.2450) loss 3.5628 (3.4575) grad_norm 1.7934 (2.0396) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][680/1251] eta 0:02:20 lr 0.000865 wd 0.0500 time 0.2401 (0.2466) data time 0.0008 (0.0018) model time 0.2394 (0.2449) loss 3.4914 (3.4579) grad_norm 1.7865 (2.0393) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][690/1251] eta 0:02:18 lr 0.000865 wd 0.0500 time 0.2393 (0.2465) data time 0.0011 (0.0018) model time 0.2382 (0.2448) loss 4.3277 (3.4565) grad_norm 1.7661 (2.0387) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][700/1251] eta 0:02:15 lr 0.000864 wd 0.0500 time 0.2397 (0.2464) data time 0.0007 (0.0018) model time 0.2389 (0.2447) loss 3.0729 (3.4531) grad_norm 1.4444 (2.0454) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:47:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][710/1251] eta 0:02:13 lr 0.000864 wd 0.0500 time 0.2366 (0.2464) data time 0.0009 (0.0018) model time 0.2358 (0.2446) loss 3.7048 (3.4529) grad_norm 1.9327 (2.0466) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][720/1251] eta 0:02:10 lr 0.000864 wd 0.0500 time 0.2452 (0.2466) data time 0.0007 (0.0018) model time 0.2445 (0.2449) loss 4.4318 (3.4553) grad_norm 1.4673 (2.0427) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][730/1251] eta 0:02:08 lr 0.000864 wd 0.0500 time 0.2376 (0.2466) data time 0.0009 (0.0018) model time 0.2366 (0.2450) loss 3.7100 (3.4545) grad_norm 4.4245 (2.0455) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][740/1251] eta 0:02:05 lr 0.000864 wd 0.0500 time 0.2420 (0.2466) data time 0.0008 (0.0018) model time 0.2412 (0.2449) loss 3.2616 (3.4510) grad_norm 2.2298 (2.0601) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][750/1251] eta 0:02:03 lr 0.000864 wd 0.0500 time 0.2490 (0.2465) data time 0.0009 (0.0018) model time 0.2481 (0.2449) loss 2.6586 (3.4498) grad_norm 1.8810 (2.0614) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][760/1251] eta 0:02:01 lr 0.000864 wd 0.0500 time 0.2517 (0.2465) data time 0.0010 (0.0018) model time 0.2507 (0.2448) loss 3.5959 (3.4460) grad_norm 2.2359 (2.0599) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][770/1251] eta 0:01:58 lr 0.000864 wd 0.0500 time 0.2413 (0.2467) data time 0.0010 (0.0017) model time 0.2403 (0.2451) loss 3.7950 (3.4459) grad_norm 1.5969 (2.0584) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][780/1251] eta 0:01:56 lr 0.000864 wd 0.0500 time 0.2457 (0.2466) data time 0.0008 (0.0017) model time 0.2450 (0.2450) loss 3.6046 (3.4457) grad_norm 2.5011 (2.0584) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][790/1251] eta 0:01:53 lr 0.000864 wd 0.0500 time 0.2435 (0.2466) data time 0.0007 (0.0017) model time 0.2428 (0.2450) loss 3.3481 (3.4435) grad_norm 1.8418 (2.0576) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][800/1251] eta 0:01:51 lr 0.000864 wd 0.0500 time 0.2381 (0.2465) data time 0.0009 (0.0017) model time 0.2372 (0.2449) loss 3.6242 (3.4390) grad_norm 2.4959 (2.0603) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][810/1251] eta 0:01:48 lr 0.000864 wd 0.0500 time 0.2396 (0.2467) data time 0.0011 (0.0017) model time 0.2385 (0.2451) loss 3.3427 (3.4407) grad_norm 3.2743 (2.0637) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][820/1251] eta 0:01:46 lr 0.000864 wd 0.0500 time 0.2409 (0.2466) data time 0.0010 (0.0017) model time 0.2399 (0.2450) loss 3.5774 (3.4414) grad_norm 2.0823 (2.0641) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][830/1251] eta 0:01:43 lr 0.000864 wd 0.0500 time 0.2431 (0.2466) data time 0.0010 (0.0017) model time 0.2422 (0.2450) loss 3.6026 (3.4436) grad_norm 2.2954 (2.0718) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][840/1251] eta 0:01:41 lr 0.000864 wd 0.0500 time 0.2411 (0.2465) data time 0.0009 (0.0017) model time 0.2401 (0.2449) loss 2.0943 (3.4415) grad_norm 2.8857 (2.0703) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][850/1251] eta 0:01:38 lr 0.000864 wd 0.0500 time 0.2371 (0.2465) data time 0.0009 (0.0017) model time 0.2361 (0.2449) loss 4.4979 (3.4451) grad_norm 1.8999 (2.0743) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][860/1251] eta 0:01:36 lr 0.000864 wd 0.0500 time 0.2380 (0.2464) data time 0.0010 (0.0017) model time 0.2370 (0.2448) loss 2.1644 (3.4433) grad_norm 2.0641 (2.0756) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][870/1251] eta 0:01:33 lr 0.000864 wd 0.0500 time 0.2454 (0.2464) data time 0.0010 (0.0017) model time 0.2445 (0.2448) loss 3.7345 (3.4419) grad_norm 1.5943 (2.0760) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][880/1251] eta 0:01:31 lr 0.000864 wd 0.0500 time 0.2318 (0.2463) data time 0.0012 (0.0017) model time 0.2306 (0.2448) loss 3.6134 (3.4435) grad_norm 2.2161 (2.0760) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][890/1251] eta 0:01:28 lr 0.000864 wd 0.0500 time 0.2429 (0.2463) data time 0.0008 (0.0017) model time 0.2421 (0.2447) loss 3.4236 (3.4423) grad_norm 1.8061 (2.0723) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][900/1251] eta 0:01:26 lr 0.000864 wd 0.0500 time 0.2384 (0.2462) data time 0.0011 (0.0016) model time 0.2373 (0.2446) loss 3.8527 (3.4432) grad_norm 1.9645 (2.0724) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][910/1251] eta 0:01:23 lr 0.000864 wd 0.0500 time 0.2394 (0.2461) data time 0.0014 (0.0016) model time 0.2379 (0.2446) loss 2.6688 (3.4395) grad_norm 1.7206 (2.0721) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][920/1251] eta 0:01:21 lr 0.000864 wd 0.0500 time 0.2411 (0.2461) data time 0.0008 (0.0016) model time 0.2403 (0.2446) loss 3.7492 (3.4418) grad_norm 1.8512 (2.0685) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][930/1251] eta 0:01:18 lr 0.000864 wd 0.0500 time 0.2459 (0.2461) data time 0.0008 (0.0016) model time 0.2451 (0.2445) loss 4.1002 (3.4457) grad_norm 2.0804 (2.0655) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][940/1251] eta 0:01:16 lr 0.000864 wd 0.0500 time 0.2399 (0.2460) data time 0.0011 (0.0016) model time 0.2388 (0.2445) loss 3.2744 (3.4463) grad_norm 2.0211 (2.0652) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][950/1251] eta 0:01:14 lr 0.000864 wd 0.0500 time 0.2380 (0.2460) data time 0.0011 (0.0016) model time 0.2370 (0.2444) loss 3.3719 (3.4457) grad_norm 2.1596 (2.0675) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:48:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][960/1251] eta 0:01:11 lr 0.000864 wd 0.0500 time 0.2408 (0.2460) data time 0.0011 (0.0016) model time 0.2397 (0.2444) loss 3.7189 (3.4469) grad_norm 2.7795 (2.0698) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][970/1251] eta 0:01:09 lr 0.000864 wd 0.0500 time 0.2400 (0.2459) data time 0.0010 (0.0016) model time 0.2390 (0.2444) loss 3.0431 (3.4454) grad_norm 2.9705 (2.0736) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][980/1251] eta 0:01:06 lr 0.000864 wd 0.0500 time 0.2388 (0.2461) data time 0.0010 (0.0016) model time 0.2378 (0.2445) loss 2.6973 (3.4472) grad_norm 1.6024 (2.0770) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][990/1251] eta 0:01:04 lr 0.000864 wd 0.0500 time 0.2388 (0.2460) data time 0.0009 (0.0016) model time 0.2379 (0.2445) loss 3.8439 (3.4465) grad_norm 1.8830 (2.0747) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1000/1251] eta 0:01:01 lr 0.000864 wd 0.0500 time 0.2349 (0.2460) data time 0.0013 (0.0016) model time 0.2336 (0.2445) loss 3.8177 (3.4430) grad_norm 2.1727 (2.0713) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1010/1251] eta 0:00:59 lr 0.000864 wd 0.0500 time 0.2478 (0.2460) data time 0.0010 (0.0016) model time 0.2468 (0.2444) loss 3.7343 (3.4437) grad_norm 2.0181 (2.0698) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1020/1251] eta 0:00:56 lr 0.000863 wd 0.0500 time 0.2375 (0.2459) data time 0.0011 (0.0016) model time 0.2365 (0.2444) loss 3.4589 (3.4442) grad_norm 1.9965 (2.0678) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1030/1251] eta 0:00:54 lr 0.000863 wd 0.0500 time 0.2365 (0.2459) data time 0.0008 (0.0016) model time 0.2356 (0.2443) loss 4.3199 (3.4437) grad_norm 1.6665 (2.0685) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1040/1251] eta 0:00:51 lr 0.000863 wd 0.0500 time 0.2369 (0.2458) data time 0.0012 (0.0016) model time 0.2358 (0.2443) loss 3.8565 (3.4431) grad_norm 1.5797 (2.0700) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1050/1251] eta 0:00:49 lr 0.000863 wd 0.0500 time 0.2466 (0.2458) data time 0.0009 (0.0016) model time 0.2457 (0.2443) loss 3.7241 (3.4464) grad_norm 1.6683 (2.0684) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1060/1251] eta 0:00:46 lr 0.000863 wd 0.0500 time 0.2321 (0.2457) data time 0.0009 (0.0016) model time 0.2312 (0.2442) loss 2.1064 (3.4479) grad_norm 2.0884 (2.0678) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1070/1251] eta 0:00:44 lr 0.000863 wd 0.0500 time 0.2448 (0.2457) data time 0.0010 (0.0015) model time 0.2438 (0.2442) loss 3.6765 (3.4486) grad_norm 1.8610 (2.0664) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1080/1251] eta 0:00:42 lr 0.000863 wd 0.0500 time 0.2443 (0.2457) data time 0.0010 (0.0015) model time 0.2433 (0.2442) loss 4.1496 (3.4500) grad_norm 1.4238 (2.0666) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1090/1251] eta 0:00:39 lr 0.000863 wd 0.0500 time 0.2465 (0.2456) data time 0.0009 (0.0015) model time 0.2456 (0.2441) loss 4.0479 (3.4522) grad_norm 3.4164 (2.0683) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1100/1251] eta 0:00:37 lr 0.000863 wd 0.0500 time 0.2358 (0.2456) data time 0.0010 (0.0015) model time 0.2348 (0.2441) loss 2.3565 (3.4494) grad_norm 1.9448 (2.0689) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1110/1251] eta 0:00:34 lr 0.000863 wd 0.0500 time 0.2500 (0.2455) data time 0.0007 (0.0015) model time 0.2492 (0.2441) loss 3.7746 (3.4488) grad_norm 1.5240 (2.0662) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1120/1251] eta 0:00:32 lr 0.000863 wd 0.0500 time 0.2371 (0.2455) data time 0.0008 (0.0015) model time 0.2363 (0.2440) loss 4.1070 (3.4501) grad_norm 2.1569 (2.0658) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1130/1251] eta 0:00:29 lr 0.000863 wd 0.0500 time 0.2463 (0.2455) data time 0.0012 (0.0015) model time 0.2451 (0.2440) loss 3.2211 (3.4497) grad_norm 1.5006 (2.0641) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1140/1251] eta 0:00:27 lr 0.000863 wd 0.0500 time 0.2356 (0.2454) data time 0.0012 (0.0015) model time 0.2344 (0.2439) loss 3.7635 (3.4492) grad_norm 3.1799 (2.0675) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1150/1251] eta 0:00:24 lr 0.000863 wd 0.0500 time 0.2374 (0.2454) data time 0.0010 (0.0015) model time 0.2364 (0.2439) loss 3.7925 (3.4511) grad_norm 2.2703 (2.0746) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1160/1251] eta 0:00:22 lr 0.000863 wd 0.0500 time 0.2455 (0.2458) data time 0.0010 (0.0015) model time 0.2445 (0.2444) loss 3.5271 (3.4492) grad_norm 2.7025 (2.0772) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1170/1251] eta 0:00:19 lr 0.000863 wd 0.0500 time 0.2387 (0.2458) data time 0.0007 (0.0015) model time 0.2380 (0.2443) loss 4.0916 (3.4483) grad_norm 2.1380 (2.0754) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1180/1251] eta 0:00:17 lr 0.000863 wd 0.0500 time 0.2354 (0.2459) data time 0.0009 (0.0015) model time 0.2345 (0.2445) loss 3.8175 (3.4480) grad_norm 2.1854 (2.0795) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1190/1251] eta 0:00:14 lr 0.000863 wd 0.0500 time 0.2359 (0.2459) data time 0.0007 (0.0015) model time 0.2352 (0.2444) loss 4.4725 (3.4494) grad_norm 1.3826 (2.0774) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:49:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1200/1251] eta 0:00:12 lr 0.000863 wd 0.0500 time 0.2369 (0.2458) data time 0.0009 (0.0015) model time 0.2361 (0.2444) loss 2.9535 (3.4480) grad_norm 1.8124 (2.0765) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1210/1251] eta 0:00:10 lr 0.000863 wd 0.0500 time 0.2386 (0.2458) data time 0.0008 (0.0015) model time 0.2378 (0.2444) loss 3.9284 (3.4492) grad_norm 2.6009 (2.0794) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1220/1251] eta 0:00:07 lr 0.000863 wd 0.0500 time 0.2458 (0.2458) data time 0.0011 (0.0015) model time 0.2447 (0.2443) loss 3.6260 (3.4498) grad_norm 1.8733 (2.0801) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1230/1251] eta 0:00:05 lr 0.000863 wd 0.0500 time 0.2431 (0.2457) data time 0.0011 (0.0015) model time 0.2419 (0.2443) loss 3.6941 (3.4485) grad_norm 2.0933 (2.0776) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1240/1251] eta 0:00:02 lr 0.000863 wd 0.0500 time 0.2261 (0.2460) data time 0.0007 (0.0015) model time 0.2254 (0.2446) loss 3.6581 (3.4502) grad_norm 4.3211 (2.0830) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [87/300][1250/1251] eta 0:00:00 lr 0.000863 wd 0.0500 time 0.2239 (0.2460) data time 0.0005 (0.0015) model time 0.2234 (0.2446) loss 4.0151 (3.4522) grad_norm 1.5444 (2.0812) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 87 training takes 0:05:07 [2024-08-26 09:50:11 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 09:50:11 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 09:50:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.430 (0.430) Loss 0.5698 (0.5698) Acc@1 90.625 (90.625) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-26 09:50:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.107) Loss 0.8735 (0.8356) Acc@1 82.324 (81.738) Acc@5 95.410 (96.165) Mem 7379MB [2024-08-26 09:50:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.094) Loss 1.2080 (0.8512) Acc@1 72.559 (81.069) Acc@5 92.090 (96.080) Mem 7379MB [2024-08-26 09:50:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.089) Loss 1.4062 (0.9686) Acc@1 66.992 (78.591) Acc@5 88.086 (94.616) Mem 7379MB [2024-08-26 09:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.2959 (1.0352) Acc@1 70.996 (76.982) Acc@5 90.820 (93.874) Mem 7379MB [2024-08-26 09:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.696 Acc@5 93.780 [2024-08-26 09:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.7% [2024-08-26 09:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 76.70% [2024-08-26 09:50:15 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 09:50:16 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 09:50:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.435 (0.435) Loss 0.4546 (0.4546) Acc@1 92.090 (92.090) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 09:50:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.111) Loss 0.7222 (0.7046) Acc@1 86.328 (85.014) Acc@5 96.387 (96.946) Mem 7379MB [2024-08-26 09:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.095) Loss 1.0029 (0.7278) Acc@1 76.172 (83.896) Acc@5 94.531 (96.908) Mem 7379MB [2024-08-26 09:50:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.089) Loss 1.2939 (0.8313) Acc@1 67.676 (81.426) Acc@5 90.527 (95.694) Mem 7379MB [2024-08-26 09:50:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.1592 (0.8857) Acc@1 71.973 (79.888) Acc@5 92.383 (95.165) Mem 7379MB [2024-08-26 09:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.472 Acc@5 95.110 [2024-08-26 09:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.5% [2024-08-26 09:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.47% [2024-08-26 09:50:20 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 09:50:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 09:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][0/1251] eta 0:12:45 lr 0.000863 wd 0.0500 time 0.6121 (0.6121) data time 0.3887 (0.3887) model time 0.0000 (0.0000) loss 3.8782 (3.8782) grad_norm 1.5311 (1.5311) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][10/1251] eta 0:06:04 lr 0.000863 wd 0.0500 time 0.2373 (0.2933) data time 0.0008 (0.0362) model time 0.0000 (0.0000) loss 4.2638 (3.5541) grad_norm 2.0000 (2.3511) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][20/1251] eta 0:05:29 lr 0.000863 wd 0.0500 time 0.2380 (0.2680) data time 0.0011 (0.0196) model time 0.0000 (0.0000) loss 3.0962 (3.4028) grad_norm 1.9116 (2.4392) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][30/1251] eta 0:05:17 lr 0.000863 wd 0.0500 time 0.2362 (0.2599) data time 0.0010 (0.0136) model time 0.0000 (0.0000) loss 3.4993 (3.3984) grad_norm 1.6908 (2.2003) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][40/1251] eta 0:05:09 lr 0.000863 wd 0.0500 time 0.2468 (0.2556) data time 0.0009 (0.0105) model time 0.0000 (0.0000) loss 3.5364 (3.3487) grad_norm 1.8338 (2.1525) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][50/1251] eta 0:05:03 lr 0.000863 wd 0.0500 time 0.2456 (0.2531) data time 0.0008 (0.0087) model time 0.0000 (0.0000) loss 3.9647 (3.3119) grad_norm 2.8141 (2.0996) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][60/1251] eta 0:04:59 lr 0.000863 wd 0.0500 time 0.2397 (0.2515) data time 0.0011 (0.0074) model time 0.2386 (0.2423) loss 2.0712 (3.3433) grad_norm 1.4806 (2.0757) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][70/1251] eta 0:04:55 lr 0.000863 wd 0.0500 time 0.2374 (0.2501) data time 0.0008 (0.0065) model time 0.2366 (0.2414) loss 3.2965 (3.3649) grad_norm 1.9954 (2.0682) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][80/1251] eta 0:04:51 lr 0.000863 wd 0.0500 time 0.2486 (0.2491) data time 0.0010 (0.0058) model time 0.2476 (0.2415) loss 3.7568 (3.3595) grad_norm 2.4057 (2.0610) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][90/1251] eta 0:04:48 lr 0.000863 wd 0.0500 time 0.2408 (0.2483) data time 0.0010 (0.0053) model time 0.2398 (0.2412) loss 3.6967 (3.4012) grad_norm 2.9342 (2.0578) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][100/1251] eta 0:04:44 lr 0.000862 wd 0.0500 time 0.2448 (0.2476) data time 0.0011 (0.0049) model time 0.2437 (0.2410) loss 3.5055 (3.4277) grad_norm 1.6805 (2.0636) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][110/1251] eta 0:04:41 lr 0.000862 wd 0.0500 time 0.2347 (0.2470) data time 0.0010 (0.0045) model time 0.2337 (0.2409) loss 3.1739 (3.4181) grad_norm 1.7529 (2.0419) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][120/1251] eta 0:04:39 lr 0.000862 wd 0.0500 time 0.2407 (0.2467) data time 0.0010 (0.0043) model time 0.2397 (0.2411) loss 2.9712 (3.4069) grad_norm 2.6531 (2.0326) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][130/1251] eta 0:04:36 lr 0.000862 wd 0.0500 time 0.2455 (0.2464) data time 0.0007 (0.0040) model time 0.2448 (0.2411) loss 2.3374 (3.3836) grad_norm 1.6110 (2.0173) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][140/1251] eta 0:04:33 lr 0.000862 wd 0.0500 time 0.2435 (0.2461) data time 0.0011 (0.0038) model time 0.2424 (0.2411) loss 3.8513 (3.3844) grad_norm 1.9143 (2.0208) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:50:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][150/1251] eta 0:04:30 lr 0.000862 wd 0.0500 time 0.2389 (0.2458) data time 0.0010 (0.0036) model time 0.2379 (0.2411) loss 3.5802 (3.3921) grad_norm 2.1528 (2.0187) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:51:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][160/1251] eta 0:04:29 lr 0.000862 wd 0.0500 time 0.2530 (0.2470) data time 0.0007 (0.0034) model time 0.2523 (0.2431) loss 2.4321 (3.3817) grad_norm 1.6931 (2.0186) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:51:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][170/1251] eta 0:04:26 lr 0.000862 wd 0.0500 time 0.2452 (0.2467) data time 0.0010 (0.0033) model time 0.2442 (0.2429) loss 3.3157 (3.3727) grad_norm 1.6846 (2.0318) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][180/1251] eta 0:04:23 lr 0.000862 wd 0.0500 time 0.2334 (0.2464) data time 0.0009 (0.0032) model time 0.2325 (0.2428) loss 3.7376 (3.3761) grad_norm 2.4625 (2.0339) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][190/1251] eta 0:04:21 lr 0.000862 wd 0.0500 time 0.2395 (0.2461) data time 0.0007 (0.0031) model time 0.2387 (0.2426) loss 4.2159 (3.3847) grad_norm 1.5386 (2.0451) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:51:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][200/1251] eta 0:04:18 lr 0.000862 wd 0.0500 time 0.2329 (0.2458) data time 0.0008 (0.0030) model time 0.2321 (0.2423) loss 3.2691 (3.3854) grad_norm 1.7472 (2.0365) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][210/1251] eta 0:04:15 lr 0.000862 wd 0.0500 time 0.2441 (0.2456) data time 0.0011 (0.0029) model time 0.2430 (0.2422) loss 3.5666 (3.3855) grad_norm 2.0865 (2.0424) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][220/1251] eta 0:04:13 lr 0.000862 wd 0.0500 time 0.2381 (0.2454) data time 0.0010 (0.0028) model time 0.2371 (0.2421) loss 3.9860 (3.3976) grad_norm 2.9047 (2.0562) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:51:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][230/1251] eta 0:04:10 lr 0.000862 wd 0.0500 time 0.2395 (0.2453) data time 0.0011 (0.0027) model time 0.2384 (0.2421) loss 2.5632 (3.3985) grad_norm 2.4068 (2.0646) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:51:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][240/1251] eta 0:04:07 lr 0.000862 wd 0.0500 time 0.2448 (0.2451) data time 0.0008 (0.0027) model time 0.2439 (0.2419) loss 3.7484 (3.4042) grad_norm 1.8320 (2.0714) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:51:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][250/1251] eta 0:04:05 lr 0.000862 wd 0.0500 time 0.2357 (0.2450) data time 0.0008 (0.0026) model time 0.2349 (0.2419) loss 3.7352 (3.4066) grad_norm 1.6666 (2.0671) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:51:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][260/1251] eta 0:04:03 lr 0.000862 wd 0.0500 time 0.2521 (0.2457) data time 0.0008 (0.0025) model time 0.2513 (0.2429) loss 2.5664 (3.4076) grad_norm 1.8606 (2.0750) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][270/1251] eta 0:04:00 lr 0.000862 wd 0.0500 time 0.2414 (0.2457) data time 0.0014 (0.0025) model time 0.2400 (0.2429) loss 3.4251 (3.4061) grad_norm 1.5244 (2.0962) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 09:51:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][280/1251] eta 0:03:58 lr 0.000862 wd 0.0500 time 0.2403 (0.2455) data time 0.0007 (0.0024) model time 0.2395 (0.2428) loss 2.1235 (3.4037) grad_norm 1.4682 (2.0853) loss_scale 8192.0000 (4212.6121) mem 7379MB [2024-08-26 09:51:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][290/1251] eta 0:03:55 lr 0.000862 wd 0.0500 time 0.2406 (0.2454) data time 0.0009 (0.0024) model time 0.2398 (0.2428) loss 3.4460 (3.3913) grad_norm 2.3453 (2.0787) loss_scale 8192.0000 (4349.3608) mem 7379MB [2024-08-26 09:51:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][300/1251] eta 0:03:53 lr 0.000862 wd 0.0500 time 0.2436 (0.2453) data time 0.0009 (0.0023) model time 0.2427 (0.2427) loss 3.4103 (3.3915) grad_norm 2.2139 (2.0727) loss_scale 8192.0000 (4477.0233) mem 7379MB [2024-08-26 09:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][310/1251] eta 0:03:50 lr 0.000862 wd 0.0500 time 0.2473 (0.2452) data time 0.0007 (0.0023) model time 0.2466 (0.2426) loss 2.3145 (3.3900) grad_norm 1.6262 (2.0655) loss_scale 8192.0000 (4596.4759) mem 7379MB [2024-08-26 09:51:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][320/1251] eta 0:03:48 lr 0.000862 wd 0.0500 time 0.2450 (0.2451) data time 0.0009 (0.0022) model time 0.2441 (0.2426) loss 3.2731 (3.3882) grad_norm 1.5470 (2.0623) loss_scale 8192.0000 (4708.4860) mem 7379MB [2024-08-26 09:51:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][330/1251] eta 0:03:45 lr 0.000862 wd 0.0500 time 0.2456 (0.2450) data time 0.0007 (0.0022) model time 0.2448 (0.2425) loss 3.8390 (3.3929) grad_norm 1.4955 (2.0600) loss_scale 8192.0000 (4813.7281) mem 7379MB [2024-08-26 09:51:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][340/1251] eta 0:03:43 lr 0.000862 wd 0.0500 time 0.2480 (0.2450) data time 0.0007 (0.0022) model time 0.2473 (0.2426) loss 3.2464 (3.3932) grad_norm 2.6357 (2.0659) loss_scale 8192.0000 (4912.7977) mem 7379MB [2024-08-26 09:51:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][350/1251] eta 0:03:40 lr 0.000862 wd 0.0500 time 0.2529 (0.2450) data time 0.0009 (0.0021) model time 0.2520 (0.2426) loss 2.7847 (3.3912) grad_norm 2.1692 (2.0724) loss_scale 8192.0000 (5006.2222) mem 7379MB [2024-08-26 09:51:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][360/1251] eta 0:03:38 lr 0.000862 wd 0.0500 time 0.2418 (0.2449) data time 0.0010 (0.0021) model time 0.2408 (0.2426) loss 3.9009 (3.3965) grad_norm 1.7572 (2.0603) loss_scale 8192.0000 (5094.4709) mem 7379MB [2024-08-26 09:51:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][370/1251] eta 0:03:35 lr 0.000862 wd 0.0500 time 0.2422 (0.2448) data time 0.0008 (0.0021) model time 0.2414 (0.2425) loss 3.2332 (3.3988) grad_norm 1.7457 (2.0546) loss_scale 8192.0000 (5177.9623) mem 7379MB [2024-08-26 09:51:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][380/1251] eta 0:03:33 lr 0.000862 wd 0.0500 time 0.2479 (0.2448) data time 0.0008 (0.0020) model time 0.2471 (0.2425) loss 3.5575 (3.3932) grad_norm 1.8567 (2.0481) loss_scale 8192.0000 (5257.0709) mem 7379MB [2024-08-26 09:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][390/1251] eta 0:03:30 lr 0.000862 wd 0.0500 time 0.2359 (0.2447) data time 0.0012 (0.0020) model time 0.2348 (0.2424) loss 3.6809 (3.3918) grad_norm 1.7348 (2.0503) loss_scale 8192.0000 (5332.1330) mem 7379MB [2024-08-26 09:51:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][400/1251] eta 0:03:28 lr 0.000862 wd 0.0500 time 0.2397 (0.2446) data time 0.0011 (0.0020) model time 0.2386 (0.2423) loss 3.9205 (3.3961) grad_norm 2.1194 (2.0503) loss_scale 8192.0000 (5403.4514) mem 7379MB [2024-08-26 09:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][410/1251] eta 0:03:25 lr 0.000862 wd 0.0500 time 0.2368 (0.2445) data time 0.0011 (0.0020) model time 0.2357 (0.2422) loss 2.7268 (3.3891) grad_norm 1.8623 (2.0493) loss_scale 8192.0000 (5471.2993) mem 7379MB [2024-08-26 09:52:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][420/1251] eta 0:03:23 lr 0.000861 wd 0.0500 time 0.2495 (0.2444) data time 0.0007 (0.0019) model time 0.2488 (0.2422) loss 4.3972 (3.3946) grad_norm 2.1793 (2.0547) loss_scale 8192.0000 (5535.9240) mem 7379MB [2024-08-26 09:52:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][430/1251] eta 0:03:20 lr 0.000861 wd 0.0500 time 0.2360 (0.2444) data time 0.0010 (0.0019) model time 0.2350 (0.2422) loss 2.8469 (3.3912) grad_norm 2.0270 (2.0556) loss_scale 8192.0000 (5597.5499) mem 7379MB [2024-08-26 09:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][440/1251] eta 0:03:18 lr 0.000861 wd 0.0500 time 0.2486 (0.2443) data time 0.0008 (0.0019) model time 0.2477 (0.2422) loss 3.9332 (3.3986) grad_norm 1.8554 (2.0561) loss_scale 8192.0000 (5656.3810) mem 7379MB [2024-08-26 09:52:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][450/1251] eta 0:03:15 lr 0.000861 wd 0.0500 time 0.2413 (0.2443) data time 0.0011 (0.0019) model time 0.2401 (0.2422) loss 2.6885 (3.3995) grad_norm 3.1677 (2.0551) loss_scale 8192.0000 (5712.6031) mem 7379MB [2024-08-26 09:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][460/1251] eta 0:03:13 lr 0.000861 wd 0.0500 time 0.2435 (0.2447) data time 0.0008 (0.0019) model time 0.2428 (0.2427) loss 3.0843 (3.3946) grad_norm 1.7065 (2.0556) loss_scale 8192.0000 (5766.3861) mem 7379MB [2024-08-26 09:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][470/1251] eta 0:03:11 lr 0.000861 wd 0.0500 time 0.2374 (0.2450) data time 0.0010 (0.0019) model time 0.2364 (0.2430) loss 3.5216 (3.3897) grad_norm 1.6913 (2.0552) loss_scale 8192.0000 (5817.8854) mem 7379MB [2024-08-26 09:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][480/1251] eta 0:03:09 lr 0.000861 wd 0.0500 time 0.2554 (0.2454) data time 0.0010 (0.0018) model time 0.2544 (0.2435) loss 3.7516 (3.3928) grad_norm 1.6523 (2.0549) loss_scale 8192.0000 (5867.2432) mem 7379MB [2024-08-26 09:52:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][490/1251] eta 0:03:07 lr 0.000861 wd 0.0500 time 0.2357 (0.2458) data time 0.0009 (0.0018) model time 0.2348 (0.2439) loss 2.8409 (3.3853) grad_norm 2.0893 (2.0548) loss_scale 8192.0000 (5914.5906) mem 7379MB [2024-08-26 09:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][500/1251] eta 0:03:04 lr 0.000861 wd 0.0500 time 0.2451 (0.2461) data time 0.0010 (0.0018) model time 0.2441 (0.2442) loss 3.4536 (3.3942) grad_norm 2.2428 (2.0550) loss_scale 8192.0000 (5960.0479) mem 7379MB [2024-08-26 09:52:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][510/1251] eta 0:03:02 lr 0.000861 wd 0.0500 time 0.2407 (0.2463) data time 0.0010 (0.0018) model time 0.2397 (0.2445) loss 3.6390 (3.3872) grad_norm 2.4027 (2.0627) loss_scale 8192.0000 (6003.7260) mem 7379MB [2024-08-26 09:52:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][520/1251] eta 0:02:59 lr 0.000861 wd 0.0500 time 0.2411 (0.2461) data time 0.0011 (0.0018) model time 0.2400 (0.2444) loss 3.2965 (3.3845) grad_norm 3.0623 (2.0654) loss_scale 8192.0000 (6045.7274) mem 7379MB [2024-08-26 09:52:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][530/1251] eta 0:02:57 lr 0.000861 wd 0.0500 time 0.2330 (0.2460) data time 0.0012 (0.0018) model time 0.2318 (0.2442) loss 3.5511 (3.3812) grad_norm 2.2202 (2.0623) loss_scale 8192.0000 (6086.1469) mem 7379MB [2024-08-26 09:52:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][540/1251] eta 0:02:55 lr 0.000861 wd 0.0500 time 0.2380 (0.2463) data time 0.0008 (0.0018) model time 0.2372 (0.2446) loss 3.7211 (3.3872) grad_norm 1.6992 (2.0582) loss_scale 8192.0000 (6125.0721) mem 7379MB [2024-08-26 09:52:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][550/1251] eta 0:02:52 lr 0.000861 wd 0.0500 time 0.2340 (0.2467) data time 0.0010 (0.0017) model time 0.2330 (0.2450) loss 4.5873 (3.3939) grad_norm 2.0371 (2.0545) loss_scale 8192.0000 (6162.5844) mem 7379MB [2024-08-26 09:52:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][560/1251] eta 0:02:50 lr 0.000861 wd 0.0500 time 0.2411 (0.2466) data time 0.0011 (0.0017) model time 0.2400 (0.2449) loss 3.7029 (3.3938) grad_norm 1.6543 (2.0504) loss_scale 8192.0000 (6198.7594) mem 7379MB [2024-08-26 09:52:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][570/1251] eta 0:02:47 lr 0.000861 wd 0.0500 time 0.2423 (0.2465) data time 0.0010 (0.0017) model time 0.2413 (0.2448) loss 3.6887 (3.3952) grad_norm 1.5151 (2.0496) loss_scale 8192.0000 (6233.6673) mem 7379MB [2024-08-26 09:52:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][580/1251] eta 0:02:45 lr 0.000861 wd 0.0500 time 0.2463 (0.2464) data time 0.0007 (0.0017) model time 0.2456 (0.2448) loss 2.7144 (3.3988) grad_norm 2.3530 (2.0469) loss_scale 8192.0000 (6267.3735) mem 7379MB [2024-08-26 09:52:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][590/1251] eta 0:02:42 lr 0.000861 wd 0.0500 time 0.2438 (0.2463) data time 0.0008 (0.0017) model time 0.2429 (0.2447) loss 4.1437 (3.4013) grad_norm 2.9385 (2.0464) loss_scale 8192.0000 (6299.9391) mem 7379MB [2024-08-26 09:52:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][600/1251] eta 0:02:40 lr 0.000861 wd 0.0500 time 0.2364 (0.2463) data time 0.0008 (0.0017) model time 0.2357 (0.2446) loss 2.3616 (3.3968) grad_norm 1.8624 (2.0503) loss_scale 8192.0000 (6331.4210) mem 7379MB [2024-08-26 09:52:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][610/1251] eta 0:02:37 lr 0.000861 wd 0.0500 time 0.2497 (0.2462) data time 0.0007 (0.0017) model time 0.2490 (0.2446) loss 3.3302 (3.3995) grad_norm 1.6629 (2.0511) loss_scale 8192.0000 (6361.8723) mem 7379MB [2024-08-26 09:52:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][620/1251] eta 0:02:35 lr 0.000861 wd 0.0500 time 0.2420 (0.2462) data time 0.0007 (0.0017) model time 0.2412 (0.2445) loss 3.9974 (3.4033) grad_norm 1.8779 (2.0507) loss_scale 8192.0000 (6391.3430) mem 7379MB [2024-08-26 09:52:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][630/1251] eta 0:02:32 lr 0.000861 wd 0.0500 time 0.2421 (0.2461) data time 0.0009 (0.0016) model time 0.2412 (0.2444) loss 3.6426 (3.4054) grad_norm 1.8793 (2.0505) loss_scale 8192.0000 (6419.8796) mem 7379MB [2024-08-26 09:52:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][640/1251] eta 0:02:30 lr 0.000861 wd 0.0500 time 0.2325 (0.2460) data time 0.0009 (0.0016) model time 0.2316 (0.2443) loss 4.3870 (3.4064) grad_norm 1.7098 (2.0508) loss_scale 8192.0000 (6447.5257) mem 7379MB [2024-08-26 09:53:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][650/1251] eta 0:02:27 lr 0.000861 wd 0.0500 time 0.2337 (0.2459) data time 0.0009 (0.0016) model time 0.2328 (0.2443) loss 2.9104 (3.4031) grad_norm 1.6117 (2.0498) loss_scale 8192.0000 (6474.3226) mem 7379MB [2024-08-26 09:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][660/1251] eta 0:02:25 lr 0.000861 wd 0.0500 time 0.2350 (0.2459) data time 0.0011 (0.0016) model time 0.2339 (0.2442) loss 2.5726 (3.4079) grad_norm 2.2082 (2.0479) loss_scale 8192.0000 (6500.3086) mem 7379MB [2024-08-26 09:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][670/1251] eta 0:02:22 lr 0.000861 wd 0.0500 time 0.2406 (0.2458) data time 0.0007 (0.0016) model time 0.2399 (0.2442) loss 4.4013 (3.4093) grad_norm 1.8597 (2.0519) loss_scale 8192.0000 (6525.5201) mem 7379MB [2024-08-26 09:53:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][680/1251] eta 0:02:20 lr 0.000861 wd 0.0500 time 0.2374 (0.2457) data time 0.0012 (0.0016) model time 0.2362 (0.2441) loss 3.6181 (3.4109) grad_norm 4.1124 (2.0520) loss_scale 8192.0000 (6549.9912) mem 7379MB [2024-08-26 09:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][690/1251] eta 0:02:17 lr 0.000861 wd 0.0500 time 0.2523 (0.2457) data time 0.0012 (0.0016) model time 0.2511 (0.2441) loss 3.6657 (3.4139) grad_norm 1.5327 (2.0539) loss_scale 8192.0000 (6573.7540) mem 7379MB [2024-08-26 09:53:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][700/1251] eta 0:02:15 lr 0.000861 wd 0.0500 time 0.2426 (0.2457) data time 0.0009 (0.0016) model time 0.2417 (0.2440) loss 2.7631 (3.4126) grad_norm 2.3692 (2.0594) loss_scale 8192.0000 (6596.8388) mem 7379MB [2024-08-26 09:53:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][710/1251] eta 0:02:12 lr 0.000861 wd 0.0500 time 0.2367 (0.2456) data time 0.0011 (0.0016) model time 0.2356 (0.2440) loss 3.5189 (3.4126) grad_norm 2.3033 (2.0569) loss_scale 8192.0000 (6619.2743) mem 7379MB [2024-08-26 09:53:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][720/1251] eta 0:02:10 lr 0.000861 wd 0.0500 time 0.2347 (0.2455) data time 0.0010 (0.0016) model time 0.2337 (0.2439) loss 2.4455 (3.4110) grad_norm 1.8481 (2.0547) loss_scale 8192.0000 (6641.0874) mem 7379MB [2024-08-26 09:53:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][730/1251] eta 0:02:07 lr 0.000861 wd 0.0500 time 0.2397 (0.2455) data time 0.0011 (0.0016) model time 0.2386 (0.2439) loss 3.4962 (3.4099) grad_norm 1.6276 (2.0564) loss_scale 8192.0000 (6662.3037) mem 7379MB [2024-08-26 09:53:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][740/1251] eta 0:02:05 lr 0.000861 wd 0.0500 time 0.2376 (0.2454) data time 0.0007 (0.0016) model time 0.2369 (0.2438) loss 3.9157 (3.4045) grad_norm 2.0960 (2.0531) loss_scale 8192.0000 (6682.9474) mem 7379MB [2024-08-26 09:53:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][750/1251] eta 0:02:02 lr 0.000860 wd 0.0500 time 0.2374 (0.2454) data time 0.0010 (0.0016) model time 0.2364 (0.2438) loss 3.2566 (3.4045) grad_norm 1.7921 (2.0504) loss_scale 8192.0000 (6703.0413) mem 7379MB [2024-08-26 09:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][760/1251] eta 0:02:00 lr 0.000860 wd 0.0500 time 0.2393 (0.2453) data time 0.0009 (0.0016) model time 0.2383 (0.2437) loss 3.0578 (3.4050) grad_norm 2.1699 (2.0486) loss_scale 8192.0000 (6722.6071) mem 7379MB [2024-08-26 09:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][770/1251] eta 0:01:58 lr 0.000860 wd 0.0500 time 0.4378 (0.2455) data time 0.0007 (0.0015) model time 0.4371 (0.2440) loss 2.5708 (3.4038) grad_norm 2.1717 (2.0528) loss_scale 8192.0000 (6741.6654) mem 7379MB [2024-08-26 09:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][780/1251] eta 0:01:55 lr 0.000860 wd 0.0500 time 0.2446 (0.2455) data time 0.0008 (0.0015) model time 0.2438 (0.2439) loss 4.5673 (3.4060) grad_norm 2.0569 (2.0525) loss_scale 8192.0000 (6760.2356) mem 7379MB [2024-08-26 09:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][790/1251] eta 0:01:53 lr 0.000860 wd 0.0500 time 0.2388 (0.2455) data time 0.0007 (0.0015) model time 0.2380 (0.2439) loss 4.0579 (3.4074) grad_norm 1.4259 (2.0491) loss_scale 8192.0000 (6778.3363) mem 7379MB [2024-08-26 09:53:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][800/1251] eta 0:01:50 lr 0.000860 wd 0.0500 time 0.2471 (0.2454) data time 0.0007 (0.0015) model time 0.2463 (0.2439) loss 2.1664 (3.4003) grad_norm 1.9038 (2.0482) loss_scale 8192.0000 (6795.9850) mem 7379MB [2024-08-26 09:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][810/1251] eta 0:01:48 lr 0.000860 wd 0.0500 time 0.2444 (0.2454) data time 0.0007 (0.0015) model time 0.2437 (0.2438) loss 3.7308 (3.4019) grad_norm 1.7610 (2.0491) loss_scale 8192.0000 (6813.1985) mem 7379MB [2024-08-26 09:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][820/1251] eta 0:01:45 lr 0.000860 wd 0.0500 time 0.2327 (0.2453) data time 0.0011 (0.0015) model time 0.2317 (0.2438) loss 3.2952 (3.4039) grad_norm 2.0483 (2.0466) loss_scale 8192.0000 (6829.9927) mem 7379MB [2024-08-26 09:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][830/1251] eta 0:01:43 lr 0.000860 wd 0.0500 time 0.2506 (0.2453) data time 0.0011 (0.0015) model time 0.2495 (0.2438) loss 3.8995 (3.4028) grad_norm 1.9377 (2.0443) loss_scale 8192.0000 (6846.3827) mem 7379MB [2024-08-26 09:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][840/1251] eta 0:01:40 lr 0.000860 wd 0.0500 time 0.2304 (0.2453) data time 0.0010 (0.0015) model time 0.2294 (0.2437) loss 3.5557 (3.4018) grad_norm 2.0596 (2.0416) loss_scale 8192.0000 (6862.3829) mem 7379MB [2024-08-26 09:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][850/1251] eta 0:01:38 lr 0.000860 wd 0.0500 time 0.2372 (0.2452) data time 0.0012 (0.0015) model time 0.2359 (0.2437) loss 3.2359 (3.4023) grad_norm 1.7315 (2.0421) loss_scale 8192.0000 (6878.0071) mem 7379MB [2024-08-26 09:53:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][860/1251] eta 0:01:35 lr 0.000860 wd 0.0500 time 0.2335 (0.2452) data time 0.0008 (0.0015) model time 0.2327 (0.2436) loss 3.7189 (3.4013) grad_norm 1.6759 (2.0388) loss_scale 8192.0000 (6893.2683) mem 7379MB [2024-08-26 09:53:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][870/1251] eta 0:01:33 lr 0.000860 wd 0.0500 time 0.2398 (0.2452) data time 0.0009 (0.0015) model time 0.2389 (0.2436) loss 3.3863 (3.4008) grad_norm 2.3301 (2.0401) loss_scale 8192.0000 (6908.1791) mem 7379MB [2024-08-26 09:53:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][880/1251] eta 0:01:30 lr 0.000860 wd 0.0500 time 0.2408 (0.2451) data time 0.0011 (0.0015) model time 0.2398 (0.2436) loss 3.7378 (3.4034) grad_norm 1.5534 (2.0391) loss_scale 8192.0000 (6922.7514) mem 7379MB [2024-08-26 09:54:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][890/1251] eta 0:01:28 lr 0.000860 wd 0.0500 time 0.2441 (0.2452) data time 0.0007 (0.0015) model time 0.2434 (0.2436) loss 3.1908 (3.4014) grad_norm 1.8014 (2.0391) loss_scale 8192.0000 (6936.9966) mem 7379MB [2024-08-26 09:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][900/1251] eta 0:01:26 lr 0.000860 wd 0.0500 time 0.2420 (0.2451) data time 0.0010 (0.0015) model time 0.2411 (0.2436) loss 3.6967 (3.4028) grad_norm 1.5355 (2.0418) loss_scale 8192.0000 (6950.9256) mem 7379MB [2024-08-26 09:54:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][910/1251] eta 0:01:23 lr 0.000860 wd 0.0500 time 0.2464 (0.2451) data time 0.0009 (0.0015) model time 0.2454 (0.2436) loss 3.7088 (3.4023) grad_norm 1.9863 (2.0408) loss_scale 8192.0000 (6964.5488) mem 7379MB [2024-08-26 09:54:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][920/1251] eta 0:01:21 lr 0.000860 wd 0.0500 time 0.2428 (0.2451) data time 0.0012 (0.0015) model time 0.2416 (0.2436) loss 3.1210 (3.4008) grad_norm 1.8757 (2.0401) loss_scale 8192.0000 (6977.8762) mem 7379MB [2024-08-26 09:54:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][930/1251] eta 0:01:18 lr 0.000860 wd 0.0500 time 0.2429 (0.2453) data time 0.0010 (0.0015) model time 0.2418 (0.2438) loss 3.2647 (3.4001) grad_norm 2.1932 (2.0440) loss_scale 8192.0000 (6990.9173) mem 7379MB [2024-08-26 09:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][940/1251] eta 0:01:16 lr 0.000860 wd 0.0500 time 0.2411 (0.2452) data time 0.0009 (0.0015) model time 0.2402 (0.2437) loss 3.5649 (3.3972) grad_norm 2.0693 (2.0448) loss_scale 8192.0000 (7003.6812) mem 7379MB [2024-08-26 09:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][950/1251] eta 0:01:13 lr 0.000860 wd 0.0500 time 0.2521 (0.2452) data time 0.0010 (0.0015) model time 0.2511 (0.2437) loss 2.7974 (3.3984) grad_norm 2.0573 (2.0451) loss_scale 8192.0000 (7016.1767) mem 7379MB [2024-08-26 09:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][960/1251] eta 0:01:11 lr 0.000860 wd 0.0500 time 0.2381 (0.2452) data time 0.0009 (0.0015) model time 0.2372 (0.2437) loss 3.6666 (3.4004) grad_norm 1.5903 (2.0477) loss_scale 8192.0000 (7028.4121) mem 7379MB [2024-08-26 09:54:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][970/1251] eta 0:01:08 lr 0.000860 wd 0.0500 time 0.2378 (0.2451) data time 0.0009 (0.0015) model time 0.2369 (0.2436) loss 3.5670 (3.3999) grad_norm 2.5188 (2.0494) loss_scale 8192.0000 (7040.3955) mem 7379MB [2024-08-26 09:54:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][980/1251] eta 0:01:06 lr 0.000860 wd 0.0500 time 0.2423 (0.2451) data time 0.0011 (0.0014) model time 0.2412 (0.2436) loss 3.8051 (3.3987) grad_norm 1.8727 (2.0508) loss_scale 8192.0000 (7052.1346) mem 7379MB [2024-08-26 09:54:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][990/1251] eta 0:01:03 lr 0.000860 wd 0.0500 time 0.2325 (0.2451) data time 0.0011 (0.0014) model time 0.2314 (0.2436) loss 3.1870 (3.3997) grad_norm 2.8188 (2.0532) loss_scale 8192.0000 (7063.6367) mem 7379MB [2024-08-26 09:54:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1000/1251] eta 0:01:01 lr 0.000860 wd 0.0500 time 0.2392 (0.2452) data time 0.0010 (0.0014) model time 0.2382 (0.2438) loss 3.4609 (3.4006) grad_norm 2.5154 (2.0558) loss_scale 8192.0000 (7074.9091) mem 7379MB [2024-08-26 09:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1010/1251] eta 0:00:59 lr 0.000860 wd 0.0500 time 0.2397 (0.2452) data time 0.0008 (0.0014) model time 0.2389 (0.2437) loss 4.2451 (3.4011) grad_norm 2.0162 (2.0548) loss_scale 8192.0000 (7085.9585) mem 7379MB [2024-08-26 09:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1020/1251] eta 0:00:56 lr 0.000860 wd 0.0500 time 0.2388 (0.2451) data time 0.0010 (0.0014) model time 0.2379 (0.2437) loss 3.7320 (3.3992) grad_norm 3.0005 (2.0537) loss_scale 8192.0000 (7096.7914) mem 7379MB [2024-08-26 09:54:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1030/1251] eta 0:00:54 lr 0.000860 wd 0.0500 time 0.4328 (0.2455) data time 0.0008 (0.0014) model time 0.4319 (0.2441) loss 2.6170 (3.4018) grad_norm 1.4365 (2.0519) loss_scale 8192.0000 (7107.4142) mem 7379MB [2024-08-26 09:54:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1040/1251] eta 0:00:51 lr 0.000860 wd 0.0500 time 0.2376 (0.2455) data time 0.0010 (0.0014) model time 0.2367 (0.2440) loss 3.4936 (3.4014) grad_norm 2.3317 (2.0520) loss_scale 8192.0000 (7117.8329) mem 7379MB [2024-08-26 09:54:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1050/1251] eta 0:00:49 lr 0.000860 wd 0.0500 time 0.2417 (0.2454) data time 0.0008 (0.0014) model time 0.2409 (0.2440) loss 4.0258 (3.4009) grad_norm 1.9130 (2.0513) loss_scale 8192.0000 (7128.0533) mem 7379MB [2024-08-26 09:54:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1060/1251] eta 0:00:46 lr 0.000860 wd 0.0500 time 0.2459 (0.2454) data time 0.0010 (0.0014) model time 0.2449 (0.2440) loss 3.5864 (3.4020) grad_norm 3.4146 (2.0515) loss_scale 8192.0000 (7138.0811) mem 7379MB [2024-08-26 09:54:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1070/1251] eta 0:00:44 lr 0.000859 wd 0.0500 time 0.2416 (0.2455) data time 0.0009 (0.0014) model time 0.2406 (0.2441) loss 3.0120 (3.4038) grad_norm 2.9893 (2.0527) loss_scale 8192.0000 (7147.9216) mem 7379MB [2024-08-26 09:54:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1080/1251] eta 0:00:41 lr 0.000859 wd 0.0500 time 0.2375 (0.2455) data time 0.0007 (0.0014) model time 0.2368 (0.2441) loss 4.3129 (3.4062) grad_norm 3.1717 (2.0532) loss_scale 8192.0000 (7157.5800) mem 7379MB [2024-08-26 09:54:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1090/1251] eta 0:00:39 lr 0.000859 wd 0.0500 time 0.2398 (0.2456) data time 0.0007 (0.0014) model time 0.2391 (0.2442) loss 3.0239 (3.4045) grad_norm 1.6781 (2.0528) loss_scale 8192.0000 (7167.0614) mem 7379MB [2024-08-26 09:54:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1100/1251] eta 0:00:37 lr 0.000859 wd 0.0500 time 0.2472 (0.2456) data time 0.0009 (0.0014) model time 0.2463 (0.2442) loss 3.6103 (3.4043) grad_norm 1.9835 (2.0524) loss_scale 8192.0000 (7176.3706) mem 7379MB [2024-08-26 09:54:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1110/1251] eta 0:00:34 lr 0.000859 wd 0.0500 time 0.2395 (0.2456) data time 0.0007 (0.0014) model time 0.2388 (0.2442) loss 2.7881 (3.4052) grad_norm 1.2575 (2.0530) loss_scale 8192.0000 (7185.5122) mem 7379MB [2024-08-26 09:54:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1120/1251] eta 0:00:32 lr 0.000859 wd 0.0500 time 0.2452 (0.2455) data time 0.0009 (0.0014) model time 0.2443 (0.2441) loss 3.5646 (3.4053) grad_norm 2.0792 (2.0534) loss_scale 8192.0000 (7194.4906) mem 7379MB [2024-08-26 09:54:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1130/1251] eta 0:00:29 lr 0.000859 wd 0.0500 time 0.2450 (0.2455) data time 0.0010 (0.0014) model time 0.2440 (0.2441) loss 3.6299 (3.4078) grad_norm 2.3791 (2.0537) loss_scale 8192.0000 (7203.3103) mem 7379MB [2024-08-26 09:55:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1140/1251] eta 0:00:27 lr 0.000859 wd 0.0500 time 0.2444 (0.2455) data time 0.0009 (0.0014) model time 0.2436 (0.2441) loss 3.8589 (3.4088) grad_norm 1.7424 (2.0532) loss_scale 8192.0000 (7211.9755) mem 7379MB [2024-08-26 09:55:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1150/1251] eta 0:00:24 lr 0.000859 wd 0.0500 time 0.2385 (0.2455) data time 0.0007 (0.0014) model time 0.2377 (0.2441) loss 3.4611 (3.4137) grad_norm 3.0900 (2.0521) loss_scale 8192.0000 (7220.4900) mem 7379MB [2024-08-26 09:55:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1160/1251] eta 0:00:22 lr 0.000859 wd 0.0500 time 0.2548 (0.2454) data time 0.0008 (0.0014) model time 0.2540 (0.2441) loss 3.9020 (3.4141) grad_norm 1.9820 (2.0557) loss_scale 8192.0000 (7228.8579) mem 7379MB [2024-08-26 09:55:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1170/1251] eta 0:00:19 lr 0.000859 wd 0.0500 time 0.2368 (0.2454) data time 0.0007 (0.0014) model time 0.2361 (0.2440) loss 4.0557 (3.4157) grad_norm 1.6565 (2.0539) loss_scale 8192.0000 (7237.0828) mem 7379MB [2024-08-26 09:55:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1180/1251] eta 0:00:17 lr 0.000859 wd 0.0500 time 0.2433 (0.2454) data time 0.0010 (0.0014) model time 0.2423 (0.2440) loss 3.8320 (3.4168) grad_norm 1.6821 (2.0532) loss_scale 8192.0000 (7245.1685) mem 7379MB [2024-08-26 09:55:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1190/1251] eta 0:00:14 lr 0.000859 wd 0.0500 time 0.2359 (0.2455) data time 0.0010 (0.0014) model time 0.2349 (0.2441) loss 2.3791 (3.4141) grad_norm 1.8193 (2.0512) loss_scale 8192.0000 (7253.1184) mem 7379MB [2024-08-26 09:55:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1200/1251] eta 0:00:12 lr 0.000859 wd 0.0500 time 0.2469 (0.2455) data time 0.0008 (0.0014) model time 0.2460 (0.2441) loss 2.6991 (3.4134) grad_norm 1.8679 (2.0516) loss_scale 8192.0000 (7260.9359) mem 7379MB [2024-08-26 09:55:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1210/1251] eta 0:00:10 lr 0.000859 wd 0.0500 time 0.2389 (0.2455) data time 0.0013 (0.0014) model time 0.2376 (0.2441) loss 3.0056 (3.4131) grad_norm 2.7015 (2.0515) loss_scale 8192.0000 (7268.6243) mem 7379MB [2024-08-26 09:55:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1220/1251] eta 0:00:07 lr 0.000859 wd 0.0500 time 0.2388 (0.2454) data time 0.0008 (0.0014) model time 0.2380 (0.2441) loss 4.2751 (3.4156) grad_norm 2.0939 (2.0517) loss_scale 8192.0000 (7276.1867) mem 7379MB [2024-08-26 09:55:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1230/1251] eta 0:00:05 lr 0.000859 wd 0.0500 time 0.2521 (0.2454) data time 0.0010 (0.0014) model time 0.2511 (0.2440) loss 3.9113 (3.4155) grad_norm 2.3261 (2.0495) loss_scale 8192.0000 (7283.6263) mem 7379MB [2024-08-26 09:55:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1240/1251] eta 0:00:02 lr 0.000859 wd 0.0500 time 0.2237 (0.2453) data time 0.0005 (0.0014) model time 0.2232 (0.2439) loss 3.9953 (3.4166) grad_norm 3.0455 (2.0493) loss_scale 8192.0000 (7290.9460) mem 7379MB [2024-08-26 09:55:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [88/300][1250/1251] eta 0:00:00 lr 0.000859 wd 0.0500 time 0.2342 (0.2452) data time 0.0007 (0.0014) model time 0.2335 (0.2438) loss 3.4594 (3.4133) grad_norm 2.3886 (2.0511) loss_scale 8192.0000 (7298.1487) mem 7379MB [2024-08-26 09:55:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 88 training takes 0:05:06 [2024-08-26 09:55:28 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 09:55:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 09:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.479 (0.479) Loss 0.5576 (0.5576) Acc@1 90.039 (90.039) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 09:55:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.111) Loss 0.9355 (0.8229) Acc@1 81.445 (82.306) Acc@5 95.410 (96.049) Mem 7379MB [2024-08-26 09:55:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.094) Loss 1.1670 (0.8454) Acc@1 73.340 (81.287) Acc@5 92.285 (95.964) Mem 7379MB [2024-08-26 09:55:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.089) Loss 1.4922 (0.9647) Acc@1 65.234 (78.484) Acc@5 87.598 (94.522) Mem 7379MB [2024-08-26 09:55:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.2158 (1.0198) Acc@1 71.094 (77.089) Acc@5 91.797 (93.879) Mem 7379MB [2024-08-26 09:55:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.666 Acc@5 93.774 [2024-08-26 09:55:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.7% [2024-08-26 09:55:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.786 (0.786) Loss 0.4534 (0.4534) Acc@1 92.090 (92.090) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 09:55:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.147) Loss 0.7212 (0.7028) Acc@1 85.938 (85.023) Acc@5 96.484 (96.955) Mem 7379MB [2024-08-26 09:55:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.114) Loss 1.0020 (0.7262) Acc@1 76.172 (83.919) Acc@5 94.434 (96.898) Mem 7379MB [2024-08-26 09:55:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.103) Loss 1.2930 (0.8298) Acc@1 67.969 (81.493) Acc@5 90.820 (95.716) Mem 7379MB [2024-08-26 09:55:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.1543 (0.8837) Acc@1 72.070 (79.973) Acc@5 92.480 (95.181) Mem 7379MB [2024-08-26 09:55:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.530 Acc@5 95.130 [2024-08-26 09:55:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.5% [2024-08-26 09:55:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.53% [2024-08-26 09:55:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 09:55:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 09:55:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][0/1251] eta 0:13:46 lr 0.000859 wd 0.0500 time 0.6606 (0.6606) data time 0.4358 (0.4358) model time 0.0000 (0.0000) loss 2.3907 (2.3907) grad_norm 1.9388 (1.9388) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:55:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][10/1251] eta 0:05:48 lr 0.000859 wd 0.0500 time 0.2486 (0.2809) data time 0.0007 (0.0406) model time 0.0000 (0.0000) loss 2.4278 (2.9536) grad_norm 2.4418 (2.1884) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:55:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][20/1251] eta 0:05:23 lr 0.000859 wd 0.0500 time 0.2494 (0.2627) data time 0.0010 (0.0217) model time 0.0000 (0.0000) loss 3.6496 (3.2643) grad_norm 1.5779 (2.1104) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:55:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][30/1251] eta 0:05:13 lr 0.000859 wd 0.0500 time 0.2424 (0.2566) data time 0.0010 (0.0151) model time 0.0000 (0.0000) loss 3.4760 (3.3981) grad_norm 2.3676 (2.2282) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:55:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][40/1251] eta 0:05:06 lr 0.000859 wd 0.0500 time 0.2354 (0.2529) data time 0.0010 (0.0117) model time 0.0000 (0.0000) loss 3.9803 (3.3495) grad_norm 2.1831 (2.2934) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:55:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][50/1251] eta 0:05:05 lr 0.000859 wd 0.0500 time 0.2467 (0.2546) data time 0.0011 (0.0096) model time 0.0000 (0.0000) loss 3.9381 (3.3494) grad_norm 1.9665 (2.2505) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:55:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][60/1251] eta 0:05:01 lr 0.000859 wd 0.0500 time 0.2446 (0.2531) data time 0.0010 (0.0082) model time 0.2435 (0.2445) loss 2.8780 (3.3597) grad_norm 2.3493 (2.2347) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:55:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][70/1251] eta 0:04:57 lr 0.000859 wd 0.0500 time 0.2455 (0.2516) data time 0.0007 (0.0072) model time 0.2448 (0.2431) loss 3.3566 (3.3417) grad_norm 1.8099 (2.2522) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:55:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][80/1251] eta 0:04:53 lr 0.000859 wd 0.0500 time 0.2417 (0.2504) data time 0.0007 (0.0064) model time 0.2410 (0.2425) loss 2.2100 (3.3199) grad_norm 1.5540 (2.2449) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][90/1251] eta 0:04:52 lr 0.000859 wd 0.0500 time 0.2397 (0.2520) data time 0.0009 (0.0059) model time 0.2388 (0.2474) loss 2.8063 (3.3581) grad_norm 2.1317 (2.2331) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][100/1251] eta 0:04:53 lr 0.000859 wd 0.0500 time 0.2389 (0.2549) data time 0.0010 (0.0054) model time 0.2380 (0.2540) loss 3.8538 (3.3494) grad_norm 2.0037 (2.2046) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][110/1251] eta 0:04:49 lr 0.000859 wd 0.0500 time 0.2416 (0.2539) data time 0.0010 (0.0050) model time 0.2406 (0.2521) loss 3.6014 (3.3727) grad_norm 1.9408 (2.1798) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][120/1251] eta 0:04:46 lr 0.000859 wd 0.0500 time 0.2475 (0.2530) data time 0.0008 (0.0047) model time 0.2467 (0.2507) loss 2.8575 (3.3727) grad_norm 1.9509 (2.1634) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][130/1251] eta 0:04:42 lr 0.000859 wd 0.0500 time 0.2449 (0.2523) data time 0.0007 (0.0044) model time 0.2441 (0.2498) loss 3.8373 (3.3626) grad_norm 2.0171 (2.1483) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][140/1251] eta 0:04:39 lr 0.000858 wd 0.0500 time 0.2408 (0.2516) data time 0.0009 (0.0042) model time 0.2398 (0.2488) loss 3.6038 (3.3665) grad_norm 1.8916 (2.1197) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][150/1251] eta 0:04:36 lr 0.000858 wd 0.0500 time 0.2502 (0.2511) data time 0.0007 (0.0040) model time 0.2496 (0.2482) loss 3.6281 (3.3668) grad_norm 2.1112 (2.1140) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][160/1251] eta 0:04:33 lr 0.000858 wd 0.0500 time 0.2390 (0.2507) data time 0.0011 (0.0038) model time 0.2379 (0.2478) loss 3.7835 (3.3629) grad_norm 2.7108 (2.1267) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][170/1251] eta 0:04:30 lr 0.000858 wd 0.0500 time 0.2400 (0.2502) data time 0.0010 (0.0036) model time 0.2390 (0.2472) loss 3.8270 (3.3842) grad_norm 2.3707 (2.1285) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][180/1251] eta 0:04:27 lr 0.000858 wd 0.0500 time 0.2492 (0.2497) data time 0.0010 (0.0035) model time 0.2482 (0.2467) loss 3.9825 (3.4028) grad_norm 2.1897 (2.1167) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][190/1251] eta 0:04:24 lr 0.000858 wd 0.0500 time 0.2474 (0.2493) data time 0.0010 (0.0033) model time 0.2464 (0.2463) loss 3.1063 (3.3957) grad_norm 3.1702 (2.1180) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][200/1251] eta 0:04:21 lr 0.000858 wd 0.0500 time 0.2348 (0.2488) data time 0.0007 (0.0032) model time 0.2340 (0.2457) loss 3.9391 (3.3915) grad_norm 1.7923 (2.1047) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][210/1251] eta 0:04:18 lr 0.000858 wd 0.0500 time 0.2427 (0.2484) data time 0.0007 (0.0031) model time 0.2420 (0.2453) loss 3.9101 (3.3844) grad_norm 2.8399 (2.0934) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][220/1251] eta 0:04:15 lr 0.000858 wd 0.0500 time 0.2482 (0.2480) data time 0.0008 (0.0030) model time 0.2474 (0.2450) loss 3.9010 (3.3974) grad_norm 2.1998 (2.0877) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][230/1251] eta 0:04:12 lr 0.000858 wd 0.0500 time 0.2454 (0.2478) data time 0.0011 (0.0029) model time 0.2443 (0.2448) loss 4.2655 (3.4077) grad_norm 1.4575 (2.0836) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][240/1251] eta 0:04:10 lr 0.000858 wd 0.0500 time 0.2353 (0.2476) data time 0.0010 (0.0029) model time 0.2343 (0.2447) loss 3.2518 (3.4093) grad_norm 1.8756 (2.0847) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][250/1251] eta 0:04:07 lr 0.000858 wd 0.0500 time 0.2354 (0.2474) data time 0.0010 (0.0028) model time 0.2343 (0.2445) loss 3.2067 (3.4229) grad_norm 2.0061 (2.0829) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][260/1251] eta 0:04:04 lr 0.000858 wd 0.0500 time 0.2475 (0.2472) data time 0.0007 (0.0027) model time 0.2468 (0.2444) loss 2.5089 (3.4237) grad_norm 1.9878 (2.0775) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][270/1251] eta 0:04:02 lr 0.000858 wd 0.0500 time 0.2354 (0.2470) data time 0.0008 (0.0027) model time 0.2346 (0.2442) loss 4.0514 (3.4184) grad_norm 1.8238 (2.0714) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][280/1251] eta 0:04:00 lr 0.000858 wd 0.0500 time 0.2430 (0.2475) data time 0.0007 (0.0026) model time 0.2423 (0.2449) loss 3.1339 (3.4230) grad_norm 2.0597 (2.0653) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][290/1251] eta 0:03:57 lr 0.000858 wd 0.0500 time 0.2413 (0.2473) data time 0.0012 (0.0026) model time 0.2401 (0.2447) loss 3.8223 (3.4217) grad_norm 2.4036 (2.0712) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][300/1251] eta 0:03:55 lr 0.000858 wd 0.0500 time 0.2378 (0.2471) data time 0.0008 (0.0025) model time 0.2370 (0.2446) loss 2.6709 (3.4182) grad_norm 1.3435 (2.0622) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][310/1251] eta 0:03:52 lr 0.000858 wd 0.0500 time 0.2365 (0.2469) data time 0.0008 (0.0025) model time 0.2356 (0.2444) loss 3.1948 (3.4254) grad_norm 1.3035 (2.0544) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][320/1251] eta 0:03:49 lr 0.000858 wd 0.0500 time 0.2439 (0.2468) data time 0.0007 (0.0024) model time 0.2432 (0.2443) loss 2.6073 (3.4266) grad_norm 3.4292 (2.0593) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:56:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][330/1251] eta 0:03:47 lr 0.000858 wd 0.0500 time 0.2347 (0.2470) data time 0.0007 (0.0024) model time 0.2340 (0.2446) loss 3.8051 (3.4318) grad_norm 2.9773 (2.0630) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][340/1251] eta 0:03:44 lr 0.000858 wd 0.0500 time 0.2364 (0.2469) data time 0.0011 (0.0023) model time 0.2353 (0.2445) loss 2.5434 (3.4304) grad_norm 2.4424 (2.0666) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][350/1251] eta 0:03:42 lr 0.000858 wd 0.0500 time 0.2483 (0.2468) data time 0.0007 (0.0023) model time 0.2476 (0.2444) loss 3.4573 (3.4354) grad_norm 1.5713 (2.0650) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][360/1251] eta 0:03:39 lr 0.000858 wd 0.0500 time 0.2432 (0.2466) data time 0.0010 (0.0023) model time 0.2422 (0.2443) loss 3.8364 (3.4361) grad_norm 2.2008 (2.0607) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][370/1251] eta 0:03:37 lr 0.000858 wd 0.0500 time 0.2397 (0.2470) data time 0.0010 (0.0022) model time 0.2387 (0.2447) loss 3.5634 (3.4354) grad_norm 1.9184 (2.0643) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][380/1251] eta 0:03:34 lr 0.000858 wd 0.0500 time 0.2463 (0.2468) data time 0.0007 (0.0022) model time 0.2456 (0.2446) loss 3.7102 (3.4342) grad_norm 1.7781 (2.0619) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][390/1251] eta 0:03:32 lr 0.000858 wd 0.0500 time 0.2459 (0.2467) data time 0.0007 (0.0022) model time 0.2452 (0.2445) loss 3.7389 (3.4302) grad_norm 2.0320 (2.0637) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][400/1251] eta 0:03:31 lr 0.000858 wd 0.0500 time 0.4494 (0.2482) data time 0.0017 (0.0021) model time 0.4477 (0.2463) loss 3.4228 (3.4320) grad_norm 1.8732 (2.0632) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][410/1251] eta 0:03:28 lr 0.000858 wd 0.0500 time 0.2366 (0.2480) data time 0.0010 (0.0021) model time 0.2356 (0.2460) loss 3.3655 (3.4314) grad_norm 2.0984 (2.0654) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][420/1251] eta 0:03:25 lr 0.000858 wd 0.0500 time 0.2439 (0.2479) data time 0.0007 (0.0021) model time 0.2431 (0.2459) loss 3.9545 (3.4293) grad_norm 1.5668 (2.0663) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][430/1251] eta 0:03:23 lr 0.000858 wd 0.0500 time 0.2441 (0.2477) data time 0.0007 (0.0021) model time 0.2434 (0.2457) loss 3.9295 (3.4245) grad_norm 3.0923 (2.0616) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][440/1251] eta 0:03:20 lr 0.000858 wd 0.0500 time 0.2438 (0.2476) data time 0.0011 (0.0020) model time 0.2427 (0.2456) loss 4.2758 (3.4278) grad_norm 2.1636 (2.0588) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][450/1251] eta 0:03:18 lr 0.000858 wd 0.0500 time 0.2383 (0.2475) data time 0.0010 (0.0020) model time 0.2373 (0.2455) loss 3.3999 (3.4307) grad_norm 1.5355 (2.0527) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][460/1251] eta 0:03:15 lr 0.000857 wd 0.0500 time 0.2449 (0.2474) data time 0.0010 (0.0020) model time 0.2440 (0.2455) loss 3.7464 (3.4280) grad_norm 1.9075 (2.0444) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][470/1251] eta 0:03:13 lr 0.000857 wd 0.0500 time 0.2396 (0.2473) data time 0.0011 (0.0020) model time 0.2385 (0.2453) loss 3.4933 (3.4251) grad_norm 2.1866 (2.0399) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][480/1251] eta 0:03:10 lr 0.000857 wd 0.0500 time 0.2445 (0.2472) data time 0.0011 (0.0020) model time 0.2434 (0.2453) loss 3.7755 (3.4301) grad_norm 2.1162 (2.0460) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][490/1251] eta 0:03:08 lr 0.000857 wd 0.0500 time 0.2406 (0.2471) data time 0.0011 (0.0019) model time 0.2396 (0.2451) loss 2.6219 (3.4302) grad_norm 1.8050 (2.0437) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][500/1251] eta 0:03:05 lr 0.000857 wd 0.0500 time 0.2360 (0.2470) data time 0.0012 (0.0019) model time 0.2349 (0.2451) loss 3.5364 (3.4240) grad_norm 1.7627 (2.0449) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][510/1251] eta 0:03:02 lr 0.000857 wd 0.0500 time 0.2515 (0.2468) data time 0.0007 (0.0019) model time 0.2508 (0.2449) loss 3.4709 (3.4183) grad_norm 1.7254 (2.0460) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][520/1251] eta 0:03:00 lr 0.000857 wd 0.0500 time 0.2494 (0.2468) data time 0.0007 (0.0019) model time 0.2487 (0.2449) loss 3.5975 (3.4220) grad_norm 2.4873 (2.0499) loss_scale 8192.0000 (8192.0000) mem 7379MB [2024-08-26 09:57:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][530/1251] eta 0:02:57 lr 0.000857 wd 0.0500 time 0.2424 (0.2467) data time 0.0012 (0.0019) model time 0.2412 (0.2448) loss 2.3502 (3.4216) grad_norm 2.8719 (inf) loss_scale 4096.0000 (8145.7175) mem 7379MB [2024-08-26 09:57:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][540/1251] eta 0:02:55 lr 0.000857 wd 0.0500 time 0.2372 (0.2466) data time 0.0007 (0.0018) model time 0.2364 (0.2447) loss 3.0907 (3.4237) grad_norm 2.5903 (inf) loss_scale 4096.0000 (8070.8614) mem 7379MB [2024-08-26 09:57:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][550/1251] eta 0:02:52 lr 0.000857 wd 0.0500 time 0.2429 (0.2465) data time 0.0009 (0.0018) model time 0.2420 (0.2446) loss 3.8515 (3.4246) grad_norm 1.8383 (inf) loss_scale 4096.0000 (7998.7223) mem 7379MB [2024-08-26 09:57:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][560/1251] eta 0:02:50 lr 0.000857 wd 0.0500 time 0.2372 (0.2464) data time 0.0010 (0.0018) model time 0.2362 (0.2446) loss 3.0891 (3.4258) grad_norm 1.7947 (inf) loss_scale 4096.0000 (7929.1551) mem 7379MB [2024-08-26 09:57:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][570/1251] eta 0:02:47 lr 0.000857 wd 0.0500 time 0.2371 (0.2463) data time 0.0010 (0.0018) model time 0.2362 (0.2445) loss 2.7488 (3.4283) grad_norm 1.9714 (inf) loss_scale 4096.0000 (7862.0245) mem 7379MB [2024-08-26 09:58:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][580/1251] eta 0:02:45 lr 0.000857 wd 0.0500 time 0.2427 (0.2463) data time 0.0010 (0.0018) model time 0.2417 (0.2444) loss 3.4330 (3.4280) grad_norm 3.2010 (inf) loss_scale 4096.0000 (7797.2048) mem 7379MB [2024-08-26 09:58:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][590/1251] eta 0:02:42 lr 0.000857 wd 0.0500 time 0.2391 (0.2462) data time 0.0009 (0.0018) model time 0.2382 (0.2444) loss 4.1092 (3.4289) grad_norm 2.5420 (inf) loss_scale 4096.0000 (7734.5787) mem 7379MB [2024-08-26 09:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][600/1251] eta 0:02:40 lr 0.000857 wd 0.0500 time 0.2445 (0.2462) data time 0.0007 (0.0018) model time 0.2437 (0.2443) loss 3.8844 (3.4327) grad_norm 1.6944 (inf) loss_scale 4096.0000 (7674.0366) mem 7379MB [2024-08-26 09:58:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][610/1251] eta 0:02:37 lr 0.000857 wd 0.0500 time 0.2299 (0.2461) data time 0.0010 (0.0018) model time 0.2289 (0.2443) loss 3.7031 (3.4307) grad_norm 1.8318 (inf) loss_scale 4096.0000 (7615.4763) mem 7379MB [2024-08-26 09:58:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][620/1251] eta 0:02:35 lr 0.000857 wd 0.0500 time 0.2351 (0.2460) data time 0.0011 (0.0017) model time 0.2340 (0.2442) loss 3.1421 (3.4326) grad_norm 2.6662 (inf) loss_scale 4096.0000 (7558.8019) mem 7379MB [2024-08-26 09:58:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][630/1251] eta 0:02:32 lr 0.000857 wd 0.0500 time 0.2451 (0.2459) data time 0.0007 (0.0017) model time 0.2444 (0.2441) loss 2.6769 (3.4302) grad_norm 1.8768 (inf) loss_scale 4096.0000 (7503.9239) mem 7379MB [2024-08-26 09:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][640/1251] eta 0:02:30 lr 0.000857 wd 0.0500 time 0.2421 (0.2465) data time 0.0009 (0.0017) model time 0.2412 (0.2448) loss 2.2286 (3.4276) grad_norm 3.1155 (inf) loss_scale 4096.0000 (7450.7582) mem 7379MB [2024-08-26 09:58:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][650/1251] eta 0:02:28 lr 0.000857 wd 0.0500 time 0.2587 (0.2465) data time 0.0007 (0.0017) model time 0.2580 (0.2447) loss 3.8540 (3.4293) grad_norm 1.5777 (inf) loss_scale 4096.0000 (7399.2258) mem 7379MB [2024-08-26 09:58:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][660/1251] eta 0:02:25 lr 0.000857 wd 0.0500 time 0.2370 (0.2464) data time 0.0009 (0.0017) model time 0.2361 (0.2447) loss 3.5529 (3.4265) grad_norm 1.9789 (inf) loss_scale 4096.0000 (7349.2526) mem 7379MB [2024-08-26 09:58:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][670/1251] eta 0:02:23 lr 0.000857 wd 0.0500 time 0.2419 (0.2464) data time 0.0008 (0.0017) model time 0.2411 (0.2446) loss 2.3314 (3.4241) grad_norm 1.9028 (inf) loss_scale 4096.0000 (7300.7690) mem 7379MB [2024-08-26 09:58:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][680/1251] eta 0:02:20 lr 0.000857 wd 0.0500 time 0.2388 (0.2463) data time 0.0010 (0.0017) model time 0.2378 (0.2446) loss 3.8179 (3.4238) grad_norm 2.5385 (inf) loss_scale 4096.0000 (7253.7093) mem 7379MB [2024-08-26 09:58:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][690/1251] eta 0:02:18 lr 0.000857 wd 0.0500 time 0.2367 (0.2462) data time 0.0012 (0.0017) model time 0.2355 (0.2445) loss 3.4987 (3.4256) grad_norm 4.1572 (inf) loss_scale 4096.0000 (7208.0116) mem 7379MB [2024-08-26 09:58:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][700/1251] eta 0:02:15 lr 0.000857 wd 0.0500 time 0.2413 (0.2461) data time 0.0008 (0.0017) model time 0.2405 (0.2444) loss 3.6371 (3.4254) grad_norm 1.8096 (inf) loss_scale 4096.0000 (7163.6177) mem 7379MB [2024-08-26 09:58:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][710/1251] eta 0:02:13 lr 0.000857 wd 0.0500 time 0.2442 (0.2461) data time 0.0011 (0.0017) model time 0.2431 (0.2444) loss 3.2123 (3.4280) grad_norm 2.2005 (inf) loss_scale 4096.0000 (7120.4726) mem 7379MB [2024-08-26 09:58:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][720/1251] eta 0:02:10 lr 0.000857 wd 0.0500 time 0.2410 (0.2460) data time 0.0010 (0.0016) model time 0.2401 (0.2443) loss 3.5515 (3.4276) grad_norm 1.8779 (inf) loss_scale 4096.0000 (7078.5243) mem 7379MB [2024-08-26 09:58:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][730/1251] eta 0:02:08 lr 0.000857 wd 0.0500 time 0.2463 (0.2460) data time 0.0009 (0.0016) model time 0.2454 (0.2443) loss 3.2717 (3.4280) grad_norm 1.3025 (inf) loss_scale 4096.0000 (7037.7237) mem 7379MB [2024-08-26 09:58:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][740/1251] eta 0:02:05 lr 0.000857 wd 0.0500 time 0.2354 (0.2459) data time 0.0009 (0.0016) model time 0.2345 (0.2442) loss 4.1895 (3.4298) grad_norm 1.9552 (inf) loss_scale 4096.0000 (6998.0243) mem 7379MB [2024-08-26 09:58:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][750/1251] eta 0:02:03 lr 0.000857 wd 0.0500 time 0.2522 (0.2458) data time 0.0010 (0.0016) model time 0.2512 (0.2442) loss 3.3160 (3.4312) grad_norm 2.0531 (inf) loss_scale 4096.0000 (6959.3822) mem 7379MB [2024-08-26 09:58:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][760/1251] eta 0:02:00 lr 0.000857 wd 0.0500 time 0.2384 (0.2458) data time 0.0008 (0.0016) model time 0.2375 (0.2441) loss 4.2968 (3.4324) grad_norm 1.9767 (inf) loss_scale 4096.0000 (6921.7556) mem 7379MB [2024-08-26 09:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][770/1251] eta 0:01:58 lr 0.000857 wd 0.0500 time 0.2297 (0.2458) data time 0.0011 (0.0016) model time 0.2286 (0.2441) loss 4.2910 (3.4318) grad_norm 2.8703 (inf) loss_scale 4096.0000 (6885.1051) mem 7379MB [2024-08-26 09:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][780/1251] eta 0:01:55 lr 0.000856 wd 0.0500 time 0.2381 (0.2457) data time 0.0007 (0.0016) model time 0.2374 (0.2441) loss 2.7951 (3.4325) grad_norm 1.8656 (inf) loss_scale 4096.0000 (6849.3931) mem 7379MB [2024-08-26 09:58:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][790/1251] eta 0:01:53 lr 0.000856 wd 0.0500 time 0.2607 (0.2457) data time 0.0010 (0.0016) model time 0.2597 (0.2441) loss 3.4798 (3.4335) grad_norm 1.8598 (inf) loss_scale 4096.0000 (6814.5841) mem 7379MB [2024-08-26 09:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][800/1251] eta 0:01:50 lr 0.000856 wd 0.0500 time 0.2424 (0.2458) data time 0.0010 (0.0016) model time 0.2414 (0.2442) loss 3.0854 (3.4345) grad_norm 2.5811 (inf) loss_scale 4096.0000 (6780.6442) mem 7379MB [2024-08-26 09:58:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][810/1251] eta 0:01:48 lr 0.000856 wd 0.0500 time 0.2369 (0.2458) data time 0.0012 (0.0016) model time 0.2357 (0.2442) loss 3.6323 (3.4320) grad_norm 1.4983 (inf) loss_scale 4096.0000 (6747.5413) mem 7379MB [2024-08-26 09:59:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][820/1251] eta 0:01:46 lr 0.000856 wd 0.0500 time 0.2432 (0.2460) data time 0.0009 (0.0016) model time 0.2423 (0.2444) loss 4.1602 (3.4343) grad_norm 2.2545 (inf) loss_scale 4096.0000 (6715.2448) mem 7379MB [2024-08-26 09:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][830/1251] eta 0:01:43 lr 0.000856 wd 0.0500 time 0.2475 (0.2460) data time 0.0010 (0.0016) model time 0.2466 (0.2443) loss 2.5062 (3.4322) grad_norm 2.1335 (inf) loss_scale 4096.0000 (6683.7256) mem 7379MB [2024-08-26 09:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][840/1251] eta 0:01:41 lr 0.000856 wd 0.0500 time 0.2416 (0.2459) data time 0.0010 (0.0016) model time 0.2406 (0.2443) loss 4.1649 (3.4348) grad_norm 1.8891 (inf) loss_scale 4096.0000 (6652.9560) mem 7379MB [2024-08-26 09:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][850/1251] eta 0:01:38 lr 0.000856 wd 0.0500 time 0.2420 (0.2459) data time 0.0008 (0.0016) model time 0.2412 (0.2443) loss 4.2460 (3.4338) grad_norm 1.7596 (inf) loss_scale 4096.0000 (6622.9095) mem 7379MB [2024-08-26 09:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][860/1251] eta 0:01:36 lr 0.000856 wd 0.0500 time 0.2397 (0.2461) data time 0.0009 (0.0015) model time 0.2388 (0.2445) loss 3.8797 (3.4350) grad_norm 1.7431 (inf) loss_scale 4096.0000 (6593.5610) mem 7379MB [2024-08-26 09:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][870/1251] eta 0:01:33 lr 0.000856 wd 0.0500 time 0.2377 (0.2460) data time 0.0009 (0.0015) model time 0.2368 (0.2444) loss 3.8087 (3.4370) grad_norm 2.2926 (inf) loss_scale 4096.0000 (6564.8863) mem 7379MB [2024-08-26 09:59:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][880/1251] eta 0:01:31 lr 0.000856 wd 0.0500 time 0.2327 (0.2459) data time 0.0010 (0.0015) model time 0.2316 (0.2444) loss 3.7368 (3.4377) grad_norm 1.4854 (inf) loss_scale 4096.0000 (6536.8627) mem 7379MB [2024-08-26 09:59:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][890/1251] eta 0:01:28 lr 0.000856 wd 0.0500 time 0.2518 (0.2459) data time 0.0007 (0.0015) model time 0.2511 (0.2443) loss 4.1013 (3.4375) grad_norm 2.8057 (inf) loss_scale 4096.0000 (6509.4680) mem 7379MB [2024-08-26 09:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][900/1251] eta 0:01:26 lr 0.000856 wd 0.0500 time 0.2373 (0.2459) data time 0.0009 (0.0015) model time 0.2364 (0.2443) loss 3.6051 (3.4336) grad_norm 2.3278 (inf) loss_scale 4096.0000 (6482.6815) mem 7379MB [2024-08-26 09:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][910/1251] eta 0:01:23 lr 0.000856 wd 0.0500 time 0.2456 (0.2461) data time 0.0007 (0.0015) model time 0.2449 (0.2445) loss 3.9461 (3.4335) grad_norm 1.5344 (inf) loss_scale 4096.0000 (6456.4830) mem 7379MB [2024-08-26 09:59:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][920/1251] eta 0:01:21 lr 0.000856 wd 0.0500 time 0.2441 (0.2461) data time 0.0011 (0.0015) model time 0.2429 (0.2445) loss 2.5393 (3.4343) grad_norm 2.5067 (inf) loss_scale 4096.0000 (6430.8534) mem 7379MB [2024-08-26 09:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][930/1251] eta 0:01:18 lr 0.000856 wd 0.0500 time 0.2395 (0.2460) data time 0.0010 (0.0015) model time 0.2385 (0.2445) loss 3.9460 (3.4334) grad_norm 1.6271 (inf) loss_scale 4096.0000 (6405.7744) mem 7379MB [2024-08-26 09:59:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][940/1251] eta 0:01:16 lr 0.000856 wd 0.0500 time 0.2373 (0.2460) data time 0.0007 (0.0015) model time 0.2366 (0.2444) loss 4.3988 (3.4344) grad_norm 1.5733 (inf) loss_scale 4096.0000 (6381.2285) mem 7379MB [2024-08-26 09:59:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][950/1251] eta 0:01:14 lr 0.000856 wd 0.0500 time 0.2365 (0.2459) data time 0.0011 (0.0015) model time 0.2354 (0.2444) loss 3.7799 (3.4368) grad_norm 2.2286 (inf) loss_scale 4096.0000 (6357.1987) mem 7379MB [2024-08-26 09:59:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][960/1251] eta 0:01:11 lr 0.000856 wd 0.0500 time 0.2379 (0.2459) data time 0.0011 (0.0015) model time 0.2368 (0.2443) loss 3.9362 (3.4336) grad_norm 1.3740 (inf) loss_scale 4096.0000 (6333.6691) mem 7379MB [2024-08-26 09:59:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][970/1251] eta 0:01:09 lr 0.000856 wd 0.0500 time 0.2426 (0.2458) data time 0.0009 (0.0015) model time 0.2417 (0.2443) loss 3.5575 (3.4365) grad_norm 4.0772 (inf) loss_scale 4096.0000 (6310.6241) mem 7379MB [2024-08-26 09:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][980/1251] eta 0:01:06 lr 0.000856 wd 0.0500 time 0.2318 (0.2460) data time 0.0010 (0.0015) model time 0.2309 (0.2445) loss 3.7079 (3.4377) grad_norm 1.6462 (inf) loss_scale 4096.0000 (6288.0489) mem 7379MB [2024-08-26 09:59:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][990/1251] eta 0:01:04 lr 0.000856 wd 0.0500 time 0.2391 (0.2459) data time 0.0007 (0.0015) model time 0.2383 (0.2444) loss 3.8408 (3.4374) grad_norm 2.3351 (inf) loss_scale 4096.0000 (6265.9294) mem 7379MB [2024-08-26 09:59:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1000/1251] eta 0:01:01 lr 0.000856 wd 0.0500 time 0.2453 (0.2459) data time 0.0009 (0.0015) model time 0.2443 (0.2444) loss 3.3053 (3.4355) grad_norm 2.2687 (inf) loss_scale 4096.0000 (6244.2517) mem 7379MB [2024-08-26 09:59:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1010/1251] eta 0:00:59 lr 0.000856 wd 0.0500 time 0.2453 (0.2458) data time 0.0007 (0.0015) model time 0.2446 (0.2443) loss 3.6125 (3.4344) grad_norm 1.8098 (inf) loss_scale 4096.0000 (6223.0030) mem 7379MB [2024-08-26 09:59:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1020/1251] eta 0:00:56 lr 0.000856 wd 0.0500 time 0.2371 (0.2460) data time 0.0007 (0.0015) model time 0.2364 (0.2445) loss 3.7376 (3.4337) grad_norm 1.8920 (inf) loss_scale 4096.0000 (6202.1704) mem 7379MB [2024-08-26 09:59:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1030/1251] eta 0:00:54 lr 0.000856 wd 0.0500 time 0.2412 (0.2461) data time 0.0009 (0.0015) model time 0.2403 (0.2447) loss 3.0953 (3.4342) grad_norm 1.9175 (inf) loss_scale 4096.0000 (6181.7420) mem 7379MB [2024-08-26 09:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1040/1251] eta 0:00:51 lr 0.000856 wd 0.0500 time 0.2427 (0.2461) data time 0.0010 (0.0015) model time 0.2417 (0.2446) loss 3.0345 (3.4350) grad_norm 2.0250 (inf) loss_scale 4096.0000 (6161.7061) mem 7379MB [2024-08-26 09:59:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1050/1251] eta 0:00:49 lr 0.000856 wd 0.0500 time 0.2420 (0.2461) data time 0.0008 (0.0015) model time 0.2413 (0.2446) loss 3.6441 (3.4337) grad_norm 1.9605 (inf) loss_scale 4096.0000 (6142.0514) mem 7379MB [2024-08-26 09:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1060/1251] eta 0:00:46 lr 0.000856 wd 0.0500 time 0.2420 (0.2461) data time 0.0011 (0.0015) model time 0.2409 (0.2446) loss 2.6544 (3.4338) grad_norm 3.8801 (inf) loss_scale 4096.0000 (6122.7672) mem 7379MB [2024-08-26 10:00:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1070/1251] eta 0:00:44 lr 0.000856 wd 0.0500 time 0.2441 (0.2460) data time 0.0008 (0.0014) model time 0.2432 (0.2445) loss 3.5314 (3.4333) grad_norm 2.0082 (inf) loss_scale 4096.0000 (6103.8431) mem 7379MB [2024-08-26 10:00:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1080/1251] eta 0:00:42 lr 0.000856 wd 0.0500 time 0.2545 (0.2460) data time 0.0007 (0.0014) model time 0.2537 (0.2445) loss 3.0946 (3.4317) grad_norm 2.1856 (inf) loss_scale 4096.0000 (6085.2692) mem 7379MB [2024-08-26 10:00:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1090/1251] eta 0:00:39 lr 0.000856 wd 0.0500 time 0.2476 (0.2459) data time 0.0010 (0.0014) model time 0.2465 (0.2445) loss 3.4726 (3.4307) grad_norm 2.8676 (inf) loss_scale 4096.0000 (6067.0357) mem 7379MB [2024-08-26 10:00:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1100/1251] eta 0:00:37 lr 0.000855 wd 0.0500 time 0.2515 (0.2459) data time 0.0010 (0.0014) model time 0.2505 (0.2444) loss 3.4606 (3.4322) grad_norm 1.5186 (inf) loss_scale 4096.0000 (6049.1335) mem 7379MB [2024-08-26 10:00:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1110/1251] eta 0:00:34 lr 0.000855 wd 0.0500 time 0.2464 (0.2459) data time 0.0008 (0.0014) model time 0.2456 (0.2444) loss 3.8586 (3.4315) grad_norm 2.3528 (inf) loss_scale 4096.0000 (6031.5536) mem 7379MB [2024-08-26 10:00:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1120/1251] eta 0:00:32 lr 0.000855 wd 0.0500 time 0.2413 (0.2458) data time 0.0007 (0.0014) model time 0.2405 (0.2444) loss 4.0906 (3.4315) grad_norm 2.1341 (inf) loss_scale 4096.0000 (6014.2872) mem 7379MB [2024-08-26 10:00:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1130/1251] eta 0:00:29 lr 0.000855 wd 0.0500 time 0.2398 (0.2458) data time 0.0008 (0.0014) model time 0.2391 (0.2443) loss 3.9784 (3.4292) grad_norm 2.4468 (inf) loss_scale 4096.0000 (5997.3263) mem 7379MB [2024-08-26 10:00:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1140/1251] eta 0:00:27 lr 0.000855 wd 0.0500 time 0.2424 (0.2458) data time 0.0007 (0.0014) model time 0.2417 (0.2443) loss 4.0307 (3.4297) grad_norm 2.2949 (inf) loss_scale 4096.0000 (5980.6626) mem 7379MB [2024-08-26 10:00:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1150/1251] eta 0:00:24 lr 0.000855 wd 0.0500 time 0.2377 (0.2457) data time 0.0009 (0.0014) model time 0.2368 (0.2443) loss 2.5038 (3.4286) grad_norm 2.0448 (inf) loss_scale 4096.0000 (5964.2884) mem 7379MB [2024-08-26 10:00:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1160/1251] eta 0:00:22 lr 0.000855 wd 0.0500 time 0.2572 (0.2458) data time 0.0008 (0.0014) model time 0.2564 (0.2444) loss 4.0461 (3.4324) grad_norm 1.6759 (inf) loss_scale 4096.0000 (5948.1964) mem 7379MB [2024-08-26 10:00:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1170/1251] eta 0:00:19 lr 0.000855 wd 0.0500 time 0.2381 (0.2458) data time 0.0010 (0.0014) model time 0.2371 (0.2444) loss 3.3721 (3.4316) grad_norm 1.6683 (inf) loss_scale 4096.0000 (5932.3792) mem 7379MB [2024-08-26 10:00:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1180/1251] eta 0:00:17 lr 0.000855 wd 0.0500 time 0.2445 (0.2460) data time 0.0009 (0.0014) model time 0.2436 (0.2445) loss 3.4371 (3.4341) grad_norm 1.7323 (inf) loss_scale 4096.0000 (5916.8298) mem 7379MB [2024-08-26 10:00:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1190/1251] eta 0:00:15 lr 0.000855 wd 0.0500 time 0.2487 (0.2459) data time 0.0009 (0.0014) model time 0.2478 (0.2445) loss 3.5663 (3.4355) grad_norm 2.6774 (inf) loss_scale 4096.0000 (5901.5416) mem 7379MB [2024-08-26 10:00:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1200/1251] eta 0:00:12 lr 0.000855 wd 0.0500 time 0.2369 (0.2459) data time 0.0009 (0.0014) model time 0.2360 (0.2445) loss 2.1608 (3.4328) grad_norm 1.9647 (inf) loss_scale 4096.0000 (5886.5079) mem 7379MB [2024-08-26 10:00:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1210/1251] eta 0:00:10 lr 0.000855 wd 0.0500 time 0.2498 (0.2459) data time 0.0010 (0.0014) model time 0.2488 (0.2444) loss 2.2391 (3.4324) grad_norm 2.1304 (inf) loss_scale 4096.0000 (5871.7225) mem 7379MB [2024-08-26 10:00:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1220/1251] eta 0:00:07 lr 0.000855 wd 0.0500 time 0.2477 (0.2458) data time 0.0009 (0.0014) model time 0.2468 (0.2444) loss 3.7068 (3.4307) grad_norm 1.6827 (inf) loss_scale 4096.0000 (5857.1794) mem 7379MB [2024-08-26 10:00:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1230/1251] eta 0:00:05 lr 0.000855 wd 0.0500 time 0.2404 (0.2458) data time 0.0009 (0.0014) model time 0.2395 (0.2444) loss 3.5755 (3.4316) grad_norm 1.7213 (inf) loss_scale 4096.0000 (5842.8725) mem 7379MB [2024-08-26 10:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1240/1251] eta 0:00:02 lr 0.000855 wd 0.0500 time 0.2292 (0.2457) data time 0.0005 (0.0014) model time 0.2287 (0.2443) loss 3.7080 (3.4323) grad_norm 1.5377 (inf) loss_scale 4096.0000 (5828.7961) mem 7379MB [2024-08-26 10:00:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [89/300][1250/1251] eta 0:00:00 lr 0.000855 wd 0.0500 time 0.2268 (0.2455) data time 0.0005 (0.0014) model time 0.2263 (0.2441) loss 3.9858 (3.4310) grad_norm 1.9478 (inf) loss_scale 4096.0000 (5814.9448) mem 7379MB [2024-08-26 10:00:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 89 training takes 0:05:07 [2024-08-26 10:00:45 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 10:00:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 10:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.454 (0.454) Loss 0.5674 (0.5674) Acc@1 89.160 (89.160) Acc@5 97.754 (97.754) Mem 7379MB [2024-08-26 10:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.112) Loss 0.8013 (0.8224) Acc@1 82.617 (81.792) Acc@5 96.191 (96.023) Mem 7379MB [2024-08-26 10:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.096) Loss 1.1670 (0.8423) Acc@1 72.070 (81.245) Acc@5 92.383 (96.047) Mem 7379MB [2024-08-26 10:00:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.090) Loss 1.4697 (0.9551) Acc@1 65.137 (78.635) Acc@5 89.258 (94.698) Mem 7379MB [2024-08-26 10:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.3350 (1.0146) Acc@1 68.750 (77.244) Acc@5 90.527 (93.974) Mem 7379MB [2024-08-26 10:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.930 Acc@5 93.896 [2024-08-26 10:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.9% [2024-08-26 10:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 76.93% [2024-08-26 10:00:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 10:00:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 10:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.414 (0.414) Loss 0.4517 (0.4517) Acc@1 92.188 (92.188) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 10:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.108) Loss 0.7183 (0.7012) Acc@1 85.938 (84.952) Acc@5 96.289 (96.964) Mem 7379MB [2024-08-26 10:00:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.094) Loss 1.0020 (0.7245) Acc@1 76.562 (83.905) Acc@5 94.238 (96.931) Mem 7379MB [2024-08-26 10:00:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.089) Loss 1.2900 (0.8279) Acc@1 67.871 (81.464) Acc@5 90.820 (95.763) Mem 7379MB [2024-08-26 10:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.1523 (0.8814) Acc@1 72.168 (79.966) Acc@5 92.480 (95.229) Mem 7379MB [2024-08-26 10:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.542 Acc@5 95.180 [2024-08-26 10:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.5% [2024-08-26 10:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.54% [2024-08-26 10:00:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 10:00:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 10:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][0/1251] eta 0:14:44 lr 0.000855 wd 0.0500 time 0.7072 (0.7072) data time 0.4688 (0.4688) model time 0.0000 (0.0000) loss 3.9164 (3.9164) grad_norm 2.7482 (2.7482) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:00:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][10/1251] eta 0:05:52 lr 0.000855 wd 0.0500 time 0.2411 (0.2839) data time 0.0009 (0.0435) model time 0.0000 (0.0000) loss 2.5802 (3.6742) grad_norm 2.0410 (2.0550) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][20/1251] eta 0:05:25 lr 0.000855 wd 0.0500 time 0.2443 (0.2643) data time 0.0008 (0.0233) model time 0.0000 (0.0000) loss 3.9119 (3.5810) grad_norm 2.1793 (2.0096) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][30/1251] eta 0:05:14 lr 0.000855 wd 0.0500 time 0.2466 (0.2575) data time 0.0009 (0.0161) model time 0.0000 (0.0000) loss 3.4133 (3.5070) grad_norm 2.8624 (2.1451) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][40/1251] eta 0:05:13 lr 0.000855 wd 0.0500 time 0.4453 (0.2589) data time 0.0011 (0.0124) model time 0.0000 (0.0000) loss 3.8055 (3.4406) grad_norm 2.1033 (2.0725) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][50/1251] eta 0:05:12 lr 0.000855 wd 0.0500 time 0.2443 (0.2600) data time 0.0011 (0.0102) model time 0.0000 (0.0000) loss 3.4760 (3.4539) grad_norm 2.2862 (2.0806) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][60/1251] eta 0:05:05 lr 0.000855 wd 0.0500 time 0.2416 (0.2568) data time 0.0010 (0.0087) model time 0.2406 (0.2396) loss 3.6249 (3.4555) grad_norm 2.4007 (2.0519) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][70/1251] eta 0:05:02 lr 0.000855 wd 0.0500 time 0.3696 (0.2565) data time 0.0011 (0.0076) model time 0.3685 (0.2466) loss 2.9430 (3.4178) grad_norm 1.8671 (2.0284) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][80/1251] eta 0:04:58 lr 0.000855 wd 0.0500 time 0.2403 (0.2546) data time 0.0007 (0.0068) model time 0.2396 (0.2443) loss 2.7259 (3.3913) grad_norm 2.0043 (2.0249) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][90/1251] eta 0:04:54 lr 0.000855 wd 0.0500 time 0.2433 (0.2533) data time 0.0007 (0.0061) model time 0.2426 (0.2439) loss 3.4946 (3.4026) grad_norm 2.0809 (2.0343) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][100/1251] eta 0:04:50 lr 0.000855 wd 0.0500 time 0.2418 (0.2522) data time 0.0011 (0.0056) model time 0.2407 (0.2432) loss 3.7650 (3.4134) grad_norm 2.0545 (2.0505) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][110/1251] eta 0:04:46 lr 0.000855 wd 0.0500 time 0.2357 (0.2513) data time 0.0010 (0.0052) model time 0.2348 (0.2429) loss 3.5064 (3.4275) grad_norm 1.5536 (2.0644) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][120/1251] eta 0:04:43 lr 0.000855 wd 0.0500 time 0.2419 (0.2505) data time 0.0008 (0.0049) model time 0.2411 (0.2425) loss 3.4991 (3.4354) grad_norm 2.1170 (2.0529) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][130/1251] eta 0:04:40 lr 0.000855 wd 0.0500 time 0.2369 (0.2499) data time 0.0010 (0.0046) model time 0.2360 (0.2424) loss 3.4167 (3.4420) grad_norm 2.8268 (2.0537) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][140/1251] eta 0:04:36 lr 0.000855 wd 0.0500 time 0.2433 (0.2492) data time 0.0010 (0.0043) model time 0.2422 (0.2421) loss 3.3319 (3.4637) grad_norm 3.6572 (2.0799) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][150/1251] eta 0:04:33 lr 0.000855 wd 0.0500 time 0.2417 (0.2488) data time 0.0008 (0.0041) model time 0.2409 (0.2420) loss 2.6715 (3.4575) grad_norm 1.9670 (2.0932) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][160/1251] eta 0:04:30 lr 0.000855 wd 0.0500 time 0.2378 (0.2483) data time 0.0010 (0.0039) model time 0.2368 (0.2418) loss 3.7549 (3.4659) grad_norm 1.9560 (2.0918) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][170/1251] eta 0:04:28 lr 0.000854 wd 0.0500 time 0.2502 (0.2480) data time 0.0010 (0.0038) model time 0.2492 (0.2418) loss 3.6744 (3.4726) grad_norm 2.3228 (2.0914) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][180/1251] eta 0:04:26 lr 0.000854 wd 0.0500 time 0.2433 (0.2489) data time 0.0009 (0.0036) model time 0.2425 (0.2435) loss 2.8965 (3.4618) grad_norm 2.8104 (2.1001) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][190/1251] eta 0:04:24 lr 0.000854 wd 0.0500 time 0.2436 (0.2496) data time 0.0009 (0.0035) model time 0.2427 (0.2448) loss 2.7639 (3.4682) grad_norm 3.6614 (2.1018) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][200/1251] eta 0:04:21 lr 0.000854 wd 0.0500 time 0.2425 (0.2492) data time 0.0009 (0.0033) model time 0.2417 (0.2445) loss 3.3596 (3.4706) grad_norm 2.6837 (2.1025) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][210/1251] eta 0:04:18 lr 0.000854 wd 0.0500 time 0.2432 (0.2488) data time 0.0010 (0.0032) model time 0.2422 (0.2442) loss 3.3199 (3.4720) grad_norm 2.1838 (2.1032) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][220/1251] eta 0:04:16 lr 0.000854 wd 0.0500 time 0.2510 (0.2486) data time 0.0012 (0.0031) model time 0.2498 (0.2442) loss 3.0727 (3.4749) grad_norm 2.7694 (2.1084) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][230/1251] eta 0:04:13 lr 0.000854 wd 0.0500 time 0.2440 (0.2484) data time 0.0007 (0.0030) model time 0.2432 (0.2441) loss 3.9798 (3.4676) grad_norm 2.6883 (2.1117) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][240/1251] eta 0:04:10 lr 0.000854 wd 0.0500 time 0.2413 (0.2481) data time 0.0009 (0.0030) model time 0.2404 (0.2439) loss 3.9778 (3.4697) grad_norm 2.0490 (2.1037) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][250/1251] eta 0:04:08 lr 0.000854 wd 0.0500 time 0.2429 (0.2478) data time 0.0008 (0.0029) model time 0.2422 (0.2437) loss 2.6900 (3.4651) grad_norm 2.0890 (2.0975) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][260/1251] eta 0:04:06 lr 0.000854 wd 0.0500 time 0.2418 (0.2484) data time 0.0013 (0.0028) model time 0.2405 (0.2446) loss 3.3329 (3.4671) grad_norm 1.5683 (2.1000) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][270/1251] eta 0:04:03 lr 0.000854 wd 0.0500 time 0.2746 (0.2483) data time 0.0009 (0.0027) model time 0.2737 (0.2446) loss 3.6410 (3.4695) grad_norm 1.8782 (2.1088) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][280/1251] eta 0:04:00 lr 0.000854 wd 0.0500 time 0.2372 (0.2482) data time 0.0010 (0.0027) model time 0.2362 (0.2445) loss 3.6309 (3.4690) grad_norm 1.5258 (2.1145) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][290/1251] eta 0:03:58 lr 0.000854 wd 0.0500 time 0.2355 (0.2480) data time 0.0014 (0.0026) model time 0.2341 (0.2444) loss 3.7049 (3.4695) grad_norm 2.2448 (2.1173) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][300/1251] eta 0:03:56 lr 0.000854 wd 0.0500 time 0.2387 (0.2485) data time 0.0011 (0.0026) model time 0.2376 (0.2452) loss 3.7292 (3.4695) grad_norm 1.6944 (2.1129) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][310/1251] eta 0:03:53 lr 0.000854 wd 0.0500 time 0.2390 (0.2483) data time 0.0009 (0.0025) model time 0.2381 (0.2450) loss 3.2346 (3.4673) grad_norm 1.5984 (2.1038) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][320/1251] eta 0:03:51 lr 0.000854 wd 0.0500 time 0.2381 (0.2481) data time 0.0010 (0.0025) model time 0.2371 (0.2449) loss 4.0438 (3.4682) grad_norm 2.1264 (2.1014) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][330/1251] eta 0:03:48 lr 0.000854 wd 0.0500 time 0.2446 (0.2480) data time 0.0009 (0.0024) model time 0.2436 (0.2448) loss 3.6537 (3.4789) grad_norm 1.6855 (2.0967) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][340/1251] eta 0:03:45 lr 0.000854 wd 0.0500 time 0.2434 (0.2478) data time 0.0007 (0.0024) model time 0.2427 (0.2446) loss 3.2999 (3.4827) grad_norm 1.8503 (2.1102) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][350/1251] eta 0:03:43 lr 0.000854 wd 0.0500 time 0.2448 (0.2477) data time 0.0010 (0.0024) model time 0.2438 (0.2446) loss 4.1043 (3.4804) grad_norm 1.6124 (2.1100) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][360/1251] eta 0:03:40 lr 0.000854 wd 0.0500 time 0.2424 (0.2477) data time 0.0012 (0.0023) model time 0.2412 (0.2446) loss 3.3017 (3.4831) grad_norm 2.0235 (2.1026) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][370/1251] eta 0:03:38 lr 0.000854 wd 0.0500 time 0.2449 (0.2475) data time 0.0009 (0.0023) model time 0.2439 (0.2444) loss 3.3039 (3.4853) grad_norm 1.5054 (2.0984) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][380/1251] eta 0:03:35 lr 0.000854 wd 0.0500 time 0.2401 (0.2473) data time 0.0010 (0.0023) model time 0.2391 (0.2442) loss 3.8182 (3.4857) grad_norm 1.9743 (2.0969) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][390/1251] eta 0:03:32 lr 0.000854 wd 0.0500 time 0.2403 (0.2471) data time 0.0010 (0.0023) model time 0.2392 (0.2441) loss 3.5146 (3.4883) grad_norm 1.9746 (2.0974) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][400/1251] eta 0:03:30 lr 0.000854 wd 0.0500 time 0.2390 (0.2469) data time 0.0008 (0.0022) model time 0.2382 (0.2440) loss 3.1059 (3.4897) grad_norm 1.4965 (2.0958) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][410/1251] eta 0:03:27 lr 0.000854 wd 0.0500 time 0.2490 (0.2469) data time 0.0010 (0.0022) model time 0.2480 (0.2439) loss 3.3328 (3.4908) grad_norm 1.7384 (2.1068) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][420/1251] eta 0:03:24 lr 0.000854 wd 0.0500 time 0.2392 (0.2467) data time 0.0007 (0.0022) model time 0.2385 (0.2438) loss 2.9676 (3.4856) grad_norm 2.0275 (2.1080) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][430/1251] eta 0:03:22 lr 0.000854 wd 0.0500 time 0.2422 (0.2469) data time 0.0010 (0.0021) model time 0.2412 (0.2441) loss 3.6733 (3.4885) grad_norm 2.5493 (2.1168) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][440/1251] eta 0:03:20 lr 0.000854 wd 0.0500 time 0.2389 (0.2468) data time 0.0011 (0.0021) model time 0.2378 (0.2440) loss 3.1332 (3.4871) grad_norm 1.9551 (2.1194) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][450/1251] eta 0:03:17 lr 0.000854 wd 0.0500 time 0.2428 (0.2467) data time 0.0007 (0.0021) model time 0.2420 (0.2439) loss 4.0427 (3.4855) grad_norm 2.4170 (2.1185) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][460/1251] eta 0:03:15 lr 0.000854 wd 0.0500 time 0.2329 (0.2470) data time 0.0009 (0.0021) model time 0.2320 (0.2443) loss 3.8650 (3.4882) grad_norm 2.5602 (2.1276) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][470/1251] eta 0:03:12 lr 0.000854 wd 0.0500 time 0.2328 (0.2468) data time 0.0008 (0.0020) model time 0.2319 (0.2442) loss 3.3492 (3.4851) grad_norm 1.6954 (2.1295) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][480/1251] eta 0:03:10 lr 0.000854 wd 0.0500 time 0.2401 (0.2467) data time 0.0009 (0.0020) model time 0.2392 (0.2441) loss 3.6974 (3.4832) grad_norm 2.1226 (2.1286) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][490/1251] eta 0:03:07 lr 0.000853 wd 0.0500 time 0.2378 (0.2465) data time 0.0007 (0.0020) model time 0.2371 (0.2439) loss 3.7856 (3.4814) grad_norm 3.2404 (2.1331) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:02:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][500/1251] eta 0:03:05 lr 0.000853 wd 0.0500 time 0.2447 (0.2465) data time 0.0008 (0.0020) model time 0.2440 (0.2439) loss 3.5336 (3.4830) grad_norm 1.6717 (2.1391) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][510/1251] eta 0:03:02 lr 0.000853 wd 0.0500 time 0.2441 (0.2464) data time 0.0010 (0.0020) model time 0.2431 (0.2438) loss 3.3723 (3.4760) grad_norm 2.4946 (2.1415) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][520/1251] eta 0:03:00 lr 0.000853 wd 0.0500 time 0.2465 (0.2463) data time 0.0008 (0.0019) model time 0.2457 (0.2438) loss 4.0286 (3.4710) grad_norm 2.9897 (2.1371) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][530/1251] eta 0:02:57 lr 0.000853 wd 0.0500 time 0.2347 (0.2466) data time 0.0012 (0.0019) model time 0.2335 (0.2442) loss 3.2334 (3.4700) grad_norm 2.0482 (2.1458) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][540/1251] eta 0:02:55 lr 0.000853 wd 0.0500 time 0.2380 (0.2469) data time 0.0009 (0.0019) model time 0.2371 (0.2445) loss 3.8091 (3.4697) grad_norm 1.3703 (2.1375) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][550/1251] eta 0:02:52 lr 0.000853 wd 0.0500 time 0.2408 (0.2468) data time 0.0010 (0.0019) model time 0.2398 (0.2444) loss 3.8982 (3.4675) grad_norm 1.3524 (2.1337) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][560/1251] eta 0:02:50 lr 0.000853 wd 0.0500 time 0.2383 (0.2467) data time 0.0011 (0.0019) model time 0.2372 (0.2443) loss 3.6476 (3.4610) grad_norm 2.1002 (2.1309) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][570/1251] eta 0:02:47 lr 0.000853 wd 0.0500 time 0.2420 (0.2466) data time 0.0008 (0.0019) model time 0.2412 (0.2442) loss 4.1834 (3.4695) grad_norm 2.6423 (2.1495) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][580/1251] eta 0:02:45 lr 0.000853 wd 0.0500 time 0.2459 (0.2465) data time 0.0007 (0.0018) model time 0.2452 (0.2442) loss 3.5254 (3.4679) grad_norm 2.4681 (2.1578) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][590/1251] eta 0:02:42 lr 0.000853 wd 0.0500 time 0.2458 (0.2464) data time 0.0011 (0.0018) model time 0.2447 (0.2441) loss 3.9149 (3.4639) grad_norm 2.1927 (2.1580) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][600/1251] eta 0:02:40 lr 0.000853 wd 0.0500 time 0.2390 (0.2467) data time 0.0007 (0.0018) model time 0.2382 (0.2444) loss 3.1380 (3.4631) grad_norm 1.4558 (2.1561) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][610/1251] eta 0:02:38 lr 0.000853 wd 0.0500 time 0.2407 (0.2466) data time 0.0007 (0.0018) model time 0.2400 (0.2444) loss 2.9975 (3.4625) grad_norm 1.4940 (2.1522) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][620/1251] eta 0:02:35 lr 0.000853 wd 0.0500 time 0.2343 (0.2465) data time 0.0007 (0.0018) model time 0.2336 (0.2442) loss 3.1705 (3.4669) grad_norm 2.2090 (2.1505) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][630/1251] eta 0:02:33 lr 0.000853 wd 0.0500 time 0.2384 (0.2464) data time 0.0009 (0.0018) model time 0.2374 (0.2442) loss 2.5621 (3.4673) grad_norm 1.6476 (2.1476) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][640/1251] eta 0:02:30 lr 0.000853 wd 0.0500 time 0.2330 (0.2463) data time 0.0012 (0.0018) model time 0.2318 (0.2441) loss 2.9566 (3.4660) grad_norm 1.7980 (2.1408) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][650/1251] eta 0:02:27 lr 0.000853 wd 0.0500 time 0.2542 (0.2463) data time 0.0007 (0.0018) model time 0.2534 (0.2440) loss 2.6006 (3.4656) grad_norm 1.6492 (2.1425) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][660/1251] eta 0:02:25 lr 0.000853 wd 0.0500 time 0.2396 (0.2462) data time 0.0010 (0.0017) model time 0.2385 (0.2440) loss 3.9664 (3.4665) grad_norm 3.0898 (2.1411) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][670/1251] eta 0:02:23 lr 0.000853 wd 0.0500 time 0.2366 (0.2462) data time 0.0011 (0.0017) model time 0.2355 (0.2440) loss 3.9884 (3.4678) grad_norm 1.7946 (2.1360) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][680/1251] eta 0:02:20 lr 0.000853 wd 0.0500 time 0.2501 (0.2461) data time 0.0009 (0.0017) model time 0.2492 (0.2440) loss 3.6235 (3.4643) grad_norm 1.4085 (2.1338) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][690/1251] eta 0:02:18 lr 0.000853 wd 0.0500 time 0.2381 (0.2460) data time 0.0008 (0.0017) model time 0.2372 (0.2439) loss 4.0411 (3.4615) grad_norm 2.1223 (2.1378) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][700/1251] eta 0:02:15 lr 0.000853 wd 0.0500 time 0.2366 (0.2460) data time 0.0010 (0.0017) model time 0.2357 (0.2438) loss 3.2954 (3.4593) grad_norm 1.7536 (2.1358) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][710/1251] eta 0:02:13 lr 0.000853 wd 0.0500 time 0.2502 (0.2461) data time 0.0009 (0.0017) model time 0.2493 (0.2440) loss 3.2537 (3.4598) grad_norm 1.5882 (2.1427) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][720/1251] eta 0:02:10 lr 0.000853 wd 0.0500 time 0.2385 (0.2463) data time 0.0007 (0.0017) model time 0.2378 (0.2443) loss 3.9103 (3.4580) grad_norm 2.3641 (2.1405) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][730/1251] eta 0:02:08 lr 0.000853 wd 0.0500 time 0.2458 (0.2463) data time 0.0008 (0.0017) model time 0.2450 (0.2442) loss 4.1935 (3.4562) grad_norm 1.7965 (2.1396) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:03:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][740/1251] eta 0:02:05 lr 0.000853 wd 0.0500 time 0.2387 (0.2462) data time 0.0011 (0.0017) model time 0.2375 (0.2441) loss 3.4144 (3.4585) grad_norm 1.6802 (2.1446) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][750/1251] eta 0:02:03 lr 0.000853 wd 0.0500 time 0.2405 (0.2461) data time 0.0010 (0.0017) model time 0.2396 (0.2441) loss 3.7648 (3.4550) grad_norm 2.1558 (2.1464) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][760/1251] eta 0:02:00 lr 0.000853 wd 0.0500 time 0.2344 (0.2461) data time 0.0008 (0.0016) model time 0.2336 (0.2440) loss 3.4703 (3.4501) grad_norm 1.5367 (2.1421) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][770/1251] eta 0:01:58 lr 0.000853 wd 0.0500 time 0.2389 (0.2460) data time 0.0010 (0.0016) model time 0.2379 (0.2440) loss 2.9756 (3.4528) grad_norm 1.7994 (2.1396) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][780/1251] eta 0:01:55 lr 0.000853 wd 0.0500 time 0.2438 (0.2459) data time 0.0008 (0.0016) model time 0.2430 (0.2439) loss 2.6476 (3.4516) grad_norm 1.9181 (2.1377) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][790/1251] eta 0:01:53 lr 0.000853 wd 0.0500 time 0.2420 (0.2459) data time 0.0007 (0.0016) model time 0.2412 (0.2439) loss 3.3620 (3.4506) grad_norm 1.7494 (2.1345) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][800/1251] eta 0:01:50 lr 0.000852 wd 0.0500 time 0.2364 (0.2461) data time 0.0007 (0.0016) model time 0.2357 (0.2441) loss 2.5907 (3.4510) grad_norm 2.0480 (2.1362) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][810/1251] eta 0:01:48 lr 0.000852 wd 0.0500 time 0.2474 (0.2460) data time 0.0008 (0.0016) model time 0.2467 (0.2440) loss 3.2239 (3.4467) grad_norm 1.9075 (2.1337) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][820/1251] eta 0:01:46 lr 0.000852 wd 0.0500 time 0.2445 (0.2460) data time 0.0008 (0.0016) model time 0.2437 (0.2440) loss 2.7332 (3.4457) grad_norm 2.4493 (2.1400) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][830/1251] eta 0:01:43 lr 0.000852 wd 0.0500 time 0.2334 (0.2459) data time 0.0013 (0.0016) model time 0.2321 (0.2439) loss 2.6728 (3.4471) grad_norm 1.4854 (2.1391) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][840/1251] eta 0:01:41 lr 0.000852 wd 0.0500 time 0.2412 (0.2458) data time 0.0010 (0.0016) model time 0.2403 (0.2439) loss 3.8881 (3.4415) grad_norm 3.2898 (2.1405) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][850/1251] eta 0:01:38 lr 0.000852 wd 0.0500 time 0.2450 (0.2458) data time 0.0010 (0.0016) model time 0.2440 (0.2438) loss 2.7477 (3.4417) grad_norm 2.1389 (2.1415) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][860/1251] eta 0:01:36 lr 0.000852 wd 0.0500 time 0.2407 (0.2457) data time 0.0011 (0.0016) model time 0.2396 (0.2438) loss 3.8377 (3.4406) grad_norm 1.3256 (2.1381) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][870/1251] eta 0:01:33 lr 0.000852 wd 0.0500 time 0.2459 (0.2457) data time 0.0009 (0.0016) model time 0.2449 (0.2437) loss 3.8411 (3.4434) grad_norm 2.0504 (2.1342) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][880/1251] eta 0:01:31 lr 0.000852 wd 0.0500 time 0.2438 (0.2456) data time 0.0008 (0.0016) model time 0.2430 (0.2437) loss 4.3018 (3.4460) grad_norm 2.0897 (2.1324) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][890/1251] eta 0:01:28 lr 0.000852 wd 0.0500 time 0.2334 (0.2456) data time 0.0008 (0.0016) model time 0.2326 (0.2437) loss 3.5602 (3.4449) grad_norm 1.7357 (2.1331) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][900/1251] eta 0:01:26 lr 0.000852 wd 0.0500 time 0.2404 (0.2455) data time 0.0011 (0.0015) model time 0.2393 (0.2436) loss 3.1830 (3.4415) grad_norm 3.9712 (2.1374) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][910/1251] eta 0:01:23 lr 0.000852 wd 0.0500 time 0.2408 (0.2455) data time 0.0009 (0.0015) model time 0.2399 (0.2436) loss 3.6644 (3.4371) grad_norm 1.8304 (2.1400) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][920/1251] eta 0:01:21 lr 0.000852 wd 0.0500 time 0.2373 (0.2454) data time 0.0010 (0.0015) model time 0.2363 (0.2435) loss 3.6657 (3.4348) grad_norm 1.8249 (2.1389) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][930/1251] eta 0:01:18 lr 0.000852 wd 0.0500 time 0.2332 (0.2454) data time 0.0008 (0.0015) model time 0.2324 (0.2435) loss 3.0600 (3.4361) grad_norm 2.7119 (2.1378) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][940/1251] eta 0:01:16 lr 0.000852 wd 0.0500 time 0.2448 (0.2453) data time 0.0009 (0.0015) model time 0.2439 (0.2434) loss 2.8372 (3.4328) grad_norm 1.9325 (2.1347) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][950/1251] eta 0:01:13 lr 0.000852 wd 0.0500 time 0.2390 (0.2455) data time 0.0010 (0.0015) model time 0.2380 (0.2436) loss 3.5212 (3.4361) grad_norm 1.9806 (2.1336) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][960/1251] eta 0:01:11 lr 0.000852 wd 0.0500 time 0.2526 (0.2455) data time 0.0009 (0.0015) model time 0.2517 (0.2436) loss 2.5427 (3.4326) grad_norm 2.7110 (2.1320) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][970/1251] eta 0:01:09 lr 0.000852 wd 0.0500 time 0.4416 (0.2456) data time 0.0010 (0.0015) model time 0.4407 (0.2438) loss 3.6739 (3.4309) grad_norm 2.4440 (2.1324) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][980/1251] eta 0:01:06 lr 0.000852 wd 0.0500 time 0.2436 (0.2460) data time 0.0011 (0.0015) model time 0.2425 (0.2442) loss 2.6611 (3.4303) grad_norm 2.8974 (2.1319) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:04:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][990/1251] eta 0:01:04 lr 0.000852 wd 0.0500 time 0.2442 (0.2459) data time 0.0008 (0.0015) model time 0.2434 (0.2441) loss 2.7796 (3.4285) grad_norm 1.8675 (2.1291) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1000/1251] eta 0:01:01 lr 0.000852 wd 0.0500 time 0.2397 (0.2459) data time 0.0011 (0.0015) model time 0.2386 (0.2441) loss 3.8420 (3.4291) grad_norm 2.3872 (2.1288) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1010/1251] eta 0:00:59 lr 0.000852 wd 0.0500 time 0.2392 (0.2458) data time 0.0009 (0.0015) model time 0.2383 (0.2440) loss 4.2020 (3.4314) grad_norm 2.8446 (2.1297) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1020/1251] eta 0:00:56 lr 0.000852 wd 0.0500 time 0.2449 (0.2458) data time 0.0010 (0.0015) model time 0.2439 (0.2440) loss 3.1015 (3.4326) grad_norm 1.7058 (2.1283) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1030/1251] eta 0:00:54 lr 0.000852 wd 0.0500 time 0.2511 (0.2457) data time 0.0012 (0.0015) model time 0.2499 (0.2440) loss 3.2948 (3.4345) grad_norm 3.0875 (2.1297) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1040/1251] eta 0:00:51 lr 0.000852 wd 0.0500 time 0.2406 (0.2457) data time 0.0012 (0.0015) model time 0.2394 (0.2440) loss 3.6699 (3.4363) grad_norm 1.7775 (2.1299) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1050/1251] eta 0:00:49 lr 0.000852 wd 0.0500 time 0.2457 (0.2457) data time 0.0008 (0.0015) model time 0.2450 (0.2439) loss 2.6772 (3.4359) grad_norm 2.4588 (2.1292) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1060/1251] eta 0:00:46 lr 0.000852 wd 0.0500 time 0.2386 (0.2458) data time 0.0007 (0.0015) model time 0.2379 (0.2441) loss 3.5003 (3.4352) grad_norm 2.6733 (2.1279) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1070/1251] eta 0:00:44 lr 0.000852 wd 0.0500 time 0.2409 (0.2458) data time 0.0009 (0.0015) model time 0.2400 (0.2441) loss 3.8338 (3.4352) grad_norm 1.6703 (2.1256) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1080/1251] eta 0:00:42 lr 0.000852 wd 0.0500 time 0.2441 (0.2459) data time 0.0007 (0.0015) model time 0.2434 (0.2442) loss 2.3764 (3.4361) grad_norm 1.7683 (2.1238) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1090/1251] eta 0:00:39 lr 0.000852 wd 0.0500 time 0.2391 (0.2459) data time 0.0011 (0.0015) model time 0.2380 (0.2442) loss 3.0844 (3.4339) grad_norm 2.0796 (2.1215) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1100/1251] eta 0:00:37 lr 0.000852 wd 0.0500 time 0.2343 (0.2459) data time 0.0010 (0.0015) model time 0.2333 (0.2442) loss 3.7957 (3.4346) grad_norm 3.4977 (2.1203) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1110/1251] eta 0:00:34 lr 0.000852 wd 0.0500 time 0.2332 (0.2458) data time 0.0012 (0.0015) model time 0.2319 (0.2441) loss 2.6595 (3.4317) grad_norm 2.4201 (2.1214) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1120/1251] eta 0:00:32 lr 0.000851 wd 0.0500 time 0.2327 (0.2458) data time 0.0011 (0.0014) model time 0.2316 (0.2441) loss 3.5987 (3.4330) grad_norm 1.5007 (2.1184) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1130/1251] eta 0:00:29 lr 0.000851 wd 0.0500 time 0.2389 (0.2458) data time 0.0010 (0.0014) model time 0.2379 (0.2441) loss 3.7262 (3.4336) grad_norm 2.4226 (2.1164) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1140/1251] eta 0:00:27 lr 0.000851 wd 0.0500 time 0.2541 (0.2458) data time 0.0009 (0.0014) model time 0.2533 (0.2441) loss 3.4172 (3.4294) grad_norm 2.2053 (2.1162) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1150/1251] eta 0:00:24 lr 0.000851 wd 0.0500 time 0.2408 (0.2457) data time 0.0009 (0.0014) model time 0.2400 (0.2440) loss 2.6141 (3.4312) grad_norm 2.3843 (inf) loss_scale 2048.0000 (4085.3241) mem 7379MB [2024-08-26 10:05:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1160/1251] eta 0:00:22 lr 0.000851 wd 0.0500 time 0.2462 (0.2457) data time 0.0008 (0.0014) model time 0.2455 (0.2440) loss 4.2612 (3.4320) grad_norm 2.5730 (inf) loss_scale 2048.0000 (4067.7761) mem 7379MB [2024-08-26 10:05:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1170/1251] eta 0:00:19 lr 0.000851 wd 0.0500 time 0.2414 (0.2457) data time 0.0011 (0.0014) model time 0.2402 (0.2440) loss 3.3442 (3.4347) grad_norm 2.4601 (inf) loss_scale 2048.0000 (4050.5278) mem 7379MB [2024-08-26 10:05:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1180/1251] eta 0:00:17 lr 0.000851 wd 0.0500 time 0.2397 (0.2456) data time 0.0009 (0.0014) model time 0.2388 (0.2440) loss 3.6949 (3.4349) grad_norm 2.7553 (inf) loss_scale 2048.0000 (4033.5715) mem 7379MB [2024-08-26 10:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1190/1251] eta 0:00:14 lr 0.000851 wd 0.0500 time 0.2457 (0.2456) data time 0.0008 (0.0014) model time 0.2449 (0.2439) loss 4.0172 (3.4368) grad_norm 1.9542 (inf) loss_scale 2048.0000 (4016.9001) mem 7379MB [2024-08-26 10:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1200/1251] eta 0:00:12 lr 0.000851 wd 0.0500 time 0.2401 (0.2455) data time 0.0008 (0.0014) model time 0.2393 (0.2439) loss 3.8943 (3.4356) grad_norm 2.8823 (inf) loss_scale 2048.0000 (4000.5062) mem 7379MB [2024-08-26 10:05:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1210/1251] eta 0:00:10 lr 0.000851 wd 0.0500 time 0.2418 (0.2455) data time 0.0009 (0.0014) model time 0.2409 (0.2439) loss 3.9473 (3.4368) grad_norm 1.4847 (inf) loss_scale 2048.0000 (3984.3832) mem 7379MB [2024-08-26 10:05:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1220/1251] eta 0:00:07 lr 0.000851 wd 0.0500 time 0.2418 (0.2455) data time 0.0009 (0.0014) model time 0.2409 (0.2438) loss 3.7292 (3.4373) grad_norm 2.4221 (inf) loss_scale 2048.0000 (3968.5242) mem 7379MB [2024-08-26 10:05:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1230/1251] eta 0:00:05 lr 0.000851 wd 0.0500 time 0.2270 (0.2458) data time 0.0012 (0.0014) model time 0.2259 (0.2441) loss 3.6833 (3.4371) grad_norm 2.1129 (inf) loss_scale 2048.0000 (3952.9228) mem 7379MB [2024-08-26 10:06:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1240/1251] eta 0:00:02 lr 0.000851 wd 0.0500 time 0.2284 (0.2458) data time 0.0006 (0.0014) model time 0.2278 (0.2442) loss 4.0524 (3.4388) grad_norm 1.5586 (inf) loss_scale 2048.0000 (3937.5729) mem 7379MB [2024-08-26 10:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [90/300][1250/1251] eta 0:00:00 lr 0.000851 wd 0.0500 time 0.2233 (0.2457) data time 0.0005 (0.0014) model time 0.2228 (0.2440) loss 2.2478 (3.4397) grad_norm 1.8074 (inf) loss_scale 2048.0000 (3922.4684) mem 7379MB [2024-08-26 10:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 90 training takes 0:05:07 [2024-08-26 10:06:02 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 10:06:03 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 10:06:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.398 (0.398) Loss 0.5483 (0.5483) Acc@1 90.527 (90.527) Acc@5 97.852 (97.852) Mem 7379MB [2024-08-26 10:06:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.107) Loss 0.8726 (0.8312) Acc@1 81.738 (82.218) Acc@5 95.801 (96.147) Mem 7379MB [2024-08-26 10:06:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.086 (0.094) Loss 1.2197 (0.8538) Acc@1 71.289 (81.269) Acc@5 92.480 (96.043) Mem 7379MB [2024-08-26 10:06:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.090) Loss 1.4346 (0.9671) Acc@1 65.430 (78.645) Acc@5 88.867 (94.739) Mem 7379MB [2024-08-26 10:06:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.084) Loss 1.2803 (1.0350) Acc@1 70.020 (76.972) Acc@5 91.211 (93.855) Mem 7379MB [2024-08-26 10:06:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.700 Acc@5 93.790 [2024-08-26 10:06:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.7% [2024-08-26 10:06:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.797 (0.797) Loss 0.4500 (0.4500) Acc@1 92.285 (92.285) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 10:06:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.147) Loss 0.7178 (0.6996) Acc@1 85.840 (84.970) Acc@5 96.387 (96.999) Mem 7379MB [2024-08-26 10:06:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.086 (0.115) Loss 1.0000 (0.7228) Acc@1 76.660 (83.891) Acc@5 94.141 (96.954) Mem 7379MB [2024-08-26 10:06:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.103) Loss 1.2900 (0.8261) Acc@1 68.066 (81.508) Acc@5 90.723 (95.769) Mem 7379MB [2024-08-26 10:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.1484 (0.8791) Acc@1 72.070 (80.030) Acc@5 92.480 (95.262) Mem 7379MB [2024-08-26 10:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.598 Acc@5 95.238 [2024-08-26 10:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.6% [2024-08-26 10:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.60% [2024-08-26 10:06:11 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 10:06:12 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 10:06:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][0/1251] eta 0:13:32 lr 0.000851 wd 0.0500 time 0.6497 (0.6497) data time 0.4224 (0.4224) model time 0.0000 (0.0000) loss 3.6669 (3.6669) grad_norm 1.9546 (1.9546) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][10/1251] eta 0:05:46 lr 0.000851 wd 0.0500 time 0.2475 (0.2795) data time 0.0010 (0.0394) model time 0.0000 (0.0000) loss 2.8101 (3.3836) grad_norm 1.7713 (2.0101) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][20/1251] eta 0:05:22 lr 0.000851 wd 0.0500 time 0.2432 (0.2618) data time 0.0009 (0.0211) model time 0.0000 (0.0000) loss 3.6205 (3.2936) grad_norm 1.9853 (2.0900) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][30/1251] eta 0:05:11 lr 0.000851 wd 0.0500 time 0.2466 (0.2548) data time 0.0009 (0.0146) model time 0.0000 (0.0000) loss 3.2261 (3.3027) grad_norm 2.0247 (2.1245) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][40/1251] eta 0:05:04 lr 0.000851 wd 0.0500 time 0.2395 (0.2515) data time 0.0011 (0.0113) model time 0.0000 (0.0000) loss 3.4088 (3.3552) grad_norm 2.3182 (2.1176) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][50/1251] eta 0:04:59 lr 0.000851 wd 0.0500 time 0.2412 (0.2497) data time 0.0011 (0.0093) model time 0.0000 (0.0000) loss 3.7483 (3.3599) grad_norm 1.8901 (2.0838) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][60/1251] eta 0:04:55 lr 0.000851 wd 0.0500 time 0.2415 (0.2485) data time 0.0007 (0.0080) model time 0.2408 (0.2410) loss 3.5810 (3.3481) grad_norm 1.4440 (2.0429) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][70/1251] eta 0:04:52 lr 0.000851 wd 0.0500 time 0.2386 (0.2473) data time 0.0011 (0.0070) model time 0.2375 (0.2399) loss 3.8331 (3.3767) grad_norm 2.5394 (2.0502) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][80/1251] eta 0:04:48 lr 0.000851 wd 0.0500 time 0.2425 (0.2466) data time 0.0009 (0.0063) model time 0.2416 (0.2403) loss 3.8652 (3.4256) grad_norm 1.5650 (2.0413) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][90/1251] eta 0:04:45 lr 0.000851 wd 0.0500 time 0.2434 (0.2463) data time 0.0008 (0.0057) model time 0.2425 (0.2408) loss 3.4536 (3.4186) grad_norm 2.4429 (2.0385) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][100/1251] eta 0:04:42 lr 0.000851 wd 0.0500 time 0.2422 (0.2458) data time 0.0010 (0.0052) model time 0.2412 (0.2407) loss 3.4824 (3.4386) grad_norm 2.0286 (2.0377) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][110/1251] eta 0:04:39 lr 0.000851 wd 0.0500 time 0.2432 (0.2454) data time 0.0010 (0.0048) model time 0.2422 (0.2407) loss 2.3924 (3.4586) grad_norm 1.9413 (2.0680) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][120/1251] eta 0:04:37 lr 0.000851 wd 0.0500 time 0.2543 (0.2451) data time 0.0009 (0.0045) model time 0.2534 (0.2407) loss 2.8596 (3.4176) grad_norm 1.5132 (2.0460) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][130/1251] eta 0:04:34 lr 0.000851 wd 0.0500 time 0.2445 (0.2447) data time 0.0009 (0.0042) model time 0.2436 (0.2405) loss 3.4268 (3.4208) grad_norm 1.5216 (2.0323) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][140/1251] eta 0:04:31 lr 0.000851 wd 0.0500 time 0.2375 (0.2444) data time 0.0008 (0.0040) model time 0.2367 (0.2403) loss 2.7915 (3.4317) grad_norm 1.8225 (2.0282) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][150/1251] eta 0:04:28 lr 0.000851 wd 0.0500 time 0.2411 (0.2441) data time 0.0011 (0.0038) model time 0.2400 (0.2402) loss 3.4713 (3.4364) grad_norm 2.0113 (2.0579) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][160/1251] eta 0:04:27 lr 0.000851 wd 0.0500 time 0.2459 (0.2451) data time 0.0009 (0.0036) model time 0.2450 (0.2419) loss 4.1536 (3.4397) grad_norm 2.1402 (2.0667) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][170/1251] eta 0:04:24 lr 0.000851 wd 0.0500 time 0.2391 (0.2448) data time 0.0007 (0.0035) model time 0.2384 (0.2418) loss 3.9544 (3.4322) grad_norm 1.5463 (2.0685) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][180/1251] eta 0:04:21 lr 0.000850 wd 0.0500 time 0.2537 (0.2446) data time 0.0009 (0.0033) model time 0.2528 (0.2416) loss 3.6420 (3.4191) grad_norm 2.2007 (2.0978) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:06:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][190/1251] eta 0:04:19 lr 0.000850 wd 0.0500 time 0.2408 (0.2444) data time 0.0010 (0.0032) model time 0.2398 (0.2415) loss 3.5943 (3.4244) grad_norm 2.2096 (2.0868) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][200/1251] eta 0:04:16 lr 0.000850 wd 0.0500 time 0.2468 (0.2444) data time 0.0009 (0.0031) model time 0.2460 (0.2417) loss 3.9249 (3.4103) grad_norm 1.8187 (2.0832) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][210/1251] eta 0:04:14 lr 0.000850 wd 0.0500 time 0.2407 (0.2444) data time 0.0011 (0.0030) model time 0.2396 (0.2417) loss 3.9652 (3.4169) grad_norm 1.9684 (2.0862) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][220/1251] eta 0:04:11 lr 0.000850 wd 0.0500 time 0.2439 (0.2443) data time 0.0012 (0.0029) model time 0.2427 (0.2416) loss 2.1497 (3.4003) grad_norm 1.5461 (2.0939) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][230/1251] eta 0:04:09 lr 0.000850 wd 0.0500 time 0.2409 (0.2442) data time 0.0009 (0.0028) model time 0.2400 (0.2417) loss 3.4337 (3.4069) grad_norm 1.6382 (2.1098) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][240/1251] eta 0:04:06 lr 0.000850 wd 0.0500 time 0.2426 (0.2441) data time 0.0009 (0.0028) model time 0.2416 (0.2416) loss 2.8844 (3.4074) grad_norm 1.7128 (2.0961) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][250/1251] eta 0:04:04 lr 0.000850 wd 0.0500 time 0.2432 (0.2446) data time 0.0011 (0.0027) model time 0.2421 (0.2423) loss 3.8629 (3.4116) grad_norm 3.1268 (2.0964) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][260/1251] eta 0:04:02 lr 0.000850 wd 0.0500 time 0.2426 (0.2444) data time 0.0009 (0.0026) model time 0.2417 (0.2422) loss 2.4546 (3.4146) grad_norm 2.0401 (2.1150) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][270/1251] eta 0:03:59 lr 0.000850 wd 0.0500 time 0.2379 (0.2443) data time 0.0011 (0.0026) model time 0.2368 (0.2421) loss 3.5586 (3.4204) grad_norm 1.9641 (2.1168) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][280/1251] eta 0:03:57 lr 0.000850 wd 0.0500 time 0.2483 (0.2443) data time 0.0007 (0.0025) model time 0.2476 (0.2421) loss 4.2730 (3.4259) grad_norm 2.7705 (2.1207) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][290/1251] eta 0:03:54 lr 0.000850 wd 0.0500 time 0.2565 (0.2442) data time 0.0010 (0.0025) model time 0.2555 (0.2421) loss 4.2459 (3.4354) grad_norm 1.6619 (2.1256) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][300/1251] eta 0:03:52 lr 0.000850 wd 0.0500 time 0.2471 (0.2442) data time 0.0008 (0.0024) model time 0.2462 (0.2420) loss 4.0070 (3.4406) grad_norm 2.1206 (2.1211) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][310/1251] eta 0:03:49 lr 0.000850 wd 0.0500 time 0.2378 (0.2441) data time 0.0011 (0.0024) model time 0.2368 (0.2420) loss 3.2393 (3.4411) grad_norm 2.5996 (2.1195) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][320/1251] eta 0:03:47 lr 0.000850 wd 0.0500 time 0.2416 (0.2440) data time 0.0010 (0.0023) model time 0.2406 (0.2419) loss 2.7568 (3.4402) grad_norm 2.0363 (2.1101) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][330/1251] eta 0:03:44 lr 0.000850 wd 0.0500 time 0.2399 (0.2439) data time 0.0008 (0.0023) model time 0.2391 (0.2418) loss 3.8957 (3.4437) grad_norm 2.3867 (2.1058) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][340/1251] eta 0:03:42 lr 0.000850 wd 0.0500 time 0.2387 (0.2444) data time 0.0011 (0.0022) model time 0.2376 (0.2425) loss 3.7578 (3.4420) grad_norm 3.1119 (2.1034) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][350/1251] eta 0:03:40 lr 0.000850 wd 0.0500 time 0.2427 (0.2444) data time 0.0007 (0.0022) model time 0.2419 (0.2424) loss 2.8749 (3.4330) grad_norm 1.7194 (2.0988) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][360/1251] eta 0:03:38 lr 0.000850 wd 0.0500 time 0.2305 (0.2453) data time 0.0010 (0.0022) model time 0.2295 (0.2436) loss 3.2643 (3.4319) grad_norm 1.6133 (2.0952) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][370/1251] eta 0:03:36 lr 0.000850 wd 0.0500 time 0.2380 (0.2453) data time 0.0010 (0.0021) model time 0.2371 (0.2435) loss 2.3054 (3.4297) grad_norm 2.2278 (2.0910) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][380/1251] eta 0:03:33 lr 0.000850 wd 0.0500 time 0.2428 (0.2452) data time 0.0011 (0.0021) model time 0.2417 (0.2435) loss 3.5164 (3.4312) grad_norm 1.3864 (2.0877) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][390/1251] eta 0:03:31 lr 0.000850 wd 0.0500 time 0.2411 (0.2451) data time 0.0009 (0.0021) model time 0.2402 (0.2435) loss 2.8504 (3.4304) grad_norm 1.8574 (2.0924) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][400/1251] eta 0:03:28 lr 0.000850 wd 0.0500 time 0.2492 (0.2450) data time 0.0011 (0.0021) model time 0.2481 (0.2433) loss 3.1118 (3.4382) grad_norm 1.7148 (2.0903) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][410/1251] eta 0:03:26 lr 0.000850 wd 0.0500 time 0.2420 (0.2450) data time 0.0007 (0.0020) model time 0.2412 (0.2433) loss 3.0694 (3.4393) grad_norm 1.7164 (2.0864) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][420/1251] eta 0:03:23 lr 0.000850 wd 0.0500 time 0.2358 (0.2449) data time 0.0008 (0.0020) model time 0.2350 (0.2432) loss 2.5256 (3.4317) grad_norm 2.2667 (2.0814) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][430/1251] eta 0:03:21 lr 0.000850 wd 0.0500 time 0.2546 (0.2451) data time 0.0012 (0.0020) model time 0.2534 (0.2435) loss 3.3801 (3.4362) grad_norm 3.3043 (2.0887) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][440/1251] eta 0:03:18 lr 0.000850 wd 0.0500 time 0.2452 (0.2450) data time 0.0009 (0.0020) model time 0.2442 (0.2434) loss 2.8394 (3.4394) grad_norm 1.6858 (2.0930) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][450/1251] eta 0:03:16 lr 0.000850 wd 0.0500 time 0.2486 (0.2454) data time 0.0011 (0.0019) model time 0.2475 (0.2439) loss 3.8221 (3.4409) grad_norm 1.4844 (2.0929) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][460/1251] eta 0:03:14 lr 0.000850 wd 0.0500 time 0.2420 (0.2458) data time 0.0010 (0.0019) model time 0.2410 (0.2444) loss 4.3212 (3.4490) grad_norm 1.8027 (2.0943) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][470/1251] eta 0:03:11 lr 0.000850 wd 0.0500 time 0.2383 (0.2457) data time 0.0007 (0.0019) model time 0.2376 (0.2442) loss 3.0163 (3.4449) grad_norm 1.8299 (2.0931) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][480/1251] eta 0:03:09 lr 0.000850 wd 0.0500 time 0.2400 (0.2456) data time 0.0011 (0.0019) model time 0.2389 (0.2441) loss 2.8288 (3.4442) grad_norm 1.8028 (2.0911) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][490/1251] eta 0:03:06 lr 0.000850 wd 0.0500 time 0.2486 (0.2455) data time 0.0009 (0.0019) model time 0.2476 (0.2441) loss 3.8118 (3.4425) grad_norm 1.3763 (2.0856) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][500/1251] eta 0:03:04 lr 0.000849 wd 0.0500 time 0.2417 (0.2457) data time 0.0010 (0.0018) model time 0.2407 (0.2442) loss 3.6779 (3.4335) grad_norm 1.6227 (2.0839) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][510/1251] eta 0:03:01 lr 0.000849 wd 0.0500 time 0.2444 (0.2456) data time 0.0009 (0.0018) model time 0.2434 (0.2442) loss 3.6111 (3.4334) grad_norm 1.4190 (2.0843) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][520/1251] eta 0:02:59 lr 0.000849 wd 0.0500 time 0.2405 (0.2455) data time 0.0009 (0.0018) model time 0.2397 (0.2441) loss 3.3676 (3.4284) grad_norm 2.3541 (2.0854) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][530/1251] eta 0:02:57 lr 0.000849 wd 0.0500 time 0.2433 (0.2462) data time 0.0010 (0.0018) model time 0.2424 (0.2448) loss 3.5194 (3.4296) grad_norm 1.6638 (2.0820) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][540/1251] eta 0:02:54 lr 0.000849 wd 0.0500 time 0.2405 (0.2461) data time 0.0008 (0.0018) model time 0.2397 (0.2447) loss 2.5240 (3.4248) grad_norm 1.8774 (2.0850) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][550/1251] eta 0:02:52 lr 0.000849 wd 0.0500 time 0.2427 (0.2460) data time 0.0010 (0.0018) model time 0.2417 (0.2446) loss 3.6481 (3.4249) grad_norm 2.2396 (2.0811) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][560/1251] eta 0:02:49 lr 0.000849 wd 0.0500 time 0.2385 (0.2460) data time 0.0010 (0.0018) model time 0.2375 (0.2446) loss 3.0963 (3.4243) grad_norm 1.5746 (2.0744) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][570/1251] eta 0:02:47 lr 0.000849 wd 0.0500 time 0.2406 (0.2458) data time 0.0010 (0.0017) model time 0.2396 (0.2445) loss 3.6205 (3.4273) grad_norm 1.9014 (2.0744) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][580/1251] eta 0:02:44 lr 0.000849 wd 0.0500 time 0.2308 (0.2458) data time 0.0009 (0.0017) model time 0.2299 (0.2444) loss 2.6395 (3.4272) grad_norm 1.8470 (2.0827) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][590/1251] eta 0:02:42 lr 0.000849 wd 0.0500 time 0.2373 (0.2457) data time 0.0009 (0.0017) model time 0.2365 (0.2443) loss 4.3068 (3.4296) grad_norm 2.0573 (2.0831) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][600/1251] eta 0:02:39 lr 0.000849 wd 0.0500 time 0.2377 (0.2456) data time 0.0008 (0.0017) model time 0.2369 (0.2442) loss 2.4986 (3.4305) grad_norm 2.2246 (2.0821) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][610/1251] eta 0:02:37 lr 0.000849 wd 0.0500 time 0.2399 (0.2455) data time 0.0010 (0.0017) model time 0.2388 (0.2441) loss 3.8681 (3.4339) grad_norm 1.9848 (2.0812) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][620/1251] eta 0:02:34 lr 0.000849 wd 0.0500 time 0.2389 (0.2455) data time 0.0010 (0.0017) model time 0.2380 (0.2441) loss 3.6495 (3.4360) grad_norm 1.7628 (2.0759) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][630/1251] eta 0:02:32 lr 0.000849 wd 0.0500 time 0.2444 (0.2454) data time 0.0010 (0.0017) model time 0.2434 (0.2440) loss 3.3249 (3.4331) grad_norm 2.4118 (2.0751) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][640/1251] eta 0:02:29 lr 0.000849 wd 0.0500 time 0.2485 (0.2453) data time 0.0008 (0.0017) model time 0.2477 (0.2439) loss 2.1590 (3.4287) grad_norm 1.8914 (2.0694) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][650/1251] eta 0:02:27 lr 0.000849 wd 0.0500 time 0.2386 (0.2453) data time 0.0010 (0.0017) model time 0.2376 (0.2439) loss 2.3119 (3.4238) grad_norm 2.2903 (2.0652) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][660/1251] eta 0:02:24 lr 0.000849 wd 0.0500 time 0.2288 (0.2452) data time 0.0008 (0.0016) model time 0.2280 (0.2438) loss 3.5831 (3.4247) grad_norm 1.9029 (2.0700) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][670/1251] eta 0:02:22 lr 0.000849 wd 0.0500 time 0.2439 (0.2452) data time 0.0007 (0.0016) model time 0.2431 (0.2438) loss 2.6465 (3.4284) grad_norm 1.4799 (2.0719) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][680/1251] eta 0:02:19 lr 0.000849 wd 0.0500 time 0.2468 (0.2451) data time 0.0009 (0.0016) model time 0.2459 (0.2438) loss 3.8050 (3.4305) grad_norm 1.8721 (2.0740) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][690/1251] eta 0:02:17 lr 0.000849 wd 0.0500 time 0.2460 (0.2451) data time 0.0012 (0.0016) model time 0.2448 (0.2437) loss 3.6274 (3.4303) grad_norm 1.9083 (2.0710) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][700/1251] eta 0:02:15 lr 0.000849 wd 0.0500 time 0.2311 (0.2450) data time 0.0009 (0.0016) model time 0.2303 (0.2437) loss 2.6583 (3.4321) grad_norm 2.6527 (2.0749) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][710/1251] eta 0:02:12 lr 0.000849 wd 0.0500 time 0.2336 (0.2450) data time 0.0012 (0.0016) model time 0.2324 (0.2436) loss 3.4802 (3.4283) grad_norm 2.0645 (2.0718) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][720/1251] eta 0:02:10 lr 0.000849 wd 0.0500 time 0.2318 (0.2449) data time 0.0011 (0.0016) model time 0.2307 (0.2436) loss 3.1684 (3.4276) grad_norm 2.4982 (2.0730) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][730/1251] eta 0:02:07 lr 0.000849 wd 0.0500 time 0.2391 (0.2449) data time 0.0007 (0.0016) model time 0.2384 (0.2435) loss 3.0268 (3.4289) grad_norm 2.0321 (2.0730) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][740/1251] eta 0:02:05 lr 0.000849 wd 0.0500 time 0.2395 (0.2448) data time 0.0010 (0.0016) model time 0.2385 (0.2435) loss 3.5564 (3.4317) grad_norm 1.8559 (2.0721) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][750/1251] eta 0:02:02 lr 0.000849 wd 0.0500 time 0.2408 (0.2448) data time 0.0012 (0.0016) model time 0.2396 (0.2434) loss 3.4808 (3.4340) grad_norm 4.2710 (2.0772) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][760/1251] eta 0:02:00 lr 0.000849 wd 0.0500 time 0.2329 (0.2447) data time 0.0008 (0.0016) model time 0.2321 (0.2434) loss 2.8244 (3.4343) grad_norm 2.1489 (2.0866) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][770/1251] eta 0:01:57 lr 0.000849 wd 0.0500 time 0.2420 (0.2447) data time 0.0007 (0.0016) model time 0.2413 (0.2434) loss 4.2744 (3.4348) grad_norm 2.2752 (2.0863) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][780/1251] eta 0:01:55 lr 0.000849 wd 0.0500 time 0.2307 (0.2450) data time 0.0011 (0.0015) model time 0.2296 (0.2436) loss 3.2512 (3.4339) grad_norm 2.2802 (2.0861) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][790/1251] eta 0:01:52 lr 0.000849 wd 0.0500 time 0.2432 (0.2449) data time 0.0011 (0.0015) model time 0.2420 (0.2436) loss 3.2569 (3.4292) grad_norm 1.5491 (2.0833) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][800/1251] eta 0:01:50 lr 0.000849 wd 0.0500 time 0.2386 (0.2449) data time 0.0009 (0.0015) model time 0.2377 (0.2435) loss 3.8610 (3.4292) grad_norm 3.2557 (2.0819) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][810/1251] eta 0:01:47 lr 0.000848 wd 0.0500 time 0.2495 (0.2448) data time 0.0010 (0.0015) model time 0.2485 (0.2435) loss 3.7275 (3.4279) grad_norm 1.6310 (2.0817) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][820/1251] eta 0:01:45 lr 0.000848 wd 0.0500 time 0.2406 (0.2448) data time 0.0009 (0.0015) model time 0.2397 (0.2435) loss 3.9887 (3.4278) grad_norm 1.5798 (2.0836) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][830/1251] eta 0:01:43 lr 0.000848 wd 0.0500 time 0.2399 (0.2447) data time 0.0008 (0.0015) model time 0.2392 (0.2434) loss 3.8624 (3.4249) grad_norm 3.0090 (2.0845) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][840/1251] eta 0:01:40 lr 0.000848 wd 0.0500 time 0.2350 (0.2447) data time 0.0009 (0.0015) model time 0.2341 (0.2434) loss 3.9948 (3.4271) grad_norm 2.3552 (2.0831) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][850/1251] eta 0:01:38 lr 0.000848 wd 0.0500 time 0.2441 (0.2447) data time 0.0010 (0.0015) model time 0.2431 (0.2433) loss 3.8490 (3.4239) grad_norm 1.3097 (2.0826) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][860/1251] eta 0:01:35 lr 0.000848 wd 0.0500 time 0.2347 (0.2449) data time 0.0008 (0.0015) model time 0.2339 (0.2435) loss 3.0856 (3.4255) grad_norm 2.1889 (2.0838) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][870/1251] eta 0:01:33 lr 0.000848 wd 0.0500 time 0.2367 (0.2448) data time 0.0012 (0.0015) model time 0.2354 (0.2435) loss 2.7772 (3.4225) grad_norm 2.6193 (2.0870) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][880/1251] eta 0:01:30 lr 0.000848 wd 0.0500 time 0.2449 (0.2450) data time 0.0007 (0.0015) model time 0.2442 (0.2437) loss 2.3644 (3.4202) grad_norm 2.5483 (2.0863) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][890/1251] eta 0:01:28 lr 0.000848 wd 0.0500 time 0.2346 (0.2449) data time 0.0009 (0.0015) model time 0.2337 (0.2436) loss 3.5851 (3.4226) grad_norm 2.5548 (2.0886) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][900/1251] eta 0:01:26 lr 0.000848 wd 0.0500 time 0.2389 (0.2451) data time 0.0011 (0.0015) model time 0.2378 (0.2438) loss 3.2744 (3.4262) grad_norm 1.7877 (2.0864) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][910/1251] eta 0:01:23 lr 0.000848 wd 0.0500 time 0.2529 (0.2451) data time 0.0011 (0.0015) model time 0.2518 (0.2438) loss 3.3191 (3.4260) grad_norm 1.4858 (2.0829) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:09:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][920/1251] eta 0:01:21 lr 0.000848 wd 0.0500 time 0.2545 (0.2451) data time 0.0007 (0.0015) model time 0.2538 (0.2438) loss 2.4483 (3.4257) grad_norm 1.9471 (2.0832) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][930/1251] eta 0:01:18 lr 0.000848 wd 0.0500 time 0.2386 (0.2451) data time 0.0007 (0.0015) model time 0.2379 (0.2438) loss 4.6274 (3.4293) grad_norm 1.4820 (2.0803) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][940/1251] eta 0:01:16 lr 0.000848 wd 0.0500 time 0.2370 (0.2450) data time 0.0011 (0.0015) model time 0.2359 (0.2438) loss 3.5774 (3.4266) grad_norm 2.5634 (2.0802) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][950/1251] eta 0:01:13 lr 0.000848 wd 0.0500 time 0.2403 (0.2450) data time 0.0010 (0.0015) model time 0.2393 (0.2437) loss 3.7651 (3.4242) grad_norm 1.3295 (2.0833) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][960/1251] eta 0:01:11 lr 0.000848 wd 0.0500 time 0.2451 (0.2450) data time 0.0008 (0.0015) model time 0.2443 (0.2437) loss 3.3969 (3.4242) grad_norm 1.5746 (2.0820) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][970/1251] eta 0:01:08 lr 0.000848 wd 0.0500 time 0.2431 (0.2449) data time 0.0009 (0.0014) model time 0.2422 (0.2437) loss 4.0338 (3.4264) grad_norm 2.1711 (2.0883) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][980/1251] eta 0:01:06 lr 0.000848 wd 0.0500 time 0.2471 (0.2449) data time 0.0007 (0.0014) model time 0.2464 (0.2436) loss 3.8991 (3.4263) grad_norm 2.4111 (2.0897) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][990/1251] eta 0:01:04 lr 0.000848 wd 0.0500 time 0.2367 (0.2453) data time 0.0008 (0.0014) model time 0.2359 (0.2440) loss 4.0604 (3.4255) grad_norm 1.8678 (2.0895) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1000/1251] eta 0:01:01 lr 0.000848 wd 0.0500 time 0.2654 (0.2453) data time 0.0007 (0.0014) model time 0.2647 (0.2440) loss 2.2437 (3.4235) grad_norm 1.6293 (2.0887) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1010/1251] eta 0:00:59 lr 0.000848 wd 0.0500 time 0.2513 (0.2452) data time 0.0007 (0.0014) model time 0.2506 (0.2440) loss 4.1935 (3.4240) grad_norm 2.3175 (2.0896) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1020/1251] eta 0:00:56 lr 0.000848 wd 0.0500 time 0.2412 (0.2452) data time 0.0009 (0.0014) model time 0.2402 (0.2440) loss 3.8423 (3.4259) grad_norm 1.8450 (2.0911) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1030/1251] eta 0:00:54 lr 0.000848 wd 0.0500 time 0.2336 (0.2454) data time 0.0009 (0.0014) model time 0.2328 (0.2441) loss 3.4447 (3.4255) grad_norm 1.3728 (2.0894) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1040/1251] eta 0:00:51 lr 0.000848 wd 0.0500 time 0.2506 (0.2453) data time 0.0010 (0.0014) model time 0.2497 (0.2441) loss 2.8901 (3.4264) grad_norm 2.4297 (2.0887) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1050/1251] eta 0:00:49 lr 0.000848 wd 0.0500 time 0.2328 (0.2455) data time 0.0009 (0.0014) model time 0.2319 (0.2443) loss 2.7326 (3.4262) grad_norm 2.7466 (2.0921) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1060/1251] eta 0:00:46 lr 0.000848 wd 0.0500 time 0.2417 (0.2455) data time 0.0007 (0.0014) model time 0.2410 (0.2443) loss 3.5163 (3.4246) grad_norm 1.9672 (2.0895) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1070/1251] eta 0:00:44 lr 0.000848 wd 0.0500 time 0.2403 (0.2454) data time 0.0010 (0.0014) model time 0.2394 (0.2442) loss 3.6166 (3.4218) grad_norm 3.0529 (2.0901) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1080/1251] eta 0:00:41 lr 0.000848 wd 0.0500 time 0.2369 (0.2454) data time 0.0009 (0.0014) model time 0.2360 (0.2441) loss 3.9085 (3.4228) grad_norm 2.1494 (2.0900) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1090/1251] eta 0:00:39 lr 0.000848 wd 0.0500 time 0.2406 (0.2455) data time 0.0010 (0.0014) model time 0.2396 (0.2443) loss 3.5607 (3.4221) grad_norm 2.1202 (2.0912) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1100/1251] eta 0:00:37 lr 0.000848 wd 0.0500 time 0.2420 (0.2455) data time 0.0010 (0.0014) model time 0.2410 (0.2442) loss 3.3675 (3.4206) grad_norm 3.1038 (2.0901) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1110/1251] eta 0:00:34 lr 0.000848 wd 0.0500 time 0.2424 (0.2454) data time 0.0010 (0.0014) model time 0.2413 (0.2442) loss 3.4444 (3.4191) grad_norm 1.9759 (2.0897) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1120/1251] eta 0:00:32 lr 0.000847 wd 0.0500 time 0.2466 (0.2454) data time 0.0009 (0.0014) model time 0.2458 (0.2442) loss 3.6228 (3.4203) grad_norm 2.2796 (2.0888) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1130/1251] eta 0:00:29 lr 0.000847 wd 0.0500 time 0.2400 (0.2454) data time 0.0009 (0.0014) model time 0.2391 (0.2441) loss 2.4996 (3.4179) grad_norm 1.9453 (2.0871) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1140/1251] eta 0:00:27 lr 0.000847 wd 0.0500 time 0.2380 (0.2453) data time 0.0012 (0.0014) model time 0.2369 (0.2441) loss 2.8283 (3.4188) grad_norm 1.5110 (2.0844) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1150/1251] eta 0:00:24 lr 0.000847 wd 0.0500 time 0.2414 (0.2453) data time 0.0012 (0.0014) model time 0.2402 (0.2441) loss 3.3132 (3.4174) grad_norm 1.9664 (2.0842) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1160/1251] eta 0:00:22 lr 0.000847 wd 0.0500 time 0.2467 (0.2453) data time 0.0010 (0.0014) model time 0.2457 (0.2441) loss 3.3918 (3.4153) grad_norm 2.2358 (2.0850) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1170/1251] eta 0:00:19 lr 0.000847 wd 0.0500 time 0.2417 (0.2452) data time 0.0010 (0.0014) model time 0.2408 (0.2440) loss 2.8460 (3.4153) grad_norm 1.9676 (2.0834) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1180/1251] eta 0:00:17 lr 0.000847 wd 0.0500 time 0.2411 (0.2452) data time 0.0008 (0.0014) model time 0.2403 (0.2440) loss 3.9241 (3.4157) grad_norm 1.4155 (2.0797) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1190/1251] eta 0:00:14 lr 0.000847 wd 0.0500 time 0.2375 (0.2452) data time 0.0010 (0.0014) model time 0.2364 (0.2440) loss 3.8821 (3.4196) grad_norm 1.7949 (2.0801) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1200/1251] eta 0:00:12 lr 0.000847 wd 0.0500 time 0.2453 (0.2452) data time 0.0008 (0.0014) model time 0.2445 (0.2439) loss 4.1608 (3.4227) grad_norm 2.0939 (2.0825) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1210/1251] eta 0:00:10 lr 0.000847 wd 0.0500 time 0.2399 (0.2451) data time 0.0011 (0.0014) model time 0.2389 (0.2439) loss 3.5586 (3.4235) grad_norm 2.6206 (2.0830) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1220/1251] eta 0:00:07 lr 0.000847 wd 0.0500 time 0.2377 (0.2451) data time 0.0010 (0.0014) model time 0.2367 (0.2439) loss 2.5153 (3.4209) grad_norm 2.1115 (2.0824) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1230/1251] eta 0:00:05 lr 0.000847 wd 0.0500 time 0.2439 (0.2451) data time 0.0008 (0.0014) model time 0.2431 (0.2439) loss 2.7033 (3.4215) grad_norm 1.8639 (2.0834) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1240/1251] eta 0:00:02 lr 0.000847 wd 0.0500 time 0.2219 (0.2450) data time 0.0005 (0.0014) model time 0.2214 (0.2438) loss 3.3549 (3.4202) grad_norm 1.8997 (2.0827) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [91/300][1250/1251] eta 0:00:00 lr 0.000847 wd 0.0500 time 0.2235 (0.2449) data time 0.0005 (0.0014) model time 0.2230 (0.2437) loss 3.1030 (3.4210) grad_norm 2.0158 (2.0826) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 91 training takes 0:05:06 [2024-08-26 10:11:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 10:11:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 10:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.434 (0.434) Loss 0.5483 (0.5483) Acc@1 90.332 (90.332) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 10:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.115) Loss 0.8364 (0.8374) Acc@1 82.227 (81.898) Acc@5 95.312 (96.174) Mem 7379MB [2024-08-26 10:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.098) Loss 1.1963 (0.8590) Acc@1 70.508 (80.827) Acc@5 92.578 (96.103) Mem 7379MB [2024-08-26 10:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.090 (0.092) Loss 1.3604 (0.9634) Acc@1 67.383 (78.339) Acc@5 90.625 (94.714) Mem 7379MB [2024-08-26 10:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.086) Loss 1.2676 (1.0216) Acc@1 70.020 (76.972) Acc@5 90.527 (93.941) Mem 7379MB [2024-08-26 10:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.632 Acc@5 93.866 [2024-08-26 10:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.6% [2024-08-26 10:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.803 (0.803) Loss 0.4482 (0.4482) Acc@1 92.480 (92.480) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 10:11:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.146) Loss 0.7148 (0.6981) Acc@1 85.742 (84.988) Acc@5 96.094 (96.955) Mem 7379MB [2024-08-26 10:11:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.113) Loss 0.9990 (0.7215) Acc@1 76.465 (83.915) Acc@5 94.238 (96.940) Mem 7379MB [2024-08-26 10:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.085 (0.102) Loss 1.2861 (0.8246) Acc@1 67.969 (81.530) Acc@5 90.723 (95.769) Mem 7379MB [2024-08-26 10:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.1465 (0.8771) Acc@1 71.777 (80.061) Acc@5 92.773 (95.267) Mem 7379MB [2024-08-26 10:11:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.652 Acc@5 95.228 [2024-08-26 10:11:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.7% [2024-08-26 10:11:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.65% [2024-08-26 10:11:28 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 10:11:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 10:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][0/1251] eta 0:13:01 lr 0.000847 wd 0.0500 time 0.6248 (0.6248) data time 0.3910 (0.3910) model time 0.0000 (0.0000) loss 3.5262 (3.5262) grad_norm 2.1401 (2.1401) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][10/1251] eta 0:05:41 lr 0.000847 wd 0.0500 time 0.2403 (0.2754) data time 0.0010 (0.0364) model time 0.0000 (0.0000) loss 4.1327 (3.5495) grad_norm 2.6854 (2.1582) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][20/1251] eta 0:05:18 lr 0.000847 wd 0.0500 time 0.2360 (0.2584) data time 0.0011 (0.0196) model time 0.0000 (0.0000) loss 2.7833 (3.4770) grad_norm 1.9552 (2.1679) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][30/1251] eta 0:05:17 lr 0.000847 wd 0.0500 time 0.2428 (0.2597) data time 0.0009 (0.0136) model time 0.0000 (0.0000) loss 3.2568 (3.4645) grad_norm 1.4878 (2.0572) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][40/1251] eta 0:05:09 lr 0.000847 wd 0.0500 time 0.2438 (0.2553) data time 0.0007 (0.0105) model time 0.0000 (0.0000) loss 2.9480 (3.4323) grad_norm 1.4999 (2.0179) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][50/1251] eta 0:05:03 lr 0.000847 wd 0.0500 time 0.2332 (0.2525) data time 0.0011 (0.0087) model time 0.0000 (0.0000) loss 3.7389 (3.3782) grad_norm 1.8730 (2.1104) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][60/1251] eta 0:04:59 lr 0.000847 wd 0.0500 time 0.2395 (0.2511) data time 0.0009 (0.0074) model time 0.2386 (0.2428) loss 3.3246 (3.3937) grad_norm 1.6738 (2.0848) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][70/1251] eta 0:04:54 lr 0.000847 wd 0.0500 time 0.2411 (0.2496) data time 0.0010 (0.0065) model time 0.2401 (0.2412) loss 2.3157 (3.4077) grad_norm 1.4593 (2.0558) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][80/1251] eta 0:04:50 lr 0.000847 wd 0.0500 time 0.2369 (0.2483) data time 0.0007 (0.0058) model time 0.2362 (0.2402) loss 2.2399 (3.4026) grad_norm 3.9145 (2.0767) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][90/1251] eta 0:04:47 lr 0.000847 wd 0.0500 time 0.2414 (0.2475) data time 0.0009 (0.0053) model time 0.2405 (0.2401) loss 4.0283 (3.4182) grad_norm 1.7915 (2.0722) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][100/1251] eta 0:04:44 lr 0.000847 wd 0.0500 time 0.2375 (0.2469) data time 0.0010 (0.0049) model time 0.2365 (0.2401) loss 3.3769 (3.4368) grad_norm 2.8072 (2.1387) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][110/1251] eta 0:04:41 lr 0.000847 wd 0.0500 time 0.2478 (0.2464) data time 0.0008 (0.0045) model time 0.2470 (0.2402) loss 4.2148 (3.4523) grad_norm 2.5161 (2.1907) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][120/1251] eta 0:04:38 lr 0.000847 wd 0.0500 time 0.2430 (0.2460) data time 0.0008 (0.0042) model time 0.2422 (0.2403) loss 3.9225 (3.4723) grad_norm 1.5220 (2.1840) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][130/1251] eta 0:04:36 lr 0.000847 wd 0.0500 time 0.4059 (0.2468) data time 0.0009 (0.0040) model time 0.4050 (0.2422) loss 3.8696 (3.4665) grad_norm 1.4918 (2.1492) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][140/1251] eta 0:04:33 lr 0.000847 wd 0.0500 time 0.2371 (0.2464) data time 0.0010 (0.0038) model time 0.2360 (0.2419) loss 3.8409 (3.4578) grad_norm 2.0303 (2.1366) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][150/1251] eta 0:04:31 lr 0.000847 wd 0.0500 time 0.2431 (0.2470) data time 0.0010 (0.0036) model time 0.2421 (0.2432) loss 3.7235 (3.4542) grad_norm 1.5713 (2.1259) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][160/1251] eta 0:04:28 lr 0.000847 wd 0.0500 time 0.2396 (0.2466) data time 0.0007 (0.0034) model time 0.2389 (0.2428) loss 3.6701 (3.4649) grad_norm 2.7841 (2.1172) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][170/1251] eta 0:04:26 lr 0.000847 wd 0.0500 time 0.2442 (0.2462) data time 0.0010 (0.0033) model time 0.2433 (0.2426) loss 3.2700 (3.4575) grad_norm 1.9775 (2.1010) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][180/1251] eta 0:04:23 lr 0.000846 wd 0.0500 time 0.2430 (0.2459) data time 0.0010 (0.0032) model time 0.2421 (0.2423) loss 3.7285 (3.4561) grad_norm 1.6744 (2.0871) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][190/1251] eta 0:04:20 lr 0.000846 wd 0.0500 time 0.2433 (0.2457) data time 0.0007 (0.0030) model time 0.2425 (0.2422) loss 4.1409 (3.4599) grad_norm 2.8008 (2.0966) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][200/1251] eta 0:04:17 lr 0.000846 wd 0.0500 time 0.2407 (0.2454) data time 0.0010 (0.0029) model time 0.2397 (0.2420) loss 3.4260 (3.4615) grad_norm 2.4915 (2.1029) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][210/1251] eta 0:04:15 lr 0.000846 wd 0.0500 time 0.2403 (0.2452) data time 0.0010 (0.0029) model time 0.2392 (0.2419) loss 3.1418 (3.4473) grad_norm 2.0261 (2.1020) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][220/1251] eta 0:04:12 lr 0.000846 wd 0.0500 time 0.2381 (0.2450) data time 0.0010 (0.0028) model time 0.2370 (0.2417) loss 3.6776 (3.4474) grad_norm 1.7608 (2.1002) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][230/1251] eta 0:04:09 lr 0.000846 wd 0.0500 time 0.2358 (0.2447) data time 0.0011 (0.0027) model time 0.2347 (0.2415) loss 3.7547 (3.4432) grad_norm 1.9163 (2.1068) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][240/1251] eta 0:04:07 lr 0.000846 wd 0.0500 time 0.2365 (0.2445) data time 0.0012 (0.0026) model time 0.2352 (0.2414) loss 3.5041 (3.4382) grad_norm 2.2234 (2.1010) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][250/1251] eta 0:04:05 lr 0.000846 wd 0.0500 time 0.2438 (0.2450) data time 0.0009 (0.0026) model time 0.2429 (0.2421) loss 2.6329 (3.4207) grad_norm 1.7608 (2.0887) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][260/1251] eta 0:04:04 lr 0.000846 wd 0.0500 time 0.2369 (0.2465) data time 0.0007 (0.0025) model time 0.2362 (0.2441) loss 3.1272 (3.4215) grad_norm 2.2334 (2.0917) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][270/1251] eta 0:04:02 lr 0.000846 wd 0.0500 time 0.2349 (0.2472) data time 0.0010 (0.0024) model time 0.2339 (0.2449) loss 3.5883 (3.4298) grad_norm 2.7489 (2.0977) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][280/1251] eta 0:03:59 lr 0.000846 wd 0.0500 time 0.2404 (0.2469) data time 0.0012 (0.0024) model time 0.2392 (0.2447) loss 3.6551 (3.4344) grad_norm 1.7083 (2.0944) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][290/1251] eta 0:03:57 lr 0.000846 wd 0.0500 time 0.2436 (0.2467) data time 0.0009 (0.0023) model time 0.2427 (0.2445) loss 2.8710 (3.4404) grad_norm 1.6983 (2.0894) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][300/1251] eta 0:03:54 lr 0.000846 wd 0.0500 time 0.2378 (0.2465) data time 0.0007 (0.0023) model time 0.2371 (0.2443) loss 4.0679 (3.4406) grad_norm 1.5836 (2.0794) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][310/1251] eta 0:03:51 lr 0.000846 wd 0.0500 time 0.2452 (0.2463) data time 0.0007 (0.0023) model time 0.2445 (0.2441) loss 3.5031 (3.4406) grad_norm 2.7181 (2.0727) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][320/1251] eta 0:03:49 lr 0.000846 wd 0.0500 time 0.2433 (0.2466) data time 0.0007 (0.0022) model time 0.2426 (0.2444) loss 4.2503 (3.4489) grad_norm 1.6325 (2.0756) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][330/1251] eta 0:03:47 lr 0.000846 wd 0.0500 time 0.2381 (0.2470) data time 0.0012 (0.0022) model time 0.2369 (0.2450) loss 3.3272 (3.4449) grad_norm 1.6943 (2.0726) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][340/1251] eta 0:03:44 lr 0.000846 wd 0.0500 time 0.2352 (0.2468) data time 0.0009 (0.0022) model time 0.2343 (0.2448) loss 4.0083 (3.4449) grad_norm 1.6409 (2.0684) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][350/1251] eta 0:03:42 lr 0.000846 wd 0.0500 time 0.2321 (0.2466) data time 0.0011 (0.0021) model time 0.2310 (0.2446) loss 3.5451 (3.4388) grad_norm 2.2827 (2.0700) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:12:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][360/1251] eta 0:03:39 lr 0.000846 wd 0.0500 time 0.2434 (0.2465) data time 0.0007 (0.0021) model time 0.2427 (0.2445) loss 4.1667 (3.4357) grad_norm 3.0877 (2.0856) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][370/1251] eta 0:03:37 lr 0.000846 wd 0.0500 time 0.2310 (0.2463) data time 0.0009 (0.0021) model time 0.2301 (0.2443) loss 3.7671 (3.4378) grad_norm 1.8587 (2.0975) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][380/1251] eta 0:03:34 lr 0.000846 wd 0.0500 time 0.2546 (0.2462) data time 0.0010 (0.0020) model time 0.2536 (0.2442) loss 3.9934 (3.4423) grad_norm 1.4021 (2.0925) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][390/1251] eta 0:03:31 lr 0.000846 wd 0.0500 time 0.2375 (0.2462) data time 0.0010 (0.0020) model time 0.2365 (0.2442) loss 3.5864 (3.4456) grad_norm 1.6419 (2.0942) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][400/1251] eta 0:03:29 lr 0.000846 wd 0.0500 time 0.2390 (0.2466) data time 0.0010 (0.0020) model time 0.2380 (0.2447) loss 3.6580 (3.4477) grad_norm 1.7009 (2.0951) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][410/1251] eta 0:03:27 lr 0.000846 wd 0.0500 time 0.2412 (0.2465) data time 0.0008 (0.0020) model time 0.2404 (0.2446) loss 3.4123 (3.4531) grad_norm 2.3301 (2.0962) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][420/1251] eta 0:03:24 lr 0.000846 wd 0.0500 time 0.2406 (0.2463) data time 0.0010 (0.0019) model time 0.2396 (0.2445) loss 2.3824 (3.4437) grad_norm 1.4554 (2.0909) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][430/1251] eta 0:03:22 lr 0.000846 wd 0.0500 time 0.2411 (0.2462) data time 0.0007 (0.0019) model time 0.2404 (0.2443) loss 3.4335 (3.4409) grad_norm 2.5100 (2.0967) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][440/1251] eta 0:03:19 lr 0.000846 wd 0.0500 time 0.2439 (0.2461) data time 0.0009 (0.0019) model time 0.2430 (0.2442) loss 3.2700 (3.4435) grad_norm 1.6851 (2.0897) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][450/1251] eta 0:03:17 lr 0.000846 wd 0.0500 time 0.2507 (0.2460) data time 0.0009 (0.0019) model time 0.2498 (0.2441) loss 2.9026 (3.4416) grad_norm 2.5610 (2.0940) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][460/1251] eta 0:03:14 lr 0.000846 wd 0.0500 time 0.2390 (0.2459) data time 0.0007 (0.0019) model time 0.2382 (0.2440) loss 3.5125 (3.4403) grad_norm 2.1800 (2.0989) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][470/1251] eta 0:03:11 lr 0.000846 wd 0.0500 time 0.2433 (0.2458) data time 0.0020 (0.0019) model time 0.2413 (0.2440) loss 2.5813 (3.4385) grad_norm 2.2675 (2.0977) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][480/1251] eta 0:03:09 lr 0.000846 wd 0.0500 time 0.2438 (0.2457) data time 0.0010 (0.0018) model time 0.2428 (0.2439) loss 2.8234 (3.4385) grad_norm 2.0982 (2.0981) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][490/1251] eta 0:03:06 lr 0.000846 wd 0.0500 time 0.2395 (0.2456) data time 0.0007 (0.0018) model time 0.2387 (0.2438) loss 2.4988 (3.4379) grad_norm 1.8960 (2.1029) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][500/1251] eta 0:03:04 lr 0.000845 wd 0.0500 time 0.2419 (0.2456) data time 0.0008 (0.0018) model time 0.2410 (0.2438) loss 3.7824 (3.4445) grad_norm 2.6422 (2.1085) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][510/1251] eta 0:03:01 lr 0.000845 wd 0.0500 time 0.2419 (0.2456) data time 0.0009 (0.0018) model time 0.2411 (0.2438) loss 3.2979 (3.4484) grad_norm 1.9792 (2.1033) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][520/1251] eta 0:02:59 lr 0.000845 wd 0.0500 time 0.2423 (0.2455) data time 0.0007 (0.0018) model time 0.2417 (0.2437) loss 3.7305 (3.4467) grad_norm 2.2443 (2.1007) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][530/1251] eta 0:02:56 lr 0.000845 wd 0.0500 time 0.2445 (0.2454) data time 0.0011 (0.0018) model time 0.2434 (0.2437) loss 3.4113 (3.4431) grad_norm 2.1474 (2.0994) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][540/1251] eta 0:02:54 lr 0.000845 wd 0.0500 time 0.2367 (0.2454) data time 0.0013 (0.0018) model time 0.2354 (0.2436) loss 4.0833 (3.4499) grad_norm 1.9682 (2.1188) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][550/1251] eta 0:02:51 lr 0.000845 wd 0.0500 time 0.2408 (0.2453) data time 0.0010 (0.0017) model time 0.2398 (0.2435) loss 3.0628 (3.4495) grad_norm 1.8240 (2.1139) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][560/1251] eta 0:02:49 lr 0.000845 wd 0.0500 time 0.2383 (0.2452) data time 0.0010 (0.0017) model time 0.2373 (0.2434) loss 2.5096 (3.4494) grad_norm 1.6651 (2.1069) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][570/1251] eta 0:02:46 lr 0.000845 wd 0.0500 time 0.2477 (0.2452) data time 0.0012 (0.0017) model time 0.2465 (0.2434) loss 3.0775 (3.4408) grad_norm 1.3593 (2.1005) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][580/1251] eta 0:02:44 lr 0.000845 wd 0.0500 time 0.2412 (0.2451) data time 0.0007 (0.0017) model time 0.2404 (0.2434) loss 4.0037 (3.4462) grad_norm 2.2684 (2.0986) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][590/1251] eta 0:02:41 lr 0.000845 wd 0.0500 time 0.2386 (0.2450) data time 0.0010 (0.0017) model time 0.2377 (0.2433) loss 3.6676 (3.4400) grad_norm 1.3543 (2.0927) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][600/1251] eta 0:02:39 lr 0.000845 wd 0.0500 time 0.2466 (0.2450) data time 0.0009 (0.0017) model time 0.2457 (0.2433) loss 3.6883 (3.4377) grad_norm 2.3044 (2.0930) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:13:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][610/1251] eta 0:02:37 lr 0.000845 wd 0.0500 time 0.2426 (0.2449) data time 0.0010 (0.0017) model time 0.2416 (0.2432) loss 3.1136 (3.4384) grad_norm 2.1107 (2.0903) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:14:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][620/1251] eta 0:02:34 lr 0.000845 wd 0.0500 time 0.2411 (0.2449) data time 0.0010 (0.0017) model time 0.2401 (0.2432) loss 3.8495 (3.4371) grad_norm 1.9706 (2.0902) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:14:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][630/1251] eta 0:02:32 lr 0.000845 wd 0.0500 time 0.2428 (0.2448) data time 0.0008 (0.0016) model time 0.2420 (0.2431) loss 2.7955 (3.4327) grad_norm 2.0009 (2.0866) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:14:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][640/1251] eta 0:02:29 lr 0.000845 wd 0.0500 time 0.2416 (0.2448) data time 0.0009 (0.0016) model time 0.2406 (0.2431) loss 3.6816 (3.4352) grad_norm 3.0469 (2.0866) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][650/1251] eta 0:02:27 lr 0.000845 wd 0.0500 time 0.2421 (0.2450) data time 0.0010 (0.0016) model time 0.2411 (0.2434) loss 3.8284 (3.4339) grad_norm 1.8083 (2.0825) loss_scale 4096.0000 (2073.1674) mem 7379MB [2024-08-26 10:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][660/1251] eta 0:02:24 lr 0.000845 wd 0.0500 time 0.2613 (0.2453) data time 0.0007 (0.0016) model time 0.2605 (0.2437) loss 2.8627 (3.4309) grad_norm 1.6793 (2.0817) loss_scale 4096.0000 (2103.7700) mem 7379MB [2024-08-26 10:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][670/1251] eta 0:02:22 lr 0.000845 wd 0.0500 time 0.2595 (0.2453) data time 0.0010 (0.0016) model time 0.2585 (0.2437) loss 3.5757 (3.4319) grad_norm 1.5021 (2.0808) loss_scale 4096.0000 (2133.4605) mem 7379MB [2024-08-26 10:14:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][680/1251] eta 0:02:20 lr 0.000845 wd 0.0500 time 0.2447 (0.2456) data time 0.0010 (0.0016) model time 0.2437 (0.2440) loss 2.7460 (3.4307) grad_norm 1.7203 (2.0793) loss_scale 4096.0000 (2162.2790) mem 7379MB [2024-08-26 10:14:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][690/1251] eta 0:02:17 lr 0.000845 wd 0.0500 time 0.2426 (0.2456) data time 0.0010 (0.0016) model time 0.2417 (0.2439) loss 3.5048 (3.4320) grad_norm 2.2571 (2.0745) loss_scale 4096.0000 (2190.2634) mem 7379MB [2024-08-26 10:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][700/1251] eta 0:02:15 lr 0.000845 wd 0.0500 time 0.2456 (0.2455) data time 0.0008 (0.0016) model time 0.2448 (0.2439) loss 3.7803 (3.4330) grad_norm 2.1421 (2.0761) loss_scale 4096.0000 (2217.4494) mem 7379MB [2024-08-26 10:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][710/1251] eta 0:02:12 lr 0.000845 wd 0.0500 time 0.2371 (0.2455) data time 0.0007 (0.0017) model time 0.2364 (0.2438) loss 3.6690 (3.4351) grad_norm 1.5422 (2.0735) loss_scale 4096.0000 (2243.8706) mem 7379MB [2024-08-26 10:14:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][720/1251] eta 0:02:10 lr 0.000845 wd 0.0500 time 0.2446 (0.2455) data time 0.0009 (0.0016) model time 0.2437 (0.2438) loss 3.1250 (3.4324) grad_norm 2.0241 (2.0705) loss_scale 4096.0000 (2269.5589) mem 7379MB [2024-08-26 10:14:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][730/1251] eta 0:02:07 lr 0.000845 wd 0.0500 time 0.2417 (0.2454) data time 0.0010 (0.0016) model time 0.2407 (0.2438) loss 3.6337 (3.4309) grad_norm 2.0868 (2.0684) loss_scale 4096.0000 (2294.5445) mem 7379MB [2024-08-26 10:14:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][740/1251] eta 0:02:05 lr 0.000845 wd 0.0500 time 0.2436 (0.2454) data time 0.0010 (0.0016) model time 0.2426 (0.2437) loss 3.3752 (3.4294) grad_norm 2.4178 (2.0672) loss_scale 4096.0000 (2318.8556) mem 7379MB [2024-08-26 10:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][750/1251] eta 0:02:02 lr 0.000845 wd 0.0500 time 0.2418 (0.2453) data time 0.0008 (0.0016) model time 0.2410 (0.2437) loss 4.0345 (3.4278) grad_norm 1.7917 (2.0670) loss_scale 4096.0000 (2342.5193) mem 7379MB [2024-08-26 10:14:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][760/1251] eta 0:02:00 lr 0.000845 wd 0.0500 time 0.2516 (0.2453) data time 0.0009 (0.0016) model time 0.2506 (0.2437) loss 3.8345 (3.4297) grad_norm 2.6360 (2.0717) loss_scale 4096.0000 (2365.5611) mem 7379MB [2024-08-26 10:14:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][770/1251] eta 0:01:58 lr 0.000845 wd 0.0500 time 0.2473 (0.2456) data time 0.0007 (0.0016) model time 0.2466 (0.2439) loss 2.6002 (3.4297) grad_norm 1.9753 (2.0726) loss_scale 4096.0000 (2388.0052) mem 7379MB [2024-08-26 10:14:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][780/1251] eta 0:01:55 lr 0.000845 wd 0.0500 time 0.2476 (0.2457) data time 0.0008 (0.0016) model time 0.2468 (0.2441) loss 3.1119 (3.4323) grad_norm 1.9275 (2.0772) loss_scale 4096.0000 (2409.8745) mem 7379MB [2024-08-26 10:14:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][790/1251] eta 0:01:53 lr 0.000845 wd 0.0500 time 0.2614 (0.2460) data time 0.0008 (0.0016) model time 0.2606 (0.2444) loss 3.1130 (3.4311) grad_norm 1.6358 (2.0736) loss_scale 4096.0000 (2431.1909) mem 7379MB [2024-08-26 10:14:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][800/1251] eta 0:01:50 lr 0.000845 wd 0.0500 time 0.2330 (0.2459) data time 0.0011 (0.0016) model time 0.2319 (0.2443) loss 3.7458 (3.4318) grad_norm 1.5452 (2.0718) loss_scale 4096.0000 (2451.9750) mem 7379MB [2024-08-26 10:14:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][810/1251] eta 0:01:48 lr 0.000844 wd 0.0500 time 0.2430 (0.2458) data time 0.0012 (0.0016) model time 0.2418 (0.2443) loss 3.1076 (3.4343) grad_norm 1.5435 (2.0693) loss_scale 4096.0000 (2472.2466) mem 7379MB [2024-08-26 10:14:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][820/1251] eta 0:01:45 lr 0.000844 wd 0.0500 time 0.2429 (0.2458) data time 0.0009 (0.0016) model time 0.2420 (0.2442) loss 4.1404 (3.4319) grad_norm 1.6581 (2.0732) loss_scale 4096.0000 (2492.0244) mem 7379MB [2024-08-26 10:14:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][830/1251] eta 0:01:43 lr 0.000844 wd 0.0500 time 0.2456 (0.2458) data time 0.0008 (0.0016) model time 0.2448 (0.2442) loss 3.9936 (3.4341) grad_norm 2.2047 (2.0780) loss_scale 4096.0000 (2511.3261) mem 7379MB [2024-08-26 10:14:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][840/1251] eta 0:01:41 lr 0.000844 wd 0.0500 time 0.2496 (0.2460) data time 0.0008 (0.0016) model time 0.2488 (0.2444) loss 2.5434 (3.4313) grad_norm 3.1793 (2.0796) loss_scale 4096.0000 (2530.1688) mem 7379MB [2024-08-26 10:14:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][850/1251] eta 0:01:38 lr 0.000844 wd 0.0500 time 0.2457 (0.2460) data time 0.0011 (0.0016) model time 0.2446 (0.2444) loss 4.3171 (3.4339) grad_norm 1.8443 (2.0807) loss_scale 4096.0000 (2548.5687) mem 7379MB [2024-08-26 10:15:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][860/1251] eta 0:01:36 lr 0.000844 wd 0.0500 time 0.2469 (0.2462) data time 0.0008 (0.0016) model time 0.2462 (0.2447) loss 4.0380 (3.4330) grad_norm 1.6922 (2.0802) loss_scale 4096.0000 (2566.5412) mem 7379MB [2024-08-26 10:15:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][870/1251] eta 0:01:33 lr 0.000844 wd 0.0500 time 0.2453 (0.2462) data time 0.0010 (0.0016) model time 0.2443 (0.2446) loss 4.0568 (3.4351) grad_norm 2.1222 (2.0810) loss_scale 4096.0000 (2584.1010) mem 7379MB [2024-08-26 10:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][880/1251] eta 0:01:31 lr 0.000844 wd 0.0500 time 0.2416 (0.2461) data time 0.0010 (0.0016) model time 0.2407 (0.2446) loss 3.3885 (3.4312) grad_norm 3.0531 (2.0815) loss_scale 4096.0000 (2601.2622) mem 7379MB [2024-08-26 10:15:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][890/1251] eta 0:01:28 lr 0.000844 wd 0.0500 time 0.2423 (0.2461) data time 0.0008 (0.0016) model time 0.2415 (0.2445) loss 3.8907 (3.4339) grad_norm 1.7719 (2.0788) loss_scale 4096.0000 (2618.0382) mem 7379MB [2024-08-26 10:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][900/1251] eta 0:01:26 lr 0.000844 wd 0.0500 time 0.2428 (0.2460) data time 0.0007 (0.0016) model time 0.2420 (0.2445) loss 3.1279 (3.4340) grad_norm 3.7737 (2.0789) loss_scale 4096.0000 (2634.4417) mem 7379MB [2024-08-26 10:15:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][910/1251] eta 0:01:23 lr 0.000844 wd 0.0500 time 0.2458 (0.2460) data time 0.0010 (0.0016) model time 0.2449 (0.2445) loss 3.4065 (3.4306) grad_norm 2.2903 (2.0793) loss_scale 4096.0000 (2650.4852) mem 7379MB [2024-08-26 10:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][920/1251] eta 0:01:21 lr 0.000844 wd 0.0500 time 0.2427 (0.2460) data time 0.0011 (0.0016) model time 0.2416 (0.2444) loss 3.9370 (3.4320) grad_norm 1.9791 (2.0864) loss_scale 4096.0000 (2666.1802) mem 7379MB [2024-08-26 10:15:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][930/1251] eta 0:01:18 lr 0.000844 wd 0.0500 time 0.2498 (0.2460) data time 0.0012 (0.0016) model time 0.2486 (0.2444) loss 3.5452 (3.4316) grad_norm 2.4167 (2.0888) loss_scale 4096.0000 (2681.5381) mem 7379MB [2024-08-26 10:15:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][940/1251] eta 0:01:16 lr 0.000844 wd 0.0500 time 0.2434 (0.2459) data time 0.0011 (0.0016) model time 0.2423 (0.2444) loss 3.6387 (3.4330) grad_norm 1.7150 (2.0855) loss_scale 4096.0000 (2696.5696) mem 7379MB [2024-08-26 10:15:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][950/1251] eta 0:01:14 lr 0.000844 wd 0.0500 time 0.2387 (0.2459) data time 0.0011 (0.0016) model time 0.2376 (0.2444) loss 4.1397 (3.4333) grad_norm 2.1835 (2.0818) loss_scale 4096.0000 (2711.2850) mem 7379MB [2024-08-26 10:15:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][960/1251] eta 0:01:11 lr 0.000844 wd 0.0500 time 0.2434 (0.2461) data time 0.0009 (0.0016) model time 0.2425 (0.2446) loss 2.9229 (3.4313) grad_norm 1.6769 (2.0812) loss_scale 4096.0000 (2725.6941) mem 7379MB [2024-08-26 10:15:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][970/1251] eta 0:01:09 lr 0.000844 wd 0.0500 time 0.2480 (0.2461) data time 0.0011 (0.0016) model time 0.2469 (0.2445) loss 3.8734 (3.4319) grad_norm 1.4955 (2.0804) loss_scale 4096.0000 (2739.8064) mem 7379MB [2024-08-26 10:15:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][980/1251] eta 0:01:06 lr 0.000844 wd 0.0500 time 0.2418 (0.2460) data time 0.0007 (0.0016) model time 0.2411 (0.2445) loss 4.0487 (3.4269) grad_norm 3.2295 (2.0806) loss_scale 4096.0000 (2753.6310) mem 7379MB [2024-08-26 10:15:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][990/1251] eta 0:01:04 lr 0.000844 wd 0.0500 time 0.2392 (0.2460) data time 0.0011 (0.0016) model time 0.2382 (0.2445) loss 3.8413 (3.4295) grad_norm 2.0843 (2.0790) loss_scale 4096.0000 (2767.1766) mem 7379MB [2024-08-26 10:15:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1000/1251] eta 0:01:01 lr 0.000844 wd 0.0500 time 0.2405 (0.2460) data time 0.0007 (0.0015) model time 0.2398 (0.2445) loss 3.0666 (3.4292) grad_norm 2.1520 (2.0756) loss_scale 4096.0000 (2780.4515) mem 7379MB [2024-08-26 10:15:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1010/1251] eta 0:00:59 lr 0.000844 wd 0.0500 time 0.2396 (0.2459) data time 0.0011 (0.0015) model time 0.2385 (0.2444) loss 2.8913 (3.4271) grad_norm 2.3580 (2.0763) loss_scale 4096.0000 (2793.4639) mem 7379MB [2024-08-26 10:15:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1020/1251] eta 0:00:56 lr 0.000844 wd 0.0500 time 0.2396 (0.2459) data time 0.0011 (0.0015) model time 0.2385 (0.2444) loss 2.8211 (3.4256) grad_norm 2.4177 (2.0802) loss_scale 4096.0000 (2806.2214) mem 7379MB [2024-08-26 10:15:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1030/1251] eta 0:00:54 lr 0.000844 wd 0.0500 time 0.2380 (0.2458) data time 0.0009 (0.0015) model time 0.2371 (0.2443) loss 3.6867 (3.4245) grad_norm 2.2380 (nan) loss_scale 2048.0000 (2798.8671) mem 7379MB [2024-08-26 10:15:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1040/1251] eta 0:00:51 lr 0.000844 wd 0.0500 time 0.2449 (0.2458) data time 0.0010 (0.0015) model time 0.2439 (0.2443) loss 3.6555 (3.4223) grad_norm 3.1248 (nan) loss_scale 2048.0000 (2791.6542) mem 7379MB [2024-08-26 10:15:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1050/1251] eta 0:00:49 lr 0.000844 wd 0.0500 time 0.2446 (0.2457) data time 0.0008 (0.0015) model time 0.2438 (0.2442) loss 2.6583 (3.4217) grad_norm 1.7015 (nan) loss_scale 2048.0000 (2784.5785) mem 7379MB [2024-08-26 10:15:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1060/1251] eta 0:00:46 lr 0.000844 wd 0.0500 time 0.2331 (0.2457) data time 0.0013 (0.0015) model time 0.2318 (0.2442) loss 3.8422 (3.4190) grad_norm 2.0497 (nan) loss_scale 2048.0000 (2777.6362) mem 7379MB [2024-08-26 10:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1070/1251] eta 0:00:44 lr 0.000844 wd 0.0500 time 0.2475 (0.2457) data time 0.0008 (0.0015) model time 0.2467 (0.2442) loss 2.4438 (3.4201) grad_norm 1.4341 (nan) loss_scale 2048.0000 (2770.8235) mem 7379MB [2024-08-26 10:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1080/1251] eta 0:00:41 lr 0.000844 wd 0.0500 time 0.2323 (0.2456) data time 0.0010 (0.0015) model time 0.2312 (0.2441) loss 3.1840 (3.4189) grad_norm 1.6830 (nan) loss_scale 2048.0000 (2764.1369) mem 7379MB [2024-08-26 10:15:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1090/1251] eta 0:00:39 lr 0.000844 wd 0.0500 time 0.2510 (0.2456) data time 0.0010 (0.0015) model time 0.2500 (0.2441) loss 2.5362 (3.4160) grad_norm 2.7233 (nan) loss_scale 2048.0000 (2757.5729) mem 7379MB [2024-08-26 10:15:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1100/1251] eta 0:00:37 lr 0.000844 wd 0.0500 time 0.2474 (0.2456) data time 0.0010 (0.0015) model time 0.2463 (0.2441) loss 3.6376 (3.4155) grad_norm 1.3537 (nan) loss_scale 2048.0000 (2751.1281) mem 7379MB [2024-08-26 10:16:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1110/1251] eta 0:00:34 lr 0.000843 wd 0.0500 time 0.2501 (0.2455) data time 0.0009 (0.0015) model time 0.2492 (0.2440) loss 3.7262 (3.4134) grad_norm 1.7656 (nan) loss_scale 2048.0000 (2744.7993) mem 7379MB [2024-08-26 10:16:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1120/1251] eta 0:00:32 lr 0.000843 wd 0.0500 time 0.2365 (0.2455) data time 0.0010 (0.0015) model time 0.2356 (0.2440) loss 3.7307 (3.4127) grad_norm 1.8922 (nan) loss_scale 2048.0000 (2738.5834) mem 7379MB [2024-08-26 10:16:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1130/1251] eta 0:00:29 lr 0.000843 wd 0.0500 time 0.2461 (0.2455) data time 0.0010 (0.0015) model time 0.2452 (0.2440) loss 3.0889 (3.4125) grad_norm 1.9195 (nan) loss_scale 2048.0000 (2732.4775) mem 7379MB [2024-08-26 10:16:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1140/1251] eta 0:00:27 lr 0.000843 wd 0.0500 time 0.2370 (0.2454) data time 0.0009 (0.0015) model time 0.2361 (0.2439) loss 2.9370 (3.4144) grad_norm 1.8209 (nan) loss_scale 2048.0000 (2726.4785) mem 7379MB [2024-08-26 10:16:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1150/1251] eta 0:00:24 lr 0.000843 wd 0.0500 time 0.2387 (0.2454) data time 0.0008 (0.0015) model time 0.2378 (0.2439) loss 3.5037 (3.4131) grad_norm 1.9407 (nan) loss_scale 2048.0000 (2720.5838) mem 7379MB [2024-08-26 10:16:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1160/1251] eta 0:00:22 lr 0.000843 wd 0.0500 time 0.2387 (0.2453) data time 0.0009 (0.0015) model time 0.2378 (0.2438) loss 2.6102 (3.4125) grad_norm 1.4138 (nan) loss_scale 2048.0000 (2714.7907) mem 7379MB [2024-08-26 10:16:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1170/1251] eta 0:00:19 lr 0.000843 wd 0.0500 time 0.2408 (0.2453) data time 0.0010 (0.0015) model time 0.2398 (0.2438) loss 3.9657 (3.4118) grad_norm 1.9624 (nan) loss_scale 2048.0000 (2709.0965) mem 7379MB [2024-08-26 10:16:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1180/1251] eta 0:00:17 lr 0.000843 wd 0.0500 time 0.4389 (0.2454) data time 0.0008 (0.0015) model time 0.4381 (0.2439) loss 3.4986 (3.4125) grad_norm 2.3727 (nan) loss_scale 2048.0000 (2703.4987) mem 7379MB [2024-08-26 10:16:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1190/1251] eta 0:00:14 lr 0.000843 wd 0.0500 time 0.2397 (0.2456) data time 0.0011 (0.0015) model time 0.2386 (0.2441) loss 3.5703 (3.4131) grad_norm 1.9290 (nan) loss_scale 2048.0000 (2697.9950) mem 7379MB [2024-08-26 10:16:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1200/1251] eta 0:00:12 lr 0.000843 wd 0.0500 time 0.2387 (0.2455) data time 0.0010 (0.0015) model time 0.2377 (0.2441) loss 3.8669 (3.4130) grad_norm 1.8818 (nan) loss_scale 2048.0000 (2692.5828) mem 7379MB [2024-08-26 10:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1210/1251] eta 0:00:10 lr 0.000843 wd 0.0500 time 0.2318 (0.2455) data time 0.0009 (0.0015) model time 0.2309 (0.2440) loss 3.8323 (3.4132) grad_norm 1.7446 (nan) loss_scale 2048.0000 (2687.2601) mem 7379MB [2024-08-26 10:16:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1220/1251] eta 0:00:07 lr 0.000843 wd 0.0500 time 0.2487 (0.2454) data time 0.0009 (0.0015) model time 0.2478 (0.2440) loss 3.8934 (3.4147) grad_norm 1.7231 (nan) loss_scale 2048.0000 (2682.0246) mem 7379MB [2024-08-26 10:16:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1230/1251] eta 0:00:05 lr 0.000843 wd 0.0500 time 0.2464 (0.2454) data time 0.0009 (0.0015) model time 0.2455 (0.2440) loss 2.7286 (3.4139) grad_norm 2.2089 (nan) loss_scale 2048.0000 (2676.8741) mem 7379MB [2024-08-26 10:16:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1240/1251] eta 0:00:02 lr 0.000843 wd 0.0500 time 0.2231 (0.2453) data time 0.0005 (0.0015) model time 0.2226 (0.2439) loss 2.6279 (3.4148) grad_norm 1.8468 (nan) loss_scale 2048.0000 (2671.8066) mem 7379MB [2024-08-26 10:16:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [92/300][1250/1251] eta 0:00:00 lr 0.000843 wd 0.0500 time 0.2320 (0.2452) data time 0.0005 (0.0014) model time 0.2315 (0.2437) loss 2.5950 (3.4156) grad_norm 1.6790 (nan) loss_scale 2048.0000 (2666.8201) mem 7379MB [2024-08-26 10:16:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 92 training takes 0:05:06 [2024-08-26 10:16:35 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 10:16:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 10:16:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.481 (0.481) Loss 0.5083 (0.5083) Acc@1 89.551 (89.551) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 10:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.115) Loss 0.8535 (0.8097) Acc@1 83.105 (82.271) Acc@5 95.020 (96.005) Mem 7379MB [2024-08-26 10:16:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.097) Loss 1.1543 (0.8210) Acc@1 73.438 (81.506) Acc@5 92.480 (96.159) Mem 7379MB [2024-08-26 10:16:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.090) Loss 1.3809 (0.9287) Acc@1 66.504 (79.076) Acc@5 89.258 (94.714) Mem 7379MB [2024-08-26 10:16:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.2656 (0.9921) Acc@1 69.141 (77.432) Acc@5 90.527 (93.974) Mem 7379MB [2024-08-26 10:16:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.050 Acc@5 93.914 [2024-08-26 10:16:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.1% [2024-08-26 10:16:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 77.05% [2024-08-26 10:16:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 10:16:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 10:16:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.447 (0.447) Loss 0.4446 (0.4446) Acc@1 92.480 (92.480) Acc@5 98.438 (98.438) Mem 7379MB [2024-08-26 10:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.111) Loss 0.7153 (0.6971) Acc@1 85.547 (84.996) Acc@5 96.191 (96.982) Mem 7379MB [2024-08-26 10:16:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.096) Loss 0.9980 (0.7210) Acc@1 76.465 (83.933) Acc@5 94.238 (96.945) Mem 7379MB [2024-08-26 10:16:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.090) Loss 1.2842 (0.8234) Acc@1 68.164 (81.530) Acc@5 90.918 (95.766) Mem 7379MB [2024-08-26 10:16:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.1465 (0.8757) Acc@1 71.191 (80.042) Acc@5 92.773 (95.255) Mem 7379MB [2024-08-26 10:16:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.652 Acc@5 95.206 [2024-08-26 10:16:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.7% [2024-08-26 10:16:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][0/1251] eta 0:21:21 lr 0.000843 wd 0.0500 time 1.0244 (1.0244) data time 0.7570 (0.7570) model time 0.0000 (0.0000) loss 3.7575 (3.7575) grad_norm 2.3291 (2.3291) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:16:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][10/1251] eta 0:06:28 lr 0.000843 wd 0.0500 time 0.2387 (0.3127) data time 0.0009 (0.0698) model time 0.0000 (0.0000) loss 3.7242 (3.1609) grad_norm 1.8006 (1.9538) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:16:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][20/1251] eta 0:05:42 lr 0.000843 wd 0.0500 time 0.2348 (0.2784) data time 0.0011 (0.0370) model time 0.0000 (0.0000) loss 2.3641 (3.1312) grad_norm 2.0622 (2.1409) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][30/1251] eta 0:05:24 lr 0.000843 wd 0.0500 time 0.2358 (0.2655) data time 0.0007 (0.0254) model time 0.0000 (0.0000) loss 3.6923 (3.2888) grad_norm 2.3979 (2.1873) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:16:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][40/1251] eta 0:05:14 lr 0.000843 wd 0.0500 time 0.2426 (0.2598) data time 0.0010 (0.0194) model time 0.0000 (0.0000) loss 4.0522 (3.2613) grad_norm 1.9911 (2.2878) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][50/1251] eta 0:05:15 lr 0.000843 wd 0.0500 time 0.3807 (0.2628) data time 0.0007 (0.0158) model time 0.0000 (0.0000) loss 2.9776 (3.2895) grad_norm 1.7248 (2.2403) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][60/1251] eta 0:05:12 lr 0.000843 wd 0.0500 time 0.2387 (0.2623) data time 0.0008 (0.0134) model time 0.2380 (0.2588) loss 4.2465 (3.3254) grad_norm 1.7435 (2.2096) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][70/1251] eta 0:05:06 lr 0.000843 wd 0.0500 time 0.2423 (0.2596) data time 0.0009 (0.0116) model time 0.2414 (0.2506) loss 3.8923 (3.3266) grad_norm 2.4560 (2.1932) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][80/1251] eta 0:05:01 lr 0.000843 wd 0.0500 time 0.2400 (0.2574) data time 0.0008 (0.0103) model time 0.2392 (0.2473) loss 3.3575 (3.3051) grad_norm 1.4752 (2.1965) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][90/1251] eta 0:04:56 lr 0.000843 wd 0.0500 time 0.2425 (0.2558) data time 0.0007 (0.0093) model time 0.2418 (0.2459) loss 2.2256 (3.3026) grad_norm 3.0953 (2.1745) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][100/1251] eta 0:04:52 lr 0.000843 wd 0.0500 time 0.2415 (0.2542) data time 0.0009 (0.0085) model time 0.2406 (0.2444) loss 3.3945 (3.3059) grad_norm 2.1046 (2.1667) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][110/1251] eta 0:04:48 lr 0.000843 wd 0.0500 time 0.2362 (0.2531) data time 0.0012 (0.0078) model time 0.2350 (0.2438) loss 3.4856 (3.3007) grad_norm 3.0151 (2.2379) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][120/1251] eta 0:04:45 lr 0.000843 wd 0.0500 time 0.2428 (0.2522) data time 0.0009 (0.0072) model time 0.2418 (0.2436) loss 3.8784 (3.3121) grad_norm 2.5003 (2.2281) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][130/1251] eta 0:04:43 lr 0.000843 wd 0.0500 time 0.4617 (0.2532) data time 0.0009 (0.0068) model time 0.4608 (0.2461) loss 3.2745 (3.3133) grad_norm 2.2533 (2.2190) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][140/1251] eta 0:04:40 lr 0.000843 wd 0.0500 time 0.2404 (0.2522) data time 0.0014 (0.0064) model time 0.2390 (0.2453) loss 3.9100 (3.3414) grad_norm 2.2436 (2.2106) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][150/1251] eta 0:04:36 lr 0.000843 wd 0.0500 time 0.2396 (0.2514) data time 0.0012 (0.0060) model time 0.2384 (0.2447) loss 3.7095 (3.3517) grad_norm 1.9926 (2.1998) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][160/1251] eta 0:04:33 lr 0.000843 wd 0.0500 time 0.2461 (0.2508) data time 0.0008 (0.0057) model time 0.2452 (0.2443) loss 3.1134 (3.3554) grad_norm 2.4293 (2.1952) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][170/1251] eta 0:04:30 lr 0.000842 wd 0.0500 time 0.2337 (0.2502) data time 0.0010 (0.0054) model time 0.2327 (0.2438) loss 2.6460 (3.3548) grad_norm 1.9137 (2.1889) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][180/1251] eta 0:04:27 lr 0.000842 wd 0.0500 time 0.2442 (0.2497) data time 0.0007 (0.0052) model time 0.2435 (0.2436) loss 3.7748 (3.3646) grad_norm 1.6198 (2.2227) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][190/1251] eta 0:04:25 lr 0.000842 wd 0.0500 time 0.2341 (0.2506) data time 0.0011 (0.0050) model time 0.2330 (0.2451) loss 3.2919 (3.3733) grad_norm 2.7942 (2.2233) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][200/1251] eta 0:04:24 lr 0.000842 wd 0.0500 time 0.2335 (0.2513) data time 0.0008 (0.0048) model time 0.2327 (0.2463) loss 3.1915 (3.3789) grad_norm 2.4495 (2.2316) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][210/1251] eta 0:04:20 lr 0.000842 wd 0.0500 time 0.2321 (0.2507) data time 0.0011 (0.0046) model time 0.2310 (0.2458) loss 3.7017 (3.3839) grad_norm 2.4393 (2.2258) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][220/1251] eta 0:04:18 lr 0.000842 wd 0.0500 time 0.2440 (0.2503) data time 0.0011 (0.0044) model time 0.2429 (0.2456) loss 3.5347 (3.3891) grad_norm 2.2698 (2.2172) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][230/1251] eta 0:04:15 lr 0.000842 wd 0.0500 time 0.2405 (0.2500) data time 0.0012 (0.0043) model time 0.2393 (0.2454) loss 3.1188 (3.3879) grad_norm 2.0541 (2.2101) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][240/1251] eta 0:04:12 lr 0.000842 wd 0.0500 time 0.2380 (0.2497) data time 0.0007 (0.0042) model time 0.2373 (0.2451) loss 3.9034 (3.3874) grad_norm 2.6177 (2.2063) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][250/1251] eta 0:04:09 lr 0.000842 wd 0.0500 time 0.2400 (0.2493) data time 0.0010 (0.0040) model time 0.2390 (0.2448) loss 3.6052 (3.3944) grad_norm 2.9642 (2.2085) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][260/1251] eta 0:04:06 lr 0.000842 wd 0.0500 time 0.2470 (0.2490) data time 0.0009 (0.0039) model time 0.2460 (0.2447) loss 3.4207 (3.3966) grad_norm 2.0904 (2.1985) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][270/1251] eta 0:04:04 lr 0.000842 wd 0.0500 time 0.2390 (0.2488) data time 0.0011 (0.0038) model time 0.2379 (0.2445) loss 2.9639 (3.3781) grad_norm 1.8996 (2.1926) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][280/1251] eta 0:04:01 lr 0.000842 wd 0.0500 time 0.2389 (0.2485) data time 0.0009 (0.0037) model time 0.2380 (0.2443) loss 2.3274 (3.3708) grad_norm 3.5869 (2.2103) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][290/1251] eta 0:03:58 lr 0.000842 wd 0.0500 time 0.2514 (0.2484) data time 0.0010 (0.0036) model time 0.2505 (0.2443) loss 3.0944 (3.3672) grad_norm 1.8863 (2.2112) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:17:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][300/1251] eta 0:03:56 lr 0.000842 wd 0.0500 time 0.2404 (0.2488) data time 0.0011 (0.0035) model time 0.2394 (0.2449) loss 2.9856 (3.3670) grad_norm 1.8139 (2.1889) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][310/1251] eta 0:03:54 lr 0.000842 wd 0.0500 time 0.2372 (0.2493) data time 0.0011 (0.0035) model time 0.2361 (0.2456) loss 1.9249 (3.3684) grad_norm 1.9325 (2.1831) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][320/1251] eta 0:03:51 lr 0.000842 wd 0.0500 time 0.2362 (0.2491) data time 0.0011 (0.0034) model time 0.2351 (0.2454) loss 3.9112 (3.3737) grad_norm 1.7343 (2.1778) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][330/1251] eta 0:03:49 lr 0.000842 wd 0.0500 time 0.2415 (0.2488) data time 0.0010 (0.0033) model time 0.2405 (0.2452) loss 3.6279 (3.3744) grad_norm 2.0847 (2.1734) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][340/1251] eta 0:03:46 lr 0.000842 wd 0.0500 time 0.2396 (0.2486) data time 0.0009 (0.0032) model time 0.2387 (0.2450) loss 3.7044 (3.3804) grad_norm 2.1791 (2.1748) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][350/1251] eta 0:03:43 lr 0.000842 wd 0.0500 time 0.2377 (0.2483) data time 0.0007 (0.0032) model time 0.2370 (0.2448) loss 4.5790 (3.3851) grad_norm 2.1630 (2.1663) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][360/1251] eta 0:03:41 lr 0.000842 wd 0.0500 time 0.2396 (0.2481) data time 0.0011 (0.0031) model time 0.2386 (0.2447) loss 3.4080 (3.3769) grad_norm 1.8460 (2.1638) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][370/1251] eta 0:03:38 lr 0.000842 wd 0.0500 time 0.2470 (0.2480) data time 0.0007 (0.0031) model time 0.2463 (0.2446) loss 2.5267 (3.3722) grad_norm 1.9905 (2.1616) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][380/1251] eta 0:03:35 lr 0.000842 wd 0.0500 time 0.2412 (0.2478) data time 0.0011 (0.0030) model time 0.2401 (0.2445) loss 3.1677 (3.3741) grad_norm 1.4656 (2.1573) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][390/1251] eta 0:03:33 lr 0.000842 wd 0.0500 time 0.2408 (0.2476) data time 0.0007 (0.0030) model time 0.2400 (0.2443) loss 3.7407 (3.3799) grad_norm 3.3541 (2.1518) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][400/1251] eta 0:03:30 lr 0.000842 wd 0.0500 time 0.2433 (0.2475) data time 0.0007 (0.0029) model time 0.2426 (0.2442) loss 2.3992 (3.3777) grad_norm 1.6696 (2.1460) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][410/1251] eta 0:03:27 lr 0.000842 wd 0.0500 time 0.2394 (0.2473) data time 0.0009 (0.0029) model time 0.2385 (0.2441) loss 3.3204 (3.3703) grad_norm 1.6217 (2.1424) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][420/1251] eta 0:03:25 lr 0.000842 wd 0.0500 time 0.2420 (0.2476) data time 0.0011 (0.0028) model time 0.2409 (0.2445) loss 3.2288 (3.3673) grad_norm 1.9997 (2.1412) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][430/1251] eta 0:03:23 lr 0.000842 wd 0.0500 time 0.2397 (0.2475) data time 0.0008 (0.0028) model time 0.2390 (0.2444) loss 4.2110 (3.3732) grad_norm 1.9554 (2.1365) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][440/1251] eta 0:03:20 lr 0.000842 wd 0.0500 time 0.2421 (0.2474) data time 0.0007 (0.0027) model time 0.2413 (0.2444) loss 4.2366 (3.3764) grad_norm 1.7291 (2.1402) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][450/1251] eta 0:03:18 lr 0.000842 wd 0.0500 time 0.2375 (0.2472) data time 0.0009 (0.0027) model time 0.2365 (0.2442) loss 3.6313 (3.3786) grad_norm 1.9102 (2.1341) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][460/1251] eta 0:03:15 lr 0.000842 wd 0.0500 time 0.2357 (0.2471) data time 0.0009 (0.0027) model time 0.2347 (0.2441) loss 3.3904 (3.3760) grad_norm 2.9953 (2.1388) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][470/1251] eta 0:03:13 lr 0.000842 wd 0.0500 time 0.2370 (0.2474) data time 0.0010 (0.0026) model time 0.2360 (0.2445) loss 3.4841 (3.3770) grad_norm 1.7600 (2.1422) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][480/1251] eta 0:03:10 lr 0.000841 wd 0.0500 time 0.2433 (0.2473) data time 0.0008 (0.0026) model time 0.2425 (0.2445) loss 4.2560 (3.3758) grad_norm 1.7692 (2.1393) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][490/1251] eta 0:03:08 lr 0.000841 wd 0.0500 time 0.2497 (0.2472) data time 0.0008 (0.0026) model time 0.2489 (0.2444) loss 3.5357 (3.3776) grad_norm 2.6362 (2.1408) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][500/1251] eta 0:03:05 lr 0.000841 wd 0.0500 time 0.2441 (0.2472) data time 0.0008 (0.0025) model time 0.2433 (0.2444) loss 3.8798 (3.3815) grad_norm 2.3176 (2.1486) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][510/1251] eta 0:03:03 lr 0.000841 wd 0.0500 time 0.2409 (0.2470) data time 0.0009 (0.0025) model time 0.2400 (0.2443) loss 2.9825 (3.3760) grad_norm 2.1333 (2.1460) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][520/1251] eta 0:03:00 lr 0.000841 wd 0.0500 time 0.2475 (0.2470) data time 0.0007 (0.0025) model time 0.2468 (0.2443) loss 3.3229 (3.3768) grad_norm 1.7676 (2.1440) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][530/1251] eta 0:02:57 lr 0.000841 wd 0.0500 time 0.2366 (0.2469) data time 0.0007 (0.0024) model time 0.2359 (0.2442) loss 4.4687 (3.3750) grad_norm 2.0608 (2.1463) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:18:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][540/1251] eta 0:02:55 lr 0.000841 wd 0.0500 time 0.2401 (0.2468) data time 0.0008 (0.0024) model time 0.2392 (0.2441) loss 2.6084 (3.3735) grad_norm 1.8409 (2.1455) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][550/1251] eta 0:02:52 lr 0.000841 wd 0.0500 time 0.2508 (0.2467) data time 0.0011 (0.0024) model time 0.2497 (0.2441) loss 3.3464 (3.3774) grad_norm 1.7965 (2.1477) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][560/1251] eta 0:02:50 lr 0.000841 wd 0.0500 time 0.2444 (0.2466) data time 0.0007 (0.0024) model time 0.2437 (0.2440) loss 2.8144 (3.3777) grad_norm 2.9628 (2.1474) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][570/1251] eta 0:02:47 lr 0.000841 wd 0.0500 time 0.2449 (0.2466) data time 0.0008 (0.0023) model time 0.2442 (0.2440) loss 2.8363 (3.3761) grad_norm 1.8549 (2.1434) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][580/1251] eta 0:02:45 lr 0.000841 wd 0.0500 time 0.2431 (0.2469) data time 0.0010 (0.0023) model time 0.2421 (0.2443) loss 3.4535 (3.3695) grad_norm 3.2110 (2.1462) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][590/1251] eta 0:02:43 lr 0.000841 wd 0.0500 time 0.2467 (0.2472) data time 0.0009 (0.0023) model time 0.2458 (0.2447) loss 3.5415 (3.3671) grad_norm 2.7910 (2.1498) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][600/1251] eta 0:02:40 lr 0.000841 wd 0.0500 time 0.2430 (0.2472) data time 0.0007 (0.0023) model time 0.2423 (0.2447) loss 4.4576 (3.3695) grad_norm 2.1641 (2.1483) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][610/1251] eta 0:02:38 lr 0.000841 wd 0.0500 time 0.2376 (0.2470) data time 0.0008 (0.0022) model time 0.2367 (0.2446) loss 3.2301 (3.3683) grad_norm 1.9356 (2.1444) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][620/1251] eta 0:02:35 lr 0.000841 wd 0.0500 time 0.2508 (0.2470) data time 0.0007 (0.0022) model time 0.2501 (0.2446) loss 3.9959 (3.3702) grad_norm 1.4231 (2.1455) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][630/1251] eta 0:02:33 lr 0.000841 wd 0.0500 time 0.2815 (0.2470) data time 0.0011 (0.0022) model time 0.2804 (0.2446) loss 2.4210 (3.3693) grad_norm 2.1695 (2.1445) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][640/1251] eta 0:02:30 lr 0.000841 wd 0.0500 time 0.2492 (0.2470) data time 0.0007 (0.0022) model time 0.2485 (0.2446) loss 2.5206 (3.3685) grad_norm 1.9614 (2.1463) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][650/1251] eta 0:02:28 lr 0.000841 wd 0.0500 time 0.4077 (0.2474) data time 0.0011 (0.0022) model time 0.4066 (0.2451) loss 3.7860 (3.3748) grad_norm 1.7901 (2.1485) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][660/1251] eta 0:02:26 lr 0.000841 wd 0.0500 time 0.2484 (0.2473) data time 0.0008 (0.0022) model time 0.2476 (0.2450) loss 2.6233 (3.3739) grad_norm 1.8785 (2.1475) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][670/1251] eta 0:02:23 lr 0.000841 wd 0.0500 time 0.2396 (0.2473) data time 0.0010 (0.0021) model time 0.2386 (0.2450) loss 3.5705 (3.3752) grad_norm 2.1049 (2.1497) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][680/1251] eta 0:02:21 lr 0.000841 wd 0.0500 time 0.2442 (0.2472) data time 0.0011 (0.0021) model time 0.2432 (0.2449) loss 3.8761 (3.3775) grad_norm 1.6080 (2.1493) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][690/1251] eta 0:02:18 lr 0.000841 wd 0.0500 time 0.2328 (0.2471) data time 0.0012 (0.0021) model time 0.2316 (0.2448) loss 3.6376 (3.3803) grad_norm 1.7323 (2.1468) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][700/1251] eta 0:02:16 lr 0.000841 wd 0.0500 time 0.2419 (0.2470) data time 0.0010 (0.0021) model time 0.2409 (0.2448) loss 3.3974 (3.3820) grad_norm 2.0713 (2.1426) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][710/1251] eta 0:02:13 lr 0.000841 wd 0.0500 time 0.2432 (0.2470) data time 0.0010 (0.0021) model time 0.2422 (0.2448) loss 2.7708 (3.3850) grad_norm 1.9410 (2.1504) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][720/1251] eta 0:02:11 lr 0.000841 wd 0.0500 time 0.2341 (0.2469) data time 0.0007 (0.0021) model time 0.2334 (0.2447) loss 2.6895 (3.3833) grad_norm 1.9902 (2.1551) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][730/1251] eta 0:02:08 lr 0.000841 wd 0.0500 time 0.2357 (0.2468) data time 0.0009 (0.0020) model time 0.2348 (0.2446) loss 3.3883 (3.3841) grad_norm 2.2407 (2.1584) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][740/1251] eta 0:02:06 lr 0.000841 wd 0.0500 time 0.2467 (0.2468) data time 0.0011 (0.0020) model time 0.2456 (0.2446) loss 2.7757 (3.3859) grad_norm 2.0611 (2.1606) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][750/1251] eta 0:02:03 lr 0.000841 wd 0.0500 time 0.2423 (0.2467) data time 0.0008 (0.0020) model time 0.2415 (0.2445) loss 3.5871 (3.3875) grad_norm 1.6501 (2.1574) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][760/1251] eta 0:02:01 lr 0.000841 wd 0.0500 time 0.2412 (0.2467) data time 0.0010 (0.0020) model time 0.2402 (0.2445) loss 3.4192 (3.3885) grad_norm 3.4597 (2.1546) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][770/1251] eta 0:01:58 lr 0.000841 wd 0.0500 time 0.2472 (0.2466) data time 0.0010 (0.0020) model time 0.2462 (0.2445) loss 4.0870 (3.3879) grad_norm 1.3411 (2.1514) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:19:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][780/1251] eta 0:01:56 lr 0.000841 wd 0.0500 time 0.2449 (0.2466) data time 0.0008 (0.0020) model time 0.2441 (0.2444) loss 3.6157 (3.3852) grad_norm 1.5841 (2.1469) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][790/1251] eta 0:01:53 lr 0.000840 wd 0.0500 time 0.2382 (0.2465) data time 0.0010 (0.0020) model time 0.2373 (0.2444) loss 2.7664 (3.3847) grad_norm 1.5711 (2.1467) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][800/1251] eta 0:01:51 lr 0.000840 wd 0.0500 time 0.2404 (0.2465) data time 0.0007 (0.0020) model time 0.2397 (0.2444) loss 3.3252 (3.3819) grad_norm 1.9555 (2.1436) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][810/1251] eta 0:01:48 lr 0.000840 wd 0.0500 time 0.2408 (0.2464) data time 0.0007 (0.0019) model time 0.2402 (0.2443) loss 3.6621 (3.3807) grad_norm 2.2334 (2.1456) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][820/1251] eta 0:01:46 lr 0.000840 wd 0.0500 time 0.2454 (0.2466) data time 0.0009 (0.0019) model time 0.2445 (0.2445) loss 3.4630 (3.3805) grad_norm 1.7151 (2.1422) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][830/1251] eta 0:01:43 lr 0.000840 wd 0.0500 time 0.2516 (0.2466) data time 0.0008 (0.0019) model time 0.2508 (0.2445) loss 3.2586 (3.3819) grad_norm 2.0380 (2.1413) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][840/1251] eta 0:01:41 lr 0.000840 wd 0.0500 time 0.2477 (0.2465) data time 0.0008 (0.0019) model time 0.2469 (0.2444) loss 2.7146 (3.3804) grad_norm 1.9892 (2.1399) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][850/1251] eta 0:01:38 lr 0.000840 wd 0.0500 time 0.2390 (0.2467) data time 0.0012 (0.0019) model time 0.2379 (0.2447) loss 3.7725 (3.3794) grad_norm 2.2130 (2.1378) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][860/1251] eta 0:01:36 lr 0.000840 wd 0.0500 time 0.2444 (0.2467) data time 0.0007 (0.0019) model time 0.2436 (0.2446) loss 3.6181 (3.3768) grad_norm 1.4345 (2.1347) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][870/1251] eta 0:01:33 lr 0.000840 wd 0.0500 time 0.2430 (0.2466) data time 0.0011 (0.0019) model time 0.2420 (0.2446) loss 3.5626 (3.3772) grad_norm 2.7626 (2.1379) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][880/1251] eta 0:01:31 lr 0.000840 wd 0.0500 time 0.2436 (0.2466) data time 0.0009 (0.0019) model time 0.2427 (0.2446) loss 4.1939 (3.3805) grad_norm 2.3412 (2.1367) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][890/1251] eta 0:01:28 lr 0.000840 wd 0.0500 time 0.2343 (0.2465) data time 0.0009 (0.0019) model time 0.2334 (0.2445) loss 2.4712 (3.3804) grad_norm 1.7196 (2.1374) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][900/1251] eta 0:01:26 lr 0.000840 wd 0.0500 time 0.2587 (0.2465) data time 0.0009 (0.0018) model time 0.2577 (0.2445) loss 3.9047 (3.3829) grad_norm 2.0712 (2.1361) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][910/1251] eta 0:01:24 lr 0.000840 wd 0.0500 time 0.2418 (0.2464) data time 0.0009 (0.0018) model time 0.2409 (0.2445) loss 2.8733 (3.3838) grad_norm 2.6864 (2.1397) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][920/1251] eta 0:01:21 lr 0.000840 wd 0.0500 time 0.2387 (0.2464) data time 0.0010 (0.0018) model time 0.2377 (0.2444) loss 3.8122 (3.3848) grad_norm 1.4436 (2.1372) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][930/1251] eta 0:01:19 lr 0.000840 wd 0.0500 time 0.2360 (0.2463) data time 0.0007 (0.0018) model time 0.2353 (0.2444) loss 3.7205 (3.3891) grad_norm 1.8564 (2.1345) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][940/1251] eta 0:01:16 lr 0.000840 wd 0.0500 time 0.2486 (0.2463) data time 0.0011 (0.0018) model time 0.2475 (0.2444) loss 3.3939 (3.3927) grad_norm 2.0341 (2.1342) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][950/1251] eta 0:01:14 lr 0.000840 wd 0.0500 time 0.2429 (0.2465) data time 0.0010 (0.0018) model time 0.2419 (0.2445) loss 3.9449 (3.3982) grad_norm 1.6986 (2.1323) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][960/1251] eta 0:01:11 lr 0.000840 wd 0.0500 time 0.2329 (0.2464) data time 0.0007 (0.0018) model time 0.2322 (0.2445) loss 4.1120 (3.3965) grad_norm 1.9627 (2.1291) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][970/1251] eta 0:01:09 lr 0.000840 wd 0.0500 time 0.2378 (0.2463) data time 0.0008 (0.0018) model time 0.2369 (0.2444) loss 3.1772 (3.3974) grad_norm 2.5643 (2.1302) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][980/1251] eta 0:01:06 lr 0.000840 wd 0.0500 time 0.2396 (0.2465) data time 0.0010 (0.0018) model time 0.2386 (0.2446) loss 3.1921 (3.3961) grad_norm 2.3509 (2.1303) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][990/1251] eta 0:01:04 lr 0.000840 wd 0.0500 time 0.2402 (0.2466) data time 0.0007 (0.0018) model time 0.2395 (0.2447) loss 3.2357 (3.3969) grad_norm 1.7176 (2.1324) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1000/1251] eta 0:01:01 lr 0.000840 wd 0.0500 time 0.2391 (0.2465) data time 0.0007 (0.0018) model time 0.2384 (0.2447) loss 4.0362 (3.3993) grad_norm 2.0290 (2.1316) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1010/1251] eta 0:00:59 lr 0.000840 wd 0.0500 time 0.2459 (0.2465) data time 0.0009 (0.0018) model time 0.2449 (0.2446) loss 2.1113 (3.4016) grad_norm 1.8037 (2.1301) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1020/1251] eta 0:00:56 lr 0.000840 wd 0.0500 time 0.2404 (0.2464) data time 0.0011 (0.0017) model time 0.2392 (0.2446) loss 3.2067 (3.4001) grad_norm 1.9730 (2.1264) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:20:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1030/1251] eta 0:00:54 lr 0.000840 wd 0.0500 time 0.2432 (0.2464) data time 0.0008 (0.0017) model time 0.2424 (0.2445) loss 4.0240 (3.4008) grad_norm 2.8715 (2.1250) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1040/1251] eta 0:00:51 lr 0.000840 wd 0.0500 time 0.2402 (0.2463) data time 0.0009 (0.0017) model time 0.2394 (0.2445) loss 2.9897 (3.4000) grad_norm 1.7465 (2.1232) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1050/1251] eta 0:00:49 lr 0.000840 wd 0.0500 time 0.2468 (0.2463) data time 0.0008 (0.0017) model time 0.2460 (0.2444) loss 2.7375 (3.4002) grad_norm 2.0949 (2.1246) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1060/1251] eta 0:00:47 lr 0.000840 wd 0.0500 time 0.2389 (0.2462) data time 0.0008 (0.0017) model time 0.2381 (0.2444) loss 3.0978 (3.3999) grad_norm 1.6388 (2.1249) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1070/1251] eta 0:00:44 lr 0.000840 wd 0.0500 time 0.2386 (0.2462) data time 0.0010 (0.0017) model time 0.2376 (0.2443) loss 3.6473 (3.3990) grad_norm 1.6599 (2.1248) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1080/1251] eta 0:00:42 lr 0.000840 wd 0.0500 time 0.2289 (0.2461) data time 0.0011 (0.0017) model time 0.2278 (0.2443) loss 3.9000 (3.4011) grad_norm 1.8012 (2.1237) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1090/1251] eta 0:00:39 lr 0.000839 wd 0.0500 time 0.2390 (0.2461) data time 0.0011 (0.0017) model time 0.2379 (0.2442) loss 3.8277 (3.4032) grad_norm 1.4331 (2.1208) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1100/1251] eta 0:00:37 lr 0.000839 wd 0.0500 time 0.2446 (0.2460) data time 0.0008 (0.0017) model time 0.2437 (0.2442) loss 2.6100 (3.4047) grad_norm 1.5201 (2.1182) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1110/1251] eta 0:00:34 lr 0.000839 wd 0.0500 time 0.2370 (0.2460) data time 0.0010 (0.0017) model time 0.2360 (0.2442) loss 3.6923 (3.4050) grad_norm 1.4516 (2.1160) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1120/1251] eta 0:00:32 lr 0.000839 wd 0.0500 time 0.2408 (0.2461) data time 0.0009 (0.0017) model time 0.2399 (0.2443) loss 3.2340 (3.4053) grad_norm 2.0268 (2.1149) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1130/1251] eta 0:00:29 lr 0.000839 wd 0.0500 time 0.2351 (0.2463) data time 0.0010 (0.0017) model time 0.2341 (0.2445) loss 3.4942 (3.4055) grad_norm 1.5612 (2.1177) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1140/1251] eta 0:00:27 lr 0.000839 wd 0.0500 time 0.2363 (0.2463) data time 0.0012 (0.0017) model time 0.2352 (0.2445) loss 3.5976 (3.4058) grad_norm 2.2393 (2.1165) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1150/1251] eta 0:00:24 lr 0.000839 wd 0.0500 time 0.2440 (0.2462) data time 0.0008 (0.0017) model time 0.2432 (0.2445) loss 2.7879 (3.4041) grad_norm 1.9205 (2.1182) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1160/1251] eta 0:00:22 lr 0.000839 wd 0.0500 time 0.2386 (0.2462) data time 0.0007 (0.0017) model time 0.2379 (0.2444) loss 4.5228 (3.4038) grad_norm 1.9826 (2.1234) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1170/1251] eta 0:00:19 lr 0.000839 wd 0.0500 time 0.2334 (0.2462) data time 0.0009 (0.0017) model time 0.2325 (0.2444) loss 3.3804 (3.4043) grad_norm 1.4913 (2.1231) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1180/1251] eta 0:00:17 lr 0.000839 wd 0.0500 time 0.2432 (0.2465) data time 0.0007 (0.0016) model time 0.2424 (0.2447) loss 4.0392 (3.4017) grad_norm 1.5158 (2.1209) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1190/1251] eta 0:00:15 lr 0.000839 wd 0.0500 time 0.2525 (0.2464) data time 0.0009 (0.0016) model time 0.2516 (0.2447) loss 3.6013 (3.4009) grad_norm 2.4292 (2.1229) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1200/1251] eta 0:00:12 lr 0.000839 wd 0.0500 time 0.2405 (0.2464) data time 0.0011 (0.0016) model time 0.2394 (0.2446) loss 3.3882 (3.4012) grad_norm 1.6496 (2.1205) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1210/1251] eta 0:00:10 lr 0.000839 wd 0.0500 time 0.2547 (0.2463) data time 0.0009 (0.0016) model time 0.2537 (0.2446) loss 2.2261 (3.3985) grad_norm 1.5845 (2.1179) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1220/1251] eta 0:00:07 lr 0.000839 wd 0.0500 time 0.2340 (0.2463) data time 0.0009 (0.0016) model time 0.2331 (0.2445) loss 2.4114 (3.3978) grad_norm 2.5672 (2.1169) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1230/1251] eta 0:00:05 lr 0.000839 wd 0.0500 time 0.2404 (0.2464) data time 0.0012 (0.0016) model time 0.2392 (0.2447) loss 2.3092 (3.3959) grad_norm 2.8653 (2.1166) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1240/1251] eta 0:00:02 lr 0.000839 wd 0.0500 time 0.2240 (0.2463) data time 0.0006 (0.0016) model time 0.2234 (0.2446) loss 4.1495 (3.3970) grad_norm 1.7378 (2.1143) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [93/300][1250/1251] eta 0:00:00 lr 0.000839 wd 0.0500 time 0.2267 (0.2461) data time 0.0005 (0.0016) model time 0.2262 (0.2444) loss 2.6931 (3.3976) grad_norm 2.2321 (2.1138) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:21:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 93 training takes 0:05:07 [2024-08-26 10:21:52 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 10:21:53 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 10:21:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.481 (0.481) Loss 0.5298 (0.5298) Acc@1 90.625 (90.625) Acc@5 97.852 (97.852) Mem 7379MB [2024-08-26 10:21:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.112) Loss 0.7993 (0.8062) Acc@1 83.887 (82.138) Acc@5 96.094 (96.236) Mem 7379MB [2024-08-26 10:21:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.097) Loss 1.1504 (0.8238) Acc@1 71.973 (81.273) Acc@5 92.773 (96.224) Mem 7379MB [2024-08-26 10:21:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.090) Loss 1.3594 (0.9421) Acc@1 67.383 (78.632) Acc@5 89.258 (94.708) Mem 7379MB [2024-08-26 10:21:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.2939 (1.0005) Acc@1 69.434 (77.172) Acc@5 90.137 (93.919) Mem 7379MB [2024-08-26 10:21:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.714 Acc@5 93.822 [2024-08-26 10:21:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 76.7% [2024-08-26 10:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.776 (0.776) Loss 0.4421 (0.4421) Acc@1 92.578 (92.578) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 10:21:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.145) Loss 0.7139 (0.6961) Acc@1 85.645 (85.032) Acc@5 96.191 (97.008) Mem 7379MB [2024-08-26 10:22:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.114) Loss 0.9951 (0.7198) Acc@1 76.367 (83.998) Acc@5 94.043 (96.963) Mem 7379MB [2024-08-26 10:22:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.102) Loss 1.2812 (0.8218) Acc@1 68.457 (81.578) Acc@5 90.820 (95.776) Mem 7379MB [2024-08-26 10:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.1445 (0.8738) Acc@1 71.094 (80.073) Acc@5 92.773 (95.272) Mem 7379MB [2024-08-26 10:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.678 Acc@5 95.234 [2024-08-26 10:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.7% [2024-08-26 10:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.68% [2024-08-26 10:22:01 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 10:22:02 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 10:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][0/1251] eta 0:13:17 lr 0.000839 wd 0.0500 time 0.6378 (0.6378) data time 0.4113 (0.4113) model time 0.0000 (0.0000) loss 3.7265 (3.7265) grad_norm 1.7559 (1.7559) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][10/1251] eta 0:05:44 lr 0.000839 wd 0.0500 time 0.2698 (0.2779) data time 0.0007 (0.0384) model time 0.0000 (0.0000) loss 2.7694 (3.3609) grad_norm 2.6032 (2.0487) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][20/1251] eta 0:05:20 lr 0.000839 wd 0.0500 time 0.2380 (0.2604) data time 0.0007 (0.0205) model time 0.0000 (0.0000) loss 3.7316 (3.3784) grad_norm 4.9345 (2.3394) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][30/1251] eta 0:05:10 lr 0.000839 wd 0.0500 time 0.2488 (0.2539) data time 0.0009 (0.0142) model time 0.0000 (0.0000) loss 3.4674 (3.2976) grad_norm 2.6661 (2.3915) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][40/1251] eta 0:05:03 lr 0.000839 wd 0.0500 time 0.2444 (0.2504) data time 0.0007 (0.0110) model time 0.0000 (0.0000) loss 2.8452 (3.3147) grad_norm 2.1649 (2.2958) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][50/1251] eta 0:05:03 lr 0.000839 wd 0.0500 time 0.2368 (0.2530) data time 0.0007 (0.0091) model time 0.0000 (0.0000) loss 4.2517 (3.3421) grad_norm 2.3722 (2.2839) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][60/1251] eta 0:04:59 lr 0.000839 wd 0.0500 time 0.2382 (0.2515) data time 0.0007 (0.0077) model time 0.2375 (0.2428) loss 2.7114 (3.3159) grad_norm 1.7610 (2.2173) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][70/1251] eta 0:04:55 lr 0.000839 wd 0.0500 time 0.2414 (0.2500) data time 0.0011 (0.0068) model time 0.2402 (0.2415) loss 3.4918 (3.3738) grad_norm 2.2757 (2.2560) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][80/1251] eta 0:04:51 lr 0.000839 wd 0.0500 time 0.2387 (0.2490) data time 0.0007 (0.0061) model time 0.2380 (0.2414) loss 4.2243 (3.3843) grad_norm 2.3676 (2.2079) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][90/1251] eta 0:04:48 lr 0.000839 wd 0.0500 time 0.2403 (0.2483) data time 0.0007 (0.0055) model time 0.2396 (0.2413) loss 2.5750 (3.3955) grad_norm 5.0458 (2.2135) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][100/1251] eta 0:04:45 lr 0.000839 wd 0.0500 time 0.2425 (0.2477) data time 0.0008 (0.0051) model time 0.2417 (0.2414) loss 3.6150 (3.3666) grad_norm 2.4939 (2.2340) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][110/1251] eta 0:04:42 lr 0.000839 wd 0.0500 time 0.2514 (0.2473) data time 0.0008 (0.0047) model time 0.2506 (0.2415) loss 3.1977 (3.3503) grad_norm 3.5717 (2.2806) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][120/1251] eta 0:04:39 lr 0.000839 wd 0.0500 time 0.2426 (0.2469) data time 0.0008 (0.0044) model time 0.2419 (0.2414) loss 3.6881 (3.3470) grad_norm 1.5528 (2.2622) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][130/1251] eta 0:04:36 lr 0.000839 wd 0.0500 time 0.2337 (0.2465) data time 0.0011 (0.0041) model time 0.2325 (0.2414) loss 3.1136 (3.3295) grad_norm 1.7831 (2.2384) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][140/1251] eta 0:04:33 lr 0.000839 wd 0.0500 time 0.2407 (0.2462) data time 0.0011 (0.0039) model time 0.2397 (0.2413) loss 3.5709 (3.3199) grad_norm 2.5159 (2.2158) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][150/1251] eta 0:04:33 lr 0.000838 wd 0.0500 time 0.2326 (0.2486) data time 0.0010 (0.0037) model time 0.2316 (0.2453) loss 3.1467 (3.3238) grad_norm 2.2605 (2.2033) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][160/1251] eta 0:04:30 lr 0.000838 wd 0.0500 time 0.2460 (0.2481) data time 0.0009 (0.0036) model time 0.2450 (0.2449) loss 2.0699 (3.3186) grad_norm 1.3548 (2.2023) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][170/1251] eta 0:04:27 lr 0.000838 wd 0.0500 time 0.2333 (0.2477) data time 0.0007 (0.0034) model time 0.2326 (0.2444) loss 3.5981 (3.3407) grad_norm 2.6516 (2.2149) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][180/1251] eta 0:04:26 lr 0.000838 wd 0.0500 time 0.2404 (0.2486) data time 0.0010 (0.0033) model time 0.2394 (0.2459) loss 3.3086 (3.3509) grad_norm 2.0658 (2.2012) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][190/1251] eta 0:04:23 lr 0.000838 wd 0.0500 time 0.2462 (0.2482) data time 0.0010 (0.0032) model time 0.2452 (0.2455) loss 3.5507 (3.3686) grad_norm 1.6528 (2.2143) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][200/1251] eta 0:04:20 lr 0.000838 wd 0.0500 time 0.2493 (0.2479) data time 0.0008 (0.0031) model time 0.2485 (0.2451) loss 3.6522 (3.3707) grad_norm 2.4189 (2.1971) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][210/1251] eta 0:04:17 lr 0.000838 wd 0.0500 time 0.2379 (0.2476) data time 0.0010 (0.0030) model time 0.2369 (0.2449) loss 3.8243 (3.3783) grad_norm 2.4405 (2.1790) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:22:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][220/1251] eta 0:04:14 lr 0.000838 wd 0.0500 time 0.2373 (0.2473) data time 0.0009 (0.0029) model time 0.2363 (0.2446) loss 2.5555 (3.3923) grad_norm 2.1723 (2.1679) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][230/1251] eta 0:04:13 lr 0.000838 wd 0.0500 time 0.2442 (0.2479) data time 0.0010 (0.0028) model time 0.2431 (0.2454) loss 3.4648 (3.3988) grad_norm 2.2317 (2.1750) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][240/1251] eta 0:04:10 lr 0.000838 wd 0.0500 time 0.2365 (0.2476) data time 0.0008 (0.0027) model time 0.2357 (0.2451) loss 3.1035 (3.3977) grad_norm 1.5304 (2.1687) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][250/1251] eta 0:04:07 lr 0.000838 wd 0.0500 time 0.2383 (0.2472) data time 0.0011 (0.0027) model time 0.2372 (0.2448) loss 3.5704 (3.3950) grad_norm 2.0471 (2.1596) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][260/1251] eta 0:04:05 lr 0.000838 wd 0.0500 time 0.2411 (0.2475) data time 0.0007 (0.0026) model time 0.2403 (0.2452) loss 3.7136 (3.4028) grad_norm 1.8751 (2.1551) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][270/1251] eta 0:04:02 lr 0.000838 wd 0.0500 time 0.2450 (0.2473) data time 0.0010 (0.0025) model time 0.2440 (0.2450) loss 2.6311 (3.4005) grad_norm 1.7498 (2.1585) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][280/1251] eta 0:03:59 lr 0.000838 wd 0.0500 time 0.2435 (0.2471) data time 0.0009 (0.0025) model time 0.2425 (0.2447) loss 3.3246 (3.4011) grad_norm 2.0751 (2.1648) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][290/1251] eta 0:03:57 lr 0.000838 wd 0.0500 time 0.2396 (0.2468) data time 0.0011 (0.0024) model time 0.2385 (0.2445) loss 3.0008 (3.3956) grad_norm 1.3514 (2.1537) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][300/1251] eta 0:03:54 lr 0.000838 wd 0.0500 time 0.2360 (0.2466) data time 0.0013 (0.0024) model time 0.2347 (0.2442) loss 3.6307 (3.3954) grad_norm 1.8464 (2.1590) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][310/1251] eta 0:03:51 lr 0.000838 wd 0.0500 time 0.2393 (0.2463) data time 0.0011 (0.0023) model time 0.2382 (0.2440) loss 3.2653 (3.3987) grad_norm 2.1642 (2.1511) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][320/1251] eta 0:03:49 lr 0.000838 wd 0.0500 time 0.2364 (0.2462) data time 0.0011 (0.0023) model time 0.2353 (0.2440) loss 3.9807 (3.3962) grad_norm 1.8325 (2.1414) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][330/1251] eta 0:03:46 lr 0.000838 wd 0.0500 time 0.2421 (0.2461) data time 0.0015 (0.0023) model time 0.2407 (0.2439) loss 2.8496 (3.3910) grad_norm 2.0538 (2.1368) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][340/1251] eta 0:03:44 lr 0.000838 wd 0.0500 time 0.2463 (0.2459) data time 0.0009 (0.0022) model time 0.2454 (0.2437) loss 2.6631 (3.3887) grad_norm 1.8827 (2.1448) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][350/1251] eta 0:03:41 lr 0.000838 wd 0.0500 time 0.2418 (0.2458) data time 0.0010 (0.0022) model time 0.2408 (0.2436) loss 3.5677 (3.3852) grad_norm 2.0812 (2.1633) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][360/1251] eta 0:03:38 lr 0.000838 wd 0.0500 time 0.2403 (0.2457) data time 0.0011 (0.0022) model time 0.2392 (0.2435) loss 2.8718 (3.3894) grad_norm 2.0103 (2.1698) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][370/1251] eta 0:03:36 lr 0.000838 wd 0.0500 time 0.2452 (0.2456) data time 0.0012 (0.0021) model time 0.2440 (0.2434) loss 3.9068 (3.3930) grad_norm 2.1838 (2.1632) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][380/1251] eta 0:03:33 lr 0.000838 wd 0.0500 time 0.2347 (0.2455) data time 0.0011 (0.0021) model time 0.2336 (0.2433) loss 3.8073 (3.3984) grad_norm 1.5531 (2.1570) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][390/1251] eta 0:03:31 lr 0.000838 wd 0.0500 time 0.2338 (0.2454) data time 0.0009 (0.0021) model time 0.2329 (0.2432) loss 4.1764 (3.4043) grad_norm 1.7972 (2.1509) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][400/1251] eta 0:03:28 lr 0.000838 wd 0.0500 time 0.2437 (0.2453) data time 0.0011 (0.0020) model time 0.2427 (0.2431) loss 3.6238 (3.4082) grad_norm 3.0012 (2.1610) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][410/1251] eta 0:03:26 lr 0.000838 wd 0.0500 time 0.2489 (0.2452) data time 0.0011 (0.0020) model time 0.2479 (0.2431) loss 3.0557 (3.4043) grad_norm 1.7995 (2.1595) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][420/1251] eta 0:03:23 lr 0.000838 wd 0.0500 time 0.2490 (0.2451) data time 0.0009 (0.0020) model time 0.2481 (0.2430) loss 2.5785 (3.4070) grad_norm 3.2802 (2.1600) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][430/1251] eta 0:03:21 lr 0.000838 wd 0.0500 time 0.2481 (0.2451) data time 0.0010 (0.0020) model time 0.2472 (0.2430) loss 3.6640 (3.4071) grad_norm 2.5482 (2.1601) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][440/1251] eta 0:03:18 lr 0.000838 wd 0.0500 time 0.2370 (0.2449) data time 0.0011 (0.0020) model time 0.2358 (0.2429) loss 3.5045 (3.4097) grad_norm 3.6638 (2.1646) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][450/1251] eta 0:03:16 lr 0.000837 wd 0.0500 time 0.2443 (0.2453) data time 0.0010 (0.0019) model time 0.2434 (0.2433) loss 2.5692 (3.4094) grad_norm 1.9879 (2.1660) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][460/1251] eta 0:03:13 lr 0.000837 wd 0.0500 time 0.2407 (0.2452) data time 0.0010 (0.0019) model time 0.2398 (0.2432) loss 4.1097 (3.4066) grad_norm 1.7670 (2.1634) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:23:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][470/1251] eta 0:03:11 lr 0.000837 wd 0.0500 time 0.2412 (0.2451) data time 0.0008 (0.0019) model time 0.2404 (0.2431) loss 3.9511 (3.4053) grad_norm 2.0398 (2.1600) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:24:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][480/1251] eta 0:03:09 lr 0.000837 wd 0.0500 time 0.2386 (0.2454) data time 0.0007 (0.0019) model time 0.2378 (0.2435) loss 4.3612 (3.4083) grad_norm 2.5243 (2.1565) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:24:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][490/1251] eta 0:03:07 lr 0.000837 wd 0.0500 time 0.3664 (0.2460) data time 0.0010 (0.0019) model time 0.3655 (0.2442) loss 3.6669 (3.4029) grad_norm 1.6678 (2.1550) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][500/1251] eta 0:03:04 lr 0.000837 wd 0.0500 time 0.2417 (0.2459) data time 0.0007 (0.0018) model time 0.2410 (0.2441) loss 2.9596 (3.4020) grad_norm 1.7928 (2.1558) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:24:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][510/1251] eta 0:03:02 lr 0.000837 wd 0.0500 time 0.2358 (0.2458) data time 0.0007 (0.0018) model time 0.2351 (0.2440) loss 3.3368 (3.3997) grad_norm 2.4369 (2.1593) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:24:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][520/1251] eta 0:02:59 lr 0.000837 wd 0.0500 time 0.2385 (0.2457) data time 0.0010 (0.0018) model time 0.2376 (0.2439) loss 3.7817 (3.4074) grad_norm 2.3818 (2.1565) loss_scale 4096.0000 (2055.8618) mem 7379MB [2024-08-26 10:24:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][530/1251] eta 0:02:57 lr 0.000837 wd 0.0500 time 0.2414 (0.2456) data time 0.0010 (0.0018) model time 0.2403 (0.2438) loss 3.0551 (3.4085) grad_norm 2.4484 (2.1569) loss_scale 4096.0000 (2094.2825) mem 7379MB [2024-08-26 10:24:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][540/1251] eta 0:02:54 lr 0.000837 wd 0.0500 time 0.2375 (0.2455) data time 0.0009 (0.0018) model time 0.2367 (0.2437) loss 3.6407 (3.4108) grad_norm 1.8337 (2.1673) loss_scale 4096.0000 (2131.2828) mem 7379MB [2024-08-26 10:24:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][550/1251] eta 0:02:52 lr 0.000837 wd 0.0500 time 0.2449 (0.2454) data time 0.0009 (0.0018) model time 0.2440 (0.2436) loss 3.8340 (3.4157) grad_norm 1.7861 (2.1715) loss_scale 4096.0000 (2166.9401) mem 7379MB [2024-08-26 10:24:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][560/1251] eta 0:02:49 lr 0.000837 wd 0.0500 time 0.2434 (0.2454) data time 0.0011 (0.0017) model time 0.2423 (0.2436) loss 3.5289 (3.4161) grad_norm 2.3395 (2.1642) loss_scale 4096.0000 (2201.3262) mem 7379MB [2024-08-26 10:24:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][570/1251] eta 0:02:47 lr 0.000837 wd 0.0500 time 0.2470 (0.2455) data time 0.0010 (0.0017) model time 0.2461 (0.2438) loss 3.5138 (3.4144) grad_norm 2.1712 (2.1627) loss_scale 4096.0000 (2234.5079) mem 7379MB [2024-08-26 10:24:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][580/1251] eta 0:02:44 lr 0.000837 wd 0.0500 time 0.2380 (0.2454) data time 0.0007 (0.0017) model time 0.2373 (0.2437) loss 3.9316 (3.4136) grad_norm 1.7440 (2.1583) loss_scale 4096.0000 (2266.5473) mem 7379MB [2024-08-26 10:24:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][590/1251] eta 0:02:42 lr 0.000837 wd 0.0500 time 0.2358 (0.2453) data time 0.0009 (0.0017) model time 0.2349 (0.2436) loss 3.6894 (3.4153) grad_norm 1.6165 (2.1546) loss_scale 4096.0000 (2297.5025) mem 7379MB [2024-08-26 10:24:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][600/1251] eta 0:02:39 lr 0.000837 wd 0.0500 time 0.2439 (0.2452) data time 0.0011 (0.0017) model time 0.2428 (0.2435) loss 3.8320 (3.4175) grad_norm 1.7176 (2.1546) loss_scale 4096.0000 (2327.4276) mem 7379MB [2024-08-26 10:24:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][610/1251] eta 0:02:37 lr 0.000837 wd 0.0500 time 0.2425 (0.2452) data time 0.0007 (0.0017) model time 0.2418 (0.2435) loss 3.6207 (3.4176) grad_norm 2.5390 (2.1519) loss_scale 4096.0000 (2356.3732) mem 7379MB [2024-08-26 10:24:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][620/1251] eta 0:02:34 lr 0.000837 wd 0.0500 time 0.2371 (0.2451) data time 0.0008 (0.0017) model time 0.2363 (0.2434) loss 4.6964 (3.4199) grad_norm 1.7688 (2.1507) loss_scale 4096.0000 (2384.3865) mem 7379MB [2024-08-26 10:24:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][630/1251] eta 0:02:32 lr 0.000837 wd 0.0500 time 0.2405 (0.2451) data time 0.0010 (0.0017) model time 0.2395 (0.2433) loss 3.2951 (3.4094) grad_norm 2.0798 (2.1541) loss_scale 4096.0000 (2411.5119) mem 7379MB [2024-08-26 10:24:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][640/1251] eta 0:02:29 lr 0.000837 wd 0.0500 time 0.2368 (0.2450) data time 0.0011 (0.0017) model time 0.2357 (0.2432) loss 3.7383 (3.4082) grad_norm 1.9715 (2.1533) loss_scale 4096.0000 (2437.7910) mem 7379MB [2024-08-26 10:24:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][650/1251] eta 0:02:27 lr 0.000837 wd 0.0500 time 0.2385 (0.2449) data time 0.0009 (0.0016) model time 0.2375 (0.2432) loss 4.1109 (3.4087) grad_norm 2.9504 (2.1542) loss_scale 4096.0000 (2463.2627) mem 7379MB [2024-08-26 10:24:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][660/1251] eta 0:02:24 lr 0.000837 wd 0.0500 time 0.2406 (0.2449) data time 0.0008 (0.0016) model time 0.2398 (0.2432) loss 3.9266 (3.4093) grad_norm 2.1137 (2.1551) loss_scale 4096.0000 (2487.9637) mem 7379MB [2024-08-26 10:24:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][670/1251] eta 0:02:22 lr 0.000837 wd 0.0500 time 0.2415 (0.2448) data time 0.0012 (0.0016) model time 0.2403 (0.2431) loss 3.7795 (3.4092) grad_norm 1.8242 (2.1516) loss_scale 4096.0000 (2511.9285) mem 7379MB [2024-08-26 10:24:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][680/1251] eta 0:02:19 lr 0.000837 wd 0.0500 time 0.2375 (0.2447) data time 0.0011 (0.0016) model time 0.2364 (0.2430) loss 2.5719 (3.4087) grad_norm 1.6725 (2.1489) loss_scale 4096.0000 (2535.1894) mem 7379MB [2024-08-26 10:24:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][690/1251] eta 0:02:17 lr 0.000837 wd 0.0500 time 0.2495 (0.2447) data time 0.0008 (0.0016) model time 0.2487 (0.2430) loss 3.9139 (3.4089) grad_norm 1.7392 (2.1459) loss_scale 4096.0000 (2557.7771) mem 7379MB [2024-08-26 10:24:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][700/1251] eta 0:02:14 lr 0.000837 wd 0.0500 time 0.4586 (0.2450) data time 0.0007 (0.0016) model time 0.4579 (0.2434) loss 4.1477 (3.4168) grad_norm 1.5600 (2.1455) loss_scale 4096.0000 (2579.7204) mem 7379MB [2024-08-26 10:24:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][710/1251] eta 0:02:12 lr 0.000837 wd 0.0500 time 0.2455 (0.2450) data time 0.0010 (0.0016) model time 0.2445 (0.2433) loss 3.3155 (3.4163) grad_norm 2.0956 (2.1491) loss_scale 4096.0000 (2601.0464) mem 7379MB [2024-08-26 10:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][720/1251] eta 0:02:10 lr 0.000837 wd 0.0500 time 0.2403 (0.2452) data time 0.0008 (0.0016) model time 0.2394 (0.2436) loss 4.0227 (3.4200) grad_norm 1.9907 (2.1535) loss_scale 4096.0000 (2621.7809) mem 7379MB [2024-08-26 10:25:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][730/1251] eta 0:02:07 lr 0.000837 wd 0.0500 time 0.2490 (0.2455) data time 0.0008 (0.0016) model time 0.2482 (0.2439) loss 3.3352 (3.4202) grad_norm 1.4830 (2.1496) loss_scale 4096.0000 (2641.9480) mem 7379MB [2024-08-26 10:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][740/1251] eta 0:02:05 lr 0.000837 wd 0.0500 time 0.2489 (0.2454) data time 0.0010 (0.0016) model time 0.2479 (0.2438) loss 2.7140 (3.4179) grad_norm 1.7563 (2.1516) loss_scale 4096.0000 (2661.5709) mem 7379MB [2024-08-26 10:25:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][750/1251] eta 0:02:03 lr 0.000836 wd 0.0500 time 0.2371 (0.2455) data time 0.0010 (0.0016) model time 0.2361 (0.2440) loss 3.4129 (3.4208) grad_norm 2.6153 (2.1524) loss_scale 4096.0000 (2680.6711) mem 7379MB [2024-08-26 10:25:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][760/1251] eta 0:02:00 lr 0.000836 wd 0.0500 time 0.2471 (0.2455) data time 0.0007 (0.0016) model time 0.2463 (0.2439) loss 4.3432 (3.4226) grad_norm 1.6238 (2.1497) loss_scale 4096.0000 (2699.2694) mem 7379MB [2024-08-26 10:25:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][770/1251] eta 0:01:58 lr 0.000836 wd 0.0500 time 0.2405 (0.2454) data time 0.0008 (0.0015) model time 0.2398 (0.2439) loss 3.7566 (3.4213) grad_norm 1.8231 (2.1475) loss_scale 4096.0000 (2717.3852) mem 7379MB [2024-08-26 10:25:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][780/1251] eta 0:01:55 lr 0.000836 wd 0.0500 time 0.2394 (0.2456) data time 0.0009 (0.0015) model time 0.2385 (0.2441) loss 3.1766 (3.4203) grad_norm 2.1858 (2.1468) loss_scale 4096.0000 (2735.0371) mem 7379MB [2024-08-26 10:25:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][790/1251] eta 0:01:53 lr 0.000836 wd 0.0500 time 0.2429 (0.2456) data time 0.0009 (0.0015) model time 0.2419 (0.2441) loss 4.0573 (3.4208) grad_norm 1.8195 (2.1486) loss_scale 4096.0000 (2752.2427) mem 7379MB [2024-08-26 10:25:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][800/1251] eta 0:01:50 lr 0.000836 wd 0.0500 time 0.2415 (0.2455) data time 0.0008 (0.0015) model time 0.2407 (0.2440) loss 4.1339 (3.4210) grad_norm 2.3234 (2.1557) loss_scale 4096.0000 (2769.0187) mem 7379MB [2024-08-26 10:25:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][810/1251] eta 0:01:48 lr 0.000836 wd 0.0500 time 0.2393 (0.2454) data time 0.0010 (0.0015) model time 0.2383 (0.2439) loss 3.7468 (3.4213) grad_norm 1.4425 (2.1577) loss_scale 4096.0000 (2785.3810) mem 7379MB [2024-08-26 10:25:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][820/1251] eta 0:01:45 lr 0.000836 wd 0.0500 time 0.2358 (0.2454) data time 0.0009 (0.0015) model time 0.2349 (0.2439) loss 3.6264 (3.4207) grad_norm 3.4401 (inf) loss_scale 2048.0000 (2791.3666) mem 7379MB [2024-08-26 10:25:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][830/1251] eta 0:01:43 lr 0.000836 wd 0.0500 time 0.2437 (0.2453) data time 0.0010 (0.0015) model time 0.2427 (0.2438) loss 3.1624 (3.4184) grad_norm 1.5576 (inf) loss_scale 2048.0000 (2782.4212) mem 7379MB [2024-08-26 10:25:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][840/1251] eta 0:01:40 lr 0.000836 wd 0.0500 time 0.2415 (0.2453) data time 0.0009 (0.0015) model time 0.2406 (0.2438) loss 3.7460 (3.4220) grad_norm 1.8098 (inf) loss_scale 2048.0000 (2773.6885) mem 7379MB [2024-08-26 10:25:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][850/1251] eta 0:01:38 lr 0.000836 wd 0.0500 time 0.2412 (0.2453) data time 0.0007 (0.0015) model time 0.2405 (0.2438) loss 3.3224 (3.4267) grad_norm 1.8353 (inf) loss_scale 2048.0000 (2765.1610) mem 7379MB [2024-08-26 10:25:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][860/1251] eta 0:01:35 lr 0.000836 wd 0.0500 time 0.2473 (0.2452) data time 0.0012 (0.0015) model time 0.2462 (0.2437) loss 3.5988 (3.4259) grad_norm 1.5765 (inf) loss_scale 2048.0000 (2756.8316) mem 7379MB [2024-08-26 10:25:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][870/1251] eta 0:01:33 lr 0.000836 wd 0.0500 time 0.2417 (0.2452) data time 0.0009 (0.0015) model time 0.2409 (0.2437) loss 3.7868 (3.4243) grad_norm 2.1733 (inf) loss_scale 2048.0000 (2748.6935) mem 7379MB [2024-08-26 10:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][880/1251] eta 0:01:30 lr 0.000836 wd 0.0500 time 0.2397 (0.2451) data time 0.0007 (0.0015) model time 0.2390 (0.2436) loss 2.3030 (3.4257) grad_norm 2.7025 (inf) loss_scale 2048.0000 (2740.7401) mem 7379MB [2024-08-26 10:25:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][890/1251] eta 0:01:28 lr 0.000836 wd 0.0500 time 0.2426 (0.2451) data time 0.0008 (0.0015) model time 0.2419 (0.2436) loss 4.9798 (3.4267) grad_norm 2.2004 (inf) loss_scale 2048.0000 (2732.9652) mem 7379MB [2024-08-26 10:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][900/1251] eta 0:01:26 lr 0.000836 wd 0.0500 time 0.2435 (0.2450) data time 0.0013 (0.0015) model time 0.2422 (0.2436) loss 3.6607 (3.4276) grad_norm 1.6577 (inf) loss_scale 2048.0000 (2725.3629) mem 7379MB [2024-08-26 10:25:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][910/1251] eta 0:01:23 lr 0.000836 wd 0.0500 time 0.2439 (0.2450) data time 0.0007 (0.0015) model time 0.2432 (0.2435) loss 4.0736 (3.4289) grad_norm 2.0153 (inf) loss_scale 2048.0000 (2717.9276) mem 7379MB [2024-08-26 10:25:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][920/1251] eta 0:01:21 lr 0.000836 wd 0.0500 time 0.2462 (0.2450) data time 0.0010 (0.0015) model time 0.2452 (0.2435) loss 3.7589 (3.4312) grad_norm 1.8382 (inf) loss_scale 2048.0000 (2710.6536) mem 7379MB [2024-08-26 10:25:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][930/1251] eta 0:01:18 lr 0.000836 wd 0.0500 time 0.2457 (0.2449) data time 0.0010 (0.0015) model time 0.2447 (0.2435) loss 3.7234 (3.4308) grad_norm 2.6028 (inf) loss_scale 2048.0000 (2703.5360) mem 7379MB [2024-08-26 10:25:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][940/1251] eta 0:01:16 lr 0.000836 wd 0.0500 time 0.2373 (0.2449) data time 0.0010 (0.0015) model time 0.2363 (0.2434) loss 3.5456 (3.4305) grad_norm 1.6108 (inf) loss_scale 2048.0000 (2696.5696) mem 7379MB [2024-08-26 10:25:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][950/1251] eta 0:01:13 lr 0.000836 wd 0.0500 time 0.2419 (0.2449) data time 0.0009 (0.0014) model time 0.2410 (0.2434) loss 2.6830 (3.4306) grad_norm 1.5048 (inf) loss_scale 2048.0000 (2689.7497) mem 7379MB [2024-08-26 10:25:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][960/1251] eta 0:01:11 lr 0.000836 wd 0.0500 time 0.2411 (0.2448) data time 0.0011 (0.0014) model time 0.2400 (0.2434) loss 3.4822 (3.4333) grad_norm 1.4500 (inf) loss_scale 2048.0000 (2683.0718) mem 7379MB [2024-08-26 10:26:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][970/1251] eta 0:01:08 lr 0.000836 wd 0.0500 time 0.2391 (0.2448) data time 0.0007 (0.0014) model time 0.2383 (0.2433) loss 2.4758 (3.4324) grad_norm 1.9693 (inf) loss_scale 2048.0000 (2676.5314) mem 7379MB [2024-08-26 10:26:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][980/1251] eta 0:01:06 lr 0.000836 wd 0.0500 time 0.2401 (0.2448) data time 0.0009 (0.0014) model time 0.2392 (0.2433) loss 4.4805 (3.4309) grad_norm 2.9798 (inf) loss_scale 2048.0000 (2670.1244) mem 7379MB [2024-08-26 10:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][990/1251] eta 0:01:03 lr 0.000836 wd 0.0500 time 0.2330 (0.2450) data time 0.0008 (0.0014) model time 0.2322 (0.2435) loss 3.2816 (3.4305) grad_norm 2.1545 (inf) loss_scale 2048.0000 (2663.8466) mem 7379MB [2024-08-26 10:26:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1000/1251] eta 0:01:01 lr 0.000836 wd 0.0500 time 0.2388 (0.2449) data time 0.0007 (0.0014) model time 0.2381 (0.2435) loss 2.4147 (3.4298) grad_norm 1.8652 (inf) loss_scale 2048.0000 (2657.6943) mem 7379MB [2024-08-26 10:26:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1010/1251] eta 0:00:59 lr 0.000836 wd 0.0500 time 0.2384 (0.2449) data time 0.0011 (0.0014) model time 0.2373 (0.2435) loss 3.6461 (3.4315) grad_norm 2.0078 (inf) loss_scale 2048.0000 (2651.6637) mem 7379MB [2024-08-26 10:26:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1020/1251] eta 0:00:56 lr 0.000836 wd 0.0500 time 0.2412 (0.2450) data time 0.0008 (0.0014) model time 0.2404 (0.2436) loss 3.5945 (3.4324) grad_norm 2.1640 (inf) loss_scale 2048.0000 (2645.7512) mem 7379MB [2024-08-26 10:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1030/1251] eta 0:00:54 lr 0.000836 wd 0.0500 time 0.2392 (0.2450) data time 0.0007 (0.0014) model time 0.2384 (0.2436) loss 3.6175 (3.4340) grad_norm 1.6534 (inf) loss_scale 2048.0000 (2639.9534) mem 7379MB [2024-08-26 10:26:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1040/1251] eta 0:00:51 lr 0.000836 wd 0.0500 time 0.2332 (0.2449) data time 0.0008 (0.0014) model time 0.2324 (0.2435) loss 2.6277 (3.4313) grad_norm 1.3830 (inf) loss_scale 2048.0000 (2634.2671) mem 7379MB [2024-08-26 10:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1050/1251] eta 0:00:49 lr 0.000836 wd 0.0500 time 0.2381 (0.2449) data time 0.0011 (0.0014) model time 0.2370 (0.2435) loss 3.8916 (3.4323) grad_norm 1.7989 (inf) loss_scale 2048.0000 (2628.6889) mem 7379MB [2024-08-26 10:26:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1060/1251] eta 0:00:46 lr 0.000835 wd 0.0500 time 0.2404 (0.2449) data time 0.0009 (0.0014) model time 0.2395 (0.2435) loss 3.6847 (3.4332) grad_norm 2.1499 (inf) loss_scale 2048.0000 (2623.2158) mem 7379MB [2024-08-26 10:26:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1070/1251] eta 0:00:44 lr 0.000835 wd 0.0500 time 0.2437 (0.2449) data time 0.0010 (0.0014) model time 0.2427 (0.2434) loss 4.2100 (3.4344) grad_norm 1.8418 (inf) loss_scale 2048.0000 (2617.8450) mem 7379MB [2024-08-26 10:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1080/1251] eta 0:00:41 lr 0.000835 wd 0.0500 time 0.2389 (0.2452) data time 0.0008 (0.0014) model time 0.2382 (0.2438) loss 3.0825 (3.4333) grad_norm 1.3908 (inf) loss_scale 2048.0000 (2612.5735) mem 7379MB [2024-08-26 10:26:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1090/1251] eta 0:00:39 lr 0.000835 wd 0.0500 time 0.4698 (0.2454) data time 0.0010 (0.0014) model time 0.4688 (0.2440) loss 3.5219 (3.4362) grad_norm 2.1081 (inf) loss_scale 2048.0000 (2607.3987) mem 7379MB [2024-08-26 10:26:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1100/1251] eta 0:00:37 lr 0.000835 wd 0.0500 time 0.2456 (0.2453) data time 0.0010 (0.0014) model time 0.2446 (0.2440) loss 3.4107 (3.4380) grad_norm 2.2027 (inf) loss_scale 2048.0000 (2602.3179) mem 7379MB [2024-08-26 10:26:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1110/1251] eta 0:00:34 lr 0.000835 wd 0.0500 time 0.2334 (0.2453) data time 0.0010 (0.0014) model time 0.2324 (0.2439) loss 3.5194 (3.4361) grad_norm 1.6195 (inf) loss_scale 2048.0000 (2597.3285) mem 7379MB [2024-08-26 10:26:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1120/1251] eta 0:00:32 lr 0.000835 wd 0.0500 time 0.2373 (0.2453) data time 0.0008 (0.0014) model time 0.2365 (0.2439) loss 3.9575 (3.4353) grad_norm 2.3318 (inf) loss_scale 2048.0000 (2592.4282) mem 7379MB [2024-08-26 10:26:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1130/1251] eta 0:00:29 lr 0.000835 wd 0.0500 time 0.2468 (0.2452) data time 0.0008 (0.0014) model time 0.2461 (0.2438) loss 3.7960 (3.4373) grad_norm 2.3840 (inf) loss_scale 2048.0000 (2587.6145) mem 7379MB [2024-08-26 10:26:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1140/1251] eta 0:00:27 lr 0.000835 wd 0.0500 time 0.2400 (0.2452) data time 0.0009 (0.0014) model time 0.2391 (0.2438) loss 3.6992 (3.4392) grad_norm 1.7520 (inf) loss_scale 2048.0000 (2582.8852) mem 7379MB [2024-08-26 10:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1150/1251] eta 0:00:24 lr 0.000835 wd 0.0500 time 0.2346 (0.2452) data time 0.0007 (0.0014) model time 0.2338 (0.2438) loss 3.4182 (3.4406) grad_norm 2.3475 (inf) loss_scale 2048.0000 (2578.2381) mem 7379MB [2024-08-26 10:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1160/1251] eta 0:00:22 lr 0.000835 wd 0.0500 time 0.2426 (0.2451) data time 0.0007 (0.0014) model time 0.2419 (0.2437) loss 3.5915 (3.4411) grad_norm 1.8770 (inf) loss_scale 2048.0000 (2573.6710) mem 7379MB [2024-08-26 10:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1170/1251] eta 0:00:19 lr 0.000835 wd 0.0500 time 0.2461 (0.2451) data time 0.0012 (0.0014) model time 0.2450 (0.2437) loss 3.8418 (3.4432) grad_norm 1.9790 (inf) loss_scale 2048.0000 (2569.1819) mem 7379MB [2024-08-26 10:26:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1180/1251] eta 0:00:17 lr 0.000835 wd 0.0500 time 0.2428 (0.2451) data time 0.0010 (0.0014) model time 0.2418 (0.2437) loss 2.8477 (3.4418) grad_norm 2.1952 (inf) loss_scale 2048.0000 (2564.7688) mem 7379MB [2024-08-26 10:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1190/1251] eta 0:00:14 lr 0.000835 wd 0.0500 time 0.2426 (0.2451) data time 0.0012 (0.0014) model time 0.2414 (0.2437) loss 3.9889 (3.4426) grad_norm 2.0100 (inf) loss_scale 2048.0000 (2560.4299) mem 7379MB [2024-08-26 10:26:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1200/1251] eta 0:00:12 lr 0.000835 wd 0.0500 time 0.2393 (0.2450) data time 0.0007 (0.0014) model time 0.2386 (0.2437) loss 3.7266 (3.4436) grad_norm 1.7161 (inf) loss_scale 2048.0000 (2556.1632) mem 7379MB [2024-08-26 10:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1210/1251] eta 0:00:10 lr 0.000835 wd 0.0500 time 0.2443 (0.2450) data time 0.0008 (0.0014) model time 0.2435 (0.2436) loss 4.0224 (3.4447) grad_norm 1.7672 (inf) loss_scale 2048.0000 (2551.9670) mem 7379MB [2024-08-26 10:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1220/1251] eta 0:00:07 lr 0.000835 wd 0.0500 time 0.2325 (0.2450) data time 0.0011 (0.0014) model time 0.2315 (0.2436) loss 3.2799 (3.4446) grad_norm 1.8977 (inf) loss_scale 2048.0000 (2547.8395) mem 7379MB [2024-08-26 10:27:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1230/1251] eta 0:00:05 lr 0.000835 wd 0.0500 time 0.2397 (0.2449) data time 0.0007 (0.0013) model time 0.2390 (0.2436) loss 1.8464 (3.4433) grad_norm 1.9459 (inf) loss_scale 2048.0000 (2543.7790) mem 7379MB [2024-08-26 10:27:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1240/1251] eta 0:00:02 lr 0.000835 wd 0.0500 time 0.2256 (0.2452) data time 0.0005 (0.0013) model time 0.2252 (0.2438) loss 3.6162 (3.4445) grad_norm 2.7854 (inf) loss_scale 2048.0000 (2539.7840) mem 7379MB [2024-08-26 10:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [94/300][1250/1251] eta 0:00:00 lr 0.000835 wd 0.0500 time 0.2243 (0.2450) data time 0.0007 (0.0013) model time 0.2236 (0.2436) loss 3.5777 (3.4460) grad_norm 1.6345 (inf) loss_scale 2048.0000 (2535.8529) mem 7379MB [2024-08-26 10:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 94 training takes 0:05:06 [2024-08-26 10:27:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 10:27:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 10:27:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.449 (0.449) Loss 0.5264 (0.5264) Acc@1 91.113 (91.113) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 10:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.111) Loss 0.8574 (0.8023) Acc@1 81.738 (82.537) Acc@5 95.312 (96.360) Mem 7379MB [2024-08-26 10:27:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.096) Loss 1.1836 (0.8242) Acc@1 70.703 (81.589) Acc@5 92.285 (96.233) Mem 7379MB [2024-08-26 10:27:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.091 (0.091) Loss 1.4453 (0.9433) Acc@1 65.234 (78.831) Acc@5 88.281 (94.761) Mem 7379MB [2024-08-26 10:27:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.2490 (1.0069) Acc@1 70.703 (77.370) Acc@5 91.113 (93.986) Mem 7379MB [2024-08-26 10:27:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.000 Acc@5 93.940 [2024-08-26 10:27:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.0% [2024-08-26 10:27:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.815 (0.815) Loss 0.4404 (0.4404) Acc@1 92.383 (92.383) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 10:27:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.149) Loss 0.7129 (0.6945) Acc@1 85.742 (85.103) Acc@5 96.191 (96.999) Mem 7379MB [2024-08-26 10:27:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.115) Loss 0.9922 (0.7184) Acc@1 76.074 (84.049) Acc@5 94.043 (96.949) Mem 7379MB [2024-08-26 10:27:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.104) Loss 1.2803 (0.8202) Acc@1 68.164 (81.628) Acc@5 90.918 (95.772) Mem 7379MB [2024-08-26 10:27:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.094) Loss 1.1416 (0.8723) Acc@1 71.289 (80.085) Acc@5 92.969 (95.270) Mem 7379MB [2024-08-26 10:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.678 Acc@5 95.232 [2024-08-26 10:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.7% [2024-08-26 10:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.68% [2024-08-26 10:27:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 10:27:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 10:27:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][0/1251] eta 0:14:57 lr 0.000835 wd 0.0500 time 0.7173 (0.7173) data time 0.4902 (0.4902) model time 0.0000 (0.0000) loss 3.2590 (3.2590) grad_norm 1.5525 (1.5525) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][10/1251] eta 0:05:51 lr 0.000835 wd 0.0500 time 0.2450 (0.2834) data time 0.0008 (0.0456) model time 0.0000 (0.0000) loss 3.2923 (3.3801) grad_norm 1.8448 (1.8720) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][20/1251] eta 0:05:46 lr 0.000835 wd 0.0500 time 0.2414 (0.2816) data time 0.0007 (0.0243) model time 0.0000 (0.0000) loss 2.8281 (3.2935) grad_norm 2.5158 (2.4149) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][30/1251] eta 0:05:29 lr 0.000835 wd 0.0500 time 0.2364 (0.2696) data time 0.0010 (0.0168) model time 0.0000 (0.0000) loss 3.2626 (3.3836) grad_norm 2.2266 (2.2446) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][40/1251] eta 0:05:19 lr 0.000835 wd 0.0500 time 0.2448 (0.2638) data time 0.0009 (0.0129) model time 0.0000 (0.0000) loss 3.7377 (3.3990) grad_norm 2.1375 (2.3623) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][50/1251] eta 0:05:12 lr 0.000835 wd 0.0500 time 0.2441 (0.2600) data time 0.0007 (0.0106) model time 0.0000 (0.0000) loss 2.1915 (3.3108) grad_norm 1.7551 (2.3122) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][60/1251] eta 0:05:06 lr 0.000835 wd 0.0500 time 0.2437 (0.2571) data time 0.0008 (0.0090) model time 0.2429 (0.2416) loss 4.1041 (3.3093) grad_norm 1.6684 (2.3200) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][70/1251] eta 0:05:00 lr 0.000835 wd 0.0500 time 0.2359 (0.2547) data time 0.0008 (0.0079) model time 0.2351 (0.2402) loss 4.0023 (3.3212) grad_norm 1.6812 (2.2562) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][80/1251] eta 0:04:56 lr 0.000835 wd 0.0500 time 0.2384 (0.2530) data time 0.0007 (0.0070) model time 0.2377 (0.2403) loss 3.9180 (3.3478) grad_norm 1.4786 (2.2279) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][90/1251] eta 0:04:54 lr 0.000835 wd 0.0500 time 0.2510 (0.2541) data time 0.0011 (0.0064) model time 0.2499 (0.2455) loss 4.1221 (3.3926) grad_norm 1.5956 (2.2022) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][100/1251] eta 0:04:53 lr 0.000835 wd 0.0500 time 0.2417 (0.2549) data time 0.0007 (0.0058) model time 0.2410 (0.2487) loss 2.7398 (3.4140) grad_norm 1.3962 (2.1520) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][110/1251] eta 0:04:49 lr 0.000834 wd 0.0500 time 0.2408 (0.2538) data time 0.0011 (0.0054) model time 0.2397 (0.2476) loss 3.8212 (3.4113) grad_norm 1.8454 (2.1291) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][120/1251] eta 0:04:46 lr 0.000834 wd 0.0500 time 0.2407 (0.2530) data time 0.0009 (0.0050) model time 0.2398 (0.2469) loss 3.9331 (3.3999) grad_norm 2.7611 (2.1288) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][130/1251] eta 0:04:42 lr 0.000834 wd 0.0500 time 0.2456 (0.2522) data time 0.0009 (0.0047) model time 0.2447 (0.2463) loss 3.0226 (3.4107) grad_norm 2.7007 (2.1380) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][140/1251] eta 0:04:39 lr 0.000834 wd 0.0500 time 0.2471 (0.2516) data time 0.0009 (0.0045) model time 0.2462 (0.2459) loss 3.7383 (3.4067) grad_norm 1.6155 (2.1237) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][150/1251] eta 0:04:36 lr 0.000834 wd 0.0500 time 0.2385 (0.2510) data time 0.0008 (0.0042) model time 0.2377 (0.2454) loss 4.2721 (3.4142) grad_norm 1.9186 (2.1118) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:27:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][160/1251] eta 0:04:33 lr 0.000834 wd 0.0500 time 0.2487 (0.2506) data time 0.0010 (0.0040) model time 0.2477 (0.2452) loss 3.7257 (3.4062) grad_norm 2.9467 (2.1220) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][170/1251] eta 0:04:30 lr 0.000834 wd 0.0500 time 0.2394 (0.2501) data time 0.0009 (0.0039) model time 0.2385 (0.2449) loss 3.4019 (3.3908) grad_norm 1.5799 (2.1149) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][180/1251] eta 0:04:27 lr 0.000834 wd 0.0500 time 0.2424 (0.2497) data time 0.0009 (0.0037) model time 0.2415 (0.2446) loss 4.0203 (3.4018) grad_norm 1.4605 (2.0989) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][190/1251] eta 0:04:24 lr 0.000834 wd 0.0500 time 0.2434 (0.2494) data time 0.0008 (0.0036) model time 0.2426 (0.2445) loss 3.8754 (3.4145) grad_norm 1.7747 (2.1066) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][200/1251] eta 0:04:21 lr 0.000834 wd 0.0500 time 0.2445 (0.2490) data time 0.0010 (0.0035) model time 0.2435 (0.2442) loss 3.7503 (3.4192) grad_norm 4.4500 (2.1165) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][210/1251] eta 0:04:18 lr 0.000834 wd 0.0500 time 0.2386 (0.2486) data time 0.0011 (0.0033) model time 0.2375 (0.2440) loss 2.8197 (3.3986) grad_norm 3.1740 (2.1503) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][220/1251] eta 0:04:15 lr 0.000834 wd 0.0500 time 0.2404 (0.2483) data time 0.0007 (0.0032) model time 0.2396 (0.2437) loss 3.3135 (3.3989) grad_norm 1.9524 (2.1401) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][230/1251] eta 0:04:13 lr 0.000834 wd 0.0500 time 0.2448 (0.2480) data time 0.0009 (0.0031) model time 0.2438 (0.2436) loss 3.8019 (3.3970) grad_norm 2.0142 (2.1324) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][240/1251] eta 0:04:10 lr 0.000834 wd 0.0500 time 0.2413 (0.2477) data time 0.0009 (0.0031) model time 0.2404 (0.2434) loss 3.9543 (3.4051) grad_norm 2.3220 (2.1352) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][250/1251] eta 0:04:07 lr 0.000834 wd 0.0500 time 0.2368 (0.2475) data time 0.0011 (0.0030) model time 0.2357 (0.2433) loss 3.6593 (3.4002) grad_norm 1.5647 (2.1298) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][260/1251] eta 0:04:06 lr 0.000834 wd 0.0500 time 0.4521 (0.2488) data time 0.0010 (0.0029) model time 0.4512 (0.2451) loss 3.3876 (3.4127) grad_norm 2.1041 (2.1191) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][270/1251] eta 0:04:03 lr 0.000834 wd 0.0500 time 0.2438 (0.2486) data time 0.0011 (0.0028) model time 0.2427 (0.2449) loss 3.7013 (3.4234) grad_norm 2.5692 (2.1143) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][280/1251] eta 0:04:01 lr 0.000834 wd 0.0500 time 0.2457 (0.2484) data time 0.0007 (0.0028) model time 0.2450 (0.2448) loss 3.3222 (3.4235) grad_norm 2.1246 (2.1247) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][290/1251] eta 0:03:58 lr 0.000834 wd 0.0500 time 0.2373 (0.2482) data time 0.0009 (0.0027) model time 0.2365 (0.2446) loss 3.9930 (3.4225) grad_norm 2.3093 (2.1230) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][300/1251] eta 0:03:55 lr 0.000834 wd 0.0500 time 0.2428 (0.2480) data time 0.0008 (0.0026) model time 0.2420 (0.2446) loss 2.8557 (3.4225) grad_norm 2.1185 (2.1446) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][310/1251] eta 0:03:53 lr 0.000834 wd 0.0500 time 0.2459 (0.2479) data time 0.0007 (0.0026) model time 0.2453 (0.2445) loss 3.5107 (3.4227) grad_norm 2.1652 (2.1428) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][320/1251] eta 0:03:50 lr 0.000834 wd 0.0500 time 0.2441 (0.2477) data time 0.0008 (0.0025) model time 0.2434 (0.2444) loss 4.0874 (3.4251) grad_norm 3.1237 (2.1401) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][330/1251] eta 0:03:47 lr 0.000834 wd 0.0500 time 0.2453 (0.2475) data time 0.0007 (0.0025) model time 0.2445 (0.2442) loss 3.4665 (3.4303) grad_norm 1.8842 (2.1292) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][340/1251] eta 0:03:45 lr 0.000834 wd 0.0500 time 0.2406 (0.2474) data time 0.0008 (0.0024) model time 0.2398 (0.2442) loss 3.0684 (3.4277) grad_norm 2.0160 (2.1284) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][350/1251] eta 0:03:42 lr 0.000834 wd 0.0500 time 0.2409 (0.2472) data time 0.0010 (0.0024) model time 0.2399 (0.2440) loss 3.0332 (3.4293) grad_norm 2.3969 (2.1345) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][360/1251] eta 0:03:40 lr 0.000834 wd 0.0500 time 0.2153 (0.2477) data time 0.0009 (0.0024) model time 0.2144 (0.2447) loss 3.4430 (3.4334) grad_norm 1.4401 (2.1395) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][370/1251] eta 0:03:38 lr 0.000834 wd 0.0500 time 0.2413 (0.2476) data time 0.0010 (0.0023) model time 0.2403 (0.2446) loss 3.7367 (3.4337) grad_norm 1.6100 (2.1486) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][380/1251] eta 0:03:35 lr 0.000834 wd 0.0500 time 0.2407 (0.2474) data time 0.0011 (0.0023) model time 0.2396 (0.2445) loss 3.6115 (3.4356) grad_norm 1.5065 (2.1387) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][390/1251] eta 0:03:32 lr 0.000834 wd 0.0500 time 0.2392 (0.2473) data time 0.0007 (0.0023) model time 0.2384 (0.2444) loss 3.8650 (3.4313) grad_norm 1.8637 (2.1336) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][400/1251] eta 0:03:30 lr 0.000834 wd 0.0500 time 0.2420 (0.2476) data time 0.0010 (0.0022) model time 0.2410 (0.2448) loss 4.1681 (3.4265) grad_norm 1.5546 (2.1284) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][410/1251] eta 0:03:28 lr 0.000833 wd 0.0500 time 0.2407 (0.2474) data time 0.0007 (0.0022) model time 0.2399 (0.2446) loss 2.5901 (3.4267) grad_norm 1.8073 (2.1189) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][420/1251] eta 0:03:25 lr 0.000833 wd 0.0500 time 0.2383 (0.2473) data time 0.0009 (0.0022) model time 0.2374 (0.2445) loss 2.9934 (3.4269) grad_norm 1.9218 (2.1172) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][430/1251] eta 0:03:22 lr 0.000833 wd 0.0500 time 0.2394 (0.2471) data time 0.0010 (0.0021) model time 0.2385 (0.2444) loss 4.0614 (3.4234) grad_norm 2.1140 (2.1107) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][440/1251] eta 0:03:20 lr 0.000833 wd 0.0500 time 0.2407 (0.2471) data time 0.0011 (0.0021) model time 0.2397 (0.2444) loss 3.6104 (3.4282) grad_norm 1.5461 (2.1061) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][450/1251] eta 0:03:17 lr 0.000833 wd 0.0500 time 0.2420 (0.2469) data time 0.0007 (0.0021) model time 0.2413 (0.2442) loss 2.0673 (3.4145) grad_norm 1.4253 (2.1001) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][460/1251] eta 0:03:15 lr 0.000833 wd 0.0500 time 0.2389 (0.2468) data time 0.0008 (0.0021) model time 0.2381 (0.2442) loss 4.3505 (3.4161) grad_norm 2.7915 (2.1017) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][470/1251] eta 0:03:12 lr 0.000833 wd 0.0500 time 0.2372 (0.2467) data time 0.0007 (0.0021) model time 0.2365 (0.2441) loss 3.8977 (3.4131) grad_norm 3.3151 (2.1230) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][480/1251] eta 0:03:10 lr 0.000833 wd 0.0500 time 0.2382 (0.2466) data time 0.0009 (0.0020) model time 0.2373 (0.2440) loss 4.0107 (3.4149) grad_norm 2.0619 (2.1173) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][490/1251] eta 0:03:07 lr 0.000833 wd 0.0500 time 0.2326 (0.2465) data time 0.0008 (0.0020) model time 0.2318 (0.2439) loss 3.7627 (3.4174) grad_norm 1.9060 (2.1112) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][500/1251] eta 0:03:05 lr 0.000833 wd 0.0500 time 0.2343 (0.2464) data time 0.0013 (0.0020) model time 0.2330 (0.2438) loss 2.8891 (3.4136) grad_norm 1.6195 (2.1089) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][510/1251] eta 0:03:02 lr 0.000833 wd 0.0500 time 0.2549 (0.2463) data time 0.0007 (0.0020) model time 0.2542 (0.2438) loss 4.3978 (3.4121) grad_norm 2.1480 (2.1087) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][520/1251] eta 0:03:00 lr 0.000833 wd 0.0500 time 0.2423 (0.2467) data time 0.0009 (0.0020) model time 0.2414 (0.2442) loss 3.0358 (3.4085) grad_norm 1.5027 (2.1165) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][530/1251] eta 0:02:57 lr 0.000833 wd 0.0500 time 0.2373 (0.2465) data time 0.0007 (0.0019) model time 0.2366 (0.2441) loss 4.4429 (3.4152) grad_norm 2.1796 (2.1101) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][540/1251] eta 0:02:55 lr 0.000833 wd 0.0500 time 0.2422 (0.2465) data time 0.0008 (0.0019) model time 0.2414 (0.2440) loss 3.0359 (3.4141) grad_norm 2.0136 (2.1024) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][550/1251] eta 0:02:53 lr 0.000833 wd 0.0500 time 0.4456 (0.2471) data time 0.0007 (0.0019) model time 0.4449 (0.2448) loss 4.0155 (3.4124) grad_norm 2.2024 (2.1005) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][560/1251] eta 0:02:50 lr 0.000833 wd 0.0500 time 0.2357 (0.2470) data time 0.0009 (0.0019) model time 0.2348 (0.2447) loss 2.4943 (3.4086) grad_norm 1.9525 (2.1028) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][570/1251] eta 0:02:48 lr 0.000833 wd 0.0500 time 0.2345 (0.2469) data time 0.0011 (0.0019) model time 0.2334 (0.2446) loss 3.4057 (3.4112) grad_norm 1.6408 (2.1037) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][580/1251] eta 0:02:45 lr 0.000833 wd 0.0500 time 0.2399 (0.2468) data time 0.0010 (0.0019) model time 0.2389 (0.2445) loss 3.5267 (3.4069) grad_norm 2.0768 (2.1079) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][590/1251] eta 0:02:43 lr 0.000833 wd 0.0500 time 0.2422 (0.2467) data time 0.0008 (0.0018) model time 0.2413 (0.2444) loss 4.1047 (3.4155) grad_norm 2.3120 (2.1100) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][600/1251] eta 0:02:40 lr 0.000833 wd 0.0500 time 0.2372 (0.2466) data time 0.0009 (0.0018) model time 0.2363 (0.2443) loss 2.2946 (3.4138) grad_norm 1.7899 (2.1070) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][610/1251] eta 0:02:37 lr 0.000833 wd 0.0500 time 0.2402 (0.2465) data time 0.0010 (0.0018) model time 0.2392 (0.2442) loss 2.5517 (3.4131) grad_norm 1.9514 (2.1048) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][620/1251] eta 0:02:35 lr 0.000833 wd 0.0500 time 0.2368 (0.2464) data time 0.0010 (0.0018) model time 0.2358 (0.2442) loss 3.8873 (3.4148) grad_norm 2.2644 (2.1046) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][630/1251] eta 0:02:33 lr 0.000833 wd 0.0500 time 0.2438 (0.2466) data time 0.0009 (0.0018) model time 0.2429 (0.2444) loss 2.9166 (3.4161) grad_norm 1.9614 (2.1022) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:29:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][640/1251] eta 0:02:30 lr 0.000833 wd 0.0500 time 0.2363 (0.2469) data time 0.0010 (0.0018) model time 0.2353 (0.2448) loss 3.7114 (3.4144) grad_norm 2.5237 (2.1070) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][650/1251] eta 0:02:28 lr 0.000833 wd 0.0500 time 0.2365 (0.2468) data time 0.0010 (0.0018) model time 0.2355 (0.2447) loss 3.0374 (3.4177) grad_norm 1.6607 (2.1095) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][660/1251] eta 0:02:25 lr 0.000833 wd 0.0500 time 0.2473 (0.2468) data time 0.0008 (0.0018) model time 0.2465 (0.2447) loss 3.9105 (3.4213) grad_norm 1.6336 (2.1055) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][670/1251] eta 0:02:23 lr 0.000833 wd 0.0500 time 0.2497 (0.2467) data time 0.0007 (0.0017) model time 0.2491 (0.2446) loss 3.3929 (3.4234) grad_norm 2.0318 (2.1030) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][680/1251] eta 0:02:20 lr 0.000833 wd 0.0500 time 0.2481 (0.2467) data time 0.0009 (0.0018) model time 0.2472 (0.2446) loss 4.2988 (3.4251) grad_norm 1.8445 (2.1027) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][690/1251] eta 0:02:18 lr 0.000833 wd 0.0500 time 0.2506 (0.2467) data time 0.0009 (0.0017) model time 0.2496 (0.2446) loss 3.4980 (3.4275) grad_norm 1.7870 (2.1002) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][700/1251] eta 0:02:15 lr 0.000833 wd 0.0500 time 0.2446 (0.2466) data time 0.0009 (0.0017) model time 0.2437 (0.2445) loss 2.2956 (3.4274) grad_norm 2.4651 (2.0988) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][710/1251] eta 0:02:13 lr 0.000832 wd 0.0500 time 0.2544 (0.2466) data time 0.0010 (0.0017) model time 0.2534 (0.2445) loss 3.1800 (3.4247) grad_norm 2.2166 (2.0974) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][720/1251] eta 0:02:10 lr 0.000832 wd 0.0500 time 0.2399 (0.2465) data time 0.0012 (0.0017) model time 0.2387 (0.2445) loss 3.1289 (3.4261) grad_norm 2.0907 (2.0955) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][730/1251] eta 0:02:08 lr 0.000832 wd 0.0500 time 0.2426 (0.2465) data time 0.0009 (0.0017) model time 0.2417 (0.2444) loss 2.2914 (3.4219) grad_norm 1.9778 (2.0956) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][740/1251] eta 0:02:05 lr 0.000832 wd 0.0500 time 0.2642 (0.2464) data time 0.0010 (0.0017) model time 0.2633 (0.2444) loss 3.6989 (3.4182) grad_norm 1.6286 (2.0977) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][750/1251] eta 0:02:03 lr 0.000832 wd 0.0500 time 0.2393 (0.2464) data time 0.0012 (0.0017) model time 0.2382 (0.2444) loss 3.3928 (3.4198) grad_norm 2.0735 (2.0966) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][760/1251] eta 0:02:00 lr 0.000832 wd 0.0500 time 0.2366 (0.2464) data time 0.0011 (0.0017) model time 0.2355 (0.2443) loss 3.0754 (3.4190) grad_norm 1.4850 (2.0928) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][770/1251] eta 0:01:58 lr 0.000832 wd 0.0500 time 0.2431 (0.2464) data time 0.0012 (0.0017) model time 0.2419 (0.2444) loss 3.7700 (3.4175) grad_norm 1.4652 (2.0904) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][780/1251] eta 0:01:56 lr 0.000832 wd 0.0500 time 0.3845 (0.2466) data time 0.0010 (0.0017) model time 0.3836 (0.2445) loss 2.7456 (3.4176) grad_norm 2.2311 (2.0964) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][790/1251] eta 0:01:53 lr 0.000832 wd 0.0500 time 0.2450 (0.2465) data time 0.0010 (0.0017) model time 0.2441 (0.2445) loss 3.1211 (3.4185) grad_norm 1.8548 (2.0950) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][800/1251] eta 0:01:51 lr 0.000832 wd 0.0500 time 0.2537 (0.2465) data time 0.0010 (0.0017) model time 0.2527 (0.2444) loss 2.4036 (3.4206) grad_norm 2.4388 (2.0917) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][810/1251] eta 0:01:48 lr 0.000832 wd 0.0500 time 0.2425 (0.2464) data time 0.0009 (0.0017) model time 0.2416 (0.2444) loss 3.1954 (3.4165) grad_norm 1.8679 (2.0890) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][820/1251] eta 0:01:46 lr 0.000832 wd 0.0500 time 0.2399 (0.2464) data time 0.0009 (0.0017) model time 0.2390 (0.2444) loss 3.2552 (3.4176) grad_norm 1.4098 (2.0875) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][830/1251] eta 0:01:43 lr 0.000832 wd 0.0500 time 0.2418 (0.2463) data time 0.0008 (0.0017) model time 0.2410 (0.2443) loss 3.6722 (3.4188) grad_norm 2.3982 (2.0899) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][840/1251] eta 0:01:41 lr 0.000832 wd 0.0500 time 0.2498 (0.2462) data time 0.0010 (0.0017) model time 0.2489 (0.2443) loss 3.5629 (3.4181) grad_norm 2.1074 (2.0907) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][850/1251] eta 0:01:38 lr 0.000832 wd 0.0500 time 0.2334 (0.2462) data time 0.0010 (0.0016) model time 0.2324 (0.2442) loss 3.5307 (3.4172) grad_norm 2.3322 (2.0915) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][860/1251] eta 0:01:36 lr 0.000832 wd 0.0500 time 0.2402 (0.2461) data time 0.0008 (0.0016) model time 0.2394 (0.2442) loss 3.1151 (3.4144) grad_norm 2.5702 (2.0887) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][870/1251] eta 0:01:33 lr 0.000832 wd 0.0500 time 0.2393 (0.2460) data time 0.0011 (0.0016) model time 0.2382 (0.2441) loss 3.7678 (3.4158) grad_norm 1.6979 (2.0867) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][880/1251] eta 0:01:31 lr 0.000832 wd 0.0500 time 0.2455 (0.2460) data time 0.0008 (0.0016) model time 0.2447 (0.2441) loss 2.6009 (3.4191) grad_norm 2.0210 (2.0864) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:30:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][890/1251] eta 0:01:28 lr 0.000832 wd 0.0500 time 0.2410 (0.2459) data time 0.0007 (0.0016) model time 0.2403 (0.2440) loss 2.8806 (3.4175) grad_norm 2.3830 (2.0879) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][900/1251] eta 0:01:26 lr 0.000832 wd 0.0500 time 0.2378 (0.2461) data time 0.0007 (0.0016) model time 0.2371 (0.2442) loss 3.8126 (3.4171) grad_norm 1.4555 (2.0877) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][910/1251] eta 0:01:23 lr 0.000832 wd 0.0500 time 0.2344 (0.2461) data time 0.0009 (0.0016) model time 0.2335 (0.2442) loss 2.6802 (3.4183) grad_norm 1.7181 (2.0871) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][920/1251] eta 0:01:21 lr 0.000832 wd 0.0500 time 0.2341 (0.2460) data time 0.0009 (0.0016) model time 0.2332 (0.2441) loss 3.3291 (3.4195) grad_norm 2.0155 (2.0860) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][930/1251] eta 0:01:18 lr 0.000832 wd 0.0500 time 0.2382 (0.2460) data time 0.0007 (0.0016) model time 0.2376 (0.2441) loss 3.6312 (3.4152) grad_norm 1.7812 (2.0826) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][940/1251] eta 0:01:16 lr 0.000832 wd 0.0500 time 0.2314 (0.2459) data time 0.0011 (0.0016) model time 0.2303 (0.2440) loss 3.3708 (3.4161) grad_norm 2.1832 (2.0820) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][950/1251] eta 0:01:14 lr 0.000832 wd 0.0500 time 0.2558 (0.2459) data time 0.0010 (0.0016) model time 0.2548 (0.2440) loss 3.5661 (3.4118) grad_norm 2.0151 (2.0802) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][960/1251] eta 0:01:11 lr 0.000832 wd 0.0500 time 0.2492 (0.2459) data time 0.0007 (0.0016) model time 0.2485 (0.2440) loss 3.8939 (3.4124) grad_norm 1.9935 (2.0810) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][970/1251] eta 0:01:09 lr 0.000832 wd 0.0500 time 0.2475 (0.2459) data time 0.0010 (0.0016) model time 0.2465 (0.2440) loss 2.3498 (3.4113) grad_norm 2.6521 (2.0842) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][980/1251] eta 0:01:06 lr 0.000832 wd 0.0500 time 0.2520 (0.2459) data time 0.0009 (0.0016) model time 0.2510 (0.2440) loss 4.1950 (3.4152) grad_norm 2.2183 (2.0831) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][990/1251] eta 0:01:04 lr 0.000832 wd 0.0500 time 0.2410 (0.2458) data time 0.0007 (0.0016) model time 0.2403 (0.2440) loss 3.6003 (3.4179) grad_norm 1.4583 (2.0888) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1000/1251] eta 0:01:01 lr 0.000832 wd 0.0500 time 0.2594 (0.2458) data time 0.0009 (0.0016) model time 0.2584 (0.2440) loss 3.4239 (3.4196) grad_norm 1.8225 (2.0877) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1010/1251] eta 0:00:59 lr 0.000831 wd 0.0500 time 0.2456 (0.2458) data time 0.0011 (0.0016) model time 0.2445 (0.2439) loss 4.0645 (3.4171) grad_norm 1.6083 (2.0868) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1020/1251] eta 0:00:56 lr 0.000831 wd 0.0500 time 0.2339 (0.2460) data time 0.0010 (0.0016) model time 0.2329 (0.2441) loss 3.4507 (3.4162) grad_norm 2.6612 (2.0860) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1030/1251] eta 0:00:54 lr 0.000831 wd 0.0500 time 0.2616 (0.2462) data time 0.0009 (0.0016) model time 0.2607 (0.2443) loss 3.9308 (3.4177) grad_norm 3.1982 (2.0873) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1040/1251] eta 0:00:51 lr 0.000831 wd 0.0500 time 0.2489 (0.2462) data time 0.0012 (0.0016) model time 0.2477 (0.2443) loss 3.3373 (3.4168) grad_norm 1.5177 (2.0884) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1050/1251] eta 0:00:49 lr 0.000831 wd 0.0500 time 0.2349 (0.2461) data time 0.0010 (0.0016) model time 0.2339 (0.2443) loss 2.5293 (3.4184) grad_norm 2.0005 (2.0891) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1060/1251] eta 0:00:47 lr 0.000831 wd 0.0500 time 0.2415 (0.2461) data time 0.0008 (0.0016) model time 0.2407 (0.2443) loss 3.6508 (3.4186) grad_norm 2.3525 (2.0888) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1070/1251] eta 0:00:44 lr 0.000831 wd 0.0500 time 0.4020 (0.2462) data time 0.0007 (0.0016) model time 0.4013 (0.2444) loss 3.4290 (3.4183) grad_norm 1.6753 (2.0865) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1080/1251] eta 0:00:42 lr 0.000831 wd 0.0500 time 0.2465 (0.2462) data time 0.0007 (0.0016) model time 0.2458 (0.2444) loss 4.1265 (3.4190) grad_norm 1.5959 (2.0835) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1090/1251] eta 0:00:39 lr 0.000831 wd 0.0500 time 0.2410 (0.2462) data time 0.0008 (0.0016) model time 0.2403 (0.2444) loss 4.2110 (3.4194) grad_norm 1.9822 (2.0856) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1100/1251] eta 0:00:37 lr 0.000831 wd 0.0500 time 0.2355 (0.2461) data time 0.0011 (0.0016) model time 0.2344 (0.2443) loss 2.6809 (3.4190) grad_norm 1.5785 (2.0849) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1110/1251] eta 0:00:34 lr 0.000831 wd 0.0500 time 0.2488 (0.2461) data time 0.0007 (0.0016) model time 0.2481 (0.2443) loss 4.0174 (3.4222) grad_norm 2.0070 (2.0850) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1120/1251] eta 0:00:32 lr 0.000831 wd 0.0500 time 0.2429 (0.2461) data time 0.0011 (0.0016) model time 0.2418 (0.2443) loss 3.7033 (3.4206) grad_norm 2.1075 (2.0889) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1130/1251] eta 0:00:29 lr 0.000831 wd 0.0500 time 0.2357 (0.2460) data time 0.0009 (0.0015) model time 0.2348 (0.2442) loss 4.1834 (3.4197) grad_norm 1.7497 (2.0893) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:31:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1140/1251] eta 0:00:27 lr 0.000831 wd 0.0500 time 0.2421 (0.2460) data time 0.0009 (0.0015) model time 0.2412 (0.2442) loss 4.2375 (3.4201) grad_norm 2.0624 (2.0872) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1150/1251] eta 0:00:24 lr 0.000831 wd 0.0500 time 0.2430 (0.2459) data time 0.0018 (0.0015) model time 0.2412 (0.2441) loss 3.5308 (3.4202) grad_norm 2.1303 (2.0856) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1160/1251] eta 0:00:22 lr 0.000831 wd 0.0500 time 0.4305 (0.2461) data time 0.0012 (0.0015) model time 0.4293 (0.2443) loss 3.6448 (3.4197) grad_norm 4.2256 (2.0921) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1170/1251] eta 0:00:19 lr 0.000831 wd 0.0500 time 0.2437 (0.2461) data time 0.0012 (0.0015) model time 0.2425 (0.2443) loss 2.2236 (3.4183) grad_norm 1.6148 (2.0938) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1180/1251] eta 0:00:17 lr 0.000831 wd 0.0500 time 0.2387 (0.2462) data time 0.0007 (0.0015) model time 0.2380 (0.2445) loss 3.9291 (3.4194) grad_norm 2.1776 (2.0910) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1190/1251] eta 0:00:15 lr 0.000831 wd 0.0500 time 0.2382 (0.2463) data time 0.0007 (0.0015) model time 0.2374 (0.2446) loss 3.7143 (3.4223) grad_norm 1.6053 (2.0885) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1200/1251] eta 0:00:12 lr 0.000831 wd 0.0500 time 0.2410 (0.2463) data time 0.0009 (0.0015) model time 0.2401 (0.2446) loss 4.1134 (3.4214) grad_norm 1.6996 (2.0895) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1210/1251] eta 0:00:10 lr 0.000831 wd 0.0500 time 0.2385 (0.2463) data time 0.0010 (0.0015) model time 0.2374 (0.2445) loss 3.7886 (3.4228) grad_norm 2.1386 (2.0895) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1220/1251] eta 0:00:07 lr 0.000831 wd 0.0500 time 0.2395 (0.2462) data time 0.0007 (0.0015) model time 0.2388 (0.2445) loss 3.5188 (3.4257) grad_norm 1.6138 (2.0897) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1230/1251] eta 0:00:05 lr 0.000831 wd 0.0500 time 0.2417 (0.2462) data time 0.0010 (0.0015) model time 0.2407 (0.2444) loss 3.7777 (3.4260) grad_norm 2.0651 (2.0883) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1240/1251] eta 0:00:02 lr 0.000831 wd 0.0500 time 0.2236 (0.2461) data time 0.0005 (0.0015) model time 0.2231 (0.2444) loss 3.4405 (3.4291) grad_norm 1.6931 (2.0868) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [95/300][1250/1251] eta 0:00:00 lr 0.000831 wd 0.0500 time 0.2238 (0.2459) data time 0.0005 (0.0015) model time 0.2233 (0.2442) loss 3.6246 (3.4278) grad_norm 1.5880 (2.0866) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 95 training takes 0:05:07 [2024-08-26 10:32:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 10:32:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 10:32:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.383 (0.383) Loss 0.5054 (0.5054) Acc@1 90.430 (90.430) Acc@5 98.438 (98.438) Mem 7379MB [2024-08-26 10:32:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.107) Loss 0.7915 (0.7865) Acc@1 83.008 (82.386) Acc@5 96.680 (96.511) Mem 7379MB [2024-08-26 10:32:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.092) Loss 1.1113 (0.8144) Acc@1 73.535 (81.390) Acc@5 92.285 (96.308) Mem 7379MB [2024-08-26 10:32:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.087) Loss 1.4766 (0.9275) Acc@1 64.453 (79.045) Acc@5 88.379 (94.793) Mem 7379MB [2024-08-26 10:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.082) Loss 1.3525 (0.9917) Acc@1 68.555 (77.560) Acc@5 89.258 (94.012) Mem 7379MB [2024-08-26 10:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.160 Acc@5 93.918 [2024-08-26 10:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.2% [2024-08-26 10:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 77.16% [2024-08-26 10:32:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 10:32:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 10:32:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.439 (0.439) Loss 0.4407 (0.4407) Acc@1 92.480 (92.480) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 10:32:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.094 (0.109) Loss 0.7114 (0.6941) Acc@1 86.328 (85.103) Acc@5 96.191 (97.008) Mem 7379MB [2024-08-26 10:32:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.095) Loss 0.9907 (0.7177) Acc@1 76.172 (84.068) Acc@5 94.141 (96.959) Mem 7379MB [2024-08-26 10:32:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.088) Loss 1.2783 (0.8190) Acc@1 67.773 (81.666) Acc@5 90.820 (95.772) Mem 7379MB [2024-08-26 10:32:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.1416 (0.8709) Acc@1 71.582 (80.159) Acc@5 93.164 (95.277) Mem 7379MB [2024-08-26 10:32:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.752 Acc@5 95.248 [2024-08-26 10:32:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.8% [2024-08-26 10:32:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.75% [2024-08-26 10:32:36 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 10:32:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 10:32:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][0/1251] eta 0:15:05 lr 0.000831 wd 0.0500 time 0.7238 (0.7238) data time 0.5045 (0.5045) model time 0.0000 (0.0000) loss 3.1365 (3.1365) grad_norm 1.7901 (1.7901) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][10/1251] eta 0:05:53 lr 0.000831 wd 0.0500 time 0.2397 (0.2847) data time 0.0008 (0.0469) model time 0.0000 (0.0000) loss 3.0891 (3.5133) grad_norm 2.3284 (1.9528) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][20/1251] eta 0:05:23 lr 0.000831 wd 0.0500 time 0.2397 (0.2629) data time 0.0011 (0.0251) model time 0.0000 (0.0000) loss 3.5444 (3.4794) grad_norm 1.9628 (1.8907) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][30/1251] eta 0:05:11 lr 0.000831 wd 0.0500 time 0.2336 (0.2553) data time 0.0012 (0.0173) model time 0.0000 (0.0000) loss 2.7702 (3.4541) grad_norm 1.8393 (1.8950) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][40/1251] eta 0:05:10 lr 0.000831 wd 0.0500 time 0.4422 (0.2565) data time 0.0010 (0.0134) model time 0.0000 (0.0000) loss 3.5373 (3.4285) grad_norm 2.1304 (1.9476) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][50/1251] eta 0:05:04 lr 0.000831 wd 0.0500 time 0.2545 (0.2538) data time 0.0007 (0.0109) model time 0.0000 (0.0000) loss 4.1261 (3.4204) grad_norm 2.2102 (2.0022) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][60/1251] eta 0:05:02 lr 0.000830 wd 0.0500 time 0.2424 (0.2543) data time 0.0007 (0.0093) model time 0.2417 (0.2554) loss 3.6907 (3.4240) grad_norm 2.7042 (2.1090) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][70/1251] eta 0:04:57 lr 0.000830 wd 0.0500 time 0.2427 (0.2522) data time 0.0007 (0.0081) model time 0.2420 (0.2472) loss 4.1907 (3.4665) grad_norm 2.2639 (2.1421) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][80/1251] eta 0:04:53 lr 0.000830 wd 0.0500 time 0.2383 (0.2510) data time 0.0010 (0.0073) model time 0.2372 (0.2452) loss 3.5671 (3.4395) grad_norm 1.4321 (2.0942) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:32:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][90/1251] eta 0:04:49 lr 0.000830 wd 0.0500 time 0.2450 (0.2497) data time 0.0010 (0.0066) model time 0.2440 (0.2435) loss 3.4905 (3.4424) grad_norm 1.6397 (2.1002) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][100/1251] eta 0:04:46 lr 0.000830 wd 0.0500 time 0.2388 (0.2489) data time 0.0007 (0.0060) model time 0.2381 (0.2428) loss 3.8058 (3.4266) grad_norm 1.9271 (2.1239) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][110/1251] eta 0:04:44 lr 0.000830 wd 0.0500 time 0.2373 (0.2497) data time 0.0008 (0.0056) model time 0.2365 (0.2453) loss 2.6445 (3.4305) grad_norm 2.0237 (2.1131) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][120/1251] eta 0:04:41 lr 0.000830 wd 0.0500 time 0.2341 (0.2491) data time 0.0011 (0.0052) model time 0.2330 (0.2447) loss 3.1035 (3.4357) grad_norm 3.2489 (2.1224) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][130/1251] eta 0:04:40 lr 0.000830 wd 0.0500 time 0.2346 (0.2499) data time 0.0009 (0.0049) model time 0.2337 (0.2464) loss 2.3301 (3.4066) grad_norm 1.7091 (2.1406) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][140/1251] eta 0:04:37 lr 0.000830 wd 0.0500 time 0.2379 (0.2494) data time 0.0012 (0.0046) model time 0.2367 (0.2459) loss 3.5335 (3.4145) grad_norm 1.9346 (2.1345) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][150/1251] eta 0:04:34 lr 0.000830 wd 0.0500 time 0.2420 (0.2489) data time 0.0011 (0.0044) model time 0.2409 (0.2453) loss 3.9765 (3.4094) grad_norm 1.7898 (2.1394) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][160/1251] eta 0:04:31 lr 0.000830 wd 0.0500 time 0.2378 (0.2485) data time 0.0008 (0.0041) model time 0.2370 (0.2451) loss 3.0821 (3.4146) grad_norm 2.7057 (2.1367) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][170/1251] eta 0:04:28 lr 0.000830 wd 0.0500 time 0.2439 (0.2482) data time 0.0007 (0.0040) model time 0.2432 (0.2448) loss 3.9652 (3.4034) grad_norm 2.5156 (2.1508) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][180/1251] eta 0:04:25 lr 0.000830 wd 0.0500 time 0.2378 (0.2478) data time 0.0008 (0.0038) model time 0.2370 (0.2445) loss 3.5285 (3.3856) grad_norm 2.2379 (2.1418) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][190/1251] eta 0:04:22 lr 0.000830 wd 0.0500 time 0.2434 (0.2476) data time 0.0007 (0.0037) model time 0.2427 (0.2443) loss 3.0355 (3.3825) grad_norm 2.0559 (2.1283) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][200/1251] eta 0:04:19 lr 0.000830 wd 0.0500 time 0.2468 (0.2473) data time 0.0007 (0.0035) model time 0.2461 (0.2441) loss 2.4797 (3.3878) grad_norm 1.5053 (2.1321) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][210/1251] eta 0:04:17 lr 0.000830 wd 0.0500 time 0.2292 (0.2470) data time 0.0009 (0.0034) model time 0.2283 (0.2439) loss 2.8376 (3.3852) grad_norm 1.7297 (2.1432) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][220/1251] eta 0:04:14 lr 0.000830 wd 0.0500 time 0.2376 (0.2468) data time 0.0007 (0.0033) model time 0.2369 (0.2437) loss 2.0260 (3.3688) grad_norm 2.3920 (2.1415) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][230/1251] eta 0:04:11 lr 0.000830 wd 0.0500 time 0.2320 (0.2465) data time 0.0011 (0.0032) model time 0.2308 (0.2434) loss 3.3577 (3.3615) grad_norm 1.8242 (2.1333) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][240/1251] eta 0:04:08 lr 0.000830 wd 0.0500 time 0.2439 (0.2463) data time 0.0012 (0.0031) model time 0.2427 (0.2432) loss 3.2742 (3.3602) grad_norm 1.7913 (2.1214) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][250/1251] eta 0:04:06 lr 0.000830 wd 0.0500 time 0.2396 (0.2461) data time 0.0013 (0.0030) model time 0.2383 (0.2432) loss 4.0223 (3.3529) grad_norm 1.7309 (2.1125) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][260/1251] eta 0:04:03 lr 0.000830 wd 0.0500 time 0.2379 (0.2460) data time 0.0012 (0.0030) model time 0.2368 (0.2431) loss 3.6174 (3.3555) grad_norm 1.4948 (2.1064) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][270/1251] eta 0:04:01 lr 0.000830 wd 0.0500 time 0.2347 (0.2458) data time 0.0009 (0.0029) model time 0.2338 (0.2430) loss 3.0543 (3.3556) grad_norm 1.8305 (2.1203) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][280/1251] eta 0:03:58 lr 0.000830 wd 0.0500 time 0.2511 (0.2458) data time 0.0007 (0.0028) model time 0.2504 (0.2429) loss 3.5600 (3.3631) grad_norm 2.9425 (2.1256) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][290/1251] eta 0:03:56 lr 0.000830 wd 0.0500 time 0.2392 (0.2456) data time 0.0010 (0.0028) model time 0.2382 (0.2429) loss 3.5293 (3.3648) grad_norm 1.6854 (2.1155) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][300/1251] eta 0:03:53 lr 0.000830 wd 0.0500 time 0.2458 (0.2455) data time 0.0010 (0.0027) model time 0.2447 (0.2428) loss 3.4052 (3.3612) grad_norm 2.0056 (2.1146) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][310/1251] eta 0:03:50 lr 0.000830 wd 0.0500 time 0.2392 (0.2455) data time 0.0009 (0.0027) model time 0.2383 (0.2428) loss 2.2498 (3.3644) grad_norm 2.2869 (2.1108) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 10:33:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][320/1251] eta 0:03:48 lr 0.000830 wd 0.0500 time 0.2383 (0.2453) data time 0.0007 (0.0026) model time 0.2376 (0.2427) loss 4.0411 (3.3709) grad_norm 1.5668 (2.1005) loss_scale 4096.0000 (2086.2804) mem 7379MB [2024-08-26 10:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][330/1251] eta 0:03:46 lr 0.000830 wd 0.0500 time 0.2372 (0.2459) data time 0.0011 (0.0026) model time 0.2361 (0.2434) loss 2.6975 (3.3685) grad_norm 2.2878 (2.0969) loss_scale 4096.0000 (2146.9970) mem 7379MB [2024-08-26 10:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][340/1251] eta 0:03:44 lr 0.000830 wd 0.0500 time 0.2405 (0.2463) data time 0.0010 (0.0025) model time 0.2395 (0.2439) loss 3.5047 (3.3619) grad_norm 1.4117 (2.1030) loss_scale 4096.0000 (2204.1525) mem 7379MB [2024-08-26 10:34:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][350/1251] eta 0:03:41 lr 0.000830 wd 0.0500 time 0.2393 (0.2462) data time 0.0010 (0.0025) model time 0.2384 (0.2438) loss 3.4961 (3.3682) grad_norm 2.4656 (2.1218) loss_scale 4096.0000 (2258.0513) mem 7379MB [2024-08-26 10:34:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][360/1251] eta 0:03:39 lr 0.000829 wd 0.0500 time 0.2413 (0.2461) data time 0.0007 (0.0024) model time 0.2405 (0.2438) loss 3.9589 (3.3713) grad_norm 2.1793 (2.1296) loss_scale 4096.0000 (2308.9640) mem 7379MB [2024-08-26 10:34:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][370/1251] eta 0:03:36 lr 0.000829 wd 0.0500 time 0.2396 (0.2460) data time 0.0007 (0.0024) model time 0.2389 (0.2437) loss 3.3935 (3.3730) grad_norm 2.3523 (2.1298) loss_scale 4096.0000 (2357.1321) mem 7379MB [2024-08-26 10:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][380/1251] eta 0:03:34 lr 0.000829 wd 0.0500 time 0.2373 (0.2459) data time 0.0012 (0.0024) model time 0.2361 (0.2437) loss 4.1935 (3.3774) grad_norm 1.6069 (2.1256) loss_scale 4096.0000 (2402.7717) mem 7379MB [2024-08-26 10:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][390/1251] eta 0:03:31 lr 0.000829 wd 0.0500 time 0.2418 (0.2459) data time 0.0011 (0.0023) model time 0.2407 (0.2436) loss 3.9042 (3.3813) grad_norm 1.6226 (2.1240) loss_scale 4096.0000 (2446.0767) mem 7379MB [2024-08-26 10:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][400/1251] eta 0:03:29 lr 0.000829 wd 0.0500 time 0.2409 (0.2457) data time 0.0011 (0.0023) model time 0.2397 (0.2435) loss 3.7516 (3.3830) grad_norm 1.8784 (2.1140) loss_scale 4096.0000 (2487.2219) mem 7379MB [2024-08-26 10:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][410/1251] eta 0:03:26 lr 0.000829 wd 0.0500 time 0.2501 (0.2457) data time 0.0008 (0.0023) model time 0.2493 (0.2435) loss 4.0559 (3.3802) grad_norm 2.0304 (2.1144) loss_scale 4096.0000 (2526.3650) mem 7379MB [2024-08-26 10:34:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][420/1251] eta 0:03:24 lr 0.000829 wd 0.0500 time 0.2430 (0.2456) data time 0.0009 (0.0022) model time 0.2421 (0.2435) loss 4.0318 (3.3823) grad_norm 2.1709 (2.1214) loss_scale 4096.0000 (2563.6485) mem 7379MB [2024-08-26 10:34:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][430/1251] eta 0:03:21 lr 0.000829 wd 0.0500 time 0.2370 (0.2455) data time 0.0011 (0.0022) model time 0.2358 (0.2434) loss 2.8951 (3.3799) grad_norm 2.0683 (2.1272) loss_scale 4096.0000 (2599.2019) mem 7379MB [2024-08-26 10:34:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][440/1251] eta 0:03:19 lr 0.000829 wd 0.0500 time 0.2461 (0.2454) data time 0.0010 (0.0022) model time 0.2451 (0.2433) loss 3.0602 (3.3684) grad_norm 1.4661 (2.1273) loss_scale 4096.0000 (2633.1429) mem 7379MB [2024-08-26 10:34:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][450/1251] eta 0:03:16 lr 0.000829 wd 0.0500 time 0.2398 (0.2458) data time 0.0009 (0.0022) model time 0.2389 (0.2437) loss 4.0664 (3.3695) grad_norm 1.6681 (2.1220) loss_scale 4096.0000 (2665.5787) mem 7379MB [2024-08-26 10:34:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][460/1251] eta 0:03:14 lr 0.000829 wd 0.0500 time 0.2364 (0.2461) data time 0.0009 (0.0021) model time 0.2355 (0.2441) loss 3.8532 (3.3681) grad_norm 2.1258 (2.1226) loss_scale 4096.0000 (2696.6074) mem 7379MB [2024-08-26 10:34:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][470/1251] eta 0:03:12 lr 0.000829 wd 0.0500 time 0.2467 (0.2460) data time 0.0010 (0.0021) model time 0.2457 (0.2441) loss 2.7294 (3.3690) grad_norm 1.7802 (2.1229) loss_scale 4096.0000 (2726.3185) mem 7379MB [2024-08-26 10:34:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][480/1251] eta 0:03:09 lr 0.000829 wd 0.0500 time 0.2464 (0.2460) data time 0.0009 (0.0021) model time 0.2455 (0.2440) loss 2.8832 (3.3667) grad_norm 2.0514 (2.1180) loss_scale 4096.0000 (2754.7942) mem 7379MB [2024-08-26 10:34:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][490/1251] eta 0:03:07 lr 0.000829 wd 0.0500 time 0.2373 (0.2459) data time 0.0012 (0.0021) model time 0.2361 (0.2439) loss 3.0735 (3.3676) grad_norm 1.8523 (2.1126) loss_scale 4096.0000 (2782.1100) mem 7379MB [2024-08-26 10:34:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][500/1251] eta 0:03:04 lr 0.000829 wd 0.0500 time 0.2438 (0.2458) data time 0.0014 (0.0020) model time 0.2424 (0.2439) loss 3.3921 (3.3670) grad_norm 2.1355 (2.1121) loss_scale 4096.0000 (2808.3353) mem 7379MB [2024-08-26 10:34:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][510/1251] eta 0:03:02 lr 0.000829 wd 0.0500 time 0.2391 (0.2458) data time 0.0010 (0.0020) model time 0.2381 (0.2438) loss 4.0792 (3.3712) grad_norm 1.8101 (2.1095) loss_scale 4096.0000 (2833.5342) mem 7379MB [2024-08-26 10:34:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][520/1251] eta 0:02:59 lr 0.000829 wd 0.0500 time 0.2414 (0.2457) data time 0.0008 (0.0020) model time 0.2406 (0.2438) loss 3.8825 (3.3686) grad_norm 2.1176 (2.1079) loss_scale 4096.0000 (2857.7658) mem 7379MB [2024-08-26 10:34:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][530/1251] eta 0:02:57 lr 0.000829 wd 0.0500 time 0.4241 (0.2460) data time 0.0011 (0.0020) model time 0.4230 (0.2441) loss 3.7795 (3.3705) grad_norm 1.9512 (2.1132) loss_scale 4096.0000 (2881.0847) mem 7379MB [2024-08-26 10:34:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][540/1251] eta 0:02:55 lr 0.000829 wd 0.0500 time 0.2348 (0.2463) data time 0.0010 (0.0020) model time 0.2339 (0.2445) loss 3.6512 (3.3703) grad_norm 2.8387 (2.1118) loss_scale 4096.0000 (2903.5416) mem 7379MB [2024-08-26 10:34:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][550/1251] eta 0:02:52 lr 0.000829 wd 0.0500 time 0.2409 (0.2462) data time 0.0007 (0.0019) model time 0.2402 (0.2444) loss 3.5669 (3.3671) grad_norm 2.0202 (2.1089) loss_scale 4096.0000 (2925.1833) mem 7379MB [2024-08-26 10:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][560/1251] eta 0:02:50 lr 0.000829 wd 0.0500 time 0.2335 (0.2462) data time 0.0011 (0.0019) model time 0.2324 (0.2444) loss 3.5262 (3.3644) grad_norm 1.6362 (2.1036) loss_scale 4096.0000 (2946.0535) mem 7379MB [2024-08-26 10:34:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][570/1251] eta 0:02:47 lr 0.000829 wd 0.0500 time 0.2547 (0.2461) data time 0.0009 (0.0019) model time 0.2538 (0.2443) loss 2.2967 (3.3650) grad_norm 2.0325 (2.1044) loss_scale 4096.0000 (2966.1926) mem 7379MB [2024-08-26 10:35:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][580/1251] eta 0:02:45 lr 0.000829 wd 0.0500 time 0.2317 (0.2465) data time 0.0010 (0.0019) model time 0.2307 (0.2447) loss 2.6548 (3.3617) grad_norm 1.5784 (2.1016) loss_scale 4096.0000 (2985.6386) mem 7379MB [2024-08-26 10:35:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][590/1251] eta 0:02:42 lr 0.000829 wd 0.0500 time 0.2351 (0.2464) data time 0.0009 (0.0019) model time 0.2342 (0.2446) loss 3.6807 (3.3611) grad_norm 1.9238 (2.1017) loss_scale 4096.0000 (3004.4264) mem 7379MB [2024-08-26 10:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][600/1251] eta 0:02:40 lr 0.000829 wd 0.0500 time 0.2438 (0.2463) data time 0.0009 (0.0019) model time 0.2429 (0.2445) loss 3.8941 (3.3661) grad_norm 2.4807 (2.1024) loss_scale 4096.0000 (3022.5890) mem 7379MB [2024-08-26 10:35:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][610/1251] eta 0:02:37 lr 0.000829 wd 0.0500 time 0.2424 (0.2462) data time 0.0010 (0.0019) model time 0.2415 (0.2444) loss 3.8269 (3.3670) grad_norm 2.1264 (2.1027) loss_scale 4096.0000 (3040.1571) mem 7379MB [2024-08-26 10:35:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][620/1251] eta 0:02:35 lr 0.000829 wd 0.0500 time 0.2376 (0.2461) data time 0.0008 (0.0018) model time 0.2368 (0.2444) loss 3.5756 (3.3683) grad_norm 1.4758 (2.1012) loss_scale 4096.0000 (3057.1594) mem 7379MB [2024-08-26 10:35:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][630/1251] eta 0:02:32 lr 0.000829 wd 0.0500 time 0.2428 (0.2460) data time 0.0009 (0.0018) model time 0.2419 (0.2443) loss 3.8485 (3.3732) grad_norm 2.4752 (2.1072) loss_scale 4096.0000 (3073.6228) mem 7379MB [2024-08-26 10:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][640/1251] eta 0:02:30 lr 0.000829 wd 0.0500 time 0.2401 (0.2459) data time 0.0008 (0.0018) model time 0.2393 (0.2442) loss 3.3771 (3.3736) grad_norm 1.4356 (2.1055) loss_scale 4096.0000 (3089.5725) mem 7379MB [2024-08-26 10:35:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][650/1251] eta 0:02:27 lr 0.000829 wd 0.0500 time 0.2484 (0.2459) data time 0.0007 (0.0018) model time 0.2477 (0.2442) loss 3.7295 (3.3780) grad_norm 2.2552 (2.1034) loss_scale 4096.0000 (3105.0323) mem 7379MB [2024-08-26 10:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][660/1251] eta 0:02:25 lr 0.000828 wd 0.0500 time 0.2378 (0.2458) data time 0.0012 (0.0018) model time 0.2366 (0.2441) loss 3.7594 (3.3834) grad_norm 2.0350 (2.0994) loss_scale 4096.0000 (3120.0242) mem 7379MB [2024-08-26 10:35:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][670/1251] eta 0:02:22 lr 0.000828 wd 0.0500 time 0.2543 (0.2458) data time 0.0010 (0.0018) model time 0.2533 (0.2441) loss 3.7294 (3.3824) grad_norm 1.8206 (2.0943) loss_scale 4096.0000 (3134.5693) mem 7379MB [2024-08-26 10:35:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][680/1251] eta 0:02:20 lr 0.000828 wd 0.0500 time 0.2353 (0.2457) data time 0.0008 (0.0018) model time 0.2346 (0.2440) loss 3.0851 (3.3845) grad_norm 1.6967 (2.0928) loss_scale 4096.0000 (3148.6872) mem 7379MB [2024-08-26 10:35:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][690/1251] eta 0:02:17 lr 0.000828 wd 0.0500 time 0.2384 (0.2456) data time 0.0010 (0.0018) model time 0.2373 (0.2440) loss 3.2009 (3.3843) grad_norm 1.9282 (2.0903) loss_scale 4096.0000 (3162.3965) mem 7379MB [2024-08-26 10:35:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][700/1251] eta 0:02:15 lr 0.000828 wd 0.0500 time 0.2320 (0.2456) data time 0.0008 (0.0018) model time 0.2313 (0.2439) loss 3.2943 (3.3874) grad_norm 2.3624 (2.0911) loss_scale 4096.0000 (3175.7147) mem 7379MB [2024-08-26 10:35:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][710/1251] eta 0:02:12 lr 0.000828 wd 0.0500 time 0.2461 (0.2455) data time 0.0007 (0.0017) model time 0.2453 (0.2438) loss 3.1963 (3.3872) grad_norm 1.8794 (2.0920) loss_scale 4096.0000 (3188.6582) mem 7379MB [2024-08-26 10:35:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][720/1251] eta 0:02:10 lr 0.000828 wd 0.0500 time 0.2363 (0.2455) data time 0.0009 (0.0017) model time 0.2354 (0.2438) loss 3.5775 (3.3886) grad_norm 1.8023 (2.0914) loss_scale 4096.0000 (3201.2427) mem 7379MB [2024-08-26 10:35:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][730/1251] eta 0:02:07 lr 0.000828 wd 0.0500 time 0.2432 (0.2454) data time 0.0011 (0.0017) model time 0.2421 (0.2437) loss 3.7483 (3.3919) grad_norm 2.3203 (2.0898) loss_scale 4096.0000 (3213.4829) mem 7379MB [2024-08-26 10:35:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][740/1251] eta 0:02:05 lr 0.000828 wd 0.0500 time 0.2376 (0.2454) data time 0.0010 (0.0017) model time 0.2367 (0.2437) loss 3.7378 (3.3919) grad_norm 2.8331 (2.1017) loss_scale 4096.0000 (3225.3927) mem 7379MB [2024-08-26 10:35:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][750/1251] eta 0:02:02 lr 0.000828 wd 0.0500 time 0.2547 (0.2453) data time 0.0009 (0.0017) model time 0.2538 (0.2437) loss 3.7212 (3.3956) grad_norm 2.1273 (2.1014) loss_scale 4096.0000 (3236.9854) mem 7379MB [2024-08-26 10:35:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][760/1251] eta 0:02:00 lr 0.000828 wd 0.0500 time 0.2419 (0.2454) data time 0.0010 (0.0017) model time 0.2409 (0.2437) loss 3.4087 (3.3925) grad_norm 2.4983 (2.0984) loss_scale 4096.0000 (3248.2733) mem 7379MB [2024-08-26 10:35:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][770/1251] eta 0:01:58 lr 0.000828 wd 0.0500 time 0.2415 (0.2453) data time 0.0010 (0.0017) model time 0.2405 (0.2437) loss 3.6760 (3.3937) grad_norm 1.8898 (2.0967) loss_scale 4096.0000 (3259.2685) mem 7379MB [2024-08-26 10:35:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][780/1251] eta 0:01:55 lr 0.000828 wd 0.0500 time 0.2364 (0.2453) data time 0.0008 (0.0017) model time 0.2356 (0.2436) loss 4.4109 (3.3926) grad_norm 1.9793 (2.0949) loss_scale 4096.0000 (3269.9821) mem 7379MB [2024-08-26 10:35:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][790/1251] eta 0:01:53 lr 0.000828 wd 0.0500 time 0.2453 (0.2453) data time 0.0010 (0.0017) model time 0.2444 (0.2436) loss 2.5444 (3.3921) grad_norm 1.9062 (2.0940) loss_scale 4096.0000 (3280.4248) mem 7379MB [2024-08-26 10:35:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][800/1251] eta 0:01:50 lr 0.000828 wd 0.0500 time 0.2431 (0.2453) data time 0.0010 (0.0017) model time 0.2421 (0.2436) loss 3.3709 (3.3946) grad_norm 2.0089 (2.0941) loss_scale 4096.0000 (3290.6067) mem 7379MB [2024-08-26 10:35:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][810/1251] eta 0:01:48 lr 0.000828 wd 0.0500 time 0.2463 (0.2452) data time 0.0007 (0.0017) model time 0.2456 (0.2436) loss 4.0609 (3.3932) grad_norm 1.9914 (2.0942) loss_scale 4096.0000 (3300.5376) mem 7379MB [2024-08-26 10:35:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][820/1251] eta 0:01:45 lr 0.000828 wd 0.0500 time 0.2419 (0.2452) data time 0.0010 (0.0017) model time 0.2409 (0.2435) loss 3.7403 (3.3958) grad_norm 2.2929 (2.0912) loss_scale 4096.0000 (3310.2266) mem 7379MB [2024-08-26 10:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][830/1251] eta 0:01:43 lr 0.000828 wd 0.0500 time 0.2413 (0.2454) data time 0.0011 (0.0017) model time 0.2402 (0.2438) loss 3.3779 (3.3963) grad_norm 1.5360 (2.0926) loss_scale 4096.0000 (3319.6823) mem 7379MB [2024-08-26 10:36:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][840/1251] eta 0:01:40 lr 0.000828 wd 0.0500 time 0.2315 (0.2454) data time 0.0009 (0.0017) model time 0.2306 (0.2437) loss 2.0203 (3.3974) grad_norm 2.1473 (2.0961) loss_scale 4096.0000 (3328.9132) mem 7379MB [2024-08-26 10:36:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][850/1251] eta 0:01:38 lr 0.000828 wd 0.0500 time 0.2391 (0.2453) data time 0.0008 (0.0017) model time 0.2384 (0.2437) loss 3.9330 (3.3969) grad_norm 1.4906 (2.0948) loss_scale 4096.0000 (3337.9271) mem 7379MB [2024-08-26 10:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][860/1251] eta 0:01:36 lr 0.000828 wd 0.0500 time 0.4624 (0.2456) data time 0.0010 (0.0017) model time 0.4613 (0.2440) loss 3.9418 (3.3976) grad_norm 2.0142 (2.0917) loss_scale 4096.0000 (3346.7317) mem 7379MB [2024-08-26 10:36:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][870/1251] eta 0:01:33 lr 0.000828 wd 0.0500 time 0.2430 (0.2458) data time 0.0008 (0.0017) model time 0.2421 (0.2442) loss 3.2923 (3.3951) grad_norm 2.4363 (2.0931) loss_scale 4096.0000 (3355.3341) mem 7379MB [2024-08-26 10:36:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][880/1251] eta 0:01:31 lr 0.000828 wd 0.0500 time 0.2430 (0.2457) data time 0.0009 (0.0016) model time 0.2421 (0.2442) loss 3.1097 (3.3947) grad_norm 1.7348 (2.0938) loss_scale 4096.0000 (3363.7412) mem 7379MB [2024-08-26 10:36:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][890/1251] eta 0:01:28 lr 0.000828 wd 0.0500 time 0.2430 (0.2457) data time 0.0009 (0.0016) model time 0.2422 (0.2441) loss 3.7302 (3.3960) grad_norm 1.5447 (2.0917) loss_scale 4096.0000 (3371.9596) mem 7379MB [2024-08-26 10:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][900/1251] eta 0:01:26 lr 0.000828 wd 0.0500 time 0.2390 (0.2457) data time 0.0010 (0.0016) model time 0.2381 (0.2441) loss 3.6235 (3.4009) grad_norm 1.8332 (2.0891) loss_scale 4096.0000 (3379.9956) mem 7379MB [2024-08-26 10:36:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][910/1251] eta 0:01:23 lr 0.000828 wd 0.0500 time 0.2458 (0.2456) data time 0.0010 (0.0016) model time 0.2448 (0.2441) loss 3.1670 (3.3984) grad_norm 1.8407 (2.0883) loss_scale 4096.0000 (3387.8551) mem 7379MB [2024-08-26 10:36:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][920/1251] eta 0:01:21 lr 0.000828 wd 0.0500 time 0.2470 (0.2456) data time 0.0009 (0.0016) model time 0.2461 (0.2441) loss 2.5087 (3.3988) grad_norm 2.5683 (2.0896) loss_scale 4096.0000 (3395.5440) mem 7379MB [2024-08-26 10:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][930/1251] eta 0:01:18 lr 0.000828 wd 0.0500 time 0.2392 (0.2456) data time 0.0010 (0.0016) model time 0.2382 (0.2440) loss 3.8320 (3.3998) grad_norm 3.4876 (2.0942) loss_scale 4096.0000 (3403.0677) mem 7379MB [2024-08-26 10:36:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][940/1251] eta 0:01:16 lr 0.000828 wd 0.0500 time 0.2404 (0.2456) data time 0.0009 (0.0016) model time 0.2396 (0.2440) loss 3.8921 (3.4037) grad_norm 1.8414 (2.0961) loss_scale 4096.0000 (3410.4315) mem 7379MB [2024-08-26 10:36:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][950/1251] eta 0:01:13 lr 0.000827 wd 0.0500 time 0.2480 (0.2456) data time 0.0010 (0.0016) model time 0.2470 (0.2440) loss 3.0431 (3.4022) grad_norm 2.2912 (2.0955) loss_scale 4096.0000 (3417.6404) mem 7379MB [2024-08-26 10:36:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][960/1251] eta 0:01:11 lr 0.000827 wd 0.0500 time 0.3934 (0.2457) data time 0.0009 (0.0016) model time 0.3925 (0.2442) loss 3.5180 (3.4021) grad_norm 3.3641 (2.1002) loss_scale 4096.0000 (3424.6993) mem 7379MB [2024-08-26 10:36:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][970/1251] eta 0:01:09 lr 0.000827 wd 0.0500 time 0.4318 (0.2458) data time 0.0007 (0.0016) model time 0.4311 (0.2443) loss 3.4018 (3.4030) grad_norm 1.7204 (2.0982) loss_scale 4096.0000 (3431.6128) mem 7379MB [2024-08-26 10:36:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][980/1251] eta 0:01:06 lr 0.000827 wd 0.0500 time 0.2415 (0.2461) data time 0.0007 (0.0016) model time 0.2408 (0.2445) loss 3.7358 (3.4037) grad_norm 2.4558 (2.0961) loss_scale 4096.0000 (3438.3853) mem 7379MB [2024-08-26 10:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][990/1251] eta 0:01:04 lr 0.000827 wd 0.0500 time 0.2480 (0.2460) data time 0.0007 (0.0016) model time 0.2473 (0.2445) loss 3.0077 (3.4020) grad_norm 2.0712 (2.0969) loss_scale 4096.0000 (3445.0212) mem 7379MB [2024-08-26 10:36:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1000/1251] eta 0:01:01 lr 0.000827 wd 0.0500 time 0.2420 (0.2460) data time 0.0010 (0.0016) model time 0.2411 (0.2445) loss 3.6124 (3.4034) grad_norm 2.2826 (2.0960) loss_scale 4096.0000 (3451.5245) mem 7379MB [2024-08-26 10:36:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1010/1251] eta 0:00:59 lr 0.000827 wd 0.0500 time 0.2409 (0.2459) data time 0.0011 (0.0016) model time 0.2398 (0.2444) loss 3.7821 (3.4029) grad_norm 1.4538 (2.0929) loss_scale 4096.0000 (3457.8991) mem 7379MB [2024-08-26 10:36:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1020/1251] eta 0:00:56 lr 0.000827 wd 0.0500 time 0.2433 (0.2459) data time 0.0009 (0.0016) model time 0.2423 (0.2444) loss 3.7066 (3.4055) grad_norm 1.9260 (2.0925) loss_scale 4096.0000 (3464.1489) mem 7379MB [2024-08-26 10:36:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1030/1251] eta 0:00:54 lr 0.000827 wd 0.0500 time 0.2398 (0.2458) data time 0.0010 (0.0016) model time 0.2388 (0.2443) loss 3.1320 (3.4041) grad_norm 2.6127 (2.0949) loss_scale 4096.0000 (3470.2774) mem 7379MB [2024-08-26 10:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1040/1251] eta 0:00:51 lr 0.000827 wd 0.0500 time 0.2408 (0.2460) data time 0.0008 (0.0016) model time 0.2400 (0.2445) loss 4.2467 (3.4056) grad_norm 3.1256 (2.0967) loss_scale 4096.0000 (3476.2882) mem 7379MB [2024-08-26 10:36:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1050/1251] eta 0:00:49 lr 0.000827 wd 0.0500 time 0.2408 (0.2460) data time 0.0012 (0.0016) model time 0.2397 (0.2445) loss 3.5168 (3.4056) grad_norm 2.2273 (2.0975) loss_scale 4096.0000 (3482.1846) mem 7379MB [2024-08-26 10:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1060/1251] eta 0:00:47 lr 0.000827 wd 0.0500 time 0.2417 (0.2461) data time 0.0009 (0.0016) model time 0.2408 (0.2446) loss 3.2588 (3.4062) grad_norm 1.9246 (2.1014) loss_scale 4096.0000 (3487.9698) mem 7379MB [2024-08-26 10:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1070/1251] eta 0:00:44 lr 0.000827 wd 0.0500 time 0.2457 (0.2462) data time 0.0010 (0.0016) model time 0.2447 (0.2448) loss 2.7077 (3.4053) grad_norm 1.6621 (2.0981) loss_scale 4096.0000 (3493.6471) mem 7379MB [2024-08-26 10:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1080/1251] eta 0:00:42 lr 0.000827 wd 0.0500 time 0.2462 (0.2464) data time 0.0010 (0.0016) model time 0.2452 (0.2449) loss 3.2858 (3.4043) grad_norm 2.1916 (2.0949) loss_scale 4096.0000 (3499.2192) mem 7379MB [2024-08-26 10:37:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1090/1251] eta 0:00:39 lr 0.000827 wd 0.0500 time 0.2417 (0.2463) data time 0.0011 (0.0016) model time 0.2406 (0.2449) loss 3.8791 (3.4035) grad_norm 3.3702 (2.0941) loss_scale 4096.0000 (3504.6893) mem 7379MB [2024-08-26 10:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1100/1251] eta 0:00:37 lr 0.000827 wd 0.0500 time 0.2430 (0.2463) data time 0.0011 (0.0016) model time 0.2420 (0.2448) loss 3.7243 (3.4035) grad_norm 2.2927 (2.0956) loss_scale 4096.0000 (3510.0599) mem 7379MB [2024-08-26 10:37:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1110/1251] eta 0:00:34 lr 0.000827 wd 0.0500 time 0.2470 (0.2463) data time 0.0010 (0.0015) model time 0.2460 (0.2448) loss 4.0457 (3.4049) grad_norm 1.7931 (2.0957) loss_scale 4096.0000 (3515.3339) mem 7379MB [2024-08-26 10:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1120/1251] eta 0:00:32 lr 0.000827 wd 0.0500 time 0.2471 (0.2463) data time 0.0009 (0.0015) model time 0.2462 (0.2448) loss 3.4738 (3.4066) grad_norm 1.4486 (2.0965) loss_scale 4096.0000 (3520.5138) mem 7379MB [2024-08-26 10:37:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1130/1251] eta 0:00:29 lr 0.000827 wd 0.0500 time 0.2396 (0.2462) data time 0.0010 (0.0015) model time 0.2386 (0.2448) loss 2.8095 (3.4046) grad_norm 2.1909 (2.0953) loss_scale 4096.0000 (3525.6021) mem 7379MB [2024-08-26 10:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1140/1251] eta 0:00:27 lr 0.000827 wd 0.0500 time 0.2432 (0.2462) data time 0.0007 (0.0015) model time 0.2425 (0.2447) loss 3.7687 (3.4054) grad_norm 1.9021 (2.0948) loss_scale 4096.0000 (3530.6012) mem 7379MB [2024-08-26 10:37:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1150/1251] eta 0:00:24 lr 0.000827 wd 0.0500 time 0.2458 (0.2461) data time 0.0007 (0.0015) model time 0.2451 (0.2447) loss 4.0788 (3.4056) grad_norm 1.7073 (2.0931) loss_scale 4096.0000 (3535.5135) mem 7379MB [2024-08-26 10:37:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1160/1251] eta 0:00:22 lr 0.000827 wd 0.0500 time 0.2394 (0.2461) data time 0.0010 (0.0015) model time 0.2383 (0.2447) loss 3.1632 (3.4063) grad_norm 2.6827 (2.0937) loss_scale 4096.0000 (3540.3411) mem 7379MB [2024-08-26 10:37:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1170/1251] eta 0:00:19 lr 0.000827 wd 0.0500 time 0.2355 (0.2461) data time 0.0013 (0.0015) model time 0.2342 (0.2446) loss 2.9878 (3.4054) grad_norm 2.0584 (2.0922) loss_scale 4096.0000 (3545.0863) mem 7379MB [2024-08-26 10:37:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1180/1251] eta 0:00:17 lr 0.000827 wd 0.0500 time 0.2375 (0.2460) data time 0.0009 (0.0015) model time 0.2366 (0.2446) loss 3.1368 (3.4049) grad_norm 1.7147 (2.0933) loss_scale 4096.0000 (3549.7511) mem 7379MB [2024-08-26 10:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1190/1251] eta 0:00:15 lr 0.000827 wd 0.0500 time 0.2358 (0.2460) data time 0.0007 (0.0015) model time 0.2351 (0.2445) loss 2.6053 (3.4045) grad_norm 1.9861 (2.0920) loss_scale 4096.0000 (3554.3375) mem 7379MB [2024-08-26 10:37:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1200/1251] eta 0:00:12 lr 0.000827 wd 0.0500 time 0.2330 (0.2460) data time 0.0008 (0.0015) model time 0.2321 (0.2445) loss 3.4865 (3.4043) grad_norm 3.0213 (2.0909) loss_scale 4096.0000 (3558.8476) mem 7379MB [2024-08-26 10:37:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1210/1251] eta 0:00:10 lr 0.000827 wd 0.0500 time 0.2453 (0.2459) data time 0.0010 (0.0015) model time 0.2443 (0.2445) loss 3.7554 (3.4050) grad_norm 1.9372 (2.0899) loss_scale 4096.0000 (3563.2832) mem 7379MB [2024-08-26 10:37:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1220/1251] eta 0:00:07 lr 0.000827 wd 0.0500 time 0.2471 (0.2459) data time 0.0010 (0.0015) model time 0.2461 (0.2445) loss 3.2942 (3.4031) grad_norm 1.9884 (2.0945) loss_scale 4096.0000 (3567.6462) mem 7379MB [2024-08-26 10:37:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1230/1251] eta 0:00:05 lr 0.000827 wd 0.0500 time 0.2408 (0.2459) data time 0.0008 (0.0015) model time 0.2400 (0.2445) loss 3.3897 (3.4026) grad_norm 1.4296 (2.0936) loss_scale 4096.0000 (3571.9383) mem 7379MB [2024-08-26 10:37:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1240/1251] eta 0:00:02 lr 0.000827 wd 0.0500 time 0.2296 (0.2458) data time 0.0005 (0.0015) model time 0.2291 (0.2444) loss 3.8002 (3.4047) grad_norm 2.2536 (2.0930) loss_scale 4096.0000 (3576.1612) mem 7379MB [2024-08-26 10:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [96/300][1250/1251] eta 0:00:00 lr 0.000826 wd 0.0500 time 0.2242 (0.2457) data time 0.0007 (0.0015) model time 0.2235 (0.2442) loss 3.4510 (3.4053) grad_norm 1.6843 (2.0922) loss_scale 4096.0000 (3580.3165) mem 7379MB [2024-08-26 10:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 96 training takes 0:05:07 [2024-08-26 10:37:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 10:37:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 10:37:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.431 (0.431) Loss 0.5264 (0.5264) Acc@1 89.648 (89.648) Acc@5 98.535 (98.535) Mem 7379MB [2024-08-26 10:37:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.112) Loss 0.7671 (0.8033) Acc@1 84.570 (82.333) Acc@5 95.996 (96.307) Mem 7379MB [2024-08-26 10:37:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.095) Loss 1.2256 (0.8306) Acc@1 70.996 (81.427) Acc@5 91.992 (96.191) Mem 7379MB [2024-08-26 10:37:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.090) Loss 1.4678 (0.9467) Acc@1 65.625 (78.941) Acc@5 88.672 (94.758) Mem 7379MB [2024-08-26 10:37:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.3691 (1.0092) Acc@1 68.848 (77.389) Acc@5 90.039 (93.993) Mem 7379MB [2024-08-26 10:37:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.022 Acc@5 93.916 [2024-08-26 10:37:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.0% [2024-08-26 10:37:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.755 (0.755) Loss 0.4404 (0.4404) Acc@1 92.578 (92.578) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 10:37:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.143) Loss 0.7100 (0.6930) Acc@1 85.938 (85.147) Acc@5 95.996 (96.999) Mem 7379MB [2024-08-26 10:37:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.113) Loss 0.9878 (0.7165) Acc@1 76.660 (84.096) Acc@5 94.141 (96.959) Mem 7379MB [2024-08-26 10:37:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.101) Loss 1.2783 (0.8177) Acc@1 67.773 (81.707) Acc@5 90.723 (95.782) Mem 7379MB [2024-08-26 10:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.093) Loss 1.1416 (0.8697) Acc@1 71.582 (80.212) Acc@5 93.164 (95.284) Mem 7379MB [2024-08-26 10:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.790 Acc@5 95.256 [2024-08-26 10:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.8% [2024-08-26 10:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.79% [2024-08-26 10:37:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 10:37:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 10:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][0/1251] eta 0:15:49 lr 0.000826 wd 0.0500 time 0.7591 (0.7591) data time 0.5356 (0.5356) model time 0.0000 (0.0000) loss 2.3734 (2.3734) grad_norm 2.2457 (2.2457) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:37:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][10/1251] eta 0:06:01 lr 0.000826 wd 0.0500 time 0.2404 (0.2914) data time 0.0008 (0.0496) model time 0.0000 (0.0000) loss 3.6026 (3.5024) grad_norm 2.0136 (2.0272) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:37:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][20/1251] eta 0:05:29 lr 0.000826 wd 0.0500 time 0.2415 (0.2678) data time 0.0009 (0.0265) model time 0.0000 (0.0000) loss 3.5302 (3.4589) grad_norm 1.8405 (2.1190) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][30/1251] eta 0:05:17 lr 0.000826 wd 0.0500 time 0.2589 (0.2600) data time 0.0008 (0.0182) model time 0.0000 (0.0000) loss 2.9513 (3.3587) grad_norm 1.6952 (2.0898) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][40/1251] eta 0:05:10 lr 0.000826 wd 0.0500 time 0.2436 (0.2561) data time 0.0010 (0.0143) model time 0.0000 (0.0000) loss 4.0568 (3.3880) grad_norm 2.2807 (2.1041) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][50/1251] eta 0:05:04 lr 0.000826 wd 0.0500 time 0.2291 (0.2533) data time 0.0007 (0.0117) model time 0.0000 (0.0000) loss 3.4042 (3.4330) grad_norm 2.2469 (2.0964) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][60/1251] eta 0:04:59 lr 0.000826 wd 0.0500 time 0.2488 (0.2515) data time 0.0007 (0.0100) model time 0.2481 (0.2411) loss 2.9847 (3.4173) grad_norm 1.7857 (2.1336) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][70/1251] eta 0:04:55 lr 0.000826 wd 0.0500 time 0.2449 (0.2499) data time 0.0011 (0.0087) model time 0.2438 (0.2402) loss 3.2780 (3.4060) grad_norm 1.8392 (2.1534) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][80/1251] eta 0:04:51 lr 0.000826 wd 0.0500 time 0.2373 (0.2491) data time 0.0011 (0.0078) model time 0.2362 (0.2407) loss 2.8076 (3.4024) grad_norm 1.7404 (2.1115) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][90/1251] eta 0:04:48 lr 0.000826 wd 0.0500 time 0.2465 (0.2487) data time 0.0011 (0.0071) model time 0.2454 (0.2416) loss 3.5182 (3.4280) grad_norm 2.2128 (2.0995) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][100/1251] eta 0:04:45 lr 0.000826 wd 0.0500 time 0.2438 (0.2478) data time 0.0012 (0.0065) model time 0.2426 (0.2410) loss 3.1393 (3.4217) grad_norm 2.0068 (2.1439) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][110/1251] eta 0:04:42 lr 0.000826 wd 0.0500 time 0.2439 (0.2475) data time 0.0010 (0.0060) model time 0.2430 (0.2413) loss 3.6289 (3.4185) grad_norm 2.1046 (2.1349) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][120/1251] eta 0:04:41 lr 0.000826 wd 0.0500 time 0.2404 (0.2490) data time 0.0013 (0.0056) model time 0.2392 (0.2446) loss 3.1902 (3.4092) grad_norm 1.4839 (2.1093) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][130/1251] eta 0:04:38 lr 0.000826 wd 0.0500 time 0.2377 (0.2484) data time 0.0011 (0.0053) model time 0.2366 (0.2442) loss 3.9387 (3.4078) grad_norm 2.7436 (2.1053) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][140/1251] eta 0:04:37 lr 0.000826 wd 0.0500 time 0.2370 (0.2495) data time 0.0009 (0.0050) model time 0.2361 (0.2461) loss 3.0008 (3.4072) grad_norm 1.8301 (2.1050) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][150/1251] eta 0:04:35 lr 0.000826 wd 0.0500 time 0.2397 (0.2503) data time 0.0011 (0.0047) model time 0.2386 (0.2476) loss 3.8254 (3.4027) grad_norm 1.7975 (2.0999) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][160/1251] eta 0:04:32 lr 0.000826 wd 0.0500 time 0.2364 (0.2497) data time 0.0011 (0.0045) model time 0.2353 (0.2469) loss 3.0006 (3.4052) grad_norm 1.3865 (2.0921) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][170/1251] eta 0:04:29 lr 0.000826 wd 0.0500 time 0.2405 (0.2493) data time 0.0011 (0.0043) model time 0.2394 (0.2464) loss 3.6220 (3.4080) grad_norm 1.6467 (2.0907) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][180/1251] eta 0:04:26 lr 0.000826 wd 0.0500 time 0.2392 (0.2487) data time 0.0008 (0.0041) model time 0.2384 (0.2458) loss 2.2548 (3.3988) grad_norm 2.0863 (2.0983) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][190/1251] eta 0:04:24 lr 0.000826 wd 0.0500 time 0.2498 (0.2494) data time 0.0007 (0.0040) model time 0.2491 (0.2468) loss 4.5034 (3.4040) grad_norm 2.0684 (2.0999) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][200/1251] eta 0:04:21 lr 0.000826 wd 0.0500 time 0.2398 (0.2489) data time 0.0010 (0.0038) model time 0.2388 (0.2463) loss 3.5690 (3.4025) grad_norm 1.6257 (2.1015) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][210/1251] eta 0:04:18 lr 0.000826 wd 0.0500 time 0.2363 (0.2485) data time 0.0010 (0.0037) model time 0.2353 (0.2459) loss 3.5654 (3.4008) grad_norm 1.9782 (2.0924) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][220/1251] eta 0:04:16 lr 0.000826 wd 0.0500 time 0.2414 (0.2484) data time 0.0009 (0.0036) model time 0.2405 (0.2458) loss 4.1033 (3.4107) grad_norm 3.0901 (2.1121) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][230/1251] eta 0:04:13 lr 0.000826 wd 0.0500 time 0.2388 (0.2484) data time 0.0008 (0.0036) model time 0.2381 (0.2457) loss 4.0031 (3.4041) grad_norm 1.7598 (2.1039) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][240/1251] eta 0:04:11 lr 0.000826 wd 0.0500 time 0.2466 (0.2489) data time 0.0008 (0.0035) model time 0.2457 (0.2464) loss 3.5822 (3.4087) grad_norm 1.5933 (2.1036) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][250/1251] eta 0:04:09 lr 0.000826 wd 0.0500 time 0.2531 (0.2493) data time 0.0009 (0.0034) model time 0.2522 (0.2470) loss 3.7119 (3.4049) grad_norm 1.6987 (2.0949) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:38:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][260/1251] eta 0:04:06 lr 0.000826 wd 0.0500 time 0.2443 (0.2491) data time 0.0008 (0.0033) model time 0.2435 (0.2468) loss 2.8298 (3.4003) grad_norm 2.8924 (2.0908) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][270/1251] eta 0:04:04 lr 0.000826 wd 0.0500 time 0.2530 (0.2489) data time 0.0010 (0.0033) model time 0.2520 (0.2465) loss 3.7327 (3.4062) grad_norm 2.0552 (2.0843) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][280/1251] eta 0:04:01 lr 0.000826 wd 0.0500 time 0.2511 (0.2487) data time 0.0008 (0.0032) model time 0.2503 (0.2463) loss 4.1305 (3.4192) grad_norm 2.3379 (2.0778) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][290/1251] eta 0:03:58 lr 0.000825 wd 0.0500 time 0.2512 (0.2485) data time 0.0010 (0.0031) model time 0.2502 (0.2461) loss 3.0445 (3.4218) grad_norm 2.3623 (2.0765) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][300/1251] eta 0:03:56 lr 0.000825 wd 0.0500 time 0.2460 (0.2483) data time 0.0008 (0.0030) model time 0.2452 (0.2460) loss 2.7045 (3.4166) grad_norm 3.1977 (2.0773) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][310/1251] eta 0:03:54 lr 0.000825 wd 0.0500 time 0.2511 (0.2489) data time 0.0009 (0.0030) model time 0.2502 (0.2467) loss 3.6879 (3.4133) grad_norm 1.6332 (2.0840) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][320/1251] eta 0:03:51 lr 0.000825 wd 0.0500 time 0.2353 (0.2486) data time 0.0014 (0.0029) model time 0.2339 (0.2464) loss 3.9441 (3.4154) grad_norm 2.0662 (2.0888) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][330/1251] eta 0:03:48 lr 0.000825 wd 0.0500 time 0.2434 (0.2483) data time 0.0010 (0.0029) model time 0.2424 (0.2461) loss 3.9831 (3.4229) grad_norm 2.0375 (2.0876) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][340/1251] eta 0:03:46 lr 0.000825 wd 0.0500 time 0.2514 (0.2481) data time 0.0011 (0.0028) model time 0.2503 (0.2459) loss 3.5208 (3.4310) grad_norm 1.8955 (2.0903) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][350/1251] eta 0:03:43 lr 0.000825 wd 0.0500 time 0.2361 (0.2484) data time 0.0010 (0.0028) model time 0.2350 (0.2464) loss 4.2955 (3.4303) grad_norm 2.7273 (2.0826) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][360/1251] eta 0:03:41 lr 0.000825 wd 0.0500 time 0.2438 (0.2490) data time 0.0008 (0.0027) model time 0.2429 (0.2470) loss 2.4187 (3.4245) grad_norm 2.4025 (2.0862) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][370/1251] eta 0:03:39 lr 0.000825 wd 0.0500 time 0.2447 (0.2487) data time 0.0010 (0.0027) model time 0.2437 (0.2467) loss 3.6951 (3.4271) grad_norm 1.5328 (2.0868) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][380/1251] eta 0:03:36 lr 0.000825 wd 0.0500 time 0.2414 (0.2485) data time 0.0010 (0.0026) model time 0.2404 (0.2465) loss 2.9573 (3.4304) grad_norm 2.2799 (2.0840) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][390/1251] eta 0:03:33 lr 0.000825 wd 0.0500 time 0.2373 (0.2483) data time 0.0010 (0.0026) model time 0.2363 (0.2463) loss 2.7215 (3.4254) grad_norm 1.8291 (2.0827) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][400/1251] eta 0:03:31 lr 0.000825 wd 0.0500 time 0.2373 (0.2480) data time 0.0009 (0.0026) model time 0.2364 (0.2460) loss 3.4664 (3.4245) grad_norm 1.7024 (2.0801) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][410/1251] eta 0:03:28 lr 0.000825 wd 0.0500 time 0.2447 (0.2479) data time 0.0008 (0.0025) model time 0.2439 (0.2459) loss 4.0808 (3.4235) grad_norm 1.7490 (2.0767) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][420/1251] eta 0:03:25 lr 0.000825 wd 0.0500 time 0.2410 (0.2478) data time 0.0008 (0.0025) model time 0.2402 (0.2458) loss 3.8597 (3.4281) grad_norm 1.9860 (2.0777) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][430/1251] eta 0:03:23 lr 0.000825 wd 0.0500 time 0.2403 (0.2477) data time 0.0011 (0.0024) model time 0.2393 (0.2458) loss 2.5100 (3.4193) grad_norm 1.7800 (2.0864) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][440/1251] eta 0:03:20 lr 0.000825 wd 0.0500 time 0.2428 (0.2476) data time 0.0012 (0.0024) model time 0.2416 (0.2457) loss 3.1598 (3.4188) grad_norm 2.2593 (2.1009) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][450/1251] eta 0:03:18 lr 0.000825 wd 0.0500 time 0.2409 (0.2480) data time 0.0008 (0.0024) model time 0.2401 (0.2461) loss 3.9162 (3.4147) grad_norm 1.7290 (2.1015) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][460/1251] eta 0:03:16 lr 0.000825 wd 0.0500 time 0.2451 (0.2478) data time 0.0012 (0.0024) model time 0.2439 (0.2460) loss 3.5502 (3.4108) grad_norm 1.6755 (2.1005) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][470/1251] eta 0:03:13 lr 0.000825 wd 0.0500 time 0.2522 (0.2477) data time 0.0009 (0.0023) model time 0.2513 (0.2459) loss 3.8170 (3.4057) grad_norm 2.1676 (2.0986) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][480/1251] eta 0:03:10 lr 0.000825 wd 0.0500 time 0.2306 (0.2475) data time 0.0008 (0.0023) model time 0.2298 (0.2457) loss 3.5281 (3.3949) grad_norm 1.9296 (2.0935) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][490/1251] eta 0:03:08 lr 0.000825 wd 0.0500 time 0.2387 (0.2474) data time 0.0011 (0.0023) model time 0.2376 (0.2456) loss 2.7053 (3.3889) grad_norm 1.4469 (2.0879) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:39:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][500/1251] eta 0:03:05 lr 0.000825 wd 0.0500 time 0.2458 (0.2473) data time 0.0009 (0.0022) model time 0.2450 (0.2455) loss 3.4950 (3.3884) grad_norm 2.1129 (2.0868) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][510/1251] eta 0:03:03 lr 0.000825 wd 0.0500 time 0.2391 (0.2473) data time 0.0009 (0.0022) model time 0.2382 (0.2454) loss 4.4642 (3.3898) grad_norm 2.6257 (2.0869) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][520/1251] eta 0:03:00 lr 0.000825 wd 0.0500 time 0.2340 (0.2471) data time 0.0009 (0.0022) model time 0.2331 (0.2453) loss 3.7768 (3.3882) grad_norm 2.2946 (2.0821) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][530/1251] eta 0:02:58 lr 0.000825 wd 0.0500 time 0.2525 (0.2470) data time 0.0009 (0.0022) model time 0.2516 (0.2452) loss 3.6142 (3.3924) grad_norm 1.3158 (2.0759) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][540/1251] eta 0:02:55 lr 0.000825 wd 0.0500 time 0.2415 (0.2469) data time 0.0010 (0.0022) model time 0.2405 (0.2451) loss 2.2911 (3.3892) grad_norm 2.3081 (2.0714) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][550/1251] eta 0:02:53 lr 0.000825 wd 0.0500 time 0.2467 (0.2469) data time 0.0013 (0.0021) model time 0.2453 (0.2451) loss 3.9545 (3.3914) grad_norm 1.9267 (2.0724) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][560/1251] eta 0:02:50 lr 0.000825 wd 0.0500 time 0.2445 (0.2472) data time 0.0010 (0.0021) model time 0.2435 (0.2454) loss 3.9069 (3.3952) grad_norm 2.2453 (2.0702) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][570/1251] eta 0:02:48 lr 0.000825 wd 0.0500 time 0.2413 (0.2471) data time 0.0009 (0.0021) model time 0.2404 (0.2454) loss 4.0880 (3.4017) grad_norm 1.8813 (2.0696) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][580/1251] eta 0:02:45 lr 0.000825 wd 0.0500 time 0.2505 (0.2471) data time 0.0008 (0.0021) model time 0.2497 (0.2453) loss 3.6612 (3.4055) grad_norm 1.9913 (2.0708) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][590/1251] eta 0:02:43 lr 0.000824 wd 0.0500 time 0.2432 (0.2470) data time 0.0008 (0.0021) model time 0.2423 (0.2452) loss 3.0604 (3.4035) grad_norm 1.8388 (2.0696) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][600/1251] eta 0:02:40 lr 0.000824 wd 0.0500 time 0.2545 (0.2472) data time 0.0007 (0.0020) model time 0.2537 (0.2455) loss 3.4491 (3.4071) grad_norm 1.9739 (2.0707) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][610/1251] eta 0:02:38 lr 0.000824 wd 0.0500 time 0.2665 (0.2472) data time 0.0010 (0.0020) model time 0.2655 (0.2455) loss 2.7783 (3.4007) grad_norm 1.8370 (2.0700) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][620/1251] eta 0:02:35 lr 0.000824 wd 0.0500 time 0.2446 (0.2471) data time 0.0010 (0.0020) model time 0.2436 (0.2454) loss 3.6874 (3.3969) grad_norm 3.1847 (2.0764) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][630/1251] eta 0:02:33 lr 0.000824 wd 0.0500 time 0.2410 (0.2471) data time 0.0007 (0.0020) model time 0.2403 (0.2454) loss 3.4577 (3.3947) grad_norm 2.1842 (2.0740) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][640/1251] eta 0:02:30 lr 0.000824 wd 0.0500 time 0.2429 (0.2470) data time 0.0007 (0.0020) model time 0.2421 (0.2453) loss 4.0818 (3.3958) grad_norm 2.4059 (2.0721) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][650/1251] eta 0:02:28 lr 0.000824 wd 0.0500 time 0.2441 (0.2469) data time 0.0009 (0.0020) model time 0.2432 (0.2452) loss 3.9040 (3.3960) grad_norm 1.7508 (2.0736) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][660/1251] eta 0:02:26 lr 0.000824 wd 0.0500 time 0.2476 (0.2475) data time 0.0007 (0.0020) model time 0.2469 (0.2458) loss 2.6909 (3.3996) grad_norm 2.4074 (2.0758) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][670/1251] eta 0:02:23 lr 0.000824 wd 0.0500 time 0.2413 (0.2474) data time 0.0007 (0.0019) model time 0.2405 (0.2457) loss 4.3315 (3.3984) grad_norm 2.2015 (2.0798) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][680/1251] eta 0:02:21 lr 0.000824 wd 0.0500 time 0.2625 (0.2474) data time 0.0015 (0.0019) model time 0.2610 (0.2458) loss 3.3929 (3.3999) grad_norm 1.6388 (2.0777) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][690/1251] eta 0:02:18 lr 0.000824 wd 0.0500 time 0.2474 (0.2473) data time 0.0007 (0.0019) model time 0.2467 (0.2457) loss 4.3172 (3.4002) grad_norm 2.3277 (2.0794) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][700/1251] eta 0:02:16 lr 0.000824 wd 0.0500 time 0.2388 (0.2473) data time 0.0007 (0.0019) model time 0.2381 (0.2457) loss 2.2478 (3.3965) grad_norm 1.8878 (2.0838) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][710/1251] eta 0:02:13 lr 0.000824 wd 0.0500 time 0.2370 (0.2472) data time 0.0009 (0.0019) model time 0.2361 (0.2456) loss 4.1509 (3.3993) grad_norm 6.0356 (2.0920) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][720/1251] eta 0:02:11 lr 0.000824 wd 0.0500 time 0.2468 (0.2471) data time 0.0007 (0.0019) model time 0.2461 (0.2455) loss 4.0591 (3.4026) grad_norm 2.1705 (2.0961) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][730/1251] eta 0:02:08 lr 0.000824 wd 0.0500 time 0.2435 (0.2473) data time 0.0011 (0.0019) model time 0.2424 (0.2457) loss 3.4204 (3.3985) grad_norm 1.8598 (2.0948) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][740/1251] eta 0:02:06 lr 0.000824 wd 0.0500 time 0.2420 (0.2473) data time 0.0012 (0.0019) model time 0.2407 (0.2457) loss 2.7359 (3.3962) grad_norm 2.3342 (2.0982) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][750/1251] eta 0:02:03 lr 0.000824 wd 0.0500 time 0.2412 (0.2473) data time 0.0012 (0.0018) model time 0.2400 (0.2457) loss 3.5741 (3.3917) grad_norm 2.2836 (2.1011) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][760/1251] eta 0:02:01 lr 0.000824 wd 0.0500 time 0.2423 (0.2474) data time 0.0012 (0.0018) model time 0.2411 (0.2459) loss 3.3589 (3.3929) grad_norm 2.0349 (2.0990) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][770/1251] eta 0:01:59 lr 0.000824 wd 0.0500 time 0.2497 (0.2474) data time 0.0010 (0.0018) model time 0.2487 (0.2459) loss 3.7956 (3.3912) grad_norm 2.1394 (2.0972) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][780/1251] eta 0:01:56 lr 0.000824 wd 0.0500 time 0.2449 (0.2477) data time 0.0010 (0.0018) model time 0.2439 (0.2461) loss 3.4361 (3.3932) grad_norm 2.2971 (2.0952) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][790/1251] eta 0:01:54 lr 0.000824 wd 0.0500 time 0.2464 (0.2476) data time 0.0007 (0.0018) model time 0.2456 (0.2461) loss 2.9877 (3.3902) grad_norm 2.6470 (2.0996) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][800/1251] eta 0:01:51 lr 0.000824 wd 0.0500 time 0.2458 (0.2476) data time 0.0010 (0.0018) model time 0.2447 (0.2461) loss 3.6930 (3.3898) grad_norm 2.2887 (2.0980) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][810/1251] eta 0:01:49 lr 0.000824 wd 0.0500 time 0.2386 (0.2475) data time 0.0009 (0.0018) model time 0.2377 (0.2460) loss 2.6847 (3.3868) grad_norm 2.1614 (2.0969) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][820/1251] eta 0:01:46 lr 0.000824 wd 0.0500 time 0.2413 (0.2474) data time 0.0008 (0.0018) model time 0.2405 (0.2459) loss 4.4465 (3.3861) grad_norm 1.6251 (2.0949) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][830/1251] eta 0:01:44 lr 0.000824 wd 0.0500 time 0.2478 (0.2473) data time 0.0010 (0.0018) model time 0.2468 (0.2458) loss 2.5845 (3.3857) grad_norm 2.2167 (2.0943) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][840/1251] eta 0:01:41 lr 0.000824 wd 0.0500 time 0.2363 (0.2473) data time 0.0008 (0.0018) model time 0.2355 (0.2458) loss 4.1001 (3.3892) grad_norm 2.4616 (2.0915) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][850/1251] eta 0:01:39 lr 0.000824 wd 0.0500 time 0.2450 (0.2472) data time 0.0008 (0.0018) model time 0.2443 (0.2457) loss 2.6134 (3.3872) grad_norm 1.4540 (2.0896) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][860/1251] eta 0:01:36 lr 0.000824 wd 0.0500 time 0.2433 (0.2471) data time 0.0010 (0.0017) model time 0.2423 (0.2456) loss 3.1202 (3.3875) grad_norm 1.5150 (2.0939) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][870/1251] eta 0:01:34 lr 0.000824 wd 0.0500 time 0.3763 (0.2474) data time 0.0010 (0.0017) model time 0.3753 (0.2459) loss 3.3399 (3.3872) grad_norm 2.3891 (2.1013) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][880/1251] eta 0:01:31 lr 0.000823 wd 0.0500 time 0.2398 (0.2475) data time 0.0008 (0.0017) model time 0.2390 (0.2460) loss 2.4812 (3.3857) grad_norm 2.1503 (2.1027) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][890/1251] eta 0:01:29 lr 0.000823 wd 0.0500 time 0.2415 (0.2475) data time 0.0011 (0.0017) model time 0.2404 (0.2460) loss 3.8674 (3.3842) grad_norm 1.7574 (2.1027) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][900/1251] eta 0:01:26 lr 0.000823 wd 0.0500 time 0.2332 (0.2474) data time 0.0009 (0.0017) model time 0.2324 (0.2459) loss 2.1807 (3.3821) grad_norm 2.4787 (2.1035) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][910/1251] eta 0:01:24 lr 0.000823 wd 0.0500 time 0.2391 (0.2473) data time 0.0008 (0.0017) model time 0.2383 (0.2458) loss 3.3510 (3.3829) grad_norm 2.9375 (2.1109) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][920/1251] eta 0:01:21 lr 0.000823 wd 0.0500 time 0.2362 (0.2472) data time 0.0009 (0.0017) model time 0.2353 (0.2457) loss 3.3412 (3.3852) grad_norm 2.4375 (2.1119) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][930/1251] eta 0:01:19 lr 0.000823 wd 0.0500 time 0.2394 (0.2471) data time 0.0011 (0.0017) model time 0.2383 (0.2457) loss 3.7632 (3.3893) grad_norm 2.2428 (2.1158) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][940/1251] eta 0:01:16 lr 0.000823 wd 0.0500 time 0.2424 (0.2471) data time 0.0010 (0.0017) model time 0.2414 (0.2456) loss 3.2452 (3.3882) grad_norm 1.3511 (2.1160) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][950/1251] eta 0:01:14 lr 0.000823 wd 0.0500 time 0.2390 (0.2470) data time 0.0010 (0.0017) model time 0.2380 (0.2455) loss 3.7448 (3.3878) grad_norm 2.0933 (2.1136) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][960/1251] eta 0:01:11 lr 0.000823 wd 0.0500 time 0.2474 (0.2469) data time 0.0007 (0.0017) model time 0.2467 (0.2455) loss 3.8344 (3.3878) grad_norm 2.9059 (2.1156) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][970/1251] eta 0:01:09 lr 0.000823 wd 0.0500 time 0.2444 (0.2469) data time 0.0011 (0.0017) model time 0.2433 (0.2454) loss 3.5933 (3.3884) grad_norm 2.4163 (2.1130) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][980/1251] eta 0:01:06 lr 0.000823 wd 0.0500 time 0.2437 (0.2468) data time 0.0011 (0.0017) model time 0.2426 (0.2453) loss 2.8531 (3.3900) grad_norm 3.0224 (2.1154) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:41:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][990/1251] eta 0:01:04 lr 0.000823 wd 0.0500 time 0.2434 (0.2470) data time 0.0007 (0.0016) model time 0.2427 (0.2455) loss 3.5252 (3.3903) grad_norm 1.9272 (2.1152) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:42:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1000/1251] eta 0:01:01 lr 0.000823 wd 0.0500 time 0.2434 (0.2469) data time 0.0010 (0.0016) model time 0.2424 (0.2455) loss 3.8433 (3.3904) grad_norm 1.6798 (2.1123) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:42:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1010/1251] eta 0:00:59 lr 0.000823 wd 0.0500 time 0.2386 (0.2468) data time 0.0009 (0.0016) model time 0.2377 (0.2454) loss 3.2703 (3.3895) grad_norm 2.1279 (2.1173) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:42:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1020/1251] eta 0:00:57 lr 0.000823 wd 0.0500 time 0.2443 (0.2468) data time 0.0010 (0.0016) model time 0.2433 (0.2454) loss 3.3361 (3.3897) grad_norm 2.2261 (2.1178) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:42:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1030/1251] eta 0:00:54 lr 0.000823 wd 0.0500 time 0.2333 (0.2468) data time 0.0012 (0.0016) model time 0.2322 (0.2453) loss 3.6168 (3.3886) grad_norm 1.6767 (2.1152) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:42:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1040/1251] eta 0:00:52 lr 0.000823 wd 0.0500 time 0.2390 (0.2467) data time 0.0007 (0.0016) model time 0.2382 (0.2453) loss 3.7804 (3.3890) grad_norm 1.7442 (2.1142) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:42:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1050/1251] eta 0:00:49 lr 0.000823 wd 0.0500 time 0.2448 (0.2467) data time 0.0009 (0.0016) model time 0.2439 (0.2452) loss 3.2940 (3.3901) grad_norm 1.8900 (2.1114) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:42:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1060/1251] eta 0:00:47 lr 0.000823 wd 0.0500 time 0.2477 (0.2466) data time 0.0010 (0.0016) model time 0.2467 (0.2452) loss 3.6672 (3.3921) grad_norm 2.2528 (2.1113) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:42:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1070/1251] eta 0:00:44 lr 0.000823 wd 0.0500 time 0.2442 (0.2466) data time 0.0009 (0.0016) model time 0.2433 (0.2451) loss 3.3267 (3.3911) grad_norm 1.9394 (inf) loss_scale 4096.0000 (4115.1223) mem 7379MB [2024-08-26 10:42:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1080/1251] eta 0:00:42 lr 0.000823 wd 0.0500 time 0.2328 (0.2467) data time 0.0012 (0.0016) model time 0.2317 (0.2453) loss 2.5912 (3.3897) grad_norm 2.0562 (inf) loss_scale 4096.0000 (4114.9454) mem 7379MB [2024-08-26 10:42:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1090/1251] eta 0:00:39 lr 0.000823 wd 0.0500 time 0.2374 (0.2466) data time 0.0010 (0.0016) model time 0.2365 (0.2452) loss 3.4647 (3.3897) grad_norm 1.3859 (inf) loss_scale 4096.0000 (4114.7718) mem 7379MB [2024-08-26 10:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1100/1251] eta 0:00:37 lr 0.000823 wd 0.0500 time 0.2347 (0.2468) data time 0.0013 (0.0016) model time 0.2334 (0.2454) loss 3.3100 (3.3918) grad_norm 1.6682 (inf) loss_scale 4096.0000 (4114.6013) mem 7379MB [2024-08-26 10:42:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1110/1251] eta 0:00:34 lr 0.000823 wd 0.0500 time 0.2435 (0.2468) data time 0.0011 (0.0016) model time 0.2423 (0.2454) loss 3.1054 (3.3868) grad_norm 1.7637 (inf) loss_scale 4096.0000 (4114.4338) mem 7379MB [2024-08-26 10:42:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1120/1251] eta 0:00:32 lr 0.000823 wd 0.0500 time 0.2409 (0.2467) data time 0.0010 (0.0016) model time 0.2399 (0.2453) loss 3.3228 (3.3860) grad_norm 1.8505 (inf) loss_scale 4096.0000 (4114.2694) mem 7379MB [2024-08-26 10:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1130/1251] eta 0:00:29 lr 0.000823 wd 0.0500 time 0.2359 (0.2468) data time 0.0010 (0.0016) model time 0.2349 (0.2454) loss 3.5034 (3.3861) grad_norm 1.8920 (inf) loss_scale 4096.0000 (4114.1079) mem 7379MB [2024-08-26 10:42:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1140/1251] eta 0:00:27 lr 0.000823 wd 0.0500 time 0.2437 (0.2468) data time 0.0011 (0.0016) model time 0.2426 (0.2454) loss 3.6487 (3.3876) grad_norm 2.5637 (inf) loss_scale 4096.0000 (4113.9492) mem 7379MB [2024-08-26 10:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1150/1251] eta 0:00:24 lr 0.000823 wd 0.0500 time 0.2340 (0.2468) data time 0.0009 (0.0016) model time 0.2330 (0.2454) loss 3.3952 (3.3883) grad_norm 1.7823 (inf) loss_scale 4096.0000 (4113.7932) mem 7379MB [2024-08-26 10:42:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1160/1251] eta 0:00:22 lr 0.000823 wd 0.0500 time 0.2429 (0.2468) data time 0.0010 (0.0016) model time 0.2419 (0.2454) loss 3.3796 (3.3848) grad_norm 1.7988 (inf) loss_scale 4096.0000 (4113.6400) mem 7379MB [2024-08-26 10:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1170/1251] eta 0:00:19 lr 0.000823 wd 0.0500 time 0.2381 (0.2467) data time 0.0010 (0.0016) model time 0.2371 (0.2453) loss 3.7660 (3.3851) grad_norm 1.9011 (inf) loss_scale 4096.0000 (4113.4893) mem 7379MB [2024-08-26 10:42:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1180/1251] eta 0:00:17 lr 0.000822 wd 0.0500 time 0.2371 (0.2470) data time 0.0012 (0.0016) model time 0.2359 (0.2456) loss 2.8902 (3.3842) grad_norm 1.8787 (inf) loss_scale 4096.0000 (4113.3412) mem 7379MB [2024-08-26 10:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1190/1251] eta 0:00:15 lr 0.000822 wd 0.0500 time 0.2457 (0.2469) data time 0.0007 (0.0016) model time 0.2451 (0.2455) loss 2.7765 (3.3832) grad_norm 1.6506 (inf) loss_scale 4096.0000 (4113.1956) mem 7379MB [2024-08-26 10:42:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1200/1251] eta 0:00:12 lr 0.000822 wd 0.0500 time 0.2443 (0.2469) data time 0.0010 (0.0015) model time 0.2432 (0.2455) loss 3.5860 (3.3834) grad_norm 2.0970 (inf) loss_scale 4096.0000 (4113.0525) mem 7379MB [2024-08-26 10:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1210/1251] eta 0:00:10 lr 0.000822 wd 0.0500 time 0.2370 (0.2468) data time 0.0011 (0.0015) model time 0.2359 (0.2454) loss 3.6461 (3.3850) grad_norm 1.5948 (inf) loss_scale 4096.0000 (4112.9116) mem 7379MB [2024-08-26 10:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1220/1251] eta 0:00:07 lr 0.000822 wd 0.0500 time 0.2361 (0.2468) data time 0.0010 (0.0015) model time 0.2351 (0.2454) loss 4.1690 (3.3846) grad_norm 1.8959 (inf) loss_scale 4096.0000 (4112.7731) mem 7379MB [2024-08-26 10:42:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1230/1251] eta 0:00:05 lr 0.000822 wd 0.0500 time 0.2501 (0.2468) data time 0.0010 (0.0015) model time 0.2492 (0.2454) loss 2.8537 (3.3849) grad_norm 2.3213 (inf) loss_scale 4096.0000 (4112.6369) mem 7379MB [2024-08-26 10:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1240/1251] eta 0:00:02 lr 0.000822 wd 0.0500 time 0.2244 (0.2468) data time 0.0005 (0.0015) model time 0.2239 (0.2454) loss 4.2329 (3.3879) grad_norm 3.2363 (inf) loss_scale 4096.0000 (4112.5028) mem 7379MB [2024-08-26 10:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [97/300][1250/1251] eta 0:00:00 lr 0.000822 wd 0.0500 time 0.2260 (0.2468) data time 0.0005 (0.0015) model time 0.2256 (0.2454) loss 2.7369 (3.3881) grad_norm 2.0193 (inf) loss_scale 4096.0000 (4112.3709) mem 7379MB [2024-08-26 10:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 97 training takes 0:05:08 [2024-08-26 10:43:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 10:43:03 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 10:43:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.462 (0.462) Loss 0.5225 (0.5225) Acc@1 90.039 (90.039) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 10:43:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.111) Loss 0.8389 (0.7986) Acc@1 82.031 (82.662) Acc@5 96.289 (96.271) Mem 7379MB [2024-08-26 10:43:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.095) Loss 1.1621 (0.8143) Acc@1 71.777 (81.789) Acc@5 92.480 (96.229) Mem 7379MB [2024-08-26 10:43:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.090) Loss 1.4521 (0.9314) Acc@1 65.430 (79.265) Acc@5 88.770 (94.790) Mem 7379MB [2024-08-26 10:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.3008 (0.9945) Acc@1 69.824 (77.711) Acc@5 90.625 (94.122) Mem 7379MB [2024-08-26 10:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.322 Acc@5 94.060 [2024-08-26 10:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.3% [2024-08-26 10:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 77.32% [2024-08-26 10:43:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 10:43:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 10:43:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.432 (0.432) Loss 0.4399 (0.4399) Acc@1 92.480 (92.480) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 10:43:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.112) Loss 0.7104 (0.6920) Acc@1 85.938 (85.130) Acc@5 96.289 (97.035) Mem 7379MB [2024-08-26 10:43:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.096) Loss 0.9873 (0.7152) Acc@1 76.074 (84.101) Acc@5 94.434 (97.005) Mem 7379MB [2024-08-26 10:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.090) Loss 1.2764 (0.8162) Acc@1 67.676 (81.732) Acc@5 91.113 (95.832) Mem 7379MB [2024-08-26 10:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.1387 (0.8683) Acc@1 71.191 (80.223) Acc@5 93.359 (95.320) Mem 7379MB [2024-08-26 10:43:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.800 Acc@5 95.282 [2024-08-26 10:43:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.8% [2024-08-26 10:43:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.80% [2024-08-26 10:43:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 10:43:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 10:43:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][0/1251] eta 0:13:49 lr 0.000822 wd 0.0500 time 0.6634 (0.6634) data time 0.4317 (0.4317) model time 0.0000 (0.0000) loss 3.2239 (3.2239) grad_norm 2.3130 (2.3130) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][10/1251] eta 0:05:49 lr 0.000822 wd 0.0500 time 0.2493 (0.2815) data time 0.0009 (0.0404) model time 0.0000 (0.0000) loss 2.6604 (3.5085) grad_norm 1.7807 (1.8273) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][20/1251] eta 0:05:25 lr 0.000822 wd 0.0500 time 0.2393 (0.2644) data time 0.0010 (0.0216) model time 0.0000 (0.0000) loss 3.0435 (3.5516) grad_norm 1.6670 (1.8903) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][30/1251] eta 0:05:14 lr 0.000822 wd 0.0500 time 0.2471 (0.2580) data time 0.0010 (0.0150) model time 0.0000 (0.0000) loss 2.5632 (3.5226) grad_norm 3.9521 (2.0210) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][40/1251] eta 0:05:08 lr 0.000822 wd 0.0500 time 0.2450 (0.2546) data time 0.0009 (0.0119) model time 0.0000 (0.0000) loss 2.3618 (3.4888) grad_norm 2.4103 (2.1262) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][50/1251] eta 0:05:03 lr 0.000822 wd 0.0500 time 0.2429 (0.2525) data time 0.0011 (0.0097) model time 0.0000 (0.0000) loss 3.8967 (3.5134) grad_norm 1.7876 (2.1375) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][60/1251] eta 0:04:58 lr 0.000822 wd 0.0500 time 0.2432 (0.2509) data time 0.0007 (0.0083) model time 0.2424 (0.2417) loss 3.5247 (3.5036) grad_norm 2.3077 (2.1565) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][70/1251] eta 0:04:54 lr 0.000822 wd 0.0500 time 0.2459 (0.2497) data time 0.0010 (0.0073) model time 0.2449 (0.2415) loss 2.3489 (3.4772) grad_norm 2.2130 (2.1300) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][80/1251] eta 0:04:51 lr 0.000822 wd 0.0500 time 0.2392 (0.2486) data time 0.0011 (0.0065) model time 0.2382 (0.2410) loss 3.3618 (3.4736) grad_norm 2.0812 (2.1298) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][90/1251] eta 0:04:47 lr 0.000822 wd 0.0500 time 0.2403 (0.2478) data time 0.0007 (0.0059) model time 0.2395 (0.2408) loss 2.6159 (3.4688) grad_norm 1.4153 (2.1121) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][100/1251] eta 0:04:44 lr 0.000822 wd 0.0500 time 0.2426 (0.2474) data time 0.0011 (0.0054) model time 0.2415 (0.2412) loss 3.5512 (3.4916) grad_norm 1.5033 (2.0943) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][110/1251] eta 0:04:41 lr 0.000822 wd 0.0500 time 0.2467 (0.2470) data time 0.0007 (0.0050) model time 0.2460 (0.2413) loss 2.5798 (3.4866) grad_norm 1.7145 (2.0893) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][120/1251] eta 0:04:38 lr 0.000822 wd 0.0500 time 0.2453 (0.2466) data time 0.0007 (0.0047) model time 0.2446 (0.2413) loss 3.7807 (3.4697) grad_norm 1.5108 (2.0758) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][130/1251] eta 0:04:35 lr 0.000822 wd 0.0500 time 0.2356 (0.2461) data time 0.0010 (0.0044) model time 0.2347 (0.2409) loss 3.8745 (3.4625) grad_norm 2.1546 (2.0634) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][140/1251] eta 0:04:34 lr 0.000822 wd 0.0500 time 0.2396 (0.2468) data time 0.0008 (0.0042) model time 0.2388 (0.2425) loss 3.0152 (3.4659) grad_norm 2.5619 (2.0533) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][150/1251] eta 0:04:33 lr 0.000822 wd 0.0500 time 0.2452 (0.2485) data time 0.0008 (0.0040) model time 0.2444 (0.2454) loss 2.6166 (3.4376) grad_norm 2.6337 (2.0648) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][160/1251] eta 0:04:30 lr 0.000822 wd 0.0500 time 0.2465 (0.2482) data time 0.0008 (0.0038) model time 0.2456 (0.2452) loss 4.1173 (3.4436) grad_norm 1.6820 (2.0590) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][170/1251] eta 0:04:27 lr 0.000822 wd 0.0500 time 0.2368 (0.2478) data time 0.0010 (0.0036) model time 0.2358 (0.2448) loss 3.8734 (3.4625) grad_norm 2.5296 (2.0649) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:43:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][180/1251] eta 0:04:25 lr 0.000822 wd 0.0500 time 0.2537 (0.2475) data time 0.0008 (0.0035) model time 0.2529 (0.2446) loss 4.2132 (3.4582) grad_norm 1.7837 (2.0606) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][190/1251] eta 0:04:22 lr 0.000822 wd 0.0500 time 0.2477 (0.2471) data time 0.0007 (0.0033) model time 0.2470 (0.2442) loss 2.8706 (3.4502) grad_norm 2.1589 (2.0550) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][200/1251] eta 0:04:19 lr 0.000822 wd 0.0500 time 0.2392 (0.2468) data time 0.0009 (0.0032) model time 0.2383 (0.2438) loss 3.2255 (3.4475) grad_norm 1.8296 (2.0470) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][210/1251] eta 0:04:16 lr 0.000822 wd 0.0500 time 0.2454 (0.2468) data time 0.0009 (0.0031) model time 0.2444 (0.2439) loss 3.4022 (3.4445) grad_norm 1.9587 (2.0578) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][220/1251] eta 0:04:14 lr 0.000821 wd 0.0500 time 0.2369 (0.2465) data time 0.0008 (0.0030) model time 0.2361 (0.2437) loss 3.6083 (3.4463) grad_norm 1.5924 (2.0597) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][230/1251] eta 0:04:11 lr 0.000821 wd 0.0500 time 0.2476 (0.2462) data time 0.0010 (0.0029) model time 0.2466 (0.2434) loss 3.6856 (3.4456) grad_norm 1.9280 (2.0498) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][240/1251] eta 0:04:08 lr 0.000821 wd 0.0500 time 0.2408 (0.2460) data time 0.0011 (0.0029) model time 0.2397 (0.2432) loss 3.7346 (3.4374) grad_norm 2.9495 (2.0576) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][250/1251] eta 0:04:06 lr 0.000821 wd 0.0500 time 0.2342 (0.2465) data time 0.0010 (0.0028) model time 0.2331 (0.2440) loss 3.6618 (3.4415) grad_norm 1.8114 (2.0573) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][260/1251] eta 0:04:05 lr 0.000821 wd 0.0500 time 0.2492 (0.2479) data time 0.0010 (0.0027) model time 0.2482 (0.2458) loss 3.0070 (3.4396) grad_norm 3.3516 (2.0779) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][270/1251] eta 0:04:02 lr 0.000821 wd 0.0500 time 0.2446 (0.2476) data time 0.0011 (0.0027) model time 0.2435 (0.2454) loss 3.2471 (3.4408) grad_norm 1.6296 (2.0823) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][280/1251] eta 0:04:00 lr 0.000821 wd 0.0500 time 0.2439 (0.2475) data time 0.0009 (0.0026) model time 0.2430 (0.2453) loss 3.9347 (3.4405) grad_norm 1.6684 (2.0693) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][290/1251] eta 0:03:57 lr 0.000821 wd 0.0500 time 0.2363 (0.2473) data time 0.0009 (0.0026) model time 0.2355 (0.2451) loss 4.1195 (3.4421) grad_norm 2.1006 (2.0717) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][300/1251] eta 0:03:55 lr 0.000821 wd 0.0500 time 0.2434 (0.2472) data time 0.0011 (0.0025) model time 0.2423 (0.2450) loss 3.6203 (3.4484) grad_norm 2.1287 (2.0808) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][310/1251] eta 0:03:52 lr 0.000821 wd 0.0500 time 0.2448 (0.2471) data time 0.0012 (0.0025) model time 0.2436 (0.2449) loss 3.1910 (3.4344) grad_norm 2.1081 (2.0778) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][320/1251] eta 0:03:49 lr 0.000821 wd 0.0500 time 0.2456 (0.2469) data time 0.0007 (0.0024) model time 0.2449 (0.2448) loss 3.8730 (3.4303) grad_norm 2.1056 (2.0753) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][330/1251] eta 0:03:47 lr 0.000821 wd 0.0500 time 0.2420 (0.2468) data time 0.0011 (0.0024) model time 0.2410 (0.2447) loss 3.5771 (3.4274) grad_norm 2.8354 (2.0743) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][340/1251] eta 0:03:44 lr 0.000821 wd 0.0500 time 0.2336 (0.2466) data time 0.0011 (0.0024) model time 0.2325 (0.2445) loss 2.3393 (3.4284) grad_norm 3.0675 (2.0753) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][350/1251] eta 0:03:42 lr 0.000821 wd 0.0500 time 0.2475 (0.2466) data time 0.0010 (0.0023) model time 0.2465 (0.2445) loss 3.8402 (3.4377) grad_norm 2.0567 (2.0747) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][360/1251] eta 0:03:39 lr 0.000821 wd 0.0500 time 0.2437 (0.2469) data time 0.0007 (0.0023) model time 0.2430 (0.2449) loss 3.0078 (3.4393) grad_norm 1.3308 (2.0676) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][370/1251] eta 0:03:37 lr 0.000821 wd 0.0500 time 0.2363 (0.2468) data time 0.0008 (0.0023) model time 0.2355 (0.2448) loss 3.0143 (3.4439) grad_norm 2.1474 (2.0614) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][380/1251] eta 0:03:35 lr 0.000821 wd 0.0500 time 0.2320 (0.2472) data time 0.0011 (0.0022) model time 0.2309 (0.2453) loss 3.7001 (3.4443) grad_norm 2.3861 (2.0735) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][390/1251] eta 0:03:32 lr 0.000821 wd 0.0500 time 0.2455 (0.2471) data time 0.0012 (0.0022) model time 0.2444 (0.2452) loss 3.4110 (3.4377) grad_norm 1.8241 (2.0863) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][400/1251] eta 0:03:30 lr 0.000821 wd 0.0500 time 0.2436 (0.2470) data time 0.0009 (0.0022) model time 0.2427 (0.2451) loss 3.5469 (3.4319) grad_norm 2.5936 (2.0967) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][410/1251] eta 0:03:27 lr 0.000821 wd 0.0500 time 0.2394 (0.2468) data time 0.0010 (0.0021) model time 0.2384 (0.2450) loss 3.4908 (3.4325) grad_norm 2.3581 (2.1000) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][420/1251] eta 0:03:25 lr 0.000821 wd 0.0500 time 0.2417 (0.2472) data time 0.0007 (0.0021) model time 0.2409 (0.2454) loss 4.0666 (3.4385) grad_norm 1.5052 (2.0943) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:44:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][430/1251] eta 0:03:22 lr 0.000821 wd 0.0500 time 0.2425 (0.2471) data time 0.0010 (0.0021) model time 0.2415 (0.2453) loss 3.1051 (3.4355) grad_norm 1.8881 (2.0885) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][440/1251] eta 0:03:20 lr 0.000821 wd 0.0500 time 0.2386 (0.2470) data time 0.0009 (0.0021) model time 0.2377 (0.2453) loss 2.1659 (3.4321) grad_norm 2.1017 (2.0907) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][450/1251] eta 0:03:17 lr 0.000821 wd 0.0500 time 0.2465 (0.2469) data time 0.0010 (0.0020) model time 0.2455 (0.2451) loss 2.9394 (3.4298) grad_norm 2.8208 (2.0949) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][460/1251] eta 0:03:15 lr 0.000821 wd 0.0500 time 0.2316 (0.2468) data time 0.0007 (0.0020) model time 0.2308 (0.2450) loss 2.3883 (3.4276) grad_norm 2.8604 (2.0917) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][470/1251] eta 0:03:12 lr 0.000821 wd 0.0500 time 0.2444 (0.2466) data time 0.0008 (0.0020) model time 0.2437 (0.2449) loss 4.3258 (3.4293) grad_norm 2.6217 (2.0935) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][480/1251] eta 0:03:10 lr 0.000821 wd 0.0500 time 0.2579 (0.2466) data time 0.0010 (0.0020) model time 0.2570 (0.2448) loss 3.6584 (3.4331) grad_norm 1.8484 (2.0941) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][490/1251] eta 0:03:07 lr 0.000821 wd 0.0500 time 0.2412 (0.2469) data time 0.0009 (0.0019) model time 0.2403 (0.2452) loss 3.5387 (3.4330) grad_norm 1.9834 (2.0971) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][500/1251] eta 0:03:05 lr 0.000821 wd 0.0500 time 0.2325 (0.2468) data time 0.0011 (0.0019) model time 0.2315 (0.2451) loss 3.7473 (3.4337) grad_norm 2.1472 (2.1032) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][510/1251] eta 0:03:03 lr 0.000820 wd 0.0500 time 0.2381 (0.2470) data time 0.0010 (0.0019) model time 0.2371 (0.2454) loss 3.4837 (3.4328) grad_norm 3.0856 (2.1088) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][520/1251] eta 0:03:00 lr 0.000820 wd 0.0500 time 0.2390 (0.2469) data time 0.0011 (0.0019) model time 0.2380 (0.2453) loss 3.7102 (3.4317) grad_norm 1.8244 (2.1089) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][530/1251] eta 0:02:57 lr 0.000820 wd 0.0500 time 0.2391 (0.2468) data time 0.0010 (0.0019) model time 0.2381 (0.2452) loss 3.4980 (3.4326) grad_norm 5.2466 (2.1181) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][540/1251] eta 0:02:55 lr 0.000820 wd 0.0500 time 0.2422 (0.2468) data time 0.0009 (0.0019) model time 0.2412 (0.2451) loss 3.7430 (3.4324) grad_norm 2.4656 (2.1180) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][550/1251] eta 0:02:52 lr 0.000820 wd 0.0500 time 0.2533 (0.2467) data time 0.0010 (0.0019) model time 0.2523 (0.2451) loss 3.8128 (3.4277) grad_norm 2.5318 (2.1154) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][560/1251] eta 0:02:50 lr 0.000820 wd 0.0500 time 0.2408 (0.2466) data time 0.0010 (0.0018) model time 0.2397 (0.2450) loss 3.2247 (3.4277) grad_norm 1.8764 (2.1152) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][570/1251] eta 0:02:47 lr 0.000820 wd 0.0500 time 0.2469 (0.2465) data time 0.0012 (0.0018) model time 0.2457 (0.2449) loss 3.1579 (3.4259) grad_norm 1.8729 (2.1119) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][580/1251] eta 0:02:45 lr 0.000820 wd 0.0500 time 0.2408 (0.2464) data time 0.0009 (0.0018) model time 0.2399 (0.2447) loss 3.3388 (3.4255) grad_norm 2.4109 (2.1104) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][590/1251] eta 0:02:42 lr 0.000820 wd 0.0500 time 0.2506 (0.2463) data time 0.0009 (0.0018) model time 0.2496 (0.2447) loss 3.5260 (3.4257) grad_norm 1.8126 (2.1061) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][600/1251] eta 0:02:40 lr 0.000820 wd 0.0500 time 0.2477 (0.2463) data time 0.0010 (0.0018) model time 0.2468 (0.2446) loss 3.1995 (3.4217) grad_norm 2.6757 (2.1053) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][610/1251] eta 0:02:37 lr 0.000820 wd 0.0500 time 0.2422 (0.2462) data time 0.0008 (0.0018) model time 0.2413 (0.2446) loss 2.5465 (3.4198) grad_norm 1.6141 (2.1021) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][620/1251] eta 0:02:35 lr 0.000820 wd 0.0500 time 0.2432 (0.2462) data time 0.0012 (0.0018) model time 0.2420 (0.2446) loss 3.5889 (3.4172) grad_norm 2.0185 (2.1023) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][630/1251] eta 0:02:33 lr 0.000820 wd 0.0500 time 0.4420 (0.2465) data time 0.0011 (0.0018) model time 0.4409 (0.2449) loss 3.5272 (3.4145) grad_norm 1.9344 (2.1015) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][640/1251] eta 0:02:30 lr 0.000820 wd 0.0500 time 0.2488 (0.2464) data time 0.0007 (0.0017) model time 0.2482 (0.2448) loss 2.6057 (3.4130) grad_norm 2.0154 (2.0983) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][650/1251] eta 0:02:28 lr 0.000820 wd 0.0500 time 0.2364 (0.2464) data time 0.0011 (0.0017) model time 0.2353 (0.2448) loss 3.7353 (3.4126) grad_norm 2.7633 (2.1015) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][660/1251] eta 0:02:25 lr 0.000820 wd 0.0500 time 0.2444 (0.2463) data time 0.0010 (0.0017) model time 0.2434 (0.2447) loss 3.8864 (3.4105) grad_norm 2.6901 (2.1014) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:45:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][670/1251] eta 0:02:23 lr 0.000820 wd 0.0500 time 0.2413 (0.2471) data time 0.0009 (0.0017) model time 0.2404 (0.2456) loss 2.8791 (3.4102) grad_norm 1.9896 (2.1021) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][680/1251] eta 0:02:21 lr 0.000820 wd 0.0500 time 0.2394 (0.2470) data time 0.0010 (0.0017) model time 0.2384 (0.2455) loss 3.5064 (3.4118) grad_norm 2.0347 (2.1005) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][690/1251] eta 0:02:18 lr 0.000820 wd 0.0500 time 0.2410 (0.2469) data time 0.0011 (0.0017) model time 0.2399 (0.2455) loss 2.2169 (3.4141) grad_norm 1.7850 (2.0979) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][700/1251] eta 0:02:16 lr 0.000820 wd 0.0500 time 0.2427 (0.2469) data time 0.0010 (0.0017) model time 0.2417 (0.2454) loss 3.7193 (3.4136) grad_norm 1.7480 (2.0987) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][710/1251] eta 0:02:13 lr 0.000820 wd 0.0500 time 0.2392 (0.2468) data time 0.0007 (0.0017) model time 0.2385 (0.2453) loss 4.3149 (3.4158) grad_norm 3.9970 (2.1076) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][720/1251] eta 0:02:11 lr 0.000820 wd 0.0500 time 0.2361 (0.2467) data time 0.0010 (0.0017) model time 0.2351 (0.2453) loss 2.7729 (3.4166) grad_norm 2.5741 (2.1078) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][730/1251] eta 0:02:08 lr 0.000820 wd 0.0500 time 0.2501 (0.2467) data time 0.0011 (0.0017) model time 0.2490 (0.2452) loss 3.0750 (3.4152) grad_norm 1.8940 (2.1046) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][740/1251] eta 0:02:05 lr 0.000820 wd 0.0500 time 0.2450 (0.2466) data time 0.0009 (0.0016) model time 0.2441 (0.2451) loss 3.2712 (3.4174) grad_norm 1.7460 (2.1012) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][750/1251] eta 0:02:03 lr 0.000820 wd 0.0500 time 0.2464 (0.2465) data time 0.0012 (0.0016) model time 0.2453 (0.2450) loss 3.5003 (3.4171) grad_norm 2.1729 (2.0990) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][760/1251] eta 0:02:01 lr 0.000820 wd 0.0500 time 0.2482 (0.2465) data time 0.0010 (0.0016) model time 0.2472 (0.2450) loss 3.3695 (3.4170) grad_norm 1.9064 (2.0969) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][770/1251] eta 0:01:58 lr 0.000820 wd 0.0500 time 0.2361 (0.2464) data time 0.0007 (0.0016) model time 0.2354 (0.2449) loss 2.7172 (3.4142) grad_norm 2.6037 (2.0983) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][780/1251] eta 0:01:56 lr 0.000820 wd 0.0500 time 0.2422 (0.2465) data time 0.0009 (0.0016) model time 0.2413 (0.2450) loss 3.7151 (3.4178) grad_norm 1.5304 (2.0978) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][790/1251] eta 0:01:53 lr 0.000820 wd 0.0500 time 0.2447 (0.2465) data time 0.0010 (0.0016) model time 0.2437 (0.2451) loss 3.1806 (3.4140) grad_norm 2.5463 (2.0968) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][800/1251] eta 0:01:51 lr 0.000819 wd 0.0500 time 0.2411 (0.2465) data time 0.0008 (0.0016) model time 0.2403 (0.2450) loss 3.9415 (3.4137) grad_norm 1.2544 (2.0948) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][810/1251] eta 0:01:48 lr 0.000819 wd 0.0500 time 0.2452 (0.2465) data time 0.0009 (0.0016) model time 0.2443 (0.2450) loss 3.1203 (3.4090) grad_norm 1.8121 (2.0959) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][820/1251] eta 0:01:46 lr 0.000819 wd 0.0500 time 0.2403 (0.2464) data time 0.0011 (0.0016) model time 0.2392 (0.2449) loss 3.9258 (3.4097) grad_norm 3.3388 (2.0998) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][830/1251] eta 0:01:43 lr 0.000819 wd 0.0500 time 0.2367 (0.2464) data time 0.0009 (0.0016) model time 0.2359 (0.2449) loss 4.0348 (3.4121) grad_norm 2.5938 (2.0990) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][840/1251] eta 0:01:41 lr 0.000819 wd 0.0500 time 0.2378 (0.2463) data time 0.0007 (0.0016) model time 0.2371 (0.2448) loss 3.5167 (3.4104) grad_norm 1.8836 (2.1012) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][850/1251] eta 0:01:38 lr 0.000819 wd 0.0500 time 0.2496 (0.2463) data time 0.0008 (0.0016) model time 0.2487 (0.2448) loss 2.7548 (3.4093) grad_norm 2.5726 (2.0992) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][860/1251] eta 0:01:36 lr 0.000819 wd 0.0500 time 0.2362 (0.2462) data time 0.0012 (0.0016) model time 0.2350 (0.2447) loss 3.2345 (3.4112) grad_norm 2.9488 (2.1024) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][870/1251] eta 0:01:33 lr 0.000819 wd 0.0500 time 0.2358 (0.2462) data time 0.0011 (0.0016) model time 0.2347 (0.2447) loss 2.7999 (3.4135) grad_norm 1.6129 (2.1012) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][880/1251] eta 0:01:31 lr 0.000819 wd 0.0500 time 0.2215 (0.2463) data time 0.0008 (0.0016) model time 0.2206 (0.2449) loss 4.5040 (3.4124) grad_norm 1.7569 (2.1001) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][890/1251] eta 0:01:28 lr 0.000819 wd 0.0500 time 0.2404 (0.2463) data time 0.0010 (0.0016) model time 0.2394 (0.2448) loss 4.2704 (3.4112) grad_norm 1.8137 (2.0998) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][900/1251] eta 0:01:26 lr 0.000819 wd 0.0500 time 0.2403 (0.2466) data time 0.0008 (0.0016) model time 0.2394 (0.2451) loss 2.8687 (3.4095) grad_norm 1.9647 (2.0988) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:46:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][910/1251] eta 0:01:24 lr 0.000819 wd 0.0500 time 0.2473 (0.2466) data time 0.0011 (0.0016) model time 0.2463 (0.2451) loss 3.8343 (3.4097) grad_norm 2.1116 (2.0982) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][920/1251] eta 0:01:21 lr 0.000819 wd 0.0500 time 0.2473 (0.2467) data time 0.0009 (0.0016) model time 0.2464 (0.2453) loss 4.0154 (3.4139) grad_norm 2.0557 (2.0972) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][930/1251] eta 0:01:19 lr 0.000819 wd 0.0500 time 0.2363 (0.2467) data time 0.0011 (0.0016) model time 0.2352 (0.2452) loss 3.1999 (3.4118) grad_norm 2.0541 (2.0969) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][940/1251] eta 0:01:16 lr 0.000819 wd 0.0500 time 0.2468 (0.2468) data time 0.0008 (0.0015) model time 0.2461 (0.2454) loss 3.6333 (3.4122) grad_norm 1.4841 (2.0957) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][950/1251] eta 0:01:14 lr 0.000819 wd 0.0500 time 0.2455 (0.2467) data time 0.0007 (0.0015) model time 0.2447 (0.2453) loss 2.1319 (3.4136) grad_norm 1.4367 (2.0926) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][960/1251] eta 0:01:11 lr 0.000819 wd 0.0500 time 0.2408 (0.2467) data time 0.0011 (0.0015) model time 0.2397 (0.2453) loss 3.7097 (3.4101) grad_norm 1.7795 (2.0924) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][970/1251] eta 0:01:09 lr 0.000819 wd 0.0500 time 0.2356 (0.2466) data time 0.0009 (0.0015) model time 0.2347 (0.2452) loss 4.3637 (3.4155) grad_norm 2.2601 (2.0963) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][980/1251] eta 0:01:06 lr 0.000819 wd 0.0500 time 0.2425 (0.2466) data time 0.0008 (0.0015) model time 0.2417 (0.2452) loss 3.9957 (3.4179) grad_norm 1.5899 (2.0942) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][990/1251] eta 0:01:04 lr 0.000819 wd 0.0500 time 0.2390 (0.2465) data time 0.0009 (0.0015) model time 0.2381 (0.2451) loss 4.3247 (3.4217) grad_norm 2.7673 (2.0926) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1000/1251] eta 0:01:01 lr 0.000819 wd 0.0500 time 0.2468 (0.2465) data time 0.0010 (0.0015) model time 0.2458 (0.2451) loss 2.7680 (3.4219) grad_norm 1.4333 (2.0926) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1010/1251] eta 0:00:59 lr 0.000819 wd 0.0500 time 0.2362 (0.2465) data time 0.0009 (0.0015) model time 0.2353 (0.2451) loss 3.1193 (3.4220) grad_norm 1.9387 (2.0901) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1020/1251] eta 0:00:56 lr 0.000819 wd 0.0500 time 0.2396 (0.2465) data time 0.0007 (0.0015) model time 0.2388 (0.2451) loss 4.1032 (3.4230) grad_norm 2.0950 (2.0890) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1030/1251] eta 0:00:54 lr 0.000819 wd 0.0500 time 0.2435 (0.2466) data time 0.0009 (0.0015) model time 0.2426 (0.2452) loss 3.4374 (3.4221) grad_norm 1.9218 (2.0895) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1040/1251] eta 0:00:52 lr 0.000819 wd 0.0500 time 0.2385 (0.2466) data time 0.0012 (0.0015) model time 0.2372 (0.2452) loss 2.3755 (3.4220) grad_norm 1.7684 (2.0882) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1050/1251] eta 0:00:49 lr 0.000819 wd 0.0500 time 0.2344 (0.2465) data time 0.0009 (0.0015) model time 0.2335 (0.2451) loss 3.5808 (3.4235) grad_norm 2.8627 (2.0877) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1060/1251] eta 0:00:47 lr 0.000819 wd 0.0500 time 0.2429 (0.2465) data time 0.0012 (0.0015) model time 0.2418 (0.2451) loss 3.3790 (3.4210) grad_norm 2.2963 (2.0873) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1070/1251] eta 0:00:44 lr 0.000819 wd 0.0500 time 0.2432 (0.2464) data time 0.0008 (0.0015) model time 0.2424 (0.2451) loss 2.3948 (3.4188) grad_norm 1.9037 (2.0856) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1080/1251] eta 0:00:42 lr 0.000819 wd 0.0500 time 0.2367 (0.2464) data time 0.0007 (0.0015) model time 0.2360 (0.2450) loss 4.1567 (3.4187) grad_norm 1.4880 (2.0840) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1090/1251] eta 0:00:39 lr 0.000819 wd 0.0500 time 0.2449 (0.2463) data time 0.0010 (0.0015) model time 0.2440 (0.2450) loss 3.6382 (3.4200) grad_norm 1.6654 (2.0825) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1100/1251] eta 0:00:37 lr 0.000818 wd 0.0500 time 0.2365 (0.2463) data time 0.0008 (0.0015) model time 0.2357 (0.2450) loss 3.8519 (3.4211) grad_norm 2.4290 (2.0832) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1110/1251] eta 0:00:34 lr 0.000818 wd 0.0500 time 0.2448 (0.2463) data time 0.0010 (0.0015) model time 0.2438 (0.2449) loss 3.6855 (3.4232) grad_norm 3.7250 (2.0846) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1120/1251] eta 0:00:32 lr 0.000818 wd 0.0500 time 0.2353 (0.2463) data time 0.0010 (0.0015) model time 0.2343 (0.2449) loss 3.6733 (3.4231) grad_norm 2.2839 (2.0893) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1130/1251] eta 0:00:29 lr 0.000818 wd 0.0500 time 0.2345 (0.2462) data time 0.0011 (0.0015) model time 0.2334 (0.2448) loss 3.7840 (3.4237) grad_norm 2.3832 (2.0900) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1140/1251] eta 0:00:27 lr 0.000818 wd 0.0500 time 0.2481 (0.2462) data time 0.0008 (0.0015) model time 0.2473 (0.2448) loss 2.1467 (3.4231) grad_norm 1.3807 (2.0888) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1150/1251] eta 0:00:24 lr 0.000818 wd 0.0500 time 0.2364 (0.2461) data time 0.0007 (0.0015) model time 0.2357 (0.2447) loss 4.1563 (3.4264) grad_norm 1.9337 (2.0867) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:47:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1160/1251] eta 0:00:22 lr 0.000818 wd 0.0500 time 0.2417 (0.2461) data time 0.0009 (0.0015) model time 0.2408 (0.2447) loss 4.4286 (3.4272) grad_norm 1.9175 (2.0859) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1170/1251] eta 0:00:19 lr 0.000818 wd 0.0500 time 0.2387 (0.2462) data time 0.0007 (0.0014) model time 0.2380 (0.2449) loss 3.3534 (3.4285) grad_norm 1.9379 (2.0867) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1180/1251] eta 0:00:17 lr 0.000818 wd 0.0500 time 0.2443 (0.2463) data time 0.0008 (0.0014) model time 0.2435 (0.2450) loss 3.7947 (3.4310) grad_norm 2.1430 (2.0895) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1190/1251] eta 0:00:15 lr 0.000818 wd 0.0500 time 0.2430 (0.2465) data time 0.0010 (0.0014) model time 0.2420 (0.2451) loss 3.6621 (3.4343) grad_norm 1.5920 (2.0891) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1200/1251] eta 0:00:12 lr 0.000818 wd 0.0500 time 0.2398 (0.2464) data time 0.0009 (0.0014) model time 0.2389 (0.2451) loss 2.5031 (3.4335) grad_norm 1.4341 (2.0919) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1210/1251] eta 0:00:10 lr 0.000818 wd 0.0500 time 0.2449 (0.2464) data time 0.0008 (0.0014) model time 0.2441 (0.2451) loss 4.0014 (3.4319) grad_norm 1.8055 (2.0953) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1220/1251] eta 0:00:07 lr 0.000818 wd 0.0500 time 0.2449 (0.2464) data time 0.0009 (0.0014) model time 0.2440 (0.2450) loss 4.0304 (3.4315) grad_norm 1.7824 (2.0937) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1230/1251] eta 0:00:05 lr 0.000818 wd 0.0500 time 0.2462 (0.2463) data time 0.0010 (0.0014) model time 0.2452 (0.2450) loss 3.7687 (3.4327) grad_norm 1.8971 (2.0955) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1240/1251] eta 0:00:02 lr 0.000818 wd 0.0500 time 0.2237 (0.2462) data time 0.0007 (0.0014) model time 0.2230 (0.2449) loss 3.9173 (3.4324) grad_norm 1.5895 (2.0954) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [98/300][1250/1251] eta 0:00:00 lr 0.000818 wd 0.0500 time 0.2251 (0.2461) data time 0.0007 (0.0014) model time 0.2244 (0.2447) loss 3.8917 (3.4334) grad_norm 1.4186 (2.0928) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 98 training takes 0:05:07 [2024-08-26 10:48:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 10:48:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 10:48:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.477 (0.477) Loss 0.5127 (0.5127) Acc@1 90.234 (90.234) Acc@5 97.754 (97.754) Mem 7379MB [2024-08-26 10:48:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.117) Loss 0.7886 (0.7934) Acc@1 84.375 (82.395) Acc@5 96.680 (96.378) Mem 7379MB [2024-08-26 10:48:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.099) Loss 1.1416 (0.8210) Acc@1 73.730 (81.655) Acc@5 93.262 (96.261) Mem 7379MB [2024-08-26 10:48:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.092) Loss 1.3691 (0.9365) Acc@1 67.871 (79.089) Acc@5 89.355 (94.827) Mem 7379MB [2024-08-26 10:48:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.086) Loss 1.2754 (1.0018) Acc@1 70.605 (77.368) Acc@5 90.918 (94.095) Mem 7379MB [2024-08-26 10:48:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.958 Acc@5 94.004 [2024-08-26 10:48:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.0% [2024-08-26 10:48:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.798 (0.798) Loss 0.4404 (0.4404) Acc@1 92.676 (92.676) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 10:48:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.147) Loss 0.7109 (0.6910) Acc@1 85.840 (85.183) Acc@5 96.191 (97.088) Mem 7379MB [2024-08-26 10:48:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.115) Loss 0.9873 (0.7145) Acc@1 76.172 (84.166) Acc@5 94.238 (97.015) Mem 7379MB [2024-08-26 10:48:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.102) Loss 1.2734 (0.8150) Acc@1 67.480 (81.792) Acc@5 90.723 (95.851) Mem 7379MB [2024-08-26 10:48:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.1387 (0.8669) Acc@1 71.484 (80.302) Acc@5 93.457 (95.358) Mem 7379MB [2024-08-26 10:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.872 Acc@5 95.316 [2024-08-26 10:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.9% [2024-08-26 10:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.87% [2024-08-26 10:48:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 10:48:30 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 10:48:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][0/1251] eta 0:14:48 lr 0.000818 wd 0.0500 time 0.7102 (0.7102) data time 0.4905 (0.4905) model time 0.0000 (0.0000) loss 2.3579 (2.3579) grad_norm 1.3336 (1.3336) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][10/1251] eta 0:05:52 lr 0.000818 wd 0.0500 time 0.2502 (0.2843) data time 0.0008 (0.0455) model time 0.0000 (0.0000) loss 3.0574 (3.1394) grad_norm 1.8127 (2.3134) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][20/1251] eta 0:05:25 lr 0.000818 wd 0.0500 time 0.2540 (0.2646) data time 0.0009 (0.0243) model time 0.0000 (0.0000) loss 3.4523 (3.3240) grad_norm 1.9425 (2.2275) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][30/1251] eta 0:05:14 lr 0.000818 wd 0.0500 time 0.2662 (0.2579) data time 0.0012 (0.0168) model time 0.0000 (0.0000) loss 3.6024 (3.3811) grad_norm 2.0417 (2.2513) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][40/1251] eta 0:05:07 lr 0.000818 wd 0.0500 time 0.2450 (0.2537) data time 0.0007 (0.0130) model time 0.0000 (0.0000) loss 3.7130 (3.3369) grad_norm 2.2690 (2.1794) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][50/1251] eta 0:05:05 lr 0.000818 wd 0.0500 time 0.3994 (0.2542) data time 0.0011 (0.0106) model time 0.0000 (0.0000) loss 3.8404 (3.3535) grad_norm 1.7775 (2.1731) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][60/1251] eta 0:04:59 lr 0.000818 wd 0.0500 time 0.2284 (0.2516) data time 0.0010 (0.0090) model time 0.2274 (0.2376) loss 3.5438 (3.3599) grad_norm 1.5940 (2.1184) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][70/1251] eta 0:04:55 lr 0.000818 wd 0.0500 time 0.2341 (0.2501) data time 0.0007 (0.0079) model time 0.2334 (0.2388) loss 4.1069 (3.3608) grad_norm 1.5926 (2.0924) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][80/1251] eta 0:04:52 lr 0.000818 wd 0.0500 time 0.2452 (0.2494) data time 0.0011 (0.0071) model time 0.2440 (0.2403) loss 3.5611 (3.3392) grad_norm 1.8155 (2.0928) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][90/1251] eta 0:04:48 lr 0.000818 wd 0.0500 time 0.2346 (0.2485) data time 0.0011 (0.0064) model time 0.2335 (0.2402) loss 3.6657 (3.3202) grad_norm 1.9471 (2.0926) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][100/1251] eta 0:04:47 lr 0.000818 wd 0.0500 time 0.2467 (0.2502) data time 0.0009 (0.0059) model time 0.2458 (0.2450) loss 2.3600 (3.3157) grad_norm 1.7355 (2.0647) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:48:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][110/1251] eta 0:04:44 lr 0.000818 wd 0.0500 time 0.2431 (0.2493) data time 0.0007 (0.0055) model time 0.2424 (0.2441) loss 3.5695 (3.3129) grad_norm 2.7538 (2.0581) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][120/1251] eta 0:04:41 lr 0.000818 wd 0.0500 time 0.2417 (0.2486) data time 0.0009 (0.0051) model time 0.2408 (0.2434) loss 4.1012 (3.3538) grad_norm 1.9817 (2.0550) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][130/1251] eta 0:04:38 lr 0.000818 wd 0.0500 time 0.2386 (0.2481) data time 0.0007 (0.0048) model time 0.2379 (0.2431) loss 2.7373 (3.3498) grad_norm 2.0422 (2.0504) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][140/1251] eta 0:04:35 lr 0.000817 wd 0.0500 time 0.2472 (0.2477) data time 0.0007 (0.0045) model time 0.2465 (0.2429) loss 4.4513 (3.3662) grad_norm 1.8922 (2.0347) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][150/1251] eta 0:04:32 lr 0.000817 wd 0.0500 time 0.2468 (0.2473) data time 0.0009 (0.0043) model time 0.2459 (0.2427) loss 1.9173 (3.3606) grad_norm 1.9379 (2.0414) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][160/1251] eta 0:04:29 lr 0.000817 wd 0.0500 time 0.2398 (0.2469) data time 0.0011 (0.0041) model time 0.2387 (0.2424) loss 2.3203 (3.3613) grad_norm 1.6451 (2.0344) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][170/1251] eta 0:04:28 lr 0.000817 wd 0.0500 time 0.2426 (0.2484) data time 0.0010 (0.0039) model time 0.2416 (0.2449) loss 2.7724 (3.3599) grad_norm 2.2113 (2.0357) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][180/1251] eta 0:04:25 lr 0.000817 wd 0.0500 time 0.2422 (0.2482) data time 0.0011 (0.0037) model time 0.2410 (0.2448) loss 3.7343 (3.3630) grad_norm 2.1389 (2.0456) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][190/1251] eta 0:04:23 lr 0.000817 wd 0.0500 time 0.2453 (0.2480) data time 0.0010 (0.0036) model time 0.2443 (0.2446) loss 3.3695 (3.3431) grad_norm 2.0877 (2.0461) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][200/1251] eta 0:04:22 lr 0.000817 wd 0.0500 time 0.2472 (0.2498) data time 0.0010 (0.0035) model time 0.2462 (0.2473) loss 3.5549 (3.3363) grad_norm 2.2721 (2.0412) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][210/1251] eta 0:04:20 lr 0.000817 wd 0.0500 time 0.2497 (0.2504) data time 0.0010 (0.0034) model time 0.2488 (0.2481) loss 4.1243 (3.3554) grad_norm 2.3903 (2.0407) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][220/1251] eta 0:04:17 lr 0.000817 wd 0.0500 time 0.2480 (0.2501) data time 0.0008 (0.0033) model time 0.2472 (0.2478) loss 3.0484 (3.3576) grad_norm 1.6084 (2.0449) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][230/1251] eta 0:04:15 lr 0.000817 wd 0.0500 time 0.2415 (0.2498) data time 0.0010 (0.0032) model time 0.2404 (0.2475) loss 3.4656 (3.3579) grad_norm 1.6302 (2.0440) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][240/1251] eta 0:04:13 lr 0.000817 wd 0.0500 time 0.2424 (0.2503) data time 0.0011 (0.0031) model time 0.2413 (0.2482) loss 3.1680 (3.3491) grad_norm 1.9240 (2.0332) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][250/1251] eta 0:04:10 lr 0.000817 wd 0.0500 time 0.2467 (0.2499) data time 0.0007 (0.0030) model time 0.2460 (0.2478) loss 2.2755 (3.3361) grad_norm 2.2520 (2.0344) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][260/1251] eta 0:04:07 lr 0.000817 wd 0.0500 time 0.2437 (0.2497) data time 0.0009 (0.0029) model time 0.2428 (0.2476) loss 3.7711 (3.3497) grad_norm 1.7490 (2.0358) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][270/1251] eta 0:04:04 lr 0.000817 wd 0.0500 time 0.2421 (0.2494) data time 0.0012 (0.0028) model time 0.2409 (0.2472) loss 3.8241 (3.3551) grad_norm 2.0742 (2.0395) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][280/1251] eta 0:04:01 lr 0.000817 wd 0.0500 time 0.2427 (0.2491) data time 0.0008 (0.0028) model time 0.2419 (0.2469) loss 3.5978 (3.3473) grad_norm 2.7538 (2.0443) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][290/1251] eta 0:03:59 lr 0.000817 wd 0.0500 time 0.2454 (0.2489) data time 0.0009 (0.0027) model time 0.2445 (0.2467) loss 3.5640 (3.3529) grad_norm 2.2984 (2.0441) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][300/1251] eta 0:03:56 lr 0.000817 wd 0.0500 time 0.2563 (0.2487) data time 0.0007 (0.0027) model time 0.2556 (0.2465) loss 4.0277 (3.3628) grad_norm 1.7259 (2.0501) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][310/1251] eta 0:03:53 lr 0.000817 wd 0.0500 time 0.2439 (0.2485) data time 0.0011 (0.0026) model time 0.2428 (0.2464) loss 3.2619 (3.3669) grad_norm 2.4741 (2.0643) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][320/1251] eta 0:03:51 lr 0.000817 wd 0.0500 time 0.2494 (0.2483) data time 0.0009 (0.0026) model time 0.2485 (0.2462) loss 3.3917 (3.3691) grad_norm 2.3723 (2.0759) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][330/1251] eta 0:03:48 lr 0.000817 wd 0.0500 time 0.2364 (0.2481) data time 0.0011 (0.0025) model time 0.2353 (0.2460) loss 3.0657 (3.3643) grad_norm 1.8134 (2.0724) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][340/1251] eta 0:03:45 lr 0.000817 wd 0.0500 time 0.2435 (0.2479) data time 0.0010 (0.0025) model time 0.2424 (0.2458) loss 3.8534 (3.3688) grad_norm 1.9536 (2.0680) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:49:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][350/1251] eta 0:03:43 lr 0.000817 wd 0.0500 time 0.2390 (0.2483) data time 0.0009 (0.0024) model time 0.2381 (0.2463) loss 3.7250 (3.3678) grad_norm 2.0692 (2.0615) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][360/1251] eta 0:03:41 lr 0.000817 wd 0.0500 time 0.2337 (0.2481) data time 0.0007 (0.0024) model time 0.2329 (0.2460) loss 3.5644 (3.3731) grad_norm 1.8753 (2.0704) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][370/1251] eta 0:03:38 lr 0.000817 wd 0.0500 time 0.2442 (0.2479) data time 0.0010 (0.0023) model time 0.2433 (0.2459) loss 2.8428 (3.3772) grad_norm 1.7982 (2.0710) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][380/1251] eta 0:03:35 lr 0.000817 wd 0.0500 time 0.2417 (0.2477) data time 0.0009 (0.0023) model time 0.2407 (0.2457) loss 3.4010 (3.3761) grad_norm 2.6363 (2.0726) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][390/1251] eta 0:03:33 lr 0.000817 wd 0.0500 time 0.2399 (0.2476) data time 0.0008 (0.0023) model time 0.2391 (0.2456) loss 4.2839 (3.3860) grad_norm 3.1890 (2.0757) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][400/1251] eta 0:03:31 lr 0.000817 wd 0.0500 time 0.2379 (0.2480) data time 0.0009 (0.0022) model time 0.2370 (0.2460) loss 2.3287 (3.3798) grad_norm 2.0516 (2.0732) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][410/1251] eta 0:03:28 lr 0.000817 wd 0.0500 time 0.2453 (0.2478) data time 0.0007 (0.0022) model time 0.2445 (0.2459) loss 2.1551 (3.3804) grad_norm 2.7533 (2.0785) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][420/1251] eta 0:03:25 lr 0.000817 wd 0.0500 time 0.2390 (0.2477) data time 0.0011 (0.0022) model time 0.2379 (0.2457) loss 3.9543 (3.3806) grad_norm 1.5361 (2.0761) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][430/1251] eta 0:03:23 lr 0.000816 wd 0.0500 time 0.2359 (0.2475) data time 0.0010 (0.0022) model time 0.2348 (0.2456) loss 3.8020 (3.3839) grad_norm 3.4464 (2.0796) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][440/1251] eta 0:03:20 lr 0.000816 wd 0.0500 time 0.2418 (0.2474) data time 0.0011 (0.0021) model time 0.2407 (0.2455) loss 3.6878 (3.3821) grad_norm 2.6002 (2.0806) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][450/1251] eta 0:03:18 lr 0.000816 wd 0.0500 time 0.2348 (0.2472) data time 0.0013 (0.0021) model time 0.2334 (0.2453) loss 3.3135 (3.3887) grad_norm 2.1909 (2.0819) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][460/1251] eta 0:03:15 lr 0.000816 wd 0.0500 time 0.2454 (0.2476) data time 0.0011 (0.0021) model time 0.2443 (0.2457) loss 2.7425 (3.3911) grad_norm 1.7918 (2.0895) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][470/1251] eta 0:03:13 lr 0.000816 wd 0.0500 time 0.2402 (0.2475) data time 0.0010 (0.0021) model time 0.2393 (0.2457) loss 3.4033 (3.3927) grad_norm 2.8092 (2.1077) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][480/1251] eta 0:03:10 lr 0.000816 wd 0.0500 time 0.2533 (0.2474) data time 0.0011 (0.0020) model time 0.2522 (0.2455) loss 3.6895 (3.3949) grad_norm 2.1110 (2.1150) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][490/1251] eta 0:03:08 lr 0.000816 wd 0.0500 time 0.2500 (0.2473) data time 0.0010 (0.0020) model time 0.2490 (0.2454) loss 3.0224 (3.3918) grad_norm 1.8875 (2.1127) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][500/1251] eta 0:03:05 lr 0.000816 wd 0.0500 time 0.2447 (0.2471) data time 0.0009 (0.0020) model time 0.2438 (0.2453) loss 3.7476 (3.3909) grad_norm 2.1925 (2.1124) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][510/1251] eta 0:03:03 lr 0.000816 wd 0.0500 time 0.2466 (0.2471) data time 0.0007 (0.0020) model time 0.2459 (0.2452) loss 3.9914 (3.3918) grad_norm 1.8352 (2.1085) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][520/1251] eta 0:03:00 lr 0.000816 wd 0.0500 time 0.2434 (0.2470) data time 0.0009 (0.0020) model time 0.2425 (0.2452) loss 2.8707 (3.3879) grad_norm 2.2078 (2.1064) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][530/1251] eta 0:02:58 lr 0.000816 wd 0.0500 time 0.2329 (0.2469) data time 0.0009 (0.0020) model time 0.2319 (0.2451) loss 3.3968 (3.3882) grad_norm 2.0670 (2.1074) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][540/1251] eta 0:02:55 lr 0.000816 wd 0.0500 time 0.2446 (0.2468) data time 0.0009 (0.0019) model time 0.2437 (0.2450) loss 3.7085 (3.3819) grad_norm 3.2303 (2.1068) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][550/1251] eta 0:02:52 lr 0.000816 wd 0.0500 time 0.2435 (0.2467) data time 0.0007 (0.0019) model time 0.2428 (0.2449) loss 2.2787 (3.3788) grad_norm 1.9680 (2.1084) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][560/1251] eta 0:02:50 lr 0.000816 wd 0.0500 time 0.2358 (0.2467) data time 0.0008 (0.0019) model time 0.2351 (0.2449) loss 3.1153 (3.3802) grad_norm 1.7387 (2.1066) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:50:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][570/1251] eta 0:02:47 lr 0.000816 wd 0.0500 time 0.2519 (0.2466) data time 0.0007 (0.0019) model time 0.2511 (0.2448) loss 4.2563 (3.3802) grad_norm 2.5218 (2.1070) loss_scale 8192.0000 (4124.6935) mem 7379MB [2024-08-26 10:50:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][580/1251] eta 0:02:45 lr 0.000816 wd 0.0500 time 0.2358 (0.2468) data time 0.0010 (0.0019) model time 0.2349 (0.2450) loss 3.6837 (3.3785) grad_norm 1.9841 (2.1074) loss_scale 8192.0000 (4194.6988) mem 7379MB [2024-08-26 10:50:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][590/1251] eta 0:02:43 lr 0.000816 wd 0.0500 time 0.2422 (0.2467) data time 0.0007 (0.0019) model time 0.2415 (0.2449) loss 4.0377 (3.3778) grad_norm 1.5320 (2.1051) loss_scale 8192.0000 (4262.3350) mem 7379MB [2024-08-26 10:50:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][600/1251] eta 0:02:40 lr 0.000816 wd 0.0500 time 0.2450 (0.2466) data time 0.0010 (0.0018) model time 0.2441 (0.2449) loss 2.5891 (3.3772) grad_norm 1.4923 (2.1061) loss_scale 8192.0000 (4327.7205) mem 7379MB [2024-08-26 10:51:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][610/1251] eta 0:02:37 lr 0.000816 wd 0.0500 time 0.2391 (0.2465) data time 0.0008 (0.0018) model time 0.2383 (0.2447) loss 3.9378 (3.3772) grad_norm 3.9912 (2.1055) loss_scale 8192.0000 (4390.9656) mem 7379MB [2024-08-26 10:51:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][620/1251] eta 0:02:35 lr 0.000816 wd 0.0500 time 0.2465 (0.2464) data time 0.0010 (0.0018) model time 0.2455 (0.2447) loss 2.9320 (3.3761) grad_norm 2.8339 (2.1094) loss_scale 8192.0000 (4452.1739) mem 7379MB [2024-08-26 10:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][630/1251] eta 0:02:33 lr 0.000816 wd 0.0500 time 0.2445 (0.2464) data time 0.0008 (0.0018) model time 0.2437 (0.2446) loss 3.6487 (3.3743) grad_norm 2.6386 (2.1118) loss_scale 8192.0000 (4511.4422) mem 7379MB [2024-08-26 10:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][640/1251] eta 0:02:30 lr 0.000816 wd 0.0500 time 0.2380 (0.2463) data time 0.0008 (0.0018) model time 0.2373 (0.2446) loss 4.2811 (3.3793) grad_norm 2.2482 (2.1131) loss_scale 8192.0000 (4568.8612) mem 7379MB [2024-08-26 10:51:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][650/1251] eta 0:02:27 lr 0.000816 wd 0.0500 time 0.2366 (0.2462) data time 0.0011 (0.0018) model time 0.2355 (0.2445) loss 3.8057 (3.3827) grad_norm 3.2614 (2.1145) loss_scale 8192.0000 (4624.5161) mem 7379MB [2024-08-26 10:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][660/1251] eta 0:02:25 lr 0.000816 wd 0.0500 time 0.2452 (0.2461) data time 0.0007 (0.0018) model time 0.2445 (0.2444) loss 3.7822 (3.3840) grad_norm 1.6964 (2.1107) loss_scale 8192.0000 (4678.4871) mem 7379MB [2024-08-26 10:51:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][670/1251] eta 0:02:22 lr 0.000816 wd 0.0500 time 0.2426 (0.2461) data time 0.0009 (0.0018) model time 0.2417 (0.2444) loss 3.4552 (3.3824) grad_norm 2.0701 (2.1075) loss_scale 8192.0000 (4730.8495) mem 7379MB [2024-08-26 10:51:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][680/1251] eta 0:02:20 lr 0.000816 wd 0.0500 time 0.2331 (0.2460) data time 0.0010 (0.0018) model time 0.2322 (0.2443) loss 3.6835 (3.3886) grad_norm 1.5158 (2.1074) loss_scale 8192.0000 (4781.6740) mem 7379MB [2024-08-26 10:51:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][690/1251] eta 0:02:18 lr 0.000816 wd 0.0500 time 0.2400 (0.2463) data time 0.0010 (0.0017) model time 0.2390 (0.2446) loss 2.6425 (3.3889) grad_norm 1.5403 (2.1070) loss_scale 8192.0000 (4831.0275) mem 7379MB [2024-08-26 10:51:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][700/1251] eta 0:02:15 lr 0.000816 wd 0.0500 time 0.2387 (0.2465) data time 0.0008 (0.0018) model time 0.2380 (0.2449) loss 3.7293 (3.3914) grad_norm 1.7847 (2.1080) loss_scale 8192.0000 (4878.9729) mem 7379MB [2024-08-26 10:51:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][710/1251] eta 0:02:13 lr 0.000816 wd 0.0500 time 0.2426 (0.2464) data time 0.0010 (0.0017) model time 0.2416 (0.2448) loss 3.3600 (3.3900) grad_norm 2.7814 (2.1112) loss_scale 8192.0000 (4925.5696) mem 7379MB [2024-08-26 10:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][720/1251] eta 0:02:10 lr 0.000815 wd 0.0500 time 0.2402 (0.2464) data time 0.0007 (0.0017) model time 0.2395 (0.2447) loss 3.8802 (3.3891) grad_norm 1.5491 (2.1107) loss_scale 8192.0000 (4970.8738) mem 7379MB [2024-08-26 10:51:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][730/1251] eta 0:02:08 lr 0.000815 wd 0.0500 time 0.2351 (0.2466) data time 0.0009 (0.0017) model time 0.2342 (0.2449) loss 2.5436 (3.3913) grad_norm 1.9322 (2.1089) loss_scale 8192.0000 (5014.9384) mem 7379MB [2024-08-26 10:51:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][740/1251] eta 0:02:05 lr 0.000815 wd 0.0500 time 0.2437 (0.2465) data time 0.0009 (0.0017) model time 0.2428 (0.2449) loss 2.8083 (3.3903) grad_norm 1.8893 (2.1078) loss_scale 8192.0000 (5057.8138) mem 7379MB [2024-08-26 10:51:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][750/1251] eta 0:02:03 lr 0.000815 wd 0.0500 time 0.2516 (0.2465) data time 0.0007 (0.0017) model time 0.2509 (0.2448) loss 3.7991 (3.3912) grad_norm 1.8381 (2.1043) loss_scale 8192.0000 (5099.5473) mem 7379MB [2024-08-26 10:51:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][760/1251] eta 0:02:00 lr 0.000815 wd 0.0500 time 0.2436 (0.2464) data time 0.0009 (0.0017) model time 0.2427 (0.2448) loss 3.8486 (3.3944) grad_norm 1.9448 (2.1044) loss_scale 8192.0000 (5140.1840) mem 7379MB [2024-08-26 10:51:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][770/1251] eta 0:01:58 lr 0.000815 wd 0.0500 time 0.2310 (0.2463) data time 0.0009 (0.0017) model time 0.2302 (0.2447) loss 2.5850 (3.3910) grad_norm 2.2548 (2.1019) loss_scale 8192.0000 (5179.7665) mem 7379MB [2024-08-26 10:51:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][780/1251] eta 0:01:55 lr 0.000815 wd 0.0500 time 0.2459 (0.2463) data time 0.0009 (0.0017) model time 0.2450 (0.2447) loss 2.4975 (3.3898) grad_norm 2.3603 (2.1033) loss_scale 8192.0000 (5218.3355) mem 7379MB [2024-08-26 10:51:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][790/1251] eta 0:01:53 lr 0.000815 wd 0.0500 time 0.2362 (0.2462) data time 0.0009 (0.0017) model time 0.2353 (0.2446) loss 4.1665 (3.3927) grad_norm 2.8003 (2.1071) loss_scale 8192.0000 (5255.9292) mem 7379MB [2024-08-26 10:51:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][800/1251] eta 0:01:51 lr 0.000815 wd 0.0500 time 0.2503 (0.2462) data time 0.0011 (0.0017) model time 0.2492 (0.2446) loss 3.3972 (3.3965) grad_norm 2.9518 (2.1078) loss_scale 8192.0000 (5292.5843) mem 7379MB [2024-08-26 10:51:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][810/1251] eta 0:01:48 lr 0.000815 wd 0.0500 time 0.2427 (0.2464) data time 0.0007 (0.0017) model time 0.2420 (0.2448) loss 3.0876 (3.3972) grad_norm 2.1911 (2.1095) loss_scale 8192.0000 (5328.3354) mem 7379MB [2024-08-26 10:51:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][820/1251] eta 0:01:46 lr 0.000815 wd 0.0500 time 0.2345 (0.2466) data time 0.0009 (0.0017) model time 0.2336 (0.2450) loss 3.9801 (3.3963) grad_norm 2.5923 (2.1083) loss_scale 8192.0000 (5363.2156) mem 7379MB [2024-08-26 10:51:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][830/1251] eta 0:01:43 lr 0.000815 wd 0.0500 time 0.2436 (0.2466) data time 0.0009 (0.0017) model time 0.2427 (0.2450) loss 4.0558 (3.3965) grad_norm 1.5658 (2.1044) loss_scale 8192.0000 (5397.2563) mem 7379MB [2024-08-26 10:51:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][840/1251] eta 0:01:41 lr 0.000815 wd 0.0500 time 0.2436 (0.2465) data time 0.0007 (0.0017) model time 0.2429 (0.2449) loss 3.9701 (3.3992) grad_norm 1.7559 (2.1037) loss_scale 8192.0000 (5430.4875) mem 7379MB [2024-08-26 10:52:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][850/1251] eta 0:01:38 lr 0.000815 wd 0.0500 time 0.2404 (0.2464) data time 0.0007 (0.0017) model time 0.2397 (0.2448) loss 4.3696 (3.3988) grad_norm 2.6381 (2.1035) loss_scale 8192.0000 (5462.9377) mem 7379MB [2024-08-26 10:52:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][860/1251] eta 0:01:36 lr 0.000815 wd 0.0500 time 0.2415 (0.2464) data time 0.0011 (0.0017) model time 0.2403 (0.2448) loss 2.9049 (3.3973) grad_norm 2.4029 (2.1028) loss_scale 8192.0000 (5494.6341) mem 7379MB [2024-08-26 10:52:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][870/1251] eta 0:01:33 lr 0.000815 wd 0.0500 time 0.2425 (0.2463) data time 0.0009 (0.0016) model time 0.2416 (0.2447) loss 3.9729 (3.3964) grad_norm 1.4716 (2.1034) loss_scale 8192.0000 (5525.6028) mem 7379MB [2024-08-26 10:52:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][880/1251] eta 0:01:31 lr 0.000815 wd 0.0500 time 0.2453 (0.2463) data time 0.0009 (0.0016) model time 0.2444 (0.2447) loss 3.6867 (3.3970) grad_norm 1.8255 (2.1027) loss_scale 8192.0000 (5555.8683) mem 7379MB [2024-08-26 10:52:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][890/1251] eta 0:01:28 lr 0.000815 wd 0.0500 time 0.2380 (0.2464) data time 0.0007 (0.0016) model time 0.2372 (0.2449) loss 3.5347 (3.3983) grad_norm 2.2998 (2.1033) loss_scale 8192.0000 (5585.4545) mem 7379MB [2024-08-26 10:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][900/1251] eta 0:01:26 lr 0.000815 wd 0.0500 time 0.2453 (0.2464) data time 0.0009 (0.0016) model time 0.2444 (0.2448) loss 3.3133 (3.3942) grad_norm 1.9606 (2.1067) loss_scale 8192.0000 (5614.3840) mem 7379MB [2024-08-26 10:52:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][910/1251] eta 0:01:24 lr 0.000815 wd 0.0500 time 0.2551 (0.2464) data time 0.0013 (0.0016) model time 0.2539 (0.2448) loss 3.6731 (3.3941) grad_norm 2.7876 (2.1071) loss_scale 8192.0000 (5642.6784) mem 7379MB [2024-08-26 10:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][920/1251] eta 0:01:21 lr 0.000815 wd 0.0500 time 0.2460 (0.2463) data time 0.0012 (0.0016) model time 0.2448 (0.2448) loss 3.2589 (3.3944) grad_norm 2.3022 (2.1060) loss_scale 8192.0000 (5670.3583) mem 7379MB [2024-08-26 10:52:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][930/1251] eta 0:01:19 lr 0.000815 wd 0.0500 time 0.2460 (0.2462) data time 0.0009 (0.0016) model time 0.2450 (0.2447) loss 3.2929 (3.3918) grad_norm 1.9118 (2.1066) loss_scale 8192.0000 (5697.4436) mem 7379MB [2024-08-26 10:52:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][940/1251] eta 0:01:16 lr 0.000815 wd 0.0500 time 0.2364 (0.2462) data time 0.0009 (0.0016) model time 0.2355 (0.2447) loss 3.0673 (3.3938) grad_norm 1.6815 (2.1038) loss_scale 8192.0000 (5723.9532) mem 7379MB [2024-08-26 10:52:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][950/1251] eta 0:01:14 lr 0.000815 wd 0.0500 time 0.2393 (0.2461) data time 0.0011 (0.0016) model time 0.2383 (0.2446) loss 3.7481 (3.3926) grad_norm 1.9833 (2.1024) loss_scale 8192.0000 (5749.9054) mem 7379MB [2024-08-26 10:52:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][960/1251] eta 0:01:11 lr 0.000815 wd 0.0500 time 0.2585 (0.2461) data time 0.0011 (0.0016) model time 0.2574 (0.2446) loss 3.4824 (3.3914) grad_norm 3.0472 (2.1041) loss_scale 8192.0000 (5775.3174) mem 7379MB [2024-08-26 10:52:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][970/1251] eta 0:01:09 lr 0.000815 wd 0.0500 time 0.4130 (0.2462) data time 0.0010 (0.0016) model time 0.4120 (0.2447) loss 3.2974 (3.3901) grad_norm 2.4307 (2.1088) loss_scale 8192.0000 (5800.2060) mem 7379MB [2024-08-26 10:52:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][980/1251] eta 0:01:06 lr 0.000815 wd 0.0500 time 0.2384 (0.2462) data time 0.0010 (0.0016) model time 0.2373 (0.2446) loss 3.8132 (3.3900) grad_norm 4.1186 (2.1137) loss_scale 8192.0000 (5824.5872) mem 7379MB [2024-08-26 10:52:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][990/1251] eta 0:01:04 lr 0.000815 wd 0.0500 time 0.2399 (0.2461) data time 0.0010 (0.0016) model time 0.2389 (0.2446) loss 4.0073 (3.3899) grad_norm 1.5626 (2.1120) loss_scale 8192.0000 (5848.4763) mem 7379MB [2024-08-26 10:52:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1000/1251] eta 0:01:01 lr 0.000814 wd 0.0500 time 0.2337 (0.2460) data time 0.0010 (0.0016) model time 0.2327 (0.2445) loss 3.2059 (3.3918) grad_norm 2.4535 (2.1123) loss_scale 8192.0000 (5871.8881) mem 7379MB [2024-08-26 10:52:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1010/1251] eta 0:00:59 lr 0.000814 wd 0.0500 time 0.2450 (0.2460) data time 0.0008 (0.0016) model time 0.2441 (0.2445) loss 4.2818 (3.3931) grad_norm 1.7316 (2.1107) loss_scale 8192.0000 (5894.8368) mem 7379MB [2024-08-26 10:52:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1020/1251] eta 0:00:56 lr 0.000814 wd 0.0500 time 0.2401 (0.2459) data time 0.0009 (0.0015) model time 0.2391 (0.2444) loss 3.5933 (3.3942) grad_norm 1.9631 (2.1096) loss_scale 8192.0000 (5917.3359) mem 7379MB [2024-08-26 10:52:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1030/1251] eta 0:00:54 lr 0.000814 wd 0.0500 time 0.2444 (0.2461) data time 0.0011 (0.0015) model time 0.2433 (0.2446) loss 3.6188 (3.3922) grad_norm 3.5905 (2.1133) loss_scale 8192.0000 (5939.3986) mem 7379MB [2024-08-26 10:52:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1040/1251] eta 0:00:51 lr 0.000814 wd 0.0500 time 0.2407 (0.2461) data time 0.0009 (0.0015) model time 0.2398 (0.2446) loss 3.4727 (3.3922) grad_norm 1.8156 (2.1135) loss_scale 8192.0000 (5961.0375) mem 7379MB [2024-08-26 10:52:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1050/1251] eta 0:00:49 lr 0.000814 wd 0.0500 time 0.2492 (0.2460) data time 0.0009 (0.0015) model time 0.2482 (0.2446) loss 3.1522 (3.3921) grad_norm 1.4826 (2.1111) loss_scale 8192.0000 (5982.2645) mem 7379MB [2024-08-26 10:52:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1060/1251] eta 0:00:46 lr 0.000814 wd 0.0500 time 0.2407 (0.2460) data time 0.0011 (0.0015) model time 0.2396 (0.2445) loss 3.7335 (3.3936) grad_norm 2.4683 (2.1102) loss_scale 8192.0000 (6003.0914) mem 7379MB [2024-08-26 10:52:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1070/1251] eta 0:00:44 lr 0.000814 wd 0.0500 time 0.2408 (0.2459) data time 0.0010 (0.0015) model time 0.2398 (0.2445) loss 3.7067 (3.3947) grad_norm 1.9844 (2.1087) loss_scale 8192.0000 (6023.5294) mem 7379MB [2024-08-26 10:52:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1080/1251] eta 0:00:42 lr 0.000814 wd 0.0500 time 0.2397 (0.2459) data time 0.0009 (0.0015) model time 0.2389 (0.2445) loss 1.9047 (3.3919) grad_norm 4.1646 (2.1110) loss_scale 8192.0000 (6043.5893) mem 7379MB [2024-08-26 10:52:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1090/1251] eta 0:00:39 lr 0.000814 wd 0.0500 time 0.2411 (0.2459) data time 0.0011 (0.0015) model time 0.2399 (0.2445) loss 3.0839 (3.3925) grad_norm 2.5204 (2.1183) loss_scale 8192.0000 (6063.2814) mem 7379MB [2024-08-26 10:53:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1100/1251] eta 0:00:37 lr 0.000814 wd 0.0500 time 0.2325 (0.2459) data time 0.0011 (0.0015) model time 0.2314 (0.2444) loss 3.6387 (3.3898) grad_norm 1.7397 (2.1200) loss_scale 8192.0000 (6082.6158) mem 7379MB [2024-08-26 10:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1110/1251] eta 0:00:34 lr 0.000814 wd 0.0500 time 0.2372 (0.2458) data time 0.0008 (0.0015) model time 0.2365 (0.2443) loss 4.6400 (3.3907) grad_norm 1.7221 (2.1195) loss_scale 8192.0000 (6101.6022) mem 7379MB [2024-08-26 10:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1120/1251] eta 0:00:32 lr 0.000814 wd 0.0500 time 0.2407 (0.2458) data time 0.0008 (0.0015) model time 0.2399 (0.2443) loss 4.1777 (3.3915) grad_norm 2.4565 (2.1188) loss_scale 8192.0000 (6120.2498) mem 7379MB [2024-08-26 10:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1130/1251] eta 0:00:29 lr 0.000814 wd 0.0500 time 0.2421 (0.2461) data time 0.0011 (0.0015) model time 0.2410 (0.2447) loss 3.0294 (3.3919) grad_norm 2.4743 (2.1186) loss_scale 8192.0000 (6138.5676) mem 7379MB [2024-08-26 10:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1140/1251] eta 0:00:27 lr 0.000814 wd 0.0500 time 0.2390 (0.2461) data time 0.0011 (0.0015) model time 0.2379 (0.2446) loss 3.4692 (3.3920) grad_norm 1.7011 (2.1178) loss_scale 8192.0000 (6156.5644) mem 7379MB [2024-08-26 10:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1150/1251] eta 0:00:24 lr 0.000814 wd 0.0500 time 0.2550 (0.2460) data time 0.0008 (0.0015) model time 0.2542 (0.2446) loss 2.9286 (3.3942) grad_norm 2.8317 (2.1180) loss_scale 8192.0000 (6174.2485) mem 7379MB [2024-08-26 10:53:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1160/1251] eta 0:00:22 lr 0.000814 wd 0.0500 time 0.2341 (0.2460) data time 0.0011 (0.0015) model time 0.2330 (0.2445) loss 2.5207 (3.3910) grad_norm 1.7833 (2.1158) loss_scale 8192.0000 (6191.6279) mem 7379MB [2024-08-26 10:53:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1170/1251] eta 0:00:19 lr 0.000814 wd 0.0500 time 0.2377 (0.2461) data time 0.0007 (0.0015) model time 0.2370 (0.2447) loss 2.9758 (3.3900) grad_norm 2.6818 (2.1165) loss_scale 8192.0000 (6208.7105) mem 7379MB [2024-08-26 10:53:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1180/1251] eta 0:00:17 lr 0.000814 wd 0.0500 time 0.2435 (0.2461) data time 0.0008 (0.0015) model time 0.2427 (0.2446) loss 3.9580 (3.3909) grad_norm 2.0177 (2.1196) loss_scale 8192.0000 (6225.5038) mem 7379MB [2024-08-26 10:53:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1190/1251] eta 0:00:15 lr 0.000814 wd 0.0500 time 0.2443 (0.2460) data time 0.0009 (0.0015) model time 0.2434 (0.2446) loss 4.2106 (3.3909) grad_norm 2.1104 (2.1178) loss_scale 8192.0000 (6242.0151) mem 7379MB [2024-08-26 10:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1200/1251] eta 0:00:12 lr 0.000814 wd 0.0500 time 0.2488 (0.2460) data time 0.0010 (0.0015) model time 0.2478 (0.2446) loss 3.8985 (3.3910) grad_norm 1.9072 (2.1164) loss_scale 8192.0000 (6258.2515) mem 7379MB [2024-08-26 10:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1210/1251] eta 0:00:10 lr 0.000814 wd 0.0500 time 0.2366 (0.2460) data time 0.0009 (0.0015) model time 0.2358 (0.2446) loss 4.4982 (3.3940) grad_norm 2.8015 (2.1156) loss_scale 8192.0000 (6274.2197) mem 7379MB [2024-08-26 10:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1220/1251] eta 0:00:07 lr 0.000814 wd 0.0500 time 0.2528 (0.2460) data time 0.0012 (0.0015) model time 0.2515 (0.2445) loss 3.1179 (3.3935) grad_norm 2.1992 (2.1166) loss_scale 8192.0000 (6289.9263) mem 7379MB [2024-08-26 10:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1230/1251] eta 0:00:05 lr 0.000814 wd 0.0500 time 0.2415 (0.2460) data time 0.0010 (0.0015) model time 0.2405 (0.2445) loss 2.3897 (3.3925) grad_norm 1.7820 (2.1171) loss_scale 8192.0000 (6305.3777) mem 7379MB [2024-08-26 10:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1240/1251] eta 0:00:02 lr 0.000814 wd 0.0500 time 0.2257 (0.2459) data time 0.0007 (0.0015) model time 0.2250 (0.2444) loss 4.0955 (3.3941) grad_norm 2.7007 (2.1156) loss_scale 8192.0000 (6320.5802) mem 7379MB [2024-08-26 10:53:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [99/300][1250/1251] eta 0:00:00 lr 0.000814 wd 0.0500 time 0.2246 (0.2457) data time 0.0007 (0.0015) model time 0.2239 (0.2443) loss 3.5879 (3.3938) grad_norm 2.8740 (inf) loss_scale 4096.0000 (6306.0719) mem 7379MB [2024-08-26 10:53:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 99 training takes 0:05:07 [2024-08-26 10:53:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 10:53:39 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 10:53:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.398 (0.398) Loss 0.5083 (0.5083) Acc@1 90.137 (90.137) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 10:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.109) Loss 0.7773 (0.8091) Acc@1 84.668 (82.422) Acc@5 96.680 (96.431) Mem 7379MB [2024-08-26 10:53:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.096) Loss 1.1875 (0.8340) Acc@1 71.973 (81.534) Acc@5 93.164 (96.373) Mem 7379MB [2024-08-26 10:53:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.090) Loss 1.4531 (0.9407) Acc@1 64.648 (78.944) Acc@5 88.770 (94.849) Mem 7379MB [2024-08-26 10:53:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.2900 (1.0053) Acc@1 70.117 (77.260) Acc@5 90.430 (94.072) Mem 7379MB [2024-08-26 10:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 76.952 Acc@5 94.032 [2024-08-26 10:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.0% [2024-08-26 10:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.808 (0.808) Loss 0.4404 (0.4404) Acc@1 92.383 (92.383) Acc@5 98.438 (98.438) Mem 7379MB [2024-08-26 10:53:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.150) Loss 0.7075 (0.6901) Acc@1 86.133 (85.183) Acc@5 96.191 (97.070) Mem 7379MB [2024-08-26 10:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.116) Loss 0.9858 (0.7136) Acc@1 76.562 (84.152) Acc@5 94.141 (97.015) Mem 7379MB [2024-08-26 10:53:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.103) Loss 1.2744 (0.8138) Acc@1 67.090 (81.792) Acc@5 90.918 (95.839) Mem 7379MB [2024-08-26 10:53:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.1377 (0.8655) Acc@1 71.484 (80.302) Acc@5 93.359 (95.343) Mem 7379MB [2024-08-26 10:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.886 Acc@5 95.308 [2024-08-26 10:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.9% [2024-08-26 10:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.89% [2024-08-26 10:53:47 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 10:53:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 10:53:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][0/1251] eta 0:16:03 lr 0.000814 wd 0.0500 time 0.7705 (0.7705) data time 0.5459 (0.5459) model time 0.0000 (0.0000) loss 3.3998 (3.3998) grad_norm 2.8424 (2.8424) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][10/1251] eta 0:05:58 lr 0.000814 wd 0.0500 time 0.2386 (0.2893) data time 0.0007 (0.0507) model time 0.0000 (0.0000) loss 2.7443 (3.3662) grad_norm 4.1212 (2.4242) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:53:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][20/1251] eta 0:05:29 lr 0.000814 wd 0.0500 time 0.2466 (0.2677) data time 0.0008 (0.0270) model time 0.0000 (0.0000) loss 1.9382 (3.4344) grad_norm 2.1512 (2.4046) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:53:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][30/1251] eta 0:05:16 lr 0.000814 wd 0.0500 time 0.2341 (0.2590) data time 0.0011 (0.0186) model time 0.0000 (0.0000) loss 3.4045 (3.3610) grad_norm 1.7351 (2.2562) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:53:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][40/1251] eta 0:05:08 lr 0.000813 wd 0.0500 time 0.2405 (0.2548) data time 0.0011 (0.0144) model time 0.0000 (0.0000) loss 3.2434 (3.3295) grad_norm 1.7427 (2.1515) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][50/1251] eta 0:05:08 lr 0.000813 wd 0.0500 time 0.2470 (0.2566) data time 0.0010 (0.0117) model time 0.0000 (0.0000) loss 3.6119 (3.3205) grad_norm 2.3524 (2.1963) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][60/1251] eta 0:05:02 lr 0.000813 wd 0.0500 time 0.2380 (0.2542) data time 0.0011 (0.0101) model time 0.2369 (0.2405) loss 3.6749 (3.3393) grad_norm 2.2228 (2.1589) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][70/1251] eta 0:04:58 lr 0.000813 wd 0.0500 time 0.2427 (0.2525) data time 0.0009 (0.0088) model time 0.2418 (0.2407) loss 3.4373 (3.3660) grad_norm 2.4729 (2.1808) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][80/1251] eta 0:04:53 lr 0.000813 wd 0.0500 time 0.2380 (0.2510) data time 0.0009 (0.0079) model time 0.2371 (0.2401) loss 3.7328 (3.3889) grad_norm 1.9012 (2.1998) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][90/1251] eta 0:04:50 lr 0.000813 wd 0.0500 time 0.2397 (0.2501) data time 0.0007 (0.0071) model time 0.2390 (0.2406) loss 2.3370 (3.3957) grad_norm 2.2493 (2.1829) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][100/1251] eta 0:04:46 lr 0.000813 wd 0.0500 time 0.2445 (0.2493) data time 0.0007 (0.0065) model time 0.2438 (0.2408) loss 3.4413 (3.3938) grad_norm 2.9594 (2.1820) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][110/1251] eta 0:04:43 lr 0.000813 wd 0.0500 time 0.2444 (0.2486) data time 0.0010 (0.0060) model time 0.2435 (0.2407) loss 2.8468 (3.3879) grad_norm 1.7302 (2.1963) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][120/1251] eta 0:04:40 lr 0.000813 wd 0.0500 time 0.2365 (0.2480) data time 0.0008 (0.0056) model time 0.2357 (0.2407) loss 2.6547 (3.3862) grad_norm 3.0849 (2.1954) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][130/1251] eta 0:04:37 lr 0.000813 wd 0.0500 time 0.2451 (0.2475) data time 0.0007 (0.0053) model time 0.2444 (0.2406) loss 2.2568 (3.3595) grad_norm 1.7096 (2.1686) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][140/1251] eta 0:04:34 lr 0.000813 wd 0.0500 time 0.2462 (0.2472) data time 0.0012 (0.0050) model time 0.2451 (0.2407) loss 3.5432 (3.3507) grad_norm 1.6171 (2.1406) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][150/1251] eta 0:04:34 lr 0.000813 wd 0.0500 time 0.4453 (0.2494) data time 0.0008 (0.0048) model time 0.4446 (0.2446) loss 3.6285 (3.3454) grad_norm 2.6790 (2.1232) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][160/1251] eta 0:04:32 lr 0.000813 wd 0.0500 time 0.2463 (0.2501) data time 0.0009 (0.0045) model time 0.2453 (0.2459) loss 3.5120 (3.3486) grad_norm 2.1264 (2.1210) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][170/1251] eta 0:04:29 lr 0.000813 wd 0.0500 time 0.2393 (0.2496) data time 0.0010 (0.0043) model time 0.2383 (0.2455) loss 3.9209 (3.3418) grad_norm 2.7882 (2.1237) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][180/1251] eta 0:04:28 lr 0.000813 wd 0.0500 time 0.2428 (0.2506) data time 0.0007 (0.0041) model time 0.2421 (0.2470) loss 3.2805 (3.3303) grad_norm 1.8339 (2.1313) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][190/1251] eta 0:04:25 lr 0.000813 wd 0.0500 time 0.2491 (0.2501) data time 0.0007 (0.0040) model time 0.2484 (0.2466) loss 4.2721 (3.3375) grad_norm 1.6306 (2.1298) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][200/1251] eta 0:04:22 lr 0.000813 wd 0.0500 time 0.2395 (0.2498) data time 0.0008 (0.0038) model time 0.2387 (0.2463) loss 2.7193 (3.3410) grad_norm 1.8326 (2.1207) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][210/1251] eta 0:04:21 lr 0.000813 wd 0.0500 time 0.2414 (0.2514) data time 0.0008 (0.0037) model time 0.2406 (0.2486) loss 2.6188 (3.3446) grad_norm 1.7921 (2.1206) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][220/1251] eta 0:04:18 lr 0.000813 wd 0.0500 time 0.2430 (0.2509) data time 0.0010 (0.0036) model time 0.2419 (0.2481) loss 3.2090 (3.3535) grad_norm 1.9384 (2.1094) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][230/1251] eta 0:04:16 lr 0.000813 wd 0.0500 time 0.4341 (0.2514) data time 0.0010 (0.0035) model time 0.4332 (0.2488) loss 4.1089 (3.3605) grad_norm 2.4018 (2.1233) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][240/1251] eta 0:04:13 lr 0.000813 wd 0.0500 time 0.2403 (0.2509) data time 0.0011 (0.0034) model time 0.2392 (0.2483) loss 3.7655 (3.3653) grad_norm 3.1914 (2.1245) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][250/1251] eta 0:04:11 lr 0.000813 wd 0.0500 time 0.2429 (0.2511) data time 0.0010 (0.0033) model time 0.2419 (0.2486) loss 4.0335 (3.3673) grad_norm 2.2267 (2.1338) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][260/1251] eta 0:04:08 lr 0.000813 wd 0.0500 time 0.2342 (0.2509) data time 0.0019 (0.0032) model time 0.2322 (0.2484) loss 2.9164 (3.3568) grad_norm 1.8594 (2.1368) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][270/1251] eta 0:04:05 lr 0.000813 wd 0.0500 time 0.2501 (0.2506) data time 0.0011 (0.0031) model time 0.2490 (0.2481) loss 3.2555 (3.3636) grad_norm 1.8329 (2.1320) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:54:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][280/1251] eta 0:04:02 lr 0.000813 wd 0.0500 time 0.2405 (0.2502) data time 0.0010 (0.0030) model time 0.2395 (0.2477) loss 3.4509 (3.3607) grad_norm 2.3587 (2.1287) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][290/1251] eta 0:04:00 lr 0.000813 wd 0.0500 time 0.2402 (0.2499) data time 0.0008 (0.0030) model time 0.2394 (0.2474) loss 2.9646 (3.3642) grad_norm 2.3932 (2.1334) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][300/1251] eta 0:03:57 lr 0.000813 wd 0.0500 time 0.2322 (0.2496) data time 0.0010 (0.0029) model time 0.2313 (0.2470) loss 3.0208 (3.3633) grad_norm 1.7671 (2.1339) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][310/1251] eta 0:03:54 lr 0.000813 wd 0.0500 time 0.2361 (0.2493) data time 0.0010 (0.0029) model time 0.2351 (0.2468) loss 3.5026 (3.3631) grad_norm 2.0856 (2.1379) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][320/1251] eta 0:03:51 lr 0.000813 wd 0.0500 time 0.2469 (0.2491) data time 0.0009 (0.0028) model time 0.2460 (0.2465) loss 2.8652 (3.3565) grad_norm 2.0945 (2.1306) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][330/1251] eta 0:03:49 lr 0.000812 wd 0.0500 time 0.2448 (0.2488) data time 0.0010 (0.0027) model time 0.2438 (0.2463) loss 2.9671 (3.3662) grad_norm 1.5656 (2.1197) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][340/1251] eta 0:03:46 lr 0.000812 wd 0.0500 time 0.2403 (0.2486) data time 0.0011 (0.0027) model time 0.2392 (0.2460) loss 3.6227 (3.3678) grad_norm 1.9460 (2.1173) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][350/1251] eta 0:03:43 lr 0.000812 wd 0.0500 time 0.2389 (0.2483) data time 0.0013 (0.0027) model time 0.2376 (0.2458) loss 3.8404 (3.3625) grad_norm 2.0133 (2.1166) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][360/1251] eta 0:03:41 lr 0.000812 wd 0.0500 time 0.2430 (0.2482) data time 0.0010 (0.0026) model time 0.2421 (0.2457) loss 2.9055 (3.3558) grad_norm 2.1837 (2.1187) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][370/1251] eta 0:03:38 lr 0.000812 wd 0.0500 time 0.2410 (0.2481) data time 0.0010 (0.0026) model time 0.2400 (0.2456) loss 3.3630 (3.3547) grad_norm 2.1985 (2.1132) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][380/1251] eta 0:03:35 lr 0.000812 wd 0.0500 time 0.2403 (0.2478) data time 0.0011 (0.0025) model time 0.2392 (0.2454) loss 3.4345 (3.3602) grad_norm 2.7786 (2.1099) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][390/1251] eta 0:03:33 lr 0.000812 wd 0.0500 time 0.2366 (0.2477) data time 0.0010 (0.0025) model time 0.2355 (0.2453) loss 2.5064 (3.3655) grad_norm 2.0624 (2.1259) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][400/1251] eta 0:03:30 lr 0.000812 wd 0.0500 time 0.2392 (0.2475) data time 0.0008 (0.0025) model time 0.2385 (0.2451) loss 2.8428 (3.3659) grad_norm 1.7978 (2.1233) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][410/1251] eta 0:03:28 lr 0.000812 wd 0.0500 time 0.2417 (0.2474) data time 0.0011 (0.0024) model time 0.2406 (0.2450) loss 3.3986 (3.3664) grad_norm 1.8942 (2.1157) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][420/1251] eta 0:03:25 lr 0.000812 wd 0.0500 time 0.2458 (0.2473) data time 0.0009 (0.0024) model time 0.2449 (0.2449) loss 2.7997 (3.3580) grad_norm 1.8716 (2.1183) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][430/1251] eta 0:03:22 lr 0.000812 wd 0.0500 time 0.2392 (0.2471) data time 0.0011 (0.0024) model time 0.2381 (0.2448) loss 3.9952 (3.3553) grad_norm 2.1544 (2.1148) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][440/1251] eta 0:03:20 lr 0.000812 wd 0.0500 time 0.2433 (0.2470) data time 0.0009 (0.0023) model time 0.2424 (0.2447) loss 3.6719 (3.3592) grad_norm 2.1208 (2.1097) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][450/1251] eta 0:03:17 lr 0.000812 wd 0.0500 time 0.2402 (0.2469) data time 0.0010 (0.0023) model time 0.2393 (0.2446) loss 3.5147 (3.3588) grad_norm 1.7829 (2.1020) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][460/1251] eta 0:03:15 lr 0.000812 wd 0.0500 time 0.2462 (0.2467) data time 0.0010 (0.0023) model time 0.2451 (0.2444) loss 3.2438 (3.3568) grad_norm 1.6552 (2.1000) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][470/1251] eta 0:03:12 lr 0.000812 wd 0.0500 time 0.2379 (0.2466) data time 0.0010 (0.0022) model time 0.2370 (0.2443) loss 2.9252 (3.3564) grad_norm 2.4311 (2.0976) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][480/1251] eta 0:03:10 lr 0.000812 wd 0.0500 time 0.2451 (0.2465) data time 0.0008 (0.0022) model time 0.2444 (0.2442) loss 3.7272 (3.3597) grad_norm 2.1582 (2.0995) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][490/1251] eta 0:03:07 lr 0.000812 wd 0.0500 time 0.2501 (0.2464) data time 0.0010 (0.0022) model time 0.2491 (0.2442) loss 2.5306 (3.3572) grad_norm 3.5452 (2.0988) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][500/1251] eta 0:03:05 lr 0.000812 wd 0.0500 time 0.2438 (0.2464) data time 0.0007 (0.0022) model time 0.2430 (0.2441) loss 3.3439 (3.3616) grad_norm 1.8380 (2.1031) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][510/1251] eta 0:03:02 lr 0.000812 wd 0.0500 time 0.2439 (0.2463) data time 0.0007 (0.0022) model time 0.2432 (0.2441) loss 2.7318 (3.3618) grad_norm 1.7687 (2.0955) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][520/1251] eta 0:02:59 lr 0.000812 wd 0.0500 time 0.2467 (0.2462) data time 0.0007 (0.0022) model time 0.2459 (0.2440) loss 4.4692 (3.3657) grad_norm 2.2638 (2.0973) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:55:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][530/1251] eta 0:02:57 lr 0.000812 wd 0.0500 time 0.2296 (0.2461) data time 0.0013 (0.0021) model time 0.2284 (0.2439) loss 3.7103 (3.3660) grad_norm 1.7981 (2.1076) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][540/1251] eta 0:02:55 lr 0.000812 wd 0.0500 time 0.2451 (0.2465) data time 0.0011 (0.0021) model time 0.2440 (0.2443) loss 4.0179 (3.3735) grad_norm 1.6223 (2.1030) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][550/1251] eta 0:02:52 lr 0.000812 wd 0.0500 time 0.2460 (0.2464) data time 0.0009 (0.0021) model time 0.2450 (0.2443) loss 3.6747 (3.3786) grad_norm 1.7871 (2.1012) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][560/1251] eta 0:02:50 lr 0.000812 wd 0.0500 time 0.2468 (0.2464) data time 0.0007 (0.0021) model time 0.2460 (0.2442) loss 3.5823 (3.3810) grad_norm 2.0457 (2.1038) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][570/1251] eta 0:02:47 lr 0.000812 wd 0.0500 time 0.2359 (0.2463) data time 0.0007 (0.0021) model time 0.2352 (0.2441) loss 3.7172 (3.3791) grad_norm 1.9382 (2.1061) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][580/1251] eta 0:02:45 lr 0.000812 wd 0.0500 time 0.2391 (0.2462) data time 0.0008 (0.0021) model time 0.2383 (0.2440) loss 2.8170 (3.3755) grad_norm 1.4922 (2.1043) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][590/1251] eta 0:02:42 lr 0.000812 wd 0.0500 time 0.2390 (0.2461) data time 0.0010 (0.0021) model time 0.2380 (0.2439) loss 3.1158 (3.3766) grad_norm 1.7676 (2.1033) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][600/1251] eta 0:02:40 lr 0.000812 wd 0.0500 time 0.2373 (0.2460) data time 0.0009 (0.0021) model time 0.2364 (0.2439) loss 3.1932 (3.3755) grad_norm 1.6484 (2.0999) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][610/1251] eta 0:02:37 lr 0.000812 wd 0.0500 time 0.2497 (0.2460) data time 0.0008 (0.0020) model time 0.2490 (0.2438) loss 4.1293 (3.3800) grad_norm 2.2082 (2.1003) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][620/1251] eta 0:02:35 lr 0.000811 wd 0.0500 time 0.2410 (0.2459) data time 0.0013 (0.0020) model time 0.2397 (0.2438) loss 3.7006 (3.3767) grad_norm 1.7755 (2.1012) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][630/1251] eta 0:02:32 lr 0.000811 wd 0.0500 time 0.2418 (0.2458) data time 0.0010 (0.0020) model time 0.2408 (0.2437) loss 3.5489 (3.3801) grad_norm 1.6871 (2.1040) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][640/1251] eta 0:02:30 lr 0.000811 wd 0.0500 time 0.2426 (0.2458) data time 0.0008 (0.0020) model time 0.2418 (0.2437) loss 2.6966 (3.3821) grad_norm 2.7791 (2.1065) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][650/1251] eta 0:02:27 lr 0.000811 wd 0.0500 time 0.2356 (0.2460) data time 0.0011 (0.0020) model time 0.2345 (0.2440) loss 3.4684 (3.3841) grad_norm 3.4492 (2.1120) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][660/1251] eta 0:02:25 lr 0.000811 wd 0.0500 time 0.2416 (0.2460) data time 0.0013 (0.0020) model time 0.2403 (0.2440) loss 2.0097 (3.3790) grad_norm 1.5496 (2.1085) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][670/1251] eta 0:02:22 lr 0.000811 wd 0.0500 time 0.2381 (0.2460) data time 0.0010 (0.0020) model time 0.2371 (0.2440) loss 4.1105 (3.3758) grad_norm 1.5290 (2.1097) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][680/1251] eta 0:02:20 lr 0.000811 wd 0.0500 time 0.2367 (0.2461) data time 0.0013 (0.0019) model time 0.2354 (0.2441) loss 3.6587 (3.3790) grad_norm 2.7177 (2.1148) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][690/1251] eta 0:02:18 lr 0.000811 wd 0.0500 time 0.2432 (0.2463) data time 0.0013 (0.0019) model time 0.2420 (0.2444) loss 3.6692 (3.3793) grad_norm 2.4465 (2.1160) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][700/1251] eta 0:02:15 lr 0.000811 wd 0.0500 time 0.2379 (0.2462) data time 0.0010 (0.0019) model time 0.2369 (0.2443) loss 2.4765 (3.3779) grad_norm 1.6203 (2.1134) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][710/1251] eta 0:02:13 lr 0.000811 wd 0.0500 time 0.2441 (0.2462) data time 0.0008 (0.0019) model time 0.2433 (0.2443) loss 2.4606 (3.3810) grad_norm 1.9047 (2.1127) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][720/1251] eta 0:02:11 lr 0.000811 wd 0.0500 time 0.2439 (0.2468) data time 0.0011 (0.0019) model time 0.2428 (0.2449) loss 4.2090 (3.3822) grad_norm 1.3359 (2.1111) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][730/1251] eta 0:02:08 lr 0.000811 wd 0.0500 time 0.2476 (0.2468) data time 0.0010 (0.0019) model time 0.2467 (0.2449) loss 2.3994 (3.3844) grad_norm 1.8148 (2.1100) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][740/1251] eta 0:02:06 lr 0.000811 wd 0.0500 time 0.2436 (0.2467) data time 0.0009 (0.0019) model time 0.2427 (0.2448) loss 4.5577 (3.3884) grad_norm 1.8556 (2.1086) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][750/1251] eta 0:02:03 lr 0.000811 wd 0.0500 time 0.2651 (0.2467) data time 0.0010 (0.0019) model time 0.2641 (0.2448) loss 2.8166 (3.3859) grad_norm 2.7365 (2.1099) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][760/1251] eta 0:02:01 lr 0.000811 wd 0.0500 time 0.2390 (0.2466) data time 0.0010 (0.0019) model time 0.2380 (0.2448) loss 3.6549 (3.3854) grad_norm 1.6583 (2.1091) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:56:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][770/1251] eta 0:01:58 lr 0.000811 wd 0.0500 time 0.2548 (0.2469) data time 0.0007 (0.0019) model time 0.2541 (0.2450) loss 2.5129 (3.3872) grad_norm 1.5594 (2.1094) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][780/1251] eta 0:01:56 lr 0.000811 wd 0.0500 time 0.2425 (0.2468) data time 0.0010 (0.0018) model time 0.2415 (0.2450) loss 3.6329 (3.3882) grad_norm 2.3785 (2.1088) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][790/1251] eta 0:01:53 lr 0.000811 wd 0.0500 time 0.2374 (0.2467) data time 0.0009 (0.0018) model time 0.2365 (0.2449) loss 3.3332 (3.3856) grad_norm 2.1374 (2.1097) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][800/1251] eta 0:01:51 lr 0.000811 wd 0.0500 time 0.2390 (0.2467) data time 0.0008 (0.0018) model time 0.2382 (0.2449) loss 3.3100 (3.3876) grad_norm 2.0783 (2.1099) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][810/1251] eta 0:01:48 lr 0.000811 wd 0.0500 time 0.2432 (0.2466) data time 0.0009 (0.0018) model time 0.2424 (0.2448) loss 3.9428 (3.3857) grad_norm 1.6928 (2.1088) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][820/1251] eta 0:01:46 lr 0.000811 wd 0.0500 time 0.2348 (0.2466) data time 0.0010 (0.0018) model time 0.2338 (0.2448) loss 3.6699 (3.3864) grad_norm 1.5183 (2.1058) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][830/1251] eta 0:01:43 lr 0.000811 wd 0.0500 time 0.2474 (0.2465) data time 0.0007 (0.0018) model time 0.2467 (0.2447) loss 2.3463 (3.3856) grad_norm 1.8044 (2.1026) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][840/1251] eta 0:01:41 lr 0.000811 wd 0.0500 time 0.2421 (0.2465) data time 0.0008 (0.0018) model time 0.2414 (0.2447) loss 4.2945 (3.3874) grad_norm 3.8902 (2.1029) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][850/1251] eta 0:01:38 lr 0.000811 wd 0.0500 time 0.2464 (0.2464) data time 0.0008 (0.0018) model time 0.2456 (0.2446) loss 3.9643 (3.3862) grad_norm 2.1220 (2.1087) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][860/1251] eta 0:01:36 lr 0.000811 wd 0.0500 time 0.2318 (0.2464) data time 0.0009 (0.0018) model time 0.2309 (0.2446) loss 3.1028 (3.3883) grad_norm 1.8244 (2.1094) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][870/1251] eta 0:01:33 lr 0.000811 wd 0.0500 time 0.2370 (0.2463) data time 0.0010 (0.0018) model time 0.2360 (0.2445) loss 3.4044 (3.3886) grad_norm 1.5284 (2.1078) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][880/1251] eta 0:01:31 lr 0.000811 wd 0.0500 time 0.2421 (0.2463) data time 0.0010 (0.0018) model time 0.2411 (0.2445) loss 3.4949 (3.3900) grad_norm 1.9314 (2.1058) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][890/1251] eta 0:01:28 lr 0.000811 wd 0.0500 time 0.2373 (0.2462) data time 0.0010 (0.0018) model time 0.2364 (0.2444) loss 2.9523 (3.3894) grad_norm 2.8008 (2.1081) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][900/1251] eta 0:01:26 lr 0.000810 wd 0.0500 time 0.2426 (0.2462) data time 0.0012 (0.0018) model time 0.2415 (0.2444) loss 3.1125 (3.3888) grad_norm 2.7586 (2.1167) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][910/1251] eta 0:01:23 lr 0.000810 wd 0.0500 time 0.2419 (0.2461) data time 0.0011 (0.0018) model time 0.2408 (0.2444) loss 2.0548 (3.3883) grad_norm 1.4606 (2.1200) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][920/1251] eta 0:01:21 lr 0.000810 wd 0.0500 time 0.2440 (0.2461) data time 0.0010 (0.0017) model time 0.2430 (0.2443) loss 3.7447 (3.3917) grad_norm 2.0668 (2.1179) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][930/1251] eta 0:01:18 lr 0.000810 wd 0.0500 time 0.2335 (0.2461) data time 0.0012 (0.0017) model time 0.2323 (0.2443) loss 2.1150 (3.3897) grad_norm 1.6557 (2.1142) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][940/1251] eta 0:01:16 lr 0.000810 wd 0.0500 time 0.2345 (0.2460) data time 0.0008 (0.0017) model time 0.2337 (0.2442) loss 4.2493 (3.3900) grad_norm 1.9538 (2.1135) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][950/1251] eta 0:01:14 lr 0.000810 wd 0.0500 time 0.2418 (0.2461) data time 0.0009 (0.0017) model time 0.2408 (0.2443) loss 4.2718 (3.3918) grad_norm 1.9233 (2.1146) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][960/1251] eta 0:01:11 lr 0.000810 wd 0.0500 time 0.2490 (0.2461) data time 0.0011 (0.0017) model time 0.2479 (0.2443) loss 3.9717 (3.3950) grad_norm 1.4545 (2.1107) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][970/1251] eta 0:01:09 lr 0.000810 wd 0.0500 time 0.2469 (0.2460) data time 0.0009 (0.0017) model time 0.2461 (0.2443) loss 3.7577 (3.3944) grad_norm 2.6206 (2.1095) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][980/1251] eta 0:01:06 lr 0.000810 wd 0.0500 time 0.2561 (0.2462) data time 0.0011 (0.0017) model time 0.2551 (0.2445) loss 2.8143 (3.3951) grad_norm 1.8766 (2.1077) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][990/1251] eta 0:01:04 lr 0.000810 wd 0.0500 time 0.2417 (0.2462) data time 0.0008 (0.0017) model time 0.2409 (0.2444) loss 3.4546 (3.3954) grad_norm 2.5727 (2.1131) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1000/1251] eta 0:01:01 lr 0.000810 wd 0.0500 time 0.2425 (0.2461) data time 0.0008 (0.0017) model time 0.2416 (0.2444) loss 3.9233 (3.3969) grad_norm 2.0989 (2.1135) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1010/1251] eta 0:00:59 lr 0.000810 wd 0.0500 time 0.2378 (0.2461) data time 0.0010 (0.0017) model time 0.2368 (0.2444) loss 3.7041 (3.3971) grad_norm 2.5719 (2.1129) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:57:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1020/1251] eta 0:00:56 lr 0.000810 wd 0.0500 time 0.2484 (0.2461) data time 0.0010 (0.0017) model time 0.2474 (0.2443) loss 3.5169 (3.3989) grad_norm 2.3006 (2.1134) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1030/1251] eta 0:00:54 lr 0.000810 wd 0.0500 time 0.2477 (0.2460) data time 0.0008 (0.0017) model time 0.2470 (0.2443) loss 3.9780 (3.3999) grad_norm 1.6702 (2.1129) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1040/1251] eta 0:00:51 lr 0.000810 wd 0.0500 time 0.2454 (0.2460) data time 0.0009 (0.0017) model time 0.2445 (0.2443) loss 3.7386 (3.3995) grad_norm 1.7762 (2.1126) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1050/1251] eta 0:00:49 lr 0.000810 wd 0.0500 time 0.2496 (0.2460) data time 0.0011 (0.0017) model time 0.2485 (0.2443) loss 2.8412 (3.3985) grad_norm 2.4721 (2.1126) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1060/1251] eta 0:00:46 lr 0.000810 wd 0.0500 time 0.2451 (0.2460) data time 0.0008 (0.0017) model time 0.2443 (0.2442) loss 2.4617 (3.3995) grad_norm 2.2177 (2.1115) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1070/1251] eta 0:00:44 lr 0.000810 wd 0.0500 time 0.2453 (0.2459) data time 0.0011 (0.0017) model time 0.2441 (0.2442) loss 3.3876 (3.3988) grad_norm 2.0886 (2.1134) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1080/1251] eta 0:00:42 lr 0.000810 wd 0.0500 time 0.2405 (0.2463) data time 0.0013 (0.0017) model time 0.2392 (0.2446) loss 3.5202 (3.3971) grad_norm 2.0231 (2.1132) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1090/1251] eta 0:00:39 lr 0.000810 wd 0.0500 time 0.2420 (0.2462) data time 0.0013 (0.0017) model time 0.2407 (0.2446) loss 3.6016 (3.3955) grad_norm 1.5161 (2.1125) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1100/1251] eta 0:00:37 lr 0.000810 wd 0.0500 time 0.2398 (0.2462) data time 0.0009 (0.0017) model time 0.2389 (0.2445) loss 3.6661 (3.3941) grad_norm 2.0734 (2.1143) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1110/1251] eta 0:00:34 lr 0.000810 wd 0.0500 time 0.2436 (0.2462) data time 0.0010 (0.0017) model time 0.2426 (0.2445) loss 3.4294 (3.3956) grad_norm 2.4229 (2.1165) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1120/1251] eta 0:00:32 lr 0.000810 wd 0.0500 time 0.2465 (0.2462) data time 0.0011 (0.0017) model time 0.2455 (0.2445) loss 3.3849 (3.3957) grad_norm 1.9893 (2.1189) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1130/1251] eta 0:00:29 lr 0.000810 wd 0.0500 time 0.2416 (0.2462) data time 0.0009 (0.0017) model time 0.2407 (0.2445) loss 3.1342 (3.3952) grad_norm 2.1233 (2.1169) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1140/1251] eta 0:00:27 lr 0.000810 wd 0.0500 time 0.2445 (0.2465) data time 0.0008 (0.0017) model time 0.2437 (0.2448) loss 3.4468 (3.3978) grad_norm 1.7423 (2.1161) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1150/1251] eta 0:00:24 lr 0.000810 wd 0.0500 time 0.2549 (0.2465) data time 0.0010 (0.0017) model time 0.2540 (0.2449) loss 3.4270 (3.3971) grad_norm 2.9181 (2.1187) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1160/1251] eta 0:00:22 lr 0.000810 wd 0.0500 time 0.4386 (0.2467) data time 0.0012 (0.0017) model time 0.4374 (0.2450) loss 2.8424 (3.3973) grad_norm 1.9226 (2.1218) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1170/1251] eta 0:00:19 lr 0.000810 wd 0.0500 time 0.2487 (0.2466) data time 0.0009 (0.0017) model time 0.2477 (0.2450) loss 3.4645 (3.3983) grad_norm 2.6385 (2.1261) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1180/1251] eta 0:00:17 lr 0.000810 wd 0.0500 time 0.2513 (0.2466) data time 0.0009 (0.0017) model time 0.2504 (0.2450) loss 2.5215 (3.3974) grad_norm 2.5402 (2.1278) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1190/1251] eta 0:00:15 lr 0.000809 wd 0.0500 time 0.2482 (0.2467) data time 0.0007 (0.0017) model time 0.2475 (0.2451) loss 3.9815 (3.4009) grad_norm 2.1651 (2.1293) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1200/1251] eta 0:00:12 lr 0.000809 wd 0.0500 time 0.4465 (0.2469) data time 0.0007 (0.0017) model time 0.4457 (0.2452) loss 2.3959 (3.4004) grad_norm 2.3065 (2.1276) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1210/1251] eta 0:00:10 lr 0.000809 wd 0.0500 time 0.2452 (0.2470) data time 0.0008 (0.0017) model time 0.2443 (0.2453) loss 4.0367 (3.4016) grad_norm 1.7328 (2.1269) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1220/1251] eta 0:00:07 lr 0.000809 wd 0.0500 time 0.2384 (0.2469) data time 0.0011 (0.0017) model time 0.2373 (0.2453) loss 3.4162 (3.4021) grad_norm 2.5472 (2.1273) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1230/1251] eta 0:00:05 lr 0.000809 wd 0.0500 time 0.2451 (0.2469) data time 0.0008 (0.0017) model time 0.2443 (0.2453) loss 4.3313 (3.4033) grad_norm 2.0418 (2.1283) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1240/1251] eta 0:00:02 lr 0.000809 wd 0.0500 time 0.2244 (0.2471) data time 0.0007 (0.0016) model time 0.2237 (0.2454) loss 3.3565 (3.4040) grad_norm 2.0155 (2.1277) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [100/300][1250/1251] eta 0:00:00 lr 0.000809 wd 0.0500 time 0.2245 (0.2469) data time 0.0005 (0.0016) model time 0.2240 (0.2453) loss 2.1916 (3.4027) grad_norm 1.7916 (2.1269) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:58:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 100 training takes 0:05:08 [2024-08-26 10:58:57 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 10:58:57 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 10:58:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.448 (0.448) Loss 0.5195 (0.5195) Acc@1 89.746 (89.746) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-26 10:58:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.111) Loss 0.8594 (0.8028) Acc@1 82.031 (82.253) Acc@5 96.094 (96.236) Mem 7379MB [2024-08-26 10:58:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.095) Loss 1.1934 (0.8191) Acc@1 71.582 (81.543) Acc@5 92.383 (96.215) Mem 7379MB [2024-08-26 10:59:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.091) Loss 1.4287 (0.9324) Acc@1 65.918 (79.035) Acc@5 87.988 (94.824) Mem 7379MB [2024-08-26 10:59:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.3379 (0.9993) Acc@1 69.141 (77.491) Acc@5 89.746 (94.005) Mem 7379MB [2024-08-26 10:59:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.088 Acc@5 93.950 [2024-08-26 10:59:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.1% [2024-08-26 10:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.817 (0.817) Loss 0.4407 (0.4407) Acc@1 92.188 (92.188) Acc@5 98.438 (98.438) Mem 7379MB [2024-08-26 10:59:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.149) Loss 0.7070 (0.6887) Acc@1 86.328 (85.112) Acc@5 96.094 (97.070) Mem 7379MB [2024-08-26 10:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.116) Loss 0.9839 (0.7123) Acc@1 76.855 (84.115) Acc@5 94.043 (97.010) Mem 7379MB [2024-08-26 10:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.104) Loss 1.2725 (0.8119) Acc@1 66.992 (81.751) Acc@5 90.918 (95.867) Mem 7379MB [2024-08-26 10:59:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.095) Loss 1.1348 (0.8636) Acc@1 71.582 (80.273) Acc@5 93.262 (95.372) Mem 7379MB [2024-08-26 10:59:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.852 Acc@5 95.338 [2024-08-26 10:59:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.9% [2024-08-26 10:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][0/1251] eta 0:22:36 lr 0.000809 wd 0.0500 time 1.0843 (1.0843) data time 0.6958 (0.6958) model time 0.0000 (0.0000) loss 2.3960 (2.3960) grad_norm 2.7705 (2.7705) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][10/1251] eta 0:07:28 lr 0.000809 wd 0.0500 time 0.2464 (0.3614) data time 0.0010 (0.0645) model time 0.0000 (0.0000) loss 2.1583 (3.2062) grad_norm 1.5307 (2.1833) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][20/1251] eta 0:06:16 lr 0.000809 wd 0.0500 time 0.2455 (0.3055) data time 0.0007 (0.0345) model time 0.0000 (0.0000) loss 3.6995 (3.2267) grad_norm 2.0015 (1.9759) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][30/1251] eta 0:05:47 lr 0.000809 wd 0.0500 time 0.2349 (0.2844) data time 0.0011 (0.0237) model time 0.0000 (0.0000) loss 3.7846 (3.3246) grad_norm 1.5664 (1.9774) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][40/1251] eta 0:05:31 lr 0.000809 wd 0.0500 time 0.2448 (0.2741) data time 0.0009 (0.0181) model time 0.0000 (0.0000) loss 3.9289 (3.3209) grad_norm 2.3931 (2.0074) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][50/1251] eta 0:05:21 lr 0.000809 wd 0.0500 time 0.2444 (0.2680) data time 0.0008 (0.0149) model time 0.0000 (0.0000) loss 2.4579 (3.2633) grad_norm 1.7540 (2.0100) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][60/1251] eta 0:05:14 lr 0.000809 wd 0.0500 time 0.2478 (0.2637) data time 0.0007 (0.0126) model time 0.2471 (0.2410) loss 2.1378 (3.2887) grad_norm 1.7602 (1.9978) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][70/1251] eta 0:05:08 lr 0.000809 wd 0.0500 time 0.2457 (0.2611) data time 0.0007 (0.0110) model time 0.2450 (0.2423) loss 2.8061 (3.3111) grad_norm 1.9311 (2.0089) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][80/1251] eta 0:05:03 lr 0.000809 wd 0.0500 time 0.2388 (0.2590) data time 0.0010 (0.0098) model time 0.2378 (0.2425) loss 3.1939 (3.3016) grad_norm 1.9263 (2.0963) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][90/1251] eta 0:04:58 lr 0.000809 wd 0.0500 time 0.2416 (0.2570) data time 0.0011 (0.0088) model time 0.2405 (0.2420) loss 3.8004 (3.2726) grad_norm 1.7000 (2.0755) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][100/1251] eta 0:04:54 lr 0.000809 wd 0.0500 time 0.2465 (0.2558) data time 0.0007 (0.0080) model time 0.2458 (0.2423) loss 3.5542 (3.2986) grad_norm 3.1528 (2.0818) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][110/1251] eta 0:04:50 lr 0.000809 wd 0.0500 time 0.2421 (0.2546) data time 0.0012 (0.0074) model time 0.2409 (0.2421) loss 3.5531 (3.2967) grad_norm 1.7323 (2.0671) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][120/1251] eta 0:04:46 lr 0.000809 wd 0.0500 time 0.2331 (0.2535) data time 0.0010 (0.0069) model time 0.2321 (0.2419) loss 3.8161 (3.3394) grad_norm 2.0892 (2.0628) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][130/1251] eta 0:04:43 lr 0.000809 wd 0.0500 time 0.2438 (0.2527) data time 0.0009 (0.0064) model time 0.2429 (0.2418) loss 2.7467 (3.3523) grad_norm 1.6617 (2.0661) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][140/1251] eta 0:04:39 lr 0.000809 wd 0.0500 time 0.2347 (0.2519) data time 0.0010 (0.0061) model time 0.2337 (0.2416) loss 3.1040 (3.3556) grad_norm 1.7477 (2.0652) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][150/1251] eta 0:04:36 lr 0.000809 wd 0.0500 time 0.2419 (0.2513) data time 0.0007 (0.0057) model time 0.2412 (0.2417) loss 3.8196 (3.3852) grad_norm 2.2659 (2.0680) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][160/1251] eta 0:04:33 lr 0.000809 wd 0.0500 time 0.2506 (0.2507) data time 0.0009 (0.0055) model time 0.2498 (0.2416) loss 2.3438 (3.3761) grad_norm 2.6951 (2.0830) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][170/1251] eta 0:04:30 lr 0.000809 wd 0.0500 time 0.2467 (0.2504) data time 0.0008 (0.0052) model time 0.2460 (0.2418) loss 3.7744 (3.3855) grad_norm 1.9052 (2.0816) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][180/1251] eta 0:04:27 lr 0.000809 wd 0.0500 time 0.2478 (0.2501) data time 0.0007 (0.0050) model time 0.2471 (0.2420) loss 3.7518 (3.3775) grad_norm 3.3262 (2.0965) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][190/1251] eta 0:04:24 lr 0.000809 wd 0.0500 time 0.2429 (0.2496) data time 0.0010 (0.0048) model time 0.2419 (0.2418) loss 3.4389 (3.3726) grad_norm 1.7955 (2.0972) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][200/1251] eta 0:04:22 lr 0.000809 wd 0.0500 time 0.2489 (0.2493) data time 0.0010 (0.0046) model time 0.2479 (0.2419) loss 3.6693 (3.3762) grad_norm 1.8244 (2.0874) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 10:59:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][210/1251] eta 0:04:19 lr 0.000809 wd 0.0500 time 0.2526 (0.2489) data time 0.0009 (0.0044) model time 0.2518 (0.2418) loss 2.4127 (3.3650) grad_norm 2.0996 (2.0774) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][220/1251] eta 0:04:16 lr 0.000808 wd 0.0500 time 0.2464 (0.2487) data time 0.0009 (0.0043) model time 0.2454 (0.2418) loss 2.7762 (3.3639) grad_norm 2.2917 (2.0815) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][230/1251] eta 0:04:13 lr 0.000808 wd 0.0500 time 0.2492 (0.2484) data time 0.0011 (0.0041) model time 0.2481 (0.2418) loss 3.7169 (3.3562) grad_norm 2.0651 (2.0809) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][240/1251] eta 0:04:10 lr 0.000808 wd 0.0500 time 0.2497 (0.2482) data time 0.0007 (0.0040) model time 0.2489 (0.2418) loss 3.4635 (3.3679) grad_norm 4.0008 (2.0813) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][250/1251] eta 0:04:08 lr 0.000808 wd 0.0500 time 0.4488 (0.2487) data time 0.0009 (0.0039) model time 0.4479 (0.2428) loss 2.6381 (3.3641) grad_norm 3.0542 (2.0847) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][260/1251] eta 0:04:06 lr 0.000808 wd 0.0500 time 0.2491 (0.2486) data time 0.0010 (0.0038) model time 0.2481 (0.2428) loss 3.1819 (3.3714) grad_norm 2.0062 (2.0954) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][270/1251] eta 0:04:03 lr 0.000808 wd 0.0500 time 0.2351 (0.2483) data time 0.0011 (0.0037) model time 0.2340 (0.2427) loss 3.2873 (3.3666) grad_norm 2.7212 (2.0949) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][280/1251] eta 0:04:00 lr 0.000808 wd 0.0500 time 0.2363 (0.2481) data time 0.0008 (0.0036) model time 0.2355 (0.2426) loss 3.1698 (3.3753) grad_norm 2.0576 (2.0936) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][290/1251] eta 0:03:58 lr 0.000808 wd 0.0500 time 0.2356 (0.2479) data time 0.0007 (0.0035) model time 0.2348 (0.2425) loss 4.6531 (3.3784) grad_norm 2.0551 (2.0868) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][300/1251] eta 0:03:55 lr 0.000808 wd 0.0500 time 0.2292 (0.2477) data time 0.0011 (0.0034) model time 0.2281 (0.2425) loss 3.6403 (3.3812) grad_norm 1.9172 (2.0891) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][310/1251] eta 0:03:52 lr 0.000808 wd 0.0500 time 0.2375 (0.2475) data time 0.0012 (0.0034) model time 0.2364 (0.2424) loss 3.6789 (3.3765) grad_norm 2.2539 (2.1170) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][320/1251] eta 0:03:50 lr 0.000808 wd 0.0500 time 0.2462 (0.2473) data time 0.0009 (0.0033) model time 0.2453 (0.2423) loss 3.1028 (3.3798) grad_norm 2.0373 (2.1210) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][330/1251] eta 0:03:47 lr 0.000808 wd 0.0500 time 0.2439 (0.2472) data time 0.0009 (0.0032) model time 0.2430 (0.2423) loss 4.1455 (3.3786) grad_norm 1.5615 (2.1227) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][340/1251] eta 0:03:45 lr 0.000808 wd 0.0500 time 0.2446 (0.2470) data time 0.0010 (0.0031) model time 0.2436 (0.2423) loss 3.0179 (3.3780) grad_norm 2.5099 (2.1201) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][350/1251] eta 0:03:42 lr 0.000808 wd 0.0500 time 0.2380 (0.2469) data time 0.0008 (0.0031) model time 0.2371 (0.2423) loss 4.1013 (3.3757) grad_norm 2.0832 (2.1168) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][360/1251] eta 0:03:40 lr 0.000808 wd 0.0500 time 0.2480 (0.2474) data time 0.0010 (0.0030) model time 0.2470 (0.2430) loss 3.7382 (3.3799) grad_norm 1.5554 (2.1094) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][370/1251] eta 0:03:37 lr 0.000808 wd 0.0500 time 0.2326 (0.2473) data time 0.0009 (0.0030) model time 0.2316 (0.2429) loss 2.2328 (3.3719) grad_norm 1.6181 (2.0973) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][380/1251] eta 0:03:35 lr 0.000808 wd 0.0500 time 0.2408 (0.2472) data time 0.0011 (0.0029) model time 0.2397 (0.2429) loss 3.5565 (3.3590) grad_norm 2.3708 (2.0982) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][390/1251] eta 0:03:32 lr 0.000808 wd 0.0500 time 0.2384 (0.2470) data time 0.0009 (0.0029) model time 0.2375 (0.2428) loss 3.1287 (3.3585) grad_norm 3.0052 (2.1024) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][400/1251] eta 0:03:30 lr 0.000808 wd 0.0500 time 0.2386 (0.2468) data time 0.0009 (0.0028) model time 0.2377 (0.2427) loss 3.4699 (3.3590) grad_norm 1.5689 (2.1046) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][410/1251] eta 0:03:27 lr 0.000808 wd 0.0500 time 0.4315 (0.2472) data time 0.0010 (0.0028) model time 0.4305 (0.2432) loss 3.8567 (3.3529) grad_norm 2.2657 (2.1038) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][420/1251] eta 0:03:25 lr 0.000808 wd 0.0500 time 0.2484 (0.2470) data time 0.0010 (0.0027) model time 0.2474 (0.2431) loss 2.1714 (3.3475) grad_norm 1.4824 (2.1067) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][430/1251] eta 0:03:22 lr 0.000808 wd 0.0500 time 0.2452 (0.2470) data time 0.0007 (0.0027) model time 0.2445 (0.2431) loss 2.6779 (3.3432) grad_norm 2.0421 (2.1033) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][440/1251] eta 0:03:20 lr 0.000808 wd 0.0500 time 0.2425 (0.2468) data time 0.0008 (0.0027) model time 0.2417 (0.2430) loss 3.7785 (3.3488) grad_norm 2.3406 (2.1013) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:00:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][450/1251] eta 0:03:17 lr 0.000808 wd 0.0500 time 0.2401 (0.2467) data time 0.0010 (0.0026) model time 0.2392 (0.2429) loss 3.6787 (3.3487) grad_norm 1.9565 (2.0973) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][460/1251] eta 0:03:15 lr 0.000808 wd 0.0500 time 0.2489 (0.2472) data time 0.0008 (0.0027) model time 0.2481 (0.2434) loss 2.1019 (3.3489) grad_norm 1.5135 (2.0925) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][470/1251] eta 0:03:12 lr 0.000808 wd 0.0500 time 0.2456 (0.2471) data time 0.0011 (0.0027) model time 0.2446 (0.2433) loss 3.5180 (3.3477) grad_norm 1.8536 (2.0976) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][480/1251] eta 0:03:10 lr 0.000808 wd 0.0500 time 0.2477 (0.2470) data time 0.0007 (0.0026) model time 0.2470 (0.2433) loss 3.5601 (3.3528) grad_norm 1.5916 (2.0989) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][490/1251] eta 0:03:08 lr 0.000808 wd 0.0500 time 0.2361 (0.2472) data time 0.0013 (0.0026) model time 0.2348 (0.2436) loss 2.9486 (3.3531) grad_norm 1.4928 (2.0932) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][500/1251] eta 0:03:05 lr 0.000808 wd 0.0500 time 0.2500 (0.2475) data time 0.0007 (0.0026) model time 0.2492 (0.2440) loss 3.3053 (3.3541) grad_norm 1.9053 (2.0890) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][510/1251] eta 0:03:03 lr 0.000807 wd 0.0500 time 0.2403 (0.2474) data time 0.0008 (0.0025) model time 0.2395 (0.2439) loss 2.7014 (3.3577) grad_norm 2.0049 (2.0866) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][520/1251] eta 0:03:00 lr 0.000807 wd 0.0500 time 0.2396 (0.2473) data time 0.0009 (0.0025) model time 0.2387 (0.2439) loss 3.1180 (3.3552) grad_norm 1.9169 (2.0814) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][530/1251] eta 0:02:58 lr 0.000807 wd 0.0500 time 0.2367 (0.2472) data time 0.0009 (0.0025) model time 0.2358 (0.2439) loss 2.5513 (3.3539) grad_norm 1.9816 (2.0870) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][540/1251] eta 0:02:56 lr 0.000807 wd 0.0500 time 0.4414 (0.2477) data time 0.0012 (0.0024) model time 0.4402 (0.2444) loss 3.5425 (3.3553) grad_norm 1.7496 (2.0840) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][550/1251] eta 0:02:53 lr 0.000807 wd 0.0500 time 0.2508 (0.2480) data time 0.0007 (0.0024) model time 0.2502 (0.2448) loss 3.9411 (3.3533) grad_norm 2.3867 (2.0813) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][560/1251] eta 0:02:51 lr 0.000807 wd 0.0500 time 0.2455 (0.2480) data time 0.0007 (0.0024) model time 0.2448 (0.2448) loss 3.2921 (3.3565) grad_norm 2.1770 (2.0801) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][570/1251] eta 0:02:48 lr 0.000807 wd 0.0500 time 0.2394 (0.2478) data time 0.0009 (0.0024) model time 0.2385 (0.2447) loss 2.9417 (3.3586) grad_norm 2.3918 (2.0823) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][580/1251] eta 0:02:46 lr 0.000807 wd 0.0500 time 0.2413 (0.2477) data time 0.0010 (0.0024) model time 0.2402 (0.2446) loss 3.7254 (3.3606) grad_norm 1.5599 (2.0766) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][590/1251] eta 0:02:43 lr 0.000807 wd 0.0500 time 0.2395 (0.2476) data time 0.0011 (0.0023) model time 0.2385 (0.2445) loss 3.6970 (3.3558) grad_norm 2.2682 (2.0773) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][600/1251] eta 0:02:41 lr 0.000807 wd 0.0500 time 0.2389 (0.2475) data time 0.0011 (0.0023) model time 0.2377 (0.2445) loss 3.5485 (3.3629) grad_norm 2.4483 (2.0813) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][610/1251] eta 0:02:38 lr 0.000807 wd 0.0500 time 0.2339 (0.2474) data time 0.0009 (0.0023) model time 0.2330 (0.2444) loss 3.1168 (3.3643) grad_norm 1.6344 (2.0816) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][620/1251] eta 0:02:36 lr 0.000807 wd 0.0500 time 0.2499 (0.2473) data time 0.0010 (0.0023) model time 0.2489 (0.2443) loss 3.2009 (3.3649) grad_norm 2.3790 (2.0812) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][630/1251] eta 0:02:33 lr 0.000807 wd 0.0500 time 0.2467 (0.2475) data time 0.0007 (0.0023) model time 0.2460 (0.2446) loss 4.2986 (3.3680) grad_norm 2.1811 (2.0886) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][640/1251] eta 0:02:31 lr 0.000807 wd 0.0500 time 0.2512 (0.2475) data time 0.0009 (0.0022) model time 0.2503 (0.2446) loss 3.8194 (3.3655) grad_norm 1.7091 (2.0906) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][650/1251] eta 0:02:28 lr 0.000807 wd 0.0500 time 0.2429 (0.2474) data time 0.0010 (0.0022) model time 0.2419 (0.2445) loss 3.4159 (3.3674) grad_norm 3.1331 (2.0898) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][660/1251] eta 0:02:26 lr 0.000807 wd 0.0500 time 0.2389 (0.2473) data time 0.0008 (0.0022) model time 0.2382 (0.2445) loss 4.1908 (3.3687) grad_norm 1.9501 (2.0917) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][670/1251] eta 0:02:23 lr 0.000807 wd 0.0500 time 0.2382 (0.2473) data time 0.0008 (0.0022) model time 0.2374 (0.2444) loss 4.1145 (3.3708) grad_norm 1.9107 (2.0994) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][680/1251] eta 0:02:21 lr 0.000807 wd 0.0500 time 0.2422 (0.2472) data time 0.0008 (0.0022) model time 0.2414 (0.2443) loss 2.6981 (3.3698) grad_norm 2.2170 (2.1024) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][690/1251] eta 0:02:18 lr 0.000807 wd 0.0500 time 0.2335 (0.2471) data time 0.0012 (0.0022) model time 0.2323 (0.2443) loss 3.6064 (3.3699) grad_norm 3.0711 (2.1025) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][700/1251] eta 0:02:16 lr 0.000807 wd 0.0500 time 0.2374 (0.2470) data time 0.0010 (0.0021) model time 0.2365 (0.2442) loss 3.3104 (3.3735) grad_norm 2.5635 (2.1063) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:02:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][710/1251] eta 0:02:13 lr 0.000807 wd 0.0500 time 0.2466 (0.2469) data time 0.0008 (0.0021) model time 0.2458 (0.2442) loss 3.5288 (3.3755) grad_norm 1.8768 (2.1032) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][720/1251] eta 0:02:11 lr 0.000807 wd 0.0500 time 0.2400 (0.2470) data time 0.0009 (0.0021) model time 0.2391 (0.2442) loss 3.7202 (3.3775) grad_norm 1.9392 (2.1038) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:02:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][730/1251] eta 0:02:08 lr 0.000807 wd 0.0500 time 0.2430 (0.2469) data time 0.0009 (0.0021) model time 0.2420 (0.2442) loss 3.6520 (3.3753) grad_norm 4.7646 (2.1093) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][740/1251] eta 0:02:06 lr 0.000807 wd 0.0500 time 0.2491 (0.2474) data time 0.0007 (0.0021) model time 0.2485 (0.2447) loss 3.8415 (3.3781) grad_norm 2.1441 (2.1149) loss_scale 8192.0000 (4101.5277) mem 7379MB [2024-08-26 11:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][750/1251] eta 0:02:03 lr 0.000807 wd 0.0500 time 0.2387 (0.2473) data time 0.0008 (0.0021) model time 0.2379 (0.2447) loss 2.0731 (3.3795) grad_norm 2.4496 (2.1199) loss_scale 8192.0000 (4155.9947) mem 7379MB [2024-08-26 11:02:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][760/1251] eta 0:02:01 lr 0.000807 wd 0.0500 time 0.2395 (0.2473) data time 0.0010 (0.0021) model time 0.2384 (0.2447) loss 3.2877 (3.3802) grad_norm 1.6588 (2.1173) loss_scale 8192.0000 (4209.0302) mem 7379MB [2024-08-26 11:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][770/1251] eta 0:01:58 lr 0.000807 wd 0.0500 time 0.2439 (0.2472) data time 0.0008 (0.0020) model time 0.2432 (0.2446) loss 2.6109 (3.3801) grad_norm 2.5965 (2.1157) loss_scale 8192.0000 (4260.6900) mem 7379MB [2024-08-26 11:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][780/1251] eta 0:01:56 lr 0.000807 wd 0.0500 time 0.2405 (0.2474) data time 0.0007 (0.0020) model time 0.2398 (0.2448) loss 3.1660 (3.3783) grad_norm 3.1589 (2.1134) loss_scale 8192.0000 (4311.0269) mem 7379MB [2024-08-26 11:02:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][790/1251] eta 0:01:54 lr 0.000806 wd 0.0500 time 0.2402 (0.2473) data time 0.0009 (0.0020) model time 0.2393 (0.2448) loss 3.0569 (3.3806) grad_norm 1.5631 (2.1155) loss_scale 8192.0000 (4360.0910) mem 7379MB [2024-08-26 11:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][800/1251] eta 0:01:51 lr 0.000806 wd 0.0500 time 0.2395 (0.2473) data time 0.0008 (0.0020) model time 0.2387 (0.2447) loss 2.6567 (3.3805) grad_norm 2.3849 (2.1147) loss_scale 8192.0000 (4407.9301) mem 7379MB [2024-08-26 11:02:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][810/1251] eta 0:01:49 lr 0.000806 wd 0.0500 time 0.2388 (0.2472) data time 0.0010 (0.0020) model time 0.2378 (0.2447) loss 3.8674 (3.3820) grad_norm 1.7713 (2.1123) loss_scale 8192.0000 (4454.5894) mem 7379MB [2024-08-26 11:02:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][820/1251] eta 0:01:46 lr 0.000806 wd 0.0500 time 0.2408 (0.2471) data time 0.0009 (0.0020) model time 0.2398 (0.2446) loss 3.6087 (3.3857) grad_norm 2.9445 (2.1102) loss_scale 8192.0000 (4500.1121) mem 7379MB [2024-08-26 11:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][830/1251] eta 0:01:44 lr 0.000806 wd 0.0500 time 0.2473 (0.2471) data time 0.0008 (0.0020) model time 0.2465 (0.2446) loss 4.0157 (3.3850) grad_norm 2.8004 (2.1091) loss_scale 8192.0000 (4544.5391) mem 7379MB [2024-08-26 11:02:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][840/1251] eta 0:01:41 lr 0.000806 wd 0.0500 time 0.2382 (0.2471) data time 0.0009 (0.0020) model time 0.2373 (0.2446) loss 3.9560 (3.3861) grad_norm 3.5766 (2.1163) loss_scale 8192.0000 (4587.9096) mem 7379MB [2024-08-26 11:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][850/1251] eta 0:01:39 lr 0.000806 wd 0.0500 time 0.2436 (0.2470) data time 0.0011 (0.0019) model time 0.2425 (0.2446) loss 3.3637 (3.3877) grad_norm 1.7566 (2.1192) loss_scale 8192.0000 (4630.2609) mem 7379MB [2024-08-26 11:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][860/1251] eta 0:01:36 lr 0.000806 wd 0.0500 time 0.2474 (0.2470) data time 0.0012 (0.0019) model time 0.2462 (0.2446) loss 3.3282 (3.3863) grad_norm 1.9993 (2.1196) loss_scale 8192.0000 (4671.6283) mem 7379MB [2024-08-26 11:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][870/1251] eta 0:01:34 lr 0.000806 wd 0.0500 time 0.2417 (0.2470) data time 0.0009 (0.0019) model time 0.2408 (0.2445) loss 3.4338 (3.3856) grad_norm 2.4938 (2.1195) loss_scale 8192.0000 (4712.0459) mem 7379MB [2024-08-26 11:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][880/1251] eta 0:01:31 lr 0.000806 wd 0.0500 time 0.2381 (0.2471) data time 0.0009 (0.0019) model time 0.2372 (0.2447) loss 3.8764 (3.3838) grad_norm 2.3092 (2.1178) loss_scale 8192.0000 (4751.5460) mem 7379MB [2024-08-26 11:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][890/1251] eta 0:01:29 lr 0.000806 wd 0.0500 time 0.2419 (0.2471) data time 0.0009 (0.0019) model time 0.2410 (0.2447) loss 3.9229 (3.3855) grad_norm 2.6956 (2.1172) loss_scale 8192.0000 (4790.1594) mem 7379MB [2024-08-26 11:02:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][900/1251] eta 0:01:26 lr 0.000806 wd 0.0500 time 0.2399 (0.2472) data time 0.0012 (0.0019) model time 0.2387 (0.2449) loss 3.3550 (3.3870) grad_norm 1.7252 (inf) loss_scale 4096.0000 (4809.7314) mem 7379MB [2024-08-26 11:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][910/1251] eta 0:01:24 lr 0.000806 wd 0.0500 time 0.2384 (0.2474) data time 0.0009 (0.0019) model time 0.2375 (0.2451) loss 3.3101 (3.3885) grad_norm 1.5554 (inf) loss_scale 4096.0000 (4801.8968) mem 7379MB [2024-08-26 11:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][920/1251] eta 0:01:21 lr 0.000806 wd 0.0500 time 0.2432 (0.2474) data time 0.0010 (0.0019) model time 0.2423 (0.2450) loss 3.5876 (3.3898) grad_norm 1.4914 (inf) loss_scale 4096.0000 (4794.2324) mem 7379MB [2024-08-26 11:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][930/1251] eta 0:01:19 lr 0.000806 wd 0.0500 time 0.2394 (0.2473) data time 0.0007 (0.0019) model time 0.2387 (0.2450) loss 4.6044 (3.3925) grad_norm 2.5178 (inf) loss_scale 4096.0000 (4786.7325) mem 7379MB [2024-08-26 11:02:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][940/1251] eta 0:01:16 lr 0.000806 wd 0.0500 time 0.2453 (0.2475) data time 0.0007 (0.0019) model time 0.2445 (0.2452) loss 2.7897 (3.3904) grad_norm 1.8821 (inf) loss_scale 4096.0000 (4779.3921) mem 7379MB [2024-08-26 11:03:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][950/1251] eta 0:01:14 lr 0.000806 wd 0.0500 time 0.2521 (0.2475) data time 0.0009 (0.0019) model time 0.2512 (0.2452) loss 3.8991 (3.3928) grad_norm 3.2200 (inf) loss_scale 4096.0000 (4772.2061) mem 7379MB [2024-08-26 11:03:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][960/1251] eta 0:01:12 lr 0.000806 wd 0.0500 time 0.2375 (0.2475) data time 0.0009 (0.0018) model time 0.2365 (0.2452) loss 3.2228 (3.3907) grad_norm 2.5635 (inf) loss_scale 4096.0000 (4765.1696) mem 7379MB [2024-08-26 11:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][970/1251] eta 0:01:09 lr 0.000806 wd 0.0500 time 0.2410 (0.2474) data time 0.0009 (0.0018) model time 0.2401 (0.2452) loss 2.3382 (3.3914) grad_norm 2.4201 (inf) loss_scale 4096.0000 (4758.2781) mem 7379MB [2024-08-26 11:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][980/1251] eta 0:01:07 lr 0.000806 wd 0.0500 time 0.2406 (0.2473) data time 0.0010 (0.0018) model time 0.2396 (0.2451) loss 2.9596 (3.3921) grad_norm 1.7355 (inf) loss_scale 4096.0000 (4751.5270) mem 7379MB [2024-08-26 11:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][990/1251] eta 0:01:04 lr 0.000806 wd 0.0500 time 0.2382 (0.2475) data time 0.0010 (0.0018) model time 0.2372 (0.2453) loss 3.6503 (3.3911) grad_norm 1.8083 (inf) loss_scale 4096.0000 (4744.9122) mem 7379MB [2024-08-26 11:03:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1000/1251] eta 0:01:02 lr 0.000806 wd 0.0500 time 0.2504 (0.2475) data time 0.0013 (0.0018) model time 0.2491 (0.2453) loss 2.9175 (3.3898) grad_norm 1.7299 (inf) loss_scale 4096.0000 (4738.4296) mem 7379MB [2024-08-26 11:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1010/1251] eta 0:00:59 lr 0.000806 wd 0.0500 time 0.2495 (0.2474) data time 0.0012 (0.0018) model time 0.2483 (0.2452) loss 3.7321 (3.3905) grad_norm 1.8257 (inf) loss_scale 4096.0000 (4732.0752) mem 7379MB [2024-08-26 11:03:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1020/1251] eta 0:00:57 lr 0.000806 wd 0.0500 time 0.2361 (0.2477) data time 0.0008 (0.0018) model time 0.2352 (0.2455) loss 1.9234 (3.3886) grad_norm 2.1719 (inf) loss_scale 4096.0000 (4725.8452) mem 7379MB [2024-08-26 11:03:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1030/1251] eta 0:00:54 lr 0.000806 wd 0.0500 time 0.2333 (0.2477) data time 0.0007 (0.0018) model time 0.2325 (0.2455) loss 3.0355 (3.3851) grad_norm 1.9184 (inf) loss_scale 4096.0000 (4719.7362) mem 7379MB [2024-08-26 11:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1040/1251] eta 0:00:52 lr 0.000806 wd 0.0500 time 0.2398 (0.2476) data time 0.0009 (0.0018) model time 0.2389 (0.2455) loss 3.4789 (3.3850) grad_norm 2.4710 (inf) loss_scale 4096.0000 (4713.7445) mem 7379MB [2024-08-26 11:03:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1050/1251] eta 0:00:49 lr 0.000806 wd 0.0500 time 0.2480 (0.2476) data time 0.0009 (0.0018) model time 0.2471 (0.2455) loss 4.0021 (3.3870) grad_norm 1.5277 (inf) loss_scale 4096.0000 (4707.8668) mem 7379MB [2024-08-26 11:03:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1060/1251] eta 0:00:47 lr 0.000806 wd 0.0500 time 0.3926 (0.2479) data time 0.0007 (0.0018) model time 0.3919 (0.2457) loss 3.9576 (3.3867) grad_norm 2.9494 (inf) loss_scale 4096.0000 (4702.0999) mem 7379MB [2024-08-26 11:03:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1070/1251] eta 0:00:44 lr 0.000806 wd 0.0500 time 0.2565 (0.2480) data time 0.0010 (0.0018) model time 0.2555 (0.2459) loss 2.3636 (3.3843) grad_norm 1.7415 (inf) loss_scale 4096.0000 (4696.4407) mem 7379MB [2024-08-26 11:03:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1080/1251] eta 0:00:42 lr 0.000805 wd 0.0500 time 0.2486 (0.2480) data time 0.0009 (0.0018) model time 0.2476 (0.2458) loss 4.1083 (3.3853) grad_norm 2.4301 (inf) loss_scale 4096.0000 (4690.8862) mem 7379MB [2024-08-26 11:03:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1090/1251] eta 0:00:39 lr 0.000805 wd 0.0500 time 0.2505 (0.2480) data time 0.0008 (0.0018) model time 0.2498 (0.2459) loss 3.5741 (3.3883) grad_norm 1.8425 (inf) loss_scale 4096.0000 (4685.4335) mem 7379MB [2024-08-26 11:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1100/1251] eta 0:00:37 lr 0.000805 wd 0.0500 time 0.2526 (0.2480) data time 0.0011 (0.0018) model time 0.2515 (0.2458) loss 3.4359 (3.3871) grad_norm 1.9259 (inf) loss_scale 4096.0000 (4680.0799) mem 7379MB [2024-08-26 11:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1110/1251] eta 0:00:34 lr 0.000805 wd 0.0500 time 0.2367 (0.2480) data time 0.0008 (0.0018) model time 0.2359 (0.2458) loss 3.2975 (3.3899) grad_norm 2.5979 (inf) loss_scale 4096.0000 (4674.8227) mem 7379MB [2024-08-26 11:03:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1120/1251] eta 0:00:32 lr 0.000805 wd 0.0500 time 0.2427 (0.2479) data time 0.0007 (0.0018) model time 0.2420 (0.2458) loss 2.4983 (3.3886) grad_norm 2.7369 (inf) loss_scale 4096.0000 (4669.6592) mem 7379MB [2024-08-26 11:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1130/1251] eta 0:00:30 lr 0.000805 wd 0.0500 time 0.2408 (0.2479) data time 0.0010 (0.0018) model time 0.2398 (0.2458) loss 3.5664 (3.3889) grad_norm 2.6056 (inf) loss_scale 4096.0000 (4664.5871) mem 7379MB [2024-08-26 11:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1140/1251] eta 0:00:27 lr 0.000805 wd 0.0500 time 0.2425 (0.2479) data time 0.0008 (0.0018) model time 0.2417 (0.2458) loss 2.7687 (3.3896) grad_norm 2.1456 (inf) loss_scale 4096.0000 (4659.6039) mem 7379MB [2024-08-26 11:03:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1150/1251] eta 0:00:25 lr 0.000805 wd 0.0500 time 0.2437 (0.2479) data time 0.0011 (0.0018) model time 0.2426 (0.2457) loss 3.7142 (3.3903) grad_norm 2.4844 (inf) loss_scale 4096.0000 (4654.7072) mem 7379MB [2024-08-26 11:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1160/1251] eta 0:00:22 lr 0.000805 wd 0.0500 time 0.4494 (0.2480) data time 0.0010 (0.0018) model time 0.4484 (0.2459) loss 3.8202 (3.3902) grad_norm 1.8241 (inf) loss_scale 4096.0000 (4649.8949) mem 7379MB [2024-08-26 11:03:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1170/1251] eta 0:00:20 lr 0.000805 wd 0.0500 time 0.2359 (0.2480) data time 0.0009 (0.0018) model time 0.2349 (0.2459) loss 4.1580 (3.3927) grad_norm 1.3520 (inf) loss_scale 4096.0000 (4645.1648) mem 7379MB [2024-08-26 11:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1180/1251] eta 0:00:17 lr 0.000805 wd 0.0500 time 0.4445 (0.2481) data time 0.0012 (0.0018) model time 0.4433 (0.2460) loss 2.5384 (3.3915) grad_norm 2.1911 (inf) loss_scale 4096.0000 (4640.5148) mem 7379MB [2024-08-26 11:04:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1190/1251] eta 0:00:15 lr 0.000805 wd 0.0500 time 0.2495 (0.2481) data time 0.0009 (0.0018) model time 0.2486 (0.2460) loss 2.4293 (3.3924) grad_norm 2.0204 (inf) loss_scale 4096.0000 (4635.9429) mem 7379MB [2024-08-26 11:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1200/1251] eta 0:00:12 lr 0.000805 wd 0.0500 time 0.2419 (0.2481) data time 0.0012 (0.0018) model time 0.2407 (0.2459) loss 3.9722 (3.3932) grad_norm 4.1985 (inf) loss_scale 4096.0000 (4631.4471) mem 7379MB [2024-08-26 11:04:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1210/1251] eta 0:00:10 lr 0.000805 wd 0.0500 time 0.2476 (0.2480) data time 0.0007 (0.0018) model time 0.2469 (0.2459) loss 3.5977 (3.3939) grad_norm 1.6490 (inf) loss_scale 4096.0000 (4627.0256) mem 7379MB [2024-08-26 11:04:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1220/1251] eta 0:00:07 lr 0.000805 wd 0.0500 time 0.2427 (0.2479) data time 0.0010 (0.0018) model time 0.2417 (0.2458) loss 2.8547 (3.3936) grad_norm 2.1003 (inf) loss_scale 4096.0000 (4622.6765) mem 7379MB [2024-08-26 11:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1230/1251] eta 0:00:05 lr 0.000805 wd 0.0500 time 0.2397 (0.2479) data time 0.0009 (0.0018) model time 0.2388 (0.2458) loss 3.3645 (3.3918) grad_norm 1.8053 (inf) loss_scale 4096.0000 (4618.3981) mem 7379MB [2024-08-26 11:04:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1240/1251] eta 0:00:02 lr 0.000805 wd 0.0500 time 0.2216 (0.2478) data time 0.0004 (0.0018) model time 0.2211 (0.2457) loss 3.5622 (3.3922) grad_norm 1.7455 (inf) loss_scale 4096.0000 (4614.1886) mem 7379MB [2024-08-26 11:04:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [101/300][1250/1251] eta 0:00:00 lr 0.000805 wd 0.0500 time 0.2323 (0.2477) data time 0.0005 (0.0018) model time 0.2319 (0.2456) loss 4.1039 (3.3936) grad_norm 2.0137 (inf) loss_scale 4096.0000 (4610.0464) mem 7379MB [2024-08-26 11:04:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 101 training takes 0:05:09 [2024-08-26 11:04:15 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 11:04:16 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 11:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.485 (0.485) Loss 0.5137 (0.5137) Acc@1 89.746 (89.746) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 11:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.115) Loss 0.8008 (0.7862) Acc@1 83.887 (83.097) Acc@5 95.996 (96.325) Mem 7379MB [2024-08-26 11:04:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.098) Loss 1.0781 (0.8144) Acc@1 74.609 (82.110) Acc@5 93.750 (96.284) Mem 7379MB [2024-08-26 11:04:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.092) Loss 1.4443 (0.9344) Acc@1 65.430 (79.275) Acc@5 87.207 (94.815) Mem 7379MB [2024-08-26 11:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.086) Loss 1.3037 (0.9950) Acc@1 68.555 (77.784) Acc@5 89.746 (94.081) Mem 7379MB [2024-08-26 11:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.440 Acc@5 94.024 [2024-08-26 11:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.4% [2024-08-26 11:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 77.44% [2024-08-26 11:04:20 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 11:04:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 11:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.476 (0.476) Loss 0.4390 (0.4390) Acc@1 92.188 (92.188) Acc@5 98.438 (98.438) Mem 7379MB [2024-08-26 11:04:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.116) Loss 0.7056 (0.6873) Acc@1 86.426 (85.156) Acc@5 96.387 (97.061) Mem 7379MB [2024-08-26 11:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.098) Loss 0.9819 (0.7112) Acc@1 76.855 (84.138) Acc@5 94.043 (97.019) Mem 7379MB [2024-08-26 11:04:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.091) Loss 1.2686 (0.8103) Acc@1 67.383 (81.770) Acc@5 90.820 (95.861) Mem 7379MB [2024-08-26 11:04:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.1299 (0.8619) Acc@1 72.070 (80.314) Acc@5 92.871 (95.348) Mem 7379MB [2024-08-26 11:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.898 Acc@5 95.302 [2024-08-26 11:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.9% [2024-08-26 11:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.90% [2024-08-26 11:04:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 11:04:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 11:04:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][0/1251] eta 0:14:00 lr 0.000805 wd 0.0500 time 0.6719 (0.6719) data time 0.4378 (0.4378) model time 0.0000 (0.0000) loss 2.5403 (2.5403) grad_norm 2.6843 (2.6843) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:04:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][10/1251] eta 0:05:50 lr 0.000805 wd 0.0500 time 0.2647 (0.2825) data time 0.0012 (0.0411) model time 0.0000 (0.0000) loss 3.0739 (3.5929) grad_norm 1.5202 (1.9787) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:04:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][20/1251] eta 0:05:24 lr 0.000805 wd 0.0500 time 0.2392 (0.2633) data time 0.0010 (0.0220) model time 0.0000 (0.0000) loss 2.7569 (3.4354) grad_norm 2.5676 (1.9265) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:04:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][30/1251] eta 0:05:13 lr 0.000805 wd 0.0500 time 0.2389 (0.2565) data time 0.0007 (0.0153) model time 0.0000 (0.0000) loss 2.9527 (3.3881) grad_norm 2.2641 (1.9133) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][40/1251] eta 0:05:12 lr 0.000805 wd 0.0500 time 0.2487 (0.2583) data time 0.0009 (0.0118) model time 0.0000 (0.0000) loss 3.0045 (3.3439) grad_norm 1.5751 (2.0375) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][50/1251] eta 0:05:10 lr 0.000805 wd 0.0500 time 0.2414 (0.2589) data time 0.0009 (0.0097) model time 0.0000 (0.0000) loss 3.2310 (3.3546) grad_norm 2.3224 (2.0888) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:04:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][60/1251] eta 0:05:04 lr 0.000805 wd 0.0500 time 0.2357 (0.2559) data time 0.0010 (0.0083) model time 0.2347 (0.2396) loss 2.4896 (3.3473) grad_norm 1.4369 (2.0727) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:04:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][70/1251] eta 0:04:59 lr 0.000805 wd 0.0500 time 0.2438 (0.2539) data time 0.0007 (0.0072) model time 0.2431 (0.2402) loss 3.6478 (3.3699) grad_norm 2.4822 (2.0958) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:04:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][80/1251] eta 0:04:55 lr 0.000805 wd 0.0500 time 0.2414 (0.2523) data time 0.0009 (0.0065) model time 0.2405 (0.2400) loss 3.2988 (3.3458) grad_norm 2.1860 (2.1138) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:04:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][90/1251] eta 0:04:51 lr 0.000805 wd 0.0500 time 0.2451 (0.2509) data time 0.0007 (0.0059) model time 0.2445 (0.2398) loss 3.8236 (3.3463) grad_norm 2.0651 (2.1295) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:04:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][100/1251] eta 0:04:47 lr 0.000805 wd 0.0500 time 0.2423 (0.2502) data time 0.0010 (0.0054) model time 0.2413 (0.2403) loss 4.1842 (3.3805) grad_norm 2.4265 (2.2022) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:04:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][110/1251] eta 0:04:46 lr 0.000804 wd 0.0500 time 0.2355 (0.2511) data time 0.0010 (0.0050) model time 0.2344 (0.2434) loss 3.5361 (3.3735) grad_norm 2.6257 (2.1886) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:04:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][120/1251] eta 0:04:43 lr 0.000804 wd 0.0500 time 0.2293 (0.2506) data time 0.0009 (0.0047) model time 0.2284 (0.2435) loss 3.7496 (3.3891) grad_norm 1.5408 (2.1541) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:04:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][130/1251] eta 0:04:40 lr 0.000804 wd 0.0500 time 0.2502 (0.2502) data time 0.0007 (0.0044) model time 0.2495 (0.2436) loss 4.2450 (3.3983) grad_norm 1.9383 (2.1298) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][140/1251] eta 0:04:37 lr 0.000804 wd 0.0500 time 0.2449 (0.2501) data time 0.0008 (0.0042) model time 0.2441 (0.2441) loss 3.7881 (3.4053) grad_norm 1.8863 (2.1292) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][150/1251] eta 0:04:36 lr 0.000804 wd 0.0500 time 0.2418 (0.2510) data time 0.0011 (0.0042) model time 0.2407 (0.2456) loss 3.5888 (3.4164) grad_norm 2.0743 (2.1530) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][160/1251] eta 0:04:33 lr 0.000804 wd 0.0500 time 0.2392 (0.2506) data time 0.0007 (0.0041) model time 0.2385 (0.2452) loss 3.6519 (3.4175) grad_norm 1.8913 (2.1643) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][170/1251] eta 0:04:30 lr 0.000804 wd 0.0500 time 0.2408 (0.2500) data time 0.0008 (0.0039) model time 0.2400 (0.2447) loss 2.2866 (3.3888) grad_norm 1.4005 (2.1596) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][180/1251] eta 0:04:27 lr 0.000804 wd 0.0500 time 0.2460 (0.2496) data time 0.0008 (0.0037) model time 0.2452 (0.2446) loss 4.1129 (3.3830) grad_norm 2.2511 (2.1482) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][190/1251] eta 0:04:24 lr 0.000804 wd 0.0500 time 0.2409 (0.2492) data time 0.0009 (0.0036) model time 0.2400 (0.2443) loss 4.3400 (3.3761) grad_norm 1.9077 (2.1514) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][200/1251] eta 0:04:21 lr 0.000804 wd 0.0500 time 0.2406 (0.2491) data time 0.0011 (0.0035) model time 0.2395 (0.2443) loss 3.1588 (3.3797) grad_norm 1.7443 (2.1455) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][210/1251] eta 0:04:19 lr 0.000804 wd 0.0500 time 0.2507 (0.2490) data time 0.0011 (0.0034) model time 0.2496 (0.2444) loss 3.2571 (3.3791) grad_norm 1.5591 (2.1288) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][220/1251] eta 0:04:16 lr 0.000804 wd 0.0500 time 0.2369 (0.2488) data time 0.0009 (0.0033) model time 0.2360 (0.2444) loss 3.6977 (3.3701) grad_norm 1.7229 (2.1429) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][230/1251] eta 0:04:13 lr 0.000804 wd 0.0500 time 0.2345 (0.2486) data time 0.0008 (0.0033) model time 0.2337 (0.2443) loss 3.3287 (3.3778) grad_norm 2.3143 (2.1421) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][240/1251] eta 0:04:11 lr 0.000804 wd 0.0500 time 0.2373 (0.2485) data time 0.0011 (0.0032) model time 0.2362 (0.2443) loss 3.6475 (3.3755) grad_norm 2.4734 (2.1485) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][250/1251] eta 0:04:09 lr 0.000804 wd 0.0500 time 0.2424 (0.2490) data time 0.0009 (0.0031) model time 0.2415 (0.2451) loss 3.4104 (3.3844) grad_norm 1.5967 (2.1590) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][260/1251] eta 0:04:06 lr 0.000804 wd 0.0500 time 0.2381 (0.2488) data time 0.0009 (0.0030) model time 0.2372 (0.2450) loss 3.5103 (3.3850) grad_norm 1.4702 (2.1554) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][270/1251] eta 0:04:04 lr 0.000804 wd 0.0500 time 0.2353 (0.2495) data time 0.0011 (0.0029) model time 0.2342 (0.2459) loss 3.9672 (3.3863) grad_norm 2.3670 (2.1524) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][280/1251] eta 0:04:01 lr 0.000804 wd 0.0500 time 0.2435 (0.2492) data time 0.0010 (0.0029) model time 0.2426 (0.2457) loss 2.9209 (3.3699) grad_norm 1.8574 (2.1431) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][290/1251] eta 0:03:59 lr 0.000804 wd 0.0500 time 0.2379 (0.2495) data time 0.0010 (0.0028) model time 0.2369 (0.2462) loss 3.6670 (3.3772) grad_norm 2.5948 (2.1409) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][300/1251] eta 0:03:56 lr 0.000804 wd 0.0500 time 0.2372 (0.2492) data time 0.0011 (0.0027) model time 0.2361 (0.2459) loss 3.0230 (3.3880) grad_norm 1.5841 (2.1422) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][310/1251] eta 0:03:54 lr 0.000804 wd 0.0500 time 0.2369 (0.2490) data time 0.0009 (0.0027) model time 0.2360 (0.2457) loss 3.0497 (3.3819) grad_norm 2.8654 (2.1367) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][320/1251] eta 0:03:52 lr 0.000804 wd 0.0500 time 0.2405 (0.2495) data time 0.0007 (0.0026) model time 0.2398 (0.2464) loss 3.8596 (3.3825) grad_norm 3.3265 (2.1367) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][330/1251] eta 0:03:50 lr 0.000804 wd 0.0500 time 0.4722 (0.2507) data time 0.0009 (0.0026) model time 0.4714 (0.2479) loss 4.2724 (3.3777) grad_norm 2.3434 (2.1430) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][340/1251] eta 0:03:48 lr 0.000804 wd 0.0500 time 0.2428 (0.2504) data time 0.0007 (0.0026) model time 0.2421 (0.2476) loss 4.3624 (3.3726) grad_norm 3.2897 (2.1548) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][350/1251] eta 0:03:45 lr 0.000804 wd 0.0500 time 0.2430 (0.2502) data time 0.0008 (0.0025) model time 0.2422 (0.2474) loss 3.5829 (3.3703) grad_norm 1.9709 (2.1628) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][360/1251] eta 0:03:42 lr 0.000804 wd 0.0500 time 0.2421 (0.2499) data time 0.0009 (0.0025) model time 0.2412 (0.2472) loss 3.0672 (3.3654) grad_norm 2.4990 (2.1613) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:05:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][370/1251] eta 0:03:39 lr 0.000804 wd 0.0500 time 0.2411 (0.2497) data time 0.0010 (0.0024) model time 0.2400 (0.2469) loss 3.8411 (3.3682) grad_norm 1.9948 (2.1555) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:06:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][380/1251] eta 0:03:37 lr 0.000804 wd 0.0500 time 0.2391 (0.2495) data time 0.0014 (0.0024) model time 0.2377 (0.2468) loss 3.5081 (3.3746) grad_norm 1.5944 (2.1524) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:06:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][390/1251] eta 0:03:35 lr 0.000803 wd 0.0500 time 0.4302 (0.2497) data time 0.0011 (0.0024) model time 0.4291 (0.2471) loss 3.4630 (3.3682) grad_norm 3.1899 (inf) loss_scale 2048.0000 (4085.5243) mem 7379MB [2024-08-26 11:06:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][400/1251] eta 0:03:32 lr 0.000803 wd 0.0500 time 0.2355 (0.2495) data time 0.0007 (0.0023) model time 0.2348 (0.2468) loss 2.3200 (3.3701) grad_norm 1.6125 (inf) loss_scale 2048.0000 (4034.7132) mem 7379MB [2024-08-26 11:06:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][410/1251] eta 0:03:29 lr 0.000803 wd 0.0500 time 0.2386 (0.2493) data time 0.0009 (0.0023) model time 0.2377 (0.2466) loss 3.1095 (3.3687) grad_norm 1.9814 (inf) loss_scale 2048.0000 (3986.3747) mem 7379MB [2024-08-26 11:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][420/1251] eta 0:03:27 lr 0.000803 wd 0.0500 time 0.2455 (0.2492) data time 0.0007 (0.0023) model time 0.2448 (0.2466) loss 3.5618 (3.3732) grad_norm 2.4486 (inf) loss_scale 2048.0000 (3940.3325) mem 7379MB [2024-08-26 11:06:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][430/1251] eta 0:03:24 lr 0.000803 wd 0.0500 time 0.2440 (0.2491) data time 0.0009 (0.0023) model time 0.2432 (0.2465) loss 2.4422 (3.3709) grad_norm 1.6990 (inf) loss_scale 2048.0000 (3896.4269) mem 7379MB [2024-08-26 11:06:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][440/1251] eta 0:03:21 lr 0.000803 wd 0.0500 time 0.2423 (0.2490) data time 0.0009 (0.0022) model time 0.2413 (0.2464) loss 3.7473 (3.3737) grad_norm 2.1380 (inf) loss_scale 2048.0000 (3854.5125) mem 7379MB [2024-08-26 11:06:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][450/1251] eta 0:03:19 lr 0.000803 wd 0.0500 time 0.2524 (0.2494) data time 0.0009 (0.0022) model time 0.2515 (0.2469) loss 4.0859 (3.3752) grad_norm 1.7868 (inf) loss_scale 2048.0000 (3814.4568) mem 7379MB [2024-08-26 11:06:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][460/1251] eta 0:03:17 lr 0.000803 wd 0.0500 time 0.2433 (0.2493) data time 0.0009 (0.0022) model time 0.2424 (0.2468) loss 2.8750 (3.3713) grad_norm 1.6323 (inf) loss_scale 2048.0000 (3776.1388) mem 7379MB [2024-08-26 11:06:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][470/1251] eta 0:03:14 lr 0.000803 wd 0.0500 time 0.2399 (0.2492) data time 0.0008 (0.0022) model time 0.2391 (0.2467) loss 3.6642 (3.3640) grad_norm 1.8841 (inf) loss_scale 2048.0000 (3739.4480) mem 7379MB [2024-08-26 11:06:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][480/1251] eta 0:03:12 lr 0.000803 wd 0.0500 time 0.2470 (0.2491) data time 0.0009 (0.0022) model time 0.2461 (0.2466) loss 2.7038 (3.3603) grad_norm 3.4856 (inf) loss_scale 2048.0000 (3704.2827) mem 7379MB [2024-08-26 11:06:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][490/1251] eta 0:03:09 lr 0.000803 wd 0.0500 time 0.2481 (0.2490) data time 0.0011 (0.0022) model time 0.2470 (0.2465) loss 3.4864 (3.3556) grad_norm 3.0740 (inf) loss_scale 2048.0000 (3670.5499) mem 7379MB [2024-08-26 11:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][500/1251] eta 0:03:06 lr 0.000803 wd 0.0500 time 0.2393 (0.2489) data time 0.0011 (0.0022) model time 0.2382 (0.2465) loss 3.8431 (3.3557) grad_norm 1.6914 (inf) loss_scale 2048.0000 (3638.1637) mem 7379MB [2024-08-26 11:06:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][510/1251] eta 0:03:04 lr 0.000803 wd 0.0500 time 0.2394 (0.2489) data time 0.0012 (0.0022) model time 0.2382 (0.2464) loss 3.2855 (3.3550) grad_norm 1.6747 (inf) loss_scale 2048.0000 (3607.0450) mem 7379MB [2024-08-26 11:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][520/1251] eta 0:03:01 lr 0.000803 wd 0.0500 time 0.2513 (0.2488) data time 0.0010 (0.0022) model time 0.2504 (0.2464) loss 3.4731 (3.3577) grad_norm 2.1429 (inf) loss_scale 2048.0000 (3577.1209) mem 7379MB [2024-08-26 11:06:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][530/1251] eta 0:02:59 lr 0.000803 wd 0.0500 time 0.2867 (0.2489) data time 0.0009 (0.0022) model time 0.2858 (0.2465) loss 2.9346 (3.3523) grad_norm 2.9631 (inf) loss_scale 2048.0000 (3548.3239) mem 7379MB [2024-08-26 11:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][540/1251] eta 0:02:56 lr 0.000803 wd 0.0500 time 0.2382 (0.2488) data time 0.0013 (0.0022) model time 0.2369 (0.2464) loss 2.7741 (3.3567) grad_norm 2.3992 (inf) loss_scale 2048.0000 (3520.5915) mem 7379MB [2024-08-26 11:06:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][550/1251] eta 0:02:54 lr 0.000803 wd 0.0500 time 0.2355 (0.2487) data time 0.0012 (0.0022) model time 0.2343 (0.2463) loss 2.5024 (3.3556) grad_norm 2.3317 (inf) loss_scale 2048.0000 (3493.8657) mem 7379MB [2024-08-26 11:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][560/1251] eta 0:02:51 lr 0.000803 wd 0.0500 time 0.2367 (0.2486) data time 0.0008 (0.0022) model time 0.2359 (0.2462) loss 3.2070 (3.3578) grad_norm 1.5473 (inf) loss_scale 2048.0000 (3468.0927) mem 7379MB [2024-08-26 11:06:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][570/1251] eta 0:02:49 lr 0.000803 wd 0.0500 time 0.2377 (0.2485) data time 0.0012 (0.0021) model time 0.2365 (0.2461) loss 2.8254 (3.3568) grad_norm 1.7695 (inf) loss_scale 2048.0000 (3443.2224) mem 7379MB [2024-08-26 11:06:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][580/1251] eta 0:02:47 lr 0.000803 wd 0.0500 time 0.2470 (0.2491) data time 0.0008 (0.0021) model time 0.2462 (0.2467) loss 4.1688 (3.3615) grad_norm 1.3913 (inf) loss_scale 2048.0000 (3419.2083) mem 7379MB [2024-08-26 11:06:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][590/1251] eta 0:02:44 lr 0.000803 wd 0.0500 time 0.2425 (0.2490) data time 0.0007 (0.0021) model time 0.2418 (0.2466) loss 2.6511 (3.3611) grad_norm 3.3232 (inf) loss_scale 2048.0000 (3396.0068) mem 7379MB [2024-08-26 11:06:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][600/1251] eta 0:02:42 lr 0.000803 wd 0.0500 time 0.2548 (0.2489) data time 0.0009 (0.0021) model time 0.2539 (0.2466) loss 3.7932 (3.3634) grad_norm 2.1768 (inf) loss_scale 2048.0000 (3373.5774) mem 7379MB [2024-08-26 11:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][610/1251] eta 0:02:39 lr 0.000803 wd 0.0500 time 0.2405 (0.2489) data time 0.0011 (0.0021) model time 0.2394 (0.2465) loss 3.3398 (3.3662) grad_norm 2.2832 (inf) loss_scale 2048.0000 (3351.8822) mem 7379MB [2024-08-26 11:07:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][620/1251] eta 0:02:36 lr 0.000803 wd 0.0500 time 0.2379 (0.2488) data time 0.0011 (0.0021) model time 0.2368 (0.2465) loss 3.7247 (3.3700) grad_norm 2.9484 (inf) loss_scale 2048.0000 (3330.8857) mem 7379MB [2024-08-26 11:07:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][630/1251] eta 0:02:34 lr 0.000803 wd 0.0500 time 0.2467 (0.2487) data time 0.0010 (0.0021) model time 0.2458 (0.2464) loss 3.6736 (3.3699) grad_norm 1.6846 (inf) loss_scale 2048.0000 (3310.5547) mem 7379MB [2024-08-26 11:07:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][640/1251] eta 0:02:31 lr 0.000803 wd 0.0500 time 0.2422 (0.2486) data time 0.0009 (0.0021) model time 0.2413 (0.2463) loss 3.1810 (3.3736) grad_norm 1.5944 (inf) loss_scale 2048.0000 (3290.8580) mem 7379MB [2024-08-26 11:07:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][650/1251] eta 0:02:29 lr 0.000803 wd 0.0500 time 0.2390 (0.2488) data time 0.0012 (0.0021) model time 0.2379 (0.2465) loss 3.6485 (3.3738) grad_norm 2.4608 (inf) loss_scale 2048.0000 (3271.7665) mem 7379MB [2024-08-26 11:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][660/1251] eta 0:02:26 lr 0.000803 wd 0.0500 time 0.2448 (0.2487) data time 0.0010 (0.0020) model time 0.2437 (0.2464) loss 4.0651 (3.3789) grad_norm 2.1456 (inf) loss_scale 2048.0000 (3253.2526) mem 7379MB [2024-08-26 11:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][670/1251] eta 0:02:24 lr 0.000802 wd 0.0500 time 0.2406 (0.2486) data time 0.0010 (0.0020) model time 0.2396 (0.2463) loss 4.0933 (3.3802) grad_norm 2.5475 (inf) loss_scale 2048.0000 (3235.2906) mem 7379MB [2024-08-26 11:07:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][680/1251] eta 0:02:22 lr 0.000802 wd 0.0500 time 0.2399 (0.2488) data time 0.0011 (0.0020) model time 0.2388 (0.2466) loss 2.6453 (3.3821) grad_norm 1.8206 (inf) loss_scale 2048.0000 (3217.8561) mem 7379MB [2024-08-26 11:07:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][690/1251] eta 0:02:19 lr 0.000802 wd 0.0500 time 0.2425 (0.2487) data time 0.0008 (0.0020) model time 0.2417 (0.2465) loss 3.0925 (3.3795) grad_norm 1.7648 (inf) loss_scale 2048.0000 (3200.9262) mem 7379MB [2024-08-26 11:07:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][700/1251] eta 0:02:16 lr 0.000802 wd 0.0500 time 0.2316 (0.2486) data time 0.0012 (0.0020) model time 0.2304 (0.2464) loss 3.6781 (3.3807) grad_norm 1.7649 (inf) loss_scale 2048.0000 (3184.4793) mem 7379MB [2024-08-26 11:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][710/1251] eta 0:02:14 lr 0.000802 wd 0.0500 time 0.2444 (0.2485) data time 0.0009 (0.0020) model time 0.2435 (0.2463) loss 3.8711 (3.3792) grad_norm 1.9971 (inf) loss_scale 2048.0000 (3168.4951) mem 7379MB [2024-08-26 11:07:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][720/1251] eta 0:02:11 lr 0.000802 wd 0.0500 time 0.2460 (0.2484) data time 0.0007 (0.0020) model time 0.2453 (0.2462) loss 2.1023 (3.3763) grad_norm 1.7451 (inf) loss_scale 2048.0000 (3152.9542) mem 7379MB [2024-08-26 11:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][730/1251] eta 0:02:09 lr 0.000802 wd 0.0500 time 0.2495 (0.2484) data time 0.0009 (0.0020) model time 0.2487 (0.2462) loss 3.3925 (3.3771) grad_norm 2.9185 (inf) loss_scale 2048.0000 (3137.8386) mem 7379MB [2024-08-26 11:07:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][740/1251] eta 0:02:06 lr 0.000802 wd 0.0500 time 0.2425 (0.2483) data time 0.0008 (0.0020) model time 0.2418 (0.2462) loss 3.6830 (3.3782) grad_norm 1.5306 (inf) loss_scale 2048.0000 (3123.1309) mem 7379MB [2024-08-26 11:07:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][750/1251] eta 0:02:04 lr 0.000802 wd 0.0500 time 0.2494 (0.2483) data time 0.0009 (0.0020) model time 0.2485 (0.2461) loss 3.3989 (3.3778) grad_norm 2.6984 (inf) loss_scale 2048.0000 (3108.8149) mem 7379MB [2024-08-26 11:07:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][760/1251] eta 0:02:01 lr 0.000802 wd 0.0500 time 0.2540 (0.2483) data time 0.0007 (0.0019) model time 0.2533 (0.2461) loss 3.3262 (3.3784) grad_norm 3.5422 (inf) loss_scale 2048.0000 (3094.8752) mem 7379MB [2024-08-26 11:07:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][770/1251] eta 0:01:59 lr 0.000802 wd 0.0500 time 0.2458 (0.2483) data time 0.0007 (0.0020) model time 0.2451 (0.2462) loss 3.4867 (3.3798) grad_norm 2.4359 (inf) loss_scale 2048.0000 (3081.2970) mem 7379MB [2024-08-26 11:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][780/1251] eta 0:01:56 lr 0.000802 wd 0.0500 time 0.2388 (0.2484) data time 0.0010 (0.0020) model time 0.2377 (0.2462) loss 2.7953 (3.3808) grad_norm 2.5467 (inf) loss_scale 2048.0000 (3068.0666) mem 7379MB [2024-08-26 11:07:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][790/1251] eta 0:01:54 lr 0.000802 wd 0.0500 time 0.2416 (0.2487) data time 0.0011 (0.0020) model time 0.2404 (0.2466) loss 3.7322 (3.3827) grad_norm 1.8137 (inf) loss_scale 2048.0000 (3055.1707) mem 7379MB [2024-08-26 11:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][800/1251] eta 0:01:52 lr 0.000802 wd 0.0500 time 0.2410 (0.2487) data time 0.0009 (0.0020) model time 0.2402 (0.2466) loss 2.1539 (3.3816) grad_norm 2.2691 (inf) loss_scale 2048.0000 (3042.5968) mem 7379MB [2024-08-26 11:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][810/1251] eta 0:01:49 lr 0.000802 wd 0.0500 time 0.2653 (0.2491) data time 0.0011 (0.0020) model time 0.2642 (0.2469) loss 4.0430 (3.3811) grad_norm 1.5351 (inf) loss_scale 2048.0000 (3030.3329) mem 7379MB [2024-08-26 11:07:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][820/1251] eta 0:01:47 lr 0.000802 wd 0.0500 time 0.2399 (0.2491) data time 0.0012 (0.0020) model time 0.2386 (0.2469) loss 3.7534 (3.3805) grad_norm 2.3549 (inf) loss_scale 2048.0000 (3018.3678) mem 7379MB [2024-08-26 11:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][830/1251] eta 0:01:44 lr 0.000802 wd 0.0500 time 0.2321 (0.2491) data time 0.0009 (0.0020) model time 0.2312 (0.2470) loss 3.8302 (3.3801) grad_norm 1.8532 (inf) loss_scale 2048.0000 (3006.6907) mem 7379MB [2024-08-26 11:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][840/1251] eta 0:01:42 lr 0.000802 wd 0.0500 time 0.2387 (0.2491) data time 0.0009 (0.0020) model time 0.2378 (0.2469) loss 3.3336 (3.3788) grad_norm 1.5436 (inf) loss_scale 2048.0000 (2995.2913) mem 7379MB [2024-08-26 11:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][850/1251] eta 0:01:39 lr 0.000802 wd 0.0500 time 0.4561 (0.2494) data time 0.0011 (0.0020) model time 0.4550 (0.2472) loss 3.4854 (3.3797) grad_norm 2.1072 (inf) loss_scale 2048.0000 (2984.1598) mem 7379MB [2024-08-26 11:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][860/1251] eta 0:01:37 lr 0.000802 wd 0.0500 time 0.2482 (0.2500) data time 0.0010 (0.0020) model time 0.2473 (0.2479) loss 3.4858 (3.3806) grad_norm 1.8733 (inf) loss_scale 2048.0000 (2973.2869) mem 7379MB [2024-08-26 11:08:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][870/1251] eta 0:01:35 lr 0.000802 wd 0.0500 time 0.2427 (0.2503) data time 0.0011 (0.0020) model time 0.2415 (0.2482) loss 3.4643 (3.3750) grad_norm 1.6127 (inf) loss_scale 2048.0000 (2962.6636) mem 7379MB [2024-08-26 11:08:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][880/1251] eta 0:01:32 lr 0.000802 wd 0.0500 time 0.2444 (0.2502) data time 0.0010 (0.0020) model time 0.2434 (0.2482) loss 2.9031 (3.3734) grad_norm 2.2704 (inf) loss_scale 2048.0000 (2952.2815) mem 7379MB [2024-08-26 11:08:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][890/1251] eta 0:01:30 lr 0.000802 wd 0.0500 time 0.2702 (0.2501) data time 0.0007 (0.0020) model time 0.2695 (0.2481) loss 3.8543 (3.3729) grad_norm 2.4007 (inf) loss_scale 2048.0000 (2942.1324) mem 7379MB [2024-08-26 11:08:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][900/1251] eta 0:01:27 lr 0.000802 wd 0.0500 time 0.2466 (0.2500) data time 0.0007 (0.0020) model time 0.2459 (0.2480) loss 3.5052 (3.3748) grad_norm 3.7594 (inf) loss_scale 2048.0000 (2932.2087) mem 7379MB [2024-08-26 11:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][910/1251] eta 0:01:25 lr 0.000802 wd 0.0500 time 0.2509 (0.2500) data time 0.0008 (0.0020) model time 0.2501 (0.2479) loss 3.4192 (3.3750) grad_norm 1.8612 (inf) loss_scale 2048.0000 (2922.5027) mem 7379MB [2024-08-26 11:08:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][920/1251] eta 0:01:22 lr 0.000802 wd 0.0500 time 0.2351 (0.2499) data time 0.0010 (0.0020) model time 0.2341 (0.2478) loss 2.8341 (3.3742) grad_norm 2.4584 (inf) loss_scale 2048.0000 (2913.0076) mem 7379MB [2024-08-26 11:08:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][930/1251] eta 0:01:20 lr 0.000802 wd 0.0500 time 0.2462 (0.2498) data time 0.0011 (0.0020) model time 0.2451 (0.2477) loss 3.1507 (3.3743) grad_norm 2.5160 (inf) loss_scale 2048.0000 (2903.7164) mem 7379MB [2024-08-26 11:08:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][940/1251] eta 0:01:17 lr 0.000802 wd 0.0500 time 0.2407 (0.2496) data time 0.0009 (0.0019) model time 0.2398 (0.2476) loss 4.1077 (3.3744) grad_norm 1.8260 (inf) loss_scale 2048.0000 (2894.6227) mem 7379MB [2024-08-26 11:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][950/1251] eta 0:01:15 lr 0.000801 wd 0.0500 time 0.2342 (0.2495) data time 0.0011 (0.0019) model time 0.2332 (0.2475) loss 3.7986 (3.3758) grad_norm 2.0505 (inf) loss_scale 2048.0000 (2885.7203) mem 7379MB [2024-08-26 11:08:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][960/1251] eta 0:01:12 lr 0.000801 wd 0.0500 time 0.4013 (0.2496) data time 0.0010 (0.0019) model time 0.4004 (0.2476) loss 4.5647 (3.3747) grad_norm 2.1441 (inf) loss_scale 2048.0000 (2877.0031) mem 7379MB [2024-08-26 11:08:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][970/1251] eta 0:01:10 lr 0.000801 wd 0.0500 time 0.2471 (0.2496) data time 0.0008 (0.0019) model time 0.2463 (0.2476) loss 2.3637 (3.3753) grad_norm 1.9519 (inf) loss_scale 2048.0000 (2868.4655) mem 7379MB [2024-08-26 11:08:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][980/1251] eta 0:01:07 lr 0.000801 wd 0.0500 time 0.2408 (0.2495) data time 0.0010 (0.0019) model time 0.2398 (0.2475) loss 3.6278 (3.3756) grad_norm 1.7937 (inf) loss_scale 2048.0000 (2860.1019) mem 7379MB [2024-08-26 11:08:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][990/1251] eta 0:01:05 lr 0.000801 wd 0.0500 time 0.2438 (0.2494) data time 0.0009 (0.0019) model time 0.2429 (0.2474) loss 3.4663 (3.3745) grad_norm 2.0161 (inf) loss_scale 2048.0000 (2851.9072) mem 7379MB [2024-08-26 11:08:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1000/1251] eta 0:01:02 lr 0.000801 wd 0.0500 time 0.2472 (0.2494) data time 0.0007 (0.0019) model time 0.2465 (0.2474) loss 2.9000 (3.3737) grad_norm 1.7224 (inf) loss_scale 2048.0000 (2843.8761) mem 7379MB [2024-08-26 11:08:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1010/1251] eta 0:01:00 lr 0.000801 wd 0.0500 time 0.2408 (0.2493) data time 0.0008 (0.0019) model time 0.2400 (0.2473) loss 2.6848 (3.3721) grad_norm 1.7450 (inf) loss_scale 2048.0000 (2836.0040) mem 7379MB [2024-08-26 11:08:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1020/1251] eta 0:00:57 lr 0.000801 wd 0.0500 time 0.2402 (0.2492) data time 0.0010 (0.0019) model time 0.2393 (0.2473) loss 3.6728 (3.3704) grad_norm 1.9265 (inf) loss_scale 2048.0000 (2828.2860) mem 7379MB [2024-08-26 11:08:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1030/1251] eta 0:00:55 lr 0.000801 wd 0.0500 time 0.2411 (0.2492) data time 0.0007 (0.0019) model time 0.2404 (0.2472) loss 4.6017 (3.3704) grad_norm 2.0201 (inf) loss_scale 2048.0000 (2820.7177) mem 7379MB [2024-08-26 11:08:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1040/1251] eta 0:00:52 lr 0.000801 wd 0.0500 time 0.2427 (0.2492) data time 0.0007 (0.0019) model time 0.2420 (0.2472) loss 2.9735 (3.3718) grad_norm 3.5252 (inf) loss_scale 2048.0000 (2813.2949) mem 7379MB [2024-08-26 11:08:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1050/1251] eta 0:00:50 lr 0.000801 wd 0.0500 time 0.2453 (0.2492) data time 0.0009 (0.0019) model time 0.2443 (0.2472) loss 3.0687 (3.3726) grad_norm 2.0041 (inf) loss_scale 2048.0000 (2806.0133) mem 7379MB [2024-08-26 11:08:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1060/1251] eta 0:00:47 lr 0.000801 wd 0.0500 time 0.2490 (0.2492) data time 0.0011 (0.0019) model time 0.2479 (0.2472) loss 2.2438 (3.3693) grad_norm 2.5476 (inf) loss_scale 2048.0000 (2798.8690) mem 7379MB [2024-08-26 11:08:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1070/1251] eta 0:00:45 lr 0.000801 wd 0.0500 time 0.2353 (0.2492) data time 0.0008 (0.0019) model time 0.2346 (0.2472) loss 3.7983 (3.3720) grad_norm 1.6200 (inf) loss_scale 2048.0000 (2791.8581) mem 7379MB [2024-08-26 11:08:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1080/1251] eta 0:00:42 lr 0.000801 wd 0.0500 time 0.2428 (0.2492) data time 0.0011 (0.0019) model time 0.2417 (0.2472) loss 2.8647 (3.3701) grad_norm 1.2988 (inf) loss_scale 2048.0000 (2784.9769) mem 7379MB [2024-08-26 11:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1090/1251] eta 0:00:40 lr 0.000801 wd 0.0500 time 0.2541 (0.2492) data time 0.0010 (0.0019) model time 0.2531 (0.2472) loss 4.1180 (3.3685) grad_norm 4.0775 (inf) loss_scale 2048.0000 (2778.2218) mem 7379MB [2024-08-26 11:09:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1100/1251] eta 0:00:37 lr 0.000801 wd 0.0500 time 0.2478 (0.2495) data time 0.0010 (0.0019) model time 0.2468 (0.2475) loss 3.3437 (3.3695) grad_norm 3.5654 (inf) loss_scale 2048.0000 (2771.5895) mem 7379MB [2024-08-26 11:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1110/1251] eta 0:00:35 lr 0.000801 wd 0.0500 time 0.2393 (0.2494) data time 0.0009 (0.0019) model time 0.2384 (0.2475) loss 3.3824 (3.3698) grad_norm 1.9999 (inf) loss_scale 2048.0000 (2765.0765) mem 7379MB [2024-08-26 11:09:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1120/1251] eta 0:00:32 lr 0.000801 wd 0.0500 time 0.2434 (0.2494) data time 0.0008 (0.0019) model time 0.2425 (0.2474) loss 4.1335 (3.3714) grad_norm 3.0756 (inf) loss_scale 2048.0000 (2758.6798) mem 7379MB [2024-08-26 11:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1130/1251] eta 0:00:30 lr 0.000801 wd 0.0500 time 0.2401 (0.2493) data time 0.0010 (0.0019) model time 0.2391 (0.2474) loss 3.3397 (3.3737) grad_norm 2.0839 (inf) loss_scale 2048.0000 (2752.3961) mem 7379MB [2024-08-26 11:09:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1140/1251] eta 0:00:27 lr 0.000801 wd 0.0500 time 0.2452 (0.2493) data time 0.0008 (0.0019) model time 0.2444 (0.2474) loss 2.7111 (3.3703) grad_norm 1.5944 (inf) loss_scale 2048.0000 (2746.2226) mem 7379MB [2024-08-26 11:09:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1150/1251] eta 0:00:25 lr 0.000801 wd 0.0500 time 0.2402 (0.2493) data time 0.0010 (0.0019) model time 0.2392 (0.2473) loss 3.1411 (3.3698) grad_norm 2.0861 (inf) loss_scale 2048.0000 (2740.1564) mem 7379MB [2024-08-26 11:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1160/1251] eta 0:00:22 lr 0.000801 wd 0.0500 time 0.2400 (0.2493) data time 0.0012 (0.0019) model time 0.2388 (0.2473) loss 2.6290 (3.3670) grad_norm 2.0029 (inf) loss_scale 2048.0000 (2734.1947) mem 7379MB [2024-08-26 11:09:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1170/1251] eta 0:00:20 lr 0.000801 wd 0.0500 time 0.2433 (0.2493) data time 0.0010 (0.0019) model time 0.2423 (0.2473) loss 3.4055 (3.3693) grad_norm 1.7114 (inf) loss_scale 2048.0000 (2728.3348) mem 7379MB [2024-08-26 11:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1180/1251] eta 0:00:17 lr 0.000801 wd 0.0500 time 0.2413 (0.2494) data time 0.0008 (0.0019) model time 0.2406 (0.2475) loss 1.9828 (3.3684) grad_norm 2.7086 (inf) loss_scale 2048.0000 (2722.5741) mem 7379MB [2024-08-26 11:09:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1190/1251] eta 0:00:15 lr 0.000801 wd 0.0500 time 0.2401 (0.2494) data time 0.0011 (0.0019) model time 0.2390 (0.2474) loss 3.6319 (3.3697) grad_norm 1.8705 (inf) loss_scale 2048.0000 (2716.9102) mem 7379MB [2024-08-26 11:09:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1200/1251] eta 0:00:12 lr 0.000801 wd 0.0500 time 0.2371 (0.2493) data time 0.0007 (0.0019) model time 0.2363 (0.2473) loss 3.5720 (3.3706) grad_norm 2.2882 (inf) loss_scale 2048.0000 (2711.3405) mem 7379MB [2024-08-26 11:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1210/1251] eta 0:00:10 lr 0.000801 wd 0.0500 time 0.2361 (0.2492) data time 0.0010 (0.0018) model time 0.2351 (0.2473) loss 3.4420 (3.3706) grad_norm 1.4218 (inf) loss_scale 2048.0000 (2705.8629) mem 7379MB [2024-08-26 11:09:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1220/1251] eta 0:00:07 lr 0.000801 wd 0.0500 time 0.2329 (0.2491) data time 0.0011 (0.0018) model time 0.2318 (0.2472) loss 3.6243 (3.3718) grad_norm 2.1564 (inf) loss_scale 2048.0000 (2700.4750) mem 7379MB [2024-08-26 11:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1230/1251] eta 0:00:05 lr 0.000801 wd 0.0500 time 0.2454 (0.2491) data time 0.0008 (0.0018) model time 0.2446 (0.2472) loss 2.4656 (3.3716) grad_norm 2.8962 (inf) loss_scale 2048.0000 (2695.1747) mem 7379MB [2024-08-26 11:09:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1240/1251] eta 0:00:02 lr 0.000800 wd 0.0500 time 0.2250 (0.2490) data time 0.0007 (0.0018) model time 0.2243 (0.2471) loss 2.5063 (3.3689) grad_norm 1.9086 (inf) loss_scale 2048.0000 (2689.9597) mem 7379MB [2024-08-26 11:09:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [102/300][1250/1251] eta 0:00:00 lr 0.000800 wd 0.0500 time 0.2253 (0.2488) data time 0.0005 (0.0018) model time 0.2248 (0.2469) loss 4.0666 (3.3702) grad_norm 2.6452 (inf) loss_scale 2048.0000 (2684.8281) mem 7379MB [2024-08-26 11:09:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 102 training takes 0:05:11 [2024-08-26 11:09:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 11:09:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 11:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.422 (0.422) Loss 0.4888 (0.4888) Acc@1 89.844 (89.844) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 11:09:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.088 (0.112) Loss 0.8291 (0.7726) Acc@1 81.152 (83.043) Acc@5 95.996 (96.520) Mem 7379MB [2024-08-26 11:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.096) Loss 1.1357 (0.8050) Acc@1 72.949 (81.915) Acc@5 92.871 (96.336) Mem 7379MB [2024-08-26 11:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.090) Loss 1.3545 (0.9153) Acc@1 67.969 (79.328) Acc@5 89.355 (94.975) Mem 7379MB [2024-08-26 11:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.2354 (0.9771) Acc@1 70.410 (77.730) Acc@5 91.309 (94.281) Mem 7379MB [2024-08-26 11:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.328 Acc@5 94.214 [2024-08-26 11:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.3% [2024-08-26 11:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.769 (0.769) Loss 0.4382 (0.4382) Acc@1 92.285 (92.285) Acc@5 98.438 (98.438) Mem 7379MB [2024-08-26 11:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.148) Loss 0.7051 (0.6869) Acc@1 86.426 (85.165) Acc@5 96.484 (97.061) Mem 7379MB [2024-08-26 11:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.115) Loss 0.9810 (0.7105) Acc@1 76.855 (84.138) Acc@5 94.141 (97.038) Mem 7379MB [2024-08-26 11:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.104) Loss 1.2656 (0.8092) Acc@1 67.383 (81.792) Acc@5 90.625 (95.873) Mem 7379MB [2024-08-26 11:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.095) Loss 1.1240 (0.8605) Acc@1 72.266 (80.357) Acc@5 93.066 (95.372) Mem 7379MB [2024-08-26 11:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.930 Acc@5 95.316 [2024-08-26 11:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 79.9% [2024-08-26 11:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 79.93% [2024-08-26 11:09:46 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 11:09:47 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 11:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][0/1251] eta 0:13:39 lr 0.000800 wd 0.0500 time 0.6551 (0.6551) data time 0.4159 (0.4159) model time 0.0000 (0.0000) loss 3.4875 (3.4875) grad_norm 1.8968 (1.8968) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][10/1251] eta 0:05:45 lr 0.000800 wd 0.0500 time 0.2396 (0.2784) data time 0.0011 (0.0388) model time 0.0000 (0.0000) loss 3.4209 (3.2829) grad_norm 2.5870 (2.2646) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][20/1251] eta 0:05:20 lr 0.000800 wd 0.0500 time 0.2407 (0.2607) data time 0.0012 (0.0208) model time 0.0000 (0.0000) loss 3.6659 (3.3127) grad_norm 2.2037 (2.2106) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:09:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][30/1251] eta 0:05:11 lr 0.000800 wd 0.0500 time 0.2477 (0.2552) data time 0.0010 (0.0144) model time 0.0000 (0.0000) loss 3.6480 (3.2705) grad_norm 2.4689 (2.3223) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][40/1251] eta 0:05:10 lr 0.000800 wd 0.0500 time 0.2398 (0.2561) data time 0.0009 (0.0112) model time 0.0000 (0.0000) loss 3.1176 (3.2699) grad_norm 2.7339 (2.2520) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][50/1251] eta 0:05:05 lr 0.000800 wd 0.0500 time 0.2488 (0.2540) data time 0.0009 (0.0092) model time 0.0000 (0.0000) loss 3.0204 (3.2990) grad_norm 1.7929 (2.1884) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][60/1251] eta 0:05:06 lr 0.000800 wd 0.0500 time 0.3966 (0.2572) data time 0.0009 (0.0078) model time 0.3956 (0.2730) loss 3.5824 (3.3238) grad_norm 2.6101 (2.1832) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][70/1251] eta 0:05:01 lr 0.000800 wd 0.0500 time 0.2465 (0.2552) data time 0.0007 (0.0069) model time 0.2457 (0.2574) loss 2.2531 (3.2918) grad_norm 2.1946 (2.1618) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][80/1251] eta 0:04:57 lr 0.000800 wd 0.0500 time 0.2451 (0.2539) data time 0.0008 (0.0062) model time 0.2443 (0.2528) loss 3.7383 (3.2697) grad_norm 1.5505 (2.1375) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][90/1251] eta 0:04:53 lr 0.000800 wd 0.0500 time 0.2429 (0.2526) data time 0.0010 (0.0056) model time 0.2420 (0.2498) loss 3.3506 (3.3115) grad_norm 1.9592 (2.1358) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][100/1251] eta 0:04:49 lr 0.000800 wd 0.0500 time 0.2437 (0.2518) data time 0.0008 (0.0052) model time 0.2429 (0.2486) loss 3.6074 (3.3253) grad_norm 1.9026 (2.1441) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][110/1251] eta 0:04:46 lr 0.000800 wd 0.0500 time 0.2442 (0.2509) data time 0.0009 (0.0048) model time 0.2433 (0.2473) loss 3.4293 (3.3014) grad_norm 2.2300 (2.1609) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][120/1251] eta 0:04:42 lr 0.000800 wd 0.0500 time 0.2410 (0.2501) data time 0.0010 (0.0045) model time 0.2400 (0.2463) loss 3.6275 (3.2861) grad_norm 2.4738 (2.1645) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][130/1251] eta 0:04:41 lr 0.000800 wd 0.0500 time 0.2370 (0.2508) data time 0.0008 (0.0042) model time 0.2361 (0.2478) loss 2.9686 (3.2866) grad_norm 2.2654 (2.1569) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][140/1251] eta 0:04:38 lr 0.000800 wd 0.0500 time 0.2401 (0.2502) data time 0.0010 (0.0040) model time 0.2390 (0.2471) loss 2.7143 (3.2801) grad_norm 1.7884 (2.1560) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][150/1251] eta 0:04:36 lr 0.000800 wd 0.0500 time 0.2415 (0.2512) data time 0.0009 (0.0038) model time 0.2407 (0.2488) loss 3.3432 (3.2907) grad_norm 2.0376 (2.1601) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][160/1251] eta 0:04:33 lr 0.000800 wd 0.0500 time 0.2381 (0.2508) data time 0.0011 (0.0036) model time 0.2370 (0.2482) loss 3.1558 (3.2914) grad_norm 1.8618 (2.1387) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][170/1251] eta 0:04:31 lr 0.000800 wd 0.0500 time 0.2421 (0.2515) data time 0.0011 (0.0035) model time 0.2411 (0.2494) loss 3.5373 (3.2918) grad_norm 2.0375 (2.1359) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][180/1251] eta 0:04:28 lr 0.000800 wd 0.0500 time 0.2385 (0.2508) data time 0.0009 (0.0033) model time 0.2376 (0.2485) loss 4.5119 (3.3215) grad_norm 1.8912 (2.1192) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][190/1251] eta 0:04:25 lr 0.000800 wd 0.0500 time 0.2400 (0.2503) data time 0.0008 (0.0032) model time 0.2391 (0.2479) loss 2.4912 (3.3118) grad_norm 2.5420 (2.1193) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][200/1251] eta 0:04:23 lr 0.000800 wd 0.0500 time 0.2368 (0.2508) data time 0.0007 (0.0031) model time 0.2361 (0.2487) loss 3.9937 (3.3247) grad_norm 2.6425 (2.1302) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][210/1251] eta 0:04:20 lr 0.000800 wd 0.0500 time 0.2383 (0.2504) data time 0.0010 (0.0030) model time 0.2373 (0.2482) loss 3.9022 (3.3349) grad_norm 2.5935 (2.1503) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][220/1251] eta 0:04:17 lr 0.000800 wd 0.0500 time 0.2404 (0.2502) data time 0.0009 (0.0029) model time 0.2395 (0.2480) loss 3.7278 (3.3403) grad_norm 1.6558 (2.1536) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][230/1251] eta 0:04:15 lr 0.000800 wd 0.0500 time 0.2504 (0.2499) data time 0.0008 (0.0029) model time 0.2496 (0.2476) loss 3.7869 (3.3330) grad_norm 1.9173 (2.1364) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][240/1251] eta 0:04:12 lr 0.000800 wd 0.0500 time 0.2490 (0.2502) data time 0.0008 (0.0028) model time 0.2482 (0.2482) loss 4.0774 (3.3391) grad_norm 2.0208 (2.1301) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][250/1251] eta 0:04:10 lr 0.000800 wd 0.0500 time 0.2421 (0.2498) data time 0.0009 (0.0027) model time 0.2412 (0.2477) loss 3.3538 (3.3452) grad_norm 2.5142 (2.1277) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][260/1251] eta 0:04:07 lr 0.000800 wd 0.0500 time 0.2377 (0.2495) data time 0.0008 (0.0026) model time 0.2370 (0.2473) loss 3.6330 (3.3486) grad_norm 1.5056 (2.1304) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][270/1251] eta 0:04:04 lr 0.000799 wd 0.0500 time 0.2340 (0.2493) data time 0.0012 (0.0026) model time 0.2327 (0.2471) loss 3.5576 (3.3478) grad_norm 1.7930 (2.1256) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][280/1251] eta 0:04:02 lr 0.000799 wd 0.0500 time 0.2418 (0.2497) data time 0.0009 (0.0025) model time 0.2408 (0.2477) loss 3.5888 (3.3496) grad_norm 1.9751 (2.1203) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][290/1251] eta 0:03:59 lr 0.000799 wd 0.0500 time 0.2449 (0.2496) data time 0.0012 (0.0026) model time 0.2436 (0.2474) loss 3.5415 (3.3503) grad_norm 1.5587 (2.1168) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][300/1251] eta 0:03:57 lr 0.000799 wd 0.0500 time 0.2372 (0.2499) data time 0.0010 (0.0025) model time 0.2361 (0.2479) loss 2.1706 (3.3417) grad_norm 1.7889 (2.1175) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][310/1251] eta 0:03:55 lr 0.000799 wd 0.0500 time 0.2407 (0.2504) data time 0.0008 (0.0025) model time 0.2399 (0.2485) loss 2.4335 (3.3414) grad_norm 2.0492 (2.1310) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][320/1251] eta 0:03:52 lr 0.000799 wd 0.0500 time 0.2527 (0.2501) data time 0.0010 (0.0025) model time 0.2517 (0.2482) loss 3.6246 (3.3433) grad_norm 1.7907 (2.1495) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][330/1251] eta 0:03:50 lr 0.000799 wd 0.0500 time 0.2383 (0.2499) data time 0.0010 (0.0024) model time 0.2372 (0.2480) loss 3.5647 (3.3412) grad_norm 2.3679 (2.1445) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][340/1251] eta 0:03:47 lr 0.000799 wd 0.0500 time 0.2429 (0.2497) data time 0.0009 (0.0024) model time 0.2419 (0.2477) loss 3.6770 (3.3275) grad_norm 2.0257 (2.1437) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][350/1251] eta 0:03:44 lr 0.000799 wd 0.0500 time 0.2475 (0.2494) data time 0.0009 (0.0024) model time 0.2466 (0.2475) loss 2.6461 (3.3295) grad_norm 1.6995 (2.1649) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][360/1251] eta 0:03:42 lr 0.000799 wd 0.0500 time 0.4245 (0.2497) data time 0.0010 (0.0023) model time 0.4235 (0.2478) loss 3.7962 (3.3329) grad_norm 2.3286 (2.1649) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][370/1251] eta 0:03:40 lr 0.000799 wd 0.0500 time 0.2361 (0.2498) data time 0.0010 (0.0023) model time 0.2351 (0.2480) loss 3.6912 (3.3381) grad_norm 1.6431 (2.1614) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][380/1251] eta 0:03:37 lr 0.000799 wd 0.0500 time 0.2533 (0.2497) data time 0.0010 (0.0023) model time 0.2523 (0.2479) loss 4.3085 (3.3464) grad_norm 2.4409 (2.1569) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][390/1251] eta 0:03:34 lr 0.000799 wd 0.0500 time 0.2515 (0.2496) data time 0.0010 (0.0022) model time 0.2506 (0.2477) loss 2.9181 (3.3401) grad_norm 1.4666 (2.1489) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][400/1251] eta 0:03:32 lr 0.000799 wd 0.0500 time 0.2438 (0.2494) data time 0.0010 (0.0022) model time 0.2428 (0.2475) loss 3.4203 (3.3441) grad_norm 1.9153 (2.1491) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][410/1251] eta 0:03:29 lr 0.000799 wd 0.0500 time 0.2385 (0.2492) data time 0.0010 (0.0022) model time 0.2375 (0.2473) loss 3.9168 (3.3430) grad_norm 1.6738 (2.1415) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][420/1251] eta 0:03:26 lr 0.000799 wd 0.0500 time 0.2377 (0.2490) data time 0.0011 (0.0021) model time 0.2366 (0.2471) loss 3.5260 (3.3466) grad_norm 1.9627 (2.1407) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][430/1251] eta 0:03:24 lr 0.000799 wd 0.0500 time 0.2385 (0.2491) data time 0.0008 (0.0021) model time 0.2377 (0.2473) loss 3.1855 (3.3439) grad_norm 2.0647 (2.1389) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][440/1251] eta 0:03:21 lr 0.000799 wd 0.0500 time 0.2406 (0.2490) data time 0.0008 (0.0021) model time 0.2398 (0.2472) loss 3.6895 (3.3410) grad_norm 1.8782 (2.1356) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][450/1251] eta 0:03:19 lr 0.000799 wd 0.0500 time 0.2380 (0.2488) data time 0.0012 (0.0021) model time 0.2368 (0.2470) loss 3.4993 (3.3431) grad_norm 1.8945 (2.1365) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][460/1251] eta 0:03:16 lr 0.000799 wd 0.0500 time 0.2426 (0.2487) data time 0.0010 (0.0020) model time 0.2416 (0.2469) loss 2.8294 (3.3417) grad_norm 2.0922 (2.1343) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][470/1251] eta 0:03:14 lr 0.000799 wd 0.0500 time 0.2417 (0.2486) data time 0.0010 (0.0020) model time 0.2407 (0.2468) loss 3.5943 (3.3500) grad_norm 1.9305 (2.1288) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][480/1251] eta 0:03:11 lr 0.000799 wd 0.0500 time 0.2431 (0.2485) data time 0.0008 (0.0020) model time 0.2423 (0.2467) loss 3.3867 (3.3512) grad_norm 1.8952 (2.1334) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][490/1251] eta 0:03:09 lr 0.000799 wd 0.0500 time 0.2462 (0.2484) data time 0.0010 (0.0020) model time 0.2453 (0.2466) loss 3.4602 (3.3534) grad_norm 1.9551 (2.1312) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][500/1251] eta 0:03:06 lr 0.000799 wd 0.0500 time 0.2486 (0.2482) data time 0.0010 (0.0020) model time 0.2476 (0.2464) loss 3.3755 (3.3530) grad_norm 2.0392 (2.1301) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][510/1251] eta 0:03:03 lr 0.000799 wd 0.0500 time 0.2492 (0.2481) data time 0.0010 (0.0019) model time 0.2483 (0.2463) loss 3.6771 (3.3519) grad_norm 1.8590 (2.1294) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][520/1251] eta 0:03:01 lr 0.000799 wd 0.0500 time 0.2378 (0.2480) data time 0.0008 (0.0019) model time 0.2371 (0.2462) loss 3.7445 (3.3566) grad_norm 2.5512 (2.1306) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][530/1251] eta 0:02:58 lr 0.000799 wd 0.0500 time 0.2429 (0.2478) data time 0.0009 (0.0019) model time 0.2420 (0.2461) loss 4.1270 (3.3537) grad_norm 2.2972 (2.1404) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][540/1251] eta 0:02:56 lr 0.000799 wd 0.0500 time 0.2446 (0.2478) data time 0.0007 (0.0019) model time 0.2439 (0.2460) loss 2.2357 (3.3531) grad_norm 1.7839 (2.1508) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][550/1251] eta 0:02:53 lr 0.000798 wd 0.0500 time 0.2390 (0.2477) data time 0.0010 (0.0019) model time 0.2380 (0.2459) loss 3.0867 (3.3563) grad_norm 2.2421 (2.1518) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][560/1251] eta 0:02:51 lr 0.000798 wd 0.0500 time 0.2428 (0.2475) data time 0.0007 (0.0019) model time 0.2420 (0.2458) loss 2.9981 (3.3523) grad_norm 2.8202 (2.1541) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][570/1251] eta 0:02:48 lr 0.000798 wd 0.0500 time 0.2478 (0.2475) data time 0.0011 (0.0018) model time 0.2467 (0.2457) loss 2.5098 (3.3529) grad_norm 1.6162 (2.1512) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][580/1251] eta 0:02:46 lr 0.000798 wd 0.0500 time 0.2406 (0.2474) data time 0.0009 (0.0018) model time 0.2398 (0.2457) loss 4.0058 (3.3570) grad_norm 2.8335 (2.1530) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][590/1251] eta 0:02:43 lr 0.000798 wd 0.0500 time 0.2493 (0.2480) data time 0.0010 (0.0018) model time 0.2483 (0.2464) loss 2.7050 (3.3547) grad_norm 1.6440 (2.1532) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][600/1251] eta 0:02:41 lr 0.000798 wd 0.0500 time 0.2405 (0.2479) data time 0.0011 (0.0018) model time 0.2394 (0.2463) loss 3.4676 (3.3611) grad_norm 1.9878 (2.1569) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][610/1251] eta 0:02:38 lr 0.000798 wd 0.0500 time 0.2419 (0.2479) data time 0.0008 (0.0018) model time 0.2411 (0.2462) loss 3.9411 (3.3630) grad_norm 1.7012 (2.1590) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][620/1251] eta 0:02:36 lr 0.000798 wd 0.0500 time 0.2416 (0.2478) data time 0.0007 (0.0018) model time 0.2409 (0.2461) loss 4.0611 (3.3666) grad_norm 2.8430 (2.1585) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][630/1251] eta 0:02:33 lr 0.000798 wd 0.0500 time 0.2336 (0.2477) data time 0.0009 (0.0018) model time 0.2327 (0.2460) loss 3.8990 (3.3647) grad_norm 2.0126 (2.1675) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][640/1251] eta 0:02:31 lr 0.000798 wd 0.0500 time 0.2423 (0.2476) data time 0.0008 (0.0018) model time 0.2415 (0.2459) loss 3.9626 (3.3654) grad_norm 2.5206 (2.1708) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][650/1251] eta 0:02:28 lr 0.000798 wd 0.0500 time 0.2431 (0.2477) data time 0.0010 (0.0017) model time 0.2421 (0.2461) loss 3.0633 (3.3653) grad_norm 2.5523 (2.1696) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][660/1251] eta 0:02:26 lr 0.000798 wd 0.0500 time 0.2377 (0.2476) data time 0.0010 (0.0017) model time 0.2367 (0.2460) loss 2.9166 (3.3651) grad_norm 1.6651 (2.1659) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][670/1251] eta 0:02:23 lr 0.000798 wd 0.0500 time 0.2417 (0.2478) data time 0.0008 (0.0017) model time 0.2409 (0.2462) loss 3.2265 (3.3644) grad_norm 2.1347 (2.1693) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][680/1251] eta 0:02:21 lr 0.000798 wd 0.0500 time 0.2326 (0.2478) data time 0.0011 (0.0017) model time 0.2315 (0.2462) loss 3.4996 (3.3649) grad_norm 2.2697 (2.1712) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][690/1251] eta 0:02:19 lr 0.000798 wd 0.0500 time 0.2467 (0.2480) data time 0.0010 (0.0017) model time 0.2457 (0.2464) loss 3.5699 (3.3707) grad_norm 1.7601 (2.1664) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][700/1251] eta 0:02:16 lr 0.000798 wd 0.0500 time 0.2398 (0.2480) data time 0.0010 (0.0017) model time 0.2389 (0.2464) loss 3.5737 (3.3701) grad_norm 1.8510 (2.1603) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][710/1251] eta 0:02:14 lr 0.000798 wd 0.0500 time 0.2469 (0.2482) data time 0.0010 (0.0017) model time 0.2460 (0.2467) loss 2.6125 (3.3696) grad_norm 1.9520 (2.1612) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][720/1251] eta 0:02:11 lr 0.000798 wd 0.0500 time 0.2408 (0.2482) data time 0.0011 (0.0017) model time 0.2397 (0.2466) loss 3.7316 (3.3694) grad_norm 2.2472 (2.1655) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][730/1251] eta 0:02:09 lr 0.000798 wd 0.0500 time 0.2442 (0.2481) data time 0.0008 (0.0017) model time 0.2435 (0.2465) loss 4.0014 (3.3697) grad_norm 1.5334 (2.1622) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][740/1251] eta 0:02:06 lr 0.000798 wd 0.0500 time 0.3053 (0.2480) data time 0.0008 (0.0017) model time 0.3045 (0.2465) loss 3.4805 (3.3657) grad_norm 2.3613 (2.1605) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][750/1251] eta 0:02:04 lr 0.000798 wd 0.0500 time 0.2558 (0.2480) data time 0.0012 (0.0017) model time 0.2546 (0.2465) loss 3.5924 (3.3662) grad_norm 2.3830 (2.1607) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][760/1251] eta 0:02:01 lr 0.000798 wd 0.0500 time 0.2449 (0.2482) data time 0.0009 (0.0017) model time 0.2440 (0.2467) loss 2.8764 (3.3652) grad_norm 1.4958 (2.1560) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:12:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][770/1251] eta 0:01:59 lr 0.000798 wd 0.0500 time 0.2463 (0.2481) data time 0.0008 (0.0016) model time 0.2456 (0.2466) loss 4.1662 (3.3639) grad_norm 1.7773 (2.1555) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][780/1251] eta 0:01:56 lr 0.000798 wd 0.0500 time 0.2380 (0.2480) data time 0.0010 (0.0016) model time 0.2370 (0.2465) loss 2.9966 (3.3619) grad_norm 2.0378 (2.1529) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][790/1251] eta 0:01:54 lr 0.000798 wd 0.0500 time 0.2451 (0.2480) data time 0.0010 (0.0016) model time 0.2441 (0.2465) loss 3.3939 (3.3610) grad_norm 2.4317 (2.1516) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][800/1251] eta 0:01:51 lr 0.000798 wd 0.0500 time 0.2339 (0.2479) data time 0.0008 (0.0016) model time 0.2331 (0.2464) loss 2.3212 (3.3613) grad_norm 1.6965 (2.1513) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][810/1251] eta 0:01:49 lr 0.000798 wd 0.0500 time 0.2397 (0.2478) data time 0.0007 (0.0016) model time 0.2390 (0.2463) loss 3.7511 (3.3623) grad_norm 1.6032 (2.1462) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][820/1251] eta 0:01:46 lr 0.000797 wd 0.0500 time 0.2562 (0.2478) data time 0.0009 (0.0016) model time 0.2553 (0.2463) loss 2.8953 (3.3625) grad_norm 2.0063 (2.1460) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][830/1251] eta 0:01:44 lr 0.000797 wd 0.0500 time 0.2455 (0.2477) data time 0.0012 (0.0016) model time 0.2443 (0.2462) loss 3.5102 (3.3645) grad_norm 2.1348 (2.1424) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][840/1251] eta 0:01:41 lr 0.000797 wd 0.0500 time 0.2407 (0.2477) data time 0.0009 (0.0016) model time 0.2399 (0.2462) loss 3.7220 (3.3627) grad_norm 2.5296 (2.1419) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][850/1251] eta 0:01:39 lr 0.000797 wd 0.0500 time 0.2454 (0.2476) data time 0.0008 (0.0016) model time 0.2446 (0.2461) loss 2.8641 (3.3630) grad_norm 1.8440 (2.1431) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][860/1251] eta 0:01:36 lr 0.000797 wd 0.0500 time 0.2467 (0.2476) data time 0.0007 (0.0016) model time 0.2460 (0.2461) loss 3.5423 (3.3637) grad_norm 1.7516 (2.1404) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][870/1251] eta 0:01:34 lr 0.000797 wd 0.0500 time 0.2437 (0.2475) data time 0.0009 (0.0016) model time 0.2428 (0.2460) loss 3.6110 (3.3670) grad_norm 2.2236 (2.1388) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][880/1251] eta 0:01:31 lr 0.000797 wd 0.0500 time 0.2381 (0.2475) data time 0.0012 (0.0016) model time 0.2369 (0.2460) loss 3.8834 (3.3677) grad_norm 1.9445 (2.1368) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][890/1251] eta 0:01:29 lr 0.000797 wd 0.0500 time 0.2361 (0.2477) data time 0.0007 (0.0016) model time 0.2353 (0.2462) loss 3.3178 (3.3693) grad_norm 1.4840 (2.1338) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][900/1251] eta 0:01:26 lr 0.000797 wd 0.0500 time 0.2440 (0.2478) data time 0.0010 (0.0016) model time 0.2431 (0.2463) loss 3.5047 (3.3681) grad_norm 2.5584 (2.1359) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][910/1251] eta 0:01:24 lr 0.000797 wd 0.0500 time 0.2418 (0.2478) data time 0.0012 (0.0016) model time 0.2406 (0.2463) loss 3.7783 (3.3687) grad_norm 2.9939 (2.1364) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][920/1251] eta 0:01:21 lr 0.000797 wd 0.0500 time 0.2516 (0.2477) data time 0.0013 (0.0016) model time 0.2503 (0.2462) loss 3.3540 (3.3703) grad_norm 1.8998 (2.1359) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][930/1251] eta 0:01:19 lr 0.000797 wd 0.0500 time 0.2396 (0.2477) data time 0.0007 (0.0016) model time 0.2389 (0.2462) loss 3.9591 (3.3705) grad_norm 1.9016 (2.1314) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][940/1251] eta 0:01:17 lr 0.000797 wd 0.0500 time 0.2471 (0.2476) data time 0.0008 (0.0016) model time 0.2463 (0.2461) loss 3.8600 (3.3725) grad_norm 1.7046 (2.1279) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][950/1251] eta 0:01:14 lr 0.000797 wd 0.0500 time 0.2424 (0.2475) data time 0.0010 (0.0016) model time 0.2414 (0.2460) loss 2.6205 (3.3708) grad_norm 1.8526 (2.1273) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][960/1251] eta 0:01:12 lr 0.000797 wd 0.0500 time 0.2350 (0.2477) data time 0.0008 (0.0016) model time 0.2343 (0.2462) loss 3.7575 (3.3723) grad_norm 2.3719 (2.1264) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][970/1251] eta 0:01:09 lr 0.000797 wd 0.0500 time 0.2405 (0.2478) data time 0.0009 (0.0016) model time 0.2396 (0.2463) loss 3.6138 (3.3698) grad_norm 2.3710 (2.1278) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][980/1251] eta 0:01:07 lr 0.000797 wd 0.0500 time 0.2490 (0.2478) data time 0.0008 (0.0016) model time 0.2482 (0.2463) loss 2.9623 (3.3698) grad_norm 2.3251 (2.1295) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][990/1251] eta 0:01:04 lr 0.000797 wd 0.0500 time 0.2372 (0.2477) data time 0.0008 (0.0015) model time 0.2364 (0.2462) loss 4.0255 (3.3722) grad_norm 2.1757 (2.1308) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1000/1251] eta 0:01:02 lr 0.000797 wd 0.0500 time 0.2446 (0.2476) data time 0.0009 (0.0015) model time 0.2437 (0.2462) loss 4.3983 (3.3741) grad_norm 2.2376 (2.1303) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:13:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1010/1251] eta 0:00:59 lr 0.000797 wd 0.0500 time 0.2395 (0.2476) data time 0.0009 (0.0015) model time 0.2385 (0.2461) loss 3.3533 (3.3740) grad_norm 2.0150 (2.1281) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:14:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1020/1251] eta 0:00:57 lr 0.000797 wd 0.0500 time 0.2400 (0.2476) data time 0.0009 (0.0015) model time 0.2390 (0.2461) loss 3.9598 (3.3757) grad_norm 2.0840 (2.1278) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:14:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1030/1251] eta 0:00:54 lr 0.000797 wd 0.0500 time 0.2374 (0.2475) data time 0.0007 (0.0015) model time 0.2367 (0.2461) loss 3.3109 (3.3779) grad_norm 2.0069 (2.1276) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:14:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1040/1251] eta 0:00:52 lr 0.000797 wd 0.0500 time 0.2440 (0.2475) data time 0.0008 (0.0015) model time 0.2432 (0.2460) loss 4.2117 (3.3791) grad_norm 3.3728 (2.1288) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:14:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1050/1251] eta 0:00:49 lr 0.000797 wd 0.0500 time 0.2330 (0.2474) data time 0.0011 (0.0015) model time 0.2319 (0.2460) loss 2.7795 (3.3782) grad_norm 3.1482 (2.1305) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:14:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1060/1251] eta 0:00:47 lr 0.000797 wd 0.0500 time 0.2375 (0.2474) data time 0.0010 (0.0015) model time 0.2365 (0.2459) loss 2.7339 (3.3769) grad_norm 1.9507 (2.1302) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:14:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1070/1251] eta 0:00:44 lr 0.000797 wd 0.0500 time 0.2443 (0.2473) data time 0.0009 (0.0015) model time 0.2434 (0.2459) loss 2.9019 (3.3761) grad_norm 2.0596 (2.1297) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1080/1251] eta 0:00:42 lr 0.000797 wd 0.0500 time 0.2371 (0.2473) data time 0.0011 (0.0015) model time 0.2360 (0.2458) loss 3.4199 (3.3772) grad_norm 1.6681 (2.1291) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:14:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1090/1251] eta 0:00:39 lr 0.000797 wd 0.0500 time 0.2335 (0.2472) data time 0.0010 (0.0015) model time 0.2325 (0.2457) loss 3.4770 (3.3770) grad_norm 1.7011 (2.1290) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:14:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1100/1251] eta 0:00:37 lr 0.000796 wd 0.0500 time 0.2465 (0.2472) data time 0.0008 (0.0015) model time 0.2457 (0.2457) loss 3.1922 (3.3770) grad_norm 2.3239 (2.1272) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1110/1251] eta 0:00:34 lr 0.000796 wd 0.0500 time 0.2427 (0.2471) data time 0.0007 (0.0015) model time 0.2419 (0.2457) loss 2.5925 (3.3750) grad_norm 1.7864 (2.1258) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:14:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1120/1251] eta 0:00:32 lr 0.000796 wd 0.0500 time 0.2514 (0.2471) data time 0.0009 (0.0015) model time 0.2504 (0.2457) loss 3.9706 (3.3764) grad_norm 1.6207 (2.1221) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:14:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1130/1251] eta 0:00:29 lr 0.000796 wd 0.0500 time 0.2376 (0.2473) data time 0.0010 (0.0015) model time 0.2366 (0.2458) loss 3.1706 (3.3743) grad_norm 2.4394 (2.1209) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:14:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1140/1251] eta 0:00:27 lr 0.000796 wd 0.0500 time 0.2447 (0.2472) data time 0.0010 (0.0015) model time 0.2437 (0.2458) loss 3.1076 (3.3733) grad_norm 1.9085 (2.1227) loss_scale 4096.0000 (2053.3848) mem 7379MB [2024-08-26 11:14:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1150/1251] eta 0:00:24 lr 0.000796 wd 0.0500 time 0.2357 (0.2472) data time 0.0008 (0.0015) model time 0.2349 (0.2457) loss 3.8105 (3.3754) grad_norm 1.7735 (2.1234) loss_scale 4096.0000 (2071.1312) mem 7379MB [2024-08-26 11:14:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1160/1251] eta 0:00:22 lr 0.000796 wd 0.0500 time 0.2392 (0.2471) data time 0.0007 (0.0015) model time 0.2385 (0.2457) loss 3.8632 (3.3752) grad_norm 1.9274 (2.1234) loss_scale 4096.0000 (2088.5719) mem 7379MB [2024-08-26 11:14:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1170/1251] eta 0:00:20 lr 0.000796 wd 0.0500 time 0.2464 (0.2471) data time 0.0008 (0.0015) model time 0.2456 (0.2456) loss 1.8772 (3.3726) grad_norm 1.9492 (2.1224) loss_scale 4096.0000 (2105.7148) mem 7379MB [2024-08-26 11:14:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1180/1251] eta 0:00:17 lr 0.000796 wd 0.0500 time 0.2473 (0.2472) data time 0.0012 (0.0015) model time 0.2462 (0.2458) loss 3.1226 (3.3747) grad_norm 2.5912 (2.1227) loss_scale 4096.0000 (2122.5673) mem 7379MB [2024-08-26 11:14:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1190/1251] eta 0:00:15 lr 0.000796 wd 0.0500 time 0.2419 (0.2474) data time 0.0010 (0.0015) model time 0.2409 (0.2459) loss 3.6189 (3.3751) grad_norm 2.5794 (2.1233) loss_scale 4096.0000 (2139.1369) mem 7379MB [2024-08-26 11:14:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1200/1251] eta 0:00:12 lr 0.000796 wd 0.0500 time 0.2406 (0.2473) data time 0.0011 (0.0015) model time 0.2395 (0.2459) loss 2.4862 (3.3755) grad_norm 1.9761 (2.1229) loss_scale 4096.0000 (2155.4305) mem 7379MB [2024-08-26 11:14:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1210/1251] eta 0:00:10 lr 0.000796 wd 0.0500 time 0.2439 (0.2474) data time 0.0007 (0.0015) model time 0.2432 (0.2460) loss 3.4307 (3.3752) grad_norm 1.9759 (2.1207) loss_scale 4096.0000 (2171.4550) mem 7379MB [2024-08-26 11:14:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1220/1251] eta 0:00:07 lr 0.000796 wd 0.0500 time 0.2378 (0.2474) data time 0.0012 (0.0015) model time 0.2365 (0.2460) loss 4.1012 (3.3767) grad_norm 2.2890 (2.1195) loss_scale 4096.0000 (2187.2170) mem 7379MB [2024-08-26 11:14:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1230/1251] eta 0:00:05 lr 0.000796 wd 0.0500 time 0.2412 (0.2478) data time 0.0007 (0.0015) model time 0.2405 (0.2464) loss 4.2331 (3.3792) grad_norm 1.8323 (2.1194) loss_scale 4096.0000 (2202.7230) mem 7379MB [2024-08-26 11:14:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1240/1251] eta 0:00:02 lr 0.000796 wd 0.0500 time 0.2228 (0.2478) data time 0.0005 (0.0015) model time 0.2224 (0.2465) loss 3.5520 (3.3784) grad_norm 1.8623 (2.1201) loss_scale 4096.0000 (2217.9790) mem 7379MB [2024-08-26 11:14:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [103/300][1250/1251] eta 0:00:00 lr 0.000796 wd 0.0500 time 0.2246 (0.2477) data time 0.0007 (0.0015) model time 0.2239 (0.2463) loss 3.4658 (3.3790) grad_norm 2.0494 (2.1205) loss_scale 4096.0000 (2232.9912) mem 7379MB [2024-08-26 11:14:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 103 training takes 0:05:09 [2024-08-26 11:14:57 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 11:14:57 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 11:14:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.400 (0.400) Loss 0.4993 (0.4993) Acc@1 89.648 (89.648) Acc@5 97.754 (97.754) Mem 7379MB [2024-08-26 11:14:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.109) Loss 0.7617 (0.7554) Acc@1 83.203 (83.150) Acc@5 96.387 (96.520) Mem 7379MB [2024-08-26 11:14:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.095) Loss 1.2324 (0.7881) Acc@1 70.215 (81.878) Acc@5 91.504 (96.373) Mem 7379MB [2024-08-26 11:15:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.089) Loss 1.3398 (0.9013) Acc@1 68.262 (79.347) Acc@5 89.258 (94.969) Mem 7379MB [2024-08-26 11:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.3018 (0.9641) Acc@1 69.336 (77.903) Acc@5 90.234 (94.288) Mem 7379MB [2024-08-26 11:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.506 Acc@5 94.178 [2024-08-26 11:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.5% [2024-08-26 11:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 77.51% [2024-08-26 11:15:01 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 11:15:02 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 11:15:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.402 (0.402) Loss 0.4377 (0.4377) Acc@1 92.188 (92.188) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 11:15:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.106) Loss 0.7046 (0.6858) Acc@1 86.230 (85.192) Acc@5 96.387 (97.035) Mem 7379MB [2024-08-26 11:15:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.095) Loss 0.9814 (0.7096) Acc@1 76.465 (84.142) Acc@5 94.043 (97.001) Mem 7379MB [2024-08-26 11:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.089) Loss 1.2617 (0.8079) Acc@1 67.480 (81.842) Acc@5 90.723 (95.857) Mem 7379MB [2024-08-26 11:15:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.1221 (0.8589) Acc@1 72.070 (80.433) Acc@5 93.262 (95.353) Mem 7379MB [2024-08-26 11:15:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.000 Acc@5 95.316 [2024-08-26 11:15:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.0% [2024-08-26 11:15:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.00% [2024-08-26 11:15:06 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 11:15:07 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 11:15:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][0/1251] eta 0:14:03 lr 0.000796 wd 0.0500 time 0.6745 (0.6745) data time 0.4357 (0.4357) model time 0.0000 (0.0000) loss 3.4286 (3.4286) grad_norm 2.8998 (2.8998) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][10/1251] eta 0:05:46 lr 0.000796 wd 0.0500 time 0.2395 (0.2792) data time 0.0009 (0.0406) model time 0.0000 (0.0000) loss 3.9914 (3.5491) grad_norm 1.5144 (2.4599) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][20/1251] eta 0:05:23 lr 0.000796 wd 0.0500 time 0.2531 (0.2624) data time 0.0012 (0.0218) model time 0.0000 (0.0000) loss 2.4672 (3.4239) grad_norm 2.2221 (2.5055) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][30/1251] eta 0:05:12 lr 0.000796 wd 0.0500 time 0.2400 (0.2559) data time 0.0008 (0.0155) model time 0.0000 (0.0000) loss 3.4735 (3.4199) grad_norm 3.9020 (2.4666) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][40/1251] eta 0:05:05 lr 0.000796 wd 0.0500 time 0.2425 (0.2523) data time 0.0007 (0.0120) model time 0.0000 (0.0000) loss 2.2441 (3.3938) grad_norm 1.8924 (2.3796) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][50/1251] eta 0:05:00 lr 0.000796 wd 0.0500 time 0.2483 (0.2506) data time 0.0011 (0.0099) model time 0.0000 (0.0000) loss 3.7466 (3.4939) grad_norm 3.0365 (2.4210) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][60/1251] eta 0:04:56 lr 0.000796 wd 0.0500 time 0.2450 (0.2490) data time 0.0009 (0.0085) model time 0.2441 (0.2398) loss 3.5982 (3.4961) grad_norm 2.2933 (2.3684) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][70/1251] eta 0:04:52 lr 0.000796 wd 0.0500 time 0.2439 (0.2480) data time 0.0007 (0.0074) model time 0.2431 (0.2405) loss 2.4866 (3.4590) grad_norm 1.9560 (2.3381) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][80/1251] eta 0:04:49 lr 0.000796 wd 0.0500 time 0.2454 (0.2473) data time 0.0009 (0.0066) model time 0.2446 (0.2406) loss 3.6931 (3.4391) grad_norm 1.9526 (2.3484) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][90/1251] eta 0:04:46 lr 0.000796 wd 0.0500 time 0.2615 (0.2469) data time 0.0009 (0.0060) model time 0.2606 (0.2411) loss 2.8677 (3.4145) grad_norm 2.2664 (2.3289) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][100/1251] eta 0:04:43 lr 0.000796 wd 0.0500 time 0.2375 (0.2466) data time 0.0009 (0.0057) model time 0.2367 (0.2410) loss 3.1320 (3.4375) grad_norm 1.5833 (2.3028) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][110/1251] eta 0:04:40 lr 0.000796 wd 0.0500 time 0.2503 (0.2463) data time 0.0011 (0.0053) model time 0.2492 (0.2412) loss 3.4166 (3.4270) grad_norm 3.1732 (2.3089) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][120/1251] eta 0:04:38 lr 0.000796 wd 0.0500 time 0.2493 (0.2459) data time 0.0007 (0.0050) model time 0.2486 (0.2411) loss 2.9819 (3.4376) grad_norm 2.9008 (2.3406) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][130/1251] eta 0:04:35 lr 0.000795 wd 0.0500 time 0.2471 (0.2453) data time 0.0011 (0.0047) model time 0.2460 (0.2407) loss 3.4142 (3.4405) grad_norm 2.0114 (2.3262) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][140/1251] eta 0:04:32 lr 0.000795 wd 0.0500 time 0.2410 (0.2451) data time 0.0010 (0.0044) model time 0.2400 (0.2407) loss 3.4650 (3.4380) grad_norm 1.7645 (2.3172) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][150/1251] eta 0:04:32 lr 0.000795 wd 0.0500 time 0.4572 (0.2477) data time 0.0009 (0.0042) model time 0.4563 (0.2450) loss 3.4985 (3.4372) grad_norm 1.4667 (2.3325) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][160/1251] eta 0:04:30 lr 0.000795 wd 0.0500 time 0.2460 (0.2476) data time 0.0007 (0.0040) model time 0.2453 (0.2450) loss 3.5310 (3.4449) grad_norm 1.9476 (2.3453) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][170/1251] eta 0:04:27 lr 0.000795 wd 0.0500 time 0.2420 (0.2475) data time 0.0009 (0.0038) model time 0.2412 (0.2449) loss 3.6862 (3.4448) grad_norm 1.4806 (2.3368) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][180/1251] eta 0:04:24 lr 0.000795 wd 0.0500 time 0.2408 (0.2472) data time 0.0007 (0.0037) model time 0.2401 (0.2446) loss 4.2209 (3.4394) grad_norm 1.7537 (2.3127) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][190/1251] eta 0:04:22 lr 0.000795 wd 0.0500 time 0.2416 (0.2470) data time 0.0008 (0.0035) model time 0.2408 (0.2445) loss 2.7369 (3.4238) grad_norm 1.9395 (2.3013) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][200/1251] eta 0:04:19 lr 0.000795 wd 0.0500 time 0.2346 (0.2467) data time 0.0008 (0.0034) model time 0.2338 (0.2442) loss 3.4905 (3.4220) grad_norm 2.6095 (2.2962) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:15:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][210/1251] eta 0:04:16 lr 0.000795 wd 0.0500 time 0.2472 (0.2465) data time 0.0007 (0.0033) model time 0.2465 (0.2440) loss 2.6109 (3.4152) grad_norm 2.4228 (2.2946) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][220/1251] eta 0:04:13 lr 0.000795 wd 0.0500 time 0.2402 (0.2463) data time 0.0009 (0.0032) model time 0.2392 (0.2438) loss 4.1510 (3.4280) grad_norm 1.8430 (2.2789) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][230/1251] eta 0:04:11 lr 0.000795 wd 0.0500 time 0.2407 (0.2462) data time 0.0009 (0.0031) model time 0.2398 (0.2437) loss 3.8059 (3.4267) grad_norm 1.8560 (2.2681) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][240/1251] eta 0:04:08 lr 0.000795 wd 0.0500 time 0.2315 (0.2459) data time 0.0007 (0.0030) model time 0.2308 (0.2435) loss 3.2963 (3.4146) grad_norm 2.1730 (2.2784) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][250/1251] eta 0:04:06 lr 0.000795 wd 0.0500 time 0.2432 (0.2465) data time 0.0007 (0.0030) model time 0.2425 (0.2443) loss 4.4381 (3.4167) grad_norm 2.7161 (2.2744) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][260/1251] eta 0:04:04 lr 0.000795 wd 0.0500 time 0.2419 (0.2465) data time 0.0011 (0.0029) model time 0.2408 (0.2443) loss 3.7197 (3.4098) grad_norm 1.7449 (2.2739) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][270/1251] eta 0:04:01 lr 0.000795 wd 0.0500 time 0.2388 (0.2463) data time 0.0010 (0.0028) model time 0.2378 (0.2441) loss 3.2575 (3.4182) grad_norm 2.2224 (2.2641) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][280/1251] eta 0:03:58 lr 0.000795 wd 0.0500 time 0.2439 (0.2461) data time 0.0008 (0.0028) model time 0.2431 (0.2440) loss 2.8611 (3.4079) grad_norm 1.5185 (2.2513) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][290/1251] eta 0:03:56 lr 0.000795 wd 0.0500 time 0.2366 (0.2460) data time 0.0010 (0.0027) model time 0.2356 (0.2439) loss 3.0209 (3.4111) grad_norm 1.8554 (2.2461) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][300/1251] eta 0:03:53 lr 0.000795 wd 0.0500 time 0.2587 (0.2460) data time 0.0007 (0.0026) model time 0.2580 (0.2439) loss 3.6110 (3.4120) grad_norm 2.9255 (2.2458) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][310/1251] eta 0:03:51 lr 0.000795 wd 0.0500 time 0.2452 (0.2459) data time 0.0010 (0.0026) model time 0.2443 (0.2438) loss 3.0889 (3.4123) grad_norm 1.5263 (2.2416) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][320/1251] eta 0:03:49 lr 0.000795 wd 0.0500 time 0.2498 (0.2464) data time 0.0007 (0.0025) model time 0.2490 (0.2445) loss 4.0797 (3.4156) grad_norm 2.4241 (2.2331) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][330/1251] eta 0:03:47 lr 0.000795 wd 0.0500 time 0.2451 (0.2469) data time 0.0007 (0.0025) model time 0.2443 (0.2451) loss 3.3351 (3.4255) grad_norm 1.7546 (2.2281) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][340/1251] eta 0:03:44 lr 0.000795 wd 0.0500 time 0.2483 (0.2468) data time 0.0008 (0.0024) model time 0.2475 (0.2450) loss 2.2769 (3.4207) grad_norm 2.2420 (2.2273) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][350/1251] eta 0:03:42 lr 0.000795 wd 0.0500 time 0.2446 (0.2466) data time 0.0010 (0.0024) model time 0.2437 (0.2448) loss 2.3593 (3.4223) grad_norm 4.4608 (2.2398) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][360/1251] eta 0:03:40 lr 0.000795 wd 0.0500 time 0.2424 (0.2470) data time 0.0008 (0.0024) model time 0.2415 (0.2453) loss 2.5212 (3.4173) grad_norm 2.1110 (2.2426) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][370/1251] eta 0:03:37 lr 0.000795 wd 0.0500 time 0.2437 (0.2468) data time 0.0007 (0.0023) model time 0.2429 (0.2451) loss 4.0445 (3.4174) grad_norm 1.7547 (2.2304) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][380/1251] eta 0:03:34 lr 0.000795 wd 0.0500 time 0.2394 (0.2467) data time 0.0007 (0.0023) model time 0.2386 (0.2450) loss 4.3392 (3.4132) grad_norm 3.4330 (2.2284) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][390/1251] eta 0:03:32 lr 0.000795 wd 0.0500 time 0.2292 (0.2465) data time 0.0009 (0.0023) model time 0.2283 (0.2448) loss 4.0444 (3.4113) grad_norm 2.1405 (2.2319) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][400/1251] eta 0:03:29 lr 0.000795 wd 0.0500 time 0.2410 (0.2464) data time 0.0013 (0.0022) model time 0.2398 (0.2447) loss 3.7220 (3.4105) grad_norm 1.3385 (2.2224) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][410/1251] eta 0:03:27 lr 0.000794 wd 0.0500 time 0.2385 (0.2463) data time 0.0012 (0.0022) model time 0.2374 (0.2446) loss 2.7310 (3.4054) grad_norm 3.3816 (2.2180) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:16:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][420/1251] eta 0:03:25 lr 0.000794 wd 0.0500 time 0.2459 (0.2467) data time 0.0009 (0.0022) model time 0.2450 (0.2451) loss 2.9842 (3.4041) grad_norm 1.5413 (nan) loss_scale 2048.0000 (4061.9477) mem 7379MB [2024-08-26 11:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][430/1251] eta 0:03:22 lr 0.000794 wd 0.0500 time 0.2418 (0.2466) data time 0.0009 (0.0022) model time 0.2408 (0.2450) loss 2.4084 (3.4037) grad_norm 2.6992 (nan) loss_scale 2048.0000 (4015.2204) mem 7379MB [2024-08-26 11:16:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][440/1251] eta 0:03:19 lr 0.000794 wd 0.0500 time 0.2446 (0.2465) data time 0.0012 (0.0021) model time 0.2434 (0.2449) loss 3.8640 (3.4069) grad_norm 1.4222 (nan) loss_scale 2048.0000 (3970.6122) mem 7379MB [2024-08-26 11:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][450/1251] eta 0:03:17 lr 0.000794 wd 0.0500 time 0.2388 (0.2468) data time 0.0009 (0.0021) model time 0.2379 (0.2452) loss 3.6113 (3.4111) grad_norm 3.0998 (nan) loss_scale 2048.0000 (3927.9823) mem 7379MB [2024-08-26 11:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][460/1251] eta 0:03:15 lr 0.000794 wd 0.0500 time 0.2424 (0.2467) data time 0.0008 (0.0021) model time 0.2416 (0.2451) loss 3.4196 (3.4105) grad_norm 2.5582 (nan) loss_scale 2048.0000 (3887.2017) mem 7379MB [2024-08-26 11:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][470/1251] eta 0:03:12 lr 0.000794 wd 0.0500 time 0.2430 (0.2466) data time 0.0009 (0.0021) model time 0.2421 (0.2450) loss 2.8776 (3.4099) grad_norm 2.5472 (nan) loss_scale 2048.0000 (3848.1529) mem 7379MB [2024-08-26 11:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][480/1251] eta 0:03:10 lr 0.000794 wd 0.0500 time 0.2419 (0.2470) data time 0.0009 (0.0020) model time 0.2409 (0.2454) loss 3.4137 (3.4144) grad_norm 1.6547 (nan) loss_scale 2048.0000 (3810.7277) mem 7379MB [2024-08-26 11:17:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][490/1251] eta 0:03:08 lr 0.000794 wd 0.0500 time 0.2559 (0.2478) data time 0.0007 (0.0020) model time 0.2552 (0.2463) loss 2.1126 (3.4118) grad_norm 2.0738 (nan) loss_scale 2048.0000 (3774.8269) mem 7379MB [2024-08-26 11:17:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][500/1251] eta 0:03:06 lr 0.000794 wd 0.0500 time 0.2406 (0.2482) data time 0.0011 (0.0020) model time 0.2395 (0.2469) loss 2.3818 (3.4071) grad_norm 1.6523 (nan) loss_scale 2048.0000 (3740.3593) mem 7379MB [2024-08-26 11:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][510/1251] eta 0:03:03 lr 0.000794 wd 0.0500 time 0.2372 (0.2481) data time 0.0008 (0.0020) model time 0.2365 (0.2467) loss 3.1081 (3.3982) grad_norm 2.1144 (nan) loss_scale 2048.0000 (3707.2407) mem 7379MB [2024-08-26 11:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][520/1251] eta 0:03:01 lr 0.000794 wd 0.0500 time 0.2879 (0.2481) data time 0.0009 (0.0020) model time 0.2870 (0.2467) loss 3.1731 (3.4003) grad_norm 2.3186 (nan) loss_scale 2048.0000 (3675.3935) mem 7379MB [2024-08-26 11:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][530/1251] eta 0:02:58 lr 0.000794 wd 0.0500 time 0.2348 (0.2480) data time 0.0008 (0.0020) model time 0.2340 (0.2466) loss 4.2349 (3.4009) grad_norm 2.0682 (nan) loss_scale 2048.0000 (3644.7458) mem 7379MB [2024-08-26 11:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][540/1251] eta 0:02:56 lr 0.000794 wd 0.0500 time 0.2398 (0.2478) data time 0.0008 (0.0020) model time 0.2390 (0.2464) loss 3.7545 (3.3976) grad_norm 1.5349 (nan) loss_scale 2048.0000 (3615.2311) mem 7379MB [2024-08-26 11:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][550/1251] eta 0:02:53 lr 0.000794 wd 0.0500 time 0.2370 (0.2477) data time 0.0013 (0.0019) model time 0.2357 (0.2463) loss 3.6367 (3.4034) grad_norm 3.4788 (nan) loss_scale 2048.0000 (3586.7877) mem 7379MB [2024-08-26 11:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][560/1251] eta 0:02:51 lr 0.000794 wd 0.0500 time 0.2424 (0.2476) data time 0.0008 (0.0019) model time 0.2416 (0.2462) loss 2.6505 (3.3968) grad_norm 1.9243 (nan) loss_scale 2048.0000 (3559.3583) mem 7379MB [2024-08-26 11:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][570/1251] eta 0:02:48 lr 0.000794 wd 0.0500 time 0.2325 (0.2475) data time 0.0010 (0.0019) model time 0.2315 (0.2461) loss 3.4963 (3.3959) grad_norm 1.8445 (nan) loss_scale 2048.0000 (3532.8897) mem 7379MB [2024-08-26 11:17:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][580/1251] eta 0:02:46 lr 0.000794 wd 0.0500 time 0.2437 (0.2475) data time 0.0008 (0.0019) model time 0.2429 (0.2461) loss 3.6319 (3.3988) grad_norm 1.9696 (nan) loss_scale 2048.0000 (3507.3322) mem 7379MB [2024-08-26 11:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][590/1251] eta 0:02:43 lr 0.000794 wd 0.0500 time 0.2428 (0.2474) data time 0.0009 (0.0019) model time 0.2419 (0.2459) loss 3.4739 (3.3998) grad_norm 2.5135 (nan) loss_scale 2048.0000 (3482.6396) mem 7379MB [2024-08-26 11:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][600/1251] eta 0:02:40 lr 0.000794 wd 0.0500 time 0.2429 (0.2473) data time 0.0008 (0.0019) model time 0.2421 (0.2458) loss 3.3773 (3.4003) grad_norm 1.5534 (nan) loss_scale 2048.0000 (3458.7687) mem 7379MB [2024-08-26 11:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][610/1251] eta 0:02:38 lr 0.000794 wd 0.0500 time 0.2425 (0.2472) data time 0.0008 (0.0019) model time 0.2418 (0.2458) loss 3.8013 (3.3991) grad_norm 1.7280 (nan) loss_scale 2048.0000 (3435.6792) mem 7379MB [2024-08-26 11:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][620/1251] eta 0:02:35 lr 0.000794 wd 0.0500 time 0.2423 (0.2471) data time 0.0008 (0.0019) model time 0.2415 (0.2457) loss 3.8779 (3.3997) grad_norm 1.9178 (nan) loss_scale 2048.0000 (3413.3333) mem 7379MB [2024-08-26 11:17:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][630/1251] eta 0:02:33 lr 0.000794 wd 0.0500 time 0.2447 (0.2471) data time 0.0007 (0.0018) model time 0.2439 (0.2456) loss 3.8349 (3.4015) grad_norm 1.8801 (nan) loss_scale 2048.0000 (3391.6957) mem 7379MB [2024-08-26 11:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][640/1251] eta 0:02:30 lr 0.000794 wd 0.0500 time 0.2431 (0.2470) data time 0.0008 (0.0018) model time 0.2423 (0.2455) loss 3.1691 (3.4000) grad_norm 1.9248 (nan) loss_scale 2048.0000 (3370.7332) mem 7379MB [2024-08-26 11:17:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][650/1251] eta 0:02:28 lr 0.000794 wd 0.0500 time 0.2445 (0.2469) data time 0.0012 (0.0018) model time 0.2433 (0.2455) loss 3.4495 (3.3987) grad_norm 1.8859 (nan) loss_scale 2048.0000 (3350.4147) mem 7379MB [2024-08-26 11:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][660/1251] eta 0:02:25 lr 0.000794 wd 0.0500 time 0.2415 (0.2468) data time 0.0010 (0.0018) model time 0.2406 (0.2454) loss 4.0824 (3.3983) grad_norm 2.5007 (nan) loss_scale 2048.0000 (3330.7110) mem 7379MB [2024-08-26 11:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][670/1251] eta 0:02:23 lr 0.000794 wd 0.0500 time 0.2373 (0.2467) data time 0.0009 (0.0018) model time 0.2364 (0.2453) loss 3.4332 (3.3983) grad_norm 1.7782 (nan) loss_scale 2048.0000 (3311.5946) mem 7379MB [2024-08-26 11:17:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][680/1251] eta 0:02:20 lr 0.000794 wd 0.0500 time 0.2412 (0.2467) data time 0.0013 (0.0018) model time 0.2399 (0.2452) loss 4.0067 (3.3987) grad_norm 1.8649 (nan) loss_scale 2048.0000 (3293.0396) mem 7379MB [2024-08-26 11:17:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][690/1251] eta 0:02:18 lr 0.000793 wd 0.0500 time 0.2406 (0.2466) data time 0.0007 (0.0018) model time 0.2399 (0.2452) loss 2.5727 (3.3965) grad_norm 1.6943 (nan) loss_scale 2048.0000 (3275.0217) mem 7379MB [2024-08-26 11:18:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][700/1251] eta 0:02:15 lr 0.000793 wd 0.0500 time 0.2451 (0.2465) data time 0.0008 (0.0018) model time 0.2443 (0.2451) loss 3.3031 (3.3912) grad_norm 2.4882 (nan) loss_scale 2048.0000 (3257.5178) mem 7379MB [2024-08-26 11:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][710/1251] eta 0:02:13 lr 0.000793 wd 0.0500 time 0.2368 (0.2465) data time 0.0009 (0.0018) model time 0.2358 (0.2450) loss 4.1202 (3.3925) grad_norm 1.7737 (nan) loss_scale 2048.0000 (3240.5063) mem 7379MB [2024-08-26 11:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][720/1251] eta 0:02:10 lr 0.000793 wd 0.0500 time 0.2411 (0.2464) data time 0.0011 (0.0017) model time 0.2401 (0.2450) loss 2.1716 (3.3891) grad_norm 1.8848 (nan) loss_scale 2048.0000 (3223.9667) mem 7379MB [2024-08-26 11:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][730/1251] eta 0:02:08 lr 0.000793 wd 0.0500 time 0.2408 (0.2467) data time 0.0009 (0.0017) model time 0.2398 (0.2453) loss 3.5273 (3.3932) grad_norm 1.9245 (nan) loss_scale 2048.0000 (3207.8796) mem 7379MB [2024-08-26 11:18:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][740/1251] eta 0:02:06 lr 0.000793 wd 0.0500 time 0.2313 (0.2466) data time 0.0009 (0.0017) model time 0.2304 (0.2452) loss 3.5973 (3.3931) grad_norm 1.7726 (nan) loss_scale 2048.0000 (3192.2267) mem 7379MB [2024-08-26 11:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][750/1251] eta 0:02:03 lr 0.000793 wd 0.0500 time 0.2359 (0.2465) data time 0.0009 (0.0017) model time 0.2350 (0.2451) loss 3.7775 (3.3930) grad_norm 2.9435 (nan) loss_scale 2048.0000 (3176.9907) mem 7379MB [2024-08-26 11:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][760/1251] eta 0:02:01 lr 0.000793 wd 0.0500 time 0.2344 (0.2464) data time 0.0012 (0.0017) model time 0.2332 (0.2450) loss 3.6174 (3.3936) grad_norm 1.8653 (nan) loss_scale 2048.0000 (3162.1551) mem 7379MB [2024-08-26 11:18:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][770/1251] eta 0:01:58 lr 0.000793 wd 0.0500 time 0.2397 (0.2464) data time 0.0010 (0.0017) model time 0.2387 (0.2450) loss 3.2294 (3.3902) grad_norm 1.9955 (nan) loss_scale 2048.0000 (3147.7043) mem 7379MB [2024-08-26 11:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][780/1251] eta 0:01:56 lr 0.000793 wd 0.0500 time 0.2416 (0.2463) data time 0.0010 (0.0017) model time 0.2406 (0.2449) loss 3.2063 (3.3910) grad_norm 2.9160 (nan) loss_scale 2048.0000 (3133.6236) mem 7379MB [2024-08-26 11:18:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][790/1251] eta 0:01:53 lr 0.000793 wd 0.0500 time 0.2408 (0.2463) data time 0.0007 (0.0017) model time 0.2401 (0.2448) loss 2.7558 (3.3877) grad_norm 1.7061 (nan) loss_scale 2048.0000 (3119.8989) mem 7379MB [2024-08-26 11:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][800/1251] eta 0:01:51 lr 0.000793 wd 0.0500 time 0.2395 (0.2462) data time 0.0012 (0.0017) model time 0.2384 (0.2448) loss 3.1994 (3.3875) grad_norm 2.0528 (nan) loss_scale 2048.0000 (3106.5169) mem 7379MB [2024-08-26 11:18:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][810/1251] eta 0:01:48 lr 0.000793 wd 0.0500 time 0.2416 (0.2462) data time 0.0009 (0.0017) model time 0.2407 (0.2448) loss 2.7807 (3.3895) grad_norm 1.6193 (nan) loss_scale 2048.0000 (3093.4649) mem 7379MB [2024-08-26 11:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][820/1251] eta 0:01:46 lr 0.000793 wd 0.0500 time 0.2406 (0.2461) data time 0.0008 (0.0017) model time 0.2398 (0.2447) loss 3.5791 (3.3890) grad_norm 2.6582 (nan) loss_scale 2048.0000 (3080.7308) mem 7379MB [2024-08-26 11:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][830/1251] eta 0:01:43 lr 0.000793 wd 0.0500 time 0.2341 (0.2461) data time 0.0009 (0.0017) model time 0.2332 (0.2447) loss 3.7770 (3.3904) grad_norm 2.1514 (nan) loss_scale 2048.0000 (3068.3032) mem 7379MB [2024-08-26 11:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][840/1251] eta 0:01:41 lr 0.000793 wd 0.0500 time 0.2442 (0.2461) data time 0.0012 (0.0016) model time 0.2430 (0.2447) loss 3.9990 (3.3931) grad_norm 3.0899 (nan) loss_scale 2048.0000 (3056.1712) mem 7379MB [2024-08-26 11:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][850/1251] eta 0:01:38 lr 0.000793 wd 0.0500 time 0.2411 (0.2460) data time 0.0007 (0.0016) model time 0.2404 (0.2446) loss 4.4191 (3.3974) grad_norm 2.3325 (nan) loss_scale 2048.0000 (3044.3243) mem 7379MB [2024-08-26 11:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][860/1251] eta 0:01:36 lr 0.000793 wd 0.0500 time 0.2378 (0.2460) data time 0.0011 (0.0016) model time 0.2367 (0.2446) loss 3.0083 (3.3987) grad_norm 1.8186 (nan) loss_scale 2048.0000 (3032.7526) mem 7379MB [2024-08-26 11:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][870/1251] eta 0:01:33 lr 0.000793 wd 0.0500 time 0.2389 (0.2459) data time 0.0008 (0.0016) model time 0.2381 (0.2445) loss 2.7082 (3.3976) grad_norm 1.6580 (nan) loss_scale 2048.0000 (3021.4466) mem 7379MB [2024-08-26 11:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][880/1251] eta 0:01:31 lr 0.000793 wd 0.0500 time 0.4387 (0.2461) data time 0.0007 (0.0016) model time 0.4380 (0.2447) loss 3.8774 (3.3989) grad_norm 2.4764 (nan) loss_scale 2048.0000 (3010.3973) mem 7379MB [2024-08-26 11:18:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][890/1251] eta 0:01:28 lr 0.000793 wd 0.0500 time 0.2421 (0.2461) data time 0.0009 (0.0016) model time 0.2412 (0.2447) loss 4.1168 (3.4007) grad_norm 2.2674 (nan) loss_scale 2048.0000 (2999.5960) mem 7379MB [2024-08-26 11:18:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][900/1251] eta 0:01:26 lr 0.000793 wd 0.0500 time 0.2414 (0.2460) data time 0.0012 (0.0016) model time 0.2402 (0.2447) loss 2.7979 (3.4005) grad_norm 2.3159 (nan) loss_scale 2048.0000 (2989.0344) mem 7379MB [2024-08-26 11:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][910/1251] eta 0:01:23 lr 0.000793 wd 0.0500 time 0.2447 (0.2460) data time 0.0007 (0.0016) model time 0.2440 (0.2446) loss 2.5751 (3.4012) grad_norm 1.8244 (nan) loss_scale 2048.0000 (2978.7047) mem 7379MB [2024-08-26 11:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][920/1251] eta 0:01:21 lr 0.000793 wd 0.0500 time 0.2505 (0.2460) data time 0.0010 (0.0016) model time 0.2495 (0.2446) loss 2.5908 (3.4002) grad_norm 1.9773 (nan) loss_scale 2048.0000 (2968.5993) mem 7379MB [2024-08-26 11:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][930/1251] eta 0:01:18 lr 0.000793 wd 0.0500 time 0.2500 (0.2460) data time 0.0007 (0.0016) model time 0.2493 (0.2446) loss 2.8835 (3.4008) grad_norm 2.5141 (nan) loss_scale 2048.0000 (2958.7111) mem 7379MB [2024-08-26 11:18:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][940/1251] eta 0:01:16 lr 0.000793 wd 0.0500 time 0.2366 (0.2459) data time 0.0011 (0.0016) model time 0.2356 (0.2445) loss 3.1477 (3.4032) grad_norm 1.8246 (nan) loss_scale 2048.0000 (2949.0329) mem 7379MB [2024-08-26 11:19:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][950/1251] eta 0:01:14 lr 0.000793 wd 0.0500 time 0.2410 (0.2459) data time 0.0008 (0.0016) model time 0.2402 (0.2445) loss 2.8971 (3.4008) grad_norm 2.2201 (nan) loss_scale 2048.0000 (2939.5584) mem 7379MB [2024-08-26 11:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][960/1251] eta 0:01:11 lr 0.000792 wd 0.0500 time 0.2443 (0.2458) data time 0.0007 (0.0016) model time 0.2436 (0.2444) loss 3.3375 (3.4007) grad_norm 1.5348 (nan) loss_scale 2048.0000 (2930.2810) mem 7379MB [2024-08-26 11:19:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][970/1251] eta 0:01:09 lr 0.000792 wd 0.0500 time 0.2449 (0.2458) data time 0.0009 (0.0016) model time 0.2439 (0.2444) loss 3.4013 (3.4022) grad_norm 1.5667 (nan) loss_scale 2048.0000 (2921.1946) mem 7379MB [2024-08-26 11:19:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][980/1251] eta 0:01:06 lr 0.000792 wd 0.0500 time 0.2497 (0.2457) data time 0.0011 (0.0016) model time 0.2486 (0.2444) loss 3.8152 (3.4028) grad_norm 1.6728 (nan) loss_scale 2048.0000 (2912.2936) mem 7379MB [2024-08-26 11:19:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][990/1251] eta 0:01:04 lr 0.000792 wd 0.0500 time 0.2370 (0.2460) data time 0.0009 (0.0015) model time 0.2361 (0.2446) loss 3.7746 (3.4005) grad_norm 3.4786 (nan) loss_scale 2048.0000 (2903.5721) mem 7379MB [2024-08-26 11:19:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1000/1251] eta 0:01:01 lr 0.000792 wd 0.0500 time 0.2380 (0.2460) data time 0.0009 (0.0016) model time 0.2370 (0.2446) loss 3.7170 (3.4011) grad_norm 2.1849 (nan) loss_scale 2048.0000 (2895.0250) mem 7379MB [2024-08-26 11:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1010/1251] eta 0:00:59 lr 0.000792 wd 0.0500 time 0.2378 (0.2459) data time 0.0009 (0.0016) model time 0.2369 (0.2446) loss 3.8761 (3.4024) grad_norm 2.5659 (nan) loss_scale 2048.0000 (2886.6469) mem 7379MB [2024-08-26 11:19:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1020/1251] eta 0:00:56 lr 0.000792 wd 0.0500 time 0.2403 (0.2459) data time 0.0009 (0.0015) model time 0.2393 (0.2445) loss 3.7732 (3.4007) grad_norm 1.5920 (nan) loss_scale 2048.0000 (2878.4329) mem 7379MB [2024-08-26 11:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1030/1251] eta 0:00:54 lr 0.000792 wd 0.0500 time 0.2402 (0.2463) data time 0.0009 (0.0015) model time 0.2393 (0.2450) loss 3.1415 (3.3988) grad_norm 2.2253 (nan) loss_scale 2048.0000 (2870.3783) mem 7379MB [2024-08-26 11:19:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1040/1251] eta 0:00:51 lr 0.000792 wd 0.0500 time 0.2381 (0.2462) data time 0.0011 (0.0015) model time 0.2370 (0.2449) loss 3.6044 (3.3967) grad_norm 2.0117 (nan) loss_scale 2048.0000 (2862.4784) mem 7379MB [2024-08-26 11:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1050/1251] eta 0:00:49 lr 0.000792 wd 0.0500 time 0.2349 (0.2462) data time 0.0009 (0.0015) model time 0.2339 (0.2449) loss 3.7500 (3.3971) grad_norm 2.3140 (nan) loss_scale 2048.0000 (2854.7288) mem 7379MB [2024-08-26 11:19:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1060/1251] eta 0:00:47 lr 0.000792 wd 0.0500 time 0.2395 (0.2462) data time 0.0007 (0.0015) model time 0.2387 (0.2448) loss 3.3375 (3.3992) grad_norm 1.5672 (nan) loss_scale 2048.0000 (2847.1254) mem 7379MB [2024-08-26 11:19:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1070/1251] eta 0:00:44 lr 0.000792 wd 0.0500 time 0.2399 (0.2462) data time 0.0009 (0.0015) model time 0.2390 (0.2448) loss 2.9443 (3.3976) grad_norm 1.8191 (nan) loss_scale 2048.0000 (2839.6639) mem 7379MB [2024-08-26 11:19:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1080/1251] eta 0:00:42 lr 0.000792 wd 0.0500 time 0.4414 (0.2465) data time 0.0008 (0.0015) model time 0.4405 (0.2452) loss 2.7721 (3.3976) grad_norm 2.5253 (nan) loss_scale 2048.0000 (2832.3404) mem 7379MB [2024-08-26 11:19:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1090/1251] eta 0:00:39 lr 0.000792 wd 0.0500 time 0.2405 (0.2465) data time 0.0008 (0.0015) model time 0.2397 (0.2452) loss 3.6336 (3.3955) grad_norm 1.7200 (nan) loss_scale 2048.0000 (2825.1512) mem 7379MB [2024-08-26 11:19:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1100/1251] eta 0:00:37 lr 0.000792 wd 0.0500 time 0.2436 (0.2464) data time 0.0007 (0.0015) model time 0.2429 (0.2451) loss 2.5958 (3.3938) grad_norm 1.9379 (nan) loss_scale 2048.0000 (2818.0926) mem 7379MB [2024-08-26 11:19:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1110/1251] eta 0:00:34 lr 0.000792 wd 0.0500 time 0.2383 (0.2464) data time 0.0010 (0.0015) model time 0.2373 (0.2451) loss 3.8098 (3.3935) grad_norm 1.9971 (nan) loss_scale 2048.0000 (2811.1611) mem 7379MB [2024-08-26 11:19:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1120/1251] eta 0:00:32 lr 0.000792 wd 0.0500 time 0.2442 (0.2464) data time 0.0010 (0.0015) model time 0.2432 (0.2451) loss 2.9142 (3.3927) grad_norm 1.6930 (nan) loss_scale 2048.0000 (2804.3533) mem 7379MB [2024-08-26 11:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1130/1251] eta 0:00:29 lr 0.000792 wd 0.0500 time 0.2402 (0.2464) data time 0.0009 (0.0015) model time 0.2393 (0.2450) loss 4.2027 (3.3964) grad_norm 3.1770 (nan) loss_scale 2048.0000 (2797.6658) mem 7379MB [2024-08-26 11:19:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1140/1251] eta 0:00:27 lr 0.000792 wd 0.0500 time 0.2424 (0.2463) data time 0.0010 (0.0015) model time 0.2413 (0.2450) loss 4.0554 (3.3992) grad_norm 2.1040 (nan) loss_scale 2048.0000 (2791.0955) mem 7379MB [2024-08-26 11:19:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1150/1251] eta 0:00:24 lr 0.000792 wd 0.0500 time 0.2378 (0.2463) data time 0.0010 (0.0015) model time 0.2368 (0.2450) loss 3.1156 (3.3976) grad_norm 1.9241 (nan) loss_scale 2048.0000 (2784.6394) mem 7379MB [2024-08-26 11:19:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1160/1251] eta 0:00:22 lr 0.000792 wd 0.0500 time 0.2400 (0.2462) data time 0.0009 (0.0015) model time 0.2391 (0.2449) loss 3.3957 (3.3970) grad_norm 2.4813 (nan) loss_scale 2048.0000 (2778.2946) mem 7379MB [2024-08-26 11:19:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1170/1251] eta 0:00:19 lr 0.000792 wd 0.0500 time 0.2403 (0.2462) data time 0.0013 (0.0015) model time 0.2391 (0.2449) loss 3.2584 (3.3962) grad_norm 1.5389 (nan) loss_scale 2048.0000 (2772.0581) mem 7379MB [2024-08-26 11:19:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1180/1251] eta 0:00:17 lr 0.000792 wd 0.0500 time 0.2440 (0.2463) data time 0.0010 (0.0015) model time 0.2429 (0.2450) loss 3.1195 (3.3982) grad_norm 3.0594 (nan) loss_scale 2048.0000 (2765.9272) mem 7379MB [2024-08-26 11:20:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1190/1251] eta 0:00:15 lr 0.000792 wd 0.0500 time 0.2413 (0.2463) data time 0.0010 (0.0015) model time 0.2403 (0.2450) loss 3.8915 (3.3970) grad_norm 1.6964 (nan) loss_scale 2048.0000 (2759.8992) mem 7379MB [2024-08-26 11:20:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1200/1251] eta 0:00:12 lr 0.000792 wd 0.0500 time 0.2437 (0.2463) data time 0.0011 (0.0015) model time 0.2426 (0.2450) loss 3.2792 (3.3991) grad_norm 1.7585 (nan) loss_scale 2048.0000 (2753.9717) mem 7379MB [2024-08-26 11:20:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1210/1251] eta 0:00:10 lr 0.000792 wd 0.0500 time 0.2387 (0.2462) data time 0.0008 (0.0015) model time 0.2379 (0.2449) loss 3.2415 (3.3970) grad_norm 1.6299 (nan) loss_scale 2048.0000 (2748.1420) mem 7379MB [2024-08-26 11:20:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1220/1251] eta 0:00:07 lr 0.000792 wd 0.0500 time 0.2347 (0.2462) data time 0.0009 (0.0015) model time 0.2337 (0.2449) loss 3.6794 (3.3958) grad_norm 7.5752 (nan) loss_scale 2048.0000 (2742.4079) mem 7379MB [2024-08-26 11:20:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1230/1251] eta 0:00:05 lr 0.000792 wd 0.0500 time 0.2345 (0.2462) data time 0.0010 (0.0015) model time 0.2335 (0.2449) loss 3.1929 (3.3955) grad_norm 2.2489 (nan) loss_scale 2048.0000 (2736.7669) mem 7379MB [2024-08-26 11:20:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1240/1251] eta 0:00:02 lr 0.000791 wd 0.0500 time 0.2233 (0.2461) data time 0.0007 (0.0015) model time 0.2226 (0.2448) loss 3.6566 (3.3956) grad_norm 2.6769 (nan) loss_scale 2048.0000 (2731.2168) mem 7379MB [2024-08-26 11:20:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [104/300][1250/1251] eta 0:00:00 lr 0.000791 wd 0.0500 time 0.2276 (0.2461) data time 0.0005 (0.0015) model time 0.2272 (0.2448) loss 3.6182 (3.3968) grad_norm 3.6118 (nan) loss_scale 2048.0000 (2725.7554) mem 7379MB [2024-08-26 11:20:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 104 training takes 0:05:07 [2024-08-26 11:20:15 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 11:20:16 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 11:20:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.449 (0.449) Loss 0.5522 (0.5522) Acc@1 89.746 (89.746) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 11:20:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.113) Loss 0.8096 (0.8085) Acc@1 83.496 (82.244) Acc@5 95.898 (96.325) Mem 7379MB [2024-08-26 11:20:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.096) Loss 1.1895 (0.8276) Acc@1 72.266 (81.515) Acc@5 92.773 (96.303) Mem 7379MB [2024-08-26 11:20:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.090) Loss 1.4062 (0.9402) Acc@1 66.602 (79.098) Acc@5 88.086 (94.903) Mem 7379MB [2024-08-26 11:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.3467 (1.0065) Acc@1 68.848 (77.622) Acc@5 90.430 (94.119) Mem 7379MB [2024-08-26 11:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.342 Acc@5 94.074 [2024-08-26 11:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.3% [2024-08-26 11:20:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.778 (0.778) Loss 0.4373 (0.4373) Acc@1 91.992 (91.992) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 11:20:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.145) Loss 0.7041 (0.6851) Acc@1 86.230 (85.218) Acc@5 96.387 (97.061) Mem 7379MB [2024-08-26 11:20:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.084 (0.113) Loss 0.9819 (0.7085) Acc@1 76.562 (84.208) Acc@5 94.043 (97.028) Mem 7379MB [2024-08-26 11:20:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.102) Loss 1.2578 (0.8066) Acc@1 67.969 (81.883) Acc@5 90.723 (95.854) Mem 7379MB [2024-08-26 11:20:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.1201 (0.8574) Acc@1 72.266 (80.483) Acc@5 93.359 (95.365) Mem 7379MB [2024-08-26 11:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.038 Acc@5 95.314 [2024-08-26 11:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.0% [2024-08-26 11:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.04% [2024-08-26 11:20:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 11:20:25 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 11:20:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][0/1251] eta 0:14:12 lr 0.000791 wd 0.0500 time 0.6815 (0.6815) data time 0.4593 (0.4593) model time 0.0000 (0.0000) loss 2.3509 (2.3509) grad_norm 2.0330 (2.0330) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:20:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][10/1251] eta 0:06:08 lr 0.000791 wd 0.0500 time 0.2409 (0.2968) data time 0.0007 (0.0426) model time 0.0000 (0.0000) loss 3.6523 (3.1305) grad_norm 2.4919 (2.2113) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:20:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][20/1251] eta 0:05:46 lr 0.000791 wd 0.0500 time 0.2479 (0.2818) data time 0.0010 (0.0228) model time 0.0000 (0.0000) loss 3.1882 (3.1830) grad_norm 1.9785 (2.2639) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][30/1251] eta 0:05:35 lr 0.000791 wd 0.0500 time 0.2387 (0.2747) data time 0.0007 (0.0158) model time 0.0000 (0.0000) loss 2.9746 (3.1898) grad_norm 2.0831 (2.2122) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][40/1251] eta 0:05:23 lr 0.000791 wd 0.0500 time 0.2426 (0.2675) data time 0.0009 (0.0123) model time 0.0000 (0.0000) loss 4.2207 (3.2528) grad_norm 1.6336 (2.0880) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:20:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][50/1251] eta 0:05:15 lr 0.000791 wd 0.0500 time 0.2355 (0.2630) data time 0.0009 (0.0104) model time 0.0000 (0.0000) loss 3.9328 (3.2729) grad_norm 1.9343 (2.0576) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:20:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][60/1251] eta 0:05:09 lr 0.000791 wd 0.0500 time 0.2377 (0.2597) data time 0.0009 (0.0089) model time 0.2369 (0.2412) loss 3.0225 (3.3298) grad_norm 2.1033 (2.0773) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:20:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][70/1251] eta 0:05:03 lr 0.000791 wd 0.0500 time 0.2413 (0.2571) data time 0.0008 (0.0078) model time 0.2406 (0.2405) loss 2.0940 (3.3332) grad_norm 3.3924 (2.1452) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:20:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][80/1251] eta 0:04:58 lr 0.000791 wd 0.0500 time 0.2354 (0.2552) data time 0.0012 (0.0071) model time 0.2342 (0.2404) loss 3.3021 (3.3142) grad_norm 2.2739 (2.1763) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:20:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][90/1251] eta 0:04:57 lr 0.000791 wd 0.0500 time 0.2441 (0.2560) data time 0.0009 (0.0065) model time 0.2431 (0.2457) loss 3.3711 (3.2941) grad_norm 1.8403 (2.1846) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:20:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][100/1251] eta 0:04:57 lr 0.000791 wd 0.0500 time 0.2387 (0.2588) data time 0.0008 (0.0059) model time 0.2379 (0.2532) loss 3.2529 (3.3020) grad_norm 1.5704 (2.1560) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:20:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][110/1251] eta 0:04:53 lr 0.000791 wd 0.0500 time 0.2541 (0.2573) data time 0.0007 (0.0055) model time 0.2533 (0.2510) loss 3.0868 (3.2928) grad_norm 2.3848 (2.1465) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:20:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][120/1251] eta 0:04:49 lr 0.000791 wd 0.0500 time 0.2390 (0.2559) data time 0.0009 (0.0052) model time 0.2381 (0.2493) loss 2.7865 (3.2742) grad_norm 1.7867 (2.1329) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:20:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][130/1251] eta 0:04:45 lr 0.000791 wd 0.0500 time 0.2451 (0.2548) data time 0.0008 (0.0049) model time 0.2443 (0.2482) loss 3.2042 (3.2793) grad_norm 1.4552 (2.1191) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][140/1251] eta 0:04:42 lr 0.000791 wd 0.0500 time 0.2359 (0.2539) data time 0.0011 (0.0046) model time 0.2348 (0.2473) loss 3.4246 (3.2949) grad_norm 2.2916 (2.1324) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][150/1251] eta 0:04:38 lr 0.000791 wd 0.0500 time 0.2431 (0.2530) data time 0.0009 (0.0044) model time 0.2422 (0.2465) loss 3.5981 (3.2941) grad_norm 1.5836 (2.1275) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][160/1251] eta 0:04:35 lr 0.000791 wd 0.0500 time 0.2435 (0.2523) data time 0.0009 (0.0042) model time 0.2426 (0.2461) loss 2.7818 (3.2791) grad_norm 3.0298 (2.1200) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][170/1251] eta 0:04:32 lr 0.000791 wd 0.0500 time 0.2371 (0.2517) data time 0.0011 (0.0040) model time 0.2361 (0.2457) loss 2.5178 (3.2727) grad_norm 1.9611 (2.1205) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][180/1251] eta 0:04:29 lr 0.000791 wd 0.0500 time 0.2425 (0.2512) data time 0.0010 (0.0039) model time 0.2415 (0.2453) loss 3.8413 (3.2682) grad_norm 2.4194 (2.1158) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][190/1251] eta 0:04:26 lr 0.000791 wd 0.0500 time 0.2561 (0.2508) data time 0.0007 (0.0037) model time 0.2554 (0.2451) loss 3.9177 (3.2711) grad_norm 1.8494 (2.1022) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][200/1251] eta 0:04:23 lr 0.000791 wd 0.0500 time 0.2512 (0.2505) data time 0.0007 (0.0036) model time 0.2505 (0.2450) loss 3.7059 (3.2859) grad_norm 1.5737 (2.0980) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][210/1251] eta 0:04:20 lr 0.000791 wd 0.0500 time 0.2533 (0.2503) data time 0.0011 (0.0035) model time 0.2522 (0.2449) loss 3.7804 (3.2862) grad_norm 1.4996 (2.1019) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][220/1251] eta 0:04:17 lr 0.000791 wd 0.0500 time 0.2424 (0.2502) data time 0.0011 (0.0034) model time 0.2413 (0.2451) loss 3.3818 (3.2986) grad_norm 2.7735 (2.1202) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][230/1251] eta 0:04:15 lr 0.000791 wd 0.0500 time 0.2453 (0.2500) data time 0.0010 (0.0033) model time 0.2442 (0.2450) loss 3.0402 (3.3106) grad_norm 1.6350 (2.1194) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][240/1251] eta 0:04:12 lr 0.000791 wd 0.0500 time 0.2385 (0.2496) data time 0.0010 (0.0032) model time 0.2375 (0.2447) loss 3.2894 (3.3072) grad_norm 1.6578 (2.1087) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][250/1251] eta 0:04:09 lr 0.000791 wd 0.0500 time 0.2482 (0.2493) data time 0.0008 (0.0031) model time 0.2474 (0.2446) loss 2.5343 (3.3010) grad_norm 1.8669 (2.1034) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][260/1251] eta 0:04:06 lr 0.000791 wd 0.0500 time 0.2380 (0.2491) data time 0.0009 (0.0031) model time 0.2371 (0.2445) loss 4.1127 (3.3089) grad_norm 3.2769 (2.1189) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][270/1251] eta 0:04:04 lr 0.000790 wd 0.0500 time 0.2403 (0.2495) data time 0.0011 (0.0030) model time 0.2391 (0.2452) loss 3.6338 (3.3063) grad_norm 2.2405 (2.1210) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][280/1251] eta 0:04:02 lr 0.000790 wd 0.0500 time 0.2494 (0.2494) data time 0.0007 (0.0029) model time 0.2487 (0.2451) loss 3.9179 (3.3149) grad_norm 1.6873 (2.1165) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][290/1251] eta 0:04:00 lr 0.000790 wd 0.0500 time 0.2312 (0.2498) data time 0.0008 (0.0029) model time 0.2303 (0.2458) loss 3.4760 (3.3207) grad_norm 2.0046 (2.1246) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][300/1251] eta 0:03:57 lr 0.000790 wd 0.0500 time 0.2453 (0.2495) data time 0.0012 (0.0028) model time 0.2441 (0.2455) loss 3.2131 (3.3218) grad_norm 2.0410 (2.1259) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][310/1251] eta 0:03:54 lr 0.000790 wd 0.0500 time 0.2440 (0.2493) data time 0.0009 (0.0027) model time 0.2431 (0.2454) loss 4.4162 (3.3251) grad_norm 1.6638 (2.1196) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][320/1251] eta 0:03:51 lr 0.000790 wd 0.0500 time 0.2471 (0.2491) data time 0.0010 (0.0027) model time 0.2461 (0.2452) loss 3.1524 (3.3237) grad_norm 1.6175 (2.1102) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][330/1251] eta 0:03:49 lr 0.000790 wd 0.0500 time 0.2414 (0.2489) data time 0.0010 (0.0026) model time 0.2404 (0.2451) loss 3.3991 (3.3308) grad_norm 2.9475 (2.1164) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][340/1251] eta 0:03:46 lr 0.000790 wd 0.0500 time 0.2532 (0.2487) data time 0.0007 (0.0026) model time 0.2525 (0.2449) loss 2.0082 (3.3221) grad_norm 2.7334 (2.1130) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][350/1251] eta 0:03:43 lr 0.000790 wd 0.0500 time 0.2408 (0.2485) data time 0.0007 (0.0026) model time 0.2401 (0.2448) loss 4.3874 (3.3271) grad_norm 2.2504 (2.1163) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][360/1251] eta 0:03:41 lr 0.000790 wd 0.0500 time 0.2411 (0.2483) data time 0.0010 (0.0025) model time 0.2401 (0.2447) loss 3.8453 (3.3316) grad_norm 2.5185 (2.1136) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][370/1251] eta 0:03:38 lr 0.000790 wd 0.0500 time 0.2445 (0.2481) data time 0.0013 (0.0025) model time 0.2433 (0.2445) loss 3.2151 (3.3355) grad_norm 2.9806 (2.1193) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:21:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][380/1251] eta 0:03:35 lr 0.000790 wd 0.0500 time 0.2396 (0.2480) data time 0.0009 (0.0024) model time 0.2387 (0.2444) loss 3.5213 (3.3366) grad_norm 1.9197 (2.1181) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][390/1251] eta 0:03:33 lr 0.000790 wd 0.0500 time 0.2401 (0.2478) data time 0.0011 (0.0024) model time 0.2390 (0.2443) loss 3.8680 (3.3328) grad_norm 2.2961 (2.1435) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][400/1251] eta 0:03:31 lr 0.000790 wd 0.0500 time 0.2378 (0.2482) data time 0.0009 (0.0024) model time 0.2369 (0.2448) loss 3.8831 (3.3316) grad_norm 1.5165 (2.1396) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][410/1251] eta 0:03:28 lr 0.000790 wd 0.0500 time 0.2448 (0.2480) data time 0.0009 (0.0023) model time 0.2439 (0.2447) loss 3.4942 (3.3321) grad_norm 2.6140 (2.1339) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][420/1251] eta 0:03:25 lr 0.000790 wd 0.0500 time 0.2423 (0.2478) data time 0.0010 (0.0023) model time 0.2413 (0.2446) loss 4.1342 (3.3363) grad_norm 1.8922 (2.1287) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][430/1251] eta 0:03:23 lr 0.000790 wd 0.0500 time 0.2410 (0.2477) data time 0.0009 (0.0023) model time 0.2400 (0.2445) loss 3.4399 (3.3409) grad_norm 1.8031 (2.1223) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][440/1251] eta 0:03:20 lr 0.000790 wd 0.0500 time 0.2424 (0.2476) data time 0.0007 (0.0022) model time 0.2417 (0.2445) loss 4.2343 (3.3431) grad_norm 2.3188 (2.1207) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][450/1251] eta 0:03:18 lr 0.000790 wd 0.0500 time 0.2364 (0.2475) data time 0.0013 (0.0022) model time 0.2351 (0.2443) loss 3.3122 (3.3446) grad_norm 1.7082 (2.1134) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][460/1251] eta 0:03:15 lr 0.000790 wd 0.0500 time 0.2434 (0.2473) data time 0.0007 (0.0022) model time 0.2427 (0.2442) loss 2.7031 (3.3463) grad_norm 2.0631 (2.1082) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][470/1251] eta 0:03:13 lr 0.000790 wd 0.0500 time 0.2466 (0.2472) data time 0.0009 (0.0022) model time 0.2457 (0.2441) loss 2.8295 (3.3440) grad_norm 2.4483 (2.1053) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][480/1251] eta 0:03:10 lr 0.000790 wd 0.0500 time 0.2447 (0.2471) data time 0.0010 (0.0021) model time 0.2436 (0.2440) loss 3.6695 (3.3454) grad_norm 1.7605 (2.1072) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][490/1251] eta 0:03:07 lr 0.000790 wd 0.0500 time 0.2379 (0.2470) data time 0.0012 (0.0021) model time 0.2367 (0.2439) loss 4.1526 (3.3445) grad_norm 2.1323 (2.1131) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][500/1251] eta 0:03:05 lr 0.000790 wd 0.0500 time 0.2472 (0.2469) data time 0.0009 (0.0021) model time 0.2464 (0.2439) loss 2.2888 (3.3451) grad_norm 4.0649 (2.1245) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][510/1251] eta 0:03:02 lr 0.000790 wd 0.0500 time 0.2352 (0.2468) data time 0.0011 (0.0021) model time 0.2341 (0.2438) loss 3.2955 (3.3451) grad_norm 2.1788 (2.1293) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][520/1251] eta 0:03:00 lr 0.000790 wd 0.0500 time 0.2502 (0.2467) data time 0.0009 (0.0021) model time 0.2492 (0.2438) loss 3.5696 (3.3451) grad_norm 1.9497 (2.1272) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][530/1251] eta 0:02:57 lr 0.000790 wd 0.0500 time 0.2357 (0.2466) data time 0.0009 (0.0020) model time 0.2348 (0.2437) loss 3.6377 (3.3484) grad_norm 1.7756 (2.1264) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][540/1251] eta 0:02:55 lr 0.000789 wd 0.0500 time 0.2286 (0.2469) data time 0.0009 (0.0020) model time 0.2277 (0.2440) loss 3.7501 (3.3495) grad_norm 1.9541 (2.1225) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][550/1251] eta 0:02:53 lr 0.000789 wd 0.0500 time 0.4571 (0.2472) data time 0.0007 (0.0020) model time 0.4564 (0.2444) loss 2.4716 (3.3510) grad_norm 2.6313 (2.1262) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][560/1251] eta 0:02:50 lr 0.000789 wd 0.0500 time 0.2345 (0.2471) data time 0.0009 (0.0020) model time 0.2336 (0.2443) loss 4.1226 (3.3540) grad_norm 2.3059 (2.1259) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][570/1251] eta 0:02:48 lr 0.000789 wd 0.0500 time 0.2308 (0.2470) data time 0.0011 (0.0020) model time 0.2297 (0.2442) loss 3.8798 (3.3571) grad_norm 2.2555 (2.1250) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][580/1251] eta 0:02:45 lr 0.000789 wd 0.0500 time 0.2413 (0.2469) data time 0.0010 (0.0020) model time 0.2403 (0.2442) loss 3.0722 (3.3573) grad_norm 2.0602 (2.1231) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][590/1251] eta 0:02:43 lr 0.000789 wd 0.0500 time 0.2472 (0.2468) data time 0.0011 (0.0019) model time 0.2461 (0.2441) loss 2.8437 (3.3607) grad_norm 2.2543 (2.1217) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][600/1251] eta 0:02:40 lr 0.000789 wd 0.0500 time 0.2410 (0.2467) data time 0.0011 (0.0019) model time 0.2399 (0.2440) loss 3.5407 (3.3618) grad_norm 1.5163 (2.1153) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][610/1251] eta 0:02:38 lr 0.000789 wd 0.0500 time 0.2437 (0.2466) data time 0.0008 (0.0019) model time 0.2430 (0.2440) loss 3.8669 (3.3661) grad_norm 1.5531 (2.1151) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:22:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][620/1251] eta 0:02:35 lr 0.000789 wd 0.0500 time 0.2420 (0.2465) data time 0.0011 (0.0019) model time 0.2409 (0.2439) loss 3.2116 (3.3601) grad_norm 2.1202 (2.1136) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][630/1251] eta 0:02:33 lr 0.000789 wd 0.0500 time 0.4450 (0.2468) data time 0.0010 (0.0019) model time 0.4441 (0.2443) loss 3.2820 (3.3614) grad_norm 2.1746 (2.1119) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][640/1251] eta 0:02:30 lr 0.000789 wd 0.0500 time 0.2342 (0.2470) data time 0.0009 (0.0019) model time 0.2333 (0.2445) loss 3.8960 (3.3630) grad_norm 1.8168 (2.1113) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][650/1251] eta 0:02:28 lr 0.000789 wd 0.0500 time 0.2337 (0.2470) data time 0.0011 (0.0019) model time 0.2325 (0.2445) loss 2.7399 (3.3628) grad_norm 1.5266 (2.1091) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][660/1251] eta 0:02:25 lr 0.000789 wd 0.0500 time 0.2473 (0.2469) data time 0.0010 (0.0019) model time 0.2463 (0.2445) loss 3.7622 (3.3650) grad_norm 1.7900 (2.1095) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][670/1251] eta 0:02:23 lr 0.000789 wd 0.0500 time 0.2469 (0.2469) data time 0.0011 (0.0018) model time 0.2458 (0.2445) loss 3.2562 (3.3652) grad_norm 3.1718 (2.1159) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][680/1251] eta 0:02:20 lr 0.000789 wd 0.0500 time 0.2339 (0.2469) data time 0.0009 (0.0018) model time 0.2330 (0.2444) loss 3.8848 (3.3662) grad_norm 2.0659 (2.1172) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][690/1251] eta 0:02:18 lr 0.000789 wd 0.0500 time 0.2375 (0.2468) data time 0.0012 (0.0018) model time 0.2363 (0.2444) loss 3.2690 (3.3674) grad_norm 1.7654 (2.1203) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][700/1251] eta 0:02:15 lr 0.000789 wd 0.0500 time 0.2414 (0.2467) data time 0.0009 (0.0018) model time 0.2404 (0.2443) loss 3.8742 (3.3691) grad_norm 2.5580 (2.1221) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][710/1251] eta 0:02:13 lr 0.000789 wd 0.0500 time 0.2362 (0.2467) data time 0.0010 (0.0018) model time 0.2352 (0.2443) loss 2.9991 (3.3638) grad_norm 1.5324 (2.1229) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][720/1251] eta 0:02:10 lr 0.000789 wd 0.0500 time 0.2380 (0.2466) data time 0.0011 (0.0018) model time 0.2369 (0.2442) loss 3.1224 (3.3633) grad_norm 1.8978 (2.1201) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][730/1251] eta 0:02:08 lr 0.000789 wd 0.0500 time 0.2401 (0.2465) data time 0.0011 (0.0018) model time 0.2390 (0.2442) loss 3.5162 (3.3647) grad_norm 3.6517 (2.1268) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][740/1251] eta 0:02:05 lr 0.000789 wd 0.0500 time 0.2438 (0.2465) data time 0.0010 (0.0018) model time 0.2428 (0.2441) loss 2.8034 (3.3687) grad_norm 1.7637 (2.1300) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][750/1251] eta 0:02:03 lr 0.000789 wd 0.0500 time 0.2490 (0.2464) data time 0.0010 (0.0018) model time 0.2480 (0.2441) loss 3.6978 (3.3679) grad_norm 2.1502 (2.1335) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][760/1251] eta 0:02:00 lr 0.000789 wd 0.0500 time 0.2402 (0.2464) data time 0.0008 (0.0017) model time 0.2394 (0.2441) loss 3.9931 (3.3707) grad_norm 1.9810 (2.1383) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][770/1251] eta 0:01:58 lr 0.000789 wd 0.0500 time 0.2376 (0.2463) data time 0.0009 (0.0017) model time 0.2367 (0.2440) loss 4.1666 (3.3723) grad_norm 1.8905 (2.1408) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][780/1251] eta 0:01:55 lr 0.000789 wd 0.0500 time 0.2403 (0.2462) data time 0.0008 (0.0017) model time 0.2394 (0.2439) loss 3.8393 (3.3729) grad_norm 1.6071 (2.1396) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][790/1251] eta 0:01:53 lr 0.000789 wd 0.0500 time 0.2428 (0.2464) data time 0.0010 (0.0017) model time 0.2418 (0.2442) loss 3.6526 (3.3728) grad_norm 1.8358 (2.1369) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][800/1251] eta 0:01:51 lr 0.000789 wd 0.0500 time 0.2423 (0.2464) data time 0.0007 (0.0017) model time 0.2415 (0.2441) loss 3.3902 (3.3758) grad_norm 2.4413 (2.1346) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][810/1251] eta 0:01:48 lr 0.000789 wd 0.0500 time 0.2420 (0.2466) data time 0.0007 (0.0017) model time 0.2413 (0.2443) loss 2.3635 (3.3780) grad_norm 1.5874 (2.1362) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][820/1251] eta 0:01:46 lr 0.000788 wd 0.0500 time 0.2378 (0.2465) data time 0.0011 (0.0017) model time 0.2367 (0.2443) loss 3.8053 (3.3815) grad_norm 1.6746 (2.1385) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][830/1251] eta 0:01:43 lr 0.000788 wd 0.0500 time 0.2407 (0.2467) data time 0.0009 (0.0017) model time 0.2398 (0.2445) loss 3.7226 (3.3819) grad_norm 2.7085 (2.1377) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][840/1251] eta 0:01:41 lr 0.000788 wd 0.0500 time 0.2392 (0.2467) data time 0.0012 (0.0017) model time 0.2380 (0.2445) loss 3.6118 (3.3844) grad_norm 1.4589 (2.1351) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][850/1251] eta 0:01:38 lr 0.000788 wd 0.0500 time 0.2493 (0.2467) data time 0.0010 (0.0017) model time 0.2483 (0.2445) loss 2.3968 (3.3852) grad_norm 1.6225 (2.1401) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:23:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][860/1251] eta 0:01:36 lr 0.000788 wd 0.0500 time 0.2399 (0.2466) data time 0.0008 (0.0017) model time 0.2392 (0.2445) loss 2.7310 (3.3876) grad_norm 1.7686 (2.1491) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][870/1251] eta 0:01:33 lr 0.000788 wd 0.0500 time 0.2384 (0.2466) data time 0.0008 (0.0017) model time 0.2377 (0.2444) loss 2.3637 (3.3854) grad_norm 2.6847 (2.1535) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][880/1251] eta 0:01:31 lr 0.000788 wd 0.0500 time 0.2362 (0.2465) data time 0.0009 (0.0017) model time 0.2353 (0.2444) loss 3.5589 (3.3856) grad_norm 1.8895 (2.1528) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][890/1251] eta 0:01:28 lr 0.000788 wd 0.0500 time 0.2398 (0.2465) data time 0.0009 (0.0017) model time 0.2389 (0.2443) loss 3.3310 (3.3846) grad_norm 1.7059 (2.1593) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][900/1251] eta 0:01:26 lr 0.000788 wd 0.0500 time 0.2413 (0.2464) data time 0.0010 (0.0017) model time 0.2403 (0.2443) loss 3.2809 (3.3866) grad_norm 1.8985 (2.1582) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][910/1251] eta 0:01:24 lr 0.000788 wd 0.0500 time 0.2356 (0.2464) data time 0.0011 (0.0016) model time 0.2345 (0.2443) loss 3.4437 (3.3879) grad_norm 1.8312 (2.1564) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][920/1251] eta 0:01:21 lr 0.000788 wd 0.0500 time 0.2436 (0.2464) data time 0.0007 (0.0016) model time 0.2429 (0.2442) loss 3.7496 (3.3852) grad_norm 2.1025 (2.1550) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][930/1251] eta 0:01:19 lr 0.000788 wd 0.0500 time 0.2402 (0.2463) data time 0.0010 (0.0016) model time 0.2392 (0.2442) loss 3.8965 (3.3868) grad_norm 3.1424 (2.1597) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][940/1251] eta 0:01:16 lr 0.000788 wd 0.0500 time 0.2351 (0.2465) data time 0.0009 (0.0016) model time 0.2342 (0.2444) loss 3.7625 (3.3868) grad_norm 1.4325 (2.1583) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][950/1251] eta 0:01:14 lr 0.000788 wd 0.0500 time 0.2421 (0.2467) data time 0.0008 (0.0016) model time 0.2413 (0.2446) loss 4.0537 (3.3881) grad_norm 2.0362 (2.1594) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][960/1251] eta 0:01:11 lr 0.000788 wd 0.0500 time 0.2472 (0.2468) data time 0.0007 (0.0016) model time 0.2464 (0.2448) loss 3.5273 (3.3883) grad_norm 1.7218 (2.1583) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][970/1251] eta 0:01:09 lr 0.000788 wd 0.0500 time 0.2441 (0.2468) data time 0.0009 (0.0016) model time 0.2432 (0.2448) loss 3.3765 (3.3883) grad_norm 2.8449 (2.1625) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][980/1251] eta 0:01:06 lr 0.000788 wd 0.0500 time 0.2403 (0.2468) data time 0.0007 (0.0016) model time 0.2396 (0.2447) loss 3.7705 (3.3868) grad_norm 2.0529 (2.1646) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][990/1251] eta 0:01:04 lr 0.000788 wd 0.0500 time 0.2394 (0.2467) data time 0.0007 (0.0016) model time 0.2387 (0.2447) loss 2.2497 (3.3861) grad_norm 2.0350 (2.1644) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1000/1251] eta 0:01:01 lr 0.000788 wd 0.0500 time 0.2449 (0.2467) data time 0.0007 (0.0016) model time 0.2441 (0.2447) loss 3.6936 (3.3858) grad_norm 2.2059 (2.1646) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1010/1251] eta 0:00:59 lr 0.000788 wd 0.0500 time 0.2447 (0.2467) data time 0.0009 (0.0016) model time 0.2438 (0.2447) loss 3.5357 (3.3858) grad_norm 1.7971 (2.1635) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1020/1251] eta 0:00:57 lr 0.000788 wd 0.0500 time 0.2451 (0.2469) data time 0.0008 (0.0016) model time 0.2444 (0.2449) loss 2.7067 (3.3824) grad_norm 2.5788 (2.1632) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1030/1251] eta 0:00:54 lr 0.000788 wd 0.0500 time 0.2401 (0.2473) data time 0.0009 (0.0016) model time 0.2392 (0.2453) loss 4.2497 (3.3830) grad_norm 1.6992 (2.1629) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1040/1251] eta 0:00:52 lr 0.000788 wd 0.0500 time 0.2389 (0.2472) data time 0.0007 (0.0016) model time 0.2382 (0.2453) loss 3.0448 (3.3839) grad_norm 1.6789 (2.1619) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1050/1251] eta 0:00:49 lr 0.000788 wd 0.0500 time 0.2412 (0.2472) data time 0.0008 (0.0016) model time 0.2404 (0.2453) loss 2.6604 (3.3817) grad_norm 1.6976 (2.1629) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1060/1251] eta 0:00:47 lr 0.000788 wd 0.0500 time 0.2423 (0.2473) data time 0.0009 (0.0016) model time 0.2414 (0.2454) loss 3.8409 (3.3818) grad_norm 1.7654 (2.1643) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1070/1251] eta 0:00:44 lr 0.000788 wd 0.0500 time 0.4416 (0.2474) data time 0.0009 (0.0016) model time 0.4406 (0.2455) loss 4.0244 (3.3829) grad_norm 2.5065 (2.1641) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1080/1251] eta 0:00:42 lr 0.000788 wd 0.0500 time 0.2440 (0.2474) data time 0.0009 (0.0016) model time 0.2431 (0.2455) loss 3.8527 (3.3837) grad_norm 2.1942 (2.1681) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1090/1251] eta 0:00:39 lr 0.000787 wd 0.0500 time 0.2460 (0.2473) data time 0.0007 (0.0016) model time 0.2453 (0.2454) loss 2.6876 (3.3831) grad_norm 2.5954 (2.1671) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1100/1251] eta 0:00:37 lr 0.000787 wd 0.0500 time 0.2393 (0.2473) data time 0.0008 (0.0016) model time 0.2385 (0.2454) loss 3.5026 (3.3827) grad_norm 1.6733 (2.1645) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1110/1251] eta 0:00:34 lr 0.000787 wd 0.0500 time 0.2558 (0.2472) data time 0.0009 (0.0015) model time 0.2549 (0.2454) loss 3.2791 (3.3819) grad_norm 1.5863 (2.1623) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:25:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1120/1251] eta 0:00:32 lr 0.000787 wd 0.0500 time 0.2486 (0.2472) data time 0.0011 (0.0015) model time 0.2476 (0.2453) loss 3.5714 (3.3814) grad_norm 2.9349 (2.1624) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1130/1251] eta 0:00:29 lr 0.000787 wd 0.0500 time 0.2476 (0.2472) data time 0.0007 (0.0015) model time 0.2468 (0.2453) loss 3.7019 (3.3845) grad_norm 2.7379 (2.1631) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:25:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1140/1251] eta 0:00:27 lr 0.000787 wd 0.0500 time 0.2336 (0.2471) data time 0.0010 (0.0015) model time 0.2326 (0.2452) loss 3.6840 (3.3839) grad_norm 2.4387 (2.1632) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:25:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1150/1251] eta 0:00:24 lr 0.000787 wd 0.0500 time 0.2413 (0.2471) data time 0.0011 (0.0015) model time 0.2402 (0.2452) loss 3.9934 (3.3847) grad_norm 1.7212 (2.1617) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:25:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1160/1251] eta 0:00:22 lr 0.000787 wd 0.0500 time 0.2477 (0.2470) data time 0.0009 (0.0015) model time 0.2468 (0.2452) loss 4.1244 (3.3843) grad_norm 2.3551 (2.1660) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:25:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1170/1251] eta 0:00:20 lr 0.000787 wd 0.0500 time 0.2364 (0.2472) data time 0.0007 (0.0015) model time 0.2357 (0.2453) loss 2.3314 (3.3856) grad_norm 2.0192 (2.1657) loss_scale 4096.0000 (2061.9915) mem 7379MB [2024-08-26 11:25:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1180/1251] eta 0:00:17 lr 0.000787 wd 0.0500 time 0.2362 (0.2473) data time 0.0010 (0.0015) model time 0.2352 (0.2454) loss 3.6616 (3.3874) grad_norm 2.0598 (2.1646) loss_scale 4096.0000 (2079.2142) mem 7379MB [2024-08-26 11:25:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1190/1251] eta 0:00:15 lr 0.000787 wd 0.0500 time 0.2485 (0.2472) data time 0.0011 (0.0015) model time 0.2474 (0.2454) loss 3.4776 (3.3875) grad_norm 2.4225 (2.1660) loss_scale 4096.0000 (2096.1478) mem 7379MB [2024-08-26 11:25:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1200/1251] eta 0:00:12 lr 0.000787 wd 0.0500 time 0.2437 (0.2472) data time 0.0011 (0.0015) model time 0.2427 (0.2454) loss 3.4851 (3.3905) grad_norm 1.9952 (2.1714) loss_scale 4096.0000 (2112.7993) mem 7379MB [2024-08-26 11:25:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1210/1251] eta 0:00:10 lr 0.000787 wd 0.0500 time 0.2433 (0.2471) data time 0.0009 (0.0015) model time 0.2424 (0.2453) loss 2.2389 (3.3884) grad_norm 2.1839 (2.1718) loss_scale 4096.0000 (2129.1759) mem 7379MB [2024-08-26 11:25:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1220/1251] eta 0:00:07 lr 0.000787 wd 0.0500 time 0.2382 (0.2471) data time 0.0008 (0.0015) model time 0.2374 (0.2453) loss 4.1080 (3.3882) grad_norm 1.5337 (2.1683) loss_scale 4096.0000 (2145.2842) mem 7379MB [2024-08-26 11:25:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1230/1251] eta 0:00:05 lr 0.000787 wd 0.0500 time 0.2478 (0.2470) data time 0.0007 (0.0015) model time 0.2471 (0.2452) loss 3.7614 (3.3880) grad_norm 2.5605 (2.1675) loss_scale 4096.0000 (2161.1308) mem 7379MB [2024-08-26 11:25:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1240/1251] eta 0:00:02 lr 0.000787 wd 0.0500 time 0.2303 (0.2469) data time 0.0007 (0.0015) model time 0.2295 (0.2451) loss 3.0779 (3.3879) grad_norm 3.1700 (2.1662) loss_scale 4096.0000 (2176.7220) mem 7379MB [2024-08-26 11:25:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [105/300][1250/1251] eta 0:00:00 lr 0.000787 wd 0.0500 time 0.2229 (0.2468) data time 0.0007 (0.0015) model time 0.2222 (0.2450) loss 2.6656 (3.3876) grad_norm 2.2466 (2.1650) loss_scale 4096.0000 (2192.0639) mem 7379MB [2024-08-26 11:25:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 105 training takes 0:05:08 [2024-08-26 11:25:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 11:25:34 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 11:25:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.415 (0.415) Loss 0.5576 (0.5576) Acc@1 89.941 (89.941) Acc@5 97.363 (97.363) Mem 7379MB [2024-08-26 11:25:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.109) Loss 0.7930 (0.7907) Acc@1 82.227 (82.946) Acc@5 96.094 (96.502) Mem 7379MB [2024-08-26 11:25:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.094) Loss 1.1113 (0.8112) Acc@1 73.145 (81.817) Acc@5 93.359 (96.401) Mem 7379MB [2024-08-26 11:25:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.090) Loss 1.4541 (0.9235) Acc@1 64.355 (79.429) Acc@5 88.672 (94.950) Mem 7379MB [2024-08-26 11:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.2939 (0.9885) Acc@1 71.191 (77.906) Acc@5 90.527 (94.217) Mem 7379MB [2024-08-26 11:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.402 Acc@5 94.146 [2024-08-26 11:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.4% [2024-08-26 11:25:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.794 (0.794) Loss 0.4360 (0.4360) Acc@1 91.992 (91.992) Acc@5 98.438 (98.438) Mem 7379MB [2024-08-26 11:25:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.148) Loss 0.7012 (0.6839) Acc@1 86.133 (85.210) Acc@5 96.387 (97.070) Mem 7379MB [2024-08-26 11:25:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.085 (0.117) Loss 0.9819 (0.7069) Acc@1 76.660 (84.198) Acc@5 93.945 (97.042) Mem 7379MB [2024-08-26 11:25:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.104) Loss 1.2539 (0.8049) Acc@1 68.164 (81.902) Acc@5 90.820 (95.867) Mem 7379MB [2024-08-26 11:25:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.095) Loss 1.1172 (0.8556) Acc@1 72.266 (80.497) Acc@5 92.969 (95.372) Mem 7379MB [2024-08-26 11:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.050 Acc@5 95.322 [2024-08-26 11:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.1% [2024-08-26 11:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.05% [2024-08-26 11:25:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 11:25:43 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 11:25:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][0/1251] eta 0:13:49 lr 0.000787 wd 0.0500 time 0.6633 (0.6633) data time 0.4433 (0.4433) model time 0.0000 (0.0000) loss 3.5149 (3.5149) grad_norm 1.8339 (1.8339) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:25:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][10/1251] eta 0:05:47 lr 0.000787 wd 0.0500 time 0.2434 (0.2803) data time 0.0010 (0.0412) model time 0.0000 (0.0000) loss 3.1033 (3.2393) grad_norm 2.7917 (2.2098) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:25:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][20/1251] eta 0:05:22 lr 0.000787 wd 0.0500 time 0.2414 (0.2621) data time 0.0008 (0.0221) model time 0.0000 (0.0000) loss 3.9216 (3.2858) grad_norm 1.7534 (2.4714) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:25:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][30/1251] eta 0:05:12 lr 0.000787 wd 0.0500 time 0.2415 (0.2562) data time 0.0009 (0.0153) model time 0.0000 (0.0000) loss 3.0810 (3.3584) grad_norm 1.4611 (2.3455) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:25:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][40/1251] eta 0:05:13 lr 0.000787 wd 0.0500 time 0.4811 (0.2592) data time 0.0010 (0.0118) model time 0.0000 (0.0000) loss 3.3052 (3.3812) grad_norm 1.6323 (2.2332) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:25:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][50/1251] eta 0:05:07 lr 0.000787 wd 0.0500 time 0.2501 (0.2562) data time 0.0008 (0.0099) model time 0.0000 (0.0000) loss 4.1266 (3.3809) grad_norm 2.3279 (2.2620) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:25:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][60/1251] eta 0:05:05 lr 0.000787 wd 0.0500 time 0.2406 (0.2563) data time 0.0007 (0.0085) model time 0.2399 (0.2555) loss 3.1638 (3.3439) grad_norm 1.9608 (2.2809) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][70/1251] eta 0:05:00 lr 0.000787 wd 0.0500 time 0.2414 (0.2545) data time 0.0011 (0.0074) model time 0.2403 (0.2489) loss 3.3296 (3.3063) grad_norm 2.2294 (2.2832) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][80/1251] eta 0:04:56 lr 0.000787 wd 0.0500 time 0.2451 (0.2532) data time 0.0008 (0.0067) model time 0.2443 (0.2469) loss 3.0665 (3.3361) grad_norm 2.1914 (2.2686) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][90/1251] eta 0:04:53 lr 0.000787 wd 0.0500 time 0.2515 (0.2527) data time 0.0008 (0.0061) model time 0.2507 (0.2469) loss 3.8801 (3.3324) grad_norm 1.7220 (2.2416) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][100/1251] eta 0:04:52 lr 0.000787 wd 0.0500 time 0.2395 (0.2538) data time 0.0011 (0.0057) model time 0.2384 (0.2499) loss 4.3739 (3.3462) grad_norm 1.7996 (2.2189) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][110/1251] eta 0:04:48 lr 0.000786 wd 0.0500 time 0.2501 (0.2529) data time 0.0008 (0.0053) model time 0.2493 (0.2487) loss 3.7277 (3.3775) grad_norm 1.6940 (2.1820) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][120/1251] eta 0:04:45 lr 0.000786 wd 0.0500 time 0.2445 (0.2524) data time 0.0008 (0.0050) model time 0.2437 (0.2480) loss 3.8972 (3.3823) grad_norm 2.8333 (2.2106) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][130/1251] eta 0:04:42 lr 0.000786 wd 0.0500 time 0.2454 (0.2516) data time 0.0013 (0.0048) model time 0.2442 (0.2471) loss 3.6171 (3.3968) grad_norm 2.8395 (2.2029) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][140/1251] eta 0:04:39 lr 0.000786 wd 0.0500 time 0.2521 (0.2511) data time 0.0010 (0.0045) model time 0.2510 (0.2468) loss 2.9343 (3.3845) grad_norm 1.9892 (2.2132) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][150/1251] eta 0:04:36 lr 0.000786 wd 0.0500 time 0.2508 (0.2510) data time 0.0009 (0.0043) model time 0.2498 (0.2468) loss 2.2307 (3.3694) grad_norm 2.2468 (2.2093) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][160/1251] eta 0:04:34 lr 0.000786 wd 0.0500 time 0.2418 (0.2518) data time 0.0008 (0.0041) model time 0.2409 (0.2483) loss 4.3543 (3.3827) grad_norm 1.5271 (2.2030) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][170/1251] eta 0:04:31 lr 0.000786 wd 0.0500 time 0.2400 (0.2514) data time 0.0011 (0.0039) model time 0.2389 (0.2480) loss 4.0224 (3.3781) grad_norm 1.7885 (2.1941) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][180/1251] eta 0:04:28 lr 0.000786 wd 0.0500 time 0.2332 (0.2511) data time 0.0007 (0.0038) model time 0.2325 (0.2477) loss 3.6524 (3.3823) grad_norm 1.4377 (2.1769) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][190/1251] eta 0:04:26 lr 0.000786 wd 0.0500 time 0.2422 (0.2508) data time 0.0007 (0.0037) model time 0.2415 (0.2475) loss 3.8889 (3.3850) grad_norm 2.0550 (2.1664) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][200/1251] eta 0:04:23 lr 0.000786 wd 0.0500 time 0.2467 (0.2505) data time 0.0007 (0.0035) model time 0.2460 (0.2472) loss 2.5969 (3.3651) grad_norm 1.6534 (2.1477) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][210/1251] eta 0:04:20 lr 0.000786 wd 0.0500 time 0.2439 (0.2502) data time 0.0011 (0.0034) model time 0.2429 (0.2469) loss 3.2701 (3.3645) grad_norm 2.7525 (2.1459) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][220/1251] eta 0:04:17 lr 0.000786 wd 0.0500 time 0.2534 (0.2500) data time 0.0007 (0.0033) model time 0.2526 (0.2468) loss 2.9677 (3.3727) grad_norm 1.9598 (2.1635) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][230/1251] eta 0:04:15 lr 0.000786 wd 0.0500 time 0.2415 (0.2498) data time 0.0009 (0.0034) model time 0.2406 (0.2465) loss 4.2824 (3.3763) grad_norm 1.8023 (2.1602) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][240/1251] eta 0:04:12 lr 0.000786 wd 0.0500 time 0.2476 (0.2496) data time 0.0010 (0.0033) model time 0.2466 (0.2463) loss 3.9135 (3.3738) grad_norm 2.1807 (2.1517) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][250/1251] eta 0:04:09 lr 0.000786 wd 0.0500 time 0.2326 (0.2493) data time 0.0009 (0.0032) model time 0.2317 (0.2460) loss 3.3985 (3.3717) grad_norm 2.2397 (2.1446) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][260/1251] eta 0:04:06 lr 0.000786 wd 0.0500 time 0.2491 (0.2491) data time 0.0011 (0.0031) model time 0.2479 (0.2458) loss 3.9586 (3.3672) grad_norm 2.1099 (2.1344) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][270/1251] eta 0:04:04 lr 0.000786 wd 0.0500 time 0.2438 (0.2489) data time 0.0010 (0.0031) model time 0.2428 (0.2458) loss 3.7608 (3.3721) grad_norm 2.0566 (2.1327) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][280/1251] eta 0:04:01 lr 0.000786 wd 0.0500 time 0.2463 (0.2488) data time 0.0009 (0.0030) model time 0.2453 (0.2457) loss 3.4303 (3.3765) grad_norm 1.8379 (2.1327) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][290/1251] eta 0:03:59 lr 0.000786 wd 0.0500 time 0.2215 (0.2493) data time 0.0008 (0.0030) model time 0.2207 (0.2464) loss 2.4969 (3.3664) grad_norm 1.8095 (2.1273) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][300/1251] eta 0:03:56 lr 0.000786 wd 0.0500 time 0.2438 (0.2492) data time 0.0007 (0.0029) model time 0.2431 (0.2463) loss 1.8244 (3.3467) grad_norm 1.6708 (2.1265) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][310/1251] eta 0:03:54 lr 0.000786 wd 0.0500 time 0.2553 (0.2491) data time 0.0009 (0.0029) model time 0.2544 (0.2462) loss 2.6791 (3.3421) grad_norm 2.0286 (2.1333) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][320/1251] eta 0:03:51 lr 0.000786 wd 0.0500 time 0.2409 (0.2491) data time 0.0007 (0.0029) model time 0.2402 (0.2462) loss 3.6845 (3.3404) grad_norm 2.1216 (2.1333) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][330/1251] eta 0:03:49 lr 0.000786 wd 0.0500 time 0.2393 (0.2494) data time 0.0010 (0.0028) model time 0.2383 (0.2467) loss 2.9336 (3.3375) grad_norm 3.1969 (2.1397) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][340/1251] eta 0:03:47 lr 0.000786 wd 0.0500 time 0.2370 (0.2498) data time 0.0007 (0.0028) model time 0.2363 (0.2472) loss 3.9546 (3.3375) grad_norm 1.9604 (2.1454) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][350/1251] eta 0:03:44 lr 0.000786 wd 0.0500 time 0.2489 (0.2497) data time 0.0013 (0.0027) model time 0.2476 (0.2471) loss 3.4190 (3.3419) grad_norm 1.5445 (2.1449) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][360/1251] eta 0:03:42 lr 0.000786 wd 0.0500 time 0.2530 (0.2496) data time 0.0008 (0.0027) model time 0.2522 (0.2470) loss 4.2037 (3.3408) grad_norm 2.9746 (2.1425) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][370/1251] eta 0:03:39 lr 0.000786 wd 0.0500 time 0.2353 (0.2495) data time 0.0011 (0.0026) model time 0.2342 (0.2470) loss 3.3077 (3.3468) grad_norm 2.3090 (2.1693) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][380/1251] eta 0:03:37 lr 0.000786 wd 0.0500 time 0.2379 (0.2494) data time 0.0011 (0.0026) model time 0.2368 (0.2469) loss 3.1827 (3.3441) grad_norm 1.9379 (2.1707) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][390/1251] eta 0:03:35 lr 0.000785 wd 0.0500 time 0.2374 (0.2497) data time 0.0007 (0.0025) model time 0.2367 (0.2473) loss 4.2642 (3.3458) grad_norm 1.7272 (2.1632) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][400/1251] eta 0:03:32 lr 0.000785 wd 0.0500 time 0.2415 (0.2495) data time 0.0012 (0.0025) model time 0.2403 (0.2471) loss 3.6575 (3.3400) grad_norm 1.8538 (2.1613) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][410/1251] eta 0:03:29 lr 0.000785 wd 0.0500 time 0.2393 (0.2494) data time 0.0009 (0.0025) model time 0.2385 (0.2470) loss 3.7890 (3.3486) grad_norm 1.7402 (2.1543) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][420/1251] eta 0:03:27 lr 0.000785 wd 0.0500 time 0.2359 (0.2497) data time 0.0007 (0.0025) model time 0.2352 (0.2474) loss 3.7132 (3.3467) grad_norm 1.7334 (2.1513) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][430/1251] eta 0:03:24 lr 0.000785 wd 0.0500 time 0.2430 (0.2496) data time 0.0008 (0.0024) model time 0.2422 (0.2473) loss 2.9787 (3.3489) grad_norm 2.8865 (2.1717) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][440/1251] eta 0:03:22 lr 0.000785 wd 0.0500 time 0.2430 (0.2495) data time 0.0009 (0.0024) model time 0.2420 (0.2472) loss 3.2315 (3.3528) grad_norm 1.8236 (2.1797) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][450/1251] eta 0:03:19 lr 0.000785 wd 0.0500 time 0.2328 (0.2493) data time 0.0009 (0.0024) model time 0.2320 (0.2469) loss 3.6037 (3.3542) grad_norm 1.6229 (2.1749) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][460/1251] eta 0:03:17 lr 0.000785 wd 0.0500 time 0.2398 (0.2500) data time 0.0010 (0.0024) model time 0.2389 (0.2478) loss 3.0212 (3.3586) grad_norm 1.4774 (inf) loss_scale 2048.0000 (4056.0174) mem 7379MB [2024-08-26 11:27:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][470/1251] eta 0:03:15 lr 0.000785 wd 0.0500 time 0.2442 (0.2501) data time 0.0009 (0.0023) model time 0.2433 (0.2479) loss 3.0650 (3.3552) grad_norm 2.7582 (inf) loss_scale 2048.0000 (4013.3843) mem 7379MB [2024-08-26 11:27:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][480/1251] eta 0:03:12 lr 0.000785 wd 0.0500 time 0.2435 (0.2501) data time 0.0008 (0.0023) model time 0.2428 (0.2479) loss 2.4430 (3.3491) grad_norm 2.2505 (inf) loss_scale 2048.0000 (3972.5239) mem 7379MB [2024-08-26 11:27:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][490/1251] eta 0:03:10 lr 0.000785 wd 0.0500 time 0.2417 (0.2500) data time 0.0009 (0.0023) model time 0.2408 (0.2478) loss 3.0717 (3.3406) grad_norm 1.9182 (inf) loss_scale 2048.0000 (3933.3279) mem 7379MB [2024-08-26 11:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][500/1251] eta 0:03:07 lr 0.000785 wd 0.0500 time 0.2436 (0.2499) data time 0.0011 (0.0023) model time 0.2425 (0.2477) loss 3.2583 (3.3463) grad_norm 2.1723 (inf) loss_scale 2048.0000 (3895.6966) mem 7379MB [2024-08-26 11:27:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][510/1251] eta 0:03:05 lr 0.000785 wd 0.0500 time 0.2348 (0.2497) data time 0.0009 (0.0023) model time 0.2339 (0.2476) loss 3.4362 (3.3412) grad_norm 2.4768 (inf) loss_scale 2048.0000 (3859.5382) mem 7379MB [2024-08-26 11:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][520/1251] eta 0:03:02 lr 0.000785 wd 0.0500 time 0.2351 (0.2496) data time 0.0009 (0.0022) model time 0.2342 (0.2475) loss 4.2177 (3.3405) grad_norm 1.4214 (inf) loss_scale 2048.0000 (3824.7678) mem 7379MB [2024-08-26 11:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][530/1251] eta 0:03:00 lr 0.000785 wd 0.0500 time 0.2361 (0.2499) data time 0.0008 (0.0022) model time 0.2353 (0.2478) loss 3.4882 (3.3387) grad_norm 1.7803 (inf) loss_scale 2048.0000 (3791.3070) mem 7379MB [2024-08-26 11:27:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][540/1251] eta 0:02:58 lr 0.000785 wd 0.0500 time 0.2404 (0.2505) data time 0.0009 (0.0022) model time 0.2395 (0.2485) loss 3.4624 (3.3423) grad_norm 2.1865 (inf) loss_scale 2048.0000 (3759.0832) mem 7379MB [2024-08-26 11:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][550/1251] eta 0:02:55 lr 0.000785 wd 0.0500 time 0.2368 (0.2504) data time 0.0010 (0.0022) model time 0.2359 (0.2484) loss 3.6711 (3.3413) grad_norm 1.9826 (inf) loss_scale 2048.0000 (3728.0290) mem 7379MB [2024-08-26 11:28:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][560/1251] eta 0:02:52 lr 0.000785 wd 0.0500 time 0.2405 (0.2503) data time 0.0007 (0.0022) model time 0.2398 (0.2483) loss 3.9489 (3.3441) grad_norm 3.1439 (inf) loss_scale 2048.0000 (3698.0820) mem 7379MB [2024-08-26 11:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][570/1251] eta 0:02:50 lr 0.000785 wd 0.0500 time 0.2445 (0.2501) data time 0.0011 (0.0022) model time 0.2434 (0.2481) loss 2.7368 (3.3413) grad_norm 1.9829 (inf) loss_scale 2048.0000 (3669.1839) mem 7379MB [2024-08-26 11:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][580/1251] eta 0:02:47 lr 0.000785 wd 0.0500 time 0.4504 (0.2503) data time 0.0010 (0.0021) model time 0.4495 (0.2483) loss 2.5249 (3.3406) grad_norm 1.9213 (inf) loss_scale 2048.0000 (3641.2806) mem 7379MB [2024-08-26 11:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][590/1251] eta 0:02:45 lr 0.000785 wd 0.0500 time 0.2460 (0.2502) data time 0.0010 (0.0021) model time 0.2450 (0.2482) loss 3.9518 (3.3428) grad_norm 1.5470 (inf) loss_scale 2048.0000 (3614.3215) mem 7379MB [2024-08-26 11:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][600/1251] eta 0:02:42 lr 0.000785 wd 0.0500 time 0.2430 (0.2500) data time 0.0010 (0.0021) model time 0.2420 (0.2480) loss 2.9322 (3.3398) grad_norm 2.2677 (inf) loss_scale 2048.0000 (3588.2596) mem 7379MB [2024-08-26 11:28:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][610/1251] eta 0:02:40 lr 0.000785 wd 0.0500 time 0.2377 (0.2498) data time 0.0008 (0.0021) model time 0.2369 (0.2478) loss 4.1545 (3.3412) grad_norm 3.4629 (inf) loss_scale 2048.0000 (3563.0507) mem 7379MB [2024-08-26 11:28:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][620/1251] eta 0:02:37 lr 0.000785 wd 0.0500 time 0.2367 (0.2498) data time 0.0010 (0.0021) model time 0.2358 (0.2479) loss 3.0463 (3.3418) grad_norm 2.0663 (inf) loss_scale 2048.0000 (3538.6538) mem 7379MB [2024-08-26 11:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][630/1251] eta 0:02:35 lr 0.000785 wd 0.0500 time 0.2403 (0.2497) data time 0.0011 (0.0021) model time 0.2392 (0.2477) loss 3.1787 (3.3465) grad_norm 1.7786 (inf) loss_scale 2048.0000 (3515.0301) mem 7379MB [2024-08-26 11:28:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][640/1251] eta 0:02:32 lr 0.000785 wd 0.0500 time 0.2478 (0.2496) data time 0.0009 (0.0021) model time 0.2469 (0.2476) loss 3.3702 (3.3456) grad_norm 1.9439 (inf) loss_scale 2048.0000 (3492.1435) mem 7379MB [2024-08-26 11:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][650/1251] eta 0:02:29 lr 0.000785 wd 0.0500 time 0.2375 (0.2494) data time 0.0009 (0.0020) model time 0.2366 (0.2475) loss 3.7707 (3.3450) grad_norm 1.5547 (inf) loss_scale 2048.0000 (3469.9601) mem 7379MB [2024-08-26 11:28:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][660/1251] eta 0:02:27 lr 0.000784 wd 0.0500 time 0.2389 (0.2493) data time 0.0010 (0.0020) model time 0.2379 (0.2474) loss 3.6027 (3.3471) grad_norm 1.9506 (inf) loss_scale 2048.0000 (3448.4478) mem 7379MB [2024-08-26 11:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][670/1251] eta 0:02:24 lr 0.000784 wd 0.0500 time 0.2368 (0.2493) data time 0.0007 (0.0020) model time 0.2361 (0.2473) loss 2.4421 (3.3448) grad_norm 2.0913 (inf) loss_scale 2048.0000 (3427.5768) mem 7379MB [2024-08-26 11:28:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][680/1251] eta 0:02:22 lr 0.000784 wd 0.0500 time 0.2408 (0.2492) data time 0.0007 (0.0020) model time 0.2400 (0.2472) loss 3.1728 (3.3395) grad_norm 2.7965 (inf) loss_scale 2048.0000 (3407.3186) mem 7379MB [2024-08-26 11:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][690/1251] eta 0:02:19 lr 0.000784 wd 0.0500 time 0.4323 (0.2494) data time 0.0007 (0.0020) model time 0.4315 (0.2475) loss 4.0475 (3.3381) grad_norm 2.7233 (inf) loss_scale 2048.0000 (3387.6469) mem 7379MB [2024-08-26 11:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][700/1251] eta 0:02:17 lr 0.000784 wd 0.0500 time 0.2426 (0.2493) data time 0.0010 (0.0020) model time 0.2416 (0.2474) loss 2.7031 (3.3337) grad_norm 3.1578 (inf) loss_scale 2048.0000 (3368.5364) mem 7379MB [2024-08-26 11:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][710/1251] eta 0:02:14 lr 0.000784 wd 0.0500 time 0.2471 (0.2492) data time 0.0007 (0.0020) model time 0.2463 (0.2473) loss 4.3035 (3.3350) grad_norm 1.7518 (inf) loss_scale 2048.0000 (3349.9634) mem 7379MB [2024-08-26 11:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][720/1251] eta 0:02:12 lr 0.000784 wd 0.0500 time 0.2406 (0.2491) data time 0.0011 (0.0020) model time 0.2395 (0.2472) loss 2.3084 (3.3338) grad_norm 1.8319 (inf) loss_scale 2048.0000 (3331.9057) mem 7379MB [2024-08-26 11:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][730/1251] eta 0:02:09 lr 0.000784 wd 0.0500 time 0.2407 (0.2491) data time 0.0009 (0.0020) model time 0.2398 (0.2471) loss 3.5005 (3.3323) grad_norm 1.7631 (inf) loss_scale 2048.0000 (3314.3420) mem 7379MB [2024-08-26 11:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][740/1251] eta 0:02:07 lr 0.000784 wd 0.0500 time 0.2370 (0.2490) data time 0.0007 (0.0020) model time 0.2363 (0.2471) loss 4.3939 (3.3308) grad_norm 2.6862 (inf) loss_scale 2048.0000 (3297.2524) mem 7379MB [2024-08-26 11:28:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][750/1251] eta 0:02:04 lr 0.000784 wd 0.0500 time 0.2423 (0.2489) data time 0.0008 (0.0019) model time 0.2415 (0.2470) loss 3.4307 (3.3310) grad_norm 3.1566 (inf) loss_scale 2048.0000 (3280.6178) mem 7379MB [2024-08-26 11:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][760/1251] eta 0:02:02 lr 0.000784 wd 0.0500 time 0.2480 (0.2488) data time 0.0007 (0.0019) model time 0.2473 (0.2469) loss 3.8896 (3.3303) grad_norm 2.5095 (inf) loss_scale 2048.0000 (3264.4205) mem 7379MB [2024-08-26 11:28:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][770/1251] eta 0:01:59 lr 0.000784 wd 0.0500 time 0.2428 (0.2487) data time 0.0010 (0.0019) model time 0.2418 (0.2468) loss 3.7171 (3.3279) grad_norm 2.0096 (inf) loss_scale 2048.0000 (3248.6433) mem 7379MB [2024-08-26 11:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][780/1251] eta 0:01:57 lr 0.000784 wd 0.0500 time 0.2419 (0.2487) data time 0.0009 (0.0019) model time 0.2409 (0.2468) loss 3.2104 (3.3305) grad_norm 2.0681 (inf) loss_scale 2048.0000 (3233.2702) mem 7379MB [2024-08-26 11:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][790/1251] eta 0:01:54 lr 0.000784 wd 0.0500 time 0.2479 (0.2486) data time 0.0007 (0.0019) model time 0.2472 (0.2467) loss 2.3409 (3.3303) grad_norm 2.4972 (inf) loss_scale 2048.0000 (3218.2857) mem 7379MB [2024-08-26 11:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][800/1251] eta 0:01:52 lr 0.000784 wd 0.0500 time 0.2392 (0.2485) data time 0.0007 (0.0019) model time 0.2385 (0.2467) loss 3.6328 (3.3325) grad_norm 1.3795 (inf) loss_scale 2048.0000 (3203.6754) mem 7379MB [2024-08-26 11:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][810/1251] eta 0:01:49 lr 0.000784 wd 0.0500 time 0.2548 (0.2485) data time 0.0008 (0.0019) model time 0.2540 (0.2467) loss 3.2217 (3.3332) grad_norm 1.9794 (inf) loss_scale 2048.0000 (3189.4254) mem 7379MB [2024-08-26 11:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][820/1251] eta 0:01:47 lr 0.000784 wd 0.0500 time 0.2389 (0.2485) data time 0.0011 (0.0019) model time 0.2379 (0.2466) loss 3.4973 (3.3298) grad_norm 1.7715 (inf) loss_scale 2048.0000 (3175.5225) mem 7379MB [2024-08-26 11:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][830/1251] eta 0:01:44 lr 0.000784 wd 0.0500 time 0.2422 (0.2487) data time 0.0009 (0.0019) model time 0.2413 (0.2469) loss 3.5844 (3.3318) grad_norm 2.5503 (inf) loss_scale 2048.0000 (3161.9543) mem 7379MB [2024-08-26 11:29:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][840/1251] eta 0:01:42 lr 0.000784 wd 0.0500 time 0.2486 (0.2487) data time 0.0007 (0.0019) model time 0.2479 (0.2468) loss 4.3604 (3.3348) grad_norm 2.1602 (inf) loss_scale 2048.0000 (3148.7087) mem 7379MB [2024-08-26 11:29:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][850/1251] eta 0:01:39 lr 0.000784 wd 0.0500 time 0.2343 (0.2489) data time 0.0009 (0.0018) model time 0.2334 (0.2470) loss 2.6344 (3.3363) grad_norm 1.5688 (inf) loss_scale 2048.0000 (3135.7744) mem 7379MB [2024-08-26 11:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][860/1251] eta 0:01:37 lr 0.000784 wd 0.0500 time 0.2452 (0.2488) data time 0.0009 (0.0018) model time 0.2443 (0.2470) loss 3.5739 (3.3372) grad_norm 2.1095 (inf) loss_scale 2048.0000 (3123.1405) mem 7379MB [2024-08-26 11:29:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][870/1251] eta 0:01:34 lr 0.000784 wd 0.0500 time 0.2373 (0.2490) data time 0.0010 (0.0018) model time 0.2362 (0.2472) loss 3.2414 (3.3349) grad_norm 1.4273 (inf) loss_scale 2048.0000 (3110.7968) mem 7379MB [2024-08-26 11:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][880/1251] eta 0:01:32 lr 0.000784 wd 0.0500 time 0.2453 (0.2489) data time 0.0010 (0.0018) model time 0.2442 (0.2471) loss 2.8684 (3.3354) grad_norm 1.9665 (inf) loss_scale 2048.0000 (3098.7333) mem 7379MB [2024-08-26 11:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][890/1251] eta 0:01:29 lr 0.000784 wd 0.0500 time 0.2411 (0.2488) data time 0.0007 (0.0018) model time 0.2404 (0.2471) loss 2.5774 (3.3324) grad_norm 2.2280 (inf) loss_scale 2048.0000 (3086.9405) mem 7379MB [2024-08-26 11:29:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][900/1251] eta 0:01:27 lr 0.000784 wd 0.0500 time 0.2407 (0.2488) data time 0.0006 (0.0018) model time 0.2401 (0.2470) loss 3.6629 (3.3351) grad_norm 2.7532 (inf) loss_scale 2048.0000 (3075.4095) mem 7379MB [2024-08-26 11:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][910/1251] eta 0:01:24 lr 0.000784 wd 0.0500 time 0.2351 (0.2487) data time 0.0009 (0.0018) model time 0.2342 (0.2470) loss 2.5370 (3.3333) grad_norm 1.9485 (inf) loss_scale 2048.0000 (3064.1317) mem 7379MB [2024-08-26 11:29:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][920/1251] eta 0:01:22 lr 0.000784 wd 0.0500 time 0.2343 (0.2489) data time 0.0011 (0.0018) model time 0.2333 (0.2471) loss 3.6472 (3.3343) grad_norm 2.5493 (inf) loss_scale 2048.0000 (3053.0988) mem 7379MB [2024-08-26 11:29:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][930/1251] eta 0:01:19 lr 0.000783 wd 0.0500 time 0.2422 (0.2488) data time 0.0008 (0.0018) model time 0.2414 (0.2470) loss 2.8367 (3.3350) grad_norm 2.0405 (inf) loss_scale 2048.0000 (3042.3029) mem 7379MB [2024-08-26 11:29:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][940/1251] eta 0:01:17 lr 0.000783 wd 0.0500 time 0.2400 (0.2487) data time 0.0008 (0.0018) model time 0.2391 (0.2470) loss 3.9086 (3.3371) grad_norm 2.4037 (inf) loss_scale 2048.0000 (3031.7365) mem 7379MB [2024-08-26 11:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][950/1251] eta 0:01:14 lr 0.000783 wd 0.0500 time 0.2393 (0.2489) data time 0.0011 (0.0018) model time 0.2382 (0.2471) loss 3.9102 (3.3401) grad_norm 2.0983 (inf) loss_scale 2048.0000 (3021.3922) mem 7379MB [2024-08-26 11:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][960/1251] eta 0:01:12 lr 0.000783 wd 0.0500 time 0.2440 (0.2488) data time 0.0010 (0.0017) model time 0.2431 (0.2471) loss 3.8843 (3.3367) grad_norm 2.0556 (inf) loss_scale 2048.0000 (3011.2633) mem 7379MB [2024-08-26 11:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][970/1251] eta 0:01:09 lr 0.000783 wd 0.0500 time 0.4539 (0.2489) data time 0.0011 (0.0017) model time 0.4529 (0.2472) loss 2.9003 (3.3339) grad_norm 1.9827 (inf) loss_scale 2048.0000 (3001.3429) mem 7379MB [2024-08-26 11:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][980/1251] eta 0:01:07 lr 0.000783 wd 0.0500 time 0.2411 (0.2490) data time 0.0007 (0.0017) model time 0.2404 (0.2473) loss 4.3098 (3.3348) grad_norm 2.3430 (inf) loss_scale 2048.0000 (2991.6249) mem 7379MB [2024-08-26 11:29:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][990/1251] eta 0:01:04 lr 0.000783 wd 0.0500 time 0.2380 (0.2490) data time 0.0008 (0.0017) model time 0.2372 (0.2473) loss 2.6858 (3.3350) grad_norm 2.2507 (inf) loss_scale 2048.0000 (2982.1029) mem 7379MB [2024-08-26 11:29:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1000/1251] eta 0:01:02 lr 0.000783 wd 0.0500 time 0.2466 (0.2489) data time 0.0008 (0.0017) model time 0.2458 (0.2472) loss 3.4536 (3.3356) grad_norm 2.3588 (inf) loss_scale 2048.0000 (2972.7712) mem 7379MB [2024-08-26 11:29:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1010/1251] eta 0:00:59 lr 0.000783 wd 0.0500 time 0.2397 (0.2488) data time 0.0012 (0.0017) model time 0.2385 (0.2471) loss 3.7160 (3.3366) grad_norm 1.8267 (inf) loss_scale 2048.0000 (2963.6241) mem 7379MB [2024-08-26 11:29:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1020/1251] eta 0:00:57 lr 0.000783 wd 0.0500 time 0.2425 (0.2487) data time 0.0010 (0.0017) model time 0.2415 (0.2471) loss 3.8725 (3.3381) grad_norm 2.5435 (inf) loss_scale 2048.0000 (2954.6562) mem 7379MB [2024-08-26 11:30:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1030/1251] eta 0:00:54 lr 0.000783 wd 0.0500 time 0.2364 (0.2487) data time 0.0008 (0.0017) model time 0.2356 (0.2470) loss 4.2099 (3.3399) grad_norm 3.0706 (inf) loss_scale 2048.0000 (2945.8623) mem 7379MB [2024-08-26 11:30:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1040/1251] eta 0:00:52 lr 0.000783 wd 0.0500 time 0.2585 (0.2487) data time 0.0008 (0.0017) model time 0.2577 (0.2470) loss 3.3059 (3.3422) grad_norm 2.8378 (inf) loss_scale 2048.0000 (2937.2373) mem 7379MB [2024-08-26 11:30:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1050/1251] eta 0:00:49 lr 0.000783 wd 0.0500 time 0.2352 (0.2486) data time 0.0009 (0.0017) model time 0.2343 (0.2469) loss 3.4734 (3.3429) grad_norm 1.7342 (inf) loss_scale 2048.0000 (2928.7764) mem 7379MB [2024-08-26 11:30:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1060/1251] eta 0:00:47 lr 0.000783 wd 0.0500 time 0.2379 (0.2486) data time 0.0012 (0.0017) model time 0.2367 (0.2469) loss 3.8112 (3.3442) grad_norm 2.4989 (inf) loss_scale 2048.0000 (2920.4750) mem 7379MB [2024-08-26 11:30:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1070/1251] eta 0:00:45 lr 0.000783 wd 0.0500 time 0.2367 (0.2487) data time 0.0009 (0.0017) model time 0.2358 (0.2470) loss 3.9069 (3.3438) grad_norm 2.0530 (inf) loss_scale 2048.0000 (2912.3287) mem 7379MB [2024-08-26 11:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1080/1251] eta 0:00:42 lr 0.000783 wd 0.0500 time 0.2480 (0.2490) data time 0.0009 (0.0017) model time 0.2471 (0.2474) loss 3.4305 (3.3434) grad_norm 1.7198 (inf) loss_scale 2048.0000 (2904.3330) mem 7379MB [2024-08-26 11:30:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1090/1251] eta 0:00:40 lr 0.000783 wd 0.0500 time 0.2348 (0.2489) data time 0.0010 (0.0017) model time 0.2339 (0.2473) loss 3.4000 (3.3448) grad_norm 1.8086 (inf) loss_scale 2048.0000 (2896.4840) mem 7379MB [2024-08-26 11:30:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1100/1251] eta 0:00:37 lr 0.000783 wd 0.0500 time 0.2559 (0.2489) data time 0.0007 (0.0017) model time 0.2552 (0.2472) loss 2.4309 (3.3438) grad_norm 1.5507 (inf) loss_scale 2048.0000 (2888.7775) mem 7379MB [2024-08-26 11:30:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1110/1251] eta 0:00:35 lr 0.000783 wd 0.0500 time 0.2418 (0.2488) data time 0.0007 (0.0017) model time 0.2410 (0.2472) loss 4.1769 (3.3427) grad_norm 2.0393 (inf) loss_scale 2048.0000 (2881.2097) mem 7379MB [2024-08-26 11:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1120/1251] eta 0:00:32 lr 0.000783 wd 0.0500 time 0.2432 (0.2488) data time 0.0008 (0.0017) model time 0.2424 (0.2471) loss 3.8171 (3.3443) grad_norm 2.5198 (inf) loss_scale 2048.0000 (2873.7770) mem 7379MB [2024-08-26 11:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1130/1251] eta 0:00:30 lr 0.000783 wd 0.0500 time 0.2512 (0.2487) data time 0.0010 (0.0017) model time 0.2502 (0.2471) loss 3.1268 (3.3437) grad_norm 1.7687 (inf) loss_scale 2048.0000 (2866.4757) mem 7379MB [2024-08-26 11:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1140/1251] eta 0:00:27 lr 0.000783 wd 0.0500 time 0.2416 (0.2488) data time 0.0010 (0.0016) model time 0.2406 (0.2472) loss 3.5085 (3.3454) grad_norm 3.6590 (inf) loss_scale 2048.0000 (2859.3024) mem 7379MB [2024-08-26 11:30:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1150/1251] eta 0:00:25 lr 0.000783 wd 0.0500 time 0.2435 (0.2488) data time 0.0008 (0.0016) model time 0.2428 (0.2472) loss 3.2855 (3.3454) grad_norm 2.1018 (inf) loss_scale 2048.0000 (2852.2537) mem 7379MB [2024-08-26 11:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1160/1251] eta 0:00:22 lr 0.000783 wd 0.0500 time 0.2386 (0.2488) data time 0.0011 (0.0016) model time 0.2375 (0.2471) loss 3.3733 (3.3459) grad_norm 3.0526 (inf) loss_scale 2048.0000 (2845.3264) mem 7379MB [2024-08-26 11:30:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1170/1251] eta 0:00:20 lr 0.000783 wd 0.0500 time 0.2366 (0.2487) data time 0.0007 (0.0016) model time 0.2359 (0.2471) loss 3.8471 (3.3468) grad_norm 2.6831 (inf) loss_scale 2048.0000 (2838.5175) mem 7379MB [2024-08-26 11:30:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1180/1251] eta 0:00:17 lr 0.000783 wd 0.0500 time 0.2400 (0.2486) data time 0.0010 (0.0016) model time 0.2390 (0.2470) loss 3.8715 (3.3464) grad_norm 2.1012 (inf) loss_scale 2048.0000 (2831.8239) mem 7379MB [2024-08-26 11:30:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1190/1251] eta 0:00:15 lr 0.000783 wd 0.0500 time 0.2375 (0.2486) data time 0.0008 (0.0016) model time 0.2368 (0.2470) loss 3.7788 (3.3466) grad_norm 1.8989 (inf) loss_scale 2048.0000 (2825.2427) mem 7379MB [2024-08-26 11:30:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1200/1251] eta 0:00:12 lr 0.000782 wd 0.0500 time 0.2428 (0.2485) data time 0.0008 (0.0016) model time 0.2420 (0.2469) loss 3.5357 (3.3477) grad_norm 1.6827 (inf) loss_scale 2048.0000 (2818.7710) mem 7379MB [2024-08-26 11:30:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1210/1251] eta 0:00:10 lr 0.000782 wd 0.0500 time 0.3950 (0.2486) data time 0.0009 (0.0016) model time 0.3941 (0.2470) loss 2.5385 (3.3494) grad_norm 1.9179 (inf) loss_scale 2048.0000 (2812.4063) mem 7379MB [2024-08-26 11:30:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1220/1251] eta 0:00:07 lr 0.000782 wd 0.0500 time 0.2416 (0.2485) data time 0.0011 (0.0016) model time 0.2405 (0.2469) loss 3.4629 (3.3502) grad_norm 1.9431 (inf) loss_scale 2048.0000 (2806.1458) mem 7379MB [2024-08-26 11:30:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1230/1251] eta 0:00:05 lr 0.000782 wd 0.0500 time 0.2475 (0.2485) data time 0.0007 (0.0016) model time 0.2468 (0.2469) loss 2.3095 (3.3488) grad_norm 1.6086 (inf) loss_scale 2048.0000 (2799.9870) mem 7379MB [2024-08-26 11:30:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1240/1251] eta 0:00:02 lr 0.000782 wd 0.0500 time 0.2302 (0.2483) data time 0.0007 (0.0016) model time 0.2295 (0.2468) loss 3.5837 (3.3492) grad_norm 2.6077 (inf) loss_scale 2048.0000 (2793.9275) mem 7379MB [2024-08-26 11:30:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [106/300][1250/1251] eta 0:00:00 lr 0.000782 wd 0.0500 time 0.2262 (0.2482) data time 0.0005 (0.0016) model time 0.2258 (0.2466) loss 3.9834 (3.3489) grad_norm 2.0351 (inf) loss_scale 2048.0000 (2787.9648) mem 7379MB [2024-08-26 11:30:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 106 training takes 0:05:10 [2024-08-26 11:30:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 11:30:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 11:30:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.406 (0.406) Loss 0.5107 (0.5107) Acc@1 90.039 (90.039) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-26 11:30:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.111) Loss 0.7407 (0.7621) Acc@1 84.668 (82.821) Acc@5 96.680 (96.600) Mem 7379MB [2024-08-26 11:30:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.095) Loss 1.1338 (0.7874) Acc@1 73.242 (81.948) Acc@5 92.871 (96.466) Mem 7379MB [2024-08-26 11:30:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.085 (0.089) Loss 1.4258 (0.9079) Acc@1 65.625 (79.316) Acc@5 88.379 (94.985) Mem 7379MB [2024-08-26 11:30:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.3779 (0.9764) Acc@1 68.262 (77.787) Acc@5 90.039 (94.238) Mem 7379MB [2024-08-26 11:30:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.484 Acc@5 94.142 [2024-08-26 11:30:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.5% [2024-08-26 11:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.842 (0.842) Loss 0.4351 (0.4351) Acc@1 91.895 (91.895) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 11:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.151) Loss 0.6978 (0.6823) Acc@1 86.230 (85.281) Acc@5 96.387 (97.079) Mem 7379MB [2024-08-26 11:31:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.068 (0.117) Loss 0.9810 (0.7054) Acc@1 76.465 (84.263) Acc@5 93.945 (97.056) Mem 7379MB [2024-08-26 11:31:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.104) Loss 1.2520 (0.8032) Acc@1 68.457 (81.971) Acc@5 90.820 (95.880) Mem 7379MB [2024-08-26 11:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.095) Loss 1.1143 (0.8540) Acc@1 72.266 (80.543) Acc@5 92.773 (95.365) Mem 7379MB [2024-08-26 11:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.096 Acc@5 95.322 [2024-08-26 11:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.1% [2024-08-26 11:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.10% [2024-08-26 11:31:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 11:31:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 11:31:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][0/1251] eta 0:14:11 lr 0.000782 wd 0.0500 time 0.6805 (0.6805) data time 0.4537 (0.4537) model time 0.0000 (0.0000) loss 3.2211 (3.2211) grad_norm 1.7671 (1.7671) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][10/1251] eta 0:05:49 lr 0.000782 wd 0.0500 time 0.2390 (0.2815) data time 0.0007 (0.0422) model time 0.0000 (0.0000) loss 2.6926 (3.3925) grad_norm 1.8085 (1.8086) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][20/1251] eta 0:05:36 lr 0.000782 wd 0.0500 time 0.2412 (0.2735) data time 0.0007 (0.0226) model time 0.0000 (0.0000) loss 3.0256 (3.4349) grad_norm 3.3027 (2.3372) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][30/1251] eta 0:05:21 lr 0.000782 wd 0.0500 time 0.2421 (0.2629) data time 0.0009 (0.0156) model time 0.0000 (0.0000) loss 2.6087 (3.4609) grad_norm 1.8804 (2.2509) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][40/1251] eta 0:05:12 lr 0.000782 wd 0.0500 time 0.2432 (0.2581) data time 0.0010 (0.0121) model time 0.0000 (0.0000) loss 3.3863 (3.4522) grad_norm 2.0930 (2.2175) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][50/1251] eta 0:05:06 lr 0.000782 wd 0.0500 time 0.2387 (0.2551) data time 0.0008 (0.0100) model time 0.0000 (0.0000) loss 3.3266 (3.4060) grad_norm 2.8018 (2.2418) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][60/1251] eta 0:05:01 lr 0.000782 wd 0.0500 time 0.2394 (0.2531) data time 0.0007 (0.0085) model time 0.2386 (0.2419) loss 2.9488 (3.4015) grad_norm 1.8909 (2.2158) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][70/1251] eta 0:04:57 lr 0.000782 wd 0.0500 time 0.2367 (0.2516) data time 0.0009 (0.0075) model time 0.2358 (0.2417) loss 3.6899 (3.4423) grad_norm 2.2140 (2.2036) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][80/1251] eta 0:04:53 lr 0.000782 wd 0.0500 time 0.2461 (0.2509) data time 0.0009 (0.0067) model time 0.2452 (0.2427) loss 3.8143 (3.4495) grad_norm 1.8718 (2.2236) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][90/1251] eta 0:04:49 lr 0.000782 wd 0.0500 time 0.2341 (0.2497) data time 0.0010 (0.0061) model time 0.2332 (0.2419) loss 2.9217 (3.4347) grad_norm 1.9417 (2.2561) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][100/1251] eta 0:04:46 lr 0.000782 wd 0.0500 time 0.2475 (0.2491) data time 0.0007 (0.0056) model time 0.2468 (0.2419) loss 4.1300 (3.4087) grad_norm 1.6810 (2.2187) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][110/1251] eta 0:04:43 lr 0.000782 wd 0.0500 time 0.2443 (0.2486) data time 0.0007 (0.0051) model time 0.2436 (0.2420) loss 2.3304 (3.4023) grad_norm 1.3376 (2.1892) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][120/1251] eta 0:04:40 lr 0.000782 wd 0.0500 time 0.2422 (0.2481) data time 0.0007 (0.0048) model time 0.2415 (0.2419) loss 2.2889 (3.3670) grad_norm 2.1448 (2.1910) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][130/1251] eta 0:04:37 lr 0.000782 wd 0.0500 time 0.2403 (0.2477) data time 0.0014 (0.0045) model time 0.2389 (0.2419) loss 3.1050 (3.3526) grad_norm 1.6356 (2.1817) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][140/1251] eta 0:04:34 lr 0.000782 wd 0.0500 time 0.2331 (0.2472) data time 0.0010 (0.0043) model time 0.2321 (0.2417) loss 3.4215 (3.3454) grad_norm 3.6978 (2.2126) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][150/1251] eta 0:04:33 lr 0.000782 wd 0.0500 time 0.2436 (0.2480) data time 0.0009 (0.0041) model time 0.2426 (0.2434) loss 3.6249 (3.3494) grad_norm 1.9458 (2.1979) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][160/1251] eta 0:04:30 lr 0.000782 wd 0.0500 time 0.2457 (0.2476) data time 0.0011 (0.0039) model time 0.2446 (0.2431) loss 3.9559 (3.3433) grad_norm 2.5846 (2.2094) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][170/1251] eta 0:04:27 lr 0.000782 wd 0.0500 time 0.2434 (0.2472) data time 0.0008 (0.0037) model time 0.2427 (0.2428) loss 2.1750 (3.3448) grad_norm 3.0394 (2.2339) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][180/1251] eta 0:04:24 lr 0.000782 wd 0.0500 time 0.2394 (0.2468) data time 0.0010 (0.0035) model time 0.2384 (0.2426) loss 3.8206 (3.3615) grad_norm 2.3957 (2.2438) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][190/1251] eta 0:04:21 lr 0.000782 wd 0.0500 time 0.2439 (0.2466) data time 0.0012 (0.0034) model time 0.2427 (0.2425) loss 3.0246 (3.3391) grad_norm 1.6070 (2.2232) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][200/1251] eta 0:04:19 lr 0.000782 wd 0.0500 time 0.2391 (0.2473) data time 0.0012 (0.0033) model time 0.2379 (0.2436) loss 3.7321 (3.3395) grad_norm 1.9411 (2.2092) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][210/1251] eta 0:04:17 lr 0.000782 wd 0.0500 time 0.2429 (0.2470) data time 0.0007 (0.0032) model time 0.2421 (0.2434) loss 3.4731 (3.3447) grad_norm 1.7916 (2.2046) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:31:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][220/1251] eta 0:04:14 lr 0.000782 wd 0.0500 time 0.2396 (0.2467) data time 0.0009 (0.0031) model time 0.2387 (0.2432) loss 2.8766 (3.3366) grad_norm 2.8890 (2.2008) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][230/1251] eta 0:04:12 lr 0.000781 wd 0.0500 time 0.2404 (0.2473) data time 0.0009 (0.0030) model time 0.2394 (0.2441) loss 3.2272 (3.3283) grad_norm 4.1133 (2.2125) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][240/1251] eta 0:04:10 lr 0.000781 wd 0.0500 time 0.2519 (0.2473) data time 0.0007 (0.0029) model time 0.2513 (0.2442) loss 4.0194 (3.3228) grad_norm 2.1040 (2.2054) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][250/1251] eta 0:04:08 lr 0.000781 wd 0.0500 time 0.2516 (0.2483) data time 0.0007 (0.0029) model time 0.2508 (0.2455) loss 3.4768 (3.3281) grad_norm 1.9809 (2.1963) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][260/1251] eta 0:04:05 lr 0.000781 wd 0.0500 time 0.2428 (0.2480) data time 0.0008 (0.0028) model time 0.2421 (0.2452) loss 3.3486 (3.3379) grad_norm 1.7993 (2.1876) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][270/1251] eta 0:04:03 lr 0.000781 wd 0.0500 time 0.2427 (0.2478) data time 0.0009 (0.0027) model time 0.2418 (0.2451) loss 3.8798 (3.3380) grad_norm 1.6961 (2.1798) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][280/1251] eta 0:04:00 lr 0.000781 wd 0.0500 time 0.2465 (0.2478) data time 0.0011 (0.0027) model time 0.2454 (0.2451) loss 3.6710 (3.3326) grad_norm 2.1045 (2.1704) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][290/1251] eta 0:03:57 lr 0.000781 wd 0.0500 time 0.2436 (0.2476) data time 0.0007 (0.0026) model time 0.2429 (0.2449) loss 3.9814 (3.3362) grad_norm 1.7065 (2.1584) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][300/1251] eta 0:03:55 lr 0.000781 wd 0.0500 time 0.2350 (0.2481) data time 0.0008 (0.0026) model time 0.2341 (0.2457) loss 3.2282 (3.3358) grad_norm 1.8649 (2.1584) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][310/1251] eta 0:03:53 lr 0.000781 wd 0.0500 time 0.2419 (0.2486) data time 0.0010 (0.0025) model time 0.2409 (0.2463) loss 3.5824 (3.3378) grad_norm 1.6336 (2.1575) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][320/1251] eta 0:03:51 lr 0.000781 wd 0.0500 time 0.2376 (0.2484) data time 0.0009 (0.0025) model time 0.2366 (0.2461) loss 3.6612 (3.3454) grad_norm 2.5872 (2.1526) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][330/1251] eta 0:03:48 lr 0.000781 wd 0.0500 time 0.2417 (0.2483) data time 0.0010 (0.0024) model time 0.2407 (0.2459) loss 3.4701 (3.3409) grad_norm 2.2170 (2.1539) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][340/1251] eta 0:03:46 lr 0.000781 wd 0.0500 time 0.2448 (0.2481) data time 0.0007 (0.0024) model time 0.2441 (0.2458) loss 3.5293 (3.3416) grad_norm 1.9633 (2.1582) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][350/1251] eta 0:03:43 lr 0.000781 wd 0.0500 time 0.2420 (0.2485) data time 0.0011 (0.0024) model time 0.2408 (0.2463) loss 3.6950 (3.3491) grad_norm 1.4834 (2.1543) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][360/1251] eta 0:03:42 lr 0.000781 wd 0.0500 time 0.2343 (0.2494) data time 0.0010 (0.0023) model time 0.2333 (0.2474) loss 3.7091 (3.3533) grad_norm 2.4756 (2.1587) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][370/1251] eta 0:03:39 lr 0.000781 wd 0.0500 time 0.2474 (0.2493) data time 0.0007 (0.0023) model time 0.2466 (0.2473) loss 2.4407 (3.3471) grad_norm 2.0766 (2.1491) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][380/1251] eta 0:03:37 lr 0.000781 wd 0.0500 time 0.2407 (0.2491) data time 0.0007 (0.0023) model time 0.2400 (0.2472) loss 3.0942 (3.3508) grad_norm 2.7739 (2.1515) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][390/1251] eta 0:03:34 lr 0.000781 wd 0.0500 time 0.2383 (0.2489) data time 0.0011 (0.0022) model time 0.2372 (0.2469) loss 3.6423 (3.3468) grad_norm 1.7506 (2.1674) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][400/1251] eta 0:03:31 lr 0.000781 wd 0.0500 time 0.2381 (0.2488) data time 0.0011 (0.0022) model time 0.2369 (0.2467) loss 3.5868 (3.3477) grad_norm 2.3232 (2.1630) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][410/1251] eta 0:03:29 lr 0.000781 wd 0.0500 time 0.2359 (0.2485) data time 0.0008 (0.0022) model time 0.2352 (0.2465) loss 4.1097 (3.3524) grad_norm 1.9970 (2.1579) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][420/1251] eta 0:03:26 lr 0.000781 wd 0.0500 time 0.2389 (0.2483) data time 0.0010 (0.0022) model time 0.2379 (0.2463) loss 3.3934 (3.3521) grad_norm 2.0103 (2.1621) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][430/1251] eta 0:03:23 lr 0.000781 wd 0.0500 time 0.2406 (0.2482) data time 0.0009 (0.0021) model time 0.2397 (0.2462) loss 3.9150 (3.3558) grad_norm 2.0428 (2.1621) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][440/1251] eta 0:03:21 lr 0.000781 wd 0.0500 time 0.2441 (0.2480) data time 0.0007 (0.0021) model time 0.2433 (0.2460) loss 3.1025 (3.3574) grad_norm 1.7335 (2.1664) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][450/1251] eta 0:03:18 lr 0.000781 wd 0.0500 time 0.2470 (0.2484) data time 0.0009 (0.0021) model time 0.2461 (0.2464) loss 3.7617 (3.3554) grad_norm 2.0225 (2.1649) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][460/1251] eta 0:03:16 lr 0.000781 wd 0.0500 time 0.2476 (0.2482) data time 0.0009 (0.0021) model time 0.2468 (0.2463) loss 3.4304 (3.3517) grad_norm 1.6758 (2.1640) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][470/1251] eta 0:03:14 lr 0.000781 wd 0.0500 time 0.2479 (0.2485) data time 0.0007 (0.0020) model time 0.2471 (0.2466) loss 2.9422 (3.3553) grad_norm 1.7993 (2.1598) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][480/1251] eta 0:03:11 lr 0.000781 wd 0.0500 time 0.2436 (0.2484) data time 0.0010 (0.0020) model time 0.2426 (0.2465) loss 2.6733 (3.3614) grad_norm 3.1433 (2.1660) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][490/1251] eta 0:03:08 lr 0.000781 wd 0.0500 time 0.2510 (0.2483) data time 0.0007 (0.0020) model time 0.2502 (0.2464) loss 3.1215 (3.3563) grad_norm 1.9702 (2.1673) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][500/1251] eta 0:03:06 lr 0.000780 wd 0.0500 time 0.2321 (0.2485) data time 0.0011 (0.0020) model time 0.2309 (0.2467) loss 3.1515 (3.3585) grad_norm 2.0779 (2.1704) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][510/1251] eta 0:03:04 lr 0.000780 wd 0.0500 time 0.2395 (0.2484) data time 0.0008 (0.0020) model time 0.2388 (0.2465) loss 3.3320 (3.3553) grad_norm 3.6503 (2.1720) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][520/1251] eta 0:03:01 lr 0.000780 wd 0.0500 time 0.2468 (0.2483) data time 0.0011 (0.0019) model time 0.2457 (0.2464) loss 3.5564 (3.3545) grad_norm 1.9570 (2.1648) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][530/1251] eta 0:02:58 lr 0.000780 wd 0.0500 time 0.2539 (0.2482) data time 0.0009 (0.0019) model time 0.2530 (0.2464) loss 3.3697 (3.3556) grad_norm 2.0011 (2.1590) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][540/1251] eta 0:02:56 lr 0.000780 wd 0.0500 time 0.2397 (0.2483) data time 0.0011 (0.0019) model time 0.2387 (0.2466) loss 3.8245 (3.3609) grad_norm 1.7109 (2.1525) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][550/1251] eta 0:02:54 lr 0.000780 wd 0.0500 time 0.2377 (0.2483) data time 0.0011 (0.0019) model time 0.2367 (0.2465) loss 4.1960 (3.3595) grad_norm 1.5868 (2.1477) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][560/1251] eta 0:02:51 lr 0.000780 wd 0.0500 time 0.2389 (0.2482) data time 0.0010 (0.0019) model time 0.2379 (0.2464) loss 3.2751 (3.3562) grad_norm 2.5356 (2.1455) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][570/1251] eta 0:02:48 lr 0.000780 wd 0.0500 time 0.2486 (0.2481) data time 0.0007 (0.0019) model time 0.2479 (0.2463) loss 3.4003 (3.3525) grad_norm 3.1478 (2.1482) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][580/1251] eta 0:02:46 lr 0.000780 wd 0.0500 time 0.2353 (0.2480) data time 0.0011 (0.0019) model time 0.2342 (0.2462) loss 2.9849 (3.3504) grad_norm 3.4324 (2.1482) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][590/1251] eta 0:02:43 lr 0.000780 wd 0.0500 time 0.2417 (0.2479) data time 0.0009 (0.0019) model time 0.2408 (0.2461) loss 3.6624 (3.3476) grad_norm 1.9208 (2.1471) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][600/1251] eta 0:02:41 lr 0.000780 wd 0.0500 time 0.2452 (0.2479) data time 0.0010 (0.0019) model time 0.2442 (0.2461) loss 3.3655 (3.3510) grad_norm 1.6698 (2.1454) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][610/1251] eta 0:02:38 lr 0.000780 wd 0.0500 time 0.2584 (0.2479) data time 0.0008 (0.0018) model time 0.2576 (0.2461) loss 2.2672 (3.3538) grad_norm 2.3036 (2.1456) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][620/1251] eta 0:02:36 lr 0.000780 wd 0.0500 time 0.2356 (0.2477) data time 0.0008 (0.0018) model time 0.2348 (0.2460) loss 2.0154 (3.3547) grad_norm 1.9238 (2.1451) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][630/1251] eta 0:02:33 lr 0.000780 wd 0.0500 time 0.2348 (0.2476) data time 0.0011 (0.0018) model time 0.2337 (0.2459) loss 2.4542 (3.3503) grad_norm 2.1065 (2.1462) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][640/1251] eta 0:02:31 lr 0.000780 wd 0.0500 time 0.2442 (0.2476) data time 0.0028 (0.0018) model time 0.2415 (0.2458) loss 3.0592 (3.3516) grad_norm 1.8641 (2.1463) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][650/1251] eta 0:02:28 lr 0.000780 wd 0.0500 time 0.2477 (0.2475) data time 0.0007 (0.0018) model time 0.2470 (0.2457) loss 2.9884 (3.3544) grad_norm 1.5077 (2.1422) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][660/1251] eta 0:02:26 lr 0.000780 wd 0.0500 time 0.2390 (0.2474) data time 0.0009 (0.0018) model time 0.2381 (0.2457) loss 3.8824 (3.3543) grad_norm 1.5723 (2.1406) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][670/1251] eta 0:02:23 lr 0.000780 wd 0.0500 time 0.2416 (0.2473) data time 0.0007 (0.0018) model time 0.2409 (0.2456) loss 4.2203 (3.3560) grad_norm 1.5429 (2.1398) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][680/1251] eta 0:02:21 lr 0.000780 wd 0.0500 time 0.2339 (0.2473) data time 0.0011 (0.0018) model time 0.2328 (0.2455) loss 2.4960 (3.3532) grad_norm 2.0333 (2.1479) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][690/1251] eta 0:02:18 lr 0.000780 wd 0.0500 time 0.2508 (0.2472) data time 0.0007 (0.0018) model time 0.2501 (0.2455) loss 3.6293 (3.3531) grad_norm 1.9787 (2.1465) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:33:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][700/1251] eta 0:02:16 lr 0.000780 wd 0.0500 time 0.2467 (0.2472) data time 0.0011 (0.0018) model time 0.2456 (0.2455) loss 3.8016 (3.3556) grad_norm 2.1837 (2.1441) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][710/1251] eta 0:02:13 lr 0.000780 wd 0.0500 time 0.2453 (0.2471) data time 0.0007 (0.0017) model time 0.2446 (0.2454) loss 2.3475 (3.3533) grad_norm 2.5900 (2.1507) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][720/1251] eta 0:02:11 lr 0.000780 wd 0.0500 time 0.2418 (0.2472) data time 0.0010 (0.0017) model time 0.2409 (0.2455) loss 2.4154 (3.3529) grad_norm 1.7186 (2.1489) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][730/1251] eta 0:02:08 lr 0.000780 wd 0.0500 time 0.2372 (0.2472) data time 0.0010 (0.0017) model time 0.2362 (0.2455) loss 3.0629 (3.3505) grad_norm 3.5337 (2.1546) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][740/1251] eta 0:02:06 lr 0.000780 wd 0.0500 time 0.2412 (0.2471) data time 0.0009 (0.0017) model time 0.2404 (0.2454) loss 4.0456 (3.3508) grad_norm 1.5706 (2.1546) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][750/1251] eta 0:02:03 lr 0.000780 wd 0.0500 time 0.2592 (0.2472) data time 0.0008 (0.0017) model time 0.2585 (0.2455) loss 3.8935 (3.3511) grad_norm 2.9847 (2.1550) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][760/1251] eta 0:02:01 lr 0.000780 wd 0.0500 time 0.2454 (0.2472) data time 0.0007 (0.0017) model time 0.2447 (0.2455) loss 4.1202 (3.3521) grad_norm 2.3326 (2.1586) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][770/1251] eta 0:01:59 lr 0.000779 wd 0.0500 time 0.2403 (0.2474) data time 0.0008 (0.0017) model time 0.2396 (0.2457) loss 3.9707 (3.3535) grad_norm 2.1252 (2.1590) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][780/1251] eta 0:01:56 lr 0.000779 wd 0.0500 time 0.2397 (0.2476) data time 0.0007 (0.0017) model time 0.2390 (0.2459) loss 3.5942 (3.3528) grad_norm 2.4338 (2.1566) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][790/1251] eta 0:01:54 lr 0.000779 wd 0.0500 time 0.2370 (0.2476) data time 0.0011 (0.0017) model time 0.2359 (0.2459) loss 2.9210 (3.3526) grad_norm 1.3668 (2.1531) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][800/1251] eta 0:01:51 lr 0.000779 wd 0.0500 time 0.2439 (0.2475) data time 0.0009 (0.0017) model time 0.2430 (0.2458) loss 3.1896 (3.3489) grad_norm 1.7779 (2.1513) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][810/1251] eta 0:01:49 lr 0.000779 wd 0.0500 time 0.2394 (0.2474) data time 0.0009 (0.0017) model time 0.2385 (0.2458) loss 4.1654 (3.3518) grad_norm 1.9419 (2.1531) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][820/1251] eta 0:01:46 lr 0.000779 wd 0.0500 time 0.2417 (0.2474) data time 0.0009 (0.0017) model time 0.2407 (0.2457) loss 3.9331 (3.3532) grad_norm 1.8516 (2.1520) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][830/1251] eta 0:01:44 lr 0.000779 wd 0.0500 time 0.2411 (0.2473) data time 0.0009 (0.0017) model time 0.2401 (0.2457) loss 3.3601 (3.3533) grad_norm 3.5735 (2.1558) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][840/1251] eta 0:01:41 lr 0.000779 wd 0.0500 time 0.2482 (0.2473) data time 0.0008 (0.0017) model time 0.2474 (0.2456) loss 2.6066 (3.3534) grad_norm 2.1084 (2.1581) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][850/1251] eta 0:01:39 lr 0.000779 wd 0.0500 time 0.2350 (0.2472) data time 0.0009 (0.0017) model time 0.2341 (0.2455) loss 3.4454 (3.3523) grad_norm 1.7589 (2.1578) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][860/1251] eta 0:01:36 lr 0.000779 wd 0.0500 time 0.2484 (0.2471) data time 0.0010 (0.0017) model time 0.2474 (0.2455) loss 3.3345 (3.3512) grad_norm 1.7246 (2.1553) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][870/1251] eta 0:01:34 lr 0.000779 wd 0.0500 time 0.2438 (0.2472) data time 0.0012 (0.0017) model time 0.2426 (0.2456) loss 2.3316 (3.3488) grad_norm 1.8115 (2.1559) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][880/1251] eta 0:01:31 lr 0.000779 wd 0.0500 time 0.2403 (0.2475) data time 0.0007 (0.0017) model time 0.2395 (0.2459) loss 2.3156 (3.3464) grad_norm 1.6417 (2.1548) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][890/1251] eta 0:01:29 lr 0.000779 wd 0.0500 time 0.2372 (0.2475) data time 0.0009 (0.0017) model time 0.2363 (0.2459) loss 4.1720 (3.3491) grad_norm 2.0712 (2.1518) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][900/1251] eta 0:01:26 lr 0.000779 wd 0.0500 time 0.2409 (0.2474) data time 0.0010 (0.0016) model time 0.2399 (0.2458) loss 3.4267 (3.3493) grad_norm 1.9024 (2.1575) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][910/1251] eta 0:01:24 lr 0.000779 wd 0.0500 time 0.2430 (0.2474) data time 0.0007 (0.0016) model time 0.2423 (0.2458) loss 4.1900 (3.3487) grad_norm 1.9613 (2.1578) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][920/1251] eta 0:01:21 lr 0.000779 wd 0.0500 time 0.2354 (0.2473) data time 0.0009 (0.0016) model time 0.2345 (0.2457) loss 3.3580 (3.3490) grad_norm 1.6956 (2.1554) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][930/1251] eta 0:01:19 lr 0.000779 wd 0.0500 time 0.2488 (0.2473) data time 0.0010 (0.0016) model time 0.2478 (0.2457) loss 3.5246 (3.3500) grad_norm 1.5059 (2.1568) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][940/1251] eta 0:01:16 lr 0.000779 wd 0.0500 time 0.2479 (0.2473) data time 0.0008 (0.0016) model time 0.2471 (0.2457) loss 3.5422 (3.3533) grad_norm 2.0412 (2.1623) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:34:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][950/1251] eta 0:01:14 lr 0.000779 wd 0.0500 time 0.2368 (0.2472) data time 0.0010 (0.0016) model time 0.2358 (0.2456) loss 3.3150 (3.3529) grad_norm 1.6453 (2.1596) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][960/1251] eta 0:01:11 lr 0.000779 wd 0.0500 time 0.2435 (0.2472) data time 0.0007 (0.0016) model time 0.2428 (0.2456) loss 4.1997 (3.3517) grad_norm 2.9725 (2.1632) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][970/1251] eta 0:01:09 lr 0.000779 wd 0.0500 time 0.2428 (0.2471) data time 0.0012 (0.0016) model time 0.2416 (0.2455) loss 3.6786 (3.3536) grad_norm 2.0327 (2.1637) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][980/1251] eta 0:01:06 lr 0.000779 wd 0.0500 time 0.2452 (0.2470) data time 0.0008 (0.0016) model time 0.2443 (0.2455) loss 3.1003 (3.3556) grad_norm 1.4590 (2.1619) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][990/1251] eta 0:01:04 lr 0.000779 wd 0.0500 time 0.2447 (0.2473) data time 0.0009 (0.0016) model time 0.2438 (0.2457) loss 3.8014 (3.3563) grad_norm 1.5545 (2.1606) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1000/1251] eta 0:01:02 lr 0.000779 wd 0.0500 time 0.2521 (0.2472) data time 0.0012 (0.0016) model time 0.2509 (0.2457) loss 3.6734 (3.3558) grad_norm 2.7509 (2.1576) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1010/1251] eta 0:00:59 lr 0.000779 wd 0.0500 time 0.2435 (0.2472) data time 0.0009 (0.0016) model time 0.2426 (0.2456) loss 3.1511 (3.3577) grad_norm 1.7817 (2.1571) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1020/1251] eta 0:00:57 lr 0.000779 wd 0.0500 time 0.2470 (0.2473) data time 0.0007 (0.0016) model time 0.2463 (0.2457) loss 3.0579 (3.3559) grad_norm 1.9722 (2.1565) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1030/1251] eta 0:00:54 lr 0.000779 wd 0.0500 time 0.2417 (0.2472) data time 0.0010 (0.0016) model time 0.2406 (0.2457) loss 3.4551 (3.3578) grad_norm 2.2315 (2.1571) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1040/1251] eta 0:00:52 lr 0.000778 wd 0.0500 time 0.2391 (0.2472) data time 0.0007 (0.0016) model time 0.2384 (0.2456) loss 4.0046 (3.3577) grad_norm 1.7668 (2.1608) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1050/1251] eta 0:00:49 lr 0.000778 wd 0.0500 time 0.2358 (0.2471) data time 0.0009 (0.0016) model time 0.2349 (0.2456) loss 3.1348 (3.3590) grad_norm 2.2601 (2.1625) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1060/1251] eta 0:00:47 lr 0.000778 wd 0.0500 time 0.2556 (0.2473) data time 0.0009 (0.0016) model time 0.2547 (0.2458) loss 4.4313 (3.3573) grad_norm 2.5353 (2.1695) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1070/1251] eta 0:00:44 lr 0.000778 wd 0.0500 time 0.2416 (0.2472) data time 0.0009 (0.0016) model time 0.2406 (0.2457) loss 3.5241 (3.3560) grad_norm 2.0733 (2.1700) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1080/1251] eta 0:00:42 lr 0.000778 wd 0.0500 time 0.2424 (0.2474) data time 0.0009 (0.0016) model time 0.2415 (0.2458) loss 3.2689 (3.3556) grad_norm 1.9982 (2.1693) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1090/1251] eta 0:00:39 lr 0.000778 wd 0.0500 time 0.2415 (0.2473) data time 0.0008 (0.0016) model time 0.2407 (0.2458) loss 3.5805 (3.3566) grad_norm 2.9355 (2.1702) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1100/1251] eta 0:00:37 lr 0.000778 wd 0.0500 time 0.2364 (0.2473) data time 0.0009 (0.0016) model time 0.2355 (0.2457) loss 3.8691 (3.3580) grad_norm 1.5670 (2.1700) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1110/1251] eta 0:00:34 lr 0.000778 wd 0.0500 time 0.2434 (0.2472) data time 0.0011 (0.0016) model time 0.2424 (0.2457) loss 3.6854 (3.3575) grad_norm 1.7233 (2.1718) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1120/1251] eta 0:00:32 lr 0.000778 wd 0.0500 time 0.2504 (0.2472) data time 0.0008 (0.0016) model time 0.2496 (0.2456) loss 3.0094 (3.3570) grad_norm 1.8383 (2.1709) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1130/1251] eta 0:00:29 lr 0.000778 wd 0.0500 time 0.2478 (0.2471) data time 0.0009 (0.0016) model time 0.2469 (0.2456) loss 4.1693 (3.3559) grad_norm 2.2380 (2.1711) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1140/1251] eta 0:00:27 lr 0.000778 wd 0.0500 time 0.2559 (0.2471) data time 0.0007 (0.0016) model time 0.2552 (0.2455) loss 3.6972 (3.3578) grad_norm 2.2623 (2.1702) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1150/1251] eta 0:00:24 lr 0.000778 wd 0.0500 time 0.2447 (0.2471) data time 0.0008 (0.0016) model time 0.2438 (0.2455) loss 4.1895 (3.3592) grad_norm 1.4366 (2.1681) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1160/1251] eta 0:00:22 lr 0.000778 wd 0.0500 time 0.2510 (0.2470) data time 0.0009 (0.0016) model time 0.2501 (0.2455) loss 2.9527 (3.3579) grad_norm 1.5998 (2.1668) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1170/1251] eta 0:00:20 lr 0.000778 wd 0.0500 time 0.2483 (0.2470) data time 0.0010 (0.0016) model time 0.2473 (0.2454) loss 3.9427 (3.3584) grad_norm 2.2770 (2.1653) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1180/1251] eta 0:00:17 lr 0.000778 wd 0.0500 time 0.2499 (0.2469) data time 0.0010 (0.0016) model time 0.2489 (0.2454) loss 3.8025 (3.3557) grad_norm 2.1620 (2.1674) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:35:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1190/1251] eta 0:00:15 lr 0.000778 wd 0.0500 time 0.2438 (0.2469) data time 0.0007 (0.0016) model time 0.2431 (0.2454) loss 3.9063 (3.3547) grad_norm 1.9807 (2.1664) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:36:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1200/1251] eta 0:00:12 lr 0.000778 wd 0.0500 time 0.2400 (0.2469) data time 0.0011 (0.0015) model time 0.2390 (0.2453) loss 3.4102 (3.3530) grad_norm 1.7124 (2.1649) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:36:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1210/1251] eta 0:00:10 lr 0.000778 wd 0.0500 time 0.2428 (0.2469) data time 0.0008 (0.0015) model time 0.2420 (0.2453) loss 2.5439 (3.3540) grad_norm 1.4596 (2.1631) loss_scale 4096.0000 (2064.9116) mem 7379MB [2024-08-26 11:36:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1220/1251] eta 0:00:07 lr 0.000778 wd 0.0500 time 0.2320 (0.2468) data time 0.0008 (0.0015) model time 0.2311 (0.2453) loss 4.5346 (3.3530) grad_norm 1.6874 (2.1603) loss_scale 4096.0000 (2081.5463) mem 7379MB [2024-08-26 11:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1230/1251] eta 0:00:05 lr 0.000778 wd 0.0500 time 0.2312 (0.2469) data time 0.0013 (0.0015) model time 0.2299 (0.2454) loss 4.0018 (3.3542) grad_norm 3.0621 (2.1604) loss_scale 4096.0000 (2097.9106) mem 7379MB [2024-08-26 11:36:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1240/1251] eta 0:00:02 lr 0.000778 wd 0.0500 time 0.4160 (0.2472) data time 0.0007 (0.0015) model time 0.4153 (0.2456) loss 3.7159 (3.3553) grad_norm 1.8286 (2.1603) loss_scale 4096.0000 (2114.0113) mem 7379MB [2024-08-26 11:36:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [107/300][1250/1251] eta 0:00:00 lr 0.000778 wd 0.0500 time 0.2250 (0.2470) data time 0.0007 (0.0015) model time 0.2243 (0.2455) loss 3.6332 (3.3557) grad_norm 1.7565 (2.1595) loss_scale 4096.0000 (2129.8545) mem 7379MB [2024-08-26 11:36:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 107 training takes 0:05:08 [2024-08-26 11:36:13 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 11:36:14 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 11:36:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.480 (0.480) Loss 0.4897 (0.4897) Acc@1 90.918 (90.918) Acc@5 98.535 (98.535) Mem 7379MB [2024-08-26 11:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.113) Loss 0.8594 (0.7945) Acc@1 84.082 (82.795) Acc@5 95.898 (96.387) Mem 7379MB [2024-08-26 11:36:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.096) Loss 1.1064 (0.8132) Acc@1 73.438 (81.715) Acc@5 92.383 (96.257) Mem 7379MB [2024-08-26 11:36:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.090) Loss 1.3926 (0.9227) Acc@1 66.504 (79.202) Acc@5 88.574 (94.793) Mem 7379MB [2024-08-26 11:36:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.2627 (0.9806) Acc@1 70.605 (77.763) Acc@5 91.016 (94.148) Mem 7379MB [2024-08-26 11:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.438 Acc@5 94.088 [2024-08-26 11:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.4% [2024-08-26 11:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.813 (0.813) Loss 0.4341 (0.4341) Acc@1 92.090 (92.090) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 11:36:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.150) Loss 0.6948 (0.6804) Acc@1 86.133 (85.387) Acc@5 96.484 (97.115) Mem 7379MB [2024-08-26 11:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.116) Loss 0.9795 (0.7041) Acc@1 76.562 (84.352) Acc@5 94.043 (97.098) Mem 7379MB [2024-08-26 11:36:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.104) Loss 1.2510 (0.8017) Acc@1 68.457 (82.028) Acc@5 90.625 (95.930) Mem 7379MB [2024-08-26 11:36:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.095) Loss 1.1113 (0.8523) Acc@1 72.363 (80.602) Acc@5 92.871 (95.398) Mem 7379MB [2024-08-26 11:36:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.144 Acc@5 95.354 [2024-08-26 11:36:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.1% [2024-08-26 11:36:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.14% [2024-08-26 11:36:22 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 11:36:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 11:36:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][0/1251] eta 0:14:47 lr 0.000778 wd 0.0500 time 0.7094 (0.7094) data time 0.4873 (0.4873) model time 0.0000 (0.0000) loss 3.4165 (3.4165) grad_norm 2.1208 (2.1208) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:36:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][10/1251] eta 0:05:51 lr 0.000778 wd 0.0500 time 0.2400 (0.2833) data time 0.0013 (0.0453) model time 0.0000 (0.0000) loss 3.1650 (3.3138) grad_norm 1.6491 (2.6541) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:36:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][20/1251] eta 0:05:32 lr 0.000778 wd 0.0500 time 0.2424 (0.2698) data time 0.0010 (0.0242) model time 0.0000 (0.0000) loss 3.5165 (3.3704) grad_norm 1.9505 (2.6229) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:36:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][30/1251] eta 0:05:18 lr 0.000778 wd 0.0500 time 0.2391 (0.2605) data time 0.0009 (0.0168) model time 0.0000 (0.0000) loss 2.5563 (3.3613) grad_norm 1.6395 (2.4159) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:36:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][40/1251] eta 0:05:10 lr 0.000778 wd 0.0500 time 0.2425 (0.2565) data time 0.0007 (0.0136) model time 0.0000 (0.0000) loss 3.4345 (3.3229) grad_norm 1.4648 (2.2793) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:36:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][50/1251] eta 0:05:04 lr 0.000778 wd 0.0500 time 0.2331 (0.2533) data time 0.0009 (0.0112) model time 0.0000 (0.0000) loss 3.7967 (3.3004) grad_norm 3.3123 (2.2498) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:36:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][60/1251] eta 0:04:59 lr 0.000777 wd 0.0500 time 0.2364 (0.2512) data time 0.0008 (0.0095) model time 0.2356 (0.2395) loss 2.6675 (3.3120) grad_norm 2.0300 (2.2447) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][70/1251] eta 0:04:55 lr 0.000777 wd 0.0500 time 0.2396 (0.2500) data time 0.0008 (0.0083) model time 0.2388 (0.2406) loss 3.1442 (3.3264) grad_norm 2.8938 (2.2753) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:36:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][80/1251] eta 0:04:54 lr 0.000777 wd 0.0500 time 0.4495 (0.2514) data time 0.0009 (0.0074) model time 0.4486 (0.2472) loss 3.6463 (3.3493) grad_norm 1.9532 (2.2671) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:36:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][90/1251] eta 0:04:50 lr 0.000777 wd 0.0500 time 0.2403 (0.2503) data time 0.0010 (0.0067) model time 0.2393 (0.2455) loss 3.4503 (3.3802) grad_norm 2.5650 (2.2720) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:36:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][100/1251] eta 0:04:47 lr 0.000777 wd 0.0500 time 0.2408 (0.2496) data time 0.0012 (0.0061) model time 0.2395 (0.2449) loss 3.5897 (3.3943) grad_norm 1.7640 (2.2403) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:36:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][110/1251] eta 0:04:44 lr 0.000777 wd 0.0500 time 0.2545 (0.2496) data time 0.0007 (0.0059) model time 0.2538 (0.2450) loss 3.9322 (3.4042) grad_norm 2.3990 (2.2212) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][120/1251] eta 0:04:41 lr 0.000777 wd 0.0500 time 0.2448 (0.2492) data time 0.0007 (0.0055) model time 0.2441 (0.2448) loss 2.8709 (3.4073) grad_norm 1.7578 (2.2021) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:36:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][130/1251] eta 0:04:38 lr 0.000777 wd 0.0500 time 0.2366 (0.2486) data time 0.0008 (0.0052) model time 0.2358 (0.2443) loss 3.5496 (3.4101) grad_norm 2.3356 (2.1976) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][140/1251] eta 0:04:37 lr 0.000777 wd 0.0500 time 0.2568 (0.2494) data time 0.0009 (0.0049) model time 0.2560 (0.2459) loss 3.2868 (3.3973) grad_norm 1.5393 (2.2027) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:37:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][150/1251] eta 0:04:36 lr 0.000777 wd 0.0500 time 0.2416 (0.2509) data time 0.0010 (0.0046) model time 0.2405 (0.2484) loss 3.1080 (3.3921) grad_norm 2.6496 (2.2091) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][160/1251] eta 0:04:33 lr 0.000777 wd 0.0500 time 0.2372 (0.2504) data time 0.0008 (0.0045) model time 0.2364 (0.2477) loss 3.8987 (3.4000) grad_norm 2.5191 (2.2182) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:37:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][170/1251] eta 0:04:30 lr 0.000777 wd 0.0500 time 0.2395 (0.2498) data time 0.0011 (0.0043) model time 0.2383 (0.2471) loss 3.7171 (3.4009) grad_norm 2.4258 (2.2170) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][180/1251] eta 0:04:27 lr 0.000777 wd 0.0500 time 0.2510 (0.2495) data time 0.0008 (0.0041) model time 0.2501 (0.2467) loss 3.5026 (3.4042) grad_norm 1.3584 (2.2081) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:37:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][190/1251] eta 0:04:24 lr 0.000777 wd 0.0500 time 0.2417 (0.2491) data time 0.0007 (0.0039) model time 0.2409 (0.2462) loss 3.0345 (3.3975) grad_norm 1.8237 (2.2020) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][200/1251] eta 0:04:21 lr 0.000777 wd 0.0500 time 0.2420 (0.2489) data time 0.0008 (0.0038) model time 0.2412 (0.2461) loss 3.7279 (3.4101) grad_norm 2.4235 (2.2069) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:37:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][210/1251] eta 0:04:18 lr 0.000777 wd 0.0500 time 0.2419 (0.2486) data time 0.0008 (0.0037) model time 0.2411 (0.2458) loss 3.9788 (3.4131) grad_norm 1.8919 (2.1975) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][220/1251] eta 0:04:16 lr 0.000777 wd 0.0500 time 0.2461 (0.2485) data time 0.0008 (0.0035) model time 0.2453 (0.2458) loss 2.7223 (3.4153) grad_norm 2.0294 (2.1902) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:37:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][230/1251] eta 0:04:13 lr 0.000777 wd 0.0500 time 0.2489 (0.2482) data time 0.0007 (0.0034) model time 0.2481 (0.2456) loss 3.5860 (3.4172) grad_norm 1.9958 (2.1820) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:37:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][240/1251] eta 0:04:10 lr 0.000777 wd 0.0500 time 0.2434 (0.2480) data time 0.0011 (0.0033) model time 0.2423 (0.2453) loss 3.7406 (3.4152) grad_norm 1.9192 (2.1847) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:37:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][250/1251] eta 0:04:08 lr 0.000777 wd 0.0500 time 0.4470 (0.2485) data time 0.0008 (0.0032) model time 0.4463 (0.2461) loss 2.8702 (3.4110) grad_norm 1.9540 (2.1766) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:37:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][260/1251] eta 0:04:07 lr 0.000777 wd 0.0500 time 0.2474 (0.2498) data time 0.0007 (0.0032) model time 0.2467 (0.2478) loss 4.0182 (3.4142) grad_norm 1.3613 (inf) loss_scale 2048.0000 (4080.3065) mem 7379MB [2024-08-26 11:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][270/1251] eta 0:04:04 lr 0.000777 wd 0.0500 time 0.2499 (0.2496) data time 0.0009 (0.0031) model time 0.2490 (0.2475) loss 3.6703 (3.4173) grad_norm 1.7413 (inf) loss_scale 2048.0000 (4005.3137) mem 7379MB [2024-08-26 11:37:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][280/1251] eta 0:04:02 lr 0.000777 wd 0.0500 time 0.2438 (0.2493) data time 0.0013 (0.0030) model time 0.2425 (0.2472) loss 3.6104 (3.4165) grad_norm 3.4416 (inf) loss_scale 2048.0000 (3935.6584) mem 7379MB [2024-08-26 11:37:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][290/1251] eta 0:03:59 lr 0.000777 wd 0.0500 time 0.2384 (0.2495) data time 0.0009 (0.0029) model time 0.2375 (0.2475) loss 3.3961 (3.4235) grad_norm 1.8607 (inf) loss_scale 2048.0000 (3870.7904) mem 7379MB [2024-08-26 11:37:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][300/1251] eta 0:03:57 lr 0.000777 wd 0.0500 time 0.2507 (0.2493) data time 0.0011 (0.0029) model time 0.2496 (0.2473) loss 3.5860 (3.4281) grad_norm 2.0032 (inf) loss_scale 2048.0000 (3810.2326) mem 7379MB [2024-08-26 11:37:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][310/1251] eta 0:03:54 lr 0.000777 wd 0.0500 time 0.2452 (0.2492) data time 0.0009 (0.0028) model time 0.2443 (0.2472) loss 3.1846 (3.4243) grad_norm 3.1047 (inf) loss_scale 2048.0000 (3753.5691) mem 7379MB [2024-08-26 11:37:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][320/1251] eta 0:03:51 lr 0.000777 wd 0.0500 time 0.2377 (0.2490) data time 0.0013 (0.0028) model time 0.2364 (0.2470) loss 2.5846 (3.4172) grad_norm 1.6797 (inf) loss_scale 2048.0000 (3700.4361) mem 7379MB [2024-08-26 11:37:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][330/1251] eta 0:03:49 lr 0.000776 wd 0.0500 time 0.2391 (0.2488) data time 0.0007 (0.0027) model time 0.2384 (0.2468) loss 3.7861 (3.4213) grad_norm 2.3180 (inf) loss_scale 2048.0000 (3650.5136) mem 7379MB [2024-08-26 11:37:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][340/1251] eta 0:03:46 lr 0.000776 wd 0.0500 time 0.2419 (0.2491) data time 0.0008 (0.0027) model time 0.2411 (0.2472) loss 2.0760 (3.4174) grad_norm 1.9541 (inf) loss_scale 2048.0000 (3603.5191) mem 7379MB [2024-08-26 11:37:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][350/1251] eta 0:03:44 lr 0.000776 wd 0.0500 time 0.2398 (0.2489) data time 0.0012 (0.0026) model time 0.2386 (0.2470) loss 3.0341 (3.4204) grad_norm 1.7836 (inf) loss_scale 2048.0000 (3559.2023) mem 7379MB [2024-08-26 11:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][360/1251] eta 0:03:41 lr 0.000776 wd 0.0500 time 0.2385 (0.2487) data time 0.0012 (0.0026) model time 0.2373 (0.2468) loss 3.7538 (3.4221) grad_norm 1.8945 (inf) loss_scale 2048.0000 (3517.3407) mem 7379MB [2024-08-26 11:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][370/1251] eta 0:03:38 lr 0.000776 wd 0.0500 time 0.2346 (0.2485) data time 0.0010 (0.0025) model time 0.2336 (0.2465) loss 3.3260 (3.4292) grad_norm 1.8519 (inf) loss_scale 2048.0000 (3477.7358) mem 7379MB [2024-08-26 11:37:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][380/1251] eta 0:03:36 lr 0.000776 wd 0.0500 time 0.2429 (0.2483) data time 0.0010 (0.0025) model time 0.2419 (0.2464) loss 3.0814 (3.4260) grad_norm 2.2189 (inf) loss_scale 2048.0000 (3440.2100) mem 7379MB [2024-08-26 11:38:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][390/1251] eta 0:03:33 lr 0.000776 wd 0.0500 time 0.2412 (0.2481) data time 0.0010 (0.0025) model time 0.2402 (0.2462) loss 3.7372 (3.4261) grad_norm 1.6791 (inf) loss_scale 2048.0000 (3404.6036) mem 7379MB [2024-08-26 11:38:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][400/1251] eta 0:03:31 lr 0.000776 wd 0.0500 time 0.2433 (0.2480) data time 0.0011 (0.0024) model time 0.2421 (0.2460) loss 3.0733 (3.4305) grad_norm 2.7402 (inf) loss_scale 2048.0000 (3370.7731) mem 7379MB [2024-08-26 11:38:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][410/1251] eta 0:03:28 lr 0.000776 wd 0.0500 time 0.2396 (0.2478) data time 0.0009 (0.0024) model time 0.2387 (0.2459) loss 2.7506 (3.4212) grad_norm 2.6953 (inf) loss_scale 2048.0000 (3338.5888) mem 7379MB [2024-08-26 11:38:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][420/1251] eta 0:03:25 lr 0.000776 wd 0.0500 time 0.2460 (0.2477) data time 0.0007 (0.0024) model time 0.2453 (0.2458) loss 2.1274 (3.4158) grad_norm 2.2503 (inf) loss_scale 2048.0000 (3307.9335) mem 7379MB [2024-08-26 11:38:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][430/1251] eta 0:03:23 lr 0.000776 wd 0.0500 time 0.2363 (0.2476) data time 0.0007 (0.0023) model time 0.2356 (0.2457) loss 3.5725 (3.4123) grad_norm 1.8678 (inf) loss_scale 2048.0000 (3278.7007) mem 7379MB [2024-08-26 11:38:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][440/1251] eta 0:03:20 lr 0.000776 wd 0.0500 time 0.2383 (0.2475) data time 0.0011 (0.0023) model time 0.2372 (0.2456) loss 3.4493 (3.4123) grad_norm 2.0491 (inf) loss_scale 2048.0000 (3250.7937) mem 7379MB [2024-08-26 11:38:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][450/1251] eta 0:03:18 lr 0.000776 wd 0.0500 time 0.2374 (0.2473) data time 0.0007 (0.0023) model time 0.2366 (0.2454) loss 4.0673 (3.4120) grad_norm 2.1857 (inf) loss_scale 2048.0000 (3224.1242) mem 7379MB [2024-08-26 11:38:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][460/1251] eta 0:03:15 lr 0.000776 wd 0.0500 time 0.2403 (0.2473) data time 0.0009 (0.0022) model time 0.2395 (0.2454) loss 2.1946 (3.4135) grad_norm 2.1066 (inf) loss_scale 2048.0000 (3198.6117) mem 7379MB [2024-08-26 11:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][470/1251] eta 0:03:12 lr 0.000776 wd 0.0500 time 0.2406 (0.2471) data time 0.0007 (0.0022) model time 0.2399 (0.2452) loss 3.5390 (3.4141) grad_norm 2.5177 (inf) loss_scale 2048.0000 (3174.1826) mem 7379MB [2024-08-26 11:38:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][480/1251] eta 0:03:10 lr 0.000776 wd 0.0500 time 0.2322 (0.2469) data time 0.0011 (0.0022) model time 0.2311 (0.2451) loss 3.3847 (3.4158) grad_norm 2.4042 (inf) loss_scale 2048.0000 (3150.7692) mem 7379MB [2024-08-26 11:38:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][490/1251] eta 0:03:07 lr 0.000776 wd 0.0500 time 0.2435 (0.2469) data time 0.0008 (0.0022) model time 0.2427 (0.2450) loss 3.6631 (3.4109) grad_norm 1.6715 (inf) loss_scale 2048.0000 (3128.3096) mem 7379MB [2024-08-26 11:38:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][500/1251] eta 0:03:05 lr 0.000776 wd 0.0500 time 0.2413 (0.2471) data time 0.0009 (0.0022) model time 0.2404 (0.2453) loss 2.3347 (3.4102) grad_norm 1.5764 (inf) loss_scale 2048.0000 (3106.7465) mem 7379MB [2024-08-26 11:38:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][510/1251] eta 0:03:03 lr 0.000776 wd 0.0500 time 0.2426 (0.2473) data time 0.0012 (0.0021) model time 0.2414 (0.2455) loss 3.2588 (3.4061) grad_norm 2.1816 (inf) loss_scale 2048.0000 (3086.0274) mem 7379MB [2024-08-26 11:38:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][520/1251] eta 0:03:00 lr 0.000776 wd 0.0500 time 0.2513 (0.2472) data time 0.0010 (0.0021) model time 0.2503 (0.2454) loss 3.6860 (3.4032) grad_norm 2.7101 (inf) loss_scale 2048.0000 (3066.1036) mem 7379MB [2024-08-26 11:38:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][530/1251] eta 0:02:58 lr 0.000776 wd 0.0500 time 0.2425 (0.2471) data time 0.0009 (0.0021) model time 0.2415 (0.2453) loss 3.0675 (3.4031) grad_norm 1.9766 (inf) loss_scale 2048.0000 (3046.9303) mem 7379MB [2024-08-26 11:38:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][540/1251] eta 0:02:55 lr 0.000776 wd 0.0500 time 0.2460 (0.2474) data time 0.0010 (0.0021) model time 0.2450 (0.2456) loss 3.7128 (3.3991) grad_norm 2.2950 (inf) loss_scale 2048.0000 (3028.4658) mem 7379MB [2024-08-26 11:38:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][550/1251] eta 0:02:53 lr 0.000776 wd 0.0500 time 0.2396 (0.2476) data time 0.0012 (0.0020) model time 0.2384 (0.2459) loss 3.2909 (3.3967) grad_norm 1.9194 (inf) loss_scale 2048.0000 (3010.6715) mem 7379MB [2024-08-26 11:38:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][560/1251] eta 0:02:51 lr 0.000776 wd 0.0500 time 0.2375 (0.2475) data time 0.0009 (0.0020) model time 0.2365 (0.2458) loss 3.0707 (3.3947) grad_norm 1.8466 (inf) loss_scale 2048.0000 (2993.5116) mem 7379MB [2024-08-26 11:38:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][570/1251] eta 0:02:48 lr 0.000776 wd 0.0500 time 0.2440 (0.2474) data time 0.0007 (0.0020) model time 0.2433 (0.2458) loss 2.4749 (3.3944) grad_norm 2.0054 (inf) loss_scale 2048.0000 (2976.9527) mem 7379MB [2024-08-26 11:38:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][580/1251] eta 0:02:45 lr 0.000776 wd 0.0500 time 0.2474 (0.2473) data time 0.0010 (0.0020) model time 0.2464 (0.2456) loss 3.0684 (3.3933) grad_norm 1.9863 (inf) loss_scale 2048.0000 (2960.9639) mem 7379MB [2024-08-26 11:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][590/1251] eta 0:02:43 lr 0.000776 wd 0.0500 time 0.2417 (0.2472) data time 0.0008 (0.0020) model time 0.2410 (0.2456) loss 2.1951 (3.3909) grad_norm 1.6010 (inf) loss_scale 2048.0000 (2945.5161) mem 7379MB [2024-08-26 11:38:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][600/1251] eta 0:02:40 lr 0.000775 wd 0.0500 time 0.2446 (0.2472) data time 0.0010 (0.0020) model time 0.2436 (0.2455) loss 3.2437 (3.3917) grad_norm 1.6964 (inf) loss_scale 2048.0000 (2930.5824) mem 7379MB [2024-08-26 11:38:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][610/1251] eta 0:02:38 lr 0.000775 wd 0.0500 time 0.2413 (0.2471) data time 0.0011 (0.0020) model time 0.2401 (0.2454) loss 2.9920 (3.3882) grad_norm 1.9459 (inf) loss_scale 2048.0000 (2916.1375) mem 7379MB [2024-08-26 11:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][620/1251] eta 0:02:35 lr 0.000775 wd 0.0500 time 0.2524 (0.2470) data time 0.0011 (0.0019) model time 0.2513 (0.2453) loss 3.9192 (3.3880) grad_norm 2.3043 (inf) loss_scale 2048.0000 (2902.1578) mem 7379MB [2024-08-26 11:38:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][630/1251] eta 0:02:33 lr 0.000775 wd 0.0500 time 0.2367 (0.2472) data time 0.0010 (0.0019) model time 0.2357 (0.2456) loss 3.8700 (3.3863) grad_norm 3.6446 (inf) loss_scale 2048.0000 (2888.6212) mem 7379MB [2024-08-26 11:39:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][640/1251] eta 0:02:30 lr 0.000775 wd 0.0500 time 0.2476 (0.2471) data time 0.0010 (0.0019) model time 0.2466 (0.2455) loss 3.6346 (3.3823) grad_norm 1.6241 (inf) loss_scale 2048.0000 (2875.5070) mem 7379MB [2024-08-26 11:39:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][650/1251] eta 0:02:28 lr 0.000775 wd 0.0500 time 0.2356 (0.2470) data time 0.0010 (0.0019) model time 0.2346 (0.2454) loss 3.8067 (3.3790) grad_norm 1.5281 (inf) loss_scale 2048.0000 (2862.7957) mem 7379MB [2024-08-26 11:39:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][660/1251] eta 0:02:25 lr 0.000775 wd 0.0500 time 0.2434 (0.2469) data time 0.0010 (0.0019) model time 0.2424 (0.2453) loss 3.9399 (3.3795) grad_norm 2.6392 (inf) loss_scale 2048.0000 (2850.4690) mem 7379MB [2024-08-26 11:39:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][670/1251] eta 0:02:23 lr 0.000775 wd 0.0500 time 0.2480 (0.2475) data time 0.0011 (0.0019) model time 0.2468 (0.2459) loss 3.5972 (3.3790) grad_norm 2.9095 (inf) loss_scale 2048.0000 (2838.5097) mem 7379MB [2024-08-26 11:39:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][680/1251] eta 0:02:21 lr 0.000775 wd 0.0500 time 0.2447 (0.2477) data time 0.0008 (0.0019) model time 0.2440 (0.2461) loss 3.7558 (3.3767) grad_norm 1.4690 (inf) loss_scale 2048.0000 (2826.9016) mem 7379MB [2024-08-26 11:39:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][690/1251] eta 0:02:18 lr 0.000775 wd 0.0500 time 0.2409 (0.2476) data time 0.0010 (0.0019) model time 0.2399 (0.2460) loss 3.7482 (3.3771) grad_norm 1.7842 (inf) loss_scale 2048.0000 (2815.6295) mem 7379MB [2024-08-26 11:39:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][700/1251] eta 0:02:16 lr 0.000775 wd 0.0500 time 0.2435 (0.2475) data time 0.0008 (0.0018) model time 0.2427 (0.2459) loss 3.6586 (3.3750) grad_norm 2.2467 (inf) loss_scale 2048.0000 (2804.6790) mem 7379MB [2024-08-26 11:39:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][710/1251] eta 0:02:13 lr 0.000775 wd 0.0500 time 0.2428 (0.2474) data time 0.0007 (0.0018) model time 0.2421 (0.2459) loss 3.1756 (3.3746) grad_norm 2.7436 (inf) loss_scale 2048.0000 (2794.0366) mem 7379MB [2024-08-26 11:39:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][720/1251] eta 0:02:11 lr 0.000775 wd 0.0500 time 0.2442 (0.2474) data time 0.0008 (0.0018) model time 0.2433 (0.2458) loss 2.8424 (3.3762) grad_norm 4.3022 (inf) loss_scale 2048.0000 (2783.6893) mem 7379MB [2024-08-26 11:39:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][730/1251] eta 0:02:08 lr 0.000775 wd 0.0500 time 0.2458 (0.2473) data time 0.0010 (0.0018) model time 0.2448 (0.2457) loss 3.4634 (3.3744) grad_norm 2.6680 (inf) loss_scale 2048.0000 (2773.6252) mem 7379MB [2024-08-26 11:39:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][740/1251] eta 0:02:06 lr 0.000775 wd 0.0500 time 0.2406 (0.2472) data time 0.0011 (0.0018) model time 0.2396 (0.2457) loss 3.9262 (3.3756) grad_norm 2.3629 (inf) loss_scale 2048.0000 (2763.8327) mem 7379MB [2024-08-26 11:39:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][750/1251] eta 0:02:03 lr 0.000775 wd 0.0500 time 0.2368 (0.2471) data time 0.0007 (0.0018) model time 0.2360 (0.2456) loss 3.2140 (3.3763) grad_norm 1.8518 (inf) loss_scale 2048.0000 (2754.3009) mem 7379MB [2024-08-26 11:39:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][760/1251] eta 0:02:01 lr 0.000775 wd 0.0500 time 0.2496 (0.2471) data time 0.0011 (0.0018) model time 0.2485 (0.2455) loss 3.8538 (3.3762) grad_norm 2.4670 (inf) loss_scale 2048.0000 (2745.0197) mem 7379MB [2024-08-26 11:39:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][770/1251] eta 0:01:58 lr 0.000775 wd 0.0500 time 0.2323 (0.2470) data time 0.0009 (0.0018) model time 0.2314 (0.2454) loss 2.7754 (3.3767) grad_norm 1.7483 (inf) loss_scale 2048.0000 (2735.9792) mem 7379MB [2024-08-26 11:39:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][780/1251] eta 0:01:56 lr 0.000775 wd 0.0500 time 0.2328 (0.2471) data time 0.0011 (0.0018) model time 0.2317 (0.2456) loss 3.3635 (3.3780) grad_norm 1.5161 (inf) loss_scale 2048.0000 (2727.1703) mem 7379MB [2024-08-26 11:39:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][790/1251] eta 0:01:53 lr 0.000775 wd 0.0500 time 0.2380 (0.2471) data time 0.0010 (0.0018) model time 0.2370 (0.2455) loss 2.5074 (3.3780) grad_norm 1.7217 (inf) loss_scale 2048.0000 (2718.5841) mem 7379MB [2024-08-26 11:39:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][800/1251] eta 0:01:51 lr 0.000775 wd 0.0500 time 0.2426 (0.2470) data time 0.0010 (0.0017) model time 0.2416 (0.2455) loss 3.8022 (3.3768) grad_norm 1.8354 (inf) loss_scale 2048.0000 (2710.2122) mem 7379MB [2024-08-26 11:39:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][810/1251] eta 0:01:48 lr 0.000775 wd 0.0500 time 0.2438 (0.2470) data time 0.0012 (0.0017) model time 0.2427 (0.2454) loss 3.5365 (3.3780) grad_norm 2.0210 (inf) loss_scale 2048.0000 (2702.0469) mem 7379MB [2024-08-26 11:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][820/1251] eta 0:01:46 lr 0.000775 wd 0.0500 time 0.2485 (0.2472) data time 0.0007 (0.0017) model time 0.2478 (0.2457) loss 3.7712 (3.3808) grad_norm 2.9059 (inf) loss_scale 2048.0000 (2694.0804) mem 7379MB [2024-08-26 11:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][830/1251] eta 0:01:44 lr 0.000775 wd 0.0500 time 0.2475 (0.2471) data time 0.0010 (0.0017) model time 0.2466 (0.2456) loss 3.5902 (3.3785) grad_norm 2.2103 (inf) loss_scale 2048.0000 (2686.3057) mem 7379MB [2024-08-26 11:39:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][840/1251] eta 0:01:41 lr 0.000775 wd 0.0500 time 0.2409 (0.2470) data time 0.0007 (0.0017) model time 0.2402 (0.2455) loss 3.5462 (3.3774) grad_norm 2.5525 (inf) loss_scale 2048.0000 (2678.7158) mem 7379MB [2024-08-26 11:39:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][850/1251] eta 0:01:39 lr 0.000775 wd 0.0500 time 0.2433 (0.2470) data time 0.0010 (0.0017) model time 0.2423 (0.2455) loss 4.1848 (3.3801) grad_norm 2.0173 (inf) loss_scale 2048.0000 (2671.3043) mem 7379MB [2024-08-26 11:39:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][860/1251] eta 0:01:36 lr 0.000774 wd 0.0500 time 0.2460 (0.2469) data time 0.0011 (0.0017) model time 0.2449 (0.2454) loss 3.7585 (3.3819) grad_norm 2.0471 (inf) loss_scale 2048.0000 (2664.0650) mem 7379MB [2024-08-26 11:39:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][870/1251] eta 0:01:34 lr 0.000774 wd 0.0500 time 0.2413 (0.2469) data time 0.0008 (0.0017) model time 0.2405 (0.2454) loss 4.3780 (3.3844) grad_norm 1.4855 (inf) loss_scale 2048.0000 (2656.9920) mem 7379MB [2024-08-26 11:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][880/1251] eta 0:01:31 lr 0.000774 wd 0.0500 time 0.2440 (0.2468) data time 0.0008 (0.0017) model time 0.2432 (0.2453) loss 3.8349 (3.3815) grad_norm 2.2725 (inf) loss_scale 2048.0000 (2650.0795) mem 7379MB [2024-08-26 11:40:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][890/1251] eta 0:01:29 lr 0.000774 wd 0.0500 time 0.2441 (0.2468) data time 0.0008 (0.0017) model time 0.2434 (0.2453) loss 3.0872 (3.3802) grad_norm 1.7206 (inf) loss_scale 2048.0000 (2643.3221) mem 7379MB [2024-08-26 11:40:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][900/1251] eta 0:01:26 lr 0.000774 wd 0.0500 time 0.2404 (0.2467) data time 0.0009 (0.0017) model time 0.2395 (0.2452) loss 3.9603 (3.3803) grad_norm 2.1030 (inf) loss_scale 2048.0000 (2636.7148) mem 7379MB [2024-08-26 11:40:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][910/1251] eta 0:01:24 lr 0.000774 wd 0.0500 time 0.2454 (0.2467) data time 0.0008 (0.0017) model time 0.2446 (0.2452) loss 2.6851 (3.3825) grad_norm 2.2020 (inf) loss_scale 2048.0000 (2630.2525) mem 7379MB [2024-08-26 11:40:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][920/1251] eta 0:01:21 lr 0.000774 wd 0.0500 time 0.2403 (0.2467) data time 0.0013 (0.0017) model time 0.2390 (0.2452) loss 3.3788 (3.3832) grad_norm 1.9825 (inf) loss_scale 2048.0000 (2623.9305) mem 7379MB [2024-08-26 11:40:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][930/1251] eta 0:01:19 lr 0.000774 wd 0.0500 time 0.2447 (0.2467) data time 0.0008 (0.0017) model time 0.2439 (0.2452) loss 2.7999 (3.3798) grad_norm 2.3904 (inf) loss_scale 2048.0000 (2617.7444) mem 7379MB [2024-08-26 11:40:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][940/1251] eta 0:01:16 lr 0.000774 wd 0.0500 time 0.2413 (0.2467) data time 0.0011 (0.0017) model time 0.2402 (0.2452) loss 3.8033 (3.3779) grad_norm 1.8299 (inf) loss_scale 2048.0000 (2611.6897) mem 7379MB [2024-08-26 11:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][950/1251] eta 0:01:14 lr 0.000774 wd 0.0500 time 0.2392 (0.2466) data time 0.0011 (0.0017) model time 0.2380 (0.2451) loss 3.6471 (3.3746) grad_norm 1.7012 (inf) loss_scale 2048.0000 (2605.7624) mem 7379MB [2024-08-26 11:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][960/1251] eta 0:01:11 lr 0.000774 wd 0.0500 time 0.2396 (0.2466) data time 0.0007 (0.0017) model time 0.2389 (0.2451) loss 4.3512 (3.3771) grad_norm 1.6683 (inf) loss_scale 2048.0000 (2599.9584) mem 7379MB [2024-08-26 11:40:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][970/1251] eta 0:01:09 lr 0.000774 wd 0.0500 time 0.2432 (0.2465) data time 0.0010 (0.0017) model time 0.2422 (0.2450) loss 3.6542 (3.3761) grad_norm 1.7184 (inf) loss_scale 2048.0000 (2594.2739) mem 7379MB [2024-08-26 11:40:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][980/1251] eta 0:01:06 lr 0.000774 wd 0.0500 time 0.2411 (0.2465) data time 0.0009 (0.0017) model time 0.2402 (0.2450) loss 2.4734 (3.3758) grad_norm 1.9145 (inf) loss_scale 2048.0000 (2588.7054) mem 7379MB [2024-08-26 11:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][990/1251] eta 0:01:04 lr 0.000774 wd 0.0500 time 0.2468 (0.2465) data time 0.0011 (0.0016) model time 0.2457 (0.2450) loss 3.0204 (3.3745) grad_norm 1.7755 (inf) loss_scale 2048.0000 (2583.2492) mem 7379MB [2024-08-26 11:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1000/1251] eta 0:01:01 lr 0.000774 wd 0.0500 time 0.2417 (0.2464) data time 0.0009 (0.0016) model time 0.2408 (0.2449) loss 3.7171 (3.3745) grad_norm 2.6797 (inf) loss_scale 2048.0000 (2577.9021) mem 7379MB [2024-08-26 11:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1010/1251] eta 0:00:59 lr 0.000774 wd 0.0500 time 0.4497 (0.2466) data time 0.0009 (0.0016) model time 0.4488 (0.2451) loss 3.5200 (3.3745) grad_norm 3.6394 (inf) loss_scale 2048.0000 (2572.6607) mem 7379MB [2024-08-26 11:40:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1020/1251] eta 0:00:57 lr 0.000774 wd 0.0500 time 0.4736 (0.2468) data time 0.0011 (0.0016) model time 0.4725 (0.2453) loss 3.5298 (3.3748) grad_norm 1.8573 (inf) loss_scale 2048.0000 (2567.5220) mem 7379MB [2024-08-26 11:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1030/1251] eta 0:00:54 lr 0.000774 wd 0.0500 time 0.2375 (0.2470) data time 0.0012 (0.0016) model time 0.2363 (0.2455) loss 3.9330 (3.3757) grad_norm 2.3165 (inf) loss_scale 2048.0000 (2562.4830) mem 7379MB [2024-08-26 11:40:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1040/1251] eta 0:00:52 lr 0.000774 wd 0.0500 time 0.2505 (0.2469) data time 0.0010 (0.0016) model time 0.2495 (0.2455) loss 3.6864 (3.3742) grad_norm 3.7811 (inf) loss_scale 2048.0000 (2557.5408) mem 7379MB [2024-08-26 11:40:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1050/1251] eta 0:00:49 lr 0.000774 wd 0.0500 time 0.2406 (0.2469) data time 0.0007 (0.0016) model time 0.2399 (0.2454) loss 3.9776 (3.3743) grad_norm 2.1653 (inf) loss_scale 2048.0000 (2552.6927) mem 7379MB [2024-08-26 11:40:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1060/1251] eta 0:00:47 lr 0.000774 wd 0.0500 time 0.2529 (0.2470) data time 0.0011 (0.0016) model time 0.2519 (0.2455) loss 3.4423 (3.3733) grad_norm 1.8722 (inf) loss_scale 2048.0000 (2547.9359) mem 7379MB [2024-08-26 11:40:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1070/1251] eta 0:00:44 lr 0.000774 wd 0.0500 time 0.2401 (0.2469) data time 0.0012 (0.0016) model time 0.2389 (0.2455) loss 3.9507 (3.3713) grad_norm 2.4919 (inf) loss_scale 2048.0000 (2543.2680) mem 7379MB [2024-08-26 11:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1080/1251] eta 0:00:42 lr 0.000774 wd 0.0500 time 0.2340 (0.2469) data time 0.0010 (0.0016) model time 0.2330 (0.2454) loss 3.6680 (3.3700) grad_norm 2.2685 (inf) loss_scale 2048.0000 (2538.6864) mem 7379MB [2024-08-26 11:40:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1090/1251] eta 0:00:39 lr 0.000774 wd 0.0500 time 0.2476 (0.2468) data time 0.0008 (0.0016) model time 0.2469 (0.2454) loss 3.2632 (3.3699) grad_norm 2.7886 (inf) loss_scale 2048.0000 (2534.1888) mem 7379MB [2024-08-26 11:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1100/1251] eta 0:00:37 lr 0.000774 wd 0.0500 time 0.2334 (0.2468) data time 0.0010 (0.0016) model time 0.2324 (0.2453) loss 2.1406 (3.3712) grad_norm 1.7082 (inf) loss_scale 2048.0000 (2529.7729) mem 7379MB [2024-08-26 11:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1110/1251] eta 0:00:34 lr 0.000774 wd 0.0500 time 0.2433 (0.2467) data time 0.0012 (0.0016) model time 0.2421 (0.2453) loss 3.3633 (3.3687) grad_norm 2.1995 (inf) loss_scale 2048.0000 (2525.4365) mem 7379MB [2024-08-26 11:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1120/1251] eta 0:00:32 lr 0.000774 wd 0.0500 time 0.2315 (0.2467) data time 0.0009 (0.0016) model time 0.2307 (0.2452) loss 3.4832 (3.3679) grad_norm 1.8741 (inf) loss_scale 2048.0000 (2521.1775) mem 7379MB [2024-08-26 11:41:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1130/1251] eta 0:00:29 lr 0.000773 wd 0.0500 time 0.2398 (0.2466) data time 0.0007 (0.0016) model time 0.2391 (0.2452) loss 4.0438 (3.3682) grad_norm 2.7408 (inf) loss_scale 2048.0000 (2516.9938) mem 7379MB [2024-08-26 11:41:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1140/1251] eta 0:00:27 lr 0.000773 wd 0.0500 time 0.2402 (0.2466) data time 0.0007 (0.0016) model time 0.2395 (0.2451) loss 3.5355 (3.3666) grad_norm 2.8379 (inf) loss_scale 2048.0000 (2512.8834) mem 7379MB [2024-08-26 11:41:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1150/1251] eta 0:00:24 lr 0.000773 wd 0.0500 time 0.2411 (0.2465) data time 0.0009 (0.0016) model time 0.2402 (0.2451) loss 3.9771 (3.3678) grad_norm 1.7464 (inf) loss_scale 2048.0000 (2508.8445) mem 7379MB [2024-08-26 11:41:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1160/1251] eta 0:00:22 lr 0.000773 wd 0.0500 time 0.2440 (0.2465) data time 0.0010 (0.0016) model time 0.2430 (0.2451) loss 3.0055 (3.3689) grad_norm 1.7585 (inf) loss_scale 2048.0000 (2504.8751) mem 7379MB [2024-08-26 11:41:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1170/1251] eta 0:00:19 lr 0.000773 wd 0.0500 time 0.2378 (0.2466) data time 0.0011 (0.0016) model time 0.2368 (0.2452) loss 3.3897 (3.3688) grad_norm 1.9532 (inf) loss_scale 2048.0000 (2500.9735) mem 7379MB [2024-08-26 11:41:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1180/1251] eta 0:00:17 lr 0.000773 wd 0.0500 time 0.4501 (0.2468) data time 0.0010 (0.0016) model time 0.4491 (0.2453) loss 3.4702 (3.3684) grad_norm 2.9521 (inf) loss_scale 2048.0000 (2497.1380) mem 7379MB [2024-08-26 11:41:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1190/1251] eta 0:00:15 lr 0.000773 wd 0.0500 time 0.2444 (0.2469) data time 0.0010 (0.0016) model time 0.2434 (0.2455) loss 3.6395 (3.3672) grad_norm 2.0046 (inf) loss_scale 2048.0000 (2493.3669) mem 7379MB [2024-08-26 11:41:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1200/1251] eta 0:00:12 lr 0.000773 wd 0.0500 time 0.2415 (0.2469) data time 0.0011 (0.0016) model time 0.2404 (0.2454) loss 2.7422 (3.3662) grad_norm 1.6709 (inf) loss_scale 2048.0000 (2489.6586) mem 7379MB [2024-08-26 11:41:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1210/1251] eta 0:00:10 lr 0.000773 wd 0.0500 time 0.2384 (0.2468) data time 0.0011 (0.0016) model time 0.2374 (0.2454) loss 2.6011 (3.3651) grad_norm 2.1848 (inf) loss_scale 2048.0000 (2486.0116) mem 7379MB [2024-08-26 11:41:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1220/1251] eta 0:00:07 lr 0.000773 wd 0.0500 time 0.2395 (0.2468) data time 0.0007 (0.0016) model time 0.2388 (0.2454) loss 3.6923 (3.3671) grad_norm 1.6396 (inf) loss_scale 2048.0000 (2482.4242) mem 7379MB [2024-08-26 11:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1230/1251] eta 0:00:05 lr 0.000773 wd 0.0500 time 0.2357 (0.2467) data time 0.0010 (0.0016) model time 0.2347 (0.2453) loss 4.1758 (3.3696) grad_norm 2.1947 (inf) loss_scale 2048.0000 (2478.8952) mem 7379MB [2024-08-26 11:41:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1240/1251] eta 0:00:02 lr 0.000773 wd 0.0500 time 0.2274 (0.2466) data time 0.0007 (0.0016) model time 0.2267 (0.2452) loss 3.2611 (3.3716) grad_norm 2.5798 (inf) loss_scale 2048.0000 (2475.4230) mem 7379MB [2024-08-26 11:41:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [108/300][1250/1251] eta 0:00:00 lr 0.000773 wd 0.0500 time 0.2257 (0.2465) data time 0.0007 (0.0015) model time 0.2250 (0.2451) loss 3.8615 (3.3736) grad_norm 2.6520 (inf) loss_scale 2048.0000 (2472.0064) mem 7379MB [2024-08-26 11:41:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 108 training takes 0:05:08 [2024-08-26 11:41:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 11:41:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 11:41:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.417 (0.417) Loss 0.5244 (0.5244) Acc@1 90.234 (90.234) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 11:41:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.107) Loss 0.8193 (0.7845) Acc@1 83.301 (83.398) Acc@5 95.703 (96.458) Mem 7379MB [2024-08-26 11:41:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.095) Loss 1.1553 (0.8068) Acc@1 73.828 (82.301) Acc@5 92.676 (96.470) Mem 7379MB [2024-08-26 11:41:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.083 (0.090) Loss 1.3926 (0.9215) Acc@1 66.895 (79.580) Acc@5 89.062 (95.108) Mem 7379MB [2024-08-26 11:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.2852 (0.9851) Acc@1 70.312 (78.058) Acc@5 90.918 (94.341) Mem 7379MB [2024-08-26 11:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.612 Acc@5 94.254 [2024-08-26 11:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.6% [2024-08-26 11:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 77.61% [2024-08-26 11:41:36 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 11:41:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 11:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.422 (0.422) Loss 0.4333 (0.4333) Acc@1 91.895 (91.895) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 11:41:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.110) Loss 0.6938 (0.6792) Acc@1 86.426 (85.476) Acc@5 96.582 (97.124) Mem 7379MB [2024-08-26 11:41:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.096) Loss 0.9746 (0.7026) Acc@1 76.465 (84.384) Acc@5 94.238 (97.108) Mem 7379MB [2024-08-26 11:41:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.089) Loss 1.2490 (0.8001) Acc@1 68.750 (82.097) Acc@5 90.723 (95.933) Mem 7379MB [2024-08-26 11:41:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.1104 (0.8506) Acc@1 72.461 (80.659) Acc@5 92.773 (95.405) Mem 7379MB [2024-08-26 11:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.186 Acc@5 95.364 [2024-08-26 11:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.2% [2024-08-26 11:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.19% [2024-08-26 11:41:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 11:41:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 11:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][0/1251] eta 0:14:10 lr 0.000773 wd 0.0500 time 0.6796 (0.6796) data time 0.4432 (0.4432) model time 0.0000 (0.0000) loss 4.1717 (4.1717) grad_norm 2.7682 (2.7682) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][10/1251] eta 0:05:50 lr 0.000773 wd 0.0500 time 0.2393 (0.2825) data time 0.0009 (0.0413) model time 0.0000 (0.0000) loss 3.9981 (3.5015) grad_norm 8.6943 (3.1780) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:41:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][20/1251] eta 0:05:24 lr 0.000773 wd 0.0500 time 0.2540 (0.2640) data time 0.0011 (0.0221) model time 0.0000 (0.0000) loss 3.3085 (3.4894) grad_norm 1.4775 (2.6566) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][30/1251] eta 0:05:14 lr 0.000773 wd 0.0500 time 0.2444 (0.2574) data time 0.0009 (0.0157) model time 0.0000 (0.0000) loss 3.5328 (3.5088) grad_norm 3.5432 (2.4841) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:41:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][40/1251] eta 0:05:07 lr 0.000773 wd 0.0500 time 0.2355 (0.2535) data time 0.0012 (0.0121) model time 0.0000 (0.0000) loss 2.9833 (3.4943) grad_norm 2.6246 (2.4306) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:41:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][50/1251] eta 0:05:01 lr 0.000773 wd 0.0500 time 0.2397 (0.2514) data time 0.0007 (0.0100) model time 0.0000 (0.0000) loss 2.6864 (3.4240) grad_norm 1.9820 (2.3371) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:41:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][60/1251] eta 0:05:00 lr 0.000773 wd 0.0500 time 0.2340 (0.2526) data time 0.0010 (0.0085) model time 0.2330 (0.2576) loss 3.1111 (3.3571) grad_norm 1.5372 (2.2830) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][70/1251] eta 0:04:56 lr 0.000773 wd 0.0500 time 0.2423 (0.2513) data time 0.0007 (0.0074) model time 0.2415 (0.2499) loss 3.9852 (3.3931) grad_norm 2.1144 (2.3163) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][80/1251] eta 0:04:53 lr 0.000773 wd 0.0500 time 0.2488 (0.2506) data time 0.0008 (0.0066) model time 0.2480 (0.2481) loss 2.8227 (3.3965) grad_norm 2.1114 (2.2994) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][90/1251] eta 0:04:49 lr 0.000773 wd 0.0500 time 0.2445 (0.2496) data time 0.0009 (0.0060) model time 0.2436 (0.2462) loss 3.5068 (3.4028) grad_norm 1.8211 (2.2884) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][100/1251] eta 0:04:46 lr 0.000773 wd 0.0500 time 0.2450 (0.2488) data time 0.0009 (0.0055) model time 0.2442 (0.2452) loss 2.8573 (3.3938) grad_norm 1.7257 (2.2864) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][110/1251] eta 0:04:43 lr 0.000773 wd 0.0500 time 0.2510 (0.2484) data time 0.0007 (0.0051) model time 0.2503 (0.2448) loss 3.7946 (3.3887) grad_norm 2.0311 (2.2947) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][120/1251] eta 0:04:40 lr 0.000773 wd 0.0500 time 0.2442 (0.2478) data time 0.0007 (0.0048) model time 0.2435 (0.2442) loss 2.6950 (3.4061) grad_norm 1.8642 (2.2710) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][130/1251] eta 0:04:38 lr 0.000773 wd 0.0500 time 0.2415 (0.2488) data time 0.0010 (0.0045) model time 0.2405 (0.2462) loss 2.9996 (3.3977) grad_norm 3.1241 (2.2723) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][140/1251] eta 0:04:36 lr 0.000773 wd 0.0500 time 0.2395 (0.2486) data time 0.0015 (0.0043) model time 0.2381 (0.2459) loss 3.0799 (3.4019) grad_norm 2.5550 (2.2718) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][150/1251] eta 0:04:33 lr 0.000772 wd 0.0500 time 0.2432 (0.2482) data time 0.0009 (0.0040) model time 0.2423 (0.2456) loss 3.9733 (3.3918) grad_norm 1.4866 (2.2620) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][160/1251] eta 0:04:30 lr 0.000772 wd 0.0500 time 0.2448 (0.2479) data time 0.0010 (0.0038) model time 0.2438 (0.2453) loss 3.9758 (3.4012) grad_norm 1.3877 (2.2541) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][170/1251] eta 0:04:28 lr 0.000772 wd 0.0500 time 0.2482 (0.2480) data time 0.0008 (0.0037) model time 0.2475 (0.2454) loss 3.3467 (3.3891) grad_norm 2.1369 (2.2375) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][180/1251] eta 0:04:25 lr 0.000772 wd 0.0500 time 0.2441 (0.2477) data time 0.0011 (0.0036) model time 0.2430 (0.2452) loss 3.5895 (3.3871) grad_norm 2.2003 (2.2278) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][190/1251] eta 0:04:22 lr 0.000772 wd 0.0500 time 0.2371 (0.2476) data time 0.0007 (0.0035) model time 0.2364 (0.2450) loss 3.6619 (3.3787) grad_norm 2.7432 (2.2300) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][200/1251] eta 0:04:22 lr 0.000772 wd 0.0500 time 0.2455 (0.2494) data time 0.0009 (0.0034) model time 0.2446 (0.2475) loss 3.9350 (3.3804) grad_norm 2.9037 (2.2495) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][210/1251] eta 0:04:19 lr 0.000772 wd 0.0500 time 0.2399 (0.2490) data time 0.0010 (0.0033) model time 0.2389 (0.2471) loss 3.6219 (3.3953) grad_norm 1.9056 (2.2501) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][220/1251] eta 0:04:16 lr 0.000772 wd 0.0500 time 0.2508 (0.2488) data time 0.0011 (0.0032) model time 0.2497 (0.2469) loss 3.3675 (3.3882) grad_norm 1.4860 (2.2444) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][230/1251] eta 0:04:13 lr 0.000772 wd 0.0500 time 0.2399 (0.2486) data time 0.0009 (0.0031) model time 0.2390 (0.2466) loss 4.0276 (3.4015) grad_norm 1.9557 (2.2393) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][240/1251] eta 0:04:11 lr 0.000772 wd 0.0500 time 0.2496 (0.2483) data time 0.0009 (0.0030) model time 0.2487 (0.2464) loss 3.1943 (3.4009) grad_norm 1.7173 (2.2330) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][250/1251] eta 0:04:08 lr 0.000772 wd 0.0500 time 0.2408 (0.2480) data time 0.0008 (0.0029) model time 0.2401 (0.2460) loss 3.8539 (3.4100) grad_norm 1.5984 (2.2175) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][260/1251] eta 0:04:05 lr 0.000772 wd 0.0500 time 0.2420 (0.2477) data time 0.0008 (0.0029) model time 0.2413 (0.2456) loss 2.4660 (3.3970) grad_norm 1.7167 (2.2048) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][270/1251] eta 0:04:02 lr 0.000772 wd 0.0500 time 0.2394 (0.2474) data time 0.0012 (0.0028) model time 0.2382 (0.2453) loss 2.8645 (3.3891) grad_norm 1.9696 (2.1987) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][280/1251] eta 0:04:00 lr 0.000772 wd 0.0500 time 0.2427 (0.2479) data time 0.0011 (0.0027) model time 0.2416 (0.2460) loss 2.5324 (3.3831) grad_norm 1.7282 (2.1980) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][290/1251] eta 0:03:58 lr 0.000772 wd 0.0500 time 0.2431 (0.2478) data time 0.0009 (0.0027) model time 0.2422 (0.2458) loss 2.6483 (3.3823) grad_norm 1.7592 (2.1978) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][300/1251] eta 0:03:55 lr 0.000772 wd 0.0500 time 0.2416 (0.2475) data time 0.0009 (0.0026) model time 0.2406 (0.2456) loss 3.9722 (3.3808) grad_norm 1.9672 (2.1947) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:42:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][310/1251] eta 0:03:52 lr 0.000772 wd 0.0500 time 0.2425 (0.2474) data time 0.0009 (0.0026) model time 0.2416 (0.2455) loss 3.8817 (3.3756) grad_norm 2.0806 (2.1886) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][320/1251] eta 0:03:50 lr 0.000772 wd 0.0500 time 0.2422 (0.2477) data time 0.0008 (0.0025) model time 0.2415 (0.2459) loss 2.2545 (3.3692) grad_norm 1.9641 (2.1860) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][330/1251] eta 0:03:48 lr 0.000772 wd 0.0500 time 0.2362 (0.2476) data time 0.0009 (0.0025) model time 0.2352 (0.2458) loss 2.8965 (3.3741) grad_norm 1.8222 (2.1812) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][340/1251] eta 0:03:45 lr 0.000772 wd 0.0500 time 0.2349 (0.2475) data time 0.0011 (0.0025) model time 0.2338 (0.2457) loss 3.7696 (3.3756) grad_norm 1.7071 (2.1721) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][350/1251] eta 0:03:42 lr 0.000772 wd 0.0500 time 0.2491 (0.2474) data time 0.0010 (0.0024) model time 0.2481 (0.2455) loss 3.6625 (3.3731) grad_norm 1.4951 (2.1663) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][360/1251] eta 0:03:40 lr 0.000772 wd 0.0500 time 0.2575 (0.2472) data time 0.0010 (0.0024) model time 0.2565 (0.2453) loss 3.6112 (3.3733) grad_norm 1.9519 (2.1601) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][370/1251] eta 0:03:37 lr 0.000772 wd 0.0500 time 0.2350 (0.2471) data time 0.0013 (0.0024) model time 0.2337 (0.2452) loss 3.1877 (3.3740) grad_norm 1.6920 (2.1595) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][380/1251] eta 0:03:35 lr 0.000772 wd 0.0500 time 0.2426 (0.2470) data time 0.0007 (0.0023) model time 0.2419 (0.2451) loss 3.1522 (3.3803) grad_norm 3.2601 (2.1615) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][390/1251] eta 0:03:32 lr 0.000772 wd 0.0500 time 0.2398 (0.2469) data time 0.0011 (0.0023) model time 0.2386 (0.2450) loss 3.1770 (3.3760) grad_norm 2.7392 (2.1662) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][400/1251] eta 0:03:30 lr 0.000772 wd 0.0500 time 0.2356 (0.2478) data time 0.0009 (0.0023) model time 0.2347 (0.2461) loss 3.2311 (3.3753) grad_norm 2.4874 (2.1724) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][410/1251] eta 0:03:28 lr 0.000772 wd 0.0500 time 0.2413 (0.2477) data time 0.0008 (0.0022) model time 0.2405 (0.2460) loss 3.1418 (3.3733) grad_norm 2.8570 (2.1728) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][420/1251] eta 0:03:25 lr 0.000771 wd 0.0500 time 0.2447 (0.2476) data time 0.0008 (0.0022) model time 0.2438 (0.2459) loss 3.1221 (3.3712) grad_norm 1.4216 (2.1681) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][430/1251] eta 0:03:23 lr 0.000771 wd 0.0500 time 0.2418 (0.2475) data time 0.0009 (0.0022) model time 0.2408 (0.2458) loss 3.6229 (3.3697) grad_norm 1.8585 (2.1631) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][440/1251] eta 0:03:20 lr 0.000771 wd 0.0500 time 0.2455 (0.2473) data time 0.0009 (0.0022) model time 0.2445 (0.2457) loss 3.5429 (3.3728) grad_norm 1.6759 (2.1594) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][450/1251] eta 0:03:18 lr 0.000771 wd 0.0500 time 0.2414 (0.2476) data time 0.0007 (0.0021) model time 0.2407 (0.2460) loss 2.3472 (3.3656) grad_norm 1.8794 (2.1595) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][460/1251] eta 0:03:15 lr 0.000771 wd 0.0500 time 0.2442 (0.2476) data time 0.0008 (0.0021) model time 0.2434 (0.2460) loss 4.1842 (3.3644) grad_norm 1.8573 (2.1579) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][470/1251] eta 0:03:13 lr 0.000771 wd 0.0500 time 0.2379 (0.2475) data time 0.0011 (0.0021) model time 0.2368 (0.2459) loss 3.4103 (3.3639) grad_norm 1.5231 (2.1536) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][480/1251] eta 0:03:10 lr 0.000771 wd 0.0500 time 0.2396 (0.2474) data time 0.0010 (0.0021) model time 0.2386 (0.2458) loss 3.8771 (3.3632) grad_norm 1.7474 (2.1538) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][490/1251] eta 0:03:08 lr 0.000771 wd 0.0500 time 0.2360 (0.2473) data time 0.0010 (0.0021) model time 0.2350 (0.2456) loss 3.6435 (3.3624) grad_norm 2.1090 (2.1538) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][500/1251] eta 0:03:05 lr 0.000771 wd 0.0500 time 0.2453 (0.2472) data time 0.0011 (0.0021) model time 0.2442 (0.2455) loss 2.9636 (3.3599) grad_norm 1.5797 (2.1488) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][510/1251] eta 0:03:03 lr 0.000771 wd 0.0500 time 0.2419 (0.2472) data time 0.0007 (0.0021) model time 0.2411 (0.2455) loss 3.8687 (3.3600) grad_norm 2.2530 (2.1504) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][520/1251] eta 0:03:00 lr 0.000771 wd 0.0500 time 0.2461 (0.2475) data time 0.0007 (0.0021) model time 0.2454 (0.2459) loss 3.8947 (3.3614) grad_norm 1.6684 (2.1515) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][530/1251] eta 0:02:58 lr 0.000771 wd 0.0500 time 0.2450 (0.2474) data time 0.0007 (0.0020) model time 0.2443 (0.2458) loss 2.1244 (3.3606) grad_norm 2.7970 (2.1487) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][540/1251] eta 0:02:55 lr 0.000771 wd 0.0500 time 0.2456 (0.2474) data time 0.0011 (0.0020) model time 0.2445 (0.2458) loss 3.3580 (3.3591) grad_norm 2.3503 (2.1546) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:43:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][550/1251] eta 0:02:53 lr 0.000771 wd 0.0500 time 0.2442 (0.2473) data time 0.0007 (0.0020) model time 0.2434 (0.2457) loss 3.2987 (3.3563) grad_norm 1.9520 (2.1621) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][560/1251] eta 0:02:50 lr 0.000771 wd 0.0500 time 0.2476 (0.2473) data time 0.0013 (0.0020) model time 0.2463 (0.2457) loss 3.8219 (3.3548) grad_norm 2.6183 (2.1593) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][570/1251] eta 0:02:48 lr 0.000771 wd 0.0500 time 0.2419 (0.2473) data time 0.0010 (0.0020) model time 0.2410 (0.2457) loss 3.8995 (3.3582) grad_norm 3.3352 (2.1684) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][580/1251] eta 0:02:46 lr 0.000771 wd 0.0500 time 0.2391 (0.2476) data time 0.0007 (0.0020) model time 0.2384 (0.2460) loss 3.7928 (3.3599) grad_norm 2.9001 (2.1699) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][590/1251] eta 0:02:43 lr 0.000771 wd 0.0500 time 0.2476 (0.2476) data time 0.0010 (0.0020) model time 0.2466 (0.2460) loss 3.3979 (3.3591) grad_norm 1.9060 (2.1678) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][600/1251] eta 0:02:41 lr 0.000771 wd 0.0500 time 0.2476 (0.2475) data time 0.0010 (0.0020) model time 0.2467 (0.2459) loss 2.4759 (3.3566) grad_norm 2.9720 (2.1675) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][610/1251] eta 0:02:38 lr 0.000771 wd 0.0500 time 0.2421 (0.2474) data time 0.0009 (0.0020) model time 0.2413 (0.2458) loss 3.7852 (3.3608) grad_norm 2.4932 (2.1685) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][620/1251] eta 0:02:36 lr 0.000771 wd 0.0500 time 0.2452 (0.2474) data time 0.0010 (0.0019) model time 0.2442 (0.2458) loss 3.1309 (3.3607) grad_norm 1.3837 (2.1618) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][630/1251] eta 0:02:33 lr 0.000771 wd 0.0500 time 0.2433 (0.2473) data time 0.0010 (0.0019) model time 0.2424 (0.2457) loss 3.4302 (3.3609) grad_norm 1.8518 (2.1572) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][640/1251] eta 0:02:31 lr 0.000771 wd 0.0500 time 0.2470 (0.2473) data time 0.0009 (0.0020) model time 0.2461 (0.2457) loss 2.2939 (3.3619) grad_norm 2.2164 (2.1556) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][650/1251] eta 0:02:28 lr 0.000771 wd 0.0500 time 0.2407 (0.2472) data time 0.0010 (0.0020) model time 0.2398 (0.2456) loss 4.1268 (3.3627) grad_norm 2.7589 (2.1583) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][660/1251] eta 0:02:26 lr 0.000771 wd 0.0500 time 0.2444 (0.2472) data time 0.0010 (0.0019) model time 0.2433 (0.2455) loss 2.7165 (3.3607) grad_norm 1.3983 (2.1601) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][670/1251] eta 0:02:23 lr 0.000771 wd 0.0500 time 0.2379 (0.2471) data time 0.0008 (0.0019) model time 0.2371 (0.2454) loss 3.7406 (3.3622) grad_norm 1.7688 (2.1563) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][680/1251] eta 0:02:21 lr 0.000770 wd 0.0500 time 0.2384 (0.2470) data time 0.0012 (0.0019) model time 0.2372 (0.2453) loss 2.4795 (3.3609) grad_norm 2.0888 (2.1516) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][690/1251] eta 0:02:18 lr 0.000770 wd 0.0500 time 0.2465 (0.2469) data time 0.0008 (0.0019) model time 0.2457 (0.2453) loss 3.1929 (3.3615) grad_norm 1.8084 (2.1524) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][700/1251] eta 0:02:15 lr 0.000770 wd 0.0500 time 0.2430 (0.2468) data time 0.0009 (0.0019) model time 0.2421 (0.2452) loss 3.2887 (3.3597) grad_norm 1.7707 (2.1510) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][710/1251] eta 0:02:13 lr 0.000770 wd 0.0500 time 0.2400 (0.2467) data time 0.0009 (0.0019) model time 0.2390 (0.2451) loss 3.8290 (3.3566) grad_norm 2.7210 (2.1508) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][720/1251] eta 0:02:10 lr 0.000770 wd 0.0500 time 0.2340 (0.2466) data time 0.0008 (0.0019) model time 0.2332 (0.2450) loss 3.3814 (3.3591) grad_norm 2.1707 (2.1523) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][730/1251] eta 0:02:08 lr 0.000770 wd 0.0500 time 0.2351 (0.2466) data time 0.0010 (0.0019) model time 0.2341 (0.2449) loss 3.6308 (3.3608) grad_norm 6.4223 (2.1667) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][740/1251] eta 0:02:06 lr 0.000770 wd 0.0500 time 0.2438 (0.2466) data time 0.0010 (0.0020) model time 0.2428 (0.2449) loss 2.8968 (3.3560) grad_norm 2.2009 (2.1697) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][750/1251] eta 0:02:03 lr 0.000770 wd 0.0500 time 0.2442 (0.2465) data time 0.0007 (0.0020) model time 0.2435 (0.2448) loss 3.5847 (3.3528) grad_norm 1.4466 (2.1666) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][760/1251] eta 0:02:01 lr 0.000770 wd 0.0500 time 0.2458 (0.2465) data time 0.0007 (0.0019) model time 0.2451 (0.2448) loss 4.1990 (3.3548) grad_norm 1.7842 (2.1634) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][770/1251] eta 0:01:58 lr 0.000770 wd 0.0500 time 0.2478 (0.2464) data time 0.0011 (0.0019) model time 0.2468 (0.2447) loss 3.5537 (3.3546) grad_norm 2.0068 (2.1657) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][780/1251] eta 0:01:56 lr 0.000770 wd 0.0500 time 0.2420 (0.2464) data time 0.0010 (0.0019) model time 0.2410 (0.2447) loss 3.8002 (3.3569) grad_norm 2.8185 (2.1641) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][790/1251] eta 0:01:53 lr 0.000770 wd 0.0500 time 0.2515 (0.2466) data time 0.0008 (0.0019) model time 0.2507 (0.2449) loss 2.8964 (3.3554) grad_norm 1.8362 (2.1651) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:44:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][800/1251] eta 0:01:51 lr 0.000770 wd 0.0500 time 0.2435 (0.2465) data time 0.0010 (0.0019) model time 0.2425 (0.2448) loss 3.2377 (3.3568) grad_norm 1.8615 (2.1706) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][810/1251] eta 0:01:48 lr 0.000770 wd 0.0500 time 0.2431 (0.2468) data time 0.0011 (0.0019) model time 0.2421 (0.2451) loss 3.0601 (3.3568) grad_norm 2.9450 (2.1797) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][820/1251] eta 0:01:46 lr 0.000770 wd 0.0500 time 0.2495 (0.2470) data time 0.0008 (0.0019) model time 0.2487 (0.2453) loss 3.9226 (3.3560) grad_norm 1.8402 (2.1836) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][830/1251] eta 0:01:43 lr 0.000770 wd 0.0500 time 0.2402 (0.2469) data time 0.0010 (0.0019) model time 0.2392 (0.2453) loss 3.6820 (3.3569) grad_norm 1.8388 (2.1862) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][840/1251] eta 0:01:41 lr 0.000770 wd 0.0500 time 0.2512 (0.2468) data time 0.0010 (0.0019) model time 0.2502 (0.2452) loss 3.7914 (3.3550) grad_norm 2.0814 (2.1849) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][850/1251] eta 0:01:39 lr 0.000770 wd 0.0500 time 0.2468 (0.2470) data time 0.0010 (0.0019) model time 0.2458 (0.2453) loss 3.5156 (3.3589) grad_norm 1.9522 (2.1833) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][860/1251] eta 0:01:36 lr 0.000770 wd 0.0500 time 0.2408 (0.2469) data time 0.0010 (0.0019) model time 0.2398 (0.2453) loss 3.1158 (3.3586) grad_norm 2.2148 (2.1824) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][870/1251] eta 0:01:34 lr 0.000770 wd 0.0500 time 0.2429 (0.2469) data time 0.0007 (0.0019) model time 0.2422 (0.2452) loss 4.0597 (3.3604) grad_norm 1.7284 (2.1805) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][880/1251] eta 0:01:31 lr 0.000770 wd 0.0500 time 0.2350 (0.2468) data time 0.0009 (0.0018) model time 0.2341 (0.2451) loss 4.1675 (3.3631) grad_norm 2.2834 (2.1809) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][890/1251] eta 0:01:29 lr 0.000770 wd 0.0500 time 0.2454 (0.2467) data time 0.0008 (0.0018) model time 0.2446 (0.2451) loss 2.7994 (3.3641) grad_norm 1.7653 (2.1801) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][900/1251] eta 0:01:26 lr 0.000770 wd 0.0500 time 0.2413 (0.2466) data time 0.0007 (0.0018) model time 0.2406 (0.2450) loss 2.2706 (3.3617) grad_norm 4.3855 (2.1907) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][910/1251] eta 0:01:24 lr 0.000770 wd 0.0500 time 0.2342 (0.2466) data time 0.0011 (0.0018) model time 0.2331 (0.2449) loss 3.5688 (3.3624) grad_norm 1.8732 (2.1949) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][920/1251] eta 0:01:21 lr 0.000770 wd 0.0500 time 0.2403 (0.2465) data time 0.0009 (0.0018) model time 0.2394 (0.2449) loss 2.1212 (3.3597) grad_norm 1.9449 (2.1931) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][930/1251] eta 0:01:19 lr 0.000770 wd 0.0500 time 0.2417 (0.2465) data time 0.0010 (0.0019) model time 0.2407 (0.2448) loss 3.7410 (3.3601) grad_norm 1.4573 (2.1914) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][940/1251] eta 0:01:16 lr 0.000770 wd 0.0500 time 0.2350 (0.2464) data time 0.0014 (0.0018) model time 0.2336 (0.2448) loss 3.7778 (3.3619) grad_norm 1.9250 (2.1866) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][950/1251] eta 0:01:14 lr 0.000769 wd 0.0500 time 0.2364 (0.2464) data time 0.0010 (0.0018) model time 0.2354 (0.2448) loss 3.0651 (3.3638) grad_norm 2.5301 (2.1871) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][960/1251] eta 0:01:11 lr 0.000769 wd 0.0500 time 0.2366 (0.2464) data time 0.0009 (0.0018) model time 0.2357 (0.2447) loss 3.4608 (3.3644) grad_norm 1.3429 (2.1852) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][970/1251] eta 0:01:09 lr 0.000769 wd 0.0500 time 0.2402 (0.2465) data time 0.0010 (0.0018) model time 0.2392 (0.2448) loss 3.6990 (3.3636) grad_norm 1.7654 (2.1860) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][980/1251] eta 0:01:06 lr 0.000769 wd 0.0500 time 0.2372 (0.2464) data time 0.0011 (0.0018) model time 0.2361 (0.2448) loss 3.0363 (3.3632) grad_norm 1.9688 (2.1827) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][990/1251] eta 0:01:04 lr 0.000769 wd 0.0500 time 0.2307 (0.2464) data time 0.0009 (0.0018) model time 0.2298 (0.2447) loss 4.3965 (3.3625) grad_norm 1.7728 (2.1813) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1000/1251] eta 0:01:01 lr 0.000769 wd 0.0500 time 0.2408 (0.2463) data time 0.0010 (0.0018) model time 0.2398 (0.2447) loss 3.3804 (3.3648) grad_norm 2.1808 (2.1829) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:45:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1010/1251] eta 0:00:59 lr 0.000769 wd 0.0500 time 0.2431 (0.2463) data time 0.0010 (0.0018) model time 0.2421 (0.2447) loss 3.1651 (3.3666) grad_norm 1.6268 (2.1829) loss_scale 4096.0000 (2054.0772) mem 7379MB [2024-08-26 11:45:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1020/1251] eta 0:00:56 lr 0.000769 wd 0.0500 time 0.2403 (0.2463) data time 0.0007 (0.0018) model time 0.2396 (0.2446) loss 4.2339 (3.3675) grad_norm 1.8146 (2.1875) loss_scale 4096.0000 (2074.0764) mem 7379MB [2024-08-26 11:45:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1030/1251] eta 0:00:54 lr 0.000769 wd 0.0500 time 0.2473 (0.2462) data time 0.0009 (0.0018) model time 0.2463 (0.2446) loss 4.0684 (3.3674) grad_norm 1.6827 (2.1894) loss_scale 4096.0000 (2093.6877) mem 7379MB [2024-08-26 11:45:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1040/1251] eta 0:00:51 lr 0.000769 wd 0.0500 time 0.2375 (0.2462) data time 0.0009 (0.0018) model time 0.2366 (0.2445) loss 3.7227 (3.3677) grad_norm 1.8846 (2.1880) loss_scale 4096.0000 (2112.9222) mem 7379MB [2024-08-26 11:46:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1050/1251] eta 0:00:49 lr 0.000769 wd 0.0500 time 0.2420 (0.2461) data time 0.0009 (0.0018) model time 0.2410 (0.2445) loss 3.5283 (3.3660) grad_norm 2.4723 (2.1866) loss_scale 4096.0000 (2131.7907) mem 7379MB [2024-08-26 11:46:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1060/1251] eta 0:00:47 lr 0.000769 wd 0.0500 time 0.2482 (0.2465) data time 0.0007 (0.0017) model time 0.2475 (0.2449) loss 4.5087 (3.3643) grad_norm 2.3410 (2.1893) loss_scale 4096.0000 (2150.3035) mem 7379MB [2024-08-26 11:46:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1070/1251] eta 0:00:44 lr 0.000769 wd 0.0500 time 0.2378 (0.2465) data time 0.0010 (0.0017) model time 0.2367 (0.2449) loss 3.4548 (3.3660) grad_norm 2.2550 (2.1933) loss_scale 4096.0000 (2168.4706) mem 7379MB [2024-08-26 11:46:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1080/1251] eta 0:00:42 lr 0.000769 wd 0.0500 time 0.2441 (0.2464) data time 0.0009 (0.0017) model time 0.2432 (0.2448) loss 3.5589 (3.3672) grad_norm 1.8356 (2.1919) loss_scale 4096.0000 (2186.3016) mem 7379MB [2024-08-26 11:46:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1090/1251] eta 0:00:39 lr 0.000769 wd 0.0500 time 0.2496 (0.2464) data time 0.0009 (0.0017) model time 0.2486 (0.2448) loss 3.4474 (3.3659) grad_norm 2.9464 (2.1934) loss_scale 4096.0000 (2203.8057) mem 7379MB [2024-08-26 11:46:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1100/1251] eta 0:00:37 lr 0.000769 wd 0.0500 time 0.2457 (0.2464) data time 0.0012 (0.0017) model time 0.2445 (0.2448) loss 2.8614 (3.3635) grad_norm 2.7558 (2.1915) loss_scale 4096.0000 (2220.9918) mem 7379MB [2024-08-26 11:46:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1110/1251] eta 0:00:34 lr 0.000769 wd 0.0500 time 0.2446 (0.2463) data time 0.0009 (0.0017) model time 0.2437 (0.2447) loss 3.0120 (3.3610) grad_norm 2.2429 (2.1948) loss_scale 4096.0000 (2237.8686) mem 7379MB [2024-08-26 11:46:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1120/1251] eta 0:00:32 lr 0.000769 wd 0.0500 time 0.2429 (0.2463) data time 0.0008 (0.0017) model time 0.2421 (0.2447) loss 4.2859 (3.3605) grad_norm 2.0562 (2.1924) loss_scale 4096.0000 (2254.4442) mem 7379MB [2024-08-26 11:46:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1130/1251] eta 0:00:29 lr 0.000769 wd 0.0500 time 0.2479 (0.2466) data time 0.0009 (0.0017) model time 0.2470 (0.2451) loss 3.1589 (3.3583) grad_norm 2.8025 (2.1923) loss_scale 4096.0000 (2270.7268) mem 7379MB [2024-08-26 11:46:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1140/1251] eta 0:00:27 lr 0.000769 wd 0.0500 time 0.2518 (0.2466) data time 0.0011 (0.0017) model time 0.2507 (0.2450) loss 2.7886 (3.3578) grad_norm 2.0710 (2.1907) loss_scale 4096.0000 (2286.7239) mem 7379MB [2024-08-26 11:46:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1150/1251] eta 0:00:24 lr 0.000769 wd 0.0500 time 0.2433 (0.2465) data time 0.0011 (0.0017) model time 0.2422 (0.2450) loss 3.7413 (3.3592) grad_norm 2.3150 (2.1922) loss_scale 4096.0000 (2302.4431) mem 7379MB [2024-08-26 11:46:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1160/1251] eta 0:00:22 lr 0.000769 wd 0.0500 time 0.2413 (0.2465) data time 0.0010 (0.0017) model time 0.2403 (0.2450) loss 3.4983 (3.3596) grad_norm 1.7384 (2.1915) loss_scale 4096.0000 (2317.8915) mem 7379MB [2024-08-26 11:46:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1170/1251] eta 0:00:19 lr 0.000769 wd 0.0500 time 0.2405 (0.2465) data time 0.0009 (0.0017) model time 0.2396 (0.2449) loss 4.3152 (3.3608) grad_norm 1.6875 (2.1889) loss_scale 4096.0000 (2333.0760) mem 7379MB [2024-08-26 11:46:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1180/1251] eta 0:00:17 lr 0.000769 wd 0.0500 time 0.2461 (0.2464) data time 0.0010 (0.0017) model time 0.2452 (0.2449) loss 4.0378 (3.3626) grad_norm 1.5924 (2.1873) loss_scale 4096.0000 (2348.0034) mem 7379MB [2024-08-26 11:46:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1190/1251] eta 0:00:15 lr 0.000769 wd 0.0500 time 0.2502 (0.2464) data time 0.0007 (0.0017) model time 0.2495 (0.2449) loss 3.2956 (3.3621) grad_norm 6.3730 (2.1886) loss_scale 4096.0000 (2362.6801) mem 7379MB [2024-08-26 11:46:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1200/1251] eta 0:00:12 lr 0.000769 wd 0.0500 time 0.2304 (0.2464) data time 0.0009 (0.0017) model time 0.2295 (0.2449) loss 3.7543 (3.3603) grad_norm 2.2755 (2.1868) loss_scale 4096.0000 (2377.1124) mem 7379MB [2024-08-26 11:46:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1210/1251] eta 0:00:10 lr 0.000769 wd 0.0500 time 0.2383 (0.2465) data time 0.0007 (0.0017) model time 0.2376 (0.2450) loss 3.6988 (3.3606) grad_norm 1.8589 (2.1851) loss_scale 4096.0000 (2391.3064) mem 7379MB [2024-08-26 11:46:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1220/1251] eta 0:00:07 lr 0.000768 wd 0.0500 time 0.2399 (0.2465) data time 0.0013 (0.0017) model time 0.2386 (0.2450) loss 3.5682 (3.3613) grad_norm 1.6483 (2.1824) loss_scale 4096.0000 (2405.2678) mem 7379MB [2024-08-26 11:46:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1230/1251] eta 0:00:05 lr 0.000768 wd 0.0500 time 0.2408 (0.2465) data time 0.0010 (0.0017) model time 0.2398 (0.2449) loss 3.7426 (3.3611) grad_norm 2.2591 (2.1846) loss_scale 4096.0000 (2419.0024) mem 7379MB [2024-08-26 11:46:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1240/1251] eta 0:00:02 lr 0.000768 wd 0.0500 time 0.2226 (0.2463) data time 0.0007 (0.0016) model time 0.2219 (0.2448) loss 3.3530 (3.3600) grad_norm 1.6672 (2.1853) loss_scale 4096.0000 (2432.5157) mem 7379MB [2024-08-26 11:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [109/300][1250/1251] eta 0:00:00 lr 0.000768 wd 0.0500 time 0.2279 (0.2462) data time 0.0007 (0.0016) model time 0.2272 (0.2447) loss 3.6994 (3.3603) grad_norm 2.1786 (2.1835) loss_scale 4096.0000 (2445.8129) mem 7379MB [2024-08-26 11:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 109 training takes 0:05:07 [2024-08-26 11:46:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 11:46:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 11:46:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.487 (0.487) Loss 0.5093 (0.5093) Acc@1 90.137 (90.137) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-26 11:46:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.114) Loss 0.8052 (0.7786) Acc@1 83.594 (83.114) Acc@5 96.191 (96.360) Mem 7379MB [2024-08-26 11:46:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.084 (0.098) Loss 1.1309 (0.7985) Acc@1 75.586 (82.361) Acc@5 92.480 (96.419) Mem 7379MB [2024-08-26 11:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.091) Loss 1.3975 (0.9124) Acc@1 66.504 (79.807) Acc@5 89.062 (95.079) Mem 7379MB [2024-08-26 11:46:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.2656 (0.9745) Acc@1 69.531 (78.275) Acc@5 91.211 (94.391) Mem 7379MB [2024-08-26 11:46:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.878 Acc@5 94.256 [2024-08-26 11:46:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.9% [2024-08-26 11:46:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 77.88% [2024-08-26 11:46:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 11:46:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 11:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.434 (0.434) Loss 0.4326 (0.4326) Acc@1 91.895 (91.895) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 11:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.112) Loss 0.6924 (0.6779) Acc@1 86.328 (85.387) Acc@5 96.680 (97.177) Mem 7379MB [2024-08-26 11:46:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.096) Loss 0.9692 (0.7009) Acc@1 76.660 (84.375) Acc@5 94.336 (97.121) Mem 7379MB [2024-08-26 11:46:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.091) Loss 1.2471 (0.7985) Acc@1 69.141 (82.056) Acc@5 90.723 (95.933) Mem 7379MB [2024-08-26 11:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.1084 (0.8489) Acc@1 72.559 (80.659) Acc@5 93.066 (95.432) Mem 7379MB [2024-08-26 11:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.198 Acc@5 95.392 [2024-08-26 11:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.2% [2024-08-26 11:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.20% [2024-08-26 11:46:59 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 11:47:00 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 11:47:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][0/1251] eta 0:14:16 lr 0.000768 wd 0.0500 time 0.6846 (0.6846) data time 0.4660 (0.4660) model time 0.0000 (0.0000) loss 3.8232 (3.8232) grad_norm 2.0170 (2.0170) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][10/1251] eta 0:05:47 lr 0.000768 wd 0.0500 time 0.2421 (0.2803) data time 0.0009 (0.0433) model time 0.0000 (0.0000) loss 4.2191 (3.4300) grad_norm 2.7406 (2.0928) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][20/1251] eta 0:05:22 lr 0.000768 wd 0.0500 time 0.2459 (0.2617) data time 0.0012 (0.0234) model time 0.0000 (0.0000) loss 3.6050 (3.3086) grad_norm 1.5689 (2.2256) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][30/1251] eta 0:05:12 lr 0.000768 wd 0.0500 time 0.2436 (0.2562) data time 0.0009 (0.0162) model time 0.0000 (0.0000) loss 2.5365 (3.2883) grad_norm 2.7974 (2.2075) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][40/1251] eta 0:05:06 lr 0.000768 wd 0.0500 time 0.2433 (0.2527) data time 0.0007 (0.0125) model time 0.0000 (0.0000) loss 3.8942 (3.3659) grad_norm 1.8228 (2.2063) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][50/1251] eta 0:05:10 lr 0.000768 wd 0.0500 time 0.2461 (0.2589) data time 0.0010 (0.0102) model time 0.0000 (0.0000) loss 3.6426 (3.3667) grad_norm 1.7069 (2.1448) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][60/1251] eta 0:05:05 lr 0.000768 wd 0.0500 time 0.2498 (0.2565) data time 0.0011 (0.0087) model time 0.2486 (0.2428) loss 2.9020 (3.3651) grad_norm 2.1080 (2.1236) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][70/1251] eta 0:05:00 lr 0.000768 wd 0.0500 time 0.2454 (0.2544) data time 0.0011 (0.0076) model time 0.2443 (0.2419) loss 3.0484 (3.3270) grad_norm 1.7518 (2.1791) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][80/1251] eta 0:04:56 lr 0.000768 wd 0.0500 time 0.2477 (0.2529) data time 0.0011 (0.0068) model time 0.2466 (0.2414) loss 3.7851 (3.3152) grad_norm 2.0007 (2.1628) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][90/1251] eta 0:04:52 lr 0.000768 wd 0.0500 time 0.2430 (0.2518) data time 0.0009 (0.0062) model time 0.2421 (0.2417) loss 3.4466 (3.3163) grad_norm 2.8175 (2.1896) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][100/1251] eta 0:04:48 lr 0.000768 wd 0.0500 time 0.2414 (0.2507) data time 0.0009 (0.0057) model time 0.2405 (0.2413) loss 3.3909 (3.3059) grad_norm 3.3316 (2.3077) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][110/1251] eta 0:04:45 lr 0.000768 wd 0.0500 time 0.2460 (0.2499) data time 0.0011 (0.0053) model time 0.2450 (0.2412) loss 2.8142 (3.2982) grad_norm 1.8714 (2.3285) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][120/1251] eta 0:04:41 lr 0.000768 wd 0.0500 time 0.2424 (0.2492) data time 0.0008 (0.0049) model time 0.2415 (0.2411) loss 3.8033 (3.3213) grad_norm 2.0508 (2.3007) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][130/1251] eta 0:04:38 lr 0.000768 wd 0.0500 time 0.2395 (0.2486) data time 0.0014 (0.0046) model time 0.2381 (0.2409) loss 3.1559 (3.3099) grad_norm 2.2727 (2.2897) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][140/1251] eta 0:04:35 lr 0.000768 wd 0.0500 time 0.2425 (0.2481) data time 0.0011 (0.0044) model time 0.2414 (0.2410) loss 3.0433 (3.3215) grad_norm 2.4553 (2.3445) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][150/1251] eta 0:04:34 lr 0.000768 wd 0.0500 time 0.2434 (0.2492) data time 0.0011 (0.0041) model time 0.2423 (0.2432) loss 2.8241 (3.3291) grad_norm 1.8964 (2.3322) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][160/1251] eta 0:04:31 lr 0.000768 wd 0.0500 time 0.2446 (0.2488) data time 0.0007 (0.0039) model time 0.2439 (0.2430) loss 4.1494 (3.3286) grad_norm 1.9530 (2.3093) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][170/1251] eta 0:04:29 lr 0.000768 wd 0.0500 time 0.2392 (0.2497) data time 0.0009 (0.0038) model time 0.2383 (0.2447) loss 3.6973 (3.3165) grad_norm 1.8313 (2.2877) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][180/1251] eta 0:04:28 lr 0.000768 wd 0.0500 time 0.2380 (0.2503) data time 0.0011 (0.0036) model time 0.2368 (0.2458) loss 3.3688 (3.3214) grad_norm 2.3369 (2.2773) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][190/1251] eta 0:04:25 lr 0.000768 wd 0.0500 time 0.2395 (0.2501) data time 0.0008 (0.0035) model time 0.2387 (0.2459) loss 2.3789 (3.3225) grad_norm 2.4097 (2.2748) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][200/1251] eta 0:04:22 lr 0.000768 wd 0.0500 time 0.2308 (0.2497) data time 0.0011 (0.0034) model time 0.2297 (0.2455) loss 3.6052 (3.3305) grad_norm 1.6872 (2.2671) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][210/1251] eta 0:04:19 lr 0.000768 wd 0.0500 time 0.2398 (0.2493) data time 0.0010 (0.0033) model time 0.2388 (0.2452) loss 3.7804 (3.3396) grad_norm 1.8589 (2.2513) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][220/1251] eta 0:04:16 lr 0.000768 wd 0.0500 time 0.2557 (0.2492) data time 0.0007 (0.0032) model time 0.2550 (0.2452) loss 2.8875 (3.3345) grad_norm 2.0976 (2.2530) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:47:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][230/1251] eta 0:04:14 lr 0.000767 wd 0.0500 time 0.2450 (0.2490) data time 0.0010 (0.0031) model time 0.2440 (0.2451) loss 3.7450 (3.3351) grad_norm 2.6130 (2.2636) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:48:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][240/1251] eta 0:04:11 lr 0.000767 wd 0.0500 time 0.2359 (0.2492) data time 0.0010 (0.0030) model time 0.2349 (0.2456) loss 3.6162 (3.3393) grad_norm 2.2685 (2.2577) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:48:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][250/1251] eta 0:04:09 lr 0.000767 wd 0.0500 time 0.2441 (0.2490) data time 0.0010 (0.0029) model time 0.2431 (0.2454) loss 3.1971 (3.3403) grad_norm 1.8808 (2.2661) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:48:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][260/1251] eta 0:04:06 lr 0.000767 wd 0.0500 time 0.2431 (0.2487) data time 0.0010 (0.0028) model time 0.2421 (0.2452) loss 3.3124 (3.3481) grad_norm 1.8681 (2.2571) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 11:48:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][270/1251] eta 0:04:03 lr 0.000767 wd 0.0500 time 0.2417 (0.2485) data time 0.0007 (0.0028) model time 0.2409 (0.2450) loss 3.1986 (3.3401) grad_norm 1.4417 (inf) loss_scale 2048.0000 (4050.6568) mem 7379MB [2024-08-26 11:48:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][280/1251] eta 0:04:01 lr 0.000767 wd 0.0500 time 0.2440 (0.2484) data time 0.0007 (0.0027) model time 0.2433 (0.2450) loss 4.2629 (3.3467) grad_norm 2.0528 (inf) loss_scale 2048.0000 (3979.3879) mem 7379MB [2024-08-26 11:48:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][290/1251] eta 0:03:58 lr 0.000767 wd 0.0500 time 0.2461 (0.2483) data time 0.0010 (0.0027) model time 0.2451 (0.2450) loss 3.3273 (3.3387) grad_norm 1.9783 (inf) loss_scale 2048.0000 (3913.0172) mem 7379MB [2024-08-26 11:48:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][300/1251] eta 0:03:56 lr 0.000767 wd 0.0500 time 0.2437 (0.2488) data time 0.0008 (0.0026) model time 0.2428 (0.2456) loss 3.7721 (3.3412) grad_norm 1.9474 (inf) loss_scale 2048.0000 (3851.0565) mem 7379MB [2024-08-26 11:48:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][310/1251] eta 0:03:53 lr 0.000767 wd 0.0500 time 0.2468 (0.2486) data time 0.0012 (0.0025) model time 0.2455 (0.2455) loss 2.4054 (3.3443) grad_norm 2.5894 (inf) loss_scale 2048.0000 (3793.0804) mem 7379MB [2024-08-26 11:48:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][320/1251] eta 0:03:51 lr 0.000767 wd 0.0500 time 0.2378 (0.2484) data time 0.0010 (0.0025) model time 0.2368 (0.2453) loss 4.0177 (3.3411) grad_norm 1.8285 (inf) loss_scale 2048.0000 (3738.7165) mem 7379MB [2024-08-26 11:48:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][330/1251] eta 0:03:48 lr 0.000767 wd 0.0500 time 0.2346 (0.2482) data time 0.0009 (0.0025) model time 0.2337 (0.2452) loss 3.5826 (3.3407) grad_norm 2.2701 (inf) loss_scale 2048.0000 (3687.6375) mem 7379MB [2024-08-26 11:48:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][340/1251] eta 0:03:46 lr 0.000767 wd 0.0500 time 0.2478 (0.2486) data time 0.0007 (0.0024) model time 0.2471 (0.2458) loss 3.3384 (3.3340) grad_norm 1.9440 (inf) loss_scale 2048.0000 (3639.5543) mem 7379MB [2024-08-26 11:48:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][350/1251] eta 0:03:43 lr 0.000767 wd 0.0500 time 0.2465 (0.2485) data time 0.0013 (0.0024) model time 0.2453 (0.2456) loss 3.0742 (3.3404) grad_norm 2.2718 (inf) loss_scale 2048.0000 (3594.2108) mem 7379MB [2024-08-26 11:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][360/1251] eta 0:03:41 lr 0.000767 wd 0.0500 time 0.2455 (0.2483) data time 0.0008 (0.0023) model time 0.2447 (0.2455) loss 3.6045 (3.3432) grad_norm 1.9311 (inf) loss_scale 2048.0000 (3551.3795) mem 7379MB [2024-08-26 11:48:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][370/1251] eta 0:03:38 lr 0.000767 wd 0.0500 time 0.2470 (0.2481) data time 0.0011 (0.0023) model time 0.2460 (0.2453) loss 3.9680 (3.3457) grad_norm 1.7324 (inf) loss_scale 2048.0000 (3510.8571) mem 7379MB [2024-08-26 11:48:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][380/1251] eta 0:03:35 lr 0.000767 wd 0.0500 time 0.2519 (0.2479) data time 0.0007 (0.0023) model time 0.2512 (0.2451) loss 3.1240 (3.3481) grad_norm 2.0708 (inf) loss_scale 2048.0000 (3472.4619) mem 7379MB [2024-08-26 11:48:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][390/1251] eta 0:03:33 lr 0.000767 wd 0.0500 time 0.2441 (0.2478) data time 0.0007 (0.0023) model time 0.2434 (0.2450) loss 3.4261 (3.3442) grad_norm 2.1501 (inf) loss_scale 2048.0000 (3436.0307) mem 7379MB [2024-08-26 11:48:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][400/1251] eta 0:03:30 lr 0.000767 wd 0.0500 time 0.2456 (0.2477) data time 0.0010 (0.0022) model time 0.2445 (0.2450) loss 3.0775 (3.3503) grad_norm 2.7207 (inf) loss_scale 2048.0000 (3401.4165) mem 7379MB [2024-08-26 11:48:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][410/1251] eta 0:03:28 lr 0.000767 wd 0.0500 time 0.2427 (0.2475) data time 0.0009 (0.0022) model time 0.2419 (0.2448) loss 3.4523 (3.3472) grad_norm 2.1193 (inf) loss_scale 2048.0000 (3368.4866) mem 7379MB [2024-08-26 11:48:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][420/1251] eta 0:03:25 lr 0.000767 wd 0.0500 time 0.2449 (0.2474) data time 0.0009 (0.0022) model time 0.2440 (0.2447) loss 3.1787 (3.3520) grad_norm 1.9416 (inf) loss_scale 2048.0000 (3337.1211) mem 7379MB [2024-08-26 11:48:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][430/1251] eta 0:03:22 lr 0.000767 wd 0.0500 time 0.2375 (0.2472) data time 0.0010 (0.0021) model time 0.2366 (0.2446) loss 3.2055 (3.3464) grad_norm 1.9799 (inf) loss_scale 2048.0000 (3307.2111) mem 7379MB [2024-08-26 11:48:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][440/1251] eta 0:03:20 lr 0.000767 wd 0.0500 time 0.2399 (0.2472) data time 0.0009 (0.0021) model time 0.2390 (0.2446) loss 3.5577 (3.3407) grad_norm 1.8903 (inf) loss_scale 2048.0000 (3278.6576) mem 7379MB [2024-08-26 11:48:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][450/1251] eta 0:03:17 lr 0.000767 wd 0.0500 time 0.2342 (0.2471) data time 0.0009 (0.0021) model time 0.2333 (0.2445) loss 3.9008 (3.3411) grad_norm 1.8549 (inf) loss_scale 2048.0000 (3251.3703) mem 7379MB [2024-08-26 11:48:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][460/1251] eta 0:03:15 lr 0.000767 wd 0.0500 time 0.2336 (0.2470) data time 0.0008 (0.0021) model time 0.2329 (0.2444) loss 4.2851 (3.3382) grad_norm 2.2238 (inf) loss_scale 2048.0000 (3225.2668) mem 7379MB [2024-08-26 11:48:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][470/1251] eta 0:03:12 lr 0.000767 wd 0.0500 time 0.2387 (0.2469) data time 0.0010 (0.0021) model time 0.2377 (0.2444) loss 3.3385 (3.3376) grad_norm 1.6656 (inf) loss_scale 2048.0000 (3200.2718) mem 7379MB [2024-08-26 11:48:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][480/1251] eta 0:03:10 lr 0.000767 wd 0.0500 time 0.2371 (0.2469) data time 0.0009 (0.0021) model time 0.2362 (0.2443) loss 3.6033 (3.3387) grad_norm 1.9925 (inf) loss_scale 2048.0000 (3176.3160) mem 7379MB [2024-08-26 11:49:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][490/1251] eta 0:03:08 lr 0.000767 wd 0.0500 time 0.2465 (0.2472) data time 0.0010 (0.0021) model time 0.2455 (0.2448) loss 3.7492 (3.3370) grad_norm 2.5681 (inf) loss_scale 2048.0000 (3153.3360) mem 7379MB [2024-08-26 11:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][500/1251] eta 0:03:05 lr 0.000766 wd 0.0500 time 0.2429 (0.2471) data time 0.0012 (0.0020) model time 0.2417 (0.2447) loss 3.4325 (3.3392) grad_norm 2.4923 (inf) loss_scale 2048.0000 (3131.2735) mem 7379MB [2024-08-26 11:49:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][510/1251] eta 0:03:03 lr 0.000766 wd 0.0500 time 0.2385 (0.2474) data time 0.0007 (0.0020) model time 0.2377 (0.2450) loss 3.8228 (3.3362) grad_norm 1.8059 (inf) loss_scale 2048.0000 (3110.0744) mem 7379MB [2024-08-26 11:49:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][520/1251] eta 0:03:00 lr 0.000766 wd 0.0500 time 0.2459 (0.2473) data time 0.0009 (0.0020) model time 0.2450 (0.2449) loss 2.6631 (3.3357) grad_norm 1.8358 (inf) loss_scale 2048.0000 (3089.6891) mem 7379MB [2024-08-26 11:49:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][530/1251] eta 0:02:58 lr 0.000766 wd 0.0500 time 0.2422 (0.2472) data time 0.0011 (0.0020) model time 0.2410 (0.2449) loss 3.3187 (3.3363) grad_norm 2.1440 (inf) loss_scale 2048.0000 (3070.0716) mem 7379MB [2024-08-26 11:49:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][540/1251] eta 0:02:55 lr 0.000766 wd 0.0500 time 0.2420 (0.2471) data time 0.0010 (0.0020) model time 0.2410 (0.2448) loss 3.3435 (3.3347) grad_norm 2.1083 (inf) loss_scale 2048.0000 (3051.1793) mem 7379MB [2024-08-26 11:49:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][550/1251] eta 0:02:53 lr 0.000766 wd 0.0500 time 0.2460 (0.2470) data time 0.0009 (0.0020) model time 0.2451 (0.2447) loss 3.0660 (3.3355) grad_norm 1.4946 (inf) loss_scale 2048.0000 (3032.9728) mem 7379MB [2024-08-26 11:49:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][560/1251] eta 0:02:50 lr 0.000766 wd 0.0500 time 0.2383 (0.2470) data time 0.0009 (0.0020) model time 0.2374 (0.2446) loss 2.9249 (3.3358) grad_norm 2.1743 (inf) loss_scale 2048.0000 (3015.4153) mem 7379MB [2024-08-26 11:49:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][570/1251] eta 0:02:48 lr 0.000766 wd 0.0500 time 0.2704 (0.2470) data time 0.0010 (0.0020) model time 0.2694 (0.2447) loss 3.3738 (3.3383) grad_norm 2.7235 (inf) loss_scale 2048.0000 (2998.4729) mem 7379MB [2024-08-26 11:49:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][580/1251] eta 0:02:45 lr 0.000766 wd 0.0500 time 0.2651 (0.2470) data time 0.0011 (0.0019) model time 0.2640 (0.2447) loss 3.6031 (3.3397) grad_norm 2.2048 (inf) loss_scale 2048.0000 (2982.1136) mem 7379MB [2024-08-26 11:49:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][590/1251] eta 0:02:43 lr 0.000766 wd 0.0500 time 0.2454 (0.2469) data time 0.0010 (0.0019) model time 0.2443 (0.2447) loss 2.8208 (3.3433) grad_norm 1.4424 (inf) loss_scale 2048.0000 (2966.3080) mem 7379MB [2024-08-26 11:49:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][600/1251] eta 0:02:40 lr 0.000766 wd 0.0500 time 0.2387 (0.2473) data time 0.0011 (0.0019) model time 0.2377 (0.2450) loss 3.6271 (3.3421) grad_norm 1.7662 (inf) loss_scale 2048.0000 (2951.0283) mem 7379MB [2024-08-26 11:49:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][610/1251] eta 0:02:38 lr 0.000766 wd 0.0500 time 0.2366 (0.2472) data time 0.0009 (0.0019) model time 0.2358 (0.2450) loss 4.1743 (3.3476) grad_norm 1.4976 (inf) loss_scale 2048.0000 (2936.2488) mem 7379MB [2024-08-26 11:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][620/1251] eta 0:02:35 lr 0.000766 wd 0.0500 time 0.2439 (0.2471) data time 0.0011 (0.0019) model time 0.2428 (0.2449) loss 2.3840 (3.3506) grad_norm 1.3232 (inf) loss_scale 2048.0000 (2921.9452) mem 7379MB [2024-08-26 11:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][630/1251] eta 0:02:33 lr 0.000766 wd 0.0500 time 0.2352 (0.2470) data time 0.0009 (0.0019) model time 0.2344 (0.2448) loss 4.0529 (3.3527) grad_norm 3.6366 (inf) loss_scale 2048.0000 (2908.0951) mem 7379MB [2024-08-26 11:49:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][640/1251] eta 0:02:30 lr 0.000766 wd 0.0500 time 0.2423 (0.2469) data time 0.0011 (0.0019) model time 0.2412 (0.2448) loss 3.7767 (3.3524) grad_norm 1.7113 (inf) loss_scale 2048.0000 (2894.6771) mem 7379MB [2024-08-26 11:49:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][650/1251] eta 0:02:28 lr 0.000766 wd 0.0500 time 0.2745 (0.2469) data time 0.0011 (0.0018) model time 0.2734 (0.2447) loss 4.0481 (3.3512) grad_norm 2.9239 (inf) loss_scale 2048.0000 (2881.6713) mem 7379MB [2024-08-26 11:49:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][660/1251] eta 0:02:25 lr 0.000766 wd 0.0500 time 0.2331 (0.2468) data time 0.0011 (0.0018) model time 0.2320 (0.2447) loss 2.5331 (3.3471) grad_norm 1.6852 (inf) loss_scale 2048.0000 (2869.0590) mem 7379MB [2024-08-26 11:49:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][670/1251] eta 0:02:23 lr 0.000766 wd 0.0500 time 0.2450 (0.2467) data time 0.0012 (0.0018) model time 0.2438 (0.2446) loss 2.5458 (3.3504) grad_norm 1.4558 (inf) loss_scale 2048.0000 (2856.8227) mem 7379MB [2024-08-26 11:49:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][680/1251] eta 0:02:20 lr 0.000766 wd 0.0500 time 0.2471 (0.2467) data time 0.0011 (0.0018) model time 0.2460 (0.2445) loss 3.2310 (3.3482) grad_norm 2.0308 (inf) loss_scale 2048.0000 (2844.9457) mem 7379MB [2024-08-26 11:49:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][690/1251] eta 0:02:18 lr 0.000766 wd 0.0500 time 0.2357 (0.2466) data time 0.0010 (0.0018) model time 0.2348 (0.2445) loss 3.7578 (3.3498) grad_norm 2.0049 (inf) loss_scale 2048.0000 (2833.4124) mem 7379MB [2024-08-26 11:49:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][700/1251] eta 0:02:15 lr 0.000766 wd 0.0500 time 0.2388 (0.2465) data time 0.0011 (0.0018) model time 0.2377 (0.2444) loss 3.4513 (3.3490) grad_norm 1.9090 (inf) loss_scale 2048.0000 (2822.2083) mem 7379MB [2024-08-26 11:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][710/1251] eta 0:02:13 lr 0.000766 wd 0.0500 time 0.2417 (0.2470) data time 0.0010 (0.0018) model time 0.2407 (0.2449) loss 3.0963 (3.3466) grad_norm 1.7301 (inf) loss_scale 2048.0000 (2811.3193) mem 7379MB [2024-08-26 11:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][720/1251] eta 0:02:11 lr 0.000766 wd 0.0500 time 0.2403 (0.2475) data time 0.0007 (0.0018) model time 0.2396 (0.2455) loss 3.3993 (3.3461) grad_norm 2.0053 (inf) loss_scale 2048.0000 (2800.7323) mem 7379MB [2024-08-26 11:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][730/1251] eta 0:02:08 lr 0.000766 wd 0.0500 time 0.2338 (0.2475) data time 0.0011 (0.0018) model time 0.2328 (0.2455) loss 3.5212 (3.3491) grad_norm 3.4250 (inf) loss_scale 2048.0000 (2790.4350) mem 7379MB [2024-08-26 11:50:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][740/1251] eta 0:02:06 lr 0.000766 wd 0.0500 time 0.2433 (0.2474) data time 0.0007 (0.0017) model time 0.2426 (0.2454) loss 3.6734 (3.3529) grad_norm 1.8610 (inf) loss_scale 2048.0000 (2780.4157) mem 7379MB [2024-08-26 11:50:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][750/1251] eta 0:02:03 lr 0.000766 wd 0.0500 time 0.2385 (0.2474) data time 0.0009 (0.0017) model time 0.2376 (0.2454) loss 4.2532 (3.3498) grad_norm 1.4882 (inf) loss_scale 2048.0000 (2770.6631) mem 7379MB [2024-08-26 11:50:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][760/1251] eta 0:02:01 lr 0.000765 wd 0.0500 time 0.2502 (0.2473) data time 0.0010 (0.0017) model time 0.2493 (0.2454) loss 3.6669 (3.3536) grad_norm 2.1386 (inf) loss_scale 2048.0000 (2761.1669) mem 7379MB [2024-08-26 11:50:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][770/1251] eta 0:01:59 lr 0.000765 wd 0.0500 time 0.2458 (0.2475) data time 0.0009 (0.0017) model time 0.2449 (0.2456) loss 2.7403 (3.3555) grad_norm 2.9674 (inf) loss_scale 2048.0000 (2751.9170) mem 7379MB [2024-08-26 11:50:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][780/1251] eta 0:01:56 lr 0.000765 wd 0.0500 time 0.2444 (0.2474) data time 0.0009 (0.0017) model time 0.2435 (0.2455) loss 3.0909 (3.3497) grad_norm 2.9434 (inf) loss_scale 2048.0000 (2742.9040) mem 7379MB [2024-08-26 11:50:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][790/1251] eta 0:01:54 lr 0.000765 wd 0.0500 time 0.2415 (0.2474) data time 0.0010 (0.0017) model time 0.2405 (0.2454) loss 3.7883 (3.3493) grad_norm 1.5991 (inf) loss_scale 2048.0000 (2734.1188) mem 7379MB [2024-08-26 11:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][800/1251] eta 0:01:51 lr 0.000765 wd 0.0500 time 0.2419 (0.2473) data time 0.0011 (0.0017) model time 0.2408 (0.2454) loss 3.7053 (3.3533) grad_norm 2.1053 (inf) loss_scale 2048.0000 (2725.5531) mem 7379MB [2024-08-26 11:50:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][810/1251] eta 0:01:49 lr 0.000765 wd 0.0500 time 0.2406 (0.2472) data time 0.0009 (0.0017) model time 0.2397 (0.2453) loss 4.4090 (3.3534) grad_norm 2.5979 (inf) loss_scale 2048.0000 (2717.1985) mem 7379MB [2024-08-26 11:50:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][820/1251] eta 0:01:46 lr 0.000765 wd 0.0500 time 0.2401 (0.2472) data time 0.0010 (0.0017) model time 0.2391 (0.2453) loss 3.9111 (3.3543) grad_norm 1.6710 (inf) loss_scale 2048.0000 (2709.0475) mem 7379MB [2024-08-26 11:50:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][830/1251] eta 0:01:44 lr 0.000765 wd 0.0500 time 0.2328 (0.2471) data time 0.0011 (0.0017) model time 0.2318 (0.2452) loss 3.9125 (3.3543) grad_norm 2.9411 (inf) loss_scale 2048.0000 (2701.0927) mem 7379MB [2024-08-26 11:50:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][840/1251] eta 0:01:41 lr 0.000765 wd 0.0500 time 0.2438 (0.2470) data time 0.0009 (0.0017) model time 0.2429 (0.2451) loss 3.9406 (3.3551) grad_norm 2.6816 (inf) loss_scale 2048.0000 (2693.3270) mem 7379MB [2024-08-26 11:50:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][850/1251] eta 0:01:39 lr 0.000765 wd 0.0500 time 0.2423 (0.2470) data time 0.0009 (0.0017) model time 0.2415 (0.2451) loss 3.5808 (3.3563) grad_norm 2.0328 (inf) loss_scale 2048.0000 (2685.7438) mem 7379MB [2024-08-26 11:50:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][860/1251] eta 0:01:36 lr 0.000765 wd 0.0500 time 0.2479 (0.2471) data time 0.0008 (0.0017) model time 0.2471 (0.2452) loss 3.4876 (3.3559) grad_norm 1.8099 (inf) loss_scale 2048.0000 (2678.3368) mem 7379MB [2024-08-26 11:50:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][870/1251] eta 0:01:34 lr 0.000765 wd 0.0500 time 0.2470 (0.2470) data time 0.0009 (0.0016) model time 0.2461 (0.2452) loss 2.9533 (3.3562) grad_norm 1.5084 (inf) loss_scale 2048.0000 (2671.0999) mem 7379MB [2024-08-26 11:50:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][880/1251] eta 0:01:31 lr 0.000765 wd 0.0500 time 0.2483 (0.2469) data time 0.0007 (0.0016) model time 0.2476 (0.2451) loss 3.0459 (3.3573) grad_norm 1.9166 (inf) loss_scale 2048.0000 (2664.0272) mem 7379MB [2024-08-26 11:50:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][890/1251] eta 0:01:29 lr 0.000765 wd 0.0500 time 0.2455 (0.2469) data time 0.0008 (0.0016) model time 0.2448 (0.2450) loss 3.1536 (3.3580) grad_norm 2.1191 (inf) loss_scale 2048.0000 (2657.1134) mem 7379MB [2024-08-26 11:50:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][900/1251] eta 0:01:26 lr 0.000765 wd 0.0500 time 0.2438 (0.2468) data time 0.0010 (0.0016) model time 0.2428 (0.2450) loss 3.7763 (3.3576) grad_norm 3.5967 (inf) loss_scale 2048.0000 (2650.3529) mem 7379MB [2024-08-26 11:50:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][910/1251] eta 0:01:24 lr 0.000765 wd 0.0500 time 0.2411 (0.2468) data time 0.0010 (0.0016) model time 0.2401 (0.2449) loss 3.4221 (3.3572) grad_norm 2.7582 (inf) loss_scale 2048.0000 (2643.7409) mem 7379MB [2024-08-26 11:50:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][920/1251] eta 0:01:21 lr 0.000765 wd 0.0500 time 0.2452 (0.2468) data time 0.0007 (0.0016) model time 0.2445 (0.2449) loss 4.0392 (3.3573) grad_norm 2.6915 (inf) loss_scale 2048.0000 (2637.2725) mem 7379MB [2024-08-26 11:50:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][930/1251] eta 0:01:19 lr 0.000765 wd 0.0500 time 0.2498 (0.2467) data time 0.0012 (0.0016) model time 0.2486 (0.2449) loss 3.1893 (3.3570) grad_norm 1.8657 (inf) loss_scale 2048.0000 (2630.9431) mem 7379MB [2024-08-26 11:50:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][940/1251] eta 0:01:16 lr 0.000765 wd 0.0500 time 0.2396 (0.2467) data time 0.0012 (0.0016) model time 0.2385 (0.2449) loss 3.5110 (3.3607) grad_norm 1.7292 (inf) loss_scale 2048.0000 (2624.7481) mem 7379MB [2024-08-26 11:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][950/1251] eta 0:01:14 lr 0.000765 wd 0.0500 time 0.2438 (0.2467) data time 0.0009 (0.0016) model time 0.2429 (0.2448) loss 3.5435 (3.3614) grad_norm 2.2047 (inf) loss_scale 2048.0000 (2618.6835) mem 7379MB [2024-08-26 11:50:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][960/1251] eta 0:01:11 lr 0.000765 wd 0.0500 time 0.2415 (0.2466) data time 0.0007 (0.0016) model time 0.2408 (0.2448) loss 3.8775 (3.3621) grad_norm 2.3653 (inf) loss_scale 2048.0000 (2612.7451) mem 7379MB [2024-08-26 11:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][970/1251] eta 0:01:09 lr 0.000765 wd 0.0500 time 0.2547 (0.2465) data time 0.0011 (0.0016) model time 0.2537 (0.2447) loss 2.7895 (3.3599) grad_norm 1.5808 (inf) loss_scale 2048.0000 (2606.9289) mem 7379MB [2024-08-26 11:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][980/1251] eta 0:01:06 lr 0.000765 wd 0.0500 time 0.2392 (0.2469) data time 0.0009 (0.0016) model time 0.2383 (0.2452) loss 3.0014 (3.3589) grad_norm 1.8649 (inf) loss_scale 2048.0000 (2601.2314) mem 7379MB [2024-08-26 11:51:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][990/1251] eta 0:01:04 lr 0.000765 wd 0.0500 time 0.2355 (0.2469) data time 0.0012 (0.0016) model time 0.2343 (0.2451) loss 3.2557 (3.3601) grad_norm 1.5932 (inf) loss_scale 2048.0000 (2595.6488) mem 7379MB [2024-08-26 11:51:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1000/1251] eta 0:01:01 lr 0.000765 wd 0.0500 time 0.2431 (0.2468) data time 0.0007 (0.0016) model time 0.2424 (0.2451) loss 2.3997 (3.3595) grad_norm 2.1593 (inf) loss_scale 2048.0000 (2590.1778) mem 7379MB [2024-08-26 11:51:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1010/1251] eta 0:00:59 lr 0.000765 wd 0.0500 time 0.2373 (0.2468) data time 0.0010 (0.0016) model time 0.2363 (0.2450) loss 3.8913 (3.3591) grad_norm 2.5484 (inf) loss_scale 2048.0000 (2584.8150) mem 7379MB [2024-08-26 11:51:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1020/1251] eta 0:00:57 lr 0.000765 wd 0.0500 time 0.2448 (0.2468) data time 0.0010 (0.0016) model time 0.2439 (0.2450) loss 3.7717 (3.3619) grad_norm 1.6656 (inf) loss_scale 2048.0000 (2579.5573) mem 7379MB [2024-08-26 11:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1030/1251] eta 0:00:54 lr 0.000764 wd 0.0500 time 0.2413 (0.2467) data time 0.0007 (0.0016) model time 0.2406 (0.2450) loss 4.2365 (3.3625) grad_norm 2.0370 (inf) loss_scale 2048.0000 (2574.4016) mem 7379MB [2024-08-26 11:51:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1040/1251] eta 0:00:52 lr 0.000764 wd 0.0500 time 0.2467 (0.2467) data time 0.0008 (0.0016) model time 0.2459 (0.2449) loss 2.4861 (3.3617) grad_norm 1.7537 (inf) loss_scale 2048.0000 (2569.3449) mem 7379MB [2024-08-26 11:51:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1050/1251] eta 0:00:49 lr 0.000764 wd 0.0500 time 0.2409 (0.2467) data time 0.0009 (0.0016) model time 0.2400 (0.2449) loss 4.2608 (3.3640) grad_norm 2.1474 (inf) loss_scale 2048.0000 (2564.3844) mem 7379MB [2024-08-26 11:51:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1060/1251] eta 0:00:47 lr 0.000764 wd 0.0500 time 0.2425 (0.2466) data time 0.0009 (0.0016) model time 0.2415 (0.2449) loss 3.3704 (3.3650) grad_norm 1.9098 (inf) loss_scale 2048.0000 (2559.5174) mem 7379MB [2024-08-26 11:51:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1070/1251] eta 0:00:44 lr 0.000764 wd 0.0500 time 0.2333 (0.2465) data time 0.0013 (0.0016) model time 0.2321 (0.2448) loss 3.7008 (3.3666) grad_norm 2.1651 (inf) loss_scale 2048.0000 (2554.7414) mem 7379MB [2024-08-26 11:51:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1080/1251] eta 0:00:42 lr 0.000764 wd 0.0500 time 0.2379 (0.2467) data time 0.0010 (0.0015) model time 0.2369 (0.2450) loss 3.4298 (3.3681) grad_norm 2.3985 (inf) loss_scale 2048.0000 (2550.0537) mem 7379MB [2024-08-26 11:51:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1090/1251] eta 0:00:39 lr 0.000764 wd 0.0500 time 0.2343 (0.2467) data time 0.0009 (0.0015) model time 0.2334 (0.2449) loss 4.2079 (3.3697) grad_norm 2.6544 (inf) loss_scale 2048.0000 (2545.4519) mem 7379MB [2024-08-26 11:51:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1100/1251] eta 0:00:37 lr 0.000764 wd 0.0500 time 0.2463 (0.2466) data time 0.0007 (0.0015) model time 0.2455 (0.2449) loss 2.5478 (3.3707) grad_norm 2.0509 (inf) loss_scale 2048.0000 (2540.9337) mem 7379MB [2024-08-26 11:51:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1110/1251] eta 0:00:34 lr 0.000764 wd 0.0500 time 0.2396 (0.2466) data time 0.0011 (0.0015) model time 0.2384 (0.2449) loss 3.1440 (3.3705) grad_norm 1.8503 (inf) loss_scale 2048.0000 (2536.4968) mem 7379MB [2024-08-26 11:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1120/1251] eta 0:00:32 lr 0.000764 wd 0.0500 time 0.2465 (0.2465) data time 0.0011 (0.0015) model time 0.2454 (0.2448) loss 3.8421 (3.3700) grad_norm 3.9519 (inf) loss_scale 2048.0000 (2532.1392) mem 7379MB [2024-08-26 11:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1130/1251] eta 0:00:29 lr 0.000764 wd 0.0500 time 0.2330 (0.2465) data time 0.0009 (0.0015) model time 0.2321 (0.2448) loss 3.6730 (3.3685) grad_norm 1.8838 (inf) loss_scale 2048.0000 (2527.8585) mem 7379MB [2024-08-26 11:51:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1140/1251] eta 0:00:27 lr 0.000764 wd 0.0500 time 0.2774 (0.2467) data time 0.0008 (0.0015) model time 0.2766 (0.2450) loss 3.3874 (3.3681) grad_norm 2.7296 (inf) loss_scale 2048.0000 (2523.6529) mem 7379MB [2024-08-26 11:51:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1150/1251] eta 0:00:24 lr 0.000764 wd 0.0500 time 0.2392 (0.2466) data time 0.0009 (0.0015) model time 0.2383 (0.2449) loss 3.3976 (3.3695) grad_norm 1.8714 (inf) loss_scale 2048.0000 (2519.5204) mem 7379MB [2024-08-26 11:51:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1160/1251] eta 0:00:22 lr 0.000764 wd 0.0500 time 0.2454 (0.2466) data time 0.0012 (0.0015) model time 0.2442 (0.2449) loss 3.8966 (3.3704) grad_norm 1.9329 (inf) loss_scale 2048.0000 (2515.4591) mem 7379MB [2024-08-26 11:51:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1170/1251] eta 0:00:19 lr 0.000764 wd 0.0500 time 0.2424 (0.2465) data time 0.0007 (0.0015) model time 0.2417 (0.2448) loss 4.3786 (3.3699) grad_norm 2.0719 (inf) loss_scale 2048.0000 (2511.4671) mem 7379MB [2024-08-26 11:51:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1180/1251] eta 0:00:17 lr 0.000764 wd 0.0500 time 0.2382 (0.2465) data time 0.0008 (0.0015) model time 0.2374 (0.2448) loss 2.8287 (3.3687) grad_norm 2.7665 (inf) loss_scale 2048.0000 (2507.5428) mem 7379MB [2024-08-26 11:51:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1190/1251] eta 0:00:15 lr 0.000764 wd 0.0500 time 0.2357 (0.2464) data time 0.0009 (0.0015) model time 0.2347 (0.2447) loss 4.0583 (3.3703) grad_norm 2.1920 (inf) loss_scale 2048.0000 (2503.6843) mem 7379MB [2024-08-26 11:51:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1200/1251] eta 0:00:12 lr 0.000764 wd 0.0500 time 0.2468 (0.2464) data time 0.0010 (0.0015) model time 0.2458 (0.2447) loss 3.5671 (3.3725) grad_norm 1.4774 (inf) loss_scale 2048.0000 (2499.8901) mem 7379MB [2024-08-26 11:51:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1210/1251] eta 0:00:10 lr 0.000764 wd 0.0500 time 0.2445 (0.2463) data time 0.0010 (0.0015) model time 0.2435 (0.2447) loss 3.7883 (3.3744) grad_norm 2.7308 (inf) loss_scale 2048.0000 (2496.1585) mem 7379MB [2024-08-26 11:52:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1220/1251] eta 0:00:07 lr 0.000764 wd 0.0500 time 0.2478 (0.2463) data time 0.0010 (0.0015) model time 0.2468 (0.2446) loss 3.6138 (3.3746) grad_norm 2.5263 (inf) loss_scale 2048.0000 (2492.4881) mem 7379MB [2024-08-26 11:52:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1230/1251] eta 0:00:05 lr 0.000764 wd 0.0500 time 0.2448 (0.2465) data time 0.0010 (0.0015) model time 0.2437 (0.2449) loss 3.7843 (3.3759) grad_norm 1.7936 (inf) loss_scale 2048.0000 (2488.8773) mem 7379MB [2024-08-26 11:52:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1240/1251] eta 0:00:02 lr 0.000764 wd 0.0500 time 0.2251 (0.2465) data time 0.0007 (0.0015) model time 0.2244 (0.2449) loss 3.3664 (3.3744) grad_norm 2.4138 (inf) loss_scale 2048.0000 (2485.3247) mem 7379MB [2024-08-26 11:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [110/300][1250/1251] eta 0:00:00 lr 0.000764 wd 0.0500 time 0.2242 (0.2465) data time 0.0005 (0.0015) model time 0.2237 (0.2449) loss 4.1096 (3.3758) grad_norm 1.8345 (inf) loss_scale 2048.0000 (2481.8289) mem 7379MB [2024-08-26 11:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 110 training takes 0:05:08 [2024-08-26 11:52:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 11:52:09 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 11:52:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.474 (0.474) Loss 0.5063 (0.5063) Acc@1 89.648 (89.648) Acc@5 97.656 (97.656) Mem 7379MB [2024-08-26 11:52:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.114) Loss 0.8325 (0.7848) Acc@1 82.227 (83.230) Acc@5 95.996 (96.387) Mem 7379MB [2024-08-26 11:52:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.098) Loss 1.1006 (0.8017) Acc@1 74.707 (82.450) Acc@5 93.848 (96.415) Mem 7379MB [2024-08-26 11:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.092) Loss 1.4248 (0.9097) Acc@1 66.699 (79.719) Acc@5 89.160 (95.133) Mem 7379MB [2024-08-26 11:52:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.2422 (0.9669) Acc@1 69.434 (78.187) Acc@5 91.309 (94.422) Mem 7379MB [2024-08-26 11:52:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.728 Acc@5 94.302 [2024-08-26 11:52:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.7% [2024-08-26 11:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.815 (0.815) Loss 0.4321 (0.4321) Acc@1 91.992 (91.992) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-26 11:52:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.150) Loss 0.6919 (0.6765) Acc@1 86.133 (85.352) Acc@5 96.777 (97.203) Mem 7379MB [2024-08-26 11:52:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.116) Loss 0.9658 (0.6993) Acc@1 77.246 (84.352) Acc@5 94.336 (97.163) Mem 7379MB [2024-08-26 11:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.103) Loss 1.2432 (0.7966) Acc@1 69.043 (82.091) Acc@5 90.527 (95.961) Mem 7379MB [2024-08-26 11:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.1064 (0.8471) Acc@1 72.754 (80.702) Acc@5 93.164 (95.441) Mem 7379MB [2024-08-26 11:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.258 Acc@5 95.384 [2024-08-26 11:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.3% [2024-08-26 11:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.26% [2024-08-26 11:52:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 11:52:18 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 11:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][0/1251] eta 0:15:13 lr 0.000764 wd 0.0500 time 0.7300 (0.7300) data time 0.4925 (0.4925) model time 0.0000 (0.0000) loss 3.2166 (3.2166) grad_norm 2.6240 (2.6240) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][10/1251] eta 0:06:17 lr 0.000764 wd 0.0500 time 0.2452 (0.3046) data time 0.0010 (0.0457) model time 0.0000 (0.0000) loss 3.5219 (3.3998) grad_norm 2.9912 (3.1111) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][20/1251] eta 0:05:41 lr 0.000764 wd 0.0500 time 0.2748 (0.2772) data time 0.0008 (0.0244) model time 0.0000 (0.0000) loss 4.3214 (3.4513) grad_norm 2.2388 (2.8572) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][30/1251] eta 0:05:24 lr 0.000764 wd 0.0500 time 0.2365 (0.2657) data time 0.0012 (0.0169) model time 0.0000 (0.0000) loss 2.2067 (3.3743) grad_norm 2.5357 (2.6628) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][40/1251] eta 0:05:15 lr 0.000763 wd 0.0500 time 0.2468 (0.2603) data time 0.0010 (0.0131) model time 0.0000 (0.0000) loss 3.4231 (3.3517) grad_norm 1.5451 (2.4670) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][50/1251] eta 0:05:13 lr 0.000763 wd 0.0500 time 0.2369 (0.2608) data time 0.0009 (0.0107) model time 0.0000 (0.0000) loss 2.1702 (3.4078) grad_norm 2.4373 (2.3988) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][60/1251] eta 0:05:07 lr 0.000763 wd 0.0500 time 0.2424 (0.2578) data time 0.0008 (0.0092) model time 0.2416 (0.2416) loss 4.1263 (3.3989) grad_norm 1.8774 (2.3132) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][70/1251] eta 0:05:01 lr 0.000763 wd 0.0500 time 0.2428 (0.2556) data time 0.0010 (0.0080) model time 0.2418 (0.2413) loss 3.5310 (3.4074) grad_norm 1.6920 (2.3756) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][80/1251] eta 0:04:57 lr 0.000763 wd 0.0500 time 0.2406 (0.2538) data time 0.0011 (0.0072) model time 0.2395 (0.2406) loss 3.9551 (3.4330) grad_norm 1.7857 (2.3282) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][90/1251] eta 0:04:53 lr 0.000763 wd 0.0500 time 0.2502 (0.2527) data time 0.0012 (0.0066) model time 0.2490 (0.2411) loss 3.2923 (3.4402) grad_norm 1.8919 (2.2665) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][100/1251] eta 0:04:51 lr 0.000763 wd 0.0500 time 0.2347 (0.2534) data time 0.0010 (0.0060) model time 0.2336 (0.2446) loss 3.6693 (3.4243) grad_norm 2.7176 (2.2297) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][110/1251] eta 0:04:47 lr 0.000763 wd 0.0500 time 0.2434 (0.2523) data time 0.0011 (0.0056) model time 0.2423 (0.2439) loss 3.9716 (3.4287) grad_norm 2.4956 (2.2269) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][120/1251] eta 0:04:44 lr 0.000763 wd 0.0500 time 0.2325 (0.2514) data time 0.0008 (0.0052) model time 0.2317 (0.2434) loss 3.2220 (3.4244) grad_norm 1.9130 (2.2117) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][130/1251] eta 0:04:42 lr 0.000763 wd 0.0500 time 0.2413 (0.2518) data time 0.0010 (0.0049) model time 0.2403 (0.2449) loss 3.6621 (3.4140) grad_norm 1.7815 (2.1822) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][140/1251] eta 0:04:39 lr 0.000763 wd 0.0500 time 0.2470 (0.2513) data time 0.0011 (0.0046) model time 0.2459 (0.2447) loss 3.7017 (3.4077) grad_norm 2.8490 (2.1706) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][150/1251] eta 0:04:35 lr 0.000763 wd 0.0500 time 0.2343 (0.2505) data time 0.0009 (0.0044) model time 0.2335 (0.2441) loss 3.5924 (3.4140) grad_norm 2.3046 (2.1823) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:52:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][160/1251] eta 0:04:32 lr 0.000763 wd 0.0500 time 0.2359 (0.2499) data time 0.0011 (0.0042) model time 0.2348 (0.2437) loss 3.8609 (3.4195) grad_norm 1.8452 (2.1690) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][170/1251] eta 0:04:29 lr 0.000763 wd 0.0500 time 0.2334 (0.2495) data time 0.0010 (0.0040) model time 0.2324 (0.2436) loss 3.3865 (3.4090) grad_norm 1.7150 (2.1609) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][180/1251] eta 0:04:26 lr 0.000763 wd 0.0500 time 0.2396 (0.2491) data time 0.0009 (0.0039) model time 0.2387 (0.2433) loss 2.7909 (3.4027) grad_norm 1.7056 (2.1587) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][190/1251] eta 0:04:24 lr 0.000763 wd 0.0500 time 0.2388 (0.2488) data time 0.0011 (0.0037) model time 0.2377 (0.2433) loss 3.4535 (3.4074) grad_norm 1.6063 (2.1497) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][200/1251] eta 0:04:21 lr 0.000763 wd 0.0500 time 0.2384 (0.2485) data time 0.0009 (0.0036) model time 0.2375 (0.2432) loss 3.9113 (3.4189) grad_norm 2.0833 (2.2010) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][210/1251] eta 0:04:18 lr 0.000763 wd 0.0500 time 0.2352 (0.2482) data time 0.0009 (0.0035) model time 0.2343 (0.2430) loss 4.0680 (3.4100) grad_norm 1.5597 (2.1917) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][220/1251] eta 0:04:15 lr 0.000763 wd 0.0500 time 0.2384 (0.2480) data time 0.0008 (0.0034) model time 0.2376 (0.2430) loss 3.4747 (3.4084) grad_norm 2.0242 (2.1778) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][230/1251] eta 0:04:12 lr 0.000763 wd 0.0500 time 0.2396 (0.2477) data time 0.0007 (0.0033) model time 0.2389 (0.2428) loss 2.2413 (3.4017) grad_norm 2.3179 (2.1768) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][240/1251] eta 0:04:10 lr 0.000763 wd 0.0500 time 0.2410 (0.2475) data time 0.0010 (0.0032) model time 0.2400 (0.2427) loss 3.8599 (3.4058) grad_norm 1.4686 (2.1706) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][250/1251] eta 0:04:07 lr 0.000763 wd 0.0500 time 0.2350 (0.2473) data time 0.0009 (0.0031) model time 0.2341 (0.2427) loss 3.2333 (3.3943) grad_norm 2.0591 (2.1615) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][260/1251] eta 0:04:05 lr 0.000763 wd 0.0500 time 0.2392 (0.2478) data time 0.0010 (0.0031) model time 0.2383 (0.2434) loss 3.8281 (3.3997) grad_norm 2.7921 (2.1565) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][270/1251] eta 0:04:02 lr 0.000763 wd 0.0500 time 0.2488 (0.2476) data time 0.0010 (0.0030) model time 0.2478 (0.2433) loss 2.8805 (3.4045) grad_norm 3.2921 (2.1585) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][280/1251] eta 0:04:00 lr 0.000763 wd 0.0500 time 0.2345 (0.2474) data time 0.0007 (0.0029) model time 0.2338 (0.2432) loss 2.7788 (3.4125) grad_norm 1.8554 (2.1630) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][290/1251] eta 0:03:57 lr 0.000763 wd 0.0500 time 0.2378 (0.2472) data time 0.0011 (0.0029) model time 0.2368 (0.2431) loss 4.0145 (3.4095) grad_norm 2.4252 (2.1629) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][300/1251] eta 0:03:54 lr 0.000763 wd 0.0500 time 0.2449 (0.2471) data time 0.0009 (0.0028) model time 0.2440 (0.2431) loss 3.5293 (3.4149) grad_norm 1.9263 (2.1638) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][310/1251] eta 0:03:52 lr 0.000762 wd 0.0500 time 0.2363 (0.2469) data time 0.0010 (0.0028) model time 0.2353 (0.2430) loss 3.7394 (3.4144) grad_norm 2.1958 (2.1672) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][320/1251] eta 0:03:49 lr 0.000762 wd 0.0500 time 0.2403 (0.2468) data time 0.0010 (0.0027) model time 0.2393 (0.2430) loss 3.3557 (3.4176) grad_norm 2.0634 (2.1667) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][330/1251] eta 0:03:47 lr 0.000762 wd 0.0500 time 0.2473 (0.2467) data time 0.0010 (0.0027) model time 0.2463 (0.2430) loss 3.2368 (3.4137) grad_norm 2.7844 (2.1681) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][340/1251] eta 0:03:44 lr 0.000762 wd 0.0500 time 0.2493 (0.2466) data time 0.0009 (0.0026) model time 0.2484 (0.2429) loss 2.1448 (3.4129) grad_norm 1.6097 (2.1674) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][350/1251] eta 0:03:42 lr 0.000762 wd 0.0500 time 0.2391 (0.2465) data time 0.0009 (0.0026) model time 0.2382 (0.2428) loss 2.2776 (3.4031) grad_norm 1.9620 (2.1774) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][360/1251] eta 0:03:39 lr 0.000762 wd 0.0500 time 0.2395 (0.2463) data time 0.0010 (0.0025) model time 0.2385 (0.2428) loss 2.8583 (3.4050) grad_norm 3.1349 (2.1933) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 11:53:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 11:53:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 11:53:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 11:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 11:55:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 11:55:38 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 11:55:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 11:55:46 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 11:55:47 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 11:55:48 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 11:55:48 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 111) [2024-08-26 11:55:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 11:56:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 11:56:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 11:56:07 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 12:30:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 12:30:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 12:30:57 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 12:34:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 12:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 12:34:09 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 12:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 12:34:19 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 12:34:20 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 12:34:21 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 12:34:21 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 111) [2024-08-26 12:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 12:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][370/1251] eta 2:44:53 lr 0.000762 wd 0.0500 time 11.2300 (11.2300) data time 0.6844 (0.6844) model time 10.5456 (10.5456) loss 4.1687 (4.1687) grad_norm 2.1449 (2.1449) loss_scale 2048.0000 (2048.0000) mem 20033MB [2024-08-26 12:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][380/1251] eta 0:17:57 lr 0.000762 wd 0.0500 time 0.2219 (1.2374) data time 0.0010 (0.0630) model time 0.2209 (1.1744) loss 2.8492 (3.7511) grad_norm 2.0003 (2.3641) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][390/1251] eta 0:10:50 lr 0.000762 wd 0.0500 time 0.2220 (0.7556) data time 0.0009 (0.0336) model time 0.2211 (0.7220) loss 3.7010 (3.6645) grad_norm 2.3577 (2.1881) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][400/1251] eta 0:08:19 lr 0.000762 wd 0.0500 time 0.2265 (0.5871) data time 0.0006 (0.0231) model time 0.2259 (0.5641) loss 2.5906 (3.6701) grad_norm 1.5898 (2.0608) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][410/1251] eta 0:07:01 lr 0.000762 wd 0.0500 time 0.2248 (0.5006) data time 0.0009 (0.0177) model time 0.2240 (0.4829) loss 3.2321 (3.6033) grad_norm 1.7540 (2.1175) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:34:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][420/1251] eta 0:06:11 lr 0.000762 wd 0.0500 time 0.2272 (0.4469) data time 0.0007 (0.0144) model time 0.2266 (0.4325) loss 3.7258 (3.5811) grad_norm 1.6619 (2.0478) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:34:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][430/1251] eta 0:05:37 lr 0.000762 wd 0.0500 time 0.2212 (0.4106) data time 0.0009 (0.0122) model time 0.2204 (0.3984) loss 3.5265 (3.5483) grad_norm 2.2223 (2.0998) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:34:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][440/1251] eta 0:05:12 lr 0.000762 wd 0.0500 time 0.2296 (0.3849) data time 0.0009 (0.0106) model time 0.2287 (0.3743) loss 3.1862 (3.5063) grad_norm 1.8464 (2.1488) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][450/1251] eta 0:04:52 lr 0.000762 wd 0.0500 time 0.2244 (0.3653) data time 0.0009 (0.0094) model time 0.2235 (0.3558) loss 3.0699 (3.4818) grad_norm 1.4639 (2.1568) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:34:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][460/1251] eta 0:04:37 lr 0.000762 wd 0.0500 time 0.2246 (0.3504) data time 0.0008 (0.0086) model time 0.2238 (0.3419) loss 4.0236 (3.4758) grad_norm 1.7683 (2.1186) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:34:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][470/1251] eta 0:04:24 lr 0.000762 wd 0.0500 time 0.2235 (0.3383) data time 0.0008 (0.0078) model time 0.2228 (0.3305) loss 3.5550 (3.4781) grad_norm 2.5044 (2.1126) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:35:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][480/1251] eta 0:04:13 lr 0.000762 wd 0.0500 time 0.2278 (0.3283) data time 0.0008 (0.0072) model time 0.2269 (0.3211) loss 2.7755 (3.4852) grad_norm 1.5146 (2.0903) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:35:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][490/1251] eta 0:04:03 lr 0.000762 wd 0.0500 time 0.2293 (0.3199) data time 0.0006 (0.0067) model time 0.2287 (0.3133) loss 1.9989 (3.4835) grad_norm 2.1648 (2.0846) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][500/1251] eta 0:03:54 lr 0.000762 wd 0.0500 time 0.2307 (0.3128) data time 0.0008 (0.0062) model time 0.2299 (0.3066) loss 3.5797 (3.4761) grad_norm 1.8445 (2.0812) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:35:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][510/1251] eta 0:03:47 lr 0.000762 wd 0.0500 time 0.2234 (0.3068) data time 0.0007 (0.0059) model time 0.2227 (0.3009) loss 3.7901 (3.4684) grad_norm 1.9091 (2.0829) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:35:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][520/1251] eta 0:03:40 lr 0.000762 wd 0.0500 time 0.2190 (0.3015) data time 0.0010 (0.0055) model time 0.2180 (0.2960) loss 2.5147 (3.4537) grad_norm 2.0823 (2.0667) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:35:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][530/1251] eta 0:03:34 lr 0.000762 wd 0.0500 time 0.2243 (0.2972) data time 0.0011 (0.0054) model time 0.2231 (0.2919) loss 3.8307 (3.4552) grad_norm 2.4072 (2.0745) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][540/1251] eta 0:03:28 lr 0.000762 wd 0.0500 time 0.2275 (0.2932) data time 0.0011 (0.0051) model time 0.2265 (0.2880) loss 3.1711 (3.4477) grad_norm 4.0455 (2.0734) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:35:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][550/1251] eta 0:03:23 lr 0.000762 wd 0.0500 time 0.2251 (0.2897) data time 0.0008 (0.0049) model time 0.2243 (0.2848) loss 3.3721 (3.4241) grad_norm 1.9630 (2.0918) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:35:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][560/1251] eta 0:03:17 lr 0.000762 wd 0.0500 time 0.2275 (0.2865) data time 0.0010 (0.0047) model time 0.2265 (0.2818) loss 2.8783 (3.4223) grad_norm 1.7249 (2.0898) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 12:35:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 12:35:25 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 12:44:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 12:44:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 12:45:01 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 12:45:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 12:45:17 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 12:45:18 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 12:45:19 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 12:45:19 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 111) [2024-08-26 12:45:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 12:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 12:47:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 12:47:29 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 12:47:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 12:47:40 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 12:47:41 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 12:47:42 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 12:47:42 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 111) [2024-08-26 12:47:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 12:48:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][570/1251] eta 0:45:43 lr 0.000761 wd 0.0500 time 0.2203 (4.0290) data time 0.0010 (0.3800) model time 0.2193 (3.6489) loss 4.4152 (3.8476) grad_norm 4.2927 (2.9801) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][580/1251] eta 0:14:55 lr 0.000761 wd 0.0500 time 0.2365 (1.3346) data time 0.0007 (0.1094) model time 0.2358 (1.2251) loss 3.9304 (3.5965) grad_norm 1.4838 (2.6149) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][590/1251] eta 0:09:43 lr 0.000761 wd 0.0500 time 0.2359 (0.8832) data time 0.0010 (0.0643) model time 0.2349 (0.8189) loss 3.5658 (3.6232) grad_norm 1.7679 (2.4573) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][600/1251] eta 0:07:29 lr 0.000761 wd 0.0500 time 0.2275 (0.6900) data time 0.0006 (0.0457) model time 0.2268 (0.6444) loss 2.5925 (3.6455) grad_norm 2.0746 (2.2874) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][610/1251] eta 0:06:14 lr 0.000761 wd 0.0500 time 0.2265 (0.5850) data time 0.0008 (0.0355) model time 0.2257 (0.5494) loss 3.5544 (3.5993) grad_norm 2.5178 (2.2989) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][620/1251] eta 0:05:29 lr 0.000761 wd 0.0500 time 0.2333 (0.5223) data time 0.0009 (0.0292) model time 0.2324 (0.4931) loss 3.9018 (3.5980) grad_norm 2.1981 (2.2797) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][630/1251] eta 0:04:57 lr 0.000761 wd 0.0500 time 0.2284 (0.4786) data time 0.0007 (0.0248) model time 0.2277 (0.4538) loss 3.6447 (3.5547) grad_norm 2.0633 (2.2249) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][640/1251] eta 0:04:33 lr 0.000761 wd 0.0500 time 0.2235 (0.4481) data time 0.0009 (0.0217) model time 0.2227 (0.4264) loss 3.6287 (3.5139) grad_norm 1.9872 (2.2286) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][650/1251] eta 0:04:13 lr 0.000761 wd 0.0500 time 0.2295 (0.4223) data time 0.0009 (0.0192) model time 0.2286 (0.4030) loss 3.7994 (3.4919) grad_norm 2.1184 (2.2424) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][660/1251] eta 0:03:58 lr 0.000761 wd 0.0500 time 0.2906 (0.4032) data time 0.0017 (0.0173) model time 0.2889 (0.3859) loss 3.0688 (3.4840) grad_norm 2.0442 (2.2206) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][670/1251] eta 0:03:45 lr 0.000761 wd 0.0500 time 0.2238 (0.3882) data time 0.0016 (0.0158) model time 0.2222 (0.3725) loss 3.1373 (3.5044) grad_norm 2.0220 (2.1952) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][680/1251] eta 0:03:33 lr 0.000761 wd 0.0500 time 0.2234 (0.3742) data time 0.0010 (0.0145) model time 0.2224 (0.3597) loss 3.4348 (3.4869) grad_norm 1.8947 (2.1782) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][690/1251] eta 0:03:23 lr 0.000761 wd 0.0500 time 0.2983 (0.3631) data time 0.0015 (0.0134) model time 0.2968 (0.3497) loss 2.7632 (3.4802) grad_norm 2.6337 (2.1744) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][700/1251] eta 0:03:15 lr 0.000761 wd 0.0500 time 0.2482 (0.3542) data time 0.0010 (0.0125) model time 0.2472 (0.3416) loss 3.8444 (3.4737) grad_norm 1.6264 (2.1625) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][710/1251] eta 0:03:08 lr 0.000761 wd 0.0500 time 0.2751 (0.3482) data time 0.0014 (0.0118) model time 0.2737 (0.3364) loss 3.1111 (3.4623) grad_norm 1.3831 (2.1408) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][720/1251] eta 0:03:00 lr 0.000761 wd 0.0500 time 0.2305 (0.3405) data time 0.0007 (0.0111) model time 0.2298 (0.3294) loss 3.1843 (3.4579) grad_norm 1.6164 (2.1306) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][730/1251] eta 0:02:53 lr 0.000761 wd 0.0500 time 0.2211 (0.3337) data time 0.0008 (0.0104) model time 0.2202 (0.3232) loss 3.0094 (3.4446) grad_norm 2.0734 (2.1223) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][740/1251] eta 0:02:47 lr 0.000761 wd 0.0500 time 0.2260 (0.3276) data time 0.0011 (0.0099) model time 0.2249 (0.3177) loss 2.4848 (3.4374) grad_norm 2.3029 (2.1199) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][750/1251] eta 0:02:42 lr 0.000761 wd 0.0500 time 0.2236 (0.3248) data time 0.0010 (0.0095) model time 0.2226 (0.3154) loss 3.0795 (3.4333) grad_norm 2.4223 (2.1210) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][760/1251] eta 0:02:37 lr 0.000761 wd 0.0500 time 0.2306 (0.3202) data time 0.0010 (0.0091) model time 0.2296 (0.3112) loss 3.8185 (3.4373) grad_norm 1.7346 (2.1021) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][770/1251] eta 0:02:32 lr 0.000761 wd 0.0500 time 0.2210 (0.3165) data time 0.0012 (0.0087) model time 0.2198 (0.3079) loss 3.7223 (3.4170) grad_norm 2.9028 (2.0933) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][780/1251] eta 0:02:27 lr 0.000761 wd 0.0500 time 0.2272 (0.3125) data time 0.0012 (0.0083) model time 0.2260 (0.3042) loss 3.5293 (3.4057) grad_norm 2.3221 (2.1132) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:48:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][790/1251] eta 0:02:23 lr 0.000761 wd 0.0500 time 0.2545 (0.3106) data time 0.0016 (0.0080) model time 0.2529 (0.3026) loss 3.6538 (3.4046) grad_norm 2.3978 (2.1232) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][800/1251] eta 0:02:18 lr 0.000761 wd 0.0500 time 0.2317 (0.3072) data time 0.0009 (0.0077) model time 0.2308 (0.2994) loss 2.5275 (3.3923) grad_norm 2.5949 (2.1227) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][810/1251] eta 0:02:14 lr 0.000761 wd 0.0500 time 0.2305 (0.3040) data time 0.0007 (0.0074) model time 0.2298 (0.2966) loss 2.4821 (3.3955) grad_norm 1.4210 (2.1240) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][820/1251] eta 0:02:10 lr 0.000761 wd 0.0500 time 0.2314 (0.3019) data time 0.0009 (0.0072) model time 0.2305 (0.2947) loss 2.5249 (3.3856) grad_norm 1.9557 (2.1628) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][830/1251] eta 0:02:06 lr 0.000760 wd 0.0500 time 0.2318 (0.3002) data time 0.0010 (0.0070) model time 0.2308 (0.2933) loss 3.4531 (3.3766) grad_norm 2.0435 (2.1651) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][840/1251] eta 0:02:02 lr 0.000760 wd 0.0500 time 0.2268 (0.2983) data time 0.0010 (0.0069) model time 0.2258 (0.2915) loss 3.5761 (3.3752) grad_norm 2.3857 (2.1589) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][850/1251] eta 0:01:58 lr 0.000760 wd 0.0500 time 0.2266 (0.2959) data time 0.0008 (0.0067) model time 0.2258 (0.2892) loss 2.6728 (3.3709) grad_norm 2.0081 (2.1769) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][860/1251] eta 0:01:55 lr 0.000760 wd 0.0500 time 0.2272 (0.2943) data time 0.0008 (0.0065) model time 0.2265 (0.2879) loss 2.9019 (3.3645) grad_norm 2.0687 (2.1750) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][870/1251] eta 0:01:51 lr 0.000760 wd 0.0500 time 0.2921 (0.2933) data time 0.0014 (0.0063) model time 0.2907 (0.2870) loss 3.6163 (3.3592) grad_norm 1.6655 (2.1724) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][880/1251] eta 0:01:48 lr 0.000760 wd 0.0500 time 0.2245 (0.2921) data time 0.0009 (0.0062) model time 0.2236 (0.2859) loss 3.4114 (3.3600) grad_norm 1.7861 (2.1657) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][890/1251] eta 0:01:44 lr 0.000760 wd 0.0500 time 0.2357 (0.2907) data time 0.0009 (0.0060) model time 0.2348 (0.2847) loss 3.6147 (3.3681) grad_norm 1.9273 (2.1575) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][900/1251] eta 0:01:41 lr 0.000760 wd 0.0500 time 0.2229 (0.2888) data time 0.0009 (0.0058) model time 0.2221 (0.2829) loss 3.0065 (3.3648) grad_norm 1.4757 (2.1558) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][910/1251] eta 0:01:38 lr 0.000760 wd 0.0500 time 0.2613 (0.2879) data time 0.0010 (0.0057) model time 0.2603 (0.2822) loss 3.9516 (3.3688) grad_norm 1.9513 (2.1530) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][920/1251] eta 0:01:34 lr 0.000760 wd 0.0500 time 0.2259 (0.2869) data time 0.0011 (0.0056) model time 0.2248 (0.2813) loss 3.5647 (3.3693) grad_norm 2.9758 (2.1479) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][930/1251] eta 0:01:31 lr 0.000760 wd 0.0500 time 0.2255 (0.2853) data time 0.0008 (0.0055) model time 0.2247 (0.2798) loss 2.5174 (3.3671) grad_norm 1.7996 (2.1537) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][940/1251] eta 0:01:28 lr 0.000760 wd 0.0500 time 0.2258 (0.2838) data time 0.0011 (0.0053) model time 0.2246 (0.2784) loss 3.4644 (3.3654) grad_norm 2.7384 (2.1495) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][950/1251] eta 0:01:25 lr 0.000760 wd 0.0500 time 0.3087 (0.2831) data time 0.0012 (0.0052) model time 0.3074 (0.2779) loss 2.5690 (3.3585) grad_norm 3.0495 (2.1529) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][960/1251] eta 0:01:22 lr 0.000760 wd 0.0500 time 0.2419 (0.2819) data time 0.0011 (0.0051) model time 0.2408 (0.2768) loss 3.6149 (3.3577) grad_norm 4.3673 (2.1586) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][970/1251] eta 0:01:18 lr 0.000760 wd 0.0500 time 0.2239 (0.2811) data time 0.0008 (0.0050) model time 0.2232 (0.2761) loss 3.8018 (3.3644) grad_norm 1.7257 (2.1606) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][980/1251] eta 0:01:15 lr 0.000760 wd 0.0500 time 0.2279 (0.2799) data time 0.0010 (0.0050) model time 0.2270 (0.2749) loss 3.0043 (3.3677) grad_norm 2.0647 (2.1565) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][990/1251] eta 0:01:12 lr 0.000760 wd 0.0500 time 0.2347 (0.2792) data time 0.0007 (0.0049) model time 0.2340 (0.2743) loss 4.3910 (3.3680) grad_norm 1.9033 (2.1467) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1000/1251] eta 0:01:09 lr 0.000760 wd 0.0500 time 0.2292 (0.2785) data time 0.0008 (0.0048) model time 0.2284 (0.2738) loss 3.9470 (3.3773) grad_norm 2.1677 (2.1503) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1010/1251] eta 0:01:06 lr 0.000760 wd 0.0500 time 0.2228 (0.2775) data time 0.0009 (0.0047) model time 0.2219 (0.2728) loss 3.5662 (3.3783) grad_norm 2.1217 (2.1505) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 12:49:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1020/1251] eta 0:01:03 lr 0.000760 wd 0.0500 time 0.2609 (0.2769) data time 0.0008 (0.0046) model time 0.2601 (0.2723) loss 3.9680 (3.3752) grad_norm 1.6690 (2.1458) loss_scale 4096.0000 (2079.5771) mem 7377MB [2024-08-26 12:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1030/1251] eta 0:01:01 lr 0.000760 wd 0.0500 time 0.3300 (0.2765) data time 0.0015 (0.0045) model time 0.3285 (0.2719) loss 3.9782 (3.3721) grad_norm 2.3281 (2.1416) loss_scale 4096.0000 (2123.0345) mem 7377MB [2024-08-26 12:49:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1040/1251] eta 0:00:58 lr 0.000760 wd 0.0500 time 0.2285 (0.2760) data time 0.0011 (0.0045) model time 0.2274 (0.2716) loss 3.4066 (3.3635) grad_norm 1.7870 (2.1367) loss_scale 4096.0000 (2164.6582) mem 7377MB [2024-08-26 12:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1050/1251] eta 0:00:55 lr 0.000760 wd 0.0500 time 0.2254 (0.2751) data time 0.0018 (0.0044) model time 0.2237 (0.2706) loss 3.2879 (3.3640) grad_norm 2.9287 (2.1472) loss_scale 4096.0000 (2204.5620) mem 7377MB [2024-08-26 12:50:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1060/1251] eta 0:00:52 lr 0.000760 wd 0.0500 time 0.2361 (0.2741) data time 0.0012 (0.0044) model time 0.2349 (0.2698) loss 3.2135 (3.3633) grad_norm 2.5359 (2.1569) loss_scale 4096.0000 (2242.8502) mem 7377MB [2024-08-26 12:50:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1070/1251] eta 0:00:49 lr 0.000760 wd 0.0500 time 0.3048 (0.2736) data time 0.0016 (0.0043) model time 0.3032 (0.2693) loss 3.7430 (3.3620) grad_norm 1.5607 (2.1550) loss_scale 4096.0000 (2279.6190) mem 7377MB [2024-08-26 12:50:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1080/1251] eta 0:00:46 lr 0.000760 wd 0.0500 time 0.2282 (0.2731) data time 0.0009 (0.0042) model time 0.2272 (0.2688) loss 3.5773 (3.3718) grad_norm 2.8647 (2.1507) loss_scale 4096.0000 (2314.9572) mem 7377MB [2024-08-26 12:50:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1090/1251] eta 0:00:43 lr 0.000759 wd 0.0500 time 0.2323 (0.2726) data time 0.0007 (0.0042) model time 0.2316 (0.2684) loss 2.4081 (3.3640) grad_norm 1.6733 (2.1493) loss_scale 4096.0000 (2348.9466) mem 7377MB [2024-08-26 12:50:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1100/1251] eta 0:00:41 lr 0.000759 wd 0.0500 time 0.2348 (0.2717) data time 0.0006 (0.0041) model time 0.2342 (0.2676) loss 2.8000 (3.3610) grad_norm 1.4004 (2.1481) loss_scale 4096.0000 (2381.6629) mem 7377MB [2024-08-26 12:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1110/1251] eta 0:00:38 lr 0.000759 wd 0.0500 time 0.2737 (0.2710) data time 0.0008 (0.0041) model time 0.2729 (0.2669) loss 4.5087 (3.3611) grad_norm 1.3420 (2.1447) loss_scale 4096.0000 (2413.1765) mem 7377MB [2024-08-26 12:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1120/1251] eta 0:00:35 lr 0.000759 wd 0.0500 time 0.2383 (0.2709) data time 0.0008 (0.0040) model time 0.2375 (0.2669) loss 3.7901 (3.3636) grad_norm 1.7638 (2.1398) loss_scale 4096.0000 (2443.5523) mem 7377MB [2024-08-26 12:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1130/1251] eta 0:00:32 lr 0.000759 wd 0.0500 time 0.2251 (0.2702) data time 0.0010 (0.0040) model time 0.2241 (0.2662) loss 3.4759 (3.3656) grad_norm 1.9765 (2.1423) loss_scale 4096.0000 (2472.8511) mem 7377MB [2024-08-26 12:50:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 12:50:22 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 12:50:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 12:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 12:52:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 12:52:32 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 12:52:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 12:52:43 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 12:52:44 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 12:52:45 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 12:52:46 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 111) [2024-08-26 12:52:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 12:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1140/1251] eta 0:06:35 lr 0.000759 wd 0.0500 time 0.2470 (3.5667) data time 0.0007 (0.1670) model time 0.2463 (3.3997) loss 3.7885 (3.7503) grad_norm 2.1995 (2.8757) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 12:53:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1150/1251] eta 0:02:00 lr 0.000759 wd 0.0500 time 0.2474 (1.1926) data time 0.0007 (0.0484) model time 0.2467 (1.1442) loss 3.6516 (3.6771) grad_norm 2.5808 (2.7874) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 12:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1160/1251] eta 0:01:12 lr 0.000759 wd 0.0500 time 0.2430 (0.7967) data time 0.0010 (0.0287) model time 0.2419 (0.7680) loss 3.5458 (3.6338) grad_norm 1.7163 (2.5631) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 12:53:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1170/1251] eta 0:00:51 lr 0.000759 wd 0.0500 time 0.2369 (0.6336) data time 0.0008 (0.0205) model time 0.2360 (0.6131) loss 2.8584 (3.6076) grad_norm 1.3683 (2.4126) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 12:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1180/1251] eta 0:00:38 lr 0.000759 wd 0.0500 time 0.2450 (0.5449) data time 0.0008 (0.0161) model time 0.2442 (0.5288) loss 3.2853 (3.5524) grad_norm 3.3190 (2.3138) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 12:53:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1190/1251] eta 0:00:29 lr 0.000759 wd 0.0500 time 0.2325 (0.4886) data time 0.0007 (0.0133) model time 0.2318 (0.4752) loss 4.0015 (3.5477) grad_norm 2.7030 (2.3334) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 12:53:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1200/1251] eta 0:00:22 lr 0.000759 wd 0.0500 time 0.2568 (0.4503) data time 0.0007 (0.0114) model time 0.2561 (0.4389) loss 3.4747 (3.5285) grad_norm 1.7802 (2.2881) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 12:53:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1210/1251] eta 0:00:17 lr 0.000759 wd 0.0500 time 0.2449 (0.4220) data time 0.0007 (0.0100) model time 0.2442 (0.4119) loss 3.6887 (3.5014) grad_norm 1.8039 (2.2458) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 12:53:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1220/1251] eta 0:00:12 lr 0.000759 wd 0.0500 time 0.2363 (0.4009) data time 0.0011 (0.0090) model time 0.2352 (0.3919) loss 3.8205 (3.4757) grad_norm 2.0201 (2.2375) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 12:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1230/1251] eta 0:00:08 lr 0.000759 wd 0.0500 time 0.2402 (0.3840) data time 0.0011 (0.0081) model time 0.2391 (0.3759) loss 3.2566 (3.4648) grad_norm 2.2675 (2.2282) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 12:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1240/1251] eta 0:00:04 lr 0.000759 wd 0.0500 time 0.2248 (0.3698) data time 0.0007 (0.0076) model time 0.2240 (0.3622) loss 3.4363 (3.4785) grad_norm 3.0407 (2.2150) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 12:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [111/300][1250/1251] eta 0:00:00 lr 0.000759 wd 0.0500 time 0.2250 (0.3571) data time 0.0007 (0.0070) model time 0.2243 (0.3502) loss 3.8012 (3.4653) grad_norm 1.9665 (2.1852) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 12:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 111 training takes 0:00:40 [2024-08-26 12:53:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 12:53:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 12:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.478 (0.478) Loss 0.5557 (0.5557) Acc@1 89.844 (89.844) Acc@5 97.949 (97.949) Mem 7377MB [2024-08-26 12:53:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.111) Loss 0.8174 (0.8172) Acc@1 83.496 (82.928) Acc@5 96.582 (96.413) Mem 7377MB [2024-08-26 12:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.066 (0.096) Loss 1.1133 (0.8347) Acc@1 73.633 (81.966) Acc@5 93.750 (96.410) Mem 7377MB [2024-08-26 12:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.090) Loss 1.4053 (0.9426) Acc@1 66.992 (79.483) Acc@5 89.551 (94.957) Mem 7377MB [2024-08-26 12:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.4072 (1.0008) Acc@1 68.750 (77.958) Acc@5 89.160 (94.253) Mem 7377MB [2024-08-26 12:53:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.560 Acc@5 94.146 [2024-08-26 12:53:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.6% [2024-08-26 12:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.830 (0.830) Loss 0.4309 (0.4309) Acc@1 91.797 (91.797) Acc@5 98.242 (98.242) Mem 7377MB [2024-08-26 12:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.089 (0.151) Loss 0.6899 (0.6745) Acc@1 86.523 (85.423) Acc@5 96.875 (97.230) Mem 7377MB [2024-08-26 12:53:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.115) Loss 0.9624 (0.6974) Acc@1 77.441 (84.422) Acc@5 94.434 (97.196) Mem 7377MB [2024-08-26 12:53:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.103) Loss 1.2383 (0.7946) Acc@1 68.652 (82.126) Acc@5 90.918 (96.037) Mem 7377MB [2024-08-26 12:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.094) Loss 1.1055 (0.8452) Acc@1 72.754 (80.726) Acc@5 93.164 (95.513) Mem 7377MB [2024-08-26 12:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.288 Acc@5 95.450 [2024-08-26 12:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.3% [2024-08-26 12:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.29% [2024-08-26 12:53:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 12:53:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 12:53:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][0/1251] eta 0:16:07 lr 0.000759 wd 0.0500 time 0.7730 (0.7730) data time 0.5119 (0.5119) model time 0.0000 (0.0000) loss 2.3762 (2.3762) grad_norm 1.7611 (1.7611) loss_scale 4096.0000 (4096.0000) mem 7380MB [2024-08-26 12:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][10/1251] eta 0:06:04 lr 0.000759 wd 0.0500 time 0.2461 (0.2936) data time 0.0009 (0.0475) model time 0.0000 (0.0000) loss 2.9797 (3.3987) grad_norm 1.9068 (2.1529) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:53:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][20/1251] eta 0:05:32 lr 0.000759 wd 0.0500 time 0.2478 (0.2699) data time 0.0009 (0.0254) model time 0.0000 (0.0000) loss 3.1904 (3.4338) grad_norm 2.1887 (2.2002) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:53:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][30/1251] eta 0:05:20 lr 0.000759 wd 0.0500 time 0.2568 (0.2625) data time 0.0009 (0.0175) model time 0.0000 (0.0000) loss 3.7837 (3.3807) grad_norm 1.9386 (2.1307) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:53:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][40/1251] eta 0:05:11 lr 0.000759 wd 0.0500 time 0.2402 (0.2576) data time 0.0010 (0.0135) model time 0.0000 (0.0000) loss 3.7119 (3.3699) grad_norm 2.1792 (2.1721) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:53:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][50/1251] eta 0:05:06 lr 0.000759 wd 0.0500 time 0.2470 (0.2554) data time 0.0011 (0.0113) model time 0.0000 (0.0000) loss 3.5684 (3.3666) grad_norm 2.8216 (2.2539) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:53:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][60/1251] eta 0:05:02 lr 0.000759 wd 0.0500 time 0.2476 (0.2538) data time 0.0009 (0.0096) model time 0.2467 (0.2447) loss 3.1528 (3.3428) grad_norm 2.0238 (2.2585) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][70/1251] eta 0:04:58 lr 0.000759 wd 0.0500 time 0.2436 (0.2526) data time 0.0011 (0.0084) model time 0.2425 (0.2443) loss 3.6445 (3.3391) grad_norm 2.2393 (2.2349) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:54:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][80/1251] eta 0:04:54 lr 0.000759 wd 0.0500 time 0.2381 (0.2513) data time 0.0010 (0.0075) model time 0.2371 (0.2432) loss 2.5636 (3.3320) grad_norm 2.0399 (2.2666) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:54:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][90/1251] eta 0:04:50 lr 0.000759 wd 0.0500 time 0.2360 (0.2504) data time 0.0011 (0.0068) model time 0.2349 (0.2429) loss 2.9679 (3.3064) grad_norm 1.4038 (2.2486) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:54:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][100/1251] eta 0:04:47 lr 0.000759 wd 0.0500 time 0.2427 (0.2498) data time 0.0011 (0.0062) model time 0.2416 (0.2430) loss 2.9809 (3.2988) grad_norm 2.6053 (2.2542) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:54:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][110/1251] eta 0:04:44 lr 0.000758 wd 0.0500 time 0.2458 (0.2495) data time 0.0008 (0.0058) model time 0.2450 (0.2434) loss 3.2157 (3.3059) grad_norm 1.6187 (2.2326) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][120/1251] eta 0:04:41 lr 0.000758 wd 0.0500 time 0.2512 (0.2491) data time 0.0011 (0.0054) model time 0.2501 (0.2435) loss 3.5846 (3.2936) grad_norm 3.1285 (2.2348) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:54:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][130/1251] eta 0:04:38 lr 0.000758 wd 0.0500 time 0.2459 (0.2486) data time 0.0009 (0.0050) model time 0.2450 (0.2432) loss 3.1042 (3.3072) grad_norm 2.2134 (2.2201) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:54:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][140/1251] eta 0:04:35 lr 0.000758 wd 0.0500 time 0.2351 (0.2483) data time 0.0009 (0.0047) model time 0.2342 (0.2432) loss 2.6469 (3.2892) grad_norm 1.8203 (2.2091) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:54:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][150/1251] eta 0:04:33 lr 0.000758 wd 0.0500 time 0.2514 (0.2481) data time 0.0008 (0.0045) model time 0.2506 (0.2433) loss 2.5305 (3.2735) grad_norm 2.6705 (2.2185) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:54:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][160/1251] eta 0:04:30 lr 0.000758 wd 0.0500 time 0.2473 (0.2478) data time 0.0009 (0.0043) model time 0.2463 (0.2431) loss 3.6056 (3.2893) grad_norm 1.7866 (2.2408) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][170/1251] eta 0:04:27 lr 0.000758 wd 0.0500 time 0.2403 (0.2474) data time 0.0013 (0.0041) model time 0.2389 (0.2429) loss 3.6173 (3.2907) grad_norm 1.6527 (2.2283) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][180/1251] eta 0:04:26 lr 0.000758 wd 0.0500 time 0.5165 (0.2487) data time 0.0011 (0.0039) model time 0.5154 (0.2450) loss 3.3284 (3.2895) grad_norm 2.4513 (2.2161) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][190/1251] eta 0:04:23 lr 0.000758 wd 0.0500 time 0.2494 (0.2484) data time 0.0010 (0.0038) model time 0.2484 (0.2447) loss 2.5166 (3.2741) grad_norm 1.8605 (2.2120) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 12:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 12:54:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 12:54:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 12:57:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 12:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 12:57:44 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 12:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 12:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 12:59:30 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 12:59:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 12:59:44 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 12:59:45 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 12:59:46 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 12:59:46 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 112) [2024-08-26 12:59:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 13:00:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][200/1251] eta 0:32:37 lr 0.000758 wd 0.0500 time 0.2381 (1.8624) data time 0.0008 (0.1002) model time 0.2373 (1.7622) loss 3.9735 (3.7996) grad_norm 1.9246 (2.3337) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][210/1251] eta 0:17:29 lr 0.000758 wd 0.0500 time 0.2390 (1.0086) data time 0.0010 (0.0480) model time 0.2380 (0.9606) loss 3.2787 (3.5914) grad_norm 2.0858 (2.2890) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][220/1251] eta 0:12:46 lr 0.000758 wd 0.0500 time 0.2422 (0.7433) data time 0.0008 (0.0318) model time 0.2413 (0.7115) loss 4.1326 (3.6546) grad_norm 1.6420 (2.1886) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][230/1251] eta 0:10:27 lr 0.000758 wd 0.0500 time 0.2383 (0.6147) data time 0.0010 (0.0239) model time 0.2373 (0.5907) loss 3.1897 (3.6030) grad_norm 2.8155 (2.2416) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][240/1251] eta 0:09:04 lr 0.000758 wd 0.0500 time 0.2428 (0.5385) data time 0.0010 (0.0193) model time 0.2418 (0.5193) loss 3.2910 (3.5680) grad_norm 3.1766 (2.2961) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][250/1251] eta 0:08:08 lr 0.000758 wd 0.0500 time 0.2461 (0.4882) data time 0.0008 (0.0162) model time 0.2454 (0.4720) loss 2.5708 (3.5070) grad_norm 1.7451 (2.2299) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][260/1251] eta 0:07:28 lr 0.000758 wd 0.0500 time 0.2518 (0.4526) data time 0.0010 (0.0140) model time 0.2508 (0.4386) loss 3.4356 (3.4710) grad_norm 1.8920 (2.2214) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][270/1251] eta 0:06:57 lr 0.000758 wd 0.0500 time 0.2408 (0.4257) data time 0.0009 (0.0124) model time 0.2399 (0.4133) loss 3.5834 (3.4426) grad_norm 2.2645 (2.1808) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][280/1251] eta 0:06:33 lr 0.000758 wd 0.0500 time 0.2374 (0.4052) data time 0.0007 (0.0111) model time 0.2366 (0.3940) loss 3.3608 (3.4133) grad_norm 2.0883 (2.1631) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][290/1251] eta 0:06:13 lr 0.000758 wd 0.0500 time 0.2365 (0.3884) data time 0.0010 (0.0101) model time 0.2355 (0.3783) loss 3.4940 (3.4234) grad_norm 1.5976 (2.1756) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][300/1251] eta 0:05:56 lr 0.000758 wd 0.0500 time 0.2381 (0.3748) data time 0.0008 (0.0093) model time 0.2372 (0.3655) loss 3.8235 (3.4339) grad_norm 1.9376 (2.2012) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][310/1251] eta 0:05:42 lr 0.000758 wd 0.0500 time 0.2414 (0.3637) data time 0.0011 (0.0086) model time 0.2403 (0.3551) loss 3.3520 (3.4309) grad_norm 2.4094 (2.2064) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][320/1251] eta 0:05:29 lr 0.000758 wd 0.0500 time 0.2396 (0.3542) data time 0.0008 (0.0080) model time 0.2388 (0.3462) loss 3.0767 (3.4060) grad_norm 1.8505 (2.2204) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][330/1251] eta 0:05:18 lr 0.000758 wd 0.0500 time 0.2414 (0.3460) data time 0.0008 (0.0075) model time 0.2406 (0.3385) loss 4.0980 (3.4035) grad_norm 1.5400 (2.2095) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][340/1251] eta 0:05:08 lr 0.000758 wd 0.0500 time 0.2396 (0.3391) data time 0.0007 (0.0071) model time 0.2389 (0.3320) loss 2.5900 (3.3877) grad_norm 1.8841 (2.2014) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][350/1251] eta 0:04:59 lr 0.000758 wd 0.0500 time 0.2389 (0.3329) data time 0.0008 (0.0068) model time 0.2382 (0.3261) loss 4.3486 (3.3871) grad_norm 2.0592 (2.1811) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][360/1251] eta 0:04:51 lr 0.000758 wd 0.0500 time 0.2275 (0.3274) data time 0.0012 (0.0064) model time 0.2263 (0.3210) loss 3.5498 (3.3936) grad_norm 2.5407 (2.1780) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][370/1251] eta 0:04:44 lr 0.000757 wd 0.0500 time 0.2423 (0.3225) data time 0.0008 (0.0061) model time 0.2415 (0.3163) loss 3.1357 (3.3738) grad_norm 1.5131 (2.1669) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][380/1251] eta 0:04:36 lr 0.000757 wd 0.0500 time 0.2295 (0.3180) data time 0.0008 (0.0059) model time 0.2286 (0.3121) loss 4.3514 (3.3783) grad_norm 2.0236 (2.1544) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][390/1251] eta 0:04:30 lr 0.000757 wd 0.0500 time 0.2464 (0.3141) data time 0.0010 (0.0056) model time 0.2454 (0.3084) loss 2.5358 (3.3613) grad_norm 3.1424 (2.1490) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][400/1251] eta 0:04:24 lr 0.000757 wd 0.0500 time 0.2369 (0.3104) data time 0.0009 (0.0054) model time 0.2360 (0.3050) loss 4.0237 (3.3540) grad_norm 2.4409 (2.1372) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:00:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][410/1251] eta 0:04:18 lr 0.000757 wd 0.0500 time 0.2462 (0.3071) data time 0.0007 (0.0052) model time 0.2455 (0.3019) loss 3.8482 (3.3487) grad_norm 2.2623 (2.1501) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][420/1251] eta 0:04:12 lr 0.000757 wd 0.0500 time 0.2331 (0.3041) data time 0.0007 (0.0051) model time 0.2324 (0.2991) loss 2.7896 (3.3501) grad_norm 2.3128 (2.1506) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][430/1251] eta 0:04:07 lr 0.000757 wd 0.0500 time 0.2375 (0.3017) data time 0.0009 (0.0049) model time 0.2366 (0.2968) loss 2.4698 (3.3403) grad_norm 2.2222 (2.1509) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:01:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][440/1251] eta 0:04:02 lr 0.000757 wd 0.0500 time 0.2351 (0.2993) data time 0.0009 (0.0047) model time 0.2342 (0.2946) loss 3.3034 (3.3395) grad_norm 1.7712 (2.1470) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:01:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][450/1251] eta 0:03:57 lr 0.000757 wd 0.0500 time 0.2374 (0.2970) data time 0.0010 (0.0046) model time 0.2365 (0.2924) loss 3.9880 (3.3363) grad_norm 2.0057 (2.1567) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][460/1251] eta 0:03:53 lr 0.000757 wd 0.0500 time 0.2362 (0.2950) data time 0.0008 (0.0045) model time 0.2354 (0.2905) loss 2.1013 (3.3224) grad_norm 2.2883 (2.1518) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:01:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][470/1251] eta 0:03:48 lr 0.000757 wd 0.0500 time 0.2349 (0.2930) data time 0.0009 (0.0044) model time 0.2340 (0.2886) loss 3.8504 (3.3330) grad_norm 2.2800 (2.1415) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:01:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][480/1251] eta 0:03:45 lr 0.000757 wd 0.0500 time 0.2371 (0.2920) data time 0.0009 (0.0042) model time 0.2361 (0.2878) loss 3.4983 (3.3357) grad_norm 2.0105 (2.1368) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][490/1251] eta 0:03:40 lr 0.000757 wd 0.0500 time 0.2440 (0.2904) data time 0.0010 (0.0041) model time 0.2430 (0.2862) loss 3.6772 (3.3198) grad_norm 2.4568 (2.1308) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:01:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][500/1251] eta 0:03:37 lr 0.000757 wd 0.0500 time 0.2438 (0.2894) data time 0.0010 (0.0040) model time 0.2428 (0.2854) loss 3.6737 (3.3181) grad_norm 1.7219 (2.1273) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][510/1251] eta 0:03:33 lr 0.000757 wd 0.0500 time 0.2328 (0.2878) data time 0.0009 (0.0040) model time 0.2320 (0.2839) loss 4.0353 (3.3281) grad_norm 1.9514 (2.1223) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][520/1251] eta 0:03:29 lr 0.000757 wd 0.0500 time 0.2364 (0.2864) data time 0.0010 (0.0039) model time 0.2354 (0.2826) loss 2.2081 (3.3309) grad_norm 2.1558 (2.1265) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 13:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][530/1251] eta 0:03:25 lr 0.000757 wd 0.0500 time 0.2346 (0.2850) data time 0.0011 (0.0038) model time 0.2334 (0.2812) loss 3.1025 (3.3315) grad_norm 1.9797 (nan) loss_scale 2048.0000 (4071.8348) mem 7379MB [2024-08-26 13:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][540/1251] eta 0:03:21 lr 0.000757 wd 0.0500 time 0.2408 (0.2837) data time 0.0008 (0.0037) model time 0.2401 (0.2800) loss 3.6358 (3.3316) grad_norm 2.4921 (nan) loss_scale 2048.0000 (4013.8453) mem 7379MB [2024-08-26 13:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][550/1251] eta 0:03:18 lr 0.000757 wd 0.0500 time 0.2406 (0.2825) data time 0.0007 (0.0036) model time 0.2398 (0.2789) loss 2.8556 (3.3321) grad_norm 2.5113 (nan) loss_scale 2048.0000 (3959.0864) mem 7379MB [2024-08-26 13:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][560/1251] eta 0:03:14 lr 0.000757 wd 0.0500 time 0.2371 (0.2813) data time 0.0010 (0.0036) model time 0.2361 (0.2778) loss 3.5728 (3.3330) grad_norm 2.2797 (nan) loss_scale 2048.0000 (3907.2954) mem 7379MB [2024-08-26 13:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][570/1251] eta 0:03:10 lr 0.000757 wd 0.0500 time 0.2430 (0.2802) data time 0.0008 (0.0035) model time 0.2422 (0.2767) loss 3.9565 (3.3349) grad_norm 1.9157 (nan) loss_scale 2048.0000 (3858.2375) mem 7379MB [2024-08-26 13:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][580/1251] eta 0:03:07 lr 0.000757 wd 0.0500 time 0.2389 (0.2792) data time 0.0008 (0.0035) model time 0.2382 (0.2757) loss 3.8244 (3.3267) grad_norm 2.0991 (nan) loss_scale 2048.0000 (3811.7018) mem 7379MB [2024-08-26 13:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][590/1251] eta 0:03:03 lr 0.000757 wd 0.0500 time 0.2470 (0.2782) data time 0.0009 (0.0034) model time 0.2462 (0.2748) loss 3.7898 (3.3287) grad_norm 2.2977 (nan) loss_scale 2048.0000 (3767.4987) mem 7379MB [2024-08-26 13:01:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][600/1251] eta 0:03:00 lr 0.000757 wd 0.0500 time 0.2431 (0.2774) data time 0.0009 (0.0033) model time 0.2422 (0.2740) loss 3.6212 (3.3340) grad_norm 1.8111 (nan) loss_scale 2048.0000 (3725.4572) mem 7379MB [2024-08-26 13:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][610/1251] eta 0:02:57 lr 0.000757 wd 0.0500 time 0.2395 (0.2765) data time 0.0008 (0.0033) model time 0.2387 (0.2732) loss 2.0477 (3.3331) grad_norm 2.7845 (nan) loss_scale 2048.0000 (3685.4224) mem 7379MB [2024-08-26 13:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][620/1251] eta 0:02:53 lr 0.000757 wd 0.0500 time 0.2407 (0.2757) data time 0.0009 (0.0032) model time 0.2398 (0.2724) loss 3.4804 (3.3419) grad_norm 2.0052 (nan) loss_scale 2048.0000 (3647.2541) mem 7379MB [2024-08-26 13:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][630/1251] eta 0:02:50 lr 0.000756 wd 0.0500 time 0.2406 (0.2750) data time 0.0009 (0.0032) model time 0.2397 (0.2718) loss 3.4703 (3.3477) grad_norm 2.3143 (nan) loss_scale 2048.0000 (3610.8246) mem 7379MB [2024-08-26 13:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][640/1251] eta 0:02:47 lr 0.000756 wd 0.0500 time 0.2355 (0.2742) data time 0.0011 (0.0031) model time 0.2344 (0.2711) loss 3.2547 (3.3473) grad_norm 1.6989 (nan) loss_scale 2048.0000 (3576.0178) mem 7379MB [2024-08-26 13:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][650/1251] eta 0:02:44 lr 0.000756 wd 0.0500 time 0.2450 (0.2735) data time 0.0009 (0.0031) model time 0.2440 (0.2704) loss 2.6642 (3.3423) grad_norm 2.2992 (nan) loss_scale 2048.0000 (3542.7277) mem 7379MB [2024-08-26 13:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][660/1251] eta 0:02:41 lr 0.000756 wd 0.0500 time 0.2450 (0.2728) data time 0.0007 (0.0030) model time 0.2443 (0.2698) loss 2.1102 (3.3364) grad_norm 2.5284 (nan) loss_scale 2048.0000 (3510.8571) mem 7379MB [2024-08-26 13:02:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][670/1251] eta 0:02:38 lr 0.000756 wd 0.0500 time 0.2490 (0.2722) data time 0.0010 (0.0030) model time 0.2480 (0.2692) loss 3.8595 (3.3337) grad_norm 1.6501 (nan) loss_scale 2048.0000 (3480.3173) mem 7379MB [2024-08-26 13:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][680/1251] eta 0:02:35 lr 0.000756 wd 0.0500 time 0.2437 (0.2716) data time 0.0012 (0.0030) model time 0.2424 (0.2687) loss 3.3828 (3.3390) grad_norm 2.4355 (nan) loss_scale 2048.0000 (3451.0266) mem 7379MB [2024-08-26 13:02:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][690/1251] eta 0:02:32 lr 0.000756 wd 0.0500 time 0.2420 (0.2710) data time 0.0008 (0.0029) model time 0.2413 (0.2681) loss 3.3682 (3.3401) grad_norm 2.0331 (nan) loss_scale 2048.0000 (3422.9098) mem 7379MB [2024-08-26 13:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][700/1251] eta 0:02:28 lr 0.000756 wd 0.0500 time 0.2329 (0.2704) data time 0.0009 (0.0029) model time 0.2320 (0.2675) loss 4.2452 (3.3409) grad_norm 1.9651 (nan) loss_scale 2048.0000 (3395.8978) mem 7379MB [2024-08-26 13:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][710/1251] eta 0:02:25 lr 0.000756 wd 0.0500 time 0.2528 (0.2698) data time 0.0011 (0.0029) model time 0.2516 (0.2669) loss 2.5602 (3.3452) grad_norm 2.8063 (nan) loss_scale 2048.0000 (3369.9268) mem 7379MB [2024-08-26 13:02:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][720/1251] eta 0:02:22 lr 0.000756 wd 0.0500 time 0.2477 (0.2693) data time 0.0008 (0.0028) model time 0.2469 (0.2665) loss 3.7879 (3.3405) grad_norm 1.7655 (nan) loss_scale 2048.0000 (3344.9376) mem 7379MB [2024-08-26 13:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][730/1251] eta 0:02:20 lr 0.000756 wd 0.0500 time 0.2322 (0.2687) data time 0.0009 (0.0028) model time 0.2313 (0.2659) loss 3.7583 (3.3393) grad_norm 2.4706 (nan) loss_scale 2048.0000 (3320.8757) mem 7379MB [2024-08-26 13:02:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][740/1251] eta 0:02:17 lr 0.000756 wd 0.0500 time 0.2427 (0.2682) data time 0.0010 (0.0028) model time 0.2417 (0.2654) loss 3.9779 (3.3422) grad_norm 1.8959 (nan) loss_scale 2048.0000 (3297.6903) mem 7379MB [2024-08-26 13:02:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][750/1251] eta 0:02:14 lr 0.000756 wd 0.0500 time 0.2307 (0.2677) data time 0.0007 (0.0027) model time 0.2300 (0.2650) loss 4.0511 (3.3443) grad_norm 2.4071 (nan) loss_scale 2048.0000 (3275.3345) mem 7379MB [2024-08-26 13:02:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][760/1251] eta 0:02:11 lr 0.000756 wd 0.0500 time 0.2409 (0.2673) data time 0.0009 (0.0027) model time 0.2400 (0.2645) loss 3.1712 (3.3469) grad_norm 1.5132 (nan) loss_scale 2048.0000 (3253.7645) mem 7379MB [2024-08-26 13:02:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][770/1251] eta 0:02:08 lr 0.000756 wd 0.0500 time 0.2391 (0.2668) data time 0.0007 (0.0027) model time 0.2384 (0.2641) loss 3.8253 (3.3490) grad_norm 1.8715 (nan) loss_scale 2048.0000 (3232.9396) mem 7379MB [2024-08-26 13:02:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][780/1251] eta 0:02:05 lr 0.000756 wd 0.0500 time 0.2409 (0.2663) data time 0.0009 (0.0027) model time 0.2400 (0.2636) loss 2.5454 (3.3494) grad_norm 2.0406 (nan) loss_scale 2048.0000 (3212.8217) mem 7379MB [2024-08-26 13:02:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][790/1251] eta 0:02:02 lr 0.000756 wd 0.0500 time 0.2496 (0.2659) data time 0.0010 (0.0026) model time 0.2486 (0.2633) loss 3.1107 (3.3532) grad_norm 1.6744 (nan) loss_scale 2048.0000 (3193.3756) mem 7379MB [2024-08-26 13:02:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][800/1251] eta 0:01:59 lr 0.000756 wd 0.0500 time 0.2497 (0.2655) data time 0.0007 (0.0026) model time 0.2490 (0.2629) loss 3.7951 (3.3536) grad_norm 1.6815 (nan) loss_scale 2048.0000 (3174.5681) mem 7379MB [2024-08-26 13:02:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][810/1251] eta 0:01:56 lr 0.000756 wd 0.0500 time 0.2424 (0.2651) data time 0.0007 (0.0026) model time 0.2416 (0.2625) loss 3.7263 (3.3537) grad_norm 1.8566 (nan) loss_scale 2048.0000 (3156.3683) mem 7379MB [2024-08-26 13:02:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][820/1251] eta 0:01:54 lr 0.000756 wd 0.0500 time 0.2387 (0.2647) data time 0.0012 (0.0026) model time 0.2375 (0.2621) loss 3.2559 (3.3555) grad_norm 3.1979 (nan) loss_scale 2048.0000 (3138.7472) mem 7379MB [2024-08-26 13:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][830/1251] eta 0:01:51 lr 0.000756 wd 0.0500 time 0.2428 (0.2643) data time 0.0010 (0.0025) model time 0.2418 (0.2618) loss 4.1494 (3.3590) grad_norm 2.0086 (nan) loss_scale 2048.0000 (3121.6776) mem 7379MB [2024-08-26 13:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][840/1251] eta 0:01:48 lr 0.000756 wd 0.0500 time 0.2439 (0.2640) data time 0.0009 (0.0025) model time 0.2430 (0.2615) loss 3.2142 (3.3527) grad_norm 1.9642 (nan) loss_scale 2048.0000 (3105.1341) mem 7379MB [2024-08-26 13:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][850/1251] eta 0:01:45 lr 0.000756 wd 0.0500 time 0.2462 (0.2637) data time 0.0011 (0.0025) model time 0.2452 (0.2612) loss 3.6144 (3.3515) grad_norm 2.1200 (nan) loss_scale 2048.0000 (3089.0926) mem 7379MB [2024-08-26 13:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][860/1251] eta 0:01:42 lr 0.000756 wd 0.0500 time 0.2295 (0.2633) data time 0.0009 (0.0025) model time 0.2287 (0.2609) loss 3.7541 (3.3530) grad_norm 2.3615 (nan) loss_scale 2048.0000 (3073.5306) mem 7379MB [2024-08-26 13:02:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][870/1251] eta 0:01:40 lr 0.000756 wd 0.0500 time 0.2414 (0.2630) data time 0.0008 (0.0025) model time 0.2405 (0.2605) loss 3.3716 (3.3573) grad_norm 1.8672 (nan) loss_scale 2048.0000 (3058.4271) mem 7379MB [2024-08-26 13:02:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][880/1251] eta 0:01:37 lr 0.000756 wd 0.0500 time 0.2379 (0.2627) data time 0.0010 (0.0024) model time 0.2369 (0.2603) loss 2.4702 (3.3565) grad_norm 1.9657 (nan) loss_scale 2048.0000 (3043.7620) mem 7379MB [2024-08-26 13:02:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][890/1251] eta 0:01:34 lr 0.000755 wd 0.0500 time 0.2546 (0.2625) data time 0.0010 (0.0024) model time 0.2536 (0.2600) loss 2.6208 (3.3536) grad_norm 1.6815 (nan) loss_scale 2048.0000 (3029.5165) mem 7379MB [2024-08-26 13:02:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][900/1251] eta 0:01:32 lr 0.000755 wd 0.0500 time 0.2299 (0.2622) data time 0.0012 (0.0024) model time 0.2287 (0.2598) loss 2.4170 (3.3533) grad_norm 2.1783 (nan) loss_scale 2048.0000 (3015.6728) mem 7379MB [2024-08-26 13:02:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][910/1251] eta 0:01:29 lr 0.000755 wd 0.0500 time 0.2351 (0.2619) data time 0.0008 (0.0024) model time 0.2342 (0.2595) loss 4.0242 (3.3498) grad_norm 2.2437 (nan) loss_scale 2048.0000 (3002.2142) mem 7379MB [2024-08-26 13:03:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][920/1251] eta 0:01:26 lr 0.000755 wd 0.0500 time 0.2413 (0.2616) data time 0.0009 (0.0024) model time 0.2404 (0.2593) loss 4.4837 (3.3504) grad_norm 1.9874 (nan) loss_scale 2048.0000 (2989.1248) mem 7379MB [2024-08-26 13:03:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][930/1251] eta 0:01:23 lr 0.000755 wd 0.0500 time 0.2346 (0.2613) data time 0.0011 (0.0023) model time 0.2335 (0.2590) loss 3.3243 (3.3541) grad_norm 2.4106 (nan) loss_scale 2048.0000 (2976.3897) mem 7379MB [2024-08-26 13:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][940/1251] eta 0:01:21 lr 0.000755 wd 0.0500 time 0.2334 (0.2610) data time 0.0011 (0.0023) model time 0.2323 (0.2587) loss 3.1106 (3.3531) grad_norm 2.0547 (nan) loss_scale 2048.0000 (2963.9947) mem 7379MB [2024-08-26 13:03:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][950/1251] eta 0:01:18 lr 0.000755 wd 0.0500 time 0.2383 (0.2607) data time 0.0010 (0.0023) model time 0.2373 (0.2584) loss 3.7227 (3.3517) grad_norm 2.1937 (nan) loss_scale 2048.0000 (2951.9262) mem 7379MB [2024-08-26 13:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][960/1251] eta 0:01:15 lr 0.000755 wd 0.0500 time 0.2328 (0.2605) data time 0.0010 (0.0023) model time 0.2318 (0.2582) loss 3.2680 (3.3542) grad_norm 3.3087 (nan) loss_scale 2048.0000 (2940.1717) mem 7379MB [2024-08-26 13:03:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][970/1251] eta 0:01:13 lr 0.000755 wd 0.0500 time 0.2416 (0.2603) data time 0.0009 (0.0023) model time 0.2407 (0.2580) loss 2.7273 (3.3564) grad_norm 2.2431 (nan) loss_scale 2048.0000 (2928.7189) mem 7379MB [2024-08-26 13:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][980/1251] eta 0:01:10 lr 0.000755 wd 0.0500 time 0.2442 (0.2601) data time 0.0011 (0.0023) model time 0.2431 (0.2578) loss 3.3655 (3.3576) grad_norm 1.8270 (nan) loss_scale 2048.0000 (2917.5564) mem 7379MB [2024-08-26 13:03:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][990/1251] eta 0:01:07 lr 0.000755 wd 0.0500 time 0.2323 (0.2598) data time 0.0011 (0.0023) model time 0.2312 (0.2576) loss 3.8231 (3.3577) grad_norm 1.9114 (nan) loss_scale 2048.0000 (2906.6733) mem 7379MB [2024-08-26 13:03:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1000/1251] eta 0:01:05 lr 0.000755 wd 0.0500 time 0.2450 (0.2596) data time 0.0008 (0.0022) model time 0.2442 (0.2574) loss 2.1599 (3.3527) grad_norm 1.7283 (nan) loss_scale 2048.0000 (2896.0593) mem 7379MB [2024-08-26 13:03:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1010/1251] eta 0:01:02 lr 0.000755 wd 0.0500 time 0.2320 (0.2596) data time 0.0011 (0.0022) model time 0.2309 (0.2574) loss 3.6296 (3.3532) grad_norm 1.6546 (nan) loss_scale 2048.0000 (2885.7045) mem 7379MB [2024-08-26 13:03:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1020/1251] eta 0:00:59 lr 0.000755 wd 0.0500 time 0.2550 (0.2596) data time 0.0007 (0.0022) model time 0.2543 (0.2573) loss 3.7302 (3.3501) grad_norm 1.5742 (nan) loss_scale 2048.0000 (2875.5995) mem 7379MB [2024-08-26 13:03:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1030/1251] eta 0:00:57 lr 0.000755 wd 0.0500 time 0.2530 (0.2593) data time 0.0007 (0.0022) model time 0.2524 (0.2571) loss 3.9026 (3.3511) grad_norm 1.6976 (nan) loss_scale 2048.0000 (2865.7354) mem 7379MB [2024-08-26 13:03:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1040/1251] eta 0:00:54 lr 0.000755 wd 0.0500 time 0.2427 (0.2591) data time 0.0012 (0.0022) model time 0.2415 (0.2569) loss 3.6251 (3.3484) grad_norm 2.3450 (nan) loss_scale 2048.0000 (2856.1037) mem 7379MB [2024-08-26 13:03:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1050/1251] eta 0:00:52 lr 0.000755 wd 0.0500 time 0.2319 (0.2589) data time 0.0008 (0.0022) model time 0.2310 (0.2567) loss 3.7144 (3.3510) grad_norm 2.3633 (nan) loss_scale 2048.0000 (2846.6962) mem 7379MB [2024-08-26 13:03:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1060/1251] eta 0:00:49 lr 0.000755 wd 0.0500 time 0.2389 (0.2587) data time 0.0011 (0.0022) model time 0.2378 (0.2565) loss 3.2942 (3.3498) grad_norm 1.4176 (nan) loss_scale 2048.0000 (2837.5052) mem 7379MB [2024-08-26 13:03:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1070/1251] eta 0:00:46 lr 0.000755 wd 0.0500 time 0.2381 (0.2585) data time 0.0009 (0.0022) model time 0.2373 (0.2563) loss 3.4756 (3.3488) grad_norm 2.5057 (nan) loss_scale 2048.0000 (2828.5233) mem 7379MB [2024-08-26 13:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1080/1251] eta 0:00:44 lr 0.000755 wd 0.0500 time 0.2528 (0.2583) data time 0.0008 (0.0022) model time 0.2519 (0.2562) loss 3.3678 (3.3486) grad_norm 1.6858 (nan) loss_scale 2048.0000 (2819.7435) mem 7379MB [2024-08-26 13:03:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1090/1251] eta 0:00:41 lr 0.000755 wd 0.0500 time 0.2370 (0.2582) data time 0.0011 (0.0021) model time 0.2358 (0.2560) loss 2.8674 (3.3465) grad_norm 2.1804 (nan) loss_scale 2048.0000 (2811.1591) mem 7379MB [2024-08-26 13:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1100/1251] eta 0:00:38 lr 0.000755 wd 0.0500 time 0.2415 (0.2580) data time 0.0011 (0.0021) model time 0.2404 (0.2559) loss 3.6325 (3.3481) grad_norm 1.9454 (nan) loss_scale 2048.0000 (2802.7635) mem 7379MB [2024-08-26 13:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1110/1251] eta 0:00:36 lr 0.000755 wd 0.0500 time 0.2460 (0.2578) data time 0.0007 (0.0021) model time 0.2454 (0.2557) loss 3.9261 (3.3486) grad_norm 1.7397 (nan) loss_scale 2048.0000 (2794.5506) mem 7379MB [2024-08-26 13:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1120/1251] eta 0:00:33 lr 0.000755 wd 0.0500 time 0.2436 (0.2576) data time 0.0009 (0.0021) model time 0.2428 (0.2555) loss 3.8672 (3.3529) grad_norm 3.8626 (nan) loss_scale 2048.0000 (2786.5145) mem 7379MB [2024-08-26 13:03:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1130/1251] eta 0:00:31 lr 0.000755 wd 0.0500 time 0.2408 (0.2575) data time 0.0009 (0.0021) model time 0.2399 (0.2554) loss 3.5782 (3.3517) grad_norm 2.3448 (nan) loss_scale 2048.0000 (2778.6496) mem 7379MB [2024-08-26 13:03:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1140/1251] eta 0:00:28 lr 0.000755 wd 0.0500 time 0.2405 (0.2574) data time 0.0009 (0.0021) model time 0.2397 (0.2553) loss 2.6532 (3.3498) grad_norm 2.0831 (nan) loss_scale 2048.0000 (2770.9505) mem 7379MB [2024-08-26 13:03:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1150/1251] eta 0:00:25 lr 0.000754 wd 0.0500 time 0.2372 (0.2572) data time 0.0009 (0.0021) model time 0.2363 (0.2551) loss 2.4015 (3.3479) grad_norm 2.2263 (nan) loss_scale 2048.0000 (2763.4119) mem 7379MB [2024-08-26 13:04:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1160/1251] eta 0:00:23 lr 0.000754 wd 0.0500 time 0.2332 (0.2570) data time 0.0010 (0.0021) model time 0.2322 (0.2550) loss 3.5700 (3.3490) grad_norm 2.2694 (nan) loss_scale 2048.0000 (2756.0289) mem 7379MB [2024-08-26 13:04:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1170/1251] eta 0:00:20 lr 0.000754 wd 0.0500 time 0.2297 (0.2569) data time 0.0010 (0.0021) model time 0.2287 (0.2548) loss 3.0257 (3.3482) grad_norm 1.6993 (nan) loss_scale 2048.0000 (2748.7967) mem 7379MB [2024-08-26 13:04:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1180/1251] eta 0:00:18 lr 0.000754 wd 0.0500 time 0.2448 (0.2567) data time 0.0008 (0.0020) model time 0.2440 (0.2547) loss 3.9479 (3.3503) grad_norm 2.5819 (nan) loss_scale 2048.0000 (2741.7108) mem 7379MB [2024-08-26 13:04:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1190/1251] eta 0:00:15 lr 0.000754 wd 0.0500 time 0.2434 (0.2566) data time 0.0009 (0.0020) model time 0.2424 (0.2545) loss 3.3600 (3.3499) grad_norm 1.7537 (nan) loss_scale 2048.0000 (2734.7668) mem 7379MB [2024-08-26 13:04:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1200/1251] eta 0:00:13 lr 0.000754 wd 0.0500 time 0.2369 (0.2564) data time 0.0008 (0.0020) model time 0.2360 (0.2544) loss 3.9358 (3.3511) grad_norm 4.1205 (nan) loss_scale 2048.0000 (2727.9604) mem 7379MB [2024-08-26 13:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1210/1251] eta 0:00:10 lr 0.000754 wd 0.0500 time 0.2388 (0.2563) data time 0.0008 (0.0020) model time 0.2380 (0.2543) loss 3.0473 (3.3502) grad_norm 1.8967 (nan) loss_scale 2048.0000 (2721.2875) mem 7379MB [2024-08-26 13:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1220/1251] eta 0:00:07 lr 0.000754 wd 0.0500 time 0.2373 (0.2561) data time 0.0012 (0.0020) model time 0.2361 (0.2541) loss 3.7705 (3.3482) grad_norm 4.8606 (nan) loss_scale 2048.0000 (2714.7444) mem 7379MB [2024-08-26 13:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1230/1251] eta 0:00:05 lr 0.000754 wd 0.0500 time 0.2419 (0.2560) data time 0.0010 (0.0020) model time 0.2409 (0.2540) loss 3.5300 (3.3491) grad_norm 1.8739 (nan) loss_scale 2048.0000 (2708.3272) mem 7379MB [2024-08-26 13:04:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1240/1251] eta 0:00:02 lr 0.000754 wd 0.0500 time 0.2217 (0.2557) data time 0.0005 (0.0020) model time 0.2213 (0.2537) loss 4.3372 (3.3493) grad_norm 2.3127 (nan) loss_scale 2048.0000 (2702.0324) mem 7379MB [2024-08-26 13:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [112/300][1250/1251] eta 0:00:00 lr 0.000754 wd 0.0500 time 0.2302 (0.2555) data time 0.0007 (0.0020) model time 0.2295 (0.2535) loss 3.4512 (3.3508) grad_norm 2.3900 (nan) loss_scale 2048.0000 (2695.8565) mem 7379MB [2024-08-26 13:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 112 training takes 0:04:30 [2024-08-26 13:04:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 13:04:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 13:04:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.409 (0.409) Loss 0.5039 (0.5039) Acc@1 91.406 (91.406) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-26 13:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.108) Loss 0.8662 (0.7863) Acc@1 81.055 (82.546) Acc@5 95.508 (96.218) Mem 7379MB [2024-08-26 13:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.094) Loss 1.1211 (0.8046) Acc@1 73.730 (81.622) Acc@5 93.457 (96.289) Mem 7379MB [2024-08-26 13:04:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.089) Loss 1.4014 (0.9120) Acc@1 65.723 (79.439) Acc@5 89.062 (94.919) Mem 7379MB [2024-08-26 13:04:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.083) Loss 1.1836 (0.9699) Acc@1 71.973 (77.977) Acc@5 92.090 (94.300) Mem 7379MB [2024-08-26 13:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.658 Acc@5 94.244 [2024-08-26 13:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.7% [2024-08-26 13:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.834 (0.834) Loss 0.4307 (0.4307) Acc@1 91.992 (91.992) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 13:04:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.151) Loss 0.6895 (0.6736) Acc@1 86.816 (85.502) Acc@5 96.875 (97.221) Mem 7379MB [2024-08-26 13:04:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.118) Loss 0.9595 (0.6961) Acc@1 77.441 (84.482) Acc@5 94.336 (97.182) Mem 7379MB [2024-08-26 13:04:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.105) Loss 1.2354 (0.7931) Acc@1 68.945 (82.186) Acc@5 91.016 (96.046) Mem 7379MB [2024-08-26 13:04:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.096) Loss 1.1074 (0.8437) Acc@1 72.461 (80.771) Acc@5 93.164 (95.524) Mem 7379MB [2024-08-26 13:04:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.346 Acc@5 95.472 [2024-08-26 13:04:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.3% [2024-08-26 13:04:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.35% [2024-08-26 13:04:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 13:04:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 13:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][0/1251] eta 0:14:12 lr 0.000754 wd 0.0500 time 0.6813 (0.6813) data time 0.3878 (0.3878) model time 0.0000 (0.0000) loss 3.2215 (3.2215) grad_norm 2.2692 (2.2692) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-26 13:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][10/1251] eta 0:05:51 lr 0.000754 wd 0.0500 time 0.2370 (0.2836) data time 0.0007 (0.0362) model time 0.0000 (0.0000) loss 2.9404 (3.2396) grad_norm 2.7073 (2.1520) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][20/1251] eta 0:05:22 lr 0.000754 wd 0.0500 time 0.2413 (0.2623) data time 0.0007 (0.0195) model time 0.0000 (0.0000) loss 3.8685 (3.3163) grad_norm 2.4401 (2.1752) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:04:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][30/1251] eta 0:05:14 lr 0.000754 wd 0.0500 time 0.2476 (0.2576) data time 0.0009 (0.0151) model time 0.0000 (0.0000) loss 3.9173 (3.2421) grad_norm 2.2254 (2.3084) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:04:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][40/1251] eta 0:05:06 lr 0.000754 wd 0.0500 time 0.2354 (0.2531) data time 0.0010 (0.0116) model time 0.0000 (0.0000) loss 3.5111 (3.2432) grad_norm 2.6171 (2.3961) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:04:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][50/1251] eta 0:05:01 lr 0.000754 wd 0.0500 time 0.2409 (0.2513) data time 0.0008 (0.0105) model time 0.0000 (0.0000) loss 4.1568 (3.2849) grad_norm 2.0807 (2.3447) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:04:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][60/1251] eta 0:04:57 lr 0.000754 wd 0.0500 time 0.2346 (0.2498) data time 0.0009 (0.0090) model time 0.2337 (0.2408) loss 3.1190 (3.3568) grad_norm 2.2270 (2.4332) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:04:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][70/1251] eta 0:04:53 lr 0.000754 wd 0.0500 time 0.2363 (0.2486) data time 0.0013 (0.0079) model time 0.2351 (0.2405) loss 3.3243 (3.3653) grad_norm 2.0020 (2.4452) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:04:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][80/1251] eta 0:04:50 lr 0.000754 wd 0.0500 time 0.2450 (0.2477) data time 0.0010 (0.0071) model time 0.2440 (0.2404) loss 4.1130 (3.3697) grad_norm 2.2010 (2.4269) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:04:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][90/1251] eta 0:04:47 lr 0.000754 wd 0.0500 time 0.2372 (0.2473) data time 0.0010 (0.0065) model time 0.2362 (0.2410) loss 2.8048 (3.3556) grad_norm 2.0785 (2.3780) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][100/1251] eta 0:04:44 lr 0.000754 wd 0.0500 time 0.2366 (0.2471) data time 0.0008 (0.0059) model time 0.2357 (0.2416) loss 2.4596 (3.3455) grad_norm 1.9458 (2.3659) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][110/1251] eta 0:04:41 lr 0.000754 wd 0.0500 time 0.2345 (0.2466) data time 0.0010 (0.0055) model time 0.2335 (0.2414) loss 3.6173 (3.3459) grad_norm 2.9728 (2.3525) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][120/1251] eta 0:04:38 lr 0.000754 wd 0.0500 time 0.2397 (0.2464) data time 0.0009 (0.0052) model time 0.2388 (0.2416) loss 3.5006 (3.3663) grad_norm 1.5417 (2.3102) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][130/1251] eta 0:04:35 lr 0.000754 wd 0.0500 time 0.2365 (0.2462) data time 0.0007 (0.0049) model time 0.2358 (0.2416) loss 3.9207 (3.3905) grad_norm 2.2393 (2.2981) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][140/1251] eta 0:04:33 lr 0.000754 wd 0.0500 time 0.2428 (0.2460) data time 0.0009 (0.0046) model time 0.2419 (0.2417) loss 4.0717 (3.3836) grad_norm 1.5774 (2.2974) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][150/1251] eta 0:04:30 lr 0.000754 wd 0.0500 time 0.2459 (0.2456) data time 0.0009 (0.0044) model time 0.2450 (0.2415) loss 3.5873 (3.3893) grad_norm 1.5982 (2.2844) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][160/1251] eta 0:04:27 lr 0.000753 wd 0.0500 time 0.2326 (0.2455) data time 0.0010 (0.0042) model time 0.2316 (0.2416) loss 3.2354 (3.3774) grad_norm 2.5279 (2.2668) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][170/1251] eta 0:04:25 lr 0.000753 wd 0.0500 time 0.2379 (0.2452) data time 0.0010 (0.0040) model time 0.2368 (0.2414) loss 3.3317 (3.3696) grad_norm 2.3342 (2.2544) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][180/1251] eta 0:04:22 lr 0.000753 wd 0.0500 time 0.2394 (0.2451) data time 0.0009 (0.0038) model time 0.2385 (0.2415) loss 3.3468 (3.3660) grad_norm 1.6392 (2.2350) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][190/1251] eta 0:04:19 lr 0.000753 wd 0.0500 time 0.2343 (0.2450) data time 0.0010 (0.0037) model time 0.2333 (0.2415) loss 2.0036 (3.3594) grad_norm 2.1597 (2.2284) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][200/1251] eta 0:04:17 lr 0.000753 wd 0.0500 time 0.2562 (0.2452) data time 0.0009 (0.0036) model time 0.2553 (0.2419) loss 3.6404 (3.3743) grad_norm 2.1780 (2.2302) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][210/1251] eta 0:04:15 lr 0.000753 wd 0.0500 time 0.2478 (0.2451) data time 0.0011 (0.0034) model time 0.2467 (0.2419) loss 3.2846 (3.3835) grad_norm 2.3396 (2.2294) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][220/1251] eta 0:04:12 lr 0.000753 wd 0.0500 time 0.2615 (0.2452) data time 0.0008 (0.0034) model time 0.2607 (0.2422) loss 3.4051 (3.3856) grad_norm 2.3399 (2.2250) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][230/1251] eta 0:04:10 lr 0.000753 wd 0.0500 time 0.2374 (0.2451) data time 0.0009 (0.0033) model time 0.2364 (0.2421) loss 3.2521 (3.3864) grad_norm 1.8347 (2.2244) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][240/1251] eta 0:04:07 lr 0.000753 wd 0.0500 time 0.2414 (0.2451) data time 0.0010 (0.0032) model time 0.2404 (0.2422) loss 3.6350 (3.3880) grad_norm 2.2654 (2.2113) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][250/1251] eta 0:04:05 lr 0.000753 wd 0.0500 time 0.2377 (0.2449) data time 0.0009 (0.0031) model time 0.2369 (0.2421) loss 4.4534 (3.3854) grad_norm 2.1121 (2.2141) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][260/1251] eta 0:04:02 lr 0.000753 wd 0.0500 time 0.2431 (0.2449) data time 0.0007 (0.0030) model time 0.2423 (0.2421) loss 3.6529 (3.3720) grad_norm 1.8158 (2.2078) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][270/1251] eta 0:04:00 lr 0.000753 wd 0.0500 time 0.2494 (0.2448) data time 0.0010 (0.0029) model time 0.2484 (0.2421) loss 3.5151 (3.3697) grad_norm 1.9331 (2.2060) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][280/1251] eta 0:03:57 lr 0.000753 wd 0.0500 time 0.2476 (0.2447) data time 0.0010 (0.0029) model time 0.2466 (0.2421) loss 3.4501 (3.3674) grad_norm 2.0998 (2.2159) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][290/1251] eta 0:03:55 lr 0.000753 wd 0.0500 time 0.2400 (0.2446) data time 0.0008 (0.0028) model time 0.2392 (0.2420) loss 4.2545 (3.3694) grad_norm 2.8991 (2.2068) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][300/1251] eta 0:03:52 lr 0.000753 wd 0.0500 time 0.2414 (0.2444) data time 0.0009 (0.0028) model time 0.2405 (0.2419) loss 3.7209 (3.3748) grad_norm 3.3067 (2.2144) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][310/1251] eta 0:03:50 lr 0.000753 wd 0.0500 time 0.2435 (0.2445) data time 0.0011 (0.0027) model time 0.2424 (0.2420) loss 3.4619 (3.3788) grad_norm 1.4787 (2.2117) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][320/1251] eta 0:03:47 lr 0.000753 wd 0.0500 time 0.2423 (0.2444) data time 0.0010 (0.0027) model time 0.2414 (0.2419) loss 3.2223 (3.3812) grad_norm 2.1148 (2.2163) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][330/1251] eta 0:03:45 lr 0.000753 wd 0.0500 time 0.2369 (0.2444) data time 0.0007 (0.0026) model time 0.2362 (0.2420) loss 3.6849 (3.3818) grad_norm 2.4478 (2.2112) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:05:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][340/1251] eta 0:03:42 lr 0.000753 wd 0.0500 time 0.2450 (0.2443) data time 0.0011 (0.0026) model time 0.2439 (0.2420) loss 3.3109 (3.3813) grad_norm 2.1426 (2.2142) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][350/1251] eta 0:03:40 lr 0.000753 wd 0.0500 time 0.2458 (0.2443) data time 0.0011 (0.0025) model time 0.2446 (0.2420) loss 3.3841 (3.3908) grad_norm 2.3703 (2.2171) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][360/1251] eta 0:03:37 lr 0.000753 wd 0.0500 time 0.2427 (0.2443) data time 0.0010 (0.0025) model time 0.2417 (0.2420) loss 3.4470 (3.3908) grad_norm 2.4931 (2.2193) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][370/1251] eta 0:03:35 lr 0.000753 wd 0.0500 time 0.2423 (0.2443) data time 0.0010 (0.0025) model time 0.2413 (0.2420) loss 3.4790 (3.3874) grad_norm 2.1060 (2.2119) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][380/1251] eta 0:03:32 lr 0.000753 wd 0.0500 time 0.2431 (0.2443) data time 0.0010 (0.0024) model time 0.2420 (0.2420) loss 3.4049 (3.3867) grad_norm 1.5723 (2.2062) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][390/1251] eta 0:03:30 lr 0.000753 wd 0.0500 time 0.2461 (0.2448) data time 0.0009 (0.0024) model time 0.2452 (0.2426) loss 3.2426 (3.3849) grad_norm 1.6261 (2.1974) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][400/1251] eta 0:03:28 lr 0.000753 wd 0.0500 time 0.2469 (0.2447) data time 0.0009 (0.0024) model time 0.2460 (0.2426) loss 3.9248 (3.3883) grad_norm 1.8386 (2.1887) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][410/1251] eta 0:03:26 lr 0.000753 wd 0.0500 time 0.2457 (0.2452) data time 0.0009 (0.0024) model time 0.2447 (0.2432) loss 3.6190 (3.3881) grad_norm 3.1345 (2.1997) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][420/1251] eta 0:03:23 lr 0.000752 wd 0.0500 time 0.2432 (0.2452) data time 0.0009 (0.0023) model time 0.2422 (0.2432) loss 2.9383 (3.3772) grad_norm 2.6340 (2.2113) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][430/1251] eta 0:03:21 lr 0.000752 wd 0.0500 time 0.2430 (0.2451) data time 0.0011 (0.0023) model time 0.2419 (0.2431) loss 2.5783 (3.3697) grad_norm 1.8857 (2.2081) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][440/1251] eta 0:03:18 lr 0.000752 wd 0.0500 time 0.2421 (0.2451) data time 0.0009 (0.0023) model time 0.2412 (0.2430) loss 3.7793 (3.3673) grad_norm 2.5794 (2.2024) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][450/1251] eta 0:03:16 lr 0.000752 wd 0.0500 time 0.2356 (0.2450) data time 0.0012 (0.0022) model time 0.2343 (0.2430) loss 2.5472 (3.3676) grad_norm 1.8945 (2.1962) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][460/1251] eta 0:03:13 lr 0.000752 wd 0.0500 time 0.2344 (0.2449) data time 0.0009 (0.0022) model time 0.2334 (0.2429) loss 3.1738 (3.3698) grad_norm 1.4402 (2.1891) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][470/1251] eta 0:03:11 lr 0.000752 wd 0.0500 time 0.2419 (0.2450) data time 0.0010 (0.0022) model time 0.2409 (0.2430) loss 3.1164 (3.3705) grad_norm 2.1504 (2.1913) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][480/1251] eta 0:03:08 lr 0.000752 wd 0.0500 time 0.2371 (0.2449) data time 0.0008 (0.0022) model time 0.2364 (0.2429) loss 4.1999 (3.3712) grad_norm 1.3914 (2.1857) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][490/1251] eta 0:03:06 lr 0.000752 wd 0.0500 time 0.2382 (0.2448) data time 0.0009 (0.0022) model time 0.2372 (0.2429) loss 3.3471 (3.3669) grad_norm 2.1152 (2.1819) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][500/1251] eta 0:03:03 lr 0.000752 wd 0.0500 time 0.2376 (0.2448) data time 0.0009 (0.0021) model time 0.2366 (0.2428) loss 2.9568 (3.3666) grad_norm 2.6502 (2.1829) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][510/1251] eta 0:03:01 lr 0.000752 wd 0.0500 time 0.2371 (0.2447) data time 0.0009 (0.0021) model time 0.2362 (0.2428) loss 2.7024 (3.3601) grad_norm 1.9930 (2.1802) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][520/1251] eta 0:02:58 lr 0.000752 wd 0.0500 time 0.2371 (0.2447) data time 0.0009 (0.0021) model time 0.2361 (0.2428) loss 2.6288 (3.3602) grad_norm 3.8578 (2.1834) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][530/1251] eta 0:02:56 lr 0.000752 wd 0.0500 time 0.2354 (0.2446) data time 0.0008 (0.0021) model time 0.2346 (0.2427) loss 3.8090 (3.3628) grad_norm 1.7033 (2.1818) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][540/1251] eta 0:02:53 lr 0.000752 wd 0.0500 time 0.2423 (0.2446) data time 0.0010 (0.0021) model time 0.2413 (0.2427) loss 4.2694 (3.3669) grad_norm 2.4440 (2.1819) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][550/1251] eta 0:02:51 lr 0.000752 wd 0.0500 time 0.2358 (0.2445) data time 0.0008 (0.0021) model time 0.2350 (0.2426) loss 3.6230 (3.3694) grad_norm 1.9168 (2.1869) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][560/1251] eta 0:02:48 lr 0.000752 wd 0.0500 time 0.2428 (0.2444) data time 0.0010 (0.0020) model time 0.2418 (0.2426) loss 3.5917 (3.3690) grad_norm 2.6858 (2.1870) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][570/1251] eta 0:02:46 lr 0.000752 wd 0.0500 time 0.2365 (0.2444) data time 0.0008 (0.0020) model time 0.2358 (0.2425) loss 3.4389 (3.3687) grad_norm 2.1810 (2.1821) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][580/1251] eta 0:02:43 lr 0.000752 wd 0.0500 time 0.2428 (0.2444) data time 0.0009 (0.0020) model time 0.2419 (0.2425) loss 2.3911 (3.3704) grad_norm 1.8106 (2.1846) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][590/1251] eta 0:02:41 lr 0.000752 wd 0.0500 time 0.2394 (0.2443) data time 0.0010 (0.0020) model time 0.2384 (0.2424) loss 3.6443 (3.3632) grad_norm 3.2186 (2.1859) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][600/1251] eta 0:02:39 lr 0.000752 wd 0.0500 time 0.2379 (0.2442) data time 0.0009 (0.0020) model time 0.2370 (0.2424) loss 3.7704 (3.3561) grad_norm 1.8782 (2.1870) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][610/1251] eta 0:02:36 lr 0.000752 wd 0.0500 time 0.2391 (0.2442) data time 0.0008 (0.0020) model time 0.2383 (0.2423) loss 3.6038 (3.3564) grad_norm 1.8014 (2.1887) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][620/1251] eta 0:02:34 lr 0.000752 wd 0.0500 time 0.2403 (0.2442) data time 0.0007 (0.0019) model time 0.2396 (0.2423) loss 3.9090 (3.3547) grad_norm 2.8210 (2.1977) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][630/1251] eta 0:02:31 lr 0.000752 wd 0.0500 time 0.2309 (0.2441) data time 0.0008 (0.0019) model time 0.2301 (0.2423) loss 2.7289 (3.3546) grad_norm 2.0240 (2.2010) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][640/1251] eta 0:02:29 lr 0.000752 wd 0.0500 time 0.2488 (0.2441) data time 0.0009 (0.0019) model time 0.2479 (0.2423) loss 3.4054 (3.3591) grad_norm 1.6361 (2.1977) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][650/1251] eta 0:02:26 lr 0.000752 wd 0.0500 time 0.2382 (0.2441) data time 0.0011 (0.0019) model time 0.2371 (0.2423) loss 3.0236 (3.3591) grad_norm 2.4212 (2.1924) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][660/1251] eta 0:02:24 lr 0.000752 wd 0.0500 time 0.2504 (0.2440) data time 0.0009 (0.0019) model time 0.2495 (0.2422) loss 2.9903 (3.3570) grad_norm 1.6579 (2.1861) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][670/1251] eta 0:02:21 lr 0.000752 wd 0.0500 time 0.2395 (0.2440) data time 0.0008 (0.0019) model time 0.2387 (0.2422) loss 3.9199 (3.3565) grad_norm 3.4217 (2.1867) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][680/1251] eta 0:02:19 lr 0.000751 wd 0.0500 time 0.2342 (0.2440) data time 0.0011 (0.0019) model time 0.2331 (0.2422) loss 3.7710 (3.3571) grad_norm 2.1012 (2.1935) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][690/1251] eta 0:02:16 lr 0.000751 wd 0.0500 time 0.2420 (0.2439) data time 0.0010 (0.0019) model time 0.2411 (0.2422) loss 2.3871 (3.3547) grad_norm 1.8358 (2.1984) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][700/1251] eta 0:02:14 lr 0.000751 wd 0.0500 time 0.2442 (0.2439) data time 0.0007 (0.0019) model time 0.2435 (0.2421) loss 4.2087 (3.3530) grad_norm 1.7134 (2.1949) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][710/1251] eta 0:02:11 lr 0.000751 wd 0.0500 time 0.2403 (0.2439) data time 0.0007 (0.0018) model time 0.2396 (0.2421) loss 3.1085 (3.3554) grad_norm 1.8845 (2.1911) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][720/1251] eta 0:02:09 lr 0.000751 wd 0.0500 time 0.2399 (0.2439) data time 0.0010 (0.0018) model time 0.2389 (0.2421) loss 3.8773 (3.3554) grad_norm 2.2456 (2.1925) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][730/1251] eta 0:02:07 lr 0.000751 wd 0.0500 time 0.2433 (0.2439) data time 0.0009 (0.0018) model time 0.2424 (0.2421) loss 3.8110 (3.3544) grad_norm 1.9699 (2.1949) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][740/1251] eta 0:02:04 lr 0.000751 wd 0.0500 time 0.2358 (0.2438) data time 0.0008 (0.0018) model time 0.2350 (0.2421) loss 4.3662 (3.3586) grad_norm 1.7292 (2.1958) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][750/1251] eta 0:02:02 lr 0.000751 wd 0.0500 time 0.2402 (0.2438) data time 0.0010 (0.0018) model time 0.2392 (0.2420) loss 2.5332 (3.3571) grad_norm 1.5111 (2.1968) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][760/1251] eta 0:01:59 lr 0.000751 wd 0.0500 time 0.2391 (0.2438) data time 0.0010 (0.0018) model time 0.2381 (0.2420) loss 3.8825 (3.3565) grad_norm 1.9655 (2.1928) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][770/1251] eta 0:01:57 lr 0.000751 wd 0.0500 time 0.2362 (0.2437) data time 0.0008 (0.0018) model time 0.2354 (0.2420) loss 3.9017 (3.3580) grad_norm 1.6652 (2.1889) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][780/1251] eta 0:01:54 lr 0.000751 wd 0.0500 time 0.2370 (0.2437) data time 0.0007 (0.0018) model time 0.2363 (0.2419) loss 4.0030 (3.3562) grad_norm 1.8564 (2.1930) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][790/1251] eta 0:01:52 lr 0.000751 wd 0.0500 time 0.2388 (0.2437) data time 0.0010 (0.0018) model time 0.2379 (0.2419) loss 3.5447 (3.3546) grad_norm 2.9709 (2.1937) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][800/1251] eta 0:01:49 lr 0.000751 wd 0.0500 time 0.2474 (0.2437) data time 0.0011 (0.0018) model time 0.2463 (0.2419) loss 3.4673 (3.3527) grad_norm 1.9674 (2.1927) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][810/1251] eta 0:01:47 lr 0.000751 wd 0.0500 time 0.2351 (0.2436) data time 0.0010 (0.0018) model time 0.2341 (0.2419) loss 4.3377 (3.3496) grad_norm 1.3984 (2.1952) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][820/1251] eta 0:01:44 lr 0.000751 wd 0.0500 time 0.2423 (0.2436) data time 0.0011 (0.0018) model time 0.2412 (0.2419) loss 3.5442 (3.3534) grad_norm 3.7958 (2.1949) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][830/1251] eta 0:01:42 lr 0.000751 wd 0.0500 time 0.2354 (0.2436) data time 0.0007 (0.0018) model time 0.2347 (0.2419) loss 2.6978 (3.3523) grad_norm 1.5316 (2.1925) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][840/1251] eta 0:01:40 lr 0.000751 wd 0.0500 time 0.2531 (0.2436) data time 0.0007 (0.0018) model time 0.2524 (0.2419) loss 3.4363 (3.3502) grad_norm 1.6885 (2.1962) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][850/1251] eta 0:01:37 lr 0.000751 wd 0.0500 time 0.2398 (0.2436) data time 0.0010 (0.0018) model time 0.2388 (0.2419) loss 3.2755 (3.3516) grad_norm 1.8929 (2.1972) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][860/1251] eta 0:01:35 lr 0.000751 wd 0.0500 time 0.2444 (0.2436) data time 0.0008 (0.0017) model time 0.2436 (0.2419) loss 3.2876 (3.3478) grad_norm 2.7631 (2.2050) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][870/1251] eta 0:01:32 lr 0.000751 wd 0.0500 time 0.2341 (0.2436) data time 0.0009 (0.0017) model time 0.2332 (0.2419) loss 3.8685 (3.3469) grad_norm 1.9518 (2.2088) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][880/1251] eta 0:01:30 lr 0.000751 wd 0.0500 time 0.2398 (0.2436) data time 0.0010 (0.0017) model time 0.2388 (0.2419) loss 3.3276 (3.3448) grad_norm 3.3470 (2.2075) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][890/1251] eta 0:01:27 lr 0.000751 wd 0.0500 time 0.2424 (0.2435) data time 0.0010 (0.0017) model time 0.2414 (0.2419) loss 3.4407 (3.3444) grad_norm 1.7007 (2.2119) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][900/1251] eta 0:01:25 lr 0.000751 wd 0.0500 time 0.2355 (0.2435) data time 0.0008 (0.0017) model time 0.2347 (0.2419) loss 2.4988 (3.3428) grad_norm 1.8908 (2.2104) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][910/1251] eta 0:01:23 lr 0.000751 wd 0.0500 time 0.2500 (0.2435) data time 0.0009 (0.0017) model time 0.2491 (0.2418) loss 3.6804 (3.3399) grad_norm 2.5354 (2.2097) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][920/1251] eta 0:01:20 lr 0.000751 wd 0.0500 time 0.2482 (0.2435) data time 0.0008 (0.0017) model time 0.2474 (0.2418) loss 2.2752 (3.3403) grad_norm 1.7100 (2.2094) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][930/1251] eta 0:01:18 lr 0.000751 wd 0.0500 time 0.2487 (0.2435) data time 0.0007 (0.0017) model time 0.2480 (0.2418) loss 3.4808 (3.3389) grad_norm 2.0131 (2.2059) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][940/1251] eta 0:01:15 lr 0.000750 wd 0.0500 time 0.2364 (0.2434) data time 0.0007 (0.0017) model time 0.2357 (0.2418) loss 3.4600 (3.3388) grad_norm 2.0264 (2.2049) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][950/1251] eta 0:01:13 lr 0.000750 wd 0.0500 time 0.2410 (0.2436) data time 0.0010 (0.0017) model time 0.2400 (0.2420) loss 3.2558 (3.3425) grad_norm 2.2983 (2.2031) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][960/1251] eta 0:01:10 lr 0.000750 wd 0.0500 time 0.2369 (0.2436) data time 0.0009 (0.0017) model time 0.2359 (0.2420) loss 3.6345 (3.3423) grad_norm 1.8812 (2.2038) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][970/1251] eta 0:01:08 lr 0.000750 wd 0.0500 time 0.2445 (0.2436) data time 0.0008 (0.0017) model time 0.2437 (0.2420) loss 3.8627 (3.3427) grad_norm 2.4830 (2.2030) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][980/1251] eta 0:01:06 lr 0.000750 wd 0.0500 time 0.2421 (0.2436) data time 0.0007 (0.0017) model time 0.2414 (0.2420) loss 3.4838 (3.3436) grad_norm 2.8120 (2.2018) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][990/1251] eta 0:01:03 lr 0.000750 wd 0.0500 time 0.2415 (0.2436) data time 0.0007 (0.0017) model time 0.2407 (0.2420) loss 3.1769 (3.3420) grad_norm 2.0912 (2.2021) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1000/1251] eta 0:01:01 lr 0.000750 wd 0.0500 time 0.2355 (0.2436) data time 0.0008 (0.0017) model time 0.2347 (0.2420) loss 3.1369 (3.3420) grad_norm 1.5145 (2.2010) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1010/1251] eta 0:00:58 lr 0.000750 wd 0.0500 time 0.2321 (0.2436) data time 0.0010 (0.0017) model time 0.2311 (0.2420) loss 3.4948 (3.3422) grad_norm 2.0107 (2.2001) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1020/1251] eta 0:00:56 lr 0.000750 wd 0.0500 time 0.2461 (0.2435) data time 0.0009 (0.0016) model time 0.2452 (0.2419) loss 2.8381 (3.3425) grad_norm 1.7017 (2.1984) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1030/1251] eta 0:00:53 lr 0.000750 wd 0.0500 time 0.2375 (0.2435) data time 0.0009 (0.0016) model time 0.2366 (0.2419) loss 3.1299 (3.3431) grad_norm 1.8440 (2.1951) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1040/1251] eta 0:00:51 lr 0.000750 wd 0.0500 time 0.2526 (0.2435) data time 0.0008 (0.0016) model time 0.2517 (0.2419) loss 2.2259 (3.3412) grad_norm 1.8304 (2.1935) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1050/1251] eta 0:00:48 lr 0.000750 wd 0.0500 time 0.2431 (0.2435) data time 0.0008 (0.0016) model time 0.2423 (0.2419) loss 3.2717 (3.3430) grad_norm 1.7701 (2.1920) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1060/1251] eta 0:00:46 lr 0.000750 wd 0.0500 time 0.2350 (0.2435) data time 0.0009 (0.0016) model time 0.2341 (0.2419) loss 2.4042 (3.3415) grad_norm 2.4132 (2.1901) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1070/1251] eta 0:00:44 lr 0.000750 wd 0.0500 time 0.2406 (0.2435) data time 0.0009 (0.0016) model time 0.2397 (0.2419) loss 3.9291 (3.3418) grad_norm 2.6849 (2.1899) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1080/1251] eta 0:00:41 lr 0.000750 wd 0.0500 time 0.2302 (0.2435) data time 0.0011 (0.0016) model time 0.2291 (0.2419) loss 3.7248 (3.3447) grad_norm 2.4019 (2.1960) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1090/1251] eta 0:00:39 lr 0.000750 wd 0.0500 time 0.2389 (0.2435) data time 0.0012 (0.0016) model time 0.2378 (0.2419) loss 2.5895 (3.3435) grad_norm 3.0341 (2.2046) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1100/1251] eta 0:00:36 lr 0.000750 wd 0.0500 time 0.2415 (0.2435) data time 0.0007 (0.0016) model time 0.2408 (0.2419) loss 2.8185 (3.3457) grad_norm 3.0765 (2.2114) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1110/1251] eta 0:00:34 lr 0.000750 wd 0.0500 time 0.2420 (0.2435) data time 0.0007 (0.0016) model time 0.2412 (0.2419) loss 3.8174 (3.3481) grad_norm 1.5231 (2.2090) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1120/1251] eta 0:00:31 lr 0.000750 wd 0.0500 time 0.2400 (0.2434) data time 0.0010 (0.0016) model time 0.2390 (0.2419) loss 3.5416 (3.3480) grad_norm 1.6262 (2.2079) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1130/1251] eta 0:00:29 lr 0.000750 wd 0.0500 time 0.2359 (0.2434) data time 0.0008 (0.0016) model time 0.2351 (0.2419) loss 3.6255 (3.3480) grad_norm 1.8408 (2.2059) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1140/1251] eta 0:00:27 lr 0.000750 wd 0.0500 time 0.2465 (0.2434) data time 0.0009 (0.0016) model time 0.2456 (0.2419) loss 2.7620 (3.3460) grad_norm 2.2923 (2.2068) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1150/1251] eta 0:00:24 lr 0.000750 wd 0.0500 time 0.2328 (0.2433) data time 0.0007 (0.0016) model time 0.2321 (0.2418) loss 4.3844 (3.3469) grad_norm 2.3666 (2.2065) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1160/1251] eta 0:00:22 lr 0.000750 wd 0.0500 time 0.2342 (0.2433) data time 0.0010 (0.0016) model time 0.2332 (0.2418) loss 3.7595 (3.3451) grad_norm 4.7419 (2.2091) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1170/1251] eta 0:00:19 lr 0.000750 wd 0.0500 time 0.2344 (0.2433) data time 0.0008 (0.0016) model time 0.2336 (0.2418) loss 3.5349 (3.3443) grad_norm 1.8994 (2.2070) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1180/1251] eta 0:00:17 lr 0.000750 wd 0.0500 time 0.2440 (0.2433) data time 0.0011 (0.0016) model time 0.2430 (0.2418) loss 3.7275 (3.3423) grad_norm 2.2064 (2.2043) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1190/1251] eta 0:00:14 lr 0.000750 wd 0.0500 time 0.2375 (0.2433) data time 0.0010 (0.0016) model time 0.2365 (0.2418) loss 2.7812 (3.3409) grad_norm 2.4138 (2.2072) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1200/1251] eta 0:00:12 lr 0.000749 wd 0.0500 time 0.2340 (0.2433) data time 0.0010 (0.0016) model time 0.2331 (0.2417) loss 2.5494 (3.3408) grad_norm 2.1735 (2.2053) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1210/1251] eta 0:00:09 lr 0.000749 wd 0.0500 time 0.2424 (0.2433) data time 0.0008 (0.0016) model time 0.2415 (0.2417) loss 2.4420 (3.3421) grad_norm 1.8601 (2.2069) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1220/1251] eta 0:00:07 lr 0.000749 wd 0.0500 time 0.2399 (0.2433) data time 0.0011 (0.0016) model time 0.2388 (0.2418) loss 3.6189 (3.3422) grad_norm 2.6755 (2.2065) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1230/1251] eta 0:00:05 lr 0.000749 wd 0.0500 time 0.2463 (0.2433) data time 0.0009 (0.0016) model time 0.2454 (0.2418) loss 3.8172 (3.3408) grad_norm 2.8571 (2.2096) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1240/1251] eta 0:00:02 lr 0.000749 wd 0.0500 time 0.2235 (0.2432) data time 0.0007 (0.0016) model time 0.2228 (0.2417) loss 2.6736 (3.3401) grad_norm 2.7502 (2.2093) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [113/300][1250/1251] eta 0:00:00 lr 0.000749 wd 0.0500 time 0.2224 (0.2431) data time 0.0005 (0.0015) model time 0.2219 (0.2415) loss 2.4834 (3.3394) grad_norm 4.7131 (2.2103) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 113 training takes 0:05:04 [2024-08-26 13:09:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 13:09:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 13:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.417 (0.417) Loss 0.4854 (0.4854) Acc@1 91.211 (91.211) Acc@5 97.754 (97.754) Mem 7380MB [2024-08-26 13:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.086 (0.113) Loss 0.7646 (0.7624) Acc@1 84.375 (83.203) Acc@5 96.094 (96.378) Mem 7380MB [2024-08-26 13:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.097) Loss 1.0537 (0.7867) Acc@1 75.488 (82.282) Acc@5 94.629 (96.484) Mem 7380MB [2024-08-26 13:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.091) Loss 1.3652 (0.8995) Acc@1 67.188 (79.798) Acc@5 89.062 (95.073) Mem 7380MB [2024-08-26 13:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.2783 (0.9648) Acc@1 70.410 (78.201) Acc@5 91.504 (94.322) Mem 7380MB [2024-08-26 13:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.906 Acc@5 94.274 [2024-08-26 13:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.9% [2024-08-26 13:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 77.91% [2024-08-26 13:09:45 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 13:09:46 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 13:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.478 (0.478) Loss 0.4282 (0.4282) Acc@1 92.188 (92.188) Acc@5 98.242 (98.242) Mem 7380MB [2024-08-26 13:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.085 (0.115) Loss 0.6880 (0.6727) Acc@1 86.816 (85.511) Acc@5 96.875 (97.239) Mem 7380MB [2024-08-26 13:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.098) Loss 0.9565 (0.6946) Acc@1 78.027 (84.552) Acc@5 94.531 (97.214) Mem 7380MB [2024-08-26 13:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.091) Loss 1.2295 (0.7914) Acc@1 68.555 (82.223) Acc@5 91.211 (96.078) Mem 7380MB [2024-08-26 13:09:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.1074 (0.8421) Acc@1 72.168 (80.807) Acc@5 93.262 (95.551) Mem 7380MB [2024-08-26 13:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.382 Acc@5 95.492 [2024-08-26 13:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.4% [2024-08-26 13:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.38% [2024-08-26 13:09:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 13:09:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 13:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][0/1251] eta 0:15:19 lr 0.000749 wd 0.0500 time 0.7350 (0.7350) data time 0.5017 (0.5017) model time 0.0000 (0.0000) loss 2.9648 (2.9648) grad_norm 3.0925 (3.0925) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][10/1251] eta 0:05:51 lr 0.000749 wd 0.0500 time 0.2345 (0.2832) data time 0.0009 (0.0466) model time 0.0000 (0.0000) loss 2.6761 (3.4455) grad_norm 2.8924 (2.6626) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][20/1251] eta 0:05:23 lr 0.000749 wd 0.0500 time 0.2304 (0.2630) data time 0.0010 (0.0250) model time 0.0000 (0.0000) loss 3.5869 (3.4555) grad_norm 2.0708 (2.4981) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 13:09:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][30/1251] eta 0:05:12 lr 0.000749 wd 0.0500 time 0.2395 (0.2556) data time 0.0009 (0.0172) model time 0.0000 (0.0000) loss 3.5052 (3.4654) grad_norm 1.7729 (2.3459) loss_scale 4096.0000 (2444.3871) mem 7380MB [2024-08-26 13:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][40/1251] eta 0:05:11 lr 0.000749 wd 0.0500 time 0.2418 (0.2574) data time 0.0007 (0.0133) model time 0.0000 (0.0000) loss 2.0207 (3.4253) grad_norm 2.4894 (2.2884) loss_scale 4096.0000 (2847.2195) mem 7380MB [2024-08-26 13:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][50/1251] eta 0:05:04 lr 0.000749 wd 0.0500 time 0.2360 (0.2537) data time 0.0007 (0.0109) model time 0.0000 (0.0000) loss 3.3437 (3.3842) grad_norm 1.9391 (nan) loss_scale 2048.0000 (2730.6667) mem 7380MB [2024-08-26 13:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][60/1251] eta 0:04:59 lr 0.000749 wd 0.0500 time 0.2403 (0.2517) data time 0.0010 (0.0093) model time 0.2393 (0.2403) loss 3.1237 (3.3643) grad_norm 1.4970 (nan) loss_scale 2048.0000 (2618.7541) mem 7380MB [2024-08-26 13:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][70/1251] eta 0:04:55 lr 0.000749 wd 0.0500 time 0.2302 (0.2503) data time 0.0009 (0.0083) model time 0.2294 (0.2400) loss 4.1176 (3.3868) grad_norm 1.9346 (nan) loss_scale 2048.0000 (2538.3662) mem 7380MB [2024-08-26 13:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][80/1251] eta 0:04:51 lr 0.000749 wd 0.0500 time 0.2328 (0.2493) data time 0.0008 (0.0074) model time 0.2320 (0.2406) loss 3.3042 (3.3680) grad_norm 2.2990 (nan) loss_scale 2048.0000 (2477.8272) mem 7380MB [2024-08-26 13:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][90/1251] eta 0:04:48 lr 0.000749 wd 0.0500 time 0.2435 (0.2483) data time 0.0010 (0.0067) model time 0.2424 (0.2400) loss 3.4479 (3.3552) grad_norm 1.8398 (nan) loss_scale 2048.0000 (2430.5934) mem 7380MB [2024-08-26 13:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][100/1251] eta 0:04:44 lr 0.000749 wd 0.0500 time 0.2464 (0.2472) data time 0.0009 (0.0062) model time 0.2455 (0.2393) loss 2.3209 (3.3745) grad_norm 2.0911 (nan) loss_scale 1024.0000 (2362.2970) mem 7380MB [2024-08-26 13:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][110/1251] eta 0:04:41 lr 0.000749 wd 0.0500 time 0.2438 (0.2466) data time 0.0009 (0.0057) model time 0.2429 (0.2392) loss 3.9432 (3.3615) grad_norm 2.7643 (nan) loss_scale 1024.0000 (2241.7297) mem 7380MB [2024-08-26 13:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][120/1251] eta 0:04:38 lr 0.000749 wd 0.0500 time 0.2445 (0.2462) data time 0.0009 (0.0053) model time 0.2435 (0.2394) loss 3.7556 (3.3694) grad_norm 1.9081 (nan) loss_scale 1024.0000 (2141.0909) mem 7380MB [2024-08-26 13:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][130/1251] eta 0:04:37 lr 0.000749 wd 0.0500 time 0.2416 (0.2473) data time 0.0007 (0.0050) model time 0.2408 (0.2420) loss 3.3965 (3.3745) grad_norm 2.4190 (nan) loss_scale 1024.0000 (2055.8168) mem 7380MB [2024-08-26 13:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][140/1251] eta 0:04:34 lr 0.000749 wd 0.0500 time 0.2398 (0.2471) data time 0.0010 (0.0047) model time 0.2387 (0.2421) loss 3.5761 (3.3930) grad_norm 1.5714 (nan) loss_scale 1024.0000 (1982.6383) mem 7380MB [2024-08-26 13:10:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][150/1251] eta 0:04:31 lr 0.000749 wd 0.0500 time 0.2417 (0.2466) data time 0.0010 (0.0045) model time 0.2407 (0.2418) loss 3.6060 (3.3935) grad_norm 1.6108 (nan) loss_scale 1024.0000 (1919.1523) mem 7380MB [2024-08-26 13:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][160/1251] eta 0:04:28 lr 0.000749 wd 0.0500 time 0.2405 (0.2464) data time 0.0010 (0.0043) model time 0.2395 (0.2418) loss 3.7081 (3.3843) grad_norm 2.0219 (nan) loss_scale 1024.0000 (1863.5528) mem 7380MB [2024-08-26 13:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][170/1251] eta 0:04:25 lr 0.000749 wd 0.0500 time 0.2395 (0.2460) data time 0.0007 (0.0041) model time 0.2388 (0.2415) loss 3.8570 (3.3750) grad_norm 3.0735 (nan) loss_scale 1024.0000 (1814.4561) mem 7380MB [2024-08-26 13:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][180/1251] eta 0:04:23 lr 0.000749 wd 0.0500 time 0.2503 (0.2457) data time 0.0008 (0.0039) model time 0.2495 (0.2414) loss 3.5895 (3.3908) grad_norm 2.2307 (nan) loss_scale 1024.0000 (1770.7845) mem 7380MB [2024-08-26 13:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][190/1251] eta 0:04:20 lr 0.000749 wd 0.0500 time 0.2465 (0.2455) data time 0.0008 (0.0038) model time 0.2457 (0.2414) loss 3.8696 (3.3767) grad_norm 3.4396 (nan) loss_scale 1024.0000 (1731.6859) mem 7380MB [2024-08-26 13:10:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][200/1251] eta 0:04:17 lr 0.000749 wd 0.0500 time 0.2450 (0.2455) data time 0.0007 (0.0036) model time 0.2443 (0.2415) loss 3.9156 (3.3671) grad_norm 1.6122 (nan) loss_scale 1024.0000 (1696.4776) mem 7380MB [2024-08-26 13:10:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][210/1251] eta 0:04:15 lr 0.000748 wd 0.0500 time 0.2371 (0.2453) data time 0.0009 (0.0035) model time 0.2362 (0.2414) loss 2.8556 (3.3662) grad_norm 1.5512 (nan) loss_scale 1024.0000 (1664.6066) mem 7380MB [2024-08-26 13:10:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][220/1251] eta 0:04:12 lr 0.000748 wd 0.0500 time 0.2318 (0.2450) data time 0.0010 (0.0034) model time 0.2308 (0.2412) loss 3.5852 (3.3759) grad_norm 2.0364 (nan) loss_scale 1024.0000 (1635.6199) mem 7380MB [2024-08-26 13:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][230/1251] eta 0:04:09 lr 0.000748 wd 0.0500 time 0.2333 (0.2448) data time 0.0008 (0.0033) model time 0.2325 (0.2411) loss 2.7499 (3.3673) grad_norm 1.8954 (nan) loss_scale 1024.0000 (1609.1429) mem 7380MB [2024-08-26 13:10:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][240/1251] eta 0:04:07 lr 0.000748 wd 0.0500 time 0.2444 (0.2447) data time 0.0009 (0.0032) model time 0.2434 (0.2411) loss 3.8151 (3.3722) grad_norm 1.6405 (nan) loss_scale 1024.0000 (1584.8631) mem 7380MB [2024-08-26 13:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][250/1251] eta 0:04:04 lr 0.000748 wd 0.0500 time 0.2493 (0.2445) data time 0.0007 (0.0031) model time 0.2486 (0.2410) loss 3.9850 (3.3874) grad_norm 1.5855 (nan) loss_scale 1024.0000 (1562.5179) mem 7380MB [2024-08-26 13:10:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][260/1251] eta 0:04:02 lr 0.000748 wd 0.0500 time 0.2426 (0.2444) data time 0.0007 (0.0030) model time 0.2418 (0.2410) loss 3.8256 (3.3894) grad_norm 4.9371 (nan) loss_scale 1024.0000 (1541.8851) mem 7380MB [2024-08-26 13:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][270/1251] eta 0:03:59 lr 0.000748 wd 0.0500 time 0.2379 (0.2442) data time 0.0011 (0.0030) model time 0.2368 (0.2409) loss 3.4151 (3.3883) grad_norm 1.5397 (nan) loss_scale 1024.0000 (1522.7749) mem 7380MB [2024-08-26 13:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][280/1251] eta 0:03:57 lr 0.000748 wd 0.0500 time 0.2436 (0.2441) data time 0.0007 (0.0029) model time 0.2429 (0.2408) loss 3.5942 (3.3829) grad_norm 2.3208 (nan) loss_scale 1024.0000 (1505.0249) mem 7380MB [2024-08-26 13:11:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][290/1251] eta 0:03:54 lr 0.000748 wd 0.0500 time 0.2489 (0.2441) data time 0.0007 (0.0030) model time 0.2481 (0.2408) loss 3.8107 (3.3852) grad_norm 2.0110 (nan) loss_scale 1024.0000 (1488.4948) mem 7380MB [2024-08-26 13:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][300/1251] eta 0:03:52 lr 0.000748 wd 0.0500 time 0.2404 (0.2441) data time 0.0007 (0.0029) model time 0.2397 (0.2409) loss 3.7447 (3.3768) grad_norm 1.8796 (nan) loss_scale 1024.0000 (1473.0631) mem 7380MB [2024-08-26 13:11:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][310/1251] eta 0:03:49 lr 0.000748 wd 0.0500 time 0.2487 (0.2441) data time 0.0007 (0.0029) model time 0.2479 (0.2409) loss 3.2622 (3.3815) grad_norm 2.3342 (nan) loss_scale 1024.0000 (1458.6238) mem 7380MB [2024-08-26 13:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][320/1251] eta 0:03:47 lr 0.000748 wd 0.0500 time 0.2380 (0.2441) data time 0.0011 (0.0028) model time 0.2368 (0.2410) loss 3.2418 (3.3782) grad_norm 1.6136 (nan) loss_scale 1024.0000 (1445.0841) mem 7380MB [2024-08-26 13:11:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][330/1251] eta 0:03:44 lr 0.000748 wd 0.0500 time 0.2420 (0.2440) data time 0.0010 (0.0028) model time 0.2410 (0.2410) loss 4.0477 (3.3869) grad_norm 1.8956 (nan) loss_scale 1024.0000 (1432.3625) mem 7380MB [2024-08-26 13:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][340/1251] eta 0:03:42 lr 0.000748 wd 0.0500 time 0.2492 (0.2440) data time 0.0010 (0.0027) model time 0.2482 (0.2410) loss 2.2751 (3.3898) grad_norm 2.5987 (nan) loss_scale 1024.0000 (1420.3871) mem 7380MB [2024-08-26 13:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][350/1251] eta 0:03:39 lr 0.000748 wd 0.0500 time 0.2392 (0.2440) data time 0.0010 (0.0027) model time 0.2382 (0.2410) loss 3.3742 (3.3919) grad_norm 2.7380 (nan) loss_scale 1024.0000 (1409.0940) mem 7380MB [2024-08-26 13:11:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][360/1251] eta 0:03:37 lr 0.000748 wd 0.0500 time 0.2426 (0.2439) data time 0.0010 (0.0026) model time 0.2417 (0.2410) loss 3.3119 (3.3924) grad_norm 3.3735 (nan) loss_scale 1024.0000 (1398.4266) mem 7380MB [2024-08-26 13:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][370/1251] eta 0:03:34 lr 0.000748 wd 0.0500 time 0.2456 (0.2438) data time 0.0008 (0.0026) model time 0.2448 (0.2410) loss 3.6466 (3.3964) grad_norm 1.6990 (nan) loss_scale 1024.0000 (1388.3342) mem 7380MB [2024-08-26 13:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][380/1251] eta 0:03:32 lr 0.000748 wd 0.0500 time 0.2387 (0.2438) data time 0.0011 (0.0025) model time 0.2376 (0.2410) loss 3.6004 (3.3960) grad_norm 1.6115 (nan) loss_scale 1024.0000 (1378.7717) mem 7380MB [2024-08-26 13:11:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][390/1251] eta 0:03:29 lr 0.000748 wd 0.0500 time 0.2493 (0.2437) data time 0.0010 (0.0025) model time 0.2483 (0.2409) loss 3.7682 (3.3978) grad_norm 1.7407 (nan) loss_scale 1024.0000 (1369.6982) mem 7380MB [2024-08-26 13:11:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][400/1251] eta 0:03:27 lr 0.000748 wd 0.0500 time 0.2411 (0.2437) data time 0.0011 (0.0025) model time 0.2401 (0.2410) loss 3.0289 (3.4036) grad_norm 2.0632 (nan) loss_scale 1024.0000 (1361.0773) mem 7380MB [2024-08-26 13:11:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][410/1251] eta 0:03:24 lr 0.000748 wd 0.0500 time 0.2355 (0.2436) data time 0.0010 (0.0024) model time 0.2345 (0.2409) loss 3.1619 (3.4029) grad_norm 2.1744 (nan) loss_scale 1024.0000 (1352.8759) mem 7380MB [2024-08-26 13:11:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][420/1251] eta 0:03:22 lr 0.000748 wd 0.0500 time 0.2336 (0.2436) data time 0.0009 (0.0024) model time 0.2327 (0.2409) loss 3.3907 (3.4101) grad_norm 1.9796 (nan) loss_scale 1024.0000 (1345.0641) mem 7380MB [2024-08-26 13:11:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][430/1251] eta 0:03:19 lr 0.000748 wd 0.0500 time 0.2494 (0.2435) data time 0.0009 (0.0024) model time 0.2484 (0.2409) loss 2.9429 (3.4000) grad_norm 2.1015 (nan) loss_scale 1024.0000 (1337.6148) mem 7380MB [2024-08-26 13:11:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][440/1251] eta 0:03:17 lr 0.000748 wd 0.0500 time 0.2381 (0.2435) data time 0.0008 (0.0024) model time 0.2373 (0.2409) loss 3.0009 (3.3986) grad_norm 1.6076 (nan) loss_scale 1024.0000 (1330.5034) mem 7380MB [2024-08-26 13:11:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][450/1251] eta 0:03:15 lr 0.000748 wd 0.0500 time 0.2417 (0.2435) data time 0.0011 (0.0023) model time 0.2407 (0.2410) loss 2.3136 (3.3932) grad_norm 1.8034 (nan) loss_scale 1024.0000 (1323.7073) mem 7380MB [2024-08-26 13:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][460/1251] eta 0:03:12 lr 0.000748 wd 0.0500 time 0.2384 (0.2435) data time 0.0012 (0.0023) model time 0.2372 (0.2409) loss 3.8063 (3.4006) grad_norm 1.8044 (nan) loss_scale 1024.0000 (1317.2061) mem 7380MB [2024-08-26 13:11:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][470/1251] eta 0:03:10 lr 0.000747 wd 0.0500 time 0.2475 (0.2435) data time 0.0010 (0.0023) model time 0.2466 (0.2410) loss 3.6251 (3.3966) grad_norm 1.8470 (nan) loss_scale 1024.0000 (1310.9809) mem 7380MB [2024-08-26 13:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][480/1251] eta 0:03:07 lr 0.000747 wd 0.0500 time 0.2334 (0.2434) data time 0.0013 (0.0023) model time 0.2321 (0.2410) loss 3.8084 (3.3933) grad_norm 2.8855 (nan) loss_scale 1024.0000 (1305.0146) mem 7380MB [2024-08-26 13:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][490/1251] eta 0:03:05 lr 0.000747 wd 0.0500 time 0.2415 (0.2434) data time 0.0010 (0.0022) model time 0.2406 (0.2409) loss 3.7003 (3.3916) grad_norm 3.1916 (nan) loss_scale 1024.0000 (1299.2912) mem 7380MB [2024-08-26 13:11:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][500/1251] eta 0:03:02 lr 0.000747 wd 0.0500 time 0.2393 (0.2434) data time 0.0008 (0.0022) model time 0.2384 (0.2409) loss 3.8990 (3.3925) grad_norm 2.2984 (nan) loss_scale 1024.0000 (1293.7964) mem 7380MB [2024-08-26 13:11:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 13:11:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 13:11:53 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 13:13:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 13:13:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 13:20:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 13:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 13:43:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 13:43:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 13:43:48 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 13:43:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 13:43:55 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 13:43:56 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 13:43:58 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 13:43:58 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 114) [2024-08-26 13:43:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 13:44:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][510/1251] eta 0:20:33 lr 0.000747 wd 0.0500 time 0.2212 (1.6652) data time 0.0008 (0.0644) model time 0.2203 (1.6008) loss 4.1436 (3.7780) grad_norm 1.9685 (2.4368) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 13:48:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 13:48:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 13:48:58 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 13:49:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 13:49:08 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 13:49:09 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 13:49:10 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 13:49:10 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 114) [2024-08-26 13:49:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 13:49:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][510/1251] eta 0:19:04 lr 0.000747 wd 0.0500 time 0.2259 (1.5440) data time 0.0008 (0.0910) model time 0.2251 (1.4529) loss 4.1423 (3.7783) grad_norm 1.8351 (2.4162) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 13:49:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][520/1251] eta 0:10:20 lr 0.000747 wd 0.0500 time 0.2208 (0.8492) data time 0.0010 (0.0437) model time 0.2198 (0.8055) loss 3.5623 (3.5386) grad_norm 2.6488 (2.2582) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 13:49:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][530/1251] eta 0:07:36 lr 0.000747 wd 0.0500 time 0.2286 (0.6331) data time 0.0007 (0.0289) model time 0.2279 (0.6042) loss 3.9211 (3.6024) grad_norm 2.6158 (2.2539) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 13:49:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][540/1251] eta 0:06:15 lr 0.000747 wd 0.0500 time 0.2228 (0.5284) data time 0.0010 (0.0218) model time 0.2218 (0.5065) loss 3.6410 (3.5217) grad_norm 1.7414 (2.2333) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 13:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 13:49:36 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 13:49:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 13:54:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 13:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 14:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 14:07:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 14:08:09 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 14:08:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 14:08:18 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 14:08:20 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 14:08:21 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 14:08:21 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 114) [2024-08-26 14:08:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 14:08:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][550/1251] eta 0:28:28 lr 0.000747 wd 0.0500 time 0.2366 (2.4376) data time 0.0009 (0.1523) model time 0.2357 (2.2853) loss 4.3789 (4.0653) grad_norm 2.3058 (1.8665) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:08:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][560/1251] eta 0:12:22 lr 0.000747 wd 0.0500 time 0.2283 (1.0742) data time 0.0010 (0.0578) model time 0.2272 (1.0164) loss 3.6324 (3.7682) grad_norm 1.5968 (1.8688) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:08:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][570/1251] eta 0:08:33 lr 0.000747 wd 0.0500 time 0.2239 (0.7547) data time 0.0007 (0.0361) model time 0.2232 (0.7187) loss 3.5462 (3.6805) grad_norm 1.7081 (2.0395) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:08:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][580/1251] eta 0:06:51 lr 0.000747 wd 0.0500 time 0.2253 (0.6136) data time 0.0011 (0.0263) model time 0.2242 (0.5873) loss 3.9936 (3.6821) grad_norm 2.1179 (2.0907) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][590/1251] eta 0:05:50 lr 0.000747 wd 0.0500 time 0.2246 (0.5296) data time 0.0013 (0.0209) model time 0.2233 (0.5087) loss 2.7803 (3.5954) grad_norm 1.5223 (2.0199) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][600/1251] eta 0:05:09 lr 0.000747 wd 0.0500 time 0.2221 (0.4759) data time 0.0008 (0.0173) model time 0.2213 (0.4585) loss 3.7338 (3.5736) grad_norm 2.1962 (2.0414) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][610/1251] eta 0:04:46 lr 0.000747 wd 0.0500 time 0.2577 (0.4471) data time 0.0009 (0.0149) model time 0.2568 (0.4322) loss 2.7765 (3.5293) grad_norm 2.3235 (2.0427) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][620/1251] eta 0:04:23 lr 0.000747 wd 0.0500 time 0.2264 (0.4180) data time 0.0010 (0.0131) model time 0.2254 (0.4050) loss 3.8248 (3.4941) grad_norm 2.4449 (2.0525) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][630/1251] eta 0:04:07 lr 0.000747 wd 0.0500 time 0.2247 (0.3991) data time 0.0007 (0.0117) model time 0.2240 (0.3874) loss 2.5215 (3.4565) grad_norm 2.2814 (2.0575) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][640/1251] eta 0:03:52 lr 0.000747 wd 0.0500 time 0.2328 (0.3812) data time 0.0008 (0.0106) model time 0.2320 (0.3706) loss 3.9836 (3.4646) grad_norm 2.0599 (2.2203) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][650/1251] eta 0:03:43 lr 0.000747 wd 0.0500 time 0.3877 (0.3718) data time 0.0009 (0.0097) model time 0.3868 (0.3621) loss 3.5971 (3.4852) grad_norm 2.4991 (2.2172) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][660/1251] eta 0:03:32 lr 0.000747 wd 0.0500 time 0.2309 (0.3604) data time 0.0008 (0.0089) model time 0.2301 (0.3514) loss 4.1719 (3.4719) grad_norm 2.4380 (2.2474) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][670/1251] eta 0:03:23 lr 0.000747 wd 0.0500 time 0.2339 (0.3498) data time 0.0007 (0.0083) model time 0.2332 (0.3415) loss 2.1466 (3.4627) grad_norm 1.9474 (2.2322) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][680/1251] eta 0:03:14 lr 0.000747 wd 0.0500 time 0.2553 (0.3410) data time 0.0009 (0.0078) model time 0.2544 (0.3332) loss 3.1714 (3.4629) grad_norm 2.2664 (2.2328) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][690/1251] eta 0:03:08 lr 0.000747 wd 0.0500 time 0.2241 (0.3365) data time 0.0009 (0.0073) model time 0.2233 (0.3292) loss 3.0219 (3.4457) grad_norm 2.2300 (2.2699) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][700/1251] eta 0:03:02 lr 0.000747 wd 0.0500 time 0.2264 (0.3313) data time 0.0010 (0.0070) model time 0.2254 (0.3244) loss 3.0437 (3.4417) grad_norm 2.4172 (2.2586) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][710/1251] eta 0:02:55 lr 0.000747 wd 0.0500 time 0.2304 (0.3253) data time 0.0010 (0.0066) model time 0.2293 (0.3186) loss 3.3442 (3.4390) grad_norm 2.2098 (2.2609) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][720/1251] eta 0:02:49 lr 0.000747 wd 0.0500 time 0.2275 (0.3198) data time 0.0008 (0.0063) model time 0.2266 (0.3135) loss 3.1927 (3.4220) grad_norm 1.9284 (2.2499) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][730/1251] eta 0:02:45 lr 0.000746 wd 0.0500 time 0.3377 (0.3169) data time 0.0011 (0.0061) model time 0.3366 (0.3109) loss 2.8348 (3.4097) grad_norm 2.0267 (2.2380) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][740/1251] eta 0:02:39 lr 0.000746 wd 0.0500 time 0.2335 (0.3130) data time 0.0007 (0.0058) model time 0.2327 (0.3072) loss 3.1766 (3.4048) grad_norm 1.8502 (2.2327) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][750/1251] eta 0:02:34 lr 0.000746 wd 0.0500 time 0.2493 (0.3089) data time 0.0007 (0.0056) model time 0.2485 (0.3034) loss 2.5461 (3.3947) grad_norm 1.9423 (2.2195) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][760/1251] eta 0:02:30 lr 0.000746 wd 0.0500 time 0.2286 (0.3060) data time 0.0007 (0.0054) model time 0.2279 (0.3006) loss 2.2613 (3.3856) grad_norm 2.1123 (2.2115) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][770/1251] eta 0:02:26 lr 0.000746 wd 0.0500 time 0.2356 (0.3037) data time 0.0007 (0.0052) model time 0.2348 (0.2985) loss 3.3114 (3.3893) grad_norm 2.7958 (2.2107) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][780/1251] eta 0:02:22 lr 0.000746 wd 0.0500 time 0.2334 (0.3019) data time 0.0007 (0.0050) model time 0.2327 (0.2969) loss 3.6339 (3.3763) grad_norm 1.7279 (2.2000) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][790/1251] eta 0:02:17 lr 0.000746 wd 0.0500 time 0.2446 (0.2990) data time 0.0009 (0.0049) model time 0.2436 (0.2942) loss 2.1112 (3.3720) grad_norm 2.1119 (2.1952) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][800/1251] eta 0:02:13 lr 0.000746 wd 0.0500 time 0.2260 (0.2962) data time 0.0013 (0.0047) model time 0.2247 (0.2915) loss 2.6826 (3.3646) grad_norm 1.5497 (2.1915) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][810/1251] eta 0:02:10 lr 0.000746 wd 0.0500 time 0.3303 (0.2949) data time 0.0008 (0.0046) model time 0.3295 (0.2903) loss 2.6268 (3.3536) grad_norm 1.8371 (2.1948) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][820/1251] eta 0:02:06 lr 0.000746 wd 0.0500 time 0.2255 (0.2929) data time 0.0010 (0.0044) model time 0.2246 (0.2884) loss 3.2943 (3.3554) grad_norm 2.1772 (2.1966) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][830/1251] eta 0:02:02 lr 0.000746 wd 0.0500 time 0.2352 (0.2917) data time 0.0010 (0.0043) model time 0.2342 (0.2873) loss 2.7127 (3.3500) grad_norm 1.5719 (2.1914) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][840/1251] eta 0:01:59 lr 0.000746 wd 0.0500 time 0.2334 (0.2903) data time 0.0007 (0.0042) model time 0.2327 (0.2860) loss 2.8595 (3.3436) grad_norm 1.7754 (2.1950) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][850/1251] eta 0:01:55 lr 0.000746 wd 0.0500 time 0.2246 (0.2891) data time 0.0008 (0.0041) model time 0.2239 (0.2850) loss 2.5493 (3.3376) grad_norm 2.2531 (2.1942) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][860/1251] eta 0:01:52 lr 0.000746 wd 0.0500 time 0.2282 (0.2884) data time 0.0015 (0.0040) model time 0.2267 (0.2843) loss 4.1541 (3.3473) grad_norm 2.4447 (2.1999) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][870/1251] eta 0:01:49 lr 0.000746 wd 0.0500 time 0.2294 (0.2866) data time 0.0009 (0.0039) model time 0.2285 (0.2826) loss 3.6377 (3.3583) grad_norm 2.3985 (2.2020) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][880/1251] eta 0:01:45 lr 0.000746 wd 0.0500 time 0.2323 (0.2853) data time 0.0016 (0.0039) model time 0.2308 (0.2814) loss 3.4979 (3.3544) grad_norm 1.9273 (2.1947) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][890/1251] eta 0:01:42 lr 0.000746 wd 0.0500 time 0.2656 (0.2844) data time 0.0009 (0.0038) model time 0.2647 (0.2807) loss 3.9256 (3.3563) grad_norm 1.8006 (2.2033) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][900/1251] eta 0:01:39 lr 0.000746 wd 0.0500 time 0.2232 (0.2837) data time 0.0009 (0.0037) model time 0.2223 (0.2800) loss 2.4654 (3.3602) grad_norm 2.0093 (2.2023) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][910/1251] eta 0:01:36 lr 0.000746 wd 0.0500 time 0.2281 (0.2822) data time 0.0009 (0.0036) model time 0.2272 (0.2786) loss 3.5099 (3.3600) grad_norm 1.5782 (2.1978) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][920/1251] eta 0:01:32 lr 0.000746 wd 0.0500 time 0.2314 (0.2808) data time 0.0008 (0.0036) model time 0.2306 (0.2772) loss 2.3630 (3.3534) grad_norm 2.1492 (2.2000) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][930/1251] eta 0:01:29 lr 0.000746 wd 0.0500 time 0.2563 (0.2799) data time 0.0010 (0.0035) model time 0.2553 (0.2764) loss 3.9669 (3.3498) grad_norm 2.3593 (2.2013) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][940/1251] eta 0:01:26 lr 0.000746 wd 0.0500 time 0.2310 (0.2793) data time 0.0010 (0.0034) model time 0.2300 (0.2758) loss 3.7711 (3.3506) grad_norm 1.9012 (2.2009) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][950/1251] eta 0:01:23 lr 0.000746 wd 0.0500 time 0.2312 (0.2788) data time 0.0007 (0.0034) model time 0.2304 (0.2754) loss 3.7067 (3.3565) grad_norm 2.1518 (2.1954) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][960/1251] eta 0:01:20 lr 0.000746 wd 0.0500 time 0.2222 (0.2776) data time 0.0009 (0.0033) model time 0.2213 (0.2743) loss 3.0068 (3.3559) grad_norm 1.6509 (2.1905) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][970/1251] eta 0:01:17 lr 0.000746 wd 0.0500 time 0.3358 (0.2769) data time 0.0010 (0.0033) model time 0.3349 (0.2737) loss 3.8224 (3.3564) grad_norm 2.4614 (2.1955) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][980/1251] eta 0:01:14 lr 0.000745 wd 0.0500 time 0.2392 (0.2767) data time 0.0009 (0.0032) model time 0.2383 (0.2734) loss 3.0070 (3.3610) grad_norm 1.9709 (2.1923) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][990/1251] eta 0:01:11 lr 0.000745 wd 0.0500 time 0.2196 (0.2756) data time 0.0012 (0.0032) model time 0.2184 (0.2724) loss 3.5110 (3.3645) grad_norm 2.5312 (2.1924) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1000/1251] eta 0:01:08 lr 0.000745 wd 0.0500 time 0.2256 (0.2746) data time 0.0008 (0.0031) model time 0.2248 (0.2714) loss 3.2801 (3.3626) grad_norm 2.5354 (2.1945) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1010/1251] eta 0:01:06 lr 0.000745 wd 0.0500 time 0.3072 (0.2743) data time 0.0008 (0.0031) model time 0.3064 (0.2712) loss 2.8400 (3.3550) grad_norm 2.0938 (2.1947) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1020/1251] eta 0:01:03 lr 0.000745 wd 0.0500 time 0.3423 (0.2741) data time 0.0011 (0.0030) model time 0.3412 (0.2710) loss 2.8766 (3.3512) grad_norm 2.8437 (2.1974) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1030/1251] eta 0:01:00 lr 0.000745 wd 0.0500 time 0.2295 (0.2734) data time 0.0010 (0.0030) model time 0.2286 (0.2704) loss 3.9193 (3.3566) grad_norm 3.5112 (2.2073) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1040/1251] eta 0:00:57 lr 0.000745 wd 0.0500 time 0.2348 (0.2725) data time 0.0007 (0.0030) model time 0.2341 (0.2695) loss 3.2038 (3.3556) grad_norm 3.4611 (2.2287) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1050/1251] eta 0:00:54 lr 0.000745 wd 0.0500 time 0.3310 (0.2718) data time 0.0009 (0.0029) model time 0.3301 (0.2689) loss 3.2041 (3.3545) grad_norm 2.4231 (2.2247) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1060/1251] eta 0:00:51 lr 0.000745 wd 0.0500 time 0.2279 (0.2716) data time 0.0010 (0.0029) model time 0.2269 (0.2687) loss 2.8629 (3.3620) grad_norm 1.8265 (2.2210) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1070/1251] eta 0:00:49 lr 0.000745 wd 0.0500 time 0.2260 (0.2708) data time 0.0008 (0.0029) model time 0.2252 (0.2679) loss 2.0729 (3.3555) grad_norm 4.2242 (2.2197) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1080/1251] eta 0:00:46 lr 0.000745 wd 0.0500 time 0.2389 (0.2705) data time 0.0009 (0.0028) model time 0.2379 (0.2677) loss 3.9637 (3.3536) grad_norm 2.2312 (2.2169) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1090/1251] eta 0:00:43 lr 0.000745 wd 0.0500 time 0.2254 (0.2697) data time 0.0007 (0.0028) model time 0.2247 (0.2669) loss 3.2353 (3.3549) grad_norm 2.1637 (2.2138) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:10:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1100/1251] eta 0:00:40 lr 0.000745 wd 0.0500 time 0.2269 (0.2700) data time 0.0011 (0.0028) model time 0.2258 (0.2672) loss 3.8522 (3.3605) grad_norm 1.7172 (2.2102) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1110/1251] eta 0:00:37 lr 0.000745 wd 0.0500 time 0.2276 (0.2693) data time 0.0007 (0.0027) model time 0.2269 (0.2665) loss 2.9753 (3.3622) grad_norm 4.1819 (2.2178) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1120/1251] eta 0:00:35 lr 0.000745 wd 0.0500 time 0.2314 (0.2686) data time 0.0009 (0.0027) model time 0.2305 (0.2659) loss 2.5997 (3.3630) grad_norm 2.0719 (2.2255) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1130/1251] eta 0:00:32 lr 0.000745 wd 0.0500 time 0.3438 (0.2683) data time 0.0010 (0.0027) model time 0.3428 (0.2656) loss 3.6047 (3.3647) grad_norm 1.6577 (2.2282) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1140/1251] eta 0:00:29 lr 0.000745 wd 0.0500 time 0.2265 (0.2680) data time 0.0009 (0.0027) model time 0.2256 (0.2654) loss 2.2409 (3.3650) grad_norm 2.6145 (2.2261) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1150/1251] eta 0:00:27 lr 0.000745 wd 0.0500 time 0.2214 (0.2678) data time 0.0008 (0.0026) model time 0.2207 (0.2652) loss 2.9523 (3.3621) grad_norm 2.4759 (2.2234) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1160/1251] eta 0:00:24 lr 0.000745 wd 0.0500 time 0.2391 (0.2673) data time 0.0011 (0.0026) model time 0.2381 (0.2646) loss 3.4167 (3.3634) grad_norm 2.1990 (2.2198) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1170/1251] eta 0:00:21 lr 0.000745 wd 0.0500 time 0.2325 (0.2666) data time 0.0009 (0.0026) model time 0.2316 (0.2641) loss 2.4125 (3.3655) grad_norm 2.6288 (2.2192) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1180/1251] eta 0:00:18 lr 0.000745 wd 0.0500 time 0.2197 (0.2668) data time 0.0007 (0.0026) model time 0.2190 (0.2642) loss 3.1255 (3.3654) grad_norm 2.0661 (2.2138) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1190/1251] eta 0:00:16 lr 0.000745 wd 0.0500 time 0.2274 (0.2662) data time 0.0011 (0.0025) model time 0.2263 (0.2637) loss 3.8781 (3.3625) grad_norm 2.1763 (2.2105) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1200/1251] eta 0:00:13 lr 0.000745 wd 0.0500 time 0.2305 (0.2657) data time 0.0009 (0.0025) model time 0.2296 (0.2632) loss 2.7961 (3.3634) grad_norm 2.8853 (2.2124) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1210/1251] eta 0:00:10 lr 0.000745 wd 0.0500 time 0.2232 (0.2654) data time 0.0008 (0.0025) model time 0.2223 (0.2629) loss 3.8128 (3.3603) grad_norm 2.9286 (2.2124) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1220/1251] eta 0:00:08 lr 0.000745 wd 0.0500 time 0.2321 (0.2652) data time 0.0007 (0.0025) model time 0.2315 (0.2627) loss 2.8324 (3.3639) grad_norm 2.8121 (2.2166) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1230/1251] eta 0:00:05 lr 0.000745 wd 0.0500 time 0.2310 (0.2650) data time 0.0009 (0.0025) model time 0.2302 (0.2625) loss 2.8099 (3.3640) grad_norm 2.6902 (2.2177) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1240/1251] eta 0:00:02 lr 0.000744 wd 0.0500 time 0.2112 (0.2643) data time 0.0005 (0.0024) model time 0.2108 (0.2619) loss 2.4263 (3.3614) grad_norm 1.7319 (2.2129) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [114/300][1250/1251] eta 0:00:00 lr 0.000744 wd 0.0500 time 0.2121 (0.2635) data time 0.0006 (0.0024) model time 0.2115 (0.2611) loss 3.6640 (3.3624) grad_norm 2.5713 (2.2084) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-26 14:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 114 training takes 0:03:06 [2024-08-26 14:11:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 14:11:35 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 14:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.415 (0.415) Loss 0.5205 (0.5205) Acc@1 90.918 (90.918) Acc@5 97.852 (97.852) Mem 7377MB [2024-08-26 14:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.096 (0.119) Loss 0.8613 (0.8046) Acc@1 82.520 (83.114) Acc@5 95.898 (96.493) Mem 7377MB [2024-08-26 14:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.106) Loss 1.0918 (0.8300) Acc@1 73.340 (82.092) Acc@5 93.262 (96.461) Mem 7377MB [2024-08-26 14:11:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.098) Loss 1.4492 (0.9396) Acc@1 64.941 (79.480) Acc@5 87.988 (95.073) Mem 7377MB [2024-08-26 14:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.091) Loss 1.2695 (1.0016) Acc@1 71.875 (77.980) Acc@5 90.625 (94.284) Mem 7377MB [2024-08-26 14:11:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.710 Acc@5 94.210 [2024-08-26 14:11:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.7% [2024-08-26 14:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.982 (0.982) Loss 0.4258 (0.4258) Acc@1 92.383 (92.383) Acc@5 98.242 (98.242) Mem 7377MB [2024-08-26 14:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.167) Loss 0.6875 (0.6713) Acc@1 86.816 (85.565) Acc@5 96.875 (97.212) Mem 7377MB [2024-08-26 14:11:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.127) Loss 0.9531 (0.6930) Acc@1 78.027 (84.626) Acc@5 94.727 (97.191) Mem 7377MB [2024-08-26 14:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.093 (0.116) Loss 1.2266 (0.7899) Acc@1 68.555 (82.293) Acc@5 91.406 (96.059) Mem 7377MB [2024-08-26 14:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.105) Loss 1.1064 (0.8405) Acc@1 72.559 (80.862) Acc@5 93.164 (95.534) Mem 7377MB [2024-08-26 14:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.430 Acc@5 95.476 [2024-08-26 14:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.4% [2024-08-26 14:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.43% [2024-08-26 14:11:47 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 14:11:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 14:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][0/1251] eta 0:18:31 lr 0.000744 wd 0.0500 time 0.8883 (0.8883) data time 0.4672 (0.4672) model time 0.0000 (0.0000) loss 2.3579 (2.3579) grad_norm 2.3442 (2.3442) loss_scale 1024.0000 (1024.0000) mem 7383MB [2024-08-26 14:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][10/1251] eta 0:05:58 lr 0.000744 wd 0.0500 time 0.2317 (0.2892) data time 0.0011 (0.0435) model time 0.0000 (0.0000) loss 2.3767 (2.9596) grad_norm 1.7473 (2.0168) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:11:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][20/1251] eta 0:05:19 lr 0.000744 wd 0.0500 time 0.2271 (0.2592) data time 0.0008 (0.0232) model time 0.0000 (0.0000) loss 2.6047 (3.1065) grad_norm 1.8880 (2.0203) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:11:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][30/1251] eta 0:05:06 lr 0.000744 wd 0.0500 time 0.2591 (0.2508) data time 0.0008 (0.0161) model time 0.0000 (0.0000) loss 4.0716 (3.2940) grad_norm 2.2819 (2.1600) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][40/1251] eta 0:05:03 lr 0.000744 wd 0.0500 time 0.2180 (0.2510) data time 0.0008 (0.0125) model time 0.0000 (0.0000) loss 3.8680 (3.2950) grad_norm 3.6015 (2.1798) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][50/1251] eta 0:05:00 lr 0.000744 wd 0.0500 time 0.2253 (0.2504) data time 0.0011 (0.0103) model time 0.0000 (0.0000) loss 3.3541 (3.2840) grad_norm 1.8656 (2.1451) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][60/1251] eta 0:04:53 lr 0.000744 wd 0.0500 time 0.2336 (0.2468) data time 0.0009 (0.0088) model time 0.2327 (0.2272) loss 2.6256 (3.3252) grad_norm 1.6709 (2.1104) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][70/1251] eta 0:04:48 lr 0.000744 wd 0.0500 time 0.2453 (0.2447) data time 0.0014 (0.0077) model time 0.2438 (0.2290) loss 2.4614 (3.3386) grad_norm 1.9616 (2.1263) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][80/1251] eta 0:04:50 lr 0.000744 wd 0.0500 time 0.2342 (0.2481) data time 0.0010 (0.0069) model time 0.2332 (0.2430) loss 3.4382 (3.3130) grad_norm 1.7573 (2.1133) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][90/1251] eta 0:04:48 lr 0.000744 wd 0.0500 time 0.2353 (0.2481) data time 0.0007 (0.0063) model time 0.2346 (0.2441) loss 3.9664 (3.3100) grad_norm 3.5165 (2.1705) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][100/1251] eta 0:04:43 lr 0.000744 wd 0.0500 time 0.2273 (0.2462) data time 0.0008 (0.0057) model time 0.2265 (0.2407) loss 1.9898 (3.2841) grad_norm 3.2787 (2.2163) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][110/1251] eta 0:04:42 lr 0.000744 wd 0.0500 time 0.2476 (0.2474) data time 0.0010 (0.0053) model time 0.2466 (0.2438) loss 3.7288 (3.2807) grad_norm 4.3376 (2.2741) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][120/1251] eta 0:04:42 lr 0.000744 wd 0.0500 time 0.2335 (0.2494) data time 0.0009 (0.0050) model time 0.2326 (0.2475) loss 2.6842 (3.2611) grad_norm 2.2522 (2.2683) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][130/1251] eta 0:04:37 lr 0.000744 wd 0.0500 time 0.2251 (0.2478) data time 0.0011 (0.0047) model time 0.2240 (0.2451) loss 2.0722 (3.2537) grad_norm 1.9450 (2.2514) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][140/1251] eta 0:04:33 lr 0.000744 wd 0.0500 time 0.2289 (0.2465) data time 0.0011 (0.0044) model time 0.2278 (0.2432) loss 3.4862 (3.2535) grad_norm 1.7888 (2.2427) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][150/1251] eta 0:04:32 lr 0.000744 wd 0.0500 time 0.2249 (0.2472) data time 0.0013 (0.0042) model time 0.2236 (0.2445) loss 3.5914 (3.2681) grad_norm 1.7264 (2.2554) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][160/1251] eta 0:04:31 lr 0.000744 wd 0.0500 time 0.3274 (0.2484) data time 0.0007 (0.0040) model time 0.3267 (0.2464) loss 2.7489 (3.2605) grad_norm 2.3097 (2.2473) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][170/1251] eta 0:04:27 lr 0.000744 wd 0.0500 time 0.2273 (0.2478) data time 0.0008 (0.0038) model time 0.2265 (0.2457) loss 2.6312 (3.2683) grad_norm 2.8969 (2.2634) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][180/1251] eta 0:04:24 lr 0.000744 wd 0.0500 time 0.2257 (0.2469) data time 0.0008 (0.0037) model time 0.2249 (0.2444) loss 3.0960 (3.2707) grad_norm 2.6727 (2.2812) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][190/1251] eta 0:04:20 lr 0.000744 wd 0.0500 time 0.2224 (0.2459) data time 0.0007 (0.0036) model time 0.2217 (0.2431) loss 2.8811 (3.2708) grad_norm 2.2517 (2.2829) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][200/1251] eta 0:04:19 lr 0.000744 wd 0.0500 time 0.2304 (0.2472) data time 0.0007 (0.0034) model time 0.2297 (0.2451) loss 2.2714 (3.2747) grad_norm 2.2341 (2.2740) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][210/1251] eta 0:04:16 lr 0.000744 wd 0.0500 time 0.2362 (0.2464) data time 0.0010 (0.0033) model time 0.2352 (0.2440) loss 3.3235 (3.2857) grad_norm 2.4536 (2.2706) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][220/1251] eta 0:04:14 lr 0.000744 wd 0.0500 time 0.2285 (0.2468) data time 0.0010 (0.0032) model time 0.2275 (0.2446) loss 3.4707 (3.3006) grad_norm 1.6558 (2.2582) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][230/1251] eta 0:04:11 lr 0.000744 wd 0.0500 time 0.2302 (0.2460) data time 0.0009 (0.0031) model time 0.2293 (0.2436) loss 3.6086 (3.3065) grad_norm 2.5791 (2.2514) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][240/1251] eta 0:04:09 lr 0.000744 wd 0.0500 time 0.3305 (0.2473) data time 0.0007 (0.0030) model time 0.3298 (0.2454) loss 2.9119 (3.2999) grad_norm 2.0075 (2.2445) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][250/1251] eta 0:04:06 lr 0.000743 wd 0.0500 time 0.2276 (0.2467) data time 0.0011 (0.0030) model time 0.2265 (0.2447) loss 3.2270 (3.2952) grad_norm 2.7431 (2.2396) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][260/1251] eta 0:04:03 lr 0.000743 wd 0.0500 time 0.2228 (0.2460) data time 0.0010 (0.0029) model time 0.2218 (0.2439) loss 3.3934 (3.3006) grad_norm 2.2039 (2.2454) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][270/1251] eta 0:04:01 lr 0.000743 wd 0.0500 time 0.3366 (0.2461) data time 0.0011 (0.0028) model time 0.3355 (0.2440) loss 1.7937 (3.3035) grad_norm 2.1042 (2.2414) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][280/1251] eta 0:03:59 lr 0.000743 wd 0.0500 time 0.2223 (0.2471) data time 0.0008 (0.0028) model time 0.2215 (0.2452) loss 4.2526 (3.3023) grad_norm 3.2587 (2.2506) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:12:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][290/1251] eta 0:03:57 lr 0.000743 wd 0.0500 time 0.2287 (0.2470) data time 0.0011 (0.0027) model time 0.2276 (0.2451) loss 2.2888 (3.3008) grad_norm 1.8804 (2.2552) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][300/1251] eta 0:03:54 lr 0.000743 wd 0.0500 time 0.2238 (0.2464) data time 0.0011 (0.0027) model time 0.2228 (0.2445) loss 3.1355 (3.3018) grad_norm 1.6920 (2.2617) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][310/1251] eta 0:03:51 lr 0.000743 wd 0.0500 time 0.2260 (0.2458) data time 0.0011 (0.0026) model time 0.2250 (0.2439) loss 2.8612 (3.3094) grad_norm 2.5507 (2.2609) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][320/1251] eta 0:03:49 lr 0.000743 wd 0.0500 time 0.3603 (0.2467) data time 0.0009 (0.0026) model time 0.3594 (0.2449) loss 2.9764 (3.2998) grad_norm 1.9472 (2.2629) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][330/1251] eta 0:03:46 lr 0.000743 wd 0.0500 time 0.2257 (0.2463) data time 0.0010 (0.0025) model time 0.2247 (0.2445) loss 3.9380 (3.3045) grad_norm 2.2748 (2.2745) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][340/1251] eta 0:03:43 lr 0.000743 wd 0.0500 time 0.2356 (0.2458) data time 0.0007 (0.0025) model time 0.2349 (0.2439) loss 3.0404 (3.3067) grad_norm 2.1312 (2.2622) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][350/1251] eta 0:03:41 lr 0.000743 wd 0.0500 time 0.2322 (0.2459) data time 0.0009 (0.0024) model time 0.2313 (0.2440) loss 3.3222 (3.3094) grad_norm 1.5841 (2.2566) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][360/1251] eta 0:03:39 lr 0.000743 wd 0.0500 time 0.2332 (0.2462) data time 0.0010 (0.0024) model time 0.2322 (0.2444) loss 3.3075 (3.3140) grad_norm 2.1173 (2.2495) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][370/1251] eta 0:03:37 lr 0.000743 wd 0.0500 time 0.2257 (0.2464) data time 0.0009 (0.0024) model time 0.2248 (0.2447) loss 3.4281 (3.3065) grad_norm 1.6706 (2.2477) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][380/1251] eta 0:03:34 lr 0.000743 wd 0.0500 time 0.2351 (0.2460) data time 0.0008 (0.0023) model time 0.2343 (0.2442) loss 2.9486 (3.3018) grad_norm 2.2039 (2.2538) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][390/1251] eta 0:03:31 lr 0.000743 wd 0.0500 time 0.2288 (0.2455) data time 0.0008 (0.0023) model time 0.2281 (0.2437) loss 4.0658 (3.3027) grad_norm 2.8234 (2.2596) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][400/1251] eta 0:03:29 lr 0.000743 wd 0.0500 time 0.2326 (0.2458) data time 0.0008 (0.0023) model time 0.2318 (0.2440) loss 4.1476 (3.3041) grad_norm 1.9635 (2.2562) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][410/1251] eta 0:03:26 lr 0.000743 wd 0.0500 time 0.2307 (0.2459) data time 0.0010 (0.0022) model time 0.2298 (0.2442) loss 3.9197 (3.3082) grad_norm 2.3842 (2.2570) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][420/1251] eta 0:03:24 lr 0.000743 wd 0.0500 time 0.2374 (0.2463) data time 0.0011 (0.0022) model time 0.2363 (0.2446) loss 3.7463 (3.3112) grad_norm 2.1518 (2.2520) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][430/1251] eta 0:03:21 lr 0.000743 wd 0.0500 time 0.2277 (0.2458) data time 0.0009 (0.0022) model time 0.2268 (0.2441) loss 3.0341 (3.3106) grad_norm 1.9669 (2.2412) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][440/1251] eta 0:03:19 lr 0.000743 wd 0.0500 time 0.3060 (0.2461) data time 0.0008 (0.0022) model time 0.3052 (0.2444) loss 4.1225 (3.3129) grad_norm 3.0441 (2.2451) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][450/1251] eta 0:03:17 lr 0.000743 wd 0.0500 time 0.2390 (0.2462) data time 0.0007 (0.0021) model time 0.2383 (0.2446) loss 3.2938 (3.3118) grad_norm 3.7485 (2.2479) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][460/1251] eta 0:03:14 lr 0.000743 wd 0.0500 time 0.2282 (0.2459) data time 0.0010 (0.0021) model time 0.2272 (0.2442) loss 3.3138 (3.3085) grad_norm 2.0906 (2.2522) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][470/1251] eta 0:03:11 lr 0.000743 wd 0.0500 time 0.2280 (0.2455) data time 0.0008 (0.0021) model time 0.2272 (0.2438) loss 4.0727 (3.3167) grad_norm 2.6701 (2.2518) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][480/1251] eta 0:03:09 lr 0.000743 wd 0.0500 time 0.3278 (0.2460) data time 0.0009 (0.0021) model time 0.3268 (0.2444) loss 3.7279 (3.3213) grad_norm 1.6601 (2.2456) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][490/1251] eta 0:03:07 lr 0.000743 wd 0.0500 time 0.2257 (0.2461) data time 0.0010 (0.0021) model time 0.2247 (0.2445) loss 3.9002 (3.3219) grad_norm 1.8182 (2.2404) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][500/1251] eta 0:03:04 lr 0.000742 wd 0.0500 time 0.2346 (0.2457) data time 0.0008 (0.0020) model time 0.2339 (0.2441) loss 3.6521 (3.3303) grad_norm 1.5169 (2.2327) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][510/1251] eta 0:03:01 lr 0.000742 wd 0.0500 time 0.2272 (0.2454) data time 0.0008 (0.0020) model time 0.2264 (0.2437) loss 4.6965 (3.3332) grad_norm 1.5907 (2.2286) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][520/1251] eta 0:02:59 lr 0.000742 wd 0.0500 time 0.3461 (0.2454) data time 0.0010 (0.0020) model time 0.3451 (0.2437) loss 3.2041 (3.3301) grad_norm 2.8563 (2.2381) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:13:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][530/1251] eta 0:02:57 lr 0.000742 wd 0.0500 time 0.2311 (0.2457) data time 0.0009 (0.0020) model time 0.2302 (0.2441) loss 3.4241 (3.3292) grad_norm 1.7311 (2.2370) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][540/1251] eta 0:02:54 lr 0.000742 wd 0.0500 time 0.3095 (0.2455) data time 0.0009 (0.0020) model time 0.3086 (0.2439) loss 3.7366 (3.3341) grad_norm 2.1050 (2.2388) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][550/1251] eta 0:02:52 lr 0.000742 wd 0.0500 time 0.2339 (0.2455) data time 0.0010 (0.0020) model time 0.2329 (0.2439) loss 3.0250 (3.3305) grad_norm 2.2229 (2.2382) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][560/1251] eta 0:02:49 lr 0.000742 wd 0.0500 time 0.3101 (0.2454) data time 0.0009 (0.0019) model time 0.3092 (0.2438) loss 3.5351 (3.3366) grad_norm 1.7290 (2.2372) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][570/1251] eta 0:02:47 lr 0.000742 wd 0.0500 time 0.2290 (0.2461) data time 0.0008 (0.0019) model time 0.2282 (0.2446) loss 3.8925 (3.3393) grad_norm 2.1543 (2.2350) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][580/1251] eta 0:02:44 lr 0.000742 wd 0.0500 time 0.2247 (0.2459) data time 0.0010 (0.0019) model time 0.2237 (0.2443) loss 3.2645 (3.3435) grad_norm 2.9376 (2.2372) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][590/1251] eta 0:02:42 lr 0.000742 wd 0.0500 time 0.2219 (0.2457) data time 0.0007 (0.0019) model time 0.2212 (0.2441) loss 3.1067 (3.3452) grad_norm 2.4131 (2.2397) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][600/1251] eta 0:02:39 lr 0.000742 wd 0.0500 time 0.3269 (0.2457) data time 0.0012 (0.0019) model time 0.3257 (0.2442) loss 3.2420 (3.3442) grad_norm 2.3636 (2.2354) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][610/1251] eta 0:02:37 lr 0.000742 wd 0.0500 time 0.2296 (0.2458) data time 0.0009 (0.0019) model time 0.2287 (0.2443) loss 3.1437 (3.3399) grad_norm 2.5171 (2.2320) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][620/1251] eta 0:02:35 lr 0.000742 wd 0.0500 time 0.2323 (0.2459) data time 0.0009 (0.0019) model time 0.2314 (0.2444) loss 3.1939 (3.3377) grad_norm 1.5467 (2.2313) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][630/1251] eta 0:02:32 lr 0.000742 wd 0.0500 time 0.4379 (0.2463) data time 0.0009 (0.0018) model time 0.4370 (0.2448) loss 3.6572 (3.3386) grad_norm 2.9055 (2.2325) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][640/1251] eta 0:02:30 lr 0.000742 wd 0.0500 time 0.3158 (0.2462) data time 0.0014 (0.0018) model time 0.3144 (0.2448) loss 3.1697 (3.3387) grad_norm 1.5379 (2.2319) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][650/1251] eta 0:02:28 lr 0.000742 wd 0.0500 time 0.2219 (0.2465) data time 0.0013 (0.0018) model time 0.2207 (0.2451) loss 4.3023 (3.3436) grad_norm 1.9480 (2.2343) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][660/1251] eta 0:02:25 lr 0.000742 wd 0.0500 time 0.2320 (0.2462) data time 0.0009 (0.0018) model time 0.2310 (0.2448) loss 3.5546 (3.3448) grad_norm 2.0621 (2.2326) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][670/1251] eta 0:02:23 lr 0.000742 wd 0.0500 time 0.2260 (0.2462) data time 0.0013 (0.0018) model time 0.2247 (0.2448) loss 2.5969 (3.3438) grad_norm 1.8645 (2.2267) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][680/1251] eta 0:02:20 lr 0.000742 wd 0.0500 time 0.2310 (0.2460) data time 0.0009 (0.0018) model time 0.2301 (0.2445) loss 3.6176 (3.3460) grad_norm 5.8271 (2.2771) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][690/1251] eta 0:02:18 lr 0.000742 wd 0.0500 time 0.2402 (0.2465) data time 0.0010 (0.0018) model time 0.2392 (0.2450) loss 3.5881 (3.3460) grad_norm 2.1702 (2.2826) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][700/1251] eta 0:02:15 lr 0.000742 wd 0.0500 time 0.2231 (0.2462) data time 0.0010 (0.0018) model time 0.2221 (0.2448) loss 4.0184 (3.3506) grad_norm 2.0609 (2.2834) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][710/1251] eta 0:02:13 lr 0.000742 wd 0.0500 time 0.2334 (0.2460) data time 0.0011 (0.0018) model time 0.2323 (0.2445) loss 2.6427 (3.3482) grad_norm 2.0080 (2.2838) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][720/1251] eta 0:02:10 lr 0.000742 wd 0.0500 time 0.3267 (0.2459) data time 0.0008 (0.0018) model time 0.3259 (0.2444) loss 2.5227 (3.3485) grad_norm 1.7958 (2.2789) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][730/1251] eta 0:02:08 lr 0.000742 wd 0.0500 time 0.2357 (0.2462) data time 0.0008 (0.0017) model time 0.2349 (0.2448) loss 3.2393 (3.3487) grad_norm 2.0653 (2.2819) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][740/1251] eta 0:02:05 lr 0.000742 wd 0.0500 time 0.2675 (0.2465) data time 0.0007 (0.0017) model time 0.2668 (0.2451) loss 4.2039 (3.3517) grad_norm 3.4420 (2.2853) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][750/1251] eta 0:02:03 lr 0.000742 wd 0.0500 time 0.2372 (0.2462) data time 0.0009 (0.0017) model time 0.2363 (0.2448) loss 3.7430 (3.3506) grad_norm 2.3688 (2.2840) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][760/1251] eta 0:02:00 lr 0.000741 wd 0.0500 time 0.2346 (0.2460) data time 0.0009 (0.0017) model time 0.2337 (0.2446) loss 2.9202 (3.3508) grad_norm 2.5204 (2.2831) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:14:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][770/1251] eta 0:01:58 lr 0.000741 wd 0.0500 time 0.2357 (0.2465) data time 0.0009 (0.0017) model time 0.2348 (0.2452) loss 2.1937 (3.3483) grad_norm 1.6889 (2.2816) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:15:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][780/1251] eta 0:01:56 lr 0.000741 wd 0.0500 time 0.2260 (0.2463) data time 0.0009 (0.0017) model time 0.2250 (0.2449) loss 2.4692 (3.3436) grad_norm 2.3430 (2.2804) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:15:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][790/1251] eta 0:01:53 lr 0.000741 wd 0.0500 time 0.2248 (0.2461) data time 0.0009 (0.0017) model time 0.2239 (0.2447) loss 2.8449 (3.3414) grad_norm 2.1927 (2.2791) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][800/1251] eta 0:01:51 lr 0.000741 wd 0.0500 time 0.2312 (0.2461) data time 0.0008 (0.0017) model time 0.2305 (0.2447) loss 3.9787 (3.3427) grad_norm 1.8409 (2.2763) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:15:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][810/1251] eta 0:01:48 lr 0.000741 wd 0.0500 time 0.3250 (0.2463) data time 0.0009 (0.0017) model time 0.3241 (0.2449) loss 3.5700 (3.3430) grad_norm 2.0450 (2.2731) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][820/1251] eta 0:01:46 lr 0.000741 wd 0.0500 time 0.2294 (0.2462) data time 0.0011 (0.0017) model time 0.2282 (0.2449) loss 2.9783 (3.3461) grad_norm 3.6836 (2.2750) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:15:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][830/1251] eta 0:01:43 lr 0.000741 wd 0.0500 time 0.2228 (0.2460) data time 0.0010 (0.0017) model time 0.2218 (0.2447) loss 3.2064 (3.3438) grad_norm 1.7001 (2.2717) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:15:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][840/1251] eta 0:01:41 lr 0.000741 wd 0.0500 time 0.2214 (0.2458) data time 0.0012 (0.0017) model time 0.2201 (0.2444) loss 3.2981 (3.3420) grad_norm 1.9198 (2.2692) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-26 14:15:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][850/1251] eta 0:01:38 lr 0.000741 wd 0.0500 time 0.2215 (0.2460) data time 0.0009 (0.0017) model time 0.2206 (0.2447) loss 3.6302 (3.3422) grad_norm 2.3029 (2.2667) loss_scale 2048.0000 (1028.8132) mem 7381MB [2024-08-26 14:15:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][860/1251] eta 0:01:36 lr 0.000741 wd 0.0500 time 0.2300 (0.2458) data time 0.0009 (0.0016) model time 0.2291 (0.2444) loss 3.6432 (3.3370) grad_norm 2.4090 (2.2647) loss_scale 2048.0000 (1040.6504) mem 7381MB [2024-08-26 14:15:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][870/1251] eta 0:01:33 lr 0.000741 wd 0.0500 time 0.2385 (0.2459) data time 0.0007 (0.0016) model time 0.2378 (0.2445) loss 2.2805 (3.3346) grad_norm 3.4637 (2.2657) loss_scale 2048.0000 (1052.2158) mem 7381MB [2024-08-26 14:15:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][880/1251] eta 0:01:31 lr 0.000741 wd 0.0500 time 0.2339 (0.2457) data time 0.0007 (0.0016) model time 0.2332 (0.2443) loss 3.6972 (3.3357) grad_norm 2.4480 (2.2669) loss_scale 2048.0000 (1063.5187) mem 7381MB [2024-08-26 14:15:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][890/1251] eta 0:01:28 lr 0.000741 wd 0.0500 time 0.3317 (0.2459) data time 0.0007 (0.0016) model time 0.3310 (0.2445) loss 3.4767 (3.3387) grad_norm 1.6337 (2.2678) loss_scale 2048.0000 (1074.5679) mem 7381MB [2024-08-26 14:15:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][900/1251] eta 0:01:26 lr 0.000741 wd 0.0500 time 0.2283 (0.2458) data time 0.0007 (0.0016) model time 0.2276 (0.2445) loss 4.6054 (3.3441) grad_norm 1.9385 (2.2623) loss_scale 2048.0000 (1085.3718) mem 7381MB [2024-08-26 14:15:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][910/1251] eta 0:01:23 lr 0.000741 wd 0.0500 time 0.2384 (0.2457) data time 0.0007 (0.0016) model time 0.2377 (0.2443) loss 3.4149 (3.3441) grad_norm 4.2361 (2.2602) loss_scale 2048.0000 (1095.9385) mem 7381MB [2024-08-26 14:15:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][920/1251] eta 0:01:21 lr 0.000741 wd 0.0500 time 0.2287 (0.2456) data time 0.0016 (0.0016) model time 0.2271 (0.2443) loss 3.9340 (3.3449) grad_norm 2.0540 (2.2575) loss_scale 2048.0000 (1106.2758) mem 7381MB [2024-08-26 14:15:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][930/1251] eta 0:01:18 lr 0.000741 wd 0.0500 time 0.2223 (0.2457) data time 0.0012 (0.0016) model time 0.2212 (0.2443) loss 3.5936 (3.3478) grad_norm 2.2568 (2.2584) loss_scale 2048.0000 (1116.3910) mem 7381MB [2024-08-26 14:15:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][940/1251] eta 0:01:16 lr 0.000741 wd 0.0500 time 0.2286 (0.2457) data time 0.0009 (0.0016) model time 0.2277 (0.2443) loss 2.5383 (3.3427) grad_norm 1.6432 (2.2564) loss_scale 2048.0000 (1126.2912) mem 7381MB [2024-08-26 14:15:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][950/1251] eta 0:01:13 lr 0.000741 wd 0.0500 time 0.2324 (0.2455) data time 0.0010 (0.0016) model time 0.2314 (0.2441) loss 3.1555 (3.3394) grad_norm 2.7369 (2.2561) loss_scale 2048.0000 (1135.9832) mem 7381MB [2024-08-26 14:15:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][960/1251] eta 0:01:11 lr 0.000741 wd 0.0500 time 0.2317 (0.2454) data time 0.0007 (0.0016) model time 0.2310 (0.2440) loss 3.7763 (3.3393) grad_norm 2.2811 (2.2565) loss_scale 2048.0000 (1145.4735) mem 7381MB [2024-08-26 14:15:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][970/1251] eta 0:01:08 lr 0.000741 wd 0.0500 time 0.2328 (0.2454) data time 0.0009 (0.0016) model time 0.2319 (0.2441) loss 2.1996 (3.3387) grad_norm 1.6995 (2.2586) loss_scale 2048.0000 (1154.7683) mem 7381MB [2024-08-26 14:15:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][980/1251] eta 0:01:06 lr 0.000741 wd 0.0500 time 0.2287 (0.2456) data time 0.0009 (0.0016) model time 0.2278 (0.2442) loss 3.5510 (3.3398) grad_norm 2.1413 (2.2564) loss_scale 2048.0000 (1163.8736) mem 7381MB [2024-08-26 14:15:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][990/1251] eta 0:01:04 lr 0.000741 wd 0.0500 time 0.2283 (0.2457) data time 0.0012 (0.0016) model time 0.2272 (0.2444) loss 3.2749 (3.3405) grad_norm 5.4431 (2.2602) loss_scale 2048.0000 (1172.7952) mem 7381MB [2024-08-26 14:15:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1000/1251] eta 0:01:01 lr 0.000741 wd 0.0500 time 0.2315 (0.2455) data time 0.0007 (0.0016) model time 0.2307 (0.2442) loss 2.7982 (3.3438) grad_norm 1.5385 (2.2655) loss_scale 2048.0000 (1181.5385) mem 7381MB [2024-08-26 14:15:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1010/1251] eta 0:00:59 lr 0.000741 wd 0.0500 time 0.2764 (0.2455) data time 0.0009 (0.0016) model time 0.2755 (0.2442) loss 3.9538 (3.3442) grad_norm 2.2070 (2.2696) loss_scale 2048.0000 (1190.1088) mem 7381MB [2024-08-26 14:15:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1020/1251] eta 0:00:56 lr 0.000740 wd 0.0500 time 0.2297 (0.2456) data time 0.0009 (0.0016) model time 0.2287 (0.2442) loss 2.6560 (3.3429) grad_norm 2.3848 (2.2704) loss_scale 2048.0000 (1198.5113) mem 7381MB [2024-08-26 14:16:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1030/1251] eta 0:00:54 lr 0.000740 wd 0.0500 time 0.2269 (0.2454) data time 0.0009 (0.0015) model time 0.2260 (0.2440) loss 3.3848 (3.3434) grad_norm 2.4116 (2.2701) loss_scale 2048.0000 (1206.7507) mem 7381MB [2024-08-26 14:16:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1040/1251] eta 0:00:51 lr 0.000740 wd 0.0500 time 0.2422 (0.2453) data time 0.0014 (0.0015) model time 0.2408 (0.2439) loss 2.9859 (3.3421) grad_norm 2.2932 (2.2689) loss_scale 2048.0000 (1214.8319) mem 7381MB [2024-08-26 14:16:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1050/1251] eta 0:00:49 lr 0.000740 wd 0.0500 time 0.3264 (0.2454) data time 0.0010 (0.0015) model time 0.3255 (0.2440) loss 3.6541 (3.3402) grad_norm 2.5986 (2.2685) loss_scale 2048.0000 (1222.7593) mem 7381MB [2024-08-26 14:16:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1060/1251] eta 0:00:46 lr 0.000740 wd 0.0500 time 0.2267 (0.2456) data time 0.0007 (0.0015) model time 0.2259 (0.2442) loss 2.8032 (3.3419) grad_norm 2.3197 (2.2668) loss_scale 2048.0000 (1230.5372) mem 7381MB [2024-08-26 14:16:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1070/1251] eta 0:00:44 lr 0.000740 wd 0.0500 time 0.2341 (0.2454) data time 0.0009 (0.0015) model time 0.2332 (0.2441) loss 3.3910 (3.3431) grad_norm 2.4488 (2.2682) loss_scale 2048.0000 (1238.1699) mem 7381MB [2024-08-26 14:16:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1080/1251] eta 0:00:41 lr 0.000740 wd 0.0500 time 0.2273 (0.2452) data time 0.0009 (0.0015) model time 0.2264 (0.2439) loss 3.0265 (3.3417) grad_norm 2.0798 (2.2715) loss_scale 2048.0000 (1245.6614) mem 7381MB [2024-08-26 14:16:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1090/1251] eta 0:00:39 lr 0.000740 wd 0.0500 time 0.2597 (0.2451) data time 0.0007 (0.0015) model time 0.2590 (0.2438) loss 3.9573 (3.3440) grad_norm 1.6333 (2.2731) loss_scale 2048.0000 (1253.0156) mem 7381MB [2024-08-26 14:16:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1100/1251] eta 0:00:37 lr 0.000740 wd 0.0500 time 0.2296 (0.2453) data time 0.0012 (0.0015) model time 0.2284 (0.2439) loss 3.3861 (3.3433) grad_norm 2.1016 (2.2752) loss_scale 2048.0000 (1260.2361) mem 7381MB [2024-08-26 14:16:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1110/1251] eta 0:00:34 lr 0.000740 wd 0.0500 time 0.2346 (0.2451) data time 0.0008 (0.0015) model time 0.2338 (0.2438) loss 3.9418 (3.3436) grad_norm 2.1991 (2.2718) loss_scale 2048.0000 (1267.3267) mem 7381MB [2024-08-26 14:16:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1120/1251] eta 0:00:32 lr 0.000740 wd 0.0500 time 0.2270 (0.2452) data time 0.0008 (0.0015) model time 0.2262 (0.2439) loss 3.0387 (3.3442) grad_norm 2.1758 (2.2690) loss_scale 2048.0000 (1274.2908) mem 7381MB [2024-08-26 14:16:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1130/1251] eta 0:00:29 lr 0.000740 wd 0.0500 time 0.2230 (0.2451) data time 0.0009 (0.0015) model time 0.2221 (0.2437) loss 3.4222 (3.3452) grad_norm 1.7605 (2.2664) loss_scale 2048.0000 (1281.1317) mem 7381MB [2024-08-26 14:16:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1140/1251] eta 0:00:27 lr 0.000740 wd 0.0500 time 0.2607 (0.2454) data time 0.0007 (0.0015) model time 0.2601 (0.2441) loss 2.2607 (3.3438) grad_norm 1.6373 (2.2657) loss_scale 2048.0000 (1287.8528) mem 7381MB [2024-08-26 14:16:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1150/1251] eta 0:00:24 lr 0.000740 wd 0.0500 time 0.2284 (0.2453) data time 0.0011 (0.0015) model time 0.2273 (0.2439) loss 3.4325 (3.3429) grad_norm 2.4525 (2.2652) loss_scale 2048.0000 (1294.4570) mem 7381MB [2024-08-26 14:16:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1160/1251] eta 0:00:22 lr 0.000740 wd 0.0500 time 0.2207 (0.2451) data time 0.0009 (0.0015) model time 0.2197 (0.2438) loss 2.2544 (3.3405) grad_norm 1.4992 (2.2681) loss_scale 2048.0000 (1300.9475) mem 7381MB [2024-08-26 14:16:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1170/1251] eta 0:00:19 lr 0.000740 wd 0.0500 time 0.2653 (0.2450) data time 0.0007 (0.0015) model time 0.2646 (0.2437) loss 3.3383 (3.3409) grad_norm 2.1135 (2.2652) loss_scale 2048.0000 (1307.3271) mem 7381MB [2024-08-26 14:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1180/1251] eta 0:00:17 lr 0.000740 wd 0.0500 time 0.2215 (0.2453) data time 0.0015 (0.0015) model time 0.2201 (0.2439) loss 3.2099 (3.3418) grad_norm 1.9899 (2.2653) loss_scale 2048.0000 (1313.5986) mem 7381MB [2024-08-26 14:16:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1190/1251] eta 0:00:14 lr 0.000740 wd 0.0500 time 0.2329 (0.2452) data time 0.0009 (0.0015) model time 0.2320 (0.2439) loss 2.3406 (3.3395) grad_norm 3.5823 (2.2663) loss_scale 2048.0000 (1319.7649) mem 7381MB [2024-08-26 14:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1200/1251] eta 0:00:12 lr 0.000740 wd 0.0500 time 0.2222 (0.2451) data time 0.0009 (0.0015) model time 0.2213 (0.2438) loss 2.8637 (3.3398) grad_norm 1.9452 (2.2667) loss_scale 2048.0000 (1325.8285) mem 7381MB [2024-08-26 14:16:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1210/1251] eta 0:00:10 lr 0.000740 wd 0.0500 time 0.2370 (0.2450) data time 0.0009 (0.0015) model time 0.2361 (0.2436) loss 3.0699 (3.3371) grad_norm 2.5826 (2.2641) loss_scale 2048.0000 (1331.7919) mem 7381MB [2024-08-26 14:16:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1220/1251] eta 0:00:07 lr 0.000740 wd 0.0500 time 0.3052 (0.2452) data time 0.0007 (0.0015) model time 0.3045 (0.2439) loss 2.9219 (3.3354) grad_norm 2.1318 (2.2635) loss_scale 2048.0000 (1337.6577) mem 7381MB [2024-08-26 14:16:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1230/1251] eta 0:00:05 lr 0.000740 wd 0.0500 time 0.2268 (0.2450) data time 0.0008 (0.0015) model time 0.2260 (0.2437) loss 3.6031 (3.3358) grad_norm 3.2067 (2.2644) loss_scale 2048.0000 (1343.4281) mem 7381MB [2024-08-26 14:16:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1240/1251] eta 0:00:02 lr 0.000740 wd 0.0500 time 0.3045 (0.2449) data time 0.0008 (0.0015) model time 0.3036 (0.2436) loss 2.7443 (3.3345) grad_norm 3.0120 (2.2670) loss_scale 2048.0000 (1349.1056) mem 7381MB [2024-08-26 14:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [115/300][1250/1251] eta 0:00:00 lr 0.000740 wd 0.0500 time 0.2165 (0.2448) data time 0.0007 (0.0015) model time 0.2158 (0.2435) loss 3.5888 (3.3327) grad_norm 2.1265 (2.2647) loss_scale 2048.0000 (1354.6922) mem 7381MB [2024-08-26 14:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 115 training takes 0:05:06 [2024-08-26 14:16:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 14:16:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 14:16:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.550 (0.550) Loss 0.5288 (0.5288) Acc@1 89.648 (89.648) Acc@5 97.852 (97.852) Mem 7381MB [2024-08-26 14:16:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.111 (0.134) Loss 0.7539 (0.7776) Acc@1 83.496 (83.132) Acc@5 96.191 (96.724) Mem 7381MB [2024-08-26 14:16:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.093 (0.114) Loss 1.1191 (0.8027) Acc@1 73.828 (82.203) Acc@5 93.164 (96.689) Mem 7381MB [2024-08-26 14:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.105) Loss 1.3789 (0.9106) Acc@1 67.969 (79.694) Acc@5 88.770 (95.227) Mem 7381MB [2024-08-26 14:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.096) Loss 1.2412 (0.9754) Acc@1 71.387 (78.127) Acc@5 91.504 (94.491) Mem 7381MB [2024-08-26 14:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.766 Acc@5 94.422 [2024-08-26 14:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.8% [2024-08-26 14:17:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.999 (0.999) Loss 0.4246 (0.4246) Acc@1 91.992 (91.992) Acc@5 98.340 (98.340) Mem 7381MB [2024-08-26 14:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.168) Loss 0.6860 (0.6702) Acc@1 86.523 (85.502) Acc@5 96.875 (97.221) Mem 7381MB [2024-08-26 14:17:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.129) Loss 0.9531 (0.6920) Acc@1 77.930 (84.594) Acc@5 94.727 (97.238) Mem 7381MB [2024-08-26 14:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.083 (0.115) Loss 1.2246 (0.7887) Acc@1 68.262 (82.280) Acc@5 91.504 (96.100) Mem 7381MB [2024-08-26 14:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.103) Loss 1.1074 (0.8391) Acc@1 72.363 (80.852) Acc@5 93.262 (95.575) Mem 7381MB [2024-08-26 14:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.440 Acc@5 95.502 [2024-08-26 14:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.4% [2024-08-26 14:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.44% [2024-08-26 14:17:04 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 14:17:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 14:17:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][0/1251] eta 0:15:51 lr 0.000740 wd 0.0500 time 0.7607 (0.7607) data time 0.4818 (0.4818) model time 0.0000 (0.0000) loss 3.5424 (3.5424) grad_norm 1.7693 (1.7693) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][10/1251] eta 0:06:14 lr 0.000740 wd 0.0500 time 0.2275 (0.3021) data time 0.0007 (0.0448) model time 0.0000 (0.0000) loss 2.5668 (3.2349) grad_norm 2.1212 (1.8401) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][20/1251] eta 0:05:46 lr 0.000739 wd 0.0500 time 0.2237 (0.2818) data time 0.0009 (0.0241) model time 0.0000 (0.0000) loss 4.0019 (3.3364) grad_norm 2.0687 (1.9348) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][30/1251] eta 0:05:23 lr 0.000739 wd 0.0500 time 0.2340 (0.2649) data time 0.0007 (0.0166) model time 0.0000 (0.0000) loss 4.2521 (3.1741) grad_norm 1.8094 (2.0263) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][40/1251] eta 0:05:10 lr 0.000739 wd 0.0500 time 0.2337 (0.2560) data time 0.0007 (0.0128) model time 0.0000 (0.0000) loss 2.1525 (3.1833) grad_norm 1.4905 (2.0066) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][50/1251] eta 0:05:12 lr 0.000739 wd 0.0500 time 0.2277 (0.2600) data time 0.0010 (0.0105) model time 0.0000 (0.0000) loss 3.5906 (3.2790) grad_norm 2.0800 (2.0362) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][60/1251] eta 0:05:03 lr 0.000739 wd 0.0500 time 0.2228 (0.2545) data time 0.0009 (0.0089) model time 0.2218 (0.2255) loss 3.6426 (3.2819) grad_norm 1.8568 (2.0082) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][70/1251] eta 0:04:56 lr 0.000739 wd 0.0500 time 0.2327 (0.2511) data time 0.0009 (0.0078) model time 0.2318 (0.2272) loss 3.1506 (3.3139) grad_norm 1.6366 (2.0615) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][80/1251] eta 0:04:53 lr 0.000739 wd 0.0500 time 0.2271 (0.2506) data time 0.0010 (0.0070) model time 0.2261 (0.2334) loss 3.0809 (3.3287) grad_norm 2.1563 (2.0543) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][90/1251] eta 0:04:51 lr 0.000739 wd 0.0500 time 0.2390 (0.2509) data time 0.0007 (0.0063) model time 0.2382 (0.2383) loss 3.6227 (3.3291) grad_norm 2.2526 (2.0861) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][100/1251] eta 0:04:48 lr 0.000739 wd 0.0500 time 0.2291 (0.2507) data time 0.0012 (0.0058) model time 0.2279 (0.2401) loss 3.3873 (3.3303) grad_norm 2.3581 (2.1236) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][110/1251] eta 0:04:43 lr 0.000739 wd 0.0500 time 0.2294 (0.2489) data time 0.0014 (0.0054) model time 0.2281 (0.2383) loss 3.5072 (3.3151) grad_norm 2.4101 (2.1975) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][120/1251] eta 0:04:39 lr 0.000739 wd 0.0500 time 0.2207 (0.2471) data time 0.0010 (0.0050) model time 0.2197 (0.2366) loss 3.1447 (3.3376) grad_norm 2.7040 (2.2003) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][130/1251] eta 0:04:39 lr 0.000739 wd 0.0500 time 0.2878 (0.2490) data time 0.0009 (0.0047) model time 0.2869 (0.2409) loss 3.5858 (3.3363) grad_norm 2.4161 (2.1965) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][140/1251] eta 0:04:34 lr 0.000739 wd 0.0500 time 0.2209 (0.2474) data time 0.0009 (0.0045) model time 0.2200 (0.2392) loss 3.1720 (3.3360) grad_norm 2.7457 (2.1977) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][150/1251] eta 0:04:32 lr 0.000739 wd 0.0500 time 0.2231 (0.2479) data time 0.0010 (0.0042) model time 0.2221 (0.2406) loss 2.3611 (3.3355) grad_norm 2.2320 (2.2074) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][160/1251] eta 0:04:29 lr 0.000739 wd 0.0500 time 0.2231 (0.2467) data time 0.0012 (0.0040) model time 0.2219 (0.2395) loss 2.7175 (3.3321) grad_norm 1.6364 (2.1954) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][170/1251] eta 0:04:28 lr 0.000739 wd 0.0500 time 0.3583 (0.2480) data time 0.0009 (0.0039) model time 0.3575 (0.2418) loss 2.8682 (3.3333) grad_norm 2.3166 (2.1807) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][180/1251] eta 0:04:26 lr 0.000739 wd 0.0500 time 0.2480 (0.2486) data time 0.0009 (0.0037) model time 0.2471 (0.2431) loss 3.1857 (3.3534) grad_norm 2.1644 (2.1841) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][190/1251] eta 0:04:22 lr 0.000739 wd 0.0500 time 0.2287 (0.2476) data time 0.0009 (0.0036) model time 0.2278 (0.2420) loss 3.3639 (3.3520) grad_norm 1.4799 (2.1733) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][200/1251] eta 0:04:20 lr 0.000739 wd 0.0500 time 0.2720 (0.2478) data time 0.0009 (0.0035) model time 0.2711 (0.2426) loss 3.9107 (3.3546) grad_norm 2.2325 (2.1747) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][210/1251] eta 0:04:18 lr 0.000739 wd 0.0500 time 0.2353 (0.2481) data time 0.0009 (0.0033) model time 0.2344 (0.2432) loss 3.6228 (3.3547) grad_norm 5.0463 (2.2294) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:17:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][220/1251] eta 0:04:15 lr 0.000739 wd 0.0500 time 0.2346 (0.2482) data time 0.0009 (0.0032) model time 0.2337 (0.2436) loss 3.7562 (3.3481) grad_norm 1.6386 (2.2355) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][230/1251] eta 0:04:12 lr 0.000739 wd 0.0500 time 0.2308 (0.2474) data time 0.0011 (0.0031) model time 0.2297 (0.2427) loss 3.5390 (3.3539) grad_norm 2.0303 (2.2270) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][240/1251] eta 0:04:09 lr 0.000739 wd 0.0500 time 0.2265 (0.2466) data time 0.0010 (0.0031) model time 0.2255 (0.2419) loss 3.4962 (3.3557) grad_norm 1.9569 (2.2337) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][250/1251] eta 0:04:07 lr 0.000739 wd 0.0500 time 0.3263 (0.2472) data time 0.0009 (0.0030) model time 0.3253 (0.2428) loss 2.5543 (3.3440) grad_norm 2.1374 (2.2221) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][260/1251] eta 0:04:04 lr 0.000739 wd 0.0500 time 0.2403 (0.2469) data time 0.0011 (0.0029) model time 0.2392 (0.2427) loss 3.3253 (3.3488) grad_norm 1.6883 (2.2107) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][270/1251] eta 0:04:02 lr 0.000739 wd 0.0500 time 0.3856 (0.2473) data time 0.0007 (0.0028) model time 0.3849 (0.2433) loss 3.2191 (3.3378) grad_norm 2.2016 (2.2129) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][280/1251] eta 0:03:59 lr 0.000738 wd 0.0500 time 0.2277 (0.2467) data time 0.0008 (0.0028) model time 0.2269 (0.2427) loss 2.9800 (3.3284) grad_norm 1.7664 (2.2045) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][290/1251] eta 0:03:57 lr 0.000738 wd 0.0500 time 0.2409 (0.2469) data time 0.0011 (0.0027) model time 0.2398 (0.2430) loss 2.1515 (3.3206) grad_norm 4.2261 (2.2115) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][300/1251] eta 0:03:55 lr 0.000738 wd 0.0500 time 0.2291 (0.2472) data time 0.0009 (0.0027) model time 0.2282 (0.2436) loss 2.4267 (3.3202) grad_norm 1.5194 (2.2166) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][310/1251] eta 0:03:52 lr 0.000738 wd 0.0500 time 0.2313 (0.2466) data time 0.0007 (0.0026) model time 0.2306 (0.2429) loss 4.0620 (3.3230) grad_norm 1.7376 (2.2196) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][320/1251] eta 0:03:49 lr 0.000738 wd 0.0500 time 0.2277 (0.2461) data time 0.0011 (0.0026) model time 0.2266 (0.2423) loss 3.0392 (3.3259) grad_norm 1.8441 (2.2075) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][330/1251] eta 0:03:47 lr 0.000738 wd 0.0500 time 0.3374 (0.2467) data time 0.0029 (0.0025) model time 0.3345 (0.2432) loss 3.4268 (3.3244) grad_norm 2.7159 (2.2043) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][340/1251] eta 0:03:45 lr 0.000738 wd 0.0500 time 0.2693 (0.2473) data time 0.0011 (0.0025) model time 0.2682 (0.2440) loss 3.2813 (3.3258) grad_norm 2.3187 (2.2083) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][350/1251] eta 0:03:43 lr 0.000738 wd 0.0500 time 0.2260 (0.2475) data time 0.0012 (0.0025) model time 0.2248 (0.2443) loss 3.7717 (3.3239) grad_norm 2.8579 (2.2049) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][360/1251] eta 0:03:40 lr 0.000738 wd 0.0500 time 0.2377 (0.2471) data time 0.0007 (0.0024) model time 0.2370 (0.2439) loss 3.8123 (3.3246) grad_norm 10.5627 (2.2232) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][370/1251] eta 0:03:37 lr 0.000738 wd 0.0500 time 0.2446 (0.2473) data time 0.0012 (0.0024) model time 0.2434 (0.2442) loss 2.7894 (3.3283) grad_norm 1.6426 (2.2302) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][380/1251] eta 0:03:35 lr 0.000738 wd 0.0500 time 0.2277 (0.2475) data time 0.0010 (0.0024) model time 0.2267 (0.2444) loss 2.3552 (3.3285) grad_norm 2.1759 (2.2241) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][390/1251] eta 0:03:32 lr 0.000738 wd 0.0500 time 0.2323 (0.2470) data time 0.0018 (0.0023) model time 0.2306 (0.2440) loss 3.8277 (3.3292) grad_norm 2.3990 (2.2283) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][400/1251] eta 0:03:30 lr 0.000738 wd 0.0500 time 0.2281 (0.2473) data time 0.0008 (0.0023) model time 0.2273 (0.2443) loss 4.2948 (3.3306) grad_norm 2.4360 (2.2281) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][410/1251] eta 0:03:28 lr 0.000738 wd 0.0500 time 0.2260 (0.2474) data time 0.0008 (0.0023) model time 0.2252 (0.2445) loss 2.4477 (3.3264) grad_norm 2.9403 (2.2341) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][420/1251] eta 0:03:25 lr 0.000738 wd 0.0500 time 0.2349 (0.2474) data time 0.0010 (0.0022) model time 0.2339 (0.2446) loss 3.0014 (3.3270) grad_norm 1.9871 (2.2319) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][430/1251] eta 0:03:22 lr 0.000738 wd 0.0500 time 0.2418 (0.2470) data time 0.0006 (0.0022) model time 0.2412 (0.2442) loss 2.3481 (3.3266) grad_norm 2.1704 (2.2263) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][440/1251] eta 0:03:20 lr 0.000738 wd 0.0500 time 0.2289 (0.2467) data time 0.0010 (0.0022) model time 0.2279 (0.2438) loss 3.3579 (3.3253) grad_norm 2.1648 (2.2312) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][450/1251] eta 0:03:17 lr 0.000738 wd 0.0500 time 0.2981 (0.2469) data time 0.0010 (0.0022) model time 0.2970 (0.2441) loss 3.7206 (3.3349) grad_norm 1.6571 (2.2285) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:18:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][460/1251] eta 0:03:15 lr 0.000738 wd 0.0500 time 0.2432 (0.2471) data time 0.0011 (0.0021) model time 0.2421 (0.2443) loss 3.2014 (3.3267) grad_norm 2.9404 (2.2384) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:19:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][470/1251] eta 0:03:12 lr 0.000738 wd 0.0500 time 0.2247 (0.2471) data time 0.0009 (0.0021) model time 0.2238 (0.2444) loss 4.1153 (3.3303) grad_norm 2.0642 (2.2340) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][480/1251] eta 0:03:10 lr 0.000738 wd 0.0500 time 0.2276 (0.2467) data time 0.0009 (0.0021) model time 0.2268 (0.2440) loss 3.7644 (3.3318) grad_norm 2.0820 (2.2359) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:19:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][490/1251] eta 0:03:07 lr 0.000738 wd 0.0500 time 0.3255 (0.2467) data time 0.0010 (0.0021) model time 0.3245 (0.2440) loss 3.5390 (3.3315) grad_norm 2.3751 (2.2334) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:19:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][500/1251] eta 0:03:05 lr 0.000738 wd 0.0500 time 0.2340 (0.2470) data time 0.0009 (0.0020) model time 0.2331 (0.2445) loss 3.2599 (3.3406) grad_norm 1.9082 (2.2302) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:19:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][510/1251] eta 0:03:02 lr 0.000738 wd 0.0500 time 0.2357 (0.2467) data time 0.0010 (0.0020) model time 0.2346 (0.2441) loss 3.5794 (3.3343) grad_norm 1.9520 (2.2265) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 14:19:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 14:19:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 14:19:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 14:43:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 14:43:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 14:44:04 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 15:20:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 15:20:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 15:20:43 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 15:20:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 15:20:52 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 15:20:53 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 15:20:54 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 15:20:54 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 116) [2024-08-26 15:20:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 15:21:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][520/1251] eta 0:39:36 lr 0.000738 wd 0.0500 time 0.2279 (3.2504) data time 0.0007 (0.1424) model time 0.2272 (3.1080) loss 3.8596 (3.9311) grad_norm 1.9305 (2.0407) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:21:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][530/1251] eta 0:14:49 lr 0.000737 wd 0.0500 time 0.2240 (1.2337) data time 0.0009 (0.0484) model time 0.2231 (1.1853) loss 3.5643 (3.6898) grad_norm 2.0470 (2.1223) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:21:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][540/1251] eta 0:09:50 lr 0.000737 wd 0.0500 time 0.2231 (0.8303) data time 0.0009 (0.0298) model time 0.2222 (0.8005) loss 3.6986 (3.6401) grad_norm 2.3444 (2.1309) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:21:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][550/1251] eta 0:07:40 lr 0.000737 wd 0.0500 time 0.2203 (0.6574) data time 0.0009 (0.0216) model time 0.2194 (0.6357) loss 3.3212 (3.5895) grad_norm 1.5865 (2.1096) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:21:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][560/1251] eta 0:06:27 lr 0.000737 wd 0.0500 time 0.2219 (0.5610) data time 0.0009 (0.0171) model time 0.2209 (0.5439) loss 3.4001 (3.5419) grad_norm 3.3540 (2.1287) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:21:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][570/1251] eta 0:05:40 lr 0.000737 wd 0.0500 time 0.2236 (0.5000) data time 0.0007 (0.0143) model time 0.2229 (0.4857) loss 2.4208 (3.5309) grad_norm 4.1206 (2.1599) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:21:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][580/1251] eta 0:05:06 lr 0.000737 wd 0.0500 time 0.2309 (0.4575) data time 0.0008 (0.0122) model time 0.2301 (0.4453) loss 3.9932 (3.5228) grad_norm 1.9296 (2.1681) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:21:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][590/1251] eta 0:04:41 lr 0.000737 wd 0.0500 time 0.2278 (0.4263) data time 0.0010 (0.0107) model time 0.2268 (0.4156) loss 2.7222 (3.4920) grad_norm 2.5042 (2.1855) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:21:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][600/1251] eta 0:04:22 lr 0.000737 wd 0.0500 time 0.2213 (0.4030) data time 0.0010 (0.0098) model time 0.2203 (0.3932) loss 3.1454 (3.4516) grad_norm 2.2898 (2.1539) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:21:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][610/1251] eta 0:04:06 lr 0.000737 wd 0.0500 time 0.2258 (0.3845) data time 0.0008 (0.0090) model time 0.2250 (0.3755) loss 3.2919 (3.4383) grad_norm 1.9804 (2.1652) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:21:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][620/1251] eta 0:03:53 lr 0.000737 wd 0.0500 time 0.2229 (0.3693) data time 0.0008 (0.0082) model time 0.2221 (0.3611) loss 3.1313 (3.4580) grad_norm 1.6743 (2.1429) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:21:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][630/1251] eta 0:03:41 lr 0.000737 wd 0.0500 time 0.2227 (0.3568) data time 0.0006 (0.0076) model time 0.2221 (0.3492) loss 3.0161 (3.4612) grad_norm 1.8166 (2.1423) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:21:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 15:21:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 15:21:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 15:23:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 15:23:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 15:23:46 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 15:23:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 15:23:56 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 15:23:58 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 15:23:59 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 15:23:59 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 116) [2024-08-26 15:23:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 15:24:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][640/1251] eta 1:01:11 lr 0.000737 wd 0.0500 time 0.2530 (6.0085) data time 0.0007 (0.6686) model time 0.2523 (5.3399) loss 2.7761 (3.4823) grad_norm 2.1574 (2.7549) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:24:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 15:24:22 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 15:24:24 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 15:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 15:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 15:29:46 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 15:30:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 15:30:04 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 15:30:05 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 15:30:06 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 15:30:07 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 116) [2024-08-26 15:30:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 15:30:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][650/1251] eta 0:26:59 lr 0.000737 wd 0.0500 time 0.2211 (2.6952) data time 0.0007 (0.1794) model time 0.2204 (2.5158) loss 4.2486 (3.9127) grad_norm 1.6865 (2.2921) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:30:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][660/1251] eta 0:10:20 lr 0.000737 wd 0.0500 time 0.2213 (1.0504) data time 0.0009 (0.0604) model time 0.2204 (0.9900) loss 3.7965 (3.6997) grad_norm 2.0009 (2.4914) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:30:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][670/1251] eta 0:06:58 lr 0.000737 wd 0.0500 time 0.2231 (0.7206) data time 0.0010 (0.0366) model time 0.2221 (0.6840) loss 4.1160 (3.7027) grad_norm 14.4937 (2.9099) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:30:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][680/1251] eta 0:05:30 lr 0.000737 wd 0.0500 time 0.2226 (0.5787) data time 0.0009 (0.0264) model time 0.2217 (0.5523) loss 3.5705 (3.6959) grad_norm 2.0997 (2.6670) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:30:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][690/1251] eta 0:04:40 lr 0.000737 wd 0.0500 time 0.2213 (0.5006) data time 0.0008 (0.0208) model time 0.2205 (0.4798) loss 3.6176 (3.6211) grad_norm 1.9147 (2.5248) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:30:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][700/1251] eta 0:04:08 lr 0.000737 wd 0.0500 time 0.2323 (0.4513) data time 0.0007 (0.0172) model time 0.2316 (0.4341) loss 2.7003 (3.5866) grad_norm 1.8779 (2.4182) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:30:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][710/1251] eta 0:03:45 lr 0.000737 wd 0.0500 time 0.2291 (0.4166) data time 0.0009 (0.0147) model time 0.2283 (0.4019) loss 3.5726 (3.5508) grad_norm 2.4555 (2.4478) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:30:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][720/1251] eta 0:03:27 lr 0.000737 wd 0.0500 time 0.2247 (0.3914) data time 0.0009 (0.0129) model time 0.2239 (0.3785) loss 2.7025 (3.5015) grad_norm 2.0400 (2.4227) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:30:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][730/1251] eta 0:03:13 lr 0.000737 wd 0.0500 time 0.2211 (0.3717) data time 0.0008 (0.0115) model time 0.2203 (0.3602) loss 3.3262 (3.4707) grad_norm 2.3819 (2.3742) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:30:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][740/1251] eta 0:03:02 lr 0.000737 wd 0.0500 time 0.2273 (0.3564) data time 0.0009 (0.0104) model time 0.2264 (0.3460) loss 3.4866 (3.4703) grad_norm 2.4651 (2.4023) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 15:30:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 15:30:46 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 15:30:47 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 15:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 15:32:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 15:32:57 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 15:34:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 15:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 15:36:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 15:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 15:36:30 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 15:36:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 15:36:39 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 15:36:40 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 15:36:41 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 15:36:41 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 116) [2024-08-26 15:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 15:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][750/1251] eta 0:40:54 lr 0.000737 wd 0.0500 time 0.2200 (4.8986) data time 0.0007 (0.3025) model time 0.2193 (4.5961) loss 2.9994 (3.6430) grad_norm 2.0389 (2.2877) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][760/1251] eta 0:10:41 lr 0.000737 wd 0.0500 time 0.2290 (1.3061) data time 0.0010 (0.0707) model time 0.2280 (1.2355) loss 3.6727 (3.5802) grad_norm 2.1817 (2.0163) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][770/1251] eta 0:06:43 lr 0.000737 wd 0.0500 time 0.2364 (0.8380) data time 0.0008 (0.0404) model time 0.2356 (0.7976) loss 4.0881 (3.5911) grad_norm 2.9302 (2.0397) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][780/1251] eta 0:05:07 lr 0.000737 wd 0.0500 time 0.2325 (0.6536) data time 0.0009 (0.0285) model time 0.2316 (0.6251) loss 4.0714 (3.6044) grad_norm 3.6069 (2.1974) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][790/1251] eta 0:04:15 lr 0.000736 wd 0.0500 time 0.2311 (0.5544) data time 0.0014 (0.0221) model time 0.2297 (0.5323) loss 3.4195 (3.5390) grad_norm 1.9792 (2.2241) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][800/1251] eta 0:03:42 lr 0.000736 wd 0.0500 time 0.2322 (0.4930) data time 0.0013 (0.0182) model time 0.2309 (0.4749) loss 3.4334 (3.5055) grad_norm 1.7543 (2.1993) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][810/1251] eta 0:03:19 lr 0.000736 wd 0.0500 time 0.2283 (0.4513) data time 0.0011 (0.0154) model time 0.2272 (0.4358) loss 3.3684 (3.4732) grad_norm 2.8919 (2.2037) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][820/1251] eta 0:03:01 lr 0.000736 wd 0.0500 time 0.2260 (0.4205) data time 0.0010 (0.0135) model time 0.2250 (0.4070) loss 3.5422 (3.4537) grad_norm 2.7468 (2.2362) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][830/1251] eta 0:02:47 lr 0.000736 wd 0.0500 time 0.2307 (0.3977) data time 0.0008 (0.0120) model time 0.2299 (0.3858) loss 2.7724 (3.4185) grad_norm 3.0085 (2.2217) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][840/1251] eta 0:02:35 lr 0.000736 wd 0.0500 time 0.2268 (0.3795) data time 0.0007 (0.0108) model time 0.2261 (0.3687) loss 4.1384 (3.4172) grad_norm 1.7741 (2.2099) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][850/1251] eta 0:02:26 lr 0.000736 wd 0.0500 time 0.2238 (0.3649) data time 0.0007 (0.0099) model time 0.2230 (0.3551) loss 4.0045 (3.4493) grad_norm 1.8520 (2.1811) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][860/1251] eta 0:02:17 lr 0.000736 wd 0.0500 time 0.2247 (0.3529) data time 0.0009 (0.0091) model time 0.2238 (0.3438) loss 3.6564 (3.4377) grad_norm 2.1486 (2.1859) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][870/1251] eta 0:02:10 lr 0.000736 wd 0.0500 time 0.2282 (0.3428) data time 0.0009 (0.0084) model time 0.2273 (0.3343) loss 3.6804 (3.4396) grad_norm 2.3175 (2.2226) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][880/1251] eta 0:02:03 lr 0.000736 wd 0.0500 time 0.2267 (0.3342) data time 0.0010 (0.0079) model time 0.2257 (0.3263) loss 3.6207 (3.4309) grad_norm 1.9410 (2.2111) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][890/1251] eta 0:01:57 lr 0.000736 wd 0.0500 time 0.2240 (0.3268) data time 0.0016 (0.0074) model time 0.2224 (0.3193) loss 3.6942 (3.4233) grad_norm 2.0269 (2.1958) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][900/1251] eta 0:01:52 lr 0.000736 wd 0.0500 time 0.2348 (0.3205) data time 0.0008 (0.0070) model time 0.2341 (0.3135) loss 3.4314 (3.4156) grad_norm 2.9970 (2.2181) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][910/1251] eta 0:01:47 lr 0.000736 wd 0.0500 time 0.2298 (0.3147) data time 0.0012 (0.0067) model time 0.2286 (0.3081) loss 2.8999 (3.4176) grad_norm 1.8064 (2.2054) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][920/1251] eta 0:01:42 lr 0.000736 wd 0.0500 time 0.2238 (0.3097) data time 0.0011 (0.0063) model time 0.2227 (0.3034) loss 3.8590 (3.4132) grad_norm 2.3163 (2.1925) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][930/1251] eta 0:01:38 lr 0.000736 wd 0.0500 time 0.2270 (0.3054) data time 0.0009 (0.0060) model time 0.2261 (0.2994) loss 3.6974 (3.3970) grad_norm 2.1466 (2.1784) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][940/1251] eta 0:01:33 lr 0.000736 wd 0.0500 time 0.2312 (0.3014) data time 0.0009 (0.0058) model time 0.2302 (0.2956) loss 3.2282 (3.3936) grad_norm 2.4378 (2.1682) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][950/1251] eta 0:01:29 lr 0.000736 wd 0.0500 time 0.2258 (0.2978) data time 0.0009 (0.0056) model time 0.2249 (0.2922) loss 2.6117 (3.3764) grad_norm 2.0763 (2.1693) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][960/1251] eta 0:01:25 lr 0.000736 wd 0.0500 time 0.2272 (0.2946) data time 0.0010 (0.0054) model time 0.2262 (0.2892) loss 2.1274 (3.3683) grad_norm 2.1140 (2.1694) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][970/1251] eta 0:01:21 lr 0.000736 wd 0.0500 time 0.2325 (0.2917) data time 0.0009 (0.0052) model time 0.2316 (0.2865) loss 2.3768 (3.3679) grad_norm 3.0352 (2.1886) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][980/1251] eta 0:01:18 lr 0.000736 wd 0.0500 time 0.2204 (0.2889) data time 0.0010 (0.0050) model time 0.2194 (0.2840) loss 3.0386 (3.3634) grad_norm 2.6035 (2.1852) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][990/1251] eta 0:01:14 lr 0.000736 wd 0.0500 time 0.2280 (0.2864) data time 0.0009 (0.0048) model time 0.2271 (0.2816) loss 3.5703 (3.3660) grad_norm 2.1099 (2.1779) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:37:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1000/1251] eta 0:01:11 lr 0.000736 wd 0.0500 time 0.2265 (0.2842) data time 0.0011 (0.0047) model time 0.2253 (0.2795) loss 3.4623 (3.3571) grad_norm 3.9932 (2.1772) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1010/1251] eta 0:01:07 lr 0.000736 wd 0.0500 time 0.2261 (0.2820) data time 0.0008 (0.0045) model time 0.2254 (0.2775) loss 2.8842 (3.3457) grad_norm 2.0799 (2.1714) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1020/1251] eta 0:01:04 lr 0.000736 wd 0.0500 time 0.2239 (0.2802) data time 0.0010 (0.0044) model time 0.2229 (0.2758) loss 4.3215 (3.3437) grad_norm 2.0713 (2.1714) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1030/1251] eta 0:01:01 lr 0.000736 wd 0.0500 time 0.2290 (0.2784) data time 0.0008 (0.0043) model time 0.2281 (0.2741) loss 3.7661 (3.3418) grad_norm 1.9635 (2.1772) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1040/1251] eta 0:00:58 lr 0.000735 wd 0.0500 time 0.2510 (0.2777) data time 0.0011 (0.0043) model time 0.2499 (0.2734) loss 3.4636 (3.3369) grad_norm 1.4324 (2.2005) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1050/1251] eta 0:00:55 lr 0.000735 wd 0.0500 time 0.2391 (0.2763) data time 0.0012 (0.0042) model time 0.2379 (0.2721) loss 3.4345 (3.3287) grad_norm 2.8763 (2.2089) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1060/1251] eta 0:00:52 lr 0.000735 wd 0.0500 time 0.2363 (0.2757) data time 0.0007 (0.0042) model time 0.2356 (0.2715) loss 3.6727 (3.3262) grad_norm 2.4803 (2.2045) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1070/1251] eta 0:00:49 lr 0.000735 wd 0.0500 time 0.2250 (0.2742) data time 0.0009 (0.0041) model time 0.2241 (0.2701) loss 3.6421 (3.3348) grad_norm 2.4468 (2.1986) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1080/1251] eta 0:00:46 lr 0.000735 wd 0.0500 time 0.2253 (0.2728) data time 0.0011 (0.0040) model time 0.2242 (0.2688) loss 3.8031 (3.3369) grad_norm 2.6166 (2.2164) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1090/1251] eta 0:00:43 lr 0.000735 wd 0.0500 time 0.2325 (0.2715) data time 0.0010 (0.0039) model time 0.2315 (0.2676) loss 3.2110 (3.3379) grad_norm 2.4261 (2.2265) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1100/1251] eta 0:00:40 lr 0.000735 wd 0.0500 time 0.2303 (0.2704) data time 0.0007 (0.0038) model time 0.2296 (0.2665) loss 4.0974 (3.3426) grad_norm 1.8382 (2.2295) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1110/1251] eta 0:00:37 lr 0.000735 wd 0.0500 time 0.2246 (0.2692) data time 0.0010 (0.0037) model time 0.2236 (0.2655) loss 3.4999 (3.3428) grad_norm 1.9023 (2.2234) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1120/1251] eta 0:00:35 lr 0.000735 wd 0.0500 time 0.2200 (0.2681) data time 0.0007 (0.0037) model time 0.2193 (0.2645) loss 3.8056 (3.3386) grad_norm 2.1323 (2.2161) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1130/1251] eta 0:00:32 lr 0.000735 wd 0.0500 time 0.2387 (0.2671) data time 0.0009 (0.0036) model time 0.2378 (0.2635) loss 2.3732 (3.3362) grad_norm 2.0646 (2.2140) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1140/1251] eta 0:00:29 lr 0.000735 wd 0.0500 time 0.2224 (0.2661) data time 0.0009 (0.0035) model time 0.2215 (0.2626) loss 3.7252 (3.3339) grad_norm 1.9748 (2.2125) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1150/1251] eta 0:00:26 lr 0.000735 wd 0.0500 time 0.2231 (0.2652) data time 0.0010 (0.0035) model time 0.2221 (0.2617) loss 3.0587 (3.3400) grad_norm 2.0961 (2.2104) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1160/1251] eta 0:00:24 lr 0.000735 wd 0.0500 time 0.2190 (0.2644) data time 0.0011 (0.0035) model time 0.2179 (0.2609) loss 3.6721 (3.3459) grad_norm 1.5638 (2.2084) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1170/1251] eta 0:00:21 lr 0.000735 wd 0.0500 time 0.2279 (0.2636) data time 0.0008 (0.0034) model time 0.2271 (0.2602) loss 2.6477 (3.3427) grad_norm 1.4990 (2.1996) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1180/1251] eta 0:00:18 lr 0.000735 wd 0.0500 time 0.2269 (0.2629) data time 0.0007 (0.0033) model time 0.2263 (0.2595) loss 4.0111 (3.3512) grad_norm 1.6542 (2.1964) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1190/1251] eta 0:00:15 lr 0.000735 wd 0.0500 time 0.2280 (0.2621) data time 0.0007 (0.0033) model time 0.2273 (0.2588) loss 3.5541 (3.3547) grad_norm 2.2550 (2.1895) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1200/1251] eta 0:00:13 lr 0.000735 wd 0.0500 time 0.2251 (0.2614) data time 0.0008 (0.0033) model time 0.2243 (0.2582) loss 2.4398 (3.3512) grad_norm 1.8785 (2.1917) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1210/1251] eta 0:00:10 lr 0.000735 wd 0.0500 time 0.2277 (0.2608) data time 0.0006 (0.0032) model time 0.2270 (0.2575) loss 3.1647 (3.3473) grad_norm 2.0336 (2.1896) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1220/1251] eta 0:00:08 lr 0.000735 wd 0.0500 time 0.2187 (0.2601) data time 0.0009 (0.0032) model time 0.2179 (0.2569) loss 2.9589 (3.3417) grad_norm 1.8899 (2.1856) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1230/1251] eta 0:00:05 lr 0.000735 wd 0.0500 time 0.2302 (0.2596) data time 0.0009 (0.0031) model time 0.2294 (0.2565) loss 3.9193 (3.3422) grad_norm 2.0449 (2.1838) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1240/1251] eta 0:00:02 lr 0.000735 wd 0.0500 time 0.2115 (0.2588) data time 0.0004 (0.0031) model time 0.2111 (0.2557) loss 2.9314 (3.3438) grad_norm 2.6876 (2.1835) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [116/300][1250/1251] eta 0:00:00 lr 0.000735 wd 0.0500 time 0.2118 (0.2580) data time 0.0004 (0.0031) model time 0.2114 (0.2549) loss 2.7762 (3.3425) grad_norm 1.6787 (2.1818) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 15:38:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 116 training takes 0:02:09 [2024-08-26 15:38:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 15:38:57 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 15:38:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.509 (0.509) Loss 0.4800 (0.4800) Acc@1 90.625 (90.625) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-26 15:38:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.085 (0.119) Loss 0.7891 (0.7665) Acc@1 82.812 (83.088) Acc@5 95.508 (96.378) Mem 7379MB [2024-08-26 15:38:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.085 (0.101) Loss 1.2236 (0.7924) Acc@1 71.484 (82.171) Acc@5 92.676 (96.401) Mem 7379MB [2024-08-26 15:39:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.094) Loss 1.3555 (0.9022) Acc@1 67.578 (79.672) Acc@5 89.844 (95.048) Mem 7379MB [2024-08-26 15:39:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.2109 (0.9633) Acc@1 72.754 (78.285) Acc@5 91.113 (94.372) Mem 7379MB [2024-08-26 15:39:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.942 Acc@5 94.310 [2024-08-26 15:39:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.9% [2024-08-26 15:39:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 77.94% [2024-08-26 15:39:04 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 15:39:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 15:39:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 1.032 (1.032) Loss 0.4231 (0.4231) Acc@1 91.992 (91.992) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-26 15:39:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.171) Loss 0.6836 (0.6692) Acc@1 86.523 (85.511) Acc@5 96.680 (97.203) Mem 7379MB [2024-08-26 15:39:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.087 (0.129) Loss 0.9526 (0.6909) Acc@1 78.027 (84.584) Acc@5 94.824 (97.247) Mem 7379MB [2024-08-26 15:39:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.112) Loss 1.2246 (0.7876) Acc@1 68.848 (82.343) Acc@5 91.309 (96.081) Mem 7379MB [2024-08-26 15:39:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.101) Loss 1.1074 (0.8379) Acc@1 72.363 (80.912) Acc@5 93.164 (95.551) Mem 7379MB [2024-08-26 15:39:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.532 Acc@5 95.494 [2024-08-26 15:39:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.5% [2024-08-26 15:39:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.53% [2024-08-26 15:39:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 15:39:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 15:39:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][0/1251] eta 0:18:13 lr 0.000735 wd 0.0500 time 0.8744 (0.8744) data time 0.5975 (0.5975) model time 0.0000 (0.0000) loss 3.4198 (3.4198) grad_norm 1.7481 (1.7481) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-26 15:39:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][10/1251] eta 0:05:53 lr 0.000735 wd 0.0500 time 0.2217 (0.2852) data time 0.0009 (0.0553) model time 0.0000 (0.0000) loss 3.2643 (3.6959) grad_norm 2.2377 (2.0887) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][20/1251] eta 0:05:17 lr 0.000735 wd 0.0500 time 0.2223 (0.2575) data time 0.0007 (0.0295) model time 0.0000 (0.0000) loss 2.5493 (3.3964) grad_norm 4.1006 (2.1561) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][30/1251] eta 0:05:03 lr 0.000735 wd 0.0500 time 0.2299 (0.2485) data time 0.0010 (0.0203) model time 0.0000 (0.0000) loss 2.4364 (3.3410) grad_norm 1.9938 (2.2292) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][40/1251] eta 0:04:55 lr 0.000735 wd 0.0500 time 0.2294 (0.2437) data time 0.0008 (0.0156) model time 0.0000 (0.0000) loss 3.4806 (3.2928) grad_norm 1.5491 (2.1513) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][50/1251] eta 0:04:49 lr 0.000734 wd 0.0500 time 0.2275 (0.2407) data time 0.0008 (0.0127) model time 0.0000 (0.0000) loss 3.8846 (3.3560) grad_norm 2.3450 (2.1593) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][60/1251] eta 0:04:44 lr 0.000734 wd 0.0500 time 0.2417 (0.2388) data time 0.0009 (0.0108) model time 0.2407 (0.2279) loss 3.5672 (3.3706) grad_norm 1.9564 (2.1598) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][70/1251] eta 0:04:40 lr 0.000734 wd 0.0500 time 0.2397 (0.2376) data time 0.0009 (0.0095) model time 0.2389 (0.2287) loss 3.6002 (3.3815) grad_norm 3.2617 (2.2133) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][80/1251] eta 0:04:37 lr 0.000734 wd 0.0500 time 0.2297 (0.2367) data time 0.0006 (0.0084) model time 0.2291 (0.2288) loss 2.5501 (3.3938) grad_norm 2.4210 (2.2392) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][90/1251] eta 0:04:33 lr 0.000734 wd 0.0500 time 0.2243 (0.2357) data time 0.0008 (0.0076) model time 0.2235 (0.2281) loss 3.4780 (3.4024) grad_norm 2.0335 (2.2213) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][100/1251] eta 0:04:30 lr 0.000734 wd 0.0500 time 0.2323 (0.2350) data time 0.0011 (0.0070) model time 0.2312 (0.2280) loss 3.2391 (3.3789) grad_norm 2.2747 (2.1976) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][110/1251] eta 0:04:27 lr 0.000734 wd 0.0500 time 0.2234 (0.2345) data time 0.0011 (0.0064) model time 0.2223 (0.2281) loss 3.4951 (3.3777) grad_norm 1.9921 (2.1912) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][120/1251] eta 0:04:24 lr 0.000734 wd 0.0500 time 0.2314 (0.2341) data time 0.0008 (0.0060) model time 0.2307 (0.2282) loss 3.6663 (3.3960) grad_norm 1.9048 (2.1967) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][130/1251] eta 0:04:22 lr 0.000734 wd 0.0500 time 0.2361 (0.2337) data time 0.0011 (0.0056) model time 0.2351 (0.2281) loss 3.1248 (3.3946) grad_norm 1.9283 (2.1907) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][140/1251] eta 0:04:19 lr 0.000734 wd 0.0500 time 0.2349 (0.2333) data time 0.0007 (0.0053) model time 0.2342 (0.2280) loss 3.3399 (3.3754) grad_norm 1.9276 (2.1699) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][150/1251] eta 0:04:16 lr 0.000734 wd 0.0500 time 0.2206 (0.2333) data time 0.0010 (0.0053) model time 0.2195 (0.2279) loss 3.4351 (3.3696) grad_norm 1.9876 (2.1702) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][160/1251] eta 0:04:14 lr 0.000734 wd 0.0500 time 0.2315 (0.2331) data time 0.0010 (0.0051) model time 0.2305 (0.2280) loss 3.7327 (3.3459) grad_norm 2.9983 (2.1903) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][170/1251] eta 0:04:11 lr 0.000734 wd 0.0500 time 0.2314 (0.2329) data time 0.0006 (0.0048) model time 0.2308 (0.2281) loss 4.0704 (3.3665) grad_norm 1.6711 (2.1854) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][180/1251] eta 0:04:09 lr 0.000734 wd 0.0500 time 0.2306 (0.2326) data time 0.0007 (0.0046) model time 0.2299 (0.2280) loss 3.3416 (3.3637) grad_norm 1.8372 (2.1678) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][190/1251] eta 0:04:06 lr 0.000734 wd 0.0500 time 0.2553 (0.2327) data time 0.0010 (0.0044) model time 0.2543 (0.2284) loss 3.4737 (3.3531) grad_norm 2.0614 (2.1686) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][200/1251] eta 0:04:04 lr 0.000734 wd 0.0500 time 0.2261 (0.2325) data time 0.0011 (0.0043) model time 0.2249 (0.2283) loss 3.5538 (3.3408) grad_norm 1.7380 (2.1826) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:39:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][210/1251] eta 0:04:01 lr 0.000734 wd 0.0500 time 0.2320 (0.2324) data time 0.0009 (0.0041) model time 0.2311 (0.2283) loss 2.3137 (3.3218) grad_norm 1.8997 (2.1694) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:40:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][220/1251] eta 0:03:59 lr 0.000734 wd 0.0500 time 0.2448 (0.2322) data time 0.0007 (0.0040) model time 0.2441 (0.2282) loss 3.2829 (3.3251) grad_norm 1.9304 (2.1650) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:40:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][230/1251] eta 0:03:57 lr 0.000734 wd 0.0500 time 0.2182 (0.2330) data time 0.0009 (0.0039) model time 0.2174 (0.2294) loss 3.8474 (3.3311) grad_norm 1.9441 (2.1535) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:40:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][240/1251] eta 0:03:55 lr 0.000734 wd 0.0500 time 0.2323 (0.2329) data time 0.0015 (0.0037) model time 0.2308 (0.2295) loss 2.7168 (3.3318) grad_norm 2.6098 (2.1567) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:40:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][250/1251] eta 0:03:53 lr 0.000734 wd 0.0500 time 0.2229 (0.2329) data time 0.0013 (0.0036) model time 0.2216 (0.2296) loss 2.7041 (3.3321) grad_norm 2.7970 (2.1970) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:40:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][260/1251] eta 0:03:50 lr 0.000734 wd 0.0500 time 0.2270 (0.2327) data time 0.0009 (0.0035) model time 0.2261 (0.2295) loss 3.2798 (3.3376) grad_norm 1.8750 (2.1996) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:40:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][270/1251] eta 0:03:48 lr 0.000734 wd 0.0500 time 0.2281 (0.2326) data time 0.0010 (0.0034) model time 0.2271 (0.2295) loss 3.3904 (3.3483) grad_norm 2.3608 (2.2049) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:40:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][280/1251] eta 0:03:45 lr 0.000734 wd 0.0500 time 0.2280 (0.2325) data time 0.0007 (0.0034) model time 0.2273 (0.2294) loss 2.7107 (3.3454) grad_norm 1.8345 (2.2011) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][290/1251] eta 0:03:43 lr 0.000734 wd 0.0500 time 0.2285 (0.2324) data time 0.0009 (0.0033) model time 0.2276 (0.2293) loss 3.9336 (3.3475) grad_norm 2.1027 (2.2111) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:40:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][300/1251] eta 0:03:40 lr 0.000733 wd 0.0500 time 0.2265 (0.2322) data time 0.0009 (0.0032) model time 0.2256 (0.2292) loss 2.8472 (3.3408) grad_norm 2.8881 (2.2137) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:40:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][310/1251] eta 0:03:39 lr 0.000733 wd 0.0500 time 0.2303 (0.2332) data time 0.0009 (0.0031) model time 0.2293 (0.2305) loss 3.9599 (3.3302) grad_norm 2.9446 (2.2453) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:40:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][320/1251] eta 0:03:37 lr 0.000733 wd 0.0500 time 0.2408 (0.2331) data time 0.0008 (0.0031) model time 0.2400 (0.2304) loss 3.7028 (3.3297) grad_norm 1.8675 (2.2353) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][330/1251] eta 0:03:34 lr 0.000733 wd 0.0500 time 0.2279 (0.2330) data time 0.0008 (0.0030) model time 0.2271 (0.2303) loss 3.0034 (3.3197) grad_norm 1.7561 (2.2349) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:40:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][340/1251] eta 0:03:32 lr 0.000733 wd 0.0500 time 0.2281 (0.2330) data time 0.0007 (0.0030) model time 0.2274 (0.2303) loss 2.4500 (3.3166) grad_norm 1.6852 (2.2306) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 15:40:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][350/1251] eta 0:03:29 lr 0.000733 wd 0.0500 time 0.2290 (0.2330) data time 0.0010 (0.0030) model time 0.2280 (0.2304) loss 3.5657 (3.3178) grad_norm 2.4778 (2.2317) loss_scale 4096.0000 (2083.0085) mem 7381MB [2024-08-26 15:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][360/1251] eta 0:03:27 lr 0.000733 wd 0.0500 time 0.2328 (0.2330) data time 0.0009 (0.0029) model time 0.2319 (0.2304) loss 3.5142 (3.3169) grad_norm 2.5482 (2.2325) loss_scale 4096.0000 (2138.7701) mem 7381MB [2024-08-26 15:40:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][370/1251] eta 0:03:25 lr 0.000733 wd 0.0500 time 0.2347 (0.2329) data time 0.0009 (0.0029) model time 0.2338 (0.2303) loss 3.8002 (3.3228) grad_norm 2.0845 (2.2271) loss_scale 4096.0000 (2191.5256) mem 7381MB [2024-08-26 15:40:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 15:40:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 15:40:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 15:43:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 16:01:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 16:01:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 16:01:46 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 16:28:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 16:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 16:28:39 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 16:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 16:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 16:33:31 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 16:33:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 16:33:40 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 16:33:42 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 16:33:43 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 16:33:43 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 117) [2024-08-26 16:33:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 16:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 16:34:01 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 16:34:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 16:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 16:36:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 16:36:10 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 16:45:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 16:45:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 16:45:40 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 16:45:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 16:45:49 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 16:45:50 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 16:45:51 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 16:45:51 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 117) [2024-08-26 16:45:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 16:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 16:50:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 16:50:24 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 16:50:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 16:50:38 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 16:50:39 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 16:50:41 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 16:50:41 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 117) [2024-08-26 16:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 16:50:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][380/1251] eta 1:28:37 lr 0.000733 wd 0.0500 time 0.3518 (6.1054) data time 0.0007 (0.3577) model time 0.3510 (5.7477) loss 3.9451 (3.9707) grad_norm 3.0641 (2.5679) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 16:50:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][390/1251] eta 0:17:18 lr 0.000733 wd 0.0500 time 0.2260 (1.2065) data time 0.0006 (0.0607) model time 0.2254 (1.1458) loss 2.8402 (3.6824) grad_norm 2.4375 (2.3405) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][400/1251] eta 0:10:48 lr 0.000733 wd 0.0500 time 0.2267 (0.7620) data time 0.0010 (0.0336) model time 0.2257 (0.7285) loss 3.4911 (3.6302) grad_norm 2.2379 (2.3455) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][410/1251] eta 0:08:20 lr 0.000733 wd 0.0500 time 0.2307 (0.5950) data time 0.0007 (0.0235) model time 0.2300 (0.5715) loss 3.2661 (3.5991) grad_norm 2.3025 (2.3679) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][420/1251] eta 0:07:02 lr 0.000733 wd 0.0500 time 0.2294 (0.5081) data time 0.0008 (0.0183) model time 0.2285 (0.4898) loss 3.6806 (3.5493) grad_norm 2.0805 (2.4229) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][430/1251] eta 0:06:12 lr 0.000733 wd 0.0500 time 0.2216 (0.4535) data time 0.0007 (0.0150) model time 0.2209 (0.4385) loss 3.4660 (3.5230) grad_norm 1.5857 (2.3697) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][440/1251] eta 0:05:38 lr 0.000733 wd 0.0500 time 0.2245 (0.4174) data time 0.0007 (0.0127) model time 0.2238 (0.4047) loss 3.8700 (3.5046) grad_norm 2.6737 (2.4200) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][450/1251] eta 0:05:12 lr 0.000733 wd 0.0500 time 0.2201 (0.3905) data time 0.0008 (0.0111) model time 0.2193 (0.3795) loss 3.7589 (3.4565) grad_norm 1.7074 (2.4069) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][460/1251] eta 0:04:53 lr 0.000733 wd 0.0500 time 0.2180 (0.3705) data time 0.0010 (0.0099) model time 0.2169 (0.3606) loss 3.2899 (3.4376) grad_norm 1.6534 (2.3428) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][470/1251] eta 0:04:36 lr 0.000733 wd 0.0500 time 0.2169 (0.3546) data time 0.0007 (0.0090) model time 0.2163 (0.3457) loss 2.3133 (3.4128) grad_norm 2.0788 (2.3133) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][480/1251] eta 0:04:23 lr 0.000733 wd 0.0500 time 0.2232 (0.3421) data time 0.0007 (0.0083) model time 0.2225 (0.3338) loss 3.6019 (3.4388) grad_norm 1.8554 (2.2694) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][490/1251] eta 0:04:12 lr 0.000733 wd 0.0500 time 0.2242 (0.3315) data time 0.0009 (0.0076) model time 0.2233 (0.3239) loss 3.7611 (3.4279) grad_norm 1.8992 (2.2674) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][500/1251] eta 0:04:02 lr 0.000733 wd 0.0500 time 0.2235 (0.3228) data time 0.0007 (0.0071) model time 0.2228 (0.3157) loss 3.4862 (3.4352) grad_norm 1.6734 (2.2349) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][510/1251] eta 0:03:53 lr 0.000733 wd 0.0500 time 0.2209 (0.3151) data time 0.0009 (0.0066) model time 0.2200 (0.3085) loss 3.2946 (3.4211) grad_norm 2.0028 (2.2192) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][520/1251] eta 0:03:45 lr 0.000733 wd 0.0500 time 0.2282 (0.3088) data time 0.0010 (0.0062) model time 0.2272 (0.3026) loss 3.4102 (3.4103) grad_norm 4.4171 (2.2188) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][530/1251] eta 0:03:38 lr 0.000733 wd 0.0500 time 0.2173 (0.3032) data time 0.0010 (0.0059) model time 0.2163 (0.2973) loss 3.7595 (3.4114) grad_norm 1.8771 (2.2312) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][540/1251] eta 0:03:32 lr 0.000733 wd 0.0500 time 0.2251 (0.2984) data time 0.0009 (0.0056) model time 0.2242 (0.2928) loss 3.2511 (3.4108) grad_norm 1.5968 (2.2167) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][550/1251] eta 0:03:26 lr 0.000732 wd 0.0500 time 0.2226 (0.2939) data time 0.0008 (0.0053) model time 0.2219 (0.2886) loss 2.8926 (3.3985) grad_norm 2.3549 (2.2017) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][560/1251] eta 0:03:20 lr 0.000732 wd 0.0500 time 0.2238 (0.2901) data time 0.0009 (0.0051) model time 0.2229 (0.2850) loss 3.7935 (3.3900) grad_norm 2.1303 (2.2024) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][570/1251] eta 0:03:15 lr 0.000732 wd 0.0500 time 0.2239 (0.2868) data time 0.0008 (0.0049) model time 0.2231 (0.2819) loss 4.0908 (3.3927) grad_norm 1.6543 (2.1936) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:51:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 16:51:42 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 16:51:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 16:53:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 16:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 16:53:43 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 16:53:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 16:53:56 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 16:53:57 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 16:53:58 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 16:53:58 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 117) [2024-08-26 16:53:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 16:54:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][580/1251] eta 3:05:17 lr 0.000732 wd 0.0500 time 16.5678 (16.5678) data time 0.7221 (0.7221) model time 15.8458 (15.8458) loss 4.1021 (4.1021) grad_norm 2.3032 (2.3032) loss_scale 4096.0000 (4096.0000) mem 20033MB [2024-08-26 16:54:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][590/1251] eta 0:19:07 lr 0.000732 wd 0.0500 time 0.2399 (1.7366) data time 0.0012 (0.0666) model time 0.2387 (1.6700) loss 2.8949 (3.7245) grad_norm 1.8392 (2.3424) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][600/1251] eta 0:11:06 lr 0.000732 wd 0.0500 time 0.2491 (1.0243) data time 0.0010 (0.0355) model time 0.2481 (0.9888) loss 3.7008 (3.6380) grad_norm 2.8306 (2.6195) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:54:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][610/1251] eta 0:08:14 lr 0.000732 wd 0.0500 time 0.2365 (0.7712) data time 0.0008 (0.0244) model time 0.2357 (0.7468) loss 2.6003 (3.6294) grad_norm 1.5749 (2.5331) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][620/1251] eta 0:06:44 lr 0.000732 wd 0.0500 time 0.2268 (0.6417) data time 0.0011 (0.0187) model time 0.2256 (0.6230) loss 3.1134 (3.5905) grad_norm 1.9235 (2.4698) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][630/1251] eta 0:05:49 lr 0.000732 wd 0.0500 time 0.2425 (0.5630) data time 0.0008 (0.0153) model time 0.2417 (0.5477) loss 3.7509 (3.5649) grad_norm 2.6063 (2.3807) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 16:54:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 16:54:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 16:54:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 17:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 17:09:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 17:10:03 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 17:10:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 17:10:14 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 17:10:16 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 17:10:17 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 17:10:17 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 117) [2024-08-26 17:10:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 17:10:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][640/1251] eta 0:49:16 lr 0.000732 wd 0.0500 time 0.2290 (4.8394) data time 0.0007 (0.4974) model time 0.2283 (4.3420) loss 2.9223 (3.7282) grad_norm 1.8316 (2.3163) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:10:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][650/1251] eta 0:12:58 lr 0.000732 wd 0.0500 time 0.2413 (1.2959) data time 0.0010 (0.1157) model time 0.2403 (1.1801) loss 3.4577 (3.6109) grad_norm 1.9108 (2.1707) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][660/1251] eta 0:08:11 lr 0.000732 wd 0.0500 time 0.2240 (0.8321) data time 0.0009 (0.0661) model time 0.2230 (0.7661) loss 3.7937 (3.5710) grad_norm 2.1043 (2.1998) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][670/1251] eta 0:06:17 lr 0.000732 wd 0.0500 time 0.2259 (0.6498) data time 0.0010 (0.0464) model time 0.2249 (0.6035) loss 4.0989 (3.6242) grad_norm 2.0787 (2.1710) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][680/1251] eta 0:05:15 lr 0.000732 wd 0.0500 time 0.2319 (0.5525) data time 0.0014 (0.0358) model time 0.2305 (0.5166) loss 3.4687 (3.5684) grad_norm 1.9035 (2.1376) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:10:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][690/1251] eta 0:04:36 lr 0.000732 wd 0.0500 time 0.2246 (0.4925) data time 0.0010 (0.0298) model time 0.2236 (0.4628) loss 3.5543 (3.5682) grad_norm 1.8308 (2.0947) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][700/1251] eta 0:04:08 lr 0.000732 wd 0.0500 time 0.2363 (0.4508) data time 0.0009 (0.0252) model time 0.2354 (0.4257) loss 3.0692 (3.5435) grad_norm 1.7606 (2.1190) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][710/1251] eta 0:03:47 lr 0.000732 wd 0.0500 time 0.2211 (0.4212) data time 0.0013 (0.0219) model time 0.2199 (0.3993) loss 3.6952 (3.5121) grad_norm 2.3585 (2.1165) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][720/1251] eta 0:03:31 lr 0.000732 wd 0.0500 time 0.2255 (0.3983) data time 0.0008 (0.0194) model time 0.2247 (0.3789) loss 2.8842 (3.4860) grad_norm 2.2402 (2.1028) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][730/1251] eta 0:03:18 lr 0.000732 wd 0.0500 time 0.2326 (0.3802) data time 0.0007 (0.0174) model time 0.2319 (0.3628) loss 3.8453 (3.4680) grad_norm 2.2471 (2.1251) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][740/1251] eta 0:03:06 lr 0.000732 wd 0.0500 time 0.2325 (0.3656) data time 0.0008 (0.0158) model time 0.2317 (0.3498) loss 3.8698 (3.4825) grad_norm 1.7245 (2.1344) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][750/1251] eta 0:02:57 lr 0.000732 wd 0.0500 time 0.2321 (0.3535) data time 0.0010 (0.0145) model time 0.2311 (0.3390) loss 3.1168 (3.4546) grad_norm 2.0523 (2.1173) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][760/1251] eta 0:02:48 lr 0.000732 wd 0.0500 time 0.2279 (0.3434) data time 0.0010 (0.0135) model time 0.2269 (0.3300) loss 3.1567 (3.4550) grad_norm 1.7743 (2.1449) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][770/1251] eta 0:02:41 lr 0.000732 wd 0.0500 time 0.2359 (0.3349) data time 0.0010 (0.0125) model time 0.2349 (0.3224) loss 3.8570 (3.4451) grad_norm 2.1883 (2.1471) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][780/1251] eta 0:02:34 lr 0.000732 wd 0.0500 time 0.2251 (0.3277) data time 0.0009 (0.0117) model time 0.2242 (0.3159) loss 3.6139 (3.4235) grad_norm 4.1441 (2.1890) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][790/1251] eta 0:02:28 lr 0.000732 wd 0.0500 time 0.2272 (0.3213) data time 0.0007 (0.0111) model time 0.2264 (0.3102) loss 3.2002 (3.4121) grad_norm 4.3606 (2.2576) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][800/1251] eta 0:02:22 lr 0.000732 wd 0.0500 time 0.2340 (0.3157) data time 0.0008 (0.0104) model time 0.2331 (0.3052) loss 2.9530 (3.4106) grad_norm 2.0029 (2.2509) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][810/1251] eta 0:02:16 lr 0.000731 wd 0.0500 time 0.2253 (0.3106) data time 0.0012 (0.0099) model time 0.2241 (0.3007) loss 3.8983 (3.4075) grad_norm 2.0296 (2.2383) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][820/1251] eta 0:02:11 lr 0.000731 wd 0.0500 time 0.2239 (0.3062) data time 0.0012 (0.0094) model time 0.2227 (0.2968) loss 3.7174 (3.3924) grad_norm 1.7125 (2.2260) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][830/1251] eta 0:02:07 lr 0.000731 wd 0.0500 time 0.2367 (0.3023) data time 0.0008 (0.0090) model time 0.2359 (0.2933) loss 3.2699 (3.3905) grad_norm 1.8007 (2.2140) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][840/1251] eta 0:02:02 lr 0.000731 wd 0.0500 time 0.2243 (0.2989) data time 0.0010 (0.0086) model time 0.2234 (0.2903) loss 2.5672 (3.3730) grad_norm 2.6715 (2.2062) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][850/1251] eta 0:01:58 lr 0.000731 wd 0.0500 time 0.2314 (0.2957) data time 0.0009 (0.0083) model time 0.2305 (0.2874) loss 2.3346 (3.3638) grad_norm 1.8161 (2.2019) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][860/1251] eta 0:01:54 lr 0.000731 wd 0.0500 time 0.2252 (0.2926) data time 0.0007 (0.0079) model time 0.2245 (0.2847) loss 2.8340 (3.3611) grad_norm 2.1210 (2.1992) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][870/1251] eta 0:01:50 lr 0.000731 wd 0.0500 time 0.2282 (0.2899) data time 0.0010 (0.0076) model time 0.2272 (0.2822) loss 2.6235 (3.3543) grad_norm 2.4297 (2.1956) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][880/1251] eta 0:01:46 lr 0.000731 wd 0.0500 time 0.2316 (0.2874) data time 0.0010 (0.0074) model time 0.2306 (0.2801) loss 3.6053 (3.3573) grad_norm 2.7107 (2.2330) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][890/1251] eta 0:01:42 lr 0.000731 wd 0.0500 time 0.2314 (0.2852) data time 0.0011 (0.0071) model time 0.2304 (0.2781) loss 3.6128 (3.3490) grad_norm 1.3057 (2.2179) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][900/1251] eta 0:01:39 lr 0.000731 wd 0.0500 time 0.2252 (0.2830) data time 0.0012 (0.0069) model time 0.2240 (0.2761) loss 3.0262 (3.3395) grad_norm 2.4508 (2.2098) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][910/1251] eta 0:01:35 lr 0.000731 wd 0.0500 time 0.2253 (0.2811) data time 0.0009 (0.0067) model time 0.2244 (0.2745) loss 4.2888 (3.3342) grad_norm 1.4403 (2.2061) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][920/1251] eta 0:01:32 lr 0.000731 wd 0.0500 time 0.2301 (0.2793) data time 0.0012 (0.0065) model time 0.2289 (0.2729) loss 3.5459 (3.3323) grad_norm 2.4366 (2.2079) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][930/1251] eta 0:01:29 lr 0.000731 wd 0.0500 time 0.2259 (0.2785) data time 0.0009 (0.0063) model time 0.2250 (0.2722) loss 2.6435 (3.3256) grad_norm 2.2067 (2.2057) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][940/1251] eta 0:01:26 lr 0.000731 wd 0.0500 time 0.2306 (0.2770) data time 0.0007 (0.0062) model time 0.2299 (0.2708) loss 3.3077 (3.3203) grad_norm 1.5274 (2.2089) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][950/1251] eta 0:01:23 lr 0.000731 wd 0.0500 time 0.2296 (0.2763) data time 0.0007 (0.0060) model time 0.2288 (0.2703) loss 4.2213 (3.3228) grad_norm 2.4641 (2.2055) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][960/1251] eta 0:01:19 lr 0.000731 wd 0.0500 time 0.2252 (0.2749) data time 0.0009 (0.0059) model time 0.2243 (0.2690) loss 3.6437 (3.3312) grad_norm 1.9778 (2.2044) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][970/1251] eta 0:01:16 lr 0.000731 wd 0.0500 time 0.2349 (0.2736) data time 0.0008 (0.0057) model time 0.2340 (0.2679) loss 3.3841 (3.3317) grad_norm 1.7572 (2.2039) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][980/1251] eta 0:01:13 lr 0.000731 wd 0.0500 time 0.2292 (0.2724) data time 0.0011 (0.0056) model time 0.2281 (0.2668) loss 3.3231 (3.3351) grad_norm 2.1627 (2.2129) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:11:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][990/1251] eta 0:01:10 lr 0.000731 wd 0.0500 time 0.2341 (0.2712) data time 0.0008 (0.0055) model time 0.2333 (0.2657) loss 4.1062 (3.3381) grad_norm 3.7666 (2.2128) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1000/1251] eta 0:01:07 lr 0.000731 wd 0.0500 time 0.2348 (0.2702) data time 0.0006 (0.0053) model time 0.2342 (0.2648) loss 3.7494 (3.3372) grad_norm 2.0112 (2.2139) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:12:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1010/1251] eta 0:01:04 lr 0.000731 wd 0.0500 time 0.2266 (0.2691) data time 0.0009 (0.0052) model time 0.2256 (0.2638) loss 3.4353 (3.3331) grad_norm 1.8628 (2.2141) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:12:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1020/1251] eta 0:01:01 lr 0.000731 wd 0.0500 time 0.2273 (0.2680) data time 0.0010 (0.0051) model time 0.2262 (0.2629) loss 2.6096 (3.3282) grad_norm 1.8691 (2.2119) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1030/1251] eta 0:00:59 lr 0.000731 wd 0.0500 time 0.2289 (0.2670) data time 0.0009 (0.0050) model time 0.2280 (0.2620) loss 4.0009 (3.3271) grad_norm 2.2097 (2.2123) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:12:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1040/1251] eta 0:00:56 lr 0.000731 wd 0.0500 time 0.2301 (0.2661) data time 0.0012 (0.0049) model time 0.2289 (0.2612) loss 2.9799 (3.3353) grad_norm 2.3633 (2.2082) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1050/1251] eta 0:00:53 lr 0.000731 wd 0.0500 time 0.2262 (0.2652) data time 0.0011 (0.0048) model time 0.2251 (0.2604) loss 3.5840 (3.3410) grad_norm 1.8698 (2.2051) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1060/1251] eta 0:00:50 lr 0.000730 wd 0.0500 time 0.2279 (0.2645) data time 0.0008 (0.0048) model time 0.2271 (0.2597) loss 2.4704 (3.3365) grad_norm 1.6277 (2.1986) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1070/1251] eta 0:00:47 lr 0.000730 wd 0.0500 time 0.2231 (0.2637) data time 0.0008 (0.0047) model time 0.2223 (0.2591) loss 4.3102 (3.3451) grad_norm 2.3232 (2.1990) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1080/1251] eta 0:00:44 lr 0.000730 wd 0.0500 time 0.2233 (0.2630) data time 0.0009 (0.0046) model time 0.2224 (0.2584) loss 3.2659 (3.3478) grad_norm 2.0267 (2.1985) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1090/1251] eta 0:00:42 lr 0.000730 wd 0.0500 time 0.2236 (0.2622) data time 0.0008 (0.0045) model time 0.2228 (0.2577) loss 2.5018 (3.3438) grad_norm 1.6655 (2.1940) loss_scale 4096.0000 (4096.0000) mem 7373MB [2024-08-26 17:12:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 17:12:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 17:12:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 17:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 17:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 17:14:29 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 17:14:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 17:14:38 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 17:14:39 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 17:14:40 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 17:14:40 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 117) [2024-08-26 17:14:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 17:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1100/1251] eta 0:07:37 lr 0.000730 wd 0.0500 time 0.2338 (3.0313) data time 0.0007 (0.1889) model time 0.2331 (2.8424) loss 3.9086 (3.9655) grad_norm 2.1663 (1.9871) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1110/1251] eta 0:02:44 lr 0.000730 wd 0.0500 time 0.2242 (1.1635) data time 0.0010 (0.0637) model time 0.2232 (1.0998) loss 3.9329 (3.7181) grad_norm 2.1767 (2.1783) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1120/1251] eta 0:01:43 lr 0.000730 wd 0.0500 time 0.2204 (0.7889) data time 0.0009 (0.0386) model time 0.2195 (0.7503) loss 3.6296 (3.6576) grad_norm 2.4106 (2.1343) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1130/1251] eta 0:01:16 lr 0.000730 wd 0.0500 time 0.2361 (0.6287) data time 0.0009 (0.0278) model time 0.2352 (0.6008) loss 3.5431 (3.6281) grad_norm 2.4939 (2.1219) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1140/1251] eta 0:00:59 lr 0.000730 wd 0.0500 time 0.2333 (0.5401) data time 0.0009 (0.0219) model time 0.2324 (0.5182) loss 3.3739 (3.5635) grad_norm 1.6175 (2.0789) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1150/1251] eta 0:00:48 lr 0.000730 wd 0.0500 time 0.2304 (0.4832) data time 0.0007 (0.0181) model time 0.2296 (0.4651) loss 2.6673 (3.5364) grad_norm 1.5598 (2.0606) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1160/1251] eta 0:00:40 lr 0.000730 wd 0.0500 time 0.2264 (0.4441) data time 0.0013 (0.0155) model time 0.2251 (0.4286) loss 3.6867 (3.5107) grad_norm 2.4815 (2.0832) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1170/1251] eta 0:00:33 lr 0.000730 wd 0.0500 time 0.2317 (0.4153) data time 0.0009 (0.0135) model time 0.2308 (0.4017) loss 2.2960 (3.4519) grad_norm 2.7064 (2.0920) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1180/1251] eta 0:00:27 lr 0.000730 wd 0.0500 time 0.2248 (0.3933) data time 0.0007 (0.0121) model time 0.2241 (0.3813) loss 3.0218 (3.4184) grad_norm 1.6536 (2.0931) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1190/1251] eta 0:00:22 lr 0.000730 wd 0.0500 time 0.2241 (0.3760) data time 0.0012 (0.0109) model time 0.2229 (0.3651) loss 3.7419 (3.4183) grad_norm 1.9874 (2.0996) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1200/1251] eta 0:00:18 lr 0.000730 wd 0.0500 time 0.2370 (0.3622) data time 0.0009 (0.0100) model time 0.2361 (0.3522) loss 2.9989 (3.4468) grad_norm 2.7596 (2.1052) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1210/1251] eta 0:00:14 lr 0.000730 wd 0.0500 time 0.2282 (0.3505) data time 0.0007 (0.0092) model time 0.2276 (0.3413) loss 2.5926 (3.4313) grad_norm 2.4697 (2.1128) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1220/1251] eta 0:00:10 lr 0.000730 wd 0.0500 time 0.2318 (0.3408) data time 0.0007 (0.0085) model time 0.2311 (0.3323) loss 3.1975 (3.4232) grad_norm 2.8446 (2.1363) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1230/1251] eta 0:00:06 lr 0.000730 wd 0.0500 time 0.2373 (0.3324) data time 0.0008 (0.0080) model time 0.2365 (0.3245) loss 3.2609 (3.4166) grad_norm 2.1023 (2.1250) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1240/1251] eta 0:00:03 lr 0.000730 wd 0.0500 time 0.2114 (0.3245) data time 0.0007 (0.0075) model time 0.2107 (0.3170) loss 3.2629 (3.4048) grad_norm 1.8412 (2.1255) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [117/300][1250/1251] eta 0:00:00 lr 0.000730 wd 0.0500 time 0.2127 (0.3173) data time 0.0007 (0.0071) model time 0.2120 (0.3102) loss 3.4152 (3.3943) grad_norm 1.6086 (2.1242) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 17:15:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 117 training takes 0:00:49 [2024-08-26 17:15:35 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 17:15:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 17:15:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.468 (0.468) Loss 0.5254 (0.5254) Acc@1 89.551 (89.551) Acc@5 98.145 (98.145) Mem 7377MB [2024-08-26 17:15:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.114) Loss 0.7808 (0.7939) Acc@1 83.398 (83.416) Acc@5 96.680 (96.689) Mem 7377MB [2024-08-26 17:15:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.089 (0.100) Loss 1.1738 (0.8318) Acc@1 72.656 (82.171) Acc@5 92.676 (96.540) Mem 7377MB [2024-08-26 17:15:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.094) Loss 1.4023 (0.9372) Acc@1 65.820 (79.631) Acc@5 89.551 (95.202) Mem 7377MB [2024-08-26 17:15:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.2686 (0.9940) Acc@1 71.875 (78.256) Acc@5 90.527 (94.493) Mem 7377MB [2024-08-26 17:15:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.874 Acc@5 94.386 [2024-08-26 17:15:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.9% [2024-08-26 17:15:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.935 (0.935) Loss 0.4219 (0.4219) Acc@1 91.895 (91.895) Acc@5 98.340 (98.340) Mem 7377MB [2024-08-26 17:15:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.162) Loss 0.6836 (0.6683) Acc@1 86.133 (85.458) Acc@5 96.680 (97.203) Mem 7377MB [2024-08-26 17:15:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.124) Loss 0.9521 (0.6903) Acc@1 78.320 (84.566) Acc@5 94.727 (97.224) Mem 7377MB [2024-08-26 17:15:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.109) Loss 1.2227 (0.7868) Acc@1 68.750 (82.299) Acc@5 91.406 (96.103) Mem 7377MB [2024-08-26 17:15:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.098) Loss 1.1055 (0.8370) Acc@1 72.754 (80.900) Acc@5 92.969 (95.563) Mem 7377MB [2024-08-26 17:15:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.524 Acc@5 95.498 [2024-08-26 17:15:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.5% [2024-08-26 17:15:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.52% [2024-08-26 17:15:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 17:15:49 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 17:15:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][0/1251] eta 0:18:54 lr 0.000730 wd 0.0500 time 0.9070 (0.9070) data time 0.6372 (0.6372) model time 0.0000 (0.0000) loss 3.3439 (3.3439) grad_norm 3.4007 (3.4007) loss_scale 4096.0000 (4096.0000) mem 7383MB [2024-08-26 17:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][10/1251] eta 0:05:59 lr 0.000730 wd 0.0500 time 0.2305 (0.2895) data time 0.0011 (0.0591) model time 0.0000 (0.0000) loss 3.3315 (3.3779) grad_norm 2.3919 (2.5967) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][20/1251] eta 0:05:19 lr 0.000730 wd 0.0500 time 0.2236 (0.2596) data time 0.0007 (0.0314) model time 0.0000 (0.0000) loss 3.3043 (3.2691) grad_norm 1.8441 (2.5140) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:15:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][30/1251] eta 0:05:04 lr 0.000730 wd 0.0500 time 0.2393 (0.2497) data time 0.0009 (0.0216) model time 0.0000 (0.0000) loss 2.9054 (3.2548) grad_norm 1.8430 (2.3380) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:15:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][40/1251] eta 0:04:55 lr 0.000730 wd 0.0500 time 0.2451 (0.2442) data time 0.0007 (0.0166) model time 0.0000 (0.0000) loss 3.0600 (3.2575) grad_norm 1.5462 (2.2642) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][50/1251] eta 0:04:49 lr 0.000730 wd 0.0500 time 0.2452 (0.2412) data time 0.0010 (0.0135) model time 0.0000 (0.0000) loss 2.0523 (3.2192) grad_norm 1.7233 (2.2301) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][60/1251] eta 0:04:44 lr 0.000729 wd 0.0500 time 0.2319 (0.2390) data time 0.0007 (0.0115) model time 0.2313 (0.2271) loss 2.6048 (3.2393) grad_norm 1.5416 (2.2101) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][70/1251] eta 0:04:40 lr 0.000729 wd 0.0500 time 0.2281 (0.2374) data time 0.0008 (0.0100) model time 0.2273 (0.2266) loss 3.2863 (3.2721) grad_norm 2.0572 (2.2174) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][80/1251] eta 0:04:36 lr 0.000729 wd 0.0500 time 0.2288 (0.2362) data time 0.0008 (0.0089) model time 0.2281 (0.2268) loss 3.2927 (3.2612) grad_norm 2.2702 (2.1852) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][90/1251] eta 0:04:33 lr 0.000729 wd 0.0500 time 0.2320 (0.2354) data time 0.0009 (0.0080) model time 0.2311 (0.2270) loss 2.1362 (3.2555) grad_norm 2.5344 (2.2076) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][100/1251] eta 0:04:29 lr 0.000729 wd 0.0500 time 0.2220 (0.2345) data time 0.0010 (0.0073) model time 0.2211 (0.2268) loss 2.4468 (3.2269) grad_norm 2.2109 (2.2323) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][110/1251] eta 0:04:26 lr 0.000729 wd 0.0500 time 0.2313 (0.2339) data time 0.0008 (0.0067) model time 0.2305 (0.2268) loss 2.8620 (3.2089) grad_norm 1.6022 (2.2580) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][120/1251] eta 0:04:24 lr 0.000729 wd 0.0500 time 0.2337 (0.2336) data time 0.0009 (0.0063) model time 0.2327 (0.2271) loss 3.6512 (3.2311) grad_norm 1.9405 (2.2547) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][130/1251] eta 0:04:21 lr 0.000729 wd 0.0500 time 0.2267 (0.2333) data time 0.0010 (0.0059) model time 0.2258 (0.2272) loss 2.6065 (3.2209) grad_norm 1.8509 (2.2643) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][140/1251] eta 0:04:20 lr 0.000729 wd 0.0500 time 0.2270 (0.2348) data time 0.0007 (0.0056) model time 0.2263 (0.2302) loss 2.4667 (3.2133) grad_norm 1.8506 (2.2607) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][150/1251] eta 0:04:18 lr 0.000729 wd 0.0500 time 0.2226 (0.2344) data time 0.0010 (0.0053) model time 0.2216 (0.2299) loss 2.1414 (3.1971) grad_norm 2.5294 (2.2626) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][160/1251] eta 0:04:16 lr 0.000729 wd 0.0500 time 0.2350 (0.2356) data time 0.0009 (0.0050) model time 0.2341 (0.2319) loss 3.8594 (3.2167) grad_norm 2.4431 (2.2837) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][170/1251] eta 0:04:14 lr 0.000729 wd 0.0500 time 0.2254 (0.2350) data time 0.0012 (0.0048) model time 0.2242 (0.2314) loss 3.7959 (3.2427) grad_norm 1.8752 (2.2883) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][180/1251] eta 0:04:11 lr 0.000729 wd 0.0500 time 0.2206 (0.2345) data time 0.0009 (0.0046) model time 0.2198 (0.2309) loss 3.4962 (3.2416) grad_norm 2.3345 (2.2857) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][190/1251] eta 0:04:08 lr 0.000729 wd 0.0500 time 0.2232 (0.2342) data time 0.0014 (0.0044) model time 0.2218 (0.2306) loss 3.9150 (3.2511) grad_norm 1.8416 (2.2714) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][200/1251] eta 0:04:05 lr 0.000729 wd 0.0500 time 0.2248 (0.2338) data time 0.0009 (0.0042) model time 0.2239 (0.2302) loss 2.1883 (3.2548) grad_norm 1.8370 (2.2513) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][210/1251] eta 0:04:03 lr 0.000729 wd 0.0500 time 0.2360 (0.2335) data time 0.0008 (0.0041) model time 0.2352 (0.2299) loss 3.3918 (3.2594) grad_norm 1.9559 (2.2723) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][220/1251] eta 0:04:00 lr 0.000729 wd 0.0500 time 0.2294 (0.2332) data time 0.0008 (0.0040) model time 0.2286 (0.2298) loss 1.8849 (3.2511) grad_norm 2.9283 (2.2890) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][230/1251] eta 0:03:57 lr 0.000729 wd 0.0500 time 0.2314 (0.2330) data time 0.0010 (0.0038) model time 0.2304 (0.2297) loss 4.0837 (3.2465) grad_norm 1.7717 (2.2935) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][240/1251] eta 0:03:55 lr 0.000729 wd 0.0500 time 0.2288 (0.2329) data time 0.0008 (0.0037) model time 0.2281 (0.2296) loss 4.0709 (3.2558) grad_norm 2.0272 (2.2831) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][250/1251] eta 0:03:52 lr 0.000729 wd 0.0500 time 0.2319 (0.2327) data time 0.0007 (0.0036) model time 0.2312 (0.2295) loss 3.6872 (3.2657) grad_norm 2.6720 (2.2679) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][260/1251] eta 0:03:50 lr 0.000729 wd 0.0500 time 0.2390 (0.2326) data time 0.0009 (0.0035) model time 0.2381 (0.2295) loss 2.9904 (3.2684) grad_norm 2.7008 (2.2665) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][270/1251] eta 0:03:48 lr 0.000729 wd 0.0500 time 0.2211 (0.2325) data time 0.0011 (0.0034) model time 0.2200 (0.2294) loss 3.5604 (3.2712) grad_norm 2.0365 (2.2633) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][280/1251] eta 0:03:45 lr 0.000729 wd 0.0500 time 0.2207 (0.2324) data time 0.0010 (0.0033) model time 0.2197 (0.2293) loss 3.3173 (3.2857) grad_norm 2.0109 (2.2700) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][290/1251] eta 0:03:43 lr 0.000729 wd 0.0500 time 0.2275 (0.2322) data time 0.0007 (0.0032) model time 0.2268 (0.2293) loss 3.3842 (3.2937) grad_norm 1.5075 (2.2719) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][300/1251] eta 0:03:40 lr 0.000729 wd 0.0500 time 0.2389 (0.2321) data time 0.0006 (0.0032) model time 0.2382 (0.2292) loss 3.2752 (3.2876) grad_norm 2.4121 (2.2628) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][310/1251] eta 0:03:38 lr 0.000728 wd 0.0500 time 0.2263 (0.2320) data time 0.0009 (0.0031) model time 0.2253 (0.2292) loss 2.5753 (3.2797) grad_norm 1.8110 (2.2592) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][320/1251] eta 0:03:35 lr 0.000728 wd 0.0500 time 0.2268 (0.2319) data time 0.0007 (0.0030) model time 0.2261 (0.2291) loss 2.8856 (3.2736) grad_norm 1.9745 (2.2494) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][330/1251] eta 0:03:33 lr 0.000728 wd 0.0500 time 0.2253 (0.2317) data time 0.0010 (0.0030) model time 0.2243 (0.2290) loss 3.6530 (3.2841) grad_norm 2.2093 (2.2602) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][340/1251] eta 0:03:31 lr 0.000728 wd 0.0500 time 0.2244 (0.2316) data time 0.0008 (0.0029) model time 0.2236 (0.2289) loss 3.1672 (3.2832) grad_norm 2.0882 (2.2566) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][350/1251] eta 0:03:28 lr 0.000728 wd 0.0500 time 0.2225 (0.2315) data time 0.0009 (0.0029) model time 0.2217 (0.2288) loss 3.1683 (3.2856) grad_norm 2.0751 (2.2503) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][360/1251] eta 0:03:26 lr 0.000728 wd 0.0500 time 0.2238 (0.2314) data time 0.0010 (0.0028) model time 0.2229 (0.2288) loss 3.1387 (3.2958) grad_norm 1.4997 (2.2458) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][370/1251] eta 0:03:23 lr 0.000728 wd 0.0500 time 0.2306 (0.2314) data time 0.0011 (0.0028) model time 0.2295 (0.2288) loss 2.1785 (3.2858) grad_norm 2.3452 (2.2450) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][380/1251] eta 0:03:21 lr 0.000728 wd 0.0500 time 0.2316 (0.2313) data time 0.0009 (0.0027) model time 0.2307 (0.2287) loss 3.4176 (3.2837) grad_norm 2.0343 (2.2419) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][390/1251] eta 0:03:19 lr 0.000728 wd 0.0500 time 0.2328 (0.2312) data time 0.0007 (0.0027) model time 0.2321 (0.2287) loss 2.6334 (3.2828) grad_norm 1.7233 (2.2426) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][400/1251] eta 0:03:16 lr 0.000728 wd 0.0500 time 0.2273 (0.2312) data time 0.0011 (0.0026) model time 0.2262 (0.2287) loss 3.7963 (3.2915) grad_norm 1.7286 (2.2335) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][410/1251] eta 0:03:14 lr 0.000728 wd 0.0500 time 0.2290 (0.2311) data time 0.0007 (0.0026) model time 0.2282 (0.2286) loss 2.3758 (3.2922) grad_norm 2.4018 (2.2262) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][420/1251] eta 0:03:12 lr 0.000728 wd 0.0500 time 0.2325 (0.2311) data time 0.0009 (0.0025) model time 0.2316 (0.2286) loss 2.4491 (3.2928) grad_norm 1.4973 (2.2280) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][430/1251] eta 0:03:09 lr 0.000728 wd 0.0500 time 0.2279 (0.2310) data time 0.0011 (0.0025) model time 0.2268 (0.2286) loss 3.4696 (3.2975) grad_norm 2.2365 (2.2333) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][440/1251] eta 0:03:07 lr 0.000728 wd 0.0500 time 0.2322 (0.2310) data time 0.0008 (0.0025) model time 0.2314 (0.2286) loss 2.7322 (3.3015) grad_norm 2.0190 (2.2374) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][450/1251] eta 0:03:04 lr 0.000728 wd 0.0500 time 0.2283 (0.2309) data time 0.0007 (0.0024) model time 0.2276 (0.2285) loss 3.4276 (3.3035) grad_norm 2.3506 (2.2393) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][460/1251] eta 0:03:02 lr 0.000728 wd 0.0500 time 0.2324 (0.2308) data time 0.0011 (0.0024) model time 0.2312 (0.2285) loss 3.0356 (3.3049) grad_norm 2.1040 (2.2359) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][470/1251] eta 0:03:00 lr 0.000728 wd 0.0500 time 0.2211 (0.2308) data time 0.0009 (0.0024) model time 0.2202 (0.2285) loss 2.5782 (3.3094) grad_norm 1.8159 (2.2345) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][480/1251] eta 0:02:57 lr 0.000728 wd 0.0500 time 0.2270 (0.2308) data time 0.0009 (0.0024) model time 0.2262 (0.2285) loss 3.4047 (3.3139) grad_norm 1.7272 (2.2303) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][490/1251] eta 0:02:55 lr 0.000728 wd 0.0500 time 0.2313 (0.2308) data time 0.0011 (0.0023) model time 0.2302 (0.2286) loss 3.6641 (3.3085) grad_norm 2.0709 (2.2361) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][500/1251] eta 0:02:53 lr 0.000728 wd 0.0500 time 0.2377 (0.2307) data time 0.0009 (0.0023) model time 0.2368 (0.2285) loss 2.8538 (3.3082) grad_norm 1.8151 (2.2416) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][510/1251] eta 0:02:50 lr 0.000728 wd 0.0500 time 0.2282 (0.2306) data time 0.0007 (0.0023) model time 0.2275 (0.2285) loss 3.8755 (3.3024) grad_norm 1.9250 (2.2383) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][520/1251] eta 0:02:48 lr 0.000728 wd 0.0500 time 0.2406 (0.2306) data time 0.0007 (0.0022) model time 0.2399 (0.2284) loss 3.1773 (3.3087) grad_norm 2.4435 (2.2394) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][530/1251] eta 0:02:46 lr 0.000728 wd 0.0500 time 0.2253 (0.2305) data time 0.0007 (0.0022) model time 0.2246 (0.2283) loss 2.9119 (3.3110) grad_norm 1.9790 (2.2434) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][540/1251] eta 0:02:43 lr 0.000728 wd 0.0500 time 0.2302 (0.2305) data time 0.0007 (0.0022) model time 0.2295 (0.2283) loss 2.6679 (3.3086) grad_norm 2.1012 (2.2453) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][550/1251] eta 0:02:41 lr 0.000728 wd 0.0500 time 0.2277 (0.2305) data time 0.0009 (0.0022) model time 0.2268 (0.2284) loss 3.9479 (3.3109) grad_norm 1.5411 (2.2464) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:17:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][560/1251] eta 0:02:39 lr 0.000728 wd 0.0500 time 0.2251 (0.2305) data time 0.0011 (0.0022) model time 0.2240 (0.2284) loss 3.4351 (3.3054) grad_norm 1.8769 (2.2439) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:18:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][570/1251] eta 0:02:36 lr 0.000727 wd 0.0500 time 0.2271 (0.2305) data time 0.0007 (0.0022) model time 0.2264 (0.2284) loss 3.6937 (3.3033) grad_norm 1.8115 (2.2373) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:18:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][580/1251] eta 0:02:34 lr 0.000727 wd 0.0500 time 0.2333 (0.2305) data time 0.0007 (0.0022) model time 0.2326 (0.2284) loss 4.3740 (3.3083) grad_norm 2.2946 (2.2356) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][590/1251] eta 0:02:32 lr 0.000727 wd 0.0500 time 0.2318 (0.2304) data time 0.0006 (0.0021) model time 0.2312 (0.2283) loss 4.2285 (3.3112) grad_norm 2.2642 (2.2404) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][600/1251] eta 0:02:29 lr 0.000727 wd 0.0500 time 0.2251 (0.2304) data time 0.0011 (0.0021) model time 0.2240 (0.2283) loss 3.8762 (3.3102) grad_norm 2.3908 (2.2383) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:18:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][610/1251] eta 0:02:27 lr 0.000727 wd 0.0500 time 0.2275 (0.2304) data time 0.0009 (0.0021) model time 0.2266 (0.2283) loss 3.6263 (3.3147) grad_norm 2.5104 (2.2426) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][620/1251] eta 0:02:25 lr 0.000727 wd 0.0500 time 0.2302 (0.2303) data time 0.0009 (0.0021) model time 0.2293 (0.2283) loss 3.7074 (3.3165) grad_norm 4.5944 (2.2561) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][630/1251] eta 0:02:22 lr 0.000727 wd 0.0500 time 0.2317 (0.2303) data time 0.0009 (0.0021) model time 0.2308 (0.2283) loss 3.1394 (3.3157) grad_norm 2.6463 (2.2583) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:18:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][640/1251] eta 0:02:20 lr 0.000727 wd 0.0500 time 0.2294 (0.2303) data time 0.0007 (0.0020) model time 0.2287 (0.2283) loss 2.3747 (3.3166) grad_norm 1.4536 (2.2550) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][650/1251] eta 0:02:18 lr 0.000727 wd 0.0500 time 0.2248 (0.2302) data time 0.0007 (0.0020) model time 0.2241 (0.2283) loss 2.1265 (3.3145) grad_norm 1.6966 (2.2516) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:18:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][660/1251] eta 0:02:16 lr 0.000727 wd 0.0500 time 0.2215 (0.2307) data time 0.0008 (0.0020) model time 0.2207 (0.2288) loss 3.9723 (3.3106) grad_norm 2.2138 (2.2602) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-26 17:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][670/1251] eta 0:02:14 lr 0.000727 wd 0.0500 time 0.2287 (0.2307) data time 0.0010 (0.0020) model time 0.2277 (0.2287) loss 2.0729 (3.3080) grad_norm 2.4855 (inf) loss_scale 2048.0000 (4065.4784) mem 7382MB [2024-08-26 17:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][680/1251] eta 0:02:11 lr 0.000727 wd 0.0500 time 0.2221 (0.2306) data time 0.0012 (0.0020) model time 0.2208 (0.2287) loss 3.7134 (3.3067) grad_norm 1.9427 (inf) loss_scale 2048.0000 (4035.8532) mem 7382MB [2024-08-26 17:18:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][690/1251] eta 0:02:09 lr 0.000727 wd 0.0500 time 0.2331 (0.2306) data time 0.0007 (0.0020) model time 0.2325 (0.2287) loss 3.9916 (3.3057) grad_norm 2.6831 (inf) loss_scale 2048.0000 (4007.0854) mem 7382MB [2024-08-26 17:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][700/1251] eta 0:02:07 lr 0.000727 wd 0.0500 time 0.2274 (0.2309) data time 0.0012 (0.0020) model time 0.2262 (0.2290) loss 3.8490 (3.3082) grad_norm 2.4302 (inf) loss_scale 2048.0000 (3979.1384) mem 7382MB [2024-08-26 17:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][710/1251] eta 0:02:04 lr 0.000727 wd 0.0500 time 0.2253 (0.2308) data time 0.0010 (0.0019) model time 0.2242 (0.2290) loss 3.5121 (3.3070) grad_norm 2.2822 (inf) loss_scale 2048.0000 (3951.9775) mem 7382MB [2024-08-26 17:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][720/1251] eta 0:02:02 lr 0.000727 wd 0.0500 time 0.2321 (0.2308) data time 0.0009 (0.0019) model time 0.2312 (0.2289) loss 2.9112 (3.3077) grad_norm 2.1508 (inf) loss_scale 2048.0000 (3925.5700) mem 7382MB [2024-08-26 17:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][730/1251] eta 0:02:00 lr 0.000727 wd 0.0500 time 0.2290 (0.2308) data time 0.0007 (0.0019) model time 0.2284 (0.2289) loss 4.2230 (3.3061) grad_norm 2.3935 (inf) loss_scale 2048.0000 (3899.8851) mem 7382MB [2024-08-26 17:18:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][740/1251] eta 0:01:57 lr 0.000727 wd 0.0500 time 0.2296 (0.2307) data time 0.0010 (0.0019) model time 0.2286 (0.2289) loss 3.6695 (3.3066) grad_norm 2.1822 (inf) loss_scale 2048.0000 (3874.8934) mem 7382MB [2024-08-26 17:18:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][750/1251] eta 0:01:55 lr 0.000727 wd 0.0500 time 0.2337 (0.2307) data time 0.0011 (0.0019) model time 0.2326 (0.2289) loss 3.9978 (3.3067) grad_norm 1.7835 (inf) loss_scale 2048.0000 (3850.5672) mem 7382MB [2024-08-26 17:18:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][760/1251] eta 0:01:53 lr 0.000727 wd 0.0500 time 0.2302 (0.2306) data time 0.0008 (0.0019) model time 0.2293 (0.2288) loss 3.9113 (3.3057) grad_norm 2.9975 (inf) loss_scale 2048.0000 (3826.8804) mem 7382MB [2024-08-26 17:18:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][770/1251] eta 0:01:50 lr 0.000727 wd 0.0500 time 0.2254 (0.2306) data time 0.0010 (0.0019) model time 0.2244 (0.2288) loss 3.4163 (3.3100) grad_norm 2.0325 (inf) loss_scale 2048.0000 (3803.8080) mem 7382MB [2024-08-26 17:18:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][780/1251] eta 0:01:48 lr 0.000727 wd 0.0500 time 0.2236 (0.2306) data time 0.0007 (0.0019) model time 0.2229 (0.2288) loss 2.2460 (3.3099) grad_norm 2.5604 (inf) loss_scale 2048.0000 (3781.3265) mem 7382MB [2024-08-26 17:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][790/1251] eta 0:01:46 lr 0.000727 wd 0.0500 time 0.2265 (0.2305) data time 0.0009 (0.0018) model time 0.2256 (0.2287) loss 3.5644 (3.3086) grad_norm 2.1095 (inf) loss_scale 2048.0000 (3759.4134) mem 7382MB [2024-08-26 17:18:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][800/1251] eta 0:01:43 lr 0.000727 wd 0.0500 time 0.2278 (0.2305) data time 0.0008 (0.0018) model time 0.2270 (0.2287) loss 3.3604 (3.3053) grad_norm 2.1337 (inf) loss_scale 2048.0000 (3738.0474) mem 7382MB [2024-08-26 17:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][810/1251] eta 0:01:41 lr 0.000727 wd 0.0500 time 0.2288 (0.2305) data time 0.0011 (0.0018) model time 0.2277 (0.2287) loss 3.6355 (3.3072) grad_norm 1.5814 (inf) loss_scale 2048.0000 (3717.2084) mem 7382MB [2024-08-26 17:18:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][820/1251] eta 0:01:39 lr 0.000726 wd 0.0500 time 0.2296 (0.2305) data time 0.0012 (0.0018) model time 0.2284 (0.2287) loss 3.1776 (3.3094) grad_norm 1.8062 (inf) loss_scale 2048.0000 (3696.8770) mem 7382MB [2024-08-26 17:19:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][830/1251] eta 0:01:37 lr 0.000726 wd 0.0500 time 0.2272 (0.2305) data time 0.0009 (0.0018) model time 0.2263 (0.2287) loss 2.4453 (3.3070) grad_norm 1.7732 (inf) loss_scale 2048.0000 (3677.0349) mem 7382MB [2024-08-26 17:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][840/1251] eta 0:01:34 lr 0.000726 wd 0.0500 time 0.2247 (0.2304) data time 0.0009 (0.0018) model time 0.2238 (0.2287) loss 3.3382 (3.3102) grad_norm 1.9870 (inf) loss_scale 2048.0000 (3657.6647) mem 7382MB [2024-08-26 17:19:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 17:19:04 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 17:19:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 17:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 17:21:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 17:22:01 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 17:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 17:25:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 17:25:28 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 17:25:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 17:25:37 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 17:25:39 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 17:25:40 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 17:25:40 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 118) [2024-08-26 17:25:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 17:26:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][850/1251] eta 0:15:47 lr 0.000726 wd 0.0500 time 0.2282 (2.3636) data time 0.0014 (0.1281) model time 0.2268 (2.2356) loss 3.6456 (3.8658) grad_norm 2.1869 (1.9197) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][860/1251] eta 0:07:14 lr 0.000726 wd 0.0500 time 0.2282 (1.1104) data time 0.0012 (0.0534) model time 0.2271 (1.0570) loss 2.9799 (3.6005) grad_norm 2.3656 (2.0656) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][870/1251] eta 0:04:58 lr 0.000726 wd 0.0500 time 0.2298 (0.7839) data time 0.0008 (0.0340) model time 0.2290 (0.7499) loss 3.9378 (3.6941) grad_norm 2.0065 (2.2825) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][880/1251] eta 0:03:55 lr 0.000726 wd 0.0500 time 0.2289 (0.6339) data time 0.0009 (0.0251) model time 0.2280 (0.6088) loss 3.0138 (3.6213) grad_norm 2.8267 (2.2460) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][890/1251] eta 0:03:17 lr 0.000726 wd 0.0500 time 0.2275 (0.5474) data time 0.0007 (0.0200) model time 0.2269 (0.5274) loss 3.7627 (3.5696) grad_norm 3.0624 (2.3681) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][900/1251] eta 0:02:52 lr 0.000726 wd 0.0500 time 0.2305 (0.4922) data time 0.0012 (0.0168) model time 0.2293 (0.4754) loss 3.0469 (3.5416) grad_norm 1.9276 (2.3741) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][910/1251] eta 0:02:34 lr 0.000726 wd 0.0500 time 0.2241 (0.4534) data time 0.0014 (0.0145) model time 0.2227 (0.4388) loss 3.4417 (3.4991) grad_norm 3.3888 (2.3213) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][920/1251] eta 0:02:20 lr 0.000726 wd 0.0500 time 0.2263 (0.4241) data time 0.0009 (0.0128) model time 0.2254 (0.4113) loss 3.4565 (3.4789) grad_norm 2.0725 (2.2780) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][930/1251] eta 0:02:08 lr 0.000726 wd 0.0500 time 0.2347 (0.4019) data time 0.0011 (0.0114) model time 0.2336 (0.3904) loss 3.5909 (3.4426) grad_norm 1.4165 (2.2604) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][940/1251] eta 0:01:59 lr 0.000726 wd 0.0500 time 0.2311 (0.3840) data time 0.0009 (0.0104) model time 0.2302 (0.3737) loss 3.5224 (3.4489) grad_norm 2.3026 (2.2265) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][950/1251] eta 0:01:51 lr 0.000726 wd 0.0500 time 0.2299 (0.3695) data time 0.0007 (0.0095) model time 0.2292 (0.3600) loss 3.8040 (3.4656) grad_norm 3.1109 (2.2741) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][960/1251] eta 0:01:44 lr 0.000726 wd 0.0500 time 0.2299 (0.3574) data time 0.0010 (0.0088) model time 0.2289 (0.3486) loss 3.7496 (3.4557) grad_norm 1.7027 (2.2436) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][970/1251] eta 0:01:37 lr 0.000726 wd 0.0500 time 0.2404 (0.3473) data time 0.0014 (0.0082) model time 0.2390 (0.3391) loss 3.8425 (3.4483) grad_norm 1.4638 (2.2159) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][980/1251] eta 0:01:31 lr 0.000726 wd 0.0500 time 0.2281 (0.3388) data time 0.0008 (0.0076) model time 0.2273 (0.3311) loss 2.8488 (3.4305) grad_norm 1.5664 (2.1961) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][990/1251] eta 0:01:26 lr 0.000726 wd 0.0500 time 0.2329 (0.3313) data time 0.0009 (0.0072) model time 0.2320 (0.3241) loss 2.8879 (3.4181) grad_norm 2.6989 (2.2042) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1000/1251] eta 0:01:21 lr 0.000726 wd 0.0500 time 0.2267 (0.3247) data time 0.0012 (0.0068) model time 0.2256 (0.3178) loss 3.1605 (3.4148) grad_norm 1.8017 (2.2055) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1010/1251] eta 0:01:16 lr 0.000726 wd 0.0500 time 0.2212 (0.3191) data time 0.0014 (0.0065) model time 0.2197 (0.3126) loss 3.2178 (3.4203) grad_norm 1.9058 (2.1898) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1020/1251] eta 0:01:12 lr 0.000726 wd 0.0500 time 0.2210 (0.3138) data time 0.0010 (0.0062) model time 0.2200 (0.3076) loss 3.3478 (3.4014) grad_norm 1.7708 (2.1754) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1030/1251] eta 0:01:08 lr 0.000726 wd 0.0500 time 0.2236 (0.3093) data time 0.0008 (0.0059) model time 0.2228 (0.3034) loss 3.4445 (3.3972) grad_norm 2.1952 (2.1626) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1040/1251] eta 0:01:04 lr 0.000726 wd 0.0500 time 0.2342 (0.3052) data time 0.0007 (0.0057) model time 0.2335 (0.2995) loss 3.8182 (3.3928) grad_norm 2.2464 (2.1837) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1050/1251] eta 0:01:00 lr 0.000726 wd 0.0500 time 0.2354 (0.3015) data time 0.0010 (0.0054) model time 0.2344 (0.2961) loss 3.7638 (3.3795) grad_norm 1.9043 (2.2040) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1060/1251] eta 0:00:56 lr 0.000726 wd 0.0500 time 0.2230 (0.2981) data time 0.0007 (0.0052) model time 0.2222 (0.2929) loss 4.0270 (3.3754) grad_norm 2.6880 (2.2069) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1070/1251] eta 0:00:53 lr 0.000725 wd 0.0500 time 0.2324 (0.2951) data time 0.0009 (0.0051) model time 0.2315 (0.2900) loss 3.5886 (3.3820) grad_norm 1.2401 (2.2047) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1080/1251] eta 0:00:49 lr 0.000725 wd 0.0500 time 0.2272 (0.2923) data time 0.0007 (0.0049) model time 0.2265 (0.2874) loss 3.2275 (3.3749) grad_norm 2.2548 (2.2091) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1090/1251] eta 0:00:46 lr 0.000725 wd 0.0500 time 0.2255 (0.2897) data time 0.0009 (0.0047) model time 0.2246 (0.2850) loss 3.5569 (3.3728) grad_norm 1.9832 (2.1926) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1100/1251] eta 0:00:43 lr 0.000725 wd 0.0500 time 0.2297 (0.2874) data time 0.0007 (0.0046) model time 0.2290 (0.2828) loss 2.2442 (3.3612) grad_norm 2.2161 (2.1950) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1110/1251] eta 0:00:40 lr 0.000725 wd 0.0500 time 0.2214 (0.2852) data time 0.0008 (0.0044) model time 0.2206 (0.2807) loss 2.1735 (3.3501) grad_norm 2.9301 (2.2074) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1120/1251] eta 0:00:37 lr 0.000725 wd 0.0500 time 0.2309 (0.2832) data time 0.0009 (0.0043) model time 0.2300 (0.2789) loss 3.4212 (3.3572) grad_norm 2.6135 (2.2010) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1130/1251] eta 0:00:34 lr 0.000725 wd 0.0500 time 0.4639 (0.2821) data time 0.0007 (0.0042) model time 0.4632 (0.2779) loss 4.2493 (3.3492) grad_norm 2.0347 (2.1989) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1140/1251] eta 0:00:31 lr 0.000725 wd 0.0500 time 0.2398 (0.2803) data time 0.0010 (0.0041) model time 0.2389 (0.2762) loss 2.3742 (3.3378) grad_norm 1.7291 (2.1976) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1150/1251] eta 0:00:28 lr 0.000725 wd 0.0500 time 0.4817 (0.2796) data time 0.0007 (0.0040) model time 0.4810 (0.2756) loss 3.8894 (3.3341) grad_norm 2.7817 (2.2032) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1160/1251] eta 0:00:25 lr 0.000725 wd 0.0500 time 0.2262 (0.2780) data time 0.0011 (0.0039) model time 0.2252 (0.2741) loss 3.9334 (3.3428) grad_norm 2.4651 (2.1931) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1170/1251] eta 0:00:22 lr 0.000725 wd 0.0500 time 0.2323 (0.2766) data time 0.0007 (0.0038) model time 0.2316 (0.2727) loss 2.8104 (3.3482) grad_norm 2.5662 (2.1996) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1180/1251] eta 0:00:19 lr 0.000725 wd 0.0500 time 0.2325 (0.2752) data time 0.0009 (0.0037) model time 0.2316 (0.2714) loss 2.9615 (3.3457) grad_norm 2.4749 (2.2230) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1190/1251] eta 0:00:16 lr 0.000725 wd 0.0500 time 0.2244 (0.2738) data time 0.0010 (0.0037) model time 0.2233 (0.2702) loss 3.1888 (3.3506) grad_norm 1.6853 (2.2231) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1200/1251] eta 0:00:13 lr 0.000725 wd 0.0500 time 0.2311 (0.2725) data time 0.0011 (0.0036) model time 0.2301 (0.2689) loss 2.9205 (3.3524) grad_norm 2.3925 (2.2202) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1210/1251] eta 0:00:11 lr 0.000725 wd 0.0500 time 0.2303 (0.2714) data time 0.0008 (0.0035) model time 0.2296 (0.2678) loss 2.5265 (3.3531) grad_norm 2.3050 (2.2111) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1220/1251] eta 0:00:08 lr 0.000725 wd 0.0500 time 0.2309 (0.2703) data time 0.0011 (0.0035) model time 0.2298 (0.2668) loss 3.5192 (3.3541) grad_norm 2.3155 (2.2182) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1230/1251] eta 0:00:05 lr 0.000725 wd 0.0500 time 0.2493 (0.2693) data time 0.0011 (0.0034) model time 0.2482 (0.2659) loss 3.7929 (3.3508) grad_norm 2.9567 (2.2284) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1240/1251] eta 0:00:02 lr 0.000725 wd 0.0500 time 0.2204 (0.2681) data time 0.0005 (0.0034) model time 0.2200 (0.2647) loss 3.2792 (3.3523) grad_norm 1.6992 (2.2240) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [118/300][1250/1251] eta 0:00:00 lr 0.000725 wd 0.0500 time 0.2213 (0.2667) data time 0.0007 (0.0033) model time 0.2206 (0.2635) loss 3.1726 (3.3554) grad_norm 1.8588 (2.2298) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 17:27:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 118 training takes 0:01:48 [2024-08-26 17:27:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 17:27:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 17:27:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.488 (0.488) Loss 0.5254 (0.5254) Acc@1 90.723 (90.723) Acc@5 97.754 (97.754) Mem 7378MB [2024-08-26 17:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.115) Loss 0.7910 (0.7892) Acc@1 84.473 (83.336) Acc@5 95.898 (96.644) Mem 7378MB [2024-08-26 17:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.101) Loss 1.1055 (0.8109) Acc@1 72.754 (82.329) Acc@5 93.066 (96.591) Mem 7378MB [2024-08-26 17:27:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.094) Loss 1.3223 (0.9173) Acc@1 69.336 (79.883) Acc@5 90.625 (95.265) Mem 7378MB [2024-08-26 17:27:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.2549 (0.9807) Acc@1 69.922 (78.242) Acc@5 92.090 (94.507) Mem 7378MB [2024-08-26 17:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.824 Acc@5 94.394 [2024-08-26 17:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.8% [2024-08-26 17:27:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.972 (0.972) Loss 0.4224 (0.4224) Acc@1 91.992 (91.992) Acc@5 98.340 (98.340) Mem 7378MB [2024-08-26 17:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.167) Loss 0.6831 (0.6674) Acc@1 86.133 (85.529) Acc@5 96.777 (97.221) Mem 7378MB [2024-08-26 17:27:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.084 (0.127) Loss 0.9502 (0.6895) Acc@1 78.613 (84.626) Acc@5 95.020 (97.238) Mem 7378MB [2024-08-26 17:27:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.111) Loss 1.2188 (0.7855) Acc@1 68.848 (82.334) Acc@5 91.602 (96.128) Mem 7378MB [2024-08-26 17:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.100) Loss 1.1035 (0.8356) Acc@1 72.266 (80.909) Acc@5 93.066 (95.596) Mem 7378MB [2024-08-26 17:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.532 Acc@5 95.554 [2024-08-26 17:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.5% [2024-08-26 17:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.53% [2024-08-26 17:27:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 17:27:51 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 17:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][0/1251] eta 0:14:55 lr 0.000725 wd 0.0500 time 0.7156 (0.7156) data time 0.4330 (0.4330) model time 0.0000 (0.0000) loss 3.9883 (3.9883) grad_norm 1.7415 (1.7415) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 17:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][10/1251] eta 0:05:39 lr 0.000725 wd 0.0500 time 0.2279 (0.2735) data time 0.0011 (0.0405) model time 0.0000 (0.0000) loss 3.9850 (3.4402) grad_norm 2.7281 (2.2741) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][20/1251] eta 0:05:09 lr 0.000725 wd 0.0500 time 0.2328 (0.2516) data time 0.0006 (0.0217) model time 0.0000 (0.0000) loss 4.2447 (3.4973) grad_norm 2.1304 (2.1397) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:27:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][30/1251] eta 0:04:58 lr 0.000725 wd 0.0500 time 0.2230 (0.2443) data time 0.0010 (0.0150) model time 0.0000 (0.0000) loss 3.4694 (3.4903) grad_norm 2.5227 (2.1155) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][40/1251] eta 0:04:51 lr 0.000725 wd 0.0500 time 0.2297 (0.2409) data time 0.0011 (0.0117) model time 0.0000 (0.0000) loss 2.8047 (3.4547) grad_norm 1.8522 (2.0842) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][50/1251] eta 0:04:46 lr 0.000725 wd 0.0500 time 0.2166 (0.2387) data time 0.0010 (0.0096) model time 0.0000 (0.0000) loss 3.2062 (3.3964) grad_norm 1.4098 (2.1171) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][60/1251] eta 0:04:42 lr 0.000725 wd 0.0500 time 0.2296 (0.2368) data time 0.0009 (0.0082) model time 0.2287 (0.2263) loss 2.6602 (3.3308) grad_norm 2.1734 (2.1866) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][70/1251] eta 0:04:38 lr 0.000724 wd 0.0500 time 0.2308 (0.2358) data time 0.0010 (0.0072) model time 0.2298 (0.2275) loss 2.2781 (3.3113) grad_norm 2.0789 (2.2960) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][80/1251] eta 0:04:35 lr 0.000724 wd 0.0500 time 0.2265 (0.2352) data time 0.0009 (0.0064) model time 0.2256 (0.2282) loss 3.3024 (3.3390) grad_norm 2.3663 (2.2912) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][90/1251] eta 0:04:32 lr 0.000724 wd 0.0500 time 0.2232 (0.2349) data time 0.0008 (0.0059) model time 0.2224 (0.2290) loss 3.8242 (3.3461) grad_norm 2.7986 (2.3073) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][100/1251] eta 0:04:29 lr 0.000724 wd 0.0500 time 0.2297 (0.2345) data time 0.0007 (0.0054) model time 0.2290 (0.2291) loss 3.9068 (3.3379) grad_norm 2.1440 (2.2935) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][110/1251] eta 0:04:26 lr 0.000724 wd 0.0500 time 0.2293 (0.2339) data time 0.0009 (0.0050) model time 0.2283 (0.2288) loss 3.3340 (3.3644) grad_norm 1.9094 (2.2577) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][120/1251] eta 0:04:24 lr 0.000724 wd 0.0500 time 0.2341 (0.2338) data time 0.0009 (0.0047) model time 0.2331 (0.2291) loss 2.0511 (3.3199) grad_norm 1.8130 (2.2369) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][130/1251] eta 0:04:21 lr 0.000724 wd 0.0500 time 0.2342 (0.2335) data time 0.0007 (0.0044) model time 0.2335 (0.2290) loss 3.7368 (3.3122) grad_norm 1.3760 (2.2375) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][140/1251] eta 0:04:18 lr 0.000724 wd 0.0500 time 0.2265 (0.2331) data time 0.0007 (0.0042) model time 0.2258 (0.2288) loss 3.5082 (3.3195) grad_norm 1.9911 (2.2183) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][150/1251] eta 0:04:16 lr 0.000724 wd 0.0500 time 0.2258 (0.2329) data time 0.0010 (0.0040) model time 0.2248 (0.2289) loss 3.3215 (3.3286) grad_norm 2.5875 (2.2100) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][160/1251] eta 0:04:13 lr 0.000724 wd 0.0500 time 0.2298 (0.2326) data time 0.0006 (0.0038) model time 0.2292 (0.2286) loss 3.9452 (3.3421) grad_norm 1.9523 (2.2098) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][170/1251] eta 0:04:11 lr 0.000724 wd 0.0500 time 0.2229 (0.2322) data time 0.0009 (0.0037) model time 0.2220 (0.2284) loss 3.7524 (3.3407) grad_norm 4.9244 (2.2308) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][180/1251] eta 0:04:08 lr 0.000724 wd 0.0500 time 0.2273 (0.2322) data time 0.0009 (0.0035) model time 0.2265 (0.2285) loss 4.1197 (3.3507) grad_norm 1.6959 (2.2379) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][190/1251] eta 0:04:06 lr 0.000724 wd 0.0500 time 0.2508 (0.2323) data time 0.0009 (0.0034) model time 0.2499 (0.2289) loss 2.5557 (3.3497) grad_norm 1.9029 (2.2245) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][200/1251] eta 0:04:04 lr 0.000724 wd 0.0500 time 0.2269 (0.2322) data time 0.0009 (0.0033) model time 0.2260 (0.2289) loss 2.9599 (3.3386) grad_norm 2.7651 (2.2275) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][210/1251] eta 0:04:01 lr 0.000724 wd 0.0500 time 0.2351 (0.2322) data time 0.0011 (0.0032) model time 0.2340 (0.2290) loss 2.8622 (3.3411) grad_norm 1.9935 (2.2207) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][220/1251] eta 0:03:59 lr 0.000724 wd 0.0500 time 0.2227 (0.2321) data time 0.0007 (0.0031) model time 0.2221 (0.2290) loss 2.9616 (3.3463) grad_norm 1.7433 (2.2228) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][230/1251] eta 0:03:56 lr 0.000724 wd 0.0500 time 0.2289 (0.2321) data time 0.0009 (0.0030) model time 0.2280 (0.2291) loss 3.8312 (3.3528) grad_norm 4.1297 (2.2556) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][240/1251] eta 0:03:54 lr 0.000724 wd 0.0500 time 0.2253 (0.2320) data time 0.0009 (0.0029) model time 0.2244 (0.2291) loss 3.4702 (3.3430) grad_norm 2.4744 (2.2665) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][250/1251] eta 0:03:52 lr 0.000724 wd 0.0500 time 0.2299 (0.2319) data time 0.0011 (0.0028) model time 0.2288 (0.2291) loss 2.5509 (3.3390) grad_norm 1.9081 (2.2609) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][260/1251] eta 0:03:49 lr 0.000724 wd 0.0500 time 0.2284 (0.2319) data time 0.0008 (0.0028) model time 0.2276 (0.2291) loss 3.8005 (3.3331) grad_norm 1.7586 (2.2452) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][270/1251] eta 0:03:47 lr 0.000724 wd 0.0500 time 0.2292 (0.2319) data time 0.0010 (0.0027) model time 0.2282 (0.2292) loss 3.3754 (3.3419) grad_norm 2.0177 (2.2286) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][280/1251] eta 0:03:45 lr 0.000724 wd 0.0500 time 0.2314 (0.2318) data time 0.0010 (0.0027) model time 0.2305 (0.2292) loss 2.9133 (3.3460) grad_norm 1.6952 (2.2267) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][290/1251] eta 0:03:42 lr 0.000724 wd 0.0500 time 0.2255 (0.2317) data time 0.0009 (0.0026) model time 0.2246 (0.2291) loss 3.4009 (3.3348) grad_norm 2.4938 (2.2258) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][300/1251] eta 0:03:40 lr 0.000724 wd 0.0500 time 0.2291 (0.2318) data time 0.0011 (0.0026) model time 0.2279 (0.2292) loss 3.4211 (3.3322) grad_norm 1.5838 (2.2160) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][310/1251] eta 0:03:38 lr 0.000724 wd 0.0500 time 0.2288 (0.2318) data time 0.0007 (0.0025) model time 0.2281 (0.2293) loss 3.7718 (3.3170) grad_norm 1.3753 (2.2073) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][320/1251] eta 0:03:35 lr 0.000723 wd 0.0500 time 0.2262 (0.2317) data time 0.0013 (0.0025) model time 0.2250 (0.2292) loss 3.6009 (3.3134) grad_norm 1.8238 (2.2051) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][330/1251] eta 0:03:33 lr 0.000723 wd 0.0500 time 0.2373 (0.2317) data time 0.0009 (0.0024) model time 0.2364 (0.2293) loss 3.6732 (3.3259) grad_norm 1.8764 (2.2034) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][340/1251] eta 0:03:31 lr 0.000723 wd 0.0500 time 0.2262 (0.2316) data time 0.0008 (0.0024) model time 0.2254 (0.2292) loss 2.0145 (3.3253) grad_norm 2.4060 (2.2170) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][350/1251] eta 0:03:29 lr 0.000723 wd 0.0500 time 0.2288 (0.2322) data time 0.0006 (0.0024) model time 0.2282 (0.2299) loss 3.7729 (3.3228) grad_norm 1.4686 (2.2230) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][360/1251] eta 0:03:26 lr 0.000723 wd 0.0500 time 0.2423 (0.2321) data time 0.0009 (0.0023) model time 0.2415 (0.2299) loss 3.5414 (3.3270) grad_norm 2.8375 (2.2296) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][370/1251] eta 0:03:24 lr 0.000723 wd 0.0500 time 0.2312 (0.2320) data time 0.0009 (0.0023) model time 0.2303 (0.2298) loss 3.9155 (3.3322) grad_norm 2.5901 (2.2396) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][380/1251] eta 0:03:22 lr 0.000723 wd 0.0500 time 0.2334 (0.2320) data time 0.0007 (0.0023) model time 0.2327 (0.2298) loss 3.2306 (3.3297) grad_norm 1.9593 (2.2381) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][390/1251] eta 0:03:19 lr 0.000723 wd 0.0500 time 0.2280 (0.2319) data time 0.0009 (0.0023) model time 0.2271 (0.2298) loss 3.6630 (3.3317) grad_norm 3.6352 (2.2436) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][400/1251] eta 0:03:17 lr 0.000723 wd 0.0500 time 0.2284 (0.2322) data time 0.0010 (0.0022) model time 0.2274 (0.2302) loss 3.6683 (3.3262) grad_norm 1.9867 (2.2415) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][410/1251] eta 0:03:15 lr 0.000723 wd 0.0500 time 0.2435 (0.2323) data time 0.0008 (0.0022) model time 0.2427 (0.2302) loss 3.4061 (3.3212) grad_norm 2.3458 (2.2396) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][420/1251] eta 0:03:12 lr 0.000723 wd 0.0500 time 0.2326 (0.2321) data time 0.0012 (0.0022) model time 0.2314 (0.2301) loss 3.8492 (3.3125) grad_norm 1.6773 (2.2355) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][430/1251] eta 0:03:10 lr 0.000723 wd 0.0500 time 0.2310 (0.2321) data time 0.0009 (0.0021) model time 0.2301 (0.2300) loss 4.0742 (3.3116) grad_norm 1.6864 (2.2254) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][440/1251] eta 0:03:08 lr 0.000723 wd 0.0500 time 0.2300 (0.2320) data time 0.0009 (0.0021) model time 0.2291 (0.2300) loss 1.9538 (3.3069) grad_norm 2.4099 (2.2260) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][450/1251] eta 0:03:05 lr 0.000723 wd 0.0500 time 0.2322 (0.2319) data time 0.0008 (0.0021) model time 0.2313 (0.2299) loss 2.8666 (3.3107) grad_norm 1.6098 (2.2226) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][460/1251] eta 0:03:03 lr 0.000723 wd 0.0500 time 0.2567 (0.2319) data time 0.0007 (0.0021) model time 0.2560 (0.2299) loss 4.2823 (3.3113) grad_norm 3.5014 (2.2268) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][470/1251] eta 0:03:01 lr 0.000723 wd 0.0500 time 0.2335 (0.2319) data time 0.0009 (0.0021) model time 0.2326 (0.2299) loss 3.3443 (3.3084) grad_norm 1.8647 (2.2293) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][480/1251] eta 0:02:58 lr 0.000723 wd 0.0500 time 0.2480 (0.2319) data time 0.0011 (0.0020) model time 0.2469 (0.2299) loss 3.9747 (3.3066) grad_norm 1.9192 (2.2307) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][490/1251] eta 0:02:56 lr 0.000723 wd 0.0500 time 0.2305 (0.2318) data time 0.0008 (0.0020) model time 0.2297 (0.2299) loss 2.6398 (3.3066) grad_norm 1.8051 (2.2324) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][500/1251] eta 0:02:54 lr 0.000723 wd 0.0500 time 0.2267 (0.2318) data time 0.0009 (0.0020) model time 0.2258 (0.2299) loss 3.0897 (3.3080) grad_norm 1.8968 (2.2282) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][510/1251] eta 0:02:51 lr 0.000723 wd 0.0500 time 0.2391 (0.2318) data time 0.0009 (0.0020) model time 0.2383 (0.2299) loss 2.9780 (3.3104) grad_norm 2.2730 (2.2209) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][520/1251] eta 0:02:49 lr 0.000723 wd 0.0500 time 0.2267 (0.2318) data time 0.0008 (0.0020) model time 0.2259 (0.2299) loss 3.4458 (3.3176) grad_norm 2.0751 (2.2257) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][530/1251] eta 0:02:47 lr 0.000723 wd 0.0500 time 0.2257 (0.2318) data time 0.0009 (0.0020) model time 0.2248 (0.2299) loss 3.0332 (3.3169) grad_norm 2.7953 (2.2286) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][540/1251] eta 0:02:44 lr 0.000723 wd 0.0500 time 0.2241 (0.2317) data time 0.0007 (0.0019) model time 0.2234 (0.2298) loss 3.2739 (3.3114) grad_norm 2.5573 (2.2246) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:29:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][550/1251] eta 0:02:42 lr 0.000723 wd 0.0500 time 0.2309 (0.2316) data time 0.0009 (0.0019) model time 0.2300 (0.2298) loss 3.7334 (3.3077) grad_norm 1.3940 (2.2192) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][560/1251] eta 0:02:40 lr 0.000723 wd 0.0500 time 0.2298 (0.2316) data time 0.0011 (0.0019) model time 0.2287 (0.2297) loss 3.6284 (3.3090) grad_norm 2.4768 (2.2208) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][570/1251] eta 0:02:37 lr 0.000722 wd 0.0500 time 0.2489 (0.2316) data time 0.0013 (0.0019) model time 0.2476 (0.2298) loss 2.7411 (3.3105) grad_norm 2.5866 (2.2222) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][580/1251] eta 0:02:35 lr 0.000722 wd 0.0500 time 0.2223 (0.2316) data time 0.0010 (0.0019) model time 0.2213 (0.2297) loss 2.3124 (3.3107) grad_norm 2.0621 (2.2223) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][590/1251] eta 0:02:33 lr 0.000722 wd 0.0500 time 0.2331 (0.2315) data time 0.0010 (0.0019) model time 0.2321 (0.2297) loss 3.2941 (3.3114) grad_norm 2.0413 (2.2208) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][600/1251] eta 0:02:30 lr 0.000722 wd 0.0500 time 0.2283 (0.2315) data time 0.0007 (0.0019) model time 0.2276 (0.2297) loss 4.2112 (3.3128) grad_norm 2.0429 (2.2205) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][610/1251] eta 0:02:28 lr 0.000722 wd 0.0500 time 0.2317 (0.2314) data time 0.0007 (0.0019) model time 0.2310 (0.2296) loss 3.0948 (3.3133) grad_norm 2.4663 (2.2226) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][620/1251] eta 0:02:26 lr 0.000722 wd 0.0500 time 0.2296 (0.2314) data time 0.0009 (0.0018) model time 0.2287 (0.2296) loss 2.8684 (3.3117) grad_norm 1.7148 (2.2185) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][630/1251] eta 0:02:23 lr 0.000722 wd 0.0500 time 0.2311 (0.2314) data time 0.0007 (0.0018) model time 0.2304 (0.2296) loss 3.6458 (3.3136) grad_norm 2.1123 (2.2214) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][640/1251] eta 0:02:21 lr 0.000722 wd 0.0500 time 0.2221 (0.2313) data time 0.0012 (0.0018) model time 0.2208 (0.2295) loss 2.7307 (3.3141) grad_norm 1.9602 (2.2169) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][650/1251] eta 0:02:19 lr 0.000722 wd 0.0500 time 0.2349 (0.2313) data time 0.0007 (0.0018) model time 0.2342 (0.2295) loss 4.3121 (3.3193) grad_norm 2.0107 (2.2130) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][660/1251] eta 0:02:16 lr 0.000722 wd 0.0500 time 0.2370 (0.2313) data time 0.0010 (0.0018) model time 0.2360 (0.2295) loss 3.4938 (3.3186) grad_norm 1.8683 (2.2122) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][670/1251] eta 0:02:14 lr 0.000722 wd 0.0500 time 0.2259 (0.2313) data time 0.0010 (0.0018) model time 0.2248 (0.2295) loss 3.6852 (3.3171) grad_norm 1.9239 (2.2135) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][680/1251] eta 0:02:12 lr 0.000722 wd 0.0500 time 0.2279 (0.2312) data time 0.0007 (0.0018) model time 0.2272 (0.2295) loss 2.3894 (3.3158) grad_norm 2.8144 (2.2126) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][690/1251] eta 0:02:09 lr 0.000722 wd 0.0500 time 0.2345 (0.2312) data time 0.0007 (0.0018) model time 0.2338 (0.2295) loss 2.7842 (3.3167) grad_norm 2.2919 (2.2193) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][700/1251] eta 0:02:07 lr 0.000722 wd 0.0500 time 0.2277 (0.2312) data time 0.0008 (0.0018) model time 0.2269 (0.2295) loss 2.9817 (3.3153) grad_norm 1.9634 (2.2190) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 17:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 17:30:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 17:30:35 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 17:32:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 17:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 17:35:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 17:40:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 17:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 18:00:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 18:00:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 18:00:29 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 18:00:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 18:00:39 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 18:00:41 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 18:00:42 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 18:00:42 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 119) [2024-08-26 18:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 18:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][710/1251] eta 0:33:43 lr 0.000722 wd 0.0500 time 0.2348 (3.7405) data time 0.0009 (0.2371) model time 0.2340 (3.5035) loss 3.7434 (3.7672) grad_norm 1.9561 (1.9941) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][720/1251] eta 0:10:58 lr 0.000722 wd 0.0500 time 0.2411 (1.2409) data time 0.0008 (0.0685) model time 0.2404 (1.1725) loss 3.9548 (3.7143) grad_norm 2.6435 (1.9460) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][730/1251] eta 0:07:08 lr 0.000722 wd 0.0500 time 0.2354 (0.8223) data time 0.0011 (0.0404) model time 0.2343 (0.7819) loss 3.5864 (3.6838) grad_norm 2.1119 (2.0239) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][740/1251] eta 0:05:32 lr 0.000722 wd 0.0500 time 0.2340 (0.6506) data time 0.0008 (0.0288) model time 0.2332 (0.6218) loss 2.8387 (3.6583) grad_norm 2.9587 (2.0348) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][750/1251] eta 0:04:39 lr 0.000722 wd 0.0500 time 0.2439 (0.5571) data time 0.0008 (0.0226) model time 0.2431 (0.5344) loss 3.6946 (3.5997) grad_norm 2.4394 (2.1779) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][760/1251] eta 0:04:04 lr 0.000722 wd 0.0500 time 0.2382 (0.4983) data time 0.0009 (0.0186) model time 0.2373 (0.4797) loss 3.7433 (3.5639) grad_norm 2.6396 (2.1603) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][770/1251] eta 0:03:40 lr 0.000722 wd 0.0500 time 0.2430 (0.4583) data time 0.0008 (0.0159) model time 0.2422 (0.4424) loss 3.5500 (3.5141) grad_norm 1.5545 (2.1299) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][780/1251] eta 0:03:21 lr 0.000722 wd 0.0500 time 0.2420 (0.4288) data time 0.0007 (0.0139) model time 0.2413 (0.4149) loss 3.4190 (3.4753) grad_norm 2.6694 (2.1242) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][790/1251] eta 0:03:07 lr 0.000722 wd 0.0500 time 0.2397 (0.4064) data time 0.0011 (0.0124) model time 0.2386 (0.3940) loss 3.6371 (3.4412) grad_norm 2.4571 (2.1201) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][800/1251] eta 0:02:55 lr 0.000722 wd 0.0500 time 0.2387 (0.3888) data time 0.0010 (0.0111) model time 0.2377 (0.3776) loss 3.0965 (3.4338) grad_norm 3.1763 (2.1604) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][810/1251] eta 0:02:45 lr 0.000722 wd 0.0500 time 0.2430 (0.3758) data time 0.0010 (0.0103) model time 0.2420 (0.3655) loss 3.4800 (3.4578) grad_norm 3.3480 (2.1856) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][820/1251] eta 0:02:36 lr 0.000721 wd 0.0500 time 0.2378 (0.3640) data time 0.0009 (0.0095) model time 0.2369 (0.3545) loss 3.8234 (3.4530) grad_norm 1.7150 (2.1906) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][830/1251] eta 0:02:29 lr 0.000721 wd 0.0500 time 0.2519 (0.3541) data time 0.0010 (0.0088) model time 0.2509 (0.3453) loss 3.1055 (3.4516) grad_norm 2.4068 (2.1803) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][840/1251] eta 0:02:21 lr 0.000721 wd 0.0500 time 0.2335 (0.3454) data time 0.0011 (0.0082) model time 0.2324 (0.3372) loss 3.6215 (3.4491) grad_norm 2.1804 (2.1654) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][850/1251] eta 0:02:15 lr 0.000721 wd 0.0500 time 0.2337 (0.3381) data time 0.0008 (0.0077) model time 0.2329 (0.3304) loss 3.0410 (3.4232) grad_norm 1.5468 (2.1481) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][860/1251] eta 0:02:09 lr 0.000721 wd 0.0500 time 0.2651 (0.3319) data time 0.0007 (0.0073) model time 0.2644 (0.3246) loss 3.1547 (3.4095) grad_norm 2.3482 (2.1594) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][870/1251] eta 0:02:04 lr 0.000721 wd 0.0500 time 0.2495 (0.3264) data time 0.0011 (0.0069) model time 0.2484 (0.3194) loss 3.9105 (3.4076) grad_norm 1.6768 (2.1661) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][880/1251] eta 0:01:59 lr 0.000721 wd 0.0500 time 0.2404 (0.3214) data time 0.0009 (0.0066) model time 0.2395 (0.3148) loss 2.4900 (3.3979) grad_norm 1.9959 (2.1739) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][890/1251] eta 0:01:54 lr 0.000721 wd 0.0500 time 0.2382 (0.3170) data time 0.0008 (0.0063) model time 0.2374 (0.3107) loss 3.4291 (3.3949) grad_norm 1.8389 (2.1679) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][900/1251] eta 0:01:49 lr 0.000721 wd 0.0500 time 0.2377 (0.3130) data time 0.0008 (0.0060) model time 0.2369 (0.3070) loss 3.1272 (3.3910) grad_norm 1.3445 (2.1614) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][910/1251] eta 0:01:45 lr 0.000721 wd 0.0500 time 0.2400 (0.3096) data time 0.0011 (0.0058) model time 0.2389 (0.3038) loss 3.4356 (3.3752) grad_norm 2.5606 (2.1718) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][920/1251] eta 0:01:41 lr 0.000721 wd 0.0500 time 0.2480 (0.3064) data time 0.0009 (0.0056) model time 0.2471 (0.3008) loss 3.3014 (3.3642) grad_norm 2.3657 (2.2171) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][930/1251] eta 0:01:37 lr 0.000721 wd 0.0500 time 0.2468 (0.3036) data time 0.0009 (0.0054) model time 0.2459 (0.2982) loss 3.9294 (3.3633) grad_norm 2.4926 (2.2160) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][940/1251] eta 0:01:33 lr 0.000721 wd 0.0500 time 0.2422 (0.3010) data time 0.0010 (0.0052) model time 0.2412 (0.2958) loss 2.4314 (3.3556) grad_norm 1.4320 (2.2180) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][950/1251] eta 0:01:29 lr 0.000721 wd 0.0500 time 0.2354 (0.2985) data time 0.0007 (0.0050) model time 0.2347 (0.2934) loss 2.3961 (3.3584) grad_norm 1.9411 (2.2092) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][960/1251] eta 0:01:26 lr 0.000721 wd 0.0500 time 0.2440 (0.2962) data time 0.0008 (0.0048) model time 0.2432 (0.2913) loss 2.5692 (3.3465) grad_norm 2.5760 (2.2031) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][970/1251] eta 0:01:22 lr 0.000721 wd 0.0500 time 0.2384 (0.2941) data time 0.0008 (0.0047) model time 0.2376 (0.2894) loss 3.2690 (3.3357) grad_norm 2.8877 (2.2056) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][980/1251] eta 0:01:19 lr 0.000721 wd 0.0500 time 0.2355 (0.2921) data time 0.0010 (0.0046) model time 0.2345 (0.2876) loss 3.3390 (3.3290) grad_norm 2.4938 (2.2045) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][990/1251] eta 0:01:15 lr 0.000721 wd 0.0500 time 0.2443 (0.2904) data time 0.0008 (0.0045) model time 0.2435 (0.2859) loss 2.8412 (3.3259) grad_norm 2.0123 (2.2239) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1000/1251] eta 0:01:12 lr 0.000721 wd 0.0500 time 0.2351 (0.2895) data time 0.0009 (0.0043) model time 0.2342 (0.2852) loss 2.5155 (3.3227) grad_norm 1.5985 (2.2157) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1010/1251] eta 0:01:09 lr 0.000721 wd 0.0500 time 0.2466 (0.2880) data time 0.0011 (0.0042) model time 0.2455 (0.2837) loss 3.9306 (3.3170) grad_norm 2.8888 (2.2211) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1020/1251] eta 0:01:06 lr 0.000721 wd 0.0500 time 0.2412 (0.2874) data time 0.0011 (0.0041) model time 0.2401 (0.2832) loss 3.5560 (3.3186) grad_norm 2.4488 (2.2286) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1030/1251] eta 0:01:03 lr 0.000721 wd 0.0500 time 0.2288 (0.2860) data time 0.0009 (0.0040) model time 0.2279 (0.2819) loss 3.7911 (3.3298) grad_norm 2.0568 (2.2222) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1040/1251] eta 0:01:00 lr 0.000721 wd 0.0500 time 0.2351 (0.2846) data time 0.0007 (0.0040) model time 0.2344 (0.2807) loss 3.4956 (3.3280) grad_norm 2.4136 (2.2136) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1050/1251] eta 0:00:56 lr 0.000721 wd 0.0500 time 0.2459 (0.2834) data time 0.0009 (0.0039) model time 0.2450 (0.2795) loss 3.7189 (3.3319) grad_norm 2.0899 (2.2124) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1060/1251] eta 0:00:53 lr 0.000721 wd 0.0500 time 0.2399 (0.2822) data time 0.0010 (0.0038) model time 0.2390 (0.2784) loss 3.8444 (3.3350) grad_norm 3.7427 (2.2125) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1070/1251] eta 0:00:50 lr 0.000720 wd 0.0500 time 0.2381 (0.2810) data time 0.0008 (0.0037) model time 0.2373 (0.2773) loss 2.8464 (3.3344) grad_norm 1.9573 (2.2092) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1080/1251] eta 0:00:47 lr 0.000720 wd 0.0500 time 0.2430 (0.2799) data time 0.0010 (0.0037) model time 0.2420 (0.2763) loss 3.3784 (3.3336) grad_norm 2.2896 (2.2111) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1090/1251] eta 0:00:44 lr 0.000720 wd 0.0500 time 0.2382 (0.2790) data time 0.0009 (0.0037) model time 0.2373 (0.2753) loss 2.6382 (3.3265) grad_norm 2.1946 (2.2066) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1100/1251] eta 0:00:41 lr 0.000720 wd 0.0500 time 0.2321 (0.2780) data time 0.0011 (0.0036) model time 0.2311 (0.2744) loss 3.6165 (3.3259) grad_norm 1.8297 (2.2035) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1110/1251] eta 0:00:39 lr 0.000720 wd 0.0500 time 0.2421 (0.2772) data time 0.0008 (0.0036) model time 0.2413 (0.2736) loss 3.8698 (3.3309) grad_norm 2.2428 (2.2057) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1120/1251] eta 0:00:36 lr 0.000720 wd 0.0500 time 0.2427 (0.2763) data time 0.0009 (0.0036) model time 0.2418 (0.2728) loss 2.9441 (3.3331) grad_norm 2.8083 (2.2072) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1130/1251] eta 0:00:33 lr 0.000720 wd 0.0500 time 0.2433 (0.2756) data time 0.0008 (0.0035) model time 0.2425 (0.2721) loss 3.8251 (3.3310) grad_norm 1.7185 (2.1974) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1140/1251] eta 0:00:30 lr 0.000720 wd 0.0500 time 0.2439 (0.2748) data time 0.0010 (0.0034) model time 0.2429 (0.2713) loss 3.6626 (3.3395) grad_norm 1.8544 (2.1967) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1150/1251] eta 0:00:27 lr 0.000720 wd 0.0500 time 0.2344 (0.2740) data time 0.0009 (0.0034) model time 0.2335 (0.2706) loss 3.5722 (3.3449) grad_norm 1.9155 (2.1936) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1160/1251] eta 0:00:24 lr 0.000720 wd 0.0500 time 0.2328 (0.2732) data time 0.0008 (0.0033) model time 0.2320 (0.2699) loss 4.2407 (3.3416) grad_norm 1.7896 (2.1998) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1170/1251] eta 0:00:22 lr 0.000720 wd 0.0500 time 0.2429 (0.2726) data time 0.0009 (0.0033) model time 0.2420 (0.2693) loss 3.6366 (3.3357) grad_norm 1.6989 (2.1972) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1180/1251] eta 0:00:19 lr 0.000720 wd 0.0500 time 0.2416 (0.2719) data time 0.0009 (0.0033) model time 0.2407 (0.2686) loss 3.5909 (3.3287) grad_norm 1.8895 (2.1954) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:02:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1190/1251] eta 0:00:16 lr 0.000720 wd 0.0500 time 0.2324 (0.2713) data time 0.0010 (0.0032) model time 0.2314 (0.2680) loss 3.3818 (3.3318) grad_norm 3.0868 (2.2032) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:03:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1200/1251] eta 0:00:13 lr 0.000720 wd 0.0500 time 0.2396 (0.2707) data time 0.0011 (0.0032) model time 0.2385 (0.2675) loss 2.9397 (3.3344) grad_norm 1.8375 (2.2062) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:03:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1210/1251] eta 0:00:11 lr 0.000720 wd 0.0500 time 0.2401 (0.2701) data time 0.0010 (0.0032) model time 0.2391 (0.2669) loss 3.5116 (3.3326) grad_norm 2.1103 (2.2003) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:03:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1220/1251] eta 0:00:08 lr 0.000720 wd 0.0500 time 0.2306 (0.2696) data time 0.0010 (0.0032) model time 0.2296 (0.2664) loss 3.4692 (3.3401) grad_norm 2.3235 (2.2015) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:03:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 18:03:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 18:03:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 18:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 18:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 18:07:00 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 18:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 18:07:09 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 18:07:10 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 18:07:12 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 18:07:12 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 119) [2024-08-26 18:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 18:09:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 18:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 18:09:36 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 18:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 18:09:44 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 18:09:45 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 18:09:47 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 18:09:47 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 119) [2024-08-26 18:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 18:10:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1230/1251] eta 0:02:24 lr 0.000720 wd 0.0500 time 0.3803 (6.9004) data time 0.0009 (0.4179) model time 0.3793 (6.4824) loss 4.3301 (4.3947) grad_norm 2.1689 (2.2383) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 18:10:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1240/1251] eta 0:00:14 lr 0.000720 wd 0.0500 time 0.2218 (1.3427) data time 0.0004 (0.0707) model time 0.2214 (1.2721) loss 2.6998 (3.6784) grad_norm 1.6874 (1.9071) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [119/300][1250/1251] eta 0:00:00 lr 0.000720 wd 0.0500 time 0.2219 (0.8333) data time 0.0005 (0.0388) model time 0.2214 (0.7944) loss 3.7604 (3.6214) grad_norm 3.5810 (2.0583) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 18:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 119 training takes 0:00:18 [2024-08-26 18:10:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 18:10:12 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 18:10:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.408 (0.408) Loss 0.4924 (0.4924) Acc@1 90.625 (90.625) Acc@5 97.949 (97.949) Mem 7377MB [2024-08-26 18:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.068 (0.101) Loss 0.8145 (0.7494) Acc@1 82.422 (83.327) Acc@5 95.801 (96.529) Mem 7377MB [2024-08-26 18:10:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.088) Loss 1.1045 (0.7760) Acc@1 72.656 (82.250) Acc@5 93.066 (96.466) Mem 7377MB [2024-08-26 18:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.084) Loss 1.3799 (0.8872) Acc@1 65.723 (79.791) Acc@5 89.355 (95.086) Mem 7377MB [2024-08-26 18:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.079) Loss 1.1768 (0.9457) Acc@1 71.582 (78.268) Acc@5 91.504 (94.431) Mem 7377MB [2024-08-26 18:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.958 Acc@5 94.396 [2024-08-26 18:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.0% [2024-08-26 18:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 77.96% [2024-08-26 18:10:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 18:10:18 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 18:10:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.497 (0.497) Loss 0.4226 (0.4226) Acc@1 91.992 (91.992) Acc@5 98.242 (98.242) Mem 7377MB [2024-08-26 18:10:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.109) Loss 0.6821 (0.6664) Acc@1 86.133 (85.609) Acc@5 96.680 (97.195) Mem 7377MB [2024-08-26 18:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.068 (0.091) Loss 0.9487 (0.6885) Acc@1 78.320 (84.696) Acc@5 94.922 (97.205) Mem 7377MB [2024-08-26 18:10:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.084) Loss 1.2168 (0.7844) Acc@1 68.457 (82.371) Acc@5 91.699 (96.103) Mem 7377MB [2024-08-26 18:10:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.080) Loss 1.1016 (0.8346) Acc@1 72.949 (81.007) Acc@5 93.066 (95.572) Mem 7377MB [2024-08-26 18:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.608 Acc@5 95.536 [2024-08-26 18:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.6% [2024-08-26 18:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.61% [2024-08-26 18:10:22 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 18:10:24 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 18:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][0/1251] eta 0:12:43 lr 0.000720 wd 0.0500 time 0.6102 (0.6102) data time 0.3632 (0.3632) model time 0.0000 (0.0000) loss 4.0520 (4.0520) grad_norm 2.1410 (2.1410) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 18:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][10/1251] eta 0:05:35 lr 0.000720 wd 0.0500 time 0.2363 (0.2707) data time 0.0010 (0.0340) model time 0.0000 (0.0000) loss 3.7106 (3.5975) grad_norm 2.0484 (2.3851) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 18:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][20/1251] eta 0:05:15 lr 0.000720 wd 0.0500 time 0.2336 (0.2564) data time 0.0010 (0.0184) model time 0.0000 (0.0000) loss 3.4215 (3.3827) grad_norm 2.6892 (2.1702) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 18:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][30/1251] eta 0:05:06 lr 0.000720 wd 0.0500 time 0.2487 (0.2508) data time 0.0009 (0.0128) model time 0.0000 (0.0000) loss 3.2865 (3.4126) grad_norm 1.6518 (2.2344) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 18:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][40/1251] eta 0:05:00 lr 0.000720 wd 0.0500 time 0.2345 (0.2481) data time 0.0013 (0.0100) model time 0.0000 (0.0000) loss 2.9276 (3.3751) grad_norm 1.9833 (2.2979) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 18:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][50/1251] eta 0:04:56 lr 0.000720 wd 0.0500 time 0.2368 (0.2467) data time 0.0010 (0.0082) model time 0.0000 (0.0000) loss 4.0206 (3.3780) grad_norm 2.0161 (2.3196) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 18:10:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][60/1251] eta 0:04:52 lr 0.000720 wd 0.0500 time 0.2314 (0.2452) data time 0.0011 (0.0071) model time 0.2304 (0.2362) loss 2.3346 (3.3303) grad_norm 1.9111 (2.2977) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 18:10:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][70/1251] eta 0:04:48 lr 0.000719 wd 0.0500 time 0.2327 (0.2439) data time 0.0009 (0.0063) model time 0.2319 (0.2354) loss 4.0896 (3.3462) grad_norm 1.5902 (2.2481) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 18:10:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 18:10:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 18:10:46 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 19:34:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 19:34:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 19:34:47 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 19:34:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 19:34:57 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 19:34:59 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 19:35:00 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 19:35:00 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 120) [2024-08-26 19:35:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 19:35:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][80/1251] eta 2:35:27 lr 0.000719 wd 0.0500 time 0.4951 (7.9656) data time 0.0009 (0.7262) model time 0.4943 (7.2394) loss 4.0550 (4.0606) grad_norm 1.8664 (1.9718) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 19:35:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][90/1251] eta 0:29:20 lr 0.000719 wd 0.0500 time 0.2263 (1.5159) data time 0.0007 (0.1219) model time 0.2256 (1.3940) loss 2.7748 (3.6107) grad_norm 2.2919 (2.2586) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 19:35:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][100/1251] eta 0:17:50 lr 0.000719 wd 0.0500 time 0.2199 (0.9298) data time 0.0014 (0.0670) model time 0.2186 (0.8628) loss 3.8905 (3.6241) grad_norm 2.0788 (2.2112) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 19:35:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][110/1251] eta 0:13:29 lr 0.000719 wd 0.0500 time 0.2272 (0.7099) data time 0.0007 (0.0464) model time 0.2265 (0.6635) loss 3.5708 (3.5978) grad_norm 2.3195 (2.2106) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 19:35:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][120/1251] eta 0:11:12 lr 0.000719 wd 0.0500 time 0.2254 (0.5950) data time 0.0009 (0.0356) model time 0.2245 (0.5594) loss 3.2266 (3.5290) grad_norm 2.5156 (2.1702) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 19:35:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][130/1251] eta 0:09:48 lr 0.000719 wd 0.0500 time 0.2250 (0.5246) data time 0.0008 (0.0290) model time 0.2242 (0.4956) loss 3.2801 (3.5141) grad_norm 1.8882 (2.1545) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 19:35:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][140/1251] eta 0:08:50 lr 0.000719 wd 0.0500 time 0.2224 (0.4771) data time 0.0010 (0.0245) model time 0.2214 (0.4527) loss 3.9331 (3.4951) grad_norm 2.0295 (2.2938) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 19:35:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][150/1251] eta 0:08:07 lr 0.000719 wd 0.0500 time 0.2293 (0.4428) data time 0.0009 (0.0212) model time 0.2284 (0.4216) loss 3.4662 (3.4446) grad_norm 2.0498 (2.3609) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 19:35:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][160/1251] eta 0:07:34 lr 0.000719 wd 0.0500 time 0.2296 (0.4168) data time 0.0010 (0.0188) model time 0.2286 (0.3980) loss 3.6494 (3.4382) grad_norm 3.6940 (2.3368) loss_scale 4096.0000 (2097.9512) mem 7377MB [2024-08-26 19:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][170/1251] eta 0:07:08 lr 0.000719 wd 0.0500 time 0.2266 (0.3963) data time 0.0007 (0.0169) model time 0.2259 (0.3794) loss 2.0892 (3.4123) grad_norm 1.6965 (2.2747) loss_scale 4096.0000 (2315.1304) mem 7377MB [2024-08-26 19:35:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][180/1251] eta 0:06:46 lr 0.000719 wd 0.0500 time 0.2307 (0.3799) data time 0.0007 (0.0153) model time 0.2301 (0.3645) loss 4.0639 (3.4281) grad_norm 2.1221 (2.2849) loss_scale 4096.0000 (2489.7255) mem 7377MB [2024-08-26 19:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][190/1251] eta 0:06:28 lr 0.000719 wd 0.0500 time 0.2323 (0.3664) data time 0.0009 (0.0141) model time 0.2313 (0.3524) loss 3.7024 (3.4263) grad_norm 2.2718 (nan) loss_scale 2048.0000 (2450.2857) mem 7377MB [2024-08-26 19:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][200/1251] eta 0:06:13 lr 0.000719 wd 0.0500 time 0.2300 (0.3553) data time 0.0010 (0.0130) model time 0.2290 (0.3423) loss 3.5477 (3.4320) grad_norm 1.7344 (nan) loss_scale 2048.0000 (2417.3115) mem 7377MB [2024-08-26 19:35:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][210/1251] eta 0:05:59 lr 0.000719 wd 0.0500 time 0.2241 (0.3455) data time 0.0013 (0.0121) model time 0.2227 (0.3334) loss 3.7958 (3.4319) grad_norm 2.1053 (nan) loss_scale 2048.0000 (2389.3333) mem 7377MB [2024-08-26 19:35:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][220/1251] eta 0:05:47 lr 0.000719 wd 0.0500 time 0.2330 (0.3372) data time 0.0011 (0.0113) model time 0.2319 (0.3259) loss 3.8428 (3.4252) grad_norm 2.2500 (nan) loss_scale 2048.0000 (2365.2958) mem 7377MB [2024-08-26 19:35:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][230/1251] eta 0:05:36 lr 0.000719 wd 0.0500 time 0.2251 (0.3300) data time 0.0009 (0.0107) model time 0.2241 (0.3193) loss 3.3859 (3.4185) grad_norm 3.1786 (nan) loss_scale 2048.0000 (2344.4211) mem 7377MB [2024-08-26 19:35:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][240/1251] eta 0:05:27 lr 0.000719 wd 0.0500 time 0.2166 (0.3235) data time 0.0012 (0.0101) model time 0.2154 (0.3134) loss 3.8632 (3.4247) grad_norm 2.2823 (nan) loss_scale 2048.0000 (2326.1235) mem 7377MB [2024-08-26 19:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][250/1251] eta 0:05:18 lr 0.000719 wd 0.0500 time 0.2239 (0.3182) data time 0.0010 (0.0096) model time 0.2230 (0.3086) loss 2.8615 (3.4097) grad_norm 1.8097 (nan) loss_scale 2048.0000 (2309.9535) mem 7377MB [2024-08-26 19:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][260/1251] eta 0:05:10 lr 0.000719 wd 0.0500 time 0.2208 (0.3131) data time 0.0011 (0.0091) model time 0.2197 (0.3040) loss 3.9281 (3.3965) grad_norm 1.9479 (nan) loss_scale 2048.0000 (2295.5604) mem 7377MB [2024-08-26 19:36:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][270/1251] eta 0:05:02 lr 0.000719 wd 0.0500 time 0.2278 (0.3087) data time 0.0009 (0.0087) model time 0.2269 (0.3000) loss 3.0792 (3.3892) grad_norm 2.4229 (nan) loss_scale 2048.0000 (2282.6667) mem 7377MB [2024-08-26 19:36:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][280/1251] eta 0:04:55 lr 0.000719 wd 0.0500 time 0.2266 (0.3047) data time 0.0007 (0.0083) model time 0.2258 (0.2964) loss 3.6887 (3.3756) grad_norm 2.3652 (nan) loss_scale 2048.0000 (2271.0495) mem 7377MB [2024-08-26 19:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][290/1251] eta 0:04:49 lr 0.000719 wd 0.0500 time 0.2253 (0.3009) data time 0.0009 (0.0080) model time 0.2243 (0.2930) loss 3.4316 (3.3703) grad_norm 1.6961 (nan) loss_scale 2048.0000 (2260.5283) mem 7377MB [2024-08-26 19:36:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][300/1251] eta 0:04:43 lr 0.000719 wd 0.0500 time 0.2303 (0.2977) data time 0.0008 (0.0077) model time 0.2295 (0.2900) loss 4.1013 (3.3685) grad_norm 2.0533 (nan) loss_scale 2048.0000 (2250.9550) mem 7377MB [2024-08-26 19:36:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][310/1251] eta 0:04:37 lr 0.000719 wd 0.0500 time 0.2267 (0.2946) data time 0.0007 (0.0074) model time 0.2260 (0.2872) loss 3.3435 (3.3585) grad_norm 2.0823 (nan) loss_scale 2048.0000 (2242.2069) mem 7377MB [2024-08-26 19:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][320/1251] eta 0:04:31 lr 0.000718 wd 0.0500 time 0.2275 (0.2919) data time 0.0007 (0.0072) model time 0.2269 (0.2847) loss 3.8326 (3.3579) grad_norm 2.9163 (nan) loss_scale 2048.0000 (2234.1818) mem 7377MB [2024-08-26 19:36:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][330/1251] eta 0:04:26 lr 0.000718 wd 0.0500 time 0.2264 (0.2895) data time 0.0010 (0.0069) model time 0.2254 (0.2825) loss 3.9526 (3.3503) grad_norm 2.0568 (nan) loss_scale 2048.0000 (2226.7937) mem 7377MB [2024-08-26 19:36:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][340/1251] eta 0:04:21 lr 0.000718 wd 0.0500 time 0.2275 (0.2871) data time 0.0010 (0.0067) model time 0.2265 (0.2804) loss 3.3447 (3.3406) grad_norm 2.7779 (nan) loss_scale 2048.0000 (2219.9695) mem 7377MB [2024-08-26 19:36:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][350/1251] eta 0:04:16 lr 0.000718 wd 0.0500 time 0.2295 (0.2850) data time 0.0009 (0.0065) model time 0.2286 (0.2785) loss 3.1758 (3.3310) grad_norm 2.3998 (nan) loss_scale 2048.0000 (2213.6471) mem 7377MB [2024-08-26 19:36:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][360/1251] eta 0:04:12 lr 0.000718 wd 0.0500 time 0.2370 (0.2831) data time 0.0009 (0.0063) model time 0.2362 (0.2768) loss 2.1648 (3.3285) grad_norm 1.5963 (nan) loss_scale 2048.0000 (2207.7730) mem 7377MB [2024-08-26 19:36:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][370/1251] eta 0:04:08 lr 0.000718 wd 0.0500 time 0.2255 (0.2820) data time 0.0012 (0.0061) model time 0.2243 (0.2759) loss 3.0893 (3.3256) grad_norm 3.2240 (nan) loss_scale 2048.0000 (2202.3014) mem 7377MB [2024-08-26 19:36:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][380/1251] eta 0:04:04 lr 0.000718 wd 0.0500 time 0.2245 (0.2803) data time 0.0011 (0.0060) model time 0.2234 (0.2743) loss 3.6685 (3.3168) grad_norm 2.5421 (nan) loss_scale 2048.0000 (2197.1921) mem 7377MB [2024-08-26 19:36:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][390/1251] eta 0:04:00 lr 0.000718 wd 0.0500 time 0.2253 (0.2794) data time 0.0009 (0.0058) model time 0.2244 (0.2736) loss 3.7966 (3.3184) grad_norm 2.7315 (nan) loss_scale 2048.0000 (2192.4103) mem 7377MB [2024-08-26 19:36:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][400/1251] eta 0:03:56 lr 0.000718 wd 0.0500 time 0.2298 (0.2778) data time 0.0011 (0.0057) model time 0.2287 (0.2722) loss 3.5791 (3.3306) grad_norm 2.2358 (nan) loss_scale 2048.0000 (2187.9255) mem 7377MB [2024-08-26 19:36:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][410/1251] eta 0:03:52 lr 0.000718 wd 0.0500 time 0.2213 (0.2764) data time 0.0010 (0.0056) model time 0.2203 (0.2708) loss 3.7052 (3.3283) grad_norm 2.4753 (nan) loss_scale 2048.0000 (2183.7108) mem 7377MB [2024-08-26 19:36:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][420/1251] eta 0:03:48 lr 0.000718 wd 0.0500 time 0.2250 (0.2750) data time 0.0007 (0.0054) model time 0.2243 (0.2696) loss 3.5672 (3.3314) grad_norm 1.8082 (nan) loss_scale 2048.0000 (2179.7427) mem 7377MB [2024-08-26 19:36:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][430/1251] eta 0:03:44 lr 0.000718 wd 0.0500 time 0.2257 (0.2737) data time 0.0007 (0.0053) model time 0.2250 (0.2684) loss 3.9041 (3.3324) grad_norm 3.1962 (nan) loss_scale 2048.0000 (2176.0000) mem 7377MB [2024-08-26 19:36:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][440/1251] eta 0:03:40 lr 0.000718 wd 0.0500 time 0.2293 (0.2725) data time 0.0008 (0.0052) model time 0.2284 (0.2673) loss 3.9551 (3.3359) grad_norm 3.7352 (nan) loss_scale 2048.0000 (2172.4641) mem 7377MB [2024-08-26 19:36:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][450/1251] eta 0:03:37 lr 0.000718 wd 0.0500 time 0.2197 (0.2713) data time 0.0011 (0.0051) model time 0.2186 (0.2662) loss 3.3120 (3.3324) grad_norm 2.2084 (nan) loss_scale 2048.0000 (2169.1183) mem 7377MB [2024-08-26 19:36:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][460/1251] eta 0:03:33 lr 0.000718 wd 0.0500 time 0.2366 (0.2703) data time 0.0007 (0.0050) model time 0.2359 (0.2653) loss 3.6526 (3.3320) grad_norm 2.3410 (nan) loss_scale 2048.0000 (2165.9476) mem 7377MB [2024-08-26 19:36:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][470/1251] eta 0:03:30 lr 0.000718 wd 0.0500 time 0.2234 (0.2691) data time 0.0008 (0.0049) model time 0.2226 (0.2642) loss 3.3183 (3.3280) grad_norm 2.5726 (nan) loss_scale 2048.0000 (2162.9388) mem 7377MB [2024-08-26 19:36:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][480/1251] eta 0:03:26 lr 0.000718 wd 0.0500 time 0.2317 (0.2681) data time 0.0008 (0.0048) model time 0.2309 (0.2633) loss 3.9952 (3.3351) grad_norm 1.9484 (nan) loss_scale 2048.0000 (2160.0796) mem 7377MB [2024-08-26 19:36:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][490/1251] eta 0:03:23 lr 0.000718 wd 0.0500 time 0.2239 (0.2671) data time 0.0013 (0.0047) model time 0.2226 (0.2624) loss 3.5972 (3.3380) grad_norm 2.0604 (nan) loss_scale 2048.0000 (2157.3592) mem 7377MB [2024-08-26 19:36:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][500/1251] eta 0:03:19 lr 0.000718 wd 0.0500 time 0.2276 (0.2662) data time 0.0008 (0.0046) model time 0.2268 (0.2616) loss 3.4922 (3.3359) grad_norm 3.5075 (nan) loss_scale 2048.0000 (2154.7678) mem 7377MB [2024-08-26 19:36:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][510/1251] eta 0:03:16 lr 0.000718 wd 0.0500 time 0.2298 (0.2654) data time 0.0008 (0.0045) model time 0.2290 (0.2609) loss 3.3165 (3.3419) grad_norm 2.1453 (nan) loss_scale 2048.0000 (2152.2963) mem 7377MB [2024-08-26 19:37:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][520/1251] eta 0:03:13 lr 0.000718 wd 0.0500 time 0.2240 (0.2646) data time 0.0007 (0.0044) model time 0.2233 (0.2601) loss 3.3722 (3.3462) grad_norm 4.3613 (nan) loss_scale 2048.0000 (2149.9367) mem 7377MB [2024-08-26 19:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][530/1251] eta 0:03:10 lr 0.000718 wd 0.0500 time 0.2320 (0.2638) data time 0.0013 (0.0044) model time 0.2308 (0.2594) loss 2.5498 (3.3454) grad_norm 1.8385 (nan) loss_scale 2048.0000 (2147.6814) mem 7377MB [2024-08-26 19:37:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][540/1251] eta 0:03:07 lr 0.000718 wd 0.0500 time 0.2248 (0.2630) data time 0.0007 (0.0043) model time 0.2242 (0.2587) loss 2.0619 (3.3381) grad_norm 1.9261 (nan) loss_scale 2048.0000 (2145.5238) mem 7377MB [2024-08-26 19:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][550/1251] eta 0:03:03 lr 0.000718 wd 0.0500 time 0.2286 (0.2623) data time 0.0009 (0.0042) model time 0.2277 (0.2581) loss 3.5155 (3.3318) grad_norm 1.8178 (nan) loss_scale 2048.0000 (2143.4576) mem 7377MB [2024-08-26 19:37:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][560/1251] eta 0:03:00 lr 0.000718 wd 0.0500 time 0.2231 (0.2615) data time 0.0009 (0.0042) model time 0.2221 (0.2574) loss 4.2590 (3.3322) grad_norm 2.0387 (nan) loss_scale 2048.0000 (2141.4772) mem 7377MB [2024-08-26 19:37:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][570/1251] eta 0:02:57 lr 0.000717 wd 0.0500 time 0.2244 (0.2609) data time 0.0010 (0.0041) model time 0.2233 (0.2568) loss 3.9422 (3.3365) grad_norm 2.9450 (nan) loss_scale 2048.0000 (2139.5772) mem 7377MB [2024-08-26 19:37:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][580/1251] eta 0:02:54 lr 0.000717 wd 0.0500 time 0.2240 (0.2602) data time 0.0008 (0.0040) model time 0.2232 (0.2561) loss 3.5645 (3.3349) grad_norm 1.8657 (nan) loss_scale 2048.0000 (2137.7530) mem 7377MB [2024-08-26 19:37:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][590/1251] eta 0:02:51 lr 0.000717 wd 0.0500 time 0.2231 (0.2595) data time 0.0007 (0.0040) model time 0.2224 (0.2555) loss 3.9623 (3.3410) grad_norm 1.5163 (nan) loss_scale 2048.0000 (2136.0000) mem 7377MB [2024-08-26 19:37:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][600/1251] eta 0:02:48 lr 0.000717 wd 0.0500 time 0.2249 (0.2589) data time 0.0007 (0.0039) model time 0.2242 (0.2549) loss 2.6430 (3.3392) grad_norm 1.7142 (nan) loss_scale 2048.0000 (2134.3142) mem 7377MB [2024-08-26 19:37:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][610/1251] eta 0:02:45 lr 0.000717 wd 0.0500 time 0.2256 (0.2582) data time 0.0009 (0.0039) model time 0.2247 (0.2544) loss 3.3408 (3.3359) grad_norm 2.0951 (nan) loss_scale 2048.0000 (2132.6917) mem 7377MB [2024-08-26 19:37:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][620/1251] eta 0:02:42 lr 0.000717 wd 0.0500 time 0.2339 (0.2577) data time 0.0007 (0.0039) model time 0.2332 (0.2538) loss 2.9569 (3.3340) grad_norm 1.4415 (nan) loss_scale 2048.0000 (2131.1292) mem 7377MB [2024-08-26 19:37:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][630/1251] eta 0:02:39 lr 0.000717 wd 0.0500 time 0.2218 (0.2571) data time 0.0008 (0.0038) model time 0.2210 (0.2533) loss 4.1148 (3.3374) grad_norm 2.2542 (nan) loss_scale 2048.0000 (2129.6232) mem 7377MB [2024-08-26 19:37:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][640/1251] eta 0:02:36 lr 0.000717 wd 0.0500 time 0.2225 (0.2566) data time 0.0009 (0.0038) model time 0.2217 (0.2528) loss 3.8509 (3.3419) grad_norm 1.7609 (nan) loss_scale 2048.0000 (2128.1708) mem 7377MB [2024-08-26 19:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][650/1251] eta 0:02:33 lr 0.000717 wd 0.0500 time 0.2244 (0.2561) data time 0.0010 (0.0037) model time 0.2234 (0.2524) loss 3.2212 (3.3433) grad_norm 1.9004 (nan) loss_scale 2048.0000 (2126.7692) mem 7377MB [2024-08-26 19:37:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][660/1251] eta 0:02:31 lr 0.000717 wd 0.0500 time 0.2247 (0.2556) data time 0.0007 (0.0037) model time 0.2240 (0.2519) loss 3.4290 (3.3451) grad_norm 1.6398 (nan) loss_scale 2048.0000 (2125.4158) mem 7377MB [2024-08-26 19:37:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][670/1251] eta 0:02:28 lr 0.000717 wd 0.0500 time 0.2239 (0.2551) data time 0.0010 (0.0036) model time 0.2229 (0.2514) loss 3.3521 (3.3454) grad_norm 1.7108 (nan) loss_scale 2048.0000 (2124.1081) mem 7377MB [2024-08-26 19:37:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][680/1251] eta 0:02:25 lr 0.000717 wd 0.0500 time 0.2220 (0.2546) data time 0.0010 (0.0036) model time 0.2210 (0.2510) loss 3.3631 (3.3435) grad_norm 1.7559 (nan) loss_scale 2048.0000 (2122.8439) mem 7377MB [2024-08-26 19:37:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][690/1251] eta 0:02:22 lr 0.000717 wd 0.0500 time 0.2297 (0.2541) data time 0.0010 (0.0035) model time 0.2287 (0.2506) loss 2.3858 (3.3430) grad_norm 1.6094 (nan) loss_scale 2048.0000 (2121.6209) mem 7377MB [2024-08-26 19:37:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][700/1251] eta 0:02:19 lr 0.000717 wd 0.0500 time 0.2362 (0.2538) data time 0.0008 (0.0035) model time 0.2354 (0.2503) loss 3.3566 (3.3465) grad_norm 2.2828 (nan) loss_scale 2048.0000 (2120.4373) mem 7377MB [2024-08-26 19:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][710/1251] eta 0:02:17 lr 0.000717 wd 0.0500 time 0.2305 (0.2535) data time 0.0015 (0.0035) model time 0.2289 (0.2500) loss 3.6848 (3.3499) grad_norm 2.1531 (nan) loss_scale 2048.0000 (2119.2911) mem 7377MB [2024-08-26 19:37:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][720/1251] eta 0:02:14 lr 0.000717 wd 0.0500 time 0.2260 (0.2531) data time 0.0011 (0.0035) model time 0.2249 (0.2497) loss 2.6187 (3.3471) grad_norm 2.2548 (nan) loss_scale 2048.0000 (2118.1807) mem 7377MB [2024-08-26 19:37:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][730/1251] eta 0:02:11 lr 0.000717 wd 0.0500 time 0.2222 (0.2528) data time 0.0014 (0.0034) model time 0.2208 (0.2494) loss 2.3971 (3.3472) grad_norm 1.8557 (nan) loss_scale 2048.0000 (2117.1043) mem 7377MB [2024-08-26 19:37:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][740/1251] eta 0:02:09 lr 0.000717 wd 0.0500 time 0.2322 (0.2525) data time 0.0013 (0.0034) model time 0.2308 (0.2491) loss 3.4409 (3.3432) grad_norm 2.0505 (nan) loss_scale 2048.0000 (2116.0604) mem 7377MB [2024-08-26 19:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][750/1251] eta 0:02:06 lr 0.000717 wd 0.0500 time 0.2216 (0.2521) data time 0.0010 (0.0034) model time 0.2207 (0.2488) loss 3.4553 (3.3462) grad_norm 1.7499 (nan) loss_scale 2048.0000 (2115.0476) mem 7377MB [2024-08-26 19:37:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][760/1251] eta 0:02:03 lr 0.000717 wd 0.0500 time 0.2184 (0.2518) data time 0.0011 (0.0033) model time 0.2173 (0.2484) loss 3.0379 (3.3479) grad_norm 1.8197 (nan) loss_scale 2048.0000 (2114.0645) mem 7377MB [2024-08-26 19:37:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][770/1251] eta 0:02:00 lr 0.000717 wd 0.0500 time 0.2270 (0.2514) data time 0.0010 (0.0033) model time 0.2260 (0.2482) loss 3.8668 (3.3463) grad_norm 1.9012 (nan) loss_scale 2048.0000 (2113.1098) mem 7377MB [2024-08-26 19:38:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][780/1251] eta 0:01:58 lr 0.000717 wd 0.0500 time 0.2249 (0.2511) data time 0.0007 (0.0033) model time 0.2242 (0.2478) loss 2.5160 (3.3432) grad_norm 1.8035 (nan) loss_scale 2048.0000 (2112.1823) mem 7377MB [2024-08-26 19:38:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][790/1251] eta 0:01:55 lr 0.000717 wd 0.0500 time 0.2208 (0.2508) data time 0.0011 (0.0032) model time 0.2197 (0.2475) loss 2.5059 (3.3417) grad_norm 1.7568 (nan) loss_scale 2048.0000 (2111.2809) mem 7377MB [2024-08-26 19:38:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][800/1251] eta 0:01:52 lr 0.000717 wd 0.0500 time 0.2210 (0.2505) data time 0.0008 (0.0032) model time 0.2202 (0.2473) loss 3.1178 (3.3400) grad_norm 2.7478 (nan) loss_scale 2048.0000 (2110.4044) mem 7377MB [2024-08-26 19:38:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][810/1251] eta 0:01:50 lr 0.000717 wd 0.0500 time 0.2336 (0.2502) data time 0.0011 (0.0032) model time 0.2326 (0.2470) loss 3.5139 (3.3416) grad_norm 2.1488 (nan) loss_scale 2048.0000 (2109.5519) mem 7377MB [2024-08-26 19:38:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][820/1251] eta 0:01:47 lr 0.000716 wd 0.0500 time 0.2314 (0.2499) data time 0.0009 (0.0031) model time 0.2305 (0.2468) loss 2.8083 (3.3441) grad_norm 1.6583 (nan) loss_scale 2048.0000 (2108.7224) mem 7377MB [2024-08-26 19:38:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][830/1251] eta 0:01:45 lr 0.000716 wd 0.0500 time 0.2198 (0.2496) data time 0.0008 (0.0031) model time 0.2190 (0.2465) loss 4.0444 (3.3439) grad_norm 1.9249 (nan) loss_scale 2048.0000 (2107.9149) mem 7377MB [2024-08-26 19:38:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][840/1251] eta 0:01:42 lr 0.000716 wd 0.0500 time 0.2230 (0.2494) data time 0.0008 (0.0031) model time 0.2222 (0.2463) loss 4.0368 (3.3436) grad_norm 3.7423 (nan) loss_scale 2048.0000 (2107.1286) mem 7377MB [2024-08-26 19:38:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][850/1251] eta 0:01:39 lr 0.000716 wd 0.0500 time 0.2347 (0.2491) data time 0.0008 (0.0031) model time 0.2340 (0.2460) loss 3.1842 (3.3444) grad_norm 2.7669 (nan) loss_scale 2048.0000 (2106.3627) mem 7377MB [2024-08-26 19:38:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][860/1251] eta 0:01:37 lr 0.000716 wd 0.0500 time 0.2240 (0.2489) data time 0.0007 (0.0031) model time 0.2233 (0.2458) loss 3.4091 (3.3447) grad_norm 1.9370 (nan) loss_scale 2048.0000 (2105.6164) mem 7377MB [2024-08-26 19:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][870/1251] eta 0:01:34 lr 0.000716 wd 0.0500 time 0.2403 (0.2486) data time 0.0010 (0.0030) model time 0.2393 (0.2456) loss 3.8287 (3.3433) grad_norm 2.9509 (nan) loss_scale 2048.0000 (2104.8889) mem 7377MB [2024-08-26 19:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][880/1251] eta 0:01:32 lr 0.000716 wd 0.0500 time 0.2324 (0.2484) data time 0.0007 (0.0030) model time 0.2317 (0.2454) loss 2.4758 (3.3430) grad_norm 2.3713 (nan) loss_scale 2048.0000 (2104.1796) mem 7377MB [2024-08-26 19:38:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][890/1251] eta 0:01:29 lr 0.000716 wd 0.0500 time 0.4388 (0.2484) data time 0.0010 (0.0030) model time 0.4378 (0.2454) loss 3.6681 (3.3365) grad_norm 2.2222 (nan) loss_scale 2048.0000 (2103.4877) mem 7377MB [2024-08-26 19:38:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][900/1251] eta 0:01:27 lr 0.000716 wd 0.0500 time 0.2461 (0.2482) data time 0.0007 (0.0030) model time 0.2454 (0.2453) loss 2.8656 (3.3360) grad_norm 2.3861 (nan) loss_scale 2048.0000 (2102.8127) mem 7377MB [2024-08-26 19:38:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][910/1251] eta 0:01:24 lr 0.000716 wd 0.0500 time 0.2286 (0.2482) data time 0.0008 (0.0029) model time 0.2279 (0.2452) loss 3.3209 (3.3325) grad_norm 2.1563 (nan) loss_scale 2048.0000 (2102.1538) mem 7377MB [2024-08-26 19:38:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][920/1251] eta 0:01:22 lr 0.000716 wd 0.0500 time 0.2290 (0.2480) data time 0.0009 (0.0029) model time 0.2281 (0.2451) loss 2.4242 (3.3298) grad_norm 2.9059 (nan) loss_scale 2048.0000 (2101.5107) mem 7377MB [2024-08-26 19:38:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][930/1251] eta 0:01:19 lr 0.000716 wd 0.0500 time 0.2228 (0.2478) data time 0.0007 (0.0029) model time 0.2221 (0.2449) loss 2.7511 (3.3274) grad_norm 1.6516 (nan) loss_scale 2048.0000 (2100.8826) mem 7377MB [2024-08-26 19:38:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][940/1251] eta 0:01:16 lr 0.000716 wd 0.0500 time 0.2288 (0.2475) data time 0.0009 (0.0029) model time 0.2279 (0.2447) loss 3.2942 (3.3270) grad_norm 2.4363 (nan) loss_scale 2048.0000 (2100.2691) mem 7377MB [2024-08-26 19:38:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][950/1251] eta 0:01:14 lr 0.000716 wd 0.0500 time 0.2195 (0.2473) data time 0.0007 (0.0029) model time 0.2188 (0.2445) loss 2.3322 (3.3267) grad_norm 2.5774 (nan) loss_scale 2048.0000 (2099.6697) mem 7377MB [2024-08-26 19:38:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][960/1251] eta 0:01:11 lr 0.000716 wd 0.0500 time 0.2209 (0.2471) data time 0.0010 (0.0028) model time 0.2199 (0.2443) loss 1.9505 (3.3240) grad_norm 2.3720 (nan) loss_scale 2048.0000 (2099.0839) mem 7377MB [2024-08-26 19:38:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][970/1251] eta 0:01:09 lr 0.000716 wd 0.0500 time 0.2229 (0.2469) data time 0.0007 (0.0028) model time 0.2222 (0.2440) loss 3.7801 (3.3235) grad_norm 2.1587 (nan) loss_scale 2048.0000 (2098.5112) mem 7377MB [2024-08-26 19:38:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][980/1251] eta 0:01:06 lr 0.000716 wd 0.0500 time 0.2305 (0.2467) data time 0.0009 (0.0028) model time 0.2295 (0.2439) loss 3.7576 (3.3255) grad_norm 2.6311 (nan) loss_scale 2048.0000 (2097.9512) mem 7377MB [2024-08-26 19:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][990/1251] eta 0:01:04 lr 0.000716 wd 0.0500 time 0.2296 (0.2465) data time 0.0007 (0.0028) model time 0.2290 (0.2437) loss 2.4334 (3.3243) grad_norm 1.7294 (nan) loss_scale 2048.0000 (2097.4035) mem 7377MB [2024-08-26 19:38:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1000/1251] eta 0:01:01 lr 0.000716 wd 0.0500 time 0.2306 (0.2463) data time 0.0009 (0.0028) model time 0.2297 (0.2435) loss 3.5085 (3.3279) grad_norm 1.8801 (nan) loss_scale 2048.0000 (2096.8677) mem 7377MB [2024-08-26 19:38:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1010/1251] eta 0:00:59 lr 0.000716 wd 0.0500 time 0.2283 (0.2461) data time 0.0007 (0.0027) model time 0.2276 (0.2434) loss 4.0529 (3.3306) grad_norm 1.6565 (nan) loss_scale 2048.0000 (2096.3433) mem 7377MB [2024-08-26 19:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1020/1251] eta 0:00:56 lr 0.000716 wd 0.0500 time 0.2158 (0.2459) data time 0.0012 (0.0027) model time 0.2147 (0.2432) loss 2.2077 (3.3274) grad_norm 2.2567 (nan) loss_scale 2048.0000 (2095.8301) mem 7377MB [2024-08-26 19:38:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1030/1251] eta 0:00:54 lr 0.000716 wd 0.0500 time 0.2368 (0.2458) data time 0.0011 (0.0027) model time 0.2358 (0.2430) loss 3.5306 (3.3238) grad_norm 2.4966 (nan) loss_scale 2048.0000 (2095.3277) mem 7377MB [2024-08-26 19:39:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1040/1251] eta 0:00:51 lr 0.000716 wd 0.0500 time 0.2310 (0.2456) data time 0.0009 (0.0027) model time 0.2301 (0.2429) loss 3.4574 (3.3234) grad_norm 2.3178 (nan) loss_scale 2048.0000 (2094.8358) mem 7377MB [2024-08-26 19:39:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1050/1251] eta 0:00:49 lr 0.000716 wd 0.0500 time 0.2243 (0.2454) data time 0.0008 (0.0027) model time 0.2235 (0.2427) loss 4.0625 (3.3248) grad_norm 2.4386 (nan) loss_scale 2048.0000 (2094.3539) mem 7377MB [2024-08-26 19:39:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1060/1251] eta 0:00:46 lr 0.000716 wd 0.0500 time 0.2321 (0.2452) data time 0.0008 (0.0027) model time 0.2313 (0.2426) loss 4.2906 (3.3266) grad_norm 1.7451 (nan) loss_scale 2048.0000 (2093.8819) mem 7377MB [2024-08-26 19:39:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1070/1251] eta 0:00:44 lr 0.000715 wd 0.0500 time 0.2257 (0.2451) data time 0.0014 (0.0027) model time 0.2243 (0.2424) loss 3.8073 (3.3273) grad_norm 1.9734 (nan) loss_scale 2048.0000 (2093.4194) mem 7377MB [2024-08-26 19:39:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1080/1251] eta 0:00:41 lr 0.000715 wd 0.0500 time 0.2267 (0.2450) data time 0.0009 (0.0026) model time 0.2259 (0.2423) loss 3.2751 (3.3256) grad_norm 2.0633 (nan) loss_scale 2048.0000 (2092.9661) mem 7377MB [2024-08-26 19:39:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1090/1251] eta 0:00:39 lr 0.000715 wd 0.0500 time 0.2311 (0.2448) data time 0.0009 (0.0026) model time 0.2302 (0.2422) loss 3.5794 (3.3275) grad_norm 2.7194 (nan) loss_scale 2048.0000 (2092.5217) mem 7377MB [2024-08-26 19:39:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1100/1251] eta 0:00:36 lr 0.000715 wd 0.0500 time 0.2308 (0.2446) data time 0.0010 (0.0026) model time 0.2298 (0.2420) loss 3.4924 (3.3276) grad_norm 2.1021 (nan) loss_scale 2048.0000 (2092.0861) mem 7377MB [2024-08-26 19:39:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1110/1251] eta 0:00:34 lr 0.000715 wd 0.0500 time 0.2252 (0.2445) data time 0.0007 (0.0026) model time 0.2245 (0.2419) loss 3.6647 (3.3246) grad_norm 2.7956 (nan) loss_scale 2048.0000 (2091.6589) mem 7377MB [2024-08-26 19:39:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1120/1251] eta 0:00:32 lr 0.000715 wd 0.0500 time 0.2208 (0.2443) data time 0.0007 (0.0026) model time 0.2201 (0.2418) loss 3.9622 (3.3275) grad_norm 1.9167 (nan) loss_scale 2048.0000 (2091.2399) mem 7377MB [2024-08-26 19:39:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1130/1251] eta 0:00:29 lr 0.000715 wd 0.0500 time 0.2256 (0.2442) data time 0.0010 (0.0026) model time 0.2246 (0.2416) loss 3.8215 (3.3268) grad_norm 2.4785 (nan) loss_scale 2048.0000 (2090.8289) mem 7377MB [2024-08-26 19:39:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1140/1251] eta 0:00:27 lr 0.000715 wd 0.0500 time 0.2262 (0.2441) data time 0.0008 (0.0026) model time 0.2254 (0.2415) loss 3.9379 (3.3292) grad_norm 2.8646 (nan) loss_scale 2048.0000 (2090.4256) mem 7377MB [2024-08-26 19:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1150/1251] eta 0:00:24 lr 0.000715 wd 0.0500 time 0.2211 (0.2439) data time 0.0009 (0.0025) model time 0.2202 (0.2413) loss 2.4247 (3.3280) grad_norm 1.4748 (nan) loss_scale 2048.0000 (2090.0299) mem 7377MB [2024-08-26 19:39:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1160/1251] eta 0:00:22 lr 0.000715 wd 0.0500 time 0.2238 (0.2438) data time 0.0009 (0.0025) model time 0.2229 (0.2412) loss 3.5818 (3.3287) grad_norm 3.1784 (nan) loss_scale 2048.0000 (2089.6414) mem 7377MB [2024-08-26 19:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1170/1251] eta 0:00:19 lr 0.000715 wd 0.0500 time 0.2251 (0.2436) data time 0.0007 (0.0025) model time 0.2244 (0.2411) loss 2.9583 (3.3269) grad_norm 2.2011 (nan) loss_scale 2048.0000 (2089.2601) mem 7377MB [2024-08-26 19:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1180/1251] eta 0:00:17 lr 0.000715 wd 0.0500 time 0.2279 (0.2435) data time 0.0010 (0.0025) model time 0.2269 (0.2410) loss 2.9446 (3.3266) grad_norm 1.9407 (nan) loss_scale 2048.0000 (2088.8857) mem 7377MB [2024-08-26 19:39:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1190/1251] eta 0:00:14 lr 0.000715 wd 0.0500 time 0.2219 (0.2434) data time 0.0011 (0.0025) model time 0.2208 (0.2409) loss 3.6728 (3.3282) grad_norm 2.1511 (nan) loss_scale 2048.0000 (2088.5180) mem 7377MB [2024-08-26 19:39:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1200/1251] eta 0:00:12 lr 0.000715 wd 0.0500 time 0.2290 (0.2432) data time 0.0007 (0.0025) model time 0.2283 (0.2408) loss 3.3010 (3.3304) grad_norm 2.3579 (nan) loss_scale 2048.0000 (2088.1569) mem 7377MB [2024-08-26 19:39:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1210/1251] eta 0:00:09 lr 0.000715 wd 0.0500 time 0.2239 (0.2431) data time 0.0010 (0.0025) model time 0.2229 (0.2406) loss 2.2497 (3.3313) grad_norm 2.2765 (nan) loss_scale 2048.0000 (2087.8021) mem 7377MB [2024-08-26 19:39:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1220/1251] eta 0:00:07 lr 0.000715 wd 0.0500 time 0.2217 (0.2430) data time 0.0012 (0.0025) model time 0.2206 (0.2405) loss 3.5449 (3.3318) grad_norm 1.6457 (nan) loss_scale 2048.0000 (2087.4536) mem 7377MB [2024-08-26 19:39:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1230/1251] eta 0:00:05 lr 0.000715 wd 0.0500 time 0.2312 (0.2428) data time 0.0009 (0.0025) model time 0.2303 (0.2404) loss 3.8443 (3.3316) grad_norm 1.7332 (nan) loss_scale 2048.0000 (2087.1111) mem 7377MB [2024-08-26 19:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1240/1251] eta 0:00:02 lr 0.000715 wd 0.0500 time 0.2103 (0.2426) data time 0.0004 (0.0024) model time 0.2099 (0.2402) loss 2.7907 (3.3305) grad_norm 2.9463 (nan) loss_scale 2048.0000 (2086.7745) mem 7377MB [2024-08-26 19:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [120/300][1250/1251] eta 0:00:00 lr 0.000715 wd 0.0500 time 0.2104 (0.2424) data time 0.0005 (0.0024) model time 0.2099 (0.2400) loss 3.1734 (3.3306) grad_norm 1.9030 (nan) loss_scale 2048.0000 (2086.4437) mem 7377MB [2024-08-26 19:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 120 training takes 0:04:44 [2024-08-26 19:39:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 19:39:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 19:39:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.449 (0.449) Loss 0.4941 (0.4941) Acc@1 90.625 (90.625) Acc@5 97.852 (97.852) Mem 7377MB [2024-08-26 19:39:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.116) Loss 0.7725 (0.7544) Acc@1 82.715 (83.017) Acc@5 95.996 (96.493) Mem 7377MB [2024-08-26 19:39:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.099) Loss 1.0498 (0.7765) Acc@1 74.707 (82.273) Acc@5 94.043 (96.522) Mem 7377MB [2024-08-26 19:39:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.094) Loss 1.3213 (0.8873) Acc@1 67.676 (79.892) Acc@5 88.965 (95.079) Mem 7377MB [2024-08-26 19:39:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.2129 (0.9494) Acc@1 71.289 (78.285) Acc@5 91.992 (94.417) Mem 7377MB [2024-08-26 19:39:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.854 Acc@5 94.340 [2024-08-26 19:39:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.9% [2024-08-26 19:39:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.932 (0.932) Loss 0.4233 (0.4233) Acc@1 92.188 (92.188) Acc@5 98.242 (98.242) Mem 7377MB [2024-08-26 19:39:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.163) Loss 0.6787 (0.6651) Acc@1 86.328 (85.645) Acc@5 96.777 (97.212) Mem 7377MB [2024-08-26 19:39:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.125) Loss 0.9478 (0.6875) Acc@1 78.223 (84.724) Acc@5 94.922 (97.219) Mem 7377MB [2024-08-26 19:39:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.110) Loss 1.2129 (0.7830) Acc@1 69.141 (82.409) Acc@5 91.602 (96.106) Mem 7377MB [2024-08-26 19:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.099) Loss 1.0996 (0.8333) Acc@1 72.559 (81.012) Acc@5 93.164 (95.582) Mem 7377MB [2024-08-26 19:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.620 Acc@5 95.546 [2024-08-26 19:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.6% [2024-08-26 19:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.62% [2024-08-26 19:40:00 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 19:40:01 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 19:40:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][0/1251] eta 0:20:13 lr 0.000715 wd 0.0500 time 0.9704 (0.9704) data time 0.6511 (0.6511) model time 0.0000 (0.0000) loss 3.5278 (3.5278) grad_norm 2.4845 (2.4845) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 19:40:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][10/1251] eta 0:06:06 lr 0.000715 wd 0.0500 time 0.2303 (0.2955) data time 0.0007 (0.0602) model time 0.0000 (0.0000) loss 3.9691 (3.7820) grad_norm 1.9024 (2.0612) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][20/1251] eta 0:05:24 lr 0.000715 wd 0.0500 time 0.2259 (0.2638) data time 0.0010 (0.0321) model time 0.0000 (0.0000) loss 3.2519 (3.5430) grad_norm 5.0770 (2.3209) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][30/1251] eta 0:05:07 lr 0.000715 wd 0.0500 time 0.2412 (0.2521) data time 0.0007 (0.0222) model time 0.0000 (0.0000) loss 3.5789 (3.4919) grad_norm 2.5953 (2.2617) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][40/1251] eta 0:04:57 lr 0.000715 wd 0.0500 time 0.2307 (0.2460) data time 0.0009 (0.0170) model time 0.0000 (0.0000) loss 3.4601 (3.4680) grad_norm 1.5524 (2.1728) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][50/1251] eta 0:04:51 lr 0.000715 wd 0.0500 time 0.2304 (0.2425) data time 0.0009 (0.0139) model time 0.0000 (0.0000) loss 3.4931 (3.4344) grad_norm 1.7409 (2.1225) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][60/1251] eta 0:04:46 lr 0.000715 wd 0.0500 time 0.2323 (0.2405) data time 0.0007 (0.0118) model time 0.2316 (0.2290) loss 3.8374 (3.3938) grad_norm 1.9973 (2.0923) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][70/1251] eta 0:04:42 lr 0.000714 wd 0.0500 time 0.2331 (0.2388) data time 0.0010 (0.0103) model time 0.2321 (0.2284) loss 3.7941 (3.3909) grad_norm 2.2127 (2.0916) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][80/1251] eta 0:04:38 lr 0.000714 wd 0.0500 time 0.2276 (0.2374) data time 0.0010 (0.0092) model time 0.2266 (0.2277) loss 3.3704 (3.3460) grad_norm 2.3260 (2.1428) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][90/1251] eta 0:04:34 lr 0.000714 wd 0.0500 time 0.2315 (0.2369) data time 0.0010 (0.0084) model time 0.2305 (0.2283) loss 3.1893 (3.3546) grad_norm 2.6269 (2.1626) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][100/1251] eta 0:04:31 lr 0.000714 wd 0.0500 time 0.2477 (0.2363) data time 0.0009 (0.0077) model time 0.2468 (0.2286) loss 3.6099 (3.3690) grad_norm 1.8548 (2.1369) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][110/1251] eta 0:04:28 lr 0.000714 wd 0.0500 time 0.2369 (0.2357) data time 0.0010 (0.0071) model time 0.2359 (0.2286) loss 3.4778 (3.3997) grad_norm 2.3017 (2.1458) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][120/1251] eta 0:04:25 lr 0.000714 wd 0.0500 time 0.2266 (0.2351) data time 0.0016 (0.0066) model time 0.2250 (0.2284) loss 4.0195 (3.4037) grad_norm 2.0031 (2.1474) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][130/1251] eta 0:04:22 lr 0.000714 wd 0.0500 time 0.2279 (0.2345) data time 0.0009 (0.0062) model time 0.2270 (0.2282) loss 3.7093 (3.4019) grad_norm 1.8039 (2.1410) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][140/1251] eta 0:04:20 lr 0.000714 wd 0.0500 time 0.2407 (0.2341) data time 0.0009 (0.0059) model time 0.2398 (0.2280) loss 2.4412 (3.3723) grad_norm 2.0076 (2.1827) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][150/1251] eta 0:04:17 lr 0.000714 wd 0.0500 time 0.2319 (0.2338) data time 0.0010 (0.0056) model time 0.2309 (0.2281) loss 3.3473 (3.3522) grad_norm 3.0824 (2.1933) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][160/1251] eta 0:04:14 lr 0.000714 wd 0.0500 time 0.2492 (0.2335) data time 0.0009 (0.0053) model time 0.2483 (0.2281) loss 3.3898 (3.3559) grad_norm 2.0012 (2.1726) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][170/1251] eta 0:04:12 lr 0.000714 wd 0.0500 time 0.2466 (0.2335) data time 0.0011 (0.0050) model time 0.2455 (0.2284) loss 2.8765 (3.3511) grad_norm 3.4718 (2.1925) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][180/1251] eta 0:04:09 lr 0.000714 wd 0.0500 time 0.2331 (0.2332) data time 0.0007 (0.0048) model time 0.2324 (0.2283) loss 3.6132 (3.3616) grad_norm 3.3951 (2.2111) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][190/1251] eta 0:04:06 lr 0.000714 wd 0.0500 time 0.2202 (0.2328) data time 0.0010 (0.0046) model time 0.2192 (0.2280) loss 3.7171 (3.3718) grad_norm 2.1293 (2.2151) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][200/1251] eta 0:04:04 lr 0.000714 wd 0.0500 time 0.2294 (0.2326) data time 0.0007 (0.0045) model time 0.2287 (0.2279) loss 3.7982 (3.3830) grad_norm 2.0429 (2.2061) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][210/1251] eta 0:04:02 lr 0.000714 wd 0.0500 time 0.2227 (0.2334) data time 0.0007 (0.0043) model time 0.2220 (0.2292) loss 4.0295 (3.3829) grad_norm 2.3215 (2.2118) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][220/1251] eta 0:04:00 lr 0.000714 wd 0.0500 time 0.2272 (0.2333) data time 0.0006 (0.0042) model time 0.2266 (0.2293) loss 3.0977 (3.3688) grad_norm 1.7071 (2.1968) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][230/1251] eta 0:03:57 lr 0.000714 wd 0.0500 time 0.2299 (0.2331) data time 0.0007 (0.0040) model time 0.2292 (0.2292) loss 3.9868 (3.3780) grad_norm 3.1791 (2.2037) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][240/1251] eta 0:03:55 lr 0.000714 wd 0.0500 time 0.2227 (0.2328) data time 0.0013 (0.0039) model time 0.2214 (0.2290) loss 3.3286 (3.3806) grad_norm 2.3987 (2.1935) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][250/1251] eta 0:03:52 lr 0.000714 wd 0.0500 time 0.2365 (0.2327) data time 0.0009 (0.0038) model time 0.2356 (0.2290) loss 3.6085 (3.3749) grad_norm 2.0348 (2.1908) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][260/1251] eta 0:03:50 lr 0.000714 wd 0.0500 time 0.2377 (0.2326) data time 0.0009 (0.0037) model time 0.2368 (0.2289) loss 3.6765 (3.3733) grad_norm 2.7340 (2.1944) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][270/1251] eta 0:03:47 lr 0.000714 wd 0.0500 time 0.2305 (0.2324) data time 0.0009 (0.0036) model time 0.2296 (0.2288) loss 3.6434 (3.3725) grad_norm 3.8249 (2.2176) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][280/1251] eta 0:03:45 lr 0.000714 wd 0.0500 time 0.2180 (0.2321) data time 0.0007 (0.0035) model time 0.2173 (0.2286) loss 2.7933 (3.3743) grad_norm 2.3358 (2.2159) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][290/1251] eta 0:03:42 lr 0.000714 wd 0.0500 time 0.2253 (0.2320) data time 0.0010 (0.0035) model time 0.2243 (0.2286) loss 3.0788 (3.3794) grad_norm 1.8201 (2.2126) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][300/1251] eta 0:03:40 lr 0.000714 wd 0.0500 time 0.2266 (0.2319) data time 0.0008 (0.0034) model time 0.2258 (0.2284) loss 3.0516 (3.3783) grad_norm 1.9392 (2.2051) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][310/1251] eta 0:03:37 lr 0.000714 wd 0.0500 time 0.2223 (0.2317) data time 0.0010 (0.0033) model time 0.2213 (0.2283) loss 2.5631 (3.3648) grad_norm 2.0702 (2.2018) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][320/1251] eta 0:03:36 lr 0.000713 wd 0.0500 time 0.4298 (0.2321) data time 0.0011 (0.0032) model time 0.4287 (0.2289) loss 3.1652 (3.3545) grad_norm 1.6312 (2.1946) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][330/1251] eta 0:03:33 lr 0.000713 wd 0.0500 time 0.2311 (0.2320) data time 0.0013 (0.0032) model time 0.2298 (0.2289) loss 3.4630 (3.3510) grad_norm 1.9092 (2.1860) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][340/1251] eta 0:03:31 lr 0.000713 wd 0.0500 time 0.2205 (0.2319) data time 0.0007 (0.0031) model time 0.2198 (0.2287) loss 3.1897 (3.3557) grad_norm 1.8678 (2.2012) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][350/1251] eta 0:03:28 lr 0.000713 wd 0.0500 time 0.2335 (0.2319) data time 0.0009 (0.0031) model time 0.2327 (0.2288) loss 3.3413 (3.3600) grad_norm 2.7249 (2.2116) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][360/1251] eta 0:03:26 lr 0.000713 wd 0.0500 time 0.2279 (0.2318) data time 0.0011 (0.0031) model time 0.2268 (0.2287) loss 3.2353 (3.3549) grad_norm 2.0050 (2.2091) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][370/1251] eta 0:03:24 lr 0.000713 wd 0.0500 time 0.2250 (0.2317) data time 0.0008 (0.0030) model time 0.2241 (0.2287) loss 3.3657 (3.3567) grad_norm 1.6935 (2.2094) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][380/1251] eta 0:03:21 lr 0.000713 wd 0.0500 time 0.2285 (0.2316) data time 0.0010 (0.0030) model time 0.2275 (0.2286) loss 3.8352 (3.3494) grad_norm 3.9733 (2.2112) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][390/1251] eta 0:03:19 lr 0.000713 wd 0.0500 time 0.2231 (0.2316) data time 0.0009 (0.0029) model time 0.2222 (0.2286) loss 2.4839 (3.3416) grad_norm 1.7280 (2.2137) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][400/1251] eta 0:03:17 lr 0.000713 wd 0.0500 time 0.2304 (0.2316) data time 0.0009 (0.0029) model time 0.2295 (0.2287) loss 3.9856 (3.3424) grad_norm 1.5526 (2.2217) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][410/1251] eta 0:03:14 lr 0.000713 wd 0.0500 time 0.2249 (0.2316) data time 0.0010 (0.0029) model time 0.2239 (0.2287) loss 3.2170 (3.3377) grad_norm 2.1829 (2.2224) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][420/1251] eta 0:03:12 lr 0.000713 wd 0.0500 time 0.2278 (0.2315) data time 0.0013 (0.0028) model time 0.2265 (0.2287) loss 3.7421 (3.3430) grad_norm 2.8050 (2.2281) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][430/1251] eta 0:03:09 lr 0.000713 wd 0.0500 time 0.2197 (0.2314) data time 0.0010 (0.0028) model time 0.2187 (0.2286) loss 3.8518 (3.3488) grad_norm 1.9759 (2.2309) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][440/1251] eta 0:03:07 lr 0.000713 wd 0.0500 time 0.2301 (0.2314) data time 0.0007 (0.0027) model time 0.2294 (0.2286) loss 3.0068 (3.3509) grad_norm 2.2186 (2.2299) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][450/1251] eta 0:03:05 lr 0.000713 wd 0.0500 time 0.2328 (0.2313) data time 0.0007 (0.0027) model time 0.2321 (0.2286) loss 3.0658 (3.3497) grad_norm 1.8690 (2.2340) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][460/1251] eta 0:03:02 lr 0.000713 wd 0.0500 time 0.2201 (0.2313) data time 0.0010 (0.0027) model time 0.2191 (0.2286) loss 3.6087 (3.3517) grad_norm 2.2797 (2.2417) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][470/1251] eta 0:03:00 lr 0.000713 wd 0.0500 time 0.2197 (0.2312) data time 0.0007 (0.0026) model time 0.2190 (0.2285) loss 1.8848 (3.3475) grad_norm 2.7518 (2.2462) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][480/1251] eta 0:02:58 lr 0.000713 wd 0.0500 time 0.2228 (0.2311) data time 0.0011 (0.0026) model time 0.2217 (0.2285) loss 2.8044 (3.3384) grad_norm 2.1916 (2.2431) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][490/1251] eta 0:02:55 lr 0.000713 wd 0.0500 time 0.2223 (0.2311) data time 0.0008 (0.0026) model time 0.2215 (0.2285) loss 3.7407 (3.3310) grad_norm 3.1290 (2.2445) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][500/1251] eta 0:02:53 lr 0.000713 wd 0.0500 time 0.2275 (0.2310) data time 0.0007 (0.0025) model time 0.2268 (0.2284) loss 4.0223 (3.3324) grad_norm 2.6817 (2.2454) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:41:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][510/1251] eta 0:02:51 lr 0.000713 wd 0.0500 time 0.2229 (0.2310) data time 0.0008 (0.0025) model time 0.2221 (0.2285) loss 2.8050 (3.3320) grad_norm 1.6396 (2.2443) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][520/1251] eta 0:02:48 lr 0.000713 wd 0.0500 time 0.2258 (0.2309) data time 0.0012 (0.0025) model time 0.2247 (0.2284) loss 2.9273 (3.3336) grad_norm 1.5920 (2.2384) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][530/1251] eta 0:02:46 lr 0.000713 wd 0.0500 time 0.2206 (0.2308) data time 0.0012 (0.0025) model time 0.2194 (0.2284) loss 3.0382 (3.3408) grad_norm 1.9813 (2.2413) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][540/1251] eta 0:02:44 lr 0.000713 wd 0.0500 time 0.2181 (0.2308) data time 0.0011 (0.0024) model time 0.2170 (0.2284) loss 3.3769 (3.3395) grad_norm 2.7668 (2.2377) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][550/1251] eta 0:02:41 lr 0.000713 wd 0.0500 time 0.2261 (0.2309) data time 0.0010 (0.0024) model time 0.2251 (0.2285) loss 3.6440 (3.3368) grad_norm 1.8050 (2.2320) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][560/1251] eta 0:02:39 lr 0.000713 wd 0.0500 time 0.2295 (0.2309) data time 0.0007 (0.0024) model time 0.2289 (0.2284) loss 3.5494 (3.3401) grad_norm 2.4712 (2.2315) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][570/1251] eta 0:02:37 lr 0.000712 wd 0.0500 time 0.2353 (0.2309) data time 0.0006 (0.0024) model time 0.2347 (0.2285) loss 4.0197 (3.3388) grad_norm 4.5606 (2.2367) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][580/1251] eta 0:02:34 lr 0.000712 wd 0.0500 time 0.2197 (0.2308) data time 0.0008 (0.0024) model time 0.2190 (0.2284) loss 2.5769 (3.3334) grad_norm 3.0514 (2.2392) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][590/1251] eta 0:02:32 lr 0.000712 wd 0.0500 time 0.2318 (0.2308) data time 0.0008 (0.0023) model time 0.2310 (0.2285) loss 3.9895 (3.3357) grad_norm 1.6967 (2.2349) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][600/1251] eta 0:02:30 lr 0.000712 wd 0.0500 time 0.2252 (0.2308) data time 0.0009 (0.0023) model time 0.2243 (0.2284) loss 3.8618 (3.3354) grad_norm 3.7132 (2.2326) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][610/1251] eta 0:02:27 lr 0.000712 wd 0.0500 time 0.2346 (0.2307) data time 0.0007 (0.0023) model time 0.2340 (0.2284) loss 2.8018 (3.3352) grad_norm 2.0011 (2.2307) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][620/1251] eta 0:02:25 lr 0.000712 wd 0.0500 time 0.2181 (0.2307) data time 0.0010 (0.0023) model time 0.2171 (0.2284) loss 3.4743 (3.3341) grad_norm 1.9200 (2.2290) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][630/1251] eta 0:02:23 lr 0.000712 wd 0.0500 time 0.2287 (0.2307) data time 0.0007 (0.0023) model time 0.2281 (0.2284) loss 3.0109 (3.3370) grad_norm 1.3964 (2.2232) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][640/1251] eta 0:02:20 lr 0.000712 wd 0.0500 time 0.2236 (0.2306) data time 0.0007 (0.0023) model time 0.2229 (0.2283) loss 3.5165 (3.3342) grad_norm 2.1940 (2.2213) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][650/1251] eta 0:02:18 lr 0.000712 wd 0.0500 time 0.2406 (0.2306) data time 0.0013 (0.0022) model time 0.2393 (0.2283) loss 3.6701 (3.3365) grad_norm 2.1414 (2.2219) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][660/1251] eta 0:02:16 lr 0.000712 wd 0.0500 time 0.2304 (0.2305) data time 0.0010 (0.0022) model time 0.2294 (0.2283) loss 3.7449 (3.3401) grad_norm 1.5800 (2.2213) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][670/1251] eta 0:02:13 lr 0.000712 wd 0.0500 time 0.2213 (0.2305) data time 0.0012 (0.0022) model time 0.2201 (0.2282) loss 3.2430 (3.3373) grad_norm 2.3655 (2.2160) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][680/1251] eta 0:02:11 lr 0.000712 wd 0.0500 time 0.2285 (0.2304) data time 0.0008 (0.0022) model time 0.2276 (0.2282) loss 4.2753 (3.3354) grad_norm 1.7082 (2.2122) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][690/1251] eta 0:02:09 lr 0.000712 wd 0.0500 time 0.2304 (0.2304) data time 0.0007 (0.0022) model time 0.2297 (0.2282) loss 2.7354 (3.3324) grad_norm 1.7141 (2.2107) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][700/1251] eta 0:02:06 lr 0.000712 wd 0.0500 time 0.2307 (0.2304) data time 0.0009 (0.0022) model time 0.2298 (0.2282) loss 4.1258 (3.3293) grad_norm 2.7825 (2.2089) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][710/1251] eta 0:02:04 lr 0.000712 wd 0.0500 time 0.2261 (0.2304) data time 0.0007 (0.0022) model time 0.2254 (0.2282) loss 2.1868 (3.3321) grad_norm 1.4912 (2.2062) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][720/1251] eta 0:02:02 lr 0.000712 wd 0.0500 time 0.2265 (0.2303) data time 0.0014 (0.0022) model time 0.2251 (0.2282) loss 3.7191 (3.3318) grad_norm 2.1971 (2.2083) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][730/1251] eta 0:02:00 lr 0.000712 wd 0.0500 time 0.2250 (0.2303) data time 0.0007 (0.0021) model time 0.2243 (0.2282) loss 3.7439 (3.3324) grad_norm 2.2791 (2.2165) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][740/1251] eta 0:01:57 lr 0.000712 wd 0.0500 time 0.2225 (0.2303) data time 0.0010 (0.0022) model time 0.2215 (0.2281) loss 2.4725 (3.3321) grad_norm 2.2286 (2.2231) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][750/1251] eta 0:01:55 lr 0.000712 wd 0.0500 time 0.2286 (0.2305) data time 0.0009 (0.0021) model time 0.2277 (0.2284) loss 3.0346 (3.3263) grad_norm 2.2874 (2.2245) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][760/1251] eta 0:01:53 lr 0.000712 wd 0.0500 time 0.2254 (0.2305) data time 0.0010 (0.0021) model time 0.2245 (0.2284) loss 3.3296 (3.3230) grad_norm 2.8911 (2.2250) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:42:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][770/1251] eta 0:01:50 lr 0.000712 wd 0.0500 time 0.2260 (0.2305) data time 0.0010 (0.0021) model time 0.2250 (0.2284) loss 3.4688 (3.3228) grad_norm 2.6454 (2.2258) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:43:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][780/1251] eta 0:01:48 lr 0.000712 wd 0.0500 time 0.2303 (0.2305) data time 0.0007 (0.0021) model time 0.2296 (0.2284) loss 2.3641 (3.3197) grad_norm 2.4801 (2.2289) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:43:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][790/1251] eta 0:01:46 lr 0.000712 wd 0.0500 time 0.2256 (0.2305) data time 0.0014 (0.0021) model time 0.2242 (0.2284) loss 2.4463 (3.3194) grad_norm 3.5769 (2.2312) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:43:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][800/1251] eta 0:01:43 lr 0.000712 wd 0.0500 time 0.2219 (0.2304) data time 0.0011 (0.0021) model time 0.2209 (0.2283) loss 3.4576 (3.3162) grad_norm 1.7607 (2.2323) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:43:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][810/1251] eta 0:01:41 lr 0.000711 wd 0.0500 time 0.2220 (0.2304) data time 0.0012 (0.0021) model time 0.2208 (0.2283) loss 3.8208 (3.3140) grad_norm 2.6564 (2.2356) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:43:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][820/1251] eta 0:01:39 lr 0.000711 wd 0.0500 time 0.2263 (0.2304) data time 0.0009 (0.0021) model time 0.2253 (0.2283) loss 2.8126 (3.3130) grad_norm 1.6019 (2.2343) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:43:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][830/1251] eta 0:01:37 lr 0.000711 wd 0.0500 time 0.2317 (0.2304) data time 0.0014 (0.0020) model time 0.2303 (0.2284) loss 3.4494 (3.3124) grad_norm 3.5830 (2.2442) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:43:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][840/1251] eta 0:01:34 lr 0.000711 wd 0.0500 time 0.2200 (0.2304) data time 0.0011 (0.0020) model time 0.2189 (0.2284) loss 2.5593 (3.3157) grad_norm 1.7591 (2.2507) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:43:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][850/1251] eta 0:01:32 lr 0.000711 wd 0.0500 time 0.2258 (0.2304) data time 0.0009 (0.0021) model time 0.2249 (0.2284) loss 3.5670 (3.3164) grad_norm 2.5675 (2.2504) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 19:43:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 19:43:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 19:43:18 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 19:46:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 19:46:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 19:46:15 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 19:46:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 19:46:24 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 19:46:26 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 19:46:27 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 19:46:27 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 121) [2024-08-26 19:46:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 19:46:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][860/1251] eta 0:10:58 lr 0.000711 wd 0.0500 time 0.2351 (1.6829) data time 0.0012 (0.0825) model time 0.2339 (1.6005) loss 3.9143 (3.8412) grad_norm 1.5875 (2.1514) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 19:46:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][870/1251] eta 0:05:53 lr 0.000711 wd 0.0500 time 0.2462 (0.9265) data time 0.0011 (0.0397) model time 0.2451 (0.8869) loss 3.1716 (3.5564) grad_norm 2.3238 (2.2233) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 19:46:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][880/1251] eta 0:04:16 lr 0.000711 wd 0.0500 time 0.2424 (0.6913) data time 0.0008 (0.0263) model time 0.2416 (0.6649) loss 4.1481 (3.5985) grad_norm 1.9297 (2.1072) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 19:46:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][890/1251] eta 0:03:28 lr 0.000711 wd 0.0500 time 0.2485 (0.5765) data time 0.0010 (0.0199) model time 0.2474 (0.5567) loss 3.2501 (3.5428) grad_norm 1.7558 (2.1061) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 19:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][900/1251] eta 0:02:58 lr 0.000711 wd 0.0500 time 0.2509 (0.5090) data time 0.0010 (0.0160) model time 0.2500 (0.4930) loss 3.2090 (3.5062) grad_norm 1.5917 (2.1015) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 19:46:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][910/1251] eta 0:02:38 lr 0.000711 wd 0.0500 time 0.2415 (0.4642) data time 0.0007 (0.0135) model time 0.2408 (0.4507) loss 2.7358 (3.4779) grad_norm 3.8497 (2.1067) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 19:47:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][920/1251] eta 0:02:23 lr 0.000711 wd 0.0500 time 0.2412 (0.4322) data time 0.0011 (0.0117) model time 0.2401 (0.4206) loss 3.4152 (3.4506) grad_norm 3.0719 (2.1502) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 19:47:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][930/1251] eta 0:02:11 lr 0.000711 wd 0.0500 time 0.2443 (0.4085) data time 0.0013 (0.0103) model time 0.2431 (0.3982) loss 2.9253 (3.4362) grad_norm 2.1606 (2.2390) loss_scale 4096.0000 (2073.9241) mem 7377MB [2024-08-26 19:47:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][940/1251] eta 0:02:01 lr 0.000711 wd 0.0500 time 0.2492 (0.3901) data time 0.0008 (0.0093) model time 0.2484 (0.3808) loss 3.6630 (3.4114) grad_norm 2.1635 (2.2457) loss_scale 4096.0000 (2301.1236) mem 7377MB [2024-08-26 19:47:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][950/1251] eta 0:01:53 lr 0.000711 wd 0.0500 time 0.2424 (0.3758) data time 0.0010 (0.0084) model time 0.2413 (0.3674) loss 3.3915 (3.4183) grad_norm 2.2858 (2.2485) loss_scale 4096.0000 (2482.4242) mem 7377MB [2024-08-26 19:47:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][960/1251] eta 0:01:45 lr 0.000711 wd 0.0500 time 0.2479 (0.3640) data time 0.0007 (0.0078) model time 0.2472 (0.3562) loss 4.0642 (3.4238) grad_norm 2.2592 (2.2320) loss_scale 4096.0000 (2630.4587) mem 7377MB [2024-08-26 19:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][970/1251] eta 0:01:39 lr 0.000711 wd 0.0500 time 0.2472 (0.3542) data time 0.0010 (0.0072) model time 0.2462 (0.3470) loss 3.9230 (3.4235) grad_norm 2.3069 (2.2243) loss_scale 4096.0000 (2753.6134) mem 7377MB [2024-08-26 19:47:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 19:47:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 19:47:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 19:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 19:52:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 19:52:16 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 19:52:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 19:52:31 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 19:52:33 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 19:52:34 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 19:52:34 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 121) [2024-08-26 19:52:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 19:52:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][980/1251] eta 0:08:07 lr 0.000711 wd 0.0500 time 0.2428 (1.7990) data time 0.0007 (0.0902) model time 0.2420 (1.7088) loss 3.8680 (3.7569) grad_norm 1.9813 (1.8869) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 19:52:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][990/1251] eta 0:04:16 lr 0.000711 wd 0.0500 time 0.2441 (0.9809) data time 0.0009 (0.0433) model time 0.2431 (0.9376) loss 3.3195 (3.5617) grad_norm 3.1651 (2.2061) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 19:52:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 19:52:57 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 19:53:00 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 19:59:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 19:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 19:59:36 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 19:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 19:59:45 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 19:59:47 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 19:59:48 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 19:59:48 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 121) [2024-08-26 19:59:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 20:00:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1000/1251] eta 0:08:52 lr 0.000711 wd 0.0500 time 0.2523 (2.1212) data time 0.0011 (0.0918) model time 0.2512 (2.0294) loss 3.7422 (3.8391) grad_norm 2.2543 (1.9689) loss_scale 4096.0000 (4096.0000) mem 7376MB [2024-08-26 20:00:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1010/1251] eta 0:04:19 lr 0.000711 wd 0.0500 time 0.2463 (1.0787) data time 0.0008 (0.0416) model time 0.2456 (1.0371) loss 4.1749 (3.6488) grad_norm 2.4996 (2.0901) loss_scale 4096.0000 (4096.0000) mem 7376MB [2024-08-26 20:00:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1020/1251] eta 0:03:00 lr 0.000711 wd 0.0500 time 0.2437 (0.7806) data time 0.0012 (0.0271) model time 0.2425 (0.7534) loss 3.7057 (3.6449) grad_norm 2.2724 (2.1441) loss_scale 4096.0000 (4096.0000) mem 7376MB [2024-08-26 20:00:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1030/1251] eta 0:02:21 lr 0.000711 wd 0.0500 time 0.2446 (0.6394) data time 0.0014 (0.0203) model time 0.2433 (0.6191) loss 3.3401 (3.5681) grad_norm 1.4464 (2.1317) loss_scale 4096.0000 (4096.0000) mem 7376MB [2024-08-26 20:00:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1040/1251] eta 0:01:57 lr 0.000711 wd 0.0500 time 0.2489 (0.5572) data time 0.0007 (0.0163) model time 0.2481 (0.5409) loss 3.5451 (3.5310) grad_norm 1.8318 (2.1424) loss_scale 4096.0000 (4096.0000) mem 7376MB [2024-08-26 20:00:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1050/1251] eta 0:01:41 lr 0.000711 wd 0.0500 time 0.2599 (0.5031) data time 0.0009 (0.0137) model time 0.2590 (0.4895) loss 2.9837 (3.4905) grad_norm 2.1673 (inf) loss_scale 2048.0000 (3778.2069) mem 7376MB [2024-08-26 20:00:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1060/1251] eta 0:01:28 lr 0.000710 wd 0.0500 time 0.2332 (0.4656) data time 0.0009 (0.0118) model time 0.2323 (0.4538) loss 2.2775 (3.4627) grad_norm 1.9520 (inf) loss_scale 2048.0000 (3523.7647) mem 7376MB [2024-08-26 20:00:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1070/1251] eta 0:01:19 lr 0.000710 wd 0.0500 time 0.2406 (0.4374) data time 0.0008 (0.0106) model time 0.2398 (0.4268) loss 2.7739 (3.4312) grad_norm 2.5825 (inf) loss_scale 2048.0000 (3334.5641) mem 7376MB [2024-08-26 20:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1080/1251] eta 0:01:11 lr 0.000710 wd 0.0500 time 0.2406 (0.4156) data time 0.0012 (0.0096) model time 0.2394 (0.4061) loss 3.9188 (3.4182) grad_norm 1.7531 (inf) loss_scale 2048.0000 (3188.3636) mem 7376MB [2024-08-26 20:00:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1090/1251] eta 0:01:04 lr 0.000710 wd 0.0500 time 0.2456 (0.3982) data time 0.0009 (0.0087) model time 0.2447 (0.3895) loss 4.0850 (3.4257) grad_norm 2.6452 (inf) loss_scale 2048.0000 (3072.0000) mem 7376MB [2024-08-26 20:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1100/1251] eta 0:00:57 lr 0.000710 wd 0.0500 time 0.2512 (0.3841) data time 0.0008 (0.0080) model time 0.2504 (0.3760) loss 2.6440 (3.4341) grad_norm 1.8324 (inf) loss_scale 2048.0000 (2977.1852) mem 7376MB [2024-08-26 20:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1110/1251] eta 0:00:52 lr 0.000710 wd 0.0500 time 0.2434 (0.3723) data time 0.0010 (0.0074) model time 0.2424 (0.3649) loss 3.2497 (3.4372) grad_norm 2.1077 (inf) loss_scale 2048.0000 (2898.4407) mem 7376MB [2024-08-26 20:00:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1120/1251] eta 0:00:47 lr 0.000710 wd 0.0500 time 0.2436 (0.3625) data time 0.0009 (0.0069) model time 0.2427 (0.3556) loss 3.4381 (3.4210) grad_norm 2.7294 (inf) loss_scale 2048.0000 (2832.0000) mem 7376MB [2024-08-26 20:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1130/1251] eta 0:00:42 lr 0.000710 wd 0.0500 time 0.2395 (0.3538) data time 0.0012 (0.0065) model time 0.2383 (0.3472) loss 3.2728 (3.4115) grad_norm 2.6610 (inf) loss_scale 2048.0000 (2775.1884) mem 7376MB [2024-08-26 20:00:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1140/1251] eta 0:00:38 lr 0.000710 wd 0.0500 time 0.2490 (0.3465) data time 0.0007 (0.0062) model time 0.2483 (0.3404) loss 4.0641 (3.4061) grad_norm 2.4458 (inf) loss_scale 2048.0000 (2726.0541) mem 7376MB [2024-08-26 20:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1150/1251] eta 0:00:34 lr 0.000710 wd 0.0500 time 0.2406 (0.3401) data time 0.0007 (0.0058) model time 0.2399 (0.3342) loss 2.6589 (3.3971) grad_norm 2.3969 (inf) loss_scale 2048.0000 (2683.1392) mem 7376MB [2024-08-26 20:00:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1160/1251] eta 0:00:30 lr 0.000710 wd 0.0500 time 0.2495 (0.3344) data time 0.0010 (0.0056) model time 0.2485 (0.3289) loss 3.8188 (3.3985) grad_norm 1.7638 (inf) loss_scale 2048.0000 (2645.3333) mem 7376MB [2024-08-26 20:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1170/1251] eta 0:00:26 lr 0.000710 wd 0.0500 time 0.2394 (0.3294) data time 0.0010 (0.0053) model time 0.2385 (0.3241) loss 2.8272 (3.3786) grad_norm 1.9771 (inf) loss_scale 2048.0000 (2611.7753) mem 7376MB [2024-08-26 20:00:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1180/1251] eta 0:00:23 lr 0.000710 wd 0.0500 time 0.2489 (0.3248) data time 0.0008 (0.0051) model time 0.2481 (0.3197) loss 3.4902 (3.3818) grad_norm 2.0352 (inf) loss_scale 2048.0000 (2581.7872) mem 7376MB [2024-08-26 20:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1190/1251] eta 0:00:19 lr 0.000710 wd 0.0500 time 0.3315 (0.3211) data time 0.0011 (0.0049) model time 0.3304 (0.3161) loss 2.6514 (3.3743) grad_norm 3.2993 (inf) loss_scale 2048.0000 (2554.8283) mem 7376MB [2024-08-26 20:00:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1200/1251] eta 0:00:16 lr 0.000710 wd 0.0500 time 0.2380 (0.3174) data time 0.0012 (0.0047) model time 0.2368 (0.3126) loss 3.4165 (3.3650) grad_norm 2.4444 (inf) loss_scale 2048.0000 (2530.4615) mem 7376MB [2024-08-26 20:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1210/1251] eta 0:00:12 lr 0.000710 wd 0.0500 time 0.2489 (0.3140) data time 0.0010 (0.0046) model time 0.2479 (0.3094) loss 3.0567 (3.3568) grad_norm 1.9357 (inf) loss_scale 2048.0000 (2508.3303) mem 7376MB [2024-08-26 20:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1220/1251] eta 0:00:09 lr 0.000710 wd 0.0500 time 0.2306 (0.3108) data time 0.0012 (0.0044) model time 0.2294 (0.3064) loss 3.4089 (3.3589) grad_norm 2.0121 (inf) loss_scale 2048.0000 (2488.1404) mem 7376MB [2024-08-26 20:01:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1230/1251] eta 0:00:06 lr 0.000710 wd 0.0500 time 0.2372 (0.3080) data time 0.0009 (0.0043) model time 0.2363 (0.3037) loss 4.3745 (3.3558) grad_norm 1.9500 (inf) loss_scale 2048.0000 (2469.6471) mem 7376MB [2024-08-26 20:01:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1240/1251] eta 0:00:03 lr 0.000710 wd 0.0500 time 0.2315 (0.3050) data time 0.0007 (0.0042) model time 0.2308 (0.3009) loss 2.6725 (3.3456) grad_norm 2.0871 (inf) loss_scale 2048.0000 (2452.6452) mem 7376MB [2024-08-26 20:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [121/300][1250/1251] eta 0:00:00 lr 0.000710 wd 0.0500 time 0.2352 (0.3021) data time 0.0007 (0.0040) model time 0.2345 (0.2980) loss 3.4578 (3.3380) grad_norm 2.0333 (inf) loss_scale 2048.0000 (2436.9612) mem 7376MB [2024-08-26 20:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 121 training takes 0:01:17 [2024-08-26 20:01:10 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 20:01:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 20:01:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.411 (0.411) Loss 0.5073 (0.5073) Acc@1 89.844 (89.844) Acc@5 97.949 (97.949) Mem 7376MB [2024-08-26 20:01:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.108) Loss 0.7437 (0.7555) Acc@1 83.887 (83.159) Acc@5 96.191 (96.493) Mem 7376MB [2024-08-26 20:01:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.095) Loss 1.1211 (0.7751) Acc@1 72.559 (82.361) Acc@5 92.480 (96.549) Mem 7376MB [2024-08-26 20:01:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.089) Loss 1.3691 (0.8845) Acc@1 67.676 (79.905) Acc@5 88.477 (95.246) Mem 7376MB [2024-08-26 20:01:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.1699 (0.9455) Acc@1 71.875 (78.344) Acc@5 91.797 (94.572) Mem 7376MB [2024-08-26 20:01:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.852 Acc@5 94.480 [2024-08-26 20:01:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 77.9% [2024-08-26 20:01:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.812 (0.812) Loss 0.4238 (0.4238) Acc@1 91.992 (91.992) Acc@5 98.340 (98.340) Mem 7376MB [2024-08-26 20:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.153) Loss 0.6768 (0.6650) Acc@1 86.426 (85.662) Acc@5 96.875 (97.212) Mem 7376MB [2024-08-26 20:01:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.118) Loss 0.9448 (0.6872) Acc@1 78.027 (84.728) Acc@5 94.824 (97.214) Mem 7376MB [2024-08-26 20:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.105) Loss 1.2090 (0.7822) Acc@1 69.336 (82.412) Acc@5 91.602 (96.100) Mem 7376MB [2024-08-26 20:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.096) Loss 1.0986 (0.8324) Acc@1 72.266 (81.019) Acc@5 92.969 (95.575) Mem 7376MB [2024-08-26 20:01:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.622 Acc@5 95.538 [2024-08-26 20:01:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.6% [2024-08-26 20:01:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.62% [2024-08-26 20:01:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 20:03:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 20:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 20:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 20:07:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 20:07:21 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 20:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 20:07:28 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 20:07:29 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 20:07:31 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 20:07:31 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 121) [2024-08-26 20:07:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 20:07:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][0/1251] eta 4:11:27 lr 0.000710 wd 0.0500 time 12.0607 (12.0607) data time 0.9746 (0.9746) model time 0.0000 (0.0000) loss 4.4812 (4.4812) grad_norm 2.1113 (2.1113) loss_scale 2048.0000 (2048.0000) mem 20033MB [2024-08-26 20:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][10/1251] eta 0:27:10 lr 0.000710 wd 0.0500 time 0.2236 (1.3140) data time 0.0009 (0.0894) model time 0.0000 (0.0000) loss 2.9205 (3.7508) grad_norm 3.0025 (2.5470) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:07:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][20/1251] eta 0:16:20 lr 0.000710 wd 0.0500 time 0.2264 (0.7968) data time 0.0008 (0.0473) model time 0.0000 (0.0000) loss 3.3970 (3.6413) grad_norm 1.5735 (2.6803) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][30/1251] eta 0:12:28 lr 0.000710 wd 0.0500 time 0.2291 (0.6128) data time 0.0006 (0.0323) model time 0.0000 (0.0000) loss 2.4954 (3.6251) grad_norm 1.8377 (2.5505) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:07:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][40/1251] eta 0:10:27 lr 0.000710 wd 0.0500 time 0.2253 (0.5185) data time 0.0008 (0.0247) model time 0.0000 (0.0000) loss 3.3404 (3.5680) grad_norm 2.8494 (2.5320) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][50/1251] eta 0:09:14 lr 0.000710 wd 0.0500 time 0.2318 (0.4616) data time 0.0006 (0.0200) model time 0.0000 (0.0000) loss 4.1545 (3.5478) grad_norm 2.3428 (2.4929) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][60/1251] eta 0:08:24 lr 0.000709 wd 0.0500 time 0.2198 (0.4233) data time 0.0009 (0.0169) model time 0.2189 (0.2275) loss 3.5623 (3.5050) grad_norm 1.9929 (2.4174) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][70/1251] eta 0:07:47 lr 0.000709 wd 0.0500 time 0.2286 (0.3957) data time 0.0009 (0.0146) model time 0.2278 (0.2270) loss 3.2465 (3.4564) grad_norm 1.9623 (2.3605) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][80/1251] eta 0:07:19 lr 0.000709 wd 0.0500 time 0.2247 (0.3749) data time 0.0008 (0.0130) model time 0.2239 (0.2267) loss 3.4832 (3.4303) grad_norm 4.3494 (2.3629) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][90/1251] eta 0:06:56 lr 0.000709 wd 0.0500 time 0.2229 (0.3585) data time 0.0006 (0.0116) model time 0.2223 (0.2262) loss 3.7283 (3.4199) grad_norm 2.7885 (2.3327) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:08:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][100/1251] eta 0:06:37 lr 0.000709 wd 0.0500 time 0.2225 (0.3453) data time 0.0006 (0.0106) model time 0.2218 (0.2258) loss 3.3096 (3.4303) grad_norm 2.4235 (2.3403) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:08:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][110/1251] eta 0:06:21 lr 0.000709 wd 0.0500 time 0.2256 (0.3344) data time 0.0009 (0.0097) model time 0.2247 (0.2253) loss 2.7845 (3.4360) grad_norm 2.2820 (2.3418) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:08:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][120/1251] eta 0:06:07 lr 0.000709 wd 0.0500 time 0.2220 (0.3253) data time 0.0007 (0.0090) model time 0.2213 (0.2250) loss 2.2373 (3.4350) grad_norm 2.8478 (2.3207) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:08:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][130/1251] eta 0:05:56 lr 0.000709 wd 0.0500 time 0.2188 (0.3177) data time 0.0009 (0.0083) model time 0.2179 (0.2250) loss 3.7742 (3.4213) grad_norm 1.5746 (2.2927) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:08:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 20:08:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 20:08:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 20:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 20:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 20:10:52 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 20:11:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 20:11:00 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 20:11:01 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 20:11:02 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 20:11:02 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 122) [2024-08-26 20:11:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 20:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][140/1251] eta 1:00:38 lr 0.000709 wd 0.0500 time 0.2380 (3.2746) data time 0.0006 (0.1908) model time 0.2373 (3.0838) loss 4.3252 (3.6998) grad_norm 1.9369 (2.0665) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][150/1251] eta 0:20:08 lr 0.000709 wd 0.0500 time 0.2246 (1.0973) data time 0.0006 (0.0552) model time 0.2240 (1.0421) loss 3.8251 (3.5950) grad_norm 1.5961 (2.1084) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][160/1251] eta 0:13:20 lr 0.000709 wd 0.0500 time 0.2224 (0.7335) data time 0.0008 (0.0326) model time 0.2215 (0.7009) loss 3.3716 (3.5511) grad_norm 2.0704 (2.2460) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][170/1251] eta 0:10:31 lr 0.000709 wd 0.0500 time 0.2176 (0.5838) data time 0.0007 (0.0233) model time 0.2169 (0.5605) loss 2.6489 (3.5534) grad_norm 2.2635 (2.2023) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][180/1251] eta 0:08:59 lr 0.000709 wd 0.0500 time 0.2360 (0.5036) data time 0.0007 (0.0182) model time 0.2353 (0.4854) loss 3.5036 (3.4946) grad_norm 1.9054 (2.1728) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][190/1251] eta 0:08:00 lr 0.000709 wd 0.0500 time 0.2248 (0.4525) data time 0.0006 (0.0150) model time 0.2242 (0.4375) loss 3.6417 (3.4739) grad_norm 3.0652 (2.1789) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][200/1251] eta 0:07:18 lr 0.000709 wd 0.0500 time 0.2199 (0.4170) data time 0.0008 (0.0128) model time 0.2191 (0.4043) loss 3.2398 (3.4422) grad_norm 2.4381 (2.1843) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][210/1251] eta 0:06:47 lr 0.000709 wd 0.0500 time 0.2288 (0.3915) data time 0.0007 (0.0112) model time 0.2281 (0.3803) loss 3.5380 (3.4137) grad_norm 1.9339 (2.2050) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][220/1251] eta 0:06:23 lr 0.000709 wd 0.0500 time 0.2273 (0.3719) data time 0.0009 (0.0100) model time 0.2264 (0.3619) loss 3.5727 (3.3777) grad_norm 2.2701 (2.2356) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][230/1251] eta 0:06:03 lr 0.000709 wd 0.0500 time 0.2219 (0.3564) data time 0.0010 (0.0091) model time 0.2209 (0.3473) loss 3.2266 (3.3751) grad_norm 1.9799 (2.2832) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][240/1251] eta 0:05:47 lr 0.000709 wd 0.0500 time 0.2186 (0.3439) data time 0.0010 (0.0083) model time 0.2176 (0.3357) loss 3.2480 (3.4022) grad_norm 2.2461 (2.2880) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][250/1251] eta 0:05:34 lr 0.000709 wd 0.0500 time 0.2255 (0.3337) data time 0.0009 (0.0077) model time 0.2246 (0.3260) loss 3.5955 (3.3932) grad_norm 3.2264 (2.2892) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][260/1251] eta 0:05:22 lr 0.000709 wd 0.0500 time 0.2241 (0.3250) data time 0.0010 (0.0071) model time 0.2231 (0.3179) loss 2.8631 (3.3803) grad_norm 1.7168 (2.2953) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][270/1251] eta 0:05:11 lr 0.000709 wd 0.0500 time 0.2260 (0.3178) data time 0.0008 (0.0067) model time 0.2252 (0.3111) loss 3.5032 (3.3754) grad_norm 2.4200 (2.3684) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][280/1251] eta 0:05:02 lr 0.000709 wd 0.0500 time 0.2238 (0.3115) data time 0.0007 (0.0063) model time 0.2230 (0.3052) loss 3.1861 (3.3610) grad_norm 1.9099 (2.3668) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][290/1251] eta 0:04:53 lr 0.000709 wd 0.0500 time 0.2233 (0.3059) data time 0.0007 (0.0060) model time 0.2227 (0.2999) loss 3.2847 (3.3521) grad_norm 1.8913 (2.3318) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][300/1251] eta 0:04:46 lr 0.000708 wd 0.0500 time 0.2167 (0.3010) data time 0.0007 (0.0057) model time 0.2159 (0.2953) loss 3.1272 (3.3483) grad_norm 2.1386 (2.3231) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][310/1251] eta 0:04:39 lr 0.000708 wd 0.0500 time 0.2260 (0.2966) data time 0.0010 (0.0054) model time 0.2250 (0.2912) loss 2.5772 (3.3368) grad_norm 2.3371 (2.3032) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][320/1251] eta 0:04:32 lr 0.000708 wd 0.0500 time 0.2219 (0.2928) data time 0.0006 (0.0052) model time 0.2213 (0.2876) loss 3.2030 (3.3435) grad_norm 2.7237 (2.2955) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][330/1251] eta 0:04:26 lr 0.000708 wd 0.0500 time 0.2288 (0.2894) data time 0.0008 (0.0050) model time 0.2281 (0.2844) loss 3.2802 (3.3394) grad_norm 3.5235 (2.3172) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][340/1251] eta 0:04:20 lr 0.000708 wd 0.0500 time 0.2231 (0.2864) data time 0.0007 (0.0048) model time 0.2223 (0.2816) loss 3.1490 (3.3204) grad_norm 2.8967 (2.3201) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:12:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][350/1251] eta 0:04:15 lr 0.000708 wd 0.0500 time 0.2230 (0.2836) data time 0.0008 (0.0046) model time 0.2222 (0.2790) loss 3.3075 (3.3117) grad_norm 2.8337 (2.3174) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][360/1251] eta 0:04:10 lr 0.000708 wd 0.0500 time 0.2315 (0.2811) data time 0.0010 (0.0045) model time 0.2306 (0.2766) loss 3.9110 (3.3146) grad_norm 1.9144 (2.3059) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:12:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][370/1251] eta 0:04:05 lr 0.000708 wd 0.0500 time 0.2250 (0.2787) data time 0.0007 (0.0043) model time 0.2243 (0.2743) loss 2.3226 (3.3070) grad_norm 1.5418 (2.2983) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:12:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][380/1251] eta 0:04:00 lr 0.000708 wd 0.0500 time 0.2277 (0.2765) data time 0.0006 (0.0042) model time 0.2271 (0.2723) loss 2.4261 (3.3072) grad_norm 1.9044 (2.2847) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][390/1251] eta 0:03:56 lr 0.000708 wd 0.0500 time 0.2214 (0.2745) data time 0.0008 (0.0041) model time 0.2206 (0.2705) loss 2.7584 (3.3031) grad_norm 2.3011 (2.2817) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:12:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][400/1251] eta 0:03:52 lr 0.000708 wd 0.0500 time 0.2313 (0.2728) data time 0.0006 (0.0040) model time 0.2308 (0.2689) loss 3.6275 (3.2975) grad_norm 2.6767 (2.2782) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:12:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][410/1251] eta 0:03:48 lr 0.000708 wd 0.0500 time 0.2284 (0.2711) data time 0.0009 (0.0038) model time 0.2275 (0.2673) loss 3.5626 (3.2997) grad_norm 2.3535 (2.3139) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:12:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 20:12:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 20:12:24 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 20:27:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 20:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 20:27:51 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 20:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 20:28:01 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 20:28:02 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 20:28:03 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 20:28:03 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 122) [2024-08-26 20:28:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 20:29:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 20:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 20:29:46 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 20:29:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 20:29:58 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 20:29:59 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 20:30:00 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 20:30:00 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 122) [2024-08-26 20:30:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 20:30:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][420/1251] eta 0:29:31 lr 0.000708 wd 0.0500 time 0.2245 (2.1323) data time 0.0006 (0.1415) model time 0.2238 (1.9908) loss 3.3273 (3.7208) grad_norm 1.9643 (2.0885) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:30:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][430/1251] eta 0:14:41 lr 0.000708 wd 0.0500 time 0.2205 (1.0733) data time 0.0006 (0.0634) model time 0.2199 (1.0099) loss 3.7339 (3.6295) grad_norm 2.3011 (2.1566) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][440/1251] eta 0:10:25 lr 0.000708 wd 0.0500 time 0.2475 (0.7709) data time 0.0010 (0.0411) model time 0.2466 (0.7299) loss 3.6078 (3.6690) grad_norm 2.5386 (2.1222) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:30:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][450/1251] eta 0:08:22 lr 0.000708 wd 0.0500 time 0.2211 (0.6272) data time 0.0009 (0.0305) model time 0.2202 (0.5967) loss 3.3853 (3.6014) grad_norm 2.3182 (2.1241) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:30:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][460/1251] eta 0:07:09 lr 0.000708 wd 0.0500 time 0.2217 (0.5432) data time 0.0008 (0.0244) model time 0.2209 (0.5188) loss 3.6160 (3.5458) grad_norm 1.8438 (2.1313) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][470/1251] eta 0:06:21 lr 0.000708 wd 0.0500 time 0.2239 (0.4885) data time 0.0007 (0.0205) model time 0.2232 (0.4681) loss 2.7489 (3.5093) grad_norm 2.2130 (2.1983) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][480/1251] eta 0:05:46 lr 0.000708 wd 0.0500 time 0.2265 (0.4496) data time 0.0006 (0.0176) model time 0.2258 (0.4320) loss 2.7123 (3.4903) grad_norm 2.8266 (2.2754) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 20:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 20:30:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 20:30:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 20:36:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 20:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 20:36:33 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 20:36:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 20:36:42 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 20:36:44 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 20:36:45 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 20:36:45 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 122) [2024-08-26 20:36:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 20:37:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][490/1251] eta 0:21:59 lr 0.000708 wd 0.0500 time 0.2468 (1.7337) data time 0.0012 (0.0584) model time 0.2456 (1.6753) loss 3.5727 (3.7978) grad_norm 1.9027 (2.0358) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 20:37:08 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 20:37:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 20:41:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 20:41:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 20:41:24 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 20:41:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 20:41:31 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 20:41:32 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 20:41:33 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 20:41:33 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 122) [2024-08-26 20:41:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 20:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][500/1251] eta 0:54:33 lr 0.000708 wd 0.0500 time 0.2222 (4.3587) data time 0.0008 (0.3143) model time 0.2214 (4.0444) loss 2.4802 (3.6528) grad_norm 1.7968 (2.1192) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:41:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][510/1251] eta 0:14:34 lr 0.000708 wd 0.0500 time 0.2375 (1.1800) data time 0.0009 (0.0733) model time 0.2366 (1.1067) loss 3.4827 (3.5724) grad_norm 1.8006 (1.9494) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:41:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 20:41:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 20:41:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 20:51:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 20:51:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 20:51:49 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 20:51:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 20:51:59 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 20:52:00 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 20:52:01 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 20:52:01 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 122) [2024-08-26 20:52:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 20:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][520/1251] eta 2:43:24 lr 0.000708 wd 0.0500 time 13.4118 (13.4118) data time 0.8497 (0.8497) model time 12.5622 (12.5622) loss 4.0429 (4.0429) grad_norm 2.7277 (2.7277) loss_scale 2048.0000 (2048.0000) mem 20033MB [2024-08-26 20:52:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][530/1251] eta 0:17:14 lr 0.000708 wd 0.0500 time 0.2254 (1.4350) data time 0.0010 (0.0781) model time 0.2245 (1.3569) loss 2.8236 (3.6925) grad_norm 1.5325 (2.0989) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][540/1251] eta 0:10:10 lr 0.000708 wd 0.0500 time 0.2249 (0.8593) data time 0.0009 (0.0415) model time 0.2240 (0.8179) loss 3.4678 (3.5831) grad_norm 2.1836 (2.2273) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][550/1251] eta 0:07:38 lr 0.000707 wd 0.0500 time 0.2213 (0.6547) data time 0.0007 (0.0284) model time 0.2206 (0.6263) loss 2.4682 (3.5930) grad_norm 2.2619 (2.1136) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][560/1251] eta 0:06:20 lr 0.000707 wd 0.0500 time 0.2273 (0.5500) data time 0.0009 (0.0217) model time 0.2264 (0.5283) loss 3.3331 (3.5246) grad_norm 2.5687 (2.2071) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][570/1251] eta 0:05:31 lr 0.000707 wd 0.0500 time 0.2236 (0.4863) data time 0.0006 (0.0176) model time 0.2230 (0.4686) loss 3.6808 (3.5178) grad_norm 2.0059 (2.2378) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][580/1251] eta 0:04:57 lr 0.000707 wd 0.0500 time 0.2224 (0.4433) data time 0.0008 (0.0149) model time 0.2216 (0.4285) loss 3.4283 (3.4911) grad_norm 3.0134 (2.2330) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][590/1251] eta 0:04:32 lr 0.000707 wd 0.0500 time 0.2243 (0.4127) data time 0.0008 (0.0129) model time 0.2236 (0.3997) loss 3.4314 (3.4473) grad_norm 1.7872 (2.2044) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][600/1251] eta 0:04:13 lr 0.000707 wd 0.0500 time 0.2238 (0.3893) data time 0.0008 (0.0115) model time 0.2229 (0.3779) loss 2.7546 (3.4316) grad_norm 1.6357 (2.1773) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][610/1251] eta 0:03:58 lr 0.000707 wd 0.0500 time 0.2292 (0.3714) data time 0.0006 (0.0103) model time 0.2286 (0.3611) loss 3.9699 (3.4280) grad_norm 2.1081 (2.1604) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][620/1251] eta 0:03:45 lr 0.000707 wd 0.0500 time 0.2283 (0.3570) data time 0.0006 (0.0094) model time 0.2277 (0.3476) loss 3.4712 (3.4428) grad_norm 1.8357 (2.1361) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][630/1251] eta 0:03:34 lr 0.000707 wd 0.0500 time 0.2281 (0.3451) data time 0.0008 (0.0086) model time 0.2273 (0.3365) loss 2.7959 (3.4345) grad_norm 2.5553 (2.1396) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][640/1251] eta 0:03:24 lr 0.000707 wd 0.0500 time 0.2301 (0.3353) data time 0.0007 (0.0080) model time 0.2295 (0.3274) loss 2.1542 (3.4381) grad_norm 1.6205 (2.1197) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][650/1251] eta 0:03:16 lr 0.000707 wd 0.0500 time 0.2303 (0.3272) data time 0.0007 (0.0075) model time 0.2296 (0.3197) loss 3.5308 (3.4294) grad_norm 2.5800 (2.1207) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][660/1251] eta 0:03:09 lr 0.000707 wd 0.0500 time 0.2289 (0.3202) data time 0.0007 (0.0070) model time 0.2282 (0.3132) loss 3.1742 (3.4128) grad_norm 2.7854 (2.1541) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][670/1251] eta 0:03:02 lr 0.000707 wd 0.0500 time 0.2227 (0.3138) data time 0.0008 (0.0066) model time 0.2220 (0.3073) loss 2.4568 (3.4016) grad_norm 1.8066 (2.1474) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][680/1251] eta 0:02:56 lr 0.000707 wd 0.0500 time 0.2201 (0.3084) data time 0.0008 (0.0062) model time 0.2193 (0.3021) loss 3.6063 (3.4049) grad_norm 2.0833 (2.1455) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][690/1251] eta 0:02:50 lr 0.000707 wd 0.0500 time 0.2267 (0.3035) data time 0.0007 (0.0059) model time 0.2260 (0.2976) loss 3.1647 (3.3956) grad_norm 1.6177 (2.1536) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:52:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][700/1251] eta 0:02:44 lr 0.000707 wd 0.0500 time 0.2254 (0.2992) data time 0.0007 (0.0057) model time 0.2247 (0.2935) loss 3.4995 (3.3792) grad_norm 1.5232 (2.1608) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:53:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][710/1251] eta 0:02:39 lr 0.000707 wd 0.0500 time 0.2298 (0.2952) data time 0.0008 (0.0054) model time 0.2290 (0.2898) loss 2.7890 (3.3738) grad_norm 2.9650 (2.1886) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:53:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][720/1251] eta 0:02:34 lr 0.000707 wd 0.0500 time 0.2239 (0.2919) data time 0.0009 (0.0052) model time 0.2230 (0.2867) loss 3.1682 (3.3595) grad_norm 1.9407 (2.1832) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:53:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][730/1251] eta 0:02:30 lr 0.000707 wd 0.0500 time 0.2330 (0.2888) data time 0.0007 (0.0050) model time 0.2323 (0.2838) loss 3.9068 (3.3552) grad_norm 2.0422 (2.1733) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:53:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][740/1251] eta 0:02:26 lr 0.000707 wd 0.0500 time 0.2398 (0.2860) data time 0.0006 (0.0048) model time 0.2392 (0.2812) loss 3.2598 (3.3512) grad_norm 1.8977 (2.1824) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:53:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][750/1251] eta 0:02:21 lr 0.000707 wd 0.0500 time 0.2265 (0.2834) data time 0.0006 (0.0047) model time 0.2259 (0.2787) loss 2.0478 (3.3493) grad_norm 2.3364 (2.1870) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:53:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][760/1251] eta 0:02:17 lr 0.000707 wd 0.0500 time 0.2258 (0.2810) data time 0.0006 (0.0045) model time 0.2252 (0.2765) loss 3.4217 (3.3464) grad_norm 1.4724 (2.1824) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][770/1251] eta 0:02:14 lr 0.000707 wd 0.0500 time 0.2219 (0.2787) data time 0.0007 (0.0044) model time 0.2212 (0.2743) loss 3.2546 (3.3387) grad_norm 2.3450 (2.1852) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 20:53:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 20:53:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 20:53:20 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 21:06:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 21:06:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 21:06:36 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 21:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 21:06:51 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 21:06:52 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 21:06:54 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 21:06:54 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 122) [2024-08-26 21:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 21:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][780/1251] eta 1:51:56 lr 0.000707 wd 0.0500 time 14.2611 (14.2611) data time 0.9286 (0.9286) model time 13.3325 (13.3325) loss 4.2120 (4.2120) grad_norm 2.9739 (2.9739) loss_scale 2048.0000 (2048.0000) mem 20033MB [2024-08-26 21:07:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][790/1251] eta 0:12:02 lr 0.000707 wd 0.0500 time 0.3043 (1.5671) data time 0.0009 (0.0853) model time 0.3034 (1.4818) loss 3.0291 (3.6228) grad_norm 3.7608 (2.4461) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:07:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][800/1251] eta 0:06:59 lr 0.000706 wd 0.0500 time 0.2276 (0.9303) data time 0.0011 (0.0453) model time 0.2264 (0.8849) loss 3.4637 (3.5968) grad_norm 3.0127 (2.4337) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][810/1251] eta 0:05:10 lr 0.000706 wd 0.0500 time 0.2187 (0.7039) data time 0.0007 (0.0310) model time 0.2181 (0.6728) loss 2.5431 (3.5989) grad_norm 2.5266 (2.3295) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:07:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][820/1251] eta 0:04:15 lr 0.000706 wd 0.0500 time 0.2295 (0.5920) data time 0.0010 (0.0237) model time 0.2285 (0.5683) loss 3.5188 (3.5149) grad_norm 2.6758 (2.2950) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][830/1251] eta 0:03:40 lr 0.000706 wd 0.0500 time 0.2314 (0.5246) data time 0.0007 (0.0193) model time 0.2306 (0.5052) loss 3.7137 (3.5002) grad_norm 1.8325 (2.2301) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][840/1251] eta 0:03:16 lr 0.000706 wd 0.0500 time 0.2289 (0.4777) data time 0.0012 (0.0163) model time 0.2276 (0.4614) loss 3.3286 (3.4582) grad_norm 2.2265 (2.1937) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:07:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][850/1251] eta 0:02:57 lr 0.000706 wd 0.0500 time 0.2225 (0.4430) data time 0.0009 (0.0142) model time 0.2216 (0.4288) loss 3.2228 (3.4256) grad_norm 2.7225 (2.2025) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:07:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][860/1251] eta 0:02:42 lr 0.000706 wd 0.0500 time 0.2139 (0.4168) data time 0.0012 (0.0126) model time 0.2127 (0.4042) loss 2.9567 (3.4160) grad_norm 2.3203 (2.1769) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][870/1251] eta 0:02:32 lr 0.000706 wd 0.0500 time 0.2991 (0.3999) data time 0.0010 (0.0117) model time 0.2981 (0.3882) loss 4.0626 (3.4032) grad_norm 1.8820 (2.1647) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:07:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][880/1251] eta 0:02:22 lr 0.000706 wd 0.0500 time 0.2358 (0.3843) data time 0.0008 (0.0106) model time 0.2351 (0.3737) loss 3.5939 (3.4056) grad_norm 2.7399 (2.2311) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][890/1251] eta 0:02:14 lr 0.000706 wd 0.0500 time 0.2356 (0.3717) data time 0.0009 (0.0099) model time 0.2347 (0.3618) loss 2.8269 (3.4055) grad_norm 3.2195 (2.2917) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:07:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][900/1251] eta 0:02:06 lr 0.000706 wd 0.0500 time 0.2375 (0.3600) data time 0.0017 (0.0091) model time 0.2358 (0.3509) loss 2.2141 (3.4143) grad_norm 2.8054 (2.2873) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:07:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][910/1251] eta 0:01:59 lr 0.000706 wd 0.0500 time 0.2328 (0.3513) data time 0.0010 (0.0085) model time 0.2318 (0.3428) loss 3.4549 (3.4039) grad_norm 2.0529 (2.2852) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][920/1251] eta 0:01:53 lr 0.000706 wd 0.0500 time 0.2331 (0.3441) data time 0.0007 (0.0080) model time 0.2323 (0.3362) loss 3.3620 (3.3938) grad_norm 1.8888 (2.2593) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:07:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 21:07:51 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 21:07:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 21:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 21:12:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 21:12:34 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 21:12:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 21:12:46 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 21:12:47 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 21:12:48 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 21:12:48 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 122) [2024-08-26 21:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 21:13:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][930/1251] eta 0:36:11 lr 0.000706 wd 0.0500 time 0.4452 (6.7651) data time 0.0008 (0.5929) model time 0.4444 (6.1721) loss 3.7882 (4.1498) grad_norm 1.7820 (1.7080) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-26 21:13:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][940/1251] eta 0:06:50 lr 0.000706 wd 0.0500 time 0.2271 (1.3192) data time 0.0010 (0.0998) model time 0.2261 (1.2193) loss 2.6896 (3.5698) grad_norm 1.7902 (2.1440) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][950/1251] eta 0:04:11 lr 0.000706 wd 0.0500 time 0.2362 (0.8345) data time 0.0009 (0.0551) model time 0.2354 (0.7794) loss 3.7863 (3.5429) grad_norm 3.3382 (2.1159) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][960/1251] eta 0:03:09 lr 0.000706 wd 0.0500 time 0.2283 (0.6529) data time 0.0007 (0.0382) model time 0.2275 (0.6147) loss 3.6590 (3.5810) grad_norm 3.6649 (2.1790) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][970/1251] eta 0:02:37 lr 0.000706 wd 0.0500 time 0.2271 (0.5590) data time 0.0008 (0.0294) model time 0.2263 (0.5297) loss 3.6835 (3.5287) grad_norm 1.8227 (2.2156) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][980/1251] eta 0:02:14 lr 0.000706 wd 0.0500 time 0.2360 (0.4968) data time 0.0010 (0.0242) model time 0.2349 (0.4726) loss 3.6240 (3.5253) grad_norm 3.7759 (2.1886) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][990/1251] eta 0:01:58 lr 0.000706 wd 0.0500 time 0.2255 (0.4535) data time 0.0008 (0.0205) model time 0.2247 (0.4330) loss 4.1331 (3.5061) grad_norm 1.9434 (2.2084) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1000/1251] eta 0:01:46 lr 0.000706 wd 0.0500 time 0.2420 (0.4245) data time 0.0010 (0.0178) model time 0.2410 (0.4067) loss 3.3867 (3.4433) grad_norm 2.3982 (2.2160) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1010/1251] eta 0:01:37 lr 0.000706 wd 0.0500 time 0.2293 (0.4035) data time 0.0009 (0.0157) model time 0.2283 (0.3878) loss 3.6659 (3.4407) grad_norm 3.8299 (2.2423) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1020/1251] eta 0:01:29 lr 0.000706 wd 0.0500 time 0.2228 (0.3870) data time 0.0008 (0.0142) model time 0.2220 (0.3728) loss 2.5789 (3.4249) grad_norm 2.2191 (2.2398) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1030/1251] eta 0:01:22 lr 0.000706 wd 0.0500 time 0.2235 (0.3719) data time 0.0010 (0.0129) model time 0.2225 (0.3589) loss 3.9500 (3.4450) grad_norm 2.4430 (2.2097) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1040/1251] eta 0:01:16 lr 0.000705 wd 0.0500 time 0.2927 (0.3607) data time 0.0009 (0.0119) model time 0.2919 (0.3489) loss 3.7135 (3.4414) grad_norm 2.7169 (2.2029) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1050/1251] eta 0:01:10 lr 0.000705 wd 0.0500 time 0.2393 (0.3516) data time 0.0008 (0.0110) model time 0.2386 (0.3406) loss 3.5901 (3.4345) grad_norm 2.3328 (2.1927) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1060/1251] eta 0:01:05 lr 0.000705 wd 0.0500 time 0.2268 (0.3424) data time 0.0010 (0.0102) model time 0.2258 (0.3322) loss 3.4620 (3.4285) grad_norm 2.4480 (2.1807) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1070/1251] eta 0:01:00 lr 0.000705 wd 0.0500 time 0.2614 (0.3348) data time 0.0013 (0.0096) model time 0.2600 (0.3252) loss 3.4617 (3.4166) grad_norm 3.9067 (2.1849) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1080/1251] eta 0:00:56 lr 0.000705 wd 0.0500 time 0.3345 (0.3301) data time 0.0010 (0.0091) model time 0.3335 (0.3211) loss 3.9582 (3.4105) grad_norm 2.4470 (2.1938) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1090/1251] eta 0:00:52 lr 0.000705 wd 0.0500 time 0.2211 (0.3254) data time 0.0008 (0.0086) model time 0.2202 (0.3169) loss 3.5090 (3.4154) grad_norm 1.6108 (2.1865) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1100/1251] eta 0:00:48 lr 0.000705 wd 0.0500 time 0.2269 (0.3199) data time 0.0009 (0.0081) model time 0.2260 (0.3118) loss 2.9516 (3.4048) grad_norm 2.3977 (2.1786) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1110/1251] eta 0:00:44 lr 0.000705 wd 0.0500 time 0.2228 (0.3149) data time 0.0013 (0.0077) model time 0.2215 (0.3072) loss 3.6178 (3.3905) grad_norm 2.3447 (2.1817) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1120/1251] eta 0:00:40 lr 0.000705 wd 0.0500 time 0.2778 (0.3107) data time 0.0009 (0.0074) model time 0.2769 (0.3033) loss 3.4001 (3.3845) grad_norm 1.7303 (2.1671) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:13:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1130/1251] eta 0:00:37 lr 0.000705 wd 0.0500 time 0.2356 (0.3087) data time 0.0006 (0.0072) model time 0.2349 (0.3015) loss 3.8715 (3.3719) grad_norm 2.0125 (2.1611) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:14:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1140/1251] eta 0:00:33 lr 0.000705 wd 0.0500 time 0.3134 (0.3059) data time 0.0010 (0.0069) model time 0.3124 (0.2990) loss 3.0393 (3.3648) grad_norm 2.4825 (2.1641) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:14:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1150/1251] eta 0:00:30 lr 0.000705 wd 0.0500 time 0.2210 (0.3029) data time 0.0007 (0.0066) model time 0.2203 (0.2963) loss 4.2671 (3.3608) grad_norm 1.7556 (2.1780) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:14:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 21:14:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 21:14:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 21:15:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 21:15:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 21:15:55 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 21:16:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 21:16:05 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 21:16:06 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 21:16:08 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 21:16:08 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 122) [2024-08-26 21:16:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 21:17:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 21:18:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 21:18:15 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 21:18:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 21:18:25 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 21:18:26 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 21:18:27 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 21:18:28 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 122) [2024-08-26 21:18:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 21:18:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1160/1251] eta 0:03:18 lr 0.000705 wd 0.0500 time 0.2460 (2.1865) data time 0.0013 (0.1149) model time 0.2447 (2.0716) loss 3.5316 (3.8634) grad_norm 1.8371 (2.4031) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:18:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1170/1251] eta 0:01:24 lr 0.000705 wd 0.0500 time 0.3576 (1.0468) data time 0.0011 (0.0480) model time 0.3565 (0.9988) loss 3.5091 (3.5294) grad_norm 2.2576 (2.2665) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:18:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1180/1251] eta 0:00:52 lr 0.000705 wd 0.0500 time 0.2379 (0.7452) data time 0.0007 (0.0307) model time 0.2372 (0.7145) loss 3.6750 (3.5867) grad_norm 2.0563 (2.3159) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:18:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1190/1251] eta 0:00:37 lr 0.000705 wd 0.0500 time 0.2887 (0.6139) data time 0.0011 (0.0227) model time 0.2876 (0.5912) loss 3.0862 (3.5312) grad_norm 1.9294 (2.3804) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:19:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1200/1251] eta 0:00:27 lr 0.000705 wd 0.0500 time 0.2305 (0.5355) data time 0.0007 (0.0181) model time 0.2298 (0.5174) loss 3.5080 (3.4916) grad_norm 1.9740 (2.5041) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:19:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1210/1251] eta 0:00:19 lr 0.000705 wd 0.0500 time 0.2290 (0.4819) data time 0.0011 (0.0151) model time 0.2279 (0.4668) loss 3.4150 (3.4935) grad_norm 2.0232 (2.5526) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:19:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1220/1251] eta 0:00:13 lr 0.000705 wd 0.0500 time 0.2339 (0.4444) data time 0.0009 (0.0130) model time 0.2330 (0.4314) loss 4.0027 (3.4707) grad_norm 2.1604 (2.4815) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:19:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1230/1251] eta 0:00:08 lr 0.000705 wd 0.0500 time 0.2898 (0.4207) data time 0.0012 (0.0115) model time 0.2886 (0.4092) loss 3.3963 (3.4200) grad_norm 1.5940 (2.4327) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:19:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1240/1251] eta 0:00:04 lr 0.000705 wd 0.0500 time 0.2591 (0.3982) data time 0.0007 (0.0104) model time 0.2584 (0.3878) loss 3.3700 (3.3958) grad_norm 1.4901 (2.4094) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:19:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [122/300][1250/1251] eta 0:00:00 lr 0.000705 wd 0.0500 time 0.2262 (0.3803) data time 0.0007 (0.0094) model time 0.2255 (0.3710) loss 3.8158 (3.4146) grad_norm 2.4598 (2.4062) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 21:19:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 122 training takes 0:00:36 [2024-08-26 21:19:11 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 21:19:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 21:19:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.484 (0.484) Loss 0.5059 (0.5059) Acc@1 90.820 (90.820) Acc@5 97.754 (97.754) Mem 7377MB [2024-08-26 21:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.089 (0.128) Loss 0.8174 (0.7803) Acc@1 83.789 (83.416) Acc@5 95.312 (96.467) Mem 7377MB [2024-08-26 21:19:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.100 (0.115) Loss 1.1797 (0.8138) Acc@1 71.875 (82.352) Acc@5 92.480 (96.447) Mem 7377MB [2024-08-26 21:19:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.106) Loss 1.3281 (0.9182) Acc@1 68.359 (79.892) Acc@5 89.355 (95.133) Mem 7377MB [2024-08-26 21:19:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.097) Loss 1.2695 (0.9745) Acc@1 69.531 (78.370) Acc@5 91.016 (94.438) Mem 7377MB [2024-08-26 21:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.966 Acc@5 94.380 [2024-08-26 21:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.0% [2024-08-26 21:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 77.97% [2024-08-26 21:19:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 21:19:22 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 21:19:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.493 (0.493) Loss 0.4233 (0.4233) Acc@1 92.090 (92.090) Acc@5 98.340 (98.340) Mem 7377MB [2024-08-26 21:19:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.093 (0.127) Loss 0.6748 (0.6644) Acc@1 86.719 (85.671) Acc@5 96.973 (97.212) Mem 7377MB [2024-08-26 21:19:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.109) Loss 0.9448 (0.6866) Acc@1 77.637 (84.756) Acc@5 94.922 (97.214) Mem 7377MB [2024-08-26 21:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.099) Loss 1.2051 (0.7812) Acc@1 69.531 (82.419) Acc@5 91.895 (96.122) Mem 7377MB [2024-08-26 21:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.0938 (0.8312) Acc@1 72.461 (81.019) Acc@5 93.262 (95.594) Mem 7377MB [2024-08-26 21:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.654 Acc@5 95.566 [2024-08-26 21:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.7% [2024-08-26 21:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.65% [2024-08-26 21:19:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 21:19:27 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 21:19:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][0/1251] eta 0:19:31 lr 0.000705 wd 0.0500 time 0.9368 (0.9368) data time 0.4777 (0.4777) model time 0.0000 (0.0000) loss 3.6059 (3.6059) grad_norm 2.4497 (2.4497) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 21:19:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][10/1251] eta 0:06:02 lr 0.000705 wd 0.0500 time 0.2402 (0.2921) data time 0.0007 (0.0445) model time 0.0000 (0.0000) loss 2.6303 (3.5733) grad_norm 1.6623 (2.1078) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:19:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][20/1251] eta 0:05:22 lr 0.000705 wd 0.0500 time 0.2245 (0.2623) data time 0.0011 (0.0238) model time 0.0000 (0.0000) loss 3.3604 (3.4885) grad_norm 2.0279 (2.1336) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:19:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][30/1251] eta 0:05:06 lr 0.000705 wd 0.0500 time 0.2320 (0.2513) data time 0.0007 (0.0165) model time 0.0000 (0.0000) loss 3.5498 (3.4295) grad_norm 3.0674 (2.1742) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:19:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][40/1251] eta 0:05:08 lr 0.000704 wd 0.0500 time 0.2295 (0.2546) data time 0.0009 (0.0129) model time 0.0000 (0.0000) loss 3.3913 (3.4134) grad_norm 1.4437 (2.1452) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:19:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][50/1251] eta 0:04:59 lr 0.000704 wd 0.0500 time 0.2287 (0.2495) data time 0.0011 (0.0106) model time 0.0000 (0.0000) loss 3.5630 (3.3845) grad_norm 2.1864 (2.1215) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:19:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][60/1251] eta 0:04:55 lr 0.000704 wd 0.0500 time 0.2245 (0.2480) data time 0.0007 (0.0091) model time 0.2237 (0.2386) loss 2.6863 (3.3515) grad_norm 1.9476 (2.2083) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:19:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][70/1251] eta 0:04:49 lr 0.000704 wd 0.0500 time 0.2330 (0.2453) data time 0.0009 (0.0080) model time 0.2321 (0.2330) loss 3.3765 (3.3784) grad_norm 1.9696 (2.2404) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:19:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][80/1251] eta 0:04:49 lr 0.000704 wd 0.0500 time 0.2816 (0.2474) data time 0.0008 (0.0072) model time 0.2809 (0.2424) loss 2.9918 (3.3462) grad_norm 2.9046 (2.2265) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:19:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][90/1251] eta 0:04:44 lr 0.000704 wd 0.0500 time 0.2257 (0.2453) data time 0.0007 (0.0065) model time 0.2249 (0.2386) loss 3.4139 (3.3377) grad_norm 1.8710 (2.2329) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:19:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][100/1251] eta 0:04:40 lr 0.000704 wd 0.0500 time 0.2198 (0.2439) data time 0.0011 (0.0060) model time 0.2186 (0.2370) loss 2.6743 (3.3143) grad_norm 2.9408 (2.2402) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:19:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][110/1251] eta 0:04:38 lr 0.000704 wd 0.0500 time 0.2827 (0.2440) data time 0.0012 (0.0056) model time 0.2815 (0.2379) loss 3.4983 (3.2902) grad_norm 2.1609 (2.2309) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:19:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][120/1251] eta 0:04:36 lr 0.000704 wd 0.0500 time 0.2271 (0.2446) data time 0.0011 (0.0053) model time 0.2260 (0.2397) loss 3.1559 (3.2788) grad_norm 1.9716 (2.2397) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:19:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][130/1251] eta 0:04:33 lr 0.000704 wd 0.0500 time 0.2255 (0.2443) data time 0.0010 (0.0049) model time 0.2245 (0.2396) loss 3.6282 (3.2998) grad_norm 2.0013 (2.2840) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][140/1251] eta 0:04:30 lr 0.000704 wd 0.0500 time 0.2252 (0.2433) data time 0.0009 (0.0048) model time 0.2243 (0.2384) loss 3.6884 (3.2887) grad_norm 1.8093 (2.2954) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][150/1251] eta 0:04:26 lr 0.000704 wd 0.0500 time 0.2250 (0.2425) data time 0.0009 (0.0045) model time 0.2240 (0.2375) loss 2.3894 (3.2793) grad_norm 2.0674 (2.2693) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][160/1251] eta 0:04:24 lr 0.000704 wd 0.0500 time 0.2330 (0.2428) data time 0.0009 (0.0043) model time 0.2321 (0.2384) loss 3.3352 (3.2611) grad_norm 1.7504 (2.2762) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][170/1251] eta 0:04:22 lr 0.000704 wd 0.0500 time 0.2312 (0.2429) data time 0.0007 (0.0041) model time 0.2306 (0.2387) loss 3.8215 (3.2578) grad_norm 2.3909 (2.2809) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][180/1251] eta 0:04:19 lr 0.000704 wd 0.0500 time 0.3011 (0.2426) data time 0.0010 (0.0040) model time 0.3001 (0.2386) loss 2.6594 (3.2671) grad_norm 2.0367 (2.2905) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][190/1251] eta 0:04:17 lr 0.000704 wd 0.0500 time 0.2255 (0.2424) data time 0.0007 (0.0038) model time 0.2248 (0.2384) loss 4.2514 (3.2712) grad_norm 3.5025 (2.2939) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][200/1251] eta 0:04:16 lr 0.000704 wd 0.0500 time 0.2337 (0.2442) data time 0.0009 (0.0037) model time 0.2329 (0.2410) loss 2.9330 (3.2483) grad_norm 1.2690 (2.2870) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][210/1251] eta 0:04:14 lr 0.000704 wd 0.0500 time 0.2242 (0.2446) data time 0.0011 (0.0036) model time 0.2231 (0.2417) loss 2.6176 (3.2435) grad_norm 1.6716 (2.2952) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][220/1251] eta 0:04:11 lr 0.000704 wd 0.0500 time 0.2323 (0.2441) data time 0.0009 (0.0035) model time 0.2314 (0.2411) loss 3.1595 (3.2554) grad_norm 2.2468 (2.2839) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][230/1251] eta 0:04:09 lr 0.000704 wd 0.0500 time 0.2243 (0.2445) data time 0.0007 (0.0034) model time 0.2235 (0.2418) loss 3.4093 (3.2697) grad_norm 1.7371 (2.2691) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][240/1251] eta 0:04:07 lr 0.000704 wd 0.0500 time 0.2855 (0.2447) data time 0.0010 (0.0033) model time 0.2845 (0.2421) loss 3.8103 (3.2660) grad_norm 1.4348 (2.2669) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][250/1251] eta 0:04:04 lr 0.000704 wd 0.0500 time 0.2394 (0.2445) data time 0.0007 (0.0032) model time 0.2388 (0.2420) loss 3.0995 (3.2697) grad_norm 1.6855 (2.2618) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][260/1251] eta 0:04:02 lr 0.000704 wd 0.0500 time 0.2391 (0.2445) data time 0.0008 (0.0031) model time 0.2383 (0.2420) loss 3.4074 (3.2723) grad_norm 2.0713 (2.2896) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][270/1251] eta 0:03:59 lr 0.000704 wd 0.0500 time 0.2235 (0.2439) data time 0.0010 (0.0030) model time 0.2225 (0.2414) loss 2.4124 (3.2780) grad_norm 1.8588 (2.2840) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][280/1251] eta 0:03:56 lr 0.000704 wd 0.0500 time 0.2907 (0.2438) data time 0.0009 (0.0030) model time 0.2899 (0.2413) loss 3.2640 (3.2767) grad_norm 1.7832 (2.2696) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][290/1251] eta 0:03:54 lr 0.000703 wd 0.0500 time 0.2336 (0.2441) data time 0.0007 (0.0029) model time 0.2329 (0.2416) loss 2.9372 (3.2685) grad_norm 1.8155 (2.2674) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][300/1251] eta 0:03:51 lr 0.000703 wd 0.0500 time 0.2307 (0.2436) data time 0.0007 (0.0029) model time 0.2300 (0.2411) loss 3.4195 (3.2724) grad_norm 2.0254 (2.2552) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][310/1251] eta 0:03:49 lr 0.000703 wd 0.0500 time 0.2534 (0.2435) data time 0.0009 (0.0028) model time 0.2524 (0.2410) loss 3.8776 (3.2814) grad_norm 2.1965 (2.2461) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][320/1251] eta 0:03:46 lr 0.000703 wd 0.0500 time 0.2331 (0.2430) data time 0.0007 (0.0028) model time 0.2324 (0.2405) loss 4.1133 (3.2813) grad_norm 2.6767 (2.2422) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][330/1251] eta 0:03:44 lr 0.000703 wd 0.0500 time 0.2414 (0.2438) data time 0.0007 (0.0027) model time 0.2407 (0.2414) loss 4.0813 (3.2865) grad_norm 1.9123 (2.2350) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][340/1251] eta 0:03:41 lr 0.000703 wd 0.0500 time 0.2199 (0.2434) data time 0.0011 (0.0027) model time 0.2188 (0.2410) loss 3.3585 (3.2901) grad_norm 1.9988 (2.2369) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][350/1251] eta 0:03:38 lr 0.000703 wd 0.0500 time 0.2303 (0.2430) data time 0.0009 (0.0026) model time 0.2294 (0.2407) loss 2.9660 (3.2903) grad_norm 1.6766 (2.2323) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][360/1251] eta 0:03:36 lr 0.000703 wd 0.0500 time 0.2277 (0.2432) data time 0.0006 (0.0026) model time 0.2271 (0.2409) loss 3.3894 (3.2871) grad_norm 1.7623 (2.2275) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:20:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][370/1251] eta 0:03:34 lr 0.000703 wd 0.0500 time 0.2319 (0.2434) data time 0.0010 (0.0025) model time 0.2309 (0.2412) loss 2.9643 (3.2809) grad_norm 2.2272 (2.2241) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][380/1251] eta 0:03:31 lr 0.000703 wd 0.0500 time 0.2253 (0.2434) data time 0.0009 (0.0025) model time 0.2243 (0.2412) loss 2.5558 (3.2770) grad_norm 2.7599 (2.2336) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][390/1251] eta 0:03:29 lr 0.000703 wd 0.0500 time 0.2356 (0.2430) data time 0.0014 (0.0025) model time 0.2342 (0.2408) loss 4.0857 (3.2888) grad_norm 3.0685 (2.2382) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][400/1251] eta 0:03:26 lr 0.000703 wd 0.0500 time 0.2264 (0.2427) data time 0.0007 (0.0024) model time 0.2257 (0.2405) loss 3.5067 (3.2903) grad_norm 1.9827 (2.2389) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][410/1251] eta 0:03:24 lr 0.000703 wd 0.0500 time 0.2879 (0.2432) data time 0.0011 (0.0024) model time 0.2867 (0.2411) loss 3.5065 (3.2928) grad_norm 2.0113 (2.2349) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][420/1251] eta 0:03:21 lr 0.000703 wd 0.0500 time 0.2262 (0.2430) data time 0.0011 (0.0024) model time 0.2251 (0.2409) loss 3.6713 (3.2981) grad_norm 1.8720 (2.2346) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][430/1251] eta 0:03:19 lr 0.000703 wd 0.0500 time 0.2627 (0.2428) data time 0.0007 (0.0023) model time 0.2620 (0.2407) loss 2.4382 (3.2885) grad_norm 1.5756 (2.2290) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][440/1251] eta 0:03:16 lr 0.000703 wd 0.0500 time 0.2259 (0.2427) data time 0.0007 (0.0023) model time 0.2252 (0.2405) loss 3.8136 (3.2878) grad_norm 2.1969 (2.2342) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][450/1251] eta 0:03:14 lr 0.000703 wd 0.0500 time 0.2285 (0.2426) data time 0.0008 (0.0023) model time 0.2277 (0.2405) loss 4.0135 (3.2903) grad_norm 2.1864 (2.2333) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][460/1251] eta 0:03:12 lr 0.000703 wd 0.0500 time 0.2224 (0.2428) data time 0.0012 (0.0023) model time 0.2212 (0.2407) loss 3.5439 (3.2942) grad_norm 1.6042 (2.2299) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][470/1251] eta 0:03:09 lr 0.000703 wd 0.0500 time 0.2396 (0.2425) data time 0.0007 (0.0022) model time 0.2389 (0.2404) loss 3.8716 (3.2967) grad_norm 2.3848 (2.2332) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][480/1251] eta 0:03:06 lr 0.000703 wd 0.0500 time 0.2227 (0.2422) data time 0.0011 (0.0022) model time 0.2216 (0.2401) loss 3.3110 (3.2976) grad_norm 2.1287 (2.2387) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][490/1251] eta 0:03:04 lr 0.000703 wd 0.0500 time 0.2412 (0.2424) data time 0.0009 (0.0022) model time 0.2403 (0.2404) loss 3.5604 (3.3013) grad_norm 2.2344 (2.2463) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][500/1251] eta 0:03:02 lr 0.000703 wd 0.0500 time 0.2244 (0.2424) data time 0.0010 (0.0022) model time 0.2234 (0.2404) loss 3.1200 (3.3027) grad_norm 2.4964 (2.2468) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][510/1251] eta 0:02:59 lr 0.000703 wd 0.0500 time 0.2415 (0.2424) data time 0.0010 (0.0021) model time 0.2405 (0.2404) loss 3.0955 (3.3011) grad_norm 3.1351 (2.2497) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][520/1251] eta 0:02:57 lr 0.000703 wd 0.0500 time 0.2316 (0.2422) data time 0.0009 (0.0021) model time 0.2307 (0.2402) loss 3.0544 (3.3028) grad_norm 1.8812 (2.2485) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][530/1251] eta 0:02:54 lr 0.000702 wd 0.0500 time 0.3006 (0.2423) data time 0.0007 (0.0021) model time 0.2999 (0.2403) loss 3.1566 (3.3064) grad_norm 2.3983 (2.2513) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-26 21:21:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][540/1251] eta 0:02:52 lr 0.000702 wd 0.0500 time 0.2249 (0.2425) data time 0.0010 (0.0021) model time 0.2238 (0.2405) loss 3.4990 (3.3084) grad_norm 1.9936 (2.2514) loss_scale 4096.0000 (2051.7856) mem 7381MB [2024-08-26 21:21:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][550/1251] eta 0:02:49 lr 0.000702 wd 0.0500 time 0.2189 (0.2422) data time 0.0012 (0.0021) model time 0.2177 (0.2403) loss 3.0152 (3.3035) grad_norm 1.8039 (2.2442) loss_scale 4096.0000 (2088.8857) mem 7381MB [2024-08-26 21:21:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][560/1251] eta 0:02:47 lr 0.000702 wd 0.0500 time 0.2462 (0.2421) data time 0.0010 (0.0021) model time 0.2451 (0.2401) loss 2.8200 (3.3009) grad_norm 1.7868 (2.2416) loss_scale 4096.0000 (2124.6631) mem 7381MB [2024-08-26 21:21:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][570/1251] eta 0:02:44 lr 0.000702 wd 0.0500 time 0.2651 (0.2422) data time 0.0007 (0.0020) model time 0.2644 (0.2403) loss 3.8697 (3.3027) grad_norm 2.5670 (2.2396) loss_scale 4096.0000 (2159.1874) mem 7381MB [2024-08-26 21:21:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][580/1251] eta 0:02:42 lr 0.000702 wd 0.0500 time 0.3211 (0.2426) data time 0.0009 (0.0020) model time 0.3202 (0.2407) loss 3.6093 (3.3080) grad_norm 2.7468 (2.2498) loss_scale 4096.0000 (2192.5232) mem 7381MB [2024-08-26 21:21:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][590/1251] eta 0:02:40 lr 0.000702 wd 0.0500 time 0.2362 (0.2423) data time 0.0007 (0.0020) model time 0.2355 (0.2405) loss 2.8108 (3.3068) grad_norm 1.6235 (2.2525) loss_scale 4096.0000 (2224.7310) mem 7381MB [2024-08-26 21:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][600/1251] eta 0:02:37 lr 0.000702 wd 0.0500 time 0.2323 (0.2421) data time 0.0007 (0.0020) model time 0.2316 (0.2402) loss 3.7980 (3.3019) grad_norm 2.4262 (2.2584) loss_scale 4096.0000 (2255.8669) mem 7381MB [2024-08-26 21:21:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][610/1251] eta 0:02:35 lr 0.000702 wd 0.0500 time 0.2383 (0.2419) data time 0.0011 (0.0020) model time 0.2372 (0.2400) loss 3.3174 (3.3022) grad_norm 1.7920 (2.2590) loss_scale 4096.0000 (2285.9836) mem 7381MB [2024-08-26 21:21:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][620/1251] eta 0:02:32 lr 0.000702 wd 0.0500 time 0.2224 (0.2423) data time 0.0007 (0.0020) model time 0.2217 (0.2404) loss 4.1085 (3.2965) grad_norm 1.9054 (2.2566) loss_scale 4096.0000 (2315.1304) mem 7381MB [2024-08-26 21:22:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][630/1251] eta 0:02:30 lr 0.000702 wd 0.0500 time 0.2822 (0.2426) data time 0.0008 (0.0020) model time 0.2813 (0.2407) loss 3.7010 (3.2961) grad_norm 2.0844 (2.2539) loss_scale 4096.0000 (2343.3534) mem 7381MB [2024-08-26 21:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][640/1251] eta 0:02:28 lr 0.000702 wd 0.0500 time 0.2322 (0.2428) data time 0.0011 (0.0020) model time 0.2311 (0.2409) loss 3.8451 (3.3029) grad_norm 2.4098 (2.2560) loss_scale 4096.0000 (2370.6958) mem 7381MB [2024-08-26 21:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 21:22:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 21:22:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 21:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 21:26:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 21:27:02 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 21:27:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 21:27:12 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 21:27:14 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 21:27:15 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 21:27:15 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 123) [2024-08-26 21:27:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 21:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 21:29:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 21:30:12 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 21:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 21:30:22 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 21:30:24 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 21:30:25 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 21:30:25 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 123) [2024-08-26 21:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 21:30:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][650/1251] eta 0:25:25 lr 0.000702 wd 0.0500 time 0.2249 (2.5379) data time 0.0010 (0.2093) model time 0.2239 (2.3287) loss 3.4756 (3.9133) grad_norm 2.4060 (2.2373) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][660/1251] eta 0:11:41 lr 0.000702 wd 0.0500 time 0.2303 (1.1868) data time 0.0010 (0.0869) model time 0.2293 (1.0999) loss 3.0460 (3.6634) grad_norm 1.8791 (2.2506) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:30:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][670/1251] eta 0:08:03 lr 0.000702 wd 0.0500 time 0.2283 (0.8318) data time 0.0009 (0.0551) model time 0.2274 (0.7767) loss 3.7980 (3.6863) grad_norm 2.3882 (2.3028) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:30:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][680/1251] eta 0:06:22 lr 0.000702 wd 0.0500 time 0.2234 (0.6690) data time 0.0011 (0.0405) model time 0.2223 (0.6286) loss 2.8011 (3.6090) grad_norm 2.3309 (2.2040) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:30:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][690/1251] eta 0:05:25 lr 0.000702 wd 0.0500 time 0.3041 (0.5809) data time 0.0011 (0.0321) model time 0.3031 (0.5488) loss 4.0141 (3.5550) grad_norm 1.9540 (2.2388) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:30:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][700/1251] eta 0:04:46 lr 0.000702 wd 0.0500 time 0.2297 (0.5207) data time 0.0008 (0.0267) model time 0.2289 (0.4940) loss 3.8647 (3.5602) grad_norm 1.8438 (2.2090) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][710/1251] eta 0:04:19 lr 0.000702 wd 0.0500 time 0.2491 (0.4795) data time 0.0018 (0.0229) model time 0.2473 (0.4566) loss 3.2673 (3.5018) grad_norm 3.4959 (2.2503) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][720/1251] eta 0:03:57 lr 0.000702 wd 0.0500 time 0.2369 (0.4473) data time 0.0009 (0.0200) model time 0.2360 (0.4272) loss 3.3429 (3.4633) grad_norm 2.0926 (2.2430) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][730/1251] eta 0:03:40 lr 0.000702 wd 0.0500 time 0.2474 (0.4235) data time 0.0011 (0.0179) model time 0.2463 (0.4057) loss 2.9944 (3.4211) grad_norm 2.5240 (2.2422) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][740/1251] eta 0:03:27 lr 0.000702 wd 0.0500 time 0.2302 (0.4054) data time 0.0009 (0.0162) model time 0.2293 (0.3892) loss 3.5050 (3.4259) grad_norm 2.0031 (2.2428) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][750/1251] eta 0:03:14 lr 0.000702 wd 0.0500 time 0.2297 (0.3888) data time 0.0007 (0.0148) model time 0.2290 (0.3740) loss 3.5953 (3.4465) grad_norm 2.1630 (2.2401) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][760/1251] eta 0:03:04 lr 0.000702 wd 0.0500 time 0.2205 (0.3750) data time 0.0010 (0.0136) model time 0.2195 (0.3614) loss 3.5079 (3.4346) grad_norm 2.3892 (2.2440) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][770/1251] eta 0:02:56 lr 0.000702 wd 0.0500 time 0.3026 (0.3666) data time 0.0014 (0.0127) model time 0.3012 (0.3538) loss 3.7309 (3.4181) grad_norm 2.0779 (2.2331) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][780/1251] eta 0:02:48 lr 0.000701 wd 0.0500 time 0.2897 (0.3571) data time 0.0013 (0.0119) model time 0.2884 (0.3452) loss 2.6760 (3.4076) grad_norm 2.4513 (2.2242) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][790/1251] eta 0:02:40 lr 0.000701 wd 0.0500 time 0.2294 (0.3487) data time 0.0010 (0.0112) model time 0.2284 (0.3376) loss 3.2581 (3.4014) grad_norm 3.1665 (2.2392) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][800/1251] eta 0:02:33 lr 0.000701 wd 0.0500 time 0.2267 (0.3411) data time 0.0009 (0.0105) model time 0.2258 (0.3306) loss 3.2316 (3.3983) grad_norm 2.6777 (2.2768) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][810/1251] eta 0:02:27 lr 0.000701 wd 0.0500 time 0.3131 (0.3348) data time 0.0011 (0.0100) model time 0.3121 (0.3249) loss 3.0764 (3.3999) grad_norm 2.1903 (2.2737) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][820/1251] eta 0:02:22 lr 0.000701 wd 0.0500 time 0.2259 (0.3303) data time 0.0009 (0.0095) model time 0.2250 (0.3208) loss 3.4743 (3.3878) grad_norm 1.8588 (2.2639) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][830/1251] eta 0:02:16 lr 0.000701 wd 0.0500 time 0.2332 (0.3248) data time 0.0006 (0.0091) model time 0.2326 (0.3157) loss 2.9453 (3.3753) grad_norm 1.6664 (2.2533) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][840/1251] eta 0:02:11 lr 0.000701 wd 0.0500 time 0.2229 (0.3205) data time 0.0007 (0.0087) model time 0.2222 (0.3118) loss 3.9703 (3.3678) grad_norm 1.7879 (2.2480) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][850/1251] eta 0:02:06 lr 0.000701 wd 0.0500 time 0.2361 (0.3161) data time 0.0007 (0.0083) model time 0.2353 (0.3078) loss 3.2322 (3.3493) grad_norm 2.0736 (2.2451) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][860/1251] eta 0:02:02 lr 0.000701 wd 0.0500 time 0.2886 (0.3136) data time 0.0013 (0.0080) model time 0.2873 (0.3056) loss 3.8360 (3.3447) grad_norm 2.3855 (2.2336) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][870/1251] eta 0:01:58 lr 0.000701 wd 0.0500 time 0.2298 (0.3106) data time 0.0008 (0.0077) model time 0.2290 (0.3029) loss 3.2275 (3.3512) grad_norm 2.0353 (2.2288) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][880/1251] eta 0:01:54 lr 0.000701 wd 0.0500 time 0.2302 (0.3078) data time 0.0010 (0.0074) model time 0.2291 (0.3004) loss 3.2967 (3.3451) grad_norm 1.8902 (2.2181) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][890/1251] eta 0:01:50 lr 0.000701 wd 0.0500 time 0.2678 (0.3048) data time 0.0009 (0.0072) model time 0.2669 (0.2976) loss 3.9712 (3.3396) grad_norm 2.6200 (2.2158) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][900/1251] eta 0:01:46 lr 0.000701 wd 0.0500 time 0.2498 (0.3033) data time 0.0007 (0.0070) model time 0.2491 (0.2964) loss 2.3977 (3.3258) grad_norm 1.7766 (2.2255) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][910/1251] eta 0:01:42 lr 0.000701 wd 0.0500 time 0.2281 (0.3009) data time 0.0008 (0.0067) model time 0.2273 (0.2942) loss 2.4869 (3.3178) grad_norm 2.5984 (2.2391) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][920/1251] eta 0:01:38 lr 0.000701 wd 0.0500 time 0.2335 (0.2983) data time 0.0010 (0.0065) model time 0.2324 (0.2918) loss 3.3056 (3.3242) grad_norm 5.7061 (2.2536) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][930/1251] eta 0:01:35 lr 0.000701 wd 0.0500 time 0.4450 (0.2966) data time 0.0007 (0.0063) model time 0.4443 (0.2902) loss 3.8803 (3.3215) grad_norm 1.9737 (2.2561) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][940/1251] eta 0:01:31 lr 0.000701 wd 0.0500 time 0.2734 (0.2955) data time 0.0016 (0.0062) model time 0.2718 (0.2893) loss 2.6444 (3.3100) grad_norm 2.4380 (2.2508) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][950/1251] eta 0:01:28 lr 0.000701 wd 0.0500 time 0.4801 (0.2941) data time 0.0009 (0.0060) model time 0.4792 (0.2880) loss 3.7736 (3.3089) grad_norm 3.0591 (2.2627) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][960/1251] eta 0:01:25 lr 0.000701 wd 0.0500 time 0.2755 (0.2922) data time 0.0014 (0.0059) model time 0.2741 (0.2864) loss 3.5931 (3.3174) grad_norm 2.6071 (2.2642) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][970/1251] eta 0:01:21 lr 0.000701 wd 0.0500 time 0.2288 (0.2903) data time 0.0007 (0.0057) model time 0.2281 (0.2845) loss 2.7343 (3.3247) grad_norm 1.9933 (2.2669) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][980/1251] eta 0:01:18 lr 0.000701 wd 0.0500 time 0.2248 (0.2890) data time 0.0008 (0.0056) model time 0.2240 (0.2834) loss 3.1065 (3.3224) grad_norm 1.9784 (2.2647) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][990/1251] eta 0:01:15 lr 0.000701 wd 0.0500 time 0.2249 (0.2878) data time 0.0007 (0.0055) model time 0.2242 (0.2823) loss 3.1470 (3.3243) grad_norm 1.6416 (2.2501) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1000/1251] eta 0:01:11 lr 0.000701 wd 0.0500 time 0.2275 (0.2861) data time 0.0009 (0.0053) model time 0.2266 (0.2808) loss 2.9370 (3.3260) grad_norm 3.3039 (2.2476) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1010/1251] eta 0:01:08 lr 0.000701 wd 0.0500 time 0.2334 (0.2846) data time 0.0009 (0.0052) model time 0.2325 (0.2793) loss 2.8852 (3.3274) grad_norm 2.0333 (2.2546) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1020/1251] eta 0:01:05 lr 0.000700 wd 0.0500 time 0.2886 (0.2838) data time 0.0015 (0.0052) model time 0.2871 (0.2786) loss 3.2561 (3.3241) grad_norm 2.8779 (2.2579) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1030/1251] eta 0:01:02 lr 0.000700 wd 0.0500 time 0.2246 (0.2825) data time 0.0006 (0.0051) model time 0.2240 (0.2774) loss 3.7514 (3.3201) grad_norm 2.6091 (2.2610) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1040/1251] eta 0:00:59 lr 0.000700 wd 0.0500 time 0.2289 (0.2817) data time 0.0008 (0.0050) model time 0.2281 (0.2767) loss 3.4352 (3.3254) grad_norm 2.0771 (2.2578) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1050/1251] eta 0:00:56 lr 0.000700 wd 0.0500 time 0.2367 (0.2804) data time 0.0011 (0.0049) model time 0.2355 (0.2755) loss 3.4266 (3.3326) grad_norm 1.6249 (2.2578) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1060/1251] eta 0:00:53 lr 0.000700 wd 0.0500 time 0.2270 (0.2797) data time 0.0007 (0.0048) model time 0.2263 (0.2749) loss 2.5042 (3.3356) grad_norm 2.4265 (2.2571) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1070/1251] eta 0:00:50 lr 0.000700 wd 0.0500 time 0.2304 (0.2788) data time 0.0010 (0.0047) model time 0.2294 (0.2741) loss 3.5255 (3.3409) grad_norm 2.9436 (2.2629) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1080/1251] eta 0:00:47 lr 0.000700 wd 0.0500 time 0.2251 (0.2777) data time 0.0010 (0.0047) model time 0.2241 (0.2730) loss 4.0338 (3.3475) grad_norm 2.3183 (2.2683) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1090/1251] eta 0:00:44 lr 0.000700 wd 0.0500 time 0.2761 (0.2767) data time 0.0014 (0.0046) model time 0.2747 (0.2721) loss 3.3862 (3.3487) grad_norm 2.1169 (2.2646) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1100/1251] eta 0:00:41 lr 0.000700 wd 0.0500 time 0.2840 (0.2760) data time 0.0015 (0.0045) model time 0.2826 (0.2715) loss 2.4682 (3.3405) grad_norm 2.2795 (2.2607) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1110/1251] eta 0:00:38 lr 0.000700 wd 0.0500 time 0.2658 (0.2755) data time 0.0013 (0.0044) model time 0.2645 (0.2711) loss 3.0993 (3.3359) grad_norm 2.3747 (2.2718) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1120/1251] eta 0:00:35 lr 0.000700 wd 0.0500 time 0.2277 (0.2745) data time 0.0008 (0.0044) model time 0.2269 (0.2701) loss 3.8699 (3.3349) grad_norm 2.6508 (2.2687) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1130/1251] eta 0:00:33 lr 0.000700 wd 0.0500 time 0.2447 (0.2736) data time 0.0009 (0.0043) model time 0.2438 (0.2693) loss 3.1959 (3.3369) grad_norm 2.8061 (2.2696) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1140/1251] eta 0:00:30 lr 0.000700 wd 0.0500 time 0.2342 (0.2727) data time 0.0010 (0.0042) model time 0.2333 (0.2684) loss 3.4696 (3.3369) grad_norm 2.8425 (2.2732) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1150/1251] eta 0:00:27 lr 0.000700 wd 0.0500 time 0.2280 (0.2724) data time 0.0009 (0.0042) model time 0.2271 (0.2682) loss 3.4577 (3.3388) grad_norm 1.5566 (2.2735) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1160/1251] eta 0:00:24 lr 0.000700 wd 0.0500 time 0.2239 (0.2715) data time 0.0010 (0.0041) model time 0.2230 (0.2674) loss 3.6291 (3.3447) grad_norm 1.4962 (2.2788) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1170/1251] eta 0:00:21 lr 0.000700 wd 0.0500 time 0.2268 (0.2711) data time 0.0011 (0.0041) model time 0.2256 (0.2670) loss 3.3008 (3.3372) grad_norm 2.9735 (2.2855) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1180/1251] eta 0:00:19 lr 0.000700 wd 0.0500 time 0.2334 (0.2703) data time 0.0009 (0.0041) model time 0.2326 (0.2662) loss 2.2336 (3.3327) grad_norm 2.1194 (2.2884) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1190/1251] eta 0:00:16 lr 0.000700 wd 0.0500 time 0.2895 (0.2701) data time 0.0010 (0.0040) model time 0.2885 (0.2661) loss 3.5878 (3.3332) grad_norm 1.7022 (2.2830) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:32:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1200/1251] eta 0:00:13 lr 0.000700 wd 0.0500 time 0.2350 (0.2695) data time 0.0007 (0.0039) model time 0.2343 (0.2655) loss 2.4226 (3.3374) grad_norm 3.0590 (2.2830) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:33:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1210/1251] eta 0:00:11 lr 0.000700 wd 0.0500 time 0.2319 (0.2688) data time 0.0009 (0.0039) model time 0.2310 (0.2649) loss 3.7891 (3.3399) grad_norm 2.3587 (2.2877) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:33:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1220/1251] eta 0:00:08 lr 0.000700 wd 0.0500 time 0.2468 (0.2684) data time 0.0008 (0.0039) model time 0.2460 (0.2646) loss 3.8679 (3.3396) grad_norm 1.8229 (2.2940) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:33:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1230/1251] eta 0:00:05 lr 0.000700 wd 0.0500 time 0.2238 (0.2682) data time 0.0008 (0.0038) model time 0.2230 (0.2643) loss 3.9616 (3.3421) grad_norm 2.5683 (2.2983) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:33:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1240/1251] eta 0:00:02 lr 0.000700 wd 0.0500 time 0.2110 (0.2676) data time 0.0004 (0.0038) model time 0.2106 (0.2638) loss 4.1234 (3.3441) grad_norm 2.1917 (2.2954) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:33:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [123/300][1250/1251] eta 0:00:00 lr 0.000700 wd 0.0500 time 0.2105 (0.2667) data time 0.0006 (0.0037) model time 0.2099 (0.2629) loss 3.5477 (3.3423) grad_norm 2.5332 (2.2968) loss_scale 4096.0000 (4096.0000) mem 7378MB [2024-08-26 21:33:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 123 training takes 0:02:41 [2024-08-26 21:33:11 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 21:33:14 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 21:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.441 (0.441) Loss 0.4526 (0.4526) Acc@1 91.602 (91.602) Acc@5 98.145 (98.145) Mem 7378MB [2024-08-26 21:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.093 (0.131) Loss 0.7534 (0.7456) Acc@1 83.691 (83.336) Acc@5 95.703 (96.529) Mem 7378MB [2024-08-26 21:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.091 (0.111) Loss 1.0859 (0.7688) Acc@1 72.461 (82.431) Acc@5 93.359 (96.563) Mem 7378MB [2024-08-26 21:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.098 (0.103) Loss 1.3486 (0.8777) Acc@1 66.406 (79.917) Acc@5 90.918 (95.253) Mem 7378MB [2024-08-26 21:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.2305 (0.9368) Acc@1 70.215 (78.480) Acc@5 90.625 (94.543) Mem 7378MB [2024-08-26 21:33:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.156 Acc@5 94.448 [2024-08-26 21:33:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.2% [2024-08-26 21:33:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 78.16% [2024-08-26 21:33:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-26 21:33:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-26 21:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.481 (0.481) Loss 0.4219 (0.4219) Acc@1 92.090 (92.090) Acc@5 98.438 (98.438) Mem 7378MB [2024-08-26 21:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.116) Loss 0.6729 (0.6635) Acc@1 86.816 (85.716) Acc@5 96.973 (97.221) Mem 7378MB [2024-08-26 21:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.094 (0.101) Loss 0.9424 (0.6856) Acc@1 77.734 (84.794) Acc@5 95.020 (97.233) Mem 7378MB [2024-08-26 21:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.097) Loss 1.2031 (0.7799) Acc@1 68.945 (82.475) Acc@5 91.797 (96.173) Mem 7378MB [2024-08-26 21:33:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.090) Loss 1.0918 (0.8298) Acc@1 72.656 (81.069) Acc@5 93.066 (95.644) Mem 7378MB [2024-08-26 21:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.686 Acc@5 95.600 [2024-08-26 21:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.7% [2024-08-26 21:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.69% [2024-08-26 21:33:28 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 21:33:31 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 21:33:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][0/1251] eta 0:17:21 lr 0.000700 wd 0.0500 time 0.8324 (0.8324) data time 0.4734 (0.4734) model time 0.0000 (0.0000) loss 3.0167 (3.0167) grad_norm 2.4819 (2.4819) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 21:33:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 21:33:32 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 21:33:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 21:35:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 21:35:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 21:35:31 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 21:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 21:35:40 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 21:35:41 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 21:35:43 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 21:35:43 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 124) [2024-08-26 21:35:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 21:36:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][10/1251] eta 0:33:15 lr 0.000700 wd 0.0500 time 0.2421 (1.6076) data time 0.0010 (0.1077) model time 0.0000 (0.0000) loss 3.2352 (3.8397) grad_norm 2.0391 (2.1226) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][20/1251] eta 0:18:56 lr 0.000699 wd 0.0500 time 0.2370 (0.9233) data time 0.0009 (0.0544) model time 0.0000 (0.0000) loss 3.6595 (3.5998) grad_norm 2.0851 (1.9670) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][30/1251] eta 0:14:09 lr 0.000699 wd 0.0500 time 0.2414 (0.6954) data time 0.0011 (0.0366) model time 0.0000 (0.0000) loss 3.2048 (3.6009) grad_norm 1.4116 (1.9952) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][40/1251] eta 0:11:44 lr 0.000699 wd 0.0500 time 0.2348 (0.5816) data time 0.0008 (0.0277) model time 0.0000 (0.0000) loss 2.8573 (3.5167) grad_norm 2.0880 (2.0784) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][50/1251] eta 0:10:16 lr 0.000699 wd 0.0500 time 0.2459 (0.5135) data time 0.0010 (0.0224) model time 0.0000 (0.0000) loss 3.0230 (3.4819) grad_norm 1.9151 (2.0477) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][60/1251] eta 0:09:17 lr 0.000699 wd 0.0500 time 0.2320 (0.4683) data time 0.0010 (0.0188) model time 0.2310 (0.2416) loss 3.4611 (3.4598) grad_norm 2.9592 (2.1043) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][70/1251] eta 0:08:34 lr 0.000699 wd 0.0500 time 0.2338 (0.4356) data time 0.0009 (0.0163) model time 0.2329 (0.2399) loss 2.1799 (3.4156) grad_norm 1.9566 (2.1490) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][80/1251] eta 0:08:01 lr 0.000699 wd 0.0500 time 0.2387 (0.4112) data time 0.0010 (0.0144) model time 0.2377 (0.2396) loss 3.9856 (3.4166) grad_norm 4.2886 (2.1814) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][90/1251] eta 0:07:35 lr 0.000699 wd 0.0500 time 0.2499 (0.3923) data time 0.0008 (0.0129) model time 0.2490 (0.2397) loss 3.9313 (3.3960) grad_norm 1.9526 (2.1890) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][100/1251] eta 0:07:14 lr 0.000699 wd 0.0500 time 0.2418 (0.3771) data time 0.0010 (0.0117) model time 0.2408 (0.2397) loss 3.8737 (3.4069) grad_norm 1.6414 (2.2150) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][110/1251] eta 0:06:56 lr 0.000699 wd 0.0500 time 0.2405 (0.3647) data time 0.0011 (0.0107) model time 0.2395 (0.2397) loss 3.3336 (3.4137) grad_norm 2.3550 (2.2474) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][120/1251] eta 0:06:40 lr 0.000699 wd 0.0500 time 0.2340 (0.3543) data time 0.0008 (0.0099) model time 0.2332 (0.2396) loss 4.0093 (3.4215) grad_norm 2.4952 (2.2505) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][130/1251] eta 0:06:27 lr 0.000699 wd 0.0500 time 0.2396 (0.3457) data time 0.0008 (0.0093) model time 0.2389 (0.2397) loss 3.5839 (3.3987) grad_norm 1.9340 (2.2506) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][140/1251] eta 0:06:15 lr 0.000699 wd 0.0500 time 0.2364 (0.3381) data time 0.0009 (0.0087) model time 0.2354 (0.2396) loss 1.8512 (3.3815) grad_norm 1.5629 (2.2617) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][150/1251] eta 0:06:05 lr 0.000699 wd 0.0500 time 0.2361 (0.3316) data time 0.0014 (0.0082) model time 0.2347 (0.2395) loss 3.7382 (3.3840) grad_norm 1.9371 (2.2655) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][160/1251] eta 0:05:55 lr 0.000699 wd 0.0500 time 0.2373 (0.3258) data time 0.0010 (0.0077) model time 0.2363 (0.2394) loss 3.3140 (3.3765) grad_norm 2.2187 (2.2781) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][170/1251] eta 0:05:46 lr 0.000699 wd 0.0500 time 0.2422 (0.3207) data time 0.0008 (0.0073) model time 0.2414 (0.2394) loss 2.3786 (3.3695) grad_norm 1.7774 (2.2672) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][180/1251] eta 0:05:38 lr 0.000699 wd 0.0500 time 0.2378 (0.3163) data time 0.0009 (0.0070) model time 0.2369 (0.2393) loss 2.7061 (3.3517) grad_norm 1.7200 (2.2545) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][190/1251] eta 0:05:31 lr 0.000699 wd 0.0500 time 0.2501 (0.3123) data time 0.0008 (0.0067) model time 0.2492 (0.2394) loss 2.6472 (3.3542) grad_norm 1.9844 (2.2630) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][200/1251] eta 0:05:24 lr 0.000699 wd 0.0500 time 0.2414 (0.3088) data time 0.0011 (0.0064) model time 0.2404 (0.2394) loss 3.4866 (3.3431) grad_norm 2.3283 (2.2681) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][210/1251] eta 0:05:17 lr 0.000699 wd 0.0500 time 0.2409 (0.3054) data time 0.0008 (0.0062) model time 0.2401 (0.2393) loss 3.4130 (3.3358) grad_norm 1.6268 (2.2580) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][220/1251] eta 0:05:11 lr 0.000699 wd 0.0500 time 0.2358 (0.3025) data time 0.0010 (0.0059) model time 0.2348 (0.2393) loss 2.8523 (3.3279) grad_norm 3.1663 (2.2584) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][230/1251] eta 0:05:06 lr 0.000699 wd 0.0500 time 0.2429 (0.2999) data time 0.0011 (0.0057) model time 0.2419 (0.2395) loss 3.7276 (3.3342) grad_norm 2.4291 (2.2547) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][240/1251] eta 0:05:00 lr 0.000699 wd 0.0500 time 0.2425 (0.2975) data time 0.0010 (0.0055) model time 0.2415 (0.2396) loss 3.3405 (3.3255) grad_norm 1.7965 (2.2592) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][250/1251] eta 0:04:55 lr 0.000699 wd 0.0500 time 0.2386 (0.2953) data time 0.0008 (0.0053) model time 0.2379 (0.2397) loss 2.3808 (3.3175) grad_norm 2.4244 (2.2699) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][260/1251] eta 0:04:50 lr 0.000698 wd 0.0500 time 0.2342 (0.2931) data time 0.0011 (0.0052) model time 0.2331 (0.2396) loss 2.7502 (3.3078) grad_norm 1.5873 (2.2949) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:37:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][270/1251] eta 0:04:45 lr 0.000698 wd 0.0500 time 0.2373 (0.2913) data time 0.0008 (0.0050) model time 0.2365 (0.2397) loss 4.2617 (3.3032) grad_norm 2.5667 (2.2899) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][280/1251] eta 0:04:40 lr 0.000698 wd 0.0500 time 0.2338 (0.2894) data time 0.0010 (0.0049) model time 0.2327 (0.2396) loss 3.4354 (3.3057) grad_norm 2.4768 (2.2867) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:37:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][290/1251] eta 0:04:37 lr 0.000698 wd 0.0500 time 0.2405 (0.2885) data time 0.0010 (0.0047) model time 0.2395 (0.2406) loss 3.1049 (3.3071) grad_norm 1.9216 (2.2916) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][300/1251] eta 0:04:32 lr 0.000698 wd 0.0500 time 0.2400 (0.2869) data time 0.0009 (0.0046) model time 0.2390 (0.2405) loss 2.9853 (3.2935) grad_norm 2.0819 (2.2881) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:37:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][310/1251] eta 0:04:29 lr 0.000698 wd 0.0500 time 0.2398 (0.2863) data time 0.0010 (0.0045) model time 0.2388 (0.2415) loss 3.2998 (3.2930) grad_norm 1.8020 (2.2820) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][320/1251] eta 0:04:25 lr 0.000698 wd 0.0500 time 0.2397 (0.2848) data time 0.0011 (0.0044) model time 0.2386 (0.2414) loss 3.2900 (3.3024) grad_norm 2.9177 (2.2810) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:37:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][330/1251] eta 0:04:21 lr 0.000698 wd 0.0500 time 0.2436 (0.2835) data time 0.0008 (0.0043) model time 0.2429 (0.2413) loss 3.5809 (3.3072) grad_norm 2.0110 (2.2812) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:37:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][340/1251] eta 0:04:17 lr 0.000698 wd 0.0500 time 0.2344 (0.2822) data time 0.0011 (0.0042) model time 0.2333 (0.2412) loss 3.1280 (3.3055) grad_norm 1.9521 (2.2793) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:37:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][350/1251] eta 0:04:13 lr 0.000698 wd 0.0500 time 0.2450 (0.2810) data time 0.0010 (0.0041) model time 0.2440 (0.2412) loss 3.1066 (3.3064) grad_norm 1.9729 (2.2790) loss_scale 4096.0000 (4096.0000) mem 7374MB [2024-08-26 21:37:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 21:37:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 21:37:27 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 21:39:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 21:39:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 21:39:31 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 21:39:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 21:39:45 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 21:39:46 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 21:39:47 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 21:39:47 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 124) [2024-08-26 21:39:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 21:40:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][360/1251] eta 0:28:36 lr 0.000698 wd 0.0500 time 0.2372 (1.9260) data time 0.0011 (0.1632) model time 0.2361 (1.7628) loss 4.0135 (3.7428) grad_norm 1.7843 (2.6863) loss_scale 4096.0000 (4096.0000) mem 7376MB [2024-08-26 21:40:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][370/1251] eta 0:15:09 lr 0.000698 wd 0.0500 time 0.2240 (1.0328) data time 0.0010 (0.0779) model time 0.2230 (0.9549) loss 3.5217 (3.5989) grad_norm 2.5212 (2.7470) loss_scale 4096.0000 (4096.0000) mem 7376MB [2024-08-26 21:40:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][380/1251] eta 0:10:58 lr 0.000698 wd 0.0500 time 0.2209 (0.7555) data time 0.0009 (0.0514) model time 0.2200 (0.7040) loss 3.6256 (3.5957) grad_norm 1.8777 (2.6561) loss_scale 4096.0000 (4096.0000) mem 7376MB [2024-08-26 21:40:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][390/1251] eta 0:08:54 lr 0.000698 wd 0.0500 time 0.2261 (0.6207) data time 0.0009 (0.0386) model time 0.2252 (0.5821) loss 3.4226 (3.5256) grad_norm 1.7750 (2.6062) loss_scale 4096.0000 (4096.0000) mem 7376MB [2024-08-26 21:40:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][400/1251] eta 0:07:40 lr 0.000698 wd 0.0500 time 0.2306 (0.5407) data time 0.0009 (0.0309) model time 0.2297 (0.5098) loss 3.1872 (3.4921) grad_norm 2.1569 (2.5558) loss_scale 4096.0000 (4096.0000) mem 7376MB [2024-08-26 21:40:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][410/1251] eta 0:06:50 lr 0.000698 wd 0.0500 time 0.2330 (0.4881) data time 0.0011 (0.0259) model time 0.2319 (0.4622) loss 2.5130 (3.4450) grad_norm 2.5158 (2.4838) loss_scale 4096.0000 (4096.0000) mem 7376MB [2024-08-26 21:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][420/1251] eta 0:06:14 lr 0.000698 wd 0.0500 time 0.2262 (0.4504) data time 0.0009 (0.0223) model time 0.2253 (0.4281) loss 3.2487 (3.4341) grad_norm 2.9559 (2.4806) loss_scale 4096.0000 (4096.0000) mem 7376MB [2024-08-26 21:40:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][430/1251] eta 0:05:46 lr 0.000698 wd 0.0500 time 0.2257 (0.4226) data time 0.0011 (0.0197) model time 0.2246 (0.4029) loss 3.3027 (3.4079) grad_norm 2.6620 (2.4331) loss_scale 4096.0000 (4096.0000) mem 7376MB [2024-08-26 21:40:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][440/1251] eta 0:05:25 lr 0.000698 wd 0.0500 time 0.2283 (0.4009) data time 0.0007 (0.0177) model time 0.2276 (0.3833) loss 3.2246 (3.3715) grad_norm 2.1555 (2.3996) loss_scale 4096.0000 (4096.0000) mem 7376MB [2024-08-26 21:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][450/1251] eta 0:05:07 lr 0.000698 wd 0.0500 time 0.2214 (0.3835) data time 0.0013 (0.0160) model time 0.2201 (0.3675) loss 3.3327 (3.3894) grad_norm 1.5273 (inf) loss_scale 2048.0000 (3930.5051) mem 7376MB [2024-08-26 21:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][460/1251] eta 0:04:52 lr 0.000698 wd 0.0500 time 0.2302 (0.3692) data time 0.0009 (0.0146) model time 0.2293 (0.3546) loss 3.8392 (3.4133) grad_norm 1.7837 (inf) loss_scale 2048.0000 (3757.7982) mem 7376MB [2024-08-26 21:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][470/1251] eta 0:04:39 lr 0.000698 wd 0.0500 time 0.2259 (0.3574) data time 0.0009 (0.0135) model time 0.2250 (0.3439) loss 3.6675 (3.4137) grad_norm 2.3680 (inf) loss_scale 2048.0000 (3614.1176) mem 7376MB [2024-08-26 21:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][480/1251] eta 0:04:28 lr 0.000698 wd 0.0500 time 0.2310 (0.3476) data time 0.0008 (0.0126) model time 0.2302 (0.3351) loss 3.0112 (3.4016) grad_norm 2.1125 (inf) loss_scale 2048.0000 (3492.7132) mem 7376MB [2024-08-26 21:40:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][490/1251] eta 0:04:18 lr 0.000698 wd 0.0500 time 0.2350 (0.3394) data time 0.0009 (0.0118) model time 0.2340 (0.3276) loss 3.9765 (3.4058) grad_norm 2.1024 (inf) loss_scale 2048.0000 (3388.7770) mem 7376MB [2024-08-26 21:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][500/1251] eta 0:04:09 lr 0.000698 wd 0.0500 time 0.2261 (0.3321) data time 0.0007 (0.0110) model time 0.2254 (0.3210) loss 2.8103 (3.3959) grad_norm 1.6738 (inf) loss_scale 2048.0000 (3298.7919) mem 7376MB [2024-08-26 21:40:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][510/1251] eta 0:04:01 lr 0.000697 wd 0.0500 time 0.2324 (0.3258) data time 0.0007 (0.0104) model time 0.2317 (0.3154) loss 4.1410 (3.3909) grad_norm 3.6181 (inf) loss_scale 2048.0000 (3220.1258) mem 7376MB [2024-08-26 21:40:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][520/1251] eta 0:03:54 lr 0.000697 wd 0.0500 time 0.2251 (0.3201) data time 0.0010 (0.0099) model time 0.2241 (0.3102) loss 3.0170 (3.3992) grad_norm 2.1396 (inf) loss_scale 2048.0000 (3150.7692) mem 7376MB [2024-08-26 21:40:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][530/1251] eta 0:03:47 lr 0.000697 wd 0.0500 time 0.2274 (0.3150) data time 0.0008 (0.0094) model time 0.2265 (0.3056) loss 3.3792 (3.3849) grad_norm 2.6032 (inf) loss_scale 2048.0000 (3089.1620) mem 7376MB [2024-08-26 21:40:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][540/1251] eta 0:03:40 lr 0.000697 wd 0.0500 time 0.2277 (0.3105) data time 0.0010 (0.0090) model time 0.2267 (0.3015) loss 4.0281 (3.3804) grad_norm 2.0637 (inf) loss_scale 2048.0000 (3034.0741) mem 7376MB [2024-08-26 21:40:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][550/1251] eta 0:03:34 lr 0.000697 wd 0.0500 time 0.2298 (0.3064) data time 0.0010 (0.0086) model time 0.2288 (0.2978) loss 2.4678 (3.3684) grad_norm 1.9347 (inf) loss_scale 2048.0000 (2984.5226) mem 7376MB [2024-08-26 21:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][560/1251] eta 0:03:29 lr 0.000697 wd 0.0500 time 0.2269 (0.3027) data time 0.0007 (0.0082) model time 0.2262 (0.2945) loss 3.5907 (3.3674) grad_norm 2.4397 (inf) loss_scale 2048.0000 (2939.7129) mem 7376MB [2024-08-26 21:40:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][570/1251] eta 0:03:23 lr 0.000697 wd 0.0500 time 0.2296 (0.2994) data time 0.0007 (0.0079) model time 0.2289 (0.2915) loss 4.2757 (3.3638) grad_norm 1.7281 (inf) loss_scale 2048.0000 (2898.9954) mem 7376MB [2024-08-26 21:41:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][580/1251] eta 0:03:18 lr 0.000697 wd 0.0500 time 0.2317 (0.2963) data time 0.0009 (0.0076) model time 0.2309 (0.2887) loss 2.5991 (3.3622) grad_norm 1.9429 (inf) loss_scale 2048.0000 (2861.8341) mem 7376MB [2024-08-26 21:41:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][590/1251] eta 0:03:14 lr 0.000697 wd 0.0500 time 0.2296 (0.2936) data time 0.0009 (0.0073) model time 0.2287 (0.2863) loss 2.2545 (3.3519) grad_norm 1.6456 (inf) loss_scale 2048.0000 (2827.7824) mem 7376MB [2024-08-26 21:41:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][600/1251] eta 0:03:09 lr 0.000697 wd 0.0500 time 0.2290 (0.2912) data time 0.0008 (0.0071) model time 0.2282 (0.2841) loss 3.3271 (3.3477) grad_norm 1.5861 (inf) loss_scale 2048.0000 (2796.4659) mem 7376MB [2024-08-26 21:41:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][610/1251] eta 0:03:05 lr 0.000697 wd 0.0500 time 0.2288 (0.2888) data time 0.0009 (0.0069) model time 0.2278 (0.2820) loss 3.4227 (3.3422) grad_norm 2.5672 (inf) loss_scale 2048.0000 (2767.5676) mem 7376MB [2024-08-26 21:41:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][620/1251] eta 0:03:00 lr 0.000697 wd 0.0500 time 0.2319 (0.2867) data time 0.0008 (0.0066) model time 0.2311 (0.2801) loss 1.8676 (3.3256) grad_norm 1.9101 (inf) loss_scale 2048.0000 (2740.8178) mem 7376MB [2024-08-26 21:41:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][630/1251] eta 0:02:56 lr 0.000697 wd 0.0500 time 0.2307 (0.2846) data time 0.0014 (0.0064) model time 0.2293 (0.2782) loss 3.6542 (3.3317) grad_norm 2.0060 (inf) loss_scale 2048.0000 (2715.9857) mem 7376MB [2024-08-26 21:41:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][640/1251] eta 0:02:53 lr 0.000697 wd 0.0500 time 0.2312 (0.2835) data time 0.0010 (0.0063) model time 0.2302 (0.2772) loss 3.1474 (3.3285) grad_norm 2.8145 (inf) loss_scale 2048.0000 (2692.8720) mem 7376MB [2024-08-26 21:41:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][650/1251] eta 0:02:49 lr 0.000697 wd 0.0500 time 0.2267 (0.2816) data time 0.0014 (0.0061) model time 0.2253 (0.2755) loss 4.0434 (3.3173) grad_norm 2.6586 (inf) loss_scale 2048.0000 (2671.3043) mem 7376MB [2024-08-26 21:41:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][660/1251] eta 0:02:45 lr 0.000697 wd 0.0500 time 0.2393 (0.2808) data time 0.0011 (0.0059) model time 0.2382 (0.2749) loss 3.5930 (3.3114) grad_norm 2.4715 (inf) loss_scale 2048.0000 (2651.1327) mem 7376MB [2024-08-26 21:41:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][670/1251] eta 0:02:42 lr 0.000697 wd 0.0500 time 0.2313 (0.2791) data time 0.0009 (0.0058) model time 0.2304 (0.2734) loss 3.9178 (3.3226) grad_norm 1.9948 (inf) loss_scale 2048.0000 (2632.2257) mem 7376MB [2024-08-26 21:41:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][680/1251] eta 0:02:38 lr 0.000697 wd 0.0500 time 0.2323 (0.2777) data time 0.0011 (0.0056) model time 0.2313 (0.2720) loss 2.3580 (3.3250) grad_norm 2.2248 (inf) loss_scale 2048.0000 (2614.4681) mem 7376MB [2024-08-26 21:41:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][690/1251] eta 0:02:34 lr 0.000697 wd 0.0500 time 0.2259 (0.2763) data time 0.0010 (0.0055) model time 0.2250 (0.2707) loss 2.8434 (3.3220) grad_norm 2.2075 (inf) loss_scale 2048.0000 (2597.7581) mem 7376MB [2024-08-26 21:41:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][700/1251] eta 0:02:31 lr 0.000697 wd 0.0500 time 0.2302 (0.2749) data time 0.0007 (0.0054) model time 0.2295 (0.2695) loss 4.0084 (3.3256) grad_norm 2.0251 (inf) loss_scale 2048.0000 (2582.0057) mem 7376MB [2024-08-26 21:41:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][710/1251] eta 0:02:28 lr 0.000697 wd 0.0500 time 0.2271 (0.2736) data time 0.0011 (0.0053) model time 0.2260 (0.2683) loss 2.9434 (3.3219) grad_norm 2.1876 (inf) loss_scale 2048.0000 (2567.1309) mem 7376MB [2024-08-26 21:41:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][720/1251] eta 0:02:24 lr 0.000697 wd 0.0500 time 0.2273 (0.2725) data time 0.0014 (0.0052) model time 0.2259 (0.2673) loss 3.8328 (3.3166) grad_norm 2.2177 (inf) loss_scale 2048.0000 (2553.0623) mem 7376MB [2024-08-26 21:41:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][730/1251] eta 0:02:21 lr 0.000697 wd 0.0500 time 0.2200 (0.2713) data time 0.0009 (0.0051) model time 0.2191 (0.2663) loss 3.9972 (3.3179) grad_norm 1.4865 (inf) loss_scale 2048.0000 (2539.7361) mem 7376MB [2024-08-26 21:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][740/1251] eta 0:02:18 lr 0.000697 wd 0.0500 time 0.2371 (0.2703) data time 0.0008 (0.0050) model time 0.2364 (0.2653) loss 3.6644 (3.3125) grad_norm 2.6363 (inf) loss_scale 2048.0000 (2527.0951) mem 7376MB [2024-08-26 21:41:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][750/1251] eta 0:02:14 lr 0.000696 wd 0.0500 time 0.2251 (0.2693) data time 0.0009 (0.0049) model time 0.2242 (0.2645) loss 4.0009 (3.3149) grad_norm 1.7724 (inf) loss_scale 2048.0000 (2515.0877) mem 7376MB [2024-08-26 21:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][760/1251] eta 0:02:11 lr 0.000696 wd 0.0500 time 0.2219 (0.2683) data time 0.0012 (0.0048) model time 0.2208 (0.2635) loss 3.4605 (3.3185) grad_norm 2.4742 (inf) loss_scale 2048.0000 (2503.6675) mem 7376MB [2024-08-26 21:41:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][770/1251] eta 0:02:08 lr 0.000696 wd 0.0500 time 0.2328 (0.2674) data time 0.0008 (0.0047) model time 0.2319 (0.2627) loss 2.3275 (3.3175) grad_norm 2.1193 (inf) loss_scale 2048.0000 (2492.7924) mem 7376MB [2024-08-26 21:41:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][780/1251] eta 0:02:05 lr 0.000696 wd 0.0500 time 0.2276 (0.2664) data time 0.0008 (0.0046) model time 0.2268 (0.2618) loss 3.9853 (3.3241) grad_norm 1.9156 (inf) loss_scale 2048.0000 (2482.4242) mem 7376MB [2024-08-26 21:41:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][790/1251] eta 0:02:02 lr 0.000696 wd 0.0500 time 0.2328 (0.2656) data time 0.0011 (0.0045) model time 0.2317 (0.2611) loss 3.1770 (3.3275) grad_norm 1.9556 (inf) loss_scale 2048.0000 (2472.5285) mem 7376MB [2024-08-26 21:41:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][800/1251] eta 0:01:59 lr 0.000696 wd 0.0500 time 0.2295 (0.2648) data time 0.0015 (0.0044) model time 0.2281 (0.2604) loss 3.4545 (3.3286) grad_norm 1.4933 (inf) loss_scale 2048.0000 (2463.0735) mem 7376MB [2024-08-26 21:41:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][810/1251] eta 0:01:56 lr 0.000696 wd 0.0500 time 0.2277 (0.2640) data time 0.0009 (0.0044) model time 0.2267 (0.2596) loss 3.3537 (3.3242) grad_norm 3.2365 (inf) loss_scale 2048.0000 (2454.0305) mem 7376MB [2024-08-26 21:41:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][820/1251] eta 0:01:53 lr 0.000696 wd 0.0500 time 0.2303 (0.2632) data time 0.0008 (0.0043) model time 0.2295 (0.2589) loss 2.1427 (3.3169) grad_norm 2.5452 (inf) loss_scale 2048.0000 (2445.3731) mem 7376MB [2024-08-26 21:41:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][830/1251] eta 0:01:50 lr 0.000696 wd 0.0500 time 0.2228 (0.2625) data time 0.0010 (0.0043) model time 0.2218 (0.2583) loss 3.9151 (3.3156) grad_norm 1.7180 (inf) loss_scale 2048.0000 (2437.0772) mem 7376MB [2024-08-26 21:42:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][840/1251] eta 0:01:47 lr 0.000696 wd 0.0500 time 0.2363 (0.2619) data time 0.0009 (0.0042) model time 0.2354 (0.2577) loss 3.4789 (3.3210) grad_norm 2.2282 (inf) loss_scale 2048.0000 (2429.1207) mem 7376MB [2024-08-26 21:42:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][850/1251] eta 0:01:44 lr 0.000696 wd 0.0500 time 0.2252 (0.2612) data time 0.0013 (0.0041) model time 0.2238 (0.2571) loss 3.2918 (3.3207) grad_norm 3.2347 (inf) loss_scale 2048.0000 (2421.4830) mem 7376MB [2024-08-26 21:42:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][860/1251] eta 0:01:41 lr 0.000696 wd 0.0500 time 0.2270 (0.2606) data time 0.0008 (0.0041) model time 0.2263 (0.2565) loss 4.0649 (3.3233) grad_norm 1.6844 (inf) loss_scale 2048.0000 (2414.1454) mem 7376MB [2024-08-26 21:42:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][870/1251] eta 0:01:39 lr 0.000696 wd 0.0500 time 0.2308 (0.2600) data time 0.0010 (0.0040) model time 0.2298 (0.2560) loss 2.4400 (3.3252) grad_norm 2.1929 (inf) loss_scale 2048.0000 (2407.0906) mem 7376MB [2024-08-26 21:42:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][880/1251] eta 0:01:36 lr 0.000696 wd 0.0500 time 0.2297 (0.2594) data time 0.0007 (0.0040) model time 0.2290 (0.2554) loss 3.7451 (3.3185) grad_norm 1.4918 (inf) loss_scale 2048.0000 (2400.3025) mem 7376MB [2024-08-26 21:42:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][890/1251] eta 0:01:33 lr 0.000696 wd 0.0500 time 0.2310 (0.2588) data time 0.0017 (0.0039) model time 0.2293 (0.2549) loss 3.6268 (3.3174) grad_norm 2.0114 (inf) loss_scale 2048.0000 (2393.7662) mem 7376MB [2024-08-26 21:42:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][900/1251] eta 0:01:30 lr 0.000696 wd 0.0500 time 0.2298 (0.2583) data time 0.0007 (0.0039) model time 0.2291 (0.2544) loss 4.0081 (3.3188) grad_norm 1.9847 (inf) loss_scale 2048.0000 (2387.4681) mem 7376MB [2024-08-26 21:42:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][910/1251] eta 0:01:27 lr 0.000696 wd 0.0500 time 0.2276 (0.2577) data time 0.0013 (0.0038) model time 0.2263 (0.2539) loss 4.1121 (3.3236) grad_norm 2.4875 (inf) loss_scale 2048.0000 (2381.3953) mem 7376MB [2024-08-26 21:42:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][920/1251] eta 0:01:25 lr 0.000696 wd 0.0500 time 0.2262 (0.2572) data time 0.0010 (0.0038) model time 0.2253 (0.2534) loss 3.2980 (3.3274) grad_norm 1.9033 (inf) loss_scale 2048.0000 (2375.5360) mem 7376MB [2024-08-26 21:42:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][930/1251] eta 0:01:22 lr 0.000696 wd 0.0500 time 0.2254 (0.2568) data time 0.0009 (0.0037) model time 0.2245 (0.2530) loss 4.0801 (3.3292) grad_norm 2.0855 (inf) loss_scale 2048.0000 (2369.8791) mem 7376MB [2024-08-26 21:42:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][940/1251] eta 0:01:19 lr 0.000696 wd 0.0500 time 0.2350 (0.2563) data time 0.0009 (0.0037) model time 0.2341 (0.2526) loss 2.6110 (3.3293) grad_norm 1.6883 (inf) loss_scale 2048.0000 (2364.4143) mem 7376MB [2024-08-26 21:42:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][950/1251] eta 0:01:16 lr 0.000696 wd 0.0500 time 0.2228 (0.2558) data time 0.0010 (0.0037) model time 0.2218 (0.2521) loss 2.8952 (3.3282) grad_norm 2.0738 (inf) loss_scale 2048.0000 (2359.1319) mem 7376MB [2024-08-26 21:42:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][960/1251] eta 0:01:14 lr 0.000696 wd 0.0500 time 0.2301 (0.2554) data time 0.0007 (0.0036) model time 0.2294 (0.2518) loss 4.0617 (3.3287) grad_norm 1.8601 (inf) loss_scale 2048.0000 (2354.0230) mem 7376MB [2024-08-26 21:42:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][970/1251] eta 0:01:11 lr 0.000696 wd 0.0500 time 0.2314 (0.2550) data time 0.0007 (0.0036) model time 0.2308 (0.2514) loss 3.7321 (3.3281) grad_norm 2.1350 (inf) loss_scale 2048.0000 (2349.0792) mem 7376MB [2024-08-26 21:42:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][980/1251] eta 0:01:08 lr 0.000696 wd 0.0500 time 0.2208 (0.2545) data time 0.0010 (0.0035) model time 0.2198 (0.2510) loss 2.9281 (3.3298) grad_norm 3.2308 (inf) loss_scale 2048.0000 (2344.2925) mem 7376MB [2024-08-26 21:42:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][990/1251] eta 0:01:06 lr 0.000695 wd 0.0500 time 0.2292 (0.2541) data time 0.0009 (0.0035) model time 0.2283 (0.2506) loss 4.1009 (3.3327) grad_norm 2.2587 (inf) loss_scale 2048.0000 (2339.6557) mem 7376MB [2024-08-26 21:42:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1000/1251] eta 0:01:03 lr 0.000695 wd 0.0500 time 0.2323 (0.2538) data time 0.0010 (0.0035) model time 0.2314 (0.2503) loss 3.0088 (3.3279) grad_norm 2.4042 (inf) loss_scale 2048.0000 (2335.1618) mem 7376MB [2024-08-26 21:42:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1010/1251] eta 0:01:01 lr 0.000695 wd 0.0500 time 0.2248 (0.2534) data time 0.0009 (0.0035) model time 0.2239 (0.2500) loss 3.5540 (3.3266) grad_norm 2.0245 (inf) loss_scale 2048.0000 (2330.8042) mem 7376MB [2024-08-26 21:42:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1020/1251] eta 0:00:58 lr 0.000695 wd 0.0500 time 0.2281 (0.2531) data time 0.0011 (0.0034) model time 0.2270 (0.2497) loss 3.7275 (3.3267) grad_norm 2.2181 (inf) loss_scale 2048.0000 (2326.5770) mem 7376MB [2024-08-26 21:42:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1030/1251] eta 0:00:55 lr 0.000695 wd 0.0500 time 0.2269 (0.2527) data time 0.0008 (0.0034) model time 0.2261 (0.2493) loss 3.3105 (3.3289) grad_norm 1.7243 (inf) loss_scale 2048.0000 (2322.4742) mem 7376MB [2024-08-26 21:42:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1040/1251] eta 0:00:53 lr 0.000695 wd 0.0500 time 0.2305 (0.2524) data time 0.0013 (0.0034) model time 0.2292 (0.2490) loss 2.4316 (3.3289) grad_norm 2.6239 (inf) loss_scale 2048.0000 (2318.4906) mem 7376MB [2024-08-26 21:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1050/1251] eta 0:00:50 lr 0.000695 wd 0.0500 time 0.2310 (0.2521) data time 0.0014 (0.0033) model time 0.2296 (0.2487) loss 2.5635 (3.3245) grad_norm 2.2268 (inf) loss_scale 2048.0000 (2314.6209) mem 7376MB [2024-08-26 21:42:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1060/1251] eta 0:00:48 lr 0.000695 wd 0.0500 time 0.2213 (0.2518) data time 0.0010 (0.0033) model time 0.2203 (0.2485) loss 2.3851 (3.3242) grad_norm 2.2169 (inf) loss_scale 2048.0000 (2310.8604) mem 7376MB [2024-08-26 21:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1070/1251] eta 0:00:45 lr 0.000695 wd 0.0500 time 0.2335 (0.2515) data time 0.0008 (0.0033) model time 0.2327 (0.2482) loss 3.3761 (3.3204) grad_norm 1.7181 (inf) loss_scale 2048.0000 (2307.2045) mem 7376MB [2024-08-26 21:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1080/1251] eta 0:00:42 lr 0.000695 wd 0.0500 time 0.2245 (0.2512) data time 0.0007 (0.0033) model time 0.2238 (0.2479) loss 4.0598 (3.3205) grad_norm 1.8023 (inf) loss_scale 2048.0000 (2303.6488) mem 7376MB [2024-08-26 21:42:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1090/1251] eta 0:00:40 lr 0.000695 wd 0.0500 time 0.2237 (0.2509) data time 0.0010 (0.0032) model time 0.2228 (0.2476) loss 3.1890 (3.3241) grad_norm 2.5469 (inf) loss_scale 2048.0000 (2300.1894) mem 7376MB [2024-08-26 21:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1100/1251] eta 0:00:37 lr 0.000695 wd 0.0500 time 0.2280 (0.2506) data time 0.0013 (0.0032) model time 0.2266 (0.2474) loss 3.6401 (3.3238) grad_norm 2.2680 (inf) loss_scale 2048.0000 (2296.8224) mem 7376MB [2024-08-26 21:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1110/1251] eta 0:00:35 lr 0.000695 wd 0.0500 time 0.2245 (0.2503) data time 0.0009 (0.0032) model time 0.2236 (0.2472) loss 3.5120 (3.3246) grad_norm 2.5402 (inf) loss_scale 2048.0000 (2293.5441) mem 7376MB [2024-08-26 21:43:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1120/1251] eta 0:00:32 lr 0.000695 wd 0.0500 time 0.2240 (0.2500) data time 0.0009 (0.0031) model time 0.2231 (0.2469) loss 3.1765 (3.3245) grad_norm 9.2004 (inf) loss_scale 2048.0000 (2290.3511) mem 7376MB [2024-08-26 21:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1130/1251] eta 0:00:30 lr 0.000695 wd 0.0500 time 0.2324 (0.2498) data time 0.0011 (0.0031) model time 0.2313 (0.2467) loss 2.5755 (3.3250) grad_norm 2.2613 (inf) loss_scale 2048.0000 (2287.2401) mem 7376MB [2024-08-26 21:43:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1140/1251] eta 0:00:27 lr 0.000695 wd 0.0500 time 0.2330 (0.2496) data time 0.0014 (0.0031) model time 0.2316 (0.2464) loss 3.2780 (3.3273) grad_norm 2.1815 (inf) loss_scale 2048.0000 (2284.2079) mem 7376MB [2024-08-26 21:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1150/1251] eta 0:00:25 lr 0.000695 wd 0.0500 time 0.2266 (0.2493) data time 0.0010 (0.0031) model time 0.2256 (0.2462) loss 3.6385 (3.3271) grad_norm 2.6473 (inf) loss_scale 2048.0000 (2281.2516) mem 7376MB [2024-08-26 21:43:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1160/1251] eta 0:00:22 lr 0.000695 wd 0.0500 time 0.2214 (0.2490) data time 0.0011 (0.0031) model time 0.2203 (0.2460) loss 2.2195 (3.3223) grad_norm 2.8194 (inf) loss_scale 2048.0000 (2278.3684) mem 7376MB [2024-08-26 21:43:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1170/1251] eta 0:00:20 lr 0.000695 wd 0.0500 time 0.2341 (0.2491) data time 0.0014 (0.0030) model time 0.2327 (0.2460) loss 3.7246 (3.3228) grad_norm 2.2030 (inf) loss_scale 2048.0000 (2275.5556) mem 7376MB [2024-08-26 21:43:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1180/1251] eta 0:00:17 lr 0.000695 wd 0.0500 time 0.2232 (0.2490) data time 0.0008 (0.0030) model time 0.2224 (0.2460) loss 3.4132 (3.3192) grad_norm 1.8106 (inf) loss_scale 2048.0000 (2272.8106) mem 7376MB [2024-08-26 21:43:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1190/1251] eta 0:00:15 lr 0.000695 wd 0.0500 time 0.2258 (0.2488) data time 0.0012 (0.0030) model time 0.2247 (0.2458) loss 3.8884 (3.3196) grad_norm 1.7071 (inf) loss_scale 2048.0000 (2270.1311) mem 7376MB [2024-08-26 21:43:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1200/1251] eta 0:00:12 lr 0.000695 wd 0.0500 time 0.2254 (0.2486) data time 0.0015 (0.0030) model time 0.2239 (0.2456) loss 3.3037 (3.3163) grad_norm 3.7853 (inf) loss_scale 2048.0000 (2267.5147) mem 7376MB [2024-08-26 21:43:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1210/1251] eta 0:00:10 lr 0.000695 wd 0.0500 time 0.2377 (0.2484) data time 0.0008 (0.0030) model time 0.2370 (0.2454) loss 3.7919 (3.3181) grad_norm 1.9829 (inf) loss_scale 2048.0000 (2264.9593) mem 7376MB [2024-08-26 21:43:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1220/1251] eta 0:00:07 lr 0.000695 wd 0.0500 time 0.2291 (0.2482) data time 0.0009 (0.0029) model time 0.2282 (0.2452) loss 3.3321 (3.3177) grad_norm 1.8182 (inf) loss_scale 2048.0000 (2262.4626) mem 7376MB [2024-08-26 21:43:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1230/1251] eta 0:00:05 lr 0.000695 wd 0.0500 time 0.2285 (0.2480) data time 0.0007 (0.0029) model time 0.2279 (0.2450) loss 3.1285 (3.3157) grad_norm 2.6101 (inf) loss_scale 2048.0000 (2260.0228) mem 7376MB [2024-08-26 21:43:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1240/1251] eta 0:00:02 lr 0.000694 wd 0.0500 time 0.2120 (0.2476) data time 0.0004 (0.0029) model time 0.2116 (0.2447) loss 3.2697 (3.3147) grad_norm 2.5946 (inf) loss_scale 2048.0000 (2257.6378) mem 7376MB [2024-08-26 21:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [124/300][1250/1251] eta 0:00:00 lr 0.000694 wd 0.0500 time 0.2128 (0.2473) data time 0.0007 (0.0029) model time 0.2121 (0.2444) loss 2.1696 (3.3117) grad_norm 1.7202 (inf) loss_scale 2048.0000 (2255.3059) mem 7376MB [2024-08-26 21:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 124 training takes 0:03:42 [2024-08-26 21:43:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 21:43:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 21:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.465 (0.465) Loss 0.4844 (0.4844) Acc@1 90.430 (90.430) Acc@5 98.242 (98.242) Mem 7376MB [2024-08-26 21:43:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.114) Loss 0.7861 (0.7451) Acc@1 83.008 (82.999) Acc@5 96.191 (96.573) Mem 7376MB [2024-08-26 21:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.087 (0.099) Loss 1.1094 (0.7707) Acc@1 73.145 (82.073) Acc@5 93.164 (96.549) Mem 7376MB [2024-08-26 21:43:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.086 (0.094) Loss 1.3711 (0.8807) Acc@1 66.895 (79.782) Acc@5 89.453 (95.281) Mem 7376MB [2024-08-26 21:43:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.087) Loss 1.2402 (0.9405) Acc@1 70.605 (78.335) Acc@5 91.406 (94.588) Mem 7376MB [2024-08-26 21:43:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 77.992 Acc@5 94.488 [2024-08-26 21:43:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.0% [2024-08-26 21:43:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 1.029 (1.029) Loss 0.4204 (0.4204) Acc@1 92.090 (92.090) Acc@5 98.438 (98.438) Mem 7376MB [2024-08-26 21:43:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.087 (0.170) Loss 0.6699 (0.6624) Acc@1 86.816 (85.769) Acc@5 96.973 (97.230) Mem 7376MB [2024-08-26 21:43:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.089 (0.130) Loss 0.9404 (0.6847) Acc@1 78.027 (84.835) Acc@5 95.020 (97.247) Mem 7376MB [2024-08-26 21:43:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.113) Loss 1.2012 (0.7790) Acc@1 69.141 (82.482) Acc@5 91.699 (96.185) Mem 7376MB [2024-08-26 21:43:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.101) Loss 1.0898 (0.8287) Acc@1 72.754 (81.105) Acc@5 92.871 (95.629) Mem 7376MB [2024-08-26 21:43:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.714 Acc@5 95.598 [2024-08-26 21:43:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.7% [2024-08-26 21:43:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.71% [2024-08-26 21:43:47 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 21:43:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 21:43:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][0/1251] eta 0:15:16 lr 0.000694 wd 0.0500 time 0.7327 (0.7327) data time 0.4356 (0.4356) model time 0.0000 (0.0000) loss 3.9063 (3.9063) grad_norm 2.4580 (2.4580) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-26 21:43:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][10/1251] eta 0:05:43 lr 0.000694 wd 0.0500 time 0.2289 (0.2769) data time 0.0007 (0.0406) model time 0.0000 (0.0000) loss 4.1137 (3.4680) grad_norm 2.3362 (2.2083) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-26 21:43:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 21:43:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 21:43:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 21:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 21:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 21:54:40 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 21:54:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 21:54:50 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 22:10:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 22:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 22:11:00 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 22:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 22:11:09 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 22:11:11 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 22:11:12 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 22:11:12 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 125) [2024-08-26 22:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 22:11:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][20/1251] eta 4:41:31 lr 0.000694 wd 0.0500 time 13.7215 (13.7215) data time 0.9045 (0.9045) model time 0.0000 (0.0000) loss 4.2184 (4.2184) grad_norm 1.9337 (1.9337) loss_scale 2048.0000 (2048.0000) mem 20033MB [2024-08-26 22:11:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][30/1251] eta 0:30:09 lr 0.000694 wd 0.0500 time 0.2466 (1.4816) data time 0.0010 (0.0835) model time 0.0000 (0.0000) loss 2.4977 (3.6558) grad_norm 2.0709 (2.1178) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:11:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][40/1251] eta 0:17:59 lr 0.000694 wd 0.0500 time 0.2407 (0.8913) data time 0.0010 (0.0443) model time 0.0000 (0.0000) loss 3.3109 (3.5479) grad_norm 1.6652 (2.2018) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][50/1251] eta 0:13:38 lr 0.000694 wd 0.0500 time 0.2409 (0.6817) data time 0.0007 (0.0305) model time 0.0000 (0.0000) loss 2.2804 (3.6154) grad_norm 2.7202 (2.3220) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][60/1251] eta 0:11:23 lr 0.000694 wd 0.0500 time 0.2394 (0.5740) data time 0.0012 (0.0233) model time 0.2381 (0.2392) loss 3.1556 (3.5437) grad_norm 3.2289 (2.3872) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:11:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 22:11:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 22:11:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 22:13:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 22:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 22:13:39 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 22:13:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 22:13:54 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 22:13:56 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 22:13:57 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 22:13:57 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 125) [2024-08-26 22:13:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 22:14:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][70/1251] eta 0:48:40 lr 0.000694 wd 0.0500 time 0.2278 (2.4726) data time 0.0008 (0.1449) model time 0.2270 (2.3277) loss 4.2993 (3.7439) grad_norm 1.5419 (2.0724) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][80/1251] eta 0:20:51 lr 0.000694 wd 0.0500 time 0.2196 (1.0684) data time 0.0009 (0.0551) model time 0.2186 (1.0133) loss 3.1690 (3.5567) grad_norm 2.0651 (2.0089) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][90/1251] eta 0:14:25 lr 0.000694 wd 0.0500 time 0.2193 (0.7451) data time 0.0012 (0.0343) model time 0.2182 (0.7107) loss 3.2409 (3.5254) grad_norm 2.8115 (2.2108) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][100/1251] eta 0:11:39 lr 0.000694 wd 0.0500 time 0.3192 (0.6077) data time 0.0008 (0.0251) model time 0.3184 (0.5827) loss 3.4026 (3.5709) grad_norm 2.1921 (2.1806) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][110/1251] eta 0:10:02 lr 0.000694 wd 0.0500 time 0.2277 (0.5282) data time 0.0009 (0.0199) model time 0.2267 (0.5083) loss 3.0744 (3.4971) grad_norm 1.8083 (2.1624) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][120/1251] eta 0:08:59 lr 0.000694 wd 0.0500 time 0.2318 (0.4771) data time 0.0010 (0.0165) model time 0.2308 (0.4606) loss 3.8832 (3.5021) grad_norm 3.9802 (2.3043) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][130/1251] eta 0:08:12 lr 0.000694 wd 0.0500 time 0.2253 (0.4395) data time 0.0009 (0.0142) model time 0.2245 (0.4253) loss 3.1214 (3.4832) grad_norm 1.8692 (2.2940) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][140/1251] eta 0:07:38 lr 0.000694 wd 0.0500 time 0.2917 (0.4130) data time 0.0010 (0.0125) model time 0.2907 (0.4005) loss 3.3186 (3.4452) grad_norm 1.8308 (2.2363) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][150/1251] eta 0:07:14 lr 0.000694 wd 0.0500 time 0.2323 (0.3943) data time 0.0010 (0.0112) model time 0.2313 (0.3831) loss 2.3196 (3.4076) grad_norm 2.3824 (2.2577) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][160/1251] eta 0:06:51 lr 0.000694 wd 0.0500 time 0.2334 (0.3771) data time 0.0007 (0.0101) model time 0.2327 (0.3670) loss 3.0904 (3.4114) grad_norm 2.6749 (2.2458) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][170/1251] eta 0:06:33 lr 0.000694 wd 0.0500 time 0.3067 (0.3643) data time 0.0015 (0.0093) model time 0.3052 (0.3550) loss 3.6360 (3.4251) grad_norm 1.9987 (2.2650) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][180/1251] eta 0:06:18 lr 0.000694 wd 0.0500 time 0.2267 (0.3531) data time 0.0007 (0.0086) model time 0.2260 (0.3445) loss 4.1909 (3.4170) grad_norm 1.6240 (2.2682) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][190/1251] eta 0:06:06 lr 0.000694 wd 0.0500 time 0.3002 (0.3455) data time 0.0008 (0.0080) model time 0.2994 (0.3375) loss 2.0807 (3.4004) grad_norm 1.9207 (2.2781) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][200/1251] eta 0:05:53 lr 0.000694 wd 0.0500 time 0.2366 (0.3368) data time 0.0009 (0.0075) model time 0.2356 (0.3293) loss 2.7916 (3.3982) grad_norm 2.6574 (2.2758) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][210/1251] eta 0:05:42 lr 0.000694 wd 0.0500 time 0.2286 (0.3294) data time 0.0007 (0.0070) model time 0.2279 (0.3224) loss 2.7313 (3.3822) grad_norm 2.1843 (2.2715) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][220/1251] eta 0:05:32 lr 0.000694 wd 0.0500 time 0.2247 (0.3228) data time 0.0009 (0.0066) model time 0.2238 (0.3162) loss 3.2063 (3.3770) grad_norm 2.0898 (2.2720) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][230/1251] eta 0:05:26 lr 0.000693 wd 0.0500 time 0.2214 (0.3196) data time 0.0009 (0.0063) model time 0.2205 (0.3133) loss 3.6007 (3.3746) grad_norm 2.0230 (2.2702) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:14:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][240/1251] eta 0:05:17 lr 0.000693 wd 0.0500 time 0.2334 (0.3144) data time 0.0007 (0.0060) model time 0.2327 (0.3084) loss 3.2711 (3.3582) grad_norm 2.2727 (2.2730) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:15:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][250/1251] eta 0:05:11 lr 0.000693 wd 0.0500 time 0.2258 (0.3111) data time 0.0013 (0.0057) model time 0.2245 (0.3053) loss 2.7472 (3.3459) grad_norm 1.6407 (2.2627) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:15:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][260/1251] eta 0:05:04 lr 0.000693 wd 0.0500 time 0.2271 (0.3069) data time 0.0010 (0.0055) model time 0.2261 (0.3014) loss 2.9016 (3.3368) grad_norm 2.1029 (2.2662) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:15:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][270/1251] eta 0:04:58 lr 0.000693 wd 0.0500 time 0.2587 (0.3039) data time 0.0007 (0.0053) model time 0.2580 (0.2986) loss 2.4504 (3.3280) grad_norm 1.5945 (2.2584) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:15:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][280/1251] eta 0:04:52 lr 0.000693 wd 0.0500 time 0.2263 (0.3010) data time 0.0007 (0.0051) model time 0.2256 (0.2960) loss 2.4272 (3.3173) grad_norm 1.9482 (2.2645) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][290/1251] eta 0:04:46 lr 0.000693 wd 0.0500 time 0.2295 (0.2979) data time 0.0009 (0.0049) model time 0.2286 (0.2930) loss 3.0320 (3.3206) grad_norm 3.0689 (2.2911) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:15:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][300/1251] eta 0:04:40 lr 0.000693 wd 0.0500 time 0.2841 (0.2951) data time 0.0009 (0.0048) model time 0.2832 (0.2904) loss 4.0808 (3.3176) grad_norm 1.8755 (2.2827) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:15:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][310/1251] eta 0:04:36 lr 0.000693 wd 0.0500 time 0.2291 (0.2938) data time 0.0008 (0.0046) model time 0.2283 (0.2892) loss 2.4643 (3.3140) grad_norm 1.4924 (2.2793) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:15:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][320/1251] eta 0:04:31 lr 0.000693 wd 0.0500 time 0.2300 (0.2919) data time 0.0010 (0.0045) model time 0.2291 (0.2874) loss 2.4772 (3.3079) grad_norm 2.5728 (2.2854) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][330/1251] eta 0:04:26 lr 0.000693 wd 0.0500 time 0.2327 (0.2895) data time 0.0009 (0.0043) model time 0.2318 (0.2852) loss 2.9994 (3.3031) grad_norm 2.0245 (2.2718) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 22:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 22:15:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 22:15:24 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 22:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 22:22:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 22:22:38 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 22:22:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 22:22:48 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 22:22:49 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 22:22:50 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 22:22:50 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 125) [2024-08-26 22:22:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 22:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][340/1251] eta 0:23:59 lr 0.000693 wd 0.0500 time 0.2258 (1.5797) data time 0.0007 (0.0747) model time 0.2251 (1.5050) loss 3.9466 (3.9717) grad_norm 3.0435 (2.7090) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][350/1251] eta 0:12:59 lr 0.000693 wd 0.0500 time 0.2195 (0.8655) data time 0.0009 (0.0359) model time 0.2186 (0.8296) loss 3.8031 (3.6939) grad_norm 1.9584 (2.4710) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][360/1251] eta 0:09:34 lr 0.000693 wd 0.0500 time 0.2230 (0.6450) data time 0.0007 (0.0238) model time 0.2223 (0.6211) loss 3.7206 (3.7041) grad_norm 2.4653 (2.4158) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][370/1251] eta 0:07:53 lr 0.000693 wd 0.0500 time 0.2284 (0.5372) data time 0.0009 (0.0180) model time 0.2275 (0.5192) loss 3.5076 (3.6256) grad_norm 1.5671 (2.3577) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][380/1251] eta 0:06:52 lr 0.000693 wd 0.0500 time 0.2275 (0.4738) data time 0.0008 (0.0145) model time 0.2267 (0.4593) loss 3.3242 (3.5714) grad_norm 1.9363 (2.2799) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][390/1251] eta 0:06:11 lr 0.000693 wd 0.0500 time 0.2282 (0.4316) data time 0.0007 (0.0122) model time 0.2275 (0.4194) loss 2.9990 (3.5299) grad_norm 3.3435 (2.2912) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][400/1251] eta 0:05:41 lr 0.000693 wd 0.0500 time 0.2209 (0.4016) data time 0.0009 (0.0105) model time 0.2201 (0.3911) loss 3.4557 (3.5051) grad_norm 2.1167 (2.3270) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][410/1251] eta 0:05:19 lr 0.000693 wd 0.0500 time 0.2248 (0.3794) data time 0.0010 (0.0093) model time 0.2238 (0.3701) loss 3.0862 (3.4645) grad_norm 2.2354 (2.3775) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][420/1251] eta 0:05:00 lr 0.000693 wd 0.0500 time 0.2210 (0.3621) data time 0.0008 (0.0084) model time 0.2202 (0.3538) loss 3.4535 (3.4308) grad_norm 2.2535 (2.3503) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][430/1251] eta 0:04:45 lr 0.000693 wd 0.0500 time 0.2240 (0.3482) data time 0.0008 (0.0076) model time 0.2231 (0.3406) loss 3.4478 (3.4419) grad_norm 2.1319 (2.3575) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][440/1251] eta 0:04:33 lr 0.000693 wd 0.0500 time 0.2214 (0.3367) data time 0.0006 (0.0070) model time 0.2208 (0.3297) loss 4.0377 (3.4617) grad_norm 2.7286 (2.3462) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][450/1251] eta 0:04:22 lr 0.000693 wd 0.0500 time 0.2227 (0.3273) data time 0.0010 (0.0065) model time 0.2217 (0.3208) loss 3.7594 (3.4543) grad_norm 2.0664 (2.3144) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][460/1251] eta 0:04:12 lr 0.000693 wd 0.0500 time 0.2248 (0.3193) data time 0.0007 (0.0061) model time 0.2241 (0.3133) loss 2.9332 (3.4290) grad_norm 2.1924 (2.2931) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][470/1251] eta 0:04:04 lr 0.000692 wd 0.0500 time 0.2328 (0.3126) data time 0.0007 (0.0057) model time 0.2321 (0.3069) loss 3.9923 (3.4231) grad_norm 2.0821 (2.2946) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][480/1251] eta 0:03:56 lr 0.000692 wd 0.0500 time 0.2414 (0.3070) data time 0.0013 (0.0054) model time 0.2401 (0.3016) loss 3.1683 (3.4076) grad_norm 2.4773 (2.2802) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][490/1251] eta 0:03:49 lr 0.000692 wd 0.0500 time 0.2245 (0.3019) data time 0.0009 (0.0051) model time 0.2236 (0.2968) loss 3.9352 (3.4048) grad_norm 2.1456 (2.2649) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][500/1251] eta 0:03:43 lr 0.000692 wd 0.0500 time 0.2312 (0.2974) data time 0.0009 (0.0049) model time 0.2303 (0.2925) loss 3.3978 (3.4044) grad_norm 1.7970 (2.2521) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][510/1251] eta 0:03:37 lr 0.000692 wd 0.0500 time 0.2248 (0.2934) data time 0.0007 (0.0047) model time 0.2241 (0.2886) loss 3.1522 (3.3869) grad_norm 1.8515 (2.2617) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][520/1251] eta 0:03:31 lr 0.000692 wd 0.0500 time 0.2224 (0.2897) data time 0.0007 (0.0045) model time 0.2218 (0.2852) loss 3.9819 (3.3849) grad_norm 1.9845 (2.2485) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][530/1251] eta 0:03:26 lr 0.000692 wd 0.0500 time 0.2248 (0.2865) data time 0.0009 (0.0044) model time 0.2239 (0.2821) loss 2.5586 (3.3705) grad_norm 1.8972 (2.2380) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][540/1251] eta 0:03:21 lr 0.000692 wd 0.0500 time 0.2306 (0.2835) data time 0.0007 (0.0042) model time 0.2299 (0.2793) loss 3.3381 (3.3646) grad_norm 1.9666 (2.2236) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][550/1251] eta 0:03:16 lr 0.000692 wd 0.0500 time 0.2276 (0.2810) data time 0.0009 (0.0040) model time 0.2267 (0.2769) loss 3.7818 (3.3572) grad_norm 2.1719 (2.2216) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:23:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][560/1251] eta 0:03:12 lr 0.000692 wd 0.0500 time 0.2277 (0.2785) data time 0.0006 (0.0039) model time 0.2271 (0.2746) loss 2.7302 (3.3558) grad_norm 2.2749 (2.2229) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:24:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][570/1251] eta 0:03:08 lr 0.000692 wd 0.0500 time 0.2304 (0.2764) data time 0.0007 (0.0038) model time 0.2297 (0.2726) loss 2.6350 (3.3488) grad_norm 1.9888 (2.2177) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:24:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][580/1251] eta 0:03:04 lr 0.000692 wd 0.0500 time 0.2410 (0.2744) data time 0.0008 (0.0037) model time 0.2402 (0.2707) loss 3.6709 (3.3442) grad_norm 3.2682 (2.2107) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:24:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][590/1251] eta 0:03:00 lr 0.000692 wd 0.0500 time 0.2263 (0.2724) data time 0.0009 (0.0036) model time 0.2254 (0.2689) loss 3.7140 (3.3367) grad_norm 1.9726 (2.2182) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-26 22:24:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 22:24:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 22:24:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 22:25:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 22:25:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 22:26:07 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 22:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 22:26:15 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 22:26:17 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 22:26:18 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 22:26:18 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 125) [2024-08-26 22:26:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 22:26:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][600/1251] eta 0:22:30 lr 0.000692 wd 0.0500 time 0.2322 (2.0739) data time 0.0009 (0.1166) model time 0.2312 (1.9573) loss 3.3909 (3.7599) grad_norm 2.1874 (2.1266) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:26:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][610/1251] eta 0:11:46 lr 0.000692 wd 0.0500 time 0.2239 (1.1017) data time 0.0012 (0.0558) model time 0.2227 (1.0459) loss 2.9553 (3.5025) grad_norm 2.0690 (2.2529) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:26:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][620/1251] eta 0:08:30 lr 0.000692 wd 0.0500 time 0.2599 (0.8085) data time 0.0011 (0.0370) model time 0.2587 (0.7715) loss 4.0313 (3.5513) grad_norm 1.8514 (2.1594) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:26:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][630/1251] eta 0:06:51 lr 0.000692 wd 0.0500 time 0.2253 (0.6632) data time 0.0009 (0.0278) model time 0.2244 (0.6354) loss 3.4298 (3.4717) grad_norm 3.3823 (2.1692) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][640/1251] eta 0:05:50 lr 0.000692 wd 0.0500 time 0.2275 (0.5742) data time 0.0009 (0.0223) model time 0.2267 (0.5519) loss 3.3606 (3.4478) grad_norm 1.9659 (2.1675) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:26:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][650/1251] eta 0:05:10 lr 0.000692 wd 0.0500 time 0.2250 (0.5174) data time 0.0008 (0.0187) model time 0.2242 (0.4987) loss 2.5493 (3.4505) grad_norm 1.7072 (2.1170) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:26:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][660/1251] eta 0:04:41 lr 0.000692 wd 0.0500 time 0.2630 (0.4757) data time 0.0010 (0.0161) model time 0.2620 (0.4596) loss 3.3901 (3.4245) grad_norm 2.0636 (2.1249) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:26:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][670/1251] eta 0:04:21 lr 0.000692 wd 0.0500 time 0.2282 (0.4496) data time 0.0009 (0.0143) model time 0.2273 (0.4354) loss 3.4332 (3.3932) grad_norm 1.9300 (2.1429) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][680/1251] eta 0:04:02 lr 0.000692 wd 0.0500 time 0.2290 (0.4251) data time 0.0007 (0.0128) model time 0.2283 (0.4124) loss 3.3475 (3.3550) grad_norm 2.0247 (2.1381) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][690/1251] eta 0:03:47 lr 0.000692 wd 0.0500 time 0.2243 (0.4054) data time 0.0010 (0.0116) model time 0.2234 (0.3939) loss 3.7003 (3.3796) grad_norm 1.8351 (2.1483) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][700/1251] eta 0:03:35 lr 0.000692 wd 0.0500 time 0.3216 (0.3904) data time 0.0012 (0.0106) model time 0.3204 (0.3798) loss 4.3242 (3.4077) grad_norm 2.1607 (2.2009) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][710/1251] eta 0:03:24 lr 0.000692 wd 0.0500 time 0.2243 (0.3787) data time 0.0011 (0.0099) model time 0.2232 (0.3688) loss 3.6549 (3.4094) grad_norm 3.0323 (2.2692) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][720/1251] eta 0:03:15 lr 0.000691 wd 0.0500 time 0.2179 (0.3685) data time 0.0007 (0.0092) model time 0.2172 (0.3593) loss 3.3515 (3.3930) grad_norm 3.0122 (2.3463) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][730/1251] eta 0:03:06 lr 0.000691 wd 0.0500 time 0.2317 (0.3584) data time 0.0007 (0.0086) model time 0.2310 (0.3499) loss 3.9998 (3.3922) grad_norm 4.5390 (2.3442) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][740/1251] eta 0:02:58 lr 0.000691 wd 0.0500 time 0.2359 (0.3501) data time 0.0007 (0.0081) model time 0.2352 (0.3420) loss 3.3372 (3.3801) grad_norm 3.2119 (2.3372) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][750/1251] eta 0:02:52 lr 0.000691 wd 0.0500 time 0.2279 (0.3443) data time 0.0006 (0.0076) model time 0.2273 (0.3367) loss 3.8055 (3.3779) grad_norm 2.3709 (2.3397) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][760/1251] eta 0:02:45 lr 0.000691 wd 0.0500 time 0.2283 (0.3375) data time 0.0013 (0.0073) model time 0.2270 (0.3303) loss 3.4933 (3.3881) grad_norm 2.2185 (2.3348) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][770/1251] eta 0:02:39 lr 0.000691 wd 0.0500 time 0.2405 (0.3316) data time 0.0010 (0.0069) model time 0.2395 (0.3247) loss 3.3413 (3.3686) grad_norm 1.8349 (2.3035) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][780/1251] eta 0:02:34 lr 0.000691 wd 0.0500 time 0.2269 (0.3272) data time 0.0009 (0.0066) model time 0.2260 (0.3206) loss 4.3487 (3.3687) grad_norm 1.8584 (2.2939) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][790/1251] eta 0:02:28 lr 0.000691 wd 0.0500 time 0.2268 (0.3230) data time 0.0010 (0.0063) model time 0.2258 (0.3167) loss 2.8188 (3.3576) grad_norm 2.3356 (2.3038) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][800/1251] eta 0:02:24 lr 0.000691 wd 0.0500 time 0.2253 (0.3197) data time 0.0008 (0.0061) model time 0.2245 (0.3136) loss 3.4935 (3.3460) grad_norm 2.3329 (2.3023) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][810/1251] eta 0:02:19 lr 0.000691 wd 0.0500 time 0.2329 (0.3155) data time 0.0007 (0.0058) model time 0.2322 (0.3097) loss 3.8956 (3.3400) grad_norm 3.4854 (2.2966) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][820/1251] eta 0:02:14 lr 0.000691 wd 0.0500 time 0.2276 (0.3117) data time 0.0008 (0.0056) model time 0.2268 (0.3060) loss 2.7074 (3.3422) grad_norm 2.1532 (2.2987) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][830/1251] eta 0:02:10 lr 0.000691 wd 0.0500 time 0.3022 (0.3091) data time 0.0008 (0.0055) model time 0.3014 (0.3037) loss 2.5052 (3.3352) grad_norm 2.2653 (2.3068) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][840/1251] eta 0:02:05 lr 0.000691 wd 0.0500 time 0.2266 (0.3063) data time 0.0007 (0.0053) model time 0.2259 (0.3010) loss 3.1931 (3.3287) grad_norm 2.7794 (2.3071) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][850/1251] eta 0:02:01 lr 0.000691 wd 0.0500 time 0.2331 (0.3037) data time 0.0010 (0.0051) model time 0.2321 (0.2986) loss 3.2989 (3.3163) grad_norm 2.5883 (2.3049) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][860/1251] eta 0:01:57 lr 0.000691 wd 0.0500 time 0.2308 (0.3010) data time 0.0009 (0.0050) model time 0.2299 (0.2960) loss 2.2321 (3.3059) grad_norm 2.6519 (2.2977) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][870/1251] eta 0:01:53 lr 0.000691 wd 0.0500 time 0.2626 (0.2990) data time 0.0013 (0.0048) model time 0.2613 (0.2942) loss 3.6503 (3.3177) grad_norm 2.0806 (2.3016) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][880/1251] eta 0:01:50 lr 0.000691 wd 0.0500 time 0.2246 (0.2980) data time 0.0009 (0.0047) model time 0.2237 (0.2933) loss 3.3083 (3.3210) grad_norm 2.2869 (2.3006) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][890/1251] eta 0:01:46 lr 0.000691 wd 0.0500 time 0.2347 (0.2956) data time 0.0009 (0.0046) model time 0.2338 (0.2910) loss 3.8790 (3.3047) grad_norm 2.2476 (2.2942) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][900/1251] eta 0:01:43 lr 0.000691 wd 0.0500 time 0.2245 (0.2946) data time 0.0009 (0.0045) model time 0.2236 (0.2901) loss 3.6491 (3.2996) grad_norm 2.2405 (2.2868) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][910/1251] eta 0:01:39 lr 0.000691 wd 0.0500 time 0.2482 (0.2929) data time 0.0010 (0.0044) model time 0.2472 (0.2885) loss 3.9649 (3.3102) grad_norm 2.1596 (2.2868) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:27:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][920/1251] eta 0:01:36 lr 0.000691 wd 0.0500 time 0.2272 (0.2915) data time 0.0009 (0.0043) model time 0.2262 (0.2872) loss 2.2858 (3.3129) grad_norm 2.2829 (2.2935) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][930/1251] eta 0:01:32 lr 0.000691 wd 0.0500 time 0.2224 (0.2897) data time 0.0009 (0.0042) model time 0.2215 (0.2855) loss 3.0547 (3.3134) grad_norm 2.0950 (2.2888) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][940/1251] eta 0:01:29 lr 0.000691 wd 0.0500 time 0.2211 (0.2879) data time 0.0010 (0.0041) model time 0.2201 (0.2838) loss 3.8358 (3.3141) grad_norm 2.3441 (2.2889) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][950/1251] eta 0:01:26 lr 0.000691 wd 0.0500 time 0.3147 (0.2868) data time 0.0014 (0.0040) model time 0.3132 (0.2828) loss 3.0037 (3.3109) grad_norm 1.9994 (2.2871) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][960/1251] eta 0:01:23 lr 0.000690 wd 0.0500 time 0.2341 (0.2859) data time 0.0012 (0.0039) model time 0.2329 (0.2819) loss 3.6137 (3.3108) grad_norm 2.0554 (2.2894) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][970/1251] eta 0:01:20 lr 0.000690 wd 0.0500 time 0.2228 (0.2848) data time 0.0007 (0.0039) model time 0.2221 (0.2810) loss 3.7024 (3.3104) grad_norm 2.6524 (2.2957) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][980/1251] eta 0:01:16 lr 0.000690 wd 0.0500 time 0.2219 (0.2834) data time 0.0008 (0.0038) model time 0.2212 (0.2796) loss 3.5241 (3.3039) grad_norm 2.1410 (2.2994) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][990/1251] eta 0:01:13 lr 0.000690 wd 0.0500 time 0.2335 (0.2821) data time 0.0008 (0.0037) model time 0.2328 (0.2783) loss 4.0276 (3.3086) grad_norm 2.4879 (2.3136) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1000/1251] eta 0:01:10 lr 0.000690 wd 0.0500 time 0.2280 (0.2815) data time 0.0010 (0.0037) model time 0.2270 (0.2778) loss 3.7856 (3.3160) grad_norm 2.0797 (2.3164) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1010/1251] eta 0:01:07 lr 0.000690 wd 0.0500 time 0.2339 (0.2802) data time 0.0008 (0.0036) model time 0.2331 (0.2766) loss 2.4307 (3.3162) grad_norm 2.2851 (2.3119) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1020/1251] eta 0:01:04 lr 0.000690 wd 0.0500 time 0.2232 (0.2791) data time 0.0006 (0.0035) model time 0.2226 (0.2756) loss 3.5204 (3.3230) grad_norm 1.8501 (2.3067) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1030/1251] eta 0:01:01 lr 0.000690 wd 0.0500 time 0.2284 (0.2785) data time 0.0009 (0.0035) model time 0.2275 (0.2750) loss 3.4661 (3.3263) grad_norm 3.8653 (2.3142) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1040/1251] eta 0:00:58 lr 0.000690 wd 0.0500 time 0.2253 (0.2779) data time 0.0010 (0.0034) model time 0.2243 (0.2744) loss 3.3418 (3.3253) grad_norm 2.9409 (2.3244) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1050/1251] eta 0:00:55 lr 0.000690 wd 0.0500 time 0.2312 (0.2771) data time 0.0010 (0.0034) model time 0.2302 (0.2737) loss 2.9820 (3.3180) grad_norm 2.0616 (2.3302) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1060/1251] eta 0:00:52 lr 0.000690 wd 0.0500 time 0.2302 (0.2761) data time 0.0007 (0.0033) model time 0.2294 (0.2728) loss 1.8507 (3.3120) grad_norm 2.6244 (2.3276) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1070/1251] eta 0:00:49 lr 0.000690 wd 0.0500 time 0.2234 (0.2751) data time 0.0009 (0.0033) model time 0.2225 (0.2718) loss 3.6478 (3.3112) grad_norm 2.1589 (2.3200) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1080/1251] eta 0:00:46 lr 0.000690 wd 0.0500 time 0.2917 (0.2748) data time 0.0014 (0.0033) model time 0.2904 (0.2716) loss 3.3343 (3.3165) grad_norm 2.6007 (2.3159) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1090/1251] eta 0:00:44 lr 0.000690 wd 0.0500 time 0.2245 (0.2740) data time 0.0007 (0.0032) model time 0.2238 (0.2707) loss 3.6749 (3.3177) grad_norm 2.9848 (2.3156) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1100/1251] eta 0:00:41 lr 0.000690 wd 0.0500 time 0.2282 (0.2733) data time 0.0007 (0.0032) model time 0.2275 (0.2701) loss 3.8711 (3.3180) grad_norm 2.2067 (2.3069) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1110/1251] eta 0:00:38 lr 0.000690 wd 0.0500 time 0.2300 (0.2725) data time 0.0010 (0.0031) model time 0.2290 (0.2693) loss 2.1678 (3.3197) grad_norm 2.2573 (2.3117) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1120/1251] eta 0:00:35 lr 0.000690 wd 0.0500 time 0.2274 (0.2721) data time 0.0008 (0.0031) model time 0.2267 (0.2690) loss 3.5642 (3.3123) grad_norm 1.8762 (2.3045) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1130/1251] eta 0:00:32 lr 0.000690 wd 0.0500 time 0.2372 (0.2717) data time 0.0010 (0.0031) model time 0.2362 (0.2687) loss 3.6832 (3.3103) grad_norm 1.7437 (2.3054) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1140/1251] eta 0:00:30 lr 0.000690 wd 0.0500 time 0.2314 (0.2709) data time 0.0006 (0.0030) model time 0.2308 (0.2679) loss 4.0213 (3.3125) grad_norm 1.8068 (2.3107) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1150/1251] eta 0:00:27 lr 0.000690 wd 0.0500 time 0.2286 (0.2702) data time 0.0007 (0.0030) model time 0.2279 (0.2672) loss 4.3489 (3.3170) grad_norm 2.4872 (2.3099) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1160/1251] eta 0:00:24 lr 0.000690 wd 0.0500 time 0.2921 (0.2702) data time 0.0016 (0.0030) model time 0.2905 (0.2672) loss 3.0543 (3.3196) grad_norm 1.9422 (2.3049) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1170/1251] eta 0:00:21 lr 0.000690 wd 0.0500 time 0.2266 (0.2695) data time 0.0007 (0.0029) model time 0.2259 (0.2666) loss 4.2165 (3.3216) grad_norm 1.9763 (2.2968) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1180/1251] eta 0:00:19 lr 0.000690 wd 0.0500 time 0.2286 (0.2689) data time 0.0009 (0.0029) model time 0.2276 (0.2660) loss 2.6350 (3.3222) grad_norm 2.3246 (2.2913) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1190/1251] eta 0:00:16 lr 0.000690 wd 0.0500 time 0.2327 (0.2683) data time 0.0009 (0.0029) model time 0.2319 (0.2654) loss 3.1702 (3.3219) grad_norm 2.3797 (2.2896) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-26 22:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1200/1251] eta 0:00:13 lr 0.000689 wd 0.0500 time 0.2927 (0.2678) data time 0.0011 (0.0029) model time 0.2916 (0.2650) loss 4.0016 (3.3227) grad_norm 2.7338 (2.2888) loss_scale 4096.0000 (2078.2660) mem 7373MB [2024-08-26 22:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1210/1251] eta 0:00:10 lr 0.000689 wd 0.0500 time 0.2322 (0.2675) data time 0.0007 (0.0028) model time 0.2315 (0.2647) loss 3.7974 (3.3228) grad_norm 2.0748 (2.2890) loss_scale 4096.0000 (2110.8627) mem 7373MB [2024-08-26 22:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1220/1251] eta 0:00:08 lr 0.000689 wd 0.0500 time 0.2343 (0.2669) data time 0.0010 (0.0028) model time 0.2334 (0.2641) loss 2.6545 (3.3227) grad_norm 2.2555 (2.2895) loss_scale 4096.0000 (2142.4229) mem 7373MB [2024-08-26 22:29:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1230/1251] eta 0:00:05 lr 0.000689 wd 0.0500 time 0.3114 (0.2665) data time 0.0012 (0.0028) model time 0.3103 (0.2637) loss 4.1567 (3.3260) grad_norm 2.1728 (2.2977) loss_scale 4096.0000 (2172.9953) mem 7373MB [2024-08-26 22:29:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1240/1251] eta 0:00:02 lr 0.000689 wd 0.0500 time 0.2113 (0.2660) data time 0.0005 (0.0028) model time 0.2109 (0.2632) loss 3.3378 (3.3216) grad_norm 2.5762 (2.3014) loss_scale 4096.0000 (2202.6256) mem 7373MB [2024-08-26 22:29:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [125/300][1250/1251] eta 0:00:00 lr 0.000689 wd 0.0500 time 0.2782 (0.2657) data time 0.0010 (0.0028) model time 0.2772 (0.2629) loss 3.4468 (3.3188) grad_norm 2.5456 (2.2991) loss_scale 4096.0000 (2231.3566) mem 7373MB [2024-08-26 22:29:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 125 training takes 0:02:55 [2024-08-26 22:29:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 22:29:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 22:29:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.466 (0.466) Loss 0.5244 (0.5244) Acc@1 90.625 (90.625) Acc@5 97.949 (97.949) Mem 7373MB [2024-08-26 22:29:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.116) Loss 0.7178 (0.7569) Acc@1 85.156 (83.709) Acc@5 96.875 (96.644) Mem 7373MB [2024-08-26 22:29:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.100) Loss 1.1240 (0.7944) Acc@1 74.023 (82.385) Acc@5 93.848 (96.591) Mem 7373MB [2024-08-26 22:29:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.094) Loss 1.3984 (0.9035) Acc@1 67.578 (80.006) Acc@5 90.039 (95.312) Mem 7373MB [2024-08-26 22:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.2178 (0.9639) Acc@1 71.289 (78.461) Acc@5 91.992 (94.569) Mem 7373MB [2024-08-26 22:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.070 Acc@5 94.518 [2024-08-26 22:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.1% [2024-08-26 22:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 1.103 (1.103) Loss 0.4194 (0.4194) Acc@1 92.285 (92.285) Acc@5 98.438 (98.438) Mem 7373MB [2024-08-26 22:29:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.135 (0.180) Loss 0.6685 (0.6613) Acc@1 86.426 (85.778) Acc@5 96.973 (97.212) Mem 7373MB [2024-08-26 22:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.090 (0.131) Loss 0.9375 (0.6839) Acc@1 77.930 (84.826) Acc@5 95.215 (97.247) Mem 7373MB [2024-08-26 22:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.083 (0.116) Loss 1.2002 (0.7782) Acc@1 69.336 (82.475) Acc@5 91.504 (96.195) Mem 7373MB [2024-08-26 22:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.067 (0.105) Loss 1.0869 (0.8278) Acc@1 72.852 (81.114) Acc@5 93.066 (95.648) Mem 7373MB [2024-08-26 22:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.734 Acc@5 95.622 [2024-08-26 22:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.7% [2024-08-26 22:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.73% [2024-08-26 22:29:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-26 22:29:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-26 22:29:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][0/1251] eta 0:20:39 lr 0.000689 wd 0.0500 time 0.9907 (0.9907) data time 0.7440 (0.7440) model time 0.0000 (0.0000) loss 3.4598 (3.4598) grad_norm 1.8244 (1.8244) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-26 22:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][10/1251] eta 0:06:26 lr 0.000689 wd 0.0500 time 0.2539 (0.3118) data time 0.0010 (0.0689) model time 0.0000 (0.0000) loss 3.9256 (3.3528) grad_norm 1.6386 (2.0971) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 22:29:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][20/1251] eta 0:05:43 lr 0.000689 wd 0.0500 time 0.2199 (0.2787) data time 0.0010 (0.0366) model time 0.0000 (0.0000) loss 2.8928 (3.4286) grad_norm 3.1996 (2.3460) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 22:29:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][30/1251] eta 0:05:19 lr 0.000689 wd 0.0500 time 0.2253 (0.2619) data time 0.0008 (0.0251) model time 0.0000 (0.0000) loss 2.3105 (3.3719) grad_norm 1.9682 (2.5884) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 22:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][40/1251] eta 0:05:06 lr 0.000689 wd 0.0500 time 0.2352 (0.2531) data time 0.0007 (0.0193) model time 0.0000 (0.0000) loss 2.8519 (3.2994) grad_norm 1.7757 (2.5545) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 22:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][50/1251] eta 0:05:03 lr 0.000689 wd 0.0500 time 0.2957 (0.2524) data time 0.0014 (0.0157) model time 0.0000 (0.0000) loss 3.4304 (3.3219) grad_norm 2.7914 (2.4899) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 22:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][60/1251] eta 0:05:01 lr 0.000689 wd 0.0500 time 0.2909 (0.2534) data time 0.0015 (0.0133) model time 0.2894 (0.2570) loss 2.4547 (3.2664) grad_norm 2.0243 (2.4689) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 22:29:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][70/1251] eta 0:04:55 lr 0.000689 wd 0.0500 time 0.2186 (0.2503) data time 0.0011 (0.0116) model time 0.2175 (0.2439) loss 3.7027 (3.2741) grad_norm 2.3821 (2.4644) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 22:29:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][80/1251] eta 0:04:50 lr 0.000689 wd 0.0500 time 0.2214 (0.2477) data time 0.0009 (0.0103) model time 0.2205 (0.2386) loss 3.6173 (3.3132) grad_norm 4.9980 (2.5036) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 22:29:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][90/1251] eta 0:04:45 lr 0.000689 wd 0.0500 time 0.2497 (0.2457) data time 0.0009 (0.0093) model time 0.2488 (0.2360) loss 3.2580 (3.3111) grad_norm 2.3264 (2.5276) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 22:29:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][100/1251] eta 0:04:46 lr 0.000689 wd 0.0500 time 0.2277 (0.2491) data time 0.0010 (0.0085) model time 0.2267 (0.2446) loss 3.6698 (3.3102) grad_norm 2.0899 (2.5174) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-26 22:29:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 22:29:59 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 22:30:01 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 22:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 22:34:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 22:34:38 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 22:34:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 22:34:47 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 22:34:48 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 22:34:49 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 22:34:49 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 126) [2024-08-26 22:34:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 22:35:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 22:35:08 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 22:35:11 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 22:39:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 22:39:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 22:39:44 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 22:39:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 22:39:57 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 22:39:59 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 22:40:00 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 22:40:00 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 126) [2024-08-26 22:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 22:40:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 22:40:16 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 22:40:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 22:42:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 22:42:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 22:42:28 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 22:42:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 22:42:44 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 22:42:45 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 22:42:47 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 22:42:47 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 126) [2024-08-26 22:42:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 22:43:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][110/1251] eta 4:34:15 lr 0.000689 wd 0.0500 time 14.4220 (14.4220) data time 1.0336 (1.0336) model time 13.3884 (13.3884) loss 4.0546 (4.0546) grad_norm 1.7444 (1.7444) loss_scale 4096.0000 (4096.0000) mem 20033MB [2024-08-26 22:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][120/1251] eta 0:29:04 lr 0.000689 wd 0.0500 time 0.2255 (1.5427) data time 0.0008 (0.0949) model time 0.2246 (1.4478) loss 2.7006 (3.7140) grad_norm 1.8413 (2.2994) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][130/1251] eta 0:17:12 lr 0.000689 wd 0.0500 time 0.2356 (0.9207) data time 0.0009 (0.0503) model time 0.2347 (0.8705) loss 3.4151 (3.5829) grad_norm 3.9826 (2.2677) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][140/1251] eta 0:12:56 lr 0.000689 wd 0.0500 time 0.2271 (0.6989) data time 0.0007 (0.0345) model time 0.2264 (0.6644) loss 2.6100 (3.5887) grad_norm 2.2044 (2.2538) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][150/1251] eta 0:10:44 lr 0.000689 wd 0.0500 time 0.2306 (0.5858) data time 0.0009 (0.0263) model time 0.2297 (0.5595) loss 3.6947 (3.5394) grad_norm 1.8965 (2.1945) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][160/1251] eta 0:09:24 lr 0.000689 wd 0.0500 time 0.2436 (0.5172) data time 0.0006 (0.0213) model time 0.2430 (0.4958) loss 3.8760 (3.5069) grad_norm 1.6475 (2.2003) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][170/1251] eta 0:08:28 lr 0.000689 wd 0.0500 time 0.2299 (0.4708) data time 0.0010 (0.0180) model time 0.2289 (0.4529) loss 3.3645 (3.4604) grad_norm 2.3893 (2.1986) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][180/1251] eta 0:07:48 lr 0.000689 wd 0.0500 time 0.2392 (0.4379) data time 0.0011 (0.0156) model time 0.2381 (0.4223) loss 3.5647 (3.4374) grad_norm 2.4603 (2.2121) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][190/1251] eta 0:07:18 lr 0.000689 wd 0.0500 time 0.2305 (0.4130) data time 0.0010 (0.0138) model time 0.2295 (0.3992) loss 2.8307 (3.4251) grad_norm 2.3091 (2.2593) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][200/1251] eta 0:06:53 lr 0.000688 wd 0.0500 time 0.2374 (0.3938) data time 0.0008 (0.0124) model time 0.2366 (0.3814) loss 3.7563 (3.4148) grad_norm 4.7044 (2.3322) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][210/1251] eta 0:06:33 lr 0.000688 wd 0.0500 time 0.2380 (0.3783) data time 0.0008 (0.0113) model time 0.2372 (0.3670) loss 3.3454 (3.4115) grad_norm 2.2968 (2.3083) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][220/1251] eta 0:06:16 lr 0.000688 wd 0.0500 time 0.2325 (0.3653) data time 0.0011 (0.0103) model time 0.2315 (0.3550) loss 2.9498 (3.4020) grad_norm 2.2254 (2.3073) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][230/1251] eta 0:06:01 lr 0.000688 wd 0.0500 time 0.2297 (0.3545) data time 0.0008 (0.0096) model time 0.2289 (0.3449) loss 2.6620 (3.4033) grad_norm 3.2835 (2.3283) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][240/1251] eta 0:05:49 lr 0.000688 wd 0.0500 time 0.2282 (0.3453) data time 0.0009 (0.0089) model time 0.2273 (0.3364) loss 3.6943 (3.4007) grad_norm 1.6228 (2.3194) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][250/1251] eta 0:05:37 lr 0.000688 wd 0.0500 time 0.2239 (0.3373) data time 0.0008 (0.0084) model time 0.2231 (0.3289) loss 3.6519 (3.3790) grad_norm 1.6197 (2.2928) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][260/1251] eta 0:05:27 lr 0.000688 wd 0.0500 time 0.2345 (0.3308) data time 0.0011 (0.0079) model time 0.2334 (0.3229) loss 2.5592 (3.3791) grad_norm 2.2582 (2.2867) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][270/1251] eta 0:05:18 lr 0.000688 wd 0.0500 time 0.2294 (0.3250) data time 0.0012 (0.0074) model time 0.2283 (0.3175) loss 3.6446 (3.3857) grad_norm 2.0787 (2.3336) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][280/1251] eta 0:05:10 lr 0.000688 wd 0.0500 time 0.2384 (0.3198) data time 0.0012 (0.0071) model time 0.2373 (0.3127) loss 3.5188 (3.3814) grad_norm 1.7883 (2.3505) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][290/1251] eta 0:05:02 lr 0.000688 wd 0.0500 time 0.2362 (0.3153) data time 0.0011 (0.0067) model time 0.2351 (0.3085) loss 3.8111 (3.3649) grad_norm 2.2115 (2.3352) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][300/1251] eta 0:04:55 lr 0.000688 wd 0.0500 time 0.2479 (0.3112) data time 0.0011 (0.0064) model time 0.2469 (0.3047) loss 3.0488 (3.3610) grad_norm 2.1688 (2.3139) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][310/1251] eta 0:04:49 lr 0.000688 wd 0.0500 time 0.2285 (0.3074) data time 0.0012 (0.0062) model time 0.2274 (0.3012) loss 3.2689 (3.3486) grad_norm 2.4232 (2.3100) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][320/1251] eta 0:04:43 lr 0.000688 wd 0.0500 time 0.2399 (0.3040) data time 0.0009 (0.0059) model time 0.2390 (0.2981) loss 3.6057 (3.3454) grad_norm 2.0544 (2.3107) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][330/1251] eta 0:04:37 lr 0.000688 wd 0.0500 time 0.2357 (0.3008) data time 0.0008 (0.0057) model time 0.2349 (0.2951) loss 3.4021 (3.3358) grad_norm 1.5753 (2.2922) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:43:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][340/1251] eta 0:04:31 lr 0.000688 wd 0.0500 time 0.2379 (0.2980) data time 0.0007 (0.0055) model time 0.2372 (0.2924) loss 2.4743 (3.3298) grad_norm 2.0303 (2.2802) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:44:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][350/1251] eta 0:04:26 lr 0.000688 wd 0.0500 time 0.2324 (0.2954) data time 0.0008 (0.0053) model time 0.2316 (0.2900) loss 3.2765 (3.3274) grad_norm 2.5289 (2.2726) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:44:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][360/1251] eta 0:04:21 lr 0.000688 wd 0.0500 time 0.2373 (0.2930) data time 0.0008 (0.0052) model time 0.2364 (0.2878) loss 3.5806 (3.3221) grad_norm 2.0462 (2.2637) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:44:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][370/1251] eta 0:04:16 lr 0.000688 wd 0.0500 time 0.2393 (0.2909) data time 0.0008 (0.0050) model time 0.2385 (0.2859) loss 3.0494 (3.3123) grad_norm 2.2630 (2.2652) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:44:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][380/1251] eta 0:04:11 lr 0.000688 wd 0.0500 time 0.2275 (0.2889) data time 0.0007 (0.0048) model time 0.2268 (0.2841) loss 4.2470 (3.3101) grad_norm 2.6515 (2.2724) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:44:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][390/1251] eta 0:04:07 lr 0.000688 wd 0.0500 time 0.2397 (0.2871) data time 0.0009 (0.0047) model time 0.2388 (0.2824) loss 3.2321 (3.3123) grad_norm 1.7345 (2.2970) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:44:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][400/1251] eta 0:04:03 lr 0.000688 wd 0.0500 time 0.2361 (0.2861) data time 0.0006 (0.0046) model time 0.2354 (0.2815) loss 2.4509 (3.3050) grad_norm 1.8308 (2.2864) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:44:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][410/1251] eta 0:03:59 lr 0.000688 wd 0.0500 time 0.2418 (0.2845) data time 0.0010 (0.0045) model time 0.2408 (0.2800) loss 3.2090 (3.2984) grad_norm 2.3547 (2.2877) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:44:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][420/1251] eta 0:03:55 lr 0.000688 wd 0.0500 time 0.2321 (0.2836) data time 0.0008 (0.0043) model time 0.2313 (0.2793) loss 3.9779 (3.2983) grad_norm 1.7831 (2.2921) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:44:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][430/1251] eta 0:03:51 lr 0.000688 wd 0.0500 time 0.2366 (0.2822) data time 0.0006 (0.0042) model time 0.2359 (0.2779) loss 3.8044 (3.3090) grad_norm 1.4601 (2.2846) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:44:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][440/1251] eta 0:03:47 lr 0.000687 wd 0.0500 time 0.2466 (0.2808) data time 0.0006 (0.0041) model time 0.2460 (0.2767) loss 2.2583 (3.3098) grad_norm 2.5839 (2.2791) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:44:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][450/1251] eta 0:03:43 lr 0.000687 wd 0.0500 time 0.2302 (0.2794) data time 0.0009 (0.0040) model time 0.2293 (0.2754) loss 3.6352 (3.3126) grad_norm 1.6442 (2.2689) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:44:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][460/1251] eta 0:03:40 lr 0.000687 wd 0.0500 time 0.2353 (0.2782) data time 0.0008 (0.0040) model time 0.2345 (0.2742) loss 3.8872 (3.3144) grad_norm 2.3048 (2.2698) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-26 22:44:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 22:44:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 22:44:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 22:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 22:53:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 22:53:25 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 22:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 22:53:35 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 22:53:36 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 22:53:37 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 22:53:37 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 126) [2024-08-26 22:53:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 23:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 23:00:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 23:00:37 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 23:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 23:00:47 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 23:00:49 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 23:00:50 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 23:00:50 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 126) [2024-08-26 23:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 23:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 23:19:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 23:19:05 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 23:23:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 23:23:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 23:23:51 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 23:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 23:24:06 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 23:24:07 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 23:24:08 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 23:24:08 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 126) [2024-08-26 23:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 23:24:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][470/1251] eta 3:02:07 lr 0.000687 wd 0.0500 time 13.9911 (13.9911) data time 0.6788 (0.6788) model time 13.3123 (13.3123) loss 4.2528 (4.2528) grad_norm 2.3809 (2.3809) loss_scale 2048.0000 (2048.0000) mem 20033MB [2024-08-26 23:24:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][480/1251] eta 0:19:15 lr 0.000687 wd 0.0500 time 0.2297 (1.4985) data time 0.0009 (0.0627) model time 0.2288 (1.4358) loss 3.0125 (3.6387) grad_norm 1.5987 (2.3572) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 23:24:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][490/1251] eta 0:11:19 lr 0.000687 wd 0.0500 time 0.2280 (0.8934) data time 0.0011 (0.0334) model time 0.2269 (0.8601) loss 2.9764 (3.5117) grad_norm 1.5981 (2.4282) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 23:24:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][500/1251] eta 0:08:29 lr 0.000687 wd 0.0500 time 0.2256 (0.6789) data time 0.0007 (0.0230) model time 0.2249 (0.6559) loss 2.8473 (3.5245) grad_norm 2.3602 (2.3162) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 23:24:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][510/1251] eta 0:07:01 lr 0.000687 wd 0.0500 time 0.2364 (0.5693) data time 0.0009 (0.0177) model time 0.2355 (0.5516) loss 3.4402 (3.4762) grad_norm 1.7842 (2.3221) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 23:24:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][520/1251] eta 0:06:07 lr 0.000687 wd 0.0500 time 0.2279 (0.5027) data time 0.0009 (0.0144) model time 0.2270 (0.4883) loss 3.8599 (3.4541) grad_norm 2.6102 (2.3185) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 23:24:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][530/1251] eta 0:05:30 lr 0.000687 wd 0.0500 time 0.2197 (0.4578) data time 0.0009 (0.0122) model time 0.2188 (0.4456) loss 3.4629 (3.4123) grad_norm 2.0608 (2.3024) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 23:24:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][540/1251] eta 0:05:02 lr 0.000687 wd 0.0500 time 0.2258 (0.4254) data time 0.0010 (0.0106) model time 0.2249 (0.4148) loss 3.2504 (3.3728) grad_norm 2.2664 (2.2851) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-26 23:24:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-26 23:24:46 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-26 23:24:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-26 23:51:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 23:51:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 23:51:27 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 23:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 23:51:37 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 23:51:38 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 23:51:39 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 23:51:39 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 126) [2024-08-26 23:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-26 23:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-26 23:54:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-26 23:54:17 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-26 23:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-26 23:54:29 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-26 23:54:30 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-26 23:54:31 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-26 23:54:31 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 126) [2024-08-26 23:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 00:05:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 00:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 00:06:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 00:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 00:07:06 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 00:07:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 00:07:13 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 00:07:15 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 00:07:16 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 00:07:16 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 126) [2024-08-27 00:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 00:21:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 00:21:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 00:21:35 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 00:57:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 00:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 00:57:11 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 00:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 00:57:23 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 00:57:25 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 00:57:26 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 00:57:26 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 126) [2024-08-27 00:57:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 00:57:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][550/1251] eta 2:40:14 lr 0.000687 wd 0.0500 time 13.7152 (13.7152) data time 0.7708 (0.7708) model time 12.9443 (12.9443) loss 3.9514 (3.9514) grad_norm 2.7415 (2.7415) loss_scale 2048.0000 (2048.0000) mem 20033MB [2024-08-27 00:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][560/1251] eta 0:17:05 lr 0.000687 wd 0.0500 time 0.2341 (1.4834) data time 0.0012 (0.0713) model time 0.2329 (1.4121) loss 2.9099 (3.7172) grad_norm 1.8520 (2.3593) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:57:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][570/1251] eta 0:10:08 lr 0.000687 wd 0.0500 time 0.2304 (0.8932) data time 0.0010 (0.0381) model time 0.2294 (0.8551) loss 3.5438 (3.6084) grad_norm 1.5398 (2.1585) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:57:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][580/1251] eta 0:07:39 lr 0.000687 wd 0.0500 time 0.2317 (0.6845) data time 0.0008 (0.0263) model time 0.2309 (0.6581) loss 2.3450 (3.5747) grad_norm 2.1017 (2.1187) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:57:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][590/1251] eta 0:06:21 lr 0.000687 wd 0.0500 time 0.2392 (0.5766) data time 0.0010 (0.0202) model time 0.2382 (0.5563) loss 3.2717 (3.5320) grad_norm 1.5583 (2.1277) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:57:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][600/1251] eta 0:05:32 lr 0.000687 wd 0.0500 time 0.2294 (0.5105) data time 0.0007 (0.0165) model time 0.2288 (0.4941) loss 3.4378 (3.4931) grad_norm 2.0394 (2.1963) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:57:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][610/1251] eta 0:04:58 lr 0.000687 wd 0.0500 time 0.2365 (0.4663) data time 0.0010 (0.0141) model time 0.2355 (0.4522) loss 3.5210 (3.4634) grad_norm 2.9215 (2.2684) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][620/1251] eta 0:04:34 lr 0.000687 wd 0.0500 time 0.2309 (0.4349) data time 0.0010 (0.0123) model time 0.2299 (0.4226) loss 3.1387 (3.4325) grad_norm 1.9999 (2.2812) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][630/1251] eta 0:04:15 lr 0.000687 wd 0.0500 time 0.2301 (0.4107) data time 0.0011 (0.0110) model time 0.2290 (0.3997) loss 2.8990 (3.4232) grad_norm 2.0718 (2.2315) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][640/1251] eta 0:03:59 lr 0.000687 wd 0.0500 time 0.2357 (0.3919) data time 0.0009 (0.0099) model time 0.2348 (0.3820) loss 3.6799 (3.4159) grad_norm 2.1765 (2.2082) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][650/1251] eta 0:03:46 lr 0.000687 wd 0.0500 time 0.2239 (0.3767) data time 0.0010 (0.0091) model time 0.2229 (0.3676) loss 3.5519 (3.4110) grad_norm 2.2945 (2.2135) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][660/1251] eta 0:03:35 lr 0.000687 wd 0.0500 time 0.2408 (0.3644) data time 0.0009 (0.0084) model time 0.2399 (0.3561) loss 2.5904 (3.4118) grad_norm 1.5439 (2.2149) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][670/1251] eta 0:03:25 lr 0.000687 wd 0.0500 time 0.2347 (0.3543) data time 0.0008 (0.0078) model time 0.2339 (0.3466) loss 2.0845 (3.4082) grad_norm 2.0367 (2.2097) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][680/1251] eta 0:03:17 lr 0.000686 wd 0.0500 time 0.2338 (0.3460) data time 0.0010 (0.0072) model time 0.2328 (0.3387) loss 3.3839 (3.3983) grad_norm 1.9137 (2.1965) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][690/1251] eta 0:03:10 lr 0.000686 wd 0.0500 time 0.2376 (0.3387) data time 0.0009 (0.0069) model time 0.2367 (0.3318) loss 3.6662 (3.3916) grad_norm 2.5673 (2.1994) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][700/1251] eta 0:03:03 lr 0.000686 wd 0.0500 time 0.2576 (0.3327) data time 0.0009 (0.0065) model time 0.2567 (0.3262) loss 2.4335 (3.3844) grad_norm 2.3219 (2.2377) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][710/1251] eta 0:02:57 lr 0.000686 wd 0.0500 time 0.2417 (0.3273) data time 0.0009 (0.0062) model time 0.2408 (0.3211) loss 3.5197 (3.3890) grad_norm 1.6246 (2.2455) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][720/1251] eta 0:02:51 lr 0.000686 wd 0.0500 time 0.2429 (0.3222) data time 0.0010 (0.0059) model time 0.2419 (0.3163) loss 3.4804 (3.3852) grad_norm 1.6824 (2.2510) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][730/1251] eta 0:02:45 lr 0.000686 wd 0.0500 time 0.2435 (0.3177) data time 0.0010 (0.0056) model time 0.2426 (0.3121) loss 3.2726 (3.3684) grad_norm 1.8799 (2.2561) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][740/1251] eta 0:02:40 lr 0.000686 wd 0.0500 time 0.2381 (0.3137) data time 0.0012 (0.0054) model time 0.2369 (0.3084) loss 3.2296 (3.3761) grad_norm 1.7081 (2.2555) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][750/1251] eta 0:02:35 lr 0.000686 wd 0.0500 time 0.2380 (0.3101) data time 0.0010 (0.0052) model time 0.2370 (0.3049) loss 3.0403 (3.3625) grad_norm 1.8577 (2.2500) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][760/1251] eta 0:02:30 lr 0.000686 wd 0.0500 time 0.2376 (0.3071) data time 0.0010 (0.0050) model time 0.2366 (0.3021) loss 3.4317 (3.3550) grad_norm 3.3837 (2.2452) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][770/1251] eta 0:02:26 lr 0.000686 wd 0.0500 time 0.2390 (0.3042) data time 0.0009 (0.0048) model time 0.2381 (0.2993) loss 3.7355 (3.3517) grad_norm 1.7919 (2.2522) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][780/1251] eta 0:02:22 lr 0.000686 wd 0.0500 time 0.2322 (0.3015) data time 0.0009 (0.0047) model time 0.2312 (0.2968) loss 2.0519 (3.3475) grad_norm 2.5530 (2.2535) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][790/1251] eta 0:02:17 lr 0.000686 wd 0.0500 time 0.2483 (0.2990) data time 0.0007 (0.0045) model time 0.2476 (0.2945) loss 3.4053 (3.3370) grad_norm 1.8097 (2.2525) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][800/1251] eta 0:02:13 lr 0.000686 wd 0.0500 time 0.2472 (0.2969) data time 0.0008 (0.0044) model time 0.2464 (0.2925) loss 3.6315 (3.3299) grad_norm 2.3877 (2.2483) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][810/1251] eta 0:02:10 lr 0.000686 wd 0.0500 time 0.2392 (0.2949) data time 0.0007 (0.0043) model time 0.2385 (0.2906) loss 3.0788 (3.3255) grad_norm 2.1156 (2.2471) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][820/1251] eta 0:02:06 lr 0.000686 wd 0.0500 time 0.2384 (0.2928) data time 0.0007 (0.0042) model time 0.2377 (0.2887) loss 3.3827 (3.3197) grad_norm 2.9544 (2.2498) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][830/1251] eta 0:02:02 lr 0.000686 wd 0.0500 time 0.2381 (0.2911) data time 0.0010 (0.0041) model time 0.2370 (0.2870) loss 3.1573 (3.3241) grad_norm 3.3360 (2.2629) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][840/1251] eta 0:01:59 lr 0.000686 wd 0.0500 time 0.2581 (0.2902) data time 0.0009 (0.0040) model time 0.2572 (0.2862) loss 2.0713 (3.3146) grad_norm 2.0254 (2.2532) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][850/1251] eta 0:01:55 lr 0.000686 wd 0.0500 time 0.2434 (0.2887) data time 0.0012 (0.0039) model time 0.2422 (0.2847) loss 3.5442 (3.3046) grad_norm 1.9352 (2.2496) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:58:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][860/1251] eta 0:01:52 lr 0.000686 wd 0.0500 time 0.2348 (0.2880) data time 0.0010 (0.0038) model time 0.2337 (0.2841) loss 3.3057 (3.3048) grad_norm 2.2784 (2.2550) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][870/1251] eta 0:01:49 lr 0.000686 wd 0.0500 time 0.2303 (0.2867) data time 0.0008 (0.0038) model time 0.2295 (0.2830) loss 4.1422 (3.3180) grad_norm 1.9666 (2.2576) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][880/1251] eta 0:01:45 lr 0.000686 wd 0.0500 time 0.2445 (0.2854) data time 0.0008 (0.0037) model time 0.2436 (0.2817) loss 2.4992 (3.3179) grad_norm 2.0355 (2.2598) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][890/1251] eta 0:01:42 lr 0.000686 wd 0.0500 time 0.2356 (0.2841) data time 0.0010 (0.0036) model time 0.2346 (0.2805) loss 3.3230 (3.3184) grad_norm 1.9176 (2.2534) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][900/1251] eta 0:01:39 lr 0.000686 wd 0.0500 time 0.2457 (0.2828) data time 0.0008 (0.0035) model time 0.2449 (0.2793) loss 3.9035 (3.3211) grad_norm 4.3286 (2.2542) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][910/1251] eta 0:01:36 lr 0.000686 wd 0.0500 time 0.2391 (0.2816) data time 0.0008 (0.0035) model time 0.2383 (0.2782) loss 3.2004 (3.3216) grad_norm 2.3318 (2.2626) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][920/1251] eta 0:01:32 lr 0.000685 wd 0.0500 time 0.2426 (0.2806) data time 0.0009 (0.0034) model time 0.2417 (0.2772) loss 2.8144 (3.3184) grad_norm 2.0525 (2.2662) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][930/1251] eta 0:01:29 lr 0.000685 wd 0.0500 time 0.2366 (0.2796) data time 0.0007 (0.0034) model time 0.2359 (0.2762) loss 2.1805 (3.3173) grad_norm 2.3068 (2.2597) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][940/1251] eta 0:01:26 lr 0.000685 wd 0.0500 time 0.2551 (0.2790) data time 0.0009 (0.0034) model time 0.2542 (0.2756) loss 3.9044 (3.3131) grad_norm 2.5543 (2.2567) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][950/1251] eta 0:01:23 lr 0.000685 wd 0.0500 time 0.2324 (0.2780) data time 0.0010 (0.0033) model time 0.2313 (0.2747) loss 3.5206 (3.3142) grad_norm 3.0250 (2.2569) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][960/1251] eta 0:01:20 lr 0.000685 wd 0.0500 time 0.2557 (0.2771) data time 0.0011 (0.0032) model time 0.2546 (0.2739) loss 3.5250 (3.3188) grad_norm 2.2841 (2.2620) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][970/1251] eta 0:01:17 lr 0.000685 wd 0.0500 time 0.2369 (0.2763) data time 0.0009 (0.0032) model time 0.2361 (0.2731) loss 3.5940 (3.3161) grad_norm 1.8587 (2.2695) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][980/1251] eta 0:01:14 lr 0.000685 wd 0.0500 time 0.2362 (0.2755) data time 0.0009 (0.0032) model time 0.2352 (0.2723) loss 3.8267 (3.3242) grad_norm 1.4861 (2.2653) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][990/1251] eta 0:01:11 lr 0.000685 wd 0.0500 time 0.2338 (0.2747) data time 0.0008 (0.0031) model time 0.2330 (0.2715) loss 4.3651 (3.3283) grad_norm 2.0292 (2.2563) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1000/1251] eta 0:01:08 lr 0.000685 wd 0.0500 time 0.2372 (0.2739) data time 0.0008 (0.0031) model time 0.2364 (0.2708) loss 3.2986 (3.3273) grad_norm 3.0854 (2.2519) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1010/1251] eta 0:01:05 lr 0.000685 wd 0.0500 time 0.2354 (0.2732) data time 0.0009 (0.0031) model time 0.2345 (0.2702) loss 3.3892 (3.3196) grad_norm 4.6627 (2.2596) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1020/1251] eta 0:01:02 lr 0.000685 wd 0.0500 time 0.2454 (0.2727) data time 0.0010 (0.0030) model time 0.2444 (0.2696) loss 3.1729 (3.3110) grad_norm 2.1175 (2.2609) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1030/1251] eta 0:01:00 lr 0.000685 wd 0.0500 time 0.2328 (0.2720) data time 0.0008 (0.0030) model time 0.2320 (0.2690) loss 3.5602 (3.3075) grad_norm 2.1512 (2.2602) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1040/1251] eta 0:00:57 lr 0.000685 wd 0.0500 time 0.2501 (0.2714) data time 0.0010 (0.0030) model time 0.2491 (0.2685) loss 3.7066 (3.3128) grad_norm 1.9118 (2.2681) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1050/1251] eta 0:00:54 lr 0.000685 wd 0.0500 time 0.2396 (0.2709) data time 0.0009 (0.0029) model time 0.2387 (0.2680) loss 3.8796 (3.3134) grad_norm 1.7909 (2.2642) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1060/1251] eta 0:00:51 lr 0.000685 wd 0.0500 time 0.2339 (0.2704) data time 0.0007 (0.0029) model time 0.2332 (0.2675) loss 4.1421 (3.3195) grad_norm 2.2451 (2.2621) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1070/1251] eta 0:00:48 lr 0.000685 wd 0.0500 time 0.2400 (0.2698) data time 0.0007 (0.0029) model time 0.2392 (0.2670) loss 2.9337 (3.3199) grad_norm 2.3430 (2.2660) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1080/1251] eta 0:00:46 lr 0.000685 wd 0.0500 time 0.2357 (0.2693) data time 0.0010 (0.0028) model time 0.2347 (0.2665) loss 3.6634 (3.3151) grad_norm 2.1206 (2.2693) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1090/1251] eta 0:00:43 lr 0.000685 wd 0.0500 time 0.2358 (0.2688) data time 0.0010 (0.0028) model time 0.2348 (0.2660) loss 3.6333 (3.3135) grad_norm 2.8635 (2.2693) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 00:59:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1100/1251] eta 0:00:40 lr 0.000685 wd 0.0500 time 0.2392 (0.2683) data time 0.0007 (0.0028) model time 0.2386 (0.2656) loss 4.0100 (3.3139) grad_norm 2.3446 (2.2714) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1110/1251] eta 0:00:37 lr 0.000685 wd 0.0500 time 0.2456 (0.2679) data time 0.0010 (0.0027) model time 0.2446 (0.2652) loss 3.4364 (3.3180) grad_norm 1.6549 (2.2679) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1120/1251] eta 0:00:35 lr 0.000685 wd 0.0500 time 0.2370 (0.2676) data time 0.0011 (0.0027) model time 0.2359 (0.2649) loss 3.4041 (3.3192) grad_norm 2.1075 (2.2673) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1130/1251] eta 0:00:32 lr 0.000685 wd 0.0500 time 0.2473 (0.2672) data time 0.0007 (0.0027) model time 0.2466 (0.2644) loss 4.3028 (3.3204) grad_norm 2.6399 (2.2690) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1140/1251] eta 0:00:29 lr 0.000685 wd 0.0500 time 0.2431 (0.2667) data time 0.0010 (0.0027) model time 0.2421 (0.2641) loss 3.5550 (3.3225) grad_norm 2.6893 (2.2683) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1150/1251] eta 0:00:26 lr 0.000685 wd 0.0500 time 0.2341 (0.2663) data time 0.0011 (0.0027) model time 0.2330 (0.2637) loss 2.2840 (3.3210) grad_norm 2.5737 (2.2653) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1160/1251] eta 0:00:24 lr 0.000684 wd 0.0500 time 0.2633 (0.2660) data time 0.0009 (0.0026) model time 0.2623 (0.2633) loss 3.4907 (3.3230) grad_norm 2.0434 (2.2608) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1170/1251] eta 0:00:21 lr 0.000684 wd 0.0500 time 0.2474 (0.2656) data time 0.0009 (0.0026) model time 0.2465 (0.2630) loss 4.3017 (3.3259) grad_norm 2.6263 (2.2639) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1180/1251] eta 0:00:18 lr 0.000684 wd 0.0500 time 0.2324 (0.2652) data time 0.0010 (0.0026) model time 0.2314 (0.2626) loss 3.7849 (3.3278) grad_norm 2.7768 (2.2610) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1190/1251] eta 0:00:16 lr 0.000684 wd 0.0500 time 0.2433 (0.2649) data time 0.0007 (0.0026) model time 0.2426 (0.2623) loss 2.3145 (3.3267) grad_norm 2.4188 (2.2609) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1200/1251] eta 0:00:13 lr 0.000684 wd 0.0500 time 0.2670 (0.2646) data time 0.0007 (0.0026) model time 0.2662 (0.2621) loss 3.9806 (3.3270) grad_norm 2.3844 (2.2611) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1210/1251] eta 0:00:10 lr 0.000684 wd 0.0500 time 0.2421 (0.2643) data time 0.0009 (0.0026) model time 0.2413 (0.2617) loss 2.1525 (3.3223) grad_norm 1.8740 (2.2579) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1220/1251] eta 0:00:08 lr 0.000684 wd 0.0500 time 0.2467 (0.2640) data time 0.0010 (0.0025) model time 0.2457 (0.2615) loss 4.3115 (3.3260) grad_norm 2.7968 (2.2583) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1230/1251] eta 0:00:05 lr 0.000684 wd 0.0500 time 0.2474 (0.2638) data time 0.0008 (0.0025) model time 0.2465 (0.2612) loss 3.9701 (3.3278) grad_norm 2.8990 (2.2583) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1240/1251] eta 0:00:02 lr 0.000684 wd 0.0500 time 0.2238 (0.2634) data time 0.0007 (0.0025) model time 0.2231 (0.2609) loss 2.6312 (3.3251) grad_norm 1.8079 (2.2554) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [126/300][1250/1251] eta 0:00:00 lr 0.000684 wd 0.0500 time 0.2227 (0.2629) data time 0.0007 (0.0025) model time 0.2220 (0.2604) loss 3.4624 (3.3224) grad_norm 2.5742 (2.2542) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 01:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 126 training takes 0:03:04 [2024-08-27 01:00:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 01:00:39 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 01:00:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.379 (0.379) Loss 0.4778 (0.4778) Acc@1 90.723 (90.723) Acc@5 98.145 (98.145) Mem 7377MB [2024-08-27 01:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.111) Loss 0.7246 (0.7380) Acc@1 85.059 (83.461) Acc@5 96.484 (96.564) Mem 7377MB [2024-08-27 01:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.094) Loss 1.1084 (0.7712) Acc@1 74.316 (82.366) Acc@5 93.164 (96.563) Mem 7377MB [2024-08-27 01:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.088) Loss 1.3984 (0.8744) Acc@1 66.504 (80.037) Acc@5 88.672 (95.246) Mem 7377MB [2024-08-27 01:00:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.1631 (0.9341) Acc@1 72.363 (78.454) Acc@5 92.188 (94.562) Mem 7377MB [2024-08-27 01:00:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.092 Acc@5 94.492 [2024-08-27 01:00:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.1% [2024-08-27 01:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.736 (0.736) Loss 0.4187 (0.4187) Acc@1 92.188 (92.188) Acc@5 98.438 (98.438) Mem 7377MB [2024-08-27 01:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.141) Loss 0.6675 (0.6604) Acc@1 86.621 (85.769) Acc@5 96.973 (97.230) Mem 7377MB [2024-08-27 01:00:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.111) Loss 0.9351 (0.6830) Acc@1 77.832 (84.821) Acc@5 95.215 (97.261) Mem 7377MB [2024-08-27 01:00:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.100) Loss 1.1982 (0.7771) Acc@1 69.141 (82.450) Acc@5 91.504 (96.217) Mem 7377MB [2024-08-27 01:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.092) Loss 1.0820 (0.8266) Acc@1 72.754 (81.071) Acc@5 92.969 (95.646) Mem 7377MB [2024-08-27 01:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.680 Acc@5 95.616 [2024-08-27 01:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.7% [2024-08-27 01:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.68% [2024-08-27 01:00:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 01:00:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 01:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][0/1251] eta 0:18:01 lr 0.000684 wd 0.0500 time 0.8644 (0.8644) data time 0.6009 (0.6009) model time 0.0000 (0.0000) loss 2.5822 (2.5822) grad_norm 2.0301 (2.0301) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-27 01:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][10/1251] eta 0:06:08 lr 0.000684 wd 0.0500 time 0.2518 (0.2973) data time 0.0008 (0.0556) model time 0.0000 (0.0000) loss 2.1768 (3.1459) grad_norm 2.4002 (2.4521) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 01:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][20/1251] eta 0:05:33 lr 0.000684 wd 0.0500 time 0.2465 (0.2707) data time 0.0008 (0.0298) model time 0.0000 (0.0000) loss 3.3579 (3.2107) grad_norm 2.2016 (2.4303) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 01:00:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][30/1251] eta 0:05:19 lr 0.000684 wd 0.0500 time 0.2447 (0.2620) data time 0.0012 (0.0208) model time 0.0000 (0.0000) loss 3.5987 (3.2550) grad_norm 2.8826 (2.3097) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 01:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][40/1251] eta 0:05:11 lr 0.000684 wd 0.0500 time 0.2598 (0.2573) data time 0.0010 (0.0162) model time 0.0000 (0.0000) loss 2.5803 (3.3140) grad_norm 1.7233 (2.2871) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 01:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][50/1251] eta 0:05:11 lr 0.000684 wd 0.0500 time 0.2557 (0.2590) data time 0.0007 (0.0136) model time 0.0000 (0.0000) loss 3.9402 (3.3032) grad_norm 1.7293 (2.2754) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 01:01:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][60/1251] eta 0:05:05 lr 0.000684 wd 0.0500 time 0.2515 (0.2567) data time 0.0007 (0.0116) model time 0.2508 (0.2436) loss 3.3743 (3.2907) grad_norm 2.0456 (2.2548) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 01:01:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 01:01:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 01:01:12 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 02:48:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 02:48:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 02:48:21 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 02:48:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 02:48:32 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 02:48:34 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 02:48:35 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 02:48:35 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 127) [2024-08-27 02:48:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 02:48:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][70/1251] eta 0:33:22 lr 0.000684 wd 0.0500 time 0.2168 (1.6954) data time 0.0007 (0.0728) model time 0.2161 (1.6226) loss 3.5866 (3.7019) grad_norm 2.7947 (2.5711) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][80/1251] eta 0:17:10 lr 0.000684 wd 0.0500 time 0.2146 (0.8801) data time 0.0007 (0.0328) model time 0.2139 (0.8472) loss 3.7322 (3.5144) grad_norm 2.7712 (2.6640) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:48:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][90/1251] eta 0:12:32 lr 0.000684 wd 0.0500 time 0.2198 (0.6481) data time 0.0008 (0.0215) model time 0.2190 (0.6266) loss 3.5260 (3.5183) grad_norm 2.5653 (2.6500) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:48:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][100/1251] eta 0:10:18 lr 0.000684 wd 0.0500 time 0.2203 (0.5377) data time 0.0010 (0.0161) model time 0.2193 (0.5217) loss 3.4370 (3.4798) grad_norm 2.1051 (2.5867) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][110/1251] eta 0:08:59 lr 0.000684 wd 0.0500 time 0.2196 (0.4730) data time 0.0009 (0.0129) model time 0.2188 (0.4601) loss 3.8371 (3.4743) grad_norm 2.0264 (2.5242) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][120/1251] eta 0:08:07 lr 0.000684 wd 0.0500 time 0.2251 (0.4313) data time 0.0007 (0.0109) model time 0.2244 (0.4204) loss 2.8296 (3.4654) grad_norm 3.4764 (2.5370) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][130/1251] eta 0:07:29 lr 0.000684 wd 0.0500 time 0.2226 (0.4012) data time 0.0007 (0.0095) model time 0.2220 (0.3917) loss 2.5939 (3.4331) grad_norm 2.1976 (2.5687) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][140/1251] eta 0:07:01 lr 0.000684 wd 0.0500 time 0.2272 (0.3793) data time 0.0007 (0.0084) model time 0.2265 (0.3709) loss 2.6902 (3.4057) grad_norm 2.1597 (2.5038) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][150/1251] eta 0:06:39 lr 0.000683 wd 0.0500 time 0.2290 (0.3624) data time 0.0009 (0.0075) model time 0.2282 (0.3549) loss 3.3140 (3.3684) grad_norm 3.1047 (2.4752) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][160/1251] eta 0:06:20 lr 0.000683 wd 0.0500 time 0.2280 (0.3489) data time 0.0006 (0.0069) model time 0.2274 (0.3420) loss 3.9458 (3.3756) grad_norm 2.9365 (2.4755) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][170/1251] eta 0:06:05 lr 0.000683 wd 0.0500 time 0.2249 (0.3378) data time 0.0007 (0.0063) model time 0.2242 (0.3315) loss 2.3648 (3.3846) grad_norm 1.7255 (2.4822) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][180/1251] eta 0:05:51 lr 0.000683 wd 0.0500 time 0.2300 (0.3286) data time 0.0009 (0.0059) model time 0.2291 (0.3227) loss 3.3726 (3.3786) grad_norm 2.9410 (2.4676) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][190/1251] eta 0:05:40 lr 0.000683 wd 0.0500 time 0.2283 (0.3207) data time 0.0007 (0.0055) model time 0.2277 (0.3152) loss 3.3756 (3.3705) grad_norm 1.7830 (2.4316) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][200/1251] eta 0:05:30 lr 0.000683 wd 0.0500 time 0.2188 (0.3140) data time 0.0009 (0.0052) model time 0.2178 (0.3088) loss 3.3947 (3.3664) grad_norm 1.6327 (2.4110) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][210/1251] eta 0:05:20 lr 0.000683 wd 0.0500 time 0.2223 (0.3082) data time 0.0007 (0.0049) model time 0.2216 (0.3033) loss 3.5572 (3.3607) grad_norm 1.7338 (2.4000) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][220/1251] eta 0:05:12 lr 0.000683 wd 0.0500 time 0.2244 (0.3035) data time 0.0008 (0.0049) model time 0.2237 (0.2986) loss 2.9096 (3.3502) grad_norm 2.4819 (2.3857) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][230/1251] eta 0:05:05 lr 0.000683 wd 0.0500 time 0.2216 (0.2991) data time 0.0009 (0.0048) model time 0.2207 (0.2944) loss 3.8662 (3.3579) grad_norm 1.9792 (2.3628) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][240/1251] eta 0:04:58 lr 0.000683 wd 0.0500 time 0.2248 (0.2951) data time 0.0009 (0.0045) model time 0.2239 (0.2906) loss 2.6776 (3.3405) grad_norm 2.9111 (2.3680) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][250/1251] eta 0:04:51 lr 0.000683 wd 0.0500 time 0.2216 (0.2916) data time 0.0007 (0.0044) model time 0.2210 (0.2872) loss 3.7063 (3.3334) grad_norm 2.1325 (2.3555) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][260/1251] eta 0:04:45 lr 0.000683 wd 0.0500 time 0.2222 (0.2885) data time 0.0009 (0.0042) model time 0.2212 (0.2842) loss 2.8031 (3.3228) grad_norm 2.3330 (2.3506) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][270/1251] eta 0:04:40 lr 0.000683 wd 0.0500 time 0.2259 (0.2857) data time 0.0009 (0.0040) model time 0.2250 (0.2817) loss 3.5351 (3.3092) grad_norm 1.5892 (2.3455) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][280/1251] eta 0:04:34 lr 0.000683 wd 0.0500 time 0.2232 (0.2830) data time 0.0008 (0.0039) model time 0.2224 (0.2791) loss 3.0540 (3.3055) grad_norm 1.7918 (2.3449) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 02:49:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 02:49:42 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 02:49:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 04:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 04:15:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 04:17:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 04:17:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 04:17:42 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 04:17:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 04:17:56 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 04:17:57 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 04:17:59 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 04:17:59 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 127) [2024-08-27 04:17:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 05:20:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 05:20:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 05:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 05:27:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 05:27:34 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 05:27:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 05:27:44 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 05:27:45 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 05:27:46 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 05:27:46 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 127) [2024-08-27 05:27:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 05:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][290/1251] eta 0:40:19 lr 0.000683 wd 0.0500 time 0.2475 (2.5175) data time 0.0007 (0.1517) model time 0.2468 (2.3658) loss 4.2619 (3.8686) grad_norm 1.9160 (2.6320) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][300/1251] eta 0:17:23 lr 0.000683 wd 0.0500 time 0.2502 (1.0970) data time 0.0009 (0.0575) model time 0.2493 (1.0394) loss 3.5725 (3.6325) grad_norm 1.8305 (2.1203) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][310/1251] eta 0:12:03 lr 0.000683 wd 0.0500 time 0.2367 (0.7685) data time 0.0008 (0.0358) model time 0.2359 (0.7327) loss 3.2285 (3.6040) grad_norm 1.8566 (2.1836) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][320/1251] eta 0:09:39 lr 0.000683 wd 0.0500 time 0.2426 (0.6222) data time 0.0010 (0.0261) model time 0.2416 (0.5961) loss 3.4371 (3.5573) grad_norm 2.4879 (2.1836) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][330/1251] eta 0:08:16 lr 0.000683 wd 0.0500 time 0.2306 (0.5396) data time 0.0010 (0.0208) model time 0.2296 (0.5187) loss 3.2177 (3.4810) grad_norm 1.6436 (2.2311) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][340/1251] eta 0:07:23 lr 0.000683 wd 0.0500 time 0.2477 (0.4865) data time 0.0008 (0.0173) model time 0.2469 (0.4693) loss 3.7716 (3.4872) grad_norm 1.7362 (2.2238) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][350/1251] eta 0:06:46 lr 0.000683 wd 0.0500 time 0.2609 (0.4507) data time 0.0007 (0.0148) model time 0.2602 (0.4359) loss 3.0468 (3.4580) grad_norm 2.9879 (2.2439) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][360/1251] eta 0:06:17 lr 0.000683 wd 0.0500 time 0.2483 (0.4233) data time 0.0010 (0.0130) model time 0.2473 (0.4103) loss 3.7508 (3.4285) grad_norm 1.8201 (2.2149) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][370/1251] eta 0:05:54 lr 0.000683 wd 0.0500 time 0.2394 (0.4026) data time 0.0008 (0.0116) model time 0.2386 (0.3910) loss 2.6389 (3.3943) grad_norm 1.9306 (2.2193) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][380/1251] eta 0:05:36 lr 0.000683 wd 0.0500 time 0.2468 (0.3860) data time 0.0008 (0.0105) model time 0.2460 (0.3755) loss 3.4792 (3.4043) grad_norm 2.6259 (2.2006) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][390/1251] eta 0:05:20 lr 0.000683 wd 0.0500 time 0.2524 (0.3727) data time 0.0009 (0.0096) model time 0.2515 (0.3631) loss 3.5397 (3.4318) grad_norm 1.8545 (2.1852) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][400/1251] eta 0:05:07 lr 0.000682 wd 0.0500 time 0.2458 (0.3617) data time 0.0007 (0.0088) model time 0.2451 (0.3528) loss 3.9818 (3.4097) grad_norm 2.2172 (2.1785) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][410/1251] eta 0:04:56 lr 0.000682 wd 0.0500 time 0.2458 (0.3525) data time 0.0007 (0.0082) model time 0.2451 (0.3443) loss 2.4545 (3.3929) grad_norm 1.8473 (2.2147) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][420/1251] eta 0:04:46 lr 0.000682 wd 0.0500 time 0.2393 (0.3445) data time 0.0012 (0.0077) model time 0.2381 (0.3368) loss 2.8905 (3.3969) grad_norm 1.5966 (2.2341) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][430/1251] eta 0:04:37 lr 0.000682 wd 0.0500 time 0.2460 (0.3378) data time 0.0010 (0.0073) model time 0.2450 (0.3305) loss 2.4073 (3.3808) grad_norm 2.5980 (2.2726) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][440/1251] eta 0:04:29 lr 0.000682 wd 0.0500 time 0.2360 (0.3318) data time 0.0009 (0.0069) model time 0.2351 (0.3249) loss 3.6334 (3.3881) grad_norm 1.5330 (2.2986) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][450/1251] eta 0:04:21 lr 0.000682 wd 0.0500 time 0.2391 (0.3264) data time 0.0010 (0.0065) model time 0.2381 (0.3199) loss 3.6230 (3.3859) grad_norm 3.3596 (2.3130) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 05:28:46 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 05:28:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 05:32:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 05:32:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 05:32:37 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 05:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 05:32:47 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 05:32:48 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 05:32:49 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 05:32:49 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 127) [2024-08-27 05:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 05:33:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][460/1251] eta 0:31:28 lr 0.000682 wd 0.0500 time 0.2376 (2.3875) data time 0.0012 (0.1588) model time 0.2364 (2.2287) loss 3.3884 (3.7662) grad_norm 2.9009 (2.2078) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][470/1251] eta 0:14:38 lr 0.000682 wd 0.0500 time 0.2449 (1.1246) data time 0.0011 (0.0660) model time 0.2438 (1.0586) loss 3.0991 (3.4960) grad_norm 4.5086 (2.3720) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][480/1251] eta 0:10:12 lr 0.000682 wd 0.0500 time 0.2310 (0.7949) data time 0.0010 (0.0419) model time 0.2300 (0.7529) loss 3.7677 (3.5419) grad_norm 2.4610 (2.3912) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][490/1251] eta 0:08:09 lr 0.000682 wd 0.0500 time 0.2343 (0.6437) data time 0.0010 (0.0309) model time 0.2333 (0.6128) loss 3.4690 (3.5154) grad_norm 1.8817 (2.3256) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][500/1251] eta 0:06:58 lr 0.000682 wd 0.0500 time 0.2339 (0.5573) data time 0.0009 (0.0245) model time 0.2330 (0.5328) loss 3.5880 (3.4717) grad_norm 1.9605 (2.2706) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][510/1251] eta 0:06:11 lr 0.000682 wd 0.0500 time 0.2434 (0.5015) data time 0.0011 (0.0204) model time 0.2423 (0.4811) loss 3.4327 (3.4378) grad_norm 1.7634 (2.3470) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][520/1251] eta 0:05:38 lr 0.000682 wd 0.0500 time 0.2468 (0.4626) data time 0.0010 (0.0177) model time 0.2458 (0.4449) loss 3.4224 (3.4014) grad_norm 2.3093 (2.3558) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][530/1251] eta 0:05:12 lr 0.000682 wd 0.0500 time 0.2489 (0.4332) data time 0.0010 (0.0155) model time 0.2479 (0.4177) loss 3.5841 (3.3854) grad_norm 2.2603 (2.3132) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][540/1251] eta 0:04:51 lr 0.000682 wd 0.0500 time 0.2308 (0.4104) data time 0.0011 (0.0139) model time 0.2298 (0.3965) loss 3.4830 (3.3592) grad_norm 2.6510 (2.2960) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][550/1251] eta 0:04:35 lr 0.000682 wd 0.0500 time 0.2451 (0.3924) data time 0.0012 (0.0126) model time 0.2439 (0.3799) loss 3.6236 (3.3781) grad_norm 3.7240 (2.3080) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][560/1251] eta 0:04:21 lr 0.000682 wd 0.0500 time 0.2351 (0.3778) data time 0.0007 (0.0115) model time 0.2344 (0.3663) loss 3.4858 (3.4036) grad_norm 2.4907 (2.2962) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][570/1251] eta 0:04:09 lr 0.000682 wd 0.0500 time 0.2448 (0.3658) data time 0.0010 (0.0106) model time 0.2439 (0.3552) loss 3.5395 (3.3874) grad_norm 2.0390 (2.2948) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][580/1251] eta 0:03:58 lr 0.000682 wd 0.0500 time 0.2512 (0.3558) data time 0.0010 (0.0098) model time 0.2503 (0.3460) loss 3.6954 (3.3742) grad_norm 1.6742 (2.2816) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][590/1251] eta 0:03:49 lr 0.000682 wd 0.0500 time 0.2361 (0.3473) data time 0.0009 (0.0092) model time 0.2352 (0.3381) loss 2.7252 (3.3678) grad_norm 2.4107 (2.2611) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][600/1251] eta 0:03:41 lr 0.000682 wd 0.0500 time 0.2357 (0.3395) data time 0.0012 (0.0086) model time 0.2345 (0.3309) loss 3.3172 (3.3582) grad_norm 2.0143 (2.2680) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][610/1251] eta 0:03:33 lr 0.000682 wd 0.0500 time 0.2588 (0.3331) data time 0.0011 (0.0082) model time 0.2578 (0.3249) loss 3.0496 (3.3620) grad_norm 2.0187 (2.3186) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][620/1251] eta 0:03:26 lr 0.000682 wd 0.0500 time 0.2290 (0.3271) data time 0.0011 (0.0077) model time 0.2280 (0.3194) loss 3.2895 (3.3640) grad_norm 3.3632 (2.3417) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][630/1251] eta 0:03:19 lr 0.000682 wd 0.0500 time 0.2371 (0.3220) data time 0.0011 (0.0074) model time 0.2360 (0.3146) loss 3.3427 (3.3572) grad_norm 1.8617 (2.3466) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][640/1251] eta 0:03:13 lr 0.000681 wd 0.0500 time 0.2472 (0.3174) data time 0.0007 (0.0070) model time 0.2464 (0.3103) loss 3.1432 (3.3531) grad_norm 2.1775 (2.3455) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][650/1251] eta 0:03:08 lr 0.000681 wd 0.0500 time 0.2342 (0.3132) data time 0.0007 (0.0067) model time 0.2334 (0.3065) loss 3.5647 (3.3488) grad_norm 2.8308 (2.3422) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][660/1251] eta 0:03:02 lr 0.000681 wd 0.0500 time 0.2439 (0.3096) data time 0.0010 (0.0064) model time 0.2430 (0.3031) loss 3.4919 (3.3320) grad_norm 2.1273 (2.3283) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][670/1251] eta 0:02:57 lr 0.000681 wd 0.0500 time 0.2365 (0.3061) data time 0.0009 (0.0062) model time 0.2356 (0.2999) loss 3.6546 (3.3289) grad_norm 2.4930 (2.3422) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][680/1251] eta 0:02:53 lr 0.000681 wd 0.0500 time 0.2338 (0.3031) data time 0.0011 (0.0060) model time 0.2327 (0.2971) loss 3.2056 (3.3300) grad_norm 2.7802 (2.3463) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][690/1251] eta 0:02:48 lr 0.000681 wd 0.0500 time 0.2338 (0.3001) data time 0.0008 (0.0058) model time 0.2330 (0.2944) loss 3.3223 (3.3265) grad_norm 2.6558 (2.3491) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][700/1251] eta 0:02:44 lr 0.000681 wd 0.0500 time 0.2367 (0.2977) data time 0.0010 (0.0056) model time 0.2358 (0.2921) loss 3.3648 (3.3193) grad_norm 2.9821 (2.3475) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][710/1251] eta 0:02:39 lr 0.000681 wd 0.0500 time 0.2317 (0.2952) data time 0.0009 (0.0054) model time 0.2308 (0.2898) loss 2.2627 (3.3050) grad_norm 2.2527 (2.3394) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][720/1251] eta 0:02:35 lr 0.000681 wd 0.0500 time 0.2360 (0.2930) data time 0.0008 (0.0052) model time 0.2352 (0.2878) loss 2.6419 (3.2990) grad_norm 1.6529 (2.3312) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][730/1251] eta 0:02:31 lr 0.000681 wd 0.0500 time 0.2339 (0.2910) data time 0.0011 (0.0051) model time 0.2328 (0.2859) loss 3.4937 (3.3063) grad_norm 2.1413 (2.3301) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][740/1251] eta 0:02:28 lr 0.000681 wd 0.0500 time 0.4718 (0.2899) data time 0.0008 (0.0049) model time 0.4711 (0.2849) loss 4.0523 (3.3041) grad_norm 2.8737 (2.3343) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][750/1251] eta 0:02:24 lr 0.000681 wd 0.0500 time 0.2325 (0.2881) data time 0.0010 (0.0048) model time 0.2315 (0.2833) loss 2.5940 (3.2966) grad_norm 1.9075 (2.3350) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][760/1251] eta 0:02:20 lr 0.000681 wd 0.0500 time 0.4870 (0.2872) data time 0.0007 (0.0047) model time 0.4862 (0.2825) loss 4.2976 (3.2967) grad_norm 2.6767 (2.3305) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][770/1251] eta 0:02:17 lr 0.000681 wd 0.0500 time 0.2506 (0.2856) data time 0.0011 (0.0046) model time 0.2495 (0.2810) loss 3.5833 (3.3066) grad_norm 2.7393 (2.3290) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][780/1251] eta 0:02:13 lr 0.000681 wd 0.0500 time 0.2312 (0.2842) data time 0.0008 (0.0045) model time 0.2304 (0.2797) loss 2.3872 (3.3121) grad_norm 2.3709 (2.3272) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][790/1251] eta 0:02:10 lr 0.000681 wd 0.0500 time 0.2478 (0.2828) data time 0.0012 (0.0044) model time 0.2466 (0.2784) loss 2.7348 (3.3091) grad_norm 2.1435 (2.3268) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][800/1251] eta 0:02:07 lr 0.000681 wd 0.0500 time 0.2383 (0.2816) data time 0.0008 (0.0043) model time 0.2376 (0.2773) loss 3.5274 (3.3152) grad_norm 2.4269 (2.3197) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][810/1251] eta 0:02:03 lr 0.000681 wd 0.0500 time 0.2330 (0.2804) data time 0.0010 (0.0042) model time 0.2321 (0.2762) loss 2.6047 (3.3165) grad_norm 1.5141 (2.3209) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][820/1251] eta 0:02:00 lr 0.000681 wd 0.0500 time 0.2377 (0.2792) data time 0.0007 (0.0041) model time 0.2370 (0.2751) loss 2.9340 (3.3172) grad_norm 2.2077 (2.3195) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][830/1251] eta 0:01:57 lr 0.000681 wd 0.0500 time 0.2360 (0.2781) data time 0.0010 (0.0040) model time 0.2350 (0.2741) loss 3.2804 (3.3142) grad_norm 1.8482 (2.3292) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][840/1251] eta 0:01:53 lr 0.000681 wd 0.0500 time 0.2313 (0.2770) data time 0.0010 (0.0039) model time 0.2303 (0.2731) loss 3.4626 (3.3100) grad_norm 2.5118 (2.3278) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 05:34:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 05:34:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 05:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 05:38:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 05:38:26 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 05:38:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 05:38:38 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 05:38:39 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 05:38:40 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 05:38:40 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 127) [2024-08-27 05:38:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 05:38:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][850/1251] eta 0:22:22 lr 0.000681 wd 0.0500 time 0.2204 (3.3485) data time 0.0007 (0.1215) model time 0.2198 (3.2270) loss 3.7937 (3.6680) grad_norm 2.3133 (2.1101) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:38:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][860/1251] eta 0:07:16 lr 0.000681 wd 0.0500 time 0.2253 (1.1161) data time 0.0007 (0.0353) model time 0.2246 (1.0808) loss 3.9177 (3.5527) grad_norm 2.0250 (2.2630) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:39:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][870/1251] eta 0:04:43 lr 0.000681 wd 0.0500 time 0.2150 (0.7447) data time 0.0010 (0.0210) model time 0.2140 (0.7236) loss 3.3023 (3.5519) grad_norm 1.9214 (2.1427) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:39:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][880/1251] eta 0:03:39 lr 0.000680 wd 0.0500 time 0.2247 (0.5913) data time 0.0008 (0.0151) model time 0.2240 (0.5762) loss 2.6547 (3.5406) grad_norm 2.0463 (2.2747) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:39:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][890/1251] eta 0:03:03 lr 0.000680 wd 0.0500 time 0.2195 (0.5073) data time 0.0007 (0.0119) model time 0.2188 (0.4955) loss 3.4750 (3.5016) grad_norm 1.7477 (2.3067) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:39:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][900/1251] eta 0:02:39 lr 0.000680 wd 0.0500 time 0.2240 (0.4551) data time 0.0007 (0.0098) model time 0.2234 (0.4452) loss 4.1725 (3.4951) grad_norm 2.3965 (2.2825) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:39:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][910/1251] eta 0:02:22 lr 0.000680 wd 0.0500 time 0.2202 (0.4193) data time 0.0006 (0.0084) model time 0.2196 (0.4109) loss 3.4822 (3.4343) grad_norm 4.6824 (2.3096) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:39:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][920/1251] eta 0:02:10 lr 0.000680 wd 0.0500 time 0.2256 (0.3930) data time 0.0008 (0.0074) model time 0.2249 (0.3856) loss 3.5892 (3.4004) grad_norm 1.6077 (2.3196) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:39:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][930/1251] eta 0:01:59 lr 0.000680 wd 0.0500 time 0.2202 (0.3729) data time 0.0009 (0.0066) model time 0.2193 (0.3663) loss 3.5840 (3.3671) grad_norm 1.6867 (2.2879) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:39:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][940/1251] eta 0:01:51 lr 0.000680 wd 0.0500 time 0.2323 (0.3572) data time 0.0009 (0.0060) model time 0.2314 (0.3512) loss 3.2392 (3.3640) grad_norm 2.5823 (2.2825) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 05:39:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 05:39:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 05:39:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 05:43:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 05:43:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 05:44:02 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 05:49:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 05:49:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 05:49:26 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 05:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 05:49:36 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 05:56:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 05:56:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 05:57:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 05:57:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 05:57:54 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 05:58:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 05:58:08 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 05:58:09 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 05:58:10 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 05:58:10 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 127) [2024-08-27 05:58:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 05:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 05:59:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 05:59:44 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 05:59:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 05:59:55 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 05:59:57 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 05:59:58 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 05:59:58 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 127) [2024-08-27 05:59:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 06:00:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][950/1251] eta 0:16:27 lr 0.000680 wd 0.0500 time 0.2242 (3.2791) data time 0.0007 (0.3161) model time 0.2235 (2.9630) loss 4.0201 (3.8328) grad_norm 2.7232 (2.1040) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][960/1251] eta 0:06:01 lr 0.000680 wd 0.0500 time 0.2206 (1.2422) data time 0.0009 (0.1060) model time 0.2196 (1.1361) loss 3.8086 (3.5965) grad_norm 3.3465 (2.3006) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][970/1251] eta 0:03:54 lr 0.000680 wd 0.0500 time 0.2244 (0.8347) data time 0.0009 (0.0640) model time 0.2234 (0.7706) loss 3.7236 (3.6203) grad_norm 2.1514 (2.3137) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][980/1251] eta 0:02:58 lr 0.000680 wd 0.0500 time 0.2170 (0.6596) data time 0.0009 (0.0460) model time 0.2161 (0.6136) loss 3.4108 (3.5991) grad_norm 2.8964 (2.3259) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][990/1251] eta 0:02:26 lr 0.000680 wd 0.0500 time 0.2248 (0.5626) data time 0.0008 (0.0360) model time 0.2240 (0.5267) loss 3.7833 (3.5540) grad_norm 2.6606 (2.3527) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1000/1251] eta 0:02:05 lr 0.000680 wd 0.0500 time 0.2257 (0.5009) data time 0.0007 (0.0296) model time 0.2250 (0.4713) loss 2.5076 (3.5240) grad_norm 1.5865 (2.3303) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1010/1251] eta 0:01:50 lr 0.000680 wd 0.0500 time 0.2274 (0.4583) data time 0.0009 (0.0252) model time 0.2266 (0.4332) loss 3.9407 (3.5012) grad_norm 1.8687 (2.4763) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1020/1251] eta 0:01:38 lr 0.000680 wd 0.0500 time 0.2219 (0.4271) data time 0.0010 (0.0219) model time 0.2209 (0.4052) loss 2.9449 (3.4464) grad_norm 2.0792 (2.4270) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1030/1251] eta 0:01:29 lr 0.000680 wd 0.0500 time 0.2206 (0.4031) data time 0.0006 (0.0195) model time 0.2201 (0.3836) loss 3.2025 (3.4256) grad_norm 2.0636 (2.3593) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1040/1251] eta 0:01:21 lr 0.000680 wd 0.0500 time 0.2262 (0.3843) data time 0.0008 (0.0175) model time 0.2254 (0.3668) loss 3.4528 (3.4121) grad_norm 1.4599 (2.3214) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1050/1251] eta 0:01:14 lr 0.000680 wd 0.0500 time 0.2198 (0.3690) data time 0.0008 (0.0159) model time 0.2189 (0.3531) loss 3.1885 (3.4350) grad_norm 2.6737 (2.3171) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1060/1251] eta 0:01:08 lr 0.000680 wd 0.0500 time 0.2202 (0.3563) data time 0.0008 (0.0146) model time 0.2194 (0.3416) loss 2.6370 (3.4180) grad_norm 2.7961 (2.2949) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1070/1251] eta 0:01:02 lr 0.000680 wd 0.0500 time 0.2251 (0.3456) data time 0.0006 (0.0135) model time 0.2245 (0.3321) loss 3.0802 (3.4137) grad_norm 2.3904 (2.2954) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1080/1251] eta 0:00:57 lr 0.000680 wd 0.0500 time 0.2270 (0.3367) data time 0.0006 (0.0126) model time 0.2263 (0.3241) loss 2.8297 (3.4043) grad_norm 1.9912 (2.2716) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1090/1251] eta 0:00:52 lr 0.000680 wd 0.0500 time 0.2242 (0.3289) data time 0.0010 (0.0118) model time 0.2232 (0.3171) loss 3.7174 (3.3905) grad_norm 2.1663 (2.2488) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1100/1251] eta 0:00:48 lr 0.000680 wd 0.0500 time 0.2205 (0.3222) data time 0.0010 (0.0111) model time 0.2194 (0.3111) loss 3.6232 (3.3854) grad_norm 1.6565 (2.2496) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1110/1251] eta 0:00:44 lr 0.000680 wd 0.0500 time 0.2246 (0.3161) data time 0.0009 (0.0105) model time 0.2236 (0.3057) loss 3.8009 (3.3829) grad_norm 1.8041 (2.2665) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1120/1251] eta 0:00:40 lr 0.000679 wd 0.0500 time 0.2168 (0.3108) data time 0.0007 (0.0099) model time 0.2161 (0.3008) loss 3.2679 (3.3679) grad_norm 2.4054 (2.2754) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:00:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1130/1251] eta 0:00:37 lr 0.000679 wd 0.0500 time 0.2189 (0.3060) data time 0.0009 (0.0094) model time 0.2179 (0.2965) loss 3.2203 (3.3597) grad_norm 2.2376 (2.2684) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1140/1251] eta 0:00:33 lr 0.000679 wd 0.0500 time 0.2273 (0.3017) data time 0.0008 (0.0090) model time 0.2265 (0.2927) loss 2.3579 (3.3538) grad_norm 1.6148 (2.2477) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1150/1251] eta 0:00:30 lr 0.000679 wd 0.0500 time 0.2234 (0.2980) data time 0.0008 (0.0086) model time 0.2226 (0.2894) loss 2.8721 (3.3423) grad_norm 2.3062 (2.2337) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:01:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1160/1251] eta 0:00:26 lr 0.000679 wd 0.0500 time 0.2214 (0.2945) data time 0.0010 (0.0083) model time 0.2204 (0.2863) loss 3.2125 (3.3322) grad_norm 1.8003 (2.2299) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:01:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1170/1251] eta 0:00:23 lr 0.000679 wd 0.0500 time 0.2253 (0.2914) data time 0.0007 (0.0079) model time 0.2246 (0.2835) loss 3.4039 (3.3358) grad_norm 2.9175 (2.2237) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1180/1251] eta 0:00:20 lr 0.000679 wd 0.0500 time 0.2270 (0.2886) data time 0.0008 (0.0076) model time 0.2263 (0.2810) loss 3.5033 (3.3254) grad_norm 2.3463 (2.2254) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:01:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1190/1251] eta 0:00:17 lr 0.000679 wd 0.0500 time 0.2201 (0.2860) data time 0.0011 (0.0073) model time 0.2190 (0.2786) loss 3.3264 (3.3278) grad_norm 1.9401 (2.2157) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:01:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1200/1251] eta 0:00:14 lr 0.000679 wd 0.0500 time 0.2233 (0.2835) data time 0.0007 (0.0071) model time 0.2226 (0.2765) loss 3.0785 (3.3221) grad_norm 1.8998 (2.2270) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:01:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1210/1251] eta 0:00:11 lr 0.000679 wd 0.0500 time 0.2197 (0.2813) data time 0.0006 (0.0069) model time 0.2191 (0.2744) loss 2.6497 (3.3101) grad_norm 2.9367 (2.2360) loss_scale 4096.0000 (2055.7283) mem 7377MB [2024-08-27 06:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1220/1251] eta 0:00:08 lr 0.000679 wd 0.0500 time 0.2200 (0.2792) data time 0.0009 (0.0066) model time 0.2191 (0.2726) loss 3.6456 (3.3133) grad_norm 1.7142 (2.2322) loss_scale 4096.0000 (2129.9200) mem 7377MB [2024-08-27 06:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1230/1251] eta 0:00:05 lr 0.000679 wd 0.0500 time 0.2173 (0.2772) data time 0.0009 (0.0064) model time 0.2164 (0.2708) loss 3.2643 (3.3070) grad_norm 1.6373 (2.2266) loss_scale 4096.0000 (2198.9053) mem 7377MB [2024-08-27 06:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1240/1251] eta 0:00:03 lr 0.000679 wd 0.0500 time 0.2159 (0.2759) data time 0.0006 (0.0063) model time 0.2153 (0.2696) loss 3.2722 (3.3033) grad_norm 1.6183 (2.2222) loss_scale 4096.0000 (2263.2136) mem 7377MB [2024-08-27 06:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [127/300][1250/1251] eta 0:00:00 lr 0.000679 wd 0.0500 time 0.2078 (0.2739) data time 0.0006 (0.0061) model time 0.2072 (0.2678) loss 2.8905 (3.2972) grad_norm nan (nan) loss_scale 2048.0000 (2316.5902) mem 7377MB [2024-08-27 06:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 127 training takes 0:01:23 [2024-08-27 06:01:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 06:01:27 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 06:01:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.389 (0.389) Loss 0.4998 (0.4998) Acc@1 91.797 (91.797) Acc@5 97.949 (97.949) Mem 7377MB [2024-08-27 06:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.067 (0.099) Loss 0.7354 (0.7530) Acc@1 85.449 (83.620) Acc@5 96.973 (96.626) Mem 7377MB [2024-08-27 06:01:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.084) Loss 1.0781 (0.7899) Acc@1 71.777 (82.310) Acc@5 93.652 (96.563) Mem 7377MB [2024-08-27 06:01:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.064 (0.079) Loss 1.4131 (0.8940) Acc@1 65.430 (79.940) Acc@5 89.160 (95.316) Mem 7377MB [2024-08-27 06:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.075) Loss 1.2178 (0.9509) Acc@1 70.508 (78.456) Acc@5 92.480 (94.622) Mem 7377MB [2024-08-27 06:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.146 Acc@5 94.536 [2024-08-27 06:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.1% [2024-08-27 06:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.745 (0.745) Loss 0.4192 (0.4192) Acc@1 92.090 (92.090) Acc@5 98.438 (98.438) Mem 7377MB [2024-08-27 06:01:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.135) Loss 0.6670 (0.6603) Acc@1 86.621 (85.813) Acc@5 96.777 (97.212) Mem 7377MB [2024-08-27 06:01:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.063 (0.104) Loss 0.9351 (0.6827) Acc@1 78.027 (84.863) Acc@5 95.117 (97.266) Mem 7377MB [2024-08-27 06:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.068 (0.092) Loss 1.1963 (0.7764) Acc@1 69.434 (82.491) Acc@5 91.406 (96.207) Mem 7377MB [2024-08-27 06:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.085) Loss 1.0762 (0.8257) Acc@1 72.949 (81.086) Acc@5 93.262 (95.660) Mem 7377MB [2024-08-27 06:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.708 Acc@5 95.618 [2024-08-27 06:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.7% [2024-08-27 06:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.71% [2024-08-27 06:01:36 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 06:01:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 06:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][0/1251] eta 0:15:51 lr 0.000679 wd 0.0500 time 0.7604 (0.7604) data time 0.5297 (0.5297) model time 0.0000 (0.0000) loss 1.9096 (1.9096) grad_norm 3.2996 (3.2996) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-27 06:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][10/1251] eta 0:05:38 lr 0.000679 wd 0.0500 time 0.2251 (0.2731) data time 0.0008 (0.0490) model time 0.0000 (0.0000) loss 3.9961 (3.4284) grad_norm 2.2023 (2.7587) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][20/1251] eta 0:05:08 lr 0.000679 wd 0.0500 time 0.2262 (0.2507) data time 0.0009 (0.0261) model time 0.0000 (0.0000) loss 3.8125 (3.4673) grad_norm 2.9992 (2.5347) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:01:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][30/1251] eta 0:04:55 lr 0.000679 wd 0.0500 time 0.2335 (0.2423) data time 0.0011 (0.0181) model time 0.0000 (0.0000) loss 3.1079 (3.3441) grad_norm 2.6802 (2.4335) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:01:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][40/1251] eta 0:04:48 lr 0.000679 wd 0.0500 time 0.2187 (0.2379) data time 0.0009 (0.0139) model time 0.0000 (0.0000) loss 3.6360 (3.3497) grad_norm 2.9119 (2.4177) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][50/1251] eta 0:04:43 lr 0.000679 wd 0.0500 time 0.2305 (0.2357) data time 0.0006 (0.0113) model time 0.0000 (0.0000) loss 2.4867 (3.3287) grad_norm 3.0507 (2.5035) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:01:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][60/1251] eta 0:04:38 lr 0.000679 wd 0.0500 time 0.2187 (0.2336) data time 0.0008 (0.0096) model time 0.2179 (0.2222) loss 3.5198 (3.3141) grad_norm 1.8004 (2.6024) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:01:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][70/1251] eta 0:04:33 lr 0.000679 wd 0.0500 time 0.2227 (0.2320) data time 0.0006 (0.0084) model time 0.2221 (0.2217) loss 2.7731 (3.2907) grad_norm 1.7814 (2.5151) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:01:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][80/1251] eta 0:04:30 lr 0.000679 wd 0.0500 time 0.2254 (0.2309) data time 0.0009 (0.0074) model time 0.2245 (0.2218) loss 3.3274 (3.2646) grad_norm 1.9648 (2.4662) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:01:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][90/1251] eta 0:04:27 lr 0.000679 wd 0.0500 time 0.2214 (0.2301) data time 0.0006 (0.0067) model time 0.2208 (0.2220) loss 3.9446 (3.2833) grad_norm 2.6088 (2.4477) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][100/1251] eta 0:04:24 lr 0.000679 wd 0.0500 time 0.2189 (0.2294) data time 0.0008 (0.0062) model time 0.2181 (0.2221) loss 3.8150 (3.3136) grad_norm 1.8501 (2.4403) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][110/1251] eta 0:04:21 lr 0.000678 wd 0.0500 time 0.2222 (0.2288) data time 0.0006 (0.0057) model time 0.2216 (0.2221) loss 3.1962 (3.3237) grad_norm 3.1147 (2.4391) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][120/1251] eta 0:04:18 lr 0.000678 wd 0.0500 time 0.2231 (0.2284) data time 0.0009 (0.0053) model time 0.2223 (0.2222) loss 3.7704 (3.3301) grad_norm 1.9986 (2.4072) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][130/1251] eta 0:04:15 lr 0.000678 wd 0.0500 time 0.2326 (0.2281) data time 0.0007 (0.0050) model time 0.2319 (0.2223) loss 2.7491 (3.3449) grad_norm 2.9812 (2.3959) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][140/1251] eta 0:04:13 lr 0.000678 wd 0.0500 time 0.2283 (0.2278) data time 0.0008 (0.0047) model time 0.2275 (0.2225) loss 3.4457 (3.3547) grad_norm 2.5072 (2.3676) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][150/1251] eta 0:04:10 lr 0.000678 wd 0.0500 time 0.2227 (0.2275) data time 0.0006 (0.0044) model time 0.2221 (0.2224) loss 3.2255 (3.3458) grad_norm 2.5238 (2.3742) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][160/1251] eta 0:04:07 lr 0.000678 wd 0.0500 time 0.2244 (0.2273) data time 0.0006 (0.0042) model time 0.2238 (0.2225) loss 2.7309 (3.3268) grad_norm 2.1906 (2.3907) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][170/1251] eta 0:04:05 lr 0.000678 wd 0.0500 time 0.2229 (0.2271) data time 0.0007 (0.0040) model time 0.2222 (0.2226) loss 2.7889 (3.3136) grad_norm 1.7608 (2.3674) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][180/1251] eta 0:04:03 lr 0.000678 wd 0.0500 time 0.2237 (0.2269) data time 0.0009 (0.0038) model time 0.2228 (0.2226) loss 3.7965 (3.3283) grad_norm 2.0968 (2.3536) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][190/1251] eta 0:04:00 lr 0.000678 wd 0.0500 time 0.2215 (0.2267) data time 0.0007 (0.0037) model time 0.2208 (0.2225) loss 3.0215 (3.3234) grad_norm 1.6360 (2.3339) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][200/1251] eta 0:03:58 lr 0.000678 wd 0.0500 time 0.2292 (0.2265) data time 0.0009 (0.0035) model time 0.2284 (0.2225) loss 3.6177 (3.3281) grad_norm 2.2925 (2.3317) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][210/1251] eta 0:03:55 lr 0.000678 wd 0.0500 time 0.2239 (0.2264) data time 0.0007 (0.0034) model time 0.2232 (0.2226) loss 2.9669 (3.3404) grad_norm 2.5638 (2.3282) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][220/1251] eta 0:03:53 lr 0.000678 wd 0.0500 time 0.2290 (0.2264) data time 0.0008 (0.0033) model time 0.2282 (0.2227) loss 1.6401 (3.3216) grad_norm 2.0162 (2.3239) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][230/1251] eta 0:03:50 lr 0.000678 wd 0.0500 time 0.2273 (0.2262) data time 0.0008 (0.0032) model time 0.2266 (0.2226) loss 3.3323 (3.3145) grad_norm 6.2537 (2.3359) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][240/1251] eta 0:03:48 lr 0.000678 wd 0.0500 time 0.2222 (0.2260) data time 0.0007 (0.0031) model time 0.2215 (0.2225) loss 2.6182 (3.3095) grad_norm 3.3418 (2.3312) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][250/1251] eta 0:03:46 lr 0.000678 wd 0.0500 time 0.2225 (0.2259) data time 0.0008 (0.0030) model time 0.2217 (0.2225) loss 3.4626 (3.3212) grad_norm 2.2750 (2.3213) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][260/1251] eta 0:03:43 lr 0.000678 wd 0.0500 time 0.2233 (0.2257) data time 0.0007 (0.0030) model time 0.2226 (0.2224) loss 2.7948 (3.3256) grad_norm 2.2697 (2.3187) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][270/1251] eta 0:03:41 lr 0.000678 wd 0.0500 time 0.2254 (0.2257) data time 0.0008 (0.0029) model time 0.2246 (0.2224) loss 2.7273 (3.3275) grad_norm 2.4309 (2.3206) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][280/1251] eta 0:03:39 lr 0.000678 wd 0.0500 time 0.2266 (0.2256) data time 0.0012 (0.0028) model time 0.2254 (0.2225) loss 4.0372 (3.3363) grad_norm 2.1726 (2.3173) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][290/1251] eta 0:03:36 lr 0.000678 wd 0.0500 time 0.2225 (0.2256) data time 0.0009 (0.0027) model time 0.2215 (0.2225) loss 2.3599 (3.3354) grad_norm 1.7739 (2.3077) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][300/1251] eta 0:03:34 lr 0.000678 wd 0.0500 time 0.2243 (0.2255) data time 0.0007 (0.0027) model time 0.2236 (0.2225) loss 3.1349 (3.3314) grad_norm 1.7576 (2.3109) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][310/1251] eta 0:03:32 lr 0.000678 wd 0.0500 time 0.2248 (0.2254) data time 0.0010 (0.0026) model time 0.2237 (0.2224) loss 3.0941 (3.3310) grad_norm 1.7913 (2.3183) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][320/1251] eta 0:03:29 lr 0.000678 wd 0.0500 time 0.2250 (0.2254) data time 0.0011 (0.0026) model time 0.2239 (0.2225) loss 2.8706 (3.3387) grad_norm 2.4290 (2.3160) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][330/1251] eta 0:03:27 lr 0.000678 wd 0.0500 time 0.2198 (0.2253) data time 0.0007 (0.0025) model time 0.2191 (0.2224) loss 3.1851 (3.3387) grad_norm 2.1189 (2.3030) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][340/1251] eta 0:03:25 lr 0.000678 wd 0.0500 time 0.2285 (0.2252) data time 0.0009 (0.0025) model time 0.2276 (0.2224) loss 3.4443 (3.3276) grad_norm 2.6861 (2.3110) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][350/1251] eta 0:03:22 lr 0.000677 wd 0.0500 time 0.2246 (0.2252) data time 0.0009 (0.0024) model time 0.2238 (0.2225) loss 2.5985 (3.3267) grad_norm 4.4853 (2.3146) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:02:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][360/1251] eta 0:03:20 lr 0.000677 wd 0.0500 time 0.2395 (0.2252) data time 0.0007 (0.0024) model time 0.2388 (0.2225) loss 3.5316 (3.3182) grad_norm 2.7066 (2.3143) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][370/1251] eta 0:03:18 lr 0.000677 wd 0.0500 time 0.2265 (0.2252) data time 0.0008 (0.0024) model time 0.2258 (0.2226) loss 3.0214 (3.3281) grad_norm 1.6297 (2.3091) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][380/1251] eta 0:03:16 lr 0.000677 wd 0.0500 time 0.2336 (0.2253) data time 0.0006 (0.0023) model time 0.2330 (0.2227) loss 2.8824 (3.3267) grad_norm 2.1369 (2.3145) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][390/1251] eta 0:03:13 lr 0.000677 wd 0.0500 time 0.2251 (0.2252) data time 0.0007 (0.0023) model time 0.2243 (0.2227) loss 2.8585 (3.3261) grad_norm 3.4295 (2.3194) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][400/1251] eta 0:03:11 lr 0.000677 wd 0.0500 time 0.2274 (0.2253) data time 0.0008 (0.0022) model time 0.2266 (0.2228) loss 3.9768 (3.3269) grad_norm 2.0029 (2.3203) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][410/1251] eta 0:03:09 lr 0.000677 wd 0.0500 time 0.2238 (0.2252) data time 0.0010 (0.0022) model time 0.2227 (0.2228) loss 4.0271 (3.3172) grad_norm 2.2720 (2.3257) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][420/1251] eta 0:03:07 lr 0.000677 wd 0.0500 time 0.2270 (0.2252) data time 0.0007 (0.0022) model time 0.2262 (0.2228) loss 3.4987 (3.3170) grad_norm 2.4341 (2.3220) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][430/1251] eta 0:03:04 lr 0.000677 wd 0.0500 time 0.2245 (0.2252) data time 0.0007 (0.0022) model time 0.2237 (0.2228) loss 3.7834 (3.3200) grad_norm 1.7837 (2.3186) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][440/1251] eta 0:03:02 lr 0.000677 wd 0.0500 time 0.2251 (0.2251) data time 0.0006 (0.0021) model time 0.2245 (0.2228) loss 3.7557 (3.3205) grad_norm 1.9566 (2.3149) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][450/1251] eta 0:03:00 lr 0.000677 wd 0.0500 time 0.2313 (0.2251) data time 0.0007 (0.0021) model time 0.2306 (0.2228) loss 3.9126 (3.3162) grad_norm 1.9319 (2.3145) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][460/1251] eta 0:02:57 lr 0.000677 wd 0.0500 time 0.2213 (0.2250) data time 0.0008 (0.0021) model time 0.2205 (0.2228) loss 3.6751 (3.3226) grad_norm 1.7204 (2.3081) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][470/1251] eta 0:02:55 lr 0.000677 wd 0.0500 time 0.2196 (0.2250) data time 0.0009 (0.0020) model time 0.2187 (0.2227) loss 3.7309 (3.3220) grad_norm 2.3845 (2.3067) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][480/1251] eta 0:02:53 lr 0.000677 wd 0.0500 time 0.2214 (0.2249) data time 0.0008 (0.0020) model time 0.2206 (0.2227) loss 3.0573 (3.3185) grad_norm 1.7962 (2.3090) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][490/1251] eta 0:02:51 lr 0.000677 wd 0.0500 time 0.3898 (0.2252) data time 0.0008 (0.0020) model time 0.3890 (0.2231) loss 2.3288 (3.3187) grad_norm 1.7884 (2.3088) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][500/1251] eta 0:02:49 lr 0.000677 wd 0.0500 time 0.2414 (0.2253) data time 0.0006 (0.0020) model time 0.2409 (0.2232) loss 2.1607 (3.3152) grad_norm 2.2414 (2.3070) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][510/1251] eta 0:02:46 lr 0.000677 wd 0.0500 time 0.2220 (0.2252) data time 0.0008 (0.0020) model time 0.2212 (0.2231) loss 3.8451 (3.3099) grad_norm 1.7785 (2.3000) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][520/1251] eta 0:02:44 lr 0.000677 wd 0.0500 time 0.2197 (0.2252) data time 0.0008 (0.0019) model time 0.2189 (0.2231) loss 2.5801 (3.3040) grad_norm 2.3818 (2.3034) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][530/1251] eta 0:02:42 lr 0.000677 wd 0.0500 time 0.2210 (0.2252) data time 0.0007 (0.0019) model time 0.2203 (0.2231) loss 3.7525 (3.3034) grad_norm 2.3722 (2.3072) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][540/1251] eta 0:02:40 lr 0.000677 wd 0.0500 time 0.2240 (0.2252) data time 0.0006 (0.0019) model time 0.2234 (0.2232) loss 3.6393 (3.2991) grad_norm 2.1618 (2.3093) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][550/1251] eta 0:02:37 lr 0.000677 wd 0.0500 time 0.2213 (0.2251) data time 0.0009 (0.0019) model time 0.2204 (0.2231) loss 3.9866 (3.3007) grad_norm 1.8586 (2.3025) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][560/1251] eta 0:02:35 lr 0.000677 wd 0.0500 time 0.2159 (0.2251) data time 0.0010 (0.0019) model time 0.2150 (0.2231) loss 3.3509 (3.2992) grad_norm 2.6677 (2.2990) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][570/1251] eta 0:02:33 lr 0.000677 wd 0.0500 time 0.2250 (0.2250) data time 0.0010 (0.0018) model time 0.2240 (0.2231) loss 3.4319 (3.2989) grad_norm 2.0748 (2.2945) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][580/1251] eta 0:02:31 lr 0.000677 wd 0.0500 time 0.2305 (0.2251) data time 0.0007 (0.0018) model time 0.2299 (0.2231) loss 4.2357 (3.2995) grad_norm 3.4476 (2.2952) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][590/1251] eta 0:02:28 lr 0.000676 wd 0.0500 time 0.2233 (0.2251) data time 0.0009 (0.0018) model time 0.2224 (0.2231) loss 3.7337 (3.3011) grad_norm 1.9355 (2.2984) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][600/1251] eta 0:02:26 lr 0.000676 wd 0.0500 time 0.2171 (0.2250) data time 0.0011 (0.0018) model time 0.2160 (0.2232) loss 3.6903 (3.3002) grad_norm 1.9371 (2.3028) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][610/1251] eta 0:02:24 lr 0.000676 wd 0.0500 time 0.2219 (0.2250) data time 0.0009 (0.0018) model time 0.2210 (0.2231) loss 3.8595 (3.3005) grad_norm 1.8335 (2.2997) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][620/1251] eta 0:02:21 lr 0.000676 wd 0.0500 time 0.2166 (0.2250) data time 0.0009 (0.0018) model time 0.2157 (0.2231) loss 3.2932 (3.3010) grad_norm 1.8815 (2.2986) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][630/1251] eta 0:02:19 lr 0.000676 wd 0.0500 time 0.2209 (0.2250) data time 0.0008 (0.0017) model time 0.2201 (0.2231) loss 2.3401 (3.3001) grad_norm 2.0463 (2.2949) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][640/1251] eta 0:02:17 lr 0.000676 wd 0.0500 time 0.2184 (0.2250) data time 0.0011 (0.0017) model time 0.2173 (0.2231) loss 3.8051 (3.2980) grad_norm 2.6284 (2.2961) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][650/1251] eta 0:02:15 lr 0.000676 wd 0.0500 time 0.2279 (0.2250) data time 0.0006 (0.0017) model time 0.2273 (0.2231) loss 3.6024 (3.2926) grad_norm 2.3690 (2.2982) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][660/1251] eta 0:02:12 lr 0.000676 wd 0.0500 time 0.2199 (0.2249) data time 0.0009 (0.0017) model time 0.2191 (0.2231) loss 3.4360 (3.2922) grad_norm 2.2635 (2.2994) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][670/1251] eta 0:02:10 lr 0.000676 wd 0.0500 time 0.2257 (0.2250) data time 0.0007 (0.0017) model time 0.2250 (0.2232) loss 3.4522 (3.2970) grad_norm 3.0241 (2.2980) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][680/1251] eta 0:02:08 lr 0.000676 wd 0.0500 time 0.2215 (0.2249) data time 0.0009 (0.0017) model time 0.2206 (0.2232) loss 2.2935 (3.2935) grad_norm 3.0859 (2.2997) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][690/1251] eta 0:02:06 lr 0.000676 wd 0.0500 time 0.2254 (0.2249) data time 0.0007 (0.0017) model time 0.2247 (0.2231) loss 3.3766 (3.2945) grad_norm 2.2484 (2.2978) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][700/1251] eta 0:02:03 lr 0.000676 wd 0.0500 time 0.2260 (0.2249) data time 0.0008 (0.0017) model time 0.2252 (0.2232) loss 4.0590 (3.2937) grad_norm 3.2306 (2.3033) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][710/1251] eta 0:02:01 lr 0.000676 wd 0.0500 time 0.2244 (0.2249) data time 0.0006 (0.0016) model time 0.2238 (0.2231) loss 2.9877 (3.2974) grad_norm 2.3382 (2.3025) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][720/1251] eta 0:01:59 lr 0.000676 wd 0.0500 time 0.2164 (0.2249) data time 0.0011 (0.0016) model time 0.2154 (0.2231) loss 3.1697 (3.2932) grad_norm 2.1202 (2.3041) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][730/1251] eta 0:01:57 lr 0.000676 wd 0.0500 time 0.2175 (0.2249) data time 0.0006 (0.0016) model time 0.2169 (0.2231) loss 4.1951 (3.2927) grad_norm 2.1281 (2.3037) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][740/1251] eta 0:01:54 lr 0.000676 wd 0.0500 time 0.2243 (0.2248) data time 0.0010 (0.0016) model time 0.2233 (0.2231) loss 3.6166 (3.2972) grad_norm 2.3973 (2.2984) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][750/1251] eta 0:01:52 lr 0.000676 wd 0.0500 time 0.2226 (0.2248) data time 0.0008 (0.0016) model time 0.2219 (0.2231) loss 3.1244 (3.2975) grad_norm 1.8117 (2.2991) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][760/1251] eta 0:01:50 lr 0.000676 wd 0.0500 time 0.2277 (0.2248) data time 0.0008 (0.0016) model time 0.2270 (0.2231) loss 2.8884 (3.2995) grad_norm 1.8880 (2.2975) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][770/1251] eta 0:01:48 lr 0.000676 wd 0.0500 time 0.2221 (0.2248) data time 0.0007 (0.0016) model time 0.2213 (0.2231) loss 3.1832 (3.2968) grad_norm 2.5189 (2.2989) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][780/1251] eta 0:01:45 lr 0.000676 wd 0.0500 time 0.2214 (0.2248) data time 0.0007 (0.0016) model time 0.2207 (0.2231) loss 3.5460 (3.2975) grad_norm 2.1151 (2.2984) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][790/1251] eta 0:01:43 lr 0.000676 wd 0.0500 time 0.2364 (0.2248) data time 0.0006 (0.0016) model time 0.2358 (0.2231) loss 3.9137 (3.2972) grad_norm 1.7284 (2.2976) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][800/1251] eta 0:01:41 lr 0.000676 wd 0.0500 time 0.2283 (0.2248) data time 0.0008 (0.0016) model time 0.2275 (0.2231) loss 3.3034 (3.2975) grad_norm 2.4072 (2.2982) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][810/1251] eta 0:01:39 lr 0.000676 wd 0.0500 time 0.2249 (0.2247) data time 0.0009 (0.0015) model time 0.2241 (0.2231) loss 3.8016 (3.2986) grad_norm 1.7616 (2.2983) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][820/1251] eta 0:01:36 lr 0.000676 wd 0.0500 time 0.2322 (0.2247) data time 0.0009 (0.0015) model time 0.2313 (0.2231) loss 3.9776 (3.3005) grad_norm 1.9332 (2.2989) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][830/1251] eta 0:01:34 lr 0.000675 wd 0.0500 time 0.2252 (0.2247) data time 0.0008 (0.0015) model time 0.2244 (0.2231) loss 4.2837 (3.3027) grad_norm 2.2439 (2.3042) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][840/1251] eta 0:01:32 lr 0.000675 wd 0.0500 time 0.2243 (0.2247) data time 0.0006 (0.0015) model time 0.2237 (0.2231) loss 2.5861 (3.3011) grad_norm 2.5497 (2.3004) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][850/1251] eta 0:01:30 lr 0.000675 wd 0.0500 time 0.2222 (0.2247) data time 0.0008 (0.0015) model time 0.2214 (0.2231) loss 3.9568 (3.3015) grad_norm 2.5151 (2.2966) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][860/1251] eta 0:01:27 lr 0.000675 wd 0.0500 time 0.2205 (0.2247) data time 0.0006 (0.0015) model time 0.2199 (0.2231) loss 3.7563 (3.2991) grad_norm 1.6067 (2.2957) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][870/1251] eta 0:01:25 lr 0.000675 wd 0.0500 time 0.2260 (0.2247) data time 0.0007 (0.0015) model time 0.2253 (0.2231) loss 3.4986 (3.3022) grad_norm 2.2152 (2.2916) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][880/1251] eta 0:01:23 lr 0.000675 wd 0.0500 time 0.2190 (0.2247) data time 0.0008 (0.0015) model time 0.2182 (0.2231) loss 3.4358 (3.3067) grad_norm 2.3390 (2.2882) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][890/1251] eta 0:01:21 lr 0.000675 wd 0.0500 time 0.2384 (0.2247) data time 0.0009 (0.0015) model time 0.2375 (0.2231) loss 3.3355 (3.3060) grad_norm 2.4491 (2.2901) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:04:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][900/1251] eta 0:01:18 lr 0.000675 wd 0.0500 time 0.2204 (0.2249) data time 0.0006 (0.0015) model time 0.2198 (0.2234) loss 3.1872 (3.3098) grad_norm 1.9161 (2.2884) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][910/1251] eta 0:01:16 lr 0.000675 wd 0.0500 time 0.2241 (0.2249) data time 0.0006 (0.0015) model time 0.2235 (0.2234) loss 2.4162 (3.3095) grad_norm 3.5108 (2.3005) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][920/1251] eta 0:01:14 lr 0.000675 wd 0.0500 time 0.2151 (0.2249) data time 0.0007 (0.0015) model time 0.2144 (0.2234) loss 2.7031 (3.3085) grad_norm 1.9697 (2.3006) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][930/1251] eta 0:01:12 lr 0.000675 wd 0.0500 time 0.2265 (0.2249) data time 0.0007 (0.0015) model time 0.2257 (0.2234) loss 2.0555 (3.3059) grad_norm 2.6430 (2.2984) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][940/1251] eta 0:01:09 lr 0.000675 wd 0.0500 time 0.2212 (0.2249) data time 0.0011 (0.0015) model time 0.2201 (0.2233) loss 3.3557 (3.3070) grad_norm 3.0189 (2.2986) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][950/1251] eta 0:01:07 lr 0.000675 wd 0.0500 time 0.2237 (0.2249) data time 0.0009 (0.0014) model time 0.2228 (0.2234) loss 3.6770 (3.3055) grad_norm 2.7555 (2.2970) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][960/1251] eta 0:01:05 lr 0.000675 wd 0.0500 time 0.2165 (0.2248) data time 0.0010 (0.0014) model time 0.2156 (0.2233) loss 3.8959 (3.3078) grad_norm 3.0691 (2.2994) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][970/1251] eta 0:01:03 lr 0.000675 wd 0.0500 time 0.2198 (0.2248) data time 0.0009 (0.0014) model time 0.2189 (0.2233) loss 3.5760 (3.3089) grad_norm 2.5781 (2.2992) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][980/1251] eta 0:01:00 lr 0.000675 wd 0.0500 time 0.2211 (0.2248) data time 0.0008 (0.0014) model time 0.2203 (0.2234) loss 3.7063 (3.3119) grad_norm 2.1722 (2.3034) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][990/1251] eta 0:00:58 lr 0.000675 wd 0.0500 time 0.2241 (0.2248) data time 0.0009 (0.0014) model time 0.2232 (0.2234) loss 3.4382 (3.3133) grad_norm 2.8288 (2.3069) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1000/1251] eta 0:00:56 lr 0.000675 wd 0.0500 time 0.2171 (0.2248) data time 0.0008 (0.0014) model time 0.2163 (0.2234) loss 3.4451 (3.3135) grad_norm 1.6442 (2.3048) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1010/1251] eta 0:00:54 lr 0.000675 wd 0.0500 time 0.2024 (0.2250) data time 0.0008 (0.0014) model time 0.2016 (0.2235) loss 3.3963 (3.3111) grad_norm 2.7011 (2.3037) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1020/1251] eta 0:00:51 lr 0.000675 wd 0.0500 time 0.2195 (0.2250) data time 0.0010 (0.0014) model time 0.2184 (0.2236) loss 3.3815 (3.3089) grad_norm 1.8790 (2.3011) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1030/1251] eta 0:00:49 lr 0.000675 wd 0.0500 time 0.2262 (0.2250) data time 0.0010 (0.0014) model time 0.2252 (0.2236) loss 2.6743 (3.3088) grad_norm 2.3026 (2.3005) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1040/1251] eta 0:00:47 lr 0.000675 wd 0.0500 time 0.2188 (0.2250) data time 0.0010 (0.0014) model time 0.2179 (0.2236) loss 3.4707 (3.3095) grad_norm 2.1774 (2.3000) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1050/1251] eta 0:00:45 lr 0.000675 wd 0.0500 time 0.2261 (0.2250) data time 0.0016 (0.0014) model time 0.2245 (0.2236) loss 3.4600 (3.3118) grad_norm 3.5566 (2.2996) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1060/1251] eta 0:00:42 lr 0.000675 wd 0.0500 time 0.2287 (0.2250) data time 0.0010 (0.0014) model time 0.2277 (0.2236) loss 3.4916 (3.3139) grad_norm 1.7191 (2.2971) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1070/1251] eta 0:00:40 lr 0.000674 wd 0.0500 time 0.2206 (0.2250) data time 0.0008 (0.0014) model time 0.2198 (0.2236) loss 3.0504 (3.3159) grad_norm 1.7332 (2.2952) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1080/1251] eta 0:00:38 lr 0.000674 wd 0.0500 time 0.2214 (0.2250) data time 0.0008 (0.0014) model time 0.2206 (0.2236) loss 3.6028 (3.3163) grad_norm 2.6420 (2.2947) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 06:05:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 06:05:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 06:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 06:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 06:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 06:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 06:11:47 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 06:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 06:11:56 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 06:11:57 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 06:11:59 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 06:11:59 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 128) [2024-08-27 06:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 06:12:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1090/1251] eta 0:11:22 lr 0.000674 wd 0.0500 time 0.2347 (4.2389) data time 0.0007 (0.1458) model time 0.2340 (4.0931) loss 3.9131 (3.7246) grad_norm 3.2828 (2.9349) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1100/1251] eta 0:03:29 lr 0.000674 wd 0.0500 time 0.2427 (1.3857) data time 0.0008 (0.0424) model time 0.2419 (1.3433) loss 3.7114 (3.5673) grad_norm 3.1947 (2.5144) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1110/1251] eta 0:02:08 lr 0.000674 wd 0.0500 time 0.2429 (0.9098) data time 0.0010 (0.0252) model time 0.2419 (0.8846) loss 3.6286 (3.5676) grad_norm 3.0660 (2.4859) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1120/1251] eta 0:01:33 lr 0.000674 wd 0.0500 time 0.2399 (0.7133) data time 0.0007 (0.0181) model time 0.2392 (0.6952) loss 2.5931 (3.5671) grad_norm 1.9308 (2.4467) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1130/1251] eta 0:01:13 lr 0.000674 wd 0.0500 time 0.2463 (0.6053) data time 0.0008 (0.0142) model time 0.2455 (0.5911) loss 3.4162 (3.5323) grad_norm 3.0210 (2.5041) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1140/1251] eta 0:00:59 lr 0.000674 wd 0.0500 time 0.2431 (0.5382) data time 0.0008 (0.0118) model time 0.2423 (0.5264) loss 3.7949 (3.5233) grad_norm 2.4567 (2.4926) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1150/1251] eta 0:00:49 lr 0.000674 wd 0.0500 time 0.2441 (0.4917) data time 0.0008 (0.0101) model time 0.2433 (0.4816) loss 3.3529 (3.4633) grad_norm 1.9240 (2.4736) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1160/1251] eta 0:00:41 lr 0.000674 wd 0.0500 time 0.2354 (0.4576) data time 0.0009 (0.0089) model time 0.2345 (0.4488) loss 3.6362 (3.4400) grad_norm 2.1371 (2.4278) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1170/1251] eta 0:00:34 lr 0.000674 wd 0.0500 time 0.2442 (0.4318) data time 0.0012 (0.0080) model time 0.2430 (0.4238) loss 3.6244 (3.4043) grad_norm 1.3587 (2.3794) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1180/1251] eta 0:00:29 lr 0.000674 wd 0.0500 time 0.2383 (0.4113) data time 0.0011 (0.0072) model time 0.2372 (0.4041) loss 3.2532 (3.3926) grad_norm 2.2072 (2.3612) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1190/1251] eta 0:00:24 lr 0.000674 wd 0.0500 time 0.2404 (0.3952) data time 0.0010 (0.0066) model time 0.2394 (0.3886) loss 3.6910 (3.4133) grad_norm 2.8230 (2.3873) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1200/1251] eta 0:00:19 lr 0.000674 wd 0.0500 time 0.2399 (0.3818) data time 0.0011 (0.0061) model time 0.2388 (0.3757) loss 3.6089 (3.3914) grad_norm 1.8435 (2.3534) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1210/1251] eta 0:00:15 lr 0.000674 wd 0.0500 time 0.2380 (0.3706) data time 0.0012 (0.0057) model time 0.2368 (0.3649) loss 3.1681 (3.3936) grad_norm 2.0098 (2.3307) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1220/1251] eta 0:00:11 lr 0.000674 wd 0.0500 time 0.2438 (0.3611) data time 0.0012 (0.0054) model time 0.2426 (0.3557) loss 3.8424 (3.3983) grad_norm 1.3962 (2.3253) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1230/1251] eta 0:00:07 lr 0.000674 wd 0.0500 time 0.2380 (0.3529) data time 0.0008 (0.0051) model time 0.2373 (0.3478) loss 2.9500 (3.3825) grad_norm 1.5152 (2.3396) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1240/1251] eta 0:00:03 lr 0.000674 wd 0.0500 time 0.2252 (0.3452) data time 0.0006 (0.0048) model time 0.2246 (0.3404) loss 3.1056 (3.3692) grad_norm 1.6989 (2.3269) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [128/300][1250/1251] eta 0:00:00 lr 0.000674 wd 0.0500 time 0.2236 (0.3379) data time 0.0006 (0.0046) model time 0.2230 (0.3333) loss 3.5099 (3.3749) grad_norm 2.8902 (2.3104) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:12:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 128 training takes 0:00:55 [2024-08-27 06:12:59 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 06:13:01 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 06:13:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.389 (0.389) Loss 0.4583 (0.4583) Acc@1 90.527 (90.527) Acc@5 97.949 (97.949) Mem 7377MB [2024-08-27 06:13:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.104) Loss 0.7715 (0.7408) Acc@1 83.105 (83.381) Acc@5 95.996 (96.680) Mem 7377MB [2024-08-27 06:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.092) Loss 1.0703 (0.7635) Acc@1 74.121 (82.524) Acc@5 93.262 (96.582) Mem 7377MB [2024-08-27 06:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.087) Loss 1.3184 (0.8640) Acc@1 67.676 (80.314) Acc@5 90.137 (95.353) Mem 7377MB [2024-08-27 06:13:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.081) Loss 1.1709 (0.9255) Acc@1 71.191 (78.713) Acc@5 92.285 (94.655) Mem 7377MB [2024-08-27 06:13:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.346 Acc@5 94.618 [2024-08-27 06:13:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.3% [2024-08-27 06:13:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 78.35% [2024-08-27 06:13:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 06:13:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 06:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.396 (0.396) Loss 0.4172 (0.4172) Acc@1 92.383 (92.383) Acc@5 98.438 (98.438) Mem 7377MB [2024-08-27 06:13:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.106) Loss 0.6670 (0.6591) Acc@1 86.816 (85.875) Acc@5 96.777 (97.212) Mem 7377MB [2024-08-27 06:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.093) Loss 0.9351 (0.6818) Acc@1 77.832 (84.924) Acc@5 95.117 (97.242) Mem 7377MB [2024-08-27 06:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.088) Loss 1.1943 (0.7755) Acc@1 69.824 (82.567) Acc@5 91.406 (96.217) Mem 7377MB [2024-08-27 06:13:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.0723 (0.8245) Acc@1 73.047 (81.157) Acc@5 93.359 (95.665) Mem 7377MB [2024-08-27 06:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.742 Acc@5 95.620 [2024-08-27 06:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.7% [2024-08-27 06:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.74% [2024-08-27 06:13:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 06:13:14 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 06:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][0/1251] eta 0:13:28 lr 0.000674 wd 0.0500 time 0.6459 (0.6459) data time 0.3785 (0.3785) model time 0.0000 (0.0000) loss 3.8390 (3.8390) grad_norm 3.8448 (3.8448) loss_scale 2048.0000 (2048.0000) mem 7383MB [2024-08-27 06:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][10/1251] eta 0:05:42 lr 0.000674 wd 0.0500 time 0.2390 (0.2763) data time 0.0008 (0.0354) model time 0.0000 (0.0000) loss 3.4281 (3.2463) grad_norm 1.8114 (2.4891) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][20/1251] eta 0:05:20 lr 0.000674 wd 0.0500 time 0.2454 (0.2606) data time 0.0012 (0.0191) model time 0.0000 (0.0000) loss 3.3255 (3.2752) grad_norm 2.1833 (2.3051) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][30/1251] eta 0:05:10 lr 0.000674 wd 0.0500 time 0.2346 (0.2544) data time 0.0011 (0.0133) model time 0.0000 (0.0000) loss 1.7100 (3.2167) grad_norm 2.2671 (2.2289) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][40/1251] eta 0:05:04 lr 0.000674 wd 0.0500 time 0.2390 (0.2514) data time 0.0010 (0.0103) model time 0.0000 (0.0000) loss 2.9169 (3.2014) grad_norm 1.7740 (2.2267) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][50/1251] eta 0:05:00 lr 0.000674 wd 0.0500 time 0.2402 (0.2499) data time 0.0011 (0.0085) model time 0.0000 (0.0000) loss 3.5761 (3.1733) grad_norm 2.6191 (2.2198) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][60/1251] eta 0:05:00 lr 0.000673 wd 0.0500 time 0.2346 (0.2525) data time 0.0010 (0.0073) model time 0.2336 (0.2649) loss 3.5491 (3.2180) grad_norm 1.9485 (2.2154) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][70/1251] eta 0:04:56 lr 0.000673 wd 0.0500 time 0.2400 (0.2511) data time 0.0012 (0.0064) model time 0.2388 (0.2529) loss 3.2374 (3.1980) grad_norm 1.9584 (2.2678) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][80/1251] eta 0:04:52 lr 0.000673 wd 0.0500 time 0.2468 (0.2501) data time 0.0012 (0.0057) model time 0.2456 (0.2495) loss 2.8636 (3.2055) grad_norm 1.6434 (2.2422) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][90/1251] eta 0:04:49 lr 0.000673 wd 0.0500 time 0.2487 (0.2492) data time 0.0008 (0.0052) model time 0.2479 (0.2472) loss 3.3472 (3.1946) grad_norm 2.2363 (2.2300) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][100/1251] eta 0:04:46 lr 0.000673 wd 0.0500 time 0.2400 (0.2486) data time 0.0009 (0.0048) model time 0.2390 (0.2462) loss 2.5318 (3.1789) grad_norm 1.7620 (2.2066) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][110/1251] eta 0:04:42 lr 0.000673 wd 0.0500 time 0.2375 (0.2480) data time 0.0011 (0.0045) model time 0.2365 (0.2453) loss 3.2765 (3.1946) grad_norm 5.1084 (2.2441) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][120/1251] eta 0:04:39 lr 0.000673 wd 0.0500 time 0.2394 (0.2475) data time 0.0011 (0.0042) model time 0.2384 (0.2448) loss 3.3409 (3.2037) grad_norm 2.2247 (2.2808) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][130/1251] eta 0:04:39 lr 0.000673 wd 0.0500 time 0.2472 (0.2493) data time 0.0011 (0.0039) model time 0.2461 (0.2478) loss 3.2048 (3.1985) grad_norm 1.5676 (2.2605) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][140/1251] eta 0:04:36 lr 0.000673 wd 0.0500 time 0.2434 (0.2487) data time 0.0011 (0.0037) model time 0.2423 (0.2470) loss 2.4915 (3.1820) grad_norm 1.7750 (2.2649) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][150/1251] eta 0:04:33 lr 0.000673 wd 0.0500 time 0.2577 (0.2483) data time 0.0010 (0.0036) model time 0.2566 (0.2464) loss 3.7964 (3.1886) grad_norm 2.3070 (2.2919) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][160/1251] eta 0:04:30 lr 0.000673 wd 0.0500 time 0.2493 (0.2480) data time 0.0010 (0.0034) model time 0.2483 (0.2460) loss 4.0739 (3.2180) grad_norm 1.8970 (2.3820) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][170/1251] eta 0:04:27 lr 0.000673 wd 0.0500 time 0.2387 (0.2477) data time 0.0010 (0.0033) model time 0.2377 (0.2458) loss 3.8457 (3.2203) grad_norm 2.9783 (2.3645) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:13:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][180/1251] eta 0:04:25 lr 0.000673 wd 0.0500 time 0.2460 (0.2475) data time 0.0009 (0.0031) model time 0.2451 (0.2455) loss 2.6521 (3.2300) grad_norm 2.4355 (2.3567) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][190/1251] eta 0:04:22 lr 0.000673 wd 0.0500 time 0.2494 (0.2472) data time 0.0011 (0.0030) model time 0.2484 (0.2451) loss 3.0369 (3.2426) grad_norm 2.1192 (2.3433) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][200/1251] eta 0:04:19 lr 0.000673 wd 0.0500 time 0.2501 (0.2470) data time 0.0011 (0.0029) model time 0.2491 (0.2450) loss 3.0125 (3.2393) grad_norm 2.3832 (2.3248) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][210/1251] eta 0:04:16 lr 0.000673 wd 0.0500 time 0.2406 (0.2467) data time 0.0007 (0.0029) model time 0.2399 (0.2447) loss 2.6631 (3.2384) grad_norm 2.9263 (2.3319) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][220/1251] eta 0:04:14 lr 0.000673 wd 0.0500 time 0.2380 (0.2465) data time 0.0008 (0.0028) model time 0.2373 (0.2445) loss 2.3242 (3.2219) grad_norm 2.1381 (2.3212) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][230/1251] eta 0:04:11 lr 0.000673 wd 0.0500 time 0.2409 (0.2464) data time 0.0010 (0.0027) model time 0.2399 (0.2443) loss 3.7131 (3.2304) grad_norm 2.5891 (2.3041) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][240/1251] eta 0:04:08 lr 0.000673 wd 0.0500 time 0.2403 (0.2461) data time 0.0010 (0.0026) model time 0.2394 (0.2441) loss 3.2816 (3.2368) grad_norm 1.9582 (2.3024) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][250/1251] eta 0:04:06 lr 0.000673 wd 0.0500 time 0.2432 (0.2459) data time 0.0011 (0.0026) model time 0.2421 (0.2438) loss 3.5725 (3.2443) grad_norm 2.1442 (2.3030) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][260/1251] eta 0:04:03 lr 0.000673 wd 0.0500 time 0.2433 (0.2458) data time 0.0008 (0.0025) model time 0.2425 (0.2437) loss 3.6901 (3.2508) grad_norm 2.1786 (2.2893) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][270/1251] eta 0:04:00 lr 0.000673 wd 0.0500 time 0.2416 (0.2455) data time 0.0008 (0.0025) model time 0.2409 (0.2434) loss 3.9394 (3.2663) grad_norm 10.9239 (2.3298) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][280/1251] eta 0:03:58 lr 0.000673 wd 0.0500 time 0.2358 (0.2453) data time 0.0012 (0.0024) model time 0.2346 (0.2432) loss 3.2243 (3.2720) grad_norm 2.3458 (2.3306) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][290/1251] eta 0:03:55 lr 0.000673 wd 0.0500 time 0.2465 (0.2453) data time 0.0007 (0.0024) model time 0.2458 (0.2433) loss 3.4246 (3.2717) grad_norm 2.5276 (2.3325) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][300/1251] eta 0:03:53 lr 0.000672 wd 0.0500 time 0.2405 (0.2451) data time 0.0008 (0.0023) model time 0.2397 (0.2431) loss 2.6803 (3.2620) grad_norm 2.0005 (2.3282) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][310/1251] eta 0:03:50 lr 0.000672 wd 0.0500 time 0.2434 (0.2450) data time 0.0009 (0.0023) model time 0.2425 (0.2430) loss 3.2366 (3.2545) grad_norm 2.0586 (2.3175) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][320/1251] eta 0:03:48 lr 0.000672 wd 0.0500 time 0.2311 (0.2450) data time 0.0010 (0.0022) model time 0.2301 (0.2430) loss 3.8685 (3.2564) grad_norm 2.1701 (2.3112) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][330/1251] eta 0:03:45 lr 0.000672 wd 0.0500 time 0.2461 (0.2449) data time 0.0008 (0.0022) model time 0.2454 (0.2430) loss 3.2928 (3.2577) grad_norm 1.8606 (2.3122) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][340/1251] eta 0:03:43 lr 0.000672 wd 0.0500 time 0.2394 (0.2448) data time 0.0011 (0.0022) model time 0.2383 (0.2429) loss 3.5416 (3.2562) grad_norm 2.1014 (2.3063) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][350/1251] eta 0:03:40 lr 0.000672 wd 0.0500 time 0.2411 (0.2448) data time 0.0010 (0.0021) model time 0.2401 (0.2429) loss 3.3191 (3.2720) grad_norm 1.7914 (2.3017) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][360/1251] eta 0:03:38 lr 0.000672 wd 0.0500 time 0.2493 (0.2448) data time 0.0007 (0.0021) model time 0.2486 (0.2429) loss 4.0584 (3.2665) grad_norm 3.0819 (2.3048) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][370/1251] eta 0:03:35 lr 0.000672 wd 0.0500 time 0.2363 (0.2447) data time 0.0009 (0.0021) model time 0.2354 (0.2428) loss 2.9069 (3.2645) grad_norm 2.0379 (2.3068) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][380/1251] eta 0:03:33 lr 0.000672 wd 0.0500 time 0.2427 (0.2446) data time 0.0009 (0.0020) model time 0.2417 (0.2428) loss 3.1671 (3.2641) grad_norm 1.9486 (2.3034) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][390/1251] eta 0:03:30 lr 0.000672 wd 0.0500 time 0.2454 (0.2446) data time 0.0010 (0.0020) model time 0.2444 (0.2427) loss 3.5020 (3.2739) grad_norm 2.9569 (2.3042) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][400/1251] eta 0:03:28 lr 0.000672 wd 0.0500 time 0.2490 (0.2445) data time 0.0011 (0.0020) model time 0.2479 (0.2427) loss 3.1753 (3.2760) grad_norm 1.6383 (2.3377) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][410/1251] eta 0:03:25 lr 0.000672 wd 0.0500 time 0.2403 (0.2444) data time 0.0011 (0.0020) model time 0.2392 (0.2426) loss 3.4091 (3.2773) grad_norm 2.2644 (2.3389) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][420/1251] eta 0:03:23 lr 0.000672 wd 0.0500 time 0.2512 (0.2444) data time 0.0010 (0.0019) model time 0.2502 (0.2426) loss 3.6674 (3.2820) grad_norm 2.2049 (2.3347) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:14:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][430/1251] eta 0:03:20 lr 0.000672 wd 0.0500 time 0.2491 (0.2444) data time 0.0008 (0.0019) model time 0.2483 (0.2426) loss 3.7375 (3.2850) grad_norm 2.5664 (2.3367) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][440/1251] eta 0:03:18 lr 0.000672 wd 0.0500 time 0.2455 (0.2443) data time 0.0009 (0.0019) model time 0.2446 (0.2426) loss 3.9162 (3.2811) grad_norm 1.8873 (2.3372) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][450/1251] eta 0:03:15 lr 0.000672 wd 0.0500 time 0.2487 (0.2443) data time 0.0008 (0.0019) model time 0.2479 (0.2425) loss 3.9695 (3.2852) grad_norm 2.1696 (2.3384) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][460/1251] eta 0:03:13 lr 0.000672 wd 0.0500 time 0.2323 (0.2442) data time 0.0011 (0.0019) model time 0.2312 (0.2425) loss 3.1464 (3.2929) grad_norm 2.3997 (2.3362) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][470/1251] eta 0:03:10 lr 0.000672 wd 0.0500 time 0.2475 (0.2441) data time 0.0010 (0.0019) model time 0.2465 (0.2424) loss 3.0477 (3.2922) grad_norm 1.8028 (2.3292) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][480/1251] eta 0:03:08 lr 0.000672 wd 0.0500 time 0.2362 (0.2440) data time 0.0008 (0.0018) model time 0.2355 (0.2423) loss 2.6662 (3.2879) grad_norm 2.6452 (2.3333) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][490/1251] eta 0:03:05 lr 0.000672 wd 0.0500 time 0.2409 (0.2440) data time 0.0008 (0.0018) model time 0.2401 (0.2423) loss 3.0522 (3.2909) grad_norm 2.4037 (2.3278) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][500/1251] eta 0:03:03 lr 0.000672 wd 0.0500 time 0.2313 (0.2439) data time 0.0009 (0.0018) model time 0.2304 (0.2421) loss 2.9905 (3.2847) grad_norm 1.9383 (2.3329) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][510/1251] eta 0:03:00 lr 0.000672 wd 0.0500 time 0.2468 (0.2438) data time 0.0009 (0.0018) model time 0.2459 (0.2421) loss 3.4902 (3.2938) grad_norm 1.5792 (2.3314) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][520/1251] eta 0:02:58 lr 0.000672 wd 0.0500 time 0.2299 (0.2438) data time 0.0010 (0.0018) model time 0.2289 (0.2421) loss 3.8690 (3.2951) grad_norm 2.2840 (2.3345) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][530/1251] eta 0:02:55 lr 0.000672 wd 0.0500 time 0.2456 (0.2438) data time 0.0010 (0.0018) model time 0.2446 (0.2421) loss 3.5999 (3.2928) grad_norm 2.3626 (2.3340) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][540/1251] eta 0:02:53 lr 0.000671 wd 0.0500 time 0.2390 (0.2438) data time 0.0009 (0.0017) model time 0.2381 (0.2421) loss 3.7899 (3.2902) grad_norm 2.4237 (2.3308) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][550/1251] eta 0:02:50 lr 0.000671 wd 0.0500 time 0.2493 (0.2437) data time 0.0011 (0.0017) model time 0.2482 (0.2421) loss 3.3365 (3.2823) grad_norm 1.9242 (2.3333) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][560/1251] eta 0:02:48 lr 0.000671 wd 0.0500 time 0.2399 (0.2438) data time 0.0008 (0.0017) model time 0.2391 (0.2421) loss 3.2027 (3.2814) grad_norm 2.3403 (2.3425) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][570/1251] eta 0:02:46 lr 0.000671 wd 0.0500 time 0.2433 (0.2438) data time 0.0008 (0.0017) model time 0.2424 (0.2421) loss 3.5713 (3.2834) grad_norm 2.1575 (2.3396) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][580/1251] eta 0:02:43 lr 0.000671 wd 0.0500 time 0.2348 (0.2438) data time 0.0011 (0.0017) model time 0.2337 (0.2421) loss 3.3732 (3.2862) grad_norm 2.0024 (2.3420) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][590/1251] eta 0:02:41 lr 0.000671 wd 0.0500 time 0.2419 (0.2437) data time 0.0011 (0.0017) model time 0.2408 (0.2421) loss 2.6165 (3.2835) grad_norm 2.0959 (2.3451) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][600/1251] eta 0:02:38 lr 0.000671 wd 0.0500 time 0.2402 (0.2440) data time 0.0007 (0.0017) model time 0.2395 (0.2424) loss 3.5582 (3.2882) grad_norm 1.9185 (2.3425) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][610/1251] eta 0:02:36 lr 0.000671 wd 0.0500 time 0.2487 (0.2440) data time 0.0009 (0.0017) model time 0.2478 (0.2424) loss 2.7131 (3.2913) grad_norm 14.8291 (2.3770) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][620/1251] eta 0:02:33 lr 0.000671 wd 0.0500 time 0.2372 (0.2439) data time 0.0010 (0.0017) model time 0.2362 (0.2424) loss 2.2475 (3.2899) grad_norm 1.7367 (2.3763) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][630/1251] eta 0:02:31 lr 0.000671 wd 0.0500 time 0.2426 (0.2439) data time 0.0008 (0.0016) model time 0.2418 (0.2423) loss 3.3499 (3.2906) grad_norm 2.0081 (2.3728) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][640/1251] eta 0:02:28 lr 0.000671 wd 0.0500 time 0.2342 (0.2439) data time 0.0009 (0.0016) model time 0.2333 (0.2423) loss 4.0538 (3.2896) grad_norm 1.8853 (2.3655) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][650/1251] eta 0:02:26 lr 0.000671 wd 0.0500 time 0.2472 (0.2443) data time 0.0008 (0.0016) model time 0.2464 (0.2428) loss 3.9482 (3.2856) grad_norm 1.8329 (2.3618) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][660/1251] eta 0:02:24 lr 0.000671 wd 0.0500 time 0.2277 (0.2443) data time 0.0008 (0.0016) model time 0.2269 (0.2428) loss 2.4899 (3.2870) grad_norm 2.0406 (2.3573) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:15:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][670/1251] eta 0:02:21 lr 0.000671 wd 0.0500 time 0.2435 (0.2443) data time 0.0010 (0.0016) model time 0.2424 (0.2428) loss 2.4346 (3.2840) grad_norm 2.1060 (2.3543) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:16:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][680/1251] eta 0:02:19 lr 0.000671 wd 0.0500 time 0.2494 (0.2443) data time 0.0008 (0.0016) model time 0.2485 (0.2428) loss 2.4131 (3.2819) grad_norm 2.1567 (2.3531) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:16:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][690/1251] eta 0:02:17 lr 0.000671 wd 0.0500 time 0.2404 (0.2443) data time 0.0008 (0.0016) model time 0.2396 (0.2428) loss 3.2337 (3.2855) grad_norm 1.6448 (2.3471) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:16:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][700/1251] eta 0:02:14 lr 0.000671 wd 0.0500 time 0.2476 (0.2442) data time 0.0011 (0.0016) model time 0.2466 (0.2428) loss 3.2013 (3.2857) grad_norm 1.8441 (2.3443) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:16:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][710/1251] eta 0:02:12 lr 0.000671 wd 0.0500 time 0.2578 (0.2443) data time 0.0007 (0.0016) model time 0.2571 (0.2428) loss 2.9004 (3.2874) grad_norm 2.6325 (2.3419) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:16:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][720/1251] eta 0:02:09 lr 0.000671 wd 0.0500 time 0.2426 (0.2442) data time 0.0007 (0.0016) model time 0.2419 (0.2427) loss 2.9388 (3.2867) grad_norm 2.4866 (2.3435) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:16:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][730/1251] eta 0:02:07 lr 0.000671 wd 0.0500 time 0.2484 (0.2442) data time 0.0011 (0.0016) model time 0.2473 (0.2427) loss 3.2016 (3.2857) grad_norm 1.9615 (2.3482) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:16:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][740/1251] eta 0:02:04 lr 0.000671 wd 0.0500 time 0.2426 (0.2442) data time 0.0007 (0.0015) model time 0.2418 (0.2427) loss 3.6544 (3.2868) grad_norm 2.2119 (2.3445) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 06:16:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][750/1251] eta 0:02:02 lr 0.000671 wd 0.0500 time 0.2405 (0.2442) data time 0.0009 (0.0015) model time 0.2396 (0.2427) loss 3.8099 (3.2860) grad_norm 2.7698 (2.3478) loss_scale 4096.0000 (2056.1811) mem 7382MB [2024-08-27 06:16:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][760/1251] eta 0:01:59 lr 0.000671 wd 0.0500 time 0.2468 (0.2441) data time 0.0007 (0.0015) model time 0.2461 (0.2427) loss 3.6259 (3.2909) grad_norm 2.5555 (2.3463) loss_scale 4096.0000 (2082.9855) mem 7382MB [2024-08-27 06:16:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][770/1251] eta 0:01:57 lr 0.000671 wd 0.0500 time 0.2405 (0.2441) data time 0.0008 (0.0015) model time 0.2397 (0.2427) loss 2.2383 (3.2915) grad_norm 1.8411 (2.3445) loss_scale 4096.0000 (2109.0947) mem 7382MB [2024-08-27 06:16:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][780/1251] eta 0:01:54 lr 0.000670 wd 0.0500 time 0.2366 (0.2441) data time 0.0008 (0.0015) model time 0.2358 (0.2426) loss 4.1129 (3.2893) grad_norm 2.0203 (2.3459) loss_scale 4096.0000 (2134.5352) mem 7382MB [2024-08-27 06:16:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][790/1251] eta 0:01:52 lr 0.000670 wd 0.0500 time 0.2469 (0.2441) data time 0.0010 (0.0015) model time 0.2458 (0.2426) loss 3.2995 (3.2845) grad_norm 3.1000 (2.3486) loss_scale 4096.0000 (2159.3325) mem 7382MB [2024-08-27 06:16:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][800/1251] eta 0:01:50 lr 0.000670 wd 0.0500 time 0.2445 (0.2440) data time 0.0010 (0.0015) model time 0.2435 (0.2426) loss 3.2155 (3.2843) grad_norm 1.9103 (2.3452) loss_scale 4096.0000 (2183.5106) mem 7382MB [2024-08-27 06:16:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][810/1251] eta 0:01:47 lr 0.000670 wd 0.0500 time 0.2408 (0.2440) data time 0.0008 (0.0015) model time 0.2399 (0.2426) loss 4.0676 (3.2876) grad_norm 1.6879 (2.3441) loss_scale 4096.0000 (2207.0925) mem 7382MB [2024-08-27 06:16:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][820/1251] eta 0:01:45 lr 0.000670 wd 0.0500 time 0.2476 (0.2440) data time 0.0010 (0.0015) model time 0.2465 (0.2426) loss 3.7703 (3.2859) grad_norm 2.3921 (2.3438) loss_scale 4096.0000 (2230.0999) mem 7382MB [2024-08-27 06:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][830/1251] eta 0:01:42 lr 0.000670 wd 0.0500 time 0.2540 (0.2440) data time 0.0008 (0.0015) model time 0.2532 (0.2426) loss 3.5307 (3.2865) grad_norm 1.6734 (2.3403) loss_scale 4096.0000 (2252.5535) mem 7382MB [2024-08-27 06:16:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][840/1251] eta 0:01:40 lr 0.000670 wd 0.0500 time 0.2425 (0.2440) data time 0.0007 (0.0015) model time 0.2418 (0.2425) loss 2.4694 (3.2835) grad_norm 2.0984 (2.3367) loss_scale 4096.0000 (2274.4732) mem 7382MB [2024-08-27 06:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][850/1251] eta 0:01:37 lr 0.000670 wd 0.0500 time 0.2362 (0.2439) data time 0.0009 (0.0015) model time 0.2353 (0.2425) loss 3.6121 (3.2877) grad_norm 3.8441 (2.3411) loss_scale 4096.0000 (2295.8778) mem 7382MB [2024-08-27 06:16:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][860/1251] eta 0:01:35 lr 0.000670 wd 0.0500 time 0.2384 (0.2439) data time 0.0008 (0.0015) model time 0.2377 (0.2425) loss 2.2920 (3.2827) grad_norm 2.4627 (2.3399) loss_scale 4096.0000 (2316.7851) mem 7382MB [2024-08-27 06:16:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][870/1251] eta 0:01:32 lr 0.000670 wd 0.0500 time 0.2339 (0.2439) data time 0.0008 (0.0015) model time 0.2332 (0.2425) loss 4.0199 (3.2820) grad_norm 3.5366 (2.3422) loss_scale 4096.0000 (2337.2124) mem 7382MB [2024-08-27 06:16:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][880/1251] eta 0:01:30 lr 0.000670 wd 0.0500 time 0.2424 (0.2439) data time 0.0011 (0.0015) model time 0.2412 (0.2424) loss 3.1143 (3.2857) grad_norm 2.2981 (2.3412) loss_scale 4096.0000 (2357.1759) mem 7382MB [2024-08-27 06:16:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][890/1251] eta 0:01:28 lr 0.000670 wd 0.0500 time 0.2423 (0.2438) data time 0.0009 (0.0015) model time 0.2414 (0.2424) loss 3.7152 (3.2856) grad_norm 2.3459 (2.3412) loss_scale 4096.0000 (2376.6914) mem 7382MB [2024-08-27 06:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][900/1251] eta 0:01:25 lr 0.000670 wd 0.0500 time 0.2414 (0.2438) data time 0.0008 (0.0015) model time 0.2406 (0.2424) loss 4.0125 (3.2886) grad_norm 1.8254 (2.3397) loss_scale 4096.0000 (2395.7736) mem 7382MB [2024-08-27 06:16:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][910/1251] eta 0:01:23 lr 0.000670 wd 0.0500 time 0.2466 (0.2438) data time 0.0012 (0.0015) model time 0.2454 (0.2424) loss 3.3966 (3.2864) grad_norm 1.6899 (2.3387) loss_scale 4096.0000 (2414.4369) mem 7382MB [2024-08-27 06:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][920/1251] eta 0:01:20 lr 0.000670 wd 0.0500 time 0.2420 (0.2438) data time 0.0008 (0.0015) model time 0.2412 (0.2424) loss 2.1731 (3.2854) grad_norm 2.1584 (2.3351) loss_scale 4096.0000 (2432.6949) mem 7382MB [2024-08-27 06:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][930/1251] eta 0:01:18 lr 0.000670 wd 0.0500 time 0.2401 (0.2438) data time 0.0009 (0.0015) model time 0.2392 (0.2424) loss 2.6989 (3.2833) grad_norm 1.8450 (2.3326) loss_scale 4096.0000 (2450.5607) mem 7382MB [2024-08-27 06:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][940/1251] eta 0:01:15 lr 0.000670 wd 0.0500 time 0.2437 (0.2438) data time 0.0007 (0.0014) model time 0.2430 (0.2424) loss 3.0591 (3.2838) grad_norm 3.0751 (2.3345) loss_scale 4096.0000 (2468.0468) mem 7382MB [2024-08-27 06:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][950/1251] eta 0:01:13 lr 0.000670 wd 0.0500 time 0.2323 (0.2438) data time 0.0009 (0.0014) model time 0.2314 (0.2424) loss 3.9581 (3.2852) grad_norm inf (inf) loss_scale 2048.0000 (2483.0116) mem 7382MB [2024-08-27 06:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][960/1251] eta 0:01:10 lr 0.000670 wd 0.0500 time 0.2450 (0.2438) data time 0.0008 (0.0014) model time 0.2442 (0.2424) loss 3.9453 (3.2862) grad_norm 1.6167 (inf) loss_scale 2048.0000 (2478.4849) mem 7382MB [2024-08-27 06:17:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][970/1251] eta 0:01:08 lr 0.000670 wd 0.0500 time 0.2315 (0.2437) data time 0.0010 (0.0014) model time 0.2304 (0.2423) loss 3.2006 (3.2859) grad_norm 2.8820 (inf) loss_scale 2048.0000 (2474.0515) mem 7382MB [2024-08-27 06:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][980/1251] eta 0:01:06 lr 0.000670 wd 0.0500 time 0.2457 (0.2437) data time 0.0010 (0.0014) model time 0.2447 (0.2423) loss 3.3403 (3.2872) grad_norm 1.8120 (inf) loss_scale 2048.0000 (2469.7085) mem 7382MB [2024-08-27 06:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 06:17:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 06:17:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 06:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 06:22:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 06:22:36 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 06:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 06:22:49 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 06:22:50 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 06:22:51 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 06:22:51 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 129) [2024-08-27 06:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 06:23:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][990/1251] eta 0:12:34 lr 0.000670 wd 0.0500 time 0.2399 (2.8921) data time 0.0007 (0.1449) model time 0.2393 (2.7471) loss 3.7167 (3.6270) grad_norm 1.8488 (2.1548) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:23:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1000/1251] eta 0:04:39 lr 0.000670 wd 0.0500 time 0.2233 (1.1144) data time 0.0009 (0.0490) model time 0.2224 (1.0654) loss 3.2112 (3.4958) grad_norm 2.0391 (2.0401) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:23:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1010/1251] eta 0:03:02 lr 0.000669 wd 0.0500 time 0.2239 (0.7578) data time 0.0010 (0.0298) model time 0.2229 (0.7280) loss 3.3750 (3.5313) grad_norm 2.7084 (2.0604) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:23:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1020/1251] eta 0:02:19 lr 0.000669 wd 0.0500 time 0.2194 (0.6049) data time 0.0010 (0.0215) model time 0.2184 (0.5834) loss 3.3895 (3.5287) grad_norm 2.2718 (2.1110) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:23:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1030/1251] eta 0:01:55 lr 0.000669 wd 0.0500 time 0.2264 (0.5211) data time 0.0009 (0.0170) model time 0.2255 (0.5041) loss 3.5181 (3.5025) grad_norm 1.7915 (2.2096) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:23:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1040/1251] eta 0:01:38 lr 0.000669 wd 0.0500 time 0.2250 (0.4674) data time 0.0007 (0.0141) model time 0.2244 (0.4533) loss 2.3569 (3.4927) grad_norm 2.5476 (2.2510) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1050/1251] eta 0:01:26 lr 0.000669 wd 0.0500 time 0.2269 (0.4301) data time 0.0011 (0.0120) model time 0.2259 (0.4180) loss 3.4520 (3.4605) grad_norm 2.7130 (2.2665) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:23:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 06:23:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 06:23:25 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 06:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 06:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 06:27:59 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 06:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 06:28:09 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 06:28:11 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 06:28:12 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 06:28:12 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 129) [2024-08-27 06:28:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 06:28:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 06:28:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 06:28:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 06:30:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 06:30:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 06:30:17 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 06:30:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 06:30:26 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 06:30:27 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 06:30:29 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 06:30:29 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 129) [2024-08-27 06:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 06:30:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1060/1251] eta 0:06:03 lr 0.000669 wd 0.0500 time 0.2413 (1.9013) data time 0.0007 (0.0791) model time 0.2406 (1.8222) loss 3.7657 (3.9480) grad_norm 1.7633 (2.2025) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:30:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1070/1251] eta 0:02:57 lr 0.000669 wd 0.0500 time 0.2439 (0.9795) data time 0.0007 (0.0357) model time 0.2432 (0.9438) loss 3.7518 (3.6135) grad_norm 1.5481 (2.3116) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:30:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1080/1251] eta 0:02:02 lr 0.000669 wd 0.0500 time 0.2354 (0.7159) data time 0.0010 (0.0233) model time 0.2344 (0.6926) loss 3.8817 (3.5972) grad_norm 2.2596 (2.3241) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:30:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1090/1251] eta 0:01:35 lr 0.000669 wd 0.0500 time 0.2392 (0.5919) data time 0.0011 (0.0175) model time 0.2381 (0.5744) loss 3.5611 (3.5703) grad_norm 1.8288 (2.3371) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:30:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1100/1251] eta 0:01:18 lr 0.000669 wd 0.0500 time 0.2537 (0.5188) data time 0.0008 (0.0140) model time 0.2529 (0.5048) loss 3.5455 (3.5419) grad_norm 5.1554 (2.3928) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1110/1251] eta 0:01:06 lr 0.000669 wd 0.0500 time 0.2370 (0.4707) data time 0.0008 (0.0118) model time 0.2361 (0.4589) loss 2.8823 (3.5054) grad_norm 2.8317 (2.4467) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1120/1251] eta 0:00:57 lr 0.000669 wd 0.0500 time 0.2363 (0.4369) data time 0.0008 (0.0102) model time 0.2355 (0.4267) loss 2.3477 (3.4631) grad_norm 2.1992 (2.4328) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1130/1251] eta 0:00:49 lr 0.000669 wd 0.0500 time 0.2382 (0.4118) data time 0.0008 (0.0090) model time 0.2374 (0.4028) loss 2.6162 (3.4352) grad_norm 2.4718 (2.4541) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1140/1251] eta 0:00:43 lr 0.000669 wd 0.0500 time 0.2460 (0.3925) data time 0.0011 (0.0081) model time 0.2448 (0.3844) loss 3.6707 (3.4109) grad_norm 1.7382 (2.3906) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1150/1251] eta 0:00:38 lr 0.000669 wd 0.0500 time 0.2376 (0.3770) data time 0.0008 (0.0074) model time 0.2368 (0.3696) loss 4.0355 (3.4295) grad_norm 1.7454 (2.3505) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1160/1251] eta 0:00:33 lr 0.000669 wd 0.0500 time 0.2442 (0.3646) data time 0.0009 (0.0068) model time 0.2433 (0.3578) loss 3.0345 (3.4389) grad_norm 2.0017 (2.3219) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1170/1251] eta 0:00:28 lr 0.000669 wd 0.0500 time 0.2484 (0.3542) data time 0.0011 (0.0063) model time 0.2473 (0.3479) loss 3.2973 (3.4323) grad_norm 2.8576 (2.3173) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1180/1251] eta 0:00:24 lr 0.000669 wd 0.0500 time 0.2319 (0.3453) data time 0.0009 (0.0059) model time 0.2310 (0.3394) loss 3.4956 (3.4099) grad_norm 2.2586 (2.3117) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1190/1251] eta 0:00:20 lr 0.000669 wd 0.0500 time 0.2312 (0.3377) data time 0.0012 (0.0056) model time 0.2300 (0.3321) loss 3.1603 (3.4032) grad_norm 1.9673 (2.3008) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1200/1251] eta 0:00:16 lr 0.000669 wd 0.0500 time 0.2319 (0.3310) data time 0.0007 (0.0052) model time 0.2311 (0.3258) loss 3.4631 (3.3968) grad_norm 2.4552 (2.3071) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1210/1251] eta 0:00:13 lr 0.000669 wd 0.0500 time 0.2393 (0.3254) data time 0.0008 (0.0050) model time 0.2385 (0.3204) loss 2.7554 (3.3844) grad_norm 3.1304 (2.3129) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1220/1251] eta 0:00:09 lr 0.000669 wd 0.0500 time 0.2435 (0.3205) data time 0.0010 (0.0047) model time 0.2425 (0.3157) loss 3.6036 (3.3900) grad_norm 1.3721 (2.3073) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1230/1251] eta 0:00:06 lr 0.000669 wd 0.0500 time 0.2368 (0.3161) data time 0.0009 (0.0045) model time 0.2358 (0.3116) loss 2.8500 (3.3742) grad_norm 2.6071 (2.2974) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1240/1251] eta 0:00:03 lr 0.000669 wd 0.0500 time 0.2241 (0.3117) data time 0.0005 (0.0044) model time 0.2236 (0.3073) loss 4.0506 (3.3785) grad_norm 2.2026 (2.2887) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [129/300][1250/1251] eta 0:00:00 lr 0.000668 wd 0.0500 time 0.2247 (0.3073) data time 0.0009 (0.0042) model time 0.2238 (0.3031) loss 2.5300 (3.3674) grad_norm 2.4618 (2.2753) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 129 training takes 0:01:00 [2024-08-27 06:31:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 06:31:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 06:31:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.418 (0.418) Loss 0.5386 (0.5386) Acc@1 90.820 (90.820) Acc@5 97.656 (97.656) Mem 7377MB [2024-08-27 06:31:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.109) Loss 0.7578 (0.7728) Acc@1 85.547 (84.011) Acc@5 96.680 (96.662) Mem 7377MB [2024-08-27 06:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.095) Loss 1.1230 (0.8000) Acc@1 71.191 (82.789) Acc@5 93.848 (96.666) Mem 7377MB [2024-08-27 06:31:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.088) Loss 1.3379 (0.9050) Acc@1 67.578 (80.267) Acc@5 90.625 (95.429) Mem 7377MB [2024-08-27 06:31:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.2666 (0.9615) Acc@1 71.289 (78.916) Acc@5 90.918 (94.758) Mem 7377MB [2024-08-27 06:31:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.394 Acc@5 94.688 [2024-08-27 06:31:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.4% [2024-08-27 06:31:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 78.39% [2024-08-27 06:31:42 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 06:31:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 06:31:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.410 (0.410) Loss 0.4175 (0.4175) Acc@1 92.578 (92.578) Acc@5 98.438 (98.438) Mem 7377MB [2024-08-27 06:31:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.069 (0.106) Loss 0.6670 (0.6582) Acc@1 87.012 (85.858) Acc@5 96.875 (97.275) Mem 7377MB [2024-08-27 06:31:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.092) Loss 0.9316 (0.6808) Acc@1 77.637 (84.914) Acc@5 95.020 (97.294) Mem 7377MB [2024-08-27 06:31:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.087) Loss 1.1904 (0.7740) Acc@1 69.727 (82.586) Acc@5 91.699 (96.276) Mem 7377MB [2024-08-27 06:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.082) Loss 1.0703 (0.8227) Acc@1 73.145 (81.207) Acc@5 93.555 (95.715) Mem 7377MB [2024-08-27 06:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.806 Acc@5 95.662 [2024-08-27 06:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.8% [2024-08-27 06:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.81% [2024-08-27 06:31:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 06:31:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 06:31:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][0/1251] eta 0:14:20 lr 0.000668 wd 0.0500 time 0.6880 (0.6880) data time 0.4252 (0.4252) model time 0.0000 (0.0000) loss 2.5496 (2.5496) grad_norm 1.7796 (1.7796) loss_scale 2048.0000 (2048.0000) mem 7383MB [2024-08-27 06:31:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 06:31:51 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 06:31:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 06:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 06:36:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 06:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 06:36:25 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 06:36:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 06:36:33 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 06:36:35 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 06:36:36 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 06:36:36 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 130) [2024-08-27 06:36:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 06:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][10/1251] eta 0:27:07 lr 0.000668 wd 0.0500 time 0.2246 (1.3115) data time 0.0008 (0.0643) model time 0.0000 (0.0000) loss 3.4506 (3.7782) grad_norm 2.4858 (2.0798) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:36:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][20/1251] eta 0:15:46 lr 0.000668 wd 0.0500 time 0.2287 (0.7688) data time 0.0008 (0.0326) model time 0.0000 (0.0000) loss 3.7125 (3.6127) grad_norm 2.5623 (2.0837) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][30/1251] eta 0:11:57 lr 0.000668 wd 0.0500 time 0.2214 (0.5876) data time 0.0009 (0.0220) model time 0.0000 (0.0000) loss 3.6672 (3.6351) grad_norm 2.6696 (2.1874) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][40/1251] eta 0:10:01 lr 0.000668 wd 0.0500 time 0.2328 (0.4970) data time 0.0007 (0.0168) model time 0.0000 (0.0000) loss 2.5362 (3.5144) grad_norm 2.7367 (2.1906) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:37:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][50/1251] eta 0:08:51 lr 0.000668 wd 0.0500 time 0.2232 (0.4427) data time 0.0010 (0.0136) model time 0.0000 (0.0000) loss 2.9134 (3.4673) grad_norm 2.3821 (2.3368) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:37:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][60/1251] eta 0:08:04 lr 0.000668 wd 0.0500 time 0.2270 (0.4066) data time 0.0007 (0.0115) model time 0.2263 (0.2252) loss 3.6211 (3.4433) grad_norm 1.5898 (2.2543) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:37:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][70/1251] eta 0:07:29 lr 0.000668 wd 0.0500 time 0.2308 (0.3806) data time 0.0007 (0.0099) model time 0.2301 (0.2245) loss 2.6306 (3.3970) grad_norm 2.4231 (2.2700) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:37:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][80/1251] eta 0:07:02 lr 0.000668 wd 0.0500 time 0.2231 (0.3609) data time 0.0008 (0.0088) model time 0.2222 (0.2237) loss 3.5160 (3.3814) grad_norm 1.7097 (2.2805) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:37:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][90/1251] eta 0:06:41 lr 0.000668 wd 0.0500 time 0.2258 (0.3456) data time 0.0006 (0.0079) model time 0.2251 (0.2233) loss 3.8756 (3.3662) grad_norm 2.6006 (2.2701) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][100/1251] eta 0:06:23 lr 0.000668 wd 0.0500 time 0.2172 (0.3336) data time 0.0010 (0.0073) model time 0.2162 (0.2235) loss 3.7029 (3.3846) grad_norm 5.7267 (2.2995) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:37:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][110/1251] eta 0:06:09 lr 0.000668 wd 0.0500 time 0.2291 (0.3236) data time 0.0008 (0.0067) model time 0.2283 (0.2234) loss 2.9214 (3.3962) grad_norm 1.6590 (2.2887) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][120/1251] eta 0:05:56 lr 0.000668 wd 0.0500 time 0.2241 (0.3152) data time 0.0007 (0.0062) model time 0.2233 (0.2232) loss 3.7743 (3.3915) grad_norm 1.8529 (2.2643) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:37:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][130/1251] eta 0:05:45 lr 0.000668 wd 0.0500 time 0.2238 (0.3082) data time 0.0007 (0.0058) model time 0.2231 (0.2233) loss 3.0154 (3.3691) grad_norm 1.7950 (2.3258) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:37:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 06:37:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 06:37:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 06:44:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 06:44:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 06:44:18 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 06:44:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 06:44:26 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 06:44:28 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 06:44:29 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 06:44:29 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 130) [2024-08-27 06:44:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 06:44:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][140/1251] eta 0:32:06 lr 0.000668 wd 0.0500 time 0.2233 (1.7338) data time 0.0006 (0.1162) model time 0.2227 (1.6175) loss 3.5165 (3.7191) grad_norm 1.7130 (2.2227) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][150/1251] eta 0:16:28 lr 0.000668 wd 0.0500 time 0.2262 (0.8975) data time 0.0006 (0.0522) model time 0.2255 (0.8452) loss 3.5623 (3.5134) grad_norm 2.0987 (2.6136) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:44:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][160/1251] eta 0:11:58 lr 0.000668 wd 0.0500 time 0.2400 (0.6590) data time 0.0009 (0.0341) model time 0.2391 (0.6249) loss 3.8921 (3.5497) grad_norm 1.9051 (2.4011) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:44:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][170/1251] eta 0:09:49 lr 0.000668 wd 0.0500 time 0.2428 (0.5456) data time 0.0011 (0.0254) model time 0.2417 (0.5202) loss 3.4117 (3.5396) grad_norm 2.6381 (2.3278) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:44:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][180/1251] eta 0:08:33 lr 0.000668 wd 0.0500 time 0.2446 (0.4796) data time 0.0007 (0.0203) model time 0.2439 (0.4593) loss 3.8597 (3.4969) grad_norm 2.8752 (2.3115) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:44:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][190/1251] eta 0:07:42 lr 0.000668 wd 0.0500 time 0.2289 (0.4357) data time 0.0007 (0.0170) model time 0.2282 (0.4187) loss 2.8315 (3.4786) grad_norm 1.7771 (2.3194) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][200/1251] eta 0:07:05 lr 0.000668 wd 0.0500 time 0.2306 (0.4050) data time 0.0008 (0.0146) model time 0.2297 (0.3904) loss 2.2880 (3.4345) grad_norm 2.2622 (2.3150) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][210/1251] eta 0:06:37 lr 0.000668 wd 0.0500 time 0.2372 (0.3822) data time 0.0010 (0.0129) model time 0.2362 (0.3693) loss 2.9159 (3.4052) grad_norm 2.3789 (2.3024) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][220/1251] eta 0:06:15 lr 0.000668 wd 0.0500 time 0.2417 (0.3644) data time 0.0010 (0.0116) model time 0.2406 (0.3529) loss 3.3516 (3.3834) grad_norm 2.2335 (2.2807) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][230/1251] eta 0:05:57 lr 0.000668 wd 0.0500 time 0.2257 (0.3502) data time 0.0009 (0.0105) model time 0.2249 (0.3398) loss 3.7340 (3.3946) grad_norm 2.1500 (2.2583) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][240/1251] eta 0:05:42 lr 0.000667 wd 0.0500 time 0.2240 (0.3390) data time 0.0007 (0.0096) model time 0.2233 (0.3294) loss 2.2653 (3.3941) grad_norm 2.7955 (2.2549) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][250/1251] eta 0:05:29 lr 0.000667 wd 0.0500 time 0.2337 (0.3295) data time 0.0008 (0.0090) model time 0.2330 (0.3206) loss 3.1011 (3.3840) grad_norm 1.9699 (2.2858) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][260/1251] eta 0:05:18 lr 0.000667 wd 0.0500 time 0.2281 (0.3214) data time 0.0008 (0.0083) model time 0.2273 (0.3130) loss 3.2573 (3.3707) grad_norm 1.7650 (2.2692) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][270/1251] eta 0:05:08 lr 0.000667 wd 0.0500 time 0.2450 (0.3146) data time 0.0009 (0.0078) model time 0.2441 (0.3068) loss 3.4009 (3.3642) grad_norm 2.1367 (2.2684) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][280/1251] eta 0:04:59 lr 0.000667 wd 0.0500 time 0.2241 (0.3086) data time 0.0007 (0.0073) model time 0.2234 (0.3012) loss 3.6889 (3.3633) grad_norm 2.2862 (2.2856) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][290/1251] eta 0:04:51 lr 0.000667 wd 0.0500 time 0.2261 (0.3033) data time 0.0009 (0.0069) model time 0.2253 (0.2963) loss 2.3823 (3.3603) grad_norm 2.2888 (2.2935) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][300/1251] eta 0:04:44 lr 0.000667 wd 0.0500 time 0.2209 (0.2988) data time 0.0009 (0.0066) model time 0.2200 (0.2922) loss 3.4322 (3.3668) grad_norm 2.2796 (2.2775) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][310/1251] eta 0:04:37 lr 0.000667 wd 0.0500 time 0.2269 (0.2948) data time 0.0007 (0.0063) model time 0.2262 (0.2885) loss 2.8482 (3.3534) grad_norm 1.9544 (2.2753) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][320/1251] eta 0:04:31 lr 0.000667 wd 0.0500 time 0.2329 (0.2917) data time 0.0008 (0.0061) model time 0.2321 (0.2856) loss 3.9369 (3.3532) grad_norm 2.5409 (2.2636) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][330/1251] eta 0:04:25 lr 0.000667 wd 0.0500 time 0.2269 (0.2883) data time 0.0008 (0.0059) model time 0.2261 (0.2825) loss 2.5771 (3.3446) grad_norm 1.9508 (2.2484) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][340/1251] eta 0:04:19 lr 0.000667 wd 0.0500 time 0.2377 (0.2853) data time 0.0010 (0.0056) model time 0.2367 (0.2797) loss 3.4563 (3.3285) grad_norm 2.5868 (2.2533) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][350/1251] eta 0:04:14 lr 0.000667 wd 0.0500 time 0.2231 (0.2826) data time 0.0009 (0.0054) model time 0.2222 (0.2772) loss 2.9778 (3.3222) grad_norm 2.2836 (2.2599) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][360/1251] eta 0:04:09 lr 0.000667 wd 0.0500 time 0.2227 (0.2801) data time 0.0010 (0.0052) model time 0.2217 (0.2749) loss 3.4326 (3.3266) grad_norm 2.1838 (2.2537) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][370/1251] eta 0:04:04 lr 0.000667 wd 0.0500 time 0.2340 (0.2778) data time 0.0006 (0.0050) model time 0.2334 (0.2728) loss 3.7315 (3.3209) grad_norm 1.6344 (2.2613) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][380/1251] eta 0:04:00 lr 0.000667 wd 0.0500 time 0.2405 (0.2758) data time 0.0008 (0.0049) model time 0.2397 (0.2709) loss 2.4784 (3.3096) grad_norm 1.5991 (2.2695) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][390/1251] eta 0:03:55 lr 0.000667 wd 0.0500 time 0.2225 (0.2739) data time 0.0009 (0.0048) model time 0.2216 (0.2692) loss 3.3246 (3.2961) grad_norm 2.0600 (2.2633) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][400/1251] eta 0:03:51 lr 0.000667 wd 0.0500 time 0.2355 (0.2723) data time 0.0008 (0.0046) model time 0.2348 (0.2677) loss 3.8883 (3.2891) grad_norm 1.9532 (2.2599) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][410/1251] eta 0:03:47 lr 0.000667 wd 0.0500 time 0.2289 (0.2707) data time 0.0008 (0.0045) model time 0.2281 (0.2661) loss 2.4831 (3.2911) grad_norm 1.9139 (2.2600) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:45:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 06:45:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 06:45:52 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 06:48:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 06:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 06:48:28 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 06:48:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 06:48:39 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 06:48:40 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 06:48:41 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 06:48:41 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 130) [2024-08-27 06:48:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 06:48:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][420/1251] eta 0:47:12 lr 0.000667 wd 0.0500 time 0.2529 (3.4084) data time 0.0006 (0.2233) model time 0.2523 (3.1851) loss 3.7002 (3.5824) grad_norm 1.8101 (2.0668) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][430/1251] eta 0:15:35 lr 0.000667 wd 0.0500 time 0.2330 (1.1393) data time 0.0007 (0.0651) model time 0.2323 (1.0743) loss 3.6999 (3.5579) grad_norm 1.5982 (1.9872) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][440/1251] eta 0:10:16 lr 0.000667 wd 0.0500 time 0.2300 (0.7596) data time 0.0007 (0.0383) model time 0.2293 (0.7213) loss 3.5712 (3.5390) grad_norm 2.1834 (2.0518) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][450/1251] eta 0:08:04 lr 0.000667 wd 0.0500 time 0.2303 (0.6045) data time 0.0012 (0.0274) model time 0.2292 (0.5771) loss 2.6726 (3.5343) grad_norm 2.1827 (2.2150) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][460/1251] eta 0:06:50 lr 0.000667 wd 0.0500 time 0.2245 (0.5187) data time 0.0012 (0.0214) model time 0.2233 (0.4973) loss 3.0482 (3.4837) grad_norm 4.4906 (2.3301) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][470/1251] eta 0:06:02 lr 0.000667 wd 0.0500 time 0.2266 (0.4645) data time 0.0006 (0.0176) model time 0.2260 (0.4469) loss 3.5922 (3.5021) grad_norm 1.8432 (2.3233) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][480/1251] eta 0:05:29 lr 0.000666 wd 0.0500 time 0.2402 (0.4275) data time 0.0006 (0.0150) model time 0.2397 (0.4125) loss 3.5536 (3.4535) grad_norm 2.1051 (2.3110) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][490/1251] eta 0:05:04 lr 0.000666 wd 0.0500 time 0.2233 (0.4000) data time 0.0006 (0.0131) model time 0.2226 (0.3869) loss 3.3840 (3.4318) grad_norm 1.8351 (2.2461) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][500/1251] eta 0:04:45 lr 0.000666 wd 0.0500 time 0.2445 (0.3798) data time 0.0009 (0.0116) model time 0.2436 (0.3682) loss 3.5073 (3.4083) grad_norm 2.7742 (2.2575) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][510/1251] eta 0:04:29 lr 0.000666 wd 0.0500 time 0.2348 (0.3638) data time 0.0008 (0.0105) model time 0.2341 (0.3533) loss 3.2352 (3.3918) grad_norm 2.1117 (2.2783) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][520/1251] eta 0:04:16 lr 0.000666 wd 0.0500 time 0.2292 (0.3509) data time 0.0007 (0.0097) model time 0.2284 (0.3412) loss 3.3869 (3.4197) grad_norm 1.6749 (2.2684) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][530/1251] eta 0:04:05 lr 0.000666 wd 0.0500 time 0.2312 (0.3403) data time 0.0009 (0.0089) model time 0.2303 (0.3313) loss 3.8202 (3.4188) grad_norm 1.5024 (2.2525) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][540/1251] eta 0:03:55 lr 0.000666 wd 0.0500 time 0.2192 (0.3311) data time 0.0008 (0.0083) model time 0.2184 (0.3228) loss 2.4700 (3.4110) grad_norm 5.0326 (2.2831) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][550/1251] eta 0:03:46 lr 0.000666 wd 0.0500 time 0.2241 (0.3235) data time 0.0010 (0.0078) model time 0.2232 (0.3157) loss 3.6103 (3.4086) grad_norm 1.6834 (2.3614) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][560/1251] eta 0:03:39 lr 0.000666 wd 0.0500 time 0.2246 (0.3170) data time 0.0007 (0.0074) model time 0.2239 (0.3096) loss 3.0972 (3.3907) grad_norm 1.7059 (2.3481) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][570/1251] eta 0:03:32 lr 0.000666 wd 0.0500 time 0.2709 (0.3115) data time 0.0005 (0.0069) model time 0.2703 (0.3046) loss 3.2665 (3.3811) grad_norm 3.2338 (2.3403) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][580/1251] eta 0:03:25 lr 0.000666 wd 0.0500 time 0.2614 (0.3067) data time 0.0007 (0.0066) model time 0.2607 (0.3000) loss 3.2689 (3.3738) grad_norm 4.5108 (2.3649) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][590/1251] eta 0:03:19 lr 0.000666 wd 0.0500 time 0.2279 (0.3022) data time 0.0008 (0.0063) model time 0.2271 (0.2959) loss 2.5357 (3.3621) grad_norm 2.4188 (2.3718) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][600/1251] eta 0:03:14 lr 0.000666 wd 0.0500 time 0.2286 (0.2982) data time 0.0007 (0.0060) model time 0.2278 (0.2921) loss 3.1130 (3.3595) grad_norm 1.7361 (2.3664) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][610/1251] eta 0:03:08 lr 0.000666 wd 0.0500 time 0.2332 (0.2945) data time 0.0007 (0.0058) model time 0.2325 (0.2887) loss 3.1905 (3.3577) grad_norm 1.9118 (2.3353) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][620/1251] eta 0:03:03 lr 0.000666 wd 0.0500 time 0.2375 (0.2911) data time 0.0008 (0.0055) model time 0.2367 (0.2856) loss 3.5266 (3.3397) grad_norm 1.8163 (2.3341) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][630/1251] eta 0:02:58 lr 0.000666 wd 0.0500 time 0.2255 (0.2882) data time 0.0006 (0.0054) model time 0.2249 (0.2829) loss 3.6692 (3.3334) grad_norm 2.1552 (2.3326) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][640/1251] eta 0:02:54 lr 0.000666 wd 0.0500 time 0.2391 (0.2855) data time 0.0009 (0.0051) model time 0.2382 (0.2803) loss 3.8420 (3.3316) grad_norm 1.6071 (2.3234) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][650/1251] eta 0:02:50 lr 0.000666 wd 0.0500 time 0.2191 (0.2832) data time 0.0008 (0.0050) model time 0.2183 (0.2782) loss 2.4876 (3.3182) grad_norm 2.1434 (2.3077) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][660/1251] eta 0:02:46 lr 0.000666 wd 0.0500 time 0.2363 (0.2810) data time 0.0006 (0.0048) model time 0.2357 (0.2762) loss 2.5634 (3.3263) grad_norm 2.0915 (2.3009) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][670/1251] eta 0:02:42 lr 0.000666 wd 0.0500 time 0.2356 (0.2790) data time 0.0006 (0.0047) model time 0.2350 (0.2743) loss 2.3652 (3.3167) grad_norm 1.9068 (2.2948) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:49:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][680/1251] eta 0:02:38 lr 0.000666 wd 0.0500 time 0.2388 (0.2771) data time 0.0006 (0.0046) model time 0.2383 (0.2725) loss 3.2773 (3.3087) grad_norm 2.2980 (2.2882) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][690/1251] eta 0:02:34 lr 0.000666 wd 0.0500 time 0.2385 (0.2753) data time 0.0011 (0.0045) model time 0.2374 (0.2708) loss 3.1609 (3.3048) grad_norm 3.2183 (2.2870) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][700/1251] eta 0:02:30 lr 0.000666 wd 0.0500 time 0.2367 (0.2737) data time 0.0007 (0.0044) model time 0.2360 (0.2694) loss 2.2872 (3.3031) grad_norm 1.7290 (2.3189) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][710/1251] eta 0:02:27 lr 0.000666 wd 0.0500 time 0.2194 (0.2731) data time 0.0008 (0.0042) model time 0.2186 (0.2689) loss 2.7791 (3.2993) grad_norm 3.5621 (2.3216) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][720/1251] eta 0:02:24 lr 0.000665 wd 0.0500 time 0.2296 (0.2717) data time 0.0007 (0.0041) model time 0.2289 (0.2675) loss 3.6201 (3.2956) grad_norm 1.8486 (2.3138) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][730/1251] eta 0:02:21 lr 0.000665 wd 0.0500 time 0.2386 (0.2712) data time 0.0007 (0.0041) model time 0.2379 (0.2671) loss 3.8568 (3.3001) grad_norm 3.0173 (2.3131) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][740/1251] eta 0:02:17 lr 0.000665 wd 0.0500 time 0.2390 (0.2700) data time 0.0008 (0.0040) model time 0.2382 (0.2660) loss 3.8879 (3.3109) grad_norm 2.4504 (2.3045) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][750/1251] eta 0:02:14 lr 0.000665 wd 0.0500 time 0.2414 (0.2688) data time 0.0008 (0.0039) model time 0.2406 (0.2649) loss 3.2786 (3.3099) grad_norm 2.4783 (2.2996) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][760/1251] eta 0:02:11 lr 0.000665 wd 0.0500 time 0.2329 (0.2676) data time 0.0010 (0.0038) model time 0.2319 (0.2638) loss 3.9396 (3.3145) grad_norm 1.5954 (2.2919) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][770/1251] eta 0:02:08 lr 0.000665 wd 0.0500 time 0.2401 (0.2665) data time 0.0009 (0.0037) model time 0.2392 (0.2628) loss 3.6150 (3.3170) grad_norm 1.9266 (2.3021) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][780/1251] eta 0:02:05 lr 0.000665 wd 0.0500 time 0.2516 (0.2656) data time 0.0006 (0.0037) model time 0.2509 (0.2619) loss 2.3146 (3.3139) grad_norm 3.0255 (2.3129) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][790/1251] eta 0:02:02 lr 0.000665 wd 0.0500 time 0.2465 (0.2647) data time 0.0009 (0.0036) model time 0.2456 (0.2611) loss 3.6356 (3.3129) grad_norm 1.8464 (2.3137) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][800/1251] eta 0:01:58 lr 0.000665 wd 0.0500 time 0.2364 (0.2638) data time 0.0009 (0.0036) model time 0.2355 (0.2602) loss 2.5561 (3.3042) grad_norm 2.6519 (2.3058) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][810/1251] eta 0:01:56 lr 0.000665 wd 0.0500 time 0.2307 (0.2630) data time 0.0010 (0.0035) model time 0.2298 (0.2595) loss 3.1929 (3.3048) grad_norm 2.3123 (2.3056) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][820/1251] eta 0:01:53 lr 0.000665 wd 0.0500 time 0.2246 (0.2623) data time 0.0006 (0.0035) model time 0.2239 (0.2588) loss 3.6883 (3.3123) grad_norm 2.3582 (2.3043) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][830/1251] eta 0:01:50 lr 0.000665 wd 0.0500 time 0.2314 (0.2616) data time 0.0007 (0.0034) model time 0.2307 (0.2582) loss 2.8323 (3.3150) grad_norm 2.3083 (2.3020) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][840/1251] eta 0:01:47 lr 0.000665 wd 0.0500 time 0.2355 (0.2609) data time 0.0006 (0.0034) model time 0.2349 (0.2575) loss 3.9663 (3.3147) grad_norm 2.1890 (2.2989) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][850/1251] eta 0:01:44 lr 0.000665 wd 0.0500 time 0.2477 (0.2602) data time 0.0009 (0.0034) model time 0.2468 (0.2569) loss 4.0552 (3.3214) grad_norm 2.7829 (2.2982) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][860/1251] eta 0:01:41 lr 0.000665 wd 0.0500 time 0.2389 (0.2596) data time 0.0010 (0.0033) model time 0.2379 (0.2562) loss 3.8684 (3.3241) grad_norm 3.6335 (2.3026) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][870/1251] eta 0:01:38 lr 0.000665 wd 0.0500 time 0.2359 (0.2589) data time 0.0009 (0.0033) model time 0.2350 (0.2556) loss 3.9223 (3.3212) grad_norm 1.6495 (2.2968) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][880/1251] eta 0:01:35 lr 0.000665 wd 0.0500 time 0.2453 (0.2584) data time 0.0006 (0.0032) model time 0.2447 (0.2551) loss 3.8075 (3.3179) grad_norm 2.5526 (2.2940) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][890/1251] eta 0:01:33 lr 0.000665 wd 0.0500 time 0.2301 (0.2578) data time 0.0010 (0.0032) model time 0.2291 (0.2546) loss 3.2120 (3.3102) grad_norm 2.5925 (2.2951) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][900/1251] eta 0:01:30 lr 0.000665 wd 0.0500 time 0.2313 (0.2571) data time 0.0009 (0.0031) model time 0.2303 (0.2540) loss 3.3045 (3.3086) grad_norm 1.6591 (2.2992) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][910/1251] eta 0:01:27 lr 0.000665 wd 0.0500 time 0.2370 (0.2565) data time 0.0009 (0.0031) model time 0.2361 (0.2534) loss 2.9903 (3.3089) grad_norm 1.9715 (2.2983) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][920/1251] eta 0:01:24 lr 0.000665 wd 0.0500 time 0.2479 (0.2561) data time 0.0007 (0.0031) model time 0.2472 (0.2530) loss 3.3637 (3.3049) grad_norm 2.0911 (2.3006) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][930/1251] eta 0:01:22 lr 0.000665 wd 0.0500 time 0.2487 (0.2556) data time 0.0009 (0.0030) model time 0.2478 (0.2526) loss 3.5839 (3.3116) grad_norm 2.0770 (2.3063) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:50:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][940/1251] eta 0:01:19 lr 0.000665 wd 0.0500 time 0.2239 (0.2551) data time 0.0007 (0.0030) model time 0.2232 (0.2521) loss 2.3059 (3.3063) grad_norm 2.4886 (2.3033) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][950/1251] eta 0:01:16 lr 0.000665 wd 0.0500 time 0.2226 (0.2546) data time 0.0007 (0.0030) model time 0.2220 (0.2516) loss 2.3207 (3.3043) grad_norm 2.1710 (2.2990) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][960/1251] eta 0:01:13 lr 0.000664 wd 0.0500 time 0.2429 (0.2541) data time 0.0007 (0.0029) model time 0.2422 (0.2511) loss 3.9444 (3.3055) grad_norm 2.2270 (2.3007) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][970/1251] eta 0:01:11 lr 0.000664 wd 0.0500 time 0.2220 (0.2537) data time 0.0007 (0.0029) model time 0.2213 (0.2508) loss 4.1773 (3.3090) grad_norm 3.3587 (2.3004) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][980/1251] eta 0:01:08 lr 0.000664 wd 0.0500 time 0.2304 (0.2533) data time 0.0011 (0.0029) model time 0.2293 (0.2504) loss 3.8704 (3.3116) grad_norm 1.8988 (2.3001) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][990/1251] eta 0:01:06 lr 0.000664 wd 0.0500 time 0.2312 (0.2529) data time 0.0006 (0.0029) model time 0.2306 (0.2500) loss 3.5732 (3.3105) grad_norm 1.9308 (2.3018) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1000/1251] eta 0:01:03 lr 0.000664 wd 0.0500 time 0.2265 (0.2525) data time 0.0006 (0.0028) model time 0.2259 (0.2497) loss 2.4743 (3.3138) grad_norm 1.9794 (2.3003) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1010/1251] eta 0:01:00 lr 0.000664 wd 0.0500 time 0.2307 (0.2521) data time 0.0007 (0.0028) model time 0.2299 (0.2493) loss 3.0238 (3.3155) grad_norm 2.4299 (2.2997) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1020/1251] eta 0:00:58 lr 0.000664 wd 0.0500 time 0.2481 (0.2517) data time 0.0008 (0.0028) model time 0.2474 (0.2490) loss 3.4454 (3.3128) grad_norm 1.8460 (2.2977) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1030/1251] eta 0:00:55 lr 0.000664 wd 0.0500 time 0.2349 (0.2514) data time 0.0006 (0.0027) model time 0.2343 (0.2486) loss 3.2269 (3.3134) grad_norm 2.1369 (2.2996) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1040/1251] eta 0:00:52 lr 0.000664 wd 0.0500 time 0.2419 (0.2510) data time 0.0005 (0.0027) model time 0.2414 (0.2483) loss 3.3179 (3.3183) grad_norm 2.1278 (2.2981) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1050/1251] eta 0:00:50 lr 0.000664 wd 0.0500 time 0.2292 (0.2507) data time 0.0010 (0.0027) model time 0.2282 (0.2480) loss 3.1210 (3.3188) grad_norm 1.8149 (2.2970) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1060/1251] eta 0:00:47 lr 0.000664 wd 0.0500 time 0.2300 (0.2503) data time 0.0006 (0.0027) model time 0.2293 (0.2476) loss 3.3254 (3.3142) grad_norm 1.6037 (2.2959) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1070/1251] eta 0:00:45 lr 0.000664 wd 0.0500 time 0.2311 (0.2500) data time 0.0008 (0.0026) model time 0.2302 (0.2473) loss 4.1492 (3.3143) grad_norm 2.6509 (2.2917) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1080/1251] eta 0:00:42 lr 0.000664 wd 0.0500 time 0.2243 (0.2496) data time 0.0011 (0.0026) model time 0.2232 (0.2470) loss 3.4330 (3.3103) grad_norm 2.2309 (2.2902) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1090/1251] eta 0:00:40 lr 0.000664 wd 0.0500 time 0.2322 (0.2493) data time 0.0005 (0.0026) model time 0.2317 (0.2467) loss 3.5883 (3.3142) grad_norm 1.8239 (2.2915) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1100/1251] eta 0:00:37 lr 0.000664 wd 0.0500 time 0.2232 (0.2490) data time 0.0006 (0.0026) model time 0.2226 (0.2464) loss 3.5912 (3.3146) grad_norm 1.7449 (2.2958) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1110/1251] eta 0:00:35 lr 0.000664 wd 0.0500 time 0.2263 (0.2486) data time 0.0008 (0.0025) model time 0.2255 (0.2461) loss 3.6322 (3.3117) grad_norm 2.1335 (2.2923) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1120/1251] eta 0:00:32 lr 0.000664 wd 0.0500 time 0.2260 (0.2483) data time 0.0008 (0.0025) model time 0.2253 (0.2458) loss 3.7848 (3.3105) grad_norm 2.2243 (2.2904) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1130/1251] eta 0:00:30 lr 0.000664 wd 0.0500 time 0.2215 (0.2480) data time 0.0008 (0.0025) model time 0.2207 (0.2455) loss 3.0077 (3.3062) grad_norm 1.6638 (2.2882) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1140/1251] eta 0:00:27 lr 0.000664 wd 0.0500 time 0.2258 (0.2477) data time 0.0007 (0.0025) model time 0.2251 (0.2452) loss 3.2006 (3.3053) grad_norm 3.0104 (2.2896) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1150/1251] eta 0:00:24 lr 0.000664 wd 0.0500 time 0.2277 (0.2474) data time 0.0009 (0.0025) model time 0.2268 (0.2449) loss 3.6361 (3.3082) grad_norm 2.7273 (2.2928) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1160/1251] eta 0:00:22 lr 0.000664 wd 0.0500 time 0.2237 (0.2471) data time 0.0007 (0.0024) model time 0.2230 (0.2447) loss 2.2371 (3.3101) grad_norm 1.8904 (2.2935) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1170/1251] eta 0:00:19 lr 0.000664 wd 0.0500 time 0.2276 (0.2468) data time 0.0009 (0.0024) model time 0.2267 (0.2444) loss 2.0636 (3.3087) grad_norm 1.9384 (2.2995) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1180/1251] eta 0:00:17 lr 0.000664 wd 0.0500 time 0.2329 (0.2466) data time 0.0009 (0.0024) model time 0.2320 (0.2442) loss 2.8297 (3.3101) grad_norm 2.3852 (2.3034) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1190/1251] eta 0:00:15 lr 0.000663 wd 0.0500 time 0.2267 (0.2463) data time 0.0009 (0.0024) model time 0.2258 (0.2439) loss 3.4537 (3.3128) grad_norm 2.5472 (2.3066) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:51:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1200/1251] eta 0:00:12 lr 0.000663 wd 0.0500 time 0.2297 (0.2461) data time 0.0007 (0.0024) model time 0.2290 (0.2437) loss 2.7170 (3.3113) grad_norm 1.6238 (2.3044) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:52:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1210/1251] eta 0:00:10 lr 0.000663 wd 0.0500 time 0.2290 (0.2458) data time 0.0008 (0.0023) model time 0.2281 (0.2435) loss 3.3516 (3.3119) grad_norm 2.2638 (2.3030) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1220/1251] eta 0:00:07 lr 0.000663 wd 0.0500 time 0.2275 (0.2456) data time 0.0008 (0.0023) model time 0.2267 (0.2432) loss 3.1706 (3.3107) grad_norm 2.4437 (2.3006) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:52:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1230/1251] eta 0:00:05 lr 0.000663 wd 0.0500 time 0.2237 (0.2456) data time 0.0009 (0.0023) model time 0.2228 (0.2433) loss 3.5666 (3.3053) grad_norm 2.4120 (2.3049) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:52:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1240/1251] eta 0:00:02 lr 0.000663 wd 0.0500 time 0.2190 (0.2453) data time 0.0005 (0.0023) model time 0.2185 (0.2430) loss 3.3508 (3.3057) grad_norm 1.7681 (2.3047) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [130/300][1250/1251] eta 0:00:00 lr 0.000663 wd 0.0500 time 0.2131 (0.2451) data time 0.0004 (0.0023) model time 0.2127 (0.2429) loss 3.1308 (3.3027) grad_norm 3.3242 (2.3100) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 06:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 130 training takes 0:03:24 [2024-08-27 06:52:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 06:52:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 06:52:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.382 (0.382) Loss 0.4597 (0.4597) Acc@1 91.406 (91.406) Acc@5 97.949 (97.949) Mem 7377MB [2024-08-27 06:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.099) Loss 0.6860 (0.7278) Acc@1 85.547 (83.825) Acc@5 97.656 (96.839) Mem 7377MB [2024-08-27 06:52:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.086) Loss 1.0801 (0.7628) Acc@1 74.707 (82.910) Acc@5 92.773 (96.642) Mem 7377MB [2024-08-27 06:52:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.081) Loss 1.2842 (0.8757) Acc@1 69.727 (80.525) Acc@5 90.918 (95.265) Mem 7377MB [2024-08-27 06:52:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 1.2324 (0.9340) Acc@1 70.508 (79.104) Acc@5 91.406 (94.636) Mem 7377MB [2024-08-27 06:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.664 Acc@5 94.524 [2024-08-27 06:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.7% [2024-08-27 06:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 78.66% [2024-08-27 06:52:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 06:52:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 06:52:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.373 (0.373) Loss 0.4165 (0.4165) Acc@1 92.578 (92.578) Acc@5 98.535 (98.535) Mem 7377MB [2024-08-27 06:52:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.098) Loss 0.6650 (0.6567) Acc@1 87.109 (85.920) Acc@5 96.875 (97.310) Mem 7377MB [2024-08-27 06:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.085) Loss 0.9297 (0.6795) Acc@1 77.539 (84.942) Acc@5 94.922 (97.307) Mem 7377MB [2024-08-27 06:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.081) Loss 1.1875 (0.7726) Acc@1 69.922 (82.636) Acc@5 91.602 (96.261) Mem 7377MB [2024-08-27 06:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.077) Loss 1.0684 (0.8210) Acc@1 72.754 (81.236) Acc@5 93.555 (95.694) Mem 7377MB [2024-08-27 06:52:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.836 Acc@5 95.650 [2024-08-27 06:52:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.8% [2024-08-27 06:52:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.84% [2024-08-27 06:52:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 06:52:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 06:52:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][0/1251] eta 0:12:36 lr 0.000663 wd 0.0500 time 0.6046 (0.6046) data time 0.3774 (0.3774) model time 0.0000 (0.0000) loss 2.7527 (2.7527) grad_norm 1.9485 (1.9485) loss_scale 2048.0000 (2048.0000) mem 7380MB [2024-08-27 06:52:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][10/1251] eta 0:05:26 lr 0.000663 wd 0.0500 time 0.2252 (0.2628) data time 0.0007 (0.0364) model time 0.0000 (0.0000) loss 2.2170 (3.0846) grad_norm 1.5882 (2.0457) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:52:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][20/1251] eta 0:05:03 lr 0.000663 wd 0.0500 time 0.2264 (0.2468) data time 0.0006 (0.0195) model time 0.0000 (0.0000) loss 3.6986 (3.1791) grad_norm 3.3378 (2.0881) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:52:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][30/1251] eta 0:04:54 lr 0.000663 wd 0.0500 time 0.2299 (0.2411) data time 0.0008 (0.0139) model time 0.0000 (0.0000) loss 2.7528 (3.1501) grad_norm 1.9207 (2.0984) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:52:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][40/1251] eta 0:04:48 lr 0.000663 wd 0.0500 time 0.2249 (0.2378) data time 0.0006 (0.0108) model time 0.0000 (0.0000) loss 3.1399 (3.2009) grad_norm 2.7496 (2.1025) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:52:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][50/1251] eta 0:04:43 lr 0.000663 wd 0.0500 time 0.2223 (0.2360) data time 0.0007 (0.0089) model time 0.0000 (0.0000) loss 2.7354 (3.1707) grad_norm 2.2469 (2.1390) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:52:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][60/1251] eta 0:04:39 lr 0.000663 wd 0.0500 time 0.2227 (0.2345) data time 0.0008 (0.0077) model time 0.2219 (0.2253) loss 3.2781 (3.1711) grad_norm 3.3608 (2.3370) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:52:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][70/1251] eta 0:04:36 lr 0.000663 wd 0.0500 time 0.2327 (0.2340) data time 0.0006 (0.0068) model time 0.2322 (0.2274) loss 3.6019 (3.1872) grad_norm 1.8255 (2.3532) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:52:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][80/1251] eta 0:04:32 lr 0.000663 wd 0.0500 time 0.2224 (0.2331) data time 0.0006 (0.0061) model time 0.2219 (0.2269) loss 4.2290 (3.1964) grad_norm 2.0984 (2.3441) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:52:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][90/1251] eta 0:04:30 lr 0.000663 wd 0.0500 time 0.2302 (0.2328) data time 0.0006 (0.0055) model time 0.2296 (0.2274) loss 3.5537 (3.2185) grad_norm 1.8894 (2.3296) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:52:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][100/1251] eta 0:04:27 lr 0.000663 wd 0.0500 time 0.2237 (0.2328) data time 0.0006 (0.0051) model time 0.2231 (0.2284) loss 2.0673 (3.2235) grad_norm 2.2606 (2.2859) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][110/1251] eta 0:04:25 lr 0.000663 wd 0.0500 time 0.2240 (0.2326) data time 0.0006 (0.0047) model time 0.2235 (0.2285) loss 3.4451 (3.2067) grad_norm 2.4595 (2.2611) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:52:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][120/1251] eta 0:04:22 lr 0.000663 wd 0.0500 time 0.2267 (0.2325) data time 0.0009 (0.0045) model time 0.2257 (0.2288) loss 3.5741 (3.1846) grad_norm 2.3418 (2.2536) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][130/1251] eta 0:04:20 lr 0.000663 wd 0.0500 time 0.2287 (0.2323) data time 0.0009 (0.0042) model time 0.2278 (0.2287) loss 3.1676 (3.1893) grad_norm 2.2018 (2.2355) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][140/1251] eta 0:04:17 lr 0.000663 wd 0.0500 time 0.2230 (0.2322) data time 0.0006 (0.0040) model time 0.2224 (0.2287) loss 4.1386 (3.2201) grad_norm 2.2626 (2.2419) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][150/1251] eta 0:04:15 lr 0.000663 wd 0.0500 time 0.2204 (0.2318) data time 0.0007 (0.0038) model time 0.2197 (0.2285) loss 3.6905 (3.2182) grad_norm 1.7583 (2.2552) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][160/1251] eta 0:04:12 lr 0.000663 wd 0.0500 time 0.2251 (0.2317) data time 0.0007 (0.0037) model time 0.2244 (0.2285) loss 3.8596 (3.2242) grad_norm 2.9655 (2.2646) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][170/1251] eta 0:04:10 lr 0.000663 wd 0.0500 time 0.2221 (0.2315) data time 0.0007 (0.0035) model time 0.2215 (0.2284) loss 2.4180 (3.2213) grad_norm 3.6528 (2.2845) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][180/1251] eta 0:04:07 lr 0.000662 wd 0.0500 time 0.2302 (0.2312) data time 0.0007 (0.0034) model time 0.2295 (0.2281) loss 3.9387 (3.2437) grad_norm 2.4815 (2.2905) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][190/1251] eta 0:04:05 lr 0.000662 wd 0.0500 time 0.2201 (0.2309) data time 0.0007 (0.0033) model time 0.2194 (0.2279) loss 2.4646 (3.2224) grad_norm 2.2942 (2.3037) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][200/1251] eta 0:04:02 lr 0.000662 wd 0.0500 time 0.2226 (0.2307) data time 0.0006 (0.0031) model time 0.2220 (0.2277) loss 3.8722 (3.2217) grad_norm 1.8891 (2.2902) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][210/1251] eta 0:03:59 lr 0.000662 wd 0.0500 time 0.2238 (0.2304) data time 0.0008 (0.0030) model time 0.2229 (0.2275) loss 3.2330 (3.2364) grad_norm 1.6587 (2.3109) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][220/1251] eta 0:03:57 lr 0.000662 wd 0.0500 time 0.2275 (0.2302) data time 0.0007 (0.0029) model time 0.2268 (0.2274) loss 3.7105 (3.2402) grad_norm 2.0842 (2.2983) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][230/1251] eta 0:03:54 lr 0.000662 wd 0.0500 time 0.2344 (0.2302) data time 0.0007 (0.0029) model time 0.2337 (0.2274) loss 3.8807 (3.2505) grad_norm 1.7395 (2.2875) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][240/1251] eta 0:03:52 lr 0.000662 wd 0.0500 time 0.2321 (0.2301) data time 0.0013 (0.0028) model time 0.2308 (0.2273) loss 3.0085 (3.2386) grad_norm 2.0999 (2.2947) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][250/1251] eta 0:03:50 lr 0.000662 wd 0.0500 time 0.2277 (0.2300) data time 0.0006 (0.0027) model time 0.2271 (0.2273) loss 2.4089 (3.2379) grad_norm 2.7904 (2.2900) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][260/1251] eta 0:03:47 lr 0.000662 wd 0.0500 time 0.2372 (0.2299) data time 0.0007 (0.0026) model time 0.2365 (0.2273) loss 2.6196 (3.2347) grad_norm 2.4735 (2.2749) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][270/1251] eta 0:03:45 lr 0.000662 wd 0.0500 time 0.2308 (0.2298) data time 0.0007 (0.0026) model time 0.2301 (0.2272) loss 3.2779 (3.2415) grad_norm 1.7502 (2.2818) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][280/1251] eta 0:03:43 lr 0.000662 wd 0.0500 time 0.2301 (0.2297) data time 0.0006 (0.0025) model time 0.2295 (0.2272) loss 4.2691 (3.2474) grad_norm 6.0052 (2.3241) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][290/1251] eta 0:03:40 lr 0.000662 wd 0.0500 time 0.2227 (0.2295) data time 0.0005 (0.0025) model time 0.2222 (0.2270) loss 3.8981 (3.2487) grad_norm 2.9460 (2.3500) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][300/1251] eta 0:03:38 lr 0.000662 wd 0.0500 time 0.2191 (0.2293) data time 0.0008 (0.0024) model time 0.2183 (0.2269) loss 2.7938 (3.2497) grad_norm 2.0213 (2.3455) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][310/1251] eta 0:03:35 lr 0.000662 wd 0.0500 time 0.2265 (0.2292) data time 0.0008 (0.0024) model time 0.2257 (0.2268) loss 3.1447 (3.2561) grad_norm 2.3549 (2.3451) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][320/1251] eta 0:03:33 lr 0.000662 wd 0.0500 time 0.2233 (0.2292) data time 0.0008 (0.0023) model time 0.2224 (0.2269) loss 3.7439 (3.2561) grad_norm 2.3632 (2.3452) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][330/1251] eta 0:03:31 lr 0.000662 wd 0.0500 time 0.2298 (0.2292) data time 0.0007 (0.0023) model time 0.2292 (0.2269) loss 2.9356 (3.2510) grad_norm 2.0543 (2.3489) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][340/1251] eta 0:03:28 lr 0.000662 wd 0.0500 time 0.2236 (0.2291) data time 0.0006 (0.0022) model time 0.2231 (0.2268) loss 4.1484 (3.2633) grad_norm 2.4239 (2.3767) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][350/1251] eta 0:03:26 lr 0.000662 wd 0.0500 time 0.2264 (0.2290) data time 0.0005 (0.0022) model time 0.2259 (0.2268) loss 3.7707 (3.2759) grad_norm 2.1188 (2.3771) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][360/1251] eta 0:03:23 lr 0.000662 wd 0.0500 time 0.2257 (0.2289) data time 0.0005 (0.0022) model time 0.2252 (0.2267) loss 3.0397 (3.2757) grad_norm 4.4328 (2.3782) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][370/1251] eta 0:03:21 lr 0.000662 wd 0.0500 time 0.2267 (0.2289) data time 0.0005 (0.0021) model time 0.2261 (0.2267) loss 3.7684 (3.2820) grad_norm 2.0017 (2.3754) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][380/1251] eta 0:03:19 lr 0.000662 wd 0.0500 time 0.2245 (0.2288) data time 0.0006 (0.0021) model time 0.2239 (0.2267) loss 4.2637 (3.2811) grad_norm 4.1680 (2.3748) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:53:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][390/1251] eta 0:03:16 lr 0.000662 wd 0.0500 time 0.2344 (0.2288) data time 0.0008 (0.0021) model time 0.2336 (0.2267) loss 3.6894 (3.2791) grad_norm 2.0942 (2.3812) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:54:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][400/1251] eta 0:03:14 lr 0.000662 wd 0.0500 time 0.2260 (0.2287) data time 0.0008 (0.0020) model time 0.2252 (0.2266) loss 3.7982 (3.2753) grad_norm 2.4826 (2.3782) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:54:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][410/1251] eta 0:03:12 lr 0.000662 wd 0.0500 time 0.2234 (0.2286) data time 0.0008 (0.0020) model time 0.2226 (0.2266) loss 3.0368 (3.2758) grad_norm 2.6638 (2.3809) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:54:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][420/1251] eta 0:03:10 lr 0.000661 wd 0.0500 time 0.2208 (0.2287) data time 0.0007 (0.0020) model time 0.2202 (0.2266) loss 2.7637 (3.2706) grad_norm 1.8476 (2.3717) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:54:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][430/1251] eta 0:03:07 lr 0.000661 wd 0.0500 time 0.2259 (0.2286) data time 0.0006 (0.0020) model time 0.2253 (0.2266) loss 3.7816 (3.2756) grad_norm 2.3017 (2.3729) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][440/1251] eta 0:03:05 lr 0.000661 wd 0.0500 time 0.2266 (0.2286) data time 0.0007 (0.0019) model time 0.2259 (0.2266) loss 3.7826 (3.2767) grad_norm 3.0283 (2.3814) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][450/1251] eta 0:03:03 lr 0.000661 wd 0.0500 time 0.2256 (0.2286) data time 0.0008 (0.0019) model time 0.2247 (0.2266) loss 3.2927 (3.2807) grad_norm 2.9025 (2.3826) loss_scale 4096.0000 (2061.6231) mem 7381MB [2024-08-27 06:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][460/1251] eta 0:03:00 lr 0.000661 wd 0.0500 time 0.2210 (0.2286) data time 0.0006 (0.0019) model time 0.2204 (0.2266) loss 3.6916 (3.2858) grad_norm 1.9590 (2.3756) loss_scale 4096.0000 (2105.7527) mem 7381MB [2024-08-27 06:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][470/1251] eta 0:02:58 lr 0.000661 wd 0.0500 time 0.2303 (0.2286) data time 0.0008 (0.0019) model time 0.2296 (0.2267) loss 1.9984 (3.2881) grad_norm 2.7535 (2.3724) loss_scale 4096.0000 (2148.0085) mem 7381MB [2024-08-27 06:54:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][480/1251] eta 0:02:56 lr 0.000661 wd 0.0500 time 0.2232 (0.2285) data time 0.0006 (0.0018) model time 0.2226 (0.2266) loss 2.2042 (3.2830) grad_norm 2.8214 (2.3684) loss_scale 4096.0000 (2188.5073) mem 7381MB [2024-08-27 06:54:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][490/1251] eta 0:02:53 lr 0.000661 wd 0.0500 time 0.2275 (0.2284) data time 0.0006 (0.0018) model time 0.2269 (0.2266) loss 3.8918 (3.2792) grad_norm 2.1001 (2.3622) loss_scale 4096.0000 (2227.3564) mem 7381MB [2024-08-27 06:54:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][500/1251] eta 0:02:51 lr 0.000661 wd 0.0500 time 0.2253 (0.2284) data time 0.0007 (0.0018) model time 0.2246 (0.2265) loss 2.4134 (3.2797) grad_norm 2.1344 (2.3628) loss_scale 4096.0000 (2264.6547) mem 7381MB [2024-08-27 06:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][510/1251] eta 0:02:49 lr 0.000661 wd 0.0500 time 0.2228 (0.2284) data time 0.0008 (0.0018) model time 0.2220 (0.2265) loss 2.9514 (3.2827) grad_norm 1.8914 (2.3724) loss_scale 4096.0000 (2300.4932) mem 7381MB [2024-08-27 06:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][520/1251] eta 0:02:46 lr 0.000661 wd 0.0500 time 0.2233 (0.2283) data time 0.0007 (0.0018) model time 0.2226 (0.2265) loss 2.6084 (3.2852) grad_norm 2.0837 (2.3702) loss_scale 4096.0000 (2334.9559) mem 7381MB [2024-08-27 06:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][530/1251] eta 0:02:44 lr 0.000661 wd 0.0500 time 0.2231 (0.2288) data time 0.0009 (0.0018) model time 0.2222 (0.2270) loss 3.6347 (3.2918) grad_norm 4.0079 (2.3788) loss_scale 4096.0000 (2368.1205) mem 7381MB [2024-08-27 06:54:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][540/1251] eta 0:02:42 lr 0.000661 wd 0.0500 time 0.2309 (0.2287) data time 0.0007 (0.0017) model time 0.2302 (0.2270) loss 3.8233 (3.2947) grad_norm 2.4760 (2.3804) loss_scale 4096.0000 (2400.0591) mem 7381MB [2024-08-27 06:54:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][550/1251] eta 0:02:40 lr 0.000661 wd 0.0500 time 0.2248 (0.2287) data time 0.0006 (0.0017) model time 0.2242 (0.2269) loss 4.0363 (3.2957) grad_norm 1.9153 (inf) loss_scale 2048.0000 (2412.2541) mem 7381MB [2024-08-27 06:54:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][560/1251] eta 0:02:38 lr 0.000661 wd 0.0500 time 0.2338 (0.2287) data time 0.0007 (0.0017) model time 0.2331 (0.2270) loss 3.6693 (3.2965) grad_norm 2.5055 (inf) loss_scale 2048.0000 (2405.7611) mem 7381MB [2024-08-27 06:54:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][570/1251] eta 0:02:35 lr 0.000661 wd 0.0500 time 0.2269 (0.2287) data time 0.0008 (0.0017) model time 0.2261 (0.2270) loss 3.7858 (3.3007) grad_norm 2.4899 (inf) loss_scale 2048.0000 (2399.4956) mem 7381MB [2024-08-27 06:54:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][580/1251] eta 0:02:33 lr 0.000661 wd 0.0500 time 0.2272 (0.2288) data time 0.0007 (0.0017) model time 0.2265 (0.2271) loss 3.3421 (3.3039) grad_norm 1.9723 (inf) loss_scale 2048.0000 (2393.4458) mem 7381MB [2024-08-27 06:54:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][590/1251] eta 0:02:31 lr 0.000661 wd 0.0500 time 0.2279 (0.2288) data time 0.0007 (0.0017) model time 0.2272 (0.2271) loss 2.9733 (3.3039) grad_norm 2.5860 (inf) loss_scale 2048.0000 (2387.6007) mem 7381MB [2024-08-27 06:54:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][600/1251] eta 0:02:28 lr 0.000661 wd 0.0500 time 0.2235 (0.2288) data time 0.0009 (0.0017) model time 0.2225 (0.2271) loss 3.4475 (3.3035) grad_norm 2.4624 (inf) loss_scale 2048.0000 (2381.9501) mem 7381MB [2024-08-27 06:54:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][610/1251] eta 0:02:26 lr 0.000661 wd 0.0500 time 0.2227 (0.2288) data time 0.0006 (0.0017) model time 0.2221 (0.2271) loss 2.9292 (3.3053) grad_norm 2.2596 (inf) loss_scale 2048.0000 (2376.4845) mem 7381MB [2024-08-27 06:54:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][620/1251] eta 0:02:24 lr 0.000661 wd 0.0500 time 0.2235 (0.2288) data time 0.0007 (0.0017) model time 0.2228 (0.2271) loss 3.3944 (3.3063) grad_norm 2.4486 (inf) loss_scale 2048.0000 (2371.1948) mem 7381MB [2024-08-27 06:54:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][630/1251] eta 0:02:22 lr 0.000661 wd 0.0500 time 0.2252 (0.2288) data time 0.0009 (0.0017) model time 0.2243 (0.2271) loss 3.0931 (3.3090) grad_norm 2.2270 (inf) loss_scale 2048.0000 (2366.0729) mem 7381MB [2024-08-27 06:54:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][640/1251] eta 0:02:19 lr 0.000661 wd 0.0500 time 0.2250 (0.2288) data time 0.0009 (0.0016) model time 0.2241 (0.2271) loss 2.3651 (3.3056) grad_norm 2.1522 (inf) loss_scale 2048.0000 (2361.1108) mem 7381MB [2024-08-27 06:54:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][650/1251] eta 0:02:17 lr 0.000660 wd 0.0500 time 0.2223 (0.2288) data time 0.0009 (0.0016) model time 0.2214 (0.2271) loss 3.5365 (3.2999) grad_norm 2.3116 (inf) loss_scale 2048.0000 (2356.3011) mem 7381MB [2024-08-27 06:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][660/1251] eta 0:02:15 lr 0.000660 wd 0.0500 time 0.2345 (0.2288) data time 0.0006 (0.0016) model time 0.2339 (0.2271) loss 2.3973 (3.2959) grad_norm 2.7993 (inf) loss_scale 2048.0000 (2351.6369) mem 7381MB [2024-08-27 06:55:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][670/1251] eta 0:02:12 lr 0.000660 wd 0.0500 time 0.2266 (0.2288) data time 0.0010 (0.0016) model time 0.2257 (0.2272) loss 3.7744 (3.2984) grad_norm 1.8883 (inf) loss_scale 2048.0000 (2347.1118) mem 7381MB [2024-08-27 06:55:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][680/1251] eta 0:02:10 lr 0.000660 wd 0.0500 time 0.2202 (0.2287) data time 0.0007 (0.0016) model time 0.2195 (0.2272) loss 3.4253 (3.2994) grad_norm 2.2134 (inf) loss_scale 2048.0000 (2342.7195) mem 7381MB [2024-08-27 06:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][690/1251] eta 0:02:08 lr 0.000660 wd 0.0500 time 0.2301 (0.2288) data time 0.0007 (0.0016) model time 0.2294 (0.2272) loss 3.8473 (3.3019) grad_norm 2.7925 (inf) loss_scale 2048.0000 (2338.4544) mem 7381MB [2024-08-27 06:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][700/1251] eta 0:02:06 lr 0.000660 wd 0.0500 time 0.2396 (0.2288) data time 0.0008 (0.0016) model time 0.2389 (0.2272) loss 3.3924 (3.2995) grad_norm 2.3838 (inf) loss_scale 2048.0000 (2334.3110) mem 7381MB [2024-08-27 06:55:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][710/1251] eta 0:02:03 lr 0.000660 wd 0.0500 time 0.2282 (0.2288) data time 0.0007 (0.0016) model time 0.2275 (0.2272) loss 3.1812 (3.2985) grad_norm 5.0352 (inf) loss_scale 2048.0000 (2330.2841) mem 7381MB [2024-08-27 06:55:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][720/1251] eta 0:02:01 lr 0.000660 wd 0.0500 time 0.2268 (0.2288) data time 0.0006 (0.0016) model time 0.2262 (0.2272) loss 3.8935 (3.2968) grad_norm 1.4849 (inf) loss_scale 2048.0000 (2326.3689) mem 7381MB [2024-08-27 06:55:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][730/1251] eta 0:01:59 lr 0.000660 wd 0.0500 time 0.2246 (0.2288) data time 0.0006 (0.0016) model time 0.2240 (0.2272) loss 3.3372 (3.2929) grad_norm 2.2083 (inf) loss_scale 2048.0000 (2322.5609) mem 7381MB [2024-08-27 06:55:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][740/1251] eta 0:01:56 lr 0.000660 wd 0.0500 time 0.2354 (0.2288) data time 0.0012 (0.0016) model time 0.2342 (0.2272) loss 3.7683 (3.2943) grad_norm 2.2243 (inf) loss_scale 2048.0000 (2318.8556) mem 7381MB [2024-08-27 06:55:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][750/1251] eta 0:01:54 lr 0.000660 wd 0.0500 time 0.2272 (0.2288) data time 0.0006 (0.0015) model time 0.2266 (0.2272) loss 3.4362 (3.2920) grad_norm 2.3560 (inf) loss_scale 2048.0000 (2315.2490) mem 7381MB [2024-08-27 06:55:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][760/1251] eta 0:01:52 lr 0.000660 wd 0.0500 time 0.2227 (0.2290) data time 0.0009 (0.0015) model time 0.2218 (0.2274) loss 3.2524 (3.2951) grad_norm 1.3362 (inf) loss_scale 2048.0000 (2311.7372) mem 7381MB [2024-08-27 06:55:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][770/1251] eta 0:01:50 lr 0.000660 wd 0.0500 time 0.2287 (0.2289) data time 0.0008 (0.0015) model time 0.2280 (0.2274) loss 3.5824 (3.2982) grad_norm 1.8900 (inf) loss_scale 2048.0000 (2308.3165) mem 7381MB [2024-08-27 06:55:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][780/1251] eta 0:01:47 lr 0.000660 wd 0.0500 time 0.2329 (0.2289) data time 0.0006 (0.0015) model time 0.2323 (0.2274) loss 3.4050 (3.2992) grad_norm 1.9675 (inf) loss_scale 2048.0000 (2304.9834) mem 7381MB [2024-08-27 06:55:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][790/1251] eta 0:01:45 lr 0.000660 wd 0.0500 time 0.2270 (0.2289) data time 0.0008 (0.0015) model time 0.2262 (0.2274) loss 2.2308 (3.2988) grad_norm 2.7541 (inf) loss_scale 2048.0000 (2301.7345) mem 7381MB [2024-08-27 06:55:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][800/1251] eta 0:01:43 lr 0.000660 wd 0.0500 time 0.2346 (0.2288) data time 0.0008 (0.0015) model time 0.2339 (0.2273) loss 3.4175 (3.3019) grad_norm 1.4147 (inf) loss_scale 2048.0000 (2298.5668) mem 7381MB [2024-08-27 06:55:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][810/1251] eta 0:01:40 lr 0.000660 wd 0.0500 time 0.2345 (0.2288) data time 0.0006 (0.0015) model time 0.2339 (0.2273) loss 3.3716 (3.2985) grad_norm 2.2323 (inf) loss_scale 2048.0000 (2295.4772) mem 7381MB [2024-08-27 06:55:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][820/1251] eta 0:01:38 lr 0.000660 wd 0.0500 time 0.2239 (0.2288) data time 0.0005 (0.0015) model time 0.2234 (0.2274) loss 2.5679 (3.2918) grad_norm 1.5591 (inf) loss_scale 2048.0000 (2292.4629) mem 7381MB [2024-08-27 06:55:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][830/1251] eta 0:01:36 lr 0.000660 wd 0.0500 time 0.2324 (0.2288) data time 0.0006 (0.0015) model time 0.2318 (0.2273) loss 3.1468 (3.2890) grad_norm 2.1437 (inf) loss_scale 2048.0000 (2289.5211) mem 7381MB [2024-08-27 06:55:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][840/1251] eta 0:01:34 lr 0.000660 wd 0.0500 time 0.2261 (0.2288) data time 0.0006 (0.0015) model time 0.2255 (0.2273) loss 3.5020 (3.2914) grad_norm 2.4231 (inf) loss_scale 2048.0000 (2286.6492) mem 7381MB [2024-08-27 06:55:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][850/1251] eta 0:01:31 lr 0.000660 wd 0.0500 time 0.2230 (0.2287) data time 0.0009 (0.0015) model time 0.2221 (0.2273) loss 3.2139 (3.2924) grad_norm 2.0726 (inf) loss_scale 2048.0000 (2283.8449) mem 7381MB [2024-08-27 06:55:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][860/1251] eta 0:01:29 lr 0.000660 wd 0.0500 time 0.2362 (0.2287) data time 0.0009 (0.0015) model time 0.2354 (0.2273) loss 3.6419 (3.2951) grad_norm 1.8045 (inf) loss_scale 2048.0000 (2281.1057) mem 7381MB [2024-08-27 06:55:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][870/1251] eta 0:01:27 lr 0.000660 wd 0.0500 time 0.2259 (0.2287) data time 0.0005 (0.0015) model time 0.2254 (0.2272) loss 3.9690 (3.2984) grad_norm 2.5762 (inf) loss_scale 2048.0000 (2278.4294) mem 7381MB [2024-08-27 06:55:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][880/1251] eta 0:01:24 lr 0.000660 wd 0.0500 time 0.2456 (0.2287) data time 0.0006 (0.0014) model time 0.2451 (0.2272) loss 2.2726 (3.2965) grad_norm 1.5463 (inf) loss_scale 2048.0000 (2275.8138) mem 7381MB [2024-08-27 06:55:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][890/1251] eta 0:01:22 lr 0.000659 wd 0.0500 time 0.2243 (0.2287) data time 0.0006 (0.0014) model time 0.2237 (0.2273) loss 4.0194 (3.2958) grad_norm 5.6457 (inf) loss_scale 2048.0000 (2273.2570) mem 7381MB [2024-08-27 06:55:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][900/1251] eta 0:01:20 lr 0.000659 wd 0.0500 time 0.2218 (0.2287) data time 0.0008 (0.0014) model time 0.2210 (0.2272) loss 2.2357 (3.2958) grad_norm 2.1502 (inf) loss_scale 2048.0000 (2270.7569) mem 7381MB [2024-08-27 06:55:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][910/1251] eta 0:01:17 lr 0.000659 wd 0.0500 time 0.2283 (0.2287) data time 0.0009 (0.0014) model time 0.2273 (0.2273) loss 3.4741 (3.2960) grad_norm 2.1189 (inf) loss_scale 2048.0000 (2268.3117) mem 7381MB [2024-08-27 06:56:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][920/1251] eta 0:01:15 lr 0.000659 wd 0.0500 time 0.2282 (0.2286) data time 0.0006 (0.0014) model time 0.2277 (0.2272) loss 2.6340 (3.2933) grad_norm 1.7989 (inf) loss_scale 2048.0000 (2265.9197) mem 7381MB [2024-08-27 06:56:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][930/1251] eta 0:01:13 lr 0.000659 wd 0.0500 time 0.2236 (0.2286) data time 0.0009 (0.0014) model time 0.2227 (0.2272) loss 2.8562 (3.2953) grad_norm 1.6531 (inf) loss_scale 2048.0000 (2263.5789) mem 7381MB [2024-08-27 06:56:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][940/1251] eta 0:01:11 lr 0.000659 wd 0.0500 time 0.2266 (0.2285) data time 0.0007 (0.0014) model time 0.2259 (0.2271) loss 3.5726 (3.2960) grad_norm 1.8465 (inf) loss_scale 2048.0000 (2261.2880) mem 7381MB [2024-08-27 06:56:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][950/1251] eta 0:01:08 lr 0.000659 wd 0.0500 time 0.2306 (0.2285) data time 0.0008 (0.0014) model time 0.2298 (0.2271) loss 3.1397 (3.2956) grad_norm 2.2611 (inf) loss_scale 2048.0000 (2259.0452) mem 7381MB [2024-08-27 06:56:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][960/1251] eta 0:01:06 lr 0.000659 wd 0.0500 time 0.2354 (0.2285) data time 0.0006 (0.0014) model time 0.2348 (0.2271) loss 3.6897 (3.2979) grad_norm 2.5384 (inf) loss_scale 2048.0000 (2256.8491) mem 7381MB [2024-08-27 06:56:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][970/1251] eta 0:01:04 lr 0.000659 wd 0.0500 time 0.2369 (0.2285) data time 0.0006 (0.0014) model time 0.2363 (0.2271) loss 2.9993 (3.2983) grad_norm 2.1613 (inf) loss_scale 2048.0000 (2254.6982) mem 7381MB [2024-08-27 06:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][980/1251] eta 0:01:01 lr 0.000659 wd 0.0500 time 0.2298 (0.2285) data time 0.0007 (0.0014) model time 0.2291 (0.2271) loss 3.4990 (3.2976) grad_norm 2.0507 (inf) loss_scale 2048.0000 (2252.5912) mem 7381MB [2024-08-27 06:56:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][990/1251] eta 0:00:59 lr 0.000659 wd 0.0500 time 0.2259 (0.2285) data time 0.0007 (0.0014) model time 0.2252 (0.2271) loss 3.7297 (3.2987) grad_norm 2.4406 (inf) loss_scale 2048.0000 (2250.5267) mem 7381MB [2024-08-27 06:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1000/1251] eta 0:00:57 lr 0.000659 wd 0.0500 time 0.2334 (0.2285) data time 0.0008 (0.0014) model time 0.2326 (0.2271) loss 3.3314 (3.3007) grad_norm 2.6798 (inf) loss_scale 2048.0000 (2248.5035) mem 7381MB [2024-08-27 06:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1010/1251] eta 0:00:55 lr 0.000659 wd 0.0500 time 0.2336 (0.2285) data time 0.0006 (0.0014) model time 0.2330 (0.2271) loss 2.7707 (3.2990) grad_norm 2.2317 (inf) loss_scale 2048.0000 (2246.5203) mem 7381MB [2024-08-27 06:56:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1020/1251] eta 0:00:52 lr 0.000659 wd 0.0500 time 0.2275 (0.2285) data time 0.0006 (0.0014) model time 0.2270 (0.2271) loss 2.4326 (3.2959) grad_norm 2.2841 (inf) loss_scale 2048.0000 (2244.5759) mem 7381MB [2024-08-27 06:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1030/1251] eta 0:00:50 lr 0.000659 wd 0.0500 time 0.2269 (0.2285) data time 0.0006 (0.0014) model time 0.2264 (0.2271) loss 3.7915 (3.2963) grad_norm 2.1117 (inf) loss_scale 2048.0000 (2242.6693) mem 7381MB [2024-08-27 06:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1040/1251] eta 0:00:48 lr 0.000659 wd 0.0500 time 0.2264 (0.2285) data time 0.0009 (0.0014) model time 0.2255 (0.2271) loss 3.7788 (3.2952) grad_norm 2.4817 (inf) loss_scale 2048.0000 (2240.7992) mem 7381MB [2024-08-27 06:56:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1050/1251] eta 0:00:45 lr 0.000659 wd 0.0500 time 0.2247 (0.2286) data time 0.0006 (0.0013) model time 0.2241 (0.2272) loss 4.0469 (3.2971) grad_norm 3.4357 (inf) loss_scale 2048.0000 (2238.9648) mem 7381MB [2024-08-27 06:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1060/1251] eta 0:00:43 lr 0.000659 wd 0.0500 time 0.2284 (0.2286) data time 0.0009 (0.0013) model time 0.2275 (0.2272) loss 3.4575 (3.2953) grad_norm 3.1817 (inf) loss_scale 2048.0000 (2237.1649) mem 7381MB [2024-08-27 06:56:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1070/1251] eta 0:00:41 lr 0.000659 wd 0.0500 time 0.2269 (0.2285) data time 0.0009 (0.0013) model time 0.2260 (0.2272) loss 3.5186 (3.2966) grad_norm 1.7622 (inf) loss_scale 2048.0000 (2235.3987) mem 7381MB [2024-08-27 06:56:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1080/1251] eta 0:00:39 lr 0.000659 wd 0.0500 time 0.2203 (0.2285) data time 0.0009 (0.0013) model time 0.2193 (0.2272) loss 2.0885 (3.2940) grad_norm 1.7438 (inf) loss_scale 2048.0000 (2233.6651) mem 7381MB [2024-08-27 06:56:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1090/1251] eta 0:00:36 lr 0.000659 wd 0.0500 time 0.2234 (0.2285) data time 0.0007 (0.0013) model time 0.2227 (0.2272) loss 3.5474 (3.2930) grad_norm 1.5997 (inf) loss_scale 2048.0000 (2231.9633) mem 7381MB [2024-08-27 06:56:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1100/1251] eta 0:00:34 lr 0.000659 wd 0.0500 time 0.2236 (0.2285) data time 0.0009 (0.0013) model time 0.2226 (0.2272) loss 4.0395 (3.2922) grad_norm 2.1327 (inf) loss_scale 2048.0000 (2230.2925) mem 7381MB [2024-08-27 06:56:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1110/1251] eta 0:00:32 lr 0.000659 wd 0.0500 time 0.2298 (0.2285) data time 0.0006 (0.0013) model time 0.2292 (0.2272) loss 3.6521 (3.2920) grad_norm 2.6349 (inf) loss_scale 2048.0000 (2228.6517) mem 7381MB [2024-08-27 06:56:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1120/1251] eta 0:00:29 lr 0.000659 wd 0.0500 time 0.2243 (0.2284) data time 0.0007 (0.0013) model time 0.2237 (0.2271) loss 3.0238 (3.2883) grad_norm 2.6272 (inf) loss_scale 2048.0000 (2227.0401) mem 7381MB [2024-08-27 06:56:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1130/1251] eta 0:00:27 lr 0.000658 wd 0.0500 time 0.2296 (0.2284) data time 0.0006 (0.0013) model time 0.2291 (0.2271) loss 3.8060 (3.2896) grad_norm 2.1390 (inf) loss_scale 2048.0000 (2225.4571) mem 7381MB [2024-08-27 06:56:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1140/1251] eta 0:00:25 lr 0.000658 wd 0.0500 time 0.2236 (0.2284) data time 0.0008 (0.0013) model time 0.2228 (0.2271) loss 2.7802 (3.2892) grad_norm 2.1052 (inf) loss_scale 2048.0000 (2223.9018) mem 7381MB [2024-08-27 06:56:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1150/1251] eta 0:00:23 lr 0.000658 wd 0.0500 time 0.2289 (0.2284) data time 0.0007 (0.0013) model time 0.2283 (0.2271) loss 3.6648 (3.2880) grad_norm 1.8675 (inf) loss_scale 2048.0000 (2222.3736) mem 7381MB [2024-08-27 06:56:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1160/1251] eta 0:00:20 lr 0.000658 wd 0.0500 time 0.2211 (0.2284) data time 0.0006 (0.0013) model time 0.2205 (0.2271) loss 2.4927 (3.2870) grad_norm 2.1268 (inf) loss_scale 2048.0000 (2220.8717) mem 7381MB [2024-08-27 06:56:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1170/1251] eta 0:00:18 lr 0.000658 wd 0.0500 time 0.2249 (0.2284) data time 0.0006 (0.0013) model time 0.2243 (0.2271) loss 3.7186 (3.2886) grad_norm 2.0278 (inf) loss_scale 2048.0000 (2219.3954) mem 7381MB [2024-08-27 06:56:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1180/1251] eta 0:00:16 lr 0.000658 wd 0.0500 time 0.2318 (0.2283) data time 0.0005 (0.0013) model time 0.2313 (0.2270) loss 2.7003 (3.2895) grad_norm 1.9794 (inf) loss_scale 2048.0000 (2217.9441) mem 7381MB [2024-08-27 06:57:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1190/1251] eta 0:00:13 lr 0.000658 wd 0.0500 time 0.2285 (0.2283) data time 0.0008 (0.0013) model time 0.2277 (0.2270) loss 3.5645 (3.2894) grad_norm 1.9068 (inf) loss_scale 2048.0000 (2216.5172) mem 7381MB [2024-08-27 06:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1200/1251] eta 0:00:11 lr 0.000658 wd 0.0500 time 0.2306 (0.2283) data time 0.0007 (0.0013) model time 0.2298 (0.2270) loss 2.3127 (3.2900) grad_norm 2.4265 (inf) loss_scale 2048.0000 (2215.1141) mem 7381MB [2024-08-27 06:57:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1210/1251] eta 0:00:09 lr 0.000658 wd 0.0500 time 0.2286 (0.2283) data time 0.0008 (0.0013) model time 0.2278 (0.2270) loss 3.6703 (3.2910) grad_norm 2.5729 (inf) loss_scale 2048.0000 (2213.7341) mem 7381MB [2024-08-27 06:57:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1220/1251] eta 0:00:07 lr 0.000658 wd 0.0500 time 0.2253 (0.2283) data time 0.0006 (0.0013) model time 0.2247 (0.2270) loss 2.8922 (3.2916) grad_norm 2.1421 (inf) loss_scale 2048.0000 (2212.3767) mem 7381MB [2024-08-27 06:57:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1230/1251] eta 0:00:04 lr 0.000658 wd 0.0500 time 0.2349 (0.2283) data time 0.0006 (0.0013) model time 0.2342 (0.2270) loss 3.8346 (3.2911) grad_norm 1.7128 (inf) loss_scale 2048.0000 (2211.0414) mem 7381MB [2024-08-27 06:57:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1240/1251] eta 0:00:02 lr 0.000658 wd 0.0500 time 0.2161 (0.2282) data time 0.0003 (0.0013) model time 0.2158 (0.2269) loss 3.4101 (3.2923) grad_norm 2.7590 (inf) loss_scale 2048.0000 (2209.7276) mem 7381MB [2024-08-27 06:57:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [131/300][1250/1251] eta 0:00:00 lr 0.000658 wd 0.0500 time 0.2130 (0.2281) data time 0.0003 (0.0013) model time 0.2127 (0.2268) loss 3.8313 (3.2918) grad_norm 2.7559 (inf) loss_scale 2048.0000 (2208.4349) mem 7381MB [2024-08-27 06:57:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 131 training takes 0:04:45 [2024-08-27 06:57:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 06:57:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 06:57:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.358 (0.358) Loss 0.5132 (0.5132) Acc@1 90.625 (90.625) Acc@5 97.949 (97.949) Mem 7381MB [2024-08-27 06:57:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.085 (0.099) Loss 0.7827 (0.7383) Acc@1 82.617 (83.896) Acc@5 96.582 (96.848) Mem 7381MB [2024-08-27 06:57:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.067 (0.085) Loss 1.0576 (0.7651) Acc@1 74.707 (83.003) Acc@5 94.043 (96.805) Mem 7381MB [2024-08-27 06:57:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.080) Loss 1.3115 (0.8695) Acc@1 67.480 (80.541) Acc@5 90.137 (95.552) Mem 7381MB [2024-08-27 06:57:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 1.2139 (0.9344) Acc@1 72.461 (78.973) Acc@5 91.602 (94.843) Mem 7381MB [2024-08-27 06:57:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.578 Acc@5 94.754 [2024-08-27 06:57:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.6% [2024-08-27 06:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.769 (0.769) Loss 0.4167 (0.4167) Acc@1 92.578 (92.578) Acc@5 98.730 (98.730) Mem 7381MB [2024-08-27 06:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.066 (0.135) Loss 0.6636 (0.6550) Acc@1 87.500 (85.982) Acc@5 96.875 (97.346) Mem 7381MB [2024-08-27 06:57:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.104) Loss 0.9277 (0.6781) Acc@1 77.441 (84.989) Acc@5 94.824 (97.312) Mem 7381MB [2024-08-27 06:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.065 (0.093) Loss 1.1826 (0.7710) Acc@1 70.020 (82.696) Acc@5 91.699 (96.276) Mem 7381MB [2024-08-27 06:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 1.0693 (0.8195) Acc@1 72.949 (81.293) Acc@5 93.652 (95.729) Mem 7381MB [2024-08-27 06:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.886 Acc@5 95.694 [2024-08-27 06:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.9% [2024-08-27 06:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.89% [2024-08-27 06:57:23 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 06:57:24 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 06:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][0/1251] eta 0:13:15 lr 0.000658 wd 0.0500 time 0.6362 (0.6362) data time 0.4231 (0.4231) model time 0.0000 (0.0000) loss 3.4781 (3.4781) grad_norm 2.4449 (2.4449) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][10/1251] eta 0:05:28 lr 0.000658 wd 0.0500 time 0.2379 (0.2644) data time 0.0006 (0.0392) model time 0.0000 (0.0000) loss 3.5658 (3.3173) grad_norm 1.9637 (2.2041) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][20/1251] eta 0:05:01 lr 0.000658 wd 0.0500 time 0.2268 (0.2451) data time 0.0006 (0.0209) model time 0.0000 (0.0000) loss 3.4809 (3.2199) grad_norm 1.9056 (2.2372) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][30/1251] eta 0:04:51 lr 0.000658 wd 0.0500 time 0.2219 (0.2388) data time 0.0006 (0.0144) model time 0.0000 (0.0000) loss 3.6260 (3.2565) grad_norm 1.9923 (2.3221) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][40/1251] eta 0:04:44 lr 0.000658 wd 0.0500 time 0.2220 (0.2352) data time 0.0009 (0.0111) model time 0.0000 (0.0000) loss 3.6979 (3.2615) grad_norm 1.7106 (2.2876) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][50/1251] eta 0:04:40 lr 0.000658 wd 0.0500 time 0.2259 (0.2333) data time 0.0009 (0.0091) model time 0.0000 (0.0000) loss 3.2649 (3.2813) grad_norm 1.9707 (2.2695) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][60/1251] eta 0:04:37 lr 0.000658 wd 0.0500 time 0.2461 (0.2326) data time 0.0008 (0.0077) model time 0.2453 (0.2281) loss 3.8203 (3.2938) grad_norm 2.2148 (2.3089) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][70/1251] eta 0:04:33 lr 0.000658 wd 0.0500 time 0.2319 (0.2316) data time 0.0008 (0.0068) model time 0.2311 (0.2264) loss 3.6089 (3.2926) grad_norm 2.4154 (2.3089) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][80/1251] eta 0:04:30 lr 0.000658 wd 0.0500 time 0.2198 (0.2309) data time 0.0009 (0.0060) model time 0.2189 (0.2259) loss 3.9858 (3.3021) grad_norm 2.6921 (2.3061) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][90/1251] eta 0:04:27 lr 0.000658 wd 0.0500 time 0.2241 (0.2303) data time 0.0007 (0.0055) model time 0.2233 (0.2255) loss 3.5015 (3.3076) grad_norm 2.9440 (2.4107) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][100/1251] eta 0:04:24 lr 0.000658 wd 0.0500 time 0.2274 (0.2298) data time 0.0006 (0.0050) model time 0.2268 (0.2254) loss 3.7269 (3.3130) grad_norm 1.4411 (2.3685) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][110/1251] eta 0:04:21 lr 0.000657 wd 0.0500 time 0.2270 (0.2296) data time 0.0006 (0.0046) model time 0.2264 (0.2255) loss 1.7164 (3.3118) grad_norm 2.4384 (2.3747) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][120/1251] eta 0:04:19 lr 0.000657 wd 0.0500 time 0.2344 (0.2294) data time 0.0008 (0.0043) model time 0.2337 (0.2256) loss 2.7931 (3.2922) grad_norm 2.4830 (2.3835) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][130/1251] eta 0:04:16 lr 0.000657 wd 0.0500 time 0.2322 (0.2291) data time 0.0008 (0.0041) model time 0.2314 (0.2256) loss 2.8174 (3.2956) grad_norm 2.0215 (2.3907) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][140/1251] eta 0:04:14 lr 0.000657 wd 0.0500 time 0.2358 (0.2290) data time 0.0006 (0.0038) model time 0.2353 (0.2257) loss 3.8955 (3.2762) grad_norm 2.0283 (2.3868) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:57:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][150/1251] eta 0:04:11 lr 0.000657 wd 0.0500 time 0.2321 (0.2288) data time 0.0007 (0.0036) model time 0.2313 (0.2257) loss 3.4286 (3.2617) grad_norm 2.0316 (2.3874) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][160/1251] eta 0:04:09 lr 0.000657 wd 0.0500 time 0.2335 (0.2289) data time 0.0008 (0.0035) model time 0.2326 (0.2260) loss 3.6295 (3.2637) grad_norm 2.1583 (2.3573) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][170/1251] eta 0:04:07 lr 0.000657 wd 0.0500 time 0.2290 (0.2288) data time 0.0006 (0.0033) model time 0.2284 (0.2260) loss 3.5554 (3.2622) grad_norm 1.6163 (2.3555) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][180/1251] eta 0:04:04 lr 0.000657 wd 0.0500 time 0.2263 (0.2287) data time 0.0008 (0.0032) model time 0.2255 (0.2261) loss 3.1287 (3.2713) grad_norm 2.7302 (2.3442) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][190/1251] eta 0:04:02 lr 0.000657 wd 0.0500 time 0.2290 (0.2286) data time 0.0007 (0.0030) model time 0.2284 (0.2261) loss 3.2123 (3.2772) grad_norm 2.3499 (2.3440) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][200/1251] eta 0:04:00 lr 0.000657 wd 0.0500 time 0.2235 (0.2285) data time 0.0007 (0.0029) model time 0.2228 (0.2261) loss 4.1646 (3.2729) grad_norm 2.9134 (2.3504) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][210/1251] eta 0:03:57 lr 0.000657 wd 0.0500 time 0.2285 (0.2284) data time 0.0009 (0.0028) model time 0.2276 (0.2260) loss 3.5167 (3.2646) grad_norm 1.9900 (2.3528) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][220/1251] eta 0:03:55 lr 0.000657 wd 0.0500 time 0.2240 (0.2283) data time 0.0009 (0.0027) model time 0.2231 (0.2260) loss 3.3014 (3.2602) grad_norm 1.8894 (2.3409) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][230/1251] eta 0:03:53 lr 0.000657 wd 0.0500 time 0.2245 (0.2283) data time 0.0007 (0.0027) model time 0.2237 (0.2260) loss 3.5779 (3.2578) grad_norm 2.4091 (2.3372) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][240/1251] eta 0:03:50 lr 0.000657 wd 0.0500 time 0.2237 (0.2282) data time 0.0009 (0.0026) model time 0.2228 (0.2260) loss 3.2347 (3.2655) grad_norm 1.4770 (2.3443) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][250/1251] eta 0:03:48 lr 0.000657 wd 0.0500 time 0.2255 (0.2280) data time 0.0006 (0.0025) model time 0.2248 (0.2258) loss 3.7823 (3.2673) grad_norm 1.7988 (2.3477) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][260/1251] eta 0:03:45 lr 0.000657 wd 0.0500 time 0.2227 (0.2279) data time 0.0008 (0.0025) model time 0.2219 (0.2257) loss 3.2769 (3.2660) grad_norm 2.3491 (2.3415) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][270/1251] eta 0:03:43 lr 0.000657 wd 0.0500 time 0.2265 (0.2279) data time 0.0009 (0.0024) model time 0.2256 (0.2257) loss 2.9409 (3.2639) grad_norm 2.1649 (2.3455) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][280/1251] eta 0:03:41 lr 0.000657 wd 0.0500 time 0.2303 (0.2279) data time 0.0013 (0.0023) model time 0.2290 (0.2258) loss 3.9905 (3.2629) grad_norm 2.3495 (2.3441) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][290/1251] eta 0:03:38 lr 0.000657 wd 0.0500 time 0.2275 (0.2279) data time 0.0006 (0.0023) model time 0.2269 (0.2258) loss 3.3531 (3.2552) grad_norm 2.3571 (2.3550) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][300/1251] eta 0:03:36 lr 0.000657 wd 0.0500 time 0.2294 (0.2278) data time 0.0006 (0.0022) model time 0.2288 (0.2258) loss 3.3812 (3.2575) grad_norm 2.4291 (2.3622) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][310/1251] eta 0:03:34 lr 0.000657 wd 0.0500 time 0.2384 (0.2279) data time 0.0009 (0.0022) model time 0.2375 (0.2259) loss 3.5950 (3.2518) grad_norm 1.8147 (2.3581) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][320/1251] eta 0:03:32 lr 0.000657 wd 0.0500 time 0.2251 (0.2287) data time 0.0006 (0.0022) model time 0.2246 (0.2270) loss 3.6426 (3.2528) grad_norm 1.7394 (2.3514) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][330/1251] eta 0:03:30 lr 0.000657 wd 0.0500 time 0.2225 (0.2286) data time 0.0006 (0.0021) model time 0.2219 (0.2269) loss 2.1234 (3.2487) grad_norm 1.9477 (2.3556) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][340/1251] eta 0:03:28 lr 0.000657 wd 0.0500 time 0.2253 (0.2285) data time 0.0007 (0.0021) model time 0.2246 (0.2268) loss 2.2418 (3.2425) grad_norm 1.9217 (2.3521) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][350/1251] eta 0:03:25 lr 0.000656 wd 0.0500 time 0.2203 (0.2284) data time 0.0009 (0.0020) model time 0.2194 (0.2268) loss 3.4760 (3.2484) grad_norm 2.1697 (2.3545) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][360/1251] eta 0:03:23 lr 0.000656 wd 0.0500 time 0.2232 (0.2283) data time 0.0007 (0.0020) model time 0.2224 (0.2267) loss 2.9575 (3.2489) grad_norm 1.9483 (2.3540) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][370/1251] eta 0:03:21 lr 0.000656 wd 0.0500 time 0.2190 (0.2282) data time 0.0007 (0.0020) model time 0.2182 (0.2266) loss 3.6797 (3.2603) grad_norm 2.3381 (2.3534) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][380/1251] eta 0:03:18 lr 0.000656 wd 0.0500 time 0.2240 (0.2283) data time 0.0006 (0.0019) model time 0.2233 (0.2266) loss 2.1169 (3.2494) grad_norm 3.3817 (2.3552) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][390/1251] eta 0:03:16 lr 0.000656 wd 0.0500 time 0.2220 (0.2281) data time 0.0006 (0.0019) model time 0.2214 (0.2265) loss 2.9296 (3.2500) grad_norm 2.4927 (2.3628) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][400/1251] eta 0:03:14 lr 0.000656 wd 0.0500 time 0.2236 (0.2280) data time 0.0006 (0.0019) model time 0.2230 (0.2264) loss 3.8502 (3.2546) grad_norm 2.0948 (2.3629) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:58:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][410/1251] eta 0:03:11 lr 0.000656 wd 0.0500 time 0.2250 (0.2280) data time 0.0009 (0.0019) model time 0.2241 (0.2263) loss 2.9045 (3.2540) grad_norm 2.3908 (2.3620) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][420/1251] eta 0:03:09 lr 0.000656 wd 0.0500 time 0.2264 (0.2279) data time 0.0005 (0.0019) model time 0.2259 (0.2263) loss 2.9163 (3.2532) grad_norm 1.9549 (2.3569) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][430/1251] eta 0:03:07 lr 0.000656 wd 0.0500 time 0.2295 (0.2278) data time 0.0007 (0.0019) model time 0.2288 (0.2262) loss 3.0126 (3.2557) grad_norm 2.4997 (2.3612) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][440/1251] eta 0:03:04 lr 0.000656 wd 0.0500 time 0.2265 (0.2278) data time 0.0006 (0.0018) model time 0.2259 (0.2261) loss 3.9966 (3.2609) grad_norm 2.4746 (2.3664) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][450/1251] eta 0:03:02 lr 0.000656 wd 0.0500 time 0.2243 (0.2277) data time 0.0008 (0.0018) model time 0.2235 (0.2260) loss 3.0351 (3.2619) grad_norm 2.9236 (2.3740) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][460/1251] eta 0:03:00 lr 0.000656 wd 0.0500 time 0.2231 (0.2276) data time 0.0007 (0.0018) model time 0.2224 (0.2260) loss 2.7403 (3.2609) grad_norm 2.3938 (2.3787) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][470/1251] eta 0:02:57 lr 0.000656 wd 0.0500 time 0.2253 (0.2275) data time 0.0008 (0.0018) model time 0.2245 (0.2259) loss 3.7821 (3.2706) grad_norm 2.2349 (2.3921) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][480/1251] eta 0:02:55 lr 0.000656 wd 0.0500 time 0.2285 (0.2275) data time 0.0008 (0.0018) model time 0.2276 (0.2258) loss 3.5867 (3.2780) grad_norm 2.0576 (2.4026) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][490/1251] eta 0:02:53 lr 0.000656 wd 0.0500 time 0.2219 (0.2274) data time 0.0009 (0.0018) model time 0.2210 (0.2258) loss 3.1730 (3.2782) grad_norm 1.9470 (2.3970) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][500/1251] eta 0:02:50 lr 0.000656 wd 0.0500 time 0.2245 (0.2274) data time 0.0008 (0.0018) model time 0.2237 (0.2257) loss 3.5086 (3.2777) grad_norm 2.1141 (2.3965) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][510/1251] eta 0:02:48 lr 0.000656 wd 0.0500 time 0.2300 (0.2273) data time 0.0007 (0.0017) model time 0.2293 (0.2257) loss 3.4777 (3.2766) grad_norm 1.8029 (2.3999) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][520/1251] eta 0:02:46 lr 0.000656 wd 0.0500 time 0.2246 (0.2276) data time 0.0010 (0.0017) model time 0.2236 (0.2261) loss 2.8376 (3.2796) grad_norm 2.3749 (2.3956) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][530/1251] eta 0:02:44 lr 0.000656 wd 0.0500 time 0.2256 (0.2276) data time 0.0009 (0.0017) model time 0.2247 (0.2261) loss 2.6968 (3.2788) grad_norm 1.7374 (2.3925) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][540/1251] eta 0:02:41 lr 0.000656 wd 0.0500 time 0.2329 (0.2276) data time 0.0009 (0.0017) model time 0.2321 (0.2261) loss 3.7167 (3.2802) grad_norm 1.9686 (2.3941) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][550/1251] eta 0:02:39 lr 0.000656 wd 0.0500 time 0.2262 (0.2276) data time 0.0006 (0.0017) model time 0.2256 (0.2261) loss 4.1506 (3.2849) grad_norm 4.2755 (2.4008) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][560/1251] eta 0:02:37 lr 0.000656 wd 0.0500 time 0.2295 (0.2276) data time 0.0007 (0.0017) model time 0.2288 (0.2261) loss 2.5341 (3.2881) grad_norm 2.6925 (2.4023) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][570/1251] eta 0:02:34 lr 0.000656 wd 0.0500 time 0.2319 (0.2275) data time 0.0006 (0.0017) model time 0.2313 (0.2261) loss 3.3492 (3.2851) grad_norm 2.2655 (2.3977) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][580/1251] eta 0:02:32 lr 0.000656 wd 0.0500 time 0.2221 (0.2275) data time 0.0010 (0.0016) model time 0.2211 (0.2261) loss 3.6103 (3.2869) grad_norm 2.0630 (2.3924) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][590/1251] eta 0:02:30 lr 0.000655 wd 0.0500 time 0.2243 (0.2275) data time 0.0009 (0.0016) model time 0.2233 (0.2261) loss 3.4431 (3.2876) grad_norm 1.8170 (2.3875) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][600/1251] eta 0:02:28 lr 0.000655 wd 0.0500 time 0.2262 (0.2275) data time 0.0008 (0.0016) model time 0.2254 (0.2260) loss 4.5213 (3.2900) grad_norm 2.1595 (2.3825) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][610/1251] eta 0:02:25 lr 0.000655 wd 0.0500 time 0.2307 (0.2275) data time 0.0007 (0.0016) model time 0.2300 (0.2260) loss 2.9782 (3.2927) grad_norm 1.5815 (2.3805) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][620/1251] eta 0:02:23 lr 0.000655 wd 0.0500 time 0.2278 (0.2275) data time 0.0006 (0.0016) model time 0.2272 (0.2260) loss 3.6323 (3.2987) grad_norm 2.4436 (2.3759) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][630/1251] eta 0:02:21 lr 0.000655 wd 0.0500 time 0.2234 (0.2275) data time 0.0007 (0.0016) model time 0.2227 (0.2260) loss 3.8474 (3.3024) grad_norm 1.9480 (2.3707) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][640/1251] eta 0:02:18 lr 0.000655 wd 0.0500 time 0.2280 (0.2274) data time 0.0008 (0.0016) model time 0.2272 (0.2260) loss 3.2642 (3.3032) grad_norm 3.4735 (2.3709) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][650/1251] eta 0:02:16 lr 0.000655 wd 0.0500 time 0.2325 (0.2274) data time 0.0008 (0.0016) model time 0.2318 (0.2260) loss 2.1697 (3.3007) grad_norm 2.3550 (2.3696) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][660/1251] eta 0:02:14 lr 0.000655 wd 0.0500 time 0.2279 (0.2274) data time 0.0009 (0.0016) model time 0.2270 (0.2260) loss 3.8956 (3.3002) grad_norm 1.4811 (2.3668) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][670/1251] eta 0:02:12 lr 0.000655 wd 0.0500 time 0.2216 (0.2274) data time 0.0008 (0.0015) model time 0.2208 (0.2260) loss 3.4781 (3.2967) grad_norm 1.9890 (2.3689) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 06:59:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][680/1251] eta 0:02:09 lr 0.000655 wd 0.0500 time 0.2282 (0.2274) data time 0.0006 (0.0015) model time 0.2276 (0.2260) loss 3.7315 (3.2981) grad_norm 1.7656 (2.3643) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][690/1251] eta 0:02:07 lr 0.000655 wd 0.0500 time 0.2238 (0.2274) data time 0.0006 (0.0015) model time 0.2233 (0.2260) loss 3.0365 (3.2990) grad_norm 1.9124 (2.3623) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][700/1251] eta 0:02:05 lr 0.000655 wd 0.0500 time 0.2247 (0.2274) data time 0.0008 (0.0015) model time 0.2239 (0.2260) loss 2.8326 (3.2981) grad_norm 2.2764 (2.3623) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][710/1251] eta 0:02:02 lr 0.000655 wd 0.0500 time 0.2215 (0.2273) data time 0.0009 (0.0015) model time 0.2206 (0.2260) loss 2.6254 (3.2950) grad_norm 1.9856 (2.3623) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][720/1251] eta 0:02:00 lr 0.000655 wd 0.0500 time 0.2271 (0.2274) data time 0.0006 (0.0015) model time 0.2265 (0.2260) loss 3.5946 (3.2941) grad_norm 2.3371 (2.3616) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][730/1251] eta 0:01:58 lr 0.000655 wd 0.0500 time 0.2275 (0.2274) data time 0.0010 (0.0015) model time 0.2266 (0.2260) loss 3.7016 (3.2950) grad_norm 2.5023 (2.3683) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][740/1251] eta 0:01:56 lr 0.000655 wd 0.0500 time 0.2244 (0.2274) data time 0.0008 (0.0015) model time 0.2235 (0.2260) loss 4.4539 (3.2962) grad_norm 1.9397 (2.3681) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][750/1251] eta 0:01:53 lr 0.000655 wd 0.0500 time 0.2691 (0.2275) data time 0.0008 (0.0015) model time 0.2683 (0.2262) loss 3.6826 (3.2963) grad_norm 2.2781 (2.3712) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][760/1251] eta 0:01:51 lr 0.000655 wd 0.0500 time 0.2304 (0.2276) data time 0.0008 (0.0015) model time 0.2296 (0.2263) loss 3.1494 (3.2942) grad_norm 1.9896 (2.3746) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][770/1251] eta 0:01:49 lr 0.000655 wd 0.0500 time 0.2221 (0.2276) data time 0.0007 (0.0015) model time 0.2214 (0.2262) loss 2.5931 (3.2915) grad_norm 1.8196 (2.3764) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][780/1251] eta 0:01:47 lr 0.000655 wd 0.0500 time 0.2279 (0.2275) data time 0.0008 (0.0015) model time 0.2270 (0.2262) loss 3.4705 (3.2873) grad_norm 3.2459 (2.3859) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][790/1251] eta 0:01:44 lr 0.000655 wd 0.0500 time 0.2306 (0.2275) data time 0.0007 (0.0014) model time 0.2299 (0.2262) loss 3.6158 (3.2869) grad_norm 2.4143 (2.3847) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][800/1251] eta 0:01:42 lr 0.000655 wd 0.0500 time 0.2231 (0.2275) data time 0.0010 (0.0014) model time 0.2221 (0.2262) loss 2.7212 (3.2838) grad_norm 2.1188 (2.3827) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][810/1251] eta 0:01:40 lr 0.000655 wd 0.0500 time 0.2190 (0.2275) data time 0.0008 (0.0014) model time 0.2182 (0.2262) loss 3.0305 (3.2835) grad_norm 2.9897 (2.3817) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][820/1251] eta 0:01:38 lr 0.000654 wd 0.0500 time 0.2217 (0.2275) data time 0.0007 (0.0014) model time 0.2210 (0.2262) loss 2.7299 (3.2828) grad_norm 2.2158 (2.3789) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][830/1251] eta 0:01:35 lr 0.000654 wd 0.0500 time 0.2280 (0.2275) data time 0.0005 (0.0014) model time 0.2275 (0.2262) loss 3.9183 (3.2836) grad_norm 1.6771 (2.3817) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][840/1251] eta 0:01:33 lr 0.000654 wd 0.0500 time 0.2193 (0.2277) data time 0.0007 (0.0014) model time 0.2186 (0.2264) loss 3.5187 (3.2837) grad_norm 2.5208 (2.3859) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][850/1251] eta 0:01:31 lr 0.000654 wd 0.0500 time 0.2208 (0.2279) data time 0.0008 (0.0014) model time 0.2200 (0.2266) loss 3.5590 (3.2884) grad_norm 1.7356 (2.3860) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][860/1251] eta 0:01:29 lr 0.000654 wd 0.0500 time 0.2214 (0.2279) data time 0.0007 (0.0014) model time 0.2207 (0.2266) loss 3.5190 (3.2893) grad_norm 2.0329 (2.3849) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][870/1251] eta 0:01:26 lr 0.000654 wd 0.0500 time 0.2275 (0.2278) data time 0.0005 (0.0014) model time 0.2270 (0.2266) loss 3.7950 (3.2865) grad_norm 2.8834 (2.3807) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][880/1251] eta 0:01:24 lr 0.000654 wd 0.0500 time 0.2191 (0.2278) data time 0.0008 (0.0014) model time 0.2183 (0.2266) loss 3.7428 (3.2863) grad_norm 1.9754 (2.3806) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][890/1251] eta 0:01:22 lr 0.000654 wd 0.0500 time 0.2273 (0.2278) data time 0.0008 (0.0014) model time 0.2265 (0.2266) loss 3.4079 (3.2890) grad_norm 1.8337 (2.3823) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][900/1251] eta 0:01:19 lr 0.000654 wd 0.0500 time 0.2274 (0.2278) data time 0.0008 (0.0014) model time 0.2266 (0.2265) loss 3.5178 (3.2869) grad_norm 1.7332 (2.3796) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][910/1251] eta 0:01:17 lr 0.000654 wd 0.0500 time 0.2243 (0.2277) data time 0.0007 (0.0014) model time 0.2236 (0.2265) loss 3.3076 (3.2888) grad_norm 1.7486 (2.3789) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][920/1251] eta 0:01:15 lr 0.000654 wd 0.0500 time 0.2351 (0.2277) data time 0.0008 (0.0014) model time 0.2343 (0.2265) loss 3.1907 (3.2901) grad_norm 2.8476 (2.3774) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][930/1251] eta 0:01:13 lr 0.000654 wd 0.0500 time 0.2153 (0.2277) data time 0.0006 (0.0014) model time 0.2147 (0.2264) loss 4.2434 (3.2907) grad_norm 4.3507 (2.3815) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:00:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][940/1251] eta 0:01:11 lr 0.000654 wd 0.0500 time 0.2582 (0.2290) data time 0.0009 (0.0013) model time 0.2573 (0.2278) loss 2.7698 (3.2898) grad_norm 2.3969 (2.3805) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][950/1251] eta 0:01:08 lr 0.000654 wd 0.0500 time 0.2287 (0.2290) data time 0.0007 (0.0013) model time 0.2281 (0.2278) loss 3.5128 (3.2900) grad_norm 2.0100 (2.3793) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][960/1251] eta 0:01:06 lr 0.000654 wd 0.0500 time 0.2215 (0.2289) data time 0.0009 (0.0013) model time 0.2205 (0.2278) loss 3.1878 (3.2892) grad_norm 4.3391 (2.3766) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][970/1251] eta 0:01:04 lr 0.000654 wd 0.0500 time 0.2204 (0.2289) data time 0.0007 (0.0013) model time 0.2197 (0.2277) loss 2.3418 (3.2881) grad_norm 1.6180 (2.3787) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][980/1251] eta 0:01:02 lr 0.000654 wd 0.0500 time 0.2244 (0.2289) data time 0.0006 (0.0013) model time 0.2238 (0.2277) loss 3.0685 (3.2860) grad_norm 2.4104 (2.3797) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][990/1251] eta 0:00:59 lr 0.000654 wd 0.0500 time 0.2250 (0.2288) data time 0.0008 (0.0013) model time 0.2242 (0.2277) loss 3.0019 (3.2856) grad_norm 1.7644 (2.3809) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1000/1251] eta 0:00:57 lr 0.000654 wd 0.0500 time 0.2214 (0.2288) data time 0.0006 (0.0013) model time 0.2208 (0.2277) loss 2.6231 (3.2823) grad_norm 1.5806 (2.3821) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1010/1251] eta 0:00:55 lr 0.000654 wd 0.0500 time 0.2391 (0.2288) data time 0.0007 (0.0013) model time 0.2384 (0.2277) loss 3.3786 (3.2846) grad_norm 1.7560 (2.3806) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1020/1251] eta 0:00:52 lr 0.000654 wd 0.0500 time 0.2215 (0.2288) data time 0.0007 (0.0013) model time 0.2208 (0.2276) loss 3.6430 (3.2845) grad_norm 2.2995 (2.3788) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1030/1251] eta 0:00:50 lr 0.000654 wd 0.0500 time 0.2303 (0.2287) data time 0.0006 (0.0013) model time 0.2297 (0.2276) loss 4.2253 (3.2858) grad_norm 1.9148 (2.3802) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1040/1251] eta 0:00:48 lr 0.000654 wd 0.0500 time 0.2226 (0.2302) data time 0.0008 (0.0013) model time 0.2218 (0.2292) loss 2.7093 (3.2864) grad_norm 2.8320 (2.3803) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1050/1251] eta 0:00:46 lr 0.000654 wd 0.0500 time 0.2135 (0.2302) data time 0.0008 (0.0013) model time 0.2127 (0.2291) loss 3.8635 (3.2875) grad_norm 2.7472 (2.3799) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1060/1251] eta 0:00:43 lr 0.000653 wd 0.0500 time 0.2159 (0.2301) data time 0.0005 (0.0013) model time 0.2154 (0.2290) loss 2.7945 (3.2835) grad_norm 2.0965 (2.3805) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1070/1251] eta 0:00:41 lr 0.000653 wd 0.0500 time 0.2364 (0.2313) data time 0.0007 (0.0013) model time 0.2358 (0.2303) loss 3.5828 (3.2845) grad_norm 2.0160 (2.3779) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1080/1251] eta 0:00:39 lr 0.000653 wd 0.0500 time 0.2207 (0.2312) data time 0.0006 (0.0013) model time 0.2201 (0.2302) loss 3.1029 (3.2828) grad_norm 2.7767 (2.3789) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1090/1251] eta 0:00:37 lr 0.000653 wd 0.0500 time 0.2272 (0.2312) data time 0.0006 (0.0013) model time 0.2266 (0.2302) loss 2.9831 (3.2817) grad_norm 1.6660 (2.3800) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1100/1251] eta 0:00:34 lr 0.000653 wd 0.0500 time 0.2155 (0.2311) data time 0.0006 (0.0013) model time 0.2149 (0.2301) loss 4.1834 (3.2822) grad_norm 2.5150 (2.3818) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1110/1251] eta 0:00:32 lr 0.000653 wd 0.0500 time 0.2264 (0.2337) data time 0.0006 (0.0013) model time 0.2258 (0.2328) loss 2.6316 (3.2812) grad_norm 1.8740 (2.3803) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1120/1251] eta 0:00:30 lr 0.000653 wd 0.0500 time 0.2309 (0.2336) data time 0.0006 (0.0013) model time 0.2303 (0.2328) loss 3.9270 (3.2834) grad_norm 2.6450 (2.3806) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1130/1251] eta 0:00:28 lr 0.000653 wd 0.0500 time 0.2118 (0.2335) data time 0.0008 (0.0013) model time 0.2111 (0.2326) loss 2.8662 (3.2833) grad_norm 2.0460 (2.3798) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1140/1251] eta 0:00:26 lr 0.000653 wd 0.0500 time 0.8635 (0.2379) data time 0.0009 (0.0012) model time 0.8626 (0.2372) loss 3.7686 (3.2833) grad_norm 2.4736 (2.3769) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1150/1251] eta 0:00:24 lr 0.000653 wd 0.0500 time 0.2179 (0.2393) data time 0.0008 (0.0012) model time 0.2171 (0.2387) loss 4.0364 (3.2861) grad_norm 8.2166 (2.3808) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1160/1251] eta 0:00:21 lr 0.000653 wd 0.0500 time 0.2222 (0.2391) data time 0.0006 (0.0012) model time 0.2216 (0.2385) loss 2.1108 (3.2865) grad_norm 2.2172 (2.3811) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1170/1251] eta 0:00:19 lr 0.000653 wd 0.0500 time 0.2279 (0.2390) data time 0.0008 (0.0012) model time 0.2271 (0.2384) loss 3.0951 (3.2872) grad_norm 2.5259 (2.3814) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1180/1251] eta 0:00:16 lr 0.000653 wd 0.0500 time 0.2234 (0.2389) data time 0.0006 (0.0012) model time 0.2228 (0.2383) loss 3.5161 (3.2890) grad_norm 2.4705 (2.3786) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1190/1251] eta 0:00:14 lr 0.000653 wd 0.0500 time 0.2269 (0.2388) data time 0.0006 (0.0012) model time 0.2263 (0.2381) loss 3.2478 (3.2881) grad_norm 2.0056 (2.3765) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1200/1251] eta 0:00:12 lr 0.000653 wd 0.0500 time 0.2256 (0.2387) data time 0.0006 (0.0012) model time 0.2250 (0.2380) loss 3.5298 (3.2869) grad_norm 3.4466 (2.3770) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1210/1251] eta 0:00:09 lr 0.000653 wd 0.0500 time 0.2217 (0.2386) data time 0.0006 (0.0012) model time 0.2211 (0.2379) loss 4.2433 (3.2897) grad_norm 1.8205 (2.3786) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1220/1251] eta 0:00:07 lr 0.000653 wd 0.0500 time 0.2189 (0.2384) data time 0.0007 (0.0012) model time 0.2181 (0.2378) loss 3.5591 (3.2906) grad_norm 2.0862 (2.3810) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1230/1251] eta 0:00:05 lr 0.000653 wd 0.0500 time 0.2191 (0.2383) data time 0.0006 (0.0012) model time 0.2185 (0.2376) loss 2.7413 (3.2926) grad_norm 2.2448 (2.3805) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1240/1251] eta 0:00:02 lr 0.000653 wd 0.0500 time 0.2140 (0.2383) data time 0.0006 (0.0012) model time 0.2135 (0.2376) loss 2.4803 (3.2921) grad_norm 2.8423 (2.3817) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [132/300][1250/1251] eta 0:00:00 lr 0.000653 wd 0.0500 time 0.2177 (0.2381) data time 0.0005 (0.0012) model time 0.2172 (0.2375) loss 2.8987 (3.2912) grad_norm 2.2065 (2.3820) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 132 training takes 0:04:57 [2024-08-27 07:02:22 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 07:02:22 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 07:02:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.430 (0.430) Loss 0.4927 (0.4927) Acc@1 90.332 (90.332) Acc@5 98.047 (98.047) Mem 7381MB [2024-08-27 07:02:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.061 (0.100) Loss 0.8008 (0.7602) Acc@1 83.301 (83.425) Acc@5 95.996 (96.564) Mem 7381MB [2024-08-27 07:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.082) Loss 1.0283 (0.7772) Acc@1 75.488 (82.626) Acc@5 94.043 (96.577) Mem 7381MB [2024-08-27 07:02:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.104) Loss 1.3213 (0.8869) Acc@1 68.848 (80.160) Acc@5 89.941 (95.243) Mem 7381MB [2024-08-27 07:02:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.095) Loss 1.2617 (0.9431) Acc@1 70.312 (78.680) Acc@5 91.113 (94.648) Mem 7381MB [2024-08-27 07:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.338 Acc@5 94.614 [2024-08-27 07:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.3% [2024-08-27 07:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.905 (0.905) Loss 0.4172 (0.4172) Acc@1 92.383 (92.383) Acc@5 98.730 (98.730) Mem 7381MB [2024-08-27 07:02:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.067 (0.149) Loss 0.6621 (0.6538) Acc@1 87.402 (85.938) Acc@5 96.973 (97.372) Mem 7381MB [2024-08-27 07:02:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.085 (0.112) Loss 0.9248 (0.6773) Acc@1 78.027 (84.956) Acc@5 94.922 (97.340) Mem 7381MB [2024-08-27 07:02:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.100) Loss 1.1777 (0.7695) Acc@1 70.215 (82.686) Acc@5 91.699 (96.308) Mem 7381MB [2024-08-27 07:02:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.091) Loss 1.0684 (0.8180) Acc@1 72.656 (81.274) Acc@5 93.555 (95.758) Mem 7381MB [2024-08-27 07:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.888 Acc@5 95.708 [2024-08-27 07:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.9% [2024-08-27 07:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.89% [2024-08-27 07:02:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 07:02:31 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 07:02:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][0/1251] eta 0:35:03 lr 0.000653 wd 0.0500 time 1.6818 (1.6818) data time 1.4179 (1.4179) model time 0.0000 (0.0000) loss 2.2374 (2.2374) grad_norm 2.1781 (2.1781) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][10/1251] eta 0:07:25 lr 0.000653 wd 0.0500 time 0.2266 (0.3592) data time 0.0005 (0.1297) model time 0.0000 (0.0000) loss 3.8314 (3.1927) grad_norm 1.8241 (2.0464) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][20/1251] eta 0:06:04 lr 0.000653 wd 0.0500 time 0.2341 (0.2961) data time 0.0006 (0.0684) model time 0.0000 (0.0000) loss 4.0466 (3.2323) grad_norm 2.0369 (2.2512) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][30/1251] eta 0:05:35 lr 0.000653 wd 0.0500 time 0.2310 (0.2748) data time 0.0006 (0.0466) model time 0.0000 (0.0000) loss 3.7765 (3.2568) grad_norm 2.0499 (2.3233) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][40/1251] eta 0:05:18 lr 0.000653 wd 0.0500 time 0.2330 (0.2633) data time 0.0006 (0.0354) model time 0.0000 (0.0000) loss 4.0142 (3.2858) grad_norm 1.9585 (2.2628) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][50/1251] eta 0:05:07 lr 0.000652 wd 0.0500 time 0.2237 (0.2562) data time 0.0008 (0.0287) model time 0.0000 (0.0000) loss 3.5649 (3.3248) grad_norm 2.1320 (2.2955) loss_scale 4096.0000 (2329.0980) mem 7381MB [2024-08-27 07:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][60/1251] eta 0:04:59 lr 0.000652 wd 0.0500 time 0.2213 (0.2514) data time 0.0006 (0.0241) model time 0.2207 (0.2263) loss 2.5325 (3.3022) grad_norm 2.2755 (2.2898) loss_scale 4096.0000 (2618.7541) mem 7381MB [2024-08-27 07:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][70/1251] eta 0:04:52 lr 0.000652 wd 0.0500 time 0.2228 (0.2474) data time 0.0007 (0.0208) model time 0.2221 (0.2240) loss 3.9630 (3.2864) grad_norm 2.1779 (2.2806) loss_scale 4096.0000 (2826.8169) mem 7381MB [2024-08-27 07:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][80/1251] eta 0:04:45 lr 0.000652 wd 0.0500 time 0.2212 (0.2440) data time 0.0007 (0.0184) model time 0.2205 (0.2225) loss 3.8956 (3.2865) grad_norm 2.3719 (2.2523) loss_scale 4096.0000 (2983.5062) mem 7381MB [2024-08-27 07:02:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][90/1251] eta 0:04:57 lr 0.000652 wd 0.0500 time 0.2337 (0.2564) data time 0.0007 (0.0164) model time 0.2331 (0.2557) loss 3.0291 (3.2723) grad_norm 2.8820 (2.2366) loss_scale 4096.0000 (3105.7582) mem 7381MB [2024-08-27 07:02:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][100/1251] eta 0:04:52 lr 0.000652 wd 0.0500 time 0.2219 (0.2543) data time 0.0008 (0.0149) model time 0.2211 (0.2514) loss 3.0040 (3.2581) grad_norm 4.7583 (2.2820) loss_scale 4096.0000 (3203.8020) mem 7381MB [2024-08-27 07:02:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][110/1251] eta 0:04:47 lr 0.000652 wd 0.0500 time 0.2297 (0.2518) data time 0.0008 (0.0136) model time 0.2289 (0.2473) loss 3.4121 (3.2611) grad_norm 1.3995 (2.3392) loss_scale 4096.0000 (3284.1802) mem 7381MB [2024-08-27 07:03:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][120/1251] eta 0:04:42 lr 0.000652 wd 0.0500 time 0.2295 (0.2500) data time 0.0006 (0.0126) model time 0.2289 (0.2447) loss 4.4026 (3.2696) grad_norm 1.8284 (2.3474) loss_scale 4096.0000 (3351.2727) mem 7381MB [2024-08-27 07:03:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][130/1251] eta 0:04:38 lr 0.000652 wd 0.0500 time 0.2255 (0.2482) data time 0.0007 (0.0117) model time 0.2247 (0.2423) loss 3.8781 (3.2588) grad_norm 2.0626 (2.3313) loss_scale 4096.0000 (3408.1221) mem 7381MB [2024-08-27 07:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][140/1251] eta 0:04:34 lr 0.000652 wd 0.0500 time 0.2293 (0.2467) data time 0.0007 (0.0109) model time 0.2286 (0.2405) loss 2.3596 (3.2401) grad_norm 1.5445 (2.3225) loss_scale 4096.0000 (3456.9078) mem 7381MB [2024-08-27 07:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][150/1251] eta 0:04:29 lr 0.000652 wd 0.0500 time 0.2271 (0.2452) data time 0.0006 (0.0102) model time 0.2265 (0.2387) loss 3.9876 (3.2451) grad_norm 1.9098 (2.3255) loss_scale 4096.0000 (3499.2318) mem 7381MB [2024-08-27 07:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][160/1251] eta 0:04:26 lr 0.000652 wd 0.0500 time 0.2212 (0.2439) data time 0.0009 (0.0096) model time 0.2203 (0.2374) loss 2.7155 (3.2449) grad_norm 1.8138 (2.3280) loss_scale 4096.0000 (3536.2981) mem 7381MB [2024-08-27 07:03:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][170/1251] eta 0:04:47 lr 0.000652 wd 0.0500 time 0.2211 (0.2660) data time 0.0008 (0.0196) model time 0.2204 (0.2545) loss 3.3184 (3.2567) grad_norm 1.7550 (2.3259) loss_scale 4096.0000 (3569.0292) mem 7381MB [2024-08-27 07:03:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][180/1251] eta 0:04:42 lr 0.000652 wd 0.0500 time 0.2227 (0.2640) data time 0.0007 (0.0186) model time 0.2221 (0.2524) loss 2.0568 (3.2451) grad_norm 1.8860 (2.3276) loss_scale 4096.0000 (3598.1436) mem 7381MB [2024-08-27 07:03:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][190/1251] eta 0:04:37 lr 0.000652 wd 0.0500 time 0.2288 (0.2619) data time 0.0008 (0.0176) model time 0.2280 (0.2504) loss 3.6727 (3.2471) grad_norm 1.7207 (2.3316) loss_scale 4096.0000 (3624.2094) mem 7381MB [2024-08-27 07:03:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][200/1251] eta 0:04:33 lr 0.000652 wd 0.0500 time 0.2188 (0.2600) data time 0.0007 (0.0168) model time 0.2182 (0.2486) loss 3.7836 (3.2480) grad_norm 2.8201 (2.3168) loss_scale 4096.0000 (3647.6816) mem 7381MB [2024-08-27 07:03:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][210/1251] eta 0:04:29 lr 0.000652 wd 0.0500 time 0.2302 (0.2585) data time 0.0007 (0.0160) model time 0.2295 (0.2472) loss 2.1452 (3.2333) grad_norm 2.2454 (2.3199) loss_scale 4096.0000 (3668.9289) mem 7381MB [2024-08-27 07:03:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][220/1251] eta 0:04:24 lr 0.000652 wd 0.0500 time 0.2207 (0.2568) data time 0.0007 (0.0153) model time 0.2199 (0.2457) loss 3.2988 (3.2400) grad_norm 2.5801 (2.3145) loss_scale 4096.0000 (3688.2534) mem 7381MB [2024-08-27 07:03:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][230/1251] eta 0:04:31 lr 0.000652 wd 0.0500 time 0.2294 (0.2659) data time 0.0008 (0.0147) model time 0.2286 (0.2579) loss 3.5079 (3.2461) grad_norm 2.6458 (2.3162) loss_scale 4096.0000 (3705.9048) mem 7381MB [2024-08-27 07:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][240/1251] eta 0:04:27 lr 0.000652 wd 0.0500 time 0.2175 (0.2646) data time 0.0007 (0.0141) model time 0.2167 (0.2567) loss 3.6282 (3.2495) grad_norm 1.6021 (2.3100) loss_scale 4096.0000 (3722.0913) mem 7381MB [2024-08-27 07:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][250/1251] eta 0:04:23 lr 0.000652 wd 0.0500 time 0.2377 (0.2632) data time 0.0007 (0.0136) model time 0.2370 (0.2552) loss 4.1056 (3.2612) grad_norm 1.9205 (2.2997) loss_scale 4096.0000 (3736.9880) mem 7381MB [2024-08-27 07:03:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][260/1251] eta 0:04:19 lr 0.000652 wd 0.0500 time 0.2172 (0.2616) data time 0.0008 (0.0131) model time 0.2164 (0.2535) loss 3.4761 (3.2624) grad_norm 2.3608 (2.2978) loss_scale 4096.0000 (3750.7433) mem 7381MB [2024-08-27 07:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][270/1251] eta 0:04:15 lr 0.000652 wd 0.0500 time 0.2407 (0.2603) data time 0.0008 (0.0127) model time 0.2399 (0.2523) loss 3.3332 (3.2631) grad_norm 1.9084 (2.2985) loss_scale 4096.0000 (3763.4834) mem 7381MB [2024-08-27 07:03:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][280/1251] eta 0:04:11 lr 0.000651 wd 0.0500 time 0.2232 (0.2592) data time 0.0008 (0.0122) model time 0.2224 (0.2512) loss 3.2827 (3.2658) grad_norm 1.7005 (2.2966) loss_scale 4096.0000 (3775.3167) mem 7381MB [2024-08-27 07:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][290/1251] eta 0:04:08 lr 0.000651 wd 0.0500 time 0.2266 (0.2581) data time 0.0006 (0.0119) model time 0.2259 (0.2503) loss 3.0681 (3.2667) grad_norm 3.0377 (2.3097) loss_scale 4096.0000 (3786.3368) mem 7381MB [2024-08-27 07:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 07:03:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 07:03:49 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 07:06:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 07:06:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 07:06:48 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 07:06:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 07:06:57 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 07:06:59 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 07:07:00 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 07:07:00 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 133) [2024-08-27 07:07:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 07:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][300/1251] eta 0:57:20 lr 0.000651 wd 0.0500 time 0.2188 (3.6174) data time 0.0007 (0.5213) model time 0.2181 (3.0961) loss 3.8945 (3.6136) grad_norm 2.3969 (2.4994) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][310/1251] eta 0:18:44 lr 0.000651 wd 0.0500 time 0.2242 (1.1955) data time 0.0008 (0.1496) model time 0.2234 (1.0458) loss 3.6557 (3.5771) grad_norm 2.4429 (2.3189) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][320/1251] eta 0:15:17 lr 0.000651 wd 0.0500 time 0.2312 (0.9853) data time 0.0009 (0.0877) model time 0.2303 (0.8976) loss 3.3528 (3.5907) grad_norm 3.6884 (2.6162) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:07:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][330/1251] eta 0:11:55 lr 0.000651 wd 0.0500 time 0.7579 (0.7774) data time 0.0006 (0.0622) model time 0.7573 (0.7152) loss 2.7863 (3.5456) grad_norm 2.5916 (2.7160) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:07:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][340/1251] eta 0:09:54 lr 0.000651 wd 0.0500 time 0.2250 (0.6530) data time 0.0007 (0.0483) model time 0.2243 (0.6047) loss 3.2697 (3.4760) grad_norm 1.9464 (2.5709) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][350/1251] eta 0:08:37 lr 0.000651 wd 0.0500 time 0.2292 (0.5740) data time 0.0008 (0.0395) model time 0.2285 (0.5345) loss 3.6677 (3.4786) grad_norm 2.1762 (2.4646) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][360/1251] eta 0:07:43 lr 0.000651 wd 0.0500 time 0.2334 (0.5201) data time 0.0007 (0.0335) model time 0.2327 (0.4866) loss 3.4468 (3.4450) grad_norm 1.8158 (2.4245) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][370/1251] eta 0:07:03 lr 0.000651 wd 0.0500 time 0.2212 (0.4801) data time 0.0007 (0.0291) model time 0.2205 (0.4511) loss 3.7519 (3.4085) grad_norm 2.7385 (2.4240) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][380/1251] eta 0:06:31 lr 0.000651 wd 0.0500 time 0.2249 (0.4495) data time 0.0009 (0.0257) model time 0.2240 (0.4238) loss 3.4929 (3.3652) grad_norm 1.8205 (2.3609) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:07:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][390/1251] eta 0:06:14 lr 0.000651 wd 0.0500 time 0.2554 (0.4348) data time 0.0009 (0.0231) model time 0.2545 (0.4117) loss 3.2388 (3.3485) grad_norm 2.6347 (2.3275) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][400/1251] eta 0:05:53 lr 0.000651 wd 0.0500 time 0.2325 (0.4150) data time 0.0009 (0.0210) model time 0.2316 (0.3940) loss 3.0991 (3.3760) grad_norm 2.3824 (2.3439) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:07:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][410/1251] eta 0:05:35 lr 0.000651 wd 0.0500 time 0.2351 (0.3990) data time 0.0011 (0.0192) model time 0.2341 (0.3798) loss 3.4694 (3.3795) grad_norm 2.0522 (2.3390) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][420/1251] eta 0:05:20 lr 0.000651 wd 0.0500 time 0.2294 (0.3854) data time 0.0008 (0.0177) model time 0.2286 (0.3677) loss 2.8385 (3.3789) grad_norm 1.8913 (2.3643) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][430/1251] eta 0:05:06 lr 0.000651 wd 0.0500 time 0.2223 (0.3737) data time 0.0009 (0.0165) model time 0.2214 (0.3573) loss 3.4125 (3.3769) grad_norm 2.2464 (2.3393) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][440/1251] eta 0:05:23 lr 0.000651 wd 0.0500 time 0.2362 (0.3987) data time 0.0010 (0.0438) model time 0.2352 (0.3549) loss 2.8319 (3.3606) grad_norm 2.2558 (2.3307) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][450/1251] eta 0:05:10 lr 0.000651 wd 0.0500 time 0.2228 (0.3874) data time 0.0008 (0.0410) model time 0.2220 (0.3464) loss 2.9590 (3.3471) grad_norm 2.6909 (2.3431) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][460/1251] eta 0:04:58 lr 0.000651 wd 0.0500 time 0.2235 (0.3776) data time 0.0008 (0.0386) model time 0.2226 (0.3391) loss 3.5030 (3.3428) grad_norm 2.2624 (2.3346) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:08:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][470/1251] eta 0:04:48 lr 0.000651 wd 0.0500 time 0.2277 (0.3691) data time 0.0009 (0.0364) model time 0.2268 (0.3327) loss 2.2555 (3.3336) grad_norm 1.9467 (2.3104) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:08:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][480/1251] eta 0:04:38 lr 0.000651 wd 0.0500 time 0.2231 (0.3613) data time 0.0007 (0.0345) model time 0.2223 (0.3268) loss 3.2005 (3.3299) grad_norm 2.2108 (2.2976) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:08:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][490/1251] eta 0:04:29 lr 0.000651 wd 0.0500 time 0.2279 (0.3542) data time 0.0007 (0.0327) model time 0.2272 (0.3215) loss 3.3157 (3.3315) grad_norm 1.8129 (2.3166) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:08:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][500/1251] eta 0:04:21 lr 0.000651 wd 0.0500 time 0.2231 (0.3481) data time 0.0010 (0.0312) model time 0.2221 (0.3170) loss 3.4888 (3.3168) grad_norm 2.2910 (2.2991) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:08:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][510/1251] eta 0:04:20 lr 0.000651 wd 0.0500 time 0.2280 (0.3518) data time 0.0008 (0.0298) model time 0.2272 (0.3220) loss 3.3326 (3.3132) grad_norm 1.8933 (2.3165) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][520/1251] eta 0:04:13 lr 0.000650 wd 0.0500 time 0.2266 (0.3462) data time 0.0008 (0.0285) model time 0.2257 (0.3177) loss 4.0436 (3.3137) grad_norm 2.0528 (2.3185) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][530/1251] eta 0:04:05 lr 0.000650 wd 0.0500 time 0.2202 (0.3410) data time 0.0006 (0.0273) model time 0.2196 (0.3137) loss 2.7135 (3.3008) grad_norm 3.8777 (2.3209) loss_scale 4096.0000 (4096.0000) mem 7377MB [2024-08-27 07:08:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][540/1251] eta 0:03:59 lr 0.000650 wd 0.0500 time 0.2414 (0.3365) data time 0.0011 (0.0262) model time 0.2403 (0.3102) loss 2.2743 (3.2987) grad_norm 3.3816 (nan) loss_scale 2048.0000 (4079.2131) mem 7377MB [2024-08-27 07:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][550/1251] eta 0:03:52 lr 0.000650 wd 0.0500 time 0.2166 (0.3321) data time 0.0007 (0.0252) model time 0.2159 (0.3068) loss 2.7824 (3.2932) grad_norm 3.0689 (nan) loss_scale 2048.0000 (3999.2441) mem 7377MB [2024-08-27 07:08:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][560/1251] eta 0:03:46 lr 0.000650 wd 0.0500 time 0.2216 (0.3280) data time 0.0010 (0.0243) model time 0.2206 (0.3037) loss 3.5001 (3.2878) grad_norm 2.9947 (nan) loss_scale 2048.0000 (3925.3333) mem 7377MB [2024-08-27 07:08:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][570/1251] eta 0:03:40 lr 0.000650 wd 0.0500 time 0.2192 (0.3243) data time 0.0009 (0.0235) model time 0.2183 (0.3009) loss 3.6308 (3.2859) grad_norm 1.9779 (nan) loss_scale 2048.0000 (3856.8175) mem 7377MB [2024-08-27 07:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][580/1251] eta 0:03:45 lr 0.000650 wd 0.0500 time 0.2205 (0.3367) data time 0.0009 (0.0234) model time 0.2196 (0.3132) loss 2.1509 (3.2798) grad_norm 2.5131 (nan) loss_scale 2048.0000 (3793.1268) mem 7377MB [2024-08-27 07:08:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][590/1251] eta 0:03:40 lr 0.000650 wd 0.0500 time 0.2259 (0.3337) data time 0.0006 (0.0227) model time 0.2253 (0.3110) loss 2.6874 (3.2726) grad_norm 2.2578 (nan) loss_scale 2048.0000 (3733.7687) mem 7377MB [2024-08-27 07:08:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][600/1251] eta 0:03:35 lr 0.000650 wd 0.0500 time 0.2278 (0.3303) data time 0.0009 (0.0220) model time 0.2269 (0.3083) loss 3.9203 (3.2645) grad_norm 2.1438 (nan) loss_scale 2048.0000 (3678.3158) mem 7377MB [2024-08-27 07:08:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][610/1251] eta 0:03:30 lr 0.000650 wd 0.0500 time 0.2220 (0.3278) data time 0.0009 (0.0213) model time 0.2211 (0.3065) loss 3.7001 (3.2630) grad_norm 2.0976 (nan) loss_scale 2048.0000 (3626.3949) mem 7377MB [2024-08-27 07:08:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][620/1251] eta 0:03:24 lr 0.000650 wd 0.0500 time 0.2387 (0.3248) data time 0.0008 (0.0207) model time 0.2379 (0.3041) loss 4.0716 (3.2739) grad_norm 3.4200 (nan) loss_scale 2048.0000 (3577.6790) mem 7377MB [2024-08-27 07:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][630/1251] eta 0:03:19 lr 0.000650 wd 0.0500 time 0.2254 (0.3218) data time 0.0008 (0.0201) model time 0.2246 (0.3017) loss 3.1907 (3.2709) grad_norm 2.1269 (nan) loss_scale 2048.0000 (3531.8802) mem 7377MB [2024-08-27 07:08:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][640/1251] eta 0:03:14 lr 0.000650 wd 0.0500 time 0.2266 (0.3191) data time 0.0009 (0.0195) model time 0.2257 (0.2996) loss 3.5439 (3.2778) grad_norm 2.8427 (nan) loss_scale 2048.0000 (3488.7442) mem 7377MB [2024-08-27 07:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][650/1251] eta 0:03:10 lr 0.000650 wd 0.0500 time 0.2274 (0.3166) data time 0.0011 (0.0190) model time 0.2263 (0.2976) loss 3.7193 (3.2820) grad_norm 2.9774 (nan) loss_scale 2048.0000 (3448.0452) mem 7377MB [2024-08-27 07:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][660/1251] eta 0:03:05 lr 0.000650 wd 0.0500 time 0.2326 (0.3141) data time 0.0008 (0.0185) model time 0.2318 (0.2956) loss 2.1542 (3.2793) grad_norm 2.5651 (nan) loss_scale 2048.0000 (3409.5824) mem 7377MB [2024-08-27 07:09:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][670/1251] eta 0:03:01 lr 0.000650 wd 0.0500 time 0.2269 (0.3118) data time 0.0010 (0.0180) model time 0.2259 (0.2938) loss 3.5062 (3.2797) grad_norm 1.6188 (nan) loss_scale 2048.0000 (3373.1765) mem 7377MB [2024-08-27 07:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][680/1251] eta 0:02:56 lr 0.000650 wd 0.0500 time 0.2211 (0.3096) data time 0.0007 (0.0176) model time 0.2203 (0.2920) loss 2.9363 (3.2729) grad_norm 1.7128 (nan) loss_scale 2048.0000 (3338.6667) mem 7377MB [2024-08-27 07:09:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][690/1251] eta 0:02:52 lr 0.000650 wd 0.0500 time 0.2311 (0.3076) data time 0.0009 (0.0172) model time 0.2302 (0.2904) loss 3.1107 (3.2741) grad_norm 3.1088 (nan) loss_scale 2048.0000 (3305.9086) mem 7377MB [2024-08-27 07:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][700/1251] eta 0:02:48 lr 0.000650 wd 0.0500 time 0.2187 (0.3056) data time 0.0008 (0.0168) model time 0.2179 (0.2888) loss 3.3460 (3.2795) grad_norm 3.5115 (nan) loss_scale 2048.0000 (3274.7723) mem 7377MB [2024-08-27 07:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][710/1251] eta 0:02:44 lr 0.000650 wd 0.0500 time 0.2257 (0.3036) data time 0.0011 (0.0164) model time 0.2247 (0.2873) loss 2.8972 (3.2843) grad_norm 3.2512 (nan) loss_scale 2048.0000 (3245.1401) mem 7377MB [2024-08-27 07:09:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][720/1251] eta 0:02:40 lr 0.000650 wd 0.0500 time 0.2305 (0.3019) data time 0.0007 (0.0161) model time 0.2298 (0.2858) loss 4.0862 (3.2845) grad_norm 2.2225 (nan) loss_scale 2048.0000 (3216.9057) mem 7377MB [2024-08-27 07:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][730/1251] eta 0:02:36 lr 0.000650 wd 0.0500 time 0.2345 (0.3002) data time 0.0008 (0.0157) model time 0.2336 (0.2845) loss 3.6333 (3.2945) grad_norm 2.0552 (nan) loss_scale 2048.0000 (3189.9724) mem 7377MB [2024-08-27 07:09:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][740/1251] eta 0:02:32 lr 0.000650 wd 0.0500 time 0.2275 (0.2986) data time 0.0009 (0.0154) model time 0.2266 (0.2832) loss 3.5252 (3.2963) grad_norm 2.3115 (nan) loss_scale 2048.0000 (3164.2523) mem 7377MB [2024-08-27 07:09:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][750/1251] eta 0:02:33 lr 0.000649 wd 0.0500 time 0.3115 (0.3071) data time 0.0010 (0.0151) model time 0.3106 (0.2921) loss 3.8340 (3.2903) grad_norm 1.5476 (nan) loss_scale 2048.0000 (3139.6652) mem 7377MB [2024-08-27 07:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][760/1251] eta 0:02:29 lr 0.000649 wd 0.0500 time 0.2359 (0.3055) data time 0.0007 (0.0148) model time 0.2353 (0.2907) loss 3.8461 (3.2866) grad_norm 1.8252 (nan) loss_scale 2048.0000 (3116.1379) mem 7377MB [2024-08-27 07:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][770/1251] eta 0:02:26 lr 0.000649 wd 0.0500 time 0.2212 (0.3037) data time 0.0008 (0.0145) model time 0.2204 (0.2893) loss 3.2703 (3.2784) grad_norm 2.1221 (nan) loss_scale 2048.0000 (3093.6034) mem 7377MB [2024-08-27 07:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][780/1251] eta 0:02:22 lr 0.000649 wd 0.0500 time 0.2239 (0.3022) data time 0.0008 (0.0142) model time 0.2231 (0.2880) loss 3.1218 (3.2801) grad_norm 1.7724 (nan) loss_scale 2048.0000 (3072.0000) mem 7377MB [2024-08-27 07:09:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][790/1251] eta 0:02:18 lr 0.000649 wd 0.0500 time 0.2229 (0.3006) data time 0.0009 (0.0139) model time 0.2220 (0.2867) loss 3.1720 (3.2836) grad_norm 1.8285 (nan) loss_scale 2048.0000 (3051.2713) mem 7377MB [2024-08-27 07:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][800/1251] eta 0:02:14 lr 0.000649 wd 0.0500 time 0.2329 (0.2991) data time 0.0007 (0.0137) model time 0.2322 (0.2855) loss 3.7566 (3.2835) grad_norm 2.2819 (nan) loss_scale 2048.0000 (3031.3651) mem 7377MB [2024-08-27 07:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][810/1251] eta 0:02:11 lr 0.000649 wd 0.0500 time 0.2186 (0.2977) data time 0.0011 (0.0134) model time 0.2175 (0.2843) loss 3.5530 (3.2919) grad_norm 1.9588 (nan) loss_scale 2048.0000 (3012.2335) mem 7377MB [2024-08-27 07:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][820/1251] eta 0:02:07 lr 0.000649 wd 0.0500 time 0.2229 (0.2964) data time 0.0008 (0.0132) model time 0.2221 (0.2832) loss 2.0478 (3.2862) grad_norm 1.7780 (nan) loss_scale 2048.0000 (2993.8321) mem 7377MB [2024-08-27 07:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][830/1251] eta 0:02:04 lr 0.000649 wd 0.0500 time 0.2253 (0.2951) data time 0.0007 (0.0129) model time 0.2246 (0.2822) loss 2.5750 (3.2849) grad_norm 2.9285 (nan) loss_scale 2048.0000 (2976.1199) mem 7377MB [2024-08-27 07:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][840/1251] eta 0:02:00 lr 0.000649 wd 0.0500 time 0.2219 (0.2938) data time 0.0007 (0.0127) model time 0.2212 (0.2811) loss 3.7126 (3.2858) grad_norm 2.1068 (nan) loss_scale 2048.0000 (2959.0588) mem 7377MB [2024-08-27 07:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][850/1251] eta 0:01:57 lr 0.000649 wd 0.0500 time 0.2207 (0.2927) data time 0.0007 (0.0125) model time 0.2200 (0.2802) loss 3.9366 (3.2912) grad_norm 2.6165 (nan) loss_scale 2048.0000 (2942.6137) mem 7377MB [2024-08-27 07:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][860/1251] eta 0:01:54 lr 0.000649 wd 0.0500 time 0.2335 (0.2931) data time 0.0008 (0.0135) model time 0.2327 (0.2796) loss 3.6627 (3.2936) grad_norm 1.9948 (nan) loss_scale 2048.0000 (2926.7518) mem 7377MB [2024-08-27 07:09:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][870/1251] eta 0:01:51 lr 0.000649 wd 0.0500 time 0.2259 (0.2919) data time 0.0007 (0.0133) model time 0.2252 (0.2786) loss 3.5810 (3.2956) grad_norm 2.7920 (nan) loss_scale 2048.0000 (2911.4425) mem 7377MB [2024-08-27 07:09:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][880/1251] eta 0:01:47 lr 0.000649 wd 0.0500 time 0.2250 (0.2908) data time 0.0007 (0.0131) model time 0.2243 (0.2777) loss 2.3408 (3.2955) grad_norm 2.4868 (nan) loss_scale 2048.0000 (2896.6575) mem 7377MB [2024-08-27 07:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][890/1251] eta 0:01:44 lr 0.000649 wd 0.0500 time 0.2209 (0.2898) data time 0.0007 (0.0129) model time 0.2202 (0.2769) loss 3.4240 (3.2955) grad_norm 2.1436 (nan) loss_scale 2048.0000 (2882.3704) mem 7377MB [2024-08-27 07:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][900/1251] eta 0:01:41 lr 0.000649 wd 0.0500 time 0.2269 (0.2887) data time 0.0009 (0.0127) model time 0.2260 (0.2760) loss 3.6843 (3.2936) grad_norm 3.1859 (nan) loss_scale 2048.0000 (2868.5563) mem 7377MB [2024-08-27 07:10:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][910/1251] eta 0:01:38 lr 0.000649 wd 0.0500 time 0.2266 (0.2878) data time 0.0007 (0.0126) model time 0.2259 (0.2752) loss 3.1876 (3.2933) grad_norm 2.1886 (nan) loss_scale 2048.0000 (2855.1922) mem 7377MB [2024-08-27 07:10:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][920/1251] eta 0:01:34 lr 0.000649 wd 0.0500 time 0.2229 (0.2868) data time 0.0009 (0.0124) model time 0.2220 (0.2744) loss 3.1940 (3.2981) grad_norm 2.5202 (nan) loss_scale 2048.0000 (2842.2564) mem 7377MB [2024-08-27 07:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][930/1251] eta 0:01:31 lr 0.000649 wd 0.0500 time 0.2353 (0.2858) data time 0.0010 (0.0122) model time 0.2344 (0.2736) loss 2.9831 (3.2977) grad_norm 3.2197 (nan) loss_scale 2048.0000 (2829.7287) mem 7377MB [2024-08-27 07:10:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 07:10:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 07:10:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 07:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 07:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 07:11:58 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 07:12:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 07:12:07 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 07:12:08 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 07:12:09 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 07:12:10 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 133) [2024-08-27 07:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 07:12:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][940/1251] eta 0:08:07 lr 0.000649 wd 0.0500 time 0.2279 (1.5687) data time 0.0007 (0.1022) model time 0.2272 (1.4665) loss 3.7933 (3.7680) grad_norm 1.7162 (2.3829) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][950/1251] eta 0:04:19 lr 0.000649 wd 0.0500 time 0.2208 (0.8617) data time 0.0008 (0.0490) model time 0.2200 (0.8127) loss 3.5950 (3.5404) grad_norm 2.6373 (2.2376) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][960/1251] eta 0:03:06 lr 0.000649 wd 0.0500 time 0.2227 (0.6424) data time 0.0007 (0.0324) model time 0.2221 (0.6100) loss 3.7464 (3.6052) grad_norm 2.6849 (2.2978) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:12:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][970/1251] eta 0:02:30 lr 0.000649 wd 0.0500 time 0.2281 (0.5362) data time 0.0009 (0.0243) model time 0.2273 (0.5119) loss 3.0194 (3.5191) grad_norm 2.2529 (2.2863) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:12:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][980/1251] eta 0:02:08 lr 0.000649 wd 0.0500 time 0.2256 (0.4727) data time 0.0009 (0.0195) model time 0.2247 (0.4531) loss 3.2474 (3.4881) grad_norm 2.4198 (2.3520) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:12:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][990/1251] eta 0:01:52 lr 0.000648 wd 0.0500 time 0.2303 (0.4312) data time 0.0008 (0.0164) model time 0.2295 (0.4148) loss 2.7710 (3.4651) grad_norm 1.7555 (2.3398) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:12:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1000/1251] eta 0:01:40 lr 0.000648 wd 0.0500 time 0.2225 (0.4020) data time 0.0010 (0.0141) model time 0.2215 (0.3879) loss 3.6629 (3.4363) grad_norm 2.3722 (2.3250) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:12:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1010/1251] eta 0:01:31 lr 0.000648 wd 0.0500 time 0.2249 (0.3799) data time 0.0008 (0.0125) model time 0.2242 (0.3675) loss 3.2760 (3.4155) grad_norm 2.3262 (2.3207) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:12:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1020/1251] eta 0:01:23 lr 0.000648 wd 0.0500 time 0.2236 (0.3627) data time 0.0008 (0.0112) model time 0.2229 (0.3515) loss 3.1925 (3.3884) grad_norm 1.9527 (2.3197) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1030/1251] eta 0:01:17 lr 0.000648 wd 0.0500 time 0.2219 (0.3495) data time 0.0011 (0.0101) model time 0.2208 (0.3393) loss 3.6992 (3.4041) grad_norm 2.3484 (2.3174) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:12:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1040/1251] eta 0:01:11 lr 0.000648 wd 0.0500 time 0.2316 (0.3383) data time 0.0007 (0.0093) model time 0.2309 (0.3290) loss 3.9348 (3.4166) grad_norm 1.4009 (2.3127) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1050/1251] eta 0:01:06 lr 0.000648 wd 0.0500 time 0.2229 (0.3288) data time 0.0008 (0.0086) model time 0.2222 (0.3202) loss 3.7759 (3.4087) grad_norm 2.3046 (2.3271) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:12:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1060/1251] eta 0:01:01 lr 0.000648 wd 0.0500 time 0.2317 (0.3209) data time 0.0007 (0.0080) model time 0.2310 (0.3130) loss 3.1280 (3.3953) grad_norm 4.0061 (2.3985) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:12:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1070/1251] eta 0:00:56 lr 0.000648 wd 0.0500 time 0.2228 (0.3141) data time 0.0008 (0.0075) model time 0.2221 (0.3067) loss 4.1215 (3.3890) grad_norm 1.5160 (2.3784) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:12:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1080/1251] eta 0:00:52 lr 0.000648 wd 0.0500 time 0.2284 (0.3084) data time 0.0007 (0.0070) model time 0.2277 (0.3014) loss 3.1369 (3.3718) grad_norm 1.8859 (2.3591) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1090/1251] eta 0:00:48 lr 0.000648 wd 0.0500 time 0.2256 (0.3034) data time 0.0007 (0.0066) model time 0.2249 (0.2968) loss 4.0235 (3.3642) grad_norm 2.9194 (2.3443) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1100/1251] eta 0:00:45 lr 0.000648 wd 0.0500 time 0.2297 (0.2989) data time 0.0009 (0.0063) model time 0.2288 (0.2926) loss 3.3265 (3.3629) grad_norm 2.2141 (2.3296) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1110/1251] eta 0:00:41 lr 0.000648 wd 0.0500 time 0.2277 (0.2950) data time 0.0007 (0.0060) model time 0.2270 (0.2890) loss 3.5746 (3.3349) grad_norm 2.3634 (2.3186) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1120/1251] eta 0:00:38 lr 0.000648 wd 0.0500 time 0.2272 (0.2916) data time 0.0008 (0.0057) model time 0.2265 (0.2858) loss 4.1443 (3.3326) grad_norm 2.0145 (2.3305) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1130/1251] eta 0:00:34 lr 0.000648 wd 0.0500 time 0.2339 (0.2883) data time 0.0008 (0.0055) model time 0.2331 (0.2828) loss 2.6921 (3.3188) grad_norm 2.2466 (2.3449) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1140/1251] eta 0:00:31 lr 0.000648 wd 0.0500 time 0.2251 (0.2854) data time 0.0007 (0.0053) model time 0.2244 (0.2801) loss 3.3132 (3.3128) grad_norm 3.2841 (2.3467) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1150/1251] eta 0:00:28 lr 0.000648 wd 0.0500 time 0.2361 (0.2829) data time 0.0007 (0.0051) model time 0.2355 (0.2778) loss 3.6798 (3.3029) grad_norm 2.7528 (2.3461) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1160/1251] eta 0:00:25 lr 0.000648 wd 0.0500 time 0.2262 (0.2804) data time 0.0008 (0.0049) model time 0.2254 (0.2755) loss 2.7824 (3.3026) grad_norm 1.8433 (2.3596) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1170/1251] eta 0:00:22 lr 0.000648 wd 0.0500 time 0.2315 (0.2782) data time 0.0007 (0.0047) model time 0.2308 (0.2735) loss 2.3014 (3.2924) grad_norm 2.3783 (2.3627) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1180/1251] eta 0:00:19 lr 0.000648 wd 0.0500 time 0.2245 (0.2760) data time 0.0008 (0.0046) model time 0.2237 (0.2714) loss 2.9804 (3.2856) grad_norm 2.5181 (2.3688) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1190/1251] eta 0:00:16 lr 0.000648 wd 0.0500 time 0.2357 (0.2741) data time 0.0008 (0.0044) model time 0.2349 (0.2697) loss 3.3495 (3.2721) grad_norm 2.0731 (2.3582) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1200/1251] eta 0:00:13 lr 0.000648 wd 0.0500 time 0.2218 (0.2723) data time 0.0007 (0.0043) model time 0.2211 (0.2681) loss 2.0468 (3.2641) grad_norm 2.0996 (2.3479) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1210/1251] eta 0:00:11 lr 0.000648 wd 0.0500 time 0.2287 (0.2708) data time 0.0009 (0.0042) model time 0.2278 (0.2666) loss 3.6374 (3.2719) grad_norm 2.7578 (2.3421) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1220/1251] eta 0:00:08 lr 0.000647 wd 0.0500 time 0.2314 (0.2700) data time 0.0009 (0.0041) model time 0.2305 (0.2660) loss 3.0156 (3.2752) grad_norm 1.8408 (2.3363) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1230/1251] eta 0:00:05 lr 0.000647 wd 0.0500 time 0.2372 (0.2686) data time 0.0010 (0.0040) model time 0.2362 (0.2646) loss 3.6349 (3.2638) grad_norm 2.9192 (2.3380) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1240/1251] eta 0:00:02 lr 0.000647 wd 0.0500 time 0.2187 (0.2677) data time 0.0006 (0.0039) model time 0.2182 (0.2638) loss 3.5882 (3.2580) grad_norm 4.8979 (2.3790) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [133/300][1250/1251] eta 0:00:00 lr 0.000647 wd 0.0500 time 0.2171 (0.2661) data time 0.0003 (0.0038) model time 0.2167 (0.2623) loss 3.5684 (3.2682) grad_norm 1.9684 (2.4030) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 07:13:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 133 training takes 0:01:24 [2024-08-27 07:13:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 07:13:40 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 07:13:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.581 (0.581) Loss 0.4609 (0.4609) Acc@1 91.113 (91.113) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-27 07:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.060 (0.400) Loss 0.7324 (0.7400) Acc@1 85.742 (83.620) Acc@5 96.680 (96.529) Mem 7379MB [2024-08-27 07:13:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.068 (0.338) Loss 1.0684 (0.7618) Acc@1 74.121 (82.831) Acc@5 93.652 (96.591) Mem 7379MB [2024-08-27 07:13:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.259) Loss 1.4023 (0.8664) Acc@1 65.430 (80.434) Acc@5 89.648 (95.353) Mem 7379MB [2024-08-27 07:13:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.212) Loss 1.1660 (0.9275) Acc@1 73.340 (79.049) Acc@5 91.797 (94.665) Mem 7379MB [2024-08-27 07:13:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.668 Acc@5 94.652 [2024-08-27 07:13:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.7% [2024-08-27 07:13:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 78.67% [2024-08-27 07:13:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 07:13:53 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 07:13:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.434 (0.434) Loss 0.4175 (0.4175) Acc@1 92.480 (92.480) Acc@5 98.730 (98.730) Mem 7379MB [2024-08-27 07:13:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.063 (0.125) Loss 0.6616 (0.6525) Acc@1 87.207 (85.964) Acc@5 96.973 (97.328) Mem 7379MB [2024-08-27 07:13:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.104) Loss 0.9209 (0.6763) Acc@1 78.223 (84.993) Acc@5 94.922 (97.298) Mem 7379MB [2024-08-27 07:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.095) Loss 1.1719 (0.7684) Acc@1 70.508 (82.746) Acc@5 92.090 (96.276) Mem 7379MB [2024-08-27 07:13:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.087) Loss 1.0664 (0.8166) Acc@1 72.656 (81.338) Acc@5 93.555 (95.734) Mem 7379MB [2024-08-27 07:13:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.918 Acc@5 95.692 [2024-08-27 07:13:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 80.9% [2024-08-27 07:13:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.92% [2024-08-27 07:13:57 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 07:13:58 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 07:13:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][0/1251] eta 0:14:18 lr 0.000647 wd 0.0500 time 0.6864 (0.6864) data time 0.3674 (0.3674) model time 0.0000 (0.0000) loss 2.9846 (2.9846) grad_norm 3.0833 (3.0833) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:14:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][10/1251] eta 0:05:32 lr 0.000647 wd 0.0500 time 0.2256 (0.2682) data time 0.0007 (0.0341) model time 0.0000 (0.0000) loss 3.6309 (3.3152) grad_norm 2.0305 (2.4320) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:14:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][20/1251] eta 0:05:06 lr 0.000647 wd 0.0500 time 0.2261 (0.2489) data time 0.0009 (0.0183) model time 0.0000 (0.0000) loss 3.3407 (3.2813) grad_norm 1.9663 (2.2608) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][30/1251] eta 0:04:55 lr 0.000647 wd 0.0500 time 0.2277 (0.2421) data time 0.0009 (0.0127) model time 0.0000 (0.0000) loss 2.4825 (3.2517) grad_norm 2.6761 (2.2505) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][40/1251] eta 0:04:48 lr 0.000647 wd 0.0500 time 0.2281 (0.2382) data time 0.0007 (0.0098) model time 0.0000 (0.0000) loss 4.0320 (3.2742) grad_norm 2.8515 (2.3358) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][50/1251] eta 0:04:43 lr 0.000647 wd 0.0500 time 0.2285 (0.2358) data time 0.0008 (0.0080) model time 0.0000 (0.0000) loss 3.8224 (3.2640) grad_norm 1.6996 (2.3241) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][60/1251] eta 0:04:39 lr 0.000647 wd 0.0500 time 0.2278 (0.2347) data time 0.0007 (0.0069) model time 0.2270 (0.2278) loss 3.0472 (3.2488) grad_norm 1.9125 (2.2686) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:14:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][70/1251] eta 0:04:36 lr 0.000647 wd 0.0500 time 0.2284 (0.2337) data time 0.0010 (0.0060) model time 0.2275 (0.2275) loss 2.1551 (3.2160) grad_norm 2.2181 (2.2582) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:14:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][80/1251] eta 0:04:32 lr 0.000647 wd 0.0500 time 0.2216 (0.2323) data time 0.0008 (0.0054) model time 0.2208 (0.2256) loss 3.1461 (3.2435) grad_norm 2.8225 (2.2945) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:14:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][90/1251] eta 0:04:28 lr 0.000647 wd 0.0500 time 0.2202 (0.2314) data time 0.0006 (0.0049) model time 0.2196 (0.2250) loss 3.5268 (3.2808) grad_norm 2.0029 (2.2592) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:14:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][100/1251] eta 0:04:25 lr 0.000647 wd 0.0500 time 0.2288 (0.2307) data time 0.0008 (0.0045) model time 0.2280 (0.2246) loss 3.5882 (3.2706) grad_norm 1.7166 (2.2686) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:14:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][110/1251] eta 0:04:22 lr 0.000647 wd 0.0500 time 0.2248 (0.2301) data time 0.0009 (0.0042) model time 0.2240 (0.2244) loss 3.3981 (3.2955) grad_norm 1.7976 (2.2494) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:14:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][120/1251] eta 0:04:20 lr 0.000647 wd 0.0500 time 0.2331 (0.2300) data time 0.0009 (0.0039) model time 0.2322 (0.2249) loss 2.8437 (3.2990) grad_norm 2.9640 (2.2747) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:14:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][130/1251] eta 0:04:17 lr 0.000647 wd 0.0500 time 0.2243 (0.2297) data time 0.0006 (0.0036) model time 0.2236 (0.2249) loss 3.6090 (3.3038) grad_norm 2.8061 (2.3111) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 07:14:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 07:14:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 07:14:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 07:17:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 07:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 07:17:45 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 07:17:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 07:17:51 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 07:17:53 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 07:17:54 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 07:17:54 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 134) [2024-08-27 07:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 07:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 07:22:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 07:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 07:26:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 07:26:14 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 07:26:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 07:26:25 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 07:26:26 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 07:26:28 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 07:26:28 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 134) [2024-08-27 07:26:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 07:26:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][140/1251] eta 2:04:01 lr 0.000647 wd 0.0500 time 0.4109 (6.6980) data time 0.0008 (0.4882) model time 0.4101 (6.2099) loss 4.1598 (4.1755) grad_norm 4.1959 (3.2317) loss_scale 2048.0000 (2048.0000) mem 7378MB [2024-08-27 07:26:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][150/1251] eta 0:24:06 lr 0.000647 wd 0.0500 time 0.2343 (1.3137) data time 0.0008 (0.0825) model time 0.2334 (1.2312) loss 3.1891 (3.6095) grad_norm 1.8374 (2.3771) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:26:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][160/1251] eta 0:14:59 lr 0.000647 wd 0.0500 time 0.2345 (0.8248) data time 0.0010 (0.0454) model time 0.2335 (0.7794) loss 3.4421 (3.5450) grad_norm 2.4856 (2.2122) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:26:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][170/1251] eta 0:11:33 lr 0.000647 wd 0.0500 time 0.2348 (0.6416) data time 0.0008 (0.0315) model time 0.2340 (0.6100) loss 3.6511 (3.5490) grad_norm 2.2500 (2.2665) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:26:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][180/1251] eta 0:09:45 lr 0.000647 wd 0.0500 time 0.2415 (0.5463) data time 0.0009 (0.0243) model time 0.2405 (0.5220) loss 3.3041 (3.4600) grad_norm 2.0911 (2.3242) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:26:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][190/1251] eta 0:08:36 lr 0.000647 wd 0.0500 time 0.2402 (0.4872) data time 0.0009 (0.0198) model time 0.2393 (0.4674) loss 3.9108 (3.4725) grad_norm 1.7777 (2.3473) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][200/1251] eta 0:07:49 lr 0.000647 wd 0.0500 time 0.2347 (0.4472) data time 0.0009 (0.0167) model time 0.2338 (0.4304) loss 3.9163 (3.4458) grad_norm 1.9507 (2.3743) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][210/1251] eta 0:07:15 lr 0.000646 wd 0.0500 time 0.2412 (0.4182) data time 0.0010 (0.0146) model time 0.2402 (0.4037) loss 3.2630 (3.3991) grad_norm 1.8807 (2.3461) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][220/1251] eta 0:06:48 lr 0.000646 wd 0.0500 time 0.2371 (0.3961) data time 0.0010 (0.0129) model time 0.2361 (0.3832) loss 3.2371 (3.3795) grad_norm 2.5639 (2.3427) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][230/1251] eta 0:06:26 lr 0.000646 wd 0.0500 time 0.2369 (0.3789) data time 0.0009 (0.0116) model time 0.2360 (0.3673) loss 2.0873 (3.3518) grad_norm 2.3593 (2.3530) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][240/1251] eta 0:06:09 lr 0.000646 wd 0.0500 time 0.2394 (0.3653) data time 0.0008 (0.0106) model time 0.2386 (0.3547) loss 4.3199 (3.3815) grad_norm 1.9571 (2.3461) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][250/1251] eta 0:05:54 lr 0.000646 wd 0.0500 time 0.2395 (0.3539) data time 0.0011 (0.0097) model time 0.2384 (0.3442) loss 3.6453 (3.3800) grad_norm 1.8901 (2.3326) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][260/1251] eta 0:05:41 lr 0.000646 wd 0.0500 time 0.2387 (0.3444) data time 0.0009 (0.0090) model time 0.2378 (0.3354) loss 3.2958 (3.3697) grad_norm 1.5966 (2.3091) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][270/1251] eta 0:05:29 lr 0.000646 wd 0.0500 time 0.2376 (0.3363) data time 0.0009 (0.0084) model time 0.2367 (0.3279) loss 3.7218 (3.3669) grad_norm 1.7414 (2.3225) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][280/1251] eta 0:05:19 lr 0.000646 wd 0.0500 time 0.2326 (0.3293) data time 0.0010 (0.0079) model time 0.2316 (0.3214) loss 3.2581 (3.3610) grad_norm 2.4509 (2.3658) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][290/1251] eta 0:05:10 lr 0.000646 wd 0.0500 time 0.2377 (0.3233) data time 0.0011 (0.0074) model time 0.2366 (0.3159) loss 3.6319 (3.3539) grad_norm 2.3974 (2.3834) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][300/1251] eta 0:05:02 lr 0.000646 wd 0.0500 time 0.2428 (0.3181) data time 0.0010 (0.0070) model time 0.2418 (0.3111) loss 3.4339 (3.3565) grad_norm 2.8978 (2.3656) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][310/1251] eta 0:04:54 lr 0.000646 wd 0.0500 time 0.2465 (0.3134) data time 0.0009 (0.0067) model time 0.2456 (0.3067) loss 2.5781 (3.3480) grad_norm 1.9027 (2.3482) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][320/1251] eta 0:04:48 lr 0.000646 wd 0.0500 time 0.2338 (0.3094) data time 0.0009 (0.0064) model time 0.2329 (0.3030) loss 3.6766 (3.3371) grad_norm 2.0299 (2.3455) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][330/1251] eta 0:04:41 lr 0.000646 wd 0.0500 time 0.2385 (0.3057) data time 0.0013 (0.0061) model time 0.2372 (0.2996) loss 3.6581 (3.3369) grad_norm 1.6154 (2.3421) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:27:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 07:27:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 07:27:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 07:40:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 07:40:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 07:40:33 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 07:40:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 07:40:40 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 07:40:41 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 07:40:43 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 07:40:43 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 134) [2024-08-27 07:40:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 07:41:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][340/1251] eta 0:27:57 lr 0.000646 wd 0.0500 time 0.2239 (1.8410) data time 0.0013 (0.1104) model time 0.2225 (1.7306) loss 3.5539 (3.7741) grad_norm 1.8132 (2.0172) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][350/1251] eta 0:13:21 lr 0.000646 wd 0.0500 time 0.2231 (0.8899) data time 0.0009 (0.0460) model time 0.2221 (0.8439) loss 2.7707 (3.5059) grad_norm 1.7558 (2.1046) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][360/1251] eta 0:09:34 lr 0.000646 wd 0.0500 time 0.2217 (0.6450) data time 0.0024 (0.0294) model time 0.2194 (0.6157) loss 4.3937 (3.6130) grad_norm 4.0382 (2.4810) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][370/1251] eta 0:07:49 lr 0.000646 wd 0.0500 time 0.2339 (0.5331) data time 0.0008 (0.0217) model time 0.2330 (0.5114) loss 3.0129 (3.5727) grad_norm 1.7926 (2.4359) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][380/1251] eta 0:06:47 lr 0.000646 wd 0.0500 time 0.2250 (0.4678) data time 0.0008 (0.0173) model time 0.2242 (0.4505) loss 3.7654 (3.5172) grad_norm 1.8567 (2.3808) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][390/1251] eta 0:06:06 lr 0.000646 wd 0.0500 time 0.2186 (0.4253) data time 0.0010 (0.0144) model time 0.2176 (0.4109) loss 3.5756 (3.5011) grad_norm 2.2031 (2.3472) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][400/1251] eta 0:05:36 lr 0.000646 wd 0.0500 time 0.2202 (0.3952) data time 0.0009 (0.0124) model time 0.2193 (0.3828) loss 3.4141 (3.4764) grad_norm 1.7748 (2.3765) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][410/1251] eta 0:05:13 lr 0.000646 wd 0.0500 time 0.2191 (0.3726) data time 0.0010 (0.0109) model time 0.2182 (0.3617) loss 3.5162 (3.4459) grad_norm 2.0049 (2.4099) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][420/1251] eta 0:04:55 lr 0.000646 wd 0.0500 time 0.2218 (0.3553) data time 0.0011 (0.0098) model time 0.2207 (0.3455) loss 3.1435 (3.4095) grad_norm 2.0016 (2.3700) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][430/1251] eta 0:04:40 lr 0.000646 wd 0.0500 time 0.2245 (0.3418) data time 0.0010 (0.0089) model time 0.2235 (0.3329) loss 3.8877 (3.4102) grad_norm 2.3355 (2.3598) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][440/1251] eta 0:04:28 lr 0.000645 wd 0.0500 time 0.2211 (0.3307) data time 0.0007 (0.0081) model time 0.2204 (0.3226) loss 3.5252 (3.4266) grad_norm 1.8928 (2.3507) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][450/1251] eta 0:04:17 lr 0.000645 wd 0.0500 time 0.2219 (0.3215) data time 0.0008 (0.0075) model time 0.2211 (0.3140) loss 3.7220 (3.4140) grad_norm 2.7503 (2.3488) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][460/1251] eta 0:04:08 lr 0.000645 wd 0.0500 time 0.2230 (0.3142) data time 0.0010 (0.0070) model time 0.2220 (0.3072) loss 3.5102 (3.3905) grad_norm 1.7883 (2.3375) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][470/1251] eta 0:04:00 lr 0.000645 wd 0.0500 time 0.2304 (0.3077) data time 0.0007 (0.0066) model time 0.2297 (0.3011) loss 2.9600 (3.3882) grad_norm 1.9870 (2.3397) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][480/1251] eta 0:03:52 lr 0.000645 wd 0.0500 time 0.2203 (0.3021) data time 0.0008 (0.0062) model time 0.2194 (0.2959) loss 3.2995 (3.3837) grad_norm 2.6346 (2.3320) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][490/1251] eta 0:03:46 lr 0.000645 wd 0.0500 time 0.2175 (0.2971) data time 0.0009 (0.0058) model time 0.2166 (0.2912) loss 2.8679 (3.3799) grad_norm 3.0347 (2.3433) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][500/1251] eta 0:03:39 lr 0.000645 wd 0.0500 time 0.2283 (0.2929) data time 0.0009 (0.0056) model time 0.2274 (0.2873) loss 3.0958 (3.3780) grad_norm 2.2812 (2.3440) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][510/1251] eta 0:03:34 lr 0.000645 wd 0.0500 time 0.2209 (0.2890) data time 0.0010 (0.0053) model time 0.2199 (0.2837) loss 3.5096 (3.3689) grad_norm 2.3950 (2.3508) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][520/1251] eta 0:03:28 lr 0.000645 wd 0.0500 time 0.2177 (0.2856) data time 0.0006 (0.0051) model time 0.2171 (0.2805) loss 3.1008 (3.3589) grad_norm 1.8241 (2.3696) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][530/1251] eta 0:03:23 lr 0.000645 wd 0.0500 time 0.2287 (0.2825) data time 0.0007 (0.0048) model time 0.2280 (0.2777) loss 4.0867 (3.3581) grad_norm 2.9177 (2.3718) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][540/1251] eta 0:03:18 lr 0.000645 wd 0.0500 time 0.2222 (0.2797) data time 0.0007 (0.0047) model time 0.2215 (0.2751) loss 3.2411 (3.3397) grad_norm 2.0206 (2.3636) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][550/1251] eta 0:03:14 lr 0.000645 wd 0.0500 time 0.2252 (0.2772) data time 0.0006 (0.0045) model time 0.2245 (0.2727) loss 3.8623 (3.3345) grad_norm 1.8948 (2.3644) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][560/1251] eta 0:03:09 lr 0.000645 wd 0.0500 time 0.2198 (0.2748) data time 0.0010 (0.0043) model time 0.2188 (0.2704) loss 3.0822 (3.3353) grad_norm 1.9583 (2.3635) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][570/1251] eta 0:03:05 lr 0.000645 wd 0.0500 time 0.2183 (0.2725) data time 0.0008 (0.0042) model time 0.2175 (0.2684) loss 3.4610 (3.3292) grad_norm 1.8283 (2.3778) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][580/1251] eta 0:03:01 lr 0.000645 wd 0.0500 time 0.2288 (0.2706) data time 0.0008 (0.0041) model time 0.2280 (0.2665) loss 3.7940 (3.3224) grad_norm 1.9921 (2.3928) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][590/1251] eta 0:02:57 lr 0.000645 wd 0.0500 time 0.2230 (0.2688) data time 0.0007 (0.0039) model time 0.2223 (0.2649) loss 2.5037 (3.3105) grad_norm 1.8249 (2.3863) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:41:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][600/1251] eta 0:02:54 lr 0.000645 wd 0.0500 time 0.2362 (0.2673) data time 0.0008 (0.0038) model time 0.2354 (0.2635) loss 2.9397 (3.3023) grad_norm 1.6336 (2.3931) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][610/1251] eta 0:02:50 lr 0.000645 wd 0.0500 time 0.2265 (0.2659) data time 0.0009 (0.0037) model time 0.2256 (0.2621) loss 3.6003 (3.3106) grad_norm 1.5293 (2.3760) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][620/1251] eta 0:02:47 lr 0.000645 wd 0.0500 time 0.4570 (0.2652) data time 0.0008 (0.0036) model time 0.4562 (0.2616) loss 3.8474 (3.3039) grad_norm 2.0945 (2.3645) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][630/1251] eta 0:02:43 lr 0.000645 wd 0.0500 time 0.2216 (0.2639) data time 0.0008 (0.0035) model time 0.2207 (0.2604) loss 2.5000 (3.2960) grad_norm 1.7657 (2.3583) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][640/1251] eta 0:02:40 lr 0.000645 wd 0.0500 time 0.4549 (0.2634) data time 0.0006 (0.0034) model time 0.4543 (0.2599) loss 3.9306 (3.2921) grad_norm 3.1376 (2.3720) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][650/1251] eta 0:02:37 lr 0.000645 wd 0.0500 time 0.2324 (0.2622) data time 0.0009 (0.0034) model time 0.2315 (0.2589) loss 3.6373 (3.2989) grad_norm 2.2876 (2.3761) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][660/1251] eta 0:02:34 lr 0.000645 wd 0.0500 time 0.2290 (0.2611) data time 0.0006 (0.0033) model time 0.2285 (0.2578) loss 2.6215 (3.3043) grad_norm 3.5334 (2.3888) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][670/1251] eta 0:02:31 lr 0.000645 wd 0.0500 time 0.2245 (0.2601) data time 0.0008 (0.0032) model time 0.2237 (0.2568) loss 2.7417 (3.3005) grad_norm 1.7911 (2.3946) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][680/1251] eta 0:02:27 lr 0.000644 wd 0.0500 time 0.2209 (0.2591) data time 0.0007 (0.0032) model time 0.2202 (0.2560) loss 2.7644 (3.3027) grad_norm 2.2991 (2.3967) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][690/1251] eta 0:02:24 lr 0.000644 wd 0.0500 time 0.2227 (0.2583) data time 0.0007 (0.0031) model time 0.2219 (0.2552) loss 2.8429 (3.3019) grad_norm 2.2041 (2.4016) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][700/1251] eta 0:02:21 lr 0.000644 wd 0.0500 time 0.2207 (0.2575) data time 0.0006 (0.0030) model time 0.2201 (0.2545) loss 2.7036 (3.3009) grad_norm 1.9297 (2.4033) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][710/1251] eta 0:02:18 lr 0.000644 wd 0.0500 time 0.2317 (0.2567) data time 0.0008 (0.0030) model time 0.2309 (0.2537) loss 3.1024 (3.2975) grad_norm 2.5572 (2.4015) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][720/1251] eta 0:02:15 lr 0.000644 wd 0.0500 time 0.2210 (0.2559) data time 0.0007 (0.0029) model time 0.2202 (0.2529) loss 3.6605 (3.2929) grad_norm 2.1905 (2.3943) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][730/1251] eta 0:02:12 lr 0.000644 wd 0.0500 time 0.2256 (0.2552) data time 0.0006 (0.0029) model time 0.2250 (0.2523) loss 3.4425 (3.2935) grad_norm 2.5413 (2.3881) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][740/1251] eta 0:02:10 lr 0.000644 wd 0.0500 time 0.2254 (0.2545) data time 0.0011 (0.0028) model time 0.2243 (0.2517) loss 3.3070 (3.2987) grad_norm 2.4467 (2.3842) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][750/1251] eta 0:02:07 lr 0.000644 wd 0.0500 time 0.2242 (0.2539) data time 0.0007 (0.0028) model time 0.2235 (0.2511) loss 1.8501 (3.2960) grad_norm 3.1006 (2.3812) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][760/1251] eta 0:02:04 lr 0.000644 wd 0.0500 time 0.2286 (0.2532) data time 0.0009 (0.0028) model time 0.2277 (0.2505) loss 3.4101 (3.3016) grad_norm 3.0971 (2.3753) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][770/1251] eta 0:02:01 lr 0.000644 wd 0.0500 time 0.2300 (0.2526) data time 0.0010 (0.0027) model time 0.2289 (0.2499) loss 3.5480 (3.3076) grad_norm 2.1727 (2.3740) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][780/1251] eta 0:01:58 lr 0.000644 wd 0.0500 time 0.2216 (0.2519) data time 0.0009 (0.0027) model time 0.2207 (0.2493) loss 2.7972 (3.3064) grad_norm 2.2927 (2.3802) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][790/1251] eta 0:01:55 lr 0.000644 wd 0.0500 time 0.2253 (0.2513) data time 0.0008 (0.0026) model time 0.2244 (0.2487) loss 2.4318 (3.3014) grad_norm 2.0031 (2.3863) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][800/1251] eta 0:01:53 lr 0.000644 wd 0.0500 time 0.2310 (0.2507) data time 0.0008 (0.0026) model time 0.2302 (0.2482) loss 2.8378 (3.2966) grad_norm 3.0981 (2.3863) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][810/1251] eta 0:01:50 lr 0.000644 wd 0.0500 time 0.2271 (0.2502) data time 0.0006 (0.0026) model time 0.2264 (0.2476) loss 3.5079 (3.2921) grad_norm 3.3063 (2.3899) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][820/1251] eta 0:01:47 lr 0.000644 wd 0.0500 time 0.2301 (0.2496) data time 0.0008 (0.0025) model time 0.2293 (0.2471) loss 3.1069 (3.2922) grad_norm 2.7620 (2.3877) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][830/1251] eta 0:01:44 lr 0.000644 wd 0.0500 time 0.2246 (0.2492) data time 0.0008 (0.0025) model time 0.2238 (0.2467) loss 3.3138 (3.2922) grad_norm 4.8208 (2.3914) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][840/1251] eta 0:01:42 lr 0.000644 wd 0.0500 time 0.2226 (0.2487) data time 0.0009 (0.0025) model time 0.2216 (0.2462) loss 3.5010 (3.2912) grad_norm 2.1194 (2.3942) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][850/1251] eta 0:01:39 lr 0.000644 wd 0.0500 time 0.2281 (0.2483) data time 0.0009 (0.0024) model time 0.2271 (0.2459) loss 3.4929 (3.3001) grad_norm 1.7412 (2.3847) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:42:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][860/1251] eta 0:01:36 lr 0.000644 wd 0.0500 time 0.2316 (0.2480) data time 0.0008 (0.0024) model time 0.2308 (0.2456) loss 4.0731 (3.2942) grad_norm 1.9240 (2.3817) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][870/1251] eta 0:01:34 lr 0.000644 wd 0.0500 time 0.2284 (0.2476) data time 0.0006 (0.0024) model time 0.2277 (0.2452) loss 2.3722 (3.2905) grad_norm 3.4154 (2.3801) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][880/1251] eta 0:01:31 lr 0.000644 wd 0.0500 time 0.2258 (0.2472) data time 0.0006 (0.0024) model time 0.2251 (0.2449) loss 4.0061 (3.2925) grad_norm 2.6410 (2.3832) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][890/1251] eta 0:01:29 lr 0.000644 wd 0.0500 time 0.2270 (0.2469) data time 0.0008 (0.0023) model time 0.2262 (0.2445) loss 2.0634 (3.2960) grad_norm 2.2610 (2.3878) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][900/1251] eta 0:01:26 lr 0.000644 wd 0.0500 time 0.2300 (0.2466) data time 0.0009 (0.0023) model time 0.2292 (0.2443) loss 3.8516 (3.3001) grad_norm 2.2493 (2.3866) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][910/1251] eta 0:01:23 lr 0.000643 wd 0.0500 time 0.2260 (0.2463) data time 0.0007 (0.0023) model time 0.2254 (0.2440) loss 3.7379 (3.3009) grad_norm 1.9278 (2.3907) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][920/1251] eta 0:01:21 lr 0.000643 wd 0.0500 time 0.2314 (0.2460) data time 0.0007 (0.0023) model time 0.2307 (0.2437) loss 3.8442 (3.3029) grad_norm 2.0527 (2.3902) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][930/1251] eta 0:01:18 lr 0.000643 wd 0.0500 time 0.2318 (0.2456) data time 0.0006 (0.0022) model time 0.2312 (0.2434) loss 4.3774 (3.3057) grad_norm 2.2827 (2.3907) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][940/1251] eta 0:01:16 lr 0.000643 wd 0.0500 time 0.2251 (0.2453) data time 0.0010 (0.0022) model time 0.2241 (0.2431) loss 3.6584 (3.3033) grad_norm 2.5666 (2.3921) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][950/1251] eta 0:01:13 lr 0.000643 wd 0.0500 time 0.2273 (0.2450) data time 0.0009 (0.0022) model time 0.2265 (0.2428) loss 3.8294 (3.3054) grad_norm 1.8239 (2.3869) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][960/1251] eta 0:01:11 lr 0.000643 wd 0.0500 time 0.2294 (0.2446) data time 0.0007 (0.0022) model time 0.2287 (0.2425) loss 3.1649 (3.3097) grad_norm 3.0277 (2.4038) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][970/1251] eta 0:01:08 lr 0.000643 wd 0.0500 time 0.2310 (0.2443) data time 0.0006 (0.0022) model time 0.2304 (0.2421) loss 3.0151 (3.3105) grad_norm 1.7376 (2.3964) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][980/1251] eta 0:01:06 lr 0.000643 wd 0.0500 time 0.2191 (0.2440) data time 0.0008 (0.0021) model time 0.2184 (0.2418) loss 3.2510 (3.3071) grad_norm 2.2196 (2.3908) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][990/1251] eta 0:01:03 lr 0.000643 wd 0.0500 time 0.2259 (0.2437) data time 0.0008 (0.0021) model time 0.2251 (0.2416) loss 2.5566 (3.3054) grad_norm 2.2294 (2.3890) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1000/1251] eta 0:01:01 lr 0.000643 wd 0.0500 time 0.2233 (0.2434) data time 0.0009 (0.0021) model time 0.2224 (0.2413) loss 3.7096 (3.3037) grad_norm 2.0245 (2.3839) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1010/1251] eta 0:00:58 lr 0.000643 wd 0.0500 time 0.2231 (0.2432) data time 0.0006 (0.0021) model time 0.2225 (0.2411) loss 3.2890 (3.3071) grad_norm 1.9523 (2.3832) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1020/1251] eta 0:00:56 lr 0.000643 wd 0.0500 time 0.2374 (0.2430) data time 0.0006 (0.0021) model time 0.2368 (0.2409) loss 3.8207 (3.3082) grad_norm 1.7283 (2.3805) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1030/1251] eta 0:00:53 lr 0.000643 wd 0.0500 time 0.2258 (0.2427) data time 0.0008 (0.0021) model time 0.2250 (0.2407) loss 1.9148 (3.3020) grad_norm 1.5988 (2.3896) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1040/1251] eta 0:00:51 lr 0.000643 wd 0.0500 time 0.2373 (0.2425) data time 0.0008 (0.0020) model time 0.2365 (0.2405) loss 2.1904 (3.3031) grad_norm 1.8961 (2.3855) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1050/1251] eta 0:00:48 lr 0.000643 wd 0.0500 time 0.2350 (0.2423) data time 0.0009 (0.0020) model time 0.2341 (0.2402) loss 2.4532 (3.2983) grad_norm 1.8532 (2.3812) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1060/1251] eta 0:00:46 lr 0.000643 wd 0.0500 time 0.2429 (0.2421) data time 0.0007 (0.0020) model time 0.2422 (0.2401) loss 2.4014 (3.2965) grad_norm 2.7792 (2.3807) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1070/1251] eta 0:00:43 lr 0.000643 wd 0.0500 time 0.2261 (0.2419) data time 0.0006 (0.0020) model time 0.2255 (0.2399) loss 3.8175 (3.3010) grad_norm 2.7896 (2.3805) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1080/1251] eta 0:00:41 lr 0.000643 wd 0.0500 time 0.2220 (0.2417) data time 0.0007 (0.0020) model time 0.2213 (0.2398) loss 4.1716 (3.3021) grad_norm 2.9095 (2.3793) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1090/1251] eta 0:00:38 lr 0.000643 wd 0.0500 time 0.2228 (0.2415) data time 0.0009 (0.0020) model time 0.2219 (0.2396) loss 3.2235 (3.2979) grad_norm 1.8372 (2.3770) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1100/1251] eta 0:00:36 lr 0.000643 wd 0.0500 time 0.2300 (0.2413) data time 0.0009 (0.0020) model time 0.2291 (0.2394) loss 2.7646 (3.2998) grad_norm 2.0435 (2.3751) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1110/1251] eta 0:00:34 lr 0.000643 wd 0.0500 time 0.2253 (0.2412) data time 0.0009 (0.0019) model time 0.2244 (0.2392) loss 2.8264 (3.3028) grad_norm 2.8934 (2.3764) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1120/1251] eta 0:00:31 lr 0.000643 wd 0.0500 time 0.2219 (0.2410) data time 0.0007 (0.0019) model time 0.2212 (0.2391) loss 3.8146 (3.3010) grad_norm 2.8057 (2.3848) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:43:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1130/1251] eta 0:00:29 lr 0.000643 wd 0.0500 time 0.2299 (0.2408) data time 0.0007 (0.0019) model time 0.2293 (0.2389) loss 3.6420 (3.3001) grad_norm 3.1397 (2.3891) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:44:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1140/1251] eta 0:00:26 lr 0.000643 wd 0.0500 time 0.2267 (0.2407) data time 0.0006 (0.0019) model time 0.2261 (0.2387) loss 2.3807 (3.2958) grad_norm 2.0902 (2.3950) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:44:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1150/1251] eta 0:00:24 lr 0.000642 wd 0.0500 time 0.2236 (0.2407) data time 0.0009 (0.0019) model time 0.2227 (0.2388) loss 3.5304 (3.2945) grad_norm 2.4782 (2.3919) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:44:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1160/1251] eta 0:00:21 lr 0.000642 wd 0.0500 time 0.2036 (0.2407) data time 0.0007 (0.0019) model time 0.2029 (0.2388) loss 2.6344 (3.2902) grad_norm 1.8195 (2.3992) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:44:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1170/1251] eta 0:00:19 lr 0.000642 wd 0.0500 time 0.2266 (0.2405) data time 0.0009 (0.0019) model time 0.2257 (0.2386) loss 2.2209 (3.2887) grad_norm 2.9217 (2.4016) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:44:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1180/1251] eta 0:00:17 lr 0.000642 wd 0.0500 time 0.2364 (0.2404) data time 0.0012 (0.0019) model time 0.2352 (0.2385) loss 3.2444 (3.2896) grad_norm 2.7626 (2.4035) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:44:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1190/1251] eta 0:00:14 lr 0.000642 wd 0.0500 time 0.2416 (0.2403) data time 0.0009 (0.0019) model time 0.2407 (0.2385) loss 3.2792 (3.2908) grad_norm 3.6018 (2.4012) loss_scale 2048.0000 (2048.0000) mem 7377MB [2024-08-27 07:44:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1200/1251] eta 0:00:12 lr 0.000642 wd 0.0500 time 0.2325 (0.2402) data time 0.0007 (0.0018) model time 0.2318 (0.2384) loss 2.6929 (3.2882) grad_norm 2.4387 (nan) loss_scale 1024.0000 (2044.4567) mem 7377MB [2024-08-27 07:44:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1210/1251] eta 0:00:09 lr 0.000642 wd 0.0500 time 0.2242 (0.2401) data time 0.0007 (0.0018) model time 0.2235 (0.2382) loss 2.1351 (3.2872) grad_norm 1.9199 (nan) loss_scale 1024.0000 (2032.8210) mem 7377MB [2024-08-27 07:44:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1220/1251] eta 0:00:07 lr 0.000642 wd 0.0500 time 0.2274 (0.2399) data time 0.0006 (0.0018) model time 0.2268 (0.2381) loss 3.3995 (3.2872) grad_norm 3.4836 (nan) loss_scale 1024.0000 (2021.4476) mem 7377MB [2024-08-27 07:44:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1230/1251] eta 0:00:05 lr 0.000642 wd 0.0500 time 0.2234 (0.2398) data time 0.0007 (0.0018) model time 0.2227 (0.2380) loss 2.9968 (3.2872) grad_norm 2.1987 (nan) loss_scale 1024.0000 (2010.3278) mem 7377MB [2024-08-27 07:44:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1240/1251] eta 0:00:02 lr 0.000642 wd 0.0500 time 0.2116 (0.2396) data time 0.0004 (0.0018) model time 0.2112 (0.2378) loss 2.2718 (3.2877) grad_norm 1.8522 (nan) loss_scale 1024.0000 (1999.4531) mem 7377MB [2024-08-27 07:44:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [134/300][1250/1251] eta 0:00:00 lr 0.000642 wd 0.0500 time 0.2179 (0.2394) data time 0.0006 (0.0018) model time 0.2173 (0.2376) loss 2.9551 (3.2871) grad_norm 2.1867 (nan) loss_scale 1024.0000 (1988.8157) mem 7377MB [2024-08-27 07:44:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 134 training takes 0:03:39 [2024-08-27 07:44:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 07:44:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 07:44:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.367 (0.367) Loss 0.5151 (0.5151) Acc@1 90.625 (90.625) Acc@5 97.363 (97.363) Mem 7377MB [2024-08-27 07:44:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.065 (0.097) Loss 0.7969 (0.7802) Acc@1 83.496 (83.576) Acc@5 96.973 (96.564) Mem 7377MB [2024-08-27 07:44:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.085) Loss 1.0898 (0.8099) Acc@1 74.707 (82.450) Acc@5 94.043 (96.456) Mem 7377MB [2024-08-27 07:44:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.080) Loss 1.3340 (0.9169) Acc@1 68.359 (80.012) Acc@5 90.234 (95.218) Mem 7377MB [2024-08-27 07:44:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 1.2090 (0.9700) Acc@1 71.484 (78.644) Acc@5 91.992 (94.593) Mem 7377MB [2024-08-27 07:44:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.182 Acc@5 94.560 [2024-08-27 07:44:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.2% [2024-08-27 07:44:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.765 (0.765) Loss 0.4180 (0.4180) Acc@1 92.480 (92.480) Acc@5 98.633 (98.633) Mem 7377MB [2024-08-27 07:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.067 (0.136) Loss 0.6616 (0.6521) Acc@1 86.914 (85.982) Acc@5 97.070 (97.292) Mem 7377MB [2024-08-27 07:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.106) Loss 0.9185 (0.6759) Acc@1 78.418 (85.026) Acc@5 94.922 (97.261) Mem 7377MB [2024-08-27 07:44:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.094) Loss 1.1689 (0.7675) Acc@1 70.703 (82.787) Acc@5 92.090 (96.276) Mem 7377MB [2024-08-27 07:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.087) Loss 1.0654 (0.8159) Acc@1 73.047 (81.407) Acc@5 93.750 (95.753) Mem 7377MB [2024-08-27 07:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.984 Acc@5 95.708 [2024-08-27 07:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.0% [2024-08-27 07:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.98% [2024-08-27 07:44:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 07:44:39 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 07:44:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][0/1251] eta 0:12:13 lr 0.000642 wd 0.0500 time 0.5867 (0.5867) data time 0.3449 (0.3449) model time 0.0000 (0.0000) loss 2.7398 (2.7398) grad_norm 2.0172 (2.0172) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 07:44:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][10/1251] eta 0:05:19 lr 0.000642 wd 0.0500 time 0.2240 (0.2573) data time 0.0007 (0.0322) model time 0.0000 (0.0000) loss 3.5204 (3.4932) grad_norm 2.6684 (2.1062) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][20/1251] eta 0:04:59 lr 0.000642 wd 0.0500 time 0.2234 (0.2432) data time 0.0007 (0.0173) model time 0.0000 (0.0000) loss 3.3841 (3.3367) grad_norm 2.9542 (2.2252) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:44:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][30/1251] eta 0:04:49 lr 0.000642 wd 0.0500 time 0.2211 (0.2374) data time 0.0008 (0.0120) model time 0.0000 (0.0000) loss 3.0949 (3.2511) grad_norm 2.4042 (2.4417) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:44:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][40/1251] eta 0:04:44 lr 0.000642 wd 0.0500 time 0.2188 (0.2347) data time 0.0006 (0.0093) model time 0.0000 (0.0000) loss 3.8723 (3.2101) grad_norm 2.2832 (2.4290) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:44:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][50/1251] eta 0:04:39 lr 0.000642 wd 0.0500 time 0.2241 (0.2328) data time 0.0008 (0.0077) model time 0.0000 (0.0000) loss 3.1327 (3.2349) grad_norm 1.5225 (2.3515) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:44:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][60/1251] eta 0:04:36 lr 0.000642 wd 0.0500 time 0.2288 (0.2320) data time 0.0009 (0.0066) model time 0.2279 (0.2268) loss 2.2331 (3.2554) grad_norm 2.0053 (2.3399) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:44:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][70/1251] eta 0:04:32 lr 0.000642 wd 0.0500 time 0.2364 (0.2311) data time 0.0009 (0.0058) model time 0.2355 (0.2258) loss 1.9390 (3.2467) grad_norm 3.4647 (2.4223) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:44:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][80/1251] eta 0:04:29 lr 0.000642 wd 0.0500 time 0.2214 (0.2303) data time 0.0009 (0.0052) model time 0.2205 (0.2249) loss 3.4197 (3.2658) grad_norm 3.4629 (2.4706) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][90/1251] eta 0:04:26 lr 0.000642 wd 0.0500 time 0.2328 (0.2298) data time 0.0006 (0.0047) model time 0.2322 (0.2249) loss 4.0021 (3.2699) grad_norm 2.6796 (2.4654) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][100/1251] eta 0:04:23 lr 0.000642 wd 0.0500 time 0.2215 (0.2293) data time 0.0007 (0.0044) model time 0.2208 (0.2247) loss 2.7996 (3.2781) grad_norm 2.2316 (2.4504) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][110/1251] eta 0:04:21 lr 0.000642 wd 0.0500 time 0.2221 (0.2289) data time 0.0010 (0.0041) model time 0.2212 (0.2245) loss 3.1920 (3.2502) grad_norm 1.5689 (2.4693) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][120/1251] eta 0:04:18 lr 0.000642 wd 0.0500 time 0.2234 (0.2289) data time 0.0007 (0.0038) model time 0.2226 (0.2250) loss 3.7079 (3.2580) grad_norm 2.5464 (2.4437) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][130/1251] eta 0:04:16 lr 0.000641 wd 0.0500 time 0.2278 (0.2289) data time 0.0010 (0.0036) model time 0.2268 (0.2254) loss 2.4561 (3.2576) grad_norm 2.0186 (2.4357) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][140/1251] eta 0:04:14 lr 0.000641 wd 0.0500 time 0.2156 (0.2288) data time 0.0007 (0.0034) model time 0.2149 (0.2256) loss 3.9823 (3.2716) grad_norm 2.0813 (2.4348) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][150/1251] eta 0:04:11 lr 0.000641 wd 0.0500 time 0.2244 (0.2287) data time 0.0010 (0.0032) model time 0.2235 (0.2257) loss 3.2424 (3.2672) grad_norm 1.7614 (2.4232) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][160/1251] eta 0:04:09 lr 0.000641 wd 0.0500 time 0.2323 (0.2289) data time 0.0010 (0.0031) model time 0.2314 (0.2262) loss 3.5436 (3.2541) grad_norm 3.1439 (2.4239) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][170/1251] eta 0:04:07 lr 0.000641 wd 0.0500 time 0.2251 (0.2288) data time 0.0006 (0.0030) model time 0.2246 (0.2262) loss 2.1942 (3.2505) grad_norm 2.6048 (2.4217) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][180/1251] eta 0:04:04 lr 0.000641 wd 0.0500 time 0.2263 (0.2285) data time 0.0006 (0.0028) model time 0.2257 (0.2259) loss 2.9814 (3.2562) grad_norm 2.7356 (2.4033) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][190/1251] eta 0:04:02 lr 0.000641 wd 0.0500 time 0.2227 (0.2283) data time 0.0007 (0.0027) model time 0.2219 (0.2257) loss 3.1525 (3.2572) grad_norm 1.7947 (2.3972) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][200/1251] eta 0:03:59 lr 0.000641 wd 0.0500 time 0.2246 (0.2283) data time 0.0007 (0.0026) model time 0.2239 (0.2259) loss 3.6996 (3.2667) grad_norm 2.5331 (2.3899) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][210/1251] eta 0:03:57 lr 0.000641 wd 0.0500 time 0.2310 (0.2282) data time 0.0008 (0.0026) model time 0.2302 (0.2258) loss 3.5570 (3.2612) grad_norm 1.8551 (2.3709) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][220/1251] eta 0:03:55 lr 0.000641 wd 0.0500 time 0.2256 (0.2281) data time 0.0008 (0.0025) model time 0.2248 (0.2257) loss 3.5693 (3.2622) grad_norm 1.6766 (2.3677) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][230/1251] eta 0:03:52 lr 0.000641 wd 0.0500 time 0.2275 (0.2280) data time 0.0009 (0.0024) model time 0.2266 (0.2257) loss 3.1459 (3.2638) grad_norm 2.4040 (2.3673) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][240/1251] eta 0:03:50 lr 0.000641 wd 0.0500 time 0.2289 (0.2281) data time 0.0009 (0.0024) model time 0.2280 (0.2259) loss 2.3088 (3.2593) grad_norm 2.6284 (2.3687) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][250/1251] eta 0:03:48 lr 0.000641 wd 0.0500 time 0.2256 (0.2281) data time 0.0006 (0.0023) model time 0.2250 (0.2260) loss 3.7753 (3.2553) grad_norm 2.2016 (2.3600) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][260/1251] eta 0:03:45 lr 0.000641 wd 0.0500 time 0.2213 (0.2280) data time 0.0009 (0.0023) model time 0.2204 (0.2259) loss 3.4697 (3.2665) grad_norm 2.0319 (2.3837) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][270/1251] eta 0:03:43 lr 0.000641 wd 0.0500 time 0.2220 (0.2278) data time 0.0008 (0.0022) model time 0.2212 (0.2257) loss 3.4164 (3.2817) grad_norm 2.5910 (2.4060) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][280/1251] eta 0:03:40 lr 0.000641 wd 0.0500 time 0.2208 (0.2276) data time 0.0006 (0.0022) model time 0.2202 (0.2255) loss 3.7106 (3.2811) grad_norm 2.0815 (2.4061) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][290/1251] eta 0:03:38 lr 0.000641 wd 0.0500 time 0.2375 (0.2275) data time 0.0009 (0.0021) model time 0.2367 (0.2255) loss 3.2727 (3.2880) grad_norm 1.8630 (2.3933) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][300/1251] eta 0:03:36 lr 0.000641 wd 0.0500 time 0.2182 (0.2275) data time 0.0007 (0.0021) model time 0.2175 (0.2254) loss 2.7245 (3.2863) grad_norm 1.9580 (2.3840) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][310/1251] eta 0:03:33 lr 0.000641 wd 0.0500 time 0.2211 (0.2274) data time 0.0007 (0.0020) model time 0.2204 (0.2254) loss 2.5350 (3.2801) grad_norm 2.1844 (2.3915) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][320/1251] eta 0:03:31 lr 0.000641 wd 0.0500 time 0.2269 (0.2274) data time 0.0008 (0.0020) model time 0.2260 (0.2254) loss 3.3220 (3.2782) grad_norm 2.5397 (2.3863) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][330/1251] eta 0:03:29 lr 0.000641 wd 0.0500 time 0.2326 (0.2275) data time 0.0008 (0.0020) model time 0.2318 (0.2256) loss 2.6793 (3.2768) grad_norm 1.7667 (2.3776) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][340/1251] eta 0:03:27 lr 0.000641 wd 0.0500 time 0.2186 (0.2273) data time 0.0009 (0.0019) model time 0.2177 (0.2254) loss 3.6428 (3.2716) grad_norm 2.6494 (2.3849) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:45:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][350/1251] eta 0:03:24 lr 0.000641 wd 0.0500 time 0.2212 (0.2272) data time 0.0007 (0.0019) model time 0.2206 (0.2253) loss 2.6687 (3.2765) grad_norm 1.7191 (2.3778) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][360/1251] eta 0:03:22 lr 0.000640 wd 0.0500 time 0.2193 (0.2271) data time 0.0007 (0.0019) model time 0.2186 (0.2253) loss 3.8069 (3.2831) grad_norm 2.4017 (2.3749) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][370/1251] eta 0:03:20 lr 0.000640 wd 0.0500 time 0.2292 (0.2271) data time 0.0008 (0.0019) model time 0.2284 (0.2253) loss 3.3300 (3.2851) grad_norm 1.8778 (2.3633) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][380/1251] eta 0:03:17 lr 0.000640 wd 0.0500 time 0.2358 (0.2270) data time 0.0006 (0.0018) model time 0.2352 (0.2252) loss 2.9849 (3.2849) grad_norm 2.5522 (2.3766) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][390/1251] eta 0:03:15 lr 0.000640 wd 0.0500 time 0.2201 (0.2271) data time 0.0009 (0.0018) model time 0.2192 (0.2253) loss 3.3171 (3.2859) grad_norm 2.5164 (2.3734) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][400/1251] eta 0:03:13 lr 0.000640 wd 0.0500 time 0.2251 (0.2270) data time 0.0008 (0.0018) model time 0.2242 (0.2253) loss 3.0286 (3.2796) grad_norm 1.9326 (2.3689) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][410/1251] eta 0:03:10 lr 0.000640 wd 0.0500 time 0.2301 (0.2271) data time 0.0006 (0.0018) model time 0.2296 (0.2253) loss 3.9317 (3.2798) grad_norm 2.5509 (2.3739) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][420/1251] eta 0:03:08 lr 0.000640 wd 0.0500 time 0.2293 (0.2271) data time 0.0008 (0.0017) model time 0.2285 (0.2254) loss 3.6820 (3.2801) grad_norm 1.6828 (2.3708) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][430/1251] eta 0:03:06 lr 0.000640 wd 0.0500 time 0.2165 (0.2272) data time 0.0009 (0.0017) model time 0.2156 (0.2255) loss 3.5877 (3.2828) grad_norm 1.7150 (2.3698) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][440/1251] eta 0:03:04 lr 0.000640 wd 0.0500 time 0.2237 (0.2272) data time 0.0009 (0.0017) model time 0.2228 (0.2255) loss 3.7493 (3.2908) grad_norm 3.0889 (2.3783) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][450/1251] eta 0:03:01 lr 0.000640 wd 0.0500 time 0.2286 (0.2272) data time 0.0006 (0.0017) model time 0.2280 (0.2255) loss 3.2126 (3.2961) grad_norm 2.0977 (2.3836) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][460/1251] eta 0:02:59 lr 0.000640 wd 0.0500 time 0.2261 (0.2271) data time 0.0008 (0.0017) model time 0.2253 (0.2254) loss 3.2278 (3.2942) grad_norm 2.1995 (2.3763) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][470/1251] eta 0:02:57 lr 0.000640 wd 0.0500 time 0.2219 (0.2275) data time 0.0006 (0.0017) model time 0.2213 (0.2259) loss 2.2250 (3.2956) grad_norm 2.0062 (2.3799) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][480/1251] eta 0:02:55 lr 0.000640 wd 0.0500 time 0.2153 (0.2275) data time 0.0011 (0.0016) model time 0.2142 (0.2259) loss 3.8252 (3.2985) grad_norm 1.8492 (2.3795) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][490/1251] eta 0:02:53 lr 0.000640 wd 0.0500 time 0.2351 (0.2275) data time 0.0007 (0.0016) model time 0.2344 (0.2259) loss 2.8458 (3.3037) grad_norm 2.2081 (2.3832) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][500/1251] eta 0:02:51 lr 0.000640 wd 0.0500 time 0.2207 (0.2288) data time 0.0008 (0.0016) model time 0.2199 (0.2275) loss 3.6854 (3.3010) grad_norm 3.3827 (2.3870) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][510/1251] eta 0:02:51 lr 0.000640 wd 0.0500 time 0.2179 (0.2310) data time 0.0008 (0.0020) model time 0.2171 (0.2294) loss 2.7861 (3.3002) grad_norm 2.1645 (2.3930) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][520/1251] eta 0:02:50 lr 0.000640 wd 0.0500 time 0.3442 (0.2327) data time 0.0008 (0.0020) model time 0.3434 (0.2313) loss 3.1362 (3.3017) grad_norm 2.0316 (2.4319) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][530/1251] eta 0:02:47 lr 0.000640 wd 0.0500 time 0.2300 (0.2326) data time 0.0009 (0.0020) model time 0.2291 (0.2312) loss 2.4195 (3.3036) grad_norm 3.2969 (2.4499) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][540/1251] eta 0:02:45 lr 0.000640 wd 0.0500 time 0.2238 (0.2325) data time 0.0008 (0.0020) model time 0.2230 (0.2311) loss 4.0263 (3.3060) grad_norm 1.6755 (2.4425) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][550/1251] eta 0:02:42 lr 0.000640 wd 0.0500 time 0.2245 (0.2324) data time 0.0009 (0.0019) model time 0.2236 (0.2310) loss 3.5997 (3.3054) grad_norm 1.9779 (2.4349) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][560/1251] eta 0:02:40 lr 0.000640 wd 0.0500 time 0.2205 (0.2323) data time 0.0008 (0.0019) model time 0.2197 (0.2309) loss 3.2426 (3.3032) grad_norm 2.2066 (2.4299) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][570/1251] eta 0:02:38 lr 0.000640 wd 0.0500 time 0.2228 (0.2322) data time 0.0007 (0.0019) model time 0.2221 (0.2308) loss 3.0172 (3.2968) grad_norm 1.7252 (2.4294) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][580/1251] eta 0:02:35 lr 0.000640 wd 0.0500 time 0.2278 (0.2320) data time 0.0006 (0.0019) model time 0.2271 (0.2306) loss 3.1851 (3.2918) grad_norm 1.7475 (2.4296) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][590/1251] eta 0:02:33 lr 0.000640 wd 0.0500 time 0.2262 (0.2319) data time 0.0008 (0.0019) model time 0.2254 (0.2304) loss 3.4365 (3.2959) grad_norm 1.9508 (2.4267) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:46:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][600/1251] eta 0:02:30 lr 0.000639 wd 0.0500 time 0.2214 (0.2317) data time 0.0009 (0.0019) model time 0.2205 (0.2303) loss 3.3066 (3.2966) grad_norm 2.2175 (2.4278) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][610/1251] eta 0:02:28 lr 0.000639 wd 0.0500 time 0.2192 (0.2316) data time 0.0007 (0.0018) model time 0.2185 (0.2302) loss 2.1848 (3.2955) grad_norm 2.0054 (2.4258) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][620/1251] eta 0:02:26 lr 0.000639 wd 0.0500 time 0.2263 (0.2315) data time 0.0008 (0.0018) model time 0.2256 (0.2300) loss 3.5089 (3.2953) grad_norm 1.8118 (2.4237) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][630/1251] eta 0:02:23 lr 0.000639 wd 0.0500 time 0.2181 (0.2314) data time 0.0005 (0.0018) model time 0.2176 (0.2300) loss 3.5797 (3.2940) grad_norm 2.3546 (2.4232) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][640/1251] eta 0:02:21 lr 0.000639 wd 0.0500 time 0.2179 (0.2313) data time 0.0007 (0.0018) model time 0.2172 (0.2298) loss 2.8476 (3.2928) grad_norm 2.0337 (2.4260) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][650/1251] eta 0:02:18 lr 0.000639 wd 0.0500 time 0.2245 (0.2312) data time 0.0006 (0.0018) model time 0.2239 (0.2297) loss 2.5516 (3.2871) grad_norm 2.1607 (2.4250) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][660/1251] eta 0:02:16 lr 0.000639 wd 0.0500 time 0.2184 (0.2314) data time 0.0008 (0.0018) model time 0.2176 (0.2300) loss 3.8807 (3.2892) grad_norm 2.7270 (2.4216) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][670/1251] eta 0:02:14 lr 0.000639 wd 0.0500 time 0.2286 (0.2312) data time 0.0007 (0.0018) model time 0.2279 (0.2298) loss 3.5873 (3.2909) grad_norm 1.7508 (2.4189) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][680/1251] eta 0:02:13 lr 0.000639 wd 0.0500 time 0.2159 (0.2336) data time 0.0007 (0.0017) model time 0.2152 (0.2324) loss 3.2084 (3.2923) grad_norm 1.6613 (2.4184) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][690/1251] eta 0:02:10 lr 0.000639 wd 0.0500 time 0.2302 (0.2335) data time 0.0006 (0.0017) model time 0.2295 (0.2323) loss 2.5235 (3.2966) grad_norm 1.6433 (2.4187) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][700/1251] eta 0:02:10 lr 0.000639 wd 0.0500 time 0.2182 (0.2360) data time 0.0006 (0.0017) model time 0.2176 (0.2350) loss 3.9797 (3.2980) grad_norm 2.1963 (2.4178) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][710/1251] eta 0:02:07 lr 0.000639 wd 0.0500 time 0.2277 (0.2358) data time 0.0007 (0.0017) model time 0.2270 (0.2348) loss 2.7163 (3.2948) grad_norm 2.6997 (2.4180) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][720/1251] eta 0:02:05 lr 0.000639 wd 0.0500 time 0.2188 (0.2356) data time 0.0006 (0.0017) model time 0.2183 (0.2346) loss 3.3690 (3.2984) grad_norm 2.1588 (2.4215) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][730/1251] eta 0:02:02 lr 0.000639 wd 0.0500 time 0.2189 (0.2355) data time 0.0008 (0.0017) model time 0.2181 (0.2345) loss 2.3525 (3.2909) grad_norm 4.9884 (2.4321) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][740/1251] eta 0:02:00 lr 0.000639 wd 0.0500 time 0.2141 (0.2354) data time 0.0007 (0.0017) model time 0.2134 (0.2343) loss 2.2268 (3.2858) grad_norm 3.7802 (2.4315) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][750/1251] eta 0:01:57 lr 0.000639 wd 0.0500 time 0.2255 (0.2353) data time 0.0009 (0.0017) model time 0.2246 (0.2342) loss 2.6862 (3.2846) grad_norm 2.1488 (2.4272) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][760/1251] eta 0:01:55 lr 0.000639 wd 0.0500 time 0.2270 (0.2352) data time 0.0007 (0.0017) model time 0.2262 (0.2341) loss 3.2861 (3.2848) grad_norm 3.0167 (2.4278) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][770/1251] eta 0:01:53 lr 0.000639 wd 0.0500 time 0.2307 (0.2350) data time 0.0007 (0.0016) model time 0.2300 (0.2340) loss 3.1753 (3.2872) grad_norm 2.3009 (2.4282) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][780/1251] eta 0:01:50 lr 0.000639 wd 0.0500 time 0.2266 (0.2349) data time 0.0007 (0.0016) model time 0.2259 (0.2338) loss 4.0124 (3.2891) grad_norm 2.9188 (2.4273) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][790/1251] eta 0:01:48 lr 0.000639 wd 0.0500 time 0.2204 (0.2347) data time 0.0007 (0.0016) model time 0.2197 (0.2337) loss 3.2835 (3.2909) grad_norm 1.7091 (2.4255) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][800/1251] eta 0:01:45 lr 0.000639 wd 0.0500 time 0.2183 (0.2346) data time 0.0007 (0.0016) model time 0.2176 (0.2335) loss 2.5201 (3.2895) grad_norm 2.2216 (2.4252) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][810/1251] eta 0:01:43 lr 0.000639 wd 0.0500 time 0.2241 (0.2345) data time 0.0010 (0.0016) model time 0.2232 (0.2334) loss 3.7944 (3.2903) grad_norm 2.0250 (2.4270) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][820/1251] eta 0:01:41 lr 0.000639 wd 0.0500 time 0.2234 (0.2344) data time 0.0010 (0.0016) model time 0.2225 (0.2333) loss 3.8550 (3.2900) grad_norm 2.8868 (2.4249) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][830/1251] eta 0:01:38 lr 0.000638 wd 0.0500 time 0.2300 (0.2343) data time 0.0006 (0.0016) model time 0.2294 (0.2332) loss 3.7040 (3.2876) grad_norm 1.8792 (2.4237) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][840/1251] eta 0:01:36 lr 0.000638 wd 0.0500 time 0.2331 (0.2343) data time 0.0007 (0.0016) model time 0.2324 (0.2332) loss 2.4184 (3.2816) grad_norm 7.5365 (2.4290) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:47:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][850/1251] eta 0:01:33 lr 0.000638 wd 0.0500 time 0.2262 (0.2342) data time 0.0007 (0.0016) model time 0.2255 (0.2331) loss 3.5036 (3.2845) grad_norm 1.9591 (2.4296) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][860/1251] eta 0:01:31 lr 0.000638 wd 0.0500 time 0.2287 (0.2341) data time 0.0008 (0.0016) model time 0.2278 (0.2330) loss 2.6075 (3.2848) grad_norm 1.9466 (2.4257) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][870/1251] eta 0:01:29 lr 0.000638 wd 0.0500 time 0.2215 (0.2340) data time 0.0009 (0.0016) model time 0.2207 (0.2329) loss 3.5647 (3.2840) grad_norm 2.2253 (2.4217) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][880/1251] eta 0:01:26 lr 0.000638 wd 0.0500 time 0.2236 (0.2339) data time 0.0007 (0.0016) model time 0.2230 (0.2328) loss 3.0931 (3.2866) grad_norm 2.7557 (2.4205) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][890/1251] eta 0:01:24 lr 0.000638 wd 0.0500 time 0.2233 (0.2338) data time 0.0010 (0.0015) model time 0.2224 (0.2327) loss 3.2144 (3.2852) grad_norm 2.1675 (2.4197) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][900/1251] eta 0:01:22 lr 0.000638 wd 0.0500 time 0.2231 (0.2337) data time 0.0008 (0.0015) model time 0.2223 (0.2326) loss 2.8795 (3.2847) grad_norm 1.4802 (2.4206) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][910/1251] eta 0:01:19 lr 0.000638 wd 0.0500 time 0.2227 (0.2336) data time 0.0009 (0.0015) model time 0.2217 (0.2325) loss 3.6052 (3.2857) grad_norm 2.2188 (2.4184) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][920/1251] eta 0:01:17 lr 0.000638 wd 0.0500 time 0.2265 (0.2335) data time 0.0006 (0.0015) model time 0.2259 (0.2324) loss 2.2228 (3.2855) grad_norm 2.1365 (2.4154) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][930/1251] eta 0:01:14 lr 0.000638 wd 0.0500 time 0.2274 (0.2335) data time 0.0010 (0.0015) model time 0.2264 (0.2323) loss 3.1992 (3.2848) grad_norm 1.9231 (2.4117) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][940/1251] eta 0:01:12 lr 0.000638 wd 0.0500 time 0.2233 (0.2334) data time 0.0011 (0.0015) model time 0.2222 (0.2323) loss 3.2477 (3.2825) grad_norm 2.0982 (2.4138) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][950/1251] eta 0:01:10 lr 0.000638 wd 0.0500 time 0.2277 (0.2333) data time 0.0009 (0.0015) model time 0.2268 (0.2321) loss 2.9525 (3.2807) grad_norm 1.9811 (2.4123) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][960/1251] eta 0:01:07 lr 0.000638 wd 0.0500 time 0.2233 (0.2332) data time 0.0009 (0.0015) model time 0.2224 (0.2321) loss 3.0599 (3.2820) grad_norm 2.3039 (2.4078) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][970/1251] eta 0:01:05 lr 0.000638 wd 0.0500 time 0.2205 (0.2331) data time 0.0007 (0.0015) model time 0.2198 (0.2320) loss 3.3635 (3.2837) grad_norm 2.8112 (2.4047) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][980/1251] eta 0:01:03 lr 0.000638 wd 0.0500 time 0.2221 (0.2330) data time 0.0008 (0.0015) model time 0.2213 (0.2319) loss 2.7392 (3.2813) grad_norm 3.5397 (2.4065) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][990/1251] eta 0:01:00 lr 0.000638 wd 0.0500 time 0.2183 (0.2330) data time 0.0008 (0.0015) model time 0.2175 (0.2318) loss 3.0126 (3.2823) grad_norm 3.6397 (2.4079) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1000/1251] eta 0:00:58 lr 0.000638 wd 0.0500 time 0.2232 (0.2329) data time 0.0008 (0.0015) model time 0.2225 (0.2317) loss 2.2522 (3.2782) grad_norm 2.5820 (2.4077) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1010/1251] eta 0:00:56 lr 0.000638 wd 0.0500 time 0.2273 (0.2328) data time 0.0009 (0.0015) model time 0.2264 (0.2317) loss 3.6677 (3.2755) grad_norm 1.7856 (2.4068) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1020/1251] eta 0:00:53 lr 0.000638 wd 0.0500 time 0.2280 (0.2328) data time 0.0006 (0.0015) model time 0.2274 (0.2316) loss 2.8474 (3.2749) grad_norm 2.2215 (2.4016) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1030/1251] eta 0:00:51 lr 0.000638 wd 0.0500 time 0.2259 (0.2327) data time 0.0008 (0.0015) model time 0.2251 (0.2316) loss 1.9184 (3.2723) grad_norm 1.9830 (2.4007) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1040/1251] eta 0:00:49 lr 0.000638 wd 0.0500 time 0.2195 (0.2327) data time 0.0009 (0.0015) model time 0.2186 (0.2315) loss 3.2856 (3.2728) grad_norm 1.3864 (2.3990) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1050/1251] eta 0:00:46 lr 0.000638 wd 0.0500 time 0.2242 (0.2326) data time 0.0008 (0.0015) model time 0.2235 (0.2315) loss 2.2146 (3.2702) grad_norm 2.2858 (2.3967) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1060/1251] eta 0:00:44 lr 0.000638 wd 0.0500 time 0.2342 (0.2326) data time 0.0006 (0.0014) model time 0.2336 (0.2314) loss 4.0870 (3.2716) grad_norm 2.6311 (2.4168) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1070/1251] eta 0:00:42 lr 0.000637 wd 0.0500 time 0.2258 (0.2325) data time 0.0006 (0.0014) model time 0.2253 (0.2314) loss 3.9962 (3.2686) grad_norm 2.1969 (2.4212) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1080/1251] eta 0:00:39 lr 0.000637 wd 0.0500 time 0.2240 (0.2324) data time 0.0007 (0.0014) model time 0.2233 (0.2313) loss 2.1440 (3.2669) grad_norm 1.4679 (2.4187) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1090/1251] eta 0:00:37 lr 0.000637 wd 0.0500 time 0.2249 (0.2324) data time 0.0008 (0.0014) model time 0.2241 (0.2312) loss 3.8217 (3.2694) grad_norm 2.3543 (2.4162) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1100/1251] eta 0:00:35 lr 0.000637 wd 0.0500 time 0.2406 (0.2323) data time 0.0008 (0.0014) model time 0.2399 (0.2312) loss 3.5610 (3.2703) grad_norm 3.2612 (2.4192) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1110/1251] eta 0:00:32 lr 0.000637 wd 0.0500 time 0.2248 (0.2323) data time 0.0009 (0.0014) model time 0.2239 (0.2311) loss 3.3314 (3.2702) grad_norm 2.1120 (2.4212) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:48:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1120/1251] eta 0:00:30 lr 0.000637 wd 0.0500 time 0.2231 (0.2322) data time 0.0009 (0.0014) model time 0.2222 (0.2311) loss 2.8257 (3.2706) grad_norm 2.0080 (2.4200) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1130/1251] eta 0:00:28 lr 0.000637 wd 0.0500 time 0.2248 (0.2322) data time 0.0007 (0.0014) model time 0.2241 (0.2310) loss 4.0190 (3.2704) grad_norm 3.0550 (2.4189) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1140/1251] eta 0:00:25 lr 0.000637 wd 0.0500 time 0.2310 (0.2321) data time 0.0008 (0.0014) model time 0.2302 (0.2309) loss 3.0903 (3.2704) grad_norm 2.9572 (2.4217) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1150/1251] eta 0:00:23 lr 0.000637 wd 0.0500 time 0.2222 (0.2320) data time 0.0009 (0.0014) model time 0.2213 (0.2309) loss 3.3577 (3.2696) grad_norm 2.1740 (2.4219) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1160/1251] eta 0:00:21 lr 0.000637 wd 0.0500 time 0.2271 (0.2320) data time 0.0008 (0.0014) model time 0.2263 (0.2308) loss 3.5783 (3.2720) grad_norm 1.8685 (2.4187) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1170/1251] eta 0:00:18 lr 0.000637 wd 0.0500 time 0.2239 (0.2320) data time 0.0009 (0.0014) model time 0.2230 (0.2308) loss 3.6286 (3.2723) grad_norm 1.6047 (2.4151) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1180/1251] eta 0:00:16 lr 0.000637 wd 0.0500 time 0.2239 (0.2319) data time 0.0006 (0.0014) model time 0.2234 (0.2308) loss 3.1656 (3.2725) grad_norm 2.1038 (2.4159) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1190/1251] eta 0:00:14 lr 0.000637 wd 0.0500 time 0.2238 (0.2319) data time 0.0006 (0.0014) model time 0.2231 (0.2307) loss 2.4662 (3.2729) grad_norm 2.4362 (2.4166) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1200/1251] eta 0:00:11 lr 0.000637 wd 0.0500 time 0.2283 (0.2320) data time 0.0006 (0.0014) model time 0.2278 (0.2309) loss 2.9985 (3.2732) grad_norm 1.8002 (2.4157) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1210/1251] eta 0:00:09 lr 0.000637 wd 0.0500 time 0.2280 (0.2320) data time 0.0008 (0.0014) model time 0.2272 (0.2308) loss 2.7682 (3.2731) grad_norm 2.8942 (2.4174) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1220/1251] eta 0:00:07 lr 0.000637 wd 0.0500 time 0.2319 (0.2319) data time 0.0008 (0.0014) model time 0.2311 (0.2308) loss 3.6729 (3.2765) grad_norm 2.2070 (2.4161) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1230/1251] eta 0:00:04 lr 0.000637 wd 0.0500 time 0.2190 (0.2319) data time 0.0009 (0.0014) model time 0.2181 (0.2307) loss 3.6939 (3.2761) grad_norm 2.2491 (2.4145) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1240/1251] eta 0:00:02 lr 0.000637 wd 0.0500 time 0.2194 (0.2318) data time 0.0004 (0.0014) model time 0.2190 (0.2306) loss 3.8076 (3.2772) grad_norm 2.5052 (2.4152) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [135/300][1250/1251] eta 0:00:00 lr 0.000637 wd 0.0500 time 0.2211 (0.2317) data time 0.0004 (0.0014) model time 0.2207 (0.2305) loss 3.6126 (3.2789) grad_norm 2.2846 (2.4159) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 135 training takes 0:04:49 [2024-08-27 07:49:29 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 07:49:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 07:49:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.429 (0.429) Loss 0.4668 (0.4668) Acc@1 91.309 (91.309) Acc@5 98.438 (98.438) Mem 7381MB [2024-08-27 07:49:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.104) Loss 0.7231 (0.7605) Acc@1 85.938 (83.736) Acc@5 96.680 (96.635) Mem 7381MB [2024-08-27 07:49:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.088) Loss 1.2012 (0.7886) Acc@1 71.289 (82.747) Acc@5 92.578 (96.540) Mem 7381MB [2024-08-27 07:49:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.067 (0.083) Loss 1.3408 (0.8904) Acc@1 67.969 (80.415) Acc@5 91.406 (95.369) Mem 7381MB [2024-08-27 07:49:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.078) Loss 1.1885 (0.9534) Acc@1 71.777 (78.937) Acc@5 93.066 (94.619) Mem 7381MB [2024-08-27 07:49:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.580 Acc@5 94.608 [2024-08-27 07:49:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.6% [2024-08-27 07:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.817 (0.817) Loss 0.4160 (0.4160) Acc@1 92.383 (92.383) Acc@5 98.633 (98.633) Mem 7381MB [2024-08-27 07:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.069 (0.143) Loss 0.6616 (0.6513) Acc@1 86.719 (85.991) Acc@5 96.973 (97.292) Mem 7381MB [2024-08-27 07:49:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.068 (0.109) Loss 0.9180 (0.6753) Acc@1 78.320 (85.012) Acc@5 94.922 (97.284) Mem 7381MB [2024-08-27 07:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.096) Loss 1.1650 (0.7665) Acc@1 70.898 (82.772) Acc@5 92.285 (96.302) Mem 7381MB [2024-08-27 07:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.088) Loss 1.0645 (0.8149) Acc@1 73.242 (81.405) Acc@5 93.652 (95.767) Mem 7381MB [2024-08-27 07:49:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.992 Acc@5 95.724 [2024-08-27 07:49:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.0% [2024-08-27 07:49:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 80.99% [2024-08-27 07:49:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 07:49:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 07:49:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][0/1251] eta 0:13:28 lr 0.000637 wd 0.0500 time 0.6462 (0.6462) data time 0.4190 (0.4190) model time 0.0000 (0.0000) loss 2.8463 (2.8463) grad_norm 2.1579 (2.1579) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][10/1251] eta 0:05:27 lr 0.000637 wd 0.0500 time 0.2187 (0.2637) data time 0.0007 (0.0389) model time 0.0000 (0.0000) loss 3.6836 (3.2673) grad_norm 5.3495 (2.4925) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][20/1251] eta 0:05:03 lr 0.000637 wd 0.0500 time 0.2315 (0.2463) data time 0.0007 (0.0209) model time 0.0000 (0.0000) loss 3.5515 (3.3100) grad_norm 2.6364 (2.5348) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][30/1251] eta 0:04:52 lr 0.000637 wd 0.0500 time 0.2257 (0.2396) data time 0.0006 (0.0145) model time 0.0000 (0.0000) loss 2.1165 (3.2531) grad_norm 3.3016 (2.6490) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][40/1251] eta 0:04:46 lr 0.000637 wd 0.0500 time 0.2271 (0.2366) data time 0.0008 (0.0112) model time 0.0000 (0.0000) loss 3.5841 (3.2354) grad_norm 3.2425 (2.6108) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][50/1251] eta 0:04:41 lr 0.000636 wd 0.0500 time 0.2352 (0.2342) data time 0.0006 (0.0092) model time 0.0000 (0.0000) loss 2.6412 (3.2328) grad_norm 1.9876 (2.5439) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][60/1251] eta 0:04:37 lr 0.000636 wd 0.0500 time 0.2244 (0.2330) data time 0.0007 (0.0078) model time 0.2237 (0.2258) loss 3.5725 (3.2121) grad_norm 2.2898 (2.5278) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][70/1251] eta 0:04:34 lr 0.000636 wd 0.0500 time 0.2278 (0.2320) data time 0.0006 (0.0069) model time 0.2272 (0.2255) loss 2.9481 (3.1891) grad_norm 1.6529 (2.7396) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][80/1251] eta 0:04:34 lr 0.000636 wd 0.0500 time 0.4380 (0.2342) data time 0.0009 (0.0061) model time 0.4371 (0.2334) loss 2.6601 (3.1796) grad_norm 2.2583 (2.6503) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][90/1251] eta 0:04:33 lr 0.000636 wd 0.0500 time 0.2195 (0.2356) data time 0.0008 (0.0056) model time 0.2187 (0.2365) loss 2.5925 (3.1847) grad_norm 2.1198 (2.5870) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][100/1251] eta 0:04:29 lr 0.000636 wd 0.0500 time 0.2221 (0.2344) data time 0.0008 (0.0051) model time 0.2213 (0.2338) loss 3.3179 (3.2100) grad_norm 2.0007 (2.5427) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][110/1251] eta 0:04:26 lr 0.000636 wd 0.0500 time 0.2294 (0.2336) data time 0.0009 (0.0047) model time 0.2285 (0.2323) loss 1.5983 (3.2199) grad_norm 2.1600 (2.5576) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][120/1251] eta 0:04:23 lr 0.000636 wd 0.0500 time 0.2212 (0.2328) data time 0.0007 (0.0044) model time 0.2205 (0.2309) loss 2.4934 (3.2166) grad_norm 2.4953 (2.5477) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][130/1251] eta 0:04:20 lr 0.000636 wd 0.0500 time 0.2211 (0.2322) data time 0.0010 (0.0041) model time 0.2201 (0.2300) loss 3.7693 (3.2215) grad_norm 1.9151 (2.5225) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][140/1251] eta 0:04:17 lr 0.000636 wd 0.0500 time 0.2331 (0.2317) data time 0.0010 (0.0039) model time 0.2321 (0.2293) loss 3.6201 (3.2140) grad_norm 2.7348 (2.5117) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][150/1251] eta 0:04:14 lr 0.000636 wd 0.0500 time 0.2220 (0.2311) data time 0.0007 (0.0037) model time 0.2213 (0.2286) loss 3.0024 (3.2243) grad_norm 2.7217 (2.5220) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][160/1251] eta 0:04:11 lr 0.000636 wd 0.0500 time 0.2189 (0.2308) data time 0.0008 (0.0036) model time 0.2182 (0.2282) loss 3.3240 (3.2385) grad_norm 2.9584 (2.5089) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][170/1251] eta 0:04:09 lr 0.000636 wd 0.0500 time 0.2196 (0.2306) data time 0.0009 (0.0034) model time 0.2188 (0.2281) loss 3.8997 (3.2431) grad_norm 2.1921 (2.4847) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][180/1251] eta 0:04:06 lr 0.000636 wd 0.0500 time 0.2342 (0.2304) data time 0.0008 (0.0033) model time 0.2334 (0.2279) loss 2.7910 (3.2481) grad_norm 2.2033 (2.5372) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][190/1251] eta 0:04:04 lr 0.000636 wd 0.0500 time 0.2203 (0.2301) data time 0.0009 (0.0031) model time 0.2194 (0.2277) loss 2.4582 (3.2501) grad_norm 2.1557 (2.5180) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][200/1251] eta 0:04:01 lr 0.000636 wd 0.0500 time 0.2168 (0.2299) data time 0.0009 (0.0030) model time 0.2159 (0.2275) loss 2.4090 (3.2428) grad_norm 1.5649 (2.5011) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][210/1251] eta 0:03:59 lr 0.000636 wd 0.0500 time 0.2263 (0.2297) data time 0.0006 (0.0029) model time 0.2257 (0.2274) loss 3.0196 (3.2452) grad_norm 2.2063 (2.4951) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][220/1251] eta 0:03:56 lr 0.000636 wd 0.0500 time 0.2222 (0.2295) data time 0.0008 (0.0028) model time 0.2215 (0.2272) loss 3.6254 (3.2454) grad_norm 2.2330 (2.4857) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][230/1251] eta 0:03:54 lr 0.000636 wd 0.0500 time 0.2194 (0.2293) data time 0.0007 (0.0028) model time 0.2187 (0.2270) loss 3.5475 (3.2378) grad_norm 2.5901 (2.4983) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][240/1251] eta 0:03:51 lr 0.000636 wd 0.0500 time 0.2153 (0.2291) data time 0.0009 (0.0027) model time 0.2144 (0.2267) loss 2.9994 (3.2529) grad_norm 1.7071 (2.4963) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][250/1251] eta 0:03:49 lr 0.000636 wd 0.0500 time 0.2238 (0.2290) data time 0.0008 (0.0026) model time 0.2230 (0.2267) loss 2.9209 (3.2360) grad_norm 2.6430 (2.5102) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][260/1251] eta 0:03:46 lr 0.000636 wd 0.0500 time 0.2225 (0.2289) data time 0.0006 (0.0026) model time 0.2218 (0.2267) loss 3.6589 (3.2436) grad_norm 1.9387 (2.5275) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][270/1251] eta 0:03:44 lr 0.000636 wd 0.0500 time 0.2190 (0.2287) data time 0.0006 (0.0025) model time 0.2183 (0.2265) loss 3.6552 (3.2503) grad_norm 2.9306 (2.5346) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][280/1251] eta 0:03:41 lr 0.000635 wd 0.0500 time 0.2309 (0.2286) data time 0.0006 (0.0024) model time 0.2303 (0.2264) loss 4.1709 (3.2583) grad_norm 3.0343 (2.5300) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][290/1251] eta 0:03:39 lr 0.000635 wd 0.0500 time 0.2293 (0.2284) data time 0.0009 (0.0024) model time 0.2284 (0.2263) loss 2.5400 (3.2654) grad_norm 1.5939 (2.5136) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][300/1251] eta 0:03:37 lr 0.000635 wd 0.0500 time 0.2231 (0.2283) data time 0.0008 (0.0023) model time 0.2224 (0.2262) loss 3.7636 (3.2614) grad_norm 2.4754 (2.5015) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][310/1251] eta 0:03:34 lr 0.000635 wd 0.0500 time 0.2255 (0.2283) data time 0.0009 (0.0023) model time 0.2246 (0.2262) loss 3.5926 (3.2610) grad_norm 3.0337 (2.4940) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][320/1251] eta 0:03:32 lr 0.000635 wd 0.0500 time 0.2290 (0.2282) data time 0.0008 (0.0023) model time 0.2282 (0.2261) loss 3.3932 (3.2654) grad_norm 2.4296 (2.5044) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][330/1251] eta 0:03:30 lr 0.000635 wd 0.0500 time 0.2223 (0.2281) data time 0.0009 (0.0022) model time 0.2214 (0.2260) loss 3.6846 (3.2630) grad_norm 2.0022 (2.5041) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][340/1251] eta 0:03:27 lr 0.000635 wd 0.0500 time 0.2266 (0.2279) data time 0.0008 (0.0022) model time 0.2258 (0.2259) loss 3.5943 (3.2588) grad_norm 2.2429 (2.5090) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:50:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][350/1251] eta 0:03:25 lr 0.000635 wd 0.0500 time 0.2236 (0.2279) data time 0.0009 (0.0021) model time 0.2227 (0.2258) loss 3.6527 (3.2629) grad_norm 1.8797 (2.5104) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][360/1251] eta 0:03:22 lr 0.000635 wd 0.0500 time 0.2239 (0.2278) data time 0.0010 (0.0021) model time 0.2229 (0.2258) loss 3.1541 (3.2674) grad_norm 2.0289 (2.5072) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][370/1251] eta 0:03:20 lr 0.000635 wd 0.0500 time 0.2255 (0.2277) data time 0.0006 (0.0021) model time 0.2249 (0.2257) loss 2.6830 (3.2677) grad_norm 2.3995 (2.4985) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:51:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][380/1251] eta 0:03:18 lr 0.000635 wd 0.0500 time 0.2195 (0.2275) data time 0.0009 (0.0020) model time 0.2185 (0.2256) loss 3.5245 (3.2734) grad_norm 2.2784 (2.4937) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:51:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][390/1251] eta 0:03:15 lr 0.000635 wd 0.0500 time 0.2275 (0.2275) data time 0.0005 (0.0020) model time 0.2270 (0.2255) loss 3.8847 (3.2804) grad_norm 1.8315 (2.4967) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:51:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][400/1251] eta 0:03:13 lr 0.000635 wd 0.0500 time 0.2235 (0.2279) data time 0.0009 (0.0020) model time 0.2226 (0.2260) loss 3.4769 (3.2853) grad_norm 2.9277 (2.5045) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:51:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][410/1251] eta 0:03:11 lr 0.000635 wd 0.0500 time 0.2178 (0.2278) data time 0.0008 (0.0020) model time 0.2170 (0.2259) loss 3.9586 (3.2882) grad_norm 2.8431 (2.5094) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][420/1251] eta 0:03:09 lr 0.000635 wd 0.0500 time 0.2254 (0.2277) data time 0.0008 (0.0019) model time 0.2246 (0.2259) loss 3.0844 (3.2873) grad_norm 2.1115 (2.5082) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:51:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][430/1251] eta 0:03:06 lr 0.000635 wd 0.0500 time 0.2225 (0.2276) data time 0.0007 (0.0019) model time 0.2218 (0.2258) loss 3.7463 (3.2868) grad_norm 4.1924 (nan) loss_scale 512.0000 (1013.3086) mem 7381MB [2024-08-27 07:51:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][440/1251] eta 0:03:04 lr 0.000635 wd 0.0500 time 0.2295 (0.2276) data time 0.0006 (0.0019) model time 0.2289 (0.2258) loss 3.1333 (3.2824) grad_norm 3.5160 (nan) loss_scale 512.0000 (1001.9410) mem 7381MB [2024-08-27 07:51:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][450/1251] eta 0:03:02 lr 0.000635 wd 0.0500 time 0.2256 (0.2275) data time 0.0008 (0.0019) model time 0.2248 (0.2257) loss 3.2519 (3.2820) grad_norm 2.3466 (nan) loss_scale 512.0000 (991.0776) mem 7381MB [2024-08-27 07:51:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][460/1251] eta 0:02:59 lr 0.000635 wd 0.0500 time 0.2221 (0.2274) data time 0.0007 (0.0018) model time 0.2214 (0.2257) loss 2.8889 (3.2833) grad_norm 2.8885 (nan) loss_scale 512.0000 (980.6855) mem 7381MB [2024-08-27 07:51:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][470/1251] eta 0:02:57 lr 0.000635 wd 0.0500 time 0.2306 (0.2274) data time 0.0009 (0.0018) model time 0.2296 (0.2256) loss 3.6548 (3.2858) grad_norm 2.3325 (nan) loss_scale 512.0000 (970.7346) mem 7381MB [2024-08-27 07:51:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][480/1251] eta 0:02:55 lr 0.000635 wd 0.0500 time 0.2205 (0.2273) data time 0.0007 (0.0018) model time 0.2198 (0.2256) loss 3.8711 (3.2920) grad_norm 4.1701 (nan) loss_scale 512.0000 (961.1975) mem 7381MB [2024-08-27 07:51:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][490/1251] eta 0:02:52 lr 0.000635 wd 0.0500 time 0.2253 (0.2273) data time 0.0010 (0.0018) model time 0.2243 (0.2255) loss 3.0217 (3.2872) grad_norm 2.4148 (nan) loss_scale 512.0000 (952.0489) mem 7381MB [2024-08-27 07:51:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][500/1251] eta 0:02:50 lr 0.000635 wd 0.0500 time 0.2350 (0.2272) data time 0.0008 (0.0018) model time 0.2341 (0.2255) loss 3.8068 (3.2899) grad_norm 2.1158 (nan) loss_scale 512.0000 (943.2655) mem 7381MB [2024-08-27 07:51:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][510/1251] eta 0:02:48 lr 0.000635 wd 0.0500 time 0.2164 (0.2272) data time 0.0007 (0.0018) model time 0.2157 (0.2255) loss 4.1926 (3.2958) grad_norm 2.3266 (nan) loss_scale 512.0000 (934.8258) mem 7381MB [2024-08-27 07:51:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][520/1251] eta 0:02:46 lr 0.000634 wd 0.0500 time 0.2232 (0.2272) data time 0.0011 (0.0017) model time 0.2221 (0.2255) loss 3.3175 (3.2974) grad_norm 2.3596 (nan) loss_scale 512.0000 (926.7102) mem 7381MB [2024-08-27 07:51:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][530/1251] eta 0:02:43 lr 0.000634 wd 0.0500 time 0.2259 (0.2271) data time 0.0006 (0.0017) model time 0.2253 (0.2255) loss 3.7289 (3.2995) grad_norm 2.4828 (nan) loss_scale 512.0000 (918.9002) mem 7381MB [2024-08-27 07:51:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][540/1251] eta 0:02:41 lr 0.000634 wd 0.0500 time 0.2214 (0.2272) data time 0.0009 (0.0017) model time 0.2205 (0.2255) loss 3.2575 (3.3042) grad_norm 2.4186 (nan) loss_scale 512.0000 (911.3789) mem 7381MB [2024-08-27 07:51:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][550/1251] eta 0:02:39 lr 0.000634 wd 0.0500 time 0.2170 (0.2272) data time 0.0009 (0.0017) model time 0.2161 (0.2255) loss 3.4489 (3.3076) grad_norm 2.3977 (nan) loss_scale 512.0000 (904.1307) mem 7381MB [2024-08-27 07:51:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][560/1251] eta 0:02:36 lr 0.000634 wd 0.0500 time 0.2283 (0.2272) data time 0.0008 (0.0017) model time 0.2275 (0.2255) loss 3.6128 (3.3100) grad_norm 2.7552 (nan) loss_scale 512.0000 (897.1408) mem 7381MB [2024-08-27 07:51:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][570/1251] eta 0:02:34 lr 0.000634 wd 0.0500 time 0.2241 (0.2271) data time 0.0008 (0.0017) model time 0.2233 (0.2255) loss 2.9575 (3.3039) grad_norm 1.8704 (nan) loss_scale 512.0000 (890.3958) mem 7381MB [2024-08-27 07:51:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][580/1251] eta 0:02:32 lr 0.000634 wd 0.0500 time 0.2219 (0.2271) data time 0.0009 (0.0017) model time 0.2210 (0.2255) loss 3.0232 (3.3013) grad_norm 2.9958 (nan) loss_scale 512.0000 (883.8830) mem 7381MB [2024-08-27 07:51:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][590/1251] eta 0:02:30 lr 0.000634 wd 0.0500 time 0.2210 (0.2271) data time 0.0009 (0.0016) model time 0.2200 (0.2255) loss 2.3545 (3.2976) grad_norm 4.2960 (nan) loss_scale 512.0000 (877.5905) mem 7381MB [2024-08-27 07:51:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][600/1251] eta 0:02:27 lr 0.000634 wd 0.0500 time 0.2240 (0.2271) data time 0.0007 (0.0016) model time 0.2233 (0.2255) loss 2.4269 (3.3002) grad_norm 2.1663 (nan) loss_scale 512.0000 (871.5075) mem 7381MB [2024-08-27 07:51:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][610/1251] eta 0:02:25 lr 0.000634 wd 0.0500 time 0.2195 (0.2270) data time 0.0008 (0.0016) model time 0.2187 (0.2254) loss 3.0120 (3.3017) grad_norm 1.9681 (nan) loss_scale 512.0000 (865.6236) mem 7381MB [2024-08-27 07:51:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][620/1251] eta 0:02:23 lr 0.000634 wd 0.0500 time 0.2216 (0.2269) data time 0.0008 (0.0016) model time 0.2207 (0.2254) loss 2.5081 (3.2980) grad_norm 1.8417 (nan) loss_scale 512.0000 (859.9291) mem 7381MB [2024-08-27 07:52:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][630/1251] eta 0:02:20 lr 0.000634 wd 0.0500 time 0.2298 (0.2269) data time 0.0009 (0.0016) model time 0.2289 (0.2253) loss 2.8464 (3.2979) grad_norm 2.4062 (nan) loss_scale 512.0000 (854.4152) mem 7381MB [2024-08-27 07:52:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][640/1251] eta 0:02:18 lr 0.000634 wd 0.0500 time 0.2212 (0.2269) data time 0.0009 (0.0016) model time 0.2203 (0.2253) loss 3.3435 (3.2979) grad_norm 1.6453 (nan) loss_scale 512.0000 (849.0733) mem 7381MB [2024-08-27 07:52:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][650/1251] eta 0:02:16 lr 0.000634 wd 0.0500 time 0.2259 (0.2269) data time 0.0009 (0.0016) model time 0.2249 (0.2253) loss 3.0492 (3.2985) grad_norm 2.1474 (nan) loss_scale 512.0000 (843.8955) mem 7381MB [2024-08-27 07:52:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][660/1251] eta 0:02:14 lr 0.000634 wd 0.0500 time 0.2230 (0.2269) data time 0.0009 (0.0016) model time 0.2221 (0.2253) loss 3.6793 (3.2987) grad_norm 2.0597 (nan) loss_scale 512.0000 (838.8744) mem 7381MB [2024-08-27 07:52:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][670/1251] eta 0:02:11 lr 0.000634 wd 0.0500 time 0.2193 (0.2269) data time 0.0008 (0.0016) model time 0.2186 (0.2253) loss 3.2302 (3.2977) grad_norm 2.3712 (nan) loss_scale 512.0000 (834.0030) mem 7381MB [2024-08-27 07:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][680/1251] eta 0:02:09 lr 0.000634 wd 0.0500 time 0.2233 (0.2268) data time 0.0008 (0.0016) model time 0.2225 (0.2253) loss 3.2168 (3.2943) grad_norm 2.2794 (nan) loss_scale 512.0000 (829.2746) mem 7381MB [2024-08-27 07:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][690/1251] eta 0:02:07 lr 0.000634 wd 0.0500 time 0.2291 (0.2268) data time 0.0007 (0.0016) model time 0.2285 (0.2253) loss 2.2223 (3.2879) grad_norm 3.4462 (nan) loss_scale 512.0000 (824.6831) mem 7381MB [2024-08-27 07:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][700/1251] eta 0:02:04 lr 0.000634 wd 0.0500 time 0.2220 (0.2268) data time 0.0007 (0.0015) model time 0.2213 (0.2253) loss 3.5696 (3.2829) grad_norm 1.9245 (nan) loss_scale 512.0000 (820.2225) mem 7381MB [2024-08-27 07:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][710/1251] eta 0:02:02 lr 0.000634 wd 0.0500 time 0.2269 (0.2268) data time 0.0008 (0.0015) model time 0.2260 (0.2253) loss 2.3184 (3.2819) grad_norm 1.5245 (nan) loss_scale 512.0000 (815.8875) mem 7381MB [2024-08-27 07:52:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][720/1251] eta 0:02:00 lr 0.000634 wd 0.0500 time 0.2199 (0.2268) data time 0.0008 (0.0015) model time 0.2191 (0.2252) loss 3.4308 (3.2799) grad_norm 1.9828 (nan) loss_scale 512.0000 (811.6727) mem 7381MB [2024-08-27 07:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][730/1251] eta 0:01:58 lr 0.000634 wd 0.0500 time 0.2274 (0.2267) data time 0.0008 (0.0015) model time 0.2266 (0.2252) loss 3.3159 (3.2832) grad_norm 2.1247 (nan) loss_scale 512.0000 (807.5732) mem 7381MB [2024-08-27 07:52:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][740/1251] eta 0:01:55 lr 0.000634 wd 0.0500 time 0.2186 (0.2267) data time 0.0007 (0.0015) model time 0.2179 (0.2252) loss 2.0997 (3.2806) grad_norm 1.9591 (nan) loss_scale 512.0000 (803.5843) mem 7381MB [2024-08-27 07:52:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][750/1251] eta 0:01:53 lr 0.000633 wd 0.0500 time 0.2223 (0.2267) data time 0.0006 (0.0015) model time 0.2217 (0.2252) loss 3.0179 (3.2813) grad_norm 2.0173 (nan) loss_scale 512.0000 (799.7017) mem 7381MB [2024-08-27 07:52:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][760/1251] eta 0:01:51 lr 0.000633 wd 0.0500 time 0.2217 (0.2266) data time 0.0007 (0.0015) model time 0.2210 (0.2252) loss 3.8123 (3.2823) grad_norm 3.1917 (nan) loss_scale 512.0000 (795.9212) mem 7381MB [2024-08-27 07:52:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][770/1251] eta 0:01:49 lr 0.000633 wd 0.0500 time 0.2260 (0.2267) data time 0.0008 (0.0015) model time 0.2252 (0.2252) loss 3.5462 (3.2861) grad_norm 3.2411 (nan) loss_scale 512.0000 (792.2387) mem 7381MB [2024-08-27 07:52:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][780/1251] eta 0:01:46 lr 0.000633 wd 0.0500 time 0.2213 (0.2267) data time 0.0008 (0.0015) model time 0.2205 (0.2252) loss 2.6225 (3.2851) grad_norm 1.9813 (nan) loss_scale 512.0000 (788.6504) mem 7381MB [2024-08-27 07:52:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][790/1251] eta 0:01:44 lr 0.000633 wd 0.0500 time 0.2199 (0.2266) data time 0.0008 (0.0015) model time 0.2192 (0.2252) loss 3.0369 (3.2856) grad_norm 1.5422 (nan) loss_scale 512.0000 (785.1530) mem 7381MB [2024-08-27 07:52:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][800/1251] eta 0:01:42 lr 0.000633 wd 0.0500 time 0.2207 (0.2266) data time 0.0006 (0.0015) model time 0.2201 (0.2251) loss 3.7311 (3.2849) grad_norm 3.0889 (nan) loss_scale 512.0000 (781.7428) mem 7381MB [2024-08-27 07:52:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][810/1251] eta 0:01:39 lr 0.000633 wd 0.0500 time 0.2247 (0.2266) data time 0.0008 (0.0015) model time 0.2239 (0.2251) loss 3.0499 (3.2877) grad_norm 1.8692 (nan) loss_scale 512.0000 (778.4168) mem 7381MB [2024-08-27 07:52:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][820/1251] eta 0:01:37 lr 0.000633 wd 0.0500 time 0.2241 (0.2266) data time 0.0006 (0.0015) model time 0.2234 (0.2251) loss 3.3093 (3.2853) grad_norm 6.8374 (nan) loss_scale 512.0000 (775.1717) mem 7381MB [2024-08-27 07:52:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][830/1251] eta 0:01:35 lr 0.000633 wd 0.0500 time 0.2217 (0.2265) data time 0.0006 (0.0014) model time 0.2210 (0.2251) loss 2.4540 (3.2856) grad_norm 1.5977 (nan) loss_scale 512.0000 (772.0048) mem 7381MB [2024-08-27 07:52:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][840/1251] eta 0:01:33 lr 0.000633 wd 0.0500 time 0.2223 (0.2265) data time 0.0008 (0.0014) model time 0.2215 (0.2251) loss 3.1124 (3.2855) grad_norm 1.7657 (nan) loss_scale 512.0000 (768.9132) mem 7381MB [2024-08-27 07:52:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][850/1251] eta 0:01:30 lr 0.000633 wd 0.0500 time 0.2279 (0.2265) data time 0.0009 (0.0014) model time 0.2270 (0.2250) loss 3.7366 (3.2879) grad_norm 2.1818 (nan) loss_scale 512.0000 (765.8942) mem 7381MB [2024-08-27 07:52:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][860/1251] eta 0:01:28 lr 0.000633 wd 0.0500 time 0.2331 (0.2264) data time 0.0006 (0.0014) model time 0.2325 (0.2250) loss 3.7217 (3.2858) grad_norm 1.9886 (nan) loss_scale 512.0000 (762.9454) mem 7381MB [2024-08-27 07:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][870/1251] eta 0:01:26 lr 0.000633 wd 0.0500 time 0.2189 (0.2264) data time 0.0009 (0.0014) model time 0.2180 (0.2250) loss 3.6371 (3.2876) grad_norm 2.0955 (nan) loss_scale 512.0000 (760.0643) mem 7381MB [2024-08-27 07:52:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][880/1251] eta 0:01:24 lr 0.000633 wd 0.0500 time 0.2281 (0.2264) data time 0.0007 (0.0014) model time 0.2274 (0.2250) loss 3.6276 (3.2849) grad_norm 2.2719 (nan) loss_scale 512.0000 (757.2486) mem 7381MB [2024-08-27 07:52:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][890/1251] eta 0:01:21 lr 0.000633 wd 0.0500 time 0.2241 (0.2264) data time 0.0008 (0.0014) model time 0.2233 (0.2250) loss 2.7945 (3.2852) grad_norm 2.0513 (nan) loss_scale 512.0000 (754.4961) mem 7381MB [2024-08-27 07:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][900/1251] eta 0:01:19 lr 0.000633 wd 0.0500 time 0.2226 (0.2264) data time 0.0009 (0.0014) model time 0.2217 (0.2250) loss 4.1242 (3.2839) grad_norm 4.1128 (nan) loss_scale 512.0000 (751.8047) mem 7381MB [2024-08-27 07:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][910/1251] eta 0:01:17 lr 0.000633 wd 0.0500 time 0.2253 (0.2264) data time 0.0008 (0.0014) model time 0.2245 (0.2250) loss 3.8293 (3.2827) grad_norm 2.4355 (nan) loss_scale 512.0000 (749.1723) mem 7381MB [2024-08-27 07:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][920/1251] eta 0:01:14 lr 0.000633 wd 0.0500 time 0.2235 (0.2265) data time 0.0008 (0.0014) model time 0.2228 (0.2251) loss 2.3555 (3.2793) grad_norm 1.9693 (nan) loss_scale 512.0000 (746.5972) mem 7381MB [2024-08-27 07:53:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][930/1251] eta 0:01:12 lr 0.000633 wd 0.0500 time 0.2257 (0.2265) data time 0.0008 (0.0014) model time 0.2249 (0.2251) loss 2.3462 (3.2791) grad_norm 2.7036 (nan) loss_scale 512.0000 (744.0773) mem 7381MB [2024-08-27 07:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][940/1251] eta 0:01:10 lr 0.000633 wd 0.0500 time 0.2220 (0.2265) data time 0.0008 (0.0014) model time 0.2212 (0.2251) loss 2.8623 (3.2786) grad_norm 2.2668 (nan) loss_scale 512.0000 (741.6111) mem 7381MB [2024-08-27 07:53:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][950/1251] eta 0:01:08 lr 0.000633 wd 0.0500 time 0.2278 (0.2264) data time 0.0009 (0.0014) model time 0.2270 (0.2251) loss 3.0827 (3.2789) grad_norm 2.4682 (nan) loss_scale 512.0000 (739.1966) mem 7381MB [2024-08-27 07:53:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][960/1251] eta 0:01:05 lr 0.000633 wd 0.0500 time 0.2231 (0.2264) data time 0.0006 (0.0014) model time 0.2225 (0.2250) loss 2.1137 (3.2785) grad_norm 3.7805 (nan) loss_scale 512.0000 (736.8325) mem 7381MB [2024-08-27 07:53:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][970/1251] eta 0:01:03 lr 0.000633 wd 0.0500 time 0.2215 (0.2264) data time 0.0009 (0.0014) model time 0.2206 (0.2250) loss 2.5861 (3.2780) grad_norm 2.4116 (nan) loss_scale 512.0000 (734.5170) mem 7381MB [2024-08-27 07:53:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][980/1251] eta 0:01:01 lr 0.000632 wd 0.0500 time 0.2247 (0.2264) data time 0.0010 (0.0014) model time 0.2237 (0.2250) loss 3.4347 (3.2784) grad_norm 1.6692 (nan) loss_scale 512.0000 (732.2487) mem 7381MB [2024-08-27 07:53:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][990/1251] eta 0:00:59 lr 0.000632 wd 0.0500 time 0.2264 (0.2264) data time 0.0008 (0.0014) model time 0.2256 (0.2250) loss 3.8368 (3.2770) grad_norm 2.3640 (nan) loss_scale 512.0000 (730.0262) mem 7381MB [2024-08-27 07:53:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1000/1251] eta 0:00:56 lr 0.000632 wd 0.0500 time 0.2239 (0.2264) data time 0.0010 (0.0014) model time 0.2229 (0.2251) loss 3.9253 (3.2773) grad_norm 4.7069 (nan) loss_scale 512.0000 (727.8482) mem 7381MB [2024-08-27 07:53:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1010/1251] eta 0:00:54 lr 0.000632 wd 0.0500 time 0.4368 (0.2266) data time 0.0008 (0.0014) model time 0.4360 (0.2253) loss 3.0162 (3.2763) grad_norm 1.9975 (nan) loss_scale 512.0000 (725.7132) mem 7381MB [2024-08-27 07:53:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1020/1251] eta 0:00:52 lr 0.000632 wd 0.0500 time 0.2240 (0.2268) data time 0.0008 (0.0013) model time 0.2233 (0.2255) loss 3.3442 (3.2756) grad_norm 2.5168 (nan) loss_scale 512.0000 (723.6200) mem 7381MB [2024-08-27 07:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1030/1251] eta 0:00:50 lr 0.000632 wd 0.0500 time 0.2264 (0.2268) data time 0.0009 (0.0013) model time 0.2254 (0.2255) loss 3.8817 (3.2750) grad_norm 2.7267 (nan) loss_scale 512.0000 (721.5674) mem 7381MB [2024-08-27 07:53:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1040/1251] eta 0:00:47 lr 0.000632 wd 0.0500 time 0.2260 (0.2268) data time 0.0007 (0.0013) model time 0.2253 (0.2255) loss 2.6917 (3.2761) grad_norm 3.1669 (nan) loss_scale 512.0000 (719.5543) mem 7381MB [2024-08-27 07:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1050/1251] eta 0:00:45 lr 0.000632 wd 0.0500 time 0.2210 (0.2268) data time 0.0008 (0.0013) model time 0.2202 (0.2254) loss 3.7845 (3.2777) grad_norm 2.1232 (nan) loss_scale 512.0000 (717.5794) mem 7381MB [2024-08-27 07:53:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1060/1251] eta 0:00:43 lr 0.000632 wd 0.0500 time 0.2299 (0.2267) data time 0.0008 (0.0013) model time 0.2290 (0.2254) loss 3.0485 (3.2775) grad_norm 3.1958 (nan) loss_scale 512.0000 (715.6418) mem 7381MB [2024-08-27 07:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1070/1251] eta 0:00:41 lr 0.000632 wd 0.0500 time 0.2230 (0.2267) data time 0.0009 (0.0013) model time 0.2221 (0.2254) loss 3.0433 (3.2819) grad_norm 2.0650 (nan) loss_scale 512.0000 (713.7404) mem 7381MB [2024-08-27 07:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1080/1251] eta 0:00:38 lr 0.000632 wd 0.0500 time 0.2246 (0.2267) data time 0.0007 (0.0013) model time 0.2240 (0.2254) loss 2.9724 (3.2828) grad_norm 2.1491 (nan) loss_scale 512.0000 (711.8742) mem 7381MB [2024-08-27 07:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1090/1251] eta 0:00:36 lr 0.000632 wd 0.0500 time 0.2276 (0.2267) data time 0.0006 (0.0013) model time 0.2269 (0.2254) loss 3.7437 (3.2843) grad_norm 2.4705 (nan) loss_scale 512.0000 (710.0422) mem 7381MB [2024-08-27 07:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1100/1251] eta 0:00:34 lr 0.000632 wd 0.0500 time 0.2223 (0.2267) data time 0.0006 (0.0013) model time 0.2216 (0.2254) loss 4.0178 (3.2853) grad_norm 2.2871 (nan) loss_scale 512.0000 (708.2434) mem 7381MB [2024-08-27 07:53:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1110/1251] eta 0:00:31 lr 0.000632 wd 0.0500 time 0.2235 (0.2267) data time 0.0009 (0.0013) model time 0.2226 (0.2254) loss 3.2062 (3.2841) grad_norm 2.3634 (nan) loss_scale 512.0000 (706.4770) mem 7381MB [2024-08-27 07:53:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1120/1251] eta 0:00:29 lr 0.000632 wd 0.0500 time 0.2301 (0.2267) data time 0.0007 (0.0013) model time 0.2294 (0.2254) loss 2.8457 (3.2835) grad_norm 3.3835 (nan) loss_scale 512.0000 (704.7422) mem 7381MB [2024-08-27 07:53:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1130/1251] eta 0:00:27 lr 0.000632 wd 0.0500 time 0.2344 (0.2267) data time 0.0009 (0.0013) model time 0.2335 (0.2254) loss 2.5335 (3.2855) grad_norm 2.2317 (nan) loss_scale 512.0000 (703.0380) mem 7381MB [2024-08-27 07:53:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1140/1251] eta 0:00:25 lr 0.000632 wd 0.0500 time 0.2268 (0.2267) data time 0.0008 (0.0013) model time 0.2260 (0.2254) loss 3.7060 (3.2872) grad_norm 2.7237 (nan) loss_scale 512.0000 (701.3637) mem 7381MB [2024-08-27 07:53:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1150/1251] eta 0:00:22 lr 0.000632 wd 0.0500 time 0.2326 (0.2267) data time 0.0009 (0.0013) model time 0.2316 (0.2254) loss 3.2248 (3.2891) grad_norm 2.6059 (nan) loss_scale 512.0000 (699.7185) mem 7381MB [2024-08-27 07:54:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1160/1251] eta 0:00:20 lr 0.000632 wd 0.0500 time 0.2229 (0.2267) data time 0.0008 (0.0013) model time 0.2221 (0.2254) loss 3.6031 (3.2869) grad_norm 1.7283 (nan) loss_scale 512.0000 (698.1016) mem 7381MB [2024-08-27 07:54:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1170/1251] eta 0:00:18 lr 0.000632 wd 0.0500 time 0.2347 (0.2267) data time 0.0009 (0.0013) model time 0.2338 (0.2254) loss 3.1015 (3.2846) grad_norm 3.1042 (nan) loss_scale 512.0000 (696.5124) mem 7381MB [2024-08-27 07:54:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1180/1251] eta 0:00:16 lr 0.000632 wd 0.0500 time 0.2308 (0.2266) data time 0.0006 (0.0013) model time 0.2302 (0.2254) loss 3.7355 (3.2846) grad_norm 2.1304 (nan) loss_scale 512.0000 (694.9500) mem 7381MB [2024-08-27 07:54:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1190/1251] eta 0:00:13 lr 0.000632 wd 0.0500 time 0.2318 (0.2266) data time 0.0010 (0.0013) model time 0.2309 (0.2254) loss 3.3138 (3.2854) grad_norm 2.7636 (nan) loss_scale 512.0000 (693.4139) mem 7381MB [2024-08-27 07:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1200/1251] eta 0:00:11 lr 0.000632 wd 0.0500 time 0.2253 (0.2266) data time 0.0009 (0.0013) model time 0.2244 (0.2254) loss 4.1383 (3.2869) grad_norm 3.5739 (nan) loss_scale 512.0000 (691.9034) mem 7381MB [2024-08-27 07:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1210/1251] eta 0:00:09 lr 0.000632 wd 0.0500 time 0.2283 (0.2266) data time 0.0007 (0.0013) model time 0.2276 (0.2254) loss 4.4345 (3.2890) grad_norm 2.1232 (nan) loss_scale 512.0000 (690.4178) mem 7381MB [2024-08-27 07:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1220/1251] eta 0:00:07 lr 0.000631 wd 0.0500 time 0.2227 (0.2266) data time 0.0006 (0.0013) model time 0.2221 (0.2253) loss 2.6687 (3.2889) grad_norm 1.5803 (nan) loss_scale 512.0000 (688.9566) mem 7381MB [2024-08-27 07:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1230/1251] eta 0:00:04 lr 0.000631 wd 0.0500 time 0.2279 (0.2266) data time 0.0007 (0.0013) model time 0.2272 (0.2253) loss 2.3480 (3.2880) grad_norm 2.6156 (nan) loss_scale 512.0000 (687.5191) mem 7381MB [2024-08-27 07:54:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1240/1251] eta 0:00:02 lr 0.000631 wd 0.0500 time 0.2140 (0.2265) data time 0.0003 (0.0013) model time 0.2136 (0.2253) loss 2.4572 (3.2879) grad_norm 2.9378 (nan) loss_scale 512.0000 (686.1048) mem 7381MB [2024-08-27 07:54:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [136/300][1250/1251] eta 0:00:00 lr 0.000631 wd 0.0500 time 0.2157 (0.2264) data time 0.0005 (0.0013) model time 0.2152 (0.2252) loss 3.5741 (3.2887) grad_norm 2.0152 (nan) loss_scale 512.0000 (684.7130) mem 7381MB [2024-08-27 07:54:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 136 training takes 0:04:43 [2024-08-27 07:54:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 07:54:22 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 07:54:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.457 (0.457) Loss 0.4929 (0.4929) Acc@1 91.211 (91.211) Acc@5 98.047 (98.047) Mem 7381MB [2024-08-27 07:54:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.106) Loss 0.7729 (0.7650) Acc@1 83.594 (83.576) Acc@5 96.289 (96.520) Mem 7381MB [2024-08-27 07:54:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.066 (0.088) Loss 1.0889 (0.7839) Acc@1 75.000 (82.678) Acc@5 93.457 (96.633) Mem 7381MB [2024-08-27 07:54:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.066 (0.082) Loss 1.3350 (0.8896) Acc@1 67.871 (80.236) Acc@5 90.039 (95.391) Mem 7381MB [2024-08-27 07:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.077) Loss 1.2266 (0.9470) Acc@1 72.852 (78.770) Acc@5 91.895 (94.696) Mem 7381MB [2024-08-27 07:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.422 Acc@5 94.664 [2024-08-27 07:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.4% [2024-08-27 07:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.838 (0.838) Loss 0.4155 (0.4155) Acc@1 92.383 (92.383) Acc@5 98.730 (98.730) Mem 7381MB [2024-08-27 07:54:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.140) Loss 0.6621 (0.6506) Acc@1 86.816 (86.017) Acc@5 96.973 (97.283) Mem 7381MB [2024-08-27 07:54:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.108) Loss 0.9199 (0.6747) Acc@1 78.418 (85.021) Acc@5 95.020 (97.284) Mem 7381MB [2024-08-27 07:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.065 (0.095) Loss 1.1670 (0.7658) Acc@1 70.996 (82.787) Acc@5 92.285 (96.295) Mem 7381MB [2024-08-27 07:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.087) Loss 1.0605 (0.8139) Acc@1 73.047 (81.429) Acc@5 93.457 (95.760) Mem 7381MB [2024-08-27 07:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.030 Acc@5 95.726 [2024-08-27 07:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.0% [2024-08-27 07:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.03% [2024-08-27 07:54:29 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 07:54:30 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 07:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][0/1251] eta 0:12:21 lr 0.000631 wd 0.0500 time 0.5927 (0.5927) data time 0.3820 (0.3820) model time 0.0000 (0.0000) loss 3.4667 (3.4667) grad_norm 2.3947 (2.3947) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:54:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][10/1251] eta 0:05:17 lr 0.000631 wd 0.0500 time 0.2192 (0.2560) data time 0.0009 (0.0356) model time 0.0000 (0.0000) loss 3.3667 (3.1652) grad_norm 1.8695 (2.2310) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:54:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][20/1251] eta 0:04:58 lr 0.000631 wd 0.0500 time 0.2339 (0.2424) data time 0.0008 (0.0190) model time 0.0000 (0.0000) loss 3.1877 (3.1530) grad_norm 2.5335 (2.4192) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:54:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][30/1251] eta 0:04:50 lr 0.000631 wd 0.0500 time 0.2364 (0.2376) data time 0.0008 (0.0132) model time 0.0000 (0.0000) loss 3.8626 (3.2021) grad_norm 2.0922 (2.4673) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:54:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][40/1251] eta 0:04:44 lr 0.000631 wd 0.0500 time 0.2420 (0.2346) data time 0.0007 (0.0102) model time 0.0000 (0.0000) loss 2.9532 (3.2128) grad_norm 1.7854 (2.4006) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:54:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][50/1251] eta 0:04:39 lr 0.000631 wd 0.0500 time 0.2226 (0.2323) data time 0.0009 (0.0084) model time 0.0000 (0.0000) loss 3.2983 (3.1627) grad_norm 3.1697 (2.3998) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:54:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][60/1251] eta 0:04:34 lr 0.000631 wd 0.0500 time 0.2222 (0.2308) data time 0.0007 (0.0072) model time 0.2215 (0.2221) loss 3.3930 (3.1883) grad_norm 1.9998 (2.4764) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:54:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][70/1251] eta 0:04:31 lr 0.000631 wd 0.0500 time 0.2201 (0.2298) data time 0.0007 (0.0063) model time 0.2194 (0.2225) loss 3.0361 (3.2047) grad_norm 2.2189 (2.4895) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:54:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][80/1251] eta 0:04:28 lr 0.000631 wd 0.0500 time 0.2223 (0.2292) data time 0.0010 (0.0056) model time 0.2213 (0.2229) loss 3.7815 (3.2066) grad_norm 1.9849 (2.4357) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:54:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][90/1251] eta 0:04:25 lr 0.000631 wd 0.0500 time 0.2212 (0.2287) data time 0.0011 (0.0051) model time 0.2201 (0.2231) loss 2.2586 (3.2244) grad_norm 2.4861 (2.4395) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:54:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][100/1251] eta 0:04:22 lr 0.000631 wd 0.0500 time 0.2227 (0.2282) data time 0.0009 (0.0047) model time 0.2217 (0.2231) loss 3.4053 (3.2158) grad_norm 2.6445 (2.4666) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][110/1251] eta 0:04:20 lr 0.000631 wd 0.0500 time 0.2223 (0.2279) data time 0.0008 (0.0044) model time 0.2215 (0.2232) loss 2.9256 (3.2221) grad_norm 2.2055 (2.4819) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:54:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][120/1251] eta 0:04:18 lr 0.000631 wd 0.0500 time 0.2194 (0.2285) data time 0.0009 (0.0041) model time 0.2185 (0.2248) loss 2.6992 (3.2186) grad_norm 2.0685 (2.4472) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][130/1251] eta 0:04:15 lr 0.000631 wd 0.0500 time 0.2183 (0.2281) data time 0.0008 (0.0038) model time 0.2175 (0.2245) loss 3.8323 (3.2163) grad_norm 2.5766 (2.4251) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][140/1251] eta 0:04:13 lr 0.000631 wd 0.0500 time 0.2209 (0.2279) data time 0.0008 (0.0036) model time 0.2200 (0.2244) loss 4.3530 (3.2344) grad_norm 2.1354 (2.4546) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][150/1251] eta 0:04:10 lr 0.000631 wd 0.0500 time 0.2205 (0.2276) data time 0.0008 (0.0035) model time 0.2197 (0.2243) loss 2.6836 (3.2298) grad_norm 2.1815 (2.4360) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][160/1251] eta 0:04:08 lr 0.000631 wd 0.0500 time 0.2230 (0.2275) data time 0.0009 (0.0033) model time 0.2221 (0.2243) loss 3.3779 (3.2401) grad_norm 1.5418 (2.4152) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][170/1251] eta 0:04:05 lr 0.000631 wd 0.0500 time 0.2245 (0.2274) data time 0.0006 (0.0032) model time 0.2239 (0.2243) loss 2.6550 (3.2541) grad_norm 1.9097 (2.3920) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][180/1251] eta 0:04:03 lr 0.000631 wd 0.0500 time 0.2240 (0.2272) data time 0.0008 (0.0030) model time 0.2232 (0.2243) loss 3.4973 (3.2508) grad_norm 3.2700 (2.4204) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][190/1251] eta 0:04:01 lr 0.000631 wd 0.0500 time 0.2255 (0.2272) data time 0.0008 (0.0029) model time 0.2248 (0.2244) loss 3.3476 (3.2405) grad_norm 1.8250 (2.4234) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][200/1251] eta 0:03:58 lr 0.000630 wd 0.0500 time 0.2249 (0.2270) data time 0.0008 (0.0028) model time 0.2241 (0.2243) loss 3.4184 (3.2506) grad_norm 2.5697 (2.4091) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][210/1251] eta 0:03:56 lr 0.000630 wd 0.0500 time 0.2245 (0.2269) data time 0.0007 (0.0027) model time 0.2238 (0.2242) loss 3.4272 (3.2581) grad_norm 2.9127 (2.4021) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][220/1251] eta 0:03:53 lr 0.000630 wd 0.0500 time 0.2255 (0.2268) data time 0.0008 (0.0027) model time 0.2247 (0.2242) loss 2.5388 (3.2543) grad_norm 4.0579 (2.4078) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][230/1251] eta 0:03:51 lr 0.000630 wd 0.0500 time 0.2206 (0.2266) data time 0.0009 (0.0026) model time 0.2197 (0.2241) loss 3.2315 (3.2587) grad_norm 3.3771 (2.4163) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][240/1251] eta 0:03:49 lr 0.000630 wd 0.0500 time 0.2259 (0.2265) data time 0.0008 (0.0025) model time 0.2251 (0.2240) loss 3.6322 (3.2551) grad_norm 2.0248 (2.4170) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][250/1251] eta 0:03:46 lr 0.000630 wd 0.0500 time 0.2201 (0.2264) data time 0.0006 (0.0024) model time 0.2195 (0.2239) loss 3.5024 (3.2609) grad_norm 1.7680 (2.4102) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][260/1251] eta 0:03:44 lr 0.000630 wd 0.0500 time 0.2227 (0.2263) data time 0.0009 (0.0024) model time 0.2219 (0.2240) loss 3.6682 (3.2631) grad_norm 1.9345 (2.4085) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][270/1251] eta 0:03:41 lr 0.000630 wd 0.0500 time 0.2255 (0.2263) data time 0.0009 (0.0023) model time 0.2246 (0.2240) loss 3.7623 (3.2704) grad_norm 2.1314 (2.3950) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][280/1251] eta 0:03:39 lr 0.000630 wd 0.0500 time 0.2201 (0.2263) data time 0.0008 (0.0023) model time 0.2193 (0.2240) loss 3.4300 (3.2631) grad_norm 2.2596 (2.3916) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][290/1251] eta 0:03:37 lr 0.000630 wd 0.0500 time 0.2247 (0.2263) data time 0.0006 (0.0022) model time 0.2241 (0.2241) loss 3.2952 (3.2694) grad_norm 2.1589 (2.3923) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][300/1251] eta 0:03:35 lr 0.000630 wd 0.0500 time 0.2232 (0.2263) data time 0.0007 (0.0022) model time 0.2224 (0.2242) loss 2.8168 (3.2721) grad_norm 1.6398 (2.3770) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][310/1251] eta 0:03:32 lr 0.000630 wd 0.0500 time 0.2238 (0.2263) data time 0.0008 (0.0021) model time 0.2230 (0.2242) loss 3.1832 (3.2791) grad_norm 3.7419 (2.3726) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][320/1251] eta 0:03:30 lr 0.000630 wd 0.0500 time 0.2190 (0.2263) data time 0.0009 (0.0021) model time 0.2181 (0.2243) loss 3.0264 (3.2785) grad_norm 1.8331 (2.3703) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][330/1251] eta 0:03:28 lr 0.000630 wd 0.0500 time 0.2279 (0.2263) data time 0.0010 (0.0021) model time 0.2269 (0.2243) loss 3.0294 (3.2775) grad_norm 3.6039 (2.3830) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][340/1251] eta 0:03:26 lr 0.000630 wd 0.0500 time 0.2200 (0.2263) data time 0.0010 (0.0020) model time 0.2191 (0.2243) loss 3.2123 (3.2726) grad_norm 3.0820 (2.3926) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][350/1251] eta 0:03:23 lr 0.000630 wd 0.0500 time 0.2272 (0.2262) data time 0.0007 (0.0020) model time 0.2265 (0.2243) loss 3.9438 (3.2813) grad_norm 2.1022 (2.3928) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][360/1251] eta 0:03:21 lr 0.000630 wd 0.0500 time 0.2241 (0.2262) data time 0.0007 (0.0020) model time 0.2234 (0.2243) loss 2.5090 (3.2777) grad_norm 2.3451 (2.3953) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][370/1251] eta 0:03:19 lr 0.000630 wd 0.0500 time 0.2294 (0.2262) data time 0.0007 (0.0020) model time 0.2287 (0.2243) loss 3.3672 (3.2838) grad_norm 3.1129 (2.3905) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][380/1251] eta 0:03:17 lr 0.000630 wd 0.0500 time 0.2240 (0.2262) data time 0.0007 (0.0019) model time 0.2233 (0.2243) loss 3.4395 (3.2856) grad_norm 3.6542 (2.3876) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:55:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][390/1251] eta 0:03:14 lr 0.000630 wd 0.0500 time 0.2198 (0.2262) data time 0.0009 (0.0019) model time 0.2188 (0.2243) loss 3.1887 (3.2832) grad_norm 1.9031 (2.3756) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][400/1251] eta 0:03:12 lr 0.000630 wd 0.0500 time 0.2216 (0.2261) data time 0.0009 (0.0019) model time 0.2207 (0.2243) loss 3.0487 (3.2789) grad_norm 1.9195 (2.3908) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][410/1251] eta 0:03:10 lr 0.000630 wd 0.0500 time 0.2332 (0.2261) data time 0.0008 (0.0019) model time 0.2325 (0.2243) loss 3.9917 (3.2866) grad_norm 1.7982 (2.3953) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][420/1251] eta 0:03:07 lr 0.000630 wd 0.0500 time 0.2302 (0.2261) data time 0.0008 (0.0018) model time 0.2294 (0.2243) loss 3.5310 (3.2789) grad_norm 1.9309 (2.3898) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][430/1251] eta 0:03:05 lr 0.000629 wd 0.0500 time 0.2257 (0.2261) data time 0.0007 (0.0018) model time 0.2250 (0.2243) loss 4.1214 (3.2808) grad_norm 1.7253 (2.3875) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][440/1251] eta 0:03:03 lr 0.000629 wd 0.0500 time 0.2214 (0.2261) data time 0.0009 (0.0018) model time 0.2205 (0.2243) loss 2.1927 (3.2789) grad_norm 1.9981 (2.3821) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][450/1251] eta 0:03:01 lr 0.000629 wd 0.0500 time 0.2299 (0.2261) data time 0.0008 (0.0018) model time 0.2290 (0.2243) loss 3.7758 (3.2783) grad_norm 1.7111 (2.3779) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][460/1251] eta 0:02:58 lr 0.000629 wd 0.0500 time 0.2207 (0.2260) data time 0.0008 (0.0018) model time 0.2199 (0.2243) loss 3.0111 (3.2725) grad_norm 1.9202 (2.3774) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][470/1251] eta 0:02:56 lr 0.000629 wd 0.0500 time 0.2249 (0.2260) data time 0.0008 (0.0017) model time 0.2242 (0.2244) loss 3.2080 (3.2723) grad_norm 3.0882 (2.3756) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][480/1251] eta 0:02:54 lr 0.000629 wd 0.0500 time 0.2259 (0.2260) data time 0.0007 (0.0017) model time 0.2252 (0.2244) loss 2.2411 (3.2684) grad_norm 2.0457 (2.3739) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][490/1251] eta 0:02:52 lr 0.000629 wd 0.0500 time 0.2285 (0.2260) data time 0.0008 (0.0017) model time 0.2277 (0.2244) loss 3.6136 (3.2691) grad_norm 1.6651 (2.3682) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][500/1251] eta 0:02:49 lr 0.000629 wd 0.0500 time 0.2262 (0.2261) data time 0.0008 (0.0017) model time 0.2255 (0.2244) loss 3.0610 (3.2704) grad_norm 1.9512 (2.3769) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][510/1251] eta 0:02:47 lr 0.000629 wd 0.0500 time 0.2246 (0.2261) data time 0.0007 (0.0017) model time 0.2239 (0.2244) loss 2.5715 (3.2676) grad_norm 3.3211 (2.3863) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][520/1251] eta 0:02:45 lr 0.000629 wd 0.0500 time 0.2263 (0.2268) data time 0.0008 (0.0017) model time 0.2255 (0.2253) loss 2.4921 (3.2631) grad_norm 1.9732 (2.3806) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][530/1251] eta 0:02:43 lr 0.000629 wd 0.0500 time 0.2296 (0.2272) data time 0.0006 (0.0016) model time 0.2290 (0.2258) loss 3.8558 (3.2606) grad_norm 2.3447 (2.3847) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][540/1251] eta 0:02:41 lr 0.000629 wd 0.0500 time 0.2248 (0.2273) data time 0.0009 (0.0016) model time 0.2239 (0.2258) loss 3.1863 (3.2632) grad_norm 2.4761 (2.3901) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][550/1251] eta 0:02:39 lr 0.000629 wd 0.0500 time 0.2218 (0.2272) data time 0.0008 (0.0016) model time 0.2211 (0.2258) loss 3.7563 (3.2679) grad_norm 3.0865 (2.3869) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][560/1251] eta 0:02:36 lr 0.000629 wd 0.0500 time 0.2232 (0.2272) data time 0.0008 (0.0016) model time 0.2224 (0.2258) loss 3.7152 (3.2708) grad_norm 2.4155 (2.3918) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][570/1251] eta 0:02:34 lr 0.000629 wd 0.0500 time 0.2206 (0.2271) data time 0.0009 (0.0016) model time 0.2197 (0.2257) loss 3.5972 (3.2692) grad_norm 2.3214 (2.4011) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][580/1251] eta 0:02:32 lr 0.000629 wd 0.0500 time 0.2226 (0.2271) data time 0.0009 (0.0016) model time 0.2217 (0.2257) loss 2.4104 (3.2678) grad_norm 1.5857 (2.4019) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][590/1251] eta 0:02:30 lr 0.000629 wd 0.0500 time 0.2380 (0.2271) data time 0.0006 (0.0016) model time 0.2374 (0.2257) loss 3.6806 (3.2648) grad_norm 2.8111 (2.3992) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][600/1251] eta 0:02:27 lr 0.000629 wd 0.0500 time 0.2170 (0.2271) data time 0.0008 (0.0016) model time 0.2162 (0.2257) loss 3.4801 (3.2678) grad_norm 2.8364 (2.4014) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][610/1251] eta 0:02:25 lr 0.000629 wd 0.0500 time 0.2248 (0.2271) data time 0.0008 (0.0015) model time 0.2241 (0.2257) loss 3.3169 (3.2664) grad_norm 2.4624 (2.4047) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][620/1251] eta 0:02:23 lr 0.000629 wd 0.0500 time 0.2270 (0.2270) data time 0.0010 (0.0015) model time 0.2260 (0.2257) loss 3.7062 (3.2707) grad_norm 1.9694 (2.4006) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][630/1251] eta 0:02:20 lr 0.000629 wd 0.0500 time 0.2300 (0.2271) data time 0.0008 (0.0015) model time 0.2292 (0.2257) loss 3.6262 (3.2759) grad_norm 3.0030 (2.3941) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][640/1251] eta 0:02:18 lr 0.000629 wd 0.0500 time 0.4502 (0.2274) data time 0.0008 (0.0015) model time 0.4494 (0.2260) loss 3.1090 (3.2766) grad_norm 2.0722 (2.3894) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:56:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][650/1251] eta 0:02:16 lr 0.000629 wd 0.0500 time 0.2294 (0.2273) data time 0.0008 (0.0015) model time 0.2286 (0.2260) loss 2.3594 (3.2725) grad_norm 3.0118 (2.3884) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][660/1251] eta 0:02:14 lr 0.000628 wd 0.0500 time 0.2228 (0.2273) data time 0.0006 (0.0015) model time 0.2223 (0.2260) loss 3.2712 (3.2711) grad_norm 1.9994 (2.3865) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][670/1251] eta 0:02:12 lr 0.000628 wd 0.0500 time 0.2303 (0.2273) data time 0.0007 (0.0015) model time 0.2296 (0.2260) loss 2.3613 (3.2670) grad_norm 2.1569 (2.3862) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][680/1251] eta 0:02:09 lr 0.000628 wd 0.0500 time 0.2206 (0.2273) data time 0.0007 (0.0015) model time 0.2199 (0.2259) loss 2.9125 (3.2669) grad_norm 1.7297 (2.3859) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][690/1251] eta 0:02:07 lr 0.000628 wd 0.0500 time 0.2274 (0.2273) data time 0.0007 (0.0015) model time 0.2267 (0.2259) loss 3.8245 (3.2644) grad_norm 1.9143 (2.3859) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][700/1251] eta 0:02:05 lr 0.000628 wd 0.0500 time 0.2166 (0.2272) data time 0.0011 (0.0015) model time 0.2154 (0.2259) loss 2.0631 (3.2638) grad_norm 1.8595 (2.3853) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][710/1251] eta 0:02:02 lr 0.000628 wd 0.0500 time 0.2286 (0.2272) data time 0.0007 (0.0015) model time 0.2279 (0.2259) loss 2.8827 (3.2643) grad_norm 2.7053 (2.3868) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][720/1251] eta 0:02:00 lr 0.000628 wd 0.0500 time 0.2264 (0.2272) data time 0.0007 (0.0014) model time 0.2257 (0.2259) loss 4.2179 (3.2657) grad_norm 2.5962 (2.3889) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][730/1251] eta 0:01:58 lr 0.000628 wd 0.0500 time 0.2217 (0.2272) data time 0.0006 (0.0014) model time 0.2211 (0.2259) loss 4.1573 (3.2641) grad_norm 2.4345 (2.3968) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][740/1251] eta 0:01:56 lr 0.000628 wd 0.0500 time 0.2307 (0.2272) data time 0.0009 (0.0014) model time 0.2297 (0.2259) loss 3.7549 (3.2658) grad_norm 1.6688 (2.4033) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][750/1251] eta 0:01:53 lr 0.000628 wd 0.0500 time 0.2202 (0.2272) data time 0.0010 (0.0014) model time 0.2192 (0.2259) loss 3.1593 (3.2661) grad_norm 2.8645 (2.4057) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][760/1251] eta 0:01:51 lr 0.000628 wd 0.0500 time 0.2331 (0.2272) data time 0.0007 (0.0014) model time 0.2323 (0.2259) loss 3.0821 (3.2675) grad_norm 1.8593 (2.4017) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][770/1251] eta 0:01:49 lr 0.000628 wd 0.0500 time 0.2220 (0.2272) data time 0.0008 (0.0014) model time 0.2213 (0.2259) loss 2.6315 (3.2667) grad_norm 2.7201 (2.4003) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][780/1251] eta 0:01:46 lr 0.000628 wd 0.0500 time 0.2201 (0.2272) data time 0.0007 (0.0014) model time 0.2194 (0.2259) loss 2.9650 (3.2651) grad_norm 1.6542 (2.3990) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][790/1251] eta 0:01:44 lr 0.000628 wd 0.0500 time 0.2322 (0.2272) data time 0.0009 (0.0014) model time 0.2313 (0.2259) loss 3.7553 (3.2676) grad_norm 1.7215 (2.3991) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][800/1251] eta 0:01:42 lr 0.000628 wd 0.0500 time 0.2302 (0.2271) data time 0.0007 (0.0014) model time 0.2295 (0.2259) loss 3.4325 (3.2699) grad_norm 1.8857 (2.3948) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][810/1251] eta 0:01:40 lr 0.000628 wd 0.0500 time 0.2244 (0.2271) data time 0.0006 (0.0014) model time 0.2237 (0.2259) loss 4.2019 (3.2688) grad_norm 2.0930 (2.3906) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][820/1251] eta 0:01:37 lr 0.000628 wd 0.0500 time 0.2208 (0.2271) data time 0.0010 (0.0014) model time 0.2198 (0.2258) loss 3.7684 (3.2686) grad_norm 2.7887 (2.3969) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][830/1251] eta 0:01:35 lr 0.000628 wd 0.0500 time 0.2243 (0.2271) data time 0.0007 (0.0014) model time 0.2236 (0.2258) loss 3.5048 (3.2694) grad_norm 3.3043 (2.3990) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][840/1251] eta 0:01:33 lr 0.000628 wd 0.0500 time 0.2331 (0.2271) data time 0.0007 (0.0014) model time 0.2324 (0.2258) loss 3.6128 (3.2718) grad_norm 2.0848 (2.3965) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][850/1251] eta 0:01:31 lr 0.000628 wd 0.0500 time 0.2211 (0.2271) data time 0.0008 (0.0014) model time 0.2203 (0.2258) loss 3.8071 (3.2682) grad_norm 2.1533 (2.3937) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][860/1251] eta 0:01:28 lr 0.000628 wd 0.0500 time 0.2345 (0.2271) data time 0.0008 (0.0014) model time 0.2337 (0.2258) loss 3.7047 (3.2683) grad_norm 1.6954 (2.3909) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][870/1251] eta 0:01:26 lr 0.000628 wd 0.0500 time 0.2376 (0.2271) data time 0.0009 (0.0014) model time 0.2366 (0.2258) loss 2.4749 (3.2656) grad_norm 2.6035 (2.3913) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][880/1251] eta 0:01:24 lr 0.000628 wd 0.0500 time 0.2238 (0.2271) data time 0.0010 (0.0014) model time 0.2228 (0.2258) loss 3.6196 (3.2668) grad_norm 2.3519 (2.3917) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][890/1251] eta 0:01:21 lr 0.000628 wd 0.0500 time 0.2224 (0.2271) data time 0.0009 (0.0014) model time 0.2215 (0.2258) loss 3.3337 (3.2650) grad_norm 1.8757 (2.3889) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][900/1251] eta 0:01:19 lr 0.000627 wd 0.0500 time 0.2231 (0.2270) data time 0.0008 (0.0013) model time 0.2223 (0.2258) loss 3.3271 (3.2680) grad_norm 1.9642 (2.3859) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][910/1251] eta 0:01:17 lr 0.000627 wd 0.0500 time 0.2282 (0.2270) data time 0.0007 (0.0013) model time 0.2275 (0.2258) loss 3.7825 (3.2674) grad_norm 1.7022 (2.3835) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:57:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][920/1251] eta 0:01:15 lr 0.000627 wd 0.0500 time 0.2290 (0.2270) data time 0.0007 (0.0013) model time 0.2283 (0.2258) loss 3.7070 (3.2669) grad_norm 2.2879 (2.3843) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][930/1251] eta 0:01:12 lr 0.000627 wd 0.0500 time 0.2277 (0.2270) data time 0.0009 (0.0013) model time 0.2268 (0.2258) loss 2.6519 (3.2659) grad_norm 2.1153 (2.3821) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][940/1251] eta 0:01:10 lr 0.000627 wd 0.0500 time 0.2219 (0.2271) data time 0.0009 (0.0013) model time 0.2210 (0.2258) loss 3.4869 (3.2669) grad_norm 2.8842 (2.3846) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][950/1251] eta 0:01:08 lr 0.000627 wd 0.0500 time 0.2245 (0.2271) data time 0.0009 (0.0013) model time 0.2236 (0.2258) loss 3.6034 (3.2659) grad_norm 1.7565 (2.3924) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][960/1251] eta 0:01:06 lr 0.000627 wd 0.0500 time 0.2287 (0.2271) data time 0.0007 (0.0013) model time 0.2280 (0.2258) loss 3.5624 (3.2654) grad_norm 4.1426 (2.3950) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][970/1251] eta 0:01:03 lr 0.000627 wd 0.0500 time 0.2175 (0.2271) data time 0.0007 (0.0013) model time 0.2168 (0.2258) loss 3.9021 (3.2664) grad_norm 2.8807 (2.3969) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][980/1251] eta 0:01:01 lr 0.000627 wd 0.0500 time 0.2228 (0.2271) data time 0.0008 (0.0013) model time 0.2220 (0.2258) loss 3.4411 (3.2678) grad_norm 2.5501 (2.4002) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][990/1251] eta 0:00:59 lr 0.000627 wd 0.0500 time 0.2225 (0.2270) data time 0.0008 (0.0013) model time 0.2218 (0.2258) loss 2.5916 (3.2686) grad_norm 2.1318 (2.3986) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1000/1251] eta 0:00:56 lr 0.000627 wd 0.0500 time 0.2285 (0.2270) data time 0.0007 (0.0013) model time 0.2278 (0.2258) loss 3.4538 (3.2670) grad_norm 1.9648 (2.3956) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1010/1251] eta 0:00:54 lr 0.000627 wd 0.0500 time 0.2251 (0.2270) data time 0.0008 (0.0013) model time 0.2243 (0.2258) loss 2.8071 (3.2656) grad_norm 1.7775 (2.3936) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1020/1251] eta 0:00:52 lr 0.000627 wd 0.0500 time 0.2221 (0.2270) data time 0.0009 (0.0013) model time 0.2212 (0.2258) loss 3.5427 (3.2675) grad_norm 4.4752 (2.3985) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1030/1251] eta 0:00:50 lr 0.000627 wd 0.0500 time 0.2188 (0.2270) data time 0.0007 (0.0013) model time 0.2181 (0.2258) loss 3.5532 (3.2682) grad_norm 1.9100 (2.4002) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1040/1251] eta 0:00:47 lr 0.000627 wd 0.0500 time 0.2249 (0.2270) data time 0.0005 (0.0013) model time 0.2243 (0.2258) loss 3.9055 (3.2664) grad_norm 1.6403 (2.3966) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1050/1251] eta 0:00:45 lr 0.000627 wd 0.0500 time 0.2211 (0.2270) data time 0.0009 (0.0013) model time 0.2203 (0.2258) loss 3.6030 (3.2682) grad_norm 3.1080 (2.3965) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1060/1251] eta 0:00:43 lr 0.000627 wd 0.0500 time 0.2196 (0.2275) data time 0.0007 (0.0013) model time 0.2190 (0.2264) loss 3.7474 (3.2689) grad_norm 1.7017 (2.3937) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1070/1251] eta 0:00:41 lr 0.000627 wd 0.0500 time 0.2262 (0.2275) data time 0.0007 (0.0013) model time 0.2255 (0.2264) loss 2.3040 (3.2676) grad_norm 1.8372 (2.3913) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1080/1251] eta 0:00:38 lr 0.000627 wd 0.0500 time 0.2280 (0.2275) data time 0.0009 (0.0013) model time 0.2272 (0.2263) loss 2.3067 (3.2665) grad_norm 1.6722 (2.3916) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1090/1251] eta 0:00:36 lr 0.000627 wd 0.0500 time 0.2225 (0.2275) data time 0.0007 (0.0013) model time 0.2218 (0.2263) loss 3.7901 (3.2686) grad_norm 2.1270 (2.3950) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1100/1251] eta 0:00:34 lr 0.000627 wd 0.0500 time 0.2190 (0.2274) data time 0.0007 (0.0013) model time 0.2183 (0.2263) loss 2.1570 (3.2676) grad_norm 2.4532 (2.3937) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1110/1251] eta 0:00:32 lr 0.000627 wd 0.0500 time 0.2221 (0.2274) data time 0.0008 (0.0013) model time 0.2213 (0.2263) loss 3.3902 (3.2695) grad_norm 2.6261 (2.3926) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1120/1251] eta 0:00:29 lr 0.000627 wd 0.0500 time 0.2206 (0.2274) data time 0.0009 (0.0013) model time 0.2196 (0.2262) loss 3.5984 (3.2680) grad_norm 1.6623 (2.3898) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1130/1251] eta 0:00:27 lr 0.000626 wd 0.0500 time 0.2272 (0.2274) data time 0.0010 (0.0013) model time 0.2262 (0.2262) loss 3.0378 (3.2696) grad_norm 2.4239 (2.3926) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1140/1251] eta 0:00:25 lr 0.000626 wd 0.0500 time 0.2248 (0.2274) data time 0.0007 (0.0013) model time 0.2241 (0.2262) loss 2.5141 (3.2707) grad_norm 1.8796 (2.3914) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1150/1251] eta 0:00:22 lr 0.000626 wd 0.0500 time 0.2184 (0.2274) data time 0.0010 (0.0013) model time 0.2174 (0.2262) loss 3.3791 (3.2711) grad_norm 2.1448 (2.3918) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1160/1251] eta 0:00:20 lr 0.000626 wd 0.0500 time 0.2244 (0.2273) data time 0.0009 (0.0013) model time 0.2234 (0.2262) loss 3.7868 (3.2714) grad_norm 2.3783 (2.3966) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1170/1251] eta 0:00:18 lr 0.000626 wd 0.0500 time 0.2222 (0.2273) data time 0.0006 (0.0013) model time 0.2215 (0.2262) loss 3.8624 (3.2726) grad_norm 2.1563 (2.4001) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 07:58:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1180/1251] eta 0:00:16 lr 0.000626 wd 0.0500 time 0.2298 (0.2273) data time 0.0008 (0.0012) model time 0.2290 (0.2261) loss 2.7083 (3.2722) grad_norm 3.6915 (2.4049) loss_scale 1024.0000 (516.3353) mem 7381MB [2024-08-27 07:59:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1190/1251] eta 0:00:13 lr 0.000626 wd 0.0500 time 0.2260 (0.2273) data time 0.0007 (0.0012) model time 0.2253 (0.2261) loss 3.9221 (3.2709) grad_norm 2.4097 (2.4058) loss_scale 1024.0000 (520.5978) mem 7381MB [2024-08-27 07:59:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1200/1251] eta 0:00:11 lr 0.000626 wd 0.0500 time 0.2239 (0.2273) data time 0.0007 (0.0012) model time 0.2231 (0.2261) loss 3.9260 (3.2724) grad_norm 6.4687 (2.4079) loss_scale 1024.0000 (524.7893) mem 7381MB [2024-08-27 07:59:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1210/1251] eta 0:00:09 lr 0.000626 wd 0.0500 time 0.2323 (0.2273) data time 0.0008 (0.0012) model time 0.2315 (0.2261) loss 3.8334 (3.2728) grad_norm 2.4173 (2.4054) loss_scale 1024.0000 (528.9116) mem 7381MB [2024-08-27 07:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1220/1251] eta 0:00:07 lr 0.000626 wd 0.0500 time 0.2281 (0.2273) data time 0.0006 (0.0012) model time 0.2275 (0.2261) loss 3.3922 (3.2730) grad_norm 2.5572 (2.4034) loss_scale 1024.0000 (532.9664) mem 7381MB [2024-08-27 07:59:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1230/1251] eta 0:00:04 lr 0.000626 wd 0.0500 time 0.2249 (0.2273) data time 0.0006 (0.0012) model time 0.2243 (0.2261) loss 4.1157 (3.2744) grad_norm 3.0280 (2.4010) loss_scale 1024.0000 (536.9553) mem 7381MB [2024-08-27 07:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1240/1251] eta 0:00:02 lr 0.000626 wd 0.0500 time 0.2135 (0.2272) data time 0.0004 (0.0012) model time 0.2131 (0.2261) loss 3.8044 (3.2758) grad_norm 1.8051 (2.3997) loss_scale 1024.0000 (540.8799) mem 7381MB [2024-08-27 07:59:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [137/300][1250/1251] eta 0:00:00 lr 0.000626 wd 0.0500 time 0.2123 (0.2271) data time 0.0004 (0.0012) model time 0.2120 (0.2260) loss 3.7504 (3.2763) grad_norm 2.3477 (2.4030) loss_scale 1024.0000 (544.7418) mem 7381MB [2024-08-27 07:59:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 137 training takes 0:04:44 [2024-08-27 07:59:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 07:59:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 07:59:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.417 (0.417) Loss 0.5381 (0.5381) Acc@1 89.453 (89.453) Acc@5 97.949 (97.949) Mem 7381MB [2024-08-27 07:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.067 (0.101) Loss 0.7886 (0.7583) Acc@1 83.105 (83.549) Acc@5 95.996 (96.564) Mem 7381MB [2024-08-27 07:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.086) Loss 1.0166 (0.7756) Acc@1 74.512 (82.864) Acc@5 94.238 (96.689) Mem 7381MB [2024-08-27 07:59:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.065 (0.081) Loss 1.3369 (0.8752) Acc@1 69.141 (80.522) Acc@5 89.844 (95.420) Mem 7381MB [2024-08-27 07:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 1.1807 (0.9280) Acc@1 72.461 (79.102) Acc@5 91.602 (94.767) Mem 7381MB [2024-08-27 07:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.732 Acc@5 94.712 [2024-08-27 07:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.7% [2024-08-27 07:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 78.73% [2024-08-27 07:59:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 07:59:20 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 07:59:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.372 (0.372) Loss 0.4153 (0.4153) Acc@1 92.480 (92.480) Acc@5 98.633 (98.633) Mem 7381MB [2024-08-27 07:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.065 (0.095) Loss 0.6621 (0.6500) Acc@1 86.816 (85.973) Acc@5 96.973 (97.266) Mem 7381MB [2024-08-27 07:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.083) Loss 0.9209 (0.6738) Acc@1 78.516 (85.040) Acc@5 95.117 (97.280) Mem 7381MB [2024-08-27 07:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.068 (0.079) Loss 1.1670 (0.7646) Acc@1 70.898 (82.812) Acc@5 92.285 (96.292) Mem 7381MB [2024-08-27 07:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.074) Loss 1.0576 (0.8126) Acc@1 73.535 (81.455) Acc@5 93.066 (95.746) Mem 7381MB [2024-08-27 07:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.056 Acc@5 95.710 [2024-08-27 07:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.1% [2024-08-27 07:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.06% [2024-08-27 07:59:23 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 07:59:24 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 07:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][0/1251] eta 0:13:31 lr 0.000626 wd 0.0500 time 0.6487 (0.6487) data time 0.4406 (0.4406) model time 0.0000 (0.0000) loss 3.8055 (3.8055) grad_norm 1.8207 (1.8207) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][10/1251] eta 0:05:27 lr 0.000626 wd 0.0500 time 0.2330 (0.2641) data time 0.0006 (0.0409) model time 0.0000 (0.0000) loss 3.3349 (3.2646) grad_norm 2.8697 (2.1669) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][20/1251] eta 0:05:02 lr 0.000626 wd 0.0500 time 0.2299 (0.2455) data time 0.0007 (0.0219) model time 0.0000 (0.0000) loss 3.4734 (3.3050) grad_norm 3.2265 (2.3319) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][30/1251] eta 0:04:52 lr 0.000626 wd 0.0500 time 0.2268 (0.2392) data time 0.0008 (0.0151) model time 0.0000 (0.0000) loss 3.8164 (3.3841) grad_norm 1.8591 (2.2087) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][40/1251] eta 0:04:45 lr 0.000626 wd 0.0500 time 0.2220 (0.2356) data time 0.0007 (0.0116) model time 0.0000 (0.0000) loss 3.3302 (3.3181) grad_norm 3.2415 (2.2019) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][50/1251] eta 0:04:40 lr 0.000626 wd 0.0500 time 0.2400 (0.2338) data time 0.0009 (0.0096) model time 0.0000 (0.0000) loss 3.3889 (3.3043) grad_norm 2.3407 (2.2353) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][60/1251] eta 0:04:37 lr 0.000626 wd 0.0500 time 0.2266 (0.2327) data time 0.0007 (0.0081) model time 0.2259 (0.2261) loss 2.4984 (3.2705) grad_norm 2.5993 (2.2664) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][70/1251] eta 0:04:33 lr 0.000626 wd 0.0500 time 0.2246 (0.2316) data time 0.0008 (0.0071) model time 0.2238 (0.2250) loss 3.3690 (3.2848) grad_norm 2.2597 (2.3257) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][80/1251] eta 0:04:32 lr 0.000626 wd 0.0500 time 0.2204 (0.2330) data time 0.0007 (0.0064) model time 0.2197 (0.2307) loss 2.8903 (3.2669) grad_norm 2.2405 (2.3350) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][90/1251] eta 0:04:29 lr 0.000626 wd 0.0500 time 0.2303 (0.2323) data time 0.0009 (0.0058) model time 0.2294 (0.2295) loss 2.7614 (3.2466) grad_norm 1.8077 (2.3105) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][100/1251] eta 0:04:26 lr 0.000626 wd 0.0500 time 0.2277 (0.2318) data time 0.0005 (0.0053) model time 0.2272 (0.2289) loss 3.8145 (3.2453) grad_norm 1.8704 (2.3151) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][110/1251] eta 0:04:23 lr 0.000625 wd 0.0500 time 0.2218 (0.2312) data time 0.0008 (0.0049) model time 0.2211 (0.2281) loss 3.5926 (3.2399) grad_norm 1.7288 (2.2830) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][120/1251] eta 0:04:20 lr 0.000625 wd 0.0500 time 0.2215 (0.2307) data time 0.0007 (0.0046) model time 0.2207 (0.2275) loss 3.8837 (3.2393) grad_norm 2.3740 (2.2719) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][130/1251] eta 0:04:18 lr 0.000625 wd 0.0500 time 0.2242 (0.2305) data time 0.0008 (0.0043) model time 0.2234 (0.2274) loss 3.6126 (3.2326) grad_norm 1.9481 (2.2836) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][140/1251] eta 0:04:15 lr 0.000625 wd 0.0500 time 0.2232 (0.2301) data time 0.0006 (0.0040) model time 0.2226 (0.2271) loss 2.9654 (3.2362) grad_norm 2.5375 (2.3302) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 07:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][150/1251] eta 0:04:13 lr 0.000625 wd 0.0500 time 0.2278 (0.2299) data time 0.0006 (0.0038) model time 0.2272 (0.2269) loss 2.1444 (3.2365) grad_norm 4.7507 (2.3543) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][160/1251] eta 0:04:10 lr 0.000625 wd 0.0500 time 0.2446 (0.2297) data time 0.0009 (0.0037) model time 0.2438 (0.2268) loss 3.5145 (3.2314) grad_norm 1.5882 (2.3521) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][170/1251] eta 0:04:07 lr 0.000625 wd 0.0500 time 0.2248 (0.2293) data time 0.0008 (0.0035) model time 0.2240 (0.2265) loss 3.3237 (3.2326) grad_norm 1.7241 (2.3228) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][180/1251] eta 0:04:05 lr 0.000625 wd 0.0500 time 0.2293 (0.2291) data time 0.0008 (0.0034) model time 0.2285 (0.2264) loss 3.7462 (3.2278) grad_norm 2.6354 (2.3101) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][190/1251] eta 0:04:02 lr 0.000625 wd 0.0500 time 0.2264 (0.2290) data time 0.0007 (0.0032) model time 0.2256 (0.2263) loss 3.4033 (3.2209) grad_norm 3.0221 (2.3108) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][200/1251] eta 0:04:00 lr 0.000625 wd 0.0500 time 0.2241 (0.2288) data time 0.0008 (0.0031) model time 0.2234 (0.2261) loss 2.2516 (3.2080) grad_norm 1.7382 (2.3105) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][210/1251] eta 0:03:58 lr 0.000625 wd 0.0500 time 0.2378 (0.2288) data time 0.0007 (0.0030) model time 0.2371 (0.2262) loss 4.1008 (3.2101) grad_norm 2.4379 (2.3161) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][220/1251] eta 0:03:55 lr 0.000625 wd 0.0500 time 0.2292 (0.2287) data time 0.0008 (0.0029) model time 0.2284 (0.2262) loss 3.4849 (3.2113) grad_norm 2.2412 (2.3048) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][230/1251] eta 0:03:53 lr 0.000625 wd 0.0500 time 0.2236 (0.2286) data time 0.0008 (0.0028) model time 0.2228 (0.2262) loss 2.1584 (3.2014) grad_norm 2.5344 (2.2943) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][240/1251] eta 0:03:51 lr 0.000625 wd 0.0500 time 0.2292 (0.2285) data time 0.0008 (0.0028) model time 0.2284 (0.2261) loss 2.4478 (3.2051) grad_norm 2.7021 (2.2955) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][250/1251] eta 0:03:48 lr 0.000625 wd 0.0500 time 0.2357 (0.2284) data time 0.0007 (0.0027) model time 0.2350 (0.2261) loss 2.7435 (3.2118) grad_norm 2.3246 (2.2916) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][260/1251] eta 0:03:46 lr 0.000625 wd 0.0500 time 0.2234 (0.2283) data time 0.0008 (0.0026) model time 0.2226 (0.2261) loss 3.4562 (3.2146) grad_norm 1.6512 (2.2898) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][270/1251] eta 0:03:43 lr 0.000625 wd 0.0500 time 0.2219 (0.2283) data time 0.0007 (0.0025) model time 0.2212 (0.2261) loss 3.9610 (3.2237) grad_norm 4.5175 (2.3055) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][280/1251] eta 0:03:41 lr 0.000625 wd 0.0500 time 0.2260 (0.2282) data time 0.0008 (0.0025) model time 0.2253 (0.2260) loss 3.2474 (3.2240) grad_norm 2.3478 (2.3007) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][290/1251] eta 0:03:39 lr 0.000625 wd 0.0500 time 0.2258 (0.2281) data time 0.0009 (0.0024) model time 0.2248 (0.2260) loss 2.5284 (3.2258) grad_norm 1.6605 (2.3021) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][300/1251] eta 0:03:36 lr 0.000625 wd 0.0500 time 0.2234 (0.2280) data time 0.0008 (0.0024) model time 0.2226 (0.2258) loss 3.8493 (3.2222) grad_norm 2.0264 (2.3032) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][310/1251] eta 0:03:34 lr 0.000625 wd 0.0500 time 0.2249 (0.2278) data time 0.0009 (0.0023) model time 0.2241 (0.2258) loss 3.5512 (3.2324) grad_norm 1.6469 (2.2983) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][320/1251] eta 0:03:32 lr 0.000625 wd 0.0500 time 0.2193 (0.2277) data time 0.0009 (0.0023) model time 0.2184 (0.2257) loss 2.9091 (3.2350) grad_norm 2.4648 (2.2953) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][330/1251] eta 0:03:29 lr 0.000625 wd 0.0500 time 0.2240 (0.2276) data time 0.0008 (0.0022) model time 0.2232 (0.2256) loss 3.3861 (3.2411) grad_norm 2.0494 (2.2902) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][340/1251] eta 0:03:29 lr 0.000624 wd 0.0500 time 0.2298 (0.2295) data time 0.0010 (0.0022) model time 0.2288 (0.2278) loss 3.0709 (3.2436) grad_norm 1.8249 (2.2920) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][350/1251] eta 0:03:26 lr 0.000624 wd 0.0500 time 0.2253 (0.2294) data time 0.0007 (0.0022) model time 0.2246 (0.2278) loss 2.1821 (3.2491) grad_norm 2.1823 (2.2877) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][360/1251] eta 0:03:24 lr 0.000624 wd 0.0500 time 0.2297 (0.2294) data time 0.0007 (0.0021) model time 0.2291 (0.2277) loss 2.6402 (3.2433) grad_norm 2.1011 (2.2806) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][370/1251] eta 0:03:22 lr 0.000624 wd 0.0500 time 0.2276 (0.2293) data time 0.0006 (0.0021) model time 0.2270 (0.2277) loss 3.0851 (3.2426) grad_norm 1.9451 (2.2826) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][380/1251] eta 0:03:19 lr 0.000624 wd 0.0500 time 0.2359 (0.2293) data time 0.0006 (0.0021) model time 0.2353 (0.2277) loss 2.2091 (3.2427) grad_norm 1.8051 (2.2863) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][390/1251] eta 0:03:17 lr 0.000624 wd 0.0500 time 0.2236 (0.2292) data time 0.0009 (0.0020) model time 0.2227 (0.2276) loss 3.6702 (3.2471) grad_norm 3.5402 (2.2836) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][400/1251] eta 0:03:15 lr 0.000624 wd 0.0500 time 0.2258 (0.2292) data time 0.0006 (0.0020) model time 0.2252 (0.2276) loss 2.8234 (3.2417) grad_norm 2.4743 (2.2796) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:00:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][410/1251] eta 0:03:12 lr 0.000624 wd 0.0500 time 0.2339 (0.2291) data time 0.0009 (0.0020) model time 0.2329 (0.2275) loss 3.7612 (3.2407) grad_norm 2.4014 (2.2827) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][420/1251] eta 0:03:10 lr 0.000624 wd 0.0500 time 0.2257 (0.2291) data time 0.0008 (0.0020) model time 0.2248 (0.2275) loss 3.7370 (3.2405) grad_norm 1.9435 (2.2860) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][430/1251] eta 0:03:08 lr 0.000624 wd 0.0500 time 0.2206 (0.2290) data time 0.0008 (0.0019) model time 0.2198 (0.2274) loss 3.1745 (3.2348) grad_norm 2.7338 (2.2977) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][440/1251] eta 0:03:05 lr 0.000624 wd 0.0500 time 0.2300 (0.2289) data time 0.0006 (0.0019) model time 0.2294 (0.2274) loss 2.1393 (3.2368) grad_norm 2.6916 (2.3007) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][450/1251] eta 0:03:03 lr 0.000624 wd 0.0500 time 0.2220 (0.2289) data time 0.0007 (0.0019) model time 0.2213 (0.2274) loss 3.2130 (3.2371) grad_norm 3.7852 (2.3018) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][460/1251] eta 0:03:01 lr 0.000624 wd 0.0500 time 0.2247 (0.2288) data time 0.0008 (0.0019) model time 0.2239 (0.2273) loss 3.5985 (3.2445) grad_norm 1.9263 (2.2972) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][470/1251] eta 0:02:58 lr 0.000624 wd 0.0500 time 0.2214 (0.2288) data time 0.0009 (0.0019) model time 0.2205 (0.2272) loss 3.6164 (3.2468) grad_norm 1.7978 (2.2965) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][480/1251] eta 0:02:56 lr 0.000624 wd 0.0500 time 0.2241 (0.2287) data time 0.0007 (0.0018) model time 0.2234 (0.2271) loss 3.2667 (3.2462) grad_norm 2.8709 (2.2947) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][490/1251] eta 0:02:53 lr 0.000624 wd 0.0500 time 0.2267 (0.2286) data time 0.0009 (0.0018) model time 0.2258 (0.2271) loss 3.4083 (3.2462) grad_norm 1.8673 (2.2870) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][500/1251] eta 0:02:51 lr 0.000624 wd 0.0500 time 0.2302 (0.2286) data time 0.0006 (0.0018) model time 0.2296 (0.2271) loss 2.7707 (3.2453) grad_norm 4.1582 (2.2919) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][510/1251] eta 0:02:49 lr 0.000624 wd 0.0500 time 0.2210 (0.2285) data time 0.0007 (0.0018) model time 0.2203 (0.2270) loss 3.7295 (3.2504) grad_norm 2.5702 (2.3037) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][520/1251] eta 0:02:46 lr 0.000624 wd 0.0500 time 0.2294 (0.2284) data time 0.0008 (0.0018) model time 0.2286 (0.2269) loss 3.2348 (3.2487) grad_norm 2.2878 (2.3028) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][530/1251] eta 0:02:44 lr 0.000624 wd 0.0500 time 0.2240 (0.2284) data time 0.0016 (0.0017) model time 0.2223 (0.2269) loss 4.0387 (3.2495) grad_norm 4.4668 (2.3121) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][540/1251] eta 0:02:42 lr 0.000624 wd 0.0500 time 0.2259 (0.2284) data time 0.0008 (0.0017) model time 0.2251 (0.2270) loss 3.5331 (3.2562) grad_norm 3.2074 (2.3214) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][550/1251] eta 0:02:40 lr 0.000624 wd 0.0500 time 0.2252 (0.2284) data time 0.0006 (0.0017) model time 0.2246 (0.2269) loss 3.0164 (3.2580) grad_norm 2.5760 (2.3288) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][560/1251] eta 0:02:37 lr 0.000624 wd 0.0500 time 0.2259 (0.2283) data time 0.0008 (0.0017) model time 0.2252 (0.2268) loss 3.5317 (3.2579) grad_norm 1.7401 (2.3274) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][570/1251] eta 0:02:35 lr 0.000623 wd 0.0500 time 0.2211 (0.2282) data time 0.0008 (0.0017) model time 0.2203 (0.2267) loss 3.7502 (3.2649) grad_norm 4.3288 (2.3435) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][580/1251] eta 0:02:33 lr 0.000623 wd 0.0500 time 0.2208 (0.2281) data time 0.0006 (0.0017) model time 0.2202 (0.2266) loss 3.7451 (3.2655) grad_norm 2.0995 (2.3434) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][590/1251] eta 0:02:30 lr 0.000623 wd 0.0500 time 0.2326 (0.2280) data time 0.0006 (0.0017) model time 0.2320 (0.2266) loss 3.5441 (3.2669) grad_norm 2.5868 (2.3444) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][600/1251] eta 0:02:28 lr 0.000623 wd 0.0500 time 0.2234 (0.2280) data time 0.0009 (0.0017) model time 0.2225 (0.2265) loss 3.5288 (3.2674) grad_norm 2.7676 (2.3450) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][610/1251] eta 0:02:26 lr 0.000623 wd 0.0500 time 0.2230 (0.2279) data time 0.0009 (0.0017) model time 0.2221 (0.2264) loss 3.6843 (3.2684) grad_norm 2.4900 (2.3447) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][620/1251] eta 0:02:23 lr 0.000623 wd 0.0500 time 0.2238 (0.2279) data time 0.0008 (0.0016) model time 0.2230 (0.2264) loss 3.5828 (3.2708) grad_norm 2.0252 (2.3542) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][630/1251] eta 0:02:21 lr 0.000623 wd 0.0500 time 0.2192 (0.2278) data time 0.0010 (0.0016) model time 0.2182 (0.2264) loss 3.7249 (3.2740) grad_norm 2.9796 (2.3657) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][640/1251] eta 0:02:19 lr 0.000623 wd 0.0500 time 0.2309 (0.2278) data time 0.0010 (0.0016) model time 0.2299 (0.2263) loss 3.1344 (3.2746) grad_norm 2.3112 (2.3656) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][650/1251] eta 0:02:16 lr 0.000623 wd 0.0500 time 0.2303 (0.2277) data time 0.0009 (0.0016) model time 0.2294 (0.2263) loss 2.5614 (3.2693) grad_norm 1.5731 (2.3635) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][660/1251] eta 0:02:14 lr 0.000623 wd 0.0500 time 0.2203 (0.2276) data time 0.0007 (0.0016) model time 0.2196 (0.2262) loss 3.5722 (3.2703) grad_norm 1.9023 (2.3587) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][670/1251] eta 0:02:12 lr 0.000623 wd 0.0500 time 0.2212 (0.2276) data time 0.0009 (0.0016) model time 0.2204 (0.2262) loss 3.7282 (3.2707) grad_norm 2.1154 (2.3581) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][680/1251] eta 0:02:09 lr 0.000623 wd 0.0500 time 0.2304 (0.2276) data time 0.0010 (0.0016) model time 0.2294 (0.2262) loss 3.4938 (3.2732) grad_norm 2.5115 (2.3586) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][690/1251] eta 0:02:07 lr 0.000623 wd 0.0500 time 0.2289 (0.2276) data time 0.0010 (0.0016) model time 0.2279 (0.2262) loss 2.9112 (3.2689) grad_norm 1.9255 (2.3551) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][700/1251] eta 0:02:05 lr 0.000623 wd 0.0500 time 0.2255 (0.2276) data time 0.0009 (0.0016) model time 0.2245 (0.2262) loss 3.8117 (3.2679) grad_norm 2.2710 (2.3593) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][710/1251] eta 0:02:03 lr 0.000623 wd 0.0500 time 0.2199 (0.2276) data time 0.0009 (0.0016) model time 0.2190 (0.2262) loss 2.8961 (3.2675) grad_norm 1.8120 (2.3598) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][720/1251] eta 0:02:00 lr 0.000623 wd 0.0500 time 0.2277 (0.2275) data time 0.0009 (0.0015) model time 0.2268 (0.2261) loss 2.9364 (3.2650) grad_norm 1.4955 (2.3584) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][730/1251] eta 0:01:58 lr 0.000623 wd 0.0500 time 0.2243 (0.2275) data time 0.0006 (0.0015) model time 0.2237 (0.2261) loss 3.6264 (3.2690) grad_norm 2.7141 (2.3538) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][740/1251] eta 0:01:56 lr 0.000623 wd 0.0500 time 0.2246 (0.2275) data time 0.0008 (0.0015) model time 0.2238 (0.2261) loss 2.7103 (3.2694) grad_norm 2.4465 (2.3512) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][750/1251] eta 0:01:53 lr 0.000623 wd 0.0500 time 0.2270 (0.2274) data time 0.0007 (0.0015) model time 0.2263 (0.2260) loss 2.7835 (3.2691) grad_norm 2.6168 (2.3570) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][760/1251] eta 0:01:51 lr 0.000623 wd 0.0500 time 0.2254 (0.2274) data time 0.0007 (0.0015) model time 0.2247 (0.2260) loss 3.8151 (3.2662) grad_norm 1.9336 (2.3588) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][770/1251] eta 0:01:49 lr 0.000623 wd 0.0500 time 0.2376 (0.2273) data time 0.0008 (0.0015) model time 0.2368 (0.2259) loss 3.8027 (3.2654) grad_norm 1.7916 (2.3617) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][780/1251] eta 0:01:47 lr 0.000623 wd 0.0500 time 0.2225 (0.2273) data time 0.0008 (0.0015) model time 0.2217 (0.2259) loss 3.1410 (3.2673) grad_norm 1.5348 (2.3600) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][790/1251] eta 0:01:44 lr 0.000623 wd 0.0500 time 0.2238 (0.2273) data time 0.0007 (0.0015) model time 0.2231 (0.2259) loss 4.0849 (3.2684) grad_norm 3.0016 (2.3621) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][800/1251] eta 0:01:42 lr 0.000623 wd 0.0500 time 0.2311 (0.2273) data time 0.0009 (0.0015) model time 0.2302 (0.2259) loss 2.6508 (3.2728) grad_norm 1.7060 (2.3654) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][810/1251] eta 0:01:40 lr 0.000622 wd 0.0500 time 0.2172 (0.2272) data time 0.0009 (0.0015) model time 0.2164 (0.2259) loss 3.3088 (3.2716) grad_norm 2.4033 (2.3657) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][820/1251] eta 0:01:37 lr 0.000622 wd 0.0500 time 0.2284 (0.2272) data time 0.0009 (0.0015) model time 0.2275 (0.2259) loss 3.5861 (3.2709) grad_norm 2.3962 (2.3657) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][830/1251] eta 0:01:35 lr 0.000622 wd 0.0500 time 0.2271 (0.2272) data time 0.0006 (0.0015) model time 0.2264 (0.2259) loss 2.9130 (3.2696) grad_norm 1.9847 (2.3631) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][840/1251] eta 0:01:33 lr 0.000622 wd 0.0500 time 0.2202 (0.2272) data time 0.0008 (0.0015) model time 0.2194 (0.2258) loss 3.5295 (3.2712) grad_norm 2.3832 (2.3686) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][850/1251] eta 0:01:31 lr 0.000622 wd 0.0500 time 0.2217 (0.2272) data time 0.0009 (0.0014) model time 0.2208 (0.2258) loss 3.5488 (3.2710) grad_norm 2.7112 (2.3691) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][860/1251] eta 0:01:29 lr 0.000622 wd 0.0500 time 0.2238 (0.2276) data time 0.0007 (0.0014) model time 0.2231 (0.2263) loss 3.2697 (3.2707) grad_norm 1.9478 (2.3675) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][870/1251] eta 0:01:26 lr 0.000622 wd 0.0500 time 0.2356 (0.2276) data time 0.0006 (0.0014) model time 0.2351 (0.2263) loss 3.2802 (3.2689) grad_norm 2.2827 (2.3654) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][880/1251] eta 0:01:24 lr 0.000622 wd 0.0500 time 0.2262 (0.2276) data time 0.0009 (0.0014) model time 0.2254 (0.2263) loss 3.3226 (3.2660) grad_norm 1.9643 (2.3656) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][890/1251] eta 0:01:22 lr 0.000622 wd 0.0500 time 0.2228 (0.2275) data time 0.0007 (0.0014) model time 0.2222 (0.2262) loss 2.6149 (3.2652) grad_norm 2.0490 (2.3626) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][900/1251] eta 0:01:19 lr 0.000622 wd 0.0500 time 0.2238 (0.2275) data time 0.0006 (0.0014) model time 0.2232 (0.2262) loss 2.5135 (3.2672) grad_norm 2.1521 (2.3601) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][910/1251] eta 0:01:17 lr 0.000622 wd 0.0500 time 0.2304 (0.2275) data time 0.0006 (0.0014) model time 0.2298 (0.2262) loss 3.8519 (3.2646) grad_norm 1.9459 (2.3563) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][920/1251] eta 0:01:15 lr 0.000622 wd 0.0500 time 0.2260 (0.2274) data time 0.0007 (0.0014) model time 0.2253 (0.2261) loss 3.4743 (3.2664) grad_norm 1.5567 (2.3600) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][930/1251] eta 0:01:12 lr 0.000622 wd 0.0500 time 0.2300 (0.2274) data time 0.0006 (0.0014) model time 0.2294 (0.2261) loss 2.2299 (3.2655) grad_norm 2.8771 (2.3605) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:02:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][940/1251] eta 0:01:10 lr 0.000622 wd 0.0500 time 0.2248 (0.2274) data time 0.0009 (0.0014) model time 0.2238 (0.2261) loss 2.3455 (3.2650) grad_norm 2.0901 (2.3623) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][950/1251] eta 0:01:08 lr 0.000622 wd 0.0500 time 0.2310 (0.2273) data time 0.0006 (0.0014) model time 0.2304 (0.2260) loss 2.6976 (3.2645) grad_norm 2.9195 (2.3637) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][960/1251] eta 0:01:06 lr 0.000622 wd 0.0500 time 0.2353 (0.2273) data time 0.0007 (0.0014) model time 0.2346 (0.2260) loss 3.7655 (3.2673) grad_norm 2.5010 (2.3632) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][970/1251] eta 0:01:03 lr 0.000622 wd 0.0500 time 0.2220 (0.2273) data time 0.0008 (0.0014) model time 0.2212 (0.2260) loss 3.7067 (3.2695) grad_norm 1.8843 (2.3654) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][980/1251] eta 0:01:01 lr 0.000622 wd 0.0500 time 0.2214 (0.2273) data time 0.0006 (0.0014) model time 0.2208 (0.2260) loss 3.7247 (3.2669) grad_norm 2.4227 (2.3687) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][990/1251] eta 0:00:59 lr 0.000622 wd 0.0500 time 0.2366 (0.2273) data time 0.0009 (0.0014) model time 0.2358 (0.2260) loss 2.8830 (3.2674) grad_norm 1.7684 (2.3695) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1000/1251] eta 0:00:57 lr 0.000622 wd 0.0500 time 0.2231 (0.2273) data time 0.0009 (0.0014) model time 0.2221 (0.2260) loss 3.8013 (3.2650) grad_norm 2.3199 (2.3708) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1010/1251] eta 0:00:54 lr 0.000622 wd 0.0500 time 0.2340 (0.2275) data time 0.0008 (0.0014) model time 0.2332 (0.2262) loss 2.9621 (3.2652) grad_norm 2.8137 (2.3742) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1020/1251] eta 0:00:52 lr 0.000622 wd 0.0500 time 0.2243 (0.2274) data time 0.0006 (0.0014) model time 0.2237 (0.2262) loss 3.7442 (3.2665) grad_norm 2.1172 (2.3784) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1030/1251] eta 0:00:50 lr 0.000622 wd 0.0500 time 0.2337 (0.2275) data time 0.0007 (0.0014) model time 0.2330 (0.2262) loss 3.8217 (3.2678) grad_norm 3.7054 (2.3792) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1040/1251] eta 0:00:47 lr 0.000621 wd 0.0500 time 0.2261 (0.2274) data time 0.0007 (0.0014) model time 0.2253 (0.2262) loss 3.2040 (3.2669) grad_norm 2.0903 (2.3793) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1050/1251] eta 0:00:45 lr 0.000621 wd 0.0500 time 0.2233 (0.2274) data time 0.0009 (0.0013) model time 0.2224 (0.2262) loss 3.4357 (3.2666) grad_norm 2.6712 (2.3796) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1060/1251] eta 0:00:43 lr 0.000621 wd 0.0500 time 0.2255 (0.2274) data time 0.0008 (0.0013) model time 0.2247 (0.2261) loss 2.8632 (3.2653) grad_norm 2.4213 (2.3811) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1070/1251] eta 0:00:41 lr 0.000621 wd 0.0500 time 0.2198 (0.2274) data time 0.0007 (0.0013) model time 0.2190 (0.2261) loss 3.5875 (3.2661) grad_norm 1.7480 (2.3804) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1080/1251] eta 0:00:38 lr 0.000621 wd 0.0500 time 0.2210 (0.2274) data time 0.0008 (0.0013) model time 0.2203 (0.2261) loss 3.2843 (3.2653) grad_norm 2.5475 (2.3871) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1090/1251] eta 0:00:36 lr 0.000621 wd 0.0500 time 0.2248 (0.2274) data time 0.0010 (0.0013) model time 0.2238 (0.2261) loss 3.3010 (3.2670) grad_norm 1.7933 (2.3867) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1100/1251] eta 0:00:34 lr 0.000621 wd 0.0500 time 0.2214 (0.2274) data time 0.0007 (0.0013) model time 0.2207 (0.2261) loss 2.3297 (3.2670) grad_norm 1.7332 (2.3867) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1110/1251] eta 0:00:32 lr 0.000621 wd 0.0500 time 0.2201 (0.2273) data time 0.0013 (0.0013) model time 0.2187 (0.2261) loss 3.5440 (3.2661) grad_norm 2.0044 (2.3886) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1120/1251] eta 0:00:29 lr 0.000621 wd 0.0500 time 0.2257 (0.2273) data time 0.0006 (0.0013) model time 0.2251 (0.2261) loss 3.4892 (3.2663) grad_norm 2.4668 (2.3893) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1130/1251] eta 0:00:27 lr 0.000621 wd 0.0500 time 0.2216 (0.2273) data time 0.0007 (0.0013) model time 0.2209 (0.2260) loss 3.9940 (3.2660) grad_norm 2.3418 (2.3862) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1140/1251] eta 0:00:25 lr 0.000621 wd 0.0500 time 0.2269 (0.2273) data time 0.0010 (0.0013) model time 0.2259 (0.2260) loss 3.6708 (3.2679) grad_norm 1.8068 (2.3854) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1150/1251] eta 0:00:22 lr 0.000621 wd 0.0500 time 0.2296 (0.2273) data time 0.0009 (0.0013) model time 0.2288 (0.2260) loss 3.7517 (3.2663) grad_norm 2.0129 (2.3852) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1160/1251] eta 0:00:20 lr 0.000621 wd 0.0500 time 0.2240 (0.2272) data time 0.0009 (0.0013) model time 0.2232 (0.2260) loss 3.0532 (3.2687) grad_norm 2.3208 (2.3897) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1170/1251] eta 0:00:18 lr 0.000621 wd 0.0500 time 0.2264 (0.2272) data time 0.0009 (0.0013) model time 0.2255 (0.2260) loss 2.6687 (3.2678) grad_norm 1.8948 (2.3867) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1180/1251] eta 0:00:16 lr 0.000621 wd 0.0500 time 0.2178 (0.2272) data time 0.0008 (0.0013) model time 0.2170 (0.2260) loss 2.7753 (3.2674) grad_norm 2.1978 (2.3858) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1190/1251] eta 0:00:13 lr 0.000621 wd 0.0500 time 0.2244 (0.2272) data time 0.0009 (0.0013) model time 0.2235 (0.2260) loss 3.1615 (3.2685) grad_norm 2.6039 (2.3843) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1200/1251] eta 0:00:11 lr 0.000621 wd 0.0500 time 0.2234 (0.2272) data time 0.0007 (0.0013) model time 0.2227 (0.2260) loss 3.6815 (3.2711) grad_norm 2.0529 (2.3840) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1210/1251] eta 0:00:09 lr 0.000621 wd 0.0500 time 0.2262 (0.2272) data time 0.0009 (0.0013) model time 0.2253 (0.2260) loss 3.4677 (3.2740) grad_norm 2.5591 (2.3830) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:04:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1220/1251] eta 0:00:07 lr 0.000621 wd 0.0500 time 0.2238 (0.2272) data time 0.0009 (0.0013) model time 0.2229 (0.2260) loss 3.3648 (3.2749) grad_norm 2.2009 (2.3827) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1230/1251] eta 0:00:04 lr 0.000621 wd 0.0500 time 0.2281 (0.2272) data time 0.0007 (0.0013) model time 0.2274 (0.2259) loss 3.9526 (3.2713) grad_norm 2.0871 (2.3815) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:04:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1240/1251] eta 0:00:02 lr 0.000621 wd 0.0500 time 0.2144 (0.2271) data time 0.0003 (0.0013) model time 0.2141 (0.2259) loss 3.5198 (3.2738) grad_norm 2.1806 (2.3795) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:04:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [138/300][1250/1251] eta 0:00:00 lr 0.000621 wd 0.0500 time 0.2133 (0.2270) data time 0.0004 (0.0013) model time 0.2129 (0.2258) loss 4.0615 (3.2746) grad_norm 3.5686 (2.3809) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:04:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 138 training takes 0:04:44 [2024-08-27 08:04:08 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 08:04:09 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 08:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.375 (0.375) Loss 0.4937 (0.4937) Acc@1 91.016 (91.016) Acc@5 98.047 (98.047) Mem 7381MB [2024-08-27 08:04:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.098) Loss 0.7690 (0.7531) Acc@1 83.887 (83.540) Acc@5 96.387 (96.626) Mem 7381MB [2024-08-27 08:04:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.064 (0.086) Loss 1.1104 (0.7710) Acc@1 74.414 (82.822) Acc@5 94.141 (96.698) Mem 7381MB [2024-08-27 08:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.080) Loss 1.2607 (0.8693) Acc@1 70.898 (80.472) Acc@5 91.016 (95.546) Mem 7381MB [2024-08-27 08:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 1.1953 (0.9252) Acc@1 70.898 (79.030) Acc@5 91.895 (94.872) Mem 7381MB [2024-08-27 08:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.654 Acc@5 94.834 [2024-08-27 08:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.7% [2024-08-27 08:04:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.748 (0.748) Loss 0.4148 (0.4148) Acc@1 92.285 (92.285) Acc@5 98.633 (98.633) Mem 7381MB [2024-08-27 08:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.141) Loss 0.6606 (0.6491) Acc@1 86.914 (85.946) Acc@5 96.973 (97.275) Mem 7381MB [2024-08-27 08:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.108) Loss 0.9194 (0.6730) Acc@1 78.613 (85.035) Acc@5 94.922 (97.294) Mem 7381MB [2024-08-27 08:04:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.066 (0.095) Loss 1.1660 (0.7636) Acc@1 70.508 (82.784) Acc@5 92.383 (96.314) Mem 7381MB [2024-08-27 08:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.087) Loss 1.0557 (0.8115) Acc@1 73.633 (81.431) Acc@5 93.555 (95.789) Mem 7381MB [2024-08-27 08:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.016 Acc@5 95.758 [2024-08-27 08:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.0% [2024-08-27 08:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][0/1251] eta 0:22:14 lr 0.000621 wd 0.0500 time 1.0671 (1.0671) data time 0.6260 (0.6260) model time 0.0000 (0.0000) loss 3.7042 (3.7042) grad_norm 2.4073 (2.4073) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 08:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 08:04:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 08:04:18 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 08:05:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 08:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 08:05:57 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 08:06:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 08:06:12 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 08:06:13 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 08:06:14 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 08:06:14 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 139) [2024-08-27 08:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 08:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 08:06:32 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 08:06:34 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 08:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 08:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 08:08:08 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 08:08:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 08:08:19 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 08:08:21 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 08:08:22 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 08:08:22 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 139) [2024-08-27 08:08:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 08:08:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][10/1251] eta 0:33:10 lr 0.000621 wd 0.0500 time 0.2251 (1.6036) data time 0.0007 (0.1161) model time 0.0000 (0.0000) loss 3.5558 (3.8230) grad_norm 2.5889 (2.5807) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:08:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][20/1251] eta 0:17:59 lr 0.000620 wd 0.0500 time 0.2234 (0.8769) data time 0.0009 (0.0555) model time 0.0000 (0.0000) loss 3.6308 (3.6196) grad_norm 1.8085 (2.5915) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:08:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][30/1251] eta 0:13:15 lr 0.000620 wd 0.0500 time 0.2207 (0.6516) data time 0.0008 (0.0367) model time 0.0000 (0.0000) loss 3.7825 (3.5995) grad_norm 1.6046 (2.3554) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][40/1251] eta 0:10:55 lr 0.000620 wd 0.0500 time 0.2206 (0.5415) data time 0.0010 (0.0276) model time 0.0000 (0.0000) loss 3.1108 (3.5090) grad_norm 1.8680 (2.3402) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:08:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][50/1251] eta 0:09:32 lr 0.000620 wd 0.0500 time 0.2263 (0.4766) data time 0.0011 (0.0221) model time 0.0000 (0.0000) loss 3.3013 (3.4588) grad_norm 1.7346 (2.4733) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][60/1251] eta 0:08:36 lr 0.000620 wd 0.0500 time 0.2215 (0.4335) data time 0.0007 (0.0185) model time 0.2207 (0.2215) loss 2.5017 (3.4199) grad_norm 2.0112 (2.4758) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:08:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][70/1251] eta 0:07:55 lr 0.000620 wd 0.0500 time 0.2252 (0.4030) data time 0.0008 (0.0160) model time 0.2244 (0.2217) loss 3.3310 (3.3920) grad_norm 1.7388 (2.4322) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:08:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][80/1251] eta 0:07:25 lr 0.000620 wd 0.0500 time 0.2209 (0.3802) data time 0.0008 (0.0141) model time 0.2201 (0.2219) loss 3.5365 (3.3710) grad_norm 2.4203 (2.3907) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][90/1251] eta 0:07:01 lr 0.000620 wd 0.0500 time 0.2177 (0.3627) data time 0.0008 (0.0126) model time 0.2169 (0.2222) loss 2.9788 (3.3344) grad_norm 2.8551 (2.3707) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][100/1251] eta 0:06:41 lr 0.000620 wd 0.0500 time 0.2247 (0.3489) data time 0.0010 (0.0114) model time 0.2236 (0.2228) loss 3.5843 (3.3549) grad_norm 2.5116 (2.4385) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][110/1251] eta 0:06:25 lr 0.000620 wd 0.0500 time 0.2340 (0.3377) data time 0.0007 (0.0105) model time 0.2333 (0.2233) loss 4.0615 (3.3683) grad_norm 2.8642 (2.4605) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][120/1251] eta 0:06:11 lr 0.000620 wd 0.0500 time 0.2288 (0.3281) data time 0.0008 (0.0097) model time 0.2279 (0.2232) loss 3.7256 (3.3614) grad_norm 1.6894 (2.4243) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][130/1251] eta 0:05:58 lr 0.000620 wd 0.0500 time 0.2228 (0.3202) data time 0.0008 (0.0090) model time 0.2220 (0.2233) loss 3.2456 (3.3485) grad_norm 1.7487 (2.4199) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][140/1251] eta 0:05:48 lr 0.000620 wd 0.0500 time 0.2209 (0.3133) data time 0.0008 (0.0084) model time 0.2201 (0.2233) loss 4.0436 (3.3431) grad_norm 2.2569 (2.4103) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][150/1251] eta 0:05:38 lr 0.000620 wd 0.0500 time 0.2222 (0.3073) data time 0.0007 (0.0079) model time 0.2214 (0.2233) loss 3.1589 (3.3308) grad_norm 1.9092 (2.4024) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][160/1251] eta 0:05:29 lr 0.000620 wd 0.0500 time 0.2216 (0.3020) data time 0.0007 (0.0075) model time 0.2210 (0.2232) loss 4.0028 (3.3281) grad_norm 1.8855 (2.3844) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][170/1251] eta 0:05:21 lr 0.000620 wd 0.0500 time 0.2253 (0.2974) data time 0.0008 (0.0071) model time 0.2245 (0.2232) loss 3.2587 (3.3259) grad_norm 2.4095 (2.3720) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][180/1251] eta 0:05:14 lr 0.000620 wd 0.0500 time 0.2255 (0.2932) data time 0.0006 (0.0068) model time 0.2249 (0.2231) loss 3.2135 (3.3108) grad_norm 2.0593 (2.3623) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][190/1251] eta 0:05:07 lr 0.000620 wd 0.0500 time 0.2255 (0.2895) data time 0.0007 (0.0065) model time 0.2249 (0.2230) loss 4.0611 (3.3071) grad_norm 2.3308 (2.3615) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][200/1251] eta 0:05:00 lr 0.000620 wd 0.0500 time 0.2316 (0.2862) data time 0.0008 (0.0062) model time 0.2308 (0.2230) loss 2.3224 (3.2849) grad_norm 2.3213 (2.3622) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][210/1251] eta 0:04:54 lr 0.000620 wd 0.0500 time 0.2280 (0.2833) data time 0.0007 (0.0059) model time 0.2273 (0.2231) loss 3.3006 (3.2740) grad_norm 3.5532 (2.3568) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][220/1251] eta 0:04:49 lr 0.000620 wd 0.0500 time 0.2256 (0.2806) data time 0.0006 (0.0057) model time 0.2251 (0.2231) loss 3.8055 (3.2692) grad_norm 2.9734 (2.3682) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][230/1251] eta 0:04:44 lr 0.000620 wd 0.0500 time 0.2281 (0.2782) data time 0.0007 (0.0055) model time 0.2274 (0.2232) loss 2.2795 (3.2676) grad_norm 1.8183 (2.3665) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][240/1251] eta 0:04:38 lr 0.000620 wd 0.0500 time 0.2353 (0.2759) data time 0.0006 (0.0053) model time 0.2347 (0.2232) loss 2.3656 (3.2611) grad_norm 2.0438 (2.3716) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][250/1251] eta 0:04:34 lr 0.000619 wd 0.0500 time 0.2285 (0.2738) data time 0.0006 (0.0051) model time 0.2279 (0.2231) loss 3.2984 (3.2574) grad_norm 4.0263 (2.3949) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][260/1251] eta 0:04:29 lr 0.000619 wd 0.0500 time 0.2252 (0.2719) data time 0.0008 (0.0050) model time 0.2244 (0.2232) loss 3.5565 (3.2479) grad_norm 2.6366 (2.3879) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][270/1251] eta 0:04:25 lr 0.000619 wd 0.0500 time 0.2233 (0.2702) data time 0.0008 (0.0048) model time 0.2225 (0.2232) loss 2.0414 (3.2411) grad_norm 1.8103 (2.3808) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][280/1251] eta 0:04:20 lr 0.000619 wd 0.0500 time 0.2293 (0.2685) data time 0.0008 (0.0047) model time 0.2284 (0.2232) loss 3.5353 (3.2503) grad_norm 2.5732 (2.3864) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][290/1251] eta 0:04:17 lr 0.000619 wd 0.0500 time 0.2252 (0.2678) data time 0.0008 (0.0046) model time 0.2244 (0.2242) loss 3.1195 (3.2528) grad_norm 2.1462 (2.3861) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][300/1251] eta 0:04:13 lr 0.000619 wd 0.0500 time 0.2370 (0.2663) data time 0.0008 (0.0044) model time 0.2362 (0.2241) loss 3.4011 (3.2434) grad_norm 1.9792 (2.3785) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][310/1251] eta 0:04:10 lr 0.000619 wd 0.0500 time 0.2193 (0.2658) data time 0.0008 (0.0043) model time 0.2185 (0.2251) loss 3.6135 (3.2405) grad_norm 2.7386 (2.3772) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][320/1251] eta 0:04:06 lr 0.000619 wd 0.0500 time 0.2177 (0.2645) data time 0.0007 (0.0042) model time 0.2170 (0.2250) loss 3.7069 (3.2513) grad_norm 1.8531 (2.3700) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][330/1251] eta 0:04:02 lr 0.000619 wd 0.0500 time 0.2213 (0.2633) data time 0.0009 (0.0041) model time 0.2204 (0.2250) loss 2.2964 (3.2542) grad_norm 1.5879 (2.3680) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][340/1251] eta 0:03:58 lr 0.000619 wd 0.0500 time 0.2244 (0.2621) data time 0.0009 (0.0040) model time 0.2235 (0.2249) loss 3.1324 (3.2554) grad_norm 2.5635 (2.3839) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][350/1251] eta 0:03:55 lr 0.000619 wd 0.0500 time 0.2256 (0.2609) data time 0.0006 (0.0039) model time 0.2250 (0.2247) loss 3.7759 (3.2579) grad_norm 3.2732 (2.3884) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][360/1251] eta 0:03:51 lr 0.000619 wd 0.0500 time 0.2262 (0.2599) data time 0.0006 (0.0039) model time 0.2256 (0.2246) loss 2.6667 (3.2586) grad_norm 2.1274 (2.3904) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][370/1251] eta 0:03:48 lr 0.000619 wd 0.0500 time 0.2155 (0.2589) data time 0.0009 (0.0038) model time 0.2146 (0.2246) loss 3.3275 (3.2599) grad_norm 2.2990 (2.3922) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][380/1251] eta 0:03:44 lr 0.000619 wd 0.0500 time 0.2251 (0.2581) data time 0.0007 (0.0037) model time 0.2244 (0.2247) loss 3.6448 (3.2601) grad_norm 1.8612 (2.3822) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][390/1251] eta 0:03:41 lr 0.000619 wd 0.0500 time 0.2236 (0.2572) data time 0.0007 (0.0036) model time 0.2228 (0.2246) loss 3.4408 (3.2528) grad_norm 2.1176 (2.3793) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][400/1251] eta 0:03:38 lr 0.000619 wd 0.0500 time 0.2174 (0.2564) data time 0.0007 (0.0036) model time 0.2166 (0.2245) loss 3.5644 (3.2586) grad_norm 1.9716 (2.3824) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][410/1251] eta 0:03:34 lr 0.000619 wd 0.0500 time 0.2189 (0.2556) data time 0.0009 (0.0035) model time 0.2180 (0.2245) loss 3.7108 (3.2623) grad_norm 1.9771 (2.3715) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][420/1251] eta 0:03:31 lr 0.000619 wd 0.0500 time 0.2207 (0.2548) data time 0.0006 (0.0034) model time 0.2201 (0.2245) loss 2.2540 (3.2607) grad_norm 1.6596 (2.3684) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][430/1251] eta 0:03:28 lr 0.000619 wd 0.0500 time 0.2185 (0.2541) data time 0.0007 (0.0034) model time 0.2177 (0.2244) loss 3.6284 (3.2689) grad_norm 2.0394 (2.3678) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][440/1251] eta 0:03:25 lr 0.000619 wd 0.0500 time 0.2162 (0.2534) data time 0.0011 (0.0033) model time 0.2151 (0.2243) loss 3.2846 (3.2723) grad_norm 2.1068 (2.3616) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][450/1251] eta 0:03:22 lr 0.000619 wd 0.0500 time 0.2221 (0.2527) data time 0.0008 (0.0033) model time 0.2213 (0.2243) loss 3.5350 (3.2739) grad_norm 2.5157 (2.3611) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][460/1251] eta 0:03:19 lr 0.000619 wd 0.0500 time 0.2231 (0.2520) data time 0.0008 (0.0032) model time 0.2223 (0.2242) loss 2.8487 (3.2669) grad_norm 1.7501 (2.3602) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][470/1251] eta 0:03:16 lr 0.000619 wd 0.0500 time 0.2201 (0.2514) data time 0.0007 (0.0032) model time 0.2194 (0.2242) loss 1.8205 (3.2575) grad_norm 1.7018 (2.3626) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][480/1251] eta 0:03:13 lr 0.000618 wd 0.0500 time 0.2278 (0.2508) data time 0.0010 (0.0031) model time 0.2268 (0.2241) loss 3.6123 (3.2562) grad_norm 3.1879 (2.3730) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][490/1251] eta 0:03:10 lr 0.000618 wd 0.0500 time 0.2305 (0.2503) data time 0.0008 (0.0031) model time 0.2297 (0.2241) loss 2.8358 (3.2594) grad_norm 2.7117 (2.3733) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][500/1251] eta 0:03:07 lr 0.000618 wd 0.0500 time 0.2252 (0.2497) data time 0.0005 (0.0030) model time 0.2246 (0.2241) loss 3.4837 (3.2613) grad_norm 2.1437 (2.3760) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][510/1251] eta 0:03:04 lr 0.000618 wd 0.0500 time 0.2270 (0.2493) data time 0.0006 (0.0030) model time 0.2264 (0.2241) loss 4.1000 (3.2647) grad_norm 3.0206 (2.3740) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][520/1251] eta 0:03:01 lr 0.000618 wd 0.0500 time 0.2231 (0.2488) data time 0.0011 (0.0030) model time 0.2221 (0.2241) loss 2.3683 (3.2675) grad_norm 1.8961 (2.3735) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][530/1251] eta 0:02:59 lr 0.000618 wd 0.0500 time 0.2217 (0.2483) data time 0.0007 (0.0029) model time 0.2210 (0.2241) loss 3.7724 (3.2601) grad_norm 1.7482 (2.3705) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][540/1251] eta 0:02:56 lr 0.000618 wd 0.0500 time 0.2225 (0.2479) data time 0.0012 (0.0029) model time 0.2213 (0.2241) loss 3.3071 (3.2589) grad_norm 3.7553 (2.3795) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][550/1251] eta 0:02:53 lr 0.000618 wd 0.0500 time 0.2216 (0.2475) data time 0.0006 (0.0029) model time 0.2210 (0.2241) loss 3.8962 (3.2617) grad_norm 2.0854 (2.3939) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][560/1251] eta 0:02:50 lr 0.000618 wd 0.0500 time 0.2227 (0.2471) data time 0.0007 (0.0028) model time 0.2220 (0.2240) loss 3.9656 (3.2634) grad_norm 2.2212 (2.3946) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][570/1251] eta 0:02:47 lr 0.000618 wd 0.0500 time 0.2237 (0.2467) data time 0.0009 (0.0028) model time 0.2228 (0.2241) loss 3.1666 (3.2631) grad_norm 1.9943 (2.3918) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][580/1251] eta 0:02:45 lr 0.000618 wd 0.0500 time 0.2256 (0.2463) data time 0.0006 (0.0028) model time 0.2251 (0.2240) loss 3.9245 (3.2656) grad_norm 1.8205 (2.3863) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][590/1251] eta 0:02:42 lr 0.000618 wd 0.0500 time 0.2237 (0.2459) data time 0.0009 (0.0027) model time 0.2227 (0.2241) loss 2.7227 (3.2666) grad_norm 1.8818 (2.3832) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][600/1251] eta 0:02:39 lr 0.000618 wd 0.0500 time 0.2310 (0.2456) data time 0.0009 (0.0027) model time 0.2301 (0.2240) loss 2.9797 (3.2696) grad_norm 4.6253 (2.3952) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][610/1251] eta 0:02:37 lr 0.000618 wd 0.0500 time 0.2258 (0.2452) data time 0.0007 (0.0027) model time 0.2251 (0.2240) loss 3.6304 (3.2670) grad_norm 2.9037 (2.4040) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][620/1251] eta 0:02:34 lr 0.000618 wd 0.0500 time 0.2243 (0.2449) data time 0.0006 (0.0026) model time 0.2237 (0.2240) loss 3.0723 (3.2670) grad_norm 2.4171 (2.4036) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][630/1251] eta 0:02:31 lr 0.000618 wd 0.0500 time 0.2236 (0.2446) data time 0.0008 (0.0026) model time 0.2228 (0.2240) loss 2.8694 (3.2694) grad_norm 3.5469 (2.4048) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:11:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][640/1251] eta 0:02:29 lr 0.000618 wd 0.0500 time 0.2296 (0.2443) data time 0.0007 (0.0026) model time 0.2289 (0.2240) loss 3.8237 (3.2727) grad_norm 1.7939 (2.4071) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][650/1251] eta 0:02:26 lr 0.000618 wd 0.0500 time 0.2269 (0.2439) data time 0.0005 (0.0026) model time 0.2264 (0.2240) loss 3.5101 (3.2685) grad_norm 1.6193 (2.4014) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:11:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][660/1251] eta 0:02:24 lr 0.000618 wd 0.0500 time 0.2253 (0.2437) data time 0.0008 (0.0025) model time 0.2245 (0.2240) loss 3.2924 (3.2673) grad_norm 2.1956 (2.3992) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 08:11:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][670/1251] eta 0:02:21 lr 0.000618 wd 0.0500 time 0.2216 (0.2434) data time 0.0006 (0.0025) model time 0.2210 (0.2240) loss 3.6639 (3.2680) grad_norm 2.4093 (2.3987) loss_scale 2048.0000 (1027.0613) mem 7377MB [2024-08-27 08:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 08:11:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 08:11:11 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 08:13:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 08:13:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 08:13:05 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 08:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 08:13:20 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 08:13:21 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 08:13:22 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 08:13:22 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 139) [2024-08-27 08:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 08:15:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 08:15:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 08:15:39 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 08:15:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 08:15:49 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 08:15:50 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 08:15:51 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 08:15:51 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 139) [2024-08-27 08:15:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 08:16:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][680/1251] eta 0:29:26 lr 0.000618 wd 0.0500 time 0.2341 (3.0940) data time 0.0009 (0.2275) model time 0.2332 (2.8665) loss 3.5516 (3.6000) grad_norm 3.3476 (2.3101) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][690/1251] eta 0:11:08 lr 0.000618 wd 0.0500 time 0.2360 (1.1913) data time 0.0011 (0.0765) model time 0.2349 (1.1147) loss 3.3550 (3.4770) grad_norm 1.6597 (2.2650) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][700/1251] eta 0:07:26 lr 0.000618 wd 0.0500 time 0.2440 (0.8096) data time 0.0011 (0.0463) model time 0.2429 (0.7632) loss 3.5917 (3.4976) grad_norm 2.2547 (2.3218) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][710/1251] eta 0:05:49 lr 0.000617 wd 0.0500 time 0.2278 (0.6464) data time 0.0014 (0.0334) model time 0.2264 (0.6131) loss 3.3767 (3.4573) grad_norm 2.3563 (2.3356) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][720/1251] eta 0:04:55 lr 0.000617 wd 0.0500 time 0.2361 (0.5562) data time 0.0010 (0.0262) model time 0.2351 (0.5300) loss 3.4692 (3.4257) grad_norm 2.2780 (2.4066) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][730/1251] eta 0:04:19 lr 0.000617 wd 0.0500 time 0.2395 (0.4986) data time 0.0007 (0.0216) model time 0.2388 (0.4770) loss 2.4547 (3.4158) grad_norm 2.5292 (2.4433) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][740/1251] eta 0:03:54 lr 0.000617 wd 0.0500 time 0.2356 (0.4589) data time 0.0012 (0.0184) model time 0.2344 (0.4404) loss 3.1351 (3.3984) grad_norm 1.7181 (2.4220) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][750/1251] eta 0:03:35 lr 0.000617 wd 0.0500 time 0.2383 (0.4296) data time 0.0011 (0.0161) model time 0.2372 (0.4135) loss 2.6368 (3.3569) grad_norm 1.6391 (2.3932) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][760/1251] eta 0:03:20 lr 0.000617 wd 0.0500 time 0.2381 (0.4074) data time 0.0008 (0.0144) model time 0.2373 (0.3930) loss 3.1990 (3.3319) grad_norm 1.9564 (2.3600) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][770/1251] eta 0:03:07 lr 0.000617 wd 0.0500 time 0.2344 (0.3896) data time 0.0011 (0.0130) model time 0.2333 (0.3766) loss 3.9032 (3.3266) grad_norm 2.1285 (2.3212) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][780/1251] eta 0:02:56 lr 0.000617 wd 0.0500 time 0.2352 (0.3750) data time 0.0010 (0.0118) model time 0.2343 (0.3631) loss 3.5094 (3.3571) grad_norm 2.3680 (2.3655) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][790/1251] eta 0:02:47 lr 0.000617 wd 0.0500 time 0.2334 (0.3630) data time 0.0009 (0.0109) model time 0.2325 (0.3521) loss 2.4194 (3.3416) grad_norm 2.2352 (2.3982) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][800/1251] eta 0:02:39 lr 0.000617 wd 0.0500 time 0.2314 (0.3531) data time 0.0008 (0.0101) model time 0.2306 (0.3430) loss 2.9397 (3.3413) grad_norm 3.1027 (2.4557) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][810/1251] eta 0:02:31 lr 0.000617 wd 0.0500 time 0.2304 (0.3445) data time 0.0009 (0.0094) model time 0.2295 (0.3351) loss 3.3019 (3.3452) grad_norm 1.6902 (2.4442) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][820/1251] eta 0:02:25 lr 0.000617 wd 0.0500 time 0.2360 (0.3372) data time 0.0010 (0.0089) model time 0.2350 (0.3283) loss 3.6149 (3.3419) grad_norm 2.0232 (2.4382) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][830/1251] eta 0:02:19 lr 0.000617 wd 0.0500 time 0.2290 (0.3306) data time 0.0010 (0.0083) model time 0.2280 (0.3223) loss 3.8455 (3.3421) grad_norm 2.2422 (2.4235) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][840/1251] eta 0:02:13 lr 0.000617 wd 0.0500 time 0.2318 (0.3251) data time 0.0010 (0.0079) model time 0.2308 (0.3172) loss 3.5847 (3.3424) grad_norm 1.6185 (2.4258) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][850/1251] eta 0:02:08 lr 0.000617 wd 0.0500 time 0.2384 (0.3201) data time 0.0007 (0.0075) model time 0.2376 (0.3126) loss 3.5745 (3.3321) grad_norm 2.6420 (2.4190) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][860/1251] eta 0:02:03 lr 0.000617 wd 0.0500 time 0.2482 (0.3159) data time 0.0010 (0.0072) model time 0.2472 (0.3088) loss 3.0012 (3.3251) grad_norm 2.1968 (2.3958) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][870/1251] eta 0:01:58 lr 0.000617 wd 0.0500 time 0.2430 (0.3119) data time 0.0012 (0.0068) model time 0.2418 (0.3051) loss 2.1885 (3.3163) grad_norm 1.7466 (2.3877) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][880/1251] eta 0:01:54 lr 0.000617 wd 0.0500 time 0.2505 (0.3083) data time 0.0009 (0.0066) model time 0.2496 (0.3017) loss 2.5593 (3.3018) grad_norm 1.7783 (2.3845) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][890/1251] eta 0:01:50 lr 0.000617 wd 0.0500 time 0.2404 (0.3050) data time 0.0011 (0.0063) model time 0.2392 (0.2987) loss 2.9854 (3.2921) grad_norm 1.7391 (2.3710) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][900/1251] eta 0:01:46 lr 0.000617 wd 0.0500 time 0.2423 (0.3021) data time 0.0007 (0.0061) model time 0.2416 (0.2960) loss 3.2098 (3.2882) grad_norm 2.3884 (2.3784) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][910/1251] eta 0:01:42 lr 0.000617 wd 0.0500 time 0.2408 (0.2993) data time 0.0010 (0.0058) model time 0.2397 (0.2934) loss 3.8480 (3.2789) grad_norm 1.9458 (2.3672) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][920/1251] eta 0:01:38 lr 0.000617 wd 0.0500 time 0.2374 (0.2968) data time 0.0009 (0.0056) model time 0.2365 (0.2911) loss 3.5242 (3.2848) grad_norm 2.1413 (2.3659) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][930/1251] eta 0:01:34 lr 0.000617 wd 0.0500 time 0.2387 (0.2944) data time 0.0007 (0.0055) model time 0.2380 (0.2889) loss 2.8793 (3.2739) grad_norm 2.7938 (2.3799) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][940/1251] eta 0:01:30 lr 0.000616 wd 0.0500 time 0.2424 (0.2923) data time 0.0007 (0.0053) model time 0.2416 (0.2870) loss 2.3983 (3.2614) grad_norm 2.5970 (2.3818) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][950/1251] eta 0:01:27 lr 0.000616 wd 0.0500 time 0.2353 (0.2904) data time 0.0012 (0.0051) model time 0.2341 (0.2853) loss 3.8360 (3.2675) grad_norm 2.0965 (2.4290) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][960/1251] eta 0:01:23 lr 0.000616 wd 0.0500 time 0.2392 (0.2886) data time 0.0010 (0.0050) model time 0.2382 (0.2836) loss 3.4035 (3.2633) grad_norm 2.9364 (2.4390) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][970/1251] eta 0:01:20 lr 0.000616 wd 0.0500 time 0.2333 (0.2878) data time 0.0011 (0.0049) model time 0.2321 (0.2829) loss 3.6980 (3.2564) grad_norm 2.0655 (2.4383) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][980/1251] eta 0:01:17 lr 0.000616 wd 0.0500 time 0.2415 (0.2862) data time 0.0011 (0.0047) model time 0.2404 (0.2814) loss 2.7484 (3.2471) grad_norm 3.0291 (2.4544) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][990/1251] eta 0:01:14 lr 0.000616 wd 0.0500 time 0.2431 (0.2856) data time 0.0010 (0.0046) model time 0.2421 (0.2809) loss 3.1957 (3.2500) grad_norm 3.1852 (2.4566) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1000/1251] eta 0:01:11 lr 0.000616 wd 0.0500 time 0.2361 (0.2841) data time 0.0011 (0.0045) model time 0.2349 (0.2796) loss 4.1044 (3.2605) grad_norm 2.8323 (2.4490) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1010/1251] eta 0:01:08 lr 0.000616 wd 0.0500 time 0.2358 (0.2827) data time 0.0008 (0.0044) model time 0.2350 (0.2783) loss 3.7116 (3.2588) grad_norm 1.7391 (2.4545) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1020/1251] eta 0:01:05 lr 0.000616 wd 0.0500 time 0.2324 (0.2815) data time 0.0009 (0.0043) model time 0.2315 (0.2772) loss 2.5769 (3.2593) grad_norm 2.8358 (2.4606) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1030/1251] eta 0:01:01 lr 0.000616 wd 0.0500 time 0.2372 (0.2803) data time 0.0012 (0.0042) model time 0.2360 (0.2761) loss 3.0541 (3.2647) grad_norm 2.1049 (2.4529) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1040/1251] eta 0:00:58 lr 0.000616 wd 0.0500 time 0.2378 (0.2792) data time 0.0010 (0.0041) model time 0.2369 (0.2750) loss 3.1524 (3.2615) grad_norm 2.5001 (2.4413) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1050/1251] eta 0:00:55 lr 0.000616 wd 0.0500 time 0.2457 (0.2781) data time 0.0006 (0.0040) model time 0.2450 (0.2741) loss 2.7303 (3.2589) grad_norm 1.6770 (2.4289) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1060/1251] eta 0:00:52 lr 0.000616 wd 0.0500 time 0.2366 (0.2771) data time 0.0009 (0.0040) model time 0.2357 (0.2731) loss 2.9858 (3.2512) grad_norm 2.1060 (2.4287) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1070/1251] eta 0:00:49 lr 0.000616 wd 0.0500 time 0.2411 (0.2761) data time 0.0012 (0.0039) model time 0.2399 (0.2722) loss 3.9012 (3.2551) grad_norm 3.2545 (2.4276) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1080/1251] eta 0:00:47 lr 0.000616 wd 0.0500 time 0.2491 (0.2753) data time 0.0009 (0.0038) model time 0.2482 (0.2715) loss 3.5355 (3.2606) grad_norm 1.6521 (2.4289) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1090/1251] eta 0:00:44 lr 0.000616 wd 0.0500 time 0.2342 (0.2745) data time 0.0010 (0.0037) model time 0.2332 (0.2707) loss 3.3578 (3.2640) grad_norm 1.9324 (2.4287) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1100/1251] eta 0:00:41 lr 0.000616 wd 0.0500 time 0.2441 (0.2736) data time 0.0007 (0.0037) model time 0.2434 (0.2700) loss 3.6805 (3.2599) grad_norm 1.9736 (2.4254) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1110/1251] eta 0:00:38 lr 0.000616 wd 0.0500 time 0.2404 (0.2728) data time 0.0009 (0.0036) model time 0.2395 (0.2692) loss 3.3634 (3.2669) grad_norm 2.8218 (2.4372) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1120/1251] eta 0:00:35 lr 0.000616 wd 0.0500 time 0.2397 (0.2720) data time 0.0011 (0.0036) model time 0.2387 (0.2685) loss 3.2723 (3.2691) grad_norm 2.0207 (2.4328) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:17:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1130/1251] eta 0:00:32 lr 0.000616 wd 0.0500 time 0.2385 (0.2713) data time 0.0008 (0.0035) model time 0.2376 (0.2678) loss 3.3392 (3.2656) grad_norm 2.5480 (2.4257) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:18:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1140/1251] eta 0:00:30 lr 0.000616 wd 0.0500 time 0.2400 (0.2706) data time 0.0008 (0.0035) model time 0.2392 (0.2671) loss 2.8237 (3.2629) grad_norm 2.6418 (2.4272) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:18:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1150/1251] eta 0:00:27 lr 0.000616 wd 0.0500 time 0.2309 (0.2699) data time 0.0009 (0.0034) model time 0.2301 (0.2665) loss 2.8935 (3.2585) grad_norm 1.8540 (2.4230) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:18:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1160/1251] eta 0:00:24 lr 0.000616 wd 0.0500 time 0.2444 (0.2692) data time 0.0007 (0.0033) model time 0.2437 (0.2659) loss 4.3689 (3.2631) grad_norm 2.2362 (2.4273) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:18:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1170/1251] eta 0:00:21 lr 0.000615 wd 0.0500 time 0.2409 (0.2686) data time 0.0008 (0.0033) model time 0.2401 (0.2653) loss 3.0516 (3.2622) grad_norm 1.9001 (2.4233) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:18:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1180/1251] eta 0:00:19 lr 0.000615 wd 0.0500 time 0.2438 (0.2680) data time 0.0011 (0.0033) model time 0.2427 (0.2648) loss 3.5513 (3.2610) grad_norm 4.8314 (2.4307) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:18:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1190/1251] eta 0:00:16 lr 0.000615 wd 0.0500 time 0.2383 (0.2675) data time 0.0010 (0.0032) model time 0.2373 (0.2643) loss 3.5166 (3.2703) grad_norm 1.8983 (2.4439) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:18:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1200/1251] eta 0:00:13 lr 0.000615 wd 0.0500 time 0.2368 (0.2669) data time 0.0007 (0.0032) model time 0.2361 (0.2638) loss 4.3063 (3.2656) grad_norm 1.7353 (2.4462) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:18:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1210/1251] eta 0:00:10 lr 0.000615 wd 0.0500 time 0.2381 (0.2664) data time 0.0007 (0.0031) model time 0.2374 (0.2633) loss 2.8918 (3.2624) grad_norm 2.7618 (2.4455) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:18:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1220/1251] eta 0:00:08 lr 0.000615 wd 0.0500 time 0.2415 (0.2659) data time 0.0007 (0.0031) model time 0.2408 (0.2628) loss 2.8501 (3.2627) grad_norm 2.1583 (2.4427) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:18:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1230/1251] eta 0:00:05 lr 0.000615 wd 0.0500 time 0.2420 (0.2654) data time 0.0009 (0.0031) model time 0.2411 (0.2623) loss 3.4160 (3.2673) grad_norm 2.7474 (2.4427) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:18:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1240/1251] eta 0:00:02 lr 0.000615 wd 0.0500 time 0.2273 (0.2647) data time 0.0007 (0.0030) model time 0.2266 (0.2617) loss 3.3853 (3.2725) grad_norm 1.9510 (2.4352) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:18:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [139/300][1250/1251] eta 0:00:00 lr 0.000615 wd 0.0500 time 0.2249 (0.2641) data time 0.0007 (0.0030) model time 0.2242 (0.2611) loss 3.2575 (3.2718) grad_norm 3.5710 (2.4339) loss_scale 2048.0000 (2048.0000) mem 7373MB [2024-08-27 08:18:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 139 training takes 0:02:31 [2024-08-27 08:18:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 08:18:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 08:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.425 (0.425) Loss 0.4468 (0.4468) Acc@1 90.723 (90.723) Acc@5 98.047 (98.047) Mem 7373MB [2024-08-27 08:18:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.107) Loss 0.7485 (0.7329) Acc@1 84.473 (84.091) Acc@5 96.387 (96.751) Mem 7373MB [2024-08-27 08:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.094) Loss 1.0645 (0.7587) Acc@1 74.512 (83.026) Acc@5 93.652 (96.675) Mem 7373MB [2024-08-27 08:18:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.090) Loss 1.3330 (0.8626) Acc@1 68.262 (80.570) Acc@5 90.918 (95.479) Mem 7373MB [2024-08-27 08:18:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.1484 (0.9213) Acc@1 71.191 (79.032) Acc@5 92.773 (94.803) Mem 7373MB [2024-08-27 08:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.636 Acc@5 94.646 [2024-08-27 08:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.6% [2024-08-27 08:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.774 (0.774) Loss 0.4131 (0.4131) Acc@1 92.578 (92.578) Acc@5 98.633 (98.633) Mem 7373MB [2024-08-27 08:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.146) Loss 0.6602 (0.6479) Acc@1 87.012 (86.115) Acc@5 96.973 (97.257) Mem 7373MB [2024-08-27 08:18:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.115) Loss 0.9204 (0.6724) Acc@1 78.613 (85.161) Acc@5 94.922 (97.275) Mem 7373MB [2024-08-27 08:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.102) Loss 1.1660 (0.7627) Acc@1 70.703 (82.872) Acc@5 92.285 (96.305) Mem 7373MB [2024-08-27 08:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.0527 (0.8104) Acc@1 73.828 (81.543) Acc@5 93.457 (95.784) Mem 7373MB [2024-08-27 08:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.110 Acc@5 95.762 [2024-08-27 08:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.1% [2024-08-27 08:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.11% [2024-08-27 08:18:39 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 08:18:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 08:18:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][0/1251] eta 0:14:30 lr 0.000615 wd 0.0500 time 0.6961 (0.6961) data time 0.4400 (0.4400) model time 0.0000 (0.0000) loss 2.6983 (2.6983) grad_norm 1.7453 (1.7453) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 08:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][10/1251] eta 0:05:49 lr 0.000615 wd 0.0500 time 0.2355 (0.2816) data time 0.0011 (0.0409) model time 0.0000 (0.0000) loss 3.5937 (3.2860) grad_norm 1.7265 (2.0226) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:18:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][20/1251] eta 0:05:22 lr 0.000615 wd 0.0500 time 0.2355 (0.2617) data time 0.0007 (0.0219) model time 0.0000 (0.0000) loss 2.4958 (3.2615) grad_norm 1.8603 (2.1154) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:18:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][30/1251] eta 0:05:10 lr 0.000615 wd 0.0500 time 0.2389 (0.2545) data time 0.0007 (0.0152) model time 0.0000 (0.0000) loss 2.8533 (3.2120) grad_norm 2.0594 (2.1846) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:18:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][40/1251] eta 0:05:03 lr 0.000615 wd 0.0500 time 0.2396 (0.2507) data time 0.0012 (0.0117) model time 0.0000 (0.0000) loss 3.1585 (3.2454) grad_norm 2.5237 (2.2687) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:18:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][50/1251] eta 0:04:58 lr 0.000615 wd 0.0500 time 0.2347 (0.2487) data time 0.0009 (0.0096) model time 0.0000 (0.0000) loss 2.7614 (3.2884) grad_norm 3.2758 (2.3632) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][60/1251] eta 0:04:54 lr 0.000615 wd 0.0500 time 0.2459 (0.2473) data time 0.0008 (0.0082) model time 0.2451 (0.2391) loss 3.1566 (3.3096) grad_norm 1.7527 (2.4771) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:18:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][70/1251] eta 0:04:50 lr 0.000615 wd 0.0500 time 0.2410 (0.2462) data time 0.0010 (0.0072) model time 0.2400 (0.2388) loss 3.7023 (3.2787) grad_norm 2.6215 (2.4694) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][80/1251] eta 0:04:47 lr 0.000615 wd 0.0500 time 0.2416 (0.2454) data time 0.0009 (0.0064) model time 0.2407 (0.2386) loss 2.5439 (3.2836) grad_norm 2.2931 (2.4108) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][90/1251] eta 0:04:44 lr 0.000615 wd 0.0500 time 0.2472 (0.2448) data time 0.0008 (0.0059) model time 0.2465 (0.2386) loss 3.9880 (3.2644) grad_norm 2.0843 (2.3977) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][100/1251] eta 0:04:40 lr 0.000615 wd 0.0500 time 0.2278 (0.2440) data time 0.0008 (0.0054) model time 0.2270 (0.2382) loss 3.2433 (3.2951) grad_norm 1.9183 (2.3827) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][110/1251] eta 0:04:38 lr 0.000615 wd 0.0500 time 0.2374 (0.2438) data time 0.0008 (0.0050) model time 0.2366 (0.2385) loss 2.9222 (3.2925) grad_norm 2.9466 (2.4249) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][120/1251] eta 0:04:35 lr 0.000615 wd 0.0500 time 0.2336 (0.2435) data time 0.0008 (0.0047) model time 0.2328 (0.2386) loss 2.6997 (3.2749) grad_norm 2.7630 (2.4667) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][130/1251] eta 0:04:32 lr 0.000615 wd 0.0500 time 0.2455 (0.2432) data time 0.0007 (0.0044) model time 0.2448 (0.2386) loss 3.7918 (3.2826) grad_norm 1.8397 (2.4608) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][140/1251] eta 0:04:29 lr 0.000615 wd 0.0500 time 0.2351 (0.2429) data time 0.0011 (0.0041) model time 0.2340 (0.2386) loss 3.5155 (3.2557) grad_norm 2.3066 (2.4680) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][150/1251] eta 0:04:27 lr 0.000614 wd 0.0500 time 0.2406 (0.2426) data time 0.0007 (0.0039) model time 0.2399 (0.2385) loss 3.2268 (3.2469) grad_norm 1.5379 (2.4445) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][160/1251] eta 0:04:24 lr 0.000614 wd 0.0500 time 0.2355 (0.2424) data time 0.0007 (0.0037) model time 0.2347 (0.2384) loss 3.5864 (3.2552) grad_norm 5.0724 (2.4509) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][170/1251] eta 0:04:21 lr 0.000614 wd 0.0500 time 0.2394 (0.2421) data time 0.0007 (0.0036) model time 0.2387 (0.2382) loss 3.6428 (3.2570) grad_norm 1.9420 (2.4526) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][180/1251] eta 0:04:20 lr 0.000614 wd 0.0500 time 0.2388 (0.2431) data time 0.0009 (0.0034) model time 0.2379 (0.2399) loss 3.8224 (3.2497) grad_norm 1.8070 (2.4534) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][190/1251] eta 0:04:17 lr 0.000614 wd 0.0500 time 0.2367 (0.2427) data time 0.0010 (0.0033) model time 0.2357 (0.2395) loss 3.7144 (3.2715) grad_norm 1.9972 (2.4457) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][200/1251] eta 0:04:14 lr 0.000614 wd 0.0500 time 0.2384 (0.2425) data time 0.0010 (0.0032) model time 0.2374 (0.2394) loss 3.3460 (3.2727) grad_norm 2.2207 (2.4444) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][210/1251] eta 0:04:12 lr 0.000614 wd 0.0500 time 0.2376 (0.2425) data time 0.0014 (0.0031) model time 0.2362 (0.2394) loss 3.0576 (3.2696) grad_norm 1.8559 (2.4373) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][220/1251] eta 0:04:09 lr 0.000614 wd 0.0500 time 0.2410 (0.2424) data time 0.0008 (0.0030) model time 0.2402 (0.2395) loss 1.7940 (3.2700) grad_norm 2.0315 (2.4506) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][230/1251] eta 0:04:07 lr 0.000614 wd 0.0500 time 0.2458 (0.2423) data time 0.0009 (0.0029) model time 0.2449 (0.2395) loss 2.1022 (3.2683) grad_norm 2.1873 (2.4555) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][240/1251] eta 0:04:05 lr 0.000614 wd 0.0500 time 0.2409 (0.2430) data time 0.0008 (0.0028) model time 0.2400 (0.2405) loss 4.0780 (3.2603) grad_norm 2.2986 (2.4424) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][250/1251] eta 0:04:03 lr 0.000614 wd 0.0500 time 0.2425 (0.2429) data time 0.0009 (0.0028) model time 0.2416 (0.2404) loss 2.4025 (3.2494) grad_norm 1.9025 (2.4353) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][260/1251] eta 0:04:00 lr 0.000614 wd 0.0500 time 0.2464 (0.2428) data time 0.0006 (0.0027) model time 0.2457 (0.2404) loss 4.2416 (3.2511) grad_norm 2.9242 (2.4269) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][270/1251] eta 0:03:58 lr 0.000614 wd 0.0500 time 0.2385 (0.2427) data time 0.0008 (0.0026) model time 0.2377 (0.2402) loss 4.0584 (3.2465) grad_norm 2.2325 (2.4457) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][280/1251] eta 0:03:55 lr 0.000614 wd 0.0500 time 0.2430 (0.2426) data time 0.0012 (0.0026) model time 0.2418 (0.2402) loss 3.8985 (3.2492) grad_norm 2.7025 (2.4579) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][290/1251] eta 0:03:52 lr 0.000614 wd 0.0500 time 0.2419 (0.2424) data time 0.0010 (0.0025) model time 0.2409 (0.2401) loss 3.9021 (3.2479) grad_norm 1.8614 (2.4440) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][300/1251] eta 0:03:50 lr 0.000614 wd 0.0500 time 0.2446 (0.2424) data time 0.0011 (0.0025) model time 0.2434 (0.2401) loss 3.0691 (3.2496) grad_norm 2.5119 (2.4404) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][310/1251] eta 0:03:48 lr 0.000614 wd 0.0500 time 0.2377 (0.2424) data time 0.0009 (0.0024) model time 0.2368 (0.2401) loss 3.8355 (3.2451) grad_norm 1.7711 (2.4323) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:19:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][320/1251] eta 0:03:45 lr 0.000614 wd 0.0500 time 0.2448 (0.2423) data time 0.0009 (0.0024) model time 0.2439 (0.2401) loss 3.8321 (3.2468) grad_norm 1.9155 (2.4317) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][330/1251] eta 0:03:43 lr 0.000614 wd 0.0500 time 0.2360 (0.2423) data time 0.0012 (0.0024) model time 0.2348 (0.2400) loss 3.6539 (3.2505) grad_norm 2.8880 (2.4783) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][340/1251] eta 0:03:40 lr 0.000614 wd 0.0500 time 0.2433 (0.2422) data time 0.0008 (0.0023) model time 0.2425 (0.2400) loss 3.7665 (3.2485) grad_norm 2.6360 (2.4750) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][350/1251] eta 0:03:38 lr 0.000614 wd 0.0500 time 0.2399 (0.2421) data time 0.0011 (0.0023) model time 0.2388 (0.2400) loss 3.1217 (3.2559) grad_norm 1.7550 (2.4715) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][360/1251] eta 0:03:35 lr 0.000614 wd 0.0500 time 0.2443 (0.2422) data time 0.0008 (0.0023) model time 0.2435 (0.2400) loss 2.2614 (3.2560) grad_norm 2.8188 (2.4695) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][370/1251] eta 0:03:33 lr 0.000614 wd 0.0500 time 0.2337 (0.2421) data time 0.0010 (0.0022) model time 0.2327 (0.2400) loss 3.6458 (3.2536) grad_norm 2.0548 (2.4627) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][380/1251] eta 0:03:30 lr 0.000614 wd 0.0500 time 0.2441 (0.2422) data time 0.0009 (0.0022) model time 0.2432 (0.2401) loss 3.8583 (3.2502) grad_norm 2.0456 (2.4581) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][390/1251] eta 0:03:28 lr 0.000613 wd 0.0500 time 0.2363 (0.2421) data time 0.0011 (0.0022) model time 0.2351 (0.2401) loss 3.6375 (3.2543) grad_norm 2.8078 (2.4733) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][400/1251] eta 0:03:25 lr 0.000613 wd 0.0500 time 0.2368 (0.2421) data time 0.0010 (0.0021) model time 0.2358 (0.2400) loss 3.3412 (3.2621) grad_norm 2.1031 (2.4669) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][410/1251] eta 0:03:23 lr 0.000613 wd 0.0500 time 0.2436 (0.2420) data time 0.0010 (0.0021) model time 0.2425 (0.2400) loss 2.6442 (3.2581) grad_norm 2.1245 (2.4709) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][420/1251] eta 0:03:20 lr 0.000613 wd 0.0500 time 0.2365 (0.2419) data time 0.0010 (0.0021) model time 0.2355 (0.2399) loss 3.3098 (3.2621) grad_norm 2.0194 (2.4773) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][430/1251] eta 0:03:18 lr 0.000613 wd 0.0500 time 0.2470 (0.2418) data time 0.0010 (0.0021) model time 0.2460 (0.2398) loss 3.7110 (3.2624) grad_norm 1.8602 (2.4740) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][440/1251] eta 0:03:16 lr 0.000613 wd 0.0500 time 0.2443 (0.2418) data time 0.0009 (0.0020) model time 0.2435 (0.2398) loss 2.8923 (3.2676) grad_norm 2.7471 (2.4751) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][450/1251] eta 0:03:13 lr 0.000613 wd 0.0500 time 0.2377 (0.2417) data time 0.0011 (0.0020) model time 0.2366 (0.2398) loss 3.0939 (3.2591) grad_norm 2.3867 (2.4880) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][460/1251] eta 0:03:11 lr 0.000613 wd 0.0500 time 0.2347 (0.2417) data time 0.0008 (0.0020) model time 0.2339 (0.2398) loss 3.8967 (3.2582) grad_norm 1.9155 (2.4798) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][470/1251] eta 0:03:08 lr 0.000613 wd 0.0500 time 0.2338 (0.2417) data time 0.0010 (0.0020) model time 0.2329 (0.2398) loss 3.4512 (3.2620) grad_norm 3.0872 (2.4748) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][480/1251] eta 0:03:06 lr 0.000613 wd 0.0500 time 0.2364 (0.2417) data time 0.0011 (0.0019) model time 0.2353 (0.2398) loss 3.5881 (3.2636) grad_norm 1.7329 (2.4712) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][490/1251] eta 0:03:03 lr 0.000613 wd 0.0500 time 0.2396 (0.2416) data time 0.0009 (0.0019) model time 0.2387 (0.2398) loss 2.4649 (3.2637) grad_norm 2.0425 (2.4733) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][500/1251] eta 0:03:01 lr 0.000613 wd 0.0500 time 0.2387 (0.2415) data time 0.0009 (0.0019) model time 0.2378 (0.2397) loss 3.1240 (3.2588) grad_norm 2.2109 (2.4673) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][510/1251] eta 0:02:58 lr 0.000613 wd 0.0500 time 0.2416 (0.2415) data time 0.0009 (0.0019) model time 0.2407 (0.2396) loss 3.1717 (3.2600) grad_norm 1.9546 (2.4648) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][520/1251] eta 0:02:56 lr 0.000613 wd 0.0500 time 0.2414 (0.2414) data time 0.0008 (0.0019) model time 0.2406 (0.2396) loss 3.6621 (3.2588) grad_norm 2.9980 (2.4617) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][530/1251] eta 0:02:54 lr 0.000613 wd 0.0500 time 0.2460 (0.2414) data time 0.0010 (0.0019) model time 0.2450 (0.2396) loss 3.4857 (3.2585) grad_norm 1.5672 (2.4562) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][540/1251] eta 0:02:51 lr 0.000613 wd 0.0500 time 0.2350 (0.2413) data time 0.0011 (0.0018) model time 0.2339 (0.2395) loss 3.3329 (3.2588) grad_norm 1.8458 (2.4563) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][550/1251] eta 0:02:49 lr 0.000613 wd 0.0500 time 0.2388 (0.2413) data time 0.0009 (0.0018) model time 0.2378 (0.2395) loss 3.6056 (3.2613) grad_norm 2.0026 (2.4569) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][560/1251] eta 0:02:46 lr 0.000613 wd 0.0500 time 0.2366 (0.2413) data time 0.0009 (0.0018) model time 0.2356 (0.2395) loss 3.6705 (3.2608) grad_norm 2.1599 (2.4580) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:20:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][570/1251] eta 0:02:44 lr 0.000613 wd 0.0500 time 0.2344 (0.2412) data time 0.0007 (0.0018) model time 0.2337 (0.2394) loss 3.2924 (3.2616) grad_norm 1.8548 (2.4570) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:21:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][580/1251] eta 0:02:41 lr 0.000613 wd 0.0500 time 0.2445 (0.2412) data time 0.0007 (0.0018) model time 0.2438 (0.2394) loss 3.9985 (3.2632) grad_norm 2.3961 (2.4544) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:21:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][590/1251] eta 0:02:39 lr 0.000613 wd 0.0500 time 0.2446 (0.2411) data time 0.0007 (0.0018) model time 0.2439 (0.2394) loss 3.9637 (3.2598) grad_norm 2.3917 (2.4497) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:21:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][600/1251] eta 0:02:36 lr 0.000613 wd 0.0500 time 0.2404 (0.2411) data time 0.0010 (0.0018) model time 0.2394 (0.2394) loss 3.6244 (3.2667) grad_norm 1.6203 (2.4478) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:21:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][610/1251] eta 0:02:34 lr 0.000613 wd 0.0500 time 0.2386 (0.2411) data time 0.0009 (0.0017) model time 0.2377 (0.2394) loss 3.2252 (3.2715) grad_norm 2.0495 (2.4491) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:21:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][620/1251] eta 0:02:32 lr 0.000612 wd 0.0500 time 0.2503 (0.2411) data time 0.0009 (0.0017) model time 0.2494 (0.2394) loss 3.4253 (3.2719) grad_norm 2.0849 (2.4500) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:21:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][630/1251] eta 0:02:29 lr 0.000612 wd 0.0500 time 0.2433 (0.2411) data time 0.0007 (0.0017) model time 0.2425 (0.2394) loss 3.4261 (3.2748) grad_norm 2.0987 (2.4499) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:21:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][640/1251] eta 0:02:27 lr 0.000612 wd 0.0500 time 0.2390 (0.2411) data time 0.0007 (0.0017) model time 0.2382 (0.2394) loss 2.5113 (3.2751) grad_norm 1.5939 (2.4441) loss_scale 2048.0000 (2048.0000) mem 7379MB [2024-08-27 08:21:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][650/1251] eta 0:02:24 lr 0.000612 wd 0.0500 time 0.2407 (0.2410) data time 0.0007 (0.0017) model time 0.2400 (0.2394) loss 2.8860 (3.2731) grad_norm 2.0717 (inf) loss_scale 1024.0000 (2038.5622) mem 7379MB [2024-08-27 08:21:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][660/1251] eta 0:02:22 lr 0.000612 wd 0.0500 time 0.2500 (0.2410) data time 0.0007 (0.0017) model time 0.2493 (0.2393) loss 2.2591 (3.2708) grad_norm 2.0187 (inf) loss_scale 1024.0000 (2023.2133) mem 7379MB [2024-08-27 08:21:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 08:21:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 08:21:22 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 08:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 08:22:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 08:22:58 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 08:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 08:23:07 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 08:23:08 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 08:25:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 08:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 08:27:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 08:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 08:27:29 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 08:27:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 08:27:38 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 08:27:40 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 08:27:41 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 08:27:41 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 140) [2024-08-27 08:27:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 08:27:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][670/1251] eta 0:20:32 lr 0.000612 wd 0.0500 time 0.2353 (2.1221) data time 0.0012 (0.0779) model time 0.2341 (2.0443) loss 3.2946 (3.8267) grad_norm 2.6321 (2.5774) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][680/1251] eta 0:09:38 lr 0.000612 wd 0.0500 time 0.2343 (1.0137) data time 0.0010 (0.0327) model time 0.2332 (0.9810) loss 2.8900 (3.4662) grad_norm 1.6998 (2.3195) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][690/1251] eta 0:06:47 lr 0.000612 wd 0.0500 time 0.2393 (0.7269) data time 0.0010 (0.0211) model time 0.2383 (0.7058) loss 3.6779 (3.5259) grad_norm 1.6243 (2.3201) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][700/1251] eta 0:05:28 lr 0.000612 wd 0.0500 time 0.2418 (0.5956) data time 0.0011 (0.0157) model time 0.2406 (0.5800) loss 3.1632 (3.4601) grad_norm 3.1325 (2.4100) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][710/1251] eta 0:04:41 lr 0.000612 wd 0.0500 time 0.2397 (0.5202) data time 0.0008 (0.0126) model time 0.2389 (0.5076) loss 3.9501 (3.4349) grad_norm 2.1108 (2.4059) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][720/1251] eta 0:04:09 lr 0.000612 wd 0.0500 time 0.2375 (0.4708) data time 0.0010 (0.0106) model time 0.2365 (0.4602) loss 3.0764 (3.4136) grad_norm 2.4705 (2.3771) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][730/1251] eta 0:03:47 lr 0.000612 wd 0.0500 time 0.2434 (0.4358) data time 0.0011 (0.0092) model time 0.2423 (0.4266) loss 3.1051 (3.3771) grad_norm 2.4111 (2.3722) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][740/1251] eta 0:03:29 lr 0.000612 wd 0.0500 time 0.2279 (0.4101) data time 0.0013 (0.0081) model time 0.2266 (0.4020) loss 3.4624 (3.3557) grad_norm 1.6285 (2.3657) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][750/1251] eta 0:03:15 lr 0.000612 wd 0.0500 time 0.2408 (0.3904) data time 0.0012 (0.0073) model time 0.2397 (0.3831) loss 3.0739 (3.3145) grad_norm 1.7262 (2.3420) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][760/1251] eta 0:03:04 lr 0.000612 wd 0.0500 time 0.2339 (0.3748) data time 0.0010 (0.0067) model time 0.2329 (0.3681) loss 3.4129 (3.3204) grad_norm 1.8502 (2.3451) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][770/1251] eta 0:02:54 lr 0.000612 wd 0.0500 time 0.2327 (0.3620) data time 0.0009 (0.0062) model time 0.2318 (0.3558) loss 3.1849 (3.3520) grad_norm 2.1099 (2.3388) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][780/1251] eta 0:02:45 lr 0.000612 wd 0.0500 time 0.2399 (0.3515) data time 0.0012 (0.0057) model time 0.2387 (0.3457) loss 3.5673 (3.3414) grad_norm 2.1455 (2.3210) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][790/1251] eta 0:02:37 lr 0.000612 wd 0.0500 time 0.2357 (0.3425) data time 0.0012 (0.0054) model time 0.2346 (0.3371) loss 3.5636 (3.3288) grad_norm 1.6207 (2.3187) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][800/1251] eta 0:02:31 lr 0.000612 wd 0.0500 time 0.2377 (0.3349) data time 0.0009 (0.0051) model time 0.2367 (0.3298) loss 2.6356 (3.3328) grad_norm 2.3376 (2.3387) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][810/1251] eta 0:02:24 lr 0.000612 wd 0.0500 time 0.2404 (0.3284) data time 0.0011 (0.0048) model time 0.2393 (0.3236) loss 3.4394 (3.3188) grad_norm 2.2037 (2.3676) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][820/1251] eta 0:02:19 lr 0.000612 wd 0.0500 time 0.2404 (0.3226) data time 0.0011 (0.0046) model time 0.2392 (0.3181) loss 2.9331 (3.3159) grad_norm 3.0636 (2.3782) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][830/1251] eta 0:02:13 lr 0.000612 wd 0.0500 time 0.2377 (0.3176) data time 0.0012 (0.0044) model time 0.2365 (0.3133) loss 3.3898 (3.3176) grad_norm 2.2129 (2.3901) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][840/1251] eta 0:02:08 lr 0.000612 wd 0.0500 time 0.2376 (0.3131) data time 0.0012 (0.0042) model time 0.2364 (0.3089) loss 3.5418 (3.3073) grad_norm 2.4499 (2.4240) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][850/1251] eta 0:02:04 lr 0.000611 wd 0.0500 time 0.2339 (0.3092) data time 0.0009 (0.0040) model time 0.2330 (0.3052) loss 3.1647 (3.2994) grad_norm 1.8368 (2.4254) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][860/1251] eta 0:01:59 lr 0.000611 wd 0.0500 time 0.2365 (0.3056) data time 0.0008 (0.0039) model time 0.2356 (0.3017) loss 3.8211 (3.2958) grad_norm 2.6662 (2.4244) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][870/1251] eta 0:01:55 lr 0.000611 wd 0.0500 time 0.2297 (0.3024) data time 0.0009 (0.0037) model time 0.2289 (0.2986) loss 3.2147 (3.2797) grad_norm 2.6329 (2.4203) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][880/1251] eta 0:01:51 lr 0.000611 wd 0.0500 time 0.2452 (0.2996) data time 0.0008 (0.0036) model time 0.2443 (0.2960) loss 3.3787 (3.2718) grad_norm 2.9008 (2.4129) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][890/1251] eta 0:01:47 lr 0.000611 wd 0.0500 time 0.2423 (0.2969) data time 0.0011 (0.0035) model time 0.2413 (0.2934) loss 3.4081 (3.2757) grad_norm 2.7425 (2.4103) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][900/1251] eta 0:01:43 lr 0.000611 wd 0.0500 time 0.2364 (0.2944) data time 0.0010 (0.0034) model time 0.2354 (0.2910) loss 3.1512 (3.2664) grad_norm 1.6098 (2.3975) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][910/1251] eta 0:01:39 lr 0.000611 wd 0.0500 time 0.2407 (0.2922) data time 0.0011 (0.0033) model time 0.2396 (0.2889) loss 3.4756 (3.2646) grad_norm 5.1595 (2.3945) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:28:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][920/1251] eta 0:01:35 lr 0.000611 wd 0.0500 time 0.2280 (0.2899) data time 0.0009 (0.0032) model time 0.2270 (0.2867) loss 2.4559 (3.2536) grad_norm 1.7562 (2.3987) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][930/1251] eta 0:01:32 lr 0.000611 wd 0.0500 time 0.2467 (0.2880) data time 0.0009 (0.0031) model time 0.2458 (0.2849) loss 2.8533 (3.2477) grad_norm 2.6556 (2.4032) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][940/1251] eta 0:01:29 lr 0.000611 wd 0.0500 time 0.2417 (0.2862) data time 0.0010 (0.0031) model time 0.2407 (0.2831) loss 3.5458 (3.2562) grad_norm 2.5208 (2.4309) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][950/1251] eta 0:01:25 lr 0.000611 wd 0.0500 time 0.4678 (0.2853) data time 0.0008 (0.0030) model time 0.4670 (0.2823) loss 3.8897 (3.2477) grad_norm 2.2953 (2.4232) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][960/1251] eta 0:01:22 lr 0.000611 wd 0.0500 time 0.2398 (0.2838) data time 0.0013 (0.0030) model time 0.2386 (0.2808) loss 2.0777 (3.2375) grad_norm 2.6362 (2.4282) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][970/1251] eta 0:01:19 lr 0.000611 wd 0.0500 time 0.4945 (0.2830) data time 0.0008 (0.0029) model time 0.4937 (0.2801) loss 4.0635 (3.2354) grad_norm 2.4135 (2.4200) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][980/1251] eta 0:01:16 lr 0.000611 wd 0.0500 time 0.2399 (0.2816) data time 0.0011 (0.0028) model time 0.2388 (0.2788) loss 3.2932 (3.2405) grad_norm 2.3748 (2.4083) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][990/1251] eta 0:01:13 lr 0.000611 wd 0.0500 time 0.2372 (0.2803) data time 0.0008 (0.0028) model time 0.2364 (0.2775) loss 2.3443 (3.2447) grad_norm 2.5948 (2.4341) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1000/1251] eta 0:01:10 lr 0.000611 wd 0.0500 time 0.2411 (0.2791) data time 0.0010 (0.0027) model time 0.2401 (0.2763) loss 2.9132 (3.2443) grad_norm 2.0450 (2.4420) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1010/1251] eta 0:01:06 lr 0.000611 wd 0.0500 time 0.2432 (0.2779) data time 0.0008 (0.0027) model time 0.2425 (0.2752) loss 3.0530 (3.2479) grad_norm 1.6738 (2.4347) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1020/1251] eta 0:01:03 lr 0.000611 wd 0.0500 time 0.2474 (0.2768) data time 0.0010 (0.0026) model time 0.2464 (0.2741) loss 2.6554 (3.2510) grad_norm 2.9596 (2.4288) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1030/1251] eta 0:01:00 lr 0.000611 wd 0.0500 time 0.2338 (0.2758) data time 0.0010 (0.0026) model time 0.2329 (0.2732) loss 3.0100 (3.2494) grad_norm 2.6310 (2.4243) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1040/1251] eta 0:00:57 lr 0.000611 wd 0.0500 time 0.2367 (0.2748) data time 0.0010 (0.0026) model time 0.2357 (0.2723) loss 3.2831 (3.2460) grad_norm 1.3708 (2.4194) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1050/1251] eta 0:00:55 lr 0.000611 wd 0.0500 time 0.2401 (0.2739) data time 0.0008 (0.0025) model time 0.2393 (0.2714) loss 3.5745 (3.2437) grad_norm 2.5125 (2.4121) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1060/1251] eta 0:00:52 lr 0.000611 wd 0.0500 time 0.2335 (0.2729) data time 0.0009 (0.0025) model time 0.2327 (0.2705) loss 3.0831 (3.2467) grad_norm 2.5240 (2.4093) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1070/1251] eta 0:00:49 lr 0.000611 wd 0.0500 time 0.2303 (0.2721) data time 0.0010 (0.0025) model time 0.2293 (0.2696) loss 3.0225 (3.2519) grad_norm 2.6001 (2.4103) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1080/1251] eta 0:00:46 lr 0.000610 wd 0.0500 time 0.2387 (0.2714) data time 0.0009 (0.0024) model time 0.2378 (0.2689) loss 2.1888 (3.2531) grad_norm 2.0852 (2.4046) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1090/1251] eta 0:00:43 lr 0.000610 wd 0.0500 time 0.2432 (0.2706) data time 0.0011 (0.0024) model time 0.2421 (0.2682) loss 3.7736 (3.2570) grad_norm 1.8736 (2.3982) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1100/1251] eta 0:00:40 lr 0.000610 wd 0.0500 time 0.2506 (0.2699) data time 0.0012 (0.0024) model time 0.2494 (0.2675) loss 3.4316 (3.2625) grad_norm 1.8855 (2.3958) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1110/1251] eta 0:00:37 lr 0.000610 wd 0.0500 time 0.2426 (0.2692) data time 0.0012 (0.0023) model time 0.2414 (0.2668) loss 2.7927 (3.2629) grad_norm 1.6095 (2.4001) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1120/1251] eta 0:00:35 lr 0.000610 wd 0.0500 time 0.2347 (0.2685) data time 0.0010 (0.0023) model time 0.2337 (0.2662) loss 2.1149 (3.2588) grad_norm 3.7212 (2.3986) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1130/1251] eta 0:00:32 lr 0.000610 wd 0.0500 time 0.2392 (0.2678) data time 0.0011 (0.0023) model time 0.2381 (0.2655) loss 2.8485 (3.2528) grad_norm 2.4539 (2.4030) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1140/1251] eta 0:00:29 lr 0.000610 wd 0.0500 time 0.2281 (0.2672) data time 0.0010 (0.0023) model time 0.2271 (0.2649) loss 3.9964 (3.2515) grad_norm 2.0420 (2.3995) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1150/1251] eta 0:00:26 lr 0.000610 wd 0.0500 time 0.2384 (0.2666) data time 0.0009 (0.0022) model time 0.2375 (0.2643) loss 3.0806 (3.2545) grad_norm 2.5440 (2.4010) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1160/1251] eta 0:00:24 lr 0.000610 wd 0.0500 time 0.2382 (0.2660) data time 0.0009 (0.0022) model time 0.2373 (0.2637) loss 3.2896 (3.2536) grad_norm 2.3189 (2.4002) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:29:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1170/1251] eta 0:00:21 lr 0.000610 wd 0.0500 time 0.2371 (0.2654) data time 0.0012 (0.0022) model time 0.2359 (0.2632) loss 3.3955 (3.2543) grad_norm 2.5645 (2.3980) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:30:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1180/1251] eta 0:00:18 lr 0.000610 wd 0.0500 time 0.2348 (0.2649) data time 0.0010 (0.0022) model time 0.2338 (0.2627) loss 3.4187 (3.2616) grad_norm 1.9983 (2.4016) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:30:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1190/1251] eta 0:00:16 lr 0.000610 wd 0.0500 time 0.2371 (0.2644) data time 0.0012 (0.0022) model time 0.2359 (0.2623) loss 3.3899 (3.2560) grad_norm 2.4184 (2.4075) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:30:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1200/1251] eta 0:00:13 lr 0.000610 wd 0.0500 time 0.2333 (0.2639) data time 0.0007 (0.0021) model time 0.2326 (0.2618) loss 2.3688 (3.2526) grad_norm 1.9803 (2.4069) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:30:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1210/1251] eta 0:00:10 lr 0.000610 wd 0.0500 time 0.2431 (0.2635) data time 0.0008 (0.0021) model time 0.2424 (0.2614) loss 3.5542 (3.2548) grad_norm 1.6533 (2.4000) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:30:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1220/1251] eta 0:00:08 lr 0.000610 wd 0.0500 time 0.2308 (0.2631) data time 0.0009 (0.0021) model time 0.2299 (0.2610) loss 2.0553 (3.2611) grad_norm 2.0927 (2.3949) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1230/1251] eta 0:00:05 lr 0.000610 wd 0.0500 time 0.2340 (0.2626) data time 0.0010 (0.0021) model time 0.2330 (0.2605) loss 3.7058 (3.2636) grad_norm 3.1814 (2.4047) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:30:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1240/1251] eta 0:00:02 lr 0.000610 wd 0.0500 time 0.2235 (0.2621) data time 0.0005 (0.0021) model time 0.2230 (0.2600) loss 4.0505 (3.2655) grad_norm 4.0085 (2.4125) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:30:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [140/300][1250/1251] eta 0:00:00 lr 0.000610 wd 0.0500 time 0.2232 (0.2615) data time 0.0006 (0.0020) model time 0.2227 (0.2594) loss 3.7331 (3.2685) grad_norm 3.3687 (2.4228) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 08:30:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 140 training takes 0:02:33 [2024-08-27 08:30:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 08:30:20 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 08:30:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.451 (0.451) Loss 0.4509 (0.4509) Acc@1 91.699 (91.699) Acc@5 97.852 (97.852) Mem 7380MB [2024-08-27 08:30:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.110) Loss 0.7354 (0.7277) Acc@1 84.961 (84.499) Acc@5 96.777 (96.751) Mem 7380MB [2024-08-27 08:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.093) Loss 1.0732 (0.7556) Acc@1 75.684 (83.282) Acc@5 93.555 (96.698) Mem 7380MB [2024-08-27 08:30:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.088) Loss 1.2842 (0.8586) Acc@1 69.922 (80.903) Acc@5 90.430 (95.476) Mem 7380MB [2024-08-27 08:30:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.082) Loss 1.1611 (0.9136) Acc@1 72.070 (79.466) Acc@5 92.285 (94.891) Mem 7380MB [2024-08-27 08:30:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.004 Acc@5 94.802 [2024-08-27 08:30:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.0% [2024-08-27 08:30:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 79.00% [2024-08-27 08:30:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 08:30:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 08:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.437 (0.437) Loss 0.4106 (0.4106) Acc@1 92.578 (92.578) Acc@5 98.633 (98.633) Mem 7380MB [2024-08-27 08:30:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.108) Loss 0.6587 (0.6470) Acc@1 86.914 (86.133) Acc@5 97.168 (97.292) Mem 7380MB [2024-08-27 08:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.092) Loss 0.9194 (0.6712) Acc@1 79.004 (85.180) Acc@5 95.020 (97.298) Mem 7380MB [2024-08-27 08:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.087) Loss 1.1641 (0.7617) Acc@1 70.801 (82.894) Acc@5 92.090 (96.314) Mem 7380MB [2024-08-27 08:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.081) Loss 1.0537 (0.8095) Acc@1 74.609 (81.576) Acc@5 93.262 (95.794) Mem 7380MB [2024-08-27 08:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.160 Acc@5 95.760 [2024-08-27 08:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.2% [2024-08-27 08:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.16% [2024-08-27 08:30:32 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 08:30:34 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 08:30:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][0/1251] eta 0:17:00 lr 0.000610 wd 0.0500 time 0.8160 (0.8160) data time 0.4593 (0.4593) model time 0.0000 (0.0000) loss 3.2539 (3.2539) grad_norm 2.9023 (2.9023) loss_scale 1024.0000 (1024.0000) mem 7383MB [2024-08-27 08:30:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][10/1251] eta 0:06:00 lr 0.000610 wd 0.0500 time 0.2305 (0.2909) data time 0.0010 (0.0428) model time 0.0000 (0.0000) loss 2.8114 (3.2205) grad_norm 3.3837 (2.6296) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:30:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][20/1251] eta 0:05:27 lr 0.000610 wd 0.0500 time 0.2374 (0.2659) data time 0.0011 (0.0230) model time 0.0000 (0.0000) loss 3.0593 (3.2074) grad_norm 4.0335 (2.7721) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:30:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][30/1251] eta 0:05:13 lr 0.000610 wd 0.0500 time 0.2325 (0.2564) data time 0.0013 (0.0159) model time 0.0000 (0.0000) loss 3.0033 (3.2642) grad_norm 1.5491 (2.5909) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:30:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][40/1251] eta 0:05:05 lr 0.000610 wd 0.0500 time 0.2375 (0.2525) data time 0.0008 (0.0123) model time 0.0000 (0.0000) loss 2.7989 (3.2846) grad_norm 2.1035 (2.4935) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:30:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][50/1251] eta 0:04:59 lr 0.000610 wd 0.0500 time 0.2376 (0.2492) data time 0.0010 (0.0101) model time 0.0000 (0.0000) loss 3.6225 (3.2998) grad_norm 2.0743 (2.4530) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:30:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][60/1251] eta 0:04:54 lr 0.000609 wd 0.0500 time 0.2338 (0.2471) data time 0.0013 (0.0086) model time 0.2325 (0.2352) loss 3.1371 (3.2590) grad_norm 2.6705 (2.4854) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:30:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][70/1251] eta 0:04:50 lr 0.000609 wd 0.0500 time 0.2374 (0.2458) data time 0.0008 (0.0076) model time 0.2365 (0.2360) loss 3.0165 (3.2450) grad_norm 5.4077 (2.5679) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:30:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][80/1251] eta 0:04:46 lr 0.000609 wd 0.0500 time 0.2335 (0.2447) data time 0.0008 (0.0068) model time 0.2328 (0.2359) loss 3.7792 (3.2506) grad_norm 2.0587 (2.6092) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:30:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][90/1251] eta 0:04:43 lr 0.000609 wd 0.0500 time 0.2426 (0.2442) data time 0.0010 (0.0061) model time 0.2417 (0.2368) loss 3.4943 (3.2679) grad_norm 2.1617 (2.5936) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:30:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][100/1251] eta 0:04:40 lr 0.000609 wd 0.0500 time 0.2391 (0.2435) data time 0.0008 (0.0056) model time 0.2383 (0.2365) loss 2.7868 (3.2781) grad_norm 2.0476 (2.5584) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][110/1251] eta 0:04:37 lr 0.000609 wd 0.0500 time 0.2382 (0.2431) data time 0.0008 (0.0052) model time 0.2374 (0.2368) loss 3.3263 (3.2485) grad_norm 2.4847 (2.5185) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][120/1251] eta 0:04:34 lr 0.000609 wd 0.0500 time 0.2346 (0.2428) data time 0.0011 (0.0049) model time 0.2335 (0.2370) loss 3.2165 (3.2457) grad_norm 2.6592 (2.5881) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][130/1251] eta 0:04:31 lr 0.000609 wd 0.0500 time 0.2538 (0.2425) data time 0.0009 (0.0046) model time 0.2528 (0.2371) loss 4.0829 (3.2309) grad_norm 1.4430 (2.5425) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][140/1251] eta 0:04:29 lr 0.000609 wd 0.0500 time 0.2540 (0.2422) data time 0.0010 (0.0044) model time 0.2530 (0.2372) loss 3.1375 (3.2243) grad_norm 2.0311 (2.5253) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][150/1251] eta 0:04:28 lr 0.000609 wd 0.0500 time 0.2399 (0.2434) data time 0.0011 (0.0042) model time 0.2387 (0.2394) loss 3.2847 (3.2460) grad_norm 2.4121 (2.5147) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][160/1251] eta 0:04:25 lr 0.000609 wd 0.0500 time 0.2330 (0.2432) data time 0.0008 (0.0040) model time 0.2322 (0.2393) loss 1.9008 (3.2409) grad_norm 3.8958 (2.5388) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][170/1251] eta 0:04:22 lr 0.000609 wd 0.0500 time 0.2359 (0.2429) data time 0.0011 (0.0038) model time 0.2348 (0.2391) loss 4.1898 (3.2377) grad_norm 2.1554 (2.5624) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][180/1251] eta 0:04:19 lr 0.000609 wd 0.0500 time 0.2356 (0.2427) data time 0.0010 (0.0036) model time 0.2346 (0.2390) loss 3.2180 (3.2439) grad_norm 1.9076 (2.5380) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][190/1251] eta 0:04:17 lr 0.000609 wd 0.0500 time 0.2469 (0.2425) data time 0.0009 (0.0035) model time 0.2460 (0.2389) loss 4.0698 (3.2522) grad_norm 2.4285 (2.5183) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][200/1251] eta 0:04:14 lr 0.000609 wd 0.0500 time 0.2449 (0.2422) data time 0.0008 (0.0034) model time 0.2442 (0.2388) loss 3.4651 (3.2494) grad_norm 2.4256 (2.5002) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][210/1251] eta 0:04:11 lr 0.000609 wd 0.0500 time 0.2376 (0.2420) data time 0.0008 (0.0033) model time 0.2368 (0.2386) loss 3.3760 (3.2478) grad_norm 1.9505 (2.4756) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][220/1251] eta 0:04:09 lr 0.000609 wd 0.0500 time 0.2387 (0.2419) data time 0.0010 (0.0032) model time 0.2377 (0.2385) loss 3.6268 (3.2374) grad_norm 1.6297 (2.4605) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][230/1251] eta 0:04:07 lr 0.000609 wd 0.0500 time 0.2446 (0.2425) data time 0.0010 (0.0031) model time 0.2436 (0.2395) loss 3.4618 (3.2319) grad_norm 2.2709 (2.4553) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][240/1251] eta 0:04:05 lr 0.000609 wd 0.0500 time 0.2403 (0.2424) data time 0.0010 (0.0030) model time 0.2393 (0.2394) loss 3.6333 (3.2183) grad_norm 2.2426 (2.4517) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][250/1251] eta 0:04:02 lr 0.000609 wd 0.0500 time 0.2348 (0.2421) data time 0.0009 (0.0029) model time 0.2339 (0.2392) loss 4.3970 (3.2163) grad_norm 2.2265 (2.4435) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][260/1251] eta 0:03:59 lr 0.000609 wd 0.0500 time 0.2346 (0.2420) data time 0.0008 (0.0029) model time 0.2338 (0.2392) loss 1.7833 (3.2094) grad_norm 3.3883 (2.4697) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][270/1251] eta 0:03:57 lr 0.000609 wd 0.0500 time 0.2376 (0.2419) data time 0.0010 (0.0028) model time 0.2366 (0.2391) loss 2.6446 (3.2148) grad_norm 2.8604 (2.4799) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][280/1251] eta 0:03:54 lr 0.000609 wd 0.0500 time 0.2362 (0.2418) data time 0.0009 (0.0027) model time 0.2353 (0.2390) loss 3.8844 (3.2151) grad_norm 2.4434 (2.4850) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][290/1251] eta 0:03:52 lr 0.000608 wd 0.0500 time 0.2322 (0.2416) data time 0.0011 (0.0027) model time 0.2311 (0.2389) loss 3.3151 (3.2140) grad_norm 2.1054 (2.4849) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][300/1251] eta 0:03:49 lr 0.000608 wd 0.0500 time 0.2339 (0.2416) data time 0.0011 (0.0026) model time 0.2327 (0.2389) loss 3.7617 (3.2139) grad_norm 2.6542 (2.4799) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][310/1251] eta 0:03:47 lr 0.000608 wd 0.0500 time 0.2400 (0.2415) data time 0.0009 (0.0026) model time 0.2391 (0.2389) loss 2.9045 (3.2115) grad_norm 2.2147 (2.4674) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][320/1251] eta 0:03:44 lr 0.000608 wd 0.0500 time 0.2413 (0.2414) data time 0.0011 (0.0025) model time 0.2402 (0.2388) loss 2.8665 (3.2120) grad_norm 3.0902 (2.4630) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][330/1251] eta 0:03:42 lr 0.000608 wd 0.0500 time 0.2417 (0.2413) data time 0.0011 (0.0025) model time 0.2406 (0.2388) loss 3.1772 (3.2141) grad_norm 3.6655 (2.4685) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][340/1251] eta 0:03:39 lr 0.000608 wd 0.0500 time 0.2503 (0.2413) data time 0.0008 (0.0024) model time 0.2495 (0.2388) loss 3.6738 (3.2238) grad_norm 2.0752 (2.4617) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:31:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][350/1251] eta 0:03:37 lr 0.000608 wd 0.0500 time 0.2394 (0.2413) data time 0.0009 (0.0024) model time 0.2385 (0.2389) loss 2.8045 (3.2214) grad_norm 2.3401 (2.4545) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][360/1251] eta 0:03:34 lr 0.000608 wd 0.0500 time 0.2414 (0.2413) data time 0.0008 (0.0024) model time 0.2406 (0.2389) loss 3.3359 (3.2139) grad_norm 1.9060 (2.4511) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][370/1251] eta 0:03:32 lr 0.000608 wd 0.0500 time 0.2493 (0.2413) data time 0.0008 (0.0023) model time 0.2485 (0.2389) loss 3.4992 (3.2082) grad_norm 1.9065 (2.4430) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][380/1251] eta 0:03:30 lr 0.000608 wd 0.0500 time 0.2378 (0.2412) data time 0.0010 (0.0023) model time 0.2368 (0.2389) loss 3.3514 (3.2119) grad_norm 2.1965 (2.4439) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][390/1251] eta 0:03:27 lr 0.000608 wd 0.0500 time 0.2385 (0.2412) data time 0.0011 (0.0023) model time 0.2374 (0.2389) loss 2.5190 (3.2118) grad_norm 2.1226 (2.4349) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][400/1251] eta 0:03:25 lr 0.000608 wd 0.0500 time 0.2374 (0.2412) data time 0.0012 (0.0023) model time 0.2362 (0.2389) loss 2.0659 (3.2099) grad_norm 2.3936 (2.4349) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][410/1251] eta 0:03:22 lr 0.000608 wd 0.0500 time 0.2459 (0.2412) data time 0.0011 (0.0022) model time 0.2448 (0.2389) loss 3.5290 (3.2118) grad_norm 2.2854 (2.4352) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][420/1251] eta 0:03:20 lr 0.000608 wd 0.0500 time 0.2318 (0.2411) data time 0.0009 (0.0022) model time 0.2310 (0.2389) loss 3.9355 (3.2128) grad_norm 1.6843 (2.4353) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][430/1251] eta 0:03:17 lr 0.000608 wd 0.0500 time 0.2379 (0.2410) data time 0.0009 (0.0022) model time 0.2371 (0.2388) loss 2.9712 (3.2156) grad_norm 4.1390 (2.4385) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][440/1251] eta 0:03:15 lr 0.000608 wd 0.0500 time 0.2351 (0.2409) data time 0.0009 (0.0021) model time 0.2342 (0.2388) loss 2.9993 (3.2125) grad_norm 2.9391 (2.4435) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][450/1251] eta 0:03:12 lr 0.000608 wd 0.0500 time 0.2393 (0.2409) data time 0.0008 (0.0021) model time 0.2385 (0.2388) loss 4.0099 (3.2163) grad_norm 1.8920 (2.4373) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][460/1251] eta 0:03:10 lr 0.000608 wd 0.0500 time 0.2316 (0.2409) data time 0.0010 (0.0021) model time 0.2307 (0.2387) loss 2.7016 (3.2173) grad_norm 2.0497 (2.4319) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][470/1251] eta 0:03:08 lr 0.000608 wd 0.0500 time 0.2344 (0.2408) data time 0.0009 (0.0021) model time 0.2335 (0.2387) loss 4.2702 (3.2242) grad_norm 2.6650 (2.4349) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][480/1251] eta 0:03:05 lr 0.000608 wd 0.0500 time 0.2387 (0.2408) data time 0.0011 (0.0021) model time 0.2376 (0.2387) loss 3.5634 (3.2248) grad_norm 2.0621 (2.4283) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][490/1251] eta 0:03:03 lr 0.000608 wd 0.0500 time 0.2422 (0.2408) data time 0.0013 (0.0020) model time 0.2409 (0.2387) loss 3.2341 (3.2208) grad_norm 1.7150 (2.4261) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][500/1251] eta 0:03:00 lr 0.000608 wd 0.0500 time 0.2317 (0.2407) data time 0.0007 (0.0020) model time 0.2310 (0.2386) loss 2.4917 (3.2202) grad_norm 2.7218 (2.4274) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][510/1251] eta 0:02:58 lr 0.000608 wd 0.0500 time 0.2335 (0.2407) data time 0.0009 (0.0020) model time 0.2326 (0.2387) loss 2.6765 (3.2230) grad_norm 3.0032 (2.4250) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][520/1251] eta 0:02:55 lr 0.000607 wd 0.0500 time 0.2418 (0.2407) data time 0.0008 (0.0020) model time 0.2409 (0.2387) loss 2.6461 (3.2234) grad_norm 2.1514 (2.4346) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][530/1251] eta 0:02:53 lr 0.000607 wd 0.0500 time 0.2373 (0.2407) data time 0.0009 (0.0020) model time 0.2364 (0.2387) loss 3.7880 (3.2298) grad_norm 2.9898 (2.4452) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][540/1251] eta 0:02:51 lr 0.000607 wd 0.0500 time 0.2425 (0.2407) data time 0.0010 (0.0020) model time 0.2415 (0.2386) loss 3.3468 (3.2305) grad_norm 2.3301 (2.4416) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][550/1251] eta 0:02:48 lr 0.000607 wd 0.0500 time 0.2405 (0.2406) data time 0.0010 (0.0019) model time 0.2394 (0.2386) loss 3.5524 (3.2303) grad_norm 3.7105 (2.4411) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][560/1251] eta 0:02:46 lr 0.000607 wd 0.0500 time 0.2380 (0.2405) data time 0.0010 (0.0019) model time 0.2369 (0.2386) loss 3.3492 (3.2325) grad_norm 2.1923 (2.4381) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][570/1251] eta 0:02:43 lr 0.000607 wd 0.0500 time 0.2324 (0.2405) data time 0.0011 (0.0019) model time 0.2313 (0.2385) loss 2.6298 (3.2325) grad_norm 1.8336 (2.4389) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][580/1251] eta 0:02:41 lr 0.000607 wd 0.0500 time 0.2422 (0.2405) data time 0.0008 (0.0019) model time 0.2414 (0.2385) loss 3.7975 (3.2320) grad_norm 1.6836 (2.4618) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][590/1251] eta 0:02:38 lr 0.000607 wd 0.0500 time 0.2422 (0.2405) data time 0.0011 (0.0019) model time 0.2412 (0.2386) loss 3.5661 (3.2392) grad_norm 1.8719 (2.4599) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:32:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][600/1251] eta 0:02:36 lr 0.000607 wd 0.0500 time 0.2396 (0.2405) data time 0.0008 (0.0019) model time 0.2389 (0.2385) loss 3.1211 (3.2428) grad_norm 3.0749 (2.4630) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][610/1251] eta 0:02:34 lr 0.000607 wd 0.0500 time 0.2359 (0.2404) data time 0.0007 (0.0019) model time 0.2352 (0.2385) loss 3.8565 (3.2416) grad_norm 1.8797 (2.4547) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][620/1251] eta 0:02:31 lr 0.000607 wd 0.0500 time 0.2391 (0.2404) data time 0.0010 (0.0019) model time 0.2380 (0.2385) loss 3.4195 (3.2442) grad_norm 2.0585 (2.4499) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][630/1251] eta 0:02:29 lr 0.000607 wd 0.0500 time 0.2387 (0.2404) data time 0.0009 (0.0018) model time 0.2378 (0.2385) loss 2.8316 (3.2452) grad_norm 2.5032 (2.4459) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][640/1251] eta 0:02:26 lr 0.000607 wd 0.0500 time 0.2347 (0.2404) data time 0.0008 (0.0018) model time 0.2339 (0.2385) loss 2.3639 (3.2421) grad_norm 3.4111 (2.4474) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][650/1251] eta 0:02:24 lr 0.000607 wd 0.0500 time 0.2357 (0.2403) data time 0.0011 (0.0018) model time 0.2346 (0.2385) loss 3.4697 (3.2426) grad_norm 2.6966 (2.4488) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][660/1251] eta 0:02:22 lr 0.000607 wd 0.0500 time 0.2358 (0.2403) data time 0.0012 (0.0018) model time 0.2346 (0.2384) loss 2.4071 (3.2431) grad_norm 1.9291 (2.4481) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][670/1251] eta 0:02:19 lr 0.000607 wd 0.0500 time 0.2377 (0.2402) data time 0.0010 (0.0018) model time 0.2368 (0.2384) loss 3.8613 (3.2426) grad_norm 3.6443 (2.4536) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][680/1251] eta 0:02:17 lr 0.000607 wd 0.0500 time 0.2384 (0.2402) data time 0.0008 (0.0018) model time 0.2376 (0.2383) loss 3.0498 (3.2462) grad_norm 1.6202 (2.4555) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][690/1251] eta 0:02:14 lr 0.000607 wd 0.0500 time 0.2394 (0.2402) data time 0.0009 (0.0018) model time 0.2385 (0.2383) loss 4.1970 (3.2487) grad_norm 2.1481 (2.4560) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][700/1251] eta 0:02:12 lr 0.000607 wd 0.0500 time 0.2420 (0.2402) data time 0.0009 (0.0018) model time 0.2411 (0.2383) loss 3.4346 (3.2511) grad_norm 3.1852 (2.4574) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][710/1251] eta 0:02:09 lr 0.000607 wd 0.0500 time 0.2382 (0.2401) data time 0.0009 (0.0018) model time 0.2373 (0.2383) loss 2.6869 (3.2520) grad_norm 2.1069 (2.4542) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][720/1251] eta 0:02:07 lr 0.000607 wd 0.0500 time 0.2369 (0.2401) data time 0.0010 (0.0017) model time 0.2359 (0.2383) loss 3.3218 (3.2534) grad_norm 2.0503 (2.4532) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][730/1251] eta 0:02:05 lr 0.000607 wd 0.0500 time 0.2362 (0.2402) data time 0.0011 (0.0017) model time 0.2351 (0.2384) loss 2.6228 (3.2485) grad_norm 2.0588 (2.4533) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][740/1251] eta 0:02:02 lr 0.000607 wd 0.0500 time 0.2312 (0.2401) data time 0.0008 (0.0017) model time 0.2304 (0.2384) loss 4.0365 (3.2482) grad_norm 2.4563 (2.4533) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][750/1251] eta 0:02:00 lr 0.000606 wd 0.0500 time 0.2403 (0.2404) data time 0.0010 (0.0017) model time 0.2393 (0.2387) loss 3.5175 (3.2471) grad_norm 3.5256 (2.4562) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][760/1251] eta 0:01:58 lr 0.000606 wd 0.0500 time 0.2293 (0.2404) data time 0.0013 (0.0017) model time 0.2280 (0.2387) loss 3.5509 (3.2478) grad_norm 1.8725 (2.4592) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][770/1251] eta 0:01:55 lr 0.000606 wd 0.0500 time 0.2369 (0.2404) data time 0.0011 (0.0017) model time 0.2358 (0.2387) loss 3.5329 (3.2512) grad_norm 2.2781 (2.4582) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][780/1251] eta 0:01:53 lr 0.000606 wd 0.0500 time 0.2374 (0.2404) data time 0.0007 (0.0017) model time 0.2367 (0.2387) loss 3.0802 (3.2542) grad_norm 2.2461 (2.4590) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][790/1251] eta 0:01:50 lr 0.000606 wd 0.0500 time 0.2411 (0.2404) data time 0.0011 (0.0017) model time 0.2400 (0.2387) loss 3.6636 (3.2556) grad_norm 2.1300 (2.4583) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][800/1251] eta 0:01:48 lr 0.000606 wd 0.0500 time 0.2349 (0.2403) data time 0.0009 (0.0017) model time 0.2340 (0.2386) loss 2.4765 (3.2554) grad_norm 2.4761 (2.4551) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][810/1251] eta 0:01:45 lr 0.000606 wd 0.0500 time 0.2381 (0.2403) data time 0.0010 (0.0017) model time 0.2371 (0.2386) loss 3.7019 (3.2565) grad_norm 2.1235 (2.4544) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][820/1251] eta 0:01:43 lr 0.000606 wd 0.0500 time 0.2407 (0.2403) data time 0.0008 (0.0017) model time 0.2399 (0.2386) loss 3.2581 (3.2606) grad_norm 2.0010 (2.4520) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][830/1251] eta 0:01:41 lr 0.000606 wd 0.0500 time 0.2449 (0.2403) data time 0.0010 (0.0017) model time 0.2439 (0.2386) loss 3.2332 (3.2606) grad_norm 1.8347 (2.4478) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][840/1251] eta 0:01:38 lr 0.000606 wd 0.0500 time 0.2395 (0.2403) data time 0.0009 (0.0017) model time 0.2386 (0.2386) loss 2.8241 (3.2606) grad_norm 2.2113 (2.4476) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][850/1251] eta 0:01:36 lr 0.000606 wd 0.0500 time 0.2289 (0.2403) data time 0.0010 (0.0017) model time 0.2279 (0.2386) loss 3.0265 (3.2623) grad_norm 2.0688 (2.4463) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][860/1251] eta 0:01:33 lr 0.000606 wd 0.0500 time 0.2381 (0.2403) data time 0.0011 (0.0016) model time 0.2370 (0.2386) loss 2.8606 (3.2643) grad_norm 2.4306 (2.4495) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][870/1251] eta 0:01:31 lr 0.000606 wd 0.0500 time 0.2483 (0.2402) data time 0.0008 (0.0016) model time 0.2476 (0.2386) loss 3.9646 (3.2654) grad_norm 2.5352 (2.4503) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][880/1251] eta 0:01:29 lr 0.000606 wd 0.0500 time 0.2421 (0.2402) data time 0.0011 (0.0016) model time 0.2410 (0.2385) loss 3.7143 (3.2639) grad_norm 2.4578 (2.4485) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][890/1251] eta 0:01:26 lr 0.000606 wd 0.0500 time 0.2319 (0.2402) data time 0.0011 (0.0016) model time 0.2309 (0.2385) loss 3.0530 (3.2605) grad_norm 2.5854 (2.4499) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][900/1251] eta 0:01:24 lr 0.000606 wd 0.0500 time 0.2333 (0.2401) data time 0.0009 (0.0016) model time 0.2324 (0.2385) loss 3.1194 (3.2557) grad_norm 3.0400 (2.4525) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][910/1251] eta 0:01:21 lr 0.000606 wd 0.0500 time 0.2440 (0.2401) data time 0.0009 (0.0016) model time 0.2430 (0.2385) loss 3.2571 (3.2542) grad_norm 2.8773 (2.4531) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][920/1251] eta 0:01:19 lr 0.000606 wd 0.0500 time 0.2365 (0.2401) data time 0.0011 (0.0016) model time 0.2355 (0.2385) loss 3.6435 (3.2573) grad_norm 2.1170 (2.4713) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][930/1251] eta 0:01:17 lr 0.000606 wd 0.0500 time 0.2411 (0.2401) data time 0.0009 (0.0016) model time 0.2402 (0.2385) loss 3.4714 (3.2573) grad_norm 2.1185 (2.4739) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][940/1251] eta 0:01:14 lr 0.000606 wd 0.0500 time 0.2413 (0.2401) data time 0.0007 (0.0016) model time 0.2406 (0.2385) loss 2.1912 (3.2573) grad_norm 1.7034 (2.4738) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][950/1251] eta 0:01:12 lr 0.000606 wd 0.0500 time 0.2499 (0.2401) data time 0.0011 (0.0016) model time 0.2488 (0.2385) loss 3.7766 (3.2575) grad_norm 1.8149 (2.4747) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][960/1251] eta 0:01:09 lr 0.000606 wd 0.0500 time 0.2312 (0.2401) data time 0.0009 (0.0016) model time 0.2304 (0.2385) loss 3.8557 (3.2573) grad_norm 3.7547 (2.4814) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][970/1251] eta 0:01:07 lr 0.000606 wd 0.0500 time 0.2388 (0.2401) data time 0.0007 (0.0016) model time 0.2380 (0.2385) loss 3.1450 (3.2575) grad_norm 2.3274 (2.4856) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][980/1251] eta 0:01:05 lr 0.000605 wd 0.0500 time 0.2394 (0.2401) data time 0.0009 (0.0016) model time 0.2385 (0.2385) loss 2.4582 (3.2537) grad_norm 1.9342 (2.4901) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][990/1251] eta 0:01:02 lr 0.000605 wd 0.0500 time 0.2348 (0.2401) data time 0.0013 (0.0016) model time 0.2335 (0.2385) loss 4.0333 (3.2550) grad_norm 1.9580 (2.4890) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1000/1251] eta 0:01:00 lr 0.000605 wd 0.0500 time 0.2376 (0.2400) data time 0.0009 (0.0016) model time 0.2367 (0.2384) loss 3.2839 (3.2551) grad_norm 2.7238 (2.4878) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1010/1251] eta 0:00:57 lr 0.000605 wd 0.0500 time 0.2510 (0.2400) data time 0.0013 (0.0016) model time 0.2497 (0.2384) loss 3.4128 (3.2579) grad_norm 3.7043 (2.4881) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1020/1251] eta 0:00:55 lr 0.000605 wd 0.0500 time 0.2362 (0.2400) data time 0.0008 (0.0016) model time 0.2355 (0.2384) loss 2.0863 (3.2606) grad_norm 2.0144 (2.4883) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1030/1251] eta 0:00:53 lr 0.000605 wd 0.0500 time 0.2374 (0.2400) data time 0.0009 (0.0016) model time 0.2366 (0.2384) loss 3.9324 (3.2616) grad_norm 1.9685 (2.4875) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1040/1251] eta 0:00:50 lr 0.000605 wd 0.0500 time 0.2391 (0.2400) data time 0.0009 (0.0016) model time 0.2382 (0.2384) loss 2.3392 (3.2603) grad_norm 1.6567 (2.4851) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1050/1251] eta 0:00:48 lr 0.000605 wd 0.0500 time 0.2301 (0.2400) data time 0.0009 (0.0016) model time 0.2292 (0.2384) loss 3.1515 (3.2636) grad_norm 1.7690 (2.4817) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1060/1251] eta 0:00:45 lr 0.000605 wd 0.0500 time 0.2350 (0.2400) data time 0.0011 (0.0016) model time 0.2339 (0.2384) loss 2.2104 (3.2582) grad_norm 2.0737 (2.4774) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1070/1251] eta 0:00:43 lr 0.000605 wd 0.0500 time 0.2277 (0.2399) data time 0.0009 (0.0015) model time 0.2268 (0.2384) loss 2.1951 (3.2547) grad_norm 2.5222 (2.4760) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1080/1251] eta 0:00:41 lr 0.000605 wd 0.0500 time 0.2374 (0.2401) data time 0.0010 (0.0015) model time 0.2364 (0.2386) loss 2.3909 (3.2542) grad_norm 3.5847 (2.4816) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1090/1251] eta 0:00:38 lr 0.000605 wd 0.0500 time 0.2311 (0.2401) data time 0.0012 (0.0015) model time 0.2299 (0.2385) loss 3.6729 (3.2553) grad_norm 2.2811 (2.4819) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1100/1251] eta 0:00:36 lr 0.000605 wd 0.0500 time 0.2412 (0.2401) data time 0.0009 (0.0015) model time 0.2403 (0.2385) loss 3.2144 (3.2564) grad_norm 1.9525 (2.4786) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1110/1251] eta 0:00:33 lr 0.000605 wd 0.0500 time 0.2424 (0.2401) data time 0.0008 (0.0015) model time 0.2416 (0.2385) loss 4.2407 (3.2584) grad_norm 2.1255 (2.4764) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1120/1251] eta 0:00:31 lr 0.000605 wd 0.0500 time 0.2625 (0.2402) data time 0.0008 (0.0015) model time 0.2617 (0.2386) loss 3.4977 (3.2597) grad_norm 2.5185 (2.4758) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1130/1251] eta 0:00:29 lr 0.000605 wd 0.0500 time 0.2361 (0.2402) data time 0.0009 (0.0015) model time 0.2352 (0.2386) loss 2.2602 (3.2576) grad_norm 1.7791 (2.4735) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1140/1251] eta 0:00:26 lr 0.000605 wd 0.0500 time 0.2390 (0.2402) data time 0.0010 (0.0015) model time 0.2380 (0.2386) loss 3.3898 (3.2582) grad_norm 2.2513 (2.4730) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1150/1251] eta 0:00:24 lr 0.000605 wd 0.0500 time 0.2394 (0.2402) data time 0.0011 (0.0015) model time 0.2384 (0.2386) loss 3.5402 (3.2584) grad_norm 2.2547 (2.4725) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1160/1251] eta 0:00:21 lr 0.000605 wd 0.0500 time 0.2424 (0.2401) data time 0.0008 (0.0015) model time 0.2416 (0.2386) loss 4.0503 (3.2577) grad_norm 8.2357 (2.4744) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1170/1251] eta 0:00:19 lr 0.000605 wd 0.0500 time 0.2369 (0.2401) data time 0.0007 (0.0015) model time 0.2362 (0.2386) loss 2.9182 (3.2549) grad_norm 2.3390 (2.4722) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1180/1251] eta 0:00:17 lr 0.000605 wd 0.0500 time 0.2394 (0.2401) data time 0.0008 (0.0015) model time 0.2386 (0.2386) loss 3.3797 (3.2571) grad_norm 2.5185 (2.4713) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1190/1251] eta 0:00:14 lr 0.000605 wd 0.0500 time 0.2335 (0.2401) data time 0.0011 (0.0015) model time 0.2325 (0.2386) loss 2.4019 (3.2576) grad_norm 2.7685 (2.4700) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1200/1251] eta 0:00:12 lr 0.000605 wd 0.0500 time 0.2416 (0.2401) data time 0.0010 (0.0015) model time 0.2406 (0.2385) loss 3.1222 (3.2563) grad_norm 2.3596 (2.4681) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1210/1251] eta 0:00:09 lr 0.000604 wd 0.0500 time 0.2316 (0.2400) data time 0.0009 (0.0015) model time 0.2307 (0.2385) loss 3.4982 (3.2592) grad_norm 2.1543 (2.4678) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1220/1251] eta 0:00:07 lr 0.000604 wd 0.0500 time 0.2419 (0.2400) data time 0.0011 (0.0015) model time 0.2408 (0.2385) loss 3.3097 (3.2589) grad_norm 2.0696 (2.4652) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1230/1251] eta 0:00:05 lr 0.000604 wd 0.0500 time 0.2419 (0.2400) data time 0.0010 (0.0015) model time 0.2409 (0.2385) loss 2.8998 (3.2583) grad_norm 2.3571 (2.4646) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1240/1251] eta 0:00:02 lr 0.000604 wd 0.0500 time 0.2264 (0.2399) data time 0.0007 (0.0015) model time 0.2257 (0.2384) loss 3.7743 (3.2597) grad_norm 2.2246 (2.4666) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [141/300][1250/1251] eta 0:00:00 lr 0.000604 wd 0.0500 time 0.2240 (0.2398) data time 0.0005 (0.0015) model time 0.2236 (0.2383) loss 2.6598 (3.2604) grad_norm 2.1699 (2.4654) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 141 training takes 0:05:00 [2024-08-27 08:35:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 08:35:35 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 08:35:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.416 (0.416) Loss 0.5166 (0.5166) Acc@1 91.406 (91.406) Acc@5 98.242 (98.242) Mem 7382MB [2024-08-27 08:35:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.108) Loss 0.7778 (0.7754) Acc@1 86.035 (84.020) Acc@5 96.094 (96.600) Mem 7382MB [2024-08-27 08:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.094) Loss 1.0879 (0.8041) Acc@1 74.414 (83.045) Acc@5 94.336 (96.652) Mem 7382MB [2024-08-27 08:35:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.087) Loss 1.3633 (0.9011) Acc@1 67.188 (80.708) Acc@5 90.332 (95.486) Mem 7382MB [2024-08-27 08:35:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.082) Loss 1.2529 (0.9568) Acc@1 70.703 (79.216) Acc@5 91.895 (94.858) Mem 7382MB [2024-08-27 08:35:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.778 Acc@5 94.754 [2024-08-27 08:35:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.8% [2024-08-27 08:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.785 (0.785) Loss 0.4106 (0.4106) Acc@1 92.676 (92.676) Acc@5 98.438 (98.438) Mem 7382MB [2024-08-27 08:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.145) Loss 0.6572 (0.6465) Acc@1 86.914 (86.151) Acc@5 97.070 (97.275) Mem 7382MB [2024-08-27 08:35:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.112) Loss 0.9180 (0.6707) Acc@1 79.102 (85.198) Acc@5 95.020 (97.289) Mem 7382MB [2024-08-27 08:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.100) Loss 1.1621 (0.7610) Acc@1 70.996 (82.904) Acc@5 92.090 (96.327) Mem 7382MB [2024-08-27 08:35:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.091) Loss 1.0537 (0.8086) Acc@1 74.609 (81.562) Acc@5 93.555 (95.801) Mem 7382MB [2024-08-27 08:35:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.148 Acc@5 95.768 [2024-08-27 08:35:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.1% [2024-08-27 08:35:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][0/1251] eta 0:23:49 lr 0.000604 wd 0.0500 time 1.1425 (1.1425) data time 0.6109 (0.6109) model time 0.0000 (0.0000) loss 3.4080 (3.4080) grad_norm 1.6934 (1.6934) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][10/1251] eta 0:06:39 lr 0.000604 wd 0.0500 time 0.2379 (0.3217) data time 0.0008 (0.0565) model time 0.0000 (0.0000) loss 4.0847 (3.1270) grad_norm 3.2634 (2.5999) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][20/1251] eta 0:05:47 lr 0.000604 wd 0.0500 time 0.2423 (0.2820) data time 0.0009 (0.0302) model time 0.0000 (0.0000) loss 3.6384 (3.1152) grad_norm 2.5532 (2.5754) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][30/1251] eta 0:05:27 lr 0.000604 wd 0.0500 time 0.2471 (0.2679) data time 0.0009 (0.0208) model time 0.0000 (0.0000) loss 2.1972 (3.0575) grad_norm 1.9638 (2.5342) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][40/1251] eta 0:05:16 lr 0.000604 wd 0.0500 time 0.2384 (0.2612) data time 0.0009 (0.0160) model time 0.0000 (0.0000) loss 3.7625 (3.1989) grad_norm 3.0525 (2.4775) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][50/1251] eta 0:05:08 lr 0.000604 wd 0.0500 time 0.2455 (0.2569) data time 0.0008 (0.0131) model time 0.0000 (0.0000) loss 2.3132 (3.2131) grad_norm 2.8361 (2.5733) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:35:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][60/1251] eta 0:05:02 lr 0.000604 wd 0.0500 time 0.2464 (0.2540) data time 0.0008 (0.0111) model time 0.2456 (0.2384) loss 3.2367 (3.1773) grad_norm 3.0032 (2.5840) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][70/1251] eta 0:04:57 lr 0.000604 wd 0.0500 time 0.2536 (0.2518) data time 0.0010 (0.0097) model time 0.2526 (0.2378) loss 3.3292 (3.1913) grad_norm 2.7057 (2.5378) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:36:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][80/1251] eta 0:04:52 lr 0.000604 wd 0.0500 time 0.2388 (0.2501) data time 0.0011 (0.0086) model time 0.2376 (0.2375) loss 3.3344 (3.1431) grad_norm 1.7787 (2.4836) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:36:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][90/1251] eta 0:04:48 lr 0.000604 wd 0.0500 time 0.2386 (0.2486) data time 0.0009 (0.0078) model time 0.2377 (0.2371) loss 3.5562 (3.1376) grad_norm 2.0098 (2.4442) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][100/1251] eta 0:04:45 lr 0.000604 wd 0.0500 time 0.2461 (0.2478) data time 0.0008 (0.0071) model time 0.2452 (0.2375) loss 2.9375 (3.1264) grad_norm 1.9397 (2.4257) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:36:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][110/1251] eta 0:04:41 lr 0.000604 wd 0.0500 time 0.2516 (0.2469) data time 0.0010 (0.0066) model time 0.2506 (0.2373) loss 3.6146 (3.1138) grad_norm 2.1998 (2.4263) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:36:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][120/1251] eta 0:04:38 lr 0.000604 wd 0.0500 time 0.2518 (0.2462) data time 0.0008 (0.0062) model time 0.2511 (0.2374) loss 3.2975 (3.1278) grad_norm 2.6201 (2.4214) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][130/1251] eta 0:04:35 lr 0.000604 wd 0.0500 time 0.2435 (0.2457) data time 0.0010 (0.0058) model time 0.2424 (0.2374) loss 2.6786 (3.1063) grad_norm 2.1197 (2.4407) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][140/1251] eta 0:04:32 lr 0.000604 wd 0.0500 time 0.2464 (0.2451) data time 0.0008 (0.0054) model time 0.2456 (0.2374) loss 2.1022 (3.1267) grad_norm 2.4642 (2.4514) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 08:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][150/1251] eta 0:04:29 lr 0.000604 wd 0.0500 time 0.2376 (0.2447) data time 0.0011 (0.0052) model time 0.2366 (0.2373) loss 3.4253 (3.1121) grad_norm 1.8177 (2.4663) loss_scale 2048.0000 (1078.2517) mem 7382MB [2024-08-27 08:36:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][160/1251] eta 0:04:26 lr 0.000604 wd 0.0500 time 0.2484 (0.2444) data time 0.0011 (0.0049) model time 0.2474 (0.2375) loss 3.3585 (3.1161) grad_norm 1.9955 (2.4539) loss_scale 2048.0000 (1138.4845) mem 7382MB [2024-08-27 08:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][170/1251] eta 0:04:23 lr 0.000604 wd 0.0500 time 0.2423 (0.2441) data time 0.0007 (0.0047) model time 0.2416 (0.2376) loss 3.4228 (3.1425) grad_norm 2.4019 (2.4375) loss_scale 2048.0000 (1191.6725) mem 7382MB [2024-08-27 08:36:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][180/1251] eta 0:04:21 lr 0.000604 wd 0.0500 time 0.2446 (0.2440) data time 0.0010 (0.0045) model time 0.2436 (0.2378) loss 3.0536 (3.1444) grad_norm 2.8505 (2.4457) loss_scale 2048.0000 (1238.9834) mem 7382MB [2024-08-27 08:36:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][190/1251] eta 0:04:18 lr 0.000603 wd 0.0500 time 0.2353 (0.2439) data time 0.0011 (0.0043) model time 0.2342 (0.2380) loss 3.8697 (3.1539) grad_norm 3.4470 (2.4533) loss_scale 2048.0000 (1281.3403) mem 7382MB [2024-08-27 08:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][200/1251] eta 0:04:16 lr 0.000603 wd 0.0500 time 0.2562 (0.2438) data time 0.0008 (0.0042) model time 0.2554 (0.2382) loss 3.7828 (3.1617) grad_norm 2.8167 (2.4649) loss_scale 2048.0000 (1319.4826) mem 7382MB [2024-08-27 08:36:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][210/1251] eta 0:04:13 lr 0.000603 wd 0.0500 time 0.2371 (0.2436) data time 0.0013 (0.0040) model time 0.2357 (0.2382) loss 2.6053 (3.1546) grad_norm 2.3116 (2.4626) loss_scale 2048.0000 (1354.0095) mem 7382MB [2024-08-27 08:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][220/1251] eta 0:04:10 lr 0.000603 wd 0.0500 time 0.2469 (0.2434) data time 0.0007 (0.0039) model time 0.2463 (0.2383) loss 3.2146 (3.1581) grad_norm 2.5686 (2.4662) loss_scale 2048.0000 (1385.4118) mem 7382MB [2024-08-27 08:36:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][230/1251] eta 0:04:08 lr 0.000603 wd 0.0500 time 0.2358 (0.2433) data time 0.0008 (0.0038) model time 0.2351 (0.2383) loss 3.2306 (3.1553) grad_norm 1.7459 (2.4532) loss_scale 2048.0000 (1414.0952) mem 7382MB [2024-08-27 08:36:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][240/1251] eta 0:04:06 lr 0.000603 wd 0.0500 time 0.2430 (0.2440) data time 0.0011 (0.0037) model time 0.2419 (0.2395) loss 3.4067 (3.1616) grad_norm 1.7765 (2.4453) loss_scale 2048.0000 (1440.3983) mem 7382MB [2024-08-27 08:36:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][250/1251] eta 0:04:04 lr 0.000603 wd 0.0500 time 0.2430 (0.2440) data time 0.0011 (0.0036) model time 0.2419 (0.2396) loss 3.4763 (3.1691) grad_norm 2.1232 (2.4353) loss_scale 2048.0000 (1464.6056) mem 7382MB [2024-08-27 08:36:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][260/1251] eta 0:04:01 lr 0.000603 wd 0.0500 time 0.2369 (0.2440) data time 0.0010 (0.0035) model time 0.2359 (0.2397) loss 3.8013 (3.1729) grad_norm 2.9934 (2.4301) loss_scale 2048.0000 (1486.9579) mem 7382MB [2024-08-27 08:36:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][270/1251] eta 0:03:59 lr 0.000603 wd 0.0500 time 0.2385 (0.2438) data time 0.0010 (0.0034) model time 0.2376 (0.2396) loss 3.6555 (3.1795) grad_norm 2.1439 (2.4257) loss_scale 2048.0000 (1507.6605) mem 7382MB [2024-08-27 08:36:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][280/1251] eta 0:03:56 lr 0.000603 wd 0.0500 time 0.2454 (0.2436) data time 0.0010 (0.0033) model time 0.2445 (0.2396) loss 2.9623 (3.1814) grad_norm 3.0223 (2.4169) loss_scale 2048.0000 (1526.8897) mem 7382MB [2024-08-27 08:36:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][290/1251] eta 0:03:54 lr 0.000603 wd 0.0500 time 0.2371 (0.2435) data time 0.0012 (0.0032) model time 0.2359 (0.2395) loss 2.7842 (3.1796) grad_norm 2.6798 (2.4161) loss_scale 2048.0000 (1544.7973) mem 7382MB [2024-08-27 08:36:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][300/1251] eta 0:03:51 lr 0.000603 wd 0.0500 time 0.2328 (0.2434) data time 0.0008 (0.0031) model time 0.2320 (0.2395) loss 2.3450 (3.1902) grad_norm 1.8756 (2.4058) loss_scale 2048.0000 (1561.5150) mem 7382MB [2024-08-27 08:36:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][310/1251] eta 0:03:48 lr 0.000603 wd 0.0500 time 0.2386 (0.2434) data time 0.0011 (0.0031) model time 0.2376 (0.2396) loss 2.8730 (3.1973) grad_norm 1.6766 (2.3952) loss_scale 2048.0000 (1577.1576) mem 7382MB [2024-08-27 08:37:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][320/1251] eta 0:03:46 lr 0.000603 wd 0.0500 time 0.2362 (0.2433) data time 0.0010 (0.0030) model time 0.2352 (0.2396) loss 2.6115 (3.2030) grad_norm 2.0834 (2.3881) loss_scale 2048.0000 (1591.8255) mem 7382MB [2024-08-27 08:37:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][330/1251] eta 0:03:44 lr 0.000603 wd 0.0500 time 0.2401 (0.2432) data time 0.0010 (0.0030) model time 0.2391 (0.2396) loss 2.9307 (3.2057) grad_norm 2.3539 (2.3940) loss_scale 2048.0000 (1605.6073) mem 7382MB [2024-08-27 08:37:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][340/1251] eta 0:03:42 lr 0.000603 wd 0.0500 time 0.2355 (0.2437) data time 0.0009 (0.0029) model time 0.2346 (0.2403) loss 3.6775 (3.2076) grad_norm 2.3793 (2.4008) loss_scale 2048.0000 (1618.5806) mem 7382MB [2024-08-27 08:37:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][350/1251] eta 0:03:39 lr 0.000603 wd 0.0500 time 0.2407 (0.2436) data time 0.0010 (0.0029) model time 0.2397 (0.2402) loss 3.5680 (3.2140) grad_norm 2.0490 (2.4096) loss_scale 2048.0000 (1630.8148) mem 7382MB [2024-08-27 08:37:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][360/1251] eta 0:03:36 lr 0.000603 wd 0.0500 time 0.2409 (0.2435) data time 0.0008 (0.0028) model time 0.2401 (0.2401) loss 2.3584 (3.2110) grad_norm 2.9142 (2.4165) loss_scale 2048.0000 (1642.3712) mem 7382MB [2024-08-27 08:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][370/1251] eta 0:03:34 lr 0.000603 wd 0.0500 time 0.2298 (0.2434) data time 0.0009 (0.0028) model time 0.2290 (0.2401) loss 3.6166 (3.2091) grad_norm 2.3021 (2.4233) loss_scale 2048.0000 (1653.3046) mem 7382MB [2024-08-27 08:37:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][380/1251] eta 0:03:31 lr 0.000603 wd 0.0500 time 0.2465 (0.2433) data time 0.0008 (0.0027) model time 0.2456 (0.2401) loss 2.0579 (3.2097) grad_norm 2.4553 (2.4219) loss_scale 2048.0000 (1663.6640) mem 7382MB [2024-08-27 08:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][390/1251] eta 0:03:29 lr 0.000603 wd 0.0500 time 0.2396 (0.2433) data time 0.0009 (0.0027) model time 0.2387 (0.2402) loss 3.8188 (3.2063) grad_norm 2.0220 (2.4244) loss_scale 2048.0000 (1673.4936) mem 7382MB [2024-08-27 08:37:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][400/1251] eta 0:03:26 lr 0.000603 wd 0.0500 time 0.2375 (0.2432) data time 0.0009 (0.0026) model time 0.2366 (0.2401) loss 3.1411 (3.2021) grad_norm 4.4956 (2.4272) loss_scale 2048.0000 (1682.8329) mem 7382MB [2024-08-27 08:37:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][410/1251] eta 0:03:24 lr 0.000603 wd 0.0500 time 0.2412 (0.2431) data time 0.0010 (0.0026) model time 0.2402 (0.2400) loss 2.6905 (3.2004) grad_norm 4.0787 (2.4434) loss_scale 2048.0000 (1691.7178) mem 7382MB [2024-08-27 08:37:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][420/1251] eta 0:03:21 lr 0.000602 wd 0.0500 time 0.2375 (0.2430) data time 0.0010 (0.0026) model time 0.2365 (0.2400) loss 2.6992 (3.2034) grad_norm 1.7205 (2.4415) loss_scale 2048.0000 (1700.1805) mem 7382MB [2024-08-27 08:37:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][430/1251] eta 0:03:19 lr 0.000602 wd 0.0500 time 0.2442 (0.2429) data time 0.0011 (0.0025) model time 0.2432 (0.2399) loss 3.0339 (3.2080) grad_norm 1.9398 (2.4324) loss_scale 2048.0000 (1708.2506) mem 7382MB [2024-08-27 08:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][440/1251] eta 0:03:17 lr 0.000602 wd 0.0500 time 0.2408 (0.2429) data time 0.0011 (0.0025) model time 0.2396 (0.2400) loss 2.1717 (3.2079) grad_norm 2.0221 (2.4318) loss_scale 2048.0000 (1715.9546) mem 7382MB [2024-08-27 08:37:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][450/1251] eta 0:03:14 lr 0.000602 wd 0.0500 time 0.2402 (0.2428) data time 0.0008 (0.0025) model time 0.2394 (0.2399) loss 2.8237 (3.2092) grad_norm 2.2708 (2.4310) loss_scale 2048.0000 (1723.3171) mem 7382MB [2024-08-27 08:37:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][460/1251] eta 0:03:12 lr 0.000602 wd 0.0500 time 0.2364 (0.2428) data time 0.0010 (0.0024) model time 0.2354 (0.2399) loss 3.7467 (3.2137) grad_norm 3.4994 (2.4371) loss_scale 2048.0000 (1730.3601) mem 7382MB [2024-08-27 08:37:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][470/1251] eta 0:03:09 lr 0.000602 wd 0.0500 time 0.2376 (0.2428) data time 0.0010 (0.0024) model time 0.2366 (0.2399) loss 3.6236 (3.2108) grad_norm 1.9854 (2.4342) loss_scale 2048.0000 (1737.1040) mem 7382MB [2024-08-27 08:37:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][480/1251] eta 0:03:07 lr 0.000602 wd 0.0500 time 0.2300 (0.2427) data time 0.0011 (0.0024) model time 0.2289 (0.2399) loss 3.3263 (3.2145) grad_norm 1.9735 (2.4296) loss_scale 2048.0000 (1743.5676) mem 7382MB [2024-08-27 08:37:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][490/1251] eta 0:03:04 lr 0.000602 wd 0.0500 time 0.2355 (0.2426) data time 0.0008 (0.0024) model time 0.2347 (0.2398) loss 3.3954 (3.2188) grad_norm 2.6594 (2.4275) loss_scale 2048.0000 (1749.7678) mem 7382MB [2024-08-27 08:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][500/1251] eta 0:03:02 lr 0.000602 wd 0.0500 time 0.2341 (0.2425) data time 0.0009 (0.0023) model time 0.2332 (0.2397) loss 4.1201 (3.2249) grad_norm 2.3525 (2.4400) loss_scale 2048.0000 (1755.7206) mem 7382MB [2024-08-27 08:37:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][510/1251] eta 0:02:59 lr 0.000602 wd 0.0500 time 0.2365 (0.2424) data time 0.0010 (0.0023) model time 0.2355 (0.2397) loss 2.6751 (3.2275) grad_norm 1.8834 (2.4365) loss_scale 2048.0000 (1761.4403) mem 7382MB [2024-08-27 08:37:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][520/1251] eta 0:02:57 lr 0.000602 wd 0.0500 time 0.2390 (0.2424) data time 0.0013 (0.0023) model time 0.2377 (0.2397) loss 2.1931 (3.2260) grad_norm 2.1413 (2.4370) loss_scale 2048.0000 (1766.9405) mem 7382MB [2024-08-27 08:37:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][530/1251] eta 0:02:54 lr 0.000602 wd 0.0500 time 0.2351 (0.2424) data time 0.0007 (0.0023) model time 0.2344 (0.2397) loss 2.3284 (3.2227) grad_norm 2.3084 (2.4430) loss_scale 2048.0000 (1772.2335) mem 7382MB [2024-08-27 08:37:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][540/1251] eta 0:02:52 lr 0.000602 wd 0.0500 time 0.2364 (0.2423) data time 0.0009 (0.0022) model time 0.2356 (0.2397) loss 2.8275 (3.2236) grad_norm 1.8165 (2.4441) loss_scale 2048.0000 (1777.3309) mem 7382MB [2024-08-27 08:37:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][550/1251] eta 0:02:49 lr 0.000602 wd 0.0500 time 0.2332 (0.2423) data time 0.0011 (0.0022) model time 0.2322 (0.2397) loss 3.1046 (3.2225) grad_norm 2.3577 (2.4416) loss_scale 2048.0000 (1782.2432) mem 7382MB [2024-08-27 08:37:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][560/1251] eta 0:02:47 lr 0.000602 wd 0.0500 time 0.2407 (0.2422) data time 0.0008 (0.0022) model time 0.2399 (0.2396) loss 4.0045 (3.2193) grad_norm 2.3772 (2.4475) loss_scale 2048.0000 (1786.9804) mem 7382MB [2024-08-27 08:38:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][570/1251] eta 0:02:44 lr 0.000602 wd 0.0500 time 0.2365 (0.2422) data time 0.0010 (0.0022) model time 0.2354 (0.2396) loss 3.0727 (3.2257) grad_norm 2.3632 (2.4437) loss_scale 2048.0000 (1791.5517) mem 7382MB [2024-08-27 08:38:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][580/1251] eta 0:02:42 lr 0.000602 wd 0.0500 time 0.2413 (0.2421) data time 0.0010 (0.0022) model time 0.2403 (0.2396) loss 3.0460 (3.2204) grad_norm 3.5859 (2.4437) loss_scale 2048.0000 (1795.9656) mem 7382MB [2024-08-27 08:38:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][590/1251] eta 0:02:39 lr 0.000602 wd 0.0500 time 0.2358 (0.2420) data time 0.0007 (0.0021) model time 0.2351 (0.2395) loss 3.5727 (3.2235) grad_norm 2.1690 (2.4412) loss_scale 2048.0000 (1800.2301) mem 7382MB [2024-08-27 08:38:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][600/1251] eta 0:02:37 lr 0.000602 wd 0.0500 time 0.2266 (0.2419) data time 0.0009 (0.0021) model time 0.2257 (0.2394) loss 3.3852 (3.2271) grad_norm 1.7476 (2.4373) loss_scale 2048.0000 (1804.3527) mem 7382MB [2024-08-27 08:38:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][610/1251] eta 0:02:35 lr 0.000602 wd 0.0500 time 0.2405 (0.2419) data time 0.0007 (0.0021) model time 0.2397 (0.2394) loss 4.0910 (3.2291) grad_norm 2.6477 (2.4354) loss_scale 2048.0000 (1808.3404) mem 7382MB [2024-08-27 08:38:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][620/1251] eta 0:02:32 lr 0.000602 wd 0.0500 time 0.2402 (0.2419) data time 0.0010 (0.0021) model time 0.2392 (0.2395) loss 3.0267 (3.2313) grad_norm 1.7475 (2.4320) loss_scale 2048.0000 (1812.1997) mem 7382MB [2024-08-27 08:38:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][630/1251] eta 0:02:30 lr 0.000602 wd 0.0500 time 0.2441 (0.2419) data time 0.0007 (0.0021) model time 0.2434 (0.2395) loss 3.8849 (3.2287) grad_norm 2.1401 (2.4264) loss_scale 2048.0000 (1815.9366) mem 7382MB [2024-08-27 08:38:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][640/1251] eta 0:02:27 lr 0.000602 wd 0.0500 time 0.2305 (0.2419) data time 0.0011 (0.0021) model time 0.2294 (0.2395) loss 3.7302 (3.2296) grad_norm 3.2877 (2.4259) loss_scale 2048.0000 (1819.5569) mem 7382MB [2024-08-27 08:38:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][650/1251] eta 0:02:25 lr 0.000601 wd 0.0500 time 0.2435 (0.2419) data time 0.0008 (0.0021) model time 0.2427 (0.2395) loss 3.3180 (3.2314) grad_norm 7.5603 (2.4377) loss_scale 2048.0000 (1823.0661) mem 7382MB [2024-08-27 08:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][660/1251] eta 0:02:22 lr 0.000601 wd 0.0500 time 0.2401 (0.2419) data time 0.0009 (0.0020) model time 0.2392 (0.2395) loss 3.2665 (3.2290) grad_norm 2.1095 (2.4418) loss_scale 2048.0000 (1826.4690) mem 7382MB [2024-08-27 08:38:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][670/1251] eta 0:02:20 lr 0.000601 wd 0.0500 time 0.2409 (0.2418) data time 0.0010 (0.0020) model time 0.2399 (0.2394) loss 3.6550 (3.2270) grad_norm 3.3471 (2.4421) loss_scale 2048.0000 (1829.7705) mem 7382MB [2024-08-27 08:38:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][680/1251] eta 0:02:18 lr 0.000601 wd 0.0500 time 0.2355 (0.2418) data time 0.0011 (0.0020) model time 0.2345 (0.2394) loss 3.8299 (3.2301) grad_norm 1.9654 (2.4431) loss_scale 2048.0000 (1832.9750) mem 7382MB [2024-08-27 08:38:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][690/1251] eta 0:02:15 lr 0.000601 wd 0.0500 time 0.2403 (0.2417) data time 0.0011 (0.0020) model time 0.2392 (0.2394) loss 3.5083 (3.2339) grad_norm 2.0913 (2.4419) loss_scale 2048.0000 (1836.0868) mem 7382MB [2024-08-27 08:38:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][700/1251] eta 0:02:13 lr 0.000601 wd 0.0500 time 0.2338 (0.2417) data time 0.0007 (0.0020) model time 0.2330 (0.2394) loss 2.2575 (3.2342) grad_norm 2.2143 (2.4378) loss_scale 2048.0000 (1839.1098) mem 7382MB [2024-08-27 08:38:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][710/1251] eta 0:02:10 lr 0.000601 wd 0.0500 time 0.2344 (0.2417) data time 0.0010 (0.0020) model time 0.2334 (0.2394) loss 3.3765 (3.2354) grad_norm 2.3259 (2.4325) loss_scale 2048.0000 (1842.0478) mem 7382MB [2024-08-27 08:38:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][720/1251] eta 0:02:08 lr 0.000601 wd 0.0500 time 0.2378 (0.2417) data time 0.0008 (0.0020) model time 0.2371 (0.2394) loss 3.5309 (3.2394) grad_norm 4.0523 (2.4367) loss_scale 2048.0000 (1844.9043) mem 7382MB [2024-08-27 08:38:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][730/1251] eta 0:02:05 lr 0.000601 wd 0.0500 time 0.2451 (0.2416) data time 0.0012 (0.0019) model time 0.2438 (0.2394) loss 3.5479 (3.2436) grad_norm 4.4627 (2.4445) loss_scale 2048.0000 (1847.6826) mem 7382MB [2024-08-27 08:38:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][740/1251] eta 0:02:03 lr 0.000601 wd 0.0500 time 0.2402 (0.2416) data time 0.0009 (0.0019) model time 0.2393 (0.2394) loss 4.3995 (3.2455) grad_norm 4.3466 (2.4507) loss_scale 2048.0000 (1850.3860) mem 7382MB [2024-08-27 08:38:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][750/1251] eta 0:02:01 lr 0.000601 wd 0.0500 time 0.2397 (0.2416) data time 0.0008 (0.0019) model time 0.2389 (0.2394) loss 2.8281 (3.2445) grad_norm 3.2032 (2.4590) loss_scale 2048.0000 (1853.0173) mem 7382MB [2024-08-27 08:38:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][760/1251] eta 0:01:58 lr 0.000601 wd 0.0500 time 0.2413 (0.2416) data time 0.0008 (0.0019) model time 0.2405 (0.2394) loss 3.5311 (3.2451) grad_norm 3.6733 (2.4655) loss_scale 2048.0000 (1855.5795) mem 7382MB [2024-08-27 08:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][770/1251] eta 0:01:56 lr 0.000601 wd 0.0500 time 0.2401 (0.2415) data time 0.0009 (0.0019) model time 0.2393 (0.2393) loss 2.9235 (3.2433) grad_norm 1.7003 (2.4610) loss_scale 2048.0000 (1858.0752) mem 7382MB [2024-08-27 08:38:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][780/1251] eta 0:01:53 lr 0.000601 wd 0.0500 time 0.2321 (0.2415) data time 0.0011 (0.0019) model time 0.2310 (0.2393) loss 3.4606 (3.2441) grad_norm 1.9818 (2.4578) loss_scale 2048.0000 (1860.5070) mem 7382MB [2024-08-27 08:38:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][790/1251] eta 0:01:51 lr 0.000601 wd 0.0500 time 0.2370 (0.2415) data time 0.0008 (0.0019) model time 0.2362 (0.2393) loss 2.8010 (3.2444) grad_norm 2.6490 (2.4580) loss_scale 2048.0000 (1862.8774) mem 7382MB [2024-08-27 08:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][800/1251] eta 0:01:48 lr 0.000601 wd 0.0500 time 0.2465 (0.2415) data time 0.0010 (0.0019) model time 0.2455 (0.2393) loss 3.0496 (3.2458) grad_norm 2.5134 (2.4609) loss_scale 2048.0000 (1865.1885) mem 7382MB [2024-08-27 08:38:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][810/1251] eta 0:01:46 lr 0.000601 wd 0.0500 time 0.2486 (0.2414) data time 0.0007 (0.0019) model time 0.2479 (0.2393) loss 3.9252 (3.2505) grad_norm 3.0413 (2.4636) loss_scale 2048.0000 (1867.4427) mem 7382MB [2024-08-27 08:39:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][820/1251] eta 0:01:44 lr 0.000601 wd 0.0500 time 0.2467 (0.2414) data time 0.0011 (0.0019) model time 0.2456 (0.2393) loss 2.9311 (3.2482) grad_norm 2.9543 (2.4602) loss_scale 2048.0000 (1869.6419) mem 7382MB [2024-08-27 08:39:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][830/1251] eta 0:01:41 lr 0.000601 wd 0.0500 time 0.2286 (0.2414) data time 0.0011 (0.0019) model time 0.2275 (0.2393) loss 4.0386 (3.2492) grad_norm 1.8193 (2.4573) loss_scale 2048.0000 (1871.7882) mem 7382MB [2024-08-27 08:39:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][840/1251] eta 0:01:39 lr 0.000601 wd 0.0500 time 0.2402 (0.2414) data time 0.0008 (0.0018) model time 0.2393 (0.2393) loss 4.1748 (3.2524) grad_norm 4.4446 (2.4547) loss_scale 2048.0000 (1873.8835) mem 7382MB [2024-08-27 08:39:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][850/1251] eta 0:01:36 lr 0.000601 wd 0.0500 time 0.2199 (0.2415) data time 0.0011 (0.0018) model time 0.2188 (0.2394) loss 3.5815 (3.2537) grad_norm 2.9957 (2.4669) loss_scale 2048.0000 (1875.9295) mem 7382MB [2024-08-27 08:39:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][860/1251] eta 0:01:34 lr 0.000601 wd 0.0500 time 0.2381 (0.2415) data time 0.0008 (0.0018) model time 0.2374 (0.2394) loss 3.3447 (3.2554) grad_norm 2.5695 (2.4732) loss_scale 2048.0000 (1877.9280) mem 7382MB [2024-08-27 08:39:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][870/1251] eta 0:01:32 lr 0.000601 wd 0.0500 time 0.2352 (0.2415) data time 0.0010 (0.0018) model time 0.2342 (0.2394) loss 3.1581 (3.2585) grad_norm 2.6667 (2.4717) loss_scale 2048.0000 (1879.8806) mem 7382MB [2024-08-27 08:39:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][880/1251] eta 0:01:29 lr 0.000600 wd 0.0500 time 0.2318 (0.2415) data time 0.0011 (0.0018) model time 0.2307 (0.2394) loss 3.2100 (3.2579) grad_norm 1.9732 (2.4663) loss_scale 2048.0000 (1881.7889) mem 7382MB [2024-08-27 08:39:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][890/1251] eta 0:01:27 lr 0.000600 wd 0.0500 time 0.2356 (0.2415) data time 0.0010 (0.0018) model time 0.2346 (0.2394) loss 3.7079 (3.2582) grad_norm 2.6420 (2.4617) loss_scale 2048.0000 (1883.6543) mem 7382MB [2024-08-27 08:39:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][900/1251] eta 0:01:24 lr 0.000600 wd 0.0500 time 0.2393 (0.2414) data time 0.0012 (0.0018) model time 0.2381 (0.2394) loss 2.7695 (3.2549) grad_norm 2.9128 (2.4608) loss_scale 2048.0000 (1885.4784) mem 7382MB [2024-08-27 08:39:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][910/1251] eta 0:01:22 lr 0.000600 wd 0.0500 time 0.2337 (0.2414) data time 0.0011 (0.0018) model time 0.2326 (0.2394) loss 2.5836 (3.2528) grad_norm 3.3780 (2.4585) loss_scale 2048.0000 (1887.2623) mem 7382MB [2024-08-27 08:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][920/1251] eta 0:01:19 lr 0.000600 wd 0.0500 time 0.2347 (0.2414) data time 0.0009 (0.0018) model time 0.2337 (0.2393) loss 2.5089 (3.2510) grad_norm 2.3625 (2.4559) loss_scale 2048.0000 (1889.0076) mem 7382MB [2024-08-27 08:39:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][930/1251] eta 0:01:17 lr 0.000600 wd 0.0500 time 0.2344 (0.2413) data time 0.0009 (0.0018) model time 0.2335 (0.2393) loss 2.2352 (3.2524) grad_norm 1.8975 (2.4546) loss_scale 2048.0000 (1890.7154) mem 7382MB [2024-08-27 08:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][940/1251] eta 0:01:15 lr 0.000600 wd 0.0500 time 0.2347 (0.2413) data time 0.0011 (0.0018) model time 0.2336 (0.2393) loss 3.1980 (3.2521) grad_norm 1.8857 (2.4526) loss_scale 2048.0000 (1892.3868) mem 7382MB [2024-08-27 08:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][950/1251] eta 0:01:12 lr 0.000600 wd 0.0500 time 0.2347 (0.2413) data time 0.0011 (0.0018) model time 0.2335 (0.2393) loss 2.2066 (3.2488) grad_norm 2.0380 (2.4547) loss_scale 2048.0000 (1894.0231) mem 7382MB [2024-08-27 08:39:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][960/1251] eta 0:01:10 lr 0.000600 wd 0.0500 time 0.2289 (0.2413) data time 0.0011 (0.0018) model time 0.2278 (0.2393) loss 2.7852 (3.2476) grad_norm 2.4547 (2.4550) loss_scale 2048.0000 (1895.6254) mem 7382MB [2024-08-27 08:39:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][970/1251] eta 0:01:07 lr 0.000600 wd 0.0500 time 0.2389 (0.2412) data time 0.0010 (0.0017) model time 0.2379 (0.2393) loss 3.6469 (3.2498) grad_norm 2.2104 (2.4530) loss_scale 2048.0000 (1897.1946) mem 7382MB [2024-08-27 08:39:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][980/1251] eta 0:01:05 lr 0.000600 wd 0.0500 time 0.2374 (0.2412) data time 0.0011 (0.0017) model time 0.2363 (0.2392) loss 3.1625 (3.2497) grad_norm 2.5483 (2.4517) loss_scale 2048.0000 (1898.7319) mem 7382MB [2024-08-27 08:39:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][990/1251] eta 0:01:02 lr 0.000600 wd 0.0500 time 0.2281 (0.2412) data time 0.0010 (0.0017) model time 0.2271 (0.2392) loss 3.2548 (3.2495) grad_norm 2.1543 (2.4513) loss_scale 2048.0000 (1900.2381) mem 7382MB [2024-08-27 08:39:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1000/1251] eta 0:01:00 lr 0.000600 wd 0.0500 time 0.2402 (0.2411) data time 0.0008 (0.0017) model time 0.2394 (0.2392) loss 3.4590 (3.2486) grad_norm 3.0554 (2.4513) loss_scale 2048.0000 (1901.7143) mem 7382MB [2024-08-27 08:39:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1010/1251] eta 0:00:58 lr 0.000600 wd 0.0500 time 0.2340 (0.2411) data time 0.0008 (0.0017) model time 0.2332 (0.2392) loss 3.4388 (3.2474) grad_norm 3.8315 (2.4565) loss_scale 2048.0000 (1903.1612) mem 7382MB [2024-08-27 08:39:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1020/1251] eta 0:00:55 lr 0.000600 wd 0.0500 time 0.2360 (0.2411) data time 0.0008 (0.0017) model time 0.2352 (0.2392) loss 2.3259 (3.2432) grad_norm 2.1989 (2.4615) loss_scale 2048.0000 (1904.5798) mem 7382MB [2024-08-27 08:39:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1030/1251] eta 0:00:53 lr 0.000600 wd 0.0500 time 0.2433 (0.2411) data time 0.0007 (0.0017) model time 0.2425 (0.2391) loss 3.3265 (3.2410) grad_norm 2.5486 (2.4588) loss_scale 2048.0000 (1905.9709) mem 7382MB [2024-08-27 08:39:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1040/1251] eta 0:00:50 lr 0.000600 wd 0.0500 time 0.2358 (0.2411) data time 0.0010 (0.0017) model time 0.2348 (0.2391) loss 2.2631 (3.2407) grad_norm 1.8409 (2.4588) loss_scale 2048.0000 (1907.3353) mem 7382MB [2024-08-27 08:39:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1050/1251] eta 0:00:48 lr 0.000600 wd 0.0500 time 0.2360 (0.2410) data time 0.0010 (0.0017) model time 0.2350 (0.2391) loss 3.5505 (3.2390) grad_norm 2.7899 (2.4568) loss_scale 2048.0000 (1908.6736) mem 7382MB [2024-08-27 08:39:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1060/1251] eta 0:00:46 lr 0.000600 wd 0.0500 time 0.2346 (0.2411) data time 0.0010 (0.0017) model time 0.2335 (0.2391) loss 3.3911 (3.2414) grad_norm 1.9866 (2.4537) loss_scale 2048.0000 (1909.9868) mem 7382MB [2024-08-27 08:40:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1070/1251] eta 0:00:43 lr 0.000600 wd 0.0500 time 0.2368 (0.2410) data time 0.0008 (0.0017) model time 0.2360 (0.2391) loss 2.2404 (3.2402) grad_norm 2.5204 (2.4544) loss_scale 2048.0000 (1911.2754) mem 7382MB [2024-08-27 08:40:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1080/1251] eta 0:00:41 lr 0.000600 wd 0.0500 time 0.2436 (0.2410) data time 0.0008 (0.0017) model time 0.2428 (0.2391) loss 2.3787 (3.2405) grad_norm 2.2996 (2.4560) loss_scale 2048.0000 (1912.5402) mem 7382MB [2024-08-27 08:40:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1090/1251] eta 0:00:38 lr 0.000600 wd 0.0500 time 0.2452 (0.2410) data time 0.0007 (0.0017) model time 0.2445 (0.2391) loss 4.2208 (3.2415) grad_norm 1.6636 (2.4548) loss_scale 2048.0000 (1913.7819) mem 7382MB [2024-08-27 08:40:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1100/1251] eta 0:00:36 lr 0.000600 wd 0.0500 time 0.2355 (0.2410) data time 0.0011 (0.0017) model time 0.2344 (0.2391) loss 3.4690 (3.2441) grad_norm 2.0143 (2.4511) loss_scale 2048.0000 (1915.0009) mem 7382MB [2024-08-27 08:40:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1110/1251] eta 0:00:33 lr 0.000599 wd 0.0500 time 0.2398 (0.2410) data time 0.0012 (0.0017) model time 0.2386 (0.2391) loss 2.6350 (3.2428) grad_norm 2.3926 (2.4478) loss_scale 2048.0000 (1916.1980) mem 7382MB [2024-08-27 08:40:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1120/1251] eta 0:00:31 lr 0.000599 wd 0.0500 time 0.2371 (0.2409) data time 0.0007 (0.0017) model time 0.2364 (0.2391) loss 2.7696 (3.2421) grad_norm 1.8678 (2.4461) loss_scale 2048.0000 (1917.3738) mem 7382MB [2024-08-27 08:40:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1130/1251] eta 0:00:29 lr 0.000599 wd 0.0500 time 0.2425 (0.2409) data time 0.0007 (0.0017) model time 0.2418 (0.2391) loss 3.6999 (3.2423) grad_norm 2.5866 (2.4456) loss_scale 2048.0000 (1918.5287) mem 7382MB [2024-08-27 08:40:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1140/1251] eta 0:00:26 lr 0.000599 wd 0.0500 time 0.2371 (0.2409) data time 0.0012 (0.0017) model time 0.2359 (0.2391) loss 2.9296 (3.2438) grad_norm 1.5291 (2.4437) loss_scale 2048.0000 (1919.6635) mem 7382MB [2024-08-27 08:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1150/1251] eta 0:00:24 lr 0.000599 wd 0.0500 time 0.2410 (0.2409) data time 0.0008 (0.0017) model time 0.2402 (0.2391) loss 3.5923 (3.2426) grad_norm 2.7068 (2.4468) loss_scale 2048.0000 (1920.7785) mem 7382MB [2024-08-27 08:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1160/1251] eta 0:00:21 lr 0.000599 wd 0.0500 time 0.2432 (0.2409) data time 0.0010 (0.0016) model time 0.2422 (0.2390) loss 2.3554 (3.2435) grad_norm 1.8402 (2.4456) loss_scale 2048.0000 (1921.8742) mem 7382MB [2024-08-27 08:40:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1170/1251] eta 0:00:19 lr 0.000599 wd 0.0500 time 0.2554 (0.2411) data time 0.0012 (0.0016) model time 0.2542 (0.2393) loss 2.9602 (3.2429) grad_norm 2.2048 (2.4431) loss_scale 2048.0000 (1922.9513) mem 7382MB [2024-08-27 08:40:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1180/1251] eta 0:00:17 lr 0.000599 wd 0.0500 time 0.2355 (0.2411) data time 0.0011 (0.0016) model time 0.2343 (0.2393) loss 3.3317 (3.2447) grad_norm 1.7149 (2.4453) loss_scale 2048.0000 (1924.0102) mem 7382MB [2024-08-27 08:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1190/1251] eta 0:00:14 lr 0.000599 wd 0.0500 time 0.2470 (0.2411) data time 0.0008 (0.0016) model time 0.2462 (0.2392) loss 3.6791 (3.2433) grad_norm 2.6822 (2.4521) loss_scale 2048.0000 (1925.0512) mem 7382MB [2024-08-27 08:40:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1200/1251] eta 0:00:12 lr 0.000599 wd 0.0500 time 0.2374 (0.2411) data time 0.0011 (0.0016) model time 0.2363 (0.2392) loss 3.2845 (3.2446) grad_norm 2.6040 (2.4513) loss_scale 2048.0000 (1926.0749) mem 7382MB [2024-08-27 08:40:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1210/1251] eta 0:00:09 lr 0.000599 wd 0.0500 time 0.2317 (0.2411) data time 0.0008 (0.0016) model time 0.2308 (0.2392) loss 3.9512 (3.2440) grad_norm 2.0911 (2.4481) loss_scale 2048.0000 (1927.0818) mem 7382MB [2024-08-27 08:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1220/1251] eta 0:00:07 lr 0.000599 wd 0.0500 time 0.2399 (0.2410) data time 0.0011 (0.0016) model time 0.2389 (0.2392) loss 2.7280 (3.2452) grad_norm 1.8368 (2.4459) loss_scale 2048.0000 (1928.0721) mem 7382MB [2024-08-27 08:40:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1230/1251] eta 0:00:05 lr 0.000599 wd 0.0500 time 0.2356 (0.2410) data time 0.0012 (0.0016) model time 0.2344 (0.2392) loss 3.7593 (3.2442) grad_norm 2.2048 (2.4476) loss_scale 2048.0000 (1929.0463) mem 7382MB [2024-08-27 08:40:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1240/1251] eta 0:00:02 lr 0.000599 wd 0.0500 time 0.2298 (0.2410) data time 0.0007 (0.0016) model time 0.2291 (0.2391) loss 3.3050 (3.2431) grad_norm 3.8748 (2.4475) loss_scale 2048.0000 (1930.0048) mem 7382MB [2024-08-27 08:40:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [142/300][1250/1251] eta 0:00:00 lr 0.000599 wd 0.0500 time 0.2242 (0.2408) data time 0.0007 (0.0016) model time 0.2234 (0.2390) loss 2.6247 (3.2413) grad_norm 3.1492 (2.4495) loss_scale 2048.0000 (1930.9480) mem 7382MB [2024-08-27 08:40:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 142 training takes 0:05:01 [2024-08-27 08:40:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 08:40:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 08:40:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.403 (0.403) Loss 0.5000 (0.5000) Acc@1 90.625 (90.625) Acc@5 98.730 (98.730) Mem 7382MB [2024-08-27 08:40:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.086 (0.116) Loss 0.7856 (0.7612) Acc@1 83.984 (84.144) Acc@5 96.484 (96.928) Mem 7382MB [2024-08-27 08:40:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.096) Loss 1.0771 (0.7868) Acc@1 75.586 (83.185) Acc@5 93.848 (96.791) Mem 7382MB [2024-08-27 08:40:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.089) Loss 1.3125 (0.8915) Acc@1 69.629 (80.897) Acc@5 90.625 (95.432) Mem 7382MB [2024-08-27 08:40:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.2354 (0.9487) Acc@1 71.973 (79.385) Acc@5 91.992 (94.836) Mem 7382MB [2024-08-27 08:40:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.044 Acc@5 94.774 [2024-08-27 08:40:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.0% [2024-08-27 08:40:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 79.04% [2024-08-27 08:40:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 08:40:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 08:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.446 (0.446) Loss 0.4102 (0.4102) Acc@1 92.773 (92.773) Acc@5 98.242 (98.242) Mem 7382MB [2024-08-27 08:40:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.111) Loss 0.6582 (0.6456) Acc@1 86.719 (86.151) Acc@5 96.875 (97.257) Mem 7382MB [2024-08-27 08:40:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.095) Loss 0.9160 (0.6701) Acc@1 79.004 (85.198) Acc@5 95.215 (97.284) Mem 7382MB [2024-08-27 08:40:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.089) Loss 1.1602 (0.7603) Acc@1 70.996 (82.916) Acc@5 92.188 (96.305) Mem 7382MB [2024-08-27 08:40:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.083) Loss 1.0508 (0.8077) Acc@1 74.707 (81.562) Acc@5 93.457 (95.782) Mem 7382MB [2024-08-27 08:40:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.156 Acc@5 95.744 [2024-08-27 08:40:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.2% [2024-08-27 08:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][0/1251] eta 0:21:46 lr 0.000599 wd 0.0500 time 1.0445 (1.0445) data time 0.6617 (0.6617) model time 0.0000 (0.0000) loss 3.7970 (3.7970) grad_norm 2.4450 (2.4450) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][10/1251] eta 0:06:31 lr 0.000599 wd 0.0500 time 0.2524 (0.3151) data time 0.0010 (0.0612) model time 0.0000 (0.0000) loss 3.3840 (3.4411) grad_norm 2.0084 (2.1310) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][20/1251] eta 0:05:42 lr 0.000599 wd 0.0500 time 0.2407 (0.2781) data time 0.0008 (0.0326) model time 0.0000 (0.0000) loss 3.9268 (3.3374) grad_norm 2.6354 (2.3076) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][30/1251] eta 0:05:23 lr 0.000599 wd 0.0500 time 0.2468 (0.2648) data time 0.0011 (0.0225) model time 0.0000 (0.0000) loss 3.6171 (3.3486) grad_norm 1.9917 (2.4349) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][40/1251] eta 0:05:13 lr 0.000599 wd 0.0500 time 0.2358 (0.2587) data time 0.0007 (0.0173) model time 0.0000 (0.0000) loss 3.8972 (3.3277) grad_norm 3.1795 (2.4485) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][50/1251] eta 0:05:13 lr 0.000599 wd 0.0500 time 0.2330 (0.2611) data time 0.0009 (0.0143) model time 0.0000 (0.0000) loss 2.2397 (3.2487) grad_norm 1.8432 (2.4008) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][60/1251] eta 0:05:07 lr 0.000599 wd 0.0500 time 0.2417 (0.2579) data time 0.0008 (0.0124) model time 0.2409 (0.2389) loss 2.5617 (3.2519) grad_norm 2.3742 (2.4360) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][70/1251] eta 0:05:01 lr 0.000599 wd 0.0500 time 0.2378 (0.2554) data time 0.0010 (0.0108) model time 0.2367 (0.2388) loss 2.9842 (3.2493) grad_norm 2.0491 (2.4079) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][80/1251] eta 0:04:56 lr 0.000598 wd 0.0500 time 0.2433 (0.2532) data time 0.0011 (0.0096) model time 0.2423 (0.2382) loss 3.3737 (3.2473) grad_norm 2.0464 (2.3655) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][90/1251] eta 0:04:52 lr 0.000598 wd 0.0500 time 0.2409 (0.2516) data time 0.0010 (0.0087) model time 0.2399 (0.2380) loss 3.6551 (3.2415) grad_norm 2.2284 (2.3692) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][100/1251] eta 0:04:48 lr 0.000598 wd 0.0500 time 0.2387 (0.2507) data time 0.0009 (0.0080) model time 0.2377 (0.2387) loss 3.1802 (3.2237) grad_norm 2.8631 (2.3792) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][110/1251] eta 0:04:45 lr 0.000598 wd 0.0500 time 0.2470 (0.2499) data time 0.0010 (0.0073) model time 0.2459 (0.2389) loss 3.5249 (3.2311) grad_norm 3.1480 (2.3988) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][120/1251] eta 0:04:41 lr 0.000598 wd 0.0500 time 0.2374 (0.2489) data time 0.0010 (0.0068) model time 0.2364 (0.2386) loss 3.3670 (3.2361) grad_norm 7.3665 (2.4617) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][130/1251] eta 0:04:38 lr 0.000598 wd 0.0500 time 0.2460 (0.2482) data time 0.0007 (0.0064) model time 0.2452 (0.2385) loss 4.1565 (3.2530) grad_norm 2.4211 (2.4723) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][140/1251] eta 0:04:34 lr 0.000598 wd 0.0500 time 0.2424 (0.2475) data time 0.0009 (0.0060) model time 0.2415 (0.2384) loss 4.0559 (3.2653) grad_norm 2.6199 (2.4714) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][150/1251] eta 0:04:31 lr 0.000598 wd 0.0500 time 0.2434 (0.2468) data time 0.0011 (0.0057) model time 0.2423 (0.2382) loss 3.5353 (3.2912) grad_norm 2.0890 (2.4569) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][160/1251] eta 0:04:28 lr 0.000598 wd 0.0500 time 0.2413 (0.2463) data time 0.0007 (0.0054) model time 0.2406 (0.2381) loss 3.1477 (3.2861) grad_norm 2.4763 (2.4799) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][170/1251] eta 0:04:25 lr 0.000598 wd 0.0500 time 0.2402 (0.2458) data time 0.0010 (0.0052) model time 0.2392 (0.2379) loss 3.6602 (3.2918) grad_norm 3.1522 (2.5164) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][180/1251] eta 0:04:22 lr 0.000598 wd 0.0500 time 0.2353 (0.2454) data time 0.0009 (0.0049) model time 0.2344 (0.2380) loss 2.5382 (3.2877) grad_norm 2.6058 (2.5672) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][190/1251] eta 0:04:20 lr 0.000598 wd 0.0500 time 0.2383 (0.2451) data time 0.0009 (0.0047) model time 0.2374 (0.2380) loss 2.4279 (3.2780) grad_norm 2.8324 (2.5637) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][200/1251] eta 0:04:17 lr 0.000598 wd 0.0500 time 0.2391 (0.2449) data time 0.0009 (0.0046) model time 0.2382 (0.2382) loss 3.8355 (3.2768) grad_norm 1.9260 (2.5457) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][210/1251] eta 0:04:14 lr 0.000598 wd 0.0500 time 0.2340 (0.2447) data time 0.0008 (0.0044) model time 0.2332 (0.2382) loss 3.9424 (3.2846) grad_norm 1.8090 (2.5263) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][220/1251] eta 0:04:12 lr 0.000598 wd 0.0500 time 0.2443 (0.2444) data time 0.0011 (0.0042) model time 0.2431 (0.2382) loss 3.5050 (3.2826) grad_norm 3.1021 (2.5189) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][230/1251] eta 0:04:09 lr 0.000598 wd 0.0500 time 0.2377 (0.2441) data time 0.0009 (0.0041) model time 0.2368 (0.2381) loss 3.1718 (3.2894) grad_norm 3.6136 (2.5224) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][240/1251] eta 0:04:06 lr 0.000598 wd 0.0500 time 0.2393 (0.2439) data time 0.0008 (0.0040) model time 0.2385 (0.2380) loss 3.4984 (3.2792) grad_norm 2.8329 (2.5130) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][250/1251] eta 0:04:04 lr 0.000598 wd 0.0500 time 0.2430 (0.2438) data time 0.0011 (0.0039) model time 0.2419 (0.2381) loss 2.4376 (3.2663) grad_norm 2.3526 (2.5003) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:41:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][260/1251] eta 0:04:01 lr 0.000598 wd 0.0500 time 0.2292 (0.2435) data time 0.0012 (0.0038) model time 0.2281 (0.2380) loss 3.4838 (3.2725) grad_norm 2.6295 (2.5037) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][270/1251] eta 0:03:58 lr 0.000598 wd 0.0500 time 0.2420 (0.2433) data time 0.0009 (0.0037) model time 0.2412 (0.2380) loss 3.5247 (3.2736) grad_norm 3.0106 (2.4962) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][280/1251] eta 0:03:55 lr 0.000598 wd 0.0500 time 0.2416 (0.2430) data time 0.0008 (0.0036) model time 0.2408 (0.2378) loss 2.9859 (3.2733) grad_norm 2.2207 (2.5100) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][290/1251] eta 0:03:53 lr 0.000598 wd 0.0500 time 0.2417 (0.2429) data time 0.0009 (0.0035) model time 0.2409 (0.2378) loss 2.4109 (3.2760) grad_norm 2.5392 (2.5000) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][300/1251] eta 0:03:50 lr 0.000598 wd 0.0500 time 0.2372 (0.2427) data time 0.0013 (0.0034) model time 0.2359 (0.2378) loss 3.2953 (3.2794) grad_norm 2.1978 (2.4876) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][310/1251] eta 0:03:48 lr 0.000597 wd 0.0500 time 0.2412 (0.2426) data time 0.0010 (0.0033) model time 0.2402 (0.2377) loss 3.5572 (3.2717) grad_norm 1.8238 (2.4825) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][320/1251] eta 0:03:45 lr 0.000597 wd 0.0500 time 0.2410 (0.2424) data time 0.0011 (0.0033) model time 0.2398 (0.2377) loss 3.2173 (3.2740) grad_norm 1.9944 (2.4788) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][330/1251] eta 0:03:43 lr 0.000597 wd 0.0500 time 0.2364 (0.2423) data time 0.0010 (0.0032) model time 0.2353 (0.2377) loss 3.3756 (3.2736) grad_norm 2.4205 (2.4709) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][340/1251] eta 0:03:40 lr 0.000597 wd 0.0500 time 0.2385 (0.2421) data time 0.0010 (0.0031) model time 0.2375 (0.2376) loss 3.1311 (3.2682) grad_norm 1.9490 (2.4764) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][350/1251] eta 0:03:38 lr 0.000597 wd 0.0500 time 0.2396 (0.2421) data time 0.0011 (0.0031) model time 0.2385 (0.2376) loss 3.2897 (3.2639) grad_norm 1.7074 (2.4700) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][360/1251] eta 0:03:35 lr 0.000597 wd 0.0500 time 0.2421 (0.2420) data time 0.0009 (0.0030) model time 0.2412 (0.2376) loss 3.4593 (3.2643) grad_norm 2.4040 (2.4581) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][370/1251] eta 0:03:33 lr 0.000597 wd 0.0500 time 0.2342 (0.2419) data time 0.0008 (0.0030) model time 0.2334 (0.2377) loss 3.2864 (3.2647) grad_norm 1.9891 (2.4560) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][380/1251] eta 0:03:30 lr 0.000597 wd 0.0500 time 0.2324 (0.2418) data time 0.0011 (0.0029) model time 0.2313 (0.2376) loss 3.1171 (3.2609) grad_norm 3.2388 (2.4701) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][390/1251] eta 0:03:28 lr 0.000597 wd 0.0500 time 0.2444 (0.2416) data time 0.0009 (0.0029) model time 0.2435 (0.2375) loss 3.3568 (3.2597) grad_norm 1.6095 (2.4682) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][400/1251] eta 0:03:25 lr 0.000597 wd 0.0500 time 0.2341 (0.2414) data time 0.0008 (0.0029) model time 0.2333 (0.2374) loss 2.8421 (3.2601) grad_norm 2.1248 (2.4666) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][410/1251] eta 0:03:22 lr 0.000597 wd 0.0500 time 0.2391 (0.2414) data time 0.0012 (0.0028) model time 0.2379 (0.2374) loss 3.4385 (3.2567) grad_norm 2.2764 (2.4607) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][420/1251] eta 0:03:20 lr 0.000597 wd 0.0500 time 0.2399 (0.2413) data time 0.0010 (0.0028) model time 0.2389 (0.2374) loss 2.5361 (3.2581) grad_norm 3.4366 (2.4524) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][430/1251] eta 0:03:18 lr 0.000597 wd 0.0500 time 0.2448 (0.2412) data time 0.0010 (0.0027) model time 0.2438 (0.2374) loss 3.2212 (3.2535) grad_norm 3.6827 (2.4561) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][440/1251] eta 0:03:15 lr 0.000597 wd 0.0500 time 0.2370 (0.2411) data time 0.0010 (0.0027) model time 0.2360 (0.2374) loss 3.3213 (3.2540) grad_norm 2.2527 (2.4591) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][450/1251] eta 0:03:13 lr 0.000597 wd 0.0500 time 0.2379 (0.2411) data time 0.0008 (0.0027) model time 0.2370 (0.2373) loss 2.4990 (3.2524) grad_norm 2.2825 (2.4643) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][460/1251] eta 0:03:10 lr 0.000597 wd 0.0500 time 0.2433 (0.2411) data time 0.0008 (0.0026) model time 0.2424 (0.2374) loss 3.9852 (3.2493) grad_norm 2.2354 (2.4729) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][470/1251] eta 0:03:08 lr 0.000597 wd 0.0500 time 0.2398 (0.2410) data time 0.0008 (0.0026) model time 0.2390 (0.2374) loss 3.9500 (3.2531) grad_norm 2.5099 (2.4679) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][480/1251] eta 0:03:05 lr 0.000597 wd 0.0500 time 0.2417 (0.2409) data time 0.0008 (0.0026) model time 0.2409 (0.2374) loss 2.5143 (3.2546) grad_norm 2.3472 (2.4711) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][490/1251] eta 0:03:03 lr 0.000597 wd 0.0500 time 0.2406 (0.2409) data time 0.0011 (0.0025) model time 0.2395 (0.2374) loss 3.4045 (3.2578) grad_norm 2.6091 (2.4750) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][500/1251] eta 0:03:00 lr 0.000597 wd 0.0500 time 0.2397 (0.2408) data time 0.0008 (0.0025) model time 0.2389 (0.2373) loss 2.8059 (3.2621) grad_norm 2.2457 (2.4730) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][510/1251] eta 0:02:58 lr 0.000597 wd 0.0500 time 0.2403 (0.2408) data time 0.0011 (0.0025) model time 0.2392 (0.2374) loss 3.3709 (3.2595) grad_norm 2.5867 (2.4684) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:42:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][520/1251] eta 0:02:55 lr 0.000597 wd 0.0500 time 0.2352 (0.2407) data time 0.0009 (0.0024) model time 0.2343 (0.2373) loss 3.5436 (3.2555) grad_norm 2.1561 (2.4720) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][530/1251] eta 0:02:53 lr 0.000597 wd 0.0500 time 0.2424 (0.2407) data time 0.0009 (0.0024) model time 0.2415 (0.2373) loss 3.5477 (3.2569) grad_norm 1.6411 (2.4668) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][540/1251] eta 0:02:51 lr 0.000596 wd 0.0500 time 0.2392 (0.2406) data time 0.0013 (0.0024) model time 0.2379 (0.2373) loss 3.5420 (3.2608) grad_norm 2.5966 (2.4648) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][550/1251] eta 0:02:48 lr 0.000596 wd 0.0500 time 0.2414 (0.2406) data time 0.0010 (0.0024) model time 0.2403 (0.2374) loss 2.3167 (3.2593) grad_norm 2.5669 (2.4650) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][560/1251] eta 0:02:46 lr 0.000596 wd 0.0500 time 0.2433 (0.2406) data time 0.0012 (0.0024) model time 0.2422 (0.2374) loss 3.5341 (3.2641) grad_norm 1.7526 (2.4615) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][570/1251] eta 0:02:43 lr 0.000596 wd 0.0500 time 0.2426 (0.2406) data time 0.0011 (0.0023) model time 0.2415 (0.2374) loss 3.9075 (3.2633) grad_norm 2.1983 (2.4571) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][580/1251] eta 0:02:41 lr 0.000596 wd 0.0500 time 0.2475 (0.2406) data time 0.0008 (0.0023) model time 0.2467 (0.2375) loss 3.6494 (3.2655) grad_norm 2.4381 (2.4613) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][590/1251] eta 0:02:39 lr 0.000596 wd 0.0500 time 0.2453 (0.2405) data time 0.0010 (0.0023) model time 0.2443 (0.2374) loss 3.6927 (3.2648) grad_norm 3.0282 (2.4560) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][600/1251] eta 0:02:36 lr 0.000596 wd 0.0500 time 0.2354 (0.2405) data time 0.0010 (0.0023) model time 0.2344 (0.2374) loss 3.2102 (3.2682) grad_norm 2.6208 (2.4635) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][610/1251] eta 0:02:34 lr 0.000596 wd 0.0500 time 0.2488 (0.2405) data time 0.0009 (0.0023) model time 0.2479 (0.2374) loss 3.2531 (3.2644) grad_norm 2.7149 (2.4610) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][620/1251] eta 0:02:31 lr 0.000596 wd 0.0500 time 0.2484 (0.2405) data time 0.0009 (0.0022) model time 0.2475 (0.2375) loss 3.0677 (3.2645) grad_norm 1.5606 (2.4563) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][630/1251] eta 0:02:29 lr 0.000596 wd 0.0500 time 0.2382 (0.2405) data time 0.0010 (0.0022) model time 0.2372 (0.2375) loss 2.8022 (3.2658) grad_norm 2.0549 (2.4496) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][640/1251] eta 0:02:26 lr 0.000596 wd 0.0500 time 0.2375 (0.2405) data time 0.0011 (0.0022) model time 0.2364 (0.2375) loss 2.7302 (3.2672) grad_norm 2.4922 (2.4478) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][650/1251] eta 0:02:24 lr 0.000596 wd 0.0500 time 0.2357 (0.2404) data time 0.0012 (0.0022) model time 0.2345 (0.2375) loss 3.4498 (3.2694) grad_norm 3.9113 (2.4679) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][660/1251] eta 0:02:22 lr 0.000596 wd 0.0500 time 0.2498 (0.2404) data time 0.0012 (0.0022) model time 0.2486 (0.2375) loss 3.2950 (3.2690) grad_norm 5.2718 (2.4849) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][670/1251] eta 0:02:19 lr 0.000596 wd 0.0500 time 0.2424 (0.2404) data time 0.0012 (0.0022) model time 0.2413 (0.2375) loss 3.6127 (3.2687) grad_norm 1.8174 (2.4851) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][680/1251] eta 0:02:17 lr 0.000596 wd 0.0500 time 0.2406 (0.2404) data time 0.0008 (0.0021) model time 0.2398 (0.2376) loss 3.9695 (3.2708) grad_norm 1.8517 (2.4795) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][690/1251] eta 0:02:14 lr 0.000596 wd 0.0500 time 0.2363 (0.2404) data time 0.0007 (0.0021) model time 0.2356 (0.2376) loss 2.3284 (3.2695) grad_norm 1.8258 (2.4737) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][700/1251] eta 0:02:12 lr 0.000596 wd 0.0500 time 0.2428 (0.2404) data time 0.0010 (0.0021) model time 0.2418 (0.2376) loss 3.6286 (3.2724) grad_norm 3.1171 (2.4768) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][710/1251] eta 0:02:10 lr 0.000596 wd 0.0500 time 0.2395 (0.2404) data time 0.0008 (0.0021) model time 0.2387 (0.2377) loss 3.2195 (3.2728) grad_norm 2.4552 (2.4828) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][720/1251] eta 0:02:07 lr 0.000596 wd 0.0500 time 0.2392 (0.2404) data time 0.0011 (0.0021) model time 0.2382 (0.2377) loss 2.9387 (3.2704) grad_norm 2.0223 (2.4805) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][730/1251] eta 0:02:05 lr 0.000596 wd 0.0500 time 0.2355 (0.2404) data time 0.0010 (0.0021) model time 0.2345 (0.2377) loss 3.1579 (3.2678) grad_norm 1.8418 (2.4771) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][740/1251] eta 0:02:02 lr 0.000596 wd 0.0500 time 0.2443 (0.2404) data time 0.0007 (0.0021) model time 0.2436 (0.2377) loss 3.8010 (3.2712) grad_norm 3.0237 (2.4751) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][750/1251] eta 0:02:00 lr 0.000596 wd 0.0500 time 0.2328 (0.2404) data time 0.0010 (0.0021) model time 0.2318 (0.2377) loss 3.7465 (3.2672) grad_norm 1.9326 (2.4745) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][760/1251] eta 0:01:58 lr 0.000596 wd 0.0500 time 0.2437 (0.2403) data time 0.0007 (0.0020) model time 0.2430 (0.2377) loss 3.8550 (3.2672) grad_norm 3.2482 (2.4779) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:43:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][770/1251] eta 0:01:55 lr 0.000595 wd 0.0500 time 0.2348 (0.2403) data time 0.0011 (0.0020) model time 0.2337 (0.2377) loss 2.8197 (3.2663) grad_norm 1.9099 (2.4820) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:44:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][780/1251] eta 0:01:53 lr 0.000595 wd 0.0500 time 0.2456 (0.2403) data time 0.0011 (0.0020) model time 0.2445 (0.2377) loss 3.6610 (3.2658) grad_norm 1.7305 (2.4755) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:44:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][790/1251] eta 0:01:50 lr 0.000595 wd 0.0500 time 0.2462 (0.2406) data time 0.0011 (0.0020) model time 0.2450 (0.2380) loss 3.3696 (3.2643) grad_norm 3.0470 (2.4718) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:44:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][800/1251] eta 0:01:48 lr 0.000595 wd 0.0500 time 0.2355 (0.2408) data time 0.0009 (0.0020) model time 0.2346 (0.2383) loss 2.9375 (3.2641) grad_norm 1.7610 (2.4659) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:44:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][810/1251] eta 0:01:46 lr 0.000595 wd 0.0500 time 0.2328 (0.2408) data time 0.0008 (0.0020) model time 0.2319 (0.2383) loss 2.5878 (3.2638) grad_norm 1.9286 (2.4620) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:44:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][820/1251] eta 0:01:43 lr 0.000595 wd 0.0500 time 0.2335 (0.2408) data time 0.0008 (0.0020) model time 0.2327 (0.2383) loss 3.6212 (3.2645) grad_norm 2.0102 (2.4571) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:44:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][830/1251] eta 0:01:41 lr 0.000595 wd 0.0500 time 0.2366 (0.2408) data time 0.0008 (0.0020) model time 0.2358 (0.2383) loss 2.8838 (3.2651) grad_norm 2.4733 (2.4542) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:44:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][840/1251] eta 0:01:38 lr 0.000595 wd 0.0500 time 0.2313 (0.2407) data time 0.0008 (0.0019) model time 0.2305 (0.2383) loss 2.2275 (3.2653) grad_norm 1.8066 (2.4493) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:44:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][850/1251] eta 0:01:36 lr 0.000595 wd 0.0500 time 0.2407 (0.2407) data time 0.0011 (0.0019) model time 0.2397 (0.2383) loss 2.9010 (3.2638) grad_norm 2.3018 (2.4465) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:44:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][860/1251] eta 0:01:34 lr 0.000595 wd 0.0500 time 0.2426 (0.2407) data time 0.0008 (0.0019) model time 0.2418 (0.2382) loss 3.3199 (3.2628) grad_norm 2.4235 (2.4477) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:44:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][870/1251] eta 0:01:31 lr 0.000595 wd 0.0500 time 0.2512 (0.2407) data time 0.0008 (0.0019) model time 0.2503 (0.2383) loss 2.9069 (3.2632) grad_norm 2.3754 (2.4481) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:44:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][880/1251] eta 0:01:29 lr 0.000595 wd 0.0500 time 0.2420 (0.2407) data time 0.0009 (0.0019) model time 0.2411 (0.2383) loss 3.6757 (3.2665) grad_norm 2.1154 (2.4463) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:44:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][890/1251] eta 0:01:26 lr 0.000595 wd 0.0500 time 0.2475 (0.2407) data time 0.0010 (0.0019) model time 0.2465 (0.2383) loss 3.5970 (3.2680) grad_norm 3.0710 (2.4461) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:44:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][900/1251] eta 0:01:24 lr 0.000595 wd 0.0500 time 0.2376 (0.2407) data time 0.0010 (0.0019) model time 0.2365 (0.2383) loss 3.5559 (3.2666) grad_norm 2.5649 (2.4518) loss_scale 4096.0000 (2068.4573) mem 7382MB [2024-08-27 08:44:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][910/1251] eta 0:01:22 lr 0.000595 wd 0.0500 time 0.2459 (0.2407) data time 0.0010 (0.0019) model time 0.2449 (0.2383) loss 2.7928 (3.2642) grad_norm 1.7372 (2.4489) loss_scale 4096.0000 (2090.7135) mem 7382MB [2024-08-27 08:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][920/1251] eta 0:01:19 lr 0.000595 wd 0.0500 time 0.2406 (0.2407) data time 0.0008 (0.0019) model time 0.2398 (0.2383) loss 3.5956 (3.2625) grad_norm 2.8779 (2.4534) loss_scale 4096.0000 (2112.4864) mem 7382MB [2024-08-27 08:44:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][930/1251] eta 0:01:17 lr 0.000595 wd 0.0500 time 0.2366 (0.2407) data time 0.0010 (0.0019) model time 0.2356 (0.2384) loss 3.4494 (3.2645) grad_norm 2.0668 (2.4517) loss_scale 4096.0000 (2133.7916) mem 7382MB [2024-08-27 08:44:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][940/1251] eta 0:01:14 lr 0.000595 wd 0.0500 time 0.2384 (0.2407) data time 0.0009 (0.0019) model time 0.2375 (0.2383) loss 3.1589 (3.2627) grad_norm 2.2590 (2.4499) loss_scale 4096.0000 (2154.6440) mem 7382MB [2024-08-27 08:44:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][950/1251] eta 0:01:12 lr 0.000595 wd 0.0500 time 0.2465 (0.2406) data time 0.0010 (0.0019) model time 0.2455 (0.2383) loss 3.6135 (3.2646) grad_norm 2.7131 (2.4471) loss_scale 4096.0000 (2175.0578) mem 7382MB [2024-08-27 08:44:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][960/1251] eta 0:01:10 lr 0.000595 wd 0.0500 time 0.2380 (0.2406) data time 0.0010 (0.0018) model time 0.2370 (0.2383) loss 3.5140 (3.2686) grad_norm 2.1962 (2.4531) loss_scale 4096.0000 (2195.0468) mem 7382MB [2024-08-27 08:44:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][970/1251] eta 0:01:07 lr 0.000595 wd 0.0500 time 0.2458 (0.2406) data time 0.0010 (0.0018) model time 0.2447 (0.2383) loss 2.9512 (3.2692) grad_norm 2.5854 (2.4514) loss_scale 4096.0000 (2214.6241) mem 7382MB [2024-08-27 08:44:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][980/1251] eta 0:01:05 lr 0.000595 wd 0.0500 time 0.2146 (0.2408) data time 0.0011 (0.0018) model time 0.2135 (0.2386) loss 2.5246 (3.2660) grad_norm 1.8816 (2.4482) loss_scale 4096.0000 (2233.8022) mem 7382MB [2024-08-27 08:44:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][990/1251] eta 0:01:02 lr 0.000595 wd 0.0500 time 0.2288 (0.2408) data time 0.0009 (0.0018) model time 0.2279 (0.2385) loss 3.3997 (3.2649) grad_norm 2.6071 (2.4481) loss_scale 4096.0000 (2252.5933) mem 7382MB [2024-08-27 08:44:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1000/1251] eta 0:01:00 lr 0.000594 wd 0.0500 time 0.2427 (0.2407) data time 0.0007 (0.0018) model time 0.2419 (0.2385) loss 2.4015 (3.2618) grad_norm 1.7079 (2.4483) loss_scale 4096.0000 (2271.0090) mem 7382MB [2024-08-27 08:44:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1010/1251] eta 0:00:58 lr 0.000594 wd 0.0500 time 0.2419 (0.2407) data time 0.0009 (0.0018) model time 0.2409 (0.2385) loss 2.9535 (3.2614) grad_norm 2.9799 (2.4481) loss_scale 4096.0000 (2289.0603) mem 7382MB [2024-08-27 08:44:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1020/1251] eta 0:00:55 lr 0.000594 wd 0.0500 time 0.2472 (0.2407) data time 0.0007 (0.0018) model time 0.2465 (0.2385) loss 3.5830 (3.2592) grad_norm 2.8815 (2.4492) loss_scale 4096.0000 (2306.7581) mem 7382MB [2024-08-27 08:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1030/1251] eta 0:00:53 lr 0.000594 wd 0.0500 time 0.2358 (0.2407) data time 0.0010 (0.0018) model time 0.2348 (0.2385) loss 2.0743 (3.2591) grad_norm 1.5751 (2.4457) loss_scale 4096.0000 (2324.1125) mem 7382MB [2024-08-27 08:45:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1040/1251] eta 0:00:50 lr 0.000594 wd 0.0500 time 0.2380 (0.2407) data time 0.0009 (0.0018) model time 0.2371 (0.2385) loss 3.0462 (3.2606) grad_norm 2.1110 (2.4416) loss_scale 4096.0000 (2341.1335) mem 7382MB [2024-08-27 08:45:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1050/1251] eta 0:00:48 lr 0.000594 wd 0.0500 time 0.2396 (0.2407) data time 0.0009 (0.0018) model time 0.2388 (0.2385) loss 4.2350 (3.2607) grad_norm 2.7477 (2.4403) loss_scale 4096.0000 (2357.8306) mem 7382MB [2024-08-27 08:45:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1060/1251] eta 0:00:45 lr 0.000594 wd 0.0500 time 0.2405 (0.2407) data time 0.0007 (0.0018) model time 0.2398 (0.2385) loss 3.9027 (3.2585) grad_norm 1.8947 (2.4378) loss_scale 4096.0000 (2374.2130) mem 7382MB [2024-08-27 08:45:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1070/1251] eta 0:00:43 lr 0.000594 wd 0.0500 time 0.2407 (0.2407) data time 0.0012 (0.0018) model time 0.2395 (0.2385) loss 3.3296 (3.2589) grad_norm 1.7671 (2.4377) loss_scale 4096.0000 (2390.2894) mem 7382MB [2024-08-27 08:45:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1080/1251] eta 0:00:41 lr 0.000594 wd 0.0500 time 0.2437 (0.2407) data time 0.0010 (0.0018) model time 0.2427 (0.2385) loss 3.1694 (3.2589) grad_norm 2.9590 (2.4434) loss_scale 4096.0000 (2406.0685) mem 7382MB [2024-08-27 08:45:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1090/1251] eta 0:00:38 lr 0.000594 wd 0.0500 time 0.2411 (0.2407) data time 0.0010 (0.0018) model time 0.2402 (0.2385) loss 2.9767 (3.2600) grad_norm 2.7494 (2.4420) loss_scale 4096.0000 (2421.5582) mem 7382MB [2024-08-27 08:45:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1100/1251] eta 0:00:36 lr 0.000594 wd 0.0500 time 0.2363 (0.2406) data time 0.0010 (0.0018) model time 0.2353 (0.2385) loss 2.7220 (3.2589) grad_norm 2.0053 (2.4422) loss_scale 4096.0000 (2436.7666) mem 7382MB [2024-08-27 08:45:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1110/1251] eta 0:00:33 lr 0.000594 wd 0.0500 time 0.2370 (0.2406) data time 0.0009 (0.0018) model time 0.2360 (0.2385) loss 3.0910 (3.2578) grad_norm 2.3251 (2.4401) loss_scale 4096.0000 (2451.7012) mem 7382MB [2024-08-27 08:45:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1120/1251] eta 0:00:31 lr 0.000594 wd 0.0500 time 0.2411 (0.2406) data time 0.0011 (0.0018) model time 0.2400 (0.2385) loss 3.4256 (3.2586) grad_norm 3.4370 (2.4429) loss_scale 4096.0000 (2466.3693) mem 7382MB [2024-08-27 08:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1130/1251] eta 0:00:29 lr 0.000594 wd 0.0500 time 0.2372 (0.2406) data time 0.0012 (0.0017) model time 0.2361 (0.2385) loss 3.7031 (3.2590) grad_norm 2.6820 (2.4461) loss_scale 4096.0000 (2480.7781) mem 7382MB [2024-08-27 08:45:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1140/1251] eta 0:00:26 lr 0.000594 wd 0.0500 time 0.2408 (0.2405) data time 0.0008 (0.0017) model time 0.2400 (0.2384) loss 3.7447 (3.2577) grad_norm 1.9994 (2.4465) loss_scale 4096.0000 (2494.9343) mem 7382MB [2024-08-27 08:45:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1150/1251] eta 0:00:24 lr 0.000594 wd 0.0500 time 0.2380 (0.2405) data time 0.0011 (0.0017) model time 0.2369 (0.2384) loss 3.3521 (3.2572) grad_norm 2.2418 (2.4447) loss_scale 4096.0000 (2508.8445) mem 7382MB [2024-08-27 08:45:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1160/1251] eta 0:00:21 lr 0.000594 wd 0.0500 time 0.2354 (0.2405) data time 0.0008 (0.0017) model time 0.2346 (0.2384) loss 3.4077 (3.2573) grad_norm 2.2949 (2.4442) loss_scale 4096.0000 (2522.5151) mem 7382MB [2024-08-27 08:45:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1170/1251] eta 0:00:19 lr 0.000594 wd 0.0500 time 0.2384 (0.2405) data time 0.0009 (0.0017) model time 0.2375 (0.2384) loss 4.0670 (3.2598) grad_norm 2.5251 (2.4436) loss_scale 4096.0000 (2535.9522) mem 7382MB [2024-08-27 08:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1180/1251] eta 0:00:17 lr 0.000594 wd 0.0500 time 0.2410 (0.2405) data time 0.0007 (0.0017) model time 0.2402 (0.2384) loss 3.7493 (3.2572) grad_norm 2.1385 (inf) loss_scale 2048.0000 (2531.8205) mem 7382MB [2024-08-27 08:45:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1190/1251] eta 0:00:14 lr 0.000594 wd 0.0500 time 0.2355 (0.2405) data time 0.0009 (0.0017) model time 0.2345 (0.2384) loss 3.3455 (3.2572) grad_norm 2.8586 (inf) loss_scale 2048.0000 (2527.7582) mem 7382MB [2024-08-27 08:45:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1200/1251] eta 0:00:12 lr 0.000594 wd 0.0500 time 0.2364 (0.2404) data time 0.0010 (0.0017) model time 0.2354 (0.2384) loss 2.6049 (3.2567) grad_norm 2.1918 (inf) loss_scale 2048.0000 (2523.7635) mem 7382MB [2024-08-27 08:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1210/1251] eta 0:00:09 lr 0.000594 wd 0.0500 time 0.2393 (0.2404) data time 0.0010 (0.0017) model time 0.2383 (0.2384) loss 3.2650 (3.2567) grad_norm 2.1629 (inf) loss_scale 2048.0000 (2519.8348) mem 7382MB [2024-08-27 08:45:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1220/1251] eta 0:00:07 lr 0.000594 wd 0.0500 time 0.2370 (0.2404) data time 0.0012 (0.0017) model time 0.2358 (0.2383) loss 3.3135 (3.2552) grad_norm 2.1641 (inf) loss_scale 2048.0000 (2515.9705) mem 7382MB [2024-08-27 08:45:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1230/1251] eta 0:00:05 lr 0.000593 wd 0.0500 time 0.2338 (0.2404) data time 0.0011 (0.0017) model time 0.2327 (0.2383) loss 3.4390 (3.2561) grad_norm 3.3389 (inf) loss_scale 2048.0000 (2512.1690) mem 7382MB [2024-08-27 08:45:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1240/1251] eta 0:00:02 lr 0.000593 wd 0.0500 time 0.2253 (0.2403) data time 0.0007 (0.0017) model time 0.2246 (0.2383) loss 3.4681 (3.2563) grad_norm 2.2217 (inf) loss_scale 2048.0000 (2508.4287) mem 7382MB [2024-08-27 08:45:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [143/300][1250/1251] eta 0:00:00 lr 0.000593 wd 0.0500 time 0.2260 (0.2402) data time 0.0005 (0.0017) model time 0.2256 (0.2381) loss 3.6039 (3.2556) grad_norm 1.8927 (inf) loss_scale 2048.0000 (2504.7482) mem 7382MB [2024-08-27 08:45:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 143 training takes 0:05:00 [2024-08-27 08:45:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 08:45:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 08:45:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.465 (0.465) Loss 0.4700 (0.4700) Acc@1 92.480 (92.480) Acc@5 97.949 (97.949) Mem 7382MB [2024-08-27 08:45:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.114) Loss 0.7471 (0.7354) Acc@1 83.984 (84.055) Acc@5 96.484 (96.715) Mem 7382MB [2024-08-27 08:45:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.097) Loss 1.1084 (0.7617) Acc@1 73.926 (83.138) Acc@5 93.750 (96.722) Mem 7382MB [2024-08-27 08:45:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.090) Loss 1.2568 (0.8690) Acc@1 71.094 (80.828) Acc@5 90.723 (95.398) Mem 7382MB [2024-08-27 08:45:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.1650 (0.9245) Acc@1 71.777 (79.423) Acc@5 92.773 (94.784) Mem 7382MB [2024-08-27 08:45:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.984 Acc@5 94.710 [2024-08-27 08:45:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.0% [2024-08-27 08:45:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.778 (0.778) Loss 0.4080 (0.4080) Acc@1 92.871 (92.871) Acc@5 98.340 (98.340) Mem 7382MB [2024-08-27 08:46:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.143) Loss 0.6572 (0.6448) Acc@1 86.816 (86.177) Acc@5 97.070 (97.310) Mem 7382MB [2024-08-27 08:46:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.112) Loss 0.9136 (0.6688) Acc@1 78.906 (85.235) Acc@5 95.117 (97.307) Mem 7382MB [2024-08-27 08:46:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.099) Loss 1.1582 (0.7590) Acc@1 71.094 (82.989) Acc@5 92.188 (96.324) Mem 7382MB [2024-08-27 08:46:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.091) Loss 1.0449 (0.8063) Acc@1 74.414 (81.572) Acc@5 93.652 (95.827) Mem 7382MB [2024-08-27 08:46:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.188 Acc@5 95.788 [2024-08-27 08:46:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.2% [2024-08-27 08:46:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.19% [2024-08-27 08:46:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 08:46:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 08:46:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][0/1251] eta 0:15:07 lr 0.000593 wd 0.0500 time 0.7254 (0.7254) data time 0.4988 (0.4988) model time 0.0000 (0.0000) loss 3.2105 (3.2105) grad_norm 3.4393 (3.4393) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][10/1251] eta 0:05:49 lr 0.000593 wd 0.0500 time 0.2380 (0.2815) data time 0.0009 (0.0464) model time 0.0000 (0.0000) loss 3.4555 (2.9489) grad_norm 2.2685 (2.4259) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][20/1251] eta 0:05:18 lr 0.000593 wd 0.0500 time 0.2309 (0.2591) data time 0.0009 (0.0249) model time 0.0000 (0.0000) loss 2.3286 (3.1295) grad_norm 2.0227 (2.3542) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][30/1251] eta 0:05:08 lr 0.000593 wd 0.0500 time 0.2407 (0.2529) data time 0.0010 (0.0172) model time 0.0000 (0.0000) loss 3.4197 (3.1563) grad_norm 2.8691 (2.3412) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][40/1251] eta 0:05:03 lr 0.000593 wd 0.0500 time 0.2415 (0.2507) data time 0.0008 (0.0138) model time 0.0000 (0.0000) loss 4.2056 (3.1573) grad_norm 2.7559 (2.4231) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][50/1251] eta 0:04:58 lr 0.000593 wd 0.0500 time 0.2292 (0.2482) data time 0.0013 (0.0113) model time 0.0000 (0.0000) loss 3.3436 (3.1607) grad_norm 1.9044 (2.4484) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][60/1251] eta 0:04:53 lr 0.000593 wd 0.0500 time 0.2414 (0.2466) data time 0.0011 (0.0096) model time 0.2402 (0.2371) loss 3.2638 (3.1895) grad_norm 2.9689 (2.4216) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][70/1251] eta 0:04:50 lr 0.000593 wd 0.0500 time 0.2445 (0.2456) data time 0.0010 (0.0085) model time 0.2434 (0.2377) loss 3.8844 (3.2141) grad_norm 1.9882 (2.3842) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][80/1251] eta 0:04:46 lr 0.000593 wd 0.0500 time 0.2445 (0.2448) data time 0.0007 (0.0075) model time 0.2438 (0.2379) loss 3.7011 (3.2063) grad_norm 1.7230 (2.3514) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][90/1251] eta 0:04:43 lr 0.000593 wd 0.0500 time 0.2440 (0.2444) data time 0.0008 (0.0068) model time 0.2432 (0.2384) loss 3.7576 (3.1987) grad_norm 10.8340 (2.4660) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][100/1251] eta 0:04:40 lr 0.000593 wd 0.0500 time 0.2402 (0.2437) data time 0.0010 (0.0063) model time 0.2392 (0.2380) loss 3.8386 (3.2227) grad_norm 2.0655 (2.4798) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][110/1251] eta 0:04:37 lr 0.000593 wd 0.0500 time 0.2421 (0.2433) data time 0.0008 (0.0058) model time 0.2413 (0.2381) loss 2.0464 (3.2281) grad_norm 2.5526 (2.4581) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][120/1251] eta 0:04:34 lr 0.000593 wd 0.0500 time 0.2441 (0.2431) data time 0.0010 (0.0054) model time 0.2431 (0.2382) loss 2.0084 (3.2088) grad_norm 2.1092 (2.4310) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][130/1251] eta 0:04:34 lr 0.000593 wd 0.0500 time 0.2417 (0.2445) data time 0.0010 (0.0051) model time 0.2407 (0.2410) loss 3.8360 (3.2433) grad_norm 2.0621 (2.4092) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][140/1251] eta 0:04:34 lr 0.000593 wd 0.0500 time 0.2399 (0.2472) data time 0.0010 (0.0048) model time 0.2389 (0.2455) loss 3.8367 (3.2509) grad_norm 2.9340 (2.3970) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][150/1251] eta 0:04:31 lr 0.000593 wd 0.0500 time 0.2368 (0.2467) data time 0.0007 (0.0045) model time 0.2361 (0.2449) loss 2.4867 (3.2291) grad_norm 4.0890 (2.4340) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][160/1251] eta 0:04:28 lr 0.000593 wd 0.0500 time 0.2422 (0.2463) data time 0.0008 (0.0043) model time 0.2414 (0.2443) loss 3.4420 (3.2354) grad_norm 2.1951 (2.4339) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][170/1251] eta 0:04:25 lr 0.000593 wd 0.0500 time 0.2376 (0.2460) data time 0.0010 (0.0042) model time 0.2365 (0.2439) loss 3.6531 (3.2452) grad_norm 2.2049 (2.4182) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][180/1251] eta 0:04:23 lr 0.000593 wd 0.0500 time 0.2433 (0.2457) data time 0.0007 (0.0040) model time 0.2426 (0.2436) loss 3.7567 (3.2380) grad_norm 2.8078 (2.4040) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][190/1251] eta 0:04:20 lr 0.000593 wd 0.0500 time 0.2368 (0.2454) data time 0.0008 (0.0038) model time 0.2359 (0.2433) loss 2.7047 (3.2366) grad_norm 3.3559 (2.4141) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][200/1251] eta 0:04:17 lr 0.000593 wd 0.0500 time 0.2338 (0.2451) data time 0.0011 (0.0037) model time 0.2327 (0.2429) loss 3.4572 (3.2321) grad_norm 2.6627 (2.4205) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][210/1251] eta 0:04:14 lr 0.000592 wd 0.0500 time 0.2381 (0.2448) data time 0.0009 (0.0036) model time 0.2372 (0.2426) loss 3.3942 (3.2303) grad_norm 2.0443 (2.4240) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:46:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][220/1251] eta 0:04:12 lr 0.000592 wd 0.0500 time 0.2325 (0.2446) data time 0.0011 (0.0035) model time 0.2313 (0.2423) loss 3.4253 (3.2302) grad_norm 1.9359 (2.4168) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][230/1251] eta 0:04:09 lr 0.000592 wd 0.0500 time 0.2398 (0.2444) data time 0.0011 (0.0034) model time 0.2387 (0.2422) loss 2.1300 (3.2265) grad_norm 1.6797 (2.3939) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][240/1251] eta 0:04:06 lr 0.000592 wd 0.0500 time 0.2393 (0.2442) data time 0.0011 (0.0033) model time 0.2382 (0.2419) loss 3.3710 (3.2353) grad_norm 2.5032 (2.3892) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][250/1251] eta 0:04:04 lr 0.000592 wd 0.0500 time 0.2385 (0.2439) data time 0.0009 (0.0032) model time 0.2376 (0.2417) loss 3.7194 (3.2452) grad_norm 2.7523 (2.3971) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][260/1251] eta 0:04:01 lr 0.000592 wd 0.0500 time 0.2354 (0.2438) data time 0.0010 (0.0031) model time 0.2344 (0.2415) loss 3.0139 (3.2402) grad_norm 1.6696 (2.4061) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][270/1251] eta 0:03:59 lr 0.000592 wd 0.0500 time 0.2417 (0.2436) data time 0.0009 (0.0030) model time 0.2408 (0.2415) loss 3.4160 (3.2345) grad_norm 3.0635 (2.4104) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][280/1251] eta 0:03:56 lr 0.000592 wd 0.0500 time 0.2330 (0.2434) data time 0.0010 (0.0030) model time 0.2320 (0.2413) loss 2.6298 (3.2374) grad_norm 3.4539 (2.4165) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][290/1251] eta 0:03:53 lr 0.000592 wd 0.0500 time 0.2379 (0.2433) data time 0.0009 (0.0029) model time 0.2370 (0.2411) loss 2.0858 (3.2367) grad_norm 1.7031 (2.4151) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][300/1251] eta 0:03:51 lr 0.000592 wd 0.0500 time 0.2349 (0.2432) data time 0.0010 (0.0029) model time 0.2338 (0.2410) loss 3.4052 (3.2418) grad_norm 1.6346 (2.4084) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][310/1251] eta 0:03:48 lr 0.000592 wd 0.0500 time 0.2459 (0.2431) data time 0.0007 (0.0028) model time 0.2451 (0.2409) loss 3.2549 (3.2438) grad_norm 6.2949 (2.4182) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][320/1251] eta 0:03:46 lr 0.000592 wd 0.0500 time 0.4702 (0.2437) data time 0.0014 (0.0027) model time 0.4688 (0.2417) loss 3.0758 (3.2510) grad_norm 2.8745 (2.4474) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][330/1251] eta 0:03:44 lr 0.000592 wd 0.0500 time 0.2415 (0.2435) data time 0.0009 (0.0027) model time 0.2406 (0.2415) loss 3.4565 (3.2530) grad_norm 2.7214 (2.4467) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][340/1251] eta 0:03:41 lr 0.000592 wd 0.0500 time 0.2456 (0.2435) data time 0.0009 (0.0027) model time 0.2447 (0.2415) loss 3.4310 (3.2531) grad_norm 1.8933 (2.4461) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][350/1251] eta 0:03:39 lr 0.000592 wd 0.0500 time 0.2446 (0.2433) data time 0.0010 (0.0026) model time 0.2436 (0.2413) loss 3.3717 (3.2576) grad_norm 2.0573 (2.4402) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][360/1251] eta 0:03:36 lr 0.000592 wd 0.0500 time 0.2437 (0.2432) data time 0.0007 (0.0026) model time 0.2430 (0.2413) loss 3.9056 (3.2640) grad_norm 2.1354 (2.4393) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][370/1251] eta 0:03:34 lr 0.000592 wd 0.0500 time 0.2285 (0.2431) data time 0.0011 (0.0025) model time 0.2274 (0.2412) loss 3.4047 (3.2639) grad_norm 2.5749 (2.4366) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][380/1251] eta 0:03:31 lr 0.000592 wd 0.0500 time 0.2429 (0.2430) data time 0.0010 (0.0025) model time 0.2419 (0.2411) loss 3.6691 (3.2626) grad_norm 2.6588 (2.4345) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][390/1251] eta 0:03:29 lr 0.000592 wd 0.0500 time 0.2354 (0.2429) data time 0.0011 (0.0025) model time 0.2342 (0.2410) loss 2.0136 (3.2579) grad_norm 2.7942 (2.4485) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][400/1251] eta 0:03:26 lr 0.000592 wd 0.0500 time 0.2434 (0.2428) data time 0.0010 (0.0024) model time 0.2425 (0.2409) loss 3.7160 (3.2606) grad_norm 1.6882 (2.4543) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][410/1251] eta 0:03:24 lr 0.000592 wd 0.0500 time 0.2405 (0.2427) data time 0.0008 (0.0024) model time 0.2397 (0.2408) loss 3.2267 (3.2603) grad_norm 2.4799 (2.4524) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][420/1251] eta 0:03:21 lr 0.000592 wd 0.0500 time 0.2378 (0.2426) data time 0.0012 (0.0024) model time 0.2366 (0.2407) loss 2.8634 (3.2552) grad_norm 3.0887 (2.4430) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][430/1251] eta 0:03:19 lr 0.000592 wd 0.0500 time 0.2325 (0.2424) data time 0.0008 (0.0023) model time 0.2317 (0.2405) loss 3.5233 (3.2556) grad_norm 3.1093 (2.4464) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][440/1251] eta 0:03:16 lr 0.000591 wd 0.0500 time 0.2403 (0.2423) data time 0.0013 (0.0023) model time 0.2390 (0.2404) loss 4.0291 (3.2568) grad_norm 2.1394 (2.4387) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][450/1251] eta 0:03:14 lr 0.000591 wd 0.0500 time 0.2389 (0.2423) data time 0.0009 (0.0023) model time 0.2380 (0.2404) loss 3.7222 (3.2569) grad_norm 3.3962 (2.4459) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][460/1251] eta 0:03:11 lr 0.000591 wd 0.0500 time 0.2362 (0.2422) data time 0.0008 (0.0023) model time 0.2354 (0.2403) loss 3.6535 (3.2586) grad_norm 2.8047 (2.4464) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:47:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][470/1251] eta 0:03:09 lr 0.000591 wd 0.0500 time 0.2387 (0.2421) data time 0.0008 (0.0022) model time 0.2379 (0.2402) loss 2.9545 (3.2570) grad_norm 2.4421 (2.4505) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][480/1251] eta 0:03:06 lr 0.000591 wd 0.0500 time 0.2494 (0.2420) data time 0.0007 (0.0022) model time 0.2487 (0.2402) loss 1.9543 (3.2584) grad_norm 2.2684 (2.4483) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][490/1251] eta 0:03:04 lr 0.000591 wd 0.0500 time 0.2273 (0.2419) data time 0.0011 (0.0022) model time 0.2262 (0.2400) loss 3.5121 (3.2556) grad_norm 1.6505 (2.4421) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][500/1251] eta 0:03:01 lr 0.000591 wd 0.0500 time 0.2403 (0.2418) data time 0.0010 (0.0022) model time 0.2393 (0.2399) loss 3.3443 (3.2562) grad_norm 1.6962 (2.4440) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][510/1251] eta 0:02:59 lr 0.000591 wd 0.0500 time 0.2345 (0.2417) data time 0.0012 (0.0021) model time 0.2333 (0.2398) loss 3.5099 (3.2549) grad_norm 3.0110 (2.4398) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][520/1251] eta 0:02:56 lr 0.000591 wd 0.0500 time 0.2364 (0.2416) data time 0.0008 (0.0021) model time 0.2356 (0.2398) loss 3.0933 (3.2522) grad_norm 2.3979 (2.4356) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][530/1251] eta 0:02:54 lr 0.000591 wd 0.0500 time 0.2443 (0.2415) data time 0.0008 (0.0021) model time 0.2435 (0.2397) loss 2.3764 (3.2447) grad_norm 1.6392 (2.4341) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][540/1251] eta 0:02:51 lr 0.000591 wd 0.0500 time 0.2352 (0.2415) data time 0.0009 (0.0021) model time 0.2343 (0.2396) loss 4.4026 (3.2441) grad_norm 1.8530 (2.4375) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][550/1251] eta 0:02:49 lr 0.000591 wd 0.0500 time 0.2380 (0.2414) data time 0.0010 (0.0021) model time 0.2370 (0.2396) loss 3.6743 (3.2436) grad_norm 2.9775 (2.4366) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][560/1251] eta 0:02:46 lr 0.000591 wd 0.0500 time 0.2358 (0.2414) data time 0.0008 (0.0021) model time 0.2350 (0.2396) loss 2.4089 (3.2400) grad_norm 1.9551 (2.4327) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][570/1251] eta 0:02:44 lr 0.000591 wd 0.0500 time 0.2391 (0.2414) data time 0.0009 (0.0020) model time 0.2382 (0.2396) loss 2.0285 (3.2398) grad_norm 2.0102 (2.4326) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][580/1251] eta 0:02:41 lr 0.000591 wd 0.0500 time 0.2347 (0.2413) data time 0.0011 (0.0020) model time 0.2337 (0.2395) loss 3.2304 (3.2425) grad_norm 2.2644 (2.4286) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][590/1251] eta 0:02:39 lr 0.000591 wd 0.0500 time 0.2351 (0.2413) data time 0.0012 (0.0020) model time 0.2339 (0.2395) loss 3.8869 (3.2429) grad_norm 3.2822 (2.4332) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][600/1251] eta 0:02:37 lr 0.000591 wd 0.0500 time 0.2326 (0.2413) data time 0.0009 (0.0020) model time 0.2317 (0.2395) loss 3.8655 (3.2469) grad_norm 3.8633 (2.4388) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][610/1251] eta 0:02:34 lr 0.000591 wd 0.0500 time 0.2414 (0.2412) data time 0.0008 (0.0020) model time 0.2406 (0.2395) loss 3.0439 (3.2469) grad_norm 2.9682 (2.4427) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][620/1251] eta 0:02:32 lr 0.000591 wd 0.0500 time 0.2437 (0.2412) data time 0.0011 (0.0020) model time 0.2426 (0.2394) loss 2.5214 (3.2464) grad_norm 2.5371 (2.4413) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][630/1251] eta 0:02:29 lr 0.000591 wd 0.0500 time 0.2395 (0.2411) data time 0.0007 (0.0020) model time 0.2388 (0.2394) loss 3.8050 (3.2430) grad_norm 2.0842 (2.4340) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][640/1251] eta 0:02:27 lr 0.000591 wd 0.0500 time 0.2440 (0.2411) data time 0.0010 (0.0019) model time 0.2430 (0.2394) loss 3.7354 (3.2483) grad_norm 2.2598 (2.4301) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][650/1251] eta 0:02:24 lr 0.000591 wd 0.0500 time 0.2324 (0.2411) data time 0.0012 (0.0019) model time 0.2311 (0.2394) loss 2.9562 (3.2486) grad_norm 2.0576 (2.4299) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][660/1251] eta 0:02:22 lr 0.000591 wd 0.0500 time 0.2402 (0.2411) data time 0.0010 (0.0019) model time 0.2392 (0.2394) loss 3.2777 (3.2525) grad_norm 3.1287 (2.4378) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][670/1251] eta 0:02:20 lr 0.000590 wd 0.0500 time 0.2386 (0.2417) data time 0.0010 (0.0019) model time 0.2376 (0.2400) loss 3.2841 (3.2527) grad_norm 3.2730 (2.4489) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][680/1251] eta 0:02:18 lr 0.000590 wd 0.0500 time 0.2313 (0.2420) data time 0.0010 (0.0019) model time 0.2303 (0.2403) loss 1.9332 (3.2532) grad_norm 2.4211 (2.4546) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][690/1251] eta 0:02:15 lr 0.000590 wd 0.0500 time 0.2315 (0.2419) data time 0.0008 (0.0019) model time 0.2307 (0.2403) loss 2.5573 (3.2508) grad_norm 1.9495 (2.4511) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][700/1251] eta 0:02:13 lr 0.000590 wd 0.0500 time 0.2384 (0.2419) data time 0.0009 (0.0019) model time 0.2375 (0.2403) loss 3.3215 (3.2506) grad_norm 1.8952 (2.4476) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][710/1251] eta 0:02:10 lr 0.000590 wd 0.0500 time 0.2358 (0.2418) data time 0.0008 (0.0019) model time 0.2350 (0.2402) loss 2.3813 (3.2495) grad_norm 2.1678 (2.4511) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:48:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][720/1251] eta 0:02:08 lr 0.000590 wd 0.0500 time 0.2360 (0.2418) data time 0.0012 (0.0019) model time 0.2348 (0.2402) loss 4.0102 (3.2523) grad_norm 2.4322 (2.4535) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][730/1251] eta 0:02:05 lr 0.000590 wd 0.0500 time 0.2357 (0.2418) data time 0.0009 (0.0018) model time 0.2349 (0.2402) loss 2.2974 (3.2492) grad_norm 2.6204 (2.4504) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][740/1251] eta 0:02:03 lr 0.000590 wd 0.0500 time 0.2333 (0.2417) data time 0.0011 (0.0018) model time 0.2322 (0.2401) loss 3.8422 (3.2480) grad_norm 1.8258 (2.4552) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][750/1251] eta 0:02:01 lr 0.000590 wd 0.0500 time 0.2401 (0.2417) data time 0.0010 (0.0018) model time 0.2391 (0.2401) loss 3.3597 (3.2471) grad_norm 2.1746 (2.4509) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][760/1251] eta 0:01:58 lr 0.000590 wd 0.0500 time 0.2427 (0.2417) data time 0.0011 (0.0018) model time 0.2417 (0.2401) loss 2.8187 (3.2435) grad_norm 4.6940 (2.4553) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][770/1251] eta 0:01:56 lr 0.000590 wd 0.0500 time 0.2374 (0.2416) data time 0.0008 (0.0018) model time 0.2367 (0.2400) loss 2.7053 (3.2450) grad_norm 3.0719 (2.4542) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][780/1251] eta 0:01:53 lr 0.000590 wd 0.0500 time 0.2337 (0.2416) data time 0.0009 (0.0018) model time 0.2329 (0.2400) loss 3.1990 (3.2457) grad_norm 1.9753 (2.4533) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][790/1251] eta 0:01:51 lr 0.000590 wd 0.0500 time 0.2283 (0.2415) data time 0.0012 (0.0018) model time 0.2271 (0.2399) loss 3.5139 (3.2491) grad_norm 2.4275 (2.4556) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][800/1251] eta 0:01:48 lr 0.000590 wd 0.0500 time 0.2346 (0.2414) data time 0.0011 (0.0018) model time 0.2335 (0.2399) loss 3.6239 (3.2514) grad_norm 2.8447 (2.4625) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][810/1251] eta 0:01:46 lr 0.000590 wd 0.0500 time 0.2315 (0.2414) data time 0.0011 (0.0018) model time 0.2304 (0.2398) loss 3.2144 (3.2510) grad_norm 2.2431 (2.4661) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][820/1251] eta 0:01:44 lr 0.000590 wd 0.0500 time 0.2334 (0.2414) data time 0.0011 (0.0018) model time 0.2322 (0.2398) loss 3.7671 (3.2528) grad_norm 2.4168 (2.4692) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][830/1251] eta 0:01:41 lr 0.000590 wd 0.0500 time 0.2316 (0.2413) data time 0.0008 (0.0018) model time 0.2309 (0.2397) loss 2.9363 (3.2540) grad_norm 3.7632 (2.4676) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][840/1251] eta 0:01:39 lr 0.000590 wd 0.0500 time 0.2350 (0.2413) data time 0.0008 (0.0018) model time 0.2342 (0.2397) loss 4.1181 (3.2578) grad_norm 2.3470 (2.4686) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][850/1251] eta 0:01:36 lr 0.000590 wd 0.0500 time 0.2429 (0.2413) data time 0.0009 (0.0017) model time 0.2420 (0.2397) loss 3.5646 (3.2564) grad_norm 2.3483 (2.4649) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][860/1251] eta 0:01:34 lr 0.000590 wd 0.0500 time 0.2307 (0.2415) data time 0.0008 (0.0017) model time 0.2299 (0.2399) loss 3.6856 (3.2588) grad_norm 4.1759 (2.4639) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][870/1251] eta 0:01:31 lr 0.000590 wd 0.0500 time 0.2413 (0.2415) data time 0.0010 (0.0017) model time 0.2403 (0.2399) loss 3.2985 (3.2609) grad_norm 2.1277 (2.4637) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][880/1251] eta 0:01:29 lr 0.000590 wd 0.0500 time 0.2365 (0.2414) data time 0.0008 (0.0017) model time 0.2357 (0.2399) loss 2.8934 (3.2618) grad_norm 1.7470 (2.4686) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][890/1251] eta 0:01:27 lr 0.000589 wd 0.0500 time 0.2480 (0.2414) data time 0.0010 (0.0017) model time 0.2471 (0.2399) loss 3.4936 (3.2608) grad_norm 2.2411 (2.4713) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][900/1251] eta 0:01:24 lr 0.000589 wd 0.0500 time 0.2299 (0.2414) data time 0.0010 (0.0017) model time 0.2289 (0.2399) loss 3.4196 (3.2648) grad_norm 2.9444 (2.4700) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][910/1251] eta 0:01:22 lr 0.000589 wd 0.0500 time 0.2339 (0.2414) data time 0.0009 (0.0017) model time 0.2330 (0.2398) loss 3.8467 (3.2647) grad_norm 2.4260 (2.4713) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][920/1251] eta 0:01:19 lr 0.000589 wd 0.0500 time 0.2409 (0.2414) data time 0.0009 (0.0017) model time 0.2400 (0.2399) loss 4.0506 (3.2657) grad_norm 2.3942 (2.4702) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][930/1251] eta 0:01:17 lr 0.000589 wd 0.0500 time 0.2417 (0.2414) data time 0.0009 (0.0017) model time 0.2408 (0.2398) loss 3.6942 (3.2664) grad_norm 1.8709 (2.4692) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][940/1251] eta 0:01:15 lr 0.000589 wd 0.0500 time 0.2416 (0.2414) data time 0.0011 (0.0017) model time 0.2405 (0.2398) loss 3.3235 (3.2663) grad_norm 1.7721 (2.4663) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][950/1251] eta 0:01:12 lr 0.000589 wd 0.0500 time 0.2370 (0.2413) data time 0.0010 (0.0017) model time 0.2360 (0.2398) loss 3.0163 (3.2666) grad_norm 3.1005 (2.4647) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][960/1251] eta 0:01:10 lr 0.000589 wd 0.0500 time 0.2364 (0.2413) data time 0.0010 (0.0017) model time 0.2354 (0.2398) loss 3.1849 (3.2662) grad_norm 3.1035 (2.4666) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:49:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][970/1251] eta 0:01:07 lr 0.000589 wd 0.0500 time 0.2528 (0.2413) data time 0.0013 (0.0017) model time 0.2515 (0.2398) loss 2.9549 (3.2670) grad_norm 2.3778 (2.4634) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][980/1251] eta 0:01:05 lr 0.000589 wd 0.0500 time 0.2321 (0.2413) data time 0.0012 (0.0017) model time 0.2309 (0.2398) loss 2.8112 (3.2631) grad_norm 1.6457 (2.4603) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][990/1251] eta 0:01:02 lr 0.000589 wd 0.0500 time 0.2354 (0.2412) data time 0.0009 (0.0017) model time 0.2345 (0.2397) loss 3.2202 (3.2642) grad_norm 3.0774 (2.4625) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1000/1251] eta 0:01:00 lr 0.000589 wd 0.0500 time 0.2400 (0.2412) data time 0.0012 (0.0017) model time 0.2388 (0.2397) loss 3.1578 (3.2642) grad_norm 2.0641 (2.4675) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1010/1251] eta 0:00:58 lr 0.000589 wd 0.0500 time 0.2356 (0.2413) data time 0.0011 (0.0017) model time 0.2345 (0.2397) loss 3.5063 (3.2657) grad_norm 2.5102 (2.4677) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1020/1251] eta 0:00:55 lr 0.000589 wd 0.0500 time 0.2399 (0.2412) data time 0.0010 (0.0016) model time 0.2388 (0.2397) loss 2.9919 (3.2627) grad_norm 1.9390 (2.4657) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1030/1251] eta 0:00:53 lr 0.000589 wd 0.0500 time 0.2399 (0.2412) data time 0.0010 (0.0016) model time 0.2390 (0.2397) loss 3.5400 (3.2620) grad_norm 2.0651 (2.4639) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1040/1251] eta 0:00:50 lr 0.000589 wd 0.0500 time 0.2366 (0.2412) data time 0.0012 (0.0016) model time 0.2354 (0.2397) loss 2.6190 (3.2623) grad_norm 2.2478 (2.4625) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1050/1251] eta 0:00:48 lr 0.000589 wd 0.0500 time 0.2444 (0.2412) data time 0.0010 (0.0016) model time 0.2434 (0.2397) loss 2.5629 (3.2599) grad_norm 2.3328 (2.4636) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1060/1251] eta 0:00:46 lr 0.000589 wd 0.0500 time 0.2474 (0.2412) data time 0.0008 (0.0016) model time 0.2467 (0.2397) loss 3.5448 (3.2631) grad_norm 1.9157 (2.4609) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1070/1251] eta 0:00:43 lr 0.000589 wd 0.0500 time 0.2439 (0.2412) data time 0.0010 (0.0016) model time 0.2429 (0.2397) loss 2.9660 (3.2626) grad_norm 2.0464 (2.4591) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1080/1251] eta 0:00:41 lr 0.000589 wd 0.0500 time 0.2385 (0.2411) data time 0.0009 (0.0016) model time 0.2376 (0.2396) loss 2.8190 (3.2630) grad_norm 3.1812 (2.4599) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1090/1251] eta 0:00:38 lr 0.000589 wd 0.0500 time 0.2383 (0.2411) data time 0.0009 (0.0016) model time 0.2375 (0.2396) loss 3.9073 (3.2620) grad_norm 3.2767 (2.4588) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1100/1251] eta 0:00:36 lr 0.000589 wd 0.0500 time 0.2417 (0.2411) data time 0.0010 (0.0016) model time 0.2406 (0.2397) loss 3.4280 (3.2605) grad_norm 2.4870 (2.4587) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1110/1251] eta 0:00:33 lr 0.000589 wd 0.0500 time 0.2440 (0.2411) data time 0.0009 (0.0016) model time 0.2432 (0.2396) loss 3.2138 (3.2606) grad_norm 2.9655 (2.4583) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1120/1251] eta 0:00:31 lr 0.000588 wd 0.0500 time 0.2385 (0.2411) data time 0.0007 (0.0016) model time 0.2378 (0.2396) loss 3.9782 (3.2608) grad_norm 2.5191 (2.4600) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1130/1251] eta 0:00:29 lr 0.000588 wd 0.0500 time 0.2438 (0.2411) data time 0.0009 (0.0016) model time 0.2428 (0.2396) loss 2.8779 (3.2628) grad_norm 3.0890 (2.4581) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1140/1251] eta 0:00:26 lr 0.000588 wd 0.0500 time 0.2318 (0.2411) data time 0.0010 (0.0016) model time 0.2308 (0.2396) loss 3.3257 (3.2622) grad_norm 2.1669 (2.4628) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1150/1251] eta 0:00:24 lr 0.000588 wd 0.0500 time 0.2335 (0.2411) data time 0.0011 (0.0016) model time 0.2324 (0.2396) loss 3.1292 (3.2618) grad_norm 1.8675 (2.4633) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1160/1251] eta 0:00:21 lr 0.000588 wd 0.0500 time 0.2425 (0.2411) data time 0.0007 (0.0016) model time 0.2418 (0.2396) loss 2.9080 (3.2612) grad_norm 2.8303 (2.4631) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1170/1251] eta 0:00:19 lr 0.000588 wd 0.0500 time 0.2465 (0.2411) data time 0.0008 (0.0016) model time 0.2457 (0.2396) loss 3.2853 (3.2623) grad_norm 2.1252 (2.4620) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1180/1251] eta 0:00:17 lr 0.000588 wd 0.0500 time 0.2432 (0.2411) data time 0.0010 (0.0016) model time 0.2422 (0.2396) loss 3.7199 (3.2622) grad_norm 2.3129 (2.4615) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1190/1251] eta 0:00:14 lr 0.000588 wd 0.0500 time 0.2263 (0.2413) data time 0.0011 (0.0016) model time 0.2252 (0.2399) loss 3.4206 (3.2619) grad_norm 1.7723 (2.4571) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1200/1251] eta 0:00:12 lr 0.000588 wd 0.0500 time 0.2393 (0.2415) data time 0.0009 (0.0016) model time 0.2384 (0.2401) loss 3.1525 (3.2597) grad_norm 2.0970 (2.4567) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1210/1251] eta 0:00:09 lr 0.000588 wd 0.0500 time 0.2387 (0.2415) data time 0.0012 (0.0016) model time 0.2375 (0.2400) loss 3.0200 (3.2571) grad_norm 2.4610 (2.4566) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:50:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1220/1251] eta 0:00:07 lr 0.000588 wd 0.0500 time 0.2375 (0.2415) data time 0.0008 (0.0016) model time 0.2367 (0.2400) loss 2.4220 (3.2564) grad_norm 2.0310 (2.4562) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1230/1251] eta 0:00:05 lr 0.000588 wd 0.0500 time 0.2416 (0.2415) data time 0.0008 (0.0016) model time 0.2408 (0.2400) loss 2.2982 (3.2580) grad_norm 2.1783 (2.4559) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1240/1251] eta 0:00:02 lr 0.000588 wd 0.0500 time 0.2237 (0.2415) data time 0.0005 (0.0016) model time 0.2232 (0.2400) loss 3.9276 (3.2569) grad_norm 2.8188 (2.4522) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [144/300][1250/1251] eta 0:00:00 lr 0.000588 wd 0.0500 time 0.2294 (0.2414) data time 0.0005 (0.0016) model time 0.2289 (0.2399) loss 3.8068 (3.2582) grad_norm 3.5403 (2.4595) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 144 training takes 0:05:01 [2024-08-27 08:51:06 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 08:51:06 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 08:51:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.427 (0.427) Loss 0.4756 (0.4756) Acc@1 90.918 (90.918) Acc@5 98.242 (98.242) Mem 7382MB [2024-08-27 08:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.107) Loss 0.7568 (0.7250) Acc@1 83.789 (84.224) Acc@5 96.289 (96.804) Mem 7382MB [2024-08-27 08:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.093) Loss 1.0234 (0.7527) Acc@1 75.488 (83.231) Acc@5 93.848 (96.708) Mem 7382MB [2024-08-27 08:51:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.089) Loss 1.3770 (0.8623) Acc@1 66.602 (80.803) Acc@5 90.527 (95.426) Mem 7382MB [2024-08-27 08:51:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.2051 (0.9232) Acc@1 71.680 (79.299) Acc@5 91.895 (94.731) Mem 7382MB [2024-08-27 08:51:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.834 Acc@5 94.724 [2024-08-27 08:51:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.8% [2024-08-27 08:51:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.817 (0.817) Loss 0.4067 (0.4067) Acc@1 92.871 (92.871) Acc@5 98.340 (98.340) Mem 7382MB [2024-08-27 08:51:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.148) Loss 0.6567 (0.6435) Acc@1 86.719 (86.248) Acc@5 97.070 (97.319) Mem 7382MB [2024-08-27 08:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.114) Loss 0.9106 (0.6673) Acc@1 78.809 (85.273) Acc@5 95.117 (97.331) Mem 7382MB [2024-08-27 08:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.102) Loss 1.1553 (0.7575) Acc@1 71.094 (83.017) Acc@5 92.383 (96.365) Mem 7382MB [2024-08-27 08:51:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.0459 (0.8049) Acc@1 74.512 (81.614) Acc@5 93.652 (95.858) Mem 7382MB [2024-08-27 08:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.228 Acc@5 95.804 [2024-08-27 08:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.2% [2024-08-27 08:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.23% [2024-08-27 08:51:15 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 08:51:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 08:51:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][0/1251] eta 0:14:24 lr 0.000588 wd 0.0500 time 0.6912 (0.6912) data time 0.4551 (0.4551) model time 0.0000 (0.0000) loss 2.4710 (2.4710) grad_norm 2.3603 (2.3603) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][10/1251] eta 0:05:47 lr 0.000588 wd 0.0500 time 0.2437 (0.2801) data time 0.0009 (0.0424) model time 0.0000 (0.0000) loss 3.8614 (3.1680) grad_norm 2.2877 (2.1822) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][20/1251] eta 0:05:21 lr 0.000588 wd 0.0500 time 0.2384 (0.2612) data time 0.0011 (0.0228) model time 0.0000 (0.0000) loss 3.1413 (3.1334) grad_norm 2.1685 (2.1628) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][30/1251] eta 0:05:10 lr 0.000588 wd 0.0500 time 0.2399 (0.2540) data time 0.0011 (0.0158) model time 0.0000 (0.0000) loss 3.5110 (3.1612) grad_norm 1.9110 (2.1358) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][40/1251] eta 0:05:03 lr 0.000588 wd 0.0500 time 0.2481 (0.2506) data time 0.0007 (0.0122) model time 0.0000 (0.0000) loss 4.2497 (3.2519) grad_norm 2.9822 (2.3931) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][50/1251] eta 0:04:57 lr 0.000588 wd 0.0500 time 0.2402 (0.2481) data time 0.0010 (0.0101) model time 0.0000 (0.0000) loss 3.2286 (3.3116) grad_norm 2.6470 (2.4683) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][60/1251] eta 0:04:53 lr 0.000588 wd 0.0500 time 0.2391 (0.2465) data time 0.0010 (0.0086) model time 0.2381 (0.2370) loss 3.3416 (3.2423) grad_norm 2.4836 (2.4072) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][70/1251] eta 0:04:53 lr 0.000588 wd 0.0500 time 0.2318 (0.2482) data time 0.0008 (0.0076) model time 0.2310 (0.2472) loss 3.1166 (3.2521) grad_norm 3.3033 (2.4497) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][80/1251] eta 0:04:49 lr 0.000588 wd 0.0500 time 0.2376 (0.2472) data time 0.0010 (0.0068) model time 0.2366 (0.2445) loss 2.4910 (3.2202) grad_norm 2.4278 (2.4308) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][90/1251] eta 0:04:45 lr 0.000588 wd 0.0500 time 0.2372 (0.2462) data time 0.0010 (0.0062) model time 0.2362 (0.2426) loss 3.5064 (3.2258) grad_norm 2.6344 (2.4325) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][100/1251] eta 0:04:42 lr 0.000587 wd 0.0500 time 0.2389 (0.2455) data time 0.0008 (0.0057) model time 0.2381 (0.2416) loss 3.3315 (3.2395) grad_norm 1.9782 (2.4025) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][110/1251] eta 0:04:39 lr 0.000587 wd 0.0500 time 0.2378 (0.2448) data time 0.0008 (0.0053) model time 0.2370 (0.2409) loss 4.3360 (3.2556) grad_norm 2.0213 (2.4160) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][120/1251] eta 0:04:36 lr 0.000587 wd 0.0500 time 0.2411 (0.2444) data time 0.0009 (0.0049) model time 0.2402 (0.2405) loss 3.3088 (3.2306) grad_norm 1.4719 (2.4065) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][130/1251] eta 0:04:33 lr 0.000587 wd 0.0500 time 0.2418 (0.2438) data time 0.0008 (0.0046) model time 0.2410 (0.2400) loss 2.9459 (3.2224) grad_norm 2.1918 (2.4037) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][140/1251] eta 0:04:30 lr 0.000587 wd 0.0500 time 0.2364 (0.2435) data time 0.0010 (0.0044) model time 0.2354 (0.2397) loss 3.2576 (3.2063) grad_norm 1.8815 (2.3960) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][150/1251] eta 0:04:27 lr 0.000587 wd 0.0500 time 0.2371 (0.2432) data time 0.0009 (0.0042) model time 0.2362 (0.2396) loss 3.2080 (3.2149) grad_norm 2.1259 (2.3878) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][160/1251] eta 0:04:25 lr 0.000587 wd 0.0500 time 0.2388 (0.2430) data time 0.0010 (0.0040) model time 0.2378 (0.2396) loss 3.8223 (3.2211) grad_norm 1.5589 (2.3634) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][170/1251] eta 0:04:22 lr 0.000587 wd 0.0500 time 0.2395 (0.2426) data time 0.0011 (0.0038) model time 0.2384 (0.2392) loss 3.3152 (3.2336) grad_norm 3.7693 (2.3528) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:51:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][180/1251] eta 0:04:19 lr 0.000587 wd 0.0500 time 0.2372 (0.2425) data time 0.0009 (0.0036) model time 0.2363 (0.2391) loss 3.6929 (3.2346) grad_norm 2.5375 (2.3682) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][190/1251] eta 0:04:17 lr 0.000587 wd 0.0500 time 0.2385 (0.2424) data time 0.0010 (0.0035) model time 0.2375 (0.2391) loss 3.4750 (3.2316) grad_norm 1.9595 (2.3671) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][200/1251] eta 0:04:14 lr 0.000587 wd 0.0500 time 0.2400 (0.2422) data time 0.0008 (0.0034) model time 0.2393 (0.2391) loss 2.2945 (3.2231) grad_norm 1.9615 (2.3580) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][210/1251] eta 0:04:12 lr 0.000587 wd 0.0500 time 0.2417 (0.2423) data time 0.0010 (0.0033) model time 0.2407 (0.2393) loss 3.1440 (3.2271) grad_norm 3.5821 (2.3668) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][220/1251] eta 0:04:09 lr 0.000587 wd 0.0500 time 0.2408 (0.2422) data time 0.0009 (0.0033) model time 0.2399 (0.2393) loss 2.5396 (3.2260) grad_norm 2.4819 (2.3720) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][230/1251] eta 0:04:07 lr 0.000587 wd 0.0500 time 0.2373 (0.2421) data time 0.0010 (0.0032) model time 0.2363 (0.2392) loss 3.4400 (3.2213) grad_norm 5.0713 (2.3912) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][240/1251] eta 0:04:04 lr 0.000587 wd 0.0500 time 0.2453 (0.2420) data time 0.0010 (0.0031) model time 0.2443 (0.2392) loss 2.8979 (3.2311) grad_norm 2.8905 (2.4234) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][250/1251] eta 0:04:02 lr 0.000587 wd 0.0500 time 0.2485 (0.2418) data time 0.0010 (0.0030) model time 0.2475 (0.2390) loss 2.7415 (3.2232) grad_norm 1.6437 (2.4152) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][260/1251] eta 0:03:59 lr 0.000587 wd 0.0500 time 0.2346 (0.2417) data time 0.0010 (0.0029) model time 0.2337 (0.2389) loss 3.5358 (3.2251) grad_norm 1.5608 (2.3976) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][270/1251] eta 0:03:57 lr 0.000587 wd 0.0500 time 0.2466 (0.2416) data time 0.0008 (0.0029) model time 0.2459 (0.2390) loss 3.6966 (3.2330) grad_norm 1.8775 (2.3852) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][280/1251] eta 0:03:54 lr 0.000587 wd 0.0500 time 0.2390 (0.2417) data time 0.0009 (0.0028) model time 0.2382 (0.2391) loss 3.3510 (3.2456) grad_norm 3.0166 (2.3950) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][290/1251] eta 0:03:52 lr 0.000587 wd 0.0500 time 0.2325 (0.2416) data time 0.0007 (0.0028) model time 0.2317 (0.2390) loss 3.4693 (3.2569) grad_norm 2.4180 (2.4153) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][300/1251] eta 0:03:49 lr 0.000587 wd 0.0500 time 0.2393 (0.2416) data time 0.0009 (0.0028) model time 0.2384 (0.2390) loss 2.2147 (3.2567) grad_norm 2.3559 (2.4113) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][310/1251] eta 0:03:47 lr 0.000587 wd 0.0500 time 0.2345 (0.2417) data time 0.0011 (0.0027) model time 0.2334 (0.2392) loss 3.3249 (3.2440) grad_norm 3.6081 (2.4205) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][320/1251] eta 0:03:45 lr 0.000587 wd 0.0500 time 0.2443 (0.2418) data time 0.0010 (0.0027) model time 0.2432 (0.2393) loss 3.0340 (3.2513) grad_norm 1.9681 (2.4202) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][330/1251] eta 0:03:42 lr 0.000586 wd 0.0500 time 0.2372 (0.2417) data time 0.0011 (0.0026) model time 0.2361 (0.2393) loss 3.2350 (3.2545) grad_norm 2.2579 (2.4155) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][340/1251] eta 0:03:40 lr 0.000586 wd 0.0500 time 0.2359 (0.2416) data time 0.0010 (0.0026) model time 0.2349 (0.2392) loss 3.5744 (3.2544) grad_norm 2.5948 (2.4116) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][350/1251] eta 0:03:37 lr 0.000586 wd 0.0500 time 0.2371 (0.2415) data time 0.0007 (0.0025) model time 0.2363 (0.2391) loss 3.2685 (3.2570) grad_norm 2.3140 (2.4050) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][360/1251] eta 0:03:35 lr 0.000586 wd 0.0500 time 0.2452 (0.2415) data time 0.0010 (0.0025) model time 0.2442 (0.2391) loss 2.2288 (3.2545) grad_norm 2.9356 (2.3970) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][370/1251] eta 0:03:32 lr 0.000586 wd 0.0500 time 0.2398 (0.2415) data time 0.0011 (0.0025) model time 0.2388 (0.2391) loss 2.4064 (3.2454) grad_norm 1.9252 (2.4141) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][380/1251] eta 0:03:30 lr 0.000586 wd 0.0500 time 0.2348 (0.2414) data time 0.0009 (0.0024) model time 0.2340 (0.2391) loss 3.7205 (3.2510) grad_norm 4.2114 (2.4545) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][390/1251] eta 0:03:27 lr 0.000586 wd 0.0500 time 0.2401 (0.2414) data time 0.0009 (0.0024) model time 0.2392 (0.2391) loss 3.6869 (3.2627) grad_norm 2.6239 (2.4489) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][400/1251] eta 0:03:25 lr 0.000586 wd 0.0500 time 0.2409 (0.2413) data time 0.0010 (0.0024) model time 0.2399 (0.2390) loss 3.5805 (3.2619) grad_norm 1.7993 (2.4495) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][410/1251] eta 0:03:22 lr 0.000586 wd 0.0500 time 0.2382 (0.2412) data time 0.0011 (0.0024) model time 0.2371 (0.2390) loss 3.1135 (3.2602) grad_norm 2.2217 (2.4468) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][420/1251] eta 0:03:20 lr 0.000586 wd 0.0500 time 0.2319 (0.2412) data time 0.0008 (0.0023) model time 0.2312 (0.2390) loss 3.7465 (3.2618) grad_norm 1.9025 (2.4388) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:52:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][430/1251] eta 0:03:17 lr 0.000586 wd 0.0500 time 0.2374 (0.2412) data time 0.0009 (0.0023) model time 0.2365 (0.2390) loss 2.6393 (3.2589) grad_norm 3.4891 (2.4395) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][440/1251] eta 0:03:15 lr 0.000586 wd 0.0500 time 0.2428 (0.2412) data time 0.0008 (0.0023) model time 0.2420 (0.2390) loss 2.6152 (3.2550) grad_norm 3.0058 (2.4432) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][450/1251] eta 0:03:13 lr 0.000586 wd 0.0500 time 0.2372 (0.2412) data time 0.0012 (0.0023) model time 0.2360 (0.2391) loss 2.8460 (3.2486) grad_norm 2.4656 (2.4372) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][460/1251] eta 0:03:11 lr 0.000586 wd 0.0500 time 0.2279 (0.2423) data time 0.0009 (0.0022) model time 0.2270 (0.2403) loss 3.6291 (3.2501) grad_norm 2.1582 (2.4327) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][470/1251] eta 0:03:09 lr 0.000586 wd 0.0500 time 0.2353 (0.2422) data time 0.0008 (0.0022) model time 0.2344 (0.2403) loss 3.2773 (3.2498) grad_norm 4.1671 (2.4426) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][480/1251] eta 0:03:06 lr 0.000586 wd 0.0500 time 0.2408 (0.2422) data time 0.0007 (0.0022) model time 0.2401 (0.2403) loss 2.7519 (3.2529) grad_norm 2.2834 (2.4451) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][490/1251] eta 0:03:04 lr 0.000586 wd 0.0500 time 0.2429 (0.2423) data time 0.0010 (0.0022) model time 0.2418 (0.2403) loss 3.0369 (3.2529) grad_norm 2.9691 (2.4435) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][500/1251] eta 0:03:01 lr 0.000586 wd 0.0500 time 0.2405 (0.2423) data time 0.0010 (0.0022) model time 0.2395 (0.2403) loss 3.0460 (3.2550) grad_norm 2.1618 (2.4404) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][510/1251] eta 0:02:59 lr 0.000586 wd 0.0500 time 0.2427 (0.2423) data time 0.0008 (0.0022) model time 0.2419 (0.2403) loss 2.9204 (3.2498) grad_norm 1.9991 (2.4380) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][520/1251] eta 0:02:57 lr 0.000586 wd 0.0500 time 0.2585 (0.2422) data time 0.0012 (0.0022) model time 0.2573 (0.2403) loss 2.9037 (3.2440) grad_norm 3.5372 (2.4390) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][530/1251] eta 0:02:54 lr 0.000586 wd 0.0500 time 0.2369 (0.2422) data time 0.0009 (0.0022) model time 0.2360 (0.2402) loss 2.5183 (3.2452) grad_norm 1.9601 (2.4408) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][540/1251] eta 0:02:52 lr 0.000586 wd 0.0500 time 0.2360 (0.2421) data time 0.0008 (0.0022) model time 0.2352 (0.2401) loss 3.3789 (3.2452) grad_norm 2.3338 (2.4400) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][550/1251] eta 0:02:49 lr 0.000586 wd 0.0500 time 0.2374 (0.2421) data time 0.0012 (0.0021) model time 0.2362 (0.2401) loss 2.6630 (3.2406) grad_norm 2.9124 (2.4351) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][560/1251] eta 0:02:47 lr 0.000585 wd 0.0500 time 0.2382 (0.2421) data time 0.0012 (0.0021) model time 0.2370 (0.2401) loss 2.8556 (3.2411) grad_norm 2.4540 (2.4328) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][570/1251] eta 0:02:44 lr 0.000585 wd 0.0500 time 0.2299 (0.2421) data time 0.0011 (0.0021) model time 0.2288 (0.2401) loss 3.3381 (3.2422) grad_norm 2.6139 (2.4322) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][580/1251] eta 0:02:42 lr 0.000585 wd 0.0500 time 0.4355 (0.2424) data time 0.0008 (0.0021) model time 0.4348 (0.2405) loss 3.8911 (3.2454) grad_norm 3.3976 (2.4368) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][590/1251] eta 0:02:40 lr 0.000585 wd 0.0500 time 0.2510 (0.2424) data time 0.0010 (0.0021) model time 0.2500 (0.2405) loss 3.4160 (3.2493) grad_norm 3.0833 (2.4423) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][600/1251] eta 0:02:37 lr 0.000585 wd 0.0500 time 0.2446 (0.2423) data time 0.0010 (0.0021) model time 0.2436 (0.2405) loss 3.6966 (3.2497) grad_norm 2.2493 (2.4435) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][610/1251] eta 0:02:35 lr 0.000585 wd 0.0500 time 0.2434 (0.2424) data time 0.0009 (0.0020) model time 0.2425 (0.2405) loss 4.1436 (3.2505) grad_norm 1.6532 (2.4456) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][620/1251] eta 0:02:32 lr 0.000585 wd 0.0500 time 0.2396 (0.2423) data time 0.0010 (0.0020) model time 0.2386 (0.2405) loss 2.3639 (3.2442) grad_norm 2.1041 (2.4489) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][630/1251] eta 0:02:30 lr 0.000585 wd 0.0500 time 0.2417 (0.2424) data time 0.0008 (0.0020) model time 0.2409 (0.2405) loss 3.6501 (3.2456) grad_norm 2.8211 (2.4557) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][640/1251] eta 0:02:28 lr 0.000585 wd 0.0500 time 0.2623 (0.2424) data time 0.0008 (0.0020) model time 0.2615 (0.2406) loss 2.5374 (3.2410) grad_norm 1.9502 (2.4506) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][650/1251] eta 0:02:25 lr 0.000585 wd 0.0500 time 0.2365 (0.2424) data time 0.0012 (0.0020) model time 0.2353 (0.2406) loss 3.6060 (3.2462) grad_norm 1.7303 (2.4436) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][660/1251] eta 0:02:23 lr 0.000585 wd 0.0500 time 0.2328 (0.2423) data time 0.0010 (0.0020) model time 0.2318 (0.2405) loss 3.5119 (3.2509) grad_norm 2.4471 (2.4398) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:53:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][670/1251] eta 0:02:20 lr 0.000585 wd 0.0500 time 0.2453 (0.2424) data time 0.0009 (0.0020) model time 0.2444 (0.2406) loss 3.7076 (3.2516) grad_norm 2.5175 (2.4425) loss_scale 4096.0000 (2054.1043) mem 7382MB [2024-08-27 08:54:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][680/1251] eta 0:02:18 lr 0.000585 wd 0.0500 time 0.2410 (0.2423) data time 0.0011 (0.0020) model time 0.2399 (0.2406) loss 2.5377 (3.2534) grad_norm 2.5574 (2.4443) loss_scale 4096.0000 (2084.0881) mem 7382MB [2024-08-27 08:54:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][690/1251] eta 0:02:15 lr 0.000585 wd 0.0500 time 0.2416 (0.2423) data time 0.0011 (0.0020) model time 0.2405 (0.2405) loss 2.2036 (3.2524) grad_norm 2.2077 (2.4453) loss_scale 4096.0000 (2113.2041) mem 7382MB [2024-08-27 08:54:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][700/1251] eta 0:02:13 lr 0.000585 wd 0.0500 time 0.2382 (0.2423) data time 0.0012 (0.0020) model time 0.2371 (0.2405) loss 3.4687 (3.2566) grad_norm 2.0409 (2.4485) loss_scale 4096.0000 (2141.4893) mem 7382MB [2024-08-27 08:54:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][710/1251] eta 0:02:11 lr 0.000585 wd 0.0500 time 0.2355 (0.2424) data time 0.0009 (0.0020) model time 0.2346 (0.2405) loss 3.1574 (3.2607) grad_norm 1.8481 (2.4452) loss_scale 4096.0000 (2168.9789) mem 7382MB [2024-08-27 08:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][720/1251] eta 0:02:08 lr 0.000585 wd 0.0500 time 0.2451 (0.2423) data time 0.0009 (0.0020) model time 0.2442 (0.2405) loss 2.9541 (3.2654) grad_norm 4.2382 (2.4498) loss_scale 4096.0000 (2195.7060) mem 7382MB [2024-08-27 08:54:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][730/1251] eta 0:02:06 lr 0.000585 wd 0.0500 time 0.2471 (0.2423) data time 0.0008 (0.0020) model time 0.2463 (0.2405) loss 3.9310 (3.2668) grad_norm 1.8687 (2.4605) loss_scale 4096.0000 (2221.7018) mem 7382MB [2024-08-27 08:54:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][740/1251] eta 0:02:03 lr 0.000585 wd 0.0500 time 0.2371 (0.2423) data time 0.0009 (0.0020) model time 0.2362 (0.2405) loss 3.0508 (3.2667) grad_norm 1.8085 (2.4619) loss_scale 4096.0000 (2246.9960) mem 7382MB [2024-08-27 08:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][750/1251] eta 0:02:01 lr 0.000585 wd 0.0500 time 0.2376 (0.2423) data time 0.0011 (0.0020) model time 0.2365 (0.2405) loss 3.6510 (3.2673) grad_norm 2.6270 (2.4618) loss_scale 4096.0000 (2271.6165) mem 7382MB [2024-08-27 08:54:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][760/1251] eta 0:01:58 lr 0.000585 wd 0.0500 time 0.2397 (0.2423) data time 0.0009 (0.0020) model time 0.2388 (0.2405) loss 2.1413 (3.2656) grad_norm 2.4985 (2.4687) loss_scale 4096.0000 (2295.5900) mem 7382MB [2024-08-27 08:54:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][770/1251] eta 0:01:56 lr 0.000585 wd 0.0500 time 0.2332 (0.2422) data time 0.0009 (0.0020) model time 0.2323 (0.2404) loss 3.6013 (3.2678) grad_norm 1.9410 (2.4683) loss_scale 4096.0000 (2318.9416) mem 7382MB [2024-08-27 08:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][780/1251] eta 0:01:54 lr 0.000584 wd 0.0500 time 0.2332 (0.2422) data time 0.0009 (0.0019) model time 0.2323 (0.2404) loss 3.7641 (3.2696) grad_norm 3.8327 (inf) loss_scale 2048.0000 (2318.0948) mem 7382MB [2024-08-27 08:54:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][790/1251] eta 0:01:51 lr 0.000584 wd 0.0500 time 0.2489 (0.2422) data time 0.0009 (0.0019) model time 0.2479 (0.2404) loss 3.1407 (3.2709) grad_norm 2.0993 (inf) loss_scale 2048.0000 (2314.6802) mem 7382MB [2024-08-27 08:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][800/1251] eta 0:01:49 lr 0.000584 wd 0.0500 time 0.2383 (0.2421) data time 0.0008 (0.0019) model time 0.2375 (0.2404) loss 4.1366 (3.2705) grad_norm 2.0921 (inf) loss_scale 2048.0000 (2311.3508) mem 7382MB [2024-08-27 08:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][810/1251] eta 0:01:46 lr 0.000584 wd 0.0500 time 0.2455 (0.2421) data time 0.0008 (0.0019) model time 0.2448 (0.2404) loss 3.7805 (3.2702) grad_norm 2.2694 (inf) loss_scale 2048.0000 (2308.1036) mem 7382MB [2024-08-27 08:54:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][820/1251] eta 0:01:44 lr 0.000584 wd 0.0500 time 0.2447 (0.2421) data time 0.0007 (0.0019) model time 0.2440 (0.2403) loss 4.0650 (3.2700) grad_norm 1.8924 (inf) loss_scale 2048.0000 (2304.9354) mem 7382MB [2024-08-27 08:54:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][830/1251] eta 0:01:41 lr 0.000584 wd 0.0500 time 0.2400 (0.2421) data time 0.0009 (0.0019) model time 0.2391 (0.2403) loss 2.7357 (3.2724) grad_norm 3.4558 (inf) loss_scale 2048.0000 (2301.8436) mem 7382MB [2024-08-27 08:54:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][840/1251] eta 0:01:39 lr 0.000584 wd 0.0500 time 0.2395 (0.2421) data time 0.0011 (0.0019) model time 0.2384 (0.2403) loss 2.6781 (3.2700) grad_norm 2.9724 (inf) loss_scale 2048.0000 (2298.8252) mem 7382MB [2024-08-27 08:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][850/1251] eta 0:01:37 lr 0.000584 wd 0.0500 time 0.2360 (0.2421) data time 0.0010 (0.0019) model time 0.2350 (0.2403) loss 3.3909 (3.2710) grad_norm 5.0895 (inf) loss_scale 2048.0000 (2295.8778) mem 7382MB [2024-08-27 08:54:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][860/1251] eta 0:01:34 lr 0.000584 wd 0.0500 time 0.2363 (0.2421) data time 0.0007 (0.0019) model time 0.2356 (0.2404) loss 3.5322 (3.2676) grad_norm 2.0254 (inf) loss_scale 2048.0000 (2292.9988) mem 7382MB [2024-08-27 08:54:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][870/1251] eta 0:01:32 lr 0.000584 wd 0.0500 time 0.2375 (0.2421) data time 0.0010 (0.0019) model time 0.2365 (0.2404) loss 3.3423 (3.2652) grad_norm 1.8899 (inf) loss_scale 2048.0000 (2290.1860) mem 7382MB [2024-08-27 08:54:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][880/1251] eta 0:01:29 lr 0.000584 wd 0.0500 time 0.2307 (0.2421) data time 0.0011 (0.0019) model time 0.2297 (0.2404) loss 2.5290 (3.2619) grad_norm 2.3535 (inf) loss_scale 2048.0000 (2287.4370) mem 7382MB [2024-08-27 08:54:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][890/1251] eta 0:01:27 lr 0.000584 wd 0.0500 time 0.2485 (0.2421) data time 0.0011 (0.0019) model time 0.2474 (0.2404) loss 2.9351 (3.2598) grad_norm 2.1398 (inf) loss_scale 2048.0000 (2284.7497) mem 7382MB [2024-08-27 08:54:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][900/1251] eta 0:01:24 lr 0.000584 wd 0.0500 time 0.2327 (0.2421) data time 0.0010 (0.0019) model time 0.2317 (0.2404) loss 3.6750 (3.2615) grad_norm 1.7753 (inf) loss_scale 2048.0000 (2282.1221) mem 7382MB [2024-08-27 08:54:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][910/1251] eta 0:01:22 lr 0.000584 wd 0.0500 time 0.2424 (0.2421) data time 0.0011 (0.0019) model time 0.2413 (0.2404) loss 3.6672 (3.2591) grad_norm 2.5541 (inf) loss_scale 2048.0000 (2279.5521) mem 7382MB [2024-08-27 08:54:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][920/1251] eta 0:01:20 lr 0.000584 wd 0.0500 time 0.2439 (0.2421) data time 0.0012 (0.0018) model time 0.2427 (0.2404) loss 2.0832 (3.2598) grad_norm 1.7868 (inf) loss_scale 2048.0000 (2277.0380) mem 7382MB [2024-08-27 08:55:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][930/1251] eta 0:01:17 lr 0.000584 wd 0.0500 time 0.2360 (0.2421) data time 0.0008 (0.0018) model time 0.2352 (0.2403) loss 3.9335 (3.2606) grad_norm 2.4039 (inf) loss_scale 2048.0000 (2274.5779) mem 7382MB [2024-08-27 08:55:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][940/1251] eta 0:01:15 lr 0.000584 wd 0.0500 time 0.2510 (0.2421) data time 0.0007 (0.0018) model time 0.2503 (0.2404) loss 2.4971 (3.2585) grad_norm 1.6656 (inf) loss_scale 2048.0000 (2272.1700) mem 7382MB [2024-08-27 08:55:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][950/1251] eta 0:01:12 lr 0.000584 wd 0.0500 time 0.2375 (0.2421) data time 0.0011 (0.0018) model time 0.2365 (0.2404) loss 3.3987 (3.2615) grad_norm 2.1359 (inf) loss_scale 2048.0000 (2269.8128) mem 7382MB [2024-08-27 08:55:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][960/1251] eta 0:01:10 lr 0.000584 wd 0.0500 time 0.2384 (0.2421) data time 0.0007 (0.0018) model time 0.2376 (0.2404) loss 3.6912 (3.2630) grad_norm 2.6829 (inf) loss_scale 2048.0000 (2267.5047) mem 7382MB [2024-08-27 08:55:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][970/1251] eta 0:01:08 lr 0.000584 wd 0.0500 time 0.2397 (0.2420) data time 0.0008 (0.0018) model time 0.2390 (0.2403) loss 3.9398 (3.2622) grad_norm 2.4535 (inf) loss_scale 2048.0000 (2265.2441) mem 7382MB [2024-08-27 08:55:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][980/1251] eta 0:01:05 lr 0.000584 wd 0.0500 time 0.2457 (0.2422) data time 0.0009 (0.0018) model time 0.2448 (0.2405) loss 3.6650 (3.2626) grad_norm 2.6570 (inf) loss_scale 2048.0000 (2263.0296) mem 7382MB [2024-08-27 08:55:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][990/1251] eta 0:01:03 lr 0.000584 wd 0.0500 time 0.2375 (0.2426) data time 0.0012 (0.0018) model time 0.2363 (0.2410) loss 3.3479 (3.2603) grad_norm 3.4434 (inf) loss_scale 2048.0000 (2260.8597) mem 7382MB [2024-08-27 08:55:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1000/1251] eta 0:01:00 lr 0.000584 wd 0.0500 time 0.2353 (0.2426) data time 0.0007 (0.0018) model time 0.2346 (0.2409) loss 2.6872 (3.2606) grad_norm 1.7632 (inf) loss_scale 2048.0000 (2258.7333) mem 7382MB [2024-08-27 08:55:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1010/1251] eta 0:00:58 lr 0.000583 wd 0.0500 time 0.2384 (0.2426) data time 0.0011 (0.0018) model time 0.2373 (0.2409) loss 3.2145 (3.2607) grad_norm 1.9378 (inf) loss_scale 2048.0000 (2256.6489) mem 7382MB [2024-08-27 08:55:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1020/1251] eta 0:00:56 lr 0.000583 wd 0.0500 time 0.2369 (0.2426) data time 0.0010 (0.0018) model time 0.2359 (0.2409) loss 3.7871 (3.2586) grad_norm 2.4358 (inf) loss_scale 2048.0000 (2254.6053) mem 7382MB [2024-08-27 08:55:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1030/1251] eta 0:00:53 lr 0.000583 wd 0.0500 time 0.2394 (0.2426) data time 0.0011 (0.0018) model time 0.2383 (0.2409) loss 3.6727 (3.2604) grad_norm 2.0967 (inf) loss_scale 2048.0000 (2252.6014) mem 7382MB [2024-08-27 08:55:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1040/1251] eta 0:00:51 lr 0.000583 wd 0.0500 time 0.2465 (0.2426) data time 0.0010 (0.0018) model time 0.2455 (0.2409) loss 3.2408 (3.2578) grad_norm 1.9030 (inf) loss_scale 2048.0000 (2250.6359) mem 7382MB [2024-08-27 08:55:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1050/1251] eta 0:00:48 lr 0.000583 wd 0.0500 time 0.2346 (0.2425) data time 0.0009 (0.0018) model time 0.2338 (0.2409) loss 2.8819 (3.2574) grad_norm 2.2853 (inf) loss_scale 2048.0000 (2248.7079) mem 7382MB [2024-08-27 08:55:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1060/1251] eta 0:00:46 lr 0.000583 wd 0.0500 time 0.2473 (0.2425) data time 0.0009 (0.0018) model time 0.2464 (0.2409) loss 3.2824 (3.2577) grad_norm 2.2938 (inf) loss_scale 2048.0000 (2246.8162) mem 7382MB [2024-08-27 08:55:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1070/1251] eta 0:00:43 lr 0.000583 wd 0.0500 time 0.2368 (0.2425) data time 0.0011 (0.0018) model time 0.2357 (0.2409) loss 3.2179 (3.2576) grad_norm 3.4382 (inf) loss_scale 2048.0000 (2244.9599) mem 7382MB [2024-08-27 08:55:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1080/1251] eta 0:00:41 lr 0.000583 wd 0.0500 time 0.2402 (0.2425) data time 0.0008 (0.0018) model time 0.2394 (0.2409) loss 2.8882 (3.2542) grad_norm 2.0336 (inf) loss_scale 2048.0000 (2243.1378) mem 7382MB [2024-08-27 08:55:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1090/1251] eta 0:00:39 lr 0.000583 wd 0.0500 time 0.2294 (0.2425) data time 0.0011 (0.0018) model time 0.2283 (0.2408) loss 3.4501 (3.2527) grad_norm 2.3079 (inf) loss_scale 2048.0000 (2241.3492) mem 7382MB [2024-08-27 08:55:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1100/1251] eta 0:00:36 lr 0.000583 wd 0.0500 time 0.2348 (0.2425) data time 0.0010 (0.0018) model time 0.2339 (0.2408) loss 3.6626 (3.2516) grad_norm 2.8588 (inf) loss_scale 2048.0000 (2239.5931) mem 7382MB [2024-08-27 08:55:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1110/1251] eta 0:00:34 lr 0.000583 wd 0.0500 time 0.2425 (0.2426) data time 0.0012 (0.0018) model time 0.2413 (0.2410) loss 3.6736 (3.2533) grad_norm 1.9215 (inf) loss_scale 2048.0000 (2237.8686) mem 7382MB [2024-08-27 08:55:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1120/1251] eta 0:00:31 lr 0.000583 wd 0.0500 time 0.2515 (0.2426) data time 0.0008 (0.0018) model time 0.2507 (0.2410) loss 3.1200 (3.2542) grad_norm 2.7126 (inf) loss_scale 2048.0000 (2236.1748) mem 7382MB [2024-08-27 08:55:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1130/1251] eta 0:00:29 lr 0.000583 wd 0.0500 time 0.2444 (0.2426) data time 0.0010 (0.0018) model time 0.2434 (0.2410) loss 3.4012 (3.2570) grad_norm 4.0501 (inf) loss_scale 2048.0000 (2234.5111) mem 7382MB [2024-08-27 08:55:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1140/1251] eta 0:00:26 lr 0.000583 wd 0.0500 time 0.2349 (0.2426) data time 0.0010 (0.0018) model time 0.2339 (0.2410) loss 3.8694 (3.2582) grad_norm 2.4737 (inf) loss_scale 2048.0000 (2232.8764) mem 7382MB [2024-08-27 08:55:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1150/1251] eta 0:00:24 lr 0.000583 wd 0.0500 time 0.2310 (0.2426) data time 0.0008 (0.0018) model time 0.2302 (0.2410) loss 2.7247 (3.2562) grad_norm 1.6832 (inf) loss_scale 2048.0000 (2231.2702) mem 7382MB [2024-08-27 08:55:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1160/1251] eta 0:00:22 lr 0.000583 wd 0.0500 time 0.2385 (0.2426) data time 0.0010 (0.0018) model time 0.2375 (0.2410) loss 3.2143 (3.2532) grad_norm 2.2630 (inf) loss_scale 2048.0000 (2229.6916) mem 7382MB [2024-08-27 08:55:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1170/1251] eta 0:00:19 lr 0.000583 wd 0.0500 time 0.2401 (0.2426) data time 0.0007 (0.0018) model time 0.2394 (0.2410) loss 3.4999 (3.2524) grad_norm 2.0980 (inf) loss_scale 2048.0000 (2228.1401) mem 7382MB [2024-08-27 08:56:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1180/1251] eta 0:00:17 lr 0.000583 wd 0.0500 time 0.2293 (0.2426) data time 0.0008 (0.0018) model time 0.2285 (0.2410) loss 4.3536 (3.2512) grad_norm 2.0494 (inf) loss_scale 2048.0000 (2226.6147) mem 7382MB [2024-08-27 08:56:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1190/1251] eta 0:00:14 lr 0.000583 wd 0.0500 time 0.2414 (0.2426) data time 0.0008 (0.0018) model time 0.2406 (0.2410) loss 3.2810 (3.2525) grad_norm 1.8672 (inf) loss_scale 2048.0000 (2225.1150) mem 7382MB [2024-08-27 08:56:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1200/1251] eta 0:00:12 lr 0.000583 wd 0.0500 time 0.2343 (0.2426) data time 0.0011 (0.0017) model time 0.2332 (0.2410) loss 3.5057 (3.2549) grad_norm 2.1088 (inf) loss_scale 2048.0000 (2223.6403) mem 7382MB [2024-08-27 08:56:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1210/1251] eta 0:00:09 lr 0.000583 wd 0.0500 time 0.2359 (0.2425) data time 0.0010 (0.0017) model time 0.2349 (0.2409) loss 3.2620 (3.2542) grad_norm 2.6844 (inf) loss_scale 2048.0000 (2222.1899) mem 7382MB [2024-08-27 08:56:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1220/1251] eta 0:00:07 lr 0.000583 wd 0.0500 time 0.2339 (0.2425) data time 0.0011 (0.0017) model time 0.2328 (0.2409) loss 3.5330 (3.2544) grad_norm 1.7924 (inf) loss_scale 2048.0000 (2220.7633) mem 7382MB [2024-08-27 08:56:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1230/1251] eta 0:00:05 lr 0.000583 wd 0.0500 time 0.2362 (0.2426) data time 0.0011 (0.0017) model time 0.2352 (0.2409) loss 3.6617 (3.2557) grad_norm 2.5694 (inf) loss_scale 2048.0000 (2219.3599) mem 7382MB [2024-08-27 08:56:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1240/1251] eta 0:00:02 lr 0.000582 wd 0.0500 time 0.2291 (0.2425) data time 0.0007 (0.0017) model time 0.2284 (0.2409) loss 2.9539 (3.2552) grad_norm 2.7435 (inf) loss_scale 2048.0000 (2217.9790) mem 7382MB [2024-08-27 08:56:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [145/300][1250/1251] eta 0:00:00 lr 0.000582 wd 0.0500 time 0.2275 (0.2424) data time 0.0007 (0.0017) model time 0.2268 (0.2408) loss 3.9954 (3.2557) grad_norm 2.1591 (inf) loss_scale 2048.0000 (2216.6203) mem 7382MB [2024-08-27 08:56:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 145 training takes 0:05:03 [2024-08-27 08:56:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 08:56:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 08:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.433 (0.433) Loss 0.4592 (0.4592) Acc@1 91.504 (91.504) Acc@5 98.340 (98.340) Mem 7382MB [2024-08-27 08:56:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.110) Loss 0.7378 (0.7391) Acc@1 84.473 (83.860) Acc@5 95.996 (96.742) Mem 7382MB [2024-08-27 08:56:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.094) Loss 1.0732 (0.7669) Acc@1 74.609 (83.110) Acc@5 94.141 (96.740) Mem 7382MB [2024-08-27 08:56:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.089) Loss 1.2998 (0.8682) Acc@1 68.750 (80.721) Acc@5 90.820 (95.467) Mem 7382MB [2024-08-27 08:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.2012 (0.9290) Acc@1 72.070 (79.206) Acc@5 92.188 (94.800) Mem 7382MB [2024-08-27 08:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.846 Acc@5 94.736 [2024-08-27 08:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.8% [2024-08-27 08:56:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.724 (0.724) Loss 0.4072 (0.4072) Acc@1 92.871 (92.871) Acc@5 98.535 (98.535) Mem 7382MB [2024-08-27 08:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.143) Loss 0.6558 (0.6431) Acc@1 86.719 (86.266) Acc@5 97.070 (97.319) Mem 7382MB [2024-08-27 08:56:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.111) Loss 0.9082 (0.6668) Acc@1 79.297 (85.245) Acc@5 95.117 (97.349) Mem 7382MB [2024-08-27 08:56:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.101) Loss 1.1533 (0.7568) Acc@1 70.898 (83.002) Acc@5 92.480 (96.406) Mem 7382MB [2024-08-27 08:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.092) Loss 1.0420 (0.8043) Acc@1 74.707 (81.631) Acc@5 93.555 (95.910) Mem 7382MB [2024-08-27 08:56:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.248 Acc@5 95.848 [2024-08-27 08:56:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.2% [2024-08-27 08:56:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.25% [2024-08-27 08:56:28 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 08:56:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 08:56:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][0/1251] eta 0:15:23 lr 0.000582 wd 0.0500 time 0.7382 (0.7382) data time 0.5062 (0.5062) model time 0.0000 (0.0000) loss 3.9200 (3.9200) grad_norm 2.1115 (2.1115) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][10/1251] eta 0:05:55 lr 0.000582 wd 0.0500 time 0.2367 (0.2863) data time 0.0010 (0.0476) model time 0.0000 (0.0000) loss 2.9805 (3.0672) grad_norm 1.8347 (2.0268) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:56:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][20/1251] eta 0:05:25 lr 0.000582 wd 0.0500 time 0.2409 (0.2641) data time 0.0009 (0.0257) model time 0.0000 (0.0000) loss 3.7602 (3.1114) grad_norm 1.9773 (2.0475) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:56:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][30/1251] eta 0:05:13 lr 0.000582 wd 0.0500 time 0.2392 (0.2567) data time 0.0009 (0.0178) model time 0.0000 (0.0000) loss 3.7580 (3.1390) grad_norm 1.7611 (2.1409) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:56:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][40/1251] eta 0:05:05 lr 0.000582 wd 0.0500 time 0.2392 (0.2521) data time 0.0008 (0.0137) model time 0.0000 (0.0000) loss 3.2408 (3.1333) grad_norm 3.9328 (2.2650) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:56:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][50/1251] eta 0:04:59 lr 0.000582 wd 0.0500 time 0.2384 (0.2492) data time 0.0012 (0.0112) model time 0.0000 (0.0000) loss 3.2084 (3.1392) grad_norm 1.9101 (2.2546) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:56:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][60/1251] eta 0:04:55 lr 0.000582 wd 0.0500 time 0.2377 (0.2478) data time 0.0012 (0.0096) model time 0.2366 (0.2391) loss 3.3336 (3.1463) grad_norm 3.7754 (2.4708) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:56:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][70/1251] eta 0:04:51 lr 0.000582 wd 0.0500 time 0.2373 (0.2467) data time 0.0008 (0.0084) model time 0.2365 (0.2389) loss 3.9555 (3.1111) grad_norm 2.1295 (2.5487) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:56:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][80/1251] eta 0:04:47 lr 0.000582 wd 0.0500 time 0.2426 (0.2454) data time 0.0008 (0.0075) model time 0.2418 (0.2377) loss 2.9692 (3.1309) grad_norm 3.1881 (2.5862) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:56:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][90/1251] eta 0:04:44 lr 0.000582 wd 0.0500 time 0.2409 (0.2448) data time 0.0009 (0.0068) model time 0.2400 (0.2379) loss 2.5922 (3.1612) grad_norm 3.0423 (2.6132) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:56:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][100/1251] eta 0:04:40 lr 0.000582 wd 0.0500 time 0.2318 (0.2441) data time 0.0012 (0.0063) model time 0.2306 (0.2377) loss 3.1185 (3.1506) grad_norm 2.0818 (2.6023) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:56:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][110/1251] eta 0:04:38 lr 0.000582 wd 0.0500 time 0.2435 (0.2438) data time 0.0009 (0.0058) model time 0.2426 (0.2380) loss 2.4271 (3.1499) grad_norm 2.4127 (2.5900) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:56:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][120/1251] eta 0:04:35 lr 0.000582 wd 0.0500 time 0.2414 (0.2433) data time 0.0011 (0.0054) model time 0.2402 (0.2378) loss 3.7054 (3.1445) grad_norm 1.7529 (2.5628) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][130/1251] eta 0:04:32 lr 0.000582 wd 0.0500 time 0.2379 (0.2431) data time 0.0009 (0.0051) model time 0.2370 (0.2379) loss 3.3165 (3.1492) grad_norm 2.2924 (2.5315) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][140/1251] eta 0:04:30 lr 0.000582 wd 0.0500 time 0.2389 (0.2430) data time 0.0010 (0.0048) model time 0.2379 (0.2384) loss 3.0162 (3.1560) grad_norm 2.1310 (2.5104) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][150/1251] eta 0:04:27 lr 0.000582 wd 0.0500 time 0.2511 (0.2429) data time 0.0010 (0.0046) model time 0.2502 (0.2385) loss 3.8003 (3.1533) grad_norm 1.9708 (2.4981) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][160/1251] eta 0:04:24 lr 0.000582 wd 0.0500 time 0.2425 (0.2428) data time 0.0008 (0.0043) model time 0.2417 (0.2387) loss 3.1760 (3.1404) grad_norm 2.8374 (2.4799) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][170/1251] eta 0:04:22 lr 0.000582 wd 0.0500 time 0.2413 (0.2427) data time 0.0010 (0.0041) model time 0.2403 (0.2388) loss 2.7261 (3.1342) grad_norm 3.0006 (2.4736) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][180/1251] eta 0:04:19 lr 0.000582 wd 0.0500 time 0.2498 (0.2426) data time 0.0009 (0.0040) model time 0.2490 (0.2387) loss 3.8143 (3.1410) grad_norm 1.8098 (2.4712) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][190/1251] eta 0:04:17 lr 0.000582 wd 0.0500 time 0.2454 (0.2427) data time 0.0011 (0.0039) model time 0.2442 (0.2390) loss 3.5626 (3.1486) grad_norm 4.3450 (2.4811) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][200/1251] eta 0:04:14 lr 0.000582 wd 0.0500 time 0.2455 (0.2425) data time 0.0009 (0.0038) model time 0.2446 (0.2390) loss 2.0136 (3.1408) grad_norm 1.9713 (2.4697) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][210/1251] eta 0:04:12 lr 0.000582 wd 0.0500 time 0.2460 (0.2425) data time 0.0011 (0.0036) model time 0.2448 (0.2391) loss 2.7745 (3.1413) grad_norm 2.5861 (2.4585) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][220/1251] eta 0:04:09 lr 0.000581 wd 0.0500 time 0.2432 (0.2424) data time 0.0010 (0.0035) model time 0.2422 (0.2391) loss 3.6667 (3.1521) grad_norm 2.8432 (2.4601) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][230/1251] eta 0:04:07 lr 0.000581 wd 0.0500 time 0.2488 (0.2424) data time 0.0010 (0.0035) model time 0.2477 (0.2392) loss 3.5987 (3.1579) grad_norm 3.0254 (2.4499) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][240/1251] eta 0:04:05 lr 0.000581 wd 0.0500 time 0.2496 (0.2424) data time 0.0010 (0.0034) model time 0.2486 (0.2392) loss 2.7223 (3.1471) grad_norm 2.5128 (2.4374) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][250/1251] eta 0:04:02 lr 0.000581 wd 0.0500 time 0.2597 (0.2423) data time 0.0008 (0.0033) model time 0.2588 (0.2393) loss 4.1384 (3.1607) grad_norm 2.5596 (2.4445) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][260/1251] eta 0:04:00 lr 0.000581 wd 0.0500 time 0.2409 (0.2422) data time 0.0008 (0.0032) model time 0.2402 (0.2392) loss 2.9733 (3.1597) grad_norm 2.2316 (2.4478) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][270/1251] eta 0:03:57 lr 0.000581 wd 0.0500 time 0.2471 (0.2423) data time 0.0008 (0.0032) model time 0.2463 (0.2394) loss 2.6550 (3.1515) grad_norm 1.6609 (2.4464) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][280/1251] eta 0:03:55 lr 0.000581 wd 0.0500 time 0.2475 (0.2423) data time 0.0008 (0.0031) model time 0.2467 (0.2394) loss 2.4836 (3.1524) grad_norm 2.7407 (2.4773) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][290/1251] eta 0:03:52 lr 0.000581 wd 0.0500 time 0.2551 (0.2422) data time 0.0010 (0.0031) model time 0.2541 (0.2394) loss 3.4033 (3.1604) grad_norm 1.7343 (2.4644) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][300/1251] eta 0:03:50 lr 0.000581 wd 0.0500 time 0.2541 (0.2422) data time 0.0010 (0.0030) model time 0.2531 (0.2395) loss 3.2877 (3.1629) grad_norm 4.1491 (2.4854) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][310/1251] eta 0:03:48 lr 0.000581 wd 0.0500 time 0.2573 (0.2423) data time 0.0010 (0.0029) model time 0.2563 (0.2396) loss 3.4781 (3.1769) grad_norm 2.4410 (2.4820) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][320/1251] eta 0:03:45 lr 0.000581 wd 0.0500 time 0.2456 (0.2422) data time 0.0009 (0.0029) model time 0.2446 (0.2396) loss 3.3244 (3.1955) grad_norm 2.7303 (2.4973) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][330/1251] eta 0:03:43 lr 0.000581 wd 0.0500 time 0.2449 (0.2422) data time 0.0011 (0.0029) model time 0.2438 (0.2396) loss 2.2280 (3.1971) grad_norm 2.3013 (2.4839) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][340/1251] eta 0:03:40 lr 0.000581 wd 0.0500 time 0.2485 (0.2422) data time 0.0007 (0.0028) model time 0.2479 (0.2396) loss 3.4620 (3.2002) grad_norm 1.6189 (2.4642) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][350/1251] eta 0:03:38 lr 0.000581 wd 0.0500 time 0.2464 (0.2422) data time 0.0008 (0.0028) model time 0.2457 (0.2397) loss 2.1810 (3.1959) grad_norm 2.0151 (2.4601) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][360/1251] eta 0:03:35 lr 0.000581 wd 0.0500 time 0.2459 (0.2422) data time 0.0010 (0.0028) model time 0.2449 (0.2396) loss 3.0248 (3.2048) grad_norm 2.3842 (2.4769) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:57:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][370/1251] eta 0:03:34 lr 0.000581 wd 0.0500 time 0.2361 (0.2433) data time 0.0011 (0.0027) model time 0.2351 (0.2410) loss 2.6450 (3.2010) grad_norm 2.0668 (2.4785) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][380/1251] eta 0:03:32 lr 0.000581 wd 0.0500 time 0.2385 (0.2444) data time 0.0011 (0.0027) model time 0.2374 (0.2422) loss 3.6225 (3.2069) grad_norm 3.8718 (2.4769) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][390/1251] eta 0:03:30 lr 0.000581 wd 0.0500 time 0.2450 (0.2444) data time 0.0011 (0.0027) model time 0.2439 (0.2423) loss 2.9541 (3.2036) grad_norm 2.8955 (2.4793) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][400/1251] eta 0:03:27 lr 0.000581 wd 0.0500 time 0.2392 (0.2442) data time 0.0007 (0.0026) model time 0.2384 (0.2421) loss 3.0905 (3.2081) grad_norm 2.8938 (2.4736) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][410/1251] eta 0:03:25 lr 0.000581 wd 0.0500 time 0.2319 (0.2442) data time 0.0009 (0.0026) model time 0.2310 (0.2421) loss 2.5787 (3.2063) grad_norm 1.5546 (2.4848) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][420/1251] eta 0:03:22 lr 0.000581 wd 0.0500 time 0.2311 (0.2441) data time 0.0008 (0.0026) model time 0.2303 (0.2420) loss 3.7956 (3.2133) grad_norm 2.7413 (2.4808) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][430/1251] eta 0:03:20 lr 0.000581 wd 0.0500 time 0.2368 (0.2440) data time 0.0007 (0.0026) model time 0.2361 (0.2419) loss 3.7276 (3.2169) grad_norm 2.7546 (2.4799) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][440/1251] eta 0:03:17 lr 0.000581 wd 0.0500 time 0.2308 (0.2439) data time 0.0007 (0.0025) model time 0.2301 (0.2418) loss 3.2365 (3.2118) grad_norm 2.6557 (2.4776) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][450/1251] eta 0:03:15 lr 0.000580 wd 0.0500 time 0.2371 (0.2438) data time 0.0012 (0.0025) model time 0.2359 (0.2418) loss 2.8775 (3.2127) grad_norm 1.7555 (2.4718) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][460/1251] eta 0:03:12 lr 0.000580 wd 0.0500 time 0.2407 (0.2437) data time 0.0009 (0.0025) model time 0.2398 (0.2416) loss 3.1382 (3.2143) grad_norm 2.4682 (2.4684) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][470/1251] eta 0:03:10 lr 0.000580 wd 0.0500 time 0.2364 (0.2436) data time 0.0008 (0.0025) model time 0.2356 (0.2416) loss 2.6340 (3.2151) grad_norm 2.7994 (2.4684) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][480/1251] eta 0:03:07 lr 0.000580 wd 0.0500 time 0.2355 (0.2436) data time 0.0009 (0.0024) model time 0.2346 (0.2415) loss 3.5772 (3.2154) grad_norm 2.3586 (2.4728) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][490/1251] eta 0:03:05 lr 0.000580 wd 0.0500 time 0.2360 (0.2436) data time 0.0012 (0.0024) model time 0.2348 (0.2415) loss 3.1175 (3.2167) grad_norm 1.7328 (2.4692) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][500/1251] eta 0:03:02 lr 0.000580 wd 0.0500 time 0.2364 (0.2436) data time 0.0012 (0.0024) model time 0.2353 (0.2416) loss 3.4252 (3.2207) grad_norm 2.1166 (2.4599) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][510/1251] eta 0:03:00 lr 0.000580 wd 0.0500 time 0.2452 (0.2436) data time 0.0007 (0.0024) model time 0.2446 (0.2415) loss 3.1944 (3.2207) grad_norm 1.8720 (2.4543) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][520/1251] eta 0:02:58 lr 0.000580 wd 0.0500 time 0.2391 (0.2435) data time 0.0012 (0.0024) model time 0.2379 (0.2415) loss 2.9141 (3.2244) grad_norm 2.6297 (2.4497) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][530/1251] eta 0:02:55 lr 0.000580 wd 0.0500 time 0.2353 (0.2435) data time 0.0010 (0.0023) model time 0.2343 (0.2415) loss 2.7239 (3.2206) grad_norm 1.9530 (2.4436) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][540/1251] eta 0:02:53 lr 0.000580 wd 0.0500 time 0.2389 (0.2435) data time 0.0009 (0.0023) model time 0.2380 (0.2415) loss 3.1745 (3.2227) grad_norm 2.2783 (2.4417) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][550/1251] eta 0:02:50 lr 0.000580 wd 0.0500 time 0.2394 (0.2434) data time 0.0007 (0.0023) model time 0.2386 (0.2414) loss 3.5313 (3.2257) grad_norm 2.2149 (2.4521) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][560/1251] eta 0:02:48 lr 0.000580 wd 0.0500 time 0.2339 (0.2434) data time 0.0011 (0.0023) model time 0.2328 (0.2414) loss 2.5996 (3.2193) grad_norm 2.2744 (2.4605) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][570/1251] eta 0:02:45 lr 0.000580 wd 0.0500 time 0.2347 (0.2433) data time 0.0008 (0.0023) model time 0.2339 (0.2413) loss 3.6351 (3.2212) grad_norm 2.3887 (2.4628) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][580/1251] eta 0:02:43 lr 0.000580 wd 0.0500 time 0.2363 (0.2433) data time 0.0008 (0.0023) model time 0.2355 (0.2413) loss 2.8959 (3.2204) grad_norm 1.8877 (2.4625) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][590/1251] eta 0:02:40 lr 0.000580 wd 0.0500 time 0.2391 (0.2432) data time 0.0011 (0.0023) model time 0.2380 (0.2412) loss 3.4907 (3.2212) grad_norm 2.1083 (2.4739) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][600/1251] eta 0:02:38 lr 0.000580 wd 0.0500 time 0.2374 (0.2431) data time 0.0007 (0.0022) model time 0.2367 (0.2412) loss 2.3813 (3.2177) grad_norm 2.9296 (2.4696) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][610/1251] eta 0:02:35 lr 0.000580 wd 0.0500 time 0.2364 (0.2430) data time 0.0008 (0.0022) model time 0.2356 (0.2411) loss 3.9573 (3.2167) grad_norm 2.0598 (2.4685) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:58:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][620/1251] eta 0:02:33 lr 0.000580 wd 0.0500 time 0.2411 (0.2429) data time 0.0010 (0.0022) model time 0.2401 (0.2410) loss 3.5119 (3.2209) grad_norm 2.1402 (2.4687) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][630/1251] eta 0:02:30 lr 0.000580 wd 0.0500 time 0.2331 (0.2429) data time 0.0009 (0.0022) model time 0.2321 (0.2409) loss 3.0162 (3.2245) grad_norm 2.7941 (2.4674) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][640/1251] eta 0:02:28 lr 0.000580 wd 0.0500 time 0.2322 (0.2428) data time 0.0014 (0.0022) model time 0.2308 (0.2408) loss 3.1235 (3.2250) grad_norm 1.8568 (2.4666) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][650/1251] eta 0:02:25 lr 0.000580 wd 0.0500 time 0.2406 (0.2427) data time 0.0010 (0.0022) model time 0.2395 (0.2407) loss 3.1635 (3.2225) grad_norm 3.4508 (2.4721) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][660/1251] eta 0:02:23 lr 0.000580 wd 0.0500 time 0.2294 (0.2426) data time 0.0011 (0.0021) model time 0.2283 (0.2406) loss 3.5046 (3.2240) grad_norm 2.6947 (2.4688) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][670/1251] eta 0:02:20 lr 0.000579 wd 0.0500 time 0.2301 (0.2425) data time 0.0009 (0.0021) model time 0.2292 (0.2406) loss 3.1103 (3.2212) grad_norm 2.0236 (2.4722) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][680/1251] eta 0:02:18 lr 0.000579 wd 0.0500 time 0.2375 (0.2425) data time 0.0009 (0.0021) model time 0.2366 (0.2406) loss 3.3643 (3.2178) grad_norm 1.8788 (2.4725) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][690/1251] eta 0:02:16 lr 0.000579 wd 0.0500 time 0.2448 (0.2425) data time 0.0010 (0.0021) model time 0.2438 (0.2406) loss 2.7835 (3.2192) grad_norm 3.3413 (2.4949) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][700/1251] eta 0:02:13 lr 0.000579 wd 0.0500 time 0.2392 (0.2425) data time 0.0010 (0.0021) model time 0.2382 (0.2405) loss 3.5375 (3.2203) grad_norm 2.3773 (2.4929) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][710/1251] eta 0:02:11 lr 0.000579 wd 0.0500 time 0.2306 (0.2424) data time 0.0011 (0.0021) model time 0.2295 (0.2405) loss 3.3594 (3.2228) grad_norm 1.7433 (2.4905) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][720/1251] eta 0:02:08 lr 0.000579 wd 0.0500 time 0.2416 (0.2424) data time 0.0010 (0.0021) model time 0.2406 (0.2405) loss 2.7852 (3.2160) grad_norm 2.5871 (2.4932) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][730/1251] eta 0:02:06 lr 0.000579 wd 0.0500 time 0.2381 (0.2424) data time 0.0011 (0.0021) model time 0.2369 (0.2405) loss 3.6965 (3.2163) grad_norm 2.1547 (2.4921) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][740/1251] eta 0:02:03 lr 0.000579 wd 0.0500 time 0.2430 (0.2424) data time 0.0010 (0.0021) model time 0.2420 (0.2406) loss 3.5846 (3.2191) grad_norm 2.3820 (2.4881) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][750/1251] eta 0:02:01 lr 0.000579 wd 0.0500 time 0.2396 (0.2424) data time 0.0007 (0.0021) model time 0.2389 (0.2405) loss 2.8268 (3.2182) grad_norm 2.9197 (2.4924) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][760/1251] eta 0:01:59 lr 0.000579 wd 0.0500 time 0.2419 (0.2424) data time 0.0009 (0.0020) model time 0.2410 (0.2405) loss 2.1998 (3.2182) grad_norm 4.3498 (2.4989) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][770/1251] eta 0:01:56 lr 0.000579 wd 0.0500 time 0.2459 (0.2424) data time 0.0007 (0.0020) model time 0.2452 (0.2405) loss 2.4368 (3.2182) grad_norm 2.3916 (2.4975) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][780/1251] eta 0:01:54 lr 0.000579 wd 0.0500 time 0.2316 (0.2423) data time 0.0011 (0.0020) model time 0.2305 (0.2405) loss 3.2484 (3.2191) grad_norm 1.9841 (2.5012) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][790/1251] eta 0:01:51 lr 0.000579 wd 0.0500 time 0.2392 (0.2423) data time 0.0012 (0.0020) model time 0.2380 (0.2405) loss 3.6006 (3.2221) grad_norm 2.3888 (2.4991) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][800/1251] eta 0:01:49 lr 0.000579 wd 0.0500 time 0.2342 (0.2422) data time 0.0008 (0.0020) model time 0.2334 (0.2404) loss 3.0753 (3.2200) grad_norm 2.3593 (2.4971) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][810/1251] eta 0:01:46 lr 0.000579 wd 0.0500 time 0.2371 (0.2423) data time 0.0010 (0.0020) model time 0.2360 (0.2404) loss 2.9644 (3.2192) grad_norm 2.1227 (2.4976) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][820/1251] eta 0:01:44 lr 0.000579 wd 0.0500 time 0.2373 (0.2422) data time 0.0008 (0.0020) model time 0.2365 (0.2404) loss 3.7071 (3.2212) grad_norm 2.3989 (2.4941) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][830/1251] eta 0:01:41 lr 0.000579 wd 0.0500 time 0.2293 (0.2422) data time 0.0008 (0.0020) model time 0.2285 (0.2403) loss 2.4450 (3.2215) grad_norm 2.3000 (2.4911) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][840/1251] eta 0:01:39 lr 0.000579 wd 0.0500 time 0.2391 (0.2422) data time 0.0008 (0.0020) model time 0.2383 (0.2403) loss 2.9708 (3.2199) grad_norm 2.0054 (2.4865) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][850/1251] eta 0:01:37 lr 0.000579 wd 0.0500 time 0.2449 (0.2421) data time 0.0009 (0.0020) model time 0.2441 (0.2403) loss 3.0849 (3.2227) grad_norm 2.3940 (2.4836) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][860/1251] eta 0:01:34 lr 0.000579 wd 0.0500 time 0.2469 (0.2421) data time 0.0008 (0.0020) model time 0.2461 (0.2403) loss 3.8004 (3.2269) grad_norm 3.2837 (2.4931) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 08:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][870/1251] eta 0:01:32 lr 0.000579 wd 0.0500 time 0.2341 (0.2421) data time 0.0011 (0.0020) model time 0.2330 (0.2403) loss 3.0110 (3.2254) grad_norm 2.3081 (2.5007) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][880/1251] eta 0:01:29 lr 0.000579 wd 0.0500 time 0.2398 (0.2421) data time 0.0008 (0.0019) model time 0.2390 (0.2403) loss 3.0575 (3.2196) grad_norm 2.5913 (2.4985) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][890/1251] eta 0:01:27 lr 0.000579 wd 0.0500 time 0.2365 (0.2421) data time 0.0009 (0.0019) model time 0.2357 (0.2403) loss 2.3704 (3.2196) grad_norm 2.9099 (2.5023) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][900/1251] eta 0:01:24 lr 0.000578 wd 0.0500 time 0.2466 (0.2421) data time 0.0010 (0.0019) model time 0.2456 (0.2403) loss 3.3963 (3.2196) grad_norm 2.4806 (2.5011) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][910/1251] eta 0:01:22 lr 0.000578 wd 0.0500 time 0.2376 (0.2421) data time 0.0010 (0.0019) model time 0.2366 (0.2403) loss 3.2095 (3.2192) grad_norm 2.3810 (2.5023) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][920/1251] eta 0:01:20 lr 0.000578 wd 0.0500 time 0.2387 (0.2420) data time 0.0010 (0.0019) model time 0.2377 (0.2402) loss 1.7912 (3.2179) grad_norm 2.1659 (2.4999) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][930/1251] eta 0:01:17 lr 0.000578 wd 0.0500 time 0.2365 (0.2420) data time 0.0009 (0.0019) model time 0.2356 (0.2402) loss 2.3523 (3.2172) grad_norm 2.5761 (2.4994) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][940/1251] eta 0:01:15 lr 0.000578 wd 0.0500 time 0.2476 (0.2421) data time 0.0010 (0.0019) model time 0.2466 (0.2403) loss 3.3370 (3.2174) grad_norm 2.0930 (2.4979) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][950/1251] eta 0:01:12 lr 0.000578 wd 0.0500 time 0.2434 (0.2421) data time 0.0009 (0.0019) model time 0.2425 (0.2403) loss 2.9002 (3.2191) grad_norm 2.1816 (2.4950) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][960/1251] eta 0:01:10 lr 0.000578 wd 0.0500 time 0.2355 (0.2421) data time 0.0010 (0.0019) model time 0.2344 (0.2403) loss 3.4323 (3.2182) grad_norm 2.2352 (2.4907) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][970/1251] eta 0:01:08 lr 0.000578 wd 0.0500 time 0.2373 (0.2421) data time 0.0009 (0.0019) model time 0.2364 (0.2403) loss 3.7000 (3.2197) grad_norm 2.7918 (2.4891) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][980/1251] eta 0:01:05 lr 0.000578 wd 0.0500 time 0.2418 (0.2421) data time 0.0008 (0.0019) model time 0.2409 (0.2403) loss 3.2181 (3.2213) grad_norm 2.2794 (2.4879) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][990/1251] eta 0:01:03 lr 0.000578 wd 0.0500 time 0.2405 (0.2421) data time 0.0009 (0.0019) model time 0.2397 (0.2403) loss 2.7302 (3.2202) grad_norm 2.0349 (2.4844) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1000/1251] eta 0:01:00 lr 0.000578 wd 0.0500 time 0.2395 (0.2422) data time 0.0010 (0.0019) model time 0.2385 (0.2404) loss 3.6234 (3.2236) grad_norm 2.0776 (2.4832) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1010/1251] eta 0:00:58 lr 0.000578 wd 0.0500 time 0.2360 (0.2421) data time 0.0008 (0.0019) model time 0.2353 (0.2404) loss 2.9929 (3.2236) grad_norm 2.4415 (2.4805) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1020/1251] eta 0:00:55 lr 0.000578 wd 0.0500 time 0.2371 (0.2422) data time 0.0008 (0.0019) model time 0.2364 (0.2404) loss 3.5124 (3.2230) grad_norm 2.3003 (2.4792) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1030/1251] eta 0:00:53 lr 0.000578 wd 0.0500 time 0.2417 (0.2422) data time 0.0009 (0.0019) model time 0.2408 (0.2404) loss 2.8070 (3.2259) grad_norm 3.1736 (2.4785) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1040/1251] eta 0:00:51 lr 0.000578 wd 0.0500 time 0.2378 (0.2422) data time 0.0011 (0.0019) model time 0.2367 (0.2404) loss 3.4755 (3.2257) grad_norm 3.0312 (2.4790) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1050/1251] eta 0:00:48 lr 0.000578 wd 0.0500 time 0.2399 (0.2422) data time 0.0009 (0.0019) model time 0.2390 (0.2405) loss 3.2756 (3.2263) grad_norm 2.8360 (2.4791) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1060/1251] eta 0:00:46 lr 0.000578 wd 0.0500 time 0.2387 (0.2422) data time 0.0011 (0.0019) model time 0.2375 (0.2404) loss 3.4472 (3.2282) grad_norm 3.1542 (2.4814) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1070/1251] eta 0:00:43 lr 0.000578 wd 0.0500 time 0.2370 (0.2422) data time 0.0010 (0.0019) model time 0.2360 (0.2404) loss 2.9514 (3.2283) grad_norm 2.7103 (2.4845) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1080/1251] eta 0:00:41 lr 0.000578 wd 0.0500 time 0.2420 (0.2422) data time 0.0008 (0.0019) model time 0.2413 (0.2405) loss 2.0936 (3.2267) grad_norm 2.4269 (2.4863) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1090/1251] eta 0:00:38 lr 0.000578 wd 0.0500 time 0.2280 (0.2422) data time 0.0010 (0.0019) model time 0.2271 (0.2404) loss 3.9581 (3.2279) grad_norm 1.9062 (2.4847) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1100/1251] eta 0:00:36 lr 0.000578 wd 0.0500 time 0.2385 (0.2422) data time 0.0010 (0.0019) model time 0.2374 (0.2405) loss 3.5146 (3.2284) grad_norm 2.7753 (2.4838) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:00:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1110/1251] eta 0:00:34 lr 0.000578 wd 0.0500 time 0.2361 (0.2422) data time 0.0009 (0.0019) model time 0.2353 (0.2404) loss 2.7032 (3.2281) grad_norm 2.4761 (2.4835) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1120/1251] eta 0:00:31 lr 0.000578 wd 0.0500 time 0.2287 (0.2422) data time 0.0013 (0.0019) model time 0.2274 (0.2404) loss 3.4755 (3.2291) grad_norm 2.7714 (2.4854) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1130/1251] eta 0:00:29 lr 0.000577 wd 0.0500 time 0.2392 (0.2422) data time 0.0007 (0.0019) model time 0.2385 (0.2404) loss 3.4790 (3.2270) grad_norm 2.5572 (2.4833) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1140/1251] eta 0:00:26 lr 0.000577 wd 0.0500 time 0.2379 (0.2422) data time 0.0008 (0.0019) model time 0.2371 (0.2404) loss 3.1815 (3.2282) grad_norm 2.9555 (2.4841) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1150/1251] eta 0:00:24 lr 0.000577 wd 0.0500 time 0.2328 (0.2422) data time 0.0009 (0.0019) model time 0.2319 (0.2404) loss 2.3101 (3.2264) grad_norm 2.1417 (2.4860) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1160/1251] eta 0:00:22 lr 0.000577 wd 0.0500 time 0.2397 (0.2421) data time 0.0015 (0.0018) model time 0.2382 (0.2404) loss 3.2489 (3.2275) grad_norm 1.6757 (2.4855) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1170/1251] eta 0:00:19 lr 0.000577 wd 0.0500 time 0.2401 (0.2421) data time 0.0007 (0.0018) model time 0.2394 (0.2404) loss 2.9473 (3.2262) grad_norm 3.7651 (2.4869) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1180/1251] eta 0:00:17 lr 0.000577 wd 0.0500 time 0.2391 (0.2421) data time 0.0009 (0.0018) model time 0.2382 (0.2404) loss 2.7457 (3.2259) grad_norm 2.5054 (2.4878) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1190/1251] eta 0:00:14 lr 0.000577 wd 0.0500 time 0.2351 (0.2421) data time 0.0011 (0.0018) model time 0.2340 (0.2404) loss 3.2884 (3.2262) grad_norm 1.6448 (2.4850) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1200/1251] eta 0:00:12 lr 0.000577 wd 0.0500 time 0.2386 (0.2421) data time 0.0008 (0.0018) model time 0.2378 (0.2404) loss 2.3527 (3.2255) grad_norm 1.7207 (2.4815) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1210/1251] eta 0:00:09 lr 0.000577 wd 0.0500 time 0.2398 (0.2421) data time 0.0009 (0.0018) model time 0.2388 (0.2404) loss 3.5414 (3.2233) grad_norm 1.8084 (2.4799) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1220/1251] eta 0:00:07 lr 0.000577 wd 0.0500 time 0.2440 (0.2421) data time 0.0013 (0.0018) model time 0.2427 (0.2403) loss 3.3790 (3.2228) grad_norm 1.9327 (2.4792) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1230/1251] eta 0:00:05 lr 0.000577 wd 0.0500 time 0.2438 (0.2420) data time 0.0009 (0.0018) model time 0.2429 (0.2403) loss 3.8790 (3.2231) grad_norm 2.5273 (2.4794) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1240/1251] eta 0:00:02 lr 0.000577 wd 0.0500 time 0.2259 (0.2420) data time 0.0005 (0.0018) model time 0.2254 (0.2402) loss 2.7040 (3.2238) grad_norm 2.8402 (2.4791) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [146/300][1250/1251] eta 0:00:00 lr 0.000577 wd 0.0500 time 0.2300 (0.2418) data time 0.0005 (0.0018) model time 0.2295 (0.2401) loss 3.4909 (3.2265) grad_norm 2.2831 (2.4772) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 146 training takes 0:05:02 [2024-08-27 09:01:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 09:01:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 09:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.445 (0.445) Loss 0.4209 (0.4209) Acc@1 91.797 (91.797) Acc@5 98.535 (98.535) Mem 7382MB [2024-08-27 09:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.109) Loss 0.7471 (0.7268) Acc@1 85.449 (83.975) Acc@5 96.191 (96.795) Mem 7382MB [2024-08-27 09:01:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.094) Loss 1.0625 (0.7450) Acc@1 73.633 (83.082) Acc@5 93.652 (96.740) Mem 7382MB [2024-08-27 09:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.088) Loss 1.3037 (0.8547) Acc@1 69.434 (80.611) Acc@5 90.137 (95.435) Mem 7382MB [2024-08-27 09:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.082) Loss 1.1602 (0.9046) Acc@1 72.754 (79.361) Acc@5 92.383 (94.884) Mem 7382MB [2024-08-27 09:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.984 Acc@5 94.850 [2024-08-27 09:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.0% [2024-08-27 09:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.770 (0.770) Loss 0.4050 (0.4050) Acc@1 92.969 (92.969) Acc@5 98.438 (98.438) Mem 7382MB [2024-08-27 09:01:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.144) Loss 0.6519 (0.6421) Acc@1 86.816 (86.284) Acc@5 97.070 (97.319) Mem 7382MB [2024-08-27 09:01:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.112) Loss 0.9062 (0.6659) Acc@1 79.199 (85.286) Acc@5 95.215 (97.368) Mem 7382MB [2024-08-27 09:01:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.100) Loss 1.1543 (0.7561) Acc@1 70.898 (83.024) Acc@5 92.480 (96.402) Mem 7382MB [2024-08-27 09:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.091) Loss 1.0381 (0.8034) Acc@1 74.609 (81.650) Acc@5 93.457 (95.910) Mem 7382MB [2024-08-27 09:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.240 Acc@5 95.854 [2024-08-27 09:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.2% [2024-08-27 09:01:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][0/1251] eta 0:20:46 lr 0.000577 wd 0.0500 time 0.9966 (0.9966) data time 0.6353 (0.6353) model time 0.0000 (0.0000) loss 3.7801 (3.7801) grad_norm 2.1165 (2.1165) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][10/1251] eta 0:06:24 lr 0.000577 wd 0.0500 time 0.2350 (0.3099) data time 0.0011 (0.0589) model time 0.0000 (0.0000) loss 3.9759 (3.0507) grad_norm 3.0568 (2.4966) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][20/1251] eta 0:05:40 lr 0.000577 wd 0.0500 time 0.2333 (0.2769) data time 0.0009 (0.0314) model time 0.0000 (0.0000) loss 2.3139 (3.1018) grad_norm 2.3157 (2.4925) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][30/1251] eta 0:05:40 lr 0.000577 wd 0.0500 time 0.4582 (0.2788) data time 0.0008 (0.0216) model time 0.0000 (0.0000) loss 2.6582 (3.1249) grad_norm 1.5422 (2.4543) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][40/1251] eta 0:05:32 lr 0.000577 wd 0.0500 time 0.2375 (0.2750) data time 0.0011 (0.0166) model time 0.0000 (0.0000) loss 3.2033 (3.1774) grad_norm 1.7439 (2.4424) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][50/1251] eta 0:05:22 lr 0.000577 wd 0.0500 time 0.2408 (0.2682) data time 0.0011 (0.0137) model time 0.0000 (0.0000) loss 3.1483 (3.1770) grad_norm 1.8217 (2.3822) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][60/1251] eta 0:05:13 lr 0.000577 wd 0.0500 time 0.2366 (0.2635) data time 0.0011 (0.0117) model time 0.2356 (0.2383) loss 2.9325 (3.1596) grad_norm 2.8789 (2.4072) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][70/1251] eta 0:05:07 lr 0.000577 wd 0.0500 time 0.2416 (0.2600) data time 0.0009 (0.0102) model time 0.2407 (0.2378) loss 3.1291 (3.1445) grad_norm 1.7620 (2.4148) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][80/1251] eta 0:05:01 lr 0.000577 wd 0.0500 time 0.2321 (0.2574) data time 0.0007 (0.0091) model time 0.2313 (0.2380) loss 3.4267 (3.1709) grad_norm 4.4019 (2.5026) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][90/1251] eta 0:04:56 lr 0.000577 wd 0.0500 time 0.2387 (0.2554) data time 0.0009 (0.0082) model time 0.2378 (0.2379) loss 3.9597 (3.1778) grad_norm 3.8398 (2.5484) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][100/1251] eta 0:04:52 lr 0.000577 wd 0.0500 time 0.2337 (0.2537) data time 0.0008 (0.0075) model time 0.2329 (0.2378) loss 3.3474 (3.1741) grad_norm 2.4663 (2.5490) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][110/1251] eta 0:04:48 lr 0.000576 wd 0.0500 time 0.2412 (0.2525) data time 0.0009 (0.0069) model time 0.2403 (0.2379) loss 3.7779 (3.1769) grad_norm 3.0434 (2.5244) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][120/1251] eta 0:04:44 lr 0.000576 wd 0.0500 time 0.2341 (0.2515) data time 0.0008 (0.0064) model time 0.2333 (0.2382) loss 2.8677 (3.1673) grad_norm 2.2455 (2.5278) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][130/1251] eta 0:04:40 lr 0.000576 wd 0.0500 time 0.2367 (0.2506) data time 0.0008 (0.0061) model time 0.2360 (0.2382) loss 4.2169 (3.1774) grad_norm 2.0819 (2.5045) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][140/1251] eta 0:04:37 lr 0.000576 wd 0.0500 time 0.2354 (0.2498) data time 0.0010 (0.0057) model time 0.2345 (0.2381) loss 3.5254 (3.1513) grad_norm 1.7756 (2.4938) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][150/1251] eta 0:04:34 lr 0.000576 wd 0.0500 time 0.2288 (0.2492) data time 0.0011 (0.0054) model time 0.2277 (0.2382) loss 2.9751 (3.1483) grad_norm 2.5720 (2.4739) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][160/1251] eta 0:04:31 lr 0.000576 wd 0.0500 time 0.2326 (0.2485) data time 0.0008 (0.0052) model time 0.2318 (0.2381) loss 3.3476 (3.1628) grad_norm 2.2430 (2.4575) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][170/1251] eta 0:04:28 lr 0.000576 wd 0.0500 time 0.2408 (0.2480) data time 0.0008 (0.0049) model time 0.2400 (0.2382) loss 3.7310 (3.1863) grad_norm 1.9869 (2.4536) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][180/1251] eta 0:04:25 lr 0.000576 wd 0.0500 time 0.2367 (0.2476) data time 0.0008 (0.0047) model time 0.2359 (0.2383) loss 2.1345 (3.1971) grad_norm 1.9827 (2.4437) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][190/1251] eta 0:04:22 lr 0.000576 wd 0.0500 time 0.2432 (0.2471) data time 0.0010 (0.0045) model time 0.2422 (0.2382) loss 3.3843 (3.2051) grad_norm 2.4196 (2.4264) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][200/1251] eta 0:04:19 lr 0.000576 wd 0.0500 time 0.2340 (0.2466) data time 0.0010 (0.0044) model time 0.2329 (0.2381) loss 3.0024 (3.2114) grad_norm 2.9584 (2.4223) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][210/1251] eta 0:04:16 lr 0.000576 wd 0.0500 time 0.2329 (0.2463) data time 0.0012 (0.0042) model time 0.2318 (0.2381) loss 3.5113 (3.2126) grad_norm 2.3255 (2.4108) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][220/1251] eta 0:04:13 lr 0.000576 wd 0.0500 time 0.2387 (0.2461) data time 0.0011 (0.0041) model time 0.2377 (0.2382) loss 3.0727 (3.2081) grad_norm 2.5787 (2.4021) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][230/1251] eta 0:04:10 lr 0.000576 wd 0.0500 time 0.2495 (0.2458) data time 0.0011 (0.0040) model time 0.2483 (0.2381) loss 3.6277 (3.2152) grad_norm 2.7795 (2.3985) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][240/1251] eta 0:04:08 lr 0.000576 wd 0.0500 time 0.2369 (0.2454) data time 0.0009 (0.0039) model time 0.2360 (0.2380) loss 4.1283 (3.2111) grad_norm 4.1457 (2.4017) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][250/1251] eta 0:04:05 lr 0.000576 wd 0.0500 time 0.2451 (0.2454) data time 0.0019 (0.0038) model time 0.2432 (0.2383) loss 3.1116 (3.2112) grad_norm 2.2610 (2.4067) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][260/1251] eta 0:04:02 lr 0.000576 wd 0.0500 time 0.2387 (0.2452) data time 0.0010 (0.0037) model time 0.2377 (0.2383) loss 3.8059 (3.2126) grad_norm 8.5704 (2.4303) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][270/1251] eta 0:04:00 lr 0.000576 wd 0.0500 time 0.2428 (0.2450) data time 0.0011 (0.0036) model time 0.2417 (0.2383) loss 2.4644 (3.2205) grad_norm 2.0171 (2.4371) loss_scale 4096.0000 (2055.5572) mem 7382MB [2024-08-27 09:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][280/1251] eta 0:03:57 lr 0.000576 wd 0.0500 time 0.2391 (0.2448) data time 0.0012 (0.0035) model time 0.2379 (0.2384) loss 3.6440 (3.2177) grad_norm 1.9498 (2.4270) loss_scale 4096.0000 (2128.1708) mem 7382MB [2024-08-27 09:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][290/1251] eta 0:03:55 lr 0.000576 wd 0.0500 time 0.2386 (0.2447) data time 0.0008 (0.0035) model time 0.2378 (0.2384) loss 3.8984 (3.2207) grad_norm 2.1261 (2.4153) loss_scale 4096.0000 (2195.7938) mem 7382MB [2024-08-27 09:02:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][300/1251] eta 0:03:52 lr 0.000576 wd 0.0500 time 0.2348 (0.2446) data time 0.0011 (0.0034) model time 0.2337 (0.2385) loss 3.3215 (3.2264) grad_norm 2.0620 (2.4080) loss_scale 4096.0000 (2258.9236) mem 7382MB [2024-08-27 09:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][310/1251] eta 0:03:50 lr 0.000576 wd 0.0500 time 0.2390 (0.2445) data time 0.0011 (0.0033) model time 0.2379 (0.2385) loss 3.4005 (3.2301) grad_norm 2.3794 (2.4121) loss_scale 4096.0000 (2317.9936) mem 7382MB [2024-08-27 09:02:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][320/1251] eta 0:03:47 lr 0.000576 wd 0.0500 time 0.2349 (0.2444) data time 0.0013 (0.0033) model time 0.2337 (0.2386) loss 1.7709 (3.2270) grad_norm 2.3389 (2.4089) loss_scale 4096.0000 (2373.3832) mem 7382MB [2024-08-27 09:03:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][330/1251] eta 0:03:44 lr 0.000575 wd 0.0500 time 0.2438 (0.2442) data time 0.0011 (0.0032) model time 0.2428 (0.2386) loss 3.4283 (3.2251) grad_norm 2.8061 (2.4167) loss_scale 4096.0000 (2425.4260) mem 7382MB [2024-08-27 09:03:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][340/1251] eta 0:03:42 lr 0.000575 wd 0.0500 time 0.2361 (0.2441) data time 0.0007 (0.0032) model time 0.2354 (0.2386) loss 2.2506 (3.2193) grad_norm 1.5111 (2.4081) loss_scale 4096.0000 (2474.4164) mem 7382MB [2024-08-27 09:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][350/1251] eta 0:03:39 lr 0.000575 wd 0.0500 time 0.2377 (0.2440) data time 0.0011 (0.0031) model time 0.2367 (0.2386) loss 3.6286 (3.2246) grad_norm 2.4743 (2.4135) loss_scale 4096.0000 (2520.6154) mem 7382MB [2024-08-27 09:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][360/1251] eta 0:03:37 lr 0.000575 wd 0.0500 time 0.2418 (0.2439) data time 0.0009 (0.0031) model time 0.2409 (0.2386) loss 3.2060 (3.2298) grad_norm 2.2214 (2.4137) loss_scale 4096.0000 (2564.2548) mem 7382MB [2024-08-27 09:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][370/1251] eta 0:03:34 lr 0.000575 wd 0.0500 time 0.2405 (0.2439) data time 0.0009 (0.0030) model time 0.2396 (0.2387) loss 3.7785 (3.2345) grad_norm 3.0424 (2.4215) loss_scale 4096.0000 (2605.5418) mem 7382MB [2024-08-27 09:03:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][380/1251] eta 0:03:32 lr 0.000575 wd 0.0500 time 0.2361 (0.2438) data time 0.0010 (0.0030) model time 0.2352 (0.2386) loss 2.3739 (3.2260) grad_norm 2.0422 (2.4250) loss_scale 4096.0000 (2644.6614) mem 7382MB [2024-08-27 09:03:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][390/1251] eta 0:03:29 lr 0.000575 wd 0.0500 time 0.2368 (0.2436) data time 0.0008 (0.0029) model time 0.2360 (0.2386) loss 3.2080 (3.2231) grad_norm 2.3506 (2.4234) loss_scale 4096.0000 (2681.7801) mem 7382MB [2024-08-27 09:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][400/1251] eta 0:03:27 lr 0.000575 wd 0.0500 time 0.2371 (0.2435) data time 0.0008 (0.0029) model time 0.2362 (0.2386) loss 3.0690 (3.2259) grad_norm 2.6818 (2.4244) loss_scale 4096.0000 (2717.0474) mem 7382MB [2024-08-27 09:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][410/1251] eta 0:03:24 lr 0.000575 wd 0.0500 time 0.2393 (0.2434) data time 0.0009 (0.0029) model time 0.2384 (0.2386) loss 3.3169 (3.2293) grad_norm 1.8783 (2.4199) loss_scale 4096.0000 (2750.5985) mem 7382MB [2024-08-27 09:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][420/1251] eta 0:03:22 lr 0.000575 wd 0.0500 time 0.2410 (0.2433) data time 0.0008 (0.0028) model time 0.2403 (0.2386) loss 3.6332 (3.2352) grad_norm 2.1876 (2.4184) loss_scale 4096.0000 (2782.5558) mem 7382MB [2024-08-27 09:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][430/1251] eta 0:03:19 lr 0.000575 wd 0.0500 time 0.2377 (0.2433) data time 0.0009 (0.0028) model time 0.2368 (0.2386) loss 3.7848 (3.2328) grad_norm 2.6327 (2.4322) loss_scale 4096.0000 (2813.0302) mem 7382MB [2024-08-27 09:03:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][440/1251] eta 0:03:17 lr 0.000575 wd 0.0500 time 0.2323 (0.2432) data time 0.0008 (0.0027) model time 0.2315 (0.2386) loss 3.0779 (3.2362) grad_norm 2.7690 (2.4316) loss_scale 4096.0000 (2842.1224) mem 7382MB [2024-08-27 09:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][450/1251] eta 0:03:14 lr 0.000575 wd 0.0500 time 0.2443 (0.2431) data time 0.0009 (0.0027) model time 0.2434 (0.2386) loss 3.4674 (3.2405) grad_norm 2.3130 (2.4324) loss_scale 4096.0000 (2869.9246) mem 7382MB [2024-08-27 09:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][460/1251] eta 0:03:12 lr 0.000575 wd 0.0500 time 0.2442 (0.2431) data time 0.0008 (0.0027) model time 0.2435 (0.2387) loss 3.5836 (3.2427) grad_norm 1.6890 (2.4301) loss_scale 4096.0000 (2896.5206) mem 7382MB [2024-08-27 09:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][470/1251] eta 0:03:09 lr 0.000575 wd 0.0500 time 0.2384 (0.2430) data time 0.0007 (0.0026) model time 0.2377 (0.2387) loss 4.1201 (3.2424) grad_norm 2.4022 (2.4263) loss_scale 4096.0000 (2921.9873) mem 7382MB [2024-08-27 09:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][480/1251] eta 0:03:07 lr 0.000575 wd 0.0500 time 0.2408 (0.2429) data time 0.0010 (0.0026) model time 0.2398 (0.2386) loss 3.0447 (3.2427) grad_norm 2.1876 (2.4233) loss_scale 4096.0000 (2946.3950) mem 7382MB [2024-08-27 09:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][490/1251] eta 0:03:04 lr 0.000575 wd 0.0500 time 0.2416 (0.2429) data time 0.0011 (0.0026) model time 0.2405 (0.2386) loss 3.4448 (3.2426) grad_norm 2.0981 (2.4284) loss_scale 4096.0000 (2969.8086) mem 7382MB [2024-08-27 09:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][500/1251] eta 0:03:02 lr 0.000575 wd 0.0500 time 0.2330 (0.2428) data time 0.0011 (0.0025) model time 0.2319 (0.2386) loss 3.2905 (3.2452) grad_norm 3.9771 (2.4418) loss_scale 4096.0000 (2992.2874) mem 7382MB [2024-08-27 09:03:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][510/1251] eta 0:02:59 lr 0.000575 wd 0.0500 time 0.2384 (0.2427) data time 0.0009 (0.0025) model time 0.2375 (0.2386) loss 3.6142 (3.2482) grad_norm 2.5737 (2.4401) loss_scale 4096.0000 (3013.8865) mem 7382MB [2024-08-27 09:03:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][520/1251] eta 0:02:57 lr 0.000575 wd 0.0500 time 0.2392 (0.2427) data time 0.0010 (0.0025) model time 0.2381 (0.2386) loss 2.9066 (3.2470) grad_norm 2.0310 (2.4409) loss_scale 4096.0000 (3034.6564) mem 7382MB [2024-08-27 09:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][530/1251] eta 0:02:54 lr 0.000575 wd 0.0500 time 0.2388 (0.2426) data time 0.0010 (0.0025) model time 0.2378 (0.2386) loss 3.1965 (3.2527) grad_norm 2.9003 (2.4384) loss_scale 4096.0000 (3054.6441) mem 7382MB [2024-08-27 09:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][540/1251] eta 0:02:52 lr 0.000575 wd 0.0500 time 0.2340 (0.2429) data time 0.0011 (0.0024) model time 0.2328 (0.2390) loss 3.1733 (3.2500) grad_norm 2.6548 (2.4395) loss_scale 4096.0000 (3073.8928) mem 7382MB [2024-08-27 09:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][550/1251] eta 0:02:50 lr 0.000575 wd 0.0500 time 0.2412 (0.2429) data time 0.0011 (0.0024) model time 0.2401 (0.2390) loss 3.7047 (3.2508) grad_norm 2.0252 (2.4564) loss_scale 4096.0000 (3092.4428) mem 7382MB [2024-08-27 09:03:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][560/1251] eta 0:02:47 lr 0.000574 wd 0.0500 time 0.2358 (0.2428) data time 0.0008 (0.0024) model time 0.2350 (0.2390) loss 3.4415 (3.2492) grad_norm 4.4418 (2.4592) loss_scale 4096.0000 (3110.3316) mem 7382MB [2024-08-27 09:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][570/1251] eta 0:02:45 lr 0.000574 wd 0.0500 time 0.2368 (0.2428) data time 0.0010 (0.0024) model time 0.2358 (0.2390) loss 3.4879 (3.2465) grad_norm 2.0657 (2.4573) loss_scale 4096.0000 (3127.5937) mem 7382MB [2024-08-27 09:04:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][580/1251] eta 0:02:42 lr 0.000574 wd 0.0500 time 0.2401 (0.2428) data time 0.0010 (0.0024) model time 0.2391 (0.2391) loss 3.5165 (3.2457) grad_norm 2.8844 (2.4609) loss_scale 4096.0000 (3144.2616) mem 7382MB [2024-08-27 09:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][590/1251] eta 0:02:40 lr 0.000574 wd 0.0500 time 0.2423 (0.2428) data time 0.0008 (0.0023) model time 0.2415 (0.2391) loss 2.0879 (3.2399) grad_norm 2.0408 (2.4564) loss_scale 4096.0000 (3160.3655) mem 7382MB [2024-08-27 09:04:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][600/1251] eta 0:02:38 lr 0.000574 wd 0.0500 time 0.2409 (0.2427) data time 0.0007 (0.0023) model time 0.2402 (0.2391) loss 3.8700 (3.2398) grad_norm 3.8043 (2.4564) loss_scale 4096.0000 (3175.9334) mem 7382MB [2024-08-27 09:04:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][610/1251] eta 0:02:35 lr 0.000574 wd 0.0500 time 0.2325 (0.2427) data time 0.0012 (0.0023) model time 0.2313 (0.2391) loss 3.5975 (3.2401) grad_norm 3.1379 (2.4656) loss_scale 4096.0000 (3190.9918) mem 7382MB [2024-08-27 09:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][620/1251] eta 0:02:33 lr 0.000574 wd 0.0500 time 0.2365 (0.2426) data time 0.0007 (0.0023) model time 0.2358 (0.2390) loss 3.2144 (3.2394) grad_norm 1.9638 (2.4612) loss_scale 4096.0000 (3205.5652) mem 7382MB [2024-08-27 09:04:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][630/1251] eta 0:02:30 lr 0.000574 wd 0.0500 time 0.2396 (0.2426) data time 0.0010 (0.0023) model time 0.2386 (0.2390) loss 3.0843 (3.2403) grad_norm 2.1901 (2.4633) loss_scale 4096.0000 (3219.6767) mem 7382MB [2024-08-27 09:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][640/1251] eta 0:02:28 lr 0.000574 wd 0.0500 time 0.2403 (0.2425) data time 0.0007 (0.0022) model time 0.2396 (0.2390) loss 3.9111 (3.2423) grad_norm 2.9446 (2.4621) loss_scale 4096.0000 (3233.3479) mem 7382MB [2024-08-27 09:04:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][650/1251] eta 0:02:25 lr 0.000574 wd 0.0500 time 0.2383 (0.2424) data time 0.0008 (0.0022) model time 0.2375 (0.2389) loss 2.7870 (3.2445) grad_norm 3.3254 (2.4618) loss_scale 4096.0000 (3246.5991) mem 7382MB [2024-08-27 09:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][660/1251] eta 0:02:23 lr 0.000574 wd 0.0500 time 0.2416 (0.2423) data time 0.0008 (0.0022) model time 0.2408 (0.2389) loss 3.5442 (3.2475) grad_norm 1.7028 (inf) loss_scale 2048.0000 (3240.8593) mem 7382MB [2024-08-27 09:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][670/1251] eta 0:02:20 lr 0.000574 wd 0.0500 time 0.2336 (0.2422) data time 0.0012 (0.0022) model time 0.2325 (0.2388) loss 3.0500 (3.2489) grad_norm 2.7564 (inf) loss_scale 2048.0000 (3223.0820) mem 7382MB [2024-08-27 09:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][680/1251] eta 0:02:18 lr 0.000574 wd 0.0500 time 0.2477 (0.2421) data time 0.0011 (0.0022) model time 0.2466 (0.2388) loss 3.6707 (3.2500) grad_norm 1.9241 (inf) loss_scale 2048.0000 (3205.8267) mem 7382MB [2024-08-27 09:04:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][690/1251] eta 0:02:15 lr 0.000574 wd 0.0500 time 0.2331 (0.2421) data time 0.0009 (0.0022) model time 0.2322 (0.2387) loss 2.5773 (3.2510) grad_norm 1.8847 (inf) loss_scale 2048.0000 (3189.0709) mem 7382MB [2024-08-27 09:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][700/1251] eta 0:02:13 lr 0.000574 wd 0.0500 time 0.2442 (0.2420) data time 0.0010 (0.0022) model time 0.2432 (0.2387) loss 2.5150 (3.2533) grad_norm 2.0832 (inf) loss_scale 2048.0000 (3172.7932) mem 7382MB [2024-08-27 09:04:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][710/1251] eta 0:02:10 lr 0.000574 wd 0.0500 time 0.2385 (0.2419) data time 0.0010 (0.0022) model time 0.2375 (0.2387) loss 3.6526 (3.2517) grad_norm 2.0947 (inf) loss_scale 2048.0000 (3156.9733) mem 7382MB [2024-08-27 09:04:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][720/1251] eta 0:02:08 lr 0.000574 wd 0.0500 time 0.2401 (0.2419) data time 0.0010 (0.0021) model time 0.2391 (0.2386) loss 3.4860 (3.2504) grad_norm 2.0060 (inf) loss_scale 2048.0000 (3141.5922) mem 7382MB [2024-08-27 09:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][730/1251] eta 0:02:06 lr 0.000574 wd 0.0500 time 0.2376 (0.2419) data time 0.0008 (0.0021) model time 0.2368 (0.2387) loss 3.6989 (3.2524) grad_norm 2.0368 (inf) loss_scale 2048.0000 (3126.6320) mem 7382MB [2024-08-27 09:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][740/1251] eta 0:02:03 lr 0.000574 wd 0.0500 time 0.2400 (0.2419) data time 0.0008 (0.0021) model time 0.2392 (0.2387) loss 3.3751 (3.2547) grad_norm 2.9416 (inf) loss_scale 2048.0000 (3112.0756) mem 7382MB [2024-08-27 09:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][750/1251] eta 0:02:01 lr 0.000574 wd 0.0500 time 0.2461 (0.2419) data time 0.0007 (0.0021) model time 0.2453 (0.2387) loss 3.6216 (3.2556) grad_norm 3.1355 (inf) loss_scale 2048.0000 (3097.9068) mem 7382MB [2024-08-27 09:04:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][760/1251] eta 0:01:58 lr 0.000574 wd 0.0500 time 0.2497 (0.2419) data time 0.0011 (0.0021) model time 0.2486 (0.2388) loss 3.5515 (3.2561) grad_norm 4.0182 (inf) loss_scale 2048.0000 (3084.1104) mem 7382MB [2024-08-27 09:04:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][770/1251] eta 0:01:56 lr 0.000574 wd 0.0500 time 0.2336 (0.2419) data time 0.0009 (0.0021) model time 0.2327 (0.2387) loss 3.1413 (3.2580) grad_norm 2.7050 (inf) loss_scale 2048.0000 (3070.6719) mem 7382MB [2024-08-27 09:04:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][780/1251] eta 0:01:53 lr 0.000574 wd 0.0500 time 0.2383 (0.2419) data time 0.0008 (0.0021) model time 0.2375 (0.2387) loss 3.6203 (3.2559) grad_norm 1.9892 (inf) loss_scale 2048.0000 (3057.5775) mem 7382MB [2024-08-27 09:04:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][790/1251] eta 0:01:51 lr 0.000573 wd 0.0500 time 0.2384 (0.2418) data time 0.0011 (0.0021) model time 0.2373 (0.2388) loss 3.7914 (3.2577) grad_norm 1.8429 (inf) loss_scale 2048.0000 (3044.8142) mem 7382MB [2024-08-27 09:04:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][800/1251] eta 0:01:49 lr 0.000573 wd 0.0500 time 0.2351 (0.2418) data time 0.0010 (0.0021) model time 0.2340 (0.2387) loss 3.6274 (3.2550) grad_norm 1.6438 (inf) loss_scale 2048.0000 (3032.3695) mem 7382MB [2024-08-27 09:04:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][810/1251] eta 0:01:46 lr 0.000573 wd 0.0500 time 0.2337 (0.2417) data time 0.0012 (0.0021) model time 0.2325 (0.2387) loss 3.1043 (3.2540) grad_norm 2.0803 (inf) loss_scale 2048.0000 (3020.2318) mem 7382MB [2024-08-27 09:04:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][820/1251] eta 0:01:44 lr 0.000573 wd 0.0500 time 0.2442 (0.2418) data time 0.0009 (0.0020) model time 0.2433 (0.2387) loss 4.1374 (3.2571) grad_norm 2.1997 (inf) loss_scale 2048.0000 (3008.3898) mem 7382MB [2024-08-27 09:05:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][830/1251] eta 0:01:41 lr 0.000573 wd 0.0500 time 0.2338 (0.2417) data time 0.0009 (0.0020) model time 0.2328 (0.2387) loss 2.9095 (3.2553) grad_norm 2.6025 (inf) loss_scale 2048.0000 (2996.8327) mem 7382MB [2024-08-27 09:05:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][840/1251] eta 0:01:39 lr 0.000573 wd 0.0500 time 0.2503 (0.2418) data time 0.0010 (0.0020) model time 0.2493 (0.2388) loss 4.0270 (3.2582) grad_norm 2.0397 (inf) loss_scale 2048.0000 (2985.5505) mem 7382MB [2024-08-27 09:05:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][850/1251] eta 0:01:36 lr 0.000573 wd 0.0500 time 0.2459 (0.2418) data time 0.0010 (0.0020) model time 0.2449 (0.2388) loss 3.3610 (3.2572) grad_norm 1.7717 (inf) loss_scale 2048.0000 (2974.5335) mem 7382MB [2024-08-27 09:05:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][860/1251] eta 0:01:34 lr 0.000573 wd 0.0500 time 0.2384 (0.2418) data time 0.0008 (0.0020) model time 0.2376 (0.2388) loss 2.9536 (3.2551) grad_norm 2.3369 (inf) loss_scale 2048.0000 (2963.7724) mem 7382MB [2024-08-27 09:05:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][870/1251] eta 0:01:32 lr 0.000573 wd 0.0500 time 0.2454 (0.2417) data time 0.0011 (0.0020) model time 0.2444 (0.2388) loss 3.6169 (3.2579) grad_norm 2.0174 (inf) loss_scale 2048.0000 (2953.2583) mem 7382MB [2024-08-27 09:05:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][880/1251] eta 0:01:29 lr 0.000573 wd 0.0500 time 0.2410 (0.2417) data time 0.0007 (0.0020) model time 0.2403 (0.2388) loss 2.7795 (3.2601) grad_norm 1.8767 (inf) loss_scale 2048.0000 (2942.9830) mem 7382MB [2024-08-27 09:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][890/1251] eta 0:01:27 lr 0.000573 wd 0.0500 time 0.2393 (0.2417) data time 0.0008 (0.0020) model time 0.2385 (0.2388) loss 3.4479 (3.2580) grad_norm 2.2530 (inf) loss_scale 2048.0000 (2932.9383) mem 7382MB [2024-08-27 09:05:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][900/1251] eta 0:01:24 lr 0.000573 wd 0.0500 time 0.2401 (0.2417) data time 0.0011 (0.0020) model time 0.2390 (0.2388) loss 3.4580 (3.2601) grad_norm 2.5188 (inf) loss_scale 2048.0000 (2923.1165) mem 7382MB [2024-08-27 09:05:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][910/1251] eta 0:01:22 lr 0.000573 wd 0.0500 time 0.2300 (0.2417) data time 0.0010 (0.0020) model time 0.2290 (0.2388) loss 2.6142 (3.2545) grad_norm 2.4249 (inf) loss_scale 2048.0000 (2913.5104) mem 7382MB [2024-08-27 09:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][920/1251] eta 0:01:19 lr 0.000573 wd 0.0500 time 0.2386 (0.2416) data time 0.0010 (0.0020) model time 0.2376 (0.2388) loss 3.0448 (3.2523) grad_norm 1.6126 (inf) loss_scale 2048.0000 (2904.1129) mem 7382MB [2024-08-27 09:05:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][930/1251] eta 0:01:17 lr 0.000573 wd 0.0500 time 0.2358 (0.2417) data time 0.0010 (0.0020) model time 0.2348 (0.2388) loss 2.0095 (3.2550) grad_norm 2.3230 (inf) loss_scale 2048.0000 (2894.9173) mem 7382MB [2024-08-27 09:05:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][940/1251] eta 0:01:15 lr 0.000573 wd 0.0500 time 0.2370 (0.2417) data time 0.0009 (0.0020) model time 0.2360 (0.2388) loss 3.2307 (3.2566) grad_norm 2.8316 (inf) loss_scale 2048.0000 (2885.9171) mem 7382MB [2024-08-27 09:05:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][950/1251] eta 0:01:12 lr 0.000573 wd 0.0500 time 0.2290 (0.2416) data time 0.0009 (0.0020) model time 0.2281 (0.2388) loss 4.2032 (3.2563) grad_norm 2.5060 (inf) loss_scale 2048.0000 (2877.1062) mem 7382MB [2024-08-27 09:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][960/1251] eta 0:01:10 lr 0.000573 wd 0.0500 time 0.4666 (0.2421) data time 0.0009 (0.0020) model time 0.4657 (0.2393) loss 2.8262 (3.2528) grad_norm 2.0170 (inf) loss_scale 2048.0000 (2868.4787) mem 7382MB [2024-08-27 09:05:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][970/1251] eta 0:01:08 lr 0.000573 wd 0.0500 time 0.2688 (0.2423) data time 0.0009 (0.0020) model time 0.2679 (0.2396) loss 2.7295 (3.2525) grad_norm 2.7340 (inf) loss_scale 2048.0000 (2860.0288) mem 7382MB [2024-08-27 09:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][980/1251] eta 0:01:05 lr 0.000573 wd 0.0500 time 0.2462 (0.2423) data time 0.0007 (0.0020) model time 0.2454 (0.2396) loss 3.5379 (3.2517) grad_norm 2.8739 (inf) loss_scale 2048.0000 (2851.7513) mem 7382MB [2024-08-27 09:05:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][990/1251] eta 0:01:03 lr 0.000573 wd 0.0500 time 0.2514 (0.2423) data time 0.0010 (0.0019) model time 0.2504 (0.2396) loss 3.1883 (3.2530) grad_norm 2.2445 (inf) loss_scale 2048.0000 (2843.6408) mem 7382MB [2024-08-27 09:05:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1000/1251] eta 0:01:00 lr 0.000573 wd 0.0500 time 0.2392 (0.2422) data time 0.0010 (0.0019) model time 0.2382 (0.2395) loss 3.1871 (3.2538) grad_norm 2.3822 (inf) loss_scale 2048.0000 (2835.6923) mem 7382MB [2024-08-27 09:05:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1010/1251] eta 0:00:58 lr 0.000573 wd 0.0500 time 0.2423 (0.2422) data time 0.0010 (0.0019) model time 0.2413 (0.2395) loss 3.2327 (3.2522) grad_norm 2.4891 (inf) loss_scale 2048.0000 (2827.9011) mem 7382MB [2024-08-27 09:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1020/1251] eta 0:00:55 lr 0.000572 wd 0.0500 time 0.2401 (0.2422) data time 0.0010 (0.0019) model time 0.2391 (0.2395) loss 3.2358 (3.2539) grad_norm 3.6937 (inf) loss_scale 2048.0000 (2820.2625) mem 7382MB [2024-08-27 09:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1030/1251] eta 0:00:53 lr 0.000572 wd 0.0500 time 0.2444 (0.2421) data time 0.0010 (0.0019) model time 0.2433 (0.2395) loss 2.5006 (3.2503) grad_norm 2.2384 (inf) loss_scale 2048.0000 (2812.7721) mem 7382MB [2024-08-27 09:05:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1040/1251] eta 0:00:51 lr 0.000572 wd 0.0500 time 0.2415 (0.2422) data time 0.0011 (0.0019) model time 0.2404 (0.2395) loss 2.7480 (3.2502) grad_norm 1.6528 (inf) loss_scale 2048.0000 (2805.4256) mem 7382MB [2024-08-27 09:05:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1050/1251] eta 0:00:48 lr 0.000572 wd 0.0500 time 0.2371 (0.2421) data time 0.0009 (0.0019) model time 0.2362 (0.2395) loss 2.7449 (3.2495) grad_norm 2.6103 (inf) loss_scale 2048.0000 (2798.2188) mem 7382MB [2024-08-27 09:05:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1060/1251] eta 0:00:46 lr 0.000572 wd 0.0500 time 0.2377 (0.2421) data time 0.0010 (0.0019) model time 0.2367 (0.2394) loss 3.0475 (3.2482) grad_norm 3.7604 (inf) loss_scale 2048.0000 (2791.1480) mem 7382MB [2024-08-27 09:05:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1070/1251] eta 0:00:43 lr 0.000572 wd 0.0500 time 0.2327 (0.2420) data time 0.0011 (0.0019) model time 0.2316 (0.2394) loss 3.5281 (3.2491) grad_norm 5.0285 (inf) loss_scale 2048.0000 (2784.2092) mem 7382MB [2024-08-27 09:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1080/1251] eta 0:00:41 lr 0.000572 wd 0.0500 time 0.2435 (0.2420) data time 0.0011 (0.0019) model time 0.2424 (0.2394) loss 3.2988 (3.2474) grad_norm 2.1019 (inf) loss_scale 2048.0000 (2777.3987) mem 7382MB [2024-08-27 09:06:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1090/1251] eta 0:00:38 lr 0.000572 wd 0.0500 time 0.2457 (0.2420) data time 0.0008 (0.0019) model time 0.2449 (0.2394) loss 3.8938 (3.2459) grad_norm 2.9919 (inf) loss_scale 2048.0000 (2770.7131) mem 7382MB [2024-08-27 09:06:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1100/1251] eta 0:00:36 lr 0.000572 wd 0.0500 time 0.2552 (0.2420) data time 0.0010 (0.0019) model time 0.2542 (0.2394) loss 2.9038 (3.2437) grad_norm 1.6793 (inf) loss_scale 2048.0000 (2764.1490) mem 7382MB [2024-08-27 09:06:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1110/1251] eta 0:00:34 lr 0.000572 wd 0.0500 time 0.2470 (0.2419) data time 0.0010 (0.0019) model time 0.2460 (0.2394) loss 3.4746 (3.2445) grad_norm 2.4249 (inf) loss_scale 2048.0000 (2757.7030) mem 7382MB [2024-08-27 09:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1120/1251] eta 0:00:31 lr 0.000572 wd 0.0500 time 0.2375 (0.2419) data time 0.0014 (0.0019) model time 0.2361 (0.2394) loss 2.9671 (3.2446) grad_norm 2.6212 (inf) loss_scale 2048.0000 (2751.3720) mem 7382MB [2024-08-27 09:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1130/1251] eta 0:00:29 lr 0.000572 wd 0.0500 time 0.2315 (0.2419) data time 0.0009 (0.0019) model time 0.2307 (0.2393) loss 2.7498 (3.2477) grad_norm 2.8273 (inf) loss_scale 2048.0000 (2745.1530) mem 7382MB [2024-08-27 09:06:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1140/1251] eta 0:00:26 lr 0.000572 wd 0.0500 time 0.2388 (0.2418) data time 0.0011 (0.0019) model time 0.2377 (0.2393) loss 2.6521 (3.2459) grad_norm 2.0310 (inf) loss_scale 2048.0000 (2739.0429) mem 7382MB [2024-08-27 09:06:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1150/1251] eta 0:00:24 lr 0.000572 wd 0.0500 time 0.2505 (0.2418) data time 0.0009 (0.0019) model time 0.2496 (0.2393) loss 2.2049 (3.2456) grad_norm 3.6577 (inf) loss_scale 2048.0000 (2733.0391) mem 7382MB [2024-08-27 09:06:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1160/1251] eta 0:00:22 lr 0.000572 wd 0.0500 time 0.2309 (0.2418) data time 0.0010 (0.0019) model time 0.2299 (0.2393) loss 2.3354 (3.2460) grad_norm 2.4059 (inf) loss_scale 2048.0000 (2727.1387) mem 7382MB [2024-08-27 09:06:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1170/1251] eta 0:00:19 lr 0.000572 wd 0.0500 time 0.2301 (0.2417) data time 0.0007 (0.0018) model time 0.2294 (0.2392) loss 3.9267 (3.2467) grad_norm 2.4442 (inf) loss_scale 2048.0000 (2721.3390) mem 7382MB [2024-08-27 09:06:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1180/1251] eta 0:00:17 lr 0.000572 wd 0.0500 time 0.2428 (0.2417) data time 0.0008 (0.0018) model time 0.2420 (0.2392) loss 3.0480 (3.2469) grad_norm 3.4690 (inf) loss_scale 2048.0000 (2715.6376) mem 7382MB [2024-08-27 09:06:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1190/1251] eta 0:00:14 lr 0.000572 wd 0.0500 time 0.2455 (0.2417) data time 0.0011 (0.0018) model time 0.2445 (0.2392) loss 3.2163 (3.2479) grad_norm 2.3380 (inf) loss_scale 2048.0000 (2710.0319) mem 7382MB [2024-08-27 09:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1200/1251] eta 0:00:12 lr 0.000572 wd 0.0500 time 0.2442 (0.2417) data time 0.0007 (0.0018) model time 0.2434 (0.2392) loss 4.0403 (3.2465) grad_norm 2.6210 (inf) loss_scale 2048.0000 (2704.5196) mem 7382MB [2024-08-27 09:06:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1210/1251] eta 0:00:09 lr 0.000572 wd 0.0500 time 0.2455 (0.2417) data time 0.0008 (0.0018) model time 0.2447 (0.2392) loss 3.4802 (3.2448) grad_norm 2.4899 (inf) loss_scale 2048.0000 (2699.0983) mem 7382MB [2024-08-27 09:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1220/1251] eta 0:00:07 lr 0.000572 wd 0.0500 time 0.2492 (0.2416) data time 0.0008 (0.0018) model time 0.2484 (0.2392) loss 3.0075 (3.2439) grad_norm 1.6242 (inf) loss_scale 2048.0000 (2693.7658) mem 7382MB [2024-08-27 09:06:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1230/1251] eta 0:00:05 lr 0.000572 wd 0.0500 time 0.2363 (0.2416) data time 0.0009 (0.0018) model time 0.2354 (0.2392) loss 2.4583 (3.2450) grad_norm 2.9259 (inf) loss_scale 2048.0000 (2688.5199) mem 7382MB [2024-08-27 09:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1240/1251] eta 0:00:02 lr 0.000571 wd 0.0500 time 0.2248 (0.2415) data time 0.0007 (0.0018) model time 0.2241 (0.2391) loss 2.6813 (3.2423) grad_norm 2.1020 (inf) loss_scale 2048.0000 (2683.3586) mem 7382MB [2024-08-27 09:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [147/300][1250/1251] eta 0:00:00 lr 0.000571 wd 0.0500 time 0.2366 (0.2414) data time 0.0005 (0.0018) model time 0.2362 (0.2390) loss 3.1881 (3.2416) grad_norm 1.8188 (inf) loss_scale 2048.0000 (2678.2798) mem 7382MB [2024-08-27 09:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 147 training takes 0:05:02 [2024-08-27 09:06:42 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 09:06:43 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 09:06:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.429 (0.429) Loss 0.4426 (0.4426) Acc@1 91.113 (91.113) Acc@5 98.340 (98.340) Mem 7382MB [2024-08-27 09:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.110) Loss 0.7192 (0.7138) Acc@1 84.570 (83.931) Acc@5 96.191 (96.715) Mem 7382MB [2024-08-27 09:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.094) Loss 1.0322 (0.7343) Acc@1 73.145 (83.082) Acc@5 94.141 (96.810) Mem 7382MB [2024-08-27 09:06:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.089) Loss 1.2217 (0.8361) Acc@1 70.898 (80.885) Acc@5 91.699 (95.602) Mem 7382MB [2024-08-27 09:06:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.083) Loss 1.1758 (0.8952) Acc@1 71.680 (79.366) Acc@5 91.992 (94.950) Mem 7382MB [2024-08-27 09:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.910 Acc@5 94.870 [2024-08-27 09:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.9% [2024-08-27 09:06:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.837 (0.837) Loss 0.4036 (0.4036) Acc@1 92.969 (92.969) Acc@5 98.633 (98.633) Mem 7382MB [2024-08-27 09:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.146) Loss 0.6499 (0.6414) Acc@1 86.914 (86.284) Acc@5 96.973 (97.328) Mem 7382MB [2024-08-27 09:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.090 (0.114) Loss 0.9058 (0.6654) Acc@1 79.004 (85.282) Acc@5 95.312 (97.354) Mem 7382MB [2024-08-27 09:06:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.102) Loss 1.1553 (0.7557) Acc@1 70.996 (83.008) Acc@5 92.676 (96.380) Mem 7382MB [2024-08-27 09:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.0371 (0.8028) Acc@1 74.707 (81.638) Acc@5 93.555 (95.903) Mem 7382MB [2024-08-27 09:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.234 Acc@5 95.846 [2024-08-27 09:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.2% [2024-08-27 09:06:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][0/1251] eta 0:23:25 lr 0.000571 wd 0.0500 time 1.1233 (1.1233) data time 0.7487 (0.7487) model time 0.0000 (0.0000) loss 2.8793 (2.8793) grad_norm 2.2378 (2.2378) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:06:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][10/1251] eta 0:06:38 lr 0.000571 wd 0.0500 time 0.2572 (0.3215) data time 0.0011 (0.0692) model time 0.0000 (0.0000) loss 3.1842 (3.1977) grad_norm 2.7936 (2.2660) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:06:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][20/1251] eta 0:05:48 lr 0.000571 wd 0.0500 time 0.2463 (0.2833) data time 0.0008 (0.0369) model time 0.0000 (0.0000) loss 3.3778 (3.3304) grad_norm 2.5479 (2.2971) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][30/1251] eta 0:05:29 lr 0.000571 wd 0.0500 time 0.2431 (0.2695) data time 0.0013 (0.0255) model time 0.0000 (0.0000) loss 3.0791 (3.3439) grad_norm 2.1755 (2.2940) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][40/1251] eta 0:05:17 lr 0.000571 wd 0.0500 time 0.2461 (0.2622) data time 0.0008 (0.0195) model time 0.0000 (0.0000) loss 2.6786 (3.3200) grad_norm 3.5659 (2.6312) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][50/1251] eta 0:05:10 lr 0.000571 wd 0.0500 time 0.2431 (0.2582) data time 0.0009 (0.0159) model time 0.0000 (0.0000) loss 3.5375 (3.3455) grad_norm 2.3803 (2.6402) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][60/1251] eta 0:05:04 lr 0.000571 wd 0.0500 time 0.2536 (0.2556) data time 0.0010 (0.0135) model time 0.2525 (0.2416) loss 2.6563 (3.2939) grad_norm 3.7853 (2.6363) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][70/1251] eta 0:04:59 lr 0.000571 wd 0.0500 time 0.2454 (0.2533) data time 0.0009 (0.0117) model time 0.2445 (0.2399) loss 4.1505 (3.3356) grad_norm 1.9223 (2.5506) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][80/1251] eta 0:04:55 lr 0.000571 wd 0.0500 time 0.2555 (0.2520) data time 0.0007 (0.0104) model time 0.2548 (0.2405) loss 3.7506 (3.3629) grad_norm 2.2931 (2.5149) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][90/1251] eta 0:04:50 lr 0.000571 wd 0.0500 time 0.2391 (0.2504) data time 0.0010 (0.0094) model time 0.2381 (0.2395) loss 3.4089 (3.3382) grad_norm 2.4936 (2.4944) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][100/1251] eta 0:04:47 lr 0.000571 wd 0.0500 time 0.2325 (0.2495) data time 0.0011 (0.0086) model time 0.2313 (0.2395) loss 3.0562 (3.3332) grad_norm 2.1429 (2.5040) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][110/1251] eta 0:04:43 lr 0.000571 wd 0.0500 time 0.2369 (0.2486) data time 0.0010 (0.0079) model time 0.2358 (0.2393) loss 2.9888 (3.3307) grad_norm 1.8889 (2.5099) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][120/1251] eta 0:04:40 lr 0.000571 wd 0.0500 time 0.2413 (0.2481) data time 0.0010 (0.0074) model time 0.2403 (0.2397) loss 3.3090 (3.3173) grad_norm 3.2286 (2.4966) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][130/1251] eta 0:04:37 lr 0.000571 wd 0.0500 time 0.2342 (0.2474) data time 0.0009 (0.0069) model time 0.2333 (0.2394) loss 3.2262 (3.3272) grad_norm 3.2941 (2.4937) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][140/1251] eta 0:04:34 lr 0.000571 wd 0.0500 time 0.2370 (0.2469) data time 0.0009 (0.0065) model time 0.2361 (0.2394) loss 4.1636 (3.3324) grad_norm 2.1473 (2.4771) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][150/1251] eta 0:04:32 lr 0.000571 wd 0.0500 time 0.4881 (0.2478) data time 0.0008 (0.0061) model time 0.4873 (0.2414) loss 3.6414 (3.3253) grad_norm 2.3408 (2.4629) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][160/1251] eta 0:04:29 lr 0.000571 wd 0.0500 time 0.2383 (0.2474) data time 0.0007 (0.0059) model time 0.2375 (0.2411) loss 2.0931 (3.2980) grad_norm 1.8282 (2.4591) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][170/1251] eta 0:04:26 lr 0.000571 wd 0.0500 time 0.2294 (0.2467) data time 0.0010 (0.0056) model time 0.2284 (0.2407) loss 2.6914 (3.2735) grad_norm nan (nan) loss_scale 1024.0000 (2042.0117) mem 7382MB [2024-08-27 09:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][180/1251] eta 0:04:23 lr 0.000571 wd 0.0500 time 0.2419 (0.2464) data time 0.0010 (0.0053) model time 0.2409 (0.2406) loss 3.7974 (3.2743) grad_norm 2.2606 (nan) loss_scale 1024.0000 (1985.7680) mem 7382MB [2024-08-27 09:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][190/1251] eta 0:04:20 lr 0.000571 wd 0.0500 time 0.2336 (0.2460) data time 0.0008 (0.0051) model time 0.2328 (0.2403) loss 4.0597 (3.2870) grad_norm 2.3848 (nan) loss_scale 1024.0000 (1935.4136) mem 7382MB [2024-08-27 09:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][200/1251] eta 0:04:18 lr 0.000571 wd 0.0500 time 0.2456 (0.2456) data time 0.0009 (0.0049) model time 0.2446 (0.2402) loss 3.0501 (3.2849) grad_norm 2.4854 (nan) loss_scale 1024.0000 (1890.0697) mem 7382MB [2024-08-27 09:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][210/1251] eta 0:04:15 lr 0.000571 wd 0.0500 time 0.2429 (0.2453) data time 0.0007 (0.0047) model time 0.2422 (0.2401) loss 3.7177 (3.2746) grad_norm 1.7873 (nan) loss_scale 1024.0000 (1849.0237) mem 7382MB [2024-08-27 09:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][220/1251] eta 0:04:12 lr 0.000570 wd 0.0500 time 0.2341 (0.2450) data time 0.0008 (0.0046) model time 0.2333 (0.2398) loss 3.1999 (3.2710) grad_norm 2.7365 (nan) loss_scale 1024.0000 (1811.6923) mem 7382MB [2024-08-27 09:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][230/1251] eta 0:04:09 lr 0.000570 wd 0.0500 time 0.2486 (0.2448) data time 0.0007 (0.0045) model time 0.2479 (0.2398) loss 3.5522 (3.2748) grad_norm 2.0976 (nan) loss_scale 1024.0000 (1777.5931) mem 7382MB [2024-08-27 09:07:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][240/1251] eta 0:04:07 lr 0.000570 wd 0.0500 time 0.2382 (0.2444) data time 0.0011 (0.0043) model time 0.2371 (0.2395) loss 2.4912 (3.2649) grad_norm 2.4282 (nan) loss_scale 1024.0000 (1746.3237) mem 7382MB [2024-08-27 09:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][250/1251] eta 0:04:04 lr 0.000570 wd 0.0500 time 0.2434 (0.2442) data time 0.0010 (0.0042) model time 0.2423 (0.2395) loss 2.2836 (3.2656) grad_norm 1.8647 (nan) loss_scale 1024.0000 (1717.5458) mem 7382MB [2024-08-27 09:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][260/1251] eta 0:04:01 lr 0.000570 wd 0.0500 time 0.2455 (0.2441) data time 0.0010 (0.0041) model time 0.2445 (0.2395) loss 3.4724 (3.2683) grad_norm 2.4827 (nan) loss_scale 1024.0000 (1690.9732) mem 7382MB [2024-08-27 09:07:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][270/1251] eta 0:03:59 lr 0.000570 wd 0.0500 time 0.2362 (0.2439) data time 0.0010 (0.0040) model time 0.2352 (0.2394) loss 2.8182 (3.2652) grad_norm 3.0130 (nan) loss_scale 1024.0000 (1666.3616) mem 7382MB [2024-08-27 09:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][280/1251] eta 0:03:56 lr 0.000570 wd 0.0500 time 0.2363 (0.2439) data time 0.0012 (0.0039) model time 0.2351 (0.2395) loss 2.6453 (3.2590) grad_norm 3.2341 (nan) loss_scale 1024.0000 (1643.5018) mem 7382MB [2024-08-27 09:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][290/1251] eta 0:03:54 lr 0.000570 wd 0.0500 time 0.2403 (0.2438) data time 0.0011 (0.0038) model time 0.2392 (0.2395) loss 2.4480 (3.2598) grad_norm 2.0037 (nan) loss_scale 1024.0000 (1622.2131) mem 7382MB [2024-08-27 09:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][300/1251] eta 0:03:51 lr 0.000570 wd 0.0500 time 0.2378 (0.2436) data time 0.0011 (0.0037) model time 0.2366 (0.2394) loss 3.1824 (3.2473) grad_norm 1.8446 (nan) loss_scale 1024.0000 (1602.3389) mem 7382MB [2024-08-27 09:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][310/1251] eta 0:03:49 lr 0.000570 wd 0.0500 time 0.2458 (0.2435) data time 0.0012 (0.0036) model time 0.2447 (0.2394) loss 3.1455 (3.2546) grad_norm 1.9017 (nan) loss_scale 1024.0000 (1583.7428) mem 7382MB [2024-08-27 09:08:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][320/1251] eta 0:03:46 lr 0.000570 wd 0.0500 time 0.2457 (0.2435) data time 0.0007 (0.0035) model time 0.2450 (0.2395) loss 3.2185 (3.2506) grad_norm 2.1371 (nan) loss_scale 1024.0000 (1566.3053) mem 7382MB [2024-08-27 09:08:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][330/1251] eta 0:03:44 lr 0.000570 wd 0.0500 time 0.2449 (0.2434) data time 0.0013 (0.0035) model time 0.2436 (0.2395) loss 4.0872 (3.2588) grad_norm 2.0597 (nan) loss_scale 1024.0000 (1549.9215) mem 7382MB [2024-08-27 09:08:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][340/1251] eta 0:03:41 lr 0.000570 wd 0.0500 time 0.2304 (0.2433) data time 0.0009 (0.0034) model time 0.2295 (0.2395) loss 3.0604 (3.2608) grad_norm 1.7465 (nan) loss_scale 1024.0000 (1534.4985) mem 7382MB [2024-08-27 09:08:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][350/1251] eta 0:03:39 lr 0.000570 wd 0.0500 time 0.2385 (0.2432) data time 0.0008 (0.0034) model time 0.2377 (0.2394) loss 2.9434 (3.2616) grad_norm 1.9833 (nan) loss_scale 1024.0000 (1519.9544) mem 7382MB [2024-08-27 09:08:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][360/1251] eta 0:03:36 lr 0.000570 wd 0.0500 time 0.2387 (0.2431) data time 0.0009 (0.0033) model time 0.2379 (0.2393) loss 3.1759 (3.2641) grad_norm 2.7219 (nan) loss_scale 1024.0000 (1506.2161) mem 7382MB [2024-08-27 09:08:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][370/1251] eta 0:03:34 lr 0.000570 wd 0.0500 time 0.2396 (0.2430) data time 0.0008 (0.0032) model time 0.2388 (0.2393) loss 3.4149 (3.2658) grad_norm 2.0020 (nan) loss_scale 1024.0000 (1493.2183) mem 7382MB [2024-08-27 09:08:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][380/1251] eta 0:03:31 lr 0.000570 wd 0.0500 time 0.2313 (0.2429) data time 0.0010 (0.0032) model time 0.2303 (0.2393) loss 3.2847 (3.2601) grad_norm 2.3813 (nan) loss_scale 1024.0000 (1480.9029) mem 7382MB [2024-08-27 09:08:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][390/1251] eta 0:03:29 lr 0.000570 wd 0.0500 time 0.2409 (0.2429) data time 0.0008 (0.0031) model time 0.2401 (0.2394) loss 2.4657 (3.2609) grad_norm 2.7892 (nan) loss_scale 1024.0000 (1469.2174) mem 7382MB [2024-08-27 09:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][400/1251] eta 0:03:26 lr 0.000570 wd 0.0500 time 0.2418 (0.2429) data time 0.0011 (0.0031) model time 0.2408 (0.2395) loss 2.4937 (3.2581) grad_norm 1.5835 (nan) loss_scale 1024.0000 (1458.1147) mem 7382MB [2024-08-27 09:08:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][410/1251] eta 0:03:24 lr 0.000570 wd 0.0500 time 0.2413 (0.2429) data time 0.0010 (0.0030) model time 0.2403 (0.2395) loss 2.6698 (3.2521) grad_norm 1.5577 (nan) loss_scale 1024.0000 (1447.5523) mem 7382MB [2024-08-27 09:08:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][420/1251] eta 0:03:21 lr 0.000570 wd 0.0500 time 0.2399 (0.2429) data time 0.0008 (0.0030) model time 0.2391 (0.2396) loss 1.9658 (3.2515) grad_norm 2.0682 (nan) loss_scale 1024.0000 (1437.4917) mem 7382MB [2024-08-27 09:08:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][430/1251] eta 0:03:20 lr 0.000570 wd 0.0500 time 0.2369 (0.2439) data time 0.0011 (0.0030) model time 0.2358 (0.2407) loss 3.5769 (3.2472) grad_norm 2.2792 (nan) loss_scale 1024.0000 (1427.8979) mem 7382MB [2024-08-27 09:08:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][440/1251] eta 0:03:18 lr 0.000570 wd 0.0500 time 0.2386 (0.2443) data time 0.0011 (0.0029) model time 0.2375 (0.2413) loss 2.9488 (3.2457) grad_norm 1.9193 (nan) loss_scale 1024.0000 (1418.7392) mem 7382MB [2024-08-27 09:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][450/1251] eta 0:03:15 lr 0.000569 wd 0.0500 time 0.2422 (0.2442) data time 0.0010 (0.0029) model time 0.2412 (0.2412) loss 3.1825 (3.2463) grad_norm 2.1620 (nan) loss_scale 1024.0000 (1409.9867) mem 7382MB [2024-08-27 09:08:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][460/1251] eta 0:03:13 lr 0.000569 wd 0.0500 time 0.2438 (0.2441) data time 0.0010 (0.0029) model time 0.2428 (0.2411) loss 3.0421 (3.2456) grad_norm 2.3835 (nan) loss_scale 1024.0000 (1401.6139) mem 7382MB [2024-08-27 09:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][470/1251] eta 0:03:10 lr 0.000569 wd 0.0500 time 0.2413 (0.2441) data time 0.0010 (0.0028) model time 0.2403 (0.2411) loss 3.5484 (3.2437) grad_norm 2.7693 (nan) loss_scale 1024.0000 (1393.5966) mem 7382MB [2024-08-27 09:08:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][480/1251] eta 0:03:08 lr 0.000569 wd 0.0500 time 0.2359 (0.2440) data time 0.0010 (0.0028) model time 0.2349 (0.2411) loss 3.3807 (3.2420) grad_norm 2.5171 (nan) loss_scale 1024.0000 (1385.9127) mem 7382MB [2024-08-27 09:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][490/1251] eta 0:03:05 lr 0.000569 wd 0.0500 time 0.2435 (0.2440) data time 0.0011 (0.0028) model time 0.2424 (0.2411) loss 2.7332 (3.2404) grad_norm 2.0949 (nan) loss_scale 1024.0000 (1378.5418) mem 7382MB [2024-08-27 09:08:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][500/1251] eta 0:03:03 lr 0.000569 wd 0.0500 time 0.2389 (0.2439) data time 0.0010 (0.0028) model time 0.2379 (0.2411) loss 3.1684 (3.2384) grad_norm 2.7971 (nan) loss_scale 1024.0000 (1371.4651) mem 7382MB [2024-08-27 09:08:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][510/1251] eta 0:03:00 lr 0.000569 wd 0.0500 time 0.2448 (0.2440) data time 0.0007 (0.0028) model time 0.2441 (0.2411) loss 3.3494 (3.2305) grad_norm 2.3033 (nan) loss_scale 1024.0000 (1364.6654) mem 7382MB [2024-08-27 09:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][520/1251] eta 0:02:58 lr 0.000569 wd 0.0500 time 0.2363 (0.2440) data time 0.0010 (0.0027) model time 0.2353 (0.2411) loss 3.1018 (3.2344) grad_norm 2.3691 (nan) loss_scale 1024.0000 (1358.1267) mem 7382MB [2024-08-27 09:09:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][530/1251] eta 0:02:55 lr 0.000569 wd 0.0500 time 0.2401 (0.2439) data time 0.0012 (0.0027) model time 0.2389 (0.2411) loss 3.7370 (3.2360) grad_norm 2.4196 (nan) loss_scale 1024.0000 (1351.8343) mem 7382MB [2024-08-27 09:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][540/1251] eta 0:02:53 lr 0.000569 wd 0.0500 time 0.2391 (0.2439) data time 0.0008 (0.0027) model time 0.2383 (0.2411) loss 3.6416 (3.2384) grad_norm 1.8598 (nan) loss_scale 1024.0000 (1345.7745) mem 7382MB [2024-08-27 09:09:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][550/1251] eta 0:02:50 lr 0.000569 wd 0.0500 time 0.2392 (0.2438) data time 0.0009 (0.0027) model time 0.2384 (0.2410) loss 2.7884 (3.2381) grad_norm 1.5748 (nan) loss_scale 1024.0000 (1339.9347) mem 7382MB [2024-08-27 09:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][560/1251] eta 0:02:48 lr 0.000569 wd 0.0500 time 0.2390 (0.2438) data time 0.0010 (0.0026) model time 0.2380 (0.2410) loss 3.4527 (3.2375) grad_norm 1.7717 (nan) loss_scale 1024.0000 (1334.3030) mem 7382MB [2024-08-27 09:09:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][570/1251] eta 0:02:45 lr 0.000569 wd 0.0500 time 0.2357 (0.2437) data time 0.0008 (0.0026) model time 0.2349 (0.2410) loss 3.6440 (3.2386) grad_norm 4.8653 (nan) loss_scale 1024.0000 (1328.8687) mem 7382MB [2024-08-27 09:09:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][580/1251] eta 0:02:43 lr 0.000569 wd 0.0500 time 0.2388 (0.2437) data time 0.0007 (0.0026) model time 0.2381 (0.2410) loss 3.9224 (3.2408) grad_norm 2.2718 (nan) loss_scale 1024.0000 (1323.6213) mem 7382MB [2024-08-27 09:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][590/1251] eta 0:02:41 lr 0.000569 wd 0.0500 time 0.2403 (0.2436) data time 0.0010 (0.0026) model time 0.2393 (0.2409) loss 3.3798 (3.2400) grad_norm 2.2023 (nan) loss_scale 1024.0000 (1318.5516) mem 7382MB [2024-08-27 09:09:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][600/1251] eta 0:02:38 lr 0.000569 wd 0.0500 time 0.2361 (0.2436) data time 0.0011 (0.0026) model time 0.2350 (0.2409) loss 3.6113 (3.2411) grad_norm 2.9186 (nan) loss_scale 1024.0000 (1313.6506) mem 7382MB [2024-08-27 09:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][610/1251] eta 0:02:36 lr 0.000569 wd 0.0500 time 0.2386 (0.2436) data time 0.0010 (0.0025) model time 0.2375 (0.2409) loss 2.8924 (3.2376) grad_norm 2.4759 (nan) loss_scale 1024.0000 (1308.9100) mem 7382MB [2024-08-27 09:09:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][620/1251] eta 0:02:33 lr 0.000569 wd 0.0500 time 0.2448 (0.2435) data time 0.0007 (0.0025) model time 0.2441 (0.2409) loss 3.7257 (3.2343) grad_norm 1.9920 (nan) loss_scale 1024.0000 (1304.3221) mem 7382MB [2024-08-27 09:09:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][630/1251] eta 0:02:31 lr 0.000569 wd 0.0500 time 0.2342 (0.2435) data time 0.0007 (0.0025) model time 0.2335 (0.2409) loss 2.8910 (3.2347) grad_norm 1.8187 (nan) loss_scale 1024.0000 (1299.8796) mem 7382MB [2024-08-27 09:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][640/1251] eta 0:02:28 lr 0.000569 wd 0.0500 time 0.2375 (0.2434) data time 0.0010 (0.0025) model time 0.2365 (0.2408) loss 2.9425 (3.2349) grad_norm 3.0579 (nan) loss_scale 1024.0000 (1295.5757) mem 7382MB [2024-08-27 09:09:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][650/1251] eta 0:02:26 lr 0.000569 wd 0.0500 time 0.2422 (0.2434) data time 0.0008 (0.0025) model time 0.2414 (0.2408) loss 2.7201 (3.2337) grad_norm 2.0675 (nan) loss_scale 1024.0000 (1291.4040) mem 7382MB [2024-08-27 09:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][660/1251] eta 0:02:23 lr 0.000569 wd 0.0500 time 0.2336 (0.2434) data time 0.0008 (0.0025) model time 0.2328 (0.2408) loss 3.8903 (3.2349) grad_norm 1.8076 (nan) loss_scale 1024.0000 (1287.3585) mem 7382MB [2024-08-27 09:09:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][670/1251] eta 0:02:21 lr 0.000568 wd 0.0500 time 0.2418 (0.2433) data time 0.0010 (0.0025) model time 0.2408 (0.2408) loss 3.8026 (3.2368) grad_norm 2.2591 (nan) loss_scale 1024.0000 (1283.4337) mem 7382MB [2024-08-27 09:09:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][680/1251] eta 0:02:18 lr 0.000568 wd 0.0500 time 0.2404 (0.2433) data time 0.0008 (0.0024) model time 0.2396 (0.2408) loss 3.5255 (3.2378) grad_norm 2.7815 (nan) loss_scale 1024.0000 (1279.6241) mem 7382MB [2024-08-27 09:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][690/1251] eta 0:02:16 lr 0.000568 wd 0.0500 time 0.2436 (0.2436) data time 0.0009 (0.0024) model time 0.2427 (0.2411) loss 3.2687 (3.2414) grad_norm 2.0282 (nan) loss_scale 1024.0000 (1275.9247) mem 7382MB [2024-08-27 09:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][700/1251] eta 0:02:14 lr 0.000568 wd 0.0500 time 0.2553 (0.2435) data time 0.0008 (0.0024) model time 0.2545 (0.2411) loss 3.0373 (3.2411) grad_norm 2.1515 (nan) loss_scale 1024.0000 (1272.3310) mem 7382MB [2024-08-27 09:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][710/1251] eta 0:02:11 lr 0.000568 wd 0.0500 time 0.2358 (0.2435) data time 0.0008 (0.0024) model time 0.2351 (0.2410) loss 3.7239 (3.2415) grad_norm 2.6475 (nan) loss_scale 1024.0000 (1268.8383) mem 7382MB [2024-08-27 09:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][720/1251] eta 0:02:09 lr 0.000568 wd 0.0500 time 0.2382 (0.2434) data time 0.0010 (0.0024) model time 0.2372 (0.2409) loss 3.6433 (3.2456) grad_norm 2.3055 (nan) loss_scale 1024.0000 (1265.4424) mem 7382MB [2024-08-27 09:09:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][730/1251] eta 0:02:06 lr 0.000568 wd 0.0500 time 0.2383 (0.2433) data time 0.0009 (0.0023) model time 0.2374 (0.2409) loss 4.0205 (3.2464) grad_norm 2.8968 (nan) loss_scale 1024.0000 (1262.1395) mem 7382MB [2024-08-27 09:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][740/1251] eta 0:02:04 lr 0.000568 wd 0.0500 time 0.2423 (0.2432) data time 0.0012 (0.0023) model time 0.2411 (0.2408) loss 2.9963 (3.2493) grad_norm 3.2065 (nan) loss_scale 1024.0000 (1258.9258) mem 7382MB [2024-08-27 09:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][750/1251] eta 0:02:01 lr 0.000568 wd 0.0500 time 0.2369 (0.2431) data time 0.0009 (0.0023) model time 0.2361 (0.2407) loss 3.7674 (3.2501) grad_norm 2.9671 (nan) loss_scale 1024.0000 (1255.7976) mem 7382MB [2024-08-27 09:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][760/1251] eta 0:01:59 lr 0.000568 wd 0.0500 time 0.2433 (0.2431) data time 0.0009 (0.0023) model time 0.2424 (0.2407) loss 3.6614 (3.2472) grad_norm 2.2787 (nan) loss_scale 1024.0000 (1252.7516) mem 7382MB [2024-08-27 09:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][770/1251] eta 0:01:56 lr 0.000568 wd 0.0500 time 0.2428 (0.2430) data time 0.0008 (0.0023) model time 0.2420 (0.2406) loss 3.3589 (3.2487) grad_norm 3.6761 (nan) loss_scale 1024.0000 (1249.7847) mem 7382MB [2024-08-27 09:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][780/1251] eta 0:01:54 lr 0.000568 wd 0.0500 time 0.2385 (0.2430) data time 0.0007 (0.0023) model time 0.2378 (0.2406) loss 3.7925 (3.2497) grad_norm 2.4366 (nan) loss_scale 1024.0000 (1246.8937) mem 7382MB [2024-08-27 09:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][790/1251] eta 0:01:51 lr 0.000568 wd 0.0500 time 0.2508 (0.2429) data time 0.0008 (0.0022) model time 0.2500 (0.2406) loss 3.1314 (3.2479) grad_norm 2.3541 (nan) loss_scale 1024.0000 (1244.0759) mem 7382MB [2024-08-27 09:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][800/1251] eta 0:01:49 lr 0.000568 wd 0.0500 time 0.2489 (0.2429) data time 0.0007 (0.0022) model time 0.2481 (0.2406) loss 4.0883 (3.2472) grad_norm 1.9455 (nan) loss_scale 1024.0000 (1241.3283) mem 7382MB [2024-08-27 09:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][810/1251] eta 0:01:47 lr 0.000568 wd 0.0500 time 0.2331 (0.2428) data time 0.0009 (0.0022) model time 0.2322 (0.2405) loss 2.4027 (3.2469) grad_norm 1.7459 (nan) loss_scale 1024.0000 (1238.6486) mem 7382MB [2024-08-27 09:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][820/1251] eta 0:01:44 lr 0.000568 wd 0.0500 time 0.2435 (0.2428) data time 0.0010 (0.0022) model time 0.2425 (0.2405) loss 3.6132 (3.2440) grad_norm 2.6356 (nan) loss_scale 1024.0000 (1236.0341) mem 7382MB [2024-08-27 09:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][830/1251] eta 0:01:42 lr 0.000568 wd 0.0500 time 0.2373 (0.2427) data time 0.0010 (0.0022) model time 0.2363 (0.2404) loss 3.9562 (3.2459) grad_norm 2.8594 (nan) loss_scale 1024.0000 (1233.4826) mem 7382MB [2024-08-27 09:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][840/1251] eta 0:01:39 lr 0.000568 wd 0.0500 time 0.2426 (0.2427) data time 0.0010 (0.0022) model time 0.2417 (0.2404) loss 3.5564 (3.2463) grad_norm 2.3016 (nan) loss_scale 1024.0000 (1230.9917) mem 7382MB [2024-08-27 09:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][850/1251] eta 0:01:37 lr 0.000568 wd 0.0500 time 0.2404 (0.2427) data time 0.0011 (0.0022) model time 0.2393 (0.2404) loss 2.9105 (3.2473) grad_norm 1.8692 (nan) loss_scale 1024.0000 (1228.5593) mem 7382MB [2024-08-27 09:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][860/1251] eta 0:01:34 lr 0.000568 wd 0.0500 time 0.2405 (0.2426) data time 0.0009 (0.0022) model time 0.2396 (0.2404) loss 2.9890 (3.2493) grad_norm 2.3667 (nan) loss_scale 1024.0000 (1226.1835) mem 7382MB [2024-08-27 09:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][870/1251] eta 0:01:32 lr 0.000568 wd 0.0500 time 0.2391 (0.2426) data time 0.0008 (0.0022) model time 0.2383 (0.2403) loss 2.0810 (3.2480) grad_norm 2.1737 (nan) loss_scale 1024.0000 (1223.8622) mem 7382MB [2024-08-27 09:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][880/1251] eta 0:01:29 lr 0.000568 wd 0.0500 time 0.2408 (0.2426) data time 0.0010 (0.0021) model time 0.2399 (0.2403) loss 3.6278 (3.2469) grad_norm 2.2591 (nan) loss_scale 1024.0000 (1221.5936) mem 7382MB [2024-08-27 09:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][890/1251] eta 0:01:27 lr 0.000568 wd 0.0500 time 0.2445 (0.2425) data time 0.0012 (0.0021) model time 0.2433 (0.2403) loss 3.5878 (3.2451) grad_norm 2.8271 (nan) loss_scale 1024.0000 (1219.3760) mem 7382MB [2024-08-27 09:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][900/1251] eta 0:01:25 lr 0.000567 wd 0.0500 time 0.2377 (0.2425) data time 0.0008 (0.0021) model time 0.2370 (0.2402) loss 4.0983 (3.2438) grad_norm 2.2200 (nan) loss_scale 1024.0000 (1217.2075) mem 7382MB [2024-08-27 09:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][910/1251] eta 0:01:22 lr 0.000567 wd 0.0500 time 0.2446 (0.2424) data time 0.0010 (0.0021) model time 0.2436 (0.2402) loss 3.2070 (3.2500) grad_norm 2.1041 (nan) loss_scale 1024.0000 (1215.0867) mem 7382MB [2024-08-27 09:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][920/1251] eta 0:01:20 lr 0.000567 wd 0.0500 time 0.2508 (0.2424) data time 0.0009 (0.0021) model time 0.2499 (0.2402) loss 3.5671 (3.2525) grad_norm 2.3594 (nan) loss_scale 1024.0000 (1213.0119) mem 7382MB [2024-08-27 09:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][930/1251] eta 0:01:17 lr 0.000567 wd 0.0500 time 0.2529 (0.2424) data time 0.0010 (0.0021) model time 0.2519 (0.2402) loss 2.8139 (3.2486) grad_norm 3.0349 (nan) loss_scale 1024.0000 (1210.9817) mem 7382MB [2024-08-27 09:10:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][940/1251] eta 0:01:15 lr 0.000567 wd 0.0500 time 0.2392 (0.2424) data time 0.0011 (0.0021) model time 0.2381 (0.2402) loss 3.5639 (3.2499) grad_norm 5.3392 (nan) loss_scale 1024.0000 (1208.9947) mem 7382MB [2024-08-27 09:10:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][950/1251] eta 0:01:12 lr 0.000567 wd 0.0500 time 0.2604 (0.2424) data time 0.0012 (0.0021) model time 0.2591 (0.2402) loss 2.5050 (3.2488) grad_norm 1.6013 (nan) loss_scale 1024.0000 (1207.0494) mem 7382MB [2024-08-27 09:10:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][960/1251] eta 0:01:10 lr 0.000567 wd 0.0500 time 0.4486 (0.2426) data time 0.0010 (0.0021) model time 0.4476 (0.2404) loss 3.3976 (3.2509) grad_norm 2.1928 (nan) loss_scale 1024.0000 (1205.1446) mem 7382MB [2024-08-27 09:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][970/1251] eta 0:01:08 lr 0.000567 wd 0.0500 time 0.2372 (0.2430) data time 0.0007 (0.0021) model time 0.2365 (0.2409) loss 3.8627 (3.2502) grad_norm 2.6125 (nan) loss_scale 1024.0000 (1203.2791) mem 7382MB [2024-08-27 09:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][980/1251] eta 0:01:05 lr 0.000567 wd 0.0500 time 0.2577 (0.2430) data time 0.0008 (0.0021) model time 0.2570 (0.2409) loss 3.9854 (3.2479) grad_norm 2.7547 (nan) loss_scale 1024.0000 (1201.4516) mem 7382MB [2024-08-27 09:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][990/1251] eta 0:01:03 lr 0.000567 wd 0.0500 time 0.2429 (0.2430) data time 0.0007 (0.0021) model time 0.2422 (0.2409) loss 3.0625 (3.2482) grad_norm 1.7445 (nan) loss_scale 1024.0000 (1199.6609) mem 7382MB [2024-08-27 09:10:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1000/1251] eta 0:01:00 lr 0.000567 wd 0.0500 time 0.2492 (0.2430) data time 0.0010 (0.0021) model time 0.2482 (0.2409) loss 3.5308 (3.2498) grad_norm 1.8790 (nan) loss_scale 1024.0000 (1197.9061) mem 7382MB [2024-08-27 09:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1010/1251] eta 0:00:58 lr 0.000567 wd 0.0500 time 0.2381 (0.2429) data time 0.0008 (0.0021) model time 0.2373 (0.2408) loss 3.7933 (3.2506) grad_norm 2.5804 (nan) loss_scale 1024.0000 (1196.1860) mem 7382MB [2024-08-27 09:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1020/1251] eta 0:00:56 lr 0.000567 wd 0.0500 time 0.2388 (0.2429) data time 0.0010 (0.0020) model time 0.2377 (0.2408) loss 3.2569 (3.2497) grad_norm 7.4261 (nan) loss_scale 1024.0000 (1194.4995) mem 7382MB [2024-08-27 09:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1030/1251] eta 0:00:53 lr 0.000567 wd 0.0500 time 0.2395 (0.2429) data time 0.0009 (0.0020) model time 0.2385 (0.2408) loss 3.7153 (3.2483) grad_norm 2.2385 (nan) loss_scale 1024.0000 (1192.8458) mem 7382MB [2024-08-27 09:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1040/1251] eta 0:00:51 lr 0.000567 wd 0.0500 time 0.2364 (0.2429) data time 0.0009 (0.0020) model time 0.2355 (0.2408) loss 3.8243 (3.2464) grad_norm 2.1291 (nan) loss_scale 1024.0000 (1191.2238) mem 7382MB [2024-08-27 09:11:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1050/1251] eta 0:00:48 lr 0.000567 wd 0.0500 time 0.2445 (0.2429) data time 0.0008 (0.0020) model time 0.2436 (0.2408) loss 2.6128 (3.2460) grad_norm 2.2888 (nan) loss_scale 1024.0000 (1189.6327) mem 7382MB [2024-08-27 09:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1060/1251] eta 0:00:46 lr 0.000567 wd 0.0500 time 0.2406 (0.2428) data time 0.0011 (0.0020) model time 0.2394 (0.2407) loss 2.7226 (3.2434) grad_norm 2.2369 (nan) loss_scale 1024.0000 (1188.0716) mem 7382MB [2024-08-27 09:11:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1070/1251] eta 0:00:43 lr 0.000567 wd 0.0500 time 0.2319 (0.2428) data time 0.0009 (0.0020) model time 0.2310 (0.2407) loss 3.0597 (3.2450) grad_norm 3.2323 (nan) loss_scale 1024.0000 (1186.5397) mem 7382MB [2024-08-27 09:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1080/1251] eta 0:00:41 lr 0.000567 wd 0.0500 time 0.2422 (0.2428) data time 0.0008 (0.0020) model time 0.2414 (0.2407) loss 3.7140 (3.2457) grad_norm 3.5360 (nan) loss_scale 1024.0000 (1185.0361) mem 7382MB [2024-08-27 09:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1090/1251] eta 0:00:39 lr 0.000567 wd 0.0500 time 0.2330 (0.2428) data time 0.0008 (0.0020) model time 0.2322 (0.2407) loss 3.1251 (3.2455) grad_norm 2.2421 (nan) loss_scale 1024.0000 (1183.5600) mem 7382MB [2024-08-27 09:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1100/1251] eta 0:00:36 lr 0.000567 wd 0.0500 time 0.2426 (0.2428) data time 0.0007 (0.0020) model time 0.2419 (0.2407) loss 3.5022 (3.2438) grad_norm 2.0226 (nan) loss_scale 1024.0000 (1182.1108) mem 7382MB [2024-08-27 09:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1110/1251] eta 0:00:34 lr 0.000567 wd 0.0500 time 0.2420 (0.2428) data time 0.0008 (0.0020) model time 0.2412 (0.2407) loss 3.6601 (3.2451) grad_norm 2.3084 (nan) loss_scale 1024.0000 (1180.6877) mem 7382MB [2024-08-27 09:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1120/1251] eta 0:00:31 lr 0.000567 wd 0.0500 time 0.2481 (0.2428) data time 0.0007 (0.0020) model time 0.2473 (0.2407) loss 3.4401 (3.2420) grad_norm 1.9423 (nan) loss_scale 1024.0000 (1179.2899) mem 7382MB [2024-08-27 09:11:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1130/1251] eta 0:00:29 lr 0.000566 wd 0.0500 time 0.2358 (0.2428) data time 0.0008 (0.0020) model time 0.2350 (0.2407) loss 3.4648 (3.2451) grad_norm 2.2286 (nan) loss_scale 1024.0000 (1177.9169) mem 7382MB [2024-08-27 09:11:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1140/1251] eta 0:00:26 lr 0.000566 wd 0.0500 time 0.2431 (0.2427) data time 0.0010 (0.0020) model time 0.2421 (0.2407) loss 3.1404 (3.2456) grad_norm 2.1898 (nan) loss_scale 1024.0000 (1176.5679) mem 7382MB [2024-08-27 09:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1150/1251] eta 0:00:24 lr 0.000566 wd 0.0500 time 0.2295 (0.2427) data time 0.0008 (0.0020) model time 0.2287 (0.2407) loss 2.1838 (3.2453) grad_norm 2.1466 (nan) loss_scale 1024.0000 (1175.2424) mem 7382MB [2024-08-27 09:11:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1160/1251] eta 0:00:22 lr 0.000566 wd 0.0500 time 0.2321 (0.2427) data time 0.0009 (0.0020) model time 0.2312 (0.2406) loss 4.1178 (3.2473) grad_norm 2.3932 (nan) loss_scale 1024.0000 (1173.9397) mem 7382MB [2024-08-27 09:11:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1170/1251] eta 0:00:19 lr 0.000566 wd 0.0500 time 0.2369 (0.2427) data time 0.0010 (0.0020) model time 0.2359 (0.2406) loss 2.9752 (3.2476) grad_norm 2.6591 (nan) loss_scale 1024.0000 (1172.6593) mem 7382MB [2024-08-27 09:11:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1180/1251] eta 0:00:17 lr 0.000566 wd 0.0500 time 0.2372 (0.2426) data time 0.0010 (0.0020) model time 0.2362 (0.2406) loss 3.2424 (3.2470) grad_norm 1.9749 (nan) loss_scale 1024.0000 (1171.4005) mem 7382MB [2024-08-27 09:11:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1190/1251] eta 0:00:14 lr 0.000566 wd 0.0500 time 0.2473 (0.2426) data time 0.0011 (0.0019) model time 0.2462 (0.2406) loss 2.9335 (3.2473) grad_norm 1.9773 (nan) loss_scale 1024.0000 (1170.1629) mem 7382MB [2024-08-27 09:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1200/1251] eta 0:00:12 lr 0.000566 wd 0.0500 time 0.2406 (0.2426) data time 0.0011 (0.0019) model time 0.2395 (0.2406) loss 3.3663 (3.2470) grad_norm 2.5781 (nan) loss_scale 1024.0000 (1168.9459) mem 7382MB [2024-08-27 09:11:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1210/1251] eta 0:00:09 lr 0.000566 wd 0.0500 time 0.2377 (0.2426) data time 0.0007 (0.0019) model time 0.2370 (0.2407) loss 2.7096 (3.2470) grad_norm 1.6651 (nan) loss_scale 1024.0000 (1167.7490) mem 7382MB [2024-08-27 09:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1220/1251] eta 0:00:07 lr 0.000566 wd 0.0500 time 0.2322 (0.2426) data time 0.0007 (0.0019) model time 0.2315 (0.2406) loss 2.0268 (3.2440) grad_norm 2.6834 (nan) loss_scale 1024.0000 (1166.5717) mem 7382MB [2024-08-27 09:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1230/1251] eta 0:00:05 lr 0.000566 wd 0.0500 time 0.2441 (0.2426) data time 0.0010 (0.0019) model time 0.2431 (0.2406) loss 3.0272 (3.2437) grad_norm 2.0332 (nan) loss_scale 1024.0000 (1165.4135) mem 7382MB [2024-08-27 09:11:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1240/1251] eta 0:00:02 lr 0.000566 wd 0.0500 time 0.2257 (0.2425) data time 0.0005 (0.0019) model time 0.2252 (0.2405) loss 3.5646 (3.2453) grad_norm 2.6073 (nan) loss_scale 1024.0000 (1164.2740) mem 7382MB [2024-08-27 09:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [148/300][1250/1251] eta 0:00:00 lr 0.000566 wd 0.0500 time 0.2309 (0.2424) data time 0.0005 (0.0019) model time 0.2303 (0.2404) loss 2.5398 (3.2432) grad_norm 3.0483 (nan) loss_scale 1024.0000 (1163.1527) mem 7382MB [2024-08-27 09:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 148 training takes 0:05:03 [2024-08-27 09:11:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 09:11:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 09:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.436 (0.436) Loss 0.4636 (0.4636) Acc@1 91.309 (91.309) Acc@5 98.242 (98.242) Mem 7382MB [2024-08-27 09:11:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.116) Loss 0.7090 (0.7333) Acc@1 84.961 (84.144) Acc@5 97.266 (96.902) Mem 7382MB [2024-08-27 09:11:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.099) Loss 1.0654 (0.7604) Acc@1 74.219 (83.203) Acc@5 93.555 (96.805) Mem 7382MB [2024-08-27 09:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.091) Loss 1.2461 (0.8655) Acc@1 70.996 (80.825) Acc@5 91.992 (95.643) Mem 7382MB [2024-08-27 09:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.2178 (0.9243) Acc@1 70.020 (79.335) Acc@5 92.285 (94.958) Mem 7382MB [2024-08-27 09:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.894 Acc@5 94.858 [2024-08-27 09:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.9% [2024-08-27 09:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.742 (0.742) Loss 0.4011 (0.4011) Acc@1 92.969 (92.969) Acc@5 98.730 (98.730) Mem 7382MB [2024-08-27 09:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.145) Loss 0.6470 (0.6402) Acc@1 86.719 (86.266) Acc@5 96.875 (97.283) Mem 7382MB [2024-08-27 09:12:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.112) Loss 0.9048 (0.6643) Acc@1 79.102 (85.328) Acc@5 95.508 (97.340) Mem 7382MB [2024-08-27 09:12:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.100) Loss 1.1562 (0.7547) Acc@1 71.094 (83.042) Acc@5 92.871 (96.390) Mem 7382MB [2024-08-27 09:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.092) Loss 1.0391 (0.8018) Acc@1 74.609 (81.683) Acc@5 93.457 (95.889) Mem 7382MB [2024-08-27 09:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.276 Acc@5 95.846 [2024-08-27 09:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.3% [2024-08-27 09:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.28% [2024-08-27 09:12:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 09:12:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 09:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][0/1251] eta 0:14:59 lr 0.000566 wd 0.0500 time 0.7190 (0.7190) data time 0.4967 (0.4967) model time 0.0000 (0.0000) loss 2.7483 (2.7483) grad_norm 2.2697 (2.2697) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][10/1251] eta 0:05:50 lr 0.000566 wd 0.0500 time 0.2411 (0.2827) data time 0.0007 (0.0462) model time 0.0000 (0.0000) loss 2.2662 (3.0025) grad_norm 3.5148 (2.8307) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][20/1251] eta 0:05:22 lr 0.000566 wd 0.0500 time 0.2471 (0.2619) data time 0.0007 (0.0247) model time 0.0000 (0.0000) loss 3.3419 (3.1018) grad_norm 2.0781 (2.8139) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][30/1251] eta 0:05:10 lr 0.000566 wd 0.0500 time 0.2390 (0.2542) data time 0.0011 (0.0171) model time 0.0000 (0.0000) loss 2.6276 (3.1043) grad_norm 2.5929 (2.6467) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][40/1251] eta 0:05:02 lr 0.000566 wd 0.0500 time 0.2370 (0.2499) data time 0.0009 (0.0132) model time 0.0000 (0.0000) loss 3.7304 (3.2249) grad_norm 2.1238 (2.5205) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][50/1251] eta 0:04:57 lr 0.000566 wd 0.0500 time 0.2378 (0.2481) data time 0.0008 (0.0108) model time 0.0000 (0.0000) loss 2.5734 (3.1982) grad_norm 2.0889 (2.5523) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][60/1251] eta 0:04:54 lr 0.000566 wd 0.0500 time 0.2358 (0.2471) data time 0.0010 (0.0092) model time 0.2348 (0.2410) loss 3.4012 (3.2033) grad_norm 1.9014 (2.4908) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][70/1251] eta 0:04:49 lr 0.000566 wd 0.0500 time 0.2395 (0.2455) data time 0.0009 (0.0080) model time 0.2386 (0.2380) loss 3.5021 (3.1813) grad_norm 2.7504 (2.4725) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][80/1251] eta 0:04:46 lr 0.000566 wd 0.0500 time 0.2388 (0.2446) data time 0.0010 (0.0072) model time 0.2378 (0.2376) loss 2.9609 (3.1865) grad_norm 5.4072 (2.5919) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][90/1251] eta 0:04:42 lr 0.000566 wd 0.0500 time 0.2385 (0.2437) data time 0.0007 (0.0065) model time 0.2378 (0.2370) loss 3.7360 (3.2146) grad_norm 1.6989 (2.6211) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][100/1251] eta 0:04:39 lr 0.000565 wd 0.0500 time 0.2504 (0.2432) data time 0.0010 (0.0059) model time 0.2494 (0.2372) loss 3.8328 (3.2208) grad_norm 2.8461 (2.5982) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][110/1251] eta 0:04:37 lr 0.000565 wd 0.0500 time 0.2370 (0.2429) data time 0.0012 (0.0055) model time 0.2358 (0.2374) loss 3.4649 (3.2448) grad_norm 2.1131 (2.5601) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][120/1251] eta 0:04:34 lr 0.000565 wd 0.0500 time 0.2383 (0.2427) data time 0.0010 (0.0051) model time 0.2373 (0.2377) loss 2.8142 (3.2421) grad_norm 2.6054 (2.5407) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][130/1251] eta 0:04:31 lr 0.000565 wd 0.0500 time 0.2393 (0.2424) data time 0.0007 (0.0048) model time 0.2386 (0.2377) loss 2.1092 (3.2109) grad_norm 2.6307 (2.5809) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][140/1251] eta 0:04:28 lr 0.000565 wd 0.0500 time 0.2346 (0.2420) data time 0.0007 (0.0046) model time 0.2339 (0.2375) loss 3.6717 (3.2249) grad_norm 2.9068 (2.5650) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][150/1251] eta 0:04:26 lr 0.000565 wd 0.0500 time 0.2322 (0.2420) data time 0.0010 (0.0043) model time 0.2312 (0.2379) loss 3.6219 (3.2198) grad_norm 2.5819 (2.5459) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][160/1251] eta 0:04:23 lr 0.000565 wd 0.0500 time 0.2418 (0.2419) data time 0.0007 (0.0041) model time 0.2410 (0.2381) loss 3.8021 (3.2199) grad_norm 2.2044 (2.5200) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][170/1251] eta 0:04:21 lr 0.000565 wd 0.0500 time 0.2370 (0.2418) data time 0.0007 (0.0039) model time 0.2362 (0.2382) loss 3.9008 (3.2195) grad_norm 2.9050 (2.5306) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][180/1251] eta 0:04:18 lr 0.000565 wd 0.0500 time 0.2371 (0.2415) data time 0.0010 (0.0038) model time 0.2361 (0.2379) loss 2.7231 (3.2196) grad_norm 2.6414 (2.5163) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][190/1251] eta 0:04:16 lr 0.000565 wd 0.0500 time 0.2419 (0.2415) data time 0.0009 (0.0036) model time 0.2410 (0.2381) loss 3.6767 (3.2133) grad_norm 2.0770 (2.4998) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][200/1251] eta 0:04:13 lr 0.000565 wd 0.0500 time 0.2403 (0.2414) data time 0.0009 (0.0035) model time 0.2394 (0.2381) loss 3.0312 (3.2132) grad_norm 2.1264 (2.5103) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][210/1251] eta 0:04:11 lr 0.000565 wd 0.0500 time 0.2386 (0.2413) data time 0.0007 (0.0034) model time 0.2378 (0.2381) loss 2.8387 (3.2036) grad_norm 2.1512 (2.5127) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:12:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][220/1251] eta 0:04:08 lr 0.000565 wd 0.0500 time 0.2378 (0.2412) data time 0.0011 (0.0033) model time 0.2367 (0.2380) loss 2.3355 (3.2073) grad_norm 2.6028 (2.5085) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][230/1251] eta 0:04:06 lr 0.000565 wd 0.0500 time 0.2445 (0.2412) data time 0.0010 (0.0032) model time 0.2436 (0.2381) loss 3.5218 (3.2024) grad_norm 3.8846 (2.5050) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][240/1251] eta 0:04:05 lr 0.000565 wd 0.0500 time 0.4691 (0.2428) data time 0.0010 (0.0031) model time 0.4682 (0.2403) loss 3.5545 (3.2010) grad_norm 2.5416 (2.5031) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][250/1251] eta 0:04:03 lr 0.000565 wd 0.0500 time 0.2453 (0.2435) data time 0.0011 (0.0030) model time 0.2442 (0.2413) loss 3.4339 (3.2066) grad_norm 2.1320 (2.5111) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][260/1251] eta 0:04:01 lr 0.000565 wd 0.0500 time 0.2419 (0.2434) data time 0.0007 (0.0030) model time 0.2412 (0.2412) loss 2.9669 (3.2093) grad_norm 2.0861 (2.4986) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][270/1251] eta 0:03:58 lr 0.000565 wd 0.0500 time 0.2441 (0.2433) data time 0.0007 (0.0029) model time 0.2434 (0.2412) loss 3.1077 (3.2149) grad_norm 1.9289 (2.4791) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][280/1251] eta 0:03:56 lr 0.000565 wd 0.0500 time 0.2358 (0.2432) data time 0.0010 (0.0028) model time 0.2348 (0.2410) loss 3.2340 (3.2204) grad_norm 2.3918 (2.4914) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][290/1251] eta 0:03:53 lr 0.000565 wd 0.0500 time 0.2396 (0.2430) data time 0.0011 (0.0028) model time 0.2385 (0.2408) loss 3.2279 (3.2221) grad_norm 2.3608 (2.5018) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][300/1251] eta 0:03:50 lr 0.000565 wd 0.0500 time 0.2393 (0.2429) data time 0.0009 (0.0027) model time 0.2384 (0.2408) loss 2.5092 (3.2158) grad_norm 2.0185 (2.5135) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][310/1251] eta 0:03:48 lr 0.000565 wd 0.0500 time 0.2334 (0.2427) data time 0.0010 (0.0027) model time 0.2324 (0.2406) loss 3.4330 (3.2149) grad_norm 1.9886 (2.5117) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][320/1251] eta 0:03:45 lr 0.000565 wd 0.0500 time 0.2397 (0.2426) data time 0.0009 (0.0026) model time 0.2388 (0.2405) loss 2.7576 (3.2142) grad_norm 2.4849 (2.5055) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][330/1251] eta 0:03:43 lr 0.000564 wd 0.0500 time 0.2412 (0.2425) data time 0.0008 (0.0026) model time 0.2405 (0.2404) loss 4.0169 (3.2058) grad_norm 2.5222 (2.5100) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][340/1251] eta 0:03:40 lr 0.000564 wd 0.0500 time 0.2364 (0.2425) data time 0.0010 (0.0025) model time 0.2354 (0.2405) loss 3.3550 (3.2063) grad_norm 2.3293 (2.5099) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][350/1251] eta 0:03:38 lr 0.000564 wd 0.0500 time 0.2398 (0.2425) data time 0.0010 (0.0025) model time 0.2389 (0.2405) loss 3.4099 (3.2066) grad_norm 4.2968 (2.5321) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][360/1251] eta 0:03:35 lr 0.000564 wd 0.0500 time 0.2403 (0.2424) data time 0.0011 (0.0024) model time 0.2393 (0.2403) loss 2.9953 (3.2056) grad_norm 7.0325 (2.5656) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][370/1251] eta 0:03:33 lr 0.000564 wd 0.0500 time 0.2504 (0.2422) data time 0.0008 (0.0024) model time 0.2496 (0.2402) loss 3.5744 (3.2097) grad_norm 3.8481 (2.5620) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][380/1251] eta 0:03:30 lr 0.000564 wd 0.0500 time 0.2414 (0.2421) data time 0.0010 (0.0024) model time 0.2404 (0.2401) loss 3.3389 (3.2105) grad_norm 2.3048 (2.5536) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][390/1251] eta 0:03:28 lr 0.000564 wd 0.0500 time 0.2438 (0.2420) data time 0.0008 (0.0023) model time 0.2430 (0.2400) loss 3.5013 (3.2135) grad_norm 1.9057 (2.5404) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][400/1251] eta 0:03:25 lr 0.000564 wd 0.0500 time 0.2388 (0.2420) data time 0.0011 (0.0023) model time 0.2377 (0.2400) loss 3.4960 (3.2156) grad_norm 2.0996 (2.5346) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][410/1251] eta 0:03:23 lr 0.000564 wd 0.0500 time 0.2375 (0.2419) data time 0.0010 (0.0023) model time 0.2365 (0.2400) loss 3.2817 (3.2178) grad_norm 2.2121 (2.5243) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][420/1251] eta 0:03:21 lr 0.000564 wd 0.0500 time 0.2374 (0.2419) data time 0.0010 (0.0022) model time 0.2364 (0.2400) loss 3.4696 (3.2204) grad_norm 3.1203 (2.5341) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][430/1251] eta 0:03:18 lr 0.000564 wd 0.0500 time 0.2440 (0.2423) data time 0.0011 (0.0022) model time 0.2429 (0.2404) loss 2.9333 (3.2224) grad_norm 2.1174 (2.5298) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][440/1251] eta 0:03:16 lr 0.000564 wd 0.0500 time 0.2441 (0.2422) data time 0.0008 (0.0022) model time 0.2433 (0.2403) loss 2.9151 (3.2202) grad_norm 2.2496 (2.5307) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][450/1251] eta 0:03:13 lr 0.000564 wd 0.0500 time 0.2393 (0.2421) data time 0.0009 (0.0022) model time 0.2384 (0.2402) loss 3.9815 (3.2261) grad_norm 2.4093 (2.5328) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][460/1251] eta 0:03:11 lr 0.000564 wd 0.0500 time 0.2429 (0.2420) data time 0.0010 (0.0021) model time 0.2419 (0.2402) loss 2.9936 (3.2186) grad_norm 1.9695 (2.5221) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:13:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][470/1251] eta 0:03:08 lr 0.000564 wd 0.0500 time 0.2421 (0.2419) data time 0.0007 (0.0021) model time 0.2414 (0.2401) loss 3.4968 (3.2225) grad_norm 2.4664 (2.5176) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][480/1251] eta 0:03:06 lr 0.000564 wd 0.0500 time 0.2385 (0.2418) data time 0.0010 (0.0021) model time 0.2375 (0.2400) loss 2.7092 (3.2221) grad_norm 1.8179 (2.5188) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][490/1251] eta 0:03:03 lr 0.000564 wd 0.0500 time 0.2346 (0.2417) data time 0.0008 (0.0021) model time 0.2337 (0.2399) loss 2.9956 (3.2209) grad_norm 2.1982 (2.5260) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][500/1251] eta 0:03:01 lr 0.000564 wd 0.0500 time 0.2464 (0.2416) data time 0.0010 (0.0021) model time 0.2455 (0.2398) loss 3.3898 (3.2241) grad_norm 2.4280 (2.5344) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][510/1251] eta 0:02:59 lr 0.000564 wd 0.0500 time 0.2529 (0.2416) data time 0.0011 (0.0020) model time 0.2518 (0.2398) loss 2.2989 (3.2242) grad_norm 2.1311 (2.5268) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][520/1251] eta 0:02:56 lr 0.000564 wd 0.0500 time 0.2390 (0.2415) data time 0.0010 (0.0020) model time 0.2380 (0.2398) loss 3.2105 (3.2223) grad_norm 1.7330 (2.5219) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][530/1251] eta 0:02:54 lr 0.000564 wd 0.0500 time 0.2405 (0.2415) data time 0.0012 (0.0020) model time 0.2393 (0.2397) loss 3.1204 (3.2211) grad_norm 2.5642 (2.5187) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][540/1251] eta 0:02:51 lr 0.000564 wd 0.0500 time 0.2377 (0.2415) data time 0.0010 (0.0020) model time 0.2367 (0.2397) loss 2.6188 (3.2210) grad_norm 1.9626 (2.5152) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][550/1251] eta 0:02:49 lr 0.000564 wd 0.0500 time 0.2291 (0.2414) data time 0.0009 (0.0020) model time 0.2283 (0.2396) loss 3.7562 (3.2260) grad_norm 2.1158 (2.5201) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][560/1251] eta 0:02:46 lr 0.000563 wd 0.0500 time 0.2468 (0.2414) data time 0.0009 (0.0020) model time 0.2459 (0.2396) loss 2.9941 (3.2260) grad_norm 4.5118 (2.5325) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][570/1251] eta 0:02:44 lr 0.000563 wd 0.0500 time 0.2282 (0.2413) data time 0.0008 (0.0019) model time 0.2273 (0.2395) loss 3.7249 (3.2269) grad_norm 2.6871 (2.5355) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][580/1251] eta 0:02:41 lr 0.000563 wd 0.0500 time 0.2443 (0.2412) data time 0.0011 (0.0019) model time 0.2431 (0.2395) loss 3.4187 (3.2232) grad_norm 3.0584 (2.5353) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][590/1251] eta 0:02:39 lr 0.000563 wd 0.0500 time 0.2336 (0.2412) data time 0.0009 (0.0019) model time 0.2327 (0.2394) loss 2.4436 (3.2217) grad_norm 3.0802 (2.5385) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][600/1251] eta 0:02:36 lr 0.000563 wd 0.0500 time 0.2445 (0.2411) data time 0.0008 (0.0019) model time 0.2437 (0.2394) loss 3.6674 (3.2203) grad_norm 2.1179 (2.5378) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][610/1251] eta 0:02:34 lr 0.000563 wd 0.0500 time 0.2367 (0.2410) data time 0.0011 (0.0019) model time 0.2356 (0.2393) loss 1.9775 (3.2206) grad_norm 2.6906 (2.5330) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][620/1251] eta 0:02:32 lr 0.000563 wd 0.0500 time 0.2333 (0.2410) data time 0.0008 (0.0019) model time 0.2325 (0.2393) loss 2.6044 (3.2215) grad_norm 2.2572 (2.5339) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][630/1251] eta 0:02:29 lr 0.000563 wd 0.0500 time 0.2384 (0.2410) data time 0.0008 (0.0019) model time 0.2376 (0.2393) loss 3.5538 (3.2179) grad_norm 2.7289 (2.5295) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][640/1251] eta 0:02:27 lr 0.000563 wd 0.0500 time 0.2345 (0.2409) data time 0.0011 (0.0018) model time 0.2333 (0.2392) loss 2.8276 (3.2229) grad_norm 1.9196 (2.5211) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][650/1251] eta 0:02:24 lr 0.000563 wd 0.0500 time 0.2479 (0.2408) data time 0.0010 (0.0018) model time 0.2469 (0.2391) loss 3.0554 (3.2194) grad_norm 1.9181 (2.5168) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][660/1251] eta 0:02:22 lr 0.000563 wd 0.0500 time 0.2321 (0.2408) data time 0.0010 (0.0018) model time 0.2311 (0.2391) loss 2.4247 (3.2173) grad_norm 2.9004 (2.5134) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][670/1251] eta 0:02:19 lr 0.000563 wd 0.0500 time 0.2401 (0.2407) data time 0.0010 (0.0018) model time 0.2391 (0.2390) loss 3.4095 (3.2174) grad_norm 3.2503 (2.5123) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][680/1251] eta 0:02:17 lr 0.000563 wd 0.0500 time 0.2460 (0.2407) data time 0.0007 (0.0018) model time 0.2453 (0.2390) loss 2.6598 (3.2181) grad_norm 2.8809 (2.5116) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][690/1251] eta 0:02:15 lr 0.000563 wd 0.0500 time 0.2416 (0.2407) data time 0.0011 (0.0018) model time 0.2405 (0.2390) loss 3.4149 (3.2159) grad_norm 2.2620 (2.5084) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][700/1251] eta 0:02:12 lr 0.000563 wd 0.0500 time 0.2429 (0.2407) data time 0.0010 (0.0018) model time 0.2419 (0.2390) loss 3.3310 (3.2173) grad_norm 2.0560 (2.5006) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][710/1251] eta 0:02:10 lr 0.000563 wd 0.0500 time 0.2433 (0.2406) data time 0.0008 (0.0018) model time 0.2425 (0.2390) loss 3.3445 (3.2178) grad_norm 1.7061 (2.4975) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:14:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][720/1251] eta 0:02:07 lr 0.000563 wd 0.0500 time 0.2371 (0.2406) data time 0.0011 (0.0018) model time 0.2360 (0.2390) loss 3.1003 (3.2172) grad_norm 2.6937 (2.4943) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][730/1251] eta 0:02:05 lr 0.000563 wd 0.0500 time 0.2414 (0.2406) data time 0.0010 (0.0017) model time 0.2404 (0.2390) loss 3.1894 (3.2168) grad_norm 1.9628 (2.4933) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][740/1251] eta 0:02:02 lr 0.000563 wd 0.0500 time 0.2507 (0.2406) data time 0.0010 (0.0017) model time 0.2497 (0.2390) loss 3.0597 (3.2175) grad_norm 2.1795 (2.4923) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][750/1251] eta 0:02:00 lr 0.000563 wd 0.0500 time 0.2324 (0.2406) data time 0.0010 (0.0017) model time 0.2314 (0.2390) loss 3.6407 (3.2216) grad_norm 2.1668 (2.4926) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][760/1251] eta 0:01:58 lr 0.000563 wd 0.0500 time 0.4122 (0.2412) data time 0.0010 (0.0017) model time 0.4112 (0.2396) loss 3.5382 (3.2227) grad_norm 1.6248 (2.4909) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][770/1251] eta 0:01:56 lr 0.000563 wd 0.0500 time 0.2370 (0.2416) data time 0.0009 (0.0017) model time 0.2361 (0.2401) loss 3.3387 (3.2242) grad_norm 2.0271 (2.4921) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][780/1251] eta 0:01:53 lr 0.000562 wd 0.0500 time 0.2462 (0.2415) data time 0.0009 (0.0017) model time 0.2453 (0.2400) loss 3.6698 (3.2233) grad_norm 1.9213 (2.4886) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][790/1251] eta 0:01:51 lr 0.000562 wd 0.0500 time 0.2327 (0.2415) data time 0.0008 (0.0017) model time 0.2319 (0.2399) loss 2.9464 (3.2222) grad_norm 2.4478 (2.5028) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][800/1251] eta 0:01:48 lr 0.000562 wd 0.0500 time 0.2342 (0.2414) data time 0.0008 (0.0017) model time 0.2334 (0.2399) loss 2.9362 (3.2209) grad_norm 1.7349 (2.5010) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][810/1251] eta 0:01:46 lr 0.000562 wd 0.0500 time 0.2334 (0.2414) data time 0.0008 (0.0017) model time 0.2326 (0.2399) loss 2.5667 (3.2209) grad_norm 3.5768 (2.4993) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][820/1251] eta 0:01:44 lr 0.000562 wd 0.0500 time 0.2376 (0.2414) data time 0.0009 (0.0017) model time 0.2368 (0.2399) loss 4.0960 (3.2242) grad_norm 3.0171 (2.5042) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][830/1251] eta 0:01:41 lr 0.000562 wd 0.0500 time 0.2346 (0.2413) data time 0.0009 (0.0017) model time 0.2337 (0.2398) loss 3.5578 (3.2265) grad_norm 2.8464 (2.5035) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][840/1251] eta 0:01:39 lr 0.000562 wd 0.0500 time 0.2465 (0.2413) data time 0.0008 (0.0016) model time 0.2458 (0.2398) loss 2.8655 (3.2270) grad_norm 3.2029 (2.5107) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][850/1251] eta 0:01:36 lr 0.000562 wd 0.0500 time 0.2372 (0.2412) data time 0.0012 (0.0016) model time 0.2361 (0.2397) loss 3.5532 (3.2243) grad_norm 2.8303 (2.5082) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][860/1251] eta 0:01:34 lr 0.000562 wd 0.0500 time 0.2372 (0.2412) data time 0.0008 (0.0016) model time 0.2364 (0.2397) loss 2.9170 (3.2242) grad_norm 3.5123 (2.5086) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][870/1251] eta 0:01:31 lr 0.000562 wd 0.0500 time 0.2413 (0.2412) data time 0.0010 (0.0016) model time 0.2403 (0.2397) loss 3.7885 (3.2276) grad_norm 1.8663 (2.5129) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][880/1251] eta 0:01:29 lr 0.000562 wd 0.0500 time 0.2398 (0.2411) data time 0.0010 (0.0016) model time 0.2388 (0.2396) loss 3.1048 (3.2272) grad_norm 2.4586 (2.5107) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][890/1251] eta 0:01:27 lr 0.000562 wd 0.0500 time 0.2451 (0.2411) data time 0.0008 (0.0016) model time 0.2442 (0.2396) loss 4.0057 (3.2292) grad_norm 1.9667 (2.5094) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][900/1251] eta 0:01:24 lr 0.000562 wd 0.0500 time 0.2355 (0.2411) data time 0.0009 (0.0016) model time 0.2347 (0.2396) loss 4.0403 (3.2321) grad_norm 2.4518 (2.5081) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][910/1251] eta 0:01:22 lr 0.000562 wd 0.0500 time 0.2440 (0.2410) data time 0.0008 (0.0016) model time 0.2432 (0.2396) loss 4.0350 (3.2371) grad_norm 2.0969 (2.5089) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:15:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][920/1251] eta 0:01:19 lr 0.000562 wd 0.0500 time 0.2396 (0.2410) data time 0.0011 (0.0016) model time 0.2386 (0.2395) loss 3.3527 (3.2378) grad_norm 1.9734 (2.5065) loss_scale 2048.0000 (1026.2237) mem 7382MB [2024-08-27 09:15:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][930/1251] eta 0:01:17 lr 0.000562 wd 0.0500 time 0.2354 (0.2410) data time 0.0011 (0.0016) model time 0.2343 (0.2395) loss 3.0607 (3.2369) grad_norm 2.3339 (2.5093) loss_scale 2048.0000 (1037.1987) mem 7382MB [2024-08-27 09:15:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][940/1251] eta 0:01:14 lr 0.000562 wd 0.0500 time 0.2390 (0.2410) data time 0.0011 (0.0016) model time 0.2378 (0.2395) loss 3.1887 (3.2369) grad_norm 2.4584 (2.5112) loss_scale 2048.0000 (1047.9405) mem 7382MB [2024-08-27 09:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][950/1251] eta 0:01:12 lr 0.000562 wd 0.0500 time 0.2479 (0.2412) data time 0.0014 (0.0016) model time 0.2465 (0.2397) loss 3.5222 (3.2373) grad_norm 2.1542 (2.5099) loss_scale 2048.0000 (1058.4564) mem 7382MB [2024-08-27 09:15:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][960/1251] eta 0:01:10 lr 0.000562 wd 0.0500 time 0.2473 (0.2411) data time 0.0007 (0.0016) model time 0.2466 (0.2397) loss 3.2680 (3.2382) grad_norm 2.1211 (2.5118) loss_scale 2048.0000 (1068.7534) mem 7382MB [2024-08-27 09:15:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][970/1251] eta 0:01:07 lr 0.000562 wd 0.0500 time 0.2441 (0.2411) data time 0.0007 (0.0016) model time 0.2434 (0.2397) loss 3.5278 (3.2374) grad_norm 2.2064 (2.5127) loss_scale 2048.0000 (1078.8383) mem 7382MB [2024-08-27 09:16:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][980/1251] eta 0:01:05 lr 0.000562 wd 0.0500 time 0.2349 (0.2411) data time 0.0011 (0.0016) model time 0.2338 (0.2396) loss 2.9111 (3.2361) grad_norm 2.1389 (2.5121) loss_scale 2048.0000 (1088.7176) mem 7382MB [2024-08-27 09:16:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][990/1251] eta 0:01:02 lr 0.000562 wd 0.0500 time 0.2391 (0.2411) data time 0.0010 (0.0016) model time 0.2381 (0.2396) loss 3.0457 (3.2330) grad_norm 2.3635 (2.5126) loss_scale 2048.0000 (1098.3976) mem 7382MB [2024-08-27 09:16:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1000/1251] eta 0:01:00 lr 0.000562 wd 0.0500 time 0.2329 (0.2411) data time 0.0008 (0.0016) model time 0.2321 (0.2396) loss 4.4616 (3.2339) grad_norm 2.4862 (2.5135) loss_scale 2048.0000 (1107.8841) mem 7382MB [2024-08-27 09:16:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1010/1251] eta 0:00:58 lr 0.000561 wd 0.0500 time 0.2413 (0.2410) data time 0.0011 (0.0016) model time 0.2402 (0.2396) loss 3.3879 (3.2344) grad_norm 2.4617 (2.5101) loss_scale 2048.0000 (1117.1830) mem 7382MB [2024-08-27 09:16:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1020/1251] eta 0:00:55 lr 0.000561 wd 0.0500 time 0.2321 (0.2410) data time 0.0010 (0.0016) model time 0.2311 (0.2396) loss 3.1159 (3.2338) grad_norm 2.2967 (2.5076) loss_scale 2048.0000 (1126.2997) mem 7382MB [2024-08-27 09:16:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1030/1251] eta 0:00:53 lr 0.000561 wd 0.0500 time 0.2392 (0.2410) data time 0.0010 (0.0015) model time 0.2382 (0.2396) loss 3.1091 (3.2328) grad_norm 2.5333 (2.5097) loss_scale 2048.0000 (1135.2396) mem 7382MB [2024-08-27 09:16:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1040/1251] eta 0:00:50 lr 0.000561 wd 0.0500 time 0.2363 (0.2410) data time 0.0011 (0.0015) model time 0.2352 (0.2395) loss 3.3160 (3.2332) grad_norm 2.6882 (2.5116) loss_scale 2048.0000 (1144.0077) mem 7382MB [2024-08-27 09:16:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1050/1251] eta 0:00:48 lr 0.000561 wd 0.0500 time 0.2414 (0.2410) data time 0.0009 (0.0015) model time 0.2404 (0.2395) loss 3.7953 (3.2334) grad_norm 1.9551 (2.5089) loss_scale 2048.0000 (1152.6089) mem 7382MB [2024-08-27 09:16:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1060/1251] eta 0:00:46 lr 0.000561 wd 0.0500 time 0.2453 (0.2410) data time 0.0010 (0.0015) model time 0.2444 (0.2395) loss 3.1483 (3.2320) grad_norm 2.8692 (2.5084) loss_scale 2048.0000 (1161.0481) mem 7382MB [2024-08-27 09:16:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1070/1251] eta 0:00:43 lr 0.000561 wd 0.0500 time 0.2364 (0.2409) data time 0.0011 (0.0015) model time 0.2354 (0.2395) loss 2.2695 (3.2348) grad_norm 1.5824 (2.5094) loss_scale 2048.0000 (1169.3296) mem 7382MB [2024-08-27 09:16:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1080/1251] eta 0:00:41 lr 0.000561 wd 0.0500 time 0.2349 (0.2409) data time 0.0009 (0.0015) model time 0.2340 (0.2395) loss 3.2667 (3.2341) grad_norm 2.6181 (2.5119) loss_scale 2048.0000 (1177.4579) mem 7382MB [2024-08-27 09:16:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1090/1251] eta 0:00:38 lr 0.000561 wd 0.0500 time 0.2405 (0.2409) data time 0.0011 (0.0015) model time 0.2394 (0.2395) loss 2.8491 (3.2329) grad_norm 4.2236 (2.5150) loss_scale 2048.0000 (1185.4372) mem 7382MB [2024-08-27 09:16:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1100/1251] eta 0:00:36 lr 0.000561 wd 0.0500 time 0.2343 (0.2409) data time 0.0010 (0.0015) model time 0.2333 (0.2395) loss 3.5420 (3.2349) grad_norm 2.8304 (2.5171) loss_scale 2048.0000 (1193.2716) mem 7382MB [2024-08-27 09:16:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1110/1251] eta 0:00:33 lr 0.000561 wd 0.0500 time 0.2356 (0.2408) data time 0.0010 (0.0015) model time 0.2346 (0.2394) loss 3.3355 (3.2355) grad_norm 2.2695 (2.5165) loss_scale 2048.0000 (1200.9649) mem 7382MB [2024-08-27 09:16:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1120/1251] eta 0:00:31 lr 0.000561 wd 0.0500 time 0.2410 (0.2408) data time 0.0007 (0.0015) model time 0.2402 (0.2394) loss 2.1078 (3.2344) grad_norm 1.9446 (2.5178) loss_scale 2048.0000 (1208.5210) mem 7382MB [2024-08-27 09:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1130/1251] eta 0:00:29 lr 0.000561 wd 0.0500 time 0.2396 (0.2408) data time 0.0009 (0.0015) model time 0.2387 (0.2394) loss 2.6854 (3.2330) grad_norm 3.5237 (2.5201) loss_scale 2048.0000 (1215.9434) mem 7382MB [2024-08-27 09:16:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1140/1251] eta 0:00:26 lr 0.000561 wd 0.0500 time 0.2383 (0.2408) data time 0.0009 (0.0015) model time 0.2374 (0.2394) loss 3.7323 (3.2344) grad_norm 2.2394 (2.5185) loss_scale 2048.0000 (1223.2358) mem 7382MB [2024-08-27 09:16:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1150/1251] eta 0:00:24 lr 0.000561 wd 0.0500 time 0.2415 (0.2408) data time 0.0010 (0.0015) model time 0.2406 (0.2394) loss 3.0080 (3.2331) grad_norm 2.4141 (2.5145) loss_scale 2048.0000 (1230.4014) mem 7382MB [2024-08-27 09:16:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1160/1251] eta 0:00:21 lr 0.000561 wd 0.0500 time 0.2401 (0.2408) data time 0.0009 (0.0015) model time 0.2392 (0.2394) loss 3.7968 (3.2317) grad_norm 1.9505 (2.5123) loss_scale 2048.0000 (1237.4436) mem 7382MB [2024-08-27 09:16:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1170/1251] eta 0:00:19 lr 0.000561 wd 0.0500 time 0.2377 (0.2408) data time 0.0010 (0.0015) model time 0.2367 (0.2394) loss 3.6700 (3.2321) grad_norm 3.6867 (2.5143) loss_scale 2048.0000 (1244.3655) mem 7382MB [2024-08-27 09:16:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1180/1251] eta 0:00:17 lr 0.000561 wd 0.0500 time 0.2339 (0.2407) data time 0.0012 (0.0015) model time 0.2327 (0.2393) loss 3.4121 (3.2314) grad_norm 2.1851 (2.5135) loss_scale 2048.0000 (1251.1702) mem 7382MB [2024-08-27 09:16:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1190/1251] eta 0:00:14 lr 0.000561 wd 0.0500 time 0.2393 (0.2407) data time 0.0009 (0.0015) model time 0.2384 (0.2393) loss 3.1323 (3.2322) grad_norm 2.4259 (2.5108) loss_scale 2048.0000 (1257.8606) mem 7382MB [2024-08-27 09:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1200/1251] eta 0:00:12 lr 0.000561 wd 0.0500 time 0.2371 (0.2407) data time 0.0007 (0.0015) model time 0.2364 (0.2393) loss 2.9954 (3.2302) grad_norm 2.5074 (2.5093) loss_scale 2048.0000 (1264.4396) mem 7382MB [2024-08-27 09:16:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1210/1251] eta 0:00:09 lr 0.000561 wd 0.0500 time 0.2501 (0.2407) data time 0.0008 (0.0015) model time 0.2492 (0.2393) loss 3.7319 (3.2315) grad_norm 1.9546 (2.5067) loss_scale 2048.0000 (1270.9100) mem 7382MB [2024-08-27 09:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1220/1251] eta 0:00:07 lr 0.000561 wd 0.0500 time 0.2414 (0.2407) data time 0.0011 (0.0015) model time 0.2404 (0.2393) loss 3.3718 (3.2331) grad_norm 2.6021 (2.5066) loss_scale 2048.0000 (1277.2744) mem 7382MB [2024-08-27 09:17:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1230/1251] eta 0:00:05 lr 0.000561 wd 0.0500 time 0.2371 (0.2406) data time 0.0008 (0.0015) model time 0.2363 (0.2392) loss 2.9604 (3.2321) grad_norm 1.8454 (2.5080) loss_scale 2048.0000 (1283.5353) mem 7382MB [2024-08-27 09:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1240/1251] eta 0:00:02 lr 0.000560 wd 0.0500 time 0.2239 (0.2406) data time 0.0007 (0.0015) model time 0.2232 (0.2392) loss 3.3941 (3.2307) grad_norm 2.4291 (2.5149) loss_scale 2048.0000 (1289.6954) mem 7382MB [2024-08-27 09:17:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [149/300][1250/1251] eta 0:00:00 lr 0.000560 wd 0.0500 time 0.2249 (0.2404) data time 0.0005 (0.0015) model time 0.2244 (0.2390) loss 4.2517 (3.2313) grad_norm 2.5088 (2.5152) loss_scale 2048.0000 (1295.7570) mem 7382MB [2024-08-27 09:17:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 149 training takes 0:05:00 [2024-08-27 09:17:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 09:17:06 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 09:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.442 (0.442) Loss 0.4497 (0.4497) Acc@1 91.992 (91.992) Acc@5 98.340 (98.340) Mem 7382MB [2024-08-27 09:17:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.112) Loss 0.7202 (0.7108) Acc@1 84.766 (84.828) Acc@5 96.875 (96.875) Mem 7382MB [2024-08-27 09:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.096) Loss 1.0391 (0.7445) Acc@1 74.512 (83.691) Acc@5 93.750 (96.731) Mem 7382MB [2024-08-27 09:17:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.090) Loss 1.3184 (0.8493) Acc@1 68.848 (81.300) Acc@5 90.039 (95.530) Mem 7382MB [2024-08-27 09:17:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.1611 (0.9085) Acc@1 72.168 (79.683) Acc@5 92.090 (94.879) Mem 7382MB [2024-08-27 09:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.210 Acc@5 94.836 [2024-08-27 09:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.2% [2024-08-27 09:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 79.21% [2024-08-27 09:17:10 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 09:17:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 09:17:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.413 (0.413) Loss 0.4006 (0.4006) Acc@1 93.066 (93.066) Acc@5 98.535 (98.535) Mem 7382MB [2024-08-27 09:17:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.107) Loss 0.6450 (0.6392) Acc@1 86.621 (86.301) Acc@5 97.070 (97.292) Mem 7382MB [2024-08-27 09:17:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.094) Loss 0.9062 (0.6636) Acc@1 78.613 (85.333) Acc@5 95.410 (97.345) Mem 7382MB [2024-08-27 09:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.088) Loss 1.1562 (0.7540) Acc@1 70.801 (83.071) Acc@5 92.969 (96.402) Mem 7382MB [2024-08-27 09:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.0381 (0.8012) Acc@1 74.219 (81.710) Acc@5 93.457 (95.887) Mem 7382MB [2024-08-27 09:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.278 Acc@5 95.846 [2024-08-27 09:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.3% [2024-08-27 09:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.28% [2024-08-27 09:17:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 09:17:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 09:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][0/1251] eta 0:13:28 lr 0.000560 wd 0.0500 time 0.6465 (0.6465) data time 0.4170 (0.4170) model time 0.0000 (0.0000) loss 2.4829 (2.4829) grad_norm 2.2738 (2.2738) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][10/1251] eta 0:05:42 lr 0.000560 wd 0.0500 time 0.2363 (0.2757) data time 0.0008 (0.0388) model time 0.0000 (0.0000) loss 2.4235 (3.2873) grad_norm 2.0048 (2.4386) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][20/1251] eta 0:05:19 lr 0.000560 wd 0.0500 time 0.2457 (0.2599) data time 0.0008 (0.0208) model time 0.0000 (0.0000) loss 2.2685 (3.1659) grad_norm 1.8061 (2.3460) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][30/1251] eta 0:05:13 lr 0.000560 wd 0.0500 time 0.2393 (0.2571) data time 0.0009 (0.0144) model time 0.0000 (0.0000) loss 2.4419 (3.1975) grad_norm 3.0741 (2.4648) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][40/1251] eta 0:05:24 lr 0.000560 wd 0.0500 time 0.3770 (0.2678) data time 0.0007 (0.0111) model time 0.0000 (0.0000) loss 2.5977 (3.1673) grad_norm 2.0648 (2.4497) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][50/1251] eta 0:05:14 lr 0.000560 wd 0.0500 time 0.2401 (0.2617) data time 0.0010 (0.0092) model time 0.0000 (0.0000) loss 3.1710 (3.2014) grad_norm 3.2586 (2.4473) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][60/1251] eta 0:05:07 lr 0.000560 wd 0.0500 time 0.2407 (0.2579) data time 0.0008 (0.0078) model time 0.2399 (0.2374) loss 2.6755 (3.2481) grad_norm 2.0481 (2.4833) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][70/1251] eta 0:05:01 lr 0.000560 wd 0.0500 time 0.2353 (0.2552) data time 0.0010 (0.0069) model time 0.2343 (0.2377) loss 3.8810 (3.2371) grad_norm 2.0271 (2.4628) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][80/1251] eta 0:04:56 lr 0.000560 wd 0.0500 time 0.2319 (0.2535) data time 0.0014 (0.0062) model time 0.2305 (0.2384) loss 3.4546 (3.2574) grad_norm 3.5603 (2.5100) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][90/1251] eta 0:04:52 lr 0.000560 wd 0.0500 time 0.2361 (0.2519) data time 0.0012 (0.0056) model time 0.2349 (0.2383) loss 2.4382 (3.2195) grad_norm 2.9950 (2.5300) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][100/1251] eta 0:04:48 lr 0.000560 wd 0.0500 time 0.2370 (0.2505) data time 0.0010 (0.0052) model time 0.2360 (0.2380) loss 3.2712 (3.2304) grad_norm 2.6350 (2.5250) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][110/1251] eta 0:04:44 lr 0.000560 wd 0.0500 time 0.2418 (0.2494) data time 0.0008 (0.0048) model time 0.2411 (0.2378) loss 2.4438 (3.2184) grad_norm 1.9309 (2.5520) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][120/1251] eta 0:04:42 lr 0.000560 wd 0.0500 time 0.2219 (0.2499) data time 0.0010 (0.0045) model time 0.2209 (0.2402) loss 3.2895 (3.2309) grad_norm 2.3856 (2.5750) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][130/1251] eta 0:04:39 lr 0.000560 wd 0.0500 time 0.2357 (0.2491) data time 0.0010 (0.0042) model time 0.2347 (0.2399) loss 2.4877 (3.2347) grad_norm 2.7648 (2.6145) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][140/1251] eta 0:04:35 lr 0.000560 wd 0.0500 time 0.2375 (0.2483) data time 0.0011 (0.0040) model time 0.2365 (0.2396) loss 3.2252 (3.2113) grad_norm 2.3839 (2.6226) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][150/1251] eta 0:04:32 lr 0.000560 wd 0.0500 time 0.2429 (0.2477) data time 0.0008 (0.0038) model time 0.2421 (0.2395) loss 3.7890 (3.2044) grad_norm 5.3515 (2.6167) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][160/1251] eta 0:04:29 lr 0.000560 wd 0.0500 time 0.2352 (0.2472) data time 0.0009 (0.0036) model time 0.2342 (0.2394) loss 2.1800 (3.2184) grad_norm 3.2092 (2.6608) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:17:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][170/1251] eta 0:04:26 lr 0.000560 wd 0.0500 time 0.2304 (0.2466) data time 0.0009 (0.0035) model time 0.2295 (0.2392) loss 3.7145 (3.2252) grad_norm 2.3796 (2.6783) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][180/1251] eta 0:04:23 lr 0.000560 wd 0.0500 time 0.2358 (0.2461) data time 0.0011 (0.0033) model time 0.2347 (0.2389) loss 3.4822 (3.2297) grad_norm 1.7171 (2.6700) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][190/1251] eta 0:04:20 lr 0.000560 wd 0.0500 time 0.2339 (0.2455) data time 0.0012 (0.0032) model time 0.2327 (0.2386) loss 3.7297 (3.2304) grad_norm 2.2893 (2.6483) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][200/1251] eta 0:04:17 lr 0.000560 wd 0.0500 time 0.2340 (0.2452) data time 0.0008 (0.0031) model time 0.2332 (0.2385) loss 2.8697 (3.2325) grad_norm 1.8247 (2.6152) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][210/1251] eta 0:04:14 lr 0.000559 wd 0.0500 time 0.2438 (0.2448) data time 0.0010 (0.0030) model time 0.2427 (0.2384) loss 3.2579 (3.2313) grad_norm 2.6007 (2.5945) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][220/1251] eta 0:04:12 lr 0.000559 wd 0.0500 time 0.2306 (0.2446) data time 0.0009 (0.0029) model time 0.2297 (0.2384) loss 2.9673 (3.2311) grad_norm 2.9174 (2.5835) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][230/1251] eta 0:04:09 lr 0.000559 wd 0.0500 time 0.2379 (0.2445) data time 0.0008 (0.0028) model time 0.2371 (0.2385) loss 2.5772 (3.2383) grad_norm 2.0427 (2.5853) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][240/1251] eta 0:04:06 lr 0.000559 wd 0.0500 time 0.2329 (0.2442) data time 0.0009 (0.0028) model time 0.2320 (0.2384) loss 3.0883 (3.2345) grad_norm 2.3002 (2.5804) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][250/1251] eta 0:04:04 lr 0.000559 wd 0.0500 time 0.2551 (0.2441) data time 0.0010 (0.0027) model time 0.2541 (0.2385) loss 1.9874 (3.2196) grad_norm 2.8336 (2.5658) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][260/1251] eta 0:04:01 lr 0.000559 wd 0.0500 time 0.2335 (0.2439) data time 0.0010 (0.0026) model time 0.2325 (0.2385) loss 3.2582 (3.2117) grad_norm 3.2613 (2.5660) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][270/1251] eta 0:03:58 lr 0.000559 wd 0.0500 time 0.2385 (0.2436) data time 0.0007 (0.0026) model time 0.2378 (0.2384) loss 2.1750 (3.2124) grad_norm 2.2436 (2.5469) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][280/1251] eta 0:03:56 lr 0.000559 wd 0.0500 time 0.2289 (0.2434) data time 0.0011 (0.0025) model time 0.2278 (0.2383) loss 2.4802 (3.2054) grad_norm 2.1178 (2.5299) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][290/1251] eta 0:03:53 lr 0.000559 wd 0.0500 time 0.2449 (0.2432) data time 0.0007 (0.0025) model time 0.2441 (0.2383) loss 2.5187 (3.1964) grad_norm 2.6321 (2.5228) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][300/1251] eta 0:03:51 lr 0.000559 wd 0.0500 time 0.2339 (0.2430) data time 0.0011 (0.0024) model time 0.2328 (0.2381) loss 3.5635 (3.1996) grad_norm 2.0915 (2.5407) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][310/1251] eta 0:03:48 lr 0.000559 wd 0.0500 time 0.2369 (0.2429) data time 0.0011 (0.0024) model time 0.2358 (0.2381) loss 3.4652 (3.2007) grad_norm 2.6908 (2.5441) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][320/1251] eta 0:03:45 lr 0.000559 wd 0.0500 time 0.2330 (0.2427) data time 0.0010 (0.0023) model time 0.2320 (0.2381) loss 3.3719 (3.2044) grad_norm 1.9411 (2.5312) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][330/1251] eta 0:03:43 lr 0.000559 wd 0.0500 time 0.2395 (0.2427) data time 0.0010 (0.0023) model time 0.2385 (0.2381) loss 2.7234 (3.2025) grad_norm 2.3167 (2.5244) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][340/1251] eta 0:03:40 lr 0.000559 wd 0.0500 time 0.2363 (0.2426) data time 0.0008 (0.0023) model time 0.2355 (0.2382) loss 3.5542 (3.1985) grad_norm 2.5042 (2.5223) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][350/1251] eta 0:03:38 lr 0.000559 wd 0.0500 time 0.2344 (0.2424) data time 0.0007 (0.0022) model time 0.2337 (0.2381) loss 2.8032 (3.1932) grad_norm 2.9233 (2.5404) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][360/1251] eta 0:03:35 lr 0.000559 wd 0.0500 time 0.2381 (0.2423) data time 0.0009 (0.0022) model time 0.2372 (0.2381) loss 2.8379 (3.1897) grad_norm 2.2392 (2.5464) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][370/1251] eta 0:03:33 lr 0.000559 wd 0.0500 time 0.2430 (0.2422) data time 0.0008 (0.0021) model time 0.2423 (0.2380) loss 3.2362 (3.1848) grad_norm 2.0632 (2.5706) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][380/1251] eta 0:03:30 lr 0.000559 wd 0.0500 time 0.2298 (0.2420) data time 0.0007 (0.0021) model time 0.2291 (0.2380) loss 3.8356 (3.1821) grad_norm 2.2848 (2.5685) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][390/1251] eta 0:03:28 lr 0.000559 wd 0.0500 time 0.2409 (0.2420) data time 0.0010 (0.0021) model time 0.2399 (0.2380) loss 3.4481 (3.1868) grad_norm 2.4546 (2.5603) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][400/1251] eta 0:03:25 lr 0.000559 wd 0.0500 time 0.2402 (0.2419) data time 0.0010 (0.0021) model time 0.2391 (0.2380) loss 3.1050 (3.1842) grad_norm 2.1552 (2.5588) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][410/1251] eta 0:03:23 lr 0.000559 wd 0.0500 time 0.2364 (0.2419) data time 0.0010 (0.0020) model time 0.2354 (0.2380) loss 3.6761 (3.1883) grad_norm 1.9876 (2.5545) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][420/1251] eta 0:03:20 lr 0.000559 wd 0.0500 time 0.2339 (0.2418) data time 0.0008 (0.0020) model time 0.2331 (0.2380) loss 2.9177 (3.1930) grad_norm 2.2613 (2.5511) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:18:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][430/1251] eta 0:03:18 lr 0.000559 wd 0.0500 time 0.2357 (0.2417) data time 0.0008 (0.0020) model time 0.2349 (0.2380) loss 4.4340 (3.1880) grad_norm 2.2564 (2.5443) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][440/1251] eta 0:03:15 lr 0.000558 wd 0.0500 time 0.2372 (0.2417) data time 0.0010 (0.0020) model time 0.2363 (0.2380) loss 3.8606 (3.1949) grad_norm 1.9589 (2.5405) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][450/1251] eta 0:03:13 lr 0.000558 wd 0.0500 time 0.2361 (0.2416) data time 0.0009 (0.0020) model time 0.2352 (0.2380) loss 3.9458 (3.1958) grad_norm 1.5557 (2.5401) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][460/1251] eta 0:03:11 lr 0.000558 wd 0.0500 time 0.2301 (0.2416) data time 0.0014 (0.0019) model time 0.2287 (0.2380) loss 3.0645 (3.2007) grad_norm 2.4982 (2.5411) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][470/1251] eta 0:03:08 lr 0.000558 wd 0.0500 time 0.2310 (0.2415) data time 0.0007 (0.0019) model time 0.2303 (0.2380) loss 2.5862 (3.2020) grad_norm 1.9946 (2.5346) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][480/1251] eta 0:03:06 lr 0.000558 wd 0.0500 time 0.2320 (0.2414) data time 0.0009 (0.0019) model time 0.2311 (0.2380) loss 3.7477 (3.2049) grad_norm 2.5931 (2.5305) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][490/1251] eta 0:03:03 lr 0.000558 wd 0.0500 time 0.2393 (0.2413) data time 0.0008 (0.0019) model time 0.2384 (0.2380) loss 4.0967 (3.2010) grad_norm 1.8786 (2.5208) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][500/1251] eta 0:03:01 lr 0.000558 wd 0.0500 time 0.2477 (0.2413) data time 0.0009 (0.0019) model time 0.2468 (0.2380) loss 2.7170 (3.2047) grad_norm 2.9178 (2.5168) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][510/1251] eta 0:02:58 lr 0.000558 wd 0.0500 time 0.2432 (0.2413) data time 0.0012 (0.0018) model time 0.2421 (0.2380) loss 3.7836 (3.2037) grad_norm 1.8410 (2.5139) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][520/1251] eta 0:02:56 lr 0.000558 wd 0.0500 time 0.2411 (0.2413) data time 0.0012 (0.0018) model time 0.2399 (0.2380) loss 3.2284 (3.1973) grad_norm 1.8708 (2.5090) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][530/1251] eta 0:02:53 lr 0.000558 wd 0.0500 time 0.2428 (0.2413) data time 0.0012 (0.0018) model time 0.2416 (0.2381) loss 3.3966 (3.1927) grad_norm 1.8169 (2.5133) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][540/1251] eta 0:02:51 lr 0.000558 wd 0.0500 time 0.2405 (0.2413) data time 0.0011 (0.0018) model time 0.2394 (0.2381) loss 3.0694 (3.1933) grad_norm 1.8147 (2.5060) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][550/1251] eta 0:02:49 lr 0.000558 wd 0.0500 time 0.2360 (0.2412) data time 0.0011 (0.0018) model time 0.2349 (0.2381) loss 2.1358 (3.1903) grad_norm 2.5221 (2.5027) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][560/1251] eta 0:02:47 lr 0.000558 wd 0.0500 time 0.2348 (0.2423) data time 0.0011 (0.0018) model time 0.2337 (0.2393) loss 3.8502 (3.1926) grad_norm 2.4226 (2.5060) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][570/1251] eta 0:02:45 lr 0.000558 wd 0.0500 time 0.2400 (0.2430) data time 0.0007 (0.0018) model time 0.2393 (0.2401) loss 1.9784 (3.1909) grad_norm 1.7120 (2.4973) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][580/1251] eta 0:02:42 lr 0.000558 wd 0.0500 time 0.2419 (0.2429) data time 0.0010 (0.0018) model time 0.2409 (0.2401) loss 2.9612 (3.1916) grad_norm 1.8766 (2.4880) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][590/1251] eta 0:02:40 lr 0.000558 wd 0.0500 time 0.2355 (0.2428) data time 0.0008 (0.0017) model time 0.2347 (0.2400) loss 2.7338 (3.1876) grad_norm 2.5615 (2.4856) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][600/1251] eta 0:02:38 lr 0.000558 wd 0.0500 time 0.2404 (0.2428) data time 0.0011 (0.0017) model time 0.2393 (0.2400) loss 3.2415 (3.1867) grad_norm 1.8703 (2.4848) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][610/1251] eta 0:02:35 lr 0.000558 wd 0.0500 time 0.2448 (0.2428) data time 0.0011 (0.0017) model time 0.2436 (0.2400) loss 3.2239 (3.1849) grad_norm 2.8627 (2.4834) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][620/1251] eta 0:02:33 lr 0.000558 wd 0.0500 time 0.2423 (0.2427) data time 0.0009 (0.0017) model time 0.2415 (0.2400) loss 2.9715 (3.1893) grad_norm 2.9430 (2.4876) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][630/1251] eta 0:02:30 lr 0.000558 wd 0.0500 time 0.2334 (0.2427) data time 0.0008 (0.0017) model time 0.2326 (0.2400) loss 2.6225 (3.1907) grad_norm 7.7933 (2.5051) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][640/1251] eta 0:02:28 lr 0.000558 wd 0.0500 time 0.2399 (0.2426) data time 0.0009 (0.0017) model time 0.2390 (0.2399) loss 3.7537 (3.1911) grad_norm 2.3785 (2.5082) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][650/1251] eta 0:02:25 lr 0.000558 wd 0.0500 time 0.2447 (0.2429) data time 0.0009 (0.0017) model time 0.2438 (0.2403) loss 3.8613 (3.1945) grad_norm 3.4002 (2.5092) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][660/1251] eta 0:02:23 lr 0.000558 wd 0.0500 time 0.2325 (0.2429) data time 0.0012 (0.0017) model time 0.2312 (0.2402) loss 3.1699 (3.1939) grad_norm 2.0661 (2.5051) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:19:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][670/1251] eta 0:02:21 lr 0.000557 wd 0.0500 time 0.2440 (0.2428) data time 0.0007 (0.0017) model time 0.2432 (0.2402) loss 4.2079 (3.1951) grad_norm 1.7483 (2.5027) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][680/1251] eta 0:02:18 lr 0.000557 wd 0.0500 time 0.2333 (0.2428) data time 0.0011 (0.0017) model time 0.2322 (0.2402) loss 3.4321 (3.1999) grad_norm 2.6113 (2.5007) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][690/1251] eta 0:02:16 lr 0.000557 wd 0.0500 time 0.2448 (0.2427) data time 0.0007 (0.0016) model time 0.2441 (0.2402) loss 2.6174 (3.1974) grad_norm 2.4965 (2.4952) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][700/1251] eta 0:02:13 lr 0.000557 wd 0.0500 time 0.2360 (0.2426) data time 0.0012 (0.0016) model time 0.2348 (0.2401) loss 3.3660 (3.1994) grad_norm 2.2935 (2.4902) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][710/1251] eta 0:02:11 lr 0.000557 wd 0.0500 time 0.2445 (0.2426) data time 0.0011 (0.0016) model time 0.2434 (0.2401) loss 2.6119 (3.1971) grad_norm 2.5808 (2.4850) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][720/1251] eta 0:02:08 lr 0.000557 wd 0.0500 time 0.2448 (0.2426) data time 0.0012 (0.0016) model time 0.2437 (0.2400) loss 2.7962 (3.1969) grad_norm 2.5402 (2.4803) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][730/1251] eta 0:02:06 lr 0.000557 wd 0.0500 time 0.2404 (0.2425) data time 0.0008 (0.0016) model time 0.2396 (0.2400) loss 3.0288 (3.1983) grad_norm 1.9617 (2.4817) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][740/1251] eta 0:02:03 lr 0.000557 wd 0.0500 time 0.2468 (0.2425) data time 0.0007 (0.0016) model time 0.2461 (0.2400) loss 3.7828 (3.1998) grad_norm 2.1846 (2.4854) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][750/1251] eta 0:02:01 lr 0.000557 wd 0.0500 time 0.2395 (0.2424) data time 0.0010 (0.0016) model time 0.2385 (0.2400) loss 2.8365 (3.2011) grad_norm 3.2625 (2.4868) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][760/1251] eta 0:01:59 lr 0.000557 wd 0.0500 time 0.2348 (0.2424) data time 0.0009 (0.0016) model time 0.2340 (0.2400) loss 3.7710 (3.2017) grad_norm 1.9015 (2.4810) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][770/1251] eta 0:01:56 lr 0.000557 wd 0.0500 time 0.2486 (0.2424) data time 0.0009 (0.0016) model time 0.2478 (0.2400) loss 3.9098 (3.2012) grad_norm 2.3606 (2.4800) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][780/1251] eta 0:01:54 lr 0.000557 wd 0.0500 time 0.2392 (0.2424) data time 0.0012 (0.0016) model time 0.2380 (0.2400) loss 3.2641 (3.2034) grad_norm 2.0599 (2.4827) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][790/1251] eta 0:01:51 lr 0.000557 wd 0.0500 time 0.2385 (0.2423) data time 0.0010 (0.0016) model time 0.2376 (0.2400) loss 2.5574 (3.2030) grad_norm 2.2861 (2.4799) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][800/1251] eta 0:01:49 lr 0.000557 wd 0.0500 time 0.2512 (0.2423) data time 0.0011 (0.0016) model time 0.2501 (0.2399) loss 2.3018 (3.2050) grad_norm 2.0990 (2.4740) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][810/1251] eta 0:01:46 lr 0.000557 wd 0.0500 time 0.2397 (0.2423) data time 0.0010 (0.0016) model time 0.2387 (0.2399) loss 3.3098 (3.2063) grad_norm 2.2992 (2.4755) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][820/1251] eta 0:01:44 lr 0.000557 wd 0.0500 time 0.2376 (0.2422) data time 0.0011 (0.0016) model time 0.2365 (0.2399) loss 3.4825 (3.2065) grad_norm 2.1868 (2.4703) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][830/1251] eta 0:01:41 lr 0.000557 wd 0.0500 time 0.2435 (0.2422) data time 0.0007 (0.0015) model time 0.2428 (0.2399) loss 2.0537 (3.2080) grad_norm 2.5231 (2.4714) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][840/1251] eta 0:01:39 lr 0.000557 wd 0.0500 time 0.2435 (0.2421) data time 0.0012 (0.0015) model time 0.2422 (0.2398) loss 3.3051 (3.2083) grad_norm 2.4946 (2.4692) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][850/1251] eta 0:01:37 lr 0.000557 wd 0.0500 time 0.2387 (0.2421) data time 0.0011 (0.0015) model time 0.2375 (0.2398) loss 3.3160 (3.2109) grad_norm 1.9758 (2.4695) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][860/1251] eta 0:01:34 lr 0.000557 wd 0.0500 time 0.2298 (0.2421) data time 0.0010 (0.0015) model time 0.2288 (0.2398) loss 2.2108 (3.2126) grad_norm 2.2609 (2.4738) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][870/1251] eta 0:01:32 lr 0.000557 wd 0.0500 time 0.2406 (0.2421) data time 0.0009 (0.0015) model time 0.2397 (0.2398) loss 3.6879 (3.2144) grad_norm 2.3272 (2.4772) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][880/1251] eta 0:01:29 lr 0.000557 wd 0.0500 time 0.2325 (0.2420) data time 0.0012 (0.0015) model time 0.2313 (0.2398) loss 3.3377 (3.2151) grad_norm 2.8187 (2.4795) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][890/1251] eta 0:01:27 lr 0.000556 wd 0.0500 time 0.2365 (0.2420) data time 0.0010 (0.0015) model time 0.2355 (0.2398) loss 2.4090 (3.2144) grad_norm 2.0071 (2.4794) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][900/1251] eta 0:01:24 lr 0.000556 wd 0.0500 time 0.2525 (0.2420) data time 0.0011 (0.0015) model time 0.2514 (0.2398) loss 3.9011 (3.2124) grad_norm 2.9261 (2.4814) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][910/1251] eta 0:01:22 lr 0.000556 wd 0.0500 time 0.2338 (0.2420) data time 0.0015 (0.0015) model time 0.2323 (0.2397) loss 2.6335 (3.2131) grad_norm 1.8179 (2.4819) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:20:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][920/1251] eta 0:01:20 lr 0.000556 wd 0.0500 time 0.2461 (0.2420) data time 0.0007 (0.0015) model time 0.2454 (0.2397) loss 3.8056 (3.2149) grad_norm 1.7493 (2.4852) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][930/1251] eta 0:01:17 lr 0.000556 wd 0.0500 time 0.2388 (0.2419) data time 0.0010 (0.0015) model time 0.2378 (0.2397) loss 3.9181 (3.2152) grad_norm 2.2606 (2.4833) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][940/1251] eta 0:01:15 lr 0.000556 wd 0.0500 time 0.2374 (0.2419) data time 0.0009 (0.0015) model time 0.2365 (0.2397) loss 2.9679 (3.2135) grad_norm 1.7447 (2.4814) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][950/1251] eta 0:01:12 lr 0.000556 wd 0.0500 time 0.2401 (0.2419) data time 0.0009 (0.0015) model time 0.2392 (0.2397) loss 2.1594 (3.2111) grad_norm 2.6146 (2.4816) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][960/1251] eta 0:01:10 lr 0.000556 wd 0.0500 time 0.2454 (0.2419) data time 0.0009 (0.0015) model time 0.2445 (0.2397) loss 3.7106 (3.2094) grad_norm 1.8795 (2.4812) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][970/1251] eta 0:01:07 lr 0.000556 wd 0.0500 time 0.2482 (0.2419) data time 0.0009 (0.0015) model time 0.2473 (0.2397) loss 3.0304 (3.2111) grad_norm 3.3681 (2.4818) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][980/1251] eta 0:01:05 lr 0.000556 wd 0.0500 time 0.2580 (0.2419) data time 0.0011 (0.0015) model time 0.2570 (0.2397) loss 3.3792 (3.2113) grad_norm 2.5452 (2.4792) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][990/1251] eta 0:01:03 lr 0.000556 wd 0.0500 time 0.2350 (0.2418) data time 0.0008 (0.0015) model time 0.2342 (0.2397) loss 3.4912 (3.2126) grad_norm 2.0080 (2.4803) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1000/1251] eta 0:01:00 lr 0.000556 wd 0.0500 time 0.2415 (0.2418) data time 0.0012 (0.0015) model time 0.2403 (0.2397) loss 3.2679 (3.2127) grad_norm 1.9688 (2.4805) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1010/1251] eta 0:00:58 lr 0.000556 wd 0.0500 time 0.2463 (0.2418) data time 0.0008 (0.0015) model time 0.2455 (0.2397) loss 4.0735 (3.2130) grad_norm 2.1260 (2.4778) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1020/1251] eta 0:00:55 lr 0.000556 wd 0.0500 time 0.2518 (0.2418) data time 0.0008 (0.0015) model time 0.2510 (0.2397) loss 3.7376 (3.2116) grad_norm 2.9986 (2.4846) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1030/1251] eta 0:00:53 lr 0.000556 wd 0.0500 time 0.2360 (0.2418) data time 0.0011 (0.0015) model time 0.2349 (0.2397) loss 3.2871 (3.2111) grad_norm 1.7261 (2.4861) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1040/1251] eta 0:00:51 lr 0.000556 wd 0.0500 time 0.2405 (0.2418) data time 0.0010 (0.0015) model time 0.2395 (0.2397) loss 3.1677 (3.2129) grad_norm 1.7708 (2.4810) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1050/1251] eta 0:00:48 lr 0.000556 wd 0.0500 time 0.2446 (0.2418) data time 0.0005 (0.0015) model time 0.2441 (0.2397) loss 4.0814 (3.2129) grad_norm 2.4594 (2.4779) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1060/1251] eta 0:00:46 lr 0.000556 wd 0.0500 time 0.2390 (0.2418) data time 0.0008 (0.0014) model time 0.2381 (0.2397) loss 3.6247 (3.2156) grad_norm 2.9548 (2.4756) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1070/1251] eta 0:00:43 lr 0.000556 wd 0.0500 time 0.2365 (0.2418) data time 0.0010 (0.0014) model time 0.2354 (0.2397) loss 3.3077 (3.2165) grad_norm 1.8773 (2.4756) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1080/1251] eta 0:00:41 lr 0.000556 wd 0.0500 time 0.2491 (0.2418) data time 0.0009 (0.0014) model time 0.2482 (0.2397) loss 3.6332 (3.2190) grad_norm 2.3249 (2.4750) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1090/1251] eta 0:00:38 lr 0.000556 wd 0.0500 time 0.2421 (0.2418) data time 0.0008 (0.0014) model time 0.2413 (0.2398) loss 3.9503 (3.2185) grad_norm 2.0390 (2.4743) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1100/1251] eta 0:00:36 lr 0.000556 wd 0.0500 time 0.2376 (0.2418) data time 0.0009 (0.0014) model time 0.2368 (0.2398) loss 3.2559 (3.2174) grad_norm 2.5082 (2.4743) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1110/1251] eta 0:00:34 lr 0.000556 wd 0.0500 time 0.2500 (0.2418) data time 0.0009 (0.0014) model time 0.2491 (0.2398) loss 3.2516 (3.2185) grad_norm 2.3806 (2.4725) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1120/1251] eta 0:00:31 lr 0.000555 wd 0.0500 time 0.2377 (0.2418) data time 0.0010 (0.0014) model time 0.2367 (0.2397) loss 3.2811 (3.2199) grad_norm 1.6186 (2.4715) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1130/1251] eta 0:00:29 lr 0.000555 wd 0.0500 time 0.2461 (0.2417) data time 0.0010 (0.0014) model time 0.2451 (0.2397) loss 4.2493 (3.2203) grad_norm 2.8198 (2.4825) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1140/1251] eta 0:00:26 lr 0.000555 wd 0.0500 time 0.2416 (0.2417) data time 0.0010 (0.0014) model time 0.2406 (0.2397) loss 3.1372 (3.2188) grad_norm 3.1491 (2.4835) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1150/1251] eta 0:00:24 lr 0.000555 wd 0.0500 time 0.2419 (0.2417) data time 0.0008 (0.0014) model time 0.2411 (0.2397) loss 2.4000 (3.2160) grad_norm 2.3201 (2.4848) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1160/1251] eta 0:00:21 lr 0.000555 wd 0.0500 time 0.2465 (0.2417) data time 0.0010 (0.0014) model time 0.2455 (0.2397) loss 3.8734 (3.2172) grad_norm 2.1084 (2.4846) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1170/1251] eta 0:00:19 lr 0.000555 wd 0.0500 time 0.2394 (0.2417) data time 0.0010 (0.0014) model time 0.2384 (0.2397) loss 4.0172 (3.2209) grad_norm 1.6845 (2.4836) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1180/1251] eta 0:00:17 lr 0.000555 wd 0.0500 time 0.2510 (0.2417) data time 0.0008 (0.0014) model time 0.2502 (0.2397) loss 3.3838 (3.2236) grad_norm 1.9134 (2.4800) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1190/1251] eta 0:00:14 lr 0.000555 wd 0.0500 time 0.2383 (0.2417) data time 0.0009 (0.0014) model time 0.2374 (0.2397) loss 3.2955 (3.2226) grad_norm 2.0310 (2.4821) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1200/1251] eta 0:00:12 lr 0.000555 wd 0.0500 time 0.2688 (0.2417) data time 0.0009 (0.0014) model time 0.2680 (0.2397) loss 3.6135 (3.2236) grad_norm 3.5445 (2.4822) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1210/1251] eta 0:00:09 lr 0.000555 wd 0.0500 time 0.2339 (0.2417) data time 0.0013 (0.0014) model time 0.2326 (0.2397) loss 3.3206 (3.2251) grad_norm 3.1915 (2.4860) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1220/1251] eta 0:00:07 lr 0.000555 wd 0.0500 time 0.2363 (0.2417) data time 0.0008 (0.0014) model time 0.2355 (0.2397) loss 3.9335 (3.2221) grad_norm 2.6911 (2.4889) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1230/1251] eta 0:00:05 lr 0.000555 wd 0.0500 time 0.2353 (0.2416) data time 0.0011 (0.0014) model time 0.2342 (0.2397) loss 3.2497 (3.2215) grad_norm 3.4504 (2.4895) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1240/1251] eta 0:00:02 lr 0.000555 wd 0.0500 time 0.2275 (0.2416) data time 0.0005 (0.0014) model time 0.2270 (0.2396) loss 3.6306 (3.2199) grad_norm 2.9453 (2.4919) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [150/300][1250/1251] eta 0:00:00 lr 0.000555 wd 0.0500 time 0.2255 (0.2415) data time 0.0005 (0.0014) model time 0.2250 (0.2395) loss 2.7729 (3.2203) grad_norm 2.2476 (2.4892) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 150 training takes 0:05:02 [2024-08-27 09:22:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 09:22:18 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 09:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.463 (0.463) Loss 0.4856 (0.4856) Acc@1 90.234 (90.234) Acc@5 98.242 (98.242) Mem 7382MB [2024-08-27 09:22:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.069 (0.111) Loss 0.7769 (0.7252) Acc@1 82.812 (84.082) Acc@5 96.289 (96.875) Mem 7382MB [2024-08-27 09:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.097) Loss 1.0332 (0.7448) Acc@1 75.000 (83.203) Acc@5 93.262 (96.819) Mem 7382MB [2024-08-27 09:22:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.090) Loss 1.2305 (0.8476) Acc@1 68.945 (80.822) Acc@5 91.699 (95.637) Mem 7382MB [2024-08-27 09:22:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.1484 (0.9037) Acc@1 72.461 (79.342) Acc@5 92.285 (95.024) Mem 7382MB [2024-08-27 09:22:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.860 Acc@5 94.944 [2024-08-27 09:22:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.9% [2024-08-27 09:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.742 (0.742) Loss 0.4023 (0.4023) Acc@1 93.066 (93.066) Acc@5 98.633 (98.633) Mem 7382MB [2024-08-27 09:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.142) Loss 0.6445 (0.6386) Acc@1 86.621 (86.337) Acc@5 97.070 (97.301) Mem 7382MB [2024-08-27 09:22:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.111) Loss 0.9053 (0.6628) Acc@1 78.711 (85.412) Acc@5 95.410 (97.335) Mem 7382MB [2024-08-27 09:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.100) Loss 1.1543 (0.7532) Acc@1 70.508 (83.099) Acc@5 93.164 (96.371) Mem 7382MB [2024-08-27 09:22:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.092) Loss 1.0371 (0.8004) Acc@1 74.512 (81.724) Acc@5 93.750 (95.865) Mem 7382MB [2024-08-27 09:22:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.278 Acc@5 95.826 [2024-08-27 09:22:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.3% [2024-08-27 09:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][0/1251] eta 0:22:34 lr 0.000555 wd 0.0500 time 1.0827 (1.0827) data time 0.6866 (0.6866) model time 0.0000 (0.0000) loss 3.5136 (3.5136) grad_norm 1.8744 (1.8744) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][10/1251] eta 0:06:31 lr 0.000555 wd 0.0500 time 0.2370 (0.3151) data time 0.0007 (0.0633) model time 0.0000 (0.0000) loss 2.9153 (3.4129) grad_norm 2.7245 (2.2192) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][20/1251] eta 0:05:43 lr 0.000555 wd 0.0500 time 0.2401 (0.2787) data time 0.0011 (0.0336) model time 0.0000 (0.0000) loss 2.8197 (3.1219) grad_norm 3.0193 (2.2124) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][30/1251] eta 0:05:24 lr 0.000555 wd 0.0500 time 0.2454 (0.2660) data time 0.0010 (0.0231) model time 0.0000 (0.0000) loss 3.1784 (3.1819) grad_norm 2.5795 (2.3599) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][40/1251] eta 0:05:13 lr 0.000555 wd 0.0500 time 0.2439 (0.2593) data time 0.0008 (0.0177) model time 0.0000 (0.0000) loss 2.8483 (3.1829) grad_norm 2.0791 (2.4803) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][50/1251] eta 0:05:06 lr 0.000555 wd 0.0500 time 0.2532 (0.2555) data time 0.0007 (0.0145) model time 0.0000 (0.0000) loss 3.8048 (3.2175) grad_norm 1.9605 (2.4517) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][60/1251] eta 0:05:01 lr 0.000555 wd 0.0500 time 0.2378 (0.2527) data time 0.0009 (0.0123) model time 0.2369 (0.2376) loss 2.5365 (3.2251) grad_norm 2.3255 (2.4356) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][70/1251] eta 0:04:56 lr 0.000555 wd 0.0500 time 0.2384 (0.2507) data time 0.0008 (0.0107) model time 0.2377 (0.2375) loss 3.6858 (3.2087) grad_norm 3.7887 (2.4702) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][80/1251] eta 0:04:51 lr 0.000555 wd 0.0500 time 0.2422 (0.2491) data time 0.0008 (0.0095) model time 0.2414 (0.2372) loss 3.8823 (3.2120) grad_norm 2.3949 (2.4610) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][90/1251] eta 0:04:48 lr 0.000555 wd 0.0500 time 0.2522 (0.2483) data time 0.0010 (0.0086) model time 0.2512 (0.2380) loss 3.1354 (3.2204) grad_norm 2.2173 (2.4802) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][100/1251] eta 0:04:44 lr 0.000554 wd 0.0500 time 0.2378 (0.2474) data time 0.0009 (0.0078) model time 0.2370 (0.2381) loss 2.5685 (3.1998) grad_norm 1.9272 (2.4854) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][110/1251] eta 0:04:41 lr 0.000554 wd 0.0500 time 0.2371 (0.2467) data time 0.0009 (0.0072) model time 0.2362 (0.2382) loss 3.2500 (3.2175) grad_norm 1.8107 (2.4977) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][120/1251] eta 0:04:38 lr 0.000554 wd 0.0500 time 0.2402 (0.2462) data time 0.0008 (0.0067) model time 0.2394 (0.2384) loss 3.5586 (3.2497) grad_norm 1.9353 (2.4988) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:22:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][130/1251] eta 0:04:37 lr 0.000554 wd 0.0500 time 0.4462 (0.2472) data time 0.0009 (0.0063) model time 0.4453 (0.2408) loss 3.8271 (3.2512) grad_norm 3.9810 (2.5288) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][140/1251] eta 0:04:37 lr 0.000554 wd 0.0500 time 0.2289 (0.2496) data time 0.0011 (0.0059) model time 0.2278 (0.2452) loss 2.7454 (3.2375) grad_norm 3.3209 (2.5763) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][150/1251] eta 0:04:33 lr 0.000554 wd 0.0500 time 0.2462 (0.2488) data time 0.0009 (0.0056) model time 0.2453 (0.2443) loss 2.5766 (3.2166) grad_norm 2.3159 (2.5948) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][160/1251] eta 0:04:30 lr 0.000554 wd 0.0500 time 0.2522 (0.2482) data time 0.0007 (0.0053) model time 0.2514 (0.2438) loss 2.3107 (3.2190) grad_norm 1.9787 (2.5644) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][170/1251] eta 0:04:27 lr 0.000554 wd 0.0500 time 0.2436 (0.2476) data time 0.0008 (0.0050) model time 0.2429 (0.2432) loss 2.9665 (3.2185) grad_norm 2.2142 (2.5485) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][180/1251] eta 0:04:24 lr 0.000554 wd 0.0500 time 0.2289 (0.2472) data time 0.0011 (0.0048) model time 0.2278 (0.2429) loss 3.4326 (3.2147) grad_norm 1.6651 (2.5298) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][190/1251] eta 0:04:21 lr 0.000554 wd 0.0500 time 0.2439 (0.2469) data time 0.0007 (0.0046) model time 0.2432 (0.2427) loss 4.0863 (3.2185) grad_norm 3.6258 (2.5715) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][200/1251] eta 0:04:19 lr 0.000554 wd 0.0500 time 0.2374 (0.2465) data time 0.0010 (0.0044) model time 0.2364 (0.2424) loss 3.2112 (3.2244) grad_norm 2.6312 (2.5648) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][210/1251] eta 0:04:16 lr 0.000554 wd 0.0500 time 0.2512 (0.2463) data time 0.0008 (0.0043) model time 0.2504 (0.2423) loss 2.5695 (3.2077) grad_norm 2.8077 (2.5632) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][220/1251] eta 0:04:13 lr 0.000554 wd 0.0500 time 0.2410 (0.2459) data time 0.0010 (0.0041) model time 0.2400 (0.2420) loss 3.4664 (3.2007) grad_norm 3.2720 (2.5505) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][230/1251] eta 0:04:10 lr 0.000554 wd 0.0500 time 0.2451 (0.2456) data time 0.0011 (0.0040) model time 0.2440 (0.2418) loss 3.1612 (3.1989) grad_norm 2.7184 (2.5448) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][240/1251] eta 0:04:08 lr 0.000554 wd 0.0500 time 0.2356 (0.2453) data time 0.0009 (0.0039) model time 0.2347 (0.2415) loss 3.6470 (3.1985) grad_norm 2.6874 (2.5530) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][250/1251] eta 0:04:06 lr 0.000554 wd 0.0500 time 0.4865 (0.2462) data time 0.0010 (0.0038) model time 0.4855 (0.2428) loss 3.5664 (3.2073) grad_norm 1.8763 (2.5340) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][260/1251] eta 0:04:03 lr 0.000554 wd 0.0500 time 0.2326 (0.2458) data time 0.0012 (0.0037) model time 0.2314 (0.2424) loss 3.5247 (3.2122) grad_norm 2.0630 (2.5250) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][270/1251] eta 0:04:00 lr 0.000554 wd 0.0500 time 0.2366 (0.2456) data time 0.0009 (0.0036) model time 0.2357 (0.2423) loss 3.8500 (3.2121) grad_norm 1.7155 (2.5189) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][280/1251] eta 0:03:58 lr 0.000554 wd 0.0500 time 0.2395 (0.2454) data time 0.0011 (0.0035) model time 0.2384 (0.2421) loss 3.6212 (3.2186) grad_norm 2.0653 (2.5097) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][290/1251] eta 0:03:55 lr 0.000554 wd 0.0500 time 0.2488 (0.2452) data time 0.0011 (0.0034) model time 0.2477 (0.2420) loss 3.3672 (3.2233) grad_norm 2.1946 (2.5058) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][300/1251] eta 0:03:53 lr 0.000554 wd 0.0500 time 0.2396 (0.2450) data time 0.0010 (0.0033) model time 0.2386 (0.2419) loss 4.1989 (3.2163) grad_norm 2.9819 (2.5030) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][310/1251] eta 0:03:50 lr 0.000554 wd 0.0500 time 0.2456 (0.2448) data time 0.0010 (0.0032) model time 0.2445 (0.2417) loss 3.4525 (3.1993) grad_norm 2.3742 (2.5145) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][320/1251] eta 0:03:47 lr 0.000553 wd 0.0500 time 0.2468 (0.2447) data time 0.0009 (0.0032) model time 0.2458 (0.2416) loss 2.9729 (3.2000) grad_norm 1.9386 (2.5052) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][330/1251] eta 0:03:45 lr 0.000553 wd 0.0500 time 0.2409 (0.2446) data time 0.0011 (0.0031) model time 0.2398 (0.2416) loss 3.6225 (3.1967) grad_norm 1.9883 (2.4886) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][340/1251] eta 0:03:42 lr 0.000553 wd 0.0500 time 0.2419 (0.2445) data time 0.0008 (0.0031) model time 0.2412 (0.2415) loss 4.0272 (3.2113) grad_norm 5.4817 (2.5017) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][350/1251] eta 0:03:40 lr 0.000553 wd 0.0500 time 0.2432 (0.2444) data time 0.0012 (0.0030) model time 0.2421 (0.2415) loss 2.4567 (3.2137) grad_norm 1.9666 (2.5078) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][360/1251] eta 0:03:37 lr 0.000553 wd 0.0500 time 0.2390 (0.2442) data time 0.0011 (0.0029) model time 0.2379 (0.2413) loss 2.9635 (3.2145) grad_norm 2.1360 (2.5044) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][370/1251] eta 0:03:35 lr 0.000553 wd 0.0500 time 0.2346 (0.2441) data time 0.0009 (0.0029) model time 0.2338 (0.2412) loss 2.9379 (3.2184) grad_norm 2.2523 (2.5146) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:23:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][380/1251] eta 0:03:32 lr 0.000553 wd 0.0500 time 0.2348 (0.2440) data time 0.0008 (0.0028) model time 0.2340 (0.2412) loss 2.4520 (3.2135) grad_norm 2.2594 (2.5083) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:24:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][390/1251] eta 0:03:30 lr 0.000553 wd 0.0500 time 0.2405 (0.2439) data time 0.0012 (0.0028) model time 0.2392 (0.2411) loss 3.2584 (3.2071) grad_norm 2.1150 (2.4951) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][400/1251] eta 0:03:27 lr 0.000553 wd 0.0500 time 0.2409 (0.2438) data time 0.0010 (0.0028) model time 0.2399 (0.2411) loss 3.6660 (3.2091) grad_norm 3.0853 (2.4895) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][410/1251] eta 0:03:24 lr 0.000553 wd 0.0500 time 0.2333 (0.2437) data time 0.0013 (0.0027) model time 0.2319 (0.2410) loss 2.8604 (3.2054) grad_norm 2.5666 (2.5007) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][420/1251] eta 0:03:22 lr 0.000553 wd 0.0500 time 0.2373 (0.2436) data time 0.0009 (0.0027) model time 0.2364 (0.2409) loss 2.0942 (3.2027) grad_norm 2.2325 (2.5108) loss_scale 4096.0000 (2067.4584) mem 7382MB [2024-08-27 09:24:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][430/1251] eta 0:03:19 lr 0.000553 wd 0.0500 time 0.2482 (0.2435) data time 0.0010 (0.0027) model time 0.2472 (0.2408) loss 3.3803 (3.1933) grad_norm 2.1360 (2.5040) loss_scale 4096.0000 (2114.5244) mem 7382MB [2024-08-27 09:24:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][440/1251] eta 0:03:17 lr 0.000553 wd 0.0500 time 0.2373 (0.2434) data time 0.0010 (0.0026) model time 0.2363 (0.2408) loss 3.2409 (3.1975) grad_norm 2.5542 (2.5166) loss_scale 4096.0000 (2159.4558) mem 7382MB [2024-08-27 09:24:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][450/1251] eta 0:03:14 lr 0.000553 wd 0.0500 time 0.2364 (0.2434) data time 0.0011 (0.0026) model time 0.2353 (0.2407) loss 3.1880 (3.1952) grad_norm 2.7308 (2.5360) loss_scale 4096.0000 (2202.3947) mem 7382MB [2024-08-27 09:24:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][460/1251] eta 0:03:12 lr 0.000553 wd 0.0500 time 0.2431 (0.2433) data time 0.0007 (0.0025) model time 0.2424 (0.2407) loss 2.3516 (3.1960) grad_norm 1.5903 (2.5248) loss_scale 4096.0000 (2243.4707) mem 7382MB [2024-08-27 09:24:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][470/1251] eta 0:03:10 lr 0.000553 wd 0.0500 time 0.2453 (0.2433) data time 0.0008 (0.0025) model time 0.2445 (0.2407) loss 3.1031 (3.1937) grad_norm 1.9301 (2.5217) loss_scale 4096.0000 (2282.8025) mem 7382MB [2024-08-27 09:24:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][480/1251] eta 0:03:07 lr 0.000553 wd 0.0500 time 0.2434 (0.2432) data time 0.0009 (0.0025) model time 0.2425 (0.2406) loss 2.1613 (3.1921) grad_norm 2.1994 (2.5181) loss_scale 4096.0000 (2320.4990) mem 7382MB [2024-08-27 09:24:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][490/1251] eta 0:03:04 lr 0.000553 wd 0.0500 time 0.2436 (0.2431) data time 0.0011 (0.0025) model time 0.2425 (0.2406) loss 2.7575 (3.1946) grad_norm 3.3377 (2.5186) loss_scale 4096.0000 (2356.6599) mem 7382MB [2024-08-27 09:24:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][500/1251] eta 0:03:02 lr 0.000553 wd 0.0500 time 0.2475 (0.2430) data time 0.0007 (0.0024) model time 0.2468 (0.2405) loss 3.1570 (3.1913) grad_norm 4.8693 (2.5330) loss_scale 4096.0000 (2391.3772) mem 7382MB [2024-08-27 09:24:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][510/1251] eta 0:03:00 lr 0.000553 wd 0.0500 time 0.2404 (0.2429) data time 0.0010 (0.0024) model time 0.2395 (0.2405) loss 3.6645 (3.1928) grad_norm 1.8122 (2.5243) loss_scale 4096.0000 (2424.7358) mem 7382MB [2024-08-27 09:24:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][520/1251] eta 0:02:57 lr 0.000553 wd 0.0500 time 0.2359 (0.2428) data time 0.0011 (0.0024) model time 0.2348 (0.2404) loss 3.4854 (3.1944) grad_norm 2.1442 (2.5183) loss_scale 4096.0000 (2456.8138) mem 7382MB [2024-08-27 09:24:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][530/1251] eta 0:02:55 lr 0.000553 wd 0.0500 time 0.2328 (0.2427) data time 0.0011 (0.0023) model time 0.2317 (0.2403) loss 3.5694 (3.1999) grad_norm 2.5379 (2.5105) loss_scale 4096.0000 (2487.6836) mem 7382MB [2024-08-27 09:24:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][540/1251] eta 0:02:52 lr 0.000553 wd 0.0500 time 0.2318 (0.2427) data time 0.0009 (0.0023) model time 0.2308 (0.2403) loss 2.8390 (3.1971) grad_norm 2.2456 (2.5133) loss_scale 4096.0000 (2517.4122) mem 7382MB [2024-08-27 09:24:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][550/1251] eta 0:02:50 lr 0.000552 wd 0.0500 time 0.2363 (0.2426) data time 0.0008 (0.0023) model time 0.2355 (0.2402) loss 1.9365 (3.1893) grad_norm 2.1645 (2.5064) loss_scale 4096.0000 (2546.0617) mem 7382MB [2024-08-27 09:24:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][560/1251] eta 0:02:47 lr 0.000552 wd 0.0500 time 0.2453 (0.2425) data time 0.0010 (0.0023) model time 0.2443 (0.2402) loss 3.0213 (3.1847) grad_norm 1.8415 (2.5040) loss_scale 4096.0000 (2573.6898) mem 7382MB [2024-08-27 09:24:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][570/1251] eta 0:02:45 lr 0.000552 wd 0.0500 time 0.2388 (0.2425) data time 0.0010 (0.0023) model time 0.2378 (0.2401) loss 3.5146 (3.1867) grad_norm 2.1133 (2.4998) loss_scale 4096.0000 (2600.3503) mem 7382MB [2024-08-27 09:24:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][580/1251] eta 0:02:42 lr 0.000552 wd 0.0500 time 0.2390 (0.2424) data time 0.0010 (0.0022) model time 0.2381 (0.2401) loss 3.6259 (3.1888) grad_norm 1.5944 (2.4910) loss_scale 4096.0000 (2626.0929) mem 7382MB [2024-08-27 09:24:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][590/1251] eta 0:02:40 lr 0.000552 wd 0.0500 time 0.2372 (0.2423) data time 0.0010 (0.0022) model time 0.2361 (0.2400) loss 3.1507 (3.1875) grad_norm 2.7284 (2.4952) loss_scale 4096.0000 (2650.9645) mem 7382MB [2024-08-27 09:24:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][600/1251] eta 0:02:37 lr 0.000552 wd 0.0500 time 0.2423 (0.2423) data time 0.0009 (0.0022) model time 0.2414 (0.2400) loss 3.1121 (3.1859) grad_norm 3.0165 (2.4979) loss_scale 4096.0000 (2675.0083) mem 7382MB [2024-08-27 09:24:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][610/1251] eta 0:02:35 lr 0.000552 wd 0.0500 time 0.2304 (0.2422) data time 0.0009 (0.0022) model time 0.2295 (0.2399) loss 3.4651 (3.1843) grad_norm 2.0243 (2.4963) loss_scale 4096.0000 (2698.2651) mem 7382MB [2024-08-27 09:24:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][620/1251] eta 0:02:32 lr 0.000552 wd 0.0500 time 0.2577 (0.2422) data time 0.0008 (0.0022) model time 0.2568 (0.2399) loss 3.1656 (3.1862) grad_norm 1.8914 (2.5005) loss_scale 4096.0000 (2720.7729) mem 7382MB [2024-08-27 09:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][630/1251] eta 0:02:30 lr 0.000552 wd 0.0500 time 0.2353 (0.2421) data time 0.0011 (0.0021) model time 0.2342 (0.2399) loss 2.2417 (3.1869) grad_norm 2.4389 (2.4959) loss_scale 4096.0000 (2742.5674) mem 7382MB [2024-08-27 09:25:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][640/1251] eta 0:02:27 lr 0.000552 wd 0.0500 time 0.2434 (0.2421) data time 0.0009 (0.0021) model time 0.2425 (0.2398) loss 2.9784 (3.1877) grad_norm 2.1589 (2.4943) loss_scale 4096.0000 (2763.6817) mem 7382MB [2024-08-27 09:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][650/1251] eta 0:02:25 lr 0.000552 wd 0.0500 time 0.2398 (0.2420) data time 0.0011 (0.0021) model time 0.2388 (0.2398) loss 3.4085 (3.1899) grad_norm 1.7639 (2.4953) loss_scale 4096.0000 (2784.1475) mem 7382MB [2024-08-27 09:25:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][660/1251] eta 0:02:23 lr 0.000552 wd 0.0500 time 0.2440 (0.2420) data time 0.0011 (0.0021) model time 0.2429 (0.2398) loss 3.0339 (3.1914) grad_norm 2.4267 (2.4926) loss_scale 4096.0000 (2803.9939) mem 7382MB [2024-08-27 09:25:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][670/1251] eta 0:02:20 lr 0.000552 wd 0.0500 time 0.2388 (0.2419) data time 0.0010 (0.0021) model time 0.2378 (0.2397) loss 3.0977 (3.1928) grad_norm 2.5077 (2.4999) loss_scale 4096.0000 (2823.2489) mem 7382MB [2024-08-27 09:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][680/1251] eta 0:02:18 lr 0.000552 wd 0.0500 time 0.2424 (0.2419) data time 0.0008 (0.0021) model time 0.2416 (0.2397) loss 3.7596 (3.1965) grad_norm 2.3064 (2.4998) loss_scale 4096.0000 (2841.9383) mem 7382MB [2024-08-27 09:25:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][690/1251] eta 0:02:15 lr 0.000552 wd 0.0500 time 0.2365 (0.2418) data time 0.0010 (0.0020) model time 0.2355 (0.2397) loss 3.3200 (3.1955) grad_norm 2.2149 (2.4970) loss_scale 4096.0000 (2860.0868) mem 7382MB [2024-08-27 09:25:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][700/1251] eta 0:02:13 lr 0.000552 wd 0.0500 time 0.2352 (0.2418) data time 0.0009 (0.0020) model time 0.2343 (0.2396) loss 3.3952 (3.1968) grad_norm 2.9255 (2.4971) loss_scale 4096.0000 (2877.7175) mem 7382MB [2024-08-27 09:25:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][710/1251] eta 0:02:10 lr 0.000552 wd 0.0500 time 0.2377 (0.2418) data time 0.0011 (0.0020) model time 0.2366 (0.2396) loss 2.0424 (3.1973) grad_norm 2.0714 (2.4955) loss_scale 4096.0000 (2894.8523) mem 7382MB [2024-08-27 09:25:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][720/1251] eta 0:02:08 lr 0.000552 wd 0.0500 time 0.2406 (0.2417) data time 0.0011 (0.0020) model time 0.2395 (0.2396) loss 3.1972 (3.1927) grad_norm 2.9093 (inf) loss_scale 2048.0000 (2891.6283) mem 7382MB [2024-08-27 09:25:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][730/1251] eta 0:02:05 lr 0.000552 wd 0.0500 time 0.2368 (0.2416) data time 0.0011 (0.0020) model time 0.2357 (0.2395) loss 2.9138 (3.1931) grad_norm 2.0231 (inf) loss_scale 2048.0000 (2880.0876) mem 7382MB [2024-08-27 09:25:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][740/1251] eta 0:02:03 lr 0.000552 wd 0.0500 time 0.2348 (0.2416) data time 0.0012 (0.0020) model time 0.2336 (0.2395) loss 3.5982 (3.1933) grad_norm 2.1143 (inf) loss_scale 2048.0000 (2868.8583) mem 7382MB [2024-08-27 09:25:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][750/1251] eta 0:02:01 lr 0.000552 wd 0.0500 time 0.2392 (0.2415) data time 0.0008 (0.0020) model time 0.2383 (0.2395) loss 2.9510 (3.1944) grad_norm 2.3936 (inf) loss_scale 2048.0000 (2857.9281) mem 7382MB [2024-08-27 09:25:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][760/1251] eta 0:01:58 lr 0.000552 wd 0.0500 time 0.2332 (0.2415) data time 0.0010 (0.0020) model time 0.2323 (0.2395) loss 3.2670 (3.1972) grad_norm 2.3920 (inf) loss_scale 2048.0000 (2847.2852) mem 7382MB [2024-08-27 09:25:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][770/1251] eta 0:01:56 lr 0.000551 wd 0.0500 time 0.2421 (0.2415) data time 0.0010 (0.0019) model time 0.2411 (0.2394) loss 3.1526 (3.1959) grad_norm 3.2373 (inf) loss_scale 2048.0000 (2836.9183) mem 7382MB [2024-08-27 09:25:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][780/1251] eta 0:01:53 lr 0.000551 wd 0.0500 time 0.2393 (0.2415) data time 0.0008 (0.0019) model time 0.2385 (0.2394) loss 3.0131 (3.1987) grad_norm 1.9507 (inf) loss_scale 2048.0000 (2826.8169) mem 7382MB [2024-08-27 09:25:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][790/1251] eta 0:01:51 lr 0.000551 wd 0.0500 time 0.2391 (0.2417) data time 0.0010 (0.0019) model time 0.2381 (0.2397) loss 3.2687 (3.1979) grad_norm 1.8808 (inf) loss_scale 1024.0000 (2813.0872) mem 7382MB [2024-08-27 09:25:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][800/1251] eta 0:01:48 lr 0.000551 wd 0.0500 time 0.2349 (0.2416) data time 0.0011 (0.0019) model time 0.2338 (0.2396) loss 3.1517 (3.1999) grad_norm 3.6348 (inf) loss_scale 1024.0000 (2790.7516) mem 7382MB [2024-08-27 09:25:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][810/1251] eta 0:01:46 lr 0.000551 wd 0.0500 time 0.2493 (0.2416) data time 0.0012 (0.0019) model time 0.2482 (0.2396) loss 3.4534 (3.1986) grad_norm 3.3606 (inf) loss_scale 1024.0000 (2768.9667) mem 7382MB [2024-08-27 09:25:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][820/1251] eta 0:01:44 lr 0.000551 wd 0.0500 time 0.2367 (0.2416) data time 0.0011 (0.0019) model time 0.2356 (0.2396) loss 3.5248 (3.1989) grad_norm 3.3478 (inf) loss_scale 1024.0000 (2747.7125) mem 7382MB [2024-08-27 09:25:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][830/1251] eta 0:01:41 lr 0.000551 wd 0.0500 time 0.2376 (0.2416) data time 0.0010 (0.0019) model time 0.2366 (0.2396) loss 3.5369 (3.1975) grad_norm 2.6151 (inf) loss_scale 1024.0000 (2726.9699) mem 7382MB [2024-08-27 09:25:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][840/1251] eta 0:01:39 lr 0.000551 wd 0.0500 time 0.2435 (0.2415) data time 0.0010 (0.0019) model time 0.2425 (0.2396) loss 3.5410 (3.1969) grad_norm 2.3187 (inf) loss_scale 1024.0000 (2706.7206) mem 7382MB [2024-08-27 09:25:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][850/1251] eta 0:01:36 lr 0.000551 wd 0.0500 time 0.2342 (0.2415) data time 0.0010 (0.0019) model time 0.2332 (0.2396) loss 2.0598 (3.1957) grad_norm 2.6641 (inf) loss_scale 1024.0000 (2686.9471) mem 7382MB [2024-08-27 09:25:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][860/1251] eta 0:01:34 lr 0.000551 wd 0.0500 time 0.2390 (0.2415) data time 0.0009 (0.0019) model time 0.2380 (0.2395) loss 3.1282 (3.1948) grad_norm 2.0341 (inf) loss_scale 1024.0000 (2667.6330) mem 7382MB [2024-08-27 09:25:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][870/1251] eta 0:01:31 lr 0.000551 wd 0.0500 time 0.2403 (0.2414) data time 0.0008 (0.0018) model time 0.2395 (0.2395) loss 3.3199 (3.1943) grad_norm 3.6833 (inf) loss_scale 1024.0000 (2648.7623) mem 7382MB [2024-08-27 09:25:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][880/1251] eta 0:01:29 lr 0.000551 wd 0.0500 time 0.2338 (0.2414) data time 0.0009 (0.0018) model time 0.2328 (0.2395) loss 3.7412 (3.1941) grad_norm 1.6285 (inf) loss_scale 1024.0000 (2630.3201) mem 7382MB [2024-08-27 09:26:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][890/1251] eta 0:01:27 lr 0.000551 wd 0.0500 time 0.2375 (0.2414) data time 0.0011 (0.0018) model time 0.2364 (0.2395) loss 3.1903 (3.1973) grad_norm 2.5632 (inf) loss_scale 1024.0000 (2612.2918) mem 7382MB [2024-08-27 09:26:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][900/1251] eta 0:01:24 lr 0.000551 wd 0.0500 time 0.2349 (0.2414) data time 0.0012 (0.0018) model time 0.2337 (0.2395) loss 4.2093 (3.2011) grad_norm 2.7729 (inf) loss_scale 1024.0000 (2594.6637) mem 7382MB [2024-08-27 09:26:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][910/1251] eta 0:01:22 lr 0.000551 wd 0.0500 time 0.2424 (0.2414) data time 0.0010 (0.0018) model time 0.2414 (0.2395) loss 2.6076 (3.2023) grad_norm 3.5997 (inf) loss_scale 1024.0000 (2577.4226) mem 7382MB [2024-08-27 09:26:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][920/1251] eta 0:01:19 lr 0.000551 wd 0.0500 time 0.2346 (0.2414) data time 0.0008 (0.0018) model time 0.2338 (0.2395) loss 3.2667 (3.2040) grad_norm 2.0264 (inf) loss_scale 1024.0000 (2560.5559) mem 7382MB [2024-08-27 09:26:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][930/1251] eta 0:01:17 lr 0.000551 wd 0.0500 time 0.2473 (0.2414) data time 0.0008 (0.0018) model time 0.2465 (0.2395) loss 3.4428 (3.2057) grad_norm 2.6481 (inf) loss_scale 1024.0000 (2544.0516) mem 7382MB [2024-08-27 09:26:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][940/1251] eta 0:01:15 lr 0.000551 wd 0.0500 time 0.2458 (0.2414) data time 0.0008 (0.0018) model time 0.2450 (0.2395) loss 2.7802 (3.2061) grad_norm 2.1436 (inf) loss_scale 1024.0000 (2527.8980) mem 7382MB [2024-08-27 09:26:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][950/1251] eta 0:01:12 lr 0.000551 wd 0.0500 time 0.2379 (0.2413) data time 0.0012 (0.0018) model time 0.2367 (0.2395) loss 2.6258 (3.2062) grad_norm 2.1844 (inf) loss_scale 1024.0000 (2512.0841) mem 7382MB [2024-08-27 09:26:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][960/1251] eta 0:01:10 lr 0.000551 wd 0.0500 time 0.2445 (0.2413) data time 0.0008 (0.0018) model time 0.2437 (0.2395) loss 3.2267 (3.2048) grad_norm 1.8424 (inf) loss_scale 1024.0000 (2496.5994) mem 7382MB [2024-08-27 09:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][970/1251] eta 0:01:07 lr 0.000551 wd 0.0500 time 0.2351 (0.2413) data time 0.0009 (0.0018) model time 0.2342 (0.2395) loss 3.5178 (3.2063) grad_norm 3.2651 (inf) loss_scale 1024.0000 (2481.4336) mem 7382MB [2024-08-27 09:26:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][980/1251] eta 0:01:05 lr 0.000551 wd 0.0500 time 0.2439 (0.2413) data time 0.0010 (0.0018) model time 0.2429 (0.2394) loss 3.3412 (3.2072) grad_norm 3.0184 (inf) loss_scale 1024.0000 (2466.5770) mem 7382MB [2024-08-27 09:26:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][990/1251] eta 0:01:02 lr 0.000551 wd 0.0500 time 0.2360 (0.2412) data time 0.0010 (0.0017) model time 0.2350 (0.2394) loss 3.4700 (3.2103) grad_norm 2.1995 (inf) loss_scale 1024.0000 (2452.0202) mem 7382MB [2024-08-27 09:26:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1000/1251] eta 0:01:00 lr 0.000550 wd 0.0500 time 0.2526 (0.2412) data time 0.0008 (0.0017) model time 0.2518 (0.2394) loss 4.0576 (3.2117) grad_norm 2.4838 (inf) loss_scale 1024.0000 (2437.7542) mem 7382MB [2024-08-27 09:26:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1010/1251] eta 0:00:58 lr 0.000550 wd 0.0500 time 0.2430 (0.2412) data time 0.0008 (0.0017) model time 0.2422 (0.2394) loss 2.7176 (3.2104) grad_norm 2.6770 (inf) loss_scale 1024.0000 (2423.7705) mem 7382MB [2024-08-27 09:26:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1020/1251] eta 0:00:55 lr 0.000550 wd 0.0500 time 0.2401 (0.2412) data time 0.0009 (0.0017) model time 0.2391 (0.2394) loss 2.9574 (3.2099) grad_norm 2.6016 (inf) loss_scale 1024.0000 (2410.0607) mem 7382MB [2024-08-27 09:26:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1030/1251] eta 0:00:53 lr 0.000550 wd 0.0500 time 0.2401 (0.2412) data time 0.0012 (0.0017) model time 0.2389 (0.2394) loss 2.9501 (3.2093) grad_norm 1.8336 (inf) loss_scale 1024.0000 (2396.6169) mem 7382MB [2024-08-27 09:26:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1040/1251] eta 0:00:50 lr 0.000550 wd 0.0500 time 0.2371 (0.2412) data time 0.0012 (0.0017) model time 0.2359 (0.2394) loss 3.0159 (3.2093) grad_norm 2.1864 (inf) loss_scale 1024.0000 (2383.4313) mem 7382MB [2024-08-27 09:26:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1050/1251] eta 0:00:48 lr 0.000550 wd 0.0500 time 0.2392 (0.2412) data time 0.0007 (0.0017) model time 0.2385 (0.2394) loss 3.9776 (3.2130) grad_norm 3.4534 (inf) loss_scale 1024.0000 (2370.4967) mem 7382MB [2024-08-27 09:26:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1060/1251] eta 0:00:46 lr 0.000550 wd 0.0500 time 0.4425 (0.2414) data time 0.0010 (0.0017) model time 0.4415 (0.2396) loss 3.3861 (3.2127) grad_norm 3.0840 (inf) loss_scale 1024.0000 (2357.8058) mem 7382MB [2024-08-27 09:26:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1070/1251] eta 0:00:43 lr 0.000550 wd 0.0500 time 0.2467 (0.2418) data time 0.0011 (0.0017) model time 0.2456 (0.2400) loss 2.4142 (3.2130) grad_norm 3.4284 (inf) loss_scale 1024.0000 (2345.3520) mem 7382MB [2024-08-27 09:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1080/1251] eta 0:00:41 lr 0.000550 wd 0.0500 time 0.2283 (0.2418) data time 0.0010 (0.0017) model time 0.2274 (0.2400) loss 2.5522 (3.2109) grad_norm 2.3662 (inf) loss_scale 1024.0000 (2333.1286) mem 7382MB [2024-08-27 09:26:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1090/1251] eta 0:00:38 lr 0.000550 wd 0.0500 time 0.2361 (0.2417) data time 0.0009 (0.0017) model time 0.2352 (0.2400) loss 3.7405 (3.2128) grad_norm 2.5760 (inf) loss_scale 1024.0000 (2321.1292) mem 7382MB [2024-08-27 09:26:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1100/1251] eta 0:00:36 lr 0.000550 wd 0.0500 time 0.2319 (0.2417) data time 0.0011 (0.0017) model time 0.2307 (0.2400) loss 3.3030 (3.2148) grad_norm 4.8040 (inf) loss_scale 1024.0000 (2309.3479) mem 7382MB [2024-08-27 09:26:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1110/1251] eta 0:00:34 lr 0.000550 wd 0.0500 time 0.2427 (0.2417) data time 0.0007 (0.0017) model time 0.2420 (0.2400) loss 2.1080 (3.2122) grad_norm 2.0942 (inf) loss_scale 1024.0000 (2297.7786) mem 7382MB [2024-08-27 09:26:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1120/1251] eta 0:00:31 lr 0.000550 wd 0.0500 time 0.2401 (0.2417) data time 0.0010 (0.0017) model time 0.2391 (0.2399) loss 1.9756 (3.2101) grad_norm 2.2932 (inf) loss_scale 1024.0000 (2286.4157) mem 7382MB [2024-08-27 09:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1130/1251] eta 0:00:29 lr 0.000550 wd 0.0500 time 0.2323 (0.2416) data time 0.0009 (0.0017) model time 0.2314 (0.2399) loss 3.2669 (3.2084) grad_norm 1.7055 (inf) loss_scale 1024.0000 (2275.2538) mem 7382MB [2024-08-27 09:27:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1140/1251] eta 0:00:26 lr 0.000550 wd 0.0500 time 0.2373 (0.2416) data time 0.0011 (0.0017) model time 0.2362 (0.2399) loss 3.8905 (3.2087) grad_norm 2.3904 (inf) loss_scale 1024.0000 (2264.2875) mem 7382MB [2024-08-27 09:27:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1150/1251] eta 0:00:24 lr 0.000550 wd 0.0500 time 0.2398 (0.2416) data time 0.0012 (0.0017) model time 0.2386 (0.2399) loss 3.3478 (3.2088) grad_norm 2.1094 (inf) loss_scale 1024.0000 (2253.5117) mem 7382MB [2024-08-27 09:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1160/1251] eta 0:00:21 lr 0.000550 wd 0.0500 time 0.2475 (0.2415) data time 0.0010 (0.0017) model time 0.2464 (0.2398) loss 3.3564 (3.2083) grad_norm 1.9705 (inf) loss_scale 1024.0000 (2242.9216) mem 7382MB [2024-08-27 09:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1170/1251] eta 0:00:19 lr 0.000550 wd 0.0500 time 0.2424 (0.2415) data time 0.0008 (0.0016) model time 0.2416 (0.2398) loss 2.9959 (3.2079) grad_norm 2.3183 (inf) loss_scale 1024.0000 (2232.5124) mem 7382MB [2024-08-27 09:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1180/1251] eta 0:00:17 lr 0.000550 wd 0.0500 time 0.2385 (0.2415) data time 0.0008 (0.0016) model time 0.2377 (0.2398) loss 2.4266 (3.2085) grad_norm 1.8870 (inf) loss_scale 1024.0000 (2222.2794) mem 7382MB [2024-08-27 09:27:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1190/1251] eta 0:00:14 lr 0.000550 wd 0.0500 time 0.2390 (0.2415) data time 0.0010 (0.0016) model time 0.2380 (0.2398) loss 4.1304 (3.2090) grad_norm 4.2048 (inf) loss_scale 1024.0000 (2212.2183) mem 7382MB [2024-08-27 09:27:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1200/1251] eta 0:00:12 lr 0.000550 wd 0.0500 time 0.2362 (0.2414) data time 0.0009 (0.0016) model time 0.2352 (0.2398) loss 3.7473 (3.2095) grad_norm 1.7585 (inf) loss_scale 1024.0000 (2202.3247) mem 7382MB [2024-08-27 09:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1210/1251] eta 0:00:09 lr 0.000550 wd 0.0500 time 0.2457 (0.2414) data time 0.0012 (0.0016) model time 0.2446 (0.2397) loss 2.3436 (3.2075) grad_norm 1.9709 (inf) loss_scale 1024.0000 (2192.5945) mem 7382MB [2024-08-27 09:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1220/1251] eta 0:00:07 lr 0.000550 wd 0.0500 time 0.2386 (0.2414) data time 0.0010 (0.0016) model time 0.2376 (0.2397) loss 3.4615 (3.2079) grad_norm 2.6383 (inf) loss_scale 1024.0000 (2183.0238) mem 7382MB [2024-08-27 09:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1230/1251] eta 0:00:05 lr 0.000549 wd 0.0500 time 0.2420 (0.2414) data time 0.0011 (0.0016) model time 0.2409 (0.2397) loss 3.0558 (3.2085) grad_norm 2.2332 (inf) loss_scale 1024.0000 (2173.6084) mem 7382MB [2024-08-27 09:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1240/1251] eta 0:00:02 lr 0.000549 wd 0.0500 time 0.2249 (0.2413) data time 0.0007 (0.0016) model time 0.2242 (0.2396) loss 3.0767 (3.2089) grad_norm 3.6910 (inf) loss_scale 1024.0000 (2164.3449) mem 7382MB [2024-08-27 09:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [151/300][1250/1251] eta 0:00:00 lr 0.000549 wd 0.0500 time 0.2261 (0.2412) data time 0.0005 (0.0016) model time 0.2257 (0.2395) loss 3.1063 (3.2093) grad_norm 2.3341 (inf) loss_scale 1024.0000 (2155.2294) mem 7382MB [2024-08-27 09:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 151 training takes 0:05:01 [2024-08-27 09:27:28 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 09:27:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 09:27:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.426 (0.426) Loss 0.4946 (0.4946) Acc@1 90.723 (90.723) Acc@5 98.145 (98.145) Mem 7382MB [2024-08-27 09:27:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.112) Loss 0.7217 (0.7143) Acc@1 84.570 (84.162) Acc@5 95.898 (96.839) Mem 7382MB [2024-08-27 09:27:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.097) Loss 1.0645 (0.7435) Acc@1 75.586 (83.143) Acc@5 93.359 (96.805) Mem 7382MB [2024-08-27 09:27:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.086 (0.090) Loss 1.2871 (0.8519) Acc@1 67.871 (80.740) Acc@5 91.113 (95.609) Mem 7382MB [2024-08-27 09:27:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.1543 (0.9105) Acc@1 71.777 (79.321) Acc@5 92.285 (94.962) Mem 7382MB [2024-08-27 09:27:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.914 Acc@5 94.930 [2024-08-27 09:27:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 78.9% [2024-08-27 09:27:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.772 (0.772) Loss 0.4031 (0.4031) Acc@1 92.969 (92.969) Acc@5 98.633 (98.633) Mem 7382MB [2024-08-27 09:27:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.144) Loss 0.6436 (0.6379) Acc@1 86.816 (86.426) Acc@5 96.875 (97.301) Mem 7382MB [2024-08-27 09:27:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.112) Loss 0.9033 (0.6619) Acc@1 78.711 (85.459) Acc@5 95.508 (97.340) Mem 7382MB [2024-08-27 09:27:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.099) Loss 1.1504 (0.7521) Acc@1 70.703 (83.181) Acc@5 92.871 (96.365) Mem 7382MB [2024-08-27 09:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.091) Loss 1.0361 (0.7995) Acc@1 74.121 (81.776) Acc@5 93.555 (95.851) Mem 7382MB [2024-08-27 09:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.328 Acc@5 95.830 [2024-08-27 09:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.3% [2024-08-27 09:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.33% [2024-08-27 09:27:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 09:27:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 09:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][0/1251] eta 0:19:31 lr 0.000549 wd 0.0500 time 0.9365 (0.9365) data time 0.4821 (0.4821) model time 0.0000 (0.0000) loss 3.3297 (3.3297) grad_norm 3.4216 (3.4216) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:27:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][10/1251] eta 0:06:14 lr 0.000549 wd 0.0500 time 0.2330 (0.3015) data time 0.0008 (0.0447) model time 0.0000 (0.0000) loss 3.4129 (3.1945) grad_norm 2.2784 (2.4080) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][20/1251] eta 0:05:34 lr 0.000549 wd 0.0500 time 0.2572 (0.2717) data time 0.0008 (0.0239) model time 0.0000 (0.0000) loss 3.9208 (3.2870) grad_norm 2.6107 (2.4825) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:27:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][30/1251] eta 0:05:19 lr 0.000549 wd 0.0500 time 0.2409 (0.2616) data time 0.0012 (0.0170) model time 0.0000 (0.0000) loss 3.7010 (3.2690) grad_norm 2.3907 (2.4475) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][40/1251] eta 0:05:09 lr 0.000549 wd 0.0500 time 0.2333 (0.2558) data time 0.0008 (0.0131) model time 0.0000 (0.0000) loss 3.8157 (3.2388) grad_norm 2.8352 (2.5453) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:27:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][50/1251] eta 0:05:03 lr 0.000549 wd 0.0500 time 0.2340 (0.2529) data time 0.0008 (0.0108) model time 0.0000 (0.0000) loss 3.9944 (3.2760) grad_norm 2.9191 (2.5845) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][60/1251] eta 0:04:58 lr 0.000549 wd 0.0500 time 0.2336 (0.2502) data time 0.0009 (0.0092) model time 0.2327 (0.2358) loss 3.3962 (3.2752) grad_norm 2.4355 (2.5911) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:27:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][70/1251] eta 0:04:53 lr 0.000549 wd 0.0500 time 0.2363 (0.2487) data time 0.0008 (0.0080) model time 0.2356 (0.2370) loss 3.7689 (3.2650) grad_norm 2.3080 (2.5474) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:27:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][80/1251] eta 0:04:49 lr 0.000549 wd 0.0500 time 0.2347 (0.2474) data time 0.0010 (0.0072) model time 0.2337 (0.2372) loss 3.4732 (3.2845) grad_norm 2.1390 (2.5248) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][90/1251] eta 0:04:46 lr 0.000549 wd 0.0500 time 0.2430 (0.2464) data time 0.0010 (0.0065) model time 0.2420 (0.2372) loss 3.5414 (3.2918) grad_norm 2.5573 (2.6122) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][100/1251] eta 0:04:42 lr 0.000549 wd 0.0500 time 0.2384 (0.2457) data time 0.0010 (0.0059) model time 0.2374 (0.2374) loss 4.0966 (3.2697) grad_norm 1.8919 (2.6404) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][110/1251] eta 0:04:39 lr 0.000549 wd 0.0500 time 0.2361 (0.2449) data time 0.0012 (0.0055) model time 0.2349 (0.2372) loss 3.3518 (3.2862) grad_norm 2.0867 (2.5953) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][120/1251] eta 0:04:36 lr 0.000549 wd 0.0500 time 0.2408 (0.2445) data time 0.0008 (0.0051) model time 0.2400 (0.2374) loss 3.3853 (3.2841) grad_norm 2.2559 (2.5731) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][130/1251] eta 0:04:33 lr 0.000549 wd 0.0500 time 0.2426 (0.2441) data time 0.0010 (0.0048) model time 0.2416 (0.2374) loss 2.6341 (3.2791) grad_norm 1.6853 (2.5744) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][140/1251] eta 0:04:30 lr 0.000549 wd 0.0500 time 0.2450 (0.2438) data time 0.0010 (0.0046) model time 0.2440 (0.2376) loss 4.3895 (3.2953) grad_norm 1.8293 (2.5363) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][150/1251] eta 0:04:28 lr 0.000549 wd 0.0500 time 0.2436 (0.2434) data time 0.0009 (0.0043) model time 0.2427 (0.2376) loss 3.0624 (3.2880) grad_norm 2.3007 (2.5422) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][160/1251] eta 0:04:25 lr 0.000549 wd 0.0500 time 0.2426 (0.2431) data time 0.0008 (0.0041) model time 0.2419 (0.2376) loss 3.0749 (3.2807) grad_norm 2.8306 (2.5225) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][170/1251] eta 0:04:22 lr 0.000549 wd 0.0500 time 0.2353 (0.2428) data time 0.0008 (0.0039) model time 0.2346 (0.2375) loss 4.0987 (3.2893) grad_norm 2.6941 (2.5217) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][180/1251] eta 0:04:19 lr 0.000549 wd 0.0500 time 0.2319 (0.2426) data time 0.0009 (0.0038) model time 0.2310 (0.2376) loss 2.1652 (3.2817) grad_norm 2.3175 (2.5620) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][190/1251] eta 0:04:17 lr 0.000549 wd 0.0500 time 0.2398 (0.2423) data time 0.0010 (0.0036) model time 0.2388 (0.2375) loss 3.4063 (3.2857) grad_norm 1.8572 (2.5629) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][200/1251] eta 0:04:14 lr 0.000548 wd 0.0500 time 0.2356 (0.2422) data time 0.0011 (0.0035) model time 0.2345 (0.2375) loss 2.8176 (3.2700) grad_norm 2.8288 (2.5494) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][210/1251] eta 0:04:11 lr 0.000548 wd 0.0500 time 0.2409 (0.2419) data time 0.0009 (0.0034) model time 0.2400 (0.2374) loss 2.6030 (3.2770) grad_norm 2.8351 (2.5421) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][220/1251] eta 0:04:09 lr 0.000548 wd 0.0500 time 0.2498 (0.2418) data time 0.0009 (0.0033) model time 0.2489 (0.2374) loss 2.5132 (3.2641) grad_norm 1.9993 (2.5239) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][230/1251] eta 0:04:06 lr 0.000548 wd 0.0500 time 0.2351 (0.2417) data time 0.0013 (0.0032) model time 0.2339 (0.2374) loss 3.4201 (3.2654) grad_norm 2.7954 (2.5324) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][240/1251] eta 0:04:04 lr 0.000548 wd 0.0500 time 0.2363 (0.2417) data time 0.0009 (0.0031) model time 0.2355 (0.2376) loss 3.9279 (3.2669) grad_norm 1.8613 (2.5483) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][250/1251] eta 0:04:01 lr 0.000548 wd 0.0500 time 0.2387 (0.2416) data time 0.0012 (0.0030) model time 0.2375 (0.2377) loss 3.6305 (3.2660) grad_norm 2.9606 (2.5586) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][260/1251] eta 0:03:59 lr 0.000548 wd 0.0500 time 0.2431 (0.2415) data time 0.0010 (0.0029) model time 0.2421 (0.2377) loss 3.8241 (3.2681) grad_norm 2.0275 (2.5592) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][270/1251] eta 0:03:56 lr 0.000548 wd 0.0500 time 0.2375 (0.2414) data time 0.0010 (0.0029) model time 0.2365 (0.2377) loss 3.4562 (3.2767) grad_norm 2.0619 (2.5599) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][280/1251] eta 0:03:54 lr 0.000548 wd 0.0500 time 0.2383 (0.2414) data time 0.0011 (0.0028) model time 0.2372 (0.2378) loss 3.3008 (3.2736) grad_norm 1.9489 (2.5457) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][290/1251] eta 0:03:51 lr 0.000548 wd 0.0500 time 0.2422 (0.2413) data time 0.0007 (0.0027) model time 0.2415 (0.2379) loss 2.5899 (3.2684) grad_norm 2.0256 (2.5464) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][300/1251] eta 0:03:49 lr 0.000548 wd 0.0500 time 0.2431 (0.2413) data time 0.0010 (0.0027) model time 0.2422 (0.2379) loss 3.7722 (3.2608) grad_norm 1.7897 (2.5484) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][310/1251] eta 0:03:46 lr 0.000548 wd 0.0500 time 0.2363 (0.2412) data time 0.0008 (0.0026) model time 0.2356 (0.2378) loss 3.5417 (3.2663) grad_norm 2.4367 (2.5385) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][320/1251] eta 0:03:44 lr 0.000548 wd 0.0500 time 0.2412 (0.2411) data time 0.0007 (0.0026) model time 0.2405 (0.2378) loss 3.1710 (3.2601) grad_norm 1.6223 (2.5308) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][330/1251] eta 0:03:41 lr 0.000548 wd 0.0500 time 0.2449 (0.2410) data time 0.0009 (0.0025) model time 0.2439 (0.2378) loss 3.5504 (3.2528) grad_norm 2.2373 (2.5298) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][340/1251] eta 0:03:39 lr 0.000548 wd 0.0500 time 0.2397 (0.2410) data time 0.0012 (0.0025) model time 0.2386 (0.2378) loss 3.3942 (3.2503) grad_norm 2.3002 (2.5299) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][350/1251] eta 0:03:37 lr 0.000548 wd 0.0500 time 0.2447 (0.2409) data time 0.0011 (0.0025) model time 0.2436 (0.2379) loss 3.3508 (3.2324) grad_norm 1.7055 (2.5242) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][360/1251] eta 0:03:34 lr 0.000548 wd 0.0500 time 0.2501 (0.2408) data time 0.0011 (0.0024) model time 0.2490 (0.2378) loss 3.4398 (3.2336) grad_norm 2.0601 (2.5272) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][370/1251] eta 0:03:32 lr 0.000548 wd 0.0500 time 0.2429 (0.2408) data time 0.0007 (0.0024) model time 0.2422 (0.2379) loss 2.5403 (3.2271) grad_norm 1.7302 (2.5261) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][380/1251] eta 0:03:29 lr 0.000548 wd 0.0500 time 0.2323 (0.2407) data time 0.0010 (0.0023) model time 0.2312 (0.2378) loss 3.4663 (3.2218) grad_norm 1.6682 (2.5333) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][390/1251] eta 0:03:27 lr 0.000548 wd 0.0500 time 0.2475 (0.2407) data time 0.0012 (0.0023) model time 0.2463 (0.2378) loss 3.0442 (3.2230) grad_norm 2.9999 (2.5299) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][400/1251] eta 0:03:24 lr 0.000548 wd 0.0500 time 0.2390 (0.2406) data time 0.0012 (0.0023) model time 0.2379 (0.2378) loss 3.4690 (3.2212) grad_norm 2.4754 (2.5323) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][410/1251] eta 0:03:22 lr 0.000548 wd 0.0500 time 0.2428 (0.2406) data time 0.0009 (0.0022) model time 0.2419 (0.2378) loss 4.0675 (3.2228) grad_norm 2.4726 (2.5482) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][420/1251] eta 0:03:19 lr 0.000548 wd 0.0500 time 0.2355 (0.2405) data time 0.0009 (0.0022) model time 0.2347 (0.2378) loss 3.4312 (3.2220) grad_norm 1.7972 (2.5413) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][430/1251] eta 0:03:17 lr 0.000547 wd 0.0500 time 0.2408 (0.2405) data time 0.0011 (0.0022) model time 0.2397 (0.2378) loss 3.1382 (3.2170) grad_norm 2.3751 (2.5395) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][440/1251] eta 0:03:14 lr 0.000547 wd 0.0500 time 0.2394 (0.2404) data time 0.0007 (0.0022) model time 0.2387 (0.2378) loss 3.0908 (3.2184) grad_norm 3.0130 (2.5358) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][450/1251] eta 0:03:12 lr 0.000547 wd 0.0500 time 0.2389 (0.2404) data time 0.0010 (0.0021) model time 0.2379 (0.2377) loss 3.8303 (3.2212) grad_norm 2.9968 (2.5379) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][460/1251] eta 0:03:10 lr 0.000547 wd 0.0500 time 0.2461 (0.2404) data time 0.0008 (0.0021) model time 0.2453 (0.2378) loss 4.1583 (3.2221) grad_norm 2.0665 (2.5430) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][470/1251] eta 0:03:07 lr 0.000547 wd 0.0500 time 0.2361 (0.2403) data time 0.0010 (0.0021) model time 0.2350 (0.2378) loss 3.1127 (3.2174) grad_norm 1.6868 (2.5293) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][480/1251] eta 0:03:05 lr 0.000547 wd 0.0500 time 0.2426 (0.2404) data time 0.0008 (0.0021) model time 0.2418 (0.2378) loss 3.2752 (3.2165) grad_norm 4.5287 (2.5287) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][490/1251] eta 0:03:02 lr 0.000547 wd 0.0500 time 0.2334 (0.2403) data time 0.0008 (0.0020) model time 0.2326 (0.2378) loss 3.4833 (3.2190) grad_norm 3.3545 (2.5375) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][500/1251] eta 0:03:00 lr 0.000547 wd 0.0500 time 0.2414 (0.2403) data time 0.0010 (0.0020) model time 0.2404 (0.2378) loss 4.3649 (3.2223) grad_norm 2.2874 (2.5380) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][510/1251] eta 0:02:58 lr 0.000547 wd 0.0500 time 0.2382 (0.2403) data time 0.0009 (0.0020) model time 0.2373 (0.2379) loss 3.7144 (3.2198) grad_norm 2.3363 (2.5413) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][520/1251] eta 0:02:55 lr 0.000547 wd 0.0500 time 0.2374 (0.2405) data time 0.0011 (0.0020) model time 0.2363 (0.2382) loss 3.4204 (3.2213) grad_norm 2.4172 (2.5484) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][530/1251] eta 0:02:53 lr 0.000547 wd 0.0500 time 0.2380 (0.2405) data time 0.0008 (0.0020) model time 0.2372 (0.2382) loss 3.5212 (3.2219) grad_norm 2.1785 (2.5456) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][540/1251] eta 0:02:50 lr 0.000547 wd 0.0500 time 0.2395 (0.2405) data time 0.0010 (0.0020) model time 0.2385 (0.2382) loss 3.4332 (3.2209) grad_norm 2.2615 (2.5472) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][550/1251] eta 0:02:48 lr 0.000547 wd 0.0500 time 0.2373 (0.2405) data time 0.0011 (0.0019) model time 0.2362 (0.2382) loss 3.3170 (3.2181) grad_norm 4.6254 (2.5477) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][560/1251] eta 0:02:46 lr 0.000547 wd 0.0500 time 0.2476 (0.2404) data time 0.0010 (0.0019) model time 0.2466 (0.2381) loss 3.8352 (3.2203) grad_norm 3.2720 (2.5512) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][570/1251] eta 0:02:43 lr 0.000547 wd 0.0500 time 0.2403 (0.2404) data time 0.0010 (0.0019) model time 0.2393 (0.2381) loss 3.7223 (3.2207) grad_norm 2.6671 (2.5544) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:29:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][580/1251] eta 0:02:41 lr 0.000547 wd 0.0500 time 0.2398 (0.2404) data time 0.0010 (0.0019) model time 0.2387 (0.2381) loss 2.3671 (3.2188) grad_norm 3.0235 (2.5511) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][590/1251] eta 0:02:38 lr 0.000547 wd 0.0500 time 0.2399 (0.2403) data time 0.0008 (0.0019) model time 0.2392 (0.2381) loss 3.8635 (3.2179) grad_norm 2.3615 (2.5470) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][600/1251] eta 0:02:36 lr 0.000547 wd 0.0500 time 0.2390 (0.2404) data time 0.0011 (0.0019) model time 0.2379 (0.2382) loss 3.1109 (3.2170) grad_norm 2.2512 (2.5438) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][610/1251] eta 0:02:34 lr 0.000547 wd 0.0500 time 0.2444 (0.2410) data time 0.0010 (0.0018) model time 0.2434 (0.2389) loss 3.3915 (3.2181) grad_norm 4.1287 (2.5419) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][620/1251] eta 0:02:32 lr 0.000547 wd 0.0500 time 0.2357 (0.2414) data time 0.0010 (0.0018) model time 0.2347 (0.2393) loss 3.0093 (3.2175) grad_norm 2.9218 (2.5406) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][630/1251] eta 0:02:29 lr 0.000547 wd 0.0500 time 0.2463 (0.2414) data time 0.0009 (0.0018) model time 0.2453 (0.2393) loss 3.8742 (3.2175) grad_norm 1.7469 (2.5381) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][640/1251] eta 0:02:27 lr 0.000547 wd 0.0500 time 0.2517 (0.2413) data time 0.0010 (0.0018) model time 0.2507 (0.2393) loss 3.5425 (3.2188) grad_norm 2.3617 (2.5411) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][650/1251] eta 0:02:25 lr 0.000546 wd 0.0500 time 0.2401 (0.2413) data time 0.0008 (0.0018) model time 0.2392 (0.2393) loss 3.6490 (3.2205) grad_norm 2.0934 (2.5371) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][660/1251] eta 0:02:22 lr 0.000546 wd 0.0500 time 0.2324 (0.2413) data time 0.0008 (0.0018) model time 0.2316 (0.2393) loss 3.8723 (3.2231) grad_norm 2.6304 (2.5365) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][670/1251] eta 0:02:20 lr 0.000546 wd 0.0500 time 0.2365 (0.2412) data time 0.0010 (0.0018) model time 0.2354 (0.2392) loss 2.5585 (3.2210) grad_norm 3.0905 (2.5420) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][680/1251] eta 0:02:17 lr 0.000546 wd 0.0500 time 0.2409 (0.2412) data time 0.0009 (0.0018) model time 0.2400 (0.2392) loss 2.4115 (3.2201) grad_norm 2.9055 (2.5519) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][690/1251] eta 0:02:15 lr 0.000546 wd 0.0500 time 0.2475 (0.2412) data time 0.0010 (0.0018) model time 0.2465 (0.2392) loss 2.0289 (3.2199) grad_norm 1.7074 (2.5481) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][700/1251] eta 0:02:12 lr 0.000546 wd 0.0500 time 0.2367 (0.2412) data time 0.0009 (0.0018) model time 0.2358 (0.2392) loss 3.0500 (3.2190) grad_norm 1.9302 (2.5467) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][710/1251] eta 0:02:10 lr 0.000546 wd 0.0500 time 0.2434 (0.2411) data time 0.0008 (0.0017) model time 0.2426 (0.2392) loss 3.9481 (3.2190) grad_norm 2.4321 (2.5455) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][720/1251] eta 0:02:08 lr 0.000546 wd 0.0500 time 0.2405 (0.2411) data time 0.0011 (0.0017) model time 0.2394 (0.2392) loss 3.2077 (3.2137) grad_norm 3.0008 (2.5447) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][730/1251] eta 0:02:05 lr 0.000546 wd 0.0500 time 0.2495 (0.2412) data time 0.0007 (0.0017) model time 0.2487 (0.2392) loss 3.9533 (3.2167) grad_norm 2.0393 (2.5398) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][740/1251] eta 0:02:03 lr 0.000546 wd 0.0500 time 0.2452 (0.2412) data time 0.0008 (0.0017) model time 0.2444 (0.2393) loss 3.9608 (3.2161) grad_norm 1.9527 (2.5399) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][750/1251] eta 0:02:00 lr 0.000546 wd 0.0500 time 0.2331 (0.2411) data time 0.0009 (0.0017) model time 0.2322 (0.2392) loss 3.6202 (3.2190) grad_norm 2.6132 (2.5422) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][760/1251] eta 0:01:58 lr 0.000546 wd 0.0500 time 0.2383 (0.2411) data time 0.0010 (0.0017) model time 0.2374 (0.2392) loss 3.1594 (3.2176) grad_norm 3.0768 (2.5480) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][770/1251] eta 0:01:55 lr 0.000546 wd 0.0500 time 0.2367 (0.2410) data time 0.0009 (0.0017) model time 0.2359 (0.2392) loss 2.7376 (3.2204) grad_norm 1.9655 (2.5462) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][780/1251] eta 0:01:53 lr 0.000546 wd 0.0500 time 0.2395 (0.2410) data time 0.0008 (0.0017) model time 0.2387 (0.2391) loss 2.6753 (3.2201) grad_norm 1.9651 (2.5455) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][790/1251] eta 0:01:51 lr 0.000546 wd 0.0500 time 0.2379 (0.2409) data time 0.0010 (0.0017) model time 0.2369 (0.2391) loss 3.2610 (3.2193) grad_norm 2.1923 (2.5425) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][800/1251] eta 0:01:48 lr 0.000546 wd 0.0500 time 0.2445 (0.2409) data time 0.0012 (0.0017) model time 0.2433 (0.2391) loss 2.9556 (3.2217) grad_norm 2.3056 (2.5422) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][810/1251] eta 0:01:46 lr 0.000546 wd 0.0500 time 0.2373 (0.2409) data time 0.0009 (0.0017) model time 0.2364 (0.2391) loss 2.9393 (3.2244) grad_norm 1.8526 (2.5396) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][820/1251] eta 0:01:43 lr 0.000546 wd 0.0500 time 0.2397 (0.2409) data time 0.0012 (0.0016) model time 0.2385 (0.2390) loss 2.8127 (3.2250) grad_norm 2.0904 (2.5433) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:30:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][830/1251] eta 0:01:41 lr 0.000546 wd 0.0500 time 0.2434 (0.2408) data time 0.0012 (0.0016) model time 0.2423 (0.2390) loss 3.4061 (3.2263) grad_norm 2.0593 (2.5383) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][840/1251] eta 0:01:38 lr 0.000546 wd 0.0500 time 0.2327 (0.2408) data time 0.0010 (0.0016) model time 0.2317 (0.2390) loss 3.7324 (3.2268) grad_norm 2.2428 (2.5360) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][850/1251] eta 0:01:36 lr 0.000546 wd 0.0500 time 0.2399 (0.2408) data time 0.0011 (0.0016) model time 0.2388 (0.2390) loss 3.5751 (3.2262) grad_norm 3.4343 (2.5363) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][860/1251] eta 0:01:34 lr 0.000546 wd 0.0500 time 0.2323 (0.2408) data time 0.0010 (0.0016) model time 0.2313 (0.2390) loss 2.7460 (3.2244) grad_norm 2.0520 (2.5342) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][870/1251] eta 0:01:31 lr 0.000546 wd 0.0500 time 0.2412 (0.2408) data time 0.0007 (0.0016) model time 0.2405 (0.2389) loss 3.3925 (3.2247) grad_norm 1.8465 (2.5316) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][880/1251] eta 0:01:29 lr 0.000545 wd 0.0500 time 0.2360 (0.2407) data time 0.0011 (0.0016) model time 0.2349 (0.2389) loss 3.5842 (3.2260) grad_norm 2.0791 (2.5292) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][890/1251] eta 0:01:26 lr 0.000545 wd 0.0500 time 0.2396 (0.2407) data time 0.0008 (0.0016) model time 0.2388 (0.2389) loss 2.1739 (3.2238) grad_norm 2.1940 (2.5242) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][900/1251] eta 0:01:24 lr 0.000545 wd 0.0500 time 0.2369 (0.2407) data time 0.0010 (0.0016) model time 0.2359 (0.2389) loss 3.5224 (3.2267) grad_norm 2.4628 (2.5208) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][910/1251] eta 0:01:22 lr 0.000545 wd 0.0500 time 0.2397 (0.2407) data time 0.0007 (0.0016) model time 0.2390 (0.2389) loss 2.5685 (3.2252) grad_norm 2.8280 (2.5193) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][920/1251] eta 0:01:19 lr 0.000545 wd 0.0500 time 0.2417 (0.2407) data time 0.0013 (0.0016) model time 0.2405 (0.2389) loss 3.3437 (3.2247) grad_norm 2.5683 (2.5211) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][930/1251] eta 0:01:17 lr 0.000545 wd 0.0500 time 0.2431 (0.2407) data time 0.0008 (0.0016) model time 0.2423 (0.2389) loss 3.2024 (3.2273) grad_norm 2.2669 (2.5266) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][940/1251] eta 0:01:14 lr 0.000545 wd 0.0500 time 0.2344 (0.2406) data time 0.0010 (0.0016) model time 0.2333 (0.2389) loss 3.5752 (3.2272) grad_norm 1.8990 (2.5285) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][950/1251] eta 0:01:12 lr 0.000545 wd 0.0500 time 0.2383 (0.2406) data time 0.0012 (0.0016) model time 0.2371 (0.2389) loss 2.6562 (3.2295) grad_norm 2.8780 (2.5337) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][960/1251] eta 0:01:10 lr 0.000545 wd 0.0500 time 0.2420 (0.2406) data time 0.0010 (0.0016) model time 0.2410 (0.2388) loss 2.2018 (3.2261) grad_norm 1.7748 (2.5303) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][970/1251] eta 0:01:07 lr 0.000545 wd 0.0500 time 0.2461 (0.2406) data time 0.0007 (0.0016) model time 0.2453 (0.2388) loss 1.9732 (3.2274) grad_norm 2.2543 (2.5299) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][980/1251] eta 0:01:05 lr 0.000545 wd 0.0500 time 0.2342 (0.2406) data time 0.0011 (0.0016) model time 0.2331 (0.2388) loss 3.2876 (3.2247) grad_norm 1.9481 (2.5259) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][990/1251] eta 0:01:02 lr 0.000545 wd 0.0500 time 0.2419 (0.2405) data time 0.0011 (0.0016) model time 0.2407 (0.2388) loss 3.3015 (3.2242) grad_norm 1.6916 (2.5220) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1000/1251] eta 0:01:00 lr 0.000545 wd 0.0500 time 0.2340 (0.2405) data time 0.0012 (0.0016) model time 0.2328 (0.2388) loss 2.2512 (3.2228) grad_norm 2.1727 (2.5225) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1010/1251] eta 0:00:57 lr 0.000545 wd 0.0500 time 0.2311 (0.2405) data time 0.0012 (0.0015) model time 0.2299 (0.2387) loss 3.6326 (3.2254) grad_norm 2.5041 (2.5209) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1020/1251] eta 0:00:55 lr 0.000545 wd 0.0500 time 0.2452 (0.2405) data time 0.0011 (0.0015) model time 0.2441 (0.2387) loss 3.5912 (3.2256) grad_norm 2.1525 (2.5178) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1030/1251] eta 0:00:53 lr 0.000545 wd 0.0500 time 0.2382 (0.2404) data time 0.0010 (0.0015) model time 0.2372 (0.2387) loss 2.5784 (3.2276) grad_norm 2.5560 (2.5182) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1040/1251] eta 0:00:50 lr 0.000545 wd 0.0500 time 0.2455 (0.2404) data time 0.0009 (0.0015) model time 0.2446 (0.2387) loss 3.1171 (3.2272) grad_norm 3.0451 (2.5271) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1050/1251] eta 0:00:48 lr 0.000545 wd 0.0500 time 0.2529 (0.2406) data time 0.0008 (0.0015) model time 0.2522 (0.2389) loss 3.9267 (3.2247) grad_norm 2.5554 (2.5248) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1060/1251] eta 0:00:45 lr 0.000545 wd 0.0500 time 0.2444 (0.2406) data time 0.0007 (0.0015) model time 0.2437 (0.2389) loss 3.4133 (3.2233) grad_norm 1.9043 (2.5243) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1070/1251] eta 0:00:43 lr 0.000545 wd 0.0500 time 0.2419 (0.2406) data time 0.0008 (0.0015) model time 0.2411 (0.2389) loss 3.9279 (3.2238) grad_norm 2.0940 (2.5236) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:31:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1080/1251] eta 0:00:41 lr 0.000545 wd 0.0500 time 0.2452 (0.2406) data time 0.0010 (0.0015) model time 0.2442 (0.2389) loss 3.3172 (3.2227) grad_norm 2.0198 (2.5258) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1090/1251] eta 0:00:38 lr 0.000545 wd 0.0500 time 0.2402 (0.2405) data time 0.0007 (0.0015) model time 0.2394 (0.2389) loss 2.7438 (3.2216) grad_norm 2.6099 (2.5244) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1100/1251] eta 0:00:36 lr 0.000545 wd 0.0500 time 0.2361 (0.2405) data time 0.0010 (0.0015) model time 0.2351 (0.2389) loss 3.1041 (3.2223) grad_norm 2.9020 (2.5248) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1110/1251] eta 0:00:33 lr 0.000544 wd 0.0500 time 0.2382 (0.2405) data time 0.0010 (0.0015) model time 0.2371 (0.2388) loss 2.6071 (3.2213) grad_norm 2.1711 (2.5232) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1120/1251] eta 0:00:31 lr 0.000544 wd 0.0500 time 0.2371 (0.2404) data time 0.0012 (0.0015) model time 0.2359 (0.2388) loss 3.0093 (3.2198) grad_norm 2.8331 (2.5234) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1130/1251] eta 0:00:29 lr 0.000544 wd 0.0500 time 0.2469 (0.2404) data time 0.0009 (0.0015) model time 0.2460 (0.2388) loss 2.5944 (3.2180) grad_norm 1.8021 (2.5209) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1140/1251] eta 0:00:26 lr 0.000544 wd 0.0500 time 0.2437 (0.2406) data time 0.0008 (0.0015) model time 0.2429 (0.2390) loss 3.7342 (3.2200) grad_norm 2.4410 (2.5207) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1150/1251] eta 0:00:24 lr 0.000544 wd 0.0500 time 0.4587 (0.2410) data time 0.0012 (0.0015) model time 0.4575 (0.2394) loss 3.2254 (3.2221) grad_norm 1.8482 (2.5179) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1160/1251] eta 0:00:21 lr 0.000544 wd 0.0500 time 0.2448 (0.2410) data time 0.0010 (0.0015) model time 0.2437 (0.2394) loss 2.3590 (3.2206) grad_norm 2.1799 (2.5169) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1170/1251] eta 0:00:19 lr 0.000544 wd 0.0500 time 0.2366 (0.2410) data time 0.0011 (0.0015) model time 0.2354 (0.2394) loss 3.3106 (3.2218) grad_norm 2.7159 (2.5167) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1180/1251] eta 0:00:17 lr 0.000544 wd 0.0500 time 0.2415 (0.2410) data time 0.0008 (0.0015) model time 0.2406 (0.2394) loss 2.6728 (3.2235) grad_norm 2.7958 (2.5153) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1190/1251] eta 0:00:14 lr 0.000544 wd 0.0500 time 0.2428 (0.2410) data time 0.0008 (0.0015) model time 0.2420 (0.2394) loss 2.4482 (3.2222) grad_norm 2.7337 (2.5135) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1200/1251] eta 0:00:12 lr 0.000544 wd 0.0500 time 0.2349 (0.2409) data time 0.0012 (0.0015) model time 0.2336 (0.2393) loss 3.1570 (3.2235) grad_norm 1.9501 (2.5151) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1210/1251] eta 0:00:09 lr 0.000544 wd 0.0500 time 0.2387 (0.2409) data time 0.0007 (0.0015) model time 0.2380 (0.2393) loss 2.2270 (3.2233) grad_norm 2.8101 (2.5135) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1220/1251] eta 0:00:07 lr 0.000544 wd 0.0500 time 0.2384 (0.2409) data time 0.0011 (0.0015) model time 0.2373 (0.2393) loss 3.6145 (3.2251) grad_norm 2.3191 (2.5135) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1230/1251] eta 0:00:05 lr 0.000544 wd 0.0500 time 0.2345 (0.2409) data time 0.0010 (0.0015) model time 0.2334 (0.2393) loss 3.5664 (3.2269) grad_norm 2.5783 (2.5135) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1240/1251] eta 0:00:02 lr 0.000544 wd 0.0500 time 0.2263 (0.2408) data time 0.0005 (0.0015) model time 0.2258 (0.2392) loss 3.3604 (3.2251) grad_norm 2.0719 (2.5096) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [152/300][1250/1251] eta 0:00:00 lr 0.000544 wd 0.0500 time 0.2231 (0.2407) data time 0.0007 (0.0015) model time 0.2224 (0.2391) loss 3.3824 (3.2233) grad_norm 2.1598 (2.5088) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 152 training takes 0:05:01 [2024-08-27 09:32:39 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 09:32:40 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 09:32:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.473 (0.473) Loss 0.4897 (0.4897) Acc@1 91.504 (91.504) Acc@5 98.242 (98.242) Mem 7382MB [2024-08-27 09:32:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.069 (0.113) Loss 0.8213 (0.7571) Acc@1 83.984 (84.171) Acc@5 95.996 (96.724) Mem 7382MB [2024-08-27 09:32:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.097) Loss 1.0693 (0.7785) Acc@1 76.172 (83.343) Acc@5 94.434 (96.684) Mem 7382MB [2024-08-27 09:32:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.093 (0.091) Loss 1.2520 (0.8730) Acc@1 69.824 (81.140) Acc@5 91.113 (95.561) Mem 7382MB [2024-08-27 09:32:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.1846 (0.9278) Acc@1 70.898 (79.671) Acc@5 91.602 (94.929) Mem 7382MB [2024-08-27 09:32:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.212 Acc@5 94.870 [2024-08-27 09:32:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.2% [2024-08-27 09:32:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 79.21% [2024-08-27 09:32:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 09:32:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 09:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.412 (0.412) Loss 0.4041 (0.4041) Acc@1 92.871 (92.871) Acc@5 98.730 (98.730) Mem 7382MB [2024-08-27 09:32:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.108) Loss 0.6436 (0.6369) Acc@1 86.719 (86.301) Acc@5 96.973 (97.319) Mem 7382MB [2024-08-27 09:32:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.092) Loss 0.9019 (0.6610) Acc@1 78.613 (85.398) Acc@5 95.605 (97.354) Mem 7382MB [2024-08-27 09:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.087) Loss 1.1514 (0.7513) Acc@1 70.605 (83.159) Acc@5 92.969 (96.374) Mem 7382MB [2024-08-27 09:32:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.081) Loss 1.0361 (0.7986) Acc@1 74.414 (81.762) Acc@5 93.750 (95.882) Mem 7382MB [2024-08-27 09:32:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.300 Acc@5 95.860 [2024-08-27 09:32:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.3% [2024-08-27 09:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][0/1251] eta 0:20:58 lr 0.000544 wd 0.0500 time 1.0059 (1.0059) data time 0.6844 (0.6844) model time 0.0000 (0.0000) loss 3.2097 (3.2097) grad_norm 2.2239 (2.2239) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][10/1251] eta 0:06:26 lr 0.000544 wd 0.0500 time 0.2349 (0.3115) data time 0.0007 (0.0631) model time 0.0000 (0.0000) loss 3.5628 (3.3324) grad_norm 2.7079 (2.7102) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][20/1251] eta 0:05:39 lr 0.000544 wd 0.0500 time 0.2376 (0.2759) data time 0.0010 (0.0335) model time 0.0000 (0.0000) loss 2.8790 (3.3362) grad_norm 3.7124 (2.5400) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][30/1251] eta 0:05:22 lr 0.000544 wd 0.0500 time 0.2412 (0.2641) data time 0.0011 (0.0230) model time 0.0000 (0.0000) loss 2.5859 (3.3855) grad_norm 2.0730 (2.6822) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:32:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][40/1251] eta 0:05:12 lr 0.000544 wd 0.0500 time 0.2345 (0.2583) data time 0.0010 (0.0177) model time 0.0000 (0.0000) loss 2.6386 (3.2949) grad_norm 3.2023 (2.5972) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][50/1251] eta 0:05:05 lr 0.000544 wd 0.0500 time 0.2387 (0.2544) data time 0.0010 (0.0144) model time 0.0000 (0.0000) loss 2.7121 (3.2446) grad_norm 2.8089 (2.5884) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][60/1251] eta 0:05:00 lr 0.000544 wd 0.0500 time 0.2357 (0.2526) data time 0.0009 (0.0124) model time 0.2349 (0.2411) loss 2.9125 (3.2606) grad_norm 1.9471 (2.5442) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][70/1251] eta 0:04:56 lr 0.000544 wd 0.0500 time 0.2396 (0.2511) data time 0.0009 (0.0110) model time 0.2387 (0.2404) loss 3.3739 (3.2445) grad_norm 2.2440 (2.5411) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][80/1251] eta 0:04:52 lr 0.000543 wd 0.0500 time 0.2389 (0.2495) data time 0.0011 (0.0097) model time 0.2378 (0.2392) loss 3.7471 (3.2569) grad_norm 2.0600 (2.5180) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][90/1251] eta 0:04:48 lr 0.000543 wd 0.0500 time 0.2340 (0.2483) data time 0.0010 (0.0088) model time 0.2330 (0.2388) loss 3.5121 (3.2176) grad_norm 2.3236 (2.5583) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][100/1251] eta 0:04:44 lr 0.000543 wd 0.0500 time 0.2433 (0.2473) data time 0.0009 (0.0080) model time 0.2424 (0.2386) loss 2.2489 (3.2084) grad_norm 2.4465 (2.5182) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][110/1251] eta 0:04:41 lr 0.000543 wd 0.0500 time 0.2419 (0.2465) data time 0.0011 (0.0074) model time 0.2408 (0.2383) loss 3.2158 (3.2032) grad_norm 3.5230 (2.5116) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][120/1251] eta 0:04:37 lr 0.000543 wd 0.0500 time 0.2375 (0.2458) data time 0.0009 (0.0069) model time 0.2366 (0.2381) loss 3.9528 (3.2161) grad_norm 2.3584 (2.5118) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][130/1251] eta 0:04:34 lr 0.000543 wd 0.0500 time 0.2296 (0.2452) data time 0.0009 (0.0064) model time 0.2287 (0.2380) loss 3.1176 (3.2090) grad_norm 2.8666 (2.5045) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][140/1251] eta 0:04:31 lr 0.000543 wd 0.0500 time 0.2392 (0.2448) data time 0.0012 (0.0060) model time 0.2380 (0.2380) loss 3.8035 (3.2144) grad_norm 2.1356 (2.4829) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][150/1251] eta 0:04:29 lr 0.000543 wd 0.0500 time 0.2420 (0.2444) data time 0.0010 (0.0057) model time 0.2409 (0.2380) loss 3.5591 (3.2201) grad_norm 2.2399 (2.4889) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][160/1251] eta 0:04:26 lr 0.000543 wd 0.0500 time 0.2319 (0.2441) data time 0.0010 (0.0054) model time 0.2309 (0.2380) loss 3.4878 (3.2189) grad_norm 2.1851 (2.5042) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][170/1251] eta 0:04:23 lr 0.000543 wd 0.0500 time 0.2449 (0.2438) data time 0.0008 (0.0052) model time 0.2441 (0.2381) loss 3.9496 (3.2081) grad_norm 2.0520 (2.5258) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][180/1251] eta 0:04:20 lr 0.000543 wd 0.0500 time 0.2345 (0.2435) data time 0.0009 (0.0049) model time 0.2336 (0.2380) loss 1.9367 (3.2026) grad_norm 2.6953 (2.5294) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][190/1251] eta 0:04:18 lr 0.000543 wd 0.0500 time 0.2319 (0.2432) data time 0.0008 (0.0047) model time 0.2310 (0.2380) loss 4.0850 (3.2104) grad_norm 2.1868 (2.5414) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][200/1251] eta 0:04:15 lr 0.000543 wd 0.0500 time 0.2372 (0.2431) data time 0.0008 (0.0046) model time 0.2364 (0.2380) loss 3.4353 (3.2167) grad_norm 2.4174 (2.5327) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][210/1251] eta 0:04:12 lr 0.000543 wd 0.0500 time 0.2388 (0.2430) data time 0.0007 (0.0044) model time 0.2381 (0.2382) loss 3.4588 (3.2180) grad_norm 2.1159 (2.5190) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][220/1251] eta 0:04:10 lr 0.000543 wd 0.0500 time 0.2327 (0.2428) data time 0.0014 (0.0042) model time 0.2313 (0.2381) loss 2.7911 (3.2164) grad_norm 2.0515 (2.4995) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][230/1251] eta 0:04:07 lr 0.000543 wd 0.0500 time 0.2407 (0.2427) data time 0.0009 (0.0041) model time 0.2398 (0.2382) loss 3.7814 (3.2209) grad_norm 2.8330 (2.4924) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][240/1251] eta 0:04:05 lr 0.000543 wd 0.0500 time 0.2389 (0.2426) data time 0.0012 (0.0040) model time 0.2377 (0.2383) loss 3.5308 (3.2217) grad_norm 2.4621 (2.4912) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][250/1251] eta 0:04:02 lr 0.000543 wd 0.0500 time 0.2304 (0.2425) data time 0.0007 (0.0039) model time 0.2296 (0.2382) loss 1.9463 (3.2219) grad_norm 2.0066 (2.4962) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][260/1251] eta 0:04:00 lr 0.000543 wd 0.0500 time 0.2339 (0.2424) data time 0.0009 (0.0038) model time 0.2330 (0.2383) loss 3.4667 (3.2273) grad_norm 2.7039 (2.5234) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][270/1251] eta 0:03:57 lr 0.000543 wd 0.0500 time 0.2447 (0.2423) data time 0.0011 (0.0037) model time 0.2437 (0.2383) loss 3.5767 (3.2311) grad_norm 2.5016 (2.5244) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][280/1251] eta 0:03:55 lr 0.000543 wd 0.0500 time 0.2439 (0.2422) data time 0.0010 (0.0036) model time 0.2429 (0.2383) loss 4.3234 (3.2411) grad_norm 2.0170 (2.5205) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:33:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][290/1251] eta 0:03:52 lr 0.000543 wd 0.0500 time 0.2380 (0.2421) data time 0.0012 (0.0035) model time 0.2369 (0.2383) loss 3.3225 (3.2468) grad_norm 2.5859 (2.5253) loss_scale 2048.0000 (1041.5945) mem 7382MB [2024-08-27 09:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][300/1251] eta 0:03:50 lr 0.000543 wd 0.0500 time 0.2448 (0.2421) data time 0.0007 (0.0034) model time 0.2441 (0.2384) loss 3.2094 (3.2499) grad_norm 3.1485 (2.5290) loss_scale 2048.0000 (1075.0299) mem 7382MB [2024-08-27 09:34:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][310/1251] eta 0:03:47 lr 0.000542 wd 0.0500 time 0.2419 (0.2421) data time 0.0010 (0.0033) model time 0.2409 (0.2385) loss 3.6497 (3.2440) grad_norm 2.5976 (2.5417) loss_scale 2048.0000 (1106.3151) mem 7382MB [2024-08-27 09:34:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][320/1251] eta 0:03:45 lr 0.000542 wd 0.0500 time 0.2379 (0.2420) data time 0.0009 (0.0033) model time 0.2370 (0.2385) loss 3.5567 (3.2383) grad_norm 1.9043 (2.5401) loss_scale 2048.0000 (1135.6511) mem 7382MB [2024-08-27 09:34:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][330/1251] eta 0:03:42 lr 0.000542 wd 0.0500 time 0.2419 (0.2419) data time 0.0010 (0.0032) model time 0.2409 (0.2385) loss 4.0190 (3.2269) grad_norm 2.2700 (2.5296) loss_scale 2048.0000 (1163.2145) mem 7382MB [2024-08-27 09:34:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][340/1251] eta 0:03:40 lr 0.000542 wd 0.0500 time 0.2330 (0.2418) data time 0.0010 (0.0031) model time 0.2319 (0.2385) loss 3.3411 (3.2235) grad_norm 2.4493 (2.5205) loss_scale 2048.0000 (1189.1613) mem 7382MB [2024-08-27 09:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][350/1251] eta 0:03:37 lr 0.000542 wd 0.0500 time 0.2423 (0.2418) data time 0.0008 (0.0031) model time 0.2415 (0.2385) loss 2.5243 (3.2253) grad_norm 2.2846 (2.5072) loss_scale 2048.0000 (1213.6296) mem 7382MB [2024-08-27 09:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][360/1251] eta 0:03:35 lr 0.000542 wd 0.0500 time 0.2369 (0.2417) data time 0.0010 (0.0030) model time 0.2358 (0.2385) loss 3.3732 (3.2208) grad_norm 2.4821 (2.5066) loss_scale 2048.0000 (1236.7424) mem 7382MB [2024-08-27 09:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][370/1251] eta 0:03:32 lr 0.000542 wd 0.0500 time 0.2403 (0.2417) data time 0.0008 (0.0030) model time 0.2395 (0.2385) loss 3.0383 (3.2244) grad_norm 2.1019 (2.5019) loss_scale 2048.0000 (1258.6092) mem 7382MB [2024-08-27 09:34:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][380/1251] eta 0:03:30 lr 0.000542 wd 0.0500 time 0.2479 (0.2416) data time 0.0008 (0.0029) model time 0.2471 (0.2385) loss 2.3771 (3.2203) grad_norm 1.6906 (2.5048) loss_scale 2048.0000 (1279.3281) mem 7382MB [2024-08-27 09:34:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][390/1251] eta 0:03:28 lr 0.000542 wd 0.0500 time 0.2376 (0.2416) data time 0.0011 (0.0029) model time 0.2365 (0.2385) loss 3.2226 (3.2118) grad_norm 3.2544 (inf) loss_scale 1024.0000 (1293.7494) mem 7382MB [2024-08-27 09:34:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][400/1251] eta 0:03:25 lr 0.000542 wd 0.0500 time 0.2320 (0.2415) data time 0.0011 (0.0028) model time 0.2310 (0.2385) loss 3.5082 (3.2147) grad_norm 2.5134 (inf) loss_scale 1024.0000 (1287.0224) mem 7382MB [2024-08-27 09:34:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][410/1251] eta 0:03:23 lr 0.000542 wd 0.0500 time 0.2477 (0.2416) data time 0.0008 (0.0028) model time 0.2469 (0.2386) loss 3.2904 (3.2194) grad_norm 2.6367 (inf) loss_scale 1024.0000 (1280.6229) mem 7382MB [2024-08-27 09:34:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][420/1251] eta 0:03:20 lr 0.000542 wd 0.0500 time 0.2313 (0.2415) data time 0.0009 (0.0027) model time 0.2304 (0.2385) loss 3.6604 (3.2144) grad_norm 2.9698 (inf) loss_scale 1024.0000 (1274.5273) mem 7382MB [2024-08-27 09:34:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][430/1251] eta 0:03:19 lr 0.000542 wd 0.0500 time 0.2367 (0.2424) data time 0.0008 (0.0027) model time 0.2359 (0.2397) loss 3.7676 (3.2111) grad_norm 2.7633 (inf) loss_scale 1024.0000 (1268.7146) mem 7382MB [2024-08-27 09:34:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][440/1251] eta 0:03:17 lr 0.000542 wd 0.0500 time 0.2316 (0.2429) data time 0.0009 (0.0027) model time 0.2307 (0.2403) loss 4.0464 (3.2118) grad_norm 3.0074 (inf) loss_scale 1024.0000 (1263.1655) mem 7382MB [2024-08-27 09:34:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][450/1251] eta 0:03:14 lr 0.000542 wd 0.0500 time 0.2464 (0.2429) data time 0.0012 (0.0026) model time 0.2452 (0.2403) loss 3.6666 (3.2092) grad_norm 2.8710 (inf) loss_scale 1024.0000 (1257.8625) mem 7382MB [2024-08-27 09:34:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][460/1251] eta 0:03:12 lr 0.000542 wd 0.0500 time 0.2388 (0.2427) data time 0.0009 (0.0026) model time 0.2379 (0.2402) loss 4.1221 (3.2049) grad_norm 2.1703 (inf) loss_scale 1024.0000 (1252.7896) mem 7382MB [2024-08-27 09:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][470/1251] eta 0:03:09 lr 0.000542 wd 0.0500 time 0.2496 (0.2427) data time 0.0009 (0.0026) model time 0.2487 (0.2402) loss 3.0316 (3.2037) grad_norm 3.1620 (inf) loss_scale 1024.0000 (1247.9321) mem 7382MB [2024-08-27 09:34:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][480/1251] eta 0:03:07 lr 0.000542 wd 0.0500 time 0.2358 (0.2426) data time 0.0010 (0.0025) model time 0.2348 (0.2401) loss 2.9827 (3.2095) grad_norm 1.7549 (inf) loss_scale 1024.0000 (1243.2765) mem 7382MB [2024-08-27 09:34:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][490/1251] eta 0:03:04 lr 0.000542 wd 0.0500 time 0.2373 (0.2426) data time 0.0008 (0.0025) model time 0.2365 (0.2401) loss 2.8077 (3.2094) grad_norm 2.0901 (inf) loss_scale 1024.0000 (1238.8106) mem 7382MB [2024-08-27 09:34:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][500/1251] eta 0:03:02 lr 0.000542 wd 0.0500 time 0.2390 (0.2424) data time 0.0011 (0.0025) model time 0.2379 (0.2400) loss 2.6492 (3.2064) grad_norm 1.9574 (inf) loss_scale 1024.0000 (1234.5230) mem 7382MB [2024-08-27 09:34:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][510/1251] eta 0:02:59 lr 0.000542 wd 0.0500 time 0.2446 (0.2425) data time 0.0012 (0.0024) model time 0.2434 (0.2400) loss 3.9314 (3.2084) grad_norm 3.2044 (inf) loss_scale 1024.0000 (1230.4031) mem 7382MB [2024-08-27 09:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][520/1251] eta 0:02:57 lr 0.000542 wd 0.0500 time 0.2458 (0.2425) data time 0.0008 (0.0024) model time 0.2450 (0.2401) loss 3.7091 (3.2083) grad_norm 2.6812 (inf) loss_scale 1024.0000 (1226.4415) mem 7382MB [2024-08-27 09:34:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][530/1251] eta 0:02:54 lr 0.000541 wd 0.0500 time 0.2423 (0.2424) data time 0.0008 (0.0024) model time 0.2416 (0.2400) loss 4.2149 (3.2080) grad_norm 2.1886 (inf) loss_scale 1024.0000 (1222.6290) mem 7382MB [2024-08-27 09:34:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][540/1251] eta 0:02:52 lr 0.000541 wd 0.0500 time 0.2497 (0.2424) data time 0.0008 (0.0024) model time 0.2489 (0.2400) loss 3.6413 (3.2119) grad_norm 2.4993 (inf) loss_scale 1024.0000 (1218.9575) mem 7382MB [2024-08-27 09:35:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][550/1251] eta 0:02:49 lr 0.000541 wd 0.0500 time 0.2309 (0.2423) data time 0.0012 (0.0023) model time 0.2297 (0.2400) loss 3.4935 (3.2153) grad_norm 7.3956 (inf) loss_scale 1024.0000 (1215.4192) mem 7382MB [2024-08-27 09:35:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][560/1251] eta 0:02:47 lr 0.000541 wd 0.0500 time 0.2381 (0.2423) data time 0.0012 (0.0023) model time 0.2369 (0.2400) loss 3.1210 (3.2156) grad_norm 2.3383 (inf) loss_scale 1024.0000 (1212.0071) mem 7382MB [2024-08-27 09:35:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][570/1251] eta 0:02:44 lr 0.000541 wd 0.0500 time 0.2397 (0.2423) data time 0.0009 (0.0023) model time 0.2389 (0.2400) loss 3.0626 (3.2157) grad_norm 2.0159 (inf) loss_scale 1024.0000 (1208.7145) mem 7382MB [2024-08-27 09:35:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][580/1251] eta 0:02:42 lr 0.000541 wd 0.0500 time 0.4871 (0.2426) data time 0.0008 (0.0023) model time 0.4863 (0.2404) loss 2.9853 (3.2122) grad_norm 4.0244 (inf) loss_scale 1024.0000 (1205.5353) mem 7382MB [2024-08-27 09:35:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][590/1251] eta 0:02:40 lr 0.000541 wd 0.0500 time 0.2407 (0.2425) data time 0.0008 (0.0023) model time 0.2399 (0.2403) loss 3.1249 (3.2092) grad_norm 2.1863 (inf) loss_scale 1024.0000 (1202.4636) mem 7382MB [2024-08-27 09:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][600/1251] eta 0:02:37 lr 0.000541 wd 0.0500 time 0.2363 (0.2425) data time 0.0009 (0.0022) model time 0.2354 (0.2403) loss 4.0106 (3.2105) grad_norm 4.5643 (inf) loss_scale 1024.0000 (1199.4942) mem 7382MB [2024-08-27 09:35:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][610/1251] eta 0:02:35 lr 0.000541 wd 0.0500 time 0.2305 (0.2425) data time 0.0009 (0.0022) model time 0.2296 (0.2403) loss 3.5509 (3.2160) grad_norm 3.1360 (inf) loss_scale 1024.0000 (1196.6219) mem 7382MB [2024-08-27 09:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][620/1251] eta 0:02:32 lr 0.000541 wd 0.0500 time 0.2450 (0.2424) data time 0.0009 (0.0022) model time 0.2440 (0.2402) loss 3.4747 (3.2130) grad_norm 1.7844 (inf) loss_scale 1024.0000 (1193.8422) mem 7382MB [2024-08-27 09:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][630/1251] eta 0:02:30 lr 0.000541 wd 0.0500 time 0.2310 (0.2423) data time 0.0009 (0.0022) model time 0.2301 (0.2401) loss 3.6332 (3.2149) grad_norm 1.8452 (inf) loss_scale 1024.0000 (1191.1506) mem 7382MB [2024-08-27 09:35:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][640/1251] eta 0:02:28 lr 0.000541 wd 0.0500 time 0.2466 (0.2423) data time 0.0009 (0.0022) model time 0.2457 (0.2401) loss 3.3343 (3.2110) grad_norm 2.3703 (inf) loss_scale 1024.0000 (1188.5429) mem 7382MB [2024-08-27 09:35:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][650/1251] eta 0:02:25 lr 0.000541 wd 0.0500 time 0.2348 (0.2422) data time 0.0010 (0.0021) model time 0.2338 (0.2401) loss 2.7524 (3.2079) grad_norm 2.1574 (inf) loss_scale 1024.0000 (1186.0154) mem 7382MB [2024-08-27 09:35:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][660/1251] eta 0:02:23 lr 0.000541 wd 0.0500 time 0.2462 (0.2422) data time 0.0008 (0.0021) model time 0.2454 (0.2401) loss 3.9786 (3.2089) grad_norm 4.8064 (inf) loss_scale 1024.0000 (1183.5643) mem 7382MB [2024-08-27 09:35:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][670/1251] eta 0:02:20 lr 0.000541 wd 0.0500 time 0.2380 (0.2421) data time 0.0008 (0.0021) model time 0.2372 (0.2400) loss 3.1886 (3.2096) grad_norm 3.1645 (inf) loss_scale 1024.0000 (1181.1863) mem 7382MB [2024-08-27 09:35:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][680/1251] eta 0:02:18 lr 0.000541 wd 0.0500 time 0.2376 (0.2421) data time 0.0008 (0.0021) model time 0.2368 (0.2400) loss 3.3545 (3.2062) grad_norm 2.8466 (inf) loss_scale 1024.0000 (1178.8781) mem 7382MB [2024-08-27 09:35:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][690/1251] eta 0:02:15 lr 0.000541 wd 0.0500 time 0.2367 (0.2421) data time 0.0011 (0.0021) model time 0.2356 (0.2400) loss 3.7939 (3.2102) grad_norm 3.1033 (inf) loss_scale 1024.0000 (1176.6368) mem 7382MB [2024-08-27 09:35:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][700/1251] eta 0:02:13 lr 0.000541 wd 0.0500 time 0.2413 (0.2420) data time 0.0009 (0.0021) model time 0.2403 (0.2400) loss 2.5835 (3.2077) grad_norm 3.1202 (inf) loss_scale 1024.0000 (1174.4593) mem 7382MB [2024-08-27 09:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][710/1251] eta 0:02:10 lr 0.000541 wd 0.0500 time 0.2332 (0.2420) data time 0.0012 (0.0021) model time 0.2320 (0.2399) loss 3.1447 (3.2086) grad_norm 2.6580 (inf) loss_scale 1024.0000 (1172.3432) mem 7382MB [2024-08-27 09:35:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][720/1251] eta 0:02:08 lr 0.000541 wd 0.0500 time 0.2367 (0.2419) data time 0.0007 (0.0020) model time 0.2360 (0.2399) loss 3.5492 (3.2096) grad_norm 2.0466 (inf) loss_scale 1024.0000 (1170.2857) mem 7382MB [2024-08-27 09:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][730/1251] eta 0:02:06 lr 0.000541 wd 0.0500 time 0.2346 (0.2419) data time 0.0009 (0.0020) model time 0.2337 (0.2399) loss 2.9854 (3.2109) grad_norm 2.2765 (inf) loss_scale 1024.0000 (1168.2845) mem 7382MB [2024-08-27 09:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][740/1251] eta 0:02:03 lr 0.000541 wd 0.0500 time 0.2461 (0.2418) data time 0.0009 (0.0020) model time 0.2452 (0.2398) loss 3.2808 (3.2109) grad_norm 2.0039 (inf) loss_scale 1024.0000 (1166.3374) mem 7382MB [2024-08-27 09:35:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][750/1251] eta 0:02:01 lr 0.000541 wd 0.0500 time 0.2406 (0.2418) data time 0.0007 (0.0020) model time 0.2398 (0.2398) loss 2.7536 (3.2082) grad_norm 2.8548 (inf) loss_scale 1024.0000 (1164.4421) mem 7382MB [2024-08-27 09:35:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][760/1251] eta 0:01:58 lr 0.000540 wd 0.0500 time 0.2410 (0.2418) data time 0.0010 (0.0020) model time 0.2399 (0.2398) loss 3.3054 (3.2060) grad_norm 2.5266 (inf) loss_scale 1024.0000 (1162.5966) mem 7382MB [2024-08-27 09:35:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][770/1251] eta 0:01:56 lr 0.000540 wd 0.0500 time 0.2322 (0.2417) data time 0.0010 (0.0020) model time 0.2312 (0.2397) loss 3.6485 (3.2060) grad_norm 3.6664 (inf) loss_scale 1024.0000 (1160.7990) mem 7382MB [2024-08-27 09:35:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][780/1251] eta 0:01:53 lr 0.000540 wd 0.0500 time 0.2527 (0.2417) data time 0.0010 (0.0020) model time 0.2517 (0.2398) loss 3.4161 (3.2062) grad_norm 2.4573 (inf) loss_scale 1024.0000 (1159.0474) mem 7382MB [2024-08-27 09:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][790/1251] eta 0:01:51 lr 0.000540 wd 0.0500 time 0.2389 (0.2417) data time 0.0007 (0.0020) model time 0.2382 (0.2397) loss 3.4501 (3.2046) grad_norm 7.9812 (inf) loss_scale 1024.0000 (1157.3401) mem 7382MB [2024-08-27 09:36:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][800/1251] eta 0:01:48 lr 0.000540 wd 0.0500 time 0.2351 (0.2417) data time 0.0010 (0.0020) model time 0.2341 (0.2397) loss 2.4826 (3.2048) grad_norm 3.0876 (inf) loss_scale 1024.0000 (1155.6754) mem 7382MB [2024-08-27 09:36:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][810/1251] eta 0:01:46 lr 0.000540 wd 0.0500 time 0.2416 (0.2417) data time 0.0007 (0.0019) model time 0.2409 (0.2397) loss 3.3833 (3.2057) grad_norm 2.8804 (inf) loss_scale 1024.0000 (1154.0518) mem 7382MB [2024-08-27 09:36:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][820/1251] eta 0:01:44 lr 0.000540 wd 0.0500 time 0.2391 (0.2416) data time 0.0008 (0.0019) model time 0.2384 (0.2396) loss 3.3187 (3.2046) grad_norm 2.9141 (inf) loss_scale 1024.0000 (1152.4677) mem 7382MB [2024-08-27 09:36:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][830/1251] eta 0:01:41 lr 0.000540 wd 0.0500 time 0.2263 (0.2415) data time 0.0011 (0.0019) model time 0.2253 (0.2396) loss 3.1529 (3.2052) grad_norm 1.6671 (inf) loss_scale 1024.0000 (1150.9218) mem 7382MB [2024-08-27 09:36:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][840/1251] eta 0:01:39 lr 0.000540 wd 0.0500 time 0.2314 (0.2415) data time 0.0008 (0.0019) model time 0.2305 (0.2396) loss 3.7470 (3.2074) grad_norm 2.3128 (inf) loss_scale 1024.0000 (1149.4126) mem 7382MB [2024-08-27 09:36:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][850/1251] eta 0:01:36 lr 0.000540 wd 0.0500 time 0.2367 (0.2415) data time 0.0011 (0.0019) model time 0.2355 (0.2396) loss 3.0624 (3.2076) grad_norm 2.3751 (inf) loss_scale 1024.0000 (1147.9389) mem 7382MB [2024-08-27 09:36:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][860/1251] eta 0:01:34 lr 0.000540 wd 0.0500 time 0.2385 (0.2415) data time 0.0008 (0.0019) model time 0.2377 (0.2396) loss 2.1136 (3.2054) grad_norm 4.1199 (inf) loss_scale 1024.0000 (1146.4994) mem 7382MB [2024-08-27 09:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][870/1251] eta 0:01:31 lr 0.000540 wd 0.0500 time 0.2380 (0.2414) data time 0.0010 (0.0019) model time 0.2370 (0.2395) loss 3.4604 (3.2056) grad_norm 2.2199 (inf) loss_scale 1024.0000 (1145.0930) mem 7382MB [2024-08-27 09:36:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][880/1251] eta 0:01:29 lr 0.000540 wd 0.0500 time 0.2395 (0.2414) data time 0.0010 (0.0019) model time 0.2385 (0.2395) loss 3.0804 (3.2049) grad_norm 2.0287 (inf) loss_scale 1024.0000 (1143.7185) mem 7382MB [2024-08-27 09:36:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][890/1251] eta 0:01:27 lr 0.000540 wd 0.0500 time 0.2341 (0.2414) data time 0.0008 (0.0019) model time 0.2334 (0.2395) loss 1.9609 (3.2037) grad_norm 2.5792 (inf) loss_scale 1024.0000 (1142.3749) mem 7382MB [2024-08-27 09:36:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][900/1251] eta 0:01:24 lr 0.000540 wd 0.0500 time 0.2361 (0.2414) data time 0.0007 (0.0019) model time 0.2354 (0.2395) loss 3.8271 (3.2055) grad_norm 2.2631 (inf) loss_scale 1024.0000 (1141.0610) mem 7382MB [2024-08-27 09:36:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][910/1251] eta 0:01:22 lr 0.000540 wd 0.0500 time 0.2393 (0.2414) data time 0.0010 (0.0018) model time 0.2383 (0.2395) loss 2.8057 (3.2097) grad_norm 2.7213 (inf) loss_scale 1024.0000 (1139.7761) mem 7382MB [2024-08-27 09:36:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][920/1251] eta 0:01:19 lr 0.000540 wd 0.0500 time 0.2512 (0.2414) data time 0.0008 (0.0018) model time 0.2505 (0.2395) loss 2.5032 (3.2098) grad_norm 1.9438 (inf) loss_scale 1024.0000 (1138.5190) mem 7382MB [2024-08-27 09:36:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][930/1251] eta 0:01:17 lr 0.000540 wd 0.0500 time 0.2322 (0.2413) data time 0.0011 (0.0018) model time 0.2311 (0.2395) loss 3.3241 (3.2102) grad_norm 2.7370 (inf) loss_scale 1024.0000 (1137.2889) mem 7382MB [2024-08-27 09:36:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][940/1251] eta 0:01:15 lr 0.000540 wd 0.0500 time 0.2322 (0.2413) data time 0.0008 (0.0018) model time 0.2314 (0.2394) loss 2.6298 (3.2084) grad_norm 2.1204 (inf) loss_scale 1024.0000 (1136.0850) mem 7382MB [2024-08-27 09:36:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][950/1251] eta 0:01:12 lr 0.000540 wd 0.0500 time 0.3777 (0.2419) data time 0.0012 (0.0018) model time 0.3765 (0.2401) loss 2.4483 (3.2071) grad_norm 2.2902 (inf) loss_scale 1024.0000 (1134.9064) mem 7382MB [2024-08-27 09:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][960/1251] eta 0:01:10 lr 0.000540 wd 0.0500 time 0.2463 (0.2421) data time 0.0008 (0.0018) model time 0.2456 (0.2403) loss 3.7353 (3.2045) grad_norm 1.9707 (inf) loss_scale 1024.0000 (1133.7523) mem 7382MB [2024-08-27 09:36:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][970/1251] eta 0:01:08 lr 0.000540 wd 0.0500 time 0.2414 (0.2420) data time 0.0010 (0.0018) model time 0.2404 (0.2402) loss 3.6934 (3.2044) grad_norm 2.0708 (inf) loss_scale 1024.0000 (1132.6220) mem 7382MB [2024-08-27 09:36:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][980/1251] eta 0:01:05 lr 0.000539 wd 0.0500 time 0.2355 (0.2420) data time 0.0009 (0.0018) model time 0.2347 (0.2402) loss 3.6385 (3.2062) grad_norm 2.0228 (inf) loss_scale 1024.0000 (1131.5148) mem 7382MB [2024-08-27 09:36:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][990/1251] eta 0:01:03 lr 0.000539 wd 0.0500 time 0.2332 (0.2420) data time 0.0008 (0.0018) model time 0.2325 (0.2402) loss 3.7116 (3.2085) grad_norm 3.3497 (inf) loss_scale 1024.0000 (1130.4299) mem 7382MB [2024-08-27 09:36:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1000/1251] eta 0:01:00 lr 0.000539 wd 0.0500 time 0.2412 (0.2419) data time 0.0010 (0.0018) model time 0.2402 (0.2402) loss 3.7306 (3.2075) grad_norm 2.3820 (inf) loss_scale 1024.0000 (1129.3666) mem 7382MB [2024-08-27 09:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1010/1251] eta 0:00:58 lr 0.000539 wd 0.0500 time 0.2467 (0.2419) data time 0.0008 (0.0018) model time 0.2459 (0.2401) loss 2.3905 (3.2084) grad_norm 1.9858 (inf) loss_scale 1024.0000 (1128.3244) mem 7382MB [2024-08-27 09:36:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1020/1251] eta 0:00:55 lr 0.000539 wd 0.0500 time 0.2418 (0.2418) data time 0.0008 (0.0018) model time 0.2410 (0.2401) loss 2.7796 (3.2082) grad_norm 2.0415 (inf) loss_scale 1024.0000 (1127.3026) mem 7382MB [2024-08-27 09:36:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1030/1251] eta 0:00:53 lr 0.000539 wd 0.0500 time 0.2354 (0.2418) data time 0.0010 (0.0017) model time 0.2344 (0.2401) loss 3.5108 (3.2094) grad_norm 1.9954 (inf) loss_scale 1024.0000 (1126.3007) mem 7382MB [2024-08-27 09:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1040/1251] eta 0:00:51 lr 0.000539 wd 0.0500 time 0.2431 (0.2418) data time 0.0007 (0.0017) model time 0.2424 (0.2400) loss 3.5472 (3.2108) grad_norm 2.8614 (inf) loss_scale 1024.0000 (1125.3180) mem 7382MB [2024-08-27 09:37:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1050/1251] eta 0:00:48 lr 0.000539 wd 0.0500 time 0.2414 (0.2418) data time 0.0011 (0.0017) model time 0.2402 (0.2400) loss 3.4129 (3.2100) grad_norm 2.9070 (inf) loss_scale 1024.0000 (1124.3539) mem 7382MB [2024-08-27 09:37:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1060/1251] eta 0:00:46 lr 0.000539 wd 0.0500 time 0.2383 (0.2417) data time 0.0008 (0.0017) model time 0.2375 (0.2400) loss 2.4203 (3.2089) grad_norm 2.7109 (inf) loss_scale 1024.0000 (1123.4081) mem 7382MB [2024-08-27 09:37:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1070/1251] eta 0:00:43 lr 0.000539 wd 0.0500 time 0.2343 (0.2417) data time 0.0007 (0.0017) model time 0.2336 (0.2400) loss 4.1667 (3.2104) grad_norm 1.9260 (inf) loss_scale 1024.0000 (1122.4799) mem 7382MB [2024-08-27 09:37:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1080/1251] eta 0:00:41 lr 0.000539 wd 0.0500 time 0.2391 (0.2417) data time 0.0008 (0.0017) model time 0.2383 (0.2400) loss 2.6172 (3.2103) grad_norm 1.8397 (inf) loss_scale 1024.0000 (1121.5689) mem 7382MB [2024-08-27 09:37:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1090/1251] eta 0:00:38 lr 0.000539 wd 0.0500 time 0.2398 (0.2416) data time 0.0008 (0.0017) model time 0.2390 (0.2399) loss 3.7991 (3.2123) grad_norm 2.5574 (inf) loss_scale 1024.0000 (1120.6746) mem 7382MB [2024-08-27 09:37:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1100/1251] eta 0:00:36 lr 0.000539 wd 0.0500 time 0.2355 (0.2417) data time 0.0011 (0.0017) model time 0.2344 (0.2399) loss 3.1119 (3.2136) grad_norm 2.5684 (inf) loss_scale 1024.0000 (1119.7965) mem 7382MB [2024-08-27 09:37:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1110/1251] eta 0:00:34 lr 0.000539 wd 0.0500 time 0.2415 (0.2417) data time 0.0010 (0.0017) model time 0.2405 (0.2400) loss 3.1331 (3.2145) grad_norm 15.8732 (inf) loss_scale 1024.0000 (1118.9343) mem 7382MB [2024-08-27 09:37:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1120/1251] eta 0:00:31 lr 0.000539 wd 0.0500 time 0.2442 (0.2419) data time 0.0009 (0.0017) model time 0.2433 (0.2402) loss 2.3021 (3.2127) grad_norm 3.3709 (inf) loss_scale 1024.0000 (1118.0874) mem 7382MB [2024-08-27 09:37:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1130/1251] eta 0:00:29 lr 0.000539 wd 0.0500 time 0.2483 (0.2418) data time 0.0010 (0.0017) model time 0.2473 (0.2402) loss 3.3291 (3.2125) grad_norm 2.1423 (inf) loss_scale 1024.0000 (1117.2555) mem 7382MB [2024-08-27 09:37:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1140/1251] eta 0:00:26 lr 0.000539 wd 0.0500 time 0.2339 (0.2418) data time 0.0011 (0.0017) model time 0.2329 (0.2401) loss 3.5214 (3.2123) grad_norm 2.1453 (inf) loss_scale 1024.0000 (1116.4382) mem 7382MB [2024-08-27 09:37:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1150/1251] eta 0:00:24 lr 0.000539 wd 0.0500 time 0.2385 (0.2418) data time 0.0008 (0.0017) model time 0.2377 (0.2401) loss 2.1898 (3.2137) grad_norm 3.0866 (inf) loss_scale 1024.0000 (1115.6351) mem 7382MB [2024-08-27 09:37:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1160/1251] eta 0:00:21 lr 0.000539 wd 0.0500 time 0.2352 (0.2418) data time 0.0010 (0.0017) model time 0.2342 (0.2401) loss 2.9125 (3.2157) grad_norm 1.9363 (inf) loss_scale 1024.0000 (1114.8458) mem 7382MB [2024-08-27 09:37:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1170/1251] eta 0:00:19 lr 0.000539 wd 0.0500 time 0.2360 (0.2417) data time 0.0010 (0.0017) model time 0.2351 (0.2401) loss 3.3913 (3.2183) grad_norm 1.7376 (inf) loss_scale 1024.0000 (1114.0700) mem 7382MB [2024-08-27 09:37:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1180/1251] eta 0:00:17 lr 0.000539 wd 0.0500 time 0.2382 (0.2417) data time 0.0010 (0.0017) model time 0.2372 (0.2401) loss 3.0235 (3.2192) grad_norm 2.2826 (inf) loss_scale 1024.0000 (1113.3074) mem 7382MB [2024-08-27 09:37:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1190/1251] eta 0:00:14 lr 0.000539 wd 0.0500 time 0.2435 (0.2417) data time 0.0010 (0.0017) model time 0.2425 (0.2400) loss 2.5103 (3.2167) grad_norm 1.8452 (inf) loss_scale 1024.0000 (1112.5575) mem 7382MB [2024-08-27 09:37:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1200/1251] eta 0:00:12 lr 0.000539 wd 0.0500 time 0.2440 (0.2417) data time 0.0009 (0.0017) model time 0.2432 (0.2400) loss 3.5975 (3.2158) grad_norm 2.3990 (inf) loss_scale 1024.0000 (1111.8201) mem 7382MB [2024-08-27 09:37:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1210/1251] eta 0:00:09 lr 0.000538 wd 0.0500 time 0.2344 (0.2416) data time 0.0007 (0.0017) model time 0.2336 (0.2400) loss 2.4093 (3.2153) grad_norm 2.3448 (inf) loss_scale 1024.0000 (1111.0950) mem 7382MB [2024-08-27 09:37:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1220/1251] eta 0:00:07 lr 0.000538 wd 0.0500 time 0.2422 (0.2416) data time 0.0008 (0.0016) model time 0.2414 (0.2400) loss 3.9360 (3.2155) grad_norm 2.6030 (inf) loss_scale 1024.0000 (1110.3817) mem 7382MB [2024-08-27 09:37:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1230/1251] eta 0:00:05 lr 0.000538 wd 0.0500 time 0.2363 (0.2416) data time 0.0010 (0.0016) model time 0.2353 (0.2400) loss 3.4300 (3.2161) grad_norm 1.9104 (inf) loss_scale 1024.0000 (1109.6799) mem 7382MB [2024-08-27 09:37:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1240/1251] eta 0:00:02 lr 0.000538 wd 0.0500 time 0.2235 (0.2415) data time 0.0005 (0.0016) model time 0.2230 (0.2399) loss 3.7925 (3.2181) grad_norm 2.5981 (inf) loss_scale 1024.0000 (1108.9895) mem 7382MB [2024-08-27 09:37:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [153/300][1250/1251] eta 0:00:00 lr 0.000538 wd 0.0500 time 0.2272 (0.2414) data time 0.0005 (0.0016) model time 0.2268 (0.2398) loss 3.9729 (3.2204) grad_norm 1.9927 (inf) loss_scale 1024.0000 (1108.3102) mem 7382MB [2024-08-27 09:37:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 153 training takes 0:05:01 [2024-08-27 09:37:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 09:37:51 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 09:37:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.425 (0.425) Loss 0.4895 (0.4895) Acc@1 91.309 (91.309) Acc@5 97.852 (97.852) Mem 7382MB [2024-08-27 09:37:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.108) Loss 0.7490 (0.7556) Acc@1 84.375 (84.082) Acc@5 96.387 (96.697) Mem 7382MB [2024-08-27 09:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.093) Loss 1.0820 (0.7707) Acc@1 75.684 (83.324) Acc@5 94.141 (96.782) Mem 7382MB [2024-08-27 09:37:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.087) Loss 1.2920 (0.8769) Acc@1 69.922 (80.844) Acc@5 90.820 (95.599) Mem 7382MB [2024-08-27 09:37:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.082) Loss 1.2070 (0.9297) Acc@1 71.387 (79.473) Acc@5 92.676 (95.034) Mem 7382MB [2024-08-27 09:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.084 Acc@5 94.956 [2024-08-27 09:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.1% [2024-08-27 09:37:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.798 (0.798) Loss 0.4043 (0.4043) Acc@1 92.969 (92.969) Acc@5 98.535 (98.535) Mem 7382MB [2024-08-27 09:37:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.147) Loss 0.6411 (0.6366) Acc@1 87.012 (86.364) Acc@5 96.973 (97.283) Mem 7382MB [2024-08-27 09:37:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.113) Loss 0.9028 (0.6606) Acc@1 78.516 (85.398) Acc@5 95.703 (97.331) Mem 7382MB [2024-08-27 09:37:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.102) Loss 1.1504 (0.7505) Acc@1 71.094 (83.191) Acc@5 92.969 (96.368) Mem 7382MB [2024-08-27 09:37:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.093) Loss 1.0381 (0.7979) Acc@1 74.512 (81.795) Acc@5 93.750 (95.877) Mem 7382MB [2024-08-27 09:37:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.348 Acc@5 95.862 [2024-08-27 09:37:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.3% [2024-08-27 09:37:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.35% [2024-08-27 09:37:59 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 09:38:00 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 09:38:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][0/1251] eta 0:14:18 lr 0.000538 wd 0.0500 time 0.6863 (0.6863) data time 0.4576 (0.4576) model time 0.0000 (0.0000) loss 2.9336 (2.9336) grad_norm 3.6490 (3.6490) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][10/1251] eta 0:05:42 lr 0.000538 wd 0.0500 time 0.2335 (0.2763) data time 0.0009 (0.0425) model time 0.0000 (0.0000) loss 4.2275 (2.9284) grad_norm 1.7030 (2.5622) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][20/1251] eta 0:05:17 lr 0.000538 wd 0.0500 time 0.2371 (0.2582) data time 0.0010 (0.0228) model time 0.0000 (0.0000) loss 2.2864 (3.0587) grad_norm 1.7942 (2.4276) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][30/1251] eta 0:05:07 lr 0.000538 wd 0.0500 time 0.2480 (0.2521) data time 0.0010 (0.0158) model time 0.0000 (0.0000) loss 3.3622 (3.1395) grad_norm 3.8951 (2.4613) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][40/1251] eta 0:05:01 lr 0.000538 wd 0.0500 time 0.2435 (0.2488) data time 0.0009 (0.0122) model time 0.0000 (0.0000) loss 2.1877 (3.1543) grad_norm 2.6446 (2.4488) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][50/1251] eta 0:04:55 lr 0.000538 wd 0.0500 time 0.2397 (0.2464) data time 0.0010 (0.0100) model time 0.0000 (0.0000) loss 2.7087 (3.1309) grad_norm 1.7272 (2.4100) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][60/1251] eta 0:04:52 lr 0.000538 wd 0.0500 time 0.2378 (0.2454) data time 0.0008 (0.0085) model time 0.2370 (0.2393) loss 3.6540 (3.1428) grad_norm 1.8390 (2.4343) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][70/1251] eta 0:04:48 lr 0.000538 wd 0.0500 time 0.2363 (0.2445) data time 0.0008 (0.0074) model time 0.2355 (0.2386) loss 3.3168 (3.1730) grad_norm 2.3369 (2.4482) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][80/1251] eta 0:04:45 lr 0.000538 wd 0.0500 time 0.2493 (0.2439) data time 0.0011 (0.0066) model time 0.2483 (0.2387) loss 3.1789 (3.1479) grad_norm 1.8552 (2.5115) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][90/1251] eta 0:04:42 lr 0.000538 wd 0.0500 time 0.2414 (0.2432) data time 0.0010 (0.0060) model time 0.2404 (0.2380) loss 3.7241 (3.1440) grad_norm 1.7231 (2.5195) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][100/1251] eta 0:04:39 lr 0.000538 wd 0.0500 time 0.2305 (0.2425) data time 0.0009 (0.0055) model time 0.2296 (0.2375) loss 3.7176 (3.1698) grad_norm 2.3687 (2.5392) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][110/1251] eta 0:04:36 lr 0.000538 wd 0.0500 time 0.2380 (0.2421) data time 0.0010 (0.0051) model time 0.2370 (0.2374) loss 3.6813 (3.1640) grad_norm 2.9596 (2.5716) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][120/1251] eta 0:04:33 lr 0.000538 wd 0.0500 time 0.2454 (0.2419) data time 0.0011 (0.0048) model time 0.2443 (0.2376) loss 3.3916 (3.1562) grad_norm 2.3893 (2.5433) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][130/1251] eta 0:04:30 lr 0.000538 wd 0.0500 time 0.2503 (0.2415) data time 0.0008 (0.0045) model time 0.2496 (0.2374) loss 1.8681 (3.1562) grad_norm 2.7618 (2.5207) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][140/1251] eta 0:04:28 lr 0.000538 wd 0.0500 time 0.2388 (0.2414) data time 0.0010 (0.0043) model time 0.2378 (0.2375) loss 3.4908 (3.1562) grad_norm 2.2346 (2.5281) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][150/1251] eta 0:04:25 lr 0.000538 wd 0.0500 time 0.2331 (0.2413) data time 0.0011 (0.0040) model time 0.2320 (0.2377) loss 3.8735 (3.1590) grad_norm 2.1732 (2.5152) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][160/1251] eta 0:04:23 lr 0.000538 wd 0.0500 time 0.2562 (0.2412) data time 0.0010 (0.0039) model time 0.2552 (0.2377) loss 3.1724 (3.1604) grad_norm 1.9753 (2.4973) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][170/1251] eta 0:04:20 lr 0.000538 wd 0.0500 time 0.2343 (0.2410) data time 0.0012 (0.0037) model time 0.2331 (0.2376) loss 2.5974 (3.1467) grad_norm 2.6742 (2.4776) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][180/1251] eta 0:04:17 lr 0.000538 wd 0.0500 time 0.2463 (0.2409) data time 0.0009 (0.0035) model time 0.2454 (0.2376) loss 3.2874 (3.1404) grad_norm 2.1726 (2.4557) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][190/1251] eta 0:04:15 lr 0.000537 wd 0.0500 time 0.2389 (0.2408) data time 0.0008 (0.0034) model time 0.2381 (0.2377) loss 4.1516 (3.1394) grad_norm 2.6750 (2.4569) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][200/1251] eta 0:04:12 lr 0.000537 wd 0.0500 time 0.2341 (0.2407) data time 0.0011 (0.0033) model time 0.2329 (0.2377) loss 3.0465 (3.1310) grad_norm 3.8610 (2.4736) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][210/1251] eta 0:04:10 lr 0.000537 wd 0.0500 time 0.2421 (0.2406) data time 0.0009 (0.0032) model time 0.2412 (0.2377) loss 3.5631 (3.1359) grad_norm 2.9964 (2.4808) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][220/1251] eta 0:04:10 lr 0.000537 wd 0.0500 time 0.2295 (0.2427) data time 0.0012 (0.0031) model time 0.2284 (0.2405) loss 3.2671 (3.1253) grad_norm 2.9258 (2.4908) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][230/1251] eta 0:04:09 lr 0.000537 wd 0.0500 time 0.2412 (0.2440) data time 0.0009 (0.0030) model time 0.2403 (0.2422) loss 3.5101 (3.1306) grad_norm 2.1336 (2.4901) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:38:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][240/1251] eta 0:04:06 lr 0.000537 wd 0.0500 time 0.2448 (0.2438) data time 0.0008 (0.0029) model time 0.2440 (0.2420) loss 2.7584 (3.1346) grad_norm 1.8083 (2.4830) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][250/1251] eta 0:04:03 lr 0.000537 wd 0.0500 time 0.2394 (0.2435) data time 0.0011 (0.0028) model time 0.2383 (0.2418) loss 3.3421 (3.1440) grad_norm 2.1874 (2.4885) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][260/1251] eta 0:04:01 lr 0.000537 wd 0.0500 time 0.2560 (0.2434) data time 0.0008 (0.0028) model time 0.2552 (0.2416) loss 3.4128 (3.1429) grad_norm 2.2428 (2.4884) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][270/1251] eta 0:03:58 lr 0.000537 wd 0.0500 time 0.2341 (0.2431) data time 0.0008 (0.0027) model time 0.2333 (0.2413) loss 3.0267 (3.1523) grad_norm 2.5106 (2.4893) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][280/1251] eta 0:03:55 lr 0.000537 wd 0.0500 time 0.2361 (0.2428) data time 0.0010 (0.0027) model time 0.2351 (0.2410) loss 3.5135 (3.1499) grad_norm 1.7357 (2.4758) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][290/1251] eta 0:03:53 lr 0.000537 wd 0.0500 time 0.2386 (0.2427) data time 0.0010 (0.0026) model time 0.2376 (0.2408) loss 3.5638 (3.1546) grad_norm 2.3541 (2.4691) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][300/1251] eta 0:03:50 lr 0.000537 wd 0.0500 time 0.2421 (0.2425) data time 0.0007 (0.0025) model time 0.2414 (0.2407) loss 2.1892 (3.1550) grad_norm 2.5790 (2.4552) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][310/1251] eta 0:03:48 lr 0.000537 wd 0.0500 time 0.2374 (0.2424) data time 0.0009 (0.0025) model time 0.2365 (0.2406) loss 3.8339 (3.1686) grad_norm 1.6842 (2.4460) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][320/1251] eta 0:03:46 lr 0.000537 wd 0.0500 time 0.2367 (0.2430) data time 0.0010 (0.0024) model time 0.2357 (0.2413) loss 3.2870 (3.1677) grad_norm 2.1114 (2.4375) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][330/1251] eta 0:03:43 lr 0.000537 wd 0.0500 time 0.2453 (0.2429) data time 0.0008 (0.0024) model time 0.2445 (0.2413) loss 2.1456 (3.1652) grad_norm 2.0775 (2.4374) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][340/1251] eta 0:03:41 lr 0.000537 wd 0.0500 time 0.2420 (0.2428) data time 0.0010 (0.0024) model time 0.2410 (0.2412) loss 3.1568 (3.1561) grad_norm 2.3946 (2.4446) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][350/1251] eta 0:03:38 lr 0.000537 wd 0.0500 time 0.2387 (0.2427) data time 0.0010 (0.0023) model time 0.2377 (0.2410) loss 3.2860 (3.1572) grad_norm 2.0525 (2.4435) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][360/1251] eta 0:03:36 lr 0.000537 wd 0.0500 time 0.2384 (0.2425) data time 0.0011 (0.0023) model time 0.2373 (0.2408) loss 3.3309 (3.1603) grad_norm 2.2016 (2.4507) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][370/1251] eta 0:03:33 lr 0.000537 wd 0.0500 time 0.2361 (0.2424) data time 0.0009 (0.0023) model time 0.2352 (0.2407) loss 3.3172 (3.1590) grad_norm 2.8582 (2.4471) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][380/1251] eta 0:03:31 lr 0.000537 wd 0.0500 time 0.2390 (0.2423) data time 0.0011 (0.0022) model time 0.2379 (0.2407) loss 2.8071 (3.1571) grad_norm 2.7364 (2.4457) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][390/1251] eta 0:03:28 lr 0.000537 wd 0.0500 time 0.2349 (0.2422) data time 0.0010 (0.0022) model time 0.2339 (0.2406) loss 2.7092 (3.1607) grad_norm 2.0340 (2.4408) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][400/1251] eta 0:03:26 lr 0.000537 wd 0.0500 time 0.2482 (0.2422) data time 0.0007 (0.0022) model time 0.2475 (0.2405) loss 3.0837 (3.1691) grad_norm 2.2746 (2.4389) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][410/1251] eta 0:03:23 lr 0.000536 wd 0.0500 time 0.2492 (0.2421) data time 0.0008 (0.0021) model time 0.2484 (0.2405) loss 4.0719 (3.1720) grad_norm 3.4598 (2.4549) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][420/1251] eta 0:03:21 lr 0.000536 wd 0.0500 time 0.2396 (0.2420) data time 0.0008 (0.0021) model time 0.2387 (0.2404) loss 1.9662 (3.1706) grad_norm 2.8477 (2.4620) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][430/1251] eta 0:03:18 lr 0.000536 wd 0.0500 time 0.2334 (0.2419) data time 0.0010 (0.0021) model time 0.2323 (0.2403) loss 3.4605 (3.1670) grad_norm 2.7304 (2.4589) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][440/1251] eta 0:03:16 lr 0.000536 wd 0.0500 time 0.2455 (0.2419) data time 0.0007 (0.0021) model time 0.2448 (0.2403) loss 3.2522 (3.1727) grad_norm 3.0819 (2.4624) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][450/1251] eta 0:03:13 lr 0.000536 wd 0.0500 time 0.2445 (0.2419) data time 0.0010 (0.0020) model time 0.2435 (0.2402) loss 3.4090 (3.1736) grad_norm 2.1392 (2.4550) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][460/1251] eta 0:03:11 lr 0.000536 wd 0.0500 time 0.2441 (0.2418) data time 0.0010 (0.0020) model time 0.2431 (0.2402) loss 3.3121 (3.1789) grad_norm 2.0330 (2.4499) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][470/1251] eta 0:03:08 lr 0.000536 wd 0.0500 time 0.2352 (0.2417) data time 0.0011 (0.0020) model time 0.2341 (0.2401) loss 3.6809 (3.1791) grad_norm 4.1321 (2.4606) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][480/1251] eta 0:03:06 lr 0.000536 wd 0.0500 time 0.2342 (0.2416) data time 0.0011 (0.0020) model time 0.2332 (0.2400) loss 3.5004 (3.1827) grad_norm 2.6928 (2.4837) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:39:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][490/1251] eta 0:03:03 lr 0.000536 wd 0.0500 time 0.2414 (0.2415) data time 0.0010 (0.0020) model time 0.2404 (0.2399) loss 2.4806 (3.1795) grad_norm 1.9148 (2.4788) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][500/1251] eta 0:03:01 lr 0.000536 wd 0.0500 time 0.2348 (0.2414) data time 0.0009 (0.0019) model time 0.2339 (0.2398) loss 3.7879 (3.1772) grad_norm 2.4987 (2.4695) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][510/1251] eta 0:02:58 lr 0.000536 wd 0.0500 time 0.2468 (0.2414) data time 0.0010 (0.0019) model time 0.2459 (0.2398) loss 3.1850 (3.1750) grad_norm 3.1113 (2.4672) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][520/1251] eta 0:02:56 lr 0.000536 wd 0.0500 time 0.2458 (0.2414) data time 0.0009 (0.0019) model time 0.2448 (0.2399) loss 3.8816 (3.1753) grad_norm 1.8520 (2.4605) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][530/1251] eta 0:02:54 lr 0.000536 wd 0.0500 time 0.2499 (0.2414) data time 0.0008 (0.0019) model time 0.2492 (0.2399) loss 1.9548 (3.1698) grad_norm 1.9119 (2.4571) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][540/1251] eta 0:02:51 lr 0.000536 wd 0.0500 time 0.2471 (0.2415) data time 0.0008 (0.0019) model time 0.2464 (0.2399) loss 3.8376 (3.1702) grad_norm 3.4598 (2.4585) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][550/1251] eta 0:02:49 lr 0.000536 wd 0.0500 time 0.2373 (0.2414) data time 0.0008 (0.0019) model time 0.2365 (0.2399) loss 3.3787 (3.1722) grad_norm 2.2950 (2.4680) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][560/1251] eta 0:02:46 lr 0.000536 wd 0.0500 time 0.2431 (0.2413) data time 0.0011 (0.0019) model time 0.2420 (0.2398) loss 3.1074 (3.1753) grad_norm 2.1726 (2.4666) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][570/1251] eta 0:02:44 lr 0.000536 wd 0.0500 time 0.2373 (0.2413) data time 0.0008 (0.0018) model time 0.2366 (0.2398) loss 2.9587 (3.1792) grad_norm 4.8139 (2.4767) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][580/1251] eta 0:02:41 lr 0.000536 wd 0.0500 time 0.2393 (0.2412) data time 0.0008 (0.0018) model time 0.2385 (0.2397) loss 3.7982 (3.1797) grad_norm 2.2328 (2.4769) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][590/1251] eta 0:02:39 lr 0.000536 wd 0.0500 time 0.2504 (0.2412) data time 0.0011 (0.0018) model time 0.2492 (0.2397) loss 3.2411 (3.1817) grad_norm 4.0073 (2.4852) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][600/1251] eta 0:02:36 lr 0.000536 wd 0.0500 time 0.2409 (0.2412) data time 0.0010 (0.0018) model time 0.2398 (0.2396) loss 2.8360 (3.1773) grad_norm 1.7316 (2.4860) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][610/1251] eta 0:02:34 lr 0.000536 wd 0.0500 time 0.2309 (0.2411) data time 0.0011 (0.0018) model time 0.2298 (0.2396) loss 3.5747 (3.1731) grad_norm 1.9129 (2.4803) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][620/1251] eta 0:02:32 lr 0.000536 wd 0.0500 time 0.2338 (0.2411) data time 0.0009 (0.0018) model time 0.2329 (0.2396) loss 4.0087 (3.1763) grad_norm 2.8251 (2.4776) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][630/1251] eta 0:02:29 lr 0.000536 wd 0.0500 time 0.2408 (0.2411) data time 0.0011 (0.0018) model time 0.2397 (0.2395) loss 3.5121 (3.1724) grad_norm 1.8395 (2.4756) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][640/1251] eta 0:02:27 lr 0.000535 wd 0.0500 time 0.2502 (0.2411) data time 0.0009 (0.0018) model time 0.2494 (0.2395) loss 3.4936 (3.1759) grad_norm 3.0121 (2.4759) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][650/1251] eta 0:02:24 lr 0.000535 wd 0.0500 time 0.2358 (0.2410) data time 0.0007 (0.0018) model time 0.2350 (0.2395) loss 2.2394 (3.1718) grad_norm 3.2352 (2.4747) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][660/1251] eta 0:02:22 lr 0.000535 wd 0.0500 time 0.2414 (0.2410) data time 0.0008 (0.0017) model time 0.2406 (0.2395) loss 3.7876 (3.1754) grad_norm 2.5190 (2.4847) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][670/1251] eta 0:02:20 lr 0.000535 wd 0.0500 time 0.2447 (0.2410) data time 0.0009 (0.0017) model time 0.2438 (0.2395) loss 3.4315 (3.1768) grad_norm 2.2802 (2.4808) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][680/1251] eta 0:02:17 lr 0.000535 wd 0.0500 time 0.2398 (0.2409) data time 0.0008 (0.0017) model time 0.2389 (0.2394) loss 3.4948 (3.1753) grad_norm 2.1018 (2.4760) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][690/1251] eta 0:02:15 lr 0.000535 wd 0.0500 time 0.2570 (0.2409) data time 0.0010 (0.0017) model time 0.2561 (0.2394) loss 4.1564 (3.1775) grad_norm 2.4798 (2.4779) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][700/1251] eta 0:02:12 lr 0.000535 wd 0.0500 time 0.2442 (0.2409) data time 0.0009 (0.0017) model time 0.2432 (0.2394) loss 2.3219 (3.1754) grad_norm 2.0149 (2.4781) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][710/1251] eta 0:02:10 lr 0.000535 wd 0.0500 time 0.2508 (0.2409) data time 0.0010 (0.0017) model time 0.2498 (0.2394) loss 3.7042 (3.1719) grad_norm 4.1276 (2.4833) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][720/1251] eta 0:02:07 lr 0.000535 wd 0.0500 time 0.2405 (0.2409) data time 0.0008 (0.0017) model time 0.2398 (0.2394) loss 3.6538 (3.1747) grad_norm 2.1839 (2.4825) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][730/1251] eta 0:02:05 lr 0.000535 wd 0.0500 time 0.2380 (0.2409) data time 0.0010 (0.0017) model time 0.2370 (0.2394) loss 3.4464 (3.1774) grad_norm 1.7405 (2.4774) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][740/1251] eta 0:02:03 lr 0.000535 wd 0.0500 time 0.4775 (0.2415) data time 0.0007 (0.0017) model time 0.4768 (0.2400) loss 3.0029 (3.1771) grad_norm 2.4763 (2.4741) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][750/1251] eta 0:02:01 lr 0.000535 wd 0.0500 time 0.2480 (0.2423) data time 0.0010 (0.0017) model time 0.2470 (0.2410) loss 3.6467 (3.1780) grad_norm 2.6105 (2.4733) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][760/1251] eta 0:01:58 lr 0.000535 wd 0.0500 time 0.2432 (0.2423) data time 0.0009 (0.0017) model time 0.2423 (0.2409) loss 2.9086 (3.1790) grad_norm 2.0607 (2.4739) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][770/1251] eta 0:01:56 lr 0.000535 wd 0.0500 time 0.2350 (0.2423) data time 0.0010 (0.0017) model time 0.2340 (0.2409) loss 3.3911 (3.1771) grad_norm 2.7149 (2.4746) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][780/1251] eta 0:01:54 lr 0.000535 wd 0.0500 time 0.2377 (0.2423) data time 0.0013 (0.0017) model time 0.2364 (0.2409) loss 3.0239 (3.1749) grad_norm 2.9499 (2.4757) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][790/1251] eta 0:01:51 lr 0.000535 wd 0.0500 time 0.2431 (0.2423) data time 0.0010 (0.0017) model time 0.2422 (0.2410) loss 4.1022 (3.1792) grad_norm 2.4635 (2.4776) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][800/1251] eta 0:01:49 lr 0.000535 wd 0.0500 time 0.2348 (0.2423) data time 0.0008 (0.0017) model time 0.2340 (0.2410) loss 3.1189 (3.1769) grad_norm 2.1495 (2.4857) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][810/1251] eta 0:01:46 lr 0.000535 wd 0.0500 time 0.2400 (0.2423) data time 0.0008 (0.0016) model time 0.2392 (0.2410) loss 2.8117 (3.1782) grad_norm 2.5698 (2.4926) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][820/1251] eta 0:01:44 lr 0.000535 wd 0.0500 time 0.2385 (0.2423) data time 0.0011 (0.0017) model time 0.2374 (0.2409) loss 3.5493 (3.1785) grad_norm 4.2689 (2.4983) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][830/1251] eta 0:01:42 lr 0.000535 wd 0.0500 time 0.2390 (0.2423) data time 0.0009 (0.0016) model time 0.2382 (0.2409) loss 3.1518 (3.1794) grad_norm 2.3615 (2.5018) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][840/1251] eta 0:01:39 lr 0.000535 wd 0.0500 time 0.2345 (0.2425) data time 0.0009 (0.0016) model time 0.2337 (0.2411) loss 2.1287 (3.1756) grad_norm 2.6194 (2.4963) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][850/1251] eta 0:01:37 lr 0.000535 wd 0.0500 time 0.2400 (0.2424) data time 0.0011 (0.0016) model time 0.2389 (0.2411) loss 3.7535 (3.1756) grad_norm 2.4590 (2.4941) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][860/1251] eta 0:01:34 lr 0.000534 wd 0.0500 time 0.2413 (0.2425) data time 0.0011 (0.0016) model time 0.2403 (0.2411) loss 3.3400 (3.1762) grad_norm 2.5662 (2.5003) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][870/1251] eta 0:01:32 lr 0.000534 wd 0.0500 time 0.2448 (0.2424) data time 0.0011 (0.0016) model time 0.2438 (0.2411) loss 2.8728 (3.1768) grad_norm 1.8834 (2.5065) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][880/1251] eta 0:01:29 lr 0.000534 wd 0.0500 time 0.2679 (0.2424) data time 0.0008 (0.0016) model time 0.2671 (0.2411) loss 2.9021 (3.1761) grad_norm 2.8201 (2.5118) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][890/1251] eta 0:01:27 lr 0.000534 wd 0.0500 time 0.2772 (0.2425) data time 0.0010 (0.0016) model time 0.2762 (0.2411) loss 2.4383 (3.1761) grad_norm 4.7234 (2.5235) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][900/1251] eta 0:01:25 lr 0.000534 wd 0.0500 time 0.2561 (0.2425) data time 0.0008 (0.0016) model time 0.2553 (0.2411) loss 2.5944 (3.1766) grad_norm 3.1876 (2.5277) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][910/1251] eta 0:01:22 lr 0.000534 wd 0.0500 time 0.2536 (0.2425) data time 0.0009 (0.0016) model time 0.2527 (0.2411) loss 3.3601 (3.1791) grad_norm 1.8949 (2.5252) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][920/1251] eta 0:01:20 lr 0.000534 wd 0.0500 time 0.2405 (0.2425) data time 0.0007 (0.0016) model time 0.2398 (0.2411) loss 3.5299 (3.1795) grad_norm 2.0554 (2.5223) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][930/1251] eta 0:01:17 lr 0.000534 wd 0.0500 time 0.2555 (0.2424) data time 0.0009 (0.0016) model time 0.2547 (0.2411) loss 3.2359 (3.1772) grad_norm 2.3673 (2.5203) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][940/1251] eta 0:01:15 lr 0.000534 wd 0.0500 time 0.2483 (0.2424) data time 0.0012 (0.0016) model time 0.2471 (0.2411) loss 3.6687 (3.1759) grad_norm 4.7368 (2.5261) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][950/1251] eta 0:01:12 lr 0.000534 wd 0.0500 time 0.2587 (0.2424) data time 0.0011 (0.0016) model time 0.2576 (0.2411) loss 2.9764 (3.1762) grad_norm 1.6785 (2.5431) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][960/1251] eta 0:01:10 lr 0.000534 wd 0.0500 time 0.2408 (0.2424) data time 0.0009 (0.0016) model time 0.2400 (0.2411) loss 4.1641 (3.1795) grad_norm 2.0733 (2.5453) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][970/1251] eta 0:01:08 lr 0.000534 wd 0.0500 time 0.2454 (0.2424) data time 0.0010 (0.0016) model time 0.2445 (0.2410) loss 3.3074 (3.1776) grad_norm 2.6581 (2.5438) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:41:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][980/1251] eta 0:01:05 lr 0.000534 wd 0.0500 time 0.2356 (0.2423) data time 0.0008 (0.0016) model time 0.2348 (0.2410) loss 3.8375 (3.1764) grad_norm 2.2633 (2.5438) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][990/1251] eta 0:01:03 lr 0.000534 wd 0.0500 time 0.2479 (0.2423) data time 0.0010 (0.0016) model time 0.2469 (0.2410) loss 2.4138 (3.1762) grad_norm 2.7252 (2.5443) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1000/1251] eta 0:01:00 lr 0.000534 wd 0.0500 time 0.2377 (0.2422) data time 0.0008 (0.0016) model time 0.2369 (0.2409) loss 4.2331 (3.1775) grad_norm 2.7669 (2.5446) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1010/1251] eta 0:00:58 lr 0.000534 wd 0.0500 time 0.2351 (0.2422) data time 0.0013 (0.0016) model time 0.2338 (0.2409) loss 2.0899 (3.1789) grad_norm 1.5795 (2.5438) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1020/1251] eta 0:00:55 lr 0.000534 wd 0.0500 time 0.2402 (0.2422) data time 0.0009 (0.0016) model time 0.2393 (0.2409) loss 3.4002 (3.1804) grad_norm 2.4207 (2.5434) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1030/1251] eta 0:00:53 lr 0.000534 wd 0.0500 time 0.2386 (0.2422) data time 0.0010 (0.0016) model time 0.2376 (0.2408) loss 3.0674 (3.1809) grad_norm 2.2472 (2.5412) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1040/1251] eta 0:00:51 lr 0.000534 wd 0.0500 time 0.2336 (0.2421) data time 0.0011 (0.0015) model time 0.2325 (0.2408) loss 2.6340 (3.1801) grad_norm 2.3389 (2.5384) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1050/1251] eta 0:00:48 lr 0.000534 wd 0.0500 time 0.2397 (0.2421) data time 0.0011 (0.0015) model time 0.2386 (0.2407) loss 3.5099 (3.1808) grad_norm 2.1111 (2.5363) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1060/1251] eta 0:00:46 lr 0.000534 wd 0.0500 time 0.2337 (0.2420) data time 0.0011 (0.0015) model time 0.2327 (0.2407) loss 3.3436 (3.1821) grad_norm 2.1279 (2.5338) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1070/1251] eta 0:00:43 lr 0.000534 wd 0.0500 time 0.2492 (0.2420) data time 0.0007 (0.0015) model time 0.2485 (0.2407) loss 2.5781 (3.1822) grad_norm 2.3529 (2.5330) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1080/1251] eta 0:00:41 lr 0.000534 wd 0.0500 time 0.2344 (0.2420) data time 0.0008 (0.0015) model time 0.2337 (0.2407) loss 2.9110 (3.1851) grad_norm 3.2105 (2.5338) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1090/1251] eta 0:00:38 lr 0.000533 wd 0.0500 time 0.2354 (0.2420) data time 0.0011 (0.0015) model time 0.2343 (0.2406) loss 3.1424 (3.1832) grad_norm 2.1467 (2.5342) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1100/1251] eta 0:00:36 lr 0.000533 wd 0.0500 time 0.2450 (0.2419) data time 0.0008 (0.0015) model time 0.2442 (0.2406) loss 3.5852 (3.1831) grad_norm 2.0323 (2.5360) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1110/1251] eta 0:00:34 lr 0.000533 wd 0.0500 time 0.2375 (0.2419) data time 0.0011 (0.0015) model time 0.2364 (0.2406) loss 3.4286 (3.1835) grad_norm 2.4162 (2.5428) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1120/1251] eta 0:00:31 lr 0.000533 wd 0.0500 time 0.2333 (0.2419) data time 0.0013 (0.0015) model time 0.2320 (0.2405) loss 3.7638 (3.1850) grad_norm 2.7273 (2.5442) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1130/1251] eta 0:00:29 lr 0.000533 wd 0.0500 time 0.2438 (0.2419) data time 0.0010 (0.0015) model time 0.2427 (0.2405) loss 2.3122 (3.1842) grad_norm 2.8129 (2.5419) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:42:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1140/1251] eta 0:00:26 lr 0.000533 wd 0.0500 time 0.2398 (0.2418) data time 0.0009 (0.0015) model time 0.2389 (0.2405) loss 3.1874 (3.1825) grad_norm 2.7813 (2.5417) loss_scale 2048.0000 (1026.6924) mem 7382MB [2024-08-27 09:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1150/1251] eta 0:00:24 lr 0.000533 wd 0.0500 time 0.2495 (0.2418) data time 0.0011 (0.0015) model time 0.2484 (0.2405) loss 3.4561 (3.1833) grad_norm 2.4482 (2.5424) loss_scale 2048.0000 (1035.5656) mem 7382MB [2024-08-27 09:42:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1160/1251] eta 0:00:22 lr 0.000533 wd 0.0500 time 0.2350 (0.2418) data time 0.0010 (0.0015) model time 0.2340 (0.2405) loss 3.0196 (3.1838) grad_norm 2.4393 (2.5425) loss_scale 2048.0000 (1044.2860) mem 7382MB [2024-08-27 09:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1170/1251] eta 0:00:19 lr 0.000533 wd 0.0500 time 0.2419 (0.2418) data time 0.0008 (0.0015) model time 0.2410 (0.2404) loss 4.0332 (3.1853) grad_norm 2.9572 (2.5448) loss_scale 2048.0000 (1052.8574) mem 7382MB [2024-08-27 09:42:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1180/1251] eta 0:00:17 lr 0.000533 wd 0.0500 time 0.2490 (0.2417) data time 0.0009 (0.0015) model time 0.2480 (0.2404) loss 3.0779 (3.1842) grad_norm 2.9385 (2.5436) loss_scale 2048.0000 (1061.2837) mem 7382MB [2024-08-27 09:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1190/1251] eta 0:00:14 lr 0.000533 wd 0.0500 time 0.2427 (0.2417) data time 0.0008 (0.0015) model time 0.2419 (0.2404) loss 3.7527 (3.1844) grad_norm 2.4256 (2.5411) loss_scale 2048.0000 (1069.5684) mem 7382MB [2024-08-27 09:42:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1200/1251] eta 0:00:12 lr 0.000533 wd 0.0500 time 0.2327 (0.2417) data time 0.0008 (0.0015) model time 0.2319 (0.2404) loss 3.8656 (3.1850) grad_norm 3.5980 (2.5423) loss_scale 2048.0000 (1077.7152) mem 7382MB [2024-08-27 09:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1210/1251] eta 0:00:09 lr 0.000533 wd 0.0500 time 0.2395 (0.2417) data time 0.0012 (0.0015) model time 0.2383 (0.2404) loss 3.2772 (3.1847) grad_norm 2.7610 (2.5416) loss_scale 2048.0000 (1085.7275) mem 7382MB [2024-08-27 09:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1220/1251] eta 0:00:07 lr 0.000533 wd 0.0500 time 0.2526 (0.2417) data time 0.0010 (0.0015) model time 0.2516 (0.2403) loss 2.7974 (3.1849) grad_norm 2.9409 (2.5399) loss_scale 2048.0000 (1093.6085) mem 7382MB [2024-08-27 09:42:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1230/1251] eta 0:00:05 lr 0.000533 wd 0.0500 time 0.2471 (0.2417) data time 0.0008 (0.0015) model time 0.2463 (0.2403) loss 2.3775 (3.1867) grad_norm 2.1245 (2.5380) loss_scale 2048.0000 (1101.3615) mem 7382MB [2024-08-27 09:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1240/1251] eta 0:00:02 lr 0.000533 wd 0.0500 time 0.2278 (0.2416) data time 0.0005 (0.0015) model time 0.2274 (0.2402) loss 4.3287 (3.1878) grad_norm 2.2663 (2.5345) loss_scale 2048.0000 (1108.9895) mem 7382MB [2024-08-27 09:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [154/300][1250/1251] eta 0:00:00 lr 0.000533 wd 0.0500 time 0.2249 (0.2414) data time 0.0005 (0.0015) model time 0.2244 (0.2401) loss 3.8859 (3.1911) grad_norm 3.8160 (2.5341) loss_scale 2048.0000 (1116.4956) mem 7382MB [2024-08-27 09:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 154 training takes 0:05:02 [2024-08-27 09:43:02 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 09:43:03 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 09:43:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.408 (0.408) Loss 0.4873 (0.4873) Acc@1 91.016 (91.016) Acc@5 98.145 (98.145) Mem 7382MB [2024-08-27 09:43:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.112) Loss 0.6792 (0.7369) Acc@1 86.719 (84.215) Acc@5 97.266 (97.008) Mem 7382MB [2024-08-27 09:43:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.096) Loss 1.0039 (0.7634) Acc@1 75.391 (83.222) Acc@5 94.238 (96.898) Mem 7382MB [2024-08-27 09:43:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.090) Loss 1.2529 (0.8527) Acc@1 69.531 (81.102) Acc@5 91.211 (95.851) Mem 7382MB [2024-08-27 09:43:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.2246 (0.9120) Acc@1 71.191 (79.611) Acc@5 92.188 (95.134) Mem 7382MB [2024-08-27 09:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.208 Acc@5 95.096 [2024-08-27 09:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.2% [2024-08-27 09:43:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.816 (0.816) Loss 0.4045 (0.4045) Acc@1 92.773 (92.773) Acc@5 98.633 (98.633) Mem 7382MB [2024-08-27 09:43:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.147) Loss 0.6387 (0.6357) Acc@1 87.109 (86.346) Acc@5 97.070 (97.319) Mem 7382MB [2024-08-27 09:43:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.113) Loss 0.9028 (0.6600) Acc@1 78.223 (85.412) Acc@5 95.508 (97.349) Mem 7382MB [2024-08-27 09:43:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.100) Loss 1.1475 (0.7497) Acc@1 71.582 (83.235) Acc@5 92.871 (96.393) Mem 7382MB [2024-08-27 09:43:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.092) Loss 1.0391 (0.7970) Acc@1 74.609 (81.853) Acc@5 93.750 (95.889) Mem 7382MB [2024-08-27 09:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.406 Acc@5 95.862 [2024-08-27 09:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.4% [2024-08-27 09:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.41% [2024-08-27 09:43:11 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 09:43:12 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 09:43:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][0/1251] eta 0:13:52 lr 0.000533 wd 0.0500 time 0.6658 (0.6658) data time 0.4403 (0.4403) model time 0.0000 (0.0000) loss 3.4396 (3.4396) grad_norm 2.1093 (2.1093) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][10/1251] eta 0:05:42 lr 0.000533 wd 0.0500 time 0.2452 (0.2762) data time 0.0008 (0.0410) model time 0.0000 (0.0000) loss 3.8569 (3.2820) grad_norm 3.4497 (3.0366) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][20/1251] eta 0:05:17 lr 0.000533 wd 0.0500 time 0.2488 (0.2579) data time 0.0008 (0.0219) model time 0.0000 (0.0000) loss 2.4731 (3.1233) grad_norm 6.2536 (3.1433) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][30/1251] eta 0:05:07 lr 0.000533 wd 0.0500 time 0.2350 (0.2516) data time 0.0011 (0.0152) model time 0.0000 (0.0000) loss 3.1331 (3.0995) grad_norm 2.0300 (2.9420) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][40/1251] eta 0:05:05 lr 0.000533 wd 0.0500 time 0.2382 (0.2521) data time 0.0010 (0.0118) model time 0.0000 (0.0000) loss 3.4235 (3.1076) grad_norm 2.1565 (2.8410) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][50/1251] eta 0:04:59 lr 0.000533 wd 0.0500 time 0.2389 (0.2496) data time 0.0008 (0.0097) model time 0.0000 (0.0000) loss 2.1843 (3.1152) grad_norm 2.9736 (2.7325) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][60/1251] eta 0:04:55 lr 0.000532 wd 0.0500 time 0.2379 (0.2478) data time 0.0010 (0.0083) model time 0.2369 (0.2375) loss 2.7452 (3.0931) grad_norm 3.1321 (2.7164) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][70/1251] eta 0:04:51 lr 0.000532 wd 0.0500 time 0.2366 (0.2468) data time 0.0008 (0.0072) model time 0.2358 (0.2388) loss 3.1485 (3.1390) grad_norm 3.7731 (2.6946) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][80/1251] eta 0:04:48 lr 0.000532 wd 0.0500 time 0.2348 (0.2460) data time 0.0012 (0.0065) model time 0.2336 (0.2389) loss 3.2582 (3.1274) grad_norm 2.1090 (2.6619) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][90/1251] eta 0:04:44 lr 0.000532 wd 0.0500 time 0.2416 (0.2453) data time 0.0007 (0.0059) model time 0.2408 (0.2388) loss 2.6189 (3.1347) grad_norm 1.9346 (2.6468) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][100/1251] eta 0:04:41 lr 0.000532 wd 0.0500 time 0.2442 (0.2447) data time 0.0012 (0.0054) model time 0.2430 (0.2386) loss 3.2489 (3.1655) grad_norm 1.9188 (2.6234) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][110/1251] eta 0:04:38 lr 0.000532 wd 0.0500 time 0.2387 (0.2443) data time 0.0007 (0.0050) model time 0.2380 (0.2387) loss 3.7195 (3.1809) grad_norm 2.2391 (2.6293) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][120/1251] eta 0:04:35 lr 0.000532 wd 0.0500 time 0.2295 (0.2439) data time 0.0010 (0.0047) model time 0.2285 (0.2387) loss 3.8720 (3.1923) grad_norm 3.7438 (2.6340) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][130/1251] eta 0:04:33 lr 0.000532 wd 0.0500 time 0.2467 (0.2435) data time 0.0010 (0.0044) model time 0.2457 (0.2386) loss 3.5839 (3.1924) grad_norm 2.1607 (2.7253) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][140/1251] eta 0:04:30 lr 0.000532 wd 0.0500 time 0.2389 (0.2432) data time 0.0011 (0.0042) model time 0.2378 (0.2385) loss 2.1364 (3.2076) grad_norm 2.6057 (2.7065) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][150/1251] eta 0:04:27 lr 0.000532 wd 0.0500 time 0.2333 (0.2427) data time 0.0011 (0.0039) model time 0.2322 (0.2382) loss 3.6119 (3.2185) grad_norm 1.9804 (2.6934) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][160/1251] eta 0:04:24 lr 0.000532 wd 0.0500 time 0.2365 (0.2426) data time 0.0010 (0.0038) model time 0.2354 (0.2383) loss 2.8627 (3.2384) grad_norm 2.1906 (2.6830) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][170/1251] eta 0:04:21 lr 0.000532 wd 0.0500 time 0.2320 (0.2423) data time 0.0012 (0.0036) model time 0.2308 (0.2382) loss 3.6133 (3.2460) grad_norm 3.4148 (2.7158) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][180/1251] eta 0:04:19 lr 0.000532 wd 0.0500 time 0.2430 (0.2421) data time 0.0009 (0.0035) model time 0.2421 (0.2381) loss 3.9432 (3.2580) grad_norm 2.1307 (2.6923) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:43:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][190/1251] eta 0:04:16 lr 0.000532 wd 0.0500 time 0.2389 (0.2420) data time 0.0010 (0.0033) model time 0.2380 (0.2382) loss 3.9776 (3.2587) grad_norm 2.4155 (2.6593) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][200/1251] eta 0:04:14 lr 0.000532 wd 0.0500 time 0.2313 (0.2420) data time 0.0008 (0.0033) model time 0.2305 (0.2383) loss 3.2664 (3.2584) grad_norm 2.5020 (2.6383) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][210/1251] eta 0:04:11 lr 0.000532 wd 0.0500 time 0.2315 (0.2418) data time 0.0009 (0.0031) model time 0.2306 (0.2382) loss 3.1738 (3.2534) grad_norm 2.3878 (2.6233) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][220/1251] eta 0:04:09 lr 0.000532 wd 0.0500 time 0.2347 (0.2416) data time 0.0009 (0.0030) model time 0.2338 (0.2381) loss 3.8067 (3.2497) grad_norm 2.8565 (2.6183) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][230/1251] eta 0:04:06 lr 0.000532 wd 0.0500 time 0.2432 (0.2416) data time 0.0011 (0.0030) model time 0.2422 (0.2382) loss 3.6492 (3.2514) grad_norm 2.1957 (2.6254) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][240/1251] eta 0:04:05 lr 0.000532 wd 0.0500 time 0.2315 (0.2433) data time 0.0008 (0.0029) model time 0.2306 (0.2405) loss 2.2556 (3.2427) grad_norm 2.6957 (2.6186) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][250/1251] eta 0:04:04 lr 0.000532 wd 0.0500 time 0.2373 (0.2439) data time 0.0007 (0.0028) model time 0.2366 (0.2414) loss 3.7855 (3.2409) grad_norm 1.9976 (2.6038) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][260/1251] eta 0:04:01 lr 0.000532 wd 0.0500 time 0.2395 (0.2437) data time 0.0010 (0.0027) model time 0.2385 (0.2412) loss 3.4520 (3.2324) grad_norm 2.3883 (2.5871) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][270/1251] eta 0:03:58 lr 0.000532 wd 0.0500 time 0.2374 (0.2435) data time 0.0008 (0.0027) model time 0.2366 (0.2410) loss 3.1298 (3.2290) grad_norm 2.7256 (2.5998) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][280/1251] eta 0:03:56 lr 0.000532 wd 0.0500 time 0.2282 (0.2432) data time 0.0010 (0.0026) model time 0.2272 (0.2408) loss 4.3484 (3.2370) grad_norm 3.8505 (2.6093) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][290/1251] eta 0:03:53 lr 0.000531 wd 0.0500 time 0.2302 (0.2431) data time 0.0009 (0.0026) model time 0.2292 (0.2407) loss 2.2060 (3.2427) grad_norm 2.8903 (2.6019) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][300/1251] eta 0:03:51 lr 0.000531 wd 0.0500 time 0.2394 (0.2430) data time 0.0010 (0.0025) model time 0.2385 (0.2406) loss 3.3075 (3.2409) grad_norm 2.0509 (2.6172) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][310/1251] eta 0:03:48 lr 0.000531 wd 0.0500 time 0.2399 (0.2429) data time 0.0010 (0.0025) model time 0.2389 (0.2405) loss 3.4497 (3.2486) grad_norm 2.4005 (2.6051) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][320/1251] eta 0:03:46 lr 0.000531 wd 0.0500 time 0.2422 (0.2428) data time 0.0008 (0.0024) model time 0.2414 (0.2405) loss 2.8937 (3.2463) grad_norm 2.0547 (2.6131) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][330/1251] eta 0:03:43 lr 0.000531 wd 0.0500 time 0.2362 (0.2427) data time 0.0010 (0.0024) model time 0.2352 (0.2404) loss 3.6737 (3.2375) grad_norm 2.8351 (2.6182) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][340/1251] eta 0:03:41 lr 0.000531 wd 0.0500 time 0.2503 (0.2426) data time 0.0010 (0.0023) model time 0.2493 (0.2403) loss 3.1018 (3.2340) grad_norm 1.8480 (2.5997) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][350/1251] eta 0:03:38 lr 0.000531 wd 0.0500 time 0.2364 (0.2425) data time 0.0007 (0.0023) model time 0.2357 (0.2403) loss 3.6083 (3.2257) grad_norm 2.1877 (2.6043) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][360/1251] eta 0:03:35 lr 0.000531 wd 0.0500 time 0.2359 (0.2424) data time 0.0011 (0.0023) model time 0.2348 (0.2402) loss 2.4883 (3.2252) grad_norm 2.5467 (2.6091) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][370/1251] eta 0:03:33 lr 0.000531 wd 0.0500 time 0.2343 (0.2423) data time 0.0010 (0.0022) model time 0.2334 (0.2401) loss 3.2950 (3.2232) grad_norm 2.5150 (2.6139) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][380/1251] eta 0:03:30 lr 0.000531 wd 0.0500 time 0.2381 (0.2421) data time 0.0010 (0.0022) model time 0.2371 (0.2399) loss 2.2540 (3.2273) grad_norm 2.5676 (2.6148) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][390/1251] eta 0:03:28 lr 0.000531 wd 0.0500 time 0.2374 (0.2420) data time 0.0010 (0.0022) model time 0.2364 (0.2398) loss 2.4529 (3.2275) grad_norm 2.0709 (2.6158) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][400/1251] eta 0:03:25 lr 0.000531 wd 0.0500 time 0.2391 (0.2420) data time 0.0008 (0.0021) model time 0.2383 (0.2398) loss 3.2696 (3.2310) grad_norm 2.7686 (2.6203) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][410/1251] eta 0:03:23 lr 0.000531 wd 0.0500 time 0.2298 (0.2419) data time 0.0009 (0.0021) model time 0.2288 (0.2398) loss 3.6546 (3.2310) grad_norm 1.7401 (2.6280) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][420/1251] eta 0:03:21 lr 0.000531 wd 0.0500 time 0.2371 (0.2420) data time 0.0008 (0.0021) model time 0.2364 (0.2399) loss 3.8640 (3.2349) grad_norm 2.0883 (2.6274) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][430/1251] eta 0:03:18 lr 0.000531 wd 0.0500 time 0.2358 (0.2419) data time 0.0008 (0.0021) model time 0.2350 (0.2398) loss 3.7631 (3.2370) grad_norm 1.9377 (2.6270) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:44:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][440/1251] eta 0:03:16 lr 0.000531 wd 0.0500 time 0.2400 (0.2418) data time 0.0009 (0.0021) model time 0.2391 (0.2397) loss 2.5157 (3.2264) grad_norm 3.1191 (2.6266) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][450/1251] eta 0:03:13 lr 0.000531 wd 0.0500 time 0.2341 (0.2417) data time 0.0011 (0.0020) model time 0.2330 (0.2396) loss 2.9504 (3.2219) grad_norm 1.7718 (2.6150) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][460/1251] eta 0:03:11 lr 0.000531 wd 0.0500 time 0.2397 (0.2417) data time 0.0008 (0.0020) model time 0.2389 (0.2396) loss 2.9100 (3.2222) grad_norm 2.0015 (2.6072) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][470/1251] eta 0:03:08 lr 0.000531 wd 0.0500 time 0.2509 (0.2416) data time 0.0010 (0.0020) model time 0.2499 (0.2396) loss 4.4823 (3.2228) grad_norm 2.4527 (2.6045) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][480/1251] eta 0:03:06 lr 0.000531 wd 0.0500 time 0.2438 (0.2415) data time 0.0011 (0.0020) model time 0.2427 (0.2395) loss 3.4942 (3.2200) grad_norm 2.3643 (2.6151) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][490/1251] eta 0:03:03 lr 0.000531 wd 0.0500 time 0.2315 (0.2415) data time 0.0010 (0.0020) model time 0.2306 (0.2394) loss 4.1680 (3.2193) grad_norm 3.1278 (2.6232) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][500/1251] eta 0:03:01 lr 0.000531 wd 0.0500 time 0.2361 (0.2414) data time 0.0012 (0.0020) model time 0.2349 (0.2394) loss 3.4034 (3.2183) grad_norm 2.4781 (2.6208) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][510/1251] eta 0:02:58 lr 0.000530 wd 0.0500 time 0.2356 (0.2413) data time 0.0012 (0.0019) model time 0.2344 (0.2393) loss 3.5143 (3.2211) grad_norm 2.4646 (2.6177) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][520/1251] eta 0:02:56 lr 0.000530 wd 0.0500 time 0.2350 (0.2413) data time 0.0007 (0.0019) model time 0.2343 (0.2393) loss 2.3395 (3.2174) grad_norm 1.7971 (2.6141) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][530/1251] eta 0:02:53 lr 0.000530 wd 0.0500 time 0.2384 (0.2412) data time 0.0009 (0.0019) model time 0.2375 (0.2393) loss 3.4026 (3.2091) grad_norm 1.9313 (2.6079) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][540/1251] eta 0:02:51 lr 0.000530 wd 0.0500 time 0.2434 (0.2412) data time 0.0010 (0.0019) model time 0.2424 (0.2392) loss 3.7812 (3.2135) grad_norm 1.7384 (2.6013) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][550/1251] eta 0:02:49 lr 0.000530 wd 0.0500 time 0.2354 (0.2412) data time 0.0009 (0.0019) model time 0.2345 (0.2392) loss 2.6628 (3.2116) grad_norm 2.4834 (2.5982) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][560/1251] eta 0:02:46 lr 0.000530 wd 0.0500 time 0.2144 (0.2415) data time 0.0009 (0.0019) model time 0.2135 (0.2396) loss 2.2647 (3.2117) grad_norm 3.4703 (2.6206) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][570/1251] eta 0:02:44 lr 0.000530 wd 0.0500 time 0.2356 (0.2414) data time 0.0008 (0.0018) model time 0.2348 (0.2395) loss 2.7286 (3.2124) grad_norm 2.6031 (2.6175) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][580/1251] eta 0:02:41 lr 0.000530 wd 0.0500 time 0.2369 (0.2413) data time 0.0010 (0.0018) model time 0.2359 (0.2395) loss 3.3574 (3.2111) grad_norm 2.0508 (2.6118) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][590/1251] eta 0:02:39 lr 0.000530 wd 0.0500 time 0.2377 (0.2413) data time 0.0012 (0.0018) model time 0.2365 (0.2394) loss 2.7790 (3.2116) grad_norm 2.0148 (2.6091) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][600/1251] eta 0:02:37 lr 0.000530 wd 0.0500 time 0.2362 (0.2413) data time 0.0009 (0.0018) model time 0.2353 (0.2394) loss 3.4644 (3.2133) grad_norm 2.3307 (2.6062) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][610/1251] eta 0:02:34 lr 0.000530 wd 0.0500 time 0.2351 (0.2412) data time 0.0011 (0.0018) model time 0.2339 (0.2394) loss 3.0109 (3.2078) grad_norm 2.2755 (2.6020) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][620/1251] eta 0:02:32 lr 0.000530 wd 0.0500 time 0.2430 (0.2412) data time 0.0007 (0.0018) model time 0.2423 (0.2394) loss 2.6166 (3.2040) grad_norm 1.9996 (2.6026) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][630/1251] eta 0:02:29 lr 0.000530 wd 0.0500 time 0.2374 (0.2411) data time 0.0007 (0.0018) model time 0.2366 (0.2393) loss 2.1967 (3.2025) grad_norm 2.6395 (2.6074) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][640/1251] eta 0:02:27 lr 0.000530 wd 0.0500 time 0.2397 (0.2411) data time 0.0011 (0.0017) model time 0.2387 (0.2393) loss 3.5325 (3.2011) grad_norm 2.4571 (2.6097) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][650/1251] eta 0:02:24 lr 0.000530 wd 0.0500 time 0.2448 (0.2411) data time 0.0010 (0.0017) model time 0.2439 (0.2393) loss 2.8008 (3.2002) grad_norm 2.7883 (2.6063) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][660/1251] eta 0:02:22 lr 0.000530 wd 0.0500 time 0.2380 (0.2411) data time 0.0008 (0.0017) model time 0.2372 (0.2393) loss 2.3774 (3.1963) grad_norm 3.3182 (2.6054) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][670/1251] eta 0:02:20 lr 0.000530 wd 0.0500 time 0.2359 (0.2410) data time 0.0008 (0.0017) model time 0.2351 (0.2392) loss 3.0933 (3.1980) grad_norm 2.4912 (2.6053) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][680/1251] eta 0:02:17 lr 0.000530 wd 0.0500 time 0.2426 (0.2410) data time 0.0010 (0.0017) model time 0.2416 (0.2392) loss 3.1517 (3.1987) grad_norm 2.5273 (2.6061) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:45:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][690/1251] eta 0:02:15 lr 0.000530 wd 0.0500 time 0.2422 (0.2410) data time 0.0011 (0.0017) model time 0.2411 (0.2392) loss 3.5351 (3.1997) grad_norm 2.2289 (2.6076) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][700/1251] eta 0:02:12 lr 0.000530 wd 0.0500 time 0.2368 (0.2410) data time 0.0010 (0.0017) model time 0.2358 (0.2392) loss 3.2406 (3.2009) grad_norm 3.3764 (2.6105) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][710/1251] eta 0:02:10 lr 0.000530 wd 0.0500 time 0.2555 (0.2410) data time 0.0010 (0.0017) model time 0.2545 (0.2393) loss 3.2758 (3.2015) grad_norm 2.1733 (2.6110) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][720/1251] eta 0:02:07 lr 0.000530 wd 0.0500 time 0.2320 (0.2409) data time 0.0011 (0.0017) model time 0.2310 (0.2392) loss 3.7663 (3.2014) grad_norm 2.5625 (2.6122) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][730/1251] eta 0:02:05 lr 0.000530 wd 0.0500 time 0.2379 (0.2409) data time 0.0008 (0.0017) model time 0.2372 (0.2392) loss 3.3154 (3.2024) grad_norm 1.6908 (2.6066) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][740/1251] eta 0:02:03 lr 0.000529 wd 0.0500 time 0.2433 (0.2409) data time 0.0007 (0.0017) model time 0.2427 (0.2392) loss 2.1275 (3.2009) grad_norm 2.2334 (2.6013) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][750/1251] eta 0:02:00 lr 0.000529 wd 0.0500 time 0.2342 (0.2409) data time 0.0011 (0.0016) model time 0.2330 (0.2392) loss 2.7845 (3.1973) grad_norm 3.1577 (2.6038) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][760/1251] eta 0:01:58 lr 0.000529 wd 0.0500 time 0.2397 (0.2408) data time 0.0008 (0.0016) model time 0.2389 (0.2392) loss 4.0592 (3.1979) grad_norm 2.4263 (2.6038) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][770/1251] eta 0:01:55 lr 0.000529 wd 0.0500 time 0.2407 (0.2408) data time 0.0011 (0.0016) model time 0.2396 (0.2391) loss 3.0261 (3.1985) grad_norm 3.9681 (2.6033) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][780/1251] eta 0:01:53 lr 0.000529 wd 0.0500 time 0.2360 (0.2408) data time 0.0011 (0.0016) model time 0.2349 (0.2391) loss 3.6002 (3.1985) grad_norm 2.5110 (2.6154) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][790/1251] eta 0:01:50 lr 0.000529 wd 0.0500 time 0.2388 (0.2408) data time 0.0011 (0.0016) model time 0.2377 (0.2391) loss 3.6905 (3.2038) grad_norm 5.7152 (2.6180) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][800/1251] eta 0:01:48 lr 0.000529 wd 0.0500 time 0.2393 (0.2407) data time 0.0008 (0.0016) model time 0.2385 (0.2391) loss 3.0115 (3.2053) grad_norm 2.1141 (2.6231) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][810/1251] eta 0:01:46 lr 0.000529 wd 0.0500 time 0.2480 (0.2407) data time 0.0010 (0.0016) model time 0.2470 (0.2391) loss 2.9048 (3.2038) grad_norm 2.4226 (2.6234) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][820/1251] eta 0:01:43 lr 0.000529 wd 0.0500 time 0.2407 (0.2407) data time 0.0008 (0.0016) model time 0.2399 (0.2390) loss 3.0369 (3.2028) grad_norm 2.2030 (2.6234) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][830/1251] eta 0:01:41 lr 0.000529 wd 0.0500 time 0.2388 (0.2406) data time 0.0010 (0.0016) model time 0.2378 (0.2390) loss 3.4031 (3.2044) grad_norm 2.4169 (2.6266) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][840/1251] eta 0:01:38 lr 0.000529 wd 0.0500 time 0.2421 (0.2406) data time 0.0008 (0.0016) model time 0.2413 (0.2390) loss 3.1303 (3.2020) grad_norm 1.7902 (2.6224) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][850/1251] eta 0:01:36 lr 0.000529 wd 0.0500 time 0.2380 (0.2406) data time 0.0011 (0.0016) model time 0.2370 (0.2389) loss 3.0096 (3.2023) grad_norm 1.9031 (2.6162) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][860/1251] eta 0:01:34 lr 0.000529 wd 0.0500 time 0.2313 (0.2406) data time 0.0010 (0.0016) model time 0.2303 (0.2389) loss 2.3620 (3.1984) grad_norm 2.8995 (2.6141) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][870/1251] eta 0:01:31 lr 0.000529 wd 0.0500 time 0.2299 (0.2405) data time 0.0009 (0.0016) model time 0.2290 (0.2389) loss 3.8306 (3.2007) grad_norm 1.6429 (2.6138) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][880/1251] eta 0:01:29 lr 0.000529 wd 0.0500 time 0.2416 (0.2405) data time 0.0010 (0.0016) model time 0.2406 (0.2389) loss 3.4073 (3.2009) grad_norm 2.8907 (2.6172) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][890/1251] eta 0:01:26 lr 0.000529 wd 0.0500 time 0.2408 (0.2405) data time 0.0010 (0.0015) model time 0.2398 (0.2389) loss 2.1510 (3.1973) grad_norm 2.3056 (2.6279) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][900/1251] eta 0:01:24 lr 0.000529 wd 0.0500 time 0.2442 (0.2405) data time 0.0009 (0.0015) model time 0.2433 (0.2388) loss 3.2874 (3.1966) grad_norm 2.1880 (2.6251) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][910/1251] eta 0:01:22 lr 0.000529 wd 0.0500 time 0.2376 (0.2405) data time 0.0008 (0.0015) model time 0.2369 (0.2389) loss 1.8761 (3.1975) grad_norm 2.7781 (2.6267) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 09:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][920/1251] eta 0:01:19 lr 0.000529 wd 0.0500 time 0.2348 (0.2404) data time 0.0010 (0.0015) model time 0.2338 (0.2388) loss 3.2946 (3.1987) grad_norm 2.1132 (inf) loss_scale 1024.0000 (2042.4408) mem 7382MB [2024-08-27 09:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][930/1251] eta 0:01:17 lr 0.000529 wd 0.0500 time 0.2375 (0.2404) data time 0.0012 (0.0015) model time 0.2363 (0.2388) loss 3.0592 (3.1970) grad_norm 1.8231 (inf) loss_scale 1024.0000 (2031.5016) mem 7382MB [2024-08-27 09:46:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][940/1251] eta 0:01:14 lr 0.000529 wd 0.0500 time 0.2322 (0.2403) data time 0.0012 (0.0015) model time 0.2310 (0.2388) loss 3.0760 (3.1959) grad_norm 2.3450 (inf) loss_scale 1024.0000 (2020.7949) mem 7382MB [2024-08-27 09:47:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][950/1251] eta 0:01:12 lr 0.000529 wd 0.0500 time 0.2295 (0.2403) data time 0.0009 (0.0015) model time 0.2286 (0.2387) loss 3.4995 (3.1947) grad_norm 2.9556 (inf) loss_scale 1024.0000 (2010.3134) mem 7382MB [2024-08-27 09:47:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][960/1251] eta 0:01:09 lr 0.000528 wd 0.0500 time 0.2291 (0.2403) data time 0.0010 (0.0015) model time 0.2281 (0.2387) loss 3.5344 (3.1975) grad_norm 1.9756 (inf) loss_scale 1024.0000 (2000.0499) mem 7382MB [2024-08-27 09:47:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][970/1251] eta 0:01:07 lr 0.000528 wd 0.0500 time 0.2517 (0.2403) data time 0.0007 (0.0015) model time 0.2510 (0.2387) loss 3.4865 (3.1997) grad_norm 3.0959 (inf) loss_scale 1024.0000 (1989.9979) mem 7382MB [2024-08-27 09:47:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][980/1251] eta 0:01:05 lr 0.000528 wd 0.0500 time 0.2354 (0.2403) data time 0.0008 (0.0015) model time 0.2346 (0.2387) loss 3.2195 (3.2008) grad_norm 1.9489 (inf) loss_scale 1024.0000 (1980.1509) mem 7382MB [2024-08-27 09:47:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][990/1251] eta 0:01:02 lr 0.000528 wd 0.0500 time 0.2349 (0.2402) data time 0.0008 (0.0015) model time 0.2342 (0.2387) loss 3.6067 (3.2000) grad_norm 2.6976 (inf) loss_scale 1024.0000 (1970.5025) mem 7382MB [2024-08-27 09:47:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1000/1251] eta 0:01:00 lr 0.000528 wd 0.0500 time 0.2485 (0.2402) data time 0.0011 (0.0015) model time 0.2474 (0.2387) loss 3.2305 (3.2038) grad_norm 1.8138 (inf) loss_scale 1024.0000 (1961.0470) mem 7382MB [2024-08-27 09:47:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1010/1251] eta 0:00:57 lr 0.000528 wd 0.0500 time 0.2305 (0.2402) data time 0.0010 (0.0015) model time 0.2295 (0.2386) loss 3.4373 (3.2067) grad_norm 4.4207 (inf) loss_scale 1024.0000 (1951.7784) mem 7382MB [2024-08-27 09:47:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1020/1251] eta 0:00:55 lr 0.000528 wd 0.0500 time 0.2437 (0.2402) data time 0.0007 (0.0015) model time 0.2430 (0.2386) loss 3.3739 (3.2058) grad_norm 1.9941 (inf) loss_scale 1024.0000 (1942.6915) mem 7382MB [2024-08-27 09:47:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1030/1251] eta 0:00:53 lr 0.000528 wd 0.0500 time 0.2321 (0.2401) data time 0.0008 (0.0015) model time 0.2313 (0.2386) loss 2.7885 (3.2039) grad_norm 1.9677 (inf) loss_scale 1024.0000 (1933.7808) mem 7382MB [2024-08-27 09:47:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1040/1251] eta 0:00:50 lr 0.000528 wd 0.0500 time 0.2386 (0.2401) data time 0.0010 (0.0015) model time 0.2376 (0.2385) loss 3.8200 (3.2014) grad_norm 1.9174 (inf) loss_scale 1024.0000 (1925.0413) mem 7382MB [2024-08-27 09:47:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1050/1251] eta 0:00:48 lr 0.000528 wd 0.0500 time 0.2412 (0.2401) data time 0.0007 (0.0015) model time 0.2404 (0.2385) loss 3.7506 (3.2017) grad_norm 2.3528 (inf) loss_scale 1024.0000 (1916.4681) mem 7382MB [2024-08-27 09:47:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1060/1251] eta 0:00:45 lr 0.000528 wd 0.0500 time 0.2411 (0.2401) data time 0.0010 (0.0015) model time 0.2400 (0.2385) loss 3.4924 (3.2008) grad_norm 1.7590 (inf) loss_scale 1024.0000 (1908.0566) mem 7382MB [2024-08-27 09:47:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1070/1251] eta 0:00:43 lr 0.000528 wd 0.0500 time 0.2413 (0.2401) data time 0.0010 (0.0015) model time 0.2404 (0.2385) loss 2.7686 (3.2031) grad_norm 2.3667 (inf) loss_scale 1024.0000 (1899.8021) mem 7382MB [2024-08-27 09:47:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1080/1251] eta 0:00:41 lr 0.000528 wd 0.0500 time 0.2377 (0.2400) data time 0.0007 (0.0015) model time 0.2369 (0.2385) loss 2.3379 (3.2033) grad_norm 3.0276 (inf) loss_scale 1024.0000 (1891.7003) mem 7382MB [2024-08-27 09:47:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1090/1251] eta 0:00:38 lr 0.000528 wd 0.0500 time 0.2364 (0.2401) data time 0.0009 (0.0015) model time 0.2355 (0.2385) loss 1.9518 (3.2037) grad_norm 2.4290 (inf) loss_scale 1024.0000 (1883.7470) mem 7382MB [2024-08-27 09:47:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1100/1251] eta 0:00:36 lr 0.000528 wd 0.0500 time 0.2349 (0.2401) data time 0.0012 (0.0015) model time 0.2337 (0.2385) loss 3.5702 (3.2044) grad_norm 3.2017 (inf) loss_scale 1024.0000 (1875.9382) mem 7382MB [2024-08-27 09:47:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1110/1251] eta 0:00:33 lr 0.000528 wd 0.0500 time 0.2330 (0.2401) data time 0.0010 (0.0015) model time 0.2321 (0.2385) loss 3.3736 (3.2058) grad_norm 1.8894 (inf) loss_scale 1024.0000 (1868.2700) mem 7382MB [2024-08-27 09:47:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1120/1251] eta 0:00:31 lr 0.000528 wd 0.0500 time 0.2427 (0.2400) data time 0.0007 (0.0015) model time 0.2420 (0.2385) loss 3.6965 (3.2076) grad_norm 2.2403 (inf) loss_scale 1024.0000 (1860.7386) mem 7382MB [2024-08-27 09:47:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1130/1251] eta 0:00:29 lr 0.000528 wd 0.0500 time 0.2371 (0.2400) data time 0.0008 (0.0014) model time 0.2363 (0.2385) loss 3.3008 (3.2100) grad_norm 4.0616 (inf) loss_scale 1024.0000 (1853.3404) mem 7382MB [2024-08-27 09:47:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1140/1251] eta 0:00:26 lr 0.000528 wd 0.0500 time 0.2292 (0.2400) data time 0.0011 (0.0014) model time 0.2281 (0.2385) loss 3.3394 (3.2110) grad_norm 2.7252 (inf) loss_scale 1024.0000 (1846.0719) mem 7382MB [2024-08-27 09:47:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1150/1251] eta 0:00:24 lr 0.000528 wd 0.0500 time 0.2401 (0.2400) data time 0.0007 (0.0014) model time 0.2394 (0.2385) loss 2.3076 (3.2081) grad_norm 2.3675 (inf) loss_scale 1024.0000 (1838.9296) mem 7382MB [2024-08-27 09:47:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1160/1251] eta 0:00:21 lr 0.000528 wd 0.0500 time 0.2398 (0.2399) data time 0.0010 (0.0014) model time 0.2388 (0.2384) loss 3.7842 (3.2068) grad_norm 2.6234 (inf) loss_scale 1024.0000 (1831.9104) mem 7382MB [2024-08-27 09:47:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1170/1251] eta 0:00:19 lr 0.000528 wd 0.0500 time 0.2367 (0.2403) data time 0.0011 (0.0014) model time 0.2356 (0.2388) loss 3.5413 (3.2072) grad_norm 2.0692 (inf) loss_scale 1024.0000 (1825.0111) mem 7382MB [2024-08-27 09:47:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1180/1251] eta 0:00:17 lr 0.000528 wd 0.0500 time 0.2275 (0.2405) data time 0.0008 (0.0014) model time 0.2267 (0.2390) loss 3.5710 (3.2054) grad_norm 2.6551 (inf) loss_scale 1024.0000 (1818.2286) mem 7382MB [2024-08-27 09:47:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1190/1251] eta 0:00:14 lr 0.000527 wd 0.0500 time 0.2526 (0.2405) data time 0.0011 (0.0014) model time 0.2514 (0.2390) loss 3.0342 (3.2058) grad_norm 2.7274 (inf) loss_scale 1024.0000 (1811.5600) mem 7382MB [2024-08-27 09:48:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1200/1251] eta 0:00:12 lr 0.000527 wd 0.0500 time 0.2440 (0.2405) data time 0.0011 (0.0014) model time 0.2428 (0.2390) loss 3.2333 (3.2056) grad_norm 2.4760 (inf) loss_scale 1024.0000 (1805.0025) mem 7382MB [2024-08-27 09:48:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1210/1251] eta 0:00:09 lr 0.000527 wd 0.0500 time 0.2288 (0.2405) data time 0.0009 (0.0014) model time 0.2279 (0.2390) loss 3.1951 (3.2063) grad_norm 2.3724 (inf) loss_scale 1024.0000 (1798.5533) mem 7382MB [2024-08-27 09:48:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1220/1251] eta 0:00:07 lr 0.000527 wd 0.0500 time 0.2412 (0.2404) data time 0.0008 (0.0014) model time 0.2405 (0.2390) loss 2.9530 (3.2057) grad_norm 2.4280 (inf) loss_scale 1024.0000 (1792.2097) mem 7382MB [2024-08-27 09:48:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1230/1251] eta 0:00:05 lr 0.000527 wd 0.0500 time 0.2370 (0.2404) data time 0.0012 (0.0014) model time 0.2358 (0.2390) loss 3.4798 (3.2045) grad_norm 2.2740 (inf) loss_scale 1024.0000 (1785.9691) mem 7382MB [2024-08-27 09:48:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1240/1251] eta 0:00:02 lr 0.000527 wd 0.0500 time 0.2224 (0.2404) data time 0.0007 (0.0014) model time 0.2217 (0.2389) loss 3.4620 (3.2049) grad_norm 3.3026 (inf) loss_scale 1024.0000 (1779.8292) mem 7382MB [2024-08-27 09:48:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [155/300][1250/1251] eta 0:00:00 lr 0.000527 wd 0.0500 time 0.2244 (0.2402) data time 0.0007 (0.0014) model time 0.2237 (0.2388) loss 3.5841 (3.2059) grad_norm 2.4998 (inf) loss_scale 1024.0000 (1773.7874) mem 7382MB [2024-08-27 09:48:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 155 training takes 0:05:00 [2024-08-27 09:48:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 09:48:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 09:48:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.460 (0.460) Loss 0.4736 (0.4736) Acc@1 91.211 (91.211) Acc@5 98.047 (98.047) Mem 7382MB [2024-08-27 09:48:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.111) Loss 0.7432 (0.7331) Acc@1 84.375 (84.224) Acc@5 97.070 (97.053) Mem 7382MB [2024-08-27 09:48:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.095) Loss 1.0459 (0.7604) Acc@1 75.391 (83.212) Acc@5 93.457 (96.894) Mem 7382MB [2024-08-27 09:48:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.090) Loss 1.2979 (0.8676) Acc@1 69.727 (80.730) Acc@5 91.309 (95.637) Mem 7382MB [2024-08-27 09:48:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.1895 (0.9213) Acc@1 71.680 (79.335) Acc@5 92.188 (95.015) Mem 7382MB [2024-08-27 09:48:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 78.990 Acc@5 94.992 [2024-08-27 09:48:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.0% [2024-08-27 09:48:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.781 (0.781) Loss 0.4041 (0.4041) Acc@1 92.969 (92.969) Acc@5 98.633 (98.633) Mem 7382MB [2024-08-27 09:48:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.146) Loss 0.6377 (0.6348) Acc@1 86.816 (86.239) Acc@5 97.168 (97.390) Mem 7382MB [2024-08-27 09:48:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.114) Loss 0.9053 (0.6594) Acc@1 78.027 (85.379) Acc@5 95.508 (97.382) Mem 7382MB [2024-08-27 09:48:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.102) Loss 1.1455 (0.7488) Acc@1 71.191 (83.216) Acc@5 92.871 (96.409) Mem 7382MB [2024-08-27 09:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.0361 (0.7961) Acc@1 74.219 (81.845) Acc@5 93.848 (95.925) Mem 7382MB [2024-08-27 09:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.402 Acc@5 95.892 [2024-08-27 09:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.4% [2024-08-27 09:48:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][0/1251] eta 0:21:45 lr 0.000527 wd 0.0500 time 1.0432 (1.0432) data time 0.4877 (0.4877) model time 0.0000 (0.0000) loss 3.0291 (3.0291) grad_norm 3.5434 (3.5434) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][10/1251] eta 0:06:27 lr 0.000527 wd 0.0500 time 0.2406 (0.3123) data time 0.0010 (0.0454) model time 0.0000 (0.0000) loss 3.7457 (3.1816) grad_norm 2.2233 (2.4199) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][20/1251] eta 0:05:41 lr 0.000527 wd 0.0500 time 0.2470 (0.2776) data time 0.0011 (0.0243) model time 0.0000 (0.0000) loss 2.4281 (3.0777) grad_norm 2.2816 (2.3813) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][30/1251] eta 0:05:23 lr 0.000527 wd 0.0500 time 0.2367 (0.2653) data time 0.0009 (0.0168) model time 0.0000 (0.0000) loss 3.6694 (3.0667) grad_norm 2.4329 (2.5313) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][40/1251] eta 0:05:13 lr 0.000527 wd 0.0500 time 0.2359 (0.2590) data time 0.0013 (0.0129) model time 0.0000 (0.0000) loss 2.9259 (3.0324) grad_norm 1.7256 (2.5282) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][50/1251] eta 0:05:05 lr 0.000527 wd 0.0500 time 0.2367 (0.2547) data time 0.0010 (0.0106) model time 0.0000 (0.0000) loss 3.5782 (3.0516) grad_norm 2.1687 (2.5240) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][60/1251] eta 0:05:00 lr 0.000527 wd 0.0500 time 0.2389 (0.2523) data time 0.0011 (0.0090) model time 0.2378 (0.2389) loss 3.6553 (3.1123) grad_norm 1.9655 (2.4967) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][70/1251] eta 0:04:55 lr 0.000527 wd 0.0500 time 0.2342 (0.2505) data time 0.0009 (0.0079) model time 0.2333 (0.2387) loss 3.7455 (3.1340) grad_norm 2.2783 (2.4664) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][80/1251] eta 0:04:51 lr 0.000527 wd 0.0500 time 0.2395 (0.2489) data time 0.0008 (0.0070) model time 0.2387 (0.2379) loss 2.7856 (3.1397) grad_norm 2.8214 (2.5739) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][90/1251] eta 0:04:47 lr 0.000527 wd 0.0500 time 0.2407 (0.2479) data time 0.0011 (0.0064) model time 0.2396 (0.2381) loss 2.9682 (3.1700) grad_norm 2.1391 (2.5854) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][100/1251] eta 0:04:44 lr 0.000527 wd 0.0500 time 0.2356 (0.2469) data time 0.0008 (0.0059) model time 0.2348 (0.2378) loss 3.3223 (3.1859) grad_norm 2.6604 (2.5701) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][110/1251] eta 0:04:41 lr 0.000527 wd 0.0500 time 0.2473 (0.2463) data time 0.0008 (0.0054) model time 0.2465 (0.2381) loss 3.5033 (3.2000) grad_norm 2.0564 (2.5572) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][120/1251] eta 0:04:37 lr 0.000527 wd 0.0500 time 0.2401 (0.2456) data time 0.0010 (0.0051) model time 0.2391 (0.2379) loss 3.4541 (3.2028) grad_norm 2.6694 (2.5721) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][130/1251] eta 0:04:34 lr 0.000527 wd 0.0500 time 0.2377 (0.2452) data time 0.0010 (0.0048) model time 0.2367 (0.2380) loss 3.5206 (3.2222) grad_norm 2.1991 (2.5487) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][140/1251] eta 0:04:31 lr 0.000527 wd 0.0500 time 0.2433 (0.2446) data time 0.0009 (0.0045) model time 0.2424 (0.2378) loss 3.1729 (3.2170) grad_norm 2.8665 (2.5752) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:48:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][150/1251] eta 0:04:28 lr 0.000527 wd 0.0500 time 0.2391 (0.2443) data time 0.0012 (0.0043) model time 0.2379 (0.2378) loss 3.4456 (3.2190) grad_norm 2.5515 (2.5852) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:49:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][160/1251] eta 0:04:26 lr 0.000526 wd 0.0500 time 0.2436 (0.2439) data time 0.0010 (0.0041) model time 0.2426 (0.2378) loss 3.2469 (3.1947) grad_norm 1.8433 (2.5709) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:49:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][170/1251] eta 0:04:23 lr 0.000526 wd 0.0500 time 0.2450 (0.2438) data time 0.0009 (0.0039) model time 0.2442 (0.2381) loss 1.9081 (3.1945) grad_norm 3.2509 (2.5749) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:49:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][180/1251] eta 0:04:22 lr 0.000526 wd 0.0500 time 0.2359 (0.2447) data time 0.0009 (0.0038) model time 0.2350 (0.2397) loss 4.2522 (3.2046) grad_norm 2.1678 (inf) loss_scale 512.0000 (1004.1989) mem 7382MB [2024-08-27 09:49:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][190/1251] eta 0:04:19 lr 0.000526 wd 0.0500 time 0.2445 (0.2444) data time 0.0008 (0.0036) model time 0.2438 (0.2396) loss 4.0183 (3.2226) grad_norm 2.6511 (inf) loss_scale 512.0000 (978.4293) mem 7382MB [2024-08-27 09:49:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][200/1251] eta 0:04:16 lr 0.000526 wd 0.0500 time 0.2357 (0.2441) data time 0.0010 (0.0035) model time 0.2347 (0.2394) loss 3.3389 (3.2121) grad_norm 2.4072 (inf) loss_scale 512.0000 (955.2239) mem 7382MB [2024-08-27 09:49:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][210/1251] eta 0:04:13 lr 0.000526 wd 0.0500 time 0.2375 (0.2439) data time 0.0008 (0.0034) model time 0.2368 (0.2394) loss 3.2230 (3.2103) grad_norm 3.0363 (inf) loss_scale 512.0000 (934.2180) mem 7382MB [2024-08-27 09:49:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][220/1251] eta 0:04:11 lr 0.000526 wd 0.0500 time 0.2523 (0.2437) data time 0.0010 (0.0033) model time 0.2514 (0.2393) loss 3.6511 (3.2295) grad_norm 3.0999 (inf) loss_scale 512.0000 (915.1131) mem 7382MB [2024-08-27 09:49:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][230/1251] eta 0:04:08 lr 0.000526 wd 0.0500 time 0.2339 (0.2435) data time 0.0009 (0.0032) model time 0.2330 (0.2392) loss 3.8566 (3.2294) grad_norm 2.4171 (inf) loss_scale 512.0000 (897.6623) mem 7382MB [2024-08-27 09:49:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][240/1251] eta 0:04:05 lr 0.000526 wd 0.0500 time 0.2369 (0.2433) data time 0.0011 (0.0031) model time 0.2358 (0.2391) loss 2.8539 (3.2225) grad_norm 3.2710 (inf) loss_scale 512.0000 (881.6598) mem 7382MB [2024-08-27 09:49:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][250/1251] eta 0:04:03 lr 0.000526 wd 0.0500 time 0.2429 (0.2431) data time 0.0009 (0.0030) model time 0.2421 (0.2391) loss 3.4635 (3.2145) grad_norm 2.1521 (inf) loss_scale 512.0000 (866.9323) mem 7382MB [2024-08-27 09:49:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][260/1251] eta 0:04:00 lr 0.000526 wd 0.0500 time 0.2410 (0.2429) data time 0.0010 (0.0029) model time 0.2401 (0.2390) loss 3.5836 (3.2153) grad_norm 3.0668 (inf) loss_scale 512.0000 (853.3333) mem 7382MB [2024-08-27 09:49:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][270/1251] eta 0:03:58 lr 0.000526 wd 0.0500 time 0.2431 (0.2427) data time 0.0010 (0.0029) model time 0.2420 (0.2388) loss 2.9364 (3.2126) grad_norm 1.9099 (inf) loss_scale 512.0000 (840.7380) mem 7382MB [2024-08-27 09:49:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][280/1251] eta 0:03:55 lr 0.000526 wd 0.0500 time 0.2375 (0.2426) data time 0.0010 (0.0028) model time 0.2365 (0.2389) loss 2.2046 (3.2179) grad_norm 2.3639 (inf) loss_scale 512.0000 (829.0391) mem 7382MB [2024-08-27 09:49:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][290/1251] eta 0:03:53 lr 0.000526 wd 0.0500 time 0.2490 (0.2425) data time 0.0008 (0.0027) model time 0.2483 (0.2388) loss 2.5635 (3.2173) grad_norm 2.1198 (inf) loss_scale 512.0000 (818.1443) mem 7382MB [2024-08-27 09:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][300/1251] eta 0:03:50 lr 0.000526 wd 0.0500 time 0.2412 (0.2425) data time 0.0010 (0.0027) model time 0.2402 (0.2389) loss 2.1903 (3.2068) grad_norm 2.4439 (inf) loss_scale 512.0000 (807.9734) mem 7382MB [2024-08-27 09:49:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][310/1251] eta 0:03:48 lr 0.000526 wd 0.0500 time 0.2479 (0.2424) data time 0.0010 (0.0026) model time 0.2468 (0.2389) loss 3.4499 (3.2085) grad_norm 2.5496 (inf) loss_scale 512.0000 (798.4566) mem 7382MB [2024-08-27 09:49:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][320/1251] eta 0:03:45 lr 0.000526 wd 0.0500 time 0.2379 (0.2423) data time 0.0009 (0.0026) model time 0.2370 (0.2389) loss 3.8298 (3.2024) grad_norm 3.0914 (inf) loss_scale 512.0000 (789.5327) mem 7382MB [2024-08-27 09:49:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][330/1251] eta 0:03:43 lr 0.000526 wd 0.0500 time 0.2441 (0.2422) data time 0.0008 (0.0025) model time 0.2433 (0.2389) loss 4.1242 (3.1995) grad_norm 1.8153 (inf) loss_scale 512.0000 (781.1480) mem 7382MB [2024-08-27 09:49:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][340/1251] eta 0:03:40 lr 0.000526 wd 0.0500 time 0.2368 (0.2421) data time 0.0010 (0.0025) model time 0.2358 (0.2388) loss 2.5407 (3.1974) grad_norm 1.9237 (inf) loss_scale 512.0000 (773.2551) mem 7382MB [2024-08-27 09:49:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][350/1251] eta 0:03:38 lr 0.000526 wd 0.0500 time 0.2422 (0.2420) data time 0.0009 (0.0025) model time 0.2413 (0.2388) loss 3.2608 (3.2065) grad_norm 2.5410 (inf) loss_scale 512.0000 (765.8120) mem 7382MB [2024-08-27 09:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][360/1251] eta 0:03:35 lr 0.000526 wd 0.0500 time 0.2407 (0.2420) data time 0.0010 (0.0024) model time 0.2397 (0.2388) loss 3.0902 (3.2073) grad_norm 2.0869 (inf) loss_scale 512.0000 (758.7812) mem 7382MB [2024-08-27 09:49:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][370/1251] eta 0:03:33 lr 0.000526 wd 0.0500 time 0.2402 (0.2419) data time 0.0008 (0.0024) model time 0.2394 (0.2388) loss 2.8004 (3.2047) grad_norm 1.7320 (inf) loss_scale 512.0000 (752.1294) mem 7382MB [2024-08-27 09:49:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][380/1251] eta 0:03:30 lr 0.000526 wd 0.0500 time 0.2413 (0.2418) data time 0.0008 (0.0024) model time 0.2405 (0.2388) loss 2.3177 (3.2105) grad_norm 2.1315 (inf) loss_scale 512.0000 (745.8268) mem 7382MB [2024-08-27 09:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][390/1251] eta 0:03:28 lr 0.000525 wd 0.0500 time 0.2391 (0.2418) data time 0.0009 (0.0023) model time 0.2382 (0.2388) loss 2.6060 (3.2121) grad_norm 1.8332 (inf) loss_scale 512.0000 (739.8465) mem 7382MB [2024-08-27 09:49:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][400/1251] eta 0:03:25 lr 0.000525 wd 0.0500 time 0.2372 (0.2417) data time 0.0013 (0.0023) model time 0.2360 (0.2388) loss 3.5029 (3.2101) grad_norm 2.4044 (inf) loss_scale 512.0000 (734.1646) mem 7382MB [2024-08-27 09:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][410/1251] eta 0:03:23 lr 0.000525 wd 0.0500 time 0.2440 (0.2417) data time 0.0009 (0.0023) model time 0.2431 (0.2388) loss 3.2654 (3.2090) grad_norm 2.6478 (inf) loss_scale 512.0000 (728.7591) mem 7382MB [2024-08-27 09:50:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][420/1251] eta 0:03:20 lr 0.000525 wd 0.0500 time 0.2329 (0.2416) data time 0.0010 (0.0022) model time 0.2320 (0.2387) loss 3.1346 (3.2120) grad_norm 2.8776 (inf) loss_scale 512.0000 (723.6105) mem 7382MB [2024-08-27 09:50:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][430/1251] eta 0:03:18 lr 0.000525 wd 0.0500 time 0.2359 (0.2416) data time 0.0010 (0.0022) model time 0.2349 (0.2387) loss 3.1571 (3.2091) grad_norm 2.2409 (inf) loss_scale 512.0000 (718.7007) mem 7382MB [2024-08-27 09:50:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][440/1251] eta 0:03:15 lr 0.000525 wd 0.0500 time 0.2308 (0.2415) data time 0.0009 (0.0022) model time 0.2299 (0.2387) loss 2.6563 (3.2085) grad_norm 3.1388 (inf) loss_scale 512.0000 (714.0136) mem 7382MB [2024-08-27 09:50:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][450/1251] eta 0:03:13 lr 0.000525 wd 0.0500 time 0.2507 (0.2415) data time 0.0010 (0.0022) model time 0.2497 (0.2387) loss 3.0591 (3.2080) grad_norm 2.0763 (inf) loss_scale 512.0000 (709.5344) mem 7382MB [2024-08-27 09:50:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][460/1251] eta 0:03:10 lr 0.000525 wd 0.0500 time 0.2320 (0.2414) data time 0.0010 (0.0021) model time 0.2310 (0.2386) loss 3.0316 (3.2007) grad_norm 1.6810 (inf) loss_scale 512.0000 (705.2495) mem 7382MB [2024-08-27 09:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][470/1251] eta 0:03:08 lr 0.000525 wd 0.0500 time 0.2398 (0.2413) data time 0.0009 (0.0021) model time 0.2389 (0.2386) loss 4.0372 (3.2009) grad_norm 3.5932 (inf) loss_scale 512.0000 (701.1465) mem 7382MB [2024-08-27 09:50:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][480/1251] eta 0:03:06 lr 0.000525 wd 0.0500 time 0.2351 (0.2413) data time 0.0009 (0.0021) model time 0.2342 (0.2386) loss 4.4484 (3.1993) grad_norm 2.6410 (inf) loss_scale 512.0000 (697.2141) mem 7382MB [2024-08-27 09:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][490/1251] eta 0:03:03 lr 0.000525 wd 0.0500 time 0.2376 (0.2412) data time 0.0012 (0.0021) model time 0.2364 (0.2386) loss 3.8896 (3.1978) grad_norm 3.1008 (inf) loss_scale 512.0000 (693.4420) mem 7382MB [2024-08-27 09:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][500/1251] eta 0:03:01 lr 0.000525 wd 0.0500 time 0.2394 (0.2412) data time 0.0008 (0.0020) model time 0.2386 (0.2386) loss 2.7318 (3.1898) grad_norm 1.9378 (inf) loss_scale 512.0000 (689.8204) mem 7382MB [2024-08-27 09:50:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][510/1251] eta 0:02:58 lr 0.000525 wd 0.0500 time 0.2384 (0.2411) data time 0.0010 (0.0020) model time 0.2374 (0.2385) loss 3.0707 (3.1914) grad_norm 2.5393 (inf) loss_scale 512.0000 (686.3405) mem 7382MB [2024-08-27 09:50:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][520/1251] eta 0:02:56 lr 0.000525 wd 0.0500 time 0.2391 (0.2411) data time 0.0012 (0.0020) model time 0.2379 (0.2385) loss 3.3069 (3.1918) grad_norm 2.3792 (inf) loss_scale 512.0000 (682.9942) mem 7382MB [2024-08-27 09:50:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][530/1251] eta 0:02:53 lr 0.000525 wd 0.0500 time 0.2457 (0.2410) data time 0.0007 (0.0020) model time 0.2451 (0.2385) loss 3.8169 (3.1938) grad_norm 1.9272 (inf) loss_scale 512.0000 (679.7740) mem 7382MB [2024-08-27 09:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][540/1251] eta 0:02:51 lr 0.000525 wd 0.0500 time 0.2389 (0.2409) data time 0.0008 (0.0020) model time 0.2381 (0.2384) loss 4.0429 (3.1950) grad_norm 2.4546 (inf) loss_scale 512.0000 (676.6728) mem 7382MB [2024-08-27 09:50:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][550/1251] eta 0:02:48 lr 0.000525 wd 0.0500 time 0.2403 (0.2409) data time 0.0008 (0.0019) model time 0.2395 (0.2385) loss 3.8453 (3.1963) grad_norm 2.4842 (inf) loss_scale 512.0000 (673.6842) mem 7382MB [2024-08-27 09:50:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][560/1251] eta 0:02:46 lr 0.000525 wd 0.0500 time 0.2379 (0.2409) data time 0.0012 (0.0019) model time 0.2368 (0.2384) loss 3.5544 (3.1914) grad_norm 4.1220 (inf) loss_scale 512.0000 (670.8021) mem 7382MB [2024-08-27 09:50:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][570/1251] eta 0:02:44 lr 0.000525 wd 0.0500 time 0.2484 (0.2409) data time 0.0010 (0.0019) model time 0.2474 (0.2385) loss 3.3075 (3.1940) grad_norm 2.4901 (inf) loss_scale 512.0000 (668.0210) mem 7382MB [2024-08-27 09:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][580/1251] eta 0:02:41 lr 0.000525 wd 0.0500 time 0.2456 (0.2409) data time 0.0009 (0.0019) model time 0.2447 (0.2385) loss 3.8577 (3.1959) grad_norm 3.1140 (inf) loss_scale 512.0000 (665.3356) mem 7382MB [2024-08-27 09:50:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][590/1251] eta 0:02:39 lr 0.000525 wd 0.0500 time 0.2353 (0.2408) data time 0.0008 (0.0019) model time 0.2345 (0.2385) loss 2.6636 (3.1977) grad_norm 2.8965 (inf) loss_scale 512.0000 (662.7411) mem 7382MB [2024-08-27 09:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][600/1251] eta 0:02:36 lr 0.000525 wd 0.0500 time 0.2373 (0.2408) data time 0.0011 (0.0019) model time 0.2362 (0.2384) loss 3.1629 (3.1988) grad_norm 2.1374 (inf) loss_scale 512.0000 (660.2329) mem 7382MB [2024-08-27 09:50:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][610/1251] eta 0:02:34 lr 0.000525 wd 0.0500 time 0.2373 (0.2408) data time 0.0007 (0.0019) model time 0.2366 (0.2384) loss 3.7062 (3.1958) grad_norm 3.7043 (inf) loss_scale 512.0000 (657.8069) mem 7382MB [2024-08-27 09:50:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][620/1251] eta 0:02:31 lr 0.000524 wd 0.0500 time 0.2415 (0.2408) data time 0.0009 (0.0018) model time 0.2406 (0.2384) loss 2.7431 (3.1937) grad_norm 2.7084 (inf) loss_scale 512.0000 (655.4589) mem 7382MB [2024-08-27 09:50:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][630/1251] eta 0:02:29 lr 0.000524 wd 0.0500 time 0.2474 (0.2407) data time 0.0010 (0.0018) model time 0.2464 (0.2384) loss 3.7151 (3.1962) grad_norm 2.0679 (inf) loss_scale 512.0000 (653.1854) mem 7382MB [2024-08-27 09:50:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][640/1251] eta 0:02:27 lr 0.000524 wd 0.0500 time 0.2418 (0.2406) data time 0.0010 (0.0018) model time 0.2408 (0.2384) loss 3.0410 (3.1935) grad_norm 2.8943 (inf) loss_scale 512.0000 (650.9828) mem 7382MB [2024-08-27 09:50:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][650/1251] eta 0:02:24 lr 0.000524 wd 0.0500 time 0.2403 (0.2406) data time 0.0007 (0.0018) model time 0.2396 (0.2384) loss 3.2862 (3.1928) grad_norm 1.8420 (inf) loss_scale 512.0000 (648.8479) mem 7382MB [2024-08-27 09:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][660/1251] eta 0:02:22 lr 0.000524 wd 0.0500 time 0.2478 (0.2406) data time 0.0009 (0.0018) model time 0.2469 (0.2384) loss 3.8012 (3.1917) grad_norm 1.9784 (inf) loss_scale 512.0000 (646.7776) mem 7382MB [2024-08-27 09:51:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][670/1251] eta 0:02:19 lr 0.000524 wd 0.0500 time 0.2418 (0.2406) data time 0.0009 (0.0018) model time 0.2409 (0.2384) loss 1.8530 (3.1922) grad_norm 2.0475 (inf) loss_scale 512.0000 (644.7690) mem 7382MB [2024-08-27 09:51:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][680/1251] eta 0:02:17 lr 0.000524 wd 0.0500 time 0.2464 (0.2406) data time 0.0007 (0.0018) model time 0.2457 (0.2384) loss 3.4015 (3.1939) grad_norm 2.4343 (inf) loss_scale 512.0000 (642.8194) mem 7382MB [2024-08-27 09:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][690/1251] eta 0:02:14 lr 0.000524 wd 0.0500 time 0.2419 (0.2406) data time 0.0010 (0.0018) model time 0.2409 (0.2384) loss 3.3748 (3.1956) grad_norm 2.0014 (inf) loss_scale 512.0000 (640.9262) mem 7382MB [2024-08-27 09:51:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][700/1251] eta 0:02:12 lr 0.000524 wd 0.0500 time 0.2402 (0.2406) data time 0.0008 (0.0018) model time 0.2394 (0.2384) loss 3.3182 (3.1966) grad_norm 2.3498 (inf) loss_scale 512.0000 (639.0870) mem 7382MB [2024-08-27 09:51:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][710/1251] eta 0:02:10 lr 0.000524 wd 0.0500 time 0.2602 (0.2406) data time 0.0010 (0.0017) model time 0.2593 (0.2385) loss 3.7337 (3.1981) grad_norm 1.9493 (inf) loss_scale 512.0000 (637.2996) mem 7382MB [2024-08-27 09:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][720/1251] eta 0:02:07 lr 0.000524 wd 0.0500 time 0.2348 (0.2409) data time 0.0008 (0.0017) model time 0.2339 (0.2388) loss 3.0115 (3.2007) grad_norm 2.7136 (inf) loss_scale 512.0000 (635.5617) mem 7382MB [2024-08-27 09:51:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][730/1251] eta 0:02:05 lr 0.000524 wd 0.0500 time 0.2358 (0.2409) data time 0.0010 (0.0017) model time 0.2347 (0.2388) loss 4.0129 (3.2032) grad_norm 2.3680 (inf) loss_scale 512.0000 (633.8714) mem 7382MB [2024-08-27 09:51:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][740/1251] eta 0:02:03 lr 0.000524 wd 0.0500 time 0.2420 (0.2409) data time 0.0008 (0.0017) model time 0.2412 (0.2388) loss 3.6773 (3.2030) grad_norm 2.0246 (inf) loss_scale 512.0000 (632.2267) mem 7382MB [2024-08-27 09:51:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][750/1251] eta 0:02:00 lr 0.000524 wd 0.0500 time 0.2485 (0.2409) data time 0.0009 (0.0017) model time 0.2476 (0.2388) loss 2.3994 (3.2045) grad_norm 3.4829 (inf) loss_scale 512.0000 (630.6258) mem 7382MB [2024-08-27 09:51:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][760/1251] eta 0:01:58 lr 0.000524 wd 0.0500 time 0.2546 (0.2409) data time 0.0011 (0.0017) model time 0.2535 (0.2389) loss 2.4451 (3.2041) grad_norm 3.6386 (inf) loss_scale 512.0000 (629.0670) mem 7382MB [2024-08-27 09:51:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][770/1251] eta 0:01:55 lr 0.000524 wd 0.0500 time 0.2368 (0.2409) data time 0.0009 (0.0017) model time 0.2359 (0.2388) loss 2.2832 (3.1990) grad_norm 1.8357 (inf) loss_scale 512.0000 (627.5486) mem 7382MB [2024-08-27 09:51:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][780/1251] eta 0:01:53 lr 0.000524 wd 0.0500 time 0.2382 (0.2408) data time 0.0010 (0.0017) model time 0.2372 (0.2388) loss 2.9800 (3.2003) grad_norm 2.0208 (inf) loss_scale 512.0000 (626.0691) mem 7382MB [2024-08-27 09:51:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][790/1251] eta 0:01:51 lr 0.000524 wd 0.0500 time 0.2420 (0.2414) data time 0.0008 (0.0017) model time 0.2412 (0.2394) loss 3.8141 (3.1993) grad_norm 2.4416 (inf) loss_scale 512.0000 (624.6271) mem 7382MB [2024-08-27 09:51:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][800/1251] eta 0:01:49 lr 0.000524 wd 0.0500 time 0.2394 (0.2419) data time 0.0009 (0.0017) model time 0.2385 (0.2399) loss 3.2743 (3.1993) grad_norm 2.8068 (inf) loss_scale 512.0000 (623.2210) mem 7382MB [2024-08-27 09:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][810/1251] eta 0:01:46 lr 0.000524 wd 0.0500 time 0.2392 (0.2418) data time 0.0011 (0.0017) model time 0.2382 (0.2399) loss 3.4347 (3.2001) grad_norm 2.1673 (inf) loss_scale 512.0000 (621.8496) mem 7382MB [2024-08-27 09:51:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][820/1251] eta 0:01:44 lr 0.000524 wd 0.0500 time 0.2386 (0.2418) data time 0.0012 (0.0017) model time 0.2374 (0.2399) loss 3.1772 (3.2002) grad_norm 5.1046 (inf) loss_scale 512.0000 (620.5116) mem 7382MB [2024-08-27 09:51:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][830/1251] eta 0:01:41 lr 0.000524 wd 0.0500 time 0.2301 (0.2417) data time 0.0014 (0.0016) model time 0.2288 (0.2398) loss 3.6090 (3.1961) grad_norm 2.1330 (inf) loss_scale 512.0000 (619.2058) mem 7382MB [2024-08-27 09:51:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][840/1251] eta 0:01:39 lr 0.000523 wd 0.0500 time 0.2427 (0.2417) data time 0.0008 (0.0016) model time 0.2419 (0.2398) loss 3.9728 (3.1946) grad_norm 2.1154 (inf) loss_scale 512.0000 (617.9310) mem 7382MB [2024-08-27 09:51:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][850/1251] eta 0:01:36 lr 0.000523 wd 0.0500 time 0.2305 (0.2416) data time 0.0009 (0.0016) model time 0.2296 (0.2397) loss 3.5640 (3.1944) grad_norm 2.5730 (inf) loss_scale 512.0000 (616.6863) mem 7382MB [2024-08-27 09:51:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][860/1251] eta 0:01:34 lr 0.000523 wd 0.0500 time 0.2470 (0.2416) data time 0.0008 (0.0016) model time 0.2461 (0.2397) loss 3.1947 (3.1938) grad_norm 2.0712 (inf) loss_scale 512.0000 (615.4704) mem 7382MB [2024-08-27 09:51:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][870/1251] eta 0:01:32 lr 0.000523 wd 0.0500 time 0.2462 (0.2416) data time 0.0009 (0.0016) model time 0.2453 (0.2397) loss 3.6530 (3.1942) grad_norm 2.6076 (inf) loss_scale 512.0000 (614.2824) mem 7382MB [2024-08-27 09:51:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][880/1251] eta 0:01:29 lr 0.000523 wd 0.0500 time 0.2371 (0.2416) data time 0.0012 (0.0016) model time 0.2360 (0.2397) loss 3.2154 (3.1914) grad_norm 3.4130 (inf) loss_scale 512.0000 (613.1215) mem 7382MB [2024-08-27 09:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][890/1251] eta 0:01:27 lr 0.000523 wd 0.0500 time 0.2475 (0.2415) data time 0.0010 (0.0016) model time 0.2465 (0.2397) loss 3.4712 (3.1930) grad_norm 2.6101 (inf) loss_scale 512.0000 (611.9865) mem 7382MB [2024-08-27 09:51:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][900/1251] eta 0:01:24 lr 0.000523 wd 0.0500 time 0.2426 (0.2415) data time 0.0007 (0.0016) model time 0.2419 (0.2397) loss 3.1025 (3.1918) grad_norm 3.1976 (inf) loss_scale 512.0000 (610.8768) mem 7382MB [2024-08-27 09:52:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][910/1251] eta 0:01:22 lr 0.000523 wd 0.0500 time 0.2375 (0.2415) data time 0.0012 (0.0016) model time 0.2364 (0.2396) loss 3.8990 (3.1912) grad_norm 2.6674 (inf) loss_scale 512.0000 (609.7914) mem 7382MB [2024-08-27 09:52:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][920/1251] eta 0:01:19 lr 0.000523 wd 0.0500 time 0.2364 (0.2414) data time 0.0007 (0.0016) model time 0.2357 (0.2396) loss 3.4427 (3.1913) grad_norm 1.8077 (inf) loss_scale 512.0000 (608.7296) mem 7382MB [2024-08-27 09:52:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][930/1251] eta 0:01:17 lr 0.000523 wd 0.0500 time 0.2369 (0.2414) data time 0.0009 (0.0016) model time 0.2360 (0.2396) loss 3.6423 (3.1939) grad_norm 1.9460 (inf) loss_scale 512.0000 (607.6907) mem 7382MB [2024-08-27 09:52:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][940/1251] eta 0:01:15 lr 0.000523 wd 0.0500 time 0.2441 (0.2414) data time 0.0010 (0.0016) model time 0.2431 (0.2395) loss 3.5769 (3.1948) grad_norm 2.0347 (inf) loss_scale 512.0000 (606.6738) mem 7382MB [2024-08-27 09:52:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][950/1251] eta 0:01:12 lr 0.000523 wd 0.0500 time 0.2360 (0.2413) data time 0.0011 (0.0016) model time 0.2350 (0.2395) loss 3.2826 (3.1976) grad_norm 2.3371 (inf) loss_scale 512.0000 (605.6782) mem 7382MB [2024-08-27 09:52:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][960/1251] eta 0:01:10 lr 0.000523 wd 0.0500 time 0.2409 (0.2413) data time 0.0011 (0.0016) model time 0.2399 (0.2395) loss 3.1713 (3.1958) grad_norm 1.7903 (inf) loss_scale 512.0000 (604.7034) mem 7382MB [2024-08-27 09:52:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][970/1251] eta 0:01:07 lr 0.000523 wd 0.0500 time 0.2447 (0.2413) data time 0.0008 (0.0016) model time 0.2440 (0.2395) loss 3.8633 (3.1960) grad_norm 2.1361 (inf) loss_scale 512.0000 (603.7487) mem 7382MB [2024-08-27 09:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][980/1251] eta 0:01:05 lr 0.000523 wd 0.0500 time 0.2421 (0.2412) data time 0.0007 (0.0016) model time 0.2414 (0.2394) loss 3.7615 (3.1946) grad_norm 3.3727 (inf) loss_scale 512.0000 (602.8135) mem 7382MB [2024-08-27 09:52:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][990/1251] eta 0:01:02 lr 0.000523 wd 0.0500 time 0.2426 (0.2412) data time 0.0010 (0.0015) model time 0.2416 (0.2394) loss 3.1136 (3.1960) grad_norm 2.2213 (inf) loss_scale 512.0000 (601.8971) mem 7382MB [2024-08-27 09:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1000/1251] eta 0:01:00 lr 0.000523 wd 0.0500 time 0.2328 (0.2412) data time 0.0012 (0.0015) model time 0.2316 (0.2394) loss 2.4220 (3.1939) grad_norm 2.5611 (inf) loss_scale 512.0000 (600.9990) mem 7382MB [2024-08-27 09:52:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1010/1251] eta 0:00:58 lr 0.000523 wd 0.0500 time 0.2409 (0.2411) data time 0.0012 (0.0015) model time 0.2398 (0.2394) loss 2.7408 (3.1911) grad_norm 2.4618 (inf) loss_scale 512.0000 (600.1187) mem 7382MB [2024-08-27 09:52:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1020/1251] eta 0:00:55 lr 0.000523 wd 0.0500 time 0.2349 (0.2411) data time 0.0009 (0.0015) model time 0.2340 (0.2393) loss 3.5528 (3.1929) grad_norm 2.0853 (inf) loss_scale 512.0000 (599.2556) mem 7382MB [2024-08-27 09:52:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1030/1251] eta 0:00:53 lr 0.000523 wd 0.0500 time 0.2556 (0.2411) data time 0.0007 (0.0015) model time 0.2549 (0.2393) loss 3.8453 (3.1945) grad_norm 3.3347 (inf) loss_scale 512.0000 (598.4093) mem 7382MB [2024-08-27 09:52:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1040/1251] eta 0:00:50 lr 0.000523 wd 0.0500 time 0.2389 (0.2411) data time 0.0011 (0.0015) model time 0.2378 (0.2393) loss 2.6490 (3.1927) grad_norm 3.0654 (inf) loss_scale 512.0000 (597.5793) mem 7382MB [2024-08-27 09:52:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1050/1251] eta 0:00:48 lr 0.000523 wd 0.0500 time 0.2419 (0.2411) data time 0.0009 (0.0015) model time 0.2410 (0.2393) loss 3.0756 (3.1915) grad_norm 1.8907 (inf) loss_scale 512.0000 (596.7650) mem 7382MB [2024-08-27 09:52:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1060/1251] eta 0:00:46 lr 0.000523 wd 0.0500 time 0.2370 (0.2411) data time 0.0007 (0.0015) model time 0.2362 (0.2393) loss 3.3172 (3.1899) grad_norm 1.6877 (inf) loss_scale 512.0000 (595.9661) mem 7382MB [2024-08-27 09:52:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1070/1251] eta 0:00:43 lr 0.000522 wd 0.0500 time 0.2362 (0.2410) data time 0.0010 (0.0015) model time 0.2352 (0.2393) loss 3.4231 (3.1920) grad_norm 2.7637 (inf) loss_scale 512.0000 (595.1821) mem 7382MB [2024-08-27 09:52:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1080/1251] eta 0:00:41 lr 0.000522 wd 0.0500 time 0.2326 (0.2410) data time 0.0010 (0.0015) model time 0.2316 (0.2393) loss 2.7691 (3.1922) grad_norm 2.6072 (inf) loss_scale 512.0000 (594.4126) mem 7382MB [2024-08-27 09:52:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1090/1251] eta 0:00:38 lr 0.000522 wd 0.0500 time 0.2346 (0.2410) data time 0.0008 (0.0015) model time 0.2337 (0.2393) loss 2.1093 (3.1907) grad_norm 1.6641 (inf) loss_scale 512.0000 (593.6572) mem 7382MB [2024-08-27 09:52:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1100/1251] eta 0:00:36 lr 0.000522 wd 0.0500 time 0.2449 (0.2410) data time 0.0010 (0.0015) model time 0.2438 (0.2393) loss 3.1438 (3.1913) grad_norm 3.5189 (inf) loss_scale 512.0000 (592.9155) mem 7382MB [2024-08-27 09:52:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1110/1251] eta 0:00:33 lr 0.000522 wd 0.0500 time 0.2430 (0.2410) data time 0.0010 (0.0015) model time 0.2420 (0.2393) loss 4.0811 (3.1929) grad_norm 1.8706 (inf) loss_scale 512.0000 (592.1872) mem 7382MB [2024-08-27 09:52:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1120/1251] eta 0:00:31 lr 0.000522 wd 0.0500 time 0.2369 (0.2410) data time 0.0010 (0.0015) model time 0.2359 (0.2393) loss 3.2522 (3.1937) grad_norm 1.7407 (inf) loss_scale 512.0000 (591.4719) mem 7382MB [2024-08-27 09:52:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1130/1251] eta 0:00:29 lr 0.000522 wd 0.0500 time 0.2393 (0.2410) data time 0.0008 (0.0015) model time 0.2384 (0.2393) loss 3.4206 (3.1945) grad_norm 2.8180 (inf) loss_scale 512.0000 (590.7692) mem 7382MB [2024-08-27 09:52:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1140/1251] eta 0:00:26 lr 0.000522 wd 0.0500 time 0.2370 (0.2410) data time 0.0011 (0.0015) model time 0.2359 (0.2393) loss 2.8250 (3.1939) grad_norm 1.7880 (inf) loss_scale 512.0000 (590.0789) mem 7382MB [2024-08-27 09:52:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1150/1251] eta 0:00:24 lr 0.000522 wd 0.0500 time 0.2407 (0.2410) data time 0.0010 (0.0015) model time 0.2397 (0.2393) loss 3.5240 (3.1921) grad_norm 1.7451 (inf) loss_scale 512.0000 (589.4005) mem 7382MB [2024-08-27 09:53:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1160/1251] eta 0:00:21 lr 0.000522 wd 0.0500 time 0.2338 (0.2409) data time 0.0008 (0.0015) model time 0.2331 (0.2392) loss 3.7615 (3.1938) grad_norm 2.9085 (inf) loss_scale 512.0000 (588.7339) mem 7382MB [2024-08-27 09:53:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1170/1251] eta 0:00:19 lr 0.000522 wd 0.0500 time 0.2440 (0.2409) data time 0.0009 (0.0015) model time 0.2431 (0.2392) loss 3.5577 (3.1926) grad_norm 1.8883 (inf) loss_scale 512.0000 (588.0786) mem 7382MB [2024-08-27 09:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1180/1251] eta 0:00:17 lr 0.000522 wd 0.0500 time 0.2320 (0.2409) data time 0.0010 (0.0015) model time 0.2310 (0.2392) loss 3.7313 (3.1939) grad_norm 1.9455 (inf) loss_scale 512.0000 (587.4344) mem 7382MB [2024-08-27 09:53:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1190/1251] eta 0:00:14 lr 0.000522 wd 0.0500 time 0.2363 (0.2408) data time 0.0012 (0.0015) model time 0.2352 (0.2392) loss 2.5188 (3.1950) grad_norm 2.0950 (inf) loss_scale 512.0000 (586.8010) mem 7382MB [2024-08-27 09:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1200/1251] eta 0:00:12 lr 0.000522 wd 0.0500 time 0.2337 (0.2408) data time 0.0007 (0.0015) model time 0.2330 (0.2391) loss 2.6429 (3.1940) grad_norm 1.8903 (inf) loss_scale 512.0000 (586.1782) mem 7382MB [2024-08-27 09:53:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1210/1251] eta 0:00:09 lr 0.000522 wd 0.0500 time 0.2536 (0.2408) data time 0.0010 (0.0015) model time 0.2525 (0.2391) loss 3.7329 (3.1945) grad_norm 2.0839 (inf) loss_scale 512.0000 (585.5656) mem 7382MB [2024-08-27 09:53:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1220/1251] eta 0:00:07 lr 0.000522 wd 0.0500 time 0.2399 (0.2408) data time 0.0009 (0.0015) model time 0.2390 (0.2392) loss 3.2776 (3.1961) grad_norm 3.1504 (inf) loss_scale 512.0000 (584.9631) mem 7382MB [2024-08-27 09:53:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1230/1251] eta 0:00:05 lr 0.000522 wd 0.0500 time 0.2360 (0.2408) data time 0.0008 (0.0015) model time 0.2352 (0.2391) loss 3.6526 (3.1960) grad_norm 1.9278 (inf) loss_scale 512.0000 (584.3704) mem 7382MB [2024-08-27 09:53:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1240/1251] eta 0:00:02 lr 0.000522 wd 0.0500 time 0.2213 (0.2408) data time 0.0007 (0.0015) model time 0.2206 (0.2392) loss 3.0593 (3.1965) grad_norm 2.2921 (inf) loss_scale 512.0000 (583.7873) mem 7382MB [2024-08-27 09:53:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [156/300][1250/1251] eta 0:00:00 lr 0.000522 wd 0.0500 time 0.2249 (0.2407) data time 0.0005 (0.0015) model time 0.2244 (0.2391) loss 3.6360 (3.1986) grad_norm 1.9412 (inf) loss_scale 512.0000 (583.2134) mem 7382MB [2024-08-27 09:53:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 156 training takes 0:05:01 [2024-08-27 09:53:22 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 09:53:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 09:53:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.450 (0.450) Loss 0.4758 (0.4758) Acc@1 91.016 (91.016) Acc@5 98.242 (98.242) Mem 7382MB [2024-08-27 09:53:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.114) Loss 0.7954 (0.7210) Acc@1 83.496 (84.384) Acc@5 96.484 (96.831) Mem 7382MB [2024-08-27 09:53:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.096) Loss 1.0859 (0.7468) Acc@1 74.805 (83.594) Acc@5 93.652 (96.833) Mem 7382MB [2024-08-27 09:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.086 (0.089) Loss 1.3086 (0.8582) Acc@1 69.922 (80.989) Acc@5 91.309 (95.536) Mem 7382MB [2024-08-27 09:53:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.1279 (0.9053) Acc@1 73.145 (79.816) Acc@5 93.066 (95.005) Mem 7382MB [2024-08-27 09:53:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.414 Acc@5 94.938 [2024-08-27 09:53:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.4% [2024-08-27 09:53:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 79.41% [2024-08-27 09:53:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 09:53:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 09:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.517 (0.517) Loss 0.4038 (0.4038) Acc@1 92.773 (92.773) Acc@5 98.535 (98.535) Mem 7382MB [2024-08-27 09:53:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.118) Loss 0.6362 (0.6334) Acc@1 86.816 (86.319) Acc@5 97.168 (97.399) Mem 7382MB [2024-08-27 09:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.098) Loss 0.9043 (0.6582) Acc@1 78.125 (85.426) Acc@5 95.508 (97.396) Mem 7382MB [2024-08-27 09:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.091) Loss 1.1436 (0.7477) Acc@1 71.387 (83.238) Acc@5 92.871 (96.402) Mem 7382MB [2024-08-27 09:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.0342 (0.7949) Acc@1 74.707 (81.898) Acc@5 93.848 (95.920) Mem 7382MB [2024-08-27 09:53:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.458 Acc@5 95.894 [2024-08-27 09:53:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.5% [2024-08-27 09:53:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.46% [2024-08-27 09:53:32 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 09:53:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 09:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][0/1251] eta 0:13:36 lr 0.000522 wd 0.0500 time 0.6529 (0.6529) data time 0.4153 (0.4153) model time 0.0000 (0.0000) loss 3.0786 (3.0786) grad_norm 2.4887 (2.4887) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][10/1251] eta 0:05:44 lr 0.000522 wd 0.0500 time 0.2446 (0.2773) data time 0.0009 (0.0387) model time 0.0000 (0.0000) loss 3.2990 (3.4132) grad_norm 2.3935 (2.3535) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:53:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][20/1251] eta 0:05:19 lr 0.000522 wd 0.0500 time 0.2393 (0.2597) data time 0.0007 (0.0208) model time 0.0000 (0.0000) loss 3.5491 (3.3446) grad_norm 2.9127 (2.3729) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][30/1251] eta 0:05:08 lr 0.000522 wd 0.0500 time 0.2427 (0.2527) data time 0.0008 (0.0144) model time 0.0000 (0.0000) loss 3.4084 (3.2455) grad_norm 3.4027 (2.8063) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][40/1251] eta 0:05:01 lr 0.000521 wd 0.0500 time 0.2376 (0.2492) data time 0.0011 (0.0112) model time 0.0000 (0.0000) loss 3.5627 (3.2128) grad_norm 2.6594 (2.8541) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][50/1251] eta 0:04:56 lr 0.000521 wd 0.0500 time 0.2363 (0.2470) data time 0.0011 (0.0092) model time 0.0000 (0.0000) loss 3.3421 (3.2366) grad_norm 1.7913 (3.4454) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][60/1251] eta 0:04:52 lr 0.000521 wd 0.0500 time 0.2327 (0.2454) data time 0.0009 (0.0078) model time 0.2317 (0.2364) loss 3.4154 (3.2415) grad_norm 2.0032 (3.2606) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][70/1251] eta 0:04:49 lr 0.000521 wd 0.0500 time 0.2468 (0.2448) data time 0.0008 (0.0069) model time 0.2460 (0.2383) loss 3.1284 (3.2033) grad_norm 3.1638 (3.1485) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:53:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][80/1251] eta 0:04:45 lr 0.000521 wd 0.0500 time 0.2322 (0.2442) data time 0.0011 (0.0062) model time 0.2311 (0.2384) loss 3.2274 (3.2339) grad_norm 3.3935 (3.1482) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:53:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][90/1251] eta 0:04:43 lr 0.000521 wd 0.0500 time 0.2706 (0.2439) data time 0.0009 (0.0056) model time 0.2696 (0.2389) loss 3.5795 (3.2271) grad_norm 1.9117 (3.0706) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:53:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][100/1251] eta 0:04:39 lr 0.000521 wd 0.0500 time 0.2404 (0.2432) data time 0.0011 (0.0052) model time 0.2393 (0.2384) loss 3.3838 (3.2261) grad_norm 2.9697 (2.9872) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:53:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][110/1251] eta 0:04:37 lr 0.000521 wd 0.0500 time 0.2367 (0.2428) data time 0.0011 (0.0048) model time 0.2356 (0.2382) loss 3.6673 (3.2209) grad_norm 1.7787 (2.9294) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][120/1251] eta 0:04:34 lr 0.000521 wd 0.0500 time 0.2392 (0.2426) data time 0.0008 (0.0045) model time 0.2384 (0.2384) loss 3.6696 (3.2101) grad_norm 3.3515 (2.8838) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][130/1251] eta 0:04:33 lr 0.000521 wd 0.0500 time 0.2400 (0.2440) data time 0.0009 (0.0042) model time 0.2391 (0.2411) loss 3.4439 (3.2079) grad_norm 2.4796 (2.8599) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][140/1251] eta 0:04:35 lr 0.000521 wd 0.0500 time 0.4515 (0.2483) data time 0.0009 (0.0040) model time 0.4506 (0.2480) loss 3.4783 (3.1942) grad_norm 3.0179 (2.8288) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][150/1251] eta 0:04:32 lr 0.000521 wd 0.0500 time 0.2348 (0.2477) data time 0.0008 (0.0038) model time 0.2340 (0.2471) loss 3.6304 (3.1997) grad_norm 4.4198 (2.8227) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][160/1251] eta 0:04:29 lr 0.000521 wd 0.0500 time 0.2325 (0.2471) data time 0.0011 (0.0036) model time 0.2313 (0.2461) loss 3.5768 (3.2044) grad_norm 2.1416 (2.8446) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][170/1251] eta 0:04:26 lr 0.000521 wd 0.0500 time 0.2331 (0.2465) data time 0.0010 (0.0035) model time 0.2321 (0.2452) loss 2.7992 (3.2011) grad_norm 2.1279 (2.8084) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][180/1251] eta 0:04:23 lr 0.000521 wd 0.0500 time 0.2311 (0.2461) data time 0.0011 (0.0033) model time 0.2300 (0.2447) loss 3.4480 (3.2151) grad_norm 1.7923 (2.7851) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][190/1251] eta 0:04:20 lr 0.000521 wd 0.0500 time 0.2346 (0.2456) data time 0.0008 (0.0032) model time 0.2338 (0.2441) loss 2.7068 (3.2109) grad_norm 3.0906 (2.7833) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][200/1251] eta 0:04:17 lr 0.000521 wd 0.0500 time 0.2412 (0.2454) data time 0.0012 (0.0031) model time 0.2400 (0.2438) loss 3.4705 (3.1935) grad_norm 2.0261 (2.7684) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][210/1251] eta 0:04:15 lr 0.000521 wd 0.0500 time 0.2526 (0.2451) data time 0.0010 (0.0030) model time 0.2516 (0.2435) loss 2.9843 (3.1974) grad_norm 2.0238 (2.7581) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][220/1251] eta 0:04:12 lr 0.000521 wd 0.0500 time 0.2511 (0.2448) data time 0.0008 (0.0029) model time 0.2504 (0.2431) loss 2.5753 (3.2007) grad_norm 2.1648 (2.7628) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][230/1251] eta 0:04:09 lr 0.000521 wd 0.0500 time 0.2379 (0.2446) data time 0.0010 (0.0028) model time 0.2369 (0.2428) loss 2.6224 (3.2004) grad_norm 4.5867 (2.7633) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][240/1251] eta 0:04:06 lr 0.000521 wd 0.0500 time 0.2364 (0.2443) data time 0.0010 (0.0028) model time 0.2354 (0.2425) loss 3.4902 (3.2095) grad_norm 2.1516 (2.7684) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][250/1251] eta 0:04:04 lr 0.000521 wd 0.0500 time 0.2363 (0.2440) data time 0.0011 (0.0027) model time 0.2351 (0.2422) loss 2.8522 (3.2005) grad_norm 2.2829 (2.7721) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][260/1251] eta 0:04:01 lr 0.000521 wd 0.0500 time 0.2330 (0.2437) data time 0.0009 (0.0026) model time 0.2321 (0.2419) loss 2.9743 (3.1976) grad_norm 3.2697 (2.7744) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][270/1251] eta 0:03:58 lr 0.000520 wd 0.0500 time 0.2337 (0.2435) data time 0.0008 (0.0026) model time 0.2330 (0.2416) loss 3.6813 (3.1986) grad_norm 2.4524 (2.8725) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][280/1251] eta 0:03:56 lr 0.000520 wd 0.0500 time 0.2409 (0.2434) data time 0.0010 (0.0025) model time 0.2399 (0.2415) loss 3.0214 (3.1940) grad_norm 2.9498 (2.8660) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][290/1251] eta 0:03:53 lr 0.000520 wd 0.0500 time 0.2353 (0.2433) data time 0.0009 (0.0025) model time 0.2344 (0.2415) loss 2.7082 (3.1949) grad_norm 3.3455 (2.8614) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][300/1251] eta 0:03:51 lr 0.000520 wd 0.0500 time 0.2310 (0.2432) data time 0.0010 (0.0024) model time 0.2300 (0.2413) loss 3.5131 (3.1972) grad_norm 2.0576 (2.8457) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][310/1251] eta 0:03:48 lr 0.000520 wd 0.0500 time 0.2454 (0.2430) data time 0.0010 (0.0024) model time 0.2444 (0.2412) loss 2.6287 (3.1905) grad_norm 4.2272 (2.8306) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][320/1251] eta 0:03:46 lr 0.000520 wd 0.0500 time 0.2306 (0.2430) data time 0.0012 (0.0023) model time 0.2293 (0.2412) loss 3.4094 (3.1927) grad_norm 2.0721 (2.8097) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][330/1251] eta 0:03:43 lr 0.000520 wd 0.0500 time 0.2392 (0.2428) data time 0.0009 (0.0023) model time 0.2384 (0.2410) loss 3.1615 (3.1852) grad_norm 2.9237 (2.7929) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][340/1251] eta 0:03:41 lr 0.000520 wd 0.0500 time 0.2341 (0.2427) data time 0.0010 (0.0023) model time 0.2331 (0.2409) loss 3.1725 (3.1784) grad_norm 2.2710 (2.7809) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:54:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][350/1251] eta 0:03:38 lr 0.000520 wd 0.0500 time 0.2372 (0.2426) data time 0.0010 (0.0022) model time 0.2362 (0.2408) loss 3.1158 (3.1806) grad_norm 2.3324 (2.7840) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][360/1251] eta 0:03:35 lr 0.000520 wd 0.0500 time 0.2313 (0.2424) data time 0.0007 (0.0022) model time 0.2306 (0.2406) loss 3.8144 (3.1842) grad_norm 2.9010 (2.7781) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][370/1251] eta 0:03:33 lr 0.000520 wd 0.0500 time 0.2333 (0.2423) data time 0.0009 (0.0022) model time 0.2324 (0.2405) loss 2.8025 (3.1783) grad_norm 3.1338 (2.7796) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][380/1251] eta 0:03:30 lr 0.000520 wd 0.0500 time 0.2326 (0.2421) data time 0.0008 (0.0021) model time 0.2318 (0.2404) loss 3.7298 (3.1754) grad_norm 2.7580 (2.7777) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][390/1251] eta 0:03:28 lr 0.000520 wd 0.0500 time 0.2369 (0.2421) data time 0.0010 (0.0021) model time 0.2359 (0.2403) loss 3.3618 (3.1763) grad_norm 2.1133 (2.7658) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][400/1251] eta 0:03:25 lr 0.000520 wd 0.0500 time 0.2404 (0.2421) data time 0.0012 (0.0021) model time 0.2392 (0.2403) loss 3.2666 (3.1778) grad_norm 2.4396 (2.7696) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][410/1251] eta 0:03:23 lr 0.000520 wd 0.0500 time 0.2402 (0.2420) data time 0.0009 (0.0021) model time 0.2393 (0.2402) loss 2.6168 (3.1769) grad_norm 2.7219 (2.7611) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][420/1251] eta 0:03:20 lr 0.000520 wd 0.0500 time 0.2319 (0.2418) data time 0.0009 (0.0020) model time 0.2310 (0.2401) loss 3.4521 (3.1711) grad_norm 2.7565 (2.7520) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][430/1251] eta 0:03:18 lr 0.000520 wd 0.0500 time 0.2414 (0.2422) data time 0.0007 (0.0020) model time 0.2407 (0.2406) loss 2.3447 (3.1719) grad_norm 4.2752 (2.7487) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][440/1251] eta 0:03:16 lr 0.000520 wd 0.0500 time 0.2446 (0.2422) data time 0.0011 (0.0020) model time 0.2435 (0.2405) loss 3.0263 (3.1746) grad_norm 2.6073 (2.7488) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][450/1251] eta 0:03:13 lr 0.000520 wd 0.0500 time 0.2479 (0.2422) data time 0.0010 (0.0020) model time 0.2468 (0.2405) loss 2.5834 (3.1715) grad_norm 1.9056 (2.7360) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][460/1251] eta 0:03:11 lr 0.000520 wd 0.0500 time 0.2400 (0.2420) data time 0.0008 (0.0019) model time 0.2392 (0.2404) loss 2.7591 (3.1735) grad_norm 3.1924 (2.7317) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][470/1251] eta 0:03:08 lr 0.000520 wd 0.0500 time 0.2332 (0.2420) data time 0.0008 (0.0019) model time 0.2324 (0.2403) loss 2.6395 (3.1721) grad_norm 2.0382 (2.7208) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][480/1251] eta 0:03:06 lr 0.000520 wd 0.0500 time 0.2515 (0.2419) data time 0.0011 (0.0019) model time 0.2505 (0.2403) loss 3.7394 (3.1695) grad_norm 3.7779 (2.7238) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][490/1251] eta 0:03:04 lr 0.000519 wd 0.0500 time 0.2387 (0.2419) data time 0.0007 (0.0019) model time 0.2379 (0.2402) loss 2.1701 (3.1690) grad_norm 2.2281 (2.7193) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][500/1251] eta 0:03:01 lr 0.000519 wd 0.0500 time 0.2322 (0.2418) data time 0.0008 (0.0019) model time 0.2314 (0.2402) loss 4.0692 (3.1671) grad_norm 2.2936 (2.7158) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][510/1251] eta 0:02:59 lr 0.000519 wd 0.0500 time 0.2363 (0.2417) data time 0.0010 (0.0019) model time 0.2353 (0.2401) loss 3.1205 (3.1625) grad_norm 2.3296 (2.7123) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][520/1251] eta 0:02:56 lr 0.000519 wd 0.0500 time 0.2304 (0.2416) data time 0.0011 (0.0018) model time 0.2293 (0.2400) loss 3.1420 (3.1584) grad_norm 1.5191 (2.7012) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][530/1251] eta 0:02:54 lr 0.000519 wd 0.0500 time 0.2356 (0.2416) data time 0.0011 (0.0018) model time 0.2345 (0.2400) loss 3.1349 (3.1593) grad_norm 2.4888 (2.6982) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][540/1251] eta 0:02:51 lr 0.000519 wd 0.0500 time 0.2383 (0.2416) data time 0.0010 (0.0018) model time 0.2373 (0.2400) loss 3.1588 (3.1621) grad_norm 2.2328 (2.6956) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][550/1251] eta 0:02:49 lr 0.000519 wd 0.0500 time 0.2357 (0.2415) data time 0.0010 (0.0018) model time 0.2347 (0.2399) loss 3.7941 (3.1690) grad_norm 2.2191 (2.6976) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][560/1251] eta 0:02:46 lr 0.000519 wd 0.0500 time 0.2353 (0.2415) data time 0.0011 (0.0018) model time 0.2343 (0.2399) loss 3.2683 (3.1686) grad_norm 3.1300 (2.6955) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][570/1251] eta 0:02:44 lr 0.000519 wd 0.0500 time 0.2328 (0.2414) data time 0.0008 (0.0018) model time 0.2320 (0.2398) loss 4.2772 (3.1700) grad_norm 1.9469 (2.6884) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][580/1251] eta 0:02:41 lr 0.000519 wd 0.0500 time 0.2370 (0.2413) data time 0.0007 (0.0018) model time 0.2363 (0.2397) loss 2.2765 (3.1674) grad_norm 2.4175 (2.6809) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][590/1251] eta 0:02:39 lr 0.000519 wd 0.0500 time 0.2350 (0.2413) data time 0.0010 (0.0017) model time 0.2339 (0.2397) loss 3.9015 (3.1712) grad_norm 2.3664 (2.6758) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:55:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][600/1251] eta 0:02:36 lr 0.000519 wd 0.0500 time 0.2332 (0.2412) data time 0.0009 (0.0017) model time 0.2323 (0.2396) loss 2.9034 (3.1736) grad_norm 2.0830 (2.6701) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][610/1251] eta 0:02:34 lr 0.000519 wd 0.0500 time 0.2384 (0.2411) data time 0.0009 (0.0017) model time 0.2375 (0.2395) loss 2.0904 (3.1754) grad_norm 2.8396 (2.6635) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][620/1251] eta 0:02:32 lr 0.000519 wd 0.0500 time 0.2416 (0.2410) data time 0.0011 (0.0017) model time 0.2405 (0.2394) loss 3.2517 (3.1732) grad_norm 1.8285 (2.6607) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][630/1251] eta 0:02:29 lr 0.000519 wd 0.0500 time 0.2380 (0.2410) data time 0.0010 (0.0017) model time 0.2371 (0.2394) loss 3.6123 (3.1756) grad_norm 1.9652 (2.6568) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][640/1251] eta 0:02:27 lr 0.000519 wd 0.0500 time 0.2318 (0.2409) data time 0.0009 (0.0017) model time 0.2309 (0.2394) loss 3.6019 (3.1782) grad_norm 2.6043 (2.6539) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][650/1251] eta 0:02:24 lr 0.000519 wd 0.0500 time 0.2357 (0.2409) data time 0.0010 (0.0017) model time 0.2347 (0.2393) loss 2.4019 (3.1788) grad_norm 1.8632 (2.6531) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][660/1251] eta 0:02:22 lr 0.000519 wd 0.0500 time 0.2385 (0.2408) data time 0.0010 (0.0017) model time 0.2375 (0.2392) loss 2.4023 (3.1788) grad_norm 2.6405 (2.6516) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][670/1251] eta 0:02:20 lr 0.000519 wd 0.0500 time 0.2414 (0.2414) data time 0.0009 (0.0017) model time 0.2404 (0.2399) loss 3.0057 (3.1813) grad_norm 2.8337 (2.6604) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][680/1251] eta 0:02:18 lr 0.000519 wd 0.0500 time 0.2324 (0.2420) data time 0.0011 (0.0016) model time 0.2312 (0.2405) loss 3.3685 (3.1833) grad_norm 2.7357 (2.6643) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][690/1251] eta 0:02:15 lr 0.000519 wd 0.0500 time 0.2344 (0.2419) data time 0.0009 (0.0016) model time 0.2334 (0.2405) loss 2.5202 (3.1797) grad_norm 2.0444 (2.6569) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][700/1251] eta 0:02:13 lr 0.000519 wd 0.0500 time 0.2307 (0.2419) data time 0.0009 (0.0016) model time 0.2298 (0.2404) loss 2.0631 (3.1792) grad_norm 2.3430 (2.6533) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][710/1251] eta 0:02:10 lr 0.000519 wd 0.0500 time 0.2448 (0.2419) data time 0.0009 (0.0016) model time 0.2439 (0.2405) loss 2.0745 (3.1703) grad_norm 2.3656 (2.6551) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][720/1251] eta 0:02:08 lr 0.000518 wd 0.0500 time 0.2401 (0.2418) data time 0.0010 (0.0016) model time 0.2391 (0.2404) loss 3.8754 (3.1706) grad_norm 3.7501 (2.6607) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][730/1251] eta 0:02:05 lr 0.000518 wd 0.0500 time 0.2382 (0.2418) data time 0.0011 (0.0016) model time 0.2372 (0.2404) loss 3.3316 (3.1734) grad_norm 1.7499 (2.6570) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][740/1251] eta 0:02:03 lr 0.000518 wd 0.0500 time 0.2397 (0.2418) data time 0.0011 (0.0016) model time 0.2386 (0.2404) loss 3.2050 (3.1737) grad_norm 1.9673 (2.6496) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][750/1251] eta 0:02:01 lr 0.000518 wd 0.0500 time 0.2307 (0.2417) data time 0.0011 (0.0016) model time 0.2295 (0.2403) loss 3.4985 (3.1764) grad_norm 3.6836 (2.6479) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][760/1251] eta 0:01:58 lr 0.000518 wd 0.0500 time 0.2391 (0.2417) data time 0.0012 (0.0016) model time 0.2378 (0.2403) loss 3.5300 (3.1790) grad_norm 4.0587 (2.6469) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][770/1251] eta 0:01:56 lr 0.000518 wd 0.0500 time 0.2317 (0.2416) data time 0.0011 (0.0016) model time 0.2306 (0.2402) loss 2.3585 (3.1789) grad_norm 2.2863 (2.6492) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][780/1251] eta 0:01:53 lr 0.000518 wd 0.0500 time 0.2330 (0.2416) data time 0.0011 (0.0016) model time 0.2319 (0.2402) loss 2.9298 (3.1765) grad_norm 2.1630 (2.6452) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][790/1251] eta 0:01:51 lr 0.000518 wd 0.0500 time 0.2421 (0.2416) data time 0.0010 (0.0016) model time 0.2411 (0.2402) loss 3.4940 (3.1784) grad_norm 2.6229 (2.6406) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][800/1251] eta 0:01:48 lr 0.000518 wd 0.0500 time 0.2414 (0.2415) data time 0.0008 (0.0016) model time 0.2405 (0.2401) loss 3.1931 (3.1813) grad_norm 2.8496 (2.6423) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][810/1251] eta 0:01:46 lr 0.000518 wd 0.0500 time 0.2402 (0.2415) data time 0.0010 (0.0016) model time 0.2391 (0.2401) loss 2.9666 (3.1785) grad_norm 2.5949 (2.6429) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][820/1251] eta 0:01:44 lr 0.000518 wd 0.0500 time 0.2358 (0.2414) data time 0.0010 (0.0015) model time 0.2348 (0.2400) loss 2.9130 (3.1788) grad_norm 2.3584 (2.6417) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][830/1251] eta 0:01:41 lr 0.000518 wd 0.0500 time 0.2282 (0.2414) data time 0.0009 (0.0015) model time 0.2273 (0.2400) loss 3.0908 (3.1772) grad_norm 2.9235 (2.6365) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][840/1251] eta 0:01:39 lr 0.000518 wd 0.0500 time 0.2431 (0.2414) data time 0.0011 (0.0015) model time 0.2420 (0.2400) loss 3.0779 (3.1747) grad_norm 1.8302 (2.6322) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:56:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][850/1251] eta 0:01:36 lr 0.000518 wd 0.0500 time 0.2415 (0.2414) data time 0.0009 (0.0015) model time 0.2406 (0.2400) loss 2.7348 (3.1749) grad_norm 2.2350 (2.6260) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:57:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][860/1251] eta 0:01:34 lr 0.000518 wd 0.0500 time 0.2367 (0.2414) data time 0.0012 (0.0015) model time 0.2355 (0.2400) loss 1.9253 (3.1701) grad_norm 2.2470 (2.6210) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][870/1251] eta 0:01:31 lr 0.000518 wd 0.0500 time 0.2410 (0.2413) data time 0.0011 (0.0015) model time 0.2399 (0.2399) loss 2.8602 (3.1720) grad_norm 2.0236 (2.6218) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:57:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][880/1251] eta 0:01:29 lr 0.000518 wd 0.0500 time 0.2386 (0.2413) data time 0.0010 (0.0015) model time 0.2376 (0.2399) loss 4.1138 (3.1731) grad_norm 2.1665 (2.6258) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:57:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][890/1251] eta 0:01:27 lr 0.000518 wd 0.0500 time 0.2444 (0.2412) data time 0.0010 (0.0015) model time 0.2435 (0.2399) loss 3.3194 (3.1712) grad_norm 1.6792 (2.6263) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:57:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][900/1251] eta 0:01:24 lr 0.000518 wd 0.0500 time 0.2511 (0.2412) data time 0.0010 (0.0015) model time 0.2501 (0.2398) loss 3.5242 (3.1685) grad_norm 2.0990 (2.6248) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:57:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][910/1251] eta 0:01:22 lr 0.000518 wd 0.0500 time 0.2437 (0.2412) data time 0.0011 (0.0015) model time 0.2427 (0.2398) loss 3.1903 (3.1674) grad_norm 1.8259 (2.6234) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:57:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][920/1251] eta 0:01:19 lr 0.000518 wd 0.0500 time 0.2449 (0.2412) data time 0.0010 (0.0015) model time 0.2439 (0.2398) loss 3.5821 (3.1680) grad_norm 1.9560 (2.6176) loss_scale 512.0000 (512.0000) mem 7382MB [2024-08-27 09:57:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][930/1251] eta 0:01:17 lr 0.000518 wd 0.0500 time 0.2440 (0.2412) data time 0.0008 (0.0015) model time 0.2432 (0.2398) loss 3.5902 (3.1692) grad_norm 2.0455 (2.6147) loss_scale 1024.0000 (516.3996) mem 7382MB [2024-08-27 09:57:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][940/1251] eta 0:01:15 lr 0.000517 wd 0.0500 time 0.2436 (0.2412) data time 0.0011 (0.0015) model time 0.2424 (0.2398) loss 3.5569 (3.1652) grad_norm 2.6911 (2.6127) loss_scale 1024.0000 (521.7938) mem 7382MB [2024-08-27 09:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][950/1251] eta 0:01:12 lr 0.000517 wd 0.0500 time 0.2520 (0.2412) data time 0.0007 (0.0015) model time 0.2513 (0.2398) loss 2.2760 (3.1641) grad_norm 2.0249 (2.6195) loss_scale 1024.0000 (527.0747) mem 7382MB [2024-08-27 09:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][960/1251] eta 0:01:10 lr 0.000517 wd 0.0500 time 0.2362 (0.2411) data time 0.0010 (0.0015) model time 0.2352 (0.2397) loss 3.2626 (3.1628) grad_norm 3.0729 (2.6183) loss_scale 1024.0000 (532.2456) mem 7382MB [2024-08-27 09:57:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][970/1251] eta 0:01:07 lr 0.000517 wd 0.0500 time 0.2316 (0.2411) data time 0.0011 (0.0015) model time 0.2306 (0.2397) loss 3.2979 (3.1644) grad_norm 1.9435 (2.6181) loss_scale 1024.0000 (537.3100) mem 7382MB [2024-08-27 09:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][980/1251] eta 0:01:05 lr 0.000517 wd 0.0500 time 0.2445 (0.2411) data time 0.0011 (0.0015) model time 0.2434 (0.2397) loss 2.3344 (3.1658) grad_norm 2.7948 (2.6174) loss_scale 1024.0000 (542.2712) mem 7382MB [2024-08-27 09:57:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][990/1251] eta 0:01:02 lr 0.000517 wd 0.0500 time 0.2401 (0.2410) data time 0.0007 (0.0015) model time 0.2394 (0.2397) loss 2.3943 (3.1644) grad_norm 2.6654 (2.6148) loss_scale 1024.0000 (547.1322) mem 7382MB [2024-08-27 09:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1000/1251] eta 0:01:00 lr 0.000517 wd 0.0500 time 0.2372 (0.2410) data time 0.0008 (0.0015) model time 0.2364 (0.2396) loss 2.2987 (3.1605) grad_norm 2.7739 (2.6137) loss_scale 1024.0000 (551.8961) mem 7382MB [2024-08-27 09:57:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1010/1251] eta 0:00:58 lr 0.000517 wd 0.0500 time 0.2436 (0.2410) data time 0.0008 (0.0015) model time 0.2428 (0.2396) loss 3.6043 (3.1590) grad_norm 2.6470 (2.6134) loss_scale 1024.0000 (556.5658) mem 7382MB [2024-08-27 09:57:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1020/1251] eta 0:00:55 lr 0.000517 wd 0.0500 time 0.2400 (0.2409) data time 0.0011 (0.0014) model time 0.2389 (0.2396) loss 3.2989 (3.1593) grad_norm 2.5879 (2.6103) loss_scale 1024.0000 (561.1440) mem 7382MB [2024-08-27 09:57:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1030/1251] eta 0:00:53 lr 0.000517 wd 0.0500 time 0.2382 (0.2409) data time 0.0009 (0.0014) model time 0.2373 (0.2395) loss 3.6417 (3.1619) grad_norm 4.7803 (2.6109) loss_scale 1024.0000 (565.6334) mem 7382MB [2024-08-27 09:57:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1040/1251] eta 0:00:50 lr 0.000517 wd 0.0500 time 0.2407 (0.2409) data time 0.0008 (0.0014) model time 0.2399 (0.2396) loss 2.4647 (3.1616) grad_norm 2.4367 (2.6128) loss_scale 1024.0000 (570.0365) mem 7382MB [2024-08-27 09:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1050/1251] eta 0:00:48 lr 0.000517 wd 0.0500 time 0.2377 (0.2409) data time 0.0007 (0.0014) model time 0.2370 (0.2395) loss 2.8159 (3.1629) grad_norm 2.3885 (2.6107) loss_scale 1024.0000 (574.3559) mem 7382MB [2024-08-27 09:57:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1060/1251] eta 0:00:46 lr 0.000517 wd 0.0500 time 0.2428 (0.2409) data time 0.0014 (0.0014) model time 0.2415 (0.2395) loss 2.7832 (3.1612) grad_norm 2.9488 (2.6109) loss_scale 1024.0000 (578.5938) mem 7382MB [2024-08-27 09:57:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1070/1251] eta 0:00:43 lr 0.000517 wd 0.0500 time 0.2424 (0.2408) data time 0.0008 (0.0014) model time 0.2416 (0.2395) loss 3.6530 (3.1633) grad_norm 2.0392 (2.6054) loss_scale 1024.0000 (582.7526) mem 7382MB [2024-08-27 09:57:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1080/1251] eta 0:00:41 lr 0.000517 wd 0.0500 time 0.2403 (0.2408) data time 0.0011 (0.0014) model time 0.2392 (0.2394) loss 2.9370 (3.1629) grad_norm 1.8974 (2.6053) loss_scale 1024.0000 (586.8344) mem 7382MB [2024-08-27 09:57:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1090/1251] eta 0:00:38 lr 0.000517 wd 0.0500 time 0.2433 (0.2408) data time 0.0009 (0.0014) model time 0.2423 (0.2394) loss 3.2807 (3.1645) grad_norm 2.0787 (2.6021) loss_scale 1024.0000 (590.8414) mem 7382MB [2024-08-27 09:57:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1100/1251] eta 0:00:36 lr 0.000517 wd 0.0500 time 0.2404 (0.2408) data time 0.0010 (0.0014) model time 0.2394 (0.2394) loss 3.1322 (3.1642) grad_norm 2.0635 (2.6007) loss_scale 1024.0000 (594.7757) mem 7382MB [2024-08-27 09:58:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1110/1251] eta 0:00:33 lr 0.000517 wd 0.0500 time 0.2366 (0.2408) data time 0.0007 (0.0014) model time 0.2358 (0.2394) loss 3.2149 (3.1664) grad_norm 2.0981 (2.5981) loss_scale 1024.0000 (598.6391) mem 7382MB [2024-08-27 09:58:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1120/1251] eta 0:00:31 lr 0.000517 wd 0.0500 time 0.2353 (0.2407) data time 0.0008 (0.0014) model time 0.2345 (0.2394) loss 3.9631 (3.1677) grad_norm 2.6853 (2.5971) loss_scale 1024.0000 (602.4335) mem 7382MB [2024-08-27 09:58:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1130/1251] eta 0:00:29 lr 0.000517 wd 0.0500 time 0.2312 (0.2407) data time 0.0010 (0.0014) model time 0.2302 (0.2394) loss 3.5374 (3.1674) grad_norm 2.5453 (2.5972) loss_scale 1024.0000 (606.1609) mem 7382MB [2024-08-27 09:58:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1140/1251] eta 0:00:26 lr 0.000517 wd 0.0500 time 0.2417 (0.2407) data time 0.0010 (0.0014) model time 0.2408 (0.2393) loss 2.7041 (3.1671) grad_norm 2.1111 (2.5960) loss_scale 1024.0000 (609.8230) mem 7382MB [2024-08-27 09:58:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1150/1251] eta 0:00:24 lr 0.000517 wd 0.0500 time 0.2348 (0.2406) data time 0.0010 (0.0014) model time 0.2338 (0.2393) loss 3.6493 (3.1671) grad_norm 2.8630 (2.5964) loss_scale 1024.0000 (613.4214) mem 7382MB [2024-08-27 09:58:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1160/1251] eta 0:00:21 lr 0.000517 wd 0.0500 time 0.2393 (0.2406) data time 0.0010 (0.0014) model time 0.2382 (0.2393) loss 3.3566 (3.1692) grad_norm 2.0391 (2.5987) loss_scale 1024.0000 (616.9578) mem 7382MB [2024-08-27 09:58:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1170/1251] eta 0:00:19 lr 0.000516 wd 0.0500 time 0.2305 (0.2406) data time 0.0009 (0.0014) model time 0.2296 (0.2393) loss 2.2213 (3.1692) grad_norm 3.5682 (2.6024) loss_scale 1024.0000 (620.4338) mem 7382MB [2024-08-27 09:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1180/1251] eta 0:00:17 lr 0.000516 wd 0.0500 time 0.2325 (0.2406) data time 0.0008 (0.0014) model time 0.2317 (0.2392) loss 3.4251 (3.1696) grad_norm 2.5338 (2.6032) loss_scale 1024.0000 (623.8510) mem 7382MB [2024-08-27 09:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1190/1251] eta 0:00:14 lr 0.000516 wd 0.0500 time 0.2373 (0.2408) data time 0.0010 (0.0014) model time 0.2363 (0.2395) loss 3.3952 (3.1687) grad_norm 2.6617 (2.6017) loss_scale 1024.0000 (627.2107) mem 7382MB [2024-08-27 09:58:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1200/1251] eta 0:00:12 lr 0.000516 wd 0.0500 time 0.2398 (0.2411) data time 0.0009 (0.0014) model time 0.2390 (0.2398) loss 3.8807 (3.1709) grad_norm 1.9293 (2.6015) loss_scale 1024.0000 (630.5146) mem 7382MB [2024-08-27 09:58:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1210/1251] eta 0:00:09 lr 0.000516 wd 0.0500 time 0.2364 (0.2411) data time 0.0010 (0.0014) model time 0.2354 (0.2398) loss 3.6167 (3.1717) grad_norm 2.4710 (2.6002) loss_scale 1024.0000 (633.7638) mem 7382MB [2024-08-27 09:58:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1220/1251] eta 0:00:07 lr 0.000516 wd 0.0500 time 0.2346 (0.2410) data time 0.0008 (0.0014) model time 0.2338 (0.2397) loss 3.7029 (3.1727) grad_norm 2.4184 (2.5971) loss_scale 1024.0000 (636.9599) mem 7382MB [2024-08-27 09:58:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1230/1251] eta 0:00:05 lr 0.000516 wd 0.0500 time 0.2259 (0.2410) data time 0.0008 (0.0014) model time 0.2251 (0.2397) loss 2.5992 (3.1746) grad_norm 2.4134 (2.5957) loss_scale 1024.0000 (640.1040) mem 7382MB [2024-08-27 09:58:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1240/1251] eta 0:00:02 lr 0.000516 wd 0.0500 time 0.2229 (0.2409) data time 0.0006 (0.0014) model time 0.2223 (0.2396) loss 2.6545 (3.1730) grad_norm 2.8175 (2.5945) loss_scale 1024.0000 (643.1974) mem 7382MB [2024-08-27 09:58:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [157/300][1250/1251] eta 0:00:00 lr 0.000516 wd 0.0500 time 0.2244 (0.2408) data time 0.0005 (0.0014) model time 0.2239 (0.2395) loss 3.0191 (3.1722) grad_norm 2.7165 (2.5914) loss_scale 1024.0000 (646.2414) mem 7382MB [2024-08-27 09:58:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 157 training takes 0:05:01 [2024-08-27 09:58:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 09:58:34 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 09:58:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.507 (0.507) Loss 0.4473 (0.4473) Acc@1 92.090 (92.090) Acc@5 98.438 (98.438) Mem 7382MB [2024-08-27 09:58:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.113) Loss 0.7520 (0.7230) Acc@1 84.668 (84.215) Acc@5 96.777 (97.053) Mem 7382MB [2024-08-27 09:58:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.095) Loss 1.0723 (0.7527) Acc@1 74.414 (83.185) Acc@5 93.066 (96.908) Mem 7382MB [2024-08-27 09:58:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.068 (0.089) Loss 1.2705 (0.8533) Acc@1 69.336 (80.853) Acc@5 91.016 (95.713) Mem 7382MB [2024-08-27 09:58:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.1465 (0.9088) Acc@1 73.047 (79.552) Acc@5 92.676 (95.034) Mem 7382MB [2024-08-27 09:58:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.210 Acc@5 94.996 [2024-08-27 09:58:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.2% [2024-08-27 09:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.777 (0.777) Loss 0.4023 (0.4023) Acc@1 92.773 (92.773) Acc@5 98.535 (98.535) Mem 7382MB [2024-08-27 09:58:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.143) Loss 0.6348 (0.6330) Acc@1 86.816 (86.355) Acc@5 97.168 (97.390) Mem 7382MB [2024-08-27 09:58:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.113) Loss 0.9038 (0.6578) Acc@1 78.516 (85.482) Acc@5 95.410 (97.396) Mem 7382MB [2024-08-27 09:58:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.100) Loss 1.1396 (0.7470) Acc@1 72.266 (83.339) Acc@5 92.871 (96.425) Mem 7382MB [2024-08-27 09:58:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.092) Loss 1.0322 (0.7940) Acc@1 74.707 (81.969) Acc@5 93.750 (95.956) Mem 7382MB [2024-08-27 09:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.524 Acc@5 95.924 [2024-08-27 09:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.5% [2024-08-27 09:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.52% [2024-08-27 09:58:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 09:58:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 09:58:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][0/1251] eta 0:16:30 lr 0.000516 wd 0.0500 time 0.7920 (0.7920) data time 0.5681 (0.5681) model time 0.0000 (0.0000) loss 1.9974 (1.9974) grad_norm 3.7099 (3.7099) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][10/1251] eta 0:05:58 lr 0.000516 wd 0.0500 time 0.2427 (0.2886) data time 0.0008 (0.0526) model time 0.0000 (0.0000) loss 3.6290 (3.1206) grad_norm 3.6383 (2.5635) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][20/1251] eta 0:05:25 lr 0.000516 wd 0.0500 time 0.2325 (0.2648) data time 0.0012 (0.0280) model time 0.0000 (0.0000) loss 3.1248 (3.2983) grad_norm 2.6003 (2.6696) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:58:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][30/1251] eta 0:05:12 lr 0.000516 wd 0.0500 time 0.2358 (0.2562) data time 0.0011 (0.0193) model time 0.0000 (0.0000) loss 2.1428 (3.1772) grad_norm 2.1135 (2.5243) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][40/1251] eta 0:05:06 lr 0.000516 wd 0.0500 time 0.2600 (0.2532) data time 0.0011 (0.0148) model time 0.0000 (0.0000) loss 3.4416 (3.2283) grad_norm 2.1258 (2.4592) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:58:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][50/1251] eta 0:05:00 lr 0.000516 wd 0.0500 time 0.2354 (0.2504) data time 0.0008 (0.0121) model time 0.0000 (0.0000) loss 3.8107 (3.2602) grad_norm 2.7947 (2.4571) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:58:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][60/1251] eta 0:04:55 lr 0.000516 wd 0.0500 time 0.2408 (0.2485) data time 0.0011 (0.0103) model time 0.2397 (0.2375) loss 3.7986 (3.2098) grad_norm 2.5908 (2.4469) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][70/1251] eta 0:04:51 lr 0.000516 wd 0.0500 time 0.2351 (0.2471) data time 0.0011 (0.0090) model time 0.2340 (0.2378) loss 3.4092 (3.2074) grad_norm 2.6739 (2.4733) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][80/1251] eta 0:04:48 lr 0.000516 wd 0.0500 time 0.2387 (0.2463) data time 0.0010 (0.0080) model time 0.2377 (0.2384) loss 3.3161 (3.2282) grad_norm 1.8698 (2.4446) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][90/1251] eta 0:04:44 lr 0.000516 wd 0.0500 time 0.2436 (0.2453) data time 0.0009 (0.0073) model time 0.2428 (0.2378) loss 3.1302 (3.2181) grad_norm 2.7572 (2.4141) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][100/1251] eta 0:04:41 lr 0.000516 wd 0.0500 time 0.2312 (0.2448) data time 0.0010 (0.0066) model time 0.2303 (0.2380) loss 2.6833 (3.2057) grad_norm 2.6832 (2.4079) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][110/1251] eta 0:04:38 lr 0.000516 wd 0.0500 time 0.2447 (0.2443) data time 0.0007 (0.0061) model time 0.2439 (0.2381) loss 2.9872 (3.1776) grad_norm 2.0234 (2.5898) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][120/1251] eta 0:04:35 lr 0.000516 wd 0.0500 time 0.2444 (0.2438) data time 0.0008 (0.0057) model time 0.2436 (0.2380) loss 3.5205 (3.1847) grad_norm 2.9185 (2.5597) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][130/1251] eta 0:04:33 lr 0.000516 wd 0.0500 time 0.2392 (0.2436) data time 0.0008 (0.0054) model time 0.2384 (0.2383) loss 3.6495 (3.1984) grad_norm 3.0943 (2.5509) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][140/1251] eta 0:04:30 lr 0.000515 wd 0.0500 time 0.2432 (0.2434) data time 0.0011 (0.0051) model time 0.2421 (0.2384) loss 2.7221 (3.1950) grad_norm 2.6015 (2.5345) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][150/1251] eta 0:04:27 lr 0.000515 wd 0.0500 time 0.2299 (0.2433) data time 0.0011 (0.0048) model time 0.2288 (0.2386) loss 3.4855 (3.2052) grad_norm 2.7905 (2.5589) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][160/1251] eta 0:04:25 lr 0.000515 wd 0.0500 time 0.2370 (0.2430) data time 0.0011 (0.0046) model time 0.2359 (0.2385) loss 2.9589 (3.2098) grad_norm 2.4948 (2.5545) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][170/1251] eta 0:04:22 lr 0.000515 wd 0.0500 time 0.2394 (0.2427) data time 0.0007 (0.0043) model time 0.2387 (0.2385) loss 2.4388 (3.2011) grad_norm 2.6921 (2.5515) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][180/1251] eta 0:04:19 lr 0.000515 wd 0.0500 time 0.2390 (0.2426) data time 0.0007 (0.0042) model time 0.2383 (0.2385) loss 3.4599 (3.1900) grad_norm 2.4371 (2.5504) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][190/1251] eta 0:04:17 lr 0.000515 wd 0.0500 time 0.2356 (0.2425) data time 0.0010 (0.0040) model time 0.2346 (0.2385) loss 3.0820 (3.1911) grad_norm 3.4689 (2.5600) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][200/1251] eta 0:04:14 lr 0.000515 wd 0.0500 time 0.2376 (0.2423) data time 0.0007 (0.0038) model time 0.2368 (0.2385) loss 3.2669 (3.1846) grad_norm 1.8793 (2.5501) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][210/1251] eta 0:04:12 lr 0.000515 wd 0.0500 time 0.2432 (0.2422) data time 0.0009 (0.0037) model time 0.2422 (0.2386) loss 3.4294 (3.1886) grad_norm 2.3151 (2.5489) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][220/1251] eta 0:04:09 lr 0.000515 wd 0.0500 time 0.2455 (0.2421) data time 0.0007 (0.0036) model time 0.2448 (0.2386) loss 2.3748 (3.1944) grad_norm 2.4313 (2.5356) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][230/1251] eta 0:04:07 lr 0.000515 wd 0.0500 time 0.2366 (0.2420) data time 0.0007 (0.0035) model time 0.2359 (0.2387) loss 3.1500 (3.1972) grad_norm 3.1162 (2.5449) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][240/1251] eta 0:04:04 lr 0.000515 wd 0.0500 time 0.2382 (0.2419) data time 0.0011 (0.0034) model time 0.2371 (0.2387) loss 3.0854 (3.1839) grad_norm 2.5440 (2.5480) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][250/1251] eta 0:04:02 lr 0.000515 wd 0.0500 time 0.2372 (0.2419) data time 0.0010 (0.0033) model time 0.2363 (0.2387) loss 3.3529 (3.1875) grad_norm 1.7847 (2.5423) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][260/1251] eta 0:03:59 lr 0.000515 wd 0.0500 time 0.2420 (0.2417) data time 0.0010 (0.0032) model time 0.2410 (0.2386) loss 2.8741 (3.1817) grad_norm 2.3192 (2.5564) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][270/1251] eta 0:03:57 lr 0.000515 wd 0.0500 time 0.2342 (0.2417) data time 0.0010 (0.0031) model time 0.2331 (0.2386) loss 3.2495 (3.1736) grad_norm 2.6450 (2.5712) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][280/1251] eta 0:03:54 lr 0.000515 wd 0.0500 time 0.2453 (0.2416) data time 0.0011 (0.0030) model time 0.2442 (0.2387) loss 3.5353 (3.1782) grad_norm 2.2781 (2.5650) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][290/1251] eta 0:03:52 lr 0.000515 wd 0.0500 time 0.2332 (0.2415) data time 0.0011 (0.0030) model time 0.2321 (0.2386) loss 3.1834 (3.1713) grad_norm 2.4005 (2.5637) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][300/1251] eta 0:03:49 lr 0.000515 wd 0.0500 time 0.2381 (0.2415) data time 0.0009 (0.0029) model time 0.2372 (0.2386) loss 2.5533 (3.1696) grad_norm 3.1929 (2.5671) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 09:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][310/1251] eta 0:03:47 lr 0.000515 wd 0.0500 time 0.2340 (0.2414) data time 0.0011 (0.0028) model time 0.2328 (0.2386) loss 2.9183 (3.1713) grad_norm 2.3243 (2.5674) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][320/1251] eta 0:03:44 lr 0.000515 wd 0.0500 time 0.2446 (0.2413) data time 0.0007 (0.0028) model time 0.2439 (0.2385) loss 3.8265 (3.1832) grad_norm 2.5218 (2.5647) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][330/1251] eta 0:03:42 lr 0.000515 wd 0.0500 time 0.2315 (0.2412) data time 0.0009 (0.0027) model time 0.2307 (0.2385) loss 4.0983 (3.1906) grad_norm 3.1200 (2.5647) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][340/1251] eta 0:03:39 lr 0.000515 wd 0.0500 time 0.2419 (0.2412) data time 0.0009 (0.0027) model time 0.2410 (0.2385) loss 3.9114 (3.1855) grad_norm 2.9229 (2.5739) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][350/1251] eta 0:03:37 lr 0.000515 wd 0.0500 time 0.2313 (0.2411) data time 0.0012 (0.0026) model time 0.2301 (0.2385) loss 3.3519 (3.1817) grad_norm 2.7186 (2.5953) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][360/1251] eta 0:03:34 lr 0.000515 wd 0.0500 time 0.2408 (0.2410) data time 0.0009 (0.0026) model time 0.2399 (0.2385) loss 2.2407 (3.1754) grad_norm 3.1621 (2.6033) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][370/1251] eta 0:03:32 lr 0.000514 wd 0.0500 time 0.2442 (0.2410) data time 0.0008 (0.0025) model time 0.2435 (0.2385) loss 2.6098 (3.1834) grad_norm 2.0424 (2.5930) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][380/1251] eta 0:03:29 lr 0.000514 wd 0.0500 time 0.2366 (0.2410) data time 0.0010 (0.0025) model time 0.2355 (0.2385) loss 2.9882 (3.1826) grad_norm 1.7761 (2.5824) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][390/1251] eta 0:03:27 lr 0.000514 wd 0.0500 time 0.2422 (0.2410) data time 0.0008 (0.0025) model time 0.2414 (0.2385) loss 3.7986 (3.1828) grad_norm 2.4879 (2.5814) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][400/1251] eta 0:03:25 lr 0.000514 wd 0.0500 time 0.2412 (0.2409) data time 0.0011 (0.0024) model time 0.2402 (0.2385) loss 3.0343 (3.1841) grad_norm 2.0909 (2.5768) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][410/1251] eta 0:03:22 lr 0.000514 wd 0.0500 time 0.2371 (0.2409) data time 0.0011 (0.0024) model time 0.2361 (0.2386) loss 3.3259 (3.1902) grad_norm 1.9895 (2.5775) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][420/1251] eta 0:03:20 lr 0.000514 wd 0.0500 time 0.2368 (0.2409) data time 0.0010 (0.0024) model time 0.2357 (0.2385) loss 3.4742 (3.1971) grad_norm 2.0635 (2.5743) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][430/1251] eta 0:03:17 lr 0.000514 wd 0.0500 time 0.2510 (0.2408) data time 0.0009 (0.0023) model time 0.2501 (0.2385) loss 3.0006 (3.2005) grad_norm 2.3136 (2.5793) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][440/1251] eta 0:03:15 lr 0.000514 wd 0.0500 time 0.2532 (0.2408) data time 0.0009 (0.0023) model time 0.2523 (0.2385) loss 2.3314 (3.1949) grad_norm 2.3407 (2.5814) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][450/1251] eta 0:03:13 lr 0.000514 wd 0.0500 time 0.4212 (0.2412) data time 0.0008 (0.0023) model time 0.4205 (0.2390) loss 2.8890 (3.1927) grad_norm 3.4132 (2.5828) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][460/1251] eta 0:03:11 lr 0.000514 wd 0.0500 time 0.2360 (0.2419) data time 0.0008 (0.0023) model time 0.2352 (0.2398) loss 3.8988 (3.1979) grad_norm 2.8915 (2.5791) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][470/1251] eta 0:03:08 lr 0.000514 wd 0.0500 time 0.2471 (0.2418) data time 0.0010 (0.0022) model time 0.2461 (0.2397) loss 3.1565 (3.1936) grad_norm 2.4896 (2.5731) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][480/1251] eta 0:03:06 lr 0.000514 wd 0.0500 time 0.2366 (0.2417) data time 0.0008 (0.0022) model time 0.2359 (0.2396) loss 3.6891 (3.2030) grad_norm 2.9082 (2.5700) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][490/1251] eta 0:03:04 lr 0.000514 wd 0.0500 time 0.2377 (0.2420) data time 0.0009 (0.0022) model time 0.2369 (0.2400) loss 3.3857 (3.2051) grad_norm 2.2188 (2.5714) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][500/1251] eta 0:03:01 lr 0.000514 wd 0.0500 time 0.2315 (0.2419) data time 0.0011 (0.0022) model time 0.2304 (0.2399) loss 2.9376 (3.2038) grad_norm 4.0536 (2.5726) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][510/1251] eta 0:02:59 lr 0.000514 wd 0.0500 time 0.2427 (0.2419) data time 0.0011 (0.0021) model time 0.2416 (0.2399) loss 3.1418 (3.2052) grad_norm 3.8452 (2.5738) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][520/1251] eta 0:02:56 lr 0.000514 wd 0.0500 time 0.2400 (0.2419) data time 0.0010 (0.0021) model time 0.2390 (0.2399) loss 3.3302 (3.2052) grad_norm 2.5416 (2.5800) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][530/1251] eta 0:02:54 lr 0.000514 wd 0.0500 time 0.2375 (0.2418) data time 0.0012 (0.0021) model time 0.2363 (0.2399) loss 3.1719 (3.2064) grad_norm 2.3755 (2.5780) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][540/1251] eta 0:02:51 lr 0.000514 wd 0.0500 time 0.2327 (0.2417) data time 0.0009 (0.0021) model time 0.2318 (0.2398) loss 2.9403 (3.2103) grad_norm 2.6544 (2.5885) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][550/1251] eta 0:02:49 lr 0.000514 wd 0.0500 time 0.2368 (0.2416) data time 0.0027 (0.0021) model time 0.2341 (0.2397) loss 3.0885 (3.2125) grad_norm 2.4636 (2.5864) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:00:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][560/1251] eta 0:02:46 lr 0.000514 wd 0.0500 time 0.2391 (0.2416) data time 0.0011 (0.0020) model time 0.2380 (0.2397) loss 3.3278 (3.2112) grad_norm 3.0357 (2.5899) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][570/1251] eta 0:02:44 lr 0.000514 wd 0.0500 time 0.2376 (0.2415) data time 0.0011 (0.0020) model time 0.2365 (0.2396) loss 3.2995 (3.2118) grad_norm 2.2485 (2.5957) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][580/1251] eta 0:02:42 lr 0.000514 wd 0.0500 time 0.2385 (0.2415) data time 0.0007 (0.0020) model time 0.2379 (0.2396) loss 3.3814 (3.2124) grad_norm 3.7837 (2.6005) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][590/1251] eta 0:02:39 lr 0.000513 wd 0.0500 time 0.2386 (0.2414) data time 0.0010 (0.0020) model time 0.2376 (0.2395) loss 3.2287 (3.2169) grad_norm 3.1310 (2.6105) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][600/1251] eta 0:02:37 lr 0.000513 wd 0.0500 time 0.2349 (0.2413) data time 0.0011 (0.0020) model time 0.2338 (0.2394) loss 3.3069 (3.2173) grad_norm 2.7698 (2.6212) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][610/1251] eta 0:02:34 lr 0.000513 wd 0.0500 time 0.2431 (0.2413) data time 0.0008 (0.0020) model time 0.2423 (0.2394) loss 2.8162 (3.2208) grad_norm 3.0945 (2.6174) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][620/1251] eta 0:02:32 lr 0.000513 wd 0.0500 time 0.2371 (0.2413) data time 0.0011 (0.0020) model time 0.2359 (0.2394) loss 3.3625 (3.2199) grad_norm 1.7028 (2.6111) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][630/1251] eta 0:02:29 lr 0.000513 wd 0.0500 time 0.2383 (0.2412) data time 0.0008 (0.0019) model time 0.2375 (0.2394) loss 4.0446 (3.2200) grad_norm 2.1386 (2.6086) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][640/1251] eta 0:02:27 lr 0.000513 wd 0.0500 time 0.2342 (0.2412) data time 0.0010 (0.0019) model time 0.2332 (0.2393) loss 3.2837 (3.2184) grad_norm 2.5143 (2.6072) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][650/1251] eta 0:02:24 lr 0.000513 wd 0.0500 time 0.2366 (0.2411) data time 0.0010 (0.0019) model time 0.2356 (0.2393) loss 3.5673 (3.2208) grad_norm 2.6238 (2.6046) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][660/1251] eta 0:02:22 lr 0.000513 wd 0.0500 time 0.2353 (0.2411) data time 0.0011 (0.0019) model time 0.2342 (0.2393) loss 3.3889 (3.2263) grad_norm 2.8281 (2.6097) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][670/1251] eta 0:02:20 lr 0.000513 wd 0.0500 time 0.2419 (0.2411) data time 0.0010 (0.0019) model time 0.2409 (0.2393) loss 3.6040 (3.2268) grad_norm 1.9212 (2.6214) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][680/1251] eta 0:02:17 lr 0.000513 wd 0.0500 time 0.2381 (0.2410) data time 0.0007 (0.0019) model time 0.2374 (0.2392) loss 2.1485 (3.2268) grad_norm 2.5071 (2.6260) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][690/1251] eta 0:02:15 lr 0.000513 wd 0.0500 time 0.2314 (0.2410) data time 0.0008 (0.0019) model time 0.2306 (0.2392) loss 2.0444 (3.2248) grad_norm 1.6935 (2.6202) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][700/1251] eta 0:02:12 lr 0.000513 wd 0.0500 time 0.2398 (0.2409) data time 0.0007 (0.0018) model time 0.2391 (0.2392) loss 4.2106 (3.2238) grad_norm 2.5144 (2.6191) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][710/1251] eta 0:02:10 lr 0.000513 wd 0.0500 time 0.2359 (0.2409) data time 0.0010 (0.0018) model time 0.2349 (0.2391) loss 3.4290 (3.2200) grad_norm 2.8936 (2.6148) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][720/1251] eta 0:02:07 lr 0.000513 wd 0.0500 time 0.2553 (0.2409) data time 0.0007 (0.0018) model time 0.2545 (0.2391) loss 3.6740 (3.2203) grad_norm 2.4217 (2.6167) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][730/1251] eta 0:02:05 lr 0.000513 wd 0.0500 time 0.2333 (0.2408) data time 0.0009 (0.0018) model time 0.2324 (0.2391) loss 3.5282 (3.2195) grad_norm 3.0226 (2.6165) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][740/1251] eta 0:02:03 lr 0.000513 wd 0.0500 time 0.2328 (0.2408) data time 0.0010 (0.0018) model time 0.2318 (0.2390) loss 3.7407 (3.2143) grad_norm 1.9935 (2.6131) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][750/1251] eta 0:02:00 lr 0.000513 wd 0.0500 time 0.2382 (0.2407) data time 0.0007 (0.0018) model time 0.2375 (0.2390) loss 3.0290 (3.2163) grad_norm 2.5981 (2.6084) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][760/1251] eta 0:01:58 lr 0.000513 wd 0.0500 time 0.2407 (0.2407) data time 0.0008 (0.0018) model time 0.2399 (0.2389) loss 4.4324 (3.2165) grad_norm 3.6454 (2.6061) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][770/1251] eta 0:01:55 lr 0.000513 wd 0.0500 time 0.2396 (0.2406) data time 0.0007 (0.0018) model time 0.2389 (0.2389) loss 2.9425 (3.2155) grad_norm 3.0948 (2.6040) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][780/1251] eta 0:01:53 lr 0.000513 wd 0.0500 time 0.2320 (0.2406) data time 0.0011 (0.0018) model time 0.2309 (0.2389) loss 3.2749 (3.2164) grad_norm 2.6670 (2.6107) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][790/1251] eta 0:01:50 lr 0.000513 wd 0.0500 time 0.2356 (0.2406) data time 0.0007 (0.0018) model time 0.2348 (0.2388) loss 3.1178 (3.2185) grad_norm 1.9800 (2.6091) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][800/1251] eta 0:01:48 lr 0.000513 wd 0.0500 time 0.2433 (0.2405) data time 0.0010 (0.0017) model time 0.2423 (0.2388) loss 3.5108 (3.2209) grad_norm 2.1584 (2.6043) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][810/1251] eta 0:01:46 lr 0.000513 wd 0.0500 time 0.2410 (0.2405) data time 0.0008 (0.0017) model time 0.2402 (0.2388) loss 2.5261 (3.2203) grad_norm 2.4626 (2.6025) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][820/1251] eta 0:01:43 lr 0.000512 wd 0.0500 time 0.2452 (0.2405) data time 0.0009 (0.0017) model time 0.2442 (0.2388) loss 2.6533 (3.2176) grad_norm 2.3310 (2.5975) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][830/1251] eta 0:01:41 lr 0.000512 wd 0.0500 time 0.2364 (0.2404) data time 0.0011 (0.0017) model time 0.2353 (0.2388) loss 3.3037 (3.2149) grad_norm 3.3413 (2.6002) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][840/1251] eta 0:01:38 lr 0.000512 wd 0.0500 time 0.2398 (0.2404) data time 0.0010 (0.0017) model time 0.2388 (0.2387) loss 2.9494 (3.2096) grad_norm 2.3867 (2.6006) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][850/1251] eta 0:01:36 lr 0.000512 wd 0.0500 time 0.2439 (0.2404) data time 0.0010 (0.0017) model time 0.2429 (0.2387) loss 2.5726 (3.2102) grad_norm 3.0370 (2.6010) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][860/1251] eta 0:01:33 lr 0.000512 wd 0.0500 time 0.2420 (0.2404) data time 0.0010 (0.0017) model time 0.2410 (0.2387) loss 2.4232 (3.2090) grad_norm 3.0603 (2.6039) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][870/1251] eta 0:01:31 lr 0.000512 wd 0.0500 time 0.2422 (0.2404) data time 0.0008 (0.0017) model time 0.2414 (0.2387) loss 3.9855 (3.2085) grad_norm 2.5364 (2.6028) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][880/1251] eta 0:01:29 lr 0.000512 wd 0.0500 time 0.2430 (0.2404) data time 0.0008 (0.0017) model time 0.2421 (0.2387) loss 2.3332 (3.2082) grad_norm 2.2304 (2.6100) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][890/1251] eta 0:01:26 lr 0.000512 wd 0.0500 time 0.2334 (0.2404) data time 0.0008 (0.0017) model time 0.2326 (0.2387) loss 3.1920 (3.2095) grad_norm 2.2388 (2.6090) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][900/1251] eta 0:01:24 lr 0.000512 wd 0.0500 time 0.2406 (0.2403) data time 0.0011 (0.0017) model time 0.2395 (0.2387) loss 3.5078 (3.2108) grad_norm 2.3842 (2.6082) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][910/1251] eta 0:01:21 lr 0.000512 wd 0.0500 time 0.2406 (0.2403) data time 0.0011 (0.0017) model time 0.2394 (0.2387) loss 2.8336 (3.2119) grad_norm 2.0340 (2.6049) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][920/1251] eta 0:01:19 lr 0.000512 wd 0.0500 time 0.2376 (0.2403) data time 0.0010 (0.0016) model time 0.2366 (0.2387) loss 3.1456 (3.2083) grad_norm 2.6657 (2.6021) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][930/1251] eta 0:01:17 lr 0.000512 wd 0.0500 time 0.2401 (0.2403) data time 0.0007 (0.0016) model time 0.2394 (0.2387) loss 3.1219 (3.2113) grad_norm 1.9954 (2.6039) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][940/1251] eta 0:01:14 lr 0.000512 wd 0.0500 time 0.2183 (0.2405) data time 0.0009 (0.0016) model time 0.2174 (0.2389) loss 2.3501 (3.2099) grad_norm 2.2280 (2.6039) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][950/1251] eta 0:01:12 lr 0.000512 wd 0.0500 time 0.2408 (0.2405) data time 0.0012 (0.0016) model time 0.2396 (0.2389) loss 3.7723 (3.2132) grad_norm 2.0020 (2.6039) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][960/1251] eta 0:01:09 lr 0.000512 wd 0.0500 time 0.2394 (0.2405) data time 0.0007 (0.0016) model time 0.2387 (0.2389) loss 3.6777 (3.2123) grad_norm 4.8605 (2.6043) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][970/1251] eta 0:01:07 lr 0.000512 wd 0.0500 time 0.2375 (0.2404) data time 0.0008 (0.0016) model time 0.2367 (0.2389) loss 2.5166 (3.2102) grad_norm 1.9462 (2.6022) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][980/1251] eta 0:01:05 lr 0.000512 wd 0.0500 time 0.2359 (0.2406) data time 0.0010 (0.0016) model time 0.2349 (0.2391) loss 3.5406 (3.2102) grad_norm 2.1070 (2.6061) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][990/1251] eta 0:01:02 lr 0.000512 wd 0.0500 time 0.2371 (0.2411) data time 0.0010 (0.0016) model time 0.2361 (0.2396) loss 3.8320 (3.2103) grad_norm 4.7461 (2.6080) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1000/1251] eta 0:01:00 lr 0.000512 wd 0.0500 time 0.2416 (0.2411) data time 0.0007 (0.0016) model time 0.2409 (0.2396) loss 3.9125 (3.2104) grad_norm 2.1083 (2.6078) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1010/1251] eta 0:00:58 lr 0.000512 wd 0.0500 time 0.2398 (0.2412) data time 0.0009 (0.0016) model time 0.2389 (0.2397) loss 3.4694 (3.2087) grad_norm 2.2630 (2.6047) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1020/1251] eta 0:00:55 lr 0.000512 wd 0.0500 time 0.2432 (0.2412) data time 0.0012 (0.0016) model time 0.2420 (0.2397) loss 2.9970 (3.2071) grad_norm 6.5573 (2.6081) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1030/1251] eta 0:00:53 lr 0.000512 wd 0.0500 time 0.2426 (0.2412) data time 0.0010 (0.0016) model time 0.2415 (0.2397) loss 3.4848 (3.2036) grad_norm 2.9383 (2.6116) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1040/1251] eta 0:00:50 lr 0.000511 wd 0.0500 time 0.2313 (0.2412) data time 0.0009 (0.0016) model time 0.2303 (0.2397) loss 3.9757 (3.2051) grad_norm 2.9711 (2.6104) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1050/1251] eta 0:00:48 lr 0.000511 wd 0.0500 time 0.2345 (0.2411) data time 0.0007 (0.0016) model time 0.2338 (0.2396) loss 2.2497 (3.2042) grad_norm 3.9838 (2.6170) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:02:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1060/1251] eta 0:00:46 lr 0.000511 wd 0.0500 time 0.2341 (0.2411) data time 0.0010 (0.0016) model time 0.2331 (0.2396) loss 3.6343 (3.2053) grad_norm 2.5043 (2.6189) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1070/1251] eta 0:00:43 lr 0.000511 wd 0.0500 time 0.2427 (0.2411) data time 0.0008 (0.0016) model time 0.2419 (0.2396) loss 2.2576 (3.2045) grad_norm 2.0260 (2.6150) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1080/1251] eta 0:00:41 lr 0.000511 wd 0.0500 time 0.2276 (0.2411) data time 0.0010 (0.0016) model time 0.2266 (0.2396) loss 2.5808 (3.2046) grad_norm 2.1118 (2.6189) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1090/1251] eta 0:00:38 lr 0.000511 wd 0.0500 time 0.2372 (0.2410) data time 0.0010 (0.0016) model time 0.2361 (0.2396) loss 3.7633 (3.2056) grad_norm 1.9903 (2.6192) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1100/1251] eta 0:00:36 lr 0.000511 wd 0.0500 time 0.2274 (0.2410) data time 0.0009 (0.0015) model time 0.2266 (0.2395) loss 3.1881 (3.2069) grad_norm 2.6579 (2.6223) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1110/1251] eta 0:00:33 lr 0.000511 wd 0.0500 time 0.2401 (0.2410) data time 0.0012 (0.0015) model time 0.2389 (0.2395) loss 3.9293 (3.2079) grad_norm 3.0777 (2.6273) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1120/1251] eta 0:00:31 lr 0.000511 wd 0.0500 time 0.2420 (0.2409) data time 0.0009 (0.0015) model time 0.2412 (0.2395) loss 3.9518 (3.2080) grad_norm 2.6901 (2.6294) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1130/1251] eta 0:00:29 lr 0.000511 wd 0.0500 time 0.2388 (0.2409) data time 0.0010 (0.0015) model time 0.2378 (0.2395) loss 2.5159 (3.2074) grad_norm 2.4573 (2.6273) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1140/1251] eta 0:00:26 lr 0.000511 wd 0.0500 time 0.2414 (0.2409) data time 0.0010 (0.0015) model time 0.2404 (0.2394) loss 3.4574 (3.2073) grad_norm 2.1861 (2.6362) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1150/1251] eta 0:00:24 lr 0.000511 wd 0.0500 time 0.2428 (0.2409) data time 0.0010 (0.0015) model time 0.2418 (0.2394) loss 3.5571 (3.2075) grad_norm 2.7984 (2.6361) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1160/1251] eta 0:00:21 lr 0.000511 wd 0.0500 time 0.2377 (0.2409) data time 0.0008 (0.0015) model time 0.2369 (0.2394) loss 2.3939 (3.2091) grad_norm 2.2683 (2.6324) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1170/1251] eta 0:00:19 lr 0.000511 wd 0.0500 time 0.2389 (0.2408) data time 0.0010 (0.0015) model time 0.2379 (0.2394) loss 2.7706 (3.2084) grad_norm 2.0778 (2.6306) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1180/1251] eta 0:00:17 lr 0.000511 wd 0.0500 time 0.2372 (0.2408) data time 0.0010 (0.0015) model time 0.2362 (0.2394) loss 2.6049 (3.2068) grad_norm 2.6718 (2.6290) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1190/1251] eta 0:00:14 lr 0.000511 wd 0.0500 time 0.2412 (0.2408) data time 0.0009 (0.0015) model time 0.2403 (0.2393) loss 2.9752 (3.2060) grad_norm 1.9825 (2.6284) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1200/1251] eta 0:00:12 lr 0.000511 wd 0.0500 time 0.2402 (0.2408) data time 0.0010 (0.0015) model time 0.2392 (0.2393) loss 2.4032 (3.2044) grad_norm 2.5294 (2.6288) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1210/1251] eta 0:00:09 lr 0.000511 wd 0.0500 time 0.2313 (0.2408) data time 0.0008 (0.0015) model time 0.2305 (0.2393) loss 2.6386 (3.2020) grad_norm 3.0801 (2.6312) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1220/1251] eta 0:00:07 lr 0.000511 wd 0.0500 time 0.2342 (0.2407) data time 0.0008 (0.0015) model time 0.2335 (0.2393) loss 3.5551 (3.2021) grad_norm 2.1974 (2.6353) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1230/1251] eta 0:00:05 lr 0.000511 wd 0.0500 time 0.2380 (0.2407) data time 0.0009 (0.0015) model time 0.2372 (0.2393) loss 3.7672 (3.2020) grad_norm 2.3732 (2.6340) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1240/1251] eta 0:00:02 lr 0.000511 wd 0.0500 time 0.2224 (0.2407) data time 0.0007 (0.0015) model time 0.2217 (0.2392) loss 3.5476 (3.2014) grad_norm 2.0184 (2.6301) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [158/300][1250/1251] eta 0:00:00 lr 0.000511 wd 0.0500 time 0.2234 (0.2405) data time 0.0005 (0.0015) model time 0.2229 (0.2391) loss 3.8411 (3.2022) grad_norm 2.0658 (2.6268) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 158 training takes 0:05:00 [2024-08-27 10:03:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 10:03:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 10:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.441 (0.441) Loss 0.4434 (0.4434) Acc@1 91.797 (91.797) Acc@5 98.242 (98.242) Mem 7382MB [2024-08-27 10:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.110) Loss 0.6768 (0.7067) Acc@1 86.621 (84.677) Acc@5 97.070 (96.973) Mem 7382MB [2024-08-27 10:03:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.095) Loss 1.0234 (0.7375) Acc@1 75.391 (83.608) Acc@5 93.945 (96.922) Mem 7382MB [2024-08-27 10:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.089) Loss 1.2285 (0.8393) Acc@1 70.703 (81.250) Acc@5 91.602 (95.766) Mem 7382MB [2024-08-27 10:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.1406 (0.8910) Acc@1 72.461 (79.899) Acc@5 92.188 (95.184) Mem 7382MB [2024-08-27 10:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.474 Acc@5 95.114 [2024-08-27 10:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.5% [2024-08-27 10:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 79.47% [2024-08-27 10:03:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 10:03:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 10:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.421 (0.421) Loss 0.4028 (0.4028) Acc@1 92.871 (92.871) Acc@5 98.535 (98.535) Mem 7382MB [2024-08-27 10:03:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.108) Loss 0.6338 (0.6322) Acc@1 87.207 (86.373) Acc@5 97.168 (97.354) Mem 7382MB [2024-08-27 10:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.095) Loss 0.9019 (0.6571) Acc@1 78.418 (85.491) Acc@5 95.410 (97.373) Mem 7382MB [2024-08-27 10:03:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.089) Loss 1.1367 (0.7462) Acc@1 72.461 (83.345) Acc@5 92.969 (96.421) Mem 7382MB [2024-08-27 10:03:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.0312 (0.7934) Acc@1 74.707 (81.988) Acc@5 93.555 (95.939) Mem 7382MB [2024-08-27 10:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.550 Acc@5 95.910 [2024-08-27 10:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.6% [2024-08-27 10:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.55% [2024-08-27 10:03:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 10:03:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 10:03:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][0/1251] eta 0:13:41 lr 0.000511 wd 0.0500 time 0.6564 (0.6564) data time 0.4148 (0.4148) model time 0.0000 (0.0000) loss 2.2775 (2.2775) grad_norm 2.3800 (2.3800) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:03:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][10/1251] eta 0:05:41 lr 0.000511 wd 0.0500 time 0.2352 (0.2754) data time 0.0010 (0.0387) model time 0.0000 (0.0000) loss 3.2109 (2.9359) grad_norm 3.5745 (3.7346) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][20/1251] eta 0:05:18 lr 0.000510 wd 0.0500 time 0.2507 (0.2586) data time 0.0009 (0.0207) model time 0.0000 (0.0000) loss 3.6283 (3.1038) grad_norm 2.2529 (3.4822) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][30/1251] eta 0:05:07 lr 0.000510 wd 0.0500 time 0.2397 (0.2517) data time 0.0008 (0.0144) model time 0.0000 (0.0000) loss 3.6981 (3.1142) grad_norm 3.0066 (3.1140) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][40/1251] eta 0:05:01 lr 0.000510 wd 0.0500 time 0.2433 (0.2487) data time 0.0012 (0.0111) model time 0.0000 (0.0000) loss 3.0576 (3.1483) grad_norm 1.6559 (2.9132) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][50/1251] eta 0:04:56 lr 0.000510 wd 0.0500 time 0.2331 (0.2466) data time 0.0011 (0.0092) model time 0.0000 (0.0000) loss 3.4131 (3.1796) grad_norm 1.9448 (2.8217) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][60/1251] eta 0:04:52 lr 0.000510 wd 0.0500 time 0.2478 (0.2453) data time 0.0008 (0.0078) model time 0.2470 (0.2378) loss 3.2007 (3.1847) grad_norm 2.3352 (2.7338) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][70/1251] eta 0:04:48 lr 0.000510 wd 0.0500 time 0.2325 (0.2441) data time 0.0011 (0.0069) model time 0.2315 (0.2367) loss 2.7894 (3.1469) grad_norm 2.8213 (2.8152) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][80/1251] eta 0:04:45 lr 0.000510 wd 0.0500 time 0.2439 (0.2438) data time 0.0010 (0.0062) model time 0.2429 (0.2380) loss 3.2855 (3.1348) grad_norm 2.2228 (2.7834) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][90/1251] eta 0:04:42 lr 0.000510 wd 0.0500 time 0.2315 (0.2435) data time 0.0011 (0.0056) model time 0.2304 (0.2386) loss 1.8788 (3.1222) grad_norm 2.2559 (2.7437) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][100/1251] eta 0:04:39 lr 0.000510 wd 0.0500 time 0.2481 (0.2429) data time 0.0007 (0.0052) model time 0.2474 (0.2382) loss 2.0626 (3.1163) grad_norm 2.1224 (2.6859) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][110/1251] eta 0:04:36 lr 0.000510 wd 0.0500 time 0.2485 (0.2427) data time 0.0009 (0.0049) model time 0.2476 (0.2382) loss 3.6690 (3.1339) grad_norm 2.1270 (2.7357) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][120/1251] eta 0:04:34 lr 0.000510 wd 0.0500 time 0.2400 (0.2424) data time 0.0011 (0.0046) model time 0.2389 (0.2381) loss 3.5338 (3.1508) grad_norm 2.5150 (2.6977) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][130/1251] eta 0:04:31 lr 0.000510 wd 0.0500 time 0.2433 (0.2421) data time 0.0009 (0.0043) model time 0.2423 (0.2380) loss 3.0749 (3.1567) grad_norm 2.3160 (2.7111) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][140/1251] eta 0:04:28 lr 0.000510 wd 0.0500 time 0.2415 (0.2418) data time 0.0010 (0.0041) model time 0.2406 (0.2380) loss 3.0425 (3.1553) grad_norm 2.6387 (2.7395) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][150/1251] eta 0:04:25 lr 0.000510 wd 0.0500 time 0.2330 (0.2415) data time 0.0012 (0.0039) model time 0.2318 (0.2378) loss 3.4815 (3.1548) grad_norm 2.8286 (2.7252) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][160/1251] eta 0:04:23 lr 0.000510 wd 0.0500 time 0.2389 (0.2413) data time 0.0010 (0.0037) model time 0.2379 (0.2377) loss 3.0515 (3.1695) grad_norm 2.7030 (2.7106) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][170/1251] eta 0:04:20 lr 0.000510 wd 0.0500 time 0.2377 (0.2412) data time 0.0009 (0.0035) model time 0.2368 (0.2378) loss 2.6751 (3.1727) grad_norm 2.3356 (2.6894) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][180/1251] eta 0:04:18 lr 0.000510 wd 0.0500 time 0.2628 (0.2411) data time 0.0010 (0.0034) model time 0.2618 (0.2378) loss 2.2275 (3.1593) grad_norm 2.5273 (2.6760) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][190/1251] eta 0:04:15 lr 0.000510 wd 0.0500 time 0.2381 (0.2409) data time 0.0011 (0.0033) model time 0.2370 (0.2377) loss 3.4641 (3.1590) grad_norm 2.3658 (2.6765) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][200/1251] eta 0:04:13 lr 0.000510 wd 0.0500 time 0.2452 (0.2408) data time 0.0007 (0.0032) model time 0.2444 (0.2378) loss 3.8833 (3.1575) grad_norm 2.8015 (2.6844) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][210/1251] eta 0:04:10 lr 0.000510 wd 0.0500 time 0.2423 (0.2409) data time 0.0010 (0.0031) model time 0.2413 (0.2380) loss 2.7889 (3.1676) grad_norm 2.1649 (2.6866) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][220/1251] eta 0:04:08 lr 0.000510 wd 0.0500 time 0.2476 (0.2409) data time 0.0008 (0.0030) model time 0.2469 (0.2381) loss 3.0771 (3.1679) grad_norm 1.7318 (2.7262) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][230/1251] eta 0:04:05 lr 0.000510 wd 0.0500 time 0.2418 (0.2409) data time 0.0010 (0.0029) model time 0.2407 (0.2382) loss 2.9934 (3.1623) grad_norm 2.3580 (2.7193) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][240/1251] eta 0:04:03 lr 0.000509 wd 0.0500 time 0.2405 (0.2408) data time 0.0009 (0.0028) model time 0.2396 (0.2382) loss 3.3896 (3.1620) grad_norm 2.2927 (2.7021) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][250/1251] eta 0:04:01 lr 0.000509 wd 0.0500 time 0.2402 (0.2417) data time 0.0008 (0.0027) model time 0.2394 (0.2393) loss 3.9451 (3.1685) grad_norm 2.4613 (2.7015) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:04:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][260/1251] eta 0:03:59 lr 0.000509 wd 0.0500 time 0.2317 (0.2416) data time 0.0010 (0.0027) model time 0.2307 (0.2393) loss 3.2843 (3.1686) grad_norm 4.2394 (2.7004) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][270/1251] eta 0:03:57 lr 0.000509 wd 0.0500 time 0.2405 (0.2416) data time 0.0009 (0.0026) model time 0.2396 (0.2394) loss 2.7838 (3.1597) grad_norm 3.7221 (2.6932) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][280/1251] eta 0:03:55 lr 0.000509 wd 0.0500 time 0.2407 (0.2422) data time 0.0009 (0.0026) model time 0.2398 (0.2401) loss 3.7921 (3.1580) grad_norm 1.6782 (2.6717) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][290/1251] eta 0:03:52 lr 0.000509 wd 0.0500 time 0.2406 (0.2421) data time 0.0009 (0.0025) model time 0.2397 (0.2401) loss 2.0380 (3.1496) grad_norm 2.9933 (2.6692) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][300/1251] eta 0:03:50 lr 0.000509 wd 0.0500 time 0.2276 (0.2420) data time 0.0011 (0.0025) model time 0.2265 (0.2399) loss 3.5245 (3.1431) grad_norm 3.7358 (2.6723) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][310/1251] eta 0:03:47 lr 0.000509 wd 0.0500 time 0.2420 (0.2419) data time 0.0008 (0.0024) model time 0.2412 (0.2399) loss 3.8470 (3.1460) grad_norm 2.2142 (2.6645) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][320/1251] eta 0:03:45 lr 0.000509 wd 0.0500 time 0.2387 (0.2418) data time 0.0010 (0.0024) model time 0.2377 (0.2398) loss 2.5341 (3.1377) grad_norm 1.7479 (2.6525) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][330/1251] eta 0:03:42 lr 0.000509 wd 0.0500 time 0.2367 (0.2417) data time 0.0008 (0.0024) model time 0.2359 (0.2397) loss 3.5354 (3.1448) grad_norm 1.8759 (2.6506) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][340/1251] eta 0:03:40 lr 0.000509 wd 0.0500 time 0.2368 (0.2416) data time 0.0011 (0.0023) model time 0.2357 (0.2396) loss 3.0588 (3.1435) grad_norm 2.2212 (2.6495) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][350/1251] eta 0:03:37 lr 0.000509 wd 0.0500 time 0.2305 (0.2415) data time 0.0009 (0.0023) model time 0.2296 (0.2395) loss 2.5149 (3.1508) grad_norm 2.7791 (2.6430) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][360/1251] eta 0:03:35 lr 0.000509 wd 0.0500 time 0.2379 (0.2414) data time 0.0011 (0.0022) model time 0.2368 (0.2394) loss 2.6599 (3.1526) grad_norm 3.2520 (2.6432) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][370/1251] eta 0:03:33 lr 0.000509 wd 0.0500 time 0.2341 (0.2419) data time 0.0009 (0.0022) model time 0.2332 (0.2400) loss 3.1646 (3.1569) grad_norm 3.3306 (2.6449) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][380/1251] eta 0:03:31 lr 0.000509 wd 0.0500 time 0.2356 (0.2429) data time 0.0010 (0.0022) model time 0.2347 (0.2412) loss 1.9801 (3.1567) grad_norm 2.6181 (2.6598) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][390/1251] eta 0:03:29 lr 0.000509 wd 0.0500 time 0.2334 (0.2428) data time 0.0007 (0.0021) model time 0.2327 (0.2412) loss 3.8908 (3.1502) grad_norm 2.9545 (2.6719) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][400/1251] eta 0:03:26 lr 0.000509 wd 0.0500 time 0.2362 (0.2427) data time 0.0009 (0.0021) model time 0.2353 (0.2411) loss 3.8339 (3.1488) grad_norm 2.2799 (2.6733) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][410/1251] eta 0:03:24 lr 0.000509 wd 0.0500 time 0.2438 (0.2426) data time 0.0010 (0.0021) model time 0.2429 (0.2410) loss 3.0101 (3.1509) grad_norm 1.9521 (2.6729) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][420/1251] eta 0:03:21 lr 0.000509 wd 0.0500 time 0.2417 (0.2426) data time 0.0008 (0.0021) model time 0.2409 (0.2409) loss 2.8990 (3.1505) grad_norm 2.5734 (2.6725) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:05:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][430/1251] eta 0:03:19 lr 0.000509 wd 0.0500 time 0.2387 (0.2425) data time 0.0011 (0.0020) model time 0.2376 (0.2409) loss 2.6314 (3.1530) grad_norm 2.0624 (2.6730) loss_scale 2048.0000 (1047.7587) mem 7382MB [2024-08-27 10:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][440/1251] eta 0:03:16 lr 0.000509 wd 0.0500 time 0.2374 (0.2424) data time 0.0010 (0.0020) model time 0.2365 (0.2407) loss 2.7465 (3.1503) grad_norm 2.0845 (2.6667) loss_scale 2048.0000 (1070.4399) mem 7382MB [2024-08-27 10:05:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][450/1251] eta 0:03:14 lr 0.000509 wd 0.0500 time 0.2450 (0.2423) data time 0.0009 (0.0020) model time 0.2441 (0.2407) loss 3.6566 (3.1543) grad_norm 1.9022 (2.6603) loss_scale 2048.0000 (1092.1153) mem 7382MB [2024-08-27 10:05:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][460/1251] eta 0:03:11 lr 0.000509 wd 0.0500 time 0.2426 (0.2422) data time 0.0011 (0.0020) model time 0.2414 (0.2406) loss 3.7421 (3.1577) grad_norm 2.4311 (2.6637) loss_scale 2048.0000 (1112.8503) mem 7382MB [2024-08-27 10:05:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][470/1251] eta 0:03:09 lr 0.000508 wd 0.0500 time 0.2305 (0.2421) data time 0.0008 (0.0020) model time 0.2296 (0.2405) loss 3.8548 (3.1607) grad_norm 2.6459 (2.6641) loss_scale 2048.0000 (1132.7049) mem 7382MB [2024-08-27 10:05:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][480/1251] eta 0:03:06 lr 0.000508 wd 0.0500 time 0.2398 (0.2421) data time 0.0007 (0.0019) model time 0.2391 (0.2404) loss 2.4569 (3.1605) grad_norm 2.1993 (2.6556) loss_scale 2048.0000 (1151.7339) mem 7382MB [2024-08-27 10:05:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][490/1251] eta 0:03:04 lr 0.000508 wd 0.0500 time 0.2341 (0.2420) data time 0.0007 (0.0019) model time 0.2334 (0.2404) loss 3.0688 (3.1648) grad_norm 2.3271 (2.6561) loss_scale 2048.0000 (1169.9878) mem 7382MB [2024-08-27 10:05:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][500/1251] eta 0:03:01 lr 0.000508 wd 0.0500 time 0.2385 (0.2419) data time 0.0010 (0.0019) model time 0.2374 (0.2403) loss 2.4631 (3.1663) grad_norm 2.2962 (2.6509) loss_scale 2048.0000 (1187.5130) mem 7382MB [2024-08-27 10:05:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][510/1251] eta 0:02:59 lr 0.000508 wd 0.0500 time 0.2318 (0.2418) data time 0.0009 (0.0019) model time 0.2310 (0.2402) loss 2.6540 (3.1731) grad_norm 3.1229 (2.6532) loss_scale 2048.0000 (1204.3523) mem 7382MB [2024-08-27 10:06:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][520/1251] eta 0:02:56 lr 0.000508 wd 0.0500 time 0.2375 (0.2418) data time 0.0010 (0.0019) model time 0.2366 (0.2402) loss 3.0419 (3.1763) grad_norm 2.7137 (2.6585) loss_scale 2048.0000 (1220.5451) mem 7382MB [2024-08-27 10:06:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][530/1251] eta 0:02:54 lr 0.000508 wd 0.0500 time 0.2403 (0.2417) data time 0.0010 (0.0019) model time 0.2393 (0.2402) loss 2.8631 (3.1753) grad_norm 2.2104 (2.6661) loss_scale 2048.0000 (1236.1281) mem 7382MB [2024-08-27 10:06:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][540/1251] eta 0:02:51 lr 0.000508 wd 0.0500 time 0.2299 (0.2417) data time 0.0011 (0.0018) model time 0.2288 (0.2401) loss 3.2414 (3.1769) grad_norm 2.3900 (2.6747) loss_scale 2048.0000 (1251.1349) mem 7382MB [2024-08-27 10:06:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][550/1251] eta 0:02:49 lr 0.000508 wd 0.0500 time 0.2401 (0.2417) data time 0.0007 (0.0018) model time 0.2394 (0.2401) loss 2.2254 (3.1724) grad_norm 3.1888 (2.6785) loss_scale 2048.0000 (1265.5971) mem 7382MB [2024-08-27 10:06:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][560/1251] eta 0:02:46 lr 0.000508 wd 0.0500 time 0.2315 (0.2417) data time 0.0010 (0.0018) model time 0.2305 (0.2401) loss 3.3290 (3.1740) grad_norm 2.3099 (2.6739) loss_scale 2048.0000 (1279.5437) mem 7382MB [2024-08-27 10:06:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][570/1251] eta 0:02:44 lr 0.000508 wd 0.0500 time 0.2412 (0.2416) data time 0.0007 (0.0018) model time 0.2405 (0.2400) loss 3.7039 (3.1775) grad_norm 2.8957 (2.6785) loss_scale 2048.0000 (1293.0018) mem 7382MB [2024-08-27 10:06:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][580/1251] eta 0:02:42 lr 0.000508 wd 0.0500 time 0.2441 (0.2415) data time 0.0009 (0.0018) model time 0.2432 (0.2400) loss 4.1773 (3.1768) grad_norm 3.0765 (2.6720) loss_scale 2048.0000 (1305.9966) mem 7382MB [2024-08-27 10:06:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][590/1251] eta 0:02:39 lr 0.000508 wd 0.0500 time 0.2409 (0.2415) data time 0.0011 (0.0018) model time 0.2397 (0.2399) loss 2.7113 (3.1770) grad_norm 2.5804 (2.6669) loss_scale 2048.0000 (1318.5516) mem 7382MB [2024-08-27 10:06:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][600/1251] eta 0:02:37 lr 0.000508 wd 0.0500 time 0.2320 (0.2415) data time 0.0011 (0.0018) model time 0.2309 (0.2399) loss 3.1884 (3.1736) grad_norm 3.1829 (2.6723) loss_scale 2048.0000 (1330.6889) mem 7382MB [2024-08-27 10:06:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][610/1251] eta 0:02:34 lr 0.000508 wd 0.0500 time 0.2369 (0.2414) data time 0.0007 (0.0018) model time 0.2362 (0.2399) loss 3.9919 (3.1763) grad_norm 2.9889 (2.6742) loss_scale 2048.0000 (1342.4288) mem 7382MB [2024-08-27 10:06:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][620/1251] eta 0:02:32 lr 0.000508 wd 0.0500 time 0.2512 (0.2414) data time 0.0008 (0.0018) model time 0.2504 (0.2398) loss 4.0935 (3.1811) grad_norm 3.4584 (2.6748) loss_scale 2048.0000 (1353.7907) mem 7382MB [2024-08-27 10:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][630/1251] eta 0:02:29 lr 0.000508 wd 0.0500 time 0.2424 (0.2413) data time 0.0010 (0.0017) model time 0.2413 (0.2398) loss 3.6702 (3.1800) grad_norm 3.2338 (2.6749) loss_scale 2048.0000 (1364.7924) mem 7382MB [2024-08-27 10:06:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][640/1251] eta 0:02:27 lr 0.000508 wd 0.0500 time 0.2465 (0.2413) data time 0.0010 (0.0017) model time 0.2455 (0.2398) loss 3.1981 (3.1779) grad_norm 2.0837 (2.6832) loss_scale 2048.0000 (1375.4509) mem 7382MB [2024-08-27 10:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][650/1251] eta 0:02:24 lr 0.000508 wd 0.0500 time 0.2378 (0.2413) data time 0.0010 (0.0017) model time 0.2368 (0.2397) loss 2.4297 (3.1755) grad_norm 3.5285 (2.6823) loss_scale 2048.0000 (1385.7819) mem 7382MB [2024-08-27 10:06:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][660/1251] eta 0:02:22 lr 0.000508 wd 0.0500 time 0.2458 (0.2412) data time 0.0010 (0.0017) model time 0.2448 (0.2397) loss 3.0143 (3.1776) grad_norm 2.7071 (2.6784) loss_scale 2048.0000 (1395.8003) mem 7382MB [2024-08-27 10:06:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][670/1251] eta 0:02:20 lr 0.000508 wd 0.0500 time 0.2333 (0.2412) data time 0.0009 (0.0017) model time 0.2324 (0.2397) loss 3.4106 (3.1748) grad_norm 3.1137 (2.6811) loss_scale 2048.0000 (1405.5201) mem 7382MB [2024-08-27 10:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][680/1251] eta 0:02:17 lr 0.000508 wd 0.0500 time 0.2413 (0.2412) data time 0.0010 (0.0017) model time 0.2403 (0.2396) loss 3.5649 (3.1741) grad_norm 2.5202 (2.6803) loss_scale 2048.0000 (1414.9545) mem 7382MB [2024-08-27 10:06:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][690/1251] eta 0:02:15 lr 0.000507 wd 0.0500 time 0.2373 (0.2411) data time 0.0009 (0.0017) model time 0.2364 (0.2396) loss 3.7211 (3.1711) grad_norm 2.2405 (2.6807) loss_scale 2048.0000 (1424.1158) mem 7382MB [2024-08-27 10:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][700/1251] eta 0:02:12 lr 0.000507 wd 0.0500 time 0.2335 (0.2411) data time 0.0010 (0.0017) model time 0.2325 (0.2396) loss 2.9423 (3.1704) grad_norm 2.1289 (2.6775) loss_scale 2048.0000 (1433.0157) mem 7382MB [2024-08-27 10:06:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][710/1251] eta 0:02:10 lr 0.000507 wd 0.0500 time 0.2439 (0.2411) data time 0.0008 (0.0017) model time 0.2431 (0.2396) loss 2.8054 (3.1710) grad_norm 1.9826 (2.6786) loss_scale 2048.0000 (1441.6653) mem 7382MB [2024-08-27 10:06:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][720/1251] eta 0:02:07 lr 0.000507 wd 0.0500 time 0.2332 (0.2410) data time 0.0010 (0.0017) model time 0.2322 (0.2395) loss 1.9531 (3.1685) grad_norm 2.9606 (2.6788) loss_scale 2048.0000 (1450.0749) mem 7382MB [2024-08-27 10:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][730/1251] eta 0:02:05 lr 0.000507 wd 0.0500 time 0.2363 (0.2410) data time 0.0011 (0.0016) model time 0.2352 (0.2395) loss 3.3691 (3.1710) grad_norm 8.3196 (2.6887) loss_scale 2048.0000 (1458.2544) mem 7382MB [2024-08-27 10:06:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][740/1251] eta 0:02:03 lr 0.000507 wd 0.0500 time 0.2436 (0.2410) data time 0.0008 (0.0016) model time 0.2428 (0.2395) loss 2.3026 (3.1680) grad_norm 2.7371 (2.6911) loss_scale 2048.0000 (1466.2132) mem 7382MB [2024-08-27 10:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][750/1251] eta 0:02:00 lr 0.000507 wd 0.0500 time 0.2338 (0.2410) data time 0.0008 (0.0016) model time 0.2330 (0.2395) loss 3.3160 (3.1683) grad_norm 2.1559 (2.6952) loss_scale 2048.0000 (1473.9601) mem 7382MB [2024-08-27 10:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][760/1251] eta 0:01:58 lr 0.000507 wd 0.0500 time 0.2338 (0.2410) data time 0.0008 (0.0016) model time 0.2330 (0.2395) loss 3.9532 (3.1724) grad_norm 2.4930 (2.6940) loss_scale 2048.0000 (1481.5033) mem 7382MB [2024-08-27 10:07:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][770/1251] eta 0:01:55 lr 0.000507 wd 0.0500 time 0.2298 (0.2409) data time 0.0008 (0.0016) model time 0.2291 (0.2395) loss 2.0213 (3.1717) grad_norm 2.2953 (2.6882) loss_scale 2048.0000 (1488.8508) mem 7382MB [2024-08-27 10:07:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][780/1251] eta 0:01:53 lr 0.000507 wd 0.0500 time 0.2399 (0.2409) data time 0.0010 (0.0016) model time 0.2389 (0.2394) loss 3.1135 (3.1718) grad_norm 2.1060 (2.6841) loss_scale 2048.0000 (1496.0102) mem 7382MB [2024-08-27 10:07:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][790/1251] eta 0:01:51 lr 0.000507 wd 0.0500 time 0.2329 (0.2412) data time 0.0009 (0.0016) model time 0.2320 (0.2397) loss 2.4552 (3.1715) grad_norm 2.4042 (2.6939) loss_scale 2048.0000 (1502.9886) mem 7382MB [2024-08-27 10:07:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][800/1251] eta 0:01:48 lr 0.000507 wd 0.0500 time 0.2373 (0.2414) data time 0.0008 (0.0016) model time 0.2365 (0.2400) loss 3.4353 (3.1721) grad_norm 2.6738 (2.6981) loss_scale 2048.0000 (1509.7928) mem 7382MB [2024-08-27 10:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][810/1251] eta 0:01:46 lr 0.000507 wd 0.0500 time 0.2369 (0.2413) data time 0.0011 (0.0016) model time 0.2358 (0.2399) loss 4.1164 (3.1725) grad_norm 1.8027 (2.6896) loss_scale 2048.0000 (1516.4291) mem 7382MB [2024-08-27 10:07:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][820/1251] eta 0:01:43 lr 0.000507 wd 0.0500 time 0.2271 (0.2413) data time 0.0010 (0.0016) model time 0.2260 (0.2399) loss 2.5774 (3.1708) grad_norm 3.5043 (2.6862) loss_scale 2048.0000 (1522.9038) mem 7382MB [2024-08-27 10:07:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][830/1251] eta 0:01:41 lr 0.000507 wd 0.0500 time 0.2340 (0.2412) data time 0.0010 (0.0016) model time 0.2330 (0.2398) loss 3.1713 (3.1749) grad_norm 2.4371 (2.6814) loss_scale 2048.0000 (1529.2226) mem 7382MB [2024-08-27 10:07:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][840/1251] eta 0:01:39 lr 0.000507 wd 0.0500 time 0.2381 (0.2412) data time 0.0010 (0.0016) model time 0.2370 (0.2398) loss 3.3620 (3.1712) grad_norm 2.9084 (2.6812) loss_scale 2048.0000 (1535.3912) mem 7382MB [2024-08-27 10:07:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][850/1251] eta 0:01:36 lr 0.000507 wd 0.0500 time 0.2328 (0.2411) data time 0.0010 (0.0016) model time 0.2318 (0.2397) loss 3.5376 (3.1716) grad_norm 1.9877 (2.6783) loss_scale 2048.0000 (1541.4148) mem 7382MB [2024-08-27 10:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][860/1251] eta 0:01:34 lr 0.000507 wd 0.0500 time 0.2341 (0.2411) data time 0.0007 (0.0015) model time 0.2334 (0.2397) loss 3.7713 (3.1746) grad_norm 2.6850 (2.6838) loss_scale 2048.0000 (1547.2985) mem 7382MB [2024-08-27 10:07:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][870/1251] eta 0:01:31 lr 0.000507 wd 0.0500 time 0.2324 (0.2410) data time 0.0008 (0.0015) model time 0.2316 (0.2396) loss 3.7558 (3.1754) grad_norm 4.7141 (2.6905) loss_scale 2048.0000 (1553.0471) mem 7382MB [2024-08-27 10:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][880/1251] eta 0:01:29 lr 0.000507 wd 0.0500 time 0.2342 (0.2410) data time 0.0010 (0.0015) model time 0.2332 (0.2396) loss 3.5493 (3.1757) grad_norm 1.7222 (2.6874) loss_scale 2048.0000 (1558.6652) mem 7382MB [2024-08-27 10:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][890/1251] eta 0:01:26 lr 0.000507 wd 0.0500 time 0.2329 (0.2410) data time 0.0010 (0.0015) model time 0.2319 (0.2395) loss 2.7548 (3.1753) grad_norm 2.2029 (2.6844) loss_scale 2048.0000 (1564.1571) mem 7382MB [2024-08-27 10:07:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][900/1251] eta 0:01:24 lr 0.000507 wd 0.0500 time 0.2481 (0.2410) data time 0.0010 (0.0015) model time 0.2472 (0.2396) loss 3.4684 (3.1773) grad_norm 2.1449 (2.6844) loss_scale 2048.0000 (1569.5272) mem 7382MB [2024-08-27 10:07:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][910/1251] eta 0:01:22 lr 0.000507 wd 0.0500 time 0.2302 (0.2409) data time 0.0009 (0.0015) model time 0.2293 (0.2395) loss 2.6658 (3.1801) grad_norm 3.0387 (2.6898) loss_scale 2048.0000 (1574.7794) mem 7382MB [2024-08-27 10:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][920/1251] eta 0:01:19 lr 0.000506 wd 0.0500 time 0.2403 (0.2409) data time 0.0013 (0.0015) model time 0.2390 (0.2395) loss 3.3748 (3.1804) grad_norm 4.9891 (2.6984) loss_scale 2048.0000 (1579.9175) mem 7382MB [2024-08-27 10:07:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][930/1251] eta 0:01:17 lr 0.000506 wd 0.0500 time 0.2394 (0.2409) data time 0.0010 (0.0015) model time 0.2383 (0.2395) loss 2.7688 (3.1761) grad_norm 2.2667 (2.7013) loss_scale 2048.0000 (1584.9452) mem 7382MB [2024-08-27 10:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][940/1251] eta 0:01:14 lr 0.000506 wd 0.0500 time 0.2421 (0.2409) data time 0.0010 (0.0015) model time 0.2411 (0.2395) loss 3.1289 (3.1772) grad_norm 1.9340 (2.7025) loss_scale 2048.0000 (1589.8661) mem 7382MB [2024-08-27 10:07:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][950/1251] eta 0:01:12 lr 0.000506 wd 0.0500 time 0.2360 (0.2409) data time 0.0009 (0.0015) model time 0.2351 (0.2395) loss 2.2320 (3.1793) grad_norm 2.2452 (2.6996) loss_scale 2048.0000 (1594.6835) mem 7382MB [2024-08-27 10:07:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][960/1251] eta 0:01:10 lr 0.000506 wd 0.0500 time 0.2365 (0.2408) data time 0.0010 (0.0015) model time 0.2356 (0.2395) loss 2.8854 (3.1787) grad_norm 2.0381 (2.6968) loss_scale 2048.0000 (1599.4006) mem 7382MB [2024-08-27 10:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][970/1251] eta 0:01:07 lr 0.000506 wd 0.0500 time 0.2399 (0.2408) data time 0.0007 (0.0015) model time 0.2393 (0.2394) loss 3.3193 (3.1797) grad_norm 1.9268 (2.6927) loss_scale 2048.0000 (1604.0206) mem 7382MB [2024-08-27 10:07:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][980/1251] eta 0:01:05 lr 0.000506 wd 0.0500 time 0.2433 (0.2408) data time 0.0010 (0.0015) model time 0.2423 (0.2394) loss 3.0116 (3.1780) grad_norm 2.3914 (2.6920) loss_scale 2048.0000 (1608.5464) mem 7382MB [2024-08-27 10:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][990/1251] eta 0:01:02 lr 0.000506 wd 0.0500 time 0.2390 (0.2408) data time 0.0008 (0.0015) model time 0.2383 (0.2394) loss 3.3207 (3.1810) grad_norm 2.0703 (2.6896) loss_scale 2048.0000 (1612.9808) mem 7382MB [2024-08-27 10:07:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1000/1251] eta 0:01:00 lr 0.000506 wd 0.0500 time 0.2361 (0.2408) data time 0.0010 (0.0015) model time 0.2351 (0.2394) loss 3.5117 (3.1806) grad_norm 2.8175 (2.6867) loss_scale 2048.0000 (1617.3267) mem 7382MB [2024-08-27 10:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1010/1251] eta 0:00:58 lr 0.000506 wd 0.0500 time 0.2370 (0.2407) data time 0.0010 (0.0015) model time 0.2360 (0.2393) loss 3.1756 (3.1801) grad_norm 2.0697 (2.6827) loss_scale 2048.0000 (1621.5865) mem 7382MB [2024-08-27 10:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1020/1251] eta 0:00:55 lr 0.000506 wd 0.0500 time 0.2408 (0.2407) data time 0.0007 (0.0015) model time 0.2400 (0.2393) loss 2.4620 (3.1772) grad_norm 2.6774 (2.6799) loss_scale 2048.0000 (1625.7630) mem 7382MB [2024-08-27 10:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1030/1251] eta 0:00:53 lr 0.000506 wd 0.0500 time 0.2395 (0.2407) data time 0.0008 (0.0015) model time 0.2387 (0.2393) loss 3.7129 (3.1785) grad_norm 2.8994 (2.6773) loss_scale 2048.0000 (1629.8584) mem 7382MB [2024-08-27 10:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1040/1251] eta 0:00:50 lr 0.000506 wd 0.0500 time 0.2311 (0.2407) data time 0.0011 (0.0015) model time 0.2300 (0.2393) loss 3.2925 (3.1801) grad_norm 4.8866 (2.6764) loss_scale 2048.0000 (1633.8751) mem 7382MB [2024-08-27 10:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1050/1251] eta 0:00:48 lr 0.000506 wd 0.0500 time 0.2367 (0.2406) data time 0.0010 (0.0015) model time 0.2357 (0.2392) loss 3.5294 (3.1799) grad_norm 2.2575 (2.6726) loss_scale 2048.0000 (1637.8154) mem 7382MB [2024-08-27 10:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1060/1251] eta 0:00:45 lr 0.000506 wd 0.0500 time 0.2339 (0.2406) data time 0.0013 (0.0014) model time 0.2326 (0.2392) loss 3.2127 (3.1829) grad_norm 2.3864 (2.6698) loss_scale 2048.0000 (1641.6814) mem 7382MB [2024-08-27 10:08:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1070/1251] eta 0:00:43 lr 0.000506 wd 0.0500 time 0.2312 (0.2406) data time 0.0011 (0.0014) model time 0.2301 (0.2392) loss 2.8922 (3.1800) grad_norm 2.4537 (2.6770) loss_scale 2048.0000 (1645.4753) mem 7382MB [2024-08-27 10:08:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1080/1251] eta 0:00:41 lr 0.000506 wd 0.0500 time 0.2372 (0.2406) data time 0.0010 (0.0014) model time 0.2362 (0.2392) loss 2.2856 (3.1813) grad_norm 2.0462 (2.6732) loss_scale 2048.0000 (1649.1989) mem 7382MB [2024-08-27 10:08:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1090/1251] eta 0:00:38 lr 0.000506 wd 0.0500 time 0.2366 (0.2406) data time 0.0009 (0.0014) model time 0.2357 (0.2392) loss 3.6546 (3.1824) grad_norm 4.0458 (2.6724) loss_scale 2048.0000 (1652.8543) mem 7382MB [2024-08-27 10:08:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1100/1251] eta 0:00:36 lr 0.000506 wd 0.0500 time 0.2416 (0.2405) data time 0.0010 (0.0014) model time 0.2406 (0.2392) loss 3.6610 (3.1832) grad_norm 1.8423 (2.6677) loss_scale 2048.0000 (1656.4432) mem 7382MB [2024-08-27 10:08:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1110/1251] eta 0:00:33 lr 0.000506 wd 0.0500 time 0.2376 (0.2405) data time 0.0009 (0.0014) model time 0.2368 (0.2392) loss 4.0072 (3.1842) grad_norm 1.9566 (2.6654) loss_scale 2048.0000 (1659.9676) mem 7382MB [2024-08-27 10:08:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1120/1251] eta 0:00:31 lr 0.000506 wd 0.0500 time 0.2377 (0.2405) data time 0.0007 (0.0014) model time 0.2369 (0.2392) loss 2.1715 (3.1833) grad_norm 1.7233 (2.6621) loss_scale 2048.0000 (1663.4291) mem 7382MB [2024-08-27 10:08:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1130/1251] eta 0:00:29 lr 0.000506 wd 0.0500 time 0.2385 (0.2405) data time 0.0010 (0.0014) model time 0.2374 (0.2392) loss 3.5262 (3.1833) grad_norm 3.0198 (2.6651) loss_scale 2048.0000 (1666.8294) mem 7382MB [2024-08-27 10:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1140/1251] eta 0:00:26 lr 0.000505 wd 0.0500 time 0.2435 (0.2405) data time 0.0010 (0.0014) model time 0.2425 (0.2392) loss 3.7035 (3.1836) grad_norm 2.3964 (2.6687) loss_scale 2048.0000 (1670.1700) mem 7382MB [2024-08-27 10:08:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1150/1251] eta 0:00:24 lr 0.000505 wd 0.0500 time 0.2440 (0.2405) data time 0.0010 (0.0014) model time 0.2430 (0.2392) loss 3.4810 (3.1843) grad_norm 2.2059 (2.6750) loss_scale 2048.0000 (1673.4526) mem 7382MB [2024-08-27 10:08:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1160/1251] eta 0:00:21 lr 0.000505 wd 0.0500 time 0.2442 (0.2405) data time 0.0008 (0.0014) model time 0.2434 (0.2392) loss 2.8038 (3.1845) grad_norm 2.2954 (2.6733) loss_scale 2048.0000 (1676.6787) mem 7382MB [2024-08-27 10:08:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1170/1251] eta 0:00:19 lr 0.000505 wd 0.0500 time 0.2398 (0.2405) data time 0.0010 (0.0014) model time 0.2389 (0.2392) loss 3.5616 (3.1867) grad_norm 3.2988 (2.6734) loss_scale 2048.0000 (1679.8497) mem 7382MB [2024-08-27 10:08:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1180/1251] eta 0:00:17 lr 0.000505 wd 0.0500 time 0.2423 (0.2405) data time 0.0010 (0.0014) model time 0.2413 (0.2391) loss 2.7443 (3.1853) grad_norm 2.1231 (2.6729) loss_scale 2048.0000 (1682.9670) mem 7382MB [2024-08-27 10:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1190/1251] eta 0:00:14 lr 0.000505 wd 0.0500 time 0.2418 (0.2405) data time 0.0008 (0.0014) model time 0.2410 (0.2392) loss 3.6174 (3.1853) grad_norm 4.7405 (2.6710) loss_scale 2048.0000 (1686.0319) mem 7382MB [2024-08-27 10:08:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1200/1251] eta 0:00:12 lr 0.000505 wd 0.0500 time 0.2424 (0.2405) data time 0.0007 (0.0014) model time 0.2416 (0.2391) loss 4.1289 (3.1883) grad_norm 3.0626 (2.6717) loss_scale 2048.0000 (1689.0458) mem 7382MB [2024-08-27 10:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1210/1251] eta 0:00:09 lr 0.000505 wd 0.0500 time 0.2327 (0.2404) data time 0.0011 (0.0014) model time 0.2316 (0.2391) loss 3.0812 (3.1882) grad_norm 3.0558 (2.6698) loss_scale 2048.0000 (1692.0099) mem 7382MB [2024-08-27 10:08:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1220/1251] eta 0:00:07 lr 0.000505 wd 0.0500 time 0.2403 (0.2404) data time 0.0011 (0.0014) model time 0.2391 (0.2391) loss 3.0825 (3.1896) grad_norm 2.2415 (2.6727) loss_scale 2048.0000 (1694.9255) mem 7382MB [2024-08-27 10:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1230/1251] eta 0:00:05 lr 0.000505 wd 0.0500 time 0.2429 (0.2404) data time 0.0010 (0.0014) model time 0.2419 (0.2391) loss 3.1536 (3.1883) grad_norm 2.4056 (2.6734) loss_scale 2048.0000 (1697.7937) mem 7382MB [2024-08-27 10:08:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1240/1251] eta 0:00:02 lr 0.000505 wd 0.0500 time 0.2252 (0.2404) data time 0.0005 (0.0014) model time 0.2247 (0.2390) loss 3.3699 (3.1884) grad_norm 1.8355 (2.6728) loss_scale 2048.0000 (1700.6156) mem 7382MB [2024-08-27 10:08:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [159/300][1250/1251] eta 0:00:00 lr 0.000505 wd 0.0500 time 0.2451 (0.2403) data time 0.0007 (0.0014) model time 0.2444 (0.2389) loss 2.6981 (3.1866) grad_norm 2.6361 (2.6744) loss_scale 2048.0000 (1703.3925) mem 7382MB [2024-08-27 10:08:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 159 training takes 0:05:00 [2024-08-27 10:08:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 10:08:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 10:08:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.421 (0.421) Loss 0.4836 (0.4836) Acc@1 91.992 (91.992) Acc@5 98.438 (98.438) Mem 7382MB [2024-08-27 10:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.114) Loss 0.7588 (0.7324) Acc@1 84.668 (84.322) Acc@5 96.582 (96.937) Mem 7382MB [2024-08-27 10:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.097) Loss 1.0684 (0.7547) Acc@1 75.195 (83.417) Acc@5 93.555 (96.880) Mem 7382MB [2024-08-27 10:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.090) Loss 1.3066 (0.8553) Acc@1 68.262 (81.045) Acc@5 91.602 (95.665) Mem 7382MB [2024-08-27 10:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.1992 (0.9119) Acc@1 71.484 (79.540) Acc@5 92.676 (95.041) Mem 7382MB [2024-08-27 10:09:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.258 Acc@5 94.988 [2024-08-27 10:09:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.3% [2024-08-27 10:09:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.789 (0.789) Loss 0.4031 (0.4031) Acc@1 93.066 (93.066) Acc@5 98.535 (98.535) Mem 7382MB [2024-08-27 10:09:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.146) Loss 0.6323 (0.6319) Acc@1 87.402 (86.408) Acc@5 97.168 (97.354) Mem 7382MB [2024-08-27 10:09:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.114) Loss 0.9028 (0.6569) Acc@1 78.418 (85.510) Acc@5 95.605 (97.377) Mem 7382MB [2024-08-27 10:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.102) Loss 1.1367 (0.7459) Acc@1 72.461 (83.361) Acc@5 92.871 (96.418) Mem 7382MB [2024-08-27 10:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.093) Loss 1.0283 (0.7928) Acc@1 74.316 (81.993) Acc@5 93.555 (95.939) Mem 7382MB [2024-08-27 10:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.546 Acc@5 95.916 [2024-08-27 10:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.5% [2024-08-27 10:09:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][0/1251] eta 0:22:48 lr 0.000505 wd 0.0500 time 1.0941 (1.0941) data time 0.8345 (0.8345) model time 0.0000 (0.0000) loss 3.0733 (3.0733) grad_norm 2.0626 (2.0626) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][10/1251] eta 0:06:29 lr 0.000505 wd 0.0500 time 0.2384 (0.3141) data time 0.0010 (0.0769) model time 0.0000 (0.0000) loss 2.3232 (3.0852) grad_norm 3.0040 (2.6939) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][20/1251] eta 0:05:40 lr 0.000505 wd 0.0500 time 0.2370 (0.2770) data time 0.0011 (0.0407) model time 0.0000 (0.0000) loss 3.3263 (3.1298) grad_norm 2.0229 (2.8004) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][30/1251] eta 0:05:47 lr 0.000505 wd 0.0500 time 0.4520 (0.2846) data time 0.0011 (0.0279) model time 0.0000 (0.0000) loss 2.4044 (3.0340) grad_norm 3.7406 (2.8952) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][40/1251] eta 0:05:33 lr 0.000505 wd 0.0500 time 0.2463 (0.2751) data time 0.0009 (0.0213) model time 0.0000 (0.0000) loss 3.7674 (3.1389) grad_norm 2.7771 (2.8127) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][50/1251] eta 0:05:22 lr 0.000505 wd 0.0500 time 0.2490 (0.2689) data time 0.0010 (0.0174) model time 0.0000 (0.0000) loss 3.7708 (3.1725) grad_norm 3.3701 (2.7844) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][60/1251] eta 0:05:15 lr 0.000505 wd 0.0500 time 0.2542 (0.2646) data time 0.0010 (0.0147) model time 0.2532 (0.2413) loss 2.1670 (3.1936) grad_norm 1.9986 (2.7147) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][70/1251] eta 0:05:08 lr 0.000505 wd 0.0500 time 0.2464 (0.2611) data time 0.0009 (0.0128) model time 0.2455 (0.2401) loss 3.5797 (3.1703) grad_norm 3.9151 (2.7127) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][80/1251] eta 0:05:02 lr 0.000505 wd 0.0500 time 0.2385 (0.2585) data time 0.0009 (0.0113) model time 0.2376 (0.2397) loss 2.1832 (3.1417) grad_norm 2.9561 (2.6904) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][90/1251] eta 0:04:57 lr 0.000505 wd 0.0500 time 0.2351 (0.2563) data time 0.0008 (0.0102) model time 0.2343 (0.2392) loss 3.6464 (3.1325) grad_norm 2.1337 (2.6686) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][100/1251] eta 0:04:53 lr 0.000505 wd 0.0500 time 0.2348 (0.2546) data time 0.0009 (0.0093) model time 0.2338 (0.2389) loss 3.8183 (3.1090) grad_norm 2.3819 (2.6177) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][110/1251] eta 0:04:48 lr 0.000505 wd 0.0500 time 0.2496 (0.2532) data time 0.0012 (0.0085) model time 0.2484 (0.2388) loss 3.2656 (3.1222) grad_norm 4.8765 (2.6457) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][120/1251] eta 0:04:45 lr 0.000504 wd 0.0500 time 0.2399 (0.2521) data time 0.0007 (0.0079) model time 0.2392 (0.2387) loss 3.7081 (3.1404) grad_norm 1.7045 (2.6256) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][130/1251] eta 0:04:41 lr 0.000504 wd 0.0500 time 0.2385 (0.2513) data time 0.0010 (0.0074) model time 0.2375 (0.2391) loss 2.7384 (3.1374) grad_norm 2.2641 (2.5990) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][140/1251] eta 0:04:38 lr 0.000504 wd 0.0500 time 0.2437 (0.2505) data time 0.0007 (0.0069) model time 0.2429 (0.2391) loss 2.4085 (3.1424) grad_norm 2.1818 (2.5875) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][150/1251] eta 0:04:34 lr 0.000504 wd 0.0500 time 0.2469 (0.2497) data time 0.0009 (0.0065) model time 0.2461 (0.2389) loss 3.2409 (3.1456) grad_norm 5.2728 (2.5954) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][160/1251] eta 0:04:31 lr 0.000504 wd 0.0500 time 0.2375 (0.2488) data time 0.0007 (0.0062) model time 0.2367 (0.2384) loss 3.8313 (3.1448) grad_norm 2.4487 (2.6293) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][170/1251] eta 0:04:28 lr 0.000504 wd 0.0500 time 0.2378 (0.2480) data time 0.0008 (0.0059) model time 0.2369 (0.2381) loss 3.1630 (3.1465) grad_norm 4.8482 (2.6827) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][180/1251] eta 0:04:25 lr 0.000504 wd 0.0500 time 0.2403 (0.2475) data time 0.0008 (0.0056) model time 0.2395 (0.2381) loss 3.9875 (3.1575) grad_norm 2.6660 (2.6782) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][190/1251] eta 0:04:22 lr 0.000504 wd 0.0500 time 0.2482 (0.2472) data time 0.0010 (0.0054) model time 0.2472 (0.2382) loss 3.3398 (3.1649) grad_norm 1.9243 (2.6728) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][200/1251] eta 0:04:19 lr 0.000504 wd 0.0500 time 0.2363 (0.2467) data time 0.0007 (0.0052) model time 0.2355 (0.2382) loss 3.5427 (3.1560) grad_norm 1.8557 (2.6956) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][210/1251] eta 0:04:16 lr 0.000504 wd 0.0500 time 0.2383 (0.2463) data time 0.0011 (0.0050) model time 0.2372 (0.2380) loss 3.6007 (3.1544) grad_norm 1.6995 (2.6806) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][220/1251] eta 0:04:13 lr 0.000504 wd 0.0500 time 0.2403 (0.2458) data time 0.0011 (0.0048) model time 0.2392 (0.2378) loss 3.2970 (3.1646) grad_norm 2.4670 (2.6999) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][230/1251] eta 0:04:10 lr 0.000504 wd 0.0500 time 0.2340 (0.2455) data time 0.0010 (0.0047) model time 0.2331 (0.2379) loss 2.5235 (3.1640) grad_norm 2.2347 (2.6869) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][240/1251] eta 0:04:07 lr 0.000504 wd 0.0500 time 0.2405 (0.2453) data time 0.0008 (0.0045) model time 0.2397 (0.2379) loss 2.8914 (3.1651) grad_norm 2.6313 (2.7119) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][250/1251] eta 0:04:05 lr 0.000504 wd 0.0500 time 0.2370 (0.2451) data time 0.0011 (0.0044) model time 0.2359 (0.2379) loss 2.6532 (3.1619) grad_norm 3.1185 (2.7048) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][260/1251] eta 0:04:02 lr 0.000504 wd 0.0500 time 0.2336 (0.2449) data time 0.0010 (0.0042) model time 0.2326 (0.2380) loss 3.3049 (3.1573) grad_norm 1.8518 (2.6881) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][270/1251] eta 0:04:00 lr 0.000504 wd 0.0500 time 0.4517 (0.2455) data time 0.0011 (0.0041) model time 0.4506 (0.2390) loss 3.7487 (3.1695) grad_norm 2.4474 (2.6722) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][280/1251] eta 0:03:58 lr 0.000504 wd 0.0500 time 0.2432 (0.2453) data time 0.0010 (0.0040) model time 0.2422 (0.2390) loss 3.2461 (3.1689) grad_norm 2.4220 (2.6591) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][290/1251] eta 0:03:55 lr 0.000504 wd 0.0500 time 0.2472 (0.2451) data time 0.0010 (0.0039) model time 0.2461 (0.2390) loss 3.3942 (3.1739) grad_norm 1.8657 (2.6531) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][300/1251] eta 0:03:52 lr 0.000504 wd 0.0500 time 0.2319 (0.2448) data time 0.0009 (0.0038) model time 0.2311 (0.2388) loss 3.9754 (3.1775) grad_norm 2.1589 (2.6512) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][310/1251] eta 0:03:50 lr 0.000504 wd 0.0500 time 0.2396 (0.2446) data time 0.0007 (0.0037) model time 0.2388 (0.2388) loss 2.8733 (3.1826) grad_norm 2.1172 (2.6363) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][320/1251] eta 0:03:47 lr 0.000504 wd 0.0500 time 0.2445 (0.2444) data time 0.0008 (0.0036) model time 0.2438 (0.2387) loss 2.4421 (3.1776) grad_norm 2.0428 (2.6213) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][330/1251] eta 0:03:45 lr 0.000504 wd 0.0500 time 0.2506 (0.2443) data time 0.0007 (0.0036) model time 0.2499 (0.2388) loss 2.6713 (3.1801) grad_norm 2.4246 (2.6186) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][340/1251] eta 0:03:42 lr 0.000503 wd 0.0500 time 0.2324 (0.2442) data time 0.0010 (0.0035) model time 0.2314 (0.2388) loss 3.3879 (3.1802) grad_norm 2.8211 (2.6127) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][350/1251] eta 0:03:39 lr 0.000503 wd 0.0500 time 0.2415 (0.2440) data time 0.0008 (0.0034) model time 0.2406 (0.2387) loss 3.0894 (3.1843) grad_norm 2.6280 (2.5979) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][360/1251] eta 0:03:37 lr 0.000503 wd 0.0500 time 0.2452 (0.2439) data time 0.0011 (0.0033) model time 0.2441 (0.2387) loss 3.2800 (3.1854) grad_norm 2.3464 (2.5877) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][370/1251] eta 0:03:34 lr 0.000503 wd 0.0500 time 0.2416 (0.2437) data time 0.0010 (0.0033) model time 0.2406 (0.2386) loss 2.9001 (3.1818) grad_norm 2.1718 (2.5851) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][380/1251] eta 0:03:32 lr 0.000503 wd 0.0500 time 0.2459 (0.2436) data time 0.0009 (0.0032) model time 0.2450 (0.2386) loss 3.2956 (3.1927) grad_norm 2.4479 (2.5792) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][390/1251] eta 0:03:29 lr 0.000503 wd 0.0500 time 0.2412 (0.2435) data time 0.0012 (0.0032) model time 0.2399 (0.2386) loss 3.8922 (3.2007) grad_norm 2.0807 (2.5694) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][400/1251] eta 0:03:27 lr 0.000503 wd 0.0500 time 0.2395 (0.2434) data time 0.0010 (0.0031) model time 0.2385 (0.2386) loss 3.3971 (3.2043) grad_norm 3.3564 (2.5700) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][410/1251] eta 0:03:24 lr 0.000503 wd 0.0500 time 0.2308 (0.2432) data time 0.0011 (0.0031) model time 0.2298 (0.2385) loss 3.0763 (3.2061) grad_norm 2.3012 (2.5745) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][420/1251] eta 0:03:21 lr 0.000503 wd 0.0500 time 0.2346 (0.2431) data time 0.0009 (0.0030) model time 0.2337 (0.2384) loss 2.9173 (3.2074) grad_norm 3.8850 (2.5861) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][430/1251] eta 0:03:19 lr 0.000503 wd 0.0500 time 0.2468 (0.2430) data time 0.0009 (0.0030) model time 0.2459 (0.2385) loss 2.4103 (3.1990) grad_norm 2.1540 (2.5947) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][440/1251] eta 0:03:16 lr 0.000503 wd 0.0500 time 0.2305 (0.2429) data time 0.0009 (0.0029) model time 0.2296 (0.2384) loss 2.9521 (3.1944) grad_norm 3.5725 (2.6031) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][450/1251] eta 0:03:14 lr 0.000503 wd 0.0500 time 0.2327 (0.2428) data time 0.0008 (0.0029) model time 0.2319 (0.2384) loss 3.6652 (3.1899) grad_norm 2.0187 (2.5983) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][460/1251] eta 0:03:11 lr 0.000503 wd 0.0500 time 0.2337 (0.2426) data time 0.0011 (0.0028) model time 0.2326 (0.2383) loss 3.2325 (3.1897) grad_norm 2.5172 (2.5909) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:10:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][470/1251] eta 0:03:09 lr 0.000503 wd 0.0500 time 0.2463 (0.2426) data time 0.0011 (0.0028) model time 0.2452 (0.2383) loss 2.9070 (3.1894) grad_norm 6.5902 (2.5997) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][480/1251] eta 0:03:06 lr 0.000503 wd 0.0500 time 0.2370 (0.2424) data time 0.0010 (0.0028) model time 0.2359 (0.2382) loss 3.5612 (3.1873) grad_norm 2.0484 (2.6057) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][490/1251] eta 0:03:04 lr 0.000503 wd 0.0500 time 0.2342 (0.2423) data time 0.0011 (0.0027) model time 0.2331 (0.2382) loss 3.4191 (3.1868) grad_norm 2.6553 (2.6022) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][500/1251] eta 0:03:01 lr 0.000503 wd 0.0500 time 0.2366 (0.2423) data time 0.0008 (0.0027) model time 0.2358 (0.2382) loss 3.9112 (3.1909) grad_norm 2.5701 (2.6011) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][510/1251] eta 0:02:59 lr 0.000503 wd 0.0500 time 0.2376 (0.2422) data time 0.0007 (0.0027) model time 0.2368 (0.2382) loss 2.2692 (3.1932) grad_norm 1.8448 (2.6065) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][520/1251] eta 0:02:57 lr 0.000503 wd 0.0500 time 0.2276 (0.2421) data time 0.0012 (0.0026) model time 0.2264 (0.2382) loss 2.8969 (3.1962) grad_norm 2.7083 (2.6120) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][530/1251] eta 0:02:54 lr 0.000503 wd 0.0500 time 0.2334 (0.2421) data time 0.0008 (0.0026) model time 0.2326 (0.2382) loss 3.7262 (3.1970) grad_norm 3.2565 (2.6090) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][540/1251] eta 0:02:52 lr 0.000503 wd 0.0500 time 0.2373 (0.2420) data time 0.0010 (0.0026) model time 0.2364 (0.2382) loss 3.6407 (3.1932) grad_norm 2.2887 (2.6078) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][550/1251] eta 0:02:49 lr 0.000503 wd 0.0500 time 0.2442 (0.2424) data time 0.0010 (0.0026) model time 0.2433 (0.2387) loss 3.7645 (3.1930) grad_norm 2.9495 (2.6013) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][560/1251] eta 0:02:47 lr 0.000503 wd 0.0500 time 0.2403 (0.2424) data time 0.0010 (0.0025) model time 0.2393 (0.2387) loss 3.0000 (3.1896) grad_norm 2.6142 (2.6042) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][570/1251] eta 0:02:45 lr 0.000502 wd 0.0500 time 0.2424 (0.2423) data time 0.0011 (0.0025) model time 0.2413 (0.2387) loss 3.5164 (3.1922) grad_norm 2.0472 (2.6012) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][580/1251] eta 0:02:42 lr 0.000502 wd 0.0500 time 0.2385 (0.2423) data time 0.0008 (0.0025) model time 0.2377 (0.2387) loss 3.9205 (3.1916) grad_norm 3.4443 (2.5975) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][590/1251] eta 0:02:40 lr 0.000502 wd 0.0500 time 0.2387 (0.2422) data time 0.0008 (0.0024) model time 0.2380 (0.2387) loss 3.8878 (3.1931) grad_norm 3.6078 (2.6029) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][600/1251] eta 0:02:37 lr 0.000502 wd 0.0500 time 0.2382 (0.2421) data time 0.0012 (0.0024) model time 0.2370 (0.2386) loss 3.3974 (3.1956) grad_norm 4.8705 (2.6089) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][610/1251] eta 0:02:35 lr 0.000502 wd 0.0500 time 0.2354 (0.2421) data time 0.0010 (0.0024) model time 0.2344 (0.2386) loss 3.5690 (3.1963) grad_norm 2.7850 (2.6090) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][620/1251] eta 0:02:32 lr 0.000502 wd 0.0500 time 0.2374 (0.2420) data time 0.0012 (0.0024) model time 0.2362 (0.2385) loss 3.4405 (3.2010) grad_norm 2.8450 (2.6093) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][630/1251] eta 0:02:30 lr 0.000502 wd 0.0500 time 0.2418 (0.2420) data time 0.0009 (0.0024) model time 0.2409 (0.2385) loss 2.7172 (3.2010) grad_norm 3.2149 (2.6077) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][640/1251] eta 0:02:27 lr 0.000502 wd 0.0500 time 0.2474 (0.2419) data time 0.0010 (0.0023) model time 0.2464 (0.2385) loss 3.2012 (3.2015) grad_norm 2.0850 (2.6031) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][650/1251] eta 0:02:25 lr 0.000502 wd 0.0500 time 0.2376 (0.2418) data time 0.0011 (0.0023) model time 0.2365 (0.2385) loss 2.6676 (3.2031) grad_norm 1.9522 (2.5977) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][660/1251] eta 0:02:22 lr 0.000502 wd 0.0500 time 0.2354 (0.2418) data time 0.0012 (0.0023) model time 0.2342 (0.2385) loss 3.3163 (3.2045) grad_norm 6.2296 (2.6111) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][670/1251] eta 0:02:20 lr 0.000502 wd 0.0500 time 0.2379 (0.2417) data time 0.0010 (0.0023) model time 0.2369 (0.2384) loss 3.7002 (3.2023) grad_norm 3.1875 (2.6209) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][680/1251] eta 0:02:18 lr 0.000502 wd 0.0500 time 0.2410 (0.2417) data time 0.0011 (0.0023) model time 0.2399 (0.2385) loss 3.4749 (3.2024) grad_norm 2.4887 (2.6292) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][690/1251] eta 0:02:15 lr 0.000502 wd 0.0500 time 0.2366 (0.2417) data time 0.0010 (0.0022) model time 0.2355 (0.2384) loss 3.6158 (3.2000) grad_norm 4.5130 (2.6316) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][700/1251] eta 0:02:13 lr 0.000502 wd 0.0500 time 0.2334 (0.2416) data time 0.0009 (0.0022) model time 0.2326 (0.2384) loss 3.1213 (3.2001) grad_norm 2.7048 (2.6354) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][710/1251] eta 0:02:10 lr 0.000502 wd 0.0500 time 0.2390 (0.2416) data time 0.0010 (0.0022) model time 0.2380 (0.2384) loss 3.8465 (3.1975) grad_norm 1.6107 (2.6322) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][720/1251] eta 0:02:08 lr 0.000502 wd 0.0500 time 0.2424 (0.2415) data time 0.0010 (0.0022) model time 0.2413 (0.2384) loss 2.9032 (3.1935) grad_norm 2.2535 (2.6324) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][730/1251] eta 0:02:05 lr 0.000502 wd 0.0500 time 0.2428 (0.2415) data time 0.0009 (0.0022) model time 0.2419 (0.2384) loss 3.3873 (3.1940) grad_norm 2.9169 (2.6297) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][740/1251] eta 0:02:03 lr 0.000502 wd 0.0500 time 0.2389 (0.2416) data time 0.0010 (0.0022) model time 0.2379 (0.2385) loss 4.1752 (3.1964) grad_norm 2.8183 (2.6316) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][750/1251] eta 0:02:00 lr 0.000502 wd 0.0500 time 0.2363 (0.2415) data time 0.0007 (0.0022) model time 0.2355 (0.2385) loss 3.7882 (3.1965) grad_norm 3.5991 (2.6399) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][760/1251] eta 0:01:58 lr 0.000502 wd 0.0500 time 0.2393 (0.2415) data time 0.0007 (0.0021) model time 0.2386 (0.2385) loss 2.3969 (3.1901) grad_norm 1.6802 (2.6389) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][770/1251] eta 0:01:56 lr 0.000502 wd 0.0500 time 0.2346 (0.2414) data time 0.0008 (0.0021) model time 0.2338 (0.2385) loss 3.9163 (3.1880) grad_norm 2.0356 (2.6367) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][780/1251] eta 0:01:53 lr 0.000502 wd 0.0500 time 0.2387 (0.2414) data time 0.0008 (0.0021) model time 0.2379 (0.2385) loss 2.4520 (3.1859) grad_norm 1.9582 (2.6331) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][790/1251] eta 0:01:51 lr 0.000501 wd 0.0500 time 0.2387 (0.2414) data time 0.0012 (0.0021) model time 0.2376 (0.2385) loss 2.8385 (3.1847) grad_norm 3.1391 (2.6308) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][800/1251] eta 0:01:48 lr 0.000501 wd 0.0500 time 0.2312 (0.2414) data time 0.0010 (0.0021) model time 0.2302 (0.2385) loss 3.4283 (3.1843) grad_norm 3.0908 (2.6306) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][810/1251] eta 0:01:46 lr 0.000501 wd 0.0500 time 0.2396 (0.2413) data time 0.0010 (0.0021) model time 0.2386 (0.2384) loss 3.9294 (3.1862) grad_norm 1.8338 (2.6284) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][820/1251] eta 0:01:44 lr 0.000501 wd 0.0500 time 0.2407 (0.2413) data time 0.0008 (0.0021) model time 0.2399 (0.2384) loss 3.8350 (3.1865) grad_norm 2.6476 (2.6293) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][830/1251] eta 0:01:41 lr 0.000501 wd 0.0500 time 0.2432 (0.2413) data time 0.0008 (0.0021) model time 0.2425 (0.2385) loss 2.4886 (3.1841) grad_norm 2.7245 (2.6283) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][840/1251] eta 0:01:39 lr 0.000501 wd 0.0500 time 0.2406 (0.2413) data time 0.0010 (0.0021) model time 0.2396 (0.2384) loss 3.0811 (3.1851) grad_norm 2.3702 (2.6286) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][850/1251] eta 0:01:36 lr 0.000501 wd 0.0500 time 0.2380 (0.2412) data time 0.0010 (0.0020) model time 0.2370 (0.2384) loss 3.4991 (3.1886) grad_norm 2.2512 (2.6258) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][860/1251] eta 0:01:34 lr 0.000501 wd 0.0500 time 0.2394 (0.2412) data time 0.0007 (0.0020) model time 0.2387 (0.2384) loss 3.6788 (3.1928) grad_norm 2.4965 (2.6279) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][870/1251] eta 0:01:31 lr 0.000501 wd 0.0500 time 0.2310 (0.2412) data time 0.0011 (0.0020) model time 0.2299 (0.2384) loss 3.3747 (3.1950) grad_norm 1.9737 (2.6283) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][880/1251] eta 0:01:29 lr 0.000501 wd 0.0500 time 0.2402 (0.2412) data time 0.0008 (0.0020) model time 0.2394 (0.2384) loss 3.7018 (3.1965) grad_norm 2.9569 (2.6255) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][890/1251] eta 0:01:27 lr 0.000501 wd 0.0500 time 0.2352 (0.2411) data time 0.0009 (0.0020) model time 0.2344 (0.2383) loss 3.9173 (3.1943) grad_norm 2.9815 (2.6265) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][900/1251] eta 0:01:24 lr 0.000501 wd 0.0500 time 0.2389 (0.2411) data time 0.0008 (0.0020) model time 0.2381 (0.2383) loss 3.6710 (3.1965) grad_norm 2.1796 (2.6290) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][910/1251] eta 0:01:22 lr 0.000501 wd 0.0500 time 0.2439 (0.2411) data time 0.0010 (0.0020) model time 0.2429 (0.2384) loss 3.6023 (3.1944) grad_norm 3.6743 (2.6290) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][920/1251] eta 0:01:19 lr 0.000501 wd 0.0500 time 0.2314 (0.2411) data time 0.0010 (0.0020) model time 0.2304 (0.2384) loss 3.3269 (3.1946) grad_norm 2.0520 (2.6276) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][930/1251] eta 0:01:17 lr 0.000501 wd 0.0500 time 0.2314 (0.2410) data time 0.0009 (0.0020) model time 0.2305 (0.2383) loss 3.5284 (3.1966) grad_norm 2.4032 (2.6245) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][940/1251] eta 0:01:14 lr 0.000501 wd 0.0500 time 0.2421 (0.2410) data time 0.0008 (0.0020) model time 0.2413 (0.2383) loss 3.8922 (3.1982) grad_norm 2.4473 (2.6205) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][950/1251] eta 0:01:12 lr 0.000501 wd 0.0500 time 0.2355 (0.2410) data time 0.0011 (0.0019) model time 0.2344 (0.2383) loss 3.1891 (3.1977) grad_norm 2.1199 (2.6153) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][960/1251] eta 0:01:10 lr 0.000501 wd 0.0500 time 0.4544 (0.2414) data time 0.0012 (0.0019) model time 0.4532 (0.2388) loss 3.5468 (3.1975) grad_norm 2.0871 (2.6103) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:12:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][970/1251] eta 0:01:07 lr 0.000501 wd 0.0500 time 0.2372 (0.2414) data time 0.0011 (0.0019) model time 0.2361 (0.2388) loss 2.7951 (3.1962) grad_norm 2.4417 (2.6057) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][980/1251] eta 0:01:05 lr 0.000501 wd 0.0500 time 0.2410 (0.2414) data time 0.0010 (0.0019) model time 0.2399 (0.2388) loss 3.5492 (3.1970) grad_norm 2.4927 (2.6025) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][990/1251] eta 0:01:02 lr 0.000501 wd 0.0500 time 0.2415 (0.2414) data time 0.0010 (0.0019) model time 0.2405 (0.2388) loss 3.3734 (3.1950) grad_norm 2.9554 (2.6028) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1000/1251] eta 0:01:00 lr 0.000501 wd 0.0500 time 0.2434 (0.2413) data time 0.0009 (0.0019) model time 0.2424 (0.2388) loss 2.9334 (3.1967) grad_norm 3.3178 (2.6043) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1010/1251] eta 0:00:58 lr 0.000501 wd 0.0500 time 0.2422 (0.2413) data time 0.0010 (0.0019) model time 0.2412 (0.2388) loss 3.2544 (3.1987) grad_norm 1.9942 (2.6014) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1020/1251] eta 0:00:55 lr 0.000500 wd 0.0500 time 0.2440 (0.2413) data time 0.0011 (0.0019) model time 0.2429 (0.2387) loss 3.0870 (3.1963) grad_norm 2.1131 (2.6001) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1030/1251] eta 0:00:53 lr 0.000500 wd 0.0500 time 0.2381 (0.2412) data time 0.0011 (0.0019) model time 0.2370 (0.2387) loss 3.3848 (3.1974) grad_norm 2.5043 (2.5978) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1040/1251] eta 0:00:50 lr 0.000500 wd 0.0500 time 0.2260 (0.2412) data time 0.0011 (0.0019) model time 0.2249 (0.2387) loss 3.5722 (3.1969) grad_norm 2.3813 (2.5960) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1050/1251] eta 0:00:48 lr 0.000500 wd 0.0500 time 0.2443 (0.2412) data time 0.0008 (0.0019) model time 0.2435 (0.2387) loss 3.6003 (3.1967) grad_norm 2.4969 (2.5948) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1060/1251] eta 0:00:46 lr 0.000500 wd 0.0500 time 0.2297 (0.2412) data time 0.0011 (0.0019) model time 0.2286 (0.2387) loss 2.5605 (3.1942) grad_norm 2.5412 (2.5948) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1070/1251] eta 0:00:43 lr 0.000500 wd 0.0500 time 0.2408 (0.2412) data time 0.0011 (0.0019) model time 0.2397 (0.2387) loss 2.7109 (3.1923) grad_norm 2.4600 (2.5972) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1080/1251] eta 0:00:41 lr 0.000500 wd 0.0500 time 0.2389 (0.2412) data time 0.0010 (0.0019) model time 0.2378 (0.2387) loss 3.7618 (3.1958) grad_norm 3.0823 (2.5941) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1090/1251] eta 0:00:38 lr 0.000500 wd 0.0500 time 0.2346 (0.2411) data time 0.0011 (0.0018) model time 0.2335 (0.2387) loss 3.7263 (3.1980) grad_norm 3.0383 (2.5939) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1100/1251] eta 0:00:36 lr 0.000500 wd 0.0500 time 0.2402 (0.2411) data time 0.0008 (0.0018) model time 0.2394 (0.2387) loss 4.0897 (3.1988) grad_norm 2.1874 (2.5981) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1110/1251] eta 0:00:33 lr 0.000500 wd 0.0500 time 0.2448 (0.2411) data time 0.0009 (0.0018) model time 0.2439 (0.2387) loss 3.2855 (3.1952) grad_norm 2.7431 (2.6037) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1120/1251] eta 0:00:31 lr 0.000500 wd 0.0500 time 0.2315 (0.2411) data time 0.0011 (0.0018) model time 0.2304 (0.2387) loss 2.3778 (3.1935) grad_norm 2.0216 (2.6038) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1130/1251] eta 0:00:29 lr 0.000500 wd 0.0500 time 0.2373 (0.2411) data time 0.0011 (0.0018) model time 0.2362 (0.2387) loss 2.2254 (3.1946) grad_norm 2.2011 (2.6021) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1140/1251] eta 0:00:26 lr 0.000500 wd 0.0500 time 0.2367 (0.2410) data time 0.0011 (0.0018) model time 0.2356 (0.2386) loss 2.5286 (3.1940) grad_norm 2.4554 (2.6023) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1150/1251] eta 0:00:24 lr 0.000500 wd 0.0500 time 0.2399 (0.2410) data time 0.0012 (0.0018) model time 0.2388 (0.2386) loss 3.3616 (3.1934) grad_norm 3.9444 (2.6057) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1160/1251] eta 0:00:21 lr 0.000500 wd 0.0500 time 0.2387 (0.2410) data time 0.0009 (0.0018) model time 0.2378 (0.2386) loss 3.0964 (3.1920) grad_norm 3.0794 (2.6060) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 10:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1170/1251] eta 0:00:19 lr 0.000500 wd 0.0500 time 0.2538 (0.2410) data time 0.0010 (0.0018) model time 0.2528 (0.2386) loss 3.0044 (3.1894) grad_norm 2.6806 (2.6033) loss_scale 4096.0000 (2049.7489) mem 7382MB [2024-08-27 10:13:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1180/1251] eta 0:00:17 lr 0.000500 wd 0.0500 time 0.2362 (0.2410) data time 0.0011 (0.0018) model time 0.2351 (0.2386) loss 2.8901 (3.1869) grad_norm 2.6208 (2.5999) loss_scale 4096.0000 (2067.0754) mem 7382MB [2024-08-27 10:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1190/1251] eta 0:00:14 lr 0.000500 wd 0.0500 time 0.2351 (0.2410) data time 0.0011 (0.0018) model time 0.2339 (0.2386) loss 3.3392 (3.1853) grad_norm 1.6270 (2.5979) loss_scale 4096.0000 (2084.1108) mem 7382MB [2024-08-27 10:13:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1200/1251] eta 0:00:12 lr 0.000500 wd 0.0500 time 0.4525 (0.2411) data time 0.0011 (0.0018) model time 0.4514 (0.2388) loss 3.7396 (3.1863) grad_norm 2.3458 (2.5942) loss_scale 4096.0000 (2100.8626) mem 7382MB [2024-08-27 10:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1210/1251] eta 0:00:09 lr 0.000500 wd 0.0500 time 0.2523 (0.2411) data time 0.0014 (0.0018) model time 0.2509 (0.2388) loss 3.9414 (3.1855) grad_norm 2.8566 (2.5937) loss_scale 4096.0000 (2117.3377) mem 7382MB [2024-08-27 10:13:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1220/1251] eta 0:00:07 lr 0.000500 wd 0.0500 time 0.2340 (0.2411) data time 0.0011 (0.0018) model time 0.2329 (0.2388) loss 3.9561 (3.1858) grad_norm 3.8136 (2.5923) loss_scale 4096.0000 (2133.5430) mem 7382MB [2024-08-27 10:14:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1230/1251] eta 0:00:05 lr 0.000500 wd 0.0500 time 0.2372 (0.2411) data time 0.0009 (0.0018) model time 0.2363 (0.2388) loss 3.8919 (3.1867) grad_norm 2.0175 (2.5972) loss_scale 4096.0000 (2149.4850) mem 7382MB [2024-08-27 10:14:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1240/1251] eta 0:00:02 lr 0.000499 wd 0.0500 time 0.2232 (0.2410) data time 0.0005 (0.0018) model time 0.2227 (0.2387) loss 2.6856 (3.1858) grad_norm 3.6529 (2.6027) loss_scale 4096.0000 (2165.1700) mem 7382MB [2024-08-27 10:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [160/300][1250/1251] eta 0:00:00 lr 0.000499 wd 0.0500 time 0.2285 (0.2409) data time 0.0005 (0.0017) model time 0.2280 (0.2386) loss 3.1841 (3.1868) grad_norm 2.3276 (2.6017) loss_scale 4096.0000 (2180.6043) mem 7382MB [2024-08-27 10:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 160 training takes 0:05:01 [2024-08-27 10:14:06 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 10:14:06 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 10:14:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.428 (0.428) Loss 0.4707 (0.4707) Acc@1 89.844 (89.844) Acc@5 98.633 (98.633) Mem 7382MB [2024-08-27 10:14:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.110) Loss 0.7432 (0.7018) Acc@1 84.180 (84.437) Acc@5 96.875 (97.088) Mem 7382MB [2024-08-27 10:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.096) Loss 1.0186 (0.7334) Acc@1 75.684 (83.459) Acc@5 94.629 (96.982) Mem 7382MB [2024-08-27 10:14:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.090) Loss 1.2227 (0.8348) Acc@1 69.434 (81.115) Acc@5 91.797 (95.772) Mem 7382MB [2024-08-27 10:14:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.1543 (0.8896) Acc@1 73.340 (79.819) Acc@5 92.773 (95.155) Mem 7382MB [2024-08-27 10:14:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.330 Acc@5 95.068 [2024-08-27 10:14:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.3% [2024-08-27 10:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.771 (0.771) Loss 0.4028 (0.4028) Acc@1 92.969 (92.969) Acc@5 98.535 (98.535) Mem 7382MB [2024-08-27 10:14:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.142) Loss 0.6318 (0.6310) Acc@1 87.500 (86.435) Acc@5 97.168 (97.390) Mem 7382MB [2024-08-27 10:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.110) Loss 0.9033 (0.6561) Acc@1 78.223 (85.500) Acc@5 95.508 (97.387) Mem 7382MB [2024-08-27 10:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.100) Loss 1.1328 (0.7449) Acc@1 72.559 (83.380) Acc@5 92.871 (96.418) Mem 7382MB [2024-08-27 10:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.092) Loss 1.0303 (0.7918) Acc@1 74.414 (81.996) Acc@5 93.457 (95.948) Mem 7382MB [2024-08-27 10:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.566 Acc@5 95.930 [2024-08-27 10:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.6% [2024-08-27 10:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.57% [2024-08-27 10:14:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 10:14:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 10:14:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][0/1251] eta 0:15:36 lr 0.000499 wd 0.0500 time 0.7486 (0.7486) data time 0.5193 (0.5193) model time 0.0000 (0.0000) loss 3.6663 (3.6663) grad_norm 1.9272 (1.9272) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][10/1251] eta 0:05:51 lr 0.000499 wd 0.0500 time 0.2361 (0.2829) data time 0.0009 (0.0482) model time 0.0000 (0.0000) loss 3.6265 (3.5886) grad_norm 2.0503 (2.6951) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][20/1251] eta 0:05:22 lr 0.000499 wd 0.0500 time 0.2397 (0.2617) data time 0.0008 (0.0257) model time 0.0000 (0.0000) loss 3.6863 (3.2935) grad_norm 2.6156 (2.5159) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][30/1251] eta 0:05:10 lr 0.000499 wd 0.0500 time 0.2351 (0.2541) data time 0.0010 (0.0177) model time 0.0000 (0.0000) loss 3.2103 (3.2699) grad_norm 2.0518 (2.5274) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][40/1251] eta 0:05:03 lr 0.000499 wd 0.0500 time 0.2416 (0.2508) data time 0.0008 (0.0136) model time 0.0000 (0.0000) loss 2.7403 (3.1982) grad_norm 2.4153 (2.4802) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][50/1251] eta 0:05:04 lr 0.000499 wd 0.0500 time 0.2167 (0.2532) data time 0.0011 (0.0112) model time 0.0000 (0.0000) loss 2.4551 (3.1980) grad_norm 2.4342 (2.4910) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][60/1251] eta 0:04:58 lr 0.000499 wd 0.0500 time 0.2409 (0.2510) data time 0.0009 (0.0095) model time 0.2400 (0.2390) loss 2.1219 (3.1727) grad_norm 2.1441 (2.4718) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][70/1251] eta 0:04:54 lr 0.000499 wd 0.0500 time 0.2322 (0.2492) data time 0.0011 (0.0083) model time 0.2311 (0.2381) loss 2.5937 (3.1475) grad_norm 2.8392 (2.4426) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][80/1251] eta 0:04:50 lr 0.000499 wd 0.0500 time 0.2372 (0.2484) data time 0.0007 (0.0074) model time 0.2365 (0.2393) loss 2.9444 (3.1277) grad_norm 2.3523 (2.4360) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][90/1251] eta 0:04:47 lr 0.000499 wd 0.0500 time 0.2432 (0.2474) data time 0.0010 (0.0067) model time 0.2422 (0.2390) loss 3.5645 (3.1144) grad_norm 2.5440 (2.4541) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][100/1251] eta 0:04:43 lr 0.000499 wd 0.0500 time 0.2402 (0.2467) data time 0.0008 (0.0062) model time 0.2394 (0.2391) loss 2.9137 (3.1267) grad_norm 2.7895 (2.4775) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][110/1251] eta 0:04:40 lr 0.000499 wd 0.0500 time 0.2378 (0.2460) data time 0.0008 (0.0057) model time 0.2370 (0.2388) loss 1.8400 (3.1410) grad_norm 2.5438 (2.4705) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][120/1251] eta 0:04:37 lr 0.000499 wd 0.0500 time 0.2521 (0.2456) data time 0.0009 (0.0053) model time 0.2511 (0.2391) loss 3.5975 (3.1683) grad_norm 3.8105 (2.4753) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][130/1251] eta 0:04:34 lr 0.000499 wd 0.0500 time 0.2359 (0.2452) data time 0.0011 (0.0050) model time 0.2348 (0.2391) loss 3.6313 (3.1769) grad_norm 2.8041 (2.4978) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][140/1251] eta 0:04:31 lr 0.000499 wd 0.0500 time 0.2352 (0.2447) data time 0.0011 (0.0047) model time 0.2341 (0.2388) loss 3.5284 (3.1709) grad_norm 3.0940 (2.5071) loss_scale 4096.0000 (4096.0000) mem 7382MB [2024-08-27 10:14:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][150/1251] eta 0:04:28 lr 0.000499 wd 0.0500 time 0.2359 (0.2441) data time 0.0008 (0.0045) model time 0.2351 (0.2384) loss 3.4911 (3.1755) grad_norm 2.8501 (nan) loss_scale 2048.0000 (4068.8742) mem 7382MB [2024-08-27 10:14:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][160/1251] eta 0:04:25 lr 0.000499 wd 0.0500 time 0.2380 (0.2437) data time 0.0010 (0.0043) model time 0.2369 (0.2383) loss 2.1332 (3.1529) grad_norm 2.1182 (nan) loss_scale 2048.0000 (3943.3540) mem 7382MB [2024-08-27 10:14:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][170/1251] eta 0:04:23 lr 0.000499 wd 0.0500 time 0.2421 (0.2435) data time 0.0008 (0.0041) model time 0.2414 (0.2384) loss 3.1752 (3.1465) grad_norm 5.1975 (nan) loss_scale 2048.0000 (3832.5146) mem 7382MB [2024-08-27 10:14:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][180/1251] eta 0:04:20 lr 0.000499 wd 0.0500 time 0.2358 (0.2433) data time 0.0009 (0.0039) model time 0.2350 (0.2383) loss 3.1114 (3.1564) grad_norm 2.6305 (nan) loss_scale 2048.0000 (3733.9227) mem 7382MB [2024-08-27 10:15:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][190/1251] eta 0:04:17 lr 0.000499 wd 0.0500 time 0.2447 (0.2430) data time 0.0008 (0.0037) model time 0.2439 (0.2382) loss 3.8466 (3.1572) grad_norm 2.3835 (nan) loss_scale 2048.0000 (3645.6545) mem 7382MB [2024-08-27 10:15:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][200/1251] eta 0:04:15 lr 0.000499 wd 0.0500 time 0.2380 (0.2428) data time 0.0007 (0.0036) model time 0.2372 (0.2382) loss 4.0414 (3.1696) grad_norm 2.7347 (nan) loss_scale 2048.0000 (3566.1692) mem 7382MB [2024-08-27 10:15:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][210/1251] eta 0:04:12 lr 0.000499 wd 0.0500 time 0.2435 (0.2426) data time 0.0010 (0.0035) model time 0.2425 (0.2382) loss 2.9370 (3.1556) grad_norm 2.4520 (nan) loss_scale 2048.0000 (3494.2180) mem 7382MB [2024-08-27 10:15:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][220/1251] eta 0:04:10 lr 0.000498 wd 0.0500 time 0.2408 (0.2425) data time 0.0008 (0.0034) model time 0.2400 (0.2383) loss 3.5047 (3.1546) grad_norm 2.1689 (nan) loss_scale 2048.0000 (3428.7783) mem 7382MB [2024-08-27 10:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][230/1251] eta 0:04:07 lr 0.000498 wd 0.0500 time 0.2422 (0.2423) data time 0.0008 (0.0033) model time 0.2414 (0.2382) loss 3.0596 (3.1492) grad_norm 3.1177 (nan) loss_scale 2048.0000 (3369.0043) mem 7382MB [2024-08-27 10:15:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][240/1251] eta 0:04:04 lr 0.000498 wd 0.0500 time 0.2519 (0.2423) data time 0.0009 (0.0032) model time 0.2509 (0.2383) loss 3.1781 (3.1647) grad_norm 3.0175 (nan) loss_scale 2048.0000 (3314.1909) mem 7382MB [2024-08-27 10:15:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][250/1251] eta 0:04:02 lr 0.000498 wd 0.0500 time 0.2392 (0.2423) data time 0.0010 (0.0031) model time 0.2383 (0.2385) loss 2.6501 (3.1699) grad_norm 2.2533 (nan) loss_scale 2048.0000 (3263.7450) mem 7382MB [2024-08-27 10:15:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][260/1251] eta 0:03:59 lr 0.000498 wd 0.0500 time 0.2377 (0.2422) data time 0.0008 (0.0030) model time 0.2369 (0.2384) loss 3.6460 (3.1778) grad_norm 2.6870 (nan) loss_scale 2048.0000 (3217.1648) mem 7382MB [2024-08-27 10:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][270/1251] eta 0:03:57 lr 0.000498 wd 0.0500 time 0.2393 (0.2420) data time 0.0009 (0.0029) model time 0.2384 (0.2384) loss 2.2005 (3.1711) grad_norm 2.9412 (nan) loss_scale 2048.0000 (3174.0221) mem 7382MB [2024-08-27 10:15:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][280/1251] eta 0:03:54 lr 0.000498 wd 0.0500 time 0.2381 (0.2418) data time 0.0011 (0.0029) model time 0.2370 (0.2383) loss 2.9297 (3.1541) grad_norm 2.5580 (nan) loss_scale 2048.0000 (3133.9502) mem 7382MB [2024-08-27 10:15:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][290/1251] eta 0:03:52 lr 0.000498 wd 0.0500 time 0.2356 (0.2417) data time 0.0010 (0.0028) model time 0.2346 (0.2382) loss 3.0885 (3.1535) grad_norm 2.2711 (nan) loss_scale 2048.0000 (3096.6323) mem 7382MB [2024-08-27 10:15:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][300/1251] eta 0:03:49 lr 0.000498 wd 0.0500 time 0.2450 (0.2416) data time 0.0011 (0.0028) model time 0.2439 (0.2382) loss 2.2432 (3.1446) grad_norm 2.2363 (nan) loss_scale 2048.0000 (3061.7940) mem 7382MB [2024-08-27 10:15:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][310/1251] eta 0:03:47 lr 0.000498 wd 0.0500 time 0.2310 (0.2415) data time 0.0010 (0.0027) model time 0.2300 (0.2382) loss 3.4578 (3.1460) grad_norm 3.1948 (nan) loss_scale 2048.0000 (3029.1961) mem 7382MB [2024-08-27 10:15:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][320/1251] eta 0:03:44 lr 0.000498 wd 0.0500 time 0.2367 (0.2414) data time 0.0011 (0.0027) model time 0.2356 (0.2381) loss 3.1413 (3.1444) grad_norm 2.2696 (nan) loss_scale 2048.0000 (2998.6293) mem 7382MB [2024-08-27 10:15:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][330/1251] eta 0:03:42 lr 0.000498 wd 0.0500 time 0.2380 (0.2413) data time 0.0010 (0.0026) model time 0.2369 (0.2382) loss 3.4069 (3.1444) grad_norm 2.2505 (nan) loss_scale 2048.0000 (2969.9094) mem 7382MB [2024-08-27 10:15:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][340/1251] eta 0:03:39 lr 0.000498 wd 0.0500 time 0.2444 (0.2413) data time 0.0008 (0.0026) model time 0.2437 (0.2381) loss 2.6154 (3.1460) grad_norm 2.4204 (nan) loss_scale 2048.0000 (2942.8739) mem 7382MB [2024-08-27 10:15:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][350/1251] eta 0:03:37 lr 0.000498 wd 0.0500 time 0.2394 (0.2412) data time 0.0010 (0.0025) model time 0.2384 (0.2382) loss 3.7491 (3.1441) grad_norm 3.0822 (nan) loss_scale 2048.0000 (2917.3789) mem 7382MB [2024-08-27 10:15:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][360/1251] eta 0:03:34 lr 0.000498 wd 0.0500 time 0.2396 (0.2411) data time 0.0010 (0.0025) model time 0.2387 (0.2381) loss 3.4534 (3.1459) grad_norm 5.8615 (nan) loss_scale 2048.0000 (2893.2964) mem 7382MB [2024-08-27 10:15:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][370/1251] eta 0:03:32 lr 0.000498 wd 0.0500 time 0.2402 (0.2410) data time 0.0011 (0.0024) model time 0.2391 (0.2380) loss 2.8890 (3.1421) grad_norm 2.3198 (nan) loss_scale 2048.0000 (2870.5121) mem 7382MB [2024-08-27 10:15:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][380/1251] eta 0:03:29 lr 0.000498 wd 0.0500 time 0.2415 (0.2410) data time 0.0009 (0.0024) model time 0.2405 (0.2380) loss 3.5595 (3.1387) grad_norm 2.7260 (nan) loss_scale 2048.0000 (2848.9239) mem 7382MB [2024-08-27 10:15:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][390/1251] eta 0:03:27 lr 0.000498 wd 0.0500 time 0.2407 (0.2409) data time 0.0010 (0.0024) model time 0.2397 (0.2380) loss 3.6454 (3.1351) grad_norm 2.1473 (nan) loss_scale 2048.0000 (2828.4399) mem 7382MB [2024-08-27 10:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][400/1251] eta 0:03:24 lr 0.000498 wd 0.0500 time 0.2316 (0.2408) data time 0.0010 (0.0023) model time 0.2305 (0.2380) loss 3.3684 (3.1376) grad_norm 2.4568 (nan) loss_scale 2048.0000 (2808.9776) mem 7382MB [2024-08-27 10:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][410/1251] eta 0:03:22 lr 0.000498 wd 0.0500 time 0.2387 (0.2408) data time 0.0010 (0.0023) model time 0.2377 (0.2380) loss 2.9021 (3.1359) grad_norm 3.2391 (nan) loss_scale 2048.0000 (2790.4623) mem 7382MB [2024-08-27 10:15:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][420/1251] eta 0:03:20 lr 0.000498 wd 0.0500 time 0.4421 (0.2412) data time 0.0007 (0.0023) model time 0.4413 (0.2385) loss 3.1117 (3.1432) grad_norm 2.3422 (nan) loss_scale 2048.0000 (2772.8266) mem 7382MB [2024-08-27 10:16:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][430/1251] eta 0:03:18 lr 0.000498 wd 0.0500 time 0.2463 (0.2422) data time 0.0008 (0.0022) model time 0.2455 (0.2397) loss 3.8153 (3.1442) grad_norm 2.4628 (nan) loss_scale 2048.0000 (2756.0093) mem 7382MB [2024-08-27 10:16:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][440/1251] eta 0:03:16 lr 0.000497 wd 0.0500 time 0.2427 (0.2421) data time 0.0010 (0.0022) model time 0.2417 (0.2397) loss 3.5625 (3.1402) grad_norm 1.9748 (nan) loss_scale 2048.0000 (2739.9546) mem 7382MB [2024-08-27 10:16:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][450/1251] eta 0:03:13 lr 0.000497 wd 0.0500 time 0.2361 (0.2420) data time 0.0008 (0.0022) model time 0.2353 (0.2396) loss 3.5191 (3.1385) grad_norm 3.5826 (nan) loss_scale 2048.0000 (2724.6120) mem 7382MB [2024-08-27 10:16:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][460/1251] eta 0:03:11 lr 0.000497 wd 0.0500 time 0.2420 (0.2420) data time 0.0008 (0.0022) model time 0.2412 (0.2395) loss 3.7236 (3.1354) grad_norm 2.6053 (nan) loss_scale 2048.0000 (2709.9349) mem 7382MB [2024-08-27 10:16:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][470/1251] eta 0:03:08 lr 0.000497 wd 0.0500 time 0.2356 (0.2419) data time 0.0010 (0.0021) model time 0.2347 (0.2394) loss 3.2078 (3.1373) grad_norm 2.2131 (nan) loss_scale 2048.0000 (2695.8811) mem 7382MB [2024-08-27 10:16:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][480/1251] eta 0:03:06 lr 0.000497 wd 0.0500 time 0.2414 (0.2422) data time 0.0008 (0.0021) model time 0.2406 (0.2399) loss 2.6947 (3.1410) grad_norm 2.0113 (nan) loss_scale 2048.0000 (2682.4116) mem 7382MB [2024-08-27 10:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][490/1251] eta 0:03:04 lr 0.000497 wd 0.0500 time 0.2325 (0.2421) data time 0.0011 (0.0021) model time 0.2314 (0.2398) loss 2.0121 (3.1390) grad_norm 2.8516 (nan) loss_scale 2048.0000 (2669.4908) mem 7382MB [2024-08-27 10:16:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][500/1251] eta 0:03:01 lr 0.000497 wd 0.0500 time 0.2436 (0.2421) data time 0.0009 (0.0021) model time 0.2427 (0.2398) loss 3.6067 (3.1403) grad_norm 2.4287 (nan) loss_scale 2048.0000 (2657.0858) mem 7382MB [2024-08-27 10:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][510/1251] eta 0:02:59 lr 0.000497 wd 0.0500 time 0.2392 (0.2421) data time 0.0011 (0.0020) model time 0.2381 (0.2398) loss 3.2836 (3.1372) grad_norm 2.0170 (nan) loss_scale 2048.0000 (2645.1663) mem 7382MB [2024-08-27 10:16:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][520/1251] eta 0:02:56 lr 0.000497 wd 0.0500 time 0.2318 (0.2420) data time 0.0008 (0.0020) model time 0.2310 (0.2398) loss 3.5024 (3.1359) grad_norm 2.3051 (nan) loss_scale 2048.0000 (2633.7044) mem 7382MB [2024-08-27 10:16:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][530/1251] eta 0:02:54 lr 0.000497 wd 0.0500 time 0.2401 (0.2420) data time 0.0011 (0.0020) model time 0.2390 (0.2397) loss 3.5078 (3.1361) grad_norm 4.4351 (nan) loss_scale 2048.0000 (2622.6742) mem 7382MB [2024-08-27 10:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][540/1251] eta 0:02:51 lr 0.000497 wd 0.0500 time 0.2342 (0.2419) data time 0.0011 (0.0020) model time 0.2331 (0.2397) loss 3.3251 (3.1308) grad_norm 3.2212 (nan) loss_scale 2048.0000 (2612.0518) mem 7382MB [2024-08-27 10:16:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][550/1251] eta 0:02:49 lr 0.000497 wd 0.0500 time 0.2422 (0.2418) data time 0.0009 (0.0020) model time 0.2412 (0.2397) loss 2.7419 (3.1317) grad_norm 3.0273 (nan) loss_scale 2048.0000 (2601.8149) mem 7382MB [2024-08-27 10:16:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][560/1251] eta 0:02:47 lr 0.000497 wd 0.0500 time 0.2305 (0.2418) data time 0.0010 (0.0020) model time 0.2295 (0.2397) loss 3.8445 (3.1335) grad_norm 3.0029 (nan) loss_scale 2048.0000 (2591.9430) mem 7382MB [2024-08-27 10:16:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][570/1251] eta 0:02:44 lr 0.000497 wd 0.0500 time 0.2285 (0.2418) data time 0.0009 (0.0020) model time 0.2277 (0.2396) loss 2.9322 (3.1399) grad_norm 2.2031 (nan) loss_scale 2048.0000 (2582.4168) mem 7382MB [2024-08-27 10:16:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][580/1251] eta 0:02:42 lr 0.000497 wd 0.0500 time 0.2456 (0.2417) data time 0.0011 (0.0019) model time 0.2445 (0.2396) loss 3.0352 (3.1402) grad_norm 2.3128 (nan) loss_scale 2048.0000 (2573.2186) mem 7382MB [2024-08-27 10:16:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][590/1251] eta 0:02:39 lr 0.000497 wd 0.0500 time 0.2420 (0.2417) data time 0.0010 (0.0019) model time 0.2410 (0.2396) loss 3.7671 (3.1427) grad_norm 1.8995 (nan) loss_scale 2048.0000 (2564.3316) mem 7382MB [2024-08-27 10:16:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][600/1251] eta 0:02:37 lr 0.000497 wd 0.0500 time 0.2427 (0.2417) data time 0.0011 (0.0019) model time 0.2416 (0.2396) loss 3.2474 (3.1393) grad_norm 1.6939 (nan) loss_scale 2048.0000 (2555.7404) mem 7382MB [2024-08-27 10:16:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][610/1251] eta 0:02:34 lr 0.000497 wd 0.0500 time 0.2280 (0.2417) data time 0.0008 (0.0019) model time 0.2272 (0.2396) loss 3.5835 (3.1423) grad_norm 3.6357 (nan) loss_scale 2048.0000 (2547.4304) mem 7382MB [2024-08-27 10:16:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][620/1251] eta 0:02:32 lr 0.000497 wd 0.0500 time 0.2383 (0.2417) data time 0.0011 (0.0019) model time 0.2372 (0.2396) loss 2.9862 (3.1416) grad_norm 4.7242 (nan) loss_scale 2048.0000 (2539.3881) mem 7382MB [2024-08-27 10:16:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][630/1251] eta 0:02:30 lr 0.000497 wd 0.0500 time 0.2345 (0.2416) data time 0.0010 (0.0019) model time 0.2334 (0.2396) loss 3.1959 (3.1395) grad_norm 2.9946 (nan) loss_scale 2048.0000 (2531.6006) mem 7382MB [2024-08-27 10:16:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][640/1251] eta 0:02:27 lr 0.000497 wd 0.0500 time 0.2287 (0.2416) data time 0.0008 (0.0019) model time 0.2279 (0.2395) loss 2.8879 (3.1397) grad_norm 2.2632 (nan) loss_scale 2048.0000 (2524.0562) mem 7382MB [2024-08-27 10:16:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][650/1251] eta 0:02:25 lr 0.000497 wd 0.0500 time 0.2364 (0.2415) data time 0.0009 (0.0018) model time 0.2355 (0.2395) loss 4.0423 (3.1428) grad_norm 2.7144 (nan) loss_scale 2048.0000 (2516.7435) mem 7382MB [2024-08-27 10:16:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][660/1251] eta 0:02:22 lr 0.000497 wd 0.0500 time 0.2390 (0.2415) data time 0.0008 (0.0019) model time 0.2382 (0.2394) loss 2.4283 (3.1454) grad_norm 3.2885 (nan) loss_scale 2048.0000 (2509.6520) mem 7382MB [2024-08-27 10:16:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][670/1251] eta 0:02:20 lr 0.000496 wd 0.0500 time 0.2298 (0.2414) data time 0.0010 (0.0018) model time 0.2287 (0.2394) loss 3.5171 (3.1488) grad_norm 2.7758 (nan) loss_scale 2048.0000 (2502.7720) mem 7382MB [2024-08-27 10:17:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][680/1251] eta 0:02:17 lr 0.000496 wd 0.0500 time 0.2390 (0.2414) data time 0.0011 (0.0018) model time 0.2380 (0.2394) loss 3.7600 (3.1520) grad_norm 2.3042 (nan) loss_scale 2048.0000 (2496.0940) mem 7382MB [2024-08-27 10:17:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][690/1251] eta 0:02:15 lr 0.000496 wd 0.0500 time 0.2353 (0.2414) data time 0.0009 (0.0018) model time 0.2344 (0.2394) loss 2.8690 (3.1483) grad_norm 2.4904 (nan) loss_scale 2048.0000 (2489.6093) mem 7382MB [2024-08-27 10:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][700/1251] eta 0:02:12 lr 0.000496 wd 0.0500 time 0.2446 (0.2413) data time 0.0010 (0.0018) model time 0.2436 (0.2393) loss 3.5705 (3.1512) grad_norm 2.8666 (nan) loss_scale 2048.0000 (2483.3096) mem 7382MB [2024-08-27 10:17:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][710/1251] eta 0:02:10 lr 0.000496 wd 0.0500 time 0.2386 (0.2413) data time 0.0008 (0.0018) model time 0.2378 (0.2393) loss 2.1968 (3.1485) grad_norm 2.4094 (nan) loss_scale 2048.0000 (2477.1871) mem 7382MB [2024-08-27 10:17:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][720/1251] eta 0:02:08 lr 0.000496 wd 0.0500 time 0.2321 (0.2413) data time 0.0009 (0.0018) model time 0.2312 (0.2393) loss 3.3484 (3.1516) grad_norm 2.1545 (nan) loss_scale 2048.0000 (2471.2344) mem 7382MB [2024-08-27 10:17:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][730/1251] eta 0:02:05 lr 0.000496 wd 0.0500 time 0.2354 (0.2412) data time 0.0009 (0.0018) model time 0.2345 (0.2393) loss 2.4347 (3.1522) grad_norm 1.5929 (nan) loss_scale 2048.0000 (2465.4446) mem 7382MB [2024-08-27 10:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][740/1251] eta 0:02:03 lr 0.000496 wd 0.0500 time 0.2462 (0.2412) data time 0.0009 (0.0018) model time 0.2454 (0.2392) loss 3.5266 (3.1561) grad_norm 2.3911 (nan) loss_scale 2048.0000 (2459.8111) mem 7382MB [2024-08-27 10:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][750/1251] eta 0:02:00 lr 0.000496 wd 0.0500 time 0.2369 (0.2411) data time 0.0011 (0.0018) model time 0.2359 (0.2392) loss 3.4094 (3.1531) grad_norm 2.4780 (nan) loss_scale 2048.0000 (2454.3276) mem 7382MB [2024-08-27 10:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][760/1251] eta 0:01:58 lr 0.000496 wd 0.0500 time 0.2423 (0.2411) data time 0.0007 (0.0018) model time 0.2416 (0.2392) loss 3.3828 (3.1529) grad_norm 2.1939 (nan) loss_scale 2048.0000 (2448.9882) mem 7382MB [2024-08-27 10:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][770/1251] eta 0:01:55 lr 0.000496 wd 0.0500 time 0.2394 (0.2411) data time 0.0009 (0.0018) model time 0.2385 (0.2392) loss 3.7563 (3.1563) grad_norm 1.9787 (nan) loss_scale 2048.0000 (2443.7873) mem 7382MB [2024-08-27 10:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][780/1251] eta 0:01:53 lr 0.000496 wd 0.0500 time 0.2409 (0.2411) data time 0.0010 (0.0017) model time 0.2399 (0.2392) loss 3.3201 (3.1576) grad_norm 1.8517 (nan) loss_scale 2048.0000 (2438.7196) mem 7382MB [2024-08-27 10:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][790/1251] eta 0:01:51 lr 0.000496 wd 0.0500 time 0.2357 (0.2411) data time 0.0007 (0.0017) model time 0.2350 (0.2391) loss 2.3099 (3.1568) grad_norm 2.7969 (nan) loss_scale 2048.0000 (2433.7800) mem 7382MB [2024-08-27 10:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][800/1251] eta 0:01:48 lr 0.000496 wd 0.0500 time 0.2379 (0.2410) data time 0.0011 (0.0017) model time 0.2368 (0.2391) loss 2.9498 (3.1557) grad_norm 2.0962 (nan) loss_scale 2048.0000 (2428.9638) mem 7382MB [2024-08-27 10:17:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][810/1251] eta 0:01:46 lr 0.000496 wd 0.0500 time 0.2421 (0.2410) data time 0.0009 (0.0017) model time 0.2413 (0.2391) loss 3.9714 (3.1579) grad_norm 2.7102 (nan) loss_scale 2048.0000 (2424.2663) mem 7382MB [2024-08-27 10:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][820/1251] eta 0:01:43 lr 0.000496 wd 0.0500 time 0.2357 (0.2410) data time 0.0011 (0.0017) model time 0.2346 (0.2391) loss 2.1044 (3.1576) grad_norm 1.8885 (nan) loss_scale 2048.0000 (2419.6833) mem 7382MB [2024-08-27 10:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][830/1251] eta 0:01:41 lr 0.000496 wd 0.0500 time 0.2384 (0.2410) data time 0.0008 (0.0017) model time 0.2376 (0.2391) loss 3.3664 (3.1601) grad_norm 2.4376 (nan) loss_scale 2048.0000 (2415.2106) mem 7382MB [2024-08-27 10:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][840/1251] eta 0:01:39 lr 0.000496 wd 0.0500 time 0.2328 (0.2410) data time 0.0010 (0.0017) model time 0.2318 (0.2391) loss 2.8026 (3.1596) grad_norm 4.1754 (nan) loss_scale 2048.0000 (2410.8442) mem 7382MB [2024-08-27 10:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][850/1251] eta 0:01:36 lr 0.000496 wd 0.0500 time 0.2291 (0.2409) data time 0.0011 (0.0017) model time 0.2280 (0.2390) loss 3.2839 (3.1616) grad_norm 2.5313 (nan) loss_scale 2048.0000 (2406.5805) mem 7382MB [2024-08-27 10:17:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][860/1251] eta 0:01:34 lr 0.000496 wd 0.0500 time 0.2455 (0.2409) data time 0.0010 (0.0017) model time 0.2445 (0.2390) loss 2.3800 (3.1626) grad_norm 2.4412 (nan) loss_scale 2048.0000 (2402.4158) mem 7382MB [2024-08-27 10:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][870/1251] eta 0:01:31 lr 0.000496 wd 0.0500 time 0.2383 (0.2409) data time 0.0009 (0.0017) model time 0.2374 (0.2390) loss 2.7305 (3.1591) grad_norm 3.1239 (nan) loss_scale 2048.0000 (2398.3467) mem 7382MB [2024-08-27 10:17:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][880/1251] eta 0:01:29 lr 0.000496 wd 0.0500 time 0.2387 (0.2408) data time 0.0007 (0.0017) model time 0.2379 (0.2390) loss 2.8771 (3.1602) grad_norm 2.1767 (nan) loss_scale 2048.0000 (2394.3700) mem 7382MB [2024-08-27 10:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][890/1251] eta 0:01:26 lr 0.000495 wd 0.0500 time 0.2311 (0.2408) data time 0.0008 (0.0017) model time 0.2303 (0.2389) loss 2.6347 (3.1576) grad_norm 2.3339 (nan) loss_scale 2048.0000 (2390.4826) mem 7382MB [2024-08-27 10:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][900/1251] eta 0:01:24 lr 0.000495 wd 0.0500 time 0.2417 (0.2408) data time 0.0012 (0.0016) model time 0.2406 (0.2390) loss 3.5396 (3.1593) grad_norm 1.8657 (nan) loss_scale 2048.0000 (2386.6815) mem 7382MB [2024-08-27 10:17:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][910/1251] eta 0:01:22 lr 0.000495 wd 0.0500 time 0.2369 (0.2408) data time 0.0009 (0.0016) model time 0.2360 (0.2390) loss 2.7387 (3.1591) grad_norm 2.2557 (nan) loss_scale 2048.0000 (2382.9638) mem 7382MB [2024-08-27 10:17:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][920/1251] eta 0:01:19 lr 0.000495 wd 0.0500 time 0.2404 (0.2408) data time 0.0010 (0.0016) model time 0.2394 (0.2390) loss 3.7101 (3.1604) grad_norm 2.3055 (nan) loss_scale 2048.0000 (2379.3268) mem 7382MB [2024-08-27 10:17:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][930/1251] eta 0:01:17 lr 0.000495 wd 0.0500 time 0.2337 (0.2408) data time 0.0012 (0.0016) model time 0.2325 (0.2390) loss 3.5555 (3.1616) grad_norm 2.6849 (nan) loss_scale 2048.0000 (2375.7680) mem 7382MB [2024-08-27 10:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][940/1251] eta 0:01:14 lr 0.000495 wd 0.0500 time 0.2313 (0.2407) data time 0.0008 (0.0016) model time 0.2304 (0.2389) loss 3.3065 (3.1660) grad_norm 1.7286 (nan) loss_scale 2048.0000 (2372.2848) mem 7382MB [2024-08-27 10:18:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][950/1251] eta 0:01:12 lr 0.000495 wd 0.0500 time 0.2362 (0.2407) data time 0.0010 (0.0016) model time 0.2353 (0.2389) loss 2.5364 (3.1642) grad_norm 3.0371 (nan) loss_scale 2048.0000 (2368.8749) mem 7382MB [2024-08-27 10:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][960/1251] eta 0:01:10 lr 0.000495 wd 0.0500 time 0.4685 (0.2411) data time 0.0008 (0.0016) model time 0.4678 (0.2394) loss 3.4237 (3.1653) grad_norm 2.3950 (nan) loss_scale 2048.0000 (2365.5359) mem 7382MB [2024-08-27 10:18:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][970/1251] eta 0:01:07 lr 0.000495 wd 0.0500 time 0.2388 (0.2413) data time 0.0011 (0.0016) model time 0.2377 (0.2396) loss 3.0736 (3.1655) grad_norm 2.5300 (nan) loss_scale 2048.0000 (2362.2657) mem 7382MB [2024-08-27 10:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][980/1251] eta 0:01:05 lr 0.000495 wd 0.0500 time 0.2218 (0.2415) data time 0.0012 (0.0016) model time 0.2207 (0.2398) loss 3.5500 (3.1664) grad_norm 2.2146 (nan) loss_scale 2048.0000 (2359.0622) mem 7382MB [2024-08-27 10:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][990/1251] eta 0:01:03 lr 0.000495 wd 0.0500 time 0.2362 (0.2415) data time 0.0009 (0.0016) model time 0.2354 (0.2398) loss 1.8934 (3.1638) grad_norm 2.2007 (nan) loss_scale 2048.0000 (2355.9233) mem 7382MB [2024-08-27 10:18:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1000/1251] eta 0:01:00 lr 0.000495 wd 0.0500 time 0.2436 (0.2414) data time 0.0011 (0.0016) model time 0.2425 (0.2397) loss 3.3271 (3.1641) grad_norm 2.8402 (nan) loss_scale 2048.0000 (2352.8472) mem 7382MB [2024-08-27 10:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1010/1251] eta 0:00:58 lr 0.000495 wd 0.0500 time 0.2421 (0.2414) data time 0.0011 (0.0016) model time 0.2410 (0.2397) loss 3.1351 (3.1635) grad_norm 2.4936 (nan) loss_scale 2048.0000 (2349.8318) mem 7382MB [2024-08-27 10:18:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1020/1251] eta 0:00:55 lr 0.000495 wd 0.0500 time 0.2332 (0.2414) data time 0.0010 (0.0016) model time 0.2322 (0.2397) loss 3.4738 (3.1609) grad_norm 3.1744 (nan) loss_scale 2048.0000 (2346.8756) mem 7382MB [2024-08-27 10:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1030/1251] eta 0:00:53 lr 0.000495 wd 0.0500 time 0.2433 (0.2414) data time 0.0009 (0.0016) model time 0.2424 (0.2397) loss 3.4905 (3.1591) grad_norm 1.9208 (nan) loss_scale 2048.0000 (2343.9767) mem 7382MB [2024-08-27 10:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1040/1251] eta 0:00:50 lr 0.000495 wd 0.0500 time 0.2480 (0.2414) data time 0.0008 (0.0016) model time 0.2472 (0.2397) loss 3.9051 (3.1588) grad_norm 2.0014 (nan) loss_scale 2048.0000 (2341.1335) mem 7382MB [2024-08-27 10:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1050/1251] eta 0:00:48 lr 0.000495 wd 0.0500 time 0.2403 (0.2413) data time 0.0011 (0.0016) model time 0.2393 (0.2396) loss 2.9102 (3.1623) grad_norm 2.5859 (nan) loss_scale 2048.0000 (2338.3444) mem 7382MB [2024-08-27 10:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1060/1251] eta 0:00:46 lr 0.000495 wd 0.0500 time 0.2486 (0.2413) data time 0.0008 (0.0016) model time 0.2478 (0.2396) loss 3.6565 (3.1660) grad_norm 4.0993 (nan) loss_scale 2048.0000 (2335.6079) mem 7382MB [2024-08-27 10:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1070/1251] eta 0:00:43 lr 0.000495 wd 0.0500 time 0.2325 (0.2413) data time 0.0008 (0.0016) model time 0.2317 (0.2396) loss 3.5367 (3.1660) grad_norm 3.2798 (nan) loss_scale 1024.0000 (2323.3613) mem 7382MB [2024-08-27 10:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1080/1251] eta 0:00:41 lr 0.000495 wd 0.0500 time 0.2382 (0.2413) data time 0.0009 (0.0015) model time 0.2373 (0.2396) loss 3.8291 (3.1671) grad_norm 2.6998 (nan) loss_scale 1024.0000 (2311.3414) mem 7382MB [2024-08-27 10:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1090/1251] eta 0:00:38 lr 0.000495 wd 0.0500 time 0.2537 (0.2412) data time 0.0010 (0.0015) model time 0.2526 (0.2396) loss 3.7456 (3.1688) grad_norm 2.8055 (nan) loss_scale 1024.0000 (2299.5417) mem 7382MB [2024-08-27 10:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1100/1251] eta 0:00:36 lr 0.000495 wd 0.0500 time 0.2328 (0.2412) data time 0.0009 (0.0015) model time 0.2319 (0.2395) loss 3.3099 (3.1709) grad_norm 1.8449 (nan) loss_scale 1024.0000 (2287.9564) mem 7382MB [2024-08-27 10:18:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1110/1251] eta 0:00:34 lr 0.000495 wd 0.0500 time 0.2345 (0.2412) data time 0.0012 (0.0015) model time 0.2333 (0.2395) loss 3.4492 (3.1704) grad_norm 3.4232 (nan) loss_scale 1024.0000 (2276.5797) mem 7382MB [2024-08-27 10:18:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1120/1251] eta 0:00:31 lr 0.000494 wd 0.0500 time 0.2347 (0.2411) data time 0.0009 (0.0015) model time 0.2338 (0.2395) loss 3.7106 (3.1722) grad_norm 3.7452 (nan) loss_scale 1024.0000 (2265.4059) mem 7382MB [2024-08-27 10:18:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1130/1251] eta 0:00:29 lr 0.000494 wd 0.0500 time 0.2393 (0.2411) data time 0.0009 (0.0015) model time 0.2384 (0.2395) loss 3.6563 (3.1712) grad_norm 2.6017 (nan) loss_scale 1024.0000 (2254.4297) mem 7382MB [2024-08-27 10:18:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1140/1251] eta 0:00:26 lr 0.000494 wd 0.0500 time 0.2374 (0.2411) data time 0.0008 (0.0015) model time 0.2366 (0.2394) loss 2.5447 (3.1706) grad_norm 2.5561 (nan) loss_scale 1024.0000 (2243.6459) mem 7382MB [2024-08-27 10:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1150/1251] eta 0:00:24 lr 0.000494 wd 0.0500 time 0.2574 (0.2411) data time 0.0009 (0.0015) model time 0.2565 (0.2394) loss 2.3589 (3.1690) grad_norm 2.5179 (nan) loss_scale 1024.0000 (2233.0495) mem 7382MB [2024-08-27 10:18:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1160/1251] eta 0:00:21 lr 0.000494 wd 0.0500 time 0.2393 (0.2411) data time 0.0009 (0.0015) model time 0.2384 (0.2394) loss 3.2674 (3.1696) grad_norm 2.1062 (nan) loss_scale 1024.0000 (2222.6357) mem 7382MB [2024-08-27 10:18:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1170/1251] eta 0:00:19 lr 0.000494 wd 0.0500 time 0.2419 (0.2410) data time 0.0011 (0.0015) model time 0.2408 (0.2394) loss 3.2250 (3.1703) grad_norm 2.0847 (nan) loss_scale 1024.0000 (2212.3997) mem 7382MB [2024-08-27 10:19:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1180/1251] eta 0:00:17 lr 0.000494 wd 0.0500 time 0.2408 (0.2410) data time 0.0008 (0.0015) model time 0.2400 (0.2394) loss 3.4648 (3.1723) grad_norm 2.3895 (nan) loss_scale 1024.0000 (2202.3370) mem 7382MB [2024-08-27 10:19:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1190/1251] eta 0:00:14 lr 0.000494 wd 0.0500 time 0.2417 (0.2410) data time 0.0010 (0.0015) model time 0.2407 (0.2394) loss 3.1451 (3.1731) grad_norm 2.9046 (nan) loss_scale 1024.0000 (2192.4433) mem 7382MB [2024-08-27 10:19:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1200/1251] eta 0:00:12 lr 0.000494 wd 0.0500 time 0.2342 (0.2409) data time 0.0011 (0.0015) model time 0.2331 (0.2393) loss 2.1258 (3.1733) grad_norm 2.9555 (nan) loss_scale 1024.0000 (2182.7144) mem 7382MB [2024-08-27 10:19:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1210/1251] eta 0:00:09 lr 0.000494 wd 0.0500 time 0.2399 (0.2409) data time 0.0009 (0.0015) model time 0.2390 (0.2393) loss 3.4262 (3.1719) grad_norm 3.6965 (nan) loss_scale 1024.0000 (2173.1462) mem 7382MB [2024-08-27 10:19:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1220/1251] eta 0:00:07 lr 0.000494 wd 0.0500 time 0.2394 (0.2409) data time 0.0010 (0.0015) model time 0.2384 (0.2393) loss 3.1436 (3.1745) grad_norm 2.2546 (nan) loss_scale 1024.0000 (2163.7346) mem 7382MB [2024-08-27 10:19:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1230/1251] eta 0:00:05 lr 0.000494 wd 0.0500 time 0.2380 (0.2409) data time 0.0008 (0.0015) model time 0.2372 (0.2393) loss 3.3443 (3.1730) grad_norm 2.1981 (nan) loss_scale 1024.0000 (2154.4760) mem 7382MB [2024-08-27 10:19:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1240/1251] eta 0:00:02 lr 0.000494 wd 0.0500 time 0.2266 (0.2408) data time 0.0008 (0.0015) model time 0.2258 (0.2392) loss 3.0280 (3.1732) grad_norm 2.7758 (nan) loss_scale 1024.0000 (2145.3666) mem 7382MB [2024-08-27 10:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [161/300][1250/1251] eta 0:00:00 lr 0.000494 wd 0.0500 time 0.2289 (0.2407) data time 0.0007 (0.0015) model time 0.2282 (0.2391) loss 3.7443 (3.1762) grad_norm 1.7821 (nan) loss_scale 1024.0000 (2136.4029) mem 7382MB [2024-08-27 10:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 161 training takes 0:05:01 [2024-08-27 10:19:16 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 10:19:17 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 10:19:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.464 (0.464) Loss 0.4646 (0.4646) Acc@1 91.797 (91.797) Acc@5 98.438 (98.438) Mem 7382MB [2024-08-27 10:19:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.112) Loss 0.7422 (0.7184) Acc@1 84.961 (84.792) Acc@5 95.703 (96.822) Mem 7382MB [2024-08-27 10:19:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.095) Loss 1.0518 (0.7400) Acc@1 74.707 (83.850) Acc@5 93.457 (96.819) Mem 7382MB [2024-08-27 10:19:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.089) Loss 1.1875 (0.8348) Acc@1 72.070 (81.499) Acc@5 91.211 (95.697) Mem 7382MB [2024-08-27 10:19:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.1846 (0.8902) Acc@1 72.461 (80.059) Acc@5 92.383 (95.153) Mem 7382MB [2024-08-27 10:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.592 Acc@5 95.062 [2024-08-27 10:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.6% [2024-08-27 10:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 79.59% [2024-08-27 10:19:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 10:19:22 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 10:19:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.419 (0.419) Loss 0.4033 (0.4033) Acc@1 92.871 (92.871) Acc@5 98.633 (98.633) Mem 7382MB [2024-08-27 10:19:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.112) Loss 0.6313 (0.6309) Acc@1 87.402 (86.426) Acc@5 97.168 (97.354) Mem 7382MB [2024-08-27 10:19:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.095) Loss 0.9014 (0.6558) Acc@1 78.516 (85.524) Acc@5 95.605 (97.400) Mem 7382MB [2024-08-27 10:19:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.089) Loss 1.1318 (0.7444) Acc@1 72.461 (83.408) Acc@5 92.871 (96.443) Mem 7382MB [2024-08-27 10:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.0303 (0.7912) Acc@1 74.609 (82.034) Acc@5 93.555 (95.965) Mem 7382MB [2024-08-27 10:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.610 Acc@5 95.948 [2024-08-27 10:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.6% [2024-08-27 10:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.61% [2024-08-27 10:19:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 10:19:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 10:19:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][0/1251] eta 0:13:21 lr 0.000494 wd 0.0500 time 0.6405 (0.6405) data time 0.4094 (0.4094) model time 0.0000 (0.0000) loss 2.2680 (2.2680) grad_norm 2.8779 (2.8779) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:19:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][10/1251] eta 0:05:43 lr 0.000494 wd 0.0500 time 0.2373 (0.2764) data time 0.0012 (0.0382) model time 0.0000 (0.0000) loss 2.8744 (3.1847) grad_norm 2.5551 (2.4566) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:19:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][20/1251] eta 0:05:18 lr 0.000494 wd 0.0500 time 0.2395 (0.2586) data time 0.0011 (0.0206) model time 0.0000 (0.0000) loss 3.2502 (3.3171) grad_norm 1.9453 (2.6812) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:19:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][30/1251] eta 0:05:07 lr 0.000494 wd 0.0500 time 0.2388 (0.2520) data time 0.0010 (0.0143) model time 0.0000 (0.0000) loss 2.7404 (3.2798) grad_norm 2.9786 (2.7786) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:19:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][40/1251] eta 0:05:01 lr 0.000494 wd 0.0500 time 0.2421 (0.2490) data time 0.0009 (0.0111) model time 0.0000 (0.0000) loss 4.0193 (3.2678) grad_norm 2.3719 (2.8375) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:19:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][50/1251] eta 0:04:56 lr 0.000494 wd 0.0500 time 0.2510 (0.2471) data time 0.0008 (0.0091) model time 0.0000 (0.0000) loss 3.2274 (3.2717) grad_norm 2.9098 (2.8553) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:19:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][60/1251] eta 0:04:52 lr 0.000494 wd 0.0500 time 0.2353 (0.2456) data time 0.0008 (0.0078) model time 0.2346 (0.2368) loss 3.6994 (3.2444) grad_norm 2.7795 (2.7663) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:19:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][70/1251] eta 0:04:48 lr 0.000494 wd 0.0500 time 0.2430 (0.2445) data time 0.0007 (0.0068) model time 0.2423 (0.2369) loss 2.2820 (3.2279) grad_norm 3.1463 (2.7243) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][80/1251] eta 0:04:45 lr 0.000494 wd 0.0500 time 0.2354 (0.2437) data time 0.0011 (0.0061) model time 0.2343 (0.2369) loss 3.7051 (3.2181) grad_norm 3.3555 (2.6891) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:19:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][90/1251] eta 0:04:45 lr 0.000493 wd 0.0500 time 0.2432 (0.2456) data time 0.0011 (0.0056) model time 0.2421 (0.2426) loss 3.4837 (3.2107) grad_norm 1.9800 (2.6899) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:19:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][100/1251] eta 0:04:42 lr 0.000493 wd 0.0500 time 0.2351 (0.2453) data time 0.0011 (0.0051) model time 0.2339 (0.2424) loss 3.5563 (3.2324) grad_norm 3.3376 (2.7206) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:19:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][110/1251] eta 0:04:39 lr 0.000493 wd 0.0500 time 0.2333 (0.2447) data time 0.0013 (0.0048) model time 0.2320 (0.2415) loss 3.5794 (3.2163) grad_norm 2.5189 (2.6943) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:19:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][120/1251] eta 0:04:36 lr 0.000493 wd 0.0500 time 0.2419 (0.2443) data time 0.0009 (0.0045) model time 0.2410 (0.2413) loss 3.8479 (3.1979) grad_norm 2.5473 (2.6729) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:19:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][130/1251] eta 0:04:33 lr 0.000493 wd 0.0500 time 0.2370 (0.2440) data time 0.0008 (0.0042) model time 0.2362 (0.2410) loss 2.7354 (3.1953) grad_norm 2.0639 (2.6767) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:20:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][140/1251] eta 0:04:30 lr 0.000493 wd 0.0500 time 0.2383 (0.2436) data time 0.0011 (0.0040) model time 0.2372 (0.2405) loss 3.3270 (3.1986) grad_norm 2.0726 (2.6818) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:20:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][150/1251] eta 0:04:27 lr 0.000493 wd 0.0500 time 0.2425 (0.2432) data time 0.0009 (0.0038) model time 0.2417 (0.2402) loss 2.0763 (3.1850) grad_norm 3.0245 (2.6540) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:20:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][160/1251] eta 0:04:24 lr 0.000493 wd 0.0500 time 0.2339 (0.2429) data time 0.0010 (0.0036) model time 0.2329 (0.2398) loss 3.2527 (3.1812) grad_norm 2.7295 (2.6643) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:20:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][170/1251] eta 0:04:22 lr 0.000493 wd 0.0500 time 0.2315 (0.2425) data time 0.0010 (0.0035) model time 0.2305 (0.2395) loss 3.0655 (3.1821) grad_norm 3.3757 (2.6735) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:20:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][180/1251] eta 0:04:19 lr 0.000493 wd 0.0500 time 0.2418 (0.2425) data time 0.0010 (0.0033) model time 0.2408 (0.2397) loss 3.1039 (3.1777) grad_norm 3.2295 (2.6733) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:20:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][190/1251] eta 0:04:17 lr 0.000493 wd 0.0500 time 0.2374 (0.2422) data time 0.0012 (0.0032) model time 0.2362 (0.2394) loss 3.6107 (3.1930) grad_norm 4.3459 (2.7368) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:20:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][200/1251] eta 0:04:14 lr 0.000493 wd 0.0500 time 0.2359 (0.2422) data time 0.0009 (0.0031) model time 0.2350 (0.2395) loss 2.2005 (3.1847) grad_norm 2.5541 (2.7186) loss_scale 1024.0000 (1024.0000) mem 7382MB [2024-08-27 10:20:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 10:20:15 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 10:20:16 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 10:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 10:27:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 10:27:34 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 10:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 10:27:43 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 10:27:45 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 10:27:46 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 10:27:46 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 162) [2024-08-27 10:27:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 10:28:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][210/1251] eta 0:31:11 lr 0.000493 wd 0.0500 time 0.2362 (1.7977) data time 0.0008 (0.0714) model time 0.2354 (1.7264) loss 3.6722 (3.6532) grad_norm 2.9752 (2.3665) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][220/1251] eta 0:16:48 lr 0.000493 wd 0.0500 time 0.2339 (0.9784) data time 0.0011 (0.0344) model time 0.2328 (0.9440) loss 3.3643 (3.4681) grad_norm 2.1375 (2.4315) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][230/1251] eta 0:12:18 lr 0.000493 wd 0.0500 time 0.2403 (0.7234) data time 0.0007 (0.0229) model time 0.2395 (0.7005) loss 3.6092 (3.4944) grad_norm 2.2696 (2.5009) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][240/1251] eta 0:10:06 lr 0.000493 wd 0.0500 time 0.2423 (0.6000) data time 0.0010 (0.0173) model time 0.2413 (0.5827) loss 2.9853 (3.4198) grad_norm 2.0803 (2.4930) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][250/1251] eta 0:08:46 lr 0.000493 wd 0.0500 time 0.2407 (0.5264) data time 0.0010 (0.0140) model time 0.2398 (0.5124) loss 3.2884 (3.3994) grad_norm 2.1619 (2.5241) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][260/1251] eta 0:07:54 lr 0.000493 wd 0.0500 time 0.2426 (0.4792) data time 0.0008 (0.0118) model time 0.2417 (0.4674) loss 2.2433 (3.3381) grad_norm 2.1259 (2.5581) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][270/1251] eta 0:07:16 lr 0.000493 wd 0.0500 time 0.2452 (0.4445) data time 0.0010 (0.0102) model time 0.2442 (0.4343) loss 3.6459 (3.3258) grad_norm 2.4147 (2.5096) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][280/1251] eta 0:06:46 lr 0.000493 wd 0.0500 time 0.2360 (0.4186) data time 0.0010 (0.0091) model time 0.2350 (0.4096) loss 3.4207 (3.2993) grad_norm 3.0278 (2.6134) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][290/1251] eta 0:06:23 lr 0.000493 wd 0.0500 time 0.2431 (0.3988) data time 0.0009 (0.0082) model time 0.2422 (0.3906) loss 3.3687 (3.2675) grad_norm 2.5138 (2.6251) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][300/1251] eta 0:06:04 lr 0.000493 wd 0.0500 time 0.2437 (0.3830) data time 0.0010 (0.0074) model time 0.2427 (0.3755) loss 3.1631 (3.2740) grad_norm 2.5204 (2.6197) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][310/1251] eta 0:05:48 lr 0.000493 wd 0.0500 time 0.2371 (0.3703) data time 0.0008 (0.0069) model time 0.2364 (0.3635) loss 3.6159 (3.2841) grad_norm 2.4296 (2.5746) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][320/1251] eta 0:05:34 lr 0.000492 wd 0.0500 time 0.2469 (0.3597) data time 0.0009 (0.0064) model time 0.2459 (0.3533) loss 3.4335 (3.2768) grad_norm 1.8081 (2.5489) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][330/1251] eta 0:05:22 lr 0.000492 wd 0.0500 time 0.2343 (0.3504) data time 0.0009 (0.0059) model time 0.2334 (0.3444) loss 2.9435 (3.2603) grad_norm 2.6949 (2.5343) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][340/1251] eta 0:05:12 lr 0.000492 wd 0.0500 time 0.2355 (0.3425) data time 0.0012 (0.0056) model time 0.2343 (0.3369) loss 3.8502 (3.2681) grad_norm 2.5838 (2.5391) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][350/1251] eta 0:05:02 lr 0.000492 wd 0.0500 time 0.2495 (0.3356) data time 0.0008 (0.0053) model time 0.2487 (0.3304) loss 2.9288 (3.2534) grad_norm 3.8424 (2.5326) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][360/1251] eta 0:04:53 lr 0.000492 wd 0.0500 time 0.2411 (0.3297) data time 0.0008 (0.0050) model time 0.2403 (0.3247) loss 4.0480 (3.2507) grad_norm 2.7126 (2.5293) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][370/1251] eta 0:04:46 lr 0.000492 wd 0.0500 time 0.2409 (0.3246) data time 0.0010 (0.0048) model time 0.2399 (0.3199) loss 3.2267 (3.2480) grad_norm 2.5235 (2.5412) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][380/1251] eta 0:04:38 lr 0.000492 wd 0.0500 time 0.2305 (0.3199) data time 0.0007 (0.0046) model time 0.2298 (0.3153) loss 2.9509 (3.2339) grad_norm 2.9348 (2.5560) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][390/1251] eta 0:04:31 lr 0.000492 wd 0.0500 time 0.2393 (0.3157) data time 0.0008 (0.0044) model time 0.2385 (0.3113) loss 3.4287 (3.2331) grad_norm 2.8565 (2.5801) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][400/1251] eta 0:04:25 lr 0.000492 wd 0.0500 time 0.2340 (0.3119) data time 0.0010 (0.0042) model time 0.2330 (0.3077) loss 2.7874 (3.2152) grad_norm 2.0848 (2.6021) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][410/1251] eta 0:04:19 lr 0.000492 wd 0.0500 time 0.2401 (0.3084) data time 0.0008 (0.0041) model time 0.2394 (0.3043) loss 3.3667 (3.2059) grad_norm 2.4987 (2.5854) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:28:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][420/1251] eta 0:04:13 lr 0.000492 wd 0.0500 time 0.2383 (0.3052) data time 0.0009 (0.0039) model time 0.2374 (0.3013) loss 3.7122 (3.2030) grad_norm 2.2818 (2.6016) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][430/1251] eta 0:04:08 lr 0.000492 wd 0.0500 time 0.2400 (0.3024) data time 0.0007 (0.0039) model time 0.2392 (0.2985) loss 2.4230 (3.2035) grad_norm 1.7840 (2.6238) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][440/1251] eta 0:04:03 lr 0.000492 wd 0.0500 time 0.2349 (0.2998) data time 0.0008 (0.0037) model time 0.2340 (0.2961) loss 2.4716 (3.1989) grad_norm 2.8217 (2.6197) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][450/1251] eta 0:03:58 lr 0.000492 wd 0.0500 time 0.2366 (0.2974) data time 0.0007 (0.0036) model time 0.2359 (0.2938) loss 3.4862 (3.1949) grad_norm 4.1487 (2.6262) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][460/1251] eta 0:03:53 lr 0.000492 wd 0.0500 time 0.2407 (0.2952) data time 0.0010 (0.0035) model time 0.2397 (0.2917) loss 3.4290 (3.1830) grad_norm 1.9660 (2.6243) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][470/1251] eta 0:03:48 lr 0.000492 wd 0.0500 time 0.2364 (0.2932) data time 0.0009 (0.0034) model time 0.2355 (0.2897) loss 1.9506 (3.1704) grad_norm 2.0068 (2.6167) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][480/1251] eta 0:03:44 lr 0.000492 wd 0.0500 time 0.2344 (0.2912) data time 0.0010 (0.0033) model time 0.2334 (0.2879) loss 3.4458 (3.1776) grad_norm 1.9454 (2.6043) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][490/1251] eta 0:03:40 lr 0.000492 wd 0.0500 time 0.2388 (0.2902) data time 0.0010 (0.0033) model time 0.2378 (0.2870) loss 2.8517 (3.1750) grad_norm 4.0556 (2.6029) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][500/1251] eta 0:03:36 lr 0.000492 wd 0.0500 time 0.2394 (0.2885) data time 0.0010 (0.0032) model time 0.2384 (0.2854) loss 3.4460 (3.1619) grad_norm 2.3043 (2.5987) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][510/1251] eta 0:03:33 lr 0.000492 wd 0.0500 time 0.2473 (0.2878) data time 0.0010 (0.0031) model time 0.2463 (0.2847) loss 3.1254 (3.1593) grad_norm 2.7457 (2.6004) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][520/1251] eta 0:03:29 lr 0.000492 wd 0.0500 time 0.2378 (0.2864) data time 0.0009 (0.0031) model time 0.2369 (0.2833) loss 3.7805 (3.1706) grad_norm 3.3833 (2.5939) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][530/1251] eta 0:03:25 lr 0.000492 wd 0.0500 time 0.2355 (0.2850) data time 0.0009 (0.0030) model time 0.2345 (0.2820) loss 1.9963 (3.1733) grad_norm 1.6668 (2.5882) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][540/1251] eta 0:03:21 lr 0.000491 wd 0.0500 time 0.2413 (0.2837) data time 0.0012 (0.0029) model time 0.2401 (0.2808) loss 3.1769 (3.1729) grad_norm 3.1187 (2.5859) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][550/1251] eta 0:03:17 lr 0.000491 wd 0.0500 time 0.2439 (0.2824) data time 0.0009 (0.0029) model time 0.2430 (0.2796) loss 3.8096 (3.1765) grad_norm 3.8455 (2.5978) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][560/1251] eta 0:03:14 lr 0.000491 wd 0.0500 time 0.2430 (0.2813) data time 0.0009 (0.0028) model time 0.2421 (0.2785) loss 2.8862 (3.1768) grad_norm 2.0494 (2.5934) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][570/1251] eta 0:03:10 lr 0.000491 wd 0.0500 time 0.2448 (0.2802) data time 0.0010 (0.0028) model time 0.2438 (0.2774) loss 3.5801 (3.1777) grad_norm 2.6484 (2.5936) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][580/1251] eta 0:03:07 lr 0.000491 wd 0.0500 time 0.2411 (0.2792) data time 0.0008 (0.0027) model time 0.2403 (0.2764) loss 3.6441 (3.1770) grad_norm 2.4148 (2.5939) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][590/1251] eta 0:03:03 lr 0.000491 wd 0.0500 time 0.2444 (0.2781) data time 0.0007 (0.0027) model time 0.2437 (0.2754) loss 3.1789 (3.1718) grad_norm 2.4510 (2.6405) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][600/1251] eta 0:03:00 lr 0.000491 wd 0.0500 time 0.2413 (0.2772) data time 0.0009 (0.0027) model time 0.2404 (0.2745) loss 3.7213 (3.1739) grad_norm 3.6191 (2.6421) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][610/1251] eta 0:02:57 lr 0.000491 wd 0.0500 time 0.2423 (0.2763) data time 0.0010 (0.0026) model time 0.2413 (0.2737) loss 3.2864 (3.1758) grad_norm 3.0481 (2.6428) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][620/1251] eta 0:02:53 lr 0.000491 wd 0.0500 time 0.2413 (0.2755) data time 0.0008 (0.0026) model time 0.2405 (0.2729) loss 2.6030 (3.1767) grad_norm 2.0617 (2.6409) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][630/1251] eta 0:02:50 lr 0.000491 wd 0.0500 time 0.2399 (0.2747) data time 0.0008 (0.0025) model time 0.2391 (0.2721) loss 3.2891 (3.1812) grad_norm 2.2009 (2.6380) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][640/1251] eta 0:02:47 lr 0.000491 wd 0.0500 time 0.2419 (0.2739) data time 0.0011 (0.0025) model time 0.2407 (0.2714) loss 3.2500 (3.1846) grad_norm 2.3809 (2.6393) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 10:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 10:29:52 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 10:29:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 10:31:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 10:31:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 10:31:42 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 10:31:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 10:31:54 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 10:31:56 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 10:31:57 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 10:31:57 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 162) [2024-08-27 10:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 10:32:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][650/1251] eta 0:32:46 lr 0.000491 wd 0.0500 time 0.2239 (3.2724) data time 0.0006 (0.1743) model time 0.2233 (3.0980) loss 3.6010 (3.7073) grad_norm 2.2980 (2.9395) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][660/1251] eta 0:10:48 lr 0.000491 wd 0.0500 time 0.2289 (1.0965) data time 0.0007 (0.0505) model time 0.2282 (1.0461) loss 3.3349 (3.4488) grad_norm 2.1057 (3.0458) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][670/1251] eta 0:07:06 lr 0.000491 wd 0.0500 time 0.2279 (0.7339) data time 0.0010 (0.0299) model time 0.2268 (0.7040) loss 3.2549 (3.3813) grad_norm 2.1667 (2.7193) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][680/1251] eta 0:05:33 lr 0.000491 wd 0.0500 time 0.2335 (0.5848) data time 0.0006 (0.0213) model time 0.2329 (0.5634) loss 2.6574 (3.3833) grad_norm 3.2794 (2.5988) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][690/1251] eta 0:04:42 lr 0.000491 wd 0.0500 time 0.2247 (0.5034) data time 0.0006 (0.0167) model time 0.2241 (0.4867) loss 3.4291 (3.3419) grad_norm 2.1778 (2.5428) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][700/1251] eta 0:04:09 lr 0.000491 wd 0.0500 time 0.2300 (0.4520) data time 0.0007 (0.0138) model time 0.2293 (0.4382) loss 3.8636 (3.3399) grad_norm 2.3509 (2.5480) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][710/1251] eta 0:03:45 lr 0.000491 wd 0.0500 time 0.2229 (0.4167) data time 0.0007 (0.0118) model time 0.2221 (0.4050) loss 3.6257 (3.3042) grad_norm 2.3648 (2.6351) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][720/1251] eta 0:03:27 lr 0.000491 wd 0.0500 time 0.2242 (0.3911) data time 0.0008 (0.0104) model time 0.2234 (0.3807) loss 3.4632 (3.2802) grad_norm 6.0513 (2.6637) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][730/1251] eta 0:03:13 lr 0.000491 wd 0.0500 time 0.2259 (0.3714) data time 0.0009 (0.0092) model time 0.2250 (0.3621) loss 3.0966 (3.2465) grad_norm 3.9883 (2.7471) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][740/1251] eta 0:03:01 lr 0.000491 wd 0.0500 time 0.2278 (0.3560) data time 0.0010 (0.0084) model time 0.2267 (0.3476) loss 3.2381 (3.2537) grad_norm 2.2097 (2.7836) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][750/1251] eta 0:02:52 lr 0.000491 wd 0.0500 time 0.2275 (0.3435) data time 0.0010 (0.0077) model time 0.2265 (0.3358) loss 3.2078 (3.2752) grad_norm 2.2218 (2.7585) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][760/1251] eta 0:02:43 lr 0.000491 wd 0.0500 time 0.2294 (0.3332) data time 0.0010 (0.0071) model time 0.2284 (0.3262) loss 3.4636 (3.2643) grad_norm 2.5556 (2.7226) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][770/1251] eta 0:02:36 lr 0.000490 wd 0.0500 time 0.2258 (0.3247) data time 0.0010 (0.0066) model time 0.2248 (0.3181) loss 2.6634 (3.2642) grad_norm 2.0288 (2.6778) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][780/1251] eta 0:02:29 lr 0.000490 wd 0.0500 time 0.2280 (0.3175) data time 0.0009 (0.0062) model time 0.2271 (0.3113) loss 3.5559 (3.2592) grad_norm 1.7333 (2.6587) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][790/1251] eta 0:02:23 lr 0.000490 wd 0.0500 time 0.2219 (0.3111) data time 0.0007 (0.0058) model time 0.2212 (0.3053) loss 3.4919 (3.2491) grad_norm 2.7493 (2.6429) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][800/1251] eta 0:02:17 lr 0.000490 wd 0.0500 time 0.2242 (0.3056) data time 0.0007 (0.0055) model time 0.2235 (0.3001) loss 3.0735 (3.2450) grad_norm 2.0314 (2.6274) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][810/1251] eta 0:02:12 lr 0.000490 wd 0.0500 time 0.2223 (0.3007) data time 0.0008 (0.0052) model time 0.2215 (0.2955) loss 3.0371 (3.2470) grad_norm 1.9266 (2.6100) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][820/1251] eta 0:02:07 lr 0.000490 wd 0.0500 time 0.2348 (0.2965) data time 0.0009 (0.0050) model time 0.2339 (0.2915) loss 1.9684 (3.2343) grad_norm 2.5380 (2.6238) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][830/1251] eta 0:02:03 lr 0.000490 wd 0.0500 time 0.2224 (0.2926) data time 0.0007 (0.0048) model time 0.2216 (0.2879) loss 2.6829 (3.2280) grad_norm 4.2311 (2.6449) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][840/1251] eta 0:01:58 lr 0.000490 wd 0.0500 time 0.2234 (0.2892) data time 0.0007 (0.0046) model time 0.2226 (0.2846) loss 3.1447 (3.2235) grad_norm 3.7722 (2.6573) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][850/1251] eta 0:01:54 lr 0.000490 wd 0.0500 time 0.2330 (0.2862) data time 0.0008 (0.0044) model time 0.2321 (0.2819) loss 3.2762 (3.2091) grad_norm 2.2753 (2.6537) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][860/1251] eta 0:01:50 lr 0.000490 wd 0.0500 time 0.2218 (0.2836) data time 0.0008 (0.0042) model time 0.2211 (0.2794) loss 3.3553 (3.2012) grad_norm 3.0988 (2.6500) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][870/1251] eta 0:01:47 lr 0.000490 wd 0.0500 time 0.2225 (0.2810) data time 0.0008 (0.0041) model time 0.2217 (0.2770) loss 3.5921 (3.2009) grad_norm 2.2712 (2.6410) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][880/1251] eta 0:01:43 lr 0.000490 wd 0.0500 time 0.2225 (0.2786) data time 0.0008 (0.0039) model time 0.2218 (0.2747) loss 1.9304 (3.1863) grad_norm 2.9148 (2.6440) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][890/1251] eta 0:01:39 lr 0.000490 wd 0.0500 time 0.2216 (0.2764) data time 0.0007 (0.0038) model time 0.2209 (0.2726) loss 2.3383 (3.1902) grad_norm 2.5172 (2.6623) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][900/1251] eta 0:01:36 lr 0.000490 wd 0.0500 time 0.2212 (0.2744) data time 0.0007 (0.0037) model time 0.2205 (0.2707) loss 2.3916 (3.1827) grad_norm 2.4837 (2.6812) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][910/1251] eta 0:01:32 lr 0.000490 wd 0.0500 time 0.2228 (0.2726) data time 0.0007 (0.0036) model time 0.2221 (0.2690) loss 3.4054 (3.1774) grad_norm 2.5334 (2.6715) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][920/1251] eta 0:01:29 lr 0.000490 wd 0.0500 time 0.2223 (0.2709) data time 0.0009 (0.0035) model time 0.2214 (0.2674) loss 3.3534 (3.1747) grad_norm 1.9979 (2.6559) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][930/1251] eta 0:01:26 lr 0.000490 wd 0.0500 time 0.2314 (0.2694) data time 0.0007 (0.0034) model time 0.2307 (0.2660) loss 2.4413 (3.1681) grad_norm 3.0314 (2.6480) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][940/1251] eta 0:01:23 lr 0.000490 wd 0.0500 time 0.2287 (0.2687) data time 0.0007 (0.0033) model time 0.2280 (0.2653) loss 2.7902 (3.1632) grad_norm 2.2445 (2.6329) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][950/1251] eta 0:01:20 lr 0.000490 wd 0.0500 time 0.2245 (0.2674) data time 0.0009 (0.0033) model time 0.2236 (0.2642) loss 3.7072 (3.1571) grad_norm 3.6479 (2.6438) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][960/1251] eta 0:01:17 lr 0.000490 wd 0.0500 time 0.2231 (0.2669) data time 0.0008 (0.0032) model time 0.2223 (0.2636) loss 3.2324 (3.1582) grad_norm 2.3757 (2.6536) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][970/1251] eta 0:01:14 lr 0.000490 wd 0.0500 time 0.2292 (0.2656) data time 0.0007 (0.0032) model time 0.2285 (0.2625) loss 3.7610 (3.1653) grad_norm 2.6510 (2.6438) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][980/1251] eta 0:01:11 lr 0.000490 wd 0.0500 time 0.2194 (0.2645) data time 0.0007 (0.0031) model time 0.2187 (0.2614) loss 3.1134 (3.1651) grad_norm 2.0592 (2.6368) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][990/1251] eta 0:01:08 lr 0.000489 wd 0.0500 time 0.2266 (0.2634) data time 0.0009 (0.0030) model time 0.2257 (0.2604) loss 3.7897 (3.1709) grad_norm 7.5750 (2.6477) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1000/1251] eta 0:01:05 lr 0.000489 wd 0.0500 time 0.2251 (0.2624) data time 0.0008 (0.0030) model time 0.2243 (0.2594) loss 3.5718 (3.1741) grad_norm 2.8662 (2.6518) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1010/1251] eta 0:01:03 lr 0.000489 wd 0.0500 time 0.2272 (0.2615) data time 0.0008 (0.0029) model time 0.2265 (0.2585) loss 2.3135 (3.1731) grad_norm 2.5183 (2.6647) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1020/1251] eta 0:01:00 lr 0.000489 wd 0.0500 time 0.2208 (0.2605) data time 0.0009 (0.0029) model time 0.2200 (0.2576) loss 3.0106 (3.1717) grad_norm 2.7334 (2.6589) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1030/1251] eta 0:00:57 lr 0.000489 wd 0.0500 time 0.2271 (0.2596) data time 0.0007 (0.0028) model time 0.2264 (0.2568) loss 2.4519 (3.1661) grad_norm 2.2910 (2.6569) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1040/1251] eta 0:00:54 lr 0.000489 wd 0.0500 time 0.2284 (0.2588) data time 0.0008 (0.0028) model time 0.2276 (0.2560) loss 3.4730 (3.1682) grad_norm 2.3027 (2.6509) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1050/1251] eta 0:00:51 lr 0.000489 wd 0.0500 time 0.2231 (0.2579) data time 0.0008 (0.0027) model time 0.2224 (0.2552) loss 3.4068 (3.1728) grad_norm 1.7069 (2.6461) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1060/1251] eta 0:00:49 lr 0.000489 wd 0.0500 time 0.2315 (0.2572) data time 0.0008 (0.0027) model time 0.2307 (0.2545) loss 2.6090 (3.1733) grad_norm 1.8750 (2.6377) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1070/1251] eta 0:00:46 lr 0.000489 wd 0.0500 time 0.2248 (0.2565) data time 0.0007 (0.0026) model time 0.2242 (0.2538) loss 3.5757 (3.1732) grad_norm 2.5816 (2.6262) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1080/1251] eta 0:00:43 lr 0.000489 wd 0.0500 time 0.2288 (0.2558) data time 0.0008 (0.0026) model time 0.2280 (0.2532) loss 3.4462 (3.1803) grad_norm 3.9058 (2.6309) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1090/1251] eta 0:00:41 lr 0.000489 wd 0.0500 time 0.2279 (0.2551) data time 0.0009 (0.0026) model time 0.2270 (0.2526) loss 3.3165 (3.1824) grad_norm 3.0749 (2.6336) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1100/1251] eta 0:00:38 lr 0.000489 wd 0.0500 time 0.2213 (0.2545) data time 0.0007 (0.0025) model time 0.2207 (0.2520) loss 3.7171 (3.1799) grad_norm 2.1538 (2.6262) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:33:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1110/1251] eta 0:00:35 lr 0.000489 wd 0.0500 time 0.2275 (0.2539) data time 0.0007 (0.0025) model time 0.2268 (0.2514) loss 3.3718 (3.1746) grad_norm 2.1626 (2.6190) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1120/1251] eta 0:00:33 lr 0.000489 wd 0.0500 time 0.2290 (0.2533) data time 0.0008 (0.0025) model time 0.2282 (0.2509) loss 3.2336 (3.1684) grad_norm 2.4281 (2.6172) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1130/1251] eta 0:00:30 lr 0.000489 wd 0.0500 time 0.2190 (0.2527) data time 0.0009 (0.0024) model time 0.2180 (0.2503) loss 3.0436 (3.1686) grad_norm 2.4135 (2.6140) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1140/1251] eta 0:00:27 lr 0.000489 wd 0.0500 time 0.2269 (0.2522) data time 0.0009 (0.0024) model time 0.2260 (0.2498) loss 2.7969 (3.1688) grad_norm 2.9218 (2.6167) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1150/1251] eta 0:00:25 lr 0.000489 wd 0.0500 time 0.2277 (0.2517) data time 0.0009 (0.0024) model time 0.2269 (0.2494) loss 3.3513 (3.1683) grad_norm 3.5139 (2.6233) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1160/1251] eta 0:00:22 lr 0.000489 wd 0.0500 time 0.2432 (0.2513) data time 0.0008 (0.0023) model time 0.2424 (0.2490) loss 3.9309 (3.1774) grad_norm 1.8806 (2.6312) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1170/1251] eta 0:00:20 lr 0.000489 wd 0.0500 time 0.2245 (0.2508) data time 0.0008 (0.0023) model time 0.2237 (0.2485) loss 2.1716 (3.1719) grad_norm 2.7412 (2.6300) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1180/1251] eta 0:00:17 lr 0.000489 wd 0.0500 time 0.2183 (0.2504) data time 0.0008 (0.0023) model time 0.2175 (0.2481) loss 2.3300 (3.1689) grad_norm 2.2494 (2.6317) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1190/1251] eta 0:00:15 lr 0.000489 wd 0.0500 time 0.2293 (0.2499) data time 0.0009 (0.0023) model time 0.2284 (0.2477) loss 3.6851 (3.1698) grad_norm 3.1035 (2.6362) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1200/1251] eta 0:00:12 lr 0.000489 wd 0.0500 time 0.2225 (0.2495) data time 0.0008 (0.0022) model time 0.2217 (0.2473) loss 3.4759 (3.1735) grad_norm 2.5528 (2.6292) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1210/1251] eta 0:00:10 lr 0.000489 wd 0.0500 time 0.2207 (0.2491) data time 0.0010 (0.0022) model time 0.2198 (0.2469) loss 3.8156 (3.1766) grad_norm 2.5221 (2.6341) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1220/1251] eta 0:00:07 lr 0.000488 wd 0.0500 time 0.2294 (0.2487) data time 0.0009 (0.0022) model time 0.2286 (0.2465) loss 3.3235 (3.1787) grad_norm 3.1914 (2.6374) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1230/1251] eta 0:00:05 lr 0.000488 wd 0.0500 time 0.2275 (0.2483) data time 0.0009 (0.0022) model time 0.2266 (0.2462) loss 2.4316 (3.1809) grad_norm 3.2352 (2.6412) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1240/1251] eta 0:00:02 lr 0.000488 wd 0.0500 time 0.2190 (0.2479) data time 0.0003 (0.0021) model time 0.2187 (0.2457) loss 3.5112 (3.1828) grad_norm 2.1082 (2.6432) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [162/300][1250/1251] eta 0:00:00 lr 0.000488 wd 0.0500 time 0.2180 (0.2474) data time 0.0005 (0.0021) model time 0.2175 (0.2452) loss 3.3422 (3.1800) grad_norm 2.0927 (2.6403) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:34:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 162 training takes 0:02:29 [2024-08-27 10:34:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 10:34:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 10:34:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.364 (0.364) Loss 0.4729 (0.4729) Acc@1 91.406 (91.406) Acc@5 97.949 (97.949) Mem 7377MB [2024-08-27 10:34:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.066 (0.100) Loss 0.6958 (0.7192) Acc@1 87.207 (84.979) Acc@5 97.656 (96.902) Mem 7377MB [2024-08-27 10:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.088) Loss 1.0449 (0.7460) Acc@1 75.879 (83.980) Acc@5 94.043 (96.856) Mem 7377MB [2024-08-27 10:34:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.067 (0.082) Loss 1.2119 (0.8506) Acc@1 71.191 (81.515) Acc@5 91.211 (95.628) Mem 7377MB [2024-08-27 10:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.078) Loss 1.1797 (0.9068) Acc@1 72.070 (80.133) Acc@5 92.676 (94.996) Mem 7377MB [2024-08-27 10:34:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.640 Acc@5 94.964 [2024-08-27 10:34:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.6% [2024-08-27 10:34:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 79.64% [2024-08-27 10:34:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 10:34:39 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 10:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.367 (0.367) Loss 0.4033 (0.4033) Acc@1 92.773 (92.773) Acc@5 98.535 (98.535) Mem 7377MB [2024-08-27 10:34:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.066 (0.096) Loss 0.6318 (0.6305) Acc@1 87.500 (86.452) Acc@5 97.363 (97.417) Mem 7377MB [2024-08-27 10:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.084) Loss 0.9014 (0.6553) Acc@1 78.613 (85.547) Acc@5 95.703 (97.447) Mem 7377MB [2024-08-27 10:34:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.080) Loss 1.1299 (0.7438) Acc@1 72.363 (83.443) Acc@5 93.066 (96.484) Mem 7377MB [2024-08-27 10:34:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.076) Loss 1.0283 (0.7905) Acc@1 74.609 (82.053) Acc@5 93.750 (96.001) Mem 7377MB [2024-08-27 10:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.648 Acc@5 95.974 [2024-08-27 10:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.6% [2024-08-27 10:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.65% [2024-08-27 10:34:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 10:34:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 10:34:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][0/1251] eta 0:12:56 lr 0.000488 wd 0.0500 time 0.6204 (0.6204) data time 0.3818 (0.3818) model time 0.0000 (0.0000) loss 3.5248 (3.5248) grad_norm 2.7470 (2.7470) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 10:34:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][10/1251] eta 0:05:24 lr 0.000488 wd 0.0500 time 0.2263 (0.2617) data time 0.0006 (0.0355) model time 0.0000 (0.0000) loss 4.1542 (3.3219) grad_norm 2.5411 (2.3387) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:34:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][20/1251] eta 0:05:01 lr 0.000488 wd 0.0500 time 0.2280 (0.2449) data time 0.0009 (0.0190) model time 0.0000 (0.0000) loss 3.5253 (3.3388) grad_norm 3.1195 (2.4771) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:34:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][30/1251] eta 0:04:52 lr 0.000488 wd 0.0500 time 0.2265 (0.2392) data time 0.0009 (0.0133) model time 0.0000 (0.0000) loss 2.8162 (3.2982) grad_norm 2.1373 (2.4207) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:34:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][40/1251] eta 0:04:45 lr 0.000488 wd 0.0500 time 0.2216 (0.2361) data time 0.0008 (0.0103) model time 0.0000 (0.0000) loss 2.6727 (3.2129) grad_norm 2.5830 (2.7139) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:34:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][50/1251] eta 0:04:40 lr 0.000488 wd 0.0500 time 0.2198 (0.2339) data time 0.0007 (0.0084) model time 0.0000 (0.0000) loss 2.6479 (3.2178) grad_norm 2.2729 (2.6354) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:34:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][60/1251] eta 0:04:37 lr 0.000488 wd 0.0500 time 0.2294 (0.2328) data time 0.0008 (0.0072) model time 0.2287 (0.2261) loss 2.9605 (3.1415) grad_norm 2.7154 (2.7264) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:35:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][70/1251] eta 0:04:33 lr 0.000488 wd 0.0500 time 0.2244 (0.2319) data time 0.0007 (0.0063) model time 0.2236 (0.2259) loss 3.3337 (3.1966) grad_norm 4.1896 (2.7218) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][80/1251] eta 0:04:31 lr 0.000488 wd 0.0500 time 0.2259 (0.2315) data time 0.0010 (0.0056) model time 0.2250 (0.2265) loss 3.7175 (3.1989) grad_norm 1.8030 (2.6935) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][90/1251] eta 0:04:28 lr 0.000488 wd 0.0500 time 0.2286 (0.2311) data time 0.0008 (0.0051) model time 0.2278 (0.2265) loss 3.2732 (3.1786) grad_norm 2.1861 (2.6752) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:35:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][100/1251] eta 0:04:25 lr 0.000488 wd 0.0500 time 0.2279 (0.2306) data time 0.0008 (0.0047) model time 0.2271 (0.2263) loss 3.4938 (3.1660) grad_norm 2.2968 (2.6903) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:35:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][110/1251] eta 0:04:22 lr 0.000488 wd 0.0500 time 0.2230 (0.2301) data time 0.0008 (0.0044) model time 0.2222 (0.2260) loss 3.6700 (3.1422) grad_norm 3.7392 (2.7089) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:35:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][120/1251] eta 0:04:21 lr 0.000488 wd 0.0500 time 0.2199 (0.2312) data time 0.0007 (0.0041) model time 0.2192 (0.2282) loss 3.4420 (3.1474) grad_norm 1.7270 (2.7372) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][130/1251] eta 0:04:18 lr 0.000488 wd 0.0500 time 0.2256 (0.2309) data time 0.0006 (0.0038) model time 0.2250 (0.2280) loss 3.3029 (3.1621) grad_norm 2.5619 (2.7524) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:35:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][140/1251] eta 0:04:16 lr 0.000488 wd 0.0500 time 0.2245 (0.2305) data time 0.0010 (0.0036) model time 0.2236 (0.2276) loss 3.5203 (3.1736) grad_norm 2.3633 (2.7377) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][150/1251] eta 0:04:13 lr 0.000488 wd 0.0500 time 0.2234 (0.2302) data time 0.0011 (0.0035) model time 0.2222 (0.2274) loss 2.2249 (3.1647) grad_norm 2.9845 (2.7467) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:35:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][160/1251] eta 0:04:10 lr 0.000488 wd 0.0500 time 0.2277 (0.2299) data time 0.0008 (0.0033) model time 0.2269 (0.2272) loss 3.4857 (3.1844) grad_norm 2.5484 (2.7321) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:35:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][170/1251] eta 0:04:08 lr 0.000488 wd 0.0500 time 0.2220 (0.2296) data time 0.0010 (0.0032) model time 0.2210 (0.2268) loss 2.4032 (3.1854) grad_norm 2.0759 (2.7127) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:35:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][180/1251] eta 0:04:05 lr 0.000488 wd 0.0500 time 0.2215 (0.2293) data time 0.0009 (0.0030) model time 0.2205 (0.2266) loss 2.3521 (3.1810) grad_norm 2.3841 (2.7075) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 10:35:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 10:35:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 10:35:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 10:38:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 10:38:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 10:38:19 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 10:38:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 10:38:28 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 10:38:29 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 10:38:30 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 10:38:30 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 163) [2024-08-27 10:38:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 10:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][190/1251] eta 1:00:50 lr 0.000487 wd 0.0500 time 0.2455 (3.4409) data time 0.0007 (0.1969) model time 0.2448 (3.2440) loss 3.9527 (3.6026) grad_norm 2.7104 (2.8800) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:38:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][200/1251] eta 0:20:15 lr 0.000487 wd 0.0500 time 0.2418 (1.1566) data time 0.0008 (0.0571) model time 0.2410 (1.0995) loss 3.4103 (3.3760) grad_norm 2.1868 (3.1421) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:38:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][210/1251] eta 0:13:27 lr 0.000487 wd 0.0500 time 0.2389 (0.7757) data time 0.0010 (0.0337) model time 0.2379 (0.7419) loss 3.2909 (3.4244) grad_norm 3.0681 (2.9501) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][220/1251] eta 0:10:37 lr 0.000487 wd 0.0500 time 0.2455 (0.6180) data time 0.0008 (0.0241) model time 0.2447 (0.5939) loss 2.7395 (3.3962) grad_norm 3.4384 (2.8372) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:38:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][230/1251] eta 0:09:03 lr 0.000487 wd 0.0500 time 0.2393 (0.5326) data time 0.0008 (0.0189) model time 0.2386 (0.5137) loss 3.3829 (3.3376) grad_norm 2.0531 (2.8003) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][240/1251] eta 0:08:04 lr 0.000487 wd 0.0500 time 0.2378 (0.4790) data time 0.0008 (0.0156) model time 0.2370 (0.4634) loss 3.8440 (3.3485) grad_norm 2.2581 (2.6635) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][250/1251] eta 0:07:22 lr 0.000487 wd 0.0500 time 0.2504 (0.4421) data time 0.0008 (0.0133) model time 0.2496 (0.4289) loss 3.7676 (3.3078) grad_norm 2.2829 (2.6462) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][260/1251] eta 0:06:51 lr 0.000487 wd 0.0500 time 0.2419 (0.4153) data time 0.0008 (0.0116) model time 0.2411 (0.4037) loss 3.5812 (3.2870) grad_norm 2.3338 (2.6500) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][270/1251] eta 0:06:27 lr 0.000487 wd 0.0500 time 0.2404 (0.3948) data time 0.0010 (0.0104) model time 0.2393 (0.3844) loss 2.9807 (3.2548) grad_norm 3.4672 (2.6954) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][280/1251] eta 0:06:07 lr 0.000487 wd 0.0500 time 0.2433 (0.3786) data time 0.0010 (0.0094) model time 0.2423 (0.3692) loss 3.1051 (3.2484) grad_norm 2.0427 (2.6831) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][290/1251] eta 0:05:51 lr 0.000487 wd 0.0500 time 0.2448 (0.3655) data time 0.0010 (0.0086) model time 0.2438 (0.3569) loss 3.9447 (3.2821) grad_norm 2.4601 (2.6942) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][300/1251] eta 0:05:37 lr 0.000487 wd 0.0500 time 0.2508 (0.3551) data time 0.0012 (0.0079) model time 0.2496 (0.3471) loss 3.5941 (3.2822) grad_norm 2.1032 (2.6660) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][310/1251] eta 0:05:25 lr 0.000487 wd 0.0500 time 0.2437 (0.3461) data time 0.0010 (0.0074) model time 0.2427 (0.3388) loss 2.9347 (3.2770) grad_norm 2.1531 (2.6945) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][320/1251] eta 0:05:15 lr 0.000487 wd 0.0500 time 0.2365 (0.3385) data time 0.0010 (0.0069) model time 0.2355 (0.3315) loss 3.2945 (3.2719) grad_norm 2.1705 (2.6659) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][330/1251] eta 0:05:05 lr 0.000487 wd 0.0500 time 0.2439 (0.3320) data time 0.0007 (0.0065) model time 0.2432 (0.3255) loss 2.9535 (3.2550) grad_norm 2.5356 (2.6386) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][340/1251] eta 0:04:57 lr 0.000487 wd 0.0500 time 0.2467 (0.3263) data time 0.0008 (0.0062) model time 0.2459 (0.3201) loss 3.2393 (3.2538) grad_norm 2.2297 (2.6301) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][350/1251] eta 0:04:49 lr 0.000487 wd 0.0500 time 0.2441 (0.3213) data time 0.0009 (0.0059) model time 0.2432 (0.3155) loss 3.0709 (3.2513) grad_norm 2.1694 (2.6090) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][360/1251] eta 0:04:42 lr 0.000487 wd 0.0500 time 0.2375 (0.3170) data time 0.0012 (0.0056) model time 0.2363 (0.3114) loss 2.2374 (3.2414) grad_norm 2.4637 (2.6005) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][370/1251] eta 0:04:35 lr 0.000487 wd 0.0500 time 0.2406 (0.3130) data time 0.0008 (0.0053) model time 0.2398 (0.3077) loss 2.8839 (3.2318) grad_norm 1.9015 (2.5958) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][380/1251] eta 0:04:29 lr 0.000487 wd 0.0500 time 0.2413 (0.3094) data time 0.0007 (0.0051) model time 0.2405 (0.3043) loss 3.1117 (3.2335) grad_norm 1.6061 (2.5807) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][390/1251] eta 0:04:23 lr 0.000487 wd 0.0500 time 0.2450 (0.3061) data time 0.0010 (0.0049) model time 0.2440 (0.3012) loss 3.4990 (3.2211) grad_norm 1.8061 (2.5578) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][400/1251] eta 0:04:17 lr 0.000487 wd 0.0500 time 0.2423 (0.3032) data time 0.0008 (0.0047) model time 0.2415 (0.2984) loss 3.0783 (3.2105) grad_norm 2.4508 (2.5499) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][410/1251] eta 0:04:12 lr 0.000487 wd 0.0500 time 0.2416 (0.3004) data time 0.0011 (0.0046) model time 0.2405 (0.2959) loss 3.5110 (3.2088) grad_norm 3.1139 (2.5453) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][420/1251] eta 0:04:07 lr 0.000486 wd 0.0500 time 0.2376 (0.2979) data time 0.0008 (0.0044) model time 0.2368 (0.2935) loss 2.5020 (3.1990) grad_norm 2.5665 (2.5646) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][430/1251] eta 0:04:02 lr 0.000486 wd 0.0500 time 0.2347 (0.2957) data time 0.0008 (0.0043) model time 0.2338 (0.2914) loss 2.5604 (3.2045) grad_norm 2.8955 (2.5760) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][440/1251] eta 0:03:58 lr 0.000486 wd 0.0500 time 0.2510 (0.2936) data time 0.0008 (0.0042) model time 0.2503 (0.2894) loss 2.2552 (3.1960) grad_norm 2.7512 (2.6032) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][450/1251] eta 0:03:53 lr 0.000486 wd 0.0500 time 0.2454 (0.2917) data time 0.0008 (0.0041) model time 0.2447 (0.2876) loss 2.9493 (3.1877) grad_norm 3.1119 (2.6183) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][460/1251] eta 0:03:49 lr 0.000486 wd 0.0500 time 0.2457 (0.2900) data time 0.0009 (0.0040) model time 0.2448 (0.2860) loss 3.3326 (3.1868) grad_norm 2.6094 (2.6175) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][470/1251] eta 0:03:45 lr 0.000486 wd 0.0500 time 0.2386 (0.2884) data time 0.0008 (0.0039) model time 0.2378 (0.2845) loss 2.1855 (3.1860) grad_norm 2.7435 (2.6228) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:39:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][480/1251] eta 0:03:41 lr 0.000486 wd 0.0500 time 0.2417 (0.2875) data time 0.0008 (0.0038) model time 0.2409 (0.2838) loss 2.7977 (3.1777) grad_norm 2.0750 (2.6306) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:40:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][490/1251] eta 0:03:37 lr 0.000486 wd 0.0500 time 0.2460 (0.2861) data time 0.0016 (0.0037) model time 0.2443 (0.2824) loss 3.5822 (3.1738) grad_norm 2.4389 (2.6392) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:40:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][500/1251] eta 0:03:34 lr 0.000486 wd 0.0500 time 0.2493 (0.2855) data time 0.0010 (0.0036) model time 0.2483 (0.2819) loss 3.2994 (3.1756) grad_norm 2.1906 (2.6594) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:40:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][510/1251] eta 0:03:30 lr 0.000486 wd 0.0500 time 0.2421 (0.2842) data time 0.0010 (0.0035) model time 0.2411 (0.2807) loss 3.8941 (3.1869) grad_norm 2.1506 (2.6599) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:40:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][520/1251] eta 0:03:26 lr 0.000486 wd 0.0500 time 0.2436 (0.2830) data time 0.0009 (0.0035) model time 0.2427 (0.2795) loss 3.0704 (3.1856) grad_norm 1.6972 (2.6448) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:40:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][530/1251] eta 0:03:23 lr 0.000486 wd 0.0500 time 0.2531 (0.2818) data time 0.0010 (0.0034) model time 0.2521 (0.2784) loss 3.5864 (3.1903) grad_norm 3.1462 (2.6401) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:40:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][540/1251] eta 0:03:19 lr 0.000486 wd 0.0500 time 0.2477 (0.2808) data time 0.0010 (0.0034) model time 0.2467 (0.2774) loss 3.3128 (3.1926) grad_norm 7.6471 (2.6539) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][550/1251] eta 0:03:16 lr 0.000486 wd 0.0500 time 0.2407 (0.2797) data time 0.0008 (0.0033) model time 0.2399 (0.2764) loss 2.2313 (3.1868) grad_norm 2.1973 (2.6469) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 10:40:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][560/1251] eta 0:03:12 lr 0.000486 wd 0.0500 time 0.2535 (0.2787) data time 0.0011 (0.0033) model time 0.2524 (0.2754) loss 3.5420 (3.1863) grad_norm 2.4819 (2.6528) loss_scale 2048.0000 (1029.4759) mem 7373MB [2024-08-27 10:40:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 10:40:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 10:40:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 10:43:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 10:43:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 10:43:23 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 10:43:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 10:43:33 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 10:43:34 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 10:43:35 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 10:43:36 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 163) [2024-08-27 10:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 10:43:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][570/1251] eta 0:18:45 lr 0.000486 wd 0.0500 time 0.2414 (1.6533) data time 0.0008 (0.0834) model time 0.2406 (1.5699) loss 3.4672 (3.5942) grad_norm 1.5366 (inf) loss_scale 1024.0000 (1137.7778) mem 7373MB [2024-08-27 10:43:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][580/1251] eta 0:10:09 lr 0.000486 wd 0.0500 time 0.2402 (0.9084) data time 0.0009 (0.0404) model time 0.2393 (0.8681) loss 3.2847 (3.4403) grad_norm 3.0147 (inf) loss_scale 1024.0000 (1077.8947) mem 7373MB [2024-08-27 10:43:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][590/1251] eta 0:07:27 lr 0.000486 wd 0.0500 time 0.2308 (0.6770) data time 0.0008 (0.0268) model time 0.2300 (0.6502) loss 3.9363 (3.5116) grad_norm 3.6001 (inf) loss_scale 1024.0000 (1059.3103) mem 7373MB [2024-08-27 10:44:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][600/1251] eta 0:06:06 lr 0.000486 wd 0.0500 time 0.2285 (0.5636) data time 0.0010 (0.0202) model time 0.2275 (0.5435) loss 3.1634 (3.4076) grad_norm 1.9260 (inf) loss_scale 1024.0000 (1050.2564) mem 7373MB [2024-08-27 10:44:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][610/1251] eta 0:05:18 lr 0.000486 wd 0.0500 time 0.2380 (0.4974) data time 0.0013 (0.0163) model time 0.2368 (0.4811) loss 3.3825 (3.3687) grad_norm 3.2159 (inf) loss_scale 1024.0000 (1044.8980) mem 7373MB [2024-08-27 10:44:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][620/1251] eta 0:04:46 lr 0.000486 wd 0.0500 time 0.2449 (0.4536) data time 0.0007 (0.0137) model time 0.2442 (0.4399) loss 2.6324 (3.3212) grad_norm 2.5859 (inf) loss_scale 1024.0000 (1041.3559) mem 7373MB [2024-08-27 10:44:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][630/1251] eta 0:04:22 lr 0.000486 wd 0.0500 time 0.2355 (0.4224) data time 0.0010 (0.0118) model time 0.2345 (0.4106) loss 3.0863 (3.3140) grad_norm 6.2344 (inf) loss_scale 1024.0000 (1038.8406) mem 7373MB [2024-08-27 10:44:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][640/1251] eta 0:04:03 lr 0.000486 wd 0.0500 time 0.2310 (0.3987) data time 0.0011 (0.0105) model time 0.2300 (0.3882) loss 3.3385 (3.2845) grad_norm 2.0660 (inf) loss_scale 1024.0000 (1036.9620) mem 7373MB [2024-08-27 10:44:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][650/1251] eta 0:03:48 lr 0.000485 wd 0.0500 time 0.2361 (0.3806) data time 0.0007 (0.0094) model time 0.2354 (0.3712) loss 3.0701 (3.2610) grad_norm 2.9105 (inf) loss_scale 1024.0000 (1035.5056) mem 7373MB [2024-08-27 10:44:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][660/1251] eta 0:03:36 lr 0.000485 wd 0.0500 time 0.2295 (0.3660) data time 0.0010 (0.0085) model time 0.2284 (0.3575) loss 3.3912 (3.2731) grad_norm 2.6263 (inf) loss_scale 1024.0000 (1034.3434) mem 7373MB [2024-08-27 10:44:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][670/1251] eta 0:03:25 lr 0.000485 wd 0.0500 time 0.2418 (0.3543) data time 0.0008 (0.0078) model time 0.2410 (0.3464) loss 3.6705 (3.2736) grad_norm 2.3128 (inf) loss_scale 1024.0000 (1033.3945) mem 7373MB [2024-08-27 10:44:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][680/1251] eta 0:03:16 lr 0.000485 wd 0.0500 time 0.2398 (0.3443) data time 0.0009 (0.0073) model time 0.2388 (0.3370) loss 3.3725 (3.2691) grad_norm 3.2405 (inf) loss_scale 1024.0000 (1032.6050) mem 7373MB [2024-08-27 10:44:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][690/1251] eta 0:03:08 lr 0.000485 wd 0.0500 time 0.2390 (0.3363) data time 0.0010 (0.0068) model time 0.2381 (0.3295) loss 2.9946 (3.2430) grad_norm 1.7382 (inf) loss_scale 1024.0000 (1031.9380) mem 7373MB [2024-08-27 10:44:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][700/1251] eta 0:03:01 lr 0.000485 wd 0.0500 time 0.2343 (0.3293) data time 0.0008 (0.0064) model time 0.2335 (0.3230) loss 3.5837 (3.2412) grad_norm 1.7831 (inf) loss_scale 1024.0000 (1031.3669) mem 7373MB [2024-08-27 10:44:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][710/1251] eta 0:02:54 lr 0.000485 wd 0.0500 time 0.2336 (0.3232) data time 0.0008 (0.0060) model time 0.2329 (0.3172) loss 2.8364 (3.2271) grad_norm 2.6992 (inf) loss_scale 1024.0000 (1030.8725) mem 7373MB [2024-08-27 10:44:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][720/1251] eta 0:02:48 lr 0.000485 wd 0.0500 time 0.2370 (0.3179) data time 0.0008 (0.0057) model time 0.2362 (0.3122) loss 4.2524 (3.2295) grad_norm 2.6108 (inf) loss_scale 1024.0000 (1030.4403) mem 7373MB [2024-08-27 10:44:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][730/1251] eta 0:02:43 lr 0.000485 wd 0.0500 time 0.2295 (0.3131) data time 0.0010 (0.0054) model time 0.2285 (0.3076) loss 2.8921 (3.2277) grad_norm 1.8798 (inf) loss_scale 1024.0000 (1030.0592) mem 7373MB [2024-08-27 10:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][740/1251] eta 0:02:37 lr 0.000485 wd 0.0500 time 0.2367 (0.3090) data time 0.0007 (0.0052) model time 0.2360 (0.3038) loss 3.3026 (3.2079) grad_norm 2.4112 (inf) loss_scale 1024.0000 (1029.7207) mem 7373MB [2024-08-27 10:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][750/1251] eta 0:02:32 lr 0.000485 wd 0.0500 time 0.2390 (0.3052) data time 0.0007 (0.0049) model time 0.2383 (0.3002) loss 3.6575 (3.2069) grad_norm 2.2570 (inf) loss_scale 1024.0000 (1029.4180) mem 7373MB [2024-08-27 10:44:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][760/1251] eta 0:02:28 lr 0.000485 wd 0.0500 time 0.2332 (0.3018) data time 0.0010 (0.0047) model time 0.2322 (0.2970) loss 2.5051 (3.1889) grad_norm 2.8979 (inf) loss_scale 1024.0000 (1029.1457) mem 7373MB [2024-08-27 10:44:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][770/1251] eta 0:02:23 lr 0.000485 wd 0.0500 time 0.2419 (0.2988) data time 0.0007 (0.0046) model time 0.2412 (0.2942) loss 3.3822 (3.1847) grad_norm 2.2433 (inf) loss_scale 1024.0000 (1028.8995) mem 7373MB [2024-08-27 10:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][780/1251] eta 0:02:19 lr 0.000485 wd 0.0500 time 0.2371 (0.2959) data time 0.0008 (0.0044) model time 0.2363 (0.2915) loss 3.6985 (3.1777) grad_norm 2.0027 (inf) loss_scale 1024.0000 (1028.6758) mem 7373MB [2024-08-27 10:44:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][790/1251] eta 0:02:15 lr 0.000485 wd 0.0500 time 0.2274 (0.2933) data time 0.0008 (0.0042) model time 0.2266 (0.2890) loss 2.5530 (3.1785) grad_norm 2.2758 (inf) loss_scale 1024.0000 (1028.4716) mem 7373MB [2024-08-27 10:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][800/1251] eta 0:02:11 lr 0.000485 wd 0.0500 time 0.2460 (0.2910) data time 0.0008 (0.0041) model time 0.2452 (0.2869) loss 2.2640 (3.1704) grad_norm 2.0244 (inf) loss_scale 1024.0000 (1028.2845) mem 7373MB [2024-08-27 10:44:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][810/1251] eta 0:02:07 lr 0.000485 wd 0.0500 time 0.2388 (0.2889) data time 0.0008 (0.0040) model time 0.2380 (0.2850) loss 3.2922 (3.1659) grad_norm 3.1465 (inf) loss_scale 1024.0000 (1028.1124) mem 7373MB [2024-08-27 10:44:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 10:44:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 10:44:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 10:46:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 10:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 10:46:56 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 10:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 10:47:13 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 10:47:14 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 10:47:15 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 10:47:15 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 163) [2024-08-27 10:47:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 10:47:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][820/1251] eta 0:35:00 lr 0.000485 wd 0.0500 time 0.2432 (4.8734) data time 0.0008 (0.3019) model time 0.2424 (4.5714) loss 2.9393 (3.6529) grad_norm 2.0145 (2.3047) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][830/1251] eta 0:09:10 lr 0.000485 wd 0.0500 time 0.2429 (1.3083) data time 0.0010 (0.0705) model time 0.2419 (1.2378) loss 3.0557 (3.4653) grad_norm 2.2568 (2.6190) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:47:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][840/1251] eta 0:05:47 lr 0.000485 wd 0.0500 time 0.2459 (0.8452) data time 0.0009 (0.0403) model time 0.2450 (0.8049) loss 3.6488 (3.4116) grad_norm 2.6256 (2.9055) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:47:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][850/1251] eta 0:04:26 lr 0.000485 wd 0.0500 time 0.2472 (0.6635) data time 0.0009 (0.0284) model time 0.2463 (0.6351) loss 3.7497 (3.4263) grad_norm 2.4002 (2.9835) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:47:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 10:47:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 10:47:46 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 10:52:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 10:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 10:52:45 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 10:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 10:52:55 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 10:52:56 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 10:52:58 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 10:52:58 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 163) [2024-08-27 10:52:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 10:53:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][860/1251] eta 0:31:26 lr 0.000485 wd 0.0500 time 0.2443 (4.8254) data time 0.0008 (0.6468) model time 0.2435 (4.1786) loss 2.9260 (3.5899) grad_norm 2.4337 (2.4917) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:53:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][870/1251] eta 0:08:15 lr 0.000484 wd 0.0500 time 0.2461 (1.2994) data time 0.0010 (0.1503) model time 0.2451 (1.1491) loss 3.3662 (3.4656) grad_norm 2.6394 (2.2700) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 10:53:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 10:53:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 10:53:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 11:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 11:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 11:00:57 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 11:01:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 11:01:08 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 11:01:09 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 11:01:10 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 11:01:10 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 163) [2024-08-27 11:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 11:05:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 11:05:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 11:05:58 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 11:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 11:06:14 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 11:06:15 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 11:06:17 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 11:06:17 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 163) [2024-08-27 11:06:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 11:06:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][880/1251] eta 0:23:29 lr 0.000484 wd 0.0500 time 0.2189 (3.7992) data time 0.0007 (0.2452) model time 0.2183 (3.5540) loss 3.8102 (3.5921) grad_norm 2.7793 (2.6450) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:06:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][890/1251] eta 0:07:31 lr 0.000484 wd 0.0500 time 0.2254 (1.2507) data time 0.0007 (0.0708) model time 0.2247 (1.1798) loss 3.5363 (3.4142) grad_norm 3.2242 (2.6708) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][900/1251] eta 0:04:49 lr 0.000484 wd 0.0500 time 0.2316 (0.8254) data time 0.0009 (0.0418) model time 0.2307 (0.7836) loss 3.3991 (3.4166) grad_norm 2.2676 (2.6117) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:06:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][910/1251] eta 0:03:41 lr 0.000484 wd 0.0500 time 0.2260 (0.6495) data time 0.0007 (0.0298) model time 0.2253 (0.6196) loss 2.7007 (3.4049) grad_norm 2.3562 (2.6536) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][920/1251] eta 0:03:03 lr 0.000484 wd 0.0500 time 0.2365 (0.5543) data time 0.0007 (0.0234) model time 0.2359 (0.5309) loss 3.2432 (3.3525) grad_norm 3.2216 (2.6471) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][930/1251] eta 0:02:38 lr 0.000484 wd 0.0500 time 0.2256 (0.4943) data time 0.0007 (0.0192) model time 0.2249 (0.4750) loss 3.6582 (3.3474) grad_norm 3.5037 (2.6819) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:06:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][940/1251] eta 0:02:20 lr 0.000484 wd 0.0500 time 0.2339 (0.4526) data time 0.0008 (0.0164) model time 0.2331 (0.4362) loss 3.0733 (3.3019) grad_norm 2.3300 (2.6623) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:06:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][950/1251] eta 0:02:07 lr 0.000484 wd 0.0500 time 0.2307 (0.4224) data time 0.0008 (0.0143) model time 0.2299 (0.4081) loss 3.2206 (3.2713) grad_norm 1.9977 (2.6239) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][960/1251] eta 0:01:56 lr 0.000484 wd 0.0500 time 0.2240 (0.3990) data time 0.0009 (0.0128) model time 0.2231 (0.3863) loss 3.4613 (3.2440) grad_norm 3.0470 (2.6049) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][970/1251] eta 0:01:47 lr 0.000484 wd 0.0500 time 0.2314 (0.3812) data time 0.0011 (0.0115) model time 0.2302 (0.3697) loss 2.9743 (3.2438) grad_norm 2.3117 (2.5852) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:06:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][980/1251] eta 0:01:39 lr 0.000484 wd 0.0500 time 0.2205 (0.3666) data time 0.0012 (0.0105) model time 0.2193 (0.3561) loss 3.1281 (3.2803) grad_norm 2.1859 (2.5514) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][990/1251] eta 0:01:32 lr 0.000484 wd 0.0500 time 0.2334 (0.3547) data time 0.0012 (0.0097) model time 0.2322 (0.3450) loss 3.7302 (3.2726) grad_norm 3.5673 (2.5996) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1000/1251] eta 0:01:26 lr 0.000484 wd 0.0500 time 0.2394 (0.3447) data time 0.0009 (0.0090) model time 0.2385 (0.3357) loss 2.8982 (3.2653) grad_norm 2.3797 (2.6115) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1010/1251] eta 0:01:20 lr 0.000484 wd 0.0500 time 0.2244 (0.3359) data time 0.0009 (0.0084) model time 0.2234 (0.3274) loss 3.5731 (3.2628) grad_norm 3.4492 (2.6239) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1020/1251] eta 0:01:15 lr 0.000484 wd 0.0500 time 0.2242 (0.3283) data time 0.0008 (0.0080) model time 0.2234 (0.3204) loss 2.8255 (3.2497) grad_norm 1.9838 (2.6690) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1030/1251] eta 0:01:11 lr 0.000484 wd 0.0500 time 0.2361 (0.3219) data time 0.0007 (0.0075) model time 0.2353 (0.3144) loss 2.8409 (3.2357) grad_norm 2.3616 (2.6551) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1040/1251] eta 0:01:06 lr 0.000484 wd 0.0500 time 0.2294 (0.3162) data time 0.0009 (0.0071) model time 0.2286 (0.3090) loss 2.8516 (3.2262) grad_norm 1.7959 (2.6364) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1050/1251] eta 0:01:02 lr 0.000484 wd 0.0500 time 0.2261 (0.3111) data time 0.0009 (0.0068) model time 0.2252 (0.3044) loss 2.6789 (3.2270) grad_norm 2.5453 (2.6171) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1060/1251] eta 0:00:58 lr 0.000484 wd 0.0500 time 0.2308 (0.3067) data time 0.0009 (0.0065) model time 0.2299 (0.3002) loss 2.7244 (3.2208) grad_norm 2.0911 (2.6089) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1070/1251] eta 0:00:54 lr 0.000484 wd 0.0500 time 0.2267 (0.3026) data time 0.0007 (0.0062) model time 0.2260 (0.2964) loss 3.3199 (3.2201) grad_norm 1.7449 (2.6065) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1080/1251] eta 0:00:51 lr 0.000484 wd 0.0500 time 0.2296 (0.2991) data time 0.0010 (0.0060) model time 0.2286 (0.2931) loss 3.0779 (3.2058) grad_norm 3.7431 (2.6158) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1090/1251] eta 0:00:47 lr 0.000484 wd 0.0500 time 0.2277 (0.2959) data time 0.0007 (0.0057) model time 0.2270 (0.2901) loss 3.0789 (3.1986) grad_norm 2.3914 (2.6070) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1100/1251] eta 0:00:44 lr 0.000483 wd 0.0500 time 0.2288 (0.2928) data time 0.0011 (0.0055) model time 0.2277 (0.2873) loss 3.0967 (3.2054) grad_norm 1.8992 (2.6130) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1110/1251] eta 0:00:40 lr 0.000483 wd 0.0500 time 0.2284 (0.2900) data time 0.0007 (0.0053) model time 0.2277 (0.2847) loss 2.6871 (3.1965) grad_norm 2.6503 (2.6135) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1120/1251] eta 0:00:37 lr 0.000483 wd 0.0500 time 0.2248 (0.2875) data time 0.0007 (0.0051) model time 0.2241 (0.2823) loss 2.0161 (3.1973) grad_norm 1.8381 (2.5999) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1130/1251] eta 0:00:34 lr 0.000483 wd 0.0500 time 0.2310 (0.2851) data time 0.0007 (0.0050) model time 0.2303 (0.2801) loss 2.4395 (3.1897) grad_norm 2.4248 (2.5897) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1140/1251] eta 0:00:31 lr 0.000483 wd 0.0500 time 0.2323 (0.2830) data time 0.0008 (0.0048) model time 0.2315 (0.2781) loss 3.1540 (3.1835) grad_norm 2.8169 (2.5901) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1150/1251] eta 0:00:28 lr 0.000483 wd 0.0500 time 0.2202 (0.2810) data time 0.0011 (0.0047) model time 0.2190 (0.2763) loss 3.3529 (3.1792) grad_norm 3.9544 (2.6064) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1160/1251] eta 0:00:25 lr 0.000483 wd 0.0500 time 0.2293 (0.2791) data time 0.0009 (0.0046) model time 0.2285 (0.2745) loss 1.9669 (3.1805) grad_norm 1.8390 (2.6160) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1170/1251] eta 0:00:22 lr 0.000483 wd 0.0500 time 0.2295 (0.2781) data time 0.0007 (0.0045) model time 0.2289 (0.2737) loss 2.9574 (3.1780) grad_norm 2.9589 (2.6205) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1180/1251] eta 0:00:19 lr 0.000483 wd 0.0500 time 0.2321 (0.2765) data time 0.0009 (0.0043) model time 0.2311 (0.2722) loss 3.6152 (3.1722) grad_norm 2.0841 (2.6105) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1190/1251] eta 0:00:16 lr 0.000483 wd 0.0500 time 0.2301 (0.2758) data time 0.0009 (0.0042) model time 0.2293 (0.2715) loss 3.9138 (3.1719) grad_norm 2.8096 (2.6080) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1200/1251] eta 0:00:13 lr 0.000483 wd 0.0500 time 0.2407 (0.2743) data time 0.0008 (0.0041) model time 0.2399 (0.2701) loss 3.8220 (3.1821) grad_norm 2.4319 (2.6146) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1210/1251] eta 0:00:11 lr 0.000483 wd 0.0500 time 0.2255 (0.2729) data time 0.0007 (0.0041) model time 0.2248 (0.2689) loss 2.7021 (3.1801) grad_norm 2.5036 (2.6143) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1220/1251] eta 0:00:08 lr 0.000483 wd 0.0500 time 0.2317 (0.2717) data time 0.0011 (0.0040) model time 0.2306 (0.2677) loss 3.7985 (3.1864) grad_norm 2.9823 (2.6339) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1230/1251] eta 0:00:05 lr 0.000483 wd 0.0500 time 0.2303 (0.2704) data time 0.0009 (0.0039) model time 0.2293 (0.2665) loss 3.6913 (3.1878) grad_norm 2.1665 (2.6415) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:07:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1240/1251] eta 0:00:02 lr 0.000483 wd 0.0500 time 0.2108 (0.2690) data time 0.0005 (0.0038) model time 0.2104 (0.2652) loss 2.3315 (3.1810) grad_norm 2.9481 (2.6363) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [163/300][1250/1251] eta 0:00:00 lr 0.000483 wd 0.0500 time 0.2131 (0.2675) data time 0.0007 (0.0037) model time 0.2124 (0.2638) loss 3.5642 (3.1820) grad_norm 2.1260 (2.6368) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 11:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 163 training takes 0:01:40 [2024-08-27 11:08:01 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 11:08:02 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 11:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.455 (0.455) Loss 0.4712 (0.4712) Acc@1 91.211 (91.211) Acc@5 98.047 (98.047) Mem 7377MB [2024-08-27 11:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.085 (0.113) Loss 0.7666 (0.7098) Acc@1 85.059 (84.739) Acc@5 96.680 (96.955) Mem 7377MB [2024-08-27 11:08:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.098) Loss 0.9756 (0.7261) Acc@1 77.441 (83.798) Acc@5 94.141 (96.842) Mem 7377MB [2024-08-27 11:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.083 (0.092) Loss 1.2344 (0.8292) Acc@1 70.996 (81.458) Acc@5 91.016 (95.612) Mem 7377MB [2024-08-27 11:08:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 1.1260 (0.8823) Acc@1 74.023 (80.016) Acc@5 92.676 (95.008) Mem 7377MB [2024-08-27 11:08:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.600 Acc@5 94.974 [2024-08-27 11:08:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.6% [2024-08-27 11:08:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.961 (0.961) Loss 0.4033 (0.4033) Acc@1 92.676 (92.676) Acc@5 98.535 (98.535) Mem 7377MB [2024-08-27 11:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.163) Loss 0.6309 (0.6298) Acc@1 87.500 (86.506) Acc@5 97.363 (97.390) Mem 7377MB [2024-08-27 11:08:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.125) Loss 0.8984 (0.6545) Acc@1 78.320 (85.556) Acc@5 95.508 (97.410) Mem 7377MB [2024-08-27 11:08:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.085 (0.110) Loss 1.1270 (0.7431) Acc@1 72.656 (83.449) Acc@5 93.164 (96.440) Mem 7377MB [2024-08-27 11:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.099) Loss 1.0273 (0.7896) Acc@1 74.512 (82.069) Acc@5 93.652 (95.956) Mem 7377MB [2024-08-27 11:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.686 Acc@5 95.944 [2024-08-27 11:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.7% [2024-08-27 11:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.69% [2024-08-27 11:08:13 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 11:08:16 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 11:08:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][0/1251] eta 0:16:43 lr 0.000483 wd 0.0500 time 0.8024 (0.8024) data time 0.5199 (0.5199) model time 0.0000 (0.0000) loss 2.4178 (2.4178) grad_norm 2.4711 (2.4711) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 11:08:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][10/1251] eta 0:05:47 lr 0.000483 wd 0.0500 time 0.2268 (0.2799) data time 0.0007 (0.0483) model time 0.0000 (0.0000) loss 2.3485 (2.8755) grad_norm 1.9511 (2.3051) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][20/1251] eta 0:05:14 lr 0.000483 wd 0.0500 time 0.2315 (0.2556) data time 0.0009 (0.0258) model time 0.0000 (0.0000) loss 3.2496 (2.9993) grad_norm 3.6772 (2.4555) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][30/1251] eta 0:05:00 lr 0.000483 wd 0.0500 time 0.2229 (0.2462) data time 0.0008 (0.0178) model time 0.0000 (0.0000) loss 3.0741 (3.1033) grad_norm 2.0007 (2.5017) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][40/1251] eta 0:04:52 lr 0.000483 wd 0.0500 time 0.2225 (0.2412) data time 0.0009 (0.0138) model time 0.0000 (0.0000) loss 3.5481 (3.1258) grad_norm 2.1223 (2.9736) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][50/1251] eta 0:04:46 lr 0.000483 wd 0.0500 time 0.2324 (0.2388) data time 0.0007 (0.0113) model time 0.0000 (0.0000) loss 3.7855 (3.1586) grad_norm 5.6990 (2.9161) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][60/1251] eta 0:04:43 lr 0.000483 wd 0.0500 time 0.2332 (0.2377) data time 0.0007 (0.0096) model time 0.2325 (0.2311) loss 3.5582 (3.2259) grad_norm 2.9196 (2.9327) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][70/1251] eta 0:04:39 lr 0.000482 wd 0.0500 time 0.2303 (0.2366) data time 0.0011 (0.0084) model time 0.2292 (0.2298) loss 3.2996 (3.2227) grad_norm 2.4354 (2.8536) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][80/1251] eta 0:04:35 lr 0.000482 wd 0.0500 time 0.2239 (0.2357) data time 0.0008 (0.0075) model time 0.2231 (0.2292) loss 3.3250 (3.1935) grad_norm 2.5475 (2.8415) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][90/1251] eta 0:04:32 lr 0.000482 wd 0.0500 time 0.2291 (0.2350) data time 0.0007 (0.0068) model time 0.2283 (0.2291) loss 2.9783 (3.1746) grad_norm 1.9752 (2.8115) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][100/1251] eta 0:04:29 lr 0.000482 wd 0.0500 time 0.2273 (0.2344) data time 0.0007 (0.0062) model time 0.2266 (0.2288) loss 3.0662 (3.1429) grad_norm 2.5118 (2.7726) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][110/1251] eta 0:04:26 lr 0.000482 wd 0.0500 time 0.2378 (0.2339) data time 0.0009 (0.0058) model time 0.2369 (0.2287) loss 3.9635 (3.1601) grad_norm 3.6360 (2.7432) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][120/1251] eta 0:04:23 lr 0.000482 wd 0.0500 time 0.2244 (0.2333) data time 0.0006 (0.0054) model time 0.2238 (0.2281) loss 3.3324 (3.1650) grad_norm 2.7908 (2.7418) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][130/1251] eta 0:04:20 lr 0.000482 wd 0.0500 time 0.2359 (0.2328) data time 0.0010 (0.0051) model time 0.2349 (0.2279) loss 3.9259 (3.1635) grad_norm 1.8666 (2.7936) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][140/1251] eta 0:04:18 lr 0.000482 wd 0.0500 time 0.2245 (0.2325) data time 0.0009 (0.0048) model time 0.2236 (0.2278) loss 2.9745 (3.1872) grad_norm 2.8782 (2.7773) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][150/1251] eta 0:04:15 lr 0.000482 wd 0.0500 time 0.2304 (0.2323) data time 0.0007 (0.0045) model time 0.2298 (0.2279) loss 3.8291 (3.1772) grad_norm 2.0707 (2.8020) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][160/1251] eta 0:04:13 lr 0.000482 wd 0.0500 time 0.2235 (0.2319) data time 0.0007 (0.0043) model time 0.2227 (0.2276) loss 2.9567 (3.1607) grad_norm 1.9150 (2.7747) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][170/1251] eta 0:04:10 lr 0.000482 wd 0.0500 time 0.2213 (0.2317) data time 0.0009 (0.0041) model time 0.2204 (0.2276) loss 2.8012 (3.1639) grad_norm 2.0730 (2.7607) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][180/1251] eta 0:04:07 lr 0.000482 wd 0.0500 time 0.2176 (0.2314) data time 0.0010 (0.0040) model time 0.2166 (0.2274) loss 3.0303 (3.1750) grad_norm 2.3455 (2.7405) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][190/1251] eta 0:04:05 lr 0.000482 wd 0.0500 time 0.2215 (0.2312) data time 0.0012 (0.0038) model time 0.2202 (0.2273) loss 3.5486 (3.1832) grad_norm 2.2775 (2.7184) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][200/1251] eta 0:04:02 lr 0.000482 wd 0.0500 time 0.2252 (0.2311) data time 0.0011 (0.0037) model time 0.2241 (0.2274) loss 3.6923 (3.1865) grad_norm 2.5225 (2.7097) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][210/1251] eta 0:04:00 lr 0.000482 wd 0.0500 time 0.2250 (0.2309) data time 0.0009 (0.0035) model time 0.2242 (0.2273) loss 3.1057 (3.1905) grad_norm 2.3581 (2.7201) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][220/1251] eta 0:03:58 lr 0.000482 wd 0.0500 time 0.2314 (0.2310) data time 0.0010 (0.0034) model time 0.2304 (0.2276) loss 4.1023 (3.1981) grad_norm 3.2828 (2.7152) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][230/1251] eta 0:03:55 lr 0.000482 wd 0.0500 time 0.2349 (0.2311) data time 0.0008 (0.0033) model time 0.2341 (0.2278) loss 3.7875 (3.1916) grad_norm 2.5898 (2.6992) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][240/1251] eta 0:03:53 lr 0.000482 wd 0.0500 time 0.2258 (0.2310) data time 0.0007 (0.0033) model time 0.2250 (0.2279) loss 3.8198 (3.1956) grad_norm 2.3729 (2.6969) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][250/1251] eta 0:03:51 lr 0.000482 wd 0.0500 time 0.2265 (0.2309) data time 0.0009 (0.0032) model time 0.2256 (0.2278) loss 3.2728 (3.2020) grad_norm 2.3716 (2.7028) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][260/1251] eta 0:03:48 lr 0.000482 wd 0.0500 time 0.2348 (0.2308) data time 0.0011 (0.0031) model time 0.2337 (0.2278) loss 2.7875 (3.2036) grad_norm 2.2582 (2.7014) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][270/1251] eta 0:03:46 lr 0.000482 wd 0.0500 time 0.2190 (0.2307) data time 0.0009 (0.0030) model time 0.2181 (0.2277) loss 2.6585 (3.1898) grad_norm 1.9843 (2.6874) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][280/1251] eta 0:03:43 lr 0.000482 wd 0.0500 time 0.2218 (0.2306) data time 0.0010 (0.0029) model time 0.2208 (0.2277) loss 3.1640 (3.1896) grad_norm 3.1647 (2.6931) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][290/1251] eta 0:03:41 lr 0.000482 wd 0.0500 time 0.2255 (0.2305) data time 0.0008 (0.0029) model time 0.2247 (0.2276) loss 3.0992 (3.1784) grad_norm 2.2321 (2.6809) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][300/1251] eta 0:03:39 lr 0.000481 wd 0.0500 time 0.2278 (0.2304) data time 0.0008 (0.0028) model time 0.2270 (0.2276) loss 3.3074 (3.1864) grad_norm 2.1035 (2.6767) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][310/1251] eta 0:03:36 lr 0.000481 wd 0.0500 time 0.2332 (0.2304) data time 0.0009 (0.0028) model time 0.2323 (0.2277) loss 3.6850 (3.1896) grad_norm 3.8231 (2.7058) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][320/1251] eta 0:03:34 lr 0.000481 wd 0.0500 time 0.2261 (0.2303) data time 0.0010 (0.0027) model time 0.2250 (0.2277) loss 3.2420 (3.1847) grad_norm 2.5316 (2.7080) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][330/1251] eta 0:03:32 lr 0.000481 wd 0.0500 time 0.2295 (0.2302) data time 0.0009 (0.0027) model time 0.2286 (0.2276) loss 3.7348 (3.1808) grad_norm 2.1684 (2.7170) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][340/1251] eta 0:03:29 lr 0.000481 wd 0.0500 time 0.2300 (0.2302) data time 0.0009 (0.0026) model time 0.2292 (0.2276) loss 3.6718 (3.1729) grad_norm 2.0516 (2.7340) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][350/1251] eta 0:03:27 lr 0.000481 wd 0.0500 time 0.2225 (0.2301) data time 0.0009 (0.0026) model time 0.2216 (0.2275) loss 2.9864 (3.1730) grad_norm 2.5984 (2.7239) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][360/1251] eta 0:03:24 lr 0.000481 wd 0.0500 time 0.2214 (0.2300) data time 0.0007 (0.0025) model time 0.2207 (0.2275) loss 3.2558 (3.1770) grad_norm 4.2083 (2.7338) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][370/1251] eta 0:03:22 lr 0.000481 wd 0.0500 time 0.2253 (0.2299) data time 0.0010 (0.0025) model time 0.2243 (0.2274) loss 3.2700 (3.1782) grad_norm 2.1300 (2.7298) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][380/1251] eta 0:03:20 lr 0.000481 wd 0.0500 time 0.2180 (0.2304) data time 0.0010 (0.0024) model time 0.2170 (0.2280) loss 1.9660 (3.1723) grad_norm 2.4067 (2.7249) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][390/1251] eta 0:03:18 lr 0.000481 wd 0.0500 time 0.2312 (0.2303) data time 0.0007 (0.0024) model time 0.2305 (0.2280) loss 3.2941 (3.1801) grad_norm 2.7757 (2.7195) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][400/1251] eta 0:03:15 lr 0.000481 wd 0.0500 time 0.2278 (0.2303) data time 0.0013 (0.0024) model time 0.2265 (0.2279) loss 2.3649 (3.1828) grad_norm 3.1478 (2.7199) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][410/1251] eta 0:03:13 lr 0.000481 wd 0.0500 time 0.2317 (0.2302) data time 0.0009 (0.0024) model time 0.2308 (0.2279) loss 2.3731 (3.1800) grad_norm 2.1135 (2.7065) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][420/1251] eta 0:03:11 lr 0.000481 wd 0.0500 time 0.2328 (0.2301) data time 0.0009 (0.0023) model time 0.2319 (0.2279) loss 3.0662 (3.1847) grad_norm 4.6496 (2.7015) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][430/1251] eta 0:03:08 lr 0.000481 wd 0.0500 time 0.2276 (0.2301) data time 0.0009 (0.0023) model time 0.2267 (0.2278) loss 4.2893 (3.1814) grad_norm 2.2326 (2.7003) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][440/1251] eta 0:03:06 lr 0.000481 wd 0.0500 time 0.2248 (0.2304) data time 0.0011 (0.0023) model time 0.2237 (0.2282) loss 3.4593 (3.1721) grad_norm 3.5331 (2.6981) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][450/1251] eta 0:03:04 lr 0.000481 wd 0.0500 time 0.2278 (0.2303) data time 0.0007 (0.0022) model time 0.2272 (0.2282) loss 2.6182 (3.1707) grad_norm 1.9183 (2.7019) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][460/1251] eta 0:03:02 lr 0.000481 wd 0.0500 time 0.2263 (0.2303) data time 0.0012 (0.0022) model time 0.2251 (0.2281) loss 2.4897 (3.1652) grad_norm 2.4781 (2.6928) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][470/1251] eta 0:02:59 lr 0.000481 wd 0.0500 time 0.2279 (0.2302) data time 0.0007 (0.0022) model time 0.2272 (0.2281) loss 2.5104 (3.1628) grad_norm 3.0293 (2.6945) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][480/1251] eta 0:02:57 lr 0.000481 wd 0.0500 time 0.2245 (0.2302) data time 0.0008 (0.0022) model time 0.2237 (0.2280) loss 3.6369 (3.1643) grad_norm 2.0083 (2.6904) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][490/1251] eta 0:02:55 lr 0.000481 wd 0.0500 time 0.2302 (0.2301) data time 0.0008 (0.0021) model time 0.2293 (0.2280) loss 3.1410 (3.1622) grad_norm 3.4527 (2.6881) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][500/1251] eta 0:02:52 lr 0.000481 wd 0.0500 time 0.2324 (0.2301) data time 0.0006 (0.0021) model time 0.2317 (0.2280) loss 3.6029 (3.1666) grad_norm 2.5449 (2.6820) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][510/1251] eta 0:02:50 lr 0.000481 wd 0.0500 time 0.2242 (0.2301) data time 0.0008 (0.0021) model time 0.2233 (0.2280) loss 2.7702 (3.1619) grad_norm 3.4182 (2.7297) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][520/1251] eta 0:02:48 lr 0.000480 wd 0.0500 time 0.2294 (0.2301) data time 0.0009 (0.0021) model time 0.2285 (0.2281) loss 3.5148 (3.1637) grad_norm 2.7224 (2.7417) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][530/1251] eta 0:02:45 lr 0.000480 wd 0.0500 time 0.2343 (0.2301) data time 0.0007 (0.0021) model time 0.2336 (0.2281) loss 3.5473 (3.1624) grad_norm 2.8734 (2.7467) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][540/1251] eta 0:02:43 lr 0.000480 wd 0.0500 time 0.2220 (0.2300) data time 0.0009 (0.0020) model time 0.2211 (0.2281) loss 3.6925 (3.1609) grad_norm 3.0137 (2.7730) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][550/1251] eta 0:02:41 lr 0.000480 wd 0.0500 time 0.2155 (0.2300) data time 0.0009 (0.0020) model time 0.2145 (0.2280) loss 3.5824 (3.1647) grad_norm 2.2442 (2.7797) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][560/1251] eta 0:02:38 lr 0.000480 wd 0.0500 time 0.2245 (0.2300) data time 0.0007 (0.0020) model time 0.2238 (0.2281) loss 2.3675 (3.1653) grad_norm 3.5787 (2.7794) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][570/1251] eta 0:02:36 lr 0.000480 wd 0.0500 time 0.2402 (0.2301) data time 0.0008 (0.0020) model time 0.2394 (0.2282) loss 3.5411 (3.1619) grad_norm 2.0764 (2.7719) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][580/1251] eta 0:02:34 lr 0.000480 wd 0.0500 time 0.2323 (0.2301) data time 0.0009 (0.0020) model time 0.2314 (0.2282) loss 3.2521 (3.1561) grad_norm 2.5005 (2.7655) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][590/1251] eta 0:02:32 lr 0.000480 wd 0.0500 time 0.2239 (0.2301) data time 0.0010 (0.0020) model time 0.2230 (0.2282) loss 2.9844 (3.1588) grad_norm 2.0952 (2.7597) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][600/1251] eta 0:02:29 lr 0.000480 wd 0.0500 time 0.2261 (0.2301) data time 0.0010 (0.0020) model time 0.2251 (0.2282) loss 3.7179 (3.1619) grad_norm 2.1185 (2.7576) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][610/1251] eta 0:02:27 lr 0.000480 wd 0.0500 time 0.2295 (0.2301) data time 0.0010 (0.0020) model time 0.2285 (0.2282) loss 3.6963 (3.1615) grad_norm 2.4284 (2.7534) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][620/1251] eta 0:02:25 lr 0.000480 wd 0.0500 time 0.2284 (0.2301) data time 0.0008 (0.0019) model time 0.2276 (0.2282) loss 3.6606 (3.1632) grad_norm 1.9088 (2.7475) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][630/1251] eta 0:02:22 lr 0.000480 wd 0.0500 time 0.2204 (0.2301) data time 0.0010 (0.0019) model time 0.2193 (0.2282) loss 2.2209 (3.1594) grad_norm 2.4586 (2.7468) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][640/1251] eta 0:02:20 lr 0.000480 wd 0.0500 time 0.2245 (0.2301) data time 0.0008 (0.0019) model time 0.2237 (0.2282) loss 3.6892 (3.1657) grad_norm 2.2701 (2.7419) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][650/1251] eta 0:02:18 lr 0.000480 wd 0.0500 time 0.2289 (0.2301) data time 0.0011 (0.0019) model time 0.2278 (0.2282) loss 2.1974 (3.1615) grad_norm 2.5974 (2.7522) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][660/1251] eta 0:02:15 lr 0.000480 wd 0.0500 time 0.2263 (0.2301) data time 0.0007 (0.0019) model time 0.2255 (0.2282) loss 3.9799 (3.1597) grad_norm 3.2577 (2.7500) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][670/1251] eta 0:02:13 lr 0.000480 wd 0.0500 time 0.2181 (0.2300) data time 0.0011 (0.0019) model time 0.2170 (0.2282) loss 2.7285 (3.1612) grad_norm 1.8409 (2.7483) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][680/1251] eta 0:02:11 lr 0.000480 wd 0.0500 time 0.2255 (0.2300) data time 0.0008 (0.0019) model time 0.2247 (0.2282) loss 3.5647 (3.1618) grad_norm 3.3171 (2.7415) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][690/1251] eta 0:02:09 lr 0.000480 wd 0.0500 time 0.2235 (0.2300) data time 0.0007 (0.0019) model time 0.2228 (0.2281) loss 3.2937 (3.1639) grad_norm 1.7741 (2.7348) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][700/1251] eta 0:02:06 lr 0.000480 wd 0.0500 time 0.2253 (0.2300) data time 0.0010 (0.0019) model time 0.2243 (0.2281) loss 2.7196 (3.1604) grad_norm 4.1852 (2.7311) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][710/1251] eta 0:02:04 lr 0.000480 wd 0.0500 time 0.2293 (0.2299) data time 0.0008 (0.0018) model time 0.2286 (0.2281) loss 1.8466 (3.1594) grad_norm 4.5833 (2.7466) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][720/1251] eta 0:02:02 lr 0.000480 wd 0.0500 time 0.2376 (0.2299) data time 0.0007 (0.0018) model time 0.2369 (0.2281) loss 2.9835 (3.1578) grad_norm 1.8036 (2.7427) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][730/1251] eta 0:01:59 lr 0.000480 wd 0.0500 time 0.2278 (0.2299) data time 0.0009 (0.0018) model time 0.2268 (0.2281) loss 3.0960 (3.1590) grad_norm 2.5807 (2.7403) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][740/1251] eta 0:01:57 lr 0.000480 wd 0.0500 time 0.2333 (0.2299) data time 0.0007 (0.0018) model time 0.2326 (0.2281) loss 3.7143 (3.1604) grad_norm 4.2472 (2.7457) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][750/1251] eta 0:01:55 lr 0.000479 wd 0.0500 time 0.2262 (0.2299) data time 0.0007 (0.0018) model time 0.2256 (0.2281) loss 3.7971 (3.1606) grad_norm 1.7771 (2.7399) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][760/1251] eta 0:01:52 lr 0.000479 wd 0.0500 time 0.2201 (0.2299) data time 0.0011 (0.0018) model time 0.2190 (0.2281) loss 2.7357 (3.1608) grad_norm 2.1233 (2.7386) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][770/1251] eta 0:01:50 lr 0.000479 wd 0.0500 time 0.2322 (0.2298) data time 0.0009 (0.0018) model time 0.2313 (0.2281) loss 2.9925 (3.1614) grad_norm 1.9572 (2.7335) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][780/1251] eta 0:01:48 lr 0.000479 wd 0.0500 time 0.2270 (0.2298) data time 0.0009 (0.0018) model time 0.2261 (0.2281) loss 3.5429 (3.1613) grad_norm 2.4443 (2.7310) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][790/1251] eta 0:01:45 lr 0.000479 wd 0.0500 time 0.2265 (0.2298) data time 0.0014 (0.0018) model time 0.2251 (0.2281) loss 3.3171 (3.1591) grad_norm 2.1344 (2.7327) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][800/1251] eta 0:01:43 lr 0.000479 wd 0.0500 time 0.2268 (0.2298) data time 0.0009 (0.0018) model time 0.2260 (0.2281) loss 3.8307 (3.1627) grad_norm 2.3584 (2.7277) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][810/1251] eta 0:01:41 lr 0.000479 wd 0.0500 time 0.2206 (0.2298) data time 0.0008 (0.0017) model time 0.2198 (0.2281) loss 3.0934 (3.1671) grad_norm 2.1768 (2.7223) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][820/1251] eta 0:01:39 lr 0.000479 wd 0.0500 time 0.2333 (0.2298) data time 0.0007 (0.0017) model time 0.2326 (0.2280) loss 2.9895 (3.1680) grad_norm 2.7712 (2.7176) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][830/1251] eta 0:01:36 lr 0.000479 wd 0.0500 time 0.2279 (0.2297) data time 0.0007 (0.0017) model time 0.2272 (0.2280) loss 3.5257 (3.1699) grad_norm 1.9635 (2.7133) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][840/1251] eta 0:01:34 lr 0.000479 wd 0.0500 time 0.2237 (0.2297) data time 0.0009 (0.0017) model time 0.2228 (0.2280) loss 4.1140 (3.1708) grad_norm 2.3871 (2.7149) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][850/1251] eta 0:01:32 lr 0.000479 wd 0.0500 time 0.2283 (0.2297) data time 0.0012 (0.0017) model time 0.2271 (0.2280) loss 3.6562 (3.1692) grad_norm 2.3071 (2.7151) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][860/1251] eta 0:01:29 lr 0.000479 wd 0.0500 time 0.2295 (0.2297) data time 0.0012 (0.0017) model time 0.2283 (0.2280) loss 3.3128 (3.1670) grad_norm 1.8228 (2.7126) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][870/1251] eta 0:01:27 lr 0.000479 wd 0.0500 time 0.2270 (0.2297) data time 0.0010 (0.0017) model time 0.2260 (0.2280) loss 3.0542 (3.1679) grad_norm 2.1827 (2.7080) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][880/1251] eta 0:01:25 lr 0.000479 wd 0.0500 time 0.2370 (0.2297) data time 0.0009 (0.0017) model time 0.2361 (0.2280) loss 2.8150 (3.1652) grad_norm 2.5109 (2.7126) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][890/1251] eta 0:01:22 lr 0.000479 wd 0.0500 time 0.2252 (0.2297) data time 0.0008 (0.0017) model time 0.2244 (0.2280) loss 3.9277 (3.1671) grad_norm 3.6117 (2.7195) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][900/1251] eta 0:01:20 lr 0.000479 wd 0.0500 time 0.2227 (0.2297) data time 0.0010 (0.0017) model time 0.2217 (0.2280) loss 3.8903 (3.1702) grad_norm 2.5729 (2.7159) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][910/1251] eta 0:01:18 lr 0.000479 wd 0.0500 time 0.2225 (0.2297) data time 0.0012 (0.0017) model time 0.2212 (0.2280) loss 3.2207 (3.1725) grad_norm 2.8493 (2.7134) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][920/1251] eta 0:01:16 lr 0.000479 wd 0.0500 time 0.2269 (0.2296) data time 0.0008 (0.0017) model time 0.2261 (0.2280) loss 3.6333 (3.1741) grad_norm 2.2940 (2.7121) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][930/1251] eta 0:01:13 lr 0.000479 wd 0.0500 time 0.2272 (0.2297) data time 0.0011 (0.0017) model time 0.2262 (0.2280) loss 2.2542 (3.1739) grad_norm 2.3886 (2.7094) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][940/1251] eta 0:01:11 lr 0.000479 wd 0.0500 time 0.2311 (0.2297) data time 0.0007 (0.0017) model time 0.2304 (0.2281) loss 2.2719 (3.1710) grad_norm 3.4808 (2.7106) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][950/1251] eta 0:01:09 lr 0.000479 wd 0.0500 time 0.2171 (0.2297) data time 0.0009 (0.0017) model time 0.2162 (0.2280) loss 4.0002 (3.1694) grad_norm 2.0775 (2.7114) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][960/1251] eta 0:01:06 lr 0.000479 wd 0.0500 time 0.2370 (0.2299) data time 0.0007 (0.0017) model time 0.2363 (0.2283) loss 2.2340 (3.1693) grad_norm 2.2437 (2.7118) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][970/1251] eta 0:01:04 lr 0.000478 wd 0.0500 time 0.2383 (0.2299) data time 0.0010 (0.0017) model time 0.2373 (0.2282) loss 3.2272 (3.1705) grad_norm 3.1524 (2.7171) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][980/1251] eta 0:01:02 lr 0.000478 wd 0.0500 time 0.2255 (0.2298) data time 0.0008 (0.0017) model time 0.2247 (0.2282) loss 2.4092 (3.1723) grad_norm 2.7051 (2.7158) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][990/1251] eta 0:00:59 lr 0.000478 wd 0.0500 time 0.2272 (0.2298) data time 0.0012 (0.0017) model time 0.2260 (0.2282) loss 3.3186 (3.1751) grad_norm 2.8774 (2.7132) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1000/1251] eta 0:00:57 lr 0.000478 wd 0.0500 time 0.2274 (0.2298) data time 0.0008 (0.0016) model time 0.2265 (0.2282) loss 3.6994 (3.1771) grad_norm 2.1935 (2.7094) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1010/1251] eta 0:00:55 lr 0.000478 wd 0.0500 time 0.2288 (0.2298) data time 0.0007 (0.0016) model time 0.2281 (0.2282) loss 3.8587 (3.1777) grad_norm 2.0953 (2.7072) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1020/1251] eta 0:00:53 lr 0.000478 wd 0.0500 time 0.2224 (0.2298) data time 0.0008 (0.0016) model time 0.2215 (0.2282) loss 3.9609 (3.1782) grad_norm 1.9676 (2.7041) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1030/1251] eta 0:00:50 lr 0.000478 wd 0.0500 time 0.2228 (0.2298) data time 0.0010 (0.0016) model time 0.2218 (0.2282) loss 3.4050 (3.1802) grad_norm 2.0516 (2.7029) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1040/1251] eta 0:00:48 lr 0.000478 wd 0.0500 time 0.2335 (0.2298) data time 0.0007 (0.0016) model time 0.2328 (0.2283) loss 3.0317 (3.1808) grad_norm 1.8204 (2.7000) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1050/1251] eta 0:00:46 lr 0.000478 wd 0.0500 time 0.2306 (0.2298) data time 0.0009 (0.0016) model time 0.2297 (0.2283) loss 2.6964 (3.1811) grad_norm 2.7600 (2.6971) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1060/1251] eta 0:00:43 lr 0.000478 wd 0.0500 time 0.2342 (0.2298) data time 0.0009 (0.0016) model time 0.2333 (0.2282) loss 3.5188 (3.1806) grad_norm 2.4630 (2.6963) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1070/1251] eta 0:00:41 lr 0.000478 wd 0.0500 time 0.2268 (0.2298) data time 0.0008 (0.0016) model time 0.2260 (0.2282) loss 3.2314 (3.1816) grad_norm 3.1034 (2.6985) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1080/1251] eta 0:00:39 lr 0.000478 wd 0.0500 time 0.2339 (0.2298) data time 0.0008 (0.0016) model time 0.2331 (0.2282) loss 3.1938 (3.1824) grad_norm 2.8869 (2.6974) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1090/1251] eta 0:00:37 lr 0.000478 wd 0.0500 time 0.2293 (0.2298) data time 0.0010 (0.0016) model time 0.2282 (0.2283) loss 3.6349 (3.1842) grad_norm 3.0947 (2.6972) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1100/1251] eta 0:00:34 lr 0.000478 wd 0.0500 time 0.2367 (0.2298) data time 0.0010 (0.0016) model time 0.2357 (0.2283) loss 2.3339 (3.1823) grad_norm 2.0767 (2.6951) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1110/1251] eta 0:00:32 lr 0.000478 wd 0.0500 time 0.2310 (0.2298) data time 0.0010 (0.0016) model time 0.2300 (0.2283) loss 3.2189 (3.1776) grad_norm 2.8130 (2.6941) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1120/1251] eta 0:00:30 lr 0.000478 wd 0.0500 time 0.2354 (0.2298) data time 0.0008 (0.0016) model time 0.2346 (0.2282) loss 2.7043 (3.1739) grad_norm 2.6211 (2.6942) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1130/1251] eta 0:00:27 lr 0.000478 wd 0.0500 time 0.2239 (0.2298) data time 0.0012 (0.0016) model time 0.2226 (0.2282) loss 3.4016 (3.1748) grad_norm 2.9169 (2.6952) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1140/1251] eta 0:00:25 lr 0.000478 wd 0.0500 time 0.2311 (0.2298) data time 0.0007 (0.0016) model time 0.2304 (0.2282) loss 3.7706 (3.1751) grad_norm 2.0923 (2.6969) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1150/1251] eta 0:00:23 lr 0.000478 wd 0.0500 time 0.2203 (0.2298) data time 0.0007 (0.0016) model time 0.2196 (0.2282) loss 3.8057 (3.1765) grad_norm 3.1022 (2.6937) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1160/1251] eta 0:00:20 lr 0.000478 wd 0.0500 time 0.2362 (0.2298) data time 0.0014 (0.0016) model time 0.2348 (0.2283) loss 2.9385 (3.1754) grad_norm 2.2314 (2.6897) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1170/1251] eta 0:00:18 lr 0.000478 wd 0.0500 time 0.2274 (0.2298) data time 0.0008 (0.0016) model time 0.2266 (0.2283) loss 2.9707 (3.1756) grad_norm 2.5458 (2.6927) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1180/1251] eta 0:00:16 lr 0.000478 wd 0.0500 time 0.2305 (0.2298) data time 0.0007 (0.0016) model time 0.2298 (0.2282) loss 3.8961 (3.1749) grad_norm 3.3816 (2.6940) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1190/1251] eta 0:00:14 lr 0.000478 wd 0.0500 time 0.2254 (0.2298) data time 0.0007 (0.0016) model time 0.2247 (0.2282) loss 2.9162 (3.1720) grad_norm 2.4955 (2.6981) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1200/1251] eta 0:00:11 lr 0.000477 wd 0.0500 time 0.2261 (0.2297) data time 0.0007 (0.0016) model time 0.2254 (0.2282) loss 3.5885 (3.1725) grad_norm 2.5418 (2.7008) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1210/1251] eta 0:00:09 lr 0.000477 wd 0.0500 time 0.2281 (0.2297) data time 0.0009 (0.0016) model time 0.2273 (0.2282) loss 3.5429 (3.1719) grad_norm 2.0927 (2.6970) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1220/1251] eta 0:00:07 lr 0.000477 wd 0.0500 time 0.2345 (0.2297) data time 0.0009 (0.0016) model time 0.2336 (0.2282) loss 3.0084 (3.1740) grad_norm 2.6456 (2.6963) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:12:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1230/1251] eta 0:00:04 lr 0.000477 wd 0.0500 time 0.2235 (0.2297) data time 0.0010 (0.0015) model time 0.2225 (0.2282) loss 3.4074 (3.1757) grad_norm 3.7154 (2.6998) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:13:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1240/1251] eta 0:00:02 lr 0.000477 wd 0.0500 time 0.2151 (0.2296) data time 0.0005 (0.0015) model time 0.2146 (0.2281) loss 2.6006 (3.1745) grad_norm 2.3941 (2.7013) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [164/300][1250/1251] eta 0:00:00 lr 0.000477 wd 0.0500 time 0.2132 (0.2295) data time 0.0005 (0.0015) model time 0.2127 (0.2280) loss 2.2431 (3.1740) grad_norm 2.3918 (2.6979) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 164 training takes 0:04:47 [2024-08-27 11:13:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 11:13:03 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 11:13:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.550 (0.550) Loss 0.4741 (0.4741) Acc@1 91.309 (91.309) Acc@5 98.242 (98.242) Mem 7381MB [2024-08-27 11:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.122) Loss 0.7529 (0.7383) Acc@1 85.645 (84.544) Acc@5 95.801 (96.982) Mem 7381MB [2024-08-27 11:13:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.102) Loss 1.0449 (0.7606) Acc@1 76.270 (83.617) Acc@5 94.434 (96.949) Mem 7381MB [2024-08-27 11:13:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.095) Loss 1.2617 (0.8606) Acc@1 71.387 (81.338) Acc@5 91.406 (95.851) Mem 7381MB [2024-08-27 11:13:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.088) Loss 1.1660 (0.9127) Acc@1 73.633 (80.071) Acc@5 92.969 (95.246) Mem 7381MB [2024-08-27 11:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.718 Acc@5 95.162 [2024-08-27 11:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.7% [2024-08-27 11:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 79.72% [2024-08-27 11:13:08 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 11:13:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 11:13:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.481 (0.481) Loss 0.4048 (0.4048) Acc@1 92.773 (92.773) Acc@5 98.535 (98.535) Mem 7381MB [2024-08-27 11:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.117) Loss 0.6304 (0.6295) Acc@1 87.109 (86.515) Acc@5 97.461 (97.399) Mem 7381MB [2024-08-27 11:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.100) Loss 0.8975 (0.6541) Acc@1 78.418 (85.565) Acc@5 95.508 (97.396) Mem 7381MB [2024-08-27 11:13:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.093) Loss 1.1250 (0.7425) Acc@1 72.754 (83.465) Acc@5 93.164 (96.440) Mem 7381MB [2024-08-27 11:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.0244 (0.7887) Acc@1 74.707 (82.100) Acc@5 93.848 (95.968) Mem 7381MB [2024-08-27 11:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.706 Acc@5 95.954 [2024-08-27 11:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.7% [2024-08-27 11:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.71% [2024-08-27 11:13:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 11:13:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 11:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][0/1251] eta 0:16:27 lr 0.000477 wd 0.0500 time 0.7890 (0.7890) data time 0.5490 (0.5490) model time 0.0000 (0.0000) loss 4.1007 (4.1007) grad_norm 2.9090 (2.9090) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:13:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][10/1251] eta 0:05:46 lr 0.000477 wd 0.0500 time 0.2318 (0.2796) data time 0.0010 (0.0510) model time 0.0000 (0.0000) loss 3.5697 (3.4454) grad_norm 2.8157 (2.7658) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:13:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][20/1251] eta 0:05:14 lr 0.000477 wd 0.0500 time 0.2459 (0.2553) data time 0.0007 (0.0271) model time 0.0000 (0.0000) loss 2.0221 (3.1123) grad_norm 2.0066 (2.6399) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:13:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][30/1251] eta 0:05:00 lr 0.000477 wd 0.0500 time 0.2287 (0.2461) data time 0.0012 (0.0187) model time 0.0000 (0.0000) loss 3.0666 (3.0000) grad_norm 2.3032 (2.5382) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:13:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][40/1251] eta 0:04:52 lr 0.000477 wd 0.0500 time 0.2281 (0.2416) data time 0.0008 (0.0144) model time 0.0000 (0.0000) loss 3.3405 (3.0202) grad_norm 4.6027 (2.6402) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][50/1251] eta 0:04:46 lr 0.000477 wd 0.0500 time 0.2314 (0.2388) data time 0.0007 (0.0118) model time 0.0000 (0.0000) loss 4.2465 (3.0571) grad_norm 2.0926 (2.6701) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:13:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][60/1251] eta 0:04:41 lr 0.000477 wd 0.0500 time 0.2276 (0.2368) data time 0.0008 (0.0100) model time 0.2268 (0.2253) loss 3.8990 (3.0825) grad_norm 3.1155 (2.6895) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][70/1251] eta 0:04:38 lr 0.000477 wd 0.0500 time 0.2295 (0.2359) data time 0.0007 (0.0088) model time 0.2287 (0.2275) loss 3.5984 (3.1131) grad_norm 2.3231 (2.7075) loss_scale 2048.0000 (1168.2254) mem 7381MB [2024-08-27 11:13:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][80/1251] eta 0:04:35 lr 0.000477 wd 0.0500 time 0.2316 (0.2351) data time 0.0009 (0.0078) model time 0.2307 (0.2278) loss 3.6425 (3.1576) grad_norm 2.9946 (2.6993) loss_scale 2048.0000 (1276.8395) mem 7381MB [2024-08-27 11:13:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][90/1251] eta 0:04:32 lr 0.000477 wd 0.0500 time 0.2328 (0.2344) data time 0.0008 (0.0071) model time 0.2319 (0.2277) loss 3.7223 (3.1403) grad_norm 2.9881 (2.6765) loss_scale 2048.0000 (1361.5824) mem 7381MB [2024-08-27 11:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][100/1251] eta 0:04:28 lr 0.000477 wd 0.0500 time 0.2252 (0.2336) data time 0.0008 (0.0065) model time 0.2244 (0.2273) loss 3.3772 (3.1414) grad_norm 3.9451 (2.6849) loss_scale 2048.0000 (1429.5446) mem 7381MB [2024-08-27 11:13:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][110/1251] eta 0:04:26 lr 0.000477 wd 0.0500 time 0.2248 (0.2333) data time 0.0009 (0.0060) model time 0.2239 (0.2276) loss 3.2865 (3.1438) grad_norm 2.4760 (2.7361) loss_scale 2048.0000 (1485.2613) mem 7381MB [2024-08-27 11:13:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][120/1251] eta 0:04:23 lr 0.000477 wd 0.0500 time 0.2254 (0.2328) data time 0.0009 (0.0056) model time 0.2245 (0.2274) loss 2.6256 (3.1354) grad_norm 2.5906 (2.7232) loss_scale 2048.0000 (1531.7686) mem 7381MB [2024-08-27 11:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][130/1251] eta 0:04:20 lr 0.000477 wd 0.0500 time 0.2290 (0.2324) data time 0.0009 (0.0052) model time 0.2281 (0.2273) loss 2.9194 (3.1137) grad_norm 2.5073 (2.7116) loss_scale 2048.0000 (1571.1756) mem 7381MB [2024-08-27 11:13:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][140/1251] eta 0:04:18 lr 0.000477 wd 0.0500 time 0.2303 (0.2323) data time 0.0010 (0.0049) model time 0.2293 (0.2275) loss 3.3450 (3.1291) grad_norm 3.0379 (2.7195) loss_scale 2048.0000 (1604.9929) mem 7381MB [2024-08-27 11:13:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][150/1251] eta 0:04:15 lr 0.000477 wd 0.0500 time 0.2312 (0.2322) data time 0.0010 (0.0047) model time 0.2302 (0.2277) loss 3.3555 (3.1425) grad_norm 2.1833 (2.7283) loss_scale 2048.0000 (1634.3311) mem 7381MB [2024-08-27 11:13:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][160/1251] eta 0:04:13 lr 0.000477 wd 0.0500 time 0.2348 (0.2320) data time 0.0007 (0.0045) model time 0.2341 (0.2278) loss 4.1035 (3.1321) grad_norm 2.9716 (2.7041) loss_scale 2048.0000 (1660.0248) mem 7381MB [2024-08-27 11:13:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][170/1251] eta 0:04:10 lr 0.000476 wd 0.0500 time 0.2378 (0.2318) data time 0.0010 (0.0043) model time 0.2368 (0.2278) loss 3.4683 (3.1455) grad_norm 2.0710 (2.7020) loss_scale 2048.0000 (1682.7135) mem 7381MB [2024-08-27 11:13:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][180/1251] eta 0:04:08 lr 0.000476 wd 0.0500 time 0.2257 (0.2316) data time 0.0009 (0.0041) model time 0.2248 (0.2277) loss 3.4216 (3.1439) grad_norm 2.3175 (2.6966) loss_scale 2048.0000 (1702.8950) mem 7381MB [2024-08-27 11:13:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][190/1251] eta 0:04:05 lr 0.000476 wd 0.0500 time 0.2310 (0.2315) data time 0.0011 (0.0039) model time 0.2299 (0.2278) loss 2.6170 (3.1430) grad_norm 2.3215 (2.6873) loss_scale 2048.0000 (1720.9634) mem 7381MB [2024-08-27 11:14:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][200/1251] eta 0:04:03 lr 0.000476 wd 0.0500 time 0.2310 (0.2314) data time 0.0007 (0.0038) model time 0.2303 (0.2279) loss 3.6136 (3.1512) grad_norm 2.1052 (2.6813) loss_scale 2048.0000 (1737.2338) mem 7381MB [2024-08-27 11:14:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][210/1251] eta 0:04:00 lr 0.000476 wd 0.0500 time 0.2225 (0.2311) data time 0.0008 (0.0036) model time 0.2217 (0.2276) loss 2.6642 (3.1546) grad_norm 3.2765 (2.6884) loss_scale 2048.0000 (1751.9621) mem 7381MB [2024-08-27 11:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][220/1251] eta 0:03:58 lr 0.000476 wd 0.0500 time 0.2308 (0.2310) data time 0.0009 (0.0035) model time 0.2299 (0.2276) loss 3.4968 (3.1503) grad_norm 2.6956 (2.6823) loss_scale 2048.0000 (1765.3575) mem 7381MB [2024-08-27 11:14:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][230/1251] eta 0:03:55 lr 0.000476 wd 0.0500 time 0.2267 (0.2309) data time 0.0008 (0.0034) model time 0.2259 (0.2276) loss 3.2179 (3.1407) grad_norm 3.6343 (2.6759) loss_scale 2048.0000 (1777.5931) mem 7381MB [2024-08-27 11:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][240/1251] eta 0:03:53 lr 0.000476 wd 0.0500 time 0.2349 (0.2309) data time 0.0007 (0.0033) model time 0.2343 (0.2277) loss 2.9255 (3.1383) grad_norm 2.1710 (2.6626) loss_scale 2048.0000 (1788.8133) mem 7381MB [2024-08-27 11:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][250/1251] eta 0:03:51 lr 0.000476 wd 0.0500 time 0.2405 (0.2310) data time 0.0009 (0.0032) model time 0.2396 (0.2279) loss 3.5508 (3.1413) grad_norm 3.2965 (2.6553) loss_scale 2048.0000 (1799.1394) mem 7381MB [2024-08-27 11:14:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][260/1251] eta 0:03:48 lr 0.000476 wd 0.0500 time 0.2284 (0.2308) data time 0.0007 (0.0032) model time 0.2277 (0.2278) loss 3.2234 (3.1470) grad_norm 2.7544 (2.6462) loss_scale 2048.0000 (1808.6743) mem 7381MB [2024-08-27 11:14:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][270/1251] eta 0:03:46 lr 0.000476 wd 0.0500 time 0.2256 (0.2307) data time 0.0011 (0.0031) model time 0.2246 (0.2277) loss 3.2795 (3.1432) grad_norm 2.5817 (2.6454) loss_scale 2048.0000 (1817.5055) mem 7381MB [2024-08-27 11:14:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][280/1251] eta 0:03:43 lr 0.000476 wd 0.0500 time 0.2290 (0.2306) data time 0.0010 (0.0030) model time 0.2280 (0.2277) loss 4.0613 (3.1473) grad_norm 2.5974 (2.6440) loss_scale 2048.0000 (1825.7082) mem 7381MB [2024-08-27 11:14:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][290/1251] eta 0:03:41 lr 0.000476 wd 0.0500 time 0.2307 (0.2305) data time 0.0009 (0.0029) model time 0.2299 (0.2277) loss 3.1474 (3.1377) grad_norm 2.8761 (2.6732) loss_scale 2048.0000 (1833.3471) mem 7381MB [2024-08-27 11:14:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][300/1251] eta 0:03:39 lr 0.000476 wd 0.0500 time 0.2328 (0.2308) data time 0.0007 (0.0029) model time 0.2321 (0.2281) loss 1.8818 (3.1265) grad_norm 2.0473 (2.6612) loss_scale 2048.0000 (1840.4784) mem 7381MB [2024-08-27 11:14:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][310/1251] eta 0:03:37 lr 0.000476 wd 0.0500 time 0.2277 (0.2312) data time 0.0006 (0.0028) model time 0.2270 (0.2287) loss 2.7764 (3.1246) grad_norm 1.9664 (2.6525) loss_scale 2048.0000 (1847.1511) mem 7381MB [2024-08-27 11:14:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][320/1251] eta 0:03:35 lr 0.000476 wd 0.0500 time 0.2347 (0.2312) data time 0.0007 (0.0028) model time 0.2340 (0.2288) loss 2.8171 (3.1206) grad_norm 1.9499 (2.6444) loss_scale 2048.0000 (1853.4081) mem 7381MB [2024-08-27 11:14:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][330/1251] eta 0:03:32 lr 0.000476 wd 0.0500 time 0.2384 (0.2312) data time 0.0008 (0.0027) model time 0.2376 (0.2287) loss 3.6010 (3.1122) grad_norm 1.9138 (2.6510) loss_scale 2048.0000 (1859.2870) mem 7381MB [2024-08-27 11:14:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][340/1251] eta 0:03:30 lr 0.000476 wd 0.0500 time 0.2235 (0.2311) data time 0.0010 (0.0027) model time 0.2226 (0.2286) loss 2.1544 (3.1088) grad_norm 2.7719 (2.6757) loss_scale 2048.0000 (1864.8211) mem 7381MB [2024-08-27 11:14:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][350/1251] eta 0:03:28 lr 0.000476 wd 0.0500 time 0.2338 (0.2311) data time 0.0009 (0.0026) model time 0.2329 (0.2287) loss 3.5444 (3.1098) grad_norm 2.0264 (2.6744) loss_scale 2048.0000 (1870.0399) mem 7381MB [2024-08-27 11:14:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][360/1251] eta 0:03:26 lr 0.000476 wd 0.0500 time 0.2263 (0.2315) data time 0.0009 (0.0026) model time 0.2254 (0.2293) loss 2.5851 (3.1023) grad_norm 1.5112 (2.6586) loss_scale 2048.0000 (1874.9695) mem 7381MB [2024-08-27 11:14:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][370/1251] eta 0:03:23 lr 0.000476 wd 0.0500 time 0.2296 (0.2315) data time 0.0008 (0.0025) model time 0.2289 (0.2292) loss 3.2848 (3.1009) grad_norm 2.7756 (2.6506) loss_scale 2048.0000 (1879.6334) mem 7381MB [2024-08-27 11:14:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][380/1251] eta 0:03:21 lr 0.000476 wd 0.0500 time 0.2332 (0.2315) data time 0.0007 (0.0025) model time 0.2326 (0.2293) loss 4.0193 (3.1075) grad_norm 2.3062 (2.6492) loss_scale 2048.0000 (1884.0525) mem 7381MB [2024-08-27 11:14:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][390/1251] eta 0:03:19 lr 0.000476 wd 0.0500 time 0.2298 (0.2314) data time 0.0008 (0.0025) model time 0.2290 (0.2292) loss 2.9222 (3.1083) grad_norm 2.4207 (2.6397) loss_scale 2048.0000 (1888.2455) mem 7381MB [2024-08-27 11:14:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][400/1251] eta 0:03:16 lr 0.000475 wd 0.0500 time 0.2244 (0.2313) data time 0.0009 (0.0024) model time 0.2235 (0.2292) loss 3.6943 (3.1136) grad_norm 4.7100 (2.6444) loss_scale 2048.0000 (1892.2294) mem 7381MB [2024-08-27 11:14:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][410/1251] eta 0:03:14 lr 0.000475 wd 0.0500 time 0.2370 (0.2313) data time 0.0009 (0.0024) model time 0.2362 (0.2292) loss 3.5345 (3.1172) grad_norm 2.9892 (2.6452) loss_scale 2048.0000 (1896.0195) mem 7381MB [2024-08-27 11:14:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][420/1251] eta 0:03:12 lr 0.000475 wd 0.0500 time 0.2357 (0.2312) data time 0.0008 (0.0024) model time 0.2349 (0.2291) loss 2.5405 (3.1161) grad_norm 2.0676 (2.6434) loss_scale 2048.0000 (1899.6295) mem 7381MB [2024-08-27 11:14:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][430/1251] eta 0:03:09 lr 0.000475 wd 0.0500 time 0.2346 (0.2312) data time 0.0007 (0.0023) model time 0.2339 (0.2291) loss 2.9605 (3.1172) grad_norm 2.0756 (2.6405) loss_scale 2048.0000 (1903.0719) mem 7381MB [2024-08-27 11:14:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][440/1251] eta 0:03:07 lr 0.000475 wd 0.0500 time 0.2328 (0.2311) data time 0.0007 (0.0023) model time 0.2321 (0.2290) loss 3.4046 (3.1163) grad_norm 2.5678 (2.6525) loss_scale 2048.0000 (1906.3583) mem 7381MB [2024-08-27 11:14:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][450/1251] eta 0:03:05 lr 0.000475 wd 0.0500 time 0.2361 (0.2311) data time 0.0007 (0.0023) model time 0.2354 (0.2290) loss 3.4426 (3.1195) grad_norm 2.7557 (2.6542) loss_scale 2048.0000 (1909.4989) mem 7381MB [2024-08-27 11:15:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][460/1251] eta 0:03:02 lr 0.000475 wd 0.0500 time 0.2299 (0.2311) data time 0.0008 (0.0023) model time 0.2291 (0.2291) loss 3.5722 (3.1201) grad_norm 3.8153 (2.6695) loss_scale 2048.0000 (1912.5033) mem 7381MB [2024-08-27 11:15:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][470/1251] eta 0:03:00 lr 0.000475 wd 0.0500 time 0.2283 (0.2310) data time 0.0006 (0.0022) model time 0.2277 (0.2290) loss 3.7955 (3.1241) grad_norm 1.9082 (2.6684) loss_scale 2048.0000 (1915.3800) mem 7381MB [2024-08-27 11:15:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][480/1251] eta 0:02:58 lr 0.000475 wd 0.0500 time 0.2253 (0.2310) data time 0.0007 (0.0022) model time 0.2246 (0.2290) loss 3.6727 (3.1233) grad_norm 3.2698 (2.6653) loss_scale 2048.0000 (1918.1372) mem 7381MB [2024-08-27 11:15:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][490/1251] eta 0:02:55 lr 0.000475 wd 0.0500 time 0.2357 (0.2309) data time 0.0006 (0.0022) model time 0.2350 (0.2289) loss 3.5950 (3.1234) grad_norm 2.9669 (2.6756) loss_scale 2048.0000 (1920.7821) mem 7381MB [2024-08-27 11:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][500/1251] eta 0:02:53 lr 0.000475 wd 0.0500 time 0.2291 (0.2310) data time 0.0009 (0.0022) model time 0.2282 (0.2290) loss 3.5968 (3.1262) grad_norm 2.9375 (2.6781) loss_scale 2048.0000 (1923.3214) mem 7381MB [2024-08-27 11:15:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][510/1251] eta 0:02:51 lr 0.000475 wd 0.0500 time 0.2182 (0.2309) data time 0.0013 (0.0021) model time 0.2170 (0.2290) loss 3.6157 (3.1302) grad_norm 2.0684 (2.6725) loss_scale 2048.0000 (1925.7613) mem 7381MB [2024-08-27 11:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][520/1251] eta 0:02:48 lr 0.000475 wd 0.0500 time 0.2277 (0.2309) data time 0.0008 (0.0021) model time 0.2269 (0.2289) loss 3.7072 (3.1331) grad_norm 2.2898 (2.6665) loss_scale 2048.0000 (1928.1075) mem 7381MB [2024-08-27 11:15:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][530/1251] eta 0:02:46 lr 0.000475 wd 0.0500 time 0.2289 (0.2308) data time 0.0013 (0.0021) model time 0.2276 (0.2289) loss 3.6226 (3.1358) grad_norm 2.0913 (2.6642) loss_scale 2048.0000 (1930.3653) mem 7381MB [2024-08-27 11:15:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][540/1251] eta 0:02:44 lr 0.000475 wd 0.0500 time 0.2311 (0.2308) data time 0.0007 (0.0021) model time 0.2304 (0.2289) loss 3.9673 (3.1377) grad_norm 3.7287 (2.6634) loss_scale 2048.0000 (1932.5397) mem 7381MB [2024-08-27 11:15:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][550/1251] eta 0:02:41 lr 0.000475 wd 0.0500 time 0.2345 (0.2307) data time 0.0009 (0.0021) model time 0.2336 (0.2288) loss 3.3491 (3.1356) grad_norm 2.3016 (2.6665) loss_scale 2048.0000 (1934.6352) mem 7381MB [2024-08-27 11:15:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][560/1251] eta 0:02:39 lr 0.000475 wd 0.0500 time 0.2268 (0.2307) data time 0.0007 (0.0021) model time 0.2262 (0.2288) loss 4.1481 (3.1372) grad_norm 2.3157 (2.6657) loss_scale 2048.0000 (1936.6560) mem 7381MB [2024-08-27 11:15:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][570/1251] eta 0:02:37 lr 0.000475 wd 0.0500 time 0.2239 (0.2306) data time 0.0008 (0.0020) model time 0.2231 (0.2287) loss 1.9266 (3.1393) grad_norm 2.6387 (2.6731) loss_scale 2048.0000 (1938.6060) mem 7381MB [2024-08-27 11:15:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][580/1251] eta 0:02:34 lr 0.000475 wd 0.0500 time 0.2258 (0.2306) data time 0.0008 (0.0020) model time 0.2249 (0.2287) loss 2.5660 (3.1349) grad_norm 2.0461 (2.6702) loss_scale 2048.0000 (1940.4888) mem 7381MB [2024-08-27 11:15:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][590/1251] eta 0:02:32 lr 0.000475 wd 0.0500 time 0.2319 (0.2306) data time 0.0009 (0.0020) model time 0.2310 (0.2287) loss 2.9657 (3.1345) grad_norm 2.4391 (2.6645) loss_scale 2048.0000 (1942.3080) mem 7381MB [2024-08-27 11:15:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][600/1251] eta 0:02:30 lr 0.000475 wd 0.0500 time 0.2284 (0.2306) data time 0.0008 (0.0020) model time 0.2277 (0.2287) loss 3.9673 (3.1302) grad_norm 2.2274 (2.6582) loss_scale 2048.0000 (1944.0666) mem 7381MB [2024-08-27 11:15:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][610/1251] eta 0:02:27 lr 0.000475 wd 0.0500 time 0.2289 (0.2305) data time 0.0011 (0.0020) model time 0.2278 (0.2287) loss 3.5704 (3.1297) grad_norm 2.0505 (2.6495) loss_scale 2048.0000 (1945.7676) mem 7381MB [2024-08-27 11:15:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][620/1251] eta 0:02:25 lr 0.000474 wd 0.0500 time 0.2253 (0.2305) data time 0.0006 (0.0020) model time 0.2247 (0.2286) loss 3.3752 (3.1289) grad_norm 2.4936 (2.6432) loss_scale 2048.0000 (1947.4138) mem 7381MB [2024-08-27 11:15:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][630/1251] eta 0:02:23 lr 0.000474 wd 0.0500 time 0.2240 (0.2304) data time 0.0008 (0.0019) model time 0.2231 (0.2286) loss 3.6508 (3.1284) grad_norm 1.9274 (2.6367) loss_scale 2048.0000 (1949.0079) mem 7381MB [2024-08-27 11:15:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][640/1251] eta 0:02:20 lr 0.000474 wd 0.0500 time 0.2316 (0.2304) data time 0.0012 (0.0019) model time 0.2305 (0.2286) loss 3.3202 (3.1307) grad_norm 1.7480 (2.6395) loss_scale 2048.0000 (1950.5523) mem 7381MB [2024-08-27 11:15:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][650/1251] eta 0:02:18 lr 0.000474 wd 0.0500 time 0.2333 (0.2303) data time 0.0008 (0.0019) model time 0.2326 (0.2285) loss 3.0486 (3.1332) grad_norm 2.4082 (2.6369) loss_scale 2048.0000 (1952.0492) mem 7381MB [2024-08-27 11:15:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][660/1251] eta 0:02:16 lr 0.000474 wd 0.0500 time 0.2276 (0.2303) data time 0.0007 (0.0019) model time 0.2269 (0.2285) loss 3.8344 (3.1323) grad_norm 2.5668 (2.6357) loss_scale 2048.0000 (1953.5008) mem 7381MB [2024-08-27 11:15:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][670/1251] eta 0:02:13 lr 0.000474 wd 0.0500 time 0.2402 (0.2303) data time 0.0011 (0.0019) model time 0.2392 (0.2285) loss 2.9472 (3.1314) grad_norm 2.6049 (2.6337) loss_scale 2048.0000 (1954.9091) mem 7381MB [2024-08-27 11:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][680/1251] eta 0:02:11 lr 0.000474 wd 0.0500 time 0.2260 (0.2303) data time 0.0012 (0.0019) model time 0.2248 (0.2285) loss 3.0085 (3.1300) grad_norm 2.1434 (2.6296) loss_scale 2048.0000 (1956.2761) mem 7381MB [2024-08-27 11:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][690/1251] eta 0:02:09 lr 0.000474 wd 0.0500 time 0.2220 (0.2303) data time 0.0008 (0.0019) model time 0.2211 (0.2285) loss 3.1196 (3.1297) grad_norm 1.8548 (2.6309) loss_scale 2048.0000 (1957.6035) mem 7381MB [2024-08-27 11:15:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][700/1251] eta 0:02:06 lr 0.000474 wd 0.0500 time 0.2423 (0.2303) data time 0.0011 (0.0019) model time 0.2412 (0.2285) loss 3.0699 (3.1321) grad_norm 3.7282 (2.6304) loss_scale 2048.0000 (1958.8930) mem 7381MB [2024-08-27 11:15:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][710/1251] eta 0:02:04 lr 0.000474 wd 0.0500 time 0.2306 (0.2302) data time 0.0011 (0.0019) model time 0.2296 (0.2285) loss 3.8202 (3.1357) grad_norm 2.0697 (2.6266) loss_scale 2048.0000 (1960.1463) mem 7381MB [2024-08-27 11:16:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][720/1251] eta 0:02:02 lr 0.000474 wd 0.0500 time 0.2246 (0.2302) data time 0.0010 (0.0019) model time 0.2235 (0.2284) loss 2.5852 (3.1348) grad_norm 2.1449 (2.6261) loss_scale 2048.0000 (1961.3648) mem 7381MB [2024-08-27 11:16:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][730/1251] eta 0:01:59 lr 0.000474 wd 0.0500 time 0.2342 (0.2302) data time 0.0010 (0.0018) model time 0.2333 (0.2284) loss 2.8809 (3.1340) grad_norm 2.6611 (2.6233) loss_scale 2048.0000 (1962.5499) mem 7381MB [2024-08-27 11:16:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][740/1251] eta 0:01:57 lr 0.000474 wd 0.0500 time 0.2200 (0.2301) data time 0.0010 (0.0018) model time 0.2189 (0.2284) loss 3.8604 (3.1342) grad_norm 2.6015 (2.6227) loss_scale 2048.0000 (1963.7031) mem 7381MB [2024-08-27 11:16:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][750/1251] eta 0:01:55 lr 0.000474 wd 0.0500 time 0.2298 (0.2301) data time 0.0007 (0.0018) model time 0.2291 (0.2283) loss 3.3855 (3.1332) grad_norm 2.0817 (2.6191) loss_scale 2048.0000 (1964.8256) mem 7381MB [2024-08-27 11:16:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][760/1251] eta 0:01:52 lr 0.000474 wd 0.0500 time 0.2266 (0.2301) data time 0.0007 (0.0018) model time 0.2258 (0.2283) loss 3.0923 (3.1349) grad_norm 2.0681 (2.6172) loss_scale 2048.0000 (1965.9185) mem 7381MB [2024-08-27 11:16:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][770/1251] eta 0:01:50 lr 0.000474 wd 0.0500 time 0.2180 (0.2300) data time 0.0012 (0.0018) model time 0.2169 (0.2283) loss 3.6430 (3.1333) grad_norm 2.7826 (2.6144) loss_scale 2048.0000 (1966.9831) mem 7381MB [2024-08-27 11:16:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][780/1251] eta 0:01:48 lr 0.000474 wd 0.0500 time 0.2275 (0.2300) data time 0.0009 (0.0018) model time 0.2266 (0.2283) loss 4.1724 (3.1354) grad_norm 3.3323 (2.6195) loss_scale 2048.0000 (1968.0205) mem 7381MB [2024-08-27 11:16:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][790/1251] eta 0:01:46 lr 0.000474 wd 0.0500 time 0.2361 (0.2300) data time 0.0010 (0.0018) model time 0.2351 (0.2283) loss 2.4524 (3.1346) grad_norm 2.3126 (2.6172) loss_scale 2048.0000 (1969.0316) mem 7381MB [2024-08-27 11:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][800/1251] eta 0:01:43 lr 0.000474 wd 0.0500 time 0.2238 (0.2300) data time 0.0010 (0.0018) model time 0.2228 (0.2283) loss 1.9113 (3.1350) grad_norm 2.9585 (2.6190) loss_scale 2048.0000 (1970.0175) mem 7381MB [2024-08-27 11:16:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][810/1251] eta 0:01:41 lr 0.000474 wd 0.0500 time 0.2341 (0.2300) data time 0.0010 (0.0018) model time 0.2331 (0.2283) loss 3.4615 (3.1374) grad_norm 4.7313 (2.6275) loss_scale 2048.0000 (1970.9790) mem 7381MB [2024-08-27 11:16:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][820/1251] eta 0:01:39 lr 0.000474 wd 0.0500 time 0.2301 (0.2299) data time 0.0006 (0.0018) model time 0.2295 (0.2283) loss 2.6585 (3.1385) grad_norm 2.7756 (2.6346) loss_scale 2048.0000 (1971.9172) mem 7381MB [2024-08-27 11:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][830/1251] eta 0:01:36 lr 0.000474 wd 0.0500 time 0.2320 (0.2299) data time 0.0008 (0.0018) model time 0.2312 (0.2282) loss 3.2858 (3.1437) grad_norm 1.6656 (2.6423) loss_scale 2048.0000 (1972.8327) mem 7381MB [2024-08-27 11:16:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][840/1251] eta 0:01:34 lr 0.000474 wd 0.0500 time 0.2265 (0.2301) data time 0.0009 (0.0018) model time 0.2256 (0.2285) loss 2.1831 (3.1386) grad_norm 2.6525 (2.6456) loss_scale 2048.0000 (1973.7265) mem 7381MB [2024-08-27 11:16:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][850/1251] eta 0:01:32 lr 0.000473 wd 0.0500 time 0.2347 (0.2301) data time 0.0008 (0.0017) model time 0.2339 (0.2285) loss 3.0325 (3.1400) grad_norm 1.7504 (2.6437) loss_scale 2048.0000 (1974.5993) mem 7381MB [2024-08-27 11:16:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][860/1251] eta 0:01:29 lr 0.000473 wd 0.0500 time 0.2274 (0.2301) data time 0.0006 (0.0017) model time 0.2268 (0.2285) loss 3.7221 (3.1411) grad_norm 2.1258 (2.6426) loss_scale 2048.0000 (1975.4518) mem 7381MB [2024-08-27 11:16:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][870/1251] eta 0:01:27 lr 0.000473 wd 0.0500 time 0.2315 (0.2301) data time 0.0009 (0.0017) model time 0.2306 (0.2285) loss 2.8657 (3.1410) grad_norm 2.6693 (2.6393) loss_scale 2048.0000 (1976.2847) mem 7381MB [2024-08-27 11:16:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][880/1251] eta 0:01:25 lr 0.000473 wd 0.0500 time 0.2287 (0.2301) data time 0.0008 (0.0017) model time 0.2279 (0.2284) loss 2.5610 (3.1403) grad_norm 2.6734 (2.6518) loss_scale 2048.0000 (1977.0988) mem 7381MB [2024-08-27 11:16:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][890/1251] eta 0:01:23 lr 0.000473 wd 0.0500 time 0.2264 (0.2301) data time 0.0012 (0.0017) model time 0.2252 (0.2284) loss 3.0217 (3.1422) grad_norm 3.9673 (2.6570) loss_scale 2048.0000 (1977.8945) mem 7381MB [2024-08-27 11:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][900/1251] eta 0:01:20 lr 0.000473 wd 0.0500 time 0.2278 (0.2301) data time 0.0010 (0.0017) model time 0.2268 (0.2284) loss 3.4279 (3.1436) grad_norm 2.2556 (2.6561) loss_scale 2048.0000 (1978.6726) mem 7381MB [2024-08-27 11:16:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][910/1251] eta 0:01:18 lr 0.000473 wd 0.0500 time 0.2356 (0.2300) data time 0.0010 (0.0017) model time 0.2346 (0.2284) loss 2.9710 (3.1449) grad_norm 3.0451 (2.6602) loss_scale 2048.0000 (1979.4336) mem 7381MB [2024-08-27 11:16:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][920/1251] eta 0:01:16 lr 0.000473 wd 0.0500 time 0.2249 (0.2300) data time 0.0009 (0.0017) model time 0.2240 (0.2284) loss 3.1427 (3.1444) grad_norm 2.6008 (2.6593) loss_scale 2048.0000 (1980.1781) mem 7381MB [2024-08-27 11:16:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][930/1251] eta 0:01:13 lr 0.000473 wd 0.0500 time 0.2265 (0.2300) data time 0.0011 (0.0017) model time 0.2254 (0.2284) loss 3.3184 (3.1461) grad_norm 2.0087 (2.6594) loss_scale 2048.0000 (1980.9066) mem 7381MB [2024-08-27 11:16:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][940/1251] eta 0:01:11 lr 0.000473 wd 0.0500 time 0.2282 (0.2300) data time 0.0010 (0.0017) model time 0.2272 (0.2284) loss 3.3734 (3.1495) grad_norm 3.2567 (2.6562) loss_scale 2048.0000 (1981.6196) mem 7381MB [2024-08-27 11:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][950/1251] eta 0:01:09 lr 0.000473 wd 0.0500 time 0.2240 (0.2300) data time 0.0012 (0.0017) model time 0.2228 (0.2284) loss 3.3960 (3.1511) grad_norm 2.4152 (2.6537) loss_scale 2048.0000 (1982.3176) mem 7381MB [2024-08-27 11:16:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][960/1251] eta 0:01:06 lr 0.000473 wd 0.0500 time 0.2284 (0.2300) data time 0.0010 (0.0017) model time 0.2274 (0.2284) loss 3.4784 (3.1515) grad_norm 2.2429 (2.6509) loss_scale 2048.0000 (1983.0010) mem 7381MB [2024-08-27 11:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][970/1251] eta 0:01:04 lr 0.000473 wd 0.0500 time 0.2330 (0.2300) data time 0.0009 (0.0017) model time 0.2321 (0.2284) loss 2.8821 (3.1508) grad_norm 2.7540 (2.6500) loss_scale 2048.0000 (1983.6704) mem 7381MB [2024-08-27 11:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][980/1251] eta 0:01:02 lr 0.000473 wd 0.0500 time 0.2269 (0.2300) data time 0.0013 (0.0017) model time 0.2256 (0.2284) loss 3.2589 (3.1521) grad_norm 2.2241 (2.6475) loss_scale 2048.0000 (1984.3262) mem 7381MB [2024-08-27 11:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][990/1251] eta 0:01:00 lr 0.000473 wd 0.0500 time 0.2345 (0.2300) data time 0.0009 (0.0017) model time 0.2336 (0.2284) loss 2.5323 (3.1515) grad_norm 2.5386 (2.6459) loss_scale 2048.0000 (1984.9687) mem 7381MB [2024-08-27 11:17:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1000/1251] eta 0:00:57 lr 0.000473 wd 0.0500 time 0.2337 (0.2300) data time 0.0010 (0.0016) model time 0.2328 (0.2284) loss 2.9385 (3.1518) grad_norm 4.6059 (2.6459) loss_scale 2048.0000 (1985.5984) mem 7381MB [2024-08-27 11:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1010/1251] eta 0:00:55 lr 0.000473 wd 0.0500 time 0.2272 (0.2300) data time 0.0008 (0.0016) model time 0.2265 (0.2284) loss 3.6173 (3.1535) grad_norm 2.0519 (2.6447) loss_scale 2048.0000 (1986.2156) mem 7381MB [2024-08-27 11:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1020/1251] eta 0:00:53 lr 0.000473 wd 0.0500 time 0.2260 (0.2299) data time 0.0011 (0.0016) model time 0.2249 (0.2284) loss 2.2058 (3.1559) grad_norm 3.6547 (2.6442) loss_scale 2048.0000 (1986.8208) mem 7381MB [2024-08-27 11:17:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1030/1251] eta 0:00:50 lr 0.000473 wd 0.0500 time 0.2244 (0.2299) data time 0.0007 (0.0016) model time 0.2237 (0.2284) loss 3.7593 (3.1556) grad_norm 2.9667 (2.6417) loss_scale 2048.0000 (1987.4142) mem 7381MB [2024-08-27 11:17:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1040/1251] eta 0:00:48 lr 0.000473 wd 0.0500 time 0.2224 (0.2299) data time 0.0010 (0.0016) model time 0.2214 (0.2284) loss 3.4951 (3.1572) grad_norm 1.9482 (2.6371) loss_scale 2048.0000 (1987.9962) mem 7381MB [2024-08-27 11:17:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1050/1251] eta 0:00:46 lr 0.000473 wd 0.0500 time 0.2211 (0.2300) data time 0.0017 (0.0016) model time 0.2194 (0.2284) loss 3.0932 (3.1572) grad_norm 1.9025 (2.6364) loss_scale 2048.0000 (1988.5671) mem 7381MB [2024-08-27 11:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1060/1251] eta 0:00:43 lr 0.000473 wd 0.0500 time 0.2272 (0.2300) data time 0.0008 (0.0016) model time 0.2264 (0.2284) loss 4.3629 (3.1595) grad_norm 1.7546 (2.6368) loss_scale 2048.0000 (1989.1272) mem 7381MB [2024-08-27 11:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1070/1251] eta 0:00:41 lr 0.000473 wd 0.0500 time 0.2381 (0.2300) data time 0.0010 (0.0016) model time 0.2372 (0.2284) loss 2.8145 (3.1605) grad_norm 2.4577 (2.6362) loss_scale 2048.0000 (1989.6769) mem 7381MB [2024-08-27 11:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1080/1251] eta 0:00:39 lr 0.000472 wd 0.0500 time 0.2328 (0.2299) data time 0.0007 (0.0016) model time 0.2321 (0.2284) loss 3.3464 (3.1635) grad_norm 2.8240 (2.6352) loss_scale 2048.0000 (1990.2165) mem 7381MB [2024-08-27 11:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1090/1251] eta 0:00:37 lr 0.000472 wd 0.0500 time 0.2259 (0.2299) data time 0.0009 (0.0016) model time 0.2251 (0.2284) loss 3.9767 (3.1646) grad_norm 2.7005 (2.6349) loss_scale 2048.0000 (1990.7461) mem 7381MB [2024-08-27 11:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1100/1251] eta 0:00:34 lr 0.000472 wd 0.0500 time 0.2259 (0.2299) data time 0.0009 (0.0016) model time 0.2251 (0.2284) loss 3.3163 (3.1653) grad_norm 2.9513 (2.6348) loss_scale 2048.0000 (1991.2661) mem 7381MB [2024-08-27 11:17:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1110/1251] eta 0:00:32 lr 0.000472 wd 0.0500 time 0.2314 (0.2299) data time 0.0009 (0.0016) model time 0.2305 (0.2284) loss 2.1802 (3.1655) grad_norm 2.9275 (2.6364) loss_scale 2048.0000 (1991.7768) mem 7381MB [2024-08-27 11:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1120/1251] eta 0:00:30 lr 0.000472 wd 0.0500 time 0.2376 (0.2299) data time 0.0010 (0.0016) model time 0.2365 (0.2284) loss 3.6485 (3.1655) grad_norm 1.8854 (2.6348) loss_scale 2048.0000 (1992.2783) mem 7381MB [2024-08-27 11:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1130/1251] eta 0:00:27 lr 0.000472 wd 0.0500 time 0.2270 (0.2299) data time 0.0009 (0.0016) model time 0.2261 (0.2284) loss 3.0412 (3.1634) grad_norm 4.9724 (2.6384) loss_scale 2048.0000 (1992.7710) mem 7381MB [2024-08-27 11:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1140/1251] eta 0:00:25 lr 0.000472 wd 0.0500 time 0.2331 (0.2299) data time 0.0008 (0.0016) model time 0.2322 (0.2284) loss 3.5209 (3.1644) grad_norm 3.4012 (2.6429) loss_scale 2048.0000 (1993.2550) mem 7381MB [2024-08-27 11:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1150/1251] eta 0:00:23 lr 0.000472 wd 0.0500 time 0.2299 (0.2299) data time 0.0007 (0.0016) model time 0.2292 (0.2284) loss 3.2171 (3.1654) grad_norm 3.0411 (2.6421) loss_scale 2048.0000 (1993.7307) mem 7381MB [2024-08-27 11:17:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1160/1251] eta 0:00:20 lr 0.000472 wd 0.0500 time 0.2281 (0.2300) data time 0.0010 (0.0016) model time 0.2271 (0.2284) loss 3.0445 (3.1640) grad_norm 2.2525 (2.6423) loss_scale 2048.0000 (1994.1981) mem 7381MB [2024-08-27 11:17:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1170/1251] eta 0:00:18 lr 0.000472 wd 0.0500 time 0.2355 (0.2300) data time 0.0008 (0.0016) model time 0.2347 (0.2284) loss 2.3032 (3.1622) grad_norm 1.8300 (2.6392) loss_scale 2048.0000 (1994.6576) mem 7381MB [2024-08-27 11:17:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1180/1251] eta 0:00:16 lr 0.000472 wd 0.0500 time 0.2341 (0.2299) data time 0.0009 (0.0016) model time 0.2332 (0.2284) loss 3.2867 (3.1629) grad_norm 2.2130 (2.6375) loss_scale 2048.0000 (1995.1092) mem 7381MB [2024-08-27 11:17:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1190/1251] eta 0:00:14 lr 0.000472 wd 0.0500 time 0.2263 (0.2299) data time 0.0008 (0.0016) model time 0.2255 (0.2284) loss 3.2322 (3.1633) grad_norm 3.3263 (2.6397) loss_scale 2048.0000 (1995.5533) mem 7381MB [2024-08-27 11:17:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1200/1251] eta 0:00:11 lr 0.000472 wd 0.0500 time 0.2324 (0.2299) data time 0.0010 (0.0016) model time 0.2313 (0.2284) loss 3.3892 (3.1629) grad_norm 2.0629 (2.6434) loss_scale 2048.0000 (1995.9900) mem 7381MB [2024-08-27 11:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1210/1251] eta 0:00:09 lr 0.000472 wd 0.0500 time 0.2270 (0.2299) data time 0.0010 (0.0016) model time 0.2259 (0.2284) loss 3.7754 (3.1636) grad_norm 2.1083 (2.6426) loss_scale 2048.0000 (1996.4195) mem 7381MB [2024-08-27 11:17:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1220/1251] eta 0:00:07 lr 0.000472 wd 0.0500 time 0.2257 (0.2299) data time 0.0011 (0.0015) model time 0.2246 (0.2284) loss 3.0727 (3.1627) grad_norm 1.9333 (2.6396) loss_scale 2048.0000 (1996.8419) mem 7381MB [2024-08-27 11:17:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1230/1251] eta 0:00:04 lr 0.000472 wd 0.0500 time 0.2296 (0.2299) data time 0.0008 (0.0015) model time 0.2288 (0.2284) loss 2.1355 (3.1619) grad_norm 2.7283 (2.6354) loss_scale 2048.0000 (1997.2575) mem 7381MB [2024-08-27 11:18:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1240/1251] eta 0:00:02 lr 0.000472 wd 0.0500 time 0.2123 (0.2298) data time 0.0007 (0.0015) model time 0.2116 (0.2283) loss 3.1162 (3.1598) grad_norm 2.1885 (2.6337) loss_scale 2048.0000 (1997.6664) mem 7381MB [2024-08-27 11:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [165/300][1250/1251] eta 0:00:00 lr 0.000472 wd 0.0500 time 0.2128 (0.2297) data time 0.0006 (0.0015) model time 0.2122 (0.2282) loss 3.4181 (3.1595) grad_norm 2.3932 (2.6354) loss_scale 2048.0000 (1998.0687) mem 7381MB [2024-08-27 11:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 165 training takes 0:04:47 [2024-08-27 11:18:02 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 11:18:03 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 11:18:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.568 (0.568) Loss 0.4692 (0.4692) Acc@1 91.113 (91.113) Acc@5 98.145 (98.145) Mem 7381MB [2024-08-27 11:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.124) Loss 0.7051 (0.7053) Acc@1 87.109 (84.863) Acc@5 96.777 (96.937) Mem 7381MB [2024-08-27 11:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.088 (0.106) Loss 1.0254 (0.7329) Acc@1 76.367 (83.770) Acc@5 94.238 (96.917) Mem 7381MB [2024-08-27 11:18:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.098) Loss 1.2676 (0.8461) Acc@1 70.898 (81.215) Acc@5 91.016 (95.703) Mem 7381MB [2024-08-27 11:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.090) Loss 1.1504 (0.8960) Acc@1 72.949 (79.983) Acc@5 92.773 (95.177) Mem 7381MB [2024-08-27 11:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.614 Acc@5 95.138 [2024-08-27 11:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.6% [2024-08-27 11:18:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.954 (0.954) Loss 0.4053 (0.4053) Acc@1 92.871 (92.871) Acc@5 98.438 (98.438) Mem 7381MB [2024-08-27 11:18:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.166) Loss 0.6289 (0.6285) Acc@1 87.305 (86.497) Acc@5 97.559 (97.425) Mem 7381MB [2024-08-27 11:18:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.089 (0.125) Loss 0.8945 (0.6530) Acc@1 78.418 (85.575) Acc@5 95.508 (97.391) Mem 7381MB [2024-08-27 11:18:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.109) Loss 1.1250 (0.7414) Acc@1 73.047 (83.474) Acc@5 92.969 (96.443) Mem 7381MB [2024-08-27 11:18:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.099) Loss 1.0225 (0.7875) Acc@1 74.316 (82.103) Acc@5 93.750 (95.972) Mem 7381MB [2024-08-27 11:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.708 Acc@5 95.954 [2024-08-27 11:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.7% [2024-08-27 11:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.71% [2024-08-27 11:18:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 11:18:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 11:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][0/1251] eta 0:17:29 lr 0.000472 wd 0.0500 time 0.8388 (0.8388) data time 0.5999 (0.5999) model time 0.0000 (0.0000) loss 3.3262 (3.3262) grad_norm 2.1483 (2.1483) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][10/1251] eta 0:05:54 lr 0.000472 wd 0.0500 time 0.2450 (0.2858) data time 0.0015 (0.0556) model time 0.0000 (0.0000) loss 2.1618 (2.8913) grad_norm 2.3289 (2.3354) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][20/1251] eta 0:05:19 lr 0.000472 wd 0.0500 time 0.2363 (0.2596) data time 0.0009 (0.0296) model time 0.0000 (0.0000) loss 3.8439 (3.0787) grad_norm 2.8624 (2.3861) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][30/1251] eta 0:05:12 lr 0.000472 wd 0.0500 time 0.2260 (0.2556) data time 0.0007 (0.0204) model time 0.0000 (0.0000) loss 3.2351 (3.0930) grad_norm 2.3591 (2.3400) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][40/1251] eta 0:05:01 lr 0.000472 wd 0.0500 time 0.2261 (0.2486) data time 0.0014 (0.0157) model time 0.0000 (0.0000) loss 3.4224 (3.1423) grad_norm 5.4745 (2.7724) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][50/1251] eta 0:04:53 lr 0.000471 wd 0.0500 time 0.2263 (0.2447) data time 0.0012 (0.0128) model time 0.0000 (0.0000) loss 3.4054 (3.1347) grad_norm 3.5995 (2.8865) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][60/1251] eta 0:04:48 lr 0.000471 wd 0.0500 time 0.2279 (0.2419) data time 0.0009 (0.0109) model time 0.2270 (0.2266) loss 3.7089 (3.1985) grad_norm 1.5891 (2.8578) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][70/1251] eta 0:04:46 lr 0.000471 wd 0.0500 time 0.2319 (0.2427) data time 0.0006 (0.0095) model time 0.2313 (0.2366) loss 1.7081 (3.1765) grad_norm 2.7835 (2.8320) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][80/1251] eta 0:04:41 lr 0.000471 wd 0.0500 time 0.2242 (0.2408) data time 0.0007 (0.0085) model time 0.2235 (0.2331) loss 3.1068 (3.1538) grad_norm 2.4447 (2.8837) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][90/1251] eta 0:04:37 lr 0.000471 wd 0.0500 time 0.2225 (0.2394) data time 0.0011 (0.0077) model time 0.2214 (0.2315) loss 2.9110 (3.1450) grad_norm 2.9497 (2.8806) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][100/1251] eta 0:04:34 lr 0.000471 wd 0.0500 time 0.2225 (0.2383) data time 0.0010 (0.0070) model time 0.2215 (0.2307) loss 3.2312 (3.1606) grad_norm 2.8509 (2.8641) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][110/1251] eta 0:04:30 lr 0.000471 wd 0.0500 time 0.2298 (0.2374) data time 0.0011 (0.0065) model time 0.2287 (0.2300) loss 3.1979 (3.1350) grad_norm 3.2245 (2.8662) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][120/1251] eta 0:04:27 lr 0.000471 wd 0.0500 time 0.2299 (0.2367) data time 0.0006 (0.0060) model time 0.2293 (0.2298) loss 3.6351 (3.1465) grad_norm 2.4699 (2.8872) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][130/1251] eta 0:04:24 lr 0.000471 wd 0.0500 time 0.2218 (0.2362) data time 0.0006 (0.0057) model time 0.2212 (0.2297) loss 3.4543 (3.1470) grad_norm 2.4505 (2.8668) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][140/1251] eta 0:04:21 lr 0.000471 wd 0.0500 time 0.2277 (0.2355) data time 0.0010 (0.0053) model time 0.2267 (0.2293) loss 2.9400 (3.1510) grad_norm 2.4166 (2.8336) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][150/1251] eta 0:04:18 lr 0.000471 wd 0.0500 time 0.2221 (0.2351) data time 0.0011 (0.0051) model time 0.2210 (0.2291) loss 2.7552 (3.1354) grad_norm 1.7886 (2.7943) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][160/1251] eta 0:04:16 lr 0.000471 wd 0.0500 time 0.2295 (0.2347) data time 0.0009 (0.0049) model time 0.2286 (0.2288) loss 3.5742 (3.1445) grad_norm 2.8920 (2.7835) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][170/1251] eta 0:04:13 lr 0.000471 wd 0.0500 time 0.2255 (0.2343) data time 0.0009 (0.0047) model time 0.2246 (0.2286) loss 2.6592 (3.1313) grad_norm 2.3095 (2.7820) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][180/1251] eta 0:04:10 lr 0.000471 wd 0.0500 time 0.2311 (0.2340) data time 0.0008 (0.0045) model time 0.2303 (0.2286) loss 3.2472 (3.1369) grad_norm 2.1629 (2.7681) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:18:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][190/1251] eta 0:04:08 lr 0.000471 wd 0.0500 time 0.2350 (0.2338) data time 0.0007 (0.0043) model time 0.2343 (0.2286) loss 3.1361 (3.1214) grad_norm 2.3568 (2.7541) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][200/1251] eta 0:04:05 lr 0.000471 wd 0.0500 time 0.2292 (0.2335) data time 0.0007 (0.0041) model time 0.2285 (0.2286) loss 2.0647 (3.1178) grad_norm 2.4545 (2.7315) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][210/1251] eta 0:04:02 lr 0.000471 wd 0.0500 time 0.2253 (0.2333) data time 0.0007 (0.0040) model time 0.2247 (0.2284) loss 2.8312 (3.1087) grad_norm 2.1628 (2.7177) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][220/1251] eta 0:04:00 lr 0.000471 wd 0.0500 time 0.2253 (0.2330) data time 0.0012 (0.0038) model time 0.2242 (0.2284) loss 3.4787 (3.1155) grad_norm 2.7210 (2.7038) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][230/1251] eta 0:03:57 lr 0.000471 wd 0.0500 time 0.2253 (0.2328) data time 0.0009 (0.0037) model time 0.2245 (0.2283) loss 3.5853 (3.1117) grad_norm 2.7728 (2.6870) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][240/1251] eta 0:03:55 lr 0.000471 wd 0.0500 time 0.2267 (0.2326) data time 0.0010 (0.0036) model time 0.2257 (0.2282) loss 2.8668 (3.1159) grad_norm 2.8588 (2.6786) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][250/1251] eta 0:03:52 lr 0.000471 wd 0.0500 time 0.2216 (0.2324) data time 0.0013 (0.0035) model time 0.2203 (0.2281) loss 2.9035 (3.1184) grad_norm 2.4280 (2.6939) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][260/1251] eta 0:03:50 lr 0.000471 wd 0.0500 time 0.2188 (0.2322) data time 0.0010 (0.0034) model time 0.2178 (0.2280) loss 2.0800 (3.1180) grad_norm 2.2079 (2.6983) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][270/1251] eta 0:03:47 lr 0.000471 wd 0.0500 time 0.2306 (0.2320) data time 0.0006 (0.0033) model time 0.2300 (0.2279) loss 3.4811 (3.1093) grad_norm 2.8827 (2.7058) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][280/1251] eta 0:03:45 lr 0.000470 wd 0.0500 time 0.2293 (0.2318) data time 0.0007 (0.0033) model time 0.2286 (0.2277) loss 3.1645 (3.1091) grad_norm 1.9326 (2.6900) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][290/1251] eta 0:03:42 lr 0.000470 wd 0.0500 time 0.2369 (0.2317) data time 0.0009 (0.0032) model time 0.2360 (0.2278) loss 2.8343 (3.1055) grad_norm 2.3571 (2.6864) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][300/1251] eta 0:03:40 lr 0.000470 wd 0.0500 time 0.2265 (0.2315) data time 0.0009 (0.0031) model time 0.2256 (0.2276) loss 3.3664 (3.1102) grad_norm 2.6037 (2.6745) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][310/1251] eta 0:03:37 lr 0.000470 wd 0.0500 time 0.2228 (0.2314) data time 0.0010 (0.0030) model time 0.2218 (0.2277) loss 2.2013 (3.1038) grad_norm 2.5127 (2.6846) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][320/1251] eta 0:03:35 lr 0.000470 wd 0.0500 time 0.2224 (0.2313) data time 0.0009 (0.0030) model time 0.2214 (0.2276) loss 3.3697 (3.1078) grad_norm 1.7710 (2.6846) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][330/1251] eta 0:03:32 lr 0.000470 wd 0.0500 time 0.2290 (0.2312) data time 0.0009 (0.0029) model time 0.2280 (0.2276) loss 3.0884 (3.1165) grad_norm 1.8799 (2.6717) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][340/1251] eta 0:03:30 lr 0.000470 wd 0.0500 time 0.2252 (0.2311) data time 0.0009 (0.0029) model time 0.2243 (0.2275) loss 2.7717 (3.1132) grad_norm 3.2144 (2.6650) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][350/1251] eta 0:03:28 lr 0.000470 wd 0.0500 time 0.2258 (0.2310) data time 0.0007 (0.0028) model time 0.2251 (0.2275) loss 3.4100 (3.1193) grad_norm 3.8245 (2.6852) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][360/1251] eta 0:03:25 lr 0.000470 wd 0.0500 time 0.2260 (0.2312) data time 0.0009 (0.0028) model time 0.2251 (0.2278) loss 3.4179 (3.1282) grad_norm 2.6980 (2.7133) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][370/1251] eta 0:03:24 lr 0.000470 wd 0.0500 time 0.2231 (0.2316) data time 0.0011 (0.0028) model time 0.2220 (0.2284) loss 2.9781 (3.1302) grad_norm 2.8480 (2.7191) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][380/1251] eta 0:03:21 lr 0.000470 wd 0.0500 time 0.2333 (0.2316) data time 0.0007 (0.0027) model time 0.2326 (0.2284) loss 2.1663 (3.1288) grad_norm 2.4860 (2.7153) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][390/1251] eta 0:03:19 lr 0.000470 wd 0.0500 time 0.2297 (0.2315) data time 0.0007 (0.0027) model time 0.2290 (0.2284) loss 2.9605 (3.1349) grad_norm 2.4542 (2.7120) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][400/1251] eta 0:03:16 lr 0.000470 wd 0.0500 time 0.2242 (0.2314) data time 0.0009 (0.0026) model time 0.2233 (0.2284) loss 3.3945 (3.1354) grad_norm 3.2296 (2.7102) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][410/1251] eta 0:03:14 lr 0.000470 wd 0.0500 time 0.2243 (0.2314) data time 0.0008 (0.0026) model time 0.2234 (0.2283) loss 2.9055 (3.1343) grad_norm 3.4422 (2.7136) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][420/1251] eta 0:03:12 lr 0.000470 wd 0.0500 time 0.2311 (0.2313) data time 0.0008 (0.0025) model time 0.2303 (0.2283) loss 2.6466 (3.1418) grad_norm 2.1129 (2.7161) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][430/1251] eta 0:03:09 lr 0.000470 wd 0.0500 time 0.2248 (0.2312) data time 0.0008 (0.0025) model time 0.2240 (0.2282) loss 3.3173 (3.1429) grad_norm 1.8746 (2.7218) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][440/1251] eta 0:03:07 lr 0.000470 wd 0.0500 time 0.2209 (0.2311) data time 0.0007 (0.0025) model time 0.2201 (0.2282) loss 3.3561 (3.1461) grad_norm 2.3357 (2.7110) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][450/1251] eta 0:03:05 lr 0.000470 wd 0.0500 time 0.2344 (0.2311) data time 0.0007 (0.0025) model time 0.2337 (0.2282) loss 2.1180 (3.1396) grad_norm 2.0182 (2.7090) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:19:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][460/1251] eta 0:03:02 lr 0.000470 wd 0.0500 time 0.2252 (0.2310) data time 0.0008 (0.0024) model time 0.2243 (0.2282) loss 2.0608 (3.1387) grad_norm 2.1699 (2.7069) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][470/1251] eta 0:03:00 lr 0.000470 wd 0.0500 time 0.2361 (0.2309) data time 0.0007 (0.0024) model time 0.2354 (0.2281) loss 3.4881 (3.1403) grad_norm 2.3608 (2.7058) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][480/1251] eta 0:02:57 lr 0.000470 wd 0.0500 time 0.2268 (0.2309) data time 0.0007 (0.0024) model time 0.2261 (0.2281) loss 3.8902 (3.1399) grad_norm 2.1859 (2.7019) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][490/1251] eta 0:02:55 lr 0.000470 wd 0.0500 time 0.2291 (0.2308) data time 0.0008 (0.0023) model time 0.2283 (0.2281) loss 3.7040 (3.1407) grad_norm 2.8728 (2.6987) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][500/1251] eta 0:02:53 lr 0.000469 wd 0.0500 time 0.2383 (0.2307) data time 0.0007 (0.0023) model time 0.2376 (0.2280) loss 3.7549 (3.1422) grad_norm 2.1079 (2.6935) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][510/1251] eta 0:02:50 lr 0.000469 wd 0.0500 time 0.2262 (0.2306) data time 0.0009 (0.0023) model time 0.2254 (0.2280) loss 3.4694 (3.1479) grad_norm 2.1469 (2.6881) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][520/1251] eta 0:02:48 lr 0.000469 wd 0.0500 time 0.2387 (0.2306) data time 0.0006 (0.0023) model time 0.2381 (0.2280) loss 2.6497 (3.1492) grad_norm 2.1192 (2.6855) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][530/1251] eta 0:02:46 lr 0.000469 wd 0.0500 time 0.2255 (0.2306) data time 0.0007 (0.0022) model time 0.2248 (0.2280) loss 3.6732 (3.1505) grad_norm 2.8308 (2.6817) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][540/1251] eta 0:02:43 lr 0.000469 wd 0.0500 time 0.2271 (0.2305) data time 0.0009 (0.0022) model time 0.2261 (0.2279) loss 4.0345 (3.1516) grad_norm 1.9335 (2.6739) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][550/1251] eta 0:02:41 lr 0.000469 wd 0.0500 time 0.2284 (0.2305) data time 0.0007 (0.0022) model time 0.2277 (0.2279) loss 3.1602 (3.1510) grad_norm 2.2239 (2.6766) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][560/1251] eta 0:02:39 lr 0.000469 wd 0.0500 time 0.2264 (0.2304) data time 0.0008 (0.0022) model time 0.2256 (0.2279) loss 2.8223 (3.1480) grad_norm 1.9838 (2.6784) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][570/1251] eta 0:02:36 lr 0.000469 wd 0.0500 time 0.2245 (0.2304) data time 0.0009 (0.0022) model time 0.2236 (0.2279) loss 3.0787 (3.1481) grad_norm 3.0528 (2.6726) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][580/1251] eta 0:02:34 lr 0.000469 wd 0.0500 time 0.2330 (0.2304) data time 0.0006 (0.0021) model time 0.2324 (0.2279) loss 3.8886 (3.1485) grad_norm 2.3969 (2.6709) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][590/1251] eta 0:02:32 lr 0.000469 wd 0.0500 time 0.2210 (0.2303) data time 0.0009 (0.0021) model time 0.2201 (0.2278) loss 3.7666 (3.1454) grad_norm 2.6601 (2.6790) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][600/1251] eta 0:02:29 lr 0.000469 wd 0.0500 time 0.2263 (0.2303) data time 0.0008 (0.0021) model time 0.2255 (0.2278) loss 2.2477 (3.1426) grad_norm 3.3383 (2.6797) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][610/1251] eta 0:02:27 lr 0.000469 wd 0.0500 time 0.2314 (0.2306) data time 0.0008 (0.0021) model time 0.2306 (0.2282) loss 3.9522 (3.1433) grad_norm 3.7398 (2.6803) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][620/1251] eta 0:02:25 lr 0.000469 wd 0.0500 time 0.2279 (0.2306) data time 0.0011 (0.0021) model time 0.2268 (0.2282) loss 2.2413 (3.1427) grad_norm 3.9610 (2.6851) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][630/1251] eta 0:02:23 lr 0.000469 wd 0.0500 time 0.2313 (0.2305) data time 0.0011 (0.0021) model time 0.2302 (0.2282) loss 3.5814 (3.1471) grad_norm 2.9081 (2.6888) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][640/1251] eta 0:02:20 lr 0.000469 wd 0.0500 time 0.2264 (0.2305) data time 0.0007 (0.0020) model time 0.2257 (0.2282) loss 2.0046 (3.1444) grad_norm 2.7587 (2.6910) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][650/1251] eta 0:02:18 lr 0.000469 wd 0.0500 time 0.2289 (0.2305) data time 0.0008 (0.0020) model time 0.2280 (0.2282) loss 3.4389 (3.1437) grad_norm 3.6708 (2.6937) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][660/1251] eta 0:02:16 lr 0.000469 wd 0.0500 time 0.2279 (0.2304) data time 0.0007 (0.0020) model time 0.2272 (0.2282) loss 3.5489 (3.1455) grad_norm 2.1100 (2.6913) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][670/1251] eta 0:02:13 lr 0.000469 wd 0.0500 time 0.2253 (0.2304) data time 0.0007 (0.0020) model time 0.2246 (0.2281) loss 2.2281 (3.1392) grad_norm 2.1948 (2.6837) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][680/1251] eta 0:02:11 lr 0.000469 wd 0.0500 time 0.2435 (0.2304) data time 0.0009 (0.0020) model time 0.2426 (0.2281) loss 3.2153 (3.1411) grad_norm 2.3832 (2.6814) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][690/1251] eta 0:02:09 lr 0.000469 wd 0.0500 time 0.2280 (0.2303) data time 0.0009 (0.0020) model time 0.2271 (0.2281) loss 3.8142 (3.1461) grad_norm 2.9478 (2.6751) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][700/1251] eta 0:02:06 lr 0.000469 wd 0.0500 time 0.2232 (0.2303) data time 0.0008 (0.0020) model time 0.2224 (0.2281) loss 3.6749 (3.1471) grad_norm 2.4849 (2.6751) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][710/1251] eta 0:02:04 lr 0.000469 wd 0.0500 time 0.2252 (0.2303) data time 0.0010 (0.0019) model time 0.2243 (0.2280) loss 3.7247 (3.1529) grad_norm 2.9696 (2.6720) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:20:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][720/1251] eta 0:02:02 lr 0.000469 wd 0.0500 time 0.2229 (0.2302) data time 0.0010 (0.0019) model time 0.2218 (0.2280) loss 3.1945 (3.1526) grad_norm 3.3291 (2.6679) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:21:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][730/1251] eta 0:01:59 lr 0.000468 wd 0.0500 time 0.2356 (0.2302) data time 0.0010 (0.0019) model time 0.2346 (0.2280) loss 3.3218 (3.1520) grad_norm 2.8132 (2.6638) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:21:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][740/1251] eta 0:01:57 lr 0.000468 wd 0.0500 time 0.2307 (0.2302) data time 0.0011 (0.0019) model time 0.2296 (0.2280) loss 3.3018 (3.1535) grad_norm 3.4229 (2.6666) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:21:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][750/1251] eta 0:01:55 lr 0.000468 wd 0.0500 time 0.2280 (0.2301) data time 0.0009 (0.0019) model time 0.2271 (0.2280) loss 3.2021 (3.1558) grad_norm 2.2654 (2.6661) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:21:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][760/1251] eta 0:01:52 lr 0.000468 wd 0.0500 time 0.2285 (0.2301) data time 0.0008 (0.0019) model time 0.2277 (0.2280) loss 2.6424 (3.1568) grad_norm 2.4412 (2.6725) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:21:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][770/1251] eta 0:01:50 lr 0.000468 wd 0.0500 time 0.2321 (0.2301) data time 0.0010 (0.0019) model time 0.2311 (0.2280) loss 2.7744 (3.1572) grad_norm 2.1391 (2.6685) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][780/1251] eta 0:01:48 lr 0.000468 wd 0.0500 time 0.2300 (0.2300) data time 0.0007 (0.0019) model time 0.2293 (0.2279) loss 3.0638 (3.1560) grad_norm 2.9644 (2.6663) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:21:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][790/1251] eta 0:01:46 lr 0.000468 wd 0.0500 time 0.2422 (0.2301) data time 0.0009 (0.0019) model time 0.2413 (0.2280) loss 3.1615 (3.1567) grad_norm 2.7333 (2.6699) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:21:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][800/1251] eta 0:01:43 lr 0.000468 wd 0.0500 time 0.2230 (0.2301) data time 0.0008 (0.0019) model time 0.2222 (0.2279) loss 1.8818 (3.1563) grad_norm 2.9253 (2.6749) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:21:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][810/1251] eta 0:01:41 lr 0.000468 wd 0.0500 time 0.2315 (0.2300) data time 0.0010 (0.0019) model time 0.2304 (0.2279) loss 3.1751 (3.1569) grad_norm 3.5916 (2.6808) loss_scale 4096.0000 (2050.5253) mem 7381MB [2024-08-27 11:21:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][820/1251] eta 0:01:39 lr 0.000468 wd 0.0500 time 0.2251 (0.2300) data time 0.0010 (0.0018) model time 0.2242 (0.2279) loss 2.5874 (3.1562) grad_norm 2.3795 (2.6850) loss_scale 4096.0000 (2075.4397) mem 7381MB [2024-08-27 11:21:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][830/1251] eta 0:01:36 lr 0.000468 wd 0.0500 time 0.2234 (0.2300) data time 0.0010 (0.0018) model time 0.2224 (0.2279) loss 2.9197 (3.1560) grad_norm 1.8210 (2.6817) loss_scale 4096.0000 (2099.7545) mem 7381MB [2024-08-27 11:21:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][840/1251] eta 0:01:34 lr 0.000468 wd 0.0500 time 0.2326 (0.2300) data time 0.0008 (0.0018) model time 0.2318 (0.2279) loss 3.5808 (3.1576) grad_norm 2.5810 (2.6776) loss_scale 4096.0000 (2123.4911) mem 7381MB [2024-08-27 11:21:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][850/1251] eta 0:01:32 lr 0.000468 wd 0.0500 time 0.2390 (0.2300) data time 0.0008 (0.0018) model time 0.2382 (0.2279) loss 3.7643 (3.1578) grad_norm 2.7159 (2.6759) loss_scale 4096.0000 (2146.6698) mem 7381MB [2024-08-27 11:21:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][860/1251] eta 0:01:29 lr 0.000468 wd 0.0500 time 0.2160 (0.2299) data time 0.0011 (0.0018) model time 0.2148 (0.2279) loss 3.2505 (3.1581) grad_norm 3.2420 (2.6720) loss_scale 4096.0000 (2169.3101) mem 7381MB [2024-08-27 11:21:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][870/1251] eta 0:01:27 lr 0.000468 wd 0.0500 time 0.2317 (0.2299) data time 0.0011 (0.0018) model time 0.2306 (0.2278) loss 3.0919 (3.1609) grad_norm 2.0949 (2.6707) loss_scale 4096.0000 (2191.4305) mem 7381MB [2024-08-27 11:21:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][880/1251] eta 0:01:25 lr 0.000468 wd 0.0500 time 0.2389 (0.2299) data time 0.0007 (0.0018) model time 0.2382 (0.2278) loss 3.2125 (3.1610) grad_norm 2.0258 (2.6684) loss_scale 4096.0000 (2213.0488) mem 7381MB [2024-08-27 11:21:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][890/1251] eta 0:01:22 lr 0.000468 wd 0.0500 time 0.2276 (0.2298) data time 0.0007 (0.0018) model time 0.2269 (0.2278) loss 3.1305 (3.1634) grad_norm 1.8151 (2.6640) loss_scale 4096.0000 (2234.1818) mem 7381MB [2024-08-27 11:21:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][900/1251] eta 0:01:20 lr 0.000468 wd 0.0500 time 0.2273 (0.2298) data time 0.0009 (0.0018) model time 0.2264 (0.2278) loss 3.4003 (3.1643) grad_norm 2.1123 (2.6683) loss_scale 4096.0000 (2254.8457) mem 7381MB [2024-08-27 11:21:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][910/1251] eta 0:01:18 lr 0.000468 wd 0.0500 time 0.2342 (0.2298) data time 0.0008 (0.0018) model time 0.2333 (0.2278) loss 2.8824 (3.1648) grad_norm 3.7286 (2.6723) loss_scale 4096.0000 (2275.0560) mem 7381MB [2024-08-27 11:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][920/1251] eta 0:01:16 lr 0.000468 wd 0.0500 time 0.2239 (0.2298) data time 0.0011 (0.0018) model time 0.2228 (0.2278) loss 3.5812 (3.1678) grad_norm 2.2521 (2.6718) loss_scale 4096.0000 (2294.8274) mem 7381MB [2024-08-27 11:21:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][930/1251] eta 0:01:13 lr 0.000468 wd 0.0500 time 0.2277 (0.2298) data time 0.0008 (0.0018) model time 0.2269 (0.2278) loss 3.0999 (3.1660) grad_norm 2.2285 (2.6660) loss_scale 4096.0000 (2314.1740) mem 7381MB [2024-08-27 11:21:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][940/1251] eta 0:01:11 lr 0.000468 wd 0.0500 time 0.2252 (0.2297) data time 0.0008 (0.0018) model time 0.2244 (0.2278) loss 3.8273 (3.1633) grad_norm 1.8388 (2.6616) loss_scale 4096.0000 (2333.1095) mem 7381MB [2024-08-27 11:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][950/1251] eta 0:01:09 lr 0.000467 wd 0.0500 time 0.2239 (0.2297) data time 0.0010 (0.0017) model time 0.2229 (0.2277) loss 3.3007 (3.1653) grad_norm 3.7327 (2.6646) loss_scale 4096.0000 (2351.6467) mem 7381MB [2024-08-27 11:21:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][960/1251] eta 0:01:06 lr 0.000467 wd 0.0500 time 0.2269 (0.2301) data time 0.0009 (0.0017) model time 0.2260 (0.2282) loss 3.1407 (3.1632) grad_norm 3.4261 (2.6696) loss_scale 4096.0000 (2369.7981) mem 7381MB [2024-08-27 11:21:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][970/1251] eta 0:01:04 lr 0.000467 wd 0.0500 time 0.2247 (0.2301) data time 0.0011 (0.0017) model time 0.2236 (0.2281) loss 3.3254 (3.1630) grad_norm 2.2224 (2.6683) loss_scale 4096.0000 (2387.5757) mem 7381MB [2024-08-27 11:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][980/1251] eta 0:01:02 lr 0.000467 wd 0.0500 time 0.2227 (0.2301) data time 0.0007 (0.0017) model time 0.2220 (0.2282) loss 2.1659 (3.1628) grad_norm 2.2613 (2.6689) loss_scale 4096.0000 (2404.9908) mem 7381MB [2024-08-27 11:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][990/1251] eta 0:01:00 lr 0.000467 wd 0.0500 time 0.2380 (0.2301) data time 0.0010 (0.0017) model time 0.2371 (0.2281) loss 3.1337 (3.1621) grad_norm 2.5845 (2.6651) loss_scale 4096.0000 (2422.0545) mem 7381MB [2024-08-27 11:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1000/1251] eta 0:00:57 lr 0.000467 wd 0.0500 time 0.2463 (0.2301) data time 0.0007 (0.0017) model time 0.2456 (0.2282) loss 1.9895 (3.1627) grad_norm 1.9440 (2.6616) loss_scale 4096.0000 (2438.7772) mem 7381MB [2024-08-27 11:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1010/1251] eta 0:00:55 lr 0.000467 wd 0.0500 time 0.2248 (0.2300) data time 0.0007 (0.0017) model time 0.2241 (0.2281) loss 2.9284 (3.1618) grad_norm 2.7450 (2.6675) loss_scale 4096.0000 (2455.1691) mem 7381MB [2024-08-27 11:22:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1020/1251] eta 0:00:53 lr 0.000467 wd 0.0500 time 0.2209 (0.2300) data time 0.0011 (0.0017) model time 0.2198 (0.2281) loss 3.2618 (3.1609) grad_norm 2.5653 (2.6673) loss_scale 4096.0000 (2471.2400) mem 7381MB [2024-08-27 11:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1030/1251] eta 0:00:50 lr 0.000467 wd 0.0500 time 0.2249 (0.2300) data time 0.0013 (0.0017) model time 0.2236 (0.2281) loss 2.8798 (3.1592) grad_norm 2.7395 (2.6679) loss_scale 4096.0000 (2486.9990) mem 7381MB [2024-08-27 11:22:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1040/1251] eta 0:00:48 lr 0.000467 wd 0.0500 time 0.2249 (0.2300) data time 0.0011 (0.0017) model time 0.2238 (0.2281) loss 3.0476 (3.1608) grad_norm 2.2299 (2.6667) loss_scale 4096.0000 (2502.4553) mem 7381MB [2024-08-27 11:22:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1050/1251] eta 0:00:46 lr 0.000467 wd 0.0500 time 0.2221 (0.2300) data time 0.0010 (0.0017) model time 0.2211 (0.2281) loss 3.4103 (3.1590) grad_norm 2.0099 (2.6636) loss_scale 4096.0000 (2517.6175) mem 7381MB [2024-08-27 11:22:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1060/1251] eta 0:00:43 lr 0.000467 wd 0.0500 time 0.2390 (0.2300) data time 0.0007 (0.0017) model time 0.2383 (0.2281) loss 2.2945 (3.1581) grad_norm 2.4558 (2.6619) loss_scale 4096.0000 (2532.4939) mem 7381MB [2024-08-27 11:22:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1070/1251] eta 0:00:41 lr 0.000467 wd 0.0500 time 0.2267 (0.2300) data time 0.0008 (0.0017) model time 0.2259 (0.2281) loss 1.8861 (3.1576) grad_norm 1.8897 (2.6594) loss_scale 4096.0000 (2547.0924) mem 7381MB [2024-08-27 11:22:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1080/1251] eta 0:00:39 lr 0.000467 wd 0.0500 time 0.2213 (0.2299) data time 0.0010 (0.0017) model time 0.2202 (0.2281) loss 3.5056 (3.1579) grad_norm 2.3340 (2.6585) loss_scale 4096.0000 (2561.4209) mem 7381MB [2024-08-27 11:22:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1090/1251] eta 0:00:37 lr 0.000467 wd 0.0500 time 0.2334 (0.2299) data time 0.0009 (0.0017) model time 0.2325 (0.2281) loss 3.3075 (3.1588) grad_norm 2.6515 (2.6591) loss_scale 4096.0000 (2575.4867) mem 7381MB [2024-08-27 11:22:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1100/1251] eta 0:00:34 lr 0.000467 wd 0.0500 time 0.2275 (0.2299) data time 0.0007 (0.0017) model time 0.2267 (0.2281) loss 3.6804 (3.1606) grad_norm 3.4387 (2.6567) loss_scale 4096.0000 (2589.2970) mem 7381MB [2024-08-27 11:22:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1110/1251] eta 0:00:32 lr 0.000467 wd 0.0500 time 0.2325 (0.2299) data time 0.0008 (0.0016) model time 0.2317 (0.2280) loss 2.3084 (3.1615) grad_norm 2.4037 (2.6595) loss_scale 4096.0000 (2602.8587) mem 7381MB [2024-08-27 11:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1120/1251] eta 0:00:30 lr 0.000467 wd 0.0500 time 0.2202 (0.2298) data time 0.0010 (0.0016) model time 0.2191 (0.2280) loss 3.6549 (3.1628) grad_norm 2.2564 (2.6588) loss_scale 4096.0000 (2616.1784) mem 7381MB [2024-08-27 11:22:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1130/1251] eta 0:00:27 lr 0.000467 wd 0.0500 time 0.2319 (0.2300) data time 0.0010 (0.0016) model time 0.2309 (0.2282) loss 2.1391 (3.1613) grad_norm 2.4505 (2.6568) loss_scale 4096.0000 (2629.2626) mem 7381MB [2024-08-27 11:22:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1140/1251] eta 0:00:25 lr 0.000467 wd 0.0500 time 0.2289 (0.2300) data time 0.0007 (0.0016) model time 0.2282 (0.2282) loss 3.3655 (3.1602) grad_norm 2.5771 (2.6549) loss_scale 4096.0000 (2642.1174) mem 7381MB [2024-08-27 11:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1150/1251] eta 0:00:23 lr 0.000467 wd 0.0500 time 0.2290 (0.2300) data time 0.0009 (0.0016) model time 0.2281 (0.2282) loss 2.2474 (3.1597) grad_norm 2.1820 (2.6522) loss_scale 4096.0000 (2654.7489) mem 7381MB [2024-08-27 11:22:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1160/1251] eta 0:00:20 lr 0.000467 wd 0.0500 time 0.2230 (0.2300) data time 0.0010 (0.0016) model time 0.2221 (0.2282) loss 3.4900 (3.1596) grad_norm 2.0521 (2.6496) loss_scale 4096.0000 (2667.1628) mem 7381MB [2024-08-27 11:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1170/1251] eta 0:00:18 lr 0.000467 wd 0.0500 time 0.2235 (0.2300) data time 0.0009 (0.0016) model time 0.2226 (0.2282) loss 3.6136 (3.1634) grad_norm 2.1814 (2.6469) loss_scale 4096.0000 (2679.3646) mem 7381MB [2024-08-27 11:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1180/1251] eta 0:00:16 lr 0.000466 wd 0.0500 time 0.2261 (0.2300) data time 0.0011 (0.0016) model time 0.2250 (0.2282) loss 3.2484 (3.1650) grad_norm 2.5299 (2.6436) loss_scale 4096.0000 (2691.3599) mem 7381MB [2024-08-27 11:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1190/1251] eta 0:00:14 lr 0.000466 wd 0.0500 time 0.2338 (0.2299) data time 0.0006 (0.0016) model time 0.2332 (0.2282) loss 2.2613 (3.1625) grad_norm 3.7463 (2.6462) loss_scale 4096.0000 (2703.1537) mem 7381MB [2024-08-27 11:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1200/1251] eta 0:00:11 lr 0.000466 wd 0.0500 time 0.2233 (0.2299) data time 0.0010 (0.0016) model time 0.2224 (0.2282) loss 3.7957 (3.1633) grad_norm 2.3707 (2.6451) loss_scale 4096.0000 (2714.7510) mem 7381MB [2024-08-27 11:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1210/1251] eta 0:00:09 lr 0.000466 wd 0.0500 time 0.2279 (0.2299) data time 0.0008 (0.0016) model time 0.2272 (0.2281) loss 3.2505 (3.1607) grad_norm 2.8306 (2.6445) loss_scale 4096.0000 (2726.1569) mem 7381MB [2024-08-27 11:22:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1220/1251] eta 0:00:07 lr 0.000466 wd 0.0500 time 0.2230 (0.2299) data time 0.0008 (0.0016) model time 0.2222 (0.2281) loss 2.7005 (3.1614) grad_norm 1.6132 (2.6433) loss_scale 4096.0000 (2737.3759) mem 7381MB [2024-08-27 11:22:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1230/1251] eta 0:00:04 lr 0.000466 wd 0.0500 time 0.2302 (0.2299) data time 0.0009 (0.0016) model time 0.2293 (0.2281) loss 3.5067 (3.1593) grad_norm 2.6270 (2.6462) loss_scale 4096.0000 (2748.4127) mem 7381MB [2024-08-27 11:22:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1240/1251] eta 0:00:02 lr 0.000466 wd 0.0500 time 0.2098 (0.2298) data time 0.0007 (0.0016) model time 0.2092 (0.2280) loss 3.0629 (3.1607) grad_norm 2.2578 (2.6483) loss_scale 4096.0000 (2759.2716) mem 7381MB [2024-08-27 11:23:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [166/300][1250/1251] eta 0:00:00 lr 0.000466 wd 0.0500 time 0.2114 (0.2297) data time 0.0005 (0.0016) model time 0.2109 (0.2279) loss 3.6752 (3.1611) grad_norm 3.3841 (2.6510) loss_scale 4096.0000 (2769.9568) mem 7381MB [2024-08-27 11:23:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 166 training takes 0:04:47 [2024-08-27 11:23:00 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 11:23:01 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 11:23:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.544 (0.544) Loss 0.4709 (0.4709) Acc@1 91.797 (91.797) Acc@5 98.242 (98.242) Mem 7381MB [2024-08-27 11:23:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.087 (0.123) Loss 0.7168 (0.7283) Acc@1 86.230 (84.819) Acc@5 96.875 (96.911) Mem 7381MB [2024-08-27 11:23:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.092 (0.105) Loss 1.0479 (0.7593) Acc@1 75.195 (83.640) Acc@5 94.727 (96.856) Mem 7381MB [2024-08-27 11:23:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.096) Loss 1.2676 (0.8592) Acc@1 70.410 (81.392) Acc@5 91.699 (95.719) Mem 7381MB [2024-08-27 11:23:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.089) Loss 1.1836 (0.9127) Acc@1 72.949 (80.030) Acc@5 92.969 (95.141) Mem 7381MB [2024-08-27 11:23:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.670 Acc@5 95.110 [2024-08-27 11:23:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.7% [2024-08-27 11:23:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.951 (0.951) Loss 0.4053 (0.4053) Acc@1 93.164 (93.164) Acc@5 98.340 (98.340) Mem 7381MB [2024-08-27 11:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.171) Loss 0.6289 (0.6278) Acc@1 87.402 (86.470) Acc@5 97.656 (97.434) Mem 7381MB [2024-08-27 11:23:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.129) Loss 0.8906 (0.6523) Acc@1 78.613 (85.603) Acc@5 95.605 (97.410) Mem 7381MB [2024-08-27 11:23:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.113) Loss 1.1250 (0.7406) Acc@1 72.852 (83.455) Acc@5 92.969 (96.453) Mem 7381MB [2024-08-27 11:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.102) Loss 1.0195 (0.7864) Acc@1 74.609 (82.098) Acc@5 93.945 (95.994) Mem 7381MB [2024-08-27 11:23:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.700 Acc@5 95.974 [2024-08-27 11:23:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.7% [2024-08-27 11:23:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][0/1251] eta 0:27:50 lr 0.000466 wd 0.0500 time 1.3354 (1.3354) data time 0.9099 (0.9099) model time 0.0000 (0.0000) loss 3.2391 (3.2391) grad_norm 3.5316 (3.5316) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-27 11:23:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][10/1251] eta 0:06:47 lr 0.000466 wd 0.0500 time 0.2265 (0.3285) data time 0.0008 (0.0838) model time 0.0000 (0.0000) loss 3.5675 (3.0922) grad_norm 2.4971 (2.5886) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-27 11:23:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][20/1251] eta 0:05:46 lr 0.000466 wd 0.0500 time 0.2229 (0.2811) data time 0.0008 (0.0444) model time 0.0000 (0.0000) loss 3.3902 (3.1192) grad_norm 2.3055 (2.5474) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-27 11:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][30/1251] eta 0:05:22 lr 0.000466 wd 0.0500 time 0.2268 (0.2639) data time 0.0008 (0.0304) model time 0.0000 (0.0000) loss 2.7275 (3.1399) grad_norm 2.0069 (2.4906) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-27 11:23:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][40/1251] eta 0:05:08 lr 0.000466 wd 0.0500 time 0.2269 (0.2549) data time 0.0008 (0.0232) model time 0.0000 (0.0000) loss 2.8411 (3.1719) grad_norm 3.9967 (2.4519) loss_scale 4096.0000 (4096.0000) mem 7381MB [2024-08-27 11:23:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][50/1251] eta 0:04:59 lr 0.000466 wd 0.0500 time 0.2245 (0.2494) data time 0.0011 (0.0189) model time 0.0000 (0.0000) loss 3.4321 (3.1972) grad_norm 1.7518 (nan) loss_scale 2048.0000 (3814.9020) mem 7381MB [2024-08-27 11:23:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][60/1251] eta 0:04:52 lr 0.000466 wd 0.0500 time 0.2225 (0.2457) data time 0.0010 (0.0160) model time 0.2214 (0.2261) loss 2.9243 (3.1737) grad_norm 2.5466 (nan) loss_scale 2048.0000 (3525.2459) mem 7381MB [2024-08-27 11:23:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][70/1251] eta 0:04:47 lr 0.000466 wd 0.0500 time 0.2482 (0.2436) data time 0.0009 (0.0139) model time 0.2473 (0.2277) loss 1.9963 (3.1405) grad_norm 3.4021 (nan) loss_scale 2048.0000 (3317.1831) mem 7381MB [2024-08-27 11:23:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][80/1251] eta 0:04:43 lr 0.000466 wd 0.0500 time 0.2330 (0.2420) data time 0.0007 (0.0125) model time 0.2322 (0.2279) loss 2.8340 (3.1494) grad_norm 3.0080 (nan) loss_scale 2048.0000 (3160.4938) mem 7381MB [2024-08-27 11:23:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][90/1251] eta 0:04:39 lr 0.000466 wd 0.0500 time 0.2323 (0.2405) data time 0.0011 (0.0112) model time 0.2312 (0.2277) loss 2.3727 (3.1488) grad_norm 2.4205 (nan) loss_scale 2048.0000 (3038.2418) mem 7381MB [2024-08-27 11:23:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][100/1251] eta 0:04:35 lr 0.000466 wd 0.0500 time 0.2285 (0.2394) data time 0.0008 (0.0102) model time 0.2277 (0.2278) loss 1.4610 (3.1334) grad_norm 3.1144 (nan) loss_scale 2048.0000 (2940.1980) mem 7381MB [2024-08-27 11:23:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][110/1251] eta 0:04:32 lr 0.000466 wd 0.0500 time 0.2404 (0.2385) data time 0.0010 (0.0094) model time 0.2394 (0.2278) loss 3.0680 (3.1223) grad_norm 2.4274 (nan) loss_scale 2048.0000 (2859.8198) mem 7381MB [2024-08-27 11:23:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][120/1251] eta 0:04:28 lr 0.000466 wd 0.0500 time 0.2293 (0.2378) data time 0.0011 (0.0087) model time 0.2283 (0.2281) loss 3.8053 (3.1375) grad_norm 2.3816 (nan) loss_scale 2048.0000 (2792.7273) mem 7381MB [2024-08-27 11:23:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][130/1251] eta 0:04:25 lr 0.000466 wd 0.0500 time 0.2243 (0.2370) data time 0.0010 (0.0081) model time 0.2233 (0.2278) loss 3.2476 (3.1528) grad_norm 2.1927 (nan) loss_scale 2048.0000 (2735.8779) mem 7381MB [2024-08-27 11:23:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][140/1251] eta 0:04:22 lr 0.000466 wd 0.0500 time 0.2376 (0.2366) data time 0.0009 (0.0078) model time 0.2367 (0.2278) loss 3.0450 (3.1358) grad_norm 2.8904 (nan) loss_scale 2048.0000 (2687.0922) mem 7381MB [2024-08-27 11:23:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][150/1251] eta 0:04:19 lr 0.000465 wd 0.0500 time 0.2270 (0.2360) data time 0.0009 (0.0074) model time 0.2261 (0.2277) loss 3.0833 (3.1306) grad_norm 3.3750 (nan) loss_scale 2048.0000 (2644.7682) mem 7381MB [2024-08-27 11:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][160/1251] eta 0:04:17 lr 0.000465 wd 0.0500 time 0.2306 (0.2356) data time 0.0011 (0.0070) model time 0.2296 (0.2277) loss 2.7652 (3.1300) grad_norm 3.1527 (nan) loss_scale 2048.0000 (2607.7019) mem 7381MB [2024-08-27 11:23:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][170/1251] eta 0:04:14 lr 0.000465 wd 0.0500 time 0.3225 (0.2357) data time 0.0011 (0.0066) model time 0.3215 (0.2285) loss 3.2162 (3.1374) grad_norm 1.8926 (nan) loss_scale 2048.0000 (2574.9708) mem 7381MB [2024-08-27 11:23:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][180/1251] eta 0:04:11 lr 0.000465 wd 0.0500 time 0.2224 (0.2352) data time 0.0010 (0.0064) model time 0.2214 (0.2282) loss 3.2244 (3.1202) grad_norm 2.9775 (nan) loss_scale 2048.0000 (2545.8564) mem 7381MB [2024-08-27 11:23:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][190/1251] eta 0:04:09 lr 0.000465 wd 0.0500 time 0.2368 (0.2350) data time 0.0009 (0.0061) model time 0.2359 (0.2283) loss 3.0469 (3.1330) grad_norm 2.1544 (nan) loss_scale 2048.0000 (2519.7906) mem 7381MB [2024-08-27 11:23:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][200/1251] eta 0:04:06 lr 0.000465 wd 0.0500 time 0.2328 (0.2347) data time 0.0012 (0.0059) model time 0.2316 (0.2283) loss 2.7252 (3.1323) grad_norm 2.0375 (nan) loss_scale 2048.0000 (2496.3184) mem 7381MB [2024-08-27 11:23:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][210/1251] eta 0:04:04 lr 0.000465 wd 0.0500 time 0.2300 (0.2344) data time 0.0007 (0.0057) model time 0.2292 (0.2282) loss 2.6478 (3.1284) grad_norm 2.0366 (nan) loss_scale 2048.0000 (2475.0711) mem 7381MB [2024-08-27 11:24:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][220/1251] eta 0:04:01 lr 0.000465 wd 0.0500 time 0.2257 (0.2340) data time 0.0011 (0.0054) model time 0.2246 (0.2280) loss 2.1340 (3.1233) grad_norm 2.4403 (nan) loss_scale 2048.0000 (2455.7466) mem 7381MB [2024-08-27 11:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][230/1251] eta 0:03:58 lr 0.000465 wd 0.0500 time 0.2416 (0.2339) data time 0.0010 (0.0053) model time 0.2406 (0.2281) loss 3.2974 (3.1273) grad_norm 2.2935 (nan) loss_scale 2048.0000 (2438.0952) mem 7381MB [2024-08-27 11:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][240/1251] eta 0:03:56 lr 0.000465 wd 0.0500 time 0.2289 (0.2337) data time 0.0006 (0.0052) model time 0.2283 (0.2280) loss 2.9416 (3.1232) grad_norm 9.9521 (nan) loss_scale 2048.0000 (2421.9087) mem 7381MB [2024-08-27 11:24:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][250/1251] eta 0:03:53 lr 0.000465 wd 0.0500 time 0.2389 (0.2335) data time 0.0010 (0.0050) model time 0.2378 (0.2280) loss 2.5669 (3.1303) grad_norm 2.7447 (nan) loss_scale 2048.0000 (2407.0120) mem 7381MB [2024-08-27 11:24:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][260/1251] eta 0:03:51 lr 0.000465 wd 0.0500 time 0.2390 (0.2334) data time 0.0007 (0.0049) model time 0.2383 (0.2280) loss 2.6670 (3.1232) grad_norm 3.3439 (nan) loss_scale 2048.0000 (2393.2567) mem 7381MB [2024-08-27 11:24:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][270/1251] eta 0:03:48 lr 0.000465 wd 0.0500 time 0.2291 (0.2332) data time 0.0007 (0.0048) model time 0.2284 (0.2280) loss 3.2094 (3.1280) grad_norm 2.4378 (nan) loss_scale 2048.0000 (2380.5166) mem 7381MB [2024-08-27 11:24:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][280/1251] eta 0:03:46 lr 0.000465 wd 0.0500 time 0.2256 (0.2331) data time 0.0010 (0.0046) model time 0.2246 (0.2280) loss 3.1016 (3.1305) grad_norm 1.8775 (nan) loss_scale 2048.0000 (2368.6833) mem 7381MB [2024-08-27 11:24:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][290/1251] eta 0:03:43 lr 0.000465 wd 0.0500 time 0.2478 (0.2330) data time 0.0009 (0.0045) model time 0.2468 (0.2280) loss 2.1052 (3.1233) grad_norm 2.0573 (nan) loss_scale 2048.0000 (2357.6632) mem 7381MB [2024-08-27 11:24:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][300/1251] eta 0:03:41 lr 0.000465 wd 0.0500 time 0.2276 (0.2329) data time 0.0007 (0.0044) model time 0.2269 (0.2282) loss 1.9251 (3.1158) grad_norm 2.7495 (nan) loss_scale 2048.0000 (2347.3754) mem 7381MB [2024-08-27 11:24:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][310/1251] eta 0:03:39 lr 0.000465 wd 0.0500 time 0.2302 (0.2328) data time 0.0009 (0.0043) model time 0.2293 (0.2281) loss 3.0991 (3.1171) grad_norm 5.6112 (nan) loss_scale 2048.0000 (2337.7492) mem 7381MB [2024-08-27 11:24:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][320/1251] eta 0:03:36 lr 0.000465 wd 0.0500 time 0.2240 (0.2327) data time 0.0009 (0.0042) model time 0.2230 (0.2281) loss 3.4882 (3.1219) grad_norm 2.7249 (nan) loss_scale 2048.0000 (2328.7227) mem 7381MB [2024-08-27 11:24:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][330/1251] eta 0:03:34 lr 0.000465 wd 0.0500 time 0.2372 (0.2325) data time 0.0011 (0.0041) model time 0.2361 (0.2280) loss 3.4346 (3.1136) grad_norm 1.8335 (nan) loss_scale 2048.0000 (2320.2417) mem 7381MB [2024-08-27 11:24:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][340/1251] eta 0:03:31 lr 0.000465 wd 0.0500 time 0.2283 (0.2325) data time 0.0010 (0.0040) model time 0.2273 (0.2281) loss 3.4014 (3.1219) grad_norm 2.4786 (nan) loss_scale 2048.0000 (2312.2581) mem 7381MB [2024-08-27 11:24:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][350/1251] eta 0:03:29 lr 0.000465 wd 0.0500 time 0.4401 (0.2330) data time 0.0010 (0.0039) model time 0.4391 (0.2288) loss 3.2312 (3.1244) grad_norm 3.0986 (nan) loss_scale 2048.0000 (2304.7293) mem 7381MB [2024-08-27 11:24:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][360/1251] eta 0:03:27 lr 0.000465 wd 0.0500 time 0.2344 (0.2328) data time 0.0008 (0.0039) model time 0.2336 (0.2287) loss 2.7760 (3.1221) grad_norm 2.9860 (nan) loss_scale 2048.0000 (2297.6177) mem 7381MB [2024-08-27 11:24:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][370/1251] eta 0:03:25 lr 0.000465 wd 0.0500 time 0.2266 (0.2327) data time 0.0007 (0.0038) model time 0.2259 (0.2287) loss 4.0065 (3.1204) grad_norm 2.2406 (nan) loss_scale 2048.0000 (2290.8895) mem 7381MB [2024-08-27 11:24:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][380/1251] eta 0:03:22 lr 0.000464 wd 0.0500 time 0.2274 (0.2327) data time 0.0008 (0.0037) model time 0.2265 (0.2287) loss 3.2319 (3.1198) grad_norm 2.2387 (nan) loss_scale 2048.0000 (2284.5144) mem 7381MB [2024-08-27 11:24:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][390/1251] eta 0:03:20 lr 0.000464 wd 0.0500 time 0.2271 (0.2326) data time 0.0007 (0.0036) model time 0.2264 (0.2287) loss 4.1000 (3.1221) grad_norm 2.1632 (nan) loss_scale 2048.0000 (2278.4655) mem 7381MB [2024-08-27 11:24:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][400/1251] eta 0:03:17 lr 0.000464 wd 0.0500 time 0.2318 (0.2325) data time 0.0010 (0.0036) model time 0.2308 (0.2287) loss 3.4985 (3.1232) grad_norm 2.1898 (nan) loss_scale 2048.0000 (2272.7182) mem 7381MB [2024-08-27 11:24:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][410/1251] eta 0:03:15 lr 0.000464 wd 0.0500 time 0.2231 (0.2324) data time 0.0010 (0.0035) model time 0.2221 (0.2287) loss 3.1754 (3.1249) grad_norm 3.2453 (nan) loss_scale 2048.0000 (2267.2506) mem 7381MB [2024-08-27 11:24:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][420/1251] eta 0:03:13 lr 0.000464 wd 0.0500 time 0.2346 (0.2328) data time 0.0009 (0.0035) model time 0.2337 (0.2292) loss 2.5596 (3.1241) grad_norm 3.1090 (nan) loss_scale 2048.0000 (2262.0428) mem 7381MB [2024-08-27 11:24:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][430/1251] eta 0:03:11 lr 0.000464 wd 0.0500 time 0.2306 (0.2332) data time 0.0009 (0.0034) model time 0.2297 (0.2297) loss 3.6260 (3.1318) grad_norm 2.0134 (nan) loss_scale 2048.0000 (2257.0766) mem 7381MB [2024-08-27 11:24:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][440/1251] eta 0:03:09 lr 0.000464 wd 0.0500 time 0.2362 (0.2331) data time 0.0012 (0.0033) model time 0.2351 (0.2297) loss 3.6647 (3.1324) grad_norm 2.0853 (nan) loss_scale 2048.0000 (2252.3356) mem 7381MB [2024-08-27 11:24:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][450/1251] eta 0:03:06 lr 0.000464 wd 0.0500 time 0.2322 (0.2331) data time 0.0009 (0.0033) model time 0.2313 (0.2297) loss 3.4577 (3.1348) grad_norm 3.1920 (nan) loss_scale 2048.0000 (2247.8049) mem 7381MB [2024-08-27 11:24:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][460/1251] eta 0:03:04 lr 0.000464 wd 0.0500 time 0.2270 (0.2330) data time 0.0007 (0.0033) model time 0.2263 (0.2296) loss 2.2598 (3.1371) grad_norm 2.9447 (nan) loss_scale 2048.0000 (2243.4707) mem 7381MB [2024-08-27 11:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][470/1251] eta 0:03:01 lr 0.000464 wd 0.0500 time 0.2305 (0.2329) data time 0.0007 (0.0032) model time 0.2298 (0.2296) loss 3.5627 (3.1374) grad_norm 3.0882 (nan) loss_scale 2048.0000 (2239.3206) mem 7381MB [2024-08-27 11:25:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][480/1251] eta 0:02:59 lr 0.000464 wd 0.0500 time 0.2263 (0.2328) data time 0.0010 (0.0032) model time 0.2253 (0.2295) loss 2.4745 (3.1321) grad_norm 2.7408 (nan) loss_scale 2048.0000 (2235.3430) mem 7381MB [2024-08-27 11:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][490/1251] eta 0:02:57 lr 0.000464 wd 0.0500 time 0.2338 (0.2327) data time 0.0010 (0.0031) model time 0.2328 (0.2294) loss 3.6308 (3.1380) grad_norm 2.1148 (nan) loss_scale 2048.0000 (2231.5275) mem 7381MB [2024-08-27 11:25:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][500/1251] eta 0:02:54 lr 0.000464 wd 0.0500 time 0.2215 (0.2326) data time 0.0009 (0.0031) model time 0.2206 (0.2294) loss 3.1780 (3.1358) grad_norm 3.1233 (nan) loss_scale 2048.0000 (2227.8643) mem 7381MB [2024-08-27 11:25:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][510/1251] eta 0:02:52 lr 0.000464 wd 0.0500 time 0.2314 (0.2325) data time 0.0010 (0.0031) model time 0.2304 (0.2294) loss 3.5156 (3.1405) grad_norm 1.8226 (nan) loss_scale 2048.0000 (2224.3444) mem 7381MB [2024-08-27 11:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][520/1251] eta 0:02:49 lr 0.000464 wd 0.0500 time 0.2276 (0.2325) data time 0.0008 (0.0030) model time 0.2269 (0.2293) loss 4.0641 (3.1411) grad_norm 2.2405 (nan) loss_scale 2048.0000 (2220.9597) mem 7381MB [2024-08-27 11:25:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][530/1251] eta 0:02:47 lr 0.000464 wd 0.0500 time 0.2319 (0.2324) data time 0.0009 (0.0030) model time 0.2310 (0.2293) loss 3.8386 (3.1449) grad_norm 2.0606 (nan) loss_scale 2048.0000 (2217.7024) mem 7381MB [2024-08-27 11:25:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][540/1251] eta 0:02:45 lr 0.000464 wd 0.0500 time 0.2255 (0.2324) data time 0.0008 (0.0029) model time 0.2247 (0.2293) loss 2.8720 (3.1482) grad_norm 2.7155 (nan) loss_scale 2048.0000 (2214.5656) mem 7381MB [2024-08-27 11:25:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][550/1251] eta 0:02:42 lr 0.000464 wd 0.0500 time 0.2180 (0.2323) data time 0.0012 (0.0029) model time 0.2168 (0.2293) loss 1.9977 (3.1462) grad_norm 2.8635 (nan) loss_scale 2048.0000 (2211.5426) mem 7381MB [2024-08-27 11:25:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][560/1251] eta 0:02:40 lr 0.000464 wd 0.0500 time 0.2262 (0.2322) data time 0.0010 (0.0029) model time 0.2252 (0.2292) loss 3.8301 (3.1516) grad_norm 3.3298 (nan) loss_scale 2048.0000 (2208.6275) mem 7381MB [2024-08-27 11:25:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][570/1251] eta 0:02:38 lr 0.000464 wd 0.0500 time 0.2303 (0.2321) data time 0.0007 (0.0029) model time 0.2297 (0.2291) loss 3.9129 (3.1556) grad_norm 4.1233 (nan) loss_scale 2048.0000 (2205.8144) mem 7381MB [2024-08-27 11:25:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][580/1251] eta 0:02:35 lr 0.000464 wd 0.0500 time 0.2337 (0.2322) data time 0.0008 (0.0028) model time 0.2330 (0.2292) loss 3.0163 (3.1603) grad_norm 1.8552 (nan) loss_scale 2048.0000 (2203.0981) mem 7381MB [2024-08-27 11:25:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][590/1251] eta 0:02:33 lr 0.000464 wd 0.0500 time 0.2293 (0.2321) data time 0.0009 (0.0028) model time 0.2284 (0.2292) loss 3.4195 (3.1616) grad_norm 4.4734 (nan) loss_scale 2048.0000 (2200.4738) mem 7381MB [2024-08-27 11:25:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][600/1251] eta 0:02:31 lr 0.000464 wd 0.0500 time 0.2202 (0.2320) data time 0.0008 (0.0028) model time 0.2194 (0.2291) loss 2.7165 (3.1626) grad_norm 2.3287 (nan) loss_scale 2048.0000 (2197.9368) mem 7381MB [2024-08-27 11:25:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][610/1251] eta 0:02:28 lr 0.000463 wd 0.0500 time 0.2253 (0.2319) data time 0.0009 (0.0027) model time 0.2244 (0.2291) loss 3.6841 (3.1626) grad_norm 4.5527 (nan) loss_scale 2048.0000 (2195.4828) mem 7381MB [2024-08-27 11:25:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][620/1251] eta 0:02:26 lr 0.000463 wd 0.0500 time 0.2362 (0.2319) data time 0.0012 (0.0027) model time 0.2350 (0.2290) loss 2.6221 (3.1602) grad_norm 4.8652 (nan) loss_scale 2048.0000 (2193.1079) mem 7381MB [2024-08-27 11:25:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][630/1251] eta 0:02:23 lr 0.000463 wd 0.0500 time 0.2262 (0.2318) data time 0.0009 (0.0027) model time 0.2253 (0.2290) loss 3.5321 (3.1590) grad_norm 3.7512 (nan) loss_scale 2048.0000 (2190.8082) mem 7381MB [2024-08-27 11:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][640/1251] eta 0:02:21 lr 0.000463 wd 0.0500 time 0.2251 (0.2318) data time 0.0009 (0.0027) model time 0.2242 (0.2290) loss 2.6184 (3.1569) grad_norm 2.2299 (nan) loss_scale 2048.0000 (2188.5803) mem 7381MB [2024-08-27 11:25:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][650/1251] eta 0:02:19 lr 0.000463 wd 0.0500 time 0.2203 (0.2317) data time 0.0010 (0.0027) model time 0.2193 (0.2289) loss 3.0997 (3.1576) grad_norm 2.5118 (nan) loss_scale 2048.0000 (2186.4209) mem 7381MB [2024-08-27 11:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][660/1251] eta 0:02:16 lr 0.000463 wd 0.0500 time 0.2251 (0.2316) data time 0.0009 (0.0026) model time 0.2243 (0.2289) loss 3.4535 (3.1574) grad_norm 2.0116 (nan) loss_scale 2048.0000 (2184.3268) mem 7381MB [2024-08-27 11:25:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][670/1251] eta 0:02:14 lr 0.000463 wd 0.0500 time 0.2248 (0.2316) data time 0.0009 (0.0026) model time 0.2240 (0.2288) loss 3.1934 (3.1589) grad_norm 2.9224 (nan) loss_scale 2048.0000 (2182.2951) mem 7381MB [2024-08-27 11:25:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][680/1251] eta 0:02:12 lr 0.000463 wd 0.0500 time 0.2278 (0.2315) data time 0.0011 (0.0026) model time 0.2268 (0.2288) loss 3.6765 (3.1601) grad_norm 2.3943 (nan) loss_scale 2048.0000 (2180.3231) mem 7381MB [2024-08-27 11:25:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][690/1251] eta 0:02:09 lr 0.000463 wd 0.0500 time 0.2245 (0.2315) data time 0.0011 (0.0026) model time 0.2234 (0.2288) loss 3.6079 (3.1604) grad_norm 3.0400 (nan) loss_scale 2048.0000 (2178.4081) mem 7381MB [2024-08-27 11:25:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][700/1251] eta 0:02:07 lr 0.000463 wd 0.0500 time 0.2274 (0.2314) data time 0.0009 (0.0025) model time 0.2265 (0.2288) loss 3.1017 (3.1563) grad_norm 2.4355 (nan) loss_scale 2048.0000 (2176.5478) mem 7381MB [2024-08-27 11:25:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][710/1251] eta 0:02:05 lr 0.000463 wd 0.0500 time 0.2284 (0.2314) data time 0.0009 (0.0025) model time 0.2275 (0.2287) loss 3.6980 (3.1618) grad_norm 2.9282 (nan) loss_scale 2048.0000 (2174.7398) mem 7381MB [2024-08-27 11:25:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][720/1251] eta 0:02:02 lr 0.000463 wd 0.0500 time 0.2271 (0.2313) data time 0.0007 (0.0025) model time 0.2263 (0.2287) loss 2.2837 (3.1595) grad_norm 1.7271 (nan) loss_scale 2048.0000 (2172.9820) mem 7381MB [2024-08-27 11:25:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][730/1251] eta 0:02:00 lr 0.000463 wd 0.0500 time 0.2256 (0.2313) data time 0.0008 (0.0025) model time 0.2248 (0.2287) loss 3.6324 (3.1581) grad_norm 2.8802 (nan) loss_scale 2048.0000 (2171.2722) mem 7381MB [2024-08-27 11:26:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][740/1251] eta 0:01:58 lr 0.000463 wd 0.0500 time 0.2300 (0.2312) data time 0.0010 (0.0025) model time 0.2290 (0.2286) loss 2.7304 (3.1526) grad_norm 2.1821 (nan) loss_scale 2048.0000 (2169.6086) mem 7381MB [2024-08-27 11:26:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][750/1251] eta 0:01:55 lr 0.000463 wd 0.0500 time 0.2276 (0.2312) data time 0.0011 (0.0024) model time 0.2266 (0.2286) loss 2.1902 (3.1495) grad_norm 3.6094 (nan) loss_scale 2048.0000 (2167.9893) mem 7381MB [2024-08-27 11:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][760/1251] eta 0:01:53 lr 0.000463 wd 0.0500 time 0.2291 (0.2312) data time 0.0010 (0.0024) model time 0.2281 (0.2286) loss 2.6839 (3.1504) grad_norm 1.7659 (nan) loss_scale 2048.0000 (2166.4126) mem 7381MB [2024-08-27 11:26:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][770/1251] eta 0:01:51 lr 0.000463 wd 0.0500 time 0.2291 (0.2311) data time 0.0010 (0.0024) model time 0.2280 (0.2286) loss 2.6069 (3.1489) grad_norm 2.3417 (nan) loss_scale 2048.0000 (2164.8768) mem 7381MB [2024-08-27 11:26:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][780/1251] eta 0:01:48 lr 0.000463 wd 0.0500 time 0.2229 (0.2311) data time 0.0011 (0.0024) model time 0.2219 (0.2286) loss 3.2980 (3.1509) grad_norm 2.4366 (nan) loss_scale 2048.0000 (2163.3803) mem 7381MB [2024-08-27 11:26:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][790/1251] eta 0:01:46 lr 0.000463 wd 0.0500 time 0.2250 (0.2310) data time 0.0008 (0.0024) model time 0.2243 (0.2285) loss 2.9390 (3.1504) grad_norm 1.6208 (nan) loss_scale 2048.0000 (2161.9216) mem 7381MB [2024-08-27 11:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][800/1251] eta 0:01:44 lr 0.000463 wd 0.0500 time 0.2284 (0.2310) data time 0.0010 (0.0024) model time 0.2274 (0.2285) loss 1.9555 (3.1481) grad_norm 5.2822 (nan) loss_scale 2048.0000 (2160.4994) mem 7381MB [2024-08-27 11:26:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][810/1251] eta 0:01:41 lr 0.000463 wd 0.0500 time 0.2378 (0.2310) data time 0.0007 (0.0023) model time 0.2371 (0.2285) loss 3.4682 (3.1518) grad_norm 2.9648 (nan) loss_scale 2048.0000 (2159.1122) mem 7381MB [2024-08-27 11:26:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][820/1251] eta 0:01:39 lr 0.000463 wd 0.0500 time 0.2213 (0.2309) data time 0.0007 (0.0023) model time 0.2206 (0.2285) loss 3.4833 (3.1528) grad_norm 2.7277 (nan) loss_scale 2048.0000 (2157.7588) mem 7381MB [2024-08-27 11:26:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][830/1251] eta 0:01:37 lr 0.000462 wd 0.0500 time 0.2252 (0.2309) data time 0.0007 (0.0023) model time 0.2245 (0.2285) loss 3.5880 (3.1520) grad_norm 2.9530 (nan) loss_scale 2048.0000 (2156.4380) mem 7381MB [2024-08-27 11:26:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][840/1251] eta 0:01:34 lr 0.000462 wd 0.0500 time 0.2277 (0.2308) data time 0.0007 (0.0023) model time 0.2270 (0.2284) loss 2.6143 (3.1488) grad_norm 2.0697 (nan) loss_scale 2048.0000 (2155.1486) mem 7381MB [2024-08-27 11:26:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][850/1251] eta 0:01:32 lr 0.000462 wd 0.0500 time 0.2268 (0.2308) data time 0.0011 (0.0023) model time 0.2257 (0.2284) loss 3.5762 (3.1533) grad_norm 1.9314 (nan) loss_scale 2048.0000 (2153.8895) mem 7381MB [2024-08-27 11:26:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][860/1251] eta 0:01:30 lr 0.000462 wd 0.0500 time 0.2324 (0.2307) data time 0.0009 (0.0023) model time 0.2315 (0.2283) loss 3.3554 (3.1550) grad_norm 2.3167 (nan) loss_scale 2048.0000 (2152.6597) mem 7381MB [2024-08-27 11:26:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][870/1251] eta 0:01:27 lr 0.000462 wd 0.0500 time 0.2222 (0.2307) data time 0.0009 (0.0022) model time 0.2213 (0.2283) loss 3.4605 (3.1543) grad_norm 3.3254 (nan) loss_scale 2048.0000 (2151.4581) mem 7381MB [2024-08-27 11:26:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][880/1251] eta 0:01:25 lr 0.000462 wd 0.0500 time 0.2236 (0.2306) data time 0.0006 (0.0022) model time 0.2229 (0.2283) loss 2.8480 (3.1544) grad_norm 2.8578 (nan) loss_scale 2048.0000 (2150.2838) mem 7381MB [2024-08-27 11:26:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][890/1251] eta 0:01:23 lr 0.000462 wd 0.0500 time 0.2361 (0.2306) data time 0.0009 (0.0022) model time 0.2352 (0.2283) loss 3.2636 (3.1569) grad_norm 2.9930 (nan) loss_scale 2048.0000 (2149.1358) mem 7381MB [2024-08-27 11:26:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][900/1251] eta 0:01:20 lr 0.000462 wd 0.0500 time 0.2300 (0.2306) data time 0.0007 (0.0022) model time 0.2292 (0.2283) loss 2.2598 (3.1527) grad_norm 2.7543 (nan) loss_scale 2048.0000 (2148.0133) mem 7381MB [2024-08-27 11:26:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][910/1251] eta 0:01:18 lr 0.000462 wd 0.0500 time 0.2250 (0.2306) data time 0.0007 (0.0022) model time 0.2243 (0.2283) loss 3.0673 (3.1505) grad_norm 2.6251 (nan) loss_scale 2048.0000 (2146.9155) mem 7381MB [2024-08-27 11:26:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][920/1251] eta 0:01:16 lr 0.000462 wd 0.0500 time 0.2291 (0.2306) data time 0.0008 (0.0022) model time 0.2283 (0.2283) loss 3.0975 (3.1496) grad_norm 2.0450 (nan) loss_scale 2048.0000 (2145.8415) mem 7381MB [2024-08-27 11:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][930/1251] eta 0:01:14 lr 0.000462 wd 0.0500 time 0.2203 (0.2305) data time 0.0007 (0.0022) model time 0.2195 (0.2283) loss 3.6074 (3.1525) grad_norm 2.1625 (nan) loss_scale 2048.0000 (2144.7905) mem 7381MB [2024-08-27 11:26:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][940/1251] eta 0:01:11 lr 0.000462 wd 0.0500 time 0.2226 (0.2305) data time 0.0012 (0.0022) model time 0.2213 (0.2283) loss 2.7767 (3.1527) grad_norm 2.4019 (nan) loss_scale 2048.0000 (2143.7620) mem 7381MB [2024-08-27 11:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][950/1251] eta 0:01:09 lr 0.000462 wd 0.0500 time 0.2235 (0.2305) data time 0.0009 (0.0021) model time 0.2225 (0.2283) loss 3.5917 (3.1507) grad_norm 2.0978 (nan) loss_scale 2048.0000 (2142.7550) mem 7381MB [2024-08-27 11:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][960/1251] eta 0:01:07 lr 0.000462 wd 0.0500 time 0.2281 (0.2310) data time 0.0011 (0.0021) model time 0.2270 (0.2287) loss 3.5275 (3.1497) grad_norm 2.4146 (nan) loss_scale 2048.0000 (2141.7690) mem 7381MB [2024-08-27 11:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][970/1251] eta 0:01:04 lr 0.000462 wd 0.0500 time 0.2277 (0.2309) data time 0.0007 (0.0021) model time 0.2269 (0.2287) loss 2.2894 (3.1482) grad_norm 2.0665 (nan) loss_scale 2048.0000 (2140.8033) mem 7381MB [2024-08-27 11:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][980/1251] eta 0:01:02 lr 0.000462 wd 0.0500 time 0.2247 (0.2309) data time 0.0010 (0.0021) model time 0.2237 (0.2287) loss 3.4358 (3.1494) grad_norm 2.5765 (nan) loss_scale 2048.0000 (2139.8573) mem 7381MB [2024-08-27 11:26:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][990/1251] eta 0:01:00 lr 0.000462 wd 0.0500 time 0.2285 (0.2309) data time 0.0006 (0.0021) model time 0.2279 (0.2287) loss 2.4578 (3.1480) grad_norm 2.0181 (nan) loss_scale 2048.0000 (2138.9304) mem 7381MB [2024-08-27 11:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1000/1251] eta 0:00:57 lr 0.000462 wd 0.0500 time 0.2347 (0.2309) data time 0.0007 (0.0021) model time 0.2340 (0.2287) loss 2.6570 (3.1472) grad_norm 2.6193 (nan) loss_scale 2048.0000 (2138.0220) mem 7381MB [2024-08-27 11:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1010/1251] eta 0:00:55 lr 0.000462 wd 0.0500 time 0.2281 (0.2308) data time 0.0008 (0.0021) model time 0.2272 (0.2287) loss 3.5223 (3.1491) grad_norm 2.4864 (nan) loss_scale 2048.0000 (2137.1316) mem 7381MB [2024-08-27 11:27:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1020/1251] eta 0:00:53 lr 0.000462 wd 0.0500 time 0.2305 (0.2308) data time 0.0007 (0.0021) model time 0.2299 (0.2287) loss 1.8520 (3.1478) grad_norm 1.7553 (nan) loss_scale 2048.0000 (2136.2586) mem 7381MB [2024-08-27 11:27:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1030/1251] eta 0:00:51 lr 0.000462 wd 0.0500 time 0.2319 (0.2308) data time 0.0012 (0.0021) model time 0.2307 (0.2287) loss 3.6932 (3.1499) grad_norm 3.1439 (nan) loss_scale 2048.0000 (2135.4025) mem 7381MB [2024-08-27 11:27:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1040/1251] eta 0:00:48 lr 0.000462 wd 0.0500 time 0.2322 (0.2308) data time 0.0009 (0.0021) model time 0.2313 (0.2287) loss 2.2022 (3.1517) grad_norm 2.6399 (nan) loss_scale 2048.0000 (2134.5629) mem 7381MB [2024-08-27 11:27:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1050/1251] eta 0:00:46 lr 0.000462 wd 0.0500 time 0.2255 (0.2308) data time 0.0010 (0.0020) model time 0.2245 (0.2286) loss 3.0852 (3.1503) grad_norm 2.4494 (nan) loss_scale 2048.0000 (2133.7393) mem 7381MB [2024-08-27 11:27:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1060/1251] eta 0:00:44 lr 0.000461 wd 0.0500 time 0.2332 (0.2307) data time 0.0009 (0.0020) model time 0.2323 (0.2286) loss 2.7673 (3.1511) grad_norm 1.8992 (nan) loss_scale 2048.0000 (2132.9312) mem 7381MB [2024-08-27 11:27:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1070/1251] eta 0:00:41 lr 0.000461 wd 0.0500 time 0.2277 (0.2307) data time 0.0009 (0.0020) model time 0.2268 (0.2286) loss 3.3762 (3.1512) grad_norm 2.0673 (nan) loss_scale 2048.0000 (2132.1382) mem 7381MB [2024-08-27 11:27:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1080/1251] eta 0:00:39 lr 0.000461 wd 0.0500 time 0.2304 (0.2307) data time 0.0007 (0.0020) model time 0.2297 (0.2286) loss 3.6879 (3.1520) grad_norm 3.0598 (nan) loss_scale 2048.0000 (2131.3599) mem 7381MB [2024-08-27 11:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1090/1251] eta 0:00:37 lr 0.000461 wd 0.0500 time 0.2301 (0.2307) data time 0.0010 (0.0020) model time 0.2291 (0.2286) loss 2.6372 (3.1535) grad_norm 2.1078 (nan) loss_scale 2048.0000 (2130.5958) mem 7381MB [2024-08-27 11:27:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1100/1251] eta 0:00:34 lr 0.000461 wd 0.0500 time 0.2234 (0.2307) data time 0.0010 (0.0020) model time 0.2224 (0.2286) loss 4.0271 (3.1533) grad_norm 2.2406 (nan) loss_scale 2048.0000 (2129.8456) mem 7381MB [2024-08-27 11:27:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1110/1251] eta 0:00:32 lr 0.000461 wd 0.0500 time 0.2426 (0.2307) data time 0.0009 (0.0020) model time 0.2417 (0.2286) loss 4.2210 (3.1545) grad_norm 2.5424 (nan) loss_scale 2048.0000 (2129.1089) mem 7381MB [2024-08-27 11:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1120/1251] eta 0:00:30 lr 0.000461 wd 0.0500 time 0.2517 (0.2307) data time 0.0007 (0.0020) model time 0.2510 (0.2286) loss 2.3931 (3.1557) grad_norm 3.2616 (nan) loss_scale 1024.0000 (2121.0776) mem 7381MB [2024-08-27 11:27:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1130/1251] eta 0:00:27 lr 0.000461 wd 0.0500 time 0.2345 (0.2307) data time 0.0010 (0.0020) model time 0.2335 (0.2286) loss 3.2850 (3.1555) grad_norm 3.4473 (nan) loss_scale 1024.0000 (2111.3775) mem 7381MB [2024-08-27 11:27:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1140/1251] eta 0:00:25 lr 0.000461 wd 0.0500 time 0.2263 (0.2307) data time 0.0008 (0.0020) model time 0.2255 (0.2286) loss 2.1605 (3.1554) grad_norm 2.7750 (nan) loss_scale 1024.0000 (2101.8475) mem 7381MB [2024-08-27 11:27:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1150/1251] eta 0:00:23 lr 0.000461 wd 0.0500 time 0.2219 (0.2306) data time 0.0009 (0.0020) model time 0.2210 (0.2286) loss 3.3296 (3.1553) grad_norm 1.7122 (nan) loss_scale 1024.0000 (2092.4831) mem 7381MB [2024-08-27 11:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1160/1251] eta 0:00:20 lr 0.000461 wd 0.0500 time 0.2298 (0.2306) data time 0.0007 (0.0020) model time 0.2290 (0.2286) loss 3.1231 (3.1580) grad_norm 2.4394 (nan) loss_scale 1024.0000 (2083.2799) mem 7381MB [2024-08-27 11:27:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1170/1251] eta 0:00:18 lr 0.000461 wd 0.0500 time 0.2300 (0.2306) data time 0.0007 (0.0019) model time 0.2293 (0.2286) loss 3.5110 (3.1592) grad_norm 2.5834 (nan) loss_scale 1024.0000 (2074.2340) mem 7381MB [2024-08-27 11:27:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1180/1251] eta 0:00:16 lr 0.000461 wd 0.0500 time 0.2273 (0.2306) data time 0.0010 (0.0019) model time 0.2263 (0.2286) loss 3.2805 (3.1606) grad_norm 2.0722 (nan) loss_scale 1024.0000 (2065.3412) mem 7381MB [2024-08-27 11:27:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1190/1251] eta 0:00:14 lr 0.000461 wd 0.0500 time 0.2307 (0.2306) data time 0.0009 (0.0019) model time 0.2298 (0.2286) loss 3.6879 (3.1599) grad_norm 3.8569 (nan) loss_scale 1024.0000 (2056.5978) mem 7381MB [2024-08-27 11:27:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1200/1251] eta 0:00:11 lr 0.000461 wd 0.0500 time 0.2232 (0.2306) data time 0.0010 (0.0019) model time 0.2222 (0.2285) loss 3.7640 (3.1594) grad_norm 3.4045 (nan) loss_scale 1024.0000 (2048.0000) mem 7381MB [2024-08-27 11:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1210/1251] eta 0:00:09 lr 0.000461 wd 0.0500 time 0.2356 (0.2306) data time 0.0012 (0.0019) model time 0.2344 (0.2286) loss 3.2013 (3.1582) grad_norm 2.4651 (nan) loss_scale 1024.0000 (2039.5442) mem 7381MB [2024-08-27 11:27:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1220/1251] eta 0:00:07 lr 0.000461 wd 0.0500 time 0.2272 (0.2305) data time 0.0009 (0.0019) model time 0.2263 (0.2286) loss 3.1148 (3.1591) grad_norm 2.2437 (nan) loss_scale 1024.0000 (2031.2269) mem 7381MB [2024-08-27 11:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1230/1251] eta 0:00:04 lr 0.000461 wd 0.0500 time 0.2297 (0.2305) data time 0.0010 (0.0019) model time 0.2287 (0.2285) loss 2.2105 (3.1579) grad_norm 3.2200 (nan) loss_scale 1024.0000 (2023.0447) mem 7381MB [2024-08-27 11:27:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1240/1251] eta 0:00:02 lr 0.000461 wd 0.0500 time 0.2179 (0.2304) data time 0.0005 (0.0019) model time 0.2174 (0.2284) loss 3.1066 (3.1574) grad_norm 3.2158 (nan) loss_scale 1024.0000 (2014.9944) mem 7381MB [2024-08-27 11:27:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [167/300][1250/1251] eta 0:00:00 lr 0.000461 wd 0.0500 time 0.2127 (0.2303) data time 0.0004 (0.0019) model time 0.2123 (0.2283) loss 3.3676 (3.1584) grad_norm 2.3121 (nan) loss_scale 1024.0000 (2007.0727) mem 7381MB [2024-08-27 11:27:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 167 training takes 0:04:48 [2024-08-27 11:27:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 11:27:59 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 11:27:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.457 (0.457) Loss 0.4673 (0.4673) Acc@1 90.820 (90.820) Acc@5 98.047 (98.047) Mem 7381MB [2024-08-27 11:28:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.117) Loss 0.7158 (0.7045) Acc@1 85.352 (84.632) Acc@5 96.484 (96.848) Mem 7381MB [2024-08-27 11:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.100) Loss 1.0273 (0.7243) Acc@1 75.293 (83.738) Acc@5 94.043 (96.838) Mem 7381MB [2024-08-27 11:28:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.086 (0.094) Loss 1.2275 (0.8259) Acc@1 70.508 (81.426) Acc@5 91.602 (95.747) Mem 7381MB [2024-08-27 11:28:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.0996 (0.8788) Acc@1 73.633 (80.030) Acc@5 94.043 (95.165) Mem 7381MB [2024-08-27 11:28:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.666 Acc@5 95.118 [2024-08-27 11:28:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.7% [2024-08-27 11:28:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.935 (0.935) Loss 0.4038 (0.4038) Acc@1 93.164 (93.164) Acc@5 98.340 (98.340) Mem 7381MB [2024-08-27 11:28:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.166) Loss 0.6270 (0.6267) Acc@1 87.793 (86.506) Acc@5 97.754 (97.417) Mem 7381MB [2024-08-27 11:28:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.125) Loss 0.8906 (0.6516) Acc@1 78.809 (85.649) Acc@5 95.605 (97.414) Mem 7381MB [2024-08-27 11:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.110) Loss 1.1230 (0.7396) Acc@1 72.656 (83.506) Acc@5 93.066 (96.437) Mem 7381MB [2024-08-27 11:28:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.099) Loss 1.0186 (0.7853) Acc@1 74.902 (82.136) Acc@5 94.141 (95.972) Mem 7381MB [2024-08-27 11:28:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.756 Acc@5 95.950 [2024-08-27 11:28:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.8% [2024-08-27 11:28:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.76% [2024-08-27 11:28:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 11:28:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 11:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][0/1251] eta 0:16:22 lr 0.000461 wd 0.0500 time 0.7855 (0.7855) data time 0.5432 (0.5432) model time 0.0000 (0.0000) loss 3.5953 (3.5953) grad_norm 2.3407 (2.3407) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][10/1251] eta 0:05:42 lr 0.000461 wd 0.0500 time 0.2243 (0.2762) data time 0.0007 (0.0504) model time 0.0000 (0.0000) loss 3.3800 (2.8978) grad_norm 2.5927 (2.6004) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][20/1251] eta 0:05:10 lr 0.000461 wd 0.0500 time 0.2303 (0.2523) data time 0.0010 (0.0269) model time 0.0000 (0.0000) loss 3.1847 (3.0218) grad_norm 2.0373 (2.6050) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][30/1251] eta 0:04:58 lr 0.000460 wd 0.0500 time 0.2393 (0.2443) data time 0.0011 (0.0185) model time 0.0000 (0.0000) loss 3.0436 (3.0397) grad_norm 2.0717 (2.6551) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][40/1251] eta 0:04:51 lr 0.000460 wd 0.0500 time 0.2279 (0.2404) data time 0.0009 (0.0143) model time 0.0000 (0.0000) loss 3.5639 (3.0993) grad_norm 2.4720 (2.6480) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][50/1251] eta 0:04:46 lr 0.000460 wd 0.0500 time 0.2267 (0.2382) data time 0.0007 (0.0117) model time 0.0000 (0.0000) loss 2.5640 (3.0389) grad_norm 2.7726 (2.6021) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][60/1251] eta 0:04:41 lr 0.000460 wd 0.0500 time 0.2313 (0.2362) data time 0.0006 (0.0099) model time 0.2307 (0.2256) loss 2.5693 (3.0583) grad_norm 2.7648 (2.5901) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][70/1251] eta 0:04:37 lr 0.000460 wd 0.0500 time 0.2349 (0.2352) data time 0.0010 (0.0087) model time 0.2339 (0.2267) loss 3.1134 (3.0812) grad_norm 2.1838 (2.6047) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][80/1251] eta 0:04:34 lr 0.000460 wd 0.0500 time 0.2261 (0.2344) data time 0.0009 (0.0077) model time 0.2252 (0.2269) loss 1.7902 (3.0703) grad_norm 3.6242 (2.6482) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][90/1251] eta 0:04:31 lr 0.000460 wd 0.0500 time 0.2231 (0.2337) data time 0.0009 (0.0070) model time 0.2221 (0.2270) loss 3.5713 (3.1093) grad_norm 2.2362 (2.6773) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][100/1251] eta 0:04:28 lr 0.000460 wd 0.0500 time 0.2324 (0.2331) data time 0.0010 (0.0064) model time 0.2315 (0.2269) loss 2.7546 (3.1005) grad_norm 1.9297 (2.6718) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][110/1251] eta 0:04:25 lr 0.000460 wd 0.0500 time 0.2316 (0.2326) data time 0.0007 (0.0059) model time 0.2310 (0.2269) loss 3.5263 (3.0890) grad_norm 3.8306 (2.7102) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][120/1251] eta 0:04:22 lr 0.000460 wd 0.0500 time 0.2246 (0.2320) data time 0.0009 (0.0055) model time 0.2237 (0.2265) loss 2.8752 (3.0830) grad_norm 3.4481 (2.7092) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][130/1251] eta 0:04:19 lr 0.000460 wd 0.0500 time 0.2221 (0.2317) data time 0.0008 (0.0052) model time 0.2214 (0.2265) loss 3.5544 (3.0955) grad_norm 1.9577 (2.7249) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][140/1251] eta 0:04:17 lr 0.000460 wd 0.0500 time 0.2287 (0.2314) data time 0.0007 (0.0049) model time 0.2280 (0.2265) loss 3.8085 (3.0942) grad_norm 2.1725 (2.7120) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][150/1251] eta 0:04:14 lr 0.000460 wd 0.0500 time 0.2254 (0.2313) data time 0.0010 (0.0046) model time 0.2244 (0.2267) loss 3.5993 (3.1007) grad_norm 2.7232 (2.7500) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][160/1251] eta 0:04:12 lr 0.000460 wd 0.0500 time 0.2256 (0.2311) data time 0.0008 (0.0044) model time 0.2247 (0.2268) loss 3.0356 (3.0932) grad_norm 2.2435 (2.7594) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][170/1251] eta 0:04:09 lr 0.000460 wd 0.0500 time 0.2372 (0.2310) data time 0.0008 (0.0042) model time 0.2364 (0.2269) loss 1.9852 (3.0761) grad_norm 2.7627 (2.7557) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][180/1251] eta 0:04:07 lr 0.000460 wd 0.0500 time 0.2260 (0.2308) data time 0.0009 (0.0040) model time 0.2252 (0.2268) loss 3.6377 (3.0891) grad_norm 2.1422 (2.7429) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][190/1251] eta 0:04:04 lr 0.000460 wd 0.0500 time 0.2363 (0.2306) data time 0.0010 (0.0039) model time 0.2354 (0.2268) loss 3.9681 (3.0961) grad_norm 2.7072 (2.7437) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][200/1251] eta 0:04:02 lr 0.000460 wd 0.0500 time 0.2358 (0.2306) data time 0.0008 (0.0037) model time 0.2349 (0.2269) loss 3.6004 (3.0920) grad_norm 1.9164 (2.7431) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][210/1251] eta 0:03:59 lr 0.000460 wd 0.0500 time 0.2273 (0.2305) data time 0.0012 (0.0036) model time 0.2260 (0.2270) loss 3.3799 (3.0922) grad_norm 2.3271 (2.7205) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:28:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][220/1251] eta 0:03:57 lr 0.000460 wd 0.0500 time 0.2268 (0.2304) data time 0.0009 (0.0035) model time 0.2259 (0.2270) loss 3.7764 (3.0989) grad_norm 2.7417 (2.7135) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:29:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][230/1251] eta 0:03:57 lr 0.000460 wd 0.0500 time 0.4360 (0.2322) data time 0.0011 (0.0034) model time 0.4349 (0.2295) loss 3.0737 (3.0969) grad_norm 3.1988 (2.7578) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:29:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][240/1251] eta 0:03:55 lr 0.000460 wd 0.0500 time 0.2300 (0.2329) data time 0.0010 (0.0033) model time 0.2290 (0.2304) loss 3.2662 (3.0921) grad_norm 2.9624 (2.7693) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:29:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][250/1251] eta 0:03:52 lr 0.000460 wd 0.0500 time 0.2313 (0.2327) data time 0.0007 (0.0032) model time 0.2306 (0.2302) loss 3.6003 (3.1095) grad_norm 2.9143 (2.7550) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:29:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][260/1251] eta 0:03:50 lr 0.000459 wd 0.0500 time 0.2252 (0.2326) data time 0.0012 (0.0032) model time 0.2240 (0.2302) loss 3.2696 (3.1147) grad_norm 3.9796 (2.7495) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:29:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][270/1251] eta 0:03:48 lr 0.000459 wd 0.0500 time 0.2268 (0.2325) data time 0.0011 (0.0031) model time 0.2257 (0.2301) loss 2.5218 (3.1088) grad_norm 2.4984 (2.7336) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:29:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][280/1251] eta 0:03:45 lr 0.000459 wd 0.0500 time 0.2305 (0.2324) data time 0.0008 (0.0030) model time 0.2297 (0.2300) loss 3.3786 (3.1121) grad_norm 3.7613 (2.7266) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:29:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][290/1251] eta 0:03:43 lr 0.000459 wd 0.0500 time 0.2265 (0.2322) data time 0.0008 (0.0029) model time 0.2258 (0.2299) loss 2.1571 (3.1080) grad_norm 2.5963 (inf) loss_scale 512.0000 (1009.9244) mem 7381MB [2024-08-27 11:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][300/1251] eta 0:03:40 lr 0.000459 wd 0.0500 time 0.2230 (0.2321) data time 0.0010 (0.0029) model time 0.2220 (0.2298) loss 2.9554 (3.1089) grad_norm 4.7369 (inf) loss_scale 512.0000 (993.3821) mem 7381MB [2024-08-27 11:29:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][310/1251] eta 0:03:38 lr 0.000459 wd 0.0500 time 0.2315 (0.2320) data time 0.0011 (0.0029) model time 0.2304 (0.2297) loss 2.9656 (3.1169) grad_norm 2.9706 (inf) loss_scale 512.0000 (977.9035) mem 7381MB [2024-08-27 11:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][320/1251] eta 0:03:36 lr 0.000459 wd 0.0500 time 0.2228 (0.2321) data time 0.0011 (0.0028) model time 0.2218 (0.2298) loss 3.3005 (3.1230) grad_norm 3.4125 (inf) loss_scale 512.0000 (963.3894) mem 7381MB [2024-08-27 11:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][330/1251] eta 0:03:33 lr 0.000459 wd 0.0500 time 0.2409 (0.2321) data time 0.0009 (0.0028) model time 0.2400 (0.2298) loss 2.1854 (3.1205) grad_norm 3.1834 (inf) loss_scale 512.0000 (949.7523) mem 7381MB [2024-08-27 11:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][340/1251] eta 0:03:31 lr 0.000459 wd 0.0500 time 0.2230 (0.2319) data time 0.0009 (0.0028) model time 0.2221 (0.2297) loss 3.7308 (3.1172) grad_norm 2.0514 (inf) loss_scale 512.0000 (936.9150) mem 7381MB [2024-08-27 11:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][350/1251] eta 0:03:28 lr 0.000459 wd 0.0500 time 0.2267 (0.2319) data time 0.0010 (0.0027) model time 0.2257 (0.2297) loss 3.4279 (3.1198) grad_norm 2.4606 (inf) loss_scale 512.0000 (924.8091) mem 7381MB [2024-08-27 11:29:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][360/1251] eta 0:03:26 lr 0.000459 wd 0.0500 time 0.2256 (0.2319) data time 0.0012 (0.0027) model time 0.2244 (0.2297) loss 3.4988 (3.1152) grad_norm 4.1689 (inf) loss_scale 512.0000 (913.3740) mem 7381MB [2024-08-27 11:29:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][370/1251] eta 0:03:24 lr 0.000459 wd 0.0500 time 0.2312 (0.2319) data time 0.0008 (0.0026) model time 0.2303 (0.2297) loss 3.4975 (3.1188) grad_norm 2.8893 (inf) loss_scale 512.0000 (902.5553) mem 7381MB [2024-08-27 11:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][380/1251] eta 0:03:21 lr 0.000459 wd 0.0500 time 0.2215 (0.2318) data time 0.0008 (0.0026) model time 0.2207 (0.2297) loss 4.0621 (3.1175) grad_norm 3.8202 (inf) loss_scale 512.0000 (892.3045) mem 7381MB [2024-08-27 11:29:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][390/1251] eta 0:03:19 lr 0.000459 wd 0.0500 time 0.2257 (0.2318) data time 0.0009 (0.0025) model time 0.2249 (0.2296) loss 3.8424 (3.1227) grad_norm 3.0709 (inf) loss_scale 512.0000 (882.5780) mem 7381MB [2024-08-27 11:29:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][400/1251] eta 0:03:17 lr 0.000459 wd 0.0500 time 0.2235 (0.2317) data time 0.0011 (0.0025) model time 0.2224 (0.2296) loss 2.3569 (3.1169) grad_norm 2.8760 (inf) loss_scale 512.0000 (873.3367) mem 7381MB [2024-08-27 11:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][410/1251] eta 0:03:14 lr 0.000459 wd 0.0500 time 0.2339 (0.2317) data time 0.0009 (0.0025) model time 0.2330 (0.2296) loss 3.5140 (3.1165) grad_norm 1.9970 (inf) loss_scale 512.0000 (864.5450) mem 7381MB [2024-08-27 11:29:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][420/1251] eta 0:03:12 lr 0.000459 wd 0.0500 time 0.2247 (0.2316) data time 0.0007 (0.0024) model time 0.2240 (0.2295) loss 3.3226 (3.1161) grad_norm 2.3006 (inf) loss_scale 512.0000 (856.1710) mem 7381MB [2024-08-27 11:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][430/1251] eta 0:03:10 lr 0.000459 wd 0.0500 time 0.2319 (0.2316) data time 0.0011 (0.0024) model time 0.2308 (0.2296) loss 3.4157 (3.1231) grad_norm 1.9758 (inf) loss_scale 512.0000 (848.1856) mem 7381MB [2024-08-27 11:29:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][440/1251] eta 0:03:07 lr 0.000459 wd 0.0500 time 0.2304 (0.2316) data time 0.0006 (0.0024) model time 0.2297 (0.2296) loss 3.6663 (3.1203) grad_norm 1.9868 (inf) loss_scale 512.0000 (840.5624) mem 7381MB [2024-08-27 11:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][450/1251] eta 0:03:05 lr 0.000459 wd 0.0500 time 0.2311 (0.2315) data time 0.0008 (0.0023) model time 0.2303 (0.2295) loss 3.7751 (3.1289) grad_norm 2.3631 (inf) loss_scale 512.0000 (833.2772) mem 7381MB [2024-08-27 11:29:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][460/1251] eta 0:03:03 lr 0.000459 wd 0.0500 time 0.2280 (0.2314) data time 0.0011 (0.0023) model time 0.2269 (0.2294) loss 2.8220 (3.1288) grad_norm 4.4994 (inf) loss_scale 512.0000 (826.3080) mem 7381MB [2024-08-27 11:29:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][470/1251] eta 0:03:01 lr 0.000459 wd 0.0500 time 0.2408 (0.2318) data time 0.0008 (0.0023) model time 0.2400 (0.2299) loss 3.3174 (3.1342) grad_norm 2.2951 (inf) loss_scale 512.0000 (819.6348) mem 7381MB [2024-08-27 11:30:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][480/1251] eta 0:02:58 lr 0.000459 wd 0.0500 time 0.2293 (0.2318) data time 0.0011 (0.0023) model time 0.2282 (0.2299) loss 3.4437 (3.1335) grad_norm 3.6219 (inf) loss_scale 512.0000 (813.2391) mem 7381MB [2024-08-27 11:30:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][490/1251] eta 0:02:56 lr 0.000458 wd 0.0500 time 0.2295 (0.2318) data time 0.0007 (0.0022) model time 0.2287 (0.2299) loss 3.8690 (3.1399) grad_norm 3.3110 (inf) loss_scale 512.0000 (807.1039) mem 7381MB [2024-08-27 11:30:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][500/1251] eta 0:02:54 lr 0.000458 wd 0.0500 time 0.2255 (0.2317) data time 0.0008 (0.0022) model time 0.2247 (0.2298) loss 2.6031 (3.1473) grad_norm 2.0778 (inf) loss_scale 512.0000 (801.2136) mem 7381MB [2024-08-27 11:30:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][510/1251] eta 0:02:51 lr 0.000458 wd 0.0500 time 0.2249 (0.2317) data time 0.0013 (0.0022) model time 0.2236 (0.2298) loss 2.6255 (3.1527) grad_norm 2.0368 (inf) loss_scale 512.0000 (795.5538) mem 7381MB [2024-08-27 11:30:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][520/1251] eta 0:02:49 lr 0.000458 wd 0.0500 time 0.2215 (0.2316) data time 0.0009 (0.0022) model time 0.2206 (0.2297) loss 2.8232 (3.1438) grad_norm 2.0358 (inf) loss_scale 512.0000 (790.1113) mem 7381MB [2024-08-27 11:30:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][530/1251] eta 0:02:46 lr 0.000458 wd 0.0500 time 0.2308 (0.2315) data time 0.0007 (0.0022) model time 0.2301 (0.2297) loss 3.6664 (3.1460) grad_norm 4.1217 (inf) loss_scale 512.0000 (784.8738) mem 7381MB [2024-08-27 11:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][540/1251] eta 0:02:44 lr 0.000458 wd 0.0500 time 0.2321 (0.2314) data time 0.0014 (0.0022) model time 0.2307 (0.2296) loss 3.6087 (3.1469) grad_norm 2.2482 (inf) loss_scale 512.0000 (779.8299) mem 7381MB [2024-08-27 11:30:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][550/1251] eta 0:02:42 lr 0.000458 wd 0.0500 time 0.2229 (0.2314) data time 0.0007 (0.0021) model time 0.2222 (0.2295) loss 3.2513 (3.1491) grad_norm 2.9400 (inf) loss_scale 512.0000 (774.9691) mem 7381MB [2024-08-27 11:30:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][560/1251] eta 0:02:39 lr 0.000458 wd 0.0500 time 0.2295 (0.2313) data time 0.0009 (0.0021) model time 0.2286 (0.2295) loss 3.5587 (3.1518) grad_norm 2.5686 (inf) loss_scale 512.0000 (770.2816) mem 7381MB [2024-08-27 11:30:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][570/1251] eta 0:02:37 lr 0.000458 wd 0.0500 time 0.2282 (0.2313) data time 0.0010 (0.0021) model time 0.2272 (0.2294) loss 3.7617 (3.1527) grad_norm 2.4558 (inf) loss_scale 512.0000 (765.7583) mem 7381MB [2024-08-27 11:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][580/1251] eta 0:02:35 lr 0.000458 wd 0.0500 time 0.2288 (0.2312) data time 0.0006 (0.0021) model time 0.2282 (0.2293) loss 2.5812 (3.1476) grad_norm 2.5774 (inf) loss_scale 512.0000 (761.3907) mem 7381MB [2024-08-27 11:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][590/1251] eta 0:02:32 lr 0.000458 wd 0.0500 time 0.2278 (0.2311) data time 0.0007 (0.0021) model time 0.2271 (0.2293) loss 3.8699 (3.1504) grad_norm 2.8380 (inf) loss_scale 512.0000 (757.1709) mem 7381MB [2024-08-27 11:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][600/1251] eta 0:02:30 lr 0.000458 wd 0.0500 time 0.2319 (0.2310) data time 0.0009 (0.0020) model time 0.2310 (0.2292) loss 3.5565 (3.1530) grad_norm 20.5534 (inf) loss_scale 512.0000 (753.0915) mem 7381MB [2024-08-27 11:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][610/1251] eta 0:02:28 lr 0.000458 wd 0.0500 time 0.2208 (0.2310) data time 0.0012 (0.0020) model time 0.2196 (0.2292) loss 3.2445 (3.1553) grad_norm 2.6229 (inf) loss_scale 512.0000 (749.1457) mem 7381MB [2024-08-27 11:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][620/1251] eta 0:02:25 lr 0.000458 wd 0.0500 time 0.2350 (0.2309) data time 0.0009 (0.0020) model time 0.2341 (0.2292) loss 2.4770 (3.1560) grad_norm 2.1311 (inf) loss_scale 512.0000 (745.3269) mem 7381MB [2024-08-27 11:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][630/1251] eta 0:02:23 lr 0.000458 wd 0.0500 time 0.2299 (0.2309) data time 0.0009 (0.0020) model time 0.2290 (0.2291) loss 3.3276 (3.1576) grad_norm 2.2221 (inf) loss_scale 512.0000 (741.6292) mem 7381MB [2024-08-27 11:30:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][640/1251] eta 0:02:21 lr 0.000458 wd 0.0500 time 0.2448 (0.2309) data time 0.0008 (0.0020) model time 0.2440 (0.2291) loss 2.7055 (3.1587) grad_norm 4.2845 (inf) loss_scale 512.0000 (738.0468) mem 7381MB [2024-08-27 11:30:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][650/1251] eta 0:02:18 lr 0.000458 wd 0.0500 time 0.2302 (0.2309) data time 0.0008 (0.0020) model time 0.2294 (0.2291) loss 2.7257 (3.1567) grad_norm 2.3101 (inf) loss_scale 512.0000 (734.5745) mem 7381MB [2024-08-27 11:30:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][660/1251] eta 0:02:16 lr 0.000458 wd 0.0500 time 0.2414 (0.2309) data time 0.0007 (0.0020) model time 0.2407 (0.2291) loss 3.3408 (3.1534) grad_norm 2.0333 (inf) loss_scale 512.0000 (731.2073) mem 7381MB [2024-08-27 11:30:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][670/1251] eta 0:02:14 lr 0.000458 wd 0.0500 time 0.2300 (0.2309) data time 0.0007 (0.0020) model time 0.2293 (0.2291) loss 2.1340 (3.1522) grad_norm 2.8376 (inf) loss_scale 512.0000 (727.9404) mem 7381MB [2024-08-27 11:30:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][680/1251] eta 0:02:11 lr 0.000458 wd 0.0500 time 0.2266 (0.2309) data time 0.0009 (0.0020) model time 0.2257 (0.2291) loss 2.6956 (3.1526) grad_norm 2.9943 (inf) loss_scale 512.0000 (724.7695) mem 7381MB [2024-08-27 11:30:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][690/1251] eta 0:02:09 lr 0.000458 wd 0.0500 time 0.2302 (0.2308) data time 0.0010 (0.0020) model time 0.2292 (0.2290) loss 3.7698 (3.1590) grad_norm 2.9606 (inf) loss_scale 512.0000 (721.6903) mem 7381MB [2024-08-27 11:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][700/1251] eta 0:02:07 lr 0.000458 wd 0.0500 time 0.2295 (0.2307) data time 0.0008 (0.0019) model time 0.2287 (0.2290) loss 3.7346 (3.1594) grad_norm 2.2627 (inf) loss_scale 512.0000 (718.6990) mem 7381MB [2024-08-27 11:30:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][710/1251] eta 0:02:04 lr 0.000457 wd 0.0500 time 0.2210 (0.2307) data time 0.0008 (0.0019) model time 0.2203 (0.2290) loss 2.8109 (3.1604) grad_norm 4.1314 (inf) loss_scale 512.0000 (715.7918) mem 7381MB [2024-08-27 11:30:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][720/1251] eta 0:02:02 lr 0.000457 wd 0.0500 time 0.2292 (0.2307) data time 0.0010 (0.0019) model time 0.2282 (0.2289) loss 2.4965 (3.1611) grad_norm 2.8889 (inf) loss_scale 512.0000 (712.9653) mem 7381MB [2024-08-27 11:30:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][730/1251] eta 0:02:00 lr 0.000457 wd 0.0500 time 0.2298 (0.2307) data time 0.0011 (0.0019) model time 0.2287 (0.2289) loss 2.0862 (3.1570) grad_norm 3.8313 (inf) loss_scale 512.0000 (710.2161) mem 7381MB [2024-08-27 11:30:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][740/1251] eta 0:01:57 lr 0.000457 wd 0.0500 time 0.2493 (0.2307) data time 0.0007 (0.0019) model time 0.2486 (0.2289) loss 3.5346 (3.1565) grad_norm 2.8031 (inf) loss_scale 512.0000 (707.5412) mem 7381MB [2024-08-27 11:31:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][750/1251] eta 0:01:55 lr 0.000457 wd 0.0500 time 0.3952 (0.2313) data time 0.0007 (0.0019) model time 0.3945 (0.2296) loss 3.5325 (3.1552) grad_norm 3.4997 (inf) loss_scale 512.0000 (704.9374) mem 7381MB [2024-08-27 11:31:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][760/1251] eta 0:01:53 lr 0.000457 wd 0.0500 time 0.2251 (0.2314) data time 0.0009 (0.0019) model time 0.2242 (0.2298) loss 2.5597 (3.1520) grad_norm 2.3288 (inf) loss_scale 512.0000 (702.4021) mem 7381MB [2024-08-27 11:31:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][770/1251] eta 0:01:51 lr 0.000457 wd 0.0500 time 0.2384 (0.2314) data time 0.0007 (0.0019) model time 0.2377 (0.2297) loss 4.0644 (3.1516) grad_norm 3.5319 (inf) loss_scale 512.0000 (699.9326) mem 7381MB [2024-08-27 11:31:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][780/1251] eta 0:01:48 lr 0.000457 wd 0.0500 time 0.2300 (0.2313) data time 0.0006 (0.0018) model time 0.2293 (0.2297) loss 2.3008 (3.1499) grad_norm 2.2076 (inf) loss_scale 512.0000 (697.5262) mem 7381MB [2024-08-27 11:31:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][790/1251] eta 0:01:46 lr 0.000457 wd 0.0500 time 0.2320 (0.2312) data time 0.0009 (0.0018) model time 0.2312 (0.2296) loss 3.0881 (3.1525) grad_norm 2.2341 (inf) loss_scale 512.0000 (695.1808) mem 7381MB [2024-08-27 11:31:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][800/1251] eta 0:01:44 lr 0.000457 wd 0.0500 time 0.2226 (0.2312) data time 0.0009 (0.0018) model time 0.2217 (0.2295) loss 2.7827 (3.1553) grad_norm 3.0146 (inf) loss_scale 512.0000 (692.8939) mem 7381MB [2024-08-27 11:31:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][810/1251] eta 0:01:41 lr 0.000457 wd 0.0500 time 0.2255 (0.2312) data time 0.0007 (0.0018) model time 0.2248 (0.2295) loss 3.6448 (3.1556) grad_norm 2.8149 (inf) loss_scale 512.0000 (690.6634) mem 7381MB [2024-08-27 11:31:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][820/1251] eta 0:01:39 lr 0.000457 wd 0.0500 time 0.2320 (0.2311) data time 0.0009 (0.0018) model time 0.2311 (0.2295) loss 3.5438 (3.1580) grad_norm 3.8482 (inf) loss_scale 512.0000 (688.4872) mem 7381MB [2024-08-27 11:31:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][830/1251] eta 0:01:37 lr 0.000457 wd 0.0500 time 0.2288 (0.2311) data time 0.0007 (0.0018) model time 0.2281 (0.2295) loss 2.3593 (3.1560) grad_norm 2.7046 (inf) loss_scale 512.0000 (686.3634) mem 7381MB [2024-08-27 11:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][840/1251] eta 0:01:34 lr 0.000457 wd 0.0500 time 0.2255 (0.2311) data time 0.0009 (0.0018) model time 0.2246 (0.2295) loss 3.0906 (3.1530) grad_norm 5.6761 (inf) loss_scale 512.0000 (684.2901) mem 7381MB [2024-08-27 11:31:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][850/1251] eta 0:01:32 lr 0.000457 wd 0.0500 time 0.2314 (0.2311) data time 0.0008 (0.0018) model time 0.2305 (0.2295) loss 2.9695 (3.1532) grad_norm 2.7195 (inf) loss_scale 512.0000 (682.2656) mem 7381MB [2024-08-27 11:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][860/1251] eta 0:01:30 lr 0.000457 wd 0.0500 time 0.2302 (0.2311) data time 0.0009 (0.0018) model time 0.2293 (0.2294) loss 3.0595 (3.1555) grad_norm 3.1774 (inf) loss_scale 512.0000 (680.2880) mem 7381MB [2024-08-27 11:31:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][870/1251] eta 0:01:28 lr 0.000457 wd 0.0500 time 0.2200 (0.2310) data time 0.0007 (0.0018) model time 0.2193 (0.2294) loss 3.2900 (3.1580) grad_norm 3.4101 (inf) loss_scale 512.0000 (678.3559) mem 7381MB [2024-08-27 11:31:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][880/1251] eta 0:01:25 lr 0.000457 wd 0.0500 time 0.2304 (0.2310) data time 0.0010 (0.0018) model time 0.2294 (0.2294) loss 3.2496 (3.1598) grad_norm 1.9485 (inf) loss_scale 512.0000 (676.4677) mem 7381MB [2024-08-27 11:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][890/1251] eta 0:01:23 lr 0.000457 wd 0.0500 time 0.2274 (0.2310) data time 0.0010 (0.0018) model time 0.2264 (0.2294) loss 2.8069 (3.1620) grad_norm 2.1920 (inf) loss_scale 512.0000 (674.6218) mem 7381MB [2024-08-27 11:31:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][900/1251] eta 0:01:21 lr 0.000457 wd 0.0500 time 0.2195 (0.2309) data time 0.0011 (0.0018) model time 0.2184 (0.2294) loss 1.9792 (3.1609) grad_norm 2.4743 (inf) loss_scale 512.0000 (672.8169) mem 7381MB [2024-08-27 11:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][910/1251] eta 0:01:18 lr 0.000457 wd 0.0500 time 0.2263 (0.2310) data time 0.0009 (0.0018) model time 0.2254 (0.2294) loss 3.3282 (3.1618) grad_norm 2.1777 (inf) loss_scale 512.0000 (671.0516) mem 7381MB [2024-08-27 11:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][920/1251] eta 0:01:16 lr 0.000457 wd 0.0500 time 0.2284 (0.2309) data time 0.0007 (0.0017) model time 0.2277 (0.2293) loss 3.5346 (3.1653) grad_norm 3.2035 (inf) loss_scale 512.0000 (669.3246) mem 7381MB [2024-08-27 11:31:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][930/1251] eta 0:01:14 lr 0.000457 wd 0.0500 time 0.2315 (0.2309) data time 0.0015 (0.0017) model time 0.2300 (0.2293) loss 3.4864 (3.1686) grad_norm 1.7785 (inf) loss_scale 512.0000 (667.6348) mem 7381MB [2024-08-27 11:31:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][940/1251] eta 0:01:11 lr 0.000456 wd 0.0500 time 0.2289 (0.2309) data time 0.0009 (0.0017) model time 0.2280 (0.2293) loss 3.3366 (3.1694) grad_norm 2.2929 (inf) loss_scale 512.0000 (665.9809) mem 7381MB [2024-08-27 11:31:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][950/1251] eta 0:01:09 lr 0.000456 wd 0.0500 time 0.2264 (0.2309) data time 0.0009 (0.0017) model time 0.2255 (0.2293) loss 2.7083 (3.1699) grad_norm 3.6065 (inf) loss_scale 512.0000 (664.3617) mem 7381MB [2024-08-27 11:31:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][960/1251] eta 0:01:07 lr 0.000456 wd 0.0500 time 0.2292 (0.2308) data time 0.0007 (0.0017) model time 0.2285 (0.2293) loss 3.7742 (3.1708) grad_norm 2.3274 (inf) loss_scale 512.0000 (662.7763) mem 7381MB [2024-08-27 11:31:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][970/1251] eta 0:01:04 lr 0.000456 wd 0.0500 time 0.2258 (0.2308) data time 0.0008 (0.0017) model time 0.2250 (0.2292) loss 3.4200 (3.1716) grad_norm 2.4071 (inf) loss_scale 512.0000 (661.2235) mem 7381MB [2024-08-27 11:31:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][980/1251] eta 0:01:02 lr 0.000456 wd 0.0500 time 0.2267 (0.2308) data time 0.0010 (0.0017) model time 0.2257 (0.2292) loss 3.5970 (3.1691) grad_norm 2.0363 (inf) loss_scale 512.0000 (659.7023) mem 7381MB [2024-08-27 11:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][990/1251] eta 0:01:00 lr 0.000456 wd 0.0500 time 0.2289 (0.2309) data time 0.0012 (0.0017) model time 0.2277 (0.2294) loss 2.7339 (3.1701) grad_norm 2.2349 (inf) loss_scale 512.0000 (658.2119) mem 7381MB [2024-08-27 11:31:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1000/1251] eta 0:00:57 lr 0.000456 wd 0.0500 time 0.2274 (0.2309) data time 0.0007 (0.0017) model time 0.2268 (0.2294) loss 3.4291 (3.1726) grad_norm 3.0458 (inf) loss_scale 512.0000 (656.7512) mem 7381MB [2024-08-27 11:32:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1010/1251] eta 0:00:55 lr 0.000456 wd 0.0500 time 0.2370 (0.2309) data time 0.0012 (0.0017) model time 0.2358 (0.2293) loss 2.5063 (3.1707) grad_norm 2.2982 (inf) loss_scale 512.0000 (655.3195) mem 7381MB [2024-08-27 11:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1020/1251] eta 0:00:53 lr 0.000456 wd 0.0500 time 0.2314 (0.2309) data time 0.0008 (0.0017) model time 0.2307 (0.2293) loss 2.8842 (3.1720) grad_norm 2.2366 (inf) loss_scale 512.0000 (653.9158) mem 7381MB [2024-08-27 11:32:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1030/1251] eta 0:00:51 lr 0.000456 wd 0.0500 time 0.2363 (0.2308) data time 0.0008 (0.0017) model time 0.2355 (0.2293) loss 2.6177 (3.1704) grad_norm 3.2010 (inf) loss_scale 512.0000 (652.5393) mem 7381MB [2024-08-27 11:32:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1040/1251] eta 0:00:48 lr 0.000456 wd 0.0500 time 0.2329 (0.2308) data time 0.0011 (0.0017) model time 0.2318 (0.2293) loss 3.8110 (3.1725) grad_norm 3.4133 (inf) loss_scale 512.0000 (651.1892) mem 7381MB [2024-08-27 11:32:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1050/1251] eta 0:00:46 lr 0.000456 wd 0.0500 time 0.2242 (0.2308) data time 0.0007 (0.0017) model time 0.2235 (0.2293) loss 2.4189 (3.1725) grad_norm 3.9695 (inf) loss_scale 512.0000 (649.8649) mem 7381MB [2024-08-27 11:32:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1060/1251] eta 0:00:44 lr 0.000456 wd 0.0500 time 0.2216 (0.2308) data time 0.0009 (0.0017) model time 0.2207 (0.2292) loss 3.3450 (3.1722) grad_norm 2.4303 (inf) loss_scale 512.0000 (648.5655) mem 7381MB [2024-08-27 11:32:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1070/1251] eta 0:00:41 lr 0.000456 wd 0.0500 time 0.2198 (0.2307) data time 0.0010 (0.0017) model time 0.2188 (0.2292) loss 3.1761 (3.1707) grad_norm 1.9000 (inf) loss_scale 512.0000 (647.2904) mem 7381MB [2024-08-27 11:32:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1080/1251] eta 0:00:39 lr 0.000456 wd 0.0500 time 0.2253 (0.2307) data time 0.0007 (0.0016) model time 0.2245 (0.2292) loss 2.8733 (3.1687) grad_norm 2.1872 (inf) loss_scale 512.0000 (646.0389) mem 7381MB [2024-08-27 11:32:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1090/1251] eta 0:00:37 lr 0.000456 wd 0.0500 time 0.2325 (0.2307) data time 0.0007 (0.0016) model time 0.2318 (0.2292) loss 3.4587 (3.1677) grad_norm 3.0656 (inf) loss_scale 512.0000 (644.8103) mem 7381MB [2024-08-27 11:32:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1100/1251] eta 0:00:34 lr 0.000456 wd 0.0500 time 0.2274 (0.2307) data time 0.0010 (0.0016) model time 0.2264 (0.2292) loss 2.9593 (3.1656) grad_norm 2.7999 (inf) loss_scale 512.0000 (643.6040) mem 7381MB [2024-08-27 11:32:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1110/1251] eta 0:00:32 lr 0.000456 wd 0.0500 time 0.2303 (0.2307) data time 0.0007 (0.0016) model time 0.2296 (0.2292) loss 2.4654 (3.1651) grad_norm 2.4444 (inf) loss_scale 512.0000 (642.4194) mem 7381MB [2024-08-27 11:32:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1120/1251] eta 0:00:30 lr 0.000456 wd 0.0500 time 0.2277 (0.2306) data time 0.0008 (0.0016) model time 0.2269 (0.2291) loss 2.8700 (3.1636) grad_norm 2.2530 (inf) loss_scale 512.0000 (641.2560) mem 7381MB [2024-08-27 11:32:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1130/1251] eta 0:00:27 lr 0.000456 wd 0.0500 time 0.2311 (0.2307) data time 0.0011 (0.0016) model time 0.2299 (0.2292) loss 3.4736 (3.1631) grad_norm 2.2356 (inf) loss_scale 512.0000 (640.1132) mem 7381MB [2024-08-27 11:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1140/1251] eta 0:00:25 lr 0.000456 wd 0.0500 time 0.2339 (0.2306) data time 0.0006 (0.0016) model time 0.2333 (0.2291) loss 3.5917 (3.1624) grad_norm 2.8919 (inf) loss_scale 512.0000 (638.9904) mem 7381MB [2024-08-27 11:32:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1150/1251] eta 0:00:23 lr 0.000456 wd 0.0500 time 0.2322 (0.2306) data time 0.0010 (0.0016) model time 0.2312 (0.2291) loss 2.8760 (3.1618) grad_norm 2.4071 (inf) loss_scale 512.0000 (637.8871) mem 7381MB [2024-08-27 11:32:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1160/1251] eta 0:00:20 lr 0.000456 wd 0.0500 time 0.2263 (0.2306) data time 0.0010 (0.0016) model time 0.2253 (0.2291) loss 3.2185 (3.1619) grad_norm 2.6471 (inf) loss_scale 512.0000 (636.8028) mem 7381MB [2024-08-27 11:32:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1170/1251] eta 0:00:18 lr 0.000455 wd 0.0500 time 0.2271 (0.2306) data time 0.0009 (0.0016) model time 0.2262 (0.2291) loss 3.1096 (3.1630) grad_norm 2.3082 (inf) loss_scale 512.0000 (635.7370) mem 7381MB [2024-08-27 11:32:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1180/1251] eta 0:00:16 lr 0.000455 wd 0.0500 time 0.2283 (0.2306) data time 0.0009 (0.0016) model time 0.2274 (0.2292) loss 3.1143 (3.1627) grad_norm 2.5766 (inf) loss_scale 512.0000 (634.6892) mem 7381MB [2024-08-27 11:32:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1190/1251] eta 0:00:14 lr 0.000455 wd 0.0500 time 0.2291 (0.2306) data time 0.0011 (0.0016) model time 0.2280 (0.2292) loss 3.3019 (3.1644) grad_norm 5.6754 (inf) loss_scale 512.0000 (633.6591) mem 7381MB [2024-08-27 11:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1200/1251] eta 0:00:11 lr 0.000455 wd 0.0500 time 0.2265 (0.2306) data time 0.0009 (0.0016) model time 0.2256 (0.2292) loss 3.1008 (3.1636) grad_norm 3.0644 (inf) loss_scale 512.0000 (632.6461) mem 7381MB [2024-08-27 11:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1210/1251] eta 0:00:09 lr 0.000455 wd 0.0500 time 0.2283 (0.2306) data time 0.0012 (0.0016) model time 0.2271 (0.2292) loss 3.0100 (3.1638) grad_norm 2.1714 (inf) loss_scale 512.0000 (631.6499) mem 7381MB [2024-08-27 11:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1220/1251] eta 0:00:07 lr 0.000455 wd 0.0500 time 0.2311 (0.2306) data time 0.0010 (0.0016) model time 0.2302 (0.2291) loss 3.3202 (3.1651) grad_norm 2.1596 (inf) loss_scale 512.0000 (630.6699) mem 7381MB [2024-08-27 11:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1230/1251] eta 0:00:04 lr 0.000455 wd 0.0500 time 0.2214 (0.2306) data time 0.0007 (0.0016) model time 0.2207 (0.2291) loss 2.1948 (3.1622) grad_norm 2.3084 (inf) loss_scale 512.0000 (629.7059) mem 7381MB [2024-08-27 11:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1240/1251] eta 0:00:02 lr 0.000455 wd 0.0500 time 0.2107 (0.2305) data time 0.0006 (0.0016) model time 0.2101 (0.2290) loss 3.1700 (3.1630) grad_norm 2.7439 (inf) loss_scale 512.0000 (628.7575) mem 7381MB [2024-08-27 11:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [168/300][1250/1251] eta 0:00:00 lr 0.000455 wd 0.0500 time 0.2112 (0.2303) data time 0.0006 (0.0016) model time 0.2105 (0.2289) loss 2.3237 (3.1652) grad_norm 3.2480 (inf) loss_scale 512.0000 (627.8241) mem 7381MB [2024-08-27 11:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 168 training takes 0:04:48 [2024-08-27 11:32:56 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 11:32:57 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 11:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.597 (0.597) Loss 0.4797 (0.4797) Acc@1 91.309 (91.309) Acc@5 97.852 (97.852) Mem 7381MB [2024-08-27 11:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.126) Loss 0.7051 (0.7117) Acc@1 84.961 (84.925) Acc@5 97.266 (97.035) Mem 7381MB [2024-08-27 11:32:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.104) Loss 0.9976 (0.7375) Acc@1 76.660 (83.938) Acc@5 94.824 (96.894) Mem 7381MB [2024-08-27 11:33:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.096) Loss 1.2734 (0.8356) Acc@1 69.531 (81.540) Acc@5 90.918 (95.741) Mem 7381MB [2024-08-27 11:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.089) Loss 1.1309 (0.8867) Acc@1 72.656 (80.192) Acc@5 92.578 (95.179) Mem 7381MB [2024-08-27 11:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.816 Acc@5 95.150 [2024-08-27 11:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.8% [2024-08-27 11:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 79.82% [2024-08-27 11:33:01 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 11:33:02 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 11:33:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.505 (0.505) Loss 0.4033 (0.4033) Acc@1 93.262 (93.262) Acc@5 98.340 (98.340) Mem 7381MB [2024-08-27 11:33:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.089 (0.120) Loss 0.6250 (0.6262) Acc@1 87.793 (86.594) Acc@5 97.754 (97.417) Mem 7381MB [2024-08-27 11:33:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.100) Loss 0.8896 (0.6510) Acc@1 78.809 (85.724) Acc@5 95.801 (97.438) Mem 7381MB [2024-08-27 11:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.093) Loss 1.1211 (0.7388) Acc@1 72.754 (83.569) Acc@5 92.969 (96.459) Mem 7381MB [2024-08-27 11:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.087) Loss 1.0146 (0.7842) Acc@1 74.609 (82.227) Acc@5 94.043 (95.994) Mem 7381MB [2024-08-27 11:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.844 Acc@5 95.966 [2024-08-27 11:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.8% [2024-08-27 11:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.84% [2024-08-27 11:33:06 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 11:33:07 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 11:33:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][0/1251] eta 0:16:28 lr 0.000455 wd 0.0500 time 0.7905 (0.7905) data time 0.5600 (0.5600) model time 0.0000 (0.0000) loss 2.2468 (2.2468) grad_norm 2.6837 (2.6837) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][10/1251] eta 0:05:46 lr 0.000455 wd 0.0500 time 0.2277 (0.2792) data time 0.0010 (0.0520) model time 0.0000 (0.0000) loss 3.3302 (3.0958) grad_norm 3.5681 (2.5369) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][20/1251] eta 0:05:23 lr 0.000455 wd 0.0500 time 0.2300 (0.2627) data time 0.0010 (0.0278) model time 0.0000 (0.0000) loss 3.1439 (3.1064) grad_norm 2.4644 (2.6917) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][30/1251] eta 0:05:25 lr 0.000455 wd 0.0500 time 0.3629 (0.2663) data time 0.0008 (0.0191) model time 0.0000 (0.0000) loss 2.5636 (3.0734) grad_norm 2.0984 (2.6506) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][40/1251] eta 0:05:11 lr 0.000455 wd 0.0500 time 0.2364 (0.2572) data time 0.0007 (0.0147) model time 0.0000 (0.0000) loss 3.4536 (3.0688) grad_norm 2.2502 (2.6351) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][50/1251] eta 0:05:02 lr 0.000455 wd 0.0500 time 0.2286 (0.2517) data time 0.0007 (0.0121) model time 0.0000 (0.0000) loss 2.3837 (3.0069) grad_norm 2.0586 (2.6366) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][60/1251] eta 0:04:55 lr 0.000455 wd 0.0500 time 0.2301 (0.2480) data time 0.0007 (0.0102) model time 0.2294 (0.2278) loss 2.0704 (2.9869) grad_norm 1.8047 (2.5813) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][70/1251] eta 0:04:49 lr 0.000455 wd 0.0500 time 0.2289 (0.2451) data time 0.0010 (0.0089) model time 0.2279 (0.2273) loss 3.4486 (3.0306) grad_norm 2.1255 (2.5111) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][80/1251] eta 0:04:44 lr 0.000455 wd 0.0500 time 0.2269 (0.2432) data time 0.0009 (0.0080) model time 0.2260 (0.2278) loss 3.4730 (3.0475) grad_norm 2.6631 (2.5723) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][90/1251] eta 0:04:40 lr 0.000455 wd 0.0500 time 0.2335 (0.2415) data time 0.0012 (0.0072) model time 0.2323 (0.2274) loss 3.1461 (3.0806) grad_norm 2.9746 (2.6514) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][100/1251] eta 0:04:36 lr 0.000455 wd 0.0500 time 0.2211 (0.2401) data time 0.0009 (0.0066) model time 0.2202 (0.2272) loss 3.5090 (3.1064) grad_norm 3.0290 (2.6481) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][110/1251] eta 0:04:32 lr 0.000455 wd 0.0500 time 0.2267 (0.2392) data time 0.0007 (0.0061) model time 0.2260 (0.2275) loss 2.5819 (3.1089) grad_norm 3.7353 (2.6628) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][120/1251] eta 0:04:29 lr 0.000455 wd 0.0500 time 0.2232 (0.2386) data time 0.0011 (0.0058) model time 0.2221 (0.2278) loss 3.3750 (3.0657) grad_norm 2.9588 (2.6651) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][130/1251] eta 0:04:26 lr 0.000455 wd 0.0500 time 0.2194 (0.2379) data time 0.0010 (0.0054) model time 0.2184 (0.2279) loss 3.5965 (3.0624) grad_norm 2.0686 (2.6422) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][140/1251] eta 0:04:23 lr 0.000454 wd 0.0500 time 0.2262 (0.2373) data time 0.0007 (0.0051) model time 0.2255 (0.2280) loss 3.4512 (3.0596) grad_norm 2.0919 (2.6317) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][150/1251] eta 0:04:20 lr 0.000454 wd 0.0500 time 0.2284 (0.2368) data time 0.0009 (0.0048) model time 0.2275 (0.2281) loss 3.4284 (3.0697) grad_norm 2.2983 (2.6134) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][160/1251] eta 0:04:17 lr 0.000454 wd 0.0500 time 0.2250 (0.2363) data time 0.0009 (0.0046) model time 0.2241 (0.2280) loss 3.0642 (3.0788) grad_norm 2.4711 (2.6160) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][170/1251] eta 0:04:14 lr 0.000454 wd 0.0500 time 0.2273 (0.2358) data time 0.0011 (0.0044) model time 0.2262 (0.2279) loss 2.4666 (3.0776) grad_norm 4.3191 (2.6049) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][180/1251] eta 0:04:13 lr 0.000454 wd 0.0500 time 0.2458 (0.2365) data time 0.0009 (0.0042) model time 0.2450 (0.2294) loss 3.1143 (3.0837) grad_norm 2.1139 (2.6084) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][190/1251] eta 0:04:10 lr 0.000454 wd 0.0500 time 0.2315 (0.2361) data time 0.0010 (0.0041) model time 0.2305 (0.2292) loss 3.8869 (3.0955) grad_norm 2.4198 (2.6070) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][200/1251] eta 0:04:07 lr 0.000454 wd 0.0500 time 0.2290 (0.2357) data time 0.0011 (0.0039) model time 0.2279 (0.2291) loss 3.2186 (3.0964) grad_norm 2.3998 (2.6480) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][210/1251] eta 0:04:05 lr 0.000454 wd 0.0500 time 0.2423 (0.2355) data time 0.0009 (0.0038) model time 0.2413 (0.2291) loss 3.3701 (3.1006) grad_norm 2.3095 (2.6281) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][220/1251] eta 0:04:02 lr 0.000454 wd 0.0500 time 0.2194 (0.2351) data time 0.0008 (0.0037) model time 0.2186 (0.2290) loss 2.4428 (3.0999) grad_norm 2.1154 (2.6281) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][230/1251] eta 0:03:59 lr 0.000454 wd 0.0500 time 0.2299 (0.2349) data time 0.0007 (0.0036) model time 0.2292 (0.2290) loss 3.6106 (3.0926) grad_norm 2.9042 (2.6535) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][240/1251] eta 0:03:57 lr 0.000454 wd 0.0500 time 0.2254 (0.2347) data time 0.0010 (0.0034) model time 0.2244 (0.2289) loss 3.3662 (3.0928) grad_norm 3.2959 (2.6576) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][250/1251] eta 0:03:54 lr 0.000454 wd 0.0500 time 0.2376 (0.2346) data time 0.0008 (0.0034) model time 0.2368 (0.2290) loss 3.4013 (3.0916) grad_norm 2.1372 (2.6694) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][260/1251] eta 0:03:52 lr 0.000454 wd 0.0500 time 0.2305 (0.2343) data time 0.0008 (0.0033) model time 0.2297 (0.2289) loss 3.6732 (3.0947) grad_norm 2.3823 (2.6616) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][270/1251] eta 0:03:49 lr 0.000454 wd 0.0500 time 0.2316 (0.2341) data time 0.0009 (0.0032) model time 0.2308 (0.2289) loss 3.3467 (3.1007) grad_norm 2.2900 (2.6654) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][280/1251] eta 0:03:47 lr 0.000454 wd 0.0500 time 0.2266 (0.2339) data time 0.0009 (0.0031) model time 0.2256 (0.2288) loss 2.7628 (3.0916) grad_norm 3.5665 (2.6698) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][290/1251] eta 0:03:44 lr 0.000454 wd 0.0500 time 0.2282 (0.2337) data time 0.0009 (0.0030) model time 0.2274 (0.2287) loss 2.1536 (3.0863) grad_norm 2.8276 (2.6614) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][300/1251] eta 0:03:41 lr 0.000454 wd 0.0500 time 0.2276 (0.2334) data time 0.0008 (0.0030) model time 0.2268 (0.2286) loss 3.7804 (3.0938) grad_norm 2.4981 (2.6612) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][310/1251] eta 0:03:39 lr 0.000454 wd 0.0500 time 0.2292 (0.2333) data time 0.0007 (0.0029) model time 0.2285 (0.2286) loss 3.3732 (3.0966) grad_norm 2.1930 (2.6610) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][320/1251] eta 0:03:37 lr 0.000454 wd 0.0500 time 0.2283 (0.2331) data time 0.0008 (0.0029) model time 0.2274 (0.2285) loss 2.4038 (3.0909) grad_norm 2.0344 (2.6642) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][330/1251] eta 0:03:34 lr 0.000454 wd 0.0500 time 0.2258 (0.2330) data time 0.0007 (0.0028) model time 0.2251 (0.2284) loss 3.0707 (3.0871) grad_norm 2.1489 (2.6641) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][340/1251] eta 0:03:32 lr 0.000454 wd 0.0500 time 0.2261 (0.2329) data time 0.0010 (0.0028) model time 0.2252 (0.2284) loss 2.9154 (3.0857) grad_norm 2.3791 (2.6562) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][350/1251] eta 0:03:29 lr 0.000454 wd 0.0500 time 0.2326 (0.2328) data time 0.0008 (0.0027) model time 0.2318 (0.2285) loss 3.2973 (3.0902) grad_norm 2.3529 (2.6557) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][360/1251] eta 0:03:27 lr 0.000454 wd 0.0500 time 0.2355 (0.2327) data time 0.0010 (0.0027) model time 0.2344 (0.2285) loss 2.0939 (3.0863) grad_norm 2.4144 (2.6539) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][370/1251] eta 0:03:24 lr 0.000453 wd 0.0500 time 0.2246 (0.2326) data time 0.0010 (0.0026) model time 0.2236 (0.2284) loss 2.6466 (3.0870) grad_norm 2.1111 (2.6589) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][380/1251] eta 0:03:22 lr 0.000453 wd 0.0500 time 0.2290 (0.2326) data time 0.0008 (0.0026) model time 0.2283 (0.2285) loss 3.0842 (3.0812) grad_norm 2.5402 (2.6544) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][390/1251] eta 0:03:20 lr 0.000453 wd 0.0500 time 0.2413 (0.2325) data time 0.0007 (0.0026) model time 0.2406 (0.2285) loss 3.4740 (3.0847) grad_norm 2.6393 (2.6926) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][400/1251] eta 0:03:17 lr 0.000453 wd 0.0500 time 0.2226 (0.2324) data time 0.0010 (0.0025) model time 0.2217 (0.2285) loss 3.4854 (3.0866) grad_norm 1.7835 (2.6898) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][410/1251] eta 0:03:15 lr 0.000453 wd 0.0500 time 0.2334 (0.2323) data time 0.0009 (0.0025) model time 0.2325 (0.2284) loss 1.8455 (3.0862) grad_norm 1.7876 (2.6840) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][420/1251] eta 0:03:12 lr 0.000453 wd 0.0500 time 0.2262 (0.2322) data time 0.0013 (0.0025) model time 0.2249 (0.2284) loss 3.1828 (3.0879) grad_norm 2.2976 (2.6820) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][430/1251] eta 0:03:10 lr 0.000453 wd 0.0500 time 0.2328 (0.2321) data time 0.0009 (0.0025) model time 0.2319 (0.2284) loss 3.7755 (3.0874) grad_norm 2.1104 (2.6808) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][440/1251] eta 0:03:08 lr 0.000453 wd 0.0500 time 0.2284 (0.2321) data time 0.0006 (0.0024) model time 0.2277 (0.2284) loss 2.8412 (3.0885) grad_norm 2.6438 (2.6772) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][450/1251] eta 0:03:05 lr 0.000453 wd 0.0500 time 0.2241 (0.2320) data time 0.0010 (0.0024) model time 0.2231 (0.2283) loss 3.2063 (3.0815) grad_norm 2.7770 (2.6798) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][460/1251] eta 0:03:03 lr 0.000453 wd 0.0500 time 0.2270 (0.2319) data time 0.0010 (0.0024) model time 0.2260 (0.2282) loss 3.4426 (3.0838) grad_norm 2.0224 (2.6825) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][470/1251] eta 0:03:01 lr 0.000453 wd 0.0500 time 0.2306 (0.2318) data time 0.0010 (0.0023) model time 0.2296 (0.2282) loss 3.4466 (3.0882) grad_norm 3.4142 (2.6864) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][480/1251] eta 0:02:58 lr 0.000453 wd 0.0500 time 0.2369 (0.2318) data time 0.0007 (0.0023) model time 0.2362 (0.2283) loss 2.4720 (3.0853) grad_norm 2.4695 (2.6826) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][490/1251] eta 0:02:56 lr 0.000453 wd 0.0500 time 0.2299 (0.2318) data time 0.0009 (0.0023) model time 0.2290 (0.2282) loss 2.3766 (3.0867) grad_norm 2.3367 (2.6944) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][500/1251] eta 0:02:54 lr 0.000453 wd 0.0500 time 0.2297 (0.2317) data time 0.0010 (0.0023) model time 0.2287 (0.2282) loss 3.7986 (3.0886) grad_norm 2.2898 (2.6903) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][510/1251] eta 0:02:51 lr 0.000453 wd 0.0500 time 0.2322 (0.2316) data time 0.0009 (0.0023) model time 0.2314 (0.2282) loss 3.2695 (3.0918) grad_norm 2.6855 (2.6861) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][520/1251] eta 0:02:49 lr 0.000453 wd 0.0500 time 0.2364 (0.2316) data time 0.0007 (0.0023) model time 0.2357 (0.2282) loss 3.8546 (3.0965) grad_norm 4.9590 (2.6927) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][530/1251] eta 0:02:46 lr 0.000453 wd 0.0500 time 0.2252 (0.2315) data time 0.0007 (0.0023) model time 0.2244 (0.2281) loss 3.4880 (3.1023) grad_norm 2.7522 (2.6932) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][540/1251] eta 0:02:44 lr 0.000453 wd 0.0500 time 0.2226 (0.2315) data time 0.0013 (0.0023) model time 0.2213 (0.2281) loss 3.4330 (3.1049) grad_norm 4.3828 (2.7010) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][550/1251] eta 0:02:43 lr 0.000453 wd 0.0500 time 0.2258 (0.2326) data time 0.0007 (0.0022) model time 0.2251 (0.2294) loss 2.2034 (3.0998) grad_norm 4.0114 (2.7034) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][560/1251] eta 0:02:40 lr 0.000453 wd 0.0500 time 0.2322 (0.2329) data time 0.0010 (0.0022) model time 0.2312 (0.2298) loss 3.8608 (3.0999) grad_norm 3.9418 (2.7021) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][570/1251] eta 0:02:38 lr 0.000453 wd 0.0500 time 0.2378 (0.2328) data time 0.0009 (0.0022) model time 0.2369 (0.2297) loss 3.0394 (3.1027) grad_norm 4.1324 (2.7088) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][580/1251] eta 0:02:36 lr 0.000453 wd 0.0500 time 0.2375 (0.2328) data time 0.0007 (0.0022) model time 0.2368 (0.2297) loss 2.6442 (3.1010) grad_norm 3.2390 (2.7212) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][590/1251] eta 0:02:33 lr 0.000452 wd 0.0500 time 0.2270 (0.2327) data time 0.0009 (0.0022) model time 0.2261 (0.2297) loss 3.6345 (3.1050) grad_norm 3.0982 (2.7341) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][600/1251] eta 0:02:31 lr 0.000452 wd 0.0500 time 0.2307 (0.2327) data time 0.0008 (0.0022) model time 0.2299 (0.2296) loss 3.1281 (3.1047) grad_norm 2.6662 (2.7379) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][610/1251] eta 0:02:29 lr 0.000452 wd 0.0500 time 0.2286 (0.2326) data time 0.0007 (0.0022) model time 0.2279 (0.2296) loss 2.9863 (3.1082) grad_norm 2.6763 (2.7430) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][620/1251] eta 0:02:26 lr 0.000452 wd 0.0500 time 0.2278 (0.2325) data time 0.0007 (0.0021) model time 0.2271 (0.2295) loss 3.8846 (3.1092) grad_norm 2.4277 (2.7384) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][630/1251] eta 0:02:24 lr 0.000452 wd 0.0500 time 0.2328 (0.2324) data time 0.0007 (0.0021) model time 0.2321 (0.2295) loss 3.8798 (3.1109) grad_norm 3.3186 (2.7389) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][640/1251] eta 0:02:21 lr 0.000452 wd 0.0500 time 0.2272 (0.2324) data time 0.0007 (0.0021) model time 0.2265 (0.2295) loss 2.3047 (3.1144) grad_norm 3.7822 (2.7426) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][650/1251] eta 0:02:19 lr 0.000452 wd 0.0500 time 0.2253 (0.2323) data time 0.0012 (0.0021) model time 0.2241 (0.2294) loss 3.5233 (3.1159) grad_norm 3.0005 (2.7434) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][660/1251] eta 0:02:17 lr 0.000452 wd 0.0500 time 0.2198 (0.2322) data time 0.0012 (0.0021) model time 0.2186 (0.2293) loss 3.3915 (3.1158) grad_norm 3.1041 (2.7454) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][670/1251] eta 0:02:14 lr 0.000452 wd 0.0500 time 0.2343 (0.2322) data time 0.0011 (0.0021) model time 0.2332 (0.2293) loss 2.7834 (3.1155) grad_norm 2.3286 (2.7494) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][680/1251] eta 0:02:12 lr 0.000452 wd 0.0500 time 0.2227 (0.2322) data time 0.0011 (0.0020) model time 0.2217 (0.2293) loss 3.5315 (3.1201) grad_norm 2.2242 (2.7439) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][690/1251] eta 0:02:10 lr 0.000452 wd 0.0500 time 0.2240 (0.2321) data time 0.0012 (0.0020) model time 0.2228 (0.2293) loss 2.4275 (3.1198) grad_norm 1.8771 (2.7407) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][700/1251] eta 0:02:08 lr 0.000452 wd 0.0500 time 0.2289 (0.2323) data time 0.0006 (0.0020) model time 0.2283 (0.2296) loss 2.6679 (3.1183) grad_norm 2.4050 (2.7364) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][710/1251] eta 0:02:05 lr 0.000452 wd 0.0500 time 0.2360 (0.2323) data time 0.0011 (0.0020) model time 0.2349 (0.2296) loss 3.3505 (3.1185) grad_norm 2.2377 (2.7319) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][720/1251] eta 0:02:03 lr 0.000452 wd 0.0500 time 0.2342 (0.2323) data time 0.0007 (0.0020) model time 0.2335 (0.2296) loss 2.9998 (3.1227) grad_norm 2.6172 (2.7313) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][730/1251] eta 0:02:01 lr 0.000452 wd 0.0500 time 0.2305 (0.2323) data time 0.0008 (0.0020) model time 0.2297 (0.2296) loss 3.1659 (3.1238) grad_norm 1.9522 (2.7308) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][740/1251] eta 0:01:58 lr 0.000452 wd 0.0500 time 0.2205 (0.2322) data time 0.0011 (0.0020) model time 0.2194 (0.2295) loss 3.6764 (3.1255) grad_norm 4.2863 (2.7299) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][750/1251] eta 0:01:56 lr 0.000452 wd 0.0500 time 0.2313 (0.2322) data time 0.0010 (0.0020) model time 0.2303 (0.2295) loss 3.5580 (3.1254) grad_norm 2.4779 (2.7331) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][760/1251] eta 0:01:53 lr 0.000452 wd 0.0500 time 0.2195 (0.2321) data time 0.0008 (0.0019) model time 0.2188 (0.2295) loss 2.5359 (3.1265) grad_norm 2.6197 (2.7310) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][770/1251] eta 0:01:51 lr 0.000452 wd 0.0500 time 0.2338 (0.2321) data time 0.0010 (0.0019) model time 0.2327 (0.2295) loss 2.4112 (3.1229) grad_norm 1.9073 (2.7346) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][780/1251] eta 0:01:49 lr 0.000452 wd 0.0500 time 0.2284 (0.2320) data time 0.0009 (0.0019) model time 0.2275 (0.2294) loss 3.3195 (3.1226) grad_norm 3.9374 (2.7342) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][790/1251] eta 0:01:46 lr 0.000452 wd 0.0500 time 0.2280 (0.2320) data time 0.0009 (0.0019) model time 0.2270 (0.2294) loss 3.6577 (3.1232) grad_norm 2.1160 (2.7291) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][800/1251] eta 0:01:44 lr 0.000452 wd 0.0500 time 0.2316 (0.2319) data time 0.0009 (0.0019) model time 0.2307 (0.2294) loss 2.1689 (3.1242) grad_norm 3.2910 (2.7463) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][810/1251] eta 0:01:42 lr 0.000452 wd 0.0500 time 0.2293 (0.2319) data time 0.0008 (0.0019) model time 0.2286 (0.2294) loss 3.7733 (3.1256) grad_norm 2.9523 (2.7497) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][820/1251] eta 0:01:39 lr 0.000451 wd 0.0500 time 0.2320 (0.2318) data time 0.0009 (0.0019) model time 0.2311 (0.2293) loss 2.9172 (3.1227) grad_norm 2.2556 (2.7517) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][830/1251] eta 0:01:37 lr 0.000451 wd 0.0500 time 0.2248 (0.2318) data time 0.0011 (0.0019) model time 0.2236 (0.2293) loss 2.4591 (3.1213) grad_norm 3.1792 (2.7572) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][840/1251] eta 0:01:35 lr 0.000451 wd 0.0500 time 0.2298 (0.2317) data time 0.0008 (0.0019) model time 0.2290 (0.2292) loss 3.0373 (3.1255) grad_norm 1.7825 (2.7578) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][850/1251] eta 0:01:32 lr 0.000451 wd 0.0500 time 0.2280 (0.2317) data time 0.0011 (0.0019) model time 0.2269 (0.2292) loss 3.6124 (3.1266) grad_norm 2.7420 (2.7515) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][860/1251] eta 0:01:30 lr 0.000451 wd 0.0500 time 0.2265 (0.2316) data time 0.0010 (0.0018) model time 0.2255 (0.2292) loss 2.7967 (3.1256) grad_norm 2.0864 (2.7462) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][870/1251] eta 0:01:28 lr 0.000451 wd 0.0500 time 0.2315 (0.2316) data time 0.0007 (0.0018) model time 0.2308 (0.2291) loss 1.9580 (3.1257) grad_norm 1.9833 (2.7453) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][880/1251] eta 0:01:25 lr 0.000451 wd 0.0500 time 0.2282 (0.2315) data time 0.0008 (0.0018) model time 0.2273 (0.2291) loss 2.7924 (3.1235) grad_norm 2.4670 (2.7395) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][890/1251] eta 0:01:23 lr 0.000451 wd 0.0500 time 0.2312 (0.2315) data time 0.0011 (0.0018) model time 0.2301 (0.2291) loss 3.3511 (3.1247) grad_norm 2.5902 (2.7518) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][900/1251] eta 0:01:21 lr 0.000451 wd 0.0500 time 0.2326 (0.2315) data time 0.0008 (0.0018) model time 0.2318 (0.2291) loss 3.8219 (3.1248) grad_norm 2.5654 (2.7467) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][910/1251] eta 0:01:18 lr 0.000451 wd 0.0500 time 0.2250 (0.2315) data time 0.0010 (0.0018) model time 0.2240 (0.2291) loss 3.1544 (3.1229) grad_norm 2.3167 (2.7446) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][920/1251] eta 0:01:16 lr 0.000451 wd 0.0500 time 0.2386 (0.2314) data time 0.0007 (0.0018) model time 0.2380 (0.2291) loss 2.4473 (3.1248) grad_norm 2.1839 (2.7405) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][930/1251] eta 0:01:14 lr 0.000451 wd 0.0500 time 0.2334 (0.2314) data time 0.0006 (0.0018) model time 0.2328 (0.2290) loss 2.0585 (3.1207) grad_norm 3.0504 (2.7379) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][940/1251] eta 0:01:11 lr 0.000451 wd 0.0500 time 0.2371 (0.2314) data time 0.0012 (0.0018) model time 0.2359 (0.2291) loss 2.8862 (3.1191) grad_norm 1.9138 (2.7365) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][950/1251] eta 0:01:09 lr 0.000451 wd 0.0500 time 0.2323 (0.2314) data time 0.0010 (0.0018) model time 0.2313 (0.2290) loss 3.0608 (3.1205) grad_norm 3.1047 (2.7340) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][960/1251] eta 0:01:07 lr 0.000451 wd 0.0500 time 0.2354 (0.2313) data time 0.0007 (0.0018) model time 0.2347 (0.2290) loss 2.7933 (3.1210) grad_norm 3.3494 (2.7397) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][970/1251] eta 0:01:04 lr 0.000451 wd 0.0500 time 0.2172 (0.2313) data time 0.0008 (0.0018) model time 0.2165 (0.2290) loss 3.6214 (3.1208) grad_norm 2.4994 (2.7415) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][980/1251] eta 0:01:02 lr 0.000451 wd 0.0500 time 0.2330 (0.2313) data time 0.0011 (0.0017) model time 0.2320 (0.2290) loss 3.7542 (3.1218) grad_norm 3.1757 (2.7409) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][990/1251] eta 0:01:00 lr 0.000451 wd 0.0500 time 0.2210 (0.2313) data time 0.0009 (0.0017) model time 0.2201 (0.2290) loss 2.2745 (3.1201) grad_norm 2.3947 (2.7561) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1000/1251] eta 0:00:58 lr 0.000451 wd 0.0500 time 0.2203 (0.2312) data time 0.0012 (0.0017) model time 0.2191 (0.2289) loss 3.1728 (3.1221) grad_norm 1.9215 (2.7512) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1010/1251] eta 0:00:55 lr 0.000451 wd 0.0500 time 0.2138 (0.2313) data time 0.0008 (0.0017) model time 0.2130 (0.2290) loss 3.0841 (3.1206) grad_norm 2.1220 (2.7513) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1020/1251] eta 0:00:53 lr 0.000451 wd 0.0500 time 0.2207 (0.2314) data time 0.0012 (0.0017) model time 0.2195 (0.2291) loss 3.7206 (3.1217) grad_norm 2.0384 (2.7468) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:37:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1030/1251] eta 0:00:51 lr 0.000451 wd 0.0500 time 0.2228 (0.2313) data time 0.0009 (0.0017) model time 0.2219 (0.2291) loss 3.8928 (3.1226) grad_norm 2.4016 (2.7477) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 11:37:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1040/1251] eta 0:00:48 lr 0.000451 wd 0.0500 time 0.2272 (0.2313) data time 0.0009 (0.0017) model time 0.2263 (0.2291) loss 3.2485 (3.1250) grad_norm 4.6816 (2.7514) loss_scale 1024.0000 (516.4265) mem 7381MB [2024-08-27 11:37:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1050/1251] eta 0:00:46 lr 0.000450 wd 0.0500 time 0.2324 (0.2313) data time 0.0011 (0.0017) model time 0.2313 (0.2290) loss 2.9758 (3.1224) grad_norm 2.6786 (2.7584) loss_scale 1024.0000 (521.2559) mem 7381MB [2024-08-27 11:37:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1060/1251] eta 0:00:44 lr 0.000450 wd 0.0500 time 0.2287 (0.2312) data time 0.0009 (0.0017) model time 0.2278 (0.2290) loss 3.6117 (3.1229) grad_norm 2.3121 (2.7691) loss_scale 1024.0000 (525.9943) mem 7381MB [2024-08-27 11:37:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1070/1251] eta 0:00:41 lr 0.000450 wd 0.0500 time 0.2291 (0.2312) data time 0.0010 (0.0017) model time 0.2281 (0.2290) loss 3.6175 (3.1250) grad_norm 1.8780 (2.7691) loss_scale 1024.0000 (530.6443) mem 7381MB [2024-08-27 11:37:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1080/1251] eta 0:00:39 lr 0.000450 wd 0.0500 time 0.2280 (0.2312) data time 0.0010 (0.0017) model time 0.2270 (0.2290) loss 3.4637 (3.1271) grad_norm 2.0932 (2.7662) loss_scale 1024.0000 (535.2081) mem 7381MB [2024-08-27 11:37:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1090/1251] eta 0:00:37 lr 0.000450 wd 0.0500 time 0.2311 (0.2311) data time 0.0007 (0.0017) model time 0.2304 (0.2290) loss 2.0520 (3.1235) grad_norm 3.2678 (2.7817) loss_scale 1024.0000 (539.6884) mem 7381MB [2024-08-27 11:37:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1100/1251] eta 0:00:34 lr 0.000450 wd 0.0500 time 0.2277 (0.2311) data time 0.0007 (0.0017) model time 0.2270 (0.2289) loss 3.5504 (3.1231) grad_norm 2.1253 (2.7835) loss_scale 1024.0000 (544.0872) mem 7381MB [2024-08-27 11:37:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1110/1251] eta 0:00:32 lr 0.000450 wd 0.0500 time 0.2208 (0.2311) data time 0.0009 (0.0017) model time 0.2199 (0.2289) loss 2.2797 (3.1218) grad_norm 2.6236 (2.7799) loss_scale 1024.0000 (548.4068) mem 7381MB [2024-08-27 11:37:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1120/1251] eta 0:00:30 lr 0.000450 wd 0.0500 time 0.2270 (0.2310) data time 0.0010 (0.0017) model time 0.2260 (0.2289) loss 2.9457 (3.1218) grad_norm 2.4738 (2.7782) loss_scale 1024.0000 (552.6494) mem 7381MB [2024-08-27 11:37:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1130/1251] eta 0:00:27 lr 0.000450 wd 0.0500 time 0.2342 (0.2310) data time 0.0009 (0.0017) model time 0.2333 (0.2289) loss 3.2919 (3.1211) grad_norm 4.7741 (2.7815) loss_scale 1024.0000 (556.8170) mem 7381MB [2024-08-27 11:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1140/1251] eta 0:00:25 lr 0.000450 wd 0.0500 time 0.2267 (0.2310) data time 0.0008 (0.0016) model time 0.2259 (0.2288) loss 2.4214 (3.1199) grad_norm 2.5821 (2.7820) loss_scale 1024.0000 (560.9115) mem 7381MB [2024-08-27 11:37:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1150/1251] eta 0:00:23 lr 0.000450 wd 0.0500 time 0.2204 (0.2309) data time 0.0010 (0.0016) model time 0.2193 (0.2288) loss 3.0771 (3.1195) grad_norm 2.7080 (2.7830) loss_scale 1024.0000 (564.9348) mem 7381MB [2024-08-27 11:37:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1160/1251] eta 0:00:21 lr 0.000450 wd 0.0500 time 0.2196 (0.2309) data time 0.0008 (0.0016) model time 0.2188 (0.2288) loss 3.7207 (3.1210) grad_norm 2.0943 (2.7846) loss_scale 1024.0000 (568.8889) mem 7381MB [2024-08-27 11:37:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1170/1251] eta 0:00:18 lr 0.000450 wd 0.0500 time 0.2323 (0.2309) data time 0.0010 (0.0016) model time 0.2313 (0.2288) loss 1.8951 (3.1198) grad_norm 2.4813 (2.7816) loss_scale 1024.0000 (572.7754) mem 7381MB [2024-08-27 11:37:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1180/1251] eta 0:00:16 lr 0.000450 wd 0.0500 time 0.2329 (0.2309) data time 0.0011 (0.0016) model time 0.2318 (0.2288) loss 2.9438 (3.1205) grad_norm 2.9355 (2.7823) loss_scale 1024.0000 (576.5961) mem 7381MB [2024-08-27 11:37:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1190/1251] eta 0:00:14 lr 0.000450 wd 0.0500 time 0.2239 (0.2309) data time 0.0011 (0.0016) model time 0.2228 (0.2288) loss 2.9666 (3.1219) grad_norm 2.1490 (2.7836) loss_scale 1024.0000 (580.3526) mem 7381MB [2024-08-27 11:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1200/1251] eta 0:00:11 lr 0.000450 wd 0.0500 time 0.2247 (0.2309) data time 0.0009 (0.0016) model time 0.2238 (0.2288) loss 3.2441 (3.1218) grad_norm 1.9886 (2.7907) loss_scale 1024.0000 (584.0466) mem 7381MB [2024-08-27 11:37:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1210/1251] eta 0:00:09 lr 0.000450 wd 0.0500 time 0.2380 (0.2308) data time 0.0006 (0.0016) model time 0.2374 (0.2288) loss 2.3384 (3.1208) grad_norm 1.9444 (2.7926) loss_scale 1024.0000 (587.6796) mem 7381MB [2024-08-27 11:37:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1220/1251] eta 0:00:07 lr 0.000450 wd 0.0500 time 0.2207 (0.2308) data time 0.0010 (0.0016) model time 0.2197 (0.2288) loss 3.3153 (3.1232) grad_norm 2.1000 (2.7934) loss_scale 1024.0000 (591.2531) mem 7381MB [2024-08-27 11:37:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1230/1251] eta 0:00:04 lr 0.000450 wd 0.0500 time 0.2322 (0.2308) data time 0.0007 (0.0016) model time 0.2314 (0.2288) loss 3.8387 (3.1225) grad_norm 2.5256 (2.7929) loss_scale 1024.0000 (594.7685) mem 7381MB [2024-08-27 11:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1240/1251] eta 0:00:02 lr 0.000450 wd 0.0500 time 0.2193 (0.2307) data time 0.0006 (0.0016) model time 0.2187 (0.2287) loss 2.7515 (3.1246) grad_norm 2.3503 (2.7905) loss_scale 1024.0000 (598.2272) mem 7381MB [2024-08-27 11:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [169/300][1250/1251] eta 0:00:00 lr 0.000450 wd 0.0500 time 0.2124 (0.2306) data time 0.0007 (0.0016) model time 0.2117 (0.2285) loss 3.0402 (3.1246) grad_norm 2.2014 (2.7896) loss_scale 1024.0000 (601.6307) mem 7381MB [2024-08-27 11:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 169 training takes 0:04:48 [2024-08-27 11:37:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 11:37:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 11:37:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.535 (0.535) Loss 0.4685 (0.4685) Acc@1 91.113 (91.113) Acc@5 98.242 (98.242) Mem 7381MB [2024-08-27 11:37:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.120) Loss 0.6816 (0.7062) Acc@1 86.035 (84.650) Acc@5 97.559 (97.044) Mem 7381MB [2024-08-27 11:37:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.066 (0.097) Loss 1.0645 (0.7313) Acc@1 73.633 (83.668) Acc@5 93.555 (96.898) Mem 7381MB [2024-08-27 11:37:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.617 (0.106) Loss 1.2480 (0.8300) Acc@1 69.629 (81.385) Acc@5 91.309 (95.772) Mem 7381MB [2024-08-27 11:38:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.117) Loss 1.1738 (0.8846) Acc@1 72.363 (80.033) Acc@5 92.188 (95.167) Mem 7381MB [2024-08-27 11:38:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.698 Acc@5 95.112 [2024-08-27 11:38:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.7% [2024-08-27 11:38:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.987 (0.987) Loss 0.4023 (0.4023) Acc@1 93.262 (93.262) Acc@5 98.438 (98.438) Mem 7381MB [2024-08-27 11:38:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.164) Loss 0.6230 (0.6257) Acc@1 87.793 (86.692) Acc@5 97.656 (97.417) Mem 7381MB [2024-08-27 11:38:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.095 (0.125) Loss 0.8896 (0.6507) Acc@1 79.004 (85.775) Acc@5 95.703 (97.424) Mem 7381MB [2024-08-27 11:38:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.111) Loss 1.1211 (0.7384) Acc@1 72.559 (83.644) Acc@5 93.066 (96.456) Mem 7381MB [2024-08-27 11:38:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.100) Loss 1.0117 (0.7834) Acc@1 74.609 (82.277) Acc@5 94.043 (96.001) Mem 7381MB [2024-08-27 11:38:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.884 Acc@5 95.978 [2024-08-27 11:38:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.9% [2024-08-27 11:38:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.88% [2024-08-27 11:38:06 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 11:38:06 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 11:38:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][0/1251] eta 0:17:27 lr 0.000450 wd 0.0500 time 0.8373 (0.8373) data time 0.6051 (0.6051) model time 0.0000 (0.0000) loss 3.2197 (3.2197) grad_norm 2.3641 (2.3641) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][10/1251] eta 0:05:49 lr 0.000450 wd 0.0500 time 0.2287 (0.2817) data time 0.0009 (0.0560) model time 0.0000 (0.0000) loss 3.0517 (3.1543) grad_norm 2.3134 (2.5641) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][20/1251] eta 0:05:16 lr 0.000449 wd 0.0500 time 0.2360 (0.2567) data time 0.0009 (0.0298) model time 0.0000 (0.0000) loss 3.1765 (3.1935) grad_norm 2.1954 (2.4523) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][30/1251] eta 0:05:01 lr 0.000449 wd 0.0500 time 0.2304 (0.2469) data time 0.0009 (0.0205) model time 0.0000 (0.0000) loss 2.9104 (3.2343) grad_norm 2.9569 (2.5827) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][40/1251] eta 0:04:53 lr 0.000449 wd 0.0500 time 0.2295 (0.2423) data time 0.0009 (0.0158) model time 0.0000 (0.0000) loss 2.9584 (3.1570) grad_norm 3.1836 (2.6318) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][50/1251] eta 0:04:47 lr 0.000449 wd 0.0500 time 0.2321 (0.2393) data time 0.0008 (0.0129) model time 0.0000 (0.0000) loss 3.2979 (3.1700) grad_norm 3.2199 (2.6792) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][60/1251] eta 0:04:42 lr 0.000449 wd 0.0500 time 0.2258 (0.2375) data time 0.0008 (0.0110) model time 0.2250 (0.2272) loss 3.5576 (3.1788) grad_norm 2.7065 (2.7226) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][70/1251] eta 0:04:38 lr 0.000449 wd 0.0500 time 0.2238 (0.2359) data time 0.0010 (0.0096) model time 0.2228 (0.2263) loss 2.7882 (3.1744) grad_norm 2.5575 (2.6778) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][80/1251] eta 0:04:35 lr 0.000449 wd 0.0500 time 0.2451 (0.2354) data time 0.0010 (0.0085) model time 0.2440 (0.2277) loss 3.2319 (3.1565) grad_norm 2.6566 (2.6963) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][90/1251] eta 0:04:32 lr 0.000449 wd 0.0500 time 0.2250 (0.2349) data time 0.0008 (0.0078) model time 0.2241 (0.2281) loss 2.2187 (3.1209) grad_norm 2.1632 (2.6967) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][100/1251] eta 0:04:29 lr 0.000449 wd 0.0500 time 0.2338 (0.2342) data time 0.0009 (0.0071) model time 0.2329 (0.2278) loss 3.4568 (3.1187) grad_norm 3.4962 (2.7006) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][110/1251] eta 0:04:26 lr 0.000449 wd 0.0500 time 0.2383 (0.2338) data time 0.0010 (0.0066) model time 0.2374 (0.2280) loss 1.6210 (3.0993) grad_norm 2.7438 (2.7102) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][120/1251] eta 0:04:23 lr 0.000449 wd 0.0500 time 0.2214 (0.2333) data time 0.0013 (0.0061) model time 0.2201 (0.2278) loss 3.3178 (3.0986) grad_norm 2.6781 (2.7246) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][130/1251] eta 0:04:24 lr 0.000449 wd 0.0500 time 0.2208 (0.2359) data time 0.0011 (0.0057) model time 0.2197 (0.2326) loss 3.2413 (3.0935) grad_norm 3.3951 (2.7229) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][140/1251] eta 0:04:24 lr 0.000449 wd 0.0500 time 0.2258 (0.2381) data time 0.0013 (0.0054) model time 0.2245 (0.2363) loss 2.7348 (3.0999) grad_norm 2.0699 (2.7129) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][150/1251] eta 0:04:21 lr 0.000449 wd 0.0500 time 0.2286 (0.2376) data time 0.0010 (0.0051) model time 0.2276 (0.2356) loss 2.2921 (3.0865) grad_norm 2.5784 (2.6995) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][160/1251] eta 0:04:18 lr 0.000449 wd 0.0500 time 0.2263 (0.2373) data time 0.0015 (0.0049) model time 0.2249 (0.2352) loss 2.6333 (3.0935) grad_norm 4.7385 (2.7100) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][170/1251] eta 0:04:15 lr 0.000449 wd 0.0500 time 0.2264 (0.2368) data time 0.0007 (0.0047) model time 0.2258 (0.2345) loss 3.0067 (3.0829) grad_norm 2.1429 (2.7231) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][180/1251] eta 0:04:12 lr 0.000449 wd 0.0500 time 0.2230 (0.2362) data time 0.0008 (0.0045) model time 0.2223 (0.2338) loss 3.4421 (3.0829) grad_norm 4.9937 (2.7340) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][190/1251] eta 0:04:10 lr 0.000449 wd 0.0500 time 0.2236 (0.2357) data time 0.0009 (0.0043) model time 0.2227 (0.2333) loss 3.7759 (3.0945) grad_norm 2.4646 (2.7765) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][200/1251] eta 0:04:07 lr 0.000449 wd 0.0500 time 0.2222 (0.2352) data time 0.0009 (0.0042) model time 0.2213 (0.2327) loss 3.1036 (3.0943) grad_norm 2.9242 (2.7632) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][210/1251] eta 0:04:04 lr 0.000449 wd 0.0500 time 0.2214 (0.2348) data time 0.0008 (0.0040) model time 0.2206 (0.2322) loss 4.1150 (3.1075) grad_norm 1.9396 (2.7527) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:38:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][220/1251] eta 0:04:01 lr 0.000449 wd 0.0500 time 0.2274 (0.2346) data time 0.0009 (0.0039) model time 0.2264 (0.2321) loss 3.1291 (3.1097) grad_norm 3.0677 (2.7801) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][230/1251] eta 0:03:59 lr 0.000449 wd 0.0500 time 0.2292 (0.2343) data time 0.0008 (0.0038) model time 0.2284 (0.2318) loss 2.8281 (3.1095) grad_norm 2.7206 (2.7785) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][240/1251] eta 0:03:56 lr 0.000449 wd 0.0500 time 0.2265 (0.2341) data time 0.0009 (0.0036) model time 0.2256 (0.2315) loss 3.3895 (3.1121) grad_norm 2.4749 (2.7626) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][250/1251] eta 0:03:54 lr 0.000448 wd 0.0500 time 0.2218 (0.2338) data time 0.0010 (0.0035) model time 0.2208 (0.2312) loss 2.8892 (3.1161) grad_norm 2.1356 (2.7441) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][260/1251] eta 0:03:51 lr 0.000448 wd 0.0500 time 0.2272 (0.2335) data time 0.0010 (0.0034) model time 0.2263 (0.2310) loss 2.9979 (3.1126) grad_norm 2.5095 (2.7307) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][270/1251] eta 0:03:48 lr 0.000448 wd 0.0500 time 0.2264 (0.2333) data time 0.0010 (0.0034) model time 0.2254 (0.2308) loss 3.6708 (3.1144) grad_norm 2.2154 (2.7121) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][280/1251] eta 0:03:46 lr 0.000448 wd 0.0500 time 0.2201 (0.2330) data time 0.0011 (0.0033) model time 0.2190 (0.2305) loss 2.9168 (3.1071) grad_norm 2.1730 (2.6948) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][290/1251] eta 0:03:43 lr 0.000448 wd 0.0500 time 0.2233 (0.2329) data time 0.0007 (0.0032) model time 0.2226 (0.2304) loss 3.3321 (3.1080) grad_norm 2.0374 (2.6883) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][300/1251] eta 0:03:41 lr 0.000448 wd 0.0500 time 0.2324 (0.2327) data time 0.0009 (0.0031) model time 0.2316 (0.2303) loss 3.2217 (3.1060) grad_norm 6.2602 (2.7300) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][310/1251] eta 0:03:38 lr 0.000448 wd 0.0500 time 0.2190 (0.2325) data time 0.0012 (0.0031) model time 0.2178 (0.2301) loss 3.4173 (3.1060) grad_norm 2.3721 (2.7381) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][320/1251] eta 0:03:36 lr 0.000448 wd 0.0500 time 0.2244 (0.2324) data time 0.0010 (0.0030) model time 0.2233 (0.2300) loss 3.2009 (3.1039) grad_norm 3.2603 (2.7376) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][330/1251] eta 0:03:34 lr 0.000448 wd 0.0500 time 0.2302 (0.2324) data time 0.0008 (0.0029) model time 0.2294 (0.2300) loss 2.9843 (3.1075) grad_norm 2.2475 (2.7444) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][340/1251] eta 0:03:31 lr 0.000448 wd 0.0500 time 0.2278 (0.2322) data time 0.0010 (0.0029) model time 0.2268 (0.2299) loss 3.4390 (3.1035) grad_norm 1.5938 (2.7356) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][350/1251] eta 0:03:29 lr 0.000448 wd 0.0500 time 0.2191 (0.2321) data time 0.0015 (0.0028) model time 0.2176 (0.2297) loss 2.8410 (3.1021) grad_norm 2.4181 (2.7309) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][360/1251] eta 0:03:26 lr 0.000448 wd 0.0500 time 0.2285 (0.2320) data time 0.0006 (0.0028) model time 0.2279 (0.2297) loss 3.8953 (3.1019) grad_norm 2.2197 (2.7315) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][370/1251] eta 0:03:24 lr 0.000448 wd 0.0500 time 0.2296 (0.2319) data time 0.0009 (0.0027) model time 0.2287 (0.2296) loss 3.3721 (3.0975) grad_norm 3.8682 (2.7367) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][380/1251] eta 0:03:21 lr 0.000448 wd 0.0500 time 0.2302 (0.2318) data time 0.0011 (0.0027) model time 0.2291 (0.2295) loss 3.1222 (3.1058) grad_norm 2.1488 (2.7339) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][390/1251] eta 0:03:19 lr 0.000448 wd 0.0500 time 0.2311 (0.2318) data time 0.0012 (0.0027) model time 0.2299 (0.2295) loss 3.6746 (3.1124) grad_norm 3.4291 (2.7327) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][400/1251] eta 0:03:17 lr 0.000448 wd 0.0500 time 0.2233 (0.2318) data time 0.0007 (0.0026) model time 0.2227 (0.2295) loss 3.1431 (3.1129) grad_norm 1.9970 (2.7299) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][410/1251] eta 0:03:14 lr 0.000448 wd 0.0500 time 0.2351 (0.2317) data time 0.0010 (0.0026) model time 0.2341 (0.2295) loss 3.2215 (3.1184) grad_norm 1.8873 (2.7221) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][420/1251] eta 0:03:12 lr 0.000448 wd 0.0500 time 0.2426 (0.2317) data time 0.0008 (0.0025) model time 0.2418 (0.2295) loss 3.6365 (3.1175) grad_norm 2.3248 (2.7146) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][430/1251] eta 0:03:10 lr 0.000448 wd 0.0500 time 0.2234 (0.2316) data time 0.0009 (0.0025) model time 0.2225 (0.2295) loss 3.3543 (3.1142) grad_norm 3.5238 (2.7156) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][440/1251] eta 0:03:07 lr 0.000448 wd 0.0500 time 0.2241 (0.2315) data time 0.0009 (0.0025) model time 0.2231 (0.2294) loss 3.5204 (3.1149) grad_norm 2.6139 (2.7194) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][450/1251] eta 0:03:05 lr 0.000448 wd 0.0500 time 0.2259 (0.2314) data time 0.0007 (0.0025) model time 0.2252 (0.2293) loss 2.7058 (3.1172) grad_norm 2.4848 (2.7286) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][460/1251] eta 0:03:03 lr 0.000448 wd 0.0500 time 0.2296 (0.2314) data time 0.0008 (0.0024) model time 0.2288 (0.2293) loss 2.8190 (3.1182) grad_norm 2.8478 (2.7254) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][470/1251] eta 0:03:00 lr 0.000448 wd 0.0500 time 0.2239 (0.2313) data time 0.0010 (0.0024) model time 0.2229 (0.2292) loss 2.9681 (3.1174) grad_norm 2.1349 (2.7231) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:39:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][480/1251] eta 0:02:58 lr 0.000447 wd 0.0500 time 0.2291 (0.2312) data time 0.0008 (0.0024) model time 0.2283 (0.2292) loss 3.7530 (3.1213) grad_norm 1.8408 (2.7419) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][490/1251] eta 0:02:55 lr 0.000447 wd 0.0500 time 0.2272 (0.2312) data time 0.0010 (0.0023) model time 0.2262 (0.2291) loss 3.1836 (3.1220) grad_norm 2.6458 (2.7423) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][500/1251] eta 0:02:53 lr 0.000447 wd 0.0500 time 0.2293 (0.2312) data time 0.0007 (0.0023) model time 0.2286 (0.2291) loss 2.7656 (3.1242) grad_norm 2.8830 (2.7410) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][510/1251] eta 0:02:51 lr 0.000447 wd 0.0500 time 0.2263 (0.2311) data time 0.0008 (0.0023) model time 0.2255 (0.2290) loss 3.2662 (3.1288) grad_norm 2.3558 (2.7351) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][520/1251] eta 0:02:48 lr 0.000447 wd 0.0500 time 0.2303 (0.2310) data time 0.0006 (0.0023) model time 0.2297 (0.2290) loss 3.2616 (3.1309) grad_norm 2.4000 (2.7513) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][530/1251] eta 0:02:46 lr 0.000447 wd 0.0500 time 0.2240 (0.2310) data time 0.0008 (0.0023) model time 0.2232 (0.2290) loss 3.3142 (3.1333) grad_norm 2.7264 (2.7516) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][540/1251] eta 0:02:44 lr 0.000447 wd 0.0500 time 0.2232 (0.2310) data time 0.0012 (0.0022) model time 0.2219 (0.2290) loss 2.9999 (3.1252) grad_norm 2.3063 (2.7469) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][550/1251] eta 0:02:41 lr 0.000447 wd 0.0500 time 0.2279 (0.2310) data time 0.0007 (0.0022) model time 0.2272 (0.2290) loss 3.1817 (3.1255) grad_norm 2.8086 (2.7484) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][560/1251] eta 0:02:39 lr 0.000447 wd 0.0500 time 0.2274 (0.2310) data time 0.0011 (0.0022) model time 0.2263 (0.2290) loss 3.2820 (3.1237) grad_norm 3.3991 (2.7401) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][570/1251] eta 0:02:37 lr 0.000447 wd 0.0500 time 0.2324 (0.2310) data time 0.0007 (0.0022) model time 0.2317 (0.2290) loss 4.4098 (3.1308) grad_norm 2.1042 (2.7324) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][580/1251] eta 0:02:34 lr 0.000447 wd 0.0500 time 0.2301 (0.2309) data time 0.0010 (0.0022) model time 0.2291 (0.2290) loss 3.0982 (3.1326) grad_norm 2.0964 (2.7336) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][590/1251] eta 0:02:32 lr 0.000447 wd 0.0500 time 0.2303 (0.2310) data time 0.0007 (0.0022) model time 0.2296 (0.2290) loss 2.1864 (3.1300) grad_norm 2.0626 (2.7322) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][600/1251] eta 0:02:30 lr 0.000447 wd 0.0500 time 0.2209 (0.2309) data time 0.0011 (0.0021) model time 0.2198 (0.2290) loss 3.2387 (3.1297) grad_norm 2.8543 (2.7387) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][610/1251] eta 0:02:28 lr 0.000447 wd 0.0500 time 0.2240 (0.2309) data time 0.0010 (0.0021) model time 0.2230 (0.2290) loss 2.3956 (3.1283) grad_norm 2.2614 (2.7365) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][620/1251] eta 0:02:25 lr 0.000447 wd 0.0500 time 0.2197 (0.2309) data time 0.0012 (0.0021) model time 0.2185 (0.2290) loss 3.1439 (3.1312) grad_norm 2.0549 (2.7293) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][630/1251] eta 0:02:23 lr 0.000447 wd 0.0500 time 0.2214 (0.2308) data time 0.0008 (0.0021) model time 0.2206 (0.2289) loss 3.2934 (3.1336) grad_norm 2.4032 (2.7218) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][640/1251] eta 0:02:21 lr 0.000447 wd 0.0500 time 0.2230 (0.2308) data time 0.0012 (0.0021) model time 0.2218 (0.2289) loss 3.5917 (3.1302) grad_norm 3.9434 (2.7234) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][650/1251] eta 0:02:18 lr 0.000447 wd 0.0500 time 0.2151 (0.2308) data time 0.0012 (0.0021) model time 0.2139 (0.2289) loss 3.6815 (3.1328) grad_norm 1.8170 (2.7199) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][660/1251] eta 0:02:16 lr 0.000447 wd 0.0500 time 0.2321 (0.2308) data time 0.0011 (0.0021) model time 0.2310 (0.2289) loss 3.2528 (3.1383) grad_norm 3.0495 (2.7216) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][670/1251] eta 0:02:14 lr 0.000447 wd 0.0500 time 0.2340 (0.2308) data time 0.0010 (0.0021) model time 0.2330 (0.2289) loss 3.2739 (3.1393) grad_norm 3.0107 (2.7251) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][680/1251] eta 0:02:11 lr 0.000447 wd 0.0500 time 0.2228 (0.2307) data time 0.0012 (0.0020) model time 0.2216 (0.2289) loss 3.5608 (3.1402) grad_norm 3.5327 (2.7251) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][690/1251] eta 0:02:09 lr 0.000447 wd 0.0500 time 0.2357 (0.2307) data time 0.0010 (0.0020) model time 0.2348 (0.2288) loss 3.5488 (3.1413) grad_norm 2.4710 (2.7369) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][700/1251] eta 0:02:07 lr 0.000446 wd 0.0500 time 0.2414 (0.2307) data time 0.0008 (0.0020) model time 0.2406 (0.2288) loss 3.4244 (3.1433) grad_norm 2.8721 (2.7382) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][710/1251] eta 0:02:04 lr 0.000446 wd 0.0500 time 0.2251 (0.2307) data time 0.0006 (0.0020) model time 0.2244 (0.2288) loss 3.8231 (3.1431) grad_norm 2.3584 (2.7335) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][720/1251] eta 0:02:02 lr 0.000446 wd 0.0500 time 0.2257 (0.2307) data time 0.0011 (0.0020) model time 0.2246 (0.2288) loss 3.0964 (3.1462) grad_norm 3.0957 (2.7330) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][730/1251] eta 0:02:00 lr 0.000446 wd 0.0500 time 0.2322 (0.2307) data time 0.0008 (0.0020) model time 0.2314 (0.2288) loss 3.6992 (3.1484) grad_norm 2.5701 (2.7289) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][740/1251] eta 0:01:57 lr 0.000446 wd 0.0500 time 0.2235 (0.2306) data time 0.0008 (0.0020) model time 0.2227 (0.2288) loss 3.5858 (3.1515) grad_norm 2.0534 (2.7274) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][750/1251] eta 0:01:55 lr 0.000446 wd 0.0500 time 0.2369 (0.2306) data time 0.0010 (0.0020) model time 0.2360 (0.2288) loss 3.3983 (3.1518) grad_norm 7.1467 (2.7304) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][760/1251] eta 0:01:53 lr 0.000446 wd 0.0500 time 0.2406 (0.2306) data time 0.0010 (0.0020) model time 0.2397 (0.2288) loss 3.5264 (3.1498) grad_norm 2.6950 (2.7284) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][770/1251] eta 0:01:50 lr 0.000446 wd 0.0500 time 0.2277 (0.2306) data time 0.0007 (0.0019) model time 0.2270 (0.2288) loss 3.6428 (3.1504) grad_norm 2.1439 (2.7325) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][780/1251] eta 0:01:48 lr 0.000446 wd 0.0500 time 0.2299 (0.2306) data time 0.0007 (0.0019) model time 0.2292 (0.2288) loss 3.3765 (3.1485) grad_norm 3.1513 (2.7337) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][790/1251] eta 0:01:46 lr 0.000446 wd 0.0500 time 0.2244 (0.2305) data time 0.0006 (0.0019) model time 0.2238 (0.2287) loss 2.8030 (3.1480) grad_norm 3.1753 (2.7357) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][800/1251] eta 0:01:43 lr 0.000446 wd 0.0500 time 0.2251 (0.2305) data time 0.0007 (0.0019) model time 0.2244 (0.2287) loss 3.9724 (3.1466) grad_norm 3.1585 (2.7317) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][810/1251] eta 0:01:41 lr 0.000446 wd 0.0500 time 0.2276 (0.2305) data time 0.0008 (0.0019) model time 0.2268 (0.2287) loss 3.3619 (3.1458) grad_norm 4.7657 (2.7362) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][820/1251] eta 0:01:39 lr 0.000446 wd 0.0500 time 0.2271 (0.2305) data time 0.0007 (0.0019) model time 0.2264 (0.2287) loss 3.8082 (3.1453) grad_norm 3.3745 (2.7460) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][830/1251] eta 0:01:37 lr 0.000446 wd 0.0500 time 0.2243 (0.2305) data time 0.0007 (0.0019) model time 0.2235 (0.2287) loss 3.8521 (3.1464) grad_norm 5.1080 (2.7641) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][840/1251] eta 0:01:34 lr 0.000446 wd 0.0500 time 0.2241 (0.2304) data time 0.0009 (0.0019) model time 0.2232 (0.2287) loss 3.4972 (3.1463) grad_norm 2.3336 (2.7660) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][850/1251] eta 0:01:32 lr 0.000446 wd 0.0500 time 0.2344 (0.2304) data time 0.0007 (0.0019) model time 0.2337 (0.2287) loss 3.2849 (3.1452) grad_norm 2.3482 (2.7686) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][860/1251] eta 0:01:30 lr 0.000446 wd 0.0500 time 0.2255 (0.2304) data time 0.0010 (0.0019) model time 0.2246 (0.2287) loss 3.5100 (3.1480) grad_norm 2.3417 (2.7647) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][870/1251] eta 0:01:27 lr 0.000446 wd 0.0500 time 0.2243 (0.2304) data time 0.0020 (0.0019) model time 0.2224 (0.2287) loss 3.2034 (3.1481) grad_norm 2.3745 (2.7626) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][880/1251] eta 0:01:25 lr 0.000446 wd 0.0500 time 0.2265 (0.2304) data time 0.0009 (0.0019) model time 0.2257 (0.2287) loss 3.0347 (3.1507) grad_norm 1.8782 (2.7623) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][890/1251] eta 0:01:23 lr 0.000446 wd 0.0500 time 0.2269 (0.2304) data time 0.0008 (0.0019) model time 0.2262 (0.2286) loss 3.4271 (3.1512) grad_norm 2.2332 (2.7606) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][900/1251] eta 0:01:20 lr 0.000446 wd 0.0500 time 0.2306 (0.2303) data time 0.0009 (0.0018) model time 0.2297 (0.2286) loss 3.2976 (3.1529) grad_norm 2.4524 (2.7579) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][910/1251] eta 0:01:18 lr 0.000446 wd 0.0500 time 0.2311 (0.2303) data time 0.0010 (0.0018) model time 0.2301 (0.2286) loss 3.4593 (3.1527) grad_norm 2.3071 (2.7594) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][920/1251] eta 0:01:16 lr 0.000446 wd 0.0500 time 0.2312 (0.2303) data time 0.0007 (0.0018) model time 0.2306 (0.2286) loss 3.4714 (3.1521) grad_norm 2.3694 (2.7575) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][930/1251] eta 0:01:13 lr 0.000445 wd 0.0500 time 0.2328 (0.2303) data time 0.0009 (0.0018) model time 0.2319 (0.2286) loss 3.2311 (3.1523) grad_norm 4.9435 (2.7584) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][940/1251] eta 0:01:11 lr 0.000445 wd 0.0500 time 0.2235 (0.2303) data time 0.0013 (0.0018) model time 0.2222 (0.2286) loss 2.9649 (3.1534) grad_norm 1.8650 (2.7553) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][950/1251] eta 0:01:09 lr 0.000445 wd 0.0500 time 0.2277 (0.2303) data time 0.0009 (0.0018) model time 0.2268 (0.2286) loss 3.2722 (3.1558) grad_norm 2.4207 (2.7525) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][960/1251] eta 0:01:07 lr 0.000445 wd 0.0500 time 0.2255 (0.2303) data time 0.0010 (0.0018) model time 0.2244 (0.2286) loss 3.3777 (3.1571) grad_norm 3.6826 (2.7510) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][970/1251] eta 0:01:04 lr 0.000445 wd 0.0500 time 0.2284 (0.2303) data time 0.0011 (0.0018) model time 0.2273 (0.2286) loss 3.4269 (3.1576) grad_norm 1.8093 (2.7528) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][980/1251] eta 0:01:02 lr 0.000445 wd 0.0500 time 0.2267 (0.2303) data time 0.0011 (0.0018) model time 0.2256 (0.2286) loss 3.5021 (3.1588) grad_norm 2.3881 (2.7493) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][990/1251] eta 0:01:00 lr 0.000445 wd 0.0500 time 0.2287 (0.2303) data time 0.0014 (0.0018) model time 0.2273 (0.2286) loss 3.4509 (3.1578) grad_norm 2.0995 (2.7457) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1000/1251] eta 0:00:57 lr 0.000445 wd 0.0500 time 0.2209 (0.2303) data time 0.0008 (0.0018) model time 0.2201 (0.2287) loss 2.2326 (3.1569) grad_norm 2.7941 (2.7445) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:41:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1010/1251] eta 0:00:55 lr 0.000445 wd 0.0500 time 0.2318 (0.2304) data time 0.0008 (0.0018) model time 0.2310 (0.2287) loss 2.6987 (3.1560) grad_norm 2.5609 (2.7438) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1020/1251] eta 0:00:53 lr 0.000445 wd 0.0500 time 0.2256 (0.2303) data time 0.0009 (0.0018) model time 0.2247 (0.2287) loss 3.3898 (3.1538) grad_norm 4.2698 (2.7469) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1030/1251] eta 0:00:50 lr 0.000445 wd 0.0500 time 0.2266 (0.2303) data time 0.0010 (0.0018) model time 0.2256 (0.2286) loss 3.2241 (3.1544) grad_norm 2.0015 (2.7454) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1040/1251] eta 0:00:48 lr 0.000445 wd 0.0500 time 0.2289 (0.2303) data time 0.0006 (0.0018) model time 0.2282 (0.2286) loss 3.1189 (3.1542) grad_norm 2.4692 (2.7452) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1050/1251] eta 0:00:46 lr 0.000445 wd 0.0500 time 0.2313 (0.2303) data time 0.0008 (0.0018) model time 0.2306 (0.2286) loss 3.7241 (3.1570) grad_norm 2.1133 (2.7423) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1060/1251] eta 0:00:44 lr 0.000445 wd 0.0500 time 0.2386 (0.2307) data time 0.0009 (0.0018) model time 0.2377 (0.2290) loss 2.7382 (3.1579) grad_norm 1.7861 (2.7411) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1070/1251] eta 0:00:41 lr 0.000445 wd 0.0500 time 0.2220 (0.2310) data time 0.0009 (0.0018) model time 0.2211 (0.2294) loss 3.5887 (3.1568) grad_norm 2.7164 (2.7417) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1080/1251] eta 0:00:39 lr 0.000445 wd 0.0500 time 0.2360 (0.2310) data time 0.0007 (0.0018) model time 0.2353 (0.2294) loss 3.2604 (3.1569) grad_norm 2.7588 (2.7398) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1090/1251] eta 0:00:37 lr 0.000445 wd 0.0500 time 0.2283 (0.2310) data time 0.0010 (0.0018) model time 0.2273 (0.2294) loss 3.3360 (3.1589) grad_norm 2.8526 (2.7411) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1100/1251] eta 0:00:34 lr 0.000445 wd 0.0500 time 0.2403 (0.2310) data time 0.0007 (0.0017) model time 0.2396 (0.2294) loss 3.5966 (3.1562) grad_norm 3.5472 (2.7411) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1110/1251] eta 0:00:32 lr 0.000445 wd 0.0500 time 0.2256 (0.2310) data time 0.0010 (0.0017) model time 0.2246 (0.2294) loss 3.6920 (3.1589) grad_norm 2.3151 (2.7406) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1120/1251] eta 0:00:30 lr 0.000445 wd 0.0500 time 0.2286 (0.2310) data time 0.0007 (0.0017) model time 0.2280 (0.2294) loss 2.0397 (3.1555) grad_norm 2.5587 (2.7394) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1130/1251] eta 0:00:27 lr 0.000445 wd 0.0500 time 0.2266 (0.2309) data time 0.0007 (0.0017) model time 0.2259 (0.2293) loss 2.7736 (3.1534) grad_norm 2.3898 (2.7433) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1140/1251] eta 0:00:25 lr 0.000445 wd 0.0500 time 0.2296 (0.2309) data time 0.0007 (0.0017) model time 0.2289 (0.2293) loss 3.8233 (3.1548) grad_norm 3.2380 (2.7453) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1150/1251] eta 0:00:23 lr 0.000445 wd 0.0500 time 0.2286 (0.2309) data time 0.0009 (0.0017) model time 0.2277 (0.2293) loss 3.8851 (3.1556) grad_norm 2.5382 (2.7445) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1160/1251] eta 0:00:21 lr 0.000444 wd 0.0500 time 0.2252 (0.2309) data time 0.0007 (0.0017) model time 0.2245 (0.2293) loss 3.8244 (3.1569) grad_norm 2.5783 (2.7498) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1170/1251] eta 0:00:18 lr 0.000444 wd 0.0500 time 0.2271 (0.2308) data time 0.0010 (0.0017) model time 0.2262 (0.2293) loss 2.5386 (3.1535) grad_norm 3.0797 (2.7475) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1180/1251] eta 0:00:16 lr 0.000444 wd 0.0500 time 0.2336 (0.2308) data time 0.0007 (0.0017) model time 0.2329 (0.2292) loss 3.4968 (3.1538) grad_norm 2.4346 (2.7469) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1190/1251] eta 0:00:14 lr 0.000444 wd 0.0500 time 0.2236 (0.2308) data time 0.0008 (0.0017) model time 0.2228 (0.2292) loss 2.3868 (3.1528) grad_norm 3.0439 (2.7455) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1200/1251] eta 0:00:11 lr 0.000444 wd 0.0500 time 0.2266 (0.2308) data time 0.0008 (0.0017) model time 0.2258 (0.2292) loss 3.7286 (3.1561) grad_norm 4.1212 (2.7485) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1210/1251] eta 0:00:09 lr 0.000444 wd 0.0500 time 0.2253 (0.2307) data time 0.0007 (0.0017) model time 0.2245 (0.2292) loss 3.7579 (3.1564) grad_norm 3.5532 (2.7496) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1220/1251] eta 0:00:07 lr 0.000444 wd 0.0500 time 0.2246 (0.2307) data time 0.0010 (0.0017) model time 0.2236 (0.2291) loss 3.3014 (3.1558) grad_norm 2.5623 (2.7519) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1230/1251] eta 0:00:04 lr 0.000444 wd 0.0500 time 0.2311 (0.2307) data time 0.0011 (0.0017) model time 0.2300 (0.2291) loss 3.5554 (3.1573) grad_norm 2.5356 (2.7498) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1240/1251] eta 0:00:02 lr 0.000444 wd 0.0500 time 0.2118 (0.2306) data time 0.0005 (0.0017) model time 0.2114 (0.2290) loss 2.4279 (3.1550) grad_norm 2.3435 (2.7466) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [170/300][1250/1251] eta 0:00:00 lr 0.000444 wd 0.0500 time 0.2133 (0.2305) data time 0.0004 (0.0017) model time 0.2128 (0.2289) loss 3.1840 (3.1553) grad_norm 2.4687 (2.7442) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 170 training takes 0:04:48 [2024-08-27 11:42:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 11:42:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 11:42:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.474 (0.474) Loss 0.4856 (0.4856) Acc@1 90.918 (90.918) Acc@5 98.145 (98.145) Mem 7381MB [2024-08-27 11:42:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.117) Loss 0.7095 (0.7171) Acc@1 86.035 (84.739) Acc@5 96.289 (96.822) Mem 7381MB [2024-08-27 11:42:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.068 (0.095) Loss 1.0039 (0.7324) Acc@1 76.758 (83.989) Acc@5 94.434 (96.940) Mem 7381MB [2024-08-27 11:42:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.085 (0.097) Loss 1.2139 (0.8290) Acc@1 70.215 (81.691) Acc@5 91.699 (95.883) Mem 7381MB [2024-08-27 11:42:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.090) Loss 1.1328 (0.8834) Acc@1 72.559 (80.314) Acc@5 92.871 (95.293) Mem 7381MB [2024-08-27 11:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.900 Acc@5 95.220 [2024-08-27 11:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.9% [2024-08-27 11:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 79.90% [2024-08-27 11:43:00 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 11:43:00 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 11:43:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.456 (0.456) Loss 0.4028 (0.4028) Acc@1 93.164 (93.164) Acc@5 98.535 (98.535) Mem 7381MB [2024-08-27 11:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.086 (0.121) Loss 0.6230 (0.6257) Acc@1 87.402 (86.594) Acc@5 97.559 (97.470) Mem 7381MB [2024-08-27 11:43:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.102) Loss 0.8887 (0.6504) Acc@1 78.809 (85.696) Acc@5 95.801 (97.452) Mem 7381MB [2024-08-27 11:43:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.094) Loss 1.1230 (0.7381) Acc@1 72.461 (83.603) Acc@5 93.066 (96.478) Mem 7381MB [2024-08-27 11:43:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.087) Loss 1.0098 (0.7827) Acc@1 75.000 (82.262) Acc@5 94.043 (96.020) Mem 7381MB [2024-08-27 11:43:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.858 Acc@5 95.986 [2024-08-27 11:43:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.9% [2024-08-27 11:43:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][0/1251] eta 0:27:34 lr 0.000444 wd 0.0500 time 1.3224 (1.3224) data time 0.9076 (0.9076) model time 0.0000 (0.0000) loss 3.5123 (3.5123) grad_norm 2.6773 (2.6773) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][10/1251] eta 0:06:47 lr 0.000444 wd 0.0500 time 0.2286 (0.3287) data time 0.0010 (0.0847) model time 0.0000 (0.0000) loss 3.2974 (3.0774) grad_norm 3.2095 (2.7027) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][20/1251] eta 0:05:45 lr 0.000444 wd 0.0500 time 0.2236 (0.2805) data time 0.0009 (0.0449) model time 0.0000 (0.0000) loss 2.8496 (3.0463) grad_norm 1.8901 (2.6839) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][30/1251] eta 0:05:21 lr 0.000444 wd 0.0500 time 0.2323 (0.2635) data time 0.0009 (0.0308) model time 0.0000 (0.0000) loss 3.1383 (3.0868) grad_norm 3.5992 (2.6955) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][40/1251] eta 0:05:07 lr 0.000444 wd 0.0500 time 0.2272 (0.2542) data time 0.0007 (0.0235) model time 0.0000 (0.0000) loss 2.2706 (3.0942) grad_norm 1.9902 (2.6830) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][50/1251] eta 0:04:59 lr 0.000444 wd 0.0500 time 0.2380 (0.2491) data time 0.0006 (0.0191) model time 0.0000 (0.0000) loss 4.2401 (3.0892) grad_norm 4.0251 (2.8055) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][60/1251] eta 0:04:52 lr 0.000444 wd 0.0500 time 0.2359 (0.2455) data time 0.0011 (0.0161) model time 0.2348 (0.2261) loss 2.6989 (3.0654) grad_norm 3.3618 (2.8903) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][70/1251] eta 0:04:46 lr 0.000444 wd 0.0500 time 0.2258 (0.2429) data time 0.0009 (0.0140) model time 0.2249 (0.2261) loss 2.5076 (3.0678) grad_norm 2.3200 (2.8856) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][80/1251] eta 0:04:42 lr 0.000444 wd 0.0500 time 0.2292 (0.2412) data time 0.0011 (0.0125) model time 0.2281 (0.2264) loss 1.8172 (3.0663) grad_norm 2.0479 (2.8624) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][90/1251] eta 0:04:38 lr 0.000444 wd 0.0500 time 0.2444 (0.2397) data time 0.0010 (0.0112) model time 0.2434 (0.2265) loss 3.3404 (3.1137) grad_norm 2.3943 (2.8573) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][100/1251] eta 0:04:34 lr 0.000444 wd 0.0500 time 0.2193 (0.2384) data time 0.0009 (0.0102) model time 0.2184 (0.2263) loss 2.9382 (3.1045) grad_norm 3.0347 (2.8400) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][110/1251] eta 0:04:30 lr 0.000444 wd 0.0500 time 0.2295 (0.2375) data time 0.0008 (0.0094) model time 0.2286 (0.2265) loss 2.8192 (3.1013) grad_norm 2.4713 (2.8509) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][120/1251] eta 0:04:27 lr 0.000444 wd 0.0500 time 0.2292 (0.2367) data time 0.0010 (0.0087) model time 0.2282 (0.2265) loss 3.2543 (3.1037) grad_norm 3.2569 (2.8401) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][130/1251] eta 0:04:24 lr 0.000443 wd 0.0500 time 0.2223 (0.2360) data time 0.0010 (0.0081) model time 0.2212 (0.2264) loss 3.3896 (3.0960) grad_norm 3.8625 (2.8265) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][140/1251] eta 0:04:21 lr 0.000443 wd 0.0500 time 0.2266 (0.2356) data time 0.0008 (0.0076) model time 0.2257 (0.2269) loss 3.1979 (3.0998) grad_norm 1.9073 (2.8100) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][150/1251] eta 0:04:18 lr 0.000443 wd 0.0500 time 0.2341 (0.2351) data time 0.0008 (0.0072) model time 0.2333 (0.2269) loss 2.5734 (3.1163) grad_norm 2.5270 (2.7810) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][160/1251] eta 0:04:16 lr 0.000443 wd 0.0500 time 0.2292 (0.2347) data time 0.0009 (0.0068) model time 0.2283 (0.2269) loss 4.1355 (3.1132) grad_norm 2.3120 (2.7735) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][170/1251] eta 0:04:13 lr 0.000443 wd 0.0500 time 0.2238 (0.2343) data time 0.0009 (0.0065) model time 0.2229 (0.2270) loss 2.1634 (3.1018) grad_norm 2.7652 (2.7678) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][180/1251] eta 0:04:10 lr 0.000443 wd 0.0500 time 0.2279 (0.2340) data time 0.0008 (0.0062) model time 0.2270 (0.2269) loss 2.3993 (3.1074) grad_norm 2.7719 (2.8215) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][190/1251] eta 0:04:07 lr 0.000443 wd 0.0500 time 0.2265 (0.2336) data time 0.0008 (0.0059) model time 0.2257 (0.2269) loss 3.6671 (3.1107) grad_norm 4.5123 (2.8503) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][200/1251] eta 0:04:05 lr 0.000443 wd 0.0500 time 0.2334 (0.2333) data time 0.0008 (0.0056) model time 0.2326 (0.2269) loss 2.4869 (3.0970) grad_norm 1.8724 (2.8753) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][210/1251] eta 0:04:02 lr 0.000443 wd 0.0500 time 0.2299 (0.2330) data time 0.0007 (0.0054) model time 0.2292 (0.2268) loss 2.2426 (3.0957) grad_norm 2.3567 (2.8641) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][220/1251] eta 0:04:00 lr 0.000443 wd 0.0500 time 0.2217 (0.2329) data time 0.0010 (0.0052) model time 0.2207 (0.2269) loss 3.0626 (3.0909) grad_norm 3.9481 (2.8809) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:43:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][230/1251] eta 0:03:57 lr 0.000443 wd 0.0500 time 0.2439 (0.2327) data time 0.0007 (0.0051) model time 0.2432 (0.2270) loss 3.1755 (3.0978) grad_norm 2.5195 (2.8568) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][240/1251] eta 0:03:55 lr 0.000443 wd 0.0500 time 0.2310 (0.2326) data time 0.0007 (0.0049) model time 0.2303 (0.2270) loss 2.3879 (3.1007) grad_norm 2.5559 (2.8627) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][250/1251] eta 0:03:52 lr 0.000443 wd 0.0500 time 0.2261 (0.2324) data time 0.0009 (0.0047) model time 0.2252 (0.2270) loss 3.7825 (3.1059) grad_norm 1.7471 (2.8426) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][260/1251] eta 0:03:50 lr 0.000443 wd 0.0500 time 0.2327 (0.2322) data time 0.0013 (0.0046) model time 0.2313 (0.2270) loss 3.5213 (3.1100) grad_norm 2.5708 (2.8361) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][270/1251] eta 0:03:48 lr 0.000443 wd 0.0500 time 0.4268 (0.2328) data time 0.0007 (0.0045) model time 0.4261 (0.2279) loss 4.1249 (3.1118) grad_norm 3.7862 (2.8344) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][280/1251] eta 0:03:45 lr 0.000443 wd 0.0500 time 0.2262 (0.2326) data time 0.0010 (0.0044) model time 0.2252 (0.2279) loss 2.9253 (3.1166) grad_norm 2.9043 (2.8318) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][290/1251] eta 0:03:43 lr 0.000443 wd 0.0500 time 0.2247 (0.2325) data time 0.0007 (0.0042) model time 0.2241 (0.2279) loss 3.7583 (3.1257) grad_norm 2.3841 (2.8294) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][300/1251] eta 0:03:41 lr 0.000443 wd 0.0500 time 0.2295 (0.2324) data time 0.0009 (0.0041) model time 0.2286 (0.2280) loss 3.3255 (3.1306) grad_norm 2.5734 (2.8313) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][310/1251] eta 0:03:38 lr 0.000443 wd 0.0500 time 0.2313 (0.2323) data time 0.0010 (0.0040) model time 0.2303 (0.2279) loss 2.5855 (3.1287) grad_norm 2.0816 (2.8263) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][320/1251] eta 0:03:36 lr 0.000443 wd 0.0500 time 0.2265 (0.2322) data time 0.0010 (0.0040) model time 0.2255 (0.2279) loss 2.9255 (3.1314) grad_norm 2.0834 (2.8215) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][330/1251] eta 0:03:33 lr 0.000443 wd 0.0500 time 0.2328 (0.2321) data time 0.0008 (0.0039) model time 0.2321 (0.2279) loss 3.4442 (3.1309) grad_norm 3.0588 (2.8217) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][340/1251] eta 0:03:31 lr 0.000443 wd 0.0500 time 0.2252 (0.2321) data time 0.0007 (0.0038) model time 0.2245 (0.2280) loss 3.1014 (3.1348) grad_norm 5.3906 (2.8180) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][350/1251] eta 0:03:29 lr 0.000443 wd 0.0500 time 0.2238 (0.2324) data time 0.0010 (0.0037) model time 0.2228 (0.2285) loss 2.7271 (3.1393) grad_norm 2.8314 (2.8182) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][360/1251] eta 0:03:26 lr 0.000442 wd 0.0500 time 0.2280 (0.2323) data time 0.0010 (0.0036) model time 0.2270 (0.2284) loss 3.4324 (3.1439) grad_norm 2.2128 (2.8274) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][370/1251] eta 0:03:24 lr 0.000442 wd 0.0500 time 0.2267 (0.2322) data time 0.0008 (0.0036) model time 0.2259 (0.2284) loss 3.0754 (3.1428) grad_norm 2.1735 (2.8239) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][380/1251] eta 0:03:22 lr 0.000442 wd 0.0500 time 0.2360 (0.2321) data time 0.0007 (0.0035) model time 0.2353 (0.2284) loss 3.6930 (3.1298) grad_norm 1.8703 (2.8223) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][390/1251] eta 0:03:19 lr 0.000442 wd 0.0500 time 0.2258 (0.2320) data time 0.0010 (0.0034) model time 0.2248 (0.2283) loss 3.3051 (3.1266) grad_norm 3.5571 (2.8286) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][400/1251] eta 0:03:17 lr 0.000442 wd 0.0500 time 0.2273 (0.2319) data time 0.0010 (0.0034) model time 0.2263 (0.2282) loss 3.4183 (3.1311) grad_norm 2.2567 (2.8338) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][410/1251] eta 0:03:14 lr 0.000442 wd 0.0500 time 0.2256 (0.2317) data time 0.0008 (0.0033) model time 0.2248 (0.2282) loss 2.1737 (3.1367) grad_norm 1.8342 (2.8230) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][420/1251] eta 0:03:12 lr 0.000442 wd 0.0500 time 0.2205 (0.2316) data time 0.0011 (0.0033) model time 0.2195 (0.2281) loss 3.7124 (3.1387) grad_norm 2.3872 (2.8295) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][430/1251] eta 0:03:10 lr 0.000442 wd 0.0500 time 0.2271 (0.2315) data time 0.0010 (0.0032) model time 0.2262 (0.2281) loss 3.4480 (3.1390) grad_norm 2.8268 (2.8346) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][440/1251] eta 0:03:07 lr 0.000442 wd 0.0500 time 0.2359 (0.2314) data time 0.0012 (0.0032) model time 0.2346 (0.2280) loss 2.2800 (3.1396) grad_norm 3.8017 (2.8276) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][450/1251] eta 0:03:05 lr 0.000442 wd 0.0500 time 0.2223 (0.2313) data time 0.0012 (0.0031) model time 0.2210 (0.2280) loss 3.3481 (3.1394) grad_norm 3.1685 (2.8570) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][460/1251] eta 0:03:02 lr 0.000442 wd 0.0500 time 0.2232 (0.2312) data time 0.0007 (0.0031) model time 0.2225 (0.2279) loss 2.2043 (3.1360) grad_norm 2.8350 (2.8643) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][470/1251] eta 0:03:00 lr 0.000442 wd 0.0500 time 0.2339 (0.2312) data time 0.0008 (0.0030) model time 0.2331 (0.2280) loss 2.4882 (3.1339) grad_norm 2.4256 (2.8617) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][480/1251] eta 0:02:58 lr 0.000442 wd 0.0500 time 0.2201 (0.2311) data time 0.0007 (0.0030) model time 0.2194 (0.2279) loss 3.2177 (3.1390) grad_norm 2.4369 (2.8590) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:44:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][490/1251] eta 0:02:55 lr 0.000442 wd 0.0500 time 0.2295 (0.2310) data time 0.0007 (0.0030) model time 0.2288 (0.2279) loss 3.7941 (3.1369) grad_norm 2.3717 (2.8746) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:45:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][500/1251] eta 0:02:53 lr 0.000442 wd 0.0500 time 0.2239 (0.2309) data time 0.0010 (0.0029) model time 0.2229 (0.2278) loss 3.7970 (3.1405) grad_norm 2.2319 (2.8671) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][510/1251] eta 0:02:51 lr 0.000442 wd 0.0500 time 0.2335 (0.2308) data time 0.0011 (0.0029) model time 0.2325 (0.2277) loss 2.2363 (3.1319) grad_norm 2.4870 (2.8554) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:45:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][520/1251] eta 0:02:48 lr 0.000442 wd 0.0500 time 0.2246 (0.2308) data time 0.0007 (0.0029) model time 0.2239 (0.2277) loss 3.8994 (3.1360) grad_norm 3.6356 (2.8592) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 11:45:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][530/1251] eta 0:02:46 lr 0.000442 wd 0.0500 time 0.2208 (0.2307) data time 0.0010 (0.0028) model time 0.2198 (0.2276) loss 3.5684 (3.1382) grad_norm 3.7014 (2.8626) loss_scale 2048.0000 (1025.9284) mem 7381MB [2024-08-27 11:45:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][540/1251] eta 0:02:43 lr 0.000442 wd 0.0500 time 0.2268 (0.2306) data time 0.0010 (0.0028) model time 0.2258 (0.2276) loss 3.3381 (3.1426) grad_norm 3.2960 (2.8662) loss_scale 2048.0000 (1044.8207) mem 7381MB [2024-08-27 11:45:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][550/1251] eta 0:02:41 lr 0.000442 wd 0.0500 time 0.2275 (0.2305) data time 0.0009 (0.0028) model time 0.2266 (0.2276) loss 3.4622 (3.1466) grad_norm 3.7929 (2.8696) loss_scale 2048.0000 (1063.0272) mem 7381MB [2024-08-27 11:45:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][560/1251] eta 0:02:39 lr 0.000442 wd 0.0500 time 0.2197 (0.2305) data time 0.0009 (0.0027) model time 0.2189 (0.2275) loss 2.5610 (3.1473) grad_norm 1.9366 (2.8630) loss_scale 2048.0000 (1080.5847) mem 7381MB [2024-08-27 11:45:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][570/1251] eta 0:02:36 lr 0.000442 wd 0.0500 time 0.2228 (0.2304) data time 0.0008 (0.0027) model time 0.2220 (0.2274) loss 3.5865 (3.1501) grad_norm 3.2185 (2.8521) loss_scale 2048.0000 (1097.5271) mem 7381MB [2024-08-27 11:45:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][580/1251] eta 0:02:34 lr 0.000442 wd 0.0500 time 0.2249 (0.2303) data time 0.0011 (0.0027) model time 0.2239 (0.2274) loss 3.2914 (3.1532) grad_norm 5.8565 (2.8561) loss_scale 2048.0000 (1113.8864) mem 7381MB [2024-08-27 11:45:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][590/1251] eta 0:02:32 lr 0.000441 wd 0.0500 time 0.2245 (0.2303) data time 0.0007 (0.0026) model time 0.2239 (0.2274) loss 2.2725 (3.1492) grad_norm 2.2167 (2.8506) loss_scale 2048.0000 (1129.6920) mem 7381MB [2024-08-27 11:45:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][600/1251] eta 0:02:30 lr 0.000441 wd 0.0500 time 0.4368 (0.2309) data time 0.0009 (0.0026) model time 0.4358 (0.2281) loss 3.4741 (3.1512) grad_norm 2.3448 (2.8425) loss_scale 2048.0000 (1144.9717) mem 7381MB [2024-08-27 11:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][610/1251] eta 0:02:28 lr 0.000441 wd 0.0500 time 0.2301 (0.2312) data time 0.0008 (0.0026) model time 0.2293 (0.2285) loss 3.6140 (3.1515) grad_norm 3.5485 (2.8384) loss_scale 2048.0000 (1159.7512) mem 7381MB [2024-08-27 11:45:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][620/1251] eta 0:02:25 lr 0.000441 wd 0.0500 time 0.2282 (0.2311) data time 0.0009 (0.0026) model time 0.2272 (0.2284) loss 3.2889 (3.1516) grad_norm 1.8480 (2.8329) loss_scale 2048.0000 (1174.0548) mem 7381MB [2024-08-27 11:45:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][630/1251] eta 0:02:23 lr 0.000441 wd 0.0500 time 0.2240 (0.2311) data time 0.0007 (0.0025) model time 0.2234 (0.2284) loss 3.9372 (3.1499) grad_norm 2.1188 (2.8236) loss_scale 2048.0000 (1187.9049) mem 7381MB [2024-08-27 11:45:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][640/1251] eta 0:02:21 lr 0.000441 wd 0.0500 time 0.2279 (0.2311) data time 0.0010 (0.0025) model time 0.2269 (0.2285) loss 2.6657 (3.1467) grad_norm 2.4860 (2.8171) loss_scale 2048.0000 (1201.3229) mem 7381MB [2024-08-27 11:45:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][650/1251] eta 0:02:18 lr 0.000441 wd 0.0500 time 0.2336 (0.2311) data time 0.0007 (0.0025) model time 0.2329 (0.2285) loss 3.7234 (3.1472) grad_norm 2.5699 (2.8190) loss_scale 2048.0000 (1214.3287) mem 7381MB [2024-08-27 11:45:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][660/1251] eta 0:02:16 lr 0.000441 wd 0.0500 time 0.2268 (0.2311) data time 0.0007 (0.0025) model time 0.2261 (0.2285) loss 3.4585 (3.1494) grad_norm 2.9169 (2.8164) loss_scale 2048.0000 (1226.9410) mem 7381MB [2024-08-27 11:45:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][670/1251] eta 0:02:14 lr 0.000441 wd 0.0500 time 0.2331 (0.2310) data time 0.0007 (0.0025) model time 0.2324 (0.2285) loss 2.9176 (3.1492) grad_norm 2.1073 (2.8131) loss_scale 2048.0000 (1239.1773) mem 7381MB [2024-08-27 11:45:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][680/1251] eta 0:02:11 lr 0.000441 wd 0.0500 time 0.2257 (0.2310) data time 0.0010 (0.0024) model time 0.2247 (0.2285) loss 3.5903 (3.1470) grad_norm 1.8830 (2.8060) loss_scale 2048.0000 (1251.0543) mem 7381MB [2024-08-27 11:45:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][690/1251] eta 0:02:09 lr 0.000441 wd 0.0500 time 0.2328 (0.2310) data time 0.0010 (0.0024) model time 0.2318 (0.2285) loss 3.0125 (3.1464) grad_norm 2.3332 (2.7983) loss_scale 2048.0000 (1262.5876) mem 7381MB [2024-08-27 11:45:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][700/1251] eta 0:02:07 lr 0.000441 wd 0.0500 time 0.2305 (0.2310) data time 0.0008 (0.0024) model time 0.2298 (0.2285) loss 2.2721 (3.1434) grad_norm 3.1092 (2.8034) loss_scale 2048.0000 (1273.7917) mem 7381MB [2024-08-27 11:45:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][710/1251] eta 0:02:04 lr 0.000441 wd 0.0500 time 0.2317 (0.2310) data time 0.0009 (0.0024) model time 0.2309 (0.2285) loss 3.7260 (3.1455) grad_norm 2.2457 (2.8002) loss_scale 2048.0000 (1284.6807) mem 7381MB [2024-08-27 11:45:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][720/1251] eta 0:02:02 lr 0.000441 wd 0.0500 time 0.2237 (0.2310) data time 0.0011 (0.0024) model time 0.2226 (0.2285) loss 2.6088 (3.1429) grad_norm 2.9074 (2.7990) loss_scale 2048.0000 (1295.2677) mem 7381MB [2024-08-27 11:45:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][730/1251] eta 0:02:00 lr 0.000441 wd 0.0500 time 0.2313 (0.2309) data time 0.0007 (0.0023) model time 0.2306 (0.2285) loss 2.9011 (3.1438) grad_norm 2.8243 (2.8003) loss_scale 2048.0000 (1305.5650) mem 7381MB [2024-08-27 11:45:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][740/1251] eta 0:01:57 lr 0.000441 wd 0.0500 time 0.2255 (0.2309) data time 0.0008 (0.0023) model time 0.2247 (0.2285) loss 2.9522 (3.1439) grad_norm 4.1134 (2.8059) loss_scale 2048.0000 (1315.5843) mem 7381MB [2024-08-27 11:45:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][750/1251] eta 0:01:55 lr 0.000441 wd 0.0500 time 0.2231 (0.2309) data time 0.0011 (0.0023) model time 0.2220 (0.2285) loss 3.2380 (3.1443) grad_norm 3.2708 (2.8117) loss_scale 2048.0000 (1325.3369) mem 7381MB [2024-08-27 11:46:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][760/1251] eta 0:01:53 lr 0.000441 wd 0.0500 time 0.2353 (0.2309) data time 0.0011 (0.0023) model time 0.2342 (0.2285) loss 3.0724 (3.1453) grad_norm 2.1748 (2.8074) loss_scale 2048.0000 (1334.8331) mem 7381MB [2024-08-27 11:46:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][770/1251] eta 0:01:51 lr 0.000441 wd 0.0500 time 0.2261 (0.2309) data time 0.0008 (0.0023) model time 0.2252 (0.2285) loss 2.5239 (3.1427) grad_norm 1.7019 (2.7999) loss_scale 2048.0000 (1344.0830) mem 7381MB [2024-08-27 11:46:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][780/1251] eta 0:01:48 lr 0.000441 wd 0.0500 time 0.2302 (0.2308) data time 0.0007 (0.0023) model time 0.2295 (0.2285) loss 3.6624 (3.1424) grad_norm 1.8598 (2.7957) loss_scale 2048.0000 (1353.0960) mem 7381MB [2024-08-27 11:46:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][790/1251] eta 0:01:46 lr 0.000441 wd 0.0500 time 0.2122 (0.2310) data time 0.0007 (0.0022) model time 0.2115 (0.2287) loss 3.4181 (3.1435) grad_norm 2.4591 (2.7910) loss_scale 2048.0000 (1361.8812) mem 7381MB [2024-08-27 11:46:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][800/1251] eta 0:01:44 lr 0.000441 wd 0.0500 time 0.2335 (0.2309) data time 0.0009 (0.0022) model time 0.2326 (0.2286) loss 2.9728 (3.1448) grad_norm 1.9708 (2.7886) loss_scale 2048.0000 (1370.4469) mem 7381MB [2024-08-27 11:46:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][810/1251] eta 0:01:41 lr 0.000440 wd 0.0500 time 0.2258 (0.2309) data time 0.0009 (0.0022) model time 0.2249 (0.2286) loss 3.4593 (3.1460) grad_norm 2.6899 (2.7909) loss_scale 2048.0000 (1378.8015) mem 7381MB [2024-08-27 11:46:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][820/1251] eta 0:01:39 lr 0.000440 wd 0.0500 time 0.2253 (0.2309) data time 0.0008 (0.0022) model time 0.2245 (0.2286) loss 2.9544 (3.1467) grad_norm 2.8584 (2.7966) loss_scale 2048.0000 (1386.9525) mem 7381MB [2024-08-27 11:46:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][830/1251] eta 0:01:37 lr 0.000440 wd 0.0500 time 0.2322 (0.2309) data time 0.0007 (0.0022) model time 0.2315 (0.2286) loss 3.5172 (3.1440) grad_norm 2.9962 (2.7995) loss_scale 2048.0000 (1394.9073) mem 7381MB [2024-08-27 11:46:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][840/1251] eta 0:01:34 lr 0.000440 wd 0.0500 time 0.2366 (0.2309) data time 0.0009 (0.0022) model time 0.2357 (0.2286) loss 3.6174 (3.1455) grad_norm 2.7636 (2.7998) loss_scale 2048.0000 (1402.6730) mem 7381MB [2024-08-27 11:46:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][850/1251] eta 0:01:32 lr 0.000440 wd 0.0500 time 0.2245 (0.2308) data time 0.0009 (0.0022) model time 0.2237 (0.2286) loss 3.4296 (3.1452) grad_norm 3.2944 (2.7997) loss_scale 2048.0000 (1410.2562) mem 7381MB [2024-08-27 11:46:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][860/1251] eta 0:01:30 lr 0.000440 wd 0.0500 time 0.2287 (0.2308) data time 0.0009 (0.0021) model time 0.2279 (0.2286) loss 3.0083 (3.1445) grad_norm 2.5679 (2.7939) loss_scale 2048.0000 (1417.6632) mem 7381MB [2024-08-27 11:46:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][870/1251] eta 0:01:27 lr 0.000440 wd 0.0500 time 0.2265 (0.2308) data time 0.0013 (0.0021) model time 0.2252 (0.2285) loss 3.4367 (3.1431) grad_norm 2.4109 (2.7921) loss_scale 2048.0000 (1424.9001) mem 7381MB [2024-08-27 11:46:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][880/1251] eta 0:01:25 lr 0.000440 wd 0.0500 time 0.2236 (0.2307) data time 0.0014 (0.0021) model time 0.2222 (0.2285) loss 3.4824 (3.1448) grad_norm 4.5991 (2.8019) loss_scale 2048.0000 (1431.9728) mem 7381MB [2024-08-27 11:46:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][890/1251] eta 0:01:23 lr 0.000440 wd 0.0500 time 0.2217 (0.2307) data time 0.0010 (0.0021) model time 0.2207 (0.2285) loss 3.2777 (3.1480) grad_norm 2.8311 (2.8039) loss_scale 2048.0000 (1438.8866) mem 7381MB [2024-08-27 11:46:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][900/1251] eta 0:01:20 lr 0.000440 wd 0.0500 time 0.2240 (0.2307) data time 0.0007 (0.0021) model time 0.2233 (0.2285) loss 4.1100 (3.1494) grad_norm 1.9032 (2.8041) loss_scale 2048.0000 (1445.6471) mem 7381MB [2024-08-27 11:46:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][910/1251] eta 0:01:18 lr 0.000440 wd 0.0500 time 0.2238 (0.2306) data time 0.0006 (0.0021) model time 0.2232 (0.2284) loss 3.4062 (3.1528) grad_norm 2.5178 (2.8078) loss_scale 2048.0000 (1452.2591) mem 7381MB [2024-08-27 11:46:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][920/1251] eta 0:01:16 lr 0.000440 wd 0.0500 time 0.2247 (0.2306) data time 0.0010 (0.0021) model time 0.2237 (0.2284) loss 3.4800 (3.1494) grad_norm 4.1287 (2.8116) loss_scale 2048.0000 (1458.7275) mem 7381MB [2024-08-27 11:46:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][930/1251] eta 0:01:14 lr 0.000440 wd 0.0500 time 0.2273 (0.2306) data time 0.0009 (0.0021) model time 0.2264 (0.2284) loss 2.7787 (3.1528) grad_norm 3.2912 (2.8169) loss_scale 2048.0000 (1465.0569) mem 7381MB [2024-08-27 11:46:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][940/1251] eta 0:01:11 lr 0.000440 wd 0.0500 time 0.2280 (0.2305) data time 0.0007 (0.0021) model time 0.2274 (0.2284) loss 3.4815 (3.1540) grad_norm 3.3484 (2.8142) loss_scale 2048.0000 (1471.2519) mem 7381MB [2024-08-27 11:46:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][950/1251] eta 0:01:09 lr 0.000440 wd 0.0500 time 0.2207 (0.2305) data time 0.0008 (0.0020) model time 0.2199 (0.2283) loss 3.5080 (3.1551) grad_norm 2.0003 (2.8108) loss_scale 2048.0000 (1477.3165) mem 7381MB [2024-08-27 11:46:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][960/1251] eta 0:01:07 lr 0.000440 wd 0.0500 time 0.2246 (0.2304) data time 0.0011 (0.0020) model time 0.2235 (0.2283) loss 3.7170 (3.1542) grad_norm 4.6019 (2.8133) loss_scale 2048.0000 (1483.2549) mem 7381MB [2024-08-27 11:46:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][970/1251] eta 0:01:04 lr 0.000440 wd 0.0500 time 0.2273 (0.2304) data time 0.0010 (0.0020) model time 0.2263 (0.2283) loss 2.3817 (3.1535) grad_norm 2.7114 (2.8101) loss_scale 2048.0000 (1489.0711) mem 7381MB [2024-08-27 11:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][980/1251] eta 0:01:02 lr 0.000440 wd 0.0500 time 0.2287 (0.2303) data time 0.0009 (0.0020) model time 0.2279 (0.2282) loss 3.2883 (3.1548) grad_norm 2.3010 (2.8036) loss_scale 2048.0000 (1494.7686) mem 7381MB [2024-08-27 11:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][990/1251] eta 0:01:00 lr 0.000440 wd 0.0500 time 0.2319 (0.2303) data time 0.0007 (0.0020) model time 0.2312 (0.2282) loss 1.9102 (3.1550) grad_norm 3.4348 (2.8010) loss_scale 2048.0000 (1500.3512) mem 7381MB [2024-08-27 11:46:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1000/1251] eta 0:00:57 lr 0.000440 wd 0.0500 time 0.2255 (0.2303) data time 0.0009 (0.0020) model time 0.2246 (0.2282) loss 1.9509 (3.1547) grad_norm 1.9754 (2.7989) loss_scale 2048.0000 (1505.8222) mem 7381MB [2024-08-27 11:46:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1010/1251] eta 0:00:55 lr 0.000440 wd 0.0500 time 0.2271 (0.2302) data time 0.0007 (0.0020) model time 0.2263 (0.2282) loss 2.9987 (3.1553) grad_norm 2.4145 (2.7987) loss_scale 2048.0000 (1511.1850) mem 7381MB [2024-08-27 11:47:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1020/1251] eta 0:00:53 lr 0.000440 wd 0.0500 time 0.2310 (0.2302) data time 0.0009 (0.0020) model time 0.2302 (0.2282) loss 3.8483 (3.1549) grad_norm 3.5618 (2.8015) loss_scale 2048.0000 (1516.4427) mem 7381MB [2024-08-27 11:47:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1030/1251] eta 0:00:50 lr 0.000440 wd 0.0500 time 0.2210 (0.2302) data time 0.0008 (0.0020) model time 0.2202 (0.2282) loss 3.6648 (3.1537) grad_norm 2.8721 (2.8025) loss_scale 2048.0000 (1521.5984) mem 7381MB [2024-08-27 11:47:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1040/1251] eta 0:00:48 lr 0.000439 wd 0.0500 time 0.2271 (0.2302) data time 0.0009 (0.0020) model time 0.2262 (0.2281) loss 3.1899 (3.1546) grad_norm 2.6641 (2.7996) loss_scale 2048.0000 (1526.6551) mem 7381MB [2024-08-27 11:47:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1050/1251] eta 0:00:46 lr 0.000439 wd 0.0500 time 0.2277 (0.2302) data time 0.0007 (0.0020) model time 0.2270 (0.2281) loss 1.6993 (3.1546) grad_norm 2.2216 (2.7979) loss_scale 2048.0000 (1531.6156) mem 7381MB [2024-08-27 11:47:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1060/1251] eta 0:00:43 lr 0.000439 wd 0.0500 time 0.2233 (0.2302) data time 0.0010 (0.0019) model time 0.2223 (0.2281) loss 3.0235 (3.1555) grad_norm 2.7348 (2.7961) loss_scale 2048.0000 (1536.4826) mem 7381MB [2024-08-27 11:47:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1070/1251] eta 0:00:41 lr 0.000439 wd 0.0500 time 0.2249 (0.2302) data time 0.0007 (0.0019) model time 0.2242 (0.2281) loss 3.6609 (3.1570) grad_norm 2.8422 (2.7978) loss_scale 2048.0000 (1541.2586) mem 7381MB [2024-08-27 11:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1080/1251] eta 0:00:39 lr 0.000439 wd 0.0500 time 0.2305 (0.2302) data time 0.0009 (0.0019) model time 0.2296 (0.2281) loss 2.8654 (3.1575) grad_norm 1.9835 (2.7939) loss_scale 2048.0000 (1545.9463) mem 7381MB [2024-08-27 11:47:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1090/1251] eta 0:00:37 lr 0.000439 wd 0.0500 time 0.2390 (0.2301) data time 0.0011 (0.0019) model time 0.2379 (0.2281) loss 2.8056 (3.1546) grad_norm 2.8362 (2.7894) loss_scale 2048.0000 (1550.5481) mem 7381MB [2024-08-27 11:47:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1100/1251] eta 0:00:34 lr 0.000439 wd 0.0500 time 0.2219 (0.2301) data time 0.0007 (0.0019) model time 0.2211 (0.2281) loss 2.8571 (3.1534) grad_norm 2.2699 (2.7859) loss_scale 2048.0000 (1555.0663) mem 7381MB [2024-08-27 11:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1110/1251] eta 0:00:32 lr 0.000439 wd 0.0500 time 0.2225 (0.2301) data time 0.0007 (0.0019) model time 0.2218 (0.2281) loss 2.9370 (3.1534) grad_norm 2.2834 (2.7839) loss_scale 2048.0000 (1559.5032) mem 7381MB [2024-08-27 11:47:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1120/1251] eta 0:00:30 lr 0.000439 wd 0.0500 time 0.2343 (0.2301) data time 0.0007 (0.0019) model time 0.2336 (0.2281) loss 3.6632 (3.1584) grad_norm 2.4156 (2.7819) loss_scale 2048.0000 (1563.8608) mem 7381MB [2024-08-27 11:47:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1130/1251] eta 0:00:27 lr 0.000439 wd 0.0500 time 0.2320 (0.2301) data time 0.0012 (0.0019) model time 0.2308 (0.2281) loss 3.1954 (3.1586) grad_norm 1.8759 (2.7808) loss_scale 2048.0000 (1568.1415) mem 7381MB [2024-08-27 11:47:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1140/1251] eta 0:00:25 lr 0.000439 wd 0.0500 time 0.4492 (0.2306) data time 0.0008 (0.0019) model time 0.4484 (0.2287) loss 2.6432 (3.1591) grad_norm 4.3836 (2.7835) loss_scale 2048.0000 (1572.3471) mem 7381MB [2024-08-27 11:47:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1150/1251] eta 0:00:23 lr 0.000439 wd 0.0500 time 0.2255 (0.2306) data time 0.0010 (0.0019) model time 0.2245 (0.2286) loss 3.1911 (3.1602) grad_norm 3.5572 (2.7863) loss_scale 2048.0000 (1576.4796) mem 7381MB [2024-08-27 11:47:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1160/1251] eta 0:00:20 lr 0.000439 wd 0.0500 time 0.2312 (0.2306) data time 0.0009 (0.0019) model time 0.2303 (0.2286) loss 2.5372 (3.1599) grad_norm 2.2609 (2.7857) loss_scale 2048.0000 (1580.5409) mem 7381MB [2024-08-27 11:47:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1170/1251] eta 0:00:18 lr 0.000439 wd 0.0500 time 0.2287 (0.2305) data time 0.0009 (0.0019) model time 0.2278 (0.2286) loss 3.6662 (3.1612) grad_norm 3.3635 (2.7867) loss_scale 2048.0000 (1584.5329) mem 7381MB [2024-08-27 11:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1180/1251] eta 0:00:16 lr 0.000439 wd 0.0500 time 0.2356 (0.2305) data time 0.0008 (0.0018) model time 0.2348 (0.2286) loss 3.0491 (3.1613) grad_norm 1.9573 (2.7843) loss_scale 2048.0000 (1588.4572) mem 7381MB [2024-08-27 11:47:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1190/1251] eta 0:00:14 lr 0.000439 wd 0.0500 time 0.2224 (0.2305) data time 0.0009 (0.0018) model time 0.2215 (0.2286) loss 3.5282 (3.1585) grad_norm 2.3198 (2.7813) loss_scale 2048.0000 (1592.3157) mem 7381MB [2024-08-27 11:47:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1200/1251] eta 0:00:11 lr 0.000439 wd 0.0500 time 0.2267 (0.2305) data time 0.0007 (0.0018) model time 0.2260 (0.2286) loss 3.3463 (3.1581) grad_norm 3.6686 (2.7847) loss_scale 2048.0000 (1596.1099) mem 7381MB [2024-08-27 11:47:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1210/1251] eta 0:00:09 lr 0.000439 wd 0.0500 time 0.2228 (0.2304) data time 0.0009 (0.0018) model time 0.2219 (0.2285) loss 3.2363 (3.1602) grad_norm 2.0826 (2.7831) loss_scale 2048.0000 (1599.8415) mem 7381MB [2024-08-27 11:47:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1220/1251] eta 0:00:07 lr 0.000439 wd 0.0500 time 0.2237 (0.2304) data time 0.0011 (0.0018) model time 0.2227 (0.2285) loss 3.5815 (3.1602) grad_norm 2.7327 (2.7856) loss_scale 2048.0000 (1603.5119) mem 7381MB [2024-08-27 11:47:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1230/1251] eta 0:00:04 lr 0.000439 wd 0.0500 time 0.2320 (0.2304) data time 0.0009 (0.0018) model time 0.2311 (0.2285) loss 3.0386 (3.1606) grad_norm 2.5064 (2.7850) loss_scale 2048.0000 (1607.1227) mem 7381MB [2024-08-27 11:47:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1240/1251] eta 0:00:02 lr 0.000439 wd 0.0500 time 0.2136 (0.2303) data time 0.0004 (0.0018) model time 0.2132 (0.2284) loss 2.3777 (3.1599) grad_norm 2.4895 (2.7900) loss_scale 2048.0000 (1610.6753) mem 7381MB [2024-08-27 11:47:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [171/300][1250/1251] eta 0:00:00 lr 0.000439 wd 0.0500 time 0.4090 (0.2303) data time 0.0004 (0.0018) model time 0.4086 (0.2284) loss 2.3554 (3.1584) grad_norm 3.2323 (2.7903) loss_scale 2048.0000 (1614.1711) mem 7381MB [2024-08-27 11:47:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 171 training takes 0:04:48 [2024-08-27 11:47:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 11:47:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 11:47:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.489 (0.489) Loss 0.4370 (0.4370) Acc@1 91.797 (91.797) Acc@5 97.852 (97.852) Mem 7381MB [2024-08-27 11:47:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.115) Loss 0.7295 (0.7065) Acc@1 86.035 (84.464) Acc@5 96.973 (96.875) Mem 7381MB [2024-08-27 11:47:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.098) Loss 1.0215 (0.7267) Acc@1 77.246 (83.840) Acc@5 93.945 (96.889) Mem 7381MB [2024-08-27 11:47:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.093) Loss 1.2812 (0.8240) Acc@1 68.457 (81.666) Acc@5 91.699 (95.791) Mem 7381MB [2024-08-27 11:47:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.1562 (0.8758) Acc@1 72.949 (80.273) Acc@5 92.676 (95.162) Mem 7381MB [2024-08-27 11:47:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.810 Acc@5 95.088 [2024-08-27 11:47:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.8% [2024-08-27 11:48:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.943 (0.943) Loss 0.4028 (0.4028) Acc@1 93.164 (93.164) Acc@5 98.535 (98.535) Mem 7381MB [2024-08-27 11:48:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.089 (0.163) Loss 0.6211 (0.6254) Acc@1 87.207 (86.612) Acc@5 97.559 (97.479) Mem 7381MB [2024-08-27 11:48:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.087 (0.124) Loss 0.8911 (0.6502) Acc@1 78.613 (85.691) Acc@5 95.605 (97.433) Mem 7381MB [2024-08-27 11:48:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.109) Loss 1.1250 (0.7376) Acc@1 72.168 (83.575) Acc@5 92.871 (96.472) Mem 7381MB [2024-08-27 11:48:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.098) Loss 1.0088 (0.7821) Acc@1 75.293 (82.267) Acc@5 94.141 (96.015) Mem 7381MB [2024-08-27 11:48:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.850 Acc@5 95.992 [2024-08-27 11:48:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.9% [2024-08-27 11:48:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][0/1251] eta 0:26:40 lr 0.000439 wd 0.0500 time 1.2792 (1.2792) data time 0.5890 (0.5890) model time 0.0000 (0.0000) loss 3.6274 (3.6274) grad_norm 2.3050 (2.3050) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][10/1251] eta 0:06:38 lr 0.000439 wd 0.0500 time 0.2266 (0.3215) data time 0.0007 (0.0545) model time 0.0000 (0.0000) loss 1.8393 (2.9236) grad_norm 2.1644 (2.8170) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][20/1251] eta 0:06:03 lr 0.000438 wd 0.0500 time 0.2364 (0.2950) data time 0.0007 (0.0294) model time 0.0000 (0.0000) loss 2.1279 (2.8822) grad_norm 2.0749 (2.8219) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][30/1251] eta 0:05:40 lr 0.000438 wd 0.0500 time 0.2241 (0.2788) data time 0.0009 (0.0213) model time 0.0000 (0.0000) loss 2.9518 (3.0331) grad_norm 2.0882 (2.6818) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][40/1251] eta 0:05:22 lr 0.000438 wd 0.0500 time 0.2389 (0.2667) data time 0.0007 (0.0164) model time 0.0000 (0.0000) loss 3.0265 (3.0633) grad_norm 2.4033 (2.6513) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][50/1251] eta 0:05:11 lr 0.000438 wd 0.0500 time 0.2215 (0.2590) data time 0.0007 (0.0134) model time 0.0000 (0.0000) loss 3.3024 (3.0561) grad_norm 2.8597 (2.7020) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][60/1251] eta 0:05:02 lr 0.000438 wd 0.0500 time 0.2242 (0.2537) data time 0.0007 (0.0114) model time 0.2235 (0.2256) loss 3.9310 (3.0594) grad_norm 2.2641 (2.6807) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][70/1251] eta 0:04:55 lr 0.000438 wd 0.0500 time 0.2314 (0.2501) data time 0.0008 (0.0099) model time 0.2306 (0.2264) loss 2.8203 (3.0647) grad_norm 2.1794 (2.6145) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][80/1251] eta 0:04:49 lr 0.000438 wd 0.0500 time 0.2186 (0.2474) data time 0.0010 (0.0088) model time 0.2175 (0.2267) loss 3.6112 (3.0424) grad_norm 1.7920 (2.6514) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][90/1251] eta 0:04:44 lr 0.000438 wd 0.0500 time 0.2299 (0.2454) data time 0.0010 (0.0080) model time 0.2288 (0.2270) loss 2.6526 (3.0783) grad_norm 2.7795 (2.7004) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][100/1251] eta 0:04:40 lr 0.000438 wd 0.0500 time 0.2269 (0.2435) data time 0.0006 (0.0073) model time 0.2262 (0.2267) loss 3.5746 (3.0928) grad_norm 3.0656 (2.7124) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][110/1251] eta 0:04:36 lr 0.000438 wd 0.0500 time 0.2260 (0.2420) data time 0.0009 (0.0068) model time 0.2250 (0.2265) loss 3.4737 (3.1031) grad_norm 2.2567 (2.7120) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][120/1251] eta 0:04:32 lr 0.000438 wd 0.0500 time 0.2309 (0.2410) data time 0.0007 (0.0063) model time 0.2303 (0.2268) loss 3.2185 (3.0940) grad_norm 1.8131 (2.6885) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][130/1251] eta 0:04:29 lr 0.000438 wd 0.0500 time 0.2254 (0.2400) data time 0.0009 (0.0059) model time 0.2244 (0.2268) loss 3.0649 (3.1059) grad_norm 2.8575 (2.6934) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][140/1251] eta 0:04:25 lr 0.000438 wd 0.0500 time 0.2247 (0.2391) data time 0.0011 (0.0055) model time 0.2237 (0.2268) loss 3.3188 (3.0947) grad_norm 2.6148 (2.6834) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][150/1251] eta 0:04:22 lr 0.000438 wd 0.0500 time 0.2368 (0.2384) data time 0.0009 (0.0052) model time 0.2359 (0.2268) loss 2.7234 (3.1105) grad_norm 2.5331 (2.6745) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][160/1251] eta 0:04:19 lr 0.000438 wd 0.0500 time 0.2229 (0.2377) data time 0.0010 (0.0050) model time 0.2220 (0.2267) loss 3.8488 (3.1069) grad_norm 2.0352 (2.6623) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][170/1251] eta 0:04:16 lr 0.000438 wd 0.0500 time 0.2212 (0.2371) data time 0.0010 (0.0048) model time 0.2202 (0.2267) loss 3.5241 (3.1108) grad_norm 2.4779 (2.6618) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][180/1251] eta 0:04:13 lr 0.000438 wd 0.0500 time 0.2272 (0.2367) data time 0.0009 (0.0046) model time 0.2263 (0.2268) loss 2.1269 (3.0911) grad_norm 2.6209 (2.6802) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][190/1251] eta 0:04:10 lr 0.000438 wd 0.0500 time 0.2314 (0.2362) data time 0.0013 (0.0044) model time 0.2302 (0.2269) loss 2.7244 (3.0884) grad_norm 2.4158 (2.6708) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][200/1251] eta 0:04:07 lr 0.000438 wd 0.0500 time 0.2375 (0.2359) data time 0.0007 (0.0042) model time 0.2368 (0.2270) loss 3.4855 (3.0947) grad_norm 2.0182 (2.6597) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][210/1251] eta 0:04:05 lr 0.000438 wd 0.0500 time 0.2348 (0.2356) data time 0.0009 (0.0041) model time 0.2339 (0.2270) loss 3.2920 (3.0933) grad_norm 2.0467 (2.6553) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][220/1251] eta 0:04:02 lr 0.000438 wd 0.0500 time 0.2256 (0.2352) data time 0.0010 (0.0039) model time 0.2246 (0.2270) loss 3.2959 (3.0812) grad_norm 2.5308 (2.6869) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:48:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][230/1251] eta 0:03:59 lr 0.000438 wd 0.0500 time 0.2281 (0.2349) data time 0.0008 (0.0038) model time 0.2273 (0.2270) loss 3.0583 (3.0823) grad_norm 2.1458 (2.6816) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][240/1251] eta 0:03:57 lr 0.000437 wd 0.0500 time 0.2253 (0.2346) data time 0.0007 (0.0037) model time 0.2246 (0.2270) loss 2.3100 (3.0785) grad_norm 3.3531 (2.6889) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][250/1251] eta 0:03:54 lr 0.000437 wd 0.0500 time 0.2342 (0.2344) data time 0.0007 (0.0036) model time 0.2334 (0.2271) loss 3.4083 (3.0907) grad_norm 2.2618 (2.6811) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][260/1251] eta 0:03:52 lr 0.000437 wd 0.0500 time 0.2229 (0.2342) data time 0.0007 (0.0035) model time 0.2222 (0.2271) loss 2.9617 (3.0913) grad_norm 2.7414 (2.6896) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][270/1251] eta 0:03:49 lr 0.000437 wd 0.0500 time 0.2194 (0.2338) data time 0.0010 (0.0034) model time 0.2184 (0.2269) loss 3.5035 (3.0935) grad_norm 4.0090 (2.7037) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][280/1251] eta 0:03:46 lr 0.000437 wd 0.0500 time 0.2225 (0.2336) data time 0.0011 (0.0033) model time 0.2214 (0.2269) loss 2.2388 (3.0828) grad_norm 1.9475 (2.6989) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][290/1251] eta 0:03:44 lr 0.000437 wd 0.0500 time 0.2253 (0.2334) data time 0.0010 (0.0032) model time 0.2242 (0.2269) loss 3.1445 (3.0914) grad_norm 4.1585 (2.7132) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][300/1251] eta 0:03:41 lr 0.000437 wd 0.0500 time 0.2252 (0.2332) data time 0.0009 (0.0032) model time 0.2243 (0.2269) loss 2.4674 (3.0938) grad_norm 1.8035 (2.7097) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][310/1251] eta 0:03:39 lr 0.000437 wd 0.0500 time 0.2248 (0.2332) data time 0.0007 (0.0031) model time 0.2241 (0.2270) loss 3.2709 (3.0977) grad_norm 2.5457 (2.7128) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][320/1251] eta 0:03:36 lr 0.000437 wd 0.0500 time 0.2283 (0.2331) data time 0.0010 (0.0030) model time 0.2273 (0.2271) loss 3.5015 (3.1041) grad_norm 3.4424 (2.7134) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][330/1251] eta 0:03:34 lr 0.000437 wd 0.0500 time 0.2211 (0.2330) data time 0.0008 (0.0030) model time 0.2202 (0.2272) loss 3.5304 (3.1054) grad_norm 2.8401 (2.7053) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][340/1251] eta 0:03:32 lr 0.000437 wd 0.0500 time 0.2296 (0.2329) data time 0.0010 (0.0029) model time 0.2285 (0.2272) loss 3.4506 (3.1050) grad_norm 2.2491 (2.7002) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][350/1251] eta 0:03:29 lr 0.000437 wd 0.0500 time 0.2260 (0.2328) data time 0.0007 (0.0029) model time 0.2253 (0.2272) loss 3.9571 (3.1058) grad_norm 2.8838 (2.7032) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][360/1251] eta 0:03:27 lr 0.000437 wd 0.0500 time 0.2192 (0.2326) data time 0.0012 (0.0028) model time 0.2179 (0.2272) loss 2.6005 (3.1069) grad_norm 4.7368 (2.7155) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][370/1251] eta 0:03:24 lr 0.000437 wd 0.0500 time 0.2314 (0.2325) data time 0.0009 (0.0028) model time 0.2305 (0.2272) loss 2.9993 (3.1095) grad_norm 2.6247 (2.7208) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][380/1251] eta 0:03:22 lr 0.000437 wd 0.0500 time 0.2285 (0.2326) data time 0.0012 (0.0027) model time 0.2273 (0.2274) loss 2.9819 (3.1105) grad_norm 2.4588 (2.7273) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][390/1251] eta 0:03:20 lr 0.000437 wd 0.0500 time 0.2270 (0.2329) data time 0.0008 (0.0027) model time 0.2261 (0.2279) loss 3.1813 (3.1100) grad_norm 2.2576 (2.7501) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][400/1251] eta 0:03:18 lr 0.000437 wd 0.0500 time 0.2242 (0.2328) data time 0.0009 (0.0027) model time 0.2233 (0.2278) loss 3.3106 (3.1088) grad_norm 2.5160 (2.7465) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][410/1251] eta 0:03:15 lr 0.000437 wd 0.0500 time 0.2276 (0.2327) data time 0.0012 (0.0026) model time 0.2264 (0.2278) loss 3.3653 (3.1108) grad_norm 3.0617 (2.7425) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][420/1251] eta 0:03:14 lr 0.000437 wd 0.0500 time 0.2297 (0.2335) data time 0.0008 (0.0026) model time 0.2289 (0.2289) loss 3.6957 (3.1050) grad_norm 2.4423 (2.7398) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][430/1251] eta 0:03:12 lr 0.000437 wd 0.0500 time 0.2394 (0.2343) data time 0.0007 (0.0025) model time 0.2388 (0.2299) loss 1.9508 (3.1013) grad_norm 2.6420 (2.7388) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][440/1251] eta 0:03:09 lr 0.000437 wd 0.0500 time 0.2313 (0.2342) data time 0.0006 (0.0025) model time 0.2307 (0.2299) loss 3.2479 (3.1051) grad_norm 3.4615 (2.7484) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][450/1251] eta 0:03:07 lr 0.000437 wd 0.0500 time 0.2191 (0.2341) data time 0.0009 (0.0025) model time 0.2182 (0.2299) loss 3.4218 (3.1009) grad_norm 1.9161 (2.7466) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][460/1251] eta 0:03:05 lr 0.000437 wd 0.0500 time 0.2226 (0.2340) data time 0.0008 (0.0025) model time 0.2219 (0.2298) loss 3.5735 (3.1062) grad_norm 2.0941 (2.7365) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][470/1251] eta 0:03:02 lr 0.000436 wd 0.0500 time 0.2427 (0.2340) data time 0.0009 (0.0024) model time 0.2417 (0.2298) loss 3.4789 (3.1108) grad_norm 2.1264 (2.7347) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][480/1251] eta 0:03:00 lr 0.000436 wd 0.0500 time 0.2303 (0.2338) data time 0.0007 (0.0024) model time 0.2296 (0.2298) loss 3.5184 (3.1116) grad_norm 3.9230 (2.7422) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:49:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][490/1251] eta 0:02:57 lr 0.000436 wd 0.0500 time 0.2266 (0.2338) data time 0.0011 (0.0024) model time 0.2255 (0.2297) loss 3.1839 (3.1104) grad_norm 2.6343 (2.7442) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][500/1251] eta 0:02:55 lr 0.000436 wd 0.0500 time 0.2260 (0.2337) data time 0.0008 (0.0023) model time 0.2252 (0.2297) loss 2.5988 (3.1127) grad_norm 2.4851 (2.7472) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][510/1251] eta 0:02:53 lr 0.000436 wd 0.0500 time 0.2280 (0.2336) data time 0.0007 (0.0023) model time 0.2273 (0.2297) loss 3.2784 (3.1164) grad_norm 3.5752 (2.7422) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][520/1251] eta 0:02:50 lr 0.000436 wd 0.0500 time 0.2201 (0.2335) data time 0.0007 (0.0023) model time 0.2194 (0.2296) loss 3.4822 (3.1125) grad_norm 3.1402 (2.7356) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][530/1251] eta 0:02:48 lr 0.000436 wd 0.0500 time 0.2351 (0.2334) data time 0.0009 (0.0023) model time 0.2342 (0.2296) loss 2.3928 (3.1139) grad_norm 2.3796 (2.7250) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][540/1251] eta 0:02:45 lr 0.000436 wd 0.0500 time 0.2302 (0.2333) data time 0.0006 (0.0023) model time 0.2296 (0.2296) loss 2.6887 (3.1090) grad_norm 1.7203 (2.7204) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][550/1251] eta 0:02:43 lr 0.000436 wd 0.0500 time 0.2270 (0.2333) data time 0.0009 (0.0022) model time 0.2261 (0.2296) loss 3.2391 (3.1084) grad_norm 2.2774 (2.7160) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][560/1251] eta 0:02:41 lr 0.000436 wd 0.0500 time 0.2359 (0.2332) data time 0.0012 (0.0022) model time 0.2347 (0.2296) loss 2.5888 (3.1071) grad_norm 3.0642 (2.7115) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][570/1251] eta 0:02:38 lr 0.000436 wd 0.0500 time 0.2516 (0.2332) data time 0.0008 (0.0022) model time 0.2508 (0.2296) loss 3.0467 (3.1096) grad_norm 2.4645 (2.7120) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][580/1251] eta 0:02:36 lr 0.000436 wd 0.0500 time 0.2334 (0.2331) data time 0.0009 (0.0022) model time 0.2325 (0.2295) loss 3.9133 (3.1098) grad_norm 3.1133 (2.7161) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][590/1251] eta 0:02:34 lr 0.000436 wd 0.0500 time 0.2279 (0.2331) data time 0.0007 (0.0022) model time 0.2272 (0.2295) loss 3.4251 (3.1133) grad_norm 2.2461 (2.7174) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][600/1251] eta 0:02:31 lr 0.000436 wd 0.0500 time 0.2303 (0.2330) data time 0.0007 (0.0021) model time 0.2296 (0.2295) loss 2.6812 (3.1132) grad_norm 2.5366 (2.7158) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][610/1251] eta 0:02:29 lr 0.000436 wd 0.0500 time 0.2224 (0.2329) data time 0.0010 (0.0021) model time 0.2215 (0.2295) loss 3.6975 (3.1126) grad_norm 2.2553 (2.7188) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][620/1251] eta 0:02:26 lr 0.000436 wd 0.0500 time 0.2256 (0.2329) data time 0.0010 (0.0021) model time 0.2246 (0.2294) loss 3.1334 (3.1151) grad_norm 2.3292 (2.7157) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][630/1251] eta 0:02:24 lr 0.000436 wd 0.0500 time 0.2255 (0.2328) data time 0.0008 (0.0021) model time 0.2247 (0.2294) loss 3.2659 (3.1196) grad_norm 3.6208 (2.7171) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][640/1251] eta 0:02:22 lr 0.000436 wd 0.0500 time 0.2324 (0.2327) data time 0.0010 (0.0021) model time 0.2314 (0.2293) loss 3.1873 (3.1232) grad_norm 2.0957 (2.7188) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][650/1251] eta 0:02:19 lr 0.000436 wd 0.0500 time 0.2236 (0.2327) data time 0.0009 (0.0021) model time 0.2227 (0.2293) loss 3.3715 (3.1243) grad_norm 3.1011 (2.7196) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][660/1251] eta 0:02:17 lr 0.000436 wd 0.0500 time 0.2233 (0.2326) data time 0.0007 (0.0021) model time 0.2225 (0.2293) loss 3.6061 (3.1228) grad_norm 3.1134 (2.7194) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][670/1251] eta 0:02:15 lr 0.000436 wd 0.0500 time 0.2275 (0.2325) data time 0.0007 (0.0020) model time 0.2268 (0.2292) loss 2.2812 (3.1222) grad_norm 3.5977 (2.7239) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][680/1251] eta 0:02:12 lr 0.000436 wd 0.0500 time 0.2243 (0.2325) data time 0.0013 (0.0020) model time 0.2230 (0.2292) loss 3.1821 (3.1228) grad_norm 3.1408 (2.7200) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][690/1251] eta 0:02:10 lr 0.000436 wd 0.0500 time 0.2246 (0.2324) data time 0.0011 (0.0020) model time 0.2236 (0.2292) loss 3.1217 (3.1240) grad_norm 3.3721 (2.7166) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][700/1251] eta 0:02:08 lr 0.000435 wd 0.0500 time 0.2289 (0.2324) data time 0.0007 (0.0020) model time 0.2283 (0.2292) loss 2.3973 (3.1231) grad_norm 2.1955 (2.7154) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][710/1251] eta 0:02:05 lr 0.000435 wd 0.0500 time 0.2276 (0.2323) data time 0.0007 (0.0020) model time 0.2269 (0.2292) loss 3.5297 (3.1235) grad_norm 2.6913 (2.7135) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][720/1251] eta 0:02:03 lr 0.000435 wd 0.0500 time 0.2201 (0.2323) data time 0.0007 (0.0020) model time 0.2193 (0.2292) loss 4.1607 (3.1269) grad_norm 1.8727 (2.7106) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][730/1251] eta 0:02:01 lr 0.000435 wd 0.0500 time 0.2261 (0.2323) data time 0.0008 (0.0020) model time 0.2254 (0.2292) loss 2.9156 (3.1254) grad_norm 2.1661 (2.7152) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][740/1251] eta 0:01:58 lr 0.000435 wd 0.0500 time 0.2257 (0.2323) data time 0.0011 (0.0019) model time 0.2245 (0.2292) loss 2.9701 (3.1233) grad_norm 3.0506 (2.7140) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:50:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][750/1251] eta 0:01:56 lr 0.000435 wd 0.0500 time 0.2322 (0.2322) data time 0.0009 (0.0019) model time 0.2313 (0.2292) loss 2.3011 (3.1270) grad_norm 2.4627 (2.7140) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][760/1251] eta 0:01:54 lr 0.000435 wd 0.0500 time 0.2318 (0.2322) data time 0.0009 (0.0019) model time 0.2309 (0.2292) loss 2.9558 (3.1261) grad_norm 2.5143 (2.7151) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][770/1251] eta 0:01:51 lr 0.000435 wd 0.0500 time 0.2265 (0.2321) data time 0.0010 (0.0019) model time 0.2255 (0.2291) loss 2.0967 (3.1288) grad_norm 2.7464 (2.7109) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][780/1251] eta 0:01:49 lr 0.000435 wd 0.0500 time 0.2257 (0.2321) data time 0.0007 (0.0019) model time 0.2250 (0.2291) loss 3.9425 (3.1294) grad_norm 3.7792 (2.7133) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][790/1251] eta 0:01:46 lr 0.000435 wd 0.0500 time 0.2319 (0.2320) data time 0.0009 (0.0019) model time 0.2310 (0.2291) loss 3.3576 (3.1280) grad_norm 4.5230 (2.7188) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][800/1251] eta 0:01:44 lr 0.000435 wd 0.0500 time 0.2325 (0.2320) data time 0.0006 (0.0019) model time 0.2319 (0.2291) loss 3.6446 (3.1280) grad_norm 2.4587 (2.7172) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][810/1251] eta 0:01:42 lr 0.000435 wd 0.0500 time 0.2271 (0.2320) data time 0.0010 (0.0019) model time 0.2261 (0.2291) loss 3.2064 (3.1261) grad_norm 1.9820 (2.7155) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][820/1251] eta 0:01:39 lr 0.000435 wd 0.0500 time 0.2279 (0.2320) data time 0.0010 (0.0019) model time 0.2269 (0.2291) loss 3.0183 (3.1254) grad_norm 2.9535 (2.7119) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][830/1251] eta 0:01:37 lr 0.000435 wd 0.0500 time 0.2310 (0.2320) data time 0.0008 (0.0019) model time 0.2302 (0.2291) loss 4.1540 (3.1249) grad_norm 4.1753 (2.7117) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][840/1251] eta 0:01:35 lr 0.000435 wd 0.0500 time 0.2302 (0.2320) data time 0.0007 (0.0019) model time 0.2294 (0.2291) loss 3.4866 (3.1239) grad_norm 2.2002 (2.7121) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][850/1251] eta 0:01:33 lr 0.000435 wd 0.0500 time 0.2284 (0.2319) data time 0.0012 (0.0019) model time 0.2272 (0.2291) loss 3.1541 (3.1273) grad_norm 2.2388 (2.7131) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][860/1251] eta 0:01:30 lr 0.000435 wd 0.0500 time 0.2285 (0.2319) data time 0.0008 (0.0018) model time 0.2277 (0.2290) loss 3.3409 (3.1254) grad_norm 2.7127 (2.7145) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][870/1251] eta 0:01:28 lr 0.000435 wd 0.0500 time 0.2358 (0.2319) data time 0.0009 (0.0018) model time 0.2349 (0.2291) loss 3.0267 (3.1248) grad_norm 2.8868 (2.7147) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][880/1251] eta 0:01:26 lr 0.000435 wd 0.0500 time 0.2281 (0.2318) data time 0.0010 (0.0018) model time 0.2272 (0.2291) loss 2.1157 (3.1242) grad_norm 2.5531 (2.7273) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][890/1251] eta 0:01:23 lr 0.000435 wd 0.0500 time 0.2322 (0.2318) data time 0.0008 (0.0018) model time 0.2313 (0.2291) loss 4.0586 (3.1258) grad_norm 3.9025 (2.7303) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][900/1251] eta 0:01:21 lr 0.000435 wd 0.0500 time 0.2238 (0.2318) data time 0.0010 (0.0018) model time 0.2228 (0.2291) loss 3.7569 (3.1239) grad_norm 2.4692 (2.7290) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][910/1251] eta 0:01:19 lr 0.000435 wd 0.0500 time 0.2220 (0.2318) data time 0.0009 (0.0018) model time 0.2211 (0.2291) loss 3.4054 (3.1258) grad_norm 2.0797 (2.7246) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][920/1251] eta 0:01:16 lr 0.000435 wd 0.0500 time 0.2329 (0.2318) data time 0.0008 (0.0018) model time 0.2321 (0.2291) loss 3.4812 (3.1264) grad_norm 2.3745 (2.7244) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][930/1251] eta 0:01:14 lr 0.000434 wd 0.0500 time 0.2274 (0.2318) data time 0.0009 (0.0018) model time 0.2265 (0.2291) loss 3.3298 (3.1270) grad_norm 3.2929 (2.7232) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][940/1251] eta 0:01:12 lr 0.000434 wd 0.0500 time 0.3847 (0.2324) data time 0.0012 (0.0018) model time 0.3835 (0.2297) loss 3.4531 (3.1278) grad_norm 2.7016 (2.7180) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][950/1251] eta 0:01:10 lr 0.000434 wd 0.0500 time 0.2322 (0.2329) data time 0.0007 (0.0018) model time 0.2315 (0.2303) loss 3.0468 (3.1271) grad_norm 2.3704 (2.7151) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][960/1251] eta 0:01:07 lr 0.000434 wd 0.0500 time 0.2252 (0.2329) data time 0.0009 (0.0018) model time 0.2243 (0.2303) loss 3.2628 (3.1287) grad_norm 2.1362 (2.7127) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][970/1251] eta 0:01:05 lr 0.000434 wd 0.0500 time 0.2263 (0.2328) data time 0.0008 (0.0018) model time 0.2255 (0.2302) loss 2.3279 (3.1313) grad_norm 2.3334 (2.7084) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][980/1251] eta 0:01:03 lr 0.000434 wd 0.0500 time 0.2283 (0.2328) data time 0.0010 (0.0018) model time 0.2273 (0.2302) loss 3.5044 (3.1308) grad_norm 3.2561 (2.7078) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][990/1251] eta 0:01:00 lr 0.000434 wd 0.0500 time 0.2276 (0.2328) data time 0.0013 (0.0018) model time 0.2263 (0.2302) loss 3.8113 (3.1311) grad_norm 2.1662 (2.7125) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1000/1251] eta 0:00:58 lr 0.000434 wd 0.0500 time 0.2283 (0.2327) data time 0.0009 (0.0017) model time 0.2274 (0.2302) loss 2.8679 (3.1304) grad_norm 2.8661 (2.7154) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:51:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1010/1251] eta 0:00:56 lr 0.000434 wd 0.0500 time 0.2288 (0.2327) data time 0.0008 (0.0017) model time 0.2280 (0.2302) loss 3.7093 (3.1301) grad_norm 1.8177 (2.7105) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1020/1251] eta 0:00:53 lr 0.000434 wd 0.0500 time 0.2265 (0.2327) data time 0.0009 (0.0017) model time 0.2256 (0.2302) loss 2.5079 (3.1295) grad_norm 2.3868 (2.7063) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1030/1251] eta 0:00:51 lr 0.000434 wd 0.0500 time 0.2271 (0.2327) data time 0.0011 (0.0017) model time 0.2260 (0.2302) loss 3.5402 (3.1292) grad_norm 1.7911 (2.7014) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1040/1251] eta 0:00:49 lr 0.000434 wd 0.0500 time 0.2264 (0.2326) data time 0.0009 (0.0017) model time 0.2255 (0.2301) loss 2.3134 (3.1312) grad_norm 2.8503 (2.7008) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1050/1251] eta 0:00:46 lr 0.000434 wd 0.0500 time 0.2288 (0.2326) data time 0.0006 (0.0017) model time 0.2281 (0.2301) loss 3.8000 (3.1317) grad_norm 2.5424 (2.7053) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1060/1251] eta 0:00:44 lr 0.000434 wd 0.0500 time 0.2268 (0.2326) data time 0.0010 (0.0017) model time 0.2259 (0.2301) loss 2.8423 (3.1320) grad_norm 5.4585 (2.7108) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1070/1251] eta 0:00:42 lr 0.000434 wd 0.0500 time 0.2256 (0.2325) data time 0.0013 (0.0017) model time 0.2244 (0.2301) loss 2.3906 (3.1330) grad_norm 2.5451 (2.7126) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1080/1251] eta 0:00:39 lr 0.000434 wd 0.0500 time 0.2254 (0.2325) data time 0.0010 (0.0017) model time 0.2244 (0.2300) loss 3.2677 (3.1346) grad_norm 3.1171 (2.7142) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1090/1251] eta 0:00:37 lr 0.000434 wd 0.0500 time 0.2238 (0.2324) data time 0.0008 (0.0017) model time 0.2231 (0.2300) loss 3.5508 (3.1337) grad_norm 2.1797 (2.7128) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1100/1251] eta 0:00:35 lr 0.000434 wd 0.0500 time 0.2286 (0.2324) data time 0.0008 (0.0017) model time 0.2278 (0.2300) loss 3.4091 (3.1352) grad_norm 2.3107 (2.7089) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1110/1251] eta 0:00:32 lr 0.000434 wd 0.0500 time 0.2288 (0.2324) data time 0.0011 (0.0017) model time 0.2277 (0.2300) loss 3.0657 (3.1371) grad_norm 2.4044 (2.7088) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1120/1251] eta 0:00:30 lr 0.000434 wd 0.0500 time 0.2209 (0.2323) data time 0.0017 (0.0017) model time 0.2192 (0.2299) loss 3.6088 (3.1410) grad_norm 4.0228 (2.7082) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1130/1251] eta 0:00:28 lr 0.000434 wd 0.0500 time 0.2269 (0.2323) data time 0.0009 (0.0017) model time 0.2260 (0.2299) loss 2.9768 (3.1418) grad_norm 2.6279 (2.7067) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1140/1251] eta 0:00:25 lr 0.000434 wd 0.0500 time 0.2295 (0.2323) data time 0.0009 (0.0017) model time 0.2286 (0.2299) loss 3.7915 (3.1406) grad_norm 2.2205 (2.7095) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1150/1251] eta 0:00:23 lr 0.000433 wd 0.0500 time 0.2268 (0.2322) data time 0.0007 (0.0017) model time 0.2261 (0.2299) loss 3.6636 (3.1388) grad_norm 7.2039 (2.7112) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1160/1251] eta 0:00:21 lr 0.000433 wd 0.0500 time 0.2358 (0.2322) data time 0.0007 (0.0017) model time 0.2351 (0.2298) loss 3.0752 (3.1395) grad_norm 2.2796 (2.7115) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1170/1251] eta 0:00:18 lr 0.000433 wd 0.0500 time 0.2233 (0.2322) data time 0.0010 (0.0017) model time 0.2223 (0.2298) loss 3.4413 (3.1406) grad_norm 2.0652 (2.7132) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1180/1251] eta 0:00:16 lr 0.000433 wd 0.0500 time 0.2304 (0.2321) data time 0.0006 (0.0017) model time 0.2298 (0.2298) loss 2.0336 (3.1374) grad_norm 2.4088 (2.7133) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1190/1251] eta 0:00:14 lr 0.000433 wd 0.0500 time 0.2238 (0.2321) data time 0.0007 (0.0017) model time 0.2231 (0.2297) loss 2.9042 (3.1387) grad_norm 2.0323 (2.7172) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1200/1251] eta 0:00:11 lr 0.000433 wd 0.0500 time 0.2258 (0.2321) data time 0.0010 (0.0017) model time 0.2248 (0.2297) loss 2.3870 (3.1355) grad_norm 2.8269 (2.7154) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1210/1251] eta 0:00:09 lr 0.000433 wd 0.0500 time 0.2281 (0.2320) data time 0.0007 (0.0017) model time 0.2274 (0.2297) loss 3.0683 (3.1360) grad_norm 3.4106 (2.7151) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1220/1251] eta 0:00:07 lr 0.000433 wd 0.0500 time 0.2276 (0.2320) data time 0.0007 (0.0017) model time 0.2269 (0.2297) loss 3.5928 (3.1356) grad_norm 3.3204 (2.7164) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1230/1251] eta 0:00:04 lr 0.000433 wd 0.0500 time 0.2265 (0.2320) data time 0.0009 (0.0017) model time 0.2256 (0.2296) loss 1.8671 (3.1345) grad_norm 2.3240 (2.7141) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1240/1251] eta 0:00:02 lr 0.000433 wd 0.0500 time 0.2112 (0.2319) data time 0.0006 (0.0017) model time 0.2106 (0.2296) loss 2.2512 (3.1339) grad_norm 2.6950 (2.7159) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [172/300][1250/1251] eta 0:00:00 lr 0.000433 wd 0.0500 time 0.2130 (0.2317) data time 0.0004 (0.0017) model time 0.2126 (0.2294) loss 2.2622 (3.1335) grad_norm 3.7182 (2.7193) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:52:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 172 training takes 0:04:49 [2024-08-27 11:52:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 11:52:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 11:52:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.513 (0.513) Loss 0.4316 (0.4316) Acc@1 91.797 (91.797) Acc@5 98.047 (98.047) Mem 7381MB [2024-08-27 11:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.117) Loss 0.7178 (0.6861) Acc@1 85.840 (84.872) Acc@5 96.387 (97.115) Mem 7381MB [2024-08-27 11:52:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.101) Loss 0.9912 (0.7064) Acc@1 76.953 (84.110) Acc@5 93.848 (97.056) Mem 7381MB [2024-08-27 11:52:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.094) Loss 1.2227 (0.8036) Acc@1 71.387 (81.858) Acc@5 91.211 (95.895) Mem 7381MB [2024-08-27 11:52:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.1650 (0.8588) Acc@1 72.266 (80.423) Acc@5 92.969 (95.291) Mem 7381MB [2024-08-27 11:52:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.022 Acc@5 95.236 [2024-08-27 11:52:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.0% [2024-08-27 11:52:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 80.02% [2024-08-27 11:52:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 11:52:59 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 11:52:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.463 (0.463) Loss 0.4021 (0.4021) Acc@1 93.066 (93.066) Acc@5 98.340 (98.340) Mem 7381MB [2024-08-27 11:53:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.117) Loss 0.6201 (0.6254) Acc@1 87.500 (86.657) Acc@5 97.656 (97.461) Mem 7381MB [2024-08-27 11:53:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.100) Loss 0.8916 (0.6501) Acc@1 78.906 (85.770) Acc@5 95.508 (97.424) Mem 7381MB [2024-08-27 11:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.094) Loss 1.1230 (0.7370) Acc@1 72.070 (83.632) Acc@5 93.066 (96.484) Mem 7381MB [2024-08-27 11:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.0078 (0.7813) Acc@1 75.391 (82.322) Acc@5 94.141 (96.022) Mem 7381MB [2024-08-27 11:53:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.900 Acc@5 95.996 [2024-08-27 11:53:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.9% [2024-08-27 11:53:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.90% [2024-08-27 11:53:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 11:53:03 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 11:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][0/1251] eta 0:15:32 lr 0.000433 wd 0.0500 time 0.7455 (0.7455) data time 0.4826 (0.4826) model time 0.0000 (0.0000) loss 2.6919 (2.6919) grad_norm 3.6930 (3.6930) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][10/1251] eta 0:05:40 lr 0.000433 wd 0.0500 time 0.2244 (0.2744) data time 0.0006 (0.0449) model time 0.0000 (0.0000) loss 3.5596 (3.0234) grad_norm 3.5952 (2.7941) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][20/1251] eta 0:05:10 lr 0.000433 wd 0.0500 time 0.2258 (0.2520) data time 0.0013 (0.0240) model time 0.0000 (0.0000) loss 2.2525 (2.9487) grad_norm 2.8830 (2.6039) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 11:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][30/1251] eta 0:04:58 lr 0.000433 wd 0.0500 time 0.2312 (0.2447) data time 0.0008 (0.0166) model time 0.0000 (0.0000) loss 3.7268 (3.1034) grad_norm 2.4602 (2.7650) loss_scale 4096.0000 (2246.1935) mem 7381MB [2024-08-27 11:53:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][40/1251] eta 0:04:51 lr 0.000433 wd 0.0500 time 0.2299 (0.2409) data time 0.0010 (0.0129) model time 0.0000 (0.0000) loss 2.6313 (3.0611) grad_norm 1.9134 (2.7289) loss_scale 4096.0000 (2697.3659) mem 7381MB [2024-08-27 11:53:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][50/1251] eta 0:04:46 lr 0.000433 wd 0.0500 time 0.2271 (0.2386) data time 0.0007 (0.0105) model time 0.0000 (0.0000) loss 2.8737 (3.0826) grad_norm 2.6654 (2.7187) loss_scale 4096.0000 (2971.6078) mem 7381MB [2024-08-27 11:53:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][60/1251] eta 0:04:42 lr 0.000433 wd 0.0500 time 0.2307 (0.2376) data time 0.0009 (0.0090) model time 0.2298 (0.2311) loss 3.5556 (3.1006) grad_norm 2.7534 (2.7351) loss_scale 4096.0000 (3155.9344) mem 7381MB [2024-08-27 11:53:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][70/1251] eta 0:04:39 lr 0.000433 wd 0.0500 time 0.2287 (0.2364) data time 0.0011 (0.0079) model time 0.2276 (0.2298) loss 3.2244 (3.1192) grad_norm 2.0572 (2.8205) loss_scale 4096.0000 (3288.3380) mem 7381MB [2024-08-27 11:53:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][80/1251] eta 0:04:35 lr 0.000433 wd 0.0500 time 0.2216 (0.2353) data time 0.0009 (0.0070) model time 0.2207 (0.2285) loss 3.8671 (3.1184) grad_norm 2.1082 (2.8271) loss_scale 4096.0000 (3388.0494) mem 7381MB [2024-08-27 11:53:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][90/1251] eta 0:04:32 lr 0.000433 wd 0.0500 time 0.2319 (0.2345) data time 0.0010 (0.0064) model time 0.2309 (0.2280) loss 2.7735 (3.0791) grad_norm 1.6728 (2.7996) loss_scale 4096.0000 (3465.8462) mem 7381MB [2024-08-27 11:53:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][100/1251] eta 0:04:29 lr 0.000433 wd 0.0500 time 0.2302 (0.2338) data time 0.0011 (0.0059) model time 0.2291 (0.2278) loss 3.4046 (3.0782) grad_norm 2.9152 (2.7696) loss_scale 4096.0000 (3528.2376) mem 7381MB [2024-08-27 11:53:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][110/1251] eta 0:04:26 lr 0.000433 wd 0.0500 time 0.2316 (0.2333) data time 0.0011 (0.0054) model time 0.2304 (0.2277) loss 3.3377 (3.0850) grad_norm 2.3774 (2.7694) loss_scale 4096.0000 (3579.3874) mem 7381MB [2024-08-27 11:53:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][120/1251] eta 0:04:23 lr 0.000433 wd 0.0500 time 0.2319 (0.2329) data time 0.0008 (0.0051) model time 0.2311 (0.2276) loss 3.6264 (3.0770) grad_norm 4.7761 (2.7887) loss_scale 4096.0000 (3622.0826) mem 7381MB [2024-08-27 11:53:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][130/1251] eta 0:04:20 lr 0.000432 wd 0.0500 time 0.2285 (0.2325) data time 0.0008 (0.0048) model time 0.2277 (0.2274) loss 3.3713 (3.0853) grad_norm 2.7444 (2.7731) loss_scale 4096.0000 (3658.2595) mem 7381MB [2024-08-27 11:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][140/1251] eta 0:04:17 lr 0.000432 wd 0.0500 time 0.2260 (0.2322) data time 0.0010 (0.0045) model time 0.2249 (0.2274) loss 3.2336 (3.0856) grad_norm 2.9582 (2.7600) loss_scale 4096.0000 (3689.3050) mem 7381MB [2024-08-27 11:53:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][150/1251] eta 0:04:15 lr 0.000432 wd 0.0500 time 0.2280 (0.2320) data time 0.0012 (0.0043) model time 0.2267 (0.2274) loss 3.1031 (3.1013) grad_norm 2.4885 (2.7359) loss_scale 4096.0000 (3716.2384) mem 7381MB [2024-08-27 11:53:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][160/1251] eta 0:04:12 lr 0.000432 wd 0.0500 time 0.2262 (0.2318) data time 0.0007 (0.0041) model time 0.2255 (0.2275) loss 2.5129 (3.0943) grad_norm 2.5873 (2.7118) loss_scale 4096.0000 (3739.8261) mem 7381MB [2024-08-27 11:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][170/1251] eta 0:04:10 lr 0.000432 wd 0.0500 time 0.2278 (0.2315) data time 0.0011 (0.0039) model time 0.2267 (0.2274) loss 2.3655 (3.0968) grad_norm 2.0640 (2.6974) loss_scale 4096.0000 (3760.6550) mem 7381MB [2024-08-27 11:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][180/1251] eta 0:04:07 lr 0.000432 wd 0.0500 time 0.2259 (0.2313) data time 0.0009 (0.0038) model time 0.2250 (0.2273) loss 3.9474 (3.1143) grad_norm 2.5982 (2.6827) loss_scale 4096.0000 (3779.1823) mem 7381MB [2024-08-27 11:53:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][190/1251] eta 0:04:05 lr 0.000432 wd 0.0500 time 0.2281 (0.2311) data time 0.0006 (0.0036) model time 0.2274 (0.2272) loss 3.7590 (3.1183) grad_norm 2.1261 (2.6710) loss_scale 4096.0000 (3795.7696) mem 7381MB [2024-08-27 11:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][200/1251] eta 0:04:02 lr 0.000432 wd 0.0500 time 0.2363 (0.2310) data time 0.0007 (0.0035) model time 0.2356 (0.2272) loss 3.1378 (3.1086) grad_norm 2.5030 (2.6534) loss_scale 4096.0000 (3810.7065) mem 7381MB [2024-08-27 11:53:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][210/1251] eta 0:04:03 lr 0.000432 wd 0.0500 time 0.2111 (0.2335) data time 0.0008 (0.0034) model time 0.2103 (0.2308) loss 2.7612 (3.1140) grad_norm 2.5551 (2.6735) loss_scale 4096.0000 (3824.2275) mem 7381MB [2024-08-27 11:53:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][220/1251] eta 0:04:02 lr 0.000432 wd 0.0500 time 0.2332 (0.2348) data time 0.0010 (0.0033) model time 0.2322 (0.2326) loss 2.3037 (3.1237) grad_norm 3.0672 (2.7213) loss_scale 4096.0000 (3836.5249) mem 7381MB [2024-08-27 11:53:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][230/1251] eta 0:03:59 lr 0.000432 wd 0.0500 time 0.2324 (0.2346) data time 0.0009 (0.0032) model time 0.2315 (0.2324) loss 3.0353 (3.1110) grad_norm 2.5489 (2.7275) loss_scale 4096.0000 (3847.7576) mem 7381MB [2024-08-27 11:54:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][240/1251] eta 0:03:56 lr 0.000432 wd 0.0500 time 0.2296 (0.2343) data time 0.0007 (0.0031) model time 0.2290 (0.2321) loss 3.0896 (3.1085) grad_norm 4.4043 (2.7486) loss_scale 4096.0000 (3858.0581) mem 7381MB [2024-08-27 11:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][250/1251] eta 0:03:54 lr 0.000432 wd 0.0500 time 0.2272 (0.2342) data time 0.0007 (0.0030) model time 0.2265 (0.2319) loss 2.3451 (3.1092) grad_norm 2.5293 (2.7311) loss_scale 4096.0000 (3867.5378) mem 7381MB [2024-08-27 11:54:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 11:54:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 11:54:03 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 11:56:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 11:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 11:56:17 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 11:56:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 11:56:26 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 11:56:28 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 11:56:29 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 11:56:29 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 173) [2024-08-27 11:56:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 11:56:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][260/1251] eta 0:30:13 lr 0.000432 wd 0.0500 time 0.2323 (1.8296) data time 0.0009 (0.1054) model time 0.2314 (1.7243) loss 3.6534 (3.5474) grad_norm 2.3709 (2.6257) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-27 11:56:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][270/1251] eta 0:16:07 lr 0.000432 wd 0.0500 time 0.2414 (0.9867) data time 0.0012 (0.0506) model time 0.2402 (0.9361) loss 3.4751 (3.4124) grad_norm 2.9865 (2.8301) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-27 11:56:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][280/1251] eta 0:11:43 lr 0.000432 wd 0.0500 time 0.2218 (0.7245) data time 0.0009 (0.0335) model time 0.2209 (0.6910) loss 4.0690 (3.4678) grad_norm 2.2209 (2.8947) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-27 11:56:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][290/1251] eta 0:09:34 lr 0.000432 wd 0.0500 time 0.2395 (0.5977) data time 0.0009 (0.0252) model time 0.2386 (0.5725) loss 3.0661 (3.3756) grad_norm 2.3777 (2.9290) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-27 11:56:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][300/1251] eta 0:08:16 lr 0.000432 wd 0.0500 time 0.2214 (0.5218) data time 0.0009 (0.0203) model time 0.2204 (0.5015) loss 3.3213 (3.3386) grad_norm 2.0659 (2.8226) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-27 11:57:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][310/1251] eta 0:07:24 lr 0.000432 wd 0.0500 time 0.2231 (0.4721) data time 0.0007 (0.0171) model time 0.2224 (0.4551) loss 2.3843 (3.2825) grad_norm 2.2333 (2.8637) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-27 11:57:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][320/1251] eta 0:06:46 lr 0.000432 wd 0.0500 time 0.2250 (0.4368) data time 0.0010 (0.0148) model time 0.2240 (0.4221) loss 3.3190 (3.2646) grad_norm 3.1139 (3.0190) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-27 11:57:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][330/1251] eta 0:06:17 lr 0.000432 wd 0.0500 time 0.2259 (0.4101) data time 0.0010 (0.0130) model time 0.2249 (0.3970) loss 2.7396 (3.2346) grad_norm 2.7532 (2.9742) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-27 11:57:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][340/1251] eta 0:05:54 lr 0.000432 wd 0.0500 time 0.2424 (0.3896) data time 0.0007 (0.0117) model time 0.2418 (0.3779) loss 3.2349 (3.2226) grad_norm 2.2208 (2.9923) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-27 11:57:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][350/1251] eta 0:05:36 lr 0.000432 wd 0.0500 time 0.2305 (0.3732) data time 0.0011 (0.0106) model time 0.2294 (0.3625) loss 3.0624 (3.2366) grad_norm 2.7348 (2.9647) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-27 11:57:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][360/1251] eta 0:05:20 lr 0.000431 wd 0.0500 time 0.2397 (0.3601) data time 0.0008 (0.0098) model time 0.2389 (0.3503) loss 3.3806 (3.2398) grad_norm 3.1564 (2.9150) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-27 11:57:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][370/1251] eta 0:05:07 lr 0.000431 wd 0.0500 time 0.2298 (0.3490) data time 0.0011 (0.0090) model time 0.2287 (0.3400) loss 3.6870 (3.2391) grad_norm 2.9054 (2.8970) loss_scale 4096.0000 (4096.0000) mem 7379MB [2024-08-27 11:57:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][380/1251] eta 0:04:55 lr 0.000431 wd 0.0500 time 0.2187 (0.3396) data time 0.0010 (0.0084) model time 0.2177 (0.3312) loss 3.0294 (3.2206) grad_norm 2.7557 (inf) loss_scale 2048.0000 (4016.6202) mem 7379MB [2024-08-27 11:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][390/1251] eta 0:04:45 lr 0.000431 wd 0.0500 time 0.2245 (0.3316) data time 0.0008 (0.0079) model time 0.2238 (0.3237) loss 3.6414 (3.2163) grad_norm 2.3419 (inf) loss_scale 2048.0000 (3874.9928) mem 7379MB [2024-08-27 11:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][400/1251] eta 0:04:36 lr 0.000431 wd 0.0500 time 0.2318 (0.3248) data time 0.0008 (0.0075) model time 0.2310 (0.3173) loss 2.6249 (3.2035) grad_norm 2.8666 (inf) loss_scale 2048.0000 (3752.3758) mem 7379MB [2024-08-27 11:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][410/1251] eta 0:04:28 lr 0.000431 wd 0.0500 time 0.2396 (0.3189) data time 0.0010 (0.0071) model time 0.2386 (0.3119) loss 3.8381 (3.1999) grad_norm 2.8332 (inf) loss_scale 2048.0000 (3645.1824) mem 7379MB [2024-08-27 11:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][420/1251] eta 0:04:20 lr 0.000431 wd 0.0500 time 0.2204 (0.3137) data time 0.0014 (0.0067) model time 0.2191 (0.3069) loss 3.1018 (3.1973) grad_norm 3.4143 (inf) loss_scale 2048.0000 (3550.6746) mem 7379MB [2024-08-27 11:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][430/1251] eta 0:04:13 lr 0.000431 wd 0.0500 time 0.2340 (0.3092) data time 0.0010 (0.0064) model time 0.2330 (0.3027) loss 2.9680 (3.1807) grad_norm 1.8794 (inf) loss_scale 2048.0000 (3466.7263) mem 7379MB [2024-08-27 11:57:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][440/1251] eta 0:04:07 lr 0.000431 wd 0.0500 time 0.2263 (0.3051) data time 0.0007 (0.0062) model time 0.2256 (0.2989) loss 3.7225 (3.1816) grad_norm 3.5617 (inf) loss_scale 2048.0000 (3391.6614) mem 7379MB [2024-08-27 11:57:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][450/1251] eta 0:04:01 lr 0.000431 wd 0.0500 time 0.2253 (0.3013) data time 0.0013 (0.0059) model time 0.2241 (0.2954) loss 2.4591 (3.1640) grad_norm 2.2945 (inf) loss_scale 2048.0000 (3324.1407) mem 7379MB [2024-08-27 11:57:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][460/1251] eta 0:03:55 lr 0.000431 wd 0.0500 time 0.2384 (0.2980) data time 0.0008 (0.0057) model time 0.2376 (0.2923) loss 3.2679 (3.1548) grad_norm 3.7591 (inf) loss_scale 2048.0000 (3263.0813) mem 7379MB [2024-08-27 11:57:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][470/1251] eta 0:03:50 lr 0.000431 wd 0.0500 time 0.2555 (0.2950) data time 0.0008 (0.0055) model time 0.2547 (0.2895) loss 3.8176 (3.1510) grad_norm 2.5648 (inf) loss_scale 2048.0000 (3207.5982) mem 7379MB [2024-08-27 11:57:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][480/1251] eta 0:03:45 lr 0.000431 wd 0.0500 time 0.2321 (0.2921) data time 0.0010 (0.0053) model time 0.2310 (0.2868) loss 2.6317 (3.1522) grad_norm 2.1189 (inf) loss_scale 2048.0000 (3156.9607) mem 7379MB [2024-08-27 11:57:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][490/1251] eta 0:03:40 lr 0.000431 wd 0.0500 time 0.2307 (0.2895) data time 0.0007 (0.0051) model time 0.2300 (0.2844) loss 2.2791 (3.1446) grad_norm 2.8385 (inf) loss_scale 2048.0000 (3110.5607) mem 7379MB [2024-08-27 11:57:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][500/1251] eta 0:03:35 lr 0.000431 wd 0.0500 time 0.2308 (0.2871) data time 0.0007 (0.0050) model time 0.2301 (0.2821) loss 2.8499 (3.1364) grad_norm 2.8066 (inf) loss_scale 2048.0000 (3067.8876) mem 7379MB [2024-08-27 11:57:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][510/1251] eta 0:03:31 lr 0.000431 wd 0.0500 time 0.2295 (0.2849) data time 0.0012 (0.0048) model time 0.2284 (0.2801) loss 3.0049 (3.1271) grad_norm 3.8136 (inf) loss_scale 2048.0000 (3028.5097) mem 7379MB [2024-08-27 11:57:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][520/1251] eta 0:03:26 lr 0.000431 wd 0.0500 time 0.2305 (0.2829) data time 0.0008 (0.0047) model time 0.2297 (0.2782) loss 2.5052 (3.1210) grad_norm 2.0778 (inf) loss_scale 2048.0000 (2992.0595) mem 7379MB [2024-08-27 11:57:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][530/1251] eta 0:03:22 lr 0.000431 wd 0.0500 time 0.2302 (0.2809) data time 0.0011 (0.0046) model time 0.2291 (0.2764) loss 3.7168 (3.1319) grad_norm 4.1613 (inf) loss_scale 2048.0000 (2958.2222) mem 7379MB [2024-08-27 11:57:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][540/1251] eta 0:03:19 lr 0.000431 wd 0.0500 time 0.2245 (0.2799) data time 0.0013 (0.0044) model time 0.2232 (0.2755) loss 2.9235 (3.1321) grad_norm 2.9005 (inf) loss_scale 2048.0000 (2926.7266) mem 7379MB [2024-08-27 11:57:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][550/1251] eta 0:03:14 lr 0.000431 wd 0.0500 time 0.2177 (0.2781) data time 0.0014 (0.0043) model time 0.2163 (0.2738) loss 3.4329 (3.1172) grad_norm 3.4326 (inf) loss_scale 2048.0000 (2897.3378) mem 7379MB [2024-08-27 11:57:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][560/1251] eta 0:03:11 lr 0.000431 wd 0.0500 time 0.2305 (0.2773) data time 0.0010 (0.0043) model time 0.2296 (0.2731) loss 3.3793 (3.1146) grad_norm 2.6582 (inf) loss_scale 2048.0000 (2869.8511) mem 7379MB [2024-08-27 11:58:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][570/1251] eta 0:03:07 lr 0.000431 wd 0.0500 time 0.2297 (0.2759) data time 0.0009 (0.0042) model time 0.2288 (0.2717) loss 3.6965 (3.1250) grad_norm 2.0546 (inf) loss_scale 2048.0000 (2844.0878) mem 7379MB [2024-08-27 11:58:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][580/1251] eta 0:03:04 lr 0.000431 wd 0.0500 time 0.2259 (0.2745) data time 0.0011 (0.0041) model time 0.2248 (0.2704) loss 2.2691 (3.1311) grad_norm 2.3337 (inf) loss_scale 2048.0000 (2819.8906) mem 7379MB [2024-08-27 11:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][590/1251] eta 0:03:00 lr 0.000430 wd 0.0500 time 0.2304 (0.2731) data time 0.0009 (0.0040) model time 0.2295 (0.2691) loss 2.7452 (3.1301) grad_norm 2.2466 (inf) loss_scale 2048.0000 (2797.1209) mem 7379MB [2024-08-27 11:58:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][600/1251] eta 0:02:57 lr 0.000430 wd 0.0500 time 0.2440 (0.2719) data time 0.0010 (0.0039) model time 0.2430 (0.2680) loss 3.7087 (3.1336) grad_norm 2.3240 (inf) loss_scale 2048.0000 (2775.6562) mem 7379MB [2024-08-27 11:58:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][610/1251] eta 0:02:53 lr 0.000430 wd 0.0500 time 0.2305 (0.2707) data time 0.0007 (0.0038) model time 0.2297 (0.2669) loss 2.7704 (3.1329) grad_norm 2.7081 (inf) loss_scale 2048.0000 (2755.3872) mem 7379MB [2024-08-27 11:58:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][620/1251] eta 0:02:50 lr 0.000430 wd 0.0500 time 0.2269 (0.2695) data time 0.0013 (0.0038) model time 0.2256 (0.2658) loss 3.5020 (3.1332) grad_norm 2.2642 (inf) loss_scale 2048.0000 (2736.2168) mem 7379MB [2024-08-27 11:58:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][630/1251] eta 0:02:46 lr 0.000430 wd 0.0500 time 0.2292 (0.2684) data time 0.0007 (0.0037) model time 0.2285 (0.2648) loss 3.4642 (3.1320) grad_norm 3.0613 (inf) loss_scale 2048.0000 (2718.0580) mem 7379MB [2024-08-27 11:58:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][640/1251] eta 0:02:43 lr 0.000430 wd 0.0500 time 0.2250 (0.2674) data time 0.0007 (0.0036) model time 0.2243 (0.2638) loss 3.5193 (3.1268) grad_norm 4.7649 (inf) loss_scale 2048.0000 (2700.8329) mem 7379MB [2024-08-27 11:58:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][650/1251] eta 0:02:40 lr 0.000430 wd 0.0500 time 0.2204 (0.2664) data time 0.0008 (0.0036) model time 0.2196 (0.2628) loss 3.7777 (3.1324) grad_norm 2.8285 (inf) loss_scale 2048.0000 (2684.4712) mem 7379MB [2024-08-27 11:58:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][660/1251] eta 0:02:36 lr 0.000430 wd 0.0500 time 0.2273 (0.2655) data time 0.0010 (0.0035) model time 0.2263 (0.2620) loss 2.9196 (3.1372) grad_norm 3.0148 (inf) loss_scale 2048.0000 (2668.9095) mem 7379MB [2024-08-27 11:58:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][670/1251] eta 0:02:33 lr 0.000430 wd 0.0500 time 0.2279 (0.2646) data time 0.0007 (0.0034) model time 0.2272 (0.2612) loss 2.2873 (3.1371) grad_norm 1.9706 (inf) loss_scale 2048.0000 (2654.0907) mem 7379MB [2024-08-27 11:58:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][680/1251] eta 0:02:30 lr 0.000430 wd 0.0500 time 0.2321 (0.2637) data time 0.0008 (0.0034) model time 0.2313 (0.2604) loss 3.3434 (3.1449) grad_norm 2.8129 (inf) loss_scale 2048.0000 (2639.9627) mem 7379MB [2024-08-27 11:58:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][690/1251] eta 0:02:27 lr 0.000430 wd 0.0500 time 0.2212 (0.2630) data time 0.0009 (0.0033) model time 0.2203 (0.2596) loss 3.2325 (3.1498) grad_norm 2.1248 (inf) loss_scale 2048.0000 (2626.4784) mem 7379MB [2024-08-27 11:58:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][700/1251] eta 0:02:24 lr 0.000430 wd 0.0500 time 0.2289 (0.2622) data time 0.0012 (0.0033) model time 0.2277 (0.2590) loss 3.7381 (3.1504) grad_norm 3.1927 (inf) loss_scale 2048.0000 (2613.5947) mem 7379MB [2024-08-27 11:58:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][710/1251] eta 0:02:21 lr 0.000430 wd 0.0500 time 0.2267 (0.2615) data time 0.0010 (0.0032) model time 0.2257 (0.2583) loss 2.7945 (3.1454) grad_norm 4.1288 (inf) loss_scale 2048.0000 (2601.2723) mem 7379MB [2024-08-27 11:58:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][720/1251] eta 0:02:18 lr 0.000430 wd 0.0500 time 0.2314 (0.2608) data time 0.0008 (0.0032) model time 0.2305 (0.2577) loss 2.0231 (3.1399) grad_norm 3.0104 (inf) loss_scale 2048.0000 (2589.4755) mem 7379MB [2024-08-27 11:58:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][730/1251] eta 0:02:15 lr 0.000430 wd 0.0500 time 0.2223 (0.2601) data time 0.0010 (0.0031) model time 0.2213 (0.2570) loss 3.6591 (3.1369) grad_norm 4.5816 (inf) loss_scale 2048.0000 (2578.1712) mem 7379MB [2024-08-27 11:58:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][740/1251] eta 0:02:12 lr 0.000430 wd 0.0500 time 0.2309 (0.2595) data time 0.0010 (0.0031) model time 0.2300 (0.2564) loss 2.9628 (3.1392) grad_norm 2.1587 (inf) loss_scale 2048.0000 (2567.3292) mem 7379MB [2024-08-27 11:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][750/1251] eta 0:02:09 lr 0.000430 wd 0.0500 time 0.2293 (0.2588) data time 0.0009 (0.0031) model time 0.2284 (0.2558) loss 3.1283 (3.1391) grad_norm 3.2886 (inf) loss_scale 2048.0000 (2556.9218) mem 7379MB [2024-08-27 11:58:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][760/1251] eta 0:02:06 lr 0.000430 wd 0.0500 time 0.2260 (0.2582) data time 0.0007 (0.0030) model time 0.2252 (0.2552) loss 3.8014 (3.1409) grad_norm 2.3689 (inf) loss_scale 2048.0000 (2546.9234) mem 7379MB [2024-08-27 11:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][770/1251] eta 0:02:03 lr 0.000430 wd 0.0500 time 0.2280 (0.2576) data time 0.0009 (0.0030) model time 0.2271 (0.2546) loss 2.0896 (3.1435) grad_norm 2.3160 (inf) loss_scale 2048.0000 (2537.3102) mem 7379MB [2024-08-27 11:58:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][780/1251] eta 0:02:01 lr 0.000430 wd 0.0500 time 0.2269 (0.2571) data time 0.0009 (0.0029) model time 0.2260 (0.2541) loss 3.6359 (3.1400) grad_norm 2.4221 (inf) loss_scale 2048.0000 (2528.0605) mem 7379MB [2024-08-27 11:58:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][790/1251] eta 0:01:58 lr 0.000430 wd 0.0500 time 0.2284 (0.2565) data time 0.0010 (0.0029) model time 0.2274 (0.2536) loss 3.0117 (3.1373) grad_norm 2.9779 (inf) loss_scale 2048.0000 (2519.1540) mem 7379MB [2024-08-27 11:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][800/1251] eta 0:01:55 lr 0.000430 wd 0.0500 time 0.2219 (0.2560) data time 0.0007 (0.0029) model time 0.2212 (0.2531) loss 3.9643 (3.1383) grad_norm 3.2643 (inf) loss_scale 2048.0000 (2510.5719) mem 7379MB [2024-08-27 11:58:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][810/1251] eta 0:01:52 lr 0.000429 wd 0.0500 time 0.2237 (0.2555) data time 0.0011 (0.0028) model time 0.2225 (0.2527) loss 3.8335 (3.1428) grad_norm 2.3471 (inf) loss_scale 2048.0000 (2502.2970) mem 7379MB [2024-08-27 11:58:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][820/1251] eta 0:01:49 lr 0.000429 wd 0.0500 time 0.2224 (0.2550) data time 0.0009 (0.0028) model time 0.2215 (0.2522) loss 2.7208 (3.1450) grad_norm 2.0627 (inf) loss_scale 2048.0000 (2494.3128) mem 7379MB [2024-08-27 11:59:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][830/1251] eta 0:01:47 lr 0.000429 wd 0.0500 time 0.2263 (0.2546) data time 0.0007 (0.0028) model time 0.2255 (0.2518) loss 3.9674 (3.1465) grad_norm 3.2973 (inf) loss_scale 2048.0000 (2486.6045) mem 7379MB [2024-08-27 11:59:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][840/1251] eta 0:01:44 lr 0.000429 wd 0.0500 time 0.2270 (0.2541) data time 0.0011 (0.0028) model time 0.2259 (0.2514) loss 2.5626 (3.1461) grad_norm 1.9349 (inf) loss_scale 2048.0000 (2479.1579) mem 7379MB [2024-08-27 11:59:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][850/1251] eta 0:01:41 lr 0.000429 wd 0.0500 time 0.2230 (0.2537) data time 0.0015 (0.0027) model time 0.2216 (0.2510) loss 2.8845 (3.1472) grad_norm 2.6092 (inf) loss_scale 2048.0000 (2471.9599) mem 7379MB [2024-08-27 11:59:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][860/1251] eta 0:01:39 lr 0.000429 wd 0.0500 time 0.2369 (0.2533) data time 0.0008 (0.0027) model time 0.2361 (0.2506) loss 3.7089 (3.1465) grad_norm 2.4339 (inf) loss_scale 2048.0000 (2464.9984) mem 7379MB [2024-08-27 11:59:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][870/1251] eta 0:01:36 lr 0.000429 wd 0.0500 time 0.2298 (0.2529) data time 0.0011 (0.0027) model time 0.2286 (0.2502) loss 3.6400 (3.1469) grad_norm 4.9486 (inf) loss_scale 2048.0000 (2458.2617) mem 7379MB [2024-08-27 11:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][880/1251] eta 0:01:33 lr 0.000429 wd 0.0500 time 0.2341 (0.2526) data time 0.0010 (0.0027) model time 0.2331 (0.2499) loss 2.9053 (3.1494) grad_norm 4.3551 (inf) loss_scale 2048.0000 (2451.7393) mem 7379MB [2024-08-27 11:59:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][890/1251] eta 0:01:31 lr 0.000429 wd 0.0500 time 0.2282 (0.2522) data time 0.0007 (0.0026) model time 0.2276 (0.2496) loss 3.4888 (3.1516) grad_norm 2.3733 (inf) loss_scale 2048.0000 (2445.4210) mem 7379MB [2024-08-27 11:59:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][900/1251] eta 0:01:28 lr 0.000429 wd 0.0500 time 0.2330 (0.2518) data time 0.0008 (0.0026) model time 0.2322 (0.2492) loss 3.1518 (3.1482) grad_norm 2.7283 (inf) loss_scale 2048.0000 (2439.2974) mem 7379MB [2024-08-27 11:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][910/1251] eta 0:01:25 lr 0.000429 wd 0.0500 time 0.2266 (0.2515) data time 0.0010 (0.0026) model time 0.2255 (0.2490) loss 3.2263 (3.1469) grad_norm 2.3729 (inf) loss_scale 2048.0000 (2433.3596) mem 7379MB [2024-08-27 11:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][920/1251] eta 0:01:23 lr 0.000429 wd 0.0500 time 0.2292 (0.2512) data time 0.0009 (0.0026) model time 0.2283 (0.2487) loss 3.6645 (3.1482) grad_norm 2.4994 (inf) loss_scale 2048.0000 (2427.5994) mem 7379MB [2024-08-27 11:59:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][930/1251] eta 0:01:20 lr 0.000429 wd 0.0500 time 0.2345 (0.2509) data time 0.0007 (0.0025) model time 0.2338 (0.2483) loss 3.2468 (3.1520) grad_norm 2.8489 (inf) loss_scale 2048.0000 (2422.0088) mem 7379MB [2024-08-27 11:59:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][940/1251] eta 0:01:17 lr 0.000429 wd 0.0500 time 0.2321 (0.2506) data time 0.0009 (0.0025) model time 0.2313 (0.2481) loss 2.4583 (3.1524) grad_norm 3.3322 (inf) loss_scale 2048.0000 (2416.5806) mem 7379MB [2024-08-27 11:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][950/1251] eta 0:01:15 lr 0.000429 wd 0.0500 time 0.2293 (0.2503) data time 0.0011 (0.0025) model time 0.2282 (0.2478) loss 2.7147 (3.1495) grad_norm 1.7907 (inf) loss_scale 2048.0000 (2411.3076) mem 7379MB [2024-08-27 11:59:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][960/1251] eta 0:01:12 lr 0.000429 wd 0.0500 time 0.2318 (0.2500) data time 0.0014 (0.0025) model time 0.2304 (0.2475) loss 2.4306 (3.1487) grad_norm 3.3288 (inf) loss_scale 2048.0000 (2406.1834) mem 7379MB [2024-08-27 11:59:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][970/1251] eta 0:01:10 lr 0.000429 wd 0.0500 time 0.2191 (0.2497) data time 0.0009 (0.0025) model time 0.2182 (0.2472) loss 3.3663 (3.1444) grad_norm 1.9252 (inf) loss_scale 2048.0000 (2401.2017) mem 7379MB [2024-08-27 11:59:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][980/1251] eta 0:01:07 lr 0.000429 wd 0.0500 time 0.2292 (0.2494) data time 0.0007 (0.0025) model time 0.2285 (0.2470) loss 3.9256 (3.1446) grad_norm 1.8685 (inf) loss_scale 2048.0000 (2396.3567) mem 7379MB [2024-08-27 11:59:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][990/1251] eta 0:01:05 lr 0.000429 wd 0.0500 time 0.2264 (0.2492) data time 0.0012 (0.0024) model time 0.2253 (0.2467) loss 3.1002 (3.1485) grad_norm 2.2928 (inf) loss_scale 2048.0000 (2391.6428) mem 7379MB [2024-08-27 11:59:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1000/1251] eta 0:01:02 lr 0.000429 wd 0.0500 time 0.2286 (0.2489) data time 0.0010 (0.0024) model time 0.2276 (0.2465) loss 2.9686 (3.1461) grad_norm 2.2613 (inf) loss_scale 2048.0000 (2387.0547) mem 7379MB [2024-08-27 11:59:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1010/1251] eta 0:00:59 lr 0.000429 wd 0.0500 time 0.2262 (0.2487) data time 0.0010 (0.0024) model time 0.2252 (0.2463) loss 3.2717 (3.1446) grad_norm 2.9485 (inf) loss_scale 2048.0000 (2382.5876) mem 7379MB [2024-08-27 11:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1020/1251] eta 0:00:57 lr 0.000429 wd 0.0500 time 0.2204 (0.2484) data time 0.0009 (0.0024) model time 0.2195 (0.2460) loss 2.9581 (3.1451) grad_norm 3.5345 (inf) loss_scale 2048.0000 (2378.2367) mem 7379MB [2024-08-27 11:59:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1030/1251] eta 0:00:54 lr 0.000429 wd 0.0500 time 0.2294 (0.2483) data time 0.0011 (0.0024) model time 0.2283 (0.2459) loss 2.6753 (3.1471) grad_norm 2.1434 (inf) loss_scale 2048.0000 (2373.9974) mem 7379MB [2024-08-27 11:59:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1040/1251] eta 0:00:52 lr 0.000428 wd 0.0500 time 0.2334 (0.2480) data time 0.0012 (0.0024) model time 0.2323 (0.2457) loss 3.1947 (3.1476) grad_norm 2.5182 (inf) loss_scale 2048.0000 (2369.8657) mem 7379MB [2024-08-27 11:59:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1050/1251] eta 0:00:49 lr 0.000428 wd 0.0500 time 0.2248 (0.2478) data time 0.0010 (0.0023) model time 0.2239 (0.2455) loss 3.1037 (3.1470) grad_norm 2.1463 (inf) loss_scale 2048.0000 (2365.8373) mem 7379MB [2024-08-27 11:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1060/1251] eta 0:00:47 lr 0.000428 wd 0.0500 time 0.2312 (0.2476) data time 0.0012 (0.0023) model time 0.2300 (0.2453) loss 1.9057 (3.1431) grad_norm 2.7993 (inf) loss_scale 2048.0000 (2361.9085) mem 7379MB [2024-08-27 11:59:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1070/1251] eta 0:00:44 lr 0.000428 wd 0.0500 time 0.2294 (0.2476) data time 0.0010 (0.0023) model time 0.2284 (0.2453) loss 3.2920 (3.1427) grad_norm 2.5938 (inf) loss_scale 2048.0000 (2358.0757) mem 7379MB [2024-08-27 11:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1080/1251] eta 0:00:42 lr 0.000428 wd 0.0500 time 0.2285 (0.2476) data time 0.0010 (0.0023) model time 0.2276 (0.2453) loss 2.8720 (3.1381) grad_norm 4.4353 (inf) loss_scale 2048.0000 (2354.3353) mem 7379MB [2024-08-27 12:00:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1090/1251] eta 0:00:39 lr 0.000428 wd 0.0500 time 0.2283 (0.2474) data time 0.0007 (0.0023) model time 0.2276 (0.2451) loss 3.4471 (3.1373) grad_norm 3.6436 (inf) loss_scale 2048.0000 (2350.6841) mem 7379MB [2024-08-27 12:00:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1100/1251] eta 0:00:37 lr 0.000428 wd 0.0500 time 0.2346 (0.2472) data time 0.0009 (0.0023) model time 0.2337 (0.2450) loss 3.1032 (3.1341) grad_norm 2.5416 (inf) loss_scale 2048.0000 (2347.1190) mem 7379MB [2024-08-27 12:00:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1110/1251] eta 0:00:34 lr 0.000428 wd 0.0500 time 0.2294 (0.2470) data time 0.0007 (0.0023) model time 0.2287 (0.2448) loss 3.2528 (3.1355) grad_norm 2.3952 (inf) loss_scale 2048.0000 (2343.6368) mem 7379MB [2024-08-27 12:00:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1120/1251] eta 0:00:32 lr 0.000428 wd 0.0500 time 0.2275 (0.2468) data time 0.0009 (0.0023) model time 0.2266 (0.2446) loss 3.5049 (3.1358) grad_norm 2.5750 (inf) loss_scale 2048.0000 (2340.2348) mem 7379MB [2024-08-27 12:00:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1130/1251] eta 0:00:29 lr 0.000428 wd 0.0500 time 0.2247 (0.2466) data time 0.0009 (0.0022) model time 0.2238 (0.2444) loss 3.4031 (3.1348) grad_norm 6.1781 (inf) loss_scale 2048.0000 (2336.9101) mem 7379MB [2024-08-27 12:00:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1140/1251] eta 0:00:27 lr 0.000428 wd 0.0500 time 0.2252 (0.2465) data time 0.0009 (0.0022) model time 0.2243 (0.2442) loss 3.2495 (3.1360) grad_norm 2.7589 (inf) loss_scale 2048.0000 (2333.6603) mem 7379MB [2024-08-27 12:00:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1150/1251] eta 0:00:24 lr 0.000428 wd 0.0500 time 0.2389 (0.2463) data time 0.0010 (0.0022) model time 0.2379 (0.2441) loss 2.0872 (3.1329) grad_norm 2.3983 (inf) loss_scale 2048.0000 (2330.4828) mem 7379MB [2024-08-27 12:00:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1160/1251] eta 0:00:22 lr 0.000428 wd 0.0500 time 0.2430 (0.2462) data time 0.0010 (0.0022) model time 0.2420 (0.2440) loss 3.2041 (3.1347) grad_norm 2.9443 (inf) loss_scale 2048.0000 (2327.3751) mem 7379MB [2024-08-27 12:00:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1170/1251] eta 0:00:19 lr 0.000428 wd 0.0500 time 0.2286 (0.2460) data time 0.0010 (0.0022) model time 0.2275 (0.2438) loss 3.5523 (3.1360) grad_norm 2.7540 (inf) loss_scale 2048.0000 (2324.3351) mem 7379MB [2024-08-27 12:00:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1180/1251] eta 0:00:17 lr 0.000428 wd 0.0500 time 0.2252 (0.2459) data time 0.0009 (0.0022) model time 0.2243 (0.2437) loss 4.2897 (3.1388) grad_norm 3.0479 (inf) loss_scale 2048.0000 (2321.3606) mem 7379MB [2024-08-27 12:00:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1190/1251] eta 0:00:14 lr 0.000428 wd 0.0500 time 0.2235 (0.2457) data time 0.0012 (0.0022) model time 0.2223 (0.2436) loss 3.1593 (3.1386) grad_norm 1.8845 (inf) loss_scale 2048.0000 (2318.4494) mem 7379MB [2024-08-27 12:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1200/1251] eta 0:00:12 lr 0.000428 wd 0.0500 time 0.2304 (0.2456) data time 0.0008 (0.0022) model time 0.2296 (0.2434) loss 2.0855 (3.1369) grad_norm 3.1780 (inf) loss_scale 2048.0000 (2315.5996) mem 7379MB [2024-08-27 12:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1210/1251] eta 0:00:10 lr 0.000428 wd 0.0500 time 0.2349 (0.2455) data time 0.0008 (0.0022) model time 0.2341 (0.2433) loss 2.2061 (3.1345) grad_norm 1.8074 (inf) loss_scale 2048.0000 (2312.8092) mem 7379MB [2024-08-27 12:00:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1220/1251] eta 0:00:07 lr 0.000428 wd 0.0500 time 0.2365 (0.2453) data time 0.0009 (0.0022) model time 0.2356 (0.2431) loss 3.3825 (3.1362) grad_norm 2.0920 (inf) loss_scale 2048.0000 (2310.0764) mem 7379MB [2024-08-27 12:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1230/1251] eta 0:00:05 lr 0.000428 wd 0.0500 time 0.2232 (0.2451) data time 0.0010 (0.0022) model time 0.2222 (0.2430) loss 2.6917 (3.1360) grad_norm 5.2476 (inf) loss_scale 2048.0000 (2307.3994) mem 7379MB [2024-08-27 12:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1240/1251] eta 0:00:02 lr 0.000428 wd 0.0500 time 0.2098 (0.2449) data time 0.0004 (0.0022) model time 0.2094 (0.2428) loss 3.7932 (3.1368) grad_norm 2.1092 (inf) loss_scale 2048.0000 (2304.7765) mem 7379MB [2024-08-27 12:00:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [173/300][1250/1251] eta 0:00:00 lr 0.000428 wd 0.0500 time 0.2110 (0.2446) data time 0.0007 (0.0021) model time 0.2103 (0.2425) loss 3.6472 (3.1367) grad_norm 3.9049 (inf) loss_scale 2048.0000 (2302.2062) mem 7379MB [2024-08-27 12:00:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 173 training takes 0:04:04 [2024-08-27 12:00:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 12:00:40 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 12:00:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.423 (0.423) Loss 0.4609 (0.4609) Acc@1 91.699 (91.699) Acc@5 98.633 (98.633) Mem 7379MB [2024-08-27 12:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.089 (0.118) Loss 0.7705 (0.7114) Acc@1 83.887 (84.615) Acc@5 96.875 (97.079) Mem 7379MB [2024-08-27 12:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.101) Loss 1.0605 (0.7330) Acc@1 75.293 (83.845) Acc@5 93.945 (97.028) Mem 7379MB [2024-08-27 12:00:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.094) Loss 1.2227 (0.8282) Acc@1 70.703 (81.707) Acc@5 91.406 (95.911) Mem 7379MB [2024-08-27 12:00:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.1094 (0.8777) Acc@1 74.023 (80.435) Acc@5 93.359 (95.348) Mem 7379MB [2024-08-27 12:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.976 Acc@5 95.284 [2024-08-27 12:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.0% [2024-08-27 12:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 1.033 (1.033) Loss 0.4009 (0.4009) Acc@1 93.164 (93.164) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-27 12:00:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.169) Loss 0.6201 (0.6246) Acc@1 87.695 (86.692) Acc@5 97.559 (97.488) Mem 7379MB [2024-08-27 12:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.084 (0.128) Loss 0.8926 (0.6494) Acc@1 79.004 (85.793) Acc@5 95.508 (97.442) Mem 7379MB [2024-08-27 12:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.113) Loss 1.1211 (0.7361) Acc@1 71.973 (83.666) Acc@5 92.969 (96.494) Mem 7379MB [2024-08-27 12:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.066 (0.101) Loss 1.0068 (0.7803) Acc@1 75.293 (82.365) Acc@5 94.141 (96.025) Mem 7379MB [2024-08-27 12:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.950 Acc@5 96.000 [2024-08-27 12:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.0% [2024-08-27 12:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.95% [2024-08-27 12:00:51 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 12:00:52 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 12:00:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][0/1251] eta 0:18:16 lr 0.000428 wd 0.0500 time 0.8763 (0.8763) data time 0.5958 (0.5958) model time 0.0000 (0.0000) loss 2.8901 (2.8901) grad_norm 2.4786 (2.4786) loss_scale 2048.0000 (2048.0000) mem 7382MB [2024-08-27 12:00:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][10/1251] eta 0:05:56 lr 0.000428 wd 0.0500 time 0.2293 (0.2874) data time 0.0011 (0.0560) model time 0.0000 (0.0000) loss 3.3403 (3.1753) grad_norm 3.7023 (2.7220) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:00:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][20/1251] eta 0:05:19 lr 0.000427 wd 0.0500 time 0.2307 (0.2596) data time 0.0011 (0.0298) model time 0.0000 (0.0000) loss 3.6163 (3.1701) grad_norm 3.4070 (2.8745) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:00:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][30/1251] eta 0:05:05 lr 0.000427 wd 0.0500 time 0.2336 (0.2503) data time 0.0011 (0.0206) model time 0.0000 (0.0000) loss 2.4110 (3.0625) grad_norm 2.7488 (2.8796) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][40/1251] eta 0:04:56 lr 0.000427 wd 0.0500 time 0.2339 (0.2452) data time 0.0006 (0.0158) model time 0.0000 (0.0000) loss 2.8500 (3.1120) grad_norm 2.3557 (2.8198) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][50/1251] eta 0:04:51 lr 0.000427 wd 0.0500 time 0.2348 (0.2424) data time 0.0014 (0.0131) model time 0.0000 (0.0000) loss 3.3808 (3.1126) grad_norm 2.5359 (2.8062) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][60/1251] eta 0:04:46 lr 0.000427 wd 0.0500 time 0.2275 (0.2406) data time 0.0011 (0.0112) model time 0.2264 (0.2299) loss 2.5217 (3.1251) grad_norm 2.9225 (2.7830) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][70/1251] eta 0:04:42 lr 0.000427 wd 0.0500 time 0.2250 (0.2390) data time 0.0009 (0.0098) model time 0.2242 (0.2288) loss 2.5255 (3.0827) grad_norm 1.9237 (2.7340) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][80/1251] eta 0:04:38 lr 0.000427 wd 0.0500 time 0.2216 (0.2377) data time 0.0008 (0.0088) model time 0.2208 (0.2282) loss 3.2503 (3.0879) grad_norm 2.5542 (2.6933) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][90/1251] eta 0:04:34 lr 0.000427 wd 0.0500 time 0.2207 (0.2365) data time 0.0007 (0.0079) model time 0.2200 (0.2276) loss 3.4430 (3.0630) grad_norm 2.6715 (2.6817) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][100/1251] eta 0:04:31 lr 0.000427 wd 0.0500 time 0.2322 (0.2357) data time 0.0010 (0.0073) model time 0.2312 (0.2275) loss 3.4408 (3.0704) grad_norm 2.0255 (2.6793) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][110/1251] eta 0:04:28 lr 0.000427 wd 0.0500 time 0.2396 (0.2352) data time 0.0008 (0.0067) model time 0.2388 (0.2277) loss 3.7339 (3.0768) grad_norm 2.8776 (2.6971) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][120/1251] eta 0:04:25 lr 0.000427 wd 0.0500 time 0.2399 (0.2348) data time 0.0010 (0.0063) model time 0.2390 (0.2279) loss 2.6020 (3.1108) grad_norm 2.9581 (2.6842) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][130/1251] eta 0:04:22 lr 0.000427 wd 0.0500 time 0.2227 (0.2341) data time 0.0013 (0.0059) model time 0.2215 (0.2275) loss 4.0418 (3.1174) grad_norm 2.1599 (2.6632) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][140/1251] eta 0:04:20 lr 0.000427 wd 0.0500 time 0.2395 (0.2342) data time 0.0011 (0.0056) model time 0.2384 (0.2281) loss 3.6194 (3.1217) grad_norm 3.7099 (2.6834) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][150/1251] eta 0:04:17 lr 0.000427 wd 0.0500 time 0.2365 (0.2339) data time 0.0014 (0.0053) model time 0.2351 (0.2282) loss 2.7226 (3.1118) grad_norm 3.8904 (2.6921) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][160/1251] eta 0:04:14 lr 0.000427 wd 0.0500 time 0.2320 (0.2337) data time 0.0012 (0.0051) model time 0.2308 (0.2281) loss 2.2836 (3.1038) grad_norm 2.7361 (2.7409) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][170/1251] eta 0:04:12 lr 0.000427 wd 0.0500 time 0.2269 (0.2335) data time 0.0009 (0.0050) model time 0.2260 (0.2281) loss 3.1797 (3.1049) grad_norm 2.9929 (2.7239) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][180/1251] eta 0:04:10 lr 0.000427 wd 0.0500 time 0.2323 (0.2335) data time 0.0009 (0.0048) model time 0.2314 (0.2284) loss 3.4908 (3.1226) grad_norm 2.9134 (2.7216) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][190/1251] eta 0:04:07 lr 0.000427 wd 0.0500 time 0.2614 (0.2333) data time 0.0007 (0.0046) model time 0.2607 (0.2284) loss 3.6933 (3.1376) grad_norm 2.5348 (2.7331) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][200/1251] eta 0:04:04 lr 0.000427 wd 0.0500 time 0.2434 (0.2331) data time 0.0007 (0.0044) model time 0.2427 (0.2284) loss 3.6920 (3.1419) grad_norm 2.0745 (2.7223) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][210/1251] eta 0:04:02 lr 0.000427 wd 0.0500 time 0.2393 (0.2328) data time 0.0009 (0.0042) model time 0.2384 (0.2283) loss 3.2079 (3.1469) grad_norm 2.3629 (2.7152) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][220/1251] eta 0:03:59 lr 0.000427 wd 0.0500 time 0.2241 (0.2326) data time 0.0012 (0.0041) model time 0.2229 (0.2282) loss 2.9683 (3.1407) grad_norm 3.7234 (2.7418) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][230/1251] eta 0:03:57 lr 0.000427 wd 0.0500 time 0.2248 (0.2325) data time 0.0009 (0.0040) model time 0.2239 (0.2283) loss 3.2033 (3.1343) grad_norm 1.7318 (2.7298) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][240/1251] eta 0:03:54 lr 0.000427 wd 0.0500 time 0.2260 (0.2323) data time 0.0007 (0.0039) model time 0.2253 (0.2282) loss 3.2431 (3.1339) grad_norm 2.1284 (2.7242) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][250/1251] eta 0:03:52 lr 0.000426 wd 0.0500 time 0.2288 (0.2322) data time 0.0010 (0.0038) model time 0.2278 (0.2283) loss 1.9238 (3.1284) grad_norm 2.2638 (2.7092) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][260/1251] eta 0:03:50 lr 0.000426 wd 0.0500 time 0.2391 (0.2321) data time 0.0009 (0.0037) model time 0.2381 (0.2282) loss 3.3576 (3.1351) grad_norm 2.4213 (2.6896) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][270/1251] eta 0:03:47 lr 0.000426 wd 0.0500 time 0.2266 (0.2319) data time 0.0009 (0.0036) model time 0.2257 (0.2281) loss 3.0053 (3.1403) grad_norm 1.9056 (2.6867) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][280/1251] eta 0:03:45 lr 0.000426 wd 0.0500 time 0.2365 (0.2318) data time 0.0007 (0.0035) model time 0.2358 (0.2281) loss 3.6176 (3.1475) grad_norm 3.4171 (2.7042) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][290/1251] eta 0:03:42 lr 0.000426 wd 0.0500 time 0.2286 (0.2318) data time 0.0012 (0.0034) model time 0.2274 (0.2282) loss 3.0655 (3.1507) grad_norm 2.2833 (2.7054) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][300/1251] eta 0:03:40 lr 0.000426 wd 0.0500 time 0.2251 (0.2318) data time 0.0018 (0.0033) model time 0.2233 (0.2282) loss 3.4451 (3.1532) grad_norm 3.3587 (2.7147) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][310/1251] eta 0:03:38 lr 0.000426 wd 0.0500 time 0.2459 (0.2319) data time 0.0010 (0.0033) model time 0.2449 (0.2284) loss 3.5306 (3.1531) grad_norm 2.6838 (2.7159) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][320/1251] eta 0:03:35 lr 0.000426 wd 0.0500 time 0.2270 (0.2318) data time 0.0010 (0.0032) model time 0.2260 (0.2285) loss 3.4148 (3.1416) grad_norm 3.7155 (2.7305) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][330/1251] eta 0:03:33 lr 0.000426 wd 0.0500 time 0.2344 (0.2318) data time 0.0013 (0.0032) model time 0.2332 (0.2285) loss 3.1888 (3.1418) grad_norm 2.1680 (2.7246) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][340/1251] eta 0:03:31 lr 0.000426 wd 0.0500 time 0.2297 (0.2317) data time 0.0009 (0.0031) model time 0.2288 (0.2285) loss 3.2708 (3.1402) grad_norm 3.4173 (2.7207) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][350/1251] eta 0:03:28 lr 0.000426 wd 0.0500 time 0.2205 (0.2317) data time 0.0009 (0.0030) model time 0.2196 (0.2285) loss 3.5592 (3.1409) grad_norm 3.6703 (2.7193) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][360/1251] eta 0:03:26 lr 0.000426 wd 0.0500 time 0.2246 (0.2316) data time 0.0011 (0.0030) model time 0.2235 (0.2285) loss 3.1944 (3.1419) grad_norm 2.3057 (2.7139) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][370/1251] eta 0:03:23 lr 0.000426 wd 0.0500 time 0.2275 (0.2315) data time 0.0009 (0.0030) model time 0.2266 (0.2284) loss 3.2201 (3.1452) grad_norm 2.9548 (2.7080) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][380/1251] eta 0:03:21 lr 0.000426 wd 0.0500 time 0.2273 (0.2315) data time 0.0009 (0.0029) model time 0.2264 (0.2284) loss 3.3987 (3.1515) grad_norm 2.5063 (2.7008) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][390/1251] eta 0:03:19 lr 0.000426 wd 0.0500 time 0.2483 (0.2315) data time 0.0007 (0.0029) model time 0.2476 (0.2285) loss 3.5127 (3.1525) grad_norm 2.5041 (2.6960) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][400/1251] eta 0:03:16 lr 0.000426 wd 0.0500 time 0.2407 (0.2315) data time 0.0012 (0.0028) model time 0.2395 (0.2285) loss 3.3893 (3.1522) grad_norm 1.8544 (2.6915) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][410/1251] eta 0:03:14 lr 0.000426 wd 0.0500 time 0.2353 (0.2314) data time 0.0010 (0.0028) model time 0.2343 (0.2284) loss 3.7957 (3.1612) grad_norm 3.5459 (2.6928) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][420/1251] eta 0:03:12 lr 0.000426 wd 0.0500 time 0.2203 (0.2317) data time 0.0010 (0.0028) model time 0.2193 (0.2289) loss 3.1738 (3.1583) grad_norm 1.9100 (2.6878) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][430/1251] eta 0:03:10 lr 0.000426 wd 0.0500 time 0.2374 (0.2316) data time 0.0009 (0.0027) model time 0.2365 (0.2289) loss 3.4799 (3.1582) grad_norm 3.9861 (2.6843) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][440/1251] eta 0:03:07 lr 0.000426 wd 0.0500 time 0.2253 (0.2316) data time 0.0011 (0.0027) model time 0.2242 (0.2289) loss 3.5297 (3.1588) grad_norm 4.6516 (2.6914) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][450/1251] eta 0:03:05 lr 0.000426 wd 0.0500 time 0.2254 (0.2316) data time 0.0008 (0.0027) model time 0.2245 (0.2288) loss 3.2174 (3.1598) grad_norm 2.0134 (2.6895) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][460/1251] eta 0:03:03 lr 0.000426 wd 0.0500 time 0.2270 (0.2315) data time 0.0007 (0.0026) model time 0.2262 (0.2288) loss 3.7748 (3.1644) grad_norm 4.1863 (2.7036) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][470/1251] eta 0:03:00 lr 0.000426 wd 0.0500 time 0.2328 (0.2314) data time 0.0013 (0.0026) model time 0.2315 (0.2287) loss 3.6277 (3.1661) grad_norm 2.9303 (2.7219) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][480/1251] eta 0:02:58 lr 0.000425 wd 0.0500 time 0.2212 (0.2313) data time 0.0013 (0.0026) model time 0.2199 (0.2287) loss 2.3619 (3.1560) grad_norm 2.9998 (2.7376) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][490/1251] eta 0:02:55 lr 0.000425 wd 0.0500 time 0.2241 (0.2312) data time 0.0009 (0.0026) model time 0.2232 (0.2286) loss 2.1962 (3.1487) grad_norm 3.1056 (2.7330) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][500/1251] eta 0:02:53 lr 0.000425 wd 0.0500 time 0.2295 (0.2312) data time 0.0012 (0.0025) model time 0.2284 (0.2286) loss 3.8380 (3.1477) grad_norm 2.3755 (2.7305) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][510/1251] eta 0:02:51 lr 0.000425 wd 0.0500 time 0.4664 (0.2317) data time 0.0015 (0.0025) model time 0.4649 (0.2292) loss 2.2823 (3.1490) grad_norm 2.2056 (2.7350) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][520/1251] eta 0:02:49 lr 0.000425 wd 0.0500 time 0.2294 (0.2316) data time 0.0009 (0.0025) model time 0.2286 (0.2291) loss 3.1719 (3.1533) grad_norm 2.2567 (2.7326) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][530/1251] eta 0:02:46 lr 0.000425 wd 0.0500 time 0.2259 (0.2316) data time 0.0010 (0.0024) model time 0.2249 (0.2291) loss 2.5100 (3.1520) grad_norm 1.8963 (2.7270) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][540/1251] eta 0:02:44 lr 0.000425 wd 0.0500 time 0.2414 (0.2315) data time 0.0009 (0.0024) model time 0.2404 (0.2291) loss 4.1034 (3.1547) grad_norm 2.2385 (2.7234) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:02:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][550/1251] eta 0:02:42 lr 0.000425 wd 0.0500 time 0.2264 (0.2315) data time 0.0012 (0.0024) model time 0.2253 (0.2290) loss 3.5897 (3.1496) grad_norm 1.8822 (2.7190) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:03:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][560/1251] eta 0:02:39 lr 0.000425 wd 0.0500 time 0.2335 (0.2315) data time 0.0013 (0.0024) model time 0.2322 (0.2290) loss 2.4886 (3.1496) grad_norm 2.9422 (2.7215) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:03:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][570/1251] eta 0:02:37 lr 0.000425 wd 0.0500 time 0.2249 (0.2314) data time 0.0007 (0.0024) model time 0.2241 (0.2290) loss 2.8657 (3.1426) grad_norm 2.0267 (2.7239) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][580/1251] eta 0:02:35 lr 0.000425 wd 0.0500 time 0.2232 (0.2314) data time 0.0010 (0.0023) model time 0.2222 (0.2290) loss 2.6996 (3.1427) grad_norm 2.2285 (2.7207) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][590/1251] eta 0:02:32 lr 0.000425 wd 0.0500 time 0.2287 (0.2313) data time 0.0007 (0.0023) model time 0.2281 (0.2289) loss 3.7150 (3.1454) grad_norm 3.0350 (2.7159) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][600/1251] eta 0:02:30 lr 0.000425 wd 0.0500 time 0.2266 (0.2313) data time 0.0008 (0.0023) model time 0.2258 (0.2289) loss 3.5004 (3.1481) grad_norm 4.7670 (2.7204) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:03:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][610/1251] eta 0:02:28 lr 0.000425 wd 0.0500 time 0.2325 (0.2312) data time 0.0007 (0.0023) model time 0.2318 (0.2289) loss 3.2549 (3.1513) grad_norm 3.2912 (2.7241) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:03:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][620/1251] eta 0:02:25 lr 0.000425 wd 0.0500 time 0.2280 (0.2312) data time 0.0009 (0.0023) model time 0.2271 (0.2289) loss 3.0994 (3.1507) grad_norm 3.0664 (2.7193) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:03:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][630/1251] eta 0:02:23 lr 0.000425 wd 0.0500 time 0.2263 (0.2311) data time 0.0008 (0.0022) model time 0.2255 (0.2289) loss 3.5656 (3.1531) grad_norm 2.2425 (2.7208) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][640/1251] eta 0:02:21 lr 0.000425 wd 0.0500 time 0.2275 (0.2311) data time 0.0016 (0.0022) model time 0.2259 (0.2288) loss 2.2130 (3.1536) grad_norm 4.5490 (2.7266) loss_scale 2048.0000 (2048.0000) mem 7381MB [2024-08-27 12:03:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][650/1251] eta 0:02:18 lr 0.000425 wd 0.0500 time 0.2223 (0.2311) data time 0.0009 (0.0022) model time 0.2214 (0.2288) loss 3.3830 (3.1487) grad_norm 3.3882 (inf) loss_scale 1024.0000 (2044.8541) mem 7381MB [2024-08-27 12:03:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][660/1251] eta 0:02:16 lr 0.000425 wd 0.0500 time 0.2275 (0.2310) data time 0.0011 (0.0022) model time 0.2264 (0.2288) loss 3.3815 (3.1429) grad_norm 2.9419 (inf) loss_scale 1024.0000 (2029.4100) mem 7381MB [2024-08-27 12:03:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][670/1251] eta 0:02:14 lr 0.000425 wd 0.0500 time 0.2359 (0.2310) data time 0.0008 (0.0022) model time 0.2351 (0.2287) loss 3.2338 (3.1428) grad_norm 2.3078 (inf) loss_scale 1024.0000 (2014.4262) mem 7381MB [2024-08-27 12:03:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][680/1251] eta 0:02:11 lr 0.000425 wd 0.0500 time 0.2375 (0.2309) data time 0.0008 (0.0022) model time 0.2368 (0.2287) loss 3.4769 (3.1439) grad_norm 2.1444 (inf) loss_scale 1024.0000 (1999.8825) mem 7381MB [2024-08-27 12:03:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][690/1251] eta 0:02:09 lr 0.000425 wd 0.0500 time 0.2239 (0.2309) data time 0.0009 (0.0021) model time 0.2230 (0.2287) loss 2.9524 (3.1433) grad_norm 3.3171 (inf) loss_scale 1024.0000 (1985.7598) mem 7381MB [2024-08-27 12:03:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][700/1251] eta 0:02:07 lr 0.000424 wd 0.0500 time 0.2277 (0.2308) data time 0.0009 (0.0021) model time 0.2268 (0.2287) loss 3.4955 (3.1479) grad_norm 3.4408 (inf) loss_scale 1024.0000 (1972.0399) mem 7381MB [2024-08-27 12:03:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][710/1251] eta 0:02:04 lr 0.000424 wd 0.0500 time 0.2261 (0.2308) data time 0.0012 (0.0021) model time 0.2249 (0.2286) loss 2.6470 (3.1482) grad_norm 2.1344 (inf) loss_scale 1024.0000 (1958.7060) mem 7381MB [2024-08-27 12:03:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][720/1251] eta 0:02:02 lr 0.000424 wd 0.0500 time 0.2243 (0.2308) data time 0.0009 (0.0021) model time 0.2234 (0.2286) loss 2.7503 (3.1466) grad_norm 2.6350 (inf) loss_scale 1024.0000 (1945.7420) mem 7381MB [2024-08-27 12:03:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][730/1251] eta 0:02:00 lr 0.000424 wd 0.0500 time 0.2225 (0.2307) data time 0.0009 (0.0021) model time 0.2216 (0.2286) loss 3.7010 (3.1465) grad_norm 2.8749 (inf) loss_scale 1024.0000 (1933.1327) mem 7381MB [2024-08-27 12:03:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][740/1251] eta 0:01:57 lr 0.000424 wd 0.0500 time 0.2266 (0.2307) data time 0.0010 (0.0021) model time 0.2256 (0.2286) loss 3.7510 (3.1455) grad_norm 2.2559 (inf) loss_scale 1024.0000 (1920.8637) mem 7381MB [2024-08-27 12:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][750/1251] eta 0:01:55 lr 0.000424 wd 0.0500 time 0.2233 (0.2307) data time 0.0010 (0.0021) model time 0.2222 (0.2286) loss 2.3176 (3.1434) grad_norm 3.4709 (inf) loss_scale 1024.0000 (1908.9214) mem 7381MB [2024-08-27 12:03:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][760/1251] eta 0:01:53 lr 0.000424 wd 0.0500 time 0.2232 (0.2306) data time 0.0009 (0.0021) model time 0.2223 (0.2285) loss 4.1777 (3.1430) grad_norm 5.2935 (inf) loss_scale 1024.0000 (1897.2930) mem 7381MB [2024-08-27 12:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][770/1251] eta 0:01:50 lr 0.000424 wd 0.0500 time 0.2336 (0.2306) data time 0.0006 (0.0020) model time 0.2329 (0.2285) loss 3.3643 (3.1444) grad_norm 3.4537 (inf) loss_scale 1024.0000 (1885.9663) mem 7381MB [2024-08-27 12:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][780/1251] eta 0:01:48 lr 0.000424 wd 0.0500 time 0.2217 (0.2305) data time 0.0009 (0.0020) model time 0.2209 (0.2284) loss 3.5107 (3.1436) grad_norm 2.6679 (inf) loss_scale 1024.0000 (1874.9296) mem 7381MB [2024-08-27 12:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][790/1251] eta 0:01:46 lr 0.000424 wd 0.0500 time 0.2254 (0.2305) data time 0.0014 (0.0020) model time 0.2241 (0.2284) loss 3.3244 (3.1418) grad_norm 5.8113 (inf) loss_scale 1024.0000 (1864.1719) mem 7381MB [2024-08-27 12:03:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][800/1251] eta 0:01:43 lr 0.000424 wd 0.0500 time 0.2211 (0.2305) data time 0.0008 (0.0020) model time 0.2203 (0.2284) loss 3.6847 (3.1445) grad_norm 2.3390 (inf) loss_scale 1024.0000 (1853.6829) mem 7381MB [2024-08-27 12:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][810/1251] eta 0:01:41 lr 0.000424 wd 0.0500 time 0.2321 (0.2305) data time 0.0010 (0.0020) model time 0.2312 (0.2284) loss 2.6701 (3.1450) grad_norm 2.0295 (inf) loss_scale 1024.0000 (1843.4525) mem 7381MB [2024-08-27 12:04:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][820/1251] eta 0:01:39 lr 0.000424 wd 0.0500 time 0.2213 (0.2304) data time 0.0015 (0.0020) model time 0.2198 (0.2284) loss 3.4942 (3.1455) grad_norm 2.4755 (inf) loss_scale 1024.0000 (1833.4714) mem 7381MB [2024-08-27 12:04:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][830/1251] eta 0:01:37 lr 0.000424 wd 0.0500 time 0.2333 (0.2304) data time 0.0007 (0.0020) model time 0.2326 (0.2284) loss 3.4294 (3.1470) grad_norm 2.7162 (inf) loss_scale 1024.0000 (1823.7304) mem 7381MB [2024-08-27 12:04:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][840/1251] eta 0:01:34 lr 0.000424 wd 0.0500 time 0.2282 (0.2304) data time 0.0006 (0.0020) model time 0.2276 (0.2284) loss 3.6647 (3.1477) grad_norm 2.3770 (inf) loss_scale 1024.0000 (1814.2212) mem 7381MB [2024-08-27 12:04:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][850/1251] eta 0:01:32 lr 0.000424 wd 0.0500 time 0.2238 (0.2304) data time 0.0009 (0.0020) model time 0.2228 (0.2284) loss 3.0656 (3.1458) grad_norm 2.5910 (inf) loss_scale 1024.0000 (1804.9354) mem 7381MB [2024-08-27 12:04:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][860/1251] eta 0:01:30 lr 0.000424 wd 0.0500 time 0.2286 (0.2304) data time 0.0011 (0.0020) model time 0.2275 (0.2283) loss 3.3071 (3.1447) grad_norm 2.4417 (inf) loss_scale 1024.0000 (1795.8653) mem 7381MB [2024-08-27 12:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][870/1251] eta 0:01:27 lr 0.000424 wd 0.0500 time 0.2305 (0.2303) data time 0.0007 (0.0019) model time 0.2299 (0.2283) loss 3.7954 (3.1427) grad_norm 1.9010 (inf) loss_scale 1024.0000 (1787.0034) mem 7381MB [2024-08-27 12:04:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][880/1251] eta 0:01:25 lr 0.000424 wd 0.0500 time 0.2342 (0.2303) data time 0.0009 (0.0019) model time 0.2333 (0.2283) loss 3.7130 (3.1465) grad_norm 2.4092 (inf) loss_scale 1024.0000 (1778.3428) mem 7381MB [2024-08-27 12:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][890/1251] eta 0:01:23 lr 0.000424 wd 0.0500 time 0.2249 (0.2303) data time 0.0009 (0.0019) model time 0.2241 (0.2283) loss 2.5381 (3.1456) grad_norm 3.8351 (inf) loss_scale 1024.0000 (1769.8765) mem 7381MB [2024-08-27 12:04:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][900/1251] eta 0:01:20 lr 0.000424 wd 0.0500 time 0.2284 (0.2303) data time 0.0009 (0.0019) model time 0.2275 (0.2283) loss 3.2967 (3.1442) grad_norm 2.5526 (inf) loss_scale 1024.0000 (1761.5982) mem 7381MB [2024-08-27 12:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][910/1251] eta 0:01:18 lr 0.000424 wd 0.0500 time 0.2272 (0.2302) data time 0.0009 (0.0019) model time 0.2264 (0.2283) loss 3.0455 (3.1452) grad_norm 1.9588 (inf) loss_scale 1024.0000 (1753.5016) mem 7381MB [2024-08-27 12:04:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][920/1251] eta 0:01:16 lr 0.000424 wd 0.0500 time 0.2242 (0.2302) data time 0.0008 (0.0019) model time 0.2234 (0.2282) loss 2.9694 (3.1404) grad_norm 2.2781 (inf) loss_scale 1024.0000 (1745.5809) mem 7381MB [2024-08-27 12:04:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][930/1251] eta 0:01:13 lr 0.000423 wd 0.0500 time 0.2261 (0.2302) data time 0.0010 (0.0019) model time 0.2251 (0.2282) loss 3.0421 (3.1382) grad_norm 2.8693 (inf) loss_scale 1024.0000 (1737.8303) mem 7381MB [2024-08-27 12:04:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][940/1251] eta 0:01:11 lr 0.000423 wd 0.0500 time 0.2214 (0.2302) data time 0.0009 (0.0019) model time 0.2204 (0.2282) loss 2.6922 (3.1371) grad_norm 2.2763 (inf) loss_scale 1024.0000 (1730.2444) mem 7381MB [2024-08-27 12:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][950/1251] eta 0:01:09 lr 0.000423 wd 0.0500 time 0.2308 (0.2301) data time 0.0013 (0.0019) model time 0.2295 (0.2282) loss 3.5161 (3.1361) grad_norm 3.1465 (inf) loss_scale 1024.0000 (1722.8181) mem 7381MB [2024-08-27 12:04:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][960/1251] eta 0:01:06 lr 0.000423 wd 0.0500 time 0.2314 (0.2301) data time 0.0009 (0.0019) model time 0.2305 (0.2282) loss 2.6483 (3.1343) grad_norm 5.6754 (inf) loss_scale 1024.0000 (1715.5463) mem 7381MB [2024-08-27 12:04:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][970/1251] eta 0:01:04 lr 0.000423 wd 0.0500 time 0.2262 (0.2301) data time 0.0009 (0.0019) model time 0.2253 (0.2281) loss 3.2540 (3.1331) grad_norm 2.2821 (inf) loss_scale 1024.0000 (1708.4243) mem 7381MB [2024-08-27 12:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][980/1251] eta 0:01:02 lr 0.000423 wd 0.0500 time 0.2263 (0.2300) data time 0.0009 (0.0019) model time 0.2254 (0.2281) loss 1.9871 (3.1320) grad_norm 3.3410 (inf) loss_scale 1024.0000 (1701.4475) mem 7381MB [2024-08-27 12:04:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][990/1251] eta 0:01:00 lr 0.000423 wd 0.0500 time 0.2260 (0.2300) data time 0.0011 (0.0018) model time 0.2250 (0.2281) loss 3.4051 (3.1305) grad_norm 3.0952 (inf) loss_scale 1024.0000 (1694.6115) mem 7381MB [2024-08-27 12:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1000/1251] eta 0:00:57 lr 0.000423 wd 0.0500 time 0.2314 (0.2300) data time 0.0008 (0.0018) model time 0.2306 (0.2281) loss 3.2137 (3.1297) grad_norm 3.0624 (inf) loss_scale 1024.0000 (1687.9121) mem 7381MB [2024-08-27 12:04:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1010/1251] eta 0:00:55 lr 0.000423 wd 0.0500 time 0.2243 (0.2300) data time 0.0012 (0.0018) model time 0.2231 (0.2281) loss 3.0307 (3.1322) grad_norm 2.3580 (inf) loss_scale 1024.0000 (1681.3452) mem 7381MB [2024-08-27 12:04:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1020/1251] eta 0:00:53 lr 0.000423 wd 0.0500 time 0.2245 (0.2299) data time 0.0012 (0.0018) model time 0.2233 (0.2280) loss 3.4568 (3.1326) grad_norm 2.4004 (inf) loss_scale 1024.0000 (1674.9070) mem 7381MB [2024-08-27 12:04:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1030/1251] eta 0:00:50 lr 0.000423 wd 0.0500 time 0.2215 (0.2299) data time 0.0007 (0.0018) model time 0.2207 (0.2280) loss 3.6665 (3.1325) grad_norm 3.5473 (inf) loss_scale 1024.0000 (1668.5936) mem 7381MB [2024-08-27 12:04:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1040/1251] eta 0:00:48 lr 0.000423 wd 0.0500 time 0.2219 (0.2299) data time 0.0007 (0.0018) model time 0.2212 (0.2281) loss 3.3298 (3.1334) grad_norm 2.2601 (inf) loss_scale 1024.0000 (1662.4015) mem 7381MB [2024-08-27 12:04:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1050/1251] eta 0:00:46 lr 0.000423 wd 0.0500 time 0.2223 (0.2301) data time 0.0009 (0.0018) model time 0.2214 (0.2283) loss 3.0784 (3.1330) grad_norm 2.9703 (inf) loss_scale 1024.0000 (1656.3273) mem 7381MB [2024-08-27 12:04:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1060/1251] eta 0:00:43 lr 0.000423 wd 0.0500 time 0.2173 (0.2301) data time 0.0009 (0.0018) model time 0.2164 (0.2283) loss 2.8271 (3.1323) grad_norm 2.5746 (inf) loss_scale 1024.0000 (1650.3676) mem 7381MB [2024-08-27 12:04:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1070/1251] eta 0:00:41 lr 0.000423 wd 0.0500 time 0.2322 (0.2301) data time 0.0010 (0.0018) model time 0.2312 (0.2282) loss 3.3208 (3.1324) grad_norm 3.2552 (inf) loss_scale 1024.0000 (1644.5191) mem 7381MB [2024-08-27 12:05:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1080/1251] eta 0:00:39 lr 0.000423 wd 0.0500 time 0.2199 (0.2301) data time 0.0009 (0.0018) model time 0.2190 (0.2282) loss 2.6532 (3.1322) grad_norm 3.1446 (inf) loss_scale 1024.0000 (1638.7789) mem 7381MB [2024-08-27 12:05:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1090/1251] eta 0:00:37 lr 0.000423 wd 0.0500 time 0.2287 (0.2300) data time 0.0009 (0.0018) model time 0.2278 (0.2282) loss 2.8742 (3.1330) grad_norm 2.7782 (inf) loss_scale 1024.0000 (1633.1439) mem 7381MB [2024-08-27 12:05:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1100/1251] eta 0:00:34 lr 0.000423 wd 0.0500 time 0.2269 (0.2300) data time 0.0008 (0.0018) model time 0.2261 (0.2282) loss 2.0683 (3.1319) grad_norm 2.2633 (inf) loss_scale 1024.0000 (1627.6113) mem 7381MB [2024-08-27 12:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1110/1251] eta 0:00:32 lr 0.000423 wd 0.0500 time 0.2327 (0.2300) data time 0.0011 (0.0018) model time 0.2316 (0.2282) loss 3.4438 (3.1340) grad_norm 3.6648 (inf) loss_scale 1024.0000 (1622.1782) mem 7381MB [2024-08-27 12:05:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1120/1251] eta 0:00:30 lr 0.000423 wd 0.0500 time 0.2409 (0.2300) data time 0.0008 (0.0018) model time 0.2401 (0.2282) loss 2.3359 (3.1324) grad_norm 2.4139 (inf) loss_scale 1024.0000 (1616.8421) mem 7381MB [2024-08-27 12:05:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1130/1251] eta 0:00:27 lr 0.000423 wd 0.0500 time 0.2251 (0.2300) data time 0.0008 (0.0018) model time 0.2243 (0.2282) loss 3.6613 (3.1344) grad_norm 2.7997 (inf) loss_scale 1024.0000 (1611.6004) mem 7381MB [2024-08-27 12:05:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1140/1251] eta 0:00:25 lr 0.000423 wd 0.0500 time 0.2226 (0.2300) data time 0.0011 (0.0017) model time 0.2215 (0.2282) loss 3.3104 (3.1361) grad_norm 2.3710 (inf) loss_scale 1024.0000 (1606.4505) mem 7381MB [2024-08-27 12:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1150/1251] eta 0:00:23 lr 0.000423 wd 0.0500 time 0.2206 (0.2300) data time 0.0009 (0.0017) model time 0.2197 (0.2282) loss 2.4563 (3.1348) grad_norm 2.3134 (inf) loss_scale 1024.0000 (1601.3901) mem 7381MB [2024-08-27 12:05:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1160/1251] eta 0:00:20 lr 0.000422 wd 0.0500 time 0.2358 (0.2300) data time 0.0006 (0.0017) model time 0.2352 (0.2282) loss 2.0941 (3.1360) grad_norm 2.4550 (inf) loss_scale 1024.0000 (1596.4169) mem 7381MB [2024-08-27 12:05:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1170/1251] eta 0:00:18 lr 0.000422 wd 0.0500 time 0.2307 (0.2299) data time 0.0009 (0.0017) model time 0.2298 (0.2282) loss 3.1870 (3.1377) grad_norm 3.2758 (inf) loss_scale 1024.0000 (1591.5286) mem 7381MB [2024-08-27 12:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1180/1251] eta 0:00:16 lr 0.000422 wd 0.0500 time 0.2206 (0.2299) data time 0.0013 (0.0017) model time 0.2193 (0.2281) loss 3.1709 (3.1373) grad_norm 1.9644 (inf) loss_scale 1024.0000 (1586.7231) mem 7381MB [2024-08-27 12:05:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1190/1251] eta 0:00:14 lr 0.000422 wd 0.0500 time 0.2235 (0.2299) data time 0.0014 (0.0017) model time 0.2222 (0.2281) loss 3.3563 (3.1387) grad_norm 2.2909 (inf) loss_scale 1024.0000 (1581.9983) mem 7381MB [2024-08-27 12:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 12:05:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 12:05:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 12:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 12:07:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 12:09:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 12:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 12:10:01 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 12:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 12:10:09 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 12:10:10 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 12:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 12:12:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 12:12:31 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 12:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 12:12:40 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 12:12:41 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 12:12:43 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 12:12:43 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 174) [2024-08-27 12:12:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 12:13:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1200/1251] eta 0:01:11 lr 0.000422 wd 0.0500 time 0.2265 (1.3946) data time 0.0009 (0.0704) model time 0.2256 (1.3242) loss 3.3854 (3.4719) grad_norm 2.5223 (3.6820) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1210/1251] eta 0:00:33 lr 0.000422 wd 0.0500 time 0.2233 (0.8092) data time 0.0008 (0.0357) model time 0.2225 (0.7735) loss 3.5078 (3.3938) grad_norm 2.5019 (3.1867) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1220/1251] eta 0:00:19 lr 0.000422 wd 0.0500 time 0.2193 (0.6157) data time 0.0009 (0.0241) model time 0.2185 (0.5916) loss 3.6140 (3.4268) grad_norm 2.0800 (3.0536) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1230/1251] eta 0:00:10 lr 0.000422 wd 0.0500 time 0.2226 (0.5182) data time 0.0007 (0.0183) model time 0.2219 (0.4998) loss 2.7091 (3.3347) grad_norm 1.6751 (2.9402) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1240/1251] eta 0:00:05 lr 0.000422 wd 0.0500 time 0.2147 (0.4587) data time 0.0005 (0.0149) model time 0.2141 (0.4438) loss 2.9474 (3.3190) grad_norm 3.2461 (2.8475) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [174/300][1250/1251] eta 0:00:00 lr 0.000422 wd 0.0500 time 0.2122 (0.4178) data time 0.0004 (0.0125) model time 0.2118 (0.4052) loss 3.1541 (3.2883) grad_norm 1.8962 (2.7926) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 174 training takes 0:00:25 [2024-08-27 12:13:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 12:13:14 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 12:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.357 (0.357) Loss 0.4512 (0.4512) Acc@1 91.016 (91.016) Acc@5 98.242 (98.242) Mem 7377MB [2024-08-27 12:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.068 (0.098) Loss 0.6631 (0.6817) Acc@1 86.621 (85.094) Acc@5 97.168 (96.928) Mem 7377MB [2024-08-27 12:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.066 (0.084) Loss 0.9741 (0.7082) Acc@1 76.367 (84.142) Acc@5 95.117 (96.917) Mem 7377MB [2024-08-27 12:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.080) Loss 1.1797 (0.8065) Acc@1 71.777 (81.804) Acc@5 92.090 (95.920) Mem 7377MB [2024-08-27 12:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.076) Loss 1.1230 (0.8613) Acc@1 73.340 (80.400) Acc@5 93.262 (95.351) Mem 7377MB [2024-08-27 12:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.942 Acc@5 95.246 [2024-08-27 12:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 79.9% [2024-08-27 12:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.720 (0.720) Loss 0.4001 (0.4001) Acc@1 93.164 (93.164) Acc@5 98.242 (98.242) Mem 7377MB [2024-08-27 12:13:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.136) Loss 0.6201 (0.6240) Acc@1 87.402 (86.621) Acc@5 97.461 (97.479) Mem 7377MB [2024-08-27 12:13:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.105) Loss 0.8896 (0.6488) Acc@1 79.004 (85.775) Acc@5 95.605 (97.452) Mem 7377MB [2024-08-27 12:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.065 (0.094) Loss 1.1191 (0.7352) Acc@1 72.070 (83.663) Acc@5 92.969 (96.510) Mem 7377MB [2024-08-27 12:13:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.086) Loss 1.0068 (0.7795) Acc@1 75.098 (82.353) Acc@5 94.043 (96.046) Mem 7377MB [2024-08-27 12:13:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.928 Acc@5 96.022 [2024-08-27 12:13:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 81.9% [2024-08-27 12:13:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.93% [2024-08-27 12:13:23 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 12:13:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 12:13:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][0/1251] eta 0:13:18 lr 0.000422 wd 0.0500 time 0.6386 (0.6386) data time 0.3930 (0.3930) model time 0.0000 (0.0000) loss 2.9557 (2.9557) grad_norm 2.0457 (2.0457) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 12:13:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][10/1251] eta 0:05:23 lr 0.000422 wd 0.0500 time 0.2198 (0.2604) data time 0.0009 (0.0365) model time 0.0000 (0.0000) loss 3.2576 (3.0413) grad_norm 2.5960 (2.7404) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:13:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][20/1251] eta 0:04:59 lr 0.000422 wd 0.0500 time 0.2274 (0.2436) data time 0.0009 (0.0196) model time 0.0000 (0.0000) loss 2.8076 (3.1089) grad_norm 3.0832 (2.6579) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:13:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][30/1251] eta 0:04:48 lr 0.000422 wd 0.0500 time 0.2174 (0.2367) data time 0.0008 (0.0136) model time 0.0000 (0.0000) loss 3.7242 (3.1281) grad_norm 3.3696 (2.7584) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:13:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][40/1251] eta 0:04:44 lr 0.000422 wd 0.0500 time 0.2333 (0.2347) data time 0.0008 (0.0105) model time 0.0000 (0.0000) loss 3.6033 (3.1404) grad_norm 2.0323 (2.7931) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:13:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][50/1251] eta 0:04:39 lr 0.000422 wd 0.0500 time 0.2309 (0.2329) data time 0.0010 (0.0086) model time 0.0000 (0.0000) loss 2.5195 (3.1432) grad_norm 2.5437 (2.7332) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:13:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][60/1251] eta 0:04:36 lr 0.000422 wd 0.0500 time 0.2195 (0.2319) data time 0.0007 (0.0074) model time 0.2187 (0.2263) loss 2.4277 (3.1516) grad_norm 2.7123 (2.7188) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:13:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][70/1251] eta 0:04:32 lr 0.000422 wd 0.0500 time 0.2242 (0.2308) data time 0.0010 (0.0065) model time 0.2232 (0.2245) loss 3.2948 (3.1482) grad_norm 2.8971 (2.6893) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][80/1251] eta 0:04:29 lr 0.000422 wd 0.0500 time 0.2266 (0.2299) data time 0.0007 (0.0058) model time 0.2259 (0.2240) loss 3.3480 (3.1326) grad_norm 4.2026 (2.7225) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:13:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][90/1251] eta 0:04:26 lr 0.000422 wd 0.0500 time 0.2288 (0.2291) data time 0.0008 (0.0053) model time 0.2280 (0.2234) loss 2.4146 (3.1193) grad_norm 3.1978 (2.8141) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:13:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][100/1251] eta 0:04:22 lr 0.000422 wd 0.0500 time 0.2221 (0.2284) data time 0.0009 (0.0048) model time 0.2212 (0.2230) loss 3.2847 (3.1277) grad_norm 2.4571 (2.8160) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][110/1251] eta 0:04:20 lr 0.000422 wd 0.0500 time 0.2201 (0.2280) data time 0.0010 (0.0045) model time 0.2190 (0.2228) loss 2.7853 (3.1198) grad_norm 3.1799 (2.7838) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:13:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][120/1251] eta 0:04:17 lr 0.000422 wd 0.0500 time 0.2198 (0.2275) data time 0.0009 (0.0042) model time 0.2188 (0.2226) loss 3.2073 (3.1092) grad_norm 2.6104 (2.7970) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:13:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][130/1251] eta 0:04:14 lr 0.000422 wd 0.0500 time 0.2243 (0.2273) data time 0.0008 (0.0039) model time 0.2235 (0.2228) loss 2.9221 (3.1137) grad_norm 1.9824 (2.7823) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:13:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][140/1251] eta 0:04:12 lr 0.000421 wd 0.0500 time 0.2201 (0.2270) data time 0.0009 (0.0037) model time 0.2193 (0.2227) loss 2.8449 (3.0935) grad_norm 2.4056 (2.7628) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][150/1251] eta 0:04:09 lr 0.000421 wd 0.0500 time 0.2213 (0.2267) data time 0.0012 (0.0035) model time 0.2201 (0.2226) loss 3.4676 (3.0814) grad_norm 2.4171 (2.7699) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][160/1251] eta 0:04:07 lr 0.000421 wd 0.0500 time 0.2229 (0.2266) data time 0.0008 (0.0034) model time 0.2221 (0.2227) loss 3.4931 (3.0711) grad_norm 2.4333 (2.7698) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][170/1251] eta 0:04:04 lr 0.000421 wd 0.0500 time 0.2251 (0.2266) data time 0.0007 (0.0032) model time 0.2244 (0.2230) loss 2.0955 (3.0760) grad_norm 3.9814 (2.7641) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][180/1251] eta 0:04:02 lr 0.000421 wd 0.0500 time 0.2372 (0.2266) data time 0.0008 (0.0031) model time 0.2364 (0.2232) loss 3.1744 (3.0766) grad_norm 2.7205 (2.7517) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][190/1251] eta 0:04:00 lr 0.000421 wd 0.0500 time 0.2247 (0.2267) data time 0.0008 (0.0030) model time 0.2239 (0.2235) loss 3.3903 (3.0666) grad_norm 2.9117 (2.8804) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][200/1251] eta 0:03:58 lr 0.000421 wd 0.0500 time 0.2301 (0.2266) data time 0.0007 (0.0029) model time 0.2293 (0.2235) loss 2.9769 (3.0578) grad_norm 3.9798 (2.8989) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][210/1251] eta 0:03:55 lr 0.000421 wd 0.0500 time 0.2128 (0.2265) data time 0.0008 (0.0028) model time 0.2120 (0.2235) loss 3.7663 (3.0531) grad_norm 2.9560 (2.8943) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][220/1251] eta 0:03:53 lr 0.000421 wd 0.0500 time 0.2233 (0.2264) data time 0.0008 (0.0027) model time 0.2225 (0.2236) loss 3.1506 (3.0587) grad_norm 2.5455 (2.8856) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][230/1251] eta 0:03:51 lr 0.000421 wd 0.0500 time 0.2219 (0.2263) data time 0.0007 (0.0026) model time 0.2212 (0.2235) loss 1.9282 (3.0536) grad_norm 3.1404 (2.8750) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][240/1251] eta 0:03:51 lr 0.000421 wd 0.0500 time 0.2213 (0.2285) data time 0.0013 (0.0026) model time 0.2200 (0.2264) loss 2.8001 (3.0448) grad_norm 3.4632 (2.8823) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][250/1251] eta 0:03:48 lr 0.000421 wd 0.0500 time 0.2219 (0.2284) data time 0.0010 (0.0025) model time 0.2209 (0.2264) loss 3.2174 (3.0428) grad_norm 1.8077 (2.8594) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][260/1251] eta 0:03:46 lr 0.000421 wd 0.0500 time 0.2230 (0.2283) data time 0.0006 (0.0024) model time 0.2224 (0.2262) loss 4.0178 (3.0575) grad_norm 2.5723 (2.8628) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][270/1251] eta 0:03:43 lr 0.000421 wd 0.0500 time 0.2235 (0.2282) data time 0.0007 (0.0024) model time 0.2227 (0.2262) loss 2.2030 (3.0579) grad_norm 2.1186 (2.8648) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][280/1251] eta 0:03:41 lr 0.000421 wd 0.0500 time 0.2222 (0.2280) data time 0.0009 (0.0023) model time 0.2213 (0.2260) loss 3.0948 (3.0671) grad_norm 2.9261 (2.8581) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][290/1251] eta 0:03:39 lr 0.000421 wd 0.0500 time 0.2301 (0.2279) data time 0.0005 (0.0023) model time 0.2296 (0.2259) loss 3.4822 (3.0683) grad_norm 2.3927 (2.8630) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][300/1251] eta 0:03:37 lr 0.000421 wd 0.0500 time 0.2233 (0.2284) data time 0.0007 (0.0022) model time 0.2226 (0.2265) loss 2.5689 (3.0696) grad_norm 3.0694 (2.8576) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][310/1251] eta 0:03:34 lr 0.000421 wd 0.0500 time 0.2176 (0.2283) data time 0.0010 (0.0022) model time 0.2167 (0.2264) loss 2.3761 (3.0691) grad_norm 3.5699 (2.8651) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][320/1251] eta 0:03:32 lr 0.000421 wd 0.0500 time 0.2201 (0.2281) data time 0.0007 (0.0022) model time 0.2194 (0.2263) loss 2.1704 (3.0678) grad_norm 2.2549 (2.8569) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][330/1251] eta 0:03:29 lr 0.000421 wd 0.0500 time 0.2190 (0.2280) data time 0.0007 (0.0021) model time 0.2183 (0.2262) loss 3.9003 (3.0649) grad_norm 2.5252 (2.8591) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][340/1251] eta 0:03:27 lr 0.000421 wd 0.0500 time 0.2253 (0.2280) data time 0.0008 (0.0021) model time 0.2245 (0.2261) loss 2.9496 (3.0717) grad_norm 2.0442 (2.8434) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][350/1251] eta 0:03:25 lr 0.000421 wd 0.0500 time 0.2209 (0.2278) data time 0.0009 (0.0021) model time 0.2200 (0.2260) loss 3.9120 (3.0797) grad_norm 2.7905 (2.8570) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][360/1251] eta 0:03:22 lr 0.000421 wd 0.0500 time 0.2173 (0.2277) data time 0.0008 (0.0020) model time 0.2165 (0.2259) loss 3.7347 (3.0791) grad_norm 3.0170 (2.8707) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][370/1251] eta 0:03:20 lr 0.000420 wd 0.0500 time 0.2285 (0.2276) data time 0.0006 (0.0020) model time 0.2279 (0.2258) loss 3.5596 (3.0891) grad_norm 3.3329 (2.8668) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][380/1251] eta 0:03:18 lr 0.000420 wd 0.0500 time 0.2237 (0.2274) data time 0.0007 (0.0020) model time 0.2230 (0.2256) loss 3.6143 (3.0957) grad_norm 2.0676 (2.8667) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:14:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 12:14:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 12:14:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 12:16:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 12:16:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 12:16:51 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 12:19:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 12:19:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 12:19:58 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 12:20:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 12:20:07 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 12:20:09 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 12:20:10 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 12:20:10 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 175) [2024-08-27 12:20:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 12:20:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][390/1251] eta 3:12:27 lr 0.000420 wd 0.0500 time 13.4118 (13.4118) data time 0.8243 (0.8243) model time 12.5875 (12.5875) loss 4.0523 (4.0523) grad_norm 2.7438 (2.7438) loss_scale 1024.0000 (1024.0000) mem 20033MB [2024-08-27 12:20:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][400/1251] eta 0:20:28 lr 0.000420 wd 0.0500 time 0.2267 (1.4439) data time 0.0011 (0.0759) model time 0.2256 (1.3680) loss 2.7023 (3.4005) grad_norm 2.2576 (3.0107) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][410/1251] eta 0:12:06 lr 0.000420 wd 0.0500 time 0.2251 (0.8644) data time 0.0010 (0.0402) model time 0.2241 (0.8241) loss 2.9030 (3.3488) grad_norm 3.1555 (2.6892) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:20:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][420/1251] eta 0:09:07 lr 0.000420 wd 0.0500 time 0.2236 (0.6591) data time 0.0008 (0.0276) model time 0.2228 (0.6315) loss 2.2570 (3.3686) grad_norm 2.7646 (2.7341) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:20:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][430/1251] eta 0:07:34 lr 0.000420 wd 0.0500 time 0.2241 (0.5541) data time 0.0010 (0.0211) model time 0.2231 (0.5330) loss 3.0733 (3.2796) grad_norm 3.5507 (2.8180) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:20:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][440/1251] eta 0:06:37 lr 0.000420 wd 0.0500 time 0.2314 (0.4901) data time 0.0007 (0.0171) model time 0.2307 (0.4730) loss 3.4708 (3.2645) grad_norm 4.0259 (2.8813) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:20:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][450/1251] eta 0:05:58 lr 0.000420 wd 0.0500 time 0.2248 (0.4474) data time 0.0009 (0.0145) model time 0.2239 (0.4329) loss 3.1680 (3.2390) grad_norm 3.0624 (2.9137) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:20:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][460/1251] eta 0:05:29 lr 0.000420 wd 0.0500 time 0.2236 (0.4165) data time 0.0010 (0.0126) model time 0.2226 (0.4039) loss 3.2364 (3.2091) grad_norm 3.2574 (2.9332) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:20:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][470/1251] eta 0:05:07 lr 0.000420 wd 0.0500 time 0.2223 (0.3934) data time 0.0010 (0.0112) model time 0.2213 (0.3823) loss 2.6028 (3.1936) grad_norm 2.0019 (2.9814) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:20:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][480/1251] eta 0:04:49 lr 0.000420 wd 0.0500 time 0.2270 (0.3752) data time 0.0011 (0.0101) model time 0.2260 (0.3652) loss 4.1278 (3.1846) grad_norm 2.7503 (2.9449) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:20:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][490/1251] eta 0:04:34 lr 0.000420 wd 0.0500 time 0.2219 (0.3610) data time 0.0008 (0.0093) model time 0.2211 (0.3517) loss 3.5318 (3.1890) grad_norm 2.9272 (2.9286) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:20:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][500/1251] eta 0:04:22 lr 0.000420 wd 0.0500 time 0.2286 (0.3490) data time 0.0011 (0.0086) model time 0.2275 (0.3404) loss 2.7221 (3.1827) grad_norm 2.1254 (2.9189) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:20:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][510/1251] eta 0:04:11 lr 0.000420 wd 0.0500 time 0.2321 (0.3391) data time 0.0009 (0.0080) model time 0.2312 (0.3312) loss 1.8537 (3.1888) grad_norm 2.0379 (2.8803) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:20:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][520/1251] eta 0:04:01 lr 0.000420 wd 0.0500 time 0.2539 (0.3310) data time 0.0010 (0.0074) model time 0.2529 (0.3236) loss 3.2495 (3.1781) grad_norm 1.9625 (2.8610) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][530/1251] eta 0:03:53 lr 0.000420 wd 0.0500 time 0.2220 (0.3237) data time 0.0009 (0.0070) model time 0.2212 (0.3168) loss 2.9041 (3.1739) grad_norm 2.3854 (2.8402) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][540/1251] eta 0:03:45 lr 0.000420 wd 0.0500 time 0.2252 (0.3175) data time 0.0011 (0.0066) model time 0.2242 (0.3110) loss 2.3930 (3.1717) grad_norm 2.6005 (2.8910) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][550/1251] eta 0:03:38 lr 0.000420 wd 0.0500 time 0.2321 (0.3122) data time 0.0011 (0.0062) model time 0.2310 (0.3060) loss 3.4859 (3.1789) grad_norm 2.2338 (2.8724) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][560/1251] eta 0:03:32 lr 0.000420 wd 0.0500 time 0.2236 (0.3072) data time 0.0009 (0.0059) model time 0.2227 (0.3013) loss 3.1945 (3.1759) grad_norm 2.7199 (2.8758) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][570/1251] eta 0:03:26 lr 0.000420 wd 0.0500 time 0.2275 (0.3029) data time 0.0009 (0.0056) model time 0.2266 (0.2973) loss 3.2470 (3.1645) grad_norm 3.6721 (2.8727) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][580/1251] eta 0:03:20 lr 0.000420 wd 0.0500 time 0.2238 (0.2993) data time 0.0011 (0.0054) model time 0.2227 (0.2939) loss 2.8913 (3.1675) grad_norm 3.5136 (2.8767) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][590/1251] eta 0:03:15 lr 0.000420 wd 0.0500 time 0.2298 (0.2959) data time 0.0012 (0.0052) model time 0.2286 (0.2907) loss 3.1295 (3.1620) grad_norm 3.3851 (2.8820) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][600/1251] eta 0:03:10 lr 0.000419 wd 0.0500 time 0.2212 (0.2927) data time 0.0011 (0.0050) model time 0.2201 (0.2877) loss 3.2432 (3.1536) grad_norm 3.1170 (2.8920) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][610/1251] eta 0:03:05 lr 0.000419 wd 0.0500 time 0.2270 (0.2899) data time 0.0006 (0.0048) model time 0.2264 (0.2850) loss 3.6155 (3.1440) grad_norm 4.4732 (2.8982) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][620/1251] eta 0:03:01 lr 0.000419 wd 0.0500 time 0.2323 (0.2874) data time 0.0006 (0.0047) model time 0.2317 (0.2827) loss 2.3565 (3.1444) grad_norm 1.9138 (2.8735) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][630/1251] eta 0:02:57 lr 0.000419 wd 0.0500 time 0.2358 (0.2851) data time 0.0007 (0.0045) model time 0.2351 (0.2805) loss 3.1798 (3.1428) grad_norm 3.7684 (2.8650) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][640/1251] eta 0:02:52 lr 0.000419 wd 0.0500 time 0.2258 (0.2830) data time 0.0009 (0.0044) model time 0.2249 (0.2786) loss 3.4537 (3.1346) grad_norm 4.3354 (2.8771) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][650/1251] eta 0:02:48 lr 0.000419 wd 0.0500 time 0.2249 (0.2809) data time 0.0008 (0.0043) model time 0.2241 (0.2766) loss 3.3451 (3.1288) grad_norm 4.0115 (2.8951) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][660/1251] eta 0:02:44 lr 0.000419 wd 0.0500 time 0.2284 (0.2790) data time 0.0007 (0.0041) model time 0.2277 (0.2749) loss 3.6313 (3.1248) grad_norm 3.7701 (2.9249) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][670/1251] eta 0:02:41 lr 0.000419 wd 0.0500 time 0.2250 (0.2772) data time 0.0008 (0.0040) model time 0.2242 (0.2732) loss 3.3358 (3.1308) grad_norm 2.4314 (2.9330) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][680/1251] eta 0:02:37 lr 0.000419 wd 0.0500 time 0.2256 (0.2763) data time 0.0007 (0.0039) model time 0.2249 (0.2723) loss 2.0236 (3.1248) grad_norm 2.4833 (2.9333) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][690/1251] eta 0:02:34 lr 0.000419 wd 0.0500 time 0.2451 (0.2749) data time 0.0012 (0.0038) model time 0.2439 (0.2710) loss 3.4219 (3.1175) grad_norm 2.4203 (2.9258) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][700/1251] eta 0:02:31 lr 0.000419 wd 0.0500 time 0.2336 (0.2742) data time 0.0009 (0.0037) model time 0.2327 (0.2705) loss 3.2573 (3.1150) grad_norm 1.9386 (2.9138) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][710/1251] eta 0:02:27 lr 0.000419 wd 0.0500 time 0.2400 (0.2729) data time 0.0008 (0.0037) model time 0.2392 (0.2693) loss 4.0880 (3.1268) grad_norm 3.8131 (2.9084) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][720/1251] eta 0:02:24 lr 0.000419 wd 0.0500 time 0.2271 (0.2717) data time 0.0008 (0.0036) model time 0.2263 (0.2681) loss 1.9041 (3.1246) grad_norm 2.2851 (2.8998) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][730/1251] eta 0:02:20 lr 0.000419 wd 0.0500 time 0.2369 (0.2706) data time 0.0009 (0.0035) model time 0.2360 (0.2671) loss 3.1600 (3.1285) grad_norm 2.2574 (2.8821) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][740/1251] eta 0:02:17 lr 0.000419 wd 0.0500 time 0.2230 (0.2694) data time 0.0007 (0.0034) model time 0.2223 (0.2660) loss 3.4356 (3.1310) grad_norm 2.5771 (2.8757) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][750/1251] eta 0:02:14 lr 0.000419 wd 0.0500 time 0.2197 (0.2683) data time 0.0008 (0.0034) model time 0.2189 (0.2649) loss 2.6794 (3.1335) grad_norm 2.0416 (2.8623) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][760/1251] eta 0:02:11 lr 0.000419 wd 0.0500 time 0.2249 (0.2674) data time 0.0009 (0.0033) model time 0.2240 (0.2641) loss 2.3838 (3.1290) grad_norm 1.6168 (2.8575) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][770/1251] eta 0:02:08 lr 0.000419 wd 0.0500 time 0.2332 (0.2664) data time 0.0007 (0.0033) model time 0.2325 (0.2632) loss 1.8547 (3.1254) grad_norm 2.2377 (2.8545) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][780/1251] eta 0:02:05 lr 0.000419 wd 0.0500 time 0.2269 (0.2655) data time 0.0007 (0.0032) model time 0.2262 (0.2623) loss 3.7676 (3.1239) grad_norm 2.6015 (2.8613) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][790/1251] eta 0:02:01 lr 0.000419 wd 0.0500 time 0.2324 (0.2646) data time 0.0008 (0.0031) model time 0.2316 (0.2615) loss 3.4582 (3.1281) grad_norm 2.8121 (2.9121) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][800/1251] eta 0:01:58 lr 0.000419 wd 0.0500 time 0.2250 (0.2638) data time 0.0009 (0.0031) model time 0.2240 (0.2607) loss 3.4592 (3.1355) grad_norm 2.3667 (2.9069) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][810/1251] eta 0:01:55 lr 0.000419 wd 0.0500 time 0.2364 (0.2630) data time 0.0007 (0.0030) model time 0.2357 (0.2600) loss 3.9574 (3.1364) grad_norm 3.0289 (2.8986) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][820/1251] eta 0:01:53 lr 0.000418 wd 0.0500 time 0.2276 (0.2622) data time 0.0011 (0.0030) model time 0.2265 (0.2592) loss 3.3878 (3.1431) grad_norm 2.1504 (2.8980) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][830/1251] eta 0:01:50 lr 0.000418 wd 0.0500 time 0.2230 (0.2615) data time 0.0007 (0.0029) model time 0.2223 (0.2586) loss 3.8087 (3.1438) grad_norm 2.3399 (2.8908) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][840/1251] eta 0:01:47 lr 0.000418 wd 0.0500 time 0.2358 (0.2608) data time 0.0007 (0.0029) model time 0.2351 (0.2579) loss 2.7288 (3.1441) grad_norm 2.7800 (2.8907) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][850/1251] eta 0:01:44 lr 0.000418 wd 0.0500 time 0.2261 (0.2602) data time 0.0010 (0.0029) model time 0.2251 (0.2573) loss 3.3373 (3.1376) grad_norm 3.0766 (2.8893) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][860/1251] eta 0:01:41 lr 0.000418 wd 0.0500 time 0.2349 (0.2596) data time 0.0008 (0.0028) model time 0.2341 (0.2567) loss 3.0172 (3.1298) grad_norm 3.2345 (2.8892) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][870/1251] eta 0:01:38 lr 0.000418 wd 0.0500 time 0.2295 (0.2589) data time 0.0009 (0.0028) model time 0.2286 (0.2562) loss 3.4464 (3.1285) grad_norm 2.1942 (2.8814) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][880/1251] eta 0:01:35 lr 0.000418 wd 0.0500 time 0.2253 (0.2584) data time 0.0011 (0.0027) model time 0.2242 (0.2557) loss 3.5407 (3.1340) grad_norm 3.1196 (2.8759) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][890/1251] eta 0:01:33 lr 0.000418 wd 0.0500 time 0.2317 (0.2580) data time 0.0010 (0.0027) model time 0.2307 (0.2552) loss 3.3076 (3.1336) grad_norm 2.6098 (2.8763) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][900/1251] eta 0:01:30 lr 0.000418 wd 0.0500 time 0.2279 (0.2574) data time 0.0006 (0.0027) model time 0.2272 (0.2547) loss 4.0072 (3.1384) grad_norm 2.4545 (2.8737) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][910/1251] eta 0:01:27 lr 0.000418 wd 0.0500 time 0.2405 (0.2569) data time 0.0007 (0.0027) model time 0.2397 (0.2543) loss 2.9560 (3.1377) grad_norm 3.3042 (2.8928) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][920/1251] eta 0:01:24 lr 0.000418 wd 0.0500 time 0.2296 (0.2564) data time 0.0009 (0.0026) model time 0.2287 (0.2538) loss 3.2337 (3.1328) grad_norm 2.6264 (2.8863) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][930/1251] eta 0:01:22 lr 0.000418 wd 0.0500 time 0.2271 (0.2559) data time 0.0009 (0.0026) model time 0.2262 (0.2533) loss 3.5938 (3.1345) grad_norm 2.3834 (2.8929) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][940/1251] eta 0:01:19 lr 0.000418 wd 0.0500 time 0.2306 (0.2555) data time 0.0007 (0.0026) model time 0.2299 (0.2529) loss 3.6947 (3.1342) grad_norm 2.6852 (2.8886) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][950/1251] eta 0:01:16 lr 0.000418 wd 0.0500 time 0.2268 (0.2550) data time 0.0012 (0.0025) model time 0.2256 (0.2525) loss 3.3088 (3.1387) grad_norm 1.9758 (2.8768) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][960/1251] eta 0:01:14 lr 0.000418 wd 0.0500 time 0.2196 (0.2546) data time 0.0011 (0.0025) model time 0.2185 (0.2521) loss 3.3132 (3.1411) grad_norm 2.0721 (2.8680) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][970/1251] eta 0:01:11 lr 0.000418 wd 0.0500 time 0.2287 (0.2542) data time 0.0007 (0.0025) model time 0.2280 (0.2517) loss 3.6211 (3.1427) grad_norm 4.8957 (2.8736) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][980/1251] eta 0:01:08 lr 0.000418 wd 0.0500 time 0.2295 (0.2538) data time 0.0008 (0.0025) model time 0.2286 (0.2513) loss 3.2933 (3.1428) grad_norm 2.1023 (2.8780) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][990/1251] eta 0:01:06 lr 0.000418 wd 0.0500 time 0.2284 (0.2534) data time 0.0012 (0.0024) model time 0.2273 (0.2510) loss 2.2510 (3.1418) grad_norm 4.9283 (2.8803) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1000/1251] eta 0:01:03 lr 0.000418 wd 0.0500 time 0.2297 (0.2531) data time 0.0009 (0.0024) model time 0.2289 (0.2506) loss 3.3798 (3.1425) grad_norm 2.1476 (2.8753) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1010/1251] eta 0:01:00 lr 0.000418 wd 0.0500 time 0.2304 (0.2527) data time 0.0008 (0.0024) model time 0.2295 (0.2503) loss 3.9778 (3.1451) grad_norm 2.7707 (2.8735) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1020/1251] eta 0:00:58 lr 0.000418 wd 0.0500 time 0.2210 (0.2524) data time 0.0010 (0.0024) model time 0.2200 (0.2500) loss 3.3248 (3.1449) grad_norm 4.3856 (2.8713) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1030/1251] eta 0:00:55 lr 0.000418 wd 0.0500 time 0.2325 (0.2521) data time 0.0007 (0.0024) model time 0.2318 (0.2497) loss 2.4294 (3.1427) grad_norm 2.0311 (2.8719) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:22:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1040/1251] eta 0:00:53 lr 0.000418 wd 0.0500 time 0.2221 (0.2518) data time 0.0007 (0.0024) model time 0.2214 (0.2494) loss 3.6897 (3.1437) grad_norm 2.5889 (2.8894) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:23:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1050/1251] eta 0:00:50 lr 0.000417 wd 0.0500 time 0.2276 (0.2515) data time 0.0007 (0.0023) model time 0.2269 (0.2491) loss 2.0774 (3.1379) grad_norm 6.4361 (2.8945) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:23:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1060/1251] eta 0:00:47 lr 0.000417 wd 0.0500 time 0.2233 (0.2512) data time 0.0010 (0.0023) model time 0.2223 (0.2488) loss 3.7165 (3.1417) grad_norm 3.2447 (2.8962) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:23:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1070/1251] eta 0:00:45 lr 0.000417 wd 0.0500 time 0.2277 (0.2509) data time 0.0007 (0.0023) model time 0.2270 (0.2486) loss 3.8070 (3.1429) grad_norm 3.4380 (2.8916) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:23:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1080/1251] eta 0:00:42 lr 0.000417 wd 0.0500 time 0.2264 (0.2506) data time 0.0008 (0.0023) model time 0.2256 (0.2483) loss 2.4300 (3.1401) grad_norm 2.3535 (2.8882) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:23:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1090/1251] eta 0:00:40 lr 0.000417 wd 0.0500 time 0.2244 (0.2503) data time 0.0009 (0.0023) model time 0.2235 (0.2481) loss 3.5764 (3.1386) grad_norm 3.2538 (2.8894) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:23:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1100/1251] eta 0:00:37 lr 0.000417 wd 0.0500 time 0.2348 (0.2501) data time 0.0009 (0.0023) model time 0.2340 (0.2478) loss 2.1910 (3.1382) grad_norm 2.8303 (2.8928) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:23:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1110/1251] eta 0:00:35 lr 0.000417 wd 0.0500 time 0.2252 (0.2498) data time 0.0007 (0.0022) model time 0.2245 (0.2476) loss 3.2687 (3.1355) grad_norm 2.0425 (2.8906) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:23:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1120/1251] eta 0:00:32 lr 0.000417 wd 0.0500 time 0.2282 (0.2496) data time 0.0008 (0.0022) model time 0.2274 (0.2473) loss 3.1441 (3.1351) grad_norm 2.7158 (2.8875) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:23:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1130/1251] eta 0:00:30 lr 0.000417 wd 0.0500 time 0.2244 (0.2493) data time 0.0010 (0.0022) model time 0.2234 (0.2471) loss 3.5029 (3.1397) grad_norm 3.6249 (2.9063) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:23:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1140/1251] eta 0:00:27 lr 0.000417 wd 0.0500 time 0.2250 (0.2491) data time 0.0009 (0.0022) model time 0.2242 (0.2469) loss 3.4708 (3.1388) grad_norm 2.4752 (2.9064) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:23:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1150/1251] eta 0:00:25 lr 0.000417 wd 0.0500 time 0.2256 (0.2488) data time 0.0010 (0.0022) model time 0.2246 (0.2466) loss 2.9791 (3.1378) grad_norm 3.4709 (2.9032) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:23:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1160/1251] eta 0:00:22 lr 0.000417 wd 0.0500 time 0.2335 (0.2487) data time 0.0007 (0.0022) model time 0.2328 (0.2465) loss 3.4755 (3.1380) grad_norm 2.4971 (2.9016) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:23:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1170/1251] eta 0:00:20 lr 0.000417 wd 0.0500 time 0.2272 (0.2484) data time 0.0006 (0.0022) model time 0.2265 (0.2463) loss 3.6049 (3.1380) grad_norm 3.1663 (2.8988) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:23:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 12:23:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 12:23:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 12:28:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 12:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 12:28:29 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 12:28:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 12:28:42 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 12:28:44 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 12:28:45 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 12:28:45 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 175) [2024-08-27 12:28:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 12:29:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 12:29:04 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 12:29:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 12:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 12:32:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 12:32:19 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 12:32:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 12:32:30 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 12:32:31 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 12:32:32 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 12:32:32 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 175) [2024-08-27 12:32:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 12:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1180/1251] eta 0:17:43 lr 0.000417 wd 0.0500 time 14.9723 (14.9723) data time 0.8070 (0.8070) model time 14.1653 (14.1653) loss 4.1167 (4.1167) grad_norm 2.5423 (2.5423) loss_scale 1024.0000 (1024.0000) mem 20033MB [2024-08-27 12:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1190/1251] eta 0:01:36 lr 0.000417 wd 0.0500 time 0.2243 (1.5860) data time 0.0010 (0.0753) model time 0.2234 (1.5108) loss 2.7604 (3.5823) grad_norm 2.6846 (3.0469) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1200/1251] eta 0:00:47 lr 0.000417 wd 0.0500 time 0.2276 (0.9391) data time 0.0009 (0.0400) model time 0.2267 (0.8991) loss 3.0616 (3.3708) grad_norm 4.3892 (3.0349) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:32:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1210/1251] eta 0:00:29 lr 0.000417 wd 0.0500 time 0.2258 (0.7119) data time 0.0011 (0.0275) model time 0.2248 (0.6844) loss 2.6270 (3.3726) grad_norm 1.8514 (2.9955) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1220/1251] eta 0:00:18 lr 0.000417 wd 0.0500 time 0.2221 (0.5943) data time 0.0009 (0.0211) model time 0.2212 (0.5732) loss 3.3281 (3.3055) grad_norm 4.3334 (2.9733) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:33:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1230/1251] eta 0:00:10 lr 0.000417 wd 0.0500 time 0.2227 (0.5236) data time 0.0012 (0.0172) model time 0.2215 (0.5064) loss 3.4907 (3.3102) grad_norm 2.0313 (2.9605) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1240/1251] eta 0:00:05 lr 0.000417 wd 0.0500 time 0.2118 (0.4737) data time 0.0007 (0.0146) model time 0.2111 (0.4591) loss 3.1346 (3.2663) grad_norm 3.0756 (2.9956) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [175/300][1250/1251] eta 0:00:00 lr 0.000417 wd 0.0500 time 0.2278 (0.4378) data time 0.0006 (0.0129) model time 0.2272 (0.4249) loss 3.1174 (3.2314) grad_norm 3.1615 (3.0213) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 175 training takes 0:00:31 [2024-08-27 12:33:08 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 12:33:12 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 12:33:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.475 (0.475) Loss 0.4429 (0.4429) Acc@1 91.602 (91.602) Acc@5 98.340 (98.340) Mem 7377MB [2024-08-27 12:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.118) Loss 0.7021 (0.6958) Acc@1 84.668 (84.801) Acc@5 96.582 (96.946) Mem 7377MB [2024-08-27 12:33:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.101) Loss 1.0098 (0.7182) Acc@1 75.684 (83.924) Acc@5 94.629 (96.959) Mem 7377MB [2024-08-27 12:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.094) Loss 1.2061 (0.8165) Acc@1 69.531 (81.609) Acc@5 92.285 (95.905) Mem 7377MB [2024-08-27 12:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.0898 (0.8690) Acc@1 74.023 (80.366) Acc@5 93.457 (95.274) Mem 7377MB [2024-08-27 12:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.028 Acc@5 95.176 [2024-08-27 12:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.0% [2024-08-27 12:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 80.03% [2024-08-27 12:33:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 12:33:22 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 12:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.473 (0.473) Loss 0.3984 (0.3984) Acc@1 92.969 (92.969) Acc@5 98.242 (98.242) Mem 7377MB [2024-08-27 12:33:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.116) Loss 0.6191 (0.6235) Acc@1 87.793 (86.692) Acc@5 97.266 (97.461) Mem 7377MB [2024-08-27 12:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.100) Loss 0.8896 (0.6485) Acc@1 78.906 (85.863) Acc@5 95.508 (97.452) Mem 7377MB [2024-08-27 12:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.094) Loss 1.1191 (0.7347) Acc@1 71.973 (83.739) Acc@5 93.066 (96.535) Mem 7377MB [2024-08-27 12:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.0059 (0.7788) Acc@1 74.805 (82.393) Acc@5 94.043 (96.079) Mem 7377MB [2024-08-27 12:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.952 Acc@5 96.060 [2024-08-27 12:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.0% [2024-08-27 12:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.95% [2024-08-27 12:33:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 12:33:27 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 12:33:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][0/1251] eta 0:16:43 lr 0.000417 wd 0.0500 time 0.8023 (0.8023) data time 0.5028 (0.5028) model time 0.0000 (0.0000) loss 2.9264 (2.9264) grad_norm 1.8678 (1.8678) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 12:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][10/1251] eta 0:05:47 lr 0.000417 wd 0.0500 time 0.2274 (0.2799) data time 0.0010 (0.0469) model time 0.0000 (0.0000) loss 3.0519 (3.0886) grad_norm 2.2445 (2.5707) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:33:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][20/1251] eta 0:05:16 lr 0.000417 wd 0.0500 time 0.2519 (0.2570) data time 0.0008 (0.0250) model time 0.0000 (0.0000) loss 2.0405 (3.0194) grad_norm 2.5798 (2.3912) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:33:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][30/1251] eta 0:05:03 lr 0.000416 wd 0.0500 time 0.2333 (0.2486) data time 0.0007 (0.0173) model time 0.0000 (0.0000) loss 3.6490 (3.1232) grad_norm 2.3994 (2.4659) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:33:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][40/1251] eta 0:04:56 lr 0.000416 wd 0.0500 time 0.2570 (0.2446) data time 0.0012 (0.0134) model time 0.0000 (0.0000) loss 3.9558 (3.1615) grad_norm 2.3198 (2.4587) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:33:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][50/1251] eta 0:04:50 lr 0.000416 wd 0.0500 time 0.2290 (0.2417) data time 0.0011 (0.0110) model time 0.0000 (0.0000) loss 3.7286 (3.1794) grad_norm 2.3677 (2.4304) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:33:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][60/1251] eta 0:04:45 lr 0.000416 wd 0.0500 time 0.2240 (0.2394) data time 0.0009 (0.0095) model time 0.2231 (0.2256) loss 2.7019 (3.1451) grad_norm 2.9932 (2.5419) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:33:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][70/1251] eta 0:04:41 lr 0.000416 wd 0.0500 time 0.2453 (0.2385) data time 0.0012 (0.0083) model time 0.2440 (0.2288) loss 3.2413 (3.1262) grad_norm 2.2672 (2.5453) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:33:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][80/1251] eta 0:04:38 lr 0.000416 wd 0.0500 time 0.2476 (0.2378) data time 0.0010 (0.0074) model time 0.2466 (0.2297) loss 3.4999 (3.1220) grad_norm 1.7430 (2.5852) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:33:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][90/1251] eta 0:04:34 lr 0.000416 wd 0.0500 time 0.2268 (0.2369) data time 0.0011 (0.0067) model time 0.2258 (0.2294) loss 3.4713 (3.1267) grad_norm 3.1548 (2.6580) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:33:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][100/1251] eta 0:04:32 lr 0.000416 wd 0.0500 time 0.2431 (0.2364) data time 0.0010 (0.0062) model time 0.2421 (0.2297) loss 3.0964 (3.1097) grad_norm 2.6727 (2.6126) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:33:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][110/1251] eta 0:04:29 lr 0.000416 wd 0.0500 time 0.2383 (0.2359) data time 0.0012 (0.0057) model time 0.2372 (0.2297) loss 3.1493 (3.0906) grad_norm 2.0243 (2.6028) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:33:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][120/1251] eta 0:04:26 lr 0.000416 wd 0.0500 time 0.2617 (0.2358) data time 0.0012 (0.0054) model time 0.2606 (0.2301) loss 3.2163 (3.0975) grad_norm 3.4151 (2.6249) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][130/1251] eta 0:04:23 lr 0.000416 wd 0.0500 time 0.2283 (0.2353) data time 0.0008 (0.0050) model time 0.2275 (0.2300) loss 3.6679 (3.0950) grad_norm 4.7270 (2.6702) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:34:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][140/1251] eta 0:04:20 lr 0.000416 wd 0.0500 time 0.2536 (0.2349) data time 0.0010 (0.0048) model time 0.2526 (0.2298) loss 3.1190 (3.0925) grad_norm 2.6599 (2.6705) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:34:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][150/1251] eta 0:04:18 lr 0.000416 wd 0.0500 time 0.2448 (0.2349) data time 0.0007 (0.0046) model time 0.2441 (0.2301) loss 3.7121 (3.0922) grad_norm 2.8589 (2.6598) loss_scale 2048.0000 (1051.1258) mem 7381MB [2024-08-27 12:34:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][160/1251] eta 0:04:15 lr 0.000416 wd 0.0500 time 0.2284 (0.2346) data time 0.0007 (0.0043) model time 0.2277 (0.2300) loss 3.3526 (3.0888) grad_norm 3.1086 (2.6495) loss_scale 2048.0000 (1113.0435) mem 7381MB [2024-08-27 12:34:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][170/1251] eta 0:04:13 lr 0.000416 wd 0.0500 time 0.2258 (0.2343) data time 0.0009 (0.0042) model time 0.2249 (0.2299) loss 3.5799 (3.0942) grad_norm 3.9691 (2.6628) loss_scale 2048.0000 (1167.7193) mem 7381MB [2024-08-27 12:34:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][180/1251] eta 0:04:10 lr 0.000416 wd 0.0500 time 0.2327 (0.2339) data time 0.0008 (0.0040) model time 0.2319 (0.2296) loss 3.7484 (3.0804) grad_norm 2.9067 (2.6659) loss_scale 2048.0000 (1216.3536) mem 7381MB [2024-08-27 12:34:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][190/1251] eta 0:04:08 lr 0.000416 wd 0.0500 time 0.2404 (0.2338) data time 0.0009 (0.0038) model time 0.2395 (0.2297) loss 3.1655 (3.0753) grad_norm 2.6543 (2.6780) loss_scale 2048.0000 (1259.8953) mem 7381MB [2024-08-27 12:34:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][200/1251] eta 0:04:05 lr 0.000416 wd 0.0500 time 0.2397 (0.2336) data time 0.0009 (0.0037) model time 0.2387 (0.2295) loss 2.7524 (3.0697) grad_norm 2.7906 (2.6772) loss_scale 2048.0000 (1299.1045) mem 7381MB [2024-08-27 12:34:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][210/1251] eta 0:04:02 lr 0.000416 wd 0.0500 time 0.2270 (0.2333) data time 0.0012 (0.0036) model time 0.2257 (0.2294) loss 2.1773 (3.0712) grad_norm 2.7834 (2.6589) loss_scale 2048.0000 (1334.5972) mem 7381MB [2024-08-27 12:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][220/1251] eta 0:04:01 lr 0.000416 wd 0.0500 time 0.2355 (0.2343) data time 0.0009 (0.0035) model time 0.2346 (0.2309) loss 2.9269 (3.0654) grad_norm 2.1647 (2.6469) loss_scale 2048.0000 (1366.8778) mem 7381MB [2024-08-27 12:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][230/1251] eta 0:04:00 lr 0.000416 wd 0.0500 time 0.2281 (0.2353) data time 0.0010 (0.0034) model time 0.2271 (0.2323) loss 3.2955 (3.0563) grad_norm 2.8564 (2.6568) loss_scale 2048.0000 (1396.3636) mem 7381MB [2024-08-27 12:34:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][240/1251] eta 0:03:57 lr 0.000416 wd 0.0500 time 0.2218 (0.2350) data time 0.0013 (0.0033) model time 0.2205 (0.2320) loss 3.3493 (3.0603) grad_norm 2.3967 (2.6623) loss_scale 2048.0000 (1423.4025) mem 7381MB [2024-08-27 12:34:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][250/1251] eta 0:03:55 lr 0.000416 wd 0.0500 time 0.2422 (0.2349) data time 0.0008 (0.0033) model time 0.2414 (0.2318) loss 3.1738 (3.0725) grad_norm 4.4168 (2.6641) loss_scale 2048.0000 (1448.2869) mem 7381MB [2024-08-27 12:34:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][260/1251] eta 0:03:52 lr 0.000415 wd 0.0500 time 0.2264 (0.2349) data time 0.0010 (0.0033) model time 0.2254 (0.2319) loss 3.8446 (3.0719) grad_norm 2.2432 (2.6620) loss_scale 2048.0000 (1471.2644) mem 7381MB [2024-08-27 12:34:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][270/1251] eta 0:03:50 lr 0.000415 wd 0.0500 time 0.2269 (0.2348) data time 0.0007 (0.0033) model time 0.2262 (0.2317) loss 3.5614 (3.0756) grad_norm 4.2170 (2.6683) loss_scale 2048.0000 (1492.5461) mem 7381MB [2024-08-27 12:34:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][280/1251] eta 0:03:47 lr 0.000415 wd 0.0500 time 0.2284 (0.2348) data time 0.0008 (0.0032) model time 0.2276 (0.2317) loss 3.6065 (3.0784) grad_norm 3.8518 (2.6833) loss_scale 2048.0000 (1512.3132) mem 7381MB [2024-08-27 12:34:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][290/1251] eta 0:03:45 lr 0.000415 wd 0.0500 time 0.2252 (0.2346) data time 0.0006 (0.0032) model time 0.2246 (0.2316) loss 3.8881 (3.0800) grad_norm 2.3394 (2.6974) loss_scale 2048.0000 (1530.7216) mem 7381MB [2024-08-27 12:34:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][300/1251] eta 0:03:43 lr 0.000415 wd 0.0500 time 0.2300 (0.2345) data time 0.0009 (0.0031) model time 0.2291 (0.2316) loss 3.3211 (3.0784) grad_norm 2.3836 (2.7024) loss_scale 2048.0000 (1547.9070) mem 7381MB [2024-08-27 12:34:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][310/1251] eta 0:03:40 lr 0.000415 wd 0.0500 time 0.2311 (0.2344) data time 0.0007 (0.0030) model time 0.2304 (0.2315) loss 3.3878 (3.0775) grad_norm 2.1729 (2.6887) loss_scale 2048.0000 (1563.9871) mem 7381MB [2024-08-27 12:34:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][320/1251] eta 0:03:38 lr 0.000415 wd 0.0500 time 0.2330 (0.2343) data time 0.0007 (0.0030) model time 0.2323 (0.2314) loss 3.2336 (3.0710) grad_norm 2.3142 (2.6806) loss_scale 2048.0000 (1579.0654) mem 7381MB [2024-08-27 12:34:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][330/1251] eta 0:03:35 lr 0.000415 wd 0.0500 time 0.2255 (0.2342) data time 0.0008 (0.0029) model time 0.2246 (0.2313) loss 3.6731 (3.0767) grad_norm 2.1861 (2.6877) loss_scale 2048.0000 (1593.2326) mem 7381MB [2024-08-27 12:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][340/1251] eta 0:03:33 lr 0.000415 wd 0.0500 time 0.2291 (0.2340) data time 0.0009 (0.0029) model time 0.2283 (0.2312) loss 3.1280 (3.0781) grad_norm 4.1651 (2.7144) loss_scale 2048.0000 (1606.5689) mem 7381MB [2024-08-27 12:34:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][350/1251] eta 0:03:30 lr 0.000415 wd 0.0500 time 0.2327 (0.2340) data time 0.0007 (0.0028) model time 0.2320 (0.2313) loss 3.5342 (3.0767) grad_norm 2.9173 (2.7163) loss_scale 2048.0000 (1619.1453) mem 7381MB [2024-08-27 12:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][360/1251] eta 0:03:28 lr 0.000415 wd 0.0500 time 0.2301 (0.2339) data time 0.0007 (0.0028) model time 0.2295 (0.2312) loss 3.1955 (3.0853) grad_norm 2.6603 (2.7097) loss_scale 2048.0000 (1631.0249) mem 7381MB [2024-08-27 12:34:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][370/1251] eta 0:03:25 lr 0.000415 wd 0.0500 time 0.2270 (0.2337) data time 0.0010 (0.0027) model time 0.2259 (0.2310) loss 3.1980 (3.0919) grad_norm 2.1687 (2.7028) loss_scale 2048.0000 (1642.2642) mem 7381MB [2024-08-27 12:34:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][380/1251] eta 0:03:23 lr 0.000415 wd 0.0500 time 0.2363 (0.2336) data time 0.0009 (0.0027) model time 0.2354 (0.2309) loss 2.2110 (3.0926) grad_norm 2.1297 (2.7006) loss_scale 2048.0000 (1652.9134) mem 7381MB [2024-08-27 12:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][390/1251] eta 0:03:20 lr 0.000415 wd 0.0500 time 0.2245 (0.2334) data time 0.0007 (0.0026) model time 0.2238 (0.2308) loss 2.4689 (3.0875) grad_norm 2.4056 (2.7024) loss_scale 2048.0000 (1663.0179) mem 7381MB [2024-08-27 12:35:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][400/1251] eta 0:03:18 lr 0.000415 wd 0.0500 time 0.2350 (0.2333) data time 0.0009 (0.0026) model time 0.2341 (0.2307) loss 2.8261 (3.0796) grad_norm 2.3212 (2.6973) loss_scale 2048.0000 (1672.6185) mem 7381MB [2024-08-27 12:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][410/1251] eta 0:03:16 lr 0.000415 wd 0.0500 time 0.2352 (0.2333) data time 0.0009 (0.0026) model time 0.2344 (0.2307) loss 3.6918 (3.0804) grad_norm 3.4389 (2.7070) loss_scale 2048.0000 (1681.7518) mem 7381MB [2024-08-27 12:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][420/1251] eta 0:03:13 lr 0.000415 wd 0.0500 time 0.2358 (0.2332) data time 0.0009 (0.0025) model time 0.2349 (0.2306) loss 3.4870 (3.0836) grad_norm 2.7529 (2.7138) loss_scale 2048.0000 (1690.4513) mem 7381MB [2024-08-27 12:35:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][430/1251] eta 0:03:11 lr 0.000415 wd 0.0500 time 0.2200 (0.2331) data time 0.0012 (0.0025) model time 0.2188 (0.2305) loss 3.0880 (3.0807) grad_norm 2.7943 (2.7188) loss_scale 2048.0000 (1698.7471) mem 7381MB [2024-08-27 12:35:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][440/1251] eta 0:03:08 lr 0.000415 wd 0.0500 time 0.2280 (0.2330) data time 0.0006 (0.0025) model time 0.2273 (0.2305) loss 3.6550 (3.0911) grad_norm 2.6515 (2.7183) loss_scale 2048.0000 (1706.6667) mem 7381MB [2024-08-27 12:35:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][450/1251] eta 0:03:06 lr 0.000415 wd 0.0500 time 0.2318 (0.2329) data time 0.0007 (0.0024) model time 0.2311 (0.2304) loss 2.1098 (3.0880) grad_norm 2.3622 (2.7125) loss_scale 2048.0000 (1714.2350) mem 7381MB [2024-08-27 12:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][460/1251] eta 0:03:04 lr 0.000415 wd 0.0500 time 0.2224 (0.2328) data time 0.0010 (0.0024) model time 0.2213 (0.2303) loss 3.4306 (3.0864) grad_norm 3.0530 (2.7183) loss_scale 2048.0000 (1721.4751) mem 7381MB [2024-08-27 12:35:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][470/1251] eta 0:03:01 lr 0.000415 wd 0.0500 time 0.2257 (0.2327) data time 0.0006 (0.0024) model time 0.2251 (0.2302) loss 2.0393 (3.0847) grad_norm 2.9131 (2.7263) loss_scale 2048.0000 (1728.4076) mem 7381MB [2024-08-27 12:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][480/1251] eta 0:02:59 lr 0.000415 wd 0.0500 time 0.2356 (0.2326) data time 0.0007 (0.0024) model time 0.2349 (0.2302) loss 3.2461 (3.0899) grad_norm 2.3058 (2.7264) loss_scale 2048.0000 (1735.0520) mem 7381MB [2024-08-27 12:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][490/1251] eta 0:02:56 lr 0.000414 wd 0.0500 time 0.2327 (0.2325) data time 0.0007 (0.0023) model time 0.2320 (0.2301) loss 3.5015 (3.0954) grad_norm 2.8939 (2.7228) loss_scale 2048.0000 (1741.4257) mem 7381MB [2024-08-27 12:35:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][500/1251] eta 0:02:54 lr 0.000414 wd 0.0500 time 0.2408 (0.2325) data time 0.0013 (0.0023) model time 0.2395 (0.2301) loss 2.5773 (3.0964) grad_norm 5.2872 (2.7259) loss_scale 2048.0000 (1747.5449) mem 7381MB [2024-08-27 12:35:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][510/1251] eta 0:02:52 lr 0.000414 wd 0.0500 time 0.2301 (0.2324) data time 0.0008 (0.0023) model time 0.2292 (0.2301) loss 3.4230 (3.1001) grad_norm 5.7291 (2.7441) loss_scale 2048.0000 (1753.4247) mem 7381MB [2024-08-27 12:35:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][520/1251] eta 0:02:49 lr 0.000414 wd 0.0500 time 0.2324 (0.2323) data time 0.0008 (0.0023) model time 0.2315 (0.2300) loss 3.3495 (3.1009) grad_norm 2.8308 (2.7520) loss_scale 2048.0000 (1759.0787) mem 7381MB [2024-08-27 12:35:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][530/1251] eta 0:02:47 lr 0.000414 wd 0.0500 time 0.2239 (0.2322) data time 0.0007 (0.0022) model time 0.2232 (0.2299) loss 3.1583 (3.0995) grad_norm 2.4973 (2.7658) loss_scale 2048.0000 (1764.5198) mem 7381MB [2024-08-27 12:35:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][540/1251] eta 0:02:45 lr 0.000414 wd 0.0500 time 0.2341 (0.2322) data time 0.0008 (0.0022) model time 0.2332 (0.2299) loss 2.3771 (3.0996) grad_norm 2.2974 (2.7729) loss_scale 2048.0000 (1769.7597) mem 7381MB [2024-08-27 12:35:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][550/1251] eta 0:02:42 lr 0.000414 wd 0.0500 time 0.2299 (0.2321) data time 0.0011 (0.0022) model time 0.2287 (0.2298) loss 3.5215 (3.1040) grad_norm 2.6970 (inf) loss_scale 1024.0000 (1759.9419) mem 7381MB [2024-08-27 12:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][560/1251] eta 0:02:40 lr 0.000414 wd 0.0500 time 0.2294 (0.2320) data time 0.0013 (0.0022) model time 0.2281 (0.2297) loss 3.1364 (3.1067) grad_norm 2.8487 (inf) loss_scale 1024.0000 (1746.8235) mem 7381MB [2024-08-27 12:35:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][570/1251] eta 0:02:37 lr 0.000414 wd 0.0500 time 0.2243 (0.2320) data time 0.0009 (0.0022) model time 0.2233 (0.2297) loss 2.1871 (3.1025) grad_norm 2.3195 (inf) loss_scale 1024.0000 (1734.1646) mem 7381MB [2024-08-27 12:35:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][580/1251] eta 0:02:35 lr 0.000414 wd 0.0500 time 0.2259 (0.2319) data time 0.0010 (0.0021) model time 0.2248 (0.2297) loss 2.2894 (3.1027) grad_norm 2.1184 (inf) loss_scale 1024.0000 (1721.9415) mem 7381MB [2024-08-27 12:35:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][590/1251] eta 0:02:33 lr 0.000414 wd 0.0500 time 0.2366 (0.2319) data time 0.0009 (0.0021) model time 0.2357 (0.2297) loss 3.6056 (3.1010) grad_norm 3.1687 (inf) loss_scale 1024.0000 (1710.1320) mem 7381MB [2024-08-27 12:35:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][600/1251] eta 0:02:30 lr 0.000414 wd 0.0500 time 0.2316 (0.2318) data time 0.0008 (0.0021) model time 0.2308 (0.2296) loss 2.9373 (3.1052) grad_norm 2.9289 (inf) loss_scale 1024.0000 (1698.7155) mem 7381MB [2024-08-27 12:35:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][610/1251] eta 0:02:28 lr 0.000414 wd 0.0500 time 0.2336 (0.2318) data time 0.0009 (0.0021) model time 0.2328 (0.2296) loss 3.1444 (3.1084) grad_norm 3.3313 (inf) loss_scale 1024.0000 (1687.6727) mem 7381MB [2024-08-27 12:35:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][620/1251] eta 0:02:26 lr 0.000414 wd 0.0500 time 0.2217 (0.2317) data time 0.0011 (0.0021) model time 0.2206 (0.2295) loss 3.4724 (3.1073) grad_norm 4.5295 (inf) loss_scale 1024.0000 (1676.9855) mem 7381MB [2024-08-27 12:35:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][630/1251] eta 0:02:23 lr 0.000414 wd 0.0500 time 0.2258 (0.2316) data time 0.0007 (0.0020) model time 0.2252 (0.2295) loss 2.5154 (3.1051) grad_norm 2.9006 (inf) loss_scale 1024.0000 (1666.6371) mem 7381MB [2024-08-27 12:35:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][640/1251] eta 0:02:21 lr 0.000414 wd 0.0500 time 0.2260 (0.2316) data time 0.0007 (0.0020) model time 0.2253 (0.2294) loss 1.9678 (3.1046) grad_norm 3.1005 (inf) loss_scale 1024.0000 (1656.6115) mem 7381MB [2024-08-27 12:35:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][650/1251] eta 0:02:19 lr 0.000414 wd 0.0500 time 0.2281 (0.2316) data time 0.0011 (0.0020) model time 0.2270 (0.2295) loss 3.1720 (3.1029) grad_norm 2.7006 (inf) loss_scale 1024.0000 (1646.8940) mem 7381MB [2024-08-27 12:36:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][660/1251] eta 0:02:16 lr 0.000414 wd 0.0500 time 0.2295 (0.2316) data time 0.0012 (0.0020) model time 0.2283 (0.2295) loss 3.0133 (3.1035) grad_norm 2.9050 (inf) loss_scale 1024.0000 (1637.4705) mem 7381MB [2024-08-27 12:36:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][670/1251] eta 0:02:14 lr 0.000414 wd 0.0500 time 0.2358 (0.2316) data time 0.0014 (0.0020) model time 0.2345 (0.2295) loss 2.3235 (3.1059) grad_norm 2.3252 (inf) loss_scale 1024.0000 (1628.3279) mem 7381MB [2024-08-27 12:36:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][680/1251] eta 0:02:12 lr 0.000414 wd 0.0500 time 0.2336 (0.2315) data time 0.0008 (0.0020) model time 0.2327 (0.2295) loss 3.7414 (3.1062) grad_norm 2.4696 (inf) loss_scale 1024.0000 (1619.4537) mem 7381MB [2024-08-27 12:36:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][690/1251] eta 0:02:09 lr 0.000414 wd 0.0500 time 0.2273 (0.2315) data time 0.0007 (0.0020) model time 0.2267 (0.2294) loss 3.3001 (3.1072) grad_norm 3.9164 (inf) loss_scale 1024.0000 (1610.8365) mem 7381MB [2024-08-27 12:36:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][700/1251] eta 0:02:07 lr 0.000414 wd 0.0500 time 0.2315 (0.2314) data time 0.0007 (0.0020) model time 0.2307 (0.2294) loss 3.3159 (3.1068) grad_norm 2.8646 (inf) loss_scale 1024.0000 (1602.4650) mem 7381MB [2024-08-27 12:36:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][710/1251] eta 0:02:05 lr 0.000414 wd 0.0500 time 0.2253 (0.2314) data time 0.0010 (0.0019) model time 0.2243 (0.2294) loss 3.5047 (3.1089) grad_norm 2.6310 (inf) loss_scale 1024.0000 (1594.3291) mem 7381MB [2024-08-27 12:36:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][720/1251] eta 0:02:02 lr 0.000413 wd 0.0500 time 0.2300 (0.2314) data time 0.0009 (0.0019) model time 0.2291 (0.2293) loss 3.7942 (3.1093) grad_norm 3.1564 (inf) loss_scale 1024.0000 (1586.4189) mem 7381MB [2024-08-27 12:36:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][730/1251] eta 0:02:00 lr 0.000413 wd 0.0500 time 0.2231 (0.2313) data time 0.0007 (0.0019) model time 0.2224 (0.2293) loss 2.1079 (3.1083) grad_norm 3.0447 (inf) loss_scale 1024.0000 (1578.7250) mem 7381MB [2024-08-27 12:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][740/1251] eta 0:01:58 lr 0.000413 wd 0.0500 time 0.2285 (0.2313) data time 0.0009 (0.0019) model time 0.2276 (0.2293) loss 3.1100 (3.1033) grad_norm 3.4117 (inf) loss_scale 1024.0000 (1571.2389) mem 7381MB [2024-08-27 12:36:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][750/1251] eta 0:01:55 lr 0.000413 wd 0.0500 time 0.2297 (0.2315) data time 0.0010 (0.0019) model time 0.2287 (0.2295) loss 2.8940 (3.1043) grad_norm 2.3158 (inf) loss_scale 1024.0000 (1563.9521) mem 7381MB [2024-08-27 12:36:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][760/1251] eta 0:01:53 lr 0.000413 wd 0.0500 time 0.2346 (0.2317) data time 0.0007 (0.0019) model time 0.2339 (0.2297) loss 3.0864 (3.1017) grad_norm 4.9302 (inf) loss_scale 1024.0000 (1556.8568) mem 7381MB [2024-08-27 12:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][770/1251] eta 0:01:51 lr 0.000413 wd 0.0500 time 0.2358 (0.2317) data time 0.0009 (0.0019) model time 0.2349 (0.2297) loss 2.1726 (3.1006) grad_norm 2.4284 (inf) loss_scale 1024.0000 (1549.9455) mem 7381MB [2024-08-27 12:36:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][780/1251] eta 0:01:49 lr 0.000413 wd 0.0500 time 0.2239 (0.2317) data time 0.0008 (0.0019) model time 0.2231 (0.2297) loss 2.9927 (3.0998) grad_norm 3.4826 (inf) loss_scale 1024.0000 (1543.2113) mem 7381MB [2024-08-27 12:36:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][790/1251] eta 0:01:46 lr 0.000413 wd 0.0500 time 0.2272 (0.2316) data time 0.0009 (0.0018) model time 0.2263 (0.2297) loss 3.0953 (3.1009) grad_norm 3.1413 (inf) loss_scale 1024.0000 (1536.6473) mem 7381MB [2024-08-27 12:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][800/1251] eta 0:01:44 lr 0.000413 wd 0.0500 time 0.2306 (0.2316) data time 0.0007 (0.0018) model time 0.2299 (0.2297) loss 1.8850 (3.1009) grad_norm 3.0694 (inf) loss_scale 1024.0000 (1530.2472) mem 7381MB [2024-08-27 12:36:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][810/1251] eta 0:01:42 lr 0.000413 wd 0.0500 time 0.2309 (0.2316) data time 0.0007 (0.0018) model time 0.2302 (0.2297) loss 2.3464 (3.1002) grad_norm 2.4599 (inf) loss_scale 1024.0000 (1524.0049) mem 7381MB [2024-08-27 12:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][820/1251] eta 0:01:39 lr 0.000413 wd 0.0500 time 0.2267 (0.2315) data time 0.0008 (0.0018) model time 0.2259 (0.2296) loss 3.7238 (3.1006) grad_norm 2.0700 (inf) loss_scale 1024.0000 (1517.9147) mem 7381MB [2024-08-27 12:36:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][830/1251] eta 0:01:37 lr 0.000413 wd 0.0500 time 0.2352 (0.2315) data time 0.0009 (0.0018) model time 0.2343 (0.2297) loss 3.5542 (3.1002) grad_norm 1.9632 (inf) loss_scale 1024.0000 (1511.9711) mem 7381MB [2024-08-27 12:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][840/1251] eta 0:01:35 lr 0.000413 wd 0.0500 time 0.2354 (0.2315) data time 0.0007 (0.0018) model time 0.2347 (0.2296) loss 2.0876 (3.0977) grad_norm 2.4626 (inf) loss_scale 1024.0000 (1506.1688) mem 7381MB [2024-08-27 12:36:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][850/1251] eta 0:01:32 lr 0.000413 wd 0.0500 time 0.2429 (0.2315) data time 0.0011 (0.0018) model time 0.2418 (0.2296) loss 3.5033 (3.1003) grad_norm 3.2030 (inf) loss_scale 1024.0000 (1500.5029) mem 7381MB [2024-08-27 12:36:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][860/1251] eta 0:01:30 lr 0.000413 wd 0.0500 time 0.2259 (0.2315) data time 0.0006 (0.0018) model time 0.2253 (0.2296) loss 3.2411 (3.1029) grad_norm 2.6103 (inf) loss_scale 1024.0000 (1494.9686) mem 7381MB [2024-08-27 12:36:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][870/1251] eta 0:01:28 lr 0.000413 wd 0.0500 time 0.2432 (0.2315) data time 0.0011 (0.0018) model time 0.2420 (0.2296) loss 2.0583 (3.0994) grad_norm 3.6107 (inf) loss_scale 1024.0000 (1489.5614) mem 7381MB [2024-08-27 12:36:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][880/1251] eta 0:01:25 lr 0.000413 wd 0.0500 time 0.2262 (0.2314) data time 0.0009 (0.0018) model time 0.2253 (0.2296) loss 3.2847 (3.0954) grad_norm 2.7251 (inf) loss_scale 1024.0000 (1484.2770) mem 7381MB [2024-08-27 12:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][890/1251] eta 0:01:23 lr 0.000413 wd 0.0500 time 0.2259 (0.2314) data time 0.0009 (0.0018) model time 0.2250 (0.2296) loss 2.9693 (3.0961) grad_norm 1.9916 (inf) loss_scale 1024.0000 (1479.1111) mem 7381MB [2024-08-27 12:36:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][900/1251] eta 0:01:21 lr 0.000413 wd 0.0500 time 0.2285 (0.2314) data time 0.0007 (0.0017) model time 0.2278 (0.2296) loss 3.6960 (3.0993) grad_norm 2.1386 (inf) loss_scale 1024.0000 (1474.0599) mem 7381MB [2024-08-27 12:36:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][910/1251] eta 0:01:18 lr 0.000413 wd 0.0500 time 0.2310 (0.2314) data time 0.0006 (0.0017) model time 0.2303 (0.2296) loss 4.0786 (3.1003) grad_norm 3.4712 (inf) loss_scale 1024.0000 (1469.1196) mem 7381MB [2024-08-27 12:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][920/1251] eta 0:01:16 lr 0.000413 wd 0.0500 time 0.2241 (0.2313) data time 0.0011 (0.0017) model time 0.2230 (0.2295) loss 3.3220 (3.1001) grad_norm 2.6231 (inf) loss_scale 1024.0000 (1464.2866) mem 7381MB [2024-08-27 12:37:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][930/1251] eta 0:01:14 lr 0.000413 wd 0.0500 time 0.2290 (0.2313) data time 0.0009 (0.0017) model time 0.2281 (0.2296) loss 3.0648 (3.0988) grad_norm 3.6147 (inf) loss_scale 1024.0000 (1459.5575) mem 7381MB [2024-08-27 12:37:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][940/1251] eta 0:01:11 lr 0.000413 wd 0.0500 time 0.2275 (0.2313) data time 0.0007 (0.0017) model time 0.2268 (0.2296) loss 3.3869 (3.1005) grad_norm 2.6341 (inf) loss_scale 1024.0000 (1454.9288) mem 7381MB [2024-08-27 12:37:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][950/1251] eta 0:01:09 lr 0.000412 wd 0.0500 time 0.2296 (0.2313) data time 0.0012 (0.0017) model time 0.2284 (0.2295) loss 3.3523 (3.1000) grad_norm 1.7692 (inf) loss_scale 1024.0000 (1450.3975) mem 7381MB [2024-08-27 12:37:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][960/1251] eta 0:01:07 lr 0.000412 wd 0.0500 time 0.2296 (0.2313) data time 0.0007 (0.0017) model time 0.2288 (0.2295) loss 3.2911 (3.0977) grad_norm 2.7201 (inf) loss_scale 1024.0000 (1445.9605) mem 7381MB [2024-08-27 12:37:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][970/1251] eta 0:01:04 lr 0.000412 wd 0.0500 time 0.2256 (0.2313) data time 0.0009 (0.0017) model time 0.2247 (0.2295) loss 3.5735 (3.1007) grad_norm 1.7920 (inf) loss_scale 1024.0000 (1441.6148) mem 7381MB [2024-08-27 12:37:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][980/1251] eta 0:01:02 lr 0.000412 wd 0.0500 time 0.2329 (0.2313) data time 0.0009 (0.0017) model time 0.2320 (0.2295) loss 3.1096 (3.0995) grad_norm 2.3748 (inf) loss_scale 1024.0000 (1437.3578) mem 7381MB [2024-08-27 12:37:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][990/1251] eta 0:01:00 lr 0.000412 wd 0.0500 time 0.2273 (0.2312) data time 0.0011 (0.0017) model time 0.2261 (0.2295) loss 3.4063 (3.1010) grad_norm 2.1665 (inf) loss_scale 1024.0000 (1433.1867) mem 7381MB [2024-08-27 12:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1000/1251] eta 0:00:58 lr 0.000412 wd 0.0500 time 0.2321 (0.2312) data time 0.0010 (0.0017) model time 0.2311 (0.2295) loss 1.8605 (3.0991) grad_norm 2.3916 (inf) loss_scale 1024.0000 (1429.0989) mem 7381MB [2024-08-27 12:37:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1010/1251] eta 0:00:55 lr 0.000412 wd 0.0500 time 0.2524 (0.2312) data time 0.0009 (0.0017) model time 0.2515 (0.2295) loss 3.4700 (3.1006) grad_norm 1.9833 (inf) loss_scale 1024.0000 (1425.0920) mem 7381MB [2024-08-27 12:37:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1020/1251] eta 0:00:53 lr 0.000412 wd 0.0500 time 0.2289 (0.2312) data time 0.0007 (0.0017) model time 0.2282 (0.2295) loss 2.3989 (3.0996) grad_norm 4.3573 (inf) loss_scale 1024.0000 (1421.1636) mem 7381MB [2024-08-27 12:37:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1030/1251] eta 0:00:51 lr 0.000412 wd 0.0500 time 0.2309 (0.2312) data time 0.0009 (0.0017) model time 0.2300 (0.2294) loss 2.7571 (3.0995) grad_norm 2.9037 (inf) loss_scale 1024.0000 (1417.3113) mem 7381MB [2024-08-27 12:37:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1040/1251] eta 0:00:48 lr 0.000412 wd 0.0500 time 0.2331 (0.2312) data time 0.0009 (0.0017) model time 0.2323 (0.2294) loss 3.6369 (3.1008) grad_norm 2.0290 (inf) loss_scale 1024.0000 (1413.5331) mem 7381MB [2024-08-27 12:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1050/1251] eta 0:00:46 lr 0.000412 wd 0.0500 time 0.2322 (0.2312) data time 0.0007 (0.0017) model time 0.2315 (0.2295) loss 2.9122 (3.1030) grad_norm 3.1442 (inf) loss_scale 1024.0000 (1409.8268) mem 7381MB [2024-08-27 12:37:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1060/1251] eta 0:00:44 lr 0.000412 wd 0.0500 time 0.2401 (0.2312) data time 0.0009 (0.0017) model time 0.2392 (0.2295) loss 2.2741 (3.1035) grad_norm 2.8301 (inf) loss_scale 1024.0000 (1406.1904) mem 7381MB [2024-08-27 12:37:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1070/1251] eta 0:00:41 lr 0.000412 wd 0.0500 time 0.2294 (0.2312) data time 0.0011 (0.0017) model time 0.2282 (0.2295) loss 3.1187 (3.1028) grad_norm 4.4458 (inf) loss_scale 1024.0000 (1402.6218) mem 7381MB [2024-08-27 12:37:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1080/1251] eta 0:00:39 lr 0.000412 wd 0.0500 time 0.2288 (0.2312) data time 0.0013 (0.0016) model time 0.2275 (0.2295) loss 3.5865 (3.1029) grad_norm 2.7978 (inf) loss_scale 1024.0000 (1399.1193) mem 7381MB [2024-08-27 12:37:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1090/1251] eta 0:00:37 lr 0.000412 wd 0.0500 time 0.2316 (0.2312) data time 0.0007 (0.0016) model time 0.2309 (0.2295) loss 2.8780 (3.1023) grad_norm 2.5763 (inf) loss_scale 1024.0000 (1395.6810) mem 7381MB [2024-08-27 12:37:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1100/1251] eta 0:00:34 lr 0.000412 wd 0.0500 time 0.2308 (0.2312) data time 0.0007 (0.0016) model time 0.2302 (0.2295) loss 3.3101 (3.1025) grad_norm 2.3883 (inf) loss_scale 1024.0000 (1392.3052) mem 7381MB [2024-08-27 12:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1110/1251] eta 0:00:32 lr 0.000412 wd 0.0500 time 0.2559 (0.2312) data time 0.0011 (0.0016) model time 0.2549 (0.2295) loss 3.4387 (3.1066) grad_norm 2.3001 (inf) loss_scale 1024.0000 (1388.9901) mem 7381MB [2024-08-27 12:37:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1120/1251] eta 0:00:30 lr 0.000412 wd 0.0500 time 0.2266 (0.2312) data time 0.0009 (0.0017) model time 0.2257 (0.2295) loss 2.5993 (3.1074) grad_norm 2.7434 (inf) loss_scale 1024.0000 (1385.7342) mem 7381MB [2024-08-27 12:37:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1130/1251] eta 0:00:27 lr 0.000412 wd 0.0500 time 0.2332 (0.2312) data time 0.0013 (0.0017) model time 0.2319 (0.2294) loss 2.0516 (3.1089) grad_norm 5.1571 (inf) loss_scale 1024.0000 (1382.5358) mem 7381MB [2024-08-27 12:37:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1140/1251] eta 0:00:25 lr 0.000412 wd 0.0500 time 0.2252 (0.2311) data time 0.0008 (0.0016) model time 0.2244 (0.2294) loss 4.0064 (3.1098) grad_norm 2.2401 (inf) loss_scale 1024.0000 (1379.3935) mem 7381MB [2024-08-27 12:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1150/1251] eta 0:00:23 lr 0.000412 wd 0.0500 time 0.2267 (0.2311) data time 0.0010 (0.0016) model time 0.2257 (0.2294) loss 3.4220 (3.1088) grad_norm 3.7546 (inf) loss_scale 1024.0000 (1376.3058) mem 7381MB [2024-08-27 12:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1160/1251] eta 0:00:21 lr 0.000412 wd 0.0500 time 0.2250 (0.2311) data time 0.0007 (0.0016) model time 0.2243 (0.2294) loss 2.2921 (3.1082) grad_norm 2.2992 (inf) loss_scale 1024.0000 (1373.2713) mem 7381MB [2024-08-27 12:37:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1170/1251] eta 0:00:18 lr 0.000412 wd 0.0500 time 0.2222 (0.2311) data time 0.0008 (0.0016) model time 0.2213 (0.2294) loss 3.5064 (3.1090) grad_norm 3.4187 (inf) loss_scale 1024.0000 (1370.2886) mem 7381MB [2024-08-27 12:38:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1180/1251] eta 0:00:16 lr 0.000411 wd 0.0500 time 0.2269 (0.2311) data time 0.0012 (0.0016) model time 0.2257 (0.2294) loss 3.2221 (3.1073) grad_norm 3.0350 (inf) loss_scale 1024.0000 (1367.3565) mem 7381MB [2024-08-27 12:38:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1190/1251] eta 0:00:14 lr 0.000411 wd 0.0500 time 0.2322 (0.2311) data time 0.0010 (0.0016) model time 0.2311 (0.2294) loss 3.1854 (3.1073) grad_norm 3.6576 (inf) loss_scale 1024.0000 (1364.4736) mem 7381MB [2024-08-27 12:38:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1200/1251] eta 0:00:11 lr 0.000411 wd 0.0500 time 0.2246 (0.2311) data time 0.0010 (0.0016) model time 0.2236 (0.2294) loss 2.9437 (3.1086) grad_norm 3.0901 (inf) loss_scale 1024.0000 (1361.6386) mem 7381MB [2024-08-27 12:38:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1210/1251] eta 0:00:09 lr 0.000411 wd 0.0500 time 0.2344 (0.2311) data time 0.0009 (0.0016) model time 0.2335 (0.2294) loss 3.5495 (3.1119) grad_norm 4.2038 (inf) loss_scale 1024.0000 (1358.8505) mem 7381MB [2024-08-27 12:38:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1220/1251] eta 0:00:07 lr 0.000411 wd 0.0500 time 0.2278 (0.2311) data time 0.0009 (0.0016) model time 0.2269 (0.2294) loss 2.1584 (3.1126) grad_norm 2.9708 (inf) loss_scale 1024.0000 (1356.1081) mem 7381MB [2024-08-27 12:38:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1230/1251] eta 0:00:04 lr 0.000411 wd 0.0500 time 0.2275 (0.2311) data time 0.0007 (0.0016) model time 0.2269 (0.2294) loss 2.6730 (3.1142) grad_norm 2.4745 (inf) loss_scale 1024.0000 (1353.4102) mem 7381MB [2024-08-27 12:38:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1240/1251] eta 0:00:02 lr 0.000411 wd 0.0500 time 0.2108 (0.2310) data time 0.0007 (0.0016) model time 0.2101 (0.2293) loss 3.1866 (3.1132) grad_norm 2.4156 (inf) loss_scale 1024.0000 (1350.7558) mem 7381MB [2024-08-27 12:38:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [176/300][1250/1251] eta 0:00:00 lr 0.000411 wd 0.0500 time 0.2131 (0.2309) data time 0.0004 (0.0016) model time 0.2127 (0.2292) loss 2.5525 (3.1104) grad_norm 2.1629 (inf) loss_scale 1024.0000 (1348.1439) mem 7381MB [2024-08-27 12:38:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 176 training takes 0:04:48 [2024-08-27 12:38:15 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 12:38:17 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 12:38:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.491 (0.491) Loss 0.4365 (0.4365) Acc@1 91.406 (91.406) Acc@5 98.145 (98.145) Mem 7381MB [2024-08-27 12:38:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.118) Loss 0.6826 (0.6949) Acc@1 87.109 (85.067) Acc@5 97.168 (96.955) Mem 7381MB [2024-08-27 12:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.100) Loss 1.0156 (0.7254) Acc@1 76.465 (84.008) Acc@5 95.312 (96.940) Mem 7381MB [2024-08-27 12:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.094) Loss 1.2393 (0.8221) Acc@1 69.727 (81.754) Acc@5 90.820 (95.857) Mem 7381MB [2024-08-27 12:38:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.1377 (0.8770) Acc@1 73.828 (80.435) Acc@5 92.773 (95.224) Mem 7381MB [2024-08-27 12:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.044 Acc@5 95.204 [2024-08-27 12:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.0% [2024-08-27 12:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 80.04% [2024-08-27 12:38:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 12:38:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 12:38:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.447 (0.447) Loss 0.3979 (0.3979) Acc@1 92.969 (92.969) Acc@5 98.242 (98.242) Mem 7381MB [2024-08-27 12:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.123) Loss 0.6177 (0.6232) Acc@1 87.988 (86.781) Acc@5 97.266 (97.470) Mem 7381MB [2024-08-27 12:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.102) Loss 0.8896 (0.6481) Acc@1 78.906 (85.877) Acc@5 95.801 (97.475) Mem 7381MB [2024-08-27 12:38:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.094) Loss 1.1191 (0.7343) Acc@1 71.973 (83.773) Acc@5 93.262 (96.566) Mem 7381MB [2024-08-27 12:38:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.0039 (0.7782) Acc@1 75.098 (82.424) Acc@5 94.238 (96.108) Mem 7381MB [2024-08-27 12:38:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.986 Acc@5 96.082 [2024-08-27 12:38:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.0% [2024-08-27 12:38:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 81.99% [2024-08-27 12:38:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 12:38:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 12:38:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][0/1251] eta 0:15:20 lr 0.000411 wd 0.0500 time 0.7359 (0.7359) data time 0.4680 (0.4680) model time 0.0000 (0.0000) loss 3.3325 (3.3325) grad_norm 2.0563 (2.0563) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:38:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][10/1251] eta 0:05:40 lr 0.000411 wd 0.0500 time 0.2335 (0.2746) data time 0.0009 (0.0436) model time 0.0000 (0.0000) loss 3.4872 (3.2286) grad_norm 2.8545 (2.5282) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:38:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][20/1251] eta 0:05:20 lr 0.000411 wd 0.0500 time 0.2285 (0.2606) data time 0.0010 (0.0233) model time 0.0000 (0.0000) loss 2.7624 (3.1539) grad_norm 2.0993 (2.5281) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:38:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][30/1251] eta 0:05:05 lr 0.000411 wd 0.0500 time 0.2287 (0.2505) data time 0.0007 (0.0162) model time 0.0000 (0.0000) loss 2.9436 (3.2205) grad_norm 2.0713 (2.5367) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:38:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][40/1251] eta 0:04:57 lr 0.000411 wd 0.0500 time 0.2239 (0.2456) data time 0.0011 (0.0129) model time 0.0000 (0.0000) loss 3.4053 (3.2414) grad_norm 2.1247 (2.4921) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:38:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][50/1251] eta 0:04:51 lr 0.000411 wd 0.0500 time 0.2236 (0.2424) data time 0.0008 (0.0106) model time 0.0000 (0.0000) loss 3.5407 (3.2494) grad_norm 2.5778 (2.5324) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:38:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][60/1251] eta 0:04:45 lr 0.000411 wd 0.0500 time 0.2297 (0.2400) data time 0.0007 (0.0091) model time 0.2290 (0.2266) loss 3.6962 (3.2371) grad_norm 3.0839 (2.5604) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:38:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][70/1251] eta 0:04:41 lr 0.000411 wd 0.0500 time 0.2324 (0.2386) data time 0.0007 (0.0080) model time 0.2316 (0.2277) loss 2.8855 (3.2109) grad_norm 2.8022 (2.5693) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:38:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][80/1251] eta 0:04:38 lr 0.000411 wd 0.0500 time 0.2433 (0.2379) data time 0.0007 (0.0071) model time 0.2426 (0.2290) loss 3.8491 (3.2427) grad_norm 2.3839 (2.6358) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:38:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][90/1251] eta 0:04:35 lr 0.000411 wd 0.0500 time 0.2287 (0.2371) data time 0.0009 (0.0065) model time 0.2278 (0.2292) loss 3.5914 (3.2471) grad_norm 1.9562 (2.7143) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:38:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][100/1251] eta 0:04:31 lr 0.000411 wd 0.0500 time 0.2231 (0.2362) data time 0.0009 (0.0061) model time 0.2223 (0.2285) loss 3.1158 (3.2313) grad_norm 2.1510 (2.7144) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:38:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][110/1251] eta 0:04:28 lr 0.000411 wd 0.0500 time 0.2279 (0.2356) data time 0.0009 (0.0056) model time 0.2271 (0.2284) loss 3.3786 (3.2188) grad_norm 2.1439 (2.7375) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:38:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][120/1251] eta 0:04:26 lr 0.000411 wd 0.0500 time 0.2343 (0.2355) data time 0.0009 (0.0052) model time 0.2333 (0.2291) loss 3.4788 (3.2012) grad_norm 2.6244 (2.7243) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:38:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][130/1251] eta 0:04:23 lr 0.000411 wd 0.0500 time 0.2210 (0.2348) data time 0.0007 (0.0049) model time 0.2203 (0.2287) loss 2.7758 (3.1955) grad_norm 2.8078 (2.7137) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:38:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][140/1251] eta 0:04:20 lr 0.000411 wd 0.0500 time 0.2323 (0.2345) data time 0.0009 (0.0046) model time 0.2314 (0.2287) loss 2.9072 (3.2037) grad_norm 3.2931 (2.7147) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][150/1251] eta 0:04:17 lr 0.000411 wd 0.0500 time 0.2268 (0.2340) data time 0.0007 (0.0044) model time 0.2261 (0.2284) loss 3.3133 (3.1980) grad_norm 3.7097 (2.7165) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][160/1251] eta 0:04:15 lr 0.000410 wd 0.0500 time 0.2387 (0.2341) data time 0.0008 (0.0042) model time 0.2379 (0.2289) loss 1.8858 (3.1663) grad_norm 4.3186 (2.7531) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][170/1251] eta 0:04:13 lr 0.000410 wd 0.0500 time 0.2238 (0.2342) data time 0.0010 (0.0040) model time 0.2229 (0.2294) loss 2.5707 (3.1449) grad_norm 2.0384 (2.7587) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][180/1251] eta 0:04:10 lr 0.000410 wd 0.0500 time 0.2381 (0.2339) data time 0.0009 (0.0039) model time 0.2372 (0.2293) loss 3.3901 (3.1415) grad_norm 2.8336 (2.7638) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][190/1251] eta 0:04:07 lr 0.000410 wd 0.0500 time 0.2240 (0.2336) data time 0.0007 (0.0037) model time 0.2233 (0.2292) loss 2.8755 (3.1463) grad_norm 3.4222 (2.7769) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][200/1251] eta 0:04:05 lr 0.000410 wd 0.0500 time 0.2252 (0.2335) data time 0.0009 (0.0036) model time 0.2243 (0.2292) loss 3.2014 (3.1478) grad_norm 2.4883 (2.7798) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][210/1251] eta 0:04:02 lr 0.000410 wd 0.0500 time 0.2332 (0.2334) data time 0.0009 (0.0035) model time 0.2323 (0.2293) loss 2.8584 (3.1442) grad_norm 3.4458 (2.7926) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][220/1251] eta 0:04:00 lr 0.000410 wd 0.0500 time 0.2232 (0.2332) data time 0.0007 (0.0034) model time 0.2225 (0.2292) loss 3.1967 (3.1503) grad_norm 3.2221 (2.8064) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][230/1251] eta 0:03:58 lr 0.000410 wd 0.0500 time 0.2276 (0.2334) data time 0.0009 (0.0033) model time 0.2266 (0.2296) loss 3.1182 (3.1433) grad_norm 3.4669 (2.8031) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][240/1251] eta 0:03:55 lr 0.000410 wd 0.0500 time 0.2322 (0.2332) data time 0.0009 (0.0032) model time 0.2313 (0.2295) loss 2.3646 (3.1354) grad_norm 2.6460 (2.8051) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][250/1251] eta 0:03:53 lr 0.000410 wd 0.0500 time 0.2237 (0.2331) data time 0.0006 (0.0032) model time 0.2231 (0.2294) loss 3.5361 (3.1370) grad_norm 2.4641 (2.7987) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][260/1251] eta 0:03:50 lr 0.000410 wd 0.0500 time 0.2256 (0.2330) data time 0.0014 (0.0031) model time 0.2242 (0.2294) loss 3.0806 (3.1326) grad_norm 2.0974 (2.7885) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][270/1251] eta 0:03:48 lr 0.000410 wd 0.0500 time 0.2260 (0.2328) data time 0.0010 (0.0030) model time 0.2251 (0.2293) loss 3.1578 (3.1399) grad_norm 2.9870 (2.7977) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][280/1251] eta 0:03:46 lr 0.000410 wd 0.0500 time 0.2180 (0.2328) data time 0.0013 (0.0030) model time 0.2167 (0.2294) loss 3.1933 (3.1441) grad_norm 3.8194 (2.8022) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][290/1251] eta 0:03:43 lr 0.000410 wd 0.0500 time 0.2337 (0.2329) data time 0.0009 (0.0029) model time 0.2329 (0.2296) loss 2.8753 (3.1454) grad_norm 2.2552 (2.7943) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][300/1251] eta 0:03:41 lr 0.000410 wd 0.0500 time 0.2214 (0.2329) data time 0.0009 (0.0029) model time 0.2205 (0.2296) loss 3.3807 (3.1459) grad_norm 2.4802 (2.7896) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][310/1251] eta 0:03:39 lr 0.000410 wd 0.0500 time 0.2228 (0.2328) data time 0.0009 (0.0028) model time 0.2220 (0.2296) loss 3.4487 (3.1481) grad_norm 2.2789 (2.7847) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][320/1251] eta 0:03:36 lr 0.000410 wd 0.0500 time 0.2273 (0.2327) data time 0.0007 (0.0028) model time 0.2265 (0.2296) loss 2.2896 (3.1472) grad_norm 2.1698 (2.7833) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][330/1251] eta 0:03:34 lr 0.000410 wd 0.0500 time 0.2271 (0.2326) data time 0.0009 (0.0027) model time 0.2262 (0.2295) loss 3.4523 (3.1317) grad_norm 2.8455 (2.8003) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][340/1251] eta 0:03:31 lr 0.000410 wd 0.0500 time 0.2311 (0.2325) data time 0.0007 (0.0027) model time 0.2305 (0.2295) loss 3.3843 (3.1268) grad_norm 2.2779 (2.8018) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][350/1251] eta 0:03:29 lr 0.000410 wd 0.0500 time 0.2236 (0.2324) data time 0.0008 (0.0026) model time 0.2228 (0.2294) loss 3.7327 (3.1292) grad_norm 3.7985 (2.7965) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][360/1251] eta 0:03:27 lr 0.000410 wd 0.0500 time 0.2219 (0.2324) data time 0.0008 (0.0026) model time 0.2211 (0.2294) loss 2.5823 (3.1259) grad_norm 3.9752 (2.7888) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][370/1251] eta 0:03:24 lr 0.000410 wd 0.0500 time 0.2335 (0.2323) data time 0.0011 (0.0026) model time 0.2324 (0.2294) loss 2.9053 (3.1280) grad_norm 2.7580 (2.7874) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][380/1251] eta 0:03:22 lr 0.000409 wd 0.0500 time 0.2283 (0.2323) data time 0.0010 (0.0026) model time 0.2273 (0.2294) loss 3.0864 (3.1346) grad_norm 3.6035 (2.7981) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][390/1251] eta 0:03:19 lr 0.000409 wd 0.0500 time 0.2277 (0.2322) data time 0.0009 (0.0026) model time 0.2268 (0.2293) loss 3.5047 (3.1359) grad_norm 4.0315 (2.8004) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 12:39:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 12:39:59 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 12:39:59 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 12:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 12:43:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 12:43:46 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 12:43:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 12:43:53 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 12:43:54 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 12:43:55 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 12:43:55 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 177) [2024-08-27 12:43:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 12:44:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][400/1251] eta 1:06:26 lr 0.000409 wd 0.0500 time 0.2217 (4.6843) data time 0.0006 (0.2290) model time 0.2210 (4.4553) loss 2.8416 (3.4142) grad_norm 3.0706 (3.1336) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:44:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][410/1251] eta 0:17:36 lr 0.000409 wd 0.0500 time 0.2249 (1.2563) data time 0.0009 (0.0540) model time 0.2240 (1.2024) loss 3.4813 (3.3773) grad_norm 2.4123 (2.6828) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:44:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][420/1251] eta 0:11:11 lr 0.000409 wd 0.0500 time 0.2238 (0.8086) data time 0.0009 (0.0309) model time 0.2229 (0.7777) loss 3.5538 (3.4102) grad_norm 3.0583 (2.7370) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:44:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][430/1251] eta 0:08:38 lr 0.000409 wd 0.0500 time 0.2232 (0.6318) data time 0.0007 (0.0219) model time 0.2225 (0.6099) loss 4.0745 (3.4049) grad_norm 2.3477 (2.9907) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:44:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][440/1251] eta 0:07:16 lr 0.000409 wd 0.0500 time 0.2287 (0.5379) data time 0.0011 (0.0170) model time 0.2276 (0.5209) loss 3.3122 (3.3401) grad_norm 2.4300 (3.0219) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:44:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][450/1251] eta 0:06:23 lr 0.000409 wd 0.0500 time 0.2179 (0.4792) data time 0.0010 (0.0141) model time 0.2169 (0.4651) loss 2.8274 (3.3062) grad_norm 2.2802 (3.0315) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:44:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][460/1251] eta 0:05:47 lr 0.000409 wd 0.0500 time 0.2176 (0.4390) data time 0.0010 (0.0120) model time 0.2166 (0.4270) loss 3.0963 (3.2654) grad_norm 2.8067 (3.0117) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:44:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][470/1251] eta 0:05:20 lr 0.000409 wd 0.0500 time 0.2212 (0.4098) data time 0.0009 (0.0105) model time 0.2203 (0.3993) loss 3.6320 (3.2335) grad_norm 2.8012 (3.0046) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:44:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][480/1251] eta 0:04:59 lr 0.000409 wd 0.0500 time 0.2215 (0.3881) data time 0.0010 (0.0094) model time 0.2205 (0.3787) loss 2.4027 (3.2087) grad_norm 2.5004 (3.0016) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:44:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][490/1251] eta 0:04:42 lr 0.000409 wd 0.0500 time 0.2200 (0.3709) data time 0.0008 (0.0085) model time 0.2192 (0.3624) loss 3.8010 (3.2007) grad_norm 2.5990 (3.0284) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:44:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][500/1251] eta 0:04:27 lr 0.000409 wd 0.0500 time 0.2148 (0.3567) data time 0.0010 (0.0078) model time 0.2138 (0.3489) loss 3.6775 (3.2177) grad_norm 2.6035 (3.0038) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:44:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][510/1251] eta 0:04:15 lr 0.000409 wd 0.0500 time 0.2176 (0.3452) data time 0.0008 (0.0072) model time 0.2168 (0.3380) loss 3.1862 (3.2107) grad_norm 3.2064 (2.9843) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 12:44:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 12:44:39 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 12:44:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 12:46:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 12:46:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 12:46:34 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 12:46:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 12:46:46 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 12:46:48 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 12:46:49 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 12:46:49 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 177) [2024-08-27 12:46:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 12:47:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][520/1251] eta 0:24:49 lr 0.000409 wd 0.0500 time 0.2326 (2.0369) data time 0.0007 (0.0765) model time 0.2319 (1.9604) loss 3.5209 (3.5014) grad_norm 2.7181 (2.7204) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:47:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][530/1251] eta 0:12:29 lr 0.000409 wd 0.0500 time 0.2469 (1.0395) data time 0.0008 (0.0346) model time 0.2461 (1.0048) loss 3.7649 (3.4046) grad_norm 2.3222 (2.6923) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:47:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][540/1251] eta 0:08:56 lr 0.000409 wd 0.0500 time 0.2382 (0.7545) data time 0.0011 (0.0227) model time 0.2371 (0.7318) loss 3.7956 (3.4468) grad_norm 2.8195 (2.8256) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:47:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][550/1251] eta 0:07:14 lr 0.000409 wd 0.0500 time 0.2359 (0.6198) data time 0.0012 (0.0171) model time 0.2347 (0.6026) loss 3.2614 (3.3718) grad_norm 2.3680 (2.9586) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:47:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][560/1251] eta 0:06:13 lr 0.000409 wd 0.0500 time 0.2366 (0.5404) data time 0.0007 (0.0138) model time 0.2359 (0.5267) loss 3.4840 (3.3215) grad_norm 2.3600 (2.9054) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:47:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][570/1251] eta 0:05:32 lr 0.000409 wd 0.0500 time 0.2382 (0.4887) data time 0.0008 (0.0116) model time 0.2374 (0.4772) loss 2.5481 (3.2859) grad_norm 3.1206 (2.8449) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:47:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][580/1251] eta 0:05:03 lr 0.000409 wd 0.0500 time 0.2406 (0.4524) data time 0.0009 (0.0100) model time 0.2397 (0.4424) loss 2.2682 (3.2600) grad_norm 2.8481 (2.8385) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:47:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][590/1251] eta 0:04:41 lr 0.000409 wd 0.0500 time 0.2434 (0.4253) data time 0.0009 (0.0089) model time 0.2425 (0.4164) loss 2.9839 (3.2366) grad_norm 3.6645 (2.8508) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:47:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][600/1251] eta 0:04:23 lr 0.000409 wd 0.0500 time 0.2414 (0.4045) data time 0.0010 (0.0080) model time 0.2403 (0.3965) loss 3.3156 (3.2248) grad_norm 1.9697 (2.9056) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:47:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][610/1251] eta 0:04:08 lr 0.000408 wd 0.0500 time 0.2351 (0.3877) data time 0.0008 (0.0073) model time 0.2343 (0.3804) loss 3.8083 (3.2405) grad_norm 4.3092 (2.9101) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:47:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][620/1251] eta 0:03:56 lr 0.000408 wd 0.0500 time 0.2347 (0.3742) data time 0.0008 (0.0067) model time 0.2339 (0.3674) loss 2.5446 (3.2492) grad_norm 2.9537 (2.8997) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:47:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][630/1251] eta 0:03:45 lr 0.000408 wd 0.0500 time 0.2449 (0.3630) data time 0.0012 (0.0062) model time 0.2437 (0.3568) loss 3.0662 (3.2481) grad_norm 2.1760 (2.9254) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:47:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][640/1251] eta 0:03:36 lr 0.000408 wd 0.0500 time 0.2522 (0.3538) data time 0.0007 (0.0059) model time 0.2515 (0.3479) loss 3.4043 (3.2360) grad_norm 6.1328 (3.0245) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:47:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][650/1251] eta 0:03:27 lr 0.000408 wd 0.0500 time 0.2352 (0.3454) data time 0.0010 (0.0055) model time 0.2342 (0.3399) loss 3.0473 (3.2264) grad_norm 2.2912 (3.0554) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:47:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 12:47:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 12:47:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 12:49:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 12:49:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 12:49:35 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 12:49:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 12:49:46 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 12:49:47 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 12:49:48 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 12:49:48 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 177) [2024-08-27 12:49:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 12:55:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 12:55:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 12:55:21 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 12:55:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 12:55:33 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 12:55:34 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 12:55:35 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 12:55:35 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 177) [2024-08-27 12:55:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 12:55:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][660/1251] eta 0:13:30 lr 0.000408 wd 0.0500 time 0.2217 (1.3715) data time 0.0011 (0.0731) model time 0.2206 (1.2985) loss 3.7435 (3.5956) grad_norm 2.0014 (2.9687) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:55:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][670/1251] eta 0:07:43 lr 0.000408 wd 0.0500 time 0.2281 (0.7980) data time 0.0007 (0.0370) model time 0.2274 (0.7609) loss 3.5672 (3.3707) grad_norm 2.8552 (2.7887) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:55:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][680/1251] eta 0:05:45 lr 0.000408 wd 0.0500 time 0.2223 (0.6059) data time 0.0009 (0.0250) model time 0.2214 (0.5809) loss 3.4616 (3.4095) grad_norm 2.1350 (2.7424) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][690/1251] eta 0:04:46 lr 0.000408 wd 0.0500 time 0.2292 (0.5107) data time 0.0007 (0.0190) model time 0.2285 (0.4917) loss 2.5427 (3.3277) grad_norm 3.0991 (2.8444) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][700/1251] eta 0:04:09 lr 0.000408 wd 0.0500 time 0.2314 (0.4537) data time 0.0012 (0.0154) model time 0.2303 (0.4383) loss 3.0288 (3.3173) grad_norm 2.8473 (2.8938) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][710/1251] eta 0:03:44 lr 0.000408 wd 0.0500 time 0.2238 (0.4155) data time 0.0006 (0.0130) model time 0.2232 (0.4025) loss 3.5484 (3.3079) grad_norm 3.8287 (2.8775) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][720/1251] eta 0:03:26 lr 0.000408 wd 0.0500 time 0.2175 (0.3886) data time 0.0008 (0.0114) model time 0.2167 (0.3772) loss 2.5472 (3.2777) grad_norm 3.6500 (2.9428) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][730/1251] eta 0:03:11 lr 0.000408 wd 0.0500 time 0.2204 (0.3683) data time 0.0010 (0.0101) model time 0.2194 (0.3582) loss 3.2137 (3.2565) grad_norm 2.0806 (2.8842) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][740/1251] eta 0:03:00 lr 0.000408 wd 0.0500 time 0.2274 (0.3528) data time 0.0009 (0.0091) model time 0.2266 (0.3437) loss 3.7941 (3.2406) grad_norm 1.9767 (2.8513) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][750/1251] eta 0:02:50 lr 0.000408 wd 0.0500 time 0.2211 (0.3401) data time 0.0010 (0.0084) model time 0.2201 (0.3317) loss 3.5132 (3.2405) grad_norm 1.8412 (2.8405) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][760/1251] eta 0:02:41 lr 0.000408 wd 0.0500 time 0.2234 (0.3295) data time 0.0009 (0.0077) model time 0.2225 (0.3218) loss 2.9871 (3.2381) grad_norm 3.4029 (2.8335) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][770/1251] eta 0:02:34 lr 0.000408 wd 0.0500 time 0.2235 (0.3210) data time 0.0007 (0.0072) model time 0.2228 (0.3138) loss 3.4765 (3.2398) grad_norm 2.6712 (2.8247) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][780/1251] eta 0:02:27 lr 0.000408 wd 0.0500 time 0.2264 (0.3136) data time 0.0007 (0.0067) model time 0.2258 (0.3069) loss 2.8783 (3.2075) grad_norm 3.5287 (2.8204) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][790/1251] eta 0:02:21 lr 0.000408 wd 0.0500 time 0.2257 (0.3072) data time 0.0008 (0.0063) model time 0.2249 (0.3009) loss 1.8593 (3.1958) grad_norm 2.4809 (2.7887) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][800/1251] eta 0:02:16 lr 0.000408 wd 0.0500 time 0.2246 (0.3017) data time 0.0010 (0.0060) model time 0.2236 (0.2957) loss 3.9257 (3.1955) grad_norm 1.8156 (2.8372) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][810/1251] eta 0:02:10 lr 0.000408 wd 0.0500 time 0.2232 (0.2968) data time 0.0009 (0.0057) model time 0.2223 (0.2912) loss 3.1386 (3.1910) grad_norm 3.1616 (2.8196) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][820/1251] eta 0:02:06 lr 0.000408 wd 0.0500 time 0.2290 (0.2926) data time 0.0007 (0.0054) model time 0.2282 (0.2872) loss 2.4291 (3.1863) grad_norm 2.7461 (2.7980) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][830/1251] eta 0:02:01 lr 0.000408 wd 0.0500 time 0.2172 (0.2888) data time 0.0007 (0.0052) model time 0.2165 (0.2836) loss 2.5066 (3.1658) grad_norm 2.8666 (2.7824) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][840/1251] eta 0:01:57 lr 0.000407 wd 0.0500 time 0.2217 (0.2854) data time 0.0007 (0.0050) model time 0.2211 (0.2804) loss 2.6523 (3.1656) grad_norm 2.0010 (2.7784) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][850/1251] eta 0:01:53 lr 0.000407 wd 0.0500 time 0.2256 (0.2824) data time 0.0011 (0.0048) model time 0.2245 (0.2777) loss 3.5685 (3.1545) grad_norm 2.9938 (2.7903) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][860/1251] eta 0:01:49 lr 0.000407 wd 0.0500 time 0.2286 (0.2796) data time 0.0009 (0.0046) model time 0.2278 (0.2750) loss 3.1880 (3.1489) grad_norm 1.9686 (2.7822) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][870/1251] eta 0:01:45 lr 0.000407 wd 0.0500 time 0.2195 (0.2771) data time 0.0010 (0.0044) model time 0.2185 (0.2726) loss 2.7750 (3.1433) grad_norm 3.0861 (2.7747) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][880/1251] eta 0:01:41 lr 0.000407 wd 0.0500 time 0.2250 (0.2748) data time 0.0011 (0.0043) model time 0.2239 (0.2705) loss 3.3754 (3.1467) grad_norm 2.5739 (2.7591) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][890/1251] eta 0:01:38 lr 0.000407 wd 0.0500 time 0.2232 (0.2727) data time 0.0009 (0.0041) model time 0.2223 (0.2685) loss 3.1967 (3.1383) grad_norm 2.9720 (2.7698) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][900/1251] eta 0:01:35 lr 0.000407 wd 0.0500 time 0.2223 (0.2707) data time 0.0008 (0.0040) model time 0.2215 (0.2667) loss 2.5280 (3.1292) grad_norm 3.3651 (2.7871) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][910/1251] eta 0:01:31 lr 0.000407 wd 0.0500 time 0.2163 (0.2689) data time 0.0010 (0.0039) model time 0.2153 (0.2650) loss 2.9171 (3.1238) grad_norm 3.1324 (2.7946) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][920/1251] eta 0:01:28 lr 0.000407 wd 0.0500 time 0.2181 (0.2673) data time 0.0008 (0.0038) model time 0.2173 (0.2635) loss 3.5513 (3.1202) grad_norm 2.8912 (2.8135) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][930/1251] eta 0:01:25 lr 0.000407 wd 0.0500 time 0.2247 (0.2659) data time 0.0008 (0.0037) model time 0.2238 (0.2622) loss 3.8620 (3.1256) grad_norm 2.0829 (2.8200) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][940/1251] eta 0:01:22 lr 0.000407 wd 0.0500 time 0.2263 (0.2653) data time 0.0008 (0.0036) model time 0.2254 (0.2617) loss 2.8551 (3.1213) grad_norm 3.0716 (2.8331) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:56:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][950/1251] eta 0:01:19 lr 0.000407 wd 0.0500 time 0.2217 (0.2640) data time 0.0017 (0.0035) model time 0.2200 (0.2605) loss 3.0043 (3.1071) grad_norm 2.3931 (2.8458) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][960/1251] eta 0:01:16 lr 0.000407 wd 0.0500 time 0.2253 (0.2635) data time 0.0011 (0.0034) model time 0.2242 (0.2601) loss 3.1707 (3.1043) grad_norm 2.1434 (2.8339) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][970/1251] eta 0:01:13 lr 0.000407 wd 0.0500 time 0.2303 (0.2624) data time 0.0010 (0.0033) model time 0.2294 (0.2590) loss 3.1038 (3.1122) grad_norm 2.8522 (2.8220) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][980/1251] eta 0:01:10 lr 0.000407 wd 0.0500 time 0.2336 (0.2612) data time 0.0006 (0.0033) model time 0.2329 (0.2579) loss 3.4317 (3.1173) grad_norm 6.3846 (2.8348) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][990/1251] eta 0:01:07 lr 0.000407 wd 0.0500 time 0.2324 (0.2601) data time 0.0008 (0.0032) model time 0.2317 (0.2569) loss 3.0927 (3.1153) grad_norm 2.2358 (2.8241) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1000/1251] eta 0:01:05 lr 0.000407 wd 0.0500 time 0.2357 (0.2591) data time 0.0009 (0.0031) model time 0.2348 (0.2559) loss 2.4605 (3.1160) grad_norm 2.9578 (2.8143) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1010/1251] eta 0:01:02 lr 0.000407 wd 0.0500 time 0.2222 (0.2582) data time 0.0007 (0.0032) model time 0.2216 (0.2550) loss 3.5624 (3.1153) grad_norm 2.6275 (2.8040) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1020/1251] eta 0:00:59 lr 0.000407 wd 0.0500 time 0.2373 (0.2574) data time 0.0010 (0.0032) model time 0.2363 (0.2542) loss 3.4314 (3.1149) grad_norm 2.4236 (2.8007) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1030/1251] eta 0:00:56 lr 0.000407 wd 0.0500 time 0.2246 (0.2566) data time 0.0006 (0.0031) model time 0.2240 (0.2535) loss 2.9714 (3.1120) grad_norm 2.3979 (2.7975) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1040/1251] eta 0:00:53 lr 0.000407 wd 0.0500 time 0.2277 (0.2557) data time 0.0008 (0.0030) model time 0.2268 (0.2527) loss 2.5196 (3.1027) grad_norm 4.9648 (2.7999) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1050/1251] eta 0:00:51 lr 0.000407 wd 0.0500 time 0.2258 (0.2549) data time 0.0008 (0.0030) model time 0.2250 (0.2520) loss 3.2856 (3.1085) grad_norm 1.8812 (2.7963) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1060/1251] eta 0:00:48 lr 0.000407 wd 0.0500 time 0.2300 (0.2542) data time 0.0006 (0.0029) model time 0.2294 (0.2512) loss 2.9474 (3.1119) grad_norm 3.0323 (2.8166) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1070/1251] eta 0:00:45 lr 0.000406 wd 0.0500 time 0.2176 (0.2534) data time 0.0008 (0.0029) model time 0.2168 (0.2505) loss 3.3574 (3.1114) grad_norm 3.5193 (2.8111) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1080/1251] eta 0:00:43 lr 0.000406 wd 0.0500 time 0.2338 (0.2527) data time 0.0010 (0.0028) model time 0.2328 (0.2498) loss 3.2628 (3.1173) grad_norm 2.2444 (2.8130) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1090/1251] eta 0:00:40 lr 0.000406 wd 0.0500 time 0.2220 (0.2521) data time 0.0010 (0.0028) model time 0.2210 (0.2493) loss 3.2452 (3.1217) grad_norm 2.5058 (2.8141) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1100/1251] eta 0:00:37 lr 0.000406 wd 0.0500 time 0.2222 (0.2514) data time 0.0007 (0.0028) model time 0.2215 (0.2487) loss 3.4135 (3.1225) grad_norm 2.8855 (2.8118) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1110/1251] eta 0:00:35 lr 0.000406 wd 0.0500 time 0.2221 (0.2508) data time 0.0006 (0.0027) model time 0.2215 (0.2480) loss 3.3938 (3.1167) grad_norm 2.9601 (2.8108) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1120/1251] eta 0:00:32 lr 0.000406 wd 0.0500 time 0.2322 (0.2502) data time 0.0008 (0.0027) model time 0.2315 (0.2475) loss 2.9322 (3.1100) grad_norm 5.0109 (2.8122) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1130/1251] eta 0:00:30 lr 0.000406 wd 0.0500 time 0.2320 (0.2497) data time 0.0009 (0.0026) model time 0.2312 (0.2470) loss 2.5475 (3.1073) grad_norm 3.5984 (2.8127) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1140/1251] eta 0:00:27 lr 0.000406 wd 0.0500 time 0.2235 (0.2492) data time 0.0008 (0.0026) model time 0.2227 (0.2466) loss 2.9972 (3.1125) grad_norm 2.4867 (2.8094) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1150/1251] eta 0:00:25 lr 0.000406 wd 0.0500 time 0.2333 (0.2487) data time 0.0006 (0.0026) model time 0.2327 (0.2461) loss 2.3826 (3.1124) grad_norm 2.3339 (2.8017) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1160/1251] eta 0:00:22 lr 0.000406 wd 0.0500 time 0.2253 (0.2482) data time 0.0008 (0.0026) model time 0.2245 (0.2456) loss 3.1555 (3.1146) grad_norm 2.1380 (2.7999) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1170/1251] eta 0:00:20 lr 0.000406 wd 0.0500 time 0.2318 (0.2477) data time 0.0007 (0.0025) model time 0.2312 (0.2452) loss 3.1672 (3.1158) grad_norm 4.8854 (2.8076) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1180/1251] eta 0:00:17 lr 0.000406 wd 0.0500 time 0.2292 (0.2473) data time 0.0010 (0.0025) model time 0.2282 (0.2448) loss 2.9765 (3.1090) grad_norm 2.1405 (2.8077) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1190/1251] eta 0:00:15 lr 0.000406 wd 0.0500 time 0.2329 (0.2468) data time 0.0006 (0.0025) model time 0.2322 (0.2444) loss 3.5261 (3.1085) grad_norm 3.0594 (2.8055) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1200/1251] eta 0:00:12 lr 0.000406 wd 0.0500 time 0.2230 (0.2465) data time 0.0008 (0.0024) model time 0.2223 (0.2440) loss 2.0212 (3.1079) grad_norm 2.0745 (2.8058) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1210/1251] eta 0:00:10 lr 0.000406 wd 0.0500 time 0.2195 (0.2460) data time 0.0008 (0.0024) model time 0.2187 (0.2436) loss 3.5415 (3.1148) grad_norm 2.6591 (2.8058) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:57:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1220/1251] eta 0:00:07 lr 0.000406 wd 0.0500 time 0.2317 (0.2457) data time 0.0009 (0.0024) model time 0.2308 (0.2433) loss 3.4050 (3.1176) grad_norm 2.1997 (2.8008) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:58:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1230/1251] eta 0:00:05 lr 0.000406 wd 0.0500 time 0.2437 (0.2454) data time 0.0006 (0.0024) model time 0.2431 (0.2430) loss 2.6648 (3.1185) grad_norm 2.4255 (2.8072) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:58:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1240/1251] eta 0:00:02 lr 0.000406 wd 0.0500 time 0.2173 (0.2450) data time 0.0003 (0.0024) model time 0.2170 (0.2426) loss 3.6313 (3.1203) grad_norm 2.4500 (2.8049) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [177/300][1250/1251] eta 0:00:00 lr 0.000406 wd 0.0500 time 0.2130 (0.2445) data time 0.0007 (0.0024) model time 0.2124 (0.2421) loss 3.3530 (3.1218) grad_norm 2.6686 (2.8129) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 12:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 177 training takes 0:02:26 [2024-08-27 12:58:06 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 12:58:09 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 12:58:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.379 (0.379) Loss 0.4575 (0.4575) Acc@1 91.016 (91.016) Acc@5 97.949 (97.949) Mem 7377MB [2024-08-27 12:58:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.098) Loss 0.7300 (0.7143) Acc@1 84.082 (84.792) Acc@5 96.777 (96.955) Mem 7377MB [2024-08-27 12:58:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.085) Loss 0.9839 (0.7319) Acc@1 77.246 (84.063) Acc@5 94.727 (96.991) Mem 7377MB [2024-08-27 12:58:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.064 (0.080) Loss 1.2539 (0.8293) Acc@1 69.727 (81.678) Acc@5 91.992 (95.880) Mem 7377MB [2024-08-27 12:58:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.076) Loss 1.1133 (0.8827) Acc@1 74.609 (80.300) Acc@5 92.480 (95.277) Mem 7377MB [2024-08-27 12:58:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 79.982 Acc@5 95.274 [2024-08-27 12:58:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.0% [2024-08-27 12:58:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.750 (0.750) Loss 0.3982 (0.3982) Acc@1 92.969 (92.969) Acc@5 98.242 (98.242) Mem 7377MB [2024-08-27 12:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.135) Loss 0.6162 (0.6230) Acc@1 87.988 (86.772) Acc@5 97.266 (97.461) Mem 7377MB [2024-08-27 12:58:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.064 (0.105) Loss 0.8853 (0.6473) Acc@1 79.004 (85.882) Acc@5 95.703 (97.470) Mem 7377MB [2024-08-27 12:58:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.067 (0.093) Loss 1.1191 (0.7336) Acc@1 72.266 (83.761) Acc@5 93.066 (96.557) Mem 7377MB [2024-08-27 12:58:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.085) Loss 1.0020 (0.7776) Acc@1 75.195 (82.434) Acc@5 93.848 (96.084) Mem 7377MB [2024-08-27 12:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 13:00:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 13:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 13:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 13:04:01 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 13:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 13:04:12 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 13:04:13 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 13:04:14 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 13:04:14 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 177) [2024-08-27 13:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 13:04:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][0/1251] eta 4:32:41 lr 0.000406 wd 0.0500 time 13.0787 (13.0787) data time 1.2621 (1.2621) model time 0.0000 (0.0000) loss 4.0005 (4.0005) grad_norm 3.7835 (3.7835) loss_scale 1024.0000 (1024.0000) mem 20032MB [2024-08-27 13:04:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][10/1251] eta 0:29:23 lr 0.000406 wd 0.0500 time 0.2330 (1.4213) data time 0.0012 (0.1170) model time 0.0000 (0.0000) loss 2.7950 (3.4008) grad_norm 2.2708 (3.3132) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 13:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][20/1251] eta 0:17:31 lr 0.000406 wd 0.0500 time 0.2331 (0.8544) data time 0.0011 (0.0618) model time 0.0000 (0.0000) loss 2.9306 (3.3545) grad_norm 3.2385 (2.9053) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 13:04:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][30/1251] eta 0:13:17 lr 0.000406 wd 0.0500 time 0.2230 (0.6528) data time 0.0009 (0.0423) model time 0.0000 (0.0000) loss 2.2079 (3.3382) grad_norm 2.5092 (2.8716) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 13:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][40/1251] eta 0:11:05 lr 0.000406 wd 0.0500 time 0.2278 (0.5492) data time 0.0010 (0.0323) model time 0.0000 (0.0000) loss 3.1522 (3.2574) grad_norm 3.5649 (2.8509) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-27 13:04:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][50/1251] eta 0:09:44 lr 0.000405 wd 0.0500 time 0.2211 (0.4863) data time 0.0007 (0.0262) model time 0.0000 (0.0000) loss 3.3952 (3.2533) grad_norm 2.4088 (2.8274) loss_scale 2048.0000 (1224.7843) mem 7373MB [2024-08-27 13:04:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][60/1251] eta 0:08:48 lr 0.000405 wd 0.0500 time 0.2395 (0.4440) data time 0.0011 (0.0220) model time 0.2384 (0.2273) loss 3.1600 (3.2180) grad_norm 2.9896 (2.7884) loss_scale 2048.0000 (1359.7377) mem 7373MB [2024-08-27 13:04:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][70/1251] eta 0:08:08 lr 0.000405 wd 0.0500 time 0.2257 (0.4138) data time 0.0016 (0.0191) model time 0.2241 (0.2281) loss 2.9757 (3.1816) grad_norm 3.7941 (2.8291) loss_scale 2048.0000 (1456.6761) mem 7373MB [2024-08-27 13:04:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][80/1251] eta 0:07:38 lr 0.000405 wd 0.0500 time 0.2205 (0.3912) data time 0.0009 (0.0169) model time 0.2196 (0.2284) loss 2.8312 (3.1847) grad_norm 2.1629 (2.8203) loss_scale 2048.0000 (1529.6790) mem 7373MB [2024-08-27 13:04:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][90/1251] eta 0:07:13 lr 0.000405 wd 0.0500 time 0.2253 (0.3732) data time 0.0009 (0.0151) model time 0.2245 (0.2281) loss 3.8178 (3.1738) grad_norm 2.9419 (2.8069) loss_scale 2048.0000 (1586.6374) mem 7373MB [2024-08-27 13:04:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][100/1251] eta 0:06:53 lr 0.000405 wd 0.0500 time 0.2226 (0.3589) data time 0.0008 (0.0139) model time 0.2219 (0.2276) loss 3.6155 (3.1889) grad_norm 3.6743 (2.7974) loss_scale 2048.0000 (1632.3168) mem 7373MB [2024-08-27 13:04:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][110/1251] eta 0:06:35 lr 0.000405 wd 0.0500 time 0.2259 (0.3470) data time 0.0010 (0.0127) model time 0.2249 (0.2273) loss 2.6752 (3.1862) grad_norm 2.1858 (2.8047) loss_scale 2048.0000 (1669.7658) mem 7373MB [2024-08-27 13:05:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][120/1251] eta 0:06:21 lr 0.000405 wd 0.0500 time 0.2278 (0.3373) data time 0.0007 (0.0118) model time 0.2270 (0.2276) loss 2.0108 (3.1887) grad_norm 3.2314 (2.7767) loss_scale 2048.0000 (1701.0248) mem 7373MB [2024-08-27 13:05:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][130/1251] eta 0:06:08 lr 0.000405 wd 0.0500 time 0.2363 (0.3291) data time 0.0011 (0.0109) model time 0.2352 (0.2276) loss 3.4471 (3.1875) grad_norm 2.1439 (2.8367) loss_scale 2048.0000 (1727.5115) mem 7373MB [2024-08-27 13:05:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][140/1251] eta 0:05:57 lr 0.000405 wd 0.0500 time 0.2268 (0.3220) data time 0.0010 (0.0102) model time 0.2258 (0.2277) loss 3.4285 (3.1784) grad_norm 4.7166 (2.8388) loss_scale 2048.0000 (1750.2411) mem 7373MB [2024-08-27 13:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][150/1251] eta 0:05:47 lr 0.000405 wd 0.0500 time 0.2328 (0.3158) data time 0.0009 (0.0096) model time 0.2319 (0.2276) loss 2.2905 (3.1702) grad_norm 4.5969 (2.8493) loss_scale 2048.0000 (1769.9603) mem 7373MB [2024-08-27 13:05:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][160/1251] eta 0:05:38 lr 0.000405 wd 0.0500 time 0.2267 (0.3104) data time 0.0009 (0.0091) model time 0.2257 (0.2276) loss 3.1217 (3.1761) grad_norm 2.7043 (2.8981) loss_scale 2048.0000 (1787.2298) mem 7373MB [2024-08-27 13:05:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][170/1251] eta 0:05:30 lr 0.000405 wd 0.0500 time 0.2292 (0.3060) data time 0.0010 (0.0086) model time 0.2282 (0.2281) loss 3.1978 (3.1648) grad_norm 2.6759 (2.9009) loss_scale 2048.0000 (1802.4795) mem 7373MB [2024-08-27 13:05:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][180/1251] eta 0:05:23 lr 0.000405 wd 0.0500 time 0.2311 (0.3018) data time 0.0009 (0.0082) model time 0.2302 (0.2282) loss 3.7853 (3.1530) grad_norm 2.0565 (2.8814) loss_scale 2048.0000 (1816.0442) mem 7373MB [2024-08-27 13:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][190/1251] eta 0:05:15 lr 0.000405 wd 0.0500 time 0.2272 (0.2978) data time 0.0009 (0.0078) model time 0.2263 (0.2280) loss 3.3234 (3.1484) grad_norm 3.1508 (2.8624) loss_scale 2048.0000 (1828.1885) mem 7373MB [2024-08-27 13:05:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][200/1251] eta 0:05:09 lr 0.000405 wd 0.0500 time 0.2244 (0.2943) data time 0.0010 (0.0075) model time 0.2234 (0.2279) loss 2.4751 (3.1353) grad_norm 3.4480 (2.8735) loss_scale 2048.0000 (1839.1244) mem 7373MB [2024-08-27 13:05:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][210/1251] eta 0:05:03 lr 0.000405 wd 0.0500 time 0.2284 (0.2912) data time 0.0015 (0.0072) model time 0.2269 (0.2279) loss 3.2906 (3.1286) grad_norm 3.7287 (2.8658) loss_scale 2048.0000 (1849.0237) mem 7373MB [2024-08-27 13:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][220/1251] eta 0:04:57 lr 0.000405 wd 0.0500 time 0.2201 (0.2882) data time 0.0008 (0.0069) model time 0.2194 (0.2276) loss 3.5813 (3.1261) grad_norm 4.0221 (2.8674) loss_scale 2048.0000 (1858.0271) mem 7373MB [2024-08-27 13:05:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][230/1251] eta 0:04:51 lr 0.000405 wd 0.0500 time 0.2272 (0.2857) data time 0.0009 (0.0067) model time 0.2262 (0.2277) loss 2.0077 (3.1241) grad_norm 2.5089 (2.8784) loss_scale 2048.0000 (1866.2511) mem 7373MB [2024-08-27 13:05:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][240/1251] eta 0:04:46 lr 0.000405 wd 0.0500 time 0.2224 (0.2834) data time 0.0007 (0.0065) model time 0.2218 (0.2277) loss 3.1952 (3.1232) grad_norm 2.8595 (2.8727) loss_scale 2048.0000 (1873.7925) mem 7373MB [2024-08-27 13:05:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][250/1251] eta 0:04:41 lr 0.000405 wd 0.0500 time 0.2348 (0.2813) data time 0.0007 (0.0063) model time 0.2341 (0.2278) loss 3.3985 (3.1153) grad_norm 2.6293 (2.8644) loss_scale 2048.0000 (1880.7331) mem 7373MB [2024-08-27 13:05:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][260/1251] eta 0:04:36 lr 0.000405 wd 0.0500 time 0.2309 (0.2793) data time 0.0013 (0.0061) model time 0.2296 (0.2277) loss 3.2910 (3.1095) grad_norm 11.6904 (2.8933) loss_scale 2048.0000 (1887.1418) mem 7373MB [2024-08-27 13:05:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][270/1251] eta 0:04:32 lr 0.000405 wd 0.0500 time 0.2293 (0.2775) data time 0.0011 (0.0059) model time 0.2282 (0.2279) loss 3.4371 (3.1049) grad_norm 3.4709 (2.9086) loss_scale 2048.0000 (1893.0775) mem 7373MB [2024-08-27 13:05:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][280/1251] eta 0:04:27 lr 0.000404 wd 0.0500 time 0.2417 (0.2757) data time 0.0009 (0.0057) model time 0.2408 (0.2278) loss 3.2067 (3.1075) grad_norm 2.5053 (2.8976) loss_scale 2048.0000 (1898.5907) mem 7373MB [2024-08-27 13:05:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][290/1251] eta 0:04:24 lr 0.000404 wd 0.0500 time 0.2244 (0.2749) data time 0.0009 (0.0056) model time 0.2234 (0.2287) loss 2.0020 (3.1005) grad_norm 2.1230 (2.8967) loss_scale 2048.0000 (1903.7251) mem 7373MB [2024-08-27 13:05:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][300/1251] eta 0:04:19 lr 0.000404 wd 0.0500 time 0.2249 (0.2732) data time 0.0009 (0.0054) model time 0.2241 (0.2286) loss 2.8597 (3.0928) grad_norm 2.9228 (2.8839) loss_scale 2048.0000 (1908.5183) mem 7373MB [2024-08-27 13:05:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][310/1251] eta 0:04:16 lr 0.000404 wd 0.0500 time 0.2230 (0.2725) data time 0.0009 (0.0053) model time 0.2220 (0.2294) loss 3.3072 (3.0948) grad_norm 2.0907 (2.8860) loss_scale 2048.0000 (1913.0032) mem 7373MB [2024-08-27 13:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][320/1251] eta 0:04:12 lr 0.000404 wd 0.0500 time 0.2255 (0.2715) data time 0.0007 (0.0052) model time 0.2248 (0.2297) loss 3.6730 (3.1033) grad_norm 2.4507 (2.8757) loss_scale 2048.0000 (1917.2087) mem 7373MB [2024-08-27 13:05:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][330/1251] eta 0:04:09 lr 0.000404 wd 0.0500 time 0.2260 (0.2706) data time 0.0011 (0.0051) model time 0.2249 (0.2301) loss 2.2532 (3.1044) grad_norm 2.5805 (2.8610) loss_scale 2048.0000 (1921.1601) mem 7373MB [2024-08-27 13:05:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][340/1251] eta 0:04:05 lr 0.000404 wd 0.0500 time 0.2255 (0.2694) data time 0.0015 (0.0050) model time 0.2240 (0.2300) loss 3.1869 (3.1062) grad_norm 2.3663 (2.8513) loss_scale 2048.0000 (1924.8798) mem 7373MB [2024-08-27 13:05:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][350/1251] eta 0:04:01 lr 0.000404 wd 0.0500 time 0.2279 (0.2682) data time 0.0007 (0.0049) model time 0.2272 (0.2298) loss 3.2710 (3.1048) grad_norm 2.6995 (2.8376) loss_scale 2048.0000 (1928.3875) mem 7373MB [2024-08-27 13:05:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][360/1251] eta 0:03:57 lr 0.000404 wd 0.0500 time 0.2247 (0.2670) data time 0.0009 (0.0048) model time 0.2238 (0.2298) loss 2.6405 (3.1030) grad_norm 2.3281 (2.8296) loss_scale 2048.0000 (1931.7008) mem 7373MB [2024-08-27 13:05:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][370/1251] eta 0:03:54 lr 0.000404 wd 0.0500 time 0.2225 (0.2660) data time 0.0011 (0.0047) model time 0.2214 (0.2297) loss 2.6585 (3.1025) grad_norm 2.2821 (2.8674) loss_scale 2048.0000 (1934.8356) mem 7373MB [2024-08-27 13:06:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][380/1251] eta 0:03:50 lr 0.000404 wd 0.0500 time 0.2296 (0.2651) data time 0.0008 (0.0046) model time 0.2287 (0.2297) loss 2.1262 (3.1033) grad_norm 2.2988 (2.8623) loss_scale 2048.0000 (1937.8058) mem 7373MB [2024-08-27 13:06:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][390/1251] eta 0:03:47 lr 0.000404 wd 0.0500 time 0.2203 (0.2641) data time 0.0014 (0.0045) model time 0.2189 (0.2296) loss 3.6294 (3.1005) grad_norm 2.5531 (2.8668) loss_scale 2048.0000 (1940.6240) mem 7373MB [2024-08-27 13:06:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][400/1251] eta 0:03:44 lr 0.000404 wd 0.0500 time 0.2264 (0.2632) data time 0.0011 (0.0044) model time 0.2253 (0.2295) loss 3.1568 (3.1088) grad_norm 1.9991 (2.8669) loss_scale 2048.0000 (1943.3017) mem 7373MB [2024-08-27 13:06:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][410/1251] eta 0:03:40 lr 0.000404 wd 0.0500 time 0.2289 (0.2624) data time 0.0011 (0.0043) model time 0.2278 (0.2295) loss 3.2854 (3.1133) grad_norm 2.0565 (2.8687) loss_scale 2048.0000 (1945.8491) mem 7373MB [2024-08-27 13:06:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][420/1251] eta 0:03:37 lr 0.000404 wd 0.0500 time 0.2368 (0.2616) data time 0.0009 (0.0042) model time 0.2359 (0.2294) loss 4.1793 (3.1130) grad_norm 3.8897 (2.8728) loss_scale 2048.0000 (1948.2755) mem 7373MB [2024-08-27 13:06:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][430/1251] eta 0:03:34 lr 0.000404 wd 0.0500 time 0.2333 (0.2610) data time 0.0013 (0.0042) model time 0.2320 (0.2294) loss 3.4990 (3.1217) grad_norm 3.4992 (2.8732) loss_scale 2048.0000 (1950.5893) mem 7373MB [2024-08-27 13:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][440/1251] eta 0:03:30 lr 0.000404 wd 0.0500 time 0.2261 (0.2602) data time 0.0009 (0.0042) model time 0.2253 (0.2293) loss 4.0858 (3.1262) grad_norm 2.4489 (2.8780) loss_scale 2048.0000 (1952.7982) mem 7373MB [2024-08-27 13:06:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][450/1251] eta 0:03:27 lr 0.000404 wd 0.0500 time 0.2294 (0.2596) data time 0.0008 (0.0041) model time 0.2287 (0.2294) loss 3.2782 (3.1255) grad_norm 3.6920 (2.8792) loss_scale 2048.0000 (1954.9091) mem 7373MB [2024-08-27 13:06:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][460/1251] eta 0:03:24 lr 0.000404 wd 0.0500 time 0.2352 (0.2589) data time 0.0009 (0.0040) model time 0.2342 (0.2293) loss 3.5076 (3.1218) grad_norm 2.9352 (2.8746) loss_scale 2048.0000 (1956.9284) mem 7373MB [2024-08-27 13:06:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][470/1251] eta 0:03:21 lr 0.000404 wd 0.0500 time 0.2285 (0.2583) data time 0.0011 (0.0040) model time 0.2274 (0.2293) loss 3.0636 (3.1132) grad_norm 3.6712 (2.8741) loss_scale 2048.0000 (1958.8620) mem 7373MB [2024-08-27 13:06:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][480/1251] eta 0:03:18 lr 0.000404 wd 0.0500 time 0.2218 (0.2577) data time 0.0010 (0.0039) model time 0.2208 (0.2293) loss 3.4561 (3.1120) grad_norm 2.9129 (2.8680) loss_scale 2048.0000 (1960.7152) mem 7373MB [2024-08-27 13:06:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][490/1251] eta 0:03:15 lr 0.000404 wd 0.0500 time 0.2224 (0.2570) data time 0.0012 (0.0039) model time 0.2212 (0.2292) loss 2.9796 (3.1166) grad_norm 2.4643 (2.8654) loss_scale 2048.0000 (1962.4929) mem 7373MB [2024-08-27 13:06:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][500/1251] eta 0:03:12 lr 0.000404 wd 0.0500 time 0.2283 (0.2564) data time 0.0010 (0.0038) model time 0.2272 (0.2291) loss 3.3630 (3.1143) grad_norm 2.2926 (2.8546) loss_scale 2048.0000 (1964.1996) mem 7373MB [2024-08-27 13:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][510/1251] eta 0:03:09 lr 0.000403 wd 0.0500 time 0.2188 (0.2559) data time 0.0008 (0.0038) model time 0.2180 (0.2291) loss 3.5324 (3.1170) grad_norm 2.1307 (2.8466) loss_scale 2048.0000 (1965.8395) mem 7373MB [2024-08-27 13:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][520/1251] eta 0:03:06 lr 0.000403 wd 0.0500 time 0.2288 (0.2554) data time 0.0008 (0.0037) model time 0.2280 (0.2291) loss 2.8012 (3.1187) grad_norm 2.2319 (2.8434) loss_scale 2048.0000 (1967.4165) mem 7373MB [2024-08-27 13:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][530/1251] eta 0:03:03 lr 0.000403 wd 0.0500 time 0.2281 (0.2550) data time 0.0013 (0.0037) model time 0.2269 (0.2291) loss 3.2230 (3.1131) grad_norm 2.2308 (2.8481) loss_scale 2048.0000 (1968.9341) mem 7373MB [2024-08-27 13:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][540/1251] eta 0:03:00 lr 0.000403 wd 0.0500 time 0.2293 (0.2544) data time 0.0012 (0.0037) model time 0.2280 (0.2290) loss 3.2995 (3.1144) grad_norm 4.6095 (2.8590) loss_scale 2048.0000 (1970.3956) mem 7373MB [2024-08-27 13:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][550/1251] eta 0:02:58 lr 0.000403 wd 0.0500 time 0.2309 (0.2539) data time 0.0013 (0.0036) model time 0.2296 (0.2289) loss 3.9452 (3.1176) grad_norm 2.3570 (2.8584) loss_scale 2048.0000 (1971.8040) mem 7373MB [2024-08-27 13:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][560/1251] eta 0:02:55 lr 0.000403 wd 0.0500 time 0.2270 (0.2535) data time 0.0010 (0.0036) model time 0.2260 (0.2289) loss 3.2082 (3.1205) grad_norm 3.4299 (2.8626) loss_scale 2048.0000 (1973.1622) mem 7373MB [2024-08-27 13:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][570/1251] eta 0:02:52 lr 0.000403 wd 0.0500 time 0.2301 (0.2531) data time 0.0011 (0.0035) model time 0.2291 (0.2289) loss 3.3536 (3.1254) grad_norm 2.4577 (2.8623) loss_scale 2048.0000 (1974.4729) mem 7373MB [2024-08-27 13:06:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][580/1251] eta 0:02:49 lr 0.000403 wd 0.0500 time 0.2293 (0.2526) data time 0.0007 (0.0035) model time 0.2286 (0.2288) loss 3.5522 (3.1263) grad_norm 2.4172 (2.8568) loss_scale 2048.0000 (1975.7384) mem 7373MB [2024-08-27 13:06:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][590/1251] eta 0:02:46 lr 0.000403 wd 0.0500 time 0.2212 (0.2522) data time 0.0012 (0.0034) model time 0.2200 (0.2288) loss 3.5364 (3.1283) grad_norm 2.8362 (2.8572) loss_scale 2048.0000 (1976.9611) mem 7373MB [2024-08-27 13:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][600/1251] eta 0:02:43 lr 0.000403 wd 0.0500 time 0.2257 (0.2518) data time 0.0009 (0.0034) model time 0.2248 (0.2288) loss 1.9410 (3.1274) grad_norm 4.2183 (2.8577) loss_scale 2048.0000 (1978.1431) mem 7373MB [2024-08-27 13:06:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][610/1251] eta 0:02:41 lr 0.000403 wd 0.0500 time 0.2297 (0.2514) data time 0.0016 (0.0034) model time 0.2280 (0.2287) loss 3.2353 (3.1271) grad_norm 2.0405 (2.8581) loss_scale 2048.0000 (1979.2864) mem 7373MB [2024-08-27 13:06:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][620/1251] eta 0:02:38 lr 0.000403 wd 0.0500 time 0.2405 (0.2511) data time 0.0008 (0.0033) model time 0.2397 (0.2287) loss 3.8869 (3.1277) grad_norm 4.1822 (2.8567) loss_scale 2048.0000 (1980.3929) mem 7373MB [2024-08-27 13:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][630/1251] eta 0:02:35 lr 0.000403 wd 0.0500 time 0.2254 (0.2507) data time 0.0016 (0.0033) model time 0.2238 (0.2287) loss 3.5565 (3.1303) grad_norm 2.3050 (2.8680) loss_scale 2048.0000 (1981.4643) mem 7373MB [2024-08-27 13:07:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][640/1251] eta 0:02:33 lr 0.000403 wd 0.0500 time 0.2349 (0.2504) data time 0.0007 (0.0033) model time 0.2342 (0.2287) loss 1.9992 (3.1259) grad_norm 2.8273 (2.8664) loss_scale 2048.0000 (1982.5023) mem 7373MB [2024-08-27 13:07:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][650/1251] eta 0:02:30 lr 0.000403 wd 0.0500 time 0.2252 (0.2501) data time 0.0009 (0.0032) model time 0.2243 (0.2288) loss 3.8820 (3.1259) grad_norm 3.2813 (2.8801) loss_scale 2048.0000 (1983.5084) mem 7373MB [2024-08-27 13:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][660/1251] eta 0:02:27 lr 0.000403 wd 0.0500 time 0.2257 (0.2498) data time 0.0009 (0.0032) model time 0.2249 (0.2287) loss 2.1630 (3.1200) grad_norm 4.0277 (2.8791) loss_scale 2048.0000 (1984.4841) mem 7373MB [2024-08-27 13:07:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][670/1251] eta 0:02:24 lr 0.000403 wd 0.0500 time 0.2218 (0.2494) data time 0.0016 (0.0032) model time 0.2202 (0.2287) loss 3.6952 (3.1239) grad_norm 6.1992 (2.8779) loss_scale 2048.0000 (1985.4307) mem 7373MB [2024-08-27 13:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][680/1251] eta 0:02:22 lr 0.000403 wd 0.0500 time 0.2230 (0.2491) data time 0.0008 (0.0032) model time 0.2222 (0.2287) loss 3.4533 (3.1244) grad_norm 2.9812 (2.8776) loss_scale 2048.0000 (1986.3495) mem 7373MB [2024-08-27 13:07:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][690/1251] eta 0:02:19 lr 0.000403 wd 0.0500 time 0.2314 (0.2489) data time 0.0013 (0.0031) model time 0.2301 (0.2287) loss 2.3801 (3.1219) grad_norm 3.8384 (2.8763) loss_scale 2048.0000 (1987.2417) mem 7373MB [2024-08-27 13:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][700/1251] eta 0:02:16 lr 0.000403 wd 0.0500 time 0.2284 (0.2486) data time 0.0009 (0.0031) model time 0.2275 (0.2287) loss 3.6481 (3.1217) grad_norm 2.4274 (2.8745) loss_scale 2048.0000 (1988.1084) mem 7373MB [2024-08-27 13:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][710/1251] eta 0:02:14 lr 0.000403 wd 0.0500 time 0.2305 (0.2483) data time 0.0010 (0.0031) model time 0.2296 (0.2287) loss 2.0199 (3.1202) grad_norm 3.1477 (2.8740) loss_scale 2048.0000 (1988.9508) mem 7373MB [2024-08-27 13:07:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][720/1251] eta 0:02:11 lr 0.000403 wd 0.0500 time 0.2255 (0.2481) data time 0.0010 (0.0031) model time 0.2245 (0.2287) loss 3.8880 (3.1176) grad_norm 2.7277 (2.8739) loss_scale 2048.0000 (1989.7698) mem 7373MB [2024-08-27 13:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][730/1251] eta 0:02:09 lr 0.000403 wd 0.0500 time 0.2380 (0.2478) data time 0.0009 (0.0030) model time 0.2371 (0.2287) loss 3.0592 (3.1184) grad_norm 2.3561 (2.8700) loss_scale 2048.0000 (1990.5663) mem 7373MB [2024-08-27 13:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][740/1251] eta 0:02:06 lr 0.000402 wd 0.0500 time 0.2322 (0.2477) data time 0.0011 (0.0030) model time 0.2311 (0.2287) loss 3.5261 (3.1223) grad_norm 2.4599 (2.8679) loss_scale 2048.0000 (1991.3414) mem 7373MB [2024-08-27 13:07:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][750/1251] eta 0:02:03 lr 0.000402 wd 0.0500 time 0.2263 (0.2474) data time 0.0012 (0.0030) model time 0.2251 (0.2287) loss 3.3090 (3.1205) grad_norm 2.0701 (2.8654) loss_scale 2048.0000 (1992.0959) mem 7373MB [2024-08-27 13:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][760/1251] eta 0:02:01 lr 0.000402 wd 0.0500 time 0.2299 (0.2472) data time 0.0011 (0.0030) model time 0.2288 (0.2287) loss 3.2845 (3.1203) grad_norm 3.0145 (2.8642) loss_scale 2048.0000 (1992.8305) mem 7373MB [2024-08-27 13:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 13:07:28 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 13:07:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 13:09:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 13:09:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 13:09:29 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 13:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 13:09:38 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 13:09:39 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 13:09:41 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 13:09:41 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 178) [2024-08-27 13:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 13:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][770/1251] eta 0:13:32 lr 0.000402 wd 0.0500 time 0.2407 (1.6894) data time 0.0006 (0.0825) model time 0.2401 (1.6069) loss 3.2531 (3.4569) grad_norm 5.4930 (3.0322) loss_scale 2048.0000 (2048.0000) mem 7372MB [2024-08-27 13:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][780/1251] eta 0:07:14 lr 0.000402 wd 0.0500 time 0.2317 (0.9216) data time 0.0010 (0.0397) model time 0.2307 (0.8819) loss 3.1342 (3.3142) grad_norm 1.9500 (2.8296) loss_scale 2048.0000 (2048.0000) mem 7372MB [2024-08-27 13:10:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][790/1251] eta 0:05:15 lr 0.000402 wd 0.0500 time 0.2286 (0.6833) data time 0.0007 (0.0263) model time 0.2279 (0.6570) loss 3.9635 (3.3929) grad_norm 6.5780 (2.9136) loss_scale 2048.0000 (2048.0000) mem 7372MB [2024-08-27 13:10:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][800/1251] eta 0:04:15 lr 0.000402 wd 0.0500 time 0.2270 (0.5664) data time 0.0009 (0.0198) model time 0.2260 (0.5466) loss 3.1118 (3.3117) grad_norm 3.1568 (2.9035) loss_scale 2048.0000 (2048.0000) mem 7372MB [2024-08-27 13:10:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][810/1251] eta 0:03:39 lr 0.000402 wd 0.0500 time 0.2264 (0.4975) data time 0.0008 (0.0160) model time 0.2256 (0.4816) loss 2.7895 (3.2651) grad_norm 3.0735 (2.9040) loss_scale 2048.0000 (2048.0000) mem 7372MB [2024-08-27 13:10:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][820/1251] eta 0:03:14 lr 0.000402 wd 0.0500 time 0.2352 (0.4519) data time 0.0009 (0.0134) model time 0.2344 (0.4384) loss 2.3277 (3.2232) grad_norm 3.0906 (inf) loss_scale 1024.0000 (1926.5085) mem 7372MB [2024-08-27 13:10:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][830/1251] eta 0:02:56 lr 0.000402 wd 0.0500 time 0.2220 (0.4196) data time 0.0012 (0.0116) model time 0.2208 (0.4079) loss 2.9707 (3.2076) grad_norm 4.1237 (inf) loss_scale 1024.0000 (1795.7101) mem 7372MB [2024-08-27 13:10:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][840/1251] eta 0:02:42 lr 0.000402 wd 0.0500 time 0.2259 (0.3953) data time 0.0010 (0.0103) model time 0.2249 (0.3850) loss 3.5814 (3.1923) grad_norm 24.4605 (inf) loss_scale 1024.0000 (1698.0253) mem 7372MB [2024-08-27 13:10:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][850/1251] eta 0:02:30 lr 0.000402 wd 0.0500 time 0.2251 (0.3761) data time 0.0006 (0.0092) model time 0.2245 (0.3669) loss 3.0996 (3.1688) grad_norm 2.7300 (inf) loss_scale 1024.0000 (1622.2921) mem 7372MB [2024-08-27 13:10:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][860/1251] eta 0:02:21 lr 0.000402 wd 0.0500 time 0.2197 (0.3612) data time 0.0011 (0.0084) model time 0.2186 (0.3528) loss 3.5265 (3.1825) grad_norm 2.0879 (inf) loss_scale 1024.0000 (1561.8586) mem 7372MB [2024-08-27 13:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][870/1251] eta 0:02:12 lr 0.000402 wd 0.0500 time 0.2243 (0.3490) data time 0.0009 (0.0077) model time 0.2234 (0.3412) loss 3.7933 (3.2017) grad_norm 2.2589 (inf) loss_scale 1024.0000 (1512.5138) mem 7372MB [2024-08-27 13:10:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][880/1251] eta 0:02:05 lr 0.000402 wd 0.0500 time 0.2332 (0.3388) data time 0.0009 (0.0072) model time 0.2323 (0.3317) loss 3.4656 (3.2123) grad_norm 3.0896 (inf) loss_scale 1024.0000 (1471.4622) mem 7372MB [2024-08-27 13:10:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][890/1251] eta 0:01:59 lr 0.000402 wd 0.0500 time 0.2241 (0.3301) data time 0.0009 (0.0067) model time 0.2233 (0.3234) loss 3.0794 (3.2058) grad_norm 1.9646 (inf) loss_scale 1024.0000 (1436.7752) mem 7372MB [2024-08-27 13:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][900/1251] eta 0:01:53 lr 0.000402 wd 0.0500 time 0.2234 (0.3227) data time 0.0010 (0.0063) model time 0.2224 (0.3164) loss 3.2008 (3.2057) grad_norm 2.3752 (inf) loss_scale 1024.0000 (1407.0791) mem 7372MB [2024-08-27 13:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][910/1251] eta 0:01:47 lr 0.000402 wd 0.0500 time 0.2265 (0.3163) data time 0.0007 (0.0059) model time 0.2258 (0.3103) loss 3.0491 (3.1917) grad_norm 3.2793 (inf) loss_scale 1024.0000 (1381.3691) mem 7372MB [2024-08-27 13:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][920/1251] eta 0:01:42 lr 0.000402 wd 0.0500 time 0.2264 (0.3109) data time 0.0006 (0.0056) model time 0.2258 (0.3053) loss 4.0083 (3.1855) grad_norm 2.5349 (inf) loss_scale 1024.0000 (1358.8931) mem 7372MB [2024-08-27 13:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][930/1251] eta 0:01:38 lr 0.000402 wd 0.0500 time 0.2266 (0.3061) data time 0.0011 (0.0054) model time 0.2255 (0.3007) loss 3.2016 (3.1810) grad_norm 2.4178 (inf) loss_scale 1024.0000 (1339.0769) mem 7372MB [2024-08-27 13:10:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][940/1251] eta 0:01:33 lr 0.000402 wd 0.0500 time 0.2292 (0.3017) data time 0.0008 (0.0052) model time 0.2284 (0.2966) loss 2.9032 (3.1568) grad_norm 7.4122 (inf) loss_scale 1024.0000 (1321.4749) mem 7372MB [2024-08-27 13:10:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][950/1251] eta 0:01:29 lr 0.000402 wd 0.0500 time 0.2301 (0.2978) data time 0.0006 (0.0049) model time 0.2294 (0.2929) loss 3.3632 (3.1519) grad_norm 2.6245 (inf) loss_scale 1024.0000 (1305.7354) mem 7372MB [2024-08-27 13:10:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][960/1251] eta 0:01:25 lr 0.000402 wd 0.0500 time 0.2239 (0.2942) data time 0.0010 (0.0047) model time 0.2229 (0.2895) loss 2.1662 (3.1276) grad_norm 2.3969 (inf) loss_scale 1024.0000 (1291.5779) mem 7372MB [2024-08-27 13:10:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][970/1251] eta 0:01:21 lr 0.000401 wd 0.0500 time 0.2232 (0.2910) data time 0.0007 (0.0046) model time 0.2224 (0.2864) loss 3.3934 (3.1254) grad_norm 3.3714 (inf) loss_scale 1024.0000 (1278.7751) mem 7372MB [2024-08-27 13:10:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][980/1251] eta 0:01:18 lr 0.000401 wd 0.0500 time 0.2279 (0.2882) data time 0.0007 (0.0044) model time 0.2271 (0.2838) loss 3.1976 (3.1228) grad_norm 1.8739 (inf) loss_scale 1024.0000 (1267.1416) mem 7372MB [2024-08-27 13:10:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][990/1251] eta 0:01:14 lr 0.000401 wd 0.0500 time 0.2237 (0.2856) data time 0.0008 (0.0042) model time 0.2229 (0.2813) loss 2.4096 (3.1211) grad_norm 2.8388 (inf) loss_scale 1024.0000 (1256.5240) mem 7372MB [2024-08-27 13:10:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1000/1251] eta 0:01:11 lr 0.000401 wd 0.0500 time 0.2362 (0.2831) data time 0.0006 (0.0041) model time 0.2356 (0.2790) loss 2.2643 (3.1166) grad_norm 1.6869 (inf) loss_scale 1024.0000 (1246.7950) mem 7372MB [2024-08-27 13:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1010/1251] eta 0:01:07 lr 0.000401 wd 0.0500 time 0.2288 (0.2810) data time 0.0010 (0.0040) model time 0.2278 (0.2770) loss 2.7121 (3.1132) grad_norm 3.8420 (inf) loss_scale 1024.0000 (1237.8474) mem 7372MB [2024-08-27 13:10:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1020/1251] eta 0:01:04 lr 0.000401 wd 0.0500 time 0.2253 (0.2789) data time 0.0010 (0.0039) model time 0.2243 (0.2750) loss 3.6929 (3.1081) grad_norm 3.3934 (inf) loss_scale 1024.0000 (1229.5907) mem 7372MB [2024-08-27 13:11:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1030/1251] eta 0:01:01 lr 0.000401 wd 0.0500 time 0.2257 (0.2769) data time 0.0006 (0.0038) model time 0.2251 (0.2731) loss 2.3055 (3.0990) grad_norm 2.5040 (inf) loss_scale 1024.0000 (1221.9480) mem 7372MB [2024-08-27 13:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1040/1251] eta 0:00:58 lr 0.000401 wd 0.0500 time 0.2264 (0.2752) data time 0.0009 (0.0037) model time 0.2255 (0.2715) loss 3.5831 (3.1069) grad_norm 2.7767 (inf) loss_scale 1024.0000 (1214.8530) mem 7372MB [2024-08-27 13:11:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1050/1251] eta 0:00:55 lr 0.000401 wd 0.0500 time 0.2317 (0.2743) data time 0.0008 (0.0036) model time 0.2309 (0.2708) loss 2.5431 (3.1089) grad_norm 2.3910 (inf) loss_scale 1024.0000 (1208.2491) mem 7372MB [2024-08-27 13:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1060/1251] eta 0:00:52 lr 0.000401 wd 0.0500 time 0.2256 (0.2727) data time 0.0011 (0.0035) model time 0.2245 (0.2692) loss 3.3929 (3.0962) grad_norm 2.3026 (inf) loss_scale 1024.0000 (1202.0870) mem 7372MB [2024-08-27 13:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1070/1251] eta 0:00:49 lr 0.000401 wd 0.0500 time 0.2255 (0.2720) data time 0.0008 (0.0034) model time 0.2248 (0.2686) loss 3.5115 (3.0913) grad_norm 2.7628 (inf) loss_scale 1024.0000 (1196.3236) mem 7372MB [2024-08-27 13:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1080/1251] eta 0:00:46 lr 0.000401 wd 0.0500 time 0.2276 (0.2706) data time 0.0007 (0.0033) model time 0.2268 (0.2673) loss 3.8844 (3.0993) grad_norm 3.5130 (inf) loss_scale 1024.0000 (1190.9216) mem 7372MB [2024-08-27 13:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1090/1251] eta 0:00:43 lr 0.000401 wd 0.0500 time 0.2223 (0.2693) data time 0.0008 (0.0033) model time 0.2215 (0.2661) loss 2.0449 (3.0994) grad_norm 2.4484 (inf) loss_scale 1024.0000 (1185.8480) mem 7372MB [2024-08-27 13:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1100/1251] eta 0:00:40 lr 0.000401 wd 0.0500 time 0.2239 (0.2681) data time 0.0008 (0.0032) model time 0.2230 (0.2649) loss 2.5598 (3.0975) grad_norm 2.6469 (inf) loss_scale 1024.0000 (1181.0737) mem 7372MB [2024-08-27 13:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1110/1251] eta 0:00:37 lr 0.000401 wd 0.0500 time 0.2249 (0.2670) data time 0.0006 (0.0031) model time 0.2243 (0.2638) loss 3.4146 (3.0991) grad_norm 2.4135 (inf) loss_scale 1024.0000 (1176.5731) mem 7372MB [2024-08-27 13:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1120/1251] eta 0:00:34 lr 0.000401 wd 0.0500 time 0.2274 (0.2660) data time 0.0007 (0.0031) model time 0.2267 (0.2629) loss 2.1995 (3.0970) grad_norm 3.8797 (inf) loss_scale 1024.0000 (1172.3231) mem 7372MB [2024-08-27 13:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1130/1251] eta 0:00:32 lr 0.000401 wd 0.0500 time 0.2252 (0.2650) data time 0.0008 (0.0030) model time 0.2244 (0.2619) loss 3.3350 (3.0970) grad_norm 1.7444 (inf) loss_scale 1024.0000 (1168.3035) mem 7372MB [2024-08-27 13:11:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1140/1251] eta 0:00:29 lr 0.000401 wd 0.0500 time 0.2335 (0.2640) data time 0.0007 (0.0030) model time 0.2328 (0.2610) loss 3.5730 (3.0961) grad_norm 2.2364 (inf) loss_scale 1024.0000 (1164.4960) mem 7372MB [2024-08-27 13:11:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1150/1251] eta 0:00:26 lr 0.000401 wd 0.0500 time 0.2282 (0.2631) data time 0.0007 (0.0030) model time 0.2276 (0.2601) loss 3.0323 (3.0901) grad_norm 2.4395 (inf) loss_scale 1024.0000 (1160.8843) mem 7372MB [2024-08-27 13:11:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1160/1251] eta 0:00:23 lr 0.000401 wd 0.0500 time 0.2232 (0.2622) data time 0.0006 (0.0029) model time 0.2226 (0.2593) loss 3.6336 (3.0939) grad_norm 3.3596 (inf) loss_scale 1024.0000 (1157.4536) mem 7372MB [2024-08-27 13:11:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1170/1251] eta 0:00:21 lr 0.000401 wd 0.0500 time 0.2281 (0.2613) data time 0.0008 (0.0029) model time 0.2272 (0.2585) loss 2.9687 (3.0992) grad_norm 3.1486 (inf) loss_scale 1024.0000 (1154.1907) mem 7372MB [2024-08-27 13:11:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1180/1251] eta 0:00:18 lr 0.000401 wd 0.0500 time 0.2220 (0.2606) data time 0.0006 (0.0028) model time 0.2214 (0.2578) loss 2.6027 (3.0990) grad_norm 2.2464 (inf) loss_scale 1024.0000 (1151.0835) mem 7372MB [2024-08-27 13:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1190/1251] eta 0:00:15 lr 0.000401 wd 0.0500 time 0.2327 (0.2598) data time 0.0007 (0.0028) model time 0.2320 (0.2570) loss 3.4982 (3.1069) grad_norm 2.8614 (inf) loss_scale 1024.0000 (1148.1212) mem 7372MB [2024-08-27 13:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1200/1251] eta 0:00:13 lr 0.000400 wd 0.0500 time 0.2272 (0.2591) data time 0.0009 (0.0028) model time 0.2264 (0.2563) loss 3.1314 (3.1108) grad_norm 4.7067 (inf) loss_scale 1024.0000 (1145.2938) mem 7372MB [2024-08-27 13:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1210/1251] eta 0:00:10 lr 0.000400 wd 0.0500 time 0.2249 (0.2584) data time 0.0009 (0.0027) model time 0.2241 (0.2557) loss 3.2790 (3.1123) grad_norm 4.4512 (inf) loss_scale 1024.0000 (1142.5924) mem 7372MB [2024-08-27 13:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1220/1251] eta 0:00:07 lr 0.000400 wd 0.0500 time 0.2252 (0.2577) data time 0.0011 (0.0027) model time 0.2241 (0.2550) loss 2.8176 (3.1076) grad_norm 2.4111 (inf) loss_scale 1024.0000 (1140.0087) mem 7372MB [2024-08-27 13:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1230/1251] eta 0:00:05 lr 0.000400 wd 0.0500 time 0.2485 (0.2571) data time 0.0006 (0.0027) model time 0.2479 (0.2545) loss 2.0590 (3.0993) grad_norm 2.2950 (inf) loss_scale 1024.0000 (1137.5352) mem 7372MB [2024-08-27 13:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1240/1251] eta 0:00:02 lr 0.000400 wd 0.0500 time 0.2109 (0.2563) data time 0.0006 (0.0026) model time 0.2103 (0.2537) loss 3.3770 (3.0965) grad_norm 4.5154 (inf) loss_scale 1024.0000 (1135.1649) mem 7372MB [2024-08-27 13:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [178/300][1250/1251] eta 0:00:00 lr 0.000400 wd 0.0500 time 0.2108 (0.2554) data time 0.0006 (0.0026) model time 0.2102 (0.2528) loss 3.1998 (3.1005) grad_norm 2.3199 (inf) loss_scale 1024.0000 (1132.8916) mem 7372MB [2024-08-27 13:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 178 training takes 0:02:04 [2024-08-27 13:11:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 13:11:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 13:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.488 (0.488) Loss 0.4414 (0.4414) Acc@1 91.797 (91.797) Acc@5 98.633 (98.633) Mem 7372MB [2024-08-27 13:11:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.085 (0.121) Loss 0.6929 (0.7041) Acc@1 86.914 (85.147) Acc@5 96.875 (96.884) Mem 7372MB [2024-08-27 13:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.085 (0.102) Loss 0.9644 (0.7254) Acc@1 75.879 (83.984) Acc@5 95.117 (97.005) Mem 7372MB [2024-08-27 13:11:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.095) Loss 1.2246 (0.8241) Acc@1 70.410 (81.637) Acc@5 91.211 (95.851) Mem 7372MB [2024-08-27 13:11:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.088) Loss 1.1133 (0.8732) Acc@1 74.219 (80.333) Acc@5 92.383 (95.251) Mem 7372MB [2024-08-27 13:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.032 Acc@5 95.266 [2024-08-27 13:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.0% [2024-08-27 13:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.977 (0.977) Loss 0.3970 (0.3970) Acc@1 93.066 (93.066) Acc@5 98.340 (98.340) Mem 7372MB [2024-08-27 13:12:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.166) Loss 0.6138 (0.6228) Acc@1 88.086 (86.834) Acc@5 97.266 (97.505) Mem 7372MB [2024-08-27 13:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.124) Loss 0.8833 (0.6469) Acc@1 78.809 (85.914) Acc@5 95.703 (97.498) Mem 7372MB [2024-08-27 13:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.110) Loss 1.1172 (0.7333) Acc@1 72.656 (83.786) Acc@5 92.969 (96.582) Mem 7372MB [2024-08-27 13:12:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.099) Loss 1.0010 (0.7772) Acc@1 75.586 (82.455) Acc@5 94.043 (96.103) Mem 7372MB [2024-08-27 13:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.010 Acc@5 96.076 [2024-08-27 13:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.0% [2024-08-27 13:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.01% [2024-08-27 13:12:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 13:12:05 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 13:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][0/1251] eta 0:18:48 lr 0.000400 wd 0.0500 time 0.9019 (0.9019) data time 0.5822 (0.5822) model time 0.0000 (0.0000) loss 2.8105 (2.8105) grad_norm 2.3053 (2.3053) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 13:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][10/1251] eta 0:05:57 lr 0.000400 wd 0.0500 time 0.2288 (0.2884) data time 0.0008 (0.0550) model time 0.0000 (0.0000) loss 2.3035 (3.0520) grad_norm 2.5540 (2.8594) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 13:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][20/1251] eta 0:05:17 lr 0.000400 wd 0.0500 time 0.2210 (0.2582) data time 0.0011 (0.0294) model time 0.0000 (0.0000) loss 3.2381 (3.1241) grad_norm 2.3630 (2.7966) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 13:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][30/1251] eta 0:05:02 lr 0.000400 wd 0.0500 time 0.2245 (0.2475) data time 0.0009 (0.0203) model time 0.0000 (0.0000) loss 3.4176 (3.1722) grad_norm 2.2531 (2.9038) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 13:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][40/1251] eta 0:04:54 lr 0.000400 wd 0.0500 time 0.2265 (0.2432) data time 0.0009 (0.0155) model time 0.0000 (0.0000) loss 3.0657 (3.0708) grad_norm 2.9412 (2.9737) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 13:12:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][50/1251] eta 0:04:48 lr 0.000400 wd 0.0500 time 0.2417 (0.2401) data time 0.0006 (0.0127) model time 0.0000 (0.0000) loss 3.6106 (3.0785) grad_norm 5.1448 (3.0033) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 13:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][60/1251] eta 0:04:43 lr 0.000400 wd 0.0500 time 0.2225 (0.2377) data time 0.0010 (0.0108) model time 0.2215 (0.2244) loss 1.9744 (3.0772) grad_norm 2.7285 (2.9811) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 13:12:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][70/1251] eta 0:04:39 lr 0.000400 wd 0.0500 time 0.2358 (0.2365) data time 0.0008 (0.0094) model time 0.2350 (0.2262) loss 3.3197 (3.1188) grad_norm 2.4480 (3.0070) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 13:12:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 13:12:22 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 13:12:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 13:18:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 13:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 13:19:06 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 13:19:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 13:19:15 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 13:19:16 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 13:19:17 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 13:19:18 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 179) [2024-08-27 13:19:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 13:19:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][80/1251] eta 0:36:57 lr 0.000400 wd 0.0500 time 0.2237 (1.8936) data time 0.0010 (0.0971) model time 0.2227 (1.7964) loss 3.3099 (3.6420) grad_norm 2.3680 (3.0281) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:19:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][90/1251] eta 0:19:42 lr 0.000400 wd 0.0500 time 0.2256 (1.0181) data time 0.0015 (0.0467) model time 0.2242 (0.9714) loss 3.1426 (3.4706) grad_norm 2.1072 (2.8081) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:19:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][100/1251] eta 0:14:19 lr 0.000400 wd 0.0500 time 0.2309 (0.7467) data time 0.0007 (0.0311) model time 0.2301 (0.7155) loss 3.8308 (3.4905) grad_norm 3.2602 (2.8829) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][110/1251] eta 0:11:40 lr 0.000400 wd 0.0500 time 0.2316 (0.6137) data time 0.0010 (0.0234) model time 0.2306 (0.5902) loss 3.1973 (3.4123) grad_norm 3.2097 (2.8583) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:19:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][120/1251] eta 0:10:05 lr 0.000400 wd 0.0500 time 0.2321 (0.5356) data time 0.0010 (0.0189) model time 0.2311 (0.5167) loss 3.0558 (3.3656) grad_norm 2.3302 (2.9605) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:19:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][130/1251] eta 0:09:01 lr 0.000400 wd 0.0500 time 0.2266 (0.4833) data time 0.0008 (0.0159) model time 0.2258 (0.4674) loss 2.3887 (3.3138) grad_norm 2.9067 (3.0336) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:19:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][140/1251] eta 0:08:15 lr 0.000400 wd 0.0500 time 0.2291 (0.4460) data time 0.0010 (0.0138) model time 0.2280 (0.4322) loss 3.4055 (3.2922) grad_norm 2.4182 (3.0594) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:19:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][150/1251] eta 0:07:40 lr 0.000400 wd 0.0500 time 0.2254 (0.4184) data time 0.0010 (0.0122) model time 0.2244 (0.4062) loss 3.0238 (3.2539) grad_norm 2.1919 (2.9755) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:19:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][160/1251] eta 0:07:13 lr 0.000400 wd 0.0500 time 0.2246 (0.3971) data time 0.0009 (0.0109) model time 0.2237 (0.3862) loss 3.1829 (3.2144) grad_norm 2.4079 (2.9189) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:19:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][170/1251] eta 0:06:50 lr 0.000400 wd 0.0500 time 0.2313 (0.3800) data time 0.0013 (0.0099) model time 0.2301 (0.3700) loss 3.5120 (3.2331) grad_norm 1.8024 (2.8611) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][180/1251] eta 0:06:32 lr 0.000399 wd 0.0500 time 0.2259 (0.3661) data time 0.0008 (0.0091) model time 0.2250 (0.3569) loss 3.4680 (3.2501) grad_norm 1.7174 (2.8261) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][190/1251] eta 0:06:16 lr 0.000399 wd 0.0500 time 0.2265 (0.3547) data time 0.0012 (0.0085) model time 0.2253 (0.3462) loss 3.4419 (3.2422) grad_norm 4.1512 (2.9238) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][200/1251] eta 0:06:02 lr 0.000399 wd 0.0500 time 0.2369 (0.3448) data time 0.0008 (0.0079) model time 0.2361 (0.3369) loss 2.9502 (3.2287) grad_norm 2.6155 (3.0156) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][210/1251] eta 0:05:50 lr 0.000399 wd 0.0500 time 0.2275 (0.3366) data time 0.0008 (0.0074) model time 0.2267 (0.3292) loss 3.6486 (3.2196) grad_norm 2.2041 (2.9675) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][220/1251] eta 0:05:39 lr 0.000399 wd 0.0500 time 0.2317 (0.3296) data time 0.0008 (0.0072) model time 0.2309 (0.3224) loss 2.6844 (3.2015) grad_norm 3.6946 (2.9494) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][230/1251] eta 0:05:30 lr 0.000399 wd 0.0500 time 0.2354 (0.3233) data time 0.0007 (0.0068) model time 0.2347 (0.3165) loss 3.7464 (3.1952) grad_norm 2.3419 (2.9267) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][240/1251] eta 0:05:21 lr 0.000399 wd 0.0500 time 0.2338 (0.3180) data time 0.0011 (0.0066) model time 0.2327 (0.3114) loss 3.0128 (3.1962) grad_norm 2.5820 (2.9307) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][250/1251] eta 0:05:13 lr 0.000399 wd 0.0500 time 0.2277 (0.3129) data time 0.0007 (0.0063) model time 0.2270 (0.3066) loss 3.3086 (3.1756) grad_norm 2.9016 (2.9417) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][260/1251] eta 0:05:05 lr 0.000399 wd 0.0500 time 0.2271 (0.3086) data time 0.0008 (0.0060) model time 0.2263 (0.3026) loss 3.8675 (3.1779) grad_norm 2.5487 (2.9498) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][270/1251] eta 0:04:58 lr 0.000399 wd 0.0500 time 0.2340 (0.3047) data time 0.0011 (0.0058) model time 0.2329 (0.2990) loss 2.3891 (3.1593) grad_norm 2.0166 (2.9330) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][280/1251] eta 0:04:52 lr 0.000399 wd 0.0500 time 0.2267 (0.3011) data time 0.0008 (0.0055) model time 0.2259 (0.2956) loss 3.2874 (3.1525) grad_norm 2.2517 (2.9072) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][290/1251] eta 0:04:46 lr 0.000399 wd 0.0500 time 0.2281 (0.2979) data time 0.0010 (0.0054) model time 0.2271 (0.2925) loss 3.6263 (3.1486) grad_norm 3.1006 (2.8917) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][300/1251] eta 0:04:40 lr 0.000399 wd 0.0500 time 0.2334 (0.2949) data time 0.0008 (0.0052) model time 0.2326 (0.2897) loss 2.8129 (3.1533) grad_norm 3.2022 (2.8906) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][310/1251] eta 0:04:34 lr 0.000399 wd 0.0500 time 0.2280 (0.2922) data time 0.0008 (0.0050) model time 0.2272 (0.2872) loss 2.4099 (3.1461) grad_norm 2.4628 (2.8840) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][320/1251] eta 0:04:29 lr 0.000399 wd 0.0500 time 0.2323 (0.2898) data time 0.0008 (0.0049) model time 0.2315 (0.2849) loss 3.1058 (3.1412) grad_norm 1.8998 (2.8722) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][330/1251] eta 0:04:24 lr 0.000399 wd 0.0500 time 0.2282 (0.2875) data time 0.0009 (0.0047) model time 0.2272 (0.2828) loss 3.2270 (3.1309) grad_norm 2.4872 (2.8998) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][340/1251] eta 0:04:19 lr 0.000399 wd 0.0500 time 0.2314 (0.2853) data time 0.0008 (0.0046) model time 0.2306 (0.2807) loss 1.9543 (3.1174) grad_norm 2.1102 (2.8984) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][350/1251] eta 0:04:15 lr 0.000399 wd 0.0500 time 0.2387 (0.2834) data time 0.0010 (0.0044) model time 0.2377 (0.2790) loss 3.6328 (3.1250) grad_norm 7.1102 (2.9055) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][360/1251] eta 0:04:11 lr 0.000399 wd 0.0500 time 0.2307 (0.2824) data time 0.0010 (0.0043) model time 0.2297 (0.2780) loss 3.2021 (3.1241) grad_norm 2.4555 (2.8921) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][370/1251] eta 0:04:07 lr 0.000399 wd 0.0500 time 0.2294 (0.2806) data time 0.0010 (0.0042) model time 0.2285 (0.2763) loss 3.4038 (3.1142) grad_norm 2.0720 (2.8910) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][380/1251] eta 0:04:03 lr 0.000399 wd 0.0500 time 0.2307 (0.2798) data time 0.0009 (0.0041) model time 0.2298 (0.2757) loss 3.0147 (3.1096) grad_norm 4.2559 (2.8943) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][390/1251] eta 0:03:59 lr 0.000399 wd 0.0500 time 0.2432 (0.2783) data time 0.0008 (0.0040) model time 0.2424 (0.2742) loss 3.8546 (3.1200) grad_norm 2.8061 (2.8857) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][400/1251] eta 0:03:55 lr 0.000399 wd 0.0500 time 0.2294 (0.2767) data time 0.0010 (0.0040) model time 0.2284 (0.2728) loss 2.1201 (3.1242) grad_norm 1.9093 (2.8734) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][410/1251] eta 0:03:51 lr 0.000398 wd 0.0500 time 0.2353 (0.2754) data time 0.0013 (0.0039) model time 0.2340 (0.2715) loss 2.9022 (3.1235) grad_norm 2.5654 (2.8655) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:20:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][420/1251] eta 0:03:47 lr 0.000398 wd 0.0500 time 0.2257 (0.2740) data time 0.0008 (0.0038) model time 0.2249 (0.2702) loss 3.4987 (3.1253) grad_norm 2.3420 (2.8650) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][430/1251] eta 0:03:43 lr 0.000398 wd 0.0500 time 0.2204 (0.2728) data time 0.0009 (0.0037) model time 0.2195 (0.2691) loss 2.6118 (3.1223) grad_norm 2.8788 (2.8650) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][440/1251] eta 0:03:40 lr 0.000398 wd 0.0500 time 0.2297 (0.2717) data time 0.0011 (0.0036) model time 0.2286 (0.2681) loss 3.2586 (3.1211) grad_norm 2.3107 (2.8711) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][450/1251] eta 0:03:36 lr 0.000398 wd 0.0500 time 0.2645 (0.2708) data time 0.0007 (0.0037) model time 0.2638 (0.2671) loss 3.7098 (3.1209) grad_norm 2.2853 (2.8882) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][460/1251] eta 0:03:33 lr 0.000398 wd 0.0500 time 0.2326 (0.2697) data time 0.0008 (0.0036) model time 0.2318 (0.2661) loss 3.3022 (3.1133) grad_norm 2.5987 (2.8816) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][470/1251] eta 0:03:29 lr 0.000398 wd 0.0500 time 0.2278 (0.2687) data time 0.0007 (0.0036) model time 0.2271 (0.2652) loss 3.4009 (3.1189) grad_norm 2.6666 (2.8767) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][480/1251] eta 0:03:26 lr 0.000398 wd 0.0500 time 0.2246 (0.2678) data time 0.0010 (0.0035) model time 0.2236 (0.2642) loss 3.3207 (3.1227) grad_norm 3.1987 (2.8823) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][490/1251] eta 0:03:23 lr 0.000398 wd 0.0500 time 0.2328 (0.2669) data time 0.0007 (0.0035) model time 0.2321 (0.2635) loss 2.2938 (3.1189) grad_norm 2.0026 (2.8859) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][500/1251] eta 0:03:19 lr 0.000398 wd 0.0500 time 0.2274 (0.2660) data time 0.0007 (0.0034) model time 0.2267 (0.2626) loss 3.0842 (3.1228) grad_norm 3.1387 (2.8880) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][510/1251] eta 0:03:16 lr 0.000398 wd 0.0500 time 0.2290 (0.2653) data time 0.0009 (0.0035) model time 0.2281 (0.2619) loss 3.1913 (3.1256) grad_norm 4.1983 (2.8908) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][520/1251] eta 0:03:13 lr 0.000398 wd 0.0500 time 0.2223 (0.2645) data time 0.0009 (0.0034) model time 0.2213 (0.2611) loss 3.0342 (3.1265) grad_norm 1.9444 (2.8823) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][530/1251] eta 0:03:10 lr 0.000398 wd 0.0500 time 0.2276 (0.2637) data time 0.0011 (0.0033) model time 0.2265 (0.2604) loss 2.8144 (3.1218) grad_norm 2.8615 (2.8799) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][540/1251] eta 0:03:06 lr 0.000398 wd 0.0500 time 0.2311 (0.2630) data time 0.0009 (0.0033) model time 0.2302 (0.2597) loss 2.2019 (3.1133) grad_norm 3.5195 (2.8759) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][550/1251] eta 0:03:03 lr 0.000398 wd 0.0500 time 0.2286 (0.2623) data time 0.0010 (0.0033) model time 0.2276 (0.2590) loss 3.3428 (3.1123) grad_norm 3.2180 (2.8800) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][560/1251] eta 0:03:00 lr 0.000398 wd 0.0500 time 0.2250 (0.2616) data time 0.0012 (0.0032) model time 0.2239 (0.2584) loss 3.1210 (3.1164) grad_norm 3.1040 (2.8859) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][570/1251] eta 0:02:57 lr 0.000398 wd 0.0500 time 0.2323 (0.2611) data time 0.0008 (0.0032) model time 0.2315 (0.2579) loss 2.9823 (3.1170) grad_norm 2.2902 (2.8847) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][580/1251] eta 0:02:54 lr 0.000398 wd 0.0500 time 0.2311 (0.2605) data time 0.0007 (0.0031) model time 0.2304 (0.2573) loss 4.1113 (3.1199) grad_norm 2.6810 (2.8806) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][590/1251] eta 0:02:51 lr 0.000398 wd 0.0500 time 0.2325 (0.2600) data time 0.0012 (0.0031) model time 0.2313 (0.2568) loss 1.8505 (3.1224) grad_norm 2.2613 (2.8767) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][600/1251] eta 0:02:48 lr 0.000398 wd 0.0500 time 0.2232 (0.2594) data time 0.0007 (0.0031) model time 0.2225 (0.2563) loss 3.6771 (3.1169) grad_norm 2.4703 (2.8716) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][610/1251] eta 0:02:45 lr 0.000398 wd 0.0500 time 0.2275 (0.2589) data time 0.0010 (0.0031) model time 0.2265 (0.2558) loss 3.4822 (3.1134) grad_norm 1.9600 (2.8641) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][620/1251] eta 0:02:43 lr 0.000398 wd 0.0500 time 0.2389 (0.2583) data time 0.0009 (0.0031) model time 0.2380 (0.2553) loss 3.6741 (3.1156) grad_norm 2.0322 (2.8605) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][630/1251] eta 0:02:40 lr 0.000398 wd 0.0500 time 0.2253 (0.2579) data time 0.0007 (0.0030) model time 0.2246 (0.2548) loss 3.5779 (3.1173) grad_norm 2.8614 (2.8598) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][640/1251] eta 0:02:37 lr 0.000397 wd 0.0500 time 0.2249 (0.2574) data time 0.0011 (0.0030) model time 0.2238 (0.2544) loss 3.2777 (3.1199) grad_norm 2.8167 (2.8890) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][650/1251] eta 0:02:34 lr 0.000397 wd 0.0500 time 0.2371 (0.2569) data time 0.0009 (0.0030) model time 0.2363 (0.2540) loss 3.9236 (3.1210) grad_norm 2.0791 (2.8871) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][660/1251] eta 0:02:31 lr 0.000397 wd 0.0500 time 0.2396 (0.2564) data time 0.0009 (0.0029) model time 0.2387 (0.2535) loss 2.5322 (3.1216) grad_norm 2.7640 (2.8844) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][670/1251] eta 0:02:28 lr 0.000397 wd 0.0500 time 0.2296 (0.2560) data time 0.0013 (0.0029) model time 0.2283 (0.2531) loss 2.8293 (3.1204) grad_norm 2.1736 (2.8784) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:21:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][680/1251] eta 0:02:25 lr 0.000397 wd 0.0500 time 0.2248 (0.2555) data time 0.0007 (0.0029) model time 0.2240 (0.2526) loss 3.5714 (3.1205) grad_norm 1.8874 (2.8684) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][690/1251] eta 0:02:23 lr 0.000397 wd 0.0500 time 0.2305 (0.2552) data time 0.0009 (0.0029) model time 0.2296 (0.2523) loss 3.3198 (3.1224) grad_norm 3.4697 (2.8650) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][700/1251] eta 0:02:20 lr 0.000397 wd 0.0500 time 0.2298 (0.2548) data time 0.0010 (0.0028) model time 0.2289 (0.2519) loss 2.7972 (3.1243) grad_norm 2.1184 (2.8577) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][710/1251] eta 0:02:17 lr 0.000397 wd 0.0500 time 0.2186 (0.2544) data time 0.0009 (0.0028) model time 0.2176 (0.2516) loss 3.7149 (3.1257) grad_norm 2.4447 (2.8594) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][720/1251] eta 0:02:14 lr 0.000397 wd 0.0500 time 0.2322 (0.2540) data time 0.0009 (0.0028) model time 0.2313 (0.2512) loss 2.7884 (3.1208) grad_norm 2.1268 (2.8561) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][730/1251] eta 0:02:12 lr 0.000397 wd 0.0500 time 0.2304 (0.2537) data time 0.0012 (0.0028) model time 0.2292 (0.2509) loss 3.1218 (3.1183) grad_norm 3.3619 (2.8567) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][740/1251] eta 0:02:09 lr 0.000397 wd 0.0500 time 0.2380 (0.2533) data time 0.0010 (0.0027) model time 0.2370 (0.2506) loss 3.6682 (3.1191) grad_norm 3.2517 (2.8527) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][750/1251] eta 0:02:06 lr 0.000397 wd 0.0500 time 0.2241 (0.2529) data time 0.0009 (0.0027) model time 0.2232 (0.2502) loss 3.2274 (3.1213) grad_norm 3.1914 (2.8640) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][760/1251] eta 0:02:04 lr 0.000397 wd 0.0500 time 0.2325 (0.2526) data time 0.0011 (0.0027) model time 0.2314 (0.2499) loss 2.1174 (3.1206) grad_norm 2.4070 (2.8702) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][770/1251] eta 0:02:01 lr 0.000397 wd 0.0500 time 0.2344 (0.2523) data time 0.0012 (0.0027) model time 0.2332 (0.2496) loss 2.2839 (3.1165) grad_norm 2.2589 (2.8655) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][780/1251] eta 0:01:58 lr 0.000397 wd 0.0500 time 0.2393 (0.2520) data time 0.0011 (0.0027) model time 0.2382 (0.2493) loss 2.2932 (3.1159) grad_norm 3.3494 (2.8619) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][790/1251] eta 0:01:56 lr 0.000397 wd 0.0500 time 0.2309 (0.2517) data time 0.0008 (0.0026) model time 0.2301 (0.2490) loss 3.5634 (3.1123) grad_norm 3.2211 (2.8621) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][800/1251] eta 0:01:53 lr 0.000397 wd 0.0500 time 0.2293 (0.2513) data time 0.0007 (0.0026) model time 0.2286 (0.2487) loss 4.0811 (3.1117) grad_norm 2.1073 (2.8655) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][810/1251] eta 0:01:50 lr 0.000397 wd 0.0500 time 0.2377 (0.2510) data time 0.0012 (0.0026) model time 0.2364 (0.2484) loss 3.1662 (3.1152) grad_norm 2.7838 (2.8621) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][820/1251] eta 0:01:48 lr 0.000397 wd 0.0500 time 0.2390 (0.2507) data time 0.0010 (0.0026) model time 0.2379 (0.2481) loss 3.2975 (3.1137) grad_norm 3.5620 (2.8703) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][830/1251] eta 0:01:45 lr 0.000397 wd 0.0500 time 0.2314 (0.2504) data time 0.0009 (0.0026) model time 0.2305 (0.2479) loss 3.3317 (3.1136) grad_norm 2.8933 (2.8801) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][840/1251] eta 0:01:42 lr 0.000397 wd 0.0500 time 0.2309 (0.2502) data time 0.0011 (0.0025) model time 0.2297 (0.2476) loss 2.9799 (3.1138) grad_norm 4.1532 (2.8863) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][850/1251] eta 0:01:40 lr 0.000397 wd 0.0500 time 0.2273 (0.2499) data time 0.0009 (0.0025) model time 0.2264 (0.2474) loss 2.2138 (3.1158) grad_norm 3.1359 (2.8908) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][860/1251] eta 0:01:37 lr 0.000397 wd 0.0500 time 0.2293 (0.2496) data time 0.0010 (0.0025) model time 0.2284 (0.2471) loss 3.2630 (3.1149) grad_norm 3.4378 (2.8952) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][870/1251] eta 0:01:35 lr 0.000397 wd 0.0500 time 0.2301 (0.2494) data time 0.0009 (0.0025) model time 0.2292 (0.2469) loss 3.2612 (3.1159) grad_norm 2.5566 (2.8915) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][880/1251] eta 0:01:32 lr 0.000396 wd 0.0500 time 0.2303 (0.2492) data time 0.0008 (0.0025) model time 0.2295 (0.2467) loss 1.9489 (3.1116) grad_norm 5.1634 (2.8914) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][890/1251] eta 0:01:29 lr 0.000396 wd 0.0500 time 0.2276 (0.2492) data time 0.0012 (0.0025) model time 0.2264 (0.2467) loss 3.4584 (3.1120) grad_norm 3.3245 (2.8925) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][900/1251] eta 0:01:27 lr 0.000396 wd 0.0500 time 0.2254 (0.2491) data time 0.0010 (0.0025) model time 0.2244 (0.2467) loss 3.0884 (3.1089) grad_norm 4.9473 (2.8952) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][910/1251] eta 0:01:24 lr 0.000396 wd 0.0500 time 0.2227 (0.2489) data time 0.0009 (0.0024) model time 0.2218 (0.2464) loss 3.3169 (3.1076) grad_norm 2.6795 (2.8953) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][920/1251] eta 0:01:22 lr 0.000396 wd 0.0500 time 0.2231 (0.2486) data time 0.0009 (0.0024) model time 0.2222 (0.2462) loss 3.1320 (3.1049) grad_norm 2.6104 (2.8916) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][930/1251] eta 0:01:19 lr 0.000396 wd 0.0500 time 0.2283 (0.2484) data time 0.0009 (0.0024) model time 0.2274 (0.2460) loss 3.5210 (3.1070) grad_norm 3.8582 (2.8900) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:22:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][940/1251] eta 0:01:17 lr 0.000396 wd 0.0500 time 0.2315 (0.2482) data time 0.0009 (0.0024) model time 0.2305 (0.2458) loss 3.6307 (3.1067) grad_norm 2.9886 (2.8901) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][950/1251] eta 0:01:14 lr 0.000396 wd 0.0500 time 0.2212 (0.2480) data time 0.0007 (0.0024) model time 0.2205 (0.2456) loss 3.0673 (3.1058) grad_norm 2.3429 (2.8875) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][960/1251] eta 0:01:12 lr 0.000396 wd 0.0500 time 0.2278 (0.2478) data time 0.0008 (0.0024) model time 0.2270 (0.2454) loss 3.7636 (3.1070) grad_norm 2.8405 (2.8889) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][970/1251] eta 0:01:09 lr 0.000396 wd 0.0500 time 0.2301 (0.2476) data time 0.0011 (0.0024) model time 0.2290 (0.2452) loss 2.0034 (3.1042) grad_norm 2.3379 (2.8853) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][980/1251] eta 0:01:07 lr 0.000396 wd 0.0500 time 0.2245 (0.2473) data time 0.0009 (0.0024) model time 0.2236 (0.2450) loss 3.7574 (3.1059) grad_norm 2.1593 (2.8856) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][990/1251] eta 0:01:04 lr 0.000396 wd 0.0500 time 0.2249 (0.2472) data time 0.0008 (0.0023) model time 0.2241 (0.2448) loss 3.5909 (3.1067) grad_norm 1.9894 (2.8798) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1000/1251] eta 0:01:01 lr 0.000396 wd 0.0500 time 0.2268 (0.2470) data time 0.0007 (0.0024) model time 0.2262 (0.2446) loss 3.8001 (3.1094) grad_norm 2.8368 (2.8851) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1010/1251] eta 0:00:59 lr 0.000396 wd 0.0500 time 0.2227 (0.2468) data time 0.0011 (0.0023) model time 0.2216 (0.2444) loss 3.2614 (3.1085) grad_norm 2.5790 (2.8831) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1020/1251] eta 0:00:56 lr 0.000396 wd 0.0500 time 0.2334 (0.2466) data time 0.0007 (0.0023) model time 0.2327 (0.2443) loss 2.2745 (3.1062) grad_norm 2.5984 (2.8821) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1030/1251] eta 0:00:54 lr 0.000396 wd 0.0500 time 0.2263 (0.2465) data time 0.0007 (0.0023) model time 0.2256 (0.2441) loss 2.1220 (3.1040) grad_norm 2.7696 (2.8885) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1040/1251] eta 0:00:51 lr 0.000396 wd 0.0500 time 0.2218 (0.2463) data time 0.0013 (0.0023) model time 0.2205 (0.2440) loss 3.0781 (3.1058) grad_norm 4.9566 (2.8915) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1050/1251] eta 0:00:49 lr 0.000396 wd 0.0500 time 0.2228 (0.2461) data time 0.0010 (0.0023) model time 0.2218 (0.2438) loss 2.7777 (3.1058) grad_norm 3.6260 (2.8978) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1060/1251] eta 0:00:46 lr 0.000396 wd 0.0500 time 0.2312 (0.2459) data time 0.0009 (0.0023) model time 0.2303 (0.2437) loss 3.9454 (3.1075) grad_norm 2.6552 (2.8958) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1070/1251] eta 0:00:44 lr 0.000396 wd 0.0500 time 0.2208 (0.2457) data time 0.0010 (0.0023) model time 0.2199 (0.2435) loss 2.8404 (3.1067) grad_norm 2.6359 (2.8922) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1080/1251] eta 0:00:41 lr 0.000396 wd 0.0500 time 0.2277 (0.2456) data time 0.0007 (0.0023) model time 0.2271 (0.2433) loss 4.0762 (3.1079) grad_norm 1.8810 (2.8886) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1090/1251] eta 0:00:39 lr 0.000396 wd 0.0500 time 0.2243 (0.2454) data time 0.0009 (0.0022) model time 0.2234 (0.2432) loss 2.6728 (3.1066) grad_norm 2.5013 (2.8840) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1100/1251] eta 0:00:37 lr 0.000396 wd 0.0500 time 0.2202 (0.2453) data time 0.0013 (0.0022) model time 0.2190 (0.2430) loss 3.0487 (3.1043) grad_norm 2.5747 (2.8924) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1110/1251] eta 0:00:34 lr 0.000395 wd 0.0500 time 0.2175 (0.2451) data time 0.0009 (0.0022) model time 0.2166 (0.2429) loss 3.3598 (3.1059) grad_norm 3.6396 (2.8997) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1120/1251] eta 0:00:32 lr 0.000395 wd 0.0500 time 0.2319 (0.2450) data time 0.0009 (0.0022) model time 0.2310 (0.2427) loss 3.9978 (3.1071) grad_norm 3.9812 (2.9034) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1130/1251] eta 0:00:29 lr 0.000395 wd 0.0500 time 0.2319 (0.2448) data time 0.0012 (0.0022) model time 0.2307 (0.2426) loss 3.0809 (3.1083) grad_norm 2.3315 (2.8999) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1140/1251] eta 0:00:27 lr 0.000395 wd 0.0500 time 0.2301 (0.2447) data time 0.0010 (0.0022) model time 0.2292 (0.2425) loss 2.2512 (3.1079) grad_norm 2.4204 (2.9011) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1150/1251] eta 0:00:24 lr 0.000395 wd 0.0500 time 0.2183 (0.2446) data time 0.0009 (0.0022) model time 0.2174 (0.2424) loss 3.9245 (3.1088) grad_norm 3.3193 (2.9005) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1160/1251] eta 0:00:22 lr 0.000395 wd 0.0500 time 0.2307 (0.2445) data time 0.0008 (0.0022) model time 0.2299 (0.2423) loss 3.3856 (3.1084) grad_norm 3.7747 (2.8990) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1170/1251] eta 0:00:19 lr 0.000395 wd 0.0500 time 0.2245 (0.2443) data time 0.0009 (0.0022) model time 0.2236 (0.2421) loss 2.5041 (3.1088) grad_norm 2.7870 (2.8975) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1180/1251] eta 0:00:17 lr 0.000395 wd 0.0500 time 0.2350 (0.2442) data time 0.0007 (0.0022) model time 0.2343 (0.2420) loss 2.2745 (3.1086) grad_norm 2.4012 (2.8985) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1190/1251] eta 0:00:14 lr 0.000395 wd 0.0500 time 0.2211 (0.2440) data time 0.0012 (0.0022) model time 0.2199 (0.2419) loss 3.4756 (3.1125) grad_norm 2.7149 (2.8970) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1200/1251] eta 0:00:12 lr 0.000395 wd 0.0500 time 0.2335 (0.2439) data time 0.0009 (0.0022) model time 0.2326 (0.2418) loss 3.2659 (3.1131) grad_norm 3.2476 (2.8954) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:23:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1210/1251] eta 0:00:09 lr 0.000395 wd 0.0500 time 0.2220 (0.2438) data time 0.0009 (0.0021) model time 0.2212 (0.2416) loss 3.1135 (3.1132) grad_norm 2.5457 (2.8951) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:24:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1220/1251] eta 0:00:07 lr 0.000395 wd 0.0500 time 0.2434 (0.2437) data time 0.0009 (0.0021) model time 0.2425 (0.2415) loss 2.3365 (3.1134) grad_norm 2.7333 (2.8943) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1230/1251] eta 0:00:05 lr 0.000395 wd 0.0500 time 0.2206 (0.2436) data time 0.0012 (0.0021) model time 0.2193 (0.2415) loss 3.3948 (3.1134) grad_norm 5.0770 (2.8959) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [179/300][1240/1251] eta 0:00:02 lr 0.000395 wd 0.0500 time 0.2085 (0.2434) data time 0.0007 (0.0021) model time 0.2078 (0.2413) loss 3.7495 (3.1125) grad_norm 2.5556 (2.8950) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 13:24:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 13:24:08 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 13:24:11 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 13:25:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 13:25:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 13:26:04 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 13:26:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 13:26:14 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 13:26:15 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 13:26:16 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 13:26:17 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 179) [2024-08-27 13:26:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 13:26:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][0/1251] eta 4:52:03 lr 0.000395 wd 0.0500 time 14.0074 (14.0074) data time 1.0411 (1.0411) model time 0.0000 (0.0000) loss 3.5903 (3.5903) grad_norm 3.2269 (3.2269) loss_scale 1024.0000 (1024.0000) mem 20035MB [2024-08-27 13:26:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][10/1251] eta 0:31:01 lr 0.000395 wd 0.0500 time 0.2301 (1.4997) data time 0.0009 (0.0956) model time 0.0000 (0.0000) loss 2.7269 (3.4415) grad_norm 3.2828 (2.9369) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:26:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][20/1251] eta 0:18:21 lr 0.000395 wd 0.0500 time 0.2405 (0.8951) data time 0.0010 (0.0507) model time 0.0000 (0.0000) loss 2.9658 (3.2996) grad_norm 2.0599 (2.8072) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:26:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][30/1251] eta 0:13:50 lr 0.000395 wd 0.0500 time 0.2264 (0.6799) data time 0.0007 (0.0347) model time 0.0000 (0.0000) loss 2.2682 (3.3512) grad_norm 2.1498 (2.7507) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:26:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][40/1251] eta 0:11:30 lr 0.000395 wd 0.0500 time 0.2332 (0.5700) data time 0.0011 (0.0265) model time 0.0000 (0.0000) loss 3.5200 (3.3175) grad_norm 2.4541 (2.9535) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][50/1251] eta 0:10:04 lr 0.000395 wd 0.0500 time 0.2244 (0.5034) data time 0.0009 (0.0215) model time 0.0000 (0.0000) loss 3.5087 (3.2888) grad_norm 4.2994 (3.0032) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][60/1251] eta 0:09:06 lr 0.000395 wd 0.0500 time 0.2243 (0.4589) data time 0.0012 (0.0182) model time 0.2231 (0.2308) loss 3.2101 (3.2511) grad_norm 2.1395 (2.9440) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][70/1251] eta 0:08:23 lr 0.000395 wd 0.0500 time 0.2283 (0.4262) data time 0.0010 (0.0158) model time 0.2273 (0.2283) loss 3.0189 (3.2201) grad_norm 2.9947 (2.9946) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][80/1251] eta 0:07:50 lr 0.000395 wd 0.0500 time 0.2331 (0.4017) data time 0.0009 (0.0140) model time 0.2322 (0.2278) loss 2.7478 (3.2105) grad_norm 2.8892 (2.9691) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][90/1251] eta 0:07:24 lr 0.000394 wd 0.0500 time 0.2367 (0.3829) data time 0.0008 (0.0126) model time 0.2359 (0.2281) loss 3.4453 (3.1951) grad_norm 3.1259 (2.9492) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:26:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][100/1251] eta 0:07:02 lr 0.000394 wd 0.0500 time 0.2264 (0.3673) data time 0.0010 (0.0114) model time 0.2254 (0.2273) loss 3.4086 (3.1969) grad_norm 3.9832 (3.1216) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][110/1251] eta 0:06:44 lr 0.000394 wd 0.0500 time 0.2314 (0.3549) data time 0.0014 (0.0105) model time 0.2300 (0.2274) loss 2.3246 (3.1944) grad_norm 2.9986 (3.1092) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][120/1251] eta 0:06:29 lr 0.000394 wd 0.0500 time 0.2388 (0.3447) data time 0.0006 (0.0097) model time 0.2381 (0.2279) loss 1.8648 (3.1948) grad_norm 2.6710 (3.0690) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][130/1251] eta 0:06:16 lr 0.000394 wd 0.0500 time 0.2333 (0.3360) data time 0.0011 (0.0091) model time 0.2321 (0.2282) loss 3.5705 (3.1843) grad_norm 3.5040 (3.0789) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][140/1251] eta 0:06:04 lr 0.000394 wd 0.0500 time 0.2209 (0.3283) data time 0.0007 (0.0085) model time 0.2202 (0.2279) loss 3.0521 (3.1693) grad_norm 3.9591 (3.0816) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][150/1251] eta 0:05:54 lr 0.000394 wd 0.0500 time 0.2229 (0.3217) data time 0.0013 (0.0081) model time 0.2216 (0.2278) loss 2.4354 (3.1641) grad_norm 3.0284 (3.0834) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][160/1251] eta 0:05:44 lr 0.000394 wd 0.0500 time 0.2287 (0.3161) data time 0.0011 (0.0076) model time 0.2276 (0.2281) loss 3.1702 (3.1642) grad_norm 2.2792 (3.0670) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][170/1251] eta 0:05:36 lr 0.000394 wd 0.0500 time 0.2309 (0.3111) data time 0.0015 (0.0073) model time 0.2295 (0.2282) loss 3.5183 (3.1617) grad_norm 2.0382 (3.0385) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][180/1251] eta 0:05:28 lr 0.000394 wd 0.0500 time 0.2365 (0.3067) data time 0.0009 (0.0069) model time 0.2355 (0.2283) loss 3.1444 (3.1487) grad_norm 3.5296 (3.0279) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][190/1251] eta 0:05:21 lr 0.000394 wd 0.0500 time 0.2246 (0.3027) data time 0.0011 (0.0067) model time 0.2235 (0.2283) loss 2.7660 (3.1544) grad_norm 2.7932 (3.0200) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][200/1251] eta 0:05:14 lr 0.000394 wd 0.0500 time 0.2248 (0.2992) data time 0.0011 (0.0065) model time 0.2237 (0.2283) loss 2.5986 (3.1365) grad_norm 2.0420 (3.0336) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][210/1251] eta 0:05:08 lr 0.000394 wd 0.0500 time 0.2327 (0.2959) data time 0.0013 (0.0063) model time 0.2314 (0.2283) loss 3.1223 (3.1307) grad_norm 3.3386 (3.0236) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][220/1251] eta 0:05:02 lr 0.000394 wd 0.0500 time 0.2265 (0.2931) data time 0.0007 (0.0061) model time 0.2258 (0.2285) loss 3.3600 (3.1235) grad_norm 2.0682 (3.0013) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][230/1251] eta 0:04:56 lr 0.000394 wd 0.0500 time 0.2254 (0.2903) data time 0.0009 (0.0059) model time 0.2245 (0.2285) loss 1.9730 (3.1188) grad_norm 3.0924 (2.9863) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][240/1251] eta 0:04:51 lr 0.000394 wd 0.0500 time 0.2279 (0.2879) data time 0.0008 (0.0057) model time 0.2271 (0.2285) loss 3.2817 (3.1151) grad_norm 2.7848 (2.9748) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][250/1251] eta 0:04:45 lr 0.000394 wd 0.0500 time 0.2273 (0.2855) data time 0.0007 (0.0055) model time 0.2266 (0.2285) loss 2.8875 (3.1076) grad_norm 4.1921 (3.0176) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][260/1251] eta 0:04:40 lr 0.000394 wd 0.0500 time 0.2298 (0.2834) data time 0.0008 (0.0054) model time 0.2291 (0.2285) loss 2.9677 (3.1026) grad_norm 2.2497 (3.0280) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][270/1251] eta 0:04:36 lr 0.000394 wd 0.0500 time 0.2268 (0.2816) data time 0.0009 (0.0053) model time 0.2259 (0.2286) loss 3.8452 (3.1007) grad_norm 2.7043 (3.0615) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][280/1251] eta 0:04:31 lr 0.000394 wd 0.0500 time 0.2293 (0.2797) data time 0.0010 (0.0051) model time 0.2283 (0.2287) loss 2.7202 (3.1037) grad_norm 3.7803 (3.0618) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][290/1251] eta 0:04:28 lr 0.000394 wd 0.0500 time 0.2634 (0.2789) data time 0.0010 (0.0050) model time 0.2624 (0.2297) loss 1.8546 (3.0950) grad_norm 2.2995 (3.0445) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][300/1251] eta 0:04:23 lr 0.000394 wd 0.0500 time 0.2275 (0.2772) data time 0.0011 (0.0049) model time 0.2264 (0.2296) loss 2.7553 (3.0837) grad_norm 3.1381 (3.0294) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][310/1251] eta 0:04:20 lr 0.000394 wd 0.0500 time 0.2281 (0.2764) data time 0.0014 (0.0048) model time 0.2267 (0.2304) loss 3.2853 (3.0840) grad_norm 8.9231 (3.0411) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-27 13:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][320/1251] eta 0:04:16 lr 0.000393 wd 0.0500 time 0.2282 (0.2750) data time 0.0008 (0.0047) model time 0.2273 (0.2304) loss 3.5838 (3.0955) grad_norm 2.9552 (3.0420) loss_scale 2048.0000 (1049.5202) mem 7376MB [2024-08-27 13:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][330/1251] eta 0:04:12 lr 0.000393 wd 0.0500 time 0.2417 (0.2738) data time 0.0010 (0.0046) model time 0.2407 (0.2304) loss 2.2533 (3.0958) grad_norm 4.5415 (3.0529) loss_scale 2048.0000 (1079.6858) mem 7376MB [2024-08-27 13:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][340/1251] eta 0:04:08 lr 0.000393 wd 0.0500 time 0.2268 (0.2725) data time 0.0011 (0.0045) model time 0.2257 (0.2304) loss 3.5138 (3.0982) grad_norm 1.9366 (3.0519) loss_scale 2048.0000 (1108.0821) mem 7376MB [2024-08-27 13:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][350/1251] eta 0:04:04 lr 0.000393 wd 0.0500 time 0.2307 (0.2713) data time 0.0009 (0.0044) model time 0.2298 (0.2304) loss 2.9933 (3.0989) grad_norm 2.3455 (3.0397) loss_scale 2048.0000 (1134.8604) mem 7376MB [2024-08-27 13:27:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][360/1251] eta 0:04:00 lr 0.000393 wd 0.0500 time 0.2232 (0.2701) data time 0.0008 (0.0043) model time 0.2224 (0.2303) loss 2.8905 (3.0995) grad_norm 3.3013 (3.0478) loss_scale 2048.0000 (1160.1551) mem 7376MB [2024-08-27 13:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][370/1251] eta 0:03:57 lr 0.000393 wd 0.0500 time 0.2334 (0.2691) data time 0.0010 (0.0042) model time 0.2323 (0.2303) loss 2.5294 (3.0977) grad_norm 2.9155 (3.0446) loss_scale 2048.0000 (1184.0863) mem 7376MB [2024-08-27 13:28:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][380/1251] eta 0:03:53 lr 0.000393 wd 0.0500 time 0.2287 (0.2680) data time 0.0007 (0.0041) model time 0.2280 (0.2302) loss 2.1068 (3.0915) grad_norm 2.7209 (3.0394) loss_scale 2048.0000 (1206.7612) mem 7376MB [2024-08-27 13:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][390/1251] eta 0:03:49 lr 0.000393 wd 0.0500 time 0.2279 (0.2671) data time 0.0008 (0.0041) model time 0.2272 (0.2302) loss 3.8480 (3.0875) grad_norm 2.6080 (3.0348) loss_scale 2048.0000 (1228.2762) mem 7376MB [2024-08-27 13:28:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][400/1251] eta 0:03:46 lr 0.000393 wd 0.0500 time 0.2358 (0.2662) data time 0.0012 (0.0040) model time 0.2347 (0.2302) loss 3.4658 (3.0945) grad_norm 2.8706 (3.0229) loss_scale 2048.0000 (1248.7182) mem 7376MB [2024-08-27 13:28:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][410/1251] eta 0:03:43 lr 0.000393 wd 0.0500 time 0.2362 (0.2654) data time 0.0015 (0.0039) model time 0.2347 (0.2302) loss 3.6082 (3.0972) grad_norm 2.5932 (3.0176) loss_scale 2048.0000 (1268.1655) mem 7376MB [2024-08-27 13:28:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][420/1251] eta 0:03:39 lr 0.000393 wd 0.0500 time 0.2280 (0.2646) data time 0.0010 (0.0038) model time 0.2270 (0.2302) loss 3.0896 (3.0934) grad_norm 2.3999 (3.0048) loss_scale 2048.0000 (1286.6888) mem 7376MB [2024-08-27 13:28:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][430/1251] eta 0:03:36 lr 0.000393 wd 0.0500 time 0.2277 (0.2637) data time 0.0007 (0.0038) model time 0.2270 (0.2301) loss 3.7207 (3.1007) grad_norm 2.8713 (3.0018) loss_scale 2048.0000 (1304.3527) mem 7376MB [2024-08-27 13:28:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][440/1251] eta 0:03:33 lr 0.000393 wd 0.0500 time 0.2281 (0.2629) data time 0.0009 (0.0037) model time 0.2272 (0.2300) loss 3.5052 (3.1035) grad_norm 2.8306 (2.9958) loss_scale 2048.0000 (1321.2154) mem 7376MB [2024-08-27 13:28:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][450/1251] eta 0:03:29 lr 0.000393 wd 0.0500 time 0.2192 (0.2621) data time 0.0013 (0.0037) model time 0.2179 (0.2300) loss 3.1748 (3.1018) grad_norm 2.2871 (2.9851) loss_scale 2048.0000 (1337.3304) mem 7376MB [2024-08-27 13:28:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][460/1251] eta 0:03:26 lr 0.000393 wd 0.0500 time 0.2294 (0.2614) data time 0.0013 (0.0036) model time 0.2281 (0.2299) loss 3.3454 (3.0975) grad_norm 2.9479 (2.9767) loss_scale 2048.0000 (1352.7462) mem 7376MB [2024-08-27 13:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][470/1251] eta 0:03:23 lr 0.000393 wd 0.0500 time 0.2213 (0.2608) data time 0.0010 (0.0036) model time 0.2203 (0.2300) loss 2.9433 (3.0901) grad_norm 2.0091 (2.9612) loss_scale 2048.0000 (1367.5074) mem 7376MB [2024-08-27 13:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][480/1251] eta 0:03:20 lr 0.000393 wd 0.0500 time 0.2287 (0.2602) data time 0.0010 (0.0035) model time 0.2278 (0.2299) loss 3.4128 (3.0909) grad_norm 3.7926 (2.9628) loss_scale 2048.0000 (1381.6549) mem 7376MB [2024-08-27 13:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][490/1251] eta 0:03:17 lr 0.000393 wd 0.0500 time 0.2331 (0.2596) data time 0.0010 (0.0035) model time 0.2321 (0.2299) loss 3.1448 (3.0947) grad_norm 3.9830 (2.9703) loss_scale 2048.0000 (1395.2261) mem 7376MB [2024-08-27 13:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][500/1251] eta 0:03:14 lr 0.000393 wd 0.0500 time 0.2312 (0.2590) data time 0.0009 (0.0034) model time 0.2302 (0.2299) loss 3.5822 (3.0946) grad_norm 2.8007 (2.9637) loss_scale 2048.0000 (1408.2555) mem 7376MB [2024-08-27 13:28:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][510/1251] eta 0:03:11 lr 0.000393 wd 0.0500 time 0.2304 (0.2584) data time 0.0007 (0.0034) model time 0.2297 (0.2298) loss 3.7391 (3.0982) grad_norm 2.7948 (2.9664) loss_scale 2048.0000 (1420.7750) mem 7376MB [2024-08-27 13:28:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][520/1251] eta 0:03:08 lr 0.000393 wd 0.0500 time 0.2187 (0.2578) data time 0.0007 (0.0034) model time 0.2179 (0.2298) loss 2.7932 (3.0979) grad_norm 3.8712 (2.9702) loss_scale 2048.0000 (1432.8138) mem 7376MB [2024-08-27 13:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][530/1251] eta 0:03:05 lr 0.000393 wd 0.0500 time 0.2236 (0.2573) data time 0.0010 (0.0033) model time 0.2226 (0.2297) loss 3.6955 (3.0936) grad_norm 3.2197 (2.9596) loss_scale 2048.0000 (1444.3992) mem 7376MB [2024-08-27 13:28:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][540/1251] eta 0:03:02 lr 0.000393 wd 0.0500 time 0.2274 (0.2567) data time 0.0012 (0.0033) model time 0.2263 (0.2297) loss 3.4820 (3.0941) grad_norm 2.6995 (2.9565) loss_scale 2048.0000 (1455.5564) mem 7376MB [2024-08-27 13:28:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][550/1251] eta 0:02:59 lr 0.000392 wd 0.0500 time 0.2318 (0.2562) data time 0.0007 (0.0032) model time 0.2311 (0.2296) loss 3.6752 (3.0943) grad_norm 5.0615 (2.9617) loss_scale 2048.0000 (1466.3085) mem 7376MB [2024-08-27 13:28:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][560/1251] eta 0:02:56 lr 0.000392 wd 0.0500 time 0.2239 (0.2557) data time 0.0012 (0.0032) model time 0.2227 (0.2296) loss 3.5719 (3.0985) grad_norm 2.8135 (2.9733) loss_scale 2048.0000 (1476.6774) mem 7376MB [2024-08-27 13:28:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][570/1251] eta 0:02:53 lr 0.000392 wd 0.0500 time 0.2295 (0.2553) data time 0.0010 (0.0032) model time 0.2285 (0.2296) loss 3.3870 (3.1007) grad_norm 2.5735 (2.9838) loss_scale 2048.0000 (1486.6830) mem 7376MB [2024-08-27 13:28:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][580/1251] eta 0:02:51 lr 0.000392 wd 0.0500 time 0.2236 (0.2549) data time 0.0010 (0.0031) model time 0.2225 (0.2296) loss 3.7802 (3.1037) grad_norm 2.6958 (2.9847) loss_scale 2048.0000 (1496.3442) mem 7376MB [2024-08-27 13:28:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][590/1251] eta 0:02:48 lr 0.000392 wd 0.0500 time 0.2310 (0.2544) data time 0.0011 (0.0031) model time 0.2299 (0.2295) loss 3.4262 (3.1061) grad_norm 12.7390 (3.0006) loss_scale 2048.0000 (1505.6785) mem 7376MB [2024-08-27 13:28:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][600/1251] eta 0:02:45 lr 0.000392 wd 0.0500 time 0.2482 (0.2541) data time 0.0011 (0.0031) model time 0.2471 (0.2296) loss 1.8075 (3.1053) grad_norm 2.2161 (2.9985) loss_scale 2048.0000 (1514.7022) mem 7376MB [2024-08-27 13:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][610/1251] eta 0:02:42 lr 0.000392 wd 0.0500 time 0.2212 (0.2537) data time 0.0011 (0.0030) model time 0.2201 (0.2296) loss 3.3551 (3.1076) grad_norm 2.8679 (2.9897) loss_scale 2048.0000 (1523.4304) mem 7376MB [2024-08-27 13:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][620/1251] eta 0:02:39 lr 0.000392 wd 0.0500 time 0.2246 (0.2533) data time 0.0010 (0.0030) model time 0.2236 (0.2296) loss 3.6200 (3.1094) grad_norm 2.2806 (2.9810) loss_scale 2048.0000 (1531.8776) mem 7376MB [2024-08-27 13:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][630/1251] eta 0:02:37 lr 0.000392 wd 0.0500 time 0.2344 (0.2529) data time 0.0009 (0.0030) model time 0.2335 (0.2295) loss 3.5692 (3.1110) grad_norm 2.1399 (2.9708) loss_scale 2048.0000 (1540.0571) mem 7376MB [2024-08-27 13:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][640/1251] eta 0:02:34 lr 0.000392 wd 0.0500 time 0.2394 (0.2526) data time 0.0008 (0.0029) model time 0.2386 (0.2295) loss 1.7124 (3.1082) grad_norm 2.9984 (2.9682) loss_scale 2048.0000 (1547.9813) mem 7376MB [2024-08-27 13:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][650/1251] eta 0:02:31 lr 0.000392 wd 0.0500 time 0.2299 (0.2522) data time 0.0007 (0.0029) model time 0.2292 (0.2295) loss 3.3137 (3.1068) grad_norm 3.6938 (2.9647) loss_scale 2048.0000 (1555.6621) mem 7376MB [2024-08-27 13:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][660/1251] eta 0:02:28 lr 0.000392 wd 0.0500 time 0.2336 (0.2518) data time 0.0008 (0.0029) model time 0.2327 (0.2295) loss 2.2269 (3.1024) grad_norm 2.6500 (2.9621) loss_scale 2048.0000 (1563.1104) mem 7376MB [2024-08-27 13:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][670/1251] eta 0:02:26 lr 0.000392 wd 0.0500 time 0.2304 (0.2515) data time 0.0008 (0.0029) model time 0.2296 (0.2294) loss 3.7241 (3.1052) grad_norm 5.1405 (2.9646) loss_scale 2048.0000 (1570.3368) mem 7376MB [2024-08-27 13:29:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][680/1251] eta 0:02:23 lr 0.000392 wd 0.0500 time 0.2248 (0.2511) data time 0.0007 (0.0028) model time 0.2241 (0.2294) loss 3.5713 (3.1049) grad_norm 3.1129 (2.9664) loss_scale 2048.0000 (1577.3510) mem 7376MB [2024-08-27 13:29:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][690/1251] eta 0:02:20 lr 0.000392 wd 0.0500 time 0.2417 (0.2508) data time 0.0009 (0.0028) model time 0.2408 (0.2294) loss 2.4565 (3.1036) grad_norm 6.3656 (2.9707) loss_scale 2048.0000 (1584.1621) mem 7376MB [2024-08-27 13:29:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][700/1251] eta 0:02:18 lr 0.000392 wd 0.0500 time 0.2300 (0.2506) data time 0.0009 (0.0028) model time 0.2291 (0.2294) loss 3.4491 (3.1032) grad_norm 3.1579 (2.9716) loss_scale 2048.0000 (1590.7789) mem 7376MB [2024-08-27 13:29:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][710/1251] eta 0:02:15 lr 0.000392 wd 0.0500 time 0.2355 (0.2502) data time 0.0008 (0.0028) model time 0.2347 (0.2294) loss 1.9220 (3.1017) grad_norm 2.0375 (2.9667) loss_scale 2048.0000 (1597.2096) mem 7376MB [2024-08-27 13:29:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][720/1251] eta 0:02:12 lr 0.000392 wd 0.0500 time 0.2300 (0.2499) data time 0.0007 (0.0027) model time 0.2293 (0.2293) loss 3.7700 (3.0992) grad_norm 2.7232 (2.9660) loss_scale 2048.0000 (1603.4619) mem 7376MB [2024-08-27 13:29:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][730/1251] eta 0:02:10 lr 0.000392 wd 0.0500 time 0.2349 (0.2496) data time 0.0010 (0.0027) model time 0.2339 (0.2293) loss 3.4490 (3.0995) grad_norm 2.9592 (2.9691) loss_scale 2048.0000 (1609.5431) mem 7376MB [2024-08-27 13:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][740/1251] eta 0:02:07 lr 0.000392 wd 0.0500 time 0.2266 (0.2493) data time 0.0010 (0.0027) model time 0.2256 (0.2292) loss 3.7103 (3.1033) grad_norm 3.3158 (2.9700) loss_scale 2048.0000 (1615.4602) mem 7376MB [2024-08-27 13:29:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][750/1251] eta 0:02:04 lr 0.000392 wd 0.0500 time 0.2320 (0.2490) data time 0.0014 (0.0027) model time 0.2306 (0.2292) loss 2.7785 (3.1010) grad_norm 2.7553 (2.9635) loss_scale 2048.0000 (1621.2197) mem 7376MB [2024-08-27 13:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][760/1251] eta 0:02:02 lr 0.000392 wd 0.0500 time 0.2264 (0.2488) data time 0.0009 (0.0027) model time 0.2254 (0.2292) loss 3.4420 (3.1019) grad_norm 4.6036 (2.9692) loss_scale 2048.0000 (1626.8279) mem 7376MB [2024-08-27 13:29:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][770/1251] eta 0:01:59 lr 0.000392 wd 0.0500 time 0.2279 (0.2485) data time 0.0008 (0.0026) model time 0.2271 (0.2292) loss 3.7135 (3.1045) grad_norm 3.1330 (2.9732) loss_scale 2048.0000 (1632.2905) mem 7376MB [2024-08-27 13:29:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][780/1251] eta 0:01:56 lr 0.000391 wd 0.0500 time 0.2303 (0.2483) data time 0.0008 (0.0026) model time 0.2294 (0.2291) loss 3.7434 (3.1061) grad_norm 3.4072 (2.9678) loss_scale 2048.0000 (1637.6133) mem 7376MB [2024-08-27 13:29:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][790/1251] eta 0:01:54 lr 0.000391 wd 0.0500 time 0.2239 (0.2480) data time 0.0010 (0.0026) model time 0.2229 (0.2291) loss 3.0212 (3.1056) grad_norm 3.1526 (2.9660) loss_scale 2048.0000 (1642.8015) mem 7376MB [2024-08-27 13:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][800/1251] eta 0:01:51 lr 0.000391 wd 0.0500 time 0.2297 (0.2478) data time 0.0007 (0.0026) model time 0.2290 (0.2291) loss 3.4776 (3.1060) grad_norm 1.8679 (2.9627) loss_scale 2048.0000 (1647.8602) mem 7376MB [2024-08-27 13:29:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][810/1251] eta 0:01:49 lr 0.000391 wd 0.0500 time 0.2269 (0.2476) data time 0.0006 (0.0026) model time 0.2263 (0.2291) loss 2.3801 (3.1006) grad_norm 2.0013 (2.9622) loss_scale 2048.0000 (1652.7941) mem 7376MB [2024-08-27 13:29:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][820/1251] eta 0:01:46 lr 0.000391 wd 0.0500 time 0.2357 (0.2476) data time 0.0012 (0.0025) model time 0.2345 (0.2293) loss 3.2512 (3.1017) grad_norm 2.2535 (2.9567) loss_scale 2048.0000 (1657.6078) mem 7376MB [2024-08-27 13:29:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][830/1251] eta 0:01:44 lr 0.000391 wd 0.0500 time 0.2312 (0.2475) data time 0.0009 (0.0025) model time 0.2304 (0.2295) loss 2.3371 (3.0986) grad_norm 2.6162 (2.9544) loss_scale 2048.0000 (1662.3057) mem 7376MB [2024-08-27 13:29:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][840/1251] eta 0:01:41 lr 0.000391 wd 0.0500 time 0.2263 (0.2473) data time 0.0009 (0.0025) model time 0.2254 (0.2295) loss 3.5056 (3.0982) grad_norm 2.3955 (2.9521) loss_scale 2048.0000 (1666.8918) mem 7376MB [2024-08-27 13:29:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][850/1251] eta 0:01:39 lr 0.000391 wd 0.0500 time 0.2386 (0.2471) data time 0.0008 (0.0025) model time 0.2378 (0.2295) loss 3.4851 (3.0973) grad_norm 2.1679 (nan) loss_scale 1024.0000 (1665.3537) mem 7376MB [2024-08-27 13:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][860/1251] eta 0:01:36 lr 0.000391 wd 0.0500 time 0.2271 (0.2469) data time 0.0011 (0.0025) model time 0.2260 (0.2294) loss 3.5879 (3.0976) grad_norm 2.5491 (nan) loss_scale 1024.0000 (1657.9048) mem 7376MB [2024-08-27 13:29:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][870/1251] eta 0:01:33 lr 0.000391 wd 0.0500 time 0.2248 (0.2467) data time 0.0009 (0.0025) model time 0.2239 (0.2294) loss 3.1624 (3.0980) grad_norm 2.9130 (nan) loss_scale 1024.0000 (1650.6269) mem 7376MB [2024-08-27 13:29:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][880/1251] eta 0:01:31 lr 0.000391 wd 0.0500 time 0.2419 (0.2465) data time 0.0009 (0.0024) model time 0.2410 (0.2294) loss 3.7928 (3.0973) grad_norm 2.8228 (nan) loss_scale 1024.0000 (1643.5142) mem 7376MB [2024-08-27 13:30:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][890/1251] eta 0:01:28 lr 0.000391 wd 0.0500 time 0.2343 (0.2463) data time 0.0010 (0.0024) model time 0.2333 (0.2294) loss 2.3071 (3.0954) grad_norm 3.7431 (nan) loss_scale 1024.0000 (1636.5612) mem 7376MB [2024-08-27 13:30:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][900/1251] eta 0:01:26 lr 0.000391 wd 0.0500 time 0.2198 (0.2461) data time 0.0010 (0.0024) model time 0.2187 (0.2294) loss 3.2490 (3.0962) grad_norm 2.0130 (nan) loss_scale 1024.0000 (1629.7625) mem 7376MB [2024-08-27 13:30:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][910/1251] eta 0:01:23 lr 0.000391 wd 0.0500 time 0.2414 (0.2460) data time 0.0009 (0.0024) model time 0.2405 (0.2294) loss 2.8783 (3.0975) grad_norm 3.6777 (nan) loss_scale 1024.0000 (1623.1131) mem 7376MB [2024-08-27 13:30:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][920/1251] eta 0:01:21 lr 0.000391 wd 0.0500 time 0.2286 (0.2458) data time 0.0008 (0.0024) model time 0.2278 (0.2294) loss 3.5946 (3.0990) grad_norm 4.2776 (nan) loss_scale 1024.0000 (1616.6080) mem 7376MB [2024-08-27 13:30:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][930/1251] eta 0:01:18 lr 0.000391 wd 0.0500 time 0.2353 (0.2456) data time 0.0011 (0.0024) model time 0.2343 (0.2294) loss 3.5076 (3.1020) grad_norm 12.6431 (nan) loss_scale 1024.0000 (1610.2427) mem 7376MB [2024-08-27 13:30:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][940/1251] eta 0:01:16 lr 0.000391 wd 0.0500 time 0.2235 (0.2455) data time 0.0007 (0.0024) model time 0.2228 (0.2294) loss 3.5447 (3.1023) grad_norm 2.8125 (nan) loss_scale 1024.0000 (1604.0128) mem 7376MB [2024-08-27 13:30:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][950/1251] eta 0:01:13 lr 0.000391 wd 0.0500 time 0.2279 (0.2453) data time 0.0012 (0.0023) model time 0.2267 (0.2294) loss 2.5075 (3.0973) grad_norm 2.5762 (nan) loss_scale 1024.0000 (1597.9138) mem 7376MB [2024-08-27 13:30:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][960/1251] eta 0:01:11 lr 0.000391 wd 0.0500 time 0.2229 (0.2451) data time 0.0010 (0.0023) model time 0.2219 (0.2294) loss 3.1626 (3.0975) grad_norm 2.3710 (nan) loss_scale 1024.0000 (1591.9417) mem 7376MB [2024-08-27 13:30:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][970/1251] eta 0:01:08 lr 0.000391 wd 0.0500 time 0.2294 (0.2449) data time 0.0009 (0.0023) model time 0.2285 (0.2294) loss 2.9422 (3.0977) grad_norm 3.0661 (nan) loss_scale 1024.0000 (1586.0927) mem 7376MB [2024-08-27 13:30:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][980/1251] eta 0:01:06 lr 0.000391 wd 0.0500 time 0.2307 (0.2448) data time 0.0011 (0.0023) model time 0.2296 (0.2293) loss 3.4054 (3.0984) grad_norm 3.6516 (nan) loss_scale 1024.0000 (1580.3629) mem 7376MB [2024-08-27 13:30:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][990/1251] eta 0:01:03 lr 0.000391 wd 0.0500 time 0.2250 (0.2446) data time 0.0012 (0.0023) model time 0.2238 (0.2293) loss 3.0942 (3.0979) grad_norm 2.2314 (nan) loss_scale 1024.0000 (1574.7487) mem 7376MB [2024-08-27 13:30:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 13:30:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 13:30:25 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 13:32:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 13:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 13:32:04 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 14:06:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 14:29:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 14:29:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 14:29:37 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 15:43:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 15:43:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 15:43:56 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 15:45:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 15:46:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 15:46:16 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 15:46:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 15:46:24 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 15:46:26 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 15:46:27 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 15:46:27 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 180) [2024-08-27 15:46:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 15:46:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1000/1251] eta 0:06:59 lr 0.000391 wd 0.0500 time 0.2296 (1.6717) data time 0.0009 (0.1203) model time 0.2287 (1.5514) loss 3.5715 (3.6150) grad_norm 2.1961 (2.5117) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1010/1251] eta 0:03:49 lr 0.000390 wd 0.0500 time 0.2257 (0.9523) data time 0.0008 (0.0608) model time 0.2249 (0.8915) loss 3.3681 (3.3614) grad_norm 2.3140 (2.4904) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:46:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1020/1251] eta 0:02:44 lr 0.000390 wd 0.0500 time 0.2290 (0.7109) data time 0.0010 (0.0409) model time 0.2280 (0.6700) loss 3.3855 (3.3991) grad_norm 2.9749 (2.5450) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:46:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1030/1251] eta 0:02:10 lr 0.000390 wd 0.0500 time 0.2354 (0.5911) data time 0.0006 (0.0309) model time 0.2348 (0.5602) loss 2.6606 (3.2975) grad_norm 3.1919 (2.6139) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:46:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1040/1251] eta 0:01:49 lr 0.000390 wd 0.0500 time 0.2315 (0.5188) data time 0.0009 (0.0249) model time 0.2306 (0.4938) loss 2.8503 (3.2729) grad_norm 3.1866 (2.6223) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1050/1251] eta 0:01:34 lr 0.000390 wd 0.0500 time 0.2234 (0.4712) data time 0.0011 (0.0209) model time 0.2223 (0.4503) loss 3.4460 (3.2504) grad_norm 3.2657 (2.6280) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:47:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1060/1251] eta 0:01:23 lr 0.000390 wd 0.0500 time 0.2291 (0.4368) data time 0.0007 (0.0182) model time 0.2285 (0.4186) loss 2.3588 (3.2317) grad_norm 4.1176 (2.6126) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:47:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1070/1251] eta 0:01:14 lr 0.000390 wd 0.0500 time 0.2277 (0.4110) data time 0.0009 (0.0161) model time 0.2268 (0.3949) loss 3.3367 (3.2191) grad_norm 2.7937 (2.6344) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:47:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1080/1251] eta 0:01:06 lr 0.000390 wd 0.0500 time 0.2275 (0.3908) data time 0.0007 (0.0145) model time 0.2268 (0.3764) loss 3.4854 (3.2002) grad_norm 3.3146 (2.6980) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:47:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1090/1251] eta 0:01:00 lr 0.000390 wd 0.0500 time 0.2610 (0.3751) data time 0.0013 (0.0132) model time 0.2598 (0.3619) loss 3.4307 (3.2122) grad_norm 3.8456 (2.7975) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:47:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1100/1251] eta 0:00:54 lr 0.000390 wd 0.0500 time 0.2260 (0.3618) data time 0.0010 (0.0121) model time 0.2250 (0.3498) loss 3.0512 (3.2224) grad_norm 5.2047 (2.8478) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1110/1251] eta 0:00:49 lr 0.000390 wd 0.0500 time 0.2263 (0.3508) data time 0.0008 (0.0112) model time 0.2254 (0.3396) loss 3.6352 (3.2165) grad_norm 2.7682 (2.8418) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:47:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1120/1251] eta 0:00:44 lr 0.000390 wd 0.0500 time 0.2336 (0.3415) data time 0.0007 (0.0104) model time 0.2329 (0.3310) loss 2.9245 (3.1973) grad_norm 2.2299 (2.7985) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:47:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1130/1251] eta 0:00:40 lr 0.000390 wd 0.0500 time 0.2266 (0.3335) data time 0.0007 (0.0098) model time 0.2259 (0.3237) loss 1.6599 (3.1961) grad_norm 2.0219 (2.7838) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1140/1251] eta 0:00:36 lr 0.000390 wd 0.0500 time 0.2265 (0.3266) data time 0.0010 (0.0092) model time 0.2255 (0.3174) loss 3.4428 (3.1884) grad_norm 2.5354 (2.7637) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:47:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1150/1251] eta 0:00:32 lr 0.000390 wd 0.0500 time 0.2344 (0.3209) data time 0.0009 (0.0087) model time 0.2335 (0.3121) loss 3.1613 (3.1845) grad_norm 2.6452 (2.7571) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:47:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1160/1251] eta 0:00:28 lr 0.000390 wd 0.0500 time 0.2290 (0.3156) data time 0.0009 (0.0083) model time 0.2281 (0.3074) loss 2.2584 (3.1810) grad_norm 3.1676 (2.8182) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:47:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1170/1251] eta 0:00:25 lr 0.000390 wd 0.0500 time 0.2299 (0.3109) data time 0.0007 (0.0079) model time 0.2293 (0.3030) loss 2.5044 (3.1662) grad_norm 3.9635 (2.8233) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:47:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1180/1251] eta 0:00:21 lr 0.000390 wd 0.0500 time 0.2321 (0.3066) data time 0.0009 (0.0075) model time 0.2312 (0.2991) loss 2.9561 (3.1714) grad_norm 2.9454 (2.8459) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 15:47:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 15:47:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 15:47:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 16:01:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 16:01:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 16:01:38 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 16:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 16:01:49 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 16:01:50 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 16:01:52 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 16:01:52 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 180) [2024-08-27 16:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 16:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1190/1251] eta 0:03:24 lr 0.000390 wd 0.0500 time 0.2468 (3.3488) data time 0.0012 (0.1538) model time 0.2456 (3.1950) loss 3.3963 (3.5488) grad_norm 1.9423 (2.3002) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 16:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1200/1251] eta 0:01:05 lr 0.000390 wd 0.0500 time 0.2456 (1.2804) data time 0.0011 (0.0523) model time 0.2446 (1.2282) loss 3.4588 (3.3887) grad_norm 2.1168 (2.5623) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 16:02:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1210/1251] eta 0:00:35 lr 0.000390 wd 0.0500 time 0.2379 (0.8655) data time 0.0011 (0.0321) model time 0.2368 (0.8334) loss 3.6459 (3.3931) grad_norm 3.1185 (2.6791) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 16:02:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1220/1251] eta 0:00:21 lr 0.000390 wd 0.0500 time 0.2425 (0.6880) data time 0.0011 (0.0233) model time 0.2414 (0.6647) loss 3.1452 (3.3833) grad_norm 4.3004 (3.0776) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 16:02:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1230/1251] eta 0:00:12 lr 0.000390 wd 0.0500 time 0.2411 (0.5896) data time 0.0011 (0.0186) model time 0.2400 (0.5710) loss 3.0158 (3.3265) grad_norm 2.0891 (2.9820) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 16:02:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1240/1251] eta 0:00:05 lr 0.000389 wd 0.0500 time 0.2295 (0.5261) data time 0.0005 (0.0154) model time 0.2291 (0.5106) loss 2.3949 (3.3007) grad_norm 2.2639 (2.9532) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 16:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [180/300][1250/1251] eta 0:00:00 lr 0.000389 wd 0.0500 time 0.2248 (0.4796) data time 0.0007 (0.0132) model time 0.2241 (0.4665) loss 3.6490 (3.2790) grad_norm 3.0496 (2.9271) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 16:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 180 training takes 0:00:31 [2024-08-27 16:02:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 16:02:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 16:02:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.390 (0.390) Loss 0.4233 (0.4233) Acc@1 92.090 (92.090) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-27 16:02:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.109) Loss 0.6855 (0.6869) Acc@1 86.035 (84.899) Acc@5 97.266 (97.124) Mem 7379MB [2024-08-27 16:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.097) Loss 1.0166 (0.7126) Acc@1 75.781 (84.203) Acc@5 94.238 (97.019) Mem 7379MB [2024-08-27 16:02:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.091) Loss 1.2109 (0.8119) Acc@1 70.215 (81.792) Acc@5 91.406 (95.851) Mem 7379MB [2024-08-27 16:02:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.1045 (0.8638) Acc@1 75.098 (80.500) Acc@5 93.066 (95.251) Mem 7379MB [2024-08-27 16:02:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.122 Acc@5 95.214 [2024-08-27 16:02:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.1% [2024-08-27 16:02:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 80.12% [2024-08-27 16:02:35 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 16:02:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 16:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.482 (0.482) Loss 0.3958 (0.3958) Acc@1 92.969 (92.969) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-27 16:02:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.116) Loss 0.6094 (0.6223) Acc@1 88.574 (86.816) Acc@5 97.363 (97.532) Mem 7379MB [2024-08-27 16:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.099) Loss 0.8774 (0.6461) Acc@1 79.004 (85.938) Acc@5 95.996 (97.512) Mem 7379MB [2024-08-27 16:02:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.092) Loss 1.1143 (0.7323) Acc@1 72.949 (83.824) Acc@5 92.773 (96.591) Mem 7379MB [2024-08-27 16:02:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.0010 (0.7762) Acc@1 75.586 (82.491) Acc@5 93.945 (96.115) Mem 7379MB [2024-08-27 16:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.036 Acc@5 96.098 [2024-08-27 16:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.0% [2024-08-27 16:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.04% [2024-08-27 16:02:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 16:02:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 16:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][0/1251] eta 0:15:49 lr 0.000389 wd 0.0500 time 0.7590 (0.7590) data time 0.4947 (0.4947) model time 0.0000 (0.0000) loss 2.8667 (2.8667) grad_norm 3.9917 (3.9917) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 16:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 16:02:42 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 16:02:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 16:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 16:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 16:04:44 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 16:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 16:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 16:09:49 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 16:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 16:14:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 16:14:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 16:14:28 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 16:30:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 16:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 16:31:07 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 16:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 16:31:24 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 16:31:26 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 16:31:27 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 16:31:27 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 181) [2024-08-27 16:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 16:38:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 16:38:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 16:38:35 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 16:43:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 16:43:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 16:44:05 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 16:44:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 16:44:12 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 16:44:14 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 16:44:15 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 16:44:15 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 181) [2024-08-27 16:44:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 17:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 17:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 17:18:37 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 17:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 17:18:51 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 17:18:52 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 17:18:53 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 17:18:53 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 181) [2024-08-27 17:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 17:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 17:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 17:54:46 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 17:54:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 17:54:54 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 17:54:56 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 17:54:57 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 17:54:58 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 181) [2024-08-27 17:54:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 17:55:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][10/1251] eta 0:32:28 lr 0.000389 wd 0.0500 time 0.2254 (1.5698) data time 0.0011 (0.1102) model time 0.0000 (0.0000) loss 3.7362 (3.5879) grad_norm 3.0600 (2.8192) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][20/1251] eta 0:18:30 lr 0.000389 wd 0.0500 time 0.2250 (0.9021) data time 0.0008 (0.0558) model time 0.0000 (0.0000) loss 3.3826 (3.3613) grad_norm 2.9546 (2.8493) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][30/1251] eta 0:13:49 lr 0.000389 wd 0.0500 time 0.2249 (0.6791) data time 0.0011 (0.0375) model time 0.0000 (0.0000) loss 3.4855 (3.4078) grad_norm 2.2028 (2.8606) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][40/1251] eta 0:11:26 lr 0.000389 wd 0.0500 time 0.2205 (0.5673) data time 0.0011 (0.0284) model time 0.0000 (0.0000) loss 2.5149 (3.3167) grad_norm 4.5581 (2.9399) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][50/1251] eta 0:10:00 lr 0.000389 wd 0.0500 time 0.2214 (0.4999) data time 0.0010 (0.0230) model time 0.0000 (0.0000) loss 2.9877 (3.3005) grad_norm 2.5917 (2.9813) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][60/1251] eta 0:09:01 lr 0.000389 wd 0.0500 time 0.2438 (0.4551) data time 0.0009 (0.0193) model time 0.2429 (0.2299) loss 3.2905 (3.2573) grad_norm 2.2084 (2.9600) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][70/1251] eta 0:08:19 lr 0.000389 wd 0.0500 time 0.2268 (0.4231) data time 0.0008 (0.0167) model time 0.2260 (0.2299) loss 2.2797 (3.2228) grad_norm 2.4943 (2.8739) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][80/1251] eta 0:07:47 lr 0.000389 wd 0.0500 time 0.2333 (0.3990) data time 0.0010 (0.0148) model time 0.2322 (0.2298) loss 3.4660 (3.2078) grad_norm 2.1876 (2.8389) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][90/1251] eta 0:07:21 lr 0.000389 wd 0.0500 time 0.2352 (0.3803) data time 0.0010 (0.0133) model time 0.2342 (0.2297) loss 3.7038 (3.1819) grad_norm 2.2406 (2.8128) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][100/1251] eta 0:07:00 lr 0.000389 wd 0.0500 time 0.2231 (0.3653) data time 0.0010 (0.0121) model time 0.2220 (0.2296) loss 3.5149 (3.1863) grad_norm 2.5591 (2.8004) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][110/1251] eta 0:06:43 lr 0.000389 wd 0.0500 time 0.2277 (0.3534) data time 0.0010 (0.0111) model time 0.2267 (0.2303) loss 3.0823 (3.1945) grad_norm 2.1814 (2.7795) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][120/1251] eta 0:06:28 lr 0.000389 wd 0.0500 time 0.2327 (0.3436) data time 0.0007 (0.0102) model time 0.2320 (0.2308) loss 3.4971 (3.1984) grad_norm 2.9344 (2.7618) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][130/1251] eta 0:06:15 lr 0.000389 wd 0.0500 time 0.2417 (0.3353) data time 0.0009 (0.0097) model time 0.2408 (0.2311) loss 3.1756 (3.1792) grad_norm 2.5810 (2.7534) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][140/1251] eta 0:06:04 lr 0.000389 wd 0.0500 time 0.2235 (0.3279) data time 0.0009 (0.0091) model time 0.2226 (0.2310) loss 1.8520 (3.1654) grad_norm 3.7774 (2.7802) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][150/1251] eta 0:05:53 lr 0.000389 wd 0.0500 time 0.2279 (0.3215) data time 0.0010 (0.0086) model time 0.2269 (0.2309) loss 4.0610 (3.1680) grad_norm 3.8582 (2.8849) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][160/1251] eta 0:05:44 lr 0.000389 wd 0.0500 time 0.2298 (0.3159) data time 0.0010 (0.0081) model time 0.2288 (0.2310) loss 3.3219 (3.1671) grad_norm 2.3348 (2.9022) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][170/1251] eta 0:05:36 lr 0.000389 wd 0.0500 time 0.2330 (0.3111) data time 0.0007 (0.0077) model time 0.2323 (0.2312) loss 2.2148 (3.1607) grad_norm 2.3548 (2.8874) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:55:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][180/1251] eta 0:05:28 lr 0.000389 wd 0.0500 time 0.2317 (0.3066) data time 0.0009 (0.0074) model time 0.2308 (0.2309) loss 2.4476 (3.1434) grad_norm 1.9406 (2.8838) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:56:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][190/1251] eta 0:05:21 lr 0.000389 wd 0.0500 time 0.2318 (0.3027) data time 0.0008 (0.0070) model time 0.2309 (0.2309) loss 2.5262 (3.1451) grad_norm 6.3563 (2.9245) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 17:56:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 17:56:02 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 17:56:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 18:02:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 18:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 18:02:19 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 18:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 18:04:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 18:04:13 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 18:08:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 18:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 18:08:14 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 18:08:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 18:08:27 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 18:08:29 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 18:08:30 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 18:08:30 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 181) [2024-08-27 18:08:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 18:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 18:12:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 18:12:26 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 18:12:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 18:12:39 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 18:12:41 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 18:12:42 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 18:12:42 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 181) [2024-08-27 18:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 18:13:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][200/1251] eta 0:55:25 lr 0.000389 wd 0.0500 time 0.2258 (3.1643) data time 0.0010 (0.3089) model time 0.2247 (2.8554) loss 4.1937 (3.6625) grad_norm 2.3180 (3.0491) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 18:13:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 18:13:02 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 18:13:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 18:15:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 18:15:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 18:15:48 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 18:17:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 18:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 18:17:39 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 18:17:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 18:17:51 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 18:17:52 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 19:02:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 19:02:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 19:02:09 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 19:02:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 19:02:23 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 19:02:24 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 19:02:25 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 19:02:25 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 181) [2024-08-27 19:02:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 19:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][210/1251] eta 0:27:25 lr 0.000389 wd 0.0500 time 0.2271 (1.5808) data time 0.0008 (0.1067) model time 0.2263 (1.4741) loss 3.6838 (3.5048) grad_norm 3.0385 (3.0723) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][220/1251] eta 0:14:15 lr 0.000388 wd 0.0500 time 0.2202 (0.8299) data time 0.0007 (0.0479) model time 0.2196 (0.7820) loss 3.9595 (3.3619) grad_norm 2.4607 (2.8914) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][230/1251] eta 0:10:28 lr 0.000388 wd 0.0500 time 0.2304 (0.6152) data time 0.0008 (0.0311) model time 0.2296 (0.5841) loss 3.6466 (3.4037) grad_norm 2.5571 (2.8049) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:02:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][240/1251] eta 0:08:38 lr 0.000388 wd 0.0500 time 0.2338 (0.5130) data time 0.0009 (0.0232) model time 0.2329 (0.4898) loss 3.0538 (3.3354) grad_norm 4.5585 (2.9190) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][250/1251] eta 0:07:33 lr 0.000388 wd 0.0500 time 0.2207 (0.4534) data time 0.0007 (0.0186) model time 0.2200 (0.4349) loss 3.2384 (3.2942) grad_norm 2.6239 (2.9404) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][260/1251] eta 0:06:50 lr 0.000388 wd 0.0500 time 0.2271 (0.4145) data time 0.0006 (0.0155) model time 0.2264 (0.3990) loss 2.4679 (3.2683) grad_norm 3.2607 (2.9231) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:02:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][270/1251] eta 0:06:19 lr 0.000388 wd 0.0500 time 0.2312 (0.3872) data time 0.0007 (0.0134) model time 0.2305 (0.3738) loss 2.1816 (3.2312) grad_norm 2.2246 (3.0472) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:02:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][280/1251] eta 0:05:56 lr 0.000388 wd 0.0500 time 0.2252 (0.3668) data time 0.0006 (0.0118) model time 0.2245 (0.3551) loss 2.4956 (3.2055) grad_norm 3.7815 (3.0175) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][290/1251] eta 0:05:37 lr 0.000388 wd 0.0500 time 0.2227 (0.3509) data time 0.0008 (0.0105) model time 0.2220 (0.3404) loss 3.5168 (3.1787) grad_norm 2.9971 (3.0392) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][300/1251] eta 0:05:21 lr 0.000388 wd 0.0500 time 0.2181 (0.3382) data time 0.0007 (0.0095) model time 0.2174 (0.3287) loss 4.0169 (3.1984) grad_norm 2.4282 (3.0563) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][310/1251] eta 0:05:08 lr 0.000388 wd 0.0500 time 0.2256 (0.3280) data time 0.0007 (0.0087) model time 0.2249 (0.3192) loss 2.4409 (3.2083) grad_norm 2.6294 (3.0242) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][320/1251] eta 0:04:57 lr 0.000388 wd 0.0500 time 0.2207 (0.3193) data time 0.0010 (0.0081) model time 0.2198 (0.3112) loss 2.7800 (3.2116) grad_norm 2.3011 (3.0301) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][330/1251] eta 0:04:47 lr 0.000388 wd 0.0500 time 0.2215 (0.3122) data time 0.0007 (0.0075) model time 0.2208 (0.3047) loss 3.2335 (3.2012) grad_norm 2.5649 (2.9963) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][340/1251] eta 0:04:38 lr 0.000388 wd 0.0500 time 0.2270 (0.3062) data time 0.0008 (0.0070) model time 0.2261 (0.2991) loss 3.2095 (3.1926) grad_norm 1.8225 (2.9771) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][350/1251] eta 0:04:31 lr 0.000388 wd 0.0500 time 0.2268 (0.3010) data time 0.0007 (0.0066) model time 0.2261 (0.2944) loss 3.5394 (3.1873) grad_norm 2.9359 (3.0028) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][360/1251] eta 0:04:24 lr 0.000388 wd 0.0500 time 0.2252 (0.2964) data time 0.0007 (0.0062) model time 0.2245 (0.2902) loss 2.6006 (3.1812) grad_norm 4.9245 (3.0110) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][370/1251] eta 0:04:17 lr 0.000388 wd 0.0500 time 0.2214 (0.2923) data time 0.0009 (0.0059) model time 0.2206 (0.2864) loss 3.7007 (3.1880) grad_norm 2.1337 (2.9812) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][380/1251] eta 0:04:11 lr 0.000388 wd 0.0500 time 0.2225 (0.2886) data time 0.0008 (0.0056) model time 0.2217 (0.2830) loss 2.8632 (3.1728) grad_norm 2.3705 (2.9599) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][390/1251] eta 0:04:05 lr 0.000388 wd 0.0500 time 0.2265 (0.2853) data time 0.0007 (0.0054) model time 0.2258 (0.2799) loss 3.5444 (3.1665) grad_norm 3.3343 (2.9716) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][400/1251] eta 0:04:00 lr 0.000388 wd 0.0500 time 0.2232 (0.2825) data time 0.0009 (0.0052) model time 0.2224 (0.2772) loss 2.5025 (3.1542) grad_norm 3.2581 (2.9946) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][410/1251] eta 0:03:55 lr 0.000388 wd 0.0500 time 0.2237 (0.2798) data time 0.0009 (0.0050) model time 0.2228 (0.2748) loss 3.5110 (3.1448) grad_norm 2.8898 (2.9889) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][420/1251] eta 0:03:50 lr 0.000388 wd 0.0500 time 0.2290 (0.2774) data time 0.0008 (0.0048) model time 0.2282 (0.2725) loss 2.9379 (3.1391) grad_norm 2.3576 (2.9859) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][430/1251] eta 0:03:45 lr 0.000388 wd 0.0500 time 0.2253 (0.2751) data time 0.0008 (0.0047) model time 0.2245 (0.2704) loss 3.5784 (3.1447) grad_norm 4.0770 (3.0002) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][440/1251] eta 0:03:41 lr 0.000388 wd 0.0500 time 0.2270 (0.2731) data time 0.0008 (0.0045) model time 0.2263 (0.2686) loss 3.5042 (3.1368) grad_norm 3.2167 (3.0141) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][450/1251] eta 0:03:37 lr 0.000388 wd 0.0500 time 0.2253 (0.2714) data time 0.0009 (0.0044) model time 0.2244 (0.2670) loss 2.3398 (3.1281) grad_norm 2.2809 (2.9902) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][460/1251] eta 0:03:33 lr 0.000387 wd 0.0500 time 0.2272 (0.2697) data time 0.0008 (0.0043) model time 0.2265 (0.2654) loss 3.1411 (3.1170) grad_norm 2.3483 (2.9745) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][470/1251] eta 0:03:29 lr 0.000387 wd 0.0500 time 0.2263 (0.2681) data time 0.0008 (0.0041) model time 0.2255 (0.2640) loss 3.4826 (3.1123) grad_norm 2.1417 (2.9553) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][480/1251] eta 0:03:25 lr 0.000387 wd 0.0500 time 0.2266 (0.2666) data time 0.0008 (0.0040) model time 0.2258 (0.2626) loss 2.5505 (3.1143) grad_norm 2.1666 (2.9483) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][490/1251] eta 0:03:22 lr 0.000387 wd 0.0500 time 0.2292 (0.2660) data time 0.0007 (0.0039) model time 0.2285 (0.2621) loss 4.1482 (3.1133) grad_norm 2.7297 (2.9408) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][500/1251] eta 0:03:18 lr 0.000387 wd 0.0500 time 0.2350 (0.2647) data time 0.0009 (0.0038) model time 0.2341 (0.2609) loss 2.8080 (3.1003) grad_norm 2.7409 (2.9389) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][510/1251] eta 0:03:15 lr 0.000387 wd 0.0500 time 0.2201 (0.2641) data time 0.0006 (0.0037) model time 0.2195 (0.2604) loss 2.6890 (3.0960) grad_norm 3.9377 (3.0188) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][520/1251] eta 0:03:12 lr 0.000387 wd 0.0500 time 0.2231 (0.2630) data time 0.0010 (0.0036) model time 0.2221 (0.2594) loss 2.8650 (3.1046) grad_norm 2.8561 (3.0154) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][530/1251] eta 0:03:08 lr 0.000387 wd 0.0500 time 0.2265 (0.2619) data time 0.0006 (0.0035) model time 0.2259 (0.2584) loss 3.0799 (3.1097) grad_norm 2.9715 (3.0045) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][540/1251] eta 0:03:05 lr 0.000387 wd 0.0500 time 0.2477 (0.2610) data time 0.0008 (0.0035) model time 0.2469 (0.2575) loss 3.2141 (3.1060) grad_norm 3.9108 (3.0114) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][550/1251] eta 0:03:02 lr 0.000387 wd 0.0500 time 0.2308 (0.2601) data time 0.0006 (0.0035) model time 0.2301 (0.2566) loss 2.8702 (3.1086) grad_norm 1.9480 (3.0101) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:04:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 19:04:00 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 19:04:02 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 19:05:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 19:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 19:05:48 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 19:05:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 19:05:58 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 19:06:00 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 19:06:01 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 19:06:01 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 181) [2024-08-27 19:06:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 19:06:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][560/1251] eta 0:19:05 lr 0.000387 wd 0.0500 time 0.2228 (1.6577) data time 0.0008 (0.1079) model time 0.2220 (1.5498) loss 3.4450 (3.4701) grad_norm 2.7868 (3.2581) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:06:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][570/1251] eta 0:10:41 lr 0.000387 wd 0.0500 time 0.2286 (0.9423) data time 0.0006 (0.0544) model time 0.2280 (0.8879) loss 3.9148 (3.3904) grad_norm 2.5747 (2.9273) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:06:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][580/1251] eta 0:07:51 lr 0.000387 wd 0.0500 time 0.2304 (0.7031) data time 0.0008 (0.0366) model time 0.2296 (0.6664) loss 3.4142 (3.4658) grad_norm 3.2866 (2.8982) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:06:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][590/1251] eta 0:06:25 lr 0.000387 wd 0.0500 time 0.2233 (0.5832) data time 0.0008 (0.0277) model time 0.2226 (0.5555) loss 2.6053 (3.3466) grad_norm 3.1329 (2.8253) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][600/1251] eta 0:05:32 lr 0.000387 wd 0.0500 time 0.2362 (0.5114) data time 0.0009 (0.0223) model time 0.2353 (0.4891) loss 2.9048 (3.3222) grad_norm 2.7090 (2.8648) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][610/1251] eta 0:04:57 lr 0.000387 wd 0.0500 time 0.2337 (0.4636) data time 0.0008 (0.0188) model time 0.2330 (0.4449) loss 3.5413 (3.3014) grad_norm 2.3902 (2.8252) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][620/1251] eta 0:04:30 lr 0.000387 wd 0.0500 time 0.2289 (0.4295) data time 0.0006 (0.0162) model time 0.2283 (0.4133) loss 2.5078 (3.2601) grad_norm 3.8071 (2.8344) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:06:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 19:06:36 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 19:06:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 19:09:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 19:09:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 19:09:19 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 19:09:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 19:09:34 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 19:09:35 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 19:09:37 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 19:09:37 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 181) [2024-08-27 19:09:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 19:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 19:09:52 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 19:09:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 19:13:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 19:13:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 19:13:53 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 19:14:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 19:14:02 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 19:14:04 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 19:14:05 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 19:14:05 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 181) [2024-08-27 19:14:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 19:14:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][630/1251] eta 2:10:53 lr 0.000387 wd 0.0500 time 12.6469 (12.6469) data time 0.6408 (0.6408) model time 12.0061 (12.0061) loss 3.8041 (3.8041) grad_norm 2.6298 (2.6298) loss_scale 1024.0000 (1024.0000) mem 20033MB [2024-08-27 19:14:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][640/1251] eta 0:14:05 lr 0.000387 wd 0.0500 time 0.2455 (1.3841) data time 0.0011 (0.0593) model time 0.2444 (1.3248) loss 2.3771 (3.4435) grad_norm 2.6911 (2.9099) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:14:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][650/1251] eta 0:08:23 lr 0.000387 wd 0.0500 time 0.2422 (0.8385) data time 0.0010 (0.0316) model time 0.2412 (0.8068) loss 2.8891 (3.3119) grad_norm 2.8381 (2.9447) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:14:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][660/1251] eta 0:06:21 lr 0.000387 wd 0.0500 time 0.2298 (0.6452) data time 0.0009 (0.0218) model time 0.2289 (0.6234) loss 2.1965 (3.3278) grad_norm 3.1598 (3.1122) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:14:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][670/1251] eta 0:05:18 lr 0.000387 wd 0.0500 time 0.2415 (0.5474) data time 0.0009 (0.0168) model time 0.2406 (0.5306) loss 2.9256 (3.2513) grad_norm 2.9985 (3.1033) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:14:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][680/1251] eta 0:04:38 lr 0.000387 wd 0.0500 time 0.2327 (0.4873) data time 0.0009 (0.0137) model time 0.2318 (0.4736) loss 3.3083 (3.2461) grad_norm 3.3464 (3.1330) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:14:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][690/1251] eta 0:04:10 lr 0.000386 wd 0.0500 time 0.2336 (0.4472) data time 0.0012 (0.0116) model time 0.2324 (0.4356) loss 3.2241 (3.2352) grad_norm 2.4837 (3.1391) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:14:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][700/1251] eta 0:03:50 lr 0.000386 wd 0.0500 time 0.2407 (0.4184) data time 0.0011 (0.0102) model time 0.2396 (0.4082) loss 2.8880 (3.2102) grad_norm 3.1093 (3.1110) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:14:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][710/1251] eta 0:03:34 lr 0.000386 wd 0.0500 time 0.2356 (0.3960) data time 0.0011 (0.0091) model time 0.2344 (0.3870) loss 2.8223 (3.1997) grad_norm 2.3449 (3.2212) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:14:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][720/1251] eta 0:03:21 lr 0.000386 wd 0.0500 time 0.2378 (0.3790) data time 0.0008 (0.0082) model time 0.2370 (0.3708) loss 3.6917 (3.1823) grad_norm 2.2466 (3.2253) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:14:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][730/1251] eta 0:03:10 lr 0.000386 wd 0.0500 time 0.2463 (0.3651) data time 0.0008 (0.0075) model time 0.2456 (0.3577) loss 3.2798 (3.1798) grad_norm 3.6563 (3.2055) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:14:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][740/1251] eta 0:03:00 lr 0.000386 wd 0.0500 time 0.2448 (0.3539) data time 0.0010 (0.0069) model time 0.2437 (0.3470) loss 2.8817 (3.1828) grad_norm 3.2123 (3.1699) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:14:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][750/1251] eta 0:02:52 lr 0.000386 wd 0.0500 time 0.2586 (0.3447) data time 0.0009 (0.0064) model time 0.2577 (0.3383) loss 1.8517 (3.1812) grad_norm 2.8239 (3.1533) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:14:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][760/1251] eta 0:02:45 lr 0.000386 wd 0.0500 time 0.2348 (0.3368) data time 0.0011 (0.0060) model time 0.2337 (0.3308) loss 3.6415 (3.1775) grad_norm 3.0711 (3.1178) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:14:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][770/1251] eta 0:02:38 lr 0.000386 wd 0.0500 time 0.2465 (0.3301) data time 0.0009 (0.0057) model time 0.2456 (0.3244) loss 3.2824 (3.1739) grad_norm 2.0607 (3.0769) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:14:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][780/1251] eta 0:02:32 lr 0.000386 wd 0.0500 time 0.2391 (0.3241) data time 0.0010 (0.0054) model time 0.2381 (0.3188) loss 2.4111 (3.1729) grad_norm 2.3363 (3.0490) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][790/1251] eta 0:02:27 lr 0.000386 wd 0.0500 time 0.2335 (0.3190) data time 0.0013 (0.0051) model time 0.2322 (0.3139) loss 3.6624 (3.1808) grad_norm 2.5039 (3.0352) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][800/1251] eta 0:02:21 lr 0.000386 wd 0.0500 time 0.2425 (0.3144) data time 0.0010 (0.0049) model time 0.2415 (0.3095) loss 3.0394 (3.1799) grad_norm 2.4377 (3.0124) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][810/1251] eta 0:02:16 lr 0.000386 wd 0.0500 time 0.2450 (0.3104) data time 0.0010 (0.0047) model time 0.2440 (0.3057) loss 3.3365 (3.1664) grad_norm 2.7697 (2.9897) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][820/1251] eta 0:02:12 lr 0.000386 wd 0.0500 time 0.2434 (0.3067) data time 0.0011 (0.0045) model time 0.2423 (0.3022) loss 2.6092 (3.1639) grad_norm 5.2993 (2.9992) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][830/1251] eta 0:02:07 lr 0.000386 wd 0.0500 time 0.2417 (0.3034) data time 0.0011 (0.0043) model time 0.2406 (0.2991) loss 3.0325 (3.1528) grad_norm 2.4036 (2.9816) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][840/1251] eta 0:02:03 lr 0.000386 wd 0.0500 time 0.2407 (0.3005) data time 0.0011 (0.0042) model time 0.2397 (0.2963) loss 3.7996 (3.1494) grad_norm 1.9883 (2.9680) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][850/1251] eta 0:01:59 lr 0.000386 wd 0.0500 time 0.2378 (0.2977) data time 0.0009 (0.0041) model time 0.2369 (0.2936) loss 2.8950 (3.1419) grad_norm 2.9439 (2.9507) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][860/1251] eta 0:01:55 lr 0.000386 wd 0.0500 time 0.2333 (0.2953) data time 0.0010 (0.0040) model time 0.2323 (0.2913) loss 2.0242 (3.1363) grad_norm 4.1811 (2.9661) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][870/1251] eta 0:01:51 lr 0.000386 wd 0.0500 time 0.2424 (0.2930) data time 0.0008 (0.0038) model time 0.2416 (0.2891) loss 3.2248 (3.1351) grad_norm 2.2485 (2.9717) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][880/1251] eta 0:01:47 lr 0.000386 wd 0.0500 time 0.2379 (0.2909) data time 0.0008 (0.0037) model time 0.2370 (0.2871) loss 3.3992 (3.1242) grad_norm 4.0238 (2.9675) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][890/1251] eta 0:01:44 lr 0.000386 wd 0.0500 time 0.2322 (0.2889) data time 0.0008 (0.0036) model time 0.2314 (0.2852) loss 3.0195 (3.1157) grad_norm 3.5920 (2.9607) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][900/1251] eta 0:01:40 lr 0.000386 wd 0.0500 time 0.2391 (0.2871) data time 0.0010 (0.0035) model time 0.2381 (0.2835) loss 3.5435 (3.1119) grad_norm 3.6551 (2.9745) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][910/1251] eta 0:01:37 lr 0.000386 wd 0.0500 time 0.2340 (0.2853) data time 0.0010 (0.0035) model time 0.2331 (0.2819) loss 2.9602 (3.1142) grad_norm 4.4099 (2.9874) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][920/1251] eta 0:01:34 lr 0.000385 wd 0.0500 time 0.2399 (0.2847) data time 0.0009 (0.0034) model time 0.2390 (0.2813) loss 1.9407 (3.1082) grad_norm 4.2542 (2.9750) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][930/1251] eta 0:01:30 lr 0.000385 wd 0.0500 time 0.2422 (0.2833) data time 0.0010 (0.0033) model time 0.2412 (0.2800) loss 2.9132 (3.1017) grad_norm 2.0506 (2.9716) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][940/1251] eta 0:01:27 lr 0.000385 wd 0.0500 time 0.2404 (0.2829) data time 0.0012 (0.0033) model time 0.2391 (0.2796) loss 3.1784 (3.1025) grad_norm 2.7248 (2.9646) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][950/1251] eta 0:01:24 lr 0.000385 wd 0.0500 time 0.2412 (0.2816) data time 0.0008 (0.0032) model time 0.2404 (0.2784) loss 3.8354 (3.1136) grad_norm 2.1529 (2.9511) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][960/1251] eta 0:01:21 lr 0.000385 wd 0.0500 time 0.2417 (0.2804) data time 0.0008 (0.0031) model time 0.2409 (0.2773) loss 2.3568 (3.1134) grad_norm 2.1258 (2.9461) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][970/1251] eta 0:01:18 lr 0.000385 wd 0.0500 time 0.2407 (0.2793) data time 0.0011 (0.0031) model time 0.2396 (0.2762) loss 3.4752 (3.1160) grad_norm 4.5918 (2.9570) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][980/1251] eta 0:01:15 lr 0.000385 wd 0.0500 time 0.2446 (0.2783) data time 0.0007 (0.0030) model time 0.2439 (0.2752) loss 3.1025 (3.1176) grad_norm 2.3210 (2.9500) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][990/1251] eta 0:01:12 lr 0.000385 wd 0.0500 time 0.2428 (0.2772) data time 0.0007 (0.0030) model time 0.2421 (0.2742) loss 3.0951 (3.1191) grad_norm 3.2293 (2.9542) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1000/1251] eta 0:01:09 lr 0.000385 wd 0.0500 time 0.2427 (0.2763) data time 0.0011 (0.0029) model time 0.2416 (0.2733) loss 2.7498 (3.1168) grad_norm 3.1359 (2.9472) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1010/1251] eta 0:01:06 lr 0.000385 wd 0.0500 time 0.2380 (0.2754) data time 0.0008 (0.0029) model time 0.2372 (0.2725) loss 2.1649 (3.1138) grad_norm 2.6154 (2.9446) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1020/1251] eta 0:01:03 lr 0.000385 wd 0.0500 time 0.2405 (0.2746) data time 0.0009 (0.0028) model time 0.2396 (0.2718) loss 4.0139 (3.1142) grad_norm 2.1394 (2.9398) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:15:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1030/1251] eta 0:01:00 lr 0.000385 wd 0.0500 time 0.2416 (0.2739) data time 0.0010 (0.0028) model time 0.2406 (0.2710) loss 3.1068 (3.1187) grad_norm 4.0209 (2.9509) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1040/1251] eta 0:00:57 lr 0.000385 wd 0.0500 time 0.2392 (0.2731) data time 0.0009 (0.0028) model time 0.2382 (0.2703) loss 3.6216 (3.1246) grad_norm 4.1435 (2.9602) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1050/1251] eta 0:00:54 lr 0.000385 wd 0.0500 time 0.2451 (0.2724) data time 0.0008 (0.0028) model time 0.2443 (0.2697) loss 3.5367 (3.1230) grad_norm 2.6582 (2.9647) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1060/1251] eta 0:00:51 lr 0.000385 wd 0.0500 time 0.2358 (0.2717) data time 0.0009 (0.0027) model time 0.2348 (0.2690) loss 3.2244 (3.1292) grad_norm 2.1882 (2.9567) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1070/1251] eta 0:00:49 lr 0.000385 wd 0.0500 time 0.2440 (0.2712) data time 0.0008 (0.0027) model time 0.2431 (0.2685) loss 3.7247 (3.1310) grad_norm 3.2885 (2.9600) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1080/1251] eta 0:00:46 lr 0.000385 wd 0.0500 time 0.2417 (0.2706) data time 0.0009 (0.0027) model time 0.2408 (0.2679) loss 3.0483 (3.1299) grad_norm 3.1570 (2.9552) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1090/1251] eta 0:00:43 lr 0.000385 wd 0.0500 time 0.2314 (0.2700) data time 0.0010 (0.0026) model time 0.2304 (0.2673) loss 3.2651 (3.1241) grad_norm 3.3242 (2.9704) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1100/1251] eta 0:00:40 lr 0.000385 wd 0.0500 time 0.2388 (0.2694) data time 0.0011 (0.0026) model time 0.2377 (0.2669) loss 3.0613 (3.1163) grad_norm 3.7731 (2.9691) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1110/1251] eta 0:00:37 lr 0.000385 wd 0.0500 time 0.2416 (0.2689) data time 0.0007 (0.0026) model time 0.2409 (0.2664) loss 3.6081 (3.1165) grad_norm 2.8509 (2.9660) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1120/1251] eta 0:00:35 lr 0.000385 wd 0.0500 time 0.2350 (0.2684) data time 0.0012 (0.0025) model time 0.2338 (0.2659) loss 3.1338 (3.1201) grad_norm 4.2869 (2.9753) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1130/1251] eta 0:00:32 lr 0.000385 wd 0.0500 time 0.2398 (0.2680) data time 0.0010 (0.0025) model time 0.2388 (0.2655) loss 3.3942 (3.1195) grad_norm 2.0456 (2.9819) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1140/1251] eta 0:00:29 lr 0.000385 wd 0.0500 time 0.2470 (0.2675) data time 0.0008 (0.0025) model time 0.2462 (0.2651) loss 3.5164 (3.1226) grad_norm 2.2317 (2.9758) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1150/1251] eta 0:00:26 lr 0.000384 wd 0.0500 time 0.2374 (0.2671) data time 0.0009 (0.0024) model time 0.2365 (0.2646) loss 2.9133 (3.1236) grad_norm 3.7636 (2.9734) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1160/1251] eta 0:00:24 lr 0.000384 wd 0.0500 time 0.2467 (0.2667) data time 0.0012 (0.0024) model time 0.2455 (0.2642) loss 3.3571 (3.1176) grad_norm 2.4326 (2.9713) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1170/1251] eta 0:00:21 lr 0.000384 wd 0.0500 time 0.2445 (0.2662) data time 0.0011 (0.0024) model time 0.2434 (0.2638) loss 3.6224 (3.1181) grad_norm 2.1218 (2.9661) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1180/1251] eta 0:00:18 lr 0.000384 wd 0.0500 time 0.2369 (0.2658) data time 0.0008 (0.0024) model time 0.2362 (0.2634) loss 3.6811 (3.1180) grad_norm 1.9084 (2.9559) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1190/1251] eta 0:00:16 lr 0.000384 wd 0.0500 time 0.2352 (0.2654) data time 0.0010 (0.0023) model time 0.2342 (0.2630) loss 3.6740 (3.1237) grad_norm 2.6317 (2.9559) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1200/1251] eta 0:00:13 lr 0.000384 wd 0.0500 time 0.2433 (0.2649) data time 0.0013 (0.0023) model time 0.2420 (0.2626) loss 3.6928 (3.1263) grad_norm 3.8153 (2.9559) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1210/1251] eta 0:00:10 lr 0.000384 wd 0.0500 time 0.2379 (0.2646) data time 0.0010 (0.0023) model time 0.2369 (0.2623) loss 3.3621 (3.1287) grad_norm 3.0766 (2.9530) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1220/1251] eta 0:00:08 lr 0.000384 wd 0.0500 time 0.2512 (0.2642) data time 0.0010 (0.0023) model time 0.2502 (0.2620) loss 3.2471 (3.1287) grad_norm 2.2484 (2.9472) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1230/1251] eta 0:00:05 lr 0.000384 wd 0.0500 time 0.2403 (0.2639) data time 0.0010 (0.0023) model time 0.2393 (0.2616) loss 2.3685 (3.1285) grad_norm 2.6587 (2.9430) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1240/1251] eta 0:00:02 lr 0.000384 wd 0.0500 time 0.2328 (0.2634) data time 0.0007 (0.0023) model time 0.2321 (0.2612) loss 3.4868 (3.1299) grad_norm 2.5953 (2.9410) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [181/300][1250/1251] eta 0:00:00 lr 0.000384 wd 0.0500 time 0.2364 (0.2629) data time 0.0005 (0.0022) model time 0.2359 (0.2606) loss 3.5224 (3.1308) grad_norm 4.4165 (2.9501) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 181 training takes 0:02:43 [2024-08-27 19:16:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 19:16:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 19:16:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.422 (0.422) Loss 0.4504 (0.4504) Acc@1 91.309 (91.309) Acc@5 98.340 (98.340) Mem 7377MB [2024-08-27 19:16:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.110) Loss 0.6396 (0.6869) Acc@1 87.695 (85.174) Acc@5 97.168 (96.999) Mem 7377MB [2024-08-27 19:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.097) Loss 0.9956 (0.7101) Acc@1 75.488 (84.366) Acc@5 95.020 (97.080) Mem 7377MB [2024-08-27 19:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.091) Loss 1.2422 (0.8025) Acc@1 71.094 (82.167) Acc@5 90.430 (96.024) Mem 7377MB [2024-08-27 19:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.0986 (0.8641) Acc@1 74.512 (80.657) Acc@5 93.457 (95.317) Mem 7377MB [2024-08-27 19:17:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.238 Acc@5 95.306 [2024-08-27 19:17:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.2% [2024-08-27 19:17:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 80.24% [2024-08-27 19:17:02 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 19:17:03 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 19:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.450 (0.450) Loss 0.3960 (0.3960) Acc@1 92.773 (92.773) Acc@5 98.340 (98.340) Mem 7377MB [2024-08-27 19:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.112) Loss 0.6094 (0.6220) Acc@1 88.184 (86.728) Acc@5 97.363 (97.523) Mem 7377MB [2024-08-27 19:17:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.096) Loss 0.8779 (0.6459) Acc@1 79.004 (85.886) Acc@5 96.094 (97.493) Mem 7377MB [2024-08-27 19:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.089) Loss 1.1143 (0.7318) Acc@1 73.047 (83.799) Acc@5 92.871 (96.582) Mem 7377MB [2024-08-27 19:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 0.9995 (0.7755) Acc@1 75.684 (82.503) Acc@5 94.043 (96.108) Mem 7377MB [2024-08-27 19:17:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.038 Acc@5 96.086 [2024-08-27 19:17:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.0% [2024-08-27 19:17:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.04% [2024-08-27 19:17:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 19:17:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 19:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][0/1251] eta 0:17:22 lr 0.000384 wd 0.0500 time 0.8336 (0.8336) data time 0.5727 (0.5727) model time 0.0000 (0.0000) loss 3.2150 (3.2150) grad_norm 3.5547 (3.5547) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 19:17:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][10/1251] eta 0:06:09 lr 0.000384 wd 0.0500 time 0.2496 (0.2979) data time 0.0009 (0.0531) model time 0.0000 (0.0000) loss 3.3018 (3.1449) grad_norm 2.3834 (2.6874) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][20/1251] eta 0:05:35 lr 0.000384 wd 0.0500 time 0.2532 (0.2727) data time 0.0010 (0.0283) model time 0.0000 (0.0000) loss 2.2979 (2.9829) grad_norm 4.7587 (2.8039) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][30/1251] eta 0:05:22 lr 0.000384 wd 0.0500 time 0.2749 (0.2640) data time 0.0011 (0.0196) model time 0.0000 (0.0000) loss 2.3197 (3.0655) grad_norm 2.9680 (3.0053) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][40/1251] eta 0:05:12 lr 0.000384 wd 0.0500 time 0.2431 (0.2583) data time 0.0009 (0.0152) model time 0.0000 (0.0000) loss 3.6081 (3.0260) grad_norm 2.2103 (2.9390) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][50/1251] eta 0:05:06 lr 0.000384 wd 0.0500 time 0.2682 (0.2556) data time 0.0009 (0.0127) model time 0.0000 (0.0000) loss 3.2382 (3.0761) grad_norm 2.7087 (2.9246) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][60/1251] eta 0:05:01 lr 0.000384 wd 0.0500 time 0.2425 (0.2532) data time 0.0011 (0.0108) model time 0.2414 (0.2396) loss 2.9072 (3.0966) grad_norm 2.9072 (2.9179) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][70/1251] eta 0:04:57 lr 0.000384 wd 0.0500 time 0.2592 (0.2516) data time 0.0010 (0.0094) model time 0.2583 (0.2403) loss 3.2830 (3.0834) grad_norm 3.9004 (2.9360) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][80/1251] eta 0:04:53 lr 0.000384 wd 0.0500 time 0.2536 (0.2503) data time 0.0010 (0.0084) model time 0.2526 (0.2403) loss 2.5017 (3.0621) grad_norm 2.4817 (2.9407) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][90/1251] eta 0:04:49 lr 0.000384 wd 0.0500 time 0.2434 (0.2497) data time 0.0007 (0.0076) model time 0.2427 (0.2410) loss 2.0886 (3.0432) grad_norm 2.3063 (2.9103) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][100/1251] eta 0:04:46 lr 0.000384 wd 0.0500 time 0.2388 (0.2493) data time 0.0008 (0.0070) model time 0.2380 (0.2417) loss 3.2433 (3.0444) grad_norm 3.4047 (2.9308) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][110/1251] eta 0:04:46 lr 0.000384 wd 0.0500 time 0.2418 (0.2507) data time 0.0011 (0.0065) model time 0.2408 (0.2454) loss 3.8175 (3.0723) grad_norm 2.1388 (2.9029) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][120/1251] eta 0:04:42 lr 0.000384 wd 0.0500 time 0.2485 (0.2502) data time 0.0012 (0.0061) model time 0.2473 (0.2451) loss 2.1459 (3.0922) grad_norm 2.7561 (2.8925) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][130/1251] eta 0:04:40 lr 0.000383 wd 0.0500 time 0.2525 (0.2498) data time 0.0007 (0.0057) model time 0.2518 (0.2450) loss 3.8248 (3.0988) grad_norm 2.8738 (2.8916) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][140/1251] eta 0:04:37 lr 0.000383 wd 0.0500 time 0.2404 (0.2495) data time 0.0010 (0.0054) model time 0.2395 (0.2449) loss 3.5081 (3.0898) grad_norm 2.6669 (2.9035) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][150/1251] eta 0:04:34 lr 0.000383 wd 0.0500 time 0.2476 (0.2490) data time 0.0007 (0.0051) model time 0.2469 (0.2445) loss 3.2218 (3.1018) grad_norm 2.9210 (2.9121) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][160/1251] eta 0:04:31 lr 0.000383 wd 0.0500 time 0.2412 (0.2485) data time 0.0009 (0.0049) model time 0.2403 (0.2440) loss 3.2496 (3.1061) grad_norm 3.2807 (2.9121) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][170/1251] eta 0:04:28 lr 0.000383 wd 0.0500 time 0.2707 (0.2484) data time 0.0010 (0.0046) model time 0.2697 (0.2441) loss 3.7002 (3.1052) grad_norm 2.2212 (2.8842) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][180/1251] eta 0:04:25 lr 0.000383 wd 0.0500 time 0.2470 (0.2482) data time 0.0007 (0.0045) model time 0.2463 (0.2441) loss 2.3758 (3.1014) grad_norm 3.0620 (2.8950) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][190/1251] eta 0:04:24 lr 0.000383 wd 0.0500 time 0.2369 (0.2489) data time 0.0012 (0.0043) model time 0.2357 (0.2453) loss 3.2642 (3.0836) grad_norm 3.8479 (2.9385) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:17:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][200/1251] eta 0:04:21 lr 0.000383 wd 0.0500 time 0.2429 (0.2488) data time 0.0007 (0.0041) model time 0.2422 (0.2453) loss 2.5798 (3.0843) grad_norm 4.3050 (3.1186) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][210/1251] eta 0:04:18 lr 0.000383 wd 0.0500 time 0.2368 (0.2484) data time 0.0008 (0.0040) model time 0.2360 (0.2450) loss 3.0525 (3.0721) grad_norm 2.9395 (3.1113) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][220/1251] eta 0:04:15 lr 0.000383 wd 0.0500 time 0.2393 (0.2482) data time 0.0010 (0.0039) model time 0.2383 (0.2448) loss 2.0236 (3.0698) grad_norm 2.2944 (3.0971) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][230/1251] eta 0:04:13 lr 0.000383 wd 0.0500 time 0.2345 (0.2479) data time 0.0007 (0.0037) model time 0.2337 (0.2445) loss 2.8857 (3.0706) grad_norm 3.3657 (3.0821) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][240/1251] eta 0:04:10 lr 0.000383 wd 0.0500 time 0.2442 (0.2477) data time 0.0011 (0.0037) model time 0.2431 (0.2444) loss 3.1038 (3.0700) grad_norm 2.9257 (3.0811) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][250/1251] eta 0:04:07 lr 0.000383 wd 0.0500 time 0.2406 (0.2475) data time 0.0008 (0.0036) model time 0.2398 (0.2442) loss 2.2695 (3.0714) grad_norm 2.5269 (3.0662) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][260/1251] eta 0:04:05 lr 0.000383 wd 0.0500 time 0.2344 (0.2473) data time 0.0010 (0.0035) model time 0.2335 (0.2441) loss 2.1686 (3.0611) grad_norm 2.2751 (3.0426) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][270/1251] eta 0:04:02 lr 0.000383 wd 0.0500 time 0.2395 (0.2471) data time 0.0007 (0.0034) model time 0.2387 (0.2439) loss 3.2991 (3.0667) grad_norm 2.5106 (3.0402) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][280/1251] eta 0:03:59 lr 0.000383 wd 0.0500 time 0.2349 (0.2470) data time 0.0008 (0.0033) model time 0.2341 (0.2438) loss 4.1150 (3.0692) grad_norm 2.3540 (3.0578) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][290/1251] eta 0:03:57 lr 0.000383 wd 0.0500 time 0.2452 (0.2468) data time 0.0008 (0.0033) model time 0.2444 (0.2436) loss 2.1609 (3.0644) grad_norm 4.9702 (3.0553) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][300/1251] eta 0:03:54 lr 0.000383 wd 0.0500 time 0.2436 (0.2467) data time 0.0011 (0.0032) model time 0.2425 (0.2436) loss 3.4721 (3.0740) grad_norm 7.0444 (3.0591) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][310/1251] eta 0:03:52 lr 0.000383 wd 0.0500 time 0.2471 (0.2466) data time 0.0008 (0.0032) model time 0.2463 (0.2436) loss 3.6519 (3.0857) grad_norm 2.0842 (3.0653) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][320/1251] eta 0:03:49 lr 0.000383 wd 0.0500 time 0.2425 (0.2465) data time 0.0012 (0.0031) model time 0.2414 (0.2435) loss 1.8706 (3.0767) grad_norm 4.3363 (3.0736) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][330/1251] eta 0:03:46 lr 0.000383 wd 0.0500 time 0.2400 (0.2464) data time 0.0010 (0.0030) model time 0.2389 (0.2435) loss 3.3898 (3.0685) grad_norm 3.1853 (3.0641) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][340/1251] eta 0:03:44 lr 0.000383 wd 0.0500 time 0.2373 (0.2463) data time 0.0010 (0.0030) model time 0.2363 (0.2434) loss 2.8564 (3.0667) grad_norm 2.1403 (3.0548) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][350/1251] eta 0:03:41 lr 0.000383 wd 0.0500 time 0.2390 (0.2462) data time 0.0010 (0.0029) model time 0.2380 (0.2433) loss 4.0839 (3.0717) grad_norm 3.1108 (3.0680) loss_scale 2048.0000 (1044.4217) mem 7381MB [2024-08-27 19:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][360/1251] eta 0:03:39 lr 0.000383 wd 0.0500 time 0.2444 (0.2462) data time 0.0009 (0.0029) model time 0.2435 (0.2433) loss 3.8580 (3.0734) grad_norm 2.2151 (3.0717) loss_scale 2048.0000 (1072.2216) mem 7381MB [2024-08-27 19:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][370/1251] eta 0:03:36 lr 0.000382 wd 0.0500 time 0.2403 (0.2460) data time 0.0010 (0.0029) model time 0.2392 (0.2432) loss 3.7830 (3.0736) grad_norm 3.7396 (3.1059) loss_scale 2048.0000 (1098.5229) mem 7381MB [2024-08-27 19:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][380/1251] eta 0:03:34 lr 0.000382 wd 0.0500 time 0.2410 (0.2459) data time 0.0013 (0.0028) model time 0.2397 (0.2431) loss 3.2558 (3.0703) grad_norm 2.6716 (3.0960) loss_scale 2048.0000 (1123.4436) mem 7381MB [2024-08-27 19:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][390/1251] eta 0:03:31 lr 0.000382 wd 0.0500 time 0.2396 (0.2459) data time 0.0008 (0.0028) model time 0.2388 (0.2431) loss 3.1791 (3.0758) grad_norm 6.6784 (3.1095) loss_scale 2048.0000 (1147.0895) mem 7381MB [2024-08-27 19:18:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][400/1251] eta 0:03:29 lr 0.000382 wd 0.0500 time 0.2483 (0.2458) data time 0.0012 (0.0027) model time 0.2471 (0.2431) loss 3.1578 (3.0728) grad_norm 4.1783 (3.1256) loss_scale 2048.0000 (1169.5561) mem 7381MB [2024-08-27 19:18:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][410/1251] eta 0:03:26 lr 0.000382 wd 0.0500 time 0.2418 (0.2458) data time 0.0007 (0.0027) model time 0.2411 (0.2431) loss 3.4522 (3.0656) grad_norm 2.4604 (3.1224) loss_scale 2048.0000 (1190.9294) mem 7381MB [2024-08-27 19:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][420/1251] eta 0:03:24 lr 0.000382 wd 0.0500 time 0.2421 (0.2458) data time 0.0010 (0.0027) model time 0.2411 (0.2431) loss 3.2828 (3.0717) grad_norm 10.7880 (3.1338) loss_scale 2048.0000 (1211.2874) mem 7381MB [2024-08-27 19:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][430/1251] eta 0:03:21 lr 0.000382 wd 0.0500 time 0.2457 (0.2457) data time 0.0011 (0.0026) model time 0.2446 (0.2431) loss 3.4003 (3.0707) grad_norm 2.4761 (3.1412) loss_scale 2048.0000 (1230.7007) mem 7381MB [2024-08-27 19:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][440/1251] eta 0:03:19 lr 0.000382 wd 0.0500 time 0.2473 (0.2457) data time 0.0009 (0.0026) model time 0.2464 (0.2431) loss 3.6415 (3.0755) grad_norm 2.4878 (3.1336) loss_scale 2048.0000 (1249.2336) mem 7381MB [2024-08-27 19:18:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][450/1251] eta 0:03:16 lr 0.000382 wd 0.0500 time 0.2488 (0.2456) data time 0.0009 (0.0026) model time 0.2479 (0.2430) loss 1.9066 (3.0709) grad_norm 2.4317 (3.1239) loss_scale 2048.0000 (1266.9446) mem 7381MB [2024-08-27 19:19:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][460/1251] eta 0:03:14 lr 0.000382 wd 0.0500 time 0.2400 (0.2456) data time 0.0011 (0.0026) model time 0.2388 (0.2430) loss 3.3041 (3.0711) grad_norm 3.7074 (3.1184) loss_scale 2048.0000 (1283.8872) mem 7381MB [2024-08-27 19:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][470/1251] eta 0:03:11 lr 0.000382 wd 0.0500 time 0.2476 (0.2456) data time 0.0007 (0.0025) model time 0.2468 (0.2430) loss 2.6168 (3.0674) grad_norm 3.3363 (3.1151) loss_scale 2048.0000 (1300.1104) mem 7381MB [2024-08-27 19:19:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][480/1251] eta 0:03:09 lr 0.000382 wd 0.0500 time 0.2414 (0.2455) data time 0.0010 (0.0025) model time 0.2404 (0.2430) loss 2.8248 (3.0652) grad_norm 2.4853 (3.1044) loss_scale 2048.0000 (1315.6590) mem 7381MB [2024-08-27 19:19:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][490/1251] eta 0:03:06 lr 0.000382 wd 0.0500 time 0.2341 (0.2455) data time 0.0013 (0.0025) model time 0.2329 (0.2430) loss 3.5534 (3.0685) grad_norm 3.0815 (3.0993) loss_scale 2048.0000 (1330.5743) mem 7381MB [2024-08-27 19:19:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][500/1251] eta 0:03:04 lr 0.000382 wd 0.0500 time 0.2354 (0.2454) data time 0.0007 (0.0025) model time 0.2347 (0.2429) loss 2.9556 (3.0714) grad_norm 2.4336 (3.1008) loss_scale 2048.0000 (1344.8942) mem 7381MB [2024-08-27 19:19:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][510/1251] eta 0:03:01 lr 0.000382 wd 0.0500 time 0.2423 (0.2453) data time 0.0010 (0.0025) model time 0.2414 (0.2428) loss 2.0903 (3.0728) grad_norm 2.5335 (3.1018) loss_scale 2048.0000 (1358.6536) mem 7381MB [2024-08-27 19:19:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][520/1251] eta 0:02:59 lr 0.000382 wd 0.0500 time 0.2533 (0.2452) data time 0.0010 (0.0024) model time 0.2522 (0.2428) loss 3.6210 (3.0743) grad_norm 2.9105 (3.1026) loss_scale 2048.0000 (1371.8848) mem 7381MB [2024-08-27 19:19:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][530/1251] eta 0:02:56 lr 0.000382 wd 0.0500 time 0.2419 (0.2452) data time 0.0012 (0.0024) model time 0.2408 (0.2428) loss 3.2138 (3.0737) grad_norm 7.1675 (3.1065) loss_scale 2048.0000 (1384.6177) mem 7381MB [2024-08-27 19:19:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][540/1251] eta 0:02:54 lr 0.000382 wd 0.0500 time 0.2447 (0.2451) data time 0.0008 (0.0024) model time 0.2439 (0.2427) loss 2.7929 (3.0710) grad_norm 2.0623 (3.0965) loss_scale 2048.0000 (1396.8799) mem 7381MB [2024-08-27 19:19:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][550/1251] eta 0:02:51 lr 0.000382 wd 0.0500 time 0.2369 (0.2450) data time 0.0010 (0.0024) model time 0.2359 (0.2426) loss 3.0051 (3.0713) grad_norm 2.8619 (3.0950) loss_scale 2048.0000 (1408.6969) mem 7381MB [2024-08-27 19:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][560/1251] eta 0:02:49 lr 0.000382 wd 0.0500 time 0.2405 (0.2449) data time 0.0007 (0.0023) model time 0.2398 (0.2426) loss 3.7642 (3.0785) grad_norm 2.0723 (3.0913) loss_scale 2048.0000 (1420.0927) mem 7381MB [2024-08-27 19:19:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][570/1251] eta 0:02:46 lr 0.000382 wd 0.0500 time 0.2379 (0.2449) data time 0.0010 (0.0024) model time 0.2369 (0.2426) loss 2.9150 (3.0802) grad_norm 4.7497 (3.0880) loss_scale 2048.0000 (1431.0893) mem 7381MB [2024-08-27 19:19:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][580/1251] eta 0:02:44 lr 0.000382 wd 0.0500 time 0.2385 (0.2449) data time 0.0013 (0.0023) model time 0.2372 (0.2425) loss 1.9056 (3.0809) grad_norm 11.3077 (3.1002) loss_scale 2048.0000 (1441.7074) mem 7381MB [2024-08-27 19:19:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][590/1251] eta 0:02:41 lr 0.000382 wd 0.0500 time 0.2421 (0.2449) data time 0.0007 (0.0023) model time 0.2414 (0.2426) loss 3.8034 (3.0830) grad_norm 2.6014 (3.0956) loss_scale 2048.0000 (1451.9662) mem 7381MB [2024-08-27 19:19:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][600/1251] eta 0:02:39 lr 0.000381 wd 0.0500 time 0.2608 (0.2450) data time 0.0012 (0.0023) model time 0.2597 (0.2427) loss 3.4606 (3.0811) grad_norm 2.0735 (3.0853) loss_scale 2048.0000 (1461.8835) mem 7381MB [2024-08-27 19:19:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][610/1251] eta 0:02:37 lr 0.000381 wd 0.0500 time 0.2557 (0.2450) data time 0.0007 (0.0023) model time 0.2550 (0.2427) loss 1.9412 (3.0785) grad_norm 2.6202 (3.0752) loss_scale 2048.0000 (1471.4763) mem 7381MB [2024-08-27 19:19:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][620/1251] eta 0:02:34 lr 0.000381 wd 0.0500 time 0.2568 (0.2450) data time 0.0009 (0.0023) model time 0.2559 (0.2427) loss 3.5318 (3.0802) grad_norm 3.2875 (3.0723) loss_scale 2048.0000 (1480.7601) mem 7381MB [2024-08-27 19:19:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][630/1251] eta 0:02:32 lr 0.000381 wd 0.0500 time 0.2479 (0.2450) data time 0.0011 (0.0023) model time 0.2468 (0.2427) loss 3.4259 (3.0778) grad_norm 2.8655 (3.0661) loss_scale 2048.0000 (1489.7496) mem 7381MB [2024-08-27 19:19:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][640/1251] eta 0:02:29 lr 0.000381 wd 0.0500 time 0.2557 (0.2449) data time 0.0008 (0.0023) model time 0.2549 (0.2426) loss 2.9128 (3.0793) grad_norm 2.9914 (3.0604) loss_scale 2048.0000 (1498.4587) mem 7381MB [2024-08-27 19:19:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][650/1251] eta 0:02:27 lr 0.000381 wd 0.0500 time 0.2643 (0.2449) data time 0.0010 (0.0022) model time 0.2633 (0.2427) loss 3.0137 (3.0813) grad_norm 3.0186 (3.0517) loss_scale 2048.0000 (1506.9002) mem 7381MB [2024-08-27 19:19:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][660/1251] eta 0:02:24 lr 0.000381 wd 0.0500 time 0.2440 (0.2450) data time 0.0012 (0.0022) model time 0.2428 (0.2427) loss 3.8302 (3.0877) grad_norm 2.5479 (3.0524) loss_scale 2048.0000 (1515.0862) mem 7381MB [2024-08-27 19:19:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][670/1251] eta 0:02:22 lr 0.000381 wd 0.0500 time 0.2466 (0.2450) data time 0.0010 (0.0022) model time 0.2456 (0.2428) loss 2.9264 (3.0892) grad_norm 2.2716 (3.0519) loss_scale 2048.0000 (1523.0283) mem 7381MB [2024-08-27 19:19:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][680/1251] eta 0:02:19 lr 0.000381 wd 0.0500 time 0.2495 (0.2450) data time 0.0008 (0.0022) model time 0.2487 (0.2428) loss 3.1172 (3.0915) grad_norm 2.4667 (3.0452) loss_scale 2048.0000 (1530.7372) mem 7381MB [2024-08-27 19:19:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][690/1251] eta 0:02:17 lr 0.000381 wd 0.0500 time 0.2494 (0.2449) data time 0.0011 (0.0022) model time 0.2483 (0.2427) loss 3.1581 (3.0903) grad_norm 2.7447 (3.0416) loss_scale 2048.0000 (1538.2229) mem 7381MB [2024-08-27 19:19:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][700/1251] eta 0:02:14 lr 0.000381 wd 0.0500 time 0.2417 (0.2449) data time 0.0008 (0.0022) model time 0.2409 (0.2427) loss 2.8714 (3.0861) grad_norm 1.9988 (3.0363) loss_scale 2048.0000 (1545.4950) mem 7381MB [2024-08-27 19:20:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][710/1251] eta 0:02:12 lr 0.000381 wd 0.0500 time 0.2410 (0.2449) data time 0.0008 (0.0022) model time 0.2402 (0.2428) loss 3.1170 (3.0875) grad_norm 2.6027 (3.0308) loss_scale 2048.0000 (1552.5626) mem 7381MB [2024-08-27 19:20:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][720/1251] eta 0:02:10 lr 0.000381 wd 0.0500 time 0.2402 (0.2452) data time 0.0014 (0.0021) model time 0.2388 (0.2431) loss 2.0824 (3.0863) grad_norm 2.4310 (3.0238) loss_scale 2048.0000 (1559.4341) mem 7381MB [2024-08-27 19:20:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][730/1251] eta 0:02:07 lr 0.000381 wd 0.0500 time 0.2587 (0.2452) data time 0.0009 (0.0021) model time 0.2577 (0.2431) loss 3.4200 (3.0908) grad_norm 4.4202 (3.0245) loss_scale 2048.0000 (1566.1176) mem 7381MB [2024-08-27 19:20:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][740/1251] eta 0:02:05 lr 0.000381 wd 0.0500 time 0.2382 (0.2452) data time 0.0011 (0.0021) model time 0.2371 (0.2431) loss 3.3060 (3.0929) grad_norm 3.0297 (inf) loss_scale 1024.0000 (1565.7112) mem 7381MB [2024-08-27 19:20:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][750/1251] eta 0:02:02 lr 0.000381 wd 0.0500 time 0.2428 (0.2452) data time 0.0009 (0.0021) model time 0.2419 (0.2431) loss 3.4275 (3.0960) grad_norm 2.0965 (inf) loss_scale 1024.0000 (1558.4980) mem 7381MB [2024-08-27 19:20:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][760/1251] eta 0:02:00 lr 0.000381 wd 0.0500 time 0.2346 (0.2452) data time 0.0010 (0.0021) model time 0.2336 (0.2431) loss 2.9389 (3.0961) grad_norm 2.3559 (inf) loss_scale 1024.0000 (1551.4744) mem 7381MB [2024-08-27 19:20:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][770/1251] eta 0:01:57 lr 0.000381 wd 0.0500 time 0.2408 (0.2451) data time 0.0009 (0.0021) model time 0.2399 (0.2430) loss 1.8718 (3.0952) grad_norm 2.9880 (inf) loss_scale 1024.0000 (1544.6329) mem 7381MB [2024-08-27 19:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][780/1251] eta 0:01:55 lr 0.000381 wd 0.0500 time 0.2440 (0.2452) data time 0.0011 (0.0021) model time 0.2429 (0.2431) loss 3.7673 (3.0971) grad_norm 2.7814 (inf) loss_scale 1024.0000 (1537.9667) mem 7381MB [2024-08-27 19:20:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][790/1251] eta 0:01:53 lr 0.000381 wd 0.0500 time 0.2389 (0.2451) data time 0.0009 (0.0021) model time 0.2380 (0.2430) loss 2.9421 (3.0990) grad_norm 2.6035 (inf) loss_scale 1024.0000 (1531.4690) mem 7381MB [2024-08-27 19:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][800/1251] eta 0:01:50 lr 0.000381 wd 0.0500 time 0.2384 (0.2451) data time 0.0011 (0.0021) model time 0.2373 (0.2430) loss 3.6617 (3.0995) grad_norm 3.6695 (inf) loss_scale 1024.0000 (1525.1336) mem 7381MB [2024-08-27 19:20:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][810/1251] eta 0:01:48 lr 0.000381 wd 0.0500 time 0.2443 (0.2451) data time 0.0009 (0.0021) model time 0.2435 (0.2431) loss 3.5849 (3.1002) grad_norm 2.7155 (inf) loss_scale 1024.0000 (1518.9544) mem 7381MB [2024-08-27 19:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][820/1251] eta 0:01:45 lr 0.000381 wd 0.0500 time 0.2465 (0.2451) data time 0.0008 (0.0020) model time 0.2457 (0.2430) loss 3.9069 (3.0986) grad_norm 2.5204 (inf) loss_scale 1024.0000 (1512.9257) mem 7381MB [2024-08-27 19:20:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][830/1251] eta 0:01:43 lr 0.000380 wd 0.0500 time 0.2474 (0.2450) data time 0.0011 (0.0020) model time 0.2463 (0.2430) loss 3.5207 (3.1005) grad_norm 2.7347 (inf) loss_scale 1024.0000 (1507.0421) mem 7381MB [2024-08-27 19:20:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][840/1251] eta 0:01:40 lr 0.000380 wd 0.0500 time 0.2424 (0.2450) data time 0.0010 (0.0020) model time 0.2414 (0.2430) loss 3.8628 (3.1024) grad_norm 3.1440 (inf) loss_scale 1024.0000 (1501.2985) mem 7381MB [2024-08-27 19:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][850/1251] eta 0:01:38 lr 0.000380 wd 0.0500 time 0.2468 (0.2450) data time 0.0013 (0.0020) model time 0.2455 (0.2429) loss 3.2979 (3.0998) grad_norm 2.5462 (inf) loss_scale 1024.0000 (1495.6898) mem 7381MB [2024-08-27 19:20:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][860/1251] eta 0:01:35 lr 0.000380 wd 0.0500 time 0.2386 (0.2449) data time 0.0007 (0.0020) model time 0.2378 (0.2429) loss 2.4050 (3.0952) grad_norm 2.6870 (inf) loss_scale 1024.0000 (1490.2114) mem 7381MB [2024-08-27 19:20:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][870/1251] eta 0:01:33 lr 0.000380 wd 0.0500 time 0.2342 (0.2449) data time 0.0008 (0.0020) model time 0.2333 (0.2429) loss 3.5314 (3.0926) grad_norm 2.6047 (inf) loss_scale 1024.0000 (1484.8588) mem 7381MB [2024-08-27 19:20:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][880/1251] eta 0:01:30 lr 0.000380 wd 0.0500 time 0.2522 (0.2449) data time 0.0007 (0.0020) model time 0.2515 (0.2429) loss 3.3073 (3.0922) grad_norm 2.8473 (inf) loss_scale 1024.0000 (1479.6277) mem 7381MB [2024-08-27 19:20:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][890/1251] eta 0:01:28 lr 0.000380 wd 0.0500 time 0.2411 (0.2449) data time 0.0009 (0.0020) model time 0.2402 (0.2429) loss 3.9030 (3.0933) grad_norm 3.4295 (inf) loss_scale 1024.0000 (1474.5140) mem 7381MB [2024-08-27 19:20:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][900/1251] eta 0:01:25 lr 0.000380 wd 0.0500 time 0.2394 (0.2449) data time 0.0008 (0.0020) model time 0.2386 (0.2429) loss 3.3607 (3.0948) grad_norm 3.2256 (inf) loss_scale 1024.0000 (1469.5139) mem 7381MB [2024-08-27 19:20:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][910/1251] eta 0:01:23 lr 0.000380 wd 0.0500 time 0.2434 (0.2449) data time 0.0009 (0.0020) model time 0.2424 (0.2429) loss 3.2105 (3.0937) grad_norm 2.4765 (inf) loss_scale 1024.0000 (1464.6235) mem 7381MB [2024-08-27 19:20:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][920/1251] eta 0:01:21 lr 0.000380 wd 0.0500 time 0.2344 (0.2449) data time 0.0010 (0.0020) model time 0.2334 (0.2429) loss 3.3987 (3.0939) grad_norm 3.5176 (inf) loss_scale 1024.0000 (1459.8393) mem 7381MB [2024-08-27 19:20:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][930/1251] eta 0:01:18 lr 0.000380 wd 0.0500 time 0.2477 (0.2449) data time 0.0008 (0.0020) model time 0.2469 (0.2429) loss 2.1438 (3.0912) grad_norm 2.2300 (inf) loss_scale 1024.0000 (1455.1579) mem 7381MB [2024-08-27 19:20:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][940/1251] eta 0:01:16 lr 0.000380 wd 0.0500 time 0.2429 (0.2449) data time 0.0010 (0.0020) model time 0.2419 (0.2429) loss 2.1519 (3.0896) grad_norm 2.5364 (inf) loss_scale 1024.0000 (1450.5760) mem 7381MB [2024-08-27 19:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][950/1251] eta 0:01:13 lr 0.000380 wd 0.0500 time 0.2430 (0.2449) data time 0.0008 (0.0020) model time 0.2422 (0.2429) loss 3.8226 (3.0895) grad_norm 3.1041 (inf) loss_scale 1024.0000 (1446.0904) mem 7381MB [2024-08-27 19:21:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][960/1251] eta 0:01:11 lr 0.000380 wd 0.0500 time 0.2329 (0.2449) data time 0.0010 (0.0020) model time 0.2319 (0.2429) loss 3.2745 (3.0895) grad_norm 2.9063 (inf) loss_scale 1024.0000 (1441.6982) mem 7381MB [2024-08-27 19:21:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][970/1251] eta 0:01:08 lr 0.000380 wd 0.0500 time 0.2398 (0.2449) data time 0.0011 (0.0019) model time 0.2387 (0.2429) loss 2.9043 (3.0912) grad_norm 2.9425 (inf) loss_scale 1024.0000 (1437.3965) mem 7381MB [2024-08-27 19:21:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][980/1251] eta 0:01:06 lr 0.000380 wd 0.0500 time 0.2401 (0.2449) data time 0.0011 (0.0019) model time 0.2390 (0.2429) loss 2.8900 (3.0939) grad_norm 2.8034 (inf) loss_scale 1024.0000 (1433.1825) mem 7381MB [2024-08-27 19:21:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][990/1251] eta 0:01:03 lr 0.000380 wd 0.0500 time 0.2442 (0.2448) data time 0.0007 (0.0019) model time 0.2435 (0.2429) loss 3.6584 (3.0953) grad_norm 4.3807 (inf) loss_scale 1024.0000 (1429.0535) mem 7381MB [2024-08-27 19:21:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1000/1251] eta 0:01:01 lr 0.000380 wd 0.0500 time 0.2425 (0.2448) data time 0.0010 (0.0019) model time 0.2415 (0.2429) loss 3.2154 (3.0934) grad_norm 2.0443 (inf) loss_scale 1024.0000 (1425.0070) mem 7381MB [2024-08-27 19:21:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1010/1251] eta 0:00:58 lr 0.000380 wd 0.0500 time 0.2345 (0.2448) data time 0.0011 (0.0019) model time 0.2333 (0.2429) loss 3.4752 (3.0947) grad_norm 2.7347 (inf) loss_scale 1024.0000 (1421.0406) mem 7381MB [2024-08-27 19:21:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1020/1251] eta 0:00:56 lr 0.000380 wd 0.0500 time 0.2599 (0.2448) data time 0.0010 (0.0019) model time 0.2589 (0.2429) loss 2.7760 (3.0947) grad_norm 2.9965 (inf) loss_scale 1024.0000 (1417.1518) mem 7381MB [2024-08-27 19:21:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1030/1251] eta 0:00:54 lr 0.000380 wd 0.0500 time 0.2431 (0.2448) data time 0.0008 (0.0019) model time 0.2423 (0.2429) loss 2.0292 (3.0891) grad_norm 2.2758 (inf) loss_scale 1024.0000 (1413.3385) mem 7381MB [2024-08-27 19:21:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1040/1251] eta 0:00:51 lr 0.000380 wd 0.0500 time 0.2341 (0.2450) data time 0.0013 (0.0019) model time 0.2328 (0.2431) loss 2.9951 (3.0875) grad_norm 2.7167 (inf) loss_scale 1024.0000 (1409.5985) mem 7381MB [2024-08-27 19:21:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1050/1251] eta 0:00:49 lr 0.000380 wd 0.0500 time 0.2422 (0.2450) data time 0.0008 (0.0019) model time 0.2413 (0.2431) loss 1.9116 (3.0879) grad_norm 3.5528 (inf) loss_scale 1024.0000 (1405.9296) mem 7381MB [2024-08-27 19:21:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1060/1251] eta 0:00:46 lr 0.000379 wd 0.0500 time 0.2425 (0.2450) data time 0.0011 (0.0019) model time 0.2414 (0.2431) loss 2.4310 (3.0879) grad_norm 2.9462 (inf) loss_scale 1024.0000 (1402.3299) mem 7381MB [2024-08-27 19:21:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1070/1251] eta 0:00:44 lr 0.000379 wd 0.0500 time 0.2400 (0.2449) data time 0.0008 (0.0019) model time 0.2392 (0.2431) loss 3.6245 (3.0900) grad_norm 6.9580 (inf) loss_scale 1024.0000 (1398.7974) mem 7381MB [2024-08-27 19:21:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1080/1251] eta 0:00:41 lr 0.000379 wd 0.0500 time 0.2373 (0.2449) data time 0.0014 (0.0019) model time 0.2360 (0.2431) loss 3.0387 (3.0928) grad_norm 3.0297 (inf) loss_scale 1024.0000 (1395.3302) mem 7381MB [2024-08-27 19:21:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1090/1251] eta 0:00:39 lr 0.000379 wd 0.0500 time 0.2483 (0.2449) data time 0.0010 (0.0019) model time 0.2473 (0.2431) loss 3.1785 (3.0925) grad_norm 3.1278 (inf) loss_scale 1024.0000 (1391.9267) mem 7381MB [2024-08-27 19:21:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1100/1251] eta 0:00:36 lr 0.000379 wd 0.0500 time 0.2402 (0.2450) data time 0.0012 (0.0019) model time 0.2390 (0.2431) loss 3.2817 (3.0911) grad_norm 2.8863 (inf) loss_scale 1024.0000 (1388.5849) mem 7381MB [2024-08-27 19:21:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1110/1251] eta 0:00:34 lr 0.000379 wd 0.0500 time 0.2390 (0.2449) data time 0.0009 (0.0019) model time 0.2381 (0.2431) loss 3.2956 (3.0929) grad_norm 2.5284 (inf) loss_scale 1024.0000 (1385.3033) mem 7381MB [2024-08-27 19:21:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1120/1251] eta 0:00:32 lr 0.000379 wd 0.0500 time 0.2467 (0.2449) data time 0.0012 (0.0019) model time 0.2456 (0.2431) loss 3.4083 (3.0920) grad_norm 3.2657 (inf) loss_scale 1024.0000 (1382.0803) mem 7381MB [2024-08-27 19:21:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1130/1251] eta 0:00:29 lr 0.000379 wd 0.0500 time 0.2452 (0.2449) data time 0.0010 (0.0019) model time 0.2443 (0.2431) loss 3.1452 (3.0905) grad_norm 3.2258 (inf) loss_scale 1024.0000 (1378.9142) mem 7381MB [2024-08-27 19:21:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1140/1251] eta 0:00:27 lr 0.000379 wd 0.0500 time 0.2421 (0.2449) data time 0.0011 (0.0019) model time 0.2411 (0.2431) loss 3.3674 (3.0897) grad_norm 3.1986 (inf) loss_scale 1024.0000 (1375.8037) mem 7381MB [2024-08-27 19:21:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1150/1251] eta 0:00:24 lr 0.000379 wd 0.0500 time 0.2442 (0.2449) data time 0.0009 (0.0019) model time 0.2433 (0.2431) loss 2.3207 (3.0894) grad_norm 2.4457 (inf) loss_scale 1024.0000 (1372.7472) mem 7381MB [2024-08-27 19:21:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1160/1251] eta 0:00:22 lr 0.000379 wd 0.0500 time 0.2310 (0.2449) data time 0.0011 (0.0018) model time 0.2299 (0.2430) loss 3.2270 (3.0901) grad_norm 2.1842 (inf) loss_scale 1024.0000 (1369.7433) mem 7381MB [2024-08-27 19:21:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1170/1251] eta 0:00:19 lr 0.000379 wd 0.0500 time 0.2313 (0.2448) data time 0.0010 (0.0018) model time 0.2304 (0.2430) loss 3.2498 (3.0890) grad_norm 2.9046 (inf) loss_scale 1024.0000 (1366.7908) mem 7381MB [2024-08-27 19:21:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1180/1251] eta 0:00:17 lr 0.000379 wd 0.0500 time 0.2361 (0.2448) data time 0.0011 (0.0018) model time 0.2350 (0.2430) loss 3.2704 (3.0917) grad_norm 2.0401 (inf) loss_scale 1024.0000 (1363.8882) mem 7381MB [2024-08-27 19:21:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1190/1251] eta 0:00:14 lr 0.000379 wd 0.0500 time 0.2452 (0.2448) data time 0.0008 (0.0018) model time 0.2444 (0.2430) loss 3.4267 (3.0910) grad_norm 2.4527 (inf) loss_scale 1024.0000 (1361.0344) mem 7381MB [2024-08-27 19:22:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1200/1251] eta 0:00:12 lr 0.000379 wd 0.0500 time 0.2317 (0.2447) data time 0.0013 (0.0018) model time 0.2304 (0.2429) loss 2.6591 (3.0909) grad_norm 3.3693 (inf) loss_scale 1024.0000 (1358.2281) mem 7381MB [2024-08-27 19:22:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1210/1251] eta 0:00:10 lr 0.000379 wd 0.0500 time 0.2473 (0.2448) data time 0.0012 (0.0018) model time 0.2460 (0.2429) loss 3.8882 (3.0930) grad_norm 4.4900 (inf) loss_scale 1024.0000 (1355.4682) mem 7381MB [2024-08-27 19:22:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1220/1251] eta 0:00:07 lr 0.000379 wd 0.0500 time 0.2422 (0.2448) data time 0.0009 (0.0018) model time 0.2413 (0.2429) loss 2.3613 (3.0923) grad_norm 2.2378 (inf) loss_scale 1024.0000 (1352.7535) mem 7381MB [2024-08-27 19:22:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1230/1251] eta 0:00:05 lr 0.000379 wd 0.0500 time 0.2410 (0.2447) data time 0.0009 (0.0018) model time 0.2401 (0.2429) loss 2.6232 (3.0908) grad_norm 7.2084 (inf) loss_scale 1024.0000 (1350.0829) mem 7381MB [2024-08-27 19:22:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1240/1251] eta 0:00:02 lr 0.000379 wd 0.0500 time 0.2225 (0.2446) data time 0.0007 (0.0018) model time 0.2217 (0.2428) loss 3.5850 (3.0920) grad_norm 2.4978 (inf) loss_scale 1024.0000 (1347.4553) mem 7381MB [2024-08-27 19:22:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [182/300][1250/1251] eta 0:00:00 lr 0.000379 wd 0.0500 time 0.2268 (0.2445) data time 0.0005 (0.0018) model time 0.2263 (0.2427) loss 2.5241 (3.0902) grad_norm 3.5229 (inf) loss_scale 1024.0000 (1344.8697) mem 7381MB [2024-08-27 19:22:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 182 training takes 0:05:05 [2024-08-27 19:22:13 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 19:22:14 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 19:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.455 (0.455) Loss 0.4329 (0.4329) Acc@1 91.504 (91.504) Acc@5 98.242 (98.242) Mem 7381MB [2024-08-27 19:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.113) Loss 0.6670 (0.6921) Acc@1 87.109 (85.085) Acc@5 96.680 (96.946) Mem 7381MB [2024-08-27 19:22:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.068 (0.098) Loss 0.9966 (0.7178) Acc@1 76.172 (84.240) Acc@5 93.750 (96.880) Mem 7381MB [2024-08-27 19:22:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.091) Loss 1.2119 (0.8175) Acc@1 72.363 (81.849) Acc@5 91.309 (95.876) Mem 7381MB [2024-08-27 19:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.0898 (0.8698) Acc@1 74.219 (80.509) Acc@5 93.457 (95.327) Mem 7381MB [2024-08-27 19:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.220 Acc@5 95.278 [2024-08-27 19:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.2% [2024-08-27 19:22:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.861 (0.861) Loss 0.3955 (0.3955) Acc@1 92.969 (92.969) Acc@5 98.438 (98.438) Mem 7381MB [2024-08-27 19:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.152) Loss 0.6094 (0.6219) Acc@1 88.184 (86.745) Acc@5 97.363 (97.496) Mem 7381MB [2024-08-27 19:22:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.117) Loss 0.8774 (0.6457) Acc@1 78.711 (85.840) Acc@5 95.996 (97.475) Mem 7381MB [2024-08-27 19:22:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.105) Loss 1.1162 (0.7315) Acc@1 73.340 (83.770) Acc@5 92.969 (96.598) Mem 7381MB [2024-08-27 19:22:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.096) Loss 0.9971 (0.7753) Acc@1 76.074 (82.505) Acc@5 94.141 (96.129) Mem 7381MB [2024-08-27 19:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.046 Acc@5 96.120 [2024-08-27 19:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.0% [2024-08-27 19:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.05% [2024-08-27 19:22:23 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 19:22:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 19:22:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][0/1251] eta 0:15:32 lr 0.000379 wd 0.0500 time 0.7457 (0.7457) data time 0.5036 (0.5036) model time 0.0000 (0.0000) loss 3.9274 (3.9274) grad_norm 2.4632 (2.4632) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][10/1251] eta 0:05:57 lr 0.000379 wd 0.0500 time 0.2503 (0.2883) data time 0.0008 (0.0473) model time 0.0000 (0.0000) loss 1.9108 (3.2909) grad_norm 2.0981 (2.4711) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:22:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][20/1251] eta 0:05:27 lr 0.000379 wd 0.0500 time 0.2412 (0.2664) data time 0.0007 (0.0252) model time 0.0000 (0.0000) loss 3.8067 (3.1791) grad_norm 3.2139 (2.7875) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:22:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][30/1251] eta 0:05:15 lr 0.000379 wd 0.0500 time 0.2432 (0.2586) data time 0.0008 (0.0175) model time 0.0000 (0.0000) loss 2.9569 (3.1103) grad_norm 2.6003 (2.7558) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][40/1251] eta 0:05:08 lr 0.000379 wd 0.0500 time 0.2494 (0.2547) data time 0.0008 (0.0139) model time 0.0000 (0.0000) loss 2.4804 (3.1121) grad_norm 2.6842 (2.8309) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:22:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][50/1251] eta 0:05:03 lr 0.000378 wd 0.0500 time 0.2555 (0.2524) data time 0.0010 (0.0114) model time 0.0000 (0.0000) loss 2.8063 (3.0247) grad_norm 2.5109 (2.8774) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][60/1251] eta 0:04:59 lr 0.000378 wd 0.0500 time 0.2483 (0.2512) data time 0.0009 (0.0098) model time 0.2474 (0.2437) loss 2.8626 (2.9961) grad_norm 2.9868 (2.9287) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:22:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][70/1251] eta 0:04:55 lr 0.000378 wd 0.0500 time 0.2449 (0.2498) data time 0.0010 (0.0085) model time 0.2439 (0.2420) loss 3.1497 (3.0128) grad_norm 3.3120 (2.9404) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][80/1251] eta 0:04:51 lr 0.000378 wd 0.0500 time 0.2437 (0.2492) data time 0.0008 (0.0076) model time 0.2429 (0.2424) loss 1.8885 (2.9891) grad_norm 2.5800 (2.9297) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:22:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][90/1251] eta 0:04:48 lr 0.000378 wd 0.0500 time 0.2548 (0.2485) data time 0.0009 (0.0069) model time 0.2539 (0.2422) loss 2.2036 (2.9997) grad_norm 5.7052 (3.0113) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:22:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][100/1251] eta 0:04:45 lr 0.000378 wd 0.0500 time 0.2381 (0.2478) data time 0.0012 (0.0064) model time 0.2369 (0.2418) loss 3.4285 (2.9915) grad_norm 2.6622 (3.0827) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:22:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 19:22:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 19:22:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 19:24:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 19:24:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 19:24:49 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 19:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 19:24:59 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 19:25:00 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 19:25:01 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 19:25:01 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 183) [2024-08-27 19:25:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 19:25:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][110/1251] eta 0:47:10 lr 0.000378 wd 0.0500 time 0.2346 (2.4810) data time 0.0008 (0.1051) model time 0.2338 (2.3759) loss 3.9741 (3.6922) grad_norm 3.8529 (2.4995) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:25:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][120/1251] eta 0:20:22 lr 0.000378 wd 0.0500 time 0.2473 (1.0809) data time 0.0009 (0.0404) model time 0.2465 (1.0405) loss 3.2357 (3.3911) grad_norm 2.8003 (2.6230) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:25:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][130/1251] eta 0:14:07 lr 0.000378 wd 0.0500 time 0.2414 (0.7562) data time 0.0008 (0.0253) model time 0.2406 (0.7309) loss 3.1821 (3.3400) grad_norm 2.6930 (2.5486) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:25:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][140/1251] eta 0:11:19 lr 0.000378 wd 0.0500 time 0.2380 (0.6118) data time 0.0009 (0.0185) model time 0.2370 (0.5933) loss 3.5486 (3.3123) grad_norm 2.4127 (2.5714) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:25:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][150/1251] eta 0:09:44 lr 0.000378 wd 0.0500 time 0.2371 (0.5308) data time 0.0009 (0.0148) model time 0.2362 (0.5161) loss 3.2403 (3.2858) grad_norm 6.4240 (2.6997) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:25:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][160/1251] eta 0:08:42 lr 0.000378 wd 0.0500 time 0.2423 (0.4791) data time 0.0007 (0.0123) model time 0.2415 (0.4668) loss 3.4986 (3.2659) grad_norm 3.0369 (3.0709) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:25:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][170/1251] eta 0:07:58 lr 0.000378 wd 0.0500 time 0.2444 (0.4430) data time 0.0008 (0.0106) model time 0.2436 (0.4323) loss 2.9031 (3.2354) grad_norm 2.5480 (3.0241) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:25:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][180/1251] eta 0:07:25 lr 0.000378 wd 0.0500 time 0.2413 (0.4160) data time 0.0010 (0.0094) model time 0.2404 (0.4065) loss 3.4441 (3.1895) grad_norm 2.9092 (3.0559) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:25:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][190/1251] eta 0:06:59 lr 0.000378 wd 0.0500 time 0.2376 (0.3953) data time 0.0008 (0.0084) model time 0.2368 (0.3869) loss 2.6622 (3.1701) grad_norm 2.7360 (3.0190) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:25:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][200/1251] eta 0:06:38 lr 0.000378 wd 0.0500 time 0.2422 (0.3790) data time 0.0008 (0.0077) model time 0.2414 (0.3713) loss 3.0460 (3.1770) grad_norm 3.4487 (3.0534) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:25:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][210/1251] eta 0:06:21 lr 0.000378 wd 0.0500 time 0.2477 (0.3661) data time 0.0010 (0.0070) model time 0.2468 (0.3591) loss 2.9820 (3.1951) grad_norm 1.9081 (3.0237) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:25:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][220/1251] eta 0:06:06 lr 0.000378 wd 0.0500 time 0.2417 (0.3554) data time 0.0007 (0.0066) model time 0.2409 (0.3488) loss 3.5391 (3.1862) grad_norm 3.1288 (3.0126) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:25:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 19:25:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 19:25:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 19:31:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 19:31:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 19:31:07 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 19:31:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 19:31:23 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 19:31:24 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 19:31:26 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 19:31:26 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 183) [2024-08-27 19:31:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 19:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 19:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 19:33:33 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 19:33:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 19:33:39 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 19:33:41 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 19:33:42 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 19:33:42 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 183) [2024-08-27 19:33:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 19:44:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 19:44:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 19:44:18 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 19:44:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 19:44:31 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 19:44:32 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 19:44:33 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 19:44:33 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 183) [2024-08-27 19:44:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 19:44:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][230/1251] eta 0:30:11 lr 0.000378 wd 0.0500 time 0.2201 (1.7743) data time 0.0008 (0.1069) model time 0.2193 (1.6674) loss 3.7682 (3.5116) grad_norm 2.4577 (3.0275) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:44:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][240/1251] eta 0:15:23 lr 0.000378 wd 0.0500 time 0.2194 (0.9139) data time 0.0008 (0.0481) model time 0.2186 (0.8658) loss 3.8205 (3.3698) grad_norm 2.8867 (2.8689) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:44:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][250/1251] eta 0:11:08 lr 0.000378 wd 0.0500 time 0.2212 (0.6678) data time 0.0009 (0.0312) model time 0.2203 (0.6366) loss 3.3369 (3.3771) grad_norm 2.1724 (2.7493) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:44:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][260/1251] eta 0:09:06 lr 0.000378 wd 0.0500 time 0.2246 (0.5514) data time 0.0008 (0.0232) model time 0.2238 (0.5282) loss 3.2464 (3.3261) grad_norm 3.3299 (2.9829) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][270/1251] eta 0:07:54 lr 0.000378 wd 0.0500 time 0.2183 (0.4833) data time 0.0008 (0.0186) model time 0.2175 (0.4647) loss 3.4905 (3.2974) grad_norm 2.6837 (2.9265) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][280/1251] eta 0:07:06 lr 0.000377 wd 0.0500 time 0.2234 (0.4393) data time 0.0008 (0.0158) model time 0.2226 (0.4235) loss 2.5581 (3.2748) grad_norm 2.2951 (2.8577) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][290/1251] eta 0:06:31 lr 0.000377 wd 0.0500 time 0.2197 (0.4079) data time 0.0008 (0.0136) model time 0.2189 (0.3943) loss 2.2572 (3.2362) grad_norm 2.7668 (2.8750) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][300/1251] eta 0:06:05 lr 0.000377 wd 0.0500 time 0.2287 (0.3848) data time 0.0006 (0.0120) model time 0.2282 (0.3727) loss 2.4225 (3.2077) grad_norm 2.2798 (2.8849) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][310/1251] eta 0:05:45 lr 0.000377 wd 0.0500 time 0.2227 (0.3666) data time 0.0009 (0.0107) model time 0.2218 (0.3559) loss 3.1806 (3.1754) grad_norm 3.2328 (2.9439) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][320/1251] eta 0:05:27 lr 0.000377 wd 0.0500 time 0.2242 (0.3522) data time 0.0007 (0.0097) model time 0.2235 (0.3424) loss 3.9471 (3.1875) grad_norm 2.6712 (2.9397) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][330/1251] eta 0:05:13 lr 0.000377 wd 0.0500 time 0.2210 (0.3404) data time 0.0008 (0.0089) model time 0.2203 (0.3315) loss 2.5257 (3.1881) grad_norm 2.4413 (3.0085) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][340/1251] eta 0:05:01 lr 0.000377 wd 0.0500 time 0.2169 (0.3311) data time 0.0009 (0.0082) model time 0.2159 (0.3229) loss 3.0837 (3.1816) grad_norm 3.8159 (3.0224) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][350/1251] eta 0:04:50 lr 0.000377 wd 0.0500 time 0.2229 (0.3228) data time 0.0007 (0.0077) model time 0.2222 (0.3151) loss 3.3886 (3.1664) grad_norm 2.4801 (2.9965) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][360/1251] eta 0:04:41 lr 0.000377 wd 0.0500 time 0.2262 (0.3158) data time 0.0009 (0.0072) model time 0.2253 (0.3086) loss 2.8286 (3.1598) grad_norm 2.1378 (2.9773) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][370/1251] eta 0:04:32 lr 0.000377 wd 0.0500 time 0.2226 (0.3097) data time 0.0008 (0.0068) model time 0.2219 (0.3029) loss 3.7046 (3.1659) grad_norm 3.1960 (2.9302) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][380/1251] eta 0:04:25 lr 0.000377 wd 0.0500 time 0.2250 (0.3044) data time 0.0007 (0.0064) model time 0.2243 (0.2980) loss 2.8013 (3.1603) grad_norm 3.5663 (2.9189) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][390/1251] eta 0:04:18 lr 0.000377 wd 0.0500 time 0.2308 (0.2999) data time 0.0010 (0.0061) model time 0.2298 (0.2938) loss 3.0905 (3.1625) grad_norm 5.8583 (2.9473) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][400/1251] eta 0:04:11 lr 0.000377 wd 0.0500 time 0.2260 (0.2959) data time 0.0006 (0.0058) model time 0.2254 (0.2901) loss 2.7194 (3.1464) grad_norm 4.0009 (3.0156) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][410/1251] eta 0:04:05 lr 0.000377 wd 0.0500 time 0.2244 (0.2924) data time 0.0007 (0.0055) model time 0.2237 (0.2868) loss 3.5714 (3.1419) grad_norm 3.8387 (3.0007) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][420/1251] eta 0:04:00 lr 0.000377 wd 0.0500 time 0.2228 (0.2891) data time 0.0009 (0.0053) model time 0.2219 (0.2838) loss 2.2395 (3.1279) grad_norm 2.2241 (2.9943) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][430/1251] eta 0:03:54 lr 0.000377 wd 0.0500 time 0.2271 (0.2861) data time 0.0009 (0.0051) model time 0.2262 (0.2810) loss 3.1646 (3.1123) grad_norm 2.0499 (2.9742) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][440/1251] eta 0:03:49 lr 0.000377 wd 0.0500 time 0.2203 (0.2834) data time 0.0009 (0.0049) model time 0.2195 (0.2785) loss 2.6526 (3.1064) grad_norm 5.8207 (2.9922) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][450/1251] eta 0:03:44 lr 0.000377 wd 0.0500 time 0.2208 (0.2808) data time 0.0009 (0.0047) model time 0.2199 (0.2760) loss 3.7220 (3.1103) grad_norm 2.3736 (3.0284) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][460/1251] eta 0:03:40 lr 0.000377 wd 0.0500 time 0.2275 (0.2785) data time 0.0007 (0.0045) model time 0.2268 (0.2740) loss 3.6468 (3.1033) grad_norm 4.5400 (3.0995) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][470/1251] eta 0:03:35 lr 0.000377 wd 0.0500 time 0.2237 (0.2764) data time 0.0012 (0.0044) model time 0.2225 (0.2720) loss 2.5057 (3.0974) grad_norm 4.6805 (3.1070) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][480/1251] eta 0:03:31 lr 0.000377 wd 0.0500 time 0.2286 (0.2745) data time 0.0009 (0.0043) model time 0.2278 (0.2702) loss 3.1427 (3.0882) grad_norm 2.3306 (3.0879) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][490/1251] eta 0:03:27 lr 0.000377 wd 0.0500 time 0.2275 (0.2726) data time 0.0006 (0.0041) model time 0.2268 (0.2685) loss 3.4225 (3.0809) grad_norm 2.9877 (3.0700) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][500/1251] eta 0:03:23 lr 0.000377 wd 0.0500 time 0.2227 (0.2709) data time 0.0010 (0.0040) model time 0.2218 (0.2669) loss 1.9779 (3.0785) grad_norm 2.0974 (3.0816) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][510/1251] eta 0:03:20 lr 0.000376 wd 0.0500 time 0.2241 (0.2702) data time 0.0010 (0.0039) model time 0.2230 (0.2662) loss 3.7610 (3.0766) grad_norm 3.6473 (3.0812) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:45:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][520/1251] eta 0:03:16 lr 0.000376 wd 0.0500 time 0.2233 (0.2687) data time 0.0009 (0.0038) model time 0.2224 (0.2649) loss 3.2604 (3.0658) grad_norm 2.7279 (3.0896) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][530/1251] eta 0:03:13 lr 0.000376 wd 0.0500 time 0.2253 (0.2681) data time 0.0006 (0.0037) model time 0.2248 (0.2644) loss 2.3240 (3.0628) grad_norm 3.2881 (3.0916) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][540/1251] eta 0:03:09 lr 0.000376 wd 0.0500 time 0.2254 (0.2669) data time 0.0007 (0.0036) model time 0.2247 (0.2632) loss 2.9456 (3.0720) grad_norm 2.5385 (3.0809) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][550/1251] eta 0:03:06 lr 0.000376 wd 0.0500 time 0.2259 (0.2656) data time 0.0006 (0.0036) model time 0.2253 (0.2621) loss 3.8250 (3.0832) grad_norm 2.7496 (3.0735) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][560/1251] eta 0:03:02 lr 0.000376 wd 0.0500 time 0.2458 (0.2645) data time 0.0009 (0.0035) model time 0.2449 (0.2610) loss 3.4311 (3.0814) grad_norm 3.1753 (3.0627) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][570/1251] eta 0:02:59 lr 0.000376 wd 0.0500 time 0.2274 (0.2634) data time 0.0007 (0.0034) model time 0.2267 (0.2600) loss 2.8878 (3.0793) grad_norm 2.5363 (3.0509) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][580/1251] eta 0:02:56 lr 0.000376 wd 0.0500 time 0.2274 (0.2624) data time 0.0008 (0.0034) model time 0.2266 (0.2590) loss 3.2343 (3.0806) grad_norm 2.8103 (3.0379) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][590/1251] eta 0:02:52 lr 0.000376 wd 0.0500 time 0.2277 (0.2614) data time 0.0006 (0.0033) model time 0.2272 (0.2581) loss 3.1540 (3.0813) grad_norm 6.7770 (3.0437) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][600/1251] eta 0:02:49 lr 0.000376 wd 0.0500 time 0.2233 (0.2604) data time 0.0009 (0.0032) model time 0.2224 (0.2572) loss 2.8815 (3.0798) grad_norm 3.6956 (3.0408) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][610/1251] eta 0:02:46 lr 0.000376 wd 0.0500 time 0.2246 (0.2595) data time 0.0007 (0.0032) model time 0.2239 (0.2564) loss 2.8682 (3.0755) grad_norm 4.7287 (3.0537) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][620/1251] eta 0:02:43 lr 0.000376 wd 0.0500 time 0.2286 (0.2586) data time 0.0006 (0.0031) model time 0.2281 (0.2555) loss 3.2331 (3.0797) grad_norm 2.4774 (3.0500) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][630/1251] eta 0:02:40 lr 0.000376 wd 0.0500 time 0.2251 (0.2579) data time 0.0007 (0.0031) model time 0.2244 (0.2548) loss 3.3159 (3.0843) grad_norm 2.6724 (3.0487) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][640/1251] eta 0:02:37 lr 0.000376 wd 0.0500 time 0.2399 (0.2572) data time 0.0008 (0.0030) model time 0.2391 (0.2541) loss 4.0988 (3.0859) grad_norm 2.6423 (3.0479) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][650/1251] eta 0:02:34 lr 0.000376 wd 0.0500 time 0.2251 (0.2564) data time 0.0008 (0.0030) model time 0.2243 (0.2535) loss 3.7544 (3.0902) grad_norm 2.3225 (3.0395) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][660/1251] eta 0:02:31 lr 0.000376 wd 0.0500 time 0.2250 (0.2558) data time 0.0009 (0.0029) model time 0.2241 (0.2528) loss 3.3280 (3.0951) grad_norm 2.4266 (3.0282) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][670/1251] eta 0:02:28 lr 0.000376 wd 0.0500 time 0.2246 (0.2551) data time 0.0008 (0.0029) model time 0.2239 (0.2523) loss 2.9110 (3.0964) grad_norm 3.3938 (3.0184) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][680/1251] eta 0:02:25 lr 0.000376 wd 0.0500 time 0.2274 (0.2545) data time 0.0008 (0.0028) model time 0.2266 (0.2517) loss 2.9604 (3.0920) grad_norm 3.2765 (3.0171) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][690/1251] eta 0:02:22 lr 0.000376 wd 0.0500 time 0.2286 (0.2539) data time 0.0010 (0.0028) model time 0.2275 (0.2511) loss 2.9217 (3.0873) grad_norm 2.6340 (3.0116) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][700/1251] eta 0:02:19 lr 0.000376 wd 0.0500 time 0.2210 (0.2533) data time 0.0009 (0.0028) model time 0.2201 (0.2505) loss 2.3187 (3.0829) grad_norm 2.3080 (3.0203) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][710/1251] eta 0:02:16 lr 0.000376 wd 0.0500 time 0.2192 (0.2527) data time 0.0009 (0.0027) model time 0.2182 (0.2500) loss 3.2756 (3.0872) grad_norm 1.9210 (3.0071) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:46:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 19:46:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 19:46:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 19:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 19:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 19:49:50 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 19:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 19:50:01 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 19:50:02 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 19:50:04 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 19:50:04 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 183) [2024-08-27 19:50:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 19:50:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][720/1251] eta 0:40:44 lr 0.000376 wd 0.0500 time 0.2239 (4.6040) data time 0.0007 (0.2775) model time 0.2232 (4.3266) loss 2.8690 (3.5568) grad_norm 2.2144 (2.7688) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][730/1251] eta 0:10:47 lr 0.000376 wd 0.0500 time 0.2179 (1.2433) data time 0.0013 (0.0649) model time 0.2166 (1.1784) loss 3.1760 (3.4000) grad_norm 2.6929 (2.6639) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][740/1251] eta 0:06:49 lr 0.000375 wd 0.0500 time 0.2193 (0.8010) data time 0.0010 (0.0372) model time 0.2183 (0.7639) loss 3.7839 (3.4398) grad_norm 2.1685 (2.6900) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][750/1251] eta 0:05:14 lr 0.000375 wd 0.0500 time 0.2269 (0.6270) data time 0.0007 (0.0262) model time 0.2261 (0.6008) loss 3.8151 (3.4112) grad_norm 2.4467 (2.6881) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][760/1251] eta 0:04:22 lr 0.000375 wd 0.0500 time 0.2245 (0.5338) data time 0.0009 (0.0205) model time 0.2236 (0.5133) loss 3.2256 (3.3323) grad_norm 2.8710 (2.8211) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][770/1251] eta 0:03:49 lr 0.000375 wd 0.0500 time 0.2286 (0.4763) data time 0.0009 (0.0170) model time 0.2277 (0.4594) loss 3.2744 (3.3145) grad_norm 4.0607 (2.9532) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][780/1251] eta 0:03:25 lr 0.000375 wd 0.0500 time 0.2327 (0.4369) data time 0.0010 (0.0144) model time 0.2317 (0.4224) loss 2.8426 (3.2731) grad_norm 2.6575 (2.8910) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][790/1251] eta 0:03:08 lr 0.000375 wd 0.0500 time 0.2357 (0.4088) data time 0.0011 (0.0126) model time 0.2346 (0.3962) loss 3.4804 (3.2459) grad_norm 3.2954 (2.8484) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][800/1251] eta 0:02:54 lr 0.000375 wd 0.0500 time 0.2444 (0.3877) data time 0.0006 (0.0112) model time 0.2438 (0.3765) loss 2.6959 (3.2213) grad_norm 2.6528 (2.8704) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][810/1251] eta 0:02:43 lr 0.000375 wd 0.0500 time 0.2265 (0.3707) data time 0.0007 (0.0103) model time 0.2259 (0.3604) loss 3.7856 (3.2089) grad_norm 1.9856 (2.8549) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][820/1251] eta 0:02:33 lr 0.000375 wd 0.0500 time 0.2310 (0.3569) data time 0.0008 (0.0094) model time 0.2303 (0.3475) loss 3.7649 (3.2265) grad_norm 2.3610 (2.8213) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][830/1251] eta 0:02:25 lr 0.000375 wd 0.0500 time 0.2279 (0.3456) data time 0.0008 (0.0087) model time 0.2270 (0.3369) loss 3.1811 (3.2168) grad_norm 4.0445 (2.8456) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][840/1251] eta 0:02:18 lr 0.000375 wd 0.0500 time 0.2287 (0.3362) data time 0.0009 (0.0080) model time 0.2278 (0.3282) loss 3.5326 (3.2184) grad_norm 2.9015 (2.8671) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][850/1251] eta 0:02:11 lr 0.000375 wd 0.0500 time 0.2292 (0.3281) data time 0.0009 (0.0075) model time 0.2283 (0.3206) loss 3.4876 (3.2096) grad_norm 2.2426 (2.8382) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][860/1251] eta 0:02:05 lr 0.000375 wd 0.0500 time 0.2325 (0.3214) data time 0.0008 (0.0071) model time 0.2317 (0.3143) loss 3.5559 (3.1953) grad_norm 3.6710 (2.8396) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][870/1251] eta 0:02:00 lr 0.000375 wd 0.0500 time 0.2308 (0.3155) data time 0.0007 (0.0067) model time 0.2302 (0.3087) loss 3.2939 (3.1824) grad_norm 2.0637 (2.8495) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:50:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][880/1251] eta 0:01:55 lr 0.000375 wd 0.0500 time 0.2251 (0.3102) data time 0.0011 (0.0064) model time 0.2239 (0.3038) loss 2.2836 (3.1801) grad_norm 2.9249 (2.8494) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][890/1251] eta 0:01:50 lr 0.000375 wd 0.0500 time 0.2332 (0.3058) data time 0.0009 (0.0061) model time 0.2323 (0.2997) loss 3.7393 (3.1793) grad_norm 2.9814 (2.8470) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][900/1251] eta 0:01:45 lr 0.000375 wd 0.0500 time 0.2298 (0.3016) data time 0.0008 (0.0058) model time 0.2290 (0.2958) loss 3.3143 (3.1606) grad_norm 3.3420 (2.8656) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][910/1251] eta 0:01:41 lr 0.000375 wd 0.0500 time 0.2312 (0.2978) data time 0.0009 (0.0056) model time 0.2304 (0.2922) loss 3.2526 (3.1509) grad_norm 3.7732 (2.8655) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][920/1251] eta 0:01:37 lr 0.000375 wd 0.0500 time 0.2249 (0.2942) data time 0.0007 (0.0054) model time 0.2242 (0.2889) loss 2.4878 (3.1347) grad_norm 2.7344 (2.8562) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][930/1251] eta 0:01:33 lr 0.000375 wd 0.0500 time 0.2241 (0.2912) data time 0.0010 (0.0052) model time 0.2231 (0.2861) loss 2.0552 (3.1274) grad_norm 2.6886 (2.8417) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][940/1251] eta 0:01:29 lr 0.000375 wd 0.0500 time 0.2247 (0.2885) data time 0.0007 (0.0050) model time 0.2239 (0.2835) loss 2.3927 (3.1206) grad_norm 3.7414 (2.8374) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][950/1251] eta 0:01:26 lr 0.000375 wd 0.0500 time 0.2237 (0.2861) data time 0.0009 (0.0050) model time 0.2228 (0.2812) loss 2.2297 (3.1156) grad_norm 3.7222 (2.8453) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][960/1251] eta 0:01:22 lr 0.000375 wd 0.0500 time 0.2252 (0.2839) data time 0.0009 (0.0048) model time 0.2243 (0.2791) loss 3.6163 (3.1208) grad_norm 4.2607 (2.8974) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][970/1251] eta 0:01:19 lr 0.000375 wd 0.0500 time 0.2310 (0.2817) data time 0.0010 (0.0047) model time 0.2300 (0.2771) loss 3.4421 (3.1124) grad_norm 3.2997 (2.9374) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][980/1251] eta 0:01:15 lr 0.000374 wd 0.0500 time 0.2268 (0.2798) data time 0.0013 (0.0045) model time 0.2255 (0.2753) loss 3.2204 (3.1023) grad_norm 2.0923 (2.9498) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][990/1251] eta 0:01:12 lr 0.000374 wd 0.0500 time 0.2346 (0.2781) data time 0.0011 (0.0044) model time 0.2336 (0.2737) loss 3.6832 (3.1011) grad_norm 2.7562 (2.9545) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1000/1251] eta 0:01:09 lr 0.000374 wd 0.0500 time 0.2204 (0.2764) data time 0.0009 (0.0043) model time 0.2195 (0.2721) loss 3.4956 (3.1038) grad_norm 2.3531 (2.9492) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1010/1251] eta 0:01:06 lr 0.000374 wd 0.0500 time 0.2385 (0.2756) data time 0.0010 (0.0042) model time 0.2375 (0.2714) loss 3.0025 (3.0971) grad_norm 2.7304 (2.9452) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1020/1251] eta 0:01:03 lr 0.000374 wd 0.0500 time 0.2292 (0.2741) data time 0.0008 (0.0041) model time 0.2284 (0.2700) loss 3.0381 (3.0915) grad_norm 4.1603 (2.9408) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1030/1251] eta 0:01:00 lr 0.000374 wd 0.0500 time 0.2511 (0.2736) data time 0.0008 (0.0040) model time 0.2503 (0.2696) loss 3.4211 (3.0895) grad_norm 3.5476 (2.9350) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1040/1251] eta 0:00:57 lr 0.000374 wd 0.0500 time 0.2338 (0.2723) data time 0.0009 (0.0039) model time 0.2329 (0.2683) loss 3.4557 (3.0988) grad_norm 1.9305 (2.9230) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1050/1251] eta 0:00:54 lr 0.000374 wd 0.0500 time 0.2292 (0.2709) data time 0.0009 (0.0038) model time 0.2283 (0.2671) loss 3.4821 (3.1004) grad_norm 2.6549 (2.9215) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1060/1251] eta 0:00:51 lr 0.000374 wd 0.0500 time 0.2305 (0.2696) data time 0.0010 (0.0037) model time 0.2295 (0.2659) loss 3.8531 (3.1047) grad_norm 2.5188 (2.9253) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1070/1251] eta 0:00:48 lr 0.000374 wd 0.0500 time 0.2615 (0.2685) data time 0.0008 (0.0037) model time 0.2608 (0.2649) loss 3.1733 (3.1034) grad_norm 1.9118 (2.9417) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1080/1251] eta 0:00:45 lr 0.000374 wd 0.0500 time 0.2352 (0.2674) data time 0.0007 (0.0036) model time 0.2346 (0.2639) loss 3.2286 (3.1048) grad_norm 3.0314 (2.9326) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1090/1251] eta 0:00:42 lr 0.000374 wd 0.0500 time 0.2282 (0.2663) data time 0.0009 (0.0035) model time 0.2273 (0.2628) loss 3.2235 (3.1004) grad_norm 3.2370 (2.9301) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1100/1251] eta 0:00:40 lr 0.000374 wd 0.0500 time 0.2254 (0.2654) data time 0.0010 (0.0035) model time 0.2244 (0.2619) loss 2.3060 (3.0951) grad_norm 2.5693 (2.9194) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1110/1251] eta 0:00:37 lr 0.000374 wd 0.0500 time 0.2390 (0.2645) data time 0.0011 (0.0034) model time 0.2378 (0.2611) loss 3.5786 (3.0934) grad_norm 2.8563 (2.9218) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1120/1251] eta 0:00:34 lr 0.000374 wd 0.0500 time 0.2453 (0.2637) data time 0.0009 (0.0033) model time 0.2444 (0.2603) loss 2.9815 (3.0971) grad_norm 2.4714 (2.9142) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1130/1251] eta 0:00:31 lr 0.000374 wd 0.0500 time 0.2315 (0.2628) data time 0.0010 (0.0033) model time 0.2304 (0.2596) loss 3.3527 (3.1031) grad_norm 2.6754 (2.9183) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:52:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1140/1251] eta 0:00:29 lr 0.000374 wd 0.0500 time 0.2516 (0.2621) data time 0.0007 (0.0032) model time 0.2509 (0.2588) loss 2.2829 (3.1005) grad_norm 2.2486 (2.9114) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1150/1251] eta 0:00:26 lr 0.000374 wd 0.0500 time 0.2323 (0.2614) data time 0.0008 (0.0032) model time 0.2315 (0.2582) loss 3.7890 (3.1082) grad_norm 2.8148 (2.9123) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:52:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1160/1251] eta 0:00:23 lr 0.000374 wd 0.0500 time 0.2490 (0.2607) data time 0.0007 (0.0031) model time 0.2483 (0.2575) loss 2.9546 (3.1089) grad_norm 2.7530 (2.9209) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:52:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1170/1251] eta 0:00:21 lr 0.000374 wd 0.0500 time 0.2298 (0.2600) data time 0.0009 (0.0031) model time 0.2290 (0.2569) loss 2.3637 (3.1055) grad_norm 3.2749 (2.9228) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1180/1251] eta 0:00:18 lr 0.000374 wd 0.0500 time 0.3411 (0.2596) data time 0.0007 (0.0030) model time 0.3405 (0.2565) loss 2.6605 (3.1017) grad_norm 4.2951 (2.9345) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:52:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1190/1251] eta 0:00:15 lr 0.000374 wd 0.0500 time 0.2354 (0.2590) data time 0.0009 (0.0030) model time 0.2345 (0.2560) loss 3.3557 (3.0975) grad_norm 3.1209 (2.9295) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1200/1251] eta 0:00:13 lr 0.000374 wd 0.0500 time 0.2419 (0.2584) data time 0.0007 (0.0030) model time 0.2412 (0.2554) loss 3.5290 (3.0959) grad_norm 2.5276 (2.9287) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:52:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1210/1251] eta 0:00:10 lr 0.000373 wd 0.0500 time 0.2235 (0.2578) data time 0.0009 (0.0029) model time 0.2226 (0.2549) loss 2.2789 (3.0976) grad_norm 1.9343 (2.9226) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1220/1251] eta 0:00:07 lr 0.000373 wd 0.0500 time 0.2500 (0.2573) data time 0.0011 (0.0029) model time 0.2489 (0.2544) loss 2.3295 (3.0943) grad_norm 2.3487 (2.9242) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:52:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1230/1251] eta 0:00:05 lr 0.000373 wd 0.0500 time 0.2363 (0.2568) data time 0.0011 (0.0029) model time 0.2353 (0.2540) loss 3.5573 (3.1026) grad_norm 2.4512 (2.9263) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1240/1251] eta 0:00:02 lr 0.000373 wd 0.0500 time 0.2142 (0.2561) data time 0.0006 (0.0028) model time 0.2136 (0.2533) loss 3.1307 (3.0991) grad_norm 2.9620 (2.9261) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:52:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [183/300][1250/1251] eta 0:00:00 lr 0.000373 wd 0.0500 time 0.2425 (0.2554) data time 0.0005 (0.0028) model time 0.2420 (0.2526) loss 3.4370 (3.0970) grad_norm 2.9578 (2.9158) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 19:52:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 183 training takes 0:02:16 [2024-08-27 19:52:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 19:52:27 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 19:52:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.445 (0.445) Loss 0.4304 (0.4304) Acc@1 91.992 (91.992) Acc@5 98.145 (98.145) Mem 7377MB [2024-08-27 19:52:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.091 (0.119) Loss 0.6846 (0.6750) Acc@1 86.426 (85.423) Acc@5 96.680 (97.088) Mem 7377MB [2024-08-27 19:52:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.089 (0.101) Loss 1.0312 (0.7074) Acc@1 75.293 (84.175) Acc@5 93.945 (97.080) Mem 7377MB [2024-08-27 19:52:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.086 (0.094) Loss 1.2168 (0.8062) Acc@1 70.996 (81.893) Acc@5 91.406 (95.990) Mem 7377MB [2024-08-27 19:52:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.1016 (0.8568) Acc@1 74.316 (80.674) Acc@5 92.871 (95.393) Mem 7377MB [2024-08-27 19:52:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.212 Acc@5 95.350 [2024-08-27 19:52:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.2% [2024-08-27 19:52:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.915 (0.915) Loss 0.3936 (0.3936) Acc@1 92.871 (92.871) Acc@5 98.438 (98.438) Mem 7377MB [2024-08-27 19:52:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.165) Loss 0.6084 (0.6214) Acc@1 88.281 (86.710) Acc@5 97.363 (97.505) Mem 7377MB [2024-08-27 19:52:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.124) Loss 0.8755 (0.6451) Acc@1 78.906 (85.849) Acc@5 95.996 (97.503) Mem 7377MB [2024-08-27 19:52:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.110) Loss 1.1152 (0.7309) Acc@1 73.633 (83.792) Acc@5 93.066 (96.623) Mem 7377MB [2024-08-27 19:52:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.099) Loss 0.9971 (0.7744) Acc@1 75.586 (82.512) Acc@5 94.238 (96.146) Mem 7377MB [2024-08-27 19:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.056 Acc@5 96.126 [2024-08-27 19:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.1% [2024-08-27 19:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.06% [2024-08-27 19:52:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 19:52:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 19:52:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][0/1251] eta 0:15:29 lr 0.000373 wd 0.0500 time 0.7431 (0.7431) data time 0.4551 (0.4551) model time 0.0000 (0.0000) loss 2.4763 (2.4763) grad_norm 2.1712 (2.1712) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-27 19:52:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][10/1251] eta 0:05:42 lr 0.000373 wd 0.0500 time 0.2259 (0.2757) data time 0.0008 (0.0424) model time 0.0000 (0.0000) loss 3.3940 (3.0262) grad_norm 3.3346 (2.6882) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:52:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][20/1251] eta 0:05:12 lr 0.000373 wd 0.0500 time 0.2243 (0.2537) data time 0.0009 (0.0226) model time 0.0000 (0.0000) loss 3.5534 (3.1251) grad_norm 3.0324 (4.5708) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:52:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][30/1251] eta 0:05:00 lr 0.000373 wd 0.0500 time 0.2242 (0.2465) data time 0.0010 (0.0157) model time 0.0000 (0.0000) loss 3.3067 (3.1375) grad_norm 3.6280 (4.1144) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:52:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][40/1251] eta 0:04:52 lr 0.000373 wd 0.0500 time 0.2295 (0.2419) data time 0.0007 (0.0122) model time 0.0000 (0.0000) loss 3.3694 (3.1244) grad_norm 2.2053 (4.2290) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][50/1251] eta 0:04:48 lr 0.000373 wd 0.0500 time 0.2318 (0.2400) data time 0.0007 (0.0100) model time 0.0000 (0.0000) loss 2.2832 (3.1339) grad_norm 2.0031 (4.0326) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:52:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][60/1251] eta 0:04:43 lr 0.000373 wd 0.0500 time 0.2328 (0.2383) data time 0.0008 (0.0085) model time 0.2320 (0.2286) loss 3.1446 (3.1342) grad_norm 2.7639 (3.9194) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:52:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][70/1251] eta 0:04:40 lr 0.000373 wd 0.0500 time 0.2361 (0.2376) data time 0.0011 (0.0075) model time 0.2350 (0.2303) loss 3.1443 (3.1013) grad_norm 3.0785 (3.8423) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][80/1251] eta 0:04:36 lr 0.000373 wd 0.0500 time 0.2217 (0.2365) data time 0.0007 (0.0067) model time 0.2210 (0.2296) loss 2.9947 (3.1020) grad_norm 3.0564 (3.7838) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][90/1251] eta 0:04:33 lr 0.000373 wd 0.0500 time 0.2244 (0.2356) data time 0.0008 (0.0061) model time 0.2235 (0.2288) loss 3.6969 (3.1396) grad_norm 2.6298 (3.6835) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][100/1251] eta 0:04:30 lr 0.000373 wd 0.0500 time 0.2275 (0.2350) data time 0.0010 (0.0056) model time 0.2266 (0.2288) loss 2.8932 (3.1374) grad_norm 3.2986 (3.6159) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][110/1251] eta 0:04:27 lr 0.000373 wd 0.0500 time 0.2615 (0.2346) data time 0.0009 (0.0052) model time 0.2606 (0.2290) loss 3.0508 (3.1176) grad_norm 2.1854 (3.5340) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][120/1251] eta 0:04:25 lr 0.000373 wd 0.0500 time 0.2227 (0.2344) data time 0.0008 (0.0048) model time 0.2218 (0.2292) loss 3.1864 (3.1107) grad_norm 3.0930 (3.4684) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][130/1251] eta 0:04:22 lr 0.000373 wd 0.0500 time 0.2256 (0.2339) data time 0.0011 (0.0046) model time 0.2244 (0.2289) loss 3.1833 (3.0786) grad_norm 2.6358 (3.4279) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][140/1251] eta 0:04:19 lr 0.000373 wd 0.0500 time 0.2238 (0.2335) data time 0.0008 (0.0043) model time 0.2230 (0.2287) loss 3.3948 (3.0983) grad_norm 3.4079 (3.3988) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][150/1251] eta 0:04:16 lr 0.000373 wd 0.0500 time 0.2244 (0.2332) data time 0.0008 (0.0041) model time 0.2236 (0.2287) loss 3.7683 (3.1012) grad_norm 2.4667 (3.3584) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][160/1251] eta 0:04:14 lr 0.000373 wd 0.0500 time 0.2272 (0.2329) data time 0.0010 (0.0039) model time 0.2262 (0.2286) loss 3.2708 (3.0951) grad_norm 2.4639 (3.3398) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][170/1251] eta 0:04:11 lr 0.000373 wd 0.0500 time 0.2248 (0.2327) data time 0.0012 (0.0037) model time 0.2236 (0.2286) loss 3.2325 (3.0825) grad_norm 2.9388 (3.3358) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][180/1251] eta 0:04:09 lr 0.000373 wd 0.0500 time 0.2299 (0.2325) data time 0.0007 (0.0036) model time 0.2292 (0.2285) loss 2.0428 (3.0631) grad_norm 3.4787 (3.3408) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][190/1251] eta 0:04:06 lr 0.000372 wd 0.0500 time 0.2268 (0.2322) data time 0.0007 (0.0035) model time 0.2261 (0.2283) loss 3.1193 (3.0724) grad_norm 2.5021 (3.3280) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][200/1251] eta 0:04:03 lr 0.000372 wd 0.0500 time 0.2278 (0.2321) data time 0.0009 (0.0034) model time 0.2269 (0.2283) loss 3.3803 (3.0808) grad_norm 2.2285 (3.2917) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][210/1251] eta 0:04:01 lr 0.000372 wd 0.0500 time 0.2274 (0.2319) data time 0.0008 (0.0032) model time 0.2267 (0.2283) loss 2.4506 (3.0868) grad_norm 2.7739 (3.2737) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][220/1251] eta 0:03:59 lr 0.000372 wd 0.0500 time 0.2290 (0.2327) data time 0.0009 (0.0031) model time 0.2280 (0.2294) loss 2.3662 (3.0860) grad_norm 3.7867 (3.2653) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][230/1251] eta 0:03:57 lr 0.000372 wd 0.0500 time 0.2362 (0.2326) data time 0.0010 (0.0031) model time 0.2352 (0.2294) loss 2.9281 (3.0861) grad_norm 2.4389 (3.2709) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:53:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][240/1251] eta 0:03:55 lr 0.000372 wd 0.0500 time 0.2265 (0.2325) data time 0.0012 (0.0030) model time 0.2253 (0.2293) loss 3.3463 (3.0926) grad_norm 3.0209 (3.2529) loss_scale 2048.0000 (1053.7427) mem 7381MB [2024-08-27 19:53:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][250/1251] eta 0:03:52 lr 0.000372 wd 0.0500 time 0.2239 (0.2322) data time 0.0007 (0.0029) model time 0.2232 (0.2291) loss 2.0283 (3.0921) grad_norm 2.0433 (3.2191) loss_scale 2048.0000 (1093.3546) mem 7381MB [2024-08-27 19:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][260/1251] eta 0:03:50 lr 0.000372 wd 0.0500 time 0.2289 (0.2321) data time 0.0009 (0.0028) model time 0.2280 (0.2291) loss 3.4344 (3.0931) grad_norm 2.8152 (3.2166) loss_scale 2048.0000 (1129.9310) mem 7381MB [2024-08-27 19:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][270/1251] eta 0:03:47 lr 0.000372 wd 0.0500 time 0.2319 (0.2321) data time 0.0009 (0.0028) model time 0.2310 (0.2291) loss 3.1802 (3.0852) grad_norm 2.4206 (3.2109) loss_scale 2048.0000 (1163.8081) mem 7381MB [2024-08-27 19:53:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][280/1251] eta 0:03:45 lr 0.000372 wd 0.0500 time 0.2281 (0.2325) data time 0.0010 (0.0027) model time 0.2271 (0.2297) loss 3.4018 (3.0746) grad_norm 2.6388 (3.2059) loss_scale 2048.0000 (1195.2740) mem 7381MB [2024-08-27 19:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][290/1251] eta 0:03:43 lr 0.000372 wd 0.0500 time 0.2286 (0.2324) data time 0.0011 (0.0027) model time 0.2274 (0.2297) loss 3.5368 (3.0807) grad_norm 2.8890 (3.2056) loss_scale 2048.0000 (1224.5773) mem 7381MB [2024-08-27 19:53:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][300/1251] eta 0:03:40 lr 0.000372 wd 0.0500 time 0.2260 (0.2323) data time 0.0009 (0.0026) model time 0.2251 (0.2296) loss 3.1084 (3.0737) grad_norm 2.5833 (3.1954) loss_scale 2048.0000 (1251.9336) mem 7381MB [2024-08-27 19:53:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][310/1251] eta 0:03:38 lr 0.000372 wd 0.0500 time 0.2262 (0.2322) data time 0.0008 (0.0026) model time 0.2255 (0.2296) loss 1.8989 (3.0695) grad_norm 2.3931 (3.1765) loss_scale 2048.0000 (1277.5305) mem 7381MB [2024-08-27 19:53:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][320/1251] eta 0:03:36 lr 0.000372 wd 0.0500 time 0.2446 (0.2321) data time 0.0010 (0.0025) model time 0.2436 (0.2296) loss 3.1497 (3.0698) grad_norm 2.8871 (3.1661) loss_scale 2048.0000 (1301.5327) mem 7381MB [2024-08-27 19:53:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][330/1251] eta 0:03:33 lr 0.000372 wd 0.0500 time 0.2222 (0.2321) data time 0.0013 (0.0025) model time 0.2208 (0.2295) loss 3.3120 (3.0659) grad_norm 2.6117 (3.1532) loss_scale 2048.0000 (1324.0846) mem 7381MB [2024-08-27 19:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][340/1251] eta 0:03:31 lr 0.000372 wd 0.0500 time 0.2338 (0.2320) data time 0.0009 (0.0024) model time 0.2329 (0.2295) loss 3.1958 (3.0675) grad_norm 2.6410 (3.1358) loss_scale 2048.0000 (1345.3138) mem 7381MB [2024-08-27 19:54:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][350/1251] eta 0:03:28 lr 0.000372 wd 0.0500 time 0.2305 (0.2318) data time 0.0006 (0.0024) model time 0.2298 (0.2294) loss 3.6507 (3.0618) grad_norm 2.9242 (3.1211) loss_scale 2048.0000 (1365.3333) mem 7381MB [2024-08-27 19:54:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][360/1251] eta 0:03:26 lr 0.000372 wd 0.0500 time 0.2371 (0.2317) data time 0.0009 (0.0023) model time 0.2362 (0.2293) loss 3.3217 (3.0595) grad_norm 2.8261 (3.1254) loss_scale 2048.0000 (1384.2438) mem 7381MB [2024-08-27 19:54:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][370/1251] eta 0:03:24 lr 0.000372 wd 0.0500 time 0.2242 (0.2316) data time 0.0008 (0.0023) model time 0.2234 (0.2292) loss 2.6330 (3.0582) grad_norm 2.1193 (3.1091) loss_scale 2048.0000 (1402.1348) mem 7381MB [2024-08-27 19:54:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][380/1251] eta 0:03:21 lr 0.000372 wd 0.0500 time 0.2307 (0.2316) data time 0.0007 (0.0023) model time 0.2299 (0.2292) loss 2.7444 (3.0574) grad_norm 2.2345 (3.0920) loss_scale 2048.0000 (1419.0866) mem 7381MB [2024-08-27 19:54:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][390/1251] eta 0:03:19 lr 0.000372 wd 0.0500 time 0.2380 (0.2315) data time 0.0007 (0.0022) model time 0.2374 (0.2292) loss 3.9929 (3.0663) grad_norm 3.3403 (3.0839) loss_scale 2048.0000 (1435.1714) mem 7381MB [2024-08-27 19:54:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][400/1251] eta 0:03:16 lr 0.000372 wd 0.0500 time 0.2372 (0.2314) data time 0.0009 (0.0022) model time 0.2363 (0.2291) loss 2.4555 (3.0716) grad_norm 2.4337 (3.0813) loss_scale 2048.0000 (1450.4539) mem 7381MB [2024-08-27 19:54:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][410/1251] eta 0:03:14 lr 0.000372 wd 0.0500 time 0.2234 (0.2314) data time 0.0012 (0.0022) model time 0.2222 (0.2291) loss 2.5790 (3.0653) grad_norm 3.3164 (3.0734) loss_scale 2048.0000 (1464.9927) mem 7381MB [2024-08-27 19:54:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][420/1251] eta 0:03:12 lr 0.000372 wd 0.0500 time 0.2259 (0.2313) data time 0.0006 (0.0022) model time 0.2253 (0.2291) loss 2.2879 (3.0561) grad_norm 3.0925 (3.0729) loss_scale 2048.0000 (1478.8409) mem 7381MB [2024-08-27 19:54:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][430/1251] eta 0:03:09 lr 0.000371 wd 0.0500 time 0.2362 (0.2313) data time 0.0008 (0.0021) model time 0.2354 (0.2290) loss 2.7251 (3.0589) grad_norm 2.7569 (3.0641) loss_scale 2048.0000 (1492.0464) mem 7381MB [2024-08-27 19:54:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][440/1251] eta 0:03:07 lr 0.000371 wd 0.0500 time 0.2332 (0.2312) data time 0.0009 (0.0021) model time 0.2323 (0.2290) loss 2.9257 (3.0632) grad_norm 8.5806 (3.0940) loss_scale 2048.0000 (1504.6531) mem 7381MB [2024-08-27 19:54:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][450/1251] eta 0:03:05 lr 0.000371 wd 0.0500 time 0.2215 (0.2311) data time 0.0010 (0.0021) model time 0.2204 (0.2289) loss 3.2544 (3.0629) grad_norm 2.3531 (3.0922) loss_scale 2048.0000 (1516.7007) mem 7381MB [2024-08-27 19:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][460/1251] eta 0:03:02 lr 0.000371 wd 0.0500 time 0.2491 (0.2311) data time 0.0009 (0.0021) model time 0.2482 (0.2289) loss 2.9926 (3.0630) grad_norm 3.2297 (3.0874) loss_scale 2048.0000 (1528.2256) mem 7381MB [2024-08-27 19:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][470/1251] eta 0:03:00 lr 0.000371 wd 0.0500 time 0.2279 (0.2310) data time 0.0007 (0.0020) model time 0.2271 (0.2289) loss 3.1102 (3.0624) grad_norm 2.7427 (3.0789) loss_scale 2048.0000 (1539.2611) mem 7381MB [2024-08-27 19:54:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][480/1251] eta 0:02:58 lr 0.000371 wd 0.0500 time 0.2344 (0.2310) data time 0.0007 (0.0020) model time 0.2337 (0.2288) loss 3.4908 (3.0677) grad_norm 3.8164 (3.0802) loss_scale 2048.0000 (1549.8378) mem 7381MB [2024-08-27 19:54:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][490/1251] eta 0:02:55 lr 0.000371 wd 0.0500 time 0.2356 (0.2309) data time 0.0009 (0.0020) model time 0.2347 (0.2287) loss 2.5160 (3.0637) grad_norm 2.6626 (3.0743) loss_scale 2048.0000 (1559.9837) mem 7381MB [2024-08-27 19:54:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][500/1251] eta 0:02:53 lr 0.000371 wd 0.0500 time 0.2427 (0.2309) data time 0.0008 (0.0020) model time 0.2420 (0.2288) loss 3.1805 (3.0603) grad_norm 2.1328 (3.0723) loss_scale 2048.0000 (1569.7246) mem 7381MB [2024-08-27 19:54:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][510/1251] eta 0:02:51 lr 0.000371 wd 0.0500 time 0.2382 (0.2308) data time 0.0010 (0.0020) model time 0.2372 (0.2287) loss 1.9385 (3.0657) grad_norm 4.0190 (3.0657) loss_scale 2048.0000 (1579.0841) mem 7381MB [2024-08-27 19:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][520/1251] eta 0:02:48 lr 0.000371 wd 0.0500 time 0.2266 (0.2307) data time 0.0008 (0.0020) model time 0.2257 (0.2287) loss 3.3358 (3.0674) grad_norm 2.1022 (3.0581) loss_scale 2048.0000 (1588.0845) mem 7381MB [2024-08-27 19:54:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][530/1251] eta 0:02:46 lr 0.000371 wd 0.0500 time 0.2469 (0.2308) data time 0.0012 (0.0019) model time 0.2457 (0.2287) loss 3.4053 (3.0713) grad_norm 2.2202 (3.0553) loss_scale 2048.0000 (1596.7458) mem 7381MB [2024-08-27 19:54:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][540/1251] eta 0:02:44 lr 0.000371 wd 0.0500 time 0.2281 (0.2307) data time 0.0013 (0.0019) model time 0.2268 (0.2286) loss 3.4480 (3.0674) grad_norm 3.3011 (3.0556) loss_scale 2048.0000 (1605.0869) mem 7381MB [2024-08-27 19:54:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][550/1251] eta 0:02:41 lr 0.000371 wd 0.0500 time 0.2289 (0.2307) data time 0.0009 (0.0019) model time 0.2280 (0.2286) loss 2.8939 (3.0664) grad_norm 2.0816 (3.0458) loss_scale 2048.0000 (1613.1252) mem 7381MB [2024-08-27 19:54:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][560/1251] eta 0:02:39 lr 0.000371 wd 0.0500 time 0.2490 (0.2306) data time 0.0012 (0.0019) model time 0.2478 (0.2286) loss 2.5617 (3.0632) grad_norm 3.5770 (3.0389) loss_scale 2048.0000 (1620.8770) mem 7381MB [2024-08-27 19:54:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][570/1251] eta 0:02:37 lr 0.000371 wd 0.0500 time 0.2401 (0.2306) data time 0.0010 (0.0019) model time 0.2391 (0.2286) loss 3.3053 (3.0619) grad_norm 2.9745 (3.0322) loss_scale 2048.0000 (1628.3573) mem 7381MB [2024-08-27 19:54:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][580/1251] eta 0:02:34 lr 0.000371 wd 0.0500 time 0.2349 (0.2306) data time 0.0011 (0.0019) model time 0.2339 (0.2286) loss 2.9437 (3.0624) grad_norm 2.3584 (3.0249) loss_scale 2048.0000 (1635.5800) mem 7381MB [2024-08-27 19:54:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][590/1251] eta 0:02:32 lr 0.000371 wd 0.0500 time 0.2278 (0.2306) data time 0.0010 (0.0018) model time 0.2267 (0.2286) loss 2.4620 (3.0643) grad_norm 2.6621 (3.0234) loss_scale 2048.0000 (1642.5584) mem 7381MB [2024-08-27 19:55:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][600/1251] eta 0:02:30 lr 0.000371 wd 0.0500 time 0.2278 (0.2306) data time 0.0007 (0.0018) model time 0.2271 (0.2286) loss 3.3458 (3.0658) grad_norm 3.4655 (3.0245) loss_scale 2048.0000 (1649.3045) mem 7381MB [2024-08-27 19:55:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][610/1251] eta 0:02:27 lr 0.000371 wd 0.0500 time 0.2314 (0.2305) data time 0.0011 (0.0018) model time 0.2303 (0.2286) loss 3.2713 (3.0664) grad_norm 4.1896 (3.0371) loss_scale 2048.0000 (1655.8298) mem 7381MB [2024-08-27 19:55:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][620/1251] eta 0:02:25 lr 0.000371 wd 0.0500 time 0.2441 (0.2305) data time 0.0011 (0.0018) model time 0.2431 (0.2286) loss 2.7553 (3.0647) grad_norm 3.5527 (3.0516) loss_scale 2048.0000 (1662.1449) mem 7381MB [2024-08-27 19:55:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][630/1251] eta 0:02:23 lr 0.000371 wd 0.0500 time 0.2396 (0.2305) data time 0.0008 (0.0018) model time 0.2388 (0.2286) loss 2.3901 (3.0621) grad_norm 2.6840 (inf) loss_scale 1024.0000 (1655.2773) mem 7381MB [2024-08-27 19:55:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][640/1251] eta 0:02:20 lr 0.000371 wd 0.0500 time 0.2537 (0.2305) data time 0.0008 (0.0018) model time 0.2529 (0.2286) loss 3.8639 (3.0680) grad_norm 2.4730 (inf) loss_scale 1024.0000 (1645.4290) mem 7381MB [2024-08-27 19:55:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][650/1251] eta 0:02:18 lr 0.000371 wd 0.0500 time 0.2295 (0.2305) data time 0.0009 (0.0018) model time 0.2286 (0.2286) loss 3.3876 (3.0744) grad_norm 3.5518 (inf) loss_scale 1024.0000 (1635.8833) mem 7381MB [2024-08-27 19:55:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][660/1251] eta 0:02:16 lr 0.000370 wd 0.0500 time 0.2393 (0.2305) data time 0.0010 (0.0018) model time 0.2383 (0.2286) loss 2.9077 (3.0738) grad_norm 3.5743 (inf) loss_scale 1024.0000 (1626.6263) mem 7381MB [2024-08-27 19:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][670/1251] eta 0:02:13 lr 0.000370 wd 0.0500 time 0.2319 (0.2304) data time 0.0009 (0.0018) model time 0.2311 (0.2285) loss 3.4182 (3.0769) grad_norm 1.9065 (inf) loss_scale 1024.0000 (1617.6453) mem 7381MB [2024-08-27 19:55:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][680/1251] eta 0:02:11 lr 0.000370 wd 0.0500 time 0.2226 (0.2304) data time 0.0009 (0.0018) model time 0.2217 (0.2285) loss 3.5644 (3.0787) grad_norm 4.5339 (inf) loss_scale 1024.0000 (1608.9280) mem 7381MB [2024-08-27 19:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][690/1251] eta 0:02:09 lr 0.000370 wd 0.0500 time 0.2283 (0.2304) data time 0.0011 (0.0017) model time 0.2272 (0.2285) loss 3.2456 (3.0775) grad_norm 4.3719 (inf) loss_scale 1024.0000 (1600.4631) mem 7381MB [2024-08-27 19:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][700/1251] eta 0:02:06 lr 0.000370 wd 0.0500 time 0.2328 (0.2304) data time 0.0010 (0.0017) model time 0.2318 (0.2286) loss 2.8448 (3.0753) grad_norm 4.1259 (inf) loss_scale 1024.0000 (1592.2397) mem 7381MB [2024-08-27 19:55:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][710/1251] eta 0:02:04 lr 0.000370 wd 0.0500 time 0.2291 (0.2304) data time 0.0008 (0.0017) model time 0.2283 (0.2286) loss 2.8594 (3.0761) grad_norm 2.7886 (inf) loss_scale 1024.0000 (1584.2475) mem 7381MB [2024-08-27 19:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][720/1251] eta 0:02:02 lr 0.000370 wd 0.0500 time 0.2262 (0.2304) data time 0.0011 (0.0017) model time 0.2251 (0.2286) loss 3.3858 (3.0737) grad_norm 3.4601 (inf) loss_scale 1024.0000 (1576.4771) mem 7381MB [2024-08-27 19:55:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][730/1251] eta 0:02:00 lr 0.000370 wd 0.0500 time 0.2483 (0.2304) data time 0.0010 (0.0017) model time 0.2473 (0.2286) loss 3.1119 (3.0742) grad_norm 3.0493 (inf) loss_scale 1024.0000 (1568.9193) mem 7381MB [2024-08-27 19:55:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][740/1251] eta 0:01:57 lr 0.000370 wd 0.0500 time 0.2446 (0.2304) data time 0.0007 (0.0017) model time 0.2439 (0.2286) loss 3.9742 (3.0775) grad_norm 2.6041 (inf) loss_scale 1024.0000 (1561.5655) mem 7381MB [2024-08-27 19:55:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][750/1251] eta 0:01:55 lr 0.000370 wd 0.0500 time 0.2489 (0.2304) data time 0.0011 (0.0017) model time 0.2478 (0.2286) loss 2.5010 (3.0811) grad_norm 2.1536 (inf) loss_scale 1024.0000 (1554.4075) mem 7381MB [2024-08-27 19:55:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][760/1251] eta 0:01:53 lr 0.000370 wd 0.0500 time 0.2356 (0.2304) data time 0.0010 (0.0017) model time 0.2346 (0.2286) loss 3.2985 (3.0835) grad_norm 2.8560 (inf) loss_scale 1024.0000 (1547.4376) mem 7381MB [2024-08-27 19:55:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][770/1251] eta 0:01:50 lr 0.000370 wd 0.0500 time 0.2338 (0.2304) data time 0.0007 (0.0017) model time 0.2331 (0.2286) loss 3.4246 (3.0849) grad_norm 6.1910 (inf) loss_scale 1024.0000 (1540.6485) mem 7381MB [2024-08-27 19:55:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][780/1251] eta 0:01:48 lr 0.000370 wd 0.0500 time 0.2302 (0.2304) data time 0.0007 (0.0017) model time 0.2295 (0.2286) loss 3.7660 (3.0817) grad_norm 2.4657 (inf) loss_scale 1024.0000 (1534.0333) mem 7381MB [2024-08-27 19:55:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][790/1251] eta 0:01:46 lr 0.000370 wd 0.0500 time 0.2349 (0.2303) data time 0.0008 (0.0017) model time 0.2340 (0.2286) loss 2.9242 (3.0771) grad_norm 2.8153 (inf) loss_scale 1024.0000 (1527.5853) mem 7381MB [2024-08-27 19:55:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][800/1251] eta 0:01:43 lr 0.000370 wd 0.0500 time 0.4323 (0.2306) data time 0.0011 (0.0017) model time 0.4312 (0.2288) loss 3.8157 (3.0792) grad_norm 3.1178 (inf) loss_scale 1024.0000 (1521.2984) mem 7381MB [2024-08-27 19:55:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][810/1251] eta 0:01:41 lr 0.000370 wd 0.0500 time 0.2411 (0.2306) data time 0.0010 (0.0017) model time 0.2401 (0.2289) loss 2.9757 (3.0781) grad_norm 2.9594 (inf) loss_scale 1024.0000 (1515.1665) mem 7381MB [2024-08-27 19:55:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][820/1251] eta 0:01:39 lr 0.000370 wd 0.0500 time 0.2303 (0.2306) data time 0.0014 (0.0017) model time 0.2289 (0.2289) loss 3.5283 (3.0819) grad_norm 3.0412 (inf) loss_scale 1024.0000 (1509.1839) mem 7381MB [2024-08-27 19:55:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][830/1251] eta 0:01:37 lr 0.000370 wd 0.0500 time 0.2403 (0.2306) data time 0.0008 (0.0017) model time 0.2394 (0.2289) loss 3.3599 (3.0845) grad_norm 2.4279 (inf) loss_scale 1024.0000 (1503.3454) mem 7381MB [2024-08-27 19:55:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][840/1251] eta 0:01:34 lr 0.000370 wd 0.0500 time 0.2361 (0.2306) data time 0.0009 (0.0016) model time 0.2352 (0.2289) loss 3.4962 (3.0866) grad_norm 4.0003 (inf) loss_scale 1024.0000 (1497.6457) mem 7381MB [2024-08-27 19:55:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][850/1251] eta 0:01:32 lr 0.000370 wd 0.0500 time 0.2666 (0.2306) data time 0.0010 (0.0016) model time 0.2656 (0.2289) loss 2.4105 (3.0868) grad_norm 2.2418 (inf) loss_scale 1024.0000 (1492.0799) mem 7381MB [2024-08-27 19:56:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][860/1251] eta 0:01:30 lr 0.000370 wd 0.0500 time 0.2290 (0.2306) data time 0.0007 (0.0016) model time 0.2282 (0.2289) loss 3.7178 (3.0874) grad_norm 3.0816 (inf) loss_scale 1024.0000 (1486.6434) mem 7381MB [2024-08-27 19:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][870/1251] eta 0:01:27 lr 0.000370 wd 0.0500 time 0.2445 (0.2306) data time 0.0007 (0.0016) model time 0.2437 (0.2289) loss 3.3727 (3.0903) grad_norm 3.2931 (inf) loss_scale 1024.0000 (1481.3318) mem 7381MB [2024-08-27 19:56:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][880/1251] eta 0:01:25 lr 0.000370 wd 0.0500 time 0.2433 (0.2306) data time 0.0009 (0.0016) model time 0.2424 (0.2289) loss 3.3012 (3.0928) grad_norm 2.3661 (inf) loss_scale 1024.0000 (1476.1407) mem 7381MB [2024-08-27 19:56:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][890/1251] eta 0:01:23 lr 0.000370 wd 0.0500 time 0.2374 (0.2306) data time 0.0008 (0.0016) model time 0.2365 (0.2289) loss 3.4026 (3.0920) grad_norm 4.1512 (inf) loss_scale 1024.0000 (1471.0662) mem 7381MB [2024-08-27 19:56:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][900/1251] eta 0:01:20 lr 0.000369 wd 0.0500 time 0.2458 (0.2307) data time 0.0008 (0.0016) model time 0.2449 (0.2290) loss 1.7238 (3.0907) grad_norm 2.1690 (inf) loss_scale 1024.0000 (1466.1043) mem 7381MB [2024-08-27 19:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][910/1251] eta 0:01:18 lr 0.000369 wd 0.0500 time 0.2473 (0.2307) data time 0.0010 (0.0016) model time 0.2463 (0.2290) loss 3.4898 (3.0916) grad_norm 2.7925 (inf) loss_scale 1024.0000 (1461.2514) mem 7381MB [2024-08-27 19:56:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][920/1251] eta 0:01:16 lr 0.000369 wd 0.0500 time 0.2286 (0.2307) data time 0.0008 (0.0016) model time 0.2278 (0.2290) loss 3.2877 (3.0909) grad_norm 2.8034 (inf) loss_scale 1024.0000 (1456.5038) mem 7381MB [2024-08-27 19:56:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][930/1251] eta 0:01:14 lr 0.000369 wd 0.0500 time 0.2243 (0.2306) data time 0.0007 (0.0016) model time 0.2235 (0.2290) loss 3.8948 (3.0935) grad_norm 2.7610 (inf) loss_scale 1024.0000 (1451.8582) mem 7381MB [2024-08-27 19:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][940/1251] eta 0:01:11 lr 0.000369 wd 0.0500 time 0.2268 (0.2306) data time 0.0010 (0.0016) model time 0.2258 (0.2290) loss 3.2107 (3.0920) grad_norm 3.8768 (inf) loss_scale 1024.0000 (1447.3114) mem 7381MB [2024-08-27 19:56:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][950/1251] eta 0:01:09 lr 0.000369 wd 0.0500 time 0.2273 (0.2306) data time 0.0006 (0.0016) model time 0.2267 (0.2289) loss 2.7324 (3.0875) grad_norm 3.4597 (inf) loss_scale 1024.0000 (1442.8601) mem 7381MB [2024-08-27 19:56:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][960/1251] eta 0:01:07 lr 0.000369 wd 0.0500 time 0.2237 (0.2306) data time 0.0006 (0.0016) model time 0.2231 (0.2289) loss 3.5116 (3.0862) grad_norm 2.7270 (inf) loss_scale 1024.0000 (1438.5016) mem 7381MB [2024-08-27 19:56:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][970/1251] eta 0:01:04 lr 0.000369 wd 0.0500 time 0.2467 (0.2306) data time 0.0010 (0.0016) model time 0.2457 (0.2289) loss 3.2131 (3.0858) grad_norm 2.5556 (inf) loss_scale 1024.0000 (1434.2327) mem 7381MB [2024-08-27 19:56:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][980/1251] eta 0:01:02 lr 0.000369 wd 0.0500 time 0.2299 (0.2306) data time 0.0009 (0.0016) model time 0.2290 (0.2289) loss 2.9066 (3.0882) grad_norm 2.0501 (inf) loss_scale 1024.0000 (1430.0510) mem 7381MB [2024-08-27 19:56:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][990/1251] eta 0:01:00 lr 0.000369 wd 0.0500 time 0.2236 (0.2306) data time 0.0009 (0.0016) model time 0.2227 (0.2289) loss 4.0978 (3.0904) grad_norm 2.3285 (inf) loss_scale 1024.0000 (1425.9536) mem 7381MB [2024-08-27 19:56:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1000/1251] eta 0:00:57 lr 0.000369 wd 0.0500 time 0.2387 (0.2306) data time 0.0009 (0.0016) model time 0.2378 (0.2289) loss 3.3435 (3.0888) grad_norm 2.2234 (inf) loss_scale 1024.0000 (1421.9381) mem 7381MB [2024-08-27 19:56:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1010/1251] eta 0:00:55 lr 0.000369 wd 0.0500 time 0.2313 (0.2305) data time 0.0007 (0.0016) model time 0.2306 (0.2289) loss 2.1638 (3.0889) grad_norm 2.7551 (inf) loss_scale 1024.0000 (1418.0020) mem 7381MB [2024-08-27 19:56:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1020/1251] eta 0:00:53 lr 0.000369 wd 0.0500 time 0.2241 (0.2305) data time 0.0009 (0.0016) model time 0.2231 (0.2289) loss 3.7507 (3.0876) grad_norm 3.5969 (inf) loss_scale 1024.0000 (1414.1430) mem 7381MB [2024-08-27 19:56:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1030/1251] eta 0:00:50 lr 0.000369 wd 0.0500 time 0.2328 (0.2305) data time 0.0011 (0.0016) model time 0.2317 (0.2289) loss 3.3370 (3.0847) grad_norm 2.9399 (inf) loss_scale 1024.0000 (1410.3589) mem 7381MB [2024-08-27 19:56:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1040/1251] eta 0:00:48 lr 0.000369 wd 0.0500 time 0.2272 (0.2305) data time 0.0010 (0.0016) model time 0.2263 (0.2288) loss 3.1215 (3.0851) grad_norm 4.5310 (inf) loss_scale 1024.0000 (1406.6475) mem 7381MB [2024-08-27 19:56:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1050/1251] eta 0:00:46 lr 0.000369 wd 0.0500 time 0.2288 (0.2305) data time 0.0006 (0.0015) model time 0.2281 (0.2289) loss 4.0744 (3.0842) grad_norm 3.2187 (inf) loss_scale 1024.0000 (1403.0067) mem 7381MB [2024-08-27 19:56:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1060/1251] eta 0:00:44 lr 0.000369 wd 0.0500 time 0.2552 (0.2305) data time 0.0009 (0.0015) model time 0.2543 (0.2289) loss 3.6726 (3.0862) grad_norm 2.5679 (inf) loss_scale 1024.0000 (1399.4345) mem 7381MB [2024-08-27 19:56:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1070/1251] eta 0:00:41 lr 0.000369 wd 0.0500 time 0.2333 (0.2305) data time 0.0009 (0.0015) model time 0.2324 (0.2289) loss 3.8380 (3.0884) grad_norm 2.8135 (inf) loss_scale 1024.0000 (1395.9290) mem 7381MB [2024-08-27 19:56:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1080/1251] eta 0:00:39 lr 0.000369 wd 0.0500 time 0.2329 (0.2305) data time 0.0008 (0.0015) model time 0.2321 (0.2289) loss 3.1743 (3.0894) grad_norm 1.6844 (inf) loss_scale 1024.0000 (1392.4884) mem 7381MB [2024-08-27 19:56:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1090/1251] eta 0:00:37 lr 0.000369 wd 0.0500 time 0.2252 (0.2305) data time 0.0008 (0.0015) model time 0.2244 (0.2289) loss 3.9925 (3.0895) grad_norm 3.3563 (inf) loss_scale 1024.0000 (1389.1109) mem 7381MB [2024-08-27 19:56:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1100/1251] eta 0:00:34 lr 0.000369 wd 0.0500 time 0.2254 (0.2305) data time 0.0007 (0.0015) model time 0.2246 (0.2289) loss 3.0735 (3.0908) grad_norm 2.1720 (inf) loss_scale 1024.0000 (1385.7947) mem 7381MB [2024-08-27 19:56:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1110/1251] eta 0:00:32 lr 0.000369 wd 0.0500 time 0.2241 (0.2304) data time 0.0008 (0.0015) model time 0.2233 (0.2289) loss 2.5053 (3.0889) grad_norm 2.2962 (inf) loss_scale 1024.0000 (1382.5383) mem 7381MB [2024-08-27 19:57:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1120/1251] eta 0:00:30 lr 0.000369 wd 0.0500 time 0.2350 (0.2304) data time 0.0009 (0.0015) model time 0.2341 (0.2288) loss 3.1499 (3.0860) grad_norm 2.4810 (inf) loss_scale 1024.0000 (1379.3399) mem 7381MB [2024-08-27 19:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1130/1251] eta 0:00:27 lr 0.000368 wd 0.0500 time 0.2397 (0.2304) data time 0.0007 (0.0015) model time 0.2391 (0.2288) loss 2.1142 (3.0837) grad_norm 3.4129 (inf) loss_scale 1024.0000 (1376.1981) mem 7381MB [2024-08-27 19:57:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1140/1251] eta 0:00:25 lr 0.000368 wd 0.0500 time 0.2536 (0.2304) data time 0.0008 (0.0015) model time 0.2529 (0.2288) loss 2.6324 (3.0843) grad_norm 2.0949 (inf) loss_scale 1024.0000 (1373.1113) mem 7381MB [2024-08-27 19:57:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1150/1251] eta 0:00:23 lr 0.000368 wd 0.0500 time 0.2275 (0.2306) data time 0.0011 (0.0015) model time 0.2263 (0.2290) loss 3.1653 (3.0849) grad_norm 2.3717 (inf) loss_scale 1024.0000 (1370.0782) mem 7381MB [2024-08-27 19:57:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1160/1251] eta 0:00:20 lr 0.000368 wd 0.0500 time 0.2247 (0.2306) data time 0.0007 (0.0015) model time 0.2240 (0.2290) loss 3.3507 (3.0862) grad_norm 3.8495 (inf) loss_scale 1024.0000 (1367.0973) mem 7381MB [2024-08-27 19:57:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1170/1251] eta 0:00:18 lr 0.000368 wd 0.0500 time 0.2261 (0.2305) data time 0.0008 (0.0015) model time 0.2254 (0.2290) loss 2.9801 (3.0883) grad_norm 2.5939 (inf) loss_scale 1024.0000 (1364.1674) mem 7381MB [2024-08-27 19:57:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1180/1251] eta 0:00:16 lr 0.000368 wd 0.0500 time 0.2205 (0.2305) data time 0.0007 (0.0015) model time 0.2199 (0.2290) loss 3.1910 (3.0877) grad_norm 3.0667 (inf) loss_scale 1024.0000 (1361.2870) mem 7381MB [2024-08-27 19:57:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1190/1251] eta 0:00:14 lr 0.000368 wd 0.0500 time 0.2248 (0.2305) data time 0.0011 (0.0015) model time 0.2238 (0.2289) loss 2.7983 (3.0865) grad_norm 2.1979 (inf) loss_scale 1024.0000 (1358.4551) mem 7381MB [2024-08-27 19:57:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1200/1251] eta 0:00:11 lr 0.000368 wd 0.0500 time 0.2416 (0.2305) data time 0.0006 (0.0015) model time 0.2410 (0.2290) loss 2.4034 (3.0876) grad_norm 3.6734 (inf) loss_scale 1024.0000 (1355.6703) mem 7381MB [2024-08-27 19:57:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1210/1251] eta 0:00:09 lr 0.000368 wd 0.0500 time 0.2266 (0.2305) data time 0.0007 (0.0015) model time 0.2259 (0.2290) loss 1.9255 (3.0869) grad_norm 2.1360 (inf) loss_scale 1024.0000 (1352.9315) mem 7381MB [2024-08-27 19:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1220/1251] eta 0:00:07 lr 0.000368 wd 0.0500 time 0.2207 (0.2305) data time 0.0012 (0.0015) model time 0.2195 (0.2290) loss 3.4325 (3.0853) grad_norm 2.9155 (inf) loss_scale 1024.0000 (1350.2375) mem 7381MB [2024-08-27 19:57:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1230/1251] eta 0:00:04 lr 0.000368 wd 0.0500 time 0.2353 (0.2305) data time 0.0009 (0.0015) model time 0.2344 (0.2290) loss 3.7314 (3.0866) grad_norm 2.6350 (inf) loss_scale 1024.0000 (1347.5873) mem 7381MB [2024-08-27 19:57:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1240/1251] eta 0:00:02 lr 0.000368 wd 0.0500 time 0.2106 (0.2304) data time 0.0005 (0.0015) model time 0.2101 (0.2289) loss 3.4787 (3.0860) grad_norm 2.3712 (inf) loss_scale 1024.0000 (1344.9799) mem 7381MB [2024-08-27 19:57:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [184/300][1250/1251] eta 0:00:00 lr 0.000368 wd 0.0500 time 0.2105 (0.2303) data time 0.0004 (0.0015) model time 0.2101 (0.2288) loss 1.9311 (3.0846) grad_norm 2.3778 (inf) loss_scale 1024.0000 (1342.4141) mem 7381MB [2024-08-27 19:57:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 184 training takes 0:04:48 [2024-08-27 19:57:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 19:57:31 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 19:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.532 (0.532) Loss 0.4272 (0.4272) Acc@1 91.699 (91.699) Acc@5 98.047 (98.047) Mem 7381MB [2024-08-27 19:57:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.122) Loss 0.6870 (0.6750) Acc@1 85.449 (85.369) Acc@5 96.973 (97.124) Mem 7381MB [2024-08-27 19:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.103) Loss 1.0039 (0.7040) Acc@1 76.367 (84.445) Acc@5 94.629 (97.084) Mem 7381MB [2024-08-27 19:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.095) Loss 1.1572 (0.8050) Acc@1 72.363 (82.000) Acc@5 91.797 (96.031) Mem 7381MB [2024-08-27 19:57:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.088) Loss 1.1211 (0.8596) Acc@1 73.535 (80.616) Acc@5 93.262 (95.463) Mem 7381MB [2024-08-27 19:57:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.214 Acc@5 95.416 [2024-08-27 19:57:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.2% [2024-08-27 19:57:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.938 (0.938) Loss 0.3918 (0.3918) Acc@1 93.066 (93.066) Acc@5 98.535 (98.535) Mem 7381MB [2024-08-27 19:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.089 (0.165) Loss 0.6084 (0.6207) Acc@1 88.184 (86.790) Acc@5 97.363 (97.505) Mem 7381MB [2024-08-27 19:57:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.125) Loss 0.8760 (0.6446) Acc@1 78.711 (85.886) Acc@5 96.094 (97.512) Mem 7381MB [2024-08-27 19:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.110) Loss 1.1133 (0.7306) Acc@1 73.535 (83.821) Acc@5 93.164 (96.623) Mem 7381MB [2024-08-27 19:57:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.099) Loss 0.9951 (0.7741) Acc@1 75.391 (82.536) Acc@5 94.141 (96.160) Mem 7381MB [2024-08-27 19:57:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.066 Acc@5 96.150 [2024-08-27 19:57:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.1% [2024-08-27 19:57:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.07% [2024-08-27 19:57:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 19:57:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 19:57:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][0/1251] eta 0:17:21 lr 0.000368 wd 0.0500 time 0.8323 (0.8323) data time 0.6075 (0.6075) model time 0.0000 (0.0000) loss 2.8361 (2.8361) grad_norm 2.1955 (2.1955) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][10/1251] eta 0:05:53 lr 0.000368 wd 0.0500 time 0.2435 (0.2848) data time 0.0007 (0.0563) model time 0.0000 (0.0000) loss 3.5170 (3.3189) grad_norm 3.4460 (2.7880) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][20/1251] eta 0:05:17 lr 0.000368 wd 0.0500 time 0.2417 (0.2578) data time 0.0008 (0.0300) model time 0.0000 (0.0000) loss 2.3168 (3.2113) grad_norm 2.8631 (4.8312) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:57:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][30/1251] eta 0:05:03 lr 0.000368 wd 0.0500 time 0.2240 (0.2482) data time 0.0014 (0.0206) model time 0.0000 (0.0000) loss 3.2065 (3.1909) grad_norm 2.4935 (4.1613) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:57:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][40/1251] eta 0:04:54 lr 0.000368 wd 0.0500 time 0.2257 (0.2435) data time 0.0010 (0.0159) model time 0.0000 (0.0000) loss 3.1644 (3.1849) grad_norm 2.4914 (3.7942) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:57:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][50/1251] eta 0:04:49 lr 0.000368 wd 0.0500 time 0.2372 (0.2407) data time 0.0009 (0.0130) model time 0.0000 (0.0000) loss 3.0949 (3.1950) grad_norm 2.5862 (3.5617) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:57:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][60/1251] eta 0:04:44 lr 0.000368 wd 0.0500 time 0.2370 (0.2388) data time 0.0008 (0.0110) model time 0.2362 (0.2285) loss 2.8581 (3.1455) grad_norm 2.1858 (3.4127) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:57:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][70/1251] eta 0:04:41 lr 0.000368 wd 0.0500 time 0.2327 (0.2380) data time 0.0006 (0.0096) model time 0.2321 (0.2301) loss 2.3525 (3.1026) grad_norm 3.5317 (3.2955) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][80/1251] eta 0:04:37 lr 0.000368 wd 0.0500 time 0.2399 (0.2371) data time 0.0007 (0.0086) model time 0.2392 (0.2299) loss 3.5283 (3.0975) grad_norm 2.1535 (3.2130) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][90/1251] eta 0:04:34 lr 0.000368 wd 0.0500 time 0.2350 (0.2366) data time 0.0008 (0.0077) model time 0.2342 (0.2302) loss 3.2288 (3.0797) grad_norm 2.6712 (3.2503) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][100/1251] eta 0:04:31 lr 0.000368 wd 0.0500 time 0.2510 (0.2360) data time 0.0009 (0.0071) model time 0.2501 (0.2302) loss 3.5154 (3.0939) grad_norm 2.8197 (3.2029) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][110/1251] eta 0:04:28 lr 0.000367 wd 0.0500 time 0.2277 (0.2353) data time 0.0008 (0.0066) model time 0.2269 (0.2296) loss 3.1802 (3.0726) grad_norm 3.9844 (3.2241) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][120/1251] eta 0:04:25 lr 0.000367 wd 0.0500 time 0.2330 (0.2349) data time 0.0010 (0.0061) model time 0.2320 (0.2295) loss 3.4155 (3.0761) grad_norm 3.3795 (3.2184) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][130/1251] eta 0:04:22 lr 0.000367 wd 0.0500 time 0.2262 (0.2345) data time 0.0010 (0.0057) model time 0.2252 (0.2294) loss 1.7974 (3.0574) grad_norm 10.1113 (3.2420) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][140/1251] eta 0:04:20 lr 0.000367 wd 0.0500 time 0.2272 (0.2341) data time 0.0009 (0.0054) model time 0.2263 (0.2293) loss 3.4484 (3.0375) grad_norm 2.6004 (3.2462) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][150/1251] eta 0:04:17 lr 0.000367 wd 0.0500 time 0.2333 (0.2338) data time 0.0009 (0.0052) model time 0.2324 (0.2291) loss 3.2672 (3.0316) grad_norm 2.5378 (3.2188) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][160/1251] eta 0:04:14 lr 0.000367 wd 0.0500 time 0.2487 (0.2335) data time 0.0007 (0.0049) model time 0.2480 (0.2290) loss 3.6527 (3.0346) grad_norm 5.4271 (3.2098) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][170/1251] eta 0:04:12 lr 0.000367 wd 0.0500 time 0.2239 (0.2333) data time 0.0009 (0.0048) model time 0.2230 (0.2289) loss 3.0319 (3.0143) grad_norm 2.1358 (3.1783) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][180/1251] eta 0:04:09 lr 0.000367 wd 0.0500 time 0.2528 (0.2332) data time 0.0008 (0.0046) model time 0.2520 (0.2290) loss 3.3388 (3.0206) grad_norm 2.3508 (3.1484) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][190/1251] eta 0:04:07 lr 0.000367 wd 0.0500 time 0.2271 (0.2329) data time 0.0009 (0.0044) model time 0.2262 (0.2289) loss 2.7106 (3.0202) grad_norm 4.0879 (3.1308) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][200/1251] eta 0:04:04 lr 0.000367 wd 0.0500 time 0.2272 (0.2327) data time 0.0007 (0.0042) model time 0.2265 (0.2288) loss 3.7089 (3.0075) grad_norm 3.4313 (3.1551) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][210/1251] eta 0:04:02 lr 0.000367 wd 0.0500 time 0.2264 (0.2325) data time 0.0011 (0.0041) model time 0.2253 (0.2287) loss 2.4707 (3.0096) grad_norm 2.7059 (3.1494) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][220/1251] eta 0:03:59 lr 0.000367 wd 0.0500 time 0.2314 (0.2325) data time 0.0007 (0.0039) model time 0.2307 (0.2288) loss 3.6121 (3.0208) grad_norm 2.4166 (3.1279) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][230/1251] eta 0:03:57 lr 0.000367 wd 0.0500 time 0.2411 (0.2323) data time 0.0008 (0.0038) model time 0.2403 (0.2287) loss 2.6938 (3.0304) grad_norm 2.4119 (3.1946) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][240/1251] eta 0:03:54 lr 0.000367 wd 0.0500 time 0.2333 (0.2322) data time 0.0009 (0.0037) model time 0.2324 (0.2287) loss 3.4950 (3.0331) grad_norm 3.2078 (3.1849) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][250/1251] eta 0:03:52 lr 0.000367 wd 0.0500 time 0.2348 (0.2321) data time 0.0007 (0.0036) model time 0.2341 (0.2287) loss 2.1965 (3.0413) grad_norm 2.8770 (3.1785) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][260/1251] eta 0:03:49 lr 0.000367 wd 0.0500 time 0.2253 (0.2320) data time 0.0010 (0.0035) model time 0.2243 (0.2287) loss 2.9227 (3.0438) grad_norm 1.8844 (3.1769) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][270/1251] eta 0:03:48 lr 0.000367 wd 0.0500 time 0.2270 (0.2326) data time 0.0007 (0.0034) model time 0.2263 (0.2296) loss 2.6623 (3.0464) grad_norm 2.9537 (3.1568) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][280/1251] eta 0:03:46 lr 0.000367 wd 0.0500 time 0.2240 (0.2332) data time 0.0009 (0.0033) model time 0.2231 (0.2304) loss 3.9039 (3.0435) grad_norm 2.6757 (3.1385) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][290/1251] eta 0:03:43 lr 0.000367 wd 0.0500 time 0.2292 (0.2331) data time 0.0006 (0.0033) model time 0.2285 (0.2302) loss 2.9309 (3.0485) grad_norm 2.4501 (3.1100) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][300/1251] eta 0:03:41 lr 0.000367 wd 0.0500 time 0.2366 (0.2331) data time 0.0007 (0.0032) model time 0.2359 (0.2304) loss 3.3232 (3.0453) grad_norm 2.9641 (3.0987) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][310/1251] eta 0:03:39 lr 0.000367 wd 0.0500 time 0.2317 (0.2330) data time 0.0008 (0.0031) model time 0.2309 (0.2303) loss 3.2566 (3.0494) grad_norm 3.5960 (3.0910) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][320/1251] eta 0:03:36 lr 0.000367 wd 0.0500 time 0.2245 (0.2329) data time 0.0009 (0.0030) model time 0.2236 (0.2303) loss 3.2815 (3.0481) grad_norm 3.2793 (3.0978) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:58:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][330/1251] eta 0:03:34 lr 0.000367 wd 0.0500 time 0.2293 (0.2328) data time 0.0009 (0.0030) model time 0.2284 (0.2302) loss 3.2955 (3.0450) grad_norm 3.5347 (3.0813) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:59:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][340/1251] eta 0:03:31 lr 0.000367 wd 0.0500 time 0.2223 (0.2327) data time 0.0009 (0.0029) model time 0.2214 (0.2301) loss 3.7844 (3.0486) grad_norm 2.2148 (3.0674) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][350/1251] eta 0:03:29 lr 0.000366 wd 0.0500 time 0.2428 (0.2326) data time 0.0009 (0.0029) model time 0.2419 (0.2301) loss 3.5527 (3.0563) grad_norm 4.7351 (3.0706) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:59:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][360/1251] eta 0:03:27 lr 0.000366 wd 0.0500 time 0.2260 (0.2325) data time 0.0007 (0.0028) model time 0.2253 (0.2300) loss 3.3430 (3.0604) grad_norm 5.0166 (3.0976) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][370/1251] eta 0:03:24 lr 0.000366 wd 0.0500 time 0.2307 (0.2325) data time 0.0013 (0.0028) model time 0.2295 (0.2300) loss 3.4432 (3.0627) grad_norm 2.2805 (3.0828) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][380/1251] eta 0:03:22 lr 0.000366 wd 0.0500 time 0.2293 (0.2324) data time 0.0006 (0.0027) model time 0.2286 (0.2299) loss 3.2137 (3.0675) grad_norm 1.8602 (3.0694) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][390/1251] eta 0:03:20 lr 0.000366 wd 0.0500 time 0.2281 (0.2323) data time 0.0008 (0.0027) model time 0.2273 (0.2299) loss 2.4474 (3.0676) grad_norm 2.8411 (3.0616) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:59:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][400/1251] eta 0:03:17 lr 0.000366 wd 0.0500 time 0.2235 (0.2323) data time 0.0013 (0.0027) model time 0.2223 (0.2299) loss 3.2571 (3.0694) grad_norm 2.4209 (3.0582) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][410/1251] eta 0:03:15 lr 0.000366 wd 0.0500 time 0.2331 (0.2322) data time 0.0009 (0.0026) model time 0.2322 (0.2298) loss 3.0414 (3.0739) grad_norm 3.3242 (3.0566) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][420/1251] eta 0:03:12 lr 0.000366 wd 0.0500 time 0.2218 (0.2321) data time 0.0009 (0.0026) model time 0.2209 (0.2297) loss 2.7179 (3.0685) grad_norm 2.7129 (3.0454) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][430/1251] eta 0:03:10 lr 0.000366 wd 0.0500 time 0.2358 (0.2320) data time 0.0009 (0.0026) model time 0.2349 (0.2297) loss 3.3597 (3.0717) grad_norm 2.4625 (3.0414) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 19:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 19:59:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 19:59:22 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 20:01:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 20:01:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 20:01:22 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 20:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 20:01:32 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 20:01:34 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 20:01:35 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 20:01:35 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 185) [2024-08-27 20:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 20:01:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][440/1251] eta 0:25:16 lr 0.000366 wd 0.0500 time 0.2492 (1.8702) data time 0.0009 (0.1782) model time 0.2483 (1.6920) loss 3.4725 (3.5329) grad_norm 2.3748 (2.8447) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][450/1251] eta 0:12:48 lr 0.000366 wd 0.0500 time 0.2315 (0.9597) data time 0.0007 (0.0798) model time 0.2308 (0.8799) loss 3.9359 (3.3086) grad_norm 2.7502 (3.4429) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][460/1251] eta 0:09:13 lr 0.000366 wd 0.0500 time 0.2315 (0.6993) data time 0.0010 (0.0517) model time 0.2305 (0.6476) loss 3.5831 (3.3595) grad_norm 2.2463 (3.3949) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][470/1251] eta 0:07:31 lr 0.000366 wd 0.0500 time 0.2615 (0.5775) data time 0.0009 (0.0384) model time 0.2606 (0.5391) loss 3.1597 (3.2954) grad_norm 2.8319 (3.1884) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][480/1251] eta 0:06:29 lr 0.000366 wd 0.0500 time 0.2317 (0.5053) data time 0.0007 (0.0308) model time 0.2310 (0.4745) loss 3.3107 (3.2707) grad_norm 2.1958 (3.0838) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][490/1251] eta 0:05:48 lr 0.000366 wd 0.0500 time 0.2372 (0.4583) data time 0.0007 (0.0257) model time 0.2365 (0.4326) loss 2.3854 (3.2189) grad_norm 3.9136 (3.1120) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][500/1251] eta 0:05:19 lr 0.000366 wd 0.0500 time 0.2340 (0.4251) data time 0.0007 (0.0221) model time 0.2333 (0.4030) loss 2.0812 (3.1801) grad_norm 3.0341 (3.1001) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][510/1251] eta 0:04:56 lr 0.000366 wd 0.0500 time 0.2327 (0.4004) data time 0.0007 (0.0194) model time 0.2321 (0.3810) loss 2.2003 (3.1572) grad_norm 2.3780 (3.0409) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][520/1251] eta 0:04:38 lr 0.000366 wd 0.0500 time 0.2306 (0.3808) data time 0.0009 (0.0174) model time 0.2297 (0.3634) loss 3.3108 (3.1408) grad_norm 2.1334 (3.0208) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][530/1251] eta 0:04:23 lr 0.000366 wd 0.0500 time 0.2305 (0.3656) data time 0.0009 (0.0157) model time 0.2296 (0.3499) loss 3.9217 (3.1515) grad_norm 3.1814 (2.9983) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][540/1251] eta 0:04:10 lr 0.000366 wd 0.0500 time 0.2222 (0.3527) data time 0.0008 (0.0144) model time 0.2214 (0.3383) loss 2.5746 (3.1590) grad_norm 3.3681 (2.9599) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][550/1251] eta 0:04:00 lr 0.000366 wd 0.0500 time 0.2421 (0.3424) data time 0.0011 (0.0133) model time 0.2410 (0.3292) loss 3.0819 (3.1597) grad_norm 2.5351 (2.9321) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][560/1251] eta 0:03:50 lr 0.000366 wd 0.0500 time 0.2438 (0.3338) data time 0.0007 (0.0123) model time 0.2431 (0.3215) loss 3.0325 (3.1480) grad_norm 2.4032 (2.9320) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][570/1251] eta 0:03:42 lr 0.000366 wd 0.0500 time 0.2275 (0.3262) data time 0.0009 (0.0115) model time 0.2266 (0.3146) loss 3.0224 (3.1363) grad_norm 3.2367 (2.9197) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][580/1251] eta 0:03:34 lr 0.000365 wd 0.0500 time 0.2281 (0.3196) data time 0.0007 (0.0108) model time 0.2274 (0.3088) loss 3.4752 (3.1251) grad_norm 4.5899 (2.9464) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][590/1251] eta 0:03:27 lr 0.000365 wd 0.0500 time 0.2237 (0.3139) data time 0.0008 (0.0102) model time 0.2229 (0.3037) loss 2.6796 (3.1250) grad_norm 3.5512 (2.9482) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][600/1251] eta 0:03:21 lr 0.000365 wd 0.0500 time 0.2342 (0.3088) data time 0.0011 (0.0097) model time 0.2332 (0.2991) loss 3.5068 (3.1345) grad_norm 3.4889 (2.9750) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][610/1251] eta 0:03:15 lr 0.000365 wd 0.0500 time 0.2270 (0.3043) data time 0.0008 (0.0092) model time 0.2262 (0.2951) loss 2.2902 (3.1199) grad_norm 3.2242 (2.9917) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][620/1251] eta 0:03:09 lr 0.000365 wd 0.0500 time 0.2293 (0.3003) data time 0.0008 (0.0088) model time 0.2285 (0.2915) loss 3.6776 (3.1219) grad_norm 3.4677 (2.9799) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][630/1251] eta 0:03:04 lr 0.000365 wd 0.0500 time 0.2449 (0.2967) data time 0.0012 (0.0084) model time 0.2437 (0.2883) loss 2.2256 (3.1154) grad_norm 1.7661 (2.9664) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][640/1251] eta 0:02:59 lr 0.000365 wd 0.0500 time 0.2317 (0.2934) data time 0.0010 (0.0080) model time 0.2307 (0.2854) loss 2.9720 (3.1034) grad_norm 2.4486 (2.9675) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][650/1251] eta 0:02:54 lr 0.000365 wd 0.0500 time 0.2601 (0.2907) data time 0.0009 (0.0077) model time 0.2591 (0.2830) loss 2.7310 (3.0969) grad_norm 2.5718 (2.9503) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][660/1251] eta 0:02:50 lr 0.000365 wd 0.0500 time 0.2245 (0.2879) data time 0.0010 (0.0074) model time 0.2235 (0.2804) loss 3.8980 (3.1094) grad_norm 3.3625 (2.9392) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][670/1251] eta 0:02:45 lr 0.000365 wd 0.0500 time 0.2285 (0.2854) data time 0.0008 (0.0071) model time 0.2277 (0.2782) loss 3.3937 (3.1009) grad_norm 2.2483 (2.9305) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][680/1251] eta 0:02:41 lr 0.000365 wd 0.0500 time 0.2383 (0.2831) data time 0.0009 (0.0069) model time 0.2374 (0.2762) loss 2.4548 (3.0956) grad_norm 2.2721 (2.9187) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][690/1251] eta 0:02:37 lr 0.000365 wd 0.0500 time 0.2245 (0.2810) data time 0.0012 (0.0067) model time 0.2233 (0.2743) loss 3.2674 (3.0893) grad_norm 2.6924 (2.9133) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][700/1251] eta 0:02:33 lr 0.000365 wd 0.0500 time 0.2301 (0.2791) data time 0.0006 (0.0065) model time 0.2295 (0.2726) loss 3.7403 (3.0827) grad_norm 3.3307 (2.9274) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:02:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][710/1251] eta 0:02:29 lr 0.000365 wd 0.0500 time 0.2363 (0.2773) data time 0.0011 (0.0063) model time 0.2351 (0.2710) loss 2.2606 (3.0865) grad_norm 1.9376 (2.9244) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][720/1251] eta 0:02:26 lr 0.000365 wd 0.0500 time 0.2294 (0.2764) data time 0.0008 (0.0061) model time 0.2286 (0.2703) loss 3.9572 (3.0876) grad_norm 2.2309 (2.9353) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][730/1251] eta 0:02:23 lr 0.000365 wd 0.0500 time 0.2284 (0.2749) data time 0.0009 (0.0059) model time 0.2275 (0.2690) loss 2.8202 (3.0703) grad_norm 3.4212 (2.9340) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][740/1251] eta 0:02:20 lr 0.000365 wd 0.0500 time 0.2274 (0.2742) data time 0.0007 (0.0058) model time 0.2268 (0.2684) loss 2.4484 (3.0659) grad_norm 2.9850 (2.9553) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][750/1251] eta 0:02:16 lr 0.000365 wd 0.0500 time 0.2226 (0.2728) data time 0.0011 (0.0057) model time 0.2215 (0.2671) loss 3.0149 (3.0730) grad_norm 5.2304 (2.9672) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][760/1251] eta 0:02:13 lr 0.000365 wd 0.0500 time 0.2222 (0.2714) data time 0.0009 (0.0056) model time 0.2213 (0.2658) loss 3.2727 (3.0791) grad_norm 3.9078 (2.9752) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][770/1251] eta 0:02:09 lr 0.000365 wd 0.0500 time 0.2287 (0.2702) data time 0.0009 (0.0055) model time 0.2278 (0.2647) loss 3.5873 (3.0759) grad_norm 3.0775 (2.9839) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][780/1251] eta 0:02:06 lr 0.000365 wd 0.0500 time 0.2273 (0.2690) data time 0.0008 (0.0053) model time 0.2265 (0.2637) loss 2.9981 (3.0765) grad_norm 3.0296 (2.9902) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][790/1251] eta 0:02:03 lr 0.000365 wd 0.0500 time 0.2200 (0.2679) data time 0.0013 (0.0052) model time 0.2187 (0.2627) loss 3.3994 (3.0760) grad_norm 3.8920 (2.9946) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][800/1251] eta 0:02:00 lr 0.000365 wd 0.0500 time 0.2276 (0.2669) data time 0.0007 (0.0051) model time 0.2269 (0.2618) loss 2.7078 (3.0734) grad_norm 2.3614 (2.9879) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][810/1251] eta 0:01:57 lr 0.000365 wd 0.0500 time 0.2268 (0.2660) data time 0.0009 (0.0050) model time 0.2259 (0.2610) loss 3.5569 (3.0719) grad_norm 2.4487 (2.9806) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][820/1251] eta 0:01:54 lr 0.000364 wd 0.0500 time 0.2303 (0.2650) data time 0.0006 (0.0049) model time 0.2297 (0.2601) loss 2.8463 (3.0680) grad_norm 3.7780 (2.9776) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][830/1251] eta 0:01:51 lr 0.000364 wd 0.0500 time 0.2292 (0.2641) data time 0.0007 (0.0048) model time 0.2286 (0.2593) loss 3.2044 (3.0744) grad_norm 3.2448 (2.9692) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][840/1251] eta 0:01:48 lr 0.000364 wd 0.0500 time 0.2298 (0.2633) data time 0.0014 (0.0047) model time 0.2284 (0.2586) loss 3.2853 (3.0791) grad_norm 2.5173 (2.9754) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][850/1251] eta 0:01:45 lr 0.000364 wd 0.0500 time 0.2261 (0.2626) data time 0.0010 (0.0047) model time 0.2251 (0.2579) loss 3.9731 (3.0821) grad_norm 2.7431 (2.9754) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][860/1251] eta 0:01:42 lr 0.000364 wd 0.0500 time 0.2321 (0.2618) data time 0.0007 (0.0046) model time 0.2314 (0.2572) loss 3.6143 (3.0856) grad_norm 2.9559 (2.9647) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][870/1251] eta 0:01:39 lr 0.000364 wd 0.0500 time 0.2324 (0.2610) data time 0.0009 (0.0045) model time 0.2314 (0.2565) loss 3.0869 (3.0882) grad_norm 2.5478 (2.9531) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][880/1251] eta 0:01:36 lr 0.000364 wd 0.0500 time 0.2392 (0.2604) data time 0.0009 (0.0044) model time 0.2383 (0.2559) loss 2.6663 (3.0883) grad_norm 2.3818 (2.9559) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][890/1251] eta 0:01:33 lr 0.000364 wd 0.0500 time 0.2318 (0.2597) data time 0.0014 (0.0044) model time 0.2304 (0.2553) loss 2.8653 (3.0860) grad_norm 2.4094 (2.9577) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][900/1251] eta 0:01:30 lr 0.000364 wd 0.0500 time 0.2295 (0.2591) data time 0.0011 (0.0043) model time 0.2284 (0.2548) loss 2.7372 (3.0822) grad_norm 2.4332 (2.9511) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][910/1251] eta 0:01:28 lr 0.000364 wd 0.0500 time 0.2232 (0.2585) data time 0.0011 (0.0042) model time 0.2220 (0.2542) loss 2.5163 (3.0786) grad_norm 2.7619 (2.9500) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][920/1251] eta 0:01:25 lr 0.000364 wd 0.0500 time 0.2308 (0.2578) data time 0.0009 (0.0042) model time 0.2299 (0.2537) loss 3.2449 (3.0838) grad_norm 2.6917 (2.9465) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][930/1251] eta 0:01:22 lr 0.000364 wd 0.0500 time 0.2301 (0.2573) data time 0.0007 (0.0041) model time 0.2294 (0.2532) loss 3.4546 (3.0831) grad_norm 2.9196 (2.9417) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][940/1251] eta 0:01:19 lr 0.000364 wd 0.0500 time 0.2346 (0.2568) data time 0.0007 (0.0040) model time 0.2339 (0.2527) loss 3.5099 (3.0836) grad_norm 2.2600 (2.9387) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][950/1251] eta 0:01:17 lr 0.000364 wd 0.0500 time 0.2350 (0.2563) data time 0.0009 (0.0040) model time 0.2341 (0.2523) loss 3.1644 (3.0882) grad_norm 2.3101 (2.9408) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][960/1251] eta 0:01:14 lr 0.000364 wd 0.0500 time 0.2291 (0.2558) data time 0.0008 (0.0039) model time 0.2283 (0.2518) loss 2.0311 (3.0809) grad_norm 1.6984 (2.9368) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:03:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][970/1251] eta 0:01:11 lr 0.000364 wd 0.0500 time 0.2300 (0.2553) data time 0.0011 (0.0039) model time 0.2289 (0.2515) loss 3.6348 (3.0816) grad_norm 2.2389 (2.9413) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][980/1251] eta 0:01:09 lr 0.000364 wd 0.0500 time 0.2300 (0.2549) data time 0.0010 (0.0038) model time 0.2290 (0.2511) loss 3.7546 (3.0835) grad_norm 2.5284 (2.9412) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][990/1251] eta 0:01:06 lr 0.000364 wd 0.0500 time 0.2244 (0.2545) data time 0.0011 (0.0038) model time 0.2233 (0.2507) loss 3.1531 (3.0876) grad_norm 2.9989 (2.9422) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1000/1251] eta 0:01:03 lr 0.000364 wd 0.0500 time 0.2288 (0.2541) data time 0.0007 (0.0037) model time 0.2281 (0.2503) loss 3.8069 (3.0912) grad_norm 2.3835 (2.9414) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1010/1251] eta 0:01:01 lr 0.000364 wd 0.0500 time 0.2269 (0.2536) data time 0.0014 (0.0037) model time 0.2255 (0.2500) loss 3.0817 (3.0904) grad_norm 2.3017 (2.9326) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1020/1251] eta 0:00:58 lr 0.000364 wd 0.0500 time 0.2382 (0.2532) data time 0.0009 (0.0036) model time 0.2373 (0.2496) loss 3.4655 (3.0928) grad_norm 3.2325 (2.9329) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1030/1251] eta 0:00:55 lr 0.000364 wd 0.0500 time 0.2241 (0.2528) data time 0.0009 (0.0036) model time 0.2232 (0.2492) loss 2.5310 (3.0934) grad_norm 2.4541 (2.9263) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1040/1251] eta 0:00:53 lr 0.000364 wd 0.0500 time 0.2297 (0.2524) data time 0.0009 (0.0036) model time 0.2288 (0.2489) loss 2.8294 (3.0920) grad_norm 2.5676 (2.9217) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1050/1251] eta 0:00:50 lr 0.000363 wd 0.0500 time 0.2203 (0.2521) data time 0.0013 (0.0035) model time 0.2190 (0.2485) loss 3.0840 (3.0928) grad_norm 2.3799 (2.9207) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1060/1251] eta 0:00:48 lr 0.000363 wd 0.0500 time 0.2289 (0.2517) data time 0.0009 (0.0035) model time 0.2280 (0.2482) loss 2.8772 (3.0949) grad_norm 2.2976 (2.9185) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1070/1251] eta 0:00:45 lr 0.000363 wd 0.0500 time 0.2318 (0.2514) data time 0.0015 (0.0034) model time 0.2304 (0.2479) loss 3.1624 (3.0965) grad_norm 3.3535 (2.9172) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1080/1251] eta 0:00:42 lr 0.000363 wd 0.0500 time 0.2207 (0.2510) data time 0.0010 (0.0034) model time 0.2197 (0.2476) loss 2.9210 (3.0922) grad_norm 2.1046 (2.9288) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1090/1251] eta 0:00:40 lr 0.000363 wd 0.0500 time 0.2215 (0.2507) data time 0.0013 (0.0034) model time 0.2202 (0.2474) loss 2.6110 (3.0913) grad_norm 3.4542 (2.9304) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1100/1251] eta 0:00:37 lr 0.000363 wd 0.0500 time 0.2333 (0.2505) data time 0.0008 (0.0033) model time 0.2325 (0.2471) loss 3.4638 (3.0914) grad_norm 2.6446 (2.9297) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1110/1251] eta 0:00:35 lr 0.000363 wd 0.0500 time 0.2372 (0.2502) data time 0.0013 (0.0033) model time 0.2359 (0.2469) loss 3.3242 (3.0955) grad_norm 5.4616 (2.9292) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1120/1251] eta 0:00:32 lr 0.000363 wd 0.0500 time 0.2254 (0.2499) data time 0.0007 (0.0033) model time 0.2246 (0.2466) loss 3.1552 (3.0975) grad_norm 2.2542 (2.9288) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1130/1251] eta 0:00:30 lr 0.000363 wd 0.0500 time 0.2287 (0.2496) data time 0.0007 (0.0033) model time 0.2280 (0.2463) loss 3.4397 (3.0941) grad_norm 2.8637 (2.9315) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1140/1251] eta 0:00:27 lr 0.000363 wd 0.0500 time 0.2395 (0.2494) data time 0.0009 (0.0033) model time 0.2386 (0.2461) loss 2.8214 (3.0940) grad_norm 1.9385 (2.9261) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1150/1251] eta 0:00:25 lr 0.000363 wd 0.0500 time 0.2477 (0.2491) data time 0.0010 (0.0032) model time 0.2467 (0.2459) loss 3.4781 (3.0892) grad_norm 3.0492 (2.9215) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-27 20:04:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1160/1251] eta 0:00:22 lr 0.000363 wd 0.0500 time 0.2261 (0.2488) data time 0.0011 (0.0032) model time 0.2250 (0.2456) loss 3.0455 (3.0875) grad_norm 1.5468 (inf) loss_scale 512.0000 (1022.5934) mem 7377MB [2024-08-27 20:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1170/1251] eta 0:00:20 lr 0.000363 wd 0.0500 time 0.2292 (0.2486) data time 0.0011 (0.0032) model time 0.2281 (0.2454) loss 3.7002 (3.0924) grad_norm 2.6044 (inf) loss_scale 512.0000 (1015.6748) mem 7377MB [2024-08-27 20:04:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1180/1251] eta 0:00:17 lr 0.000363 wd 0.0500 time 0.2330 (0.2483) data time 0.0007 (0.0032) model time 0.2323 (0.2452) loss 2.0723 (3.0897) grad_norm 2.4957 (inf) loss_scale 512.0000 (1008.9412) mem 7377MB [2024-08-27 20:04:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1190/1251] eta 0:00:15 lr 0.000363 wd 0.0500 time 0.2289 (0.2481) data time 0.0008 (0.0031) model time 0.2281 (0.2450) loss 3.8183 (3.0886) grad_norm 3.7911 (inf) loss_scale 512.0000 (1002.3852) mem 7377MB [2024-08-27 20:04:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1200/1251] eta 0:00:12 lr 0.000363 wd 0.0500 time 0.2244 (0.2479) data time 0.0013 (0.0031) model time 0.2231 (0.2447) loss 3.3774 (3.0907) grad_norm 2.6327 (inf) loss_scale 512.0000 (996.0000) mem 7377MB [2024-08-27 20:04:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1210/1251] eta 0:00:10 lr 0.000363 wd 0.0500 time 0.2321 (0.2476) data time 0.0007 (0.0031) model time 0.2315 (0.2445) loss 3.8128 (3.0929) grad_norm 2.7834 (inf) loss_scale 512.0000 (989.7789) mem 7377MB [2024-08-27 20:04:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1220/1251] eta 0:00:07 lr 0.000363 wd 0.0500 time 0.2289 (0.2474) data time 0.0006 (0.0031) model time 0.2283 (0.2443) loss 3.2558 (3.0925) grad_norm 4.1312 (inf) loss_scale 512.0000 (983.7157) mem 7377MB [2024-08-27 20:04:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1230/1251] eta 0:00:05 lr 0.000363 wd 0.0500 time 0.2250 (0.2473) data time 0.0007 (0.0030) model time 0.2243 (0.2442) loss 3.3691 (3.0926) grad_norm 3.9682 (inf) loss_scale 512.0000 (977.8045) mem 7377MB [2024-08-27 20:04:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1240/1251] eta 0:00:02 lr 0.000363 wd 0.0500 time 0.2134 (0.2469) data time 0.0006 (0.0030) model time 0.2127 (0.2439) loss 3.7055 (3.0898) grad_norm 3.0157 (inf) loss_scale 512.0000 (972.0396) mem 7377MB [2024-08-27 20:05:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [185/300][1250/1251] eta 0:00:00 lr 0.000363 wd 0.0500 time 0.2097 (0.2467) data time 0.0006 (0.0030) model time 0.2091 (0.2437) loss 3.1260 (3.0887) grad_norm 2.4190 (inf) loss_scale 512.0000 (966.4156) mem 7377MB [2024-08-27 20:05:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 185 training takes 0:03:21 [2024-08-27 20:05:02 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 20:05:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 20:05:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.494 (0.494) Loss 0.4370 (0.4370) Acc@1 91.406 (91.406) Acc@5 98.340 (98.340) Mem 7377MB [2024-08-27 20:05:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.117) Loss 0.7031 (0.6823) Acc@1 86.035 (85.387) Acc@5 96.680 (97.106) Mem 7377MB [2024-08-27 20:05:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.101) Loss 0.9810 (0.7084) Acc@1 76.367 (84.384) Acc@5 94.043 (97.108) Mem 7377MB [2024-08-27 20:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.095) Loss 1.2812 (0.8090) Acc@1 69.531 (82.079) Acc@5 90.723 (95.987) Mem 7377MB [2024-08-27 20:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.088) Loss 1.0898 (0.8623) Acc@1 74.805 (80.731) Acc@5 93.359 (95.420) Mem 7377MB [2024-08-27 20:05:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.358 Acc@5 95.384 [2024-08-27 20:05:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.4% [2024-08-27 20:05:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 80.36% [2024-08-27 20:05:10 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 20:05:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 20:05:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.464 (0.464) Loss 0.3911 (0.3911) Acc@1 92.969 (92.969) Acc@5 98.535 (98.535) Mem 7377MB [2024-08-27 20:05:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.114) Loss 0.6069 (0.6200) Acc@1 88.184 (86.834) Acc@5 97.266 (97.532) Mem 7377MB [2024-08-27 20:05:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.099) Loss 0.8760 (0.6440) Acc@1 78.320 (85.882) Acc@5 96.289 (97.512) Mem 7377MB [2024-08-27 20:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.092) Loss 1.1123 (0.7299) Acc@1 73.340 (83.846) Acc@5 93.164 (96.626) Mem 7377MB [2024-08-27 20:05:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 0.9941 (0.7734) Acc@1 75.586 (82.567) Acc@5 94.043 (96.149) Mem 7377MB [2024-08-27 20:05:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.100 Acc@5 96.136 [2024-08-27 20:05:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.1% [2024-08-27 20:05:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.10% [2024-08-27 20:05:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 20:05:18 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 20:05:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][0/1251] eta 0:18:08 lr 0.000363 wd 0.0500 time 0.8700 (0.8700) data time 0.6133 (0.6133) model time 0.0000 (0.0000) loss 3.5868 (3.5868) grad_norm 2.5135 (2.5135) loss_scale 512.0000 (512.0000) mem 7380MB [2024-08-27 20:05:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][10/1251] eta 0:05:55 lr 0.000363 wd 0.0500 time 0.2461 (0.2867) data time 0.0006 (0.0568) model time 0.0000 (0.0000) loss 2.8589 (2.7394) grad_norm 3.0467 (6.6000) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][20/1251] eta 0:05:19 lr 0.000363 wd 0.0500 time 0.2390 (0.2594) data time 0.0006 (0.0304) model time 0.0000 (0.0000) loss 3.2169 (2.8486) grad_norm 3.2494 (4.8001) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][30/1251] eta 0:05:04 lr 0.000363 wd 0.0500 time 0.2311 (0.2497) data time 0.0010 (0.0209) model time 0.0000 (0.0000) loss 3.1523 (2.8558) grad_norm 2.4573 (4.1060) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][40/1251] eta 0:04:56 lr 0.000362 wd 0.0500 time 0.2320 (0.2451) data time 0.0009 (0.0160) model time 0.0000 (0.0000) loss 3.3216 (2.9411) grad_norm 2.3347 (3.8538) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][50/1251] eta 0:04:51 lr 0.000362 wd 0.0500 time 0.2263 (0.2423) data time 0.0010 (0.0131) model time 0.0000 (0.0000) loss 3.7270 (2.9663) grad_norm 3.1019 (3.6191) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][60/1251] eta 0:04:45 lr 0.000362 wd 0.0500 time 0.2288 (0.2401) data time 0.0008 (0.0111) model time 0.2281 (0.2276) loss 3.0032 (2.9528) grad_norm 2.3649 (3.5527) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][70/1251] eta 0:04:41 lr 0.000362 wd 0.0500 time 0.2365 (0.2386) data time 0.0007 (0.0097) model time 0.2358 (0.2284) loss 3.3466 (2.9653) grad_norm 2.6309 (3.4968) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][80/1251] eta 0:04:39 lr 0.000362 wd 0.0500 time 0.2866 (0.2383) data time 0.0009 (0.0086) model time 0.2857 (0.2306) loss 2.2105 (2.9568) grad_norm 2.6429 (3.4302) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][90/1251] eta 0:04:35 lr 0.000362 wd 0.0500 time 0.2337 (0.2376) data time 0.0012 (0.0078) model time 0.2325 (0.2304) loss 3.2865 (2.9753) grad_norm 2.0772 (3.3604) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][100/1251] eta 0:04:32 lr 0.000362 wd 0.0500 time 0.2369 (0.2365) data time 0.0007 (0.0072) model time 0.2362 (0.2295) loss 3.5765 (2.9842) grad_norm 2.6225 (3.3891) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][110/1251] eta 0:04:29 lr 0.000362 wd 0.0500 time 0.2400 (0.2361) data time 0.0008 (0.0066) model time 0.2392 (0.2298) loss 3.6223 (3.0218) grad_norm 2.6620 (3.3386) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][120/1251] eta 0:04:27 lr 0.000362 wd 0.0500 time 0.2260 (0.2363) data time 0.0010 (0.0062) model time 0.2251 (0.2308) loss 3.1041 (3.0114) grad_norm 2.3477 (3.2891) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][130/1251] eta 0:04:25 lr 0.000362 wd 0.0500 time 0.2240 (0.2368) data time 0.0007 (0.0058) model time 0.2233 (0.2321) loss 2.1205 (2.9963) grad_norm 2.6595 (3.2503) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][140/1251] eta 0:04:22 lr 0.000362 wd 0.0500 time 0.2366 (0.2362) data time 0.0008 (0.0055) model time 0.2359 (0.2316) loss 2.0325 (2.9898) grad_norm 2.7939 (3.2165) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][150/1251] eta 0:04:19 lr 0.000362 wd 0.0500 time 0.2305 (0.2357) data time 0.0012 (0.0052) model time 0.2293 (0.2312) loss 3.2495 (3.0001) grad_norm 2.7753 (3.2030) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][160/1251] eta 0:04:17 lr 0.000362 wd 0.0500 time 0.2587 (0.2356) data time 0.0009 (0.0050) model time 0.2578 (0.2313) loss 2.5949 (3.0021) grad_norm 2.5992 (3.1955) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:05:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][170/1251] eta 0:04:14 lr 0.000362 wd 0.0500 time 0.2257 (0.2353) data time 0.0011 (0.0049) model time 0.2246 (0.2309) loss 3.6778 (3.0083) grad_norm 3.1643 (3.1757) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][180/1251] eta 0:04:11 lr 0.000362 wd 0.0500 time 0.2244 (0.2349) data time 0.0011 (0.0047) model time 0.2232 (0.2306) loss 3.0930 (3.0105) grad_norm 3.8224 (3.1496) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][190/1251] eta 0:04:08 lr 0.000362 wd 0.0500 time 0.2325 (0.2345) data time 0.0008 (0.0045) model time 0.2317 (0.2304) loss 3.7798 (3.0114) grad_norm 4.0353 (3.1577) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][200/1251] eta 0:04:06 lr 0.000362 wd 0.0500 time 0.2287 (0.2343) data time 0.0008 (0.0043) model time 0.2279 (0.2303) loss 2.8203 (3.0143) grad_norm 2.9044 (3.1425) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][210/1251] eta 0:04:03 lr 0.000362 wd 0.0500 time 0.2356 (0.2342) data time 0.0010 (0.0041) model time 0.2346 (0.2303) loss 3.3745 (3.0078) grad_norm 2.8753 (3.1792) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][220/1251] eta 0:04:01 lr 0.000362 wd 0.0500 time 0.2360 (0.2339) data time 0.0008 (0.0040) model time 0.2352 (0.2302) loss 3.1410 (3.0177) grad_norm 2.6446 (3.1638) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][230/1251] eta 0:03:58 lr 0.000362 wd 0.0500 time 0.2289 (0.2338) data time 0.0007 (0.0039) model time 0.2283 (0.2300) loss 3.3536 (3.0214) grad_norm 2.9859 (3.1500) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][240/1251] eta 0:03:56 lr 0.000362 wd 0.0500 time 0.2297 (0.2336) data time 0.0009 (0.0038) model time 0.2288 (0.2299) loss 3.2429 (3.0275) grad_norm 3.1771 (3.1311) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][250/1251] eta 0:03:53 lr 0.000362 wd 0.0500 time 0.2336 (0.2334) data time 0.0010 (0.0037) model time 0.2326 (0.2298) loss 2.2211 (3.0250) grad_norm 1.7458 (3.1078) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][260/1251] eta 0:03:51 lr 0.000362 wd 0.0500 time 0.2415 (0.2333) data time 0.0007 (0.0036) model time 0.2408 (0.2297) loss 3.8368 (3.0295) grad_norm 2.6340 (3.1056) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][270/1251] eta 0:03:48 lr 0.000361 wd 0.0500 time 0.2333 (0.2331) data time 0.0008 (0.0035) model time 0.2325 (0.2297) loss 3.4132 (3.0247) grad_norm 3.3871 (3.1182) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][280/1251] eta 0:03:46 lr 0.000361 wd 0.0500 time 0.2234 (0.2330) data time 0.0009 (0.0034) model time 0.2226 (0.2297) loss 2.4077 (3.0258) grad_norm 2.1596 (3.1017) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][290/1251] eta 0:03:43 lr 0.000361 wd 0.0500 time 0.2452 (0.2331) data time 0.0006 (0.0034) model time 0.2446 (0.2298) loss 2.0642 (3.0275) grad_norm 2.8928 (3.0894) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][300/1251] eta 0:03:41 lr 0.000361 wd 0.0500 time 0.2322 (0.2329) data time 0.0009 (0.0033) model time 0.2313 (0.2297) loss 3.5908 (3.0372) grad_norm 4.1430 (3.0799) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][310/1251] eta 0:03:39 lr 0.000361 wd 0.0500 time 0.2320 (0.2327) data time 0.0011 (0.0032) model time 0.2309 (0.2296) loss 3.0851 (3.0402) grad_norm 2.9555 (3.0695) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][320/1251] eta 0:03:36 lr 0.000361 wd 0.0500 time 0.2324 (0.2326) data time 0.0008 (0.0032) model time 0.2316 (0.2295) loss 3.2061 (3.0378) grad_norm 2.4812 (3.0671) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][330/1251] eta 0:03:34 lr 0.000361 wd 0.0500 time 0.2270 (0.2325) data time 0.0010 (0.0031) model time 0.2260 (0.2294) loss 2.0949 (3.0372) grad_norm 2.0089 (3.0539) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][340/1251] eta 0:03:31 lr 0.000361 wd 0.0500 time 0.2257 (0.2323) data time 0.0010 (0.0030) model time 0.2247 (0.2293) loss 3.5407 (3.0393) grad_norm 3.5682 (3.0497) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][350/1251] eta 0:03:29 lr 0.000361 wd 0.0500 time 0.2258 (0.2322) data time 0.0008 (0.0030) model time 0.2250 (0.2293) loss 3.3266 (3.0375) grad_norm 2.6048 (3.0532) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][360/1251] eta 0:03:26 lr 0.000361 wd 0.0500 time 0.2271 (0.2320) data time 0.0008 (0.0029) model time 0.2263 (0.2291) loss 3.2839 (3.0449) grad_norm 3.4490 (3.0666) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][370/1251] eta 0:03:24 lr 0.000361 wd 0.0500 time 0.2391 (0.2320) data time 0.0010 (0.0029) model time 0.2381 (0.2292) loss 3.2386 (3.0537) grad_norm 3.2437 (3.0608) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][380/1251] eta 0:03:22 lr 0.000361 wd 0.0500 time 0.2349 (0.2320) data time 0.0011 (0.0028) model time 0.2338 (0.2291) loss 2.9093 (3.0512) grad_norm 1.9759 (3.0579) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][390/1251] eta 0:03:19 lr 0.000361 wd 0.0500 time 0.2668 (0.2320) data time 0.0007 (0.0028) model time 0.2660 (0.2292) loss 2.8120 (3.0536) grad_norm 2.3980 (3.0609) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][400/1251] eta 0:03:17 lr 0.000361 wd 0.0500 time 0.2279 (0.2319) data time 0.0008 (0.0027) model time 0.2271 (0.2291) loss 1.8067 (3.0510) grad_norm 2.4817 (3.0590) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][410/1251] eta 0:03:14 lr 0.000361 wd 0.0500 time 0.2414 (0.2319) data time 0.0007 (0.0027) model time 0.2407 (0.2291) loss 3.2885 (3.0475) grad_norm 2.7557 (3.0571) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][420/1251] eta 0:03:12 lr 0.000361 wd 0.0500 time 0.2434 (0.2319) data time 0.0009 (0.0027) model time 0.2425 (0.2292) loss 2.8276 (3.0477) grad_norm 2.2386 (3.0496) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][430/1251] eta 0:03:10 lr 0.000361 wd 0.0500 time 0.2414 (0.2318) data time 0.0011 (0.0026) model time 0.2404 (0.2291) loss 1.9655 (3.0468) grad_norm 3.0219 (3.0490) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][440/1251] eta 0:03:07 lr 0.000361 wd 0.0500 time 0.2260 (0.2317) data time 0.0010 (0.0026) model time 0.2250 (0.2291) loss 3.2445 (3.0485) grad_norm 2.4222 (3.0447) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][450/1251] eta 0:03:05 lr 0.000361 wd 0.0500 time 0.2268 (0.2317) data time 0.0010 (0.0026) model time 0.2258 (0.2291) loss 3.4423 (3.0559) grad_norm 2.3884 (3.0357) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][460/1251] eta 0:03:03 lr 0.000361 wd 0.0500 time 0.2331 (0.2317) data time 0.0007 (0.0025) model time 0.2324 (0.2291) loss 2.8207 (3.0595) grad_norm 2.5700 (3.0323) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][470/1251] eta 0:03:00 lr 0.000361 wd 0.0500 time 0.2513 (0.2317) data time 0.0012 (0.0025) model time 0.2502 (0.2292) loss 2.9196 (3.0672) grad_norm 2.6056 (3.0350) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][480/1251] eta 0:02:58 lr 0.000361 wd 0.0500 time 0.2337 (0.2316) data time 0.0009 (0.0025) model time 0.2328 (0.2291) loss 3.4233 (3.0673) grad_norm 2.0584 (3.0303) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][490/1251] eta 0:02:56 lr 0.000361 wd 0.0500 time 0.2267 (0.2316) data time 0.0006 (0.0025) model time 0.2261 (0.2291) loss 2.5357 (3.0644) grad_norm 3.8053 (3.0420) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][500/1251] eta 0:02:53 lr 0.000361 wd 0.0500 time 0.2245 (0.2316) data time 0.0009 (0.0024) model time 0.2236 (0.2291) loss 2.7909 (3.0610) grad_norm 2.5837 (3.0378) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][510/1251] eta 0:02:51 lr 0.000360 wd 0.0500 time 0.2443 (0.2315) data time 0.0011 (0.0024) model time 0.2432 (0.2291) loss 3.2540 (3.0594) grad_norm 2.6166 (3.0331) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][520/1251] eta 0:02:49 lr 0.000360 wd 0.0500 time 0.2292 (0.2315) data time 0.0009 (0.0024) model time 0.2283 (0.2291) loss 3.9666 (3.0596) grad_norm 1.8833 (3.0293) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][530/1251] eta 0:02:46 lr 0.000360 wd 0.0500 time 0.2347 (0.2314) data time 0.0008 (0.0024) model time 0.2340 (0.2290) loss 3.6567 (3.0634) grad_norm 2.5045 (3.0176) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][540/1251] eta 0:02:44 lr 0.000360 wd 0.0500 time 0.2348 (0.2318) data time 0.0012 (0.0024) model time 0.2336 (0.2295) loss 2.8031 (3.0665) grad_norm 2.7021 (3.0122) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][550/1251] eta 0:02:42 lr 0.000360 wd 0.0500 time 0.2295 (0.2318) data time 0.0011 (0.0023) model time 0.2284 (0.2295) loss 2.5847 (3.0710) grad_norm 2.6374 (3.0114) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][560/1251] eta 0:02:40 lr 0.000360 wd 0.0500 time 0.2246 (0.2318) data time 0.0009 (0.0023) model time 0.2237 (0.2295) loss 3.2098 (3.0714) grad_norm 4.2320 (3.0117) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][570/1251] eta 0:02:37 lr 0.000360 wd 0.0500 time 0.2260 (0.2317) data time 0.0009 (0.0023) model time 0.2252 (0.2295) loss 3.6970 (3.0732) grad_norm 3.1495 (3.0132) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][580/1251] eta 0:02:35 lr 0.000360 wd 0.0500 time 0.2276 (0.2317) data time 0.0009 (0.0023) model time 0.2267 (0.2295) loss 2.6369 (3.0739) grad_norm 2.2034 (3.0070) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][590/1251] eta 0:02:33 lr 0.000360 wd 0.0500 time 0.2284 (0.2318) data time 0.0007 (0.0023) model time 0.2277 (0.2295) loss 3.0605 (3.0789) grad_norm 3.9832 (3.0124) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][600/1251] eta 0:02:30 lr 0.000360 wd 0.0500 time 0.2298 (0.2317) data time 0.0011 (0.0022) model time 0.2287 (0.2295) loss 3.0438 (3.0778) grad_norm 4.6440 (3.0249) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][610/1251] eta 0:02:28 lr 0.000360 wd 0.0500 time 0.2314 (0.2317) data time 0.0008 (0.0022) model time 0.2306 (0.2295) loss 2.8827 (3.0770) grad_norm 2.8762 (3.0226) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][620/1251] eta 0:02:26 lr 0.000360 wd 0.0500 time 0.2259 (0.2317) data time 0.0009 (0.0022) model time 0.2250 (0.2295) loss 2.7626 (3.0785) grad_norm 2.4137 (3.0197) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][630/1251] eta 0:02:23 lr 0.000360 wd 0.0500 time 0.2238 (0.2317) data time 0.0009 (0.0022) model time 0.2230 (0.2295) loss 2.5629 (3.0814) grad_norm 2.2069 (3.0184) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][640/1251] eta 0:02:21 lr 0.000360 wd 0.0500 time 0.2270 (0.2317) data time 0.0014 (0.0022) model time 0.2256 (0.2295) loss 2.5917 (3.0831) grad_norm 2.0644 (3.0145) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 20:07:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 20:07:49 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 20:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 20:09:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 20:09:38 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 20:18:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 20:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 20:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 20:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 20:24:13 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 20:24:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 20:24:23 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 20:24:25 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 20:24:26 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 20:24:26 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 186) [2024-08-27 20:24:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 20:32:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 20:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 20:40:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 20:41:03 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 20:41:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 20:41:12 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 20:41:14 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 20:41:15 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 20:41:15 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 186) [2024-08-27 20:41:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 20:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][650/1251] eta 0:33:25 lr 0.000360 wd 0.0500 time 0.2272 (3.3366) data time 0.0005 (0.2164) model time 0.2266 (3.1202) loss 3.7915 (3.5986) grad_norm 2.5891 (2.5589) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:41:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][660/1251] eta 0:10:57 lr 0.000360 wd 0.0500 time 0.2148 (1.1131) data time 0.0009 (0.0625) model time 0.2140 (1.0507) loss 3.6601 (3.3541) grad_norm 2.7619 (3.6461) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][670/1251] eta 0:07:11 lr 0.000360 wd 0.0500 time 0.2198 (0.7433) data time 0.0009 (0.0368) model time 0.2189 (0.7065) loss 3.3051 (3.3686) grad_norm 2.9046 (3.4549) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:41:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][680/1251] eta 0:05:37 lr 0.000360 wd 0.0500 time 0.2312 (0.5913) data time 0.0007 (0.0262) model time 0.2305 (0.5651) loss 2.4240 (3.3242) grad_norm 3.7012 (3.4412) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][690/1251] eta 0:04:45 lr 0.000360 wd 0.0500 time 0.2252 (0.5082) data time 0.0007 (0.0206) model time 0.2246 (0.4877) loss 3.4147 (3.2942) grad_norm 3.2840 (3.3332) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:41:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][700/1251] eta 0:04:11 lr 0.000360 wd 0.0500 time 0.2213 (0.4556) data time 0.0007 (0.0169) model time 0.2206 (0.4387) loss 3.4256 (3.2863) grad_norm 2.6666 (3.1993) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][710/1251] eta 0:03:46 lr 0.000360 wd 0.0500 time 0.2257 (0.4196) data time 0.0007 (0.0144) model time 0.2250 (0.4052) loss 3.2825 (3.2408) grad_norm 2.4740 (3.1087) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:41:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][720/1251] eta 0:03:28 lr 0.000360 wd 0.0500 time 0.2293 (0.3930) data time 0.0006 (0.0126) model time 0.2287 (0.3805) loss 3.6852 (3.2165) grad_norm 2.7173 (3.0248) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][730/1251] eta 0:03:14 lr 0.000360 wd 0.0500 time 0.2248 (0.3729) data time 0.0009 (0.0112) model time 0.2239 (0.3617) loss 3.3312 (3.1855) grad_norm 3.6457 (3.0896) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:41:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][740/1251] eta 0:03:02 lr 0.000359 wd 0.0500 time 0.2336 (0.3569) data time 0.0008 (0.0101) model time 0.2328 (0.3468) loss 2.7930 (3.1719) grad_norm 2.4282 (3.1135) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:41:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][750/1251] eta 0:02:52 lr 0.000359 wd 0.0500 time 0.2308 (0.3444) data time 0.0008 (0.0092) model time 0.2300 (0.3352) loss 3.2920 (3.1957) grad_norm 3.2852 (3.1388) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:41:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][760/1251] eta 0:02:44 lr 0.000359 wd 0.0500 time 0.2261 (0.3340) data time 0.0010 (0.0085) model time 0.2251 (0.3256) loss 3.2114 (3.1851) grad_norm 2.6967 (3.1039) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:41:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][770/1251] eta 0:02:36 lr 0.000359 wd 0.0500 time 0.2277 (0.3252) data time 0.0008 (0.0079) model time 0.2269 (0.3173) loss 2.9530 (3.1846) grad_norm 2.4835 (3.0986) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:42:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][780/1251] eta 0:02:29 lr 0.000359 wd 0.0500 time 0.2323 (0.3176) data time 0.0009 (0.0073) model time 0.2315 (0.3102) loss 3.8263 (3.1881) grad_norm 11.3507 (3.1627) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:42:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][790/1251] eta 0:02:23 lr 0.000359 wd 0.0500 time 0.2186 (0.3112) data time 0.0008 (0.0069) model time 0.2179 (0.3043) loss 2.9749 (3.1746) grad_norm 3.0604 (3.1354) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:42:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][800/1251] eta 0:02:17 lr 0.000359 wd 0.0500 time 0.2306 (0.3057) data time 0.0008 (0.0066) model time 0.2299 (0.2991) loss 2.8423 (3.1673) grad_norm 2.9573 (3.1348) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:42:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][810/1251] eta 0:02:12 lr 0.000359 wd 0.0500 time 0.2255 (0.3007) data time 0.0007 (0.0063) model time 0.2248 (0.2944) loss 3.2734 (3.1692) grad_norm 3.1008 (3.1393) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:42:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][820/1251] eta 0:02:07 lr 0.000359 wd 0.0500 time 0.2338 (0.2963) data time 0.0008 (0.0059) model time 0.2330 (0.2903) loss 2.0263 (3.1629) grad_norm 6.0333 (3.1399) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:42:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][830/1251] eta 0:02:03 lr 0.000359 wd 0.0500 time 0.2254 (0.2924) data time 0.0007 (0.0057) model time 0.2247 (0.2867) loss 3.0246 (3.1599) grad_norm 2.5119 (3.1129) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:42:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][840/1251] eta 0:01:58 lr 0.000359 wd 0.0500 time 0.2281 (0.2890) data time 0.0007 (0.0055) model time 0.2274 (0.2836) loss 3.2395 (3.1545) grad_norm 3.1176 (3.1091) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:42:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][850/1251] eta 0:01:54 lr 0.000359 wd 0.0500 time 0.2277 (0.2859) data time 0.0008 (0.0052) model time 0.2269 (0.2807) loss 3.1260 (3.1360) grad_norm 2.9343 (3.0956) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:42:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][860/1251] eta 0:01:50 lr 0.000359 wd 0.0500 time 0.2269 (0.2832) data time 0.0007 (0.0050) model time 0.2262 (0.2782) loss 3.0114 (3.1282) grad_norm 2.3574 (3.0947) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:42:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][870/1251] eta 0:01:46 lr 0.000359 wd 0.0500 time 0.2382 (0.2806) data time 0.0007 (0.0049) model time 0.2375 (0.2758) loss 3.2584 (3.1251) grad_norm 2.0412 (3.0727) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:42:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][880/1251] eta 0:01:43 lr 0.000359 wd 0.0500 time 0.2329 (0.2783) data time 0.0006 (0.0048) model time 0.2323 (0.2736) loss 2.4617 (3.1214) grad_norm 2.7762 (3.0635) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][890/1251] eta 0:01:39 lr 0.000359 wd 0.0500 time 0.2486 (0.2764) data time 0.0006 (0.0046) model time 0.2480 (0.2718) loss 2.1433 (3.1232) grad_norm 3.2380 (3.0712) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:42:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 20:42:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 20:42:31 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 20:46:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 20:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 20:47:03 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 20:47:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 20:47:11 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 20:47:13 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 20:47:14 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 20:47:14 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 186) [2024-08-27 20:47:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 20:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 20:49:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 20:49:13 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 20:49:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 20:49:23 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 20:49:25 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 20:49:26 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 20:49:26 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 186) [2024-08-27 20:49:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 20:49:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][900/1251] eta 0:21:22 lr 0.000359 wd 0.0500 time 0.2477 (3.6545) data time 0.0008 (0.2726) model time 0.2469 (3.3819) loss 3.4542 (3.3955) grad_norm 4.8153 (3.0611) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:49:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][910/1251] eta 0:06:55 lr 0.000359 wd 0.0500 time 0.2473 (1.2196) data time 0.0007 (0.0787) model time 0.2466 (1.1409) loss 3.7064 (3.3213) grad_norm 3.9478 (3.3893) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][920/1251] eta 0:04:29 lr 0.000359 wd 0.0500 time 0.2363 (0.8129) data time 0.0011 (0.0464) model time 0.2351 (0.7665) loss 3.2715 (3.3526) grad_norm 3.9781 (3.3571) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:49:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][930/1251] eta 0:03:27 lr 0.000359 wd 0.0500 time 0.2428 (0.6453) data time 0.0011 (0.0331) model time 0.2417 (0.6122) loss 2.1839 (3.3396) grad_norm 3.4000 (3.2209) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:49:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][940/1251] eta 0:02:51 lr 0.000359 wd 0.0500 time 0.2347 (0.5530) data time 0.0008 (0.0259) model time 0.2339 (0.5272) loss 2.8534 (3.2709) grad_norm 2.2312 (3.0898) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][950/1251] eta 0:02:29 lr 0.000359 wd 0.0500 time 0.2492 (0.4953) data time 0.0007 (0.0213) model time 0.2484 (0.4740) loss 3.7797 (3.2593) grad_norm 2.5447 (3.1702) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][960/1251] eta 0:02:12 lr 0.000359 wd 0.0500 time 0.2366 (0.4558) data time 0.0009 (0.0182) model time 0.2357 (0.4376) loss 3.3821 (3.2300) grad_norm 2.9746 (3.1097) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][970/1251] eta 0:01:59 lr 0.000359 wd 0.0500 time 0.2397 (0.4269) data time 0.0010 (0.0159) model time 0.2387 (0.4110) loss 3.2503 (3.2001) grad_norm 2.8072 (3.1158) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:50:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][980/1251] eta 0:01:49 lr 0.000358 wd 0.0500 time 0.2480 (0.4050) data time 0.0011 (0.0142) model time 0.2469 (0.3908) loss 3.3967 (3.1732) grad_norm 2.5988 (3.1391) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:50:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][990/1251] eta 0:01:41 lr 0.000358 wd 0.0500 time 0.2416 (0.3877) data time 0.0012 (0.0128) model time 0.2404 (0.3749) loss 3.3618 (3.1742) grad_norm 2.2396 (3.1186) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:50:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1000/1251] eta 0:01:33 lr 0.000358 wd 0.0500 time 0.2423 (0.3737) data time 0.0010 (0.0117) model time 0.2413 (0.3620) loss 2.7664 (3.1843) grad_norm 2.2749 (3.0753) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:50:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1010/1251] eta 0:01:27 lr 0.000358 wd 0.0500 time 0.2451 (0.3624) data time 0.0013 (0.0109) model time 0.2439 (0.3515) loss 3.5208 (3.1795) grad_norm 2.7777 (3.0329) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:50:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1020/1251] eta 0:01:21 lr 0.000358 wd 0.0500 time 0.2389 (0.3527) data time 0.0012 (0.0101) model time 0.2377 (0.3427) loss 2.9636 (3.1798) grad_norm 5.1376 (3.1184) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:50:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1030/1251] eta 0:01:16 lr 0.000358 wd 0.0500 time 0.2533 (0.3446) data time 0.0018 (0.0094) model time 0.2515 (0.3352) loss 3.6802 (3.1790) grad_norm 2.9993 (3.1068) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1040/1251] eta 0:01:11 lr 0.000358 wd 0.0500 time 0.2402 (0.3373) data time 0.0008 (0.0089) model time 0.2394 (0.3284) loss 2.7320 (3.1601) grad_norm 2.3191 (3.1516) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1050/1251] eta 0:01:06 lr 0.000358 wd 0.0500 time 0.2256 (0.3310) data time 0.0010 (0.0084) model time 0.2246 (0.3226) loss 3.2414 (3.1587) grad_norm 2.9668 (3.1063) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:50:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1060/1251] eta 0:01:02 lr 0.000358 wd 0.0500 time 0.2417 (0.3255) data time 0.0011 (0.0079) model time 0.2406 (0.3176) loss 3.1282 (3.1494) grad_norm 1.9925 (3.1002) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:50:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1070/1251] eta 0:00:58 lr 0.000358 wd 0.0500 time 0.2364 (0.3207) data time 0.0011 (0.0075) model time 0.2353 (0.3131) loss 1.9083 (3.1388) grad_norm 2.9100 (3.1254) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:50:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 20:50:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 20:50:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 20:57:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 20:57:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 20:57:56 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 20:58:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 20:58:05 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 20:58:06 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 20:58:07 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 20:58:07 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 186) [2024-08-27 20:58:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 20:58:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1080/1251] eta 0:04:17 lr 0.000358 wd 0.0500 time 0.2249 (1.5053) data time 0.0006 (0.0819) model time 0.2243 (1.4235) loss 3.4793 (3.5233) grad_norm 2.9605 (3.1536) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1090/1251] eta 0:02:14 lr 0.000358 wd 0.0500 time 0.2307 (0.8333) data time 0.0009 (0.0393) model time 0.2298 (0.7941) loss 3.2743 (3.3807) grad_norm 2.4540 (3.1875) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1100/1251] eta 0:01:34 lr 0.000358 wd 0.0500 time 0.2251 (0.6240) data time 0.0006 (0.0261) model time 0.2245 (0.5980) loss 3.8279 (3.4081) grad_norm 3.4772 (3.2511) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1110/1251] eta 0:01:13 lr 0.000358 wd 0.0500 time 0.2279 (0.5218) data time 0.0009 (0.0196) model time 0.2270 (0.5021) loss 3.1075 (3.3266) grad_norm 3.5543 (3.2461) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1120/1251] eta 0:01:00 lr 0.000358 wd 0.0500 time 0.2223 (0.4620) data time 0.0007 (0.0159) model time 0.2216 (0.4461) loss 3.0807 (3.2865) grad_norm 2.7356 (3.1799) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1130/1251] eta 0:00:51 lr 0.000358 wd 0.0500 time 0.2252 (0.4217) data time 0.0008 (0.0133) model time 0.2245 (0.4084) loss 2.4379 (3.2484) grad_norm 2.7260 (3.1695) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1140/1251] eta 0:00:43 lr 0.000358 wd 0.0500 time 0.2180 (0.3929) data time 0.0009 (0.0115) model time 0.2171 (0.3814) loss 3.1506 (3.2346) grad_norm 3.9004 (3.1349) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1150/1251] eta 0:00:37 lr 0.000358 wd 0.0500 time 0.2224 (0.3714) data time 0.0008 (0.0102) model time 0.2216 (0.3612) loss 2.9656 (3.2092) grad_norm 2.3838 (3.1284) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1160/1251] eta 0:00:32 lr 0.000358 wd 0.0500 time 0.2258 (0.3547) data time 0.0006 (0.0092) model time 0.2253 (0.3455) loss 3.1286 (3.1809) grad_norm 3.6280 (3.1133) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1170/1251] eta 0:00:27 lr 0.000358 wd 0.0500 time 0.2211 (0.3414) data time 0.0010 (0.0083) model time 0.2201 (0.3330) loss 3.6369 (3.1997) grad_norm 2.6919 (3.0852) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1180/1251] eta 0:00:23 lr 0.000358 wd 0.0500 time 0.2215 (0.3306) data time 0.0007 (0.0077) model time 0.2208 (0.3229) loss 3.9927 (3.2043) grad_norm 2.7607 (3.0664) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1190/1251] eta 0:00:19 lr 0.000358 wd 0.0500 time 0.2195 (0.3216) data time 0.0011 (0.0071) model time 0.2185 (0.3145) loss 3.4852 (3.1932) grad_norm 3.0828 (3.0358) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1200/1251] eta 0:00:16 lr 0.000358 wd 0.0500 time 0.2200 (0.3140) data time 0.0008 (0.0066) model time 0.2192 (0.3074) loss 3.5598 (3.1792) grad_norm 2.5079 (3.0467) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1210/1251] eta 0:00:12 lr 0.000358 wd 0.0500 time 0.2194 (0.3075) data time 0.0007 (0.0062) model time 0.2186 (0.3013) loss 3.9266 (3.1816) grad_norm 3.9333 (3.0283) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1220/1251] eta 0:00:09 lr 0.000357 wd 0.0500 time 0.2194 (0.3019) data time 0.0007 (0.0058) model time 0.2187 (0.2961) loss 3.1425 (3.1647) grad_norm 2.4924 (3.0741) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:58:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1230/1251] eta 0:00:06 lr 0.000357 wd 0.0500 time 0.2168 (0.2968) data time 0.0007 (0.0055) model time 0.2161 (0.2913) loss 3.5719 (3.1613) grad_norm 3.3764 (3.0737) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:59:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1240/1251] eta 0:00:03 lr 0.000357 wd 0.0500 time 0.2140 (0.2923) data time 0.0004 (0.0053) model time 0.2136 (0.2870) loss 2.8550 (3.1700) grad_norm 2.6558 (3.0631) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:59:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [186/300][1250/1251] eta 0:00:00 lr 0.000357 wd 0.0500 time 0.2125 (0.2881) data time 0.0003 (0.0050) model time 0.2122 (0.2830) loss 3.0925 (3.1542) grad_norm 2.9890 (3.0534) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 20:59:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 186 training takes 0:00:51 [2024-08-27 20:59:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 20:59:07 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 20:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.341 (0.341) Loss 0.4551 (0.4551) Acc@1 91.211 (91.211) Acc@5 98.438 (98.438) Mem 7377MB [2024-08-27 20:59:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.095) Loss 0.7036 (0.6886) Acc@1 86.035 (85.121) Acc@5 96.484 (96.999) Mem 7377MB [2024-08-27 20:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.066 (0.084) Loss 0.9585 (0.7056) Acc@1 75.684 (84.333) Acc@5 95.215 (97.108) Mem 7377MB [2024-08-27 20:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.066 (0.079) Loss 1.1992 (0.8026) Acc@1 71.484 (82.142) Acc@5 90.820 (95.946) Mem 7377MB [2024-08-27 20:59:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.075) Loss 1.0469 (0.8524) Acc@1 75.293 (80.905) Acc@5 93.652 (95.413) Mem 7377MB [2024-08-27 20:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.442 Acc@5 95.356 [2024-08-27 20:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.4% [2024-08-27 20:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 80.44% [2024-08-27 20:59:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 20:59:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 20:59:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.357 (0.357) Loss 0.3901 (0.3901) Acc@1 93.066 (93.066) Acc@5 98.535 (98.535) Mem 7377MB [2024-08-27 20:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.095) Loss 0.6074 (0.6192) Acc@1 87.988 (86.825) Acc@5 97.363 (97.532) Mem 7377MB [2024-08-27 20:59:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.065 (0.083) Loss 0.8765 (0.6433) Acc@1 78.125 (85.896) Acc@5 96.191 (97.503) Mem 7377MB [2024-08-27 20:59:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.079) Loss 1.1133 (0.7291) Acc@1 73.535 (83.855) Acc@5 93.262 (96.610) Mem 7377MB [2024-08-27 20:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.075) Loss 0.9907 (0.7726) Acc@1 75.684 (82.565) Acc@5 93.848 (96.146) Mem 7377MB [2024-08-27 20:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.098 Acc@5 96.130 [2024-08-27 20:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.1% [2024-08-27 20:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.10% [2024-08-27 20:59:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 20:59:20 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 20:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][0/1251] eta 0:13:25 lr 0.000357 wd 0.0500 time 0.6437 (0.6437) data time 0.4093 (0.4093) model time 0.0000 (0.0000) loss 2.4342 (2.4342) grad_norm 2.2913 (2.2913) loss_scale 512.0000 (512.0000) mem 7380MB [2024-08-27 20:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][10/1251] eta 0:05:29 lr 0.000357 wd 0.0500 time 0.2273 (0.2658) data time 0.0008 (0.0382) model time 0.0000 (0.0000) loss 2.3569 (3.0705) grad_norm 2.4412 (2.9118) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][20/1251] eta 0:05:03 lr 0.000357 wd 0.0500 time 0.2229 (0.2469) data time 0.0008 (0.0205) model time 0.0000 (0.0000) loss 3.5785 (2.9859) grad_norm 2.5499 (2.9051) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][30/1251] eta 0:05:01 lr 0.000357 wd 0.0500 time 0.2033 (0.2469) data time 0.0006 (0.0143) model time 0.0000 (0.0000) loss 3.1089 (2.9648) grad_norm 3.4226 (2.8900) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][40/1251] eta 0:04:52 lr 0.000357 wd 0.0500 time 0.2223 (0.2416) data time 0.0008 (0.0110) model time 0.0000 (0.0000) loss 2.6209 (2.9672) grad_norm 2.5697 (3.0040) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][50/1251] eta 0:04:45 lr 0.000357 wd 0.0500 time 0.2210 (0.2381) data time 0.0009 (0.0092) model time 0.0000 (0.0000) loss 3.4479 (2.9926) grad_norm 3.2552 (3.0376) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][60/1251] eta 0:04:40 lr 0.000357 wd 0.0500 time 0.2250 (0.2356) data time 0.0008 (0.0079) model time 0.2242 (0.2220) loss 3.3534 (2.9861) grad_norm 2.1031 (3.4329) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][70/1251] eta 0:04:38 lr 0.000357 wd 0.0500 time 0.2220 (0.2356) data time 0.0006 (0.0072) model time 0.2214 (0.2271) loss 2.0948 (2.9682) grad_norm 2.8412 (3.3665) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][80/1251] eta 0:04:33 lr 0.000357 wd 0.0500 time 0.2215 (0.2339) data time 0.0008 (0.0064) model time 0.2207 (0.2251) loss 2.3747 (2.9575) grad_norm 2.8133 (3.4182) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][90/1251] eta 0:04:30 lr 0.000357 wd 0.0500 time 0.2178 (0.2327) data time 0.0006 (0.0058) model time 0.2171 (0.2242) loss 4.0089 (2.9525) grad_norm 2.2153 (3.3583) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][100/1251] eta 0:04:26 lr 0.000357 wd 0.0500 time 0.2190 (0.2318) data time 0.0007 (0.0053) model time 0.2182 (0.2239) loss 2.7738 (2.9635) grad_norm 1.9483 (3.2709) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][110/1251] eta 0:04:26 lr 0.000357 wd 0.0500 time 0.2236 (0.2334) data time 0.0009 (0.0049) model time 0.2227 (0.2282) loss 2.2615 (2.9673) grad_norm 2.4563 (3.2087) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][120/1251] eta 0:04:23 lr 0.000357 wd 0.0500 time 0.2264 (0.2327) data time 0.0006 (0.0046) model time 0.2258 (0.2275) loss 2.4512 (2.9383) grad_norm 3.2392 (3.1914) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][130/1251] eta 0:04:20 lr 0.000357 wd 0.0500 time 0.2286 (0.2320) data time 0.0008 (0.0043) model time 0.2278 (0.2270) loss 2.6176 (2.9430) grad_norm 2.6352 (3.1712) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][140/1251] eta 0:04:17 lr 0.000357 wd 0.0500 time 0.2283 (0.2314) data time 0.0009 (0.0041) model time 0.2274 (0.2265) loss 2.8567 (2.9695) grad_norm 2.9230 (3.1617) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][150/1251] eta 0:04:14 lr 0.000357 wd 0.0500 time 0.2225 (0.2310) data time 0.0008 (0.0039) model time 0.2217 (0.2262) loss 3.3456 (2.9855) grad_norm 3.1096 (3.1518) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 20:59:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][160/1251] eta 0:04:11 lr 0.000357 wd 0.0500 time 0.2348 (0.2306) data time 0.0008 (0.0037) model time 0.2340 (0.2259) loss 3.1964 (2.9876) grad_norm 2.7330 (3.1307) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][170/1251] eta 0:04:08 lr 0.000357 wd 0.0500 time 0.2286 (0.2302) data time 0.0007 (0.0036) model time 0.2278 (0.2257) loss 2.4998 (2.9983) grad_norm 1.8809 (3.1115) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][180/1251] eta 0:04:06 lr 0.000357 wd 0.0500 time 0.2375 (0.2299) data time 0.0008 (0.0034) model time 0.2368 (0.2256) loss 3.6108 (3.0045) grad_norm 3.8388 (3.1131) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][190/1251] eta 0:04:03 lr 0.000357 wd 0.0500 time 0.2183 (0.2295) data time 0.0008 (0.0033) model time 0.2175 (0.2253) loss 3.3100 (3.0062) grad_norm 5.2887 (3.1305) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][200/1251] eta 0:04:00 lr 0.000356 wd 0.0500 time 0.2247 (0.2292) data time 0.0006 (0.0032) model time 0.2241 (0.2251) loss 3.0355 (3.0115) grad_norm 1.9730 (3.1029) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][210/1251] eta 0:03:58 lr 0.000356 wd 0.0500 time 0.2214 (0.2289) data time 0.0010 (0.0031) model time 0.2205 (0.2249) loss 2.1323 (3.0010) grad_norm 2.3881 (3.0909) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][220/1251] eta 0:03:55 lr 0.000356 wd 0.0500 time 0.2384 (0.2288) data time 0.0009 (0.0030) model time 0.2375 (0.2249) loss 2.9522 (3.0128) grad_norm 1.8788 (3.0636) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][230/1251] eta 0:03:53 lr 0.000356 wd 0.0500 time 0.2293 (0.2287) data time 0.0007 (0.0029) model time 0.2286 (0.2249) loss 3.6153 (3.0252) grad_norm 2.1858 (3.0443) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][240/1251] eta 0:03:51 lr 0.000356 wd 0.0500 time 0.2267 (0.2287) data time 0.0006 (0.0028) model time 0.2262 (0.2250) loss 3.2369 (3.0228) grad_norm 2.9799 (3.0586) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][250/1251] eta 0:03:48 lr 0.000356 wd 0.0500 time 0.2268 (0.2286) data time 0.0009 (0.0027) model time 0.2259 (0.2251) loss 3.4622 (3.0349) grad_norm 1.9048 (3.0541) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][260/1251] eta 0:03:46 lr 0.000356 wd 0.0500 time 0.2336 (0.2285) data time 0.0010 (0.0027) model time 0.2326 (0.2252) loss 2.7840 (3.0425) grad_norm 2.1776 (3.0439) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][270/1251] eta 0:03:44 lr 0.000356 wd 0.0500 time 0.2613 (0.2286) data time 0.0006 (0.0026) model time 0.2607 (0.2254) loss 3.4841 (3.0469) grad_norm 2.7512 (3.0353) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][280/1251] eta 0:03:42 lr 0.000356 wd 0.0500 time 0.2407 (0.2287) data time 0.0006 (0.0026) model time 0.2401 (0.2255) loss 3.1012 (3.0401) grad_norm 5.6672 (3.0519) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][290/1251] eta 0:03:39 lr 0.000356 wd 0.0500 time 0.2424 (0.2286) data time 0.0008 (0.0026) model time 0.2416 (0.2255) loss 2.7214 (3.0274) grad_norm 2.4515 (3.0683) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][300/1251] eta 0:03:37 lr 0.000356 wd 0.0500 time 0.2425 (0.2286) data time 0.0008 (0.0025) model time 0.2417 (0.2255) loss 2.4383 (3.0234) grad_norm 2.3853 (3.0692) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][310/1251] eta 0:03:35 lr 0.000356 wd 0.0500 time 0.2544 (0.2286) data time 0.0007 (0.0025) model time 0.2537 (0.2256) loss 2.6317 (3.0310) grad_norm 2.9243 (3.0647) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][320/1251] eta 0:03:32 lr 0.000356 wd 0.0500 time 0.2370 (0.2285) data time 0.0008 (0.0024) model time 0.2362 (0.2256) loss 2.3481 (3.0298) grad_norm 4.0925 (3.0564) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][330/1251] eta 0:03:30 lr 0.000356 wd 0.0500 time 0.2436 (0.2285) data time 0.0008 (0.0024) model time 0.2428 (0.2256) loss 3.8043 (3.0366) grad_norm 2.4794 (3.0395) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][340/1251] eta 0:03:28 lr 0.000356 wd 0.0500 time 0.2401 (0.2285) data time 0.0008 (0.0023) model time 0.2393 (0.2257) loss 3.0585 (3.0419) grad_norm 2.0837 (3.0248) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][350/1251] eta 0:03:25 lr 0.000356 wd 0.0500 time 0.2395 (0.2285) data time 0.0008 (0.0023) model time 0.2387 (0.2257) loss 3.3949 (3.0345) grad_norm 2.4485 (3.0123) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][360/1251] eta 0:03:23 lr 0.000356 wd 0.0500 time 0.2390 (0.2285) data time 0.0007 (0.0023) model time 0.2383 (0.2258) loss 3.3922 (3.0333) grad_norm 3.0430 (3.0106) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][370/1251] eta 0:03:21 lr 0.000356 wd 0.0500 time 0.2242 (0.2284) data time 0.0008 (0.0023) model time 0.2234 (0.2257) loss 2.3723 (3.0343) grad_norm 2.5266 (3.0018) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][380/1251] eta 0:03:18 lr 0.000356 wd 0.0500 time 0.2285 (0.2284) data time 0.0009 (0.0022) model time 0.2276 (0.2257) loss 3.6402 (3.0444) grad_norm 2.5110 (2.9891) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][390/1251] eta 0:03:16 lr 0.000356 wd 0.0500 time 0.2313 (0.2283) data time 0.0008 (0.0022) model time 0.2305 (0.2257) loss 3.3236 (3.0476) grad_norm 2.7501 (3.0288) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][400/1251] eta 0:03:14 lr 0.000356 wd 0.0500 time 0.2257 (0.2283) data time 0.0006 (0.0022) model time 0.2252 (0.2257) loss 2.6887 (3.0502) grad_norm 2.4796 (3.0228) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][410/1251] eta 0:03:11 lr 0.000356 wd 0.0500 time 0.2348 (0.2282) data time 0.0006 (0.0021) model time 0.2342 (0.2257) loss 3.5110 (3.0539) grad_norm 3.3662 (3.0204) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][420/1251] eta 0:03:09 lr 0.000356 wd 0.0500 time 0.2329 (0.2282) data time 0.0009 (0.0021) model time 0.2321 (0.2257) loss 3.8142 (3.0582) grad_norm 2.0125 (3.0101) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:00:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][430/1251] eta 0:03:07 lr 0.000356 wd 0.0500 time 0.2282 (0.2281) data time 0.0007 (0.0021) model time 0.2275 (0.2257) loss 3.0333 (3.0546) grad_norm 3.2759 (3.0048) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][440/1251] eta 0:03:04 lr 0.000355 wd 0.0500 time 0.2328 (0.2280) data time 0.0008 (0.0020) model time 0.2320 (0.2256) loss 3.3459 (3.0557) grad_norm 2.4497 (3.0077) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][450/1251] eta 0:03:02 lr 0.000355 wd 0.0500 time 0.2288 (0.2280) data time 0.0007 (0.0020) model time 0.2281 (0.2256) loss 3.5074 (3.0583) grad_norm 2.7054 (3.0164) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][460/1251] eta 0:03:00 lr 0.000355 wd 0.0500 time 0.2369 (0.2280) data time 0.0007 (0.0020) model time 0.2362 (0.2256) loss 1.9737 (3.0595) grad_norm 3.0624 (3.0157) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][470/1251] eta 0:02:58 lr 0.000355 wd 0.0500 time 0.2544 (0.2280) data time 0.0009 (0.0020) model time 0.2535 (0.2257) loss 3.4860 (3.0577) grad_norm 2.2883 (3.0103) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][480/1251] eta 0:02:55 lr 0.000355 wd 0.0500 time 0.2334 (0.2281) data time 0.0006 (0.0020) model time 0.2328 (0.2258) loss 2.8528 (3.0548) grad_norm 2.0845 (3.0069) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][490/1251] eta 0:02:53 lr 0.000355 wd 0.0500 time 0.2320 (0.2282) data time 0.0006 (0.0019) model time 0.2313 (0.2259) loss 3.7333 (3.0574) grad_norm 6.0731 (3.0132) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][500/1251] eta 0:02:51 lr 0.000355 wd 0.0500 time 0.2285 (0.2282) data time 0.0008 (0.0019) model time 0.2276 (0.2260) loss 2.8558 (3.0613) grad_norm 5.2508 (3.0167) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][510/1251] eta 0:02:49 lr 0.000355 wd 0.0500 time 0.2249 (0.2282) data time 0.0007 (0.0019) model time 0.2242 (0.2260) loss 1.9521 (3.0592) grad_norm 2.5911 (3.0153) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][520/1251] eta 0:02:46 lr 0.000355 wd 0.0500 time 0.2257 (0.2281) data time 0.0008 (0.0019) model time 0.2249 (0.2260) loss 2.8151 (3.0549) grad_norm 2.0441 (3.0146) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][530/1251] eta 0:02:44 lr 0.000355 wd 0.0500 time 0.2316 (0.2281) data time 0.0008 (0.0019) model time 0.2309 (0.2260) loss 3.4391 (3.0557) grad_norm 2.1806 (3.0239) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][540/1251] eta 0:02:42 lr 0.000355 wd 0.0500 time 0.2284 (0.2281) data time 0.0008 (0.0018) model time 0.2276 (0.2260) loss 2.4800 (3.0494) grad_norm 2.0431 (3.0207) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][550/1251] eta 0:02:39 lr 0.000355 wd 0.0500 time 0.2334 (0.2280) data time 0.0009 (0.0018) model time 0.2325 (0.2259) loss 3.3404 (3.0501) grad_norm 1.7805 (3.0123) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][560/1251] eta 0:02:37 lr 0.000355 wd 0.0500 time 0.2257 (0.2280) data time 0.0009 (0.0018) model time 0.2248 (0.2259) loss 3.4112 (3.0563) grad_norm 2.9211 (3.0040) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][570/1251] eta 0:02:35 lr 0.000355 wd 0.0500 time 0.2188 (0.2283) data time 0.0018 (0.0018) model time 0.2170 (0.2262) loss 3.0319 (3.0552) grad_norm 2.7873 (2.9990) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][580/1251] eta 0:02:33 lr 0.000355 wd 0.0500 time 0.2204 (0.2282) data time 0.0008 (0.0018) model time 0.2195 (0.2262) loss 3.4989 (3.0566) grad_norm 3.3419 (2.9963) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][590/1251] eta 0:02:30 lr 0.000355 wd 0.0500 time 0.2107 (0.2282) data time 0.0009 (0.0018) model time 0.2098 (0.2262) loss 3.3045 (3.0588) grad_norm 3.1219 (3.0009) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][600/1251] eta 0:02:28 lr 0.000355 wd 0.0500 time 0.2312 (0.2282) data time 0.0008 (0.0017) model time 0.2304 (0.2262) loss 3.4239 (3.0593) grad_norm 2.7844 (3.0122) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][610/1251] eta 0:02:26 lr 0.000355 wd 0.0500 time 0.2157 (0.2282) data time 0.0009 (0.0017) model time 0.2148 (0.2262) loss 2.8990 (3.0590) grad_norm 6.3491 (3.0220) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][620/1251] eta 0:02:23 lr 0.000355 wd 0.0500 time 0.2189 (0.2281) data time 0.0009 (0.0017) model time 0.2180 (0.2262) loss 2.8722 (3.0584) grad_norm 3.1216 (3.0194) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][630/1251] eta 0:02:21 lr 0.000355 wd 0.0500 time 0.2190 (0.2286) data time 0.0007 (0.0017) model time 0.2183 (0.2268) loss 2.6490 (3.0527) grad_norm 2.2664 (3.0154) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][640/1251] eta 0:02:19 lr 0.000355 wd 0.0500 time 0.2374 (0.2286) data time 0.0008 (0.0017) model time 0.2366 (0.2268) loss 1.6478 (3.0525) grad_norm 2.4067 (3.0154) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][650/1251] eta 0:02:17 lr 0.000355 wd 0.0500 time 0.2245 (0.2286) data time 0.0008 (0.0017) model time 0.2237 (0.2267) loss 3.4786 (3.0526) grad_norm 2.8448 (3.0144) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:01:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][660/1251] eta 0:02:15 lr 0.000355 wd 0.0500 time 0.2370 (0.2286) data time 0.0007 (0.0017) model time 0.2363 (0.2268) loss 2.6535 (3.0501) grad_norm 2.7340 (3.0184) loss_scale 1024.0000 (515.0983) mem 7381MB [2024-08-27 21:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][670/1251] eta 0:02:12 lr 0.000354 wd 0.0500 time 0.2256 (0.2286) data time 0.0006 (0.0017) model time 0.2250 (0.2267) loss 3.9988 (3.0490) grad_norm 2.2688 (3.0134) loss_scale 1024.0000 (522.6826) mem 7381MB [2024-08-27 21:01:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][680/1251] eta 0:02:10 lr 0.000354 wd 0.0500 time 0.2278 (0.2286) data time 0.0007 (0.0017) model time 0.2271 (0.2267) loss 2.5079 (3.0488) grad_norm 3.6668 (3.0154) loss_scale 1024.0000 (530.0441) mem 7381MB [2024-08-27 21:01:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][690/1251] eta 0:02:08 lr 0.000354 wd 0.0500 time 0.2215 (0.2286) data time 0.0006 (0.0017) model time 0.2209 (0.2267) loss 3.3707 (3.0501) grad_norm 2.2099 (3.0087) loss_scale 1024.0000 (537.1925) mem 7381MB [2024-08-27 21:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 21:01:59 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 21:02:00 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 21:04:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 21:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 21:04:57 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 21:05:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 21:05:06 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 21:05:07 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 21:05:08 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 21:05:08 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 187) [2024-08-27 21:05:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 21:05:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][700/1251] eta 0:26:02 lr 0.000354 wd 0.0500 time 0.2296 (2.8358) data time 0.0008 (0.2934) model time 0.2288 (2.5424) loss 3.6080 (3.6245) grad_norm 2.0445 (2.5356) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:05:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][710/1251] eta 0:09:55 lr 0.000354 wd 0.0500 time 0.2271 (1.1010) data time 0.0009 (0.0988) model time 0.2262 (1.0022) loss 3.2237 (3.3010) grad_norm 2.5197 (2.5343) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:05:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][720/1251] eta 0:06:38 lr 0.000354 wd 0.0500 time 0.2237 (0.7513) data time 0.0011 (0.0597) model time 0.2225 (0.6916) loss 3.3730 (3.2759) grad_norm 8.5149 (3.0108) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:05:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][730/1251] eta 0:05:13 lr 0.000354 wd 0.0500 time 0.2250 (0.6024) data time 0.0009 (0.0430) model time 0.2241 (0.5594) loss 3.1793 (3.2324) grad_norm 2.5359 (3.0557) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:05:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][740/1251] eta 0:04:25 lr 0.000354 wd 0.0500 time 0.2291 (0.5203) data time 0.0009 (0.0338) model time 0.2281 (0.4864) loss 3.2313 (3.1938) grad_norm 2.5846 (3.0331) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:05:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][750/1251] eta 0:03:54 lr 0.000354 wd 0.0500 time 0.2353 (0.4677) data time 0.0009 (0.0279) model time 0.2343 (0.4397) loss 2.3711 (3.1989) grad_norm 2.3254 (2.9846) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][760/1251] eta 0:03:31 lr 0.000354 wd 0.0500 time 0.2258 (0.4311) data time 0.0013 (0.0238) model time 0.2245 (0.4073) loss 3.3755 (3.1843) grad_norm 2.3500 (2.9245) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:05:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][770/1251] eta 0:03:14 lr 0.000354 wd 0.0500 time 0.2291 (0.4042) data time 0.0010 (0.0208) model time 0.2281 (0.3834) loss 2.3084 (3.1485) grad_norm 3.4494 (3.0037) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:05:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][780/1251] eta 0:03:00 lr 0.000354 wd 0.0500 time 0.2468 (0.3835) data time 0.0007 (0.0185) model time 0.2461 (0.3650) loss 3.0249 (3.1188) grad_norm 2.3375 (2.9469) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:05:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][790/1251] eta 0:02:49 lr 0.000354 wd 0.0500 time 0.2436 (0.3673) data time 0.0009 (0.0167) model time 0.2427 (0.3506) loss 3.1579 (3.1144) grad_norm 2.8421 (2.9770) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:05:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][800/1251] eta 0:02:39 lr 0.000354 wd 0.0500 time 0.2220 (0.3543) data time 0.0014 (0.0152) model time 0.2206 (0.3390) loss 2.8935 (3.1434) grad_norm 3.1534 (2.9869) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:05:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][810/1251] eta 0:02:31 lr 0.000354 wd 0.0500 time 0.2337 (0.3435) data time 0.0011 (0.0140) model time 0.2326 (0.3295) loss 2.4020 (3.1360) grad_norm 2.3361 (2.9368) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:05:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][820/1251] eta 0:02:24 lr 0.000354 wd 0.0500 time 0.2301 (0.3347) data time 0.0009 (0.0130) model time 0.2292 (0.3218) loss 3.0851 (3.1346) grad_norm 2.6267 (2.9229) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:05:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][830/1251] eta 0:02:17 lr 0.000354 wd 0.0500 time 0.2326 (0.3269) data time 0.0007 (0.0121) model time 0.2319 (0.3149) loss 3.0164 (3.1373) grad_norm 1.9589 (2.8939) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][840/1251] eta 0:02:11 lr 0.000354 wd 0.0500 time 0.2273 (0.3201) data time 0.0012 (0.0113) model time 0.2261 (0.3087) loss 3.2018 (3.1329) grad_norm 3.0748 (2.8760) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][850/1251] eta 0:02:05 lr 0.000354 wd 0.0500 time 0.2300 (0.3141) data time 0.0011 (0.0107) model time 0.2290 (0.3034) loss 3.1335 (3.1260) grad_norm 2.2686 (2.8595) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][860/1251] eta 0:02:00 lr 0.000354 wd 0.0500 time 0.2255 (0.3091) data time 0.0010 (0.0101) model time 0.2245 (0.2990) loss 3.4723 (3.1294) grad_norm 3.2241 (2.8849) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][870/1251] eta 0:01:56 lr 0.000354 wd 0.0500 time 0.2337 (0.3046) data time 0.0008 (0.0096) model time 0.2329 (0.2950) loss 2.9513 (3.1156) grad_norm 2.8579 (2.9107) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][880/1251] eta 0:01:51 lr 0.000354 wd 0.0500 time 0.2305 (0.3005) data time 0.0011 (0.0091) model time 0.2293 (0.2913) loss 2.9950 (3.1132) grad_norm 2.2671 (2.8956) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][890/1251] eta 0:01:47 lr 0.000354 wd 0.0500 time 0.2331 (0.2970) data time 0.0014 (0.0087) model time 0.2318 (0.2883) loss 1.8988 (3.1072) grad_norm 2.5458 (2.8813) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][900/1251] eta 0:01:43 lr 0.000354 wd 0.0500 time 0.2316 (0.2938) data time 0.0015 (0.0084) model time 0.2301 (0.2854) loss 2.8111 (3.0935) grad_norm 2.2103 (2.9006) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][910/1251] eta 0:01:39 lr 0.000353 wd 0.0500 time 0.2334 (0.2909) data time 0.0011 (0.0080) model time 0.2324 (0.2828) loss 2.9519 (3.0901) grad_norm 2.7110 (2.9202) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][920/1251] eta 0:01:35 lr 0.000353 wd 0.0500 time 0.2219 (0.2883) data time 0.0011 (0.0077) model time 0.2209 (0.2806) loss 3.4135 (3.0910) grad_norm 2.4657 (2.9911) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][930/1251] eta 0:01:31 lr 0.000353 wd 0.0500 time 0.2230 (0.2858) data time 0.0011 (0.0075) model time 0.2219 (0.2784) loss 3.6008 (3.0812) grad_norm 3.2116 (3.0137) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][940/1251] eta 0:01:28 lr 0.000353 wd 0.0500 time 0.2286 (0.2834) data time 0.0009 (0.0072) model time 0.2276 (0.2762) loss 3.0530 (3.0884) grad_norm 2.3449 (3.0113) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][950/1251] eta 0:01:24 lr 0.000353 wd 0.0500 time 0.2399 (0.2813) data time 0.0007 (0.0070) model time 0.2392 (0.2743) loss 2.6047 (3.0794) grad_norm 2.3503 (3.0010) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][960/1251] eta 0:01:21 lr 0.000353 wd 0.0500 time 0.2287 (0.2796) data time 0.0007 (0.0068) model time 0.2280 (0.2728) loss 2.1471 (3.0697) grad_norm 2.2208 (2.9962) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][970/1251] eta 0:01:18 lr 0.000353 wd 0.0500 time 0.2258 (0.2779) data time 0.0010 (0.0066) model time 0.2248 (0.2712) loss 3.1333 (3.0706) grad_norm 2.8735 (2.9959) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][980/1251] eta 0:01:14 lr 0.000353 wd 0.0500 time 0.2371 (0.2761) data time 0.0009 (0.0065) model time 0.2362 (0.2697) loss 3.1153 (3.0665) grad_norm 3.1120 (2.9823) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][990/1251] eta 0:01:11 lr 0.000353 wd 0.0500 time 0.2394 (0.2753) data time 0.0010 (0.0063) model time 0.2384 (0.2690) loss 3.1824 (3.0641) grad_norm 3.7186 (3.0165) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1000/1251] eta 0:01:08 lr 0.000353 wd 0.0500 time 0.2299 (0.2738) data time 0.0009 (0.0061) model time 0.2290 (0.2677) loss 2.4193 (3.0580) grad_norm 4.1248 (3.0514) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1010/1251] eta 0:01:05 lr 0.000353 wd 0.0500 time 0.2251 (0.2732) data time 0.0013 (0.0060) model time 0.2238 (0.2672) loss 3.4274 (3.0608) grad_norm 2.5200 (3.0561) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1020/1251] eta 0:01:02 lr 0.000353 wd 0.0500 time 0.2338 (0.2719) data time 0.0008 (0.0058) model time 0.2330 (0.2661) loss 3.4479 (3.0715) grad_norm 3.1666 (3.0517) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1030/1251] eta 0:00:59 lr 0.000353 wd 0.0500 time 0.2287 (0.2707) data time 0.0007 (0.0057) model time 0.2280 (0.2650) loss 3.6225 (3.0696) grad_norm 2.1798 (3.0340) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1040/1251] eta 0:00:56 lr 0.000353 wd 0.0500 time 0.2285 (0.2695) data time 0.0006 (0.0056) model time 0.2279 (0.2640) loss 2.2821 (3.0716) grad_norm 2.5156 (3.0257) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1050/1251] eta 0:00:53 lr 0.000353 wd 0.0500 time 0.2269 (0.2684) data time 0.0010 (0.0054) model time 0.2259 (0.2630) loss 2.8648 (3.0770) grad_norm 2.7030 (3.0161) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1060/1251] eta 0:00:51 lr 0.000353 wd 0.0500 time 0.2190 (0.2673) data time 0.0013 (0.0053) model time 0.2177 (0.2620) loss 2.9214 (3.0709) grad_norm 2.8118 (3.0130) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1070/1251] eta 0:00:48 lr 0.000353 wd 0.0500 time 0.2429 (0.2664) data time 0.0007 (0.0052) model time 0.2422 (0.2612) loss 2.9619 (3.0688) grad_norm 2.8857 (3.0092) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1080/1251] eta 0:00:45 lr 0.000353 wd 0.0500 time 0.2227 (0.2655) data time 0.0007 (0.0051) model time 0.2220 (0.2604) loss 2.1790 (3.0615) grad_norm 2.6568 (3.0125) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1090/1251] eta 0:00:42 lr 0.000353 wd 0.0500 time 0.2277 (0.2646) data time 0.0009 (0.0050) model time 0.2267 (0.2596) loss 3.5352 (3.0663) grad_norm 3.7950 (3.0198) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:07:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1100/1251] eta 0:00:39 lr 0.000353 wd 0.0500 time 0.2227 (0.2638) data time 0.0009 (0.0049) model time 0.2218 (0.2589) loss 3.2775 (3.0704) grad_norm 2.2597 (3.0335) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:07:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1110/1251] eta 0:00:37 lr 0.000353 wd 0.0500 time 0.2333 (0.2630) data time 0.0009 (0.0048) model time 0.2324 (0.2582) loss 3.6989 (3.0761) grad_norm 2.4128 (3.0226) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:07:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1120/1251] eta 0:00:34 lr 0.000353 wd 0.0500 time 0.2300 (0.2622) data time 0.0007 (0.0047) model time 0.2294 (0.2575) loss 3.2880 (3.0757) grad_norm 5.0894 (3.0199) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:07:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1130/1251] eta 0:00:31 lr 0.000353 wd 0.0500 time 0.2257 (0.2615) data time 0.0008 (0.0047) model time 0.2249 (0.2569) loss 3.3194 (3.0820) grad_norm 3.5523 (3.0293) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1140/1251] eta 0:00:28 lr 0.000353 wd 0.0500 time 0.2266 (0.2608) data time 0.0014 (0.0046) model time 0.2252 (0.2563) loss 3.3556 (3.0850) grad_norm 2.9648 (3.0454) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1150/1251] eta 0:00:26 lr 0.000352 wd 0.0500 time 0.2232 (0.2601) data time 0.0010 (0.0045) model time 0.2222 (0.2556) loss 3.1503 (3.0815) grad_norm 3.5456 (3.0450) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 21:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1160/1251] eta 0:00:23 lr 0.000352 wd 0.0500 time 0.2295 (0.2595) data time 0.0008 (0.0045) model time 0.2287 (0.2550) loss 2.2690 (3.0758) grad_norm 2.0450 (inf) loss_scale 512.0000 (1021.7978) mem 7374MB [2024-08-27 21:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1170/1251] eta 0:00:20 lr 0.000352 wd 0.0500 time 0.2261 (0.2589) data time 0.0007 (0.0044) model time 0.2254 (0.2545) loss 2.8581 (3.0688) grad_norm 2.1304 (inf) loss_scale 512.0000 (1011.0653) mem 7374MB [2024-08-27 21:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1180/1251] eta 0:00:18 lr 0.000352 wd 0.0500 time 0.2289 (0.2584) data time 0.0007 (0.0043) model time 0.2282 (0.2540) loss 3.5232 (3.0703) grad_norm 3.4443 (inf) loss_scale 512.0000 (1000.7753) mem 7374MB [2024-08-27 21:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1190/1251] eta 0:00:15 lr 0.000352 wd 0.0500 time 0.2271 (0.2578) data time 0.0008 (0.0042) model time 0.2262 (0.2536) loss 3.1624 (3.0699) grad_norm 3.8539 (inf) loss_scale 512.0000 (990.9010) mem 7374MB [2024-08-27 21:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1200/1251] eta 0:00:13 lr 0.000352 wd 0.0500 time 0.2423 (0.2573) data time 0.0014 (0.0042) model time 0.2408 (0.2531) loss 3.6346 (3.0695) grad_norm 1.7700 (inf) loss_scale 512.0000 (981.4178) mem 7374MB [2024-08-27 21:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1210/1251] eta 0:00:10 lr 0.000352 wd 0.0500 time 0.2265 (0.2568) data time 0.0011 (0.0041) model time 0.2254 (0.2526) loss 2.9837 (3.0762) grad_norm 3.5617 (inf) loss_scale 512.0000 (972.3029) mem 7374MB [2024-08-27 21:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1220/1251] eta 0:00:07 lr 0.000352 wd 0.0500 time 0.2266 (0.2563) data time 0.0009 (0.0041) model time 0.2256 (0.2523) loss 3.8184 (3.0712) grad_norm 2.0235 (inf) loss_scale 512.0000 (963.5352) mem 7374MB [2024-08-27 21:07:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1230/1251] eta 0:00:05 lr 0.000352 wd 0.0500 time 0.2285 (0.2559) data time 0.0008 (0.0040) model time 0.2277 (0.2519) loss 2.5634 (3.0687) grad_norm 3.3701 (inf) loss_scale 512.0000 (955.0953) mem 7374MB [2024-08-27 21:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1240/1251] eta 0:00:02 lr 0.000352 wd 0.0500 time 0.2104 (0.2552) data time 0.0004 (0.0040) model time 0.2100 (0.2513) loss 2.6178 (3.0686) grad_norm 3.0093 (inf) loss_scale 512.0000 (946.9651) mem 7374MB [2024-08-27 21:07:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [187/300][1250/1251] eta 0:00:00 lr 0.000352 wd 0.0500 time 0.2108 (0.2544) data time 0.0006 (0.0039) model time 0.2102 (0.2505) loss 3.1995 (3.0741) grad_norm 2.6356 (inf) loss_scale 512.0000 (939.1279) mem 7374MB [2024-08-27 21:07:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 187 training takes 0:02:21 [2024-08-27 21:07:35 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 21:07:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 21:07:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.470 (0.470) Loss 0.4324 (0.4324) Acc@1 91.699 (91.699) Acc@5 98.047 (98.047) Mem 7374MB [2024-08-27 21:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.116) Loss 0.6973 (0.6847) Acc@1 85.742 (85.112) Acc@5 97.070 (97.106) Mem 7374MB [2024-08-27 21:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.085 (0.101) Loss 0.9341 (0.7093) Acc@1 77.246 (84.245) Acc@5 95.312 (97.056) Mem 7374MB [2024-08-27 21:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.088 (0.095) Loss 1.2480 (0.8071) Acc@1 69.922 (81.912) Acc@5 91.016 (95.993) Mem 7374MB [2024-08-27 21:07:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.088) Loss 1.0547 (0.8572) Acc@1 74.707 (80.609) Acc@5 93.359 (95.374) Mem 7374MB [2024-08-27 21:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.246 Acc@5 95.328 [2024-08-27 21:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.2% [2024-08-27 21:07:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.947 (0.947) Loss 0.3896 (0.3896) Acc@1 93.066 (93.066) Acc@5 98.535 (98.535) Mem 7374MB [2024-08-27 21:07:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.164) Loss 0.6074 (0.6190) Acc@1 88.184 (86.861) Acc@5 97.461 (97.514) Mem 7374MB [2024-08-27 21:07:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.126) Loss 0.8765 (0.6430) Acc@1 78.223 (85.924) Acc@5 96.191 (97.512) Mem 7374MB [2024-08-27 21:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.087 (0.111) Loss 1.1143 (0.7285) Acc@1 73.340 (83.874) Acc@5 93.164 (96.623) Mem 7374MB [2024-08-27 21:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.100) Loss 0.9883 (0.7721) Acc@1 75.977 (82.577) Acc@5 93.848 (96.146) Mem 7374MB [2024-08-27 21:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.122 Acc@5 96.122 [2024-08-27 21:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.1% [2024-08-27 21:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.12% [2024-08-27 21:07:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 21:07:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 21:07:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][0/1251] eta 0:17:01 lr 0.000352 wd 0.0500 time 0.8164 (0.8164) data time 0.5498 (0.5498) model time 0.0000 (0.0000) loss 3.7074 (3.7074) grad_norm 3.1858 (3.1858) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-27 21:07:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][10/1251] eta 0:05:49 lr 0.000352 wd 0.0500 time 0.2235 (0.2814) data time 0.0008 (0.0514) model time 0.0000 (0.0000) loss 2.1701 (3.0970) grad_norm 3.9852 (3.0178) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:07:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][20/1251] eta 0:05:15 lr 0.000352 wd 0.0500 time 0.2253 (0.2566) data time 0.0010 (0.0275) model time 0.0000 (0.0000) loss 2.1516 (3.1048) grad_norm 2.1849 (2.8948) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][30/1251] eta 0:05:02 lr 0.000352 wd 0.0500 time 0.2266 (0.2478) data time 0.0009 (0.0190) model time 0.0000 (0.0000) loss 3.3011 (3.1385) grad_norm 3.7954 (2.9332) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][40/1251] eta 0:04:54 lr 0.000352 wd 0.0500 time 0.2385 (0.2433) data time 0.0009 (0.0147) model time 0.0000 (0.0000) loss 2.1662 (3.1361) grad_norm 3.3650 (2.9648) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][50/1251] eta 0:04:48 lr 0.000352 wd 0.0500 time 0.2296 (0.2405) data time 0.0007 (0.0121) model time 0.0000 (0.0000) loss 2.8333 (3.1166) grad_norm 3.8358 (3.0361) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][60/1251] eta 0:04:44 lr 0.000352 wd 0.0500 time 0.2248 (0.2388) data time 0.0010 (0.0103) model time 0.2238 (0.2286) loss 2.9623 (3.1178) grad_norm 3.1129 (3.1093) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][70/1251] eta 0:04:41 lr 0.000352 wd 0.0500 time 0.2243 (0.2380) data time 0.0009 (0.0091) model time 0.2234 (0.2300) loss 2.5648 (3.1338) grad_norm 2.1421 (3.1300) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][80/1251] eta 0:04:37 lr 0.000352 wd 0.0500 time 0.2295 (0.2368) data time 0.0008 (0.0081) model time 0.2287 (0.2290) loss 3.0324 (3.1377) grad_norm 2.1243 (3.1722) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][90/1251] eta 0:04:34 lr 0.000352 wd 0.0500 time 0.2245 (0.2360) data time 0.0009 (0.0074) model time 0.2236 (0.2289) loss 3.4507 (3.1139) grad_norm 3.4868 (3.1056) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][100/1251] eta 0:04:30 lr 0.000352 wd 0.0500 time 0.2281 (0.2354) data time 0.0010 (0.0068) model time 0.2271 (0.2288) loss 2.7423 (3.1023) grad_norm 7.5900 (3.1474) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][110/1251] eta 0:04:27 lr 0.000352 wd 0.0500 time 0.2236 (0.2347) data time 0.0010 (0.0063) model time 0.2226 (0.2284) loss 3.3820 (3.0839) grad_norm 3.3625 (3.1571) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][120/1251] eta 0:04:24 lr 0.000352 wd 0.0500 time 0.2267 (0.2342) data time 0.0009 (0.0058) model time 0.2258 (0.2283) loss 2.7435 (3.0961) grad_norm 2.2715 (3.1178) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][130/1251] eta 0:04:22 lr 0.000351 wd 0.0500 time 0.2349 (0.2341) data time 0.0011 (0.0055) model time 0.2338 (0.2287) loss 2.8600 (3.0934) grad_norm 2.2506 (3.0847) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][140/1251] eta 0:04:19 lr 0.000351 wd 0.0500 time 0.2340 (0.2338) data time 0.0006 (0.0052) model time 0.2334 (0.2288) loss 2.2218 (3.0771) grad_norm 3.6191 (3.0560) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][150/1251] eta 0:04:17 lr 0.000351 wd 0.0500 time 0.2352 (0.2336) data time 0.0008 (0.0049) model time 0.2344 (0.2288) loss 3.3370 (3.0686) grad_norm 2.4442 (3.0484) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][160/1251] eta 0:04:14 lr 0.000351 wd 0.0500 time 0.2707 (0.2336) data time 0.0010 (0.0046) model time 0.2697 (0.2292) loss 3.2018 (3.0509) grad_norm 2.6838 (3.0793) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][170/1251] eta 0:04:12 lr 0.000351 wd 0.0500 time 0.2203 (0.2332) data time 0.0007 (0.0044) model time 0.2196 (0.2289) loss 3.2507 (3.0473) grad_norm 2.8501 (3.0914) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][180/1251] eta 0:04:09 lr 0.000351 wd 0.0500 time 0.2241 (0.2331) data time 0.0007 (0.0043) model time 0.2234 (0.2290) loss 3.6229 (3.0564) grad_norm 2.2627 (3.0949) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][190/1251] eta 0:04:07 lr 0.000351 wd 0.0500 time 0.2265 (0.2329) data time 0.0007 (0.0041) model time 0.2258 (0.2289) loss 3.8166 (3.0620) grad_norm 2.1299 (3.0840) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][200/1251] eta 0:04:04 lr 0.000351 wd 0.0500 time 0.2334 (0.2328) data time 0.0007 (0.0040) model time 0.2327 (0.2290) loss 3.7043 (3.0533) grad_norm 2.5796 (3.0623) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][210/1251] eta 0:04:02 lr 0.000351 wd 0.0500 time 0.2253 (0.2327) data time 0.0009 (0.0038) model time 0.2244 (0.2290) loss 3.4242 (3.0672) grad_norm 2.8193 (3.0877) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][220/1251] eta 0:04:00 lr 0.000351 wd 0.0500 time 0.2457 (0.2335) data time 0.0014 (0.0037) model time 0.2444 (0.2302) loss 3.5845 (3.0729) grad_norm 2.7272 (3.0834) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][230/1251] eta 0:03:58 lr 0.000351 wd 0.0500 time 0.2303 (0.2334) data time 0.0009 (0.0036) model time 0.2294 (0.2301) loss 3.0821 (3.0681) grad_norm 2.6167 (3.0842) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][240/1251] eta 0:03:55 lr 0.000351 wd 0.0500 time 0.2489 (0.2334) data time 0.0007 (0.0035) model time 0.2482 (0.2303) loss 1.8742 (3.0652) grad_norm 1.9091 (3.0588) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][250/1251] eta 0:03:53 lr 0.000351 wd 0.0500 time 0.2374 (0.2332) data time 0.0007 (0.0034) model time 0.2366 (0.2302) loss 1.8213 (3.0558) grad_norm 2.4558 (3.0412) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][260/1251] eta 0:03:51 lr 0.000351 wd 0.0500 time 0.2238 (0.2338) data time 0.0010 (0.0033) model time 0.2228 (0.2310) loss 3.6383 (3.0448) grad_norm 2.8490 (3.0322) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][270/1251] eta 0:03:49 lr 0.000351 wd 0.0500 time 0.2207 (0.2337) data time 0.0009 (0.0032) model time 0.2198 (0.2309) loss 1.8403 (3.0328) grad_norm 3.8008 (3.0331) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][280/1251] eta 0:03:46 lr 0.000351 wd 0.0500 time 0.2286 (0.2335) data time 0.0007 (0.0032) model time 0.2278 (0.2308) loss 3.8417 (3.0333) grad_norm 2.8777 (3.0927) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][290/1251] eta 0:03:44 lr 0.000351 wd 0.0500 time 0.2334 (0.2335) data time 0.0009 (0.0031) model time 0.2325 (0.2308) loss 4.0003 (3.0301) grad_norm 2.3380 (3.0767) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][300/1251] eta 0:03:41 lr 0.000351 wd 0.0500 time 0.2281 (0.2333) data time 0.0009 (0.0030) model time 0.2272 (0.2306) loss 3.3822 (3.0322) grad_norm 3.5387 (3.0591) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][310/1251] eta 0:03:39 lr 0.000351 wd 0.0500 time 0.2340 (0.2331) data time 0.0008 (0.0030) model time 0.2332 (0.2305) loss 3.4022 (3.0300) grad_norm 2.0640 (3.0727) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][320/1251] eta 0:03:37 lr 0.000351 wd 0.0500 time 0.2314 (0.2331) data time 0.0014 (0.0029) model time 0.2301 (0.2305) loss 2.9702 (3.0327) grad_norm 2.3881 (3.0850) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][330/1251] eta 0:03:34 lr 0.000351 wd 0.0500 time 0.2260 (0.2330) data time 0.0007 (0.0029) model time 0.2253 (0.2305) loss 3.9473 (3.0324) grad_norm 4.2100 (3.0832) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][340/1251] eta 0:03:32 lr 0.000351 wd 0.0500 time 0.2259 (0.2334) data time 0.0014 (0.0028) model time 0.2245 (0.2309) loss 3.5424 (3.0307) grad_norm 2.4219 (3.0803) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][350/1251] eta 0:03:30 lr 0.000351 wd 0.0500 time 0.2279 (0.2333) data time 0.0011 (0.0028) model time 0.2268 (0.2309) loss 3.0610 (3.0311) grad_norm 3.2191 (3.0884) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][360/1251] eta 0:03:27 lr 0.000351 wd 0.0500 time 0.2253 (0.2332) data time 0.0008 (0.0027) model time 0.2245 (0.2308) loss 3.5283 (3.0312) grad_norm 2.7293 (3.0798) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][370/1251] eta 0:03:25 lr 0.000350 wd 0.0500 time 0.2293 (0.2332) data time 0.0009 (0.0027) model time 0.2284 (0.2308) loss 3.0239 (3.0383) grad_norm 4.0561 (3.0931) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][380/1251] eta 0:03:23 lr 0.000350 wd 0.0500 time 0.2240 (0.2332) data time 0.0007 (0.0026) model time 0.2233 (0.2309) loss 2.0432 (3.0369) grad_norm 2.2529 (3.0829) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][390/1251] eta 0:03:20 lr 0.000350 wd 0.0500 time 0.2290 (0.2332) data time 0.0011 (0.0026) model time 0.2278 (0.2309) loss 3.1349 (3.0348) grad_norm 3.0655 (3.0756) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][400/1251] eta 0:03:18 lr 0.000350 wd 0.0500 time 0.2289 (0.2331) data time 0.0010 (0.0026) model time 0.2279 (0.2309) loss 3.2447 (3.0291) grad_norm 3.7255 (3.0738) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][410/1251] eta 0:03:16 lr 0.000350 wd 0.0500 time 0.2308 (0.2331) data time 0.0011 (0.0025) model time 0.2297 (0.2309) loss 3.1871 (3.0286) grad_norm 13.2889 (3.0839) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][420/1251] eta 0:03:13 lr 0.000350 wd 0.0500 time 0.2334 (0.2331) data time 0.0009 (0.0025) model time 0.2325 (0.2309) loss 3.1531 (3.0356) grad_norm 2.6547 (3.0905) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][430/1251] eta 0:03:11 lr 0.000350 wd 0.0500 time 0.2284 (0.2330) data time 0.0010 (0.0025) model time 0.2275 (0.2309) loss 2.4268 (3.0307) grad_norm 2.6040 (3.0849) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][440/1251] eta 0:03:09 lr 0.000350 wd 0.0500 time 0.2286 (0.2331) data time 0.0009 (0.0024) model time 0.2277 (0.2310) loss 3.0875 (3.0327) grad_norm 4.0632 (3.0871) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][450/1251] eta 0:03:06 lr 0.000350 wd 0.0500 time 0.2288 (0.2330) data time 0.0010 (0.0024) model time 0.2278 (0.2309) loss 3.2048 (3.0314) grad_norm 2.6279 (3.0796) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][460/1251] eta 0:03:04 lr 0.000350 wd 0.0500 time 0.2364 (0.2330) data time 0.0007 (0.0024) model time 0.2357 (0.2308) loss 2.6243 (3.0380) grad_norm 3.1857 (3.0888) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][470/1251] eta 0:03:01 lr 0.000350 wd 0.0500 time 0.2340 (0.2330) data time 0.0012 (0.0024) model time 0.2328 (0.2309) loss 3.1824 (3.0309) grad_norm 2.9101 (3.0787) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][480/1251] eta 0:02:59 lr 0.000350 wd 0.0500 time 0.2256 (0.2329) data time 0.0010 (0.0023) model time 0.2246 (0.2308) loss 3.7706 (3.0307) grad_norm 3.4989 (3.0805) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][490/1251] eta 0:02:57 lr 0.000350 wd 0.0500 time 0.2340 (0.2328) data time 0.0010 (0.0023) model time 0.2331 (0.2308) loss 3.1253 (3.0354) grad_norm 2.3131 (3.0807) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][500/1251] eta 0:02:54 lr 0.000350 wd 0.0500 time 0.2289 (0.2328) data time 0.0010 (0.0023) model time 0.2280 (0.2307) loss 3.2674 (3.0378) grad_norm 2.2911 (3.0775) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][510/1251] eta 0:02:52 lr 0.000350 wd 0.0500 time 0.2311 (0.2327) data time 0.0006 (0.0023) model time 0.2305 (0.2306) loss 2.5273 (3.0410) grad_norm 2.0629 (3.0741) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][520/1251] eta 0:02:50 lr 0.000350 wd 0.0500 time 0.2321 (0.2327) data time 0.0009 (0.0022) model time 0.2312 (0.2306) loss 2.9202 (3.0349) grad_norm 2.0447 (3.0668) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][530/1251] eta 0:02:47 lr 0.000350 wd 0.0500 time 0.2284 (0.2326) data time 0.0007 (0.0022) model time 0.2277 (0.2306) loss 3.1187 (3.0344) grad_norm 4.1397 (3.0657) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][540/1251] eta 0:02:45 lr 0.000350 wd 0.0500 time 0.2370 (0.2326) data time 0.0007 (0.0022) model time 0.2363 (0.2306) loss 3.6676 (3.0334) grad_norm 2.0066 (3.0594) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][550/1251] eta 0:02:43 lr 0.000350 wd 0.0500 time 0.2255 (0.2326) data time 0.0012 (0.0022) model time 0.2243 (0.2306) loss 3.3970 (3.0348) grad_norm 2.8509 (3.0528) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][560/1251] eta 0:02:40 lr 0.000350 wd 0.0500 time 0.2269 (0.2325) data time 0.0008 (0.0022) model time 0.2261 (0.2305) loss 3.3332 (3.0373) grad_norm 3.8408 (3.0510) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][570/1251] eta 0:02:38 lr 0.000350 wd 0.0500 time 0.2334 (0.2325) data time 0.0009 (0.0022) model time 0.2325 (0.2305) loss 3.0042 (3.0370) grad_norm 2.2807 (3.0427) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][580/1251] eta 0:02:35 lr 0.000350 wd 0.0500 time 0.2470 (0.2325) data time 0.0012 (0.0021) model time 0.2458 (0.2305) loss 2.8456 (3.0363) grad_norm 2.8711 (3.0743) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][590/1251] eta 0:02:33 lr 0.000350 wd 0.0500 time 0.2321 (0.2324) data time 0.0007 (0.0021) model time 0.2315 (0.2305) loss 2.6397 (3.0358) grad_norm 2.5795 (3.0712) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][600/1251] eta 0:02:31 lr 0.000350 wd 0.0500 time 0.2313 (0.2324) data time 0.0006 (0.0021) model time 0.2307 (0.2305) loss 3.8384 (3.0360) grad_norm 3.3216 (3.0708) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][610/1251] eta 0:02:28 lr 0.000349 wd 0.0500 time 0.2304 (0.2324) data time 0.0007 (0.0021) model time 0.2298 (0.2305) loss 3.5306 (3.0311) grad_norm 3.5873 (3.0766) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][620/1251] eta 0:02:26 lr 0.000349 wd 0.0500 time 0.2287 (0.2323) data time 0.0013 (0.0021) model time 0.2274 (0.2304) loss 3.9457 (3.0377) grad_norm 6.2654 (3.0784) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][630/1251] eta 0:02:24 lr 0.000349 wd 0.0500 time 0.2255 (0.2323) data time 0.0010 (0.0021) model time 0.2246 (0.2304) loss 2.9545 (3.0437) grad_norm 3.4127 (3.0825) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][640/1251] eta 0:02:21 lr 0.000349 wd 0.0500 time 0.2239 (0.2323) data time 0.0009 (0.0021) model time 0.2231 (0.2304) loss 2.8434 (3.0415) grad_norm 4.3516 (3.0784) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][650/1251] eta 0:02:19 lr 0.000349 wd 0.0500 time 0.2283 (0.2323) data time 0.0011 (0.0020) model time 0.2272 (0.2304) loss 3.1719 (3.0467) grad_norm 4.8526 (3.0926) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][660/1251] eta 0:02:17 lr 0.000349 wd 0.0500 time 0.2295 (0.2322) data time 0.0007 (0.0020) model time 0.2288 (0.2304) loss 2.4127 (3.0480) grad_norm 2.4253 (3.0959) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][670/1251] eta 0:02:14 lr 0.000349 wd 0.0500 time 0.2499 (0.2323) data time 0.0010 (0.0020) model time 0.2488 (0.2304) loss 2.6058 (3.0470) grad_norm 2.4040 (3.0948) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][680/1251] eta 0:02:12 lr 0.000349 wd 0.0500 time 0.2335 (0.2322) data time 0.0009 (0.0020) model time 0.2326 (0.2304) loss 2.4448 (3.0452) grad_norm 2.1217 (3.0899) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][690/1251] eta 0:02:10 lr 0.000349 wd 0.0500 time 0.2342 (0.2322) data time 0.0009 (0.0020) model time 0.2333 (0.2304) loss 3.2707 (3.0464) grad_norm 3.8929 (3.0913) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][700/1251] eta 0:02:07 lr 0.000349 wd 0.0500 time 0.2258 (0.2322) data time 0.0007 (0.0020) model time 0.2251 (0.2304) loss 4.3009 (3.0456) grad_norm 6.3457 (3.0888) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][710/1251] eta 0:02:05 lr 0.000349 wd 0.0500 time 0.2309 (0.2322) data time 0.0008 (0.0020) model time 0.2301 (0.2304) loss 3.6353 (3.0475) grad_norm 3.5467 (3.0874) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][720/1251] eta 0:02:03 lr 0.000349 wd 0.0500 time 0.2369 (0.2322) data time 0.0009 (0.0020) model time 0.2360 (0.2304) loss 3.3464 (3.0486) grad_norm 3.8241 (3.0865) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][730/1251] eta 0:02:01 lr 0.000349 wd 0.0500 time 0.2277 (0.2323) data time 0.0009 (0.0020) model time 0.2268 (0.2304) loss 3.3271 (3.0523) grad_norm 3.5127 (3.0840) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][740/1251] eta 0:01:58 lr 0.000349 wd 0.0500 time 0.2263 (0.2323) data time 0.0007 (0.0020) model time 0.2256 (0.2304) loss 3.1294 (3.0540) grad_norm 3.1170 (3.0824) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][750/1251] eta 0:01:56 lr 0.000349 wd 0.0500 time 0.2319 (0.2323) data time 0.0007 (0.0019) model time 0.2312 (0.2305) loss 2.8089 (3.0538) grad_norm 2.4076 (3.0757) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][760/1251] eta 0:01:54 lr 0.000349 wd 0.0500 time 0.2240 (0.2323) data time 0.0009 (0.0019) model time 0.2232 (0.2305) loss 3.0373 (3.0497) grad_norm 4.1420 (3.0772) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][770/1251] eta 0:01:51 lr 0.000349 wd 0.0500 time 0.2279 (0.2322) data time 0.0009 (0.0019) model time 0.2270 (0.2305) loss 3.5267 (3.0493) grad_norm 4.1247 (3.0796) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][780/1251] eta 0:01:49 lr 0.000349 wd 0.0500 time 0.2392 (0.2325) data time 0.0009 (0.0019) model time 0.2383 (0.2307) loss 2.4041 (3.0476) grad_norm 3.0580 (3.0815) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][790/1251] eta 0:01:47 lr 0.000349 wd 0.0500 time 0.2307 (0.2325) data time 0.0010 (0.0019) model time 0.2298 (0.2307) loss 3.4721 (3.0504) grad_norm 3.2242 (3.0871) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][800/1251] eta 0:01:44 lr 0.000349 wd 0.0500 time 0.2246 (0.2324) data time 0.0009 (0.0019) model time 0.2237 (0.2307) loss 3.4658 (3.0532) grad_norm 3.2544 (3.0919) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][810/1251] eta 0:01:42 lr 0.000349 wd 0.0500 time 0.2321 (0.2324) data time 0.0013 (0.0019) model time 0.2308 (0.2307) loss 3.0719 (3.0561) grad_norm 2.8823 (3.0866) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][820/1251] eta 0:01:40 lr 0.000349 wd 0.0500 time 0.2256 (0.2324) data time 0.0011 (0.0019) model time 0.2244 (0.2307) loss 3.2310 (3.0584) grad_norm 3.0543 (3.0821) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][830/1251] eta 0:01:37 lr 0.000349 wd 0.0500 time 0.2291 (0.2324) data time 0.0007 (0.0019) model time 0.2284 (0.2307) loss 3.2149 (3.0592) grad_norm 3.2808 (3.0837) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][840/1251] eta 0:01:35 lr 0.000348 wd 0.0500 time 0.2398 (0.2323) data time 0.0008 (0.0019) model time 0.2390 (0.2306) loss 2.3617 (3.0586) grad_norm 3.2866 (3.0792) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][850/1251] eta 0:01:33 lr 0.000348 wd 0.0500 time 0.2226 (0.2323) data time 0.0012 (0.0018) model time 0.2214 (0.2306) loss 2.9194 (3.0611) grad_norm 2.9420 (3.0734) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][860/1251] eta 0:01:30 lr 0.000348 wd 0.0500 time 0.2285 (0.2323) data time 0.0011 (0.0019) model time 0.2273 (0.2305) loss 2.1342 (3.0613) grad_norm 4.5218 (3.0781) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][870/1251] eta 0:01:28 lr 0.000348 wd 0.0500 time 0.2230 (0.2323) data time 0.0007 (0.0019) model time 0.2223 (0.2305) loss 2.8041 (3.0604) grad_norm 2.3399 (3.0786) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][880/1251] eta 0:01:26 lr 0.000348 wd 0.0500 time 0.2341 (0.2322) data time 0.0011 (0.0019) model time 0.2330 (0.2305) loss 3.1677 (3.0605) grad_norm 2.0369 (3.0808) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][890/1251] eta 0:01:23 lr 0.000348 wd 0.0500 time 0.2226 (0.2322) data time 0.0009 (0.0019) model time 0.2217 (0.2305) loss 3.3289 (3.0612) grad_norm 3.2267 (3.0855) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][900/1251] eta 0:01:21 lr 0.000348 wd 0.0500 time 0.2265 (0.2322) data time 0.0009 (0.0019) model time 0.2257 (0.2304) loss 3.6025 (3.0612) grad_norm 2.7977 (3.0835) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][910/1251] eta 0:01:19 lr 0.000348 wd 0.0500 time 0.2273 (0.2322) data time 0.0010 (0.0018) model time 0.2263 (0.2304) loss 1.9064 (3.0628) grad_norm 2.4584 (3.0792) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][920/1251] eta 0:01:16 lr 0.000348 wd 0.0500 time 0.2226 (0.2321) data time 0.0011 (0.0018) model time 0.2215 (0.2304) loss 2.6191 (3.0616) grad_norm 2.5524 (3.0793) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][930/1251] eta 0:01:14 lr 0.000348 wd 0.0500 time 0.2200 (0.2321) data time 0.0010 (0.0018) model time 0.2190 (0.2304) loss 3.4824 (3.0593) grad_norm 2.1367 (3.0771) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][940/1251] eta 0:01:12 lr 0.000348 wd 0.0500 time 0.2304 (0.2321) data time 0.0011 (0.0018) model time 0.2293 (0.2304) loss 3.1339 (3.0567) grad_norm 3.0138 (3.0737) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][950/1251] eta 0:01:09 lr 0.000348 wd 0.0500 time 0.2312 (0.2321) data time 0.0010 (0.0018) model time 0.2302 (0.2304) loss 3.2339 (3.0563) grad_norm 2.5945 (3.0677) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][960/1251] eta 0:01:07 lr 0.000348 wd 0.0500 time 0.2313 (0.2321) data time 0.0009 (0.0018) model time 0.2304 (0.2303) loss 3.0774 (3.0582) grad_norm 2.8576 (3.0623) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][970/1251] eta 0:01:05 lr 0.000348 wd 0.0500 time 0.2325 (0.2320) data time 0.0010 (0.0018) model time 0.2315 (0.2303) loss 3.4313 (3.0618) grad_norm 2.1240 (3.0633) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][980/1251] eta 0:01:02 lr 0.000348 wd 0.0500 time 0.2237 (0.2320) data time 0.0012 (0.0018) model time 0.2225 (0.2303) loss 3.3235 (3.0612) grad_norm 2.7064 (3.0620) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][990/1251] eta 0:01:00 lr 0.000348 wd 0.0500 time 0.2278 (0.2320) data time 0.0008 (0.0018) model time 0.2270 (0.2304) loss 2.8305 (3.0599) grad_norm 2.9690 (3.0605) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1000/1251] eta 0:00:58 lr 0.000348 wd 0.0500 time 0.2284 (0.2320) data time 0.0009 (0.0018) model time 0.2276 (0.2303) loss 3.5625 (3.0604) grad_norm 3.6118 (3.0567) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1010/1251] eta 0:00:55 lr 0.000348 wd 0.0500 time 0.2282 (0.2320) data time 0.0008 (0.0018) model time 0.2275 (0.2303) loss 2.7299 (3.0566) grad_norm 4.9972 (3.0623) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1020/1251] eta 0:00:53 lr 0.000348 wd 0.0500 time 0.2286 (0.2320) data time 0.0007 (0.0018) model time 0.2280 (0.2303) loss 2.7706 (3.0570) grad_norm 3.4015 (3.0618) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1030/1251] eta 0:00:51 lr 0.000348 wd 0.0500 time 0.2353 (0.2319) data time 0.0011 (0.0018) model time 0.2342 (0.2302) loss 3.7401 (3.0566) grad_norm 3.7203 (3.0907) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1040/1251] eta 0:00:48 lr 0.000348 wd 0.0500 time 0.2284 (0.2319) data time 0.0007 (0.0018) model time 0.2277 (0.2302) loss 3.3281 (3.0582) grad_norm 2.6342 (3.0984) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:11:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1050/1251] eta 0:00:46 lr 0.000348 wd 0.0500 time 0.2324 (0.2319) data time 0.0013 (0.0018) model time 0.2310 (0.2302) loss 3.4179 (3.0611) grad_norm 2.2868 (3.0945) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1060/1251] eta 0:00:44 lr 0.000348 wd 0.0500 time 0.2226 (0.2319) data time 0.0015 (0.0018) model time 0.2210 (0.2302) loss 3.1458 (3.0611) grad_norm 2.3424 (3.1036) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1070/1251] eta 0:00:41 lr 0.000348 wd 0.0500 time 0.2291 (0.2319) data time 0.0007 (0.0018) model time 0.2284 (0.2302) loss 3.3950 (3.0612) grad_norm 2.8809 (3.1028) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1080/1251] eta 0:00:39 lr 0.000347 wd 0.0500 time 0.2332 (0.2319) data time 0.0014 (0.0017) model time 0.2318 (0.2302) loss 3.4147 (3.0637) grad_norm 3.5160 (3.1008) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1090/1251] eta 0:00:37 lr 0.000347 wd 0.0500 time 0.2277 (0.2318) data time 0.0007 (0.0017) model time 0.2269 (0.2302) loss 1.9286 (3.0600) grad_norm 3.2621 (3.0994) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1100/1251] eta 0:00:35 lr 0.000347 wd 0.0500 time 0.2406 (0.2318) data time 0.0010 (0.0017) model time 0.2396 (0.2302) loss 2.3682 (3.0564) grad_norm 2.7335 (3.0970) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1110/1251] eta 0:00:32 lr 0.000347 wd 0.0500 time 0.2350 (0.2318) data time 0.0011 (0.0017) model time 0.2340 (0.2302) loss 3.3488 (3.0557) grad_norm 2.4158 (3.0961) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1120/1251] eta 0:00:30 lr 0.000347 wd 0.0500 time 0.2346 (0.2319) data time 0.0008 (0.0017) model time 0.2338 (0.2303) loss 3.6594 (3.0562) grad_norm 2.9989 (3.1015) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1130/1251] eta 0:00:28 lr 0.000347 wd 0.0500 time 0.2232 (0.2319) data time 0.0009 (0.0017) model time 0.2222 (0.2303) loss 3.5723 (3.0569) grad_norm 3.3039 (3.1026) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1140/1251] eta 0:00:25 lr 0.000347 wd 0.0500 time 0.2232 (0.2319) data time 0.0010 (0.0017) model time 0.2222 (0.2303) loss 3.4204 (3.0580) grad_norm 3.5530 (3.1004) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1150/1251] eta 0:00:23 lr 0.000347 wd 0.0500 time 0.2337 (0.2321) data time 0.0008 (0.0017) model time 0.2329 (0.2305) loss 3.1236 (3.0603) grad_norm 2.3701 (3.0993) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1160/1251] eta 0:00:21 lr 0.000347 wd 0.0500 time 0.2274 (0.2321) data time 0.0011 (0.0017) model time 0.2263 (0.2305) loss 3.5961 (3.0591) grad_norm 2.9830 (3.0970) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1170/1251] eta 0:00:18 lr 0.000347 wd 0.0500 time 0.2231 (0.2321) data time 0.0011 (0.0017) model time 0.2220 (0.2305) loss 3.5605 (3.0590) grad_norm 3.9865 (3.0934) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1180/1251] eta 0:00:16 lr 0.000347 wd 0.0500 time 0.2302 (0.2321) data time 0.0012 (0.0017) model time 0.2291 (0.2305) loss 3.5226 (3.0593) grad_norm 2.6682 (3.0937) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1190/1251] eta 0:00:14 lr 0.000347 wd 0.0500 time 0.2319 (0.2321) data time 0.0008 (0.0017) model time 0.2311 (0.2305) loss 2.5921 (3.0590) grad_norm 2.5845 (3.0991) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1200/1251] eta 0:00:11 lr 0.000347 wd 0.0500 time 0.2271 (0.2320) data time 0.0013 (0.0017) model time 0.2259 (0.2304) loss 2.8334 (3.0569) grad_norm 2.9185 (3.1026) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1210/1251] eta 0:00:09 lr 0.000347 wd 0.0500 time 0.2209 (0.2321) data time 0.0009 (0.0017) model time 0.2200 (0.2305) loss 3.3410 (3.0587) grad_norm 2.9166 (3.1022) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1220/1251] eta 0:00:07 lr 0.000347 wd 0.0500 time 0.2301 (0.2320) data time 0.0013 (0.0017) model time 0.2288 (0.2305) loss 3.9879 (3.0594) grad_norm 3.4559 (3.1010) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1230/1251] eta 0:00:04 lr 0.000347 wd 0.0500 time 0.2305 (0.2320) data time 0.0007 (0.0017) model time 0.2298 (0.2304) loss 4.0421 (3.0582) grad_norm 3.9629 (3.0998) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1240/1251] eta 0:00:02 lr 0.000347 wd 0.0500 time 0.2353 (0.2320) data time 0.0006 (0.0017) model time 0.2347 (0.2304) loss 3.3654 (3.0600) grad_norm 2.2576 (3.0970) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [188/300][1250/1251] eta 0:00:00 lr 0.000347 wd 0.0500 time 0.2351 (0.2318) data time 0.0006 (0.0017) model time 0.2345 (0.2302) loss 3.5473 (3.0600) grad_norm 2.6450 (3.0972) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 188 training takes 0:04:50 [2024-08-27 21:12:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 21:12:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 21:12:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.546 (0.546) Loss 0.4551 (0.4551) Acc@1 92.188 (92.188) Acc@5 98.340 (98.340) Mem 7381MB [2024-08-27 21:12:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.123) Loss 0.7227 (0.6755) Acc@1 85.156 (85.680) Acc@5 97.168 (97.239) Mem 7381MB [2024-08-27 21:12:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.103) Loss 1.0166 (0.7066) Acc@1 74.902 (84.380) Acc@5 94.238 (97.173) Mem 7381MB [2024-08-27 21:12:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.096) Loss 1.1816 (0.7984) Acc@1 71.094 (82.088) Acc@5 92.188 (96.087) Mem 7381MB [2024-08-27 21:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.089) Loss 1.1055 (0.8472) Acc@1 73.730 (80.914) Acc@5 93.652 (95.539) Mem 7381MB [2024-08-27 21:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.494 Acc@5 95.478 [2024-08-27 21:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.5% [2024-08-27 21:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 80.49% [2024-08-27 21:12:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 21:12:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 21:12:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.454 (0.454) Loss 0.3882 (0.3882) Acc@1 93.066 (93.066) Acc@5 98.535 (98.535) Mem 7381MB [2024-08-27 21:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.113) Loss 0.6074 (0.6181) Acc@1 88.379 (86.905) Acc@5 97.363 (97.470) Mem 7381MB [2024-08-27 21:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.099) Loss 0.8760 (0.6425) Acc@1 78.125 (85.947) Acc@5 96.289 (97.503) Mem 7381MB [2024-08-27 21:12:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.086 (0.094) Loss 1.1133 (0.7277) Acc@1 73.438 (83.877) Acc@5 93.066 (96.607) Mem 7381MB [2024-08-27 21:12:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.068 (0.087) Loss 0.9863 (0.7713) Acc@1 76.172 (82.577) Acc@5 94.043 (96.137) Mem 7381MB [2024-08-27 21:12:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.132 Acc@5 96.116 [2024-08-27 21:12:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.1% [2024-08-27 21:12:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.13% [2024-08-27 21:12:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 21:12:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 21:12:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][0/1251] eta 0:16:26 lr 0.000347 wd 0.0500 time 0.7886 (0.7886) data time 0.5261 (0.5261) model time 0.0000 (0.0000) loss 3.0050 (3.0050) grad_norm 3.1275 (3.1275) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:12:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][10/1251] eta 0:05:44 lr 0.000347 wd 0.0500 time 0.2233 (0.2775) data time 0.0007 (0.0488) model time 0.0000 (0.0000) loss 3.7531 (2.9469) grad_norm 3.0220 (3.3358) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][20/1251] eta 0:05:12 lr 0.000347 wd 0.0500 time 0.2325 (0.2535) data time 0.0009 (0.0261) model time 0.0000 (0.0000) loss 2.5533 (3.0781) grad_norm 2.6823 (3.0381) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][30/1251] eta 0:04:59 lr 0.000347 wd 0.0500 time 0.2327 (0.2455) data time 0.0008 (0.0180) model time 0.0000 (0.0000) loss 3.4070 (3.0977) grad_norm 3.7964 (3.0422) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][40/1251] eta 0:04:52 lr 0.000347 wd 0.0500 time 0.2256 (0.2416) data time 0.0007 (0.0139) model time 0.0000 (0.0000) loss 2.5245 (3.0328) grad_norm 3.6528 (2.9197) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][50/1251] eta 0:04:47 lr 0.000347 wd 0.0500 time 0.2337 (0.2392) data time 0.0008 (0.0114) model time 0.0000 (0.0000) loss 2.7405 (3.0109) grad_norm 5.0342 (3.0972) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][60/1251] eta 0:04:42 lr 0.000347 wd 0.0500 time 0.2306 (0.2374) data time 0.0007 (0.0098) model time 0.2299 (0.2269) loss 2.1623 (3.0003) grad_norm 3.0916 (3.1424) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][70/1251] eta 0:04:38 lr 0.000346 wd 0.0500 time 0.2369 (0.2361) data time 0.0008 (0.0086) model time 0.2361 (0.2270) loss 3.4820 (3.0225) grad_norm 2.5464 (3.0778) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][80/1251] eta 0:04:35 lr 0.000346 wd 0.0500 time 0.2409 (0.2356) data time 0.0011 (0.0077) model time 0.2398 (0.2281) loss 2.9553 (3.0386) grad_norm 3.7974 (3.1028) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][90/1251] eta 0:04:32 lr 0.000346 wd 0.0500 time 0.2270 (0.2347) data time 0.0007 (0.0070) model time 0.2263 (0.2276) loss 2.3871 (3.0057) grad_norm 2.1849 (3.0794) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][100/1251] eta 0:04:29 lr 0.000346 wd 0.0500 time 0.2299 (0.2342) data time 0.0010 (0.0064) model time 0.2289 (0.2279) loss 2.4124 (3.0176) grad_norm 1.9918 (3.0560) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][110/1251] eta 0:04:26 lr 0.000346 wd 0.0500 time 0.2341 (0.2338) data time 0.0008 (0.0059) model time 0.2333 (0.2280) loss 2.6245 (2.9985) grad_norm 2.1273 (3.0340) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][120/1251] eta 0:04:23 lr 0.000346 wd 0.0500 time 0.2314 (0.2333) data time 0.0007 (0.0055) model time 0.2307 (0.2278) loss 2.5004 (2.9779) grad_norm 3.5437 (3.0202) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][130/1251] eta 0:04:21 lr 0.000346 wd 0.0500 time 0.2450 (0.2332) data time 0.0010 (0.0052) model time 0.2439 (0.2282) loss 3.5150 (2.9858) grad_norm 2.4894 (3.0503) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][140/1251] eta 0:04:18 lr 0.000346 wd 0.0500 time 0.2281 (0.2328) data time 0.0009 (0.0049) model time 0.2272 (0.2280) loss 2.7449 (2.9795) grad_norm 2.5808 (3.0542) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][150/1251] eta 0:04:16 lr 0.000346 wd 0.0500 time 0.2322 (0.2326) data time 0.0009 (0.0046) model time 0.2313 (0.2280) loss 3.1222 (2.9784) grad_norm 3.3233 (3.0490) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][160/1251] eta 0:04:13 lr 0.000346 wd 0.0500 time 0.2416 (0.2325) data time 0.0014 (0.0044) model time 0.2403 (0.2282) loss 3.3134 (2.9782) grad_norm 2.2331 (3.0183) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][170/1251] eta 0:04:11 lr 0.000346 wd 0.0500 time 0.2272 (0.2323) data time 0.0015 (0.0042) model time 0.2257 (0.2282) loss 3.5142 (2.9819) grad_norm 4.0940 (3.0487) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][180/1251] eta 0:04:08 lr 0.000346 wd 0.0500 time 0.2286 (0.2322) data time 0.0011 (0.0040) model time 0.2276 (0.2282) loss 2.6472 (2.9652) grad_norm 2.3394 (3.0817) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][190/1251] eta 0:04:06 lr 0.000346 wd 0.0500 time 0.2218 (0.2323) data time 0.0008 (0.0039) model time 0.2210 (0.2286) loss 3.2770 (2.9749) grad_norm 3.0763 (3.0660) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][200/1251] eta 0:04:03 lr 0.000346 wd 0.0500 time 0.2429 (0.2321) data time 0.0009 (0.0037) model time 0.2420 (0.2286) loss 3.4619 (2.9919) grad_norm 2.4194 (3.0539) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][210/1251] eta 0:04:01 lr 0.000346 wd 0.0500 time 0.2242 (0.2320) data time 0.0012 (0.0037) model time 0.2230 (0.2285) loss 3.2536 (2.9894) grad_norm 2.6192 (3.0418) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][220/1251] eta 0:03:59 lr 0.000346 wd 0.0500 time 0.2414 (0.2319) data time 0.0013 (0.0036) model time 0.2401 (0.2285) loss 3.3962 (2.9940) grad_norm 3.2631 (3.0781) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][230/1251] eta 0:03:56 lr 0.000346 wd 0.0500 time 0.2435 (0.2320) data time 0.0006 (0.0035) model time 0.2429 (0.2287) loss 3.4680 (2.9960) grad_norm 2.6871 (3.0580) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][240/1251] eta 0:03:54 lr 0.000346 wd 0.0500 time 0.2378 (0.2320) data time 0.0010 (0.0034) model time 0.2369 (0.2288) loss 3.2540 (3.0013) grad_norm 2.5394 (3.0824) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][250/1251] eta 0:03:52 lr 0.000346 wd 0.0500 time 0.2251 (0.2318) data time 0.0009 (0.0033) model time 0.2242 (0.2286) loss 2.4926 (3.0054) grad_norm 3.1435 (3.0625) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][260/1251] eta 0:03:51 lr 0.000346 wd 0.0500 time 0.2230 (0.2333) data time 0.0007 (0.0033) model time 0.2223 (0.2306) loss 2.4429 (3.0019) grad_norm 2.9747 (3.0617) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:13:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][270/1251] eta 0:03:48 lr 0.000346 wd 0.0500 time 0.2262 (0.2331) data time 0.0009 (0.0032) model time 0.2254 (0.2304) loss 2.9086 (3.0149) grad_norm 2.2001 (3.0611) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][280/1251] eta 0:03:46 lr 0.000346 wd 0.0500 time 0.2259 (0.2331) data time 0.0010 (0.0032) model time 0.2249 (0.2304) loss 3.4923 (3.0141) grad_norm 2.6397 (3.0495) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][290/1251] eta 0:03:43 lr 0.000346 wd 0.0500 time 0.2323 (0.2331) data time 0.0007 (0.0032) model time 0.2316 (0.2304) loss 2.2077 (3.0150) grad_norm 3.7245 (3.0689) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][300/1251] eta 0:03:41 lr 0.000346 wd 0.0500 time 0.2293 (0.2329) data time 0.0007 (0.0031) model time 0.2286 (0.2303) loss 4.0824 (3.0198) grad_norm 2.3458 (3.0781) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][310/1251] eta 0:03:39 lr 0.000345 wd 0.0500 time 0.2265 (0.2328) data time 0.0009 (0.0030) model time 0.2256 (0.2302) loss 3.0288 (3.0206) grad_norm 3.1991 (3.0738) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][320/1251] eta 0:03:36 lr 0.000345 wd 0.0500 time 0.2276 (0.2328) data time 0.0009 (0.0030) model time 0.2266 (0.2302) loss 2.8019 (3.0211) grad_norm 2.6740 (3.0656) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][330/1251] eta 0:03:34 lr 0.000345 wd 0.0500 time 0.2318 (0.2328) data time 0.0008 (0.0029) model time 0.2310 (0.2302) loss 3.5026 (3.0290) grad_norm 2.3359 (3.0517) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][340/1251] eta 0:03:31 lr 0.000345 wd 0.0500 time 0.2222 (0.2327) data time 0.0011 (0.0029) model time 0.2211 (0.2302) loss 2.9569 (3.0290) grad_norm 1.9604 (3.0507) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][350/1251] eta 0:03:29 lr 0.000345 wd 0.0500 time 0.2230 (0.2326) data time 0.0007 (0.0028) model time 0.2223 (0.2301) loss 3.8182 (3.0331) grad_norm 3.1403 (3.0447) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][360/1251] eta 0:03:27 lr 0.000345 wd 0.0500 time 0.2247 (0.2326) data time 0.0011 (0.0028) model time 0.2236 (0.2301) loss 3.7294 (3.0374) grad_norm 4.2913 (3.0412) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][370/1251] eta 0:03:24 lr 0.000345 wd 0.0500 time 0.2204 (0.2326) data time 0.0010 (0.0028) model time 0.2194 (0.2301) loss 3.2979 (3.0369) grad_norm 2.3723 (3.0319) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][380/1251] eta 0:03:22 lr 0.000345 wd 0.0500 time 0.2285 (0.2325) data time 0.0010 (0.0027) model time 0.2275 (0.2301) loss 3.5261 (3.0411) grad_norm 1.8822 (3.0271) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][390/1251] eta 0:03:20 lr 0.000345 wd 0.0500 time 0.2249 (0.2325) data time 0.0008 (0.0027) model time 0.2241 (0.2301) loss 2.7579 (3.0421) grad_norm 2.3861 (3.0193) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][400/1251] eta 0:03:17 lr 0.000345 wd 0.0500 time 0.2231 (0.2325) data time 0.0009 (0.0026) model time 0.2222 (0.2301) loss 3.0916 (3.0371) grad_norm 2.5463 (3.0083) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][410/1251] eta 0:03:15 lr 0.000345 wd 0.0500 time 0.2252 (0.2325) data time 0.0012 (0.0026) model time 0.2240 (0.2301) loss 3.2459 (3.0374) grad_norm 3.5862 (3.0252) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][420/1251] eta 0:03:13 lr 0.000345 wd 0.0500 time 0.2210 (0.2324) data time 0.0009 (0.0026) model time 0.2201 (0.2301) loss 2.8220 (3.0325) grad_norm 1.8917 (3.0291) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][430/1251] eta 0:03:10 lr 0.000345 wd 0.0500 time 0.2425 (0.2323) data time 0.0006 (0.0025) model time 0.2419 (0.2300) loss 2.7966 (3.0293) grad_norm 3.0793 (3.0388) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][440/1251] eta 0:03:08 lr 0.000345 wd 0.0500 time 0.2287 (0.2323) data time 0.0006 (0.0025) model time 0.2280 (0.2301) loss 3.7206 (3.0301) grad_norm 3.1946 (3.0308) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][450/1251] eta 0:03:06 lr 0.000345 wd 0.0500 time 0.2243 (0.2322) data time 0.0007 (0.0025) model time 0.2236 (0.2300) loss 4.1684 (3.0313) grad_norm 15.5479 (3.0610) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][460/1251] eta 0:03:03 lr 0.000345 wd 0.0500 time 0.2273 (0.2321) data time 0.0009 (0.0025) model time 0.2264 (0.2299) loss 3.2455 (3.0330) grad_norm 2.3111 (3.0802) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][470/1251] eta 0:03:01 lr 0.000345 wd 0.0500 time 0.2312 (0.2320) data time 0.0009 (0.0024) model time 0.2303 (0.2298) loss 2.7146 (3.0355) grad_norm 2.7006 (3.0836) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][480/1251] eta 0:02:58 lr 0.000345 wd 0.0500 time 0.2327 (0.2321) data time 0.0006 (0.0024) model time 0.2320 (0.2299) loss 3.1351 (3.0342) grad_norm 3.4324 (3.0979) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][490/1251] eta 0:02:56 lr 0.000345 wd 0.0500 time 0.2358 (0.2321) data time 0.0007 (0.0024) model time 0.2351 (0.2299) loss 3.7811 (3.0345) grad_norm 2.4788 (3.1053) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][500/1251] eta 0:02:54 lr 0.000345 wd 0.0500 time 0.2296 (0.2321) data time 0.0015 (0.0023) model time 0.2281 (0.2299) loss 3.2951 (3.0320) grad_norm 2.2913 (3.1113) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][510/1251] eta 0:02:51 lr 0.000345 wd 0.0500 time 0.2278 (0.2320) data time 0.0008 (0.0023) model time 0.2270 (0.2299) loss 3.2140 (3.0341) grad_norm 2.1071 (3.1119) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][520/1251] eta 0:02:49 lr 0.000345 wd 0.0500 time 0.2331 (0.2320) data time 0.0008 (0.0023) model time 0.2323 (0.2299) loss 3.9077 (3.0402) grad_norm 2.2886 (3.1050) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:14:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][530/1251] eta 0:02:47 lr 0.000345 wd 0.0500 time 0.2252 (0.2319) data time 0.0007 (0.0023) model time 0.2245 (0.2298) loss 2.9186 (3.0440) grad_norm 2.7047 (3.1086) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][540/1251] eta 0:02:44 lr 0.000344 wd 0.0500 time 0.2241 (0.2318) data time 0.0007 (0.0023) model time 0.2234 (0.2298) loss 3.9400 (3.0450) grad_norm 2.9790 (3.1621) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:15:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][550/1251] eta 0:02:42 lr 0.000344 wd 0.0500 time 0.2283 (0.2318) data time 0.0009 (0.0022) model time 0.2273 (0.2297) loss 3.0767 (3.0422) grad_norm 2.4833 (3.1575) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][560/1251] eta 0:02:40 lr 0.000344 wd 0.0500 time 0.2311 (0.2317) data time 0.0007 (0.0022) model time 0.2304 (0.2297) loss 2.5911 (3.0433) grad_norm 4.8970 (3.1585) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:15:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][570/1251] eta 0:02:37 lr 0.000344 wd 0.0500 time 0.2414 (0.2317) data time 0.0007 (0.0022) model time 0.2407 (0.2296) loss 3.6494 (3.0412) grad_norm 3.2799 (3.1695) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][580/1251] eta 0:02:35 lr 0.000344 wd 0.0500 time 0.2273 (0.2317) data time 0.0009 (0.0022) model time 0.2263 (0.2297) loss 2.8144 (3.0413) grad_norm 2.9947 (3.1789) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:15:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][590/1251] eta 0:02:33 lr 0.000344 wd 0.0500 time 0.2240 (0.2317) data time 0.0010 (0.0022) model time 0.2230 (0.2297) loss 2.0000 (3.0377) grad_norm 2.7947 (3.1750) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][600/1251] eta 0:02:31 lr 0.000344 wd 0.0500 time 0.4479 (0.2320) data time 0.0009 (0.0022) model time 0.4470 (0.2301) loss 3.7528 (3.0437) grad_norm 2.8214 (3.1652) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:15:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][610/1251] eta 0:02:28 lr 0.000344 wd 0.0500 time 0.2267 (0.2320) data time 0.0006 (0.0022) model time 0.2261 (0.2300) loss 1.8619 (3.0396) grad_norm 2.2287 (3.1566) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:15:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][620/1251] eta 0:02:26 lr 0.000344 wd 0.0500 time 0.2279 (0.2319) data time 0.0009 (0.0021) model time 0.2270 (0.2300) loss 3.3584 (3.0391) grad_norm 2.4503 (3.1615) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][630/1251] eta 0:02:23 lr 0.000344 wd 0.0500 time 0.2253 (0.2319) data time 0.0009 (0.0021) model time 0.2244 (0.2299) loss 2.0717 (3.0395) grad_norm 5.1405 (3.1671) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:15:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][640/1251] eta 0:02:21 lr 0.000344 wd 0.0500 time 0.2246 (0.2319) data time 0.0007 (0.0021) model time 0.2239 (0.2300) loss 2.4478 (3.0420) grad_norm 3.0724 (3.1668) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:15:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][650/1251] eta 0:02:19 lr 0.000344 wd 0.0500 time 0.2275 (0.2319) data time 0.0013 (0.0021) model time 0.2262 (0.2299) loss 2.1787 (3.0461) grad_norm 1.8345 (3.1615) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:15:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][660/1251] eta 0:02:17 lr 0.000344 wd 0.0500 time 0.2302 (0.2318) data time 0.0009 (0.0021) model time 0.2293 (0.2299) loss 3.6590 (3.0421) grad_norm 2.4101 (3.1570) loss_scale 1024.0000 (515.0983) mem 7381MB [2024-08-27 21:15:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][670/1251] eta 0:02:14 lr 0.000344 wd 0.0500 time 0.2291 (0.2318) data time 0.0008 (0.0021) model time 0.2282 (0.2299) loss 3.2477 (3.0420) grad_norm 2.4975 (3.1546) loss_scale 1024.0000 (522.6826) mem 7381MB [2024-08-27 21:15:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][680/1251] eta 0:02:12 lr 0.000344 wd 0.0500 time 0.2226 (0.2318) data time 0.0014 (0.0021) model time 0.2212 (0.2299) loss 2.9731 (3.0428) grad_norm 2.7082 (3.1520) loss_scale 1024.0000 (530.0441) mem 7381MB [2024-08-27 21:15:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][690/1251] eta 0:02:09 lr 0.000344 wd 0.0500 time 0.2267 (0.2317) data time 0.0009 (0.0020) model time 0.2258 (0.2298) loss 2.6018 (3.0431) grad_norm 2.8026 (3.1626) loss_scale 1024.0000 (537.1925) mem 7381MB [2024-08-27 21:15:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][700/1251] eta 0:02:07 lr 0.000344 wd 0.0500 time 0.2287 (0.2317) data time 0.0010 (0.0020) model time 0.2277 (0.2298) loss 2.5941 (3.0409) grad_norm 2.4616 (3.1559) loss_scale 1024.0000 (544.1369) mem 7381MB [2024-08-27 21:15:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][710/1251] eta 0:02:05 lr 0.000344 wd 0.0500 time 0.2300 (0.2317) data time 0.0010 (0.0020) model time 0.2291 (0.2298) loss 3.7144 (3.0421) grad_norm 2.4617 (3.1530) loss_scale 1024.0000 (550.8861) mem 7381MB [2024-08-27 21:15:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][720/1251] eta 0:02:02 lr 0.000344 wd 0.0500 time 0.2267 (0.2316) data time 0.0009 (0.0020) model time 0.2258 (0.2298) loss 3.3584 (3.0445) grad_norm 2.6303 (3.1512) loss_scale 1024.0000 (557.4480) mem 7381MB [2024-08-27 21:15:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][730/1251] eta 0:02:00 lr 0.000344 wd 0.0500 time 0.2275 (0.2316) data time 0.0010 (0.0020) model time 0.2266 (0.2297) loss 3.1515 (3.0458) grad_norm 2.3796 (3.1658) loss_scale 1024.0000 (563.8304) mem 7381MB [2024-08-27 21:15:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][740/1251] eta 0:01:58 lr 0.000344 wd 0.0500 time 0.2246 (0.2316) data time 0.0007 (0.0020) model time 0.2239 (0.2297) loss 3.8553 (3.0453) grad_norm 3.8179 (3.1625) loss_scale 1024.0000 (570.0405) mem 7381MB [2024-08-27 21:15:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][750/1251] eta 0:01:56 lr 0.000344 wd 0.0500 time 0.2286 (0.2316) data time 0.0008 (0.0020) model time 0.2278 (0.2297) loss 3.3487 (3.0499) grad_norm 2.8430 (3.1695) loss_scale 1024.0000 (576.0852) mem 7381MB [2024-08-27 21:15:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][760/1251] eta 0:01:53 lr 0.000344 wd 0.0500 time 0.2229 (0.2316) data time 0.0009 (0.0020) model time 0.2220 (0.2297) loss 3.2733 (3.0531) grad_norm 3.4133 (3.1665) loss_scale 1024.0000 (581.9711) mem 7381MB [2024-08-27 21:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][770/1251] eta 0:01:51 lr 0.000344 wd 0.0500 time 0.2297 (0.2315) data time 0.0009 (0.0020) model time 0.2288 (0.2297) loss 2.4942 (3.0548) grad_norm 3.0992 (3.1651) loss_scale 1024.0000 (587.7043) mem 7381MB [2024-08-27 21:15:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][780/1251] eta 0:01:49 lr 0.000343 wd 0.0500 time 0.2326 (0.2315) data time 0.0008 (0.0019) model time 0.2318 (0.2297) loss 2.9668 (3.0561) grad_norm 2.0242 (3.1660) loss_scale 1024.0000 (593.2907) mem 7381MB [2024-08-27 21:15:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][790/1251] eta 0:01:46 lr 0.000343 wd 0.0500 time 0.2267 (0.2315) data time 0.0009 (0.0019) model time 0.2258 (0.2297) loss 3.3257 (3.0552) grad_norm 3.5905 (3.1824) loss_scale 1024.0000 (598.7358) mem 7381MB [2024-08-27 21:16:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][800/1251] eta 0:01:44 lr 0.000343 wd 0.0500 time 0.2336 (0.2315) data time 0.0009 (0.0019) model time 0.2327 (0.2297) loss 3.1910 (3.0556) grad_norm 3.8666 (3.2004) loss_scale 1024.0000 (604.0449) mem 7381MB [2024-08-27 21:16:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][810/1251] eta 0:01:42 lr 0.000343 wd 0.0500 time 0.2266 (0.2315) data time 0.0011 (0.0019) model time 0.2256 (0.2297) loss 3.4057 (3.0551) grad_norm 2.0735 (3.1962) loss_scale 1024.0000 (609.2232) mem 7381MB [2024-08-27 21:16:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][820/1251] eta 0:01:39 lr 0.000343 wd 0.0500 time 0.2320 (0.2315) data time 0.0007 (0.0019) model time 0.2313 (0.2297) loss 3.5443 (3.0547) grad_norm 3.2443 (3.1956) loss_scale 1024.0000 (614.2753) mem 7381MB [2024-08-27 21:16:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][830/1251] eta 0:01:37 lr 0.000343 wd 0.0500 time 0.2235 (0.2314) data time 0.0009 (0.0019) model time 0.2226 (0.2296) loss 2.9275 (3.0572) grad_norm 2.4913 (3.1933) loss_scale 1024.0000 (619.2058) mem 7381MB [2024-08-27 21:16:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][840/1251] eta 0:01:35 lr 0.000343 wd 0.0500 time 0.2257 (0.2314) data time 0.0007 (0.0019) model time 0.2250 (0.2296) loss 3.6924 (3.0598) grad_norm 2.3492 (3.1872) loss_scale 1024.0000 (624.0190) mem 7381MB [2024-08-27 21:16:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][850/1251] eta 0:01:32 lr 0.000343 wd 0.0500 time 0.2293 (0.2314) data time 0.0009 (0.0019) model time 0.2284 (0.2296) loss 3.1211 (3.0573) grad_norm 2.3344 (3.1844) loss_scale 1024.0000 (628.7192) mem 7381MB [2024-08-27 21:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][860/1251] eta 0:01:30 lr 0.000343 wd 0.0500 time 0.2275 (0.2313) data time 0.0009 (0.0019) model time 0.2266 (0.2296) loss 3.1447 (3.0587) grad_norm 2.6715 (3.1841) loss_scale 1024.0000 (633.3101) mem 7381MB [2024-08-27 21:16:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][870/1251] eta 0:01:28 lr 0.000343 wd 0.0500 time 0.2229 (0.2313) data time 0.0009 (0.0019) model time 0.2220 (0.2295) loss 3.4150 (3.0597) grad_norm 3.3768 (3.1817) loss_scale 1024.0000 (637.7956) mem 7381MB [2024-08-27 21:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][880/1251] eta 0:01:25 lr 0.000343 wd 0.0500 time 0.2274 (0.2313) data time 0.0011 (0.0019) model time 0.2263 (0.2295) loss 3.9919 (3.0618) grad_norm 2.7057 (3.1760) loss_scale 1024.0000 (642.1793) mem 7381MB [2024-08-27 21:16:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][890/1251] eta 0:01:23 lr 0.000343 wd 0.0500 time 0.2224 (0.2313) data time 0.0011 (0.0018) model time 0.2213 (0.2295) loss 4.0104 (3.0625) grad_norm 2.7198 (3.1686) loss_scale 1024.0000 (646.4646) mem 7381MB [2024-08-27 21:16:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][900/1251] eta 0:01:21 lr 0.000343 wd 0.0500 time 0.2396 (0.2313) data time 0.0007 (0.0018) model time 0.2389 (0.2295) loss 3.4024 (3.0657) grad_norm 2.5269 (3.1642) loss_scale 1024.0000 (650.6548) mem 7381MB [2024-08-27 21:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][910/1251] eta 0:01:18 lr 0.000343 wd 0.0500 time 0.2272 (0.2313) data time 0.0007 (0.0018) model time 0.2265 (0.2296) loss 2.1841 (3.0658) grad_norm 2.5432 (3.1610) loss_scale 1024.0000 (654.7530) mem 7381MB [2024-08-27 21:16:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][920/1251] eta 0:01:16 lr 0.000343 wd 0.0500 time 0.2337 (0.2313) data time 0.0008 (0.0018) model time 0.2329 (0.2296) loss 3.5497 (3.0689) grad_norm 2.5872 (3.1572) loss_scale 1024.0000 (658.7622) mem 7381MB [2024-08-27 21:16:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][930/1251] eta 0:01:14 lr 0.000343 wd 0.0500 time 0.2254 (0.2313) data time 0.0012 (0.0018) model time 0.2242 (0.2296) loss 3.7791 (3.0680) grad_norm 3.0243 (3.1536) loss_scale 1024.0000 (662.6853) mem 7381MB [2024-08-27 21:16:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][940/1251] eta 0:01:11 lr 0.000343 wd 0.0500 time 0.2317 (0.2313) data time 0.0007 (0.0018) model time 0.2310 (0.2296) loss 2.0431 (3.0653) grad_norm 2.9923 (3.1574) loss_scale 1024.0000 (666.5250) mem 7381MB [2024-08-27 21:16:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][950/1251] eta 0:01:09 lr 0.000343 wd 0.0500 time 0.2240 (0.2313) data time 0.0009 (0.0018) model time 0.2231 (0.2296) loss 3.3192 (3.0645) grad_norm 5.2100 (3.1580) loss_scale 1024.0000 (670.2839) mem 7381MB [2024-08-27 21:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][960/1251] eta 0:01:07 lr 0.000343 wd 0.0500 time 0.2277 (0.2313) data time 0.0009 (0.0018) model time 0.2268 (0.2296) loss 3.3050 (3.0661) grad_norm 3.6997 (3.1618) loss_scale 1024.0000 (673.9646) mem 7381MB [2024-08-27 21:16:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][970/1251] eta 0:01:04 lr 0.000343 wd 0.0500 time 0.2292 (0.2313) data time 0.0008 (0.0018) model time 0.2284 (0.2296) loss 3.5693 (3.0675) grad_norm 3.0149 (3.1665) loss_scale 1024.0000 (677.5695) mem 7381MB [2024-08-27 21:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][980/1251] eta 0:01:02 lr 0.000343 wd 0.0500 time 0.2332 (0.2313) data time 0.0011 (0.0018) model time 0.2321 (0.2295) loss 2.1104 (3.0661) grad_norm 3.3197 (3.1641) loss_scale 1024.0000 (681.1009) mem 7381MB [2024-08-27 21:16:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][990/1251] eta 0:01:00 lr 0.000343 wd 0.0500 time 0.2241 (0.2313) data time 0.0017 (0.0018) model time 0.2223 (0.2295) loss 3.2230 (3.0644) grad_norm 3.0602 (3.1626) loss_scale 1024.0000 (684.5610) mem 7381MB [2024-08-27 21:16:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1000/1251] eta 0:00:58 lr 0.000343 wd 0.0500 time 0.2232 (0.2312) data time 0.0009 (0.0018) model time 0.2223 (0.2295) loss 3.0079 (3.0631) grad_norm 2.0769 (3.1612) loss_scale 1024.0000 (687.9520) mem 7381MB [2024-08-27 21:16:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1010/1251] eta 0:00:55 lr 0.000343 wd 0.0500 time 0.2218 (0.2312) data time 0.0008 (0.0018) model time 0.2210 (0.2295) loss 3.7242 (3.0648) grad_norm 2.8605 (3.1583) loss_scale 1024.0000 (691.2760) mem 7381MB [2024-08-27 21:16:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1020/1251] eta 0:00:53 lr 0.000342 wd 0.0500 time 0.2261 (0.2312) data time 0.0009 (0.0018) model time 0.2252 (0.2295) loss 2.5308 (3.0643) grad_norm 2.2466 (3.1566) loss_scale 1024.0000 (694.5348) mem 7381MB [2024-08-27 21:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1030/1251] eta 0:00:51 lr 0.000342 wd 0.0500 time 0.2561 (0.2312) data time 0.0011 (0.0018) model time 0.2551 (0.2295) loss 3.1158 (3.0657) grad_norm 3.1567 (3.1527) loss_scale 1024.0000 (697.7304) mem 7381MB [2024-08-27 21:16:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1040/1251] eta 0:00:48 lr 0.000342 wd 0.0500 time 0.2284 (0.2312) data time 0.0007 (0.0018) model time 0.2277 (0.2295) loss 2.7018 (3.0637) grad_norm 3.7235 (3.1631) loss_scale 1024.0000 (700.8646) mem 7381MB [2024-08-27 21:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1050/1251] eta 0:00:46 lr 0.000342 wd 0.0500 time 0.2327 (0.2312) data time 0.0010 (0.0018) model time 0.2317 (0.2295) loss 2.9050 (3.0625) grad_norm 2.6069 (3.1576) loss_scale 1024.0000 (703.9391) mem 7381MB [2024-08-27 21:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1060/1251] eta 0:00:44 lr 0.000342 wd 0.0500 time 0.2369 (0.2313) data time 0.0006 (0.0018) model time 0.2363 (0.2296) loss 2.2865 (3.0598) grad_norm 3.5978 (3.1597) loss_scale 1024.0000 (706.9557) mem 7381MB [2024-08-27 21:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1070/1251] eta 0:00:41 lr 0.000342 wd 0.0500 time 0.2305 (0.2313) data time 0.0008 (0.0018) model time 0.2297 (0.2296) loss 3.2575 (3.0597) grad_norm 4.6991 (3.1578) loss_scale 1024.0000 (709.9160) mem 7381MB [2024-08-27 21:17:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1080/1251] eta 0:00:39 lr 0.000342 wd 0.0500 time 0.2257 (0.2312) data time 0.0009 (0.0018) model time 0.2248 (0.2295) loss 2.1022 (3.0557) grad_norm 3.3499 (3.1544) loss_scale 1024.0000 (712.8215) mem 7381MB [2024-08-27 21:17:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1090/1251] eta 0:00:37 lr 0.000342 wd 0.0500 time 0.2319 (0.2312) data time 0.0009 (0.0018) model time 0.2310 (0.2296) loss 3.4560 (3.0568) grad_norm 3.0920 (3.1519) loss_scale 1024.0000 (715.6737) mem 7381MB [2024-08-27 21:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1100/1251] eta 0:00:34 lr 0.000342 wd 0.0500 time 0.2288 (0.2312) data time 0.0008 (0.0017) model time 0.2280 (0.2296) loss 3.1012 (3.0556) grad_norm 2.6075 (3.1513) loss_scale 1024.0000 (718.4741) mem 7381MB [2024-08-27 21:17:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1110/1251] eta 0:00:32 lr 0.000342 wd 0.0500 time 0.2266 (0.2312) data time 0.0008 (0.0017) model time 0.2258 (0.2295) loss 3.3927 (3.0563) grad_norm 2.8739 (3.1458) loss_scale 1024.0000 (721.2241) mem 7381MB [2024-08-27 21:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1120/1251] eta 0:00:30 lr 0.000342 wd 0.0500 time 0.2326 (0.2312) data time 0.0009 (0.0017) model time 0.2317 (0.2295) loss 3.2709 (3.0553) grad_norm 3.8941 (3.1443) loss_scale 1024.0000 (723.9251) mem 7381MB [2024-08-27 21:17:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1130/1251] eta 0:00:27 lr 0.000342 wd 0.0500 time 0.2259 (0.2312) data time 0.0009 (0.0017) model time 0.2250 (0.2295) loss 2.8551 (3.0582) grad_norm 2.4793 (3.1448) loss_scale 1024.0000 (726.5782) mem 7381MB [2024-08-27 21:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1140/1251] eta 0:00:25 lr 0.000342 wd 0.0500 time 0.2336 (0.2314) data time 0.0008 (0.0017) model time 0.2328 (0.2297) loss 2.2206 (3.0581) grad_norm 3.0705 (3.1419) loss_scale 1024.0000 (729.1849) mem 7381MB [2024-08-27 21:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1150/1251] eta 0:00:23 lr 0.000342 wd 0.0500 time 0.2248 (0.2314) data time 0.0011 (0.0017) model time 0.2237 (0.2297) loss 2.9747 (3.0568) grad_norm 2.4990 (3.1370) loss_scale 1024.0000 (731.7463) mem 7381MB [2024-08-27 21:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1160/1251] eta 0:00:21 lr 0.000342 wd 0.0500 time 0.2249 (0.2314) data time 0.0013 (0.0017) model time 0.2235 (0.2297) loss 3.1642 (3.0564) grad_norm 2.1347 (3.1358) loss_scale 1024.0000 (734.2636) mem 7381MB [2024-08-27 21:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1170/1251] eta 0:00:18 lr 0.000342 wd 0.0500 time 0.2402 (0.2314) data time 0.0016 (0.0017) model time 0.2387 (0.2297) loss 3.6781 (3.0572) grad_norm 2.1605 (3.1300) loss_scale 1024.0000 (736.7378) mem 7381MB [2024-08-27 21:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1180/1251] eta 0:00:16 lr 0.000342 wd 0.0500 time 0.2221 (0.2314) data time 0.0010 (0.0017) model time 0.2211 (0.2297) loss 3.0612 (3.0552) grad_norm 2.9580 (3.1256) loss_scale 1024.0000 (739.1702) mem 7381MB [2024-08-27 21:17:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1190/1251] eta 0:00:14 lr 0.000342 wd 0.0500 time 0.2210 (0.2317) data time 0.0010 (0.0017) model time 0.2200 (0.2301) loss 3.4660 (3.0566) grad_norm 2.6585 (3.1221) loss_scale 1024.0000 (741.5617) mem 7381MB [2024-08-27 21:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1200/1251] eta 0:00:11 lr 0.000342 wd 0.0500 time 0.2365 (0.2317) data time 0.0009 (0.0017) model time 0.2357 (0.2301) loss 3.1925 (3.0563) grad_norm 2.2323 (3.1269) loss_scale 1024.0000 (743.9134) mem 7381MB [2024-08-27 21:17:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1210/1251] eta 0:00:09 lr 0.000342 wd 0.0500 time 0.2307 (0.2317) data time 0.0009 (0.0017) model time 0.2298 (0.2301) loss 2.7804 (3.0560) grad_norm 2.3219 (3.1237) loss_scale 1024.0000 (746.2263) mem 7381MB [2024-08-27 21:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1220/1251] eta 0:00:07 lr 0.000342 wd 0.0500 time 0.2288 (0.2316) data time 0.0007 (0.0017) model time 0.2281 (0.2300) loss 2.4604 (3.0541) grad_norm 3.2779 (3.1194) loss_scale 1024.0000 (748.5012) mem 7381MB [2024-08-27 21:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1230/1251] eta 0:00:04 lr 0.000342 wd 0.0500 time 0.2256 (0.2316) data time 0.0013 (0.0017) model time 0.2243 (0.2300) loss 3.4408 (3.0554) grad_norm 2.7740 (3.1171) loss_scale 1024.0000 (750.7392) mem 7381MB [2024-08-27 21:17:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1240/1251] eta 0:00:02 lr 0.000342 wd 0.0500 time 0.2111 (0.2315) data time 0.0006 (0.0017) model time 0.2105 (0.2299) loss 2.6406 (3.0542) grad_norm 4.3912 (3.1255) loss_scale 1024.0000 (752.9412) mem 7381MB [2024-08-27 21:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [189/300][1250/1251] eta 0:00:00 lr 0.000342 wd 0.0500 time 0.2118 (0.2314) data time 0.0006 (0.0017) model time 0.2111 (0.2298) loss 3.3451 (3.0552) grad_norm 2.7229 (3.1252) loss_scale 1024.0000 (755.1079) mem 7381MB [2024-08-27 21:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 189 training takes 0:04:49 [2024-08-27 21:17:45 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 21:17:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 21:17:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.573 (0.573) Loss 0.4177 (0.4177) Acc@1 92.383 (92.383) Acc@5 98.438 (98.438) Mem 7381MB [2024-08-27 21:17:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.095 (0.126) Loss 0.7012 (0.6830) Acc@1 87.109 (85.529) Acc@5 96.973 (97.195) Mem 7381MB [2024-08-27 21:17:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.106) Loss 0.9941 (0.7060) Acc@1 75.977 (84.556) Acc@5 94.531 (97.145) Mem 7381MB [2024-08-27 21:17:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.097) Loss 1.2188 (0.8066) Acc@1 72.461 (82.170) Acc@5 90.918 (96.050) Mem 7381MB [2024-08-27 21:17:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.090) Loss 1.0586 (0.8593) Acc@1 74.707 (80.905) Acc@5 93.945 (95.489) Mem 7381MB [2024-08-27 21:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.546 Acc@5 95.430 [2024-08-27 21:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.5% [2024-08-27 21:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 80.55% [2024-08-27 21:17:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 21:17:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 21:17:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.523 (0.523) Loss 0.3882 (0.3882) Acc@1 93.066 (93.066) Acc@5 98.438 (98.438) Mem 7381MB [2024-08-27 21:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.129) Loss 0.6074 (0.6174) Acc@1 88.379 (86.932) Acc@5 97.266 (97.443) Mem 7381MB [2024-08-27 21:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.107) Loss 0.8774 (0.6420) Acc@1 78.125 (85.951) Acc@5 95.996 (97.466) Mem 7381MB [2024-08-27 21:17:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.098) Loss 1.1123 (0.7270) Acc@1 73.535 (83.896) Acc@5 93.066 (96.604) Mem 7381MB [2024-08-27 21:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.090) Loss 0.9844 (0.7704) Acc@1 76.172 (82.615) Acc@5 94.238 (96.134) Mem 7381MB [2024-08-27 21:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.196 Acc@5 96.108 [2024-08-27 21:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.2% [2024-08-27 21:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.20% [2024-08-27 21:17:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 21:17:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 21:17:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][0/1251] eta 0:17:44 lr 0.000342 wd 0.0500 time 0.8507 (0.8507) data time 0.6285 (0.6285) model time 0.0000 (0.0000) loss 3.2740 (3.2740) grad_norm 2.4580 (2.4580) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:17:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][10/1251] eta 0:05:53 lr 0.000341 wd 0.0500 time 0.2258 (0.2847) data time 0.0012 (0.0581) model time 0.0000 (0.0000) loss 2.9762 (2.8005) grad_norm 2.3090 (2.7525) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][20/1251] eta 0:05:19 lr 0.000341 wd 0.0500 time 0.2447 (0.2594) data time 0.0009 (0.0309) model time 0.0000 (0.0000) loss 3.3263 (2.8929) grad_norm 2.5670 (2.7912) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][30/1251] eta 0:05:04 lr 0.000341 wd 0.0500 time 0.2336 (0.2497) data time 0.0010 (0.0213) model time 0.0000 (0.0000) loss 3.8228 (2.9419) grad_norm 2.7206 (2.8424) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][40/1251] eta 0:04:56 lr 0.000341 wd 0.0500 time 0.2408 (0.2446) data time 0.0009 (0.0163) model time 0.0000 (0.0000) loss 2.9856 (3.0190) grad_norm 2.0683 (2.8759) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][50/1251] eta 0:04:51 lr 0.000341 wd 0.0500 time 0.2429 (0.2424) data time 0.0009 (0.0134) model time 0.0000 (0.0000) loss 3.3227 (3.0096) grad_norm 3.8107 (2.9997) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][60/1251] eta 0:04:45 lr 0.000341 wd 0.0500 time 0.2270 (0.2401) data time 0.0010 (0.0113) model time 0.2259 (0.2272) loss 3.2858 (3.0119) grad_norm 2.1148 (3.0030) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][70/1251] eta 0:04:41 lr 0.000341 wd 0.0500 time 0.2293 (0.2386) data time 0.0008 (0.0099) model time 0.2286 (0.2279) loss 2.6070 (3.0131) grad_norm 3.2121 (2.9877) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][80/1251] eta 0:04:38 lr 0.000341 wd 0.0500 time 0.2423 (0.2374) data time 0.0007 (0.0088) model time 0.2416 (0.2280) loss 2.0813 (3.0056) grad_norm 3.3878 (2.9535) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][90/1251] eta 0:04:34 lr 0.000341 wd 0.0500 time 0.2276 (0.2364) data time 0.0010 (0.0079) model time 0.2267 (0.2277) loss 3.4487 (3.0019) grad_norm 2.9197 (2.9446) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][100/1251] eta 0:04:31 lr 0.000341 wd 0.0500 time 0.2552 (0.2361) data time 0.0013 (0.0073) model time 0.2539 (0.2288) loss 2.2184 (2.9864) grad_norm 3.8591 (2.9440) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][110/1251] eta 0:04:28 lr 0.000341 wd 0.0500 time 0.2237 (0.2357) data time 0.0011 (0.0068) model time 0.2226 (0.2288) loss 2.7939 (2.9866) grad_norm 3.0991 (2.9520) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][120/1251] eta 0:04:26 lr 0.000341 wd 0.0500 time 0.2471 (0.2352) data time 0.0012 (0.0064) model time 0.2459 (0.2288) loss 3.2118 (2.9838) grad_norm 2.1512 (2.9207) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][130/1251] eta 0:04:23 lr 0.000341 wd 0.0500 time 0.2285 (0.2348) data time 0.0009 (0.0060) model time 0.2276 (0.2288) loss 3.2204 (2.9758) grad_norm 2.1856 (2.9386) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][140/1251] eta 0:04:20 lr 0.000341 wd 0.0500 time 0.2494 (0.2345) data time 0.0009 (0.0056) model time 0.2485 (0.2287) loss 2.9804 (2.9859) grad_norm 3.4271 (2.9408) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][150/1251] eta 0:04:18 lr 0.000341 wd 0.0500 time 0.2478 (0.2343) data time 0.0013 (0.0053) model time 0.2465 (0.2290) loss 2.1773 (2.9974) grad_norm 2.2982 (2.9452) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][160/1251] eta 0:04:15 lr 0.000341 wd 0.0500 time 0.2324 (0.2341) data time 0.0008 (0.0051) model time 0.2316 (0.2290) loss 2.7383 (2.9999) grad_norm 4.0081 (2.9913) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][170/1251] eta 0:04:12 lr 0.000341 wd 0.0500 time 0.2363 (0.2338) data time 0.0012 (0.0048) model time 0.2351 (0.2289) loss 3.0125 (3.0091) grad_norm 2.9416 (2.9775) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][180/1251] eta 0:04:10 lr 0.000341 wd 0.0500 time 0.2633 (0.2337) data time 0.0010 (0.0046) model time 0.2624 (0.2291) loss 3.3940 (3.0333) grad_norm 6.8238 (2.9857) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][190/1251] eta 0:04:07 lr 0.000341 wd 0.0500 time 0.2286 (0.2334) data time 0.0007 (0.0045) model time 0.2279 (0.2289) loss 3.4698 (3.0342) grad_norm 3.2738 (2.9860) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][200/1251] eta 0:04:05 lr 0.000341 wd 0.0500 time 0.2305 (0.2332) data time 0.0007 (0.0043) model time 0.2298 (0.2289) loss 3.6565 (3.0384) grad_norm 2.8575 (2.9789) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][210/1251] eta 0:04:02 lr 0.000341 wd 0.0500 time 0.2264 (0.2330) data time 0.0006 (0.0041) model time 0.2258 (0.2289) loss 2.3567 (3.0393) grad_norm 2.9733 (3.0007) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][220/1251] eta 0:04:00 lr 0.000341 wd 0.0500 time 0.2293 (0.2330) data time 0.0007 (0.0040) model time 0.2286 (0.2290) loss 2.4034 (3.0353) grad_norm 6.0872 (3.0275) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][230/1251] eta 0:03:57 lr 0.000341 wd 0.0500 time 0.2360 (0.2329) data time 0.0008 (0.0039) model time 0.2351 (0.2290) loss 3.9856 (3.0335) grad_norm 4.4926 (3.0323) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][240/1251] eta 0:03:55 lr 0.000341 wd 0.0500 time 0.2315 (0.2327) data time 0.0009 (0.0037) model time 0.2307 (0.2289) loss 3.0638 (3.0421) grad_norm 2.8305 (3.0302) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][250/1251] eta 0:03:52 lr 0.000340 wd 0.0500 time 0.2267 (0.2325) data time 0.0006 (0.0036) model time 0.2260 (0.2288) loss 2.9058 (3.0455) grad_norm 2.4126 (3.0381) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][260/1251] eta 0:03:50 lr 0.000340 wd 0.0500 time 0.2292 (0.2323) data time 0.0010 (0.0036) model time 0.2281 (0.2286) loss 3.5224 (3.0445) grad_norm 2.5548 (3.0391) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:18:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][270/1251] eta 0:03:47 lr 0.000340 wd 0.0500 time 0.2336 (0.2322) data time 0.0008 (0.0035) model time 0.2328 (0.2286) loss 3.3447 (3.0319) grad_norm 7.8951 (3.0538) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][280/1251] eta 0:03:45 lr 0.000340 wd 0.0500 time 0.2296 (0.2320) data time 0.0008 (0.0034) model time 0.2288 (0.2286) loss 3.3242 (3.0279) grad_norm 3.9802 (3.0655) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][290/1251] eta 0:03:42 lr 0.000340 wd 0.0500 time 0.2244 (0.2320) data time 0.0012 (0.0033) model time 0.2232 (0.2285) loss 2.7333 (3.0309) grad_norm 2.4066 (3.0603) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][300/1251] eta 0:03:40 lr 0.000340 wd 0.0500 time 0.2682 (0.2320) data time 0.0008 (0.0033) model time 0.2674 (0.2287) loss 3.6517 (3.0339) grad_norm 3.2723 (3.0562) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][310/1251] eta 0:03:38 lr 0.000340 wd 0.0500 time 0.2335 (0.2318) data time 0.0010 (0.0032) model time 0.2325 (0.2286) loss 3.1563 (3.0361) grad_norm 2.1999 (3.0473) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][320/1251] eta 0:03:35 lr 0.000340 wd 0.0500 time 0.2380 (0.2318) data time 0.0007 (0.0031) model time 0.2373 (0.2286) loss 3.8107 (3.0415) grad_norm 4.3945 (3.0475) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][330/1251] eta 0:03:33 lr 0.000340 wd 0.0500 time 0.2215 (0.2323) data time 0.0009 (0.0031) model time 0.2207 (0.2293) loss 2.8100 (3.0479) grad_norm 2.6610 (3.0399) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][340/1251] eta 0:03:31 lr 0.000340 wd 0.0500 time 0.2268 (0.2323) data time 0.0009 (0.0030) model time 0.2258 (0.2293) loss 2.7450 (3.0442) grad_norm 2.1473 (3.0347) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][350/1251] eta 0:03:29 lr 0.000340 wd 0.0500 time 0.2256 (0.2322) data time 0.0008 (0.0030) model time 0.2247 (0.2292) loss 2.8594 (3.0461) grad_norm 2.1337 (3.0235) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][360/1251] eta 0:03:26 lr 0.000340 wd 0.0500 time 0.2311 (0.2321) data time 0.0009 (0.0029) model time 0.2302 (0.2292) loss 2.4240 (3.0444) grad_norm 2.3840 (3.0231) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][370/1251] eta 0:03:24 lr 0.000340 wd 0.0500 time 0.2310 (0.2320) data time 0.0007 (0.0029) model time 0.2304 (0.2292) loss 2.4157 (3.0427) grad_norm 2.9468 (3.0309) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][380/1251] eta 0:03:22 lr 0.000340 wd 0.0500 time 0.2203 (0.2319) data time 0.0010 (0.0028) model time 0.2193 (0.2291) loss 2.6984 (3.0386) grad_norm 2.8418 (3.0190) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][390/1251] eta 0:03:19 lr 0.000340 wd 0.0500 time 0.2294 (0.2319) data time 0.0011 (0.0028) model time 0.2283 (0.2291) loss 3.4203 (3.0434) grad_norm 3.0856 (3.0174) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][400/1251] eta 0:03:17 lr 0.000340 wd 0.0500 time 0.2353 (0.2318) data time 0.0010 (0.0027) model time 0.2343 (0.2291) loss 3.8124 (3.0494) grad_norm 4.5473 (3.0354) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][410/1251] eta 0:03:14 lr 0.000340 wd 0.0500 time 0.2261 (0.2317) data time 0.0009 (0.0027) model time 0.2252 (0.2291) loss 2.2050 (3.0404) grad_norm 3.6277 (3.0266) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][420/1251] eta 0:03:12 lr 0.000340 wd 0.0500 time 0.2329 (0.2317) data time 0.0007 (0.0026) model time 0.2322 (0.2290) loss 3.5776 (3.0423) grad_norm 2.1308 (3.0259) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][430/1251] eta 0:03:10 lr 0.000340 wd 0.0500 time 0.2260 (0.2317) data time 0.0007 (0.0026) model time 0.2253 (0.2290) loss 3.3182 (3.0424) grad_norm 2.9865 (3.0244) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][440/1251] eta 0:03:07 lr 0.000340 wd 0.0500 time 0.2221 (0.2316) data time 0.0013 (0.0026) model time 0.2208 (0.2290) loss 3.6930 (3.0421) grad_norm 3.2127 (3.0274) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][450/1251] eta 0:03:05 lr 0.000340 wd 0.0500 time 0.2280 (0.2316) data time 0.0009 (0.0026) model time 0.2271 (0.2290) loss 3.0634 (3.0482) grad_norm 3.8138 (3.0396) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][460/1251] eta 0:03:03 lr 0.000340 wd 0.0500 time 0.2364 (0.2315) data time 0.0006 (0.0025) model time 0.2358 (0.2290) loss 3.3001 (3.0448) grad_norm 3.8400 (3.0481) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][470/1251] eta 0:03:00 lr 0.000340 wd 0.0500 time 0.2345 (0.2315) data time 0.0010 (0.0025) model time 0.2335 (0.2290) loss 3.0422 (3.0447) grad_norm 3.3383 (3.0496) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][480/1251] eta 0:02:58 lr 0.000340 wd 0.0500 time 0.2462 (0.2314) data time 0.0011 (0.0025) model time 0.2450 (0.2289) loss 2.6588 (3.0451) grad_norm 4.7961 (3.0791) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][490/1251] eta 0:02:56 lr 0.000339 wd 0.0500 time 0.2292 (0.2314) data time 0.0007 (0.0025) model time 0.2284 (0.2289) loss 3.7101 (3.0412) grad_norm 2.4973 (3.0758) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][500/1251] eta 0:02:53 lr 0.000339 wd 0.0500 time 0.2371 (0.2314) data time 0.0007 (0.0025) model time 0.2364 (0.2289) loss 3.2217 (3.0411) grad_norm 3.5499 (3.0711) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][510/1251] eta 0:02:51 lr 0.000339 wd 0.0500 time 0.2225 (0.2313) data time 0.0010 (0.0024) model time 0.2215 (0.2289) loss 3.1142 (3.0443) grad_norm 2.4528 (3.0718) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][520/1251] eta 0:02:49 lr 0.000339 wd 0.0500 time 0.2276 (0.2314) data time 0.0007 (0.0024) model time 0.2268 (0.2290) loss 3.8814 (3.0480) grad_norm 2.4953 (3.0735) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:19:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][530/1251] eta 0:02:46 lr 0.000339 wd 0.0500 time 0.2427 (0.2314) data time 0.0010 (0.0024) model time 0.2418 (0.2290) loss 3.4559 (3.0531) grad_norm 3.1830 (3.0703) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][540/1251] eta 0:02:44 lr 0.000339 wd 0.0500 time 0.2309 (0.2313) data time 0.0011 (0.0024) model time 0.2298 (0.2289) loss 2.1913 (3.0497) grad_norm 4.0424 (3.0756) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][550/1251] eta 0:02:42 lr 0.000339 wd 0.0500 time 0.2229 (0.2313) data time 0.0012 (0.0023) model time 0.2217 (0.2289) loss 3.4869 (3.0515) grad_norm 3.0690 (3.0687) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][560/1251] eta 0:02:39 lr 0.000339 wd 0.0500 time 0.2302 (0.2312) data time 0.0014 (0.0023) model time 0.2289 (0.2289) loss 2.9623 (3.0514) grad_norm 2.2941 (3.0614) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][570/1251] eta 0:02:37 lr 0.000339 wd 0.0500 time 0.2296 (0.2311) data time 0.0009 (0.0023) model time 0.2287 (0.2288) loss 2.9637 (3.0521) grad_norm 2.4203 (3.0671) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][580/1251] eta 0:02:35 lr 0.000339 wd 0.0500 time 0.2455 (0.2312) data time 0.0006 (0.0023) model time 0.2448 (0.2289) loss 3.1884 (3.0535) grad_norm 2.4058 (3.0628) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][590/1251] eta 0:02:32 lr 0.000339 wd 0.0500 time 0.2239 (0.2311) data time 0.0007 (0.0022) model time 0.2232 (0.2289) loss 3.7338 (3.0560) grad_norm 4.5989 (3.0723) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][600/1251] eta 0:02:30 lr 0.000339 wd 0.0500 time 0.2269 (0.2311) data time 0.0007 (0.0022) model time 0.2262 (0.2289) loss 3.1808 (3.0563) grad_norm 2.6443 (3.0692) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][610/1251] eta 0:02:28 lr 0.000339 wd 0.0500 time 0.2270 (0.2311) data time 0.0010 (0.0022) model time 0.2260 (0.2288) loss 2.2296 (3.0557) grad_norm 2.7574 (3.0668) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][620/1251] eta 0:02:25 lr 0.000339 wd 0.0500 time 0.2289 (0.2311) data time 0.0009 (0.0022) model time 0.2280 (0.2289) loss 2.9659 (3.0559) grad_norm 2.5233 (3.0666) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][630/1251] eta 0:02:23 lr 0.000339 wd 0.0500 time 0.2318 (0.2311) data time 0.0007 (0.0022) model time 0.2311 (0.2289) loss 3.9148 (3.0571) grad_norm 3.1569 (3.0697) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][640/1251] eta 0:02:21 lr 0.000339 wd 0.0500 time 0.2372 (0.2311) data time 0.0009 (0.0022) model time 0.2363 (0.2290) loss 3.2433 (3.0558) grad_norm 2.4787 (3.0659) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][650/1251] eta 0:02:18 lr 0.000339 wd 0.0500 time 0.2257 (0.2311) data time 0.0007 (0.0022) model time 0.2250 (0.2290) loss 3.9357 (3.0559) grad_norm 2.0687 (3.0613) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][660/1251] eta 0:02:16 lr 0.000339 wd 0.0500 time 0.2253 (0.2311) data time 0.0008 (0.0021) model time 0.2245 (0.2290) loss 3.5745 (3.0564) grad_norm 6.7649 (3.0718) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][670/1251] eta 0:02:14 lr 0.000339 wd 0.0500 time 0.2283 (0.2311) data time 0.0011 (0.0021) model time 0.2271 (0.2290) loss 3.7354 (3.0580) grad_norm 2.0778 (3.0653) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][680/1251] eta 0:02:11 lr 0.000339 wd 0.0500 time 0.2297 (0.2311) data time 0.0008 (0.0021) model time 0.2288 (0.2290) loss 3.6695 (3.0598) grad_norm 2.8603 (3.0605) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][690/1251] eta 0:02:09 lr 0.000339 wd 0.0500 time 0.2237 (0.2311) data time 0.0009 (0.0021) model time 0.2228 (0.2290) loss 3.1571 (3.0621) grad_norm 2.3394 (3.0540) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][700/1251] eta 0:02:07 lr 0.000339 wd 0.0500 time 0.2296 (0.2311) data time 0.0015 (0.0021) model time 0.2281 (0.2290) loss 2.1609 (3.0590) grad_norm 3.5746 (3.0487) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][710/1251] eta 0:02:05 lr 0.000339 wd 0.0500 time 0.2244 (0.2311) data time 0.0014 (0.0021) model time 0.2230 (0.2290) loss 3.3345 (3.0630) grad_norm 7.7909 (3.0566) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][720/1251] eta 0:02:02 lr 0.000338 wd 0.0500 time 0.2299 (0.2311) data time 0.0010 (0.0021) model time 0.2289 (0.2290) loss 3.0047 (3.0622) grad_norm 3.0437 (3.0610) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][730/1251] eta 0:02:00 lr 0.000338 wd 0.0500 time 0.2207 (0.2311) data time 0.0009 (0.0021) model time 0.2198 (0.2290) loss 3.5245 (3.0632) grad_norm 3.6120 (3.0566) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][740/1251] eta 0:01:58 lr 0.000338 wd 0.0500 time 0.2248 (0.2310) data time 0.0012 (0.0020) model time 0.2237 (0.2290) loss 3.0475 (3.0647) grad_norm 2.2926 (3.0540) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][750/1251] eta 0:01:55 lr 0.000338 wd 0.0500 time 0.2227 (0.2310) data time 0.0010 (0.0020) model time 0.2217 (0.2290) loss 3.2242 (3.0639) grad_norm 3.2524 (3.0565) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][760/1251] eta 0:01:53 lr 0.000338 wd 0.0500 time 0.2268 (0.2310) data time 0.0009 (0.0020) model time 0.2259 (0.2290) loss 3.4872 (3.0615) grad_norm 2.5250 (3.0518) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][770/1251] eta 0:01:51 lr 0.000338 wd 0.0500 time 0.2218 (0.2309) data time 0.0012 (0.0020) model time 0.2206 (0.2289) loss 3.3663 (3.0637) grad_norm 3.3893 (3.0528) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][780/1251] eta 0:01:48 lr 0.000338 wd 0.0500 time 0.2304 (0.2309) data time 0.0015 (0.0020) model time 0.2289 (0.2289) loss 1.7582 (3.0607) grad_norm 2.8964 (3.0503) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:20:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][790/1251] eta 0:01:46 lr 0.000338 wd 0.0500 time 0.2310 (0.2309) data time 0.0009 (0.0020) model time 0.2301 (0.2289) loss 3.2441 (3.0608) grad_norm 2.9578 (3.0490) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][800/1251] eta 0:01:44 lr 0.000338 wd 0.0500 time 0.2280 (0.2309) data time 0.0010 (0.0020) model time 0.2270 (0.2289) loss 3.2477 (3.0609) grad_norm 2.9121 (3.0451) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][810/1251] eta 0:01:41 lr 0.000338 wd 0.0500 time 0.2293 (0.2309) data time 0.0008 (0.0020) model time 0.2285 (0.2289) loss 3.1968 (3.0596) grad_norm 3.2313 (3.0414) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][820/1251] eta 0:01:39 lr 0.000338 wd 0.0500 time 0.4309 (0.2311) data time 0.0007 (0.0020) model time 0.4302 (0.2291) loss 1.8553 (3.0590) grad_norm 3.4442 (3.0445) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][830/1251] eta 0:01:37 lr 0.000338 wd 0.0500 time 0.2226 (0.2313) data time 0.0007 (0.0020) model time 0.2219 (0.2293) loss 2.6117 (3.0596) grad_norm 2.6319 (3.0481) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][840/1251] eta 0:01:35 lr 0.000338 wd 0.0500 time 0.2273 (0.2312) data time 0.0009 (0.0020) model time 0.2265 (0.2293) loss 2.5151 (3.0601) grad_norm 2.6507 (3.0471) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][850/1251] eta 0:01:32 lr 0.000338 wd 0.0500 time 0.2257 (0.2314) data time 0.0007 (0.0019) model time 0.2250 (0.2295) loss 3.0723 (3.0585) grad_norm 2.5964 (3.0495) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][860/1251] eta 0:01:30 lr 0.000338 wd 0.0500 time 0.2180 (0.2314) data time 0.0009 (0.0019) model time 0.2171 (0.2295) loss 3.4022 (3.0614) grad_norm 3.6639 (3.0485) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][870/1251] eta 0:01:28 lr 0.000338 wd 0.0500 time 0.2281 (0.2313) data time 0.0012 (0.0019) model time 0.2269 (0.2295) loss 3.3305 (3.0603) grad_norm 3.1226 (3.0536) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][880/1251] eta 0:01:25 lr 0.000338 wd 0.0500 time 0.2296 (0.2313) data time 0.0006 (0.0019) model time 0.2290 (0.2294) loss 2.9442 (3.0576) grad_norm 2.7911 (3.0471) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][890/1251] eta 0:01:23 lr 0.000338 wd 0.0500 time 0.2317 (0.2313) data time 0.0009 (0.0019) model time 0.2308 (0.2294) loss 1.9866 (3.0538) grad_norm 1.9603 (3.0442) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][900/1251] eta 0:01:21 lr 0.000338 wd 0.0500 time 0.2245 (0.2313) data time 0.0014 (0.0019) model time 0.2230 (0.2294) loss 3.1603 (3.0557) grad_norm 2.7894 (3.0430) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][910/1251] eta 0:01:18 lr 0.000338 wd 0.0500 time 0.2229 (0.2313) data time 0.0013 (0.0019) model time 0.2216 (0.2294) loss 3.0285 (3.0559) grad_norm 4.0307 (3.0661) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][920/1251] eta 0:01:16 lr 0.000338 wd 0.0500 time 0.2360 (0.2313) data time 0.0009 (0.0019) model time 0.2351 (0.2294) loss 3.3090 (3.0577) grad_norm 4.1425 (3.0685) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][930/1251] eta 0:01:14 lr 0.000338 wd 0.0500 time 0.2250 (0.2313) data time 0.0008 (0.0019) model time 0.2243 (0.2294) loss 4.0653 (3.0582) grad_norm 3.2860 (3.0666) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][940/1251] eta 0:01:11 lr 0.000338 wd 0.0500 time 0.2284 (0.2313) data time 0.0009 (0.0019) model time 0.2275 (0.2294) loss 2.1324 (3.0581) grad_norm 2.9587 (3.0656) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][950/1251] eta 0:01:09 lr 0.000338 wd 0.0500 time 0.2274 (0.2313) data time 0.0009 (0.0019) model time 0.2265 (0.2294) loss 3.2960 (3.0580) grad_norm 3.6720 (3.0663) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][960/1251] eta 0:01:07 lr 0.000337 wd 0.0500 time 0.2293 (0.2313) data time 0.0009 (0.0019) model time 0.2284 (0.2294) loss 2.8707 (3.0578) grad_norm 2.5471 (3.0664) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][970/1251] eta 0:01:04 lr 0.000337 wd 0.0500 time 0.2273 (0.2312) data time 0.0008 (0.0019) model time 0.2265 (0.2294) loss 3.0011 (3.0575) grad_norm 3.0703 (3.0628) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][980/1251] eta 0:01:02 lr 0.000337 wd 0.0500 time 0.2266 (0.2312) data time 0.0008 (0.0019) model time 0.2258 (0.2294) loss 3.4043 (3.0585) grad_norm 3.7259 (3.0606) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][990/1251] eta 0:01:00 lr 0.000337 wd 0.0500 time 0.2308 (0.2312) data time 0.0007 (0.0018) model time 0.2301 (0.2294) loss 3.5526 (3.0606) grad_norm 2.9204 (3.0612) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1000/1251] eta 0:00:58 lr 0.000337 wd 0.0500 time 0.2275 (0.2312) data time 0.0010 (0.0018) model time 0.2265 (0.2294) loss 2.4468 (3.0618) grad_norm 1.9814 (3.0655) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1010/1251] eta 0:00:55 lr 0.000337 wd 0.0500 time 0.2490 (0.2312) data time 0.0009 (0.0018) model time 0.2481 (0.2294) loss 3.2811 (3.0611) grad_norm 3.6363 (3.0731) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1020/1251] eta 0:00:53 lr 0.000337 wd 0.0500 time 0.2248 (0.2312) data time 0.0010 (0.0018) model time 0.2238 (0.2294) loss 2.4422 (3.0593) grad_norm 2.5788 (3.0706) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1030/1251] eta 0:00:51 lr 0.000337 wd 0.0500 time 0.2355 (0.2312) data time 0.0009 (0.0018) model time 0.2347 (0.2294) loss 3.4135 (3.0576) grad_norm 2.8302 (3.0698) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1040/1251] eta 0:00:48 lr 0.000337 wd 0.0500 time 0.2388 (0.2312) data time 0.0008 (0.0018) model time 0.2380 (0.2294) loss 3.0119 (3.0578) grad_norm 2.6451 (3.0721) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1050/1251] eta 0:00:46 lr 0.000337 wd 0.0500 time 0.2265 (0.2312) data time 0.0010 (0.0018) model time 0.2256 (0.2294) loss 3.4607 (3.0555) grad_norm 2.9860 (3.0670) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1060/1251] eta 0:00:44 lr 0.000337 wd 0.0500 time 0.2264 (0.2312) data time 0.0006 (0.0018) model time 0.2257 (0.2294) loss 3.4192 (3.0568) grad_norm 4.4310 (3.0678) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1070/1251] eta 0:00:41 lr 0.000337 wd 0.0500 time 0.2258 (0.2312) data time 0.0010 (0.0018) model time 0.2248 (0.2294) loss 3.0495 (3.0575) grad_norm 2.9463 (3.0672) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1080/1251] eta 0:00:39 lr 0.000337 wd 0.0500 time 0.2292 (0.2312) data time 0.0009 (0.0018) model time 0.2283 (0.2294) loss 3.3251 (3.0564) grad_norm 3.5446 (3.0660) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1090/1251] eta 0:00:37 lr 0.000337 wd 0.0500 time 0.2301 (0.2312) data time 0.0007 (0.0018) model time 0.2294 (0.2294) loss 2.9226 (3.0553) grad_norm 2.8813 (3.0632) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1100/1251] eta 0:00:34 lr 0.000337 wd 0.0500 time 0.2274 (0.2312) data time 0.0008 (0.0018) model time 0.2266 (0.2294) loss 2.4772 (3.0569) grad_norm 2.6324 (3.0652) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1110/1251] eta 0:00:32 lr 0.000337 wd 0.0500 time 0.2205 (0.2312) data time 0.0008 (0.0018) model time 0.2197 (0.2294) loss 2.5272 (3.0570) grad_norm 1.8888 (3.0641) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1120/1251] eta 0:00:30 lr 0.000337 wd 0.0500 time 0.2263 (0.2312) data time 0.0010 (0.0018) model time 0.2252 (0.2294) loss 3.1307 (3.0590) grad_norm 5.5161 (3.0706) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1130/1251] eta 0:00:27 lr 0.000337 wd 0.0500 time 0.2197 (0.2312) data time 0.0009 (0.0018) model time 0.2188 (0.2294) loss 2.6232 (3.0589) grad_norm 2.3647 (3.0723) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1140/1251] eta 0:00:25 lr 0.000337 wd 0.0500 time 0.2252 (0.2312) data time 0.0011 (0.0018) model time 0.2241 (0.2295) loss 1.9558 (3.0578) grad_norm 2.5913 (3.0716) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1150/1251] eta 0:00:23 lr 0.000337 wd 0.0500 time 0.2201 (0.2312) data time 0.0010 (0.0018) model time 0.2190 (0.2294) loss 2.4893 (3.0576) grad_norm 3.2639 (3.0694) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1160/1251] eta 0:00:21 lr 0.000337 wd 0.0500 time 0.2313 (0.2312) data time 0.0010 (0.0018) model time 0.2303 (0.2295) loss 2.6278 (3.0591) grad_norm 2.5660 (3.0637) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1170/1251] eta 0:00:18 lr 0.000337 wd 0.0500 time 0.2261 (0.2312) data time 0.0008 (0.0017) model time 0.2253 (0.2294) loss 1.9779 (3.0583) grad_norm 3.1871 (3.0641) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1180/1251] eta 0:00:16 lr 0.000337 wd 0.0500 time 0.2294 (0.2312) data time 0.0008 (0.0017) model time 0.2286 (0.2294) loss 2.7293 (3.0563) grad_norm 2.6046 (3.0663) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1190/1251] eta 0:00:14 lr 0.000337 wd 0.0500 time 0.2327 (0.2312) data time 0.0009 (0.0017) model time 0.2317 (0.2295) loss 3.2354 (3.0567) grad_norm 3.4016 (3.0673) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1200/1251] eta 0:00:11 lr 0.000336 wd 0.0500 time 0.2341 (0.2312) data time 0.0009 (0.0017) model time 0.2332 (0.2295) loss 3.1736 (3.0579) grad_norm 2.1058 (3.0650) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1210/1251] eta 0:00:09 lr 0.000336 wd 0.0500 time 0.2311 (0.2312) data time 0.0011 (0.0017) model time 0.2300 (0.2295) loss 2.9535 (3.0563) grad_norm 2.7753 (3.0623) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1220/1251] eta 0:00:07 lr 0.000336 wd 0.0500 time 0.2215 (0.2312) data time 0.0007 (0.0017) model time 0.2207 (0.2294) loss 3.1251 (3.0553) grad_norm 2.5415 (3.0632) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1230/1251] eta 0:00:04 lr 0.000336 wd 0.0500 time 0.2258 (0.2312) data time 0.0010 (0.0017) model time 0.2248 (0.2295) loss 2.6309 (3.0557) grad_norm 2.2937 (3.0673) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1240/1251] eta 0:00:02 lr 0.000336 wd 0.0500 time 0.2232 (0.2311) data time 0.0006 (0.0017) model time 0.2226 (0.2294) loss 2.9325 (3.0558) grad_norm 3.0897 (3.0646) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [190/300][1250/1251] eta 0:00:00 lr 0.000336 wd 0.0500 time 0.2215 (0.2310) data time 0.0006 (0.0017) model time 0.2208 (0.2293) loss 3.2429 (3.0533) grad_norm 4.8140 (3.0744) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 190 training takes 0:04:48 [2024-08-27 21:22:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 21:22:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 21:22:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.589 (0.589) Loss 0.4399 (0.4399) Acc@1 92.676 (92.676) Acc@5 98.242 (98.242) Mem 7381MB [2024-08-27 21:22:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.129) Loss 0.6914 (0.6775) Acc@1 86.230 (85.804) Acc@5 97.168 (97.195) Mem 7381MB [2024-08-27 21:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.106) Loss 1.0000 (0.7063) Acc@1 75.098 (84.561) Acc@5 94.629 (97.149) Mem 7381MB [2024-08-27 21:22:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.093 (0.098) Loss 1.1660 (0.8031) Acc@1 71.484 (82.227) Acc@5 92.383 (96.012) Mem 7381MB [2024-08-27 21:22:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.090) Loss 1.0205 (0.8539) Acc@1 76.270 (81.017) Acc@5 93.945 (95.470) Mem 7381MB [2024-08-27 21:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.638 Acc@5 95.432 [2024-08-27 21:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.6% [2024-08-27 21:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 80.64% [2024-08-27 21:22:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 21:22:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 21:22:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.544 (0.544) Loss 0.3882 (0.3882) Acc@1 93.164 (93.164) Acc@5 98.535 (98.535) Mem 7381MB [2024-08-27 21:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.120) Loss 0.6079 (0.6163) Acc@1 88.477 (87.003) Acc@5 97.168 (97.443) Mem 7381MB [2024-08-27 21:22:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.101) Loss 0.8765 (0.6411) Acc@1 78.418 (85.989) Acc@5 95.898 (97.456) Mem 7381MB [2024-08-27 21:22:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.095) Loss 1.1113 (0.7260) Acc@1 73.242 (83.943) Acc@5 92.773 (96.588) Mem 7381MB [2024-08-27 21:22:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.088) Loss 0.9814 (0.7693) Acc@1 76.270 (82.665) Acc@5 94.141 (96.129) Mem 7381MB [2024-08-27 21:22:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.236 Acc@5 96.110 [2024-08-27 21:22:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.2% [2024-08-27 21:22:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.24% [2024-08-27 21:22:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 21:22:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 21:22:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][0/1251] eta 0:15:21 lr 0.000336 wd 0.0500 time 0.7369 (0.7369) data time 0.4910 (0.4910) model time 0.0000 (0.0000) loss 3.3338 (3.3338) grad_norm 2.1350 (2.1350) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:22:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][10/1251] eta 0:05:39 lr 0.000336 wd 0.0500 time 0.2323 (0.2738) data time 0.0009 (0.0456) model time 0.0000 (0.0000) loss 3.1024 (3.2152) grad_norm 3.9549 (3.2536) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][20/1251] eta 0:05:12 lr 0.000336 wd 0.0500 time 0.2644 (0.2538) data time 0.0012 (0.0244) model time 0.0000 (0.0000) loss 2.5607 (3.0724) grad_norm 2.3958 (3.1678) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][30/1251] eta 0:05:06 lr 0.000336 wd 0.0500 time 0.2285 (0.2511) data time 0.0008 (0.0169) model time 0.0000 (0.0000) loss 2.7575 (3.0214) grad_norm 2.5537 (3.0204) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][40/1251] eta 0:04:56 lr 0.000336 wd 0.0500 time 0.2270 (0.2452) data time 0.0008 (0.0130) model time 0.0000 (0.0000) loss 3.1762 (2.9852) grad_norm 2.9956 (3.0040) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][50/1251] eta 0:04:51 lr 0.000336 wd 0.0500 time 0.2269 (0.2427) data time 0.0009 (0.0107) model time 0.0000 (0.0000) loss 2.8870 (3.0230) grad_norm 4.4317 (3.0755) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][60/1251] eta 0:04:46 lr 0.000336 wd 0.0500 time 0.2244 (0.2403) data time 0.0009 (0.0091) model time 0.2235 (0.2267) loss 3.2983 (3.0109) grad_norm 3.2058 (3.1456) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][70/1251] eta 0:04:42 lr 0.000336 wd 0.0500 time 0.2266 (0.2390) data time 0.0010 (0.0080) model time 0.2257 (0.2285) loss 2.1253 (3.0095) grad_norm 3.4510 (3.1832) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][80/1251] eta 0:04:39 lr 0.000336 wd 0.0500 time 0.2255 (0.2384) data time 0.0011 (0.0073) model time 0.2244 (0.2294) loss 3.0741 (3.0117) grad_norm 2.9786 (3.1521) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][90/1251] eta 0:04:35 lr 0.000336 wd 0.0500 time 0.2329 (0.2371) data time 0.0009 (0.0066) model time 0.2320 (0.2285) loss 3.3531 (3.0341) grad_norm 3.0540 (3.1457) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][100/1251] eta 0:04:31 lr 0.000336 wd 0.0500 time 0.2237 (0.2361) data time 0.0010 (0.0061) model time 0.2227 (0.2279) loss 3.3218 (3.0552) grad_norm 3.2661 (3.1218) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][110/1251] eta 0:04:28 lr 0.000336 wd 0.0500 time 0.2432 (0.2356) data time 0.0015 (0.0056) model time 0.2417 (0.2283) loss 2.8621 (3.0422) grad_norm 2.1616 (3.0970) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][120/1251] eta 0:04:26 lr 0.000336 wd 0.0500 time 0.2273 (0.2352) data time 0.0007 (0.0053) model time 0.2266 (0.2284) loss 3.3784 (3.0363) grad_norm 2.5176 (3.0740) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][130/1251] eta 0:04:23 lr 0.000336 wd 0.0500 time 0.2273 (0.2347) data time 0.0011 (0.0049) model time 0.2263 (0.2283) loss 3.2279 (3.0421) grad_norm 2.8825 (3.0668) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][140/1251] eta 0:04:20 lr 0.000336 wd 0.0500 time 0.2252 (0.2342) data time 0.0009 (0.0047) model time 0.2242 (0.2281) loss 3.6677 (3.0481) grad_norm 2.7194 (3.0774) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][150/1251] eta 0:04:17 lr 0.000336 wd 0.0500 time 0.2226 (0.2336) data time 0.0010 (0.0044) model time 0.2216 (0.2278) loss 3.0147 (3.0304) grad_norm 2.5599 (3.0741) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:23:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][160/1251] eta 0:04:14 lr 0.000336 wd 0.0500 time 0.2242 (0.2332) data time 0.0011 (0.0043) model time 0.2231 (0.2275) loss 2.9388 (3.0397) grad_norm 2.4322 (3.0849) loss_scale 2048.0000 (1062.1615) mem 7381MB [2024-08-27 21:23:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][170/1251] eta 0:04:11 lr 0.000336 wd 0.0500 time 0.2238 (0.2330) data time 0.0007 (0.0041) model time 0.2231 (0.2275) loss 3.0309 (3.0450) grad_norm 2.4074 (3.0619) loss_scale 2048.0000 (1119.8129) mem 7381MB [2024-08-27 21:23:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][180/1251] eta 0:04:09 lr 0.000336 wd 0.0500 time 0.2272 (0.2327) data time 0.0012 (0.0039) model time 0.2259 (0.2275) loss 2.3682 (3.0346) grad_norm 2.8527 (3.0347) loss_scale 2048.0000 (1171.0939) mem 7381MB [2024-08-27 21:23:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][190/1251] eta 0:04:07 lr 0.000335 wd 0.0500 time 0.2335 (0.2337) data time 0.0008 (0.0038) model time 0.2328 (0.2292) loss 2.5796 (3.0333) grad_norm 4.2195 (3.0316) loss_scale 2048.0000 (1217.0052) mem 7381MB [2024-08-27 21:23:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][200/1251] eta 0:04:06 lr 0.000335 wd 0.0500 time 0.2317 (0.2345) data time 0.0011 (0.0037) model time 0.2306 (0.2305) loss 3.5785 (3.0403) grad_norm 3.4744 (3.0362) loss_scale 2048.0000 (1258.3483) mem 7381MB [2024-08-27 21:23:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][210/1251] eta 0:04:03 lr 0.000335 wd 0.0500 time 0.2387 (0.2343) data time 0.0007 (0.0035) model time 0.2380 (0.2303) loss 3.5982 (3.0438) grad_norm 2.9204 (3.0473) loss_scale 2048.0000 (1295.7725) mem 7381MB [2024-08-27 21:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][220/1251] eta 0:04:01 lr 0.000335 wd 0.0500 time 0.2206 (0.2339) data time 0.0010 (0.0034) model time 0.2195 (0.2300) loss 3.3768 (3.0491) grad_norm 2.3759 (3.0424) loss_scale 2048.0000 (1329.8100) mem 7381MB [2024-08-27 21:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][230/1251] eta 0:03:58 lr 0.000335 wd 0.0500 time 0.2256 (0.2338) data time 0.0008 (0.0033) model time 0.2248 (0.2300) loss 2.1827 (3.0491) grad_norm 2.6882 (3.0268) loss_scale 2048.0000 (1360.9004) mem 7381MB [2024-08-27 21:23:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][240/1251] eta 0:03:56 lr 0.000335 wd 0.0500 time 0.2359 (0.2336) data time 0.0009 (0.0032) model time 0.2351 (0.2299) loss 3.5174 (3.0552) grad_norm 2.4414 (3.0090) loss_scale 2048.0000 (1389.4108) mem 7381MB [2024-08-27 21:23:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][250/1251] eta 0:03:53 lr 0.000335 wd 0.0500 time 0.2306 (0.2334) data time 0.0009 (0.0032) model time 0.2297 (0.2298) loss 2.6704 (3.0608) grad_norm 2.7616 (3.0001) loss_scale 2048.0000 (1415.6494) mem 7381MB [2024-08-27 21:23:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][260/1251] eta 0:03:51 lr 0.000335 wd 0.0500 time 0.2332 (0.2334) data time 0.0010 (0.0031) model time 0.2322 (0.2298) loss 3.2129 (3.0602) grad_norm 2.7980 (2.9997) loss_scale 2048.0000 (1439.8774) mem 7381MB [2024-08-27 21:23:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][270/1251] eta 0:03:48 lr 0.000335 wd 0.0500 time 0.2282 (0.2332) data time 0.0007 (0.0030) model time 0.2275 (0.2298) loss 3.0649 (3.0606) grad_norm 2.5199 (2.9998) loss_scale 2048.0000 (1462.3173) mem 7381MB [2024-08-27 21:24:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][280/1251] eta 0:03:46 lr 0.000335 wd 0.0500 time 0.2242 (0.2330) data time 0.0008 (0.0030) model time 0.2234 (0.2297) loss 2.9064 (3.0638) grad_norm 3.6247 (3.0099) loss_scale 2048.0000 (1483.1601) mem 7381MB [2024-08-27 21:24:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][290/1251] eta 0:03:43 lr 0.000335 wd 0.0500 time 0.2325 (0.2329) data time 0.0010 (0.0029) model time 0.2316 (0.2296) loss 3.0993 (3.0656) grad_norm 2.5780 (3.0058) loss_scale 2048.0000 (1502.5704) mem 7381MB [2024-08-27 21:24:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][300/1251] eta 0:03:41 lr 0.000335 wd 0.0500 time 0.2237 (0.2328) data time 0.0009 (0.0029) model time 0.2228 (0.2295) loss 2.7528 (3.0595) grad_norm 2.2019 (2.9932) loss_scale 2048.0000 (1520.6910) mem 7381MB [2024-08-27 21:24:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][310/1251] eta 0:03:38 lr 0.000335 wd 0.0500 time 0.2183 (0.2327) data time 0.0012 (0.0028) model time 0.2171 (0.2295) loss 2.9916 (3.0627) grad_norm 4.8262 (2.9892) loss_scale 2048.0000 (1537.6463) mem 7381MB [2024-08-27 21:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][320/1251] eta 0:03:36 lr 0.000335 wd 0.0500 time 0.2317 (0.2326) data time 0.0008 (0.0027) model time 0.2309 (0.2294) loss 3.1154 (3.0633) grad_norm 2.3672 (2.9904) loss_scale 2048.0000 (1553.5452) mem 7381MB [2024-08-27 21:24:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][330/1251] eta 0:03:34 lr 0.000335 wd 0.0500 time 0.2214 (0.2325) data time 0.0009 (0.0027) model time 0.2205 (0.2294) loss 2.7551 (3.0697) grad_norm 2.9971 (2.9922) loss_scale 2048.0000 (1568.4834) mem 7381MB [2024-08-27 21:24:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][340/1251] eta 0:03:31 lr 0.000335 wd 0.0500 time 0.2244 (0.2323) data time 0.0007 (0.0027) model time 0.2236 (0.2293) loss 2.3297 (3.0703) grad_norm 2.2841 (2.9991) loss_scale 2048.0000 (1582.5455) mem 7381MB [2024-08-27 21:24:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][350/1251] eta 0:03:29 lr 0.000335 wd 0.0500 time 0.2354 (0.2322) data time 0.0010 (0.0026) model time 0.2344 (0.2291) loss 3.1281 (3.0769) grad_norm 3.1805 (3.0345) loss_scale 2048.0000 (1595.8063) mem 7381MB [2024-08-27 21:24:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][360/1251] eta 0:03:26 lr 0.000335 wd 0.0500 time 0.2352 (0.2321) data time 0.0009 (0.0026) model time 0.2343 (0.2291) loss 3.0307 (3.0804) grad_norm 2.8072 (3.0331) loss_scale 2048.0000 (1608.3324) mem 7381MB [2024-08-27 21:24:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][370/1251] eta 0:03:24 lr 0.000335 wd 0.0500 time 0.2321 (0.2320) data time 0.0007 (0.0025) model time 0.2314 (0.2291) loss 3.8133 (3.0806) grad_norm 9.0518 (3.0467) loss_scale 2048.0000 (1620.1833) mem 7381MB [2024-08-27 21:24:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][380/1251] eta 0:03:22 lr 0.000335 wd 0.0500 time 0.2417 (0.2320) data time 0.0008 (0.0025) model time 0.2410 (0.2291) loss 2.7456 (3.0840) grad_norm 3.0895 (3.0485) loss_scale 2048.0000 (1631.4121) mem 7381MB [2024-08-27 21:24:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][390/1251] eta 0:03:19 lr 0.000335 wd 0.0500 time 0.2310 (0.2319) data time 0.0011 (0.0025) model time 0.2299 (0.2290) loss 3.6152 (3.0902) grad_norm 2.3226 (3.0427) loss_scale 2048.0000 (1642.0665) mem 7381MB [2024-08-27 21:24:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][400/1251] eta 0:03:17 lr 0.000335 wd 0.0500 time 0.2311 (0.2318) data time 0.0009 (0.0025) model time 0.2302 (0.2290) loss 2.5411 (3.0905) grad_norm 2.2533 (3.0420) loss_scale 2048.0000 (1652.1895) mem 7381MB [2024-08-27 21:24:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][410/1251] eta 0:03:14 lr 0.000335 wd 0.0500 time 0.2251 (0.2318) data time 0.0012 (0.0024) model time 0.2239 (0.2290) loss 2.2607 (3.0894) grad_norm 2.3417 (3.0520) loss_scale 2048.0000 (1661.8200) mem 7381MB [2024-08-27 21:24:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][420/1251] eta 0:03:12 lr 0.000335 wd 0.0500 time 0.2319 (0.2317) data time 0.0006 (0.0024) model time 0.2313 (0.2289) loss 1.9396 (3.0896) grad_norm 2.1616 (3.0732) loss_scale 2048.0000 (1670.9929) mem 7381MB [2024-08-27 21:24:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][430/1251] eta 0:03:10 lr 0.000334 wd 0.0500 time 0.2338 (0.2317) data time 0.0008 (0.0024) model time 0.2330 (0.2289) loss 2.5124 (3.0915) grad_norm 2.6742 (3.0641) loss_scale 2048.0000 (1679.7401) mem 7381MB [2024-08-27 21:24:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][440/1251] eta 0:03:07 lr 0.000334 wd 0.0500 time 0.2277 (0.2316) data time 0.0009 (0.0023) model time 0.2269 (0.2289) loss 3.4319 (3.0923) grad_norm 2.3570 (3.0534) loss_scale 2048.0000 (1688.0907) mem 7381MB [2024-08-27 21:24:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][450/1251] eta 0:03:05 lr 0.000334 wd 0.0500 time 0.2236 (0.2316) data time 0.0009 (0.0023) model time 0.2227 (0.2289) loss 2.6964 (3.0906) grad_norm 2.8527 (3.0510) loss_scale 2048.0000 (1696.0710) mem 7381MB [2024-08-27 21:24:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][460/1251] eta 0:03:03 lr 0.000334 wd 0.0500 time 0.2337 (0.2315) data time 0.0009 (0.0023) model time 0.2328 (0.2289) loss 3.3190 (3.0879) grad_norm 2.3423 (3.0673) loss_scale 2048.0000 (1703.7050) mem 7381MB [2024-08-27 21:24:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][470/1251] eta 0:03:00 lr 0.000334 wd 0.0500 time 0.2325 (0.2315) data time 0.0007 (0.0023) model time 0.2318 (0.2289) loss 3.1699 (3.0881) grad_norm 2.7669 (3.0626) loss_scale 2048.0000 (1711.0149) mem 7381MB [2024-08-27 21:24:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][480/1251] eta 0:02:58 lr 0.000334 wd 0.0500 time 0.2226 (0.2315) data time 0.0011 (0.0022) model time 0.2215 (0.2289) loss 2.8996 (3.0869) grad_norm 2.5667 (3.0574) loss_scale 2048.0000 (1718.0208) mem 7381MB [2024-08-27 21:24:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][490/1251] eta 0:02:56 lr 0.000334 wd 0.0500 time 0.2259 (0.2314) data time 0.0007 (0.0022) model time 0.2252 (0.2288) loss 3.1320 (3.0873) grad_norm 2.9392 (3.0578) loss_scale 2048.0000 (1724.7413) mem 7381MB [2024-08-27 21:24:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][500/1251] eta 0:02:53 lr 0.000334 wd 0.0500 time 0.2267 (0.2314) data time 0.0007 (0.0022) model time 0.2260 (0.2288) loss 3.1525 (3.0903) grad_norm 4.2777 (3.0593) loss_scale 2048.0000 (1731.1936) mem 7381MB [2024-08-27 21:24:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][510/1251] eta 0:02:51 lr 0.000334 wd 0.0500 time 0.2212 (0.2313) data time 0.0010 (0.0022) model time 0.2201 (0.2288) loss 2.3034 (3.0903) grad_norm 3.4029 (3.0650) loss_scale 2048.0000 (1737.3933) mem 7381MB [2024-08-27 21:24:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][520/1251] eta 0:02:49 lr 0.000334 wd 0.0500 time 0.2372 (0.2313) data time 0.0007 (0.0022) model time 0.2365 (0.2288) loss 3.3124 (3.0855) grad_norm 3.7523 (3.0649) loss_scale 2048.0000 (1743.3551) mem 7381MB [2024-08-27 21:24:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][530/1251] eta 0:02:46 lr 0.000334 wd 0.0500 time 0.2243 (0.2313) data time 0.0006 (0.0022) model time 0.2237 (0.2288) loss 2.7278 (3.0891) grad_norm 3.9851 (3.0638) loss_scale 2048.0000 (1749.0923) mem 7381MB [2024-08-27 21:25:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][540/1251] eta 0:02:44 lr 0.000334 wd 0.0500 time 0.2285 (0.2313) data time 0.0012 (0.0021) model time 0.2273 (0.2289) loss 2.2461 (3.0853) grad_norm 3.0101 (3.0607) loss_scale 2048.0000 (1754.6174) mem 7381MB [2024-08-27 21:25:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][550/1251] eta 0:02:42 lr 0.000334 wd 0.0500 time 0.2255 (0.2317) data time 0.0013 (0.0021) model time 0.2242 (0.2293) loss 3.1301 (3.0843) grad_norm 12.8872 (3.0761) loss_scale 2048.0000 (1759.9419) mem 7381MB [2024-08-27 21:25:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][560/1251] eta 0:02:40 lr 0.000334 wd 0.0500 time 0.2266 (0.2316) data time 0.0007 (0.0021) model time 0.2259 (0.2293) loss 3.7281 (3.0784) grad_norm 3.2268 (3.0741) loss_scale 2048.0000 (1765.0766) mem 7381MB [2024-08-27 21:25:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][570/1251] eta 0:02:37 lr 0.000334 wd 0.0500 time 0.2327 (0.2316) data time 0.0010 (0.0021) model time 0.2316 (0.2293) loss 3.3935 (3.0761) grad_norm 3.2045 (3.0727) loss_scale 2048.0000 (1770.0315) mem 7381MB [2024-08-27 21:25:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][580/1251] eta 0:02:35 lr 0.000334 wd 0.0500 time 0.2276 (0.2316) data time 0.0010 (0.0021) model time 0.2266 (0.2292) loss 3.1840 (3.0760) grad_norm 2.5539 (3.0712) loss_scale 2048.0000 (1774.8158) mem 7381MB [2024-08-27 21:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][590/1251] eta 0:02:33 lr 0.000334 wd 0.0500 time 0.2229 (0.2315) data time 0.0016 (0.0021) model time 0.2213 (0.2292) loss 3.1952 (3.0733) grad_norm 2.4882 (3.0681) loss_scale 2048.0000 (1779.4382) mem 7381MB [2024-08-27 21:25:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][600/1251] eta 0:02:30 lr 0.000334 wd 0.0500 time 0.2366 (0.2315) data time 0.0008 (0.0020) model time 0.2358 (0.2292) loss 1.9656 (3.0731) grad_norm 2.0556 (3.0830) loss_scale 2048.0000 (1783.9068) mem 7381MB [2024-08-27 21:25:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][610/1251] eta 0:02:28 lr 0.000334 wd 0.0500 time 0.2341 (0.2315) data time 0.0008 (0.0021) model time 0.2333 (0.2292) loss 3.0756 (3.0751) grad_norm 3.0205 (3.0782) loss_scale 2048.0000 (1788.2291) mem 7381MB [2024-08-27 21:25:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][620/1251] eta 0:02:26 lr 0.000334 wd 0.0500 time 0.2246 (0.2315) data time 0.0009 (0.0020) model time 0.2237 (0.2292) loss 3.4509 (3.0731) grad_norm 2.4588 (3.0750) loss_scale 2048.0000 (1792.4122) mem 7381MB [2024-08-27 21:25:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][630/1251] eta 0:02:23 lr 0.000334 wd 0.0500 time 0.2305 (0.2315) data time 0.0007 (0.0020) model time 0.2298 (0.2292) loss 3.6882 (3.0807) grad_norm 2.6596 (3.0798) loss_scale 2048.0000 (1796.4628) mem 7381MB [2024-08-27 21:25:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][640/1251] eta 0:02:21 lr 0.000334 wd 0.0500 time 0.2273 (0.2315) data time 0.0010 (0.0020) model time 0.2263 (0.2292) loss 2.0775 (3.0791) grad_norm 3.5920 (3.0928) loss_scale 2048.0000 (1800.3869) mem 7381MB [2024-08-27 21:25:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][650/1251] eta 0:02:19 lr 0.000334 wd 0.0500 time 0.2381 (0.2314) data time 0.0006 (0.0020) model time 0.2375 (0.2292) loss 3.7792 (3.0825) grad_norm 3.1093 (inf) loss_scale 1024.0000 (1790.0338) mem 7381MB [2024-08-27 21:25:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][660/1251] eta 0:02:16 lr 0.000334 wd 0.0500 time 0.2205 (0.2314) data time 0.0014 (0.0020) model time 0.2190 (0.2292) loss 3.2922 (3.0779) grad_norm 3.0354 (inf) loss_scale 1024.0000 (1778.4448) mem 7381MB [2024-08-27 21:25:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][670/1251] eta 0:02:14 lr 0.000333 wd 0.0500 time 0.2202 (0.2314) data time 0.0010 (0.0020) model time 0.2192 (0.2292) loss 3.3222 (3.0813) grad_norm 2.5417 (inf) loss_scale 1024.0000 (1767.2012) mem 7381MB [2024-08-27 21:25:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][680/1251] eta 0:02:12 lr 0.000333 wd 0.0500 time 0.2256 (0.2314) data time 0.0011 (0.0020) model time 0.2245 (0.2292) loss 2.1926 (3.0832) grad_norm 2.7072 (inf) loss_scale 1024.0000 (1756.2878) mem 7381MB [2024-08-27 21:25:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][690/1251] eta 0:02:09 lr 0.000333 wd 0.0500 time 0.2262 (0.2313) data time 0.0012 (0.0020) model time 0.2250 (0.2292) loss 3.3526 (3.0819) grad_norm 3.4686 (inf) loss_scale 1024.0000 (1745.6903) mem 7381MB [2024-08-27 21:25:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][700/1251] eta 0:02:07 lr 0.000333 wd 0.0500 time 0.2228 (0.2313) data time 0.0011 (0.0020) model time 0.2217 (0.2291) loss 3.2749 (3.0809) grad_norm 2.1955 (inf) loss_scale 1024.0000 (1735.3951) mem 7381MB [2024-08-27 21:25:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][710/1251] eta 0:02:05 lr 0.000333 wd 0.0500 time 0.2239 (0.2313) data time 0.0007 (0.0020) model time 0.2232 (0.2291) loss 3.5500 (3.0848) grad_norm 2.1703 (inf) loss_scale 1024.0000 (1725.3896) mem 7381MB [2024-08-27 21:25:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][720/1251] eta 0:02:02 lr 0.000333 wd 0.0500 time 0.2231 (0.2313) data time 0.0009 (0.0019) model time 0.2222 (0.2292) loss 2.7248 (3.0799) grad_norm 3.5042 (inf) loss_scale 1024.0000 (1715.6616) mem 7381MB [2024-08-27 21:25:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][730/1251] eta 0:02:00 lr 0.000333 wd 0.0500 time 0.2323 (0.2316) data time 0.0009 (0.0019) model time 0.2314 (0.2295) loss 3.2011 (3.0788) grad_norm 2.2720 (inf) loss_scale 1024.0000 (1706.1997) mem 7381MB [2024-08-27 21:25:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][740/1251] eta 0:01:58 lr 0.000333 wd 0.0500 time 0.2252 (0.2319) data time 0.0012 (0.0019) model time 0.2240 (0.2298) loss 3.4353 (3.0796) grad_norm 2.3240 (inf) loss_scale 1024.0000 (1696.9933) mem 7381MB [2024-08-27 21:25:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][750/1251] eta 0:01:56 lr 0.000333 wd 0.0500 time 0.2260 (0.2318) data time 0.0009 (0.0019) model time 0.2251 (0.2297) loss 3.4547 (3.0801) grad_norm 2.1317 (inf) loss_scale 1024.0000 (1688.0320) mem 7381MB [2024-08-27 21:25:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][760/1251] eta 0:01:53 lr 0.000333 wd 0.0500 time 0.2263 (0.2318) data time 0.0010 (0.0019) model time 0.2252 (0.2297) loss 3.3504 (3.0803) grad_norm 2.9166 (inf) loss_scale 1024.0000 (1679.3062) mem 7381MB [2024-08-27 21:25:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][770/1251] eta 0:01:51 lr 0.000333 wd 0.0500 time 0.2216 (0.2318) data time 0.0009 (0.0019) model time 0.2207 (0.2297) loss 2.2402 (3.0777) grad_norm 3.4637 (inf) loss_scale 1024.0000 (1670.8067) mem 7381MB [2024-08-27 21:25:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][780/1251] eta 0:01:49 lr 0.000333 wd 0.0500 time 0.2372 (0.2318) data time 0.0012 (0.0019) model time 0.2360 (0.2297) loss 3.1724 (3.0780) grad_norm 3.0845 (inf) loss_scale 1024.0000 (1662.5250) mem 7381MB [2024-08-27 21:25:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][790/1251] eta 0:01:46 lr 0.000333 wd 0.0500 time 0.2290 (0.2318) data time 0.0010 (0.0019) model time 0.2279 (0.2297) loss 2.0687 (3.0738) grad_norm 2.4340 (inf) loss_scale 1024.0000 (1654.4526) mem 7381MB [2024-08-27 21:26:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][800/1251] eta 0:01:44 lr 0.000333 wd 0.0500 time 0.2298 (0.2318) data time 0.0009 (0.0019) model time 0.2289 (0.2298) loss 2.6528 (3.0756) grad_norm 2.0807 (inf) loss_scale 1024.0000 (1646.5818) mem 7381MB [2024-08-27 21:26:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][810/1251] eta 0:01:42 lr 0.000333 wd 0.0500 time 0.2302 (0.2318) data time 0.0016 (0.0019) model time 0.2287 (0.2297) loss 2.8953 (3.0741) grad_norm 1.9030 (inf) loss_scale 1024.0000 (1638.9051) mem 7381MB [2024-08-27 21:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][820/1251] eta 0:01:39 lr 0.000333 wd 0.0500 time 0.2243 (0.2317) data time 0.0008 (0.0019) model time 0.2234 (0.2297) loss 3.4416 (3.0751) grad_norm 2.7143 (inf) loss_scale 1024.0000 (1631.4153) mem 7381MB [2024-08-27 21:26:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][830/1251] eta 0:01:37 lr 0.000333 wd 0.0500 time 0.2263 (0.2317) data time 0.0008 (0.0019) model time 0.2255 (0.2297) loss 3.3803 (3.0757) grad_norm 3.1851 (inf) loss_scale 1024.0000 (1624.1059) mem 7381MB [2024-08-27 21:26:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][840/1251] eta 0:01:35 lr 0.000333 wd 0.0500 time 0.2321 (0.2317) data time 0.0007 (0.0019) model time 0.2314 (0.2297) loss 3.8754 (3.0760) grad_norm 2.3084 (inf) loss_scale 1024.0000 (1616.9703) mem 7381MB [2024-08-27 21:26:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][850/1251] eta 0:01:32 lr 0.000333 wd 0.0500 time 0.2319 (0.2317) data time 0.0007 (0.0018) model time 0.2311 (0.2297) loss 2.9624 (3.0777) grad_norm 6.4741 (inf) loss_scale 1024.0000 (1610.0024) mem 7381MB [2024-08-27 21:26:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][860/1251] eta 0:01:30 lr 0.000333 wd 0.0500 time 0.2424 (0.2317) data time 0.0009 (0.0018) model time 0.2415 (0.2297) loss 3.1503 (3.0799) grad_norm 3.1603 (inf) loss_scale 1024.0000 (1603.1963) mem 7381MB [2024-08-27 21:26:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][870/1251] eta 0:01:28 lr 0.000333 wd 0.0500 time 0.2341 (0.2317) data time 0.0011 (0.0018) model time 0.2330 (0.2297) loss 3.3415 (3.0787) grad_norm 2.2463 (inf) loss_scale 1024.0000 (1596.5465) mem 7381MB [2024-08-27 21:26:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][880/1251] eta 0:01:25 lr 0.000333 wd 0.0500 time 0.2325 (0.2317) data time 0.0008 (0.0018) model time 0.2317 (0.2297) loss 3.0549 (3.0794) grad_norm 2.4666 (inf) loss_scale 1024.0000 (1590.0477) mem 7381MB [2024-08-27 21:26:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][890/1251] eta 0:01:23 lr 0.000333 wd 0.0500 time 0.2242 (0.2317) data time 0.0008 (0.0018) model time 0.2235 (0.2297) loss 2.2081 (3.0786) grad_norm 2.4332 (inf) loss_scale 1024.0000 (1583.6947) mem 7381MB [2024-08-27 21:26:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][900/1251] eta 0:01:21 lr 0.000333 wd 0.0500 time 0.2458 (0.2317) data time 0.0009 (0.0018) model time 0.2449 (0.2298) loss 3.1648 (3.0801) grad_norm 4.7027 (inf) loss_scale 1024.0000 (1577.4828) mem 7381MB [2024-08-27 21:26:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][910/1251] eta 0:01:19 lr 0.000332 wd 0.0500 time 0.2265 (0.2317) data time 0.0007 (0.0018) model time 0.2258 (0.2297) loss 3.0469 (3.0815) grad_norm 2.6337 (inf) loss_scale 1024.0000 (1571.4072) mem 7381MB [2024-08-27 21:26:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][920/1251] eta 0:01:16 lr 0.000332 wd 0.0500 time 0.2336 (0.2317) data time 0.0013 (0.0018) model time 0.2323 (0.2297) loss 2.6189 (3.0805) grad_norm 5.9753 (inf) loss_scale 1024.0000 (1565.4636) mem 7381MB [2024-08-27 21:26:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][930/1251] eta 0:01:14 lr 0.000332 wd 0.0500 time 0.2306 (0.2316) data time 0.0007 (0.0018) model time 0.2299 (0.2297) loss 3.8542 (3.0836) grad_norm 3.5714 (inf) loss_scale 1024.0000 (1559.6477) mem 7381MB [2024-08-27 21:26:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][940/1251] eta 0:01:12 lr 0.000332 wd 0.0500 time 0.2252 (0.2316) data time 0.0011 (0.0018) model time 0.2241 (0.2297) loss 2.2016 (3.0838) grad_norm 2.7760 (inf) loss_scale 1024.0000 (1553.9554) mem 7381MB [2024-08-27 21:26:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][950/1251] eta 0:01:09 lr 0.000332 wd 0.0500 time 0.2244 (0.2316) data time 0.0014 (0.0018) model time 0.2231 (0.2297) loss 2.9362 (3.0840) grad_norm 2.0707 (inf) loss_scale 1024.0000 (1548.3828) mem 7381MB [2024-08-27 21:26:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][960/1251] eta 0:01:07 lr 0.000332 wd 0.0500 time 0.2247 (0.2316) data time 0.0007 (0.0018) model time 0.2240 (0.2296) loss 2.9029 (3.0838) grad_norm 2.4471 (inf) loss_scale 1024.0000 (1542.9261) mem 7381MB [2024-08-27 21:26:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][970/1251] eta 0:01:05 lr 0.000332 wd 0.0500 time 0.2276 (0.2316) data time 0.0009 (0.0018) model time 0.2266 (0.2296) loss 3.2811 (3.0845) grad_norm 2.9204 (inf) loss_scale 1024.0000 (1537.5819) mem 7381MB [2024-08-27 21:26:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][980/1251] eta 0:01:02 lr 0.000332 wd 0.0500 time 0.2205 (0.2315) data time 0.0008 (0.0018) model time 0.2196 (0.2296) loss 2.4827 (3.0857) grad_norm 2.2077 (inf) loss_scale 1024.0000 (1532.3466) mem 7381MB [2024-08-27 21:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][990/1251] eta 0:01:00 lr 0.000332 wd 0.0500 time 0.2356 (0.2315) data time 0.0009 (0.0018) model time 0.2347 (0.2296) loss 2.8990 (3.0866) grad_norm 2.9957 (inf) loss_scale 1024.0000 (1527.2170) mem 7381MB [2024-08-27 21:26:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1000/1251] eta 0:00:58 lr 0.000332 wd 0.0500 time 0.2227 (0.2315) data time 0.0009 (0.0018) model time 0.2218 (0.2296) loss 2.9165 (3.0872) grad_norm 2.9593 (inf) loss_scale 1024.0000 (1522.1898) mem 7381MB [2024-08-27 21:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1010/1251] eta 0:00:55 lr 0.000332 wd 0.0500 time 0.2251 (0.2315) data time 0.0012 (0.0018) model time 0.2240 (0.2296) loss 3.0443 (3.0849) grad_norm 4.0199 (inf) loss_scale 1024.0000 (1517.2621) mem 7381MB [2024-08-27 21:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1020/1251] eta 0:00:53 lr 0.000332 wd 0.0500 time 0.2251 (0.2315) data time 0.0007 (0.0018) model time 0.2245 (0.2296) loss 2.1576 (3.0836) grad_norm 2.3952 (inf) loss_scale 1024.0000 (1512.4310) mem 7381MB [2024-08-27 21:26:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1030/1251] eta 0:00:51 lr 0.000332 wd 0.0500 time 0.2271 (0.2314) data time 0.0011 (0.0018) model time 0.2260 (0.2296) loss 3.2514 (3.0827) grad_norm 2.9062 (inf) loss_scale 1024.0000 (1507.6935) mem 7381MB [2024-08-27 21:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1040/1251] eta 0:00:48 lr 0.000332 wd 0.0500 time 0.2265 (0.2315) data time 0.0008 (0.0018) model time 0.2258 (0.2296) loss 3.6318 (3.0852) grad_norm 3.3031 (inf) loss_scale 1024.0000 (1503.0471) mem 7381MB [2024-08-27 21:26:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1050/1251] eta 0:00:46 lr 0.000332 wd 0.0500 time 0.2237 (0.2315) data time 0.0007 (0.0018) model time 0.2230 (0.2296) loss 3.0421 (3.0829) grad_norm 4.4617 (inf) loss_scale 1024.0000 (1498.4891) mem 7381MB [2024-08-27 21:27:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1060/1251] eta 0:00:44 lr 0.000332 wd 0.0500 time 0.2301 (0.2314) data time 0.0009 (0.0018) model time 0.2292 (0.2296) loss 3.0692 (3.0820) grad_norm 2.3956 (inf) loss_scale 1024.0000 (1494.0170) mem 7381MB [2024-08-27 21:27:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1070/1251] eta 0:00:41 lr 0.000332 wd 0.0500 time 0.2250 (0.2314) data time 0.0010 (0.0018) model time 0.2241 (0.2296) loss 3.3112 (3.0829) grad_norm 2.2049 (inf) loss_scale 1024.0000 (1489.6284) mem 7381MB [2024-08-27 21:27:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1080/1251] eta 0:00:39 lr 0.000332 wd 0.0500 time 0.2231 (0.2314) data time 0.0008 (0.0017) model time 0.2223 (0.2296) loss 3.8757 (3.0813) grad_norm 3.1462 (inf) loss_scale 1024.0000 (1485.3210) mem 7381MB [2024-08-27 21:27:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1090/1251] eta 0:00:37 lr 0.000332 wd 0.0500 time 0.2277 (0.2314) data time 0.0012 (0.0017) model time 0.2265 (0.2295) loss 2.9779 (3.0817) grad_norm 2.1574 (inf) loss_scale 1024.0000 (1481.0926) mem 7381MB [2024-08-27 21:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1100/1251] eta 0:00:34 lr 0.000332 wd 0.0500 time 0.2316 (0.2314) data time 0.0012 (0.0017) model time 0.2304 (0.2295) loss 3.3934 (3.0823) grad_norm 2.5533 (inf) loss_scale 1024.0000 (1476.9410) mem 7381MB [2024-08-27 21:27:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1110/1251] eta 0:00:32 lr 0.000332 wd 0.0500 time 0.2243 (0.2314) data time 0.0012 (0.0017) model time 0.2231 (0.2295) loss 3.3206 (3.0822) grad_norm 2.3417 (inf) loss_scale 1024.0000 (1472.8641) mem 7381MB [2024-08-27 21:27:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1120/1251] eta 0:00:30 lr 0.000332 wd 0.0500 time 0.2245 (0.2314) data time 0.0009 (0.0017) model time 0.2236 (0.2295) loss 2.6902 (3.0798) grad_norm 2.4757 (inf) loss_scale 1024.0000 (1468.8599) mem 7381MB [2024-08-27 21:27:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1130/1251] eta 0:00:27 lr 0.000332 wd 0.0500 time 0.2321 (0.2314) data time 0.0010 (0.0017) model time 0.2311 (0.2296) loss 2.2115 (3.0786) grad_norm 2.6488 (inf) loss_scale 1024.0000 (1464.9266) mem 7381MB [2024-08-27 21:27:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1140/1251] eta 0:00:25 lr 0.000332 wd 0.0500 time 0.2211 (0.2314) data time 0.0012 (0.0017) model time 0.2199 (0.2295) loss 3.2729 (3.0793) grad_norm 2.6513 (inf) loss_scale 1024.0000 (1461.0622) mem 7381MB [2024-08-27 21:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1150/1251] eta 0:00:23 lr 0.000331 wd 0.0500 time 0.2244 (0.2314) data time 0.0012 (0.0017) model time 0.2233 (0.2296) loss 2.2627 (3.0789) grad_norm 2.5507 (inf) loss_scale 1024.0000 (1457.2650) mem 7381MB [2024-08-27 21:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1160/1251] eta 0:00:21 lr 0.000331 wd 0.0500 time 0.2256 (0.2314) data time 0.0009 (0.0017) model time 0.2247 (0.2296) loss 3.2373 (3.0809) grad_norm 4.0644 (inf) loss_scale 1024.0000 (1453.5332) mem 7381MB [2024-08-27 21:27:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1170/1251] eta 0:00:18 lr 0.000331 wd 0.0500 time 0.2283 (0.2314) data time 0.0009 (0.0017) model time 0.2273 (0.2295) loss 2.9023 (3.0804) grad_norm 2.5099 (inf) loss_scale 1024.0000 (1449.8651) mem 7381MB [2024-08-27 21:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1180/1251] eta 0:00:16 lr 0.000331 wd 0.0500 time 0.2231 (0.2313) data time 0.0007 (0.0017) model time 0.2224 (0.2295) loss 3.7828 (3.0795) grad_norm 2.0653 (inf) loss_scale 1024.0000 (1446.2591) mem 7381MB [2024-08-27 21:27:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1190/1251] eta 0:00:14 lr 0.000331 wd 0.0500 time 0.2263 (0.2313) data time 0.0009 (0.0017) model time 0.2254 (0.2295) loss 3.1165 (3.0781) grad_norm 2.4918 (inf) loss_scale 1024.0000 (1442.7137) mem 7381MB [2024-08-27 21:27:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1200/1251] eta 0:00:11 lr 0.000331 wd 0.0500 time 0.2206 (0.2313) data time 0.0008 (0.0017) model time 0.2198 (0.2295) loss 3.4994 (3.0783) grad_norm 4.8183 (inf) loss_scale 1024.0000 (1439.2273) mem 7381MB [2024-08-27 21:27:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1210/1251] eta 0:00:09 lr 0.000331 wd 0.0500 time 0.2280 (0.2313) data time 0.0008 (0.0017) model time 0.2272 (0.2295) loss 3.7899 (3.0774) grad_norm 3.3855 (inf) loss_scale 1024.0000 (1435.7985) mem 7381MB [2024-08-27 21:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1220/1251] eta 0:00:07 lr 0.000331 wd 0.0500 time 0.2313 (0.2313) data time 0.0010 (0.0017) model time 0.2304 (0.2295) loss 3.1508 (3.0769) grad_norm 2.8921 (inf) loss_scale 1024.0000 (1432.4259) mem 7381MB [2024-08-27 21:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1230/1251] eta 0:00:04 lr 0.000331 wd 0.0500 time 0.2399 (0.2313) data time 0.0007 (0.0017) model time 0.2392 (0.2295) loss 3.0316 (3.0757) grad_norm 2.9544 (inf) loss_scale 1024.0000 (1429.1080) mem 7381MB [2024-08-27 21:27:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1240/1251] eta 0:00:02 lr 0.000331 wd 0.0500 time 0.2114 (0.2312) data time 0.0005 (0.0017) model time 0.2110 (0.2294) loss 1.8499 (3.0734) grad_norm 1.8210 (inf) loss_scale 1024.0000 (1425.8437) mem 7381MB [2024-08-27 21:27:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [191/300][1250/1251] eta 0:00:00 lr 0.000331 wd 0.0500 time 0.2206 (0.2312) data time 0.0004 (0.0017) model time 0.2202 (0.2294) loss 3.7030 (3.0735) grad_norm 2.4910 (inf) loss_scale 1024.0000 (1422.6315) mem 7381MB [2024-08-27 21:27:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 191 training takes 0:04:49 [2024-08-27 21:27:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 21:27:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 21:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.449 (0.449) Loss 0.4146 (0.4146) Acc@1 92.285 (92.285) Acc@5 98.438 (98.438) Mem 7381MB [2024-08-27 21:27:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.120) Loss 0.6968 (0.6826) Acc@1 85.938 (85.547) Acc@5 97.266 (97.106) Mem 7381MB [2024-08-27 21:27:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.103) Loss 1.0039 (0.7108) Acc@1 77.344 (84.584) Acc@5 94.531 (97.047) Mem 7381MB [2024-08-27 21:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.089 (0.097) Loss 1.2188 (0.8079) Acc@1 70.410 (82.211) Acc@5 91.602 (95.993) Mem 7381MB [2024-08-27 21:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.089) Loss 1.0381 (0.8537) Acc@1 75.391 (80.933) Acc@5 93.945 (95.532) Mem 7381MB [2024-08-27 21:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.586 Acc@5 95.510 [2024-08-27 21:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.6% [2024-08-27 21:27:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.946 (0.946) Loss 0.3884 (0.3884) Acc@1 92.969 (92.969) Acc@5 98.438 (98.438) Mem 7381MB [2024-08-27 21:27:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.088 (0.165) Loss 0.6074 (0.6153) Acc@1 88.574 (87.047) Acc@5 97.266 (97.479) Mem 7381MB [2024-08-27 21:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.126) Loss 0.8750 (0.6405) Acc@1 78.418 (86.063) Acc@5 95.898 (97.503) Mem 7381MB [2024-08-27 21:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.112) Loss 1.1133 (0.7254) Acc@1 72.949 (84.019) Acc@5 92.773 (96.636) Mem 7381MB [2024-08-27 21:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.100) Loss 0.9800 (0.7685) Acc@1 75.977 (82.734) Acc@5 94.336 (96.177) Mem 7381MB [2024-08-27 21:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.302 Acc@5 96.150 [2024-08-27 21:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.3% [2024-08-27 21:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.30% [2024-08-27 21:27:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 21:27:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 21:27:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][0/1251] eta 0:15:38 lr 0.000331 wd 0.0500 time 0.7498 (0.7498) data time 0.4986 (0.4986) model time 0.0000 (0.0000) loss 2.3092 (2.3092) grad_norm 3.0250 (3.0250) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:27:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][10/1251] eta 0:06:08 lr 0.000331 wd 0.0500 time 0.2341 (0.2973) data time 0.0011 (0.0463) model time 0.0000 (0.0000) loss 3.5868 (3.1935) grad_norm 2.4455 (2.8585) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][20/1251] eta 0:05:25 lr 0.000331 wd 0.0500 time 0.2309 (0.2646) data time 0.0009 (0.0248) model time 0.0000 (0.0000) loss 2.1648 (2.9898) grad_norm 2.3681 (2.9026) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][30/1251] eta 0:05:09 lr 0.000331 wd 0.0500 time 0.2259 (0.2533) data time 0.0007 (0.0172) model time 0.0000 (0.0000) loss 3.0123 (3.0704) grad_norm 2.8517 (2.9670) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][40/1251] eta 0:04:59 lr 0.000331 wd 0.0500 time 0.2222 (0.2473) data time 0.0010 (0.0133) model time 0.0000 (0.0000) loss 3.6510 (3.0918) grad_norm 2.8725 (3.1024) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][50/1251] eta 0:04:52 lr 0.000331 wd 0.0500 time 0.2288 (0.2439) data time 0.0007 (0.0110) model time 0.0000 (0.0000) loss 3.7020 (3.0703) grad_norm 8.1148 (3.3121) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][60/1251] eta 0:04:52 lr 0.000331 wd 0.0500 time 0.2212 (0.2452) data time 0.0009 (0.0094) model time 0.2203 (0.2506) loss 3.5978 (3.0726) grad_norm 2.6519 (3.3470) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][70/1251] eta 0:04:46 lr 0.000331 wd 0.0500 time 0.2220 (0.2430) data time 0.0008 (0.0082) model time 0.2212 (0.2395) loss 2.7697 (3.0864) grad_norm 2.5113 (3.2417) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][80/1251] eta 0:04:42 lr 0.000331 wd 0.0500 time 0.2297 (0.2413) data time 0.0007 (0.0073) model time 0.2289 (0.2358) loss 3.6508 (3.1063) grad_norm 3.6832 (3.2027) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][90/1251] eta 0:04:38 lr 0.000331 wd 0.0500 time 0.2289 (0.2400) data time 0.0010 (0.0067) model time 0.2280 (0.2337) loss 3.3569 (3.0689) grad_norm 3.2099 (3.1913) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][100/1251] eta 0:04:35 lr 0.000331 wd 0.0500 time 0.2203 (0.2389) data time 0.0008 (0.0061) model time 0.2195 (0.2326) loss 3.7133 (3.0851) grad_norm 3.6674 (3.1531) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][110/1251] eta 0:04:31 lr 0.000331 wd 0.0500 time 0.2279 (0.2381) data time 0.0009 (0.0057) model time 0.2270 (0.2319) loss 3.3751 (3.0577) grad_norm 3.9200 (3.1471) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][120/1251] eta 0:04:28 lr 0.000331 wd 0.0500 time 0.2274 (0.2375) data time 0.0009 (0.0053) model time 0.2266 (0.2315) loss 2.3984 (3.0578) grad_norm 2.3148 (3.1435) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][130/1251] eta 0:04:25 lr 0.000331 wd 0.0500 time 0.2291 (0.2368) data time 0.0010 (0.0050) model time 0.2281 (0.2310) loss 2.3546 (3.0697) grad_norm 6.6712 (3.2056) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][140/1251] eta 0:04:22 lr 0.000330 wd 0.0500 time 0.2296 (0.2361) data time 0.0008 (0.0048) model time 0.2288 (0.2304) loss 2.1238 (3.0714) grad_norm 3.5006 (3.2322) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][150/1251] eta 0:04:19 lr 0.000330 wd 0.0500 time 0.2214 (0.2358) data time 0.0007 (0.0045) model time 0.2208 (0.2304) loss 1.9797 (3.0598) grad_norm 4.4672 (3.2460) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][160/1251] eta 0:04:16 lr 0.000330 wd 0.0500 time 0.2346 (0.2354) data time 0.0007 (0.0043) model time 0.2339 (0.2302) loss 2.9851 (3.0529) grad_norm 3.4075 (3.2539) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][170/1251] eta 0:04:14 lr 0.000330 wd 0.0500 time 0.2323 (0.2351) data time 0.0008 (0.0041) model time 0.2315 (0.2301) loss 1.9000 (3.0417) grad_norm 2.5952 (3.2429) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][180/1251] eta 0:04:11 lr 0.000330 wd 0.0500 time 0.2256 (0.2347) data time 0.0008 (0.0040) model time 0.2248 (0.2299) loss 2.9481 (3.0433) grad_norm 2.2990 (3.2026) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][190/1251] eta 0:04:08 lr 0.000330 wd 0.0500 time 0.2591 (0.2346) data time 0.0007 (0.0038) model time 0.2584 (0.2300) loss 2.6266 (3.0418) grad_norm 3.8299 (3.1983) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 21:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][200/1251] eta 0:04:06 lr 0.000330 wd 0.0500 time 0.2312 (0.2343) data time 0.0007 (0.0037) model time 0.2304 (0.2298) loss 4.1541 (3.0520) grad_norm 3.2876 (inf) loss_scale 512.0000 (998.5274) mem 7381MB [2024-08-27 21:28:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][210/1251] eta 0:04:03 lr 0.000330 wd 0.0500 time 0.2414 (0.2342) data time 0.0014 (0.0036) model time 0.2400 (0.2299) loss 3.3238 (3.0609) grad_norm 3.1028 (inf) loss_scale 512.0000 (975.4692) mem 7381MB [2024-08-27 21:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][220/1251] eta 0:04:01 lr 0.000330 wd 0.0500 time 0.2294 (0.2340) data time 0.0009 (0.0035) model time 0.2285 (0.2298) loss 2.9725 (3.0505) grad_norm 3.6529 (inf) loss_scale 512.0000 (954.4977) mem 7381MB [2024-08-27 21:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][230/1251] eta 0:03:58 lr 0.000330 wd 0.0500 time 0.2427 (0.2338) data time 0.0007 (0.0034) model time 0.2420 (0.2297) loss 3.7391 (3.0498) grad_norm 2.2832 (inf) loss_scale 512.0000 (935.3420) mem 7381MB [2024-08-27 21:28:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][240/1251] eta 0:03:56 lr 0.000330 wd 0.0500 time 0.2360 (0.2336) data time 0.0009 (0.0033) model time 0.2350 (0.2297) loss 3.1352 (3.0452) grad_norm 2.7683 (inf) loss_scale 512.0000 (917.7759) mem 7381MB [2024-08-27 21:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][250/1251] eta 0:03:53 lr 0.000330 wd 0.0500 time 0.2345 (0.2335) data time 0.0010 (0.0032) model time 0.2335 (0.2297) loss 2.2922 (3.0476) grad_norm 3.9489 (inf) loss_scale 512.0000 (901.6096) mem 7381MB [2024-08-27 21:28:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][260/1251] eta 0:03:51 lr 0.000330 wd 0.0500 time 0.2302 (0.2333) data time 0.0009 (0.0031) model time 0.2293 (0.2296) loss 3.2772 (3.0414) grad_norm 2.6374 (inf) loss_scale 512.0000 (886.6820) mem 7381MB [2024-08-27 21:28:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][270/1251] eta 0:03:48 lr 0.000330 wd 0.0500 time 0.2297 (0.2332) data time 0.0008 (0.0030) model time 0.2288 (0.2295) loss 3.3514 (3.0534) grad_norm 2.7550 (inf) loss_scale 512.0000 (872.8561) mem 7381MB [2024-08-27 21:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][280/1251] eta 0:03:46 lr 0.000330 wd 0.0500 time 0.2306 (0.2330) data time 0.0009 (0.0030) model time 0.2298 (0.2294) loss 3.2641 (3.0488) grad_norm 2.7582 (inf) loss_scale 512.0000 (860.0142) mem 7381MB [2024-08-27 21:29:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][290/1251] eta 0:03:43 lr 0.000330 wd 0.0500 time 0.2248 (0.2329) data time 0.0010 (0.0029) model time 0.2238 (0.2293) loss 2.2503 (3.0484) grad_norm 1.9959 (inf) loss_scale 512.0000 (848.0550) mem 7381MB [2024-08-27 21:29:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][300/1251] eta 0:03:41 lr 0.000330 wd 0.0500 time 0.2231 (0.2327) data time 0.0007 (0.0028) model time 0.2224 (0.2293) loss 3.5301 (3.0523) grad_norm 3.0290 (inf) loss_scale 512.0000 (836.8904) mem 7381MB [2024-08-27 21:29:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][310/1251] eta 0:03:38 lr 0.000330 wd 0.0500 time 0.2269 (0.2326) data time 0.0008 (0.0028) model time 0.2260 (0.2292) loss 3.5471 (3.0625) grad_norm 2.5350 (inf) loss_scale 512.0000 (826.4437) mem 7381MB [2024-08-27 21:29:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][320/1251] eta 0:03:36 lr 0.000330 wd 0.0500 time 0.2287 (0.2324) data time 0.0009 (0.0027) model time 0.2278 (0.2290) loss 3.0690 (3.0682) grad_norm 2.6287 (inf) loss_scale 512.0000 (816.6480) mem 7381MB [2024-08-27 21:29:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][330/1251] eta 0:03:33 lr 0.000330 wd 0.0500 time 0.2231 (0.2322) data time 0.0011 (0.0027) model time 0.2220 (0.2289) loss 2.7248 (3.0713) grad_norm 2.4446 (inf) loss_scale 512.0000 (807.4441) mem 7381MB [2024-08-27 21:29:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][340/1251] eta 0:03:31 lr 0.000330 wd 0.0500 time 0.2320 (0.2321) data time 0.0007 (0.0026) model time 0.2313 (0.2288) loss 2.3727 (3.0562) grad_norm 2.3357 (inf) loss_scale 512.0000 (798.7801) mem 7381MB [2024-08-27 21:29:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][350/1251] eta 0:03:29 lr 0.000330 wd 0.0500 time 0.2284 (0.2320) data time 0.0010 (0.0026) model time 0.2275 (0.2288) loss 3.4810 (3.0609) grad_norm 2.3658 (inf) loss_scale 512.0000 (790.6097) mem 7381MB [2024-08-27 21:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][360/1251] eta 0:03:26 lr 0.000330 wd 0.0500 time 0.2308 (0.2319) data time 0.0007 (0.0026) model time 0.2301 (0.2288) loss 3.6528 (3.0643) grad_norm 2.6200 (inf) loss_scale 512.0000 (782.8920) mem 7381MB [2024-08-27 21:29:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][370/1251] eta 0:03:24 lr 0.000330 wd 0.0500 time 0.2268 (0.2320) data time 0.0007 (0.0025) model time 0.2261 (0.2289) loss 3.4020 (3.0660) grad_norm 3.2884 (inf) loss_scale 512.0000 (775.5903) mem 7381MB [2024-08-27 21:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][380/1251] eta 0:03:21 lr 0.000329 wd 0.0500 time 0.2274 (0.2319) data time 0.0009 (0.0025) model time 0.2264 (0.2288) loss 3.0008 (3.0725) grad_norm 3.5698 (inf) loss_scale 512.0000 (768.6719) mem 7381MB [2024-08-27 21:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][390/1251] eta 0:03:19 lr 0.000329 wd 0.0500 time 0.2281 (0.2319) data time 0.0013 (0.0025) model time 0.2268 (0.2289) loss 3.1251 (3.0731) grad_norm 3.0512 (inf) loss_scale 512.0000 (762.1074) mem 7381MB [2024-08-27 21:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][400/1251] eta 0:03:17 lr 0.000329 wd 0.0500 time 0.2232 (0.2319) data time 0.0012 (0.0025) model time 0.2219 (0.2289) loss 3.2371 (3.0674) grad_norm 3.8304 (inf) loss_scale 512.0000 (755.8703) mem 7381MB [2024-08-27 21:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][410/1251] eta 0:03:14 lr 0.000329 wd 0.0500 time 0.2220 (0.2318) data time 0.0016 (0.0025) model time 0.2204 (0.2288) loss 2.2339 (3.0667) grad_norm 3.0746 (inf) loss_scale 512.0000 (749.9367) mem 7381MB [2024-08-27 21:29:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][420/1251] eta 0:03:12 lr 0.000329 wd 0.0500 time 0.2228 (0.2318) data time 0.0006 (0.0024) model time 0.2222 (0.2288) loss 3.6702 (3.0719) grad_norm 2.9357 (inf) loss_scale 512.0000 (744.2850) mem 7381MB [2024-08-27 21:29:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][430/1251] eta 0:03:10 lr 0.000329 wd 0.0500 time 0.2256 (0.2317) data time 0.0011 (0.0024) model time 0.2245 (0.2288) loss 3.4003 (3.0738) grad_norm 2.3487 (inf) loss_scale 512.0000 (738.8956) mem 7381MB [2024-08-27 21:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][440/1251] eta 0:03:07 lr 0.000329 wd 0.0500 time 0.2199 (0.2316) data time 0.0013 (0.0024) model time 0.2186 (0.2287) loss 3.4678 (3.0728) grad_norm 3.4820 (inf) loss_scale 512.0000 (733.7506) mem 7381MB [2024-08-27 21:29:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][450/1251] eta 0:03:05 lr 0.000329 wd 0.0500 time 0.2275 (0.2315) data time 0.0011 (0.0024) model time 0.2264 (0.2287) loss 3.2578 (3.0751) grad_norm 2.4464 (inf) loss_scale 512.0000 (728.8337) mem 7381MB [2024-08-27 21:29:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][460/1251] eta 0:03:03 lr 0.000329 wd 0.0500 time 0.2476 (0.2315) data time 0.0016 (0.0023) model time 0.2461 (0.2287) loss 2.1740 (3.0734) grad_norm 3.7867 (inf) loss_scale 512.0000 (724.1302) mem 7381MB [2024-08-27 21:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][470/1251] eta 0:03:00 lr 0.000329 wd 0.0500 time 0.2270 (0.2314) data time 0.0014 (0.0023) model time 0.2257 (0.2287) loss 3.6706 (3.0728) grad_norm 2.3442 (inf) loss_scale 512.0000 (719.6263) mem 7381MB [2024-08-27 21:29:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][480/1251] eta 0:02:58 lr 0.000329 wd 0.0500 time 0.2186 (0.2314) data time 0.0015 (0.0023) model time 0.2171 (0.2286) loss 3.0997 (3.0679) grad_norm 2.9982 (inf) loss_scale 512.0000 (715.3098) mem 7381MB [2024-08-27 21:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][490/1251] eta 0:02:56 lr 0.000329 wd 0.0500 time 0.2282 (0.2314) data time 0.0007 (0.0022) model time 0.2275 (0.2287) loss 3.0177 (3.0659) grad_norm 2.4347 (inf) loss_scale 512.0000 (711.1690) mem 7381MB [2024-08-27 21:29:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][500/1251] eta 0:02:53 lr 0.000329 wd 0.0500 time 0.2337 (0.2314) data time 0.0007 (0.0022) model time 0.2330 (0.2287) loss 3.7348 (3.0646) grad_norm 2.7120 (inf) loss_scale 512.0000 (707.1936) mem 7381MB [2024-08-27 21:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][510/1251] eta 0:02:51 lr 0.000329 wd 0.0500 time 0.2210 (0.2313) data time 0.0007 (0.0022) model time 0.2203 (0.2287) loss 3.0991 (3.0697) grad_norm 2.8244 (inf) loss_scale 512.0000 (703.3738) mem 7381MB [2024-08-27 21:29:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][520/1251] eta 0:02:49 lr 0.000329 wd 0.0500 time 0.2261 (0.2317) data time 0.0007 (0.0022) model time 0.2254 (0.2291) loss 2.5498 (3.0724) grad_norm 3.9531 (inf) loss_scale 512.0000 (699.7006) mem 7381MB [2024-08-27 21:29:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][530/1251] eta 0:02:47 lr 0.000329 wd 0.0500 time 0.2210 (0.2319) data time 0.0009 (0.0022) model time 0.2201 (0.2294) loss 3.5862 (3.0757) grad_norm 2.9592 (inf) loss_scale 512.0000 (696.1657) mem 7381MB [2024-08-27 21:30:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][540/1251] eta 0:02:44 lr 0.000329 wd 0.0500 time 0.2297 (0.2318) data time 0.0009 (0.0021) model time 0.2288 (0.2294) loss 2.2675 (3.0728) grad_norm 2.6539 (inf) loss_scale 512.0000 (692.7616) mem 7381MB [2024-08-27 21:30:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][550/1251] eta 0:02:42 lr 0.000329 wd 0.0500 time 0.2282 (0.2318) data time 0.0009 (0.0021) model time 0.2273 (0.2293) loss 2.4743 (3.0676) grad_norm 4.0651 (inf) loss_scale 512.0000 (689.4809) mem 7381MB [2024-08-27 21:30:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][560/1251] eta 0:02:40 lr 0.000329 wd 0.0500 time 0.2275 (0.2317) data time 0.0011 (0.0021) model time 0.2264 (0.2293) loss 2.7920 (3.0657) grad_norm 2.5551 (inf) loss_scale 512.0000 (686.3173) mem 7381MB [2024-08-27 21:30:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][570/1251] eta 0:02:37 lr 0.000329 wd 0.0500 time 0.2340 (0.2317) data time 0.0009 (0.0021) model time 0.2331 (0.2293) loss 2.9322 (3.0656) grad_norm 2.8454 (inf) loss_scale 512.0000 (683.2644) mem 7381MB [2024-08-27 21:30:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][580/1251] eta 0:02:35 lr 0.000329 wd 0.0500 time 0.2226 (0.2316) data time 0.0010 (0.0021) model time 0.2215 (0.2292) loss 3.1463 (3.0623) grad_norm 1.9261 (inf) loss_scale 512.0000 (680.3167) mem 7381MB [2024-08-27 21:30:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][590/1251] eta 0:02:33 lr 0.000329 wd 0.0500 time 0.2242 (0.2316) data time 0.0010 (0.0021) model time 0.2233 (0.2292) loss 2.3833 (3.0606) grad_norm 1.8334 (inf) loss_scale 512.0000 (677.4687) mem 7381MB [2024-08-27 21:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][600/1251] eta 0:02:30 lr 0.000329 wd 0.0500 time 0.2227 (0.2315) data time 0.0009 (0.0020) model time 0.2218 (0.2291) loss 2.5652 (3.0605) grad_norm 2.6906 (inf) loss_scale 512.0000 (674.7155) mem 7381MB [2024-08-27 21:30:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][610/1251] eta 0:02:28 lr 0.000329 wd 0.0500 time 0.2411 (0.2315) data time 0.0008 (0.0020) model time 0.2403 (0.2291) loss 2.9142 (3.0637) grad_norm 3.8440 (inf) loss_scale 512.0000 (672.0524) mem 7381MB [2024-08-27 21:30:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][620/1251] eta 0:02:26 lr 0.000328 wd 0.0500 time 0.2309 (0.2315) data time 0.0009 (0.0020) model time 0.2300 (0.2291) loss 3.7991 (3.0669) grad_norm 3.3205 (inf) loss_scale 512.0000 (669.4750) mem 7381MB [2024-08-27 21:30:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][630/1251] eta 0:02:23 lr 0.000328 wd 0.0500 time 0.2291 (0.2314) data time 0.0012 (0.0020) model time 0.2279 (0.2291) loss 2.9416 (3.0662) grad_norm 4.1828 (inf) loss_scale 512.0000 (666.9794) mem 7381MB [2024-08-27 21:30:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][640/1251] eta 0:02:21 lr 0.000328 wd 0.0500 time 0.2358 (0.2314) data time 0.0007 (0.0020) model time 0.2351 (0.2291) loss 2.6930 (3.0659) grad_norm 2.9302 (inf) loss_scale 512.0000 (664.5616) mem 7381MB [2024-08-27 21:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][650/1251] eta 0:02:19 lr 0.000328 wd 0.0500 time 0.2400 (0.2314) data time 0.0010 (0.0020) model time 0.2390 (0.2291) loss 2.0737 (3.0609) grad_norm 3.3229 (inf) loss_scale 512.0000 (662.2181) mem 7381MB [2024-08-27 21:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][660/1251] eta 0:02:16 lr 0.000328 wd 0.0500 time 0.2301 (0.2313) data time 0.0008 (0.0020) model time 0.2293 (0.2290) loss 3.1237 (3.0596) grad_norm 3.2172 (inf) loss_scale 512.0000 (659.9455) mem 7381MB [2024-08-27 21:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][670/1251] eta 0:02:14 lr 0.000328 wd 0.0500 time 0.2245 (0.2313) data time 0.0010 (0.0020) model time 0.2235 (0.2290) loss 3.6532 (3.0588) grad_norm 2.5286 (inf) loss_scale 512.0000 (657.7407) mem 7381MB [2024-08-27 21:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][680/1251] eta 0:02:12 lr 0.000328 wd 0.0500 time 0.2317 (0.2313) data time 0.0011 (0.0020) model time 0.2306 (0.2290) loss 2.8773 (3.0614) grad_norm 5.5869 (inf) loss_scale 512.0000 (655.6006) mem 7381MB [2024-08-27 21:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][690/1251] eta 0:02:09 lr 0.000328 wd 0.0500 time 0.2356 (0.2313) data time 0.0009 (0.0019) model time 0.2346 (0.2291) loss 2.8795 (3.0633) grad_norm 3.1774 (inf) loss_scale 512.0000 (653.5224) mem 7381MB [2024-08-27 21:30:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][700/1251] eta 0:02:07 lr 0.000328 wd 0.0500 time 0.2228 (0.2313) data time 0.0010 (0.0019) model time 0.2218 (0.2291) loss 2.7366 (3.0642) grad_norm 3.4820 (inf) loss_scale 512.0000 (651.5036) mem 7381MB [2024-08-27 21:30:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][710/1251] eta 0:02:05 lr 0.000328 wd 0.0500 time 0.2224 (0.2313) data time 0.0008 (0.0019) model time 0.2215 (0.2291) loss 3.5946 (3.0698) grad_norm 4.3307 (inf) loss_scale 512.0000 (649.5415) mem 7381MB [2024-08-27 21:30:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][720/1251] eta 0:02:02 lr 0.000328 wd 0.0500 time 0.2312 (0.2313) data time 0.0008 (0.0019) model time 0.2304 (0.2291) loss 3.1028 (3.0691) grad_norm 2.2604 (inf) loss_scale 256.0000 (645.5035) mem 7381MB [2024-08-27 21:30:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][730/1251] eta 0:02:00 lr 0.000328 wd 0.0500 time 0.2295 (0.2313) data time 0.0008 (0.0019) model time 0.2286 (0.2291) loss 3.0881 (3.0695) grad_norm 2.6975 (inf) loss_scale 256.0000 (640.1751) mem 7381MB [2024-08-27 21:30:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][740/1251] eta 0:01:58 lr 0.000328 wd 0.0500 time 0.2217 (0.2313) data time 0.0011 (0.0019) model time 0.2206 (0.2291) loss 2.6265 (3.0723) grad_norm 2.7126 (inf) loss_scale 256.0000 (634.9906) mem 7381MB [2024-08-27 21:30:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][750/1251] eta 0:01:55 lr 0.000328 wd 0.0500 time 0.2301 (0.2313) data time 0.0006 (0.0019) model time 0.2294 (0.2291) loss 2.7225 (3.0763) grad_norm 2.4610 (inf) loss_scale 256.0000 (629.9441) mem 7381MB [2024-08-27 21:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][760/1251] eta 0:01:53 lr 0.000328 wd 0.0500 time 0.2252 (0.2313) data time 0.0007 (0.0019) model time 0.2245 (0.2291) loss 2.2939 (3.0762) grad_norm 3.8660 (inf) loss_scale 256.0000 (625.0302) mem 7381MB [2024-08-27 21:30:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][770/1251] eta 0:01:51 lr 0.000328 wd 0.0500 time 0.2247 (0.2312) data time 0.0007 (0.0019) model time 0.2240 (0.2291) loss 2.1827 (3.0746) grad_norm 3.4175 (inf) loss_scale 256.0000 (620.2438) mem 7381MB [2024-08-27 21:30:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][780/1251] eta 0:01:48 lr 0.000328 wd 0.0500 time 0.2309 (0.2312) data time 0.0010 (0.0019) model time 0.2299 (0.2291) loss 2.8810 (3.0759) grad_norm 4.4897 (inf) loss_scale 256.0000 (615.5800) mem 7381MB [2024-08-27 21:30:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][790/1251] eta 0:01:46 lr 0.000328 wd 0.0500 time 0.2231 (0.2312) data time 0.0008 (0.0019) model time 0.2223 (0.2290) loss 1.6528 (3.0757) grad_norm 2.8631 (inf) loss_scale 256.0000 (611.0341) mem 7381MB [2024-08-27 21:30:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][800/1251] eta 0:01:44 lr 0.000328 wd 0.0500 time 0.2237 (0.2311) data time 0.0007 (0.0019) model time 0.2229 (0.2290) loss 3.7889 (3.0747) grad_norm 3.5587 (inf) loss_scale 256.0000 (606.6017) mem 7381MB [2024-08-27 21:31:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][810/1251] eta 0:01:41 lr 0.000328 wd 0.0500 time 0.2201 (0.2311) data time 0.0013 (0.0019) model time 0.2188 (0.2290) loss 3.1634 (3.0757) grad_norm 2.4301 (inf) loss_scale 256.0000 (602.2787) mem 7381MB [2024-08-27 21:31:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][820/1251] eta 0:01:39 lr 0.000328 wd 0.0500 time 0.2267 (0.2310) data time 0.0009 (0.0018) model time 0.2258 (0.2289) loss 2.8425 (3.0774) grad_norm 3.5196 (inf) loss_scale 256.0000 (598.0609) mem 7381MB [2024-08-27 21:31:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][830/1251] eta 0:01:37 lr 0.000328 wd 0.0500 time 0.2162 (0.2310) data time 0.0009 (0.0018) model time 0.2153 (0.2289) loss 3.6881 (3.0767) grad_norm 5.2758 (inf) loss_scale 256.0000 (593.9446) mem 7381MB [2024-08-27 21:31:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][840/1251] eta 0:01:34 lr 0.000328 wd 0.0500 time 0.2198 (0.2310) data time 0.0008 (0.0018) model time 0.2189 (0.2289) loss 3.1024 (3.0782) grad_norm 2.2541 (inf) loss_scale 256.0000 (589.9263) mem 7381MB [2024-08-27 21:31:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][850/1251] eta 0:01:32 lr 0.000328 wd 0.0500 time 0.2265 (0.2309) data time 0.0009 (0.0018) model time 0.2255 (0.2289) loss 2.6461 (3.0782) grad_norm 2.7578 (inf) loss_scale 256.0000 (586.0024) mem 7381MB [2024-08-27 21:31:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][860/1251] eta 0:01:30 lr 0.000328 wd 0.0500 time 0.2349 (0.2309) data time 0.0009 (0.0018) model time 0.2340 (0.2289) loss 3.3774 (3.0801) grad_norm 6.1801 (inf) loss_scale 256.0000 (582.1696) mem 7381MB [2024-08-27 21:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][870/1251] eta 0:01:27 lr 0.000327 wd 0.0500 time 0.2326 (0.2309) data time 0.0010 (0.0018) model time 0.2316 (0.2289) loss 3.0154 (3.0787) grad_norm 3.2111 (inf) loss_scale 256.0000 (578.4248) mem 7381MB [2024-08-27 21:31:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][880/1251] eta 0:01:25 lr 0.000327 wd 0.0500 time 0.2268 (0.2309) data time 0.0009 (0.0018) model time 0.2259 (0.2289) loss 3.1696 (3.0783) grad_norm 4.3291 (inf) loss_scale 256.0000 (574.7650) mem 7381MB [2024-08-27 21:31:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][890/1251] eta 0:01:23 lr 0.000327 wd 0.0500 time 0.2260 (0.2309) data time 0.0010 (0.0018) model time 0.2250 (0.2288) loss 2.2474 (3.0735) grad_norm 2.6362 (inf) loss_scale 256.0000 (571.1874) mem 7381MB [2024-08-27 21:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][900/1251] eta 0:01:21 lr 0.000327 wd 0.0500 time 0.2245 (0.2308) data time 0.0009 (0.0018) model time 0.2236 (0.2288) loss 3.2586 (3.0724) grad_norm 5.8524 (inf) loss_scale 256.0000 (567.6892) mem 7381MB [2024-08-27 21:31:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][910/1251] eta 0:01:18 lr 0.000327 wd 0.0500 time 0.2229 (0.2309) data time 0.0008 (0.0018) model time 0.2221 (0.2288) loss 2.6446 (3.0709) grad_norm 4.6504 (inf) loss_scale 256.0000 (564.2678) mem 7381MB [2024-08-27 21:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][920/1251] eta 0:01:16 lr 0.000327 wd 0.0500 time 0.2211 (0.2311) data time 0.0009 (0.0018) model time 0.2202 (0.2291) loss 2.2710 (3.0673) grad_norm 2.6666 (inf) loss_scale 256.0000 (560.9207) mem 7381MB [2024-08-27 21:31:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][930/1251] eta 0:01:14 lr 0.000327 wd 0.0500 time 0.2452 (0.2311) data time 0.0006 (0.0018) model time 0.2445 (0.2291) loss 2.1517 (3.0672) grad_norm 2.7755 (inf) loss_scale 256.0000 (557.6455) mem 7381MB [2024-08-27 21:31:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][940/1251] eta 0:01:11 lr 0.000327 wd 0.0500 time 0.2314 (0.2311) data time 0.0008 (0.0018) model time 0.2306 (0.2291) loss 2.1031 (3.0666) grad_norm 2.9410 (inf) loss_scale 256.0000 (554.4400) mem 7381MB [2024-08-27 21:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][950/1251] eta 0:01:09 lr 0.000327 wd 0.0500 time 0.2215 (0.2310) data time 0.0007 (0.0018) model time 0.2208 (0.2291) loss 3.4301 (3.0682) grad_norm 2.3671 (inf) loss_scale 256.0000 (551.3018) mem 7381MB [2024-08-27 21:31:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][960/1251] eta 0:01:07 lr 0.000327 wd 0.0500 time 0.2221 (0.2310) data time 0.0007 (0.0018) model time 0.2214 (0.2291) loss 2.9635 (3.0670) grad_norm 3.2782 (inf) loss_scale 256.0000 (548.2289) mem 7381MB [2024-08-27 21:31:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][970/1251] eta 0:01:04 lr 0.000327 wd 0.0500 time 0.2275 (0.2310) data time 0.0007 (0.0017) model time 0.2268 (0.2290) loss 3.6268 (3.0671) grad_norm 3.0919 (inf) loss_scale 256.0000 (545.2194) mem 7381MB [2024-08-27 21:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][980/1251] eta 0:01:02 lr 0.000327 wd 0.0500 time 0.2259 (0.2310) data time 0.0007 (0.0017) model time 0.2252 (0.2290) loss 3.4474 (3.0663) grad_norm 2.8968 (inf) loss_scale 256.0000 (542.2712) mem 7381MB [2024-08-27 21:31:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][990/1251] eta 0:01:00 lr 0.000327 wd 0.0500 time 0.2255 (0.2311) data time 0.0011 (0.0017) model time 0.2243 (0.2292) loss 3.2561 (3.0665) grad_norm 3.5648 (inf) loss_scale 256.0000 (539.3824) mem 7381MB [2024-08-27 21:31:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1000/1251] eta 0:00:58 lr 0.000327 wd 0.0500 time 0.2263 (0.2311) data time 0.0007 (0.0017) model time 0.2256 (0.2292) loss 2.0739 (3.0662) grad_norm 2.7158 (inf) loss_scale 256.0000 (536.5514) mem 7381MB [2024-08-27 21:31:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1010/1251] eta 0:00:55 lr 0.000327 wd 0.0500 time 0.2259 (0.2311) data time 0.0006 (0.0017) model time 0.2252 (0.2292) loss 2.3139 (3.0683) grad_norm 2.3462 (inf) loss_scale 256.0000 (533.7765) mem 7381MB [2024-08-27 21:31:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1020/1251] eta 0:00:53 lr 0.000327 wd 0.0500 time 0.2301 (0.2311) data time 0.0017 (0.0017) model time 0.2284 (0.2292) loss 2.5286 (3.0681) grad_norm 2.9250 (inf) loss_scale 256.0000 (531.0558) mem 7381MB [2024-08-27 21:31:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1030/1251] eta 0:00:51 lr 0.000327 wd 0.0500 time 0.2336 (0.2311) data time 0.0007 (0.0017) model time 0.2329 (0.2292) loss 3.2240 (3.0698) grad_norm 3.1772 (inf) loss_scale 256.0000 (528.3880) mem 7381MB [2024-08-27 21:31:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1040/1251] eta 0:00:48 lr 0.000327 wd 0.0500 time 0.2311 (0.2311) data time 0.0010 (0.0017) model time 0.2301 (0.2292) loss 3.5567 (3.0726) grad_norm 2.6507 (inf) loss_scale 256.0000 (525.7714) mem 7381MB [2024-08-27 21:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1050/1251] eta 0:00:46 lr 0.000327 wd 0.0500 time 0.2258 (0.2313) data time 0.0010 (0.0017) model time 0.2249 (0.2294) loss 3.1958 (3.0688) grad_norm 2.5756 (inf) loss_scale 256.0000 (523.2046) mem 7381MB [2024-08-27 21:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1060/1251] eta 0:00:44 lr 0.000327 wd 0.0500 time 0.2329 (0.2313) data time 0.0010 (0.0017) model time 0.2319 (0.2294) loss 3.3729 (3.0705) grad_norm 2.6547 (inf) loss_scale 256.0000 (520.6861) mem 7381MB [2024-08-27 21:32:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1070/1251] eta 0:00:41 lr 0.000327 wd 0.0500 time 0.2306 (0.2312) data time 0.0007 (0.0017) model time 0.2299 (0.2294) loss 2.0765 (3.0711) grad_norm 2.4129 (inf) loss_scale 256.0000 (518.2148) mem 7381MB [2024-08-27 21:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1080/1251] eta 0:00:39 lr 0.000327 wd 0.0500 time 0.2254 (0.2312) data time 0.0009 (0.0017) model time 0.2245 (0.2294) loss 3.1921 (3.0710) grad_norm 2.8983 (inf) loss_scale 256.0000 (515.7891) mem 7381MB [2024-08-27 21:32:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1090/1251] eta 0:00:37 lr 0.000327 wd 0.0500 time 0.2255 (0.2312) data time 0.0016 (0.0017) model time 0.2239 (0.2293) loss 2.9029 (3.0705) grad_norm 2.2722 (inf) loss_scale 256.0000 (513.4079) mem 7381MB [2024-08-27 21:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1100/1251] eta 0:00:34 lr 0.000327 wd 0.0500 time 0.2317 (0.2312) data time 0.0032 (0.0017) model time 0.2285 (0.2293) loss 2.6536 (3.0703) grad_norm 2.3932 (inf) loss_scale 256.0000 (511.0699) mem 7381MB [2024-08-27 21:32:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1110/1251] eta 0:00:32 lr 0.000326 wd 0.0500 time 0.2261 (0.2312) data time 0.0009 (0.0017) model time 0.2251 (0.2293) loss 3.5952 (3.0692) grad_norm 2.3696 (inf) loss_scale 256.0000 (508.7741) mem 7381MB [2024-08-27 21:32:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1120/1251] eta 0:00:30 lr 0.000326 wd 0.0500 time 0.2298 (0.2312) data time 0.0007 (0.0017) model time 0.2291 (0.2293) loss 2.2676 (3.0661) grad_norm 2.6751 (inf) loss_scale 256.0000 (506.5192) mem 7381MB [2024-08-27 21:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1130/1251] eta 0:00:27 lr 0.000326 wd 0.0500 time 0.2232 (0.2312) data time 0.0010 (0.0017) model time 0.2222 (0.2293) loss 2.7260 (3.0655) grad_norm 3.6001 (inf) loss_scale 256.0000 (504.3042) mem 7381MB [2024-08-27 21:32:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1140/1251] eta 0:00:25 lr 0.000326 wd 0.0500 time 0.2298 (0.2311) data time 0.0013 (0.0017) model time 0.2286 (0.2293) loss 3.3060 (3.0665) grad_norm 3.3856 (inf) loss_scale 256.0000 (502.1280) mem 7381MB [2024-08-27 21:32:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1150/1251] eta 0:00:23 lr 0.000326 wd 0.0500 time 0.2409 (0.2311) data time 0.0009 (0.0017) model time 0.2400 (0.2293) loss 3.4631 (3.0671) grad_norm 2.1925 (inf) loss_scale 256.0000 (499.9896) mem 7381MB [2024-08-27 21:32:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1160/1251] eta 0:00:21 lr 0.000326 wd 0.0500 time 0.2266 (0.2311) data time 0.0010 (0.0017) model time 0.2256 (0.2293) loss 3.2776 (3.0692) grad_norm 2.5908 (inf) loss_scale 256.0000 (497.8880) mem 7381MB [2024-08-27 21:32:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1170/1251] eta 0:00:18 lr 0.000326 wd 0.0500 time 0.2319 (0.2311) data time 0.0009 (0.0017) model time 0.2310 (0.2293) loss 3.5795 (3.0703) grad_norm 2.4636 (inf) loss_scale 256.0000 (495.8224) mem 7381MB [2024-08-27 21:32:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1180/1251] eta 0:00:16 lr 0.000326 wd 0.0500 time 0.2277 (0.2311) data time 0.0009 (0.0017) model time 0.2268 (0.2293) loss 2.7413 (3.0697) grad_norm 3.5112 (inf) loss_scale 256.0000 (493.7917) mem 7381MB [2024-08-27 21:32:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1190/1251] eta 0:00:14 lr 0.000326 wd 0.0500 time 0.2217 (0.2311) data time 0.0016 (0.0017) model time 0.2201 (0.2293) loss 3.6991 (3.0663) grad_norm 3.0101 (inf) loss_scale 256.0000 (491.7951) mem 7381MB [2024-08-27 21:32:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1200/1251] eta 0:00:11 lr 0.000326 wd 0.0500 time 0.2267 (0.2311) data time 0.0009 (0.0016) model time 0.2258 (0.2293) loss 3.5763 (3.0664) grad_norm 2.9621 (inf) loss_scale 256.0000 (489.8318) mem 7381MB [2024-08-27 21:32:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1210/1251] eta 0:00:09 lr 0.000326 wd 0.0500 time 0.2223 (0.2311) data time 0.0013 (0.0016) model time 0.2210 (0.2293) loss 4.2605 (3.0663) grad_norm 3.1649 (inf) loss_scale 256.0000 (487.9009) mem 7381MB [2024-08-27 21:32:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1220/1251] eta 0:00:07 lr 0.000326 wd 0.0500 time 0.2266 (0.2311) data time 0.0011 (0.0016) model time 0.2255 (0.2293) loss 3.3957 (3.0668) grad_norm 2.2427 (inf) loss_scale 256.0000 (486.0016) mem 7381MB [2024-08-27 21:32:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1230/1251] eta 0:00:04 lr 0.000326 wd 0.0500 time 0.2274 (0.2311) data time 0.0009 (0.0016) model time 0.2265 (0.2293) loss 2.8605 (3.0670) grad_norm 3.5350 (inf) loss_scale 256.0000 (484.1332) mem 7381MB [2024-08-27 21:32:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1240/1251] eta 0:00:02 lr 0.000326 wd 0.0500 time 0.2109 (0.2310) data time 0.0006 (0.0016) model time 0.2103 (0.2292) loss 2.7173 (3.0675) grad_norm 2.8865 (inf) loss_scale 256.0000 (482.2949) mem 7381MB [2024-08-27 21:32:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [192/300][1250/1251] eta 0:00:00 lr 0.000326 wd 0.0500 time 0.2101 (0.2309) data time 0.0007 (0.0016) model time 0.2094 (0.2291) loss 2.9861 (3.0670) grad_norm 3.4981 (inf) loss_scale 256.0000 (480.4860) mem 7381MB [2024-08-27 21:32:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 192 training takes 0:04:48 [2024-08-27 21:32:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 21:32:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 21:32:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.585 (0.585) Loss 0.4805 (0.4805) Acc@1 92.285 (92.285) Acc@5 98.242 (98.242) Mem 7381MB [2024-08-27 21:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.130) Loss 0.7090 (0.6970) Acc@1 85.352 (85.556) Acc@5 96.777 (97.221) Mem 7381MB [2024-08-27 21:32:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.108) Loss 1.0547 (0.7248) Acc@1 75.781 (84.394) Acc@5 93.652 (97.140) Mem 7381MB [2024-08-27 21:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.099) Loss 1.2227 (0.8216) Acc@1 70.117 (82.176) Acc@5 92.383 (96.087) Mem 7381MB [2024-08-27 21:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.091) Loss 1.0762 (0.8676) Acc@1 74.902 (80.943) Acc@5 92.969 (95.501) Mem 7381MB [2024-08-27 21:32:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.552 Acc@5 95.476 [2024-08-27 21:32:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.6% [2024-08-27 21:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.956 (0.956) Loss 0.3884 (0.3884) Acc@1 92.969 (92.969) Acc@5 98.438 (98.438) Mem 7381MB [2024-08-27 21:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.088 (0.169) Loss 0.6084 (0.6146) Acc@1 88.477 (87.100) Acc@5 97.168 (97.496) Mem 7381MB [2024-08-27 21:32:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.127) Loss 0.8745 (0.6402) Acc@1 78.418 (86.114) Acc@5 95.996 (97.535) Mem 7381MB [2024-08-27 21:32:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.112) Loss 1.1152 (0.7252) Acc@1 72.363 (84.032) Acc@5 92.676 (96.645) Mem 7381MB [2024-08-27 21:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.101) Loss 0.9805 (0.7683) Acc@1 76.074 (82.755) Acc@5 94.336 (96.191) Mem 7381MB [2024-08-27 21:32:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.314 Acc@5 96.174 [2024-08-27 21:32:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.3% [2024-08-27 21:32:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.31% [2024-08-27 21:32:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 21:32:53 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 21:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][0/1251] eta 0:18:17 lr 0.000326 wd 0.0500 time 0.8773 (0.8773) data time 0.6287 (0.6287) model time 0.0000 (0.0000) loss 3.8638 (3.8638) grad_norm 3.3542 (3.3542) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][10/1251] eta 0:05:57 lr 0.000326 wd 0.0500 time 0.2277 (0.2882) data time 0.0010 (0.0582) model time 0.0000 (0.0000) loss 2.9352 (3.0670) grad_norm 2.3534 (2.8814) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:32:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][20/1251] eta 0:05:20 lr 0.000326 wd 0.0500 time 0.2485 (0.2606) data time 0.0008 (0.0310) model time 0.0000 (0.0000) loss 3.1652 (3.0615) grad_norm 3.0234 (2.9190) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][30/1251] eta 0:05:05 lr 0.000326 wd 0.0500 time 0.2217 (0.2500) data time 0.0011 (0.0214) model time 0.0000 (0.0000) loss 2.7448 (3.0145) grad_norm 1.7060 (2.8235) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][40/1251] eta 0:04:56 lr 0.000326 wd 0.0500 time 0.2443 (0.2452) data time 0.0007 (0.0164) model time 0.0000 (0.0000) loss 2.4332 (2.9961) grad_norm 3.1934 (2.9472) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][50/1251] eta 0:04:51 lr 0.000326 wd 0.0500 time 0.2393 (0.2423) data time 0.0007 (0.0134) model time 0.0000 (0.0000) loss 3.3672 (2.9773) grad_norm 2.4724 (3.0528) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][60/1251] eta 0:04:46 lr 0.000326 wd 0.0500 time 0.2467 (0.2403) data time 0.0007 (0.0114) model time 0.2460 (0.2287) loss 2.7090 (2.9741) grad_norm 4.1942 (3.1791) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][70/1251] eta 0:04:41 lr 0.000326 wd 0.0500 time 0.2340 (0.2386) data time 0.0009 (0.0100) model time 0.2331 (0.2277) loss 3.3713 (2.9801) grad_norm 3.7443 (3.2100) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][80/1251] eta 0:04:38 lr 0.000326 wd 0.0500 time 0.2390 (0.2375) data time 0.0009 (0.0088) model time 0.2381 (0.2282) loss 3.6312 (3.0021) grad_norm 2.4415 (3.2058) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][90/1251] eta 0:04:34 lr 0.000326 wd 0.0500 time 0.2255 (0.2367) data time 0.0009 (0.0080) model time 0.2246 (0.2285) loss 1.7952 (2.9893) grad_norm 3.3362 (3.1538) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][100/1251] eta 0:04:31 lr 0.000325 wd 0.0500 time 0.2449 (0.2363) data time 0.0008 (0.0074) model time 0.2442 (0.2288) loss 2.0869 (2.9700) grad_norm 3.1771 (3.1302) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][110/1251] eta 0:04:30 lr 0.000325 wd 0.0500 time 0.2216 (0.2373) data time 0.0009 (0.0068) model time 0.2207 (0.2317) loss 3.5687 (2.9780) grad_norm 2.2771 (3.1114) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][120/1251] eta 0:04:29 lr 0.000325 wd 0.0500 time 0.2253 (0.2382) data time 0.0007 (0.0063) model time 0.2246 (0.2340) loss 2.8687 (2.9905) grad_norm 3.2343 (3.1671) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][130/1251] eta 0:04:26 lr 0.000325 wd 0.0500 time 0.2227 (0.2377) data time 0.0013 (0.0059) model time 0.2215 (0.2336) loss 2.8018 (2.9866) grad_norm 3.6651 (3.1695) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][140/1251] eta 0:04:23 lr 0.000325 wd 0.0500 time 0.2276 (0.2373) data time 0.0007 (0.0056) model time 0.2270 (0.2332) loss 3.7946 (2.9857) grad_norm 2.8101 (3.1440) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][150/1251] eta 0:04:20 lr 0.000325 wd 0.0500 time 0.2258 (0.2368) data time 0.0012 (0.0053) model time 0.2246 (0.2327) loss 2.9767 (2.9726) grad_norm 3.3648 (3.1406) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][160/1251] eta 0:04:17 lr 0.000325 wd 0.0500 time 0.2247 (0.2365) data time 0.0007 (0.0051) model time 0.2240 (0.2325) loss 3.3686 (2.9720) grad_norm 3.1222 (3.1343) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][170/1251] eta 0:04:15 lr 0.000325 wd 0.0500 time 0.2229 (0.2359) data time 0.0009 (0.0048) model time 0.2220 (0.2320) loss 2.5178 (2.9768) grad_norm 2.6042 (3.1132) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][180/1251] eta 0:04:12 lr 0.000325 wd 0.0500 time 0.2277 (0.2355) data time 0.0009 (0.0046) model time 0.2268 (0.2317) loss 2.8849 (2.9701) grad_norm 2.8734 (3.1247) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][190/1251] eta 0:04:09 lr 0.000325 wd 0.0500 time 0.2233 (0.2353) data time 0.0009 (0.0044) model time 0.2224 (0.2315) loss 2.3919 (2.9686) grad_norm 3.8419 (3.1165) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][200/1251] eta 0:04:07 lr 0.000325 wd 0.0500 time 0.2244 (0.2351) data time 0.0009 (0.0043) model time 0.2234 (0.2314) loss 2.9949 (2.9659) grad_norm 3.0392 (3.0886) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][210/1251] eta 0:04:04 lr 0.000325 wd 0.0500 time 0.2267 (0.2347) data time 0.0007 (0.0041) model time 0.2260 (0.2311) loss 2.8825 (2.9713) grad_norm 3.3975 (3.0662) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][220/1251] eta 0:04:01 lr 0.000325 wd 0.0500 time 0.2262 (0.2345) data time 0.0008 (0.0040) model time 0.2254 (0.2310) loss 3.1911 (2.9728) grad_norm 2.5251 (3.0434) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][230/1251] eta 0:03:59 lr 0.000325 wd 0.0500 time 0.2291 (0.2344) data time 0.0009 (0.0039) model time 0.2282 (0.2309) loss 2.6902 (2.9680) grad_norm 2.3704 (3.0154) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][240/1251] eta 0:03:56 lr 0.000325 wd 0.0500 time 0.2311 (0.2342) data time 0.0009 (0.0038) model time 0.2302 (0.2308) loss 2.0274 (2.9623) grad_norm 2.4445 (3.0188) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][250/1251] eta 0:03:54 lr 0.000325 wd 0.0500 time 0.2273 (0.2339) data time 0.0011 (0.0037) model time 0.2262 (0.2305) loss 3.3872 (2.9617) grad_norm 4.9433 (3.0162) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][260/1251] eta 0:03:51 lr 0.000325 wd 0.0500 time 0.2431 (0.2339) data time 0.0009 (0.0036) model time 0.2422 (0.2306) loss 3.0806 (2.9680) grad_norm 5.3117 (3.1062) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][270/1251] eta 0:03:49 lr 0.000325 wd 0.0500 time 0.2318 (0.2337) data time 0.0011 (0.0035) model time 0.2306 (0.2306) loss 2.7365 (2.9567) grad_norm 2.7781 (3.1064) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:33:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][280/1251] eta 0:03:46 lr 0.000325 wd 0.0500 time 0.2275 (0.2336) data time 0.0009 (0.0034) model time 0.2266 (0.2305) loss 3.0209 (2.9650) grad_norm 2.7760 (3.0991) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][290/1251] eta 0:03:44 lr 0.000325 wd 0.0500 time 0.2238 (0.2334) data time 0.0010 (0.0033) model time 0.2228 (0.2303) loss 3.1061 (2.9692) grad_norm 3.7536 (3.1130) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][300/1251] eta 0:03:41 lr 0.000325 wd 0.0500 time 0.2249 (0.2332) data time 0.0010 (0.0033) model time 0.2239 (0.2301) loss 2.2133 (2.9630) grad_norm 3.1986 (3.1050) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][310/1251] eta 0:03:39 lr 0.000325 wd 0.0500 time 0.2283 (0.2330) data time 0.0009 (0.0032) model time 0.2274 (0.2299) loss 3.1613 (2.9708) grad_norm 2.4313 (3.1128) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][320/1251] eta 0:03:36 lr 0.000325 wd 0.0500 time 0.2226 (0.2328) data time 0.0007 (0.0031) model time 0.2219 (0.2298) loss 3.6710 (2.9765) grad_norm 2.3997 (3.1107) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][330/1251] eta 0:03:34 lr 0.000325 wd 0.0500 time 0.2261 (0.2326) data time 0.0009 (0.0031) model time 0.2252 (0.2296) loss 2.8679 (2.9782) grad_norm 2.9291 (3.1026) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][340/1251] eta 0:03:31 lr 0.000324 wd 0.0500 time 0.2247 (0.2325) data time 0.0014 (0.0030) model time 0.2233 (0.2295) loss 2.6104 (2.9857) grad_norm 3.1070 (3.0980) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][350/1251] eta 0:03:29 lr 0.000324 wd 0.0500 time 0.2243 (0.2323) data time 0.0009 (0.0030) model time 0.2234 (0.2294) loss 3.6248 (3.0034) grad_norm 3.2666 (3.0927) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][360/1251] eta 0:03:27 lr 0.000324 wd 0.0500 time 0.2289 (0.2333) data time 0.0007 (0.0029) model time 0.2283 (0.2307) loss 3.3396 (3.0103) grad_norm 3.2372 (3.0875) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][370/1251] eta 0:03:25 lr 0.000324 wd 0.0500 time 0.2260 (0.2332) data time 0.0010 (0.0029) model time 0.2251 (0.2305) loss 3.0930 (3.0027) grad_norm 2.4881 (3.0869) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][380/1251] eta 0:03:23 lr 0.000324 wd 0.0500 time 0.2260 (0.2331) data time 0.0008 (0.0028) model time 0.2252 (0.2305) loss 2.8543 (3.0005) grad_norm 2.5231 (3.0861) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][390/1251] eta 0:03:20 lr 0.000324 wd 0.0500 time 0.2287 (0.2331) data time 0.0015 (0.0028) model time 0.2273 (0.2305) loss 3.3031 (3.0058) grad_norm 2.7015 (3.0778) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][400/1251] eta 0:03:18 lr 0.000324 wd 0.0500 time 0.2291 (0.2330) data time 0.0010 (0.0027) model time 0.2281 (0.2304) loss 2.7046 (3.0045) grad_norm 2.8309 (3.0709) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][410/1251] eta 0:03:16 lr 0.000324 wd 0.0500 time 0.4525 (0.2339) data time 0.0009 (0.0027) model time 0.4517 (0.2315) loss 3.6765 (3.0118) grad_norm 2.8953 (3.0732) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][420/1251] eta 0:03:14 lr 0.000324 wd 0.0500 time 0.2360 (0.2338) data time 0.0008 (0.0027) model time 0.2352 (0.2314) loss 3.5954 (3.0104) grad_norm 2.8017 (3.0777) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][430/1251] eta 0:03:11 lr 0.000324 wd 0.0500 time 0.2293 (0.2337) data time 0.0009 (0.0026) model time 0.2284 (0.2313) loss 3.6460 (3.0160) grad_norm 2.7562 (3.1111) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][440/1251] eta 0:03:09 lr 0.000324 wd 0.0500 time 0.2270 (0.2336) data time 0.0010 (0.0026) model time 0.2260 (0.2313) loss 2.9915 (3.0184) grad_norm 2.5447 (3.1060) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][450/1251] eta 0:03:06 lr 0.000324 wd 0.0500 time 0.2347 (0.2334) data time 0.0007 (0.0026) model time 0.2340 (0.2311) loss 3.6732 (3.0220) grad_norm 2.1863 (3.0998) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][460/1251] eta 0:03:04 lr 0.000324 wd 0.0500 time 0.2370 (0.2334) data time 0.0008 (0.0025) model time 0.2362 (0.2310) loss 3.2517 (3.0260) grad_norm 3.7332 (3.0972) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][470/1251] eta 0:03:02 lr 0.000324 wd 0.0500 time 0.2267 (0.2332) data time 0.0007 (0.0025) model time 0.2261 (0.2309) loss 1.9724 (3.0237) grad_norm 2.8686 (3.0991) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][480/1251] eta 0:02:59 lr 0.000324 wd 0.0500 time 0.2333 (0.2332) data time 0.0010 (0.0025) model time 0.2323 (0.2309) loss 3.1995 (3.0246) grad_norm 4.7469 (3.1180) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][490/1251] eta 0:02:57 lr 0.000324 wd 0.0500 time 0.2200 (0.2331) data time 0.0012 (0.0024) model time 0.2188 (0.2308) loss 3.7619 (3.0259) grad_norm 4.7710 (3.1300) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][500/1251] eta 0:02:54 lr 0.000324 wd 0.0500 time 0.2352 (0.2330) data time 0.0007 (0.0024) model time 0.2346 (0.2307) loss 3.0057 (3.0295) grad_norm 2.6923 (3.1311) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][510/1251] eta 0:02:52 lr 0.000324 wd 0.0500 time 0.2271 (0.2329) data time 0.0008 (0.0024) model time 0.2263 (0.2307) loss 2.8861 (3.0282) grad_norm 2.7885 (3.1368) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][520/1251] eta 0:02:50 lr 0.000324 wd 0.0500 time 0.2434 (0.2329) data time 0.0009 (0.0024) model time 0.2425 (0.2307) loss 2.8448 (3.0273) grad_norm 2.9192 (3.1313) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][530/1251] eta 0:02:47 lr 0.000324 wd 0.0500 time 0.2285 (0.2329) data time 0.0008 (0.0023) model time 0.2277 (0.2307) loss 3.4305 (3.0301) grad_norm 3.7928 (3.1275) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:34:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][540/1251] eta 0:02:45 lr 0.000324 wd 0.0500 time 0.2284 (0.2328) data time 0.0009 (0.0023) model time 0.2275 (0.2307) loss 3.5076 (3.0350) grad_norm 2.8978 (3.1268) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][550/1251] eta 0:02:43 lr 0.000324 wd 0.0500 time 0.2203 (0.2327) data time 0.0012 (0.0023) model time 0.2191 (0.2306) loss 3.3734 (3.0357) grad_norm 3.9031 (3.1304) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][560/1251] eta 0:02:40 lr 0.000324 wd 0.0500 time 0.2315 (0.2327) data time 0.0010 (0.0023) model time 0.2305 (0.2306) loss 3.0400 (3.0341) grad_norm 17.7076 (3.1551) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][570/1251] eta 0:02:38 lr 0.000324 wd 0.0500 time 0.2372 (0.2327) data time 0.0007 (0.0023) model time 0.2365 (0.2305) loss 2.9765 (3.0341) grad_norm 3.5011 (3.1535) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][580/1251] eta 0:02:36 lr 0.000323 wd 0.0500 time 0.2417 (0.2326) data time 0.0009 (0.0022) model time 0.2409 (0.2305) loss 3.1436 (3.0368) grad_norm 2.8152 (3.1496) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][590/1251] eta 0:02:33 lr 0.000323 wd 0.0500 time 0.2287 (0.2326) data time 0.0009 (0.0022) model time 0.2278 (0.2305) loss 3.6894 (3.0343) grad_norm 2.8030 (3.1531) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][600/1251] eta 0:02:31 lr 0.000323 wd 0.0500 time 0.2220 (0.2325) data time 0.0009 (0.0022) model time 0.2212 (0.2305) loss 3.1558 (3.0345) grad_norm 2.4253 (3.1782) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][610/1251] eta 0:02:29 lr 0.000323 wd 0.0500 time 0.2328 (0.2325) data time 0.0008 (0.0022) model time 0.2320 (0.2304) loss 3.2640 (3.0368) grad_norm 3.3728 (3.1814) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][620/1251] eta 0:02:26 lr 0.000323 wd 0.0500 time 0.2444 (0.2325) data time 0.0013 (0.0022) model time 0.2432 (0.2304) loss 3.4143 (3.0356) grad_norm 2.2781 (3.1738) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][630/1251] eta 0:02:24 lr 0.000323 wd 0.0500 time 0.2576 (0.2325) data time 0.0009 (0.0022) model time 0.2567 (0.2304) loss 2.6697 (3.0345) grad_norm 2.4814 (3.1646) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][640/1251] eta 0:02:22 lr 0.000323 wd 0.0500 time 0.2380 (0.2324) data time 0.0010 (0.0021) model time 0.2370 (0.2304) loss 2.3574 (3.0303) grad_norm 4.4085 (3.1635) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][650/1251] eta 0:02:19 lr 0.000323 wd 0.0500 time 0.2272 (0.2324) data time 0.0008 (0.0021) model time 0.2264 (0.2304) loss 3.4807 (3.0307) grad_norm 3.3532 (3.1637) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][660/1251] eta 0:02:17 lr 0.000323 wd 0.0500 time 0.2300 (0.2324) data time 0.0007 (0.0021) model time 0.2293 (0.2304) loss 2.7569 (3.0351) grad_norm 2.4662 (3.1556) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][670/1251] eta 0:02:15 lr 0.000323 wd 0.0500 time 0.2454 (0.2324) data time 0.0007 (0.0021) model time 0.2447 (0.2304) loss 3.5524 (3.0367) grad_norm 2.8655 (3.1571) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][680/1251] eta 0:02:12 lr 0.000323 wd 0.0500 time 0.2290 (0.2323) data time 0.0010 (0.0021) model time 0.2280 (0.2303) loss 3.7870 (3.0338) grad_norm 4.6880 (3.1598) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][690/1251] eta 0:02:10 lr 0.000323 wd 0.0500 time 0.2280 (0.2323) data time 0.0012 (0.0021) model time 0.2267 (0.2303) loss 3.2381 (3.0345) grad_norm 3.2002 (3.1638) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][700/1251] eta 0:02:07 lr 0.000323 wd 0.0500 time 0.2456 (0.2323) data time 0.0010 (0.0021) model time 0.2446 (0.2303) loss 3.4318 (3.0342) grad_norm 3.5851 (3.1690) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][710/1251] eta 0:02:05 lr 0.000323 wd 0.0500 time 0.2346 (0.2323) data time 0.0009 (0.0021) model time 0.2337 (0.2303) loss 3.1903 (3.0345) grad_norm 2.1953 (3.1829) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][720/1251] eta 0:02:03 lr 0.000323 wd 0.0500 time 0.2267 (0.2322) data time 0.0009 (0.0021) model time 0.2258 (0.2302) loss 3.3787 (3.0363) grad_norm 2.4821 (3.1766) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][730/1251] eta 0:02:00 lr 0.000323 wd 0.0500 time 0.2294 (0.2322) data time 0.0011 (0.0020) model time 0.2283 (0.2302) loss 3.6302 (3.0345) grad_norm 2.6010 (3.1688) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][740/1251] eta 0:01:58 lr 0.000323 wd 0.0500 time 0.2942 (0.2322) data time 0.0013 (0.0020) model time 0.2930 (0.2303) loss 3.2973 (3.0370) grad_norm 2.8244 (3.1657) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][750/1251] eta 0:01:56 lr 0.000323 wd 0.0500 time 0.3331 (0.2324) data time 0.0013 (0.0020) model time 0.3318 (0.2304) loss 2.8036 (3.0324) grad_norm 2.3097 (3.1629) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][760/1251] eta 0:01:54 lr 0.000323 wd 0.0500 time 0.2342 (0.2323) data time 0.0007 (0.0020) model time 0.2336 (0.2304) loss 2.0295 (3.0279) grad_norm 3.2594 (3.1578) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][770/1251] eta 0:01:51 lr 0.000323 wd 0.0500 time 0.2341 (0.2323) data time 0.0006 (0.0020) model time 0.2334 (0.2304) loss 2.9629 (3.0301) grad_norm 3.1772 (3.1580) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][780/1251] eta 0:01:49 lr 0.000323 wd 0.0500 time 0.2370 (0.2322) data time 0.0011 (0.0020) model time 0.2360 (0.2303) loss 2.8859 (3.0325) grad_norm 3.4782 (3.1603) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][790/1251] eta 0:01:47 lr 0.000323 wd 0.0500 time 0.2411 (0.2322) data time 0.0009 (0.0020) model time 0.2402 (0.2303) loss 3.2289 (3.0327) grad_norm 3.0459 (3.1619) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][800/1251] eta 0:01:44 lr 0.000323 wd 0.0500 time 0.2321 (0.2322) data time 0.0010 (0.0020) model time 0.2312 (0.2303) loss 3.2766 (3.0333) grad_norm 3.1603 (3.1585) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][810/1251] eta 0:01:42 lr 0.000323 wd 0.0500 time 0.2395 (0.2322) data time 0.0010 (0.0020) model time 0.2384 (0.2303) loss 3.1832 (3.0306) grad_norm 4.6368 (3.1588) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][820/1251] eta 0:01:40 lr 0.000322 wd 0.0500 time 0.2326 (0.2321) data time 0.0009 (0.0020) model time 0.2317 (0.2303) loss 2.5901 (3.0323) grad_norm 2.7360 (3.1581) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][830/1251] eta 0:01:37 lr 0.000322 wd 0.0500 time 0.2330 (0.2321) data time 0.0010 (0.0019) model time 0.2320 (0.2302) loss 3.2969 (3.0341) grad_norm 3.8349 (3.1579) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][840/1251] eta 0:01:35 lr 0.000322 wd 0.0500 time 0.2260 (0.2321) data time 0.0009 (0.0019) model time 0.2251 (0.2302) loss 3.1319 (3.0342) grad_norm 2.9632 (3.1642) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][850/1251] eta 0:01:33 lr 0.000322 wd 0.0500 time 0.2474 (0.2321) data time 0.0015 (0.0019) model time 0.2459 (0.2302) loss 2.3507 (3.0341) grad_norm 2.9168 (3.1647) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][860/1251] eta 0:01:30 lr 0.000322 wd 0.0500 time 0.2274 (0.2320) data time 0.0011 (0.0019) model time 0.2263 (0.2302) loss 1.9772 (3.0360) grad_norm 3.1476 (3.1648) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][870/1251] eta 0:01:28 lr 0.000322 wd 0.0500 time 0.2314 (0.2320) data time 0.0016 (0.0019) model time 0.2298 (0.2301) loss 3.1115 (3.0345) grad_norm 3.1078 (3.1640) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][880/1251] eta 0:01:26 lr 0.000322 wd 0.0500 time 0.2434 (0.2320) data time 0.0009 (0.0019) model time 0.2424 (0.2301) loss 3.3464 (3.0365) grad_norm 1.9719 (3.1609) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][890/1251] eta 0:01:23 lr 0.000322 wd 0.0500 time 0.2389 (0.2320) data time 0.0009 (0.0019) model time 0.2380 (0.2301) loss 3.1228 (3.0373) grad_norm 2.1662 (3.1550) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][900/1251] eta 0:01:21 lr 0.000322 wd 0.0500 time 0.2263 (0.2325) data time 0.0013 (0.0019) model time 0.2251 (0.2306) loss 2.8857 (3.0384) grad_norm 2.3591 (3.1513) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][910/1251] eta 0:01:19 lr 0.000322 wd 0.0500 time 0.2319 (0.2324) data time 0.0011 (0.0019) model time 0.2308 (0.2306) loss 2.1391 (3.0319) grad_norm 2.7631 (3.1443) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][920/1251] eta 0:01:16 lr 0.000322 wd 0.0500 time 0.2293 (0.2324) data time 0.0010 (0.0019) model time 0.2283 (0.2306) loss 3.1581 (3.0322) grad_norm 4.2651 (3.1453) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][930/1251] eta 0:01:14 lr 0.000322 wd 0.0500 time 0.2277 (0.2324) data time 0.0009 (0.0019) model time 0.2268 (0.2306) loss 3.0946 (3.0297) grad_norm 3.2773 (3.1466) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][940/1251] eta 0:01:12 lr 0.000322 wd 0.0500 time 0.2256 (0.2324) data time 0.0009 (0.0019) model time 0.2247 (0.2305) loss 2.2259 (3.0294) grad_norm 3.7166 (3.1499) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][950/1251] eta 0:01:09 lr 0.000322 wd 0.0500 time 0.2281 (0.2323) data time 0.0008 (0.0019) model time 0.2273 (0.2305) loss 2.9909 (3.0309) grad_norm 4.1706 (3.1542) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][960/1251] eta 0:01:07 lr 0.000322 wd 0.0500 time 0.2243 (0.2323) data time 0.0007 (0.0019) model time 0.2236 (0.2305) loss 3.0872 (3.0307) grad_norm 2.3015 (3.1508) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][970/1251] eta 0:01:05 lr 0.000322 wd 0.0500 time 0.2301 (0.2323) data time 0.0009 (0.0019) model time 0.2292 (0.2305) loss 3.1994 (3.0298) grad_norm 2.6104 (3.1469) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][980/1251] eta 0:01:02 lr 0.000322 wd 0.0500 time 0.2220 (0.2322) data time 0.0010 (0.0019) model time 0.2211 (0.2305) loss 2.8034 (3.0305) grad_norm 2.4009 (3.1492) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][990/1251] eta 0:01:00 lr 0.000322 wd 0.0500 time 0.2318 (0.2322) data time 0.0007 (0.0019) model time 0.2311 (0.2304) loss 4.0463 (3.0299) grad_norm 2.1457 (3.1442) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1000/1251] eta 0:00:58 lr 0.000322 wd 0.0500 time 0.2260 (0.2322) data time 0.0012 (0.0019) model time 0.2248 (0.2304) loss 3.3186 (3.0307) grad_norm 2.6219 (3.1488) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1010/1251] eta 0:00:55 lr 0.000322 wd 0.0500 time 0.2229 (0.2322) data time 0.0010 (0.0019) model time 0.2219 (0.2304) loss 3.5149 (3.0327) grad_norm 4.1995 (3.1456) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1020/1251] eta 0:00:53 lr 0.000322 wd 0.0500 time 0.2343 (0.2321) data time 0.0008 (0.0019) model time 0.2335 (0.2304) loss 2.4715 (3.0315) grad_norm 2.4422 (3.1433) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1030/1251] eta 0:00:51 lr 0.000322 wd 0.0500 time 0.2268 (0.2321) data time 0.0007 (0.0018) model time 0.2261 (0.2303) loss 3.6590 (3.0312) grad_norm 3.4188 (3.1478) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1040/1251] eta 0:00:49 lr 0.000322 wd 0.0500 time 0.2234 (0.2323) data time 0.0008 (0.0018) model time 0.2225 (0.2305) loss 2.1732 (3.0319) grad_norm 2.7829 (3.1458) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1050/1251] eta 0:00:46 lr 0.000322 wd 0.0500 time 0.2306 (0.2324) data time 0.0010 (0.0018) model time 0.2295 (0.2307) loss 2.8382 (3.0302) grad_norm 3.0359 (3.1452) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1060/1251] eta 0:00:44 lr 0.000322 wd 0.0500 time 0.2346 (0.2324) data time 0.0013 (0.0018) model time 0.2334 (0.2307) loss 3.2882 (3.0327) grad_norm 2.9296 (3.1426) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:37:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1070/1251] eta 0:00:42 lr 0.000321 wd 0.0500 time 0.2197 (0.2324) data time 0.0008 (0.0018) model time 0.2189 (0.2306) loss 2.7776 (3.0332) grad_norm 2.8530 (3.1394) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:37:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1080/1251] eta 0:00:39 lr 0.000321 wd 0.0500 time 0.2272 (0.2324) data time 0.0010 (0.0018) model time 0.2263 (0.2306) loss 2.9359 (3.0325) grad_norm 2.5262 (3.1356) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:37:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1090/1251] eta 0:00:37 lr 0.000321 wd 0.0500 time 0.2204 (0.2323) data time 0.0010 (0.0018) model time 0.2194 (0.2306) loss 3.0883 (3.0330) grad_norm 2.3607 (3.1310) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:37:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1100/1251] eta 0:00:35 lr 0.000321 wd 0.0500 time 0.2299 (0.2323) data time 0.0010 (0.0018) model time 0.2289 (0.2306) loss 3.0085 (3.0341) grad_norm 3.2826 (3.1353) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:37:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1110/1251] eta 0:00:32 lr 0.000321 wd 0.0500 time 0.2252 (0.2322) data time 0.0013 (0.0018) model time 0.2239 (0.2305) loss 2.4589 (3.0323) grad_norm 4.6283 (3.1385) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:37:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 21:37:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 21:37:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 21:38:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 21:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 21:39:08 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 21:39:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 21:39:23 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 21:39:24 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 21:39:25 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 21:39:25 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 193) [2024-08-27 21:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 21:39:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1120/1251] eta 0:05:25 lr 0.000321 wd 0.0500 time 0.2327 (2.4881) data time 0.0009 (0.1858) model time 0.2318 (2.3023) loss 4.1731 (3.5738) grad_norm 10.4003 (4.5055) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-27 21:39:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1130/1251] eta 0:02:10 lr 0.000321 wd 0.0500 time 0.2230 (1.0756) data time 0.0009 (0.0708) model time 0.2221 (1.0048) loss 3.3072 (3.2827) grad_norm 2.9249 (3.5190) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-27 21:39:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1140/1251] eta 0:01:23 lr 0.000321 wd 0.0500 time 0.2411 (0.7501) data time 0.0007 (0.0446) model time 0.2405 (0.7055) loss 3.2643 (3.2819) grad_norm 2.3434 (3.2416) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-27 21:39:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1150/1251] eta 0:01:01 lr 0.000321 wd 0.0500 time 0.2245 (0.6048) data time 0.0010 (0.0326) model time 0.2235 (0.5722) loss 3.5005 (3.2837) grad_norm 2.0026 (3.2226) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-27 21:39:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1160/1251] eta 0:00:47 lr 0.000321 wd 0.0500 time 0.2244 (0.5226) data time 0.0012 (0.0258) model time 0.2231 (0.4968) loss 2.7757 (3.2221) grad_norm 3.6415 (3.1852) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-27 21:39:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1170/1251] eta 0:00:38 lr 0.000321 wd 0.0500 time 0.2344 (0.4704) data time 0.0008 (0.0214) model time 0.2335 (0.4490) loss 3.6060 (3.2134) grad_norm 3.6584 (3.2028) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-27 21:39:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1180/1251] eta 0:00:30 lr 0.000321 wd 0.0500 time 0.2324 (0.4338) data time 0.0008 (0.0184) model time 0.2317 (0.4155) loss 2.7690 (3.1817) grad_norm 2.8912 (3.1842) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-27 21:40:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1190/1251] eta 0:00:24 lr 0.000321 wd 0.0500 time 0.2298 (0.4065) data time 0.0009 (0.0161) model time 0.2290 (0.3904) loss 3.2948 (3.1518) grad_norm 3.5582 (3.1501) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-27 21:40:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1200/1251] eta 0:00:19 lr 0.000321 wd 0.0500 time 0.2273 (0.3857) data time 0.0008 (0.0144) model time 0.2266 (0.3714) loss 2.2657 (3.1246) grad_norm 3.1759 (3.1174) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-27 21:40:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1210/1251] eta 0:00:15 lr 0.000321 wd 0.0500 time 0.2229 (0.3696) data time 0.0008 (0.0130) model time 0.2221 (0.3567) loss 3.0896 (3.1238) grad_norm 2.3139 (3.0939) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-27 21:40:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1220/1251] eta 0:00:11 lr 0.000321 wd 0.0500 time 0.2201 (0.3562) data time 0.0009 (0.0118) model time 0.2193 (0.3444) loss 3.5385 (3.1447) grad_norm 4.2787 (3.0681) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-27 21:40:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1230/1251] eta 0:00:07 lr 0.000321 wd 0.0500 time 0.2362 (0.3458) data time 0.0007 (0.0110) model time 0.2355 (0.3349) loss 3.6692 (3.1553) grad_norm 4.5946 (3.0542) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-27 21:40:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1240/1251] eta 0:00:03 lr 0.000321 wd 0.0500 time 0.2115 (0.3359) data time 0.0005 (0.0102) model time 0.2111 (0.3257) loss 1.9750 (3.1382) grad_norm 3.7733 (3.0666) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-27 21:40:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [193/300][1250/1251] eta 0:00:00 lr 0.000321 wd 0.0500 time 0.2109 (0.3272) data time 0.0006 (0.0096) model time 0.2102 (0.3177) loss 2.6746 (3.1420) grad_norm 5.8816 (3.1157) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-27 21:40:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 193 training takes 0:00:44 [2024-08-27 21:40:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 21:40:17 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 21:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.483 (0.483) Loss 0.4341 (0.4341) Acc@1 91.699 (91.699) Acc@5 98.340 (98.340) Mem 7377MB [2024-08-27 21:40:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.116) Loss 0.6924 (0.6826) Acc@1 86.523 (85.627) Acc@5 97.168 (97.177) Mem 7377MB [2024-08-27 21:40:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.099) Loss 0.9634 (0.7119) Acc@1 77.539 (84.626) Acc@5 94.824 (97.117) Mem 7377MB [2024-08-27 21:40:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.093) Loss 1.2188 (0.8013) Acc@1 69.922 (82.513) Acc@5 91.602 (96.059) Mem 7377MB [2024-08-27 21:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.1191 (0.8521) Acc@1 72.949 (81.136) Acc@5 92.773 (95.470) Mem 7377MB [2024-08-27 21:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.662 Acc@5 95.434 [2024-08-27 21:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.7% [2024-08-27 21:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 80.66% [2024-08-27 21:40:23 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 21:40:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 21:40:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.433 (0.433) Loss 0.3887 (0.3887) Acc@1 93.066 (93.066) Acc@5 98.438 (98.438) Mem 7377MB [2024-08-27 21:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.119) Loss 0.6094 (0.6140) Acc@1 88.574 (87.163) Acc@5 97.168 (97.470) Mem 7377MB [2024-08-27 21:40:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.102) Loss 0.8745 (0.6397) Acc@1 78.613 (86.147) Acc@5 95.898 (97.554) Mem 7377MB [2024-08-27 21:40:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.094) Loss 1.1123 (0.7245) Acc@1 72.266 (84.035) Acc@5 92.676 (96.658) Mem 7377MB [2024-08-27 21:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.088) Loss 0.9795 (0.7677) Acc@1 75.781 (82.748) Acc@5 94.434 (96.208) Mem 7377MB [2024-08-27 21:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.316 Acc@5 96.202 [2024-08-27 21:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.3% [2024-08-27 21:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.32% [2024-08-27 21:40:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 21:40:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 21:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][0/1251] eta 0:15:40 lr 0.000321 wd 0.0500 time 0.7514 (0.7514) data time 0.4309 (0.4309) model time 0.0000 (0.0000) loss 2.4836 (2.4836) grad_norm 2.9677 (2.9677) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-27 21:40:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][10/1251] eta 0:05:42 lr 0.000321 wd 0.0500 time 0.2308 (0.2760) data time 0.0009 (0.0409) model time 0.0000 (0.0000) loss 2.6832 (2.8407) grad_norm 2.7126 (3.3399) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:40:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][20/1251] eta 0:05:11 lr 0.000321 wd 0.0500 time 0.2245 (0.2527) data time 0.0011 (0.0219) model time 0.0000 (0.0000) loss 2.5465 (2.9116) grad_norm 2.4450 (3.1534) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][30/1251] eta 0:05:00 lr 0.000321 wd 0.0500 time 0.2299 (0.2462) data time 0.0009 (0.0152) model time 0.0000 (0.0000) loss 3.3977 (3.0062) grad_norm 3.9216 (3.0251) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:40:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][40/1251] eta 0:04:53 lr 0.000321 wd 0.0500 time 0.2254 (0.2423) data time 0.0010 (0.0117) model time 0.0000 (0.0000) loss 2.9436 (2.9880) grad_norm 2.1542 (2.9229) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:40:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][50/1251] eta 0:04:48 lr 0.000321 wd 0.0500 time 0.2251 (0.2401) data time 0.0008 (0.0096) model time 0.0000 (0.0000) loss 3.0953 (2.9881) grad_norm 2.4534 (2.9000) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:40:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][60/1251] eta 0:04:43 lr 0.000320 wd 0.0500 time 0.2223 (0.2381) data time 0.0006 (0.0082) model time 0.2217 (0.2274) loss 3.4269 (2.9990) grad_norm 3.1202 (2.9466) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][70/1251] eta 0:04:39 lr 0.000320 wd 0.0500 time 0.2331 (0.2367) data time 0.0007 (0.0072) model time 0.2324 (0.2271) loss 3.1414 (2.9539) grad_norm 3.2692 (2.9223) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:40:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][80/1251] eta 0:04:36 lr 0.000320 wd 0.0500 time 0.2272 (0.2357) data time 0.0008 (0.0064) model time 0.2265 (0.2274) loss 3.4826 (2.9544) grad_norm 2.6764 (2.9102) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:40:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][90/1251] eta 0:04:33 lr 0.000320 wd 0.0500 time 0.2295 (0.2353) data time 0.0009 (0.0058) model time 0.2286 (0.2282) loss 2.9958 (2.9703) grad_norm 4.7253 (2.9453) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][100/1251] eta 0:04:30 lr 0.000320 wd 0.0500 time 0.2284 (0.2347) data time 0.0006 (0.0053) model time 0.2278 (0.2283) loss 2.5873 (2.9634) grad_norm 3.0782 (3.0225) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][110/1251] eta 0:04:27 lr 0.000320 wd 0.0500 time 0.2488 (0.2344) data time 0.0009 (0.0049) model time 0.2479 (0.2286) loss 3.2554 (2.9571) grad_norm 3.9330 (3.0813) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:41:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][120/1251] eta 0:04:24 lr 0.000320 wd 0.0500 time 0.2277 (0.2339) data time 0.0007 (0.0046) model time 0.2271 (0.2284) loss 2.1255 (2.9364) grad_norm 3.4829 (3.0943) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:41:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][130/1251] eta 0:04:22 lr 0.000320 wd 0.0500 time 0.2350 (0.2338) data time 0.0008 (0.0043) model time 0.2341 (0.2288) loss 2.2764 (2.9329) grad_norm 2.4579 (3.0957) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:41:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][140/1251] eta 0:04:19 lr 0.000320 wd 0.0500 time 0.2315 (0.2336) data time 0.0009 (0.0041) model time 0.2305 (0.2289) loss 3.2569 (2.9546) grad_norm 2.5078 (3.1133) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:41:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][150/1251] eta 0:04:16 lr 0.000320 wd 0.0500 time 0.2277 (0.2332) data time 0.0007 (0.0039) model time 0.2270 (0.2287) loss 3.6547 (2.9517) grad_norm 2.9404 (3.1014) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:41:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][160/1251] eta 0:04:15 lr 0.000320 wd 0.0500 time 0.2245 (0.2346) data time 0.0009 (0.0037) model time 0.2236 (0.2311) loss 1.6924 (2.9314) grad_norm 2.1388 (3.0827) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:41:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][170/1251] eta 0:04:14 lr 0.000320 wd 0.0500 time 0.2289 (0.2357) data time 0.0008 (0.0036) model time 0.2281 (0.2329) loss 3.6071 (2.9310) grad_norm 5.7000 (3.1286) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:41:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][180/1251] eta 0:04:12 lr 0.000320 wd 0.0500 time 0.2263 (0.2354) data time 0.0009 (0.0034) model time 0.2254 (0.2326) loss 3.2934 (2.9437) grad_norm 3.3070 (3.1266) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:41:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][190/1251] eta 0:04:09 lr 0.000320 wd 0.0500 time 0.2268 (0.2351) data time 0.0008 (0.0033) model time 0.2261 (0.2322) loss 2.2426 (2.9597) grad_norm 3.1035 (3.1150) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:41:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][200/1251] eta 0:04:06 lr 0.000320 wd 0.0500 time 0.2282 (0.2348) data time 0.0009 (0.0032) model time 0.2273 (0.2319) loss 2.5773 (2.9566) grad_norm 3.2076 (3.0946) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:41:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][210/1251] eta 0:04:04 lr 0.000320 wd 0.0500 time 0.2280 (0.2345) data time 0.0009 (0.0031) model time 0.2271 (0.2317) loss 2.8221 (2.9621) grad_norm 3.7253 (3.1071) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-27 21:41:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][220/1251] eta 0:04:01 lr 0.000320 wd 0.0500 time 0.2514 (0.2344) data time 0.0009 (0.0030) model time 0.2504 (0.2316) loss 2.4274 (2.9636) grad_norm 3.4892 (3.1214) loss_scale 512.0000 (265.2670) mem 7380MB [2024-08-27 21:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][230/1251] eta 0:03:58 lr 0.000320 wd 0.0500 time 0.2389 (0.2341) data time 0.0007 (0.0029) model time 0.2383 (0.2313) loss 2.5654 (2.9684) grad_norm 2.4124 (3.1123) loss_scale 512.0000 (275.9481) mem 7380MB [2024-08-27 21:41:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][240/1251] eta 0:03:56 lr 0.000320 wd 0.0500 time 0.2226 (0.2338) data time 0.0010 (0.0029) model time 0.2216 (0.2310) loss 3.2712 (2.9696) grad_norm 2.7526 (3.1102) loss_scale 512.0000 (285.7427) mem 7380MB [2024-08-27 21:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][250/1251] eta 0:03:53 lr 0.000320 wd 0.0500 time 0.2252 (0.2336) data time 0.0008 (0.0028) model time 0.2244 (0.2309) loss 3.4528 (2.9698) grad_norm 2.7007 (3.1047) loss_scale 512.0000 (294.7570) mem 7380MB [2024-08-27 21:41:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 21:41:33 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 21:41:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 21:46:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 21:46:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 21:46:39 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 21:46:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 21:46:48 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 21:46:49 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 21:46:50 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 21:46:50 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 194) [2024-08-27 21:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 21:47:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][260/1251] eta 0:43:22 lr 0.000320 wd 0.0500 time 0.2479 (2.6264) data time 0.0008 (0.1103) model time 0.2471 (2.5162) loss 3.8557 (3.4940) grad_norm 3.4984 (3.5991) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 21:47:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][270/1251] eta 0:18:34 lr 0.000320 wd 0.0500 time 0.2353 (1.1356) data time 0.0011 (0.0423) model time 0.2343 (1.0934) loss 2.9130 (3.2473) grad_norm 2.8616 (3.1700) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 21:47:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][280/1251] eta 0:12:48 lr 0.000320 wd 0.0500 time 0.2405 (0.7915) data time 0.0007 (0.0264) model time 0.2397 (0.7651) loss 3.0579 (3.2504) grad_norm 2.4527 (3.2548) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 21:47:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][290/1251] eta 0:10:13 lr 0.000320 wd 0.0500 time 0.2463 (0.6385) data time 0.0010 (0.0194) model time 0.2453 (0.6191) loss 3.3772 (3.2681) grad_norm 2.1866 (3.1393) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 21:47:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][300/1251] eta 0:08:44 lr 0.000319 wd 0.0500 time 0.2320 (0.5512) data time 0.0011 (0.0154) model time 0.2310 (0.5358) loss 2.9037 (3.2148) grad_norm 3.6218 (3.2668) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 21:47:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][310/1251] eta 0:07:46 lr 0.000319 wd 0.0500 time 0.2402 (0.4955) data time 0.0008 (0.0128) model time 0.2394 (0.4827) loss 3.5580 (3.2265) grad_norm 3.5833 (3.2538) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 21:47:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][320/1251] eta 0:07:05 lr 0.000319 wd 0.0500 time 0.2353 (0.4574) data time 0.0008 (0.0110) model time 0.2345 (0.4464) loss 2.6311 (3.2075) grad_norm 3.2120 (3.2667) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 21:47:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][330/1251] eta 0:06:34 lr 0.000319 wd 0.0500 time 0.2371 (0.4286) data time 0.0011 (0.0098) model time 0.2360 (0.4188) loss 3.0401 (3.1797) grad_norm 2.6209 (3.2223) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 21:47:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][340/1251] eta 0:06:10 lr 0.000319 wd 0.0500 time 0.2403 (0.4066) data time 0.0009 (0.0087) model time 0.2394 (0.3978) loss 2.5093 (3.1472) grad_norm 3.6926 (3.2490) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 21:47:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][350/1251] eta 0:05:50 lr 0.000319 wd 0.0500 time 0.2355 (0.3895) data time 0.0008 (0.0081) model time 0.2347 (0.3815) loss 3.2326 (3.1615) grad_norm 4.0888 (3.2236) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 21:47:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][360/1251] eta 0:05:34 lr 0.000319 wd 0.0500 time 0.2386 (0.3753) data time 0.0010 (0.0074) model time 0.2376 (0.3679) loss 3.1852 (3.1856) grad_norm 2.7078 (3.2084) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 21:47:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][370/1251] eta 0:05:20 lr 0.000319 wd 0.0500 time 0.2423 (0.3638) data time 0.0008 (0.0069) model time 0.2416 (0.3569) loss 4.0495 (3.1775) grad_norm 3.4384 (3.1547) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 21:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 21:47:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 21:47:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 21:49:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 21:49:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 21:49:40 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 21:49:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 21:49:53 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 21:49:55 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 21:49:56 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 21:49:56 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 194) [2024-08-27 21:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 21:50:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][380/1251] eta 0:44:52 lr 0.000319 wd 0.0500 time 0.2278 (3.0910) data time 0.0007 (0.1975) model time 0.2271 (2.8935) loss 3.5498 (3.6858) grad_norm 2.8248 (3.9062) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][390/1251] eta 0:16:58 lr 0.000319 wd 0.0500 time 0.2279 (1.1835) data time 0.0010 (0.0666) model time 0.2269 (1.1169) loss 3.2625 (3.3551) grad_norm 4.0883 (3.7053) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][400/1251] eta 0:11:22 lr 0.000319 wd 0.0500 time 0.2247 (0.8014) data time 0.0011 (0.0404) model time 0.2236 (0.7610) loss 3.1794 (3.3443) grad_norm 2.4307 (3.5910) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][410/1251] eta 0:08:56 lr 0.000319 wd 0.0500 time 0.2320 (0.6384) data time 0.0010 (0.0292) model time 0.2310 (0.6092) loss 3.4421 (3.3084) grad_norm 3.3745 (3.4114) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][420/1251] eta 0:07:36 lr 0.000319 wd 0.0500 time 0.2305 (0.5487) data time 0.0009 (0.0229) model time 0.2296 (0.5258) loss 3.2048 (3.2466) grad_norm 2.8597 (3.2302) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][430/1251] eta 0:06:43 lr 0.000319 wd 0.0500 time 0.2293 (0.4910) data time 0.0008 (0.0190) model time 0.2285 (0.4721) loss 2.3226 (3.2232) grad_norm 2.7156 (3.1877) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][440/1251] eta 0:06:05 lr 0.000319 wd 0.0500 time 0.2320 (0.4513) data time 0.0009 (0.0162) model time 0.2310 (0.4351) loss 3.1713 (3.2103) grad_norm 2.4052 (3.1245) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][450/1251] eta 0:05:37 lr 0.000319 wd 0.0500 time 0.2372 (0.4214) data time 0.0016 (0.0142) model time 0.2355 (0.4072) loss 2.5901 (3.1676) grad_norm 2.5193 (3.1093) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][460/1251] eta 0:05:15 lr 0.000319 wd 0.0500 time 0.2455 (0.3993) data time 0.0008 (0.0127) model time 0.2447 (0.3866) loss 3.0775 (3.1320) grad_norm 4.4174 (3.1435) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][470/1251] eta 0:04:57 lr 0.000319 wd 0.0500 time 0.2353 (0.3815) data time 0.0011 (0.0115) model time 0.2342 (0.3700) loss 3.1221 (3.1221) grad_norm 2.6769 (3.1304) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][480/1251] eta 0:04:42 lr 0.000319 wd 0.0500 time 0.2407 (0.3669) data time 0.0010 (0.0105) model time 0.2397 (0.3564) loss 3.0131 (3.1459) grad_norm 3.6970 (3.1308) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][490/1251] eta 0:04:30 lr 0.000319 wd 0.0500 time 0.2336 (0.3551) data time 0.0008 (0.0097) model time 0.2328 (0.3454) loss 2.4217 (3.1354) grad_norm 2.8177 (3.1057) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][500/1251] eta 0:04:19 lr 0.000319 wd 0.0500 time 0.2366 (0.3452) data time 0.0009 (0.0090) model time 0.2357 (0.3363) loss 2.9781 (3.1338) grad_norm 3.2990 (3.0839) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][510/1251] eta 0:04:09 lr 0.000319 wd 0.0500 time 0.2441 (0.3371) data time 0.0008 (0.0084) model time 0.2433 (0.3287) loss 2.8947 (3.1358) grad_norm 3.0163 (3.1046) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][520/1251] eta 0:04:01 lr 0.000319 wd 0.0500 time 0.2509 (0.3298) data time 0.0013 (0.0079) model time 0.2496 (0.3219) loss 3.3644 (3.1289) grad_norm 2.9808 (3.1058) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][530/1251] eta 0:03:53 lr 0.000319 wd 0.0500 time 0.2317 (0.3233) data time 0.0010 (0.0075) model time 0.2306 (0.3159) loss 3.5000 (3.1199) grad_norm 3.2446 (3.0886) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][540/1251] eta 0:03:45 lr 0.000318 wd 0.0500 time 0.2232 (0.3178) data time 0.0010 (0.0071) model time 0.2222 (0.3107) loss 3.4651 (3.1181) grad_norm 2.6409 (3.0787) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][550/1251] eta 0:03:39 lr 0.000318 wd 0.0500 time 0.2248 (0.3128) data time 0.0007 (0.0068) model time 0.2240 (0.3060) loss 2.8280 (3.1100) grad_norm 2.1705 (3.0872) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:50:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][560/1251] eta 0:03:33 lr 0.000318 wd 0.0500 time 0.2334 (0.3083) data time 0.0011 (0.0065) model time 0.2324 (0.3018) loss 3.1287 (3.1028) grad_norm 3.2593 (3.0775) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][570/1251] eta 0:03:27 lr 0.000318 wd 0.0500 time 0.2457 (0.3043) data time 0.0010 (0.0062) model time 0.2447 (0.2981) loss 1.8134 (3.0973) grad_norm 3.4989 (3.0808) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][580/1251] eta 0:03:21 lr 0.000318 wd 0.0500 time 0.2221 (0.3006) data time 0.0012 (0.0059) model time 0.2209 (0.2947) loss 3.0761 (3.0895) grad_norm 2.7710 (3.1672) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][590/1251] eta 0:03:16 lr 0.000318 wd 0.0500 time 0.2303 (0.2973) data time 0.0010 (0.0057) model time 0.2294 (0.2916) loss 2.9145 (3.0808) grad_norm 3.0011 (3.1516) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][600/1251] eta 0:03:11 lr 0.000318 wd 0.0500 time 0.2347 (0.2943) data time 0.0008 (0.0055) model time 0.2339 (0.2888) loss 3.4933 (3.0854) grad_norm 2.7950 (3.1235) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][610/1251] eta 0:03:06 lr 0.000318 wd 0.0500 time 0.2343 (0.2915) data time 0.0009 (0.0053) model time 0.2334 (0.2862) loss 3.1505 (3.0742) grad_norm 2.9684 (3.1576) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][620/1251] eta 0:03:02 lr 0.000318 wd 0.0500 time 0.2198 (0.2890) data time 0.0017 (0.0052) model time 0.2182 (0.2838) loss 3.2493 (3.0794) grad_norm 2.8410 (3.1634) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][630/1251] eta 0:02:58 lr 0.000318 wd 0.0500 time 0.2564 (0.2867) data time 0.0012 (0.0050) model time 0.2552 (0.2817) loss 3.0364 (3.0733) grad_norm 3.3860 (3.1529) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][640/1251] eta 0:02:53 lr 0.000318 wd 0.0500 time 0.2259 (0.2845) data time 0.0008 (0.0049) model time 0.2251 (0.2797) loss 2.2578 (3.0598) grad_norm 3.0439 (3.1417) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][650/1251] eta 0:02:49 lr 0.000318 wd 0.0500 time 0.2261 (0.2825) data time 0.0010 (0.0047) model time 0.2250 (0.2778) loss 3.5028 (3.0687) grad_norm 2.7100 (3.1509) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][660/1251] eta 0:02:45 lr 0.000318 wd 0.0500 time 0.2241 (0.2806) data time 0.0015 (0.0046) model time 0.2225 (0.2760) loss 3.1123 (3.0624) grad_norm 2.7024 (3.1401) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][670/1251] eta 0:02:42 lr 0.000318 wd 0.0500 time 0.2279 (0.2796) data time 0.0013 (0.0045) model time 0.2265 (0.2752) loss 3.0705 (3.0554) grad_norm 3.6156 (3.1445) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][680/1251] eta 0:02:38 lr 0.000318 wd 0.0500 time 0.2359 (0.2780) data time 0.0011 (0.0044) model time 0.2348 (0.2737) loss 2.3410 (3.0462) grad_norm 3.6436 (3.1544) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][690/1251] eta 0:02:35 lr 0.000318 wd 0.0500 time 0.2308 (0.2773) data time 0.0009 (0.0043) model time 0.2298 (0.2730) loss 3.3983 (3.0537) grad_norm 2.2199 (3.1544) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][700/1251] eta 0:02:31 lr 0.000318 wd 0.0500 time 0.2277 (0.2758) data time 0.0009 (0.0042) model time 0.2268 (0.2717) loss 3.6034 (3.0636) grad_norm 3.4527 (3.1564) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][710/1251] eta 0:02:28 lr 0.000318 wd 0.0500 time 0.2342 (0.2746) data time 0.0007 (0.0041) model time 0.2335 (0.2705) loss 3.3247 (3.0604) grad_norm 3.0788 (3.1655) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][720/1251] eta 0:02:25 lr 0.000318 wd 0.0500 time 0.2284 (0.2734) data time 0.0011 (0.0040) model time 0.2273 (0.2694) loss 2.3948 (3.0615) grad_norm 7.2117 (3.1838) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][730/1251] eta 0:02:21 lr 0.000318 wd 0.0500 time 0.2262 (0.2722) data time 0.0010 (0.0039) model time 0.2253 (0.2682) loss 3.0223 (3.0654) grad_norm 2.8781 (3.1885) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][740/1251] eta 0:02:18 lr 0.000318 wd 0.0500 time 0.2445 (0.2711) data time 0.0013 (0.0039) model time 0.2432 (0.2673) loss 3.3135 (3.0630) grad_norm 3.7018 (3.1799) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][750/1251] eta 0:02:15 lr 0.000318 wd 0.0500 time 0.2192 (0.2699) data time 0.0008 (0.0038) model time 0.2184 (0.2661) loss 3.0107 (3.0634) grad_norm 3.4412 (3.1821) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][760/1251] eta 0:02:12 lr 0.000318 wd 0.0500 time 0.2322 (0.2690) data time 0.0009 (0.0037) model time 0.2313 (0.2653) loss 2.0477 (3.0567) grad_norm 3.2217 (3.2341) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][770/1251] eta 0:02:08 lr 0.000318 wd 0.0500 time 0.2323 (0.2681) data time 0.0009 (0.0037) model time 0.2313 (0.2644) loss 3.4157 (3.0600) grad_norm 3.1308 (3.2640) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][780/1251] eta 0:02:05 lr 0.000318 wd 0.0500 time 0.2310 (0.2672) data time 0.0006 (0.0036) model time 0.2304 (0.2635) loss 3.0941 (3.0622) grad_norm 3.7748 (3.2488) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][790/1251] eta 0:02:02 lr 0.000317 wd 0.0500 time 0.2264 (0.2663) data time 0.0011 (0.0036) model time 0.2253 (0.2627) loss 3.0510 (3.0641) grad_norm 2.6406 (3.2433) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][800/1251] eta 0:01:59 lr 0.000317 wd 0.0500 time 0.2302 (0.2654) data time 0.0010 (0.0035) model time 0.2292 (0.2619) loss 3.4632 (3.0643) grad_norm 2.6985 (3.2294) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][810/1251] eta 0:01:56 lr 0.000317 wd 0.0500 time 0.2268 (0.2646) data time 0.0009 (0.0035) model time 0.2259 (0.2611) loss 3.4577 (3.0718) grad_norm 2.4188 (3.2210) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:51:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][820/1251] eta 0:01:53 lr 0.000317 wd 0.0500 time 0.2264 (0.2638) data time 0.0010 (0.0034) model time 0.2254 (0.2604) loss 3.0608 (3.0720) grad_norm 2.4262 (3.2312) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][830/1251] eta 0:01:50 lr 0.000317 wd 0.0500 time 0.2255 (0.2631) data time 0.0009 (0.0034) model time 0.2246 (0.2597) loss 3.3344 (3.0677) grad_norm 3.4999 (3.2284) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][840/1251] eta 0:01:47 lr 0.000317 wd 0.0500 time 0.2289 (0.2623) data time 0.0007 (0.0033) model time 0.2281 (0.2590) loss 2.1611 (3.0619) grad_norm 3.1332 (3.2260) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][850/1251] eta 0:01:44 lr 0.000317 wd 0.0500 time 0.2473 (0.2618) data time 0.0008 (0.0033) model time 0.2465 (0.2585) loss 3.0826 (3.0556) grad_norm 2.2457 (3.2184) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][860/1251] eta 0:01:42 lr 0.000317 wd 0.0500 time 0.2205 (0.2611) data time 0.0011 (0.0032) model time 0.2194 (0.2578) loss 3.9050 (3.0605) grad_norm 3.0413 (3.2143) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][870/1251] eta 0:01:39 lr 0.000317 wd 0.0500 time 0.2374 (0.2604) data time 0.0007 (0.0032) model time 0.2367 (0.2572) loss 3.2396 (3.0598) grad_norm 4.6211 (3.2113) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][880/1251] eta 0:01:36 lr 0.000317 wd 0.0500 time 0.2341 (0.2598) data time 0.0011 (0.0032) model time 0.2331 (0.2567) loss 3.1040 (3.0588) grad_norm 3.5161 (3.2055) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][890/1251] eta 0:01:33 lr 0.000317 wd 0.0500 time 0.2286 (0.2593) data time 0.0010 (0.0031) model time 0.2276 (0.2562) loss 3.0699 (3.0662) grad_norm 3.9668 (3.2053) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][900/1251] eta 0:01:30 lr 0.000317 wd 0.0500 time 0.2245 (0.2588) data time 0.0013 (0.0031) model time 0.2231 (0.2557) loss 3.5409 (3.0626) grad_norm 3.1381 (3.2139) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][910/1251] eta 0:01:28 lr 0.000317 wd 0.0500 time 0.2328 (0.2582) data time 0.0008 (0.0030) model time 0.2320 (0.2552) loss 2.5438 (3.0582) grad_norm 3.5798 (3.2274) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][920/1251] eta 0:01:25 lr 0.000317 wd 0.0500 time 0.2353 (0.2577) data time 0.0007 (0.0030) model time 0.2347 (0.2547) loss 2.8483 (3.0582) grad_norm 2.0974 (3.2367) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][930/1251] eta 0:01:22 lr 0.000317 wd 0.0500 time 0.2312 (0.2572) data time 0.0010 (0.0030) model time 0.2301 (0.2542) loss 3.0500 (3.0620) grad_norm 3.1359 (3.2503) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][940/1251] eta 0:01:19 lr 0.000317 wd 0.0500 time 0.2323 (0.2567) data time 0.0009 (0.0029) model time 0.2314 (0.2538) loss 2.9210 (3.0616) grad_norm 5.6137 (3.2585) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][950/1251] eta 0:01:17 lr 0.000317 wd 0.0500 time 0.2276 (0.2562) data time 0.0009 (0.0029) model time 0.2267 (0.2533) loss 3.1194 (3.0627) grad_norm 2.5903 (3.2715) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][960/1251] eta 0:01:14 lr 0.000317 wd 0.0500 time 0.2300 (0.2557) data time 0.0009 (0.0029) model time 0.2290 (0.2529) loss 2.9428 (3.0631) grad_norm 2.9799 (3.2673) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][970/1251] eta 0:01:11 lr 0.000317 wd 0.0500 time 0.2219 (0.2553) data time 0.0008 (0.0028) model time 0.2211 (0.2525) loss 3.4175 (3.0638) grad_norm 2.5348 (3.2591) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][980/1251] eta 0:01:09 lr 0.000317 wd 0.0500 time 0.2327 (0.2549) data time 0.0009 (0.0028) model time 0.2318 (0.2521) loss 3.8886 (3.0619) grad_norm 2.6289 (3.2579) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][990/1251] eta 0:01:06 lr 0.000317 wd 0.0500 time 0.2384 (0.2545) data time 0.0007 (0.0028) model time 0.2377 (0.2517) loss 3.4090 (3.0626) grad_norm 2.7920 (3.2502) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1000/1251] eta 0:01:03 lr 0.000317 wd 0.0500 time 0.2300 (0.2541) data time 0.0009 (0.0028) model time 0.2290 (0.2514) loss 3.0909 (3.0676) grad_norm 3.4097 (3.2429) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1010/1251] eta 0:01:01 lr 0.000317 wd 0.0500 time 0.2221 (0.2537) data time 0.0012 (0.0027) model time 0.2209 (0.2510) loss 3.0259 (3.0692) grad_norm 2.4674 (3.2390) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1020/1251] eta 0:00:58 lr 0.000317 wd 0.0500 time 0.2312 (0.2534) data time 0.0007 (0.0027) model time 0.2304 (0.2506) loss 2.6513 (3.0647) grad_norm 3.4662 (3.2421) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1030/1251] eta 0:00:55 lr 0.000316 wd 0.0500 time 0.2314 (0.2530) data time 0.0009 (0.0027) model time 0.2305 (0.2503) loss 2.7990 (3.0650) grad_norm 2.8139 (3.2342) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1040/1251] eta 0:00:53 lr 0.000316 wd 0.0500 time 0.2352 (0.2527) data time 0.0007 (0.0027) model time 0.2345 (0.2500) loss 2.9075 (3.0594) grad_norm 3.2949 (3.2367) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1050/1251] eta 0:00:50 lr 0.000316 wd 0.0500 time 0.2289 (0.2524) data time 0.0008 (0.0026) model time 0.2280 (0.2498) loss 2.8460 (3.0637) grad_norm 1.8639 (3.2281) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1060/1251] eta 0:00:48 lr 0.000316 wd 0.0500 time 0.2181 (0.2520) data time 0.0012 (0.0026) model time 0.2169 (0.2494) loss 3.2908 (3.0633) grad_norm 3.0508 (3.2412) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1070/1251] eta 0:00:45 lr 0.000316 wd 0.0500 time 0.2300 (0.2517) data time 0.0009 (0.0026) model time 0.2292 (0.2491) loss 2.7212 (3.0604) grad_norm 2.9985 (3.2521) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:52:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1080/1251] eta 0:00:43 lr 0.000316 wd 0.0500 time 0.2260 (0.2515) data time 0.0011 (0.0026) model time 0.2249 (0.2489) loss 3.9559 (3.0614) grad_norm 2.5860 (3.2455) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1090/1251] eta 0:00:40 lr 0.000316 wd 0.0500 time 0.2262 (0.2512) data time 0.0009 (0.0026) model time 0.2252 (0.2486) loss 3.5552 (3.0567) grad_norm 3.7503 (3.2432) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1100/1251] eta 0:00:37 lr 0.000316 wd 0.0500 time 0.2232 (0.2509) data time 0.0007 (0.0025) model time 0.2225 (0.2484) loss 2.7611 (3.0565) grad_norm 2.9476 (3.2376) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1110/1251] eta 0:00:35 lr 0.000316 wd 0.0500 time 0.2210 (0.2506) data time 0.0009 (0.0025) model time 0.2201 (0.2481) loss 3.6611 (3.0570) grad_norm 3.4707 (3.2346) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1120/1251] eta 0:00:32 lr 0.000316 wd 0.0500 time 0.2273 (0.2503) data time 0.0011 (0.0025) model time 0.2263 (0.2478) loss 3.1341 (3.0580) grad_norm 2.1748 (3.2334) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1130/1251] eta 0:00:30 lr 0.000316 wd 0.0500 time 0.2578 (0.2501) data time 0.0009 (0.0025) model time 0.2569 (0.2476) loss 2.1928 (3.0545) grad_norm 2.8975 (3.2246) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1140/1251] eta 0:00:27 lr 0.000316 wd 0.0500 time 0.2329 (0.2499) data time 0.0006 (0.0025) model time 0.2322 (0.2474) loss 3.0339 (3.0581) grad_norm 2.8359 (3.2270) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1150/1251] eta 0:00:25 lr 0.000316 wd 0.0500 time 0.2239 (0.2496) data time 0.0009 (0.0025) model time 0.2229 (0.2472) loss 2.5163 (3.0600) grad_norm 4.0609 (3.2289) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1160/1251] eta 0:00:22 lr 0.000316 wd 0.0500 time 0.2325 (0.2494) data time 0.0010 (0.0024) model time 0.2315 (0.2470) loss 2.2458 (3.0587) grad_norm 3.7809 (3.2276) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1170/1251] eta 0:00:20 lr 0.000316 wd 0.0500 time 0.2313 (0.2492) data time 0.0008 (0.0024) model time 0.2305 (0.2467) loss 2.9814 (3.0598) grad_norm 4.0366 (3.2236) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1180/1251] eta 0:00:17 lr 0.000316 wd 0.0500 time 0.2256 (0.2489) data time 0.0008 (0.0024) model time 0.2248 (0.2465) loss 3.6157 (3.0573) grad_norm 3.2477 (3.2358) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1190/1251] eta 0:00:15 lr 0.000316 wd 0.0500 time 0.2476 (0.2490) data time 0.0008 (0.0024) model time 0.2468 (0.2466) loss 3.2082 (3.0516) grad_norm 3.0649 (3.2366) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1200/1251] eta 0:00:12 lr 0.000316 wd 0.0500 time 0.2286 (0.2487) data time 0.0007 (0.0024) model time 0.2279 (0.2464) loss 2.2463 (3.0498) grad_norm 3.7028 (3.2420) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1210/1251] eta 0:00:10 lr 0.000316 wd 0.0500 time 0.2284 (0.2487) data time 0.0011 (0.0024) model time 0.2273 (0.2463) loss 2.6371 (3.0468) grad_norm 2.8283 (3.2369) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1220/1251] eta 0:00:07 lr 0.000316 wd 0.0500 time 0.2253 (0.2485) data time 0.0010 (0.0024) model time 0.2243 (0.2461) loss 2.1309 (3.0446) grad_norm 2.3543 (3.2330) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1230/1251] eta 0:00:05 lr 0.000316 wd 0.0500 time 0.2254 (0.2483) data time 0.0008 (0.0023) model time 0.2246 (0.2459) loss 3.3085 (3.0449) grad_norm 2.1419 (3.2312) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1240/1251] eta 0:00:02 lr 0.000316 wd 0.0500 time 0.2113 (0.2480) data time 0.0006 (0.0023) model time 0.2107 (0.2456) loss 2.8231 (3.0434) grad_norm 2.9515 (3.2308) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [194/300][1250/1251] eta 0:00:00 lr 0.000316 wd 0.0500 time 0.2113 (0.2476) data time 0.0004 (0.0023) model time 0.2109 (0.2453) loss 2.7661 (3.0453) grad_norm 4.2255 (3.2330) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 21:53:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 194 training takes 0:03:36 [2024-08-27 21:53:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 21:53:40 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 21:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.422 (0.422) Loss 0.4331 (0.4331) Acc@1 92.480 (92.480) Acc@5 98.242 (98.242) Mem 7373MB [2024-08-27 21:53:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.093 (0.114) Loss 0.6748 (0.6843) Acc@1 86.816 (85.724) Acc@5 97.168 (97.141) Mem 7373MB [2024-08-27 21:53:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.099) Loss 1.0088 (0.7099) Acc@1 77.539 (84.863) Acc@5 94.238 (97.103) Mem 7373MB [2024-08-27 21:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.093) Loss 1.2012 (0.8051) Acc@1 71.680 (82.532) Acc@5 91.699 (96.091) Mem 7373MB [2024-08-27 21:53:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.0488 (0.8578) Acc@1 73.730 (81.186) Acc@5 93.945 (95.446) Mem 7373MB [2024-08-27 21:53:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.794 Acc@5 95.470 [2024-08-27 21:53:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.8% [2024-08-27 21:53:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 80.79% [2024-08-27 21:53:46 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 21:53:47 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 21:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.553 (0.553) Loss 0.3889 (0.3889) Acc@1 93.164 (93.164) Acc@5 98.438 (98.438) Mem 7373MB [2024-08-27 21:53:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.126) Loss 0.6094 (0.6137) Acc@1 88.672 (87.180) Acc@5 97.363 (97.532) Mem 7373MB [2024-08-27 21:53:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.105) Loss 0.8730 (0.6395) Acc@1 78.711 (86.156) Acc@5 95.898 (97.586) Mem 7373MB [2024-08-27 21:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.091 (0.097) Loss 1.1113 (0.7241) Acc@1 72.461 (84.038) Acc@5 92.578 (96.673) Mem 7373MB [2024-08-27 21:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.090) Loss 0.9790 (0.7674) Acc@1 75.781 (82.755) Acc@5 94.238 (96.218) Mem 7373MB [2024-08-27 21:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.336 Acc@5 96.216 [2024-08-27 21:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.3% [2024-08-27 21:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.34% [2024-08-27 21:53:51 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 21:53:52 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 21:53:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][0/1251] eta 0:17:36 lr 0.000316 wd 0.0500 time 0.8444 (0.8444) data time 0.5423 (0.5423) model time 0.0000 (0.0000) loss 2.8115 (2.8115) grad_norm 2.6089 (2.6089) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-27 21:53:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][10/1251] eta 0:05:54 lr 0.000316 wd 0.0500 time 0.2308 (0.2858) data time 0.0007 (0.0502) model time 0.0000 (0.0000) loss 3.7522 (2.9027) grad_norm 4.6241 (3.4481) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 21:53:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][20/1251] eta 0:05:18 lr 0.000315 wd 0.0500 time 0.2315 (0.2589) data time 0.0013 (0.0269) model time 0.0000 (0.0000) loss 3.6401 (2.9796) grad_norm 3.0580 (3.3174) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 21:53:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][30/1251] eta 0:05:04 lr 0.000315 wd 0.0500 time 0.2263 (0.2497) data time 0.0014 (0.0186) model time 0.0000 (0.0000) loss 3.3262 (2.9849) grad_norm 2.5815 (3.3021) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 21:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][40/1251] eta 0:04:56 lr 0.000315 wd 0.0500 time 0.2260 (0.2447) data time 0.0009 (0.0143) model time 0.0000 (0.0000) loss 3.9102 (2.9619) grad_norm 3.1696 (inf) loss_scale 256.0000 (499.5122) mem 7379MB [2024-08-27 21:54:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][50/1251] eta 0:04:50 lr 0.000315 wd 0.0500 time 0.2300 (0.2419) data time 0.0009 (0.0117) model time 0.0000 (0.0000) loss 3.1704 (3.0238) grad_norm 2.7737 (inf) loss_scale 256.0000 (451.7647) mem 7379MB [2024-08-27 21:54:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][60/1251] eta 0:04:45 lr 0.000315 wd 0.0500 time 0.2220 (0.2399) data time 0.0007 (0.0100) model time 0.2213 (0.2287) loss 2.0642 (3.0075) grad_norm 2.4370 (inf) loss_scale 256.0000 (419.6721) mem 7379MB [2024-08-27 21:54:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][70/1251] eta 0:04:41 lr 0.000315 wd 0.0500 time 0.2441 (0.2383) data time 0.0010 (0.0088) model time 0.2430 (0.2282) loss 2.8666 (2.9955) grad_norm 4.7301 (inf) loss_scale 256.0000 (396.6197) mem 7379MB [2024-08-27 21:54:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][80/1251] eta 0:04:38 lr 0.000315 wd 0.0500 time 0.2332 (0.2377) data time 0.0007 (0.0078) model time 0.2326 (0.2296) loss 3.5130 (2.9630) grad_norm 4.1838 (inf) loss_scale 256.0000 (379.2593) mem 7379MB [2024-08-27 21:54:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][90/1251] eta 0:04:34 lr 0.000315 wd 0.0500 time 0.2294 (0.2368) data time 0.0012 (0.0071) model time 0.2282 (0.2293) loss 3.0443 (2.9632) grad_norm 3.7620 (inf) loss_scale 256.0000 (365.7143) mem 7379MB [2024-08-27 21:54:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][100/1251] eta 0:04:32 lr 0.000315 wd 0.0500 time 0.2258 (0.2366) data time 0.0009 (0.0065) model time 0.2249 (0.2300) loss 3.2039 (2.9953) grad_norm 2.4132 (inf) loss_scale 256.0000 (354.8515) mem 7379MB [2024-08-27 21:54:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][110/1251] eta 0:04:29 lr 0.000315 wd 0.0500 time 0.2273 (0.2362) data time 0.0012 (0.0060) model time 0.2261 (0.2302) loss 2.2158 (2.9860) grad_norm 2.1117 (inf) loss_scale 256.0000 (345.9459) mem 7379MB [2024-08-27 21:54:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][120/1251] eta 0:04:26 lr 0.000315 wd 0.0500 time 0.2242 (0.2357) data time 0.0010 (0.0056) model time 0.2233 (0.2301) loss 2.8419 (2.9987) grad_norm 2.9376 (inf) loss_scale 256.0000 (338.5124) mem 7379MB [2024-08-27 21:54:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][130/1251] eta 0:04:23 lr 0.000315 wd 0.0500 time 0.2446 (0.2354) data time 0.0011 (0.0053) model time 0.2436 (0.2301) loss 2.9860 (2.9873) grad_norm 2.3338 (inf) loss_scale 256.0000 (332.2137) mem 7379MB [2024-08-27 21:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][140/1251] eta 0:04:21 lr 0.000315 wd 0.0500 time 0.2248 (0.2351) data time 0.0010 (0.0052) model time 0.2238 (0.2297) loss 2.3883 (3.0025) grad_norm 3.5472 (inf) loss_scale 256.0000 (326.8085) mem 7379MB [2024-08-27 21:54:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][150/1251] eta 0:04:18 lr 0.000315 wd 0.0500 time 0.2271 (0.2348) data time 0.0010 (0.0049) model time 0.2261 (0.2297) loss 3.1101 (2.9841) grad_norm 3.3724 (inf) loss_scale 256.0000 (322.1192) mem 7379MB [2024-08-27 21:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][160/1251] eta 0:04:15 lr 0.000315 wd 0.0500 time 0.2258 (0.2344) data time 0.0008 (0.0047) model time 0.2250 (0.2295) loss 3.8828 (2.9916) grad_norm 3.4443 (inf) loss_scale 256.0000 (318.0124) mem 7379MB [2024-08-27 21:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][170/1251] eta 0:04:13 lr 0.000315 wd 0.0500 time 0.2770 (0.2345) data time 0.0009 (0.0045) model time 0.2760 (0.2300) loss 3.3424 (2.9966) grad_norm 4.0010 (inf) loss_scale 256.0000 (314.3860) mem 7379MB [2024-08-27 21:54:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][180/1251] eta 0:04:10 lr 0.000315 wd 0.0500 time 0.2348 (0.2343) data time 0.0010 (0.0043) model time 0.2339 (0.2299) loss 3.0520 (3.0031) grad_norm 2.4383 (inf) loss_scale 256.0000 (311.1602) mem 7379MB [2024-08-27 21:54:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][190/1251] eta 0:04:08 lr 0.000315 wd 0.0500 time 0.2348 (0.2340) data time 0.0009 (0.0041) model time 0.2339 (0.2298) loss 2.7372 (3.0103) grad_norm 2.2671 (inf) loss_scale 256.0000 (308.2723) mem 7379MB [2024-08-27 21:54:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][200/1251] eta 0:04:05 lr 0.000315 wd 0.0500 time 0.2357 (0.2337) data time 0.0013 (0.0040) model time 0.2344 (0.2295) loss 2.8289 (3.0034) grad_norm 3.6549 (inf) loss_scale 256.0000 (305.6716) mem 7379MB [2024-08-27 21:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][210/1251] eta 0:04:03 lr 0.000315 wd 0.0500 time 0.2317 (0.2336) data time 0.0007 (0.0038) model time 0.2310 (0.2296) loss 3.0655 (3.0042) grad_norm 2.7171 (inf) loss_scale 256.0000 (303.3175) mem 7379MB [2024-08-27 21:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][220/1251] eta 0:04:00 lr 0.000315 wd 0.0500 time 0.2318 (0.2334) data time 0.0007 (0.0037) model time 0.2310 (0.2296) loss 3.3560 (3.0003) grad_norm 3.0097 (inf) loss_scale 256.0000 (301.1765) mem 7379MB [2024-08-27 21:54:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][230/1251] eta 0:03:58 lr 0.000315 wd 0.0500 time 0.2300 (0.2333) data time 0.0011 (0.0036) model time 0.2289 (0.2296) loss 3.0502 (3.0031) grad_norm 1.9511 (inf) loss_scale 256.0000 (299.2208) mem 7379MB [2024-08-27 21:54:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][240/1251] eta 0:03:55 lr 0.000315 wd 0.0500 time 0.2249 (0.2330) data time 0.0010 (0.0035) model time 0.2238 (0.2294) loss 3.4339 (3.0101) grad_norm 2.9006 (inf) loss_scale 256.0000 (297.4274) mem 7379MB [2024-08-27 21:54:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][250/1251] eta 0:03:53 lr 0.000315 wd 0.0500 time 0.2253 (0.2329) data time 0.0009 (0.0034) model time 0.2244 (0.2293) loss 3.1708 (3.0120) grad_norm 2.4198 (inf) loss_scale 256.0000 (295.7769) mem 7379MB [2024-08-27 21:54:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][260/1251] eta 0:03:50 lr 0.000315 wd 0.0500 time 0.2318 (0.2329) data time 0.0013 (0.0033) model time 0.2304 (0.2294) loss 3.4558 (3.0124) grad_norm 2.5172 (inf) loss_scale 256.0000 (294.2529) mem 7379MB [2024-08-27 21:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][270/1251] eta 0:03:48 lr 0.000314 wd 0.0500 time 0.2316 (0.2327) data time 0.0008 (0.0032) model time 0.2308 (0.2293) loss 2.3095 (3.0106) grad_norm 13.7643 (inf) loss_scale 256.0000 (292.8413) mem 7379MB [2024-08-27 21:54:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][280/1251] eta 0:03:45 lr 0.000314 wd 0.0500 time 0.2390 (0.2327) data time 0.0008 (0.0032) model time 0.2382 (0.2294) loss 3.5116 (3.0162) grad_norm 2.3603 (inf) loss_scale 256.0000 (291.5302) mem 7379MB [2024-08-27 21:54:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][290/1251] eta 0:03:43 lr 0.000314 wd 0.0500 time 0.2204 (0.2325) data time 0.0007 (0.0031) model time 0.2197 (0.2293) loss 3.6726 (3.0099) grad_norm 3.3956 (inf) loss_scale 256.0000 (290.3093) mem 7379MB [2024-08-27 21:55:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][300/1251] eta 0:03:41 lr 0.000314 wd 0.0500 time 0.2299 (0.2324) data time 0.0011 (0.0030) model time 0.2288 (0.2293) loss 3.2457 (3.0179) grad_norm 3.0691 (inf) loss_scale 256.0000 (289.1694) mem 7379MB [2024-08-27 21:55:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][310/1251] eta 0:03:38 lr 0.000314 wd 0.0500 time 0.2297 (0.2324) data time 0.0011 (0.0030) model time 0.2286 (0.2293) loss 3.3001 (3.0293) grad_norm 3.4297 (inf) loss_scale 256.0000 (288.1029) mem 7379MB [2024-08-27 21:55:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][320/1251] eta 0:03:36 lr 0.000314 wd 0.0500 time 0.2349 (0.2323) data time 0.0009 (0.0029) model time 0.2340 (0.2292) loss 3.0258 (3.0298) grad_norm 2.6459 (inf) loss_scale 256.0000 (287.1028) mem 7379MB [2024-08-27 21:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][330/1251] eta 0:03:33 lr 0.000314 wd 0.0500 time 0.2384 (0.2322) data time 0.0007 (0.0028) model time 0.2377 (0.2293) loss 3.3159 (3.0368) grad_norm 3.0055 (inf) loss_scale 256.0000 (286.1631) mem 7379MB [2024-08-27 21:55:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][340/1251] eta 0:03:31 lr 0.000314 wd 0.0500 time 0.2351 (0.2322) data time 0.0007 (0.0028) model time 0.2343 (0.2292) loss 2.2542 (3.0351) grad_norm 2.7250 (inf) loss_scale 256.0000 (285.2786) mem 7379MB [2024-08-27 21:55:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][350/1251] eta 0:03:29 lr 0.000314 wd 0.0500 time 0.2266 (0.2322) data time 0.0007 (0.0027) model time 0.2259 (0.2293) loss 2.3360 (3.0328) grad_norm 2.4925 (inf) loss_scale 256.0000 (284.4444) mem 7379MB [2024-08-27 21:55:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][360/1251] eta 0:03:26 lr 0.000314 wd 0.0500 time 0.2285 (0.2321) data time 0.0007 (0.0027) model time 0.2278 (0.2293) loss 2.0066 (3.0283) grad_norm 2.6350 (inf) loss_scale 256.0000 (283.6565) mem 7379MB [2024-08-27 21:55:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][370/1251] eta 0:03:24 lr 0.000314 wd 0.0500 time 0.2243 (0.2320) data time 0.0010 (0.0027) model time 0.2233 (0.2293) loss 3.0993 (3.0302) grad_norm 3.6871 (inf) loss_scale 256.0000 (282.9111) mem 7379MB [2024-08-27 21:55:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][380/1251] eta 0:03:22 lr 0.000314 wd 0.0500 time 0.2277 (0.2320) data time 0.0009 (0.0026) model time 0.2268 (0.2292) loss 3.6128 (3.0278) grad_norm 2.9015 (inf) loss_scale 256.0000 (282.2047) mem 7379MB [2024-08-27 21:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][390/1251] eta 0:03:19 lr 0.000314 wd 0.0500 time 0.2273 (0.2319) data time 0.0009 (0.0026) model time 0.2263 (0.2292) loss 3.5236 (3.0314) grad_norm 4.0317 (inf) loss_scale 256.0000 (281.5345) mem 7379MB [2024-08-27 21:55:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][400/1251] eta 0:03:17 lr 0.000314 wd 0.0500 time 0.2343 (0.2323) data time 0.0009 (0.0025) model time 0.2333 (0.2297) loss 3.7452 (3.0389) grad_norm 3.4423 (inf) loss_scale 256.0000 (280.8978) mem 7379MB [2024-08-27 21:55:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][410/1251] eta 0:03:15 lr 0.000314 wd 0.0500 time 0.2222 (0.2323) data time 0.0008 (0.0025) model time 0.2214 (0.2298) loss 3.4320 (3.0467) grad_norm 2.2818 (inf) loss_scale 256.0000 (280.2920) mem 7379MB [2024-08-27 21:55:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][420/1251] eta 0:03:12 lr 0.000314 wd 0.0500 time 0.2248 (0.2322) data time 0.0007 (0.0025) model time 0.2240 (0.2297) loss 3.3139 (3.0501) grad_norm 2.8453 (inf) loss_scale 256.0000 (279.7150) mem 7379MB [2024-08-27 21:55:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][430/1251] eta 0:03:10 lr 0.000314 wd 0.0500 time 0.2287 (0.2322) data time 0.0007 (0.0024) model time 0.2280 (0.2297) loss 2.9777 (3.0506) grad_norm 2.7163 (inf) loss_scale 256.0000 (279.1647) mem 7379MB [2024-08-27 21:55:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][440/1251] eta 0:03:08 lr 0.000314 wd 0.0500 time 0.2273 (0.2322) data time 0.0012 (0.0024) model time 0.2261 (0.2297) loss 3.1100 (3.0442) grad_norm 2.1599 (inf) loss_scale 256.0000 (278.6395) mem 7379MB [2024-08-27 21:55:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][450/1251] eta 0:03:05 lr 0.000314 wd 0.0500 time 0.2329 (0.2322) data time 0.0010 (0.0024) model time 0.2320 (0.2297) loss 3.1812 (3.0432) grad_norm 3.3432 (inf) loss_scale 256.0000 (278.1375) mem 7379MB [2024-08-27 21:55:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][460/1251] eta 0:03:03 lr 0.000314 wd 0.0500 time 0.2288 (0.2321) data time 0.0010 (0.0023) model time 0.2278 (0.2297) loss 2.5708 (3.0425) grad_norm 2.1370 (inf) loss_scale 256.0000 (277.6573) mem 7379MB [2024-08-27 21:55:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][470/1251] eta 0:03:01 lr 0.000314 wd 0.0500 time 0.2299 (0.2320) data time 0.0011 (0.0023) model time 0.2287 (0.2297) loss 3.3629 (3.0468) grad_norm 2.6957 (inf) loss_scale 256.0000 (277.1975) mem 7379MB [2024-08-27 21:55:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][480/1251] eta 0:02:58 lr 0.000314 wd 0.0500 time 0.2294 (0.2320) data time 0.0014 (0.0023) model time 0.2280 (0.2296) loss 3.3072 (3.0519) grad_norm 2.6448 (inf) loss_scale 256.0000 (276.7568) mem 7379MB [2024-08-27 21:55:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][490/1251] eta 0:02:56 lr 0.000314 wd 0.0500 time 0.2229 (0.2324) data time 0.0011 (0.0023) model time 0.2218 (0.2302) loss 3.1465 (3.0567) grad_norm 2.0115 (inf) loss_scale 256.0000 (276.3340) mem 7379MB [2024-08-27 21:55:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][500/1251] eta 0:02:54 lr 0.000314 wd 0.0500 time 0.2225 (0.2324) data time 0.0010 (0.0023) model time 0.2215 (0.2301) loss 3.3729 (3.0629) grad_norm 3.0779 (inf) loss_scale 256.0000 (275.9281) mem 7379MB [2024-08-27 21:55:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][510/1251] eta 0:02:52 lr 0.000313 wd 0.0500 time 0.2223 (0.2324) data time 0.0009 (0.0022) model time 0.2214 (0.2301) loss 3.0433 (3.0639) grad_norm 3.4487 (inf) loss_scale 256.0000 (275.5382) mem 7379MB [2024-08-27 21:55:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][520/1251] eta 0:02:49 lr 0.000313 wd 0.0500 time 0.2208 (0.2323) data time 0.0009 (0.0022) model time 0.2199 (0.2301) loss 2.4794 (3.0607) grad_norm 3.4251 (inf) loss_scale 256.0000 (275.1631) mem 7379MB [2024-08-27 21:55:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][530/1251] eta 0:02:47 lr 0.000313 wd 0.0500 time 0.2268 (0.2323) data time 0.0010 (0.0022) model time 0.2258 (0.2301) loss 3.2818 (3.0648) grad_norm 3.3667 (inf) loss_scale 256.0000 (274.8023) mem 7379MB [2024-08-27 21:55:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][540/1251] eta 0:02:45 lr 0.000313 wd 0.0500 time 0.2304 (0.2323) data time 0.0009 (0.0022) model time 0.2295 (0.2301) loss 2.1074 (3.0656) grad_norm 2.8478 (inf) loss_scale 256.0000 (274.4547) mem 7379MB [2024-08-27 21:56:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][550/1251] eta 0:02:42 lr 0.000313 wd 0.0500 time 0.2337 (0.2323) data time 0.0007 (0.0022) model time 0.2330 (0.2301) loss 2.7175 (3.0652) grad_norm 3.1448 (inf) loss_scale 256.0000 (274.1198) mem 7379MB [2024-08-27 21:56:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][560/1251] eta 0:02:40 lr 0.000313 wd 0.0500 time 0.2316 (0.2322) data time 0.0009 (0.0022) model time 0.2307 (0.2301) loss 3.1957 (3.0659) grad_norm 2.5386 (inf) loss_scale 256.0000 (273.7968) mem 7379MB [2024-08-27 21:56:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][570/1251] eta 0:02:38 lr 0.000313 wd 0.0500 time 0.2306 (0.2322) data time 0.0012 (0.0022) model time 0.2294 (0.2300) loss 3.3640 (3.0659) grad_norm 2.6361 (inf) loss_scale 256.0000 (273.4851) mem 7379MB [2024-08-27 21:56:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][580/1251] eta 0:02:35 lr 0.000313 wd 0.0500 time 0.2261 (0.2322) data time 0.0009 (0.0022) model time 0.2252 (0.2300) loss 3.9031 (3.0674) grad_norm 3.4957 (inf) loss_scale 256.0000 (273.1842) mem 7379MB [2024-08-27 21:56:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][590/1251] eta 0:02:33 lr 0.000313 wd 0.0500 time 0.2287 (0.2321) data time 0.0010 (0.0021) model time 0.2277 (0.2300) loss 2.0738 (3.0689) grad_norm 3.0524 (inf) loss_scale 256.0000 (272.8934) mem 7379MB [2024-08-27 21:56:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][600/1251] eta 0:02:31 lr 0.000313 wd 0.0500 time 0.2310 (0.2321) data time 0.0009 (0.0021) model time 0.2301 (0.2299) loss 2.7911 (3.0659) grad_norm 2.5815 (inf) loss_scale 256.0000 (272.6123) mem 7379MB [2024-08-27 21:56:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][610/1251] eta 0:02:28 lr 0.000313 wd 0.0500 time 0.2395 (0.2321) data time 0.0010 (0.0021) model time 0.2385 (0.2299) loss 3.6567 (3.0599) grad_norm 4.0370 (inf) loss_scale 256.0000 (272.3404) mem 7379MB [2024-08-27 21:56:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][620/1251] eta 0:02:26 lr 0.000313 wd 0.0500 time 0.2250 (0.2320) data time 0.0010 (0.0021) model time 0.2240 (0.2299) loss 3.1243 (3.0564) grad_norm 2.3073 (inf) loss_scale 256.0000 (272.0773) mem 7379MB [2024-08-27 21:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][630/1251] eta 0:02:24 lr 0.000313 wd 0.0500 time 0.2241 (0.2320) data time 0.0016 (0.0021) model time 0.2225 (0.2299) loss 3.1926 (3.0583) grad_norm 3.1273 (inf) loss_scale 256.0000 (271.8225) mem 7379MB [2024-08-27 21:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][640/1251] eta 0:02:21 lr 0.000313 wd 0.0500 time 0.2223 (0.2320) data time 0.0011 (0.0021) model time 0.2212 (0.2299) loss 2.8873 (3.0591) grad_norm 2.4719 (inf) loss_scale 256.0000 (271.5757) mem 7379MB [2024-08-27 21:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][650/1251] eta 0:02:19 lr 0.000313 wd 0.0500 time 0.2287 (0.2319) data time 0.0007 (0.0021) model time 0.2280 (0.2299) loss 3.2480 (3.0616) grad_norm 2.1930 (inf) loss_scale 256.0000 (271.3364) mem 7379MB [2024-08-27 21:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][660/1251] eta 0:02:17 lr 0.000313 wd 0.0500 time 0.2281 (0.2320) data time 0.0010 (0.0020) model time 0.2272 (0.2299) loss 3.0764 (3.0590) grad_norm 5.1201 (inf) loss_scale 256.0000 (271.1044) mem 7379MB [2024-08-27 21:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][670/1251] eta 0:02:14 lr 0.000313 wd 0.0500 time 0.2282 (0.2320) data time 0.0007 (0.0020) model time 0.2275 (0.2300) loss 2.8509 (3.0558) grad_norm 3.3366 (inf) loss_scale 256.0000 (270.8793) mem 7379MB [2024-08-27 21:56:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][680/1251] eta 0:02:12 lr 0.000313 wd 0.0500 time 0.2288 (0.2320) data time 0.0011 (0.0020) model time 0.2277 (0.2300) loss 3.0655 (3.0551) grad_norm 4.1398 (inf) loss_scale 256.0000 (270.6608) mem 7379MB [2024-08-27 21:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][690/1251] eta 0:02:10 lr 0.000313 wd 0.0500 time 0.2253 (0.2320) data time 0.0008 (0.0020) model time 0.2244 (0.2300) loss 3.0339 (3.0500) grad_norm 3.2651 (inf) loss_scale 256.0000 (270.4486) mem 7379MB [2024-08-27 21:56:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][700/1251] eta 0:02:07 lr 0.000313 wd 0.0500 time 0.2221 (0.2320) data time 0.0011 (0.0020) model time 0.2209 (0.2300) loss 2.8237 (3.0496) grad_norm 2.7130 (inf) loss_scale 256.0000 (270.2425) mem 7379MB [2024-08-27 21:56:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][710/1251] eta 0:02:05 lr 0.000313 wd 0.0500 time 0.2284 (0.2320) data time 0.0007 (0.0020) model time 0.2277 (0.2300) loss 3.9556 (3.0487) grad_norm 3.1734 (inf) loss_scale 256.0000 (270.0422) mem 7379MB [2024-08-27 21:56:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][720/1251] eta 0:02:03 lr 0.000313 wd 0.0500 time 0.2279 (0.2320) data time 0.0008 (0.0020) model time 0.2271 (0.2300) loss 3.1134 (3.0513) grad_norm 2.9454 (inf) loss_scale 256.0000 (269.8474) mem 7379MB [2024-08-27 21:56:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][730/1251] eta 0:02:00 lr 0.000313 wd 0.0500 time 0.2295 (0.2320) data time 0.0009 (0.0020) model time 0.2286 (0.2300) loss 3.0098 (3.0550) grad_norm 3.1749 (inf) loss_scale 256.0000 (269.6580) mem 7379MB [2024-08-27 21:56:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][740/1251] eta 0:01:58 lr 0.000313 wd 0.0500 time 0.2312 (0.2319) data time 0.0010 (0.0020) model time 0.2302 (0.2299) loss 3.4474 (3.0556) grad_norm 3.5888 (inf) loss_scale 256.0000 (269.4737) mem 7379MB [2024-08-27 21:56:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][750/1251] eta 0:01:56 lr 0.000312 wd 0.0500 time 0.2209 (0.2319) data time 0.0010 (0.0020) model time 0.2198 (0.2299) loss 3.4661 (3.0555) grad_norm 2.7602 (inf) loss_scale 256.0000 (269.2943) mem 7379MB [2024-08-27 21:56:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][760/1251] eta 0:01:53 lr 0.000312 wd 0.0500 time 0.2269 (0.2319) data time 0.0011 (0.0020) model time 0.2258 (0.2299) loss 3.7344 (3.0583) grad_norm 2.4853 (inf) loss_scale 256.0000 (269.1196) mem 7379MB [2024-08-27 21:56:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][770/1251] eta 0:01:51 lr 0.000312 wd 0.0500 time 0.2259 (0.2319) data time 0.0008 (0.0020) model time 0.2251 (0.2299) loss 1.6812 (3.0528) grad_norm 2.9750 (inf) loss_scale 256.0000 (268.9494) mem 7379MB [2024-08-27 21:56:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][780/1251] eta 0:01:49 lr 0.000312 wd 0.0500 time 0.2348 (0.2319) data time 0.0009 (0.0019) model time 0.2339 (0.2299) loss 2.7422 (3.0475) grad_norm 2.6179 (inf) loss_scale 256.0000 (268.7836) mem 7379MB [2024-08-27 21:56:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][790/1251] eta 0:01:46 lr 0.000312 wd 0.0500 time 0.2250 (0.2319) data time 0.0011 (0.0019) model time 0.2239 (0.2299) loss 3.2637 (3.0460) grad_norm 3.2765 (inf) loss_scale 256.0000 (268.6220) mem 7379MB [2024-08-27 21:56:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][800/1251] eta 0:01:44 lr 0.000312 wd 0.0500 time 0.2280 (0.2319) data time 0.0009 (0.0019) model time 0.2271 (0.2299) loss 3.6163 (3.0469) grad_norm 2.6730 (inf) loss_scale 256.0000 (268.4644) mem 7379MB [2024-08-27 21:57:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][810/1251] eta 0:01:42 lr 0.000312 wd 0.0500 time 0.2334 (0.2319) data time 0.0008 (0.0019) model time 0.2326 (0.2300) loss 3.1895 (3.0471) grad_norm 2.8039 (inf) loss_scale 256.0000 (268.3107) mem 7379MB [2024-08-27 21:57:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][820/1251] eta 0:01:39 lr 0.000312 wd 0.0500 time 0.2233 (0.2319) data time 0.0011 (0.0019) model time 0.2222 (0.2300) loss 3.3391 (3.0478) grad_norm 2.7826 (inf) loss_scale 256.0000 (268.1608) mem 7379MB [2024-08-27 21:57:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][830/1251] eta 0:01:37 lr 0.000312 wd 0.0500 time 0.2362 (0.2319) data time 0.0007 (0.0019) model time 0.2355 (0.2300) loss 3.0360 (3.0496) grad_norm 2.1765 (inf) loss_scale 256.0000 (268.0144) mem 7379MB [2024-08-27 21:57:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][840/1251] eta 0:01:35 lr 0.000312 wd 0.0500 time 0.2560 (0.2319) data time 0.0012 (0.0019) model time 0.2549 (0.2300) loss 3.4452 (3.0487) grad_norm 3.1242 (inf) loss_scale 256.0000 (267.8716) mem 7379MB [2024-08-27 21:57:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][850/1251] eta 0:01:33 lr 0.000312 wd 0.0500 time 0.2258 (0.2319) data time 0.0009 (0.0019) model time 0.2249 (0.2300) loss 3.2485 (3.0488) grad_norm 2.2268 (inf) loss_scale 256.0000 (267.7321) mem 7379MB [2024-08-27 21:57:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][860/1251] eta 0:01:30 lr 0.000312 wd 0.0500 time 0.2284 (0.2319) data time 0.0014 (0.0019) model time 0.2269 (0.2300) loss 3.0008 (3.0483) grad_norm 3.4303 (inf) loss_scale 256.0000 (267.5958) mem 7379MB [2024-08-27 21:57:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][870/1251] eta 0:01:28 lr 0.000312 wd 0.0500 time 0.2322 (0.2319) data time 0.0010 (0.0019) model time 0.2312 (0.2300) loss 2.5860 (3.0474) grad_norm 3.0538 (inf) loss_scale 256.0000 (267.4627) mem 7379MB [2024-08-27 21:57:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][880/1251] eta 0:01:26 lr 0.000312 wd 0.0500 time 0.2264 (0.2319) data time 0.0011 (0.0019) model time 0.2253 (0.2300) loss 2.6857 (3.0450) grad_norm 3.1333 (inf) loss_scale 256.0000 (267.3326) mem 7379MB [2024-08-27 21:57:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][890/1251] eta 0:01:23 lr 0.000312 wd 0.0500 time 0.2309 (0.2319) data time 0.0009 (0.0019) model time 0.2299 (0.2300) loss 3.3693 (3.0483) grad_norm 2.4810 (inf) loss_scale 256.0000 (267.2054) mem 7379MB [2024-08-27 21:57:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][900/1251] eta 0:01:21 lr 0.000312 wd 0.0500 time 0.2371 (0.2320) data time 0.0011 (0.0019) model time 0.2361 (0.2301) loss 3.2245 (3.0483) grad_norm 8.8133 (inf) loss_scale 256.0000 (267.0810) mem 7379MB [2024-08-27 21:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][910/1251] eta 0:01:19 lr 0.000312 wd 0.0500 time 0.2301 (0.2319) data time 0.0007 (0.0019) model time 0.2294 (0.2301) loss 3.4983 (3.0463) grad_norm 3.2292 (inf) loss_scale 256.0000 (266.9594) mem 7379MB [2024-08-27 21:57:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][920/1251] eta 0:01:16 lr 0.000312 wd 0.0500 time 0.2220 (0.2320) data time 0.0010 (0.0019) model time 0.2210 (0.2301) loss 3.6680 (3.0489) grad_norm 3.6362 (inf) loss_scale 256.0000 (266.8404) mem 7379MB [2024-08-27 21:57:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][930/1251] eta 0:01:14 lr 0.000312 wd 0.0500 time 0.2331 (0.2320) data time 0.0010 (0.0019) model time 0.2321 (0.2301) loss 3.4022 (3.0486) grad_norm 2.6853 (inf) loss_scale 256.0000 (266.7240) mem 7379MB [2024-08-27 21:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][940/1251] eta 0:01:12 lr 0.000312 wd 0.0500 time 0.2325 (0.2320) data time 0.0011 (0.0019) model time 0.2315 (0.2301) loss 2.5130 (3.0469) grad_norm 4.0127 (inf) loss_scale 256.0000 (266.6100) mem 7379MB [2024-08-27 21:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][950/1251] eta 0:01:09 lr 0.000312 wd 0.0500 time 0.2299 (0.2320) data time 0.0007 (0.0018) model time 0.2292 (0.2301) loss 3.6055 (3.0484) grad_norm 5.0229 (inf) loss_scale 256.0000 (266.4984) mem 7379MB [2024-08-27 21:57:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][960/1251] eta 0:01:07 lr 0.000312 wd 0.0500 time 0.2231 (0.2319) data time 0.0007 (0.0018) model time 0.2224 (0.2301) loss 2.4641 (3.0498) grad_norm 3.3551 (inf) loss_scale 256.0000 (266.3892) mem 7379MB [2024-08-27 21:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][970/1251] eta 0:01:05 lr 0.000312 wd 0.0500 time 0.2392 (0.2319) data time 0.0011 (0.0018) model time 0.2381 (0.2301) loss 3.0032 (3.0481) grad_norm 3.2745 (inf) loss_scale 256.0000 (266.2822) mem 7379MB [2024-08-27 21:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][980/1251] eta 0:01:02 lr 0.000312 wd 0.0500 time 0.2300 (0.2319) data time 0.0009 (0.0018) model time 0.2290 (0.2301) loss 3.6066 (3.0450) grad_norm 2.8183 (inf) loss_scale 256.0000 (266.1774) mem 7379MB [2024-08-27 21:57:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][990/1251] eta 0:01:00 lr 0.000312 wd 0.0500 time 0.2326 (0.2319) data time 0.0009 (0.0018) model time 0.2316 (0.2301) loss 2.8032 (3.0446) grad_norm 2.8560 (inf) loss_scale 256.0000 (266.0747) mem 7379MB [2024-08-27 21:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1000/1251] eta 0:00:58 lr 0.000311 wd 0.0500 time 0.2340 (0.2319) data time 0.0008 (0.0018) model time 0.2332 (0.2301) loss 3.6750 (3.0442) grad_norm 2.3741 (inf) loss_scale 256.0000 (265.9740) mem 7379MB [2024-08-27 21:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1010/1251] eta 0:00:55 lr 0.000311 wd 0.0500 time 0.2285 (0.2319) data time 0.0007 (0.0018) model time 0.2277 (0.2301) loss 3.4454 (3.0452) grad_norm 2.7406 (inf) loss_scale 256.0000 (265.8754) mem 7379MB [2024-08-27 21:57:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1020/1251] eta 0:00:53 lr 0.000311 wd 0.0500 time 0.2227 (0.2319) data time 0.0015 (0.0018) model time 0.2212 (0.2301) loss 3.8888 (3.0430) grad_norm 2.2061 (inf) loss_scale 256.0000 (265.7786) mem 7379MB [2024-08-27 21:57:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1030/1251] eta 0:00:51 lr 0.000311 wd 0.0500 time 0.2275 (0.2319) data time 0.0010 (0.0018) model time 0.2265 (0.2301) loss 3.2339 (3.0434) grad_norm 2.3408 (inf) loss_scale 256.0000 (265.6838) mem 7379MB [2024-08-27 21:57:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1040/1251] eta 0:00:48 lr 0.000311 wd 0.0500 time 0.2298 (0.2319) data time 0.0007 (0.0018) model time 0.2291 (0.2301) loss 3.0464 (3.0412) grad_norm 3.4674 (inf) loss_scale 256.0000 (265.5908) mem 7379MB [2024-08-27 21:57:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1050/1251] eta 0:00:46 lr 0.000311 wd 0.0500 time 0.2260 (0.2319) data time 0.0007 (0.0018) model time 0.2254 (0.2301) loss 2.0240 (3.0382) grad_norm 3.5621 (inf) loss_scale 256.0000 (265.4995) mem 7379MB [2024-08-27 21:57:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1060/1251] eta 0:00:44 lr 0.000311 wd 0.0500 time 0.2338 (0.2319) data time 0.0007 (0.0018) model time 0.2331 (0.2301) loss 2.9457 (3.0382) grad_norm 2.5582 (inf) loss_scale 256.0000 (265.4100) mem 7379MB [2024-08-27 21:58:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1070/1251] eta 0:00:41 lr 0.000311 wd 0.0500 time 0.2324 (0.2319) data time 0.0008 (0.0018) model time 0.2315 (0.2301) loss 2.8832 (3.0377) grad_norm 3.2657 (inf) loss_scale 256.0000 (265.3221) mem 7379MB [2024-08-27 21:58:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1080/1251] eta 0:00:39 lr 0.000311 wd 0.0500 time 0.2238 (0.2319) data time 0.0007 (0.0018) model time 0.2230 (0.2301) loss 3.2069 (3.0349) grad_norm 3.3124 (inf) loss_scale 256.0000 (265.2359) mem 7379MB [2024-08-27 21:58:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1090/1251] eta 0:00:37 lr 0.000311 wd 0.0500 time 0.2280 (0.2319) data time 0.0010 (0.0018) model time 0.2270 (0.2301) loss 2.1376 (3.0327) grad_norm 2.2722 (inf) loss_scale 256.0000 (265.1512) mem 7379MB [2024-08-27 21:58:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1100/1251] eta 0:00:35 lr 0.000311 wd 0.0500 time 0.2237 (0.2319) data time 0.0014 (0.0018) model time 0.2222 (0.2301) loss 3.1870 (3.0318) grad_norm 5.1349 (inf) loss_scale 256.0000 (265.0681) mem 7379MB [2024-08-27 21:58:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1110/1251] eta 0:00:32 lr 0.000311 wd 0.0500 time 0.2234 (0.2318) data time 0.0012 (0.0018) model time 0.2222 (0.2301) loss 2.7226 (3.0299) grad_norm 2.2489 (inf) loss_scale 256.0000 (264.9865) mem 7379MB [2024-08-27 21:58:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1120/1251] eta 0:00:30 lr 0.000311 wd 0.0500 time 0.2284 (0.2318) data time 0.0012 (0.0018) model time 0.2272 (0.2300) loss 3.3454 (3.0305) grad_norm 2.9972 (inf) loss_scale 256.0000 (264.9063) mem 7379MB [2024-08-27 21:58:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1130/1251] eta 0:00:28 lr 0.000311 wd 0.0500 time 0.2298 (0.2318) data time 0.0006 (0.0018) model time 0.2291 (0.2301) loss 3.8759 (3.0322) grad_norm 2.3066 (inf) loss_scale 256.0000 (264.8276) mem 7379MB [2024-08-27 21:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1140/1251] eta 0:00:25 lr 0.000311 wd 0.0500 time 0.2280 (0.2318) data time 0.0007 (0.0018) model time 0.2273 (0.2300) loss 2.5626 (3.0322) grad_norm 4.0930 (inf) loss_scale 256.0000 (264.7502) mem 7379MB [2024-08-27 21:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1150/1251] eta 0:00:23 lr 0.000311 wd 0.0500 time 0.2261 (0.2318) data time 0.0012 (0.0018) model time 0.2249 (0.2300) loss 3.1956 (3.0322) grad_norm 3.1470 (inf) loss_scale 256.0000 (264.6742) mem 7379MB [2024-08-27 21:58:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1160/1251] eta 0:00:21 lr 0.000311 wd 0.0500 time 0.2232 (0.2318) data time 0.0015 (0.0018) model time 0.2217 (0.2300) loss 2.8626 (3.0328) grad_norm 4.0487 (inf) loss_scale 256.0000 (264.5995) mem 7379MB [2024-08-27 21:58:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1170/1251] eta 0:00:18 lr 0.000311 wd 0.0500 time 0.2312 (0.2317) data time 0.0008 (0.0018) model time 0.2304 (0.2300) loss 2.7971 (3.0327) grad_norm 3.6560 (inf) loss_scale 256.0000 (264.5260) mem 7379MB [2024-08-27 21:58:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1180/1251] eta 0:00:16 lr 0.000311 wd 0.0500 time 0.2318 (0.2317) data time 0.0007 (0.0017) model time 0.2311 (0.2300) loss 2.5535 (3.0333) grad_norm 2.1581 (inf) loss_scale 256.0000 (264.4539) mem 7379MB [2024-08-27 21:58:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1190/1251] eta 0:00:14 lr 0.000311 wd 0.0500 time 0.2265 (0.2317) data time 0.0010 (0.0017) model time 0.2255 (0.2300) loss 3.1881 (3.0326) grad_norm 2.1942 (inf) loss_scale 256.0000 (264.3829) mem 7379MB [2024-08-27 21:58:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1200/1251] eta 0:00:11 lr 0.000311 wd 0.0500 time 0.2231 (0.2317) data time 0.0007 (0.0017) model time 0.2224 (0.2300) loss 2.9888 (3.0341) grad_norm 2.8199 (inf) loss_scale 256.0000 (264.3131) mem 7379MB [2024-08-27 21:58:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1210/1251] eta 0:00:09 lr 0.000311 wd 0.0500 time 0.2256 (0.2317) data time 0.0007 (0.0017) model time 0.2249 (0.2300) loss 3.4230 (3.0334) grad_norm 2.3795 (inf) loss_scale 256.0000 (264.2444) mem 7379MB [2024-08-27 21:58:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1220/1251] eta 0:00:07 lr 0.000311 wd 0.0500 time 0.2298 (0.2317) data time 0.0008 (0.0017) model time 0.2290 (0.2300) loss 3.4354 (3.0350) grad_norm 2.8210 (inf) loss_scale 256.0000 (264.1769) mem 7379MB [2024-08-27 21:58:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1230/1251] eta 0:00:04 lr 0.000311 wd 0.0500 time 0.2337 (0.2317) data time 0.0008 (0.0017) model time 0.2330 (0.2300) loss 3.5691 (3.0338) grad_norm 4.5753 (inf) loss_scale 256.0000 (264.1105) mem 7379MB [2024-08-27 21:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1240/1251] eta 0:00:02 lr 0.000310 wd 0.0500 time 0.2128 (0.2316) data time 0.0004 (0.0017) model time 0.2123 (0.2299) loss 3.5553 (3.0342) grad_norm 3.7963 (inf) loss_scale 256.0000 (264.0451) mem 7379MB [2024-08-27 21:58:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [195/300][1250/1251] eta 0:00:00 lr 0.000310 wd 0.0500 time 0.2100 (0.2315) data time 0.0007 (0.0017) model time 0.2093 (0.2297) loss 3.4894 (3.0342) grad_norm 2.9572 (inf) loss_scale 256.0000 (263.9808) mem 7379MB [2024-08-27 21:58:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 195 training takes 0:04:49 [2024-08-27 21:58:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 21:58:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 21:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.512 (0.512) Loss 0.4456 (0.4456) Acc@1 92.871 (92.871) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-27 21:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.122) Loss 0.7119 (0.6804) Acc@1 86.719 (85.938) Acc@5 96.973 (97.097) Mem 7379MB [2024-08-27 21:58:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.103) Loss 0.9604 (0.7042) Acc@1 76.465 (84.798) Acc@5 95.117 (97.135) Mem 7379MB [2024-08-27 21:58:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.097) Loss 1.1270 (0.7951) Acc@1 72.949 (82.538) Acc@5 92.285 (96.084) Mem 7379MB [2024-08-27 21:58:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.089) Loss 1.0859 (0.8450) Acc@1 74.023 (81.150) Acc@5 93.750 (95.579) Mem 7379MB [2024-08-27 21:58:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.736 Acc@5 95.538 [2024-08-27 21:58:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.7% [2024-08-27 21:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.961 (0.961) Loss 0.3896 (0.3896) Acc@1 93.262 (93.262) Acc@5 98.535 (98.535) Mem 7379MB [2024-08-27 21:58:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.164) Loss 0.6084 (0.6133) Acc@1 88.672 (87.198) Acc@5 97.461 (97.550) Mem 7379MB [2024-08-27 21:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.086 (0.125) Loss 0.8696 (0.6391) Acc@1 78.711 (86.179) Acc@5 95.996 (97.582) Mem 7379MB [2024-08-27 21:58:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.111) Loss 1.1094 (0.7236) Acc@1 72.559 (84.085) Acc@5 92.773 (96.680) Mem 7379MB [2024-08-27 21:58:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.099) Loss 0.9785 (0.7667) Acc@1 75.879 (82.798) Acc@5 94.434 (96.203) Mem 7379MB [2024-08-27 21:58:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.366 Acc@5 96.206 [2024-08-27 21:58:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.4% [2024-08-27 21:58:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.37% [2024-08-27 21:58:51 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 21:58:52 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 21:58:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][0/1251] eta 0:16:53 lr 0.000310 wd 0.0500 time 0.8101 (0.8101) data time 0.5721 (0.5721) model time 0.0000 (0.0000) loss 2.8808 (2.8808) grad_norm 2.8726 (2.8726) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:58:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][10/1251] eta 0:05:49 lr 0.000310 wd 0.0500 time 0.2266 (0.2818) data time 0.0007 (0.0530) model time 0.0000 (0.0000) loss 3.5952 (3.0987) grad_norm 3.5760 (3.7741) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:58:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][20/1251] eta 0:05:15 lr 0.000310 wd 0.0500 time 0.2240 (0.2567) data time 0.0012 (0.0283) model time 0.0000 (0.0000) loss 3.0865 (3.0618) grad_norm 3.5344 (3.8502) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:58:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][30/1251] eta 0:05:01 lr 0.000310 wd 0.0500 time 0.2240 (0.2472) data time 0.0010 (0.0196) model time 0.0000 (0.0000) loss 3.6251 (3.0861) grad_norm 2.3393 (3.7490) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][40/1251] eta 0:04:53 lr 0.000310 wd 0.0500 time 0.2282 (0.2427) data time 0.0007 (0.0151) model time 0.0000 (0.0000) loss 3.6247 (3.0734) grad_norm 2.9783 (3.4550) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][50/1251] eta 0:04:48 lr 0.000310 wd 0.0500 time 0.2231 (0.2402) data time 0.0009 (0.0123) model time 0.0000 (0.0000) loss 3.1758 (3.0425) grad_norm 2.7167 (3.3420) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][60/1251] eta 0:04:43 lr 0.000310 wd 0.0500 time 0.2315 (0.2382) data time 0.0009 (0.0106) model time 0.2306 (0.2264) loss 3.5111 (3.0503) grad_norm 2.4007 (3.2955) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][70/1251] eta 0:04:39 lr 0.000310 wd 0.0500 time 0.2256 (0.2369) data time 0.0010 (0.0093) model time 0.2246 (0.2268) loss 2.8535 (3.0452) grad_norm 2.8748 (3.3831) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][80/1251] eta 0:04:36 lr 0.000310 wd 0.0500 time 0.2260 (0.2359) data time 0.0018 (0.0083) model time 0.2242 (0.2273) loss 3.4298 (3.0276) grad_norm 3.1496 (3.3467) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][90/1251] eta 0:04:33 lr 0.000310 wd 0.0500 time 0.2372 (0.2353) data time 0.0009 (0.0075) model time 0.2363 (0.2277) loss 3.0824 (3.0343) grad_norm 2.7738 (3.2874) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][100/1251] eta 0:04:32 lr 0.000310 wd 0.0500 time 0.2401 (0.2369) data time 0.0010 (0.0069) model time 0.2391 (0.2323) loss 3.0775 (3.0154) grad_norm 3.2838 (3.2689) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][110/1251] eta 0:04:29 lr 0.000310 wd 0.0500 time 0.2355 (0.2363) data time 0.0009 (0.0063) model time 0.2347 (0.2318) loss 2.8160 (3.0106) grad_norm 3.1435 (3.2250) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][120/1251] eta 0:04:26 lr 0.000310 wd 0.0500 time 0.2289 (0.2356) data time 0.0007 (0.0059) model time 0.2282 (0.2311) loss 3.8073 (3.0145) grad_norm 3.4709 (3.1917) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][130/1251] eta 0:04:23 lr 0.000310 wd 0.0500 time 0.2291 (0.2351) data time 0.0009 (0.0055) model time 0.2283 (0.2307) loss 3.4460 (3.0141) grad_norm 3.3024 (3.1613) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][140/1251] eta 0:04:20 lr 0.000310 wd 0.0500 time 0.2314 (0.2346) data time 0.0009 (0.0052) model time 0.2304 (0.2303) loss 2.8578 (3.0181) grad_norm 2.5671 (3.1549) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][150/1251] eta 0:04:17 lr 0.000310 wd 0.0500 time 0.2232 (0.2342) data time 0.0012 (0.0049) model time 0.2221 (0.2300) loss 3.0506 (3.0283) grad_norm 2.5050 (3.1479) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][160/1251] eta 0:04:15 lr 0.000310 wd 0.0500 time 0.2334 (0.2340) data time 0.0007 (0.0047) model time 0.2327 (0.2300) loss 3.0050 (3.0206) grad_norm 2.3383 (3.1202) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][170/1251] eta 0:04:12 lr 0.000310 wd 0.0500 time 0.2276 (0.2339) data time 0.0011 (0.0045) model time 0.2266 (0.2300) loss 3.9268 (3.0225) grad_norm 2.5765 (3.0971) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][180/1251] eta 0:04:10 lr 0.000310 wd 0.0500 time 0.2334 (0.2337) data time 0.0009 (0.0044) model time 0.2324 (0.2299) loss 2.8532 (3.0094) grad_norm 2.7369 (3.0973) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][190/1251] eta 0:04:07 lr 0.000310 wd 0.0500 time 0.2215 (0.2333) data time 0.0009 (0.0042) model time 0.2207 (0.2296) loss 3.2090 (3.0092) grad_norm 3.1259 (3.1112) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][200/1251] eta 0:04:05 lr 0.000310 wd 0.0500 time 0.2324 (0.2332) data time 0.0009 (0.0040) model time 0.2315 (0.2296) loss 3.8195 (3.0173) grad_norm 2.4824 (3.1204) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][210/1251] eta 0:04:02 lr 0.000310 wd 0.0500 time 0.2416 (0.2331) data time 0.0007 (0.0039) model time 0.2409 (0.2297) loss 2.7578 (3.0220) grad_norm 4.5184 (3.1372) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][220/1251] eta 0:04:00 lr 0.000310 wd 0.0500 time 0.2258 (0.2330) data time 0.0007 (0.0038) model time 0.2251 (0.2296) loss 3.8676 (3.0166) grad_norm 3.5369 (3.2205) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][230/1251] eta 0:03:57 lr 0.000310 wd 0.0500 time 0.2304 (0.2328) data time 0.0006 (0.0037) model time 0.2297 (0.2295) loss 2.9813 (3.0133) grad_norm 3.1628 (3.1977) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][240/1251] eta 0:03:55 lr 0.000309 wd 0.0500 time 0.2345 (0.2326) data time 0.0007 (0.0035) model time 0.2339 (0.2294) loss 2.3841 (3.0138) grad_norm 3.5243 (3.1987) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][250/1251] eta 0:03:52 lr 0.000309 wd 0.0500 time 0.2728 (0.2327) data time 0.0007 (0.0035) model time 0.2721 (0.2296) loss 3.4529 (3.0102) grad_norm 2.5316 (3.1890) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][260/1251] eta 0:03:50 lr 0.000309 wd 0.0500 time 0.2317 (0.2327) data time 0.0012 (0.0035) model time 0.2305 (0.2295) loss 2.6051 (3.0087) grad_norm 2.2422 (3.1845) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][270/1251] eta 0:03:48 lr 0.000309 wd 0.0500 time 0.2264 (0.2326) data time 0.0008 (0.0034) model time 0.2256 (0.2295) loss 2.4355 (3.0042) grad_norm 2.9132 (3.1712) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][280/1251] eta 0:03:45 lr 0.000309 wd 0.0500 time 0.2344 (0.2325) data time 0.0011 (0.0034) model time 0.2333 (0.2294) loss 3.5709 (3.0139) grad_norm 2.5494 (3.1798) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 21:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][290/1251] eta 0:03:43 lr 0.000309 wd 0.0500 time 0.2304 (0.2323) data time 0.0007 (0.0033) model time 0.2297 (0.2293) loss 1.8646 (3.0017) grad_norm 3.1806 (3.1706) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][300/1251] eta 0:03:40 lr 0.000309 wd 0.0500 time 0.2255 (0.2322) data time 0.0012 (0.0032) model time 0.2243 (0.2292) loss 3.1632 (2.9992) grad_norm 4.9059 (3.1655) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][310/1251] eta 0:03:38 lr 0.000309 wd 0.0500 time 0.2290 (0.2321) data time 0.0008 (0.0031) model time 0.2282 (0.2291) loss 2.1989 (3.0006) grad_norm 3.1687 (3.1571) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][320/1251] eta 0:03:35 lr 0.000309 wd 0.0500 time 0.2272 (0.2319) data time 0.0008 (0.0031) model time 0.2264 (0.2291) loss 2.6099 (3.0026) grad_norm 3.1159 (3.1700) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][330/1251] eta 0:03:33 lr 0.000309 wd 0.0500 time 0.2209 (0.2319) data time 0.0014 (0.0030) model time 0.2194 (0.2291) loss 2.3294 (3.0139) grad_norm 2.5341 (3.1708) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][340/1251] eta 0:03:31 lr 0.000309 wd 0.0500 time 0.2337 (0.2319) data time 0.0011 (0.0030) model time 0.2326 (0.2291) loss 3.7538 (3.0043) grad_norm 2.7141 (3.1628) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][350/1251] eta 0:03:28 lr 0.000309 wd 0.0500 time 0.2344 (0.2318) data time 0.0008 (0.0029) model time 0.2337 (0.2291) loss 3.7726 (3.0089) grad_norm 3.0610 (3.1543) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][360/1251] eta 0:03:26 lr 0.000309 wd 0.0500 time 0.2280 (0.2317) data time 0.0015 (0.0029) model time 0.2265 (0.2290) loss 3.0610 (3.0119) grad_norm 3.0412 (3.1473) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][370/1251] eta 0:03:24 lr 0.000309 wd 0.0500 time 0.2458 (0.2316) data time 0.0011 (0.0028) model time 0.2447 (0.2290) loss 2.6824 (3.0093) grad_norm 3.8949 (3.1481) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][380/1251] eta 0:03:21 lr 0.000309 wd 0.0500 time 0.2271 (0.2317) data time 0.0011 (0.0028) model time 0.2260 (0.2291) loss 2.3573 (3.0036) grad_norm 2.4837 (3.1394) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][390/1251] eta 0:03:19 lr 0.000309 wd 0.0500 time 0.2349 (0.2317) data time 0.0007 (0.0027) model time 0.2342 (0.2291) loss 3.5416 (3.0083) grad_norm 5.6842 (3.1309) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][400/1251] eta 0:03:17 lr 0.000309 wd 0.0500 time 0.2286 (0.2316) data time 0.0009 (0.0027) model time 0.2276 (0.2290) loss 2.8761 (3.0099) grad_norm 12.2713 (3.1650) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][410/1251] eta 0:03:14 lr 0.000309 wd 0.0500 time 0.2269 (0.2315) data time 0.0011 (0.0027) model time 0.2258 (0.2289) loss 3.5204 (3.0130) grad_norm 4.9195 (3.1792) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][420/1251] eta 0:03:12 lr 0.000309 wd 0.0500 time 0.2330 (0.2314) data time 0.0007 (0.0026) model time 0.2323 (0.2290) loss 3.2681 (3.0083) grad_norm 3.4331 (3.1970) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][430/1251] eta 0:03:09 lr 0.000309 wd 0.0500 time 0.2339 (0.2313) data time 0.0009 (0.0026) model time 0.2330 (0.2289) loss 3.6151 (3.0150) grad_norm 5.7756 (3.2255) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][440/1251] eta 0:03:07 lr 0.000309 wd 0.0500 time 0.2252 (0.2313) data time 0.0014 (0.0026) model time 0.2238 (0.2288) loss 3.3106 (3.0232) grad_norm 4.7987 (3.2345) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][450/1251] eta 0:03:05 lr 0.000309 wd 0.0500 time 0.2291 (0.2312) data time 0.0011 (0.0025) model time 0.2280 (0.2288) loss 1.9080 (3.0234) grad_norm 3.7854 (3.2376) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][460/1251] eta 0:03:02 lr 0.000309 wd 0.0500 time 0.2331 (0.2312) data time 0.0013 (0.0025) model time 0.2318 (0.2288) loss 2.9702 (3.0242) grad_norm 3.1414 (3.2257) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][470/1251] eta 0:03:00 lr 0.000309 wd 0.0500 time 0.2453 (0.2313) data time 0.0009 (0.0025) model time 0.2444 (0.2289) loss 3.0070 (3.0226) grad_norm 3.2402 (3.2283) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][480/1251] eta 0:02:58 lr 0.000308 wd 0.0500 time 0.2264 (0.2316) data time 0.0011 (0.0024) model time 0.2253 (0.2293) loss 3.2151 (3.0244) grad_norm 5.1760 (3.2324) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][490/1251] eta 0:02:56 lr 0.000308 wd 0.0500 time 0.2330 (0.2316) data time 0.0010 (0.0024) model time 0.2321 (0.2294) loss 3.6775 (3.0253) grad_norm 2.2398 (3.2277) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][500/1251] eta 0:02:53 lr 0.000308 wd 0.0500 time 0.2252 (0.2316) data time 0.0007 (0.0024) model time 0.2245 (0.2294) loss 4.0128 (3.0271) grad_norm 2.2504 (3.2197) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][510/1251] eta 0:02:51 lr 0.000308 wd 0.0500 time 0.2307 (0.2317) data time 0.0009 (0.0024) model time 0.2298 (0.2295) loss 2.4852 (3.0286) grad_norm 3.6021 (3.2209) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][520/1251] eta 0:02:49 lr 0.000308 wd 0.0500 time 0.2353 (0.2317) data time 0.0007 (0.0023) model time 0.2346 (0.2295) loss 3.4463 (3.0337) grad_norm 3.0341 (3.2261) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][530/1251] eta 0:02:46 lr 0.000308 wd 0.0500 time 0.2233 (0.2316) data time 0.0010 (0.0023) model time 0.2222 (0.2294) loss 3.2688 (3.0316) grad_norm 3.0528 (3.2312) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][540/1251] eta 0:02:44 lr 0.000308 wd 0.0500 time 0.2286 (0.2316) data time 0.0009 (0.0023) model time 0.2277 (0.2294) loss 3.0864 (3.0347) grad_norm 4.0038 (3.2268) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:00:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][550/1251] eta 0:02:42 lr 0.000308 wd 0.0500 time 0.2286 (0.2315) data time 0.0010 (0.0023) model time 0.2276 (0.2294) loss 3.3668 (3.0367) grad_norm 3.2577 (3.2265) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][560/1251] eta 0:02:40 lr 0.000308 wd 0.0500 time 0.2319 (0.2316) data time 0.0007 (0.0022) model time 0.2312 (0.2295) loss 4.1351 (3.0393) grad_norm 2.3411 (3.2269) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][570/1251] eta 0:02:37 lr 0.000308 wd 0.0500 time 0.2350 (0.2316) data time 0.0008 (0.0022) model time 0.2343 (0.2295) loss 3.5444 (3.0409) grad_norm 2.4444 (3.2178) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][580/1251] eta 0:02:35 lr 0.000308 wd 0.0500 time 0.2317 (0.2316) data time 0.0008 (0.0022) model time 0.2308 (0.2295) loss 3.3228 (3.0466) grad_norm 3.4817 (3.2222) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][590/1251] eta 0:02:33 lr 0.000308 wd 0.0500 time 0.2257 (0.2316) data time 0.0007 (0.0022) model time 0.2250 (0.2295) loss 2.2577 (3.0466) grad_norm 3.0540 (3.2189) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][600/1251] eta 0:02:30 lr 0.000308 wd 0.0500 time 0.2281 (0.2315) data time 0.0007 (0.0022) model time 0.2274 (0.2295) loss 3.4871 (3.0493) grad_norm 2.5484 (3.2159) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][610/1251] eta 0:02:28 lr 0.000308 wd 0.0500 time 0.2257 (0.2315) data time 0.0009 (0.0022) model time 0.2248 (0.2295) loss 3.3481 (3.0485) grad_norm 2.2045 (3.2097) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][620/1251] eta 0:02:26 lr 0.000308 wd 0.0500 time 0.2300 (0.2315) data time 0.0009 (0.0021) model time 0.2291 (0.2295) loss 2.0121 (3.0451) grad_norm 2.1829 (3.2078) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][630/1251] eta 0:02:23 lr 0.000308 wd 0.0500 time 0.2242 (0.2315) data time 0.0008 (0.0021) model time 0.2234 (0.2295) loss 3.3841 (3.0425) grad_norm 2.4451 (3.2043) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][640/1251] eta 0:02:21 lr 0.000308 wd 0.0500 time 0.2321 (0.2315) data time 0.0010 (0.0021) model time 0.2312 (0.2295) loss 3.4445 (3.0442) grad_norm 2.6683 (3.1972) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][650/1251] eta 0:02:19 lr 0.000308 wd 0.0500 time 0.2278 (0.2315) data time 0.0008 (0.0021) model time 0.2270 (0.2295) loss 3.0661 (3.0435) grad_norm 4.3078 (3.1907) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][660/1251] eta 0:02:16 lr 0.000308 wd 0.0500 time 0.2261 (0.2315) data time 0.0011 (0.0021) model time 0.2251 (0.2295) loss 2.0851 (3.0394) grad_norm 2.6297 (3.1854) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][670/1251] eta 0:02:14 lr 0.000308 wd 0.0500 time 0.2456 (0.2315) data time 0.0009 (0.0021) model time 0.2447 (0.2295) loss 3.2598 (3.0368) grad_norm 2.7886 (3.1746) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][680/1251] eta 0:02:12 lr 0.000308 wd 0.0500 time 0.2284 (0.2314) data time 0.0008 (0.0020) model time 0.2276 (0.2295) loss 3.1472 (3.0382) grad_norm 1.9896 (3.1674) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][690/1251] eta 0:02:09 lr 0.000308 wd 0.0500 time 0.2333 (0.2314) data time 0.0007 (0.0020) model time 0.2326 (0.2295) loss 3.6459 (3.0386) grad_norm 3.6751 (3.1671) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][700/1251] eta 0:02:07 lr 0.000308 wd 0.0500 time 0.2266 (0.2314) data time 0.0012 (0.0020) model time 0.2255 (0.2295) loss 2.3372 (3.0352) grad_norm 3.2382 (3.1670) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][710/1251] eta 0:02:05 lr 0.000308 wd 0.0500 time 0.2347 (0.2313) data time 0.0007 (0.0020) model time 0.2340 (0.2294) loss 2.8119 (3.0358) grad_norm 4.7207 (3.1712) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][720/1251] eta 0:02:02 lr 0.000308 wd 0.0500 time 0.2260 (0.2313) data time 0.0010 (0.0020) model time 0.2250 (0.2294) loss 2.4319 (3.0332) grad_norm 2.6286 (3.1770) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][730/1251] eta 0:02:00 lr 0.000307 wd 0.0500 time 0.2208 (0.2314) data time 0.0012 (0.0020) model time 0.2196 (0.2295) loss 2.6035 (3.0292) grad_norm 2.7899 (3.1705) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][740/1251] eta 0:01:58 lr 0.000307 wd 0.0500 time 0.2237 (0.2313) data time 0.0010 (0.0020) model time 0.2227 (0.2294) loss 2.0035 (3.0228) grad_norm 2.9001 (3.1704) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][750/1251] eta 0:01:55 lr 0.000307 wd 0.0500 time 0.2404 (0.2313) data time 0.0009 (0.0020) model time 0.2396 (0.2294) loss 3.1839 (3.0233) grad_norm 4.6397 (3.1755) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][760/1251] eta 0:01:53 lr 0.000307 wd 0.0500 time 0.2314 (0.2313) data time 0.0009 (0.0020) model time 0.2305 (0.2294) loss 2.2343 (3.0200) grad_norm 2.2930 (3.1655) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][770/1251] eta 0:01:51 lr 0.000307 wd 0.0500 time 0.2279 (0.2313) data time 0.0006 (0.0019) model time 0.2272 (0.2294) loss 3.4999 (3.0212) grad_norm 3.3245 (3.1619) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][780/1251] eta 0:01:48 lr 0.000307 wd 0.0500 time 0.2288 (0.2312) data time 0.0015 (0.0019) model time 0.2274 (0.2294) loss 2.9048 (3.0190) grad_norm 3.3100 (3.1634) loss_scale 256.0000 (256.0000) mem 7379MB [2024-08-27 22:01:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][790/1251] eta 0:01:46 lr 0.000307 wd 0.0500 time 0.2295 (0.2312) data time 0.0011 (0.0019) model time 0.2284 (0.2294) loss 3.1877 (3.0206) grad_norm 3.0074 (3.1633) loss_scale 512.0000 (256.9709) mem 7379MB [2024-08-27 22:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][800/1251] eta 0:01:44 lr 0.000307 wd 0.0500 time 0.2395 (0.2312) data time 0.0013 (0.0019) model time 0.2382 (0.2294) loss 3.4659 (3.0196) grad_norm 2.3366 (3.1625) loss_scale 512.0000 (260.1548) mem 7379MB [2024-08-27 22:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][810/1251] eta 0:01:41 lr 0.000307 wd 0.0500 time 0.2366 (0.2312) data time 0.0014 (0.0019) model time 0.2352 (0.2294) loss 2.9937 (3.0234) grad_norm 3.2325 (3.1647) loss_scale 512.0000 (263.2602) mem 7379MB [2024-08-27 22:02:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][820/1251] eta 0:01:39 lr 0.000307 wd 0.0500 time 0.2248 (0.2312) data time 0.0007 (0.0019) model time 0.2241 (0.2294) loss 2.1781 (3.0232) grad_norm 2.9111 (3.1636) loss_scale 512.0000 (266.2899) mem 7379MB [2024-08-27 22:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][830/1251] eta 0:01:37 lr 0.000307 wd 0.0500 time 0.2297 (0.2312) data time 0.0007 (0.0019) model time 0.2290 (0.2294) loss 2.4962 (3.0204) grad_norm 3.0466 (3.1694) loss_scale 512.0000 (269.2467) mem 7379MB [2024-08-27 22:02:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][840/1251] eta 0:01:34 lr 0.000307 wd 0.0500 time 0.2264 (0.2311) data time 0.0013 (0.0019) model time 0.2251 (0.2293) loss 3.0843 (3.0205) grad_norm 2.4596 (3.1640) loss_scale 512.0000 (272.1332) mem 7379MB [2024-08-27 22:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][850/1251] eta 0:01:32 lr 0.000307 wd 0.0500 time 0.2329 (0.2311) data time 0.0009 (0.0019) model time 0.2319 (0.2293) loss 3.3200 (3.0236) grad_norm 3.1919 (3.1643) loss_scale 512.0000 (274.9518) mem 7379MB [2024-08-27 22:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][860/1251] eta 0:01:30 lr 0.000307 wd 0.0500 time 0.2194 (0.2311) data time 0.0012 (0.0019) model time 0.2183 (0.2293) loss 3.2578 (3.0217) grad_norm 2.7091 (3.1621) loss_scale 512.0000 (277.7050) mem 7379MB [2024-08-27 22:02:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][870/1251] eta 0:01:28 lr 0.000307 wd 0.0500 time 0.2321 (0.2311) data time 0.0007 (0.0019) model time 0.2314 (0.2293) loss 3.9925 (3.0249) grad_norm 2.3039 (3.1585) loss_scale 512.0000 (280.3949) mem 7379MB [2024-08-27 22:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][880/1251] eta 0:01:25 lr 0.000307 wd 0.0500 time 0.2323 (0.2311) data time 0.0007 (0.0019) model time 0.2316 (0.2293) loss 2.8048 (3.0242) grad_norm 2.7830 (3.1823) loss_scale 512.0000 (283.0238) mem 7379MB [2024-08-27 22:02:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][890/1251] eta 0:01:23 lr 0.000307 wd 0.0500 time 0.2411 (0.2311) data time 0.0010 (0.0019) model time 0.2401 (0.2293) loss 2.5535 (3.0244) grad_norm 3.3328 (3.1761) loss_scale 512.0000 (285.5937) mem 7379MB [2024-08-27 22:02:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][900/1251] eta 0:01:21 lr 0.000307 wd 0.0500 time 0.2220 (0.2311) data time 0.0009 (0.0019) model time 0.2211 (0.2293) loss 2.4459 (3.0217) grad_norm 2.2156 (3.1740) loss_scale 512.0000 (288.1065) mem 7379MB [2024-08-27 22:02:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][910/1251] eta 0:01:18 lr 0.000307 wd 0.0500 time 0.2300 (0.2311) data time 0.0009 (0.0019) model time 0.2291 (0.2293) loss 3.2340 (3.0227) grad_norm 3.2652 (3.1722) loss_scale 512.0000 (290.5642) mem 7379MB [2024-08-27 22:02:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][920/1251] eta 0:01:16 lr 0.000307 wd 0.0500 time 0.2329 (0.2311) data time 0.0009 (0.0019) model time 0.2320 (0.2293) loss 2.1620 (3.0206) grad_norm 3.0436 (3.1696) loss_scale 512.0000 (292.9685) mem 7379MB [2024-08-27 22:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][930/1251] eta 0:01:14 lr 0.000307 wd 0.0500 time 0.2256 (0.2311) data time 0.0014 (0.0018) model time 0.2242 (0.2293) loss 3.4145 (3.0225) grad_norm 3.1806 (3.1636) loss_scale 512.0000 (295.3212) mem 7379MB [2024-08-27 22:02:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][940/1251] eta 0:01:11 lr 0.000307 wd 0.0500 time 0.2313 (0.2311) data time 0.0007 (0.0018) model time 0.2306 (0.2293) loss 2.8589 (3.0202) grad_norm 4.0995 (3.1598) loss_scale 512.0000 (297.6238) mem 7379MB [2024-08-27 22:02:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][950/1251] eta 0:01:09 lr 0.000307 wd 0.0500 time 0.2299 (0.2310) data time 0.0010 (0.0018) model time 0.2289 (0.2293) loss 2.1173 (3.0187) grad_norm 5.4718 (3.1614) loss_scale 512.0000 (299.8780) mem 7379MB [2024-08-27 22:02:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][960/1251] eta 0:01:07 lr 0.000307 wd 0.0500 time 0.2213 (0.2311) data time 0.0008 (0.0018) model time 0.2206 (0.2293) loss 2.7923 (3.0166) grad_norm 2.3179 (3.1602) loss_scale 512.0000 (302.0853) mem 7379MB [2024-08-27 22:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][970/1251] eta 0:01:04 lr 0.000307 wd 0.0500 time 0.2216 (0.2311) data time 0.0014 (0.0018) model time 0.2203 (0.2293) loss 3.1931 (3.0193) grad_norm 2.5282 (3.1570) loss_scale 512.0000 (304.2472) mem 7379MB [2024-08-27 22:02:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][980/1251] eta 0:01:02 lr 0.000306 wd 0.0500 time 0.2255 (0.2311) data time 0.0012 (0.0018) model time 0.2243 (0.2293) loss 3.0490 (3.0179) grad_norm 2.5874 (3.1528) loss_scale 512.0000 (306.3649) mem 7379MB [2024-08-27 22:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][990/1251] eta 0:01:00 lr 0.000306 wd 0.0500 time 0.2286 (0.2311) data time 0.0007 (0.0018) model time 0.2279 (0.2293) loss 3.3320 (3.0189) grad_norm 3.7973 (3.1536) loss_scale 512.0000 (308.4400) mem 7379MB [2024-08-27 22:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1000/1251] eta 0:00:58 lr 0.000306 wd 0.0500 time 0.2266 (0.2313) data time 0.0009 (0.0018) model time 0.2256 (0.2295) loss 3.0274 (3.0195) grad_norm 2.8853 (3.1505) loss_scale 512.0000 (310.4735) mem 7379MB [2024-08-27 22:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1010/1251] eta 0:00:55 lr 0.000306 wd 0.0500 time 0.2295 (0.2313) data time 0.0008 (0.0018) model time 0.2286 (0.2295) loss 2.2572 (3.0195) grad_norm 2.7539 (3.1489) loss_scale 512.0000 (312.4669) mem 7379MB [2024-08-27 22:02:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1020/1251] eta 0:00:53 lr 0.000306 wd 0.0500 time 0.2239 (0.2312) data time 0.0007 (0.0018) model time 0.2232 (0.2295) loss 3.7028 (3.0170) grad_norm 2.3486 (3.1474) loss_scale 512.0000 (314.4212) mem 7379MB [2024-08-27 22:02:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1030/1251] eta 0:00:51 lr 0.000306 wd 0.0500 time 0.2251 (0.2314) data time 0.0007 (0.0018) model time 0.2244 (0.2297) loss 3.0952 (3.0168) grad_norm 3.2940 (3.1483) loss_scale 512.0000 (316.3375) mem 7379MB [2024-08-27 22:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1040/1251] eta 0:00:48 lr 0.000306 wd 0.0500 time 0.2302 (0.2314) data time 0.0008 (0.0018) model time 0.2295 (0.2297) loss 2.6011 (3.0159) grad_norm 3.2883 (3.1522) loss_scale 512.0000 (318.2171) mem 7379MB [2024-08-27 22:02:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1050/1251] eta 0:00:46 lr 0.000306 wd 0.0500 time 0.2293 (0.2314) data time 0.0010 (0.0018) model time 0.2283 (0.2297) loss 3.2803 (3.0160) grad_norm 2.4686 (3.1498) loss_scale 512.0000 (320.0609) mem 7379MB [2024-08-27 22:02:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1060/1251] eta 0:00:44 lr 0.000306 wd 0.0500 time 0.2340 (0.2314) data time 0.0009 (0.0018) model time 0.2330 (0.2297) loss 1.9561 (3.0154) grad_norm 1.9531 (3.1445) loss_scale 512.0000 (321.8699) mem 7379MB [2024-08-27 22:03:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1070/1251] eta 0:00:41 lr 0.000306 wd 0.0500 time 0.2239 (0.2314) data time 0.0007 (0.0018) model time 0.2231 (0.2297) loss 3.1281 (3.0157) grad_norm 3.4768 (3.1439) loss_scale 512.0000 (323.6452) mem 7379MB [2024-08-27 22:03:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1080/1251] eta 0:00:39 lr 0.000306 wd 0.0500 time 0.2246 (0.2314) data time 0.0009 (0.0018) model time 0.2237 (0.2297) loss 2.7499 (3.0174) grad_norm 2.3577 (3.1417) loss_scale 512.0000 (325.3876) mem 7379MB [2024-08-27 22:03:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1090/1251] eta 0:00:37 lr 0.000306 wd 0.0500 time 0.2354 (0.2314) data time 0.0009 (0.0018) model time 0.2345 (0.2297) loss 2.6546 (3.0174) grad_norm 2.0857 (3.1418) loss_scale 512.0000 (327.0981) mem 7379MB [2024-08-27 22:03:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1100/1251] eta 0:00:34 lr 0.000306 wd 0.0500 time 0.2318 (0.2314) data time 0.0007 (0.0018) model time 0.2311 (0.2297) loss 3.2338 (3.0194) grad_norm 2.7226 (3.1456) loss_scale 512.0000 (328.7775) mem 7379MB [2024-08-27 22:03:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1110/1251] eta 0:00:32 lr 0.000306 wd 0.0500 time 0.2293 (0.2314) data time 0.0011 (0.0017) model time 0.2282 (0.2297) loss 3.3687 (3.0233) grad_norm 2.7326 (3.1852) loss_scale 512.0000 (330.4266) mem 7379MB [2024-08-27 22:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1120/1251] eta 0:00:30 lr 0.000306 wd 0.0500 time 0.2316 (0.2314) data time 0.0007 (0.0017) model time 0.2310 (0.2297) loss 3.1562 (3.0251) grad_norm 3.1524 (3.1855) loss_scale 512.0000 (332.0464) mem 7379MB [2024-08-27 22:03:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1130/1251] eta 0:00:27 lr 0.000306 wd 0.0500 time 0.2374 (0.2314) data time 0.0007 (0.0017) model time 0.2367 (0.2297) loss 2.5556 (3.0260) grad_norm 2.2921 (3.1850) loss_scale 512.0000 (333.6375) mem 7379MB [2024-08-27 22:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1140/1251] eta 0:00:25 lr 0.000306 wd 0.0500 time 0.2264 (0.2314) data time 0.0012 (0.0017) model time 0.2252 (0.2297) loss 2.9391 (3.0281) grad_norm 2.0076 (3.1798) loss_scale 512.0000 (335.2007) mem 7379MB [2024-08-27 22:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1150/1251] eta 0:00:23 lr 0.000306 wd 0.0500 time 0.2338 (0.2313) data time 0.0009 (0.0017) model time 0.2329 (0.2297) loss 3.1979 (3.0277) grad_norm 5.7673 (3.1830) loss_scale 512.0000 (336.7368) mem 7379MB [2024-08-27 22:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1160/1251] eta 0:00:21 lr 0.000306 wd 0.0500 time 0.2238 (0.2313) data time 0.0008 (0.0017) model time 0.2230 (0.2297) loss 2.5951 (3.0260) grad_norm 2.5892 (3.1853) loss_scale 512.0000 (338.2463) mem 7379MB [2024-08-27 22:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1170/1251] eta 0:00:18 lr 0.000306 wd 0.0500 time 0.2224 (0.2313) data time 0.0007 (0.0017) model time 0.2217 (0.2297) loss 2.8510 (3.0290) grad_norm 2.4213 (3.1843) loss_scale 512.0000 (339.7301) mem 7379MB [2024-08-27 22:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1180/1251] eta 0:00:16 lr 0.000306 wd 0.0500 time 0.2293 (0.2313) data time 0.0008 (0.0017) model time 0.2285 (0.2296) loss 3.0756 (3.0300) grad_norm 2.4931 (3.1806) loss_scale 512.0000 (341.1888) mem 7379MB [2024-08-27 22:03:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1190/1251] eta 0:00:14 lr 0.000306 wd 0.0500 time 0.2295 (0.2312) data time 0.0007 (0.0017) model time 0.2288 (0.2296) loss 3.7479 (3.0300) grad_norm 3.3032 (3.1777) loss_scale 512.0000 (342.6230) mem 7379MB [2024-08-27 22:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1200/1251] eta 0:00:11 lr 0.000306 wd 0.0500 time 0.2329 (0.2312) data time 0.0007 (0.0017) model time 0.2322 (0.2296) loss 1.9666 (3.0280) grad_norm 3.6244 (3.1824) loss_scale 512.0000 (344.0333) mem 7379MB [2024-08-27 22:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1210/1251] eta 0:00:09 lr 0.000306 wd 0.0500 time 0.2248 (0.2312) data time 0.0007 (0.0017) model time 0.2241 (0.2296) loss 1.7873 (3.0255) grad_norm 2.9871 (3.1816) loss_scale 512.0000 (345.4203) mem 7379MB [2024-08-27 22:03:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1220/1251] eta 0:00:07 lr 0.000305 wd 0.0500 time 0.2310 (0.2312) data time 0.0008 (0.0017) model time 0.2302 (0.2296) loss 3.6729 (3.0259) grad_norm 2.9512 (3.1940) loss_scale 512.0000 (346.7846) mem 7379MB [2024-08-27 22:03:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1230/1251] eta 0:00:04 lr 0.000305 wd 0.0500 time 0.2253 (0.2312) data time 0.0008 (0.0017) model time 0.2245 (0.2296) loss 3.7842 (3.0268) grad_norm 2.5160 (3.1933) loss_scale 512.0000 (348.1267) mem 7379MB [2024-08-27 22:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1240/1251] eta 0:00:02 lr 0.000305 wd 0.0500 time 0.2109 (0.2311) data time 0.0004 (0.0017) model time 0.2104 (0.2295) loss 3.4983 (3.0272) grad_norm 4.0826 (3.1921) loss_scale 512.0000 (349.4472) mem 7379MB [2024-08-27 22:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [196/300][1250/1251] eta 0:00:00 lr 0.000305 wd 0.0500 time 0.2109 (0.2310) data time 0.0004 (0.0017) model time 0.2105 (0.2294) loss 3.2984 (3.0273) grad_norm 2.9808 (3.1913) loss_scale 512.0000 (350.7466) mem 7379MB [2024-08-27 22:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 196 training takes 0:04:48 [2024-08-27 22:03:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 22:03:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 22:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.495 (0.495) Loss 0.4512 (0.4512) Acc@1 91.406 (91.406) Acc@5 97.949 (97.949) Mem 7379MB [2024-08-27 22:03:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.085 (0.123) Loss 0.6597 (0.6750) Acc@1 86.914 (85.636) Acc@5 97.070 (97.159) Mem 7379MB [2024-08-27 22:03:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.096 (0.104) Loss 0.9414 (0.6972) Acc@1 77.441 (84.803) Acc@5 95.703 (97.256) Mem 7379MB [2024-08-27 22:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.096) Loss 1.1562 (0.7956) Acc@1 72.559 (82.523) Acc@5 92.090 (96.204) Mem 7379MB [2024-08-27 22:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.089) Loss 1.0312 (0.8453) Acc@1 75.684 (81.226) Acc@5 94.434 (95.658) Mem 7379MB [2024-08-27 22:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.836 Acc@5 95.618 [2024-08-27 22:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.8% [2024-08-27 22:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 80.84% [2024-08-27 22:03:46 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 22:03:46 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 22:03:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.489 (0.489) Loss 0.3894 (0.3894) Acc@1 93.262 (93.262) Acc@5 98.438 (98.438) Mem 7379MB [2024-08-27 22:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.089 (0.126) Loss 0.6060 (0.6126) Acc@1 88.672 (87.225) Acc@5 97.461 (97.541) Mem 7379MB [2024-08-27 22:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.101 (0.104) Loss 0.8652 (0.6381) Acc@1 79.004 (86.226) Acc@5 96.191 (97.582) Mem 7379MB [2024-08-27 22:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.097) Loss 1.1055 (0.7224) Acc@1 72.852 (84.120) Acc@5 92.773 (96.689) Mem 7379MB [2024-08-27 22:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.089) Loss 0.9785 (0.7655) Acc@1 76.270 (82.815) Acc@5 94.629 (96.208) Mem 7379MB [2024-08-27 22:03:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.386 Acc@5 96.200 [2024-08-27 22:03:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.4% [2024-08-27 22:03:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.39% [2024-08-27 22:03:51 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 22:03:51 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 22:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][0/1251] eta 0:16:08 lr 0.000305 wd 0.0500 time 0.7742 (0.7742) data time 0.5139 (0.5139) model time 0.0000 (0.0000) loss 3.6166 (3.6166) grad_norm 2.6214 (2.6214) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][10/1251] eta 0:05:47 lr 0.000305 wd 0.0500 time 0.2297 (0.2800) data time 0.0012 (0.0481) model time 0.0000 (0.0000) loss 3.1019 (3.1476) grad_norm 3.4627 (3.3853) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:03:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][20/1251] eta 0:05:14 lr 0.000305 wd 0.0500 time 0.2252 (0.2558) data time 0.0011 (0.0257) model time 0.0000 (0.0000) loss 3.1009 (3.0559) grad_norm 2.1328 (3.2937) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][30/1251] eta 0:05:03 lr 0.000305 wd 0.0500 time 0.2429 (0.2483) data time 0.0010 (0.0177) model time 0.0000 (0.0000) loss 2.5925 (3.0601) grad_norm 1.9804 (3.3622) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][40/1251] eta 0:04:55 lr 0.000305 wd 0.0500 time 0.2347 (0.2438) data time 0.0010 (0.0137) model time 0.0000 (0.0000) loss 2.2819 (3.0720) grad_norm 3.0536 (3.2901) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][50/1251] eta 0:04:50 lr 0.000305 wd 0.0500 time 0.2555 (0.2418) data time 0.0008 (0.0113) model time 0.0000 (0.0000) loss 2.3052 (3.0569) grad_norm 2.6996 (3.2270) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][60/1251] eta 0:04:44 lr 0.000305 wd 0.0500 time 0.2274 (0.2393) data time 0.0006 (0.0096) model time 0.2268 (0.2254) loss 2.4164 (3.0160) grad_norm 2.6489 (3.2299) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][70/1251] eta 0:04:41 lr 0.000305 wd 0.0500 time 0.2405 (0.2381) data time 0.0012 (0.0085) model time 0.2394 (0.2274) loss 3.1531 (3.0257) grad_norm 3.8695 (3.4142) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][80/1251] eta 0:04:37 lr 0.000305 wd 0.0500 time 0.2293 (0.2372) data time 0.0008 (0.0076) model time 0.2286 (0.2281) loss 3.8326 (3.0344) grad_norm 2.5240 (3.3878) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][90/1251] eta 0:04:34 lr 0.000305 wd 0.0500 time 0.2220 (0.2364) data time 0.0010 (0.0070) model time 0.2210 (0.2281) loss 1.9182 (2.9933) grad_norm 4.5647 (3.4531) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][100/1251] eta 0:04:31 lr 0.000305 wd 0.0500 time 0.2278 (0.2355) data time 0.0011 (0.0064) model time 0.2267 (0.2278) loss 3.3435 (2.9880) grad_norm 3.5301 (3.4430) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][110/1251] eta 0:04:28 lr 0.000305 wd 0.0500 time 0.2323 (0.2351) data time 0.0007 (0.0059) model time 0.2316 (0.2280) loss 3.7696 (2.9930) grad_norm 2.5141 (3.3926) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][120/1251] eta 0:04:25 lr 0.000305 wd 0.0500 time 0.2525 (0.2349) data time 0.0009 (0.0055) model time 0.2516 (0.2286) loss 3.3554 (2.9936) grad_norm 2.3692 (3.3388) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][130/1251] eta 0:04:22 lr 0.000305 wd 0.0500 time 0.2350 (0.2345) data time 0.0009 (0.0052) model time 0.2341 (0.2285) loss 3.0257 (3.0106) grad_norm 2.6609 (3.3340) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][140/1251] eta 0:04:20 lr 0.000305 wd 0.0500 time 0.2328 (0.2341) data time 0.0008 (0.0049) model time 0.2320 (0.2285) loss 3.0010 (2.9946) grad_norm 5.7740 (3.3957) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][150/1251] eta 0:04:17 lr 0.000305 wd 0.0500 time 0.2294 (0.2337) data time 0.0010 (0.0046) model time 0.2283 (0.2283) loss 2.6670 (2.9921) grad_norm 2.2174 (3.3531) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][160/1251] eta 0:04:14 lr 0.000305 wd 0.0500 time 0.2241 (0.2333) data time 0.0011 (0.0044) model time 0.2231 (0.2282) loss 2.5759 (2.9949) grad_norm 2.7847 (3.3214) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][170/1251] eta 0:04:11 lr 0.000305 wd 0.0500 time 0.2333 (0.2330) data time 0.0010 (0.0042) model time 0.2323 (0.2281) loss 3.7308 (2.9850) grad_norm 2.8809 (3.2920) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][180/1251] eta 0:04:09 lr 0.000305 wd 0.0500 time 0.2306 (0.2329) data time 0.0008 (0.0040) model time 0.2299 (0.2283) loss 3.3040 (2.9946) grad_norm 3.6084 (3.2877) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][190/1251] eta 0:04:06 lr 0.000305 wd 0.0500 time 0.2321 (0.2327) data time 0.0009 (0.0039) model time 0.2311 (0.2282) loss 3.4461 (3.0056) grad_norm 3.6978 (3.2802) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][200/1251] eta 0:04:04 lr 0.000305 wd 0.0500 time 0.2342 (0.2325) data time 0.0007 (0.0037) model time 0.2335 (0.2282) loss 3.7809 (3.0136) grad_norm 2.1689 (3.2937) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][210/1251] eta 0:04:01 lr 0.000305 wd 0.0500 time 0.2305 (0.2323) data time 0.0009 (0.0036) model time 0.2296 (0.2282) loss 3.0975 (3.0238) grad_norm 2.6484 (3.2806) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][220/1251] eta 0:03:59 lr 0.000304 wd 0.0500 time 0.2278 (0.2321) data time 0.0011 (0.0035) model time 0.2267 (0.2280) loss 2.4671 (3.0220) grad_norm 3.4871 (3.2701) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][230/1251] eta 0:03:56 lr 0.000304 wd 0.0500 time 0.2211 (0.2319) data time 0.0014 (0.0034) model time 0.2197 (0.2280) loss 3.7092 (3.0235) grad_norm 2.9721 (3.2688) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][240/1251] eta 0:03:54 lr 0.000304 wd 0.0500 time 0.2299 (0.2319) data time 0.0012 (0.0033) model time 0.2288 (0.2281) loss 3.1442 (3.0236) grad_norm 2.6773 (3.2495) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][250/1251] eta 0:03:52 lr 0.000304 wd 0.0500 time 0.2227 (0.2318) data time 0.0013 (0.0032) model time 0.2214 (0.2282) loss 3.1842 (3.0296) grad_norm 3.8715 (3.2442) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][260/1251] eta 0:03:49 lr 0.000304 wd 0.0500 time 0.2318 (0.2317) data time 0.0007 (0.0031) model time 0.2310 (0.2281) loss 3.2980 (3.0399) grad_norm 3.4545 (3.2276) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][270/1251] eta 0:03:47 lr 0.000304 wd 0.0500 time 0.2419 (0.2317) data time 0.0007 (0.0030) model time 0.2412 (0.2282) loss 3.5529 (3.0425) grad_norm 2.8077 (3.2059) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][280/1251] eta 0:03:45 lr 0.000304 wd 0.0500 time 0.2214 (0.2323) data time 0.0007 (0.0030) model time 0.2207 (0.2290) loss 3.3752 (3.0384) grad_norm 2.0003 (3.1929) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:04:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][290/1251] eta 0:03:43 lr 0.000304 wd 0.0500 time 0.2274 (0.2321) data time 0.0009 (0.0029) model time 0.2265 (0.2289) loss 2.0561 (3.0394) grad_norm 2.6970 (3.1967) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][300/1251] eta 0:03:40 lr 0.000304 wd 0.0500 time 0.2314 (0.2321) data time 0.0014 (0.0028) model time 0.2300 (0.2290) loss 3.1237 (3.0427) grad_norm 3.0872 (3.1894) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][310/1251] eta 0:03:38 lr 0.000304 wd 0.0500 time 0.2323 (0.2320) data time 0.0007 (0.0028) model time 0.2316 (0.2289) loss 3.6858 (3.0457) grad_norm 4.7001 (3.2208) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][320/1251] eta 0:03:35 lr 0.000304 wd 0.0500 time 0.2221 (0.2319) data time 0.0012 (0.0027) model time 0.2209 (0.2290) loss 3.2741 (3.0402) grad_norm 3.7658 (3.2297) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][330/1251] eta 0:03:33 lr 0.000304 wd 0.0500 time 0.2233 (0.2318) data time 0.0007 (0.0027) model time 0.2226 (0.2289) loss 3.7538 (3.0415) grad_norm 2.6460 (3.2272) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][340/1251] eta 0:03:31 lr 0.000304 wd 0.0500 time 0.2241 (0.2317) data time 0.0009 (0.0026) model time 0.2232 (0.2288) loss 3.9294 (3.0495) grad_norm 2.6417 (3.2201) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][350/1251] eta 0:03:28 lr 0.000304 wd 0.0500 time 0.2268 (0.2316) data time 0.0007 (0.0026) model time 0.2261 (0.2287) loss 3.4323 (3.0497) grad_norm 3.0093 (3.2243) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][360/1251] eta 0:03:26 lr 0.000304 wd 0.0500 time 0.2281 (0.2315) data time 0.0011 (0.0026) model time 0.2270 (0.2287) loss 3.7417 (3.0569) grad_norm 2.2093 (3.2258) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][370/1251] eta 0:03:23 lr 0.000304 wd 0.0500 time 0.2181 (0.2313) data time 0.0010 (0.0025) model time 0.2171 (0.2286) loss 3.2825 (3.0610) grad_norm 3.0608 (3.2384) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][380/1251] eta 0:03:21 lr 0.000304 wd 0.0500 time 0.2232 (0.2313) data time 0.0017 (0.0025) model time 0.2214 (0.2286) loss 2.2966 (3.0541) grad_norm 4.6745 (3.2428) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][390/1251] eta 0:03:19 lr 0.000304 wd 0.0500 time 0.2412 (0.2314) data time 0.0012 (0.0025) model time 0.2400 (0.2287) loss 3.1361 (3.0619) grad_norm 3.2868 (3.2367) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][400/1251] eta 0:03:16 lr 0.000304 wd 0.0500 time 0.2296 (0.2313) data time 0.0012 (0.0024) model time 0.2284 (0.2287) loss 3.0447 (3.0618) grad_norm 2.6791 (3.2349) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][410/1251] eta 0:03:14 lr 0.000304 wd 0.0500 time 0.2272 (0.2313) data time 0.0011 (0.0024) model time 0.2262 (0.2287) loss 3.2971 (3.0617) grad_norm 2.2297 (3.2278) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][420/1251] eta 0:03:12 lr 0.000304 wd 0.0500 time 0.2255 (0.2312) data time 0.0014 (0.0024) model time 0.2241 (0.2286) loss 3.2722 (3.0645) grad_norm 2.9873 (3.2200) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][430/1251] eta 0:03:09 lr 0.000304 wd 0.0500 time 0.2311 (0.2312) data time 0.0009 (0.0023) model time 0.2302 (0.2286) loss 3.5230 (3.0627) grad_norm 3.1760 (3.2191) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][440/1251] eta 0:03:07 lr 0.000304 wd 0.0500 time 0.2342 (0.2311) data time 0.0010 (0.0023) model time 0.2332 (0.2286) loss 3.2466 (3.0580) grad_norm 2.4894 (3.2066) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][450/1251] eta 0:03:05 lr 0.000304 wd 0.0500 time 0.2254 (0.2312) data time 0.0010 (0.0023) model time 0.2244 (0.2287) loss 3.5550 (3.0613) grad_norm 4.9090 (3.2047) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][460/1251] eta 0:03:02 lr 0.000303 wd 0.0500 time 0.2337 (0.2312) data time 0.0010 (0.0023) model time 0.2327 (0.2287) loss 1.9463 (3.0568) grad_norm 6.6670 (3.2282) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][470/1251] eta 0:03:00 lr 0.000303 wd 0.0500 time 0.2330 (0.2311) data time 0.0007 (0.0023) model time 0.2323 (0.2287) loss 3.4963 (3.0562) grad_norm 2.7558 (3.2365) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][480/1251] eta 0:02:58 lr 0.000303 wd 0.0500 time 0.2263 (0.2311) data time 0.0011 (0.0022) model time 0.2252 (0.2286) loss 3.0998 (3.0548) grad_norm 2.5572 (3.2297) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][490/1251] eta 0:02:56 lr 0.000303 wd 0.0500 time 0.2822 (0.2313) data time 0.0007 (0.0022) model time 0.2815 (0.2289) loss 3.6770 (3.0546) grad_norm 2.6346 (3.2279) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][500/1251] eta 0:02:53 lr 0.000303 wd 0.0500 time 0.3158 (0.2315) data time 0.0012 (0.0022) model time 0.3146 (0.2292) loss 1.7803 (3.0523) grad_norm 2.5792 (3.2172) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][510/1251] eta 0:02:51 lr 0.000303 wd 0.0500 time 0.2292 (0.2314) data time 0.0007 (0.0022) model time 0.2285 (0.2291) loss 2.3929 (3.0504) grad_norm 2.7895 (3.2125) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][520/1251] eta 0:02:49 lr 0.000303 wd 0.0500 time 0.2337 (0.2314) data time 0.0008 (0.0022) model time 0.2329 (0.2291) loss 3.0160 (3.0487) grad_norm 2.7362 (3.2089) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][530/1251] eta 0:02:46 lr 0.000303 wd 0.0500 time 0.2319 (0.2315) data time 0.0007 (0.0022) model time 0.2312 (0.2292) loss 2.6829 (3.0449) grad_norm 3.4449 (3.2174) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][540/1251] eta 0:02:44 lr 0.000303 wd 0.0500 time 0.2292 (0.2318) data time 0.0009 (0.0021) model time 0.2283 (0.2296) loss 4.1608 (3.0494) grad_norm 3.3579 (3.2137) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:05:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][550/1251] eta 0:02:42 lr 0.000303 wd 0.0500 time 0.2385 (0.2322) data time 0.0010 (0.0021) model time 0.2376 (0.2300) loss 3.2618 (3.0459) grad_norm 3.2861 (3.2081) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][560/1251] eta 0:02:40 lr 0.000303 wd 0.0500 time 0.2462 (0.2322) data time 0.0008 (0.0021) model time 0.2455 (0.2301) loss 3.2523 (3.0428) grad_norm 3.0538 (3.2052) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][570/1251] eta 0:02:38 lr 0.000303 wd 0.0500 time 0.2332 (0.2322) data time 0.0009 (0.0021) model time 0.2324 (0.2301) loss 2.2202 (3.0375) grad_norm 2.6587 (3.2026) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][580/1251] eta 0:02:35 lr 0.000303 wd 0.0500 time 0.2311 (0.2321) data time 0.0011 (0.0021) model time 0.2300 (0.2300) loss 3.1475 (3.0389) grad_norm 3.3615 (3.2018) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][590/1251] eta 0:02:33 lr 0.000303 wd 0.0500 time 0.2236 (0.2321) data time 0.0012 (0.0020) model time 0.2224 (0.2300) loss 3.2825 (3.0393) grad_norm 3.2731 (3.1957) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][600/1251] eta 0:02:31 lr 0.000303 wd 0.0500 time 0.2717 (0.2321) data time 0.0009 (0.0020) model time 0.2708 (0.2300) loss 3.2948 (3.0404) grad_norm 2.2375 (3.2027) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][610/1251] eta 0:02:28 lr 0.000303 wd 0.0500 time 0.2236 (0.2320) data time 0.0009 (0.0020) model time 0.2227 (0.2300) loss 3.6792 (3.0394) grad_norm 2.8550 (3.1996) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][620/1251] eta 0:02:26 lr 0.000303 wd 0.0500 time 0.2319 (0.2320) data time 0.0008 (0.0020) model time 0.2311 (0.2299) loss 1.8910 (3.0366) grad_norm 3.6067 (3.1988) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][630/1251] eta 0:02:24 lr 0.000303 wd 0.0500 time 0.2318 (0.2320) data time 0.0009 (0.0020) model time 0.2309 (0.2299) loss 3.0809 (3.0355) grad_norm 2.5219 (3.1956) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][640/1251] eta 0:02:21 lr 0.000303 wd 0.0500 time 0.2384 (0.2319) data time 0.0009 (0.0020) model time 0.2374 (0.2299) loss 3.2201 (3.0346) grad_norm 2.0894 (3.1907) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][650/1251] eta 0:02:19 lr 0.000303 wd 0.0500 time 0.2416 (0.2319) data time 0.0009 (0.0020) model time 0.2407 (0.2299) loss 3.0090 (3.0358) grad_norm 3.0212 (3.1873) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][660/1251] eta 0:02:17 lr 0.000303 wd 0.0500 time 0.2299 (0.2319) data time 0.0009 (0.0020) model time 0.2290 (0.2299) loss 3.7043 (3.0390) grad_norm 2.9959 (3.1835) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][670/1251] eta 0:02:14 lr 0.000303 wd 0.0500 time 0.2355 (0.2319) data time 0.0008 (0.0020) model time 0.2347 (0.2299) loss 3.4805 (3.0414) grad_norm 2.3959 (3.2102) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][680/1251] eta 0:02:12 lr 0.000303 wd 0.0500 time 0.2467 (0.2319) data time 0.0009 (0.0019) model time 0.2458 (0.2299) loss 2.2239 (3.0439) grad_norm 2.5461 (3.2128) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][690/1251] eta 0:02:10 lr 0.000303 wd 0.0500 time 0.2491 (0.2319) data time 0.0012 (0.0019) model time 0.2480 (0.2299) loss 3.0748 (3.0419) grad_norm 2.6880 (3.2129) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][700/1251] eta 0:02:07 lr 0.000303 wd 0.0500 time 0.2401 (0.2319) data time 0.0012 (0.0019) model time 0.2389 (0.2299) loss 2.6800 (3.0402) grad_norm 1.9610 (3.2110) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][710/1251] eta 0:02:05 lr 0.000302 wd 0.0500 time 0.2536 (0.2319) data time 0.0007 (0.0019) model time 0.2529 (0.2300) loss 3.7659 (3.0375) grad_norm 2.8659 (3.2096) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][720/1251] eta 0:02:03 lr 0.000302 wd 0.0500 time 0.2510 (0.2319) data time 0.0007 (0.0019) model time 0.2503 (0.2299) loss 3.0288 (3.0377) grad_norm 3.0431 (3.2063) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][730/1251] eta 0:02:00 lr 0.000302 wd 0.0500 time 0.2410 (0.2318) data time 0.0010 (0.0019) model time 0.2400 (0.2299) loss 3.2284 (3.0347) grad_norm 2.8791 (3.2003) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][740/1251] eta 0:01:58 lr 0.000302 wd 0.0500 time 0.2436 (0.2318) data time 0.0007 (0.0019) model time 0.2428 (0.2299) loss 3.1170 (3.0354) grad_norm 3.4879 (3.1958) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][750/1251] eta 0:01:56 lr 0.000302 wd 0.0500 time 0.2293 (0.2318) data time 0.0013 (0.0019) model time 0.2280 (0.2299) loss 3.3297 (3.0362) grad_norm 3.4025 (3.2001) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][760/1251] eta 0:01:53 lr 0.000302 wd 0.0500 time 0.2389 (0.2318) data time 0.0010 (0.0019) model time 0.2379 (0.2299) loss 3.3862 (3.0348) grad_norm 5.2423 (3.2092) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][770/1251] eta 0:01:51 lr 0.000302 wd 0.0500 time 0.2287 (0.2318) data time 0.0007 (0.0018) model time 0.2280 (0.2299) loss 2.7556 (3.0343) grad_norm 3.6979 (3.2087) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][780/1251] eta 0:01:49 lr 0.000302 wd 0.0500 time 0.2271 (0.2318) data time 0.0007 (0.0018) model time 0.2264 (0.2299) loss 2.7192 (3.0351) grad_norm 3.5287 (3.2175) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][790/1251] eta 0:01:46 lr 0.000302 wd 0.0500 time 0.2279 (0.2318) data time 0.0011 (0.0018) model time 0.2268 (0.2299) loss 2.7246 (3.0362) grad_norm 2.9499 (3.2199) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][800/1251] eta 0:01:44 lr 0.000302 wd 0.0500 time 0.2346 (0.2318) data time 0.0010 (0.0018) model time 0.2336 (0.2299) loss 3.2705 (3.0389) grad_norm 3.5038 (3.2318) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:06:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][810/1251] eta 0:01:42 lr 0.000302 wd 0.0500 time 0.2332 (0.2318) data time 0.0012 (0.0018) model time 0.2320 (0.2299) loss 2.8246 (3.0386) grad_norm 2.2833 (3.2263) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][820/1251] eta 0:01:39 lr 0.000302 wd 0.0500 time 0.2381 (0.2317) data time 0.0007 (0.0018) model time 0.2374 (0.2299) loss 1.8426 (3.0359) grad_norm 3.1836 (3.2227) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][830/1251] eta 0:01:37 lr 0.000302 wd 0.0500 time 0.2283 (0.2317) data time 0.0010 (0.0018) model time 0.2273 (0.2299) loss 2.3141 (3.0361) grad_norm 2.8322 (3.2209) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][840/1251] eta 0:01:35 lr 0.000302 wd 0.0500 time 0.2287 (0.2317) data time 0.0012 (0.0018) model time 0.2275 (0.2298) loss 2.4249 (3.0372) grad_norm 2.9843 (3.2160) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][850/1251] eta 0:01:32 lr 0.000302 wd 0.0500 time 0.2476 (0.2317) data time 0.0009 (0.0018) model time 0.2468 (0.2299) loss 1.7205 (3.0355) grad_norm 2.8153 (3.2159) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][860/1251] eta 0:01:30 lr 0.000302 wd 0.0500 time 0.2295 (0.2317) data time 0.0010 (0.0018) model time 0.2286 (0.2299) loss 2.6275 (3.0333) grad_norm 3.3344 (3.2143) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][870/1251] eta 0:01:28 lr 0.000302 wd 0.0500 time 0.2392 (0.2317) data time 0.0009 (0.0018) model time 0.2383 (0.2298) loss 3.3138 (3.0331) grad_norm 2.9697 (3.2255) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][880/1251] eta 0:01:25 lr 0.000302 wd 0.0500 time 0.2389 (0.2317) data time 0.0010 (0.0018) model time 0.2379 (0.2298) loss 3.5935 (3.0359) grad_norm 4.6916 (3.2418) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][890/1251] eta 0:01:23 lr 0.000302 wd 0.0500 time 0.2362 (0.2317) data time 0.0011 (0.0018) model time 0.2351 (0.2298) loss 3.0756 (3.0339) grad_norm 3.4529 (3.2809) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][900/1251] eta 0:01:21 lr 0.000302 wd 0.0500 time 0.2331 (0.2316) data time 0.0008 (0.0018) model time 0.2324 (0.2298) loss 3.1085 (3.0343) grad_norm 3.9589 (3.2944) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][910/1251] eta 0:01:18 lr 0.000302 wd 0.0500 time 0.2323 (0.2316) data time 0.0010 (0.0018) model time 0.2313 (0.2298) loss 3.1227 (3.0344) grad_norm 2.9253 (3.2909) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][920/1251] eta 0:01:16 lr 0.000302 wd 0.0500 time 0.2335 (0.2316) data time 0.0009 (0.0017) model time 0.2326 (0.2298) loss 3.1023 (3.0339) grad_norm 3.4675 (3.3421) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][930/1251] eta 0:01:14 lr 0.000302 wd 0.0500 time 0.2278 (0.2316) data time 0.0011 (0.0017) model time 0.2267 (0.2298) loss 3.1255 (3.0320) grad_norm 2.7853 (3.3418) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][940/1251] eta 0:01:12 lr 0.000302 wd 0.0500 time 0.2380 (0.2315) data time 0.0009 (0.0017) model time 0.2371 (0.2298) loss 3.3648 (3.0340) grad_norm 2.9028 (3.3384) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][950/1251] eta 0:01:09 lr 0.000302 wd 0.0500 time 0.2297 (0.2315) data time 0.0011 (0.0017) model time 0.2286 (0.2298) loss 2.8008 (3.0348) grad_norm 3.1227 (3.3327) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][960/1251] eta 0:01:07 lr 0.000301 wd 0.0500 time 0.2337 (0.2316) data time 0.0008 (0.0017) model time 0.2329 (0.2298) loss 2.1497 (3.0336) grad_norm 3.1693 (3.3289) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][970/1251] eta 0:01:05 lr 0.000301 wd 0.0500 time 0.2261 (0.2315) data time 0.0008 (0.0017) model time 0.2253 (0.2298) loss 1.8117 (3.0324) grad_norm 1.9962 (3.3239) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][980/1251] eta 0:01:02 lr 0.000301 wd 0.0500 time 0.2296 (0.2315) data time 0.0010 (0.0017) model time 0.2286 (0.2298) loss 3.3768 (3.0340) grad_norm 2.5191 (3.3153) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][990/1251] eta 0:01:00 lr 0.000301 wd 0.0500 time 0.2381 (0.2315) data time 0.0007 (0.0017) model time 0.2373 (0.2298) loss 2.6889 (3.0328) grad_norm 3.7048 (3.3129) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1000/1251] eta 0:00:58 lr 0.000301 wd 0.0500 time 0.2300 (0.2316) data time 0.0010 (0.0017) model time 0.2290 (0.2298) loss 2.3111 (3.0330) grad_norm 3.0135 (3.3095) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1010/1251] eta 0:00:55 lr 0.000301 wd 0.0500 time 0.2317 (0.2316) data time 0.0007 (0.0017) model time 0.2310 (0.2298) loss 2.5854 (3.0324) grad_norm 3.5510 (3.3095) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1020/1251] eta 0:00:53 lr 0.000301 wd 0.0500 time 0.2330 (0.2315) data time 0.0007 (0.0017) model time 0.2323 (0.2298) loss 3.5044 (3.0341) grad_norm 2.2714 (3.3211) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1030/1251] eta 0:00:51 lr 0.000301 wd 0.0500 time 0.2367 (0.2315) data time 0.0012 (0.0017) model time 0.2355 (0.2298) loss 3.2892 (3.0355) grad_norm 2.6345 (3.3190) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1040/1251] eta 0:00:48 lr 0.000301 wd 0.0500 time 0.2339 (0.2315) data time 0.0009 (0.0017) model time 0.2330 (0.2297) loss 1.8205 (3.0340) grad_norm 3.2762 (3.3242) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1050/1251] eta 0:00:46 lr 0.000301 wd 0.0500 time 0.2425 (0.2315) data time 0.0011 (0.0017) model time 0.2414 (0.2298) loss 2.0485 (3.0326) grad_norm 3.6350 (3.3187) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1060/1251] eta 0:00:44 lr 0.000301 wd 0.0500 time 0.2278 (0.2315) data time 0.0013 (0.0017) model time 0.2266 (0.2298) loss 3.3272 (3.0336) grad_norm 2.7387 (3.3193) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:07:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1070/1251] eta 0:00:41 lr 0.000301 wd 0.0500 time 0.2590 (0.2315) data time 0.0010 (0.0017) model time 0.2581 (0.2298) loss 3.1489 (3.0350) grad_norm 3.3497 (3.3181) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1080/1251] eta 0:00:39 lr 0.000301 wd 0.0500 time 0.2203 (0.2320) data time 0.0011 (0.0017) model time 0.2192 (0.2302) loss 3.2186 (3.0321) grad_norm 2.5107 (3.3165) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1090/1251] eta 0:00:37 lr 0.000301 wd 0.0500 time 0.2322 (0.2321) data time 0.0014 (0.0017) model time 0.2308 (0.2304) loss 3.4340 (3.0340) grad_norm 2.5265 (3.3124) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1100/1251] eta 0:00:35 lr 0.000301 wd 0.0500 time 0.2229 (0.2321) data time 0.0013 (0.0017) model time 0.2216 (0.2304) loss 3.3039 (3.0341) grad_norm 2.2322 (3.3064) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1110/1251] eta 0:00:32 lr 0.000301 wd 0.0500 time 0.2395 (0.2321) data time 0.0006 (0.0017) model time 0.2388 (0.2304) loss 3.0416 (3.0342) grad_norm 2.7113 (3.3031) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1120/1251] eta 0:00:30 lr 0.000301 wd 0.0500 time 0.2267 (0.2321) data time 0.0007 (0.0017) model time 0.2260 (0.2303) loss 3.5520 (3.0337) grad_norm 2.7113 (3.3001) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1130/1251] eta 0:00:28 lr 0.000301 wd 0.0500 time 0.2302 (0.2321) data time 0.0007 (0.0017) model time 0.2295 (0.2303) loss 3.2345 (3.0336) grad_norm 3.0360 (3.2991) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1140/1251] eta 0:00:25 lr 0.000301 wd 0.0500 time 0.2229 (0.2320) data time 0.0009 (0.0017) model time 0.2220 (0.2303) loss 3.5770 (3.0326) grad_norm 2.4703 (3.2976) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1150/1251] eta 0:00:23 lr 0.000301 wd 0.0500 time 0.2327 (0.2320) data time 0.0012 (0.0017) model time 0.2315 (0.2303) loss 3.2218 (3.0331) grad_norm 2.9244 (3.3018) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1160/1251] eta 0:00:21 lr 0.000301 wd 0.0500 time 0.2252 (0.2320) data time 0.0010 (0.0017) model time 0.2242 (0.2303) loss 3.0297 (3.0320) grad_norm 4.1417 (3.3037) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1170/1251] eta 0:00:18 lr 0.000301 wd 0.0500 time 0.2205 (0.2320) data time 0.0011 (0.0017) model time 0.2194 (0.2303) loss 2.6280 (3.0311) grad_norm 2.2995 (3.3036) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1180/1251] eta 0:00:16 lr 0.000301 wd 0.0500 time 0.2210 (0.2320) data time 0.0016 (0.0017) model time 0.2194 (0.2303) loss 3.3414 (3.0331) grad_norm 2.5974 (3.3012) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1190/1251] eta 0:00:14 lr 0.000301 wd 0.0500 time 0.2289 (0.2320) data time 0.0010 (0.0017) model time 0.2279 (0.2303) loss 3.2999 (3.0325) grad_norm 2.4906 (3.2965) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1200/1251] eta 0:00:11 lr 0.000300 wd 0.0500 time 0.2388 (0.2319) data time 0.0010 (0.0017) model time 0.2378 (0.2302) loss 3.1368 (3.0331) grad_norm 2.9849 (3.2928) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1210/1251] eta 0:00:09 lr 0.000300 wd 0.0500 time 0.2233 (0.2321) data time 0.0007 (0.0017) model time 0.2226 (0.2304) loss 2.0925 (3.0344) grad_norm 2.4397 (3.2907) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1220/1251] eta 0:00:07 lr 0.000300 wd 0.0500 time 0.2238 (0.2321) data time 0.0008 (0.0017) model time 0.2230 (0.2304) loss 3.3874 (3.0350) grad_norm 3.3034 (3.2949) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1230/1251] eta 0:00:04 lr 0.000300 wd 0.0500 time 0.2306 (0.2321) data time 0.0011 (0.0017) model time 0.2295 (0.2304) loss 2.4951 (3.0337) grad_norm 2.5525 (3.2922) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1240/1251] eta 0:00:02 lr 0.000300 wd 0.0500 time 0.2126 (0.2320) data time 0.0005 (0.0017) model time 0.2122 (0.2303) loss 3.7642 (3.0355) grad_norm 3.1695 (3.2929) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [197/300][1250/1251] eta 0:00:00 lr 0.000300 wd 0.0500 time 0.2116 (0.2318) data time 0.0004 (0.0017) model time 0.2112 (0.2301) loss 2.6665 (3.0351) grad_norm 4.3693 (3.3024) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 197 training takes 0:04:49 [2024-08-27 22:08:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 22:08:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 22:08:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.534 (0.534) Loss 0.4331 (0.4331) Acc@1 92.090 (92.090) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-27 22:08:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.121) Loss 0.7061 (0.6830) Acc@1 86.035 (85.476) Acc@5 96.875 (97.186) Mem 7379MB [2024-08-27 22:08:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.089 (0.103) Loss 0.9912 (0.7077) Acc@1 76.953 (84.519) Acc@5 95.117 (97.173) Mem 7379MB [2024-08-27 22:08:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.096) Loss 1.2129 (0.8022) Acc@1 70.605 (82.302) Acc@5 91.602 (96.144) Mem 7379MB [2024-08-27 22:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.088) Loss 1.1035 (0.8538) Acc@1 72.949 (81.062) Acc@5 93.652 (95.577) Mem 7379MB [2024-08-27 22:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.734 Acc@5 95.556 [2024-08-27 22:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.7% [2024-08-27 22:08:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.973 (0.973) Loss 0.3884 (0.3884) Acc@1 93.262 (93.262) Acc@5 98.535 (98.535) Mem 7379MB [2024-08-27 22:08:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.165) Loss 0.6035 (0.6116) Acc@1 88.574 (87.251) Acc@5 97.559 (97.541) Mem 7379MB [2024-08-27 22:08:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.087 (0.127) Loss 0.8633 (0.6373) Acc@1 78.906 (86.198) Acc@5 96.191 (97.582) Mem 7379MB [2024-08-27 22:08:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.111) Loss 1.1035 (0.7217) Acc@1 72.754 (84.069) Acc@5 92.773 (96.699) Mem 7379MB [2024-08-27 22:08:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.100) Loss 0.9790 (0.7648) Acc@1 76.270 (82.789) Acc@5 94.629 (96.230) Mem 7379MB [2024-08-27 22:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.366 Acc@5 96.228 [2024-08-27 22:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.4% [2024-08-27 22:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][0/1251] eta 0:29:40 lr 0.000300 wd 0.0500 time 1.4231 (1.4231) data time 0.8451 (0.8451) model time 0.0000 (0.0000) loss 3.2020 (3.2020) grad_norm 2.9336 (2.9336) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][10/1251] eta 0:07:04 lr 0.000300 wd 0.0500 time 0.2609 (0.3418) data time 0.0010 (0.0779) model time 0.0000 (0.0000) loss 3.1634 (3.1907) grad_norm 3.2615 (3.0948) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][20/1251] eta 0:05:54 lr 0.000300 wd 0.0500 time 0.2276 (0.2878) data time 0.0007 (0.0418) model time 0.0000 (0.0000) loss 2.4095 (3.1371) grad_norm 2.3616 (3.2728) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][30/1251] eta 0:05:28 lr 0.000300 wd 0.0500 time 0.2310 (0.2687) data time 0.0009 (0.0286) model time 0.0000 (0.0000) loss 3.3454 (3.2228) grad_norm 2.5883 (3.0804) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][40/1251] eta 0:05:14 lr 0.000300 wd 0.0500 time 0.2352 (0.2596) data time 0.0008 (0.0219) model time 0.0000 (0.0000) loss 3.0120 (3.1841) grad_norm 3.7920 (3.1168) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][50/1251] eta 0:05:04 lr 0.000300 wd 0.0500 time 0.2633 (0.2538) data time 0.0007 (0.0178) model time 0.0000 (0.0000) loss 3.7134 (3.1440) grad_norm 2.8645 (3.0660) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][60/1251] eta 0:04:57 lr 0.000300 wd 0.0500 time 0.2274 (0.2496) data time 0.0007 (0.0151) model time 0.2267 (0.2269) loss 2.5286 (3.1277) grad_norm 2.8820 (3.1581) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][70/1251] eta 0:04:51 lr 0.000300 wd 0.0500 time 0.2344 (0.2467) data time 0.0007 (0.0131) model time 0.2337 (0.2275) loss 3.6006 (3.1648) grad_norm 3.8680 (3.1426) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][80/1251] eta 0:04:46 lr 0.000300 wd 0.0500 time 0.2254 (0.2446) data time 0.0008 (0.0116) model time 0.2246 (0.2277) loss 2.7144 (3.1743) grad_norm 3.1853 (3.1714) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][90/1251] eta 0:04:41 lr 0.000300 wd 0.0500 time 0.2276 (0.2428) data time 0.0009 (0.0104) model time 0.2267 (0.2276) loss 2.1103 (3.1484) grad_norm 4.2314 (3.1532) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][100/1251] eta 0:04:37 lr 0.000300 wd 0.0500 time 0.2338 (0.2413) data time 0.0012 (0.0095) model time 0.2326 (0.2275) loss 1.7348 (3.1251) grad_norm 3.1590 (3.1726) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][110/1251] eta 0:04:34 lr 0.000300 wd 0.0500 time 0.2339 (0.2403) data time 0.0007 (0.0088) model time 0.2332 (0.2276) loss 1.8445 (3.1095) grad_norm 3.4299 (3.1977) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][120/1251] eta 0:04:31 lr 0.000300 wd 0.0500 time 0.2994 (0.2401) data time 0.0007 (0.0081) model time 0.2987 (0.2291) loss 3.3208 (3.1020) grad_norm 2.8194 (3.1915) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][130/1251] eta 0:04:28 lr 0.000300 wd 0.0500 time 0.2427 (0.2393) data time 0.0010 (0.0076) model time 0.2417 (0.2290) loss 2.4561 (3.0899) grad_norm 2.4293 (3.1863) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][140/1251] eta 0:04:25 lr 0.000300 wd 0.0500 time 0.2338 (0.2386) data time 0.0009 (0.0071) model time 0.2328 (0.2288) loss 3.1142 (3.0843) grad_norm 3.1326 (3.1917) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][150/1251] eta 0:04:22 lr 0.000300 wd 0.0500 time 0.2320 (0.2380) data time 0.0007 (0.0067) model time 0.2313 (0.2288) loss 3.2646 (3.0748) grad_norm 3.8993 (3.1733) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][160/1251] eta 0:04:18 lr 0.000300 wd 0.0500 time 0.2223 (0.2373) data time 0.0013 (0.0064) model time 0.2210 (0.2285) loss 2.8813 (3.0609) grad_norm 3.5567 (3.1501) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][170/1251] eta 0:04:16 lr 0.000300 wd 0.0500 time 0.2523 (0.2369) data time 0.0008 (0.0061) model time 0.2515 (0.2286) loss 3.3151 (3.0585) grad_norm 2.7365 (3.1323) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][180/1251] eta 0:04:13 lr 0.000300 wd 0.0500 time 0.2306 (0.2363) data time 0.0008 (0.0058) model time 0.2298 (0.2284) loss 2.9673 (3.0517) grad_norm 3.7287 (3.1649) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][190/1251] eta 0:04:10 lr 0.000300 wd 0.0500 time 0.2477 (0.2362) data time 0.0008 (0.0056) model time 0.2469 (0.2286) loss 2.4535 (3.0510) grad_norm 3.7560 (3.2397) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][200/1251] eta 0:04:07 lr 0.000299 wd 0.0500 time 0.2247 (0.2358) data time 0.0008 (0.0053) model time 0.2239 (0.2286) loss 3.0700 (3.0404) grad_norm 2.8694 (3.2471) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][210/1251] eta 0:04:05 lr 0.000299 wd 0.0500 time 0.2300 (0.2355) data time 0.0007 (0.0051) model time 0.2293 (0.2285) loss 2.1264 (3.0489) grad_norm 2.7189 (3.2360) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][220/1251] eta 0:04:02 lr 0.000299 wd 0.0500 time 0.2378 (0.2351) data time 0.0015 (0.0049) model time 0.2363 (0.2284) loss 1.7709 (3.0378) grad_norm 3.8244 (3.2168) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][230/1251] eta 0:03:59 lr 0.000299 wd 0.0500 time 0.2297 (0.2349) data time 0.0010 (0.0048) model time 0.2287 (0.2284) loss 2.3784 (3.0313) grad_norm 3.3296 (3.2155) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][240/1251] eta 0:03:57 lr 0.000299 wd 0.0500 time 0.2238 (0.2346) data time 0.0008 (0.0046) model time 0.2230 (0.2284) loss 3.1873 (3.0170) grad_norm 2.7619 (3.2085) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][250/1251] eta 0:03:54 lr 0.000299 wd 0.0500 time 0.2364 (0.2345) data time 0.0011 (0.0045) model time 0.2353 (0.2285) loss 3.2189 (3.0062) grad_norm 2.7624 (3.2330) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][260/1251] eta 0:03:52 lr 0.000299 wd 0.0500 time 0.2303 (0.2344) data time 0.0009 (0.0043) model time 0.2294 (0.2285) loss 3.2762 (3.0030) grad_norm 2.4408 (3.2280) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][270/1251] eta 0:03:49 lr 0.000299 wd 0.0500 time 0.2232 (0.2342) data time 0.0011 (0.0042) model time 0.2221 (0.2286) loss 3.3254 (2.9984) grad_norm 3.1805 (3.2133) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][280/1251] eta 0:03:47 lr 0.000299 wd 0.0500 time 0.2318 (0.2340) data time 0.0011 (0.0041) model time 0.2307 (0.2285) loss 2.1288 (3.0008) grad_norm 2.7698 (3.2053) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][290/1251] eta 0:03:44 lr 0.000299 wd 0.0500 time 0.2304 (0.2338) data time 0.0017 (0.0040) model time 0.2287 (0.2285) loss 3.4582 (3.0048) grad_norm 2.4605 (3.1879) loss_scale 1024.0000 (520.7973) mem 7379MB [2024-08-27 22:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][300/1251] eta 0:03:42 lr 0.000299 wd 0.0500 time 0.2333 (0.2337) data time 0.0013 (0.0039) model time 0.2319 (0.2285) loss 3.5192 (3.0026) grad_norm 3.3611 (3.1783) loss_scale 1024.0000 (537.5150) mem 7379MB [2024-08-27 22:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][310/1251] eta 0:03:39 lr 0.000299 wd 0.0500 time 0.2300 (0.2336) data time 0.0008 (0.0038) model time 0.2292 (0.2285) loss 3.2980 (3.0142) grad_norm 4.5267 (3.1748) loss_scale 1024.0000 (553.1576) mem 7379MB [2024-08-27 22:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][320/1251] eta 0:03:37 lr 0.000299 wd 0.0500 time 0.2303 (0.2335) data time 0.0008 (0.0037) model time 0.2295 (0.2285) loss 2.3857 (3.0113) grad_norm 2.1743 (3.1684) loss_scale 1024.0000 (567.8255) mem 7379MB [2024-08-27 22:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][330/1251] eta 0:03:34 lr 0.000299 wd 0.0500 time 0.2266 (0.2333) data time 0.0009 (0.0036) model time 0.2257 (0.2285) loss 3.9686 (3.0176) grad_norm 2.8542 (3.1646) loss_scale 1024.0000 (581.6073) mem 7379MB [2024-08-27 22:10:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][340/1251] eta 0:03:32 lr 0.000299 wd 0.0500 time 0.2324 (0.2332) data time 0.0010 (0.0036) model time 0.2314 (0.2285) loss 3.2670 (3.0102) grad_norm 2.2328 (3.1601) loss_scale 1024.0000 (594.5806) mem 7379MB [2024-08-27 22:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][350/1251] eta 0:03:30 lr 0.000299 wd 0.0500 time 0.2267 (0.2331) data time 0.0014 (0.0035) model time 0.2253 (0.2285) loss 3.1763 (3.0135) grad_norm 2.6122 (3.1500) loss_scale 1024.0000 (606.8148) mem 7379MB [2024-08-27 22:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][360/1251] eta 0:03:28 lr 0.000299 wd 0.0500 time 0.2216 (0.2342) data time 0.0010 (0.0035) model time 0.2206 (0.2299) loss 2.2774 (3.0167) grad_norm 4.4529 (3.1576) loss_scale 1024.0000 (618.3712) mem 7379MB [2024-08-27 22:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][370/1251] eta 0:03:26 lr 0.000299 wd 0.0500 time 0.2278 (0.2346) data time 0.0010 (0.0034) model time 0.2268 (0.2305) loss 3.5187 (3.0197) grad_norm 4.3298 (3.1705) loss_scale 1024.0000 (629.3046) mem 7379MB [2024-08-27 22:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][380/1251] eta 0:03:24 lr 0.000299 wd 0.0500 time 0.2273 (0.2345) data time 0.0010 (0.0033) model time 0.2264 (0.2304) loss 3.2313 (3.0189) grad_norm 2.7232 (3.1704) loss_scale 1024.0000 (639.6640) mem 7379MB [2024-08-27 22:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][390/1251] eta 0:03:21 lr 0.000299 wd 0.0500 time 0.2295 (0.2343) data time 0.0007 (0.0033) model time 0.2287 (0.2303) loss 3.5181 (3.0281) grad_norm 2.7017 (3.1688) loss_scale 1024.0000 (649.4936) mem 7379MB [2024-08-27 22:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][400/1251] eta 0:03:19 lr 0.000299 wd 0.0500 time 0.2289 (0.2342) data time 0.0009 (0.0032) model time 0.2280 (0.2302) loss 2.7303 (3.0196) grad_norm 3.5613 (3.2037) loss_scale 1024.0000 (658.8329) mem 7379MB [2024-08-27 22:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][410/1251] eta 0:03:16 lr 0.000299 wd 0.0500 time 0.2284 (0.2340) data time 0.0009 (0.0032) model time 0.2274 (0.2301) loss 3.3512 (3.0173) grad_norm 2.7838 (3.2029) loss_scale 1024.0000 (667.7178) mem 7379MB [2024-08-27 22:10:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][420/1251] eta 0:03:14 lr 0.000299 wd 0.0500 time 0.2280 (0.2339) data time 0.0009 (0.0031) model time 0.2271 (0.2300) loss 3.8079 (3.0176) grad_norm 2.0932 (3.1863) loss_scale 1024.0000 (676.1805) mem 7379MB [2024-08-27 22:10:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][430/1251] eta 0:03:11 lr 0.000299 wd 0.0500 time 0.2267 (0.2338) data time 0.0013 (0.0031) model time 0.2254 (0.2300) loss 3.2296 (3.0204) grad_norm 2.3862 (3.1767) loss_scale 1024.0000 (684.2506) mem 7379MB [2024-08-27 22:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][440/1251] eta 0:03:09 lr 0.000299 wd 0.0500 time 0.2314 (0.2336) data time 0.0009 (0.0030) model time 0.2305 (0.2299) loss 3.1183 (3.0213) grad_norm 2.0085 (3.1745) loss_scale 1024.0000 (691.9546) mem 7379MB [2024-08-27 22:10:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][450/1251] eta 0:03:07 lr 0.000298 wd 0.0500 time 0.2299 (0.2335) data time 0.0007 (0.0030) model time 0.2292 (0.2298) loss 2.3071 (3.0161) grad_norm 2.4582 (3.1679) loss_scale 1024.0000 (699.3171) mem 7379MB [2024-08-27 22:10:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][460/1251] eta 0:03:04 lr 0.000298 wd 0.0500 time 0.2245 (0.2339) data time 0.0010 (0.0029) model time 0.2235 (0.2303) loss 3.0051 (3.0158) grad_norm 2.4997 (3.1655) loss_scale 1024.0000 (706.3601) mem 7379MB [2024-08-27 22:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][470/1251] eta 0:03:02 lr 0.000298 wd 0.0500 time 0.2249 (0.2338) data time 0.0010 (0.0029) model time 0.2239 (0.2302) loss 2.4028 (3.0115) grad_norm 3.2069 (3.1705) loss_scale 1024.0000 (713.1040) mem 7379MB [2024-08-27 22:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][480/1251] eta 0:03:00 lr 0.000298 wd 0.0500 time 0.2244 (0.2336) data time 0.0009 (0.0029) model time 0.2234 (0.2302) loss 2.4484 (3.0139) grad_norm 5.6648 (3.1974) loss_scale 1024.0000 (719.5676) mem 7379MB [2024-08-27 22:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][490/1251] eta 0:02:57 lr 0.000298 wd 0.0500 time 0.2272 (0.2336) data time 0.0011 (0.0028) model time 0.2261 (0.2301) loss 2.7638 (3.0109) grad_norm 3.4393 (3.2072) loss_scale 1024.0000 (725.7678) mem 7379MB [2024-08-27 22:10:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][500/1251] eta 0:02:55 lr 0.000298 wd 0.0500 time 0.2210 (0.2334) data time 0.0007 (0.0028) model time 0.2203 (0.2300) loss 3.7736 (3.0150) grad_norm 2.8631 (3.2019) loss_scale 1024.0000 (731.7206) mem 7379MB [2024-08-27 22:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][510/1251] eta 0:02:52 lr 0.000298 wd 0.0500 time 0.2243 (0.2334) data time 0.0013 (0.0028) model time 0.2231 (0.2300) loss 3.1110 (3.0163) grad_norm 2.7079 (3.2053) loss_scale 1024.0000 (737.4403) mem 7379MB [2024-08-27 22:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][520/1251] eta 0:02:50 lr 0.000298 wd 0.0500 time 0.2277 (0.2333) data time 0.0010 (0.0027) model time 0.2268 (0.2300) loss 3.4814 (3.0157) grad_norm 3.0138 (3.2018) loss_scale 1024.0000 (742.9405) mem 7379MB [2024-08-27 22:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][530/1251] eta 0:02:48 lr 0.000298 wd 0.0500 time 0.2275 (0.2332) data time 0.0007 (0.0027) model time 0.2269 (0.2299) loss 3.1643 (3.0179) grad_norm 3.3509 (3.2110) loss_scale 1024.0000 (748.2335) mem 7379MB [2024-08-27 22:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][540/1251] eta 0:02:45 lr 0.000298 wd 0.0500 time 0.2306 (0.2331) data time 0.0011 (0.0027) model time 0.2294 (0.2299) loss 3.2360 (3.0212) grad_norm 2.8427 (3.2110) loss_scale 1024.0000 (753.3309) mem 7379MB [2024-08-27 22:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][550/1251] eta 0:02:43 lr 0.000298 wd 0.0500 time 0.2306 (0.2331) data time 0.0015 (0.0026) model time 0.2291 (0.2298) loss 3.4444 (3.0202) grad_norm 2.5098 (3.2146) loss_scale 1024.0000 (758.2432) mem 7379MB [2024-08-27 22:11:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][560/1251] eta 0:02:40 lr 0.000298 wd 0.0500 time 0.2223 (0.2330) data time 0.0011 (0.0026) model time 0.2212 (0.2298) loss 2.9706 (3.0188) grad_norm 3.1550 (3.2205) loss_scale 1024.0000 (762.9804) mem 7379MB [2024-08-27 22:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][570/1251] eta 0:02:38 lr 0.000298 wd 0.0500 time 0.2381 (0.2329) data time 0.0007 (0.0026) model time 0.2375 (0.2298) loss 2.4967 (3.0199) grad_norm 3.5323 (3.2179) loss_scale 1024.0000 (767.5517) mem 7379MB [2024-08-27 22:11:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][580/1251] eta 0:02:36 lr 0.000298 wd 0.0500 time 0.2289 (0.2328) data time 0.0009 (0.0026) model time 0.2280 (0.2297) loss 3.2704 (3.0205) grad_norm 2.5512 (3.2158) loss_scale 1024.0000 (771.9656) mem 7379MB [2024-08-27 22:11:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][590/1251] eta 0:02:33 lr 0.000298 wd 0.0500 time 0.2346 (0.2327) data time 0.0007 (0.0025) model time 0.2339 (0.2296) loss 3.2401 (3.0238) grad_norm 3.3799 (3.2135) loss_scale 1024.0000 (776.2301) mem 7379MB [2024-08-27 22:11:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][600/1251] eta 0:02:31 lr 0.000298 wd 0.0500 time 0.2284 (0.2327) data time 0.0009 (0.0025) model time 0.2275 (0.2296) loss 2.2510 (3.0216) grad_norm 3.6455 (3.2097) loss_scale 1024.0000 (780.3527) mem 7379MB [2024-08-27 22:11:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][610/1251] eta 0:02:29 lr 0.000298 wd 0.0500 time 0.2274 (0.2327) data time 0.0008 (0.0025) model time 0.2266 (0.2296) loss 3.6122 (3.0258) grad_norm 5.6339 (3.2151) loss_scale 1024.0000 (784.3404) mem 7379MB [2024-08-27 22:11:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][620/1251] eta 0:02:26 lr 0.000298 wd 0.0500 time 0.2245 (0.2326) data time 0.0019 (0.0025) model time 0.2226 (0.2296) loss 2.3015 (3.0251) grad_norm 5.0124 (3.2217) loss_scale 1024.0000 (788.1997) mem 7379MB [2024-08-27 22:11:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][630/1251] eta 0:02:24 lr 0.000298 wd 0.0500 time 0.2263 (0.2326) data time 0.0010 (0.0025) model time 0.2253 (0.2296) loss 3.3957 (3.0277) grad_norm 2.8994 (3.2266) loss_scale 1024.0000 (791.9366) mem 7379MB [2024-08-27 22:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][640/1251] eta 0:02:22 lr 0.000298 wd 0.0500 time 0.2247 (0.2326) data time 0.0011 (0.0024) model time 0.2236 (0.2296) loss 2.9916 (3.0274) grad_norm 2.9208 (3.2257) loss_scale 1024.0000 (795.5569) mem 7379MB [2024-08-27 22:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][650/1251] eta 0:02:19 lr 0.000298 wd 0.0500 time 0.2451 (0.2325) data time 0.0010 (0.0024) model time 0.2442 (0.2296) loss 3.4816 (3.0289) grad_norm 2.8065 (3.2258) loss_scale 1024.0000 (799.0661) mem 7379MB [2024-08-27 22:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][660/1251] eta 0:02:17 lr 0.000298 wd 0.0500 time 0.2372 (0.2325) data time 0.0011 (0.0024) model time 0.2361 (0.2295) loss 2.5019 (3.0302) grad_norm 2.3251 (3.2278) loss_scale 1024.0000 (802.4690) mem 7379MB [2024-08-27 22:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][670/1251] eta 0:02:15 lr 0.000298 wd 0.0500 time 0.2276 (0.2324) data time 0.0017 (0.0024) model time 0.2259 (0.2295) loss 2.9647 (3.0308) grad_norm 2.4140 (3.2188) loss_scale 1024.0000 (805.7705) mem 7379MB [2024-08-27 22:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][680/1251] eta 0:02:12 lr 0.000298 wd 0.0500 time 0.2336 (0.2323) data time 0.0010 (0.0024) model time 0.2326 (0.2295) loss 3.1103 (3.0332) grad_norm 1.9026 (3.2141) loss_scale 1024.0000 (808.9750) mem 7379MB [2024-08-27 22:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][690/1251] eta 0:02:10 lr 0.000298 wd 0.0500 time 0.2260 (0.2323) data time 0.0009 (0.0023) model time 0.2252 (0.2294) loss 3.4885 (3.0315) grad_norm 3.3231 (3.2271) loss_scale 1024.0000 (812.0868) mem 7379MB [2024-08-27 22:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][700/1251] eta 0:02:07 lr 0.000297 wd 0.0500 time 0.2332 (0.2322) data time 0.0012 (0.0023) model time 0.2319 (0.2294) loss 1.9332 (3.0316) grad_norm 2.4240 (3.2215) loss_scale 1024.0000 (815.1098) mem 7379MB [2024-08-27 22:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][710/1251] eta 0:02:05 lr 0.000297 wd 0.0500 time 0.2239 (0.2322) data time 0.0012 (0.0023) model time 0.2227 (0.2294) loss 3.1586 (3.0314) grad_norm 3.0748 (3.2128) loss_scale 1024.0000 (818.0478) mem 7379MB [2024-08-27 22:11:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][720/1251] eta 0:02:03 lr 0.000297 wd 0.0500 time 0.2245 (0.2321) data time 0.0007 (0.0023) model time 0.2239 (0.2294) loss 3.8420 (3.0349) grad_norm 2.0267 (3.2075) loss_scale 1024.0000 (820.9043) mem 7379MB [2024-08-27 22:11:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][730/1251] eta 0:02:00 lr 0.000297 wd 0.0500 time 0.2273 (0.2321) data time 0.0007 (0.0023) model time 0.2267 (0.2294) loss 2.9869 (3.0311) grad_norm 2.7050 (3.2159) loss_scale 1024.0000 (823.6826) mem 7379MB [2024-08-27 22:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][740/1251] eta 0:01:58 lr 0.000297 wd 0.0500 time 0.2302 (0.2321) data time 0.0007 (0.0023) model time 0.2294 (0.2294) loss 3.4489 (3.0311) grad_norm 2.5301 (3.2134) loss_scale 1024.0000 (826.3860) mem 7379MB [2024-08-27 22:11:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][750/1251] eta 0:01:56 lr 0.000297 wd 0.0500 time 0.2297 (0.2320) data time 0.0015 (0.0022) model time 0.2282 (0.2293) loss 3.1919 (3.0316) grad_norm 2.3200 (3.2098) loss_scale 1024.0000 (829.0173) mem 7379MB [2024-08-27 22:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][760/1251] eta 0:01:53 lr 0.000297 wd 0.0500 time 0.2277 (0.2320) data time 0.0008 (0.0022) model time 0.2269 (0.2293) loss 3.7792 (3.0286) grad_norm 2.7421 (3.2084) loss_scale 1024.0000 (831.5795) mem 7379MB [2024-08-27 22:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][770/1251] eta 0:01:51 lr 0.000297 wd 0.0500 time 0.2269 (0.2320) data time 0.0010 (0.0022) model time 0.2259 (0.2293) loss 3.5413 (3.0302) grad_norm 2.5864 (3.2074) loss_scale 1024.0000 (834.0752) mem 7379MB [2024-08-27 22:11:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][780/1251] eta 0:01:49 lr 0.000297 wd 0.0500 time 0.2255 (0.2319) data time 0.0015 (0.0022) model time 0.2241 (0.2293) loss 3.0837 (3.0308) grad_norm 2.6231 (3.2086) loss_scale 1024.0000 (836.5070) mem 7379MB [2024-08-27 22:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][790/1251] eta 0:01:46 lr 0.000297 wd 0.0500 time 0.2262 (0.2319) data time 0.0017 (0.0022) model time 0.2245 (0.2293) loss 3.2125 (3.0318) grad_norm 3.9187 (3.2091) loss_scale 1024.0000 (838.8774) mem 7379MB [2024-08-27 22:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][800/1251] eta 0:01:44 lr 0.000297 wd 0.0500 time 0.2335 (0.2319) data time 0.0008 (0.0022) model time 0.2327 (0.2293) loss 2.6023 (3.0303) grad_norm 2.9275 (3.2041) loss_scale 1024.0000 (841.1885) mem 7379MB [2024-08-27 22:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][810/1251] eta 0:01:42 lr 0.000297 wd 0.0500 time 0.2318 (0.2318) data time 0.0007 (0.0022) model time 0.2311 (0.2293) loss 2.2400 (3.0294) grad_norm 2.7991 (3.2018) loss_scale 1024.0000 (843.4427) mem 7379MB [2024-08-27 22:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][820/1251] eta 0:01:39 lr 0.000297 wd 0.0500 time 0.2429 (0.2318) data time 0.0018 (0.0021) model time 0.2411 (0.2293) loss 3.1594 (3.0306) grad_norm 3.3352 (3.1985) loss_scale 1024.0000 (845.6419) mem 7379MB [2024-08-27 22:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][830/1251] eta 0:01:37 lr 0.000297 wd 0.0500 time 0.2285 (0.2318) data time 0.0014 (0.0021) model time 0.2271 (0.2293) loss 2.1732 (3.0301) grad_norm 3.1799 (3.1989) loss_scale 1024.0000 (847.7882) mem 7379MB [2024-08-27 22:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][840/1251] eta 0:01:35 lr 0.000297 wd 0.0500 time 0.2347 (0.2318) data time 0.0011 (0.0021) model time 0.2336 (0.2293) loss 3.4727 (3.0341) grad_norm 3.3317 (3.2004) loss_scale 1024.0000 (849.8835) mem 7379MB [2024-08-27 22:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][850/1251] eta 0:01:32 lr 0.000297 wd 0.0500 time 0.2211 (0.2317) data time 0.0012 (0.0021) model time 0.2199 (0.2292) loss 2.8942 (3.0339) grad_norm 2.9290 (inf) loss_scale 512.0000 (848.9213) mem 7379MB [2024-08-27 22:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][860/1251] eta 0:01:30 lr 0.000297 wd 0.0500 time 0.2315 (0.2317) data time 0.0012 (0.0021) model time 0.2304 (0.2292) loss 3.3899 (3.0328) grad_norm 6.6315 (inf) loss_scale 512.0000 (845.0081) mem 7379MB [2024-08-27 22:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][870/1251] eta 0:01:28 lr 0.000297 wd 0.0500 time 0.2290 (0.2317) data time 0.0009 (0.0021) model time 0.2281 (0.2292) loss 3.3148 (3.0327) grad_norm 2.1663 (inf) loss_scale 512.0000 (841.1848) mem 7379MB [2024-08-27 22:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][880/1251] eta 0:01:26 lr 0.000297 wd 0.0500 time 0.2208 (0.2322) data time 0.0008 (0.0021) model time 0.2200 (0.2298) loss 3.4551 (3.0342) grad_norm 2.5019 (inf) loss_scale 512.0000 (837.4484) mem 7379MB [2024-08-27 22:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][890/1251] eta 0:01:23 lr 0.000297 wd 0.0500 time 0.2320 (0.2324) data time 0.0008 (0.0021) model time 0.2312 (0.2300) loss 3.8458 (3.0342) grad_norm 2.9290 (inf) loss_scale 512.0000 (833.7957) mem 7379MB [2024-08-27 22:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][900/1251] eta 0:01:21 lr 0.000297 wd 0.0500 time 0.2416 (0.2324) data time 0.0008 (0.0021) model time 0.2409 (0.2300) loss 3.2586 (3.0341) grad_norm 2.3536 (inf) loss_scale 512.0000 (830.2242) mem 7379MB [2024-08-27 22:12:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][910/1251] eta 0:01:19 lr 0.000297 wd 0.0500 time 0.2385 (0.2324) data time 0.0007 (0.0020) model time 0.2378 (0.2300) loss 2.8583 (3.0319) grad_norm 3.4173 (inf) loss_scale 512.0000 (826.7311) mem 7379MB [2024-08-27 22:12:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][920/1251] eta 0:01:16 lr 0.000297 wd 0.0500 time 0.2293 (0.2323) data time 0.0008 (0.0020) model time 0.2285 (0.2299) loss 2.3930 (3.0297) grad_norm 2.3273 (inf) loss_scale 512.0000 (823.3138) mem 7379MB [2024-08-27 22:12:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][930/1251] eta 0:01:14 lr 0.000297 wd 0.0500 time 0.2279 (0.2323) data time 0.0008 (0.0020) model time 0.2271 (0.2299) loss 3.8413 (3.0306) grad_norm 5.3589 (inf) loss_scale 512.0000 (819.9699) mem 7379MB [2024-08-27 22:12:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][940/1251] eta 0:01:12 lr 0.000296 wd 0.0500 time 0.2315 (0.2322) data time 0.0007 (0.0020) model time 0.2308 (0.2299) loss 3.1759 (3.0313) grad_norm 2.8634 (inf) loss_scale 512.0000 (816.6971) mem 7379MB [2024-08-27 22:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][950/1251] eta 0:01:09 lr 0.000296 wd 0.0500 time 0.2279 (0.2322) data time 0.0007 (0.0020) model time 0.2271 (0.2298) loss 3.8732 (3.0314) grad_norm 3.7094 (inf) loss_scale 512.0000 (813.4932) mem 7379MB [2024-08-27 22:12:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][960/1251] eta 0:01:07 lr 0.000296 wd 0.0500 time 0.2257 (0.2322) data time 0.0011 (0.0020) model time 0.2246 (0.2298) loss 2.9593 (3.0322) grad_norm 6.1141 (inf) loss_scale 512.0000 (810.3559) mem 7379MB [2024-08-27 22:12:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][970/1251] eta 0:01:05 lr 0.000296 wd 0.0500 time 0.2213 (0.2321) data time 0.0011 (0.0020) model time 0.2202 (0.2298) loss 2.9828 (3.0319) grad_norm 4.0441 (inf) loss_scale 512.0000 (807.2832) mem 7379MB [2024-08-27 22:12:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][980/1251] eta 0:01:02 lr 0.000296 wd 0.0500 time 0.2223 (0.2322) data time 0.0010 (0.0020) model time 0.2213 (0.2299) loss 2.8418 (3.0306) grad_norm 3.2771 (inf) loss_scale 512.0000 (804.2732) mem 7379MB [2024-08-27 22:12:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][990/1251] eta 0:01:00 lr 0.000296 wd 0.0500 time 0.2281 (0.2322) data time 0.0009 (0.0020) model time 0.2271 (0.2299) loss 3.1239 (3.0293) grad_norm 3.0992 (inf) loss_scale 512.0000 (801.3239) mem 7379MB [2024-08-27 22:12:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1000/1251] eta 0:00:58 lr 0.000296 wd 0.0500 time 0.2301 (0.2321) data time 0.0008 (0.0020) model time 0.2293 (0.2299) loss 3.4278 (3.0322) grad_norm 3.2291 (inf) loss_scale 512.0000 (798.4336) mem 7379MB [2024-08-27 22:12:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1010/1251] eta 0:00:55 lr 0.000296 wd 0.0500 time 0.2245 (0.2321) data time 0.0009 (0.0020) model time 0.2236 (0.2299) loss 2.9653 (3.0331) grad_norm 3.0238 (inf) loss_scale 512.0000 (795.6004) mem 7379MB [2024-08-27 22:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1020/1251] eta 0:00:53 lr 0.000296 wd 0.0500 time 0.2293 (0.2321) data time 0.0009 (0.0019) model time 0.2284 (0.2298) loss 2.4780 (3.0320) grad_norm 5.3141 (inf) loss_scale 512.0000 (792.8227) mem 7379MB [2024-08-27 22:12:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1030/1251] eta 0:00:51 lr 0.000296 wd 0.0500 time 0.2262 (0.2321) data time 0.0010 (0.0019) model time 0.2252 (0.2298) loss 3.4152 (3.0324) grad_norm 3.4415 (inf) loss_scale 512.0000 (790.0989) mem 7379MB [2024-08-27 22:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1040/1251] eta 0:00:48 lr 0.000296 wd 0.0500 time 0.2228 (0.2320) data time 0.0011 (0.0019) model time 0.2218 (0.2298) loss 2.0421 (3.0311) grad_norm 2.9380 (inf) loss_scale 512.0000 (787.4275) mem 7379MB [2024-08-27 22:12:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1050/1251] eta 0:00:46 lr 0.000296 wd 0.0500 time 0.2274 (0.2320) data time 0.0014 (0.0019) model time 0.2260 (0.2298) loss 2.8935 (3.0328) grad_norm 2.8538 (inf) loss_scale 512.0000 (784.8069) mem 7379MB [2024-08-27 22:12:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1060/1251] eta 0:00:44 lr 0.000296 wd 0.0500 time 0.2309 (0.2320) data time 0.0014 (0.0019) model time 0.2295 (0.2298) loss 3.0115 (3.0346) grad_norm 2.6638 (inf) loss_scale 512.0000 (782.2356) mem 7379MB [2024-08-27 22:12:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1070/1251] eta 0:00:41 lr 0.000296 wd 0.0500 time 0.2402 (0.2320) data time 0.0010 (0.0019) model time 0.2393 (0.2298) loss 3.4652 (3.0361) grad_norm 2.7720 (inf) loss_scale 512.0000 (779.7124) mem 7379MB [2024-08-27 22:13:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1080/1251] eta 0:00:39 lr 0.000296 wd 0.0500 time 0.2306 (0.2319) data time 0.0009 (0.0019) model time 0.2297 (0.2297) loss 2.0873 (3.0356) grad_norm 2.6350 (inf) loss_scale 512.0000 (777.2359) mem 7379MB [2024-08-27 22:13:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1090/1251] eta 0:00:37 lr 0.000296 wd 0.0500 time 0.2267 (0.2319) data time 0.0015 (0.0019) model time 0.2253 (0.2297) loss 3.2007 (3.0336) grad_norm 2.3371 (inf) loss_scale 512.0000 (774.8048) mem 7379MB [2024-08-27 22:13:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1100/1251] eta 0:00:35 lr 0.000296 wd 0.0500 time 0.2287 (0.2319) data time 0.0011 (0.0019) model time 0.2276 (0.2297) loss 3.1013 (3.0334) grad_norm 2.7368 (inf) loss_scale 512.0000 (772.4178) mem 7379MB [2024-08-27 22:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 22:13:08 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 22:13:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 22:14:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 22:14:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 22:14:49 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 22:14:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 22:14:56 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 22:14:57 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 22:14:58 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 22:14:59 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 198) [2024-08-27 22:14:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 22:15:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1110/1251] eta 0:09:17 lr 0.000296 wd 0.0500 time 0.2209 (3.9538) data time 0.0008 (0.1468) model time 0.2201 (3.8071) loss 3.1597 (3.3838) grad_norm 2.6442 (3.2688) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1120/1251] eta 0:02:48 lr 0.000296 wd 0.0500 time 0.2189 (1.2894) data time 0.0007 (0.0426) model time 0.2182 (1.2468) loss 3.3460 (3.2399) grad_norm 2.3904 (3.1237) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:15:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1130/1251] eta 0:01:42 lr 0.000296 wd 0.0500 time 0.2187 (0.8451) data time 0.0009 (0.0252) model time 0.2177 (0.8199) loss 3.2220 (3.2739) grad_norm 3.7552 (3.0140) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:15:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1140/1251] eta 0:01:13 lr 0.000296 wd 0.0500 time 0.2246 (0.6625) data time 0.0008 (0.0183) model time 0.2238 (0.6441) loss 2.3248 (3.2507) grad_norm 2.6249 (3.1251) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:15:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1150/1251] eta 0:00:56 lr 0.000296 wd 0.0500 time 0.2199 (0.5623) data time 0.0007 (0.0144) model time 0.2191 (0.5479) loss 3.0592 (3.2094) grad_norm 2.1739 (3.1064) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:15:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1160/1251] eta 0:00:45 lr 0.000296 wd 0.0500 time 0.2299 (0.4996) data time 0.0006 (0.0119) model time 0.2293 (0.4877) loss 3.6499 (3.2083) grad_norm 6.3046 (3.1644) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:15:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1170/1251] eta 0:00:36 lr 0.000296 wd 0.0500 time 0.2239 (0.4565) data time 0.0007 (0.0102) model time 0.2232 (0.4463) loss 3.2456 (3.1583) grad_norm 4.7027 (3.3153) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:15:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1180/1251] eta 0:00:30 lr 0.000296 wd 0.0500 time 0.2212 (0.4254) data time 0.0007 (0.0092) model time 0.2205 (0.4163) loss 3.2788 (3.1235) grad_norm 2.4212 (3.3006) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:15:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1190/1251] eta 0:00:24 lr 0.000295 wd 0.0500 time 0.2171 (0.4016) data time 0.0009 (0.0083) model time 0.2162 (0.3933) loss 3.3569 (3.1054) grad_norm 3.6650 (3.3092) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:15:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1200/1251] eta 0:00:19 lr 0.000295 wd 0.0500 time 0.2195 (0.3828) data time 0.0011 (0.0075) model time 0.2185 (0.3752) loss 2.6845 (3.0926) grad_norm 2.6171 (3.3027) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:15:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 22:15:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 22:15:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 22:17:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 22:17:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 22:17:45 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 22:17:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 22:17:58 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 22:18:00 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 22:18:01 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 22:18:01 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 198) [2024-08-27 22:18:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 22:18:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1210/1251] eta 0:10:07 lr 0.000295 wd 0.0500 time 14.8125 (14.8125) data time 0.8814 (0.8814) model time 13.9311 (13.9311) loss 3.5692 (3.5692) grad_norm 3.4116 (3.4116) loss_scale 512.0000 (512.0000) mem 20033MB [2024-08-27 22:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1220/1251] eta 0:00:48 lr 0.000295 wd 0.0500 time 0.2285 (1.5768) data time 0.0011 (0.0810) model time 0.2275 (1.4958) loss 2.6073 (3.3068) grad_norm 2.5151 (3.7197) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1230/1251] eta 0:00:19 lr 0.000295 wd 0.0500 time 0.2247 (0.9362) data time 0.0012 (0.0429) model time 0.2234 (0.8933) loss 3.0370 (3.2030) grad_norm 1.9916 (3.4156) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:18:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1240/1251] eta 0:00:07 lr 0.000295 wd 0.0500 time 0.2144 (0.7054) data time 0.0005 (0.0296) model time 0.2139 (0.6758) loss 2.1230 (3.2206) grad_norm 2.8186 (3.2816) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [198/300][1250/1251] eta 0:00:00 lr 0.000295 wd 0.0500 time 0.2112 (0.5851) data time 0.0007 (0.0226) model time 0.2105 (0.5625) loss 3.1510 (3.1736) grad_norm 6.4123 (3.4467) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 198 training takes 0:00:23 [2024-08-27 22:18:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 22:18:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 22:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.491 (0.491) Loss 0.4258 (0.4258) Acc@1 92.188 (92.188) Acc@5 98.047 (98.047) Mem 7379MB [2024-08-27 22:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.116) Loss 0.6797 (0.6658) Acc@1 87.305 (85.866) Acc@5 97.070 (97.266) Mem 7379MB [2024-08-27 22:18:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.100) Loss 0.9810 (0.6940) Acc@1 76.465 (84.733) Acc@5 95.312 (97.233) Mem 7379MB [2024-08-27 22:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.093) Loss 1.2285 (0.7887) Acc@1 71.094 (82.696) Acc@5 91.406 (96.154) Mem 7379MB [2024-08-27 22:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.086) Loss 1.0801 (0.8420) Acc@1 74.219 (81.364) Acc@5 93.555 (95.594) Mem 7379MB [2024-08-27 22:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.936 Acc@5 95.524 [2024-08-27 22:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.9% [2024-08-27 22:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 80.94% [2024-08-27 22:18:39 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 22:18:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 22:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.452 (0.452) Loss 0.3877 (0.3877) Acc@1 92.969 (92.969) Acc@5 98.438 (98.438) Mem 7379MB [2024-08-27 22:18:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.093 (0.115) Loss 0.5996 (0.6102) Acc@1 88.770 (87.269) Acc@5 97.559 (97.541) Mem 7379MB [2024-08-27 22:18:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.098) Loss 0.8657 (0.6363) Acc@1 78.809 (86.203) Acc@5 96.094 (97.582) Mem 7379MB [2024-08-27 22:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.088 (0.092) Loss 1.1006 (0.7207) Acc@1 72.559 (84.101) Acc@5 92.969 (96.699) Mem 7379MB [2024-08-27 22:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 0.9795 (0.7638) Acc@1 76.074 (82.834) Acc@5 94.727 (96.237) Mem 7379MB [2024-08-27 22:18:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.408 Acc@5 96.236 [2024-08-27 22:18:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.4% [2024-08-27 22:18:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.41% [2024-08-27 22:18:45 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 22:18:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 22:18:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][0/1251] eta 0:16:01 lr 0.000295 wd 0.0500 time 0.7687 (0.7687) data time 0.4564 (0.4564) model time 0.0000 (0.0000) loss 3.2269 (3.2269) grad_norm 3.6971 (3.6971) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:18:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 22:18:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 22:18:52 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 22:21:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 22:21:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 22:21:32 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 22:21:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 22:21:41 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 22:21:42 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 22:21:44 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 22:21:44 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 199) [2024-08-27 22:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 22:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][10/1251] eta 0:46:46 lr 0.000295 wd 0.0500 time 0.2168 (2.2614) data time 0.0014 (0.1947) model time 0.0000 (0.0000) loss 3.1216 (3.4166) grad_norm 3.2636 (4.1934) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][20/1251] eta 0:21:49 lr 0.000295 wd 0.0500 time 0.2254 (1.0638) data time 0.0009 (0.0808) model time 0.0000 (0.0000) loss 2.8109 (3.2584) grad_norm 2.8764 (3.7034) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][30/1251] eta 0:15:20 lr 0.000295 wd 0.0500 time 0.2399 (0.7540) data time 0.0008 (0.0512) model time 0.0000 (0.0000) loss 3.8544 (3.2683) grad_norm 2.6850 (3.4996) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][40/1251] eta 0:12:21 lr 0.000295 wd 0.0500 time 0.2285 (0.6127) data time 0.0012 (0.0379) model time 0.0000 (0.0000) loss 2.7983 (3.2352) grad_norm 2.2644 (3.3422) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][50/1251] eta 0:10:37 lr 0.000295 wd 0.0500 time 0.2197 (0.5304) data time 0.0009 (0.0301) model time 0.0000 (0.0000) loss 3.3176 (3.1784) grad_norm 2.4967 (3.2268) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][60/1251] eta 0:09:29 lr 0.000295 wd 0.0500 time 0.2217 (0.4779) data time 0.0016 (0.0253) model time 0.2201 (0.2288) loss 3.1703 (3.1837) grad_norm 4.0871 (3.1571) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][70/1251] eta 0:08:40 lr 0.000295 wd 0.0500 time 0.2326 (0.4407) data time 0.0009 (0.0217) model time 0.2317 (0.2280) loss 3.4863 (3.1559) grad_norm 3.4113 (3.2673) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][80/1251] eta 0:08:04 lr 0.000295 wd 0.0500 time 0.2238 (0.4137) data time 0.0009 (0.0190) model time 0.2229 (0.2291) loss 3.0121 (3.1242) grad_norm 2.2875 (3.2267) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][90/1251] eta 0:07:35 lr 0.000295 wd 0.0500 time 0.2327 (0.3926) data time 0.0009 (0.0170) model time 0.2319 (0.2291) loss 3.2438 (3.0909) grad_norm 4.3573 (3.4415) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][100/1251] eta 0:07:12 lr 0.000295 wd 0.0500 time 0.2251 (0.3756) data time 0.0011 (0.0153) model time 0.2241 (0.2287) loss 2.9881 (3.1027) grad_norm 3.3599 (3.4520) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][110/1251] eta 0:06:52 lr 0.000295 wd 0.0500 time 0.2369 (0.3619) data time 0.0008 (0.0140) model time 0.2362 (0.2286) loss 3.3226 (3.1272) grad_norm 2.5488 (3.4100) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][120/1251] eta 0:06:36 lr 0.000295 wd 0.0500 time 0.2370 (0.3507) data time 0.0013 (0.0129) model time 0.2357 (0.2288) loss 3.3647 (3.1220) grad_norm 3.2431 (3.3852) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][130/1251] eta 0:06:22 lr 0.000295 wd 0.0500 time 0.2354 (0.3413) data time 0.0009 (0.0120) model time 0.2346 (0.2288) loss 3.2651 (3.1060) grad_norm 3.4817 (3.4026) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][140/1251] eta 0:06:10 lr 0.000295 wd 0.0500 time 0.2366 (0.3335) data time 0.0007 (0.0112) model time 0.2358 (0.2292) loss 2.7547 (3.1043) grad_norm 3.1546 (3.3796) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][150/1251] eta 0:05:59 lr 0.000295 wd 0.0500 time 0.2313 (0.3264) data time 0.0015 (0.0105) model time 0.2298 (0.2292) loss 2.9431 (3.0946) grad_norm 3.6195 (3.4830) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][160/1251] eta 0:05:49 lr 0.000295 wd 0.0500 time 0.2357 (0.3202) data time 0.0009 (0.0099) model time 0.2348 (0.2290) loss 2.7240 (3.0870) grad_norm 2.5188 (3.4467) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][170/1251] eta 0:05:40 lr 0.000295 wd 0.0500 time 0.2294 (0.3149) data time 0.0009 (0.0094) model time 0.2285 (0.2292) loss 3.4132 (3.0914) grad_norm 3.3743 (3.4162) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][180/1251] eta 0:05:32 lr 0.000295 wd 0.0500 time 0.2241 (0.3101) data time 0.0016 (0.0089) model time 0.2225 (0.2291) loss 2.7268 (3.0766) grad_norm 2.9217 (3.3661) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][190/1251] eta 0:05:24 lr 0.000294 wd 0.0500 time 0.2196 (0.3056) data time 0.0008 (0.0085) model time 0.2188 (0.2288) loss 2.9913 (3.0687) grad_norm 3.2164 (3.3973) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][200/1251] eta 0:05:17 lr 0.000294 wd 0.0500 time 0.2356 (0.3017) data time 0.0007 (0.0081) model time 0.2350 (0.2287) loss 3.6291 (3.0650) grad_norm 23.5400 (3.4940) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][210/1251] eta 0:05:10 lr 0.000294 wd 0.0500 time 0.2319 (0.2981) data time 0.0008 (0.0078) model time 0.2311 (0.2287) loss 2.9681 (3.0505) grad_norm 4.3866 (3.4642) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][220/1251] eta 0:05:04 lr 0.000294 wd 0.0500 time 0.2348 (0.2950) data time 0.0009 (0.0075) model time 0.2340 (0.2286) loss 3.1315 (3.0484) grad_norm 3.0800 (3.4428) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][230/1251] eta 0:04:58 lr 0.000294 wd 0.0500 time 0.2331 (0.2921) data time 0.0010 (0.0072) model time 0.2321 (0.2286) loss 3.3246 (3.0566) grad_norm 3.8742 (3.4133) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:22:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][240/1251] eta 0:04:52 lr 0.000294 wd 0.0500 time 0.2286 (0.2895) data time 0.0007 (0.0070) model time 0.2279 (0.2286) loss 2.9684 (3.0515) grad_norm 3.5791 (3.4037) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][250/1251] eta 0:04:47 lr 0.000294 wd 0.0500 time 0.2273 (0.2870) data time 0.0010 (0.0067) model time 0.2263 (0.2285) loss 3.5403 (3.0474) grad_norm 2.6437 (3.3973) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][260/1251] eta 0:04:42 lr 0.000294 wd 0.0500 time 0.2221 (0.2847) data time 0.0008 (0.0065) model time 0.2213 (0.2284) loss 2.4547 (3.0416) grad_norm 2.7765 (3.3922) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][270/1251] eta 0:04:37 lr 0.000294 wd 0.0500 time 0.2286 (0.2826) data time 0.0014 (0.0063) model time 0.2272 (0.2284) loss 2.1330 (3.0318) grad_norm 4.1310 (3.3919) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][280/1251] eta 0:04:32 lr 0.000294 wd 0.0500 time 0.2292 (0.2806) data time 0.0009 (0.0061) model time 0.2283 (0.2283) loss 3.2837 (3.0419) grad_norm 2.5245 (3.3756) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][290/1251] eta 0:04:28 lr 0.000294 wd 0.0500 time 0.4555 (0.2796) data time 0.0007 (0.0060) model time 0.4548 (0.2292) loss 3.3328 (3.0346) grad_norm 3.7230 (3.3626) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][300/1251] eta 0:04:24 lr 0.000294 wd 0.0500 time 0.2246 (0.2779) data time 0.0009 (0.0058) model time 0.2237 (0.2291) loss 1.8489 (3.0224) grad_norm 3.3973 (3.3353) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][310/1251] eta 0:04:20 lr 0.000294 wd 0.0500 time 0.4879 (0.2771) data time 0.0006 (0.0057) model time 0.4872 (0.2300) loss 3.5823 (3.0194) grad_norm 3.6509 (3.3283) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][320/1251] eta 0:04:16 lr 0.000294 wd 0.0500 time 0.2305 (0.2755) data time 0.0012 (0.0055) model time 0.2294 (0.2299) loss 3.5209 (3.0281) grad_norm 3.4976 (3.3727) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][330/1251] eta 0:04:12 lr 0.000294 wd 0.0500 time 0.2260 (0.2740) data time 0.0007 (0.0054) model time 0.2253 (0.2297) loss 2.1802 (3.0378) grad_norm 2.3706 (3.3802) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][340/1251] eta 0:04:08 lr 0.000294 wd 0.0500 time 0.2259 (0.2727) data time 0.0009 (0.0053) model time 0.2250 (0.2297) loss 2.7249 (3.0344) grad_norm 2.8190 (3.3695) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][350/1251] eta 0:04:04 lr 0.000294 wd 0.0500 time 0.2285 (0.2714) data time 0.0010 (0.0051) model time 0.2275 (0.2296) loss 2.8789 (3.0367) grad_norm 3.7182 (3.3872) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][360/1251] eta 0:04:00 lr 0.000294 wd 0.0500 time 0.2231 (0.2703) data time 0.0010 (0.0050) model time 0.2222 (0.2296) loss 2.4244 (3.0375) grad_norm 3.2746 (3.3805) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][370/1251] eta 0:03:57 lr 0.000294 wd 0.0500 time 0.2283 (0.2692) data time 0.0012 (0.0049) model time 0.2271 (0.2296) loss 2.3681 (3.0338) grad_norm 3.2366 (3.3751) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][380/1251] eta 0:03:53 lr 0.000294 wd 0.0500 time 0.2319 (0.2681) data time 0.0010 (0.0048) model time 0.2309 (0.2295) loss 2.9907 (3.0328) grad_norm 3.0835 (3.3649) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][390/1251] eta 0:03:49 lr 0.000294 wd 0.0500 time 0.2330 (0.2671) data time 0.0007 (0.0047) model time 0.2324 (0.2294) loss 3.3015 (3.0306) grad_norm 5.1752 (3.4241) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][400/1251] eta 0:03:46 lr 0.000294 wd 0.0500 time 0.2319 (0.2661) data time 0.0011 (0.0046) model time 0.2308 (0.2294) loss 2.7537 (3.0337) grad_norm 3.5634 (3.4225) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][410/1251] eta 0:03:42 lr 0.000294 wd 0.0500 time 0.2306 (0.2652) data time 0.0011 (0.0045) model time 0.2296 (0.2293) loss 3.0884 (3.0374) grad_norm 2.3138 (3.4128) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][420/1251] eta 0:03:39 lr 0.000294 wd 0.0500 time 0.2420 (0.2643) data time 0.0007 (0.0045) model time 0.2413 (0.2293) loss 1.9380 (3.0364) grad_norm 2.8984 (3.4042) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][430/1251] eta 0:03:36 lr 0.000294 wd 0.0500 time 0.2258 (0.2635) data time 0.0011 (0.0044) model time 0.2247 (0.2292) loss 3.4184 (3.0399) grad_norm 4.0347 (3.3970) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][440/1251] eta 0:03:33 lr 0.000293 wd 0.0500 time 0.2593 (0.2628) data time 0.0012 (0.0044) model time 0.2581 (0.2292) loss 3.2846 (3.0444) grad_norm 2.4026 (3.4094) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][450/1251] eta 0:03:29 lr 0.000293 wd 0.0500 time 0.2322 (0.2620) data time 0.0010 (0.0043) model time 0.2312 (0.2292) loss 2.5352 (3.0459) grad_norm 2.4964 (3.3974) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][460/1251] eta 0:03:26 lr 0.000293 wd 0.0500 time 0.2289 (0.2614) data time 0.0010 (0.0043) model time 0.2279 (0.2292) loss 2.0564 (3.0420) grad_norm 2.6972 (3.3939) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][470/1251] eta 0:03:23 lr 0.000293 wd 0.0500 time 0.2374 (0.2607) data time 0.0015 (0.0042) model time 0.2359 (0.2292) loss 2.8600 (3.0386) grad_norm 3.5704 (3.3906) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][480/1251] eta 0:03:20 lr 0.000293 wd 0.0500 time 0.2214 (0.2600) data time 0.0010 (0.0041) model time 0.2204 (0.2291) loss 3.5306 (3.0351) grad_norm 2.6552 (3.3867) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][490/1251] eta 0:03:17 lr 0.000293 wd 0.0500 time 0.2393 (0.2594) data time 0.0012 (0.0041) model time 0.2381 (0.2291) loss 2.7209 (3.0354) grad_norm 2.3724 (3.3760) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:23:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][500/1251] eta 0:03:14 lr 0.000293 wd 0.0500 time 0.2226 (0.2587) data time 0.0013 (0.0040) model time 0.2214 (0.2290) loss 3.0120 (3.0328) grad_norm 4.2728 (3.3963) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][510/1251] eta 0:03:11 lr 0.000293 wd 0.0500 time 0.2299 (0.2581) data time 0.0014 (0.0039) model time 0.2285 (0.2290) loss 2.9956 (3.0333) grad_norm 3.2852 (3.3949) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][520/1251] eta 0:03:08 lr 0.000293 wd 0.0500 time 0.2329 (0.2575) data time 0.0009 (0.0039) model time 0.2320 (0.2289) loss 3.1997 (3.0401) grad_norm 7.4170 (3.4030) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][530/1251] eta 0:03:05 lr 0.000293 wd 0.0500 time 0.2268 (0.2569) data time 0.0009 (0.0038) model time 0.2259 (0.2289) loss 3.5302 (3.0339) grad_norm 2.0490 (3.3939) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][540/1251] eta 0:03:02 lr 0.000293 wd 0.0500 time 0.2241 (0.2564) data time 0.0008 (0.0038) model time 0.2233 (0.2288) loss 2.0827 (3.0283) grad_norm 3.2636 (3.3852) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][550/1251] eta 0:02:59 lr 0.000293 wd 0.0500 time 0.2414 (0.2559) data time 0.0008 (0.0037) model time 0.2406 (0.2288) loss 3.3765 (3.0291) grad_norm 3.9359 (3.3798) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][560/1251] eta 0:02:56 lr 0.000293 wd 0.0500 time 0.2191 (0.2553) data time 0.0008 (0.0037) model time 0.2183 (0.2287) loss 2.0556 (3.0323) grad_norm 3.1288 (3.3701) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][570/1251] eta 0:02:53 lr 0.000293 wd 0.0500 time 0.2386 (0.2549) data time 0.0011 (0.0036) model time 0.2375 (0.2287) loss 3.2980 (3.0349) grad_norm 2.7934 (3.3635) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][580/1251] eta 0:02:50 lr 0.000293 wd 0.0500 time 0.2265 (0.2544) data time 0.0008 (0.0036) model time 0.2258 (0.2286) loss 3.6454 (3.0362) grad_norm 3.4839 (3.3623) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][590/1251] eta 0:02:47 lr 0.000293 wd 0.0500 time 0.2687 (0.2540) data time 0.0007 (0.0036) model time 0.2680 (0.2287) loss 3.7873 (3.0389) grad_norm 2.4106 (3.3508) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][600/1251] eta 0:02:45 lr 0.000293 wd 0.0500 time 0.2356 (0.2536) data time 0.0009 (0.0035) model time 0.2347 (0.2287) loss 3.5952 (3.0415) grad_norm 2.3705 (3.3472) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][610/1251] eta 0:02:42 lr 0.000293 wd 0.0500 time 0.2309 (0.2533) data time 0.0010 (0.0035) model time 0.2299 (0.2287) loss 3.4837 (3.0395) grad_norm 3.5988 (3.3418) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][620/1251] eta 0:02:39 lr 0.000293 wd 0.0500 time 0.2336 (0.2529) data time 0.0009 (0.0035) model time 0.2326 (0.2287) loss 3.0435 (3.0394) grad_norm 2.8172 (3.3372) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][630/1251] eta 0:02:36 lr 0.000293 wd 0.0500 time 0.2372 (0.2525) data time 0.0012 (0.0034) model time 0.2360 (0.2287) loss 3.1379 (3.0421) grad_norm 2.6060 (3.3357) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][640/1251] eta 0:02:34 lr 0.000293 wd 0.0500 time 0.2250 (0.2522) data time 0.0007 (0.0034) model time 0.2243 (0.2288) loss 2.8374 (3.0426) grad_norm 2.6419 (3.3301) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][650/1251] eta 0:02:31 lr 0.000293 wd 0.0500 time 0.2329 (0.2519) data time 0.0007 (0.0033) model time 0.2322 (0.2288) loss 3.2742 (3.0387) grad_norm 5.6002 (3.3367) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][660/1251] eta 0:02:28 lr 0.000293 wd 0.0500 time 0.2260 (0.2515) data time 0.0010 (0.0033) model time 0.2250 (0.2288) loss 2.1911 (3.0355) grad_norm 2.9174 (3.3395) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][670/1251] eta 0:02:25 lr 0.000293 wd 0.0500 time 0.2342 (0.2512) data time 0.0015 (0.0033) model time 0.2327 (0.2287) loss 2.9644 (3.0338) grad_norm 3.3839 (3.3457) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][680/1251] eta 0:02:23 lr 0.000293 wd 0.0500 time 0.2270 (0.2508) data time 0.0011 (0.0033) model time 0.2259 (0.2287) loss 2.6553 (3.0370) grad_norm 2.1898 (3.3347) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][690/1251] eta 0:02:20 lr 0.000292 wd 0.0500 time 0.2243 (0.2505) data time 0.0007 (0.0032) model time 0.2237 (0.2287) loss 3.5694 (3.0385) grad_norm 2.4673 (3.3283) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][700/1251] eta 0:02:17 lr 0.000292 wd 0.0500 time 0.2281 (0.2502) data time 0.0009 (0.0032) model time 0.2272 (0.2286) loss 1.9454 (3.0337) grad_norm 2.8499 (3.3255) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][710/1251] eta 0:02:15 lr 0.000292 wd 0.0500 time 0.2420 (0.2499) data time 0.0008 (0.0032) model time 0.2412 (0.2287) loss 1.9998 (3.0334) grad_norm 2.4171 (3.3181) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][720/1251] eta 0:02:12 lr 0.000292 wd 0.0500 time 0.2470 (0.2496) data time 0.0009 (0.0031) model time 0.2460 (0.2287) loss 2.0855 (3.0280) grad_norm 2.5495 (3.3183) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][730/1251] eta 0:02:09 lr 0.000292 wd 0.0500 time 0.2367 (0.2493) data time 0.0009 (0.0031) model time 0.2357 (0.2287) loss 2.3857 (3.0277) grad_norm 2.5841 (3.3119) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][740/1251] eta 0:02:07 lr 0.000292 wd 0.0500 time 0.2254 (0.2491) data time 0.0007 (0.0031) model time 0.2247 (0.2287) loss 3.6423 (3.0328) grad_norm 3.0064 (3.3082) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][750/1251] eta 0:02:04 lr 0.000292 wd 0.0500 time 0.2269 (0.2488) data time 0.0007 (0.0031) model time 0.2262 (0.2287) loss 3.4152 (3.0351) grad_norm 3.8032 (3.3055) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][760/1251] eta 0:02:02 lr 0.000292 wd 0.0500 time 0.2417 (0.2486) data time 0.0009 (0.0030) model time 0.2408 (0.2286) loss 3.0233 (3.0324) grad_norm 2.1173 (3.2970) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][770/1251] eta 0:01:59 lr 0.000292 wd 0.0500 time 0.2392 (0.2483) data time 0.0013 (0.0030) model time 0.2379 (0.2287) loss 2.0764 (3.0334) grad_norm 2.9144 (3.2960) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][780/1251] eta 0:01:56 lr 0.000292 wd 0.0500 time 0.2527 (0.2481) data time 0.0010 (0.0030) model time 0.2517 (0.2287) loss 2.7958 (3.0337) grad_norm 2.9818 (3.2931) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][790/1251] eta 0:01:54 lr 0.000292 wd 0.0500 time 0.2342 (0.2479) data time 0.0011 (0.0030) model time 0.2331 (0.2287) loss 3.1825 (3.0333) grad_norm 4.1370 (3.2985) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][800/1251] eta 0:01:51 lr 0.000292 wd 0.0500 time 0.2431 (0.2477) data time 0.0008 (0.0029) model time 0.2423 (0.2287) loss 3.9237 (3.0349) grad_norm 4.4891 (3.3041) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][810/1251] eta 0:01:49 lr 0.000292 wd 0.0500 time 0.2494 (0.2474) data time 0.0009 (0.0029) model time 0.2485 (0.2287) loss 2.3245 (3.0312) grad_norm 3.7523 (3.3194) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][820/1251] eta 0:01:46 lr 0.000292 wd 0.0500 time 0.2330 (0.2475) data time 0.0009 (0.0029) model time 0.2321 (0.2290) loss 3.3106 (3.0293) grad_norm 4.1782 (3.3190) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][830/1251] eta 0:01:44 lr 0.000292 wd 0.0500 time 0.2224 (0.2474) data time 0.0010 (0.0029) model time 0.2214 (0.2291) loss 2.4211 (3.0252) grad_norm 6.9790 (3.3185) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][840/1251] eta 0:01:41 lr 0.000292 wd 0.0500 time 0.2275 (0.2473) data time 0.0015 (0.0029) model time 0.2260 (0.2292) loss 1.9443 (3.0251) grad_norm 3.1832 (3.3209) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][850/1251] eta 0:01:39 lr 0.000292 wd 0.0500 time 0.2288 (0.2470) data time 0.0009 (0.0028) model time 0.2279 (0.2292) loss 3.3934 (3.0237) grad_norm 5.6791 (3.3234) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][860/1251] eta 0:01:36 lr 0.000292 wd 0.0500 time 0.2344 (0.2469) data time 0.0009 (0.0028) model time 0.2335 (0.2292) loss 3.1728 (3.0230) grad_norm 3.0826 (3.3240) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][870/1251] eta 0:01:33 lr 0.000292 wd 0.0500 time 0.2259 (0.2467) data time 0.0008 (0.0028) model time 0.2251 (0.2292) loss 2.6477 (3.0206) grad_norm 5.3481 (3.3248) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][880/1251] eta 0:01:31 lr 0.000292 wd 0.0500 time 0.2287 (0.2465) data time 0.0007 (0.0028) model time 0.2280 (0.2292) loss 2.4054 (3.0217) grad_norm 3.2444 (3.3338) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][890/1251] eta 0:01:28 lr 0.000292 wd 0.0500 time 0.2399 (0.2463) data time 0.0012 (0.0028) model time 0.2387 (0.2292) loss 2.6489 (3.0209) grad_norm 2.4871 (3.3340) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][900/1251] eta 0:01:26 lr 0.000292 wd 0.0500 time 0.2342 (0.2462) data time 0.0007 (0.0028) model time 0.2335 (0.2292) loss 2.9104 (3.0208) grad_norm 3.2901 (3.3277) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][910/1251] eta 0:01:23 lr 0.000292 wd 0.0500 time 0.2211 (0.2460) data time 0.0007 (0.0028) model time 0.2205 (0.2292) loss 1.9745 (3.0210) grad_norm 3.8594 (3.3266) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][920/1251] eta 0:01:21 lr 0.000292 wd 0.0500 time 0.2240 (0.2458) data time 0.0009 (0.0028) model time 0.2231 (0.2291) loss 2.9563 (3.0212) grad_norm 3.4306 (3.3238) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][930/1251] eta 0:01:18 lr 0.000292 wd 0.0500 time 0.2278 (0.2457) data time 0.0010 (0.0028) model time 0.2268 (0.2292) loss 3.3763 (3.0238) grad_norm 3.7548 (3.3217) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][940/1251] eta 0:01:16 lr 0.000291 wd 0.0500 time 0.2260 (0.2455) data time 0.0010 (0.0027) model time 0.2250 (0.2291) loss 3.3885 (3.0238) grad_norm 3.8030 (3.3359) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][950/1251] eta 0:01:13 lr 0.000291 wd 0.0500 time 0.2296 (0.2453) data time 0.0007 (0.0027) model time 0.2289 (0.2291) loss 2.2771 (3.0218) grad_norm 3.8957 (3.3329) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][960/1251] eta 0:01:11 lr 0.000291 wd 0.0500 time 0.2567 (0.2452) data time 0.0011 (0.0027) model time 0.2556 (0.2292) loss 2.7695 (3.0196) grad_norm 4.8386 (3.3460) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][970/1251] eta 0:01:08 lr 0.000291 wd 0.0500 time 0.2240 (0.2450) data time 0.0012 (0.0027) model time 0.2229 (0.2292) loss 3.1657 (3.0206) grad_norm 4.0287 (3.3460) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][980/1251] eta 0:01:06 lr 0.000291 wd 0.0500 time 0.2218 (0.2449) data time 0.0009 (0.0027) model time 0.2210 (0.2291) loss 2.0342 (3.0216) grad_norm 2.2126 (3.3409) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][990/1251] eta 0:01:03 lr 0.000291 wd 0.0500 time 0.2224 (0.2447) data time 0.0008 (0.0027) model time 0.2216 (0.2291) loss 3.7064 (3.0218) grad_norm 4.2539 (3.3389) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1000/1251] eta 0:01:01 lr 0.000291 wd 0.0500 time 0.2252 (0.2446) data time 0.0009 (0.0027) model time 0.2243 (0.2291) loss 2.3979 (3.0212) grad_norm 3.4691 (3.3346) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1010/1251] eta 0:00:58 lr 0.000291 wd 0.0500 time 0.2220 (0.2445) data time 0.0013 (0.0027) model time 0.2207 (0.2291) loss 2.9652 (3.0200) grad_norm 2.6943 (3.3320) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:25:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1020/1251] eta 0:00:56 lr 0.000291 wd 0.0500 time 0.2314 (0.2443) data time 0.0008 (0.0027) model time 0.2306 (0.2291) loss 2.1607 (3.0219) grad_norm 2.8592 (3.3295) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1030/1251] eta 0:00:53 lr 0.000291 wd 0.0500 time 0.2287 (0.2442) data time 0.0011 (0.0026) model time 0.2276 (0.2291) loss 3.0938 (3.0192) grad_norm 3.7293 (3.3263) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1040/1251] eta 0:00:51 lr 0.000291 wd 0.0500 time 0.2233 (0.2441) data time 0.0008 (0.0026) model time 0.2225 (0.2291) loss 3.3710 (3.0195) grad_norm 2.8288 (3.3215) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1050/1251] eta 0:00:49 lr 0.000291 wd 0.0500 time 0.2260 (0.2439) data time 0.0008 (0.0026) model time 0.2252 (0.2291) loss 2.7459 (3.0201) grad_norm 4.1843 (3.3233) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1060/1251] eta 0:00:46 lr 0.000291 wd 0.0500 time 0.2307 (0.2438) data time 0.0009 (0.0026) model time 0.2298 (0.2291) loss 3.3173 (3.0211) grad_norm 4.2913 (3.3255) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1070/1251] eta 0:00:44 lr 0.000291 wd 0.0500 time 0.2265 (0.2437) data time 0.0011 (0.0026) model time 0.2254 (0.2291) loss 3.1128 (3.0222) grad_norm 3.2750 (3.3240) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1080/1251] eta 0:00:41 lr 0.000291 wd 0.0500 time 0.2273 (0.2435) data time 0.0010 (0.0026) model time 0.2263 (0.2291) loss 3.5930 (3.0210) grad_norm 3.1385 (3.3225) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1090/1251] eta 0:00:39 lr 0.000291 wd 0.0500 time 0.2300 (0.2434) data time 0.0007 (0.0026) model time 0.2293 (0.2291) loss 2.7911 (3.0209) grad_norm 3.0164 (3.3209) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1100/1251] eta 0:00:36 lr 0.000291 wd 0.0500 time 0.2220 (0.2433) data time 0.0007 (0.0026) model time 0.2213 (0.2291) loss 3.6444 (3.0205) grad_norm 2.5930 (3.3169) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1110/1251] eta 0:00:34 lr 0.000291 wd 0.0500 time 0.2286 (0.2431) data time 0.0007 (0.0026) model time 0.2279 (0.2290) loss 3.2775 (3.0200) grad_norm 2.8549 (3.3190) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1120/1251] eta 0:00:31 lr 0.000291 wd 0.0500 time 0.2252 (0.2430) data time 0.0015 (0.0025) model time 0.2237 (0.2290) loss 3.2275 (3.0215) grad_norm 3.4826 (3.3189) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1130/1251] eta 0:00:29 lr 0.000291 wd 0.0500 time 0.2198 (0.2429) data time 0.0007 (0.0025) model time 0.2191 (0.2290) loss 3.4815 (3.0211) grad_norm 2.8894 (3.3150) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1140/1251] eta 0:00:26 lr 0.000291 wd 0.0500 time 0.2271 (0.2428) data time 0.0013 (0.0025) model time 0.2258 (0.2290) loss 2.5285 (3.0210) grad_norm 2.4701 (3.3124) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1150/1251] eta 0:00:24 lr 0.000291 wd 0.0500 time 0.2343 (0.2427) data time 0.0007 (0.0025) model time 0.2336 (0.2290) loss 4.1702 (3.0217) grad_norm 2.6169 (3.3085) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1160/1251] eta 0:00:22 lr 0.000291 wd 0.0500 time 0.2291 (0.2426) data time 0.0006 (0.0025) model time 0.2285 (0.2290) loss 2.9986 (3.0223) grad_norm 2.7456 (3.3054) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1170/1251] eta 0:00:19 lr 0.000291 wd 0.0500 time 0.2260 (0.2424) data time 0.0014 (0.0025) model time 0.2246 (0.2290) loss 3.4304 (3.0195) grad_norm 2.5546 (3.3133) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1180/1251] eta 0:00:17 lr 0.000291 wd 0.0500 time 0.2300 (0.2423) data time 0.0007 (0.0025) model time 0.2293 (0.2290) loss 4.0977 (3.0239) grad_norm 7.4479 (3.3219) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1190/1251] eta 0:00:14 lr 0.000290 wd 0.0500 time 0.2300 (0.2422) data time 0.0009 (0.0025) model time 0.2291 (0.2290) loss 3.5166 (3.0259) grad_norm 1.9969 (3.3181) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1200/1251] eta 0:00:12 lr 0.000290 wd 0.0500 time 0.2256 (0.2422) data time 0.0009 (0.0025) model time 0.2246 (0.2291) loss 2.8973 (3.0253) grad_norm 2.4876 (3.3146) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1210/1251] eta 0:00:09 lr 0.000290 wd 0.0500 time 0.2250 (0.2421) data time 0.0009 (0.0024) model time 0.2241 (0.2290) loss 3.9301 (3.0269) grad_norm 3.2894 (3.3099) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1220/1251] eta 0:00:07 lr 0.000290 wd 0.0500 time 0.2230 (0.2419) data time 0.0009 (0.0024) model time 0.2221 (0.2290) loss 3.6354 (3.0283) grad_norm 3.2079 (3.3074) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1230/1251] eta 0:00:05 lr 0.000290 wd 0.0500 time 0.2294 (0.2418) data time 0.0009 (0.0024) model time 0.2285 (0.2290) loss 2.9809 (3.0272) grad_norm 3.0239 (3.3076) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1240/1251] eta 0:00:02 lr 0.000290 wd 0.0500 time 0.2109 (0.2417) data time 0.0006 (0.0024) model time 0.2103 (0.2289) loss 3.4272 (3.0267) grad_norm 2.7141 (3.3085) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [199/300][1250/1251] eta 0:00:00 lr 0.000290 wd 0.0500 time 0.2113 (0.2414) data time 0.0006 (0.0024) model time 0.2107 (0.2288) loss 2.9449 (3.0274) grad_norm 6.2262 (3.3130) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 22:26:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 199 training takes 0:05:01 [2024-08-27 22:26:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 22:26:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 22:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.472 (0.472) Loss 0.4253 (0.4253) Acc@1 91.504 (91.504) Acc@5 98.145 (98.145) Mem 7374MB [2024-08-27 22:26:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.117) Loss 0.6787 (0.6639) Acc@1 85.840 (85.449) Acc@5 97.266 (97.212) Mem 7374MB [2024-08-27 22:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.084 (0.101) Loss 0.9531 (0.6862) Acc@1 75.879 (84.566) Acc@5 95.117 (97.177) Mem 7374MB [2024-08-27 22:26:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.095) Loss 1.1426 (0.7772) Acc@1 72.559 (82.501) Acc@5 92.578 (96.125) Mem 7374MB [2024-08-27 22:26:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.088) Loss 1.0762 (0.8286) Acc@1 73.828 (81.243) Acc@5 93.652 (95.589) Mem 7374MB [2024-08-27 22:27:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.812 Acc@5 95.546 [2024-08-27 22:27:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.8% [2024-08-27 22:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.910 (0.910) Loss 0.3877 (0.3877) Acc@1 92.969 (92.969) Acc@5 98.438 (98.438) Mem 7374MB [2024-08-27 22:27:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.164) Loss 0.5967 (0.6092) Acc@1 89.062 (87.340) Acc@5 97.656 (97.585) Mem 7374MB [2024-08-27 22:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.124) Loss 0.8667 (0.6356) Acc@1 78.613 (86.221) Acc@5 96.094 (97.596) Mem 7374MB [2024-08-27 22:27:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.092 (0.111) Loss 1.0996 (0.7200) Acc@1 73.145 (84.151) Acc@5 93.066 (96.702) Mem 7374MB [2024-08-27 22:27:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.100) Loss 0.9805 (0.7630) Acc@1 75.977 (82.903) Acc@5 94.727 (96.213) Mem 7374MB [2024-08-27 22:27:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.456 Acc@5 96.218 [2024-08-27 22:27:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.5% [2024-08-27 22:27:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.46% [2024-08-27 22:27:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 22:27:09 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 22:27:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][0/1251] eta 0:17:40 lr 0.000290 wd 0.0500 time 0.8481 (0.8481) data time 0.5217 (0.5217) model time 0.0000 (0.0000) loss 2.2791 (2.2791) grad_norm 2.8232 (2.8232) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][10/1251] eta 0:05:54 lr 0.000290 wd 0.0500 time 0.2355 (0.2860) data time 0.0007 (0.0483) model time 0.0000 (0.0000) loss 3.8082 (2.8812) grad_norm 2.5593 (2.8771) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][20/1251] eta 0:05:19 lr 0.000290 wd 0.0500 time 0.2248 (0.2592) data time 0.0007 (0.0258) model time 0.0000 (0.0000) loss 2.8710 (3.0674) grad_norm 3.4486 (3.0995) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][30/1251] eta 0:05:04 lr 0.000290 wd 0.0500 time 0.2349 (0.2495) data time 0.0010 (0.0179) model time 0.0000 (0.0000) loss 3.3836 (3.1074) grad_norm 2.8699 (3.2755) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][40/1251] eta 0:04:55 lr 0.000290 wd 0.0500 time 0.2252 (0.2444) data time 0.0009 (0.0138) model time 0.0000 (0.0000) loss 3.3925 (3.1508) grad_norm 3.0309 (3.3053) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][50/1251] eta 0:04:49 lr 0.000290 wd 0.0500 time 0.2294 (0.2408) data time 0.0007 (0.0113) model time 0.0000 (0.0000) loss 2.3826 (3.1184) grad_norm 3.9172 (3.2451) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][60/1251] eta 0:04:44 lr 0.000290 wd 0.0500 time 0.2363 (0.2387) data time 0.0009 (0.0096) model time 0.2353 (0.2270) loss 3.1244 (3.1003) grad_norm 2.9188 (3.2925) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][70/1251] eta 0:04:43 lr 0.000290 wd 0.0500 time 0.2500 (0.2400) data time 0.0010 (0.0084) model time 0.2490 (0.2370) loss 2.7268 (3.0570) grad_norm 2.7055 (3.2242) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][80/1251] eta 0:04:39 lr 0.000290 wd 0.0500 time 0.2464 (0.2388) data time 0.0014 (0.0076) model time 0.2450 (0.2342) loss 3.7511 (3.0450) grad_norm 2.5142 (3.2137) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][90/1251] eta 0:04:36 lr 0.000290 wd 0.0500 time 0.2438 (0.2380) data time 0.0013 (0.0071) model time 0.2425 (0.2326) loss 3.1327 (3.0500) grad_norm 3.5305 (3.2571) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][100/1251] eta 0:04:33 lr 0.000290 wd 0.0500 time 0.2522 (0.2374) data time 0.0009 (0.0065) model time 0.2514 (0.2322) loss 3.4243 (3.0570) grad_norm 2.2920 (3.2036) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][110/1251] eta 0:04:30 lr 0.000290 wd 0.0500 time 0.2263 (0.2368) data time 0.0009 (0.0062) model time 0.2254 (0.2315) loss 3.2209 (3.0715) grad_norm 2.3452 (3.1592) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][120/1251] eta 0:04:27 lr 0.000290 wd 0.0500 time 0.2498 (0.2363) data time 0.0008 (0.0058) model time 0.2489 (0.2313) loss 2.5937 (3.0760) grad_norm 2.4686 (3.1549) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][130/1251] eta 0:04:24 lr 0.000290 wd 0.0500 time 0.2576 (0.2360) data time 0.0009 (0.0054) model time 0.2567 (0.2312) loss 3.0190 (3.0804) grad_norm 2.0253 (3.2365) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][140/1251] eta 0:04:21 lr 0.000290 wd 0.0500 time 0.2273 (0.2354) data time 0.0008 (0.0052) model time 0.2266 (0.2307) loss 2.2051 (3.0766) grad_norm 5.2066 (3.3606) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][150/1251] eta 0:04:18 lr 0.000290 wd 0.0500 time 0.2256 (0.2351) data time 0.0010 (0.0049) model time 0.2246 (0.2306) loss 3.3845 (3.0813) grad_norm 3.0032 (3.4075) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][160/1251] eta 0:04:15 lr 0.000290 wd 0.0500 time 0.2339 (0.2346) data time 0.0009 (0.0046) model time 0.2330 (0.2302) loss 2.4888 (3.0836) grad_norm 2.5865 (3.3868) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][170/1251] eta 0:04:13 lr 0.000290 wd 0.0500 time 0.2405 (0.2342) data time 0.0013 (0.0044) model time 0.2392 (0.2299) loss 3.0298 (3.0750) grad_norm 2.9260 (3.3633) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][180/1251] eta 0:04:10 lr 0.000290 wd 0.0500 time 0.2286 (0.2339) data time 0.0009 (0.0042) model time 0.2276 (0.2297) loss 2.2445 (3.0674) grad_norm 2.5718 (3.3305) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][190/1251] eta 0:04:08 lr 0.000289 wd 0.0500 time 0.2393 (0.2338) data time 0.0010 (0.0041) model time 0.2384 (0.2298) loss 2.4749 (3.0666) grad_norm 2.5808 (3.2999) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][200/1251] eta 0:04:05 lr 0.000289 wd 0.0500 time 0.2409 (0.2336) data time 0.0010 (0.0039) model time 0.2399 (0.2297) loss 2.8124 (3.0721) grad_norm 2.9483 (3.2803) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:27:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][210/1251] eta 0:04:02 lr 0.000289 wd 0.0500 time 0.2295 (0.2334) data time 0.0010 (0.0038) model time 0.2285 (0.2296) loss 3.6489 (3.0750) grad_norm 3.5933 (3.2806) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][220/1251] eta 0:04:00 lr 0.000289 wd 0.0500 time 0.2329 (0.2333) data time 0.0010 (0.0037) model time 0.2320 (0.2296) loss 3.4109 (3.0687) grad_norm 2.7546 (3.2666) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:28:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][230/1251] eta 0:03:58 lr 0.000289 wd 0.0500 time 0.2310 (0.2332) data time 0.0011 (0.0036) model time 0.2300 (0.2296) loss 3.2572 (3.0534) grad_norm 2.8204 (3.2917) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][240/1251] eta 0:03:55 lr 0.000289 wd 0.0500 time 0.2341 (0.2331) data time 0.0008 (0.0035) model time 0.2334 (0.2296) loss 2.9652 (3.0358) grad_norm 2.8002 (3.2751) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:28:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][250/1251] eta 0:03:53 lr 0.000289 wd 0.0500 time 0.2380 (0.2329) data time 0.0007 (0.0034) model time 0.2373 (0.2295) loss 3.2920 (3.0272) grad_norm 3.6996 (3.2657) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:28:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][260/1251] eta 0:03:50 lr 0.000289 wd 0.0500 time 0.2227 (0.2327) data time 0.0012 (0.0033) model time 0.2215 (0.2293) loss 3.5185 (3.0363) grad_norm 2.8562 (3.2615) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:28:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][270/1251] eta 0:03:48 lr 0.000289 wd 0.0500 time 0.2741 (0.2328) data time 0.0009 (0.0032) model time 0.2732 (0.2296) loss 3.2121 (3.0354) grad_norm 2.4699 (3.2419) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:28:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][280/1251] eta 0:03:46 lr 0.000289 wd 0.0500 time 0.2275 (0.2334) data time 0.0007 (0.0032) model time 0.2268 (0.2304) loss 2.0324 (3.0367) grad_norm 3.5230 (3.2299) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:28:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][290/1251] eta 0:03:44 lr 0.000289 wd 0.0500 time 0.2246 (0.2332) data time 0.0009 (0.0031) model time 0.2237 (0.2303) loss 3.2215 (3.0369) grad_norm 3.8100 (3.2287) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:28:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][300/1251] eta 0:03:41 lr 0.000289 wd 0.0500 time 0.2232 (0.2331) data time 0.0007 (0.0030) model time 0.2226 (0.2302) loss 3.2098 (3.0343) grad_norm 4.5920 (3.2403) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:28:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][310/1251] eta 0:03:39 lr 0.000289 wd 0.0500 time 0.2222 (0.2331) data time 0.0007 (0.0030) model time 0.2215 (0.2303) loss 2.4891 (3.0324) grad_norm 3.3954 (3.2369) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][320/1251] eta 0:03:36 lr 0.000289 wd 0.0500 time 0.2247 (0.2330) data time 0.0006 (0.0029) model time 0.2241 (0.2302) loss 1.9483 (3.0172) grad_norm 3.9221 (3.2443) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][330/1251] eta 0:03:34 lr 0.000289 wd 0.0500 time 0.2282 (0.2328) data time 0.0013 (0.0029) model time 0.2268 (0.2300) loss 3.7649 (3.0197) grad_norm 3.2409 (3.2533) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][340/1251] eta 0:03:31 lr 0.000289 wd 0.0500 time 0.2201 (0.2327) data time 0.0010 (0.0029) model time 0.2191 (0.2299) loss 3.2720 (3.0213) grad_norm 3.3703 (3.2486) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][350/1251] eta 0:03:29 lr 0.000289 wd 0.0500 time 0.2247 (0.2326) data time 0.0009 (0.0028) model time 0.2238 (0.2299) loss 3.1709 (3.0257) grad_norm 5.0603 (3.2633) loss_scale 1024.0000 (522.2108) mem 7379MB [2024-08-27 22:28:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][360/1251] eta 0:03:27 lr 0.000289 wd 0.0500 time 0.2217 (0.2325) data time 0.0010 (0.0028) model time 0.2207 (0.2298) loss 1.7822 (3.0302) grad_norm 2.2845 (3.2737) loss_scale 1024.0000 (536.1108) mem 7379MB [2024-08-27 22:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][370/1251] eta 0:03:24 lr 0.000289 wd 0.0500 time 0.2357 (0.2325) data time 0.0008 (0.0027) model time 0.2349 (0.2297) loss 3.6084 (3.0324) grad_norm 3.2781 (3.2613) loss_scale 1024.0000 (549.2615) mem 7379MB [2024-08-27 22:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][380/1251] eta 0:03:22 lr 0.000289 wd 0.0500 time 0.2309 (0.2324) data time 0.0007 (0.0027) model time 0.2302 (0.2297) loss 2.5825 (3.0302) grad_norm 3.3230 (3.2580) loss_scale 1024.0000 (561.7218) mem 7379MB [2024-08-27 22:28:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][390/1251] eta 0:03:20 lr 0.000289 wd 0.0500 time 0.2206 (0.2324) data time 0.0007 (0.0027) model time 0.2199 (0.2297) loss 2.9938 (3.0345) grad_norm 2.9599 (3.2473) loss_scale 1024.0000 (573.5448) mem 7379MB [2024-08-27 22:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][400/1251] eta 0:03:17 lr 0.000289 wd 0.0500 time 0.2263 (0.2323) data time 0.0009 (0.0026) model time 0.2254 (0.2297) loss 2.1549 (3.0183) grad_norm 3.4994 (3.2503) loss_scale 1024.0000 (584.7781) mem 7379MB [2024-08-27 22:28:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][410/1251] eta 0:03:15 lr 0.000289 wd 0.0500 time 0.2341 (0.2322) data time 0.0007 (0.0026) model time 0.2334 (0.2296) loss 2.1276 (3.0109) grad_norm 2.4661 (3.2342) loss_scale 1024.0000 (595.4647) mem 7379MB [2024-08-27 22:28:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][420/1251] eta 0:03:12 lr 0.000289 wd 0.0500 time 0.2267 (0.2322) data time 0.0011 (0.0026) model time 0.2256 (0.2296) loss 2.2556 (3.0082) grad_norm 3.0734 (3.2409) loss_scale 1024.0000 (605.6437) mem 7379MB [2024-08-27 22:28:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][430/1251] eta 0:03:10 lr 0.000289 wd 0.0500 time 0.2243 (0.2321) data time 0.0011 (0.0026) model time 0.2232 (0.2296) loss 3.7957 (3.0128) grad_norm 3.4894 (3.2331) loss_scale 1024.0000 (615.3503) mem 7379MB [2024-08-27 22:28:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][440/1251] eta 0:03:08 lr 0.000288 wd 0.0500 time 0.2310 (0.2321) data time 0.0008 (0.0026) model time 0.2303 (0.2295) loss 2.9662 (3.0143) grad_norm 4.5845 (3.2323) loss_scale 1024.0000 (624.6168) mem 7379MB [2024-08-27 22:28:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][450/1251] eta 0:03:05 lr 0.000288 wd 0.0500 time 0.2239 (0.2321) data time 0.0010 (0.0026) model time 0.2229 (0.2295) loss 4.1516 (3.0204) grad_norm 3.3247 (3.2321) loss_scale 1024.0000 (633.4723) mem 7379MB [2024-08-27 22:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][460/1251] eta 0:03:03 lr 0.000288 wd 0.0500 time 0.2258 (0.2321) data time 0.0013 (0.0025) model time 0.2244 (0.2295) loss 3.0805 (3.0236) grad_norm 3.0883 (3.2335) loss_scale 1024.0000 (641.9436) mem 7379MB [2024-08-27 22:28:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][470/1251] eta 0:03:01 lr 0.000288 wd 0.0500 time 0.2184 (0.2320) data time 0.0008 (0.0025) model time 0.2175 (0.2295) loss 2.1532 (3.0187) grad_norm 2.7807 (3.2358) loss_scale 1024.0000 (650.0552) mem 7379MB [2024-08-27 22:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][480/1251] eta 0:02:58 lr 0.000288 wd 0.0500 time 0.2336 (0.2320) data time 0.0010 (0.0025) model time 0.2327 (0.2295) loss 3.0446 (3.0176) grad_norm 2.8502 (3.2516) loss_scale 1024.0000 (657.8295) mem 7379MB [2024-08-27 22:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][490/1251] eta 0:02:56 lr 0.000288 wd 0.0500 time 0.2311 (0.2320) data time 0.0013 (0.0025) model time 0.2298 (0.2295) loss 3.1896 (3.0172) grad_norm 3.0199 (3.2624) loss_scale 1024.0000 (665.2872) mem 7379MB [2024-08-27 22:29:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][500/1251] eta 0:02:54 lr 0.000288 wd 0.0500 time 0.2251 (0.2320) data time 0.0007 (0.0025) model time 0.2244 (0.2295) loss 3.7305 (3.0180) grad_norm 2.5280 (3.2655) loss_scale 1024.0000 (672.4471) mem 7379MB [2024-08-27 22:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][510/1251] eta 0:02:51 lr 0.000288 wd 0.0500 time 0.2258 (0.2320) data time 0.0008 (0.0025) model time 0.2250 (0.2295) loss 2.2442 (3.0115) grad_norm 3.6220 (3.2633) loss_scale 1024.0000 (679.3268) mem 7379MB [2024-08-27 22:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][520/1251] eta 0:02:49 lr 0.000288 wd 0.0500 time 0.2218 (0.2320) data time 0.0009 (0.0025) model time 0.2209 (0.2295) loss 3.0485 (3.0149) grad_norm 2.9885 (3.2642) loss_scale 1024.0000 (685.9424) mem 7379MB [2024-08-27 22:29:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][530/1251] eta 0:02:47 lr 0.000288 wd 0.0500 time 0.2224 (0.2319) data time 0.0013 (0.0025) model time 0.2211 (0.2294) loss 2.1620 (3.0168) grad_norm 2.3370 (3.2654) loss_scale 1024.0000 (692.3089) mem 7379MB [2024-08-27 22:29:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][540/1251] eta 0:02:44 lr 0.000288 wd 0.0500 time 0.2236 (0.2319) data time 0.0009 (0.0025) model time 0.2227 (0.2294) loss 3.1845 (3.0155) grad_norm 2.5881 (3.2602) loss_scale 1024.0000 (698.4399) mem 7379MB [2024-08-27 22:29:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][550/1251] eta 0:02:42 lr 0.000288 wd 0.0500 time 0.2211 (0.2319) data time 0.0009 (0.0024) model time 0.2202 (0.2294) loss 3.5509 (3.0230) grad_norm 1.9082 (3.2513) loss_scale 1024.0000 (704.3485) mem 7379MB [2024-08-27 22:29:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][560/1251] eta 0:02:40 lr 0.000288 wd 0.0500 time 0.2292 (0.2318) data time 0.0009 (0.0024) model time 0.2283 (0.2294) loss 3.1004 (3.0212) grad_norm 4.6914 (3.2478) loss_scale 1024.0000 (710.0463) mem 7379MB [2024-08-27 22:29:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][570/1251] eta 0:02:37 lr 0.000288 wd 0.0500 time 0.2252 (0.2318) data time 0.0008 (0.0024) model time 0.2243 (0.2294) loss 2.9295 (3.0207) grad_norm 3.2776 (3.2487) loss_scale 1024.0000 (715.5447) mem 7379MB [2024-08-27 22:29:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][580/1251] eta 0:02:35 lr 0.000288 wd 0.0500 time 0.2293 (0.2318) data time 0.0009 (0.0024) model time 0.2284 (0.2294) loss 3.1059 (3.0222) grad_norm 2.7720 (3.2446) loss_scale 1024.0000 (720.8537) mem 7379MB [2024-08-27 22:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][590/1251] eta 0:02:33 lr 0.000288 wd 0.0500 time 0.2224 (0.2318) data time 0.0011 (0.0024) model time 0.2212 (0.2295) loss 2.2911 (3.0219) grad_norm 2.3946 (3.2444) loss_scale 1024.0000 (725.9831) mem 7379MB [2024-08-27 22:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][600/1251] eta 0:02:30 lr 0.000288 wd 0.0500 time 0.2304 (0.2319) data time 0.0009 (0.0023) model time 0.2295 (0.2295) loss 2.8900 (3.0175) grad_norm 2.9502 (3.2359) loss_scale 1024.0000 (730.9418) mem 7379MB [2024-08-27 22:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][610/1251] eta 0:02:28 lr 0.000288 wd 0.0500 time 0.2267 (0.2322) data time 0.0009 (0.0023) model time 0.2258 (0.2299) loss 3.1802 (3.0149) grad_norm 2.6307 (3.2433) loss_scale 1024.0000 (735.7381) mem 7379MB [2024-08-27 22:29:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][620/1251] eta 0:02:26 lr 0.000288 wd 0.0500 time 0.2326 (0.2321) data time 0.0012 (0.0023) model time 0.2314 (0.2299) loss 3.2579 (3.0138) grad_norm 3.3763 (3.2415) loss_scale 1024.0000 (740.3800) mem 7379MB [2024-08-27 22:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][630/1251] eta 0:02:24 lr 0.000288 wd 0.0500 time 0.2314 (0.2321) data time 0.0011 (0.0023) model time 0.2302 (0.2299) loss 2.7475 (3.0154) grad_norm 2.3977 (3.2413) loss_scale 1024.0000 (744.8748) mem 7379MB [2024-08-27 22:29:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][640/1251] eta 0:02:21 lr 0.000288 wd 0.0500 time 0.2240 (0.2321) data time 0.0010 (0.0023) model time 0.2230 (0.2298) loss 3.0814 (3.0186) grad_norm 2.4510 (3.2333) loss_scale 1024.0000 (749.2293) mem 7379MB [2024-08-27 22:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][650/1251] eta 0:02:19 lr 0.000288 wd 0.0500 time 0.2368 (0.2321) data time 0.0013 (0.0023) model time 0.2356 (0.2298) loss 2.6468 (3.0137) grad_norm 3.5331 (3.2300) loss_scale 1024.0000 (753.4501) mem 7379MB [2024-08-27 22:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][660/1251] eta 0:02:17 lr 0.000288 wd 0.0500 time 0.2220 (0.2320) data time 0.0017 (0.0022) model time 0.2203 (0.2298) loss 2.7774 (3.0159) grad_norm 2.8621 (3.2217) loss_scale 1024.0000 (757.5431) mem 7379MB [2024-08-27 22:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][670/1251] eta 0:02:14 lr 0.000288 wd 0.0500 time 0.2274 (0.2320) data time 0.0010 (0.0022) model time 0.2264 (0.2298) loss 2.1016 (3.0110) grad_norm 5.5456 (3.2631) loss_scale 1024.0000 (761.5142) mem 7379MB [2024-08-27 22:29:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][680/1251] eta 0:02:12 lr 0.000288 wd 0.0500 time 0.2306 (0.2320) data time 0.0009 (0.0022) model time 0.2297 (0.2299) loss 2.9996 (3.0078) grad_norm 3.5611 (3.2729) loss_scale 1024.0000 (765.3686) mem 7379MB [2024-08-27 22:29:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][690/1251] eta 0:02:10 lr 0.000287 wd 0.0500 time 0.2310 (0.2321) data time 0.0011 (0.0022) model time 0.2298 (0.2299) loss 2.3301 (3.0071) grad_norm 3.5776 (3.2818) loss_scale 1024.0000 (769.1114) mem 7379MB [2024-08-27 22:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][700/1251] eta 0:02:07 lr 0.000287 wd 0.0500 time 0.2237 (0.2320) data time 0.0010 (0.0022) model time 0.2226 (0.2298) loss 2.0239 (3.0046) grad_norm 3.2052 (3.2756) loss_scale 1024.0000 (772.7475) mem 7379MB [2024-08-27 22:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][710/1251] eta 0:02:05 lr 0.000287 wd 0.0500 time 0.2304 (0.2320) data time 0.0010 (0.0022) model time 0.2294 (0.2299) loss 3.4950 (3.0054) grad_norm 2.8226 (3.2759) loss_scale 1024.0000 (776.2813) mem 7379MB [2024-08-27 22:29:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][720/1251] eta 0:02:03 lr 0.000287 wd 0.0500 time 0.2233 (0.2320) data time 0.0010 (0.0022) model time 0.2224 (0.2298) loss 2.3925 (3.0030) grad_norm 2.3368 (3.2738) loss_scale 1024.0000 (779.7171) mem 7379MB [2024-08-27 22:29:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][730/1251] eta 0:02:00 lr 0.000287 wd 0.0500 time 0.2222 (0.2320) data time 0.0011 (0.0021) model time 0.2211 (0.2298) loss 4.0828 (3.0063) grad_norm 2.1244 (3.2730) loss_scale 1024.0000 (783.0588) mem 7379MB [2024-08-27 22:30:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][740/1251] eta 0:01:58 lr 0.000287 wd 0.0500 time 0.2305 (0.2320) data time 0.0007 (0.0021) model time 0.2297 (0.2298) loss 3.7338 (3.0010) grad_norm 2.4114 (3.2759) loss_scale 1024.0000 (786.3104) mem 7379MB [2024-08-27 22:30:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][750/1251] eta 0:01:56 lr 0.000287 wd 0.0500 time 0.2483 (0.2319) data time 0.0006 (0.0021) model time 0.2477 (0.2298) loss 1.9579 (3.0003) grad_norm 2.2315 (3.2728) loss_scale 1024.0000 (789.4754) mem 7379MB [2024-08-27 22:30:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][760/1251] eta 0:01:53 lr 0.000287 wd 0.0500 time 0.2212 (0.2319) data time 0.0010 (0.0021) model time 0.2202 (0.2298) loss 3.4588 (3.0041) grad_norm 4.9508 (3.2950) loss_scale 1024.0000 (792.5572) mem 7379MB [2024-08-27 22:30:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][770/1251] eta 0:01:51 lr 0.000287 wd 0.0500 time 0.2256 (0.2318) data time 0.0010 (0.0021) model time 0.2246 (0.2298) loss 3.2042 (3.0037) grad_norm 4.0733 (3.2930) loss_scale 1024.0000 (795.5590) mem 7379MB [2024-08-27 22:30:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][780/1251] eta 0:01:49 lr 0.000287 wd 0.0500 time 0.2333 (0.2318) data time 0.0008 (0.0021) model time 0.2325 (0.2298) loss 2.8276 (3.0039) grad_norm 3.0211 (3.2944) loss_scale 1024.0000 (798.4840) mem 7379MB [2024-08-27 22:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][790/1251] eta 0:01:46 lr 0.000287 wd 0.0500 time 0.2311 (0.2318) data time 0.0010 (0.0021) model time 0.2301 (0.2298) loss 2.6756 (3.0045) grad_norm 3.2569 (3.2975) loss_scale 1024.0000 (801.3350) mem 7379MB [2024-08-27 22:30:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][800/1251] eta 0:01:44 lr 0.000287 wd 0.0500 time 0.2288 (0.2318) data time 0.0008 (0.0021) model time 0.2280 (0.2298) loss 3.2830 (3.0047) grad_norm 7.5184 (3.3067) loss_scale 1024.0000 (804.1149) mem 7379MB [2024-08-27 22:30:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][810/1251] eta 0:01:42 lr 0.000287 wd 0.0500 time 0.2304 (0.2318) data time 0.0008 (0.0020) model time 0.2296 (0.2298) loss 2.7120 (3.0029) grad_norm 3.7378 (3.3136) loss_scale 1024.0000 (806.8261) mem 7379MB [2024-08-27 22:30:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][820/1251] eta 0:01:39 lr 0.000287 wd 0.0500 time 0.2236 (0.2318) data time 0.0016 (0.0020) model time 0.2220 (0.2297) loss 3.1900 (3.0015) grad_norm 3.1562 (3.3122) loss_scale 1024.0000 (809.4714) mem 7379MB [2024-08-27 22:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][830/1251] eta 0:01:37 lr 0.000287 wd 0.0500 time 0.2245 (0.2318) data time 0.0015 (0.0020) model time 0.2230 (0.2298) loss 2.9833 (3.0046) grad_norm 2.2452 (3.3036) loss_scale 1024.0000 (812.0529) mem 7379MB [2024-08-27 22:30:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][840/1251] eta 0:01:35 lr 0.000287 wd 0.0500 time 0.2281 (0.2318) data time 0.0014 (0.0020) model time 0.2267 (0.2298) loss 3.2981 (3.0055) grad_norm 4.4730 (3.3088) loss_scale 1024.0000 (814.5731) mem 7379MB [2024-08-27 22:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][850/1251] eta 0:01:32 lr 0.000287 wd 0.0500 time 0.2273 (0.2318) data time 0.0007 (0.0020) model time 0.2266 (0.2298) loss 2.6168 (3.0056) grad_norm 5.7872 (3.3209) loss_scale 1024.0000 (817.0341) mem 7379MB [2024-08-27 22:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][860/1251] eta 0:01:30 lr 0.000287 wd 0.0500 time 0.2290 (0.2318) data time 0.0010 (0.0020) model time 0.2281 (0.2298) loss 2.5396 (3.0058) grad_norm 2.9584 (3.3179) loss_scale 1024.0000 (819.4379) mem 7379MB [2024-08-27 22:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][870/1251] eta 0:01:28 lr 0.000287 wd 0.0500 time 0.2252 (0.2318) data time 0.0007 (0.0020) model time 0.2244 (0.2298) loss 2.6558 (3.0057) grad_norm 3.1648 (3.3178) loss_scale 1024.0000 (821.7865) mem 7379MB [2024-08-27 22:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][880/1251] eta 0:01:25 lr 0.000287 wd 0.0500 time 0.2501 (0.2318) data time 0.0011 (0.0020) model time 0.2490 (0.2298) loss 2.7091 (3.0059) grad_norm 3.3642 (3.3192) loss_scale 1024.0000 (824.0817) mem 7379MB [2024-08-27 22:30:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][890/1251] eta 0:01:23 lr 0.000287 wd 0.0500 time 0.2326 (0.2318) data time 0.0015 (0.0020) model time 0.2311 (0.2298) loss 3.1972 (3.0090) grad_norm 2.6848 (3.3224) loss_scale 1024.0000 (826.3255) mem 7379MB [2024-08-27 22:30:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][900/1251] eta 0:01:21 lr 0.000287 wd 0.0500 time 0.2233 (0.2317) data time 0.0011 (0.0020) model time 0.2222 (0.2298) loss 3.0156 (3.0075) grad_norm 3.4421 (3.3175) loss_scale 1024.0000 (828.5194) mem 7379MB [2024-08-27 22:30:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][910/1251] eta 0:01:19 lr 0.000287 wd 0.0500 time 0.2234 (0.2317) data time 0.0008 (0.0020) model time 0.2226 (0.2298) loss 3.2806 (3.0092) grad_norm 3.2128 (3.3170) loss_scale 1024.0000 (830.6652) mem 7379MB [2024-08-27 22:30:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][920/1251] eta 0:01:16 lr 0.000287 wd 0.0500 time 0.2251 (0.2317) data time 0.0008 (0.0020) model time 0.2243 (0.2298) loss 3.1596 (3.0100) grad_norm 3.5946 (3.3250) loss_scale 1024.0000 (832.7644) mem 7379MB [2024-08-27 22:30:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][930/1251] eta 0:01:14 lr 0.000287 wd 0.0500 time 0.2349 (0.2318) data time 0.0010 (0.0020) model time 0.2340 (0.2298) loss 3.2678 (3.0097) grad_norm 2.9541 (3.3344) loss_scale 1024.0000 (834.8185) mem 7379MB [2024-08-27 22:30:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][940/1251] eta 0:01:12 lr 0.000286 wd 0.0500 time 0.2276 (0.2317) data time 0.0010 (0.0020) model time 0.2266 (0.2298) loss 3.1089 (3.0108) grad_norm 2.9588 (3.3351) loss_scale 1024.0000 (836.8289) mem 7379MB [2024-08-27 22:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][950/1251] eta 0:01:09 lr 0.000286 wd 0.0500 time 0.2284 (0.2317) data time 0.0013 (0.0019) model time 0.2272 (0.2298) loss 3.1644 (3.0111) grad_norm 4.4863 (3.3347) loss_scale 1024.0000 (838.7971) mem 7379MB [2024-08-27 22:30:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][960/1251] eta 0:01:07 lr 0.000286 wd 0.0500 time 0.2201 (0.2317) data time 0.0010 (0.0019) model time 0.2190 (0.2298) loss 2.4635 (3.0084) grad_norm 2.8642 (3.3327) loss_scale 1024.0000 (840.7242) mem 7379MB [2024-08-27 22:30:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][970/1251] eta 0:01:05 lr 0.000286 wd 0.0500 time 0.2308 (0.2317) data time 0.0009 (0.0019) model time 0.2299 (0.2298) loss 3.1709 (3.0098) grad_norm 2.3453 (3.3275) loss_scale 1024.0000 (842.6117) mem 7379MB [2024-08-27 22:30:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][980/1251] eta 0:01:02 lr 0.000286 wd 0.0500 time 0.2228 (0.2317) data time 0.0008 (0.0019) model time 0.2219 (0.2297) loss 2.9142 (3.0069) grad_norm 2.9577 (inf) loss_scale 512.0000 (842.8950) mem 7379MB [2024-08-27 22:30:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][990/1251] eta 0:01:00 lr 0.000286 wd 0.0500 time 0.2303 (0.2317) data time 0.0007 (0.0019) model time 0.2297 (0.2298) loss 2.9262 (3.0050) grad_norm 4.1057 (inf) loss_scale 512.0000 (839.5560) mem 7379MB [2024-08-27 22:31:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1000/1251] eta 0:00:58 lr 0.000286 wd 0.0500 time 0.2220 (0.2317) data time 0.0008 (0.0019) model time 0.2212 (0.2298) loss 2.4801 (3.0041) grad_norm 3.3642 (inf) loss_scale 512.0000 (836.2837) mem 7379MB [2024-08-27 22:31:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1010/1251] eta 0:00:55 lr 0.000286 wd 0.0500 time 0.2414 (0.2316) data time 0.0007 (0.0019) model time 0.2407 (0.2297) loss 2.2397 (3.0036) grad_norm 3.0474 (inf) loss_scale 512.0000 (833.0762) mem 7379MB [2024-08-27 22:31:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1020/1251] eta 0:00:53 lr 0.000286 wd 0.0500 time 0.2292 (0.2316) data time 0.0007 (0.0019) model time 0.2286 (0.2297) loss 3.2642 (3.0033) grad_norm 4.2005 (inf) loss_scale 512.0000 (829.9314) mem 7379MB [2024-08-27 22:31:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1030/1251] eta 0:00:51 lr 0.000286 wd 0.0500 time 0.2264 (0.2316) data time 0.0012 (0.0019) model time 0.2253 (0.2297) loss 3.1749 (3.0048) grad_norm 2.6624 (inf) loss_scale 512.0000 (826.8477) mem 7379MB [2024-08-27 22:31:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1040/1251] eta 0:00:48 lr 0.000286 wd 0.0500 time 0.2249 (0.2316) data time 0.0009 (0.0019) model time 0.2241 (0.2298) loss 2.9523 (3.0029) grad_norm 2.5553 (inf) loss_scale 512.0000 (823.8232) mem 7379MB [2024-08-27 22:31:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1050/1251] eta 0:00:46 lr 0.000286 wd 0.0500 time 0.2256 (0.2316) data time 0.0010 (0.0019) model time 0.2246 (0.2298) loss 2.8626 (3.0026) grad_norm 2.4140 (inf) loss_scale 512.0000 (820.8563) mem 7379MB [2024-08-27 22:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1060/1251] eta 0:00:44 lr 0.000286 wd 0.0500 time 0.2357 (0.2316) data time 0.0009 (0.0019) model time 0.2348 (0.2298) loss 3.1757 (3.0011) grad_norm 3.4295 (inf) loss_scale 512.0000 (817.9453) mem 7379MB [2024-08-27 22:31:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1070/1251] eta 0:00:41 lr 0.000286 wd 0.0500 time 0.2257 (0.2316) data time 0.0006 (0.0019) model time 0.2251 (0.2298) loss 3.5197 (3.0024) grad_norm 3.4399 (inf) loss_scale 512.0000 (815.0887) mem 7379MB [2024-08-27 22:31:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1080/1251] eta 0:00:39 lr 0.000286 wd 0.0500 time 0.2218 (0.2316) data time 0.0006 (0.0019) model time 0.2211 (0.2298) loss 2.8238 (3.0039) grad_norm 2.9185 (inf) loss_scale 512.0000 (812.2849) mem 7379MB [2024-08-27 22:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1090/1251] eta 0:00:37 lr 0.000286 wd 0.0500 time 0.2303 (0.2316) data time 0.0012 (0.0019) model time 0.2291 (0.2298) loss 2.0225 (3.0039) grad_norm 2.5525 (inf) loss_scale 512.0000 (809.5325) mem 7379MB [2024-08-27 22:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1100/1251] eta 0:00:34 lr 0.000286 wd 0.0500 time 0.2205 (0.2316) data time 0.0011 (0.0019) model time 0.2195 (0.2298) loss 3.1469 (3.0049) grad_norm 2.8584 (inf) loss_scale 512.0000 (806.8302) mem 7379MB [2024-08-27 22:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1110/1251] eta 0:00:32 lr 0.000286 wd 0.0500 time 0.2320 (0.2316) data time 0.0007 (0.0019) model time 0.2314 (0.2298) loss 3.5412 (3.0042) grad_norm 3.1421 (inf) loss_scale 512.0000 (804.1764) mem 7379MB [2024-08-27 22:31:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1120/1251] eta 0:00:30 lr 0.000286 wd 0.0500 time 0.2275 (0.2316) data time 0.0006 (0.0019) model time 0.2269 (0.2298) loss 2.6206 (3.0023) grad_norm 2.8935 (inf) loss_scale 512.0000 (801.5700) mem 7379MB [2024-08-27 22:31:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1130/1251] eta 0:00:28 lr 0.000286 wd 0.0500 time 0.2244 (0.2317) data time 0.0016 (0.0019) model time 0.2228 (0.2299) loss 2.6020 (3.0012) grad_norm 2.9149 (inf) loss_scale 512.0000 (799.0097) mem 7379MB [2024-08-27 22:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1140/1251] eta 0:00:25 lr 0.000286 wd 0.0500 time 0.2274 (0.2317) data time 0.0007 (0.0019) model time 0.2267 (0.2299) loss 2.4198 (3.0005) grad_norm 3.6637 (inf) loss_scale 512.0000 (796.4943) mem 7379MB [2024-08-27 22:31:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1150/1251] eta 0:00:23 lr 0.000286 wd 0.0500 time 0.2283 (0.2317) data time 0.0014 (0.0019) model time 0.2270 (0.2299) loss 2.8313 (2.9991) grad_norm 3.1251 (inf) loss_scale 512.0000 (794.0226) mem 7379MB [2024-08-27 22:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1160/1251] eta 0:00:21 lr 0.000286 wd 0.0500 time 0.2228 (0.2317) data time 0.0009 (0.0019) model time 0.2219 (0.2299) loss 3.4194 (3.0027) grad_norm 3.7845 (inf) loss_scale 512.0000 (791.5935) mem 7379MB [2024-08-27 22:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1170/1251] eta 0:00:18 lr 0.000286 wd 0.0500 time 0.2247 (0.2317) data time 0.0009 (0.0018) model time 0.2238 (0.2298) loss 2.7691 (3.0004) grad_norm 2.4551 (inf) loss_scale 512.0000 (789.2058) mem 7379MB [2024-08-27 22:31:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1180/1251] eta 0:00:16 lr 0.000286 wd 0.0500 time 0.2290 (0.2316) data time 0.0008 (0.0018) model time 0.2281 (0.2298) loss 3.7297 (3.0008) grad_norm 3.8159 (inf) loss_scale 512.0000 (786.8586) mem 7379MB [2024-08-27 22:31:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1190/1251] eta 0:00:14 lr 0.000285 wd 0.0500 time 0.2307 (0.2316) data time 0.0008 (0.0018) model time 0.2299 (0.2298) loss 3.0917 (3.0015) grad_norm 2.9309 (inf) loss_scale 512.0000 (784.5508) mem 7379MB [2024-08-27 22:31:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1200/1251] eta 0:00:11 lr 0.000285 wd 0.0500 time 0.2195 (0.2316) data time 0.0010 (0.0018) model time 0.2185 (0.2298) loss 3.4910 (3.0026) grad_norm 2.9408 (inf) loss_scale 512.0000 (782.2814) mem 7379MB [2024-08-27 22:31:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1210/1251] eta 0:00:09 lr 0.000285 wd 0.0500 time 0.2281 (0.2317) data time 0.0010 (0.0018) model time 0.2271 (0.2299) loss 2.9537 (3.0057) grad_norm 3.0856 (inf) loss_scale 512.0000 (780.0495) mem 7379MB [2024-08-27 22:31:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1220/1251] eta 0:00:07 lr 0.000285 wd 0.0500 time 0.2271 (0.2319) data time 0.0011 (0.0018) model time 0.2260 (0.2301) loss 3.3093 (3.0039) grad_norm 3.2336 (inf) loss_scale 512.0000 (777.8542) mem 7379MB [2024-08-27 22:31:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1230/1251] eta 0:00:04 lr 0.000285 wd 0.0500 time 0.2291 (0.2319) data time 0.0009 (0.0018) model time 0.2282 (0.2301) loss 2.9381 (3.0043) grad_norm 4.6666 (inf) loss_scale 512.0000 (775.6946) mem 7379MB [2024-08-27 22:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1240/1251] eta 0:00:02 lr 0.000285 wd 0.0500 time 0.2133 (0.2318) data time 0.0006 (0.0018) model time 0.2127 (0.2300) loss 2.9537 (3.0042) grad_norm 2.5176 (inf) loss_scale 512.0000 (773.5697) mem 7379MB [2024-08-27 22:31:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [200/300][1250/1251] eta 0:00:00 lr 0.000285 wd 0.0500 time 0.2120 (0.2316) data time 0.0004 (0.0018) model time 0.2116 (0.2298) loss 2.5844 (3.0031) grad_norm 2.0472 (inf) loss_scale 512.0000 (771.4788) mem 7379MB [2024-08-27 22:31:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 200 training takes 0:04:49 [2024-08-27 22:31:59 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 22:32:00 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 22:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.479 (0.479) Loss 0.4453 (0.4453) Acc@1 92.480 (92.480) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-27 22:32:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.120) Loss 0.6460 (0.6773) Acc@1 87.207 (85.866) Acc@5 97.559 (97.319) Mem 7379MB [2024-08-27 22:32:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.102) Loss 0.9966 (0.7042) Acc@1 76.367 (84.868) Acc@5 95.410 (97.261) Mem 7379MB [2024-08-27 22:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.095) Loss 1.1885 (0.7996) Acc@1 71.680 (82.482) Acc@5 92.480 (96.242) Mem 7379MB [2024-08-27 22:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.088) Loss 1.0391 (0.8477) Acc@1 75.781 (81.369) Acc@5 93.555 (95.663) Mem 7379MB [2024-08-27 22:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.926 Acc@5 95.616 [2024-08-27 22:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.9% [2024-08-27 22:32:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.945 (0.945) Loss 0.3882 (0.3882) Acc@1 92.969 (92.969) Acc@5 98.633 (98.633) Mem 7379MB [2024-08-27 22:32:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.099 (0.167) Loss 0.5947 (0.6085) Acc@1 89.062 (87.331) Acc@5 97.559 (97.621) Mem 7379MB [2024-08-27 22:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.126) Loss 0.8672 (0.6351) Acc@1 78.320 (86.230) Acc@5 96.094 (97.614) Mem 7379MB [2024-08-27 22:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.111) Loss 1.0977 (0.7195) Acc@1 73.340 (84.164) Acc@5 93.164 (96.740) Mem 7379MB [2024-08-27 22:32:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.101) Loss 0.9795 (0.7624) Acc@1 76.074 (82.939) Acc@5 94.727 (96.237) Mem 7379MB [2024-08-27 22:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.510 Acc@5 96.244 [2024-08-27 22:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.5% [2024-08-27 22:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.51% [2024-08-27 22:32:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 22:32:09 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 22:32:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][0/1251] eta 0:17:07 lr 0.000285 wd 0.0500 time 0.8214 (0.8214) data time 0.5903 (0.5903) model time 0.0000 (0.0000) loss 3.3023 (3.3023) grad_norm 4.6340 (4.6340) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][10/1251] eta 0:05:51 lr 0.000285 wd 0.0500 time 0.2332 (0.2832) data time 0.0009 (0.0546) model time 0.0000 (0.0000) loss 3.2813 (2.9869) grad_norm 3.0914 (3.7004) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][20/1251] eta 0:05:18 lr 0.000285 wd 0.0500 time 0.2326 (0.2587) data time 0.0009 (0.0292) model time 0.0000 (0.0000) loss 3.0016 (3.0608) grad_norm 2.4132 (3.5891) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][30/1251] eta 0:05:04 lr 0.000285 wd 0.0500 time 0.2374 (0.2493) data time 0.0008 (0.0201) model time 0.0000 (0.0000) loss 3.6744 (3.0998) grad_norm 3.8604 (3.5134) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][40/1251] eta 0:04:55 lr 0.000285 wd 0.0500 time 0.2247 (0.2440) data time 0.0007 (0.0155) model time 0.0000 (0.0000) loss 2.6681 (3.0359) grad_norm 3.4629 (3.4150) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][50/1251] eta 0:04:50 lr 0.000285 wd 0.0500 time 0.2452 (0.2416) data time 0.0009 (0.0127) model time 0.0000 (0.0000) loss 3.1402 (3.0354) grad_norm 2.0166 (3.4255) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][60/1251] eta 0:04:45 lr 0.000285 wd 0.0500 time 0.2284 (0.2398) data time 0.0010 (0.0108) model time 0.2275 (0.2290) loss 3.3667 (3.0754) grad_norm 3.5658 (3.3733) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][70/1251] eta 0:04:40 lr 0.000285 wd 0.0500 time 0.2235 (0.2379) data time 0.0010 (0.0094) model time 0.2225 (0.2273) loss 2.7579 (3.0893) grad_norm 2.6171 (3.2967) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][80/1251] eta 0:04:37 lr 0.000285 wd 0.0500 time 0.2358 (0.2369) data time 0.0008 (0.0084) model time 0.2350 (0.2277) loss 3.7829 (3.0842) grad_norm 4.1660 (3.3186) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][90/1251] eta 0:04:33 lr 0.000285 wd 0.0500 time 0.2245 (0.2358) data time 0.0007 (0.0076) model time 0.2238 (0.2272) loss 2.6729 (3.0685) grad_norm 2.6118 (3.2561) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][100/1251] eta 0:04:30 lr 0.000285 wd 0.0500 time 0.2362 (0.2351) data time 0.0007 (0.0070) model time 0.2355 (0.2272) loss 3.8903 (3.0645) grad_norm 3.2464 (3.2561) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][110/1251] eta 0:04:27 lr 0.000285 wd 0.0500 time 0.2287 (0.2346) data time 0.0006 (0.0064) model time 0.2280 (0.2274) loss 3.1299 (3.0520) grad_norm 2.8840 (3.1995) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][120/1251] eta 0:04:24 lr 0.000285 wd 0.0500 time 0.2400 (0.2341) data time 0.0009 (0.0060) model time 0.2391 (0.2273) loss 2.8926 (3.0514) grad_norm 2.1391 (3.1493) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][130/1251] eta 0:04:21 lr 0.000285 wd 0.0500 time 0.2367 (0.2336) data time 0.0007 (0.0057) model time 0.2359 (0.2273) loss 2.8700 (3.0461) grad_norm 3.1018 (3.1330) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][140/1251] eta 0:04:19 lr 0.000285 wd 0.0500 time 0.2465 (0.2334) data time 0.0008 (0.0053) model time 0.2457 (0.2275) loss 2.9168 (3.0493) grad_norm 4.9677 (3.3804) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][150/1251] eta 0:04:16 lr 0.000285 wd 0.0500 time 0.2437 (0.2333) data time 0.0011 (0.0051) model time 0.2427 (0.2278) loss 3.4373 (3.0619) grad_norm 2.4424 (3.3684) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][160/1251] eta 0:04:14 lr 0.000285 wd 0.0500 time 0.2499 (0.2331) data time 0.0009 (0.0048) model time 0.2490 (0.2280) loss 3.0450 (3.0468) grad_norm 3.4979 (3.3499) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][170/1251] eta 0:04:11 lr 0.000285 wd 0.0500 time 0.2320 (0.2328) data time 0.0011 (0.0047) model time 0.2309 (0.2278) loss 3.2076 (3.0483) grad_norm 2.6758 (3.3491) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][180/1251] eta 0:04:09 lr 0.000285 wd 0.0500 time 0.2427 (0.2327) data time 0.0008 (0.0045) model time 0.2419 (0.2280) loss 4.0680 (3.0556) grad_norm 2.4420 (3.3194) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][190/1251] eta 0:04:06 lr 0.000284 wd 0.0500 time 0.2317 (0.2326) data time 0.0010 (0.0043) model time 0.2308 (0.2280) loss 3.1524 (3.0591) grad_norm 3.7707 (3.3027) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][200/1251] eta 0:04:04 lr 0.000284 wd 0.0500 time 0.2252 (0.2323) data time 0.0007 (0.0041) model time 0.2244 (0.2279) loss 4.0091 (3.0645) grad_norm 2.8432 (3.2762) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][210/1251] eta 0:04:01 lr 0.000284 wd 0.0500 time 0.2370 (0.2322) data time 0.0009 (0.0040) model time 0.2361 (0.2279) loss 2.8610 (3.0712) grad_norm 3.2299 (3.2584) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][220/1251] eta 0:03:59 lr 0.000284 wd 0.0500 time 0.2347 (0.2321) data time 0.0009 (0.0039) model time 0.2338 (0.2280) loss 3.2641 (3.0686) grad_norm 2.7089 (3.2596) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][230/1251] eta 0:03:56 lr 0.000284 wd 0.0500 time 0.2327 (0.2319) data time 0.0011 (0.0038) model time 0.2317 (0.2279) loss 2.9741 (3.0735) grad_norm 4.4319 (3.2631) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][240/1251] eta 0:03:54 lr 0.000284 wd 0.0500 time 0.2366 (0.2318) data time 0.0010 (0.0037) model time 0.2356 (0.2278) loss 2.6689 (3.0622) grad_norm 2.5253 (3.2518) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][250/1251] eta 0:03:51 lr 0.000284 wd 0.0500 time 0.2398 (0.2316) data time 0.0009 (0.0036) model time 0.2389 (0.2278) loss 2.3157 (3.0597) grad_norm 15.2722 (3.3099) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][260/1251] eta 0:03:49 lr 0.000284 wd 0.0500 time 0.2261 (0.2314) data time 0.0012 (0.0035) model time 0.2248 (0.2277) loss 2.5460 (3.0487) grad_norm 3.5166 (3.3021) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][270/1251] eta 0:03:46 lr 0.000284 wd 0.0500 time 0.2308 (0.2313) data time 0.0009 (0.0034) model time 0.2299 (0.2276) loss 2.2854 (3.0533) grad_norm 2.9017 (3.2795) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][280/1251] eta 0:03:44 lr 0.000284 wd 0.0500 time 0.2309 (0.2311) data time 0.0009 (0.0033) model time 0.2300 (0.2275) loss 2.8779 (3.0580) grad_norm 3.3051 (3.3017) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][290/1251] eta 0:03:41 lr 0.000284 wd 0.0500 time 0.2250 (0.2310) data time 0.0013 (0.0033) model time 0.2237 (0.2275) loss 2.0740 (3.0448) grad_norm 3.0461 (3.3005) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][300/1251] eta 0:03:39 lr 0.000284 wd 0.0500 time 0.2339 (0.2310) data time 0.0014 (0.0032) model time 0.2324 (0.2275) loss 2.9022 (3.0458) grad_norm 4.2472 (3.2864) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][310/1251] eta 0:03:37 lr 0.000284 wd 0.0500 time 0.2371 (0.2309) data time 0.0012 (0.0032) model time 0.2359 (0.2276) loss 3.2578 (3.0467) grad_norm 2.4535 (3.2716) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][320/1251] eta 0:03:35 lr 0.000284 wd 0.0500 time 0.2358 (0.2309) data time 0.0009 (0.0031) model time 0.2349 (0.2276) loss 3.0022 (3.0456) grad_norm 3.1142 (3.2680) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][330/1251] eta 0:03:32 lr 0.000284 wd 0.0500 time 0.2401 (0.2309) data time 0.0009 (0.0030) model time 0.2393 (0.2277) loss 3.2655 (3.0454) grad_norm 2.6589 (3.2728) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][340/1251] eta 0:03:30 lr 0.000284 wd 0.0500 time 0.2150 (0.2314) data time 0.0007 (0.0030) model time 0.2143 (0.2283) loss 2.8577 (3.0389) grad_norm 2.9100 (3.2695) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][350/1251] eta 0:03:28 lr 0.000284 wd 0.0500 time 0.2214 (0.2313) data time 0.0007 (0.0029) model time 0.2207 (0.2283) loss 2.9684 (3.0322) grad_norm 3.0173 (3.2709) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][360/1251] eta 0:03:26 lr 0.000284 wd 0.0500 time 0.2234 (0.2312) data time 0.0011 (0.0029) model time 0.2223 (0.2283) loss 2.0419 (3.0206) grad_norm 5.4284 (3.2863) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][370/1251] eta 0:03:23 lr 0.000284 wd 0.0500 time 0.2230 (0.2312) data time 0.0009 (0.0028) model time 0.2220 (0.2283) loss 3.2407 (3.0144) grad_norm 4.9671 (3.2988) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][380/1251] eta 0:03:21 lr 0.000284 wd 0.0500 time 0.2259 (0.2311) data time 0.0010 (0.0028) model time 0.2249 (0.2283) loss 1.8217 (3.0109) grad_norm 4.0868 (3.3269) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][390/1251] eta 0:03:18 lr 0.000284 wd 0.0500 time 0.2300 (0.2310) data time 0.0010 (0.0027) model time 0.2291 (0.2282) loss 3.2442 (3.0056) grad_norm 3.1069 (3.3366) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][400/1251] eta 0:03:16 lr 0.000284 wd 0.0500 time 0.2293 (0.2309) data time 0.0009 (0.0027) model time 0.2284 (0.2281) loss 3.6449 (3.0126) grad_norm 2.3557 (3.3247) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][410/1251] eta 0:03:14 lr 0.000284 wd 0.0500 time 0.2346 (0.2310) data time 0.0007 (0.0027) model time 0.2339 (0.2282) loss 1.9839 (3.0087) grad_norm 2.5747 (3.3159) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][420/1251] eta 0:03:11 lr 0.000284 wd 0.0500 time 0.2252 (0.2309) data time 0.0006 (0.0026) model time 0.2245 (0.2282) loss 2.4486 (3.0091) grad_norm 2.5559 (3.3016) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][430/1251] eta 0:03:09 lr 0.000284 wd 0.0500 time 0.2251 (0.2309) data time 0.0007 (0.0026) model time 0.2245 (0.2282) loss 3.3272 (3.0089) grad_norm 4.9229 (3.3059) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][440/1251] eta 0:03:07 lr 0.000283 wd 0.0500 time 0.2195 (0.2309) data time 0.0010 (0.0026) model time 0.2185 (0.2282) loss 3.3790 (3.0148) grad_norm 5.2326 (3.3092) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][450/1251] eta 0:03:04 lr 0.000283 wd 0.0500 time 0.2412 (0.2309) data time 0.0010 (0.0025) model time 0.2403 (0.2283) loss 2.3143 (3.0101) grad_norm 2.7159 (3.3169) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][460/1251] eta 0:03:02 lr 0.000283 wd 0.0500 time 0.2310 (0.2309) data time 0.0012 (0.0025) model time 0.2298 (0.2283) loss 2.8095 (3.0106) grad_norm 3.2665 (3.3209) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][470/1251] eta 0:03:00 lr 0.000283 wd 0.0500 time 0.2210 (0.2308) data time 0.0010 (0.0025) model time 0.2200 (0.2282) loss 3.3521 (3.0116) grad_norm 3.4267 (3.3179) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][480/1251] eta 0:02:57 lr 0.000283 wd 0.0500 time 0.2306 (0.2307) data time 0.0011 (0.0025) model time 0.2296 (0.2282) loss 2.7361 (3.0140) grad_norm 4.6945 (3.3161) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][490/1251] eta 0:02:56 lr 0.000283 wd 0.0500 time 0.2204 (0.2316) data time 0.0008 (0.0024) model time 0.2196 (0.2291) loss 3.0305 (3.0092) grad_norm 2.6411 (3.3222) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][500/1251] eta 0:02:53 lr 0.000283 wd 0.0500 time 0.2252 (0.2315) data time 0.0015 (0.0024) model time 0.2237 (0.2291) loss 2.2615 (3.0078) grad_norm 2.4063 (3.3186) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][510/1251] eta 0:02:51 lr 0.000283 wd 0.0500 time 0.2267 (0.2314) data time 0.0011 (0.0024) model time 0.2256 (0.2290) loss 2.5630 (3.0070) grad_norm 2.5599 (3.3100) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][520/1251] eta 0:02:49 lr 0.000283 wd 0.0500 time 0.2270 (0.2314) data time 0.0009 (0.0024) model time 0.2261 (0.2290) loss 3.2043 (3.0097) grad_norm 1.9874 (3.2988) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][530/1251] eta 0:02:46 lr 0.000283 wd 0.0500 time 0.2269 (0.2313) data time 0.0012 (0.0023) model time 0.2257 (0.2290) loss 3.3655 (3.0064) grad_norm 2.3902 (3.2878) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][540/1251] eta 0:02:44 lr 0.000283 wd 0.0500 time 0.2367 (0.2313) data time 0.0011 (0.0023) model time 0.2356 (0.2290) loss 3.1925 (3.0091) grad_norm 3.6866 (3.2868) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][550/1251] eta 0:02:42 lr 0.000283 wd 0.0500 time 0.2337 (0.2313) data time 0.0008 (0.0023) model time 0.2330 (0.2290) loss 3.4183 (3.0067) grad_norm 2.4926 (3.2793) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][560/1251] eta 0:02:39 lr 0.000283 wd 0.0500 time 0.2305 (0.2313) data time 0.0009 (0.0023) model time 0.2296 (0.2290) loss 2.6516 (3.0080) grad_norm 3.2238 (3.2719) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][570/1251] eta 0:02:37 lr 0.000283 wd 0.0500 time 0.2405 (0.2312) data time 0.0010 (0.0023) model time 0.2395 (0.2290) loss 3.3566 (3.0055) grad_norm 2.8072 (3.2766) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][580/1251] eta 0:02:35 lr 0.000283 wd 0.0500 time 0.2261 (0.2312) data time 0.0010 (0.0022) model time 0.2251 (0.2290) loss 3.4470 (3.0033) grad_norm 2.6723 (3.2768) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][590/1251] eta 0:02:32 lr 0.000283 wd 0.0500 time 0.2274 (0.2311) data time 0.0013 (0.0022) model time 0.2261 (0.2289) loss 2.1804 (2.9996) grad_norm 5.2720 (3.2822) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][600/1251] eta 0:02:30 lr 0.000283 wd 0.0500 time 0.2194 (0.2311) data time 0.0012 (0.0022) model time 0.2182 (0.2288) loss 2.3999 (3.0026) grad_norm 3.1867 (3.2773) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][610/1251] eta 0:02:28 lr 0.000283 wd 0.0500 time 0.2351 (0.2310) data time 0.0009 (0.0022) model time 0.2342 (0.2288) loss 2.7862 (2.9997) grad_norm 3.4877 (3.2758) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][620/1251] eta 0:02:25 lr 0.000283 wd 0.0500 time 0.2268 (0.2311) data time 0.0009 (0.0022) model time 0.2259 (0.2289) loss 2.7688 (2.9995) grad_norm 2.3997 (3.2661) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][630/1251] eta 0:02:23 lr 0.000283 wd 0.0500 time 0.2274 (0.2311) data time 0.0007 (0.0021) model time 0.2268 (0.2289) loss 2.0473 (2.9992) grad_norm 2.7744 (3.2649) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][640/1251] eta 0:02:21 lr 0.000283 wd 0.0500 time 0.2339 (0.2310) data time 0.0009 (0.0021) model time 0.2330 (0.2289) loss 2.0233 (2.9989) grad_norm 2.4154 (3.2648) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][650/1251] eta 0:02:18 lr 0.000283 wd 0.0500 time 0.2398 (0.2310) data time 0.0013 (0.0022) model time 0.2385 (0.2289) loss 3.2175 (3.0000) grad_norm 2.6317 (3.2597) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][660/1251] eta 0:02:16 lr 0.000283 wd 0.0500 time 0.2296 (0.2310) data time 0.0012 (0.0021) model time 0.2284 (0.2289) loss 3.2562 (2.9992) grad_norm 2.8956 (3.2640) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][670/1251] eta 0:02:14 lr 0.000283 wd 0.0500 time 0.2299 (0.2311) data time 0.0009 (0.0021) model time 0.2290 (0.2289) loss 3.0592 (3.0002) grad_norm 3.0836 (3.2656) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][680/1251] eta 0:02:11 lr 0.000283 wd 0.0500 time 0.2230 (0.2311) data time 0.0010 (0.0021) model time 0.2220 (0.2289) loss 3.2555 (3.0005) grad_norm 2.7234 (3.2677) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][690/1251] eta 0:02:09 lr 0.000282 wd 0.0500 time 0.2227 (0.2310) data time 0.0009 (0.0021) model time 0.2219 (0.2289) loss 2.9051 (2.9994) grad_norm 2.7591 (3.2611) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][700/1251] eta 0:02:07 lr 0.000282 wd 0.0500 time 0.2242 (0.2310) data time 0.0009 (0.0021) model time 0.2233 (0.2289) loss 3.1855 (2.9989) grad_norm 2.7970 (3.2563) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][710/1251] eta 0:02:04 lr 0.000282 wd 0.0500 time 0.2230 (0.2310) data time 0.0007 (0.0021) model time 0.2222 (0.2289) loss 2.3924 (2.9992) grad_norm 2.7780 (3.2511) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][720/1251] eta 0:02:02 lr 0.000282 wd 0.0500 time 0.2278 (0.2310) data time 0.0011 (0.0021) model time 0.2266 (0.2289) loss 3.4703 (3.0023) grad_norm 2.9330 (3.2695) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][730/1251] eta 0:02:00 lr 0.000282 wd 0.0500 time 0.2259 (0.2310) data time 0.0014 (0.0021) model time 0.2245 (0.2289) loss 3.4133 (3.0038) grad_norm 3.0428 (3.2615) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][740/1251] eta 0:01:58 lr 0.000282 wd 0.0500 time 0.2221 (0.2310) data time 0.0011 (0.0020) model time 0.2210 (0.2290) loss 3.0932 (3.0089) grad_norm 2.9179 (3.2586) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][750/1251] eta 0:01:55 lr 0.000282 wd 0.0500 time 0.2262 (0.2310) data time 0.0009 (0.0020) model time 0.2253 (0.2290) loss 2.9669 (3.0103) grad_norm 2.8178 (3.2504) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][760/1251] eta 0:01:53 lr 0.000282 wd 0.0500 time 0.2299 (0.2310) data time 0.0007 (0.0020) model time 0.2292 (0.2290) loss 3.6609 (3.0116) grad_norm 2.0194 (3.2442) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][770/1251] eta 0:01:51 lr 0.000282 wd 0.0500 time 0.2346 (0.2310) data time 0.0008 (0.0020) model time 0.2338 (0.2290) loss 3.8689 (3.0122) grad_norm 3.9763 (3.2379) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][780/1251] eta 0:01:48 lr 0.000282 wd 0.0500 time 0.2297 (0.2310) data time 0.0011 (0.0020) model time 0.2286 (0.2290) loss 2.9330 (3.0096) grad_norm 2.2890 (3.2308) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][790/1251] eta 0:01:46 lr 0.000282 wd 0.0500 time 0.2318 (0.2310) data time 0.0007 (0.0020) model time 0.2312 (0.2290) loss 2.4432 (3.0082) grad_norm 3.5045 (3.2257) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][800/1251] eta 0:01:44 lr 0.000282 wd 0.0500 time 0.2309 (0.2310) data time 0.0009 (0.0020) model time 0.2300 (0.2290) loss 2.7767 (3.0107) grad_norm 3.7525 (3.2229) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][810/1251] eta 0:01:41 lr 0.000282 wd 0.0500 time 0.2215 (0.2311) data time 0.0012 (0.0020) model time 0.2203 (0.2291) loss 2.8089 (3.0114) grad_norm 2.2900 (3.2215) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][820/1251] eta 0:01:39 lr 0.000282 wd 0.0500 time 0.2322 (0.2310) data time 0.0012 (0.0020) model time 0.2310 (0.2290) loss 3.1862 (3.0131) grad_norm 2.3938 (3.2200) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][830/1251] eta 0:01:37 lr 0.000282 wd 0.0500 time 0.2301 (0.2310) data time 0.0009 (0.0020) model time 0.2292 (0.2290) loss 3.2002 (3.0096) grad_norm 2.9521 (3.2190) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][840/1251] eta 0:01:34 lr 0.000282 wd 0.0500 time 0.2312 (0.2310) data time 0.0009 (0.0020) model time 0.2303 (0.2290) loss 3.0997 (3.0075) grad_norm 2.9615 (3.2146) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][850/1251] eta 0:01:32 lr 0.000282 wd 0.0500 time 0.2275 (0.2310) data time 0.0006 (0.0020) model time 0.2269 (0.2290) loss 3.9193 (3.0071) grad_norm 4.5511 (3.2145) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][860/1251] eta 0:01:30 lr 0.000282 wd 0.0500 time 0.2302 (0.2311) data time 0.0009 (0.0020) model time 0.2293 (0.2291) loss 3.6578 (3.0085) grad_norm 4.5747 (3.2185) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][870/1251] eta 0:01:28 lr 0.000282 wd 0.0500 time 0.2400 (0.2313) data time 0.0010 (0.0020) model time 0.2391 (0.2294) loss 3.5529 (3.0109) grad_norm 5.7854 (3.2248) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][880/1251] eta 0:01:25 lr 0.000282 wd 0.0500 time 0.2318 (0.2313) data time 0.0008 (0.0020) model time 0.2310 (0.2294) loss 3.9265 (3.0127) grad_norm 17.9369 (3.2434) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][890/1251] eta 0:01:23 lr 0.000282 wd 0.0500 time 0.2288 (0.2313) data time 0.0007 (0.0020) model time 0.2281 (0.2294) loss 2.4796 (3.0119) grad_norm 2.2486 (3.2434) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][900/1251] eta 0:01:21 lr 0.000282 wd 0.0500 time 0.2316 (0.2313) data time 0.0011 (0.0020) model time 0.2305 (0.2294) loss 1.8085 (3.0114) grad_norm 3.4347 (3.2475) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][910/1251] eta 0:01:18 lr 0.000282 wd 0.0500 time 0.2289 (0.2313) data time 0.0011 (0.0020) model time 0.2278 (0.2294) loss 2.0709 (3.0123) grad_norm 2.5745 (3.2458) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][920/1251] eta 0:01:16 lr 0.000282 wd 0.0500 time 0.2299 (0.2313) data time 0.0009 (0.0020) model time 0.2290 (0.2294) loss 3.3866 (3.0135) grad_norm 2.2063 (3.2493) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][930/1251] eta 0:01:14 lr 0.000282 wd 0.0500 time 0.2318 (0.2313) data time 0.0010 (0.0019) model time 0.2308 (0.2294) loss 2.6950 (3.0131) grad_norm 2.8406 (3.2470) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][940/1251] eta 0:01:11 lr 0.000282 wd 0.0500 time 0.2313 (0.2313) data time 0.0007 (0.0020) model time 0.2306 (0.2294) loss 3.5403 (3.0121) grad_norm 3.1127 (3.2450) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][950/1251] eta 0:01:09 lr 0.000281 wd 0.0500 time 0.2195 (0.2314) data time 0.0011 (0.0020) model time 0.2184 (0.2295) loss 2.4319 (3.0115) grad_norm 2.9458 (3.2397) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][960/1251] eta 0:01:07 lr 0.000281 wd 0.0500 time 0.2402 (0.2315) data time 0.0008 (0.0020) model time 0.2394 (0.2296) loss 1.7113 (3.0120) grad_norm 3.6316 (3.2422) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][970/1251] eta 0:01:05 lr 0.000281 wd 0.0500 time 0.2230 (0.2315) data time 0.0013 (0.0019) model time 0.2217 (0.2296) loss 2.9577 (3.0109) grad_norm 2.6491 (3.2488) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][980/1251] eta 0:01:02 lr 0.000281 wd 0.0500 time 0.2222 (0.2315) data time 0.0010 (0.0019) model time 0.2212 (0.2296) loss 3.2006 (3.0099) grad_norm 2.9843 (3.2483) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][990/1251] eta 0:01:00 lr 0.000281 wd 0.0500 time 0.2290 (0.2315) data time 0.0007 (0.0019) model time 0.2283 (0.2296) loss 2.8724 (3.0110) grad_norm 2.8395 (3.2450) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1000/1251] eta 0:00:58 lr 0.000281 wd 0.0500 time 0.2256 (0.2315) data time 0.0007 (0.0019) model time 0.2249 (0.2296) loss 2.2128 (3.0091) grad_norm 2.7136 (3.2402) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1010/1251] eta 0:00:55 lr 0.000281 wd 0.0500 time 0.2330 (0.2315) data time 0.0009 (0.0019) model time 0.2321 (0.2297) loss 3.2954 (3.0107) grad_norm 3.4865 (3.2362) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1020/1251] eta 0:00:53 lr 0.000281 wd 0.0500 time 0.2268 (0.2315) data time 0.0008 (0.0019) model time 0.2260 (0.2296) loss 2.8116 (3.0083) grad_norm 4.4624 (3.2356) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1030/1251] eta 0:00:51 lr 0.000281 wd 0.0500 time 0.2279 (0.2315) data time 0.0016 (0.0019) model time 0.2263 (0.2296) loss 3.0540 (3.0089) grad_norm 3.7120 (3.2361) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1040/1251] eta 0:00:48 lr 0.000281 wd 0.0500 time 0.2302 (0.2315) data time 0.0011 (0.0019) model time 0.2291 (0.2296) loss 2.9222 (3.0095) grad_norm 3.9804 (3.2350) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1050/1251] eta 0:00:46 lr 0.000281 wd 0.0500 time 0.2284 (0.2315) data time 0.0009 (0.0019) model time 0.2275 (0.2296) loss 2.9838 (3.0067) grad_norm 3.3501 (3.2334) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1060/1251] eta 0:00:44 lr 0.000281 wd 0.0500 time 0.2262 (0.2314) data time 0.0012 (0.0019) model time 0.2249 (0.2296) loss 3.8362 (3.0087) grad_norm 2.1926 (3.2336) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1070/1251] eta 0:00:41 lr 0.000281 wd 0.0500 time 0.2338 (0.2315) data time 0.0007 (0.0019) model time 0.2332 (0.2296) loss 3.6301 (3.0105) grad_norm 3.1588 (3.2351) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1080/1251] eta 0:00:39 lr 0.000281 wd 0.0500 time 0.2280 (0.2314) data time 0.0009 (0.0019) model time 0.2271 (0.2296) loss 3.4424 (3.0112) grad_norm 2.5157 (3.2342) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1090/1251] eta 0:00:37 lr 0.000281 wd 0.0500 time 0.2219 (0.2314) data time 0.0011 (0.0019) model time 0.2208 (0.2296) loss 3.5688 (3.0140) grad_norm 2.1261 (3.2364) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1100/1251] eta 0:00:34 lr 0.000281 wd 0.0500 time 0.2296 (0.2314) data time 0.0007 (0.0019) model time 0.2289 (0.2296) loss 3.3885 (3.0133) grad_norm 2.7145 (3.2326) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1110/1251] eta 0:00:32 lr 0.000281 wd 0.0500 time 0.2285 (0.2314) data time 0.0008 (0.0019) model time 0.2277 (0.2295) loss 2.0398 (3.0124) grad_norm 2.5654 (3.2285) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1120/1251] eta 0:00:30 lr 0.000281 wd 0.0500 time 0.2177 (0.2314) data time 0.0011 (0.0019) model time 0.2167 (0.2295) loss 2.6711 (3.0136) grad_norm 2.3387 (3.2261) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1130/1251] eta 0:00:27 lr 0.000281 wd 0.0500 time 0.2342 (0.2313) data time 0.0007 (0.0019) model time 0.2335 (0.2295) loss 3.4104 (3.0152) grad_norm 2.3778 (3.2273) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1140/1251] eta 0:00:25 lr 0.000281 wd 0.0500 time 0.2262 (0.2313) data time 0.0008 (0.0019) model time 0.2255 (0.2295) loss 1.8107 (3.0159) grad_norm 2.4673 (3.2249) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1150/1251] eta 0:00:23 lr 0.000281 wd 0.0500 time 0.2347 (0.2313) data time 0.0010 (0.0018) model time 0.2337 (0.2294) loss 2.6282 (3.0175) grad_norm 2.9972 (3.2215) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1160/1251] eta 0:00:21 lr 0.000281 wd 0.0500 time 0.2268 (0.2313) data time 0.0013 (0.0018) model time 0.2255 (0.2295) loss 2.3266 (3.0169) grad_norm 2.1788 (3.2228) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1170/1251] eta 0:00:18 lr 0.000281 wd 0.0500 time 0.2336 (0.2313) data time 0.0007 (0.0018) model time 0.2329 (0.2294) loss 3.7907 (3.0181) grad_norm 3.2608 (3.2263) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1180/1251] eta 0:00:16 lr 0.000281 wd 0.0500 time 0.2282 (0.2312) data time 0.0010 (0.0018) model time 0.2271 (0.2294) loss 2.7970 (3.0174) grad_norm 3.9358 (3.2295) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1190/1251] eta 0:00:14 lr 0.000281 wd 0.0500 time 0.2200 (0.2313) data time 0.0011 (0.0018) model time 0.2189 (0.2295) loss 3.3526 (3.0180) grad_norm 3.7439 (3.2288) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1200/1251] eta 0:00:11 lr 0.000280 wd 0.0500 time 0.2286 (0.2313) data time 0.0015 (0.0018) model time 0.2271 (0.2295) loss 2.9184 (3.0172) grad_norm 2.7591 (3.2256) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1210/1251] eta 0:00:09 lr 0.000280 wd 0.0500 time 0.2250 (0.2313) data time 0.0010 (0.0018) model time 0.2240 (0.2295) loss 3.4795 (3.0172) grad_norm 2.0659 (3.2228) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1220/1251] eta 0:00:07 lr 0.000280 wd 0.0500 time 0.2286 (0.2312) data time 0.0007 (0.0018) model time 0.2278 (0.2295) loss 3.4941 (3.0170) grad_norm 3.8851 (3.2273) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1230/1251] eta 0:00:04 lr 0.000280 wd 0.0500 time 0.2343 (0.2312) data time 0.0012 (0.0018) model time 0.2332 (0.2295) loss 3.5292 (3.0178) grad_norm 2.7887 (3.2313) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1240/1251] eta 0:00:02 lr 0.000280 wd 0.0500 time 0.2116 (0.2312) data time 0.0004 (0.0018) model time 0.2112 (0.2294) loss 3.4931 (3.0177) grad_norm 3.8514 (3.2301) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [201/300][1250/1251] eta 0:00:00 lr 0.000280 wd 0.0500 time 0.2114 (0.2310) data time 0.0006 (0.0018) model time 0.2107 (0.2292) loss 2.2440 (3.0180) grad_norm 3.3703 (3.2335) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 201 training takes 0:04:48 [2024-08-27 22:36:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 22:36:59 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 22:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.528 (0.528) Loss 0.4277 (0.4277) Acc@1 91.602 (91.602) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-27 22:37:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.127) Loss 0.6582 (0.6815) Acc@1 86.426 (85.307) Acc@5 97.363 (97.266) Mem 7379MB [2024-08-27 22:37:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.086 (0.107) Loss 0.9678 (0.7016) Acc@1 74.609 (84.519) Acc@5 95.117 (97.219) Mem 7379MB [2024-08-27 22:37:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.098) Loss 1.1426 (0.7953) Acc@1 73.828 (82.447) Acc@5 92.871 (96.150) Mem 7379MB [2024-08-27 22:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.090) Loss 1.0986 (0.8472) Acc@1 73.828 (81.157) Acc@5 92.969 (95.570) Mem 7379MB [2024-08-27 22:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.820 Acc@5 95.564 [2024-08-27 22:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.8% [2024-08-27 22:37:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 1.044 (1.044) Loss 0.3882 (0.3882) Acc@1 92.871 (92.871) Acc@5 98.535 (98.535) Mem 7379MB [2024-08-27 22:37:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.168) Loss 0.5933 (0.6083) Acc@1 88.867 (87.260) Acc@5 97.559 (97.621) Mem 7379MB [2024-08-27 22:37:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.127) Loss 0.8672 (0.6349) Acc@1 78.418 (86.170) Acc@5 96.387 (97.656) Mem 7379MB [2024-08-27 22:37:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.111) Loss 1.0957 (0.7192) Acc@1 73.438 (84.114) Acc@5 93.359 (96.771) Mem 7379MB [2024-08-27 22:37:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.100) Loss 0.9790 (0.7620) Acc@1 76.465 (82.922) Acc@5 94.531 (96.287) Mem 7379MB [2024-08-27 22:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.504 Acc@5 96.276 [2024-08-27 22:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.5% [2024-08-27 22:37:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][0/1251] eta 0:28:35 lr 0.000280 wd 0.0500 time 1.3713 (1.3713) data time 1.0036 (1.0036) model time 0.0000 (0.0000) loss 2.7088 (2.7088) grad_norm 2.3018 (2.3018) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][10/1251] eta 0:06:53 lr 0.000280 wd 0.0500 time 0.2376 (0.3335) data time 0.0009 (0.0926) model time 0.0000 (0.0000) loss 3.0400 (3.0326) grad_norm 3.5612 (3.0909) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][20/1251] eta 0:05:49 lr 0.000280 wd 0.0500 time 0.2319 (0.2843) data time 0.0011 (0.0491) model time 0.0000 (0.0000) loss 3.6898 (3.0936) grad_norm 6.4865 (3.3141) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][30/1251] eta 0:05:26 lr 0.000280 wd 0.0500 time 0.2369 (0.2670) data time 0.0008 (0.0339) model time 0.0000 (0.0000) loss 2.1059 (3.0366) grad_norm 3.0458 (3.2166) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][40/1251] eta 0:05:11 lr 0.000280 wd 0.0500 time 0.2443 (0.2575) data time 0.0009 (0.0260) model time 0.0000 (0.0000) loss 3.0394 (3.0823) grad_norm 4.2741 (3.1823) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][50/1251] eta 0:05:02 lr 0.000280 wd 0.0500 time 0.2336 (0.2518) data time 0.0008 (0.0214) model time 0.0000 (0.0000) loss 2.9551 (3.0874) grad_norm 2.7156 (3.1313) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][60/1251] eta 0:04:55 lr 0.000280 wd 0.0500 time 0.2584 (0.2483) data time 0.0010 (0.0181) model time 0.2574 (0.2290) loss 3.0531 (3.0774) grad_norm 2.4390 (3.1190) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][70/1251] eta 0:04:50 lr 0.000280 wd 0.0500 time 0.2431 (0.2458) data time 0.0012 (0.0158) model time 0.2419 (0.2292) loss 2.9322 (3.0437) grad_norm 2.9401 (3.1010) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][80/1251] eta 0:04:45 lr 0.000280 wd 0.0500 time 0.2417 (0.2439) data time 0.0007 (0.0140) model time 0.2410 (0.2291) loss 3.8199 (3.0815) grad_norm 2.6103 (3.0750) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][90/1251] eta 0:04:41 lr 0.000280 wd 0.0500 time 0.2422 (0.2423) data time 0.0009 (0.0126) model time 0.2412 (0.2287) loss 3.4656 (3.0445) grad_norm 4.1040 (3.1092) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][100/1251] eta 0:04:39 lr 0.000280 wd 0.0500 time 0.2218 (0.2429) data time 0.0007 (0.0116) model time 0.2211 (0.2322) loss 4.0721 (3.0512) grad_norm 2.9652 (3.1465) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][110/1251] eta 0:04:35 lr 0.000280 wd 0.0500 time 0.2248 (0.2417) data time 0.0009 (0.0107) model time 0.2239 (0.2316) loss 2.5002 (3.0402) grad_norm 2.8181 (3.1121) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][120/1251] eta 0:04:32 lr 0.000280 wd 0.0500 time 0.2197 (0.2408) data time 0.0009 (0.0098) model time 0.2188 (0.2313) loss 3.1088 (3.0317) grad_norm 2.6117 (3.0808) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][130/1251] eta 0:04:29 lr 0.000280 wd 0.0500 time 0.2310 (0.2401) data time 0.0009 (0.0092) model time 0.2302 (0.2313) loss 2.7745 (3.0108) grad_norm 3.2740 (3.0682) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][140/1251] eta 0:04:25 lr 0.000280 wd 0.0500 time 0.2322 (0.2394) data time 0.0007 (0.0086) model time 0.2315 (0.2310) loss 2.8506 (3.0048) grad_norm 3.1648 (3.0878) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][150/1251] eta 0:04:22 lr 0.000280 wd 0.0500 time 0.2292 (0.2388) data time 0.0009 (0.0082) model time 0.2283 (0.2308) loss 1.9576 (2.9932) grad_norm 6.2401 (3.1139) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 22:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 22:37:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 22:37:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 22:39:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 22:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 22:39:40 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 22:39:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 22:39:52 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 22:39:53 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 22:39:54 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 22:39:54 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 202) [2024-08-27 22:39:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 22:40:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][160/1251] eta 0:33:58 lr 0.000280 wd 0.0500 time 0.2335 (1.8683) data time 0.0007 (0.1240) model time 0.2327 (1.7442) loss 3.2559 (3.2848) grad_norm 2.7760 (3.5378) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][170/1251] eta 0:18:08 lr 0.000280 wd 0.0500 time 0.2293 (1.0066) data time 0.0011 (0.0596) model time 0.2281 (0.9471) loss 3.0552 (3.1170) grad_norm 2.3627 (3.2902) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][180/1251] eta 0:13:10 lr 0.000280 wd 0.0500 time 0.2256 (0.7382) data time 0.0008 (0.0394) model time 0.2248 (0.6988) loss 3.2075 (3.1614) grad_norm 6.8831 (3.3181) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][190/1251] eta 0:10:44 lr 0.000280 wd 0.0500 time 0.2286 (0.6072) data time 0.0009 (0.0295) model time 0.2277 (0.5776) loss 3.2948 (3.1496) grad_norm 3.0672 (3.4868) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][200/1251] eta 0:09:16 lr 0.000279 wd 0.0500 time 0.2216 (0.5295) data time 0.0010 (0.0237) model time 0.2206 (0.5058) loss 2.9131 (3.1355) grad_norm 3.3181 (3.4985) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][210/1251] eta 0:08:17 lr 0.000279 wd 0.0500 time 0.2240 (0.4780) data time 0.0007 (0.0199) model time 0.2234 (0.4582) loss 2.1728 (3.1261) grad_norm 2.2423 (3.5018) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][220/1251] eta 0:07:35 lr 0.000279 wd 0.0500 time 0.2288 (0.4422) data time 0.0010 (0.0175) model time 0.2278 (0.4248) loss 3.0606 (3.1157) grad_norm 3.9855 (3.4971) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][230/1251] eta 0:07:03 lr 0.000279 wd 0.0500 time 0.2214 (0.4148) data time 0.0011 (0.0155) model time 0.2203 (0.3994) loss 2.8212 (3.0796) grad_norm 2.3587 (3.4384) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][240/1251] eta 0:06:38 lr 0.000279 wd 0.0500 time 0.2300 (0.3941) data time 0.0009 (0.0138) model time 0.2291 (0.3802) loss 3.1888 (3.0565) grad_norm 2.5707 (3.3881) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][250/1251] eta 0:06:17 lr 0.000279 wd 0.0500 time 0.2264 (0.3772) data time 0.0010 (0.0125) model time 0.2253 (0.3647) loss 3.4683 (3.0702) grad_norm 2.6461 (3.3697) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][260/1251] eta 0:06:00 lr 0.000279 wd 0.0500 time 0.2204 (0.3635) data time 0.0008 (0.0115) model time 0.2196 (0.3520) loss 3.6492 (3.0932) grad_norm 2.5839 (3.3535) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][270/1251] eta 0:05:45 lr 0.000279 wd 0.0500 time 0.2306 (0.3522) data time 0.0008 (0.0106) model time 0.2297 (0.3417) loss 3.5497 (3.0932) grad_norm 2.5634 (3.3351) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][280/1251] eta 0:05:32 lr 0.000279 wd 0.0500 time 0.2320 (0.3428) data time 0.0008 (0.0098) model time 0.2311 (0.3329) loss 2.7789 (3.0780) grad_norm 2.7288 (3.3358) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][290/1251] eta 0:05:21 lr 0.000279 wd 0.0500 time 0.2223 (0.3346) data time 0.0007 (0.0092) model time 0.2216 (0.3254) loss 3.5947 (3.0788) grad_norm 3.3197 (3.3347) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][300/1251] eta 0:05:11 lr 0.000279 wd 0.0500 time 0.2253 (0.3273) data time 0.0007 (0.0087) model time 0.2246 (0.3187) loss 2.9494 (3.0688) grad_norm 2.9381 (3.3573) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][310/1251] eta 0:05:02 lr 0.000279 wd 0.0500 time 0.2269 (0.3210) data time 0.0006 (0.0082) model time 0.2263 (0.3128) loss 3.8493 (3.0727) grad_norm 3.0668 (3.3391) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][320/1251] eta 0:04:53 lr 0.000279 wd 0.0500 time 0.2254 (0.3154) data time 0.0010 (0.0077) model time 0.2244 (0.3076) loss 3.0147 (3.0838) grad_norm 2.3867 (3.3100) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][330/1251] eta 0:04:46 lr 0.000279 wd 0.0500 time 0.2255 (0.3107) data time 0.0007 (0.0074) model time 0.2248 (0.3033) loss 3.0714 (3.0658) grad_norm 2.7451 (3.2761) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][340/1251] eta 0:04:39 lr 0.000279 wd 0.0500 time 0.2251 (0.3064) data time 0.0006 (0.0070) model time 0.2244 (0.2994) loss 3.6427 (3.0683) grad_norm 3.1052 (3.2665) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][350/1251] eta 0:04:32 lr 0.000279 wd 0.0500 time 0.2216 (0.3024) data time 0.0008 (0.0067) model time 0.2207 (0.2957) loss 2.3866 (3.0519) grad_norm 6.0820 (3.2908) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:41:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][360/1251] eta 0:04:26 lr 0.000279 wd 0.0500 time 0.2292 (0.2989) data time 0.0008 (0.0065) model time 0.2284 (0.2925) loss 3.1435 (3.0375) grad_norm 2.5950 (3.3064) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:41:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][370/1251] eta 0:04:20 lr 0.000279 wd 0.0500 time 0.2360 (0.2958) data time 0.0007 (0.0062) model time 0.2353 (0.2896) loss 3.3484 (3.0296) grad_norm 4.4452 (3.3479) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:41:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][380/1251] eta 0:04:14 lr 0.000279 wd 0.0500 time 0.2220 (0.2927) data time 0.0006 (0.0060) model time 0.2214 (0.2867) loss 2.6403 (3.0302) grad_norm 3.2083 (3.3711) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:41:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][390/1251] eta 0:04:09 lr 0.000279 wd 0.0500 time 0.2281 (0.2900) data time 0.0007 (0.0058) model time 0.2274 (0.2842) loss 2.2438 (3.0291) grad_norm 3.0223 (3.3762) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:41:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][400/1251] eta 0:04:04 lr 0.000279 wd 0.0500 time 0.2391 (0.2877) data time 0.0007 (0.0056) model time 0.2385 (0.2821) loss 2.9286 (3.0257) grad_norm 2.3685 (3.3662) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:41:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][410/1251] eta 0:04:00 lr 0.000279 wd 0.0500 time 0.2351 (0.2855) data time 0.0012 (0.0054) model time 0.2340 (0.2801) loss 3.2230 (3.0173) grad_norm 3.7471 (3.3635) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:41:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][420/1251] eta 0:03:55 lr 0.000279 wd 0.0500 time 0.2239 (0.2834) data time 0.0007 (0.0052) model time 0.2232 (0.2781) loss 2.1600 (3.0085) grad_norm 3.1025 (3.3561) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:41:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][430/1251] eta 0:03:51 lr 0.000279 wd 0.0500 time 0.2247 (0.2814) data time 0.0009 (0.0051) model time 0.2238 (0.2763) loss 3.1893 (3.0160) grad_norm 2.6657 (3.3540) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:41:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][440/1251] eta 0:03:47 lr 0.000279 wd 0.0500 time 0.2298 (0.2803) data time 0.0009 (0.0049) model time 0.2290 (0.2753) loss 2.7035 (3.0111) grad_norm 2.2350 (3.3520) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:41:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][450/1251] eta 0:03:43 lr 0.000278 wd 0.0500 time 0.2269 (0.2784) data time 0.0011 (0.0048) model time 0.2258 (0.2736) loss 3.2360 (2.9986) grad_norm 3.0268 (3.3359) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:41:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][460/1251] eta 0:03:39 lr 0.000278 wd 0.0500 time 0.2270 (0.2776) data time 0.0008 (0.0047) model time 0.2262 (0.2729) loss 3.5612 (2.9977) grad_norm 3.6769 (3.3319) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][470/1251] eta 0:03:35 lr 0.000278 wd 0.0500 time 0.2227 (0.2760) data time 0.0008 (0.0046) model time 0.2218 (0.2714) loss 3.5797 (3.0047) grad_norm 4.6848 (3.3405) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 22:41:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][480/1251] eta 0:03:31 lr 0.000278 wd 0.0500 time 0.2254 (0.2745) data time 0.0008 (0.0045) model time 0.2245 (0.2701) loss 1.5429 (3.0060) grad_norm 3.0606 (3.3508) loss_scale 1024.0000 (519.7812) mem 7377MB [2024-08-27 22:41:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][490/1251] eta 0:03:27 lr 0.000278 wd 0.0500 time 0.2290 (0.2732) data time 0.0011 (0.0044) model time 0.2279 (0.2689) loss 2.8750 (3.0069) grad_norm 3.0603 (3.3527) loss_scale 1024.0000 (534.6549) mem 7377MB [2024-08-27 22:41:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][500/1251] eta 0:03:24 lr 0.000278 wd 0.0500 time 0.2282 (0.2719) data time 0.0009 (0.0043) model time 0.2272 (0.2677) loss 3.4281 (3.0091) grad_norm 4.2647 (3.3576) loss_scale 1024.0000 (548.6762) mem 7377MB [2024-08-27 22:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][510/1251] eta 0:03:20 lr 0.000278 wd 0.0500 time 0.2190 (0.2707) data time 0.0008 (0.0042) model time 0.2182 (0.2666) loss 2.7985 (3.0096) grad_norm 2.3982 (3.3423) loss_scale 1024.0000 (561.9164) mem 7377MB [2024-08-27 22:41:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][520/1251] eta 0:03:17 lr 0.000278 wd 0.0500 time 0.2340 (0.2697) data time 0.0009 (0.0041) model time 0.2332 (0.2656) loss 3.1921 (3.0060) grad_norm 2.8145 (3.3420) loss_scale 1024.0000 (574.4390) mem 7377MB [2024-08-27 22:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][530/1251] eta 0:03:13 lr 0.000278 wd 0.0500 time 0.2299 (0.2687) data time 0.0007 (0.0040) model time 0.2292 (0.2647) loss 3.6009 (3.0039) grad_norm 2.5668 (3.3241) loss_scale 1024.0000 (586.3008) mem 7377MB [2024-08-27 22:41:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][540/1251] eta 0:03:10 lr 0.000278 wd 0.0500 time 0.2235 (0.2676) data time 0.0009 (0.0039) model time 0.2226 (0.2637) loss 3.2077 (2.9971) grad_norm 4.1861 (3.3246) loss_scale 1024.0000 (597.5527) mem 7377MB [2024-08-27 22:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][550/1251] eta 0:03:06 lr 0.000278 wd 0.0500 time 0.2195 (0.2667) data time 0.0009 (0.0039) model time 0.2186 (0.2629) loss 3.2444 (3.0003) grad_norm 4.3879 (3.3355) loss_scale 1024.0000 (608.2406) mem 7377MB [2024-08-27 22:41:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][560/1251] eta 0:03:03 lr 0.000278 wd 0.0500 time 0.2294 (0.2657) data time 0.0008 (0.0038) model time 0.2286 (0.2620) loss 2.9891 (3.0046) grad_norm 3.6688 (3.3264) loss_scale 1024.0000 (618.4059) mem 7377MB [2024-08-27 22:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][570/1251] eta 0:03:00 lr 0.000278 wd 0.0500 time 0.2240 (0.2649) data time 0.0008 (0.0037) model time 0.2231 (0.2611) loss 1.9145 (3.0043) grad_norm 3.3239 (3.3298) loss_scale 1024.0000 (628.0859) mem 7377MB [2024-08-27 22:41:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 22:41:51 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 22:41:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 22:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 22:49:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 22:50:05 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 22:50:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 22:50:17 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 22:50:18 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 22:50:19 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 22:50:20 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 202) [2024-08-27 22:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 22:50:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][580/1251] eta 0:54:30 lr 0.000278 wd 0.0500 time 0.2406 (4.8741) data time 0.0007 (0.4233) model time 0.2398 (4.4508) loss 2.7972 (3.2363) grad_norm 1.7837 (2.5225) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:50:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][590/1251] eta 0:14:22 lr 0.000278 wd 0.0500 time 0.2543 (1.3042) data time 0.0013 (0.0985) model time 0.2530 (1.2057) loss 3.4789 (3.2393) grad_norm 2.6200 (3.1086) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:50:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][600/1251] eta 0:09:05 lr 0.000278 wd 0.0500 time 0.2279 (0.8385) data time 0.0007 (0.0562) model time 0.2272 (0.7824) loss 3.4329 (3.2318) grad_norm 4.3665 (3.3292) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][610/1251] eta 0:06:59 lr 0.000278 wd 0.0500 time 0.2289 (0.6537) data time 0.0011 (0.0395) model time 0.2278 (0.6142) loss 3.4702 (3.2664) grad_norm 2.4826 (3.1976) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:50:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][620/1251] eta 0:05:49 lr 0.000278 wd 0.0500 time 0.2208 (0.5547) data time 0.0010 (0.0305) model time 0.2198 (0.5241) loss 3.0139 (3.1798) grad_norm 2.2526 (3.0955) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:50:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][630/1251] eta 0:05:06 lr 0.000278 wd 0.0500 time 0.2177 (0.4939) data time 0.0010 (0.0250) model time 0.2167 (0.4689) loss 3.3516 (3.1806) grad_norm 2.3482 (3.0284) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:50:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][640/1251] eta 0:04:36 lr 0.000278 wd 0.0500 time 0.2230 (0.4517) data time 0.0010 (0.0212) model time 0.2220 (0.4305) loss 2.9585 (3.1639) grad_norm 2.8168 (2.9755) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][650/1251] eta 0:04:13 lr 0.000278 wd 0.0500 time 0.2256 (0.4215) data time 0.0011 (0.0185) model time 0.2245 (0.4030) loss 3.3177 (3.1341) grad_norm 7.2381 (3.0494) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:50:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][660/1251] eta 0:03:55 lr 0.000278 wd 0.0500 time 0.2253 (0.3982) data time 0.0007 (0.0165) model time 0.2246 (0.3817) loss 2.1080 (3.1104) grad_norm 3.0492 (3.1055) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][670/1251] eta 0:03:40 lr 0.000278 wd 0.0500 time 0.2302 (0.3802) data time 0.0009 (0.0148) model time 0.2293 (0.3654) loss 3.6195 (3.1035) grad_norm 3.0921 (3.1925) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][680/1251] eta 0:03:28 lr 0.000278 wd 0.0500 time 0.2273 (0.3657) data time 0.0009 (0.0135) model time 0.2264 (0.3522) loss 3.2245 (3.1164) grad_norm 3.0438 (3.1924) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][690/1251] eta 0:03:18 lr 0.000278 wd 0.0500 time 0.2234 (0.3537) data time 0.0007 (0.0124) model time 0.2227 (0.3412) loss 3.0567 (3.1022) grad_norm 2.4599 (3.1573) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][700/1251] eta 0:03:09 lr 0.000278 wd 0.0500 time 0.2232 (0.3437) data time 0.0010 (0.0118) model time 0.2222 (0.3319) loss 3.1649 (3.1049) grad_norm 4.0896 (3.2246) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][710/1251] eta 0:03:01 lr 0.000277 wd 0.0500 time 0.2613 (0.3355) data time 0.0009 (0.0111) model time 0.2603 (0.3244) loss 3.1416 (3.0842) grad_norm 3.6450 (3.2529) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][720/1251] eta 0:02:54 lr 0.000277 wd 0.0500 time 0.2222 (0.3281) data time 0.0013 (0.0104) model time 0.2210 (0.3177) loss 3.3369 (3.0724) grad_norm 2.5426 (3.2444) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][730/1251] eta 0:02:47 lr 0.000277 wd 0.0500 time 0.2246 (0.3219) data time 0.0007 (0.0098) model time 0.2238 (0.3121) loss 3.1209 (3.0677) grad_norm 2.7211 (3.2293) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][740/1251] eta 0:02:41 lr 0.000277 wd 0.0500 time 0.2310 (0.3161) data time 0.0007 (0.0093) model time 0.2304 (0.3068) loss 2.4249 (3.0689) grad_norm 4.2663 (3.2402) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][750/1251] eta 0:02:35 lr 0.000277 wd 0.0500 time 0.2286 (0.3110) data time 0.0010 (0.0088) model time 0.2276 (0.3022) loss 2.9356 (3.0686) grad_norm 2.4152 (3.2308) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][760/1251] eta 0:02:30 lr 0.000277 wd 0.0500 time 0.2290 (0.3069) data time 0.0014 (0.0084) model time 0.2276 (0.2985) loss 3.5243 (3.0555) grad_norm 2.6336 (3.2660) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][770/1251] eta 0:02:25 lr 0.000277 wd 0.0500 time 0.2215 (0.3029) data time 0.0009 (0.0081) model time 0.2206 (0.2948) loss 3.1326 (3.0500) grad_norm 2.7663 (3.2474) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][780/1251] eta 0:02:21 lr 0.000277 wd 0.0500 time 0.2268 (0.2994) data time 0.0008 (0.0077) model time 0.2260 (0.2917) loss 2.3526 (3.0306) grad_norm 2.2186 (3.2335) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][790/1251] eta 0:02:16 lr 0.000277 wd 0.0500 time 0.2304 (0.2964) data time 0.0011 (0.0074) model time 0.2293 (0.2889) loss 2.0220 (3.0302) grad_norm 2.6823 (3.2308) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][800/1251] eta 0:02:12 lr 0.000277 wd 0.0500 time 0.2300 (0.2934) data time 0.0007 (0.0071) model time 0.2293 (0.2863) loss 2.6036 (3.0272) grad_norm 2.6522 (3.2509) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][810/1251] eta 0:02:08 lr 0.000277 wd 0.0500 time 0.2218 (0.2907) data time 0.0013 (0.0069) model time 0.2205 (0.2837) loss 2.4127 (3.0243) grad_norm 2.1539 (3.2346) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][820/1251] eta 0:02:04 lr 0.000277 wd 0.0500 time 0.2220 (0.2882) data time 0.0009 (0.0067) model time 0.2211 (0.2815) loss 3.3073 (3.0249) grad_norm 2.9071 (3.2466) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][830/1251] eta 0:02:00 lr 0.000277 wd 0.0500 time 0.2438 (0.2860) data time 0.0010 (0.0065) model time 0.2429 (0.2795) loss 3.3670 (3.0181) grad_norm 2.7496 (3.2351) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][840/1251] eta 0:01:56 lr 0.000277 wd 0.0500 time 0.2385 (0.2840) data time 0.0008 (0.0064) model time 0.2377 (0.2776) loss 3.0305 (3.0103) grad_norm 2.5390 (3.2360) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][850/1251] eta 0:01:53 lr 0.000277 wd 0.0500 time 0.2295 (0.2821) data time 0.0008 (0.0062) model time 0.2287 (0.2759) loss 3.8447 (3.0091) grad_norm 2.8396 (3.2308) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][860/1251] eta 0:01:49 lr 0.000277 wd 0.0500 time 0.2209 (0.2802) data time 0.0009 (0.0060) model time 0.2200 (0.2742) loss 3.3285 (3.0117) grad_norm 3.0593 (3.2317) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][870/1251] eta 0:01:46 lr 0.000277 wd 0.0500 time 0.2282 (0.2793) data time 0.0011 (0.0059) model time 0.2270 (0.2735) loss 2.5506 (3.0022) grad_norm 2.2803 (3.2113) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][880/1251] eta 0:01:42 lr 0.000277 wd 0.0500 time 0.2239 (0.2776) data time 0.0010 (0.0057) model time 0.2229 (0.2719) loss 3.2799 (2.9966) grad_norm 3.6742 (3.2090) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][890/1251] eta 0:01:40 lr 0.000277 wd 0.0500 time 0.2296 (0.2771) data time 0.0007 (0.0055) model time 0.2289 (0.2715) loss 3.5472 (2.9955) grad_norm 2.7388 (3.2125) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][900/1251] eta 0:01:36 lr 0.000277 wd 0.0500 time 0.2393 (0.2757) data time 0.0009 (0.0055) model time 0.2384 (0.2702) loss 3.1494 (3.0048) grad_norm 3.6766 (3.2150) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][910/1251] eta 0:01:33 lr 0.000277 wd 0.0500 time 0.2255 (0.2742) data time 0.0011 (0.0053) model time 0.2244 (0.2689) loss 3.2568 (3.0039) grad_norm 5.6392 (3.2443) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:51:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][920/1251] eta 0:01:30 lr 0.000277 wd 0.0500 time 0.2300 (0.2730) data time 0.0009 (0.0052) model time 0.2291 (0.2677) loss 3.0664 (3.0055) grad_norm 3.4374 (3.2477) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][930/1251] eta 0:01:27 lr 0.000277 wd 0.0500 time 0.2306 (0.2718) data time 0.0008 (0.0051) model time 0.2299 (0.2667) loss 3.4244 (3.0087) grad_norm 2.9018 (3.2417) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][940/1251] eta 0:01:24 lr 0.000277 wd 0.0500 time 0.2273 (0.2708) data time 0.0009 (0.0050) model time 0.2265 (0.2657) loss 3.4782 (3.0083) grad_norm 2.5136 (3.2297) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][950/1251] eta 0:01:21 lr 0.000277 wd 0.0500 time 0.2236 (0.2697) data time 0.0009 (0.0049) model time 0.2226 (0.2647) loss 3.3522 (3.0061) grad_norm 3.0997 (3.2297) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][960/1251] eta 0:01:18 lr 0.000276 wd 0.0500 time 0.2299 (0.2686) data time 0.0010 (0.0048) model time 0.2290 (0.2637) loss 2.2977 (3.0010) grad_norm 2.7828 (3.2362) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][970/1251] eta 0:01:15 lr 0.000276 wd 0.0500 time 0.2212 (0.2677) data time 0.0011 (0.0048) model time 0.2201 (0.2630) loss 3.1658 (3.0008) grad_norm 2.8793 (3.2320) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][980/1251] eta 0:01:12 lr 0.000276 wd 0.0500 time 0.2263 (0.2668) data time 0.0011 (0.0047) model time 0.2252 (0.2621) loss 2.5951 (3.0055) grad_norm 2.3911 (3.2447) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][990/1251] eta 0:01:09 lr 0.000276 wd 0.0500 time 0.2364 (0.2660) data time 0.0009 (0.0046) model time 0.2355 (0.2614) loss 3.5577 (3.0118) grad_norm 2.2582 (3.2372) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1000/1251] eta 0:01:06 lr 0.000276 wd 0.0500 time 0.2307 (0.2651) data time 0.0008 (0.0045) model time 0.2300 (0.2606) loss 2.4990 (3.0075) grad_norm 2.7101 (3.2384) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1010/1251] eta 0:01:03 lr 0.000276 wd 0.0500 time 0.2326 (0.2643) data time 0.0008 (0.0044) model time 0.2318 (0.2598) loss 3.9880 (3.0152) grad_norm 4.0942 (3.2429) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1020/1251] eta 0:01:00 lr 0.000276 wd 0.0500 time 0.2223 (0.2635) data time 0.0009 (0.0044) model time 0.2214 (0.2591) loss 3.0904 (3.0191) grad_norm 20.1338 (3.2849) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1030/1251] eta 0:00:58 lr 0.000276 wd 0.0500 time 0.2421 (0.2627) data time 0.0010 (0.0043) model time 0.2411 (0.2584) loss 2.1573 (3.0159) grad_norm 3.9205 (3.2898) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1040/1251] eta 0:00:55 lr 0.000276 wd 0.0500 time 0.2387 (0.2620) data time 0.0007 (0.0043) model time 0.2380 (0.2578) loss 2.9315 (3.0129) grad_norm 2.2399 (3.2852) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1050/1251] eta 0:00:52 lr 0.000276 wd 0.0500 time 0.2252 (0.2614) data time 0.0009 (0.0042) model time 0.2243 (0.2572) loss 3.0061 (3.0074) grad_norm 2.6172 (3.2831) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1060/1251] eta 0:00:49 lr 0.000276 wd 0.0500 time 0.2250 (0.2607) data time 0.0010 (0.0041) model time 0.2240 (0.2566) loss 3.8243 (3.0096) grad_norm 4.3513 (3.2761) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1070/1251] eta 0:00:47 lr 0.000276 wd 0.0500 time 0.2337 (0.2602) data time 0.0010 (0.0041) model time 0.2327 (0.2561) loss 2.5727 (3.0109) grad_norm 3.1122 (3.2780) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1080/1251] eta 0:00:44 lr 0.000276 wd 0.0500 time 0.2286 (0.2596) data time 0.0008 (0.0040) model time 0.2279 (0.2556) loss 2.3752 (3.0111) grad_norm 2.7350 (3.3300) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1090/1251] eta 0:00:41 lr 0.000276 wd 0.0500 time 0.2248 (0.2590) data time 0.0009 (0.0040) model time 0.2239 (0.2551) loss 3.9135 (3.0207) grad_norm 4.5472 (3.3431) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1100/1251] eta 0:00:39 lr 0.000276 wd 0.0500 time 0.2239 (0.2585) data time 0.0009 (0.0039) model time 0.2230 (0.2546) loss 3.0663 (3.0170) grad_norm 3.2048 (3.3542) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1110/1251] eta 0:00:36 lr 0.000276 wd 0.0500 time 0.2425 (0.2580) data time 0.0008 (0.0039) model time 0.2418 (0.2541) loss 3.3936 (3.0136) grad_norm 3.4221 (3.3520) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1120/1251] eta 0:00:33 lr 0.000276 wd 0.0500 time 0.2273 (0.2575) data time 0.0012 (0.0038) model time 0.2261 (0.2537) loss 2.8299 (3.0103) grad_norm 3.2850 (3.3457) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1130/1251] eta 0:00:31 lr 0.000276 wd 0.0500 time 0.2306 (0.2570) data time 0.0007 (0.0038) model time 0.2299 (0.2533) loss 3.6841 (3.0141) grad_norm 2.7045 (3.3473) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1140/1251] eta 0:00:28 lr 0.000276 wd 0.0500 time 0.2234 (0.2566) data time 0.0012 (0.0037) model time 0.2222 (0.2528) loss 3.2784 (3.0178) grad_norm 3.2387 (3.3489) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1150/1251] eta 0:00:25 lr 0.000276 wd 0.0500 time 0.2323 (0.2561) data time 0.0007 (0.0037) model time 0.2316 (0.2524) loss 2.8736 (3.0189) grad_norm 2.8305 (3.3443) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1160/1251] eta 0:00:23 lr 0.000276 wd 0.0500 time 0.2369 (0.2557) data time 0.0013 (0.0036) model time 0.2357 (0.2520) loss 3.3014 (3.0211) grad_norm 3.7511 (3.3575) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1170/1251] eta 0:00:20 lr 0.000276 wd 0.0500 time 0.2358 (0.2553) data time 0.0012 (0.0036) model time 0.2346 (0.2517) loss 2.3383 (3.0212) grad_norm 2.4998 (3.3465) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:52:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1180/1251] eta 0:00:18 lr 0.000276 wd 0.0500 time 0.2336 (0.2549) data time 0.0011 (0.0036) model time 0.2325 (0.2513) loss 2.1021 (3.0174) grad_norm 2.5629 (3.3352) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:53:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1190/1251] eta 0:00:15 lr 0.000276 wd 0.0500 time 0.2277 (0.2545) data time 0.0016 (0.0035) model time 0.2261 (0.2510) loss 3.4380 (3.0175) grad_norm 3.3569 (3.3324) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:53:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1200/1251] eta 0:00:12 lr 0.000276 wd 0.0500 time 0.2289 (0.2542) data time 0.0007 (0.0035) model time 0.2283 (0.2507) loss 3.5225 (3.0203) grad_norm 2.8609 (3.3290) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:53:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1210/1251] eta 0:00:10 lr 0.000276 wd 0.0500 time 0.2293 (0.2538) data time 0.0007 (0.0035) model time 0.2285 (0.2503) loss 3.5816 (3.0208) grad_norm 3.3520 (3.3235) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:53:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1220/1251] eta 0:00:07 lr 0.000275 wd 0.0500 time 0.2283 (0.2534) data time 0.0007 (0.0035) model time 0.2276 (0.2500) loss 3.4821 (3.0179) grad_norm 4.7379 (3.3216) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:53:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1230/1251] eta 0:00:05 lr 0.000275 wd 0.0500 time 0.2340 (0.2531) data time 0.0007 (0.0034) model time 0.2333 (0.2497) loss 3.9134 (3.0180) grad_norm 2.3755 (3.3193) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:53:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1240/1251] eta 0:00:02 lr 0.000275 wd 0.0500 time 0.2127 (0.2527) data time 0.0007 (0.0034) model time 0.2120 (0.2492) loss 2.6199 (3.0140) grad_norm 3.9492 (3.3241) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [202/300][1250/1251] eta 0:00:00 lr 0.000275 wd 0.0500 time 0.2119 (0.2521) data time 0.0005 (0.0034) model time 0.2115 (0.2487) loss 3.6447 (3.0191) grad_norm 3.9316 (3.3269) loss_scale 1024.0000 (1024.0000) mem 7374MB [2024-08-27 22:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 202 training takes 0:02:49 [2024-08-27 22:53:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 22:53:22 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 22:53:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.418 (0.418) Loss 0.4067 (0.4067) Acc@1 92.773 (92.773) Acc@5 98.438 (98.438) Mem 7374MB [2024-08-27 22:53:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.114) Loss 0.6416 (0.6668) Acc@1 87.500 (85.653) Acc@5 97.363 (97.212) Mem 7374MB [2024-08-27 22:53:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.092 (0.099) Loss 0.9707 (0.6930) Acc@1 76.270 (84.868) Acc@5 94.727 (97.219) Mem 7374MB [2024-08-27 22:53:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.094) Loss 1.1543 (0.7882) Acc@1 73.340 (82.690) Acc@5 92.285 (96.147) Mem 7374MB [2024-08-27 22:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.088) Loss 1.0898 (0.8393) Acc@1 75.391 (81.333) Acc@5 92.383 (95.591) Mem 7374MB [2024-08-27 22:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.906 Acc@5 95.532 [2024-08-27 22:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.9% [2024-08-27 22:53:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.940 (0.940) Loss 0.3877 (0.3877) Acc@1 92.969 (92.969) Acc@5 98.535 (98.535) Mem 7374MB [2024-08-27 22:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.164) Loss 0.5918 (0.6077) Acc@1 88.965 (87.331) Acc@5 97.656 (97.665) Mem 7374MB [2024-08-27 22:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.125) Loss 0.8667 (0.6344) Acc@1 77.930 (86.207) Acc@5 96.191 (97.652) Mem 7374MB [2024-08-27 22:53:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.110) Loss 1.0947 (0.7187) Acc@1 73.242 (84.132) Acc@5 93.164 (96.790) Mem 7374MB [2024-08-27 22:53:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.099) Loss 0.9805 (0.7617) Acc@1 76.074 (82.915) Acc@5 94.434 (96.310) Mem 7374MB [2024-08-27 22:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.490 Acc@5 96.298 [2024-08-27 22:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.5% [2024-08-27 22:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.49% [2024-08-27 22:53:33 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 22:53:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 22:53:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][0/1251] eta 0:16:09 lr 0.000275 wd 0.0500 time 0.7749 (0.7749) data time 0.4316 (0.4316) model time 0.0000 (0.0000) loss 3.4633 (3.4633) grad_norm 2.3555 (2.3555) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-27 22:53:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][10/1251] eta 0:05:43 lr 0.000275 wd 0.0500 time 0.2386 (0.2770) data time 0.0009 (0.0402) model time 0.0000 (0.0000) loss 3.2487 (3.0217) grad_norm 2.3856 (2.9868) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:53:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][20/1251] eta 0:05:11 lr 0.000275 wd 0.0500 time 0.2244 (0.2534) data time 0.0009 (0.0219) model time 0.0000 (0.0000) loss 3.1195 (2.9655) grad_norm 3.5134 (2.8859) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:53:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][30/1251] eta 0:05:00 lr 0.000275 wd 0.0500 time 0.2255 (0.2461) data time 0.0009 (0.0154) model time 0.0000 (0.0000) loss 3.2227 (2.9155) grad_norm 3.5293 (2.9942) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:53:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][40/1251] eta 0:04:53 lr 0.000275 wd 0.0500 time 0.2219 (0.2426) data time 0.0007 (0.0121) model time 0.0000 (0.0000) loss 2.3682 (2.8730) grad_norm 3.4771 (3.0403) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:53:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][50/1251] eta 0:04:48 lr 0.000275 wd 0.0500 time 0.2257 (0.2403) data time 0.0007 (0.0099) model time 0.0000 (0.0000) loss 3.2139 (2.8885) grad_norm 4.4818 (3.1266) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][60/1251] eta 0:04:43 lr 0.000275 wd 0.0500 time 0.2285 (0.2383) data time 0.0009 (0.0085) model time 0.2276 (0.2268) loss 3.4030 (2.9237) grad_norm 3.3540 (3.3100) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:53:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][70/1251] eta 0:04:40 lr 0.000275 wd 0.0500 time 0.2216 (0.2373) data time 0.0010 (0.0075) model time 0.2206 (0.2284) loss 2.3452 (2.9425) grad_norm 3.6503 (3.3699) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:53:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][80/1251] eta 0:04:40 lr 0.000275 wd 0.0500 time 0.2266 (0.2393) data time 0.0011 (0.0072) model time 0.2255 (0.2349) loss 1.8801 (2.9221) grad_norm 3.7834 (3.4042) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:53:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][90/1251] eta 0:04:36 lr 0.000275 wd 0.0500 time 0.2306 (0.2383) data time 0.0010 (0.0066) model time 0.2296 (0.2335) loss 2.6656 (2.9347) grad_norm 2.4566 (3.3650) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][100/1251] eta 0:04:33 lr 0.000275 wd 0.0500 time 0.2470 (0.2378) data time 0.0010 (0.0060) model time 0.2460 (0.2332) loss 3.5238 (2.9533) grad_norm 2.4255 (3.3929) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][110/1251] eta 0:04:30 lr 0.000275 wd 0.0500 time 0.2455 (0.2371) data time 0.0008 (0.0056) model time 0.2447 (0.2325) loss 2.3861 (2.9623) grad_norm 2.9759 (3.3336) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][120/1251] eta 0:04:27 lr 0.000275 wd 0.0500 time 0.2318 (0.2366) data time 0.0010 (0.0052) model time 0.2308 (0.2321) loss 3.7241 (2.9742) grad_norm 3.7228 (3.3634) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][130/1251] eta 0:04:24 lr 0.000275 wd 0.0500 time 0.2248 (0.2360) data time 0.0012 (0.0049) model time 0.2236 (0.2315) loss 2.8468 (2.9557) grad_norm 2.8664 (3.3414) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][140/1251] eta 0:04:23 lr 0.000275 wd 0.0500 time 0.2405 (0.2369) data time 0.0013 (0.0047) model time 0.2391 (0.2331) loss 3.3779 (2.9392) grad_norm 2.8251 (3.3430) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][150/1251] eta 0:04:20 lr 0.000275 wd 0.0500 time 0.2253 (0.2364) data time 0.0009 (0.0045) model time 0.2244 (0.2327) loss 3.2453 (2.9467) grad_norm 3.0739 (3.3236) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][160/1251] eta 0:04:17 lr 0.000275 wd 0.0500 time 0.2325 (0.2361) data time 0.0010 (0.0043) model time 0.2315 (0.2325) loss 2.8459 (2.9373) grad_norm 2.6878 (3.3067) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][170/1251] eta 0:04:14 lr 0.000275 wd 0.0500 time 0.2351 (0.2357) data time 0.0007 (0.0042) model time 0.2344 (0.2321) loss 2.0377 (2.9371) grad_norm 2.9290 (3.3225) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][180/1251] eta 0:04:12 lr 0.000275 wd 0.0500 time 0.2406 (0.2353) data time 0.0009 (0.0040) model time 0.2397 (0.2317) loss 3.5401 (2.9421) grad_norm 5.7466 (3.3364) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][190/1251] eta 0:04:09 lr 0.000275 wd 0.0500 time 0.2422 (0.2351) data time 0.0009 (0.0038) model time 0.2413 (0.2315) loss 3.4751 (2.9450) grad_norm 3.0811 (3.3208) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][200/1251] eta 0:04:06 lr 0.000275 wd 0.0500 time 0.2369 (0.2347) data time 0.0011 (0.0037) model time 0.2358 (0.2312) loss 3.5930 (2.9534) grad_norm 3.5829 (3.2992) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][210/1251] eta 0:04:04 lr 0.000275 wd 0.0500 time 0.2555 (0.2345) data time 0.0008 (0.0036) model time 0.2548 (0.2311) loss 3.9606 (2.9517) grad_norm 2.5709 (3.3080) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][220/1251] eta 0:04:01 lr 0.000274 wd 0.0500 time 0.2443 (0.2343) data time 0.0010 (0.0035) model time 0.2433 (0.2310) loss 2.9224 (2.9476) grad_norm 3.5951 (3.3321) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][230/1251] eta 0:03:58 lr 0.000274 wd 0.0500 time 0.2363 (0.2340) data time 0.0009 (0.0034) model time 0.2353 (0.2308) loss 2.6299 (2.9486) grad_norm 2.2991 (3.3087) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][240/1251] eta 0:03:56 lr 0.000274 wd 0.0500 time 0.2306 (0.2340) data time 0.0007 (0.0033) model time 0.2300 (0.2307) loss 2.3524 (2.9443) grad_norm 3.2325 (3.2958) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][250/1251] eta 0:03:54 lr 0.000274 wd 0.0500 time 0.2332 (0.2339) data time 0.0009 (0.0032) model time 0.2323 (0.2308) loss 3.2662 (2.9536) grad_norm 3.0725 (3.2789) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][260/1251] eta 0:03:51 lr 0.000274 wd 0.0500 time 0.2313 (0.2338) data time 0.0010 (0.0032) model time 0.2303 (0.2307) loss 2.9162 (2.9637) grad_norm 4.9869 (3.2817) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][270/1251] eta 0:03:49 lr 0.000274 wd 0.0500 time 0.2259 (0.2337) data time 0.0012 (0.0031) model time 0.2247 (0.2306) loss 2.5022 (2.9538) grad_norm 3.7023 (3.2863) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][280/1251] eta 0:03:46 lr 0.000274 wd 0.0500 time 0.2252 (0.2335) data time 0.0009 (0.0031) model time 0.2242 (0.2304) loss 2.3396 (2.9459) grad_norm 3.7673 (3.2937) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][290/1251] eta 0:03:44 lr 0.000274 wd 0.0500 time 0.2360 (0.2333) data time 0.0007 (0.0030) model time 0.2353 (0.2303) loss 2.6325 (2.9473) grad_norm 2.6739 (3.3024) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-27 22:54:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][300/1251] eta 0:03:41 lr 0.000274 wd 0.0500 time 0.2309 (0.2332) data time 0.0012 (0.0029) model time 0.2297 (0.2302) loss 3.3364 (2.9569) grad_norm 2.4114 (inf) loss_scale 512.0000 (1017.1960) mem 7379MB [2024-08-27 22:54:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][310/1251] eta 0:03:39 lr 0.000274 wd 0.0500 time 0.2538 (0.2331) data time 0.0010 (0.0029) model time 0.2528 (0.2302) loss 3.1632 (2.9543) grad_norm 2.5166 (inf) loss_scale 512.0000 (1000.9518) mem 7379MB [2024-08-27 22:54:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][320/1251] eta 0:03:36 lr 0.000274 wd 0.0500 time 0.2428 (0.2330) data time 0.0012 (0.0029) model time 0.2416 (0.2301) loss 2.6408 (2.9557) grad_norm 3.5870 (inf) loss_scale 512.0000 (985.7196) mem 7379MB [2024-08-27 22:54:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][330/1251] eta 0:03:34 lr 0.000274 wd 0.0500 time 0.2418 (0.2329) data time 0.0008 (0.0028) model time 0.2410 (0.2301) loss 2.8161 (2.9555) grad_norm 4.2717 (inf) loss_scale 512.0000 (971.4079) mem 7379MB [2024-08-27 22:54:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][340/1251] eta 0:03:32 lr 0.000274 wd 0.0500 time 0.2830 (0.2330) data time 0.0009 (0.0028) model time 0.2821 (0.2302) loss 3.1048 (2.9618) grad_norm 2.9443 (inf) loss_scale 512.0000 (957.9355) mem 7379MB [2024-08-27 22:54:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][350/1251] eta 0:03:29 lr 0.000274 wd 0.0500 time 0.2300 (0.2329) data time 0.0010 (0.0028) model time 0.2290 (0.2301) loss 2.3648 (2.9557) grad_norm 2.8559 (inf) loss_scale 512.0000 (945.2308) mem 7379MB [2024-08-27 22:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 22:55:00 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 22:55:00 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 22:58:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 22:58:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 22:58:39 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 22:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 22:58:47 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 22:58:49 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 22:58:50 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 22:58:50 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 203) [2024-08-27 22:58:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 22:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][360/1251] eta 1:14:58 lr 0.000274 wd 0.0500 time 0.2174 (5.0491) data time 0.0008 (0.3148) model time 0.2166 (4.7343) loss 2.5225 (3.2944) grad_norm 2.8608 (3.3715) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][370/1251] eta 0:19:42 lr 0.000274 wd 0.0500 time 0.2248 (1.3422) data time 0.0010 (0.0737) model time 0.2238 (1.2685) loss 3.0728 (3.2679) grad_norm 2.9057 (3.3799) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][380/1251] eta 0:12:27 lr 0.000274 wd 0.0500 time 0.2270 (0.8580) data time 0.0009 (0.0421) model time 0.2260 (0.8158) loss 3.5909 (3.2852) grad_norm 2.5140 (3.4918) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][390/1251] eta 0:09:34 lr 0.000274 wd 0.0500 time 0.2305 (0.6673) data time 0.0008 (0.0298) model time 0.2298 (0.6375) loss 4.0343 (3.2994) grad_norm 2.3977 (3.4650) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][400/1251] eta 0:08:00 lr 0.000274 wd 0.0500 time 0.2291 (0.5649) data time 0.0010 (0.0231) model time 0.2281 (0.5418) loss 2.9677 (3.2310) grad_norm 2.8410 (4.0766) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][410/1251] eta 0:07:02 lr 0.000274 wd 0.0500 time 0.2259 (0.5023) data time 0.0010 (0.0190) model time 0.2249 (0.4833) loss 3.3463 (3.2320) grad_norm 3.0824 (3.9096) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][420/1251] eta 0:06:21 lr 0.000274 wd 0.0500 time 0.2255 (0.4590) data time 0.0010 (0.0162) model time 0.2245 (0.4428) loss 2.9531 (3.2073) grad_norm 1.8906 (3.7359) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][430/1251] eta 0:05:51 lr 0.000274 wd 0.0500 time 0.2256 (0.4278) data time 0.0009 (0.0142) model time 0.2247 (0.4136) loss 3.2810 (3.1502) grad_norm 2.7462 (3.6421) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][440/1251] eta 0:05:27 lr 0.000274 wd 0.0500 time 0.2315 (0.4042) data time 0.0007 (0.0126) model time 0.2309 (0.3916) loss 2.2400 (3.1111) grad_norm 4.1292 (3.5439) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][450/1251] eta 0:05:08 lr 0.000274 wd 0.0500 time 0.2200 (0.3853) data time 0.0008 (0.0113) model time 0.2191 (0.3739) loss 3.6160 (3.1217) grad_norm 2.1385 (3.4524) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][460/1251] eta 0:04:52 lr 0.000274 wd 0.0500 time 0.2235 (0.3704) data time 0.0009 (0.0104) model time 0.2226 (0.3599) loss 3.2521 (3.1308) grad_norm 2.4433 (3.4199) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][470/1251] eta 0:04:39 lr 0.000273 wd 0.0500 time 0.2265 (0.3580) data time 0.0011 (0.0096) model time 0.2254 (0.3484) loss 3.3152 (3.1332) grad_norm 3.1660 (3.4136) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][480/1251] eta 0:04:27 lr 0.000273 wd 0.0500 time 0.2163 (0.3471) data time 0.0016 (0.0089) model time 0.2148 (0.3382) loss 2.9923 (3.1340) grad_norm 4.0309 (3.3997) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][490/1251] eta 0:04:17 lr 0.000273 wd 0.0500 time 0.2317 (0.3383) data time 0.0010 (0.0085) model time 0.2307 (0.3298) loss 3.0322 (3.1187) grad_norm 4.3554 (3.4778) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][500/1251] eta 0:04:08 lr 0.000273 wd 0.0500 time 0.2247 (0.3306) data time 0.0013 (0.0080) model time 0.2234 (0.3226) loss 2.9932 (3.1068) grad_norm 3.3568 (3.5103) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][510/1251] eta 0:04:00 lr 0.000273 wd 0.0500 time 0.2247 (0.3240) data time 0.0007 (0.0076) model time 0.2240 (0.3164) loss 2.7937 (3.0966) grad_norm 2.9049 (3.5192) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][520/1251] eta 0:03:52 lr 0.000273 wd 0.0500 time 0.2225 (0.3181) data time 0.0008 (0.0072) model time 0.2217 (0.3108) loss 2.2020 (3.0861) grad_norm 2.1434 (3.4937) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][530/1251] eta 0:03:45 lr 0.000273 wd 0.0500 time 0.2278 (0.3130) data time 0.0009 (0.0069) model time 0.2269 (0.3061) loss 3.6927 (3.0810) grad_norm 3.5196 (3.4747) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][540/1251] eta 0:03:39 lr 0.000273 wd 0.0500 time 0.2254 (0.3085) data time 0.0009 (0.0066) model time 0.2245 (0.3019) loss 3.7842 (3.0733) grad_norm 3.0289 (3.4577) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][550/1251] eta 0:03:33 lr 0.000273 wd 0.0500 time 0.2411 (0.3045) data time 0.0007 (0.0063) model time 0.2404 (0.2982) loss 3.3277 (3.0743) grad_norm 2.1973 (3.4331) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][560/1251] eta 0:03:27 lr 0.000273 wd 0.0500 time 0.2257 (0.3008) data time 0.0010 (0.0060) model time 0.2247 (0.2948) loss 2.3524 (3.0516) grad_norm 2.8615 (3.4085) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 22:59:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][570/1251] eta 0:03:22 lr 0.000273 wd 0.0500 time 0.2247 (0.2977) data time 0.0009 (0.0058) model time 0.2238 (0.2918) loss 1.8271 (3.0423) grad_norm 4.7824 (3.4108) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][580/1251] eta 0:03:17 lr 0.000273 wd 0.0500 time 0.2242 (0.2946) data time 0.0009 (0.0056) model time 0.2233 (0.2889) loss 2.2748 (3.0393) grad_norm 3.6887 (3.3983) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][590/1251] eta 0:03:12 lr 0.000273 wd 0.0500 time 0.2279 (0.2917) data time 0.0009 (0.0054) model time 0.2269 (0.2863) loss 2.3857 (3.0360) grad_norm 7.3516 (3.4016) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][600/1251] eta 0:03:08 lr 0.000273 wd 0.0500 time 0.2257 (0.2891) data time 0.0014 (0.0052) model time 0.2244 (0.2838) loss 3.3513 (3.0381) grad_norm 2.1673 (3.3790) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][610/1251] eta 0:03:03 lr 0.000273 wd 0.0500 time 0.2253 (0.2868) data time 0.0011 (0.0051) model time 0.2242 (0.2817) loss 3.3533 (3.0297) grad_norm 2.3227 (3.3659) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][620/1251] eta 0:02:59 lr 0.000273 wd 0.0500 time 0.2316 (0.2846) data time 0.0012 (0.0049) model time 0.2304 (0.2797) loss 3.0917 (3.0143) grad_norm 3.1496 (3.3518) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][630/1251] eta 0:02:55 lr 0.000273 wd 0.0500 time 0.2237 (0.2825) data time 0.0007 (0.0048) model time 0.2230 (0.2778) loss 3.7702 (3.0142) grad_norm 3.6583 (3.3435) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][640/1251] eta 0:02:51 lr 0.000273 wd 0.0500 time 0.2186 (0.2806) data time 0.0010 (0.0047) model time 0.2176 (0.2759) loss 3.7669 (3.0165) grad_norm 2.8195 (3.3632) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][650/1251] eta 0:02:48 lr 0.000273 wd 0.0500 time 0.2281 (0.2796) data time 0.0009 (0.0045) model time 0.2273 (0.2751) loss 2.8492 (3.0085) grad_norm 3.4531 (3.3793) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][660/1251] eta 0:02:44 lr 0.000273 wd 0.0500 time 0.2274 (0.2779) data time 0.0007 (0.0044) model time 0.2266 (0.2735) loss 2.9566 (3.0002) grad_norm 3.7937 (3.3769) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][670/1251] eta 0:02:41 lr 0.000273 wd 0.0500 time 0.2354 (0.2772) data time 0.0007 (0.0043) model time 0.2347 (0.2729) loss 3.3516 (3.0055) grad_norm 3.7158 (3.3836) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][680/1251] eta 0:02:37 lr 0.000273 wd 0.0500 time 0.2400 (0.2757) data time 0.0010 (0.0043) model time 0.2389 (0.2715) loss 3.4101 (3.0147) grad_norm 4.6429 (3.4038) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][690/1251] eta 0:02:33 lr 0.000273 wd 0.0500 time 0.2552 (0.2743) data time 0.0012 (0.0042) model time 0.2541 (0.2702) loss 3.6071 (3.0162) grad_norm 2.9929 (3.4084) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][700/1251] eta 0:02:30 lr 0.000273 wd 0.0500 time 0.2234 (0.2730) data time 0.0010 (0.0041) model time 0.2224 (0.2689) loss 3.0845 (3.0177) grad_norm 3.0897 (3.4058) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][710/1251] eta 0:02:27 lr 0.000273 wd 0.0500 time 0.2253 (0.2718) data time 0.0006 (0.0040) model time 0.2247 (0.2678) loss 3.3460 (3.0197) grad_norm 4.7763 (3.4020) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][720/1251] eta 0:02:23 lr 0.000273 wd 0.0500 time 0.2352 (0.2706) data time 0.0009 (0.0039) model time 0.2342 (0.2667) loss 3.3315 (3.0208) grad_norm 3.0141 (3.4035) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][730/1251] eta 0:02:20 lr 0.000272 wd 0.0500 time 0.2449 (0.2695) data time 0.0007 (0.0038) model time 0.2442 (0.2657) loss 3.5394 (3.0179) grad_norm 2.8701 (3.4004) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][740/1251] eta 0:02:17 lr 0.000272 wd 0.0500 time 0.2393 (0.2685) data time 0.0009 (0.0038) model time 0.2384 (0.2647) loss 2.3990 (3.0147) grad_norm 2.4385 (3.3946) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][750/1251] eta 0:02:14 lr 0.000272 wd 0.0500 time 0.2325 (0.2675) data time 0.0009 (0.0038) model time 0.2316 (0.2637) loss 3.6431 (3.0127) grad_norm 2.7515 (3.3919) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][760/1251] eta 0:02:10 lr 0.000272 wd 0.0500 time 0.2423 (0.2665) data time 0.0013 (0.0037) model time 0.2410 (0.2628) loss 2.6098 (3.0162) grad_norm 2.7847 (3.3844) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][770/1251] eta 0:02:07 lr 0.000272 wd 0.0500 time 0.2458 (0.2656) data time 0.0013 (0.0036) model time 0.2446 (0.2620) loss 3.3235 (3.0219) grad_norm 2.6005 (3.3759) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][780/1251] eta 0:02:04 lr 0.000272 wd 0.0500 time 0.2371 (0.2648) data time 0.0008 (0.0036) model time 0.2363 (0.2612) loss 2.1774 (3.0181) grad_norm 3.0438 (3.3871) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][790/1251] eta 0:02:01 lr 0.000272 wd 0.0500 time 0.2438 (0.2641) data time 0.0007 (0.0036) model time 0.2430 (0.2605) loss 3.5473 (3.0259) grad_norm 3.1621 (3.4014) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][800/1251] eta 0:01:58 lr 0.000272 wd 0.0500 time 0.2515 (0.2633) data time 0.0008 (0.0035) model time 0.2508 (0.2598) loss 2.9411 (3.0278) grad_norm 2.3939 (3.3853) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][810/1251] eta 0:01:55 lr 0.000272 wd 0.0500 time 0.2392 (0.2626) data time 0.0007 (0.0035) model time 0.2385 (0.2591) loss 2.3159 (3.0251) grad_norm 3.7236 (3.4192) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][820/1251] eta 0:01:52 lr 0.000272 wd 0.0500 time 0.2350 (0.2619) data time 0.0007 (0.0034) model time 0.2343 (0.2585) loss 2.7900 (3.0219) grad_norm 4.9575 (3.4175) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:00:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][830/1251] eta 0:01:49 lr 0.000272 wd 0.0500 time 0.2531 (0.2612) data time 0.0012 (0.0034) model time 0.2519 (0.2578) loss 2.7928 (3.0150) grad_norm 2.6777 (3.4127) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][840/1251] eta 0:01:47 lr 0.000272 wd 0.0500 time 0.2458 (0.2606) data time 0.0010 (0.0034) model time 0.2449 (0.2573) loss 3.8382 (3.0152) grad_norm 3.7481 (3.4132) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][850/1251] eta 0:01:44 lr 0.000272 wd 0.0500 time 0.2509 (0.2601) data time 0.0008 (0.0033) model time 0.2501 (0.2567) loss 2.3906 (3.0166) grad_norm 2.7195 (3.4088) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][860/1251] eta 0:01:41 lr 0.000272 wd 0.0500 time 0.2271 (0.2594) data time 0.0008 (0.0033) model time 0.2263 (0.2561) loss 2.5110 (3.0141) grad_norm 5.3155 (3.4349) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][870/1251] eta 0:01:38 lr 0.000272 wd 0.0500 time 0.2225 (0.2589) data time 0.0008 (0.0033) model time 0.2217 (0.2556) loss 3.1531 (3.0200) grad_norm 5.5371 (3.4542) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][880/1251] eta 0:01:35 lr 0.000272 wd 0.0500 time 0.2498 (0.2583) data time 0.0009 (0.0032) model time 0.2489 (0.2551) loss 3.0718 (3.0158) grad_norm 2.3799 (3.4448) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][890/1251] eta 0:01:33 lr 0.000272 wd 0.0500 time 0.2442 (0.2578) data time 0.0006 (0.0032) model time 0.2435 (0.2546) loss 3.4174 (3.0157) grad_norm 3.4352 (3.4374) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][900/1251] eta 0:01:30 lr 0.000272 wd 0.0500 time 0.2305 (0.2573) data time 0.0014 (0.0032) model time 0.2290 (0.2541) loss 2.7562 (3.0120) grad_norm 3.0523 (3.4312) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][910/1251] eta 0:01:27 lr 0.000272 wd 0.0500 time 0.2331 (0.2568) data time 0.0010 (0.0031) model time 0.2321 (0.2537) loss 3.4343 (3.0173) grad_norm 3.1083 (3.4397) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][920/1251] eta 0:01:24 lr 0.000272 wd 0.0500 time 0.2322 (0.2563) data time 0.0009 (0.0031) model time 0.2312 (0.2532) loss 3.1301 (3.0200) grad_norm 3.1363 (3.4392) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][930/1251] eta 0:01:22 lr 0.000272 wd 0.0500 time 0.2755 (0.2559) data time 0.0008 (0.0030) model time 0.2746 (0.2528) loss 2.7476 (3.0210) grad_norm 2.9860 (3.4596) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][940/1251] eta 0:01:19 lr 0.000272 wd 0.0500 time 0.2385 (0.2554) data time 0.0012 (0.0030) model time 0.2373 (0.2524) loss 3.4621 (3.0249) grad_norm 2.8574 (3.4555) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][950/1251] eta 0:01:16 lr 0.000272 wd 0.0500 time 0.2294 (0.2550) data time 0.0010 (0.0030) model time 0.2283 (0.2520) loss 3.0387 (3.0245) grad_norm 2.5643 (3.4699) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][960/1251] eta 0:01:14 lr 0.000272 wd 0.0500 time 0.2269 (0.2545) data time 0.0013 (0.0030) model time 0.2256 (0.2516) loss 2.3936 (3.0225) grad_norm 2.3654 (3.4565) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][970/1251] eta 0:01:11 lr 0.000272 wd 0.0500 time 0.2357 (0.2541) data time 0.0011 (0.0029) model time 0.2346 (0.2512) loss 3.1763 (3.0224) grad_norm 5.1602 (3.4494) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][980/1251] eta 0:01:08 lr 0.000271 wd 0.0500 time 0.2410 (0.2537) data time 0.0008 (0.0029) model time 0.2402 (0.2508) loss 3.4719 (3.0248) grad_norm 2.3772 (3.4429) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][990/1251] eta 0:01:06 lr 0.000271 wd 0.0500 time 0.2320 (0.2534) data time 0.0008 (0.0029) model time 0.2312 (0.2505) loss 3.4476 (3.0269) grad_norm 4.4938 (3.4616) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1000/1251] eta 0:01:03 lr 0.000271 wd 0.0500 time 0.2331 (0.2530) data time 0.0012 (0.0029) model time 0.2319 (0.2502) loss 3.6302 (3.0245) grad_norm 2.4885 (3.4580) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1010/1251] eta 0:01:00 lr 0.000271 wd 0.0500 time 0.2420 (0.2526) data time 0.0007 (0.0028) model time 0.2413 (0.2498) loss 3.5442 (3.0237) grad_norm 3.5052 (3.4620) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1020/1251] eta 0:00:58 lr 0.000271 wd 0.0500 time 0.2213 (0.2523) data time 0.0014 (0.0028) model time 0.2199 (0.2495) loss 3.1886 (3.0197) grad_norm 3.3065 (3.4543) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1030/1251] eta 0:00:55 lr 0.000271 wd 0.0500 time 0.2279 (0.2519) data time 0.0010 (0.0028) model time 0.2268 (0.2491) loss 3.4209 (3.0242) grad_norm 2.6453 (3.4442) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1040/1251] eta 0:00:53 lr 0.000271 wd 0.0500 time 0.2429 (0.2516) data time 0.0007 (0.0028) model time 0.2422 (0.2488) loss 2.9288 (3.0253) grad_norm 2.5806 (3.4389) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1050/1251] eta 0:00:50 lr 0.000271 wd 0.0500 time 0.2294 (0.2513) data time 0.0009 (0.0028) model time 0.2285 (0.2486) loss 3.1744 (3.0242) grad_norm 4.1642 (3.4360) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1060/1251] eta 0:00:47 lr 0.000271 wd 0.0500 time 0.2285 (0.2511) data time 0.0007 (0.0027) model time 0.2278 (0.2483) loss 3.0705 (3.0211) grad_norm 2.8684 (3.4351) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1070/1251] eta 0:00:45 lr 0.000271 wd 0.0500 time 0.2369 (0.2507) data time 0.0011 (0.0027) model time 0.2358 (0.2480) loss 3.2653 (3.0196) grad_norm 3.3994 (3.4371) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1080/1251] eta 0:00:42 lr 0.000271 wd 0.0500 time 0.2353 (0.2505) data time 0.0007 (0.0027) model time 0.2347 (0.2478) loss 3.0860 (3.0187) grad_norm 3.2361 (3.4407) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:01:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1090/1251] eta 0:00:40 lr 0.000271 wd 0.0500 time 0.2283 (0.2502) data time 0.0011 (0.0027) model time 0.2272 (0.2475) loss 3.2216 (3.0188) grad_norm 2.5826 (3.4350) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1100/1251] eta 0:00:37 lr 0.000271 wd 0.0500 time 0.2315 (0.2499) data time 0.0009 (0.0026) model time 0.2306 (0.2472) loss 3.1911 (3.0215) grad_norm 2.6059 (3.4242) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1110/1251] eta 0:00:35 lr 0.000271 wd 0.0500 time 0.2386 (0.2496) data time 0.0012 (0.0026) model time 0.2374 (0.2470) loss 1.8019 (3.0207) grad_norm 7.1489 (3.4323) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1120/1251] eta 0:00:32 lr 0.000271 wd 0.0500 time 0.2295 (0.2493) data time 0.0007 (0.0026) model time 0.2288 (0.2467) loss 3.0710 (3.0226) grad_norm 2.7095 (3.4364) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1130/1251] eta 0:00:30 lr 0.000271 wd 0.0500 time 0.2282 (0.2490) data time 0.0009 (0.0026) model time 0.2273 (0.2465) loss 3.4624 (3.0243) grad_norm 3.9184 (3.4366) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1140/1251] eta 0:00:27 lr 0.000271 wd 0.0500 time 0.2237 (0.2488) data time 0.0010 (0.0026) model time 0.2227 (0.2462) loss 3.4319 (3.0249) grad_norm 10.7748 (3.4461) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1150/1251] eta 0:00:25 lr 0.000271 wd 0.0500 time 0.2484 (0.2486) data time 0.0010 (0.0025) model time 0.2474 (0.2460) loss 3.0748 (3.0229) grad_norm 3.2923 (3.4513) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1160/1251] eta 0:00:22 lr 0.000271 wd 0.0500 time 0.2279 (0.2483) data time 0.0010 (0.0025) model time 0.2269 (0.2458) loss 2.3274 (3.0194) grad_norm 3.7780 (3.4564) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1170/1251] eta 0:00:20 lr 0.000271 wd 0.0500 time 0.2233 (0.2483) data time 0.0008 (0.0025) model time 0.2224 (0.2458) loss 2.3176 (3.0138) grad_norm 3.1209 (3.4553) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1180/1251] eta 0:00:17 lr 0.000271 wd 0.0500 time 0.2262 (0.2481) data time 0.0012 (0.0025) model time 0.2250 (0.2456) loss 2.5459 (3.0142) grad_norm 3.6430 (3.4550) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1190/1251] eta 0:00:15 lr 0.000271 wd 0.0500 time 0.2287 (0.2481) data time 0.0007 (0.0025) model time 0.2279 (0.2456) loss 3.0862 (3.0122) grad_norm 2.7804 (3.4500) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1200/1251] eta 0:00:12 lr 0.000271 wd 0.0500 time 0.2238 (0.2478) data time 0.0008 (0.0025) model time 0.2229 (0.2453) loss 3.7225 (3.0106) grad_norm 2.5641 (3.4474) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1210/1251] eta 0:00:10 lr 0.000271 wd 0.0500 time 0.2254 (0.2476) data time 0.0006 (0.0025) model time 0.2248 (0.2451) loss 2.5796 (3.0076) grad_norm 2.5005 (3.4420) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1220/1251] eta 0:00:07 lr 0.000271 wd 0.0500 time 0.2265 (0.2473) data time 0.0009 (0.0024) model time 0.2257 (0.2449) loss 1.8441 (3.0069) grad_norm 2.7353 (3.4365) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1230/1251] eta 0:00:05 lr 0.000271 wd 0.0500 time 0.2257 (0.2472) data time 0.0006 (0.0025) model time 0.2251 (0.2447) loss 2.9038 (3.0088) grad_norm 2.1384 (3.4348) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1240/1251] eta 0:00:02 lr 0.000270 wd 0.0500 time 0.2281 (0.2469) data time 0.0007 (0.0025) model time 0.2274 (0.2444) loss 3.2125 (3.0081) grad_norm 2.7736 (3.4340) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [203/300][1250/1251] eta 0:00:00 lr 0.000270 wd 0.0500 time 0.2137 (0.2465) data time 0.0005 (0.0024) model time 0.2133 (0.2441) loss 3.0180 (3.0081) grad_norm 3.1207 (3.4293) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:02:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 203 training takes 0:03:40 [2024-08-27 23:02:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 23:02:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 23:02:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.440 (0.440) Loss 0.4546 (0.4546) Acc@1 91.895 (91.895) Acc@5 98.340 (98.340) Mem 7373MB [2024-08-27 23:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.114) Loss 0.6729 (0.6754) Acc@1 86.816 (85.875) Acc@5 97.266 (97.248) Mem 7373MB [2024-08-27 23:02:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.099) Loss 0.9692 (0.7019) Acc@1 78.125 (85.007) Acc@5 95.410 (97.247) Mem 7373MB [2024-08-27 23:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.094) Loss 1.2188 (0.7989) Acc@1 70.898 (82.639) Acc@5 90.820 (96.106) Mem 7373MB [2024-08-27 23:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.0605 (0.8487) Acc@1 75.977 (81.300) Acc@5 93.359 (95.608) Mem 7373MB [2024-08-27 23:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 80.936 Acc@5 95.604 [2024-08-27 23:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 80.9% [2024-08-27 23:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.944 (0.944) Loss 0.3877 (0.3877) Acc@1 92.969 (92.969) Acc@5 98.535 (98.535) Mem 7373MB [2024-08-27 23:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.163) Loss 0.5908 (0.6075) Acc@1 88.672 (87.287) Acc@5 97.656 (97.638) Mem 7373MB [2024-08-27 23:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.085 (0.125) Loss 0.8672 (0.6343) Acc@1 78.125 (86.203) Acc@5 96.094 (97.652) Mem 7373MB [2024-08-27 23:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.084 (0.110) Loss 1.0938 (0.7184) Acc@1 73.145 (84.183) Acc@5 93.066 (96.774) Mem 7373MB [2024-08-27 23:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.099) Loss 0.9795 (0.7614) Acc@1 75.977 (82.948) Acc@5 94.434 (96.287) Mem 7373MB [2024-08-27 23:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.504 Acc@5 96.276 [2024-08-27 23:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.5% [2024-08-27 23:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.50% [2024-08-27 23:02:47 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 23:02:52 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-27 23:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][0/1251] eta 0:16:52 lr 0.000270 wd 0.0500 time 0.8094 (0.8094) data time 0.4721 (0.4721) model time 0.0000 (0.0000) loss 3.0941 (3.0941) grad_norm 2.5515 (2.5515) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:02:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][10/1251] eta 0:05:50 lr 0.000270 wd 0.0500 time 0.2306 (0.2826) data time 0.0007 (0.0440) model time 0.0000 (0.0000) loss 2.7669 (3.0351) grad_norm 2.2974 (2.8810) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:02:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][20/1251] eta 0:05:16 lr 0.000270 wd 0.0500 time 0.2341 (0.2572) data time 0.0009 (0.0235) model time 0.0000 (0.0000) loss 2.7497 (2.9908) grad_norm 2.5286 (2.9587) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][30/1251] eta 0:05:02 lr 0.000270 wd 0.0500 time 0.2229 (0.2473) data time 0.0008 (0.0162) model time 0.0000 (0.0000) loss 3.5315 (3.0517) grad_norm 3.7257 (3.0174) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][40/1251] eta 0:04:53 lr 0.000270 wd 0.0500 time 0.2249 (0.2426) data time 0.0009 (0.0125) model time 0.0000 (0.0000) loss 2.2488 (3.0691) grad_norm 2.7239 (3.0911) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][50/1251] eta 0:04:48 lr 0.000270 wd 0.0500 time 0.2238 (0.2400) data time 0.0010 (0.0104) model time 0.0000 (0.0000) loss 2.3848 (2.9797) grad_norm 4.8791 (3.4203) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][60/1251] eta 0:04:43 lr 0.000270 wd 0.0500 time 0.2315 (0.2383) data time 0.0008 (0.0088) model time 0.2307 (0.2284) loss 2.2971 (2.9203) grad_norm 4.2925 (3.4814) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][70/1251] eta 0:04:40 lr 0.000270 wd 0.0500 time 0.2258 (0.2374) data time 0.0007 (0.0078) model time 0.2252 (0.2296) loss 2.7411 (2.9399) grad_norm 2.3415 (3.4261) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][80/1251] eta 0:04:36 lr 0.000270 wd 0.0500 time 0.2236 (0.2362) data time 0.0009 (0.0069) model time 0.2228 (0.2286) loss 3.1221 (2.9630) grad_norm 2.9879 (3.3163) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][90/1251] eta 0:04:33 lr 0.000270 wd 0.0500 time 0.2301 (0.2357) data time 0.0010 (0.0064) model time 0.2291 (0.2289) loss 2.9557 (2.9602) grad_norm 2.9568 (3.2377) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][100/1251] eta 0:04:30 lr 0.000270 wd 0.0500 time 0.2315 (0.2350) data time 0.0010 (0.0059) model time 0.2305 (0.2285) loss 2.5104 (2.9551) grad_norm 4.1377 (3.2805) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][110/1251] eta 0:04:27 lr 0.000270 wd 0.0500 time 0.2248 (0.2348) data time 0.0010 (0.0055) model time 0.2238 (0.2290) loss 2.7277 (2.9648) grad_norm 3.4934 (3.3705) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][120/1251] eta 0:04:24 lr 0.000270 wd 0.0500 time 0.2230 (0.2343) data time 0.0013 (0.0051) model time 0.2218 (0.2288) loss 3.7138 (2.9800) grad_norm 2.7238 (3.3169) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][130/1251] eta 0:04:22 lr 0.000270 wd 0.0500 time 0.2332 (0.2341) data time 0.0009 (0.0048) model time 0.2323 (0.2291) loss 2.1293 (2.9609) grad_norm 4.9075 (3.3235) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][140/1251] eta 0:04:19 lr 0.000270 wd 0.0500 time 0.2344 (0.2338) data time 0.0007 (0.0045) model time 0.2338 (0.2291) loss 2.8498 (2.9484) grad_norm 2.9759 (3.3072) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][150/1251] eta 0:04:17 lr 0.000270 wd 0.0500 time 0.2256 (0.2336) data time 0.0010 (0.0043) model time 0.2246 (0.2290) loss 1.9121 (2.9691) grad_norm 9.6094 (3.3422) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][160/1251] eta 0:04:14 lr 0.000270 wd 0.0500 time 0.2271 (0.2332) data time 0.0011 (0.0041) model time 0.2260 (0.2288) loss 3.1776 (2.9715) grad_norm 3.1153 (3.3438) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][170/1251] eta 0:04:11 lr 0.000270 wd 0.0500 time 0.2368 (0.2329) data time 0.0009 (0.0040) model time 0.2359 (0.2286) loss 3.2099 (2.9835) grad_norm 3.1317 (3.3441) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][180/1251] eta 0:04:09 lr 0.000270 wd 0.0500 time 0.2259 (0.2327) data time 0.0011 (0.0038) model time 0.2249 (0.2286) loss 3.3232 (2.9721) grad_norm 3.3778 (3.3299) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][190/1251] eta 0:04:06 lr 0.000270 wd 0.0500 time 0.2205 (0.2325) data time 0.0009 (0.0037) model time 0.2196 (0.2285) loss 3.0502 (2.9773) grad_norm 2.9115 (3.3181) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][200/1251] eta 0:04:04 lr 0.000270 wd 0.0500 time 0.2290 (0.2323) data time 0.0010 (0.0035) model time 0.2281 (0.2284) loss 2.6371 (2.9686) grad_norm 2.8565 (3.2985) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][210/1251] eta 0:04:01 lr 0.000270 wd 0.0500 time 0.2348 (0.2322) data time 0.0009 (0.0034) model time 0.2339 (0.2285) loss 3.4676 (2.9681) grad_norm 3.4422 (3.2920) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][220/1251] eta 0:03:59 lr 0.000270 wd 0.0500 time 0.2201 (0.2321) data time 0.0010 (0.0034) model time 0.2191 (0.2285) loss 3.2772 (2.9762) grad_norm 3.0599 (3.2727) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][230/1251] eta 0:03:56 lr 0.000270 wd 0.0500 time 0.2282 (0.2320) data time 0.0010 (0.0033) model time 0.2272 (0.2285) loss 2.4215 (2.9794) grad_norm 5.0745 (3.2746) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][240/1251] eta 0:03:54 lr 0.000269 wd 0.0500 time 0.2264 (0.2320) data time 0.0013 (0.0032) model time 0.2251 (0.2285) loss 3.5003 (2.9835) grad_norm 2.5018 (3.2634) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][250/1251] eta 0:03:52 lr 0.000269 wd 0.0500 time 0.2310 (0.2318) data time 0.0009 (0.0031) model time 0.2301 (0.2285) loss 3.0864 (2.9847) grad_norm 2.4179 (3.2540) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][260/1251] eta 0:03:49 lr 0.000269 wd 0.0500 time 0.2245 (0.2317) data time 0.0009 (0.0030) model time 0.2236 (0.2284) loss 2.4837 (2.9801) grad_norm 3.0137 (3.2530) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][270/1251] eta 0:03:47 lr 0.000269 wd 0.0500 time 0.2518 (0.2317) data time 0.0007 (0.0030) model time 0.2511 (0.2285) loss 2.0003 (2.9740) grad_norm 2.4046 (3.2331) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][280/1251] eta 0:03:44 lr 0.000269 wd 0.0500 time 0.2264 (0.2316) data time 0.0007 (0.0029) model time 0.2256 (0.2285) loss 3.2744 (2.9843) grad_norm 2.7236 (3.2208) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][290/1251] eta 0:03:42 lr 0.000269 wd 0.0500 time 0.2250 (0.2316) data time 0.0010 (0.0029) model time 0.2239 (0.2286) loss 3.1618 (2.9950) grad_norm 6.8113 (3.2822) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][300/1251] eta 0:03:40 lr 0.000269 wd 0.0500 time 0.2284 (0.2316) data time 0.0009 (0.0028) model time 0.2275 (0.2286) loss 2.9308 (2.9993) grad_norm 2.8050 (3.2818) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][310/1251] eta 0:03:37 lr 0.000269 wd 0.0500 time 0.2299 (0.2315) data time 0.0009 (0.0027) model time 0.2291 (0.2286) loss 3.4014 (3.0049) grad_norm 2.1257 (3.2929) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][320/1251] eta 0:03:35 lr 0.000269 wd 0.0500 time 0.2308 (0.2316) data time 0.0014 (0.0027) model time 0.2293 (0.2287) loss 2.9994 (3.0047) grad_norm 3.0436 (3.2861) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][330/1251] eta 0:03:33 lr 0.000269 wd 0.0500 time 0.2364 (0.2317) data time 0.0010 (0.0027) model time 0.2355 (0.2289) loss 3.2301 (2.9996) grad_norm 3.3458 (3.3158) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][340/1251] eta 0:03:31 lr 0.000269 wd 0.0500 time 0.2261 (0.2317) data time 0.0010 (0.0026) model time 0.2251 (0.2290) loss 2.5155 (2.9937) grad_norm 2.2626 (3.3086) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][350/1251] eta 0:03:28 lr 0.000269 wd 0.0500 time 0.2203 (0.2316) data time 0.0012 (0.0026) model time 0.2190 (0.2289) loss 2.7155 (2.9943) grad_norm 2.1020 (3.2881) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][360/1251] eta 0:03:26 lr 0.000269 wd 0.0500 time 0.2301 (0.2316) data time 0.0009 (0.0025) model time 0.2292 (0.2289) loss 3.3249 (2.9884) grad_norm 2.5912 (3.2803) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][370/1251] eta 0:03:23 lr 0.000269 wd 0.0500 time 0.2317 (0.2315) data time 0.0011 (0.0025) model time 0.2306 (0.2289) loss 2.9185 (2.9904) grad_norm 9.5495 (3.3115) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][380/1251] eta 0:03:21 lr 0.000269 wd 0.0500 time 0.2330 (0.2315) data time 0.0007 (0.0025) model time 0.2323 (0.2289) loss 3.4463 (2.9933) grad_norm 2.6477 (3.3148) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][390/1251] eta 0:03:19 lr 0.000269 wd 0.0500 time 0.2223 (0.2314) data time 0.0010 (0.0024) model time 0.2213 (0.2288) loss 2.7485 (2.9993) grad_norm 3.4163 (3.3066) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][400/1251] eta 0:03:16 lr 0.000269 wd 0.0500 time 0.2285 (0.2313) data time 0.0008 (0.0024) model time 0.2277 (0.2288) loss 3.1115 (3.0031) grad_norm 5.4021 (3.3007) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][410/1251] eta 0:03:14 lr 0.000269 wd 0.0500 time 0.2288 (0.2313) data time 0.0008 (0.0024) model time 0.2280 (0.2288) loss 3.5486 (3.0018) grad_norm 3.5850 (3.3027) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][420/1251] eta 0:03:12 lr 0.000269 wd 0.0500 time 0.2259 (0.2313) data time 0.0010 (0.0023) model time 0.2249 (0.2289) loss 3.6766 (2.9952) grad_norm 3.5622 (3.3130) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][430/1251] eta 0:03:09 lr 0.000269 wd 0.0500 time 0.2304 (0.2313) data time 0.0006 (0.0023) model time 0.2298 (0.2289) loss 2.1804 (2.9865) grad_norm 2.5080 (3.3067) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][440/1251] eta 0:03:07 lr 0.000269 wd 0.0500 time 0.2308 (0.2312) data time 0.0009 (0.0023) model time 0.2299 (0.2289) loss 3.5616 (2.9927) grad_norm 4.3873 (3.2986) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][450/1251] eta 0:03:05 lr 0.000269 wd 0.0500 time 0.2215 (0.2312) data time 0.0012 (0.0022) model time 0.2202 (0.2288) loss 3.2205 (2.9904) grad_norm 2.4092 (3.2873) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][460/1251] eta 0:03:02 lr 0.000269 wd 0.0500 time 0.2264 (0.2312) data time 0.0009 (0.0022) model time 0.2255 (0.2289) loss 2.8207 (2.9959) grad_norm 3.9158 (3.2895) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][470/1251] eta 0:03:00 lr 0.000269 wd 0.0500 time 0.2248 (0.2312) data time 0.0010 (0.0022) model time 0.2238 (0.2289) loss 3.2136 (2.9977) grad_norm 3.1884 (3.2823) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][480/1251] eta 0:02:58 lr 0.000269 wd 0.0500 time 0.2345 (0.2316) data time 0.0009 (0.0022) model time 0.2336 (0.2294) loss 3.1632 (3.0017) grad_norm 2.2475 (3.2746) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][490/1251] eta 0:02:56 lr 0.000269 wd 0.0500 time 0.2627 (0.2316) data time 0.0009 (0.0021) model time 0.2619 (0.2294) loss 2.2796 (2.9998) grad_norm 2.1247 (3.2716) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][500/1251] eta 0:02:53 lr 0.000268 wd 0.0500 time 0.2369 (0.2316) data time 0.0007 (0.0021) model time 0.2363 (0.2295) loss 3.2760 (3.0013) grad_norm 3.2471 (3.2649) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][510/1251] eta 0:02:51 lr 0.000268 wd 0.0500 time 0.2400 (0.2316) data time 0.0010 (0.0021) model time 0.2389 (0.2295) loss 3.4605 (3.0037) grad_norm 2.8702 (3.2642) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][520/1251] eta 0:02:49 lr 0.000268 wd 0.0500 time 0.2351 (0.2316) data time 0.0013 (0.0021) model time 0.2338 (0.2295) loss 3.4604 (3.0065) grad_norm 4.5804 (3.2669) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][530/1251] eta 0:02:46 lr 0.000268 wd 0.0500 time 0.2222 (0.2316) data time 0.0010 (0.0021) model time 0.2213 (0.2295) loss 2.9978 (3.0059) grad_norm 3.0555 (3.2685) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:04:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][540/1251] eta 0:02:44 lr 0.000268 wd 0.0500 time 0.2467 (0.2316) data time 0.0006 (0.0021) model time 0.2461 (0.2295) loss 1.8370 (3.0033) grad_norm 2.9733 (3.2730) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][550/1251] eta 0:02:42 lr 0.000268 wd 0.0500 time 0.2404 (0.2316) data time 0.0007 (0.0021) model time 0.2397 (0.2296) loss 3.6753 (3.0057) grad_norm 2.5639 (3.2686) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][560/1251] eta 0:02:40 lr 0.000268 wd 0.0500 time 0.2352 (0.2316) data time 0.0010 (0.0020) model time 0.2341 (0.2296) loss 3.1943 (3.0061) grad_norm 2.7189 (3.2654) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][570/1251] eta 0:02:37 lr 0.000268 wd 0.0500 time 0.2596 (0.2316) data time 0.0010 (0.0020) model time 0.2586 (0.2296) loss 3.5857 (3.0116) grad_norm 2.5236 (3.2625) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][580/1251] eta 0:02:35 lr 0.000268 wd 0.0500 time 0.2484 (0.2316) data time 0.0009 (0.0021) model time 0.2476 (0.2296) loss 2.8946 (3.0081) grad_norm 2.5645 (3.2593) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][590/1251] eta 0:02:33 lr 0.000268 wd 0.0500 time 0.2319 (0.2316) data time 0.0006 (0.0021) model time 0.2313 (0.2295) loss 2.5768 (2.9988) grad_norm 2.3230 (3.2527) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][600/1251] eta 0:02:30 lr 0.000268 wd 0.0500 time 0.2443 (0.2316) data time 0.0008 (0.0021) model time 0.2435 (0.2295) loss 3.2061 (2.9972) grad_norm 3.1897 (3.2729) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][610/1251] eta 0:02:28 lr 0.000268 wd 0.0500 time 0.2449 (0.2316) data time 0.0010 (0.0021) model time 0.2438 (0.2295) loss 3.4739 (2.9993) grad_norm 5.1272 (3.2718) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][620/1251] eta 0:02:26 lr 0.000268 wd 0.0500 time 0.2282 (0.2316) data time 0.0010 (0.0020) model time 0.2272 (0.2295) loss 2.6563 (3.0018) grad_norm 3.5133 (3.2711) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][630/1251] eta 0:02:23 lr 0.000268 wd 0.0500 time 0.2311 (0.2315) data time 0.0007 (0.0020) model time 0.2305 (0.2295) loss 3.5083 (3.0047) grad_norm 4.0329 (3.2722) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][640/1251] eta 0:02:21 lr 0.000268 wd 0.0500 time 0.2307 (0.2315) data time 0.0013 (0.0020) model time 0.2294 (0.2295) loss 3.3253 (3.0019) grad_norm 2.4362 (3.2711) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][650/1251] eta 0:02:19 lr 0.000268 wd 0.0500 time 0.2480 (0.2315) data time 0.0007 (0.0020) model time 0.2473 (0.2295) loss 2.3519 (3.0011) grad_norm 3.1893 (3.2736) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][660/1251] eta 0:02:16 lr 0.000268 wd 0.0500 time 0.2676 (0.2316) data time 0.0007 (0.0020) model time 0.2670 (0.2296) loss 3.3326 (3.0002) grad_norm 3.7206 (3.2714) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][670/1251] eta 0:02:14 lr 0.000268 wd 0.0500 time 0.2319 (0.2316) data time 0.0010 (0.0020) model time 0.2309 (0.2296) loss 3.0553 (2.9967) grad_norm 2.6814 (3.2700) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][680/1251] eta 0:02:12 lr 0.000268 wd 0.0500 time 0.2241 (0.2316) data time 0.0010 (0.0020) model time 0.2231 (0.2296) loss 2.9471 (2.9966) grad_norm 2.6558 (3.2840) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][690/1251] eta 0:02:09 lr 0.000268 wd 0.0500 time 0.2411 (0.2316) data time 0.0007 (0.0020) model time 0.2405 (0.2296) loss 3.5703 (2.9959) grad_norm 3.7623 (3.2780) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][700/1251] eta 0:02:07 lr 0.000268 wd 0.0500 time 0.4582 (0.2320) data time 0.0010 (0.0020) model time 0.4572 (0.2300) loss 3.3159 (3.0006) grad_norm 3.9899 (3.2791) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][710/1251] eta 0:02:05 lr 0.000268 wd 0.0500 time 0.2275 (0.2319) data time 0.0007 (0.0020) model time 0.2268 (0.2299) loss 3.0227 (3.0034) grad_norm 2.6800 (3.2735) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][720/1251] eta 0:02:03 lr 0.000268 wd 0.0500 time 0.2265 (0.2320) data time 0.0008 (0.0021) model time 0.2256 (0.2299) loss 2.9121 (3.0037) grad_norm 3.4328 (3.2681) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][730/1251] eta 0:02:00 lr 0.000268 wd 0.0500 time 0.2242 (0.2320) data time 0.0007 (0.0021) model time 0.2235 (0.2299) loss 3.2873 (3.0031) grad_norm 3.7103 (3.2659) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][740/1251] eta 0:01:58 lr 0.000268 wd 0.0500 time 0.2132 (0.2321) data time 0.0008 (0.0021) model time 0.2123 (0.2301) loss 2.8646 (3.0033) grad_norm 2.6899 (3.2605) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][750/1251] eta 0:01:56 lr 0.000268 wd 0.0500 time 0.2242 (0.2323) data time 0.0007 (0.0021) model time 0.2235 (0.2303) loss 2.8552 (3.0001) grad_norm 2.8986 (3.2575) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][760/1251] eta 0:01:54 lr 0.000267 wd 0.0500 time 0.2350 (0.2323) data time 0.0009 (0.0021) model time 0.2341 (0.2302) loss 2.7584 (2.9948) grad_norm 2.7863 (3.2541) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][770/1251] eta 0:01:51 lr 0.000267 wd 0.0500 time 0.2167 (0.2323) data time 0.0007 (0.0021) model time 0.2159 (0.2302) loss 2.0906 (2.9913) grad_norm 2.4155 (3.2556) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][780/1251] eta 0:01:49 lr 0.000267 wd 0.0500 time 0.2333 (0.2323) data time 0.0007 (0.0021) model time 0.2326 (0.2303) loss 2.2554 (2.9910) grad_norm 6.1707 (3.2560) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][790/1251] eta 0:01:47 lr 0.000267 wd 0.0500 time 0.2219 (0.2323) data time 0.0010 (0.0021) model time 0.2209 (0.2303) loss 3.1914 (2.9911) grad_norm 2.5816 (3.2522) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:05:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][800/1251] eta 0:01:44 lr 0.000267 wd 0.0500 time 0.2261 (0.2323) data time 0.0009 (0.0021) model time 0.2253 (0.2303) loss 3.4267 (2.9930) grad_norm 3.3159 (3.2612) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:06:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][810/1251] eta 0:01:42 lr 0.000267 wd 0.0500 time 0.2289 (0.2323) data time 0.0006 (0.0021) model time 0.2282 (0.2303) loss 3.0231 (2.9966) grad_norm 2.5440 (3.2562) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:06:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][820/1251] eta 0:01:40 lr 0.000267 wd 0.0500 time 0.2385 (0.2323) data time 0.0007 (0.0021) model time 0.2379 (0.2303) loss 2.6892 (2.9964) grad_norm 4.2936 (3.2566) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:06:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][830/1251] eta 0:01:37 lr 0.000267 wd 0.0500 time 0.2295 (0.2323) data time 0.0010 (0.0020) model time 0.2285 (0.2303) loss 2.5877 (2.9950) grad_norm 4.4831 (3.2550) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:06:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][840/1251] eta 0:01:35 lr 0.000267 wd 0.0500 time 0.2254 (0.2322) data time 0.0008 (0.0020) model time 0.2245 (0.2302) loss 2.2566 (2.9971) grad_norm 3.5265 (3.2564) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:06:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][850/1251] eta 0:01:33 lr 0.000267 wd 0.0500 time 0.2314 (0.2322) data time 0.0011 (0.0020) model time 0.2303 (0.2302) loss 1.9377 (2.9963) grad_norm 3.3554 (3.2544) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:06:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][860/1251] eta 0:01:30 lr 0.000267 wd 0.0500 time 0.2212 (0.2322) data time 0.0010 (0.0020) model time 0.2201 (0.2302) loss 2.9889 (2.9933) grad_norm 5.2439 (3.2583) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][870/1251] eta 0:01:28 lr 0.000267 wd 0.0500 time 0.2144 (0.2322) data time 0.0009 (0.0020) model time 0.2134 (0.2302) loss 3.4149 (2.9956) grad_norm 3.6101 (3.2627) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:06:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][880/1251] eta 0:01:26 lr 0.000267 wd 0.0500 time 0.2206 (0.2322) data time 0.0008 (0.0020) model time 0.2198 (0.2302) loss 3.7599 (2.9949) grad_norm 2.6831 (3.2651) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:06:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][890/1251] eta 0:01:23 lr 0.000267 wd 0.0500 time 0.2291 (0.2322) data time 0.0008 (0.0020) model time 0.2283 (0.2302) loss 2.0930 (2.9943) grad_norm 2.4195 (3.2669) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:06:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][900/1251] eta 0:01:21 lr 0.000267 wd 0.0500 time 0.2336 (0.2322) data time 0.0007 (0.0020) model time 0.2329 (0.2302) loss 3.2528 (2.9963) grad_norm 2.9067 (3.2669) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:06:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][910/1251] eta 0:01:19 lr 0.000267 wd 0.0500 time 0.2203 (0.2321) data time 0.0014 (0.0020) model time 0.2189 (0.2302) loss 3.0654 (2.9977) grad_norm 3.3828 (3.2691) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-27 23:06:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 23:06:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 23:06:25 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 23:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 23:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 23:08:33 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 23:08:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 23:08:42 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 23:08:43 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 23:08:45 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 23:08:45 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 204) [2024-08-27 23:08:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 23:09:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][920/1251] eta 0:14:25 lr 0.000267 wd 0.0500 time 0.2458 (2.6133) data time 0.0007 (0.1698) model time 0.2451 (2.4436) loss 3.5992 (3.4321) grad_norm 3.1822 (2.9722) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-27 23:09:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][930/1251] eta 0:06:02 lr 0.000267 wd 0.0500 time 0.2410 (1.1300) data time 0.0010 (0.0643) model time 0.2400 (1.0657) loss 3.5130 (3.2340) grad_norm 3.0103 (3.3910) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-27 23:09:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][940/1251] eta 0:04:05 lr 0.000267 wd 0.0500 time 0.2398 (0.7879) data time 0.0009 (0.0400) model time 0.2389 (0.7479) loss 3.1645 (3.2621) grad_norm 2.4203 (3.2010) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-27 23:09:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][950/1251] eta 0:03:11 lr 0.000267 wd 0.0500 time 0.2485 (0.6361) data time 0.0011 (0.0292) model time 0.2474 (0.6070) loss 2.9829 (3.2323) grad_norm 3.2727 (3.3322) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-27 23:09:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][960/1251] eta 0:02:40 lr 0.000267 wd 0.0500 time 0.2418 (0.5503) data time 0.0009 (0.0231) model time 0.2409 (0.5272) loss 2.9643 (3.1869) grad_norm 2.6799 (3.4906) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-27 23:09:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][970/1251] eta 0:02:19 lr 0.000267 wd 0.0500 time 0.2336 (0.4949) data time 0.0008 (0.0191) model time 0.2327 (0.4758) loss 3.4713 (3.1843) grad_norm 3.2263 (3.4392) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-27 23:09:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][980/1251] eta 0:02:03 lr 0.000267 wd 0.0500 time 0.2389 (0.4568) data time 0.0008 (0.0164) model time 0.2382 (0.4404) loss 2.7212 (3.1409) grad_norm 3.0802 (3.3407) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-27 23:09:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][990/1251] eta 0:01:51 lr 0.000267 wd 0.0500 time 0.2396 (0.4281) data time 0.0010 (0.0144) model time 0.2386 (0.4137) loss 3.4272 (3.1154) grad_norm 3.6736 (3.2526) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-27 23:09:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1000/1251] eta 0:01:41 lr 0.000267 wd 0.0500 time 0.2405 (0.4062) data time 0.0008 (0.0128) model time 0.2396 (0.3934) loss 2.4171 (3.0889) grad_norm 2.7024 (3.3089) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-27 23:09:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1010/1251] eta 0:01:33 lr 0.000266 wd 0.0500 time 0.2485 (0.3891) data time 0.0008 (0.0116) model time 0.2477 (0.3775) loss 3.3148 (3.0842) grad_norm 3.6928 (3.2824) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-27 23:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1020/1251] eta 0:01:26 lr 0.000266 wd 0.0500 time 0.2471 (0.3752) data time 0.0010 (0.0106) model time 0.2461 (0.3646) loss 3.2871 (3.1114) grad_norm 3.2217 (3.2621) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-27 23:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1030/1251] eta 0:01:20 lr 0.000266 wd 0.0500 time 0.2418 (0.3638) data time 0.0009 (0.0098) model time 0.2409 (0.3540) loss 3.7623 (3.1029) grad_norm 2.8870 (3.3163) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-27 23:09:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1040/1251] eta 0:01:14 lr 0.000266 wd 0.0500 time 0.2410 (0.3541) data time 0.0009 (0.0091) model time 0.2402 (0.3450) loss 2.0565 (3.0954) grad_norm 3.1128 (3.3002) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-27 23:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1050/1251] eta 0:01:09 lr 0.000266 wd 0.0500 time 0.2374 (0.3463) data time 0.0011 (0.0088) model time 0.2364 (0.3375) loss 2.5295 (3.0983) grad_norm 3.8897 (3.2725) loss_scale 1024.0000 (530.8235) mem 7376MB [2024-08-27 23:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1060/1251] eta 0:01:04 lr 0.000266 wd 0.0500 time 0.2456 (0.3392) data time 0.0008 (0.0083) model time 0.2448 (0.3309) loss 2.4259 (3.0867) grad_norm 2.9730 (3.4357) loss_scale 1024.0000 (564.6027) mem 7376MB [2024-08-27 23:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1070/1251] eta 0:01:00 lr 0.000266 wd 0.0500 time 0.2480 (0.3332) data time 0.0010 (0.0078) model time 0.2470 (0.3254) loss 3.1551 (3.0879) grad_norm 4.5684 (3.4305) loss_scale 1024.0000 (594.0513) mem 7376MB [2024-08-27 23:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1080/1251] eta 0:00:56 lr 0.000266 wd 0.0500 time 0.2389 (0.3277) data time 0.0010 (0.0074) model time 0.2379 (0.3203) loss 3.2004 (3.0869) grad_norm 2.2563 (3.4204) loss_scale 1024.0000 (619.9518) mem 7376MB [2024-08-27 23:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1090/1251] eta 0:00:51 lr 0.000266 wd 0.0500 time 0.2434 (0.3228) data time 0.0008 (0.0070) model time 0.2427 (0.3158) loss 3.0752 (3.0748) grad_norm 4.5570 (3.4267) loss_scale 1024.0000 (642.9091) mem 7376MB [2024-08-27 23:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1100/1251] eta 0:00:48 lr 0.000266 wd 0.0500 time 0.2378 (0.3185) data time 0.0013 (0.0067) model time 0.2366 (0.3117) loss 2.4335 (3.0624) grad_norm 5.8939 (3.5072) loss_scale 1024.0000 (663.3978) mem 7376MB [2024-08-27 23:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1110/1251] eta 0:00:44 lr 0.000266 wd 0.0500 time 0.2461 (0.3148) data time 0.0008 (0.0065) model time 0.2454 (0.3083) loss 2.6743 (3.0529) grad_norm 3.6318 (3.5073) loss_scale 1024.0000 (681.7959) mem 7376MB [2024-08-27 23:09:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1120/1251] eta 0:00:40 lr 0.000266 wd 0.0500 time 0.2468 (0.3114) data time 0.0008 (0.0062) model time 0.2461 (0.3052) loss 2.2877 (3.0400) grad_norm 2.9636 (3.4781) loss_scale 1024.0000 (698.4078) mem 7376MB [2024-08-27 23:09:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1130/1251] eta 0:00:37 lr 0.000266 wd 0.0500 time 0.2486 (0.3081) data time 0.0009 (0.0060) model time 0.2477 (0.3022) loss 2.0945 (3.0311) grad_norm 2.4115 (3.4554) loss_scale 1024.0000 (713.4815) mem 7376MB [2024-08-27 23:09:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1140/1251] eta 0:00:33 lr 0.000266 wd 0.0500 time 0.2383 (0.3052) data time 0.0008 (0.0058) model time 0.2375 (0.2995) loss 2.7795 (3.0332) grad_norm 4.4038 (3.4428) loss_scale 1024.0000 (727.2212) mem 7376MB [2024-08-27 23:10:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1150/1251] eta 0:00:30 lr 0.000266 wd 0.0500 time 0.2413 (0.3025) data time 0.0008 (0.0056) model time 0.2404 (0.2969) loss 3.2294 (3.0240) grad_norm 2.8349 (3.4488) loss_scale 1024.0000 (739.7966) mem 7376MB [2024-08-27 23:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1160/1251] eta 0:00:27 lr 0.000266 wd 0.0500 time 0.2443 (0.3000) data time 0.0007 (0.0054) model time 0.2435 (0.2947) loss 2.4189 (3.0205) grad_norm 2.4609 (3.4373) loss_scale 1024.0000 (751.3496) mem 7376MB [2024-08-27 23:10:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1170/1251] eta 0:00:24 lr 0.000266 wd 0.0500 time 0.2472 (0.2978) data time 0.0010 (0.0052) model time 0.2462 (0.2926) loss 2.2427 (3.0108) grad_norm 4.3472 (3.4363) loss_scale 1024.0000 (762.0000) mem 7376MB [2024-08-27 23:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1180/1251] eta 0:00:21 lr 0.000266 wd 0.0500 time 0.2371 (0.2960) data time 0.0009 (0.0051) model time 0.2362 (0.2910) loss 2.6762 (3.0064) grad_norm 3.1358 (3.5062) loss_scale 1024.0000 (771.8496) mem 7376MB [2024-08-27 23:10:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1190/1251] eta 0:00:17 lr 0.000266 wd 0.0500 time 0.2584 (0.2941) data time 0.0010 (0.0049) model time 0.2574 (0.2891) loss 3.4034 (3.0122) grad_norm 3.3428 (inf) loss_scale 512.0000 (764.2899) mem 7376MB [2024-08-27 23:10:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1200/1251] eta 0:00:14 lr 0.000266 wd 0.0500 time 0.2424 (0.2922) data time 0.0011 (0.0048) model time 0.2412 (0.2874) loss 2.3468 (3.0037) grad_norm 3.3205 (inf) loss_scale 512.0000 (755.4685) mem 7376MB [2024-08-27 23:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1210/1251] eta 0:00:11 lr 0.000266 wd 0.0500 time 0.2504 (0.2914) data time 0.0009 (0.0047) model time 0.2495 (0.2867) loss 2.1302 (3.0015) grad_norm 3.5529 (inf) loss_scale 512.0000 (747.2432) mem 7376MB [2024-08-27 23:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1220/1251] eta 0:00:08 lr 0.000266 wd 0.0500 time 0.2376 (0.2899) data time 0.0008 (0.0046) model time 0.2367 (0.2853) loss 1.8739 (2.9945) grad_norm 4.0703 (inf) loss_scale 512.0000 (739.5556) mem 7376MB [2024-08-27 23:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1230/1251] eta 0:00:06 lr 0.000266 wd 0.0500 time 0.2428 (0.2892) data time 0.0009 (0.0045) model time 0.2419 (0.2847) loss 3.1411 (3.0014) grad_norm 2.6797 (inf) loss_scale 512.0000 (732.3544) mem 7376MB [2024-08-27 23:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 23:10:22 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 23:10:25 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 23:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 23:13:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 23:13:30 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 23:13:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 23:13:42 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 23:13:43 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 23:13:44 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 23:13:44 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 204) [2024-08-27 23:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 23:14:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1240/1251] eta 0:00:33 lr 0.000266 wd 0.0500 time 0.2111 (3.0207) data time 0.0004 (0.1962) model time 0.2107 (2.8245) loss 3.3761 (3.4034) grad_norm 3.4747 (3.8495) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 23:14:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [204/300][1250/1251] eta 0:00:01 lr 0.000266 wd 0.0500 time 0.2185 (1.1504) data time 0.0006 (0.0659) model time 0.2178 (1.0845) loss 3.6306 (3.2690) grad_norm 3.3760 (3.4902) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-27 23:14:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 204 training takes 0:00:17 [2024-08-27 23:14:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 23:14:11 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 23:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.433 (0.433) Loss 0.4436 (0.4436) Acc@1 92.285 (92.285) Acc@5 98.145 (98.145) Mem 7374MB [2024-08-27 23:14:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.114) Loss 0.6543 (0.6664) Acc@1 86.328 (85.804) Acc@5 97.461 (97.301) Mem 7374MB [2024-08-27 23:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.087 (0.100) Loss 0.9727 (0.6888) Acc@1 77.148 (84.970) Acc@5 95.117 (97.307) Mem 7374MB [2024-08-27 23:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.093) Loss 1.1475 (0.7818) Acc@1 73.438 (82.841) Acc@5 92.188 (96.276) Mem 7374MB [2024-08-27 23:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.087) Loss 1.0879 (0.8361) Acc@1 75.684 (81.479) Acc@5 92.773 (95.720) Mem 7374MB [2024-08-27 23:14:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.102 Acc@5 95.706 [2024-08-27 23:14:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.1% [2024-08-27 23:14:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.10% [2024-08-27 23:14:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-27 23:14:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-27 23:14:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.429 (0.429) Loss 0.3872 (0.3872) Acc@1 93.066 (93.066) Acc@5 98.535 (98.535) Mem 7374MB [2024-08-27 23:14:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.112) Loss 0.5908 (0.6073) Acc@1 88.574 (87.322) Acc@5 97.559 (97.647) Mem 7374MB [2024-08-27 23:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.097) Loss 0.8706 (0.6342) Acc@1 77.832 (86.212) Acc@5 96.191 (97.666) Mem 7374MB [2024-08-27 23:14:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.087 (0.091) Loss 1.0938 (0.7182) Acc@1 73.438 (84.205) Acc@5 92.969 (96.787) Mem 7374MB [2024-08-27 23:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 0.9790 (0.7612) Acc@1 76.074 (83.008) Acc@5 94.141 (96.296) Mem 7374MB [2024-08-27 23:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.574 Acc@5 96.282 [2024-08-27 23:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-08-27 23:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.57% [2024-08-27 23:14:23 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-27 23:18:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 23:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 23:25:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 23:25:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 23:25:21 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 23:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth [2024-08-27 23:25:38 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth.................... [2024-08-27 23:25:40 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 23:25:41 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 23:25:41 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth' (epoch 204) [2024-08-27 23:25:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 23:25:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][0/1251] eta 3:59:39 lr 0.000266 wd 0.0500 time 11.4945 (11.4945) data time 0.9774 (0.9774) model time 0.0000 (0.0000) loss 3.6752 (3.6752) grad_norm 3.1475 (3.1475) loss_scale 512.0000 (512.0000) mem 20033MB [2024-08-27 23:25:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 23:25:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 23:26:00 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 23:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 23:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 23:28:34 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 23:28:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 23:28:45 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 23:30:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 23:30:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 23:30:46 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 23:37:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 23:37:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 23:37:30 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 23:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 23:37:44 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 23:37:46 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 23:37:47 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 23:37:47 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 205) [2024-08-27 23:37:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 23:38:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][10/1251] eta 1:02:14 lr 0.000266 wd 0.0500 time 0.2489 (3.0090) data time 0.0008 (0.1737) model time 0.0000 (0.0000) loss 3.4003 (3.4143) grad_norm 2.8114 (4.9112) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:38:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][20/1251] eta 0:23:51 lr 0.000265 wd 0.0500 time 0.2333 (1.1627) data time 0.0010 (0.0589) model time 0.0000 (0.0000) loss 3.1533 (3.2643) grad_norm 2.6165 (4.1367) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:38:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][30/1251] eta 0:16:12 lr 0.000265 wd 0.0500 time 0.2815 (0.7967) data time 0.0010 (0.0359) model time 0.0000 (0.0000) loss 3.1616 (3.2343) grad_norm 3.2816 (3.8488) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:38:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][40/1251] eta 0:12:53 lr 0.000265 wd 0.0500 time 0.2411 (0.6391) data time 0.0009 (0.0262) model time 0.0000 (0.0000) loss 3.0379 (3.2111) grad_norm 3.7239 (3.6683) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:38:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][50/1251] eta 0:11:02 lr 0.000265 wd 0.0500 time 0.2283 (0.5514) data time 0.0010 (0.0209) model time 0.0000 (0.0000) loss 3.1520 (3.1498) grad_norm 3.5818 (3.5813) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:38:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][60/1251] eta 0:09:50 lr 0.000265 wd 0.0500 time 0.2453 (0.4961) data time 0.0008 (0.0175) model time 0.2445 (0.2450) loss 2.2747 (3.1298) grad_norm 3.7028 (3.5826) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][70/1251] eta 0:08:59 lr 0.000265 wd 0.0500 time 0.2345 (0.4571) data time 0.0009 (0.0151) model time 0.2336 (0.2429) loss 3.7071 (3.1360) grad_norm 4.5630 (3.6151) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][80/1251] eta 0:08:22 lr 0.000265 wd 0.0500 time 0.2462 (0.4291) data time 0.0010 (0.0133) model time 0.2453 (0.2438) loss 2.5567 (3.1002) grad_norm 4.2691 (3.5842) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:38:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][90/1251] eta 0:07:52 lr 0.000265 wd 0.0500 time 0.2358 (0.4073) data time 0.0007 (0.0119) model time 0.2351 (0.2434) loss 2.8209 (3.0673) grad_norm 3.5531 (3.5014) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:38:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 23:38:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 23:38:30 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 23:49:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 23:49:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 23:49:48 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 23:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 23:49:56 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 23:49:57 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 23:49:59 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 23:49:59 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 205) [2024-08-27 23:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 23:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][100/1251] eta 1:02:58 lr 0.000265 wd 0.0500 time 0.2299 (3.2825) data time 0.0007 (0.3238) model time 0.2292 (2.9587) loss 3.8286 (3.5501) grad_norm 2.6999 (2.7904) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][110/1251] eta 0:20:53 lr 0.000265 wd 0.0500 time 0.2225 (1.0990) data time 0.0008 (0.0934) model time 0.2217 (1.0056) loss 3.1424 (3.2326) grad_norm 2.2520 (2.9424) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][120/1251] eta 0:13:51 lr 0.000265 wd 0.0500 time 0.2264 (0.7348) data time 0.0010 (0.0551) model time 0.2254 (0.6798) loss 3.2875 (3.2638) grad_norm 2.0473 (2.9706) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][130/1251] eta 0:10:56 lr 0.000265 wd 0.0500 time 0.2196 (0.5859) data time 0.0008 (0.0399) model time 0.2188 (0.5460) loss 2.0658 (3.2314) grad_norm 4.7527 (3.0814) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:50:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][140/1251] eta 0:09:20 lr 0.000265 wd 0.0500 time 0.2241 (0.5041) data time 0.0014 (0.0311) model time 0.2227 (0.4730) loss 3.2228 (3.1959) grad_norm 3.0412 (3.3234) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:50:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][150/1251] eta 0:08:19 lr 0.000265 wd 0.0500 time 0.2312 (0.4536) data time 0.0006 (0.0260) model time 0.2306 (0.4276) loss 3.2244 (3.1746) grad_norm 3.5206 (3.3727) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:50:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][160/1251] eta 0:07:36 lr 0.000265 wd 0.0500 time 0.2334 (0.4184) data time 0.0006 (0.0221) model time 0.2327 (0.3963) loss 3.3740 (3.1333) grad_norm 3.1993 (3.3571) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-27 23:50:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 23:50:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 23:50:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 23:52:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 23:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 23:52:31 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 23:52:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 23:52:42 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 23:52:44 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 23:52:45 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 23:52:45 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 205) [2024-08-27 23:52:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-27 23:53:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][170/1251] eta 1:06:10 lr 0.000265 wd 0.0500 time 0.2454 (3.6732) data time 0.0007 (0.2792) model time 0.2448 (3.3940) loss 3.3111 (3.2324) grad_norm 3.1993 (3.3313) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][180/1251] eta 0:21:49 lr 0.000265 wd 0.0500 time 0.2404 (1.2228) data time 0.0009 (0.0805) model time 0.2395 (1.1423) loss 3.2978 (3.1933) grad_norm 4.2735 (3.4313) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][190/1251] eta 0:14:24 lr 0.000265 wd 0.0500 time 0.2400 (0.8147) data time 0.0016 (0.0475) model time 0.2384 (0.7672) loss 3.1881 (3.2167) grad_norm 2.3814 (3.1702) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][200/1251] eta 0:11:18 lr 0.000265 wd 0.0500 time 0.2487 (0.6460) data time 0.0007 (0.0338) model time 0.2480 (0.6122) loss 2.4474 (3.1906) grad_norm 2.6930 (3.1609) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][210/1251] eta 0:09:37 lr 0.000265 wd 0.0500 time 0.2360 (0.5548) data time 0.0007 (0.0264) model time 0.2352 (0.5285) loss 2.8404 (3.1421) grad_norm 2.9583 (3.1396) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][220/1251] eta 0:08:31 lr 0.000265 wd 0.0500 time 0.2349 (0.4965) data time 0.0010 (0.0218) model time 0.2339 (0.4747) loss 3.3767 (3.1490) grad_norm 8.9175 (3.2551) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][230/1251] eta 0:07:46 lr 0.000265 wd 0.0500 time 0.2292 (0.4565) data time 0.0008 (0.0186) model time 0.2285 (0.4379) loss 3.1377 (3.1274) grad_norm 3.5579 (3.2882) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][240/1251] eta 0:07:12 lr 0.000265 wd 0.0500 time 0.2449 (0.4278) data time 0.0010 (0.0162) model time 0.2439 (0.4116) loss 3.4684 (3.0961) grad_norm 2.5609 (3.3420) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][250/1251] eta 0:06:46 lr 0.000265 wd 0.0500 time 0.2417 (0.4065) data time 0.0012 (0.0144) model time 0.2405 (0.3921) loss 3.2193 (3.0781) grad_norm 2.4272 (3.3871) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][260/1251] eta 0:06:25 lr 0.000265 wd 0.0500 time 0.2360 (0.3888) data time 0.0012 (0.0130) model time 0.2347 (0.3759) loss 2.8893 (3.0717) grad_norm 10.3311 (3.4388) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][270/1251] eta 0:06:07 lr 0.000265 wd 0.0500 time 0.2372 (0.3748) data time 0.0011 (0.0118) model time 0.2361 (0.3629) loss 3.2576 (3.1048) grad_norm 2.3500 (3.3908) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][280/1251] eta 0:05:52 lr 0.000264 wd 0.0500 time 0.2492 (0.3632) data time 0.0011 (0.0109) model time 0.2482 (0.3523) loss 3.5477 (3.0921) grad_norm 2.6770 (3.3464) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][290/1251] eta 0:05:39 lr 0.000264 wd 0.0500 time 0.2508 (0.3534) data time 0.0010 (0.0101) model time 0.2499 (0.3433) loss 2.6186 (3.0937) grad_norm 4.3916 (3.4121) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][300/1251] eta 0:05:28 lr 0.000264 wd 0.0500 time 0.2352 (0.3451) data time 0.0010 (0.0094) model time 0.2342 (0.3356) loss 3.5564 (3.0938) grad_norm 3.1269 (3.3985) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][310/1251] eta 0:05:17 lr 0.000264 wd 0.0500 time 0.2362 (0.3378) data time 0.0008 (0.0089) model time 0.2355 (0.3289) loss 2.7434 (3.0804) grad_norm 2.3706 (3.3993) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][320/1251] eta 0:05:08 lr 0.000264 wd 0.0500 time 0.2398 (0.3316) data time 0.0007 (0.0084) model time 0.2390 (0.3232) loss 2.8081 (3.0688) grad_norm 4.1469 (3.4687) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][330/1251] eta 0:05:00 lr 0.000264 wd 0.0500 time 0.2476 (0.3264) data time 0.0007 (0.0081) model time 0.2469 (0.3183) loss 2.8912 (3.0629) grad_norm 2.8264 (3.4748) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][340/1251] eta 0:04:52 lr 0.000264 wd 0.0500 time 0.2402 (0.3215) data time 0.0010 (0.0077) model time 0.2392 (0.3138) loss 1.9469 (3.0563) grad_norm 5.0513 (3.7294) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][350/1251] eta 0:04:45 lr 0.000264 wd 0.0500 time 0.2365 (0.3171) data time 0.0010 (0.0074) model time 0.2356 (0.3097) loss 2.6698 (3.0506) grad_norm 2.5549 (3.7303) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][360/1251] eta 0:04:39 lr 0.000264 wd 0.0500 time 0.2413 (0.3131) data time 0.0007 (0.0071) model time 0.2406 (0.3061) loss 2.9401 (3.0480) grad_norm 3.5550 (3.7028) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][370/1251] eta 0:04:32 lr 0.000264 wd 0.0500 time 0.2411 (0.3096) data time 0.0011 (0.0068) model time 0.2400 (0.3028) loss 3.2029 (3.0377) grad_norm 2.9532 (3.7012) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][380/1251] eta 0:04:26 lr 0.000264 wd 0.0500 time 0.2398 (0.3064) data time 0.0009 (0.0065) model time 0.2389 (0.2999) loss 3.0967 (3.0303) grad_norm 6.7633 (3.6911) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:53:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][390/1251] eta 0:04:21 lr 0.000264 wd 0.0500 time 0.2413 (0.3036) data time 0.0010 (0.0063) model time 0.2402 (0.2973) loss 3.9027 (3.0306) grad_norm 2.8804 (3.6614) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:54:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][400/1251] eta 0:04:16 lr 0.000264 wd 0.0500 time 0.2371 (0.3009) data time 0.0008 (0.0061) model time 0.2363 (0.2948) loss 2.3450 (3.0215) grad_norm 2.8809 (3.6350) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][410/1251] eta 0:04:10 lr 0.000264 wd 0.0500 time 0.2315 (0.2984) data time 0.0009 (0.0058) model time 0.2306 (0.2926) loss 2.1894 (3.0238) grad_norm 2.6052 (3.6215) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:54:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][420/1251] eta 0:04:06 lr 0.000264 wd 0.0500 time 0.2568 (0.2963) data time 0.0010 (0.0057) model time 0.2559 (0.2906) loss 1.9314 (3.0148) grad_norm 7.0500 (3.6208) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:54:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][430/1251] eta 0:04:01 lr 0.000264 wd 0.0500 time 0.2457 (0.2942) data time 0.0007 (0.0056) model time 0.2449 (0.2887) loss 3.2358 (3.0108) grad_norm 3.2767 (3.6260) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:54:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][440/1251] eta 0:03:57 lr 0.000264 wd 0.0500 time 0.2400 (0.2923) data time 0.0011 (0.0054) model time 0.2389 (0.2869) loss 3.2410 (3.0045) grad_norm 3.3684 (3.6040) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][450/1251] eta 0:03:52 lr 0.000264 wd 0.0500 time 0.2431 (0.2905) data time 0.0007 (0.0052) model time 0.2424 (0.2853) loss 1.9093 (2.9991) grad_norm 4.5328 (3.6031) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][460/1251] eta 0:03:49 lr 0.000264 wd 0.0500 time 0.2421 (0.2896) data time 0.0008 (0.0051) model time 0.2413 (0.2845) loss 2.6431 (2.9919) grad_norm 3.2548 (3.6278) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-27 23:54:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-27 23:54:16 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-27 23:54:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-27 23:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 23:57:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 23:57:41 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 23:59:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-27 23:59:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-27 23:59:39 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-27 23:59:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-27 23:59:48 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-27 23:59:49 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-27 23:59:51 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-27 23:59:51 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 205) [2024-08-27 23:59:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-28 00:02:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-28 00:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-28 00:02:22 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-28 00:02:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-28 00:02:30 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-28 01:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-28 01:01:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-28 01:01:48 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-28 01:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-28 01:08:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-28 01:08:29 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-28 01:08:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-28 01:08:38 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-28 01:08:39 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-28 01:08:41 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-28 01:08:41 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 205) [2024-08-28 01:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-28 01:09:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][470/1251] eta 1:40:13 lr 0.000264 wd 0.0500 time 0.4201 (7.6991) data time 0.0008 (0.5211) model time 0.4193 (7.1780) loss 4.0239 (4.0084) grad_norm 10.0703 (6.8301) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-28 01:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][480/1251] eta 0:18:57 lr 0.000264 wd 0.0500 time 0.2337 (1.4750) data time 0.0008 (0.0877) model time 0.2329 (1.3873) loss 2.1192 (3.2504) grad_norm 2.4561 (3.9163) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][490/1251] eta 0:11:31 lr 0.000264 wd 0.0500 time 0.2481 (0.9089) data time 0.0009 (0.0484) model time 0.2473 (0.8605) loss 3.1823 (3.2532) grad_norm 3.4833 (3.5865) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][500/1251] eta 0:08:43 lr 0.000264 wd 0.0500 time 0.2356 (0.6972) data time 0.0009 (0.0339) model time 0.2347 (0.6633) loss 3.2184 (3.2292) grad_norm 3.8668 (3.5097) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][510/1251] eta 0:07:13 lr 0.000264 wd 0.0500 time 0.2393 (0.5856) data time 0.0009 (0.0261) model time 0.2385 (0.5595) loss 3.4898 (3.2037) grad_norm 2.9183 (3.3578) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][520/1251] eta 0:06:17 lr 0.000264 wd 0.0500 time 0.2376 (0.5169) data time 0.0008 (0.0213) model time 0.2368 (0.4956) loss 3.2693 (3.1889) grad_norm 2.9683 (3.2937) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][530/1251] eta 0:05:39 lr 0.000264 wd 0.0500 time 0.2408 (0.4706) data time 0.0007 (0.0180) model time 0.2402 (0.4526) loss 3.6635 (3.1595) grad_norm 3.9853 (3.2923) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][540/1251] eta 0:05:10 lr 0.000263 wd 0.0500 time 0.2355 (0.4373) data time 0.0009 (0.0157) model time 0.2346 (0.4216) loss 2.8872 (3.1184) grad_norm 2.7592 (3.2305) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][550/1251] eta 0:04:48 lr 0.000263 wd 0.0500 time 0.2467 (0.4121) data time 0.0009 (0.0139) model time 0.2458 (0.3982) loss 2.7457 (3.0925) grad_norm 3.8912 (3.2413) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][560/1251] eta 0:04:31 lr 0.000263 wd 0.0500 time 0.2369 (0.3925) data time 0.0007 (0.0125) model time 0.2362 (0.3800) loss 1.8481 (3.0772) grad_norm 3.5613 (3.4446) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][570/1251] eta 0:04:16 lr 0.000263 wd 0.0500 time 0.2274 (0.3770) data time 0.0008 (0.0114) model time 0.2266 (0.3657) loss 3.4824 (3.1022) grad_norm 2.5045 (3.4922) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][580/1251] eta 0:04:04 lr 0.000263 wd 0.0500 time 0.2266 (0.3650) data time 0.0009 (0.0104) model time 0.2257 (0.3545) loss 3.2975 (3.0944) grad_norm 2.7598 (3.4466) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][590/1251] eta 0:03:54 lr 0.000263 wd 0.0500 time 0.2305 (0.3541) data time 0.0008 (0.0097) model time 0.2297 (0.3444) loss 3.1823 (3.0873) grad_norm 2.2656 (3.4586) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][600/1251] eta 0:03:44 lr 0.000263 wd 0.0500 time 0.2183 (0.3445) data time 0.0011 (0.0091) model time 0.2172 (0.3355) loss 2.9028 (3.0697) grad_norm 3.3721 (3.4541) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][610/1251] eta 0:03:35 lr 0.000263 wd 0.0500 time 0.2196 (0.3364) data time 0.0009 (0.0085) model time 0.2187 (0.3279) loss 3.5415 (3.0644) grad_norm 3.3410 (3.4599) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][620/1251] eta 0:03:28 lr 0.000263 wd 0.0500 time 0.2257 (0.3297) data time 0.0010 (0.0080) model time 0.2248 (0.3217) loss 3.2447 (3.0644) grad_norm 5.0501 (3.4644) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][630/1251] eta 0:03:20 lr 0.000263 wd 0.0500 time 0.2183 (0.3236) data time 0.0012 (0.0076) model time 0.2171 (0.3160) loss 3.3055 (3.0656) grad_norm 2.6983 (3.4845) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][640/1251] eta 0:03:14 lr 0.000263 wd 0.0500 time 0.2317 (0.3181) data time 0.0011 (0.0072) model time 0.2306 (0.3109) loss 2.7083 (3.0577) grad_norm 3.3674 (3.4920) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][650/1251] eta 0:03:08 lr 0.000263 wd 0.0500 time 0.2292 (0.3132) data time 0.0009 (0.0069) model time 0.2283 (0.3063) loss 3.4759 (3.0413) grad_norm 2.3434 (3.4607) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][660/1251] eta 0:03:02 lr 0.000263 wd 0.0500 time 0.2174 (0.3091) data time 0.0012 (0.0066) model time 0.2162 (0.3025) loss 3.2152 (3.0418) grad_norm 3.5251 (3.4595) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][670/1251] eta 0:02:57 lr 0.000263 wd 0.0500 time 0.2224 (0.3052) data time 0.0008 (0.0063) model time 0.2216 (0.2989) loss 3.3370 (3.0336) grad_norm 3.6256 (3.4602) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][680/1251] eta 0:02:52 lr 0.000263 wd 0.0500 time 0.2279 (0.3016) data time 0.0009 (0.0061) model time 0.2270 (0.2955) loss 3.1172 (3.0290) grad_norm 3.1895 (3.4521) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][690/1251] eta 0:02:47 lr 0.000263 wd 0.0500 time 0.2215 (0.2984) data time 0.0007 (0.0058) model time 0.2208 (0.2925) loss 3.4679 (3.0246) grad_norm 2.9929 (3.4955) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][700/1251] eta 0:02:42 lr 0.000263 wd 0.0500 time 0.2233 (0.2954) data time 0.0009 (0.0056) model time 0.2225 (0.2898) loss 3.1592 (3.0196) grad_norm 3.5054 (3.4775) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][710/1251] eta 0:02:38 lr 0.000263 wd 0.0500 time 0.2179 (0.2927) data time 0.0008 (0.0054) model time 0.2171 (0.2873) loss 3.4323 (3.0173) grad_norm 2.9317 (3.4767) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][720/1251] eta 0:02:34 lr 0.000263 wd 0.0500 time 0.2271 (0.2906) data time 0.0007 (0.0053) model time 0.2264 (0.2853) loss 3.9444 (3.0096) grad_norm 3.4104 (3.4699) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][730/1251] eta 0:02:30 lr 0.000263 wd 0.0500 time 0.2207 (0.2883) data time 0.0009 (0.0051) model time 0.2198 (0.2832) loss 3.4848 (3.0040) grad_norm 2.8697 (3.4592) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][740/1251] eta 0:02:26 lr 0.000263 wd 0.0500 time 0.2194 (0.2861) data time 0.0009 (0.0050) model time 0.2185 (0.2811) loss 2.8217 (2.9950) grad_norm 3.3070 (3.4644) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][750/1251] eta 0:02:22 lr 0.000263 wd 0.0500 time 0.2233 (0.2843) data time 0.0011 (0.0049) model time 0.2222 (0.2794) loss 1.9586 (2.9936) grad_norm 2.2703 (3.4597) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][760/1251] eta 0:02:19 lr 0.000263 wd 0.0500 time 0.2299 (0.2832) data time 0.0009 (0.0047) model time 0.2290 (0.2784) loss 2.8098 (2.9881) grad_norm 2.7336 (3.4471) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][770/1251] eta 0:02:15 lr 0.000263 wd 0.0500 time 0.2323 (0.2814) data time 0.0011 (0.0046) model time 0.2312 (0.2768) loss 3.2569 (2.9833) grad_norm 2.9300 (3.4460) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][780/1251] eta 0:02:12 lr 0.000263 wd 0.0500 time 0.2281 (0.2805) data time 0.0009 (0.0045) model time 0.2272 (0.2760) loss 3.0688 (2.9864) grad_norm 3.3851 (3.4502) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][790/1251] eta 0:02:08 lr 0.000262 wd 0.0500 time 0.2307 (0.2790) data time 0.0009 (0.0044) model time 0.2298 (0.2746) loss 3.2556 (2.9982) grad_norm 2.4297 (3.4418) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][800/1251] eta 0:02:05 lr 0.000262 wd 0.0500 time 0.2243 (0.2775) data time 0.0011 (0.0043) model time 0.2232 (0.2732) loss 3.4627 (2.9971) grad_norm 3.7067 (3.4344) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][810/1251] eta 0:02:01 lr 0.000262 wd 0.0500 time 0.2259 (0.2760) data time 0.0007 (0.0042) model time 0.2252 (0.2718) loss 3.4939 (2.9994) grad_norm 2.8084 (3.4285) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][820/1251] eta 0:01:58 lr 0.000262 wd 0.0500 time 0.2290 (0.2748) data time 0.0007 (0.0041) model time 0.2283 (0.2706) loss 3.3320 (2.9990) grad_norm 3.0291 (3.4221) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][830/1251] eta 0:01:55 lr 0.000262 wd 0.0500 time 0.2223 (0.2736) data time 0.0008 (0.0040) model time 0.2216 (0.2695) loss 3.5749 (2.9979) grad_norm 7.0804 (3.4984) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][840/1251] eta 0:01:51 lr 0.000262 wd 0.0500 time 0.2256 (0.2725) data time 0.0009 (0.0040) model time 0.2247 (0.2685) loss 2.9498 (2.9949) grad_norm 2.2735 (3.4918) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][850/1251] eta 0:01:48 lr 0.000262 wd 0.0500 time 0.2413 (0.2714) data time 0.0007 (0.0039) model time 0.2406 (0.2674) loss 3.5647 (2.9940) grad_norm 2.8568 (3.4816) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][860/1251] eta 0:01:45 lr 0.000262 wd 0.0500 time 0.2338 (0.2703) data time 0.0007 (0.0038) model time 0.2330 (0.2665) loss 2.5238 (2.9892) grad_norm 3.5542 (3.4742) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][870/1251] eta 0:01:42 lr 0.000262 wd 0.0500 time 0.2517 (0.2693) data time 0.0007 (0.0038) model time 0.2510 (0.2655) loss 3.7144 (2.9979) grad_norm 3.3181 (3.4589) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][880/1251] eta 0:01:39 lr 0.000262 wd 0.0500 time 0.2275 (0.2683) data time 0.0009 (0.0037) model time 0.2266 (0.2646) loss 3.2091 (2.9999) grad_norm 2.4282 (3.4620) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][890/1251] eta 0:01:36 lr 0.000262 wd 0.0500 time 0.2262 (0.2674) data time 0.0007 (0.0037) model time 0.2254 (0.2637) loss 3.4376 (3.0004) grad_norm 3.0994 (3.4504) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][900/1251] eta 0:01:33 lr 0.000262 wd 0.0500 time 0.2322 (0.2666) data time 0.0008 (0.0036) model time 0.2315 (0.2630) loss 2.9620 (3.0044) grad_norm 2.5727 (3.4366) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][910/1251] eta 0:01:30 lr 0.000262 wd 0.0500 time 0.2298 (0.2658) data time 0.0007 (0.0036) model time 0.2291 (0.2622) loss 3.2926 (3.0079) grad_norm 3.5219 (3.4297) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][920/1251] eta 0:01:27 lr 0.000262 wd 0.0500 time 0.2217 (0.2650) data time 0.0010 (0.0035) model time 0.2208 (0.2615) loss 2.5488 (3.0088) grad_norm 3.3728 (3.4236) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][930/1251] eta 0:01:24 lr 0.000262 wd 0.0500 time 0.2182 (0.2643) data time 0.0007 (0.0035) model time 0.2174 (0.2608) loss 2.3408 (3.0037) grad_norm 3.2332 (3.4171) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][940/1251] eta 0:01:21 lr 0.000262 wd 0.0500 time 0.2406 (0.2636) data time 0.0009 (0.0034) model time 0.2397 (0.2602) loss 3.4562 (2.9972) grad_norm 3.0681 (3.4159) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][950/1251] eta 0:01:19 lr 0.000262 wd 0.0500 time 0.2261 (0.2629) data time 0.0007 (0.0034) model time 0.2254 (0.2595) loss 3.4340 (2.9966) grad_norm 3.3411 (3.4129) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][960/1251] eta 0:01:16 lr 0.000262 wd 0.0500 time 0.2311 (0.2624) data time 0.0009 (0.0034) model time 0.2302 (0.2590) loss 3.2229 (2.9986) grad_norm 2.8655 (3.4083) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][970/1251] eta 0:01:13 lr 0.000262 wd 0.0500 time 0.2217 (0.2617) data time 0.0009 (0.0033) model time 0.2208 (0.2584) loss 3.1848 (2.9980) grad_norm 3.2522 (3.4159) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][980/1251] eta 0:01:10 lr 0.000262 wd 0.0500 time 0.2288 (0.2611) data time 0.0007 (0.0033) model time 0.2281 (0.2578) loss 3.0725 (3.0010) grad_norm 3.2172 (3.4156) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:11:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][990/1251] eta 0:01:07 lr 0.000262 wd 0.0500 time 0.2257 (0.2605) data time 0.0008 (0.0032) model time 0.2249 (0.2573) loss 1.9476 (2.9991) grad_norm 2.7407 (3.4046) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-28 01:11:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-28 01:11:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-28 01:11:12 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-28 01:13:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-28 01:13:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-28 01:13:35 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-28 01:13:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-28 01:13:47 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-28 01:13:48 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-28 01:13:49 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-28 01:13:50 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 205) [2024-08-28 01:13:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-28 01:14:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1000/1251] eta 0:33:58 lr 0.000262 wd 0.0500 time 1.2436 (8.1215) data time 0.0008 (0.4356) model time 1.2427 (7.6859) loss 3.9526 (4.0514) grad_norm 2.7503 (8.3521) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-28 01:14:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1010/1251] eta 0:06:15 lr 0.000262 wd 0.0500 time 0.2338 (1.5598) data time 0.0008 (0.0736) model time 0.2330 (1.4863) loss 2.7267 (3.3504) grad_norm 2.8430 (5.1535) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1020/1251] eta 0:03:42 lr 0.000262 wd 0.0500 time 0.2361 (0.9625) data time 0.0010 (0.0407) model time 0.2350 (0.9218) loss 3.4409 (3.3594) grad_norm 3.6801 (4.3863) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1030/1251] eta 0:02:42 lr 0.000262 wd 0.0500 time 0.2394 (0.7375) data time 0.0008 (0.0283) model time 0.2386 (0.7091) loss 3.3384 (3.3495) grad_norm 2.4154 (4.7696) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1040/1251] eta 0:02:10 lr 0.000262 wd 0.0500 time 0.2355 (0.6203) data time 0.0013 (0.0219) model time 0.2342 (0.5984) loss 3.2707 (3.2771) grad_norm 3.0787 (4.4265) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1050/1251] eta 0:01:50 lr 0.000261 wd 0.0500 time 0.2526 (0.5495) data time 0.0008 (0.0179) model time 0.2519 (0.5317) loss 3.2090 (3.2391) grad_norm 3.1186 (4.2459) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1060/1251] eta 0:01:35 lr 0.000261 wd 0.0500 time 0.2494 (0.5009) data time 0.0008 (0.0151) model time 0.2486 (0.4858) loss 3.3295 (3.2003) grad_norm 3.1249 (4.0809) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1070/1251] eta 0:01:24 lr 0.000261 wd 0.0500 time 0.2357 (0.4652) data time 0.0009 (0.0133) model time 0.2348 (0.4519) loss 3.1585 (3.1511) grad_norm 2.5117 (3.9922) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1080/1251] eta 0:01:14 lr 0.000261 wd 0.0500 time 0.2430 (0.4381) data time 0.0012 (0.0119) model time 0.2418 (0.4262) loss 3.3067 (3.1242) grad_norm 2.9485 (3.8703) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1090/1251] eta 0:01:07 lr 0.000261 wd 0.0500 time 0.2385 (0.4170) data time 0.0009 (0.0107) model time 0.2376 (0.4063) loss 2.3452 (3.1044) grad_norm 3.0015 (3.8062) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1100/1251] eta 0:01:00 lr 0.000261 wd 0.0500 time 0.2499 (0.4007) data time 0.0009 (0.0098) model time 0.2490 (0.3909) loss 3.8598 (3.1298) grad_norm 3.1803 (3.7508) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1110/1251] eta 0:00:54 lr 0.000261 wd 0.0500 time 0.2350 (0.3872) data time 0.0010 (0.0090) model time 0.2340 (0.3782) loss 3.1842 (3.1249) grad_norm 3.1640 (3.7439) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1120/1251] eta 0:00:49 lr 0.000261 wd 0.0500 time 0.2434 (0.3753) data time 0.0009 (0.0084) model time 0.2426 (0.3670) loss 3.3673 (3.1145) grad_norm 2.7576 (3.7069) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1130/1251] eta 0:00:44 lr 0.000261 wd 0.0500 time 0.2368 (0.3656) data time 0.0011 (0.0079) model time 0.2357 (0.3577) loss 3.1570 (3.1011) grad_norm 3.1175 (3.6914) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1140/1251] eta 0:00:39 lr 0.000261 wd 0.0500 time 0.2504 (0.3573) data time 0.0010 (0.0075) model time 0.2494 (0.3498) loss 2.9483 (3.0826) grad_norm 2.8634 (3.6657) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1150/1251] eta 0:00:35 lr 0.000261 wd 0.0500 time 0.2458 (0.3501) data time 0.0010 (0.0071) model time 0.2448 (0.3431) loss 3.3075 (3.0758) grad_norm 2.6472 (3.7042) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1160/1251] eta 0:00:31 lr 0.000261 wd 0.0500 time 0.2406 (0.3438) data time 0.0013 (0.0068) model time 0.2393 (0.3370) loss 3.5828 (3.0860) grad_norm 3.7652 (3.7277) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1170/1251] eta 0:00:27 lr 0.000261 wd 0.0500 time 0.2544 (0.3381) data time 0.0010 (0.0065) model time 0.2534 (0.3316) loss 2.6170 (3.0742) grad_norm 3.5266 (3.7065) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1180/1251] eta 0:00:23 lr 0.000261 wd 0.0500 time 0.2482 (0.3329) data time 0.0010 (0.0062) model time 0.2472 (0.3267) loss 3.0910 (3.0575) grad_norm 3.5510 (3.6802) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1190/1251] eta 0:00:20 lr 0.000261 wd 0.0500 time 0.2475 (0.3284) data time 0.0011 (0.0059) model time 0.2465 (0.3225) loss 3.2374 (3.0537) grad_norm 3.7529 (3.6470) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:14:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1200/1251] eta 0:00:16 lr 0.000261 wd 0.0500 time 0.2416 (0.3246) data time 0.0008 (0.0058) model time 0.2409 (0.3188) loss 3.1653 (3.0430) grad_norm 3.3389 (3.6123) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1210/1251] eta 0:00:13 lr 0.000261 wd 0.0500 time 0.2423 (0.3207) data time 0.0010 (0.0055) model time 0.2413 (0.3152) loss 3.2427 (3.0413) grad_norm 3.6843 (3.5976) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:15:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1220/1251] eta 0:00:09 lr 0.000261 wd 0.0500 time 0.2536 (0.3174) data time 0.0008 (0.0053) model time 0.2528 (0.3120) loss 3.7247 (3.0422) grad_norm 2.5700 (3.5602) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:15:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1230/1251] eta 0:00:06 lr 0.000261 wd 0.0500 time 0.2525 (0.3141) data time 0.0011 (0.0052) model time 0.2514 (0.3089) loss 3.3228 (3.0370) grad_norm 3.0279 (3.5502) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:15:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1240/1251] eta 0:00:03 lr 0.000261 wd 0.0500 time 0.2249 (0.3108) data time 0.0005 (0.0050) model time 0.2244 (0.3057) loss 3.2910 (3.0386) grad_norm 2.6168 (3.5691) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [205/300][1250/1251] eta 0:00:00 lr 0.000261 wd 0.0500 time 0.2252 (0.3074) data time 0.0005 (0.0048) model time 0.2247 (0.3026) loss 3.5624 (3.0287) grad_norm 3.2395 (3.5834) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 205 training takes 0:01:17 [2024-08-28 01:15:11 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-28 01:15:14 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-28 01:15:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.452 (0.452) Loss 0.4082 (0.4082) Acc@1 91.992 (91.992) Acc@5 98.438 (98.438) Mem 7379MB [2024-08-28 01:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.111) Loss 0.6592 (0.6547) Acc@1 87.500 (86.115) Acc@5 97.656 (97.381) Mem 7379MB [2024-08-28 01:15:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.095) Loss 0.9673 (0.6824) Acc@1 76.855 (84.998) Acc@5 94.922 (97.354) Mem 7379MB [2024-08-28 01:15:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.089) Loss 1.1592 (0.7799) Acc@1 72.559 (82.737) Acc@5 92.188 (96.239) Mem 7379MB [2024-08-28 01:15:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.0205 (0.8312) Acc@1 77.246 (81.421) Acc@5 93.457 (95.746) Mem 7379MB [2024-08-28 01:15:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.050 Acc@5 95.726 [2024-08-28 01:15:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.0% [2024-08-28 01:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.783 (0.783) Loss 0.3857 (0.3857) Acc@1 93.066 (93.066) Acc@5 98.535 (98.535) Mem 7379MB [2024-08-28 01:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.143) Loss 0.5894 (0.6072) Acc@1 88.672 (87.385) Acc@5 97.559 (97.665) Mem 7379MB [2024-08-28 01:15:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.112) Loss 0.8711 (0.6343) Acc@1 77.930 (86.263) Acc@5 96.094 (97.666) Mem 7379MB [2024-08-28 01:15:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.101) Loss 1.0918 (0.7182) Acc@1 73.730 (84.262) Acc@5 92.773 (96.768) Mem 7379MB [2024-08-28 01:15:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.092) Loss 0.9775 (0.7610) Acc@1 75.977 (83.022) Acc@5 94.336 (96.291) Mem 7379MB [2024-08-28 01:15:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.598 Acc@5 96.294 [2024-08-28 01:15:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-08-28 01:15:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.60% [2024-08-28 01:15:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-28 01:15:25 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-28 01:15:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][0/1251] eta 0:16:38 lr 0.000261 wd 0.0500 time 0.7980 (0.7980) data time 0.3867 (0.3867) model time 0.0000 (0.0000) loss 2.7780 (2.7780) grad_norm 2.4394 (2.4394) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-28 01:15:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-28 01:15:28 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-28 01:15:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-29 21:39:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-29 21:39:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-29 21:40:03 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-29 21:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-29 21:45:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-29 21:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-29 21:54:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-29 21:54:20 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-29 21:54:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-29 21:54:27 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-29 21:54:28 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-29 21:54:30 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-29 21:54:30 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 206) [2024-08-29 21:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-29 21:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-29 22:03:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-29 22:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-29 22:03:38 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-29 22:03:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-29 22:03:51 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-29 22:03:52 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-29 22:03:53 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-29 22:03:53 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 206) [2024-08-29 22:03:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-29 22:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][10/1251] eta 4:09:52 lr 0.000261 wd 0.0500 time 12.0814 (12.0814) data time 0.8733 (0.8733) model time 0.0000 (0.0000) loss 3.4351 (3.4351) grad_norm 4.6483 (4.6483) loss_scale 512.0000 (512.0000) mem 20033MB [2024-08-29 22:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][20/1251] eta 0:26:54 lr 0.000261 wd 0.0500 time 0.2228 (1.3114) data time 0.0009 (0.0802) model time 0.0000 (0.0000) loss 2.2718 (3.2031) grad_norm 2.7505 (3.2651) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][30/1251] eta 0:16:09 lr 0.000261 wd 0.0500 time 0.2209 (0.7942) data time 0.0010 (0.0425) model time 0.0000 (0.0000) loss 3.2860 (3.1918) grad_norm 2.4149 (3.5126) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][40/1251] eta 0:12:18 lr 0.000261 wd 0.0500 time 0.2205 (0.6099) data time 0.0007 (0.0291) model time 0.0000 (0.0000) loss 2.1622 (3.2233) grad_norm 3.1442 (3.3743) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:04:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][50/1251] eta 0:10:19 lr 0.000261 wd 0.0500 time 0.2264 (0.5159) data time 0.0008 (0.0222) model time 0.0000 (0.0000) loss 2.9826 (3.1850) grad_norm 4.7397 (3.4909) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][60/1251] eta 0:09:06 lr 0.000260 wd 0.0500 time 0.2239 (0.4587) data time 0.0007 (0.0180) model time 0.2232 (0.2234) loss 3.4024 (3.1759) grad_norm 3.3204 (3.6166) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][70/1251] eta 0:08:15 lr 0.000260 wd 0.0500 time 0.2174 (0.4200) data time 0.0011 (0.0152) model time 0.2163 (0.2225) loss 2.9219 (3.1339) grad_norm 2.2952 (3.5167) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][80/1251] eta 0:07:39 lr 0.000260 wd 0.0500 time 0.2187 (0.3926) data time 0.0012 (0.0132) model time 0.2175 (0.2233) loss 2.8678 (3.1049) grad_norm 2.3715 (3.4619) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:04:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][90/1251] eta 0:07:12 lr 0.000260 wd 0.0500 time 0.2228 (0.3722) data time 0.0008 (0.0117) model time 0.2220 (0.2240) loss 2.4847 (3.0890) grad_norm 3.1188 (3.4728) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:04:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][100/1251] eta 0:06:49 lr 0.000260 wd 0.0500 time 0.2209 (0.3557) data time 0.0007 (0.0105) model time 0.2202 (0.2235) loss 3.4865 (3.0917) grad_norm 2.7470 (3.4327) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:04:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-29 22:04:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-29 22:04:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-29 22:16:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-29 22:16:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-29 22:16:43 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-29 22:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-29 22:16:54 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-29 22:16:55 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-29 22:16:57 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-29 22:16:57 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 206) [2024-08-29 22:16:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-29 22:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][110/1251] eta 0:59:24 lr 0.000260 wd 0.0500 time 0.2354 (3.1243) data time 0.0008 (0.1198) model time 0.2346 (3.0045) loss 3.7810 (3.4072) grad_norm 3.7861 (4.9901) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][120/1251] eta 0:22:40 lr 0.000260 wd 0.0500 time 0.2382 (1.2032) data time 0.0012 (0.0407) model time 0.2371 (1.1625) loss 3.3770 (3.3231) grad_norm 3.2597 (3.8539) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][130/1251] eta 0:15:19 lr 0.000260 wd 0.0500 time 0.2410 (0.8206) data time 0.0009 (0.0248) model time 0.2400 (0.7958) loss 3.5490 (3.2878) grad_norm 3.4540 (3.4716) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][140/1251] eta 0:12:08 lr 0.000260 wd 0.0500 time 0.2471 (0.6553) data time 0.0009 (0.0180) model time 0.2462 (0.6373) loss 3.2173 (3.2425) grad_norm 4.3188 (3.6108) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-29 22:17:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-29 22:17:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-29 22:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-29 22:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-29 22:35:11 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-29 22:35:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-29 22:35:24 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-29 22:35:25 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-29 22:35:26 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-29 22:35:26 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 206) [2024-08-29 22:35:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-29 22:37:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-29 22:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-29 22:37:21 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-29 22:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-29 22:37:30 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-29 22:37:31 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-29 22:37:32 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-29 22:37:32 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 206) [2024-08-29 22:37:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-29 22:37:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][150/1251] eta 4:11:06 lr 0.000260 wd 0.0500 time 13.6840 (13.6840) data time 0.7664 (0.7664) model time 12.9176 (12.9176) loss 3.4963 (3.4963) grad_norm 3.0424 (3.0424) loss_scale 512.0000 (512.0000) mem 20034MB [2024-08-29 22:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][160/1251] eta 0:26:53 lr 0.000260 wd 0.0500 time 0.2485 (1.4791) data time 0.0011 (0.0707) model time 0.2474 (1.4084) loss 2.4737 (3.2688) grad_norm 3.6312 (3.5280) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][170/1251] eta 0:16:01 lr 0.000260 wd 0.0500 time 0.2352 (0.8890) data time 0.0012 (0.0376) model time 0.2339 (0.8514) loss 3.0022 (3.1873) grad_norm 2.8471 (3.4810) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:37:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][180/1251] eta 0:12:07 lr 0.000260 wd 0.0500 time 0.2373 (0.6790) data time 0.0008 (0.0259) model time 0.2366 (0.6531) loss 2.1088 (3.2267) grad_norm 3.3268 (3.4809) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][190/1251] eta 0:10:06 lr 0.000260 wd 0.0500 time 0.2371 (0.5714) data time 0.0012 (0.0198) model time 0.2360 (0.5516) loss 2.7251 (3.1713) grad_norm 2.8204 (3.3807) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][200/1251] eta 0:08:52 lr 0.000260 wd 0.0500 time 0.2406 (0.5064) data time 0.0010 (0.0162) model time 0.2396 (0.4902) loss 3.5761 (3.1557) grad_norm 2.5431 (3.2934) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][210/1251] eta 0:08:01 lr 0.000260 wd 0.0500 time 0.2440 (0.4622) data time 0.0011 (0.0137) model time 0.2429 (0.4485) loss 3.1514 (3.1250) grad_norm 2.8584 (3.3725) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][220/1251] eta 0:07:24 lr 0.000260 wd 0.0500 time 0.2427 (0.4310) data time 0.0010 (0.0119) model time 0.2417 (0.4191) loss 2.5309 (3.0885) grad_norm 2.8562 (3.3646) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][230/1251] eta 0:06:56 lr 0.000260 wd 0.0500 time 0.2406 (0.4076) data time 0.0014 (0.0106) model time 0.2392 (0.3970) loss 2.7174 (3.0699) grad_norm 2.4183 (3.3490) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][240/1251] eta 0:06:33 lr 0.000260 wd 0.0500 time 0.2342 (0.3892) data time 0.0010 (0.0095) model time 0.2332 (0.3797) loss 3.3550 (3.0563) grad_norm 3.8824 (3.9769) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][250/1251] eta 0:06:14 lr 0.000260 wd 0.0500 time 0.2459 (0.3744) data time 0.0007 (0.0087) model time 0.2452 (0.3657) loss 3.3962 (3.0591) grad_norm 3.3421 (3.9949) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][260/1251] eta 0:05:59 lr 0.000260 wd 0.0500 time 0.2364 (0.3625) data time 0.0011 (0.0080) model time 0.2353 (0.3545) loss 2.2520 (3.0565) grad_norm 3.5635 (3.9555) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][270/1251] eta 0:05:45 lr 0.000260 wd 0.0500 time 0.2492 (0.3524) data time 0.0008 (0.0075) model time 0.2484 (0.3449) loss 1.9199 (3.0636) grad_norm 2.8082 (3.9105) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][280/1251] eta 0:05:33 lr 0.000260 wd 0.0500 time 0.2478 (0.3439) data time 0.0010 (0.0070) model time 0.2468 (0.3369) loss 3.5251 (3.0552) grad_norm 3.0759 (3.8603) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][290/1251] eta 0:05:23 lr 0.000260 wd 0.0500 time 0.2433 (0.3365) data time 0.0008 (0.0066) model time 0.2425 (0.3299) loss 3.0358 (3.0378) grad_norm 2.5991 (3.8604) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][300/1251] eta 0:05:13 lr 0.000260 wd 0.0500 time 0.2424 (0.3301) data time 0.0010 (0.0062) model time 0.2413 (0.3239) loss 2.5741 (3.0379) grad_norm 3.1006 (3.8074) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][310/1251] eta 0:05:05 lr 0.000260 wd 0.0500 time 0.2348 (0.3245) data time 0.0011 (0.0059) model time 0.2338 (0.3186) loss 3.3682 (3.0456) grad_norm 2.2919 (3.7242) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][320/1251] eta 0:04:57 lr 0.000259 wd 0.0500 time 0.2358 (0.3194) data time 0.0010 (0.0056) model time 0.2348 (0.3138) loss 3.2264 (3.0436) grad_norm 2.6343 (3.6622) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][330/1251] eta 0:04:50 lr 0.000259 wd 0.0500 time 0.2333 (0.3149) data time 0.0010 (0.0053) model time 0.2323 (0.3096) loss 3.1848 (3.0324) grad_norm 2.5660 (3.6355) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][340/1251] eta 0:04:43 lr 0.000259 wd 0.0500 time 0.2368 (0.3110) data time 0.0011 (0.0051) model time 0.2357 (0.3059) loss 2.6560 (3.0296) grad_norm 2.4164 (3.6030) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][350/1251] eta 0:04:36 lr 0.000259 wd 0.0500 time 0.2470 (0.3074) data time 0.0010 (0.0049) model time 0.2459 (0.3025) loss 2.8303 (3.0209) grad_norm 3.6268 (3.5961) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][360/1251] eta 0:04:31 lr 0.000259 wd 0.0500 time 0.2379 (0.3042) data time 0.0010 (0.0047) model time 0.2369 (0.2995) loss 3.9684 (3.0195) grad_norm 7.8783 (3.5998) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][370/1251] eta 0:04:25 lr 0.000259 wd 0.0500 time 0.2400 (0.3012) data time 0.0007 (0.0046) model time 0.2393 (0.2967) loss 3.4766 (3.0117) grad_norm 2.5383 (3.5846) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][380/1251] eta 0:04:20 lr 0.000259 wd 0.0500 time 0.2530 (0.2986) data time 0.0009 (0.0044) model time 0.2521 (0.2942) loss 1.7684 (3.0055) grad_norm 2.5142 (3.5726) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][390/1251] eta 0:04:14 lr 0.000259 wd 0.0500 time 0.2452 (0.2961) data time 0.0008 (0.0043) model time 0.2444 (0.2918) loss 2.9819 (3.0030) grad_norm 2.7126 (3.5412) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][400/1251] eta 0:04:10 lr 0.000259 wd 0.0500 time 0.2369 (0.2938) data time 0.0008 (0.0042) model time 0.2361 (0.2896) loss 3.1026 (2.9976) grad_norm 2.4380 (3.5191) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 22:38:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-29 22:38:52 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-29 22:38:53 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-29 22:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-29 22:56:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-29 22:56:12 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-29 22:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-29 22:56:20 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-29 22:56:21 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-29 22:56:22 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-29 22:56:22 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 206) [2024-08-29 22:56:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-29 23:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-29 23:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-29 23:16:59 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-29 23:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-29 23:17:14 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-29 23:17:16 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-29 23:17:17 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-29 23:17:17 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 206) [2024-08-29 23:17:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-29 23:17:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][410/1251] eta 0:42:20 lr 0.000259 wd 0.0500 time 0.2371 (3.0206) data time 0.0008 (0.1980) model time 0.2363 (2.8226) loss 3.5860 (3.5049) grad_norm 3.2598 (2.9392) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:17:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][420/1251] eta 0:16:09 lr 0.000259 wd 0.0500 time 0.2450 (1.1671) data time 0.0011 (0.0667) model time 0.2440 (1.1003) loss 3.2824 (3.1973) grad_norm 3.4579 (3.1928) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:17:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][430/1251] eta 0:10:53 lr 0.000259 wd 0.0500 time 0.2390 (0.7963) data time 0.0012 (0.0405) model time 0.2378 (0.7558) loss 3.3280 (3.1956) grad_norm 2.9270 (3.1538) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:17:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][440/1251] eta 0:08:36 lr 0.000259 wd 0.0500 time 0.2372 (0.6367) data time 0.0010 (0.0292) model time 0.2362 (0.6075) loss 3.3017 (3.1731) grad_norm 3.3655 (3.1744) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:17:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][450/1251] eta 0:07:19 lr 0.000259 wd 0.0500 time 0.2422 (0.5485) data time 0.0009 (0.0230) model time 0.2413 (0.5255) loss 3.3317 (3.1530) grad_norm 5.0389 (3.1723) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:17:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][460/1251] eta 0:06:29 lr 0.000259 wd 0.0500 time 0.2413 (0.4921) data time 0.0008 (0.0190) model time 0.2405 (0.4731) loss 2.5088 (3.1278) grad_norm 3.0585 (3.4905) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][470/1251] eta 0:05:53 lr 0.000259 wd 0.0500 time 0.2374 (0.4529) data time 0.0013 (0.0162) model time 0.2361 (0.4367) loss 3.4031 (3.1189) grad_norm 2.5853 (3.6774) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:17:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][480/1251] eta 0:05:27 lr 0.000259 wd 0.0500 time 0.2359 (0.4243) data time 0.0010 (0.0142) model time 0.2349 (0.4101) loss 2.3996 (3.0755) grad_norm 2.7045 (3.5685) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:17:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][490/1251] eta 0:05:06 lr 0.000259 wd 0.0500 time 0.2389 (0.4024) data time 0.0009 (0.0127) model time 0.2380 (0.3897) loss 2.8883 (3.0640) grad_norm 3.7638 (3.4976) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:17:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][500/1251] eta 0:04:49 lr 0.000259 wd 0.0500 time 0.2352 (0.3850) data time 0.0011 (0.0115) model time 0.2341 (0.3735) loss 3.1115 (3.0478) grad_norm 3.1047 (3.4485) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][510/1251] eta 0:04:34 lr 0.000259 wd 0.0500 time 0.2481 (0.3711) data time 0.0009 (0.0105) model time 0.2471 (0.3606) loss 2.6094 (3.0716) grad_norm 2.1721 (3.4270) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][520/1251] eta 0:04:22 lr 0.000259 wd 0.0500 time 0.2332 (0.3597) data time 0.0011 (0.0097) model time 0.2321 (0.3500) loss 2.1539 (3.0549) grad_norm 2.9387 (3.3761) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][530/1251] eta 0:04:12 lr 0.000259 wd 0.0500 time 0.2349 (0.3499) data time 0.0007 (0.0090) model time 0.2342 (0.3410) loss 2.9337 (3.0653) grad_norm 3.0629 (3.3433) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][540/1251] eta 0:04:02 lr 0.000259 wd 0.0500 time 0.2408 (0.3416) data time 0.0009 (0.0084) model time 0.2399 (0.3332) loss 3.1671 (3.0591) grad_norm 2.2161 (3.3534) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][550/1251] eta 0:03:54 lr 0.000259 wd 0.0500 time 0.2347 (0.3345) data time 0.0009 (0.0079) model time 0.2338 (0.3266) loss 3.3175 (3.0488) grad_norm 4.7962 (3.3832) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][560/1251] eta 0:03:46 lr 0.000259 wd 0.0500 time 0.2451 (0.3283) data time 0.0011 (0.0075) model time 0.2440 (0.3208) loss 3.1418 (3.0465) grad_norm 2.8015 (3.3665) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][570/1251] eta 0:03:39 lr 0.000259 wd 0.0500 time 0.2389 (0.3227) data time 0.0009 (0.0071) model time 0.2380 (0.3156) loss 3.4635 (3.0527) grad_norm 3.1106 (3.3338) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][580/1251] eta 0:03:33 lr 0.000258 wd 0.0500 time 0.2340 (0.3179) data time 0.0009 (0.0068) model time 0.2332 (0.3111) loss 2.9662 (3.0470) grad_norm 4.2838 (3.3210) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][590/1251] eta 0:03:27 lr 0.000258 wd 0.0500 time 0.2436 (0.3137) data time 0.0011 (0.0065) model time 0.2425 (0.3072) loss 3.3141 (3.0418) grad_norm 3.3426 (3.3346) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][600/1251] eta 0:03:21 lr 0.000258 wd 0.0500 time 0.2411 (0.3097) data time 0.0011 (0.0062) model time 0.2400 (0.3035) loss 2.0747 (3.0333) grad_norm 2.5397 (3.3547) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][610/1251] eta 0:03:16 lr 0.000258 wd 0.0500 time 0.2311 (0.3061) data time 0.0010 (0.0060) model time 0.2301 (0.3002) loss 2.3698 (3.0232) grad_norm 5.1903 (3.3495) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][620/1251] eta 0:03:11 lr 0.000258 wd 0.0500 time 0.2351 (0.3030) data time 0.0010 (0.0057) model time 0.2341 (0.2972) loss 2.7119 (3.0149) grad_norm 2.7863 (3.3325) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][630/1251] eta 0:03:06 lr 0.000258 wd 0.0500 time 0.2409 (0.3001) data time 0.0009 (0.0055) model time 0.2400 (0.2945) loss 3.1078 (3.0129) grad_norm 2.5105 (3.3033) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][640/1251] eta 0:03:01 lr 0.000258 wd 0.0500 time 0.2367 (0.2976) data time 0.0010 (0.0054) model time 0.2357 (0.2923) loss 3.1503 (3.0054) grad_norm 2.6468 (3.2842) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][650/1251] eta 0:02:57 lr 0.000258 wd 0.0500 time 0.2496 (0.2952) data time 0.0012 (0.0052) model time 0.2484 (0.2900) loss 3.0442 (3.0046) grad_norm 2.7690 (3.2937) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][660/1251] eta 0:02:53 lr 0.000258 wd 0.0500 time 0.2345 (0.2930) data time 0.0008 (0.0050) model time 0.2336 (0.2880) loss 2.7470 (2.9938) grad_norm 3.0594 (3.2919) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][670/1251] eta 0:02:49 lr 0.000258 wd 0.0500 time 0.2320 (0.2910) data time 0.0009 (0.0049) model time 0.2310 (0.2861) loss 2.3565 (2.9898) grad_norm 2.8064 (3.3082) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][680/1251] eta 0:02:45 lr 0.000258 wd 0.0500 time 0.2435 (0.2891) data time 0.0009 (0.0047) model time 0.2426 (0.2843) loss 3.2658 (2.9919) grad_norm 3.4946 (3.3673) loss_scale 1024.0000 (513.8618) mem 7377MB [2024-08-29 23:18:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][690/1251] eta 0:02:41 lr 0.000258 wd 0.0500 time 0.2478 (0.2874) data time 0.0011 (0.0046) model time 0.2467 (0.2827) loss 2.8088 (2.9862) grad_norm 2.1750 (3.3672) loss_scale 1024.0000 (531.7614) mem 7377MB [2024-08-29 23:18:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][700/1251] eta 0:02:37 lr 0.000258 wd 0.0500 time 0.2384 (0.2865) data time 0.0010 (0.0045) model time 0.2374 (0.2820) loss 3.3805 (2.9833) grad_norm 2.6074 (3.3698) loss_scale 1024.0000 (548.4475) mem 7377MB [2024-08-29 23:18:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][710/1251] eta 0:02:34 lr 0.000258 wd 0.0500 time 0.2307 (0.2849) data time 0.0011 (0.0044) model time 0.2297 (0.2805) loss 2.5628 (2.9785) grad_norm 2.4928 (3.3613) loss_scale 1024.0000 (564.0393) mem 7377MB [2024-08-29 23:18:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-29 23:18:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-29 23:18:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-29 23:20:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-29 23:20:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-29 23:20:51 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-29 23:21:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-29 23:21:01 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-29 23:21:02 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-29 23:21:03 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-29 23:21:03 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 206) [2024-08-29 23:21:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-29 23:21:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][720/1251] eta 0:18:32 lr 0.000258 wd 0.0500 time 0.2383 (2.0945) data time 0.0007 (0.1989) model time 0.2376 (1.8956) loss 3.7461 (3.4683) grad_norm 7.7505 (3.7159) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-29 23:21:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][730/1251] eta 0:09:15 lr 0.000258 wd 0.0500 time 0.2409 (1.0668) data time 0.0010 (0.0890) model time 0.2399 (0.9778) loss 3.8517 (3.2890) grad_norm 2.9382 (3.5972) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-29 23:21:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][740/1251] eta 0:06:34 lr 0.000258 wd 0.0500 time 0.2393 (0.7729) data time 0.0009 (0.0576) model time 0.2384 (0.7153) loss 3.4625 (3.3077) grad_norm 4.6715 (3.7611) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-29 23:21:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][750/1251] eta 0:05:17 lr 0.000258 wd 0.0500 time 0.2394 (0.6338) data time 0.0012 (0.0427) model time 0.2383 (0.5910) loss 2.9501 (3.2393) grad_norm 4.7221 (3.6764) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-29 23:21:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][760/1251] eta 0:04:31 lr 0.000258 wd 0.0500 time 0.2464 (0.5523) data time 0.0007 (0.0340) model time 0.2457 (0.5183) loss 3.4557 (3.1932) grad_norm 2.7502 (3.5544) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-29 23:21:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][770/1251] eta 0:03:59 lr 0.000258 wd 0.0500 time 0.2411 (0.4988) data time 0.0009 (0.0283) model time 0.2402 (0.4704) loss 2.5853 (3.1739) grad_norm 2.6405 (3.6215) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-29 23:21:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][780/1251] eta 0:03:37 lr 0.000258 wd 0.0500 time 0.2359 (0.4610) data time 0.0008 (0.0243) model time 0.2351 (0.4366) loss 2.4767 (3.1440) grad_norm 3.5101 (3.5654) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-29 23:21:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][790/1251] eta 0:03:19 lr 0.000258 wd 0.0500 time 0.2363 (0.4330) data time 0.0008 (0.0214) model time 0.2355 (0.4116) loss 2.4724 (3.1253) grad_norm 2.5984 (3.5058) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-29 23:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][800/1251] eta 0:03:05 lr 0.000258 wd 0.0500 time 0.2436 (0.4113) data time 0.0014 (0.0191) model time 0.2422 (0.3922) loss 3.2513 (3.0958) grad_norm 4.3321 (3.4696) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-29 23:21:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][810/1251] eta 0:02:53 lr 0.000258 wd 0.0500 time 0.2380 (0.3938) data time 0.0010 (0.0172) model time 0.2370 (0.3766) loss 3.7023 (3.0997) grad_norm 3.2656 (3.4927) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-29 23:21:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-29 23:21:47 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-29 23:21:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-29 23:25:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-29 23:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-29 23:30:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-29 23:30:41 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-29 23:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-29 23:30:50 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-29 23:30:52 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-29 23:30:53 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-29 23:30:53 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 206) [2024-08-29 23:30:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-29 23:31:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][820/1251] eta 0:13:49 lr 0.000258 wd 0.0500 time 0.2377 (1.9251) data time 0.0008 (0.1233) model time 0.2369 (1.8018) loss 3.3199 (3.5504) grad_norm 3.5145 (3.6487) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-29 23:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][830/1251] eta 0:06:56 lr 0.000258 wd 0.0500 time 0.2327 (0.9883) data time 0.0009 (0.0555) model time 0.2318 (0.9328) loss 3.6017 (3.3552) grad_norm 3.6884 (3.5423) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-29 23:31:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][840/1251] eta 0:04:56 lr 0.000257 wd 0.0500 time 0.2420 (0.7208) data time 0.0012 (0.0360) model time 0.2408 (0.6848) loss 3.1492 (3.3592) grad_norm 3.7177 (inf) loss_scale 512.0000 (859.4286) mem 7381MB [2024-08-29 23:31:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][850/1251] eta 0:03:58 lr 0.000257 wd 0.0500 time 0.2341 (0.5940) data time 0.0009 (0.0268) model time 0.2332 (0.5671) loss 3.5304 (3.3078) grad_norm 4.1718 (inf) loss_scale 512.0000 (768.0000) mem 7381MB [2024-08-29 23:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][860/1251] eta 0:03:23 lr 0.000257 wd 0.0500 time 0.2311 (0.5202) data time 0.0010 (0.0215) model time 0.2301 (0.4987) loss 3.1154 (3.2505) grad_norm 2.9463 (inf) loss_scale 512.0000 (714.6667) mem 7381MB [2024-08-29 23:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][870/1251] eta 0:02:59 lr 0.000257 wd 0.0500 time 0.2344 (0.4718) data time 0.0007 (0.0180) model time 0.2337 (0.4539) loss 2.3203 (3.2078) grad_norm 2.8444 (inf) loss_scale 512.0000 (679.7241) mem 7381MB [2024-08-29 23:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][880/1251] eta 0:02:42 lr 0.000257 wd 0.0500 time 0.2435 (0.4375) data time 0.0009 (0.0155) model time 0.2426 (0.4220) loss 2.0011 (3.1619) grad_norm 2.6689 (inf) loss_scale 512.0000 (655.0588) mem 7381MB [2024-08-29 23:31:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][890/1251] eta 0:02:28 lr 0.000257 wd 0.0500 time 0.2357 (0.4120) data time 0.0009 (0.0136) model time 0.2347 (0.3984) loss 2.6093 (3.1266) grad_norm 3.5256 (inf) loss_scale 512.0000 (636.7179) mem 7381MB [2024-08-29 23:31:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][900/1251] eta 0:02:17 lr 0.000257 wd 0.0500 time 0.2376 (0.3923) data time 0.0010 (0.0122) model time 0.2366 (0.3801) loss 3.1194 (3.1001) grad_norm 3.2630 (inf) loss_scale 512.0000 (622.5455) mem 7381MB [2024-08-29 23:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][910/1251] eta 0:02:08 lr 0.000257 wd 0.0500 time 0.2342 (0.3765) data time 0.0008 (0.0111) model time 0.2334 (0.3654) loss 3.5725 (3.1011) grad_norm 5.5231 (inf) loss_scale 512.0000 (611.2653) mem 7381MB [2024-08-29 23:31:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][920/1251] eta 0:02:00 lr 0.000257 wd 0.0500 time 0.2333 (0.3636) data time 0.0008 (0.0101) model time 0.2325 (0.3535) loss 2.3066 (3.1072) grad_norm 4.1991 (inf) loss_scale 512.0000 (602.0741) mem 7381MB [2024-08-29 23:31:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][930/1251] eta 0:01:53 lr 0.000257 wd 0.0500 time 0.2359 (0.3533) data time 0.0011 (0.0094) model time 0.2348 (0.3439) loss 3.2269 (3.1061) grad_norm 3.2543 (inf) loss_scale 512.0000 (594.4407) mem 7381MB [2024-08-29 23:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][940/1251] eta 0:01:47 lr 0.000257 wd 0.0500 time 0.2383 (0.3444) data time 0.0008 (0.0087) model time 0.2375 (0.3357) loss 3.1595 (3.0956) grad_norm 2.8006 (inf) loss_scale 512.0000 (588.0000) mem 7381MB [2024-08-29 23:31:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][950/1251] eta 0:01:41 lr 0.000257 wd 0.0500 time 0.2344 (0.3369) data time 0.0011 (0.0082) model time 0.2333 (0.3287) loss 3.1445 (3.0918) grad_norm 4.5390 (inf) loss_scale 512.0000 (582.4928) mem 7381MB [2024-08-29 23:31:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-29 23:31:46 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-29 23:31:47 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-29 23:51:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-29 23:51:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-29 23:51:43 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-29 23:51:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-29 23:51:53 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-29 23:51:55 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-29 23:51:56 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-29 23:51:56 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 206) [2024-08-29 23:51:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-29 23:52:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][960/1251] eta 0:34:41 lr 0.000257 wd 0.0500 time 0.3850 (7.1523) data time 0.0008 (0.3847) model time 0.3842 (6.7676) loss 3.7898 (3.8196) grad_norm 2.1624 (2.9735) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-29 23:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][970/1251] eta 0:06:30 lr 0.000257 wd 0.0500 time 0.2323 (1.3880) data time 0.0008 (0.0650) model time 0.2314 (1.3231) loss 2.7511 (3.2595) grad_norm 2.4232 (2.9278) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][980/1251] eta 0:03:54 lr 0.000257 wd 0.0500 time 0.2379 (0.8656) data time 0.0010 (0.0370) model time 0.2369 (0.8287) loss 3.3749 (3.2428) grad_norm 2.1842 (3.0101) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][990/1251] eta 0:02:54 lr 0.000257 wd 0.0500 time 0.2447 (0.6690) data time 0.0007 (0.0257) model time 0.2440 (0.6432) loss 2.9979 (3.2669) grad_norm 2.8400 (3.0600) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1000/1251] eta 0:02:22 lr 0.000257 wd 0.0500 time 0.2360 (0.5663) data time 0.0010 (0.0198) model time 0.2350 (0.5465) loss 3.0482 (3.2043) grad_norm 4.4876 (3.1356) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1010/1251] eta 0:02:01 lr 0.000257 wd 0.0500 time 0.2337 (0.5033) data time 0.0008 (0.0162) model time 0.2329 (0.4871) loss 3.4344 (3.2007) grad_norm 3.8871 (3.2554) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1020/1251] eta 0:01:46 lr 0.000257 wd 0.0500 time 0.2448 (0.4607) data time 0.0008 (0.0138) model time 0.2440 (0.4469) loss 3.4597 (3.1738) grad_norm 2.3759 (3.2585) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1030/1251] eta 0:01:34 lr 0.000257 wd 0.0500 time 0.2303 (0.4298) data time 0.0010 (0.0120) model time 0.2294 (0.4178) loss 2.7987 (3.1274) grad_norm 2.2210 (3.1944) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1040/1251] eta 0:01:25 lr 0.000257 wd 0.0500 time 0.2289 (0.4066) data time 0.0009 (0.0107) model time 0.2279 (0.3959) loss 3.7510 (3.1145) grad_norm 3.6033 (3.2210) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1050/1251] eta 0:01:18 lr 0.000257 wd 0.0500 time 0.2313 (0.3882) data time 0.0007 (0.0096) model time 0.2305 (0.3786) loss 1.8919 (3.0836) grad_norm 2.7098 (3.2165) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1060/1251] eta 0:01:11 lr 0.000257 wd 0.0500 time 0.2392 (0.3737) data time 0.0007 (0.0088) model time 0.2385 (0.3649) loss 3.4064 (3.1068) grad_norm 2.6898 (3.2317) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1070/1251] eta 0:01:05 lr 0.000257 wd 0.0500 time 0.2370 (0.3615) data time 0.0011 (0.0081) model time 0.2359 (0.3535) loss 3.5237 (3.1035) grad_norm 3.2256 (3.2166) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1080/1251] eta 0:01:00 lr 0.000257 wd 0.0500 time 0.2442 (0.3516) data time 0.0008 (0.0075) model time 0.2435 (0.3441) loss 3.0606 (3.1006) grad_norm 3.4341 (3.1911) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1090/1251] eta 0:00:55 lr 0.000257 wd 0.0500 time 0.2432 (0.3432) data time 0.0009 (0.0070) model time 0.2423 (0.3362) loss 3.2209 (3.1053) grad_norm 12.7192 (3.2599) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1100/1251] eta 0:00:50 lr 0.000256 wd 0.0500 time 0.2348 (0.3359) data time 0.0010 (0.0066) model time 0.2338 (0.3293) loss 3.1837 (3.0899) grad_norm 2.7664 (3.2509) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1110/1251] eta 0:00:46 lr 0.000256 wd 0.0500 time 0.2331 (0.3295) data time 0.0009 (0.0062) model time 0.2322 (0.3232) loss 3.4266 (3.0841) grad_norm 2.7825 (3.2442) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1120/1251] eta 0:00:42 lr 0.000256 wd 0.0500 time 0.2379 (0.3237) data time 0.0009 (0.0059) model time 0.2370 (0.3178) loss 3.3149 (3.0843) grad_norm 2.7202 (3.2111) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1130/1251] eta 0:00:38 lr 0.000256 wd 0.0500 time 0.2310 (0.3188) data time 0.0012 (0.0056) model time 0.2299 (0.3132) loss 2.6359 (3.0762) grad_norm 2.9188 (3.1967) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:52:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1140/1251] eta 0:00:34 lr 0.000256 wd 0.0500 time 0.2433 (0.3144) data time 0.0009 (0.0054) model time 0.2425 (0.3090) loss 3.3614 (3.0677) grad_norm 6.7432 (3.2084) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:53:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1150/1251] eta 0:00:31 lr 0.000256 wd 0.0500 time 0.2385 (0.3105) data time 0.0010 (0.0051) model time 0.2375 (0.3054) loss 3.2108 (3.0662) grad_norm 3.2312 (3.4493) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1160/1251] eta 0:00:27 lr 0.000256 wd 0.0500 time 0.2424 (0.3071) data time 0.0008 (0.0049) model time 0.2416 (0.3022) loss 3.4405 (3.0514) grad_norm 2.7460 (3.4333) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:53:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1170/1251] eta 0:00:24 lr 0.000256 wd 0.0500 time 0.2413 (0.3039) data time 0.0010 (0.0048) model time 0.2403 (0.2991) loss 3.1743 (3.0455) grad_norm 3.2199 (3.4051) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:53:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1180/1251] eta 0:00:21 lr 0.000256 wd 0.0500 time 0.2413 (0.3011) data time 0.0008 (0.0046) model time 0.2406 (0.2965) loss 3.8584 (3.0462) grad_norm 2.9217 (3.4066) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:53:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1190/1251] eta 0:00:18 lr 0.000256 wd 0.0500 time 0.2449 (0.2984) data time 0.0007 (0.0044) model time 0.2442 (0.2940) loss 3.2344 (3.0431) grad_norm 5.0162 (3.4207) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:53:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1200/1251] eta 0:00:15 lr 0.000256 wd 0.0500 time 0.2366 (0.2959) data time 0.0007 (0.0043) model time 0.2359 (0.2916) loss 3.5454 (3.0414) grad_norm 2.9659 (3.4251) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1210/1251] eta 0:00:12 lr 0.000256 wd 0.0500 time 0.2253 (0.2936) data time 0.0010 (0.0042) model time 0.2242 (0.2894) loss 3.5500 (3.0320) grad_norm 2.3024 (3.4023) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:53:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1220/1251] eta 0:00:09 lr 0.000256 wd 0.0500 time 0.2332 (0.2914) data time 0.0009 (0.0040) model time 0.2323 (0.2874) loss 2.9384 (3.0193) grad_norm 3.3452 (3.4800) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:53:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1230/1251] eta 0:00:06 lr 0.000256 wd 0.0500 time 0.2384 (0.2894) data time 0.0009 (0.0039) model time 0.2375 (0.2855) loss 2.7264 (3.0126) grad_norm 3.8843 (3.6524) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:53:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1240/1251] eta 0:00:03 lr 0.000256 wd 0.0500 time 0.2229 (0.2874) data time 0.0007 (0.0038) model time 0.2222 (0.2836) loss 2.0562 (3.0102) grad_norm 2.3781 (3.6369) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:53:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [206/300][1250/1251] eta 0:00:00 lr 0.000256 wd 0.0500 time 0.2225 (0.2861) data time 0.0007 (0.0037) model time 0.2218 (0.2824) loss 3.0472 (3.0080) grad_norm 2.8583 (3.6168) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-29 23:53:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 206 training takes 0:01:23 [2024-08-29 23:53:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-29 23:53:27 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-29 23:53:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.421 (0.421) Loss 0.4065 (0.4065) Acc@1 92.578 (92.578) Acc@5 98.633 (98.633) Mem 7378MB [2024-08-29 23:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.107) Loss 0.6182 (0.6468) Acc@1 87.500 (86.168) Acc@5 97.559 (97.434) Mem 7378MB [2024-08-29 23:53:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.094) Loss 0.9639 (0.6806) Acc@1 76.758 (84.859) Acc@5 95.508 (97.387) Mem 7378MB [2024-08-29 23:53:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.088) Loss 1.1738 (0.7790) Acc@1 72.656 (82.734) Acc@5 91.895 (96.276) Mem 7378MB [2024-08-29 23:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.082) Loss 1.0342 (0.8309) Acc@1 75.098 (81.376) Acc@5 93.750 (95.696) Mem 7378MB [2024-08-29 23:53:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.000 Acc@5 95.634 [2024-08-29 23:53:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.0% [2024-08-29 23:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.905 (0.905) Loss 0.3840 (0.3840) Acc@1 93.066 (93.066) Acc@5 98.535 (98.535) Mem 7378MB [2024-08-29 23:53:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.154) Loss 0.5889 (0.6076) Acc@1 88.770 (87.367) Acc@5 97.656 (97.674) Mem 7378MB [2024-08-29 23:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.118) Loss 0.8706 (0.6345) Acc@1 78.125 (86.263) Acc@5 96.387 (97.680) Mem 7378MB [2024-08-29 23:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.104) Loss 1.0928 (0.7181) Acc@1 73.535 (84.262) Acc@5 93.164 (96.768) Mem 7378MB [2024-08-29 23:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.095) Loss 0.9775 (0.7608) Acc@1 75.879 (83.041) Acc@5 94.238 (96.289) Mem 7378MB [2024-08-29 23:53:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.610 Acc@5 96.300 [2024-08-29 23:53:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-08-29 23:53:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.61% [2024-08-29 23:53:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-29 23:53:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-29 23:53:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][0/1251] eta 0:15:26 lr 0.000256 wd 0.0500 time 0.7403 (0.7403) data time 0.4691 (0.4691) model time 0.0000 (0.0000) loss 2.6577 (2.6577) grad_norm 3.2819 (3.2819) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:53:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][10/1251] eta 0:05:55 lr 0.000256 wd 0.0500 time 0.2467 (0.2866) data time 0.0008 (0.0450) model time 0.0000 (0.0000) loss 2.8222 (2.7858) grad_norm 4.5285 (4.5304) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][20/1251] eta 0:05:26 lr 0.000256 wd 0.0500 time 0.2352 (0.2650) data time 0.0008 (0.0241) model time 0.0000 (0.0000) loss 3.7064 (2.8876) grad_norm 2.7167 (3.7592) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][30/1251] eta 0:05:14 lr 0.000256 wd 0.0500 time 0.2494 (0.2575) data time 0.0010 (0.0166) model time 0.0000 (0.0000) loss 3.3420 (2.9613) grad_norm 2.9336 (3.5759) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:53:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][40/1251] eta 0:05:07 lr 0.000256 wd 0.0500 time 0.2442 (0.2535) data time 0.0010 (0.0129) model time 0.0000 (0.0000) loss 3.3125 (2.9356) grad_norm 4.3987 (3.5625) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][50/1251] eta 0:05:02 lr 0.000256 wd 0.0500 time 0.2539 (0.2518) data time 0.0010 (0.0106) model time 0.0000 (0.0000) loss 3.4078 (2.9649) grad_norm 3.0765 (3.5124) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:53:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][60/1251] eta 0:04:57 lr 0.000256 wd 0.0500 time 0.2521 (0.2502) data time 0.0009 (0.0090) model time 0.2513 (0.2411) loss 3.4808 (2.9698) grad_norm 2.5018 (3.4705) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:53:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][70/1251] eta 0:04:53 lr 0.000256 wd 0.0500 time 0.2391 (0.2483) data time 0.0006 (0.0079) model time 0.2385 (0.2384) loss 3.1494 (2.9778) grad_norm 2.2700 (3.4617) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:53:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][80/1251] eta 0:04:49 lr 0.000256 wd 0.0500 time 0.2383 (0.2470) data time 0.0008 (0.0070) model time 0.2375 (0.2381) loss 3.1052 (2.9594) grad_norm 2.3614 (3.4438) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][90/1251] eta 0:04:45 lr 0.000256 wd 0.0500 time 0.2389 (0.2463) data time 0.0009 (0.0063) model time 0.2380 (0.2383) loss 1.9089 (2.9381) grad_norm 2.8191 (3.3862) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][100/1251] eta 0:04:43 lr 0.000256 wd 0.0500 time 0.2402 (0.2460) data time 0.0009 (0.0058) model time 0.2394 (0.2391) loss 3.3931 (2.9324) grad_norm 5.0590 (3.3976) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][110/1251] eta 0:04:39 lr 0.000255 wd 0.0500 time 0.2315 (0.2454) data time 0.0010 (0.0054) model time 0.2305 (0.2390) loss 2.8309 (2.9543) grad_norm 2.9157 (3.3799) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][120/1251] eta 0:04:37 lr 0.000255 wd 0.0500 time 0.2436 (0.2450) data time 0.0010 (0.0050) model time 0.2426 (0.2392) loss 3.3230 (2.9735) grad_norm 2.3823 (3.3468) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][130/1251] eta 0:04:34 lr 0.000255 wd 0.0500 time 0.2371 (0.2447) data time 0.0008 (0.0047) model time 0.2363 (0.2392) loss 2.5300 (2.9682) grad_norm 2.7937 (3.3926) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][140/1251] eta 0:04:31 lr 0.000255 wd 0.0500 time 0.2407 (0.2444) data time 0.0007 (0.0044) model time 0.2400 (0.2393) loss 3.5577 (2.9849) grad_norm 3.1759 (3.3831) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][150/1251] eta 0:04:28 lr 0.000255 wd 0.0500 time 0.2398 (0.2441) data time 0.0007 (0.0042) model time 0.2391 (0.2393) loss 3.0467 (2.9932) grad_norm 2.5108 (3.3609) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][160/1251] eta 0:04:26 lr 0.000255 wd 0.0500 time 0.2383 (0.2439) data time 0.0007 (0.0040) model time 0.2376 (0.2393) loss 2.4371 (2.9843) grad_norm 2.6335 (3.3454) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][170/1251] eta 0:04:23 lr 0.000255 wd 0.0500 time 0.2596 (0.2436) data time 0.0007 (0.0038) model time 0.2589 (0.2392) loss 2.3244 (2.9773) grad_norm 3.7433 (3.3500) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][180/1251] eta 0:04:20 lr 0.000255 wd 0.0500 time 0.2566 (0.2435) data time 0.0007 (0.0037) model time 0.2558 (0.2393) loss 3.1237 (2.9648) grad_norm 2.4766 (3.3385) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][190/1251] eta 0:04:18 lr 0.000255 wd 0.0500 time 0.2360 (0.2435) data time 0.0007 (0.0036) model time 0.2354 (0.2395) loss 3.2763 (2.9647) grad_norm 3.1994 (3.3123) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][200/1251] eta 0:04:15 lr 0.000255 wd 0.0500 time 0.2480 (0.2434) data time 0.0007 (0.0034) model time 0.2473 (0.2396) loss 2.3240 (2.9729) grad_norm 2.5110 (3.3209) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][210/1251] eta 0:04:13 lr 0.000255 wd 0.0500 time 0.2464 (0.2433) data time 0.0008 (0.0033) model time 0.2457 (0.2396) loss 1.9998 (2.9678) grad_norm 2.6604 (3.3435) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][220/1251] eta 0:04:10 lr 0.000255 wd 0.0500 time 0.2360 (0.2430) data time 0.0006 (0.0032) model time 0.2354 (0.2394) loss 3.3707 (2.9848) grad_norm 3.4219 (3.3396) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][230/1251] eta 0:04:08 lr 0.000255 wd 0.0500 time 0.2400 (0.2429) data time 0.0010 (0.0031) model time 0.2391 (0.2394) loss 2.8648 (2.9773) grad_norm 2.9838 (3.3357) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][240/1251] eta 0:04:05 lr 0.000255 wd 0.0500 time 0.2404 (0.2428) data time 0.0008 (0.0030) model time 0.2395 (0.2394) loss 3.5788 (2.9745) grad_norm 3.8875 (3.3784) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][250/1251] eta 0:04:02 lr 0.000255 wd 0.0500 time 0.2397 (0.2427) data time 0.0008 (0.0029) model time 0.2388 (0.2393) loss 2.8980 (2.9672) grad_norm 3.5728 (3.3861) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][260/1251] eta 0:04:00 lr 0.000255 wd 0.0500 time 0.2424 (0.2427) data time 0.0009 (0.0029) model time 0.2415 (0.2395) loss 3.4458 (2.9714) grad_norm 3.7881 (3.4161) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][270/1251] eta 0:03:57 lr 0.000255 wd 0.0500 time 0.2394 (0.2426) data time 0.0010 (0.0028) model time 0.2384 (0.2395) loss 2.9019 (2.9780) grad_norm 4.5031 (3.4190) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][280/1251] eta 0:03:55 lr 0.000255 wd 0.0500 time 0.2454 (0.2425) data time 0.0008 (0.0027) model time 0.2446 (0.2394) loss 3.0125 (2.9819) grad_norm 3.7794 (3.4181) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][290/1251] eta 0:03:52 lr 0.000255 wd 0.0500 time 0.2370 (0.2424) data time 0.0010 (0.0027) model time 0.2361 (0.2394) loss 3.4784 (2.9863) grad_norm 3.3758 (3.4203) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][300/1251] eta 0:03:50 lr 0.000255 wd 0.0500 time 0.2359 (0.2423) data time 0.0010 (0.0026) model time 0.2349 (0.2393) loss 1.9832 (2.9859) grad_norm 3.2153 (3.4334) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][310/1251] eta 0:03:47 lr 0.000255 wd 0.0500 time 0.2373 (0.2422) data time 0.0009 (0.0026) model time 0.2364 (0.2394) loss 1.9812 (2.9808) grad_norm 2.9375 (3.4659) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][320/1251] eta 0:03:45 lr 0.000255 wd 0.0500 time 0.2395 (0.2421) data time 0.0009 (0.0025) model time 0.2385 (0.2393) loss 3.3088 (2.9824) grad_norm 3.4008 (3.4617) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:54:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][330/1251] eta 0:03:42 lr 0.000255 wd 0.0500 time 0.2313 (0.2420) data time 0.0009 (0.0025) model time 0.2304 (0.2393) loss 3.5590 (2.9890) grad_norm 2.9653 (3.4441) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][340/1251] eta 0:03:40 lr 0.000255 wd 0.0500 time 0.2436 (0.2420) data time 0.0009 (0.0024) model time 0.2428 (0.2393) loss 3.7973 (2.9942) grad_norm 2.6084 (3.4335) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:55:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][350/1251] eta 0:03:38 lr 0.000255 wd 0.0500 time 0.2401 (0.2420) data time 0.0007 (0.0024) model time 0.2394 (0.2394) loss 3.3717 (2.9894) grad_norm 3.7702 (3.4227) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:55:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][360/1251] eta 0:03:35 lr 0.000255 wd 0.0500 time 0.2394 (0.2420) data time 0.0007 (0.0023) model time 0.2387 (0.2394) loss 3.5906 (2.9866) grad_norm 3.1752 (3.4582) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][370/1251] eta 0:03:33 lr 0.000254 wd 0.0500 time 0.2381 (0.2419) data time 0.0010 (0.0023) model time 0.2372 (0.2394) loss 2.5157 (2.9791) grad_norm 4.4619 (3.4542) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:55:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][380/1251] eta 0:03:30 lr 0.000254 wd 0.0500 time 0.2477 (0.2419) data time 0.0008 (0.0023) model time 0.2469 (0.2394) loss 3.4019 (2.9862) grad_norm 2.9468 (3.4432) loss_scale 512.0000 (512.0000) mem 7383MB [2024-08-29 23:55:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-29 23:55:10 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-29 23:55:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 00:21:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 00:21:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 00:21:54 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 00:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 00:41:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 00:41:35 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 00:41:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 00:41:51 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 00:41:53 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 00:41:54 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 00:41:54 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 207) [2024-08-30 00:41:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 00:42:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][390/1251] eta 0:26:42 lr 0.000254 wd 0.0500 time 0.2446 (1.8608) data time 0.0009 (0.0778) model time 0.2437 (1.7830) loss 3.1277 (3.4699) grad_norm 13.8898 (6.2995) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][400/1251] eta 0:14:54 lr 0.000254 wd 0.0500 time 0.2418 (1.0515) data time 0.0008 (0.0395) model time 0.2410 (1.0121) loss 3.4173 (3.3172) grad_norm 5.8912 (4.8535) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][410/1251] eta 0:10:59 lr 0.000254 wd 0.0500 time 0.2500 (0.7838) data time 0.0011 (0.0267) model time 0.2489 (0.7571) loss 3.1795 (3.2833) grad_norm 2.9117 (4.4609) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][420/1251] eta 0:08:59 lr 0.000254 wd 0.0500 time 0.2440 (0.6492) data time 0.0008 (0.0203) model time 0.2431 (0.6290) loss 2.3627 (3.1990) grad_norm 4.8571 (4.2336) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][430/1251] eta 0:07:46 lr 0.000254 wd 0.0500 time 0.2441 (0.5685) data time 0.0010 (0.0165) model time 0.2431 (0.5521) loss 2.6089 (3.1654) grad_norm 5.4439 (4.1753) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][440/1251] eta 0:06:56 lr 0.000254 wd 0.0500 time 0.2427 (0.5140) data time 0.0007 (0.0139) model time 0.2419 (0.5001) loss 3.1694 (3.1517) grad_norm 3.0457 (4.0571) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][450/1251] eta 0:06:20 lr 0.000254 wd 0.0500 time 0.2513 (0.4754) data time 0.0008 (0.0121) model time 0.2505 (0.4633) loss 2.1102 (3.1061) grad_norm 3.5613 (3.9655) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][460/1251] eta 0:05:53 lr 0.000254 wd 0.0500 time 0.2404 (0.4469) data time 0.0013 (0.0109) model time 0.2391 (0.4360) loss 3.5345 (3.0918) grad_norm 2.8212 (3.9006) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][470/1251] eta 0:05:31 lr 0.000254 wd 0.0500 time 0.2463 (0.4244) data time 0.0009 (0.0098) model time 0.2454 (0.4146) loss 3.6761 (3.0779) grad_norm 2.7435 (3.8433) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][480/1251] eta 0:05:13 lr 0.000254 wd 0.0500 time 0.2340 (0.4066) data time 0.0012 (0.0090) model time 0.2328 (0.3976) loss 2.9481 (3.0755) grad_norm 2.8137 (3.7566) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][490/1251] eta 0:04:58 lr 0.000254 wd 0.0500 time 0.2387 (0.3920) data time 0.0012 (0.0083) model time 0.2375 (0.3838) loss 3.0668 (3.0959) grad_norm 3.8078 (3.8110) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][500/1251] eta 0:04:45 lr 0.000254 wd 0.0500 time 0.2536 (0.3798) data time 0.0009 (0.0077) model time 0.2528 (0.3722) loss 3.5635 (3.1022) grad_norm 2.5426 (3.7355) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][510/1251] eta 0:04:33 lr 0.000254 wd 0.0500 time 0.2362 (0.3693) data time 0.0009 (0.0072) model time 0.2352 (0.3622) loss 2.6763 (3.0784) grad_norm 3.4217 (3.7103) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][520/1251] eta 0:04:23 lr 0.000254 wd 0.0500 time 0.2418 (0.3602) data time 0.0007 (0.0067) model time 0.2411 (0.3534) loss 2.0814 (3.0628) grad_norm 3.7617 (3.6910) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][530/1251] eta 0:04:14 lr 0.000254 wd 0.0500 time 0.2536 (0.3525) data time 0.0011 (0.0064) model time 0.2525 (0.3461) loss 3.2201 (3.0624) grad_norm 3.8178 (3.6993) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][540/1251] eta 0:04:05 lr 0.000254 wd 0.0500 time 0.2499 (0.3459) data time 0.0010 (0.0061) model time 0.2489 (0.3399) loss 3.6076 (3.0676) grad_norm 2.5322 (3.6742) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][550/1251] eta 0:03:58 lr 0.000254 wd 0.0500 time 0.2472 (0.3399) data time 0.0008 (0.0058) model time 0.2464 (0.3341) loss 2.2550 (3.0652) grad_norm 2.2664 (3.6176) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:42:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][560/1251] eta 0:03:51 lr 0.000254 wd 0.0500 time 0.2555 (0.3349) data time 0.0008 (0.0055) model time 0.2547 (0.3294) loss 2.4110 (3.0474) grad_norm 2.0373 (3.7323) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][570/1251] eta 0:03:44 lr 0.000254 wd 0.0500 time 0.2485 (0.3302) data time 0.0008 (0.0053) model time 0.2476 (0.3249) loss 2.6949 (3.0489) grad_norm 3.6115 (3.6934) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][580/1251] eta 0:03:38 lr 0.000254 wd 0.0500 time 0.2560 (0.3260) data time 0.0013 (0.0051) model time 0.2547 (0.3209) loss 3.1122 (3.0336) grad_norm 6.6455 (3.6956) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][590/1251] eta 0:03:32 lr 0.000254 wd 0.0500 time 0.2445 (0.3221) data time 0.0008 (0.0049) model time 0.2437 (0.3172) loss 3.0576 (3.0279) grad_norm 2.5116 (3.6874) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][600/1251] eta 0:03:27 lr 0.000254 wd 0.0500 time 0.2532 (0.3187) data time 0.0011 (0.0047) model time 0.2522 (0.3140) loss 3.1843 (3.0246) grad_norm 2.9285 (3.6497) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][610/1251] eta 0:03:22 lr 0.000254 wd 0.0500 time 0.2425 (0.3153) data time 0.0010 (0.0046) model time 0.2415 (0.3108) loss 3.4063 (3.0258) grad_norm 3.2800 (3.6563) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][620/1251] eta 0:03:17 lr 0.000254 wd 0.0500 time 0.2482 (0.3124) data time 0.0011 (0.0044) model time 0.2470 (0.3079) loss 3.2339 (3.0158) grad_norm 3.9152 (3.6634) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][630/1251] eta 0:03:12 lr 0.000253 wd 0.0500 time 0.2423 (0.3096) data time 0.0007 (0.0043) model time 0.2416 (0.3053) loss 2.2960 (3.0058) grad_norm 4.5996 (3.6756) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][640/1251] eta 0:03:07 lr 0.000253 wd 0.0500 time 0.2404 (0.3071) data time 0.0010 (0.0042) model time 0.2394 (0.3029) loss 2.4303 (3.0007) grad_norm 3.1350 (3.6756) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][650/1251] eta 0:03:03 lr 0.000253 wd 0.0500 time 0.2547 (0.3049) data time 0.0007 (0.0040) model time 0.2540 (0.3008) loss 3.9137 (2.9958) grad_norm 2.8800 (3.6437) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][660/1251] eta 0:02:58 lr 0.000253 wd 0.0500 time 0.2438 (0.3027) data time 0.0011 (0.0039) model time 0.2427 (0.2987) loss 3.2799 (3.0000) grad_norm 2.4763 (3.6286) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][670/1251] eta 0:02:55 lr 0.000253 wd 0.0500 time 0.2429 (0.3016) data time 0.0014 (0.0039) model time 0.2415 (0.2977) loss 2.4696 (2.9952) grad_norm 2.4339 (3.5958) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][680/1251] eta 0:02:51 lr 0.000253 wd 0.0500 time 0.2467 (0.2997) data time 0.0009 (0.0038) model time 0.2458 (0.2959) loss 2.6928 (2.9834) grad_norm 3.5827 (3.5985) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][690/1251] eta 0:02:47 lr 0.000253 wd 0.0500 time 0.2365 (0.2987) data time 0.0010 (0.0037) model time 0.2355 (0.2950) loss 2.9384 (2.9789) grad_norm 4.8122 (3.5941) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][700/1251] eta 0:02:43 lr 0.000253 wd 0.0500 time 0.2354 (0.2971) data time 0.0014 (0.0036) model time 0.2339 (0.2935) loss 2.8283 (2.9856) grad_norm 3.6180 (3.5813) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][710/1251] eta 0:02:39 lr 0.000253 wd 0.0500 time 0.2306 (0.2954) data time 0.0010 (0.0035) model time 0.2297 (0.2919) loss 2.8989 (2.9854) grad_norm 3.4670 (3.5768) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][720/1251] eta 0:02:36 lr 0.000253 wd 0.0500 time 0.2421 (0.2939) data time 0.0010 (0.0035) model time 0.2410 (0.2904) loss 3.3364 (2.9862) grad_norm 2.4691 (3.5574) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:43:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 00:43:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 00:43:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 00:54:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 00:54:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 00:54:30 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 00:54:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 00:54:38 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 00:54:40 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 00:54:41 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 00:54:41 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 207) [2024-08-30 00:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 00:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][730/1251] eta 0:40:43 lr 0.000253 wd 0.0500 time 0.2311 (4.6904) data time 0.0008 (0.2435) model time 0.2303 (4.4469) loss 2.6249 (3.2639) grad_norm 3.1084 (2.8415) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 00:55:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][740/1251] eta 0:10:50 lr 0.000253 wd 0.0500 time 0.2354 (1.2721) data time 0.0011 (0.0571) model time 0.2343 (1.2149) loss 3.2058 (3.2599) grad_norm 2.8842 (3.3087) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 00:55:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][750/1251] eta 0:06:53 lr 0.000253 wd 0.0500 time 0.2349 (0.8246) data time 0.0009 (0.0328) model time 0.2339 (0.7919) loss 3.6028 (3.2303) grad_norm 4.7138 (3.7495) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 00:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][760/1251] eta 0:05:18 lr 0.000253 wd 0.0500 time 0.2348 (0.6485) data time 0.0008 (0.0232) model time 0.2340 (0.6253) loss 4.1259 (3.2486) grad_norm 2.7099 (3.4804) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 00:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][770/1251] eta 0:04:26 lr 0.000253 wd 0.0500 time 0.2354 (0.5547) data time 0.0012 (0.0180) model time 0.2342 (0.5367) loss 2.8915 (3.1953) grad_norm 3.6352 (3.4132) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 00:55:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 00:55:11 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 00:55:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 00:58:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 00:58:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 00:58:12 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 00:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 00:58:19 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 00:58:20 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 00:58:21 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 00:58:21 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 207) [2024-08-30 00:58:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 00:58:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][780/1251] eta 0:20:44 lr 0.000253 wd 0.0500 time 0.2304 (2.6430) data time 0.0006 (0.1489) model time 0.2297 (2.4941) loss 3.9209 (3.4333) grad_norm 6.2804 (3.9832) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 00:58:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 00:58:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 00:58:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 01:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 01:05:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 01:05:42 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 01:05:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 01:05:53 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 01:05:54 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 01:05:55 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 01:05:55 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 207) [2024-08-30 01:05:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 01:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 01:30:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 01:30:30 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 01:30:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 01:30:47 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 01:30:48 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 01:30:49 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 01:30:49 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 207) [2024-08-30 01:30:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 01:31:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][790/1251] eta 1:49:39 lr 0.000253 wd 0.0500 time 14.2723 (14.2723) data time 0.8704 (0.8704) model time 13.4019 (13.4019) loss 3.7471 (3.7471) grad_norm 2.4854 (2.4854) loss_scale 512.0000 (512.0000) mem 20031MB [2024-08-30 01:31:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][800/1251] eta 0:11:33 lr 0.000253 wd 0.0500 time 0.2379 (1.5374) data time 0.0012 (0.0802) model time 0.2367 (1.4572) loss 2.4296 (3.2473) grad_norm 2.9618 (3.8961) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][810/1251] eta 0:06:46 lr 0.000253 wd 0.0500 time 0.2378 (0.9213) data time 0.0010 (0.0425) model time 0.2368 (0.8789) loss 3.2205 (3.2726) grad_norm 3.8048 (3.9644) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][820/1251] eta 0:05:02 lr 0.000253 wd 0.0500 time 0.2429 (0.7015) data time 0.0007 (0.0291) model time 0.2423 (0.6724) loss 2.3245 (3.2441) grad_norm 2.4605 (3.9734) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][830/1251] eta 0:04:08 lr 0.000253 wd 0.0500 time 0.2425 (0.5897) data time 0.0011 (0.0223) model time 0.2414 (0.5675) loss 3.1300 (3.1878) grad_norm 2.8026 (4.5592) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][840/1251] eta 0:03:34 lr 0.000253 wd 0.0500 time 0.2424 (0.5216) data time 0.0008 (0.0181) model time 0.2416 (0.5035) loss 3.1849 (3.1657) grad_norm 3.6117 (4.2894) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][850/1251] eta 0:03:10 lr 0.000253 wd 0.0500 time 0.2506 (0.4762) data time 0.0009 (0.0153) model time 0.2497 (0.4609) loss 3.1819 (3.1333) grad_norm 2.7182 (4.1998) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][860/1251] eta 0:02:53 lr 0.000253 wd 0.0500 time 0.2485 (0.4434) data time 0.0010 (0.0133) model time 0.2475 (0.4301) loss 2.9724 (3.1040) grad_norm 2.5576 (3.9770) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][870/1251] eta 0:02:39 lr 0.000253 wd 0.0500 time 0.2374 (0.4187) data time 0.0010 (0.0118) model time 0.2364 (0.4069) loss 2.6037 (3.0907) grad_norm 2.6535 (3.8460) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][880/1251] eta 0:02:28 lr 0.000253 wd 0.0500 time 0.2381 (0.3994) data time 0.0008 (0.0106) model time 0.2373 (0.3888) loss 3.4865 (3.0685) grad_norm 3.7955 (3.8100) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][890/1251] eta 0:02:18 lr 0.000252 wd 0.0500 time 0.2400 (0.3840) data time 0.0008 (0.0096) model time 0.2392 (0.3744) loss 3.7232 (3.0769) grad_norm 3.2549 (3.7399) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][900/1251] eta 0:02:10 lr 0.000252 wd 0.0500 time 0.2301 (0.3713) data time 0.0010 (0.0089) model time 0.2291 (0.3625) loss 2.4723 (3.0865) grad_norm 2.7173 (3.7428) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][910/1251] eta 0:02:03 lr 0.000252 wd 0.0500 time 0.2403 (0.3609) data time 0.0008 (0.0082) model time 0.2395 (0.3526) loss 1.9254 (3.0836) grad_norm 4.5536 (3.7263) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][920/1251] eta 0:01:56 lr 0.000252 wd 0.0500 time 0.2422 (0.3520) data time 0.0009 (0.0077) model time 0.2413 (0.3444) loss 3.0357 (3.0710) grad_norm 2.6234 (3.6839) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][930/1251] eta 0:01:50 lr 0.000252 wd 0.0500 time 0.2462 (0.3444) data time 0.0007 (0.0072) model time 0.2455 (0.3372) loss 3.1641 (3.0651) grad_norm 6.1966 (3.6947) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][940/1251] eta 0:01:44 lr 0.000252 wd 0.0500 time 0.2416 (0.3376) data time 0.0009 (0.0068) model time 0.2407 (0.3308) loss 2.3506 (3.0614) grad_norm 2.9834 (3.6804) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][950/1251] eta 0:01:39 lr 0.000252 wd 0.0500 time 0.2510 (0.3319) data time 0.0010 (0.0064) model time 0.2500 (0.3255) loss 3.2851 (3.0674) grad_norm 3.4401 (3.6611) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][960/1251] eta 0:01:35 lr 0.000252 wd 0.0500 time 0.2320 (0.3265) data time 0.0009 (0.0061) model time 0.2311 (0.3204) loss 2.9403 (3.0593) grad_norm 3.6340 (3.7521) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][970/1251] eta 0:01:30 lr 0.000252 wd 0.0500 time 0.2696 (0.3222) data time 0.0009 (0.0058) model time 0.2687 (0.3164) loss 3.4644 (3.0449) grad_norm 3.4257 (3.7345) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][980/1251] eta 0:01:26 lr 0.000252 wd 0.0500 time 0.2507 (0.3181) data time 0.0009 (0.0056) model time 0.2498 (0.3126) loss 2.5034 (3.0464) grad_norm 3.5598 (3.6984) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][990/1251] eta 0:01:22 lr 0.000252 wd 0.0500 time 0.2390 (0.3144) data time 0.0009 (0.0053) model time 0.2381 (0.3091) loss 2.8329 (3.0323) grad_norm 2.1777 (3.6611) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:31:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1000/1251] eta 0:01:18 lr 0.000252 wd 0.0500 time 0.2427 (0.3110) data time 0.0012 (0.0051) model time 0.2415 (0.3058) loss 3.3183 (3.0240) grad_norm 2.9361 (3.6276) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1010/1251] eta 0:01:14 lr 0.000252 wd 0.0500 time 0.2670 (0.3079) data time 0.0007 (0.0049) model time 0.2662 (0.3030) loss 3.0906 (3.0180) grad_norm 3.2628 (3.6288) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1020/1251] eta 0:01:10 lr 0.000252 wd 0.0500 time 0.2505 (0.3051) data time 0.0007 (0.0048) model time 0.2498 (0.3003) loss 1.7873 (3.0145) grad_norm 3.4786 (3.5979) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1030/1251] eta 0:01:06 lr 0.000252 wd 0.0500 time 0.2535 (0.3026) data time 0.0006 (0.0046) model time 0.2529 (0.2980) loss 3.3371 (3.0132) grad_norm 3.4964 (3.6463) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1040/1251] eta 0:01:03 lr 0.000252 wd 0.0500 time 0.2408 (0.3004) data time 0.0007 (0.0045) model time 0.2401 (0.2959) loss 3.3739 (3.0062) grad_norm 4.9075 (3.6280) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1050/1251] eta 0:00:59 lr 0.000252 wd 0.0500 time 0.2571 (0.2983) data time 0.0006 (0.0043) model time 0.2565 (0.2940) loss 3.2245 (2.9979) grad_norm 3.1931 (3.6303) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1060/1251] eta 0:00:56 lr 0.000252 wd 0.0500 time 0.2455 (0.2963) data time 0.0008 (0.0042) model time 0.2447 (0.2920) loss 3.6545 (2.9954) grad_norm 2.7761 (3.5999) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1070/1251] eta 0:00:53 lr 0.000252 wd 0.0500 time 0.2420 (0.2943) data time 0.0009 (0.0041) model time 0.2411 (0.2902) loss 3.2755 (2.9998) grad_norm 2.9248 (3.5798) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1080/1251] eta 0:00:50 lr 0.000252 wd 0.0500 time 0.2522 (0.2934) data time 0.0008 (0.0040) model time 0.2514 (0.2894) loss 1.8581 (2.9930) grad_norm 2.9977 (3.5618) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1090/1251] eta 0:00:46 lr 0.000252 wd 0.0500 time 0.2430 (0.2918) data time 0.0012 (0.0039) model time 0.2418 (0.2879) loss 3.1741 (2.9834) grad_norm 2.9905 (3.5637) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1100/1251] eta 0:00:43 lr 0.000252 wd 0.0500 time 0.2359 (0.2911) data time 0.0010 (0.0038) model time 0.2349 (0.2873) loss 3.2802 (2.9819) grad_norm 4.1386 (3.5685) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1110/1251] eta 0:00:40 lr 0.000252 wd 0.0500 time 0.2445 (0.2896) data time 0.0008 (0.0037) model time 0.2437 (0.2859) loss 3.8100 (2.9936) grad_norm 2.9091 (3.6023) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1120/1251] eta 0:00:37 lr 0.000252 wd 0.0500 time 0.2395 (0.2883) data time 0.0008 (0.0036) model time 0.2388 (0.2846) loss 1.6499 (2.9908) grad_norm 2.7931 (3.5919) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1130/1251] eta 0:00:34 lr 0.000252 wd 0.0500 time 0.2401 (0.2869) data time 0.0010 (0.0036) model time 0.2392 (0.2833) loss 3.2246 (2.9938) grad_norm 3.3973 (3.5778) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1140/1251] eta 0:00:31 lr 0.000252 wd 0.0500 time 0.2413 (0.2857) data time 0.0007 (0.0035) model time 0.2406 (0.2822) loss 3.6416 (2.9952) grad_norm 4.3376 (3.6045) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1150/1251] eta 0:00:28 lr 0.000251 wd 0.0500 time 0.2373 (0.2845) data time 0.0007 (0.0035) model time 0.2366 (0.2811) loss 2.8752 (2.9959) grad_norm 3.6151 (3.5935) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1160/1251] eta 0:00:25 lr 0.000251 wd 0.0500 time 0.2444 (0.2834) data time 0.0010 (0.0034) model time 0.2434 (0.2800) loss 2.4549 (2.9957) grad_norm 2.6517 (3.5751) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1170/1251] eta 0:00:22 lr 0.000251 wd 0.0500 time 0.2515 (0.2824) data time 0.0007 (0.0033) model time 0.2507 (0.2791) loss 2.2791 (2.9933) grad_norm 2.9642 (3.5651) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1180/1251] eta 0:00:19 lr 0.000251 wd 0.0500 time 0.2440 (0.2814) data time 0.0007 (0.0033) model time 0.2433 (0.2781) loss 3.4917 (2.9900) grad_norm 4.1783 (3.5723) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1190/1251] eta 0:00:17 lr 0.000251 wd 0.0500 time 0.2490 (0.2804) data time 0.0009 (0.0032) model time 0.2481 (0.2772) loss 3.4268 (2.9952) grad_norm 3.2141 (3.5656) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1200/1251] eta 0:00:14 lr 0.000251 wd 0.0500 time 0.2426 (0.2795) data time 0.0009 (0.0032) model time 0.2417 (0.2764) loss 3.5040 (3.0009) grad_norm 2.6119 (3.5659) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1210/1251] eta 0:00:11 lr 0.000251 wd 0.0500 time 0.2413 (0.2786) data time 0.0007 (0.0031) model time 0.2407 (0.2755) loss 3.6113 (2.9990) grad_norm 2.9569 (3.5743) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1220/1251] eta 0:00:08 lr 0.000251 wd 0.0500 time 0.2364 (0.2778) data time 0.0008 (0.0031) model time 0.2356 (0.2748) loss 3.3976 (3.0060) grad_norm 3.1453 (3.5644) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1230/1251] eta 0:00:05 lr 0.000251 wd 0.0500 time 0.2378 (0.2771) data time 0.0009 (0.0030) model time 0.2369 (0.2741) loss 3.4957 (3.0093) grad_norm 2.3487 (3.5540) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1240/1251] eta 0:00:03 lr 0.000251 wd 0.0500 time 0.2258 (0.2762) data time 0.0005 (0.0030) model time 0.2253 (0.2732) loss 2.8326 (3.0070) grad_norm 2.9484 (3.5502) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [207/300][1250/1251] eta 0:00:00 lr 0.000251 wd 0.0500 time 0.2230 (0.2751) data time 0.0006 (0.0029) model time 0.2224 (0.2722) loss 3.5424 (3.0057) grad_norm 4.3459 (3.5526) loss_scale 512.0000 (512.0000) mem 7373MB [2024-08-30 01:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 207 training takes 0:02:06 [2024-08-30 01:33:01 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 01:33:03 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 01:33:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.397 (0.397) Loss 0.3940 (0.3940) Acc@1 92.285 (92.285) Acc@5 98.438 (98.438) Mem 7373MB [2024-08-30 01:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.108) Loss 0.6592 (0.6513) Acc@1 88.086 (86.088) Acc@5 97.363 (97.354) Mem 7373MB [2024-08-30 01:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.094) Loss 0.9248 (0.6816) Acc@1 77.344 (85.142) Acc@5 95.312 (97.266) Mem 7373MB [2024-08-30 01:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.088) Loss 1.1572 (0.7741) Acc@1 72.852 (82.882) Acc@5 92.578 (96.286) Mem 7373MB [2024-08-30 01:33:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.0674 (0.8237) Acc@1 75.586 (81.648) Acc@5 92.871 (95.739) Mem 7373MB [2024-08-30 01:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.138 Acc@5 95.680 [2024-08-30 01:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.1% [2024-08-30 01:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.14% [2024-08-30 01:33:10 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-30 01:33:11 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-30 01:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 1.414 (1.414) Loss 0.3821 (0.3821) Acc@1 93.164 (93.164) Acc@5 98.535 (98.535) Mem 7373MB [2024-08-30 01:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.069 (0.196) Loss 0.5874 (0.6070) Acc@1 88.965 (87.393) Acc@5 97.656 (97.674) Mem 7373MB [2024-08-30 01:33:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.139) Loss 0.8706 (0.6342) Acc@1 78.320 (86.291) Acc@5 96.289 (97.656) Mem 7373MB [2024-08-30 01:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.119) Loss 1.0918 (0.7177) Acc@1 73.535 (84.236) Acc@5 92.969 (96.736) Mem 7373MB [2024-08-30 01:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.106) Loss 0.9775 (0.7605) Acc@1 76.074 (83.027) Acc@5 94.238 (96.265) Mem 7373MB [2024-08-30 01:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.574 Acc@5 96.280 [2024-08-30 01:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-08-30 01:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.57% [2024-08-30 01:33:16 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 01:33:17 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 01:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][0/1251] eta 0:15:55 lr 0.000251 wd 0.0500 time 0.7639 (0.7639) data time 0.4749 (0.4749) model time 0.0000 (0.0000) loss 2.1518 (2.1518) grad_norm 4.2429 (4.2429) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][10/1251] eta 0:05:59 lr 0.000251 wd 0.0500 time 0.2543 (0.2900) data time 0.0009 (0.0442) model time 0.0000 (0.0000) loss 3.2274 (2.6906) grad_norm 3.4661 (3.1169) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][20/1251] eta 0:05:29 lr 0.000251 wd 0.0500 time 0.2530 (0.2676) data time 0.0009 (0.0236) model time 0.0000 (0.0000) loss 3.7006 (2.7779) grad_norm 3.8718 (3.3664) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][30/1251] eta 0:05:18 lr 0.000251 wd 0.0500 time 0.2416 (0.2605) data time 0.0010 (0.0163) model time 0.0000 (0.0000) loss 3.1530 (2.8808) grad_norm 6.6505 (3.3372) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][40/1251] eta 0:05:10 lr 0.000251 wd 0.0500 time 0.2511 (0.2563) data time 0.0011 (0.0126) model time 0.0000 (0.0000) loss 2.9646 (2.8682) grad_norm 3.5630 (3.3158) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][50/1251] eta 0:05:04 lr 0.000251 wd 0.0500 time 0.2416 (0.2534) data time 0.0008 (0.0103) model time 0.0000 (0.0000) loss 3.5573 (2.9334) grad_norm 3.2737 (3.3345) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][60/1251] eta 0:04:59 lr 0.000251 wd 0.0500 time 0.2413 (0.2517) data time 0.0009 (0.0088) model time 0.2405 (0.2420) loss 2.1355 (2.9258) grad_norm 2.4633 (3.3229) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][70/1251] eta 0:04:55 lr 0.000251 wd 0.0500 time 0.2441 (0.2503) data time 0.0010 (0.0077) model time 0.2430 (0.2412) loss 3.3612 (2.9257) grad_norm 2.6335 (3.3437) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][80/1251] eta 0:04:51 lr 0.000251 wd 0.0500 time 0.2500 (0.2492) data time 0.0008 (0.0068) model time 0.2492 (0.2412) loss 2.0290 (2.9072) grad_norm 3.2915 (3.3424) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][90/1251] eta 0:04:48 lr 0.000251 wd 0.0500 time 0.2619 (0.2487) data time 0.0007 (0.0062) model time 0.2612 (0.2418) loss 3.4984 (2.9327) grad_norm 3.6106 (3.3754) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][100/1251] eta 0:04:45 lr 0.000251 wd 0.0500 time 0.2452 (0.2481) data time 0.0008 (0.0057) model time 0.2444 (0.2418) loss 3.9413 (2.9634) grad_norm 3.0572 (3.4912) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][110/1251] eta 0:04:42 lr 0.000251 wd 0.0500 time 0.2539 (0.2477) data time 0.0012 (0.0052) model time 0.2527 (0.2418) loss 2.7230 (2.9661) grad_norm 3.7966 (3.4413) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][120/1251] eta 0:04:39 lr 0.000251 wd 0.0500 time 0.2508 (0.2474) data time 0.0008 (0.0049) model time 0.2500 (0.2421) loss 3.4115 (2.9756) grad_norm 2.3601 (3.4110) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][130/1251] eta 0:04:36 lr 0.000251 wd 0.0500 time 0.2489 (0.2471) data time 0.0011 (0.0046) model time 0.2478 (0.2420) loss 3.3198 (2.9852) grad_norm 3.0616 (3.4244) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][140/1251] eta 0:04:34 lr 0.000251 wd 0.0500 time 0.2695 (0.2470) data time 0.0008 (0.0043) model time 0.2687 (0.2423) loss 2.9916 (2.9817) grad_norm 5.4214 (3.4860) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][150/1251] eta 0:04:31 lr 0.000251 wd 0.0500 time 0.2620 (0.2469) data time 0.0009 (0.0041) model time 0.2610 (0.2425) loss 2.1180 (2.9747) grad_norm 3.9330 (3.4946) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][160/1251] eta 0:04:29 lr 0.000251 wd 0.0500 time 0.2596 (0.2467) data time 0.0009 (0.0040) model time 0.2586 (0.2426) loss 2.9941 (2.9906) grad_norm 4.4361 (3.5594) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:33:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][170/1251] eta 0:04:26 lr 0.000250 wd 0.0500 time 0.2520 (0.2465) data time 0.0008 (0.0038) model time 0.2512 (0.2425) loss 3.4150 (3.0023) grad_norm 3.4015 (3.5475) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][180/1251] eta 0:04:23 lr 0.000250 wd 0.0500 time 0.2490 (0.2464) data time 0.0009 (0.0037) model time 0.2482 (0.2426) loss 1.8356 (2.9875) grad_norm 2.6850 (3.4964) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][190/1251] eta 0:04:21 lr 0.000250 wd 0.0500 time 0.2450 (0.2461) data time 0.0009 (0.0035) model time 0.2441 (0.2424) loss 2.4434 (2.9909) grad_norm 5.7905 (3.5487) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][200/1251] eta 0:04:18 lr 0.000250 wd 0.0500 time 0.2381 (0.2460) data time 0.0010 (0.0034) model time 0.2370 (0.2424) loss 3.4948 (2.9817) grad_norm 3.2367 (3.5435) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][210/1251] eta 0:04:16 lr 0.000250 wd 0.0500 time 0.2404 (0.2459) data time 0.0008 (0.0033) model time 0.2396 (0.2425) loss 2.7415 (2.9956) grad_norm 3.4461 (3.5233) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][220/1251] eta 0:04:13 lr 0.000250 wd 0.0500 time 0.2383 (0.2458) data time 0.0009 (0.0032) model time 0.2375 (0.2424) loss 2.8775 (3.0002) grad_norm 2.9493 (3.5109) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][230/1251] eta 0:04:10 lr 0.000250 wd 0.0500 time 0.2480 (0.2458) data time 0.0009 (0.0031) model time 0.2470 (0.2426) loss 2.7920 (2.9940) grad_norm 3.0072 (3.5366) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][240/1251] eta 0:04:08 lr 0.000250 wd 0.0500 time 0.2411 (0.2457) data time 0.0007 (0.0030) model time 0.2405 (0.2426) loss 2.2420 (2.9826) grad_norm 3.7775 (3.5618) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][250/1251] eta 0:04:05 lr 0.000250 wd 0.0500 time 0.2413 (0.2457) data time 0.0007 (0.0030) model time 0.2406 (0.2426) loss 2.2242 (2.9798) grad_norm 3.1609 (3.5470) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][260/1251] eta 0:04:03 lr 0.000250 wd 0.0500 time 0.2472 (0.2457) data time 0.0008 (0.0029) model time 0.2465 (0.2427) loss 2.9710 (2.9772) grad_norm 3.2888 (3.5488) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][270/1251] eta 0:04:01 lr 0.000250 wd 0.0500 time 0.4624 (0.2464) data time 0.0010 (0.0028) model time 0.4614 (0.2438) loss 3.0135 (2.9782) grad_norm 2.7491 (3.5372) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][280/1251] eta 0:03:59 lr 0.000250 wd 0.0500 time 0.2439 (0.2463) data time 0.0008 (0.0027) model time 0.2431 (0.2436) loss 1.8942 (2.9851) grad_norm 2.5472 (3.5242) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][290/1251] eta 0:03:56 lr 0.000250 wd 0.0500 time 0.2536 (0.2462) data time 0.0006 (0.0027) model time 0.2530 (0.2436) loss 3.5516 (2.9904) grad_norm 2.9120 (3.5360) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][300/1251] eta 0:03:54 lr 0.000250 wd 0.0500 time 0.2420 (0.2461) data time 0.0008 (0.0026) model time 0.2412 (0.2435) loss 3.2610 (2.9885) grad_norm 2.8276 (3.5212) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][310/1251] eta 0:03:51 lr 0.000250 wd 0.0500 time 0.2378 (0.2460) data time 0.0007 (0.0026) model time 0.2372 (0.2434) loss 2.9451 (2.9909) grad_norm 3.8010 (3.5424) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][320/1251] eta 0:03:48 lr 0.000250 wd 0.0500 time 0.2426 (0.2459) data time 0.0007 (0.0025) model time 0.2419 (0.2435) loss 3.1986 (2.9956) grad_norm 2.6483 (3.5551) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 01:34:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][330/1251] eta 0:03:46 lr 0.000250 wd 0.0500 time 0.2453 (0.2459) data time 0.0010 (0.0025) model time 0.2443 (0.2434) loss 3.8149 (2.9947) grad_norm 3.6649 (3.5711) loss_scale 1024.0000 (513.5468) mem 7378MB [2024-08-30 01:34:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][340/1251] eta 0:03:43 lr 0.000250 wd 0.0500 time 0.2408 (0.2457) data time 0.0008 (0.0024) model time 0.2401 (0.2433) loss 2.3618 (2.9976) grad_norm 2.5622 (3.5532) loss_scale 1024.0000 (528.5161) mem 7378MB [2024-08-30 01:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][350/1251] eta 0:03:41 lr 0.000250 wd 0.0500 time 0.2434 (0.2462) data time 0.0010 (0.0024) model time 0.2424 (0.2439) loss 3.4048 (2.9856) grad_norm 3.3334 (3.5327) loss_scale 1024.0000 (542.6325) mem 7378MB [2024-08-30 01:34:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][360/1251] eta 0:03:39 lr 0.000250 wd 0.0500 time 0.2380 (0.2461) data time 0.0008 (0.0024) model time 0.2372 (0.2439) loss 2.6087 (2.9850) grad_norm 2.7595 (3.5178) loss_scale 1024.0000 (555.9668) mem 7378MB [2024-08-30 01:34:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][370/1251] eta 0:03:36 lr 0.000250 wd 0.0500 time 0.2399 (0.2460) data time 0.0007 (0.0023) model time 0.2392 (0.2438) loss 2.8568 (2.9753) grad_norm 2.3829 (3.5049) loss_scale 1024.0000 (568.5822) mem 7378MB [2024-08-30 01:34:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][380/1251] eta 0:03:34 lr 0.000250 wd 0.0500 time 0.2387 (0.2459) data time 0.0010 (0.0023) model time 0.2377 (0.2437) loss 2.2260 (2.9714) grad_norm 6.8562 (3.5206) loss_scale 1024.0000 (580.5354) mem 7378MB [2024-08-30 01:34:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][390/1251] eta 0:03:31 lr 0.000250 wd 0.0500 time 0.2393 (0.2458) data time 0.0008 (0.0023) model time 0.2385 (0.2436) loss 2.8579 (2.9702) grad_norm 4.4402 (3.5359) loss_scale 1024.0000 (591.8772) mem 7378MB [2024-08-30 01:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][400/1251] eta 0:03:29 lr 0.000250 wd 0.0500 time 0.2432 (0.2458) data time 0.0010 (0.0022) model time 0.2422 (0.2436) loss 3.0553 (2.9707) grad_norm 2.8623 (3.5432) loss_scale 1024.0000 (602.6534) mem 7378MB [2024-08-30 01:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][410/1251] eta 0:03:26 lr 0.000250 wd 0.0500 time 0.2430 (0.2457) data time 0.0007 (0.0022) model time 0.2424 (0.2435) loss 1.8716 (2.9697) grad_norm 2.9768 (3.5424) loss_scale 1024.0000 (612.9051) mem 7378MB [2024-08-30 01:35:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][420/1251] eta 0:03:24 lr 0.000250 wd 0.0500 time 0.2411 (0.2456) data time 0.0007 (0.0022) model time 0.2404 (0.2434) loss 2.1664 (2.9660) grad_norm 3.8287 (inf) loss_scale 512.0000 (615.3729) mem 7378MB [2024-08-30 01:35:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][430/1251] eta 0:03:21 lr 0.000249 wd 0.0500 time 0.2437 (0.2455) data time 0.0007 (0.0021) model time 0.2430 (0.2434) loss 3.5645 (2.9672) grad_norm 3.1494 (inf) loss_scale 512.0000 (612.9745) mem 7378MB [2024-08-30 01:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][440/1251] eta 0:03:19 lr 0.000249 wd 0.0500 time 0.2492 (0.2456) data time 0.0006 (0.0021) model time 0.2485 (0.2435) loss 3.4414 (2.9669) grad_norm 2.6579 (inf) loss_scale 512.0000 (610.6848) mem 7378MB [2024-08-30 01:35:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][450/1251] eta 0:03:16 lr 0.000249 wd 0.0500 time 0.2469 (0.2456) data time 0.0006 (0.0021) model time 0.2463 (0.2436) loss 2.1785 (2.9640) grad_norm 3.3223 (inf) loss_scale 512.0000 (608.4967) mem 7378MB [2024-08-30 01:35:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][460/1251] eta 0:03:14 lr 0.000249 wd 0.0500 time 0.2545 (0.2456) data time 0.0009 (0.0021) model time 0.2536 (0.2436) loss 3.2805 (2.9691) grad_norm 2.9494 (inf) loss_scale 512.0000 (606.4035) mem 7378MB [2024-08-30 01:35:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][470/1251] eta 0:03:11 lr 0.000249 wd 0.0500 time 0.2371 (0.2455) data time 0.0006 (0.0020) model time 0.2365 (0.2435) loss 3.5431 (2.9762) grad_norm 3.2352 (inf) loss_scale 512.0000 (604.3992) mem 7378MB [2024-08-30 01:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][480/1251] eta 0:03:09 lr 0.000249 wd 0.0500 time 0.2561 (0.2455) data time 0.0008 (0.0020) model time 0.2553 (0.2435) loss 1.9872 (2.9720) grad_norm 2.9541 (inf) loss_scale 512.0000 (602.4782) mem 7378MB [2024-08-30 01:35:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][490/1251] eta 0:03:06 lr 0.000249 wd 0.0500 time 0.2524 (0.2454) data time 0.0008 (0.0020) model time 0.2516 (0.2434) loss 3.1763 (2.9638) grad_norm 3.9916 (inf) loss_scale 512.0000 (600.6354) mem 7378MB [2024-08-30 01:35:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][500/1251] eta 0:03:04 lr 0.000249 wd 0.0500 time 0.2547 (0.2454) data time 0.0009 (0.0020) model time 0.2538 (0.2434) loss 2.5280 (2.9638) grad_norm 3.5313 (inf) loss_scale 512.0000 (598.8663) mem 7378MB [2024-08-30 01:35:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][510/1251] eta 0:03:01 lr 0.000249 wd 0.0500 time 0.2695 (0.2454) data time 0.0006 (0.0020) model time 0.2688 (0.2435) loss 3.5753 (2.9665) grad_norm 3.7022 (inf) loss_scale 512.0000 (597.1663) mem 7378MB [2024-08-30 01:35:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][520/1251] eta 0:02:59 lr 0.000249 wd 0.0500 time 0.2348 (0.2455) data time 0.0008 (0.0020) model time 0.2340 (0.2436) loss 3.5071 (2.9672) grad_norm 4.4492 (inf) loss_scale 512.0000 (595.5317) mem 7378MB [2024-08-30 01:35:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][530/1251] eta 0:02:56 lr 0.000249 wd 0.0500 time 0.2349 (0.2455) data time 0.0010 (0.0019) model time 0.2339 (0.2436) loss 3.2466 (2.9651) grad_norm 2.6563 (inf) loss_scale 512.0000 (593.9586) mem 7378MB [2024-08-30 01:35:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][540/1251] eta 0:02:54 lr 0.000249 wd 0.0500 time 0.2411 (0.2455) data time 0.0012 (0.0019) model time 0.2399 (0.2436) loss 3.0075 (2.9634) grad_norm 3.2744 (inf) loss_scale 512.0000 (592.4436) mem 7378MB [2024-08-30 01:35:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][550/1251] eta 0:02:52 lr 0.000249 wd 0.0500 time 0.2411 (0.2455) data time 0.0008 (0.0019) model time 0.2404 (0.2436) loss 3.2536 (2.9662) grad_norm 2.1800 (inf) loss_scale 512.0000 (590.9837) mem 7378MB [2024-08-30 01:35:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][560/1251] eta 0:02:49 lr 0.000249 wd 0.0500 time 0.2314 (0.2455) data time 0.0010 (0.0019) model time 0.2304 (0.2436) loss 3.1876 (2.9675) grad_norm 2.6976 (inf) loss_scale 512.0000 (589.5758) mem 7378MB [2024-08-30 01:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][570/1251] eta 0:02:47 lr 0.000249 wd 0.0500 time 0.2471 (0.2454) data time 0.0008 (0.0019) model time 0.2463 (0.2436) loss 3.0150 (2.9623) grad_norm 3.7191 (inf) loss_scale 512.0000 (588.2172) mem 7378MB [2024-08-30 01:35:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][580/1251] eta 0:02:44 lr 0.000249 wd 0.0500 time 0.2375 (0.2454) data time 0.0009 (0.0019) model time 0.2365 (0.2435) loss 3.2103 (2.9654) grad_norm 3.0838 (inf) loss_scale 512.0000 (586.9053) mem 7378MB [2024-08-30 01:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][590/1251] eta 0:02:42 lr 0.000249 wd 0.0500 time 0.2321 (0.2453) data time 0.0010 (0.0019) model time 0.2310 (0.2435) loss 3.0173 (2.9653) grad_norm 2.5182 (inf) loss_scale 512.0000 (585.6379) mem 7378MB [2024-08-30 01:35:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][600/1251] eta 0:02:39 lr 0.000249 wd 0.0500 time 0.2366 (0.2453) data time 0.0009 (0.0018) model time 0.2358 (0.2435) loss 3.7595 (2.9666) grad_norm 2.9678 (inf) loss_scale 512.0000 (584.4126) mem 7378MB [2024-08-30 01:35:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][610/1251] eta 0:02:37 lr 0.000249 wd 0.0500 time 0.2355 (0.2452) data time 0.0011 (0.0018) model time 0.2344 (0.2434) loss 2.2359 (2.9636) grad_norm 2.3460 (inf) loss_scale 512.0000 (583.2275) mem 7378MB [2024-08-30 01:35:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][620/1251] eta 0:02:34 lr 0.000249 wd 0.0500 time 0.2399 (0.2452) data time 0.0009 (0.0018) model time 0.2389 (0.2434) loss 3.0292 (2.9634) grad_norm 3.3605 (inf) loss_scale 512.0000 (582.0805) mem 7378MB [2024-08-30 01:35:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][630/1251] eta 0:02:32 lr 0.000249 wd 0.0500 time 0.2415 (0.2452) data time 0.0007 (0.0018) model time 0.2409 (0.2434) loss 2.5788 (2.9628) grad_norm 4.8046 (inf) loss_scale 512.0000 (580.9699) mem 7378MB [2024-08-30 01:35:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][640/1251] eta 0:02:29 lr 0.000249 wd 0.0500 time 0.2433 (0.2452) data time 0.0009 (0.0018) model time 0.2424 (0.2434) loss 2.5013 (2.9626) grad_norm 2.5190 (inf) loss_scale 512.0000 (579.8939) mem 7378MB [2024-08-30 01:35:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][650/1251] eta 0:02:27 lr 0.000249 wd 0.0500 time 0.2401 (0.2452) data time 0.0012 (0.0018) model time 0.2390 (0.2434) loss 3.2607 (2.9638) grad_norm 2.8423 (inf) loss_scale 512.0000 (578.8510) mem 7378MB [2024-08-30 01:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][660/1251] eta 0:02:24 lr 0.000249 wd 0.0500 time 0.2390 (0.2451) data time 0.0007 (0.0018) model time 0.2383 (0.2434) loss 2.9407 (2.9666) grad_norm 2.7204 (inf) loss_scale 512.0000 (577.8396) mem 7378MB [2024-08-30 01:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][670/1251] eta 0:02:22 lr 0.000249 wd 0.0500 time 0.2469 (0.2452) data time 0.0007 (0.0018) model time 0.2461 (0.2434) loss 1.9096 (2.9689) grad_norm 4.9056 (inf) loss_scale 512.0000 (576.8584) mem 7378MB [2024-08-30 01:36:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 01:36:02 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 01:36:02 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 01:39:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 01:39:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 01:39:19 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 01:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 01:39:32 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 01:39:33 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 01:39:34 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 01:39:34 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 208) [2024-08-30 01:39:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 01:49:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 01:49:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 01:56:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 01:56:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 01:56:53 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 01:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 01:57:03 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 01:57:04 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 01:57:06 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 01:57:06 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 208) [2024-08-30 01:57:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 01:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][680/1251] eta 0:20:30 lr 0.000249 wd 0.0500 time 0.2438 (2.1544) data time 0.0007 (0.0988) model time 0.2431 (2.0556) loss 3.1385 (3.3140) grad_norm 2.5525 (3.1384) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 01:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][690/1251] eta 0:10:12 lr 0.000248 wd 0.0500 time 0.2469 (1.0919) data time 0.0009 (0.0447) model time 0.2459 (1.0472) loss 3.5372 (3.1921) grad_norm 3.3902 (3.5239) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 01:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][700/1251] eta 0:07:15 lr 0.000248 wd 0.0500 time 0.2468 (0.7899) data time 0.0010 (0.0293) model time 0.2458 (0.7605) loss 3.5155 (3.2286) grad_norm 3.0858 (3.4210) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 01:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][710/1251] eta 0:05:50 lr 0.000248 wd 0.0500 time 0.2445 (0.6471) data time 0.0010 (0.0226) model time 0.2435 (0.6246) loss 3.3052 (3.2091) grad_norm 3.1801 (3.3407) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 01:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][720/1251] eta 0:04:59 lr 0.000248 wd 0.0500 time 0.2420 (0.5639) data time 0.0008 (0.0181) model time 0.2412 (0.5457) loss 3.1675 (3.1839) grad_norm 2.9621 (3.2843) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 01:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][730/1251] eta 0:04:25 lr 0.000248 wd 0.0500 time 0.2262 (0.5091) data time 0.0009 (0.0155) model time 0.2253 (0.4935) loss 2.2499 (3.1392) grad_norm 3.3339 (3.5117) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 01:57:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][740/1251] eta 0:04:00 lr 0.000248 wd 0.0500 time 0.2387 (0.4703) data time 0.0008 (0.0136) model time 0.2379 (0.4568) loss 1.8465 (3.1183) grad_norm 2.5151 (3.4904) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 01:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 01:57:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 01:57:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 02:01:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 02:01:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 02:01:19 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 02:01:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 02:01:27 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 02:01:29 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 02:01:30 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 02:01:30 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 208) [2024-08-30 02:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 02:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][750/1251] eta 0:38:43 lr 0.000248 wd 0.0500 time 0.2218 (4.6386) data time 0.0011 (0.3375) model time 0.2207 (4.3011) loss 2.8208 (3.2428) grad_norm 3.5273 (3.1276) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][760/1251] eta 0:10:11 lr 0.000248 wd 0.0500 time 0.2307 (1.2445) data time 0.0009 (0.0787) model time 0.2298 (1.1658) loss 3.0446 (3.2711) grad_norm 6.8415 (3.9314) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][770/1251] eta 0:06:26 lr 0.000248 wd 0.0500 time 0.2228 (0.8026) data time 0.0007 (0.0449) model time 0.2221 (0.7577) loss 3.4262 (3.2439) grad_norm 3.0973 (4.1327) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:01:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][780/1251] eta 0:04:56 lr 0.000248 wd 0.0500 time 0.2293 (0.6293) data time 0.0008 (0.0316) model time 0.2285 (0.5977) loss 3.5430 (3.2261) grad_norm 3.6067 (3.7721) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][790/1251] eta 0:04:07 lr 0.000248 wd 0.0500 time 0.2458 (0.5370) data time 0.0009 (0.0246) model time 0.2449 (0.5124) loss 3.0423 (3.1662) grad_norm 3.1572 (3.5814) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][800/1251] eta 0:03:36 lr 0.000248 wd 0.0500 time 0.2341 (0.4793) data time 0.0009 (0.0201) model time 0.2333 (0.4591) loss 3.0799 (3.1308) grad_norm 6.5363 (3.6622) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][810/1251] eta 0:03:13 lr 0.000248 wd 0.0500 time 0.2238 (0.4392) data time 0.0009 (0.0171) model time 0.2229 (0.4221) loss 2.9351 (3.1191) grad_norm 5.0901 (3.7598) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][820/1251] eta 0:02:56 lr 0.000248 wd 0.0500 time 0.2278 (0.4103) data time 0.0008 (0.0149) model time 0.2269 (0.3954) loss 3.3163 (3.0945) grad_norm 2.8387 (3.7283) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][830/1251] eta 0:02:43 lr 0.000248 wd 0.0500 time 0.2248 (0.3885) data time 0.0007 (0.0132) model time 0.2241 (0.3753) loss 2.0774 (3.0655) grad_norm 3.7530 (3.7921) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][840/1251] eta 0:02:32 lr 0.000248 wd 0.0500 time 0.2227 (0.3713) data time 0.0007 (0.0119) model time 0.2220 (0.3594) loss 3.1337 (3.0518) grad_norm 3.9668 (3.7655) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][850/1251] eta 0:02:23 lr 0.000248 wd 0.0500 time 0.2314 (0.3574) data time 0.0009 (0.0109) model time 0.2305 (0.3465) loss 3.6501 (3.0772) grad_norm 2.6544 (3.7260) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][860/1251] eta 0:02:15 lr 0.000248 wd 0.0500 time 0.2248 (0.3460) data time 0.0008 (0.0100) model time 0.2240 (0.3360) loss 3.3281 (3.0718) grad_norm 7.0295 (3.7248) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][870/1251] eta 0:02:08 lr 0.000248 wd 0.0500 time 0.2289 (0.3369) data time 0.0012 (0.0093) model time 0.2277 (0.3275) loss 3.2158 (3.0805) grad_norm 2.6311 (3.6812) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][880/1251] eta 0:02:01 lr 0.000248 wd 0.0500 time 0.2247 (0.3287) data time 0.0008 (0.0087) model time 0.2238 (0.3200) loss 3.1602 (3.0796) grad_norm 3.0917 (3.6492) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][890/1251] eta 0:01:56 lr 0.000248 wd 0.0500 time 0.2465 (0.3217) data time 0.0010 (0.0082) model time 0.2455 (0.3135) loss 3.5554 (3.0672) grad_norm 2.6715 (3.6509) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][900/1251] eta 0:01:50 lr 0.000248 wd 0.0500 time 0.2313 (0.3155) data time 0.0006 (0.0077) model time 0.2307 (0.3078) loss 2.9259 (3.0642) grad_norm 3.2533 (3.6051) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][910/1251] eta 0:01:45 lr 0.000248 wd 0.0500 time 0.2305 (0.3103) data time 0.0006 (0.0073) model time 0.2299 (0.3030) loss 2.3693 (3.0642) grad_norm 3.1671 (3.5926) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][920/1251] eta 0:01:41 lr 0.000248 wd 0.0500 time 0.2301 (0.3056) data time 0.0008 (0.0070) model time 0.2292 (0.2986) loss 3.2687 (3.0622) grad_norm 2.7990 (3.5948) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][930/1251] eta 0:01:36 lr 0.000248 wd 0.0500 time 0.2301 (0.3015) data time 0.0009 (0.0067) model time 0.2291 (0.2948) loss 3.4396 (3.0483) grad_norm 2.9651 (3.5941) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][940/1251] eta 0:01:32 lr 0.000248 wd 0.0500 time 0.2256 (0.2978) data time 0.0009 (0.0064) model time 0.2247 (0.2914) loss 2.9625 (3.0465) grad_norm 5.2885 (3.5868) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][950/1251] eta 0:01:28 lr 0.000248 wd 0.0500 time 0.2243 (0.2944) data time 0.0006 (0.0061) model time 0.2237 (0.2883) loss 2.5799 (3.0299) grad_norm 2.2023 (3.5847) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][960/1251] eta 0:01:24 lr 0.000247 wd 0.0500 time 0.2318 (0.2916) data time 0.0011 (0.0059) model time 0.2308 (0.2857) loss 1.9932 (3.0211) grad_norm 2.7862 (3.5742) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][970/1251] eta 0:01:21 lr 0.000247 wd 0.0500 time 0.2218 (0.2888) data time 0.0007 (0.0057) model time 0.2211 (0.2831) loss 2.7399 (3.0232) grad_norm 2.8647 (3.5480) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][980/1251] eta 0:01:17 lr 0.000247 wd 0.0500 time 0.2244 (0.2861) data time 0.0009 (0.0055) model time 0.2235 (0.2806) loss 2.4958 (3.0160) grad_norm 3.0497 (3.5181) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][990/1251] eta 0:01:14 lr 0.000247 wd 0.0500 time 0.2251 (0.2838) data time 0.0009 (0.0053) model time 0.2242 (0.2785) loss 3.5397 (3.0200) grad_norm 5.1853 (3.5104) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1000/1251] eta 0:01:10 lr 0.000247 wd 0.0500 time 0.2214 (0.2816) data time 0.0013 (0.0051) model time 0.2201 (0.2765) loss 3.2167 (3.0136) grad_norm 3.4201 (3.4874) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1010/1251] eta 0:01:07 lr 0.000247 wd 0.0500 time 0.2244 (0.2795) data time 0.0009 (0.0050) model time 0.2235 (0.2745) loss 3.2633 (3.0006) grad_norm 2.3267 (3.4720) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1020/1251] eta 0:01:04 lr 0.000247 wd 0.0500 time 0.2288 (0.2777) data time 0.0006 (0.0048) model time 0.2281 (0.2729) loss 3.7910 (2.9959) grad_norm 2.7592 (3.4886) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1030/1251] eta 0:01:00 lr 0.000247 wd 0.0500 time 0.2299 (0.2760) data time 0.0009 (0.0047) model time 0.2290 (0.2713) loss 3.2731 (2.9963) grad_norm 3.8947 (3.5013) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1040/1251] eta 0:00:58 lr 0.000247 wd 0.0500 time 0.2263 (0.2751) data time 0.0015 (0.0046) model time 0.2248 (0.2705) loss 2.6083 (2.9886) grad_norm 4.1883 (3.5847) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:02:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1050/1251] eta 0:00:54 lr 0.000247 wd 0.0500 time 0.2200 (0.2734) data time 0.0011 (0.0045) model time 0.2189 (0.2690) loss 2.7722 (2.9832) grad_norm 2.8444 (3.5847) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1060/1251] eta 0:00:52 lr 0.000247 wd 0.0500 time 0.2299 (0.2728) data time 0.0008 (0.0044) model time 0.2291 (0.2684) loss 3.7626 (2.9854) grad_norm 5.0467 (3.5856) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1070/1251] eta 0:00:49 lr 0.000247 wd 0.0500 time 0.2352 (0.2715) data time 0.0010 (0.0043) model time 0.2341 (0.2672) loss 3.3968 (2.9933) grad_norm 3.8297 (3.5788) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1080/1251] eta 0:00:46 lr 0.000247 wd 0.0500 time 0.2337 (0.2701) data time 0.0008 (0.0042) model time 0.2329 (0.2659) loss 3.2925 (2.9905) grad_norm 3.1750 (3.5615) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1090/1251] eta 0:00:43 lr 0.000247 wd 0.0500 time 0.2312 (0.2689) data time 0.0009 (0.0041) model time 0.2303 (0.2648) loss 3.1725 (2.9885) grad_norm 3.4026 (3.5586) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1100/1251] eta 0:00:40 lr 0.000247 wd 0.0500 time 0.2270 (0.2678) data time 0.0008 (0.0040) model time 0.2261 (0.2638) loss 3.5243 (2.9896) grad_norm 2.8663 (3.5458) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1110/1251] eta 0:00:37 lr 0.000247 wd 0.0500 time 0.2279 (0.2666) data time 0.0009 (0.0039) model time 0.2270 (0.2627) loss 3.5589 (2.9899) grad_norm 2.8109 (3.5480) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1120/1251] eta 0:00:34 lr 0.000247 wd 0.0500 time 0.2308 (0.2657) data time 0.0009 (0.0039) model time 0.2299 (0.2618) loss 3.4636 (2.9867) grad_norm 4.5276 (3.5464) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1130/1251] eta 0:00:32 lr 0.000247 wd 0.0500 time 0.2389 (0.2648) data time 0.0009 (0.0038) model time 0.2380 (0.2610) loss 2.0153 (2.9829) grad_norm 3.0262 (3.5403) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1140/1251] eta 0:00:29 lr 0.000247 wd 0.0500 time 0.2502 (0.2638) data time 0.0011 (0.0037) model time 0.2491 (0.2601) loss 3.4420 (2.9794) grad_norm 2.4796 (3.5333) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1150/1251] eta 0:00:26 lr 0.000247 wd 0.0500 time 0.2410 (0.2629) data time 0.0017 (0.0036) model time 0.2393 (0.2593) loss 2.8995 (2.9846) grad_norm 2.3256 (3.5439) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1160/1251] eta 0:00:23 lr 0.000247 wd 0.0500 time 0.2373 (0.2621) data time 0.0010 (0.0036) model time 0.2363 (0.2585) loss 3.1837 (2.9893) grad_norm 3.5901 (3.5374) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1170/1251] eta 0:00:21 lr 0.000247 wd 0.0500 time 0.2295 (0.2613) data time 0.0009 (0.0035) model time 0.2286 (0.2578) loss 2.1963 (2.9861) grad_norm 1.9697 (3.5226) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1180/1251] eta 0:00:18 lr 0.000247 wd 0.0500 time 0.2422 (0.2605) data time 0.0007 (0.0035) model time 0.2415 (0.2571) loss 3.6185 (2.9926) grad_norm 2.8505 (3.5123) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1190/1251] eta 0:00:15 lr 0.000247 wd 0.0500 time 0.2330 (0.2598) data time 0.0007 (0.0034) model time 0.2322 (0.2563) loss 3.2126 (2.9978) grad_norm 2.9042 (3.5061) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1200/1251] eta 0:00:13 lr 0.000247 wd 0.0500 time 0.2368 (0.2591) data time 0.0007 (0.0034) model time 0.2361 (0.2557) loss 2.1156 (2.9942) grad_norm 2.7268 (3.4951) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1210/1251] eta 0:00:10 lr 0.000247 wd 0.0500 time 0.2354 (0.2584) data time 0.0006 (0.0033) model time 0.2347 (0.2551) loss 3.1101 (2.9909) grad_norm 3.4715 (3.4882) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1220/1251] eta 0:00:07 lr 0.000246 wd 0.0500 time 0.2429 (0.2579) data time 0.0007 (0.0033) model time 0.2422 (0.2546) loss 3.0513 (2.9873) grad_norm 2.3421 (3.4873) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1230/1251] eta 0:00:05 lr 0.000246 wd 0.0500 time 0.2449 (0.2573) data time 0.0008 (0.0032) model time 0.2441 (0.2541) loss 3.2868 (2.9893) grad_norm 3.3700 (3.4785) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1240/1251] eta 0:00:02 lr 0.000246 wd 0.0500 time 0.2213 (0.2566) data time 0.0004 (0.0032) model time 0.2209 (0.2534) loss 2.4330 (2.9930) grad_norm 3.1944 (3.4775) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [208/300][1250/1251] eta 0:00:00 lr 0.000246 wd 0.0500 time 0.2215 (0.2557) data time 0.0004 (0.0031) model time 0.2211 (0.2526) loss 2.4204 (2.9892) grad_norm 2.2920 (3.4688) loss_scale 512.0000 (512.0000) mem 7374MB [2024-08-30 02:03:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 208 training takes 0:02:08 [2024-08-30 02:03:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 02:03:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 02:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.438 (0.438) Loss 0.4033 (0.4033) Acc@1 92.480 (92.480) Acc@5 98.633 (98.633) Mem 7374MB [2024-08-30 02:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.085 (0.116) Loss 0.6528 (0.6479) Acc@1 87.109 (85.964) Acc@5 97.559 (97.346) Mem 7374MB [2024-08-30 02:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.088 (0.101) Loss 0.9375 (0.6764) Acc@1 77.441 (85.147) Acc@5 94.824 (97.275) Mem 7374MB [2024-08-30 02:03:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.095) Loss 1.2354 (0.7709) Acc@1 70.605 (83.005) Acc@5 91.797 (96.330) Mem 7374MB [2024-08-30 02:03:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.088) Loss 1.0586 (0.8213) Acc@1 76.074 (81.664) Acc@5 93.457 (95.763) Mem 7374MB [2024-08-30 02:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.264 Acc@5 95.716 [2024-08-30 02:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.3% [2024-08-30 02:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.26% [2024-08-30 02:03:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-30 02:03:57 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-30 02:03:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.447 (0.447) Loss 0.3811 (0.3811) Acc@1 93.164 (93.164) Acc@5 98.535 (98.535) Mem 7374MB [2024-08-30 02:03:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.114) Loss 0.5884 (0.6068) Acc@1 88.965 (87.393) Acc@5 97.656 (97.674) Mem 7374MB [2024-08-30 02:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.098) Loss 0.8711 (0.6340) Acc@1 78.125 (86.323) Acc@5 96.191 (97.642) Mem 7374MB [2024-08-30 02:04:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.092) Loss 1.0918 (0.7176) Acc@1 73.535 (84.233) Acc@5 93.164 (96.746) Mem 7374MB [2024-08-30 02:04:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.086) Loss 0.9795 (0.7603) Acc@1 76.270 (83.024) Acc@5 94.141 (96.284) Mem 7374MB [2024-08-30 02:04:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.568 Acc@5 96.278 [2024-08-30 02:04:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-08-30 02:04:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.57% [2024-08-30 02:04:01 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 02:04:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 02:04:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][0/1251] eta 0:17:05 lr 0.000246 wd 0.0500 time 0.8199 (0.8199) data time 0.5544 (0.5544) model time 0.0000 (0.0000) loss 3.0141 (3.0141) grad_norm 3.8070 (3.8070) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 02:04:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][10/1251] eta 0:05:52 lr 0.000246 wd 0.0500 time 0.2196 (0.2840) data time 0.0014 (0.0514) model time 0.0000 (0.0000) loss 3.5119 (3.3086) grad_norm 3.8549 (3.1219) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 02:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][20/1251] eta 0:05:14 lr 0.000246 wd 0.0500 time 0.2284 (0.2559) data time 0.0007 (0.0275) model time 0.0000 (0.0000) loss 1.9370 (3.0369) grad_norm 2.3904 (3.0605) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 02:04:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 02:04:15 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 02:04:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 02:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 02:06:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 02:06:35 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 02:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 02:06:45 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 02:06:47 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 02:06:48 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 02:06:48 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 209) [2024-08-30 02:06:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 02:07:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][30/1251] eta 1:46:17 lr 0.000246 wd 0.0500 time 0.2391 (5.2236) data time 0.0008 (0.2110) model time 0.0000 (0.0000) loss 2.5769 (3.4338) grad_norm 2.8279 (3.2575) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][40/1251] eta 0:28:05 lr 0.000246 wd 0.0500 time 0.2402 (1.3918) data time 0.0010 (0.0496) model time 0.0000 (0.0000) loss 2.8778 (3.1786) grad_norm 4.6421 (3.7604) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][50/1251] eta 0:17:53 lr 0.000246 wd 0.0500 time 0.2469 (0.8941) data time 0.0007 (0.0287) model time 0.0000 (0.0000) loss 3.5836 (3.1743) grad_norm 4.4544 (3.4317) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][60/1251] eta 0:13:50 lr 0.000246 wd 0.0500 time 0.2445 (0.6977) data time 0.0008 (0.0203) model time 0.2437 (0.2448) loss 3.7868 (3.1956) grad_norm 4.1118 (3.4311) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][70/1251] eta 0:11:39 lr 0.000246 wd 0.0500 time 0.2526 (0.5927) data time 0.0009 (0.0160) model time 0.2516 (0.2447) loss 2.9458 (3.1276) grad_norm 3.2169 (3.5010) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][80/1251] eta 0:10:17 lr 0.000246 wd 0.0500 time 0.2447 (0.5272) data time 0.0009 (0.0132) model time 0.2437 (0.2446) loss 2.9597 (3.1186) grad_norm 2.5365 (3.4492) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][90/1251] eta 0:09:20 lr 0.000246 wd 0.0500 time 0.2422 (0.4824) data time 0.0011 (0.0113) model time 0.2412 (0.2444) loss 2.8063 (3.0946) grad_norm 2.4121 (3.6444) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][100/1251] eta 0:08:37 lr 0.000246 wd 0.0500 time 0.2347 (0.4497) data time 0.0010 (0.0099) model time 0.2337 (0.2440) loss 2.9819 (3.0695) grad_norm 3.9315 (3.6303) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][110/1251] eta 0:08:04 lr 0.000246 wd 0.0500 time 0.2332 (0.4249) data time 0.0009 (0.0088) model time 0.2323 (0.2438) loss 2.2454 (3.0525) grad_norm 3.5317 (3.6561) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][120/1251] eta 0:07:38 lr 0.000246 wd 0.0500 time 0.2415 (0.4055) data time 0.0009 (0.0080) model time 0.2406 (0.2437) loss 3.2702 (3.0543) grad_norm 2.7966 (3.6253) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][130/1251] eta 0:07:17 lr 0.000246 wd 0.0500 time 0.2483 (0.3900) data time 0.0009 (0.0073) model time 0.2474 (0.2438) loss 3.8471 (3.0883) grad_norm 3.1797 (3.6045) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][140/1251] eta 0:06:58 lr 0.000246 wd 0.0500 time 0.2353 (0.3767) data time 0.0007 (0.0068) model time 0.2346 (0.2433) loss 3.1861 (3.0775) grad_norm 2.9621 (3.5662) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][150/1251] eta 0:06:43 lr 0.000246 wd 0.0500 time 0.2428 (0.3661) data time 0.0010 (0.0063) model time 0.2418 (0.2434) loss 3.2265 (3.0687) grad_norm 2.9356 (3.5166) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][160/1251] eta 0:06:29 lr 0.000246 wd 0.0500 time 0.2341 (0.3570) data time 0.0009 (0.0059) model time 0.2332 (0.2436) loss 3.1149 (3.0587) grad_norm 3.5805 (3.5263) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][170/1251] eta 0:06:17 lr 0.000246 wd 0.0500 time 0.2466 (0.3490) data time 0.0010 (0.0056) model time 0.2456 (0.2434) loss 3.2188 (3.0514) grad_norm 3.5148 (3.6185) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][180/1251] eta 0:06:06 lr 0.000246 wd 0.0500 time 0.2382 (0.3419) data time 0.0007 (0.0053) model time 0.2375 (0.2430) loss 3.1432 (3.0456) grad_norm 3.1462 (3.5884) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][190/1251] eta 0:05:56 lr 0.000246 wd 0.0500 time 0.2551 (0.3360) data time 0.0008 (0.0050) model time 0.2543 (0.2431) loss 2.3940 (3.0479) grad_norm 4.4089 (3.5913) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][200/1251] eta 0:05:47 lr 0.000246 wd 0.0500 time 0.2489 (0.3306) data time 0.0009 (0.0048) model time 0.2479 (0.2431) loss 3.4592 (3.0453) grad_norm 2.5634 (3.5723) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][210/1251] eta 0:05:39 lr 0.000246 wd 0.0500 time 0.2443 (0.3258) data time 0.0011 (0.0046) model time 0.2433 (0.2430) loss 3.3959 (3.0333) grad_norm 4.9376 (3.5674) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][220/1251] eta 0:05:31 lr 0.000246 wd 0.0500 time 0.2404 (0.3215) data time 0.0008 (0.0044) model time 0.2396 (0.2429) loss 3.0582 (3.0257) grad_norm 3.4483 (3.6073) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][230/1251] eta 0:05:24 lr 0.000245 wd 0.0500 time 0.2413 (0.3176) data time 0.0009 (0.0043) model time 0.2405 (0.2428) loss 1.8816 (3.0095) grad_norm 2.3886 (3.6044) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:07:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][240/1251] eta 0:05:17 lr 0.000245 wd 0.0500 time 0.2484 (0.3141) data time 0.0010 (0.0041) model time 0.2473 (0.2427) loss 1.8569 (3.0033) grad_norm 4.3061 (3.6170) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][250/1251] eta 0:05:11 lr 0.000245 wd 0.0500 time 0.2364 (0.3110) data time 0.0008 (0.0040) model time 0.2357 (0.2428) loss 2.4003 (3.0004) grad_norm 4.1160 (3.6257) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][260/1251] eta 0:05:05 lr 0.000245 wd 0.0500 time 0.2459 (0.3080) data time 0.0009 (0.0038) model time 0.2450 (0.2426) loss 2.2732 (2.9985) grad_norm 4.0405 (3.6064) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][270/1251] eta 0:04:59 lr 0.000245 wd 0.0500 time 0.2342 (0.3052) data time 0.0011 (0.0037) model time 0.2332 (0.2425) loss 3.3654 (3.0008) grad_norm 2.2481 (3.5858) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][280/1251] eta 0:04:53 lr 0.000245 wd 0.0500 time 0.2315 (0.3027) data time 0.0011 (0.0036) model time 0.2304 (0.2424) loss 3.5132 (2.9950) grad_norm 3.0115 (3.5599) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][290/1251] eta 0:04:48 lr 0.000245 wd 0.0500 time 0.2447 (0.3005) data time 0.0008 (0.0035) model time 0.2439 (0.2425) loss 2.6776 (2.9836) grad_norm 8.8446 (3.5716) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][300/1251] eta 0:04:43 lr 0.000245 wd 0.0500 time 0.2506 (0.2983) data time 0.0008 (0.0034) model time 0.2498 (0.2424) loss 3.2344 (2.9809) grad_norm 2.7191 (3.6046) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][310/1251] eta 0:04:38 lr 0.000245 wd 0.0500 time 0.2402 (0.2964) data time 0.0010 (0.0034) model time 0.2392 (0.2424) loss 3.1916 (2.9781) grad_norm 3.0102 (3.5991) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][320/1251] eta 0:04:35 lr 0.000245 wd 0.0500 time 0.2530 (0.2954) data time 0.0010 (0.0033) model time 0.2519 (0.2433) loss 2.8918 (2.9727) grad_norm 4.0088 (3.6029) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][330/1251] eta 0:04:30 lr 0.000245 wd 0.0500 time 0.2471 (0.2937) data time 0.0009 (0.0032) model time 0.2462 (0.2432) loss 2.9043 (2.9646) grad_norm 5.1154 (3.6055) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][340/1251] eta 0:04:26 lr 0.000245 wd 0.0500 time 0.2401 (0.2930) data time 0.0007 (0.0032) model time 0.2394 (0.2441) loss 3.5766 (2.9668) grad_norm 2.9544 (3.6019) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][350/1251] eta 0:04:22 lr 0.000245 wd 0.0500 time 0.2466 (0.2914) data time 0.0011 (0.0031) model time 0.2455 (0.2440) loss 3.1889 (2.9757) grad_norm 2.3215 (3.6094) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][360/1251] eta 0:04:18 lr 0.000245 wd 0.0500 time 0.2431 (0.2900) data time 0.0010 (0.0030) model time 0.2421 (0.2440) loss 3.3493 (2.9749) grad_norm 6.7449 (3.6007) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][370/1251] eta 0:04:14 lr 0.000245 wd 0.0500 time 0.2366 (0.2886) data time 0.0012 (0.0030) model time 0.2353 (0.2440) loss 3.1632 (2.9776) grad_norm 2.7777 (3.5870) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][380/1251] eta 0:04:10 lr 0.000245 wd 0.0500 time 0.2452 (0.2873) data time 0.0008 (0.0029) model time 0.2444 (0.2438) loss 3.2918 (2.9789) grad_norm 2.5279 (3.5724) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][390/1251] eta 0:04:06 lr 0.000245 wd 0.0500 time 0.2476 (0.2861) data time 0.0007 (0.0029) model time 0.2469 (0.2438) loss 3.1643 (2.9793) grad_norm 3.1900 (3.5682) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][400/1251] eta 0:04:02 lr 0.000245 wd 0.0500 time 0.2414 (0.2850) data time 0.0008 (0.0028) model time 0.2406 (0.2438) loss 3.1205 (2.9733) grad_norm 3.0933 (3.5619) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][410/1251] eta 0:03:58 lr 0.000245 wd 0.0500 time 0.2483 (0.2839) data time 0.0009 (0.0028) model time 0.2474 (0.2438) loss 2.2666 (2.9680) grad_norm 3.1127 (3.5454) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][420/1251] eta 0:03:55 lr 0.000245 wd 0.0500 time 0.2390 (0.2828) data time 0.0011 (0.0028) model time 0.2379 (0.2437) loss 3.3738 (2.9661) grad_norm 3.7377 (3.5338) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][430/1251] eta 0:03:51 lr 0.000245 wd 0.0500 time 0.2434 (0.2818) data time 0.0011 (0.0027) model time 0.2423 (0.2436) loss 2.9951 (2.9697) grad_norm 3.8540 (3.5235) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][440/1251] eta 0:03:47 lr 0.000245 wd 0.0500 time 0.2447 (0.2809) data time 0.0010 (0.0027) model time 0.2436 (0.2436) loss 3.3600 (2.9750) grad_norm 2.4379 (3.5174) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][450/1251] eta 0:03:44 lr 0.000245 wd 0.0500 time 0.2397 (0.2801) data time 0.0007 (0.0026) model time 0.2389 (0.2436) loss 2.4393 (2.9729) grad_norm 2.7695 (3.5010) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][460/1251] eta 0:03:40 lr 0.000245 wd 0.0500 time 0.2386 (0.2791) data time 0.0008 (0.0026) model time 0.2378 (0.2435) loss 3.4916 (2.9796) grad_norm 4.6346 (3.4989) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][470/1251] eta 0:03:37 lr 0.000245 wd 0.0500 time 0.2349 (0.2782) data time 0.0008 (0.0026) model time 0.2341 (0.2434) loss 3.2715 (2.9820) grad_norm 2.7964 (3.4922) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][480/1251] eta 0:03:33 lr 0.000245 wd 0.0500 time 0.2400 (0.2774) data time 0.0007 (0.0025) model time 0.2393 (0.2433) loss 2.2208 (2.9789) grad_norm 3.5422 (3.4798) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][490/1251] eta 0:03:30 lr 0.000245 wd 0.0500 time 0.2393 (0.2766) data time 0.0007 (0.0025) model time 0.2386 (0.2432) loss 2.6487 (2.9740) grad_norm 3.0641 (3.4850) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][500/1251] eta 0:03:27 lr 0.000244 wd 0.0500 time 0.2499 (0.2759) data time 0.0007 (0.0025) model time 0.2492 (0.2432) loss 2.8463 (2.9688) grad_norm 2.2828 (3.4732) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][510/1251] eta 0:03:23 lr 0.000244 wd 0.0500 time 0.2405 (0.2752) data time 0.0008 (0.0024) model time 0.2397 (0.2432) loss 3.5196 (2.9695) grad_norm 3.4595 (3.4742) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][520/1251] eta 0:03:20 lr 0.000244 wd 0.0500 time 0.2410 (0.2745) data time 0.0007 (0.0024) model time 0.2403 (0.2431) loss 2.4710 (2.9707) grad_norm 3.0413 (3.4731) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][530/1251] eta 0:03:17 lr 0.000244 wd 0.0500 time 0.2316 (0.2739) data time 0.0009 (0.0024) model time 0.2307 (0.2430) loss 2.3119 (2.9711) grad_norm 2.7662 (3.4914) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][540/1251] eta 0:03:14 lr 0.000244 wd 0.0500 time 0.2435 (0.2733) data time 0.0007 (0.0024) model time 0.2428 (0.2430) loss 3.4791 (2.9769) grad_norm 4.7132 (3.4885) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][550/1251] eta 0:03:11 lr 0.000244 wd 0.0500 time 0.2373 (0.2727) data time 0.0010 (0.0023) model time 0.2363 (0.2430) loss 3.0204 (2.9733) grad_norm 2.6472 (3.4830) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][560/1251] eta 0:03:08 lr 0.000244 wd 0.0500 time 0.2280 (0.2721) data time 0.0007 (0.0023) model time 0.2273 (0.2429) loss 3.4715 (2.9714) grad_norm 3.3729 (3.4773) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][570/1251] eta 0:03:04 lr 0.000244 wd 0.0500 time 0.2420 (0.2716) data time 0.0010 (0.0023) model time 0.2410 (0.2429) loss 2.9738 (2.9702) grad_norm 2.9897 (3.4739) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][580/1251] eta 0:03:01 lr 0.000244 wd 0.0500 time 0.2444 (0.2711) data time 0.0008 (0.0023) model time 0.2436 (0.2429) loss 3.3131 (2.9737) grad_norm 3.1776 (3.4732) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][590/1251] eta 0:02:58 lr 0.000244 wd 0.0500 time 0.2347 (0.2706) data time 0.0010 (0.0023) model time 0.2337 (0.2429) loss 3.1719 (2.9775) grad_norm 2.3145 (3.4767) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][600/1251] eta 0:02:55 lr 0.000244 wd 0.0500 time 0.2470 (0.2701) data time 0.0009 (0.0023) model time 0.2462 (0.2429) loss 2.7204 (2.9778) grad_norm 3.2050 (3.4703) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][610/1251] eta 0:02:52 lr 0.000244 wd 0.0500 time 0.2400 (0.2697) data time 0.0010 (0.0022) model time 0.2390 (0.2429) loss 3.0686 (2.9793) grad_norm 3.4938 (3.4719) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][620/1251] eta 0:02:49 lr 0.000244 wd 0.0500 time 0.2394 (0.2693) data time 0.0009 (0.0022) model time 0.2384 (0.2429) loss 2.1132 (2.9781) grad_norm 2.5044 (3.4740) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][630/1251] eta 0:02:46 lr 0.000244 wd 0.0500 time 0.2474 (0.2689) data time 0.0010 (0.0022) model time 0.2465 (0.2429) loss 1.9314 (2.9747) grad_norm 2.7019 (3.4786) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][640/1251] eta 0:02:44 lr 0.000244 wd 0.0500 time 0.2467 (0.2685) data time 0.0010 (0.0022) model time 0.2457 (0.2429) loss 3.0157 (2.9746) grad_norm 3.8737 (3.4847) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][650/1251] eta 0:02:41 lr 0.000244 wd 0.0500 time 0.2399 (0.2680) data time 0.0008 (0.0022) model time 0.2391 (0.2429) loss 3.5988 (2.9783) grad_norm 2.9218 (3.4850) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][660/1251] eta 0:02:38 lr 0.000244 wd 0.0500 time 0.2453 (0.2676) data time 0.0008 (0.0022) model time 0.2446 (0.2428) loss 3.4168 (2.9817) grad_norm 3.2792 (3.4846) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][670/1251] eta 0:02:35 lr 0.000244 wd 0.0500 time 0.2455 (0.2672) data time 0.0007 (0.0022) model time 0.2447 (0.2428) loss 3.4971 (2.9788) grad_norm 2.8086 (3.4812) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][680/1251] eta 0:02:32 lr 0.000244 wd 0.0500 time 0.2453 (0.2669) data time 0.0008 (0.0022) model time 0.2445 (0.2428) loss 3.6259 (2.9786) grad_norm 3.1602 (3.4788) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][690/1251] eta 0:02:29 lr 0.000244 wd 0.0500 time 0.2457 (0.2665) data time 0.0010 (0.0021) model time 0.2447 (0.2428) loss 2.6811 (2.9741) grad_norm 2.5216 (3.5001) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][700/1251] eta 0:02:26 lr 0.000244 wd 0.0500 time 0.2381 (0.2662) data time 0.0008 (0.0021) model time 0.2374 (0.2428) loss 3.5832 (2.9772) grad_norm 2.6850 (3.4959) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][710/1251] eta 0:02:23 lr 0.000244 wd 0.0500 time 0.2429 (0.2659) data time 0.0008 (0.0021) model time 0.2421 (0.2429) loss 3.0007 (2.9778) grad_norm 4.2413 (3.4975) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][720/1251] eta 0:02:21 lr 0.000244 wd 0.0500 time 0.2493 (0.2657) data time 0.0010 (0.0021) model time 0.2483 (0.2429) loss 3.2117 (2.9758) grad_norm 2.5883 (3.4947) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][730/1251] eta 0:02:18 lr 0.000244 wd 0.0500 time 0.2539 (0.2654) data time 0.0009 (0.0021) model time 0.2530 (0.2429) loss 3.3782 (2.9732) grad_norm 2.8213 (3.4894) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][740/1251] eta 0:02:15 lr 0.000244 wd 0.0500 time 0.2521 (0.2651) data time 0.0008 (0.0021) model time 0.2513 (0.2429) loss 2.8165 (2.9697) grad_norm 3.1398 (3.4931) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][750/1251] eta 0:02:12 lr 0.000244 wd 0.0500 time 0.2598 (0.2648) data time 0.0008 (0.0021) model time 0.2590 (0.2429) loss 3.0202 (2.9687) grad_norm 3.2549 (3.4895) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][760/1251] eta 0:02:09 lr 0.000243 wd 0.0500 time 0.2449 (0.2645) data time 0.0010 (0.0020) model time 0.2439 (0.2429) loss 2.8384 (2.9697) grad_norm 4.8511 (3.4988) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][770/1251] eta 0:02:07 lr 0.000243 wd 0.0500 time 0.2467 (0.2642) data time 0.0011 (0.0020) model time 0.2456 (0.2429) loss 3.2238 (2.9707) grad_norm 2.8778 (3.5025) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][780/1251] eta 0:02:04 lr 0.000243 wd 0.0500 time 0.2454 (0.2638) data time 0.0010 (0.0020) model time 0.2445 (0.2428) loss 2.3362 (2.9695) grad_norm 2.8806 (3.5099) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][790/1251] eta 0:02:01 lr 0.000243 wd 0.0500 time 0.2470 (0.2635) data time 0.0008 (0.0020) model time 0.2462 (0.2427) loss 2.7113 (2.9707) grad_norm 2.3612 (3.5097) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][800/1251] eta 0:01:58 lr 0.000243 wd 0.0500 time 0.2506 (0.2633) data time 0.0008 (0.0020) model time 0.2499 (0.2427) loss 3.4353 (2.9731) grad_norm 3.2218 (3.5176) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][810/1251] eta 0:01:55 lr 0.000243 wd 0.0500 time 0.2449 (0.2630) data time 0.0010 (0.0020) model time 0.2439 (0.2427) loss 2.9810 (2.9736) grad_norm 3.2767 (3.5161) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][820/1251] eta 0:01:53 lr 0.000243 wd 0.0500 time 0.2614 (0.2627) data time 0.0010 (0.0020) model time 0.2604 (0.2427) loss 3.3958 (2.9736) grad_norm 3.1481 (3.5113) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][830/1251] eta 0:01:50 lr 0.000243 wd 0.0500 time 0.2439 (0.2624) data time 0.0009 (0.0020) model time 0.2429 (0.2426) loss 2.2113 (2.9718) grad_norm 2.7781 (3.5076) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][840/1251] eta 0:01:47 lr 0.000243 wd 0.0500 time 0.2369 (0.2624) data time 0.0008 (0.0020) model time 0.2361 (0.2429) loss 2.3592 (2.9682) grad_norm 4.5931 (3.5073) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][850/1251] eta 0:01:45 lr 0.000243 wd 0.0500 time 0.4118 (0.2624) data time 0.0010 (0.0019) model time 0.4108 (0.2430) loss 2.6252 (2.9680) grad_norm 3.7825 (3.5088) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][860/1251] eta 0:01:42 lr 0.000243 wd 0.0500 time 0.2391 (0.2621) data time 0.0008 (0.0019) model time 0.2382 (0.2430) loss 2.9874 (2.9652) grad_norm 2.5325 (3.5087) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][870/1251] eta 0:01:39 lr 0.000243 wd 0.0500 time 0.2375 (0.2618) data time 0.0007 (0.0019) model time 0.2368 (0.2429) loss 3.4296 (2.9642) grad_norm 2.9994 (3.5073) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][880/1251] eta 0:01:37 lr 0.000243 wd 0.0500 time 0.2450 (0.2616) data time 0.0008 (0.0019) model time 0.2442 (0.2429) loss 2.5957 (2.9619) grad_norm 1.9602 (3.4994) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][890/1251] eta 0:01:34 lr 0.000243 wd 0.0500 time 0.2461 (0.2614) data time 0.0010 (0.0019) model time 0.2451 (0.2429) loss 2.1182 (2.9612) grad_norm 2.8170 (3.4943) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][900/1251] eta 0:01:31 lr 0.000243 wd 0.0500 time 0.2424 (0.2612) data time 0.0008 (0.0019) model time 0.2417 (0.2429) loss 3.2773 (2.9618) grad_norm 4.4189 (3.4902) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][910/1251] eta 0:01:29 lr 0.000243 wd 0.0500 time 0.2418 (0.2610) data time 0.0011 (0.0019) model time 0.2407 (0.2429) loss 3.4019 (2.9588) grad_norm 4.4417 (3.4889) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][920/1251] eta 0:01:26 lr 0.000243 wd 0.0500 time 0.2451 (0.2608) data time 0.0010 (0.0019) model time 0.2441 (0.2429) loss 2.7385 (2.9574) grad_norm 2.7167 (3.4881) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][930/1251] eta 0:01:23 lr 0.000243 wd 0.0500 time 0.2429 (0.2606) data time 0.0010 (0.0019) model time 0.2419 (0.2429) loss 3.0340 (2.9574) grad_norm 2.8259 (3.4916) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][940/1251] eta 0:01:20 lr 0.000243 wd 0.0500 time 0.2397 (0.2604) data time 0.0009 (0.0019) model time 0.2388 (0.2428) loss 3.5920 (2.9567) grad_norm 2.9296 (3.4891) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][950/1251] eta 0:01:18 lr 0.000243 wd 0.0500 time 0.2415 (0.2602) data time 0.0010 (0.0019) model time 0.2406 (0.2428) loss 2.8307 (2.9591) grad_norm 3.0338 (3.4922) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][960/1251] eta 0:01:15 lr 0.000243 wd 0.0500 time 0.2401 (0.2601) data time 0.0008 (0.0018) model time 0.2393 (0.2429) loss 2.6723 (2.9612) grad_norm 2.7802 (3.4951) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][970/1251] eta 0:01:13 lr 0.000243 wd 0.0500 time 0.2478 (0.2599) data time 0.0011 (0.0018) model time 0.2467 (0.2428) loss 2.2659 (2.9580) grad_norm 2.5525 (3.4937) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][980/1251] eta 0:01:10 lr 0.000243 wd 0.0500 time 0.2459 (0.2597) data time 0.0009 (0.0018) model time 0.2450 (0.2428) loss 3.1846 (2.9548) grad_norm 2.6065 (3.4914) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][990/1251] eta 0:01:07 lr 0.000243 wd 0.0500 time 0.2382 (0.2596) data time 0.0010 (0.0018) model time 0.2372 (0.2429) loss 3.2476 (2.9557) grad_norm 3.6582 (3.4931) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1000/1251] eta 0:01:05 lr 0.000243 wd 0.0500 time 0.2491 (0.2594) data time 0.0008 (0.0018) model time 0.2483 (0.2429) loss 3.1776 (2.9584) grad_norm 3.9993 (3.4868) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1010/1251] eta 0:01:02 lr 0.000243 wd 0.0500 time 0.2347 (0.2593) data time 0.0008 (0.0018) model time 0.2339 (0.2429) loss 2.3392 (2.9576) grad_norm 2.4600 (3.4803) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1020/1251] eta 0:00:59 lr 0.000243 wd 0.0500 time 0.2414 (0.2591) data time 0.0007 (0.0018) model time 0.2407 (0.2429) loss 2.9702 (2.9587) grad_norm 3.8812 (3.4854) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1030/1251] eta 0:00:57 lr 0.000242 wd 0.0500 time 0.2386 (0.2590) data time 0.0009 (0.0018) model time 0.2377 (0.2429) loss 3.1054 (2.9592) grad_norm 1.9760 (3.4925) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1040/1251] eta 0:00:54 lr 0.000242 wd 0.0500 time 0.2371 (0.2588) data time 0.0009 (0.0018) model time 0.2362 (0.2429) loss 3.3654 (2.9608) grad_norm 2.4275 (3.4886) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1050/1251] eta 0:00:51 lr 0.000242 wd 0.0500 time 0.2519 (0.2587) data time 0.0008 (0.0018) model time 0.2511 (0.2429) loss 1.8661 (2.9601) grad_norm 3.7895 (3.4888) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1060/1251] eta 0:00:49 lr 0.000242 wd 0.0500 time 0.2460 (0.2585) data time 0.0007 (0.0018) model time 0.2452 (0.2429) loss 2.7025 (2.9581) grad_norm 2.4730 (3.4862) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1070/1251] eta 0:00:46 lr 0.000242 wd 0.0500 time 0.2301 (0.2584) data time 0.0008 (0.0018) model time 0.2294 (0.2429) loss 3.4227 (2.9619) grad_norm 3.9189 (3.4842) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1080/1251] eta 0:00:44 lr 0.000242 wd 0.0500 time 0.2502 (0.2582) data time 0.0010 (0.0018) model time 0.2492 (0.2429) loss 2.9983 (2.9617) grad_norm 3.5728 (3.4850) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1090/1251] eta 0:00:41 lr 0.000242 wd 0.0500 time 0.2397 (0.2581) data time 0.0009 (0.0018) model time 0.2389 (0.2429) loss 3.6026 (2.9637) grad_norm 3.1157 (3.4827) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1100/1251] eta 0:00:38 lr 0.000242 wd 0.0500 time 0.2421 (0.2580) data time 0.0010 (0.0018) model time 0.2412 (0.2429) loss 2.6484 (2.9616) grad_norm 3.1286 (3.4776) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1110/1251] eta 0:00:36 lr 0.000242 wd 0.0500 time 0.2412 (0.2579) data time 0.0009 (0.0017) model time 0.2402 (0.2429) loss 3.4125 (2.9633) grad_norm 4.6622 (3.5113) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1120/1251] eta 0:00:33 lr 0.000242 wd 0.0500 time 0.2332 (0.2577) data time 0.0010 (0.0017) model time 0.2322 (0.2429) loss 3.0617 (2.9624) grad_norm 4.2562 (3.5163) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1130/1251] eta 0:00:31 lr 0.000242 wd 0.0500 time 0.2444 (0.2576) data time 0.0010 (0.0017) model time 0.2434 (0.2429) loss 2.9698 (2.9621) grad_norm 2.9224 (3.5136) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1140/1251] eta 0:00:28 lr 0.000242 wd 0.0500 time 0.2415 (0.2574) data time 0.0008 (0.0017) model time 0.2407 (0.2429) loss 3.4927 (2.9636) grad_norm 3.7126 (3.5104) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1150/1251] eta 0:00:25 lr 0.000242 wd 0.0500 time 0.2549 (0.2573) data time 0.0012 (0.0017) model time 0.2536 (0.2428) loss 2.7089 (2.9637) grad_norm 2.3448 (3.5076) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1160/1251] eta 0:00:23 lr 0.000242 wd 0.0500 time 0.2503 (0.2572) data time 0.0010 (0.0017) model time 0.2493 (0.2428) loss 2.6900 (2.9645) grad_norm 3.5514 (3.5033) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 02:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1170/1251] eta 0:00:20 lr 0.000242 wd 0.0500 time 0.2392 (0.2571) data time 0.0007 (0.0017) model time 0.2384 (0.2428) loss 3.0209 (2.9645) grad_norm 3.5133 (3.5006) loss_scale 1024.0000 (515.1356) mem 7377MB [2024-08-30 02:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1180/1251] eta 0:00:18 lr 0.000242 wd 0.0500 time 0.2375 (0.2569) data time 0.0010 (0.0017) model time 0.2366 (0.2428) loss 3.3338 (2.9647) grad_norm 3.1207 (3.4984) loss_scale 1024.0000 (519.5490) mem 7377MB [2024-08-30 02:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1190/1251] eta 0:00:15 lr 0.000242 wd 0.0500 time 0.2386 (0.2568) data time 0.0010 (0.0017) model time 0.2377 (0.2428) loss 2.7584 (2.9636) grad_norm 3.8117 (3.4965) loss_scale 1024.0000 (523.8865) mem 7377MB [2024-08-30 02:11:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1200/1251] eta 0:00:13 lr 0.000242 wd 0.0500 time 0.2509 (0.2567) data time 0.0007 (0.0017) model time 0.2502 (0.2428) loss 3.4749 (2.9641) grad_norm 3.9377 (3.4938) loss_scale 1024.0000 (528.1500) mem 7377MB [2024-08-30 02:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1210/1251] eta 0:00:10 lr 0.000242 wd 0.0500 time 0.2390 (0.2566) data time 0.0010 (0.0017) model time 0.2380 (0.2428) loss 3.6393 (2.9667) grad_norm 4.7522 (3.4927) loss_scale 1024.0000 (532.3415) mem 7377MB [2024-08-30 02:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1220/1251] eta 0:00:07 lr 0.000242 wd 0.0500 time 0.2400 (0.2565) data time 0.0009 (0.0017) model time 0.2391 (0.2428) loss 3.1870 (2.9678) grad_norm 2.9844 (3.4927) loss_scale 1024.0000 (536.4627) mem 7377MB [2024-08-30 02:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1230/1251] eta 0:00:05 lr 0.000242 wd 0.0500 time 0.2406 (0.2564) data time 0.0008 (0.0017) model time 0.2399 (0.2428) loss 3.5068 (2.9685) grad_norm 2.4825 (3.4870) loss_scale 1024.0000 (540.5154) mem 7377MB [2024-08-30 02:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1240/1251] eta 0:00:02 lr 0.000242 wd 0.0500 time 0.2287 (0.2562) data time 0.0005 (0.0017) model time 0.2282 (0.2427) loss 3.1593 (2.9693) grad_norm 3.5777 (3.4869) loss_scale 1024.0000 (544.5012) mem 7377MB [2024-08-30 02:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [209/300][1250/1251] eta 0:00:00 lr 0.000242 wd 0.0500 time 0.2285 (0.2559) data time 0.0007 (0.0017) model time 0.2278 (0.2425) loss 3.3832 (2.9687) grad_norm 2.5213 (3.4814) loss_scale 1024.0000 (548.4219) mem 7377MB [2024-08-30 02:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 209 training takes 0:05:13 [2024-08-30 02:12:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 02:12:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 02:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.450 (0.450) Loss 0.3960 (0.3960) Acc@1 92.578 (92.578) Acc@5 98.242 (98.242) Mem 7377MB [2024-08-30 02:12:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.110) Loss 0.6758 (0.6570) Acc@1 85.840 (86.035) Acc@5 97.266 (97.381) Mem 7377MB [2024-08-30 02:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.096) Loss 0.9360 (0.6820) Acc@1 77.148 (85.040) Acc@5 95.215 (97.345) Mem 7377MB [2024-08-30 02:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.089) Loss 1.1875 (0.7727) Acc@1 71.094 (82.847) Acc@5 92.578 (96.299) Mem 7377MB [2024-08-30 02:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.063 (0.084) Loss 1.0576 (0.8213) Acc@1 75.000 (81.586) Acc@5 93.359 (95.746) Mem 7377MB [2024-08-30 02:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.182 Acc@5 95.684 [2024-08-30 02:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.2% [2024-08-30 02:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.812 (0.812) Loss 0.3804 (0.3804) Acc@1 93.262 (93.262) Acc@5 98.633 (98.633) Mem 7377MB [2024-08-30 02:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.148) Loss 0.5889 (0.6064) Acc@1 89.258 (87.376) Acc@5 97.656 (97.656) Mem 7377MB [2024-08-30 02:12:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.114) Loss 0.8701 (0.6336) Acc@1 78.516 (86.314) Acc@5 96.094 (97.619) Mem 7377MB [2024-08-30 02:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.102) Loss 1.0928 (0.7173) Acc@1 73.535 (84.227) Acc@5 93.359 (96.733) Mem 7377MB [2024-08-30 02:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 0.9805 (0.7601) Acc@1 76.270 (83.015) Acc@5 94.238 (96.282) Mem 7377MB [2024-08-30 02:12:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.578 Acc@5 96.270 [2024-08-30 02:12:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-08-30 02:12:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.58% [2024-08-30 02:12:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 02:12:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 02:12:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][0/1251] eta 0:13:12 lr 0.000242 wd 0.0500 time 0.6336 (0.6336) data time 0.3794 (0.3794) model time 0.0000 (0.0000) loss 3.3011 (3.3011) grad_norm 3.5737 (3.5737) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-30 02:12:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][10/1251] eta 0:05:50 lr 0.000242 wd 0.0500 time 0.2457 (0.2828) data time 0.0011 (0.0355) model time 0.0000 (0.0000) loss 2.4722 (2.8573) grad_norm 2.5641 (3.1741) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:12:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][20/1251] eta 0:05:25 lr 0.000242 wd 0.0500 time 0.2465 (0.2641) data time 0.0010 (0.0191) model time 0.0000 (0.0000) loss 2.6204 (2.9187) grad_norm 3.3537 (3.2561) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:12:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][30/1251] eta 0:05:13 lr 0.000242 wd 0.0500 time 0.2360 (0.2569) data time 0.0010 (0.0133) model time 0.0000 (0.0000) loss 2.9103 (2.8511) grad_norm 3.3297 (3.3296) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:12:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][40/1251] eta 0:05:06 lr 0.000241 wd 0.0500 time 0.2398 (0.2528) data time 0.0010 (0.0104) model time 0.0000 (0.0000) loss 2.9156 (2.8909) grad_norm 2.5043 (3.3627) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:12:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][50/1251] eta 0:05:00 lr 0.000241 wd 0.0500 time 0.2504 (0.2504) data time 0.0007 (0.0086) model time 0.0000 (0.0000) loss 3.8630 (2.9561) grad_norm 7.1012 (3.3784) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:12:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 02:12:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 02:12:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 02:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 02:18:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 02:18:31 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 02:18:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 02:18:40 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 02:18:41 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 02:18:42 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 02:18:43 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 210) [2024-08-30 02:18:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 02:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][60/1251] eta 1:17:20 lr 0.000241 wd 0.0500 time 0.2274 (3.8960) data time 0.0007 (0.2911) model time 0.2267 (3.6048) loss 3.0400 (3.2181) grad_norm 2.5708 (3.3888) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][70/1251] eta 0:25:08 lr 0.000241 wd 0.0500 time 0.2309 (1.2769) data time 0.0007 (0.0839) model time 0.2302 (1.1930) loss 3.3540 (3.2232) grad_norm 3.3893 (3.7496) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][80/1251] eta 0:16:25 lr 0.000241 wd 0.0500 time 0.2312 (0.8415) data time 0.0012 (0.0495) model time 0.2300 (0.7920) loss 3.0573 (3.1954) grad_norm 2.2484 (3.6001) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][90/1251] eta 0:12:49 lr 0.000241 wd 0.0500 time 0.2205 (0.6625) data time 0.0007 (0.0352) model time 0.2198 (0.6273) loss 2.4623 (3.2049) grad_norm 6.4111 (3.6108) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][100/1251] eta 0:10:48 lr 0.000241 wd 0.0500 time 0.2362 (0.5638) data time 0.0007 (0.0274) model time 0.2354 (0.5363) loss 2.8332 (3.1683) grad_norm 2.6841 (3.5282) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][110/1251] eta 0:09:32 lr 0.000241 wd 0.0500 time 0.2275 (0.5016) data time 0.0007 (0.0226) model time 0.2268 (0.4790) loss 3.4398 (3.1732) grad_norm 3.6658 (3.5128) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][120/1251] eta 0:08:38 lr 0.000241 wd 0.0500 time 0.2237 (0.4587) data time 0.0007 (0.0192) model time 0.2230 (0.4395) loss 3.1446 (3.1253) grad_norm 3.9085 (3.5331) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][130/1251] eta 0:07:59 lr 0.000241 wd 0.0500 time 0.2265 (0.4279) data time 0.0008 (0.0168) model time 0.2257 (0.4111) loss 2.8542 (3.0715) grad_norm 2.8123 (3.5263) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][140/1251] eta 0:07:29 lr 0.000241 wd 0.0500 time 0.2314 (0.4042) data time 0.0016 (0.0149) model time 0.2299 (0.3893) loss 3.3554 (3.0546) grad_norm 2.4302 (3.4971) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][150/1251] eta 0:07:04 lr 0.000241 wd 0.0500 time 0.2223 (0.3853) data time 0.0012 (0.0135) model time 0.2210 (0.3718) loss 2.9313 (3.0511) grad_norm 5.4493 (3.4849) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][160/1251] eta 0:06:43 lr 0.000241 wd 0.0500 time 0.2235 (0.3702) data time 0.0009 (0.0123) model time 0.2225 (0.3579) loss 3.1521 (3.0697) grad_norm 4.0062 (3.4462) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][170/1251] eta 0:06:27 lr 0.000241 wd 0.0500 time 0.2477 (0.3581) data time 0.0010 (0.0113) model time 0.2468 (0.3468) loss 3.2015 (3.0602) grad_norm 3.6819 (3.4045) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][180/1251] eta 0:06:12 lr 0.000241 wd 0.0500 time 0.2345 (0.3481) data time 0.0013 (0.0106) model time 0.2331 (0.3375) loss 2.7719 (3.0608) grad_norm 3.1630 (3.4081) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][190/1251] eta 0:05:59 lr 0.000241 wd 0.0500 time 0.2228 (0.3392) data time 0.0009 (0.0099) model time 0.2219 (0.3294) loss 3.6189 (3.0632) grad_norm 3.2815 (3.4147) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][200/1251] eta 0:05:48 lr 0.000241 wd 0.0500 time 0.2229 (0.3316) data time 0.0009 (0.0093) model time 0.2220 (0.3223) loss 2.7054 (3.0539) grad_norm 9.0884 (3.5588) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][210/1251] eta 0:05:38 lr 0.000241 wd 0.0500 time 0.2347 (0.3249) data time 0.0007 (0.0087) model time 0.2340 (0.3162) loss 2.5757 (3.0470) grad_norm 2.5027 (3.6820) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][220/1251] eta 0:05:28 lr 0.000241 wd 0.0500 time 0.2234 (0.3189) data time 0.0008 (0.0083) model time 0.2226 (0.3106) loss 2.9228 (3.0448) grad_norm 3.1119 (3.6594) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][230/1251] eta 0:05:20 lr 0.000241 wd 0.0500 time 0.2269 (0.3136) data time 0.0009 (0.0079) model time 0.2260 (0.3058) loss 2.1995 (3.0385) grad_norm 2.9385 (3.6353) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][240/1251] eta 0:05:12 lr 0.000241 wd 0.0500 time 0.2415 (0.3092) data time 0.0010 (0.0075) model time 0.2405 (0.3017) loss 2.5981 (3.0275) grad_norm 2.1640 (3.6095) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][250/1251] eta 0:05:05 lr 0.000241 wd 0.0500 time 0.2317 (0.3050) data time 0.0007 (0.0072) model time 0.2309 (0.2979) loss 2.6375 (3.0238) grad_norm 2.9408 (3.5892) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][260/1251] eta 0:04:58 lr 0.000241 wd 0.0500 time 0.2393 (0.3014) data time 0.0010 (0.0069) model time 0.2383 (0.2945) loss 2.6276 (3.0064) grad_norm 4.1957 (3.5807) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][270/1251] eta 0:04:52 lr 0.000241 wd 0.0500 time 0.2292 (0.2980) data time 0.0007 (0.0066) model time 0.2284 (0.2914) loss 2.9381 (2.9963) grad_norm 3.1211 (3.5820) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][280/1251] eta 0:04:46 lr 0.000241 wd 0.0500 time 0.2231 (0.2950) data time 0.0012 (0.0064) model time 0.2219 (0.2886) loss 3.0664 (2.9953) grad_norm 3.3033 (3.5817) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][290/1251] eta 0:04:40 lr 0.000241 wd 0.0500 time 0.2272 (0.2922) data time 0.0010 (0.0061) model time 0.2262 (0.2860) loss 1.9557 (2.9846) grad_norm 3.6942 (3.6295) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:19:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][300/1251] eta 0:04:35 lr 0.000241 wd 0.0500 time 0.2278 (0.2895) data time 0.0007 (0.0059) model time 0.2271 (0.2836) loss 2.0314 (2.9885) grad_norm 3.3496 (3.6276) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][310/1251] eta 0:04:30 lr 0.000240 wd 0.0500 time 0.2256 (0.2871) data time 0.0008 (0.0057) model time 0.2247 (0.2814) loss 2.3724 (2.9758) grad_norm 2.9730 (3.6117) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][320/1251] eta 0:04:25 lr 0.000240 wd 0.0500 time 0.2237 (0.2849) data time 0.0009 (0.0056) model time 0.2228 (0.2794) loss 3.1815 (2.9711) grad_norm 3.7406 (3.6055) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][330/1251] eta 0:04:20 lr 0.000240 wd 0.0500 time 0.2338 (0.2830) data time 0.0008 (0.0054) model time 0.2329 (0.2777) loss 3.0204 (2.9727) grad_norm 4.6651 (3.5995) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][340/1251] eta 0:04:16 lr 0.000240 wd 0.0500 time 0.2184 (0.2811) data time 0.0011 (0.0052) model time 0.2173 (0.2759) loss 2.1398 (2.9694) grad_norm 2.5977 (3.6080) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][350/1251] eta 0:04:12 lr 0.000240 wd 0.0500 time 0.2282 (0.2801) data time 0.0010 (0.0051) model time 0.2272 (0.2750) loss 2.6119 (2.9656) grad_norm 4.1453 (3.5984) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][360/1251] eta 0:04:08 lr 0.000240 wd 0.0500 time 0.2225 (0.2784) data time 0.0012 (0.0050) model time 0.2212 (0.2734) loss 3.3661 (2.9603) grad_norm 2.5417 (3.5756) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][370/1251] eta 0:04:04 lr 0.000240 wd 0.0500 time 0.2264 (0.2776) data time 0.0011 (0.0049) model time 0.2253 (0.2728) loss 3.2901 (2.9566) grad_norm 2.9688 (3.5681) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][380/1251] eta 0:04:00 lr 0.000240 wd 0.0500 time 0.2360 (0.2761) data time 0.0007 (0.0047) model time 0.2353 (0.2714) loss 3.4226 (2.9700) grad_norm 3.6448 (3.5503) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][390/1251] eta 0:03:56 lr 0.000240 wd 0.0500 time 0.2298 (0.2748) data time 0.0010 (0.0046) model time 0.2288 (0.2702) loss 2.9149 (2.9664) grad_norm 2.5615 (3.5281) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][400/1251] eta 0:03:52 lr 0.000240 wd 0.0500 time 0.2341 (0.2735) data time 0.0010 (0.0045) model time 0.2332 (0.2689) loss 3.0807 (2.9705) grad_norm 3.0447 (3.5150) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][410/1251] eta 0:03:48 lr 0.000240 wd 0.0500 time 0.2261 (0.2722) data time 0.0008 (0.0044) model time 0.2253 (0.2678) loss 3.0413 (2.9736) grad_norm 3.1618 (3.4996) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][420/1251] eta 0:03:45 lr 0.000240 wd 0.0500 time 0.2284 (0.2710) data time 0.0008 (0.0043) model time 0.2276 (0.2667) loss 2.0807 (2.9695) grad_norm 3.6418 (3.4842) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][430/1251] eta 0:03:41 lr 0.000240 wd 0.0500 time 0.2253 (0.2699) data time 0.0010 (0.0043) model time 0.2243 (0.2656) loss 3.9175 (2.9710) grad_norm 2.6964 (3.4739) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][440/1251] eta 0:03:37 lr 0.000240 wd 0.0500 time 0.2325 (0.2688) data time 0.0009 (0.0042) model time 0.2316 (0.2646) loss 2.2763 (2.9675) grad_norm 2.9948 (3.4708) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][450/1251] eta 0:03:34 lr 0.000240 wd 0.0500 time 0.2302 (0.2678) data time 0.0011 (0.0041) model time 0.2292 (0.2637) loss 3.2006 (2.9681) grad_norm 3.0407 (3.4633) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][460/1251] eta 0:03:30 lr 0.000240 wd 0.0500 time 0.2297 (0.2667) data time 0.0007 (0.0040) model time 0.2290 (0.2627) loss 2.9225 (2.9717) grad_norm 2.1899 (3.4657) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][470/1251] eta 0:03:27 lr 0.000240 wd 0.0500 time 0.2235 (0.2658) data time 0.0010 (0.0040) model time 0.2225 (0.2619) loss 2.4015 (2.9739) grad_norm 3.1649 (3.4803) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][480/1251] eta 0:03:24 lr 0.000240 wd 0.0500 time 0.2235 (0.2649) data time 0.0010 (0.0039) model time 0.2225 (0.2610) loss 3.4002 (2.9745) grad_norm 2.4723 (3.4682) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][490/1251] eta 0:03:20 lr 0.000240 wd 0.0500 time 0.2316 (0.2641) data time 0.0009 (0.0038) model time 0.2308 (0.2603) loss 3.3683 (2.9810) grad_norm 2.9066 (3.4837) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][500/1251] eta 0:03:17 lr 0.000240 wd 0.0500 time 0.2311 (0.2633) data time 0.0009 (0.0038) model time 0.2303 (0.2595) loss 3.4461 (2.9817) grad_norm 3.1027 (3.4848) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][510/1251] eta 0:03:14 lr 0.000240 wd 0.0500 time 0.2316 (0.2625) data time 0.0007 (0.0037) model time 0.2310 (0.2588) loss 3.7981 (2.9792) grad_norm 3.8440 (3.4813) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][520/1251] eta 0:03:11 lr 0.000240 wd 0.0500 time 0.2457 (0.2618) data time 0.0007 (0.0036) model time 0.2449 (0.2581) loss 3.0278 (2.9733) grad_norm 2.4747 (3.4719) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][530/1251] eta 0:03:08 lr 0.000240 wd 0.0500 time 0.2256 (0.2611) data time 0.0013 (0.0036) model time 0.2243 (0.2575) loss 3.0044 (2.9668) grad_norm 2.9161 (3.4647) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][540/1251] eta 0:03:05 lr 0.000240 wd 0.0500 time 0.2385 (0.2604) data time 0.0015 (0.0035) model time 0.2370 (0.2568) loss 2.6315 (2.9681) grad_norm 2.2531 (3.4560) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][550/1251] eta 0:03:02 lr 0.000240 wd 0.0500 time 0.2284 (0.2597) data time 0.0009 (0.0035) model time 0.2275 (0.2562) loss 2.9869 (2.9685) grad_norm 2.2619 (3.4498) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:20:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][560/1251] eta 0:02:59 lr 0.000240 wd 0.0500 time 0.2323 (0.2590) data time 0.0009 (0.0034) model time 0.2314 (0.2556) loss 2.9546 (2.9669) grad_norm 2.8507 (3.4409) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:21:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][570/1251] eta 0:02:56 lr 0.000240 wd 0.0500 time 0.2280 (0.2585) data time 0.0011 (0.0034) model time 0.2268 (0.2551) loss 3.1511 (2.9744) grad_norm 3.3713 (3.4737) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:21:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 02:21:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 02:21:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 02:23:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 02:23:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 02:23:11 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 02:23:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 02:23:24 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 02:23:26 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 02:23:27 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 02:23:27 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 210) [2024-08-30 02:23:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 02:25:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 02:25:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 02:25:37 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 02:25:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 02:25:46 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 02:25:47 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 02:25:48 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 02:25:49 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 210) [2024-08-30 02:25:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 02:26:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][580/1251] eta 1:19:44 lr 0.000239 wd 0.0500 time 0.5052 (7.1310) data time 0.0008 (0.4231) model time 0.5044 (6.7079) loss 3.3751 (3.4836) grad_norm 4.5034 (4.1717) loss_scale 1024.0000 (1024.0000) mem 7375MB [2024-08-30 02:26:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][590/1251] eta 0:15:09 lr 0.000239 wd 0.0500 time 0.2228 (1.3760) data time 0.0009 (0.0714) model time 0.2219 (1.3046) loss 2.4432 (3.2477) grad_norm 4.6780 (3.5869) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][600/1251] eta 0:09:15 lr 0.000239 wd 0.0500 time 0.2232 (0.8535) data time 0.0010 (0.0394) model time 0.2222 (0.8141) loss 2.9025 (3.1704) grad_norm 3.3321 (4.0149) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][610/1251] eta 0:07:01 lr 0.000239 wd 0.0500 time 0.2258 (0.6575) data time 0.0007 (0.0274) model time 0.2251 (0.6301) loss 3.5127 (3.2055) grad_norm 3.3274 (3.9250) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][620/1251] eta 0:05:50 lr 0.000239 wd 0.0500 time 0.2271 (0.5551) data time 0.0009 (0.0211) model time 0.2262 (0.5339) loss 2.9756 (3.1444) grad_norm 3.4014 (3.7586) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][630/1251] eta 0:05:05 lr 0.000239 wd 0.0500 time 0.2374 (0.4923) data time 0.0009 (0.0172) model time 0.2365 (0.4751) loss 2.8355 (3.1162) grad_norm 3.0518 (3.6477) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][640/1251] eta 0:04:34 lr 0.000239 wd 0.0500 time 0.2282 (0.4495) data time 0.0009 (0.0146) model time 0.2273 (0.4349) loss 3.1867 (3.0783) grad_norm 3.4526 (3.5274) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][650/1251] eta 0:04:11 lr 0.000239 wd 0.0500 time 0.2296 (0.4190) data time 0.0010 (0.0127) model time 0.2286 (0.4063) loss 3.1707 (3.0430) grad_norm 3.1561 (3.4495) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][660/1251] eta 0:03:53 lr 0.000239 wd 0.0500 time 0.2281 (0.3957) data time 0.0010 (0.0113) model time 0.2271 (0.3844) loss 2.7698 (3.0306) grad_norm 3.7103 (3.4319) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][670/1251] eta 0:03:39 lr 0.000239 wd 0.0500 time 0.2330 (0.3774) data time 0.0010 (0.0102) model time 0.2320 (0.3672) loss 2.0010 (3.0090) grad_norm 3.8584 (3.4355) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][680/1251] eta 0:03:27 lr 0.000239 wd 0.0500 time 0.2268 (0.3627) data time 0.0007 (0.0093) model time 0.2260 (0.3534) loss 3.5681 (3.0327) grad_norm 2.6054 (3.4298) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][690/1251] eta 0:03:16 lr 0.000239 wd 0.0500 time 0.2388 (0.3509) data time 0.0009 (0.0085) model time 0.2380 (0.3423) loss 3.2559 (3.0319) grad_norm 2.6845 (3.4335) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][700/1251] eta 0:03:07 lr 0.000239 wd 0.0500 time 0.2314 (0.3409) data time 0.0010 (0.0079) model time 0.2303 (0.3330) loss 3.2192 (3.0317) grad_norm 2.9924 (3.4130) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][710/1251] eta 0:02:59 lr 0.000239 wd 0.0500 time 0.2310 (0.3324) data time 0.0009 (0.0074) model time 0.2301 (0.3250) loss 3.1457 (3.0197) grad_norm 3.7439 (3.4149) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][720/1251] eta 0:02:52 lr 0.000239 wd 0.0500 time 0.2392 (0.3253) data time 0.0008 (0.0070) model time 0.2384 (0.3183) loss 2.9959 (3.0042) grad_norm 3.5105 (3.4103) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][730/1251] eta 0:02:46 lr 0.000239 wd 0.0500 time 0.2284 (0.3189) data time 0.0008 (0.0066) model time 0.2276 (0.3123) loss 3.2267 (2.9952) grad_norm 2.7872 (3.4287) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][740/1251] eta 0:02:40 lr 0.000239 wd 0.0500 time 0.2321 (0.3133) data time 0.0009 (0.0062) model time 0.2312 (0.3071) loss 2.9475 (2.9972) grad_norm 2.3089 (3.3919) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][750/1251] eta 0:02:34 lr 0.000239 wd 0.0500 time 0.2283 (0.3084) data time 0.0009 (0.0059) model time 0.2274 (0.3025) loss 2.7816 (2.9909) grad_norm 3.2640 (3.3727) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][760/1251] eta 0:02:29 lr 0.000239 wd 0.0500 time 0.2259 (0.3040) data time 0.0008 (0.0057) model time 0.2251 (0.2984) loss 2.9850 (2.9839) grad_norm 4.3257 (3.3861) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][770/1251] eta 0:02:24 lr 0.000239 wd 0.0500 time 0.2286 (0.3000) data time 0.0009 (0.0054) model time 0.2277 (0.2946) loss 2.8336 (2.9810) grad_norm 4.0804 (3.3918) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][780/1251] eta 0:02:19 lr 0.000239 wd 0.0500 time 0.2209 (0.2965) data time 0.0011 (0.0052) model time 0.2199 (0.2913) loss 2.8996 (2.9683) grad_norm 2.3716 (3.4119) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][790/1251] eta 0:02:15 lr 0.000239 wd 0.0500 time 0.2317 (0.2933) data time 0.0010 (0.0050) model time 0.2308 (0.2883) loss 2.8753 (2.9649) grad_norm 3.0474 (3.4052) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][800/1251] eta 0:02:10 lr 0.000239 wd 0.0500 time 0.2284 (0.2905) data time 0.0009 (0.0048) model time 0.2275 (0.2856) loss 3.7631 (2.9637) grad_norm 3.8726 (3.4122) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][810/1251] eta 0:02:06 lr 0.000239 wd 0.0500 time 0.2322 (0.2879) data time 0.0007 (0.0047) model time 0.2315 (0.2832) loss 3.2292 (2.9629) grad_norm 3.3209 (3.4085) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][820/1251] eta 0:02:02 lr 0.000239 wd 0.0500 time 0.2266 (0.2854) data time 0.0006 (0.0045) model time 0.2260 (0.2808) loss 3.8344 (2.9645) grad_norm 3.4383 (3.4070) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:27:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][830/1251] eta 0:01:59 lr 0.000239 wd 0.0500 time 0.2301 (0.2832) data time 0.0008 (0.0044) model time 0.2293 (0.2789) loss 3.5537 (2.9555) grad_norm 3.0755 (3.4032) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:27:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][840/1251] eta 0:01:55 lr 0.000238 wd 0.0500 time 0.2280 (0.2811) data time 0.0009 (0.0042) model time 0.2271 (0.2768) loss 3.3913 (2.9478) grad_norm 3.6038 (3.3993) loss_scale 1024.0000 (1024.0000) mem 7376MB [2024-08-30 02:27:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 02:27:10 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 02:27:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 02:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 02:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 02:29:32 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 02:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 02:29:40 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 02:29:42 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 02:29:43 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 02:29:43 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 210) [2024-08-30 02:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 02:40:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 02:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 02:40:29 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 02:40:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 02:40:43 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 02:40:44 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 02:40:45 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 02:40:45 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 210) [2024-08-30 02:40:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 02:42:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 02:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 02:42:46 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 02:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 02:42:55 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 02:42:57 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 02:42:58 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 02:42:58 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 210) [2024-08-30 02:42:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 02:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 02:45:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 02:45:53 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 02:46:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 02:46:02 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 02:46:04 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 02:46:05 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 02:46:05 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 210) [2024-08-30 02:46:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 02:47:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 02:47:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 02:47:49 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 02:47:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 02:47:55 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 02:47:56 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 02:47:58 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 02:47:58 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 210) [2024-08-30 02:47:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 02:48:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][850/1251] eta 0:52:13 lr 0.000238 wd 0.0500 time 1.0855 (7.8143) data time 0.0007 (0.7233) model time 1.0848 (7.0910) loss 3.7652 (3.6890) grad_norm 2.4274 (3.8512) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-08-30 02:48:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][860/1251] eta 0:09:42 lr 0.000238 wd 0.0500 time 0.2243 (1.4894) data time 0.0008 (0.1214) model time 0.2236 (1.3680) loss 2.8660 (3.2217) grad_norm 2.9552 (3.3717) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][870/1251] eta 0:05:48 lr 0.000238 wd 0.0500 time 0.2282 (0.9140) data time 0.0008 (0.0667) model time 0.2273 (0.8474) loss 3.2000 (3.1957) grad_norm 2.2310 (3.2665) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][880/1251] eta 0:04:19 lr 0.000238 wd 0.0500 time 0.2285 (0.6986) data time 0.0006 (0.0461) model time 0.2279 (0.6525) loss 3.1040 (3.1882) grad_norm 3.5637 (3.3150) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][890/1251] eta 0:03:31 lr 0.000238 wd 0.0500 time 0.2326 (0.5859) data time 0.0008 (0.0354) model time 0.2319 (0.5505) loss 3.2666 (3.1568) grad_norm 3.1128 (3.3552) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][900/1251] eta 0:03:01 lr 0.000238 wd 0.0500 time 0.2237 (0.5162) data time 0.0008 (0.0288) model time 0.2229 (0.4875) loss 3.1427 (3.1270) grad_norm 3.8087 (3.3915) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][910/1251] eta 0:02:40 lr 0.000238 wd 0.0500 time 0.2486 (0.4694) data time 0.0007 (0.0243) model time 0.2479 (0.4451) loss 3.5366 (3.0961) grad_norm 2.9668 (3.4639) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][920/1251] eta 0:02:23 lr 0.000238 wd 0.0500 time 0.2260 (0.4349) data time 0.0009 (0.0210) model time 0.2251 (0.4139) loss 2.7154 (3.0497) grad_norm 4.2623 (3.5410) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][930/1251] eta 0:02:11 lr 0.000238 wd 0.0500 time 0.2192 (0.4092) data time 0.0009 (0.0186) model time 0.2182 (0.3906) loss 3.0128 (3.0365) grad_norm 2.7667 (3.5378) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][940/1251] eta 0:02:01 lr 0.000238 wd 0.0500 time 0.2259 (0.3892) data time 0.0006 (0.0167) model time 0.2253 (0.3725) loss 2.1207 (3.0185) grad_norm 2.5667 (3.4680) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][950/1251] eta 0:01:52 lr 0.000238 wd 0.0500 time 0.2211 (0.3729) data time 0.0007 (0.0151) model time 0.2204 (0.3578) loss 3.4348 (3.0363) grad_norm 3.2659 (3.4467) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][960/1251] eta 0:01:44 lr 0.000238 wd 0.0500 time 0.2222 (0.3596) data time 0.0009 (0.0139) model time 0.2213 (0.3458) loss 3.6363 (3.0467) grad_norm 2.7062 (3.3930) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][970/1251] eta 0:01:37 lr 0.000238 wd 0.0500 time 0.2273 (0.3485) data time 0.0006 (0.0128) model time 0.2267 (0.3357) loss 3.2689 (3.0475) grad_norm 2.8377 (3.3749) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][980/1251] eta 0:01:31 lr 0.000238 wd 0.0500 time 0.2331 (0.3391) data time 0.0008 (0.0119) model time 0.2324 (0.3272) loss 3.0985 (3.0406) grad_norm 4.7409 (3.3782) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][990/1251] eta 0:01:26 lr 0.000238 wd 0.0500 time 0.2257 (0.3309) data time 0.0009 (0.0111) model time 0.2248 (0.3198) loss 3.4357 (3.0384) grad_norm 3.8525 (3.4126) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1000/1251] eta 0:01:21 lr 0.000238 wd 0.0500 time 0.2233 (0.3239) data time 0.0009 (0.0105) model time 0.2224 (0.3134) loss 3.2072 (3.0318) grad_norm 2.6845 (3.3964) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1010/1251] eta 0:01:16 lr 0.000238 wd 0.0500 time 0.2232 (0.3177) data time 0.0008 (0.0099) model time 0.2223 (0.3078) loss 3.1548 (3.0353) grad_norm 4.0581 (3.4113) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1020/1251] eta 0:01:12 lr 0.000238 wd 0.0500 time 0.2281 (0.3124) data time 0.0009 (0.0094) model time 0.2272 (0.3030) loss 2.5826 (3.0293) grad_norm 3.9045 (3.4077) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:48:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1030/1251] eta 0:01:07 lr 0.000238 wd 0.0500 time 0.2204 (0.3074) data time 0.0009 (0.0089) model time 0.2195 (0.2985) loss 3.5786 (3.0198) grad_norm 16.0894 (3.4729) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1040/1251] eta 0:01:03 lr 0.000238 wd 0.0500 time 0.2216 (0.3031) data time 0.0009 (0.0085) model time 0.2207 (0.2946) loss 3.3447 (3.0164) grad_norm 3.2917 (3.4788) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1050/1251] eta 0:01:00 lr 0.000238 wd 0.0500 time 0.2204 (0.2993) data time 0.0007 (0.0081) model time 0.2197 (0.2912) loss 3.0794 (3.0035) grad_norm 2.5181 (3.4604) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1060/1251] eta 0:00:56 lr 0.000238 wd 0.0500 time 0.2346 (0.2957) data time 0.0008 (0.0078) model time 0.2337 (0.2880) loss 3.1410 (3.0046) grad_norm 2.6745 (3.4362) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1070/1251] eta 0:00:52 lr 0.000238 wd 0.0500 time 0.2296 (0.2925) data time 0.0007 (0.0075) model time 0.2289 (0.2851) loss 3.7816 (3.0003) grad_norm 3.3159 (3.4324) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1080/1251] eta 0:00:49 lr 0.000238 wd 0.0500 time 0.2310 (0.2896) data time 0.0005 (0.0072) model time 0.2305 (0.2824) loss 3.0080 (2.9986) grad_norm 5.8358 (3.4835) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1090/1251] eta 0:00:46 lr 0.000238 wd 0.0500 time 0.2263 (0.2870) data time 0.0006 (0.0069) model time 0.2257 (0.2801) loss 3.5444 (2.9969) grad_norm 2.6580 (3.4902) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1100/1251] eta 0:00:42 lr 0.000238 wd 0.0500 time 0.2258 (0.2845) data time 0.0008 (0.0067) model time 0.2251 (0.2778) loss 3.7838 (2.9896) grad_norm 2.2718 (3.4784) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1110/1251] eta 0:00:39 lr 0.000237 wd 0.0500 time 0.2193 (0.2822) data time 0.0010 (0.0065) model time 0.2183 (0.2757) loss 3.0556 (2.9802) grad_norm 2.7787 (3.4771) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1120/1251] eta 0:00:36 lr 0.000237 wd 0.0500 time 0.2179 (0.2800) data time 0.0010 (0.0063) model time 0.2170 (0.2737) loss 2.9926 (2.9793) grad_norm 3.6869 (3.4712) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1130/1251] eta 0:00:33 lr 0.000237 wd 0.0500 time 0.2347 (0.2780) data time 0.0008 (0.0061) model time 0.2340 (0.2719) loss 2.0164 (2.9810) grad_norm 2.7427 (3.4587) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1140/1251] eta 0:00:30 lr 0.000237 wd 0.0500 time 0.2227 (0.2767) data time 0.0007 (0.0059) model time 0.2220 (0.2709) loss 2.9529 (2.9806) grad_norm 2.6571 (3.4523) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1150/1251] eta 0:00:27 lr 0.000237 wd 0.0500 time 0.2283 (0.2750) data time 0.0008 (0.0057) model time 0.2275 (0.2693) loss 3.3122 (2.9728) grad_norm 4.1835 (3.4412) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1160/1251] eta 0:00:24 lr 0.000237 wd 0.0500 time 0.2319 (0.2740) data time 0.0008 (0.0056) model time 0.2312 (0.2685) loss 3.2200 (2.9727) grad_norm 3.7924 (3.4422) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1170/1251] eta 0:00:22 lr 0.000237 wd 0.0500 time 0.2278 (0.2725) data time 0.0008 (0.0054) model time 0.2270 (0.2671) loss 3.0462 (2.9793) grad_norm 3.5246 (3.4349) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1180/1251] eta 0:00:19 lr 0.000237 wd 0.0500 time 0.2279 (0.2710) data time 0.0008 (0.0053) model time 0.2272 (0.2657) loss 3.2139 (2.9790) grad_norm 3.3342 (3.4397) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1190/1251] eta 0:00:16 lr 0.000237 wd 0.0500 time 0.2203 (0.2695) data time 0.0008 (0.0052) model time 0.2195 (0.2644) loss 3.2789 (2.9812) grad_norm 3.1580 (3.4462) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1200/1251] eta 0:00:13 lr 0.000237 wd 0.0500 time 0.2220 (0.2682) data time 0.0007 (0.0050) model time 0.2213 (0.2631) loss 3.6800 (2.9811) grad_norm 2.8218 (3.4543) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1210/1251] eta 0:00:10 lr 0.000237 wd 0.0500 time 0.2307 (0.2670) data time 0.0007 (0.0049) model time 0.2301 (0.2620) loss 3.1381 (2.9819) grad_norm 2.9727 (3.4551) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1220/1251] eta 0:00:08 lr 0.000237 wd 0.0500 time 0.2243 (0.2658) data time 0.0007 (0.0048) model time 0.2235 (0.2609) loss 3.0829 (2.9781) grad_norm 2.2897 (3.4428) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1230/1251] eta 0:00:05 lr 0.000237 wd 0.0500 time 0.2201 (0.2646) data time 0.0007 (0.0047) model time 0.2194 (0.2599) loss 3.2349 (2.9735) grad_norm 3.6836 (3.4397) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1240/1251] eta 0:00:02 lr 0.000237 wd 0.0500 time 0.2127 (0.2635) data time 0.0003 (0.0046) model time 0.2123 (0.2589) loss 2.7277 (2.9679) grad_norm 2.6451 (3.4303) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [210/300][1250/1251] eta 0:00:00 lr 0.000237 wd 0.0500 time 0.2128 (0.2623) data time 0.0003 (0.0045) model time 0.2124 (0.2578) loss 3.5348 (2.9768) grad_norm 4.8100 (3.4276) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 02:49:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 210 training takes 0:01:45 [2024-08-30 02:49:47 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 02:49:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 02:49:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.378 (0.378) Loss 0.4272 (0.4272) Acc@1 92.285 (92.285) Acc@5 98.438 (98.438) Mem 7377MB [2024-08-30 02:49:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.069 (0.098) Loss 0.6396 (0.6686) Acc@1 88.867 (86.337) Acc@5 97.070 (97.283) Mem 7377MB [2024-08-30 02:49:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.085) Loss 0.9316 (0.6925) Acc@1 79.004 (85.370) Acc@5 95.117 (97.312) Mem 7377MB [2024-08-30 02:49:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.080) Loss 1.1514 (0.7850) Acc@1 73.828 (83.099) Acc@5 91.797 (96.264) Mem 7377MB [2024-08-30 02:49:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.075) Loss 1.0371 (0.8318) Acc@1 76.270 (81.824) Acc@5 93.652 (95.798) Mem 7377MB [2024-08-30 02:49:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.334 Acc@5 95.726 [2024-08-30 02:49:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.3% [2024-08-30 02:49:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.33% [2024-08-30 02:49:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-30 02:49:58 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-30 02:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.369 (0.369) Loss 0.3796 (0.3796) Acc@1 92.969 (92.969) Acc@5 98.535 (98.535) Mem 7377MB [2024-08-30 02:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.096) Loss 0.5898 (0.6067) Acc@1 88.867 (87.367) Acc@5 97.656 (97.665) Mem 7377MB [2024-08-30 02:50:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.067 (0.082) Loss 0.8687 (0.6335) Acc@1 78.418 (86.300) Acc@5 95.996 (97.638) Mem 7377MB [2024-08-30 02:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.078) Loss 1.0918 (0.7172) Acc@1 73.633 (84.233) Acc@5 93.262 (96.752) Mem 7377MB [2024-08-30 02:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.074) Loss 0.9819 (0.7601) Acc@1 76.465 (83.053) Acc@5 94.238 (96.299) Mem 7377MB [2024-08-30 02:50:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.618 Acc@5 96.280 [2024-08-30 02:50:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-08-30 02:50:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.62% [2024-08-30 02:50:02 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 02:50:05 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 02:50:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][0/1251] eta 0:13:33 lr 0.000237 wd 0.0500 time 0.6503 (0.6503) data time 0.4209 (0.4209) model time 0.0000 (0.0000) loss 2.8811 (2.8811) grad_norm 3.2543 (3.2543) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-30 02:50:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][10/1251] eta 0:05:26 lr 0.000237 wd 0.0500 time 0.2224 (0.2629) data time 0.0008 (0.0392) model time 0.0000 (0.0000) loss 3.1972 (3.1527) grad_norm 3.5105 (3.4038) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][20/1251] eta 0:05:01 lr 0.000237 wd 0.0500 time 0.2211 (0.2447) data time 0.0007 (0.0209) model time 0.0000 (0.0000) loss 2.6747 (2.9970) grad_norm 2.1889 (3.1082) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][30/1251] eta 0:04:51 lr 0.000237 wd 0.0500 time 0.2244 (0.2388) data time 0.0007 (0.0145) model time 0.0000 (0.0000) loss 3.5131 (3.1059) grad_norm 3.2300 (3.2186) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][40/1251] eta 0:04:44 lr 0.000237 wd 0.0500 time 0.2180 (0.2349) data time 0.0007 (0.0112) model time 0.0000 (0.0000) loss 3.0912 (3.0857) grad_norm 3.3963 (3.2228) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][50/1251] eta 0:04:40 lr 0.000237 wd 0.0500 time 0.2271 (0.2333) data time 0.0009 (0.0092) model time 0.0000 (0.0000) loss 2.3107 (3.0401) grad_norm 3.1332 (3.2133) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][60/1251] eta 0:04:36 lr 0.000237 wd 0.0500 time 0.2271 (0.2321) data time 0.0006 (0.0078) model time 0.2265 (0.2254) loss 2.8807 (3.0031) grad_norm 3.7403 (3.1877) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][70/1251] eta 0:04:32 lr 0.000237 wd 0.0500 time 0.2224 (0.2311) data time 0.0006 (0.0069) model time 0.2218 (0.2245) loss 2.9423 (2.9665) grad_norm 5.1677 (3.3221) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][80/1251] eta 0:04:29 lr 0.000237 wd 0.0500 time 0.2245 (0.2304) data time 0.0007 (0.0061) model time 0.2238 (0.2247) loss 3.5794 (2.9772) grad_norm 2.6337 (3.3549) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][90/1251] eta 0:04:26 lr 0.000237 wd 0.0500 time 0.2194 (0.2298) data time 0.0008 (0.0056) model time 0.2186 (0.2245) loss 2.1156 (2.9859) grad_norm 2.7049 (3.3260) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][100/1251] eta 0:04:24 lr 0.000237 wd 0.0500 time 0.2251 (0.2295) data time 0.0006 (0.0051) model time 0.2245 (0.2247) loss 2.6346 (2.9887) grad_norm 2.4837 (3.2836) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][110/1251] eta 0:04:21 lr 0.000237 wd 0.0500 time 0.2217 (0.2290) data time 0.0006 (0.0047) model time 0.2211 (0.2245) loss 3.3652 (3.0165) grad_norm 3.2042 (3.2666) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][120/1251] eta 0:04:18 lr 0.000237 wd 0.0500 time 0.2203 (0.2286) data time 0.0009 (0.0044) model time 0.2194 (0.2243) loss 3.1117 (3.0009) grad_norm 3.3981 (3.2439) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][130/1251] eta 0:04:15 lr 0.000236 wd 0.0500 time 0.2218 (0.2282) data time 0.0007 (0.0041) model time 0.2211 (0.2240) loss 3.7785 (2.9923) grad_norm 3.1448 (3.2494) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][140/1251] eta 0:04:13 lr 0.000236 wd 0.0500 time 0.2241 (0.2280) data time 0.0008 (0.0039) model time 0.2233 (0.2241) loss 3.1598 (2.9757) grad_norm 3.3409 (3.2661) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][150/1251] eta 0:04:10 lr 0.000236 wd 0.0500 time 0.2212 (0.2277) data time 0.0008 (0.0037) model time 0.2204 (0.2240) loss 3.2190 (2.9835) grad_norm 3.8304 (3.2811) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][160/1251] eta 0:04:08 lr 0.000236 wd 0.0500 time 0.2184 (0.2274) data time 0.0009 (0.0035) model time 0.2175 (0.2237) loss 2.8004 (2.9912) grad_norm 2.8286 (3.2634) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][170/1251] eta 0:04:05 lr 0.000236 wd 0.0500 time 0.2249 (0.2273) data time 0.0007 (0.0034) model time 0.2243 (0.2238) loss 2.9867 (2.9951) grad_norm 3.2926 (3.2650) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][180/1251] eta 0:04:03 lr 0.000236 wd 0.0500 time 0.2251 (0.2272) data time 0.0009 (0.0033) model time 0.2243 (0.2238) loss 3.3148 (3.0065) grad_norm 27.4454 (3.3848) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][190/1251] eta 0:04:00 lr 0.000236 wd 0.0500 time 0.2222 (0.2270) data time 0.0008 (0.0031) model time 0.2214 (0.2238) loss 2.4004 (3.0044) grad_norm 4.8791 (3.4588) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][200/1251] eta 0:03:58 lr 0.000236 wd 0.0500 time 0.2279 (0.2269) data time 0.0008 (0.0030) model time 0.2271 (0.2238) loss 2.2756 (2.9969) grad_norm 2.3588 (3.4621) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][210/1251] eta 0:03:56 lr 0.000236 wd 0.0500 time 0.2291 (0.2269) data time 0.0010 (0.0029) model time 0.2281 (0.2239) loss 3.8125 (3.0041) grad_norm 3.1290 (3.4600) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][220/1251] eta 0:03:53 lr 0.000236 wd 0.0500 time 0.2275 (0.2268) data time 0.0006 (0.0028) model time 0.2269 (0.2239) loss 3.2002 (3.0122) grad_norm 3.5058 (3.4501) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][230/1251] eta 0:03:51 lr 0.000236 wd 0.0500 time 0.2231 (0.2267) data time 0.0006 (0.0027) model time 0.2225 (0.2239) loss 3.5424 (3.0183) grad_norm 2.6373 (3.4396) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:50:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][240/1251] eta 0:03:49 lr 0.000236 wd 0.0500 time 0.2235 (0.2266) data time 0.0006 (0.0027) model time 0.2229 (0.2238) loss 3.8892 (3.0083) grad_norm 2.7064 (3.4277) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][250/1251] eta 0:03:46 lr 0.000236 wd 0.0500 time 0.2234 (0.2264) data time 0.0006 (0.0026) model time 0.2228 (0.2237) loss 3.4471 (3.0040) grad_norm 2.5829 (3.4065) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][260/1251] eta 0:03:44 lr 0.000236 wd 0.0500 time 0.2278 (0.2264) data time 0.0008 (0.0025) model time 0.2270 (0.2238) loss 2.6252 (2.9888) grad_norm 2.8954 (3.3955) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][270/1251] eta 0:03:42 lr 0.000236 wd 0.0500 time 0.2252 (0.2263) data time 0.0008 (0.0025) model time 0.2244 (0.2238) loss 3.2367 (2.9978) grad_norm 2.2269 (3.4037) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][280/1251] eta 0:03:39 lr 0.000236 wd 0.0500 time 0.2265 (0.2263) data time 0.0006 (0.0024) model time 0.2259 (0.2238) loss 2.7404 (2.9972) grad_norm 3.0867 (3.4102) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][290/1251] eta 0:03:37 lr 0.000236 wd 0.0500 time 0.2195 (0.2262) data time 0.0010 (0.0024) model time 0.2185 (0.2237) loss 3.2864 (2.9945) grad_norm 3.4497 (3.3995) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][300/1251] eta 0:03:35 lr 0.000236 wd 0.0500 time 0.2283 (0.2261) data time 0.0006 (0.0023) model time 0.2277 (0.2237) loss 3.5478 (2.9902) grad_norm 3.3605 (3.3990) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][310/1251] eta 0:03:32 lr 0.000236 wd 0.0500 time 0.2253 (0.2261) data time 0.0007 (0.0023) model time 0.2246 (0.2238) loss 2.6187 (2.9830) grad_norm 3.8181 (3.4072) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][320/1251] eta 0:03:30 lr 0.000236 wd 0.0500 time 0.2266 (0.2261) data time 0.0006 (0.0023) model time 0.2260 (0.2238) loss 3.0758 (2.9838) grad_norm 3.7370 (3.4004) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][330/1251] eta 0:03:28 lr 0.000236 wd 0.0500 time 0.2265 (0.2266) data time 0.0009 (0.0022) model time 0.2256 (0.2244) loss 3.1839 (2.9865) grad_norm 3.2416 (3.3959) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][340/1251] eta 0:03:26 lr 0.000236 wd 0.0500 time 0.2249 (0.2265) data time 0.0008 (0.0022) model time 0.2240 (0.2244) loss 3.1980 (2.9922) grad_norm 5.4154 (3.3881) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][350/1251] eta 0:03:24 lr 0.000236 wd 0.0500 time 0.2235 (0.2264) data time 0.0010 (0.0021) model time 0.2226 (0.2243) loss 2.1799 (2.9883) grad_norm 2.7148 (3.3959) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][360/1251] eta 0:03:21 lr 0.000236 wd 0.0500 time 0.2309 (0.2264) data time 0.0007 (0.0021) model time 0.2302 (0.2243) loss 2.6759 (2.9918) grad_norm 3.9192 (3.4016) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][370/1251] eta 0:03:19 lr 0.000236 wd 0.0500 time 0.2227 (0.2263) data time 0.0008 (0.0021) model time 0.2219 (0.2243) loss 3.6359 (2.9972) grad_norm 3.2130 (3.4090) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][380/1251] eta 0:03:17 lr 0.000236 wd 0.0500 time 0.2248 (0.2263) data time 0.0008 (0.0021) model time 0.2241 (0.2242) loss 3.4171 (2.9949) grad_norm 4.4666 (3.4329) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][390/1251] eta 0:03:14 lr 0.000236 wd 0.0500 time 0.2282 (0.2262) data time 0.0008 (0.0020) model time 0.2274 (0.2242) loss 2.9124 (2.9900) grad_norm 3.0673 (3.4225) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 02:51:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 02:51:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 02:51:35 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 02:54:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 02:54:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 02:55:04 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 02:55:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 02:55:13 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 02:55:15 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 02:55:16 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 02:55:16 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 211) [2024-08-30 02:55:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 03:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 03:00:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 03:00:41 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 03:00:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 03:00:48 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 03:00:50 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 03:00:51 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 03:00:51 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 211) [2024-08-30 03:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 03:01:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][400/1251] eta 0:38:49 lr 0.000235 wd 0.0500 time 0.2245 (2.7370) data time 0.0007 (0.1470) model time 0.2238 (2.5900) loss 3.4734 (3.5085) grad_norm 3.1756 (3.3491) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][410/1251] eta 0:14:54 lr 0.000235 wd 0.0500 time 0.2287 (1.0631) data time 0.0008 (0.0497) model time 0.2279 (1.0135) loss 3.1805 (3.2619) grad_norm 3.4253 (3.3729) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][420/1251] eta 0:10:04 lr 0.000235 wd 0.0500 time 0.2222 (0.7278) data time 0.0009 (0.0302) model time 0.2213 (0.6976) loss 3.3497 (3.1981) grad_norm 3.4617 (3.2937) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][430/1251] eta 0:07:59 lr 0.000235 wd 0.0500 time 0.2198 (0.5835) data time 0.0008 (0.0219) model time 0.2190 (0.5616) loss 3.1909 (3.1865) grad_norm 2.7360 (3.5859) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][440/1251] eta 0:06:48 lr 0.000235 wd 0.0500 time 0.2260 (0.5034) data time 0.0009 (0.0172) model time 0.2252 (0.4862) loss 3.2531 (3.1513) grad_norm 4.5420 (3.6251) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][450/1251] eta 0:06:02 lr 0.000235 wd 0.0500 time 0.2238 (0.4528) data time 0.0006 (0.0143) model time 0.2232 (0.4385) loss 2.1388 (3.1209) grad_norm 2.9695 (4.5583) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][460/1251] eta 0:05:30 lr 0.000235 wd 0.0500 time 0.2337 (0.4180) data time 0.0008 (0.0122) model time 0.2329 (0.4058) loss 3.0710 (3.0966) grad_norm 2.6582 (4.3166) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][470/1251] eta 0:05:06 lr 0.000235 wd 0.0500 time 0.2296 (0.3924) data time 0.0008 (0.0107) model time 0.2288 (0.3817) loss 2.3765 (3.0559) grad_norm 12.3833 (4.3631) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][480/1251] eta 0:04:47 lr 0.000235 wd 0.0500 time 0.2258 (0.3724) data time 0.0006 (0.0096) model time 0.2252 (0.3628) loss 2.9930 (3.0400) grad_norm 3.3895 (4.1934) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][490/1251] eta 0:04:31 lr 0.000235 wd 0.0500 time 0.2202 (0.3568) data time 0.0011 (0.0087) model time 0.2192 (0.3481) loss 3.3475 (3.0324) grad_norm 2.3992 (4.0381) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][500/1251] eta 0:04:18 lr 0.000235 wd 0.0500 time 0.2279 (0.3446) data time 0.0009 (0.0079) model time 0.2270 (0.3366) loss 2.6266 (3.0440) grad_norm 3.2055 (4.0140) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][510/1251] eta 0:04:07 lr 0.000235 wd 0.0500 time 0.2248 (0.3342) data time 0.0007 (0.0073) model time 0.2241 (0.3269) loss 2.1149 (3.0272) grad_norm 2.7180 (3.9198) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][520/1251] eta 0:03:57 lr 0.000235 wd 0.0500 time 0.2245 (0.3254) data time 0.0006 (0.0068) model time 0.2239 (0.3186) loss 2.4000 (3.0260) grad_norm 2.5186 (3.8479) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][530/1251] eta 0:03:49 lr 0.000235 wd 0.0500 time 0.2154 (0.3179) data time 0.0008 (0.0064) model time 0.2146 (0.3115) loss 3.2431 (3.0288) grad_norm 3.0189 (3.7696) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][540/1251] eta 0:03:41 lr 0.000235 wd 0.0500 time 0.2272 (0.3114) data time 0.0009 (0.0060) model time 0.2263 (0.3054) loss 3.1966 (3.0228) grad_norm 2.9424 (3.6943) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][550/1251] eta 0:03:34 lr 0.000235 wd 0.0500 time 0.2343 (0.3058) data time 0.0008 (0.0057) model time 0.2335 (0.3001) loss 3.4309 (3.0190) grad_norm 2.7851 (3.6992) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][560/1251] eta 0:03:27 lr 0.000235 wd 0.0500 time 0.2241 (0.3010) data time 0.0009 (0.0054) model time 0.2233 (0.2956) loss 3.2750 (3.0217) grad_norm 5.4467 (3.7059) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][570/1251] eta 0:03:22 lr 0.000235 wd 0.0500 time 0.2418 (0.2967) data time 0.0007 (0.0052) model time 0.2412 (0.2916) loss 3.1987 (3.0204) grad_norm 3.0115 (3.6654) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][580/1251] eta 0:03:16 lr 0.000235 wd 0.0500 time 0.2277 (0.2926) data time 0.0008 (0.0049) model time 0.2268 (0.2877) loss 3.0326 (3.0109) grad_norm 2.8801 (3.6342) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][590/1251] eta 0:03:11 lr 0.000235 wd 0.0500 time 0.2284 (0.2891) data time 0.0009 (0.0047) model time 0.2275 (0.2843) loss 1.7915 (3.0071) grad_norm 3.2311 (3.6317) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][600/1251] eta 0:03:06 lr 0.000235 wd 0.0500 time 0.2235 (0.2859) data time 0.0007 (0.0045) model time 0.2228 (0.2814) loss 2.6948 (2.9960) grad_norm 7.6236 (3.6491) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][610/1251] eta 0:03:01 lr 0.000235 wd 0.0500 time 0.2181 (0.2830) data time 0.0009 (0.0044) model time 0.2172 (0.2786) loss 2.9195 (2.9910) grad_norm 3.3207 (3.6596) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:01:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][620/1251] eta 0:02:56 lr 0.000235 wd 0.0500 time 0.2283 (0.2804) data time 0.0010 (0.0042) model time 0.2273 (0.2762) loss 3.3094 (2.9931) grad_norm 2.8316 (3.7198) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:02:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][630/1251] eta 0:02:52 lr 0.000235 wd 0.0500 time 0.2224 (0.2780) data time 0.0008 (0.0041) model time 0.2216 (0.2739) loss 3.3481 (2.9842) grad_norm 2.2790 (3.7025) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:02:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][640/1251] eta 0:02:48 lr 0.000235 wd 0.0500 time 0.2280 (0.2758) data time 0.0008 (0.0040) model time 0.2272 (0.2719) loss 3.1237 (2.9838) grad_norm 3.1302 (3.7052) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:02:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][650/1251] eta 0:02:44 lr 0.000235 wd 0.0500 time 0.2214 (0.2738) data time 0.0008 (0.0038) model time 0.2207 (0.2700) loss 2.5234 (2.9755) grad_norm 2.5863 (3.7001) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][660/1251] eta 0:02:40 lr 0.000234 wd 0.0500 time 0.2316 (0.2719) data time 0.0008 (0.0037) model time 0.2307 (0.2682) loss 2.2411 (2.9644) grad_norm 5.6939 (3.7040) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 03:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][670/1251] eta 0:02:37 lr 0.000234 wd 0.0500 time 0.2267 (0.2703) data time 0.0008 (0.0036) model time 0.2259 (0.2667) loss 3.3037 (2.9639) grad_norm 4.6039 (3.6983) loss_scale 2048.0000 (1057.5127) mem 7381MB [2024-08-30 03:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][680/1251] eta 0:02:33 lr 0.000234 wd 0.0500 time 0.2280 (0.2687) data time 0.0011 (0.0035) model time 0.2269 (0.2652) loss 2.9781 (2.9621) grad_norm 2.7431 (3.6882) loss_scale 2048.0000 (1092.2667) mem 7381MB [2024-08-30 03:02:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][690/1251] eta 0:02:30 lr 0.000234 wd 0.0500 time 0.2290 (0.2680) data time 0.0007 (0.0035) model time 0.2283 (0.2645) loss 3.3317 (2.9564) grad_norm 2.3936 (3.6818) loss_scale 2048.0000 (1124.6644) mem 7381MB [2024-08-30 03:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][700/1251] eta 0:02:26 lr 0.000234 wd 0.0500 time 0.2161 (0.2665) data time 0.0009 (0.0034) model time 0.2152 (0.2631) loss 2.3268 (2.9490) grad_norm 3.9842 (3.6942) loss_scale 2048.0000 (1154.9377) mem 7381MB [2024-08-30 03:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][710/1251] eta 0:02:23 lr 0.000234 wd 0.0500 time 0.2373 (0.2661) data time 0.0007 (0.0033) model time 0.2366 (0.2628) loss 3.3354 (2.9545) grad_norm 3.0374 (3.7189) loss_scale 2048.0000 (1183.2889) mem 7381MB [2024-08-30 03:02:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][720/1251] eta 0:02:20 lr 0.000234 wd 0.0500 time 0.2216 (0.2648) data time 0.0007 (0.0032) model time 0.2210 (0.2616) loss 3.7263 (2.9658) grad_norm 3.2581 (3.7072) loss_scale 2048.0000 (1209.8954) mem 7381MB [2024-08-30 03:02:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][730/1251] eta 0:02:17 lr 0.000234 wd 0.0500 time 0.2259 (0.2636) data time 0.0006 (0.0032) model time 0.2253 (0.2604) loss 3.2504 (2.9654) grad_norm 3.9611 (inf) loss_scale 1024.0000 (1207.4030) mem 7381MB [2024-08-30 03:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 03:02:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 03:02:27 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 03:04:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 03:04:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 03:05:00 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 03:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 03:10:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 03:10:20 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 03:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 03:10:32 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 03:10:33 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 03:10:34 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 03:10:34 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 211) [2024-08-30 03:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 03:10:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][740/1251] eta 0:15:19 lr 0.000234 wd 0.0500 time 0.2365 (1.7989) data time 0.0009 (0.1029) model time 0.2356 (1.6960) loss 3.3375 (3.3518) grad_norm 2.8973 (3.0700) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 03:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][750/1251] eta 0:07:47 lr 0.000234 wd 0.0500 time 0.2512 (0.9333) data time 0.0008 (0.0463) model time 0.2505 (0.8870) loss 3.8058 (3.2082) grad_norm 3.8173 (3.3450) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 03:10:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][760/1251] eta 0:05:37 lr 0.000234 wd 0.0500 time 0.2434 (0.6865) data time 0.0011 (0.0302) model time 0.2423 (0.6564) loss 3.5559 (3.2322) grad_norm 4.4062 (3.4479) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 03:11:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][770/1251] eta 0:04:33 lr 0.000234 wd 0.0500 time 0.2443 (0.5694) data time 0.0011 (0.0225) model time 0.2432 (0.5469) loss 3.2018 (3.1962) grad_norm 6.0440 (3.4728) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 03:11:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][780/1251] eta 0:03:55 lr 0.000234 wd 0.0500 time 0.2375 (0.5005) data time 0.0009 (0.0181) model time 0.2366 (0.4824) loss 3.1731 (3.1588) grad_norm 3.6108 (3.4475) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 03:11:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 03:11:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 03:11:07 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 03:14:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 03:14:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 03:14:19 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 03:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 03:14:33 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 03:14:34 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 03:14:35 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 03:14:35 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 211) [2024-08-30 03:14:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 03:14:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][790/1251] eta 0:45:50 lr 0.000234 wd 0.0500 time 0.3524 (5.9658) data time 0.0007 (0.5004) model time 0.3517 (5.4654) loss 3.5653 (3.5221) grad_norm 3.9565 (3.3101) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-08-30 03:14:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][800/1251] eta 0:08:52 lr 0.000234 wd 0.0500 time 0.2303 (1.1814) data time 0.0006 (0.0841) model time 0.2297 (1.0973) loss 2.3313 (3.1976) grad_norm 3.4805 (3.4861) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:14:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][810/1251] eta 0:05:29 lr 0.000234 wd 0.0500 time 0.2275 (0.7462) data time 0.0008 (0.0463) model time 0.2268 (0.6999) loss 3.1059 (3.1917) grad_norm 2.7191 (3.2925) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:14:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][820/1251] eta 0:04:11 lr 0.000234 wd 0.0500 time 0.2188 (0.5825) data time 0.0007 (0.0321) model time 0.2181 (0.5504) loss 3.1661 (3.1972) grad_norm 3.1791 (3.2313) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][830/1251] eta 0:03:29 lr 0.000234 wd 0.0500 time 0.2394 (0.4984) data time 0.0008 (0.0247) model time 0.2386 (0.4737) loss 3.1512 (3.1696) grad_norm 2.3820 (3.3519) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][840/1251] eta 0:03:03 lr 0.000234 wd 0.0500 time 0.2240 (0.4457) data time 0.0008 (0.0201) model time 0.2232 (0.4256) loss 3.2334 (3.1509) grad_norm 3.0219 (3.3026) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][850/1251] eta 0:02:44 lr 0.000234 wd 0.0500 time 0.2252 (0.4101) data time 0.0007 (0.0170) model time 0.2244 (0.3931) loss 3.3931 (3.1133) grad_norm 3.7519 (3.3122) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][860/1251] eta 0:02:30 lr 0.000234 wd 0.0500 time 0.2220 (0.3843) data time 0.0008 (0.0148) model time 0.2212 (0.3695) loss 2.9616 (3.0691) grad_norm 2.7583 (3.3085) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][870/1251] eta 0:02:18 lr 0.000234 wd 0.0500 time 0.2239 (0.3648) data time 0.0010 (0.0131) model time 0.2228 (0.3517) loss 3.1362 (3.0656) grad_norm 4.1954 (3.3293) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][880/1251] eta 0:02:09 lr 0.000234 wd 0.0500 time 0.2189 (0.3494) data time 0.0007 (0.0118) model time 0.2182 (0.3376) loss 2.1490 (3.0539) grad_norm 3.5874 (3.3883) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][890/1251] eta 0:02:01 lr 0.000234 wd 0.0500 time 0.2257 (0.3371) data time 0.0007 (0.0107) model time 0.2249 (0.3264) loss 3.4960 (3.0687) grad_norm 3.2365 (3.4342) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][900/1251] eta 0:01:54 lr 0.000234 wd 0.0500 time 0.2226 (0.3269) data time 0.0009 (0.0098) model time 0.2217 (0.3170) loss 3.3225 (3.0607) grad_norm 2.5563 (3.4507) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][910/1251] eta 0:01:48 lr 0.000234 wd 0.0500 time 0.2220 (0.3186) data time 0.0007 (0.0091) model time 0.2213 (0.3095) loss 3.1467 (3.0563) grad_norm 4.0443 (3.4378) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][920/1251] eta 0:01:43 lr 0.000234 wd 0.0500 time 0.2227 (0.3116) data time 0.0009 (0.0085) model time 0.2217 (0.3031) loss 3.1144 (3.0503) grad_norm 3.6542 (3.4434) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][930/1251] eta 0:01:38 lr 0.000233 wd 0.0500 time 0.2281 (0.3055) data time 0.0008 (0.0080) model time 0.2273 (0.2975) loss 3.4162 (3.0399) grad_norm 4.5832 (3.4474) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][940/1251] eta 0:01:33 lr 0.000233 wd 0.0500 time 0.2275 (0.3002) data time 0.0009 (0.0075) model time 0.2266 (0.2928) loss 3.1871 (3.0388) grad_norm 2.9854 (3.5102) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][950/1251] eta 0:01:28 lr 0.000233 wd 0.0500 time 0.2271 (0.2956) data time 0.0008 (0.0071) model time 0.2263 (0.2885) loss 3.0541 (3.0407) grad_norm 3.1637 (3.5129) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][960/1251] eta 0:01:24 lr 0.000233 wd 0.0500 time 0.2260 (0.2914) data time 0.0010 (0.0067) model time 0.2251 (0.2847) loss 2.6995 (3.0313) grad_norm 5.2160 (3.5301) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][970/1251] eta 0:01:20 lr 0.000233 wd 0.0500 time 0.2327 (0.2877) data time 0.0007 (0.0064) model time 0.2320 (0.2813) loss 3.2420 (3.0206) grad_norm 3.1631 (3.5435) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][980/1251] eta 0:01:17 lr 0.000233 wd 0.0500 time 0.2274 (0.2844) data time 0.0008 (0.0061) model time 0.2266 (0.2783) loss 3.4194 (3.0240) grad_norm 3.9199 (3.5877) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][990/1251] eta 0:01:13 lr 0.000233 wd 0.0500 time 0.2247 (0.2816) data time 0.0006 (0.0059) model time 0.2241 (0.2757) loss 3.6340 (3.0160) grad_norm 2.7449 (3.5788) loss_scale 1024.0000 (1024.0000) mem 7379MB [2024-08-30 03:15:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 03:15:37 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 03:15:38 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 03:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 03:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 03:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 03:22:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 03:22:08 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 03:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 03:22:20 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 03:22:22 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 03:22:23 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 03:22:23 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 211) [2024-08-30 03:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 03:22:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1000/1251] eta 0:07:02 lr 0.000233 wd 0.0500 time 0.2288 (1.6821) data time 0.0006 (0.0913) model time 0.2282 (1.5908) loss 3.2969 (3.3672) grad_norm 3.8790 (3.8168) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 03:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1010/1251] eta 0:03:30 lr 0.000233 wd 0.0500 time 0.2181 (0.8731) data time 0.0006 (0.0411) model time 0.2174 (0.8320) loss 3.5681 (3.1846) grad_norm 3.0210 (3.4146) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 03:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1020/1251] eta 0:02:28 lr 0.000233 wd 0.0500 time 0.2214 (0.6420) data time 0.0010 (0.0267) model time 0.2204 (0.6153) loss 3.0286 (3.2049) grad_norm 3.9792 (3.4146) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 03:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1030/1251] eta 0:01:57 lr 0.000233 wd 0.0500 time 0.2287 (0.5323) data time 0.0008 (0.0199) model time 0.2279 (0.5123) loss 3.1279 (3.1981) grad_norm 3.0701 (3.7694) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 03:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1040/1251] eta 0:01:38 lr 0.000233 wd 0.0500 time 0.2216 (0.4688) data time 0.0007 (0.0160) model time 0.2209 (0.4528) loss 3.3107 (3.1805) grad_norm 8.5079 (3.8846) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 03:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1050/1251] eta 0:01:25 lr 0.000233 wd 0.0500 time 0.2205 (0.4267) data time 0.0007 (0.0134) model time 0.2199 (0.4133) loss 2.6136 (3.1576) grad_norm 3.0419 (3.7376) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 03:22:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1060/1251] eta 0:01:15 lr 0.000233 wd 0.0500 time 0.2197 (0.3974) data time 0.0007 (0.0115) model time 0.2190 (0.3858) loss 2.0099 (3.1407) grad_norm 3.4929 (3.6513) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 03:22:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1070/1251] eta 0:01:07 lr 0.000233 wd 0.0500 time 0.2216 (0.3754) data time 0.0007 (0.0102) model time 0.2209 (0.3652) loss 2.2373 (3.1167) grad_norm 2.5426 (3.5573) loss_scale 1024.0000 (1024.0000) mem 7377MB [2024-08-30 03:22:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1080/1251] eta 0:01:01 lr 0.000233 wd 0.0500 time 0.2278 (0.3588) data time 0.0009 (0.0091) model time 0.2269 (0.3497) loss 3.3766 (3.0898) grad_norm 3.6358 (inf) loss_scale 512.0000 (1006.5455) mem 7377MB [2024-08-30 03:23:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1090/1251] eta 0:00:55 lr 0.000233 wd 0.0500 time 0.2173 (0.3453) data time 0.0009 (0.0083) model time 0.2165 (0.3370) loss 3.5041 (3.0903) grad_norm 3.3143 (inf) loss_scale 512.0000 (956.0816) mem 7377MB [2024-08-30 03:23:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1100/1251] eta 0:00:50 lr 0.000233 wd 0.0500 time 0.2263 (0.3344) data time 0.0008 (0.0076) model time 0.2255 (0.3267) loss 2.3047 (3.0915) grad_norm 4.7837 (inf) loss_scale 512.0000 (914.9630) mem 7377MB [2024-08-30 03:23:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1110/1251] eta 0:00:45 lr 0.000233 wd 0.0500 time 0.2329 (0.3253) data time 0.0008 (0.0071) model time 0.2322 (0.3182) loss 2.9726 (3.0904) grad_norm 3.3230 (inf) loss_scale 512.0000 (880.8136) mem 7377MB [2024-08-30 03:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1120/1251] eta 0:00:41 lr 0.000233 wd 0.0500 time 0.2299 (0.3178) data time 0.0008 (0.0066) model time 0.2291 (0.3112) loss 2.8622 (3.0701) grad_norm 3.3176 (inf) loss_scale 512.0000 (852.0000) mem 7377MB [2024-08-30 03:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1130/1251] eta 0:00:37 lr 0.000233 wd 0.0500 time 0.2198 (0.3112) data time 0.0011 (0.0062) model time 0.2187 (0.3050) loss 2.9161 (3.0622) grad_norm 2.4957 (inf) loss_scale 512.0000 (827.3623) mem 7377MB [2024-08-30 03:23:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1140/1251] eta 0:00:33 lr 0.000233 wd 0.0500 time 0.2239 (0.3056) data time 0.0007 (0.0058) model time 0.2232 (0.2998) loss 3.3353 (3.0568) grad_norm 2.7882 (inf) loss_scale 512.0000 (806.0541) mem 7377MB [2024-08-30 03:23:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1150/1251] eta 0:00:30 lr 0.000233 wd 0.0500 time 0.2223 (0.3006) data time 0.0008 (0.0055) model time 0.2215 (0.2951) loss 2.6235 (3.0439) grad_norm 2.8877 (inf) loss_scale 512.0000 (787.4430) mem 7377MB [2024-08-30 03:23:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1160/1251] eta 0:00:26 lr 0.000233 wd 0.0500 time 0.2252 (0.2960) data time 0.0009 (0.0052) model time 0.2243 (0.2908) loss 3.0830 (3.0515) grad_norm 3.9614 (inf) loss_scale 512.0000 (771.0476) mem 7377MB [2024-08-30 03:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1170/1251] eta 0:00:23 lr 0.000233 wd 0.0500 time 0.2256 (0.2920) data time 0.0008 (0.0050) model time 0.2247 (0.2870) loss 2.4665 (3.0386) grad_norm 3.0527 (inf) loss_scale 512.0000 (756.4944) mem 7377MB [2024-08-30 03:23:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1180/1251] eta 0:00:20 lr 0.000233 wd 0.0500 time 0.2246 (0.2885) data time 0.0006 (0.0048) model time 0.2240 (0.2837) loss 3.5826 (3.0422) grad_norm 4.3419 (inf) loss_scale 512.0000 (743.4894) mem 7377MB [2024-08-30 03:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1190/1251] eta 0:00:17 lr 0.000233 wd 0.0500 time 0.2317 (0.2855) data time 0.0009 (0.0046) model time 0.2308 (0.2809) loss 2.3279 (3.0361) grad_norm 3.0007 (inf) loss_scale 512.0000 (731.7980) mem 7377MB [2024-08-30 03:23:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1200/1251] eta 0:00:14 lr 0.000232 wd 0.0500 time 0.2297 (0.2827) data time 0.0009 (0.0044) model time 0.2288 (0.2783) loss 3.5420 (3.0263) grad_norm 3.5501 (inf) loss_scale 512.0000 (721.2308) mem 7377MB [2024-08-30 03:23:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1210/1251] eta 0:00:11 lr 0.000232 wd 0.0500 time 0.2223 (0.2802) data time 0.0010 (0.0043) model time 0.2213 (0.2759) loss 2.9093 (3.0201) grad_norm 3.5512 (inf) loss_scale 512.0000 (711.6330) mem 7377MB [2024-08-30 03:23:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1220/1251] eta 0:00:08 lr 0.000232 wd 0.0500 time 0.2360 (0.2779) data time 0.0009 (0.0041) model time 0.2350 (0.2738) loss 3.2796 (3.0213) grad_norm 3.8451 (inf) loss_scale 512.0000 (702.8772) mem 7377MB [2024-08-30 03:23:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1230/1251] eta 0:00:05 lr 0.000232 wd 0.0500 time 0.2186 (0.2758) data time 0.0006 (0.0040) model time 0.2180 (0.2718) loss 3.3213 (3.0136) grad_norm 3.9770 (inf) loss_scale 512.0000 (694.8571) mem 7377MB [2024-08-30 03:23:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1240/1251] eta 0:00:03 lr 0.000232 wd 0.0500 time 0.2155 (0.2735) data time 0.0006 (0.0039) model time 0.2150 (0.2697) loss 2.4115 (3.0025) grad_norm 4.0519 (inf) loss_scale 512.0000 (687.4839) mem 7377MB [2024-08-30 03:23:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [211/300][1250/1251] eta 0:00:00 lr 0.000232 wd 0.0500 time 0.2124 (0.2713) data time 0.0006 (0.0037) model time 0.2118 (0.2676) loss 3.3822 (2.9950) grad_norm 3.0676 (inf) loss_scale 512.0000 (680.6822) mem 7377MB [2024-08-30 03:23:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 211 training takes 0:01:09 [2024-08-30 03:23:36 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 03:23:39 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 03:23:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.382 (0.382) Loss 0.4087 (0.4087) Acc@1 92.773 (92.773) Acc@5 98.340 (98.340) Mem 7377MB [2024-08-30 03:23:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.100) Loss 0.6982 (0.6709) Acc@1 86.133 (85.991) Acc@5 97.168 (97.363) Mem 7377MB [2024-08-30 03:23:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.089 (0.089) Loss 0.9473 (0.6960) Acc@1 78.125 (85.049) Acc@5 95.117 (97.373) Mem 7377MB [2024-08-30 03:23:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.083) Loss 1.1484 (0.7856) Acc@1 73.438 (82.872) Acc@5 92.969 (96.384) Mem 7377MB [2024-08-30 03:23:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.079) Loss 1.0742 (0.8345) Acc@1 75.000 (81.598) Acc@5 93.457 (95.841) Mem 7377MB [2024-08-30 03:23:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.260 Acc@5 95.812 [2024-08-30 03:23:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.3% [2024-08-30 03:23:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.811 (0.811) Loss 0.3794 (0.3794) Acc@1 93.066 (93.066) Acc@5 98.535 (98.535) Mem 7377MB [2024-08-30 03:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.146) Loss 0.5898 (0.6063) Acc@1 88.770 (87.367) Acc@5 97.656 (97.665) Mem 7377MB [2024-08-30 03:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.068 (0.110) Loss 0.8662 (0.6330) Acc@1 78.516 (86.333) Acc@5 95.898 (97.633) Mem 7377MB [2024-08-30 03:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.098) Loss 1.0898 (0.7168) Acc@1 73.438 (84.299) Acc@5 93.066 (96.730) Mem 7377MB [2024-08-30 03:23:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.090) Loss 0.9829 (0.7598) Acc@1 76.562 (83.117) Acc@5 94.238 (96.291) Mem 7377MB [2024-08-30 03:23:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.688 Acc@5 96.266 [2024-08-30 03:23:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-08-30 03:23:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.69% [2024-08-30 03:23:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 03:23:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 03:23:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][0/1251] eta 0:12:08 lr 0.000232 wd 0.0500 time 0.5826 (0.5826) data time 0.3423 (0.3423) model time 0.0000 (0.0000) loss 3.3007 (3.3007) grad_norm 4.8177 (4.8177) loss_scale 512.0000 (512.0000) mem 7380MB [2024-08-30 03:23:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][10/1251] eta 0:05:21 lr 0.000232 wd 0.0500 time 0.2217 (0.2592) data time 0.0007 (0.0320) model time 0.0000 (0.0000) loss 1.7889 (2.7380) grad_norm 3.6523 (3.3765) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 03:23:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 03:23:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 03:23:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 03:26:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 03:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 03:28:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 03:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 03:28:21 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 03:28:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 03:28:30 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 03:28:31 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 03:28:33 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 03:28:33 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 212) [2024-08-30 03:28:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 03:28:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][20/1251] eta 0:32:42 lr 0.000232 wd 0.0500 time 0.2235 (1.5945) data time 0.0007 (0.0792) model time 0.0000 (0.0000) loss 3.1818 (3.3202) grad_norm 3.1036 (3.7360) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][30/1251] eta 0:17:46 lr 0.000232 wd 0.0500 time 0.2280 (0.8736) data time 0.0009 (0.0380) model time 0.0000 (0.0000) loss 3.0527 (3.1202) grad_norm 2.7907 (3.6695) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:28:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][40/1251] eta 0:13:07 lr 0.000232 wd 0.0500 time 0.2222 (0.6499) data time 0.0007 (0.0252) model time 0.0000 (0.0000) loss 3.4588 (3.1773) grad_norm 2.9730 (3.4293) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:28:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][50/1251] eta 0:10:49 lr 0.000232 wd 0.0500 time 0.2294 (0.5410) data time 0.0010 (0.0190) model time 0.0000 (0.0000) loss 3.1238 (3.1297) grad_norm 3.4589 (3.4074) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][60/1251] eta 0:09:28 lr 0.000232 wd 0.0500 time 0.2287 (0.4770) data time 0.0011 (0.0153) model time 0.2277 (0.2262) loss 2.8851 (3.0875) grad_norm 2.9100 (3.3761) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][70/1251] eta 0:08:33 lr 0.000232 wd 0.0500 time 0.2230 (0.4347) data time 0.0008 (0.0129) model time 0.2222 (0.2264) loss 1.9647 (3.0365) grad_norm 2.5116 (3.4024) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][80/1251] eta 0:07:53 lr 0.000232 wd 0.0500 time 0.2223 (0.4044) data time 0.0009 (0.0111) model time 0.2214 (0.2258) loss 3.2402 (3.0178) grad_norm 3.3453 (3.3794) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][90/1251] eta 0:07:23 lr 0.000232 wd 0.0500 time 0.2201 (0.3816) data time 0.0009 (0.0098) model time 0.2192 (0.2252) loss 2.6909 (2.9953) grad_norm 3.8515 (3.3685) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][100/1251] eta 0:06:58 lr 0.000232 wd 0.0500 time 0.2215 (0.3640) data time 0.0009 (0.0088) model time 0.2206 (0.2250) loss 2.8394 (2.9816) grad_norm 3.6789 (3.4145) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][110/1251] eta 0:06:39 lr 0.000232 wd 0.0500 time 0.2315 (0.3499) data time 0.0008 (0.0080) model time 0.2307 (0.2247) loss 2.8483 (3.0023) grad_norm 3.9519 (3.4442) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][120/1251] eta 0:06:22 lr 0.000232 wd 0.0500 time 0.2251 (0.3385) data time 0.0007 (0.0074) model time 0.2243 (0.2247) loss 3.3753 (3.0127) grad_norm 3.0079 (3.4984) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][130/1251] eta 0:06:08 lr 0.000232 wd 0.0500 time 0.2223 (0.3289) data time 0.0010 (0.0068) model time 0.2213 (0.2246) loss 3.2215 (3.0109) grad_norm 2.5742 (3.5004) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][140/1251] eta 0:05:56 lr 0.000232 wd 0.0500 time 0.2303 (0.3210) data time 0.0008 (0.0064) model time 0.2296 (0.2247) loss 2.9426 (2.9975) grad_norm 2.6361 (3.4785) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][150/1251] eta 0:05:45 lr 0.000232 wd 0.0500 time 0.2253 (0.3141) data time 0.0006 (0.0060) model time 0.2246 (0.2247) loss 3.2666 (3.0042) grad_norm 3.0958 (3.4937) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][160/1251] eta 0:05:36 lr 0.000232 wd 0.0500 time 0.2267 (0.3082) data time 0.0007 (0.0056) model time 0.2260 (0.2248) loss 2.6593 (2.9923) grad_norm 3.6498 (3.4712) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][170/1251] eta 0:05:27 lr 0.000232 wd 0.0500 time 0.2263 (0.3030) data time 0.0007 (0.0053) model time 0.2256 (0.2248) loss 3.5036 (2.9901) grad_norm 3.1095 (3.4497) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][180/1251] eta 0:05:19 lr 0.000232 wd 0.0500 time 0.2272 (0.2986) data time 0.0009 (0.0051) model time 0.2263 (0.2250) loss 2.9474 (2.9944) grad_norm 2.7408 (3.4448) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][190/1251] eta 0:05:12 lr 0.000232 wd 0.0500 time 0.2278 (0.2946) data time 0.0006 (0.0048) model time 0.2271 (0.2251) loss 2.9112 (2.9789) grad_norm 3.8786 (3.4252) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][200/1251] eta 0:05:05 lr 0.000232 wd 0.0500 time 0.2190 (0.2910) data time 0.0007 (0.0046) model time 0.2182 (0.2251) loss 3.5195 (2.9795) grad_norm 2.9646 (3.4087) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][210/1251] eta 0:04:59 lr 0.000232 wd 0.0500 time 0.2215 (0.2876) data time 0.0009 (0.0044) model time 0.2206 (0.2249) loss 2.0851 (2.9612) grad_norm 2.7961 (3.3987) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][220/1251] eta 0:04:53 lr 0.000231 wd 0.0500 time 0.2229 (0.2846) data time 0.0007 (0.0043) model time 0.2222 (0.2249) loss 3.2034 (2.9525) grad_norm 4.3709 (3.3936) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][230/1251] eta 0:04:47 lr 0.000231 wd 0.0500 time 0.2306 (0.2819) data time 0.0006 (0.0041) model time 0.2300 (0.2248) loss 3.4148 (2.9511) grad_norm 5.8333 (3.3908) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][240/1251] eta 0:04:42 lr 0.000231 wd 0.0500 time 0.2239 (0.2794) data time 0.0007 (0.0040) model time 0.2232 (0.2248) loss 2.4762 (2.9526) grad_norm 4.2522 (3.4390) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][250/1251] eta 0:04:37 lr 0.000231 wd 0.0500 time 0.2213 (0.2772) data time 0.0012 (0.0038) model time 0.2201 (0.2248) loss 2.0074 (2.9453) grad_norm 3.1734 (3.4592) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][260/1251] eta 0:04:32 lr 0.000231 wd 0.0500 time 0.2226 (0.2752) data time 0.0007 (0.0037) model time 0.2220 (0.2250) loss 2.5943 (2.9418) grad_norm 4.5354 (3.4589) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][270/1251] eta 0:04:28 lr 0.000231 wd 0.0500 time 0.2259 (0.2734) data time 0.0009 (0.0036) model time 0.2249 (0.2250) loss 3.1552 (2.9337) grad_norm 3.4162 (3.4808) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][280/1251] eta 0:04:23 lr 0.000231 wd 0.0500 time 0.2299 (0.2717) data time 0.0006 (0.0035) model time 0.2293 (0.2251) loss 1.9187 (2.9243) grad_norm 2.7944 (3.4683) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 03:29:52 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 03:29:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 03:33:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 03:33:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 03:33:27 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 03:33:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 03:33:37 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 03:33:38 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 03:33:39 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 03:33:39 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 212) [2024-08-30 03:33:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 03:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][290/1251] eta 3:31:10 lr 0.000231 wd 0.0500 time 13.1844 (13.1844) data time 0.8245 (0.8245) model time 12.3599 (12.3599) loss 3.5473 (3.5473) grad_norm 3.6637 (3.6637) loss_scale 512.0000 (512.0000) mem 20033MB [2024-08-30 03:34:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][300/1251] eta 0:22:33 lr 0.000231 wd 0.0500 time 0.2338 (1.4236) data time 0.0009 (0.0759) model time 0.2329 (1.3477) loss 2.3999 (3.2688) grad_norm 2.9414 (3.1594) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][310/1251] eta 0:13:23 lr 0.000231 wd 0.0500 time 0.2313 (0.8544) data time 0.0012 (0.0403) model time 0.2301 (0.8141) loss 2.9730 (3.1984) grad_norm 2.7875 (3.0123) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][320/1251] eta 0:10:09 lr 0.000231 wd 0.0500 time 0.2209 (0.6542) data time 0.0009 (0.0277) model time 0.2200 (0.6265) loss 1.9980 (3.1836) grad_norm 3.4510 (4.5636) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][330/1251] eta 0:08:27 lr 0.000231 wd 0.0500 time 0.2272 (0.5512) data time 0.0009 (0.0212) model time 0.2262 (0.5300) loss 2.8902 (3.1280) grad_norm 9.9529 (4.3575) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][340/1251] eta 0:07:24 lr 0.000231 wd 0.0500 time 0.2258 (0.4878) data time 0.0007 (0.0173) model time 0.2251 (0.4705) loss 3.4429 (3.1112) grad_norm 4.4103 (4.1088) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][350/1251] eta 0:06:40 lr 0.000231 wd 0.0500 time 0.2235 (0.4450) data time 0.0009 (0.0146) model time 0.2226 (0.4304) loss 2.8734 (3.0794) grad_norm 3.6042 (3.9868) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][360/1251] eta 0:06:09 lr 0.000231 wd 0.0500 time 0.2380 (0.4150) data time 0.0011 (0.0127) model time 0.2370 (0.4022) loss 2.9140 (3.0542) grad_norm 3.2087 (3.8657) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][370/1251] eta 0:05:45 lr 0.000231 wd 0.0500 time 0.2279 (0.3920) data time 0.0009 (0.0113) model time 0.2270 (0.3807) loss 2.4810 (3.0262) grad_norm 2.9219 (3.7875) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][380/1251] eta 0:05:25 lr 0.000231 wd 0.0500 time 0.2208 (0.3739) data time 0.0009 (0.0102) model time 0.2199 (0.3638) loss 3.5548 (3.0174) grad_norm 3.3248 (3.9228) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][390/1251] eta 0:05:09 lr 0.000231 wd 0.0500 time 0.2293 (0.3597) data time 0.0012 (0.0093) model time 0.2282 (0.3504) loss 3.0977 (3.0264) grad_norm 3.7390 (3.8511) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][400/1251] eta 0:04:55 lr 0.000231 wd 0.0500 time 0.2280 (0.3478) data time 0.0009 (0.0086) model time 0.2270 (0.3393) loss 2.4352 (3.0332) grad_norm 2.8528 (3.7926) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][410/1251] eta 0:04:44 lr 0.000231 wd 0.0500 time 0.2263 (0.3378) data time 0.0007 (0.0079) model time 0.2256 (0.3299) loss 2.0400 (3.0310) grad_norm 3.2475 (3.7593) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][420/1251] eta 0:04:33 lr 0.000231 wd 0.0500 time 0.2239 (0.3295) data time 0.0011 (0.0074) model time 0.2228 (0.3221) loss 3.6996 (3.0227) grad_norm 3.9733 (3.7355) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][430/1251] eta 0:04:24 lr 0.000231 wd 0.0500 time 0.2289 (0.3223) data time 0.0007 (0.0070) model time 0.2282 (0.3153) loss 3.2316 (3.0145) grad_norm 3.0130 (3.7619) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][440/1251] eta 0:04:16 lr 0.000231 wd 0.0500 time 0.2312 (0.3160) data time 0.0011 (0.0066) model time 0.2302 (0.3095) loss 2.2442 (3.0141) grad_norm 2.3119 (3.7635) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][450/1251] eta 0:04:08 lr 0.000231 wd 0.0500 time 0.2209 (0.3107) data time 0.0013 (0.0063) model time 0.2195 (0.3045) loss 3.2882 (3.0210) grad_norm 4.9782 (3.7618) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][460/1251] eta 0:04:02 lr 0.000231 wd 0.0500 time 0.2281 (0.3061) data time 0.0010 (0.0060) model time 0.2271 (0.3002) loss 3.3762 (3.0166) grad_norm 4.4161 (3.7529) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][470/1251] eta 0:03:55 lr 0.000231 wd 0.0500 time 0.2333 (0.3019) data time 0.0010 (0.0057) model time 0.2323 (0.2962) loss 3.5613 (3.0061) grad_norm 5.3230 (3.7355) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][480/1251] eta 0:03:49 lr 0.000231 wd 0.0500 time 0.2280 (0.2980) data time 0.0010 (0.0055) model time 0.2271 (0.2925) loss 2.6882 (3.0045) grad_norm 2.4866 (3.7323) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][490/1251] eta 0:03:44 lr 0.000230 wd 0.0500 time 0.2248 (0.2947) data time 0.0015 (0.0052) model time 0.2233 (0.2895) loss 2.6570 (2.9939) grad_norm 3.4888 (3.7887) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][500/1251] eta 0:03:38 lr 0.000230 wd 0.0500 time 0.2251 (0.2915) data time 0.0009 (0.0050) model time 0.2242 (0.2864) loss 3.1584 (2.9861) grad_norm 3.7864 (3.8394) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][510/1251] eta 0:03:34 lr 0.000230 wd 0.0500 time 0.2321 (0.2889) data time 0.0008 (0.0049) model time 0.2313 (0.2840) loss 3.0037 (2.9827) grad_norm 2.9660 (3.8149) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][520/1251] eta 0:03:29 lr 0.000230 wd 0.0500 time 0.2233 (0.2863) data time 0.0009 (0.0047) model time 0.2224 (0.2816) loss 1.8529 (2.9816) grad_norm 2.6496 (3.7915) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][530/1251] eta 0:03:24 lr 0.000230 wd 0.0500 time 0.2266 (0.2840) data time 0.0007 (0.0045) model time 0.2259 (0.2795) loss 3.5285 (2.9840) grad_norm 2.9867 (3.8484) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][540/1251] eta 0:03:20 lr 0.000230 wd 0.0500 time 0.2315 (0.2819) data time 0.0008 (0.0044) model time 0.2307 (0.2775) loss 3.3329 (2.9743) grad_norm 3.3412 (3.8369) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:34:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][550/1251] eta 0:03:16 lr 0.000230 wd 0.0500 time 0.2282 (0.2801) data time 0.0007 (0.0043) model time 0.2275 (0.2758) loss 3.1367 (2.9659) grad_norm 3.1676 (3.8137) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][560/1251] eta 0:03:12 lr 0.000230 wd 0.0500 time 0.2379 (0.2782) data time 0.0007 (0.0042) model time 0.2372 (0.2741) loss 3.1716 (2.9608) grad_norm 3.7457 (3.8265) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][570/1251] eta 0:03:08 lr 0.000230 wd 0.0500 time 0.2225 (0.2765) data time 0.0009 (0.0041) model time 0.2216 (0.2724) loss 2.5387 (2.9610) grad_norm 3.5316 (3.8185) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][580/1251] eta 0:03:04 lr 0.000230 wd 0.0500 time 0.2303 (0.2757) data time 0.0008 (0.0040) model time 0.2296 (0.2717) loss 2.0701 (2.9559) grad_norm 2.7138 (3.7833) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][590/1251] eta 0:03:01 lr 0.000230 wd 0.0500 time 0.2258 (0.2742) data time 0.0014 (0.0039) model time 0.2245 (0.2703) loss 2.6697 (2.9460) grad_norm 3.0165 (3.7668) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][600/1251] eta 0:02:58 lr 0.000230 wd 0.0500 time 0.2214 (0.2735) data time 0.0011 (0.0038) model time 0.2203 (0.2697) loss 3.4948 (2.9459) grad_norm 3.2701 (3.7764) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][610/1251] eta 0:02:54 lr 0.000230 wd 0.0500 time 0.2254 (0.2720) data time 0.0007 (0.0037) model time 0.2247 (0.2683) loss 3.5658 (2.9578) grad_norm 4.0936 (3.7700) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][620/1251] eta 0:02:50 lr 0.000230 wd 0.0500 time 0.2281 (0.2707) data time 0.0007 (0.0036) model time 0.2274 (0.2671) loss 1.6716 (2.9561) grad_norm 3.8382 (3.7622) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][630/1251] eta 0:02:47 lr 0.000230 wd 0.0500 time 0.2473 (0.2697) data time 0.0010 (0.0035) model time 0.2462 (0.2661) loss 3.5016 (2.9609) grad_norm 2.9548 (3.7502) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][640/1251] eta 0:02:44 lr 0.000230 wd 0.0500 time 0.2346 (0.2686) data time 0.0007 (0.0035) model time 0.2339 (0.2651) loss 3.3161 (2.9617) grad_norm 5.4132 (3.7727) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][650/1251] eta 0:02:40 lr 0.000230 wd 0.0500 time 0.2219 (0.2675) data time 0.0011 (0.0034) model time 0.2208 (0.2641) loss 2.4370 (2.9617) grad_norm 9.8655 (3.7873) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][660/1251] eta 0:02:37 lr 0.000230 wd 0.0500 time 0.2339 (0.2665) data time 0.0010 (0.0034) model time 0.2329 (0.2632) loss 2.5032 (2.9580) grad_norm 2.9060 (3.7737) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][670/1251] eta 0:02:34 lr 0.000230 wd 0.0500 time 0.2307 (0.2655) data time 0.0007 (0.0033) model time 0.2300 (0.2622) loss 1.5533 (2.9553) grad_norm 3.4750 (3.7648) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][680/1251] eta 0:02:31 lr 0.000230 wd 0.0500 time 0.2395 (0.2647) data time 0.0008 (0.0032) model time 0.2386 (0.2614) loss 4.0186 (2.9533) grad_norm 2.6476 (3.7486) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][690/1251] eta 0:02:27 lr 0.000230 wd 0.0500 time 0.2263 (0.2637) data time 0.0010 (0.0032) model time 0.2253 (0.2605) loss 2.9344 (2.9554) grad_norm 2.7293 (3.7298) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][700/1251] eta 0:02:24 lr 0.000230 wd 0.0500 time 0.2258 (0.2629) data time 0.0010 (0.0031) model time 0.2248 (0.2598) loss 3.3028 (2.9582) grad_norm 3.0152 (3.7294) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][710/1251] eta 0:02:21 lr 0.000230 wd 0.0500 time 0.2316 (0.2622) data time 0.0008 (0.0031) model time 0.2308 (0.2591) loss 3.8794 (2.9620) grad_norm 3.6442 (3.7279) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][720/1251] eta 0:02:18 lr 0.000230 wd 0.0500 time 0.2242 (0.2615) data time 0.0008 (0.0030) model time 0.2234 (0.2584) loss 3.3371 (2.9661) grad_norm 3.1918 (3.7196) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][730/1251] eta 0:02:15 lr 0.000230 wd 0.0500 time 0.2255 (0.2608) data time 0.0010 (0.0030) model time 0.2245 (0.2578) loss 3.5546 (2.9682) grad_norm 4.6412 (3.7102) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][740/1251] eta 0:02:12 lr 0.000230 wd 0.0500 time 0.2226 (0.2601) data time 0.0012 (0.0029) model time 0.2214 (0.2571) loss 2.7729 (2.9684) grad_norm 6.6340 (3.8148) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][750/1251] eta 0:02:09 lr 0.000230 wd 0.0500 time 0.2215 (0.2594) data time 0.0014 (0.0029) model time 0.2201 (0.2565) loss 3.2975 (2.9625) grad_norm 4.4452 (3.8216) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][760/1251] eta 0:02:07 lr 0.000229 wd 0.0500 time 0.2341 (0.2588) data time 0.0011 (0.0029) model time 0.2330 (0.2559) loss 2.8540 (2.9537) grad_norm 2.9009 (3.8086) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][770/1251] eta 0:02:04 lr 0.000229 wd 0.0500 time 0.2229 (0.2582) data time 0.0009 (0.0028) model time 0.2220 (0.2554) loss 3.6330 (2.9543) grad_norm 4.6625 (3.7980) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][780/1251] eta 0:02:01 lr 0.000229 wd 0.0500 time 0.2262 (0.2576) data time 0.0009 (0.0028) model time 0.2253 (0.2548) loss 3.2935 (2.9581) grad_norm 2.7352 (3.7993) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][790/1251] eta 0:01:58 lr 0.000229 wd 0.0500 time 0.2282 (0.2570) data time 0.0011 (0.0028) model time 0.2270 (0.2542) loss 3.4351 (2.9581) grad_norm 2.8130 (3.7939) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][800/1251] eta 0:01:55 lr 0.000229 wd 0.0500 time 0.2282 (0.2565) data time 0.0007 (0.0027) model time 0.2274 (0.2538) loss 3.2650 (2.9637) grad_norm 6.2765 (3.7809) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:35:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][810/1251] eta 0:01:52 lr 0.000229 wd 0.0500 time 0.2246 (0.2560) data time 0.0009 (0.0027) model time 0.2237 (0.2533) loss 2.7825 (2.9641) grad_norm 2.5944 (3.7732) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][820/1251] eta 0:01:50 lr 0.000229 wd 0.0500 time 0.2266 (0.2555) data time 0.0010 (0.0027) model time 0.2256 (0.2528) loss 3.5239 (2.9585) grad_norm 3.4911 (3.7734) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][830/1251] eta 0:01:47 lr 0.000229 wd 0.0500 time 0.2306 (0.2551) data time 0.0014 (0.0026) model time 0.2292 (0.2524) loss 3.3772 (2.9577) grad_norm 3.2311 (3.7689) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][840/1251] eta 0:01:44 lr 0.000229 wd 0.0500 time 0.2403 (0.2547) data time 0.0007 (0.0026) model time 0.2396 (0.2521) loss 3.5683 (2.9580) grad_norm 5.1960 (3.7739) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][850/1251] eta 0:01:41 lr 0.000229 wd 0.0500 time 0.2233 (0.2542) data time 0.0010 (0.0026) model time 0.2223 (0.2516) loss 3.2474 (2.9609) grad_norm 2.6162 (3.7761) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][860/1251] eta 0:01:39 lr 0.000229 wd 0.0500 time 0.2243 (0.2538) data time 0.0014 (0.0026) model time 0.2229 (0.2512) loss 3.4266 (2.9624) grad_norm 3.1819 (3.7627) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][870/1251] eta 0:01:36 lr 0.000229 wd 0.0500 time 0.2260 (0.2533) data time 0.0007 (0.0025) model time 0.2253 (0.2508) loss 3.2685 (2.9633) grad_norm 2.5604 (3.7528) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][880/1251] eta 0:01:33 lr 0.000229 wd 0.0500 time 0.2353 (0.2529) data time 0.0010 (0.0025) model time 0.2343 (0.2504) loss 3.1467 (2.9654) grad_norm 3.4514 (3.7522) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][890/1251] eta 0:01:31 lr 0.000229 wd 0.0500 time 0.2279 (0.2526) data time 0.0010 (0.0025) model time 0.2268 (0.2501) loss 2.2765 (2.9633) grad_norm 4.3133 (3.7494) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][900/1251] eta 0:01:28 lr 0.000229 wd 0.0500 time 0.2297 (0.2522) data time 0.0009 (0.0025) model time 0.2288 (0.2497) loss 2.9886 (2.9635) grad_norm 2.7849 (3.7393) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][910/1251] eta 0:01:25 lr 0.000229 wd 0.0500 time 0.2238 (0.2517) data time 0.0009 (0.0024) model time 0.2229 (0.2493) loss 3.4457 (2.9662) grad_norm 3.0228 (3.7343) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][920/1251] eta 0:01:23 lr 0.000229 wd 0.0500 time 0.2261 (0.2514) data time 0.0013 (0.0024) model time 0.2248 (0.2489) loss 3.4897 (2.9676) grad_norm 3.1488 (3.7265) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][930/1251] eta 0:01:20 lr 0.000229 wd 0.0500 time 0.2217 (0.2510) data time 0.0008 (0.0024) model time 0.2209 (0.2486) loss 2.3763 (2.9656) grad_norm 2.9125 (3.7197) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][940/1251] eta 0:01:17 lr 0.000229 wd 0.0500 time 0.2295 (0.2507) data time 0.0007 (0.0024) model time 0.2288 (0.2483) loss 3.8060 (2.9669) grad_norm 4.0202 (3.7269) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][950/1251] eta 0:01:15 lr 0.000229 wd 0.0500 time 0.2314 (0.2504) data time 0.0007 (0.0024) model time 0.2307 (0.2480) loss 1.7582 (2.9609) grad_norm 2.9903 (3.7234) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][960/1251] eta 0:01:12 lr 0.000229 wd 0.0500 time 0.2297 (0.2501) data time 0.0007 (0.0023) model time 0.2290 (0.2477) loss 3.4825 (2.9641) grad_norm 3.1225 (3.7160) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][970/1251] eta 0:01:10 lr 0.000229 wd 0.0500 time 0.2287 (0.2498) data time 0.0010 (0.0023) model time 0.2277 (0.2474) loss 3.5419 (2.9656) grad_norm 4.1768 (3.7143) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][980/1251] eta 0:01:07 lr 0.000229 wd 0.0500 time 0.2267 (0.2494) data time 0.0010 (0.0023) model time 0.2257 (0.2471) loss 2.3104 (2.9639) grad_norm 2.9218 (3.7030) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][990/1251] eta 0:01:05 lr 0.000229 wd 0.0500 time 0.2313 (0.2491) data time 0.0009 (0.0023) model time 0.2304 (0.2469) loss 3.3140 (2.9623) grad_norm 3.8923 (3.7027) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1000/1251] eta 0:01:02 lr 0.000229 wd 0.0500 time 0.2277 (0.2488) data time 0.0008 (0.0023) model time 0.2269 (0.2465) loss 2.1625 (2.9612) grad_norm 2.5894 (3.6945) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1010/1251] eta 0:00:59 lr 0.000229 wd 0.0500 time 0.2251 (0.2486) data time 0.0011 (0.0023) model time 0.2240 (0.2463) loss 3.1379 (2.9575) grad_norm 3.2027 (3.6843) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1020/1251] eta 0:00:57 lr 0.000229 wd 0.0500 time 0.2343 (0.2483) data time 0.0010 (0.0022) model time 0.2333 (0.2461) loss 3.0889 (2.9580) grad_norm 4.4714 (3.6730) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1030/1251] eta 0:00:54 lr 0.000229 wd 0.0500 time 0.2298 (0.2480) data time 0.0012 (0.0022) model time 0.2286 (0.2458) loss 3.3213 (2.9611) grad_norm 2.4468 (3.6650) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1040/1251] eta 0:00:52 lr 0.000228 wd 0.0500 time 0.2249 (0.2478) data time 0.0010 (0.0022) model time 0.2240 (0.2456) loss 3.1700 (2.9604) grad_norm 2.4142 (3.6597) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1050/1251] eta 0:00:49 lr 0.000228 wd 0.0500 time 0.2256 (0.2475) data time 0.0015 (0.0022) model time 0.2241 (0.2453) loss 3.0665 (2.9599) grad_norm 3.6242 (3.6716) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1060/1251] eta 0:00:47 lr 0.000228 wd 0.0500 time 0.2306 (0.2473) data time 0.0007 (0.0022) model time 0.2298 (0.2451) loss 3.7966 (2.9614) grad_norm 3.9648 (3.6789) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:36:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1070/1251] eta 0:00:44 lr 0.000228 wd 0.0500 time 0.2328 (0.2471) data time 0.0009 (0.0022) model time 0.2318 (0.2450) loss 3.6279 (2.9635) grad_norm 2.3750 (3.6738) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1080/1251] eta 0:00:42 lr 0.000228 wd 0.0500 time 0.2196 (0.2469) data time 0.0009 (0.0022) model time 0.2187 (0.2447) loss 2.9712 (2.9620) grad_norm 2.0840 (3.6650) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:37:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1090/1251] eta 0:00:39 lr 0.000228 wd 0.0500 time 0.2198 (0.2467) data time 0.0009 (0.0021) model time 0.2189 (0.2446) loss 3.5663 (2.9639) grad_norm 2.5719 (3.6565) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:37:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1100/1251] eta 0:00:37 lr 0.000228 wd 0.0500 time 0.2276 (0.2465) data time 0.0008 (0.0021) model time 0.2268 (0.2443) loss 2.1248 (2.9587) grad_norm 3.2914 (3.6464) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:37:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1110/1251] eta 0:00:34 lr 0.000228 wd 0.0500 time 0.2336 (0.2466) data time 0.0008 (0.0021) model time 0.2328 (0.2444) loss 3.2720 (2.9603) grad_norm 2.8248 (3.6391) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:37:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 03:37:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 03:37:11 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 03:38:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 03:38:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 03:38:44 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 03:38:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 03:38:57 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 03:38:58 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 03:38:59 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 03:38:59 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 212) [2024-08-30 03:38:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 03:39:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1120/1251] eta 0:16:30 lr 0.000228 wd 0.0500 time 0.3486 (7.5593) data time 0.0008 (0.9373) model time 0.3478 (6.6219) loss 3.7633 (3.8199) grad_norm 2.3907 (2.5793) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 03:39:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1130/1251] eta 0:02:55 lr 0.000228 wd 0.0500 time 0.2226 (1.4479) data time 0.0007 (0.1571) model time 0.2219 (1.2908) loss 2.2514 (3.2516) grad_norm 3.6742 (3.2790) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:39:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1140/1251] eta 0:01:38 lr 0.000228 wd 0.0500 time 0.2211 (0.8909) data time 0.0008 (0.0861) model time 0.2202 (0.8047) loss 3.3296 (3.2265) grad_norm 3.5136 (3.2836) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:39:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1150/1251] eta 0:01:08 lr 0.000228 wd 0.0500 time 0.2297 (0.6830) data time 0.0006 (0.0595) model time 0.2291 (0.6235) loss 3.3144 (3.2440) grad_norm 3.5583 (3.3557) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:39:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1160/1251] eta 0:00:52 lr 0.000228 wd 0.0500 time 0.2194 (0.5739) data time 0.0010 (0.0456) model time 0.2184 (0.5284) loss 3.0623 (3.1974) grad_norm 3.1781 (3.4462) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1170/1251] eta 0:00:41 lr 0.000228 wd 0.0500 time 0.2239 (0.5070) data time 0.0006 (0.0370) model time 0.2233 (0.4700) loss 2.9465 (3.1992) grad_norm 3.1477 (3.4808) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1180/1251] eta 0:00:32 lr 0.000228 wd 0.0500 time 0.2258 (0.4618) data time 0.0007 (0.0312) model time 0.2251 (0.4306) loss 3.7926 (3.1579) grad_norm 2.9790 (3.6071) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:39:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1190/1251] eta 0:00:26 lr 0.000228 wd 0.0500 time 0.2302 (0.4291) data time 0.0009 (0.0270) model time 0.2293 (0.4022) loss 3.1213 (3.1264) grad_norm 2.7656 (3.5925) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:39:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1200/1251] eta 0:00:20 lr 0.000228 wd 0.0500 time 0.2201 (0.4041) data time 0.0008 (0.0238) model time 0.2193 (0.3803) loss 2.9144 (3.1059) grad_norm 5.9881 (3.6210) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:39:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1210/1251] eta 0:00:15 lr 0.000228 wd 0.0500 time 0.2343 (0.3849) data time 0.0007 (0.0213) model time 0.2336 (0.3636) loss 2.0605 (3.0960) grad_norm 2.8764 (3.6180) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:39:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1220/1251] eta 0:00:11 lr 0.000228 wd 0.0500 time 0.2265 (0.3695) data time 0.0006 (0.0193) model time 0.2259 (0.3502) loss 3.5279 (3.1084) grad_norm 2.8276 (3.6918) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:39:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1230/1251] eta 0:00:07 lr 0.000228 wd 0.0500 time 0.2247 (0.3568) data time 0.0009 (0.0177) model time 0.2237 (0.3391) loss 3.1779 (3.0993) grad_norm 2.9174 (3.6667) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1240/1251] eta 0:00:03 lr 0.000228 wd 0.0500 time 0.2149 (0.3456) data time 0.0003 (0.0164) model time 0.2146 (0.3293) loss 3.2113 (3.0948) grad_norm 2.8008 (3.6830) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [212/300][1250/1251] eta 0:00:00 lr 0.000228 wd 0.0500 time 0.2236 (0.3357) data time 0.0005 (0.0152) model time 0.2232 (0.3206) loss 3.2161 (3.0802) grad_norm 3.9076 (3.6571) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 212 training takes 0:00:44 [2024-08-30 03:39:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 03:39:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 03:39:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.366 (0.366) Loss 0.4268 (0.4268) Acc@1 91.895 (91.895) Acc@5 98.145 (98.145) Mem 7377MB [2024-08-30 03:39:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.098) Loss 0.6729 (0.6755) Acc@1 86.426 (85.804) Acc@5 97.461 (97.381) Mem 7377MB [2024-08-30 03:39:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.066 (0.085) Loss 0.9702 (0.7004) Acc@1 76.562 (85.040) Acc@5 95.020 (97.331) Mem 7377MB [2024-08-30 03:39:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.081) Loss 1.1484 (0.7921) Acc@1 71.973 (82.800) Acc@5 92.480 (96.270) Mem 7377MB [2024-08-30 03:39:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 1.0625 (0.8428) Acc@1 75.879 (81.595) Acc@5 94.043 (95.682) Mem 7377MB [2024-08-30 03:39:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.220 Acc@5 95.616 [2024-08-30 03:39:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.2% [2024-08-30 03:39:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.678 (0.678) Loss 0.3789 (0.3789) Acc@1 93.262 (93.262) Acc@5 98.633 (98.633) Mem 7377MB [2024-08-30 03:39:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.135) Loss 0.5903 (0.6062) Acc@1 88.770 (87.349) Acc@5 97.852 (97.710) Mem 7377MB [2024-08-30 03:39:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.104) Loss 0.8633 (0.6325) Acc@1 78.418 (86.305) Acc@5 95.996 (97.689) Mem 7377MB [2024-08-30 03:39:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.093) Loss 1.0859 (0.7163) Acc@1 73.730 (84.268) Acc@5 92.969 (96.784) Mem 7377MB [2024-08-30 03:39:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.086) Loss 0.9819 (0.7594) Acc@1 76.660 (83.141) Acc@5 94.141 (96.344) Mem 7377MB [2024-08-30 03:39:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.722 Acc@5 96.300 [2024-08-30 03:39:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-08-30 03:39:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.72% [2024-08-30 03:39:59 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 03:40:00 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 03:40:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][0/1251] eta 0:12:06 lr 0.000228 wd 0.0500 time 0.5806 (0.5806) data time 0.3326 (0.3326) model time 0.0000 (0.0000) loss 3.1393 (3.1393) grad_norm 3.7389 (3.7389) loss_scale 512.0000 (512.0000) mem 7380MB [2024-08-30 03:40:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][10/1251] eta 0:05:19 lr 0.000228 wd 0.0500 time 0.2193 (0.2571) data time 0.0010 (0.0311) model time 0.0000 (0.0000) loss 3.3769 (2.9160) grad_norm 6.6612 (3.7772) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 03:40:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][20/1251] eta 0:04:57 lr 0.000228 wd 0.0500 time 0.2239 (0.2417) data time 0.0008 (0.0168) model time 0.0000 (0.0000) loss 2.6966 (2.8808) grad_norm 2.4845 (3.5008) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 03:40:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 03:40:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 03:40:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 03:51:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 03:51:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 03:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 03:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 03:54:38 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 03:54:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 03:54:44 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 03:54:46 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 03:54:47 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 03:54:47 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 213) [2024-08-30 03:54:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 03:55:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][30/1251] eta 0:55:44 lr 0.000228 wd 0.0500 time 0.2222 (2.7390) data time 0.0007 (0.1027) model time 0.0000 (0.0000) loss 3.5581 (3.4534) grad_norm 3.8126 (3.4020) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:55:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][40/1251] eta 0:21:27 lr 0.000228 wd 0.0500 time 0.2371 (1.0628) data time 0.0010 (0.0349) model time 0.0000 (0.0000) loss 3.0810 (3.2081) grad_norm 3.4562 (3.8564) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][50/1251] eta 0:14:33 lr 0.000228 wd 0.0500 time 0.2317 (0.7269) data time 0.0009 (0.0213) model time 0.0000 (0.0000) loss 3.3629 (3.2057) grad_norm 4.7937 (3.8923) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 03:55:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 03:55:11 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 03:55:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 03:58:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 03:58:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 03:59:12 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 03:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 03:59:21 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 03:59:23 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 03:59:24 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 03:59:24 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 213) [2024-08-30 03:59:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 03:59:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][60/1251] eta 2:23:42 lr 0.000227 wd 0.0500 time 0.3991 (7.2397) data time 0.0009 (0.5551) model time 0.3982 (6.6846) loss 3.7936 (3.7255) grad_norm 2.3935 (2.8829) loss_scale 512.0000 (512.0000) mem 7378MB [2024-08-30 03:59:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][70/1251] eta 0:27:52 lr 0.000227 wd 0.0500 time 0.2453 (1.4159) data time 0.0012 (0.0934) model time 0.2440 (1.3224) loss 2.5098 (3.2799) grad_norm 3.7327 (3.7455) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:59:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][80/1251] eta 0:17:17 lr 0.000227 wd 0.0500 time 0.2248 (0.8859) data time 0.0010 (0.0515) model time 0.2238 (0.8344) loss 3.1995 (3.2171) grad_norm 2.7251 (3.8772) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:59:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][90/1251] eta 0:13:09 lr 0.000227 wd 0.0500 time 0.2267 (0.6803) data time 0.0009 (0.0357) model time 0.2257 (0.6446) loss 2.9424 (3.1552) grad_norm 2.6222 (3.5357) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:59:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][100/1251] eta 0:11:00 lr 0.000227 wd 0.0500 time 0.2781 (0.5740) data time 0.0011 (0.0275) model time 0.2770 (0.5465) loss 3.0268 (3.0984) grad_norm 4.7938 (3.4924) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:59:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][110/1251] eta 0:09:43 lr 0.000227 wd 0.0500 time 0.2849 (0.5117) data time 0.0012 (0.0224) model time 0.2837 (0.4892) loss 3.1000 (3.0855) grad_norm 2.8817 (3.5614) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 03:59:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][120/1251] eta 0:08:52 lr 0.000227 wd 0.0500 time 0.2961 (0.4708) data time 0.0015 (0.0190) model time 0.2947 (0.4518) loss 3.3257 (3.0710) grad_norm 2.6089 (3.6537) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][130/1251] eta 0:08:09 lr 0.000227 wd 0.0500 time 0.2242 (0.4371) data time 0.0009 (0.0165) model time 0.2233 (0.4206) loss 3.0493 (3.0416) grad_norm 3.3296 (3.6316) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][140/1251] eta 0:07:37 lr 0.000227 wd 0.0500 time 0.2296 (0.4118) data time 0.0010 (0.0146) model time 0.2286 (0.3972) loss 2.8675 (3.0241) grad_norm 3.0364 (3.6622) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][150/1251] eta 0:07:11 lr 0.000227 wd 0.0500 time 0.2277 (0.3919) data time 0.0008 (0.0132) model time 0.2268 (0.3788) loss 2.1131 (3.0058) grad_norm 45.2768 (4.0848) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][160/1251] eta 0:06:53 lr 0.000227 wd 0.0500 time 0.2210 (0.3791) data time 0.0009 (0.0120) model time 0.2201 (0.3671) loss 3.5983 (3.0382) grad_norm 2.8409 (4.0510) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][170/1251] eta 0:06:35 lr 0.000227 wd 0.0500 time 0.2200 (0.3657) data time 0.0011 (0.0110) model time 0.2189 (0.3547) loss 3.2678 (3.0383) grad_norm 2.6693 (3.9783) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][180/1251] eta 0:06:20 lr 0.000227 wd 0.0500 time 0.2318 (0.3553) data time 0.0008 (0.0102) model time 0.2310 (0.3451) loss 3.2119 (3.0402) grad_norm 3.7943 (3.9819) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][190/1251] eta 0:06:06 lr 0.000227 wd 0.0500 time 0.2290 (0.3455) data time 0.0009 (0.0095) model time 0.2281 (0.3360) loss 3.2479 (3.0372) grad_norm 3.7038 (3.9575) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][200/1251] eta 0:05:56 lr 0.000227 wd 0.0500 time 0.2769 (0.3390) data time 0.0013 (0.0089) model time 0.2756 (0.3301) loss 3.1095 (3.0244) grad_norm 3.2591 (4.0675) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][210/1251] eta 0:05:45 lr 0.000227 wd 0.0500 time 0.2280 (0.3324) data time 0.0010 (0.0084) model time 0.2270 (0.3240) loss 3.7234 (3.0243) grad_norm 4.3893 (4.0426) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][220/1251] eta 0:05:36 lr 0.000227 wd 0.0500 time 0.2315 (0.3260) data time 0.0011 (0.0080) model time 0.2304 (0.3180) loss 3.4034 (3.0323) grad_norm 2.7797 (3.9932) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][230/1251] eta 0:05:27 lr 0.000227 wd 0.0500 time 0.2357 (0.3204) data time 0.0010 (0.0076) model time 0.2347 (0.3128) loss 2.8318 (3.0256) grad_norm 4.6858 (3.9435) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][240/1251] eta 0:05:21 lr 0.000227 wd 0.0500 time 0.2314 (0.3176) data time 0.0010 (0.0072) model time 0.2304 (0.3104) loss 3.4465 (3.0150) grad_norm 2.4572 (3.8860) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][250/1251] eta 0:05:14 lr 0.000227 wd 0.0500 time 0.3289 (0.3139) data time 0.0021 (0.0069) model time 0.3269 (0.3070) loss 3.0397 (3.0119) grad_norm 2.9099 (3.8346) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][260/1251] eta 0:05:07 lr 0.000227 wd 0.0500 time 0.2299 (0.3101) data time 0.0009 (0.0066) model time 0.2290 (0.3035) loss 3.6250 (3.0017) grad_norm 4.2822 (3.8245) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][270/1251] eta 0:05:00 lr 0.000227 wd 0.0500 time 0.2324 (0.3062) data time 0.0011 (0.0064) model time 0.2314 (0.2999) loss 3.0137 (2.9918) grad_norm 2.5178 (3.7815) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][280/1251] eta 0:04:54 lr 0.000227 wd 0.0500 time 0.2362 (0.3037) data time 0.0010 (0.0061) model time 0.2352 (0.2975) loss 3.7817 (2.9885) grad_norm 2.8933 (3.7402) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][290/1251] eta 0:04:49 lr 0.000227 wd 0.0500 time 0.2229 (0.3011) data time 0.0012 (0.0059) model time 0.2217 (0.2952) loss 3.1914 (2.9838) grad_norm 3.2912 (3.7252) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][300/1251] eta 0:04:43 lr 0.000227 wd 0.0500 time 0.2404 (0.2982) data time 0.0009 (0.0057) model time 0.2395 (0.2925) loss 3.0475 (2.9806) grad_norm 2.8399 (3.7009) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][310/1251] eta 0:04:38 lr 0.000227 wd 0.0500 time 0.2501 (0.2962) data time 0.0019 (0.0055) model time 0.2482 (0.2906) loss 3.1106 (2.9702) grad_norm 3.3832 (3.6989) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][320/1251] eta 0:04:34 lr 0.000227 wd 0.0500 time 0.2770 (0.2945) data time 0.0015 (0.0054) model time 0.2755 (0.2891) loss 3.4305 (2.9654) grad_norm 2.7419 (3.6851) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][330/1251] eta 0:04:29 lr 0.000226 wd 0.0500 time 0.2306 (0.2930) data time 0.0015 (0.0052) model time 0.2291 (0.2878) loss 2.7091 (2.9611) grad_norm 2.5899 (3.6645) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][340/1251] eta 0:04:24 lr 0.000226 wd 0.0500 time 0.2306 (0.2907) data time 0.0010 (0.0051) model time 0.2296 (0.2856) loss 2.0480 (2.9638) grad_norm 2.8192 (3.6989) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][350/1251] eta 0:04:20 lr 0.000226 wd 0.0500 time 0.2241 (0.2893) data time 0.0011 (0.0049) model time 0.2230 (0.2844) loss 2.9540 (2.9590) grad_norm 3.7388 (3.7095) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][360/1251] eta 0:04:16 lr 0.000226 wd 0.0500 time 0.2265 (0.2878) data time 0.0009 (0.0048) model time 0.2256 (0.2831) loss 3.0239 (2.9523) grad_norm 7.2434 (3.7314) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:00:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][370/1251] eta 0:04:13 lr 0.000226 wd 0.0500 time 0.2394 (0.2872) data time 0.0010 (0.0047) model time 0.2385 (0.2825) loss 3.2756 (2.9529) grad_norm 2.6599 (3.7266) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][380/1251] eta 0:04:09 lr 0.000226 wd 0.0500 time 0.2681 (0.2859) data time 0.0015 (0.0046) model time 0.2666 (0.2814) loss 2.9329 (2.9605) grad_norm 4.8691 (3.7144) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][390/1251] eta 0:04:04 lr 0.000226 wd 0.0500 time 0.2257 (0.2842) data time 0.0011 (0.0045) model time 0.2246 (0.2797) loss 3.1770 (2.9550) grad_norm 3.4136 (3.7019) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][400/1251] eta 0:04:00 lr 0.000226 wd 0.0500 time 0.3171 (0.2830) data time 0.0013 (0.0044) model time 0.3158 (0.2787) loss 3.2643 (2.9582) grad_norm 2.7308 (3.7010) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][410/1251] eta 0:03:57 lr 0.000226 wd 0.0500 time 0.2276 (0.2823) data time 0.0008 (0.0043) model time 0.2268 (0.2780) loss 3.0086 (2.9573) grad_norm 4.0221 (3.7104) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][420/1251] eta 0:03:53 lr 0.000226 wd 0.0500 time 0.2357 (0.2809) data time 0.0008 (0.0042) model time 0.2349 (0.2767) loss 3.7617 (2.9597) grad_norm 5.5548 (3.7165) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][430/1251] eta 0:03:49 lr 0.000226 wd 0.0500 time 0.2226 (0.2794) data time 0.0011 (0.0041) model time 0.2216 (0.2753) loss 3.1879 (2.9552) grad_norm 2.9194 (3.7165) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][440/1251] eta 0:03:45 lr 0.000226 wd 0.0500 time 0.2339 (0.2786) data time 0.0009 (0.0040) model time 0.2330 (0.2746) loss 3.4693 (2.9519) grad_norm 3.2982 (3.7004) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][450/1251] eta 0:03:42 lr 0.000226 wd 0.0500 time 0.2238 (0.2777) data time 0.0007 (0.0039) model time 0.2231 (0.2738) loss 2.6869 (2.9468) grad_norm 3.0468 (3.7002) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][460/1251] eta 0:03:39 lr 0.000226 wd 0.0500 time 0.2272 (0.2770) data time 0.0008 (0.0039) model time 0.2264 (0.2732) loss 3.6539 (2.9541) grad_norm 3.2048 (3.6882) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][470/1251] eta 0:03:35 lr 0.000226 wd 0.0500 time 0.2274 (0.2759) data time 0.0009 (0.0038) model time 0.2265 (0.2721) loss 3.2518 (2.9573) grad_norm 4.6521 (3.6827) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][480/1251] eta 0:03:31 lr 0.000226 wd 0.0500 time 0.2240 (0.2748) data time 0.0006 (0.0037) model time 0.2234 (0.2710) loss 3.2956 (2.9565) grad_norm 2.6823 (3.6721) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][490/1251] eta 0:03:28 lr 0.000226 wd 0.0500 time 0.2272 (0.2745) data time 0.0007 (0.0037) model time 0.2265 (0.2708) loss 2.6771 (2.9601) grad_norm 3.5333 (3.6565) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][500/1251] eta 0:03:25 lr 0.000226 wd 0.0500 time 0.2271 (0.2734) data time 0.0007 (0.0036) model time 0.2264 (0.2698) loss 3.0781 (2.9649) grad_norm 2.7850 (3.6469) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][510/1251] eta 0:03:21 lr 0.000226 wd 0.0500 time 0.2271 (0.2724) data time 0.0012 (0.0036) model time 0.2259 (0.2689) loss 2.7315 (2.9635) grad_norm 3.4856 (3.6632) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][520/1251] eta 0:03:18 lr 0.000226 wd 0.0500 time 0.2218 (0.2718) data time 0.0010 (0.0035) model time 0.2208 (0.2683) loss 2.1824 (2.9593) grad_norm 3.8219 (3.6754) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][530/1251] eta 0:03:15 lr 0.000226 wd 0.0500 time 0.2248 (0.2714) data time 0.0011 (0.0035) model time 0.2237 (0.2680) loss 2.9346 (2.9550) grad_norm 3.2537 (3.6727) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][540/1251] eta 0:03:12 lr 0.000226 wd 0.0500 time 0.2303 (0.2710) data time 0.0009 (0.0034) model time 0.2294 (0.2675) loss 3.2477 (2.9540) grad_norm 3.1850 (3.6588) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][550/1251] eta 0:03:09 lr 0.000226 wd 0.0500 time 0.2217 (0.2701) data time 0.0013 (0.0034) model time 0.2204 (0.2667) loss 3.5493 (2.9572) grad_norm 2.9946 (3.6504) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][560/1251] eta 0:03:06 lr 0.000226 wd 0.0500 time 0.2284 (0.2693) data time 0.0013 (0.0033) model time 0.2271 (0.2659) loss 3.2668 (2.9565) grad_norm 2.5393 (3.6394) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][570/1251] eta 0:03:03 lr 0.000226 wd 0.0500 time 0.2857 (0.2691) data time 0.0012 (0.0033) model time 0.2846 (0.2658) loss 3.4800 (2.9605) grad_norm 2.6449 (3.6218) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][580/1251] eta 0:03:00 lr 0.000226 wd 0.0500 time 0.2270 (0.2685) data time 0.0010 (0.0032) model time 0.2260 (0.2653) loss 2.0354 (2.9577) grad_norm 4.4708 (3.6233) loss_scale 1024.0000 (516.9042) mem 7377MB [2024-08-30 04:01:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][590/1251] eta 0:02:57 lr 0.000226 wd 0.0500 time 0.2266 (0.2682) data time 0.0009 (0.0032) model time 0.2257 (0.2650) loss 3.2689 (2.9541) grad_norm 2.9427 (3.6169) loss_scale 1024.0000 (526.4361) mem 7377MB [2024-08-30 04:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][600/1251] eta 0:02:54 lr 0.000225 wd 0.0500 time 0.2534 (0.2675) data time 0.0007 (0.0032) model time 0.2528 (0.2643) loss 2.2907 (2.9539) grad_norm 3.0820 (3.6112) loss_scale 1024.0000 (535.6162) mem 7377MB [2024-08-30 04:01:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][610/1251] eta 0:02:51 lr 0.000225 wd 0.0500 time 0.2347 (0.2671) data time 0.0007 (0.0031) model time 0.2341 (0.2640) loss 3.2886 (2.9586) grad_norm 4.0014 (3.6040) loss_scale 1024.0000 (544.4638) mem 7377MB [2024-08-30 04:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][620/1251] eta 0:02:48 lr 0.000225 wd 0.0500 time 0.2249 (0.2668) data time 0.0009 (0.0031) model time 0.2240 (0.2637) loss 3.7047 (2.9627) grad_norm 3.2011 (3.5937) loss_scale 1024.0000 (552.9964) mem 7377MB [2024-08-30 04:02:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][630/1251] eta 0:02:45 lr 0.000225 wd 0.0500 time 0.2276 (0.2662) data time 0.0009 (0.0030) model time 0.2267 (0.2631) loss 2.4693 (2.9628) grad_norm 2.4819 (3.5916) loss_scale 1024.0000 (561.2308) mem 7377MB [2024-08-30 04:02:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][640/1251] eta 0:02:42 lr 0.000225 wd 0.0500 time 0.2287 (0.2655) data time 0.0010 (0.0030) model time 0.2277 (0.2625) loss 3.2681 (2.9656) grad_norm 3.7430 (3.6078) loss_scale 1024.0000 (569.1821) mem 7377MB [2024-08-30 04:02:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][650/1251] eta 0:02:39 lr 0.000225 wd 0.0500 time 0.2754 (0.2654) data time 0.0013 (0.0030) model time 0.2741 (0.2625) loss 3.1198 (2.9674) grad_norm 3.7063 (3.6064) loss_scale 1024.0000 (576.8649) mem 7377MB [2024-08-30 04:02:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][660/1251] eta 0:02:36 lr 0.000225 wd 0.0500 time 0.3193 (0.2650) data time 0.0009 (0.0030) model time 0.3184 (0.2620) loss 3.2643 (2.9657) grad_norm 2.7603 (3.6028) loss_scale 1024.0000 (584.2924) mem 7377MB [2024-08-30 04:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][670/1251] eta 0:02:33 lr 0.000225 wd 0.0500 time 0.2282 (0.2645) data time 0.0010 (0.0029) model time 0.2272 (0.2616) loss 1.9517 (2.9645) grad_norm 3.0223 (3.5995) loss_scale 1024.0000 (591.4771) mem 7377MB [2024-08-30 04:02:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][680/1251] eta 0:02:30 lr 0.000225 wd 0.0500 time 0.2242 (0.2639) data time 0.0009 (0.0029) model time 0.2233 (0.2611) loss 2.8063 (2.9679) grad_norm 2.8915 (3.6120) loss_scale 1024.0000 (598.4309) mem 7377MB [2024-08-30 04:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][690/1251] eta 0:02:27 lr 0.000225 wd 0.0500 time 0.3262 (0.2637) data time 0.0013 (0.0029) model time 0.3249 (0.2608) loss 3.2364 (2.9689) grad_norm 2.5792 (3.6177) loss_scale 1024.0000 (605.1646) mem 7377MB [2024-08-30 04:02:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][700/1251] eta 0:02:25 lr 0.000225 wd 0.0500 time 0.2327 (0.2635) data time 0.0011 (0.0028) model time 0.2316 (0.2607) loss 2.1687 (2.9658) grad_norm 2.6803 (3.6224) loss_scale 1024.0000 (611.6885) mem 7377MB [2024-08-30 04:02:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][710/1251] eta 0:02:22 lr 0.000225 wd 0.0500 time 0.2358 (0.2629) data time 0.0010 (0.0028) model time 0.2347 (0.2601) loss 1.7539 (2.9647) grad_norm 4.5406 (3.6297) loss_scale 1024.0000 (618.0123) mem 7377MB [2024-08-30 04:02:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][720/1251] eta 0:02:19 lr 0.000225 wd 0.0500 time 0.2271 (0.2627) data time 0.0012 (0.0028) model time 0.2259 (0.2599) loss 3.1108 (2.9607) grad_norm 4.5912 (3.6309) loss_scale 1024.0000 (624.1450) mem 7377MB [2024-08-30 04:02:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][730/1251] eta 0:02:16 lr 0.000225 wd 0.0500 time 0.2399 (0.2623) data time 0.0012 (0.0028) model time 0.2387 (0.2595) loss 2.4109 (2.9619) grad_norm 2.9480 (3.6329) loss_scale 1024.0000 (630.0952) mem 7377MB [2024-08-30 04:02:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][740/1251] eta 0:02:14 lr 0.000225 wd 0.0500 time 0.3173 (0.2623) data time 0.0013 (0.0027) model time 0.3161 (0.2596) loss 2.8329 (2.9634) grad_norm 3.0630 (3.6274) loss_scale 1024.0000 (635.8710) mem 7377MB [2024-08-30 04:02:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][750/1251] eta 0:02:11 lr 0.000225 wd 0.0500 time 0.2310 (0.2618) data time 0.0010 (0.0027) model time 0.2301 (0.2591) loss 3.5018 (2.9614) grad_norm 3.3298 (3.6283) loss_scale 1024.0000 (641.4798) mem 7377MB [2024-08-30 04:02:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][760/1251] eta 0:02:08 lr 0.000225 wd 0.0500 time 0.2261 (0.2613) data time 0.0007 (0.0027) model time 0.2254 (0.2586) loss 2.3510 (2.9583) grad_norm 3.5379 (3.6353) loss_scale 1024.0000 (646.9288) mem 7377MB [2024-08-30 04:02:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][770/1251] eta 0:02:05 lr 0.000225 wd 0.0500 time 0.2379 (0.2609) data time 0.0009 (0.0027) model time 0.2370 (0.2582) loss 1.7802 (2.9555) grad_norm 2.6131 (3.6356) loss_scale 1024.0000 (652.2247) mem 7377MB [2024-08-30 04:02:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][780/1251] eta 0:02:02 lr 0.000225 wd 0.0500 time 0.2374 (0.2610) data time 0.0011 (0.0026) model time 0.2362 (0.2583) loss 2.8381 (2.9538) grad_norm 2.7877 (3.6256) loss_scale 1024.0000 (657.3740) mem 7377MB [2024-08-30 04:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][790/1251] eta 0:02:00 lr 0.000225 wd 0.0500 time 0.2213 (0.2605) data time 0.0010 (0.0026) model time 0.2203 (0.2579) loss 2.9325 (2.9547) grad_norm 3.3777 (3.6200) loss_scale 1024.0000 (662.3825) mem 7377MB [2024-08-30 04:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 04:02:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 04:02:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 04:06:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 04:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 04:06:55 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 04:08:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 04:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 04:09:06 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 04:09:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 04:09:18 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 04:09:19 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 04:09:20 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 04:09:20 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 213) [2024-08-30 04:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 04:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][800/1251] eta 0:37:29 lr 0.000225 wd 0.0500 time 0.2255 (4.9884) data time 0.0007 (0.5678) model time 0.2248 (4.4206) loss 3.0066 (3.4295) grad_norm 2.8616 (3.7023) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][810/1251] eta 0:09:53 lr 0.000225 wd 0.0500 time 0.3221 (1.3466) data time 0.0017 (0.1319) model time 0.3204 (1.2147) loss 3.2519 (3.2597) grad_norm 4.1925 (3.7558) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][820/1251] eta 0:06:10 lr 0.000225 wd 0.0500 time 0.2233 (0.8605) data time 0.0007 (0.0750) model time 0.2225 (0.7855) loss 3.5736 (3.2773) grad_norm 3.4375 (3.7306) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][830/1251] eta 0:04:46 lr 0.000225 wd 0.0500 time 0.3019 (0.6796) data time 0.0010 (0.0526) model time 0.3009 (0.6270) loss 3.6758 (3.2560) grad_norm 2.9565 (3.5471) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][840/1251] eta 0:03:58 lr 0.000225 wd 0.0500 time 0.2295 (0.5799) data time 0.0011 (0.0406) model time 0.2284 (0.5393) loss 3.0051 (3.1832) grad_norm 4.1745 (3.5269) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][850/1251] eta 0:03:26 lr 0.000225 wd 0.0500 time 0.2335 (0.5143) data time 0.0009 (0.0331) model time 0.2326 (0.4811) loss 2.9808 (3.1591) grad_norm 3.0973 (3.4282) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][860/1251] eta 0:03:03 lr 0.000225 wd 0.0500 time 0.2299 (0.4688) data time 0.0009 (0.0280) model time 0.2291 (0.4408) loss 2.9564 (3.1315) grad_norm 1.8882 (3.4214) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][870/1251] eta 0:02:48 lr 0.000224 wd 0.0500 time 0.2291 (0.4429) data time 0.0010 (0.0244) model time 0.2281 (0.4185) loss 3.3080 (3.1079) grad_norm 2.5035 (3.3759) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][880/1251] eta 0:02:35 lr 0.000224 wd 0.0500 time 0.4043 (0.4197) data time 0.0012 (0.0216) model time 0.4031 (0.3981) loss 1.9680 (3.0803) grad_norm 2.8585 (3.3449) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][890/1251] eta 0:02:24 lr 0.000224 wd 0.0500 time 0.2299 (0.3993) data time 0.0007 (0.0194) model time 0.2292 (0.3799) loss 3.5584 (3.0656) grad_norm 3.7955 (3.3578) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][900/1251] eta 0:02:14 lr 0.000224 wd 0.0500 time 0.2337 (0.3828) data time 0.0010 (0.0176) model time 0.2327 (0.3653) loss 3.5314 (3.0741) grad_norm 3.9283 (3.4269) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][910/1251] eta 0:02:07 lr 0.000224 wd 0.0500 time 0.3412 (0.3728) data time 0.0014 (0.0161) model time 0.3398 (0.3566) loss 2.6422 (3.0617) grad_norm 2.5792 (3.4096) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][920/1251] eta 0:01:59 lr 0.000224 wd 0.0500 time 0.2232 (0.3622) data time 0.0013 (0.0150) model time 0.2219 (0.3473) loss 3.0350 (3.0523) grad_norm 6.0430 (3.4334) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][930/1251] eta 0:01:53 lr 0.000224 wd 0.0500 time 0.2275 (0.3522) data time 0.0010 (0.0139) model time 0.2265 (0.3383) loss 2.9038 (3.0398) grad_norm 2.6985 (3.4449) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][940/1251] eta 0:01:47 lr 0.000224 wd 0.0500 time 0.2283 (0.3457) data time 0.0010 (0.0130) model time 0.2273 (0.3327) loss 3.3658 (3.0344) grad_norm 3.9851 (3.4493) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][950/1251] eta 0:01:42 lr 0.000224 wd 0.0500 time 0.2343 (0.3404) data time 0.0007 (0.0122) model time 0.2336 (0.3281) loss 2.7861 (3.0274) grad_norm 4.2289 (3.4397) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][960/1251] eta 0:01:37 lr 0.000224 wd 0.0500 time 0.2264 (0.3354) data time 0.0007 (0.0115) model time 0.2257 (0.3238) loss 2.5528 (3.0241) grad_norm 4.5331 (3.4361) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][970/1251] eta 0:01:32 lr 0.000224 wd 0.0500 time 0.2319 (0.3291) data time 0.0008 (0.0109) model time 0.2311 (0.3182) loss 3.6725 (3.0240) grad_norm 3.2198 (3.4169) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][980/1251] eta 0:01:27 lr 0.000224 wd 0.0500 time 0.2408 (0.3236) data time 0.0009 (0.0104) model time 0.2399 (0.3132) loss 3.3605 (3.0142) grad_norm 3.4110 (3.4082) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][990/1251] eta 0:01:23 lr 0.000224 wd 0.0500 time 0.2351 (0.3205) data time 0.0007 (0.0099) model time 0.2344 (0.3106) loss 3.1320 (3.0135) grad_norm 2.3198 (3.3948) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1000/1251] eta 0:01:19 lr 0.000224 wd 0.0500 time 0.2232 (0.3172) data time 0.0011 (0.0095) model time 0.2221 (0.3077) loss 1.9511 (3.0007) grad_norm 6.0479 (3.4022) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1010/1251] eta 0:01:15 lr 0.000224 wd 0.0500 time 0.2255 (0.3131) data time 0.0008 (0.0091) model time 0.2246 (0.3040) loss 1.6965 (2.9889) grad_norm 3.3686 (3.3981) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1020/1251] eta 0:01:11 lr 0.000224 wd 0.0500 time 0.2394 (0.3105) data time 0.0009 (0.0087) model time 0.2385 (0.3017) loss 2.4608 (2.9865) grad_norm 3.3698 (3.4093) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1030/1251] eta 0:01:08 lr 0.000224 wd 0.0500 time 0.2430 (0.3085) data time 0.0010 (0.0084) model time 0.2421 (0.3001) loss 2.3195 (2.9842) grad_norm 3.1058 (3.3986) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1040/1251] eta 0:01:04 lr 0.000224 wd 0.0500 time 0.2419 (0.3068) data time 0.0009 (0.0081) model time 0.2410 (0.2987) loss 3.2052 (2.9850) grad_norm 2.9115 (3.3681) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1050/1251] eta 0:01:01 lr 0.000224 wd 0.0500 time 0.2242 (0.3037) data time 0.0010 (0.0078) model time 0.2233 (0.2959) loss 3.1719 (2.9774) grad_norm 4.8944 (3.3987) loss_scale 1024.0000 (1024.0000) mem 7373MB [2024-08-30 04:10:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1060/1251] eta 0:00:57 lr 0.000224 wd 0.0500 time 0.2374 (0.3009) data time 0.0008 (0.0076) model time 0.2366 (0.2933) loss 2.6393 (2.9641) grad_norm 3.0750 (inf) loss_scale 512.0000 (1018.1597) mem 7373MB [2024-08-30 04:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1070/1251] eta 0:00:54 lr 0.000224 wd 0.0500 time 0.3424 (0.2998) data time 0.0014 (0.0073) model time 0.3410 (0.2925) loss 3.5772 (2.9635) grad_norm 3.5410 (inf) loss_scale 512.0000 (999.6190) mem 7373MB [2024-08-30 04:10:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1080/1251] eta 0:00:50 lr 0.000224 wd 0.0500 time 0.2245 (0.2980) data time 0.0013 (0.0071) model time 0.2231 (0.2909) loss 3.4036 (2.9677) grad_norm 2.6434 (inf) loss_scale 512.0000 (982.3887) mem 7373MB [2024-08-30 04:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1090/1251] eta 0:00:47 lr 0.000224 wd 0.0500 time 0.2290 (0.2971) data time 0.0010 (0.0069) model time 0.2280 (0.2902) loss 2.9328 (2.9598) grad_norm 4.6204 (inf) loss_scale 512.0000 (966.3345) mem 7373MB [2024-08-30 04:10:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1100/1251] eta 0:00:44 lr 0.000224 wd 0.0500 time 0.2448 (0.2951) data time 0.0007 (0.0067) model time 0.2440 (0.2883) loss 3.1415 (2.9544) grad_norm 3.2650 (inf) loss_scale 512.0000 (951.3399) mem 7373MB [2024-08-30 04:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1110/1251] eta 0:00:41 lr 0.000224 wd 0.0500 time 0.2288 (0.2952) data time 0.0009 (0.0065) model time 0.2280 (0.2887) loss 3.3690 (2.9549) grad_norm 3.4893 (inf) loss_scale 512.0000 (937.3035) mem 7373MB [2024-08-30 04:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1120/1251] eta 0:00:38 lr 0.000224 wd 0.0500 time 0.2258 (0.2932) data time 0.0010 (0.0064) model time 0.2249 (0.2868) loss 3.4691 (2.9623) grad_norm 2.6997 (inf) loss_scale 512.0000 (924.1362) mem 7373MB [2024-08-30 04:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1130/1251] eta 0:00:35 lr 0.000224 wd 0.0500 time 0.2315 (0.2912) data time 0.0010 (0.0062) model time 0.2306 (0.2850) loss 3.5454 (2.9595) grad_norm 3.1998 (inf) loss_scale 512.0000 (911.7598) mem 7373MB [2024-08-30 04:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1140/1251] eta 0:00:32 lr 0.000224 wd 0.0500 time 0.2288 (0.2903) data time 0.0008 (0.0061) model time 0.2279 (0.2842) loss 3.4017 (2.9641) grad_norm 3.9681 (inf) loss_scale 512.0000 (900.1050) mem 7373MB [2024-08-30 04:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1150/1251] eta 0:00:29 lr 0.000223 wd 0.0500 time 0.2321 (0.2894) data time 0.0008 (0.0059) model time 0.2313 (0.2835) loss 3.2066 (2.9681) grad_norm 2.9840 (inf) loss_scale 512.0000 (889.1105) mem 7373MB [2024-08-30 04:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1160/1251] eta 0:00:26 lr 0.000223 wd 0.0500 time 0.2336 (0.2887) data time 0.0008 (0.0058) model time 0.2328 (0.2829) loss 3.4097 (2.9674) grad_norm 2.5593 (inf) loss_scale 512.0000 (878.7218) mem 7373MB [2024-08-30 04:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1170/1251] eta 0:00:23 lr 0.000223 wd 0.0500 time 0.2232 (0.2870) data time 0.0009 (0.0057) model time 0.2223 (0.2814) loss 3.2849 (2.9622) grad_norm 3.3500 (inf) loss_scale 512.0000 (868.8901) mem 7373MB [2024-08-30 04:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1180/1251] eta 0:00:20 lr 0.000223 wd 0.0500 time 0.2345 (0.2856) data time 0.0010 (0.0055) model time 0.2336 (0.2801) loss 1.8562 (2.9589) grad_norm 3.1378 (inf) loss_scale 512.0000 (859.5718) mem 7373MB [2024-08-30 04:11:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1190/1251] eta 0:00:17 lr 0.000223 wd 0.0500 time 0.3487 (0.2854) data time 0.0020 (0.0054) model time 0.3467 (0.2800) loss 3.6207 (2.9584) grad_norm 3.8490 (inf) loss_scale 512.0000 (850.7277) mem 7373MB [2024-08-30 04:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1200/1251] eta 0:00:14 lr 0.000223 wd 0.0500 time 0.2227 (0.2841) data time 0.0010 (0.0053) model time 0.2217 (0.2788) loss 3.2257 (2.9664) grad_norm 4.1105 (inf) loss_scale 512.0000 (842.3226) mem 7373MB [2024-08-30 04:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1210/1251] eta 0:00:11 lr 0.000223 wd 0.0500 time 0.2754 (0.2829) data time 0.0011 (0.0052) model time 0.2743 (0.2776) loss 3.0101 (2.9702) grad_norm 2.8720 (inf) loss_scale 512.0000 (834.3245) mem 7373MB [2024-08-30 04:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1220/1251] eta 0:00:08 lr 0.000223 wd 0.0500 time 0.2266 (0.2822) data time 0.0007 (0.0051) model time 0.2259 (0.2771) loss 2.2866 (2.9668) grad_norm 2.8233 (inf) loss_scale 512.0000 (826.7045) mem 7373MB [2024-08-30 04:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1230/1251] eta 0:00:05 lr 0.000223 wd 0.0500 time 0.2332 (0.2817) data time 0.0009 (0.0050) model time 0.2323 (0.2766) loss 3.4999 (2.9744) grad_norm 2.6418 (inf) loss_scale 512.0000 (819.4365) mem 7373MB [2024-08-30 04:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1240/1251] eta 0:00:03 lr 0.000223 wd 0.0500 time 0.2119 (0.2810) data time 0.0004 (0.0049) model time 0.2115 (0.2761) loss 3.0491 (2.9772) grad_norm 4.7854 (inf) loss_scale 512.0000 (812.4966) mem 7373MB [2024-08-30 04:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [213/300][1250/1251] eta 0:00:00 lr 0.000223 wd 0.0500 time 0.2117 (0.2795) data time 0.0004 (0.0048) model time 0.2112 (0.2746) loss 2.4089 (2.9727) grad_norm 2.9146 (inf) loss_scale 512.0000 (805.8631) mem 7373MB [2024-08-30 04:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 213 training takes 0:02:06 [2024-08-30 04:11:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 04:11:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 04:11:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.424 (0.424) Loss 0.4165 (0.4165) Acc@1 92.676 (92.676) Acc@5 98.242 (98.242) Mem 7373MB [2024-08-30 04:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.116) Loss 0.6387 (0.6595) Acc@1 87.500 (86.310) Acc@5 97.852 (97.425) Mem 7373MB [2024-08-30 04:11:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.085 (0.101) Loss 0.9985 (0.6891) Acc@1 76.270 (85.273) Acc@5 95.312 (97.400) Mem 7373MB [2024-08-30 04:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.095 (0.099) Loss 1.2256 (0.7860) Acc@1 71.680 (82.901) Acc@5 92.285 (96.371) Mem 7373MB [2024-08-30 04:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.088 (0.094) Loss 1.0371 (0.8323) Acc@1 76.270 (81.776) Acc@5 93.359 (95.806) Mem 7373MB [2024-08-30 04:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.342 Acc@5 95.742 [2024-08-30 04:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.3% [2024-08-30 04:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.34% [2024-08-30 04:11:39 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-30 04:11:40 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-30 04:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.623 (0.623) Loss 0.3779 (0.3779) Acc@1 93.359 (93.359) Acc@5 98.633 (98.633) Mem 7373MB [2024-08-30 04:11:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.138) Loss 0.5889 (0.6055) Acc@1 88.770 (87.331) Acc@5 97.852 (97.718) Mem 7373MB [2024-08-30 04:11:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.109) Loss 0.8643 (0.6318) Acc@1 78.613 (86.333) Acc@5 95.898 (97.684) Mem 7373MB [2024-08-30 04:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.100) Loss 1.0850 (0.7155) Acc@1 73.926 (84.274) Acc@5 92.969 (96.758) Mem 7373MB [2024-08-30 04:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.092) Loss 0.9810 (0.7586) Acc@1 76.855 (83.139) Acc@5 94.238 (96.308) Mem 7373MB [2024-08-30 04:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.726 Acc@5 96.274 [2024-08-30 04:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-08-30 04:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.73% [2024-08-30 04:11:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 04:11:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 04:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][0/1251] eta 0:15:33 lr 0.000223 wd 0.0500 time 0.7463 (0.7463) data time 0.4612 (0.4612) model time 0.0000 (0.0000) loss 3.5060 (3.5060) grad_norm 2.0733 (2.0733) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 04:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][10/1251] eta 0:05:39 lr 0.000223 wd 0.0500 time 0.2258 (0.2733) data time 0.0007 (0.0429) model time 0.0000 (0.0000) loss 3.4806 (2.8750) grad_norm 2.6827 (3.2556) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:11:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][20/1251] eta 0:05:10 lr 0.000223 wd 0.0500 time 0.2287 (0.2519) data time 0.0009 (0.0229) model time 0.0000 (0.0000) loss 2.8342 (2.7408) grad_norm 2.5520 (3.1934) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][30/1251] eta 0:05:13 lr 0.000223 wd 0.0500 time 0.3181 (0.2571) data time 0.0022 (0.0159) model time 0.0000 (0.0000) loss 2.9551 (2.8065) grad_norm 2.0705 (3.1406) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][40/1251] eta 0:05:13 lr 0.000223 wd 0.0500 time 0.3053 (0.2586) data time 0.0016 (0.0124) model time 0.0000 (0.0000) loss 3.1565 (2.8700) grad_norm 2.9485 (3.1445) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][50/1251] eta 0:05:02 lr 0.000223 wd 0.0500 time 0.2228 (0.2522) data time 0.0012 (0.0101) model time 0.0000 (0.0000) loss 2.9677 (2.8799) grad_norm 2.9883 (3.2307) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][60/1251] eta 0:04:55 lr 0.000223 wd 0.0500 time 0.2285 (0.2483) data time 0.0008 (0.0086) model time 0.2276 (0.2273) loss 3.4975 (2.9510) grad_norm 3.8542 (3.4524) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][70/1251] eta 0:04:53 lr 0.000223 wd 0.0500 time 0.3887 (0.2485) data time 0.0009 (0.0075) model time 0.3877 (0.2379) loss 2.3566 (2.9219) grad_norm 2.9474 (3.3741) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][80/1251] eta 0:04:51 lr 0.000223 wd 0.0500 time 0.2255 (0.2493) data time 0.0010 (0.0068) model time 0.2245 (0.2432) loss 2.4498 (2.9152) grad_norm 2.4339 (3.3250) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][90/1251] eta 0:04:46 lr 0.000223 wd 0.0500 time 0.2314 (0.2470) data time 0.0008 (0.0062) model time 0.2306 (0.2392) loss 3.2577 (2.9073) grad_norm 2.8127 (3.5281) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][100/1251] eta 0:04:44 lr 0.000223 wd 0.0500 time 0.2462 (0.2474) data time 0.0007 (0.0057) model time 0.2455 (0.2414) loss 3.4905 (2.9234) grad_norm 2.5276 (3.5362) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][110/1251] eta 0:04:40 lr 0.000223 wd 0.0500 time 0.2382 (0.2460) data time 0.0009 (0.0053) model time 0.2373 (0.2396) loss 3.4903 (2.9489) grad_norm 4.3750 (3.5292) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][120/1251] eta 0:04:42 lr 0.000223 wd 0.0500 time 0.2299 (0.2495) data time 0.0007 (0.0049) model time 0.2292 (0.2464) loss 3.0201 (2.9515) grad_norm 2.0902 (3.4996) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][130/1251] eta 0:04:37 lr 0.000223 wd 0.0500 time 0.2296 (0.2479) data time 0.0008 (0.0046) model time 0.2289 (0.2441) loss 2.3106 (2.9566) grad_norm 3.5386 (3.5671) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][140/1251] eta 0:04:34 lr 0.000223 wd 0.0500 time 0.2295 (0.2468) data time 0.0008 (0.0044) model time 0.2287 (0.2426) loss 3.0183 (2.9595) grad_norm 4.0451 (3.6137) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][150/1251] eta 0:04:31 lr 0.000223 wd 0.0500 time 0.3603 (0.2470) data time 0.0018 (0.0041) model time 0.3585 (0.2432) loss 3.0024 (2.9498) grad_norm 5.1858 (3.6824) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][160/1251] eta 0:04:30 lr 0.000223 wd 0.0500 time 0.2260 (0.2477) data time 0.0007 (0.0040) model time 0.2252 (0.2446) loss 3.0153 (2.9514) grad_norm 3.5318 (3.6814) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][170/1251] eta 0:04:28 lr 0.000222 wd 0.0500 time 0.2417 (0.2482) data time 0.0008 (0.0038) model time 0.2410 (0.2454) loss 3.1277 (2.9683) grad_norm 3.0209 (3.6670) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][180/1251] eta 0:04:24 lr 0.000222 wd 0.0500 time 0.2304 (0.2471) data time 0.0009 (0.0036) model time 0.2295 (0.2440) loss 2.7420 (2.9684) grad_norm 4.6139 (3.6686) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][190/1251] eta 0:04:21 lr 0.000222 wd 0.0500 time 0.2318 (0.2463) data time 0.0011 (0.0035) model time 0.2308 (0.2430) loss 3.0322 (2.9533) grad_norm 7.1016 (3.6994) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][200/1251] eta 0:04:20 lr 0.000222 wd 0.0500 time 0.2490 (0.2482) data time 0.0012 (0.0034) model time 0.2479 (0.2457) loss 3.0325 (2.9499) grad_norm 4.9644 (3.7782) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][210/1251] eta 0:04:17 lr 0.000222 wd 0.0500 time 0.2292 (0.2473) data time 0.0011 (0.0033) model time 0.2282 (0.2447) loss 3.3808 (2.9374) grad_norm 3.6770 (3.7823) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][220/1251] eta 0:04:14 lr 0.000222 wd 0.0500 time 0.2335 (0.2465) data time 0.0007 (0.0032) model time 0.2327 (0.2437) loss 3.6366 (2.9561) grad_norm 4.8774 (3.7572) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][230/1251] eta 0:04:12 lr 0.000222 wd 0.0500 time 0.2335 (0.2470) data time 0.0007 (0.0031) model time 0.2328 (0.2444) loss 3.3185 (2.9592) grad_norm 3.0160 (3.7537) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][240/1251] eta 0:04:10 lr 0.000222 wd 0.0500 time 0.3120 (0.2479) data time 0.0011 (0.0030) model time 0.3109 (0.2456) loss 3.1875 (2.9542) grad_norm 4.5161 (3.7611) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][250/1251] eta 0:04:08 lr 0.000222 wd 0.0500 time 0.2495 (0.2481) data time 0.0008 (0.0029) model time 0.2487 (0.2459) loss 3.1730 (2.9456) grad_norm 5.4565 (3.7519) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][260/1251] eta 0:04:05 lr 0.000222 wd 0.0500 time 0.2382 (0.2474) data time 0.0007 (0.0029) model time 0.2375 (0.2451) loss 2.2715 (2.9299) grad_norm 3.3796 (3.7438) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][270/1251] eta 0:04:02 lr 0.000222 wd 0.0500 time 0.2385 (0.2468) data time 0.0008 (0.0028) model time 0.2378 (0.2444) loss 2.7688 (2.9296) grad_norm 2.8424 (3.7744) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:12:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][280/1251] eta 0:04:00 lr 0.000222 wd 0.0500 time 0.2308 (0.2482) data time 0.0009 (0.0027) model time 0.2300 (0.2462) loss 3.1703 (2.9328) grad_norm 3.6432 (3.7543) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][290/1251] eta 0:03:57 lr 0.000222 wd 0.0500 time 0.2287 (0.2475) data time 0.0007 (0.0027) model time 0.2280 (0.2454) loss 2.2875 (2.9401) grad_norm 2.5585 (3.7460) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][300/1251] eta 0:03:55 lr 0.000222 wd 0.0500 time 0.2244 (0.2475) data time 0.0011 (0.0026) model time 0.2232 (0.2455) loss 2.0284 (2.9379) grad_norm 3.6284 (3.7469) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][310/1251] eta 0:03:52 lr 0.000222 wd 0.0500 time 0.2250 (0.2470) data time 0.0013 (0.0026) model time 0.2237 (0.2449) loss 2.9777 (2.9424) grad_norm 3.2928 (3.7343) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][320/1251] eta 0:03:51 lr 0.000222 wd 0.0500 time 0.2999 (0.2483) data time 0.0017 (0.0026) model time 0.2982 (0.2464) loss 3.2447 (2.9489) grad_norm 4.1738 (3.7200) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][330/1251] eta 0:03:48 lr 0.000222 wd 0.0500 time 0.2290 (0.2477) data time 0.0007 (0.0025) model time 0.2284 (0.2457) loss 2.9846 (2.9494) grad_norm 3.9410 (3.7047) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][340/1251] eta 0:03:45 lr 0.000222 wd 0.0500 time 0.2324 (0.2472) data time 0.0009 (0.0025) model time 0.2315 (0.2451) loss 3.1539 (2.9489) grad_norm 5.2150 (3.6967) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][350/1251] eta 0:03:42 lr 0.000222 wd 0.0500 time 0.2919 (0.2469) data time 0.0009 (0.0024) model time 0.2911 (0.2448) loss 2.5662 (2.9421) grad_norm 3.1926 (3.6890) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][360/1251] eta 0:03:41 lr 0.000222 wd 0.0500 time 0.2347 (0.2486) data time 0.0009 (0.0024) model time 0.2338 (0.2468) loss 2.9261 (2.9308) grad_norm 5.7212 (3.6834) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][370/1251] eta 0:03:38 lr 0.000222 wd 0.0500 time 0.2299 (0.2485) data time 0.0011 (0.0024) model time 0.2288 (0.2468) loss 3.4502 (2.9326) grad_norm 3.1690 (3.6769) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][380/1251] eta 0:03:36 lr 0.000222 wd 0.0500 time 0.2447 (0.2480) data time 0.0008 (0.0024) model time 0.2439 (0.2462) loss 2.4940 (2.9236) grad_norm 2.8136 (3.6832) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][390/1251] eta 0:03:33 lr 0.000222 wd 0.0500 time 0.2302 (0.2476) data time 0.0012 (0.0023) model time 0.2290 (0.2458) loss 1.9527 (2.9195) grad_norm 6.2019 (3.7057) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][400/1251] eta 0:03:31 lr 0.000222 wd 0.0500 time 0.2187 (0.2488) data time 0.0010 (0.0023) model time 0.2177 (0.2471) loss 3.2127 (2.9200) grad_norm 2.7514 (3.7173) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][410/1251] eta 0:03:28 lr 0.000222 wd 0.0500 time 0.2243 (0.2484) data time 0.0009 (0.0023) model time 0.2233 (0.2467) loss 3.4588 (2.9228) grad_norm 2.9324 (3.7238) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][420/1251] eta 0:03:25 lr 0.000222 wd 0.0500 time 0.2256 (0.2479) data time 0.0010 (0.0022) model time 0.2246 (0.2461) loss 3.0278 (2.9264) grad_norm 3.3246 (3.7116) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][430/1251] eta 0:03:23 lr 0.000222 wd 0.0500 time 0.2277 (0.2481) data time 0.0009 (0.0022) model time 0.2268 (0.2464) loss 3.6383 (2.9247) grad_norm 4.7383 (3.7143) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][440/1251] eta 0:03:21 lr 0.000222 wd 0.0500 time 0.2299 (0.2483) data time 0.0014 (0.0022) model time 0.2285 (0.2467) loss 2.8750 (2.9220) grad_norm 2.9726 (3.6996) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][450/1251] eta 0:03:19 lr 0.000221 wd 0.0500 time 0.2293 (0.2487) data time 0.0010 (0.0022) model time 0.2284 (0.2471) loss 2.3773 (2.9216) grad_norm 2.7020 (3.7364) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][460/1251] eta 0:03:16 lr 0.000221 wd 0.0500 time 0.2236 (0.2482) data time 0.0009 (0.0021) model time 0.2227 (0.2466) loss 2.3420 (2.9199) grad_norm 2.7323 (3.7287) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][470/1251] eta 0:03:13 lr 0.000221 wd 0.0500 time 0.2348 (0.2478) data time 0.0008 (0.0021) model time 0.2340 (0.2462) loss 4.0042 (2.9290) grad_norm 4.2716 (3.7323) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][480/1251] eta 0:03:11 lr 0.000221 wd 0.0500 time 0.2219 (0.2486) data time 0.0010 (0.0021) model time 0.2209 (0.2471) loss 2.5053 (2.9312) grad_norm 3.2844 (3.7216) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][490/1251] eta 0:03:08 lr 0.000221 wd 0.0500 time 0.2274 (0.2482) data time 0.0010 (0.0021) model time 0.2264 (0.2466) loss 2.5721 (2.9267) grad_norm 5.1863 (3.7283) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][500/1251] eta 0:03:06 lr 0.000221 wd 0.0500 time 0.2309 (0.2481) data time 0.0011 (0.0020) model time 0.2298 (0.2466) loss 2.0672 (2.9198) grad_norm 2.7769 (3.7247) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][510/1251] eta 0:03:03 lr 0.000221 wd 0.0500 time 0.2330 (0.2478) data time 0.0007 (0.0020) model time 0.2324 (0.2462) loss 2.8396 (2.9223) grad_norm 3.8686 (3.7246) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:13:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][520/1251] eta 0:03:01 lr 0.000221 wd 0.0500 time 0.3017 (0.2485) data time 0.0017 (0.0020) model time 0.3000 (0.2470) loss 3.0875 (2.9268) grad_norm 5.2875 (3.7165) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:14:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][530/1251] eta 0:02:58 lr 0.000221 wd 0.0500 time 0.2341 (0.2482) data time 0.0009 (0.0020) model time 0.2332 (0.2466) loss 3.2318 (2.9269) grad_norm 5.3300 (3.7189) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:14:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][540/1251] eta 0:02:56 lr 0.000221 wd 0.0500 time 0.2249 (0.2478) data time 0.0009 (0.0020) model time 0.2240 (0.2462) loss 2.1883 (2.9270) grad_norm 2.5783 (3.7055) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:14:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][550/1251] eta 0:02:53 lr 0.000221 wd 0.0500 time 0.2242 (0.2475) data time 0.0009 (0.0020) model time 0.2233 (0.2459) loss 3.0340 (2.9273) grad_norm 3.1063 (3.6932) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 04:14:06 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 04:14:06 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 04:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 04:15:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 04:16:01 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 04:16:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 04:16:11 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 04:16:13 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 04:16:14 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 04:16:14 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 214) [2024-08-30 04:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 04:16:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][560/1251] eta 0:27:43 lr 0.000221 wd 0.0500 time 0.2374 (2.4079) data time 0.0010 (0.1136) model time 0.2364 (2.2943) loss 3.7202 (3.4839) grad_norm 5.5361 (3.8008) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:16:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][570/1251] eta 0:11:58 lr 0.000221 wd 0.0500 time 0.2396 (1.0558) data time 0.0009 (0.0432) model time 0.2387 (1.0125) loss 3.4158 (3.2209) grad_norm 2.9050 (3.5976) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][580/1251] eta 0:08:18 lr 0.000221 wd 0.0500 time 0.2405 (0.7428) data time 0.0007 (0.0270) model time 0.2398 (0.7158) loss 3.1785 (3.2307) grad_norm 3.0800 (3.3710) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:16:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][590/1251] eta 0:06:38 lr 0.000221 wd 0.0500 time 0.2410 (0.6036) data time 0.0009 (0.0198) model time 0.2401 (0.5838) loss 3.3508 (3.2054) grad_norm 3.1406 (3.3131) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][600/1251] eta 0:05:42 lr 0.000221 wd 0.0500 time 0.2484 (0.5258) data time 0.0009 (0.0157) model time 0.2475 (0.5101) loss 2.8647 (3.1547) grad_norm 4.5491 (3.5325) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:16:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][610/1251] eta 0:05:05 lr 0.000221 wd 0.0500 time 0.2411 (0.4762) data time 0.0008 (0.0131) model time 0.2403 (0.4631) loss 3.3392 (3.1497) grad_norm 2.4761 (3.5248) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:16:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][620/1251] eta 0:04:38 lr 0.000221 wd 0.0500 time 0.2475 (0.4406) data time 0.0008 (0.0112) model time 0.2467 (0.4294) loss 2.2957 (3.1106) grad_norm 3.3544 (3.5152) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:16:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][630/1251] eta 0:04:17 lr 0.000221 wd 0.0500 time 0.2362 (0.4141) data time 0.0011 (0.0099) model time 0.2351 (0.4043) loss 3.0790 (3.0680) grad_norm 4.0348 (3.6547) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:16:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][640/1251] eta 0:04:00 lr 0.000221 wd 0.0500 time 0.2286 (0.3941) data time 0.0009 (0.0088) model time 0.2277 (0.3853) loss 2.4085 (3.0379) grad_norm 3.1881 (3.5879) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][650/1251] eta 0:03:47 lr 0.000221 wd 0.0500 time 0.2368 (0.3783) data time 0.0008 (0.0080) model time 0.2360 (0.3703) loss 3.0767 (3.0449) grad_norm 4.9911 (3.6229) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:16:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][660/1251] eta 0:03:35 lr 0.000221 wd 0.0500 time 0.2413 (0.3652) data time 0.0010 (0.0074) model time 0.2402 (0.3578) loss 3.3816 (3.0646) grad_norm 4.3674 (3.6464) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][670/1251] eta 0:03:26 lr 0.000221 wd 0.0500 time 0.2362 (0.3546) data time 0.0009 (0.0068) model time 0.2354 (0.3478) loss 3.3633 (3.0392) grad_norm 2.9335 (3.6028) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][680/1251] eta 0:03:17 lr 0.000221 wd 0.0500 time 0.2365 (0.3456) data time 0.0007 (0.0063) model time 0.2358 (0.3393) loss 2.0545 (3.0280) grad_norm 3.5637 (3.5723) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][690/1251] eta 0:03:09 lr 0.000221 wd 0.0500 time 0.2503 (0.3381) data time 0.0010 (0.0060) model time 0.2494 (0.3321) loss 2.5650 (3.0231) grad_norm 3.1715 (3.5608) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][700/1251] eta 0:03:02 lr 0.000221 wd 0.0500 time 0.2323 (0.3315) data time 0.0008 (0.0056) model time 0.2316 (0.3259) loss 2.7169 (3.0153) grad_norm 3.6423 (3.5348) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][710/1251] eta 0:02:56 lr 0.000221 wd 0.0500 time 0.2423 (0.3259) data time 0.0009 (0.0053) model time 0.2413 (0.3205) loss 3.4469 (3.0179) grad_norm 3.0188 (3.5391) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][720/1251] eta 0:02:50 lr 0.000220 wd 0.0500 time 0.2376 (0.3207) data time 0.0010 (0.0051) model time 0.2365 (0.3157) loss 3.2333 (3.0204) grad_norm 2.4746 (3.5418) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][730/1251] eta 0:02:44 lr 0.000220 wd 0.0500 time 0.2363 (0.3162) data time 0.0008 (0.0048) model time 0.2355 (0.3114) loss 2.7993 (3.0048) grad_norm 4.1303 (3.5264) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][740/1251] eta 0:02:39 lr 0.000220 wd 0.0500 time 0.2393 (0.3122) data time 0.0012 (0.0046) model time 0.2382 (0.3075) loss 2.6818 (2.9997) grad_norm 2.9211 (3.4988) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][750/1251] eta 0:02:34 lr 0.000220 wd 0.0500 time 0.2417 (0.3087) data time 0.0008 (0.0044) model time 0.2409 (0.3042) loss 2.6834 (2.9899) grad_norm 2.8563 (3.4818) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][760/1251] eta 0:02:30 lr 0.000220 wd 0.0500 time 0.2537 (0.3055) data time 0.0008 (0.0043) model time 0.2529 (0.3013) loss 2.0800 (2.9774) grad_norm 3.7282 (3.4848) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][770/1251] eta 0:02:25 lr 0.000220 wd 0.0500 time 0.2383 (0.3027) data time 0.0008 (0.0041) model time 0.2375 (0.2986) loss 2.3364 (2.9762) grad_norm 3.7033 (3.4980) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][780/1251] eta 0:02:21 lr 0.000220 wd 0.0500 time 0.2390 (0.3002) data time 0.0007 (0.0040) model time 0.2383 (0.2962) loss 2.5290 (2.9794) grad_norm 4.1493 (3.5494) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][790/1251] eta 0:02:17 lr 0.000220 wd 0.0500 time 0.2414 (0.2977) data time 0.0007 (0.0038) model time 0.2407 (0.2939) loss 3.4767 (2.9716) grad_norm 3.5423 (3.5337) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][800/1251] eta 0:02:13 lr 0.000220 wd 0.0500 time 0.2481 (0.2955) data time 0.0007 (0.0037) model time 0.2473 (0.2918) loss 2.1595 (2.9650) grad_norm 4.5349 (3.5208) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][810/1251] eta 0:02:09 lr 0.000220 wd 0.0500 time 0.2350 (0.2933) data time 0.0010 (0.0036) model time 0.2340 (0.2897) loss 2.4612 (2.9581) grad_norm 2.9875 (3.5394) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 04:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 04:17:35 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 04:17:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 04:53:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 04:53:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 04:53:16 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 04:53:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 04:53:23 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 04:53:24 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 04:53:26 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 04:53:26 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 214) [2024-08-30 04:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 04:53:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][820/1251] eta 0:22:44 lr 0.000220 wd 0.0500 time 0.2252 (3.1668) data time 0.0005 (0.1992) model time 0.2247 (2.9676) loss 3.4039 (3.3997) grad_norm 3.1637 (3.1374) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][830/1251] eta 0:07:28 lr 0.000220 wd 0.0500 time 0.2268 (1.0664) data time 0.0009 (0.0577) model time 0.2259 (1.0087) loss 3.3923 (3.2253) grad_norm 3.3186 (3.7708) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][840/1251] eta 0:04:54 lr 0.000220 wd 0.0500 time 0.2196 (0.7166) data time 0.0010 (0.0341) model time 0.2186 (0.6825) loss 3.1442 (3.2547) grad_norm 3.9199 (3.7302) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:53:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][850/1251] eta 0:03:49 lr 0.000220 wd 0.0500 time 0.2238 (0.5733) data time 0.0007 (0.0243) model time 0.2230 (0.5490) loss 2.6828 (3.2473) grad_norm 4.7316 (3.6614) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][860/1251] eta 0:03:13 lr 0.000220 wd 0.0500 time 0.2200 (0.4944) data time 0.0009 (0.0191) model time 0.2191 (0.4754) loss 3.1087 (3.1933) grad_norm 3.1873 (3.5344) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:53:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][870/1251] eta 0:02:49 lr 0.000220 wd 0.0500 time 0.2256 (0.4444) data time 0.0007 (0.0157) model time 0.2250 (0.4287) loss 3.3939 (3.1876) grad_norm 2.9822 (3.4546) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:53:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][880/1251] eta 0:02:32 lr 0.000220 wd 0.0500 time 0.2215 (0.4098) data time 0.0008 (0.0134) model time 0.2207 (0.3965) loss 3.2495 (3.1349) grad_norm 2.6375 (3.4737) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:53:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][890/1251] eta 0:02:19 lr 0.000220 wd 0.0500 time 0.2217 (0.3850) data time 0.0008 (0.0117) model time 0.2210 (0.3733) loss 3.1901 (3.1075) grad_norm 3.3846 (3.4679) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][900/1251] eta 0:02:08 lr 0.000220 wd 0.0500 time 0.2301 (0.3661) data time 0.0008 (0.0105) model time 0.2293 (0.3556) loss 3.3913 (3.0644) grad_norm 8.1027 (3.4841) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][910/1251] eta 0:01:59 lr 0.000220 wd 0.0500 time 0.2219 (0.3513) data time 0.0009 (0.0094) model time 0.2210 (0.3418) loss 2.9706 (3.0595) grad_norm 2.8548 (3.4696) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][920/1251] eta 0:01:52 lr 0.000220 wd 0.0500 time 0.2251 (0.3391) data time 0.0009 (0.0086) model time 0.2241 (0.3305) loss 2.9861 (3.0775) grad_norm 3.3216 (3.4406) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][930/1251] eta 0:01:45 lr 0.000220 wd 0.0500 time 0.2325 (0.3292) data time 0.0011 (0.0080) model time 0.2315 (0.3213) loss 3.2902 (3.0643) grad_norm 2.7499 (3.4042) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][940/1251] eta 0:01:39 lr 0.000220 wd 0.0500 time 0.2215 (0.3209) data time 0.0009 (0.0074) model time 0.2206 (0.3135) loss 2.5885 (3.0663) grad_norm 3.1853 (3.4380) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][950/1251] eta 0:01:34 lr 0.000220 wd 0.0500 time 0.2271 (0.3140) data time 0.0010 (0.0069) model time 0.2260 (0.3070) loss 3.3394 (3.0570) grad_norm 5.5434 (3.5044) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][960/1251] eta 0:01:29 lr 0.000220 wd 0.0500 time 0.2232 (0.3079) data time 0.0008 (0.0065) model time 0.2223 (0.3014) loss 2.7690 (3.0486) grad_norm 3.6457 (3.5231) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][970/1251] eta 0:01:25 lr 0.000220 wd 0.0500 time 0.2244 (0.3026) data time 0.0009 (0.0062) model time 0.2234 (0.2964) loss 3.1552 (3.0470) grad_norm 4.0632 (3.5672) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][980/1251] eta 0:01:20 lr 0.000220 wd 0.0500 time 0.2303 (0.2981) data time 0.0008 (0.0058) model time 0.2295 (0.2923) loss 2.8119 (3.0400) grad_norm 3.3139 (3.7106) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][990/1251] eta 0:01:16 lr 0.000220 wd 0.0500 time 0.2302 (0.2940) data time 0.0009 (0.0056) model time 0.2293 (0.2884) loss 2.0181 (3.0278) grad_norm 3.4927 (3.7007) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1000/1251] eta 0:01:12 lr 0.000219 wd 0.0500 time 0.2234 (0.2903) data time 0.0008 (0.0053) model time 0.2225 (0.2850) loss 2.6814 (3.0175) grad_norm 3.5171 (3.6778) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1010/1251] eta 0:01:09 lr 0.000219 wd 0.0500 time 0.2321 (0.2871) data time 0.0008 (0.0051) model time 0.2313 (0.2821) loss 2.9252 (3.0157) grad_norm 3.1503 (3.6745) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1020/1251] eta 0:01:05 lr 0.000219 wd 0.0500 time 0.2248 (0.2842) data time 0.0009 (0.0049) model time 0.2239 (0.2793) loss 3.5023 (3.0013) grad_norm 3.6230 (3.6601) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1030/1251] eta 0:01:02 lr 0.000219 wd 0.0500 time 0.2259 (0.2815) data time 0.0009 (0.0047) model time 0.2250 (0.2768) loss 2.9138 (2.9962) grad_norm 2.8299 (3.6483) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1040/1251] eta 0:00:58 lr 0.000219 wd 0.0500 time 0.2211 (0.2791) data time 0.0009 (0.0045) model time 0.2202 (0.2745) loss 3.3955 (2.9950) grad_norm 3.1104 (3.6221) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1050/1251] eta 0:00:55 lr 0.000219 wd 0.0500 time 0.2170 (0.2770) data time 0.0009 (0.0044) model time 0.2161 (0.2726) loss 2.4189 (2.9880) grad_norm 5.1579 (3.6148) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1060/1251] eta 0:00:52 lr 0.000219 wd 0.0500 time 0.2392 (0.2750) data time 0.0006 (0.0042) model time 0.2387 (0.2707) loss 1.8289 (2.9882) grad_norm 3.7231 (3.6087) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1070/1251] eta 0:00:49 lr 0.000219 wd 0.0500 time 0.2311 (0.2731) data time 0.0006 (0.0041) model time 0.2304 (0.2690) loss 2.2115 (2.9789) grad_norm 3.9276 (3.6048) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 04:54:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 04:54:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 04:54:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 04:56:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 04:56:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 04:56:26 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 05:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 05:03:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 05:04:01 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 05:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 05:04:11 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 05:04:13 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 05:04:14 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 05:04:14 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 214) [2024-08-30 05:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 05:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 05:18:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 05:19:02 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 05:19:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 05:19:14 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 05:19:15 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 05:19:16 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 05:19:16 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 214) [2024-08-30 05:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 05:19:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1080/1251] eta 0:05:23 lr 0.000219 wd 0.0500 time 0.2412 (1.8903) data time 0.0008 (0.0942) model time 0.2405 (1.7960) loss 3.2835 (3.3725) grad_norm 3.4107 (3.6119) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:19:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1090/1251] eta 0:02:36 lr 0.000219 wd 0.0500 time 0.2391 (0.9729) data time 0.0007 (0.0425) model time 0.2384 (0.9303) loss 3.9669 (3.1911) grad_norm 3.1732 (3.3358) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:19:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1100/1251] eta 0:01:47 lr 0.000219 wd 0.0500 time 0.2437 (0.7112) data time 0.0009 (0.0278) model time 0.2427 (0.6835) loss 3.7053 (3.2166) grad_norm 3.5638 (3.4273) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:19:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1110/1251] eta 0:01:22 lr 0.000219 wd 0.0500 time 0.2418 (0.5880) data time 0.0013 (0.0208) model time 0.2405 (0.5672) loss 2.9348 (3.1776) grad_norm 2.3606 (3.4488) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:19:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1120/1251] eta 0:01:07 lr 0.000219 wd 0.0500 time 0.2357 (0.5164) data time 0.0008 (0.0167) model time 0.2349 (0.4997) loss 3.1550 (3.1435) grad_norm 3.4870 (3.5904) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:19:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1130/1251] eta 0:00:56 lr 0.000219 wd 0.0500 time 0.2358 (0.4688) data time 0.0009 (0.0140) model time 0.2350 (0.4547) loss 2.4851 (3.1161) grad_norm 2.8986 (3.5695) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:19:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1140/1251] eta 0:00:48 lr 0.000219 wd 0.0500 time 0.2316 (0.4350) data time 0.0009 (0.0121) model time 0.2307 (0.4229) loss 2.0847 (3.0804) grad_norm 2.8356 (3.5244) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:19:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1150/1251] eta 0:00:41 lr 0.000219 wd 0.0500 time 0.2428 (0.4104) data time 0.0010 (0.0107) model time 0.2418 (0.3997) loss 2.3896 (3.0579) grad_norm 2.9982 (3.5386) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:19:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1160/1251] eta 0:00:35 lr 0.000219 wd 0.0500 time 0.2351 (0.3909) data time 0.0010 (0.0096) model time 0.2341 (0.3812) loss 2.9687 (3.0347) grad_norm 2.3852 (3.5583) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:19:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1170/1251] eta 0:00:30 lr 0.000219 wd 0.0500 time 0.2344 (0.3754) data time 0.0009 (0.0087) model time 0.2334 (0.3666) loss 3.3326 (3.0360) grad_norm 5.3842 (3.5443) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:20:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1180/1251] eta 0:00:25 lr 0.000219 wd 0.0500 time 0.2410 (0.3630) data time 0.0010 (0.0080) model time 0.2400 (0.3550) loss 2.3577 (3.0473) grad_norm 2.6208 (3.5226) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:20:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1190/1251] eta 0:00:21 lr 0.000219 wd 0.0500 time 0.2460 (0.3531) data time 0.0012 (0.0075) model time 0.2448 (0.3457) loss 3.1055 (3.0339) grad_norm 4.0160 (3.5074) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:20:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1200/1251] eta 0:00:17 lr 0.000219 wd 0.0500 time 0.2377 (0.3445) data time 0.0008 (0.0070) model time 0.2370 (0.3375) loss 2.9039 (3.0248) grad_norm 3.9952 (3.5407) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:20:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1210/1251] eta 0:00:13 lr 0.000219 wd 0.0500 time 0.2370 (0.3371) data time 0.0013 (0.0065) model time 0.2358 (0.3306) loss 2.8195 (3.0156) grad_norm 2.8563 (3.5108) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:20:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1220/1251] eta 0:00:10 lr 0.000219 wd 0.0500 time 0.2456 (0.3307) data time 0.0009 (0.0062) model time 0.2447 (0.3246) loss 3.2403 (3.0051) grad_norm 2.9159 (3.5244) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:20:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1230/1251] eta 0:00:06 lr 0.000219 wd 0.0500 time 0.2358 (0.3253) data time 0.0009 (0.0059) model time 0.2349 (0.3194) loss 2.4216 (2.9950) grad_norm 3.0222 (3.5094) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:20:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1240/1251] eta 0:00:03 lr 0.000219 wd 0.0500 time 0.2240 (0.3199) data time 0.0007 (0.0056) model time 0.2233 (0.3144) loss 3.0950 (3.0014) grad_norm 3.4571 (3.5187) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:20:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [214/300][1250/1251] eta 0:00:00 lr 0.000219 wd 0.0500 time 0.2246 (0.3147) data time 0.0005 (0.0053) model time 0.2241 (0.3094) loss 2.6039 (2.9888) grad_norm 6.2946 (3.5144) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 05:20:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 214 training takes 0:00:56 [2024-08-30 05:20:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 05:20:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 05:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.424 (0.424) Loss 0.4143 (0.4143) Acc@1 92.969 (92.969) Acc@5 98.145 (98.145) Mem 7379MB [2024-08-30 05:20:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.108) Loss 0.5996 (0.6513) Acc@1 87.988 (86.435) Acc@5 97.559 (97.496) Mem 7379MB [2024-08-30 05:20:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.093) Loss 0.9614 (0.6798) Acc@1 76.270 (85.486) Acc@5 95.312 (97.428) Mem 7379MB [2024-08-30 05:20:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.088) Loss 1.1729 (0.7718) Acc@1 73.535 (83.235) Acc@5 92.188 (96.380) Mem 7379MB [2024-08-30 05:20:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.0566 (0.8202) Acc@1 75.781 (82.019) Acc@5 93.359 (95.817) Mem 7379MB [2024-08-30 05:20:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.556 Acc@5 95.764 [2024-08-30 05:20:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.6% [2024-08-30 05:20:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.56% [2024-08-30 05:20:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-30 05:20:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-30 05:20:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.466 (0.466) Loss 0.3782 (0.3782) Acc@1 93.262 (93.262) Acc@5 98.438 (98.438) Mem 7379MB [2024-08-30 05:20:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.111) Loss 0.5889 (0.6053) Acc@1 88.867 (87.296) Acc@5 97.852 (97.718) Mem 7379MB [2024-08-30 05:20:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.095) Loss 0.8638 (0.6316) Acc@1 78.516 (86.337) Acc@5 95.996 (97.689) Mem 7379MB [2024-08-30 05:20:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.089) Loss 1.0859 (0.7153) Acc@1 73.730 (84.268) Acc@5 93.164 (96.780) Mem 7379MB [2024-08-30 05:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 0.9819 (0.7585) Acc@1 77.051 (83.148) Acc@5 94.336 (96.318) Mem 7379MB [2024-08-30 05:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.736 Acc@5 96.284 [2024-08-30 05:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-08-30 05:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.74% [2024-08-30 05:20:29 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 05:20:30 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 05:20:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][0/1251] eta 0:15:42 lr 0.000219 wd 0.0500 time 0.7537 (0.7537) data time 0.4720 (0.4720) model time 0.0000 (0.0000) loss 2.9301 (2.9301) grad_norm 4.5459 (4.5459) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][10/1251] eta 0:05:57 lr 0.000219 wd 0.0500 time 0.2386 (0.2882) data time 0.0010 (0.0438) model time 0.0000 (0.0000) loss 3.8252 (3.0403) grad_norm 6.6750 (4.6769) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][20/1251] eta 0:05:28 lr 0.000218 wd 0.0500 time 0.2491 (0.2672) data time 0.0011 (0.0235) model time 0.0000 (0.0000) loss 2.3319 (2.9125) grad_norm 2.5471 (3.8738) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:20:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][30/1251] eta 0:05:17 lr 0.000218 wd 0.0500 time 0.2420 (0.2601) data time 0.0009 (0.0162) model time 0.0000 (0.0000) loss 2.6955 (2.8582) grad_norm 3.2626 (3.4997) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:20:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][40/1251] eta 0:05:10 lr 0.000218 wd 0.0500 time 0.2498 (0.2564) data time 0.0009 (0.0125) model time 0.0000 (0.0000) loss 3.9865 (2.8653) grad_norm 2.9484 (3.4133) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:20:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][50/1251] eta 0:05:05 lr 0.000218 wd 0.0500 time 0.2584 (0.2546) data time 0.0008 (0.0103) model time 0.0000 (0.0000) loss 2.2837 (2.8933) grad_norm 3.2407 (3.3807) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:20:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][60/1251] eta 0:05:00 lr 0.000218 wd 0.0500 time 0.2357 (0.2527) data time 0.0010 (0.0088) model time 0.2347 (0.2419) loss 1.8953 (2.8693) grad_norm 3.1934 (3.3245) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:20:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][70/1251] eta 0:04:57 lr 0.000218 wd 0.0500 time 0.2453 (0.2516) data time 0.0006 (0.0077) model time 0.2447 (0.2428) loss 2.9704 (2.8881) grad_norm 2.7648 (3.3298) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:20:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][80/1251] eta 0:04:53 lr 0.000218 wd 0.0500 time 0.2525 (0.2509) data time 0.0010 (0.0069) model time 0.2515 (0.2435) loss 3.2260 (2.8772) grad_norm 3.9272 (3.3424) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:20:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][90/1251] eta 0:04:53 lr 0.000218 wd 0.0500 time 0.2516 (0.2529) data time 0.0008 (0.0062) model time 0.2508 (0.2497) loss 1.6033 (2.8480) grad_norm 3.0248 (3.5566) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:20:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][100/1251] eta 0:04:49 lr 0.000218 wd 0.0500 time 0.2429 (0.2519) data time 0.0010 (0.0057) model time 0.2420 (0.2480) loss 3.3694 (2.8712) grad_norm 4.3356 (3.6005) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:20:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][110/1251] eta 0:04:49 lr 0.000218 wd 0.0500 time 0.5326 (0.2538) data time 0.0010 (0.0053) model time 0.5316 (0.2520) loss 2.7364 (2.8715) grad_norm 2.9892 (3.5805) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][120/1251] eta 0:04:45 lr 0.000218 wd 0.0500 time 0.2458 (0.2528) data time 0.0010 (0.0050) model time 0.2447 (0.2505) loss 3.0516 (2.8450) grad_norm 4.2312 (3.5483) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][130/1251] eta 0:04:42 lr 0.000218 wd 0.0500 time 0.2434 (0.2524) data time 0.0010 (0.0047) model time 0.2424 (0.2499) loss 3.3937 (2.8418) grad_norm 3.1627 (3.5774) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][140/1251] eta 0:04:39 lr 0.000218 wd 0.0500 time 0.2413 (0.2516) data time 0.0010 (0.0044) model time 0.2403 (0.2488) loss 3.8070 (2.8728) grad_norm 2.7254 (3.5691) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][150/1251] eta 0:04:36 lr 0.000218 wd 0.0500 time 0.2405 (0.2510) data time 0.0011 (0.0042) model time 0.2393 (0.2480) loss 1.6774 (2.8837) grad_norm 3.1167 (3.5520) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][160/1251] eta 0:04:33 lr 0.000218 wd 0.0500 time 0.2405 (0.2505) data time 0.0009 (0.0040) model time 0.2396 (0.2475) loss 2.6270 (2.8871) grad_norm 3.2536 (3.5238) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][170/1251] eta 0:04:30 lr 0.000218 wd 0.0500 time 0.2401 (0.2500) data time 0.0007 (0.0038) model time 0.2394 (0.2469) loss 3.1759 (2.8930) grad_norm 3.2459 (3.6611) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][180/1251] eta 0:04:27 lr 0.000218 wd 0.0500 time 0.2498 (0.2496) data time 0.0007 (0.0037) model time 0.2491 (0.2465) loss 2.6507 (2.8956) grad_norm 2.9766 (3.6418) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][190/1251] eta 0:04:24 lr 0.000218 wd 0.0500 time 0.2497 (0.2492) data time 0.0009 (0.0035) model time 0.2488 (0.2462) loss 3.2747 (2.8941) grad_norm 3.6992 (3.6444) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][200/1251] eta 0:04:21 lr 0.000218 wd 0.0500 time 0.2377 (0.2489) data time 0.0009 (0.0034) model time 0.2368 (0.2459) loss 2.9909 (2.9018) grad_norm 4.0387 (3.6277) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][210/1251] eta 0:04:18 lr 0.000218 wd 0.0500 time 0.2421 (0.2486) data time 0.0007 (0.0033) model time 0.2414 (0.2456) loss 3.0687 (2.8947) grad_norm 2.6521 (3.6104) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][220/1251] eta 0:04:16 lr 0.000218 wd 0.0500 time 0.2474 (0.2484) data time 0.0009 (0.0032) model time 0.2465 (0.2454) loss 3.0824 (2.9029) grad_norm 3.6036 (3.6150) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][230/1251] eta 0:04:13 lr 0.000218 wd 0.0500 time 0.2411 (0.2481) data time 0.0011 (0.0031) model time 0.2400 (0.2452) loss 3.1423 (2.9108) grad_norm 5.5079 (3.6307) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][240/1251] eta 0:04:10 lr 0.000218 wd 0.0500 time 0.2377 (0.2480) data time 0.0008 (0.0030) model time 0.2369 (0.2451) loss 2.1091 (2.9081) grad_norm 2.2324 (3.6054) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][250/1251] eta 0:04:08 lr 0.000218 wd 0.0500 time 0.2471 (0.2479) data time 0.0008 (0.0029) model time 0.2463 (0.2451) loss 3.2475 (2.9216) grad_norm 3.7374 (3.5850) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][260/1251] eta 0:04:05 lr 0.000218 wd 0.0500 time 0.2379 (0.2477) data time 0.0010 (0.0029) model time 0.2369 (0.2449) loss 2.8190 (2.9288) grad_norm 2.2209 (3.5668) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][270/1251] eta 0:04:02 lr 0.000218 wd 0.0500 time 0.2477 (0.2475) data time 0.0011 (0.0028) model time 0.2467 (0.2448) loss 3.1356 (2.9319) grad_norm 2.5471 (3.5607) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][280/1251] eta 0:04:00 lr 0.000218 wd 0.0500 time 0.2407 (0.2473) data time 0.0010 (0.0027) model time 0.2397 (0.2447) loss 2.7154 (2.9227) grad_norm 4.0898 (3.5845) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][290/1251] eta 0:03:57 lr 0.000218 wd 0.0500 time 0.2436 (0.2474) data time 0.0008 (0.0027) model time 0.2428 (0.2447) loss 2.0721 (2.9148) grad_norm 4.1018 (3.5938) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][300/1251] eta 0:03:55 lr 0.000217 wd 0.0500 time 0.2563 (0.2473) data time 0.0011 (0.0026) model time 0.2551 (0.2447) loss 3.0806 (2.9140) grad_norm 5.8769 (3.5954) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][310/1251] eta 0:03:52 lr 0.000217 wd 0.0500 time 0.2439 (0.2472) data time 0.0011 (0.0026) model time 0.2428 (0.2447) loss 2.7054 (2.9233) grad_norm 3.0077 (3.6111) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][320/1251] eta 0:03:50 lr 0.000217 wd 0.0500 time 0.2496 (0.2471) data time 0.0008 (0.0025) model time 0.2488 (0.2446) loss 2.9598 (2.9243) grad_norm 3.5071 (3.6003) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:21:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 05:21:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 05:21:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 05:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 05:25:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 05:25:21 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 05:25:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 05:25:30 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 05:25:32 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 05:25:33 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 05:25:33 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 215) [2024-08-30 05:25:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 05:25:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][330/1251] eta 0:27:00 lr 0.000217 wd 0.0500 time 0.2965 (1.7590) data time 0.0010 (0.0993) model time 0.2954 (1.6597) loss 3.2839 (3.3632) grad_norm 5.8035 (3.5075) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:25:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][340/1251] eta 0:15:04 lr 0.000217 wd 0.0500 time 0.2398 (0.9932) data time 0.0008 (0.0502) model time 0.2390 (0.9430) loss 3.1213 (3.1769) grad_norm 3.3483 (3.4733) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][350/1251] eta 0:11:12 lr 0.000217 wd 0.0500 time 0.2283 (0.7460) data time 0.0009 (0.0339) model time 0.2274 (0.7121) loss 3.2678 (3.1951) grad_norm 2.9225 (3.3460) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][360/1251] eta 0:09:09 lr 0.000217 wd 0.0500 time 0.2250 (0.6164) data time 0.0008 (0.0256) model time 0.2242 (0.5908) loss 2.3765 (3.1211) grad_norm 3.2994 (3.4093) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][370/1251] eta 0:07:55 lr 0.000217 wd 0.0500 time 0.2708 (0.5400) data time 0.0010 (0.0207) model time 0.2698 (0.5193) loss 2.4613 (3.0994) grad_norm 3.7865 (3.6846) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][380/1251] eta 0:07:09 lr 0.000217 wd 0.0500 time 0.2226 (0.4932) data time 0.0006 (0.0174) model time 0.2219 (0.4758) loss 3.1347 (3.0767) grad_norm 2.8770 (3.6664) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][390/1251] eta 0:06:31 lr 0.000217 wd 0.0500 time 0.2223 (0.4549) data time 0.0007 (0.0151) model time 0.2217 (0.4398) loss 2.2396 (3.0368) grad_norm 3.3587 (3.5855) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][400/1251] eta 0:06:04 lr 0.000217 wd 0.0500 time 0.3293 (0.4281) data time 0.0014 (0.0133) model time 0.3279 (0.4148) loss 3.2657 (3.0200) grad_norm 3.3891 (3.5242) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][410/1251] eta 0:05:41 lr 0.000217 wd 0.0500 time 0.2315 (0.4061) data time 0.0007 (0.0120) model time 0.2308 (0.3941) loss 3.1570 (3.0087) grad_norm 3.4481 (3.4549) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][420/1251] eta 0:05:25 lr 0.000217 wd 0.0500 time 0.3154 (0.3922) data time 0.0016 (0.0109) model time 0.3138 (0.3813) loss 3.2677 (3.0127) grad_norm 2.9706 (3.5887) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][430/1251] eta 0:05:10 lr 0.000217 wd 0.0500 time 0.2186 (0.3787) data time 0.0011 (0.0102) model time 0.2174 (0.3685) loss 2.4904 (3.0180) grad_norm 4.4516 (3.5745) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][440/1251] eta 0:04:56 lr 0.000217 wd 0.0500 time 0.2220 (0.3662) data time 0.0006 (0.0094) model time 0.2214 (0.3568) loss 3.7285 (3.0286) grad_norm 3.6425 (3.6257) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][450/1251] eta 0:04:44 lr 0.000217 wd 0.0500 time 0.2289 (0.3555) data time 0.0007 (0.0088) model time 0.2282 (0.3468) loss 3.0830 (3.0148) grad_norm 3.0262 (3.6259) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][460/1251] eta 0:04:36 lr 0.000217 wd 0.0500 time 0.2432 (0.3492) data time 0.0009 (0.0082) model time 0.2423 (0.3410) loss 1.8683 (3.0065) grad_norm 4.5560 (3.6411) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][470/1251] eta 0:04:26 lr 0.000217 wd 0.0500 time 0.2301 (0.3410) data time 0.0009 (0.0078) model time 0.2292 (0.3333) loss 3.1779 (3.0080) grad_norm 4.4667 (3.6882) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][480/1251] eta 0:04:18 lr 0.000217 wd 0.0500 time 0.2217 (0.3350) data time 0.0011 (0.0073) model time 0.2206 (0.3276) loss 3.0543 (3.0139) grad_norm 2.5684 (3.6636) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][490/1251] eta 0:04:10 lr 0.000217 wd 0.0500 time 0.2234 (0.3288) data time 0.0011 (0.0070) model time 0.2223 (0.3218) loss 2.3377 (3.0126) grad_norm 2.8650 (3.6466) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][500/1251] eta 0:04:04 lr 0.000217 wd 0.0500 time 0.2760 (0.3249) data time 0.0009 (0.0067) model time 0.2751 (0.3183) loss 2.3406 (2.9971) grad_norm 2.5688 (3.6306) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][510/1251] eta 0:03:57 lr 0.000217 wd 0.0500 time 0.2207 (0.3206) data time 0.0007 (0.0064) model time 0.2200 (0.3143) loss 2.4881 (3.0007) grad_norm 4.2095 (3.6747) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][520/1251] eta 0:03:50 lr 0.000217 wd 0.0500 time 0.2302 (0.3159) data time 0.0008 (0.0061) model time 0.2294 (0.3098) loss 3.1737 (2.9884) grad_norm 4.1906 (3.6558) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][530/1251] eta 0:03:44 lr 0.000217 wd 0.0500 time 0.2289 (0.3117) data time 0.0008 (0.0059) model time 0.2281 (0.3059) loss 2.9701 (2.9786) grad_norm 2.5644 (3.6604) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][540/1251] eta 0:03:40 lr 0.000217 wd 0.0500 time 0.3069 (0.3100) data time 0.0017 (0.0057) model time 0.3052 (0.3044) loss 2.7188 (2.9766) grad_norm 4.3642 (3.6498) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][550/1251] eta 0:03:35 lr 0.000217 wd 0.0500 time 0.2905 (0.3067) data time 0.0012 (0.0055) model time 0.2893 (0.3013) loss 2.8484 (2.9769) grad_norm 3.2855 (3.6364) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 05:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][560/1251] eta 0:03:30 lr 0.000217 wd 0.0500 time 0.2286 (0.3045) data time 0.0010 (0.0053) model time 0.2276 (0.2992) loss 3.5445 (2.9712) grad_norm 3.4274 (3.6237) loss_scale 1024.0000 (522.6667) mem 7377MB [2024-08-30 05:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][570/1251] eta 0:03:25 lr 0.000216 wd 0.0500 time 0.2312 (0.3014) data time 0.0007 (0.0051) model time 0.2305 (0.2963) loss 2.1826 (2.9633) grad_norm 3.2480 (3.6484) loss_scale 1024.0000 (542.7200) mem 7377MB [2024-08-30 05:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][580/1251] eta 0:03:20 lr 0.000216 wd 0.0500 time 0.2329 (0.2993) data time 0.0011 (0.0050) model time 0.2319 (0.2943) loss 2.2168 (2.9566) grad_norm 2.9866 (3.6691) loss_scale 1024.0000 (561.2308) mem 7377MB [2024-08-30 05:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][590/1251] eta 0:03:16 lr 0.000216 wd 0.0500 time 0.2260 (0.2975) data time 0.0006 (0.0048) model time 0.2254 (0.2926) loss 3.6389 (2.9503) grad_norm 7.3991 (3.6836) loss_scale 1024.0000 (578.3704) mem 7377MB [2024-08-30 05:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][600/1251] eta 0:03:11 lr 0.000216 wd 0.0500 time 0.2243 (0.2949) data time 0.0009 (0.0047) model time 0.2234 (0.2902) loss 3.3788 (2.9571) grad_norm 3.5041 (3.7618) loss_scale 1024.0000 (594.2857) mem 7377MB [2024-08-30 05:27:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][610/1251] eta 0:03:08 lr 0.000216 wd 0.0500 time 0.2278 (0.2942) data time 0.0009 (0.0046) model time 0.2269 (0.2897) loss 2.4782 (2.9522) grad_norm 2.6425 (3.7539) loss_scale 1024.0000 (609.1034) mem 7377MB [2024-08-30 05:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][620/1251] eta 0:03:04 lr 0.000216 wd 0.0500 time 0.2437 (0.2930) data time 0.0018 (0.0044) model time 0.2419 (0.2885) loss 3.0960 (2.9407) grad_norm 4.6933 (3.7359) loss_scale 1024.0000 (622.9333) mem 7377MB [2024-08-30 05:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][630/1251] eta 0:03:01 lr 0.000216 wd 0.0500 time 0.2284 (0.2926) data time 0.0008 (0.0043) model time 0.2276 (0.2883) loss 2.8076 (2.9358) grad_norm 4.0666 (3.7318) loss_scale 1024.0000 (635.8710) mem 7377MB [2024-08-30 05:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][640/1251] eta 0:02:57 lr 0.000216 wd 0.0500 time 0.2324 (0.2907) data time 0.0009 (0.0042) model time 0.2314 (0.2864) loss 3.1081 (2.9478) grad_norm 2.5025 (3.7603) loss_scale 1024.0000 (648.0000) mem 7377MB [2024-08-30 05:27:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][650/1251] eta 0:02:53 lr 0.000216 wd 0.0500 time 0.2254 (0.2887) data time 0.0007 (0.0041) model time 0.2248 (0.2846) loss 3.2591 (2.9508) grad_norm 2.5295 (3.7603) loss_scale 1024.0000 (659.3939) mem 7377MB [2024-08-30 05:27:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][660/1251] eta 0:02:49 lr 0.000216 wd 0.0500 time 0.2429 (0.2875) data time 0.0011 (0.0041) model time 0.2418 (0.2834) loss 2.9926 (2.9511) grad_norm 3.9893 (3.7797) loss_scale 1024.0000 (670.1176) mem 7377MB [2024-08-30 05:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][670/1251] eta 0:02:46 lr 0.000216 wd 0.0500 time 0.2272 (0.2864) data time 0.0008 (0.0040) model time 0.2263 (0.2824) loss 2.5606 (2.9516) grad_norm 3.3937 (3.7799) loss_scale 1024.0000 (680.2286) mem 7377MB [2024-08-30 05:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][680/1251] eta 0:02:42 lr 0.000216 wd 0.0500 time 0.2205 (0.2852) data time 0.0007 (0.0039) model time 0.2197 (0.2813) loss 3.4702 (2.9545) grad_norm 4.1879 (3.7794) loss_scale 1024.0000 (689.7778) mem 7377MB [2024-08-30 05:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][690/1251] eta 0:02:39 lr 0.000216 wd 0.0500 time 0.2245 (0.2836) data time 0.0009 (0.0038) model time 0.2235 (0.2798) loss 3.1922 (2.9556) grad_norm 4.6880 (3.7666) loss_scale 1024.0000 (698.8108) mem 7377MB [2024-08-30 05:27:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][700/1251] eta 0:02:35 lr 0.000216 wd 0.0500 time 0.3222 (0.2827) data time 0.0011 (0.0037) model time 0.3210 (0.2789) loss 2.6638 (2.9546) grad_norm 2.9956 (3.7520) loss_scale 1024.0000 (707.3684) mem 7377MB [2024-08-30 05:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][710/1251] eta 0:02:32 lr 0.000216 wd 0.0500 time 0.2204 (0.2821) data time 0.0010 (0.0037) model time 0.2194 (0.2785) loss 1.8272 (2.9445) grad_norm 2.4825 (3.7457) loss_scale 1024.0000 (715.4872) mem 7377MB [2024-08-30 05:27:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][720/1251] eta 0:02:29 lr 0.000216 wd 0.0500 time 0.2376 (0.2808) data time 0.0009 (0.0036) model time 0.2368 (0.2771) loss 2.9703 (2.9480) grad_norm 2.4442 (3.7503) loss_scale 1024.0000 (723.2000) mem 7377MB [2024-08-30 05:27:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][730/1251] eta 0:02:25 lr 0.000216 wd 0.0500 time 0.2182 (0.2794) data time 0.0013 (0.0036) model time 0.2169 (0.2759) loss 3.0308 (2.9541) grad_norm 4.3674 (3.7380) loss_scale 1024.0000 (730.5366) mem 7377MB [2024-08-30 05:27:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][740/1251] eta 0:02:22 lr 0.000216 wd 0.0500 time 0.2881 (0.2788) data time 0.0007 (0.0035) model time 0.2874 (0.2753) loss 3.1890 (2.9538) grad_norm 3.0766 (3.7396) loss_scale 1024.0000 (737.5238) mem 7377MB [2024-08-30 05:27:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][750/1251] eta 0:02:19 lr 0.000216 wd 0.0500 time 0.2276 (0.2781) data time 0.0008 (0.0034) model time 0.2267 (0.2747) loss 3.3301 (2.9613) grad_norm 5.5036 (3.7387) loss_scale 1024.0000 (744.1860) mem 7377MB [2024-08-30 05:27:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][760/1251] eta 0:02:16 lr 0.000216 wd 0.0500 time 0.2327 (0.2773) data time 0.0008 (0.0034) model time 0.2318 (0.2739) loss 2.4143 (2.9639) grad_norm 5.0914 (3.7350) loss_scale 1024.0000 (750.5455) mem 7377MB [2024-08-30 05:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][770/1251] eta 0:02:12 lr 0.000216 wd 0.0500 time 0.2199 (0.2761) data time 0.0007 (0.0033) model time 0.2192 (0.2728) loss 2.9578 (2.9634) grad_norm 5.3858 (3.7481) loss_scale 1024.0000 (756.6222) mem 7377MB [2024-08-30 05:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][780/1251] eta 0:02:09 lr 0.000216 wd 0.0500 time 0.2437 (0.2751) data time 0.0009 (0.0033) model time 0.2428 (0.2718) loss 3.3115 (2.9589) grad_norm 4.0262 (3.7391) loss_scale 1024.0000 (762.4348) mem 7377MB [2024-08-30 05:27:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][790/1251] eta 0:02:06 lr 0.000216 wd 0.0500 time 0.2238 (0.2750) data time 0.0010 (0.0033) model time 0.2228 (0.2717) loss 3.0246 (2.9516) grad_norm 2.8799 (3.7295) loss_scale 1024.0000 (768.0000) mem 7377MB [2024-08-30 05:27:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][800/1251] eta 0:02:03 lr 0.000216 wd 0.0500 time 0.2330 (0.2740) data time 0.0008 (0.0032) model time 0.2322 (0.2708) loss 2.3440 (2.9502) grad_norm 2.7278 (3.7161) loss_scale 1024.0000 (773.3333) mem 7377MB [2024-08-30 05:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][810/1251] eta 0:02:00 lr 0.000216 wd 0.0500 time 0.2198 (0.2733) data time 0.0010 (0.0032) model time 0.2188 (0.2701) loss 2.8155 (2.9550) grad_norm 3.7695 (3.7230) loss_scale 1024.0000 (778.4490) mem 7377MB [2024-08-30 05:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][820/1251] eta 0:01:57 lr 0.000216 wd 0.0500 time 0.2226 (0.2724) data time 0.0007 (0.0031) model time 0.2218 (0.2693) loss 2.4133 (2.9546) grad_norm 3.4652 (3.7208) loss_scale 1024.0000 (783.3600) mem 7377MB [2024-08-30 05:27:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][830/1251] eta 0:01:54 lr 0.000216 wd 0.0500 time 0.2975 (0.2725) data time 0.0013 (0.0031) model time 0.2961 (0.2694) loss 2.9057 (2.9581) grad_norm 4.3764 (3.7244) loss_scale 1024.0000 (788.0784) mem 7377MB [2024-08-30 05:28:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][840/1251] eta 0:01:51 lr 0.000216 wd 0.0500 time 0.2241 (0.2717) data time 0.0007 (0.0031) model time 0.2233 (0.2686) loss 3.0914 (2.9595) grad_norm 2.5759 (3.7116) loss_scale 1024.0000 (792.6154) mem 7377MB [2024-08-30 05:28:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][850/1251] eta 0:01:48 lr 0.000215 wd 0.0500 time 0.2215 (0.2708) data time 0.0009 (0.0030) model time 0.2206 (0.2678) loss 2.8126 (2.9523) grad_norm 5.4623 (3.7105) loss_scale 1024.0000 (796.9811) mem 7377MB [2024-08-30 05:28:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][860/1251] eta 0:01:45 lr 0.000215 wd 0.0500 time 0.2214 (0.2700) data time 0.0008 (0.0030) model time 0.2205 (0.2671) loss 3.1820 (2.9508) grad_norm 3.0349 (3.7075) loss_scale 1024.0000 (801.1852) mem 7377MB [2024-08-30 05:28:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][870/1251] eta 0:01:42 lr 0.000215 wd 0.0500 time 0.2766 (0.2703) data time 0.0016 (0.0030) model time 0.2750 (0.2673) loss 1.9064 (2.9501) grad_norm 3.5836 (3.7050) loss_scale 1024.0000 (805.2364) mem 7377MB [2024-08-30 05:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][880/1251] eta 0:01:40 lr 0.000215 wd 0.0500 time 0.3310 (0.2697) data time 0.0011 (0.0029) model time 0.3299 (0.2668) loss 3.2155 (2.9543) grad_norm 7.8443 (3.7070) loss_scale 1024.0000 (809.1429) mem 7377MB [2024-08-30 05:28:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][890/1251] eta 0:01:37 lr 0.000215 wd 0.0500 time 0.2271 (0.2693) data time 0.0010 (0.0029) model time 0.2261 (0.2664) loss 2.8826 (2.9551) grad_norm 3.1325 (3.7076) loss_scale 1024.0000 (812.9123) mem 7377MB [2024-08-30 05:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][900/1251] eta 0:01:34 lr 0.000215 wd 0.0500 time 0.2242 (0.2686) data time 0.0007 (0.0029) model time 0.2234 (0.2657) loss 2.5526 (2.9561) grad_norm 4.5120 (3.7010) loss_scale 1024.0000 (816.5517) mem 7377MB [2024-08-30 05:28:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][910/1251] eta 0:01:31 lr 0.000215 wd 0.0500 time 0.2843 (0.2684) data time 0.0007 (0.0028) model time 0.2836 (0.2656) loss 3.5919 (2.9586) grad_norm 4.8916 (3.6992) loss_scale 1024.0000 (820.0678) mem 7377MB [2024-08-30 05:28:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][920/1251] eta 0:01:28 lr 0.000215 wd 0.0500 time 0.2305 (0.2682) data time 0.0009 (0.0028) model time 0.2296 (0.2654) loss 3.0533 (2.9583) grad_norm 2.6220 (3.6907) loss_scale 1024.0000 (823.4667) mem 7377MB [2024-08-30 05:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][930/1251] eta 0:01:25 lr 0.000215 wd 0.0500 time 0.2236 (0.2676) data time 0.0006 (0.0028) model time 0.2229 (0.2648) loss 2.9025 (2.9564) grad_norm 3.0557 (3.6843) loss_scale 1024.0000 (826.7541) mem 7377MB [2024-08-30 05:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][940/1251] eta 0:01:23 lr 0.000215 wd 0.0500 time 0.2584 (0.2670) data time 0.0010 (0.0028) model time 0.2573 (0.2642) loss 3.1984 (2.9565) grad_norm 3.3378 (3.6783) loss_scale 1024.0000 (829.9355) mem 7377MB [2024-08-30 05:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][950/1251] eta 0:01:20 lr 0.000215 wd 0.0500 time 0.2237 (0.2670) data time 0.0010 (0.0027) model time 0.2227 (0.2642) loss 3.5653 (2.9586) grad_norm 4.0858 (3.6662) loss_scale 1024.0000 (833.0159) mem 7377MB [2024-08-30 05:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][960/1251] eta 0:01:17 lr 0.000215 wd 0.0500 time 0.2283 (0.2667) data time 0.0012 (0.0027) model time 0.2271 (0.2640) loss 1.8764 (2.9590) grad_norm 2.6866 (3.6532) loss_scale 1024.0000 (836.0000) mem 7377MB [2024-08-30 05:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][970/1251] eta 0:01:14 lr 0.000215 wd 0.0500 time 0.2244 (0.2661) data time 0.0009 (0.0027) model time 0.2235 (0.2634) loss 2.9151 (2.9547) grad_norm 3.1329 (3.6455) loss_scale 1024.0000 (838.8923) mem 7377MB [2024-08-30 05:28:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][980/1251] eta 0:01:11 lr 0.000215 wd 0.0500 time 0.2272 (0.2655) data time 0.0014 (0.0027) model time 0.2258 (0.2628) loss 3.0848 (2.9530) grad_norm 3.3082 (3.6379) loss_scale 1024.0000 (841.6970) mem 7377MB [2024-08-30 05:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][990/1251] eta 0:01:09 lr 0.000215 wd 0.0500 time 0.2266 (0.2653) data time 0.0010 (0.0026) model time 0.2256 (0.2626) loss 3.2180 (2.9553) grad_norm 3.0916 (3.6328) loss_scale 1024.0000 (844.4179) mem 7377MB [2024-08-30 05:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1000/1251] eta 0:01:06 lr 0.000215 wd 0.0500 time 0.2265 (0.2650) data time 0.0010 (0.0026) model time 0.2255 (0.2624) loss 2.5686 (2.9574) grad_norm 2.7894 (3.6531) loss_scale 1024.0000 (847.0588) mem 7377MB [2024-08-30 05:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1010/1251] eta 0:01:03 lr 0.000215 wd 0.0500 time 0.2332 (0.2645) data time 0.0007 (0.0026) model time 0.2325 (0.2619) loss 2.1997 (2.9572) grad_norm 2.9352 (3.6442) loss_scale 1024.0000 (849.6232) mem 7377MB [2024-08-30 05:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1020/1251] eta 0:01:01 lr 0.000215 wd 0.0500 time 0.2280 (0.2644) data time 0.0008 (0.0026) model time 0.2272 (0.2619) loss 2.2697 (2.9539) grad_norm 2.1016 (3.6483) loss_scale 1024.0000 (852.1143) mem 7377MB [2024-08-30 05:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1030/1251] eta 0:00:58 lr 0.000215 wd 0.0500 time 0.2621 (0.2642) data time 0.0013 (0.0026) model time 0.2608 (0.2617) loss 2.9514 (2.9534) grad_norm 3.3981 (3.6513) loss_scale 1024.0000 (854.5352) mem 7377MB [2024-08-30 05:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1040/1251] eta 0:00:55 lr 0.000215 wd 0.0500 time 0.2207 (0.2641) data time 0.0011 (0.0025) model time 0.2196 (0.2616) loss 2.0155 (2.9486) grad_norm 2.7196 (3.6521) loss_scale 1024.0000 (856.8889) mem 7377MB [2024-08-30 05:28:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1050/1251] eta 0:00:52 lr 0.000215 wd 0.0500 time 0.2270 (0.2637) data time 0.0009 (0.0025) model time 0.2262 (0.2611) loss 3.2405 (2.9499) grad_norm 3.9680 (3.6534) loss_scale 1024.0000 (859.1781) mem 7377MB [2024-08-30 05:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1060/1251] eta 0:00:50 lr 0.000215 wd 0.0500 time 0.2258 (0.2631) data time 0.0009 (0.0025) model time 0.2249 (0.2606) loss 3.5137 (2.9532) grad_norm 4.0007 (3.6533) loss_scale 1024.0000 (861.4054) mem 7377MB [2024-08-30 05:28:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1070/1251] eta 0:00:47 lr 0.000215 wd 0.0500 time 0.2266 (0.2629) data time 0.0010 (0.0025) model time 0.2256 (0.2604) loss 2.9756 (2.9532) grad_norm 2.7290 (3.6546) loss_scale 1024.0000 (863.5733) mem 7377MB [2024-08-30 05:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1080/1251] eta 0:00:44 lr 0.000215 wd 0.0500 time 0.2281 (0.2627) data time 0.0009 (0.0025) model time 0.2272 (0.2603) loss 3.2642 (2.9522) grad_norm 3.3663 (3.6477) loss_scale 1024.0000 (865.6842) mem 7377MB [2024-08-30 05:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1090/1251] eta 0:00:42 lr 0.000215 wd 0.0500 time 0.2389 (0.2625) data time 0.0010 (0.0025) model time 0.2379 (0.2601) loss 3.3109 (2.9539) grad_norm 3.7351 (3.6541) loss_scale 1024.0000 (867.7403) mem 7377MB [2024-08-30 05:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1100/1251] eta 0:00:39 lr 0.000215 wd 0.0500 time 0.2230 (0.2621) data time 0.0011 (0.0024) model time 0.2219 (0.2597) loss 3.1161 (2.9542) grad_norm 3.6640 (3.6520) loss_scale 1024.0000 (869.7436) mem 7377MB [2024-08-30 05:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1110/1251] eta 0:00:36 lr 0.000215 wd 0.0500 time 0.2401 (0.2617) data time 0.0010 (0.0024) model time 0.2391 (0.2593) loss 3.0712 (2.9537) grad_norm 3.6685 (3.6458) loss_scale 1024.0000 (871.6962) mem 7377MB [2024-08-30 05:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1120/1251] eta 0:00:34 lr 0.000215 wd 0.0500 time 0.2223 (0.2618) data time 0.0010 (0.0024) model time 0.2213 (0.2594) loss 2.7861 (2.9548) grad_norm 4.1686 (3.6476) loss_scale 1024.0000 (873.6000) mem 7377MB [2024-08-30 05:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1130/1251] eta 0:00:31 lr 0.000214 wd 0.0500 time 0.2357 (0.2613) data time 0.0006 (0.0024) model time 0.2351 (0.2590) loss 2.7198 (2.9511) grad_norm 2.6075 (3.6403) loss_scale 1024.0000 (875.4568) mem 7377MB [2024-08-30 05:29:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1140/1251] eta 0:00:29 lr 0.000214 wd 0.0500 time 0.3233 (0.2614) data time 0.0019 (0.0024) model time 0.3214 (0.2590) loss 2.0946 (2.9513) grad_norm 2.1387 (3.6312) loss_scale 1024.0000 (877.2683) mem 7377MB [2024-08-30 05:29:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1150/1251] eta 0:00:26 lr 0.000214 wd 0.0500 time 0.2535 (0.2613) data time 0.0009 (0.0024) model time 0.2527 (0.2589) loss 3.0473 (2.9485) grad_norm 3.6214 (3.6236) loss_scale 1024.0000 (879.0361) mem 7377MB [2024-08-30 05:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1160/1251] eta 0:00:23 lr 0.000214 wd 0.0500 time 0.3411 (0.2614) data time 0.0010 (0.0023) model time 0.3401 (0.2591) loss 2.5093 (2.9480) grad_norm 5.7976 (3.6219) loss_scale 1024.0000 (880.7619) mem 7377MB [2024-08-30 05:29:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1170/1251] eta 0:00:21 lr 0.000214 wd 0.0500 time 0.2295 (0.2611) data time 0.0007 (0.0023) model time 0.2289 (0.2587) loss 3.5104 (2.9456) grad_norm 3.7966 (3.6458) loss_scale 1024.0000 (882.4471) mem 7377MB [2024-08-30 05:29:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1180/1251] eta 0:00:18 lr 0.000214 wd 0.0500 time 0.2215 (0.2606) data time 0.0008 (0.0023) model time 0.2208 (0.2583) loss 2.4435 (2.9457) grad_norm 7.3752 (3.6455) loss_scale 1024.0000 (884.0930) mem 7377MB [2024-08-30 05:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1190/1251] eta 0:00:15 lr 0.000214 wd 0.0500 time 0.2292 (0.2603) data time 0.0007 (0.0023) model time 0.2285 (0.2580) loss 3.0920 (2.9471) grad_norm 3.0473 (3.6454) loss_scale 1024.0000 (885.7011) mem 7377MB [2024-08-30 05:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1200/1251] eta 0:00:13 lr 0.000214 wd 0.0500 time 0.2225 (0.2604) data time 0.0011 (0.0023) model time 0.2214 (0.2581) loss 2.3968 (2.9445) grad_norm 2.6297 (3.6404) loss_scale 1024.0000 (887.2727) mem 7377MB [2024-08-30 05:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1210/1251] eta 0:00:10 lr 0.000214 wd 0.0500 time 0.2298 (0.2600) data time 0.0009 (0.0023) model time 0.2289 (0.2577) loss 2.1664 (2.9444) grad_norm 4.5608 (3.6367) loss_scale 1024.0000 (888.8090) mem 7377MB [2024-08-30 05:29:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1220/1251] eta 0:00:08 lr 0.000214 wd 0.0500 time 0.2260 (0.2599) data time 0.0008 (0.0023) model time 0.2252 (0.2576) loss 3.4840 (2.9437) grad_norm 3.1964 (3.6347) loss_scale 1024.0000 (890.3111) mem 7377MB [2024-08-30 05:29:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1230/1251] eta 0:00:05 lr 0.000214 wd 0.0500 time 0.2283 (0.2595) data time 0.0008 (0.0022) model time 0.2276 (0.2573) loss 3.3109 (2.9450) grad_norm 2.5777 (3.6250) loss_scale 1024.0000 (891.7802) mem 7377MB [2024-08-30 05:29:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1240/1251] eta 0:00:02 lr 0.000214 wd 0.0500 time 0.2999 (0.2595) data time 0.0011 (0.0022) model time 0.2988 (0.2573) loss 3.4232 (2.9450) grad_norm 3.8541 (3.6241) loss_scale 1024.0000 (893.2174) mem 7377MB [2024-08-30 05:29:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [215/300][1250/1251] eta 0:00:00 lr 0.000214 wd 0.0500 time 0.2111 (0.2591) data time 0.0006 (0.0022) model time 0.2105 (0.2569) loss 2.8603 (2.9472) grad_norm 2.5501 (3.6417) loss_scale 1024.0000 (894.6237) mem 7377MB [2024-08-30 05:29:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 215 training takes 0:04:00 [2024-08-30 05:29:39 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 05:29:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 05:29:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.474 (0.474) Loss 0.4009 (0.4009) Acc@1 92.871 (92.871) Acc@5 98.047 (98.047) Mem 7377MB [2024-08-30 05:29:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.119) Loss 0.6299 (0.6343) Acc@1 87.695 (86.381) Acc@5 97.656 (97.399) Mem 7377MB [2024-08-30 05:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.101) Loss 0.9375 (0.6684) Acc@1 76.953 (85.254) Acc@5 95.312 (97.405) Mem 7377MB [2024-08-30 05:29:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.093) Loss 1.1523 (0.7701) Acc@1 72.266 (82.847) Acc@5 91.895 (96.314) Mem 7377MB [2024-08-30 05:29:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.076 (0.088) Loss 1.0283 (0.8161) Acc@1 76.465 (81.826) Acc@5 93.262 (95.794) Mem 7377MB [2024-08-30 05:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.390 Acc@5 95.730 [2024-08-30 05:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.4% [2024-08-30 05:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.941 (0.941) Loss 0.3789 (0.3789) Acc@1 93.262 (93.262) Acc@5 98.438 (98.438) Mem 7377MB [2024-08-30 05:29:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.167) Loss 0.5879 (0.6052) Acc@1 88.672 (87.331) Acc@5 97.754 (97.701) Mem 7377MB [2024-08-30 05:29:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.129) Loss 0.8613 (0.6312) Acc@1 78.711 (86.328) Acc@5 95.996 (97.680) Mem 7377MB [2024-08-30 05:29:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.113) Loss 1.0869 (0.7150) Acc@1 73.438 (84.221) Acc@5 93.262 (96.790) Mem 7377MB [2024-08-30 05:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.101) Loss 0.9805 (0.7582) Acc@1 76.953 (83.136) Acc@5 94.434 (96.308) Mem 7377MB [2024-08-30 05:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.726 Acc@5 96.288 [2024-08-30 05:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-08-30 05:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.73% [2024-08-30 05:29:52 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 05:29:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 05:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][0/1251] eta 0:16:02 lr 0.000214 wd 0.0500 time 0.7694 (0.7694) data time 0.4337 (0.4337) model time 0.0000 (0.0000) loss 3.4606 (3.4606) grad_norm 2.5640 (2.5640) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-08-30 05:29:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][10/1251] eta 0:06:26 lr 0.000214 wd 0.0500 time 0.3112 (0.3116) data time 0.0012 (0.0406) model time 0.0000 (0.0000) loss 3.4131 (2.9303) grad_norm 2.9683 (3.3528) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:29:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][20/1251] eta 0:05:38 lr 0.000214 wd 0.0500 time 0.2280 (0.2753) data time 0.0009 (0.0218) model time 0.0000 (0.0000) loss 2.2728 (2.7450) grad_norm 3.4629 (3.6011) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][30/1251] eta 0:05:17 lr 0.000214 wd 0.0500 time 0.2368 (0.2600) data time 0.0011 (0.0151) model time 0.0000 (0.0000) loss 3.1307 (2.8099) grad_norm 2.6683 (3.7021) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][40/1251] eta 0:05:13 lr 0.000214 wd 0.0500 time 0.2217 (0.2587) data time 0.0010 (0.0117) model time 0.0000 (0.0000) loss 3.0354 (2.8333) grad_norm 2.9057 (3.6754) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][50/1251] eta 0:05:10 lr 0.000214 wd 0.0500 time 0.2207 (0.2585) data time 0.0011 (0.0097) model time 0.0000 (0.0000) loss 3.4148 (2.8665) grad_norm 4.3934 (3.6893) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][60/1251] eta 0:05:07 lr 0.000214 wd 0.0500 time 0.2509 (0.2583) data time 0.0008 (0.0083) model time 0.2501 (0.2562) loss 3.3555 (2.8727) grad_norm 2.6952 (3.7458) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][70/1251] eta 0:05:00 lr 0.000214 wd 0.0500 time 0.2298 (0.2545) data time 0.0009 (0.0073) model time 0.2289 (0.2430) loss 2.7779 (2.8647) grad_norm 2.5377 (3.7221) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][80/1251] eta 0:04:54 lr 0.000214 wd 0.0500 time 0.2323 (0.2515) data time 0.0011 (0.0066) model time 0.2313 (0.2380) loss 2.9684 (2.8920) grad_norm 5.1837 (3.9599) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][90/1251] eta 0:04:52 lr 0.000214 wd 0.0500 time 0.3014 (0.2522) data time 0.0014 (0.0061) model time 0.3000 (0.2427) loss 3.3979 (2.8912) grad_norm 6.9386 (3.9729) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][100/1251] eta 0:04:49 lr 0.000214 wd 0.0500 time 0.2279 (0.2513) data time 0.0011 (0.0056) model time 0.2268 (0.2425) loss 3.1648 (2.8721) grad_norm 2.6638 (3.9095) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][110/1251] eta 0:04:45 lr 0.000214 wd 0.0500 time 0.2864 (0.2500) data time 0.0008 (0.0052) model time 0.2856 (0.2414) loss 3.4554 (2.8917) grad_norm 2.1581 (3.8531) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][120/1251] eta 0:04:41 lr 0.000214 wd 0.0500 time 0.2254 (0.2490) data time 0.0010 (0.0049) model time 0.2244 (0.2406) loss 2.2737 (2.8998) grad_norm 4.6238 (3.8895) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][130/1251] eta 0:04:39 lr 0.000214 wd 0.0500 time 0.2272 (0.2492) data time 0.0014 (0.0046) model time 0.2258 (0.2419) loss 2.3966 (2.9144) grad_norm 4.0827 (3.8703) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][140/1251] eta 0:04:36 lr 0.000214 wd 0.0500 time 0.2280 (0.2488) data time 0.0009 (0.0043) model time 0.2271 (0.2420) loss 3.3544 (2.9077) grad_norm 2.9402 (3.8694) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][150/1251] eta 0:04:32 lr 0.000214 wd 0.0500 time 0.2315 (0.2477) data time 0.0008 (0.0041) model time 0.2307 (0.2408) loss 1.9311 (2.9048) grad_norm 5.2828 (3.8674) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][160/1251] eta 0:04:28 lr 0.000213 wd 0.0500 time 0.2412 (0.2465) data time 0.0009 (0.0039) model time 0.2403 (0.2397) loss 3.1313 (2.9044) grad_norm 3.3631 (3.8360) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][170/1251] eta 0:04:27 lr 0.000213 wd 0.0500 time 0.2347 (0.2473) data time 0.0014 (0.0038) model time 0.2332 (0.2413) loss 3.0461 (2.8994) grad_norm 3.6739 (3.8188) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][180/1251] eta 0:04:24 lr 0.000213 wd 0.0500 time 0.2163 (0.2473) data time 0.0009 (0.0036) model time 0.2154 (0.2416) loss 3.2246 (2.9017) grad_norm 2.6307 (3.7748) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][190/1251] eta 0:04:22 lr 0.000213 wd 0.0500 time 0.2334 (0.2471) data time 0.0009 (0.0035) model time 0.2325 (0.2416) loss 2.2062 (2.9157) grad_norm 2.7063 (3.7441) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][200/1251] eta 0:04:18 lr 0.000213 wd 0.0500 time 0.2230 (0.2460) data time 0.0009 (0.0034) model time 0.2222 (0.2405) loss 3.9718 (2.9202) grad_norm 6.8959 (3.7531) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][210/1251] eta 0:04:15 lr 0.000213 wd 0.0500 time 0.3250 (0.2459) data time 0.0018 (0.0033) model time 0.3232 (0.2407) loss 1.9561 (2.9160) grad_norm 2.7065 (3.7628) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][220/1251] eta 0:04:14 lr 0.000213 wd 0.0500 time 0.2311 (0.2465) data time 0.0009 (0.0032) model time 0.2302 (0.2416) loss 3.0495 (2.9174) grad_norm 4.5885 (3.7676) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][230/1251] eta 0:04:10 lr 0.000213 wd 0.0500 time 0.2269 (0.2457) data time 0.0008 (0.0031) model time 0.2260 (0.2408) loss 3.1677 (2.9130) grad_norm 2.9890 (3.7518) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][240/1251] eta 0:04:07 lr 0.000213 wd 0.0500 time 0.2338 (0.2450) data time 0.0008 (0.0030) model time 0.2330 (0.2401) loss 2.7800 (2.9176) grad_norm 2.9293 (3.7699) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][250/1251] eta 0:04:05 lr 0.000213 wd 0.0500 time 0.2485 (0.2454) data time 0.0010 (0.0030) model time 0.2475 (0.2408) loss 2.7555 (2.9290) grad_norm 2.9255 (3.7603) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:30:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][260/1251] eta 0:04:04 lr 0.000213 wd 0.0500 time 0.2783 (0.2464) data time 0.0013 (0.0029) model time 0.2770 (0.2422) loss 1.8764 (2.9357) grad_norm 3.7539 (3.7423) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][270/1251] eta 0:04:01 lr 0.000213 wd 0.0500 time 0.2345 (0.2458) data time 0.0006 (0.0028) model time 0.2339 (0.2416) loss 2.8372 (2.9396) grad_norm 4.6404 (3.7392) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][280/1251] eta 0:03:58 lr 0.000213 wd 0.0500 time 0.2362 (0.2453) data time 0.0009 (0.0028) model time 0.2353 (0.2411) loss 2.2407 (2.9391) grad_norm 3.7274 (3.7870) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:31:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][290/1251] eta 0:03:55 lr 0.000213 wd 0.0500 time 0.2466 (0.2448) data time 0.0012 (0.0028) model time 0.2454 (0.2406) loss 2.6530 (2.9358) grad_norm 3.6583 (3.7797) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:31:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][300/1251] eta 0:03:53 lr 0.000213 wd 0.0500 time 0.2232 (0.2455) data time 0.0010 (0.0027) model time 0.2222 (0.2415) loss 3.1535 (2.9372) grad_norm 4.1416 (3.7690) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:31:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][310/1251] eta 0:03:50 lr 0.000213 wd 0.0500 time 0.2302 (0.2449) data time 0.0006 (0.0027) model time 0.2296 (0.2410) loss 3.5389 (2.9373) grad_norm 3.7788 (3.7966) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:31:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][320/1251] eta 0:03:48 lr 0.000213 wd 0.0500 time 0.2196 (0.2452) data time 0.0007 (0.0026) model time 0.2189 (0.2414) loss 2.9754 (2.9313) grad_norm 3.0727 (3.7815) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-08-30 05:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][330/1251] eta 0:03:45 lr 0.000213 wd 0.0500 time 0.2287 (0.2447) data time 0.0009 (0.0026) model time 0.2278 (0.2409) loss 2.9468 (2.9357) grad_norm 2.8273 (inf) loss_scale 512.0000 (1020.9063) mem 7381MB [2024-08-30 05:31:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][340/1251] eta 0:03:43 lr 0.000213 wd 0.0500 time 0.2536 (0.2452) data time 0.0011 (0.0026) model time 0.2525 (0.2416) loss 2.4295 (2.9426) grad_norm 3.0879 (inf) loss_scale 512.0000 (1005.9824) mem 7381MB [2024-08-30 05:31:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][350/1251] eta 0:03:40 lr 0.000213 wd 0.0500 time 0.2297 (0.2448) data time 0.0010 (0.0025) model time 0.2288 (0.2411) loss 3.1581 (2.9532) grad_norm 3.9114 (inf) loss_scale 512.0000 (991.9088) mem 7381MB [2024-08-30 05:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][360/1251] eta 0:03:37 lr 0.000213 wd 0.0500 time 0.2259 (0.2443) data time 0.0007 (0.0025) model time 0.2252 (0.2406) loss 3.2311 (2.9561) grad_norm 4.7616 (inf) loss_scale 512.0000 (978.6150) mem 7381MB [2024-08-30 05:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][370/1251] eta 0:03:34 lr 0.000213 wd 0.0500 time 0.2275 (0.2439) data time 0.0007 (0.0025) model time 0.2268 (0.2402) loss 3.4171 (2.9538) grad_norm 2.9303 (inf) loss_scale 512.0000 (966.0377) mem 7381MB [2024-08-30 05:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][380/1251] eta 0:03:33 lr 0.000213 wd 0.0500 time 0.2265 (0.2447) data time 0.0007 (0.0024) model time 0.2258 (0.2412) loss 1.7570 (2.9491) grad_norm 4.8416 (inf) loss_scale 512.0000 (954.1207) mem 7381MB [2024-08-30 05:31:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][390/1251] eta 0:03:30 lr 0.000213 wd 0.0500 time 0.2943 (0.2445) data time 0.0014 (0.0024) model time 0.2929 (0.2411) loss 2.1566 (2.9418) grad_norm 4.2892 (inf) loss_scale 512.0000 (942.8133) mem 7381MB [2024-08-30 05:31:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][400/1251] eta 0:03:28 lr 0.000213 wd 0.0500 time 0.2286 (0.2444) data time 0.0008 (0.0024) model time 0.2278 (0.2411) loss 3.0738 (2.9437) grad_norm 2.5720 (inf) loss_scale 512.0000 (932.0698) mem 7381MB [2024-08-30 05:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][410/1251] eta 0:03:25 lr 0.000213 wd 0.0500 time 0.2406 (0.2441) data time 0.0008 (0.0024) model time 0.2397 (0.2407) loss 3.4247 (2.9466) grad_norm 3.8470 (inf) loss_scale 512.0000 (921.8491) mem 7381MB [2024-08-30 05:31:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][420/1251] eta 0:03:23 lr 0.000213 wd 0.0500 time 0.3150 (0.2444) data time 0.0009 (0.0023) model time 0.3141 (0.2412) loss 2.9560 (2.9475) grad_norm 2.4081 (inf) loss_scale 512.0000 (912.1140) mem 7381MB [2024-08-30 05:31:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][430/1251] eta 0:03:20 lr 0.000213 wd 0.0500 time 0.2321 (0.2445) data time 0.0009 (0.0023) model time 0.2312 (0.2413) loss 3.5486 (2.9505) grad_norm 3.6443 (inf) loss_scale 512.0000 (902.8306) mem 7381MB [2024-08-30 05:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][440/1251] eta 0:03:18 lr 0.000212 wd 0.0500 time 0.2330 (0.2442) data time 0.0015 (0.0023) model time 0.2315 (0.2410) loss 3.2421 (2.9522) grad_norm 3.4138 (inf) loss_scale 512.0000 (893.9683) mem 7381MB [2024-08-30 05:31:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][450/1251] eta 0:03:15 lr 0.000212 wd 0.0500 time 0.2203 (0.2445) data time 0.0011 (0.0023) model time 0.2192 (0.2414) loss 3.1295 (2.9551) grad_norm 4.4337 (inf) loss_scale 512.0000 (885.4989) mem 7381MB [2024-08-30 05:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][460/1251] eta 0:03:13 lr 0.000212 wd 0.0500 time 0.2203 (0.2451) data time 0.0008 (0.0022) model time 0.2195 (0.2421) loss 3.1300 (2.9588) grad_norm 2.8652 (inf) loss_scale 512.0000 (877.3970) mem 7381MB [2024-08-30 05:31:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][470/1251] eta 0:03:11 lr 0.000212 wd 0.0500 time 0.2213 (0.2453) data time 0.0011 (0.0022) model time 0.2203 (0.2424) loss 3.1428 (2.9593) grad_norm 2.3946 (inf) loss_scale 512.0000 (869.6391) mem 7381MB [2024-08-30 05:31:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][480/1251] eta 0:03:08 lr 0.000212 wd 0.0500 time 0.2413 (0.2450) data time 0.0008 (0.0022) model time 0.2404 (0.2421) loss 3.1486 (2.9625) grad_norm 4.3151 (inf) loss_scale 512.0000 (862.2037) mem 7381MB [2024-08-30 05:31:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][490/1251] eta 0:03:06 lr 0.000212 wd 0.0500 time 0.2228 (0.2446) data time 0.0010 (0.0022) model time 0.2219 (0.2417) loss 3.0939 (2.9617) grad_norm 3.3622 (inf) loss_scale 512.0000 (855.0713) mem 7381MB [2024-08-30 05:31:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][500/1251] eta 0:03:03 lr 0.000212 wd 0.0500 time 0.2760 (0.2447) data time 0.0007 (0.0021) model time 0.2753 (0.2418) loss 3.4099 (2.9612) grad_norm 3.9299 (inf) loss_scale 512.0000 (848.2236) mem 7381MB [2024-08-30 05:31:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][510/1251] eta 0:03:01 lr 0.000212 wd 0.0500 time 0.2275 (0.2446) data time 0.0008 (0.0021) model time 0.2267 (0.2418) loss 3.1317 (2.9621) grad_norm 3.2416 (inf) loss_scale 512.0000 (841.6438) mem 7381MB [2024-08-30 05:32:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][520/1251] eta 0:02:58 lr 0.000212 wd 0.0500 time 0.2224 (0.2446) data time 0.0008 (0.0021) model time 0.2216 (0.2418) loss 3.4553 (2.9634) grad_norm 3.1922 (inf) loss_scale 512.0000 (835.3167) mem 7381MB [2024-08-30 05:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][530/1251] eta 0:02:56 lr 0.000212 wd 0.0500 time 0.2254 (0.2443) data time 0.0009 (0.0021) model time 0.2245 (0.2415) loss 2.6586 (2.9639) grad_norm 2.5229 (inf) loss_scale 512.0000 (829.2279) mem 7381MB [2024-08-30 05:32:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][540/1251] eta 0:02:53 lr 0.000212 wd 0.0500 time 0.3056 (0.2444) data time 0.0014 (0.0021) model time 0.3042 (0.2417) loss 1.6462 (2.9598) grad_norm 2.5423 (inf) loss_scale 512.0000 (823.3641) mem 7381MB [2024-08-30 05:32:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][550/1251] eta 0:02:51 lr 0.000212 wd 0.0500 time 0.2277 (0.2446) data time 0.0007 (0.0020) model time 0.2270 (0.2419) loss 3.2361 (2.9557) grad_norm 2.6927 (inf) loss_scale 512.0000 (817.7132) mem 7381MB [2024-08-30 05:32:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][560/1251] eta 0:02:48 lr 0.000212 wd 0.0500 time 0.2272 (0.2444) data time 0.0009 (0.0020) model time 0.2263 (0.2417) loss 2.8751 (2.9496) grad_norm 3.0215 (inf) loss_scale 512.0000 (812.2638) mem 7381MB [2024-08-30 05:32:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][570/1251] eta 0:02:46 lr 0.000212 wd 0.0500 time 0.2299 (0.2441) data time 0.0008 (0.0020) model time 0.2290 (0.2414) loss 2.5428 (2.9468) grad_norm 4.9497 (inf) loss_scale 512.0000 (807.0053) mem 7381MB [2024-08-30 05:32:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][580/1251] eta 0:02:43 lr 0.000212 wd 0.0500 time 0.2223 (0.2441) data time 0.0013 (0.0020) model time 0.2210 (0.2414) loss 3.4903 (2.9488) grad_norm 2.7318 (inf) loss_scale 512.0000 (801.9277) mem 7381MB [2024-08-30 05:32:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][590/1251] eta 0:02:41 lr 0.000212 wd 0.0500 time 0.2264 (0.2442) data time 0.0008 (0.0020) model time 0.2256 (0.2416) loss 3.1244 (2.9509) grad_norm 3.2790 (inf) loss_scale 512.0000 (797.0220) mem 7381MB [2024-08-30 05:32:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][600/1251] eta 0:02:39 lr 0.000212 wd 0.0500 time 0.2255 (0.2443) data time 0.0006 (0.0020) model time 0.2249 (0.2417) loss 2.0406 (2.9512) grad_norm 2.8142 (inf) loss_scale 512.0000 (792.2795) mem 7381MB [2024-08-30 05:32:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][610/1251] eta 0:02:36 lr 0.000212 wd 0.0500 time 0.2347 (0.2440) data time 0.0006 (0.0020) model time 0.2340 (0.2414) loss 1.7692 (2.9500) grad_norm 4.2006 (inf) loss_scale 512.0000 (787.6923) mem 7381MB [2024-08-30 05:32:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][620/1251] eta 0:02:33 lr 0.000212 wd 0.0500 time 0.2292 (0.2437) data time 0.0009 (0.0019) model time 0.2284 (0.2411) loss 3.0698 (2.9474) grad_norm 3.6997 (inf) loss_scale 512.0000 (783.2528) mem 7381MB [2024-08-30 05:32:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][630/1251] eta 0:02:31 lr 0.000212 wd 0.0500 time 0.2268 (0.2441) data time 0.0009 (0.0019) model time 0.2259 (0.2416) loss 1.9054 (2.9463) grad_norm 3.4047 (inf) loss_scale 512.0000 (778.9540) mem 7381MB [2024-08-30 05:32:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][640/1251] eta 0:02:29 lr 0.000212 wd 0.0500 time 0.2303 (0.2442) data time 0.0007 (0.0019) model time 0.2296 (0.2417) loss 2.9465 (2.9434) grad_norm 3.7447 (inf) loss_scale 512.0000 (774.7894) mem 7381MB [2024-08-30 05:32:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][650/1251] eta 0:02:26 lr 0.000212 wd 0.0500 time 0.2227 (0.2439) data time 0.0006 (0.0019) model time 0.2220 (0.2414) loss 2.0210 (2.9416) grad_norm 3.2473 (inf) loss_scale 512.0000 (770.7527) mem 7381MB [2024-08-30 05:32:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][660/1251] eta 0:02:24 lr 0.000212 wd 0.0500 time 0.2341 (0.2440) data time 0.0011 (0.0019) model time 0.2330 (0.2416) loss 2.5469 (2.9442) grad_norm 2.9492 (inf) loss_scale 512.0000 (766.8381) mem 7381MB [2024-08-30 05:32:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][670/1251] eta 0:02:21 lr 0.000212 wd 0.0500 time 0.3040 (0.2443) data time 0.0009 (0.0019) model time 0.3031 (0.2419) loss 3.3393 (2.9478) grad_norm 2.6178 (inf) loss_scale 512.0000 (763.0402) mem 7381MB [2024-08-30 05:32:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][680/1251] eta 0:02:19 lr 0.000212 wd 0.0500 time 0.2345 (0.2443) data time 0.0008 (0.0019) model time 0.2337 (0.2420) loss 2.5300 (2.9479) grad_norm 3.8792 (inf) loss_scale 512.0000 (759.3539) mem 7381MB [2024-08-30 05:32:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][690/1251] eta 0:02:16 lr 0.000212 wd 0.0500 time 0.2278 (0.2441) data time 0.0009 (0.0019) model time 0.2269 (0.2418) loss 2.3491 (2.9483) grad_norm 3.9824 (inf) loss_scale 512.0000 (755.7742) mem 7381MB [2024-08-30 05:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][700/1251] eta 0:02:14 lr 0.000212 wd 0.0500 time 0.2322 (0.2439) data time 0.0009 (0.0019) model time 0.2313 (0.2416) loss 3.3193 (2.9504) grad_norm 4.7441 (inf) loss_scale 512.0000 (752.2967) mem 7381MB [2024-08-30 05:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][710/1251] eta 0:02:12 lr 0.000212 wd 0.0500 time 0.2398 (0.2443) data time 0.0010 (0.0019) model time 0.2389 (0.2420) loss 2.7707 (2.9492) grad_norm 3.7422 (inf) loss_scale 512.0000 (748.9170) mem 7381MB [2024-08-30 05:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][720/1251] eta 0:02:09 lr 0.000211 wd 0.0500 time 0.2279 (0.2441) data time 0.0009 (0.0018) model time 0.2270 (0.2417) loss 2.3871 (2.9449) grad_norm 3.0657 (inf) loss_scale 512.0000 (745.6311) mem 7381MB [2024-08-30 05:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][730/1251] eta 0:02:07 lr 0.000211 wd 0.0500 time 0.2263 (0.2442) data time 0.0010 (0.0018) model time 0.2253 (0.2419) loss 2.3312 (2.9407) grad_norm 2.8654 (inf) loss_scale 512.0000 (742.4350) mem 7381MB [2024-08-30 05:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][740/1251] eta 0:02:04 lr 0.000211 wd 0.0500 time 0.2201 (0.2439) data time 0.0009 (0.0018) model time 0.2192 (0.2416) loss 3.0634 (2.9403) grad_norm 3.2023 (inf) loss_scale 512.0000 (739.3252) mem 7381MB [2024-08-30 05:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][750/1251] eta 0:02:02 lr 0.000211 wd 0.0500 time 0.3329 (0.2442) data time 0.0013 (0.0018) model time 0.3316 (0.2419) loss 3.3056 (2.9392) grad_norm 3.3212 (inf) loss_scale 512.0000 (736.2983) mem 7381MB [2024-08-30 05:32:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][760/1251] eta 0:01:59 lr 0.000211 wd 0.0500 time 0.2265 (0.2441) data time 0.0008 (0.0018) model time 0.2257 (0.2419) loss 2.8390 (2.9386) grad_norm 9.7914 (inf) loss_scale 512.0000 (733.3509) mem 7381MB [2024-08-30 05:33:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][770/1251] eta 0:01:57 lr 0.000211 wd 0.0500 time 0.2522 (0.2439) data time 0.0007 (0.0018) model time 0.2515 (0.2417) loss 3.2382 (2.9438) grad_norm 3.2933 (inf) loss_scale 512.0000 (730.4799) mem 7381MB [2024-08-30 05:33:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][780/1251] eta 0:01:54 lr 0.000211 wd 0.0500 time 0.2387 (0.2437) data time 0.0006 (0.0018) model time 0.2381 (0.2414) loss 2.5270 (2.9423) grad_norm 4.5504 (inf) loss_scale 512.0000 (727.6825) mem 7381MB [2024-08-30 05:33:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][790/1251] eta 0:01:52 lr 0.000211 wd 0.0500 time 0.3180 (0.2440) data time 0.0015 (0.0018) model time 0.3166 (0.2418) loss 2.0332 (2.9396) grad_norm 3.5131 (inf) loss_scale 512.0000 (724.9558) mem 7381MB [2024-08-30 05:33:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][800/1251] eta 0:01:50 lr 0.000211 wd 0.0500 time 0.2645 (0.2439) data time 0.0010 (0.0018) model time 0.2635 (0.2417) loss 3.3799 (2.9412) grad_norm 3.2409 (inf) loss_scale 512.0000 (722.2971) mem 7381MB [2024-08-30 05:33:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][810/1251] eta 0:01:47 lr 0.000211 wd 0.0500 time 0.2378 (0.2440) data time 0.0010 (0.0018) model time 0.2367 (0.2417) loss 2.6792 (2.9392) grad_norm 4.0834 (inf) loss_scale 512.0000 (719.7041) mem 7381MB [2024-08-30 05:33:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][820/1251] eta 0:01:45 lr 0.000211 wd 0.0500 time 0.2293 (0.2438) data time 0.0010 (0.0018) model time 0.2282 (0.2416) loss 2.8709 (2.9380) grad_norm 3.1802 (inf) loss_scale 512.0000 (717.1742) mem 7381MB [2024-08-30 05:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][830/1251] eta 0:01:42 lr 0.000211 wd 0.0500 time 0.2286 (0.2439) data time 0.0008 (0.0018) model time 0.2278 (0.2417) loss 3.3667 (2.9383) grad_norm 3.9953 (inf) loss_scale 512.0000 (714.7052) mem 7381MB [2024-08-30 05:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][840/1251] eta 0:01:40 lr 0.000211 wd 0.0500 time 0.2234 (0.2440) data time 0.0011 (0.0017) model time 0.2224 (0.2418) loss 2.9357 (2.9398) grad_norm 3.1471 (inf) loss_scale 512.0000 (712.2949) mem 7381MB [2024-08-30 05:33:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][850/1251] eta 0:01:37 lr 0.000211 wd 0.0500 time 0.2232 (0.2438) data time 0.0006 (0.0017) model time 0.2226 (0.2416) loss 3.3594 (2.9409) grad_norm 4.2678 (inf) loss_scale 512.0000 (709.9412) mem 7381MB [2024-08-30 05:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][860/1251] eta 0:01:35 lr 0.000211 wd 0.0500 time 0.3079 (0.2439) data time 0.0013 (0.0017) model time 0.3067 (0.2417) loss 3.1527 (2.9389) grad_norm 3.0320 (inf) loss_scale 512.0000 (707.6423) mem 7381MB [2024-08-30 05:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][870/1251] eta 0:01:32 lr 0.000211 wd 0.0500 time 0.2675 (0.2440) data time 0.0014 (0.0017) model time 0.2661 (0.2418) loss 2.6346 (2.9409) grad_norm 6.5001 (inf) loss_scale 512.0000 (705.3961) mem 7381MB [2024-08-30 05:33:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][880/1251] eta 0:01:30 lr 0.000211 wd 0.0500 time 0.2210 (0.2441) data time 0.0008 (0.0017) model time 0.2203 (0.2420) loss 1.9924 (2.9396) grad_norm 3.7591 (inf) loss_scale 512.0000 (703.2009) mem 7381MB [2024-08-30 05:33:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][890/1251] eta 0:01:28 lr 0.000211 wd 0.0500 time 0.2285 (0.2439) data time 0.0008 (0.0017) model time 0.2277 (0.2418) loss 3.3924 (2.9409) grad_norm 2.1764 (inf) loss_scale 512.0000 (701.0550) mem 7381MB [2024-08-30 05:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][900/1251] eta 0:01:25 lr 0.000211 wd 0.0500 time 0.2263 (0.2437) data time 0.0009 (0.0017) model time 0.2254 (0.2416) loss 3.4965 (2.9426) grad_norm 14.4816 (inf) loss_scale 512.0000 (698.9567) mem 7381MB [2024-08-30 05:33:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][910/1251] eta 0:01:23 lr 0.000211 wd 0.0500 time 0.2974 (0.2437) data time 0.0011 (0.0017) model time 0.2963 (0.2416) loss 2.0455 (2.9410) grad_norm 2.8539 (inf) loss_scale 512.0000 (696.9045) mem 7381MB [2024-08-30 05:33:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][920/1251] eta 0:01:20 lr 0.000211 wd 0.0500 time 0.2212 (0.2438) data time 0.0009 (0.0017) model time 0.2203 (0.2417) loss 2.4854 (2.9395) grad_norm 4.7847 (inf) loss_scale 512.0000 (694.8969) mem 7381MB [2024-08-30 05:33:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][930/1251] eta 0:01:18 lr 0.000211 wd 0.0500 time 0.2292 (0.2436) data time 0.0006 (0.0017) model time 0.2285 (0.2416) loss 2.8993 (2.9388) grad_norm 3.6424 (inf) loss_scale 512.0000 (692.9323) mem 7381MB [2024-08-30 05:33:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][940/1251] eta 0:01:15 lr 0.000211 wd 0.0500 time 0.2217 (0.2436) data time 0.0012 (0.0017) model time 0.2205 (0.2416) loss 3.6285 (2.9376) grad_norm 3.4176 (inf) loss_scale 512.0000 (691.0096) mem 7381MB [2024-08-30 05:33:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][950/1251] eta 0:01:13 lr 0.000211 wd 0.0500 time 0.2742 (0.2435) data time 0.0009 (0.0017) model time 0.2733 (0.2415) loss 3.1087 (2.9410) grad_norm 2.7198 (inf) loss_scale 512.0000 (689.1272) mem 7381MB [2024-08-30 05:33:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][960/1251] eta 0:01:10 lr 0.000211 wd 0.0500 time 0.2279 (0.2437) data time 0.0009 (0.0017) model time 0.2270 (0.2417) loss 2.3332 (2.9397) grad_norm 2.9129 (inf) loss_scale 512.0000 (687.2841) mem 7381MB [2024-08-30 05:33:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][970/1251] eta 0:01:08 lr 0.000211 wd 0.0500 time 0.2244 (0.2435) data time 0.0007 (0.0017) model time 0.2237 (0.2415) loss 3.6743 (2.9391) grad_norm 3.9328 (inf) loss_scale 512.0000 (685.4789) mem 7381MB [2024-08-30 05:33:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][980/1251] eta 0:01:05 lr 0.000211 wd 0.0500 time 0.2285 (0.2434) data time 0.0009 (0.0017) model time 0.2276 (0.2413) loss 3.2587 (2.9395) grad_norm 2.4297 (inf) loss_scale 512.0000 (683.7105) mem 7381MB [2024-08-30 05:33:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][990/1251] eta 0:01:03 lr 0.000211 wd 0.0500 time 0.2243 (0.2434) data time 0.0009 (0.0016) model time 0.2234 (0.2413) loss 2.6789 (2.9356) grad_norm 5.9325 (inf) loss_scale 512.0000 (681.9778) mem 7381MB [2024-08-30 05:33:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1000/1251] eta 0:01:01 lr 0.000210 wd 0.0500 time 0.2246 (0.2435) data time 0.0009 (0.0016) model time 0.2237 (0.2414) loss 2.9386 (2.9341) grad_norm 3.0498 (inf) loss_scale 512.0000 (680.2797) mem 7381MB [2024-08-30 05:34:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1010/1251] eta 0:00:58 lr 0.000210 wd 0.0500 time 0.2265 (0.2434) data time 0.0007 (0.0016) model time 0.2259 (0.2414) loss 3.5184 (2.9341) grad_norm 2.5308 (inf) loss_scale 512.0000 (678.6152) mem 7381MB [2024-08-30 05:34:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1020/1251] eta 0:00:56 lr 0.000210 wd 0.0500 time 0.2187 (0.2433) data time 0.0010 (0.0016) model time 0.2176 (0.2413) loss 2.7585 (2.9324) grad_norm 4.5887 (inf) loss_scale 512.0000 (676.9833) mem 7381MB [2024-08-30 05:34:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1030/1251] eta 0:00:53 lr 0.000210 wd 0.0500 time 0.2225 (0.2432) data time 0.0014 (0.0016) model time 0.2211 (0.2412) loss 3.4187 (2.9329) grad_norm 3.4042 (inf) loss_scale 512.0000 (675.3831) mem 7381MB [2024-08-30 05:34:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1040/1251] eta 0:00:51 lr 0.000210 wd 0.0500 time 0.2998 (0.2434) data time 0.0015 (0.0016) model time 0.2983 (0.2415) loss 2.6005 (2.9303) grad_norm 2.6175 (inf) loss_scale 512.0000 (673.8136) mem 7381MB [2024-08-30 05:34:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1050/1251] eta 0:00:48 lr 0.000210 wd 0.0500 time 0.2280 (0.2434) data time 0.0009 (0.0016) model time 0.2271 (0.2414) loss 2.1877 (2.9304) grad_norm 3.7155 (inf) loss_scale 512.0000 (672.2740) mem 7381MB [2024-08-30 05:34:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1060/1251] eta 0:00:46 lr 0.000210 wd 0.0500 time 0.2462 (0.2432) data time 0.0006 (0.0016) model time 0.2455 (0.2412) loss 3.3331 (2.9304) grad_norm 3.0161 (inf) loss_scale 512.0000 (670.7634) mem 7381MB [2024-08-30 05:34:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1070/1251] eta 0:00:44 lr 0.000210 wd 0.0500 time 0.2218 (0.2432) data time 0.0009 (0.0016) model time 0.2209 (0.2413) loss 3.0312 (2.9285) grad_norm 2.4918 (inf) loss_scale 512.0000 (669.2810) mem 7381MB [2024-08-30 05:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1080/1251] eta 0:00:41 lr 0.000210 wd 0.0500 time 0.2300 (0.2433) data time 0.0007 (0.0016) model time 0.2293 (0.2414) loss 3.3273 (2.9304) grad_norm 4.5365 (inf) loss_scale 512.0000 (667.8261) mem 7381MB [2024-08-30 05:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1090/1251] eta 0:00:39 lr 0.000210 wd 0.0500 time 0.2358 (0.2434) data time 0.0010 (0.0016) model time 0.2348 (0.2414) loss 2.4582 (2.9301) grad_norm 2.8848 (inf) loss_scale 512.0000 (666.3978) mem 7381MB [2024-08-30 05:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1100/1251] eta 0:00:36 lr 0.000210 wd 0.0500 time 0.2262 (0.2432) data time 0.0009 (0.0016) model time 0.2253 (0.2413) loss 3.5293 (2.9312) grad_norm 3.2696 (inf) loss_scale 512.0000 (664.9955) mem 7381MB [2024-08-30 05:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1110/1251] eta 0:00:34 lr 0.000210 wd 0.0500 time 0.2284 (0.2431) data time 0.0008 (0.0016) model time 0.2276 (0.2412) loss 3.6016 (2.9328) grad_norm 4.1704 (inf) loss_scale 512.0000 (663.6184) mem 7381MB [2024-08-30 05:34:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1120/1251] eta 0:00:31 lr 0.000210 wd 0.0500 time 0.2324 (0.2432) data time 0.0006 (0.0016) model time 0.2318 (0.2413) loss 2.8593 (2.9315) grad_norm 4.8600 (inf) loss_scale 512.0000 (662.2658) mem 7381MB [2024-08-30 05:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1130/1251] eta 0:00:29 lr 0.000210 wd 0.0500 time 0.2251 (0.2432) data time 0.0009 (0.0016) model time 0.2242 (0.2413) loss 2.8842 (2.9302) grad_norm 3.3856 (inf) loss_scale 512.0000 (660.9372) mem 7381MB [2024-08-30 05:34:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1140/1251] eta 0:00:27 lr 0.000210 wd 0.0500 time 0.2234 (0.2433) data time 0.0009 (0.0016) model time 0.2225 (0.2413) loss 3.2289 (2.9309) grad_norm 4.6387 (inf) loss_scale 512.0000 (659.6319) mem 7381MB [2024-08-30 05:34:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1150/1251] eta 0:00:24 lr 0.000210 wd 0.0500 time 0.2271 (0.2431) data time 0.0007 (0.0016) model time 0.2264 (0.2412) loss 2.7559 (2.9301) grad_norm 2.4418 (inf) loss_scale 512.0000 (658.3493) mem 7381MB [2024-08-30 05:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1160/1251] eta 0:00:22 lr 0.000210 wd 0.0500 time 0.2241 (0.2432) data time 0.0009 (0.0016) model time 0.2232 (0.2412) loss 2.8520 (2.9306) grad_norm 4.0540 (inf) loss_scale 512.0000 (657.0887) mem 7381MB [2024-08-30 05:34:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1170/1251] eta 0:00:19 lr 0.000210 wd 0.0500 time 0.2264 (0.2432) data time 0.0008 (0.0016) model time 0.2257 (0.2413) loss 2.8881 (2.9292) grad_norm 3.6662 (inf) loss_scale 512.0000 (655.8497) mem 7381MB [2024-08-30 05:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1180/1251] eta 0:00:17 lr 0.000210 wd 0.0500 time 0.2334 (0.2433) data time 0.0008 (0.0016) model time 0.2326 (0.2414) loss 3.2858 (2.9314) grad_norm 2.9580 (inf) loss_scale 512.0000 (654.6317) mem 7381MB [2024-08-30 05:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1190/1251] eta 0:00:14 lr 0.000210 wd 0.0500 time 0.2273 (0.2431) data time 0.0010 (0.0016) model time 0.2263 (0.2413) loss 3.0541 (2.9306) grad_norm 3.3137 (inf) loss_scale 512.0000 (653.4341) mem 7381MB [2024-08-30 05:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1200/1251] eta 0:00:12 lr 0.000210 wd 0.0500 time 0.3173 (0.2433) data time 0.0016 (0.0016) model time 0.3157 (0.2414) loss 3.1688 (2.9318) grad_norm 3.2294 (inf) loss_scale 512.0000 (652.2565) mem 7381MB [2024-08-30 05:34:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1210/1251] eta 0:00:09 lr 0.000210 wd 0.0500 time 0.3193 (0.2434) data time 0.0011 (0.0016) model time 0.3181 (0.2415) loss 2.4845 (2.9339) grad_norm 2.8585 (inf) loss_scale 512.0000 (651.0983) mem 7381MB [2024-08-30 05:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1220/1251] eta 0:00:07 lr 0.000210 wd 0.0500 time 0.2279 (0.2434) data time 0.0007 (0.0016) model time 0.2272 (0.2415) loss 3.2466 (2.9334) grad_norm 3.5037 (inf) loss_scale 512.0000 (649.9590) mem 7381MB [2024-08-30 05:34:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1230/1251] eta 0:00:05 lr 0.000210 wd 0.0500 time 0.2278 (0.2433) data time 0.0009 (0.0015) model time 0.2269 (0.2414) loss 2.9921 (2.9347) grad_norm 4.2485 (inf) loss_scale 512.0000 (648.8383) mem 7381MB [2024-08-30 05:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1240/1251] eta 0:00:02 lr 0.000210 wd 0.0500 time 0.2849 (0.2432) data time 0.0007 (0.0015) model time 0.2842 (0.2414) loss 3.5204 (2.9366) grad_norm 3.1386 (inf) loss_scale 512.0000 (647.7357) mem 7381MB [2024-08-30 05:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [216/300][1250/1251] eta 0:00:00 lr 0.000210 wd 0.0500 time 0.2115 (0.2432) data time 0.0006 (0.0015) model time 0.2109 (0.2414) loss 2.3759 (2.9351) grad_norm 5.2508 (inf) loss_scale 512.0000 (646.6507) mem 7381MB [2024-08-30 05:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 216 training takes 0:05:04 [2024-08-30 05:34:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 05:34:59 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 05:34:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.574 (0.574) Loss 0.4019 (0.4019) Acc@1 93.457 (93.457) Acc@5 98.340 (98.340) Mem 7381MB [2024-08-30 05:35:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.123) Loss 0.6636 (0.6365) Acc@1 87.793 (86.639) Acc@5 97.168 (97.417) Mem 7381MB [2024-08-30 05:35:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.102) Loss 0.9414 (0.6643) Acc@1 77.246 (85.617) Acc@5 95.117 (97.410) Mem 7381MB [2024-08-30 05:35:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.094 (0.099) Loss 1.1484 (0.7606) Acc@1 72.754 (83.257) Acc@5 92.676 (96.399) Mem 7381MB [2024-08-30 05:35:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.091) Loss 1.0859 (0.8092) Acc@1 74.121 (81.938) Acc@5 93.750 (95.851) Mem 7381MB [2024-08-30 05:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.438 Acc@5 95.774 [2024-08-30 05:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.4% [2024-08-30 05:35:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.978 (0.978) Loss 0.3784 (0.3784) Acc@1 93.555 (93.555) Acc@5 98.438 (98.438) Mem 7381MB [2024-08-30 05:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.088 (0.169) Loss 0.5874 (0.6044) Acc@1 88.574 (87.358) Acc@5 97.754 (97.763) Mem 7381MB [2024-08-30 05:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.085 (0.129) Loss 0.8613 (0.6308) Acc@1 78.516 (86.347) Acc@5 96.094 (97.717) Mem 7381MB [2024-08-30 05:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.114) Loss 1.0889 (0.7147) Acc@1 73.828 (84.258) Acc@5 93.359 (96.828) Mem 7381MB [2024-08-30 05:35:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.077 (0.105) Loss 0.9785 (0.7578) Acc@1 77.441 (83.198) Acc@5 94.629 (96.351) Mem 7381MB [2024-08-30 05:35:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.784 Acc@5 96.322 [2024-08-30 05:35:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-08-30 05:35:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.78% [2024-08-30 05:35:08 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 05:35:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 05:35:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][0/1251] eta 0:15:25 lr 0.000210 wd 0.0500 time 0.7396 (0.7396) data time 0.5058 (0.5058) model time 0.0000 (0.0000) loss 3.2518 (3.2518) grad_norm 3.3414 (3.3414) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][10/1251] eta 0:05:37 lr 0.000210 wd 0.0500 time 0.2235 (0.2721) data time 0.0009 (0.0470) model time 0.0000 (0.0000) loss 2.4704 (3.1181) grad_norm 2.5490 (3.1208) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][20/1251] eta 0:05:08 lr 0.000210 wd 0.0500 time 0.2275 (0.2503) data time 0.0008 (0.0251) model time 0.0000 (0.0000) loss 3.0887 (2.9421) grad_norm 2.3912 (3.0693) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][30/1251] eta 0:05:17 lr 0.000209 wd 0.0500 time 0.3209 (0.2601) data time 0.0012 (0.0174) model time 0.0000 (0.0000) loss 3.3540 (2.8939) grad_norm 3.4862 (2.9901) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][40/1251] eta 0:05:06 lr 0.000209 wd 0.0500 time 0.2292 (0.2527) data time 0.0010 (0.0135) model time 0.0000 (0.0000) loss 2.7760 (2.8700) grad_norm 3.4383 (3.0203) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][50/1251] eta 0:05:01 lr 0.000209 wd 0.0500 time 0.2287 (0.2514) data time 0.0009 (0.0111) model time 0.0000 (0.0000) loss 2.0701 (2.8317) grad_norm 4.1106 (3.0536) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][60/1251] eta 0:04:54 lr 0.000209 wd 0.0500 time 0.2294 (0.2471) data time 0.0010 (0.0094) model time 0.2284 (0.2242) loss 3.1075 (2.8509) grad_norm 2.1389 (3.1053) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][70/1251] eta 0:04:52 lr 0.000209 wd 0.0500 time 0.2292 (0.2475) data time 0.0009 (0.0082) model time 0.2282 (0.2364) loss 3.3889 (2.8515) grad_norm 2.7075 (3.1614) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][80/1251] eta 0:04:52 lr 0.000209 wd 0.0500 time 0.2276 (0.2502) data time 0.0011 (0.0074) model time 0.2265 (0.2471) loss 2.8543 (2.8516) grad_norm 2.4420 (3.2378) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][90/1251] eta 0:04:47 lr 0.000209 wd 0.0500 time 0.2263 (0.2476) data time 0.0008 (0.0067) model time 0.2256 (0.2416) loss 3.4935 (2.8776) grad_norm 3.3885 (3.3214) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][100/1251] eta 0:04:45 lr 0.000209 wd 0.0500 time 0.2818 (0.2480) data time 0.0017 (0.0061) model time 0.2801 (0.2434) loss 3.2713 (2.8942) grad_norm 3.3927 (3.8099) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][110/1251] eta 0:04:43 lr 0.000209 wd 0.0500 time 0.3165 (0.2482) data time 0.0010 (0.0057) model time 0.3155 (0.2445) loss 3.1233 (2.8903) grad_norm 2.7941 (3.7925) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][120/1251] eta 0:04:41 lr 0.000209 wd 0.0500 time 0.2185 (0.2485) data time 0.0010 (0.0053) model time 0.2175 (0.2452) loss 2.1215 (2.8726) grad_norm 3.6347 (3.7806) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][130/1251] eta 0:04:36 lr 0.000209 wd 0.0500 time 0.2181 (0.2469) data time 0.0009 (0.0050) model time 0.2171 (0.2430) loss 3.1645 (2.8687) grad_norm 3.7329 (3.7560) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][140/1251] eta 0:04:32 lr 0.000209 wd 0.0500 time 0.2219 (0.2455) data time 0.0009 (0.0047) model time 0.2210 (0.2411) loss 3.6355 (2.8861) grad_norm 3.0032 (3.7065) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][150/1251] eta 0:04:30 lr 0.000209 wd 0.0500 time 0.3062 (0.2453) data time 0.0011 (0.0045) model time 0.3051 (0.2412) loss 3.7693 (2.9057) grad_norm 6.2500 (3.7165) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][160/1251] eta 0:04:28 lr 0.000209 wd 0.0500 time 0.2344 (0.2458) data time 0.0009 (0.0043) model time 0.2336 (0.2420) loss 3.1900 (2.9048) grad_norm 4.2043 (3.7297) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][170/1251] eta 0:04:24 lr 0.000209 wd 0.0500 time 0.2256 (0.2447) data time 0.0008 (0.0041) model time 0.2248 (0.2407) loss 2.9609 (2.9038) grad_norm 2.8336 (3.7015) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][180/1251] eta 0:04:22 lr 0.000209 wd 0.0500 time 0.2317 (0.2448) data time 0.0007 (0.0039) model time 0.2309 (0.2411) loss 3.0004 (2.8961) grad_norm 2.7799 (3.7087) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][190/1251] eta 0:04:18 lr 0.000209 wd 0.0500 time 0.2315 (0.2439) data time 0.0006 (0.0037) model time 0.2308 (0.2401) loss 3.0026 (2.8962) grad_norm 2.8836 (3.7164) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:35:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][200/1251] eta 0:04:17 lr 0.000209 wd 0.0500 time 0.2336 (0.2450) data time 0.0009 (0.0036) model time 0.2327 (0.2417) loss 2.9196 (2.9022) grad_norm 4.5249 (3.7098) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][210/1251] eta 0:04:14 lr 0.000209 wd 0.0500 time 0.2296 (0.2442) data time 0.0008 (0.0035) model time 0.2288 (0.2408) loss 2.5356 (2.8930) grad_norm 4.0850 (3.7299) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][220/1251] eta 0:04:10 lr 0.000209 wd 0.0500 time 0.2288 (0.2434) data time 0.0008 (0.0034) model time 0.2280 (0.2399) loss 3.2883 (2.8975) grad_norm 3.0902 (3.7084) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][230/1251] eta 0:04:08 lr 0.000209 wd 0.0500 time 0.2665 (0.2438) data time 0.0011 (0.0033) model time 0.2654 (0.2405) loss 2.6074 (2.9021) grad_norm 3.1752 (3.6824) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][240/1251] eta 0:04:06 lr 0.000209 wd 0.0500 time 0.2240 (0.2437) data time 0.0008 (0.0032) model time 0.2232 (0.2406) loss 2.4854 (2.8989) grad_norm 3.8038 (3.7972) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][250/1251] eta 0:04:04 lr 0.000209 wd 0.0500 time 0.2248 (0.2443) data time 0.0009 (0.0031) model time 0.2239 (0.2414) loss 3.0386 (2.9000) grad_norm 2.7820 (3.7971) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][260/1251] eta 0:04:01 lr 0.000209 wd 0.0500 time 0.2394 (0.2436) data time 0.0009 (0.0030) model time 0.2385 (0.2407) loss 2.8347 (2.9045) grad_norm 2.8265 (3.7960) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][270/1251] eta 0:03:58 lr 0.000209 wd 0.0500 time 0.2304 (0.2431) data time 0.0009 (0.0030) model time 0.2295 (0.2401) loss 3.5409 (2.9110) grad_norm 3.1212 (3.7902) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][280/1251] eta 0:03:56 lr 0.000209 wd 0.0500 time 0.3102 (0.2439) data time 0.0009 (0.0029) model time 0.3093 (0.2411) loss 1.6661 (2.9093) grad_norm 2.5307 (3.7754) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][290/1251] eta 0:03:53 lr 0.000209 wd 0.0500 time 0.2386 (0.2434) data time 0.0010 (0.0028) model time 0.2376 (0.2406) loss 2.3799 (2.9075) grad_norm 3.3351 (3.7581) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][300/1251] eta 0:03:50 lr 0.000209 wd 0.0500 time 0.2180 (0.2428) data time 0.0008 (0.0028) model time 0.2171 (0.2399) loss 2.4976 (2.9052) grad_norm 2.8457 (3.7510) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][310/1251] eta 0:03:48 lr 0.000208 wd 0.0500 time 0.2355 (0.2428) data time 0.0010 (0.0027) model time 0.2345 (0.2400) loss 2.9363 (2.9146) grad_norm 3.3842 (3.7393) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][320/1251] eta 0:03:46 lr 0.000208 wd 0.0500 time 0.2219 (0.2428) data time 0.0008 (0.0027) model time 0.2211 (0.2401) loss 2.2268 (2.9091) grad_norm 3.5348 (3.7307) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][330/1251] eta 0:03:43 lr 0.000208 wd 0.0500 time 0.2288 (0.2431) data time 0.0009 (0.0026) model time 0.2279 (0.2405) loss 2.9009 (2.9080) grad_norm 3.5449 (3.7290) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][340/1251] eta 0:03:41 lr 0.000208 wd 0.0500 time 0.2297 (0.2427) data time 0.0008 (0.0026) model time 0.2289 (0.2401) loss 3.3757 (2.9143) grad_norm 2.6958 (3.7054) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][350/1251] eta 0:03:38 lr 0.000208 wd 0.0500 time 0.2261 (0.2423) data time 0.0013 (0.0026) model time 0.2248 (0.2396) loss 2.9759 (2.9179) grad_norm 2.9037 (3.7011) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][360/1251] eta 0:03:35 lr 0.000208 wd 0.0500 time 0.2312 (0.2423) data time 0.0007 (0.0025) model time 0.2305 (0.2396) loss 2.5520 (2.9177) grad_norm 2.9189 (3.6848) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][370/1251] eta 0:03:33 lr 0.000208 wd 0.0500 time 0.2233 (0.2424) data time 0.0010 (0.0025) model time 0.2224 (0.2398) loss 3.2606 (2.9205) grad_norm 3.4753 (3.6865) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][380/1251] eta 0:03:31 lr 0.000208 wd 0.0500 time 0.2203 (0.2424) data time 0.0011 (0.0024) model time 0.2192 (0.2399) loss 2.7740 (2.9285) grad_norm 4.8130 (3.6927) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][390/1251] eta 0:03:28 lr 0.000208 wd 0.0500 time 0.2368 (0.2421) data time 0.0011 (0.0024) model time 0.2357 (0.2396) loss 2.4460 (2.9304) grad_norm 3.7988 (3.7120) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][400/1251] eta 0:03:26 lr 0.000208 wd 0.0500 time 0.2966 (0.2428) data time 0.0015 (0.0024) model time 0.2951 (0.2405) loss 3.1751 (2.9335) grad_norm 3.0866 (3.7022) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][410/1251] eta 0:03:24 lr 0.000208 wd 0.0500 time 0.2266 (0.2429) data time 0.0011 (0.0023) model time 0.2255 (0.2406) loss 2.6286 (2.9299) grad_norm 2.2438 (3.7000) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][420/1251] eta 0:03:21 lr 0.000208 wd 0.0500 time 0.2288 (0.2425) data time 0.0009 (0.0023) model time 0.2279 (0.2402) loss 3.2610 (2.9330) grad_norm 2.3618 (3.6994) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][430/1251] eta 0:03:19 lr 0.000208 wd 0.0500 time 0.3050 (0.2424) data time 0.0013 (0.0023) model time 0.3037 (0.2401) loss 3.2560 (2.9290) grad_norm 2.7449 (3.7016) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][440/1251] eta 0:03:16 lr 0.000208 wd 0.0500 time 0.2835 (0.2424) data time 0.0011 (0.0023) model time 0.2824 (0.2402) loss 3.1064 (2.9302) grad_norm 2.7038 (3.6949) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][450/1251] eta 0:03:14 lr 0.000208 wd 0.0500 time 0.2610 (0.2432) data time 0.0020 (0.0022) model time 0.2590 (0.2410) loss 3.0150 (2.9297) grad_norm 3.2836 (3.6929) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][460/1251] eta 0:03:12 lr 0.000208 wd 0.0500 time 0.2267 (0.2428) data time 0.0009 (0.0022) model time 0.2258 (0.2407) loss 3.6874 (2.9359) grad_norm 3.1487 (3.6821) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][470/1251] eta 0:03:09 lr 0.000208 wd 0.0500 time 0.2329 (0.2426) data time 0.0011 (0.0022) model time 0.2318 (0.2404) loss 2.0146 (2.9332) grad_norm 3.3259 (3.6835) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][480/1251] eta 0:03:06 lr 0.000208 wd 0.0500 time 0.2262 (0.2422) data time 0.0007 (0.0022) model time 0.2255 (0.2400) loss 2.7197 (2.9360) grad_norm 2.6795 (3.6706) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][490/1251] eta 0:03:04 lr 0.000208 wd 0.0500 time 0.2291 (0.2427) data time 0.0012 (0.0021) model time 0.2279 (0.2406) loss 2.1890 (2.9359) grad_norm 4.0291 (3.6732) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][500/1251] eta 0:03:02 lr 0.000208 wd 0.0500 time 0.2435 (0.2424) data time 0.0008 (0.0021) model time 0.2428 (0.2403) loss 2.2214 (2.9381) grad_norm 2.8289 (3.6672) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][510/1251] eta 0:02:59 lr 0.000208 wd 0.0500 time 0.2304 (0.2425) data time 0.0007 (0.0021) model time 0.2297 (0.2404) loss 2.9244 (2.9406) grad_norm 3.8755 (3.6738) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][520/1251] eta 0:02:57 lr 0.000208 wd 0.0500 time 0.2213 (0.2422) data time 0.0009 (0.0021) model time 0.2204 (0.2401) loss 3.2456 (2.9464) grad_norm 3.1201 (3.6773) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][530/1251] eta 0:02:54 lr 0.000208 wd 0.0500 time 0.3083 (0.2427) data time 0.0013 (0.0021) model time 0.3070 (0.2407) loss 3.6539 (2.9505) grad_norm 5.2370 (3.6795) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][540/1251] eta 0:02:52 lr 0.000208 wd 0.0500 time 0.2268 (0.2427) data time 0.0009 (0.0020) model time 0.2259 (0.2407) loss 2.5249 (2.9492) grad_norm 5.1310 (3.6728) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][550/1251] eta 0:02:49 lr 0.000208 wd 0.0500 time 0.2290 (0.2425) data time 0.0009 (0.0020) model time 0.2281 (0.2404) loss 2.5804 (2.9529) grad_norm 2.7520 (3.6625) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][560/1251] eta 0:02:47 lr 0.000208 wd 0.0500 time 0.2294 (0.2422) data time 0.0008 (0.0020) model time 0.2286 (0.2402) loss 3.2342 (2.9464) grad_norm 3.2430 (3.6650) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][570/1251] eta 0:02:45 lr 0.000208 wd 0.0500 time 0.2267 (0.2425) data time 0.0007 (0.0020) model time 0.2260 (0.2405) loss 2.9178 (2.9455) grad_norm 2.4917 (3.6643) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][580/1251] eta 0:02:42 lr 0.000208 wd 0.0500 time 0.3136 (0.2426) data time 0.0012 (0.0020) model time 0.3124 (0.2406) loss 3.3477 (2.9425) grad_norm 5.4037 (3.6628) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][590/1251] eta 0:02:40 lr 0.000207 wd 0.0500 time 0.2287 (0.2425) data time 0.0007 (0.0020) model time 0.2280 (0.2406) loss 2.6069 (2.9458) grad_norm 2.5912 (3.6523) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][600/1251] eta 0:02:37 lr 0.000207 wd 0.0500 time 0.2291 (0.2423) data time 0.0010 (0.0020) model time 0.2282 (0.2404) loss 2.7737 (2.9444) grad_norm 2.7903 (3.6429) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][610/1251] eta 0:02:35 lr 0.000207 wd 0.0500 time 0.2720 (0.2426) data time 0.0007 (0.0019) model time 0.2713 (0.2407) loss 1.8399 (2.9414) grad_norm 4.3785 (3.6483) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][620/1251] eta 0:02:33 lr 0.000207 wd 0.0500 time 0.2224 (0.2426) data time 0.0008 (0.0019) model time 0.2216 (0.2407) loss 1.8220 (2.9384) grad_norm 3.9392 (3.6624) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][630/1251] eta 0:02:30 lr 0.000207 wd 0.0500 time 0.2238 (0.2424) data time 0.0007 (0.0019) model time 0.2231 (0.2405) loss 1.8089 (2.9393) grad_norm 3.7418 (3.6591) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][640/1251] eta 0:02:28 lr 0.000207 wd 0.0500 time 0.2266 (0.2426) data time 0.0009 (0.0019) model time 0.2258 (0.2407) loss 2.4939 (2.9418) grad_norm 3.8563 (3.6844) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][650/1251] eta 0:02:25 lr 0.000207 wd 0.0500 time 0.2291 (0.2427) data time 0.0007 (0.0019) model time 0.2284 (0.2408) loss 2.2422 (2.9397) grad_norm 2.8678 (3.6747) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][660/1251] eta 0:02:23 lr 0.000207 wd 0.0500 time 0.2278 (0.2429) data time 0.0011 (0.0019) model time 0.2267 (0.2410) loss 2.9672 (2.9409) grad_norm 2.6628 (3.6623) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][670/1251] eta 0:02:20 lr 0.000207 wd 0.0500 time 0.2366 (0.2427) data time 0.0009 (0.0019) model time 0.2357 (0.2408) loss 2.8258 (2.9389) grad_norm 4.2910 (3.6568) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][680/1251] eta 0:02:18 lr 0.000207 wd 0.0500 time 0.2268 (0.2425) data time 0.0012 (0.0019) model time 0.2256 (0.2406) loss 2.7231 (2.9327) grad_norm 4.9538 (3.6678) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][690/1251] eta 0:02:16 lr 0.000207 wd 0.0500 time 0.2195 (0.2425) data time 0.0012 (0.0018) model time 0.2182 (0.2407) loss 1.9196 (2.9281) grad_norm 3.6546 (3.6678) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:37:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][700/1251] eta 0:02:13 lr 0.000207 wd 0.0500 time 0.2262 (0.2426) data time 0.0007 (0.0018) model time 0.2255 (0.2408) loss 3.2337 (2.9288) grad_norm 3.3866 (3.6631) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][710/1251] eta 0:02:11 lr 0.000207 wd 0.0500 time 0.3271 (0.2426) data time 0.0012 (0.0018) model time 0.3258 (0.2408) loss 3.2706 (2.9247) grad_norm 3.9352 (3.6553) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][720/1251] eta 0:02:08 lr 0.000207 wd 0.0500 time 0.2268 (0.2425) data time 0.0009 (0.0018) model time 0.2260 (0.2407) loss 3.2773 (2.9272) grad_norm 5.0232 (3.6577) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][730/1251] eta 0:02:06 lr 0.000207 wd 0.0500 time 0.2750 (0.2425) data time 0.0014 (0.0018) model time 0.2736 (0.2407) loss 3.0172 (2.9233) grad_norm 4.6227 (3.6600) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][740/1251] eta 0:02:03 lr 0.000207 wd 0.0500 time 0.2216 (0.2426) data time 0.0008 (0.0018) model time 0.2207 (0.2409) loss 3.0649 (2.9242) grad_norm 2.7831 (3.6625) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][750/1251] eta 0:02:01 lr 0.000207 wd 0.0500 time 0.2295 (0.2425) data time 0.0009 (0.0018) model time 0.2286 (0.2407) loss 3.3276 (2.9271) grad_norm 3.1419 (3.6700) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][760/1251] eta 0:01:58 lr 0.000207 wd 0.0500 time 0.2332 (0.2423) data time 0.0007 (0.0018) model time 0.2325 (0.2406) loss 3.8996 (2.9320) grad_norm 5.7456 (3.6987) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][770/1251] eta 0:01:56 lr 0.000207 wd 0.0500 time 0.3257 (0.2424) data time 0.0011 (0.0018) model time 0.3246 (0.2406) loss 1.7438 (2.9272) grad_norm 3.2788 (3.6956) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][780/1251] eta 0:01:54 lr 0.000207 wd 0.0500 time 0.2283 (0.2425) data time 0.0006 (0.0018) model time 0.2277 (0.2408) loss 2.7157 (2.9280) grad_norm 4.3674 (3.6902) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][790/1251] eta 0:01:51 lr 0.000207 wd 0.0500 time 0.2285 (0.2426) data time 0.0010 (0.0018) model time 0.2275 (0.2409) loss 3.7474 (2.9304) grad_norm 2.7479 (3.6821) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][800/1251] eta 0:01:49 lr 0.000207 wd 0.0500 time 0.2313 (0.2424) data time 0.0009 (0.0017) model time 0.2304 (0.2407) loss 2.1333 (2.9291) grad_norm 3.1271 (3.6734) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][810/1251] eta 0:01:46 lr 0.000207 wd 0.0500 time 0.2224 (0.2423) data time 0.0011 (0.0017) model time 0.2213 (0.2405) loss 3.2667 (2.9315) grad_norm 3.4640 (3.6708) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][820/1251] eta 0:01:44 lr 0.000207 wd 0.0500 time 0.2271 (0.2426) data time 0.0006 (0.0017) model time 0.2264 (0.2409) loss 3.3540 (2.9326) grad_norm 3.5097 (3.6694) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][830/1251] eta 0:01:42 lr 0.000207 wd 0.0500 time 0.2260 (0.2424) data time 0.0007 (0.0017) model time 0.2253 (0.2407) loss 2.7985 (2.9309) grad_norm 3.7256 (3.6646) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][840/1251] eta 0:01:39 lr 0.000207 wd 0.0500 time 0.2336 (0.2423) data time 0.0008 (0.0017) model time 0.2329 (0.2406) loss 2.5292 (2.9313) grad_norm 7.1040 (3.6623) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][850/1251] eta 0:01:37 lr 0.000207 wd 0.0500 time 0.2228 (0.2424) data time 0.0008 (0.0017) model time 0.2220 (0.2407) loss 3.2790 (2.9299) grad_norm 3.3158 (3.6621) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][860/1251] eta 0:01:34 lr 0.000207 wd 0.0500 time 0.2270 (0.2426) data time 0.0007 (0.0017) model time 0.2263 (0.2409) loss 2.3812 (2.9300) grad_norm 5.1785 (3.6774) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][870/1251] eta 0:01:32 lr 0.000206 wd 0.0500 time 0.2359 (0.2428) data time 0.0008 (0.0017) model time 0.2351 (0.2411) loss 3.3799 (2.9321) grad_norm 3.4123 (3.6829) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][880/1251] eta 0:01:30 lr 0.000206 wd 0.0500 time 0.2313 (0.2426) data time 0.0008 (0.0017) model time 0.2305 (0.2410) loss 2.9059 (2.9316) grad_norm 3.7721 (3.6885) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][890/1251] eta 0:01:27 lr 0.000206 wd 0.0500 time 0.2316 (0.2425) data time 0.0009 (0.0017) model time 0.2307 (0.2408) loss 3.4911 (2.9311) grad_norm 3.7178 (3.6814) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][900/1251] eta 0:01:25 lr 0.000206 wd 0.0500 time 0.3230 (0.2428) data time 0.0012 (0.0017) model time 0.3217 (0.2412) loss 3.0492 (2.9280) grad_norm 2.2860 (3.6730) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][910/1251] eta 0:01:22 lr 0.000206 wd 0.0500 time 0.2333 (0.2426) data time 0.0009 (0.0017) model time 0.2324 (0.2410) loss 3.2445 (2.9264) grad_norm 3.1734 (3.6732) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][920/1251] eta 0:01:20 lr 0.000206 wd 0.0500 time 0.3215 (0.2429) data time 0.0017 (0.0017) model time 0.3198 (0.2413) loss 2.3472 (2.9282) grad_norm 2.7235 (3.6725) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][930/1251] eta 0:01:17 lr 0.000206 wd 0.0500 time 0.2274 (0.2427) data time 0.0014 (0.0017) model time 0.2261 (0.2411) loss 3.3299 (2.9284) grad_norm 3.6993 (3.6677) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][940/1251] eta 0:01:15 lr 0.000206 wd 0.0500 time 0.3249 (0.2430) data time 0.0010 (0.0017) model time 0.3239 (0.2414) loss 2.8467 (2.9287) grad_norm 4.8103 (3.6660) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:38:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][950/1251] eta 0:01:13 lr 0.000206 wd 0.0500 time 0.2296 (0.2429) data time 0.0007 (0.0016) model time 0.2288 (0.2413) loss 3.5628 (2.9300) grad_norm 2.8126 (3.6651) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:39:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][960/1251] eta 0:01:10 lr 0.000206 wd 0.0500 time 0.2279 (0.2428) data time 0.0008 (0.0016) model time 0.2271 (0.2412) loss 2.8990 (2.9266) grad_norm 4.2574 (3.6680) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:39:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][970/1251] eta 0:01:08 lr 0.000206 wd 0.0500 time 0.2311 (0.2426) data time 0.0006 (0.0016) model time 0.2305 (0.2410) loss 2.7395 (2.9255) grad_norm 2.8015 (3.6676) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:39:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][980/1251] eta 0:01:05 lr 0.000206 wd 0.0500 time 0.2799 (0.2429) data time 0.0014 (0.0016) model time 0.2785 (0.2413) loss 2.5607 (2.9247) grad_norm 3.2008 (3.6644) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:39:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][990/1251] eta 0:01:03 lr 0.000206 wd 0.0500 time 0.2319 (0.2428) data time 0.0009 (0.0016) model time 0.2310 (0.2412) loss 2.6669 (2.9252) grad_norm 4.3157 (3.6691) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:39:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1000/1251] eta 0:01:00 lr 0.000206 wd 0.0500 time 0.2310 (0.2429) data time 0.0008 (0.0016) model time 0.2302 (0.2413) loss 3.1097 (2.9266) grad_norm 3.2997 (3.6668) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:39:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1010/1251] eta 0:00:58 lr 0.000206 wd 0.0500 time 0.2251 (0.2431) data time 0.0007 (0.0016) model time 0.2244 (0.2416) loss 2.2841 (2.9239) grad_norm 2.7787 (3.6733) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:39:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1020/1251] eta 0:00:56 lr 0.000206 wd 0.0500 time 0.3089 (0.2433) data time 0.0020 (0.0016) model time 0.3069 (0.2418) loss 2.1401 (2.9254) grad_norm 2.7303 (3.6725) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:39:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1030/1251] eta 0:00:53 lr 0.000206 wd 0.0500 time 0.2244 (0.2433) data time 0.0007 (0.0016) model time 0.2237 (0.2418) loss 3.6104 (2.9269) grad_norm 2.9081 (3.6663) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:39:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1040/1251] eta 0:00:51 lr 0.000206 wd 0.0500 time 0.2226 (0.2431) data time 0.0009 (0.0016) model time 0.2217 (0.2416) loss 2.3289 (2.9268) grad_norm 2.9098 (3.6590) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:39:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1050/1251] eta 0:00:48 lr 0.000206 wd 0.0500 time 0.3007 (0.2432) data time 0.0016 (0.0016) model time 0.2991 (0.2417) loss 3.3055 (2.9284) grad_norm 3.0341 (3.6624) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:39:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1060/1251] eta 0:00:46 lr 0.000206 wd 0.0500 time 0.2377 (0.2433) data time 0.0011 (0.0016) model time 0.2366 (0.2418) loss 3.1791 (2.9325) grad_norm 2.5322 (3.6588) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:39:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1070/1251] eta 0:00:44 lr 0.000206 wd 0.0500 time 0.2243 (0.2434) data time 0.0009 (0.0016) model time 0.2234 (0.2419) loss 3.1616 (2.9326) grad_norm 3.5065 (3.6552) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:39:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1080/1251] eta 0:00:41 lr 0.000206 wd 0.0500 time 0.2306 (0.2432) data time 0.0007 (0.0016) model time 0.2299 (0.2417) loss 2.3048 (2.9324) grad_norm 3.1454 (3.6517) loss_scale 1024.0000 (513.4209) mem 7381MB [2024-08-30 05:39:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1090/1251] eta 0:00:39 lr 0.000206 wd 0.0500 time 0.2278 (0.2431) data time 0.0008 (0.0016) model time 0.2269 (0.2416) loss 2.4937 (2.9318) grad_norm 3.2956 (3.6508) loss_scale 1024.0000 (518.1008) mem 7381MB [2024-08-30 05:39:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1100/1251] eta 0:00:36 lr 0.000206 wd 0.0500 time 0.2255 (0.2431) data time 0.0009 (0.0016) model time 0.2245 (0.2416) loss 2.3958 (2.9307) grad_norm 3.0012 (3.6465) loss_scale 1024.0000 (522.6957) mem 7381MB [2024-08-30 05:39:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1110/1251] eta 0:00:34 lr 0.000206 wd 0.0500 time 0.2337 (0.2432) data time 0.0006 (0.0016) model time 0.2331 (0.2417) loss 3.0167 (2.9314) grad_norm 2.6781 (3.6535) loss_scale 1024.0000 (527.2079) mem 7381MB [2024-08-30 05:39:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1120/1251] eta 0:00:31 lr 0.000206 wd 0.0500 time 0.2216 (0.2430) data time 0.0009 (0.0016) model time 0.2207 (0.2415) loss 3.8062 (2.9331) grad_norm 11.3093 (3.6609) loss_scale 1024.0000 (531.6396) mem 7381MB [2024-08-30 05:39:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1130/1251] eta 0:00:29 lr 0.000206 wd 0.0500 time 0.2276 (0.2431) data time 0.0011 (0.0016) model time 0.2265 (0.2415) loss 2.2789 (2.9342) grad_norm 3.1386 (3.6648) loss_scale 1024.0000 (535.9929) mem 7381MB [2024-08-30 05:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1140/1251] eta 0:00:26 lr 0.000206 wd 0.0500 time 0.3288 (0.2431) data time 0.0040 (0.0016) model time 0.3248 (0.2416) loss 2.5970 (2.9356) grad_norm 3.0213 (3.6678) loss_scale 1024.0000 (540.2699) mem 7381MB [2024-08-30 05:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1150/1251] eta 0:00:24 lr 0.000205 wd 0.0500 time 0.2268 (0.2433) data time 0.0009 (0.0016) model time 0.2258 (0.2418) loss 3.0223 (2.9340) grad_norm 2.6778 (3.6661) loss_scale 1024.0000 (544.4726) mem 7381MB [2024-08-30 05:39:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1160/1251] eta 0:00:22 lr 0.000205 wd 0.0500 time 0.2252 (0.2431) data time 0.0008 (0.0016) model time 0.2243 (0.2416) loss 2.9941 (2.9328) grad_norm 2.9986 (3.6612) loss_scale 1024.0000 (548.6029) mem 7381MB [2024-08-30 05:39:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1170/1251] eta 0:00:19 lr 0.000205 wd 0.0500 time 0.2138 (0.2430) data time 0.0010 (0.0015) model time 0.2128 (0.2415) loss 3.2596 (2.9342) grad_norm inf (inf) loss_scale 512.0000 (552.2254) mem 7381MB [2024-08-30 05:39:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1180/1251] eta 0:00:17 lr 0.000205 wd 0.0500 time 0.2196 (0.2430) data time 0.0009 (0.0015) model time 0.2187 (0.2415) loss 2.8871 (2.9338) grad_norm 3.3448 (inf) loss_scale 512.0000 (551.8848) mem 7381MB [2024-08-30 05:39:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1190/1251] eta 0:00:14 lr 0.000205 wd 0.0500 time 0.2285 (0.2430) data time 0.0006 (0.0015) model time 0.2279 (0.2416) loss 2.6049 (2.9344) grad_norm 3.2637 (inf) loss_scale 512.0000 (551.5500) mem 7381MB [2024-08-30 05:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1200/1251] eta 0:00:12 lr 0.000205 wd 0.0500 time 0.2251 (0.2431) data time 0.0009 (0.0015) model time 0.2243 (0.2416) loss 2.2150 (2.9349) grad_norm 2.7305 (inf) loss_scale 512.0000 (551.2206) mem 7381MB [2024-08-30 05:40:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1210/1251] eta 0:00:09 lr 0.000205 wd 0.0500 time 0.2277 (0.2430) data time 0.0010 (0.0015) model time 0.2267 (0.2415) loss 2.7414 (2.9358) grad_norm 3.3744 (inf) loss_scale 512.0000 (550.8968) mem 7381MB [2024-08-30 05:40:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1220/1251] eta 0:00:07 lr 0.000205 wd 0.0500 time 0.2266 (0.2429) data time 0.0011 (0.0015) model time 0.2255 (0.2414) loss 3.1824 (2.9338) grad_norm 3.1234 (inf) loss_scale 512.0000 (550.5782) mem 7381MB [2024-08-30 05:40:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1230/1251] eta 0:00:05 lr 0.000205 wd 0.0500 time 0.2260 (0.2431) data time 0.0010 (0.0015) model time 0.2250 (0.2416) loss 2.1007 (2.9342) grad_norm 2.3012 (inf) loss_scale 512.0000 (550.2648) mem 7381MB [2024-08-30 05:40:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1240/1251] eta 0:00:02 lr 0.000205 wd 0.0500 time 0.2110 (0.2429) data time 0.0006 (0.0015) model time 0.2104 (0.2414) loss 3.1206 (2.9347) grad_norm 3.7393 (inf) loss_scale 512.0000 (549.9565) mem 7381MB [2024-08-30 05:40:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [217/300][1250/1251] eta 0:00:00 lr 0.000205 wd 0.0500 time 0.2127 (0.2427) data time 0.0006 (0.0015) model time 0.2120 (0.2412) loss 2.9112 (2.9335) grad_norm 2.9397 (inf) loss_scale 512.0000 (549.6531) mem 7381MB [2024-08-30 05:40:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 217 training takes 0:05:03 [2024-08-30 05:40:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 05:40:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 05:40:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.743 (0.743) Loss 0.4365 (0.4365) Acc@1 92.676 (92.676) Acc@5 98.340 (98.340) Mem 7381MB [2024-08-30 05:40:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.141) Loss 0.6357 (0.6629) Acc@1 88.574 (86.586) Acc@5 97.852 (97.523) Mem 7381MB [2024-08-30 05:40:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.113) Loss 0.9795 (0.6942) Acc@1 76.758 (85.384) Acc@5 95.215 (97.414) Mem 7381MB [2024-08-30 05:40:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.105) Loss 1.1602 (0.7874) Acc@1 72.852 (83.219) Acc@5 91.699 (96.380) Mem 7381MB [2024-08-30 05:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.066 (0.097) Loss 0.9863 (0.8346) Acc@1 76.270 (82.003) Acc@5 94.922 (95.851) Mem 7381MB [2024-08-30 05:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.578 Acc@5 95.792 [2024-08-30 05:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.6% [2024-08-30 05:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.58% [2024-08-30 05:40:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-30 05:40:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-30 05:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.445 (0.445) Loss 0.3772 (0.3772) Acc@1 93.359 (93.359) Acc@5 98.438 (98.438) Mem 7381MB [2024-08-30 05:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.114) Loss 0.5879 (0.6041) Acc@1 88.574 (87.358) Acc@5 97.852 (97.754) Mem 7381MB [2024-08-30 05:40:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.098) Loss 0.8613 (0.6306) Acc@1 78.613 (86.342) Acc@5 95.996 (97.693) Mem 7381MB [2024-08-30 05:40:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.092) Loss 1.0889 (0.7146) Acc@1 73.730 (84.240) Acc@5 93.359 (96.828) Mem 7381MB [2024-08-30 05:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 0.9775 (0.7577) Acc@1 77.246 (83.160) Acc@5 94.629 (96.356) Mem 7381MB [2024-08-30 05:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.732 Acc@5 96.318 [2024-08-30 05:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-08-30 05:40:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][0/1251] eta 0:26:06 lr 0.000205 wd 0.0500 time 1.2523 (1.2523) data time 0.8617 (0.8617) model time 0.0000 (0.0000) loss 3.2367 (3.2367) grad_norm 3.7897 (3.7897) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][10/1251] eta 0:07:21 lr 0.000205 wd 0.0500 time 0.3141 (0.3559) data time 0.0015 (0.0794) model time 0.0000 (0.0000) loss 2.8463 (2.9278) grad_norm 4.5411 (3.5206) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:40:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][20/1251] eta 0:06:08 lr 0.000205 wd 0.0500 time 0.2230 (0.2997) data time 0.0008 (0.0422) model time 0.0000 (0.0000) loss 3.2706 (2.9652) grad_norm 5.3655 (3.8494) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][30/1251] eta 0:05:38 lr 0.000205 wd 0.0500 time 0.2452 (0.2776) data time 0.0010 (0.0289) model time 0.0000 (0.0000) loss 3.4130 (3.0211) grad_norm 10.2319 (4.0330) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][40/1251] eta 0:05:28 lr 0.000205 wd 0.0500 time 0.2267 (0.2713) data time 0.0009 (0.0222) model time 0.0000 (0.0000) loss 2.7225 (2.9311) grad_norm 2.4192 (3.8543) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][50/1251] eta 0:05:19 lr 0.000205 wd 0.0500 time 0.2271 (0.2662) data time 0.0009 (0.0181) model time 0.0000 (0.0000) loss 2.4001 (2.9368) grad_norm 2.2954 (3.7713) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:40:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][60/1251] eta 0:05:14 lr 0.000205 wd 0.0500 time 0.2287 (0.2638) data time 0.0006 (0.0153) model time 0.2281 (0.2505) loss 2.9611 (2.9379) grad_norm 4.2815 (3.7948) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][70/1251] eta 0:05:05 lr 0.000205 wd 0.0500 time 0.2290 (0.2589) data time 0.0006 (0.0133) model time 0.2283 (0.2394) loss 2.9210 (2.9180) grad_norm 2.5757 (3.7371) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:40:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][80/1251] eta 0:04:58 lr 0.000205 wd 0.0500 time 0.2324 (0.2548) data time 0.0009 (0.0117) model time 0.2315 (0.2345) loss 3.4033 (2.9333) grad_norm 9.8304 (3.7655) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:40:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][90/1251] eta 0:04:56 lr 0.000205 wd 0.0500 time 0.3034 (0.2552) data time 0.0007 (0.0106) model time 0.3027 (0.2402) loss 3.3330 (2.9154) grad_norm 8.6978 (3.7604) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:40:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][100/1251] eta 0:04:52 lr 0.000205 wd 0.0500 time 0.2300 (0.2539) data time 0.0007 (0.0096) model time 0.2293 (0.2404) loss 3.5569 (2.9179) grad_norm 3.3798 (3.7856) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:40:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][110/1251] eta 0:04:48 lr 0.000205 wd 0.0500 time 0.2281 (0.2529) data time 0.0006 (0.0089) model time 0.2275 (0.2406) loss 2.1860 (2.9069) grad_norm 2.8057 (3.8006) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:40:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][120/1251] eta 0:04:43 lr 0.000205 wd 0.0500 time 0.2272 (0.2508) data time 0.0008 (0.0082) model time 0.2264 (0.2386) loss 3.5087 (2.9060) grad_norm 4.4103 (3.7646) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:40:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][130/1251] eta 0:04:42 lr 0.000205 wd 0.0500 time 0.2287 (0.2521) data time 0.0008 (0.0077) model time 0.2279 (0.2421) loss 3.7176 (2.9279) grad_norm 2.6908 (3.7001) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][140/1251] eta 0:04:40 lr 0.000205 wd 0.0500 time 0.2327 (0.2521) data time 0.0009 (0.0072) model time 0.2317 (0.2430) loss 3.2456 (2.9291) grad_norm 3.0239 (3.7043) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][150/1251] eta 0:04:35 lr 0.000205 wd 0.0500 time 0.2302 (0.2506) data time 0.0008 (0.0068) model time 0.2294 (0.2416) loss 2.9049 (2.9365) grad_norm 5.7136 (3.7222) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][160/1251] eta 0:04:33 lr 0.000205 wd 0.0500 time 0.2343 (0.2506) data time 0.0008 (0.0065) model time 0.2335 (0.2423) loss 2.3789 (2.9440) grad_norm 3.1369 (3.6977) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][170/1251] eta 0:04:31 lr 0.000205 wd 0.0500 time 0.3365 (0.2510) data time 0.0012 (0.0061) model time 0.3353 (0.2435) loss 2.0367 (2.9331) grad_norm 3.0020 (3.6621) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][180/1251] eta 0:04:28 lr 0.000204 wd 0.0500 time 0.2249 (0.2508) data time 0.0010 (0.0059) model time 0.2239 (0.2437) loss 2.6166 (2.9420) grad_norm 4.4739 (3.6332) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][190/1251] eta 0:04:24 lr 0.000204 wd 0.0500 time 0.2292 (0.2495) data time 0.0006 (0.0056) model time 0.2286 (0.2424) loss 2.5256 (2.9471) grad_norm 4.0093 (3.6238) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][200/1251] eta 0:04:21 lr 0.000204 wd 0.0500 time 0.2246 (0.2484) data time 0.0009 (0.0054) model time 0.2238 (0.2413) loss 3.0987 (2.9607) grad_norm 3.4158 (3.6132) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][210/1251] eta 0:04:18 lr 0.000204 wd 0.0500 time 0.3027 (0.2481) data time 0.0019 (0.0052) model time 0.3008 (0.2413) loss 3.3258 (2.9600) grad_norm 3.3448 (3.6665) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][220/1251] eta 0:04:16 lr 0.000204 wd 0.0500 time 0.2188 (0.2485) data time 0.0010 (0.0050) model time 0.2178 (0.2421) loss 3.1037 (2.9632) grad_norm 3.8826 (3.6657) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][230/1251] eta 0:04:12 lr 0.000204 wd 0.0500 time 0.2272 (0.2475) data time 0.0006 (0.0048) model time 0.2265 (0.2411) loss 2.9621 (2.9613) grad_norm 3.0188 (3.6455) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][240/1251] eta 0:04:10 lr 0.000204 wd 0.0500 time 0.2240 (0.2474) data time 0.0009 (0.0047) model time 0.2231 (0.2413) loss 2.6366 (2.9594) grad_norm 3.5534 (3.6705) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][250/1251] eta 0:04:06 lr 0.000204 wd 0.0500 time 0.2353 (0.2467) data time 0.0009 (0.0045) model time 0.2344 (0.2407) loss 3.2177 (2.9650) grad_norm 2.6247 (3.6802) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][260/1251] eta 0:04:05 lr 0.000204 wd 0.0500 time 0.2327 (0.2476) data time 0.0009 (0.0044) model time 0.2318 (0.2420) loss 2.3604 (2.9617) grad_norm 3.0625 (3.6675) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][270/1251] eta 0:04:02 lr 0.000204 wd 0.0500 time 0.2314 (0.2469) data time 0.0009 (0.0043) model time 0.2304 (0.2413) loss 2.9306 (2.9580) grad_norm 2.9750 (3.8074) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][280/1251] eta 0:03:59 lr 0.000204 wd 0.0500 time 0.2267 (0.2462) data time 0.0007 (0.0042) model time 0.2261 (0.2407) loss 3.1854 (2.9557) grad_norm 3.5229 (3.8168) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][290/1251] eta 0:03:56 lr 0.000204 wd 0.0500 time 0.3407 (0.2462) data time 0.0013 (0.0041) model time 0.3395 (0.2408) loss 3.5888 (2.9628) grad_norm 3.8682 (3.7977) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][300/1251] eta 0:03:54 lr 0.000204 wd 0.0500 time 0.2275 (0.2462) data time 0.0011 (0.0040) model time 0.2264 (0.2410) loss 2.8971 (2.9623) grad_norm 3.5768 (3.7860) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][310/1251] eta 0:03:51 lr 0.000204 wd 0.0500 time 0.2279 (0.2464) data time 0.0009 (0.0039) model time 0.2270 (0.2414) loss 3.0045 (2.9650) grad_norm 12.3699 (3.8044) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][320/1251] eta 0:03:48 lr 0.000204 wd 0.0500 time 0.2195 (0.2457) data time 0.0010 (0.0038) model time 0.2185 (0.2408) loss 3.2101 (2.9564) grad_norm 4.7376 (3.8036) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][330/1251] eta 0:03:45 lr 0.000204 wd 0.0500 time 0.2397 (0.2452) data time 0.0010 (0.0037) model time 0.2388 (0.2403) loss 3.0654 (2.9632) grad_norm 2.4374 (3.7904) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][340/1251] eta 0:03:43 lr 0.000204 wd 0.0500 time 0.3230 (0.2458) data time 0.0018 (0.0036) model time 0.3212 (0.2411) loss 2.2570 (2.9621) grad_norm 3.1255 (3.7644) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][350/1251] eta 0:03:41 lr 0.000204 wd 0.0500 time 0.2319 (0.2454) data time 0.0010 (0.0036) model time 0.2308 (0.2407) loss 3.5632 (2.9599) grad_norm 3.3526 (3.7444) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][360/1251] eta 0:03:38 lr 0.000204 wd 0.0500 time 0.2243 (0.2449) data time 0.0010 (0.0035) model time 0.2233 (0.2402) loss 2.4754 (2.9612) grad_norm 2.3951 (3.7852) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][370/1251] eta 0:03:35 lr 0.000204 wd 0.0500 time 0.2293 (0.2451) data time 0.0009 (0.0035) model time 0.2284 (0.2406) loss 2.8038 (2.9632) grad_norm 3.5635 (3.7836) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][380/1251] eta 0:03:33 lr 0.000204 wd 0.0500 time 0.2341 (0.2454) data time 0.0009 (0.0034) model time 0.2332 (0.2410) loss 1.7195 (2.9599) grad_norm 2.7733 (3.7604) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:41:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][390/1251] eta 0:03:31 lr 0.000204 wd 0.0500 time 0.2195 (0.2455) data time 0.0009 (0.0033) model time 0.2186 (0.2412) loss 3.2100 (2.9571) grad_norm 2.2454 (3.7468) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][400/1251] eta 0:03:28 lr 0.000204 wd 0.0500 time 0.2218 (0.2450) data time 0.0008 (0.0033) model time 0.2210 (0.2408) loss 3.3249 (2.9582) grad_norm 3.5537 (3.7286) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][410/1251] eta 0:03:25 lr 0.000204 wd 0.0500 time 0.2290 (0.2446) data time 0.0007 (0.0032) model time 0.2283 (0.2404) loss 3.6753 (2.9517) grad_norm 3.2761 (3.7185) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][420/1251] eta 0:03:23 lr 0.000204 wd 0.0500 time 0.2196 (0.2448) data time 0.0010 (0.0032) model time 0.2186 (0.2407) loss 3.3915 (2.9579) grad_norm 3.3423 (3.7135) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][430/1251] eta 0:03:21 lr 0.000204 wd 0.0500 time 0.2241 (0.2450) data time 0.0010 (0.0031) model time 0.2231 (0.2410) loss 3.4839 (2.9519) grad_norm 3.7183 (3.7029) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][440/1251] eta 0:03:18 lr 0.000204 wd 0.0500 time 0.2284 (0.2450) data time 0.0007 (0.0031) model time 0.2277 (0.2411) loss 1.9322 (2.9490) grad_norm 3.1583 (3.7096) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][450/1251] eta 0:03:15 lr 0.000204 wd 0.0500 time 0.2266 (0.2446) data time 0.0006 (0.0030) model time 0.2260 (0.2408) loss 2.5408 (2.9447) grad_norm 3.5800 (3.7068) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][460/1251] eta 0:03:13 lr 0.000204 wd 0.0500 time 0.2214 (0.2447) data time 0.0008 (0.0030) model time 0.2205 (0.2409) loss 3.5470 (2.9463) grad_norm 4.3750 (3.7145) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][470/1251] eta 0:03:11 lr 0.000203 wd 0.0500 time 0.2211 (0.2449) data time 0.0011 (0.0030) model time 0.2199 (0.2412) loss 3.6754 (2.9447) grad_norm 3.7332 (3.7285) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][480/1251] eta 0:03:08 lr 0.000203 wd 0.0500 time 0.2275 (0.2445) data time 0.0010 (0.0029) model time 0.2265 (0.2409) loss 3.2226 (2.9446) grad_norm 3.5409 (3.7178) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][490/1251] eta 0:03:05 lr 0.000203 wd 0.0500 time 0.2257 (0.2442) data time 0.0007 (0.0029) model time 0.2249 (0.2405) loss 2.9893 (2.9447) grad_norm 3.0593 (3.7149) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][500/1251] eta 0:03:03 lr 0.000203 wd 0.0500 time 0.3042 (0.2445) data time 0.0010 (0.0028) model time 0.3032 (0.2409) loss 3.5614 (2.9431) grad_norm 4.2066 (3.7150) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][510/1251] eta 0:03:01 lr 0.000203 wd 0.0500 time 0.3160 (0.2451) data time 0.0014 (0.0028) model time 0.3146 (0.2416) loss 2.8865 (2.9397) grad_norm 3.6057 (3.7043) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][520/1251] eta 0:02:59 lr 0.000203 wd 0.0500 time 0.2279 (0.2452) data time 0.0006 (0.0028) model time 0.2273 (0.2418) loss 3.6697 (2.9351) grad_norm 2.5930 (3.6888) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][530/1251] eta 0:02:56 lr 0.000203 wd 0.0500 time 0.2261 (0.2449) data time 0.0010 (0.0027) model time 0.2250 (0.2415) loss 3.2359 (2.9370) grad_norm 3.1429 (3.6835) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][540/1251] eta 0:02:54 lr 0.000203 wd 0.0500 time 0.2806 (0.2448) data time 0.0016 (0.0027) model time 0.2790 (0.2415) loss 3.1985 (2.9429) grad_norm 4.3111 (3.6764) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][550/1251] eta 0:02:51 lr 0.000203 wd 0.0500 time 0.2347 (0.2450) data time 0.0011 (0.0027) model time 0.2336 (0.2418) loss 2.9777 (2.9430) grad_norm 3.3442 (3.6776) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][560/1251] eta 0:02:49 lr 0.000203 wd 0.0500 time 0.2258 (0.2448) data time 0.0009 (0.0027) model time 0.2249 (0.2415) loss 3.1141 (2.9425) grad_norm 14.1483 (3.6951) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][570/1251] eta 0:02:46 lr 0.000203 wd 0.0500 time 0.2391 (0.2448) data time 0.0008 (0.0026) model time 0.2383 (0.2416) loss 2.1476 (2.9406) grad_norm 4.3979 (3.6944) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][580/1251] eta 0:02:44 lr 0.000203 wd 0.0500 time 0.2543 (0.2446) data time 0.0007 (0.0026) model time 0.2537 (0.2414) loss 3.7327 (2.9391) grad_norm 5.5511 (3.6904) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][590/1251] eta 0:02:41 lr 0.000203 wd 0.0500 time 0.2266 (0.2450) data time 0.0009 (0.0026) model time 0.2257 (0.2418) loss 3.4777 (2.9413) grad_norm 3.5280 (3.6932) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][600/1251] eta 0:02:39 lr 0.000203 wd 0.0500 time 0.2199 (0.2446) data time 0.0008 (0.0026) model time 0.2191 (0.2415) loss 3.3540 (2.9395) grad_norm 5.4139 (3.6993) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][610/1251] eta 0:02:36 lr 0.000203 wd 0.0500 time 0.2257 (0.2444) data time 0.0007 (0.0025) model time 0.2250 (0.2413) loss 3.4471 (2.9424) grad_norm 3.9071 (3.7069) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][620/1251] eta 0:02:34 lr 0.000203 wd 0.0500 time 0.2571 (0.2444) data time 0.0014 (0.0025) model time 0.2558 (0.2413) loss 2.9482 (2.9452) grad_norm 3.0784 (3.7029) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][630/1251] eta 0:02:31 lr 0.000203 wd 0.0500 time 0.2324 (0.2444) data time 0.0007 (0.0025) model time 0.2317 (0.2414) loss 3.5233 (2.9483) grad_norm 2.9709 (3.6991) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 05:42:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 05:42:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 05:42:58 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 05:46:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 05:46:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 05:46:12 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 05:46:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 05:46:24 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 05:46:26 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 05:46:27 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 05:46:27 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 218) [2024-08-30 05:46:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 05:46:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 05:46:47 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 05:46:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 05:49:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 05:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 05:49:11 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 05:49:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 05:49:21 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 05:49:22 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 05:49:23 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 05:49:23 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 218) [2024-08-30 05:49:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 05:49:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][640/1251] eta 0:17:43 lr 0.000203 wd 0.0500 time 0.2413 (1.7406) data time 0.0009 (0.0896) model time 0.2404 (1.6510) loss 3.4785 (3.3173) grad_norm 3.0826 (3.7316) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:49:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][650/1251] eta 0:09:03 lr 0.000203 wd 0.0500 time 0.2306 (0.9050) data time 0.0009 (0.0405) model time 0.2296 (0.8645) loss 3.6008 (3.2067) grad_norm 4.9980 (3.6367) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:49:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][660/1251] eta 0:06:33 lr 0.000203 wd 0.0500 time 0.2324 (0.6654) data time 0.0010 (0.0264) model time 0.2314 (0.6390) loss 3.4245 (3.2223) grad_norm 3.4033 (3.5032) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][670/1251] eta 0:05:21 lr 0.000203 wd 0.0500 time 0.2391 (0.5530) data time 0.0010 (0.0197) model time 0.2382 (0.5333) loss 2.6387 (3.1681) grad_norm 2.4587 (3.4126) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:49:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][680/1251] eta 0:04:37 lr 0.000203 wd 0.0500 time 0.2385 (0.4865) data time 0.0007 (0.0158) model time 0.2378 (0.4707) loss 3.2607 (3.1499) grad_norm 2.8159 (3.6368) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:49:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][690/1251] eta 0:04:08 lr 0.000203 wd 0.0500 time 0.2304 (0.4438) data time 0.0009 (0.0133) model time 0.2295 (0.4306) loss 2.2299 (3.1062) grad_norm 2.3479 (3.6412) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][700/1251] eta 0:03:47 lr 0.000203 wd 0.0500 time 0.2267 (0.4129) data time 0.0009 (0.0115) model time 0.2258 (0.4014) loss 2.0503 (3.0731) grad_norm 2.6298 (3.5401) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][710/1251] eta 0:03:31 lr 0.000203 wd 0.0500 time 0.2362 (0.3903) data time 0.0009 (0.0101) model time 0.2353 (0.3802) loss 2.5469 (3.0439) grad_norm 4.1385 (3.5601) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][720/1251] eta 0:03:18 lr 0.000203 wd 0.0500 time 0.2359 (0.3732) data time 0.0010 (0.0091) model time 0.2348 (0.3641) loss 3.2624 (3.0120) grad_norm 3.6491 (3.6269) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][730/1251] eta 0:03:07 lr 0.000203 wd 0.0500 time 0.2313 (0.3597) data time 0.0008 (0.0083) model time 0.2305 (0.3514) loss 3.9465 (3.0185) grad_norm 3.3270 (3.6266) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][740/1251] eta 0:02:57 lr 0.000203 wd 0.0500 time 0.2277 (0.3482) data time 0.0008 (0.0076) model time 0.2269 (0.3406) loss 2.0835 (3.0254) grad_norm 2.5631 (3.6360) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][750/1251] eta 0:02:49 lr 0.000202 wd 0.0500 time 0.2373 (0.3386) data time 0.0011 (0.0071) model time 0.2363 (0.3316) loss 3.0981 (3.0273) grad_norm 3.0076 (3.6193) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][760/1251] eta 0:02:42 lr 0.000202 wd 0.0500 time 0.2355 (0.3304) data time 0.0008 (0.0066) model time 0.2347 (0.3238) loss 2.9127 (3.0026) grad_norm 2.6983 (3.6534) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][770/1251] eta 0:02:35 lr 0.000202 wd 0.0500 time 0.2366 (0.3241) data time 0.0012 (0.0062) model time 0.2354 (0.3179) loss 2.9759 (2.9954) grad_norm 3.9391 (3.6326) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][780/1251] eta 0:02:29 lr 0.000202 wd 0.0500 time 0.2298 (0.3179) data time 0.0009 (0.0058) model time 0.2290 (0.3121) loss 2.9632 (2.9851) grad_norm 4.0100 (3.6049) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][790/1251] eta 0:02:24 lr 0.000202 wd 0.0500 time 0.2277 (0.3128) data time 0.0008 (0.0055) model time 0.2269 (0.3073) loss 2.6598 (2.9806) grad_norm 3.8938 (3.6247) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][800/1251] eta 0:02:19 lr 0.000202 wd 0.0500 time 0.2340 (0.3084) data time 0.0011 (0.0053) model time 0.2329 (0.3032) loss 3.5725 (2.9861) grad_norm 4.4935 (3.6461) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][810/1251] eta 0:02:14 lr 0.000202 wd 0.0500 time 0.2309 (0.3043) data time 0.0010 (0.0050) model time 0.2299 (0.2993) loss 2.3072 (2.9692) grad_norm 3.3336 (3.6204) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][820/1251] eta 0:02:09 lr 0.000202 wd 0.0500 time 0.2364 (0.3008) data time 0.0009 (0.0048) model time 0.2355 (0.2960) loss 3.2035 (2.9673) grad_norm 2.7280 (3.6390) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][830/1251] eta 0:02:05 lr 0.000202 wd 0.0500 time 0.2391 (0.2975) data time 0.0011 (0.0046) model time 0.2381 (0.2928) loss 1.9031 (2.9545) grad_norm 2.7007 (3.6487) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][840/1251] eta 0:02:01 lr 0.000202 wd 0.0500 time 0.2290 (0.2945) data time 0.0009 (0.0045) model time 0.2281 (0.2900) loss 3.0746 (2.9481) grad_norm 7.7686 (3.6847) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][850/1251] eta 0:01:57 lr 0.000202 wd 0.0500 time 0.2510 (0.2920) data time 0.0011 (0.0043) model time 0.2500 (0.2877) loss 2.6565 (2.9434) grad_norm 3.7094 (3.6694) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][860/1251] eta 0:01:53 lr 0.000202 wd 0.0500 time 0.2360 (0.2896) data time 0.0011 (0.0042) model time 0.2349 (0.2854) loss 3.3167 (2.9507) grad_norm 4.5969 (3.6618) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][870/1251] eta 0:01:49 lr 0.000202 wd 0.0500 time 0.2614 (0.2874) data time 0.0009 (0.0040) model time 0.2605 (0.2834) loss 2.8989 (2.9463) grad_norm 2.7531 (3.6867) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][880/1251] eta 0:01:45 lr 0.000202 wd 0.0500 time 0.2339 (0.2854) data time 0.0011 (0.0039) model time 0.2328 (0.2815) loss 2.2340 (2.9350) grad_norm 2.5051 (3.6757) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][890/1251] eta 0:01:42 lr 0.000202 wd 0.0500 time 0.2365 (0.2834) data time 0.0011 (0.0038) model time 0.2354 (0.2796) loss 3.1092 (2.9263) grad_norm 2.8009 (3.6506) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][900/1251] eta 0:01:39 lr 0.000202 wd 0.0500 time 0.2326 (0.2822) data time 0.0010 (0.0037) model time 0.2316 (0.2785) loss 3.3930 (2.9203) grad_norm 16.4499 (3.7144) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][910/1251] eta 0:01:35 lr 0.000202 wd 0.0500 time 0.2344 (0.2805) data time 0.0010 (0.0036) model time 0.2334 (0.2769) loss 2.3342 (2.9252) grad_norm 3.9706 (3.7064) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][920/1251] eta 0:01:32 lr 0.000202 wd 0.0500 time 0.2354 (0.2798) data time 0.0008 (0.0035) model time 0.2346 (0.2762) loss 3.7258 (2.9265) grad_norm 2.5472 (3.6963) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][930/1251] eta 0:01:29 lr 0.000202 wd 0.0500 time 0.2404 (0.2784) data time 0.0009 (0.0034) model time 0.2395 (0.2750) loss 2.9959 (2.9152) grad_norm 2.2588 (3.6941) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][940/1251] eta 0:01:26 lr 0.000202 wd 0.0500 time 0.2499 (0.2780) data time 0.0011 (0.0034) model time 0.2489 (0.2747) loss 2.5138 (2.9101) grad_norm 4.1562 (3.6856) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][950/1251] eta 0:01:23 lr 0.000202 wd 0.0500 time 0.2429 (0.2767) data time 0.0011 (0.0033) model time 0.2418 (0.2734) loss 3.1054 (2.9182) grad_norm 3.5010 (3.6798) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:50:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][960/1251] eta 0:01:20 lr 0.000202 wd 0.0500 time 0.2312 (0.2755) data time 0.0009 (0.0032) model time 0.2303 (0.2722) loss 2.8981 (2.9222) grad_norm 2.4352 (3.6965) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][970/1251] eta 0:01:17 lr 0.000202 wd 0.0500 time 0.2422 (0.2743) data time 0.0011 (0.0032) model time 0.2411 (0.2712) loss 3.3167 (2.9224) grad_norm 4.0937 (3.6902) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][980/1251] eta 0:01:14 lr 0.000202 wd 0.0500 time 0.2398 (0.2733) data time 0.0007 (0.0031) model time 0.2391 (0.2702) loss 3.0488 (2.9233) grad_norm 4.7938 (3.7014) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][990/1251] eta 0:01:11 lr 0.000202 wd 0.0500 time 0.2289 (0.2722) data time 0.0009 (0.0030) model time 0.2280 (0.2692) loss 3.1493 (2.9239) grad_norm 4.0128 (3.6980) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1000/1251] eta 0:01:08 lr 0.000202 wd 0.0500 time 0.2264 (0.2712) data time 0.0009 (0.0030) model time 0.2256 (0.2682) loss 2.5388 (2.9230) grad_norm 2.3828 (3.6927) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1010/1251] eta 0:01:05 lr 0.000202 wd 0.0500 time 0.2356 (0.2703) data time 0.0010 (0.0029) model time 0.2346 (0.2673) loss 3.3915 (2.9224) grad_norm 3.3512 (3.6970) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1020/1251] eta 0:01:02 lr 0.000202 wd 0.0500 time 0.2372 (0.2695) data time 0.0007 (0.0029) model time 0.2365 (0.2666) loss 2.6202 (2.9170) grad_norm 3.4925 (3.6995) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1030/1251] eta 0:00:59 lr 0.000202 wd 0.0500 time 0.2461 (0.2687) data time 0.0007 (0.0028) model time 0.2454 (0.2658) loss 3.5134 (2.9214) grad_norm 5.7706 (3.7175) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1040/1251] eta 0:00:56 lr 0.000201 wd 0.0500 time 0.2359 (0.2679) data time 0.0012 (0.0028) model time 0.2347 (0.2651) loss 3.0847 (2.9265) grad_norm 4.0938 (3.7043) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1050/1251] eta 0:00:53 lr 0.000201 wd 0.0500 time 0.2380 (0.2672) data time 0.0009 (0.0028) model time 0.2371 (0.2645) loss 3.5833 (2.9269) grad_norm 7.1680 (3.7013) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1060/1251] eta 0:00:50 lr 0.000201 wd 0.0500 time 0.2394 (0.2665) data time 0.0009 (0.0027) model time 0.2385 (0.2638) loss 3.4148 (2.9306) grad_norm 2.7076 (3.6870) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1070/1251] eta 0:00:48 lr 0.000201 wd 0.0500 time 0.2395 (0.2660) data time 0.0012 (0.0027) model time 0.2383 (0.2633) loss 2.7928 (2.9338) grad_norm 4.1039 (3.6801) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1080/1251] eta 0:00:45 lr 0.000201 wd 0.0500 time 0.2394 (0.2653) data time 0.0010 (0.0026) model time 0.2384 (0.2627) loss 2.4555 (2.9331) grad_norm 2.9121 (3.6684) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1090/1251] eta 0:00:42 lr 0.000201 wd 0.0500 time 0.2392 (0.2647) data time 0.0007 (0.0026) model time 0.2385 (0.2621) loss 2.5476 (2.9300) grad_norm 3.1706 (3.6631) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1100/1251] eta 0:00:39 lr 0.000201 wd 0.0500 time 0.2340 (0.2642) data time 0.0011 (0.0026) model time 0.2329 (0.2616) loss 2.5396 (2.9254) grad_norm 7.8788 (3.6691) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1110/1251] eta 0:00:37 lr 0.000201 wd 0.0500 time 0.2452 (0.2637) data time 0.0009 (0.0025) model time 0.2443 (0.2611) loss 1.8597 (2.9200) grad_norm 2.4784 (3.6640) loss_scale 512.0000 (512.0000) mem 7375MB [2024-08-30 05:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1120/1251] eta 0:00:34 lr 0.000201 wd 0.0500 time 0.2321 (0.2631) data time 0.0010 (0.0025) model time 0.2311 (0.2606) loss 3.3840 (2.9240) grad_norm inf (inf) loss_scale 256.0000 (511.4754) mem 7375MB [2024-08-30 05:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1130/1251] eta 0:00:31 lr 0.000201 wd 0.0500 time 0.2387 (0.2626) data time 0.0008 (0.0025) model time 0.2379 (0.2601) loss 3.0792 (2.9230) grad_norm 3.7512 (inf) loss_scale 256.0000 (506.3454) mem 7375MB [2024-08-30 05:51:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1140/1251] eta 0:00:29 lr 0.000201 wd 0.0500 time 0.2345 (0.2620) data time 0.0008 (0.0025) model time 0.2337 (0.2595) loss 3.4538 (2.9245) grad_norm 2.9459 (inf) loss_scale 256.0000 (501.4173) mem 7375MB [2024-08-30 05:51:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1150/1251] eta 0:00:26 lr 0.000201 wd 0.0500 time 0.2587 (0.2616) data time 0.0009 (0.0024) model time 0.2578 (0.2592) loss 3.0678 (2.9292) grad_norm 3.7857 (inf) loss_scale 256.0000 (496.6795) mem 7375MB [2024-08-30 05:51:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1160/1251] eta 0:00:23 lr 0.000201 wd 0.0500 time 0.2356 (0.2611) data time 0.0007 (0.0024) model time 0.2349 (0.2587) loss 1.8900 (2.9208) grad_norm 2.4669 (inf) loss_scale 256.0000 (492.1212) mem 7375MB [2024-08-30 05:51:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1170/1251] eta 0:00:21 lr 0.000201 wd 0.0500 time 0.2341 (0.2606) data time 0.0009 (0.0024) model time 0.2332 (0.2583) loss 3.1978 (2.9194) grad_norm 2.4657 (inf) loss_scale 256.0000 (487.7323) mem 7375MB [2024-08-30 05:51:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1180/1251] eta 0:00:18 lr 0.000201 wd 0.0500 time 0.2301 (0.2601) data time 0.0010 (0.0023) model time 0.2291 (0.2578) loss 3.6312 (2.9208) grad_norm 2.8090 (inf) loss_scale 256.0000 (483.5036) mem 7375MB [2024-08-30 05:51:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 05:51:52 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 05:51:53 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 05:57:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 05:57:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 05:58:06 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 05:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 05:58:16 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 05:58:18 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 05:58:19 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 05:58:19 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 218) [2024-08-30 05:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 05:58:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1190/1251] eta 0:02:01 lr 0.000201 wd 0.0500 time 0.2231 (1.9929) data time 0.0011 (0.1176) model time 0.2220 (1.8753) loss 3.2654 (3.4982) grad_norm 5.0334 (3.8456) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 05:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1200/1251] eta 0:00:48 lr 0.000201 wd 0.0500 time 0.2218 (0.9522) data time 0.0009 (0.0490) model time 0.2209 (0.9032) loss 2.8638 (3.2054) grad_norm 4.9902 (4.7859) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 05:58:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1210/1251] eta 0:00:27 lr 0.000201 wd 0.0500 time 0.2204 (0.6824) data time 0.0007 (0.0312) model time 0.2197 (0.6512) loss 3.5277 (3.2258) grad_norm 3.7744 (4.3267) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 05:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1220/1251] eta 0:00:17 lr 0.000201 wd 0.0500 time 0.2213 (0.5591) data time 0.0010 (0.0230) model time 0.2203 (0.5361) loss 2.8252 (3.1860) grad_norm 3.7953 (3.9943) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 05:58:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1230/1251] eta 0:00:10 lr 0.000201 wd 0.0500 time 0.2212 (0.4880) data time 0.0008 (0.0183) model time 0.2203 (0.4697) loss 3.3662 (3.1530) grad_norm 3.8433 (3.8986) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 05:58:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1240/1251] eta 0:00:04 lr 0.000201 wd 0.0500 time 0.2146 (0.4409) data time 0.0006 (0.0153) model time 0.2140 (0.4256) loss 2.7405 (3.1435) grad_norm 3.6431 (3.8075) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 05:58:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [218/300][1250/1251] eta 0:00:00 lr 0.000201 wd 0.0500 time 0.2158 (0.4073) data time 0.0006 (0.0131) model time 0.2152 (0.3942) loss 3.3382 (3.1225) grad_norm 2.7562 (3.7411) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 05:58:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 218 training takes 0:00:27 [2024-08-30 05:58:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 05:58:52 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 05:58:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.343 (0.343) Loss 0.4021 (0.4021) Acc@1 92.871 (92.871) Acc@5 98.438 (98.438) Mem 7377MB [2024-08-30 05:58:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.097) Loss 0.6846 (0.6551) Acc@1 87.500 (86.337) Acc@5 97.461 (97.425) Mem 7377MB [2024-08-30 05:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.068 (0.085) Loss 0.9604 (0.6809) Acc@1 77.344 (85.449) Acc@5 94.629 (97.391) Mem 7377MB [2024-08-30 05:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.067 (0.080) Loss 1.1592 (0.7762) Acc@1 73.926 (83.191) Acc@5 92.090 (96.374) Mem 7377MB [2024-08-30 05:58:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 1.0439 (0.8236) Acc@1 75.098 (81.955) Acc@5 94.141 (95.872) Mem 7377MB [2024-08-30 06:00:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 06:00:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 06:00:52 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 06:01:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 06:01:00 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 06:01:01 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 06:01:03 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 06:01:03 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 218) [2024-08-30 06:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 06:01:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][0/1251] eta 4:53:57 lr 0.000201 wd 0.0500 time 14.0983 (14.0983) data time 1.1133 (1.1133) model time 0.0000 (0.0000) loss 3.4590 (3.4590) grad_norm 2.8558 (2.8558) loss_scale 256.0000 (256.0000) mem 20033MB [2024-08-30 06:01:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][10/1251] eta 0:31:12 lr 0.000201 wd 0.0500 time 0.2500 (1.5088) data time 0.0011 (0.1023) model time 0.0000 (0.0000) loss 2.8132 (3.2910) grad_norm 3.2161 (3.5936) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][20/1251] eta 0:18:26 lr 0.000201 wd 0.0500 time 0.2233 (0.8985) data time 0.0010 (0.0542) model time 0.0000 (0.0000) loss 2.9593 (3.1796) grad_norm 3.3642 (3.2615) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:01:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][30/1251] eta 0:13:52 lr 0.000201 wd 0.0500 time 0.2200 (0.6817) data time 0.0008 (0.0371) model time 0.0000 (0.0000) loss 2.4283 (3.2164) grad_norm 3.6849 (3.2325) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][40/1251] eta 0:11:31 lr 0.000201 wd 0.0500 time 0.2245 (0.5706) data time 0.0013 (0.0283) model time 0.0000 (0.0000) loss 2.5499 (3.1226) grad_norm 3.9687 (3.4044) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][50/1251] eta 0:10:04 lr 0.000201 wd 0.0500 time 0.2259 (0.5033) data time 0.0007 (0.0230) model time 0.0000 (0.0000) loss 3.7430 (3.1137) grad_norm 3.8719 (3.3937) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:01:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][60/1251] eta 0:09:06 lr 0.000201 wd 0.0500 time 0.2271 (0.4588) data time 0.0014 (0.0194) model time 0.2256 (0.2303) loss 3.3240 (3.0936) grad_norm 3.0346 (3.4000) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][70/1251] eta 0:08:23 lr 0.000200 wd 0.0500 time 0.2293 (0.4263) data time 0.0009 (0.0169) model time 0.2284 (0.2286) loss 2.8378 (3.0513) grad_norm 3.9634 (3.4110) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][80/1251] eta 0:07:50 lr 0.000200 wd 0.0500 time 0.2332 (0.4018) data time 0.0017 (0.0149) model time 0.2315 (0.2280) loss 2.6728 (3.0371) grad_norm 2.9499 (3.4107) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:01:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][90/1251] eta 0:07:24 lr 0.000200 wd 0.0500 time 0.2270 (0.3826) data time 0.0009 (0.0134) model time 0.2261 (0.2275) loss 3.4024 (3.0298) grad_norm 5.1228 (3.4593) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][100/1251] eta 0:07:03 lr 0.000200 wd 0.0500 time 0.2285 (0.3677) data time 0.0011 (0.0122) model time 0.2274 (0.2281) loss 3.0871 (3.0347) grad_norm 4.5624 (3.4437) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][110/1251] eta 0:06:45 lr 0.000200 wd 0.0500 time 0.2276 (0.3552) data time 0.0013 (0.0112) model time 0.2263 (0.2282) loss 2.4065 (3.0347) grad_norm 5.0859 (3.4654) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:01:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][120/1251] eta 0:06:30 lr 0.000200 wd 0.0500 time 0.2248 (0.3449) data time 0.0012 (0.0104) model time 0.2235 (0.2281) loss 1.4957 (3.0229) grad_norm 17.3504 (3.5824) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][130/1251] eta 0:06:16 lr 0.000200 wd 0.0500 time 0.2258 (0.3360) data time 0.0010 (0.0097) model time 0.2248 (0.2280) loss 3.1921 (3.0170) grad_norm 3.4363 (3.5425) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:01:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][140/1251] eta 0:06:04 lr 0.000200 wd 0.0500 time 0.2287 (0.3282) data time 0.0010 (0.0091) model time 0.2278 (0.2278) loss 3.4429 (3.0044) grad_norm 10.5139 (3.6102) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:01:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][150/1251] eta 0:05:53 lr 0.000200 wd 0.0500 time 0.2213 (0.3215) data time 0.0014 (0.0086) model time 0.2199 (0.2275) loss 2.2293 (2.9966) grad_norm 2.6299 (3.5849) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][160/1251] eta 0:05:44 lr 0.000200 wd 0.0500 time 0.2318 (0.3157) data time 0.0010 (0.0081) model time 0.2307 (0.2275) loss 3.6624 (3.0023) grad_norm 3.6074 (3.5619) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][170/1251] eta 0:05:35 lr 0.000200 wd 0.0500 time 0.2265 (0.3106) data time 0.0010 (0.0077) model time 0.2255 (0.2275) loss 3.0858 (3.0014) grad_norm 3.8506 (3.5836) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][180/1251] eta 0:05:27 lr 0.000200 wd 0.0500 time 0.2362 (0.3061) data time 0.0009 (0.0073) model time 0.2353 (0.2275) loss 3.2646 (2.9855) grad_norm 3.3295 (3.5558) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][190/1251] eta 0:05:20 lr 0.000200 wd 0.0500 time 0.2292 (0.3020) data time 0.0010 (0.0070) model time 0.2282 (0.2274) loss 2.6418 (2.9812) grad_norm 3.3813 (3.5693) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][200/1251] eta 0:05:13 lr 0.000200 wd 0.0500 time 0.2238 (0.2982) data time 0.0009 (0.0067) model time 0.2229 (0.2273) loss 3.0929 (2.9732) grad_norm 3.6063 (3.6376) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][210/1251] eta 0:05:07 lr 0.000200 wd 0.0500 time 0.2254 (0.2950) data time 0.0012 (0.0065) model time 0.2243 (0.2274) loss 3.3548 (2.9695) grad_norm 3.5089 (3.6197) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][220/1251] eta 0:05:00 lr 0.000200 wd 0.0500 time 0.2300 (0.2919) data time 0.0007 (0.0062) model time 0.2293 (0.2273) loss 3.4199 (2.9668) grad_norm 2.8375 (3.6322) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][230/1251] eta 0:04:55 lr 0.000200 wd 0.0500 time 0.2318 (0.2892) data time 0.0009 (0.0060) model time 0.2309 (0.2274) loss 1.8880 (2.9667) grad_norm 3.3958 (3.6508) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][240/1251] eta 0:04:49 lr 0.000200 wd 0.0500 time 0.2246 (0.2866) data time 0.0008 (0.0058) model time 0.2238 (0.2273) loss 2.9874 (2.9645) grad_norm 2.7085 (3.6452) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][250/1251] eta 0:04:44 lr 0.000200 wd 0.0500 time 0.2279 (0.2843) data time 0.0008 (0.0056) model time 0.2271 (0.2274) loss 3.2206 (2.9553) grad_norm 3.9432 (3.6646) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][260/1251] eta 0:04:39 lr 0.000200 wd 0.0500 time 0.2300 (0.2822) data time 0.0007 (0.0054) model time 0.2293 (0.2274) loss 2.7424 (2.9460) grad_norm 4.7126 (3.6521) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][270/1251] eta 0:04:34 lr 0.000200 wd 0.0500 time 0.2236 (0.2802) data time 0.0014 (0.0053) model time 0.2222 (0.2273) loss 3.2213 (2.9429) grad_norm 3.8188 (3.6670) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][280/1251] eta 0:04:30 lr 0.000200 wd 0.0500 time 0.2283 (0.2783) data time 0.0010 (0.0051) model time 0.2274 (0.2273) loss 2.9310 (2.9515) grad_norm 2.7633 (3.6705) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][290/1251] eta 0:04:26 lr 0.000200 wd 0.0500 time 0.2218 (0.2773) data time 0.0008 (0.0050) model time 0.2210 (0.2281) loss 1.4676 (2.9410) grad_norm 3.3361 (3.6494) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][300/1251] eta 0:04:22 lr 0.000200 wd 0.0500 time 0.2301 (0.2757) data time 0.0015 (0.0048) model time 0.2286 (0.2281) loss 2.8466 (2.9325) grad_norm 3.8958 (3.6555) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][310/1251] eta 0:04:18 lr 0.000200 wd 0.0500 time 0.2216 (0.2748) data time 0.0010 (0.0047) model time 0.2206 (0.2289) loss 3.2273 (2.9286) grad_norm 3.7205 (3.6805) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][320/1251] eta 0:04:14 lr 0.000200 wd 0.0500 time 0.2380 (0.2734) data time 0.0006 (0.0046) model time 0.2374 (0.2289) loss 3.2920 (2.9379) grad_norm 2.8984 (3.6697) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][330/1251] eta 0:04:10 lr 0.000200 wd 0.0500 time 0.2278 (0.2720) data time 0.0007 (0.0045) model time 0.2270 (0.2288) loss 1.6793 (2.9356) grad_norm 2.9890 (3.6564) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][340/1251] eta 0:04:06 lr 0.000200 wd 0.0500 time 0.2257 (0.2707) data time 0.0011 (0.0044) model time 0.2246 (0.2287) loss 3.0588 (2.9380) grad_norm 3.1503 (3.6401) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][350/1251] eta 0:04:02 lr 0.000200 wd 0.0500 time 0.2264 (0.2696) data time 0.0009 (0.0043) model time 0.2255 (0.2287) loss 3.0710 (2.9403) grad_norm 2.9853 (3.6201) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][360/1251] eta 0:03:59 lr 0.000199 wd 0.0500 time 0.2247 (0.2684) data time 0.0012 (0.0042) model time 0.2234 (0.2286) loss 2.4945 (2.9409) grad_norm 7.4616 (3.6750) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][370/1251] eta 0:03:55 lr 0.000199 wd 0.0500 time 0.2256 (0.2674) data time 0.0012 (0.0041) model time 0.2243 (0.2286) loss 2.2438 (2.9377) grad_norm 3.1927 (3.6835) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][380/1251] eta 0:03:51 lr 0.000199 wd 0.0500 time 0.2250 (0.2663) data time 0.0009 (0.0041) model time 0.2241 (0.2286) loss 2.1013 (2.9357) grad_norm 3.5583 (3.6884) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][390/1251] eta 0:03:48 lr 0.000199 wd 0.0500 time 0.2273 (0.2654) data time 0.0007 (0.0040) model time 0.2266 (0.2286) loss 3.3818 (2.9313) grad_norm 3.1687 (3.6929) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][400/1251] eta 0:03:45 lr 0.000199 wd 0.0500 time 0.2169 (0.2645) data time 0.0013 (0.0039) model time 0.2156 (0.2286) loss 3.0672 (2.9344) grad_norm 3.4771 (3.7018) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:02:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][410/1251] eta 0:03:41 lr 0.000199 wd 0.0500 time 0.2242 (0.2636) data time 0.0011 (0.0038) model time 0.2230 (0.2285) loss 3.2887 (2.9387) grad_norm 5.1061 (3.7031) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][420/1251] eta 0:03:38 lr 0.000199 wd 0.0500 time 0.2323 (0.2628) data time 0.0009 (0.0038) model time 0.2314 (0.2285) loss 3.5090 (2.9378) grad_norm 2.7044 (3.7022) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][430/1251] eta 0:03:35 lr 0.000199 wd 0.0500 time 0.2361 (0.2620) data time 0.0009 (0.0037) model time 0.2352 (0.2285) loss 3.0869 (2.9420) grad_norm 2.6263 (3.7218) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][440/1251] eta 0:03:31 lr 0.000199 wd 0.0500 time 0.2328 (0.2613) data time 0.0007 (0.0037) model time 0.2321 (0.2285) loss 3.5574 (2.9461) grad_norm 2.5165 (3.7425) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][450/1251] eta 0:03:28 lr 0.000199 wd 0.0500 time 0.2298 (0.2606) data time 0.0010 (0.0036) model time 0.2288 (0.2285) loss 2.7026 (2.9438) grad_norm 5.0267 (3.7470) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][460/1251] eta 0:03:25 lr 0.000199 wd 0.0500 time 0.2318 (0.2599) data time 0.0010 (0.0036) model time 0.2309 (0.2285) loss 3.2421 (2.9382) grad_norm 3.6247 (3.7397) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][470/1251] eta 0:03:22 lr 0.000199 wd 0.0500 time 0.2285 (0.2592) data time 0.0009 (0.0035) model time 0.2276 (0.2284) loss 2.6398 (2.9302) grad_norm 3.2697 (3.7306) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][480/1251] eta 0:03:19 lr 0.000199 wd 0.0500 time 0.2206 (0.2585) data time 0.0008 (0.0035) model time 0.2197 (0.2283) loss 3.0106 (2.9276) grad_norm 3.3498 (3.7144) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][490/1251] eta 0:03:16 lr 0.000199 wd 0.0500 time 0.2237 (0.2579) data time 0.0015 (0.0034) model time 0.2222 (0.2283) loss 3.0221 (2.9330) grad_norm 9.7374 (3.7201) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][500/1251] eta 0:03:13 lr 0.000199 wd 0.0500 time 0.2243 (0.2573) data time 0.0011 (0.0034) model time 0.2232 (0.2283) loss 2.9419 (2.9296) grad_norm 3.5710 (3.7223) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][510/1251] eta 0:03:10 lr 0.000199 wd 0.0500 time 0.2320 (0.2567) data time 0.0006 (0.0033) model time 0.2313 (0.2283) loss 3.5186 (2.9337) grad_norm 2.5100 (3.7187) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][520/1251] eta 0:03:07 lr 0.000199 wd 0.0500 time 0.2319 (0.2562) data time 0.0007 (0.0033) model time 0.2312 (0.2283) loss 2.7401 (2.9333) grad_norm 2.6320 (3.7157) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][530/1251] eta 0:03:04 lr 0.000199 wd 0.0500 time 0.2221 (0.2556) data time 0.0009 (0.0032) model time 0.2212 (0.2282) loss 3.3641 (2.9286) grad_norm 5.3904 (3.7184) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][540/1251] eta 0:03:01 lr 0.000199 wd 0.0500 time 0.2317 (0.2552) data time 0.0011 (0.0032) model time 0.2306 (0.2282) loss 2.8734 (2.9261) grad_norm 4.2791 (3.7157) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][550/1251] eta 0:02:58 lr 0.000199 wd 0.0500 time 0.2368 (0.2547) data time 0.0010 (0.0031) model time 0.2358 (0.2282) loss 3.5130 (2.9265) grad_norm 3.5075 (3.7275) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][560/1251] eta 0:02:55 lr 0.000199 wd 0.0500 time 0.2289 (0.2543) data time 0.0018 (0.0031) model time 0.2271 (0.2283) loss 3.3926 (2.9339) grad_norm 2.5865 (3.7275) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][570/1251] eta 0:02:52 lr 0.000199 wd 0.0500 time 0.2287 (0.2539) data time 0.0012 (0.0031) model time 0.2276 (0.2284) loss 3.1607 (2.9358) grad_norm 2.8890 (3.7162) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][580/1251] eta 0:02:50 lr 0.000199 wd 0.0500 time 0.2353 (0.2535) data time 0.0007 (0.0030) model time 0.2346 (0.2284) loss 3.1097 (2.9348) grad_norm 4.4450 (3.7137) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][590/1251] eta 0:02:47 lr 0.000199 wd 0.0500 time 0.2317 (0.2531) data time 0.0009 (0.0030) model time 0.2307 (0.2284) loss 3.2470 (2.9353) grad_norm 2.7390 (3.7063) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][600/1251] eta 0:02:44 lr 0.000199 wd 0.0500 time 0.2305 (0.2528) data time 0.0012 (0.0030) model time 0.2293 (0.2284) loss 2.0247 (2.9333) grad_norm 3.6261 (3.7031) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][610/1251] eta 0:02:41 lr 0.000199 wd 0.0500 time 0.2249 (0.2524) data time 0.0016 (0.0030) model time 0.2233 (0.2284) loss 2.8611 (2.9334) grad_norm 2.9420 (3.6982) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][620/1251] eta 0:02:39 lr 0.000199 wd 0.0500 time 0.2289 (0.2520) data time 0.0008 (0.0029) model time 0.2281 (0.2284) loss 3.1237 (2.9341) grad_norm 3.5813 (3.6992) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][630/1251] eta 0:02:36 lr 0.000199 wd 0.0500 time 0.2341 (0.2517) data time 0.0011 (0.0029) model time 0.2330 (0.2284) loss 3.3151 (2.9377) grad_norm 3.7899 (3.6990) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][640/1251] eta 0:02:33 lr 0.000198 wd 0.0500 time 0.2275 (0.2514) data time 0.0009 (0.0029) model time 0.2266 (0.2285) loss 2.0354 (2.9358) grad_norm 4.5924 (3.7039) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][650/1251] eta 0:02:30 lr 0.000198 wd 0.0500 time 0.2323 (0.2510) data time 0.0007 (0.0028) model time 0.2316 (0.2284) loss 3.6102 (2.9358) grad_norm 3.9136 (3.7056) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][660/1251] eta 0:02:28 lr 0.000198 wd 0.0500 time 0.2266 (0.2507) data time 0.0007 (0.0028) model time 0.2258 (0.2284) loss 1.9061 (2.9308) grad_norm 2.7748 (3.6947) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:03:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][670/1251] eta 0:02:25 lr 0.000198 wd 0.0500 time 0.2271 (0.2504) data time 0.0008 (0.0028) model time 0.2263 (0.2285) loss 3.2070 (2.9331) grad_norm 5.1831 (3.6925) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][680/1251] eta 0:02:22 lr 0.000198 wd 0.0500 time 0.2239 (0.2501) data time 0.0009 (0.0028) model time 0.2229 (0.2284) loss 3.2437 (2.9343) grad_norm 3.6446 (3.6829) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][690/1251] eta 0:02:20 lr 0.000198 wd 0.0500 time 0.2282 (0.2498) data time 0.0014 (0.0027) model time 0.2268 (0.2285) loss 2.2052 (2.9307) grad_norm 4.2090 (3.6823) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][700/1251] eta 0:02:17 lr 0.000198 wd 0.0500 time 0.2272 (0.2496) data time 0.0009 (0.0027) model time 0.2262 (0.2285) loss 3.3477 (2.9296) grad_norm 4.3588 (3.6908) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][710/1251] eta 0:02:14 lr 0.000198 wd 0.0500 time 0.2284 (0.2493) data time 0.0007 (0.0027) model time 0.2277 (0.2285) loss 1.9361 (2.9292) grad_norm 3.1324 (3.6883) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][720/1251] eta 0:02:12 lr 0.000198 wd 0.0500 time 0.2218 (0.2490) data time 0.0008 (0.0027) model time 0.2209 (0.2285) loss 3.2780 (2.9269) grad_norm 3.2154 (3.6874) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][730/1251] eta 0:02:09 lr 0.000198 wd 0.0500 time 0.2277 (0.2487) data time 0.0012 (0.0026) model time 0.2265 (0.2285) loss 2.8942 (2.9282) grad_norm 5.5506 (3.6801) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][740/1251] eta 0:02:06 lr 0.000198 wd 0.0500 time 0.2333 (0.2484) data time 0.0012 (0.0026) model time 0.2321 (0.2285) loss 3.1429 (2.9311) grad_norm 3.0213 (3.6778) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][750/1251] eta 0:02:04 lr 0.000198 wd 0.0500 time 0.2288 (0.2481) data time 0.0009 (0.0026) model time 0.2279 (0.2284) loss 2.9853 (2.9284) grad_norm 4.9337 (3.6770) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][760/1251] eta 0:02:01 lr 0.000198 wd 0.0500 time 0.2314 (0.2479) data time 0.0011 (0.0026) model time 0.2303 (0.2284) loss 3.0634 (2.9294) grad_norm 5.6133 (3.6825) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][770/1251] eta 0:01:59 lr 0.000198 wd 0.0500 time 0.2224 (0.2476) data time 0.0010 (0.0026) model time 0.2214 (0.2284) loss 3.5924 (2.9302) grad_norm 3.6289 (3.6957) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][780/1251] eta 0:01:56 lr 0.000198 wd 0.0500 time 0.2292 (0.2474) data time 0.0007 (0.0025) model time 0.2285 (0.2284) loss 3.5625 (2.9314) grad_norm 2.8957 (3.6937) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][790/1251] eta 0:01:53 lr 0.000198 wd 0.0500 time 0.2280 (0.2472) data time 0.0009 (0.0025) model time 0.2271 (0.2284) loss 2.6068 (2.9309) grad_norm 3.8365 (3.7201) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][800/1251] eta 0:01:51 lr 0.000198 wd 0.0500 time 0.2342 (0.2470) data time 0.0007 (0.0025) model time 0.2335 (0.2284) loss 3.2272 (2.9313) grad_norm 4.9936 (3.7279) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][810/1251] eta 0:01:48 lr 0.000198 wd 0.0500 time 0.2272 (0.2467) data time 0.0007 (0.0025) model time 0.2265 (0.2284) loss 2.3433 (2.9250) grad_norm 3.0066 (3.7232) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][820/1251] eta 0:01:46 lr 0.000198 wd 0.0500 time 0.2270 (0.2468) data time 0.0006 (0.0025) model time 0.2264 (0.2286) loss 3.1814 (2.9264) grad_norm 3.7513 (3.7175) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][830/1251] eta 0:01:43 lr 0.000198 wd 0.0500 time 0.2353 (0.2467) data time 0.0009 (0.0025) model time 0.2344 (0.2288) loss 1.9062 (2.9227) grad_norm 4.1264 (3.7202) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][840/1251] eta 0:01:41 lr 0.000198 wd 0.0500 time 0.2408 (0.2465) data time 0.0008 (0.0024) model time 0.2400 (0.2288) loss 3.6102 (2.9233) grad_norm 5.2714 (3.7198) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][850/1251] eta 0:01:38 lr 0.000198 wd 0.0500 time 0.2294 (0.2463) data time 0.0007 (0.0024) model time 0.2287 (0.2288) loss 3.0613 (2.9225) grad_norm 2.4154 (3.7209) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][860/1251] eta 0:01:36 lr 0.000198 wd 0.0500 time 0.2318 (0.2461) data time 0.0009 (0.0024) model time 0.2308 (0.2288) loss 3.2480 (2.9226) grad_norm 3.5117 (3.7161) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][870/1251] eta 0:01:33 lr 0.000198 wd 0.0500 time 0.2359 (0.2459) data time 0.0007 (0.0024) model time 0.2352 (0.2288) loss 2.8647 (2.9235) grad_norm 2.9217 (3.7162) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][880/1251] eta 0:01:31 lr 0.000198 wd 0.0500 time 0.2455 (0.2458) data time 0.0008 (0.0024) model time 0.2447 (0.2288) loss 3.4770 (2.9214) grad_norm 5.3036 (3.7148) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][890/1251] eta 0:01:28 lr 0.000198 wd 0.0500 time 0.2307 (0.2456) data time 0.0009 (0.0024) model time 0.2298 (0.2288) loss 2.4330 (2.9194) grad_norm 2.7007 (3.7273) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][900/1251] eta 0:01:26 lr 0.000198 wd 0.0500 time 0.2304 (0.2454) data time 0.0010 (0.0024) model time 0.2294 (0.2288) loss 3.3512 (2.9194) grad_norm 3.2929 (3.7233) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][910/1251] eta 0:01:23 lr 0.000198 wd 0.0500 time 0.2238 (0.2452) data time 0.0009 (0.0023) model time 0.2229 (0.2288) loss 2.1505 (2.9191) grad_norm 3.6133 (3.7323) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][920/1251] eta 0:01:21 lr 0.000198 wd 0.0500 time 0.2295 (0.2450) data time 0.0013 (0.0023) model time 0.2282 (0.2288) loss 3.4293 (2.9203) grad_norm 3.8444 (3.7328) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:04:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][930/1251] eta 0:01:18 lr 0.000197 wd 0.0500 time 0.2308 (0.2449) data time 0.0012 (0.0023) model time 0.2296 (0.2288) loss 3.4966 (2.9245) grad_norm 2.7832 (3.7308) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][940/1251] eta 0:01:16 lr 0.000197 wd 0.0500 time 0.2287 (0.2447) data time 0.0007 (0.0023) model time 0.2280 (0.2288) loss 3.0255 (2.9235) grad_norm 3.0034 (3.7314) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][950/1251] eta 0:01:13 lr 0.000197 wd 0.0500 time 0.2336 (0.2445) data time 0.0009 (0.0023) model time 0.2327 (0.2288) loss 2.1158 (2.9182) grad_norm 2.7934 (3.7294) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][960/1251] eta 0:01:11 lr 0.000197 wd 0.0500 time 0.2416 (0.2444) data time 0.0009 (0.0023) model time 0.2407 (0.2288) loss 3.1696 (2.9182) grad_norm 5.6924 (3.7277) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][970/1251] eta 0:01:08 lr 0.000197 wd 0.0500 time 0.2265 (0.2442) data time 0.0010 (0.0023) model time 0.2255 (0.2287) loss 2.6920 (2.9195) grad_norm 3.3696 (3.7228) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][980/1251] eta 0:01:06 lr 0.000197 wd 0.0500 time 0.2259 (0.2441) data time 0.0015 (0.0022) model time 0.2244 (0.2287) loss 3.2741 (2.9208) grad_norm 3.5819 (3.7182) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][990/1251] eta 0:01:03 lr 0.000197 wd 0.0500 time 0.2299 (0.2439) data time 0.0010 (0.0022) model time 0.2288 (0.2287) loss 3.2026 (2.9206) grad_norm 3.2816 (3.7161) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1000/1251] eta 0:01:01 lr 0.000197 wd 0.0500 time 0.2439 (0.2437) data time 0.0009 (0.0022) model time 0.2430 (0.2287) loss 2.8247 (2.9199) grad_norm 3.6052 (3.7108) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1010/1251] eta 0:00:58 lr 0.000197 wd 0.0500 time 0.2209 (0.2436) data time 0.0012 (0.0022) model time 0.2197 (0.2287) loss 2.9329 (2.9212) grad_norm 5.0177 (3.7158) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1020/1251] eta 0:00:56 lr 0.000197 wd 0.0500 time 0.2223 (0.2434) data time 0.0008 (0.0022) model time 0.2215 (0.2286) loss 3.3333 (2.9218) grad_norm 3.5367 (3.7177) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1030/1251] eta 0:00:53 lr 0.000197 wd 0.0500 time 0.2205 (0.2433) data time 0.0012 (0.0022) model time 0.2193 (0.2286) loss 3.0546 (2.9191) grad_norm 3.6376 (3.7167) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1040/1251] eta 0:00:51 lr 0.000197 wd 0.0500 time 0.2275 (0.2432) data time 0.0008 (0.0022) model time 0.2267 (0.2287) loss 3.1779 (2.9208) grad_norm 2.5530 (3.7126) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1050/1251] eta 0:00:48 lr 0.000197 wd 0.0500 time 0.2252 (0.2430) data time 0.0009 (0.0022) model time 0.2242 (0.2286) loss 2.1713 (2.9204) grad_norm 5.4235 (3.7105) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1060/1251] eta 0:00:46 lr 0.000197 wd 0.0500 time 0.2273 (0.2429) data time 0.0007 (0.0022) model time 0.2266 (0.2286) loss 2.6524 (2.9225) grad_norm 3.2518 (3.7217) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1070/1251] eta 0:00:43 lr 0.000197 wd 0.0500 time 0.2299 (0.2427) data time 0.0009 (0.0021) model time 0.2290 (0.2286) loss 3.3527 (2.9224) grad_norm 2.5525 (3.7170) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1080/1251] eta 0:00:41 lr 0.000197 wd 0.0500 time 0.2334 (0.2426) data time 0.0009 (0.0021) model time 0.2325 (0.2286) loss 1.8739 (2.9218) grad_norm 3.6529 (3.7127) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1090/1251] eta 0:00:39 lr 0.000197 wd 0.0500 time 0.2294 (0.2425) data time 0.0009 (0.0021) model time 0.2285 (0.2286) loss 3.2793 (2.9215) grad_norm 4.4081 (3.7217) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1100/1251] eta 0:00:36 lr 0.000197 wd 0.0500 time 0.2319 (0.2424) data time 0.0011 (0.0021) model time 0.2308 (0.2286) loss 3.1695 (2.9205) grad_norm 6.7599 (3.7255) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1110/1251] eta 0:00:34 lr 0.000197 wd 0.0500 time 0.2314 (0.2422) data time 0.0010 (0.0021) model time 0.2304 (0.2285) loss 3.3730 (2.9204) grad_norm 3.6495 (3.7285) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1120/1251] eta 0:00:31 lr 0.000197 wd 0.0500 time 0.2319 (0.2421) data time 0.0010 (0.0021) model time 0.2309 (0.2286) loss 2.3294 (2.9217) grad_norm 2.9165 (3.7389) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1130/1251] eta 0:00:29 lr 0.000197 wd 0.0500 time 0.2274 (0.2420) data time 0.0007 (0.0021) model time 0.2267 (0.2286) loss 3.3801 (2.9225) grad_norm 6.1004 (3.7656) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1140/1251] eta 0:00:26 lr 0.000197 wd 0.0500 time 0.2254 (0.2419) data time 0.0011 (0.0021) model time 0.2243 (0.2286) loss 2.0403 (2.9213) grad_norm 3.1914 (3.7625) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1150/1251] eta 0:00:24 lr 0.000197 wd 0.0500 time 0.2313 (0.2418) data time 0.0009 (0.0021) model time 0.2304 (0.2286) loss 3.1748 (2.9223) grad_norm 2.6744 (3.7575) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1160/1251] eta 0:00:21 lr 0.000197 wd 0.0500 time 0.2275 (0.2417) data time 0.0008 (0.0021) model time 0.2267 (0.2285) loss 3.3208 (2.9220) grad_norm 4.2649 (3.7538) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1170/1251] eta 0:00:19 lr 0.000197 wd 0.0500 time 0.2499 (0.2416) data time 0.0009 (0.0021) model time 0.2489 (0.2285) loss 2.7327 (2.9218) grad_norm 3.4689 (3.7502) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1180/1251] eta 0:00:17 lr 0.000197 wd 0.0500 time 0.2397 (0.2415) data time 0.0011 (0.0021) model time 0.2386 (0.2285) loss 2.5365 (2.9243) grad_norm 3.2620 (3.7515) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1190/1251] eta 0:00:14 lr 0.000197 wd 0.0500 time 0.2434 (0.2414) data time 0.0010 (0.0020) model time 0.2424 (0.2286) loss 2.2704 (2.9254) grad_norm 3.6276 (3.7494) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:05:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1200/1251] eta 0:00:12 lr 0.000197 wd 0.0500 time 0.2326 (0.2413) data time 0.0009 (0.0020) model time 0.2318 (0.2285) loss 3.3709 (2.9271) grad_norm 3.6665 (3.7455) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1210/1251] eta 0:00:09 lr 0.000197 wd 0.0500 time 0.2344 (0.2412) data time 0.0007 (0.0020) model time 0.2337 (0.2285) loss 2.4095 (2.9265) grad_norm 4.2322 (3.7431) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:06:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1220/1251] eta 0:00:07 lr 0.000196 wd 0.0500 time 0.2378 (0.2411) data time 0.0007 (0.0020) model time 0.2371 (0.2286) loss 2.4876 (2.9262) grad_norm 4.0237 (3.7431) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:06:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1230/1251] eta 0:00:05 lr 0.000196 wd 0.0500 time 0.2261 (0.2410) data time 0.0009 (0.0020) model time 0.2253 (0.2285) loss 3.4330 (2.9262) grad_norm 4.0693 (3.7475) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:06:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1240/1251] eta 0:00:02 lr 0.000196 wd 0.0500 time 0.2105 (0.2408) data time 0.0005 (0.0020) model time 0.2101 (0.2285) loss 3.5603 (2.9259) grad_norm 4.2826 (3.7618) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [219/300][1250/1251] eta 0:00:00 lr 0.000196 wd 0.0500 time 0.2116 (0.2406) data time 0.0005 (0.0020) model time 0.2111 (0.2283) loss 3.0690 (2.9240) grad_norm 4.3547 (3.7683) loss_scale 256.0000 (256.0000) mem 7377MB [2024-08-30 06:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 219 training takes 0:05:00 [2024-08-30 06:06:11 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 06:06:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 06:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.444 (0.444) Loss 0.4365 (0.4365) Acc@1 91.895 (91.895) Acc@5 98.145 (98.145) Mem 7377MB [2024-08-30 06:06:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.115) Loss 0.6162 (0.6518) Acc@1 87.402 (86.328) Acc@5 97.852 (97.505) Mem 7377MB [2024-08-30 06:06:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.100) Loss 0.9766 (0.6780) Acc@1 76.465 (85.366) Acc@5 94.922 (97.433) Mem 7377MB [2024-08-30 06:06:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.093) Loss 1.1807 (0.7723) Acc@1 72.070 (83.068) Acc@5 92.188 (96.330) Mem 7377MB [2024-08-30 06:06:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 1.0898 (0.8225) Acc@1 74.414 (81.915) Acc@5 93.457 (95.822) Mem 7377MB [2024-08-30 06:06:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.380 Acc@5 95.760 [2024-08-30 06:06:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.4% [2024-08-30 06:06:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 1.011 (1.011) Loss 0.3765 (0.3765) Acc@1 93.262 (93.262) Acc@5 98.535 (98.535) Mem 7377MB [2024-08-30 06:06:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.172) Loss 0.5859 (0.6033) Acc@1 89.062 (87.473) Acc@5 97.852 (97.745) Mem 7377MB [2024-08-30 06:06:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.085 (0.129) Loss 0.8652 (0.6300) Acc@1 78.516 (86.412) Acc@5 96.191 (97.726) Mem 7377MB [2024-08-30 06:06:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.113) Loss 1.0889 (0.7138) Acc@1 73.828 (84.334) Acc@5 93.164 (96.834) Mem 7377MB [2024-08-30 06:06:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.102) Loss 0.9756 (0.7571) Acc@1 77.246 (83.244) Acc@5 94.824 (96.363) Mem 7377MB [2024-08-30 06:06:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.802 Acc@5 96.322 [2024-08-30 06:06:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-08-30 06:06:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.80% [2024-08-30 06:06:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 06:06:27 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 06:06:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][0/1251] eta 0:16:46 lr 0.000196 wd 0.0500 time 0.8043 (0.8043) data time 0.4768 (0.4768) model time 0.0000 (0.0000) loss 2.7598 (2.7598) grad_norm 4.1253 (4.1253) loss_scale 256.0000 (256.0000) mem 7380MB [2024-08-30 06:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][10/1251] eta 0:05:47 lr 0.000196 wd 0.0500 time 0.2343 (0.2796) data time 0.0009 (0.0444) model time 0.0000 (0.0000) loss 3.1485 (3.0167) grad_norm 4.0808 (3.8923) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:06:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][20/1251] eta 0:05:13 lr 0.000196 wd 0.0500 time 0.2247 (0.2547) data time 0.0020 (0.0238) model time 0.0000 (0.0000) loss 2.8323 (2.9982) grad_norm 2.9081 (3.6667) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][30/1251] eta 0:05:00 lr 0.000196 wd 0.0500 time 0.2302 (0.2465) data time 0.0014 (0.0165) model time 0.0000 (0.0000) loss 3.1338 (3.0811) grad_norm 4.7941 (3.5745) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][40/1251] eta 0:04:53 lr 0.000196 wd 0.0500 time 0.2247 (0.2423) data time 0.0009 (0.0128) model time 0.0000 (0.0000) loss 2.6300 (3.0614) grad_norm 5.4321 (3.6200) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][50/1251] eta 0:04:47 lr 0.000196 wd 0.0500 time 0.2266 (0.2396) data time 0.0012 (0.0105) model time 0.0000 (0.0000) loss 2.3976 (3.0506) grad_norm 2.7668 (3.9866) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][60/1251] eta 0:04:43 lr 0.000196 wd 0.0500 time 0.2273 (0.2379) data time 0.0009 (0.0089) model time 0.2263 (0.2284) loss 3.2274 (3.0102) grad_norm 3.6208 (4.0054) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][70/1251] eta 0:04:42 lr 0.000196 wd 0.0500 time 0.2288 (0.2393) data time 0.0007 (0.0078) model time 0.2281 (0.2374) loss 2.6576 (2.9429) grad_norm 4.4773 (3.9694) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][80/1251] eta 0:04:39 lr 0.000196 wd 0.0500 time 0.2408 (0.2384) data time 0.0007 (0.0070) model time 0.2401 (0.2354) loss 3.0968 (2.9440) grad_norm 3.1712 (3.9020) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][90/1251] eta 0:04:35 lr 0.000196 wd 0.0500 time 0.2241 (0.2374) data time 0.0010 (0.0064) model time 0.2232 (0.2334) loss 2.0600 (2.9406) grad_norm 2.6080 (3.8201) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][100/1251] eta 0:04:32 lr 0.000196 wd 0.0500 time 0.2300 (0.2366) data time 0.0009 (0.0059) model time 0.2291 (0.2324) loss 3.2151 (2.9563) grad_norm 3.4985 (3.8709) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:06:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][110/1251] eta 0:04:29 lr 0.000196 wd 0.0500 time 0.2372 (0.2359) data time 0.0014 (0.0054) model time 0.2357 (0.2316) loss 2.8768 (2.9711) grad_norm 3.3308 (3.8297) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][120/1251] eta 0:04:26 lr 0.000196 wd 0.0500 time 0.2266 (0.2355) data time 0.0009 (0.0051) model time 0.2257 (0.2313) loss 3.1563 (2.9839) grad_norm 3.2121 (3.8037) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][130/1251] eta 0:04:23 lr 0.000196 wd 0.0500 time 0.2246 (0.2350) data time 0.0013 (0.0048) model time 0.2234 (0.2309) loss 2.6298 (2.9915) grad_norm 3.5615 (3.9608) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][140/1251] eta 0:04:20 lr 0.000196 wd 0.0500 time 0.2290 (0.2346) data time 0.0011 (0.0045) model time 0.2279 (0.2306) loss 2.0163 (2.9788) grad_norm 2.6593 (3.8935) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][150/1251] eta 0:04:17 lr 0.000196 wd 0.0500 time 0.2303 (0.2342) data time 0.0007 (0.0043) model time 0.2296 (0.2304) loss 3.5649 (2.9764) grad_norm 3.4354 (3.8913) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][160/1251] eta 0:04:15 lr 0.000196 wd 0.0500 time 0.2304 (0.2340) data time 0.0007 (0.0041) model time 0.2297 (0.2302) loss 2.5061 (2.9824) grad_norm 5.3896 (3.8998) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][170/1251] eta 0:04:12 lr 0.000196 wd 0.0500 time 0.2320 (0.2338) data time 0.0009 (0.0039) model time 0.2311 (0.2303) loss 3.3282 (2.9762) grad_norm 2.8045 (3.8437) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][180/1251] eta 0:04:10 lr 0.000196 wd 0.0500 time 0.2272 (0.2335) data time 0.0013 (0.0038) model time 0.2259 (0.2300) loss 3.2155 (2.9710) grad_norm 3.4938 (3.8302) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][190/1251] eta 0:04:07 lr 0.000196 wd 0.0500 time 0.2224 (0.2334) data time 0.0012 (0.0036) model time 0.2212 (0.2300) loss 3.1619 (2.9649) grad_norm 2.7910 (3.8117) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][200/1251] eta 0:04:05 lr 0.000196 wd 0.0500 time 0.2272 (0.2332) data time 0.0010 (0.0035) model time 0.2262 (0.2298) loss 3.1866 (2.9718) grad_norm 3.9959 (3.8005) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][210/1251] eta 0:04:02 lr 0.000196 wd 0.0500 time 0.2501 (0.2331) data time 0.0011 (0.0034) model time 0.2490 (0.2298) loss 3.4528 (2.9748) grad_norm 5.7974 (3.8889) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][220/1251] eta 0:04:00 lr 0.000196 wd 0.0500 time 0.2404 (0.2330) data time 0.0009 (0.0033) model time 0.2395 (0.2299) loss 3.5649 (2.9658) grad_norm 2.6991 (3.8671) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][230/1251] eta 0:03:57 lr 0.000196 wd 0.0500 time 0.2400 (0.2329) data time 0.0006 (0.0032) model time 0.2394 (0.2299) loss 2.2163 (2.9482) grad_norm 5.3015 (3.8673) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][240/1251] eta 0:03:55 lr 0.000196 wd 0.0500 time 0.2382 (0.2329) data time 0.0009 (0.0031) model time 0.2373 (0.2299) loss 3.0478 (2.9364) grad_norm 3.5061 (3.8325) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][250/1251] eta 0:03:52 lr 0.000196 wd 0.0500 time 0.2272 (0.2327) data time 0.0010 (0.0030) model time 0.2262 (0.2298) loss 2.7152 (2.9335) grad_norm 3.8491 (3.8423) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][260/1251] eta 0:03:50 lr 0.000195 wd 0.0500 time 0.2286 (0.2327) data time 0.0009 (0.0030) model time 0.2277 (0.2299) loss 3.6641 (2.9419) grad_norm 3.5380 (3.8436) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][270/1251] eta 0:03:48 lr 0.000195 wd 0.0500 time 0.2297 (0.2325) data time 0.0007 (0.0029) model time 0.2290 (0.2297) loss 2.9926 (2.9459) grad_norm 3.2039 (3.8355) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][280/1251] eta 0:03:46 lr 0.000195 wd 0.0500 time 0.2382 (0.2332) data time 0.0010 (0.0028) model time 0.2372 (0.2307) loss 3.2961 (2.9412) grad_norm 4.7924 (3.8230) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][290/1251] eta 0:03:43 lr 0.000195 wd 0.0500 time 0.2301 (0.2330) data time 0.0009 (0.0028) model time 0.2292 (0.2305) loss 2.9469 (2.9409) grad_norm 3.5439 (3.8004) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][300/1251] eta 0:03:41 lr 0.000195 wd 0.0500 time 0.2288 (0.2330) data time 0.0014 (0.0027) model time 0.2274 (0.2305) loss 2.4440 (2.9319) grad_norm 3.8131 (3.8016) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][310/1251] eta 0:03:39 lr 0.000195 wd 0.0500 time 0.2322 (0.2329) data time 0.0009 (0.0027) model time 0.2313 (0.2305) loss 1.8649 (2.9273) grad_norm 3.9835 (3.8101) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][320/1251] eta 0:03:36 lr 0.000195 wd 0.0500 time 0.2391 (0.2329) data time 0.0007 (0.0026) model time 0.2384 (0.2305) loss 3.3964 (2.9250) grad_norm 4.3192 (3.8334) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][330/1251] eta 0:03:34 lr 0.000195 wd 0.0500 time 0.2268 (0.2328) data time 0.0007 (0.0026) model time 0.2261 (0.2304) loss 3.2079 (2.9221) grad_norm 5.2715 (3.8284) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][340/1251] eta 0:03:31 lr 0.000195 wd 0.0500 time 0.2307 (0.2327) data time 0.0012 (0.0025) model time 0.2295 (0.2303) loss 3.1229 (2.9258) grad_norm 3.8145 (3.8143) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][350/1251] eta 0:03:29 lr 0.000195 wd 0.0500 time 0.2335 (0.2326) data time 0.0009 (0.0025) model time 0.2326 (0.2303) loss 3.0739 (2.9337) grad_norm 4.0982 (3.8026) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][360/1251] eta 0:03:27 lr 0.000195 wd 0.0500 time 0.2343 (0.2325) data time 0.0010 (0.0025) model time 0.2333 (0.2302) loss 3.3814 (2.9383) grad_norm 4.5224 (3.8060) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][370/1251] eta 0:03:24 lr 0.000195 wd 0.0500 time 0.2292 (0.2325) data time 0.0012 (0.0024) model time 0.2280 (0.2303) loss 2.9449 (2.9347) grad_norm 2.6925 (3.7924) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][380/1251] eta 0:03:22 lr 0.000195 wd 0.0500 time 0.2283 (0.2325) data time 0.0011 (0.0024) model time 0.2271 (0.2302) loss 2.8576 (2.9344) grad_norm 6.8193 (3.8085) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][390/1251] eta 0:03:20 lr 0.000195 wd 0.0500 time 0.2285 (0.2323) data time 0.0009 (0.0024) model time 0.2277 (0.2301) loss 2.6428 (2.9353) grad_norm 2.9672 (3.7962) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][400/1251] eta 0:03:17 lr 0.000195 wd 0.0500 time 0.2231 (0.2322) data time 0.0009 (0.0023) model time 0.2223 (0.2301) loss 1.9904 (2.9220) grad_norm 3.1205 (3.7873) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][410/1251] eta 0:03:15 lr 0.000195 wd 0.0500 time 0.2347 (0.2323) data time 0.0009 (0.0023) model time 0.2338 (0.2301) loss 3.0470 (2.9175) grad_norm 6.4409 (3.7953) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][420/1251] eta 0:03:13 lr 0.000195 wd 0.0500 time 0.2289 (0.2323) data time 0.0011 (0.0023) model time 0.2278 (0.2302) loss 1.7320 (2.9187) grad_norm 4.2413 (3.8050) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][430/1251] eta 0:03:10 lr 0.000195 wd 0.0500 time 0.2265 (0.2322) data time 0.0010 (0.0022) model time 0.2255 (0.2301) loss 2.2503 (2.9196) grad_norm 4.0640 (3.8048) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][440/1251] eta 0:03:08 lr 0.000195 wd 0.0500 time 0.2190 (0.2321) data time 0.0008 (0.0022) model time 0.2182 (0.2300) loss 3.2531 (2.9239) grad_norm 3.5995 (3.8111) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][450/1251] eta 0:03:05 lr 0.000195 wd 0.0500 time 0.2333 (0.2321) data time 0.0012 (0.0022) model time 0.2322 (0.2300) loss 3.3092 (2.9310) grad_norm 2.7899 (3.8155) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][460/1251] eta 0:03:03 lr 0.000195 wd 0.0500 time 0.2218 (0.2320) data time 0.0010 (0.0022) model time 0.2208 (0.2300) loss 3.1039 (2.9299) grad_norm 3.3269 (3.8115) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][470/1251] eta 0:03:01 lr 0.000195 wd 0.0500 time 0.2277 (0.2320) data time 0.0009 (0.0021) model time 0.2267 (0.2299) loss 3.3172 (2.9270) grad_norm 6.5610 (3.8183) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][480/1251] eta 0:02:58 lr 0.000195 wd 0.0500 time 0.2252 (0.2319) data time 0.0007 (0.0021) model time 0.2244 (0.2298) loss 3.3373 (2.9300) grad_norm 3.6080 (3.8239) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][490/1251] eta 0:02:56 lr 0.000195 wd 0.0500 time 0.2375 (0.2319) data time 0.0010 (0.0021) model time 0.2365 (0.2299) loss 3.1628 (2.9286) grad_norm 3.8671 (3.8191) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][500/1251] eta 0:02:54 lr 0.000195 wd 0.0500 time 0.2297 (0.2319) data time 0.0015 (0.0021) model time 0.2282 (0.2299) loss 2.9935 (2.9245) grad_norm 2.4893 (3.8040) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][510/1251] eta 0:02:51 lr 0.000195 wd 0.0500 time 0.2288 (0.2319) data time 0.0013 (0.0021) model time 0.2275 (0.2299) loss 3.2461 (2.9224) grad_norm 2.8858 (3.8005) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][520/1251] eta 0:02:49 lr 0.000195 wd 0.0500 time 0.2301 (0.2318) data time 0.0009 (0.0021) model time 0.2292 (0.2298) loss 2.0224 (2.9211) grad_norm 3.6886 (3.7879) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][530/1251] eta 0:02:47 lr 0.000195 wd 0.0500 time 0.2251 (0.2318) data time 0.0011 (0.0020) model time 0.2241 (0.2298) loss 2.9180 (2.9224) grad_norm 2.7110 (3.7830) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][540/1251] eta 0:02:44 lr 0.000195 wd 0.0500 time 0.2220 (0.2318) data time 0.0011 (0.0020) model time 0.2209 (0.2298) loss 3.0848 (2.9203) grad_norm 4.1225 (3.7771) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][550/1251] eta 0:02:42 lr 0.000194 wd 0.0500 time 0.2251 (0.2317) data time 0.0012 (0.0020) model time 0.2239 (0.2298) loss 3.1636 (2.9240) grad_norm 5.8858 (3.8050) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][560/1251] eta 0:02:40 lr 0.000194 wd 0.0500 time 0.2246 (0.2317) data time 0.0006 (0.0020) model time 0.2240 (0.2297) loss 3.3867 (2.9220) grad_norm 3.1205 (3.8060) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][570/1251] eta 0:02:37 lr 0.000194 wd 0.0500 time 0.2325 (0.2316) data time 0.0010 (0.0020) model time 0.2314 (0.2297) loss 2.2565 (2.9214) grad_norm 3.5348 (3.8027) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][580/1251] eta 0:02:35 lr 0.000194 wd 0.0500 time 0.2306 (0.2316) data time 0.0010 (0.0020) model time 0.2296 (0.2297) loss 3.1410 (2.9255) grad_norm 2.7866 (3.8051) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][590/1251] eta 0:02:33 lr 0.000194 wd 0.0500 time 0.2242 (0.2316) data time 0.0008 (0.0019) model time 0.2234 (0.2297) loss 2.1182 (2.9225) grad_norm 3.7296 (3.7963) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][600/1251] eta 0:02:30 lr 0.000194 wd 0.0500 time 0.2298 (0.2315) data time 0.0006 (0.0019) model time 0.2292 (0.2296) loss 2.4720 (2.9201) grad_norm 3.2016 (3.7841) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][610/1251] eta 0:02:28 lr 0.000194 wd 0.0500 time 0.2404 (0.2318) data time 0.0013 (0.0019) model time 0.2391 (0.2299) loss 3.1610 (2.9223) grad_norm 3.1284 (3.7851) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 06:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][620/1251] eta 0:02:26 lr 0.000194 wd 0.0500 time 0.2284 (0.2317) data time 0.0007 (0.0019) model time 0.2278 (0.2299) loss 2.7544 (2.9183) grad_norm 2.4587 (3.7806) loss_scale 512.0000 (257.2367) mem 7381MB [2024-08-30 06:08:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][630/1251] eta 0:02:23 lr 0.000194 wd 0.0500 time 0.2339 (0.2317) data time 0.0009 (0.0019) model time 0.2330 (0.2299) loss 2.9405 (2.9217) grad_norm 3.7407 (3.7713) loss_scale 512.0000 (261.2742) mem 7381MB [2024-08-30 06:08:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][640/1251] eta 0:02:21 lr 0.000194 wd 0.0500 time 0.2298 (0.2317) data time 0.0008 (0.0019) model time 0.2289 (0.2299) loss 2.3167 (2.9177) grad_norm 4.1404 (3.7683) loss_scale 512.0000 (265.1856) mem 7381MB [2024-08-30 06:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][650/1251] eta 0:02:19 lr 0.000194 wd 0.0500 time 0.2226 (0.2317) data time 0.0018 (0.0019) model time 0.2208 (0.2299) loss 2.8495 (2.9174) grad_norm 3.3314 (3.7637) loss_scale 512.0000 (268.9770) mem 7381MB [2024-08-30 06:09:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][660/1251] eta 0:02:16 lr 0.000194 wd 0.0500 time 0.2239 (0.2316) data time 0.0012 (0.0019) model time 0.2227 (0.2298) loss 3.0846 (2.9191) grad_norm 2.7599 (3.7738) loss_scale 512.0000 (272.6536) mem 7381MB [2024-08-30 06:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][670/1251] eta 0:02:14 lr 0.000194 wd 0.0500 time 0.2415 (0.2317) data time 0.0013 (0.0018) model time 0.2402 (0.2299) loss 3.1542 (2.9121) grad_norm 5.8903 (3.7798) loss_scale 512.0000 (276.2206) mem 7381MB [2024-08-30 06:09:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][680/1251] eta 0:02:12 lr 0.000194 wd 0.0500 time 0.2334 (0.2316) data time 0.0007 (0.0018) model time 0.2327 (0.2299) loss 2.3929 (2.9093) grad_norm 2.4887 (3.7681) loss_scale 512.0000 (279.6828) mem 7381MB [2024-08-30 06:09:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][690/1251] eta 0:02:09 lr 0.000194 wd 0.0500 time 0.2202 (0.2316) data time 0.0011 (0.0018) model time 0.2190 (0.2298) loss 3.4805 (2.9098) grad_norm 4.0291 (3.7700) loss_scale 512.0000 (283.0449) mem 7381MB [2024-08-30 06:09:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][700/1251] eta 0:02:07 lr 0.000194 wd 0.0500 time 0.2323 (0.2315) data time 0.0009 (0.0018) model time 0.2314 (0.2298) loss 2.7695 (2.9073) grad_norm 2.5916 (3.7606) loss_scale 512.0000 (286.3110) mem 7381MB [2024-08-30 06:09:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][710/1251] eta 0:02:05 lr 0.000194 wd 0.0500 time 0.2224 (0.2315) data time 0.0008 (0.0018) model time 0.2216 (0.2298) loss 3.5343 (2.9078) grad_norm 3.3131 (3.7547) loss_scale 512.0000 (289.4852) mem 7381MB [2024-08-30 06:09:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][720/1251] eta 0:02:02 lr 0.000194 wd 0.0500 time 0.2337 (0.2315) data time 0.0014 (0.0018) model time 0.2323 (0.2298) loss 3.1457 (2.9043) grad_norm 3.2800 (3.7481) loss_scale 512.0000 (292.5714) mem 7381MB [2024-08-30 06:09:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][730/1251] eta 0:02:00 lr 0.000194 wd 0.0500 time 0.2233 (0.2315) data time 0.0015 (0.0018) model time 0.2218 (0.2297) loss 2.2891 (2.9024) grad_norm 3.2318 (3.7529) loss_scale 512.0000 (295.5732) mem 7381MB [2024-08-30 06:09:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][740/1251] eta 0:01:58 lr 0.000194 wd 0.0500 time 0.2249 (0.2314) data time 0.0009 (0.0018) model time 0.2239 (0.2297) loss 2.8430 (2.9026) grad_norm 2.6332 (3.7484) loss_scale 512.0000 (298.4939) mem 7381MB [2024-08-30 06:09:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][750/1251] eta 0:01:55 lr 0.000194 wd 0.0500 time 0.2264 (0.2314) data time 0.0010 (0.0018) model time 0.2255 (0.2297) loss 2.3519 (2.8998) grad_norm 3.2898 (3.7463) loss_scale 512.0000 (301.3369) mem 7381MB [2024-08-30 06:09:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][760/1251] eta 0:01:53 lr 0.000194 wd 0.0500 time 0.2330 (0.2314) data time 0.0008 (0.0018) model time 0.2321 (0.2297) loss 3.0132 (2.9042) grad_norm 3.3731 (3.7452) loss_scale 512.0000 (304.1051) mem 7381MB [2024-08-30 06:09:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][770/1251] eta 0:01:51 lr 0.000194 wd 0.0500 time 0.2251 (0.2314) data time 0.0012 (0.0017) model time 0.2240 (0.2297) loss 3.0297 (2.9039) grad_norm 3.1471 (3.7361) loss_scale 512.0000 (306.8016) mem 7381MB [2024-08-30 06:09:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][780/1251] eta 0:01:48 lr 0.000194 wd 0.0500 time 0.2260 (0.2314) data time 0.0011 (0.0017) model time 0.2249 (0.2297) loss 3.1723 (2.9067) grad_norm 9.3742 (3.7457) loss_scale 512.0000 (309.4289) mem 7381MB [2024-08-30 06:09:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][790/1251] eta 0:01:46 lr 0.000194 wd 0.0500 time 0.2296 (0.2313) data time 0.0010 (0.0017) model time 0.2286 (0.2296) loss 2.6442 (2.9065) grad_norm 3.3483 (3.7530) loss_scale 512.0000 (311.9899) mem 7381MB [2024-08-30 06:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][800/1251] eta 0:01:44 lr 0.000194 wd 0.0500 time 0.2285 (0.2313) data time 0.0011 (0.0017) model time 0.2274 (0.2296) loss 3.1507 (2.9061) grad_norm 4.6296 (3.7577) loss_scale 512.0000 (314.4869) mem 7381MB [2024-08-30 06:09:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][810/1251] eta 0:01:41 lr 0.000194 wd 0.0500 time 0.2260 (0.2313) data time 0.0008 (0.0017) model time 0.2252 (0.2296) loss 2.0449 (2.9039) grad_norm 3.1048 (3.7561) loss_scale 512.0000 (316.9223) mem 7381MB [2024-08-30 06:09:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][820/1251] eta 0:01:39 lr 0.000194 wd 0.0500 time 0.2407 (0.2313) data time 0.0011 (0.0017) model time 0.2395 (0.2296) loss 3.3476 (2.9060) grad_norm 2.3288 (3.7518) loss_scale 512.0000 (319.2984) mem 7381MB [2024-08-30 06:09:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][830/1251] eta 0:01:37 lr 0.000194 wd 0.0500 time 0.2371 (0.2312) data time 0.0007 (0.0017) model time 0.2364 (0.2296) loss 1.8085 (2.9028) grad_norm 2.8107 (3.7529) loss_scale 512.0000 (321.6173) mem 7381MB [2024-08-30 06:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][840/1251] eta 0:01:35 lr 0.000193 wd 0.0500 time 0.2334 (0.2312) data time 0.0007 (0.0017) model time 0.2327 (0.2295) loss 2.1073 (2.9045) grad_norm 3.1294 (3.7456) loss_scale 512.0000 (323.8811) mem 7381MB [2024-08-30 06:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][850/1251] eta 0:01:32 lr 0.000193 wd 0.0500 time 0.2252 (0.2311) data time 0.0007 (0.0017) model time 0.2245 (0.2295) loss 3.0513 (2.9036) grad_norm 3.0103 (3.7487) loss_scale 512.0000 (326.0917) mem 7381MB [2024-08-30 06:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][860/1251] eta 0:01:30 lr 0.000193 wd 0.0500 time 0.2235 (0.2311) data time 0.0011 (0.0017) model time 0.2224 (0.2294) loss 2.2597 (2.9051) grad_norm 3.9307 (3.7491) loss_scale 512.0000 (328.2509) mem 7381MB [2024-08-30 06:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][870/1251] eta 0:01:28 lr 0.000193 wd 0.0500 time 0.2269 (0.2311) data time 0.0006 (0.0017) model time 0.2263 (0.2294) loss 2.8516 (2.9053) grad_norm 3.4200 (3.7422) loss_scale 512.0000 (330.3605) mem 7381MB [2024-08-30 06:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][880/1251] eta 0:01:25 lr 0.000193 wd 0.0500 time 0.2340 (0.2310) data time 0.0009 (0.0017) model time 0.2331 (0.2294) loss 3.1927 (2.9074) grad_norm 4.0785 (3.7506) loss_scale 512.0000 (332.4222) mem 7381MB [2024-08-30 06:09:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][890/1251] eta 0:01:23 lr 0.000193 wd 0.0500 time 0.2403 (0.2310) data time 0.0009 (0.0017) model time 0.2394 (0.2294) loss 3.1642 (2.9104) grad_norm 3.4878 (3.7480) loss_scale 512.0000 (334.4377) mem 7381MB [2024-08-30 06:09:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][900/1251] eta 0:01:21 lr 0.000193 wd 0.0500 time 0.2242 (0.2310) data time 0.0013 (0.0017) model time 0.2228 (0.2294) loss 3.2461 (2.9102) grad_norm 3.1982 (3.7450) loss_scale 512.0000 (336.4084) mem 7381MB [2024-08-30 06:09:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][910/1251] eta 0:01:18 lr 0.000193 wd 0.0500 time 0.2226 (0.2310) data time 0.0013 (0.0017) model time 0.2213 (0.2293) loss 2.6902 (2.9106) grad_norm 3.9194 (3.7464) loss_scale 512.0000 (338.3359) mem 7381MB [2024-08-30 06:10:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][920/1251] eta 0:01:16 lr 0.000193 wd 0.0500 time 0.2348 (0.2310) data time 0.0007 (0.0016) model time 0.2341 (0.2294) loss 2.3312 (2.9128) grad_norm 4.9986 (3.7458) loss_scale 512.0000 (340.2215) mem 7381MB [2024-08-30 06:10:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][930/1251] eta 0:01:14 lr 0.000193 wd 0.0500 time 0.2270 (0.2310) data time 0.0009 (0.0016) model time 0.2261 (0.2294) loss 3.5485 (2.9130) grad_norm 3.7475 (3.7414) loss_scale 512.0000 (342.0666) mem 7381MB [2024-08-30 06:10:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][940/1251] eta 0:01:11 lr 0.000193 wd 0.0500 time 0.2226 (0.2309) data time 0.0010 (0.0016) model time 0.2216 (0.2293) loss 2.6542 (2.9151) grad_norm 2.9218 (3.7416) loss_scale 512.0000 (343.8725) mem 7381MB [2024-08-30 06:10:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][950/1251] eta 0:01:09 lr 0.000193 wd 0.0500 time 0.2308 (0.2309) data time 0.0009 (0.0016) model time 0.2298 (0.2293) loss 3.2838 (2.9124) grad_norm 2.5466 (3.7386) loss_scale 512.0000 (345.6404) mem 7381MB [2024-08-30 06:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][960/1251] eta 0:01:07 lr 0.000193 wd 0.0500 time 0.2359 (0.2309) data time 0.0007 (0.0016) model time 0.2352 (0.2293) loss 3.5122 (2.9118) grad_norm 3.9494 (3.7487) loss_scale 512.0000 (347.3715) mem 7381MB [2024-08-30 06:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][970/1251] eta 0:01:04 lr 0.000193 wd 0.0500 time 0.2520 (0.2309) data time 0.0009 (0.0016) model time 0.2510 (0.2293) loss 2.8609 (2.9108) grad_norm 4.3595 (3.7480) loss_scale 512.0000 (349.0669) mem 7381MB [2024-08-30 06:10:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][980/1251] eta 0:01:02 lr 0.000193 wd 0.0500 time 0.2328 (0.2309) data time 0.0006 (0.0016) model time 0.2322 (0.2293) loss 2.2987 (2.9097) grad_norm 3.0213 (3.7492) loss_scale 512.0000 (350.7278) mem 7381MB [2024-08-30 06:10:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][990/1251] eta 0:01:00 lr 0.000193 wd 0.0500 time 0.2237 (0.2308) data time 0.0013 (0.0016) model time 0.2224 (0.2292) loss 3.1699 (2.9105) grad_norm 4.0704 (3.7581) loss_scale 512.0000 (352.3552) mem 7381MB [2024-08-30 06:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1000/1251] eta 0:00:57 lr 0.000193 wd 0.0500 time 0.2335 (0.2308) data time 0.0008 (0.0016) model time 0.2328 (0.2292) loss 3.7592 (2.9098) grad_norm 3.1052 (3.7576) loss_scale 512.0000 (353.9500) mem 7381MB [2024-08-30 06:10:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1010/1251] eta 0:00:55 lr 0.000193 wd 0.0500 time 0.2276 (0.2308) data time 0.0009 (0.0016) model time 0.2268 (0.2292) loss 3.2209 (2.9090) grad_norm 3.7922 (3.7559) loss_scale 512.0000 (355.5134) mem 7381MB [2024-08-30 06:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1020/1251] eta 0:00:53 lr 0.000193 wd 0.0500 time 0.2288 (0.2308) data time 0.0007 (0.0016) model time 0.2281 (0.2292) loss 3.2385 (2.9101) grad_norm 3.6151 (3.7550) loss_scale 512.0000 (357.0460) mem 7381MB [2024-08-30 06:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1030/1251] eta 0:00:51 lr 0.000193 wd 0.0500 time 0.2296 (0.2308) data time 0.0008 (0.0016) model time 0.2287 (0.2292) loss 3.2696 (2.9113) grad_norm 3.2661 (3.7508) loss_scale 512.0000 (358.5490) mem 7381MB [2024-08-30 06:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1040/1251] eta 0:00:48 lr 0.000193 wd 0.0500 time 0.2252 (0.2308) data time 0.0007 (0.0016) model time 0.2245 (0.2292) loss 3.0667 (2.9099) grad_norm 3.7910 (3.7515) loss_scale 512.0000 (360.0231) mem 7381MB [2024-08-30 06:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1050/1251] eta 0:00:46 lr 0.000193 wd 0.0500 time 0.2440 (0.2308) data time 0.0007 (0.0016) model time 0.2433 (0.2292) loss 2.0016 (2.9069) grad_norm 4.5746 (3.7485) loss_scale 512.0000 (361.4691) mem 7381MB [2024-08-30 06:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1060/1251] eta 0:00:44 lr 0.000193 wd 0.0500 time 0.2217 (0.2307) data time 0.0010 (0.0016) model time 0.2207 (0.2292) loss 3.0333 (2.9058) grad_norm 5.2116 (3.7489) loss_scale 512.0000 (362.8878) mem 7381MB [2024-08-30 06:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1070/1251] eta 0:00:41 lr 0.000193 wd 0.0500 time 0.2328 (0.2307) data time 0.0009 (0.0016) model time 0.2319 (0.2291) loss 3.3872 (2.9068) grad_norm 3.5591 (3.7479) loss_scale 512.0000 (364.2801) mem 7381MB [2024-08-30 06:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1080/1251] eta 0:00:39 lr 0.000193 wd 0.0500 time 0.2276 (0.2307) data time 0.0007 (0.0016) model time 0.2269 (0.2291) loss 3.3615 (2.9091) grad_norm 3.0854 (3.7432) loss_scale 512.0000 (365.6466) mem 7381MB [2024-08-30 06:10:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1090/1251] eta 0:00:37 lr 0.000193 wd 0.0500 time 0.2331 (0.2307) data time 0.0009 (0.0016) model time 0.2322 (0.2291) loss 3.1080 (2.9092) grad_norm 3.2317 (3.7385) loss_scale 512.0000 (366.9881) mem 7381MB [2024-08-30 06:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1100/1251] eta 0:00:34 lr 0.000193 wd 0.0500 time 0.2436 (0.2307) data time 0.0007 (0.0016) model time 0.2429 (0.2291) loss 2.5644 (2.9087) grad_norm 5.2620 (3.7423) loss_scale 512.0000 (368.3052) mem 7381MB [2024-08-30 06:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1110/1251] eta 0:00:32 lr 0.000193 wd 0.0500 time 0.2249 (0.2307) data time 0.0007 (0.0016) model time 0.2242 (0.2291) loss 3.1174 (2.9077) grad_norm 2.4230 (3.7370) loss_scale 512.0000 (369.5986) mem 7381MB [2024-08-30 06:10:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1120/1251] eta 0:00:30 lr 0.000193 wd 0.0500 time 0.3852 (0.2308) data time 0.0008 (0.0016) model time 0.3844 (0.2292) loss 2.9643 (2.9077) grad_norm 4.2326 (3.7366) loss_scale 512.0000 (370.8689) mem 7381MB [2024-08-30 06:10:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1130/1251] eta 0:00:27 lr 0.000192 wd 0.0500 time 0.2284 (0.2307) data time 0.0012 (0.0015) model time 0.2271 (0.2292) loss 2.9360 (2.9093) grad_norm 2.4221 (3.7435) loss_scale 512.0000 (372.1167) mem 7381MB [2024-08-30 06:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1140/1251] eta 0:00:25 lr 0.000192 wd 0.0500 time 0.2344 (0.2307) data time 0.0009 (0.0015) model time 0.2335 (0.2292) loss 2.6813 (2.9072) grad_norm 3.8803 (3.7465) loss_scale 512.0000 (373.3427) mem 7381MB [2024-08-30 06:10:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1150/1251] eta 0:00:23 lr 0.000192 wd 0.0500 time 0.2311 (0.2307) data time 0.0009 (0.0015) model time 0.2302 (0.2292) loss 3.1344 (2.9074) grad_norm 3.2672 (3.7477) loss_scale 512.0000 (374.5474) mem 7381MB [2024-08-30 06:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1160/1251] eta 0:00:20 lr 0.000192 wd 0.0500 time 0.2370 (0.2307) data time 0.0007 (0.0015) model time 0.2363 (0.2292) loss 2.5957 (2.9082) grad_norm 2.6596 (3.7457) loss_scale 512.0000 (375.7313) mem 7381MB [2024-08-30 06:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1170/1251] eta 0:00:18 lr 0.000192 wd 0.0500 time 0.2310 (0.2307) data time 0.0007 (0.0015) model time 0.2304 (0.2292) loss 2.2647 (2.9054) grad_norm 3.4245 (3.7435) loss_scale 512.0000 (376.8950) mem 7381MB [2024-08-30 06:11:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1180/1251] eta 0:00:16 lr 0.000192 wd 0.0500 time 0.2255 (0.2307) data time 0.0013 (0.0015) model time 0.2242 (0.2292) loss 3.0538 (2.9074) grad_norm 4.0876 (3.7408) loss_scale 512.0000 (378.0390) mem 7381MB [2024-08-30 06:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1190/1251] eta 0:00:14 lr 0.000192 wd 0.0500 time 0.2345 (0.2307) data time 0.0009 (0.0015) model time 0.2336 (0.2292) loss 2.9272 (2.9079) grad_norm 3.4045 (3.7401) loss_scale 512.0000 (379.1637) mem 7381MB [2024-08-30 06:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1200/1251] eta 0:00:11 lr 0.000192 wd 0.0500 time 0.2262 (0.2307) data time 0.0008 (0.0015) model time 0.2254 (0.2292) loss 3.8424 (2.9104) grad_norm 2.6034 (3.7563) loss_scale 512.0000 (380.2698) mem 7381MB [2024-08-30 06:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1210/1251] eta 0:00:09 lr 0.000192 wd 0.0500 time 0.4349 (0.2310) data time 0.0007 (0.0015) model time 0.4342 (0.2295) loss 1.8563 (2.9101) grad_norm 3.9224 (3.7554) loss_scale 512.0000 (381.3576) mem 7381MB [2024-08-30 06:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1220/1251] eta 0:00:07 lr 0.000192 wd 0.0500 time 0.2272 (0.2310) data time 0.0016 (0.0015) model time 0.2257 (0.2295) loss 2.6058 (2.9094) grad_norm 5.3789 (3.7560) loss_scale 512.0000 (382.4275) mem 7381MB [2024-08-30 06:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1230/1251] eta 0:00:04 lr 0.000192 wd 0.0500 time 0.2296 (0.2310) data time 0.0009 (0.0015) model time 0.2287 (0.2295) loss 2.3411 (2.9080) grad_norm 4.2863 (3.7570) loss_scale 512.0000 (383.4801) mem 7381MB [2024-08-30 06:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1240/1251] eta 0:00:02 lr 0.000192 wd 0.0500 time 0.2120 (0.2309) data time 0.0007 (0.0015) model time 0.2114 (0.2294) loss 2.9500 (2.9099) grad_norm 3.5519 (3.7559) loss_scale 512.0000 (384.5157) mem 7381MB [2024-08-30 06:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [220/300][1250/1251] eta 0:00:00 lr 0.000192 wd 0.0500 time 0.2111 (0.2308) data time 0.0004 (0.0015) model time 0.2106 (0.2293) loss 1.8412 (2.9087) grad_norm 3.4406 (3.7597) loss_scale 512.0000 (385.5348) mem 7381MB [2024-08-30 06:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 220 training takes 0:04:48 [2024-08-30 06:11:16 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 06:11:17 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 06:11:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.539 (0.539) Loss 0.4092 (0.4092) Acc@1 92.383 (92.383) Acc@5 98.535 (98.535) Mem 7381MB [2024-08-30 06:11:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.123) Loss 0.6523 (0.6626) Acc@1 88.086 (86.444) Acc@5 96.875 (97.363) Mem 7381MB [2024-08-30 06:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.103) Loss 0.9604 (0.6894) Acc@1 77.051 (85.393) Acc@5 95.605 (97.359) Mem 7381MB [2024-08-30 06:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.096) Loss 1.1387 (0.7817) Acc@1 72.852 (83.150) Acc@5 92.578 (96.377) Mem 7381MB [2024-08-30 06:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.088) Loss 1.0273 (0.8299) Acc@1 76.074 (81.850) Acc@5 93.750 (95.872) Mem 7381MB [2024-08-30 06:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.348 Acc@5 95.806 [2024-08-30 06:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.3% [2024-08-30 06:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.941 (0.941) Loss 0.3762 (0.3762) Acc@1 93.262 (93.262) Acc@5 98.438 (98.438) Mem 7381MB [2024-08-30 06:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.091 (0.170) Loss 0.5854 (0.6031) Acc@1 89.062 (87.447) Acc@5 97.852 (97.727) Mem 7381MB [2024-08-30 06:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.128) Loss 0.8667 (0.6298) Acc@1 78.711 (86.412) Acc@5 96.289 (97.707) Mem 7381MB [2024-08-30 06:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.087 (0.113) Loss 1.0898 (0.7137) Acc@1 73.633 (84.353) Acc@5 93.164 (96.809) Mem 7381MB [2024-08-30 06:11:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.102) Loss 0.9746 (0.7570) Acc@1 77.344 (83.248) Acc@5 94.531 (96.334) Mem 7381MB [2024-08-30 06:11:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.802 Acc@5 96.312 [2024-08-30 06:11:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-08-30 06:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][0/1251] eta 0:26:20 lr 0.000192 wd 0.0500 time 1.2633 (1.2633) data time 0.7638 (0.7638) model time 0.0000 (0.0000) loss 2.8075 (2.8075) grad_norm 4.0839 (4.0839) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][10/1251] eta 0:06:41 lr 0.000192 wd 0.0500 time 0.2434 (0.3232) data time 0.0009 (0.0705) model time 0.0000 (0.0000) loss 2.1000 (2.8972) grad_norm 3.7900 (3.9420) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][20/1251] eta 0:05:40 lr 0.000192 wd 0.0500 time 0.2346 (0.2768) data time 0.0007 (0.0374) model time 0.0000 (0.0000) loss 2.4010 (2.9281) grad_norm 4.9001 (4.4916) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:11:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][30/1251] eta 0:05:18 lr 0.000192 wd 0.0500 time 0.2250 (0.2612) data time 0.0007 (0.0257) model time 0.0000 (0.0000) loss 2.5081 (2.9751) grad_norm 4.2321 (4.3558) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][40/1251] eta 0:05:06 lr 0.000192 wd 0.0500 time 0.2225 (0.2529) data time 0.0011 (0.0197) model time 0.0000 (0.0000) loss 3.6823 (2.9698) grad_norm 3.9112 (4.1816) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:11:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][50/1251] eta 0:04:58 lr 0.000192 wd 0.0500 time 0.2358 (0.2483) data time 0.0009 (0.0161) model time 0.0000 (0.0000) loss 3.1254 (2.9998) grad_norm 2.4825 (3.9728) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:11:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][60/1251] eta 0:04:51 lr 0.000192 wd 0.0500 time 0.2306 (0.2446) data time 0.0008 (0.0136) model time 0.2297 (0.2243) loss 2.7027 (3.0244) grad_norm 3.8690 (3.9239) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][70/1251] eta 0:04:46 lr 0.000192 wd 0.0500 time 0.2322 (0.2425) data time 0.0008 (0.0119) model time 0.2315 (0.2265) loss 3.1724 (3.0408) grad_norm 3.3793 (3.8477) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:11:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][80/1251] eta 0:04:41 lr 0.000192 wd 0.0500 time 0.2317 (0.2408) data time 0.0009 (0.0105) model time 0.2307 (0.2269) loss 2.9834 (3.0378) grad_norm 3.3577 (3.8345) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][90/1251] eta 0:04:37 lr 0.000192 wd 0.0500 time 0.2226 (0.2392) data time 0.0007 (0.0095) model time 0.2219 (0.2266) loss 3.1223 (3.0167) grad_norm 2.4446 (3.7356) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][100/1251] eta 0:04:34 lr 0.000192 wd 0.0500 time 0.2283 (0.2382) data time 0.0007 (0.0087) model time 0.2277 (0.2267) loss 2.4965 (3.0124) grad_norm 2.9109 (3.7127) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:11:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][110/1251] eta 0:04:30 lr 0.000192 wd 0.0500 time 0.2245 (0.2375) data time 0.0009 (0.0080) model time 0.2236 (0.2271) loss 2.4749 (2.9783) grad_norm 2.5579 (3.6595) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][120/1251] eta 0:04:27 lr 0.000192 wd 0.0500 time 0.2217 (0.2367) data time 0.0013 (0.0074) model time 0.2204 (0.2272) loss 3.1680 (2.9860) grad_norm 2.9570 (3.6397) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][130/1251] eta 0:04:24 lr 0.000192 wd 0.0500 time 0.2267 (0.2361) data time 0.0008 (0.0069) model time 0.2259 (0.2273) loss 3.4261 (2.9728) grad_norm 2.8418 (3.6201) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][140/1251] eta 0:04:21 lr 0.000192 wd 0.0500 time 0.2252 (0.2355) data time 0.0014 (0.0065) model time 0.2238 (0.2271) loss 2.5080 (2.9783) grad_norm 6.1764 (3.6432) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][150/1251] eta 0:04:18 lr 0.000192 wd 0.0500 time 0.2283 (0.2349) data time 0.0006 (0.0061) model time 0.2277 (0.2270) loss 2.2404 (2.9599) grad_norm 3.0078 (3.6504) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][160/1251] eta 0:04:15 lr 0.000191 wd 0.0500 time 0.2237 (0.2346) data time 0.0006 (0.0058) model time 0.2231 (0.2271) loss 3.0834 (2.9732) grad_norm 3.4603 (3.6419) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][170/1251] eta 0:04:13 lr 0.000191 wd 0.0500 time 0.2347 (0.2343) data time 0.0008 (0.0056) model time 0.2338 (0.2273) loss 3.0391 (2.9778) grad_norm 3.3270 (3.6246) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][180/1251] eta 0:04:10 lr 0.000191 wd 0.0500 time 0.2337 (0.2341) data time 0.0012 (0.0053) model time 0.2326 (0.2275) loss 3.0037 (2.9791) grad_norm 4.0666 (3.6038) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][190/1251] eta 0:04:08 lr 0.000191 wd 0.0500 time 0.2333 (0.2339) data time 0.0007 (0.0051) model time 0.2326 (0.2275) loss 2.4274 (2.9763) grad_norm 2.7735 (3.5835) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][200/1251] eta 0:04:05 lr 0.000191 wd 0.0500 time 0.2280 (0.2336) data time 0.0009 (0.0049) model time 0.2271 (0.2275) loss 2.7998 (2.9857) grad_norm 4.7317 (3.6261) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][210/1251] eta 0:04:02 lr 0.000191 wd 0.0500 time 0.2224 (0.2333) data time 0.0007 (0.0047) model time 0.2217 (0.2275) loss 3.1306 (2.9910) grad_norm 2.8666 (3.6521) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][220/1251] eta 0:04:00 lr 0.000191 wd 0.0500 time 0.2219 (0.2330) data time 0.0014 (0.0045) model time 0.2205 (0.2274) loss 3.2626 (2.9883) grad_norm 3.1666 (3.6376) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][230/1251] eta 0:03:57 lr 0.000191 wd 0.0500 time 0.2351 (0.2328) data time 0.0007 (0.0044) model time 0.2344 (0.2273) loss 3.5277 (2.9982) grad_norm 6.0755 (3.6685) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][240/1251] eta 0:03:55 lr 0.000191 wd 0.0500 time 0.2338 (0.2327) data time 0.0009 (0.0042) model time 0.2329 (0.2274) loss 3.1669 (2.9797) grad_norm 3.6808 (3.6966) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][250/1251] eta 0:03:52 lr 0.000191 wd 0.0500 time 0.2254 (0.2327) data time 0.0012 (0.0041) model time 0.2243 (0.2276) loss 2.3445 (2.9702) grad_norm 3.1345 (3.6870) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][260/1251] eta 0:03:50 lr 0.000191 wd 0.0500 time 0.2321 (0.2325) data time 0.0006 (0.0040) model time 0.2315 (0.2276) loss 3.5646 (2.9658) grad_norm 4.5953 (3.6855) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][270/1251] eta 0:03:47 lr 0.000191 wd 0.0500 time 0.2287 (0.2324) data time 0.0006 (0.0039) model time 0.2280 (0.2276) loss 3.6116 (2.9728) grad_norm 3.7435 (3.6786) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][280/1251] eta 0:03:45 lr 0.000191 wd 0.0500 time 0.2255 (0.2322) data time 0.0007 (0.0038) model time 0.2248 (0.2275) loss 3.8841 (2.9728) grad_norm 3.2235 (3.6840) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][290/1251] eta 0:03:43 lr 0.000191 wd 0.0500 time 0.2340 (0.2321) data time 0.0009 (0.0037) model time 0.2331 (0.2276) loss 2.2954 (2.9610) grad_norm 2.8319 (3.6813) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][300/1251] eta 0:03:40 lr 0.000191 wd 0.0500 time 0.2278 (0.2320) data time 0.0011 (0.0036) model time 0.2267 (0.2276) loss 3.2000 (2.9564) grad_norm 3.9304 (3.6869) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][310/1251] eta 0:03:38 lr 0.000191 wd 0.0500 time 0.2259 (0.2319) data time 0.0011 (0.0035) model time 0.2247 (0.2275) loss 2.9193 (2.9518) grad_norm 3.1166 (3.6747) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][320/1251] eta 0:03:35 lr 0.000191 wd 0.0500 time 0.2276 (0.2317) data time 0.0009 (0.0035) model time 0.2266 (0.2275) loss 3.1602 (2.9497) grad_norm 4.0257 (3.6711) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][330/1251] eta 0:03:33 lr 0.000191 wd 0.0500 time 0.2334 (0.2316) data time 0.0007 (0.0034) model time 0.2327 (0.2275) loss 3.4154 (2.9478) grad_norm 4.3634 (3.6761) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][340/1251] eta 0:03:30 lr 0.000191 wd 0.0500 time 0.2272 (0.2315) data time 0.0012 (0.0033) model time 0.2260 (0.2275) loss 1.9034 (2.9455) grad_norm 3.1859 (3.6817) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][350/1251] eta 0:03:28 lr 0.000191 wd 0.0500 time 0.2307 (0.2316) data time 0.0011 (0.0033) model time 0.2296 (0.2276) loss 2.9462 (2.9449) grad_norm 2.9181 (3.6740) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][360/1251] eta 0:03:26 lr 0.000191 wd 0.0500 time 0.4341 (0.2322) data time 0.0012 (0.0032) model time 0.4329 (0.2284) loss 1.9293 (2.9253) grad_norm 3.3138 (3.6500) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][370/1251] eta 0:03:24 lr 0.000191 wd 0.0500 time 0.2278 (0.2320) data time 0.0011 (0.0031) model time 0.2267 (0.2284) loss 3.2219 (2.9207) grad_norm 2.9945 (3.6456) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 06:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 06:12:52 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 06:12:52 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 06:22:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 06:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 06:24:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 06:24:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 06:25:02 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 06:25:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 06:25:10 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 06:25:11 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 06:25:12 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 06:25:12 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 221) [2024-08-30 06:25:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 06:25:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][380/1251] eta 0:18:59 lr 0.000191 wd 0.0500 time 0.2217 (1.3077) data time 0.0010 (0.0608) model time 0.2208 (1.2470) loss 3.2510 (3.3745) grad_norm 3.0677 (3.9609) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:25:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][390/1251] eta 0:11:01 lr 0.000191 wd 0.0500 time 0.2263 (0.7678) data time 0.0008 (0.0309) model time 0.2255 (0.7370) loss 2.9129 (3.1468) grad_norm 2.2631 (3.7543) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:25:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][400/1251] eta 0:08:20 lr 0.000191 wd 0.0500 time 0.2250 (0.5879) data time 0.0009 (0.0209) model time 0.2241 (0.5670) loss 3.4095 (3.1736) grad_norm 3.2946 (3.6084) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:25:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][410/1251] eta 0:06:58 lr 0.000191 wd 0.0500 time 0.2208 (0.4970) data time 0.0009 (0.0159) model time 0.2200 (0.4811) loss 2.2619 (3.0939) grad_norm 4.7399 (3.5960) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][420/1251] eta 0:06:07 lr 0.000191 wd 0.0500 time 0.2186 (0.4427) data time 0.0009 (0.0129) model time 0.2176 (0.4298) loss 2.5102 (3.0844) grad_norm 2.8817 (3.5618) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:25:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][430/1251] eta 0:05:33 lr 0.000191 wd 0.0500 time 0.2382 (0.4068) data time 0.0007 (0.0109) model time 0.2375 (0.3958) loss 3.0243 (3.0525) grad_norm 3.0451 (3.5907) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][440/1251] eta 0:05:08 lr 0.000191 wd 0.0500 time 0.2231 (0.3806) data time 0.0008 (0.0095) model time 0.2223 (0.3711) loss 2.1424 (3.0306) grad_norm 3.8732 (3.8189) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:25:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][450/1251] eta 0:04:49 lr 0.000191 wd 0.0500 time 0.2215 (0.3611) data time 0.0008 (0.0084) model time 0.2207 (0.3527) loss 3.1277 (3.0166) grad_norm 3.0434 (3.7703) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:25:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][460/1251] eta 0:04:33 lr 0.000190 wd 0.0500 time 0.2255 (0.3460) data time 0.0007 (0.0076) model time 0.2248 (0.3384) loss 3.0759 (3.0069) grad_norm 2.7565 (3.7133) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:25:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][470/1251] eta 0:04:20 lr 0.000190 wd 0.0500 time 0.2275 (0.3340) data time 0.0009 (0.0069) model time 0.2266 (0.3271) loss 3.1671 (3.0146) grad_norm 36.0117 (4.0406) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:25:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][480/1251] eta 0:04:10 lr 0.000190 wd 0.0500 time 0.2210 (0.3243) data time 0.0010 (0.0064) model time 0.2200 (0.3179) loss 2.8126 (3.0155) grad_norm 4.2340 (4.1124) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:25:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][490/1251] eta 0:04:00 lr 0.000190 wd 0.0500 time 0.2243 (0.3161) data time 0.0008 (0.0059) model time 0.2235 (0.3102) loss 3.2309 (3.0161) grad_norm 2.6300 (4.0584) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:25:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][500/1251] eta 0:03:52 lr 0.000190 wd 0.0500 time 0.2187 (0.3092) data time 0.0009 (0.0055) model time 0.2178 (0.3037) loss 2.9734 (2.9962) grad_norm 2.5867 (3.9950) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:25:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][510/1251] eta 0:03:44 lr 0.000190 wd 0.0500 time 0.2235 (0.3032) data time 0.0006 (0.0052) model time 0.2229 (0.2980) loss 1.7019 (2.9827) grad_norm 3.8012 (4.0150) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][520/1251] eta 0:03:37 lr 0.000190 wd 0.0500 time 0.2324 (0.2980) data time 0.0009 (0.0049) model time 0.2315 (0.2931) loss 3.2391 (2.9796) grad_norm 3.4308 (4.0074) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][530/1251] eta 0:03:31 lr 0.000190 wd 0.0500 time 0.2295 (0.2936) data time 0.0008 (0.0047) model time 0.2287 (0.2889) loss 3.0992 (2.9738) grad_norm 3.0668 (4.0336) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][540/1251] eta 0:03:25 lr 0.000190 wd 0.0500 time 0.2281 (0.2895) data time 0.0007 (0.0045) model time 0.2274 (0.2851) loss 2.5537 (2.9803) grad_norm 3.4724 (4.0035) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][550/1251] eta 0:03:20 lr 0.000190 wd 0.0500 time 0.2249 (0.2861) data time 0.0008 (0.0043) model time 0.2242 (0.2818) loss 2.2724 (2.9672) grad_norm 2.6291 (3.9746) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][560/1251] eta 0:03:15 lr 0.000190 wd 0.0500 time 0.2280 (0.2829) data time 0.0007 (0.0041) model time 0.2273 (0.2788) loss 2.5337 (2.9700) grad_norm 2.8820 (3.9483) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][570/1251] eta 0:03:10 lr 0.000190 wd 0.0500 time 0.2214 (0.2800) data time 0.0010 (0.0039) model time 0.2204 (0.2761) loss 3.4037 (2.9585) grad_norm 4.3142 (3.9818) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][580/1251] eta 0:03:06 lr 0.000190 wd 0.0500 time 0.2314 (0.2775) data time 0.0008 (0.0038) model time 0.2306 (0.2738) loss 3.0341 (2.9475) grad_norm 2.9056 (3.9850) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][590/1251] eta 0:03:01 lr 0.000190 wd 0.0500 time 0.2273 (0.2752) data time 0.0009 (0.0037) model time 0.2264 (0.2715) loss 2.8876 (2.9383) grad_norm 2.5719 (3.9954) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][600/1251] eta 0:02:57 lr 0.000190 wd 0.0500 time 0.2246 (0.2730) data time 0.0010 (0.0035) model time 0.2235 (0.2694) loss 3.2469 (2.9434) grad_norm 5.1701 (4.0013) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][610/1251] eta 0:02:53 lr 0.000190 wd 0.0500 time 0.2213 (0.2711) data time 0.0011 (0.0034) model time 0.2201 (0.2676) loss 3.2787 (2.9355) grad_norm 3.7225 (3.9932) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][620/1251] eta 0:02:49 lr 0.000190 wd 0.0500 time 0.2240 (0.2693) data time 0.0008 (0.0033) model time 0.2232 (0.2660) loss 2.1792 (2.9259) grad_norm 3.1473 (3.9879) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][630/1251] eta 0:02:46 lr 0.000190 wd 0.0500 time 0.2268 (0.2677) data time 0.0010 (0.0032) model time 0.2258 (0.2645) loss 2.3512 (2.9171) grad_norm 2.9371 (3.9765) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][640/1251] eta 0:02:42 lr 0.000190 wd 0.0500 time 0.2259 (0.2662) data time 0.0006 (0.0031) model time 0.2253 (0.2630) loss 3.5930 (2.9107) grad_norm 4.5042 (3.9569) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][650/1251] eta 0:02:39 lr 0.000190 wd 0.0500 time 0.2210 (0.2646) data time 0.0011 (0.0031) model time 0.2199 (0.2616) loss 3.3025 (2.9168) grad_norm 2.6054 (3.9403) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][660/1251] eta 0:02:36 lr 0.000190 wd 0.0500 time 0.2179 (0.2640) data time 0.0010 (0.0030) model time 0.2169 (0.2610) loss 2.6640 (2.9115) grad_norm 2.6340 (3.9297) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][670/1251] eta 0:02:32 lr 0.000190 wd 0.0500 time 0.2222 (0.2627) data time 0.0008 (0.0029) model time 0.2214 (0.2598) loss 2.3711 (2.9005) grad_norm 2.9033 (3.9239) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][680/1251] eta 0:02:29 lr 0.000190 wd 0.0500 time 0.2278 (0.2623) data time 0.0009 (0.0029) model time 0.2269 (0.2594) loss 2.6293 (2.8986) grad_norm 3.0346 (3.9157) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][690/1251] eta 0:02:26 lr 0.000190 wd 0.0500 time 0.2224 (0.2611) data time 0.0010 (0.0028) model time 0.2214 (0.2583) loss 2.6224 (2.9077) grad_norm 2.9437 (3.8897) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][700/1251] eta 0:02:23 lr 0.000190 wd 0.0500 time 0.2266 (0.2600) data time 0.0007 (0.0027) model time 0.2259 (0.2573) loss 2.6610 (2.9087) grad_norm 3.9257 (3.8852) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][710/1251] eta 0:02:20 lr 0.000190 wd 0.0500 time 0.2330 (0.2590) data time 0.0009 (0.0027) model time 0.2321 (0.2563) loss 2.9112 (2.9073) grad_norm 3.7642 (3.8742) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][720/1251] eta 0:02:17 lr 0.000190 wd 0.0500 time 0.2270 (0.2580) data time 0.0010 (0.0026) model time 0.2260 (0.2554) loss 2.7843 (2.9081) grad_norm 3.0296 (3.8740) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][730/1251] eta 0:02:13 lr 0.000190 wd 0.0500 time 0.2293 (0.2571) data time 0.0007 (0.0026) model time 0.2286 (0.2545) loss 4.0219 (2.9151) grad_norm 2.9511 (3.8687) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][740/1251] eta 0:02:11 lr 0.000190 wd 0.0500 time 0.2277 (0.2564) data time 0.0009 (0.0026) model time 0.2268 (0.2538) loss 3.1879 (2.9141) grad_norm 5.2820 (3.8626) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][750/1251] eta 0:02:08 lr 0.000189 wd 0.0500 time 0.2223 (0.2556) data time 0.0007 (0.0025) model time 0.2215 (0.2531) loss 2.7650 (2.9125) grad_norm 3.9945 (3.8518) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][760/1251] eta 0:02:05 lr 0.000189 wd 0.0500 time 0.2274 (0.2548) data time 0.0007 (0.0025) model time 0.2267 (0.2523) loss 1.8502 (2.9047) grad_norm 3.4939 (3.8295) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:26:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][770/1251] eta 0:02:02 lr 0.000189 wd 0.0500 time 0.2198 (0.2541) data time 0.0012 (0.0024) model time 0.2187 (0.2517) loss 3.1053 (2.9107) grad_norm 2.4784 (3.8114) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][780/1251] eta 0:01:59 lr 0.000189 wd 0.0500 time 0.2239 (0.2535) data time 0.0007 (0.0024) model time 0.2232 (0.2511) loss 2.9806 (2.9155) grad_norm 2.6142 (3.8061) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][790/1251] eta 0:01:56 lr 0.000189 wd 0.0500 time 0.2258 (0.2529) data time 0.0007 (0.0024) model time 0.2252 (0.2505) loss 3.2763 (2.9148) grad_norm 2.7078 (3.7917) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][800/1251] eta 0:01:53 lr 0.000189 wd 0.0500 time 0.2257 (0.2523) data time 0.0008 (0.0023) model time 0.2248 (0.2499) loss 3.2584 (2.9214) grad_norm 2.9766 (3.7858) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][810/1251] eta 0:01:50 lr 0.000189 wd 0.0500 time 0.2222 (0.2516) data time 0.0009 (0.0023) model time 0.2214 (0.2494) loss 2.7479 (2.9257) grad_norm 3.9333 (3.7994) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][820/1251] eta 0:01:48 lr 0.000189 wd 0.0500 time 0.2206 (0.2510) data time 0.0008 (0.0023) model time 0.2198 (0.2488) loss 3.0850 (2.9277) grad_norm 3.9093 (3.7927) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][830/1251] eta 0:01:45 lr 0.000189 wd 0.0500 time 0.2207 (0.2505) data time 0.0006 (0.0022) model time 0.2201 (0.2482) loss 2.9052 (2.9219) grad_norm 4.9643 (3.8050) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][840/1251] eta 0:01:42 lr 0.000189 wd 0.0500 time 0.2233 (0.2499) data time 0.0009 (0.0022) model time 0.2225 (0.2477) loss 2.6661 (2.9142) grad_norm 4.1867 (3.8046) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][850/1251] eta 0:01:40 lr 0.000189 wd 0.0500 time 0.2240 (0.2494) data time 0.0009 (0.0022) model time 0.2231 (0.2472) loss 2.3386 (2.9140) grad_norm 3.1845 (3.7926) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][860/1251] eta 0:01:37 lr 0.000189 wd 0.0500 time 0.2269 (0.2489) data time 0.0006 (0.0021) model time 0.2262 (0.2468) loss 2.5318 (2.9180) grad_norm 3.5820 (3.7971) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][870/1251] eta 0:01:34 lr 0.000189 wd 0.0500 time 0.2269 (0.2485) data time 0.0007 (0.0021) model time 0.2262 (0.2463) loss 2.0597 (2.9166) grad_norm 2.6002 (3.7878) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][880/1251] eta 0:01:32 lr 0.000189 wd 0.0500 time 0.2332 (0.2480) data time 0.0008 (0.0021) model time 0.2324 (0.2459) loss 3.0801 (2.9192) grad_norm 2.7904 (3.7777) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][890/1251] eta 0:01:29 lr 0.000189 wd 0.0500 time 0.2301 (0.2476) data time 0.0006 (0.0021) model time 0.2295 (0.2455) loss 3.0390 (2.9228) grad_norm 2.8517 (3.7883) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][900/1251] eta 0:01:26 lr 0.000189 wd 0.0500 time 0.2341 (0.2472) data time 0.0008 (0.0021) model time 0.2333 (0.2452) loss 2.8642 (2.9165) grad_norm 2.8632 (3.7836) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][910/1251] eta 0:01:24 lr 0.000189 wd 0.0500 time 0.2238 (0.2468) data time 0.0008 (0.0020) model time 0.2230 (0.2448) loss 3.5918 (2.9166) grad_norm 2.6669 (3.7751) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][920/1251] eta 0:01:21 lr 0.000189 wd 0.0500 time 0.2255 (0.2465) data time 0.0009 (0.0020) model time 0.2246 (0.2445) loss 1.8708 (2.9153) grad_norm 3.9549 (3.7726) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][930/1251] eta 0:01:19 lr 0.000189 wd 0.0500 time 0.2221 (0.2461) data time 0.0009 (0.0020) model time 0.2212 (0.2441) loss 3.0678 (2.9196) grad_norm 3.3539 (3.7704) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][940/1251] eta 0:01:16 lr 0.000189 wd 0.0500 time 0.2192 (0.2457) data time 0.0009 (0.0020) model time 0.2183 (0.2438) loss 2.8870 (2.9221) grad_norm 2.0320 (3.7604) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][950/1251] eta 0:01:13 lr 0.000189 wd 0.0500 time 0.2226 (0.2454) data time 0.0007 (0.0020) model time 0.2219 (0.2434) loss 2.4926 (2.9235) grad_norm 2.8720 (3.7487) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][960/1251] eta 0:01:11 lr 0.000189 wd 0.0500 time 0.2241 (0.2450) data time 0.0007 (0.0019) model time 0.2234 (0.2431) loss 3.3898 (2.9249) grad_norm 3.1121 (3.7470) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][970/1251] eta 0:01:08 lr 0.000189 wd 0.0500 time 0.2211 (0.2447) data time 0.0010 (0.0019) model time 0.2201 (0.2427) loss 3.0920 (2.9255) grad_norm 4.8605 (3.7562) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][980/1251] eta 0:01:06 lr 0.000189 wd 0.0500 time 0.2198 (0.2443) data time 0.0007 (0.0019) model time 0.2190 (0.2424) loss 2.9139 (2.9238) grad_norm 4.8259 (3.7704) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][990/1251] eta 0:01:03 lr 0.000189 wd 0.0500 time 0.2260 (0.2440) data time 0.0011 (0.0019) model time 0.2249 (0.2421) loss 3.0602 (2.9262) grad_norm 5.4978 (3.7900) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1000/1251] eta 0:01:01 lr 0.000189 wd 0.0500 time 0.2241 (0.2437) data time 0.0008 (0.0019) model time 0.2233 (0.2418) loss 3.5884 (2.9264) grad_norm 3.0274 (3.7877) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1010/1251] eta 0:00:58 lr 0.000189 wd 0.0500 time 0.2227 (0.2434) data time 0.0007 (0.0019) model time 0.2219 (0.2416) loss 1.5702 (2.9273) grad_norm 3.8785 (3.7895) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1020/1251] eta 0:00:56 lr 0.000189 wd 0.0500 time 0.2199 (0.2432) data time 0.0010 (0.0019) model time 0.2189 (0.2413) loss 3.5162 (2.9253) grad_norm 3.6882 (3.8091) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 06:27:56 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 06:27:57 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 06:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 06:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 06:32:57 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 06:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 06:33:06 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 06:33:07 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 06:33:08 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 06:33:08 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 221) [2024-08-30 06:33:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 06:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1030/1251] eta 0:11:08 lr 0.000189 wd 0.0500 time 0.2333 (3.0266) data time 0.0009 (0.1154) model time 0.2324 (2.9112) loss 3.2752 (3.2897) grad_norm 4.6240 (4.7309) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1040/1251] eta 0:04:05 lr 0.000188 wd 0.0500 time 0.2280 (1.1658) data time 0.0010 (0.0392) model time 0.2270 (1.1266) loss 3.1924 (3.1415) grad_norm 3.7785 (4.2794) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1050/1251] eta 0:02:40 lr 0.000188 wd 0.0500 time 0.2376 (0.7962) data time 0.0010 (0.0240) model time 0.2366 (0.7722) loss 2.8433 (3.1725) grad_norm 4.6766 (3.9556) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:33:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1060/1251] eta 0:02:01 lr 0.000188 wd 0.0500 time 0.2435 (0.6363) data time 0.0014 (0.0174) model time 0.2421 (0.6188) loss 3.3004 (3.1438) grad_norm 4.9694 (3.9370) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:33:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1070/1251] eta 0:01:39 lr 0.000188 wd 0.0500 time 0.2346 (0.5475) data time 0.0009 (0.0138) model time 0.2336 (0.5337) loss 2.9519 (3.0907) grad_norm 5.1884 (3.8501) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:33:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1080/1251] eta 0:01:24 lr 0.000188 wd 0.0500 time 0.2367 (0.4916) data time 0.0008 (0.0115) model time 0.2360 (0.4801) loss 2.1679 (3.0787) grad_norm 2.8079 (3.7180) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:33:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1090/1251] eta 0:01:12 lr 0.000188 wd 0.0500 time 0.2316 (0.4531) data time 0.0009 (0.0099) model time 0.2307 (0.4432) loss 3.2475 (3.0528) grad_norm 3.0076 (3.6973) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:33:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1100/1251] eta 0:01:04 lr 0.000188 wd 0.0500 time 0.2332 (0.4244) data time 0.0010 (0.0087) model time 0.2322 (0.4157) loss 2.3302 (3.0105) grad_norm 3.7749 (3.6330) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:33:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1110/1251] eta 0:00:56 lr 0.000188 wd 0.0500 time 0.2193 (0.4021) data time 0.0008 (0.0078) model time 0.2185 (0.3943) loss 3.1055 (3.0022) grad_norm 2.4066 (3.7254) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1120/1251] eta 0:00:50 lr 0.000188 wd 0.0500 time 0.2326 (0.3845) data time 0.0010 (0.0071) model time 0.2315 (0.3774) loss 2.8212 (2.9951) grad_norm 3.1805 (3.7005) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:33:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1130/1251] eta 0:00:44 lr 0.000188 wd 0.0500 time 0.2464 (0.3707) data time 0.0011 (0.0065) model time 0.2453 (0.3642) loss 2.8392 (3.0151) grad_norm 2.7524 (3.7567) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 06:33:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 06:33:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 06:33:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 07:05:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 07:05:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 07:05:47 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 07:06:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 07:06:04 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 07:06:05 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 07:06:06 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 07:06:06 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 221) [2024-08-30 07:06:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 07:06:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1140/1251] eta 0:04:43 lr 0.000188 wd 0.0500 time 0.2280 (2.5513) data time 0.0013 (0.1411) model time 0.2267 (2.4102) loss 3.3916 (3.4912) grad_norm 7.3253 (4.2434) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-30 07:06:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1150/1251] eta 0:02:00 lr 0.000188 wd 0.0500 time 0.2427 (1.1976) data time 0.0010 (0.0588) model time 0.2416 (1.1388) loss 2.8420 (3.2331) grad_norm 7.4946 (4.2508) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-30 07:06:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1160/1251] eta 0:01:16 lr 0.000188 wd 0.0500 time 0.2243 (0.8391) data time 0.0007 (0.0374) model time 0.2236 (0.8017) loss 3.7809 (3.2550) grad_norm 3.3396 (4.0609) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-30 07:06:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1170/1251] eta 0:00:54 lr 0.000188 wd 0.0500 time 0.3087 (0.6771) data time 0.0010 (0.0276) model time 0.3077 (0.6495) loss 2.6754 (3.2129) grad_norm 4.8913 (4.0506) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-30 07:06:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1180/1251] eta 0:00:41 lr 0.000188 wd 0.0500 time 0.2221 (0.5876) data time 0.0009 (0.0220) model time 0.2213 (0.5656) loss 3.6281 (3.1792) grad_norm 4.0996 (4.0857) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-30 07:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1190/1251] eta 0:00:32 lr 0.000188 wd 0.0500 time 0.2259 (0.5247) data time 0.0010 (0.0183) model time 0.2250 (0.5063) loss 2.6560 (3.1539) grad_norm 3.2638 (3.9424) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-30 07:06:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 07:06:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 07:06:42 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 09:05:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 09:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 09:05:41 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 09:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 09:05:50 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 09:05:52 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 09:05:53 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 09:05:53 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 221) [2024-08-30 09:05:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 09:06:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1200/1251] eta 0:01:31 lr 0.000188 wd 0.0500 time 0.2431 (1.7880) data time 0.0011 (0.0811) model time 0.2419 (1.7069) loss 3.2786 (3.4224) grad_norm 3.2143 (3.7452) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-30 09:06:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1210/1251] eta 0:00:41 lr 0.000188 wd 0.0500 time 0.2432 (1.0145) data time 0.0008 (0.0411) model time 0.2424 (0.9734) loss 3.2977 (3.1801) grad_norm 2.8581 (3.9439) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-30 09:06:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1220/1251] eta 0:00:23 lr 0.000188 wd 0.0500 time 0.2356 (0.7565) data time 0.0011 (0.0278) model time 0.2345 (0.7287) loss 3.1167 (3.1863) grad_norm 5.1556 (3.9641) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-30 09:06:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1230/1251] eta 0:00:13 lr 0.000188 wd 0.0500 time 0.2369 (0.6279) data time 0.0009 (0.0211) model time 0.2360 (0.6068) loss 2.7412 (3.1348) grad_norm 3.3974 (3.9022) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-30 09:06:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1240/1251] eta 0:00:06 lr 0.000188 wd 0.0500 time 0.2252 (0.5487) data time 0.0007 (0.0172) model time 0.2245 (0.5315) loss 2.2301 (3.1144) grad_norm 4.2212 (4.1080) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-30 09:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [221/300][1250/1251] eta 0:00:00 lr 0.000188 wd 0.0500 time 0.2285 (0.4947) data time 0.0005 (0.0144) model time 0.2280 (0.4802) loss 2.7630 (3.0825) grad_norm 2.5263 (3.9583) loss_scale 512.0000 (512.0000) mem 7376MB [2024-08-30 09:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 221 training takes 0:00:29 [2024-08-30 09:06:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 09:06:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 09:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.411 (0.411) Loss 0.3848 (0.3848) Acc@1 92.773 (92.773) Acc@5 98.438 (98.438) Mem 7376MB [2024-08-30 09:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.106) Loss 0.6494 (0.6409) Acc@1 87.598 (86.701) Acc@5 97.461 (97.443) Mem 7376MB [2024-08-30 09:06:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.091) Loss 0.9980 (0.6704) Acc@1 75.293 (85.500) Acc@5 94.824 (97.456) Mem 7376MB [2024-08-30 09:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.086) Loss 1.1113 (0.7602) Acc@1 74.219 (83.313) Acc@5 91.895 (96.431) Mem 7376MB [2024-08-30 09:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.081) Loss 1.0908 (0.8097) Acc@1 74.805 (82.057) Acc@5 94.141 (95.944) Mem 7376MB [2024-08-30 09:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.600 Acc@5 95.904 [2024-08-30 09:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.6% [2024-08-30 09:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.60% [2024-08-30 09:06:35 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-30 09:06:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-30 09:06:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.473 (0.473) Loss 0.3765 (0.3765) Acc@1 93.359 (93.359) Acc@5 98.340 (98.340) Mem 7376MB [2024-08-30 09:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.109) Loss 0.5830 (0.6027) Acc@1 88.867 (87.482) Acc@5 97.852 (97.710) Mem 7376MB [2024-08-30 09:06:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.094) Loss 0.8677 (0.6296) Acc@1 78.809 (86.440) Acc@5 96.191 (97.684) Mem 7376MB [2024-08-30 09:06:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.088) Loss 1.0879 (0.7135) Acc@1 73.438 (84.343) Acc@5 93.066 (96.796) Mem 7376MB [2024-08-30 09:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.082) Loss 0.9751 (0.7568) Acc@1 77.051 (83.203) Acc@5 94.727 (96.332) Mem 7376MB [2024-08-30 09:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.774 Acc@5 96.310 [2024-08-30 09:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-08-30 09:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.77% [2024-08-30 09:06:39 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 09:06:40 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 09:06:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][0/1251] eta 0:16:25 lr 0.000188 wd 0.0500 time 0.7882 (0.7882) data time 0.4895 (0.4895) model time 0.0000 (0.0000) loss 3.3051 (3.3051) grad_norm 3.4789 (3.4789) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:06:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][10/1251] eta 0:05:57 lr 0.000188 wd 0.0500 time 0.2394 (0.2883) data time 0.0010 (0.0455) model time 0.0000 (0.0000) loss 2.8488 (2.7783) grad_norm 2.9208 (5.6389) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:06:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][20/1251] eta 0:05:25 lr 0.000188 wd 0.0500 time 0.2381 (0.2648) data time 0.0011 (0.0244) model time 0.0000 (0.0000) loss 2.6788 (2.8209) grad_norm 3.0720 (4.4758) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:06:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][30/1251] eta 0:05:13 lr 0.000188 wd 0.0500 time 0.2453 (0.2568) data time 0.0008 (0.0168) model time 0.0000 (0.0000) loss 3.1881 (2.8206) grad_norm 3.5104 (4.1861) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:06:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][40/1251] eta 0:05:05 lr 0.000188 wd 0.0500 time 0.2448 (0.2526) data time 0.0008 (0.0130) model time 0.0000 (0.0000) loss 3.0745 (2.8743) grad_norm 3.3882 (4.0287) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:06:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][50/1251] eta 0:05:01 lr 0.000188 wd 0.0500 time 0.2409 (0.2507) data time 0.0010 (0.0107) model time 0.0000 (0.0000) loss 2.5389 (2.9063) grad_norm 3.8688 (3.9882) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:06:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][60/1251] eta 0:04:56 lr 0.000188 wd 0.0500 time 0.2340 (0.2487) data time 0.0008 (0.0092) model time 0.2332 (0.2370) loss 1.8056 (2.9205) grad_norm 3.1972 (3.9931) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][70/1251] eta 0:04:52 lr 0.000188 wd 0.0500 time 0.2482 (0.2478) data time 0.0011 (0.0080) model time 0.2470 (0.2391) loss 3.3047 (2.9065) grad_norm 3.6316 (3.9273) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:07:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][80/1251] eta 0:04:49 lr 0.000187 wd 0.0500 time 0.2367 (0.2471) data time 0.0009 (0.0072) model time 0.2358 (0.2396) loss 2.9860 (2.9084) grad_norm 4.8765 (3.8788) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:07:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][90/1251] eta 0:04:46 lr 0.000187 wd 0.0500 time 0.2536 (0.2465) data time 0.0010 (0.0065) model time 0.2525 (0.2399) loss 2.4066 (2.9078) grad_norm 3.8166 (3.8653) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:07:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][100/1251] eta 0:04:42 lr 0.000187 wd 0.0500 time 0.2358 (0.2457) data time 0.0011 (0.0060) model time 0.2347 (0.2394) loss 3.2974 (2.9151) grad_norm 2.6100 (3.8157) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:07:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][110/1251] eta 0:04:39 lr 0.000187 wd 0.0500 time 0.2471 (0.2452) data time 0.0011 (0.0055) model time 0.2460 (0.2394) loss 2.7728 (2.9108) grad_norm 2.9575 (3.7697) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][120/1251] eta 0:04:36 lr 0.000187 wd 0.0500 time 0.2400 (0.2448) data time 0.0011 (0.0052) model time 0.2389 (0.2394) loss 3.3790 (2.9059) grad_norm 3.4727 (3.7493) loss_scale 1024.0000 (533.1570) mem 7379MB [2024-08-30 09:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][130/1251] eta 0:04:33 lr 0.000187 wd 0.0500 time 0.2419 (0.2444) data time 0.0010 (0.0048) model time 0.2409 (0.2393) loss 2.5654 (2.9092) grad_norm 4.3977 (3.7288) loss_scale 1024.0000 (570.6260) mem 7379MB [2024-08-30 09:07:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][140/1251] eta 0:04:33 lr 0.000187 wd 0.0500 time 0.2396 (0.2459) data time 0.0011 (0.0046) model time 0.2385 (0.2421) loss 2.6473 (2.8976) grad_norm 2.9851 (3.7035) loss_scale 1024.0000 (602.7801) mem 7379MB [2024-08-30 09:07:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][150/1251] eta 0:04:30 lr 0.000187 wd 0.0500 time 0.2528 (0.2455) data time 0.0011 (0.0043) model time 0.2517 (0.2418) loss 3.1930 (2.8977) grad_norm 2.9000 (3.7301) loss_scale 1024.0000 (630.6755) mem 7379MB [2024-08-30 09:07:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][160/1251] eta 0:04:27 lr 0.000187 wd 0.0500 time 0.2339 (0.2453) data time 0.0008 (0.0041) model time 0.2331 (0.2417) loss 3.2501 (2.8925) grad_norm 3.9327 (3.7093) loss_scale 1024.0000 (655.1056) mem 7379MB [2024-08-30 09:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][170/1251] eta 0:04:24 lr 0.000187 wd 0.0500 time 0.2415 (0.2450) data time 0.0008 (0.0039) model time 0.2407 (0.2414) loss 1.8816 (2.8922) grad_norm 2.4572 (3.7044) loss_scale 1024.0000 (676.6784) mem 7379MB [2024-08-30 09:07:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][180/1251] eta 0:04:22 lr 0.000187 wd 0.0500 time 0.2342 (0.2448) data time 0.0010 (0.0038) model time 0.2333 (0.2415) loss 3.2366 (2.8908) grad_norm 2.1521 (3.6845) loss_scale 1024.0000 (695.8674) mem 7379MB [2024-08-30 09:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][190/1251] eta 0:04:19 lr 0.000187 wd 0.0500 time 0.2387 (0.2447) data time 0.0008 (0.0036) model time 0.2379 (0.2414) loss 2.8498 (2.8756) grad_norm 4.2207 (3.6711) loss_scale 1024.0000 (713.0471) mem 7379MB [2024-08-30 09:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][200/1251] eta 0:04:16 lr 0.000187 wd 0.0500 time 0.2373 (0.2444) data time 0.0009 (0.0035) model time 0.2364 (0.2412) loss 2.9430 (2.8716) grad_norm 3.0421 (3.6877) loss_scale 1024.0000 (728.5174) mem 7379MB [2024-08-30 09:07:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][210/1251] eta 0:04:14 lr 0.000187 wd 0.0500 time 0.2472 (0.2442) data time 0.0008 (0.0034) model time 0.2465 (0.2411) loss 3.1895 (2.8662) grad_norm 2.8719 (3.7009) loss_scale 1024.0000 (742.5213) mem 7379MB [2024-08-30 09:07:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][220/1251] eta 0:04:11 lr 0.000187 wd 0.0500 time 0.2327 (0.2440) data time 0.0011 (0.0033) model time 0.2316 (0.2410) loss 3.1874 (2.8739) grad_norm 3.0590 (3.7041) loss_scale 1024.0000 (755.2579) mem 7379MB [2024-08-30 09:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][230/1251] eta 0:04:09 lr 0.000187 wd 0.0500 time 0.2364 (0.2439) data time 0.0010 (0.0032) model time 0.2355 (0.2409) loss 1.7882 (2.8696) grad_norm 4.5743 (3.7945) loss_scale 1024.0000 (766.8918) mem 7379MB [2024-08-30 09:07:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][240/1251] eta 0:04:08 lr 0.000187 wd 0.0500 time 0.2345 (0.2459) data time 0.0010 (0.0031) model time 0.2335 (0.2436) loss 2.8652 (2.8607) grad_norm 2.6474 (3.7840) loss_scale 1024.0000 (777.5602) mem 7379MB [2024-08-30 09:07:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][250/1251] eta 0:04:05 lr 0.000187 wd 0.0500 time 0.2355 (0.2457) data time 0.0010 (0.0030) model time 0.2345 (0.2434) loss 3.4138 (2.8613) grad_norm 4.0270 (3.8006) loss_scale 1024.0000 (787.3785) mem 7379MB [2024-08-30 09:07:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][260/1251] eta 0:04:03 lr 0.000187 wd 0.0500 time 0.2373 (0.2455) data time 0.0008 (0.0029) model time 0.2365 (0.2431) loss 3.6418 (2.8719) grad_norm 2.9164 (3.7909) loss_scale 1024.0000 (796.4444) mem 7379MB [2024-08-30 09:07:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][270/1251] eta 0:04:00 lr 0.000187 wd 0.0500 time 0.2425 (0.2453) data time 0.0008 (0.0029) model time 0.2417 (0.2430) loss 2.3708 (2.8753) grad_norm 2.7271 (3.7680) loss_scale 1024.0000 (804.8413) mem 7379MB [2024-08-30 09:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][280/1251] eta 0:03:58 lr 0.000187 wd 0.0500 time 0.2377 (0.2452) data time 0.0011 (0.0028) model time 0.2366 (0.2429) loss 3.1272 (2.8776) grad_norm 3.0440 (3.7546) loss_scale 1024.0000 (812.6406) mem 7379MB [2024-08-30 09:07:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][290/1251] eta 0:03:55 lr 0.000187 wd 0.0500 time 0.2428 (0.2451) data time 0.0008 (0.0028) model time 0.2419 (0.2428) loss 3.2181 (2.8826) grad_norm 3.1863 (3.7625) loss_scale 1024.0000 (819.9038) mem 7379MB [2024-08-30 09:07:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][300/1251] eta 0:03:53 lr 0.000187 wd 0.0500 time 0.2410 (0.2450) data time 0.0009 (0.0027) model time 0.2401 (0.2428) loss 2.5350 (2.8836) grad_norm 4.2094 (3.7470) loss_scale 1024.0000 (826.6844) mem 7379MB [2024-08-30 09:07:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][310/1251] eta 0:03:50 lr 0.000187 wd 0.0500 time 0.2383 (0.2450) data time 0.0011 (0.0026) model time 0.2372 (0.2428) loss 2.2384 (2.8846) grad_norm 3.4746 (3.7503) loss_scale 1024.0000 (833.0289) mem 7379MB [2024-08-30 09:07:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][320/1251] eta 0:03:47 lr 0.000187 wd 0.0500 time 0.2374 (0.2449) data time 0.0008 (0.0026) model time 0.2367 (0.2427) loss 1.5804 (2.8787) grad_norm 4.9840 (3.7451) loss_scale 1024.0000 (838.9782) mem 7379MB [2024-08-30 09:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][330/1251] eta 0:03:45 lr 0.000187 wd 0.0500 time 0.2357 (0.2448) data time 0.0009 (0.0025) model time 0.2348 (0.2427) loss 3.5046 (2.8777) grad_norm 2.7551 (3.7320) loss_scale 1024.0000 (844.5680) mem 7379MB [2024-08-30 09:08:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][340/1251] eta 0:03:42 lr 0.000187 wd 0.0500 time 0.2392 (0.2446) data time 0.0010 (0.0025) model time 0.2381 (0.2425) loss 3.1388 (2.8817) grad_norm 3.7798 (3.7440) loss_scale 1024.0000 (849.8299) mem 7379MB [2024-08-30 09:08:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][350/1251] eta 0:03:40 lr 0.000187 wd 0.0500 time 0.2389 (0.2445) data time 0.0010 (0.0025) model time 0.2378 (0.2423) loss 3.0591 (2.8879) grad_norm 3.4650 (3.7418) loss_scale 1024.0000 (854.7920) mem 7379MB [2024-08-30 09:08:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][360/1251] eta 0:03:37 lr 0.000187 wd 0.0500 time 0.2350 (0.2443) data time 0.0008 (0.0024) model time 0.2342 (0.2422) loss 3.3925 (2.8862) grad_norm 2.7539 (3.7391) loss_scale 1024.0000 (859.4792) mem 7379MB [2024-08-30 09:08:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][370/1251] eta 0:03:35 lr 0.000187 wd 0.0500 time 0.2396 (0.2442) data time 0.0008 (0.0024) model time 0.2388 (0.2421) loss 3.1498 (2.8937) grad_norm 2.8386 (3.7566) loss_scale 1024.0000 (863.9137) mem 7379MB [2024-08-30 09:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][380/1251] eta 0:03:32 lr 0.000186 wd 0.0500 time 0.2349 (0.2440) data time 0.0007 (0.0023) model time 0.2342 (0.2420) loss 3.4377 (2.8966) grad_norm 2.2172 (3.7450) loss_scale 1024.0000 (868.1155) mem 7379MB [2024-08-30 09:08:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][390/1251] eta 0:03:30 lr 0.000186 wd 0.0500 time 0.2387 (0.2439) data time 0.0009 (0.0023) model time 0.2378 (0.2419) loss 3.0457 (2.8962) grad_norm 2.6375 (3.7491) loss_scale 1024.0000 (872.1023) mem 7379MB [2024-08-30 09:08:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][400/1251] eta 0:03:27 lr 0.000186 wd 0.0500 time 0.2497 (0.2439) data time 0.0011 (0.0023) model time 0.2486 (0.2418) loss 3.5942 (2.8919) grad_norm 2.9202 (3.7351) loss_scale 1024.0000 (875.8903) mem 7379MB [2024-08-30 09:08:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][410/1251] eta 0:03:25 lr 0.000186 wd 0.0500 time 0.2462 (0.2438) data time 0.0011 (0.0022) model time 0.2451 (0.2417) loss 3.0241 (2.8832) grad_norm 2.4368 (3.7248) loss_scale 1024.0000 (879.4939) mem 7379MB [2024-08-30 09:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][420/1251] eta 0:03:22 lr 0.000186 wd 0.0500 time 0.2364 (0.2436) data time 0.0009 (0.0022) model time 0.2355 (0.2416) loss 3.3461 (2.8837) grad_norm 2.4398 (3.7354) loss_scale 1024.0000 (882.9264) mem 7379MB [2024-08-30 09:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][430/1251] eta 0:03:20 lr 0.000186 wd 0.0500 time 0.2363 (0.2436) data time 0.0010 (0.0022) model time 0.2353 (0.2416) loss 3.1735 (2.8892) grad_norm 5.3639 (3.7628) loss_scale 1024.0000 (886.1995) mem 7379MB [2024-08-30 09:08:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][440/1251] eta 0:03:17 lr 0.000186 wd 0.0500 time 0.2405 (0.2435) data time 0.0011 (0.0022) model time 0.2394 (0.2416) loss 3.2383 (2.8888) grad_norm 3.9735 (3.7718) loss_scale 1024.0000 (889.3243) mem 7379MB [2024-08-30 09:08:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][450/1251] eta 0:03:15 lr 0.000186 wd 0.0500 time 0.2422 (0.2435) data time 0.0008 (0.0021) model time 0.2414 (0.2416) loss 3.4984 (2.8919) grad_norm 2.7215 (3.7582) loss_scale 1024.0000 (892.3104) mem 7379MB [2024-08-30 09:08:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][460/1251] eta 0:03:12 lr 0.000186 wd 0.0500 time 0.2341 (0.2435) data time 0.0008 (0.0021) model time 0.2333 (0.2415) loss 2.4832 (2.8928) grad_norm 3.5453 (3.7589) loss_scale 1024.0000 (895.1670) mem 7379MB [2024-08-30 09:08:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][470/1251] eta 0:03:10 lr 0.000186 wd 0.0500 time 0.2406 (0.2434) data time 0.0010 (0.0021) model time 0.2396 (0.2414) loss 3.0244 (2.8895) grad_norm 4.7171 (3.7587) loss_scale 1024.0000 (897.9023) mem 7379MB [2024-08-30 09:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][480/1251] eta 0:03:07 lr 0.000186 wd 0.0500 time 0.2365 (0.2433) data time 0.0010 (0.0021) model time 0.2355 (0.2414) loss 2.8938 (2.8889) grad_norm 3.8390 (3.7467) loss_scale 1024.0000 (900.5239) mem 7379MB [2024-08-30 09:08:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][490/1251] eta 0:03:05 lr 0.000186 wd 0.0500 time 0.2375 (0.2433) data time 0.0008 (0.0020) model time 0.2367 (0.2414) loss 3.1981 (2.8917) grad_norm 9.6182 (3.7467) loss_scale 1024.0000 (903.0387) mem 7379MB [2024-08-30 09:08:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][500/1251] eta 0:03:02 lr 0.000186 wd 0.0500 time 0.2434 (0.2432) data time 0.0011 (0.0020) model time 0.2423 (0.2413) loss 2.8381 (2.8961) grad_norm 3.7270 (3.7488) loss_scale 1024.0000 (905.4531) mem 7379MB [2024-08-30 09:08:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][510/1251] eta 0:03:00 lr 0.000186 wd 0.0500 time 0.2398 (0.2431) data time 0.0012 (0.0020) model time 0.2387 (0.2412) loss 3.1187 (2.8989) grad_norm 4.1349 (3.7475) loss_scale 1024.0000 (907.7730) mem 7379MB [2024-08-30 09:08:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][520/1251] eta 0:02:57 lr 0.000186 wd 0.0500 time 0.2320 (0.2430) data time 0.0008 (0.0020) model time 0.2312 (0.2411) loss 3.9760 (2.9007) grad_norm 3.9185 (3.7627) loss_scale 1024.0000 (910.0038) mem 7379MB [2024-08-30 09:08:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][530/1251] eta 0:02:55 lr 0.000186 wd 0.0500 time 0.2443 (0.2430) data time 0.0012 (0.0020) model time 0.2431 (0.2411) loss 2.9739 (2.9032) grad_norm 3.0807 (3.7573) loss_scale 1024.0000 (912.1507) mem 7379MB [2024-08-30 09:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][540/1251] eta 0:02:52 lr 0.000186 wd 0.0500 time 0.2445 (0.2430) data time 0.0011 (0.0020) model time 0.2435 (0.2411) loss 2.0473 (2.9014) grad_norm 3.0491 (3.7475) loss_scale 1024.0000 (914.2181) mem 7379MB [2024-08-30 09:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][550/1251] eta 0:02:50 lr 0.000186 wd 0.0500 time 0.2362 (0.2429) data time 0.0010 (0.0019) model time 0.2352 (0.2411) loss 2.8675 (2.9013) grad_norm 3.1337 (3.7368) loss_scale 1024.0000 (916.2105) mem 7379MB [2024-08-30 09:08:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][560/1251] eta 0:02:47 lr 0.000186 wd 0.0500 time 0.2366 (0.2429) data time 0.0008 (0.0019) model time 0.2358 (0.2411) loss 3.4399 (2.9033) grad_norm 4.1787 (3.7261) loss_scale 1024.0000 (918.1319) mem 7379MB [2024-08-30 09:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][570/1251] eta 0:02:45 lr 0.000186 wd 0.0500 time 0.2367 (0.2428) data time 0.0010 (0.0019) model time 0.2357 (0.2410) loss 3.4376 (2.9061) grad_norm 2.6244 (3.7149) loss_scale 1024.0000 (919.9860) mem 7379MB [2024-08-30 09:09:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][580/1251] eta 0:02:42 lr 0.000186 wd 0.0500 time 0.2431 (0.2428) data time 0.0008 (0.0019) model time 0.2423 (0.2410) loss 1.9011 (2.9045) grad_norm 4.0195 (3.7129) loss_scale 1024.0000 (921.7762) mem 7379MB [2024-08-30 09:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][590/1251] eta 0:02:40 lr 0.000186 wd 0.0500 time 0.2347 (0.2427) data time 0.0008 (0.0019) model time 0.2339 (0.2409) loss 3.8892 (2.9070) grad_norm 3.6767 (3.7107) loss_scale 1024.0000 (923.5059) mem 7379MB [2024-08-30 09:09:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][600/1251] eta 0:02:38 lr 0.000186 wd 0.0500 time 0.2371 (0.2427) data time 0.0008 (0.0019) model time 0.2363 (0.2409) loss 1.8426 (2.9019) grad_norm 4.2505 (3.7181) loss_scale 1024.0000 (925.1780) mem 7379MB [2024-08-30 09:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][610/1251] eta 0:02:35 lr 0.000186 wd 0.0500 time 0.2374 (0.2427) data time 0.0008 (0.0018) model time 0.2366 (0.2409) loss 3.5610 (2.9060) grad_norm 2.9404 (3.7251) loss_scale 1024.0000 (926.7954) mem 7379MB [2024-08-30 09:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][620/1251] eta 0:02:33 lr 0.000186 wd 0.0500 time 0.2390 (0.2426) data time 0.0009 (0.0018) model time 0.2382 (0.2408) loss 3.2755 (2.9062) grad_norm 2.9480 (3.7343) loss_scale 1024.0000 (928.3607) mem 7379MB [2024-08-30 09:09:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][630/1251] eta 0:02:30 lr 0.000186 wd 0.0500 time 0.2373 (0.2426) data time 0.0011 (0.0018) model time 0.2362 (0.2408) loss 1.9737 (2.9049) grad_norm 3.0484 (3.7348) loss_scale 1024.0000 (929.8764) mem 7379MB [2024-08-30 09:09:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][640/1251] eta 0:02:28 lr 0.000186 wd 0.0500 time 0.2337 (0.2425) data time 0.0010 (0.0018) model time 0.2327 (0.2407) loss 3.3591 (2.9025) grad_norm 3.4976 (3.7228) loss_scale 1024.0000 (931.3448) mem 7379MB [2024-08-30 09:09:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][650/1251] eta 0:02:25 lr 0.000186 wd 0.0500 time 0.2462 (0.2425) data time 0.0008 (0.0018) model time 0.2454 (0.2408) loss 1.8782 (2.9020) grad_norm 3.5449 (3.7168) loss_scale 1024.0000 (932.7680) mem 7379MB [2024-08-30 09:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][660/1251] eta 0:02:23 lr 0.000186 wd 0.0500 time 0.2453 (0.2425) data time 0.0007 (0.0018) model time 0.2445 (0.2407) loss 3.5001 (2.8994) grad_norm 3.2410 (3.7149) loss_scale 1024.0000 (934.1483) mem 7379MB [2024-08-30 09:09:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][670/1251] eta 0:02:20 lr 0.000185 wd 0.0500 time 0.2419 (0.2425) data time 0.0010 (0.0018) model time 0.2409 (0.2407) loss 3.1647 (2.8986) grad_norm 15.6913 (3.7447) loss_scale 1024.0000 (935.4873) mem 7379MB [2024-08-30 09:09:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][680/1251] eta 0:02:18 lr 0.000185 wd 0.0500 time 0.2557 (0.2425) data time 0.0012 (0.0018) model time 0.2545 (0.2408) loss 3.3833 (2.9026) grad_norm 6.8264 (3.7644) loss_scale 1024.0000 (936.7871) mem 7379MB [2024-08-30 09:09:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][690/1251] eta 0:02:16 lr 0.000185 wd 0.0500 time 0.2491 (0.2425) data time 0.0011 (0.0018) model time 0.2481 (0.2408) loss 3.4102 (2.9013) grad_norm 4.7646 (3.7665) loss_scale 1024.0000 (938.0492) mem 7379MB [2024-08-30 09:09:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][700/1251] eta 0:02:13 lr 0.000185 wd 0.0500 time 0.2472 (0.2424) data time 0.0010 (0.0017) model time 0.2461 (0.2408) loss 2.9464 (2.9000) grad_norm 2.8966 (3.7674) loss_scale 1024.0000 (939.2753) mem 7379MB [2024-08-30 09:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][710/1251] eta 0:02:11 lr 0.000185 wd 0.0500 time 0.2419 (0.2424) data time 0.0008 (0.0017) model time 0.2411 (0.2407) loss 3.5220 (2.9019) grad_norm 3.3697 (3.7619) loss_scale 1024.0000 (940.4669) mem 7379MB [2024-08-30 09:09:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][720/1251] eta 0:02:08 lr 0.000185 wd 0.0500 time 0.2362 (0.2424) data time 0.0009 (0.0017) model time 0.2353 (0.2407) loss 3.5585 (2.9012) grad_norm 3.1779 (3.7596) loss_scale 1024.0000 (941.6255) mem 7379MB [2024-08-30 09:09:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][730/1251] eta 0:02:06 lr 0.000185 wd 0.0500 time 0.2432 (0.2424) data time 0.0009 (0.0017) model time 0.2423 (0.2407) loss 2.5535 (2.9006) grad_norm 3.2392 (3.7516) loss_scale 1024.0000 (942.7524) mem 7379MB [2024-08-30 09:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][740/1251] eta 0:02:03 lr 0.000185 wd 0.0500 time 0.2526 (0.2424) data time 0.0008 (0.0017) model time 0.2518 (0.2407) loss 3.4603 (2.9017) grad_norm 4.2799 (3.7674) loss_scale 1024.0000 (943.8489) mem 7379MB [2024-08-30 09:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][750/1251] eta 0:02:01 lr 0.000185 wd 0.0500 time 0.2395 (0.2424) data time 0.0008 (0.0017) model time 0.2387 (0.2407) loss 2.1699 (2.8954) grad_norm 5.5554 (3.7681) loss_scale 1024.0000 (944.9161) mem 7379MB [2024-08-30 09:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][760/1251] eta 0:01:59 lr 0.000185 wd 0.0500 time 0.2378 (0.2428) data time 0.0009 (0.0017) model time 0.2369 (0.2412) loss 3.0132 (2.8974) grad_norm 3.5999 (3.7640) loss_scale 1024.0000 (945.9553) mem 7379MB [2024-08-30 09:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][770/1251] eta 0:01:56 lr 0.000185 wd 0.0500 time 0.2401 (0.2427) data time 0.0010 (0.0017) model time 0.2391 (0.2411) loss 2.0686 (2.8927) grad_norm 3.3517 (3.7599) loss_scale 1024.0000 (946.9676) mem 7379MB [2024-08-30 09:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][780/1251] eta 0:01:54 lr 0.000185 wd 0.0500 time 0.2528 (0.2427) data time 0.0008 (0.0017) model time 0.2520 (0.2411) loss 3.0041 (2.8929) grad_norm 3.0958 (3.7644) loss_scale 1024.0000 (947.9539) mem 7379MB [2024-08-30 09:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][790/1251] eta 0:01:51 lr 0.000185 wd 0.0500 time 0.2367 (0.2427) data time 0.0008 (0.0017) model time 0.2360 (0.2411) loss 3.6649 (2.8917) grad_norm 2.5116 (3.7730) loss_scale 1024.0000 (948.9153) mem 7379MB [2024-08-30 09:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][800/1251] eta 0:01:49 lr 0.000185 wd 0.0500 time 0.2376 (0.2426) data time 0.0011 (0.0017) model time 0.2365 (0.2410) loss 3.4247 (2.8911) grad_norm 4.3911 (3.7692) loss_scale 1024.0000 (949.8527) mem 7379MB [2024-08-30 09:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][810/1251] eta 0:01:46 lr 0.000185 wd 0.0500 time 0.2491 (0.2426) data time 0.0008 (0.0016) model time 0.2482 (0.2410) loss 3.0880 (2.8940) grad_norm 10.8311 (3.7889) loss_scale 1024.0000 (950.7670) mem 7379MB [2024-08-30 09:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][820/1251] eta 0:01:44 lr 0.000185 wd 0.0500 time 0.2395 (0.2426) data time 0.0008 (0.0016) model time 0.2387 (0.2410) loss 3.2719 (2.8923) grad_norm 2.8500 (3.7867) loss_scale 1024.0000 (951.6590) mem 7379MB [2024-08-30 09:10:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][830/1251] eta 0:01:42 lr 0.000185 wd 0.0500 time 0.2382 (0.2425) data time 0.0011 (0.0016) model time 0.2371 (0.2409) loss 2.0477 (2.8901) grad_norm 3.6488 (3.7896) loss_scale 1024.0000 (952.5295) mem 7379MB [2024-08-30 09:10:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][840/1251] eta 0:01:39 lr 0.000185 wd 0.0500 time 0.2470 (0.2425) data time 0.0011 (0.0016) model time 0.2459 (0.2409) loss 3.2038 (2.8902) grad_norm 6.1834 (3.7882) loss_scale 1024.0000 (953.3793) mem 7379MB [2024-08-30 09:10:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][850/1251] eta 0:01:37 lr 0.000185 wd 0.0500 time 0.2499 (0.2425) data time 0.0010 (0.0016) model time 0.2489 (0.2409) loss 2.6042 (2.8908) grad_norm 4.5315 (3.7919) loss_scale 1024.0000 (954.2092) mem 7379MB [2024-08-30 09:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][860/1251] eta 0:01:34 lr 0.000185 wd 0.0500 time 0.2406 (0.2425) data time 0.0008 (0.0016) model time 0.2398 (0.2409) loss 3.4446 (2.8923) grad_norm 3.1971 (3.7905) loss_scale 1024.0000 (955.0197) mem 7379MB [2024-08-30 09:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][870/1251] eta 0:01:32 lr 0.000185 wd 0.0500 time 0.2433 (0.2424) data time 0.0010 (0.0016) model time 0.2424 (0.2409) loss 3.5619 (2.8953) grad_norm 5.7754 (3.7923) loss_scale 1024.0000 (955.8117) mem 7379MB [2024-08-30 09:10:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][880/1251] eta 0:01:29 lr 0.000185 wd 0.0500 time 0.2399 (0.2424) data time 0.0008 (0.0016) model time 0.2390 (0.2408) loss 3.4203 (2.8944) grad_norm 2.6031 (3.7946) loss_scale 1024.0000 (956.5857) mem 7379MB [2024-08-30 09:10:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][890/1251] eta 0:01:27 lr 0.000185 wd 0.0500 time 0.2433 (0.2424) data time 0.0010 (0.0016) model time 0.2422 (0.2408) loss 2.0412 (2.8897) grad_norm 2.9973 (3.7911) loss_scale 1024.0000 (957.3423) mem 7379MB [2024-08-30 09:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][900/1251] eta 0:01:25 lr 0.000185 wd 0.0500 time 0.2431 (0.2423) data time 0.0010 (0.0016) model time 0.2421 (0.2408) loss 3.3728 (2.8895) grad_norm 2.6878 (3.7851) loss_scale 1024.0000 (958.0821) mem 7379MB [2024-08-30 09:10:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][910/1251] eta 0:01:22 lr 0.000185 wd 0.0500 time 0.2416 (0.2423) data time 0.0011 (0.0016) model time 0.2405 (0.2407) loss 2.8032 (2.8913) grad_norm 4.8657 (3.7840) loss_scale 1024.0000 (958.8057) mem 7379MB [2024-08-30 09:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][920/1251] eta 0:01:20 lr 0.000185 wd 0.0500 time 0.2410 (0.2422) data time 0.0011 (0.0016) model time 0.2399 (0.2407) loss 3.5592 (2.8921) grad_norm 3.4521 (3.7834) loss_scale 1024.0000 (959.5136) mem 7379MB [2024-08-30 09:10:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][930/1251] eta 0:01:17 lr 0.000185 wd 0.0500 time 0.2420 (0.2422) data time 0.0012 (0.0016) model time 0.2409 (0.2407) loss 2.5152 (2.8908) grad_norm 2.9750 (inf) loss_scale 512.0000 (956.9066) mem 7379MB [2024-08-30 09:10:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][940/1251] eta 0:01:15 lr 0.000185 wd 0.0500 time 0.2396 (0.2422) data time 0.0012 (0.0016) model time 0.2385 (0.2407) loss 3.0930 (2.8909) grad_norm 2.4379 (inf) loss_scale 512.0000 (952.1785) mem 7379MB [2024-08-30 09:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][950/1251] eta 0:01:12 lr 0.000185 wd 0.0500 time 0.2396 (0.2422) data time 0.0011 (0.0016) model time 0.2385 (0.2407) loss 3.2162 (2.8920) grad_norm 3.8885 (inf) loss_scale 512.0000 (947.5499) mem 7379MB [2024-08-30 09:10:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][960/1251] eta 0:01:10 lr 0.000185 wd 0.0500 time 0.2465 (0.2422) data time 0.0007 (0.0016) model time 0.2458 (0.2407) loss 2.9729 (2.8916) grad_norm 3.9018 (inf) loss_scale 512.0000 (943.0177) mem 7379MB [2024-08-30 09:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][970/1251] eta 0:01:08 lr 0.000184 wd 0.0500 time 0.2384 (0.2422) data time 0.0010 (0.0015) model time 0.2374 (0.2407) loss 2.7645 (2.8877) grad_norm 8.9059 (inf) loss_scale 512.0000 (938.5788) mem 7379MB [2024-08-30 09:10:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][980/1251] eta 0:01:05 lr 0.000184 wd 0.0500 time 0.2461 (0.2422) data time 0.0007 (0.0015) model time 0.2454 (0.2407) loss 3.1765 (2.8902) grad_norm 4.3689 (inf) loss_scale 512.0000 (934.2304) mem 7379MB [2024-08-30 09:10:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][990/1251] eta 0:01:03 lr 0.000184 wd 0.0500 time 0.2442 (0.2422) data time 0.0012 (0.0015) model time 0.2430 (0.2407) loss 2.0077 (2.8894) grad_norm 3.3330 (inf) loss_scale 512.0000 (929.9697) mem 7379MB [2024-08-30 09:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1000/1251] eta 0:01:00 lr 0.000184 wd 0.0500 time 0.2452 (0.2422) data time 0.0009 (0.0015) model time 0.2442 (0.2407) loss 2.5289 (2.8911) grad_norm 3.0251 (inf) loss_scale 512.0000 (925.7942) mem 7379MB [2024-08-30 09:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1010/1251] eta 0:00:58 lr 0.000184 wd 0.0500 time 0.2442 (0.2422) data time 0.0011 (0.0015) model time 0.2431 (0.2407) loss 3.2686 (2.8910) grad_norm 2.7705 (inf) loss_scale 512.0000 (921.7013) mem 7379MB [2024-08-30 09:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1020/1251] eta 0:00:55 lr 0.000184 wd 0.0500 time 0.2447 (0.2422) data time 0.0010 (0.0015) model time 0.2437 (0.2407) loss 2.1476 (2.8904) grad_norm 5.2952 (inf) loss_scale 512.0000 (917.6885) mem 7379MB [2024-08-30 09:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1030/1251] eta 0:00:53 lr 0.000184 wd 0.0500 time 0.2400 (0.2422) data time 0.0010 (0.0015) model time 0.2390 (0.2408) loss 3.1827 (2.8897) grad_norm 3.7567 (inf) loss_scale 512.0000 (913.7536) mem 7379MB [2024-08-30 09:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1040/1251] eta 0:00:51 lr 0.000184 wd 0.0500 time 0.2458 (0.2422) data time 0.0011 (0.0015) model time 0.2447 (0.2408) loss 3.2796 (2.8892) grad_norm 3.6962 (inf) loss_scale 512.0000 (909.8943) mem 7379MB [2024-08-30 09:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1050/1251] eta 0:00:48 lr 0.000184 wd 0.0500 time 0.2364 (0.2422) data time 0.0009 (0.0015) model time 0.2355 (0.2408) loss 3.6486 (2.8910) grad_norm 2.9401 (inf) loss_scale 512.0000 (906.1085) mem 7379MB [2024-08-30 09:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1060/1251] eta 0:00:46 lr 0.000184 wd 0.0500 time 0.2395 (0.2422) data time 0.0010 (0.0015) model time 0.2385 (0.2407) loss 2.4409 (2.8927) grad_norm 4.4932 (inf) loss_scale 512.0000 (902.3940) mem 7379MB [2024-08-30 09:11:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1070/1251] eta 0:00:43 lr 0.000184 wd 0.0500 time 0.2422 (0.2424) data time 0.0009 (0.0015) model time 0.2413 (0.2409) loss 3.2999 (2.8936) grad_norm 2.4816 (inf) loss_scale 512.0000 (898.7488) mem 7379MB [2024-08-30 09:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1080/1251] eta 0:00:41 lr 0.000184 wd 0.0500 time 0.2465 (0.2424) data time 0.0010 (0.0015) model time 0.2455 (0.2409) loss 2.4815 (2.8923) grad_norm 3.9836 (inf) loss_scale 512.0000 (895.1711) mem 7379MB [2024-08-30 09:11:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1090/1251] eta 0:00:39 lr 0.000184 wd 0.0500 time 0.2420 (0.2424) data time 0.0010 (0.0015) model time 0.2410 (0.2409) loss 3.2959 (2.8931) grad_norm 4.2346 (inf) loss_scale 512.0000 (891.6590) mem 7379MB [2024-08-30 09:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1100/1251] eta 0:00:36 lr 0.000184 wd 0.0500 time 0.2386 (0.2423) data time 0.0008 (0.0015) model time 0.2379 (0.2409) loss 3.3112 (2.8926) grad_norm 2.6386 (inf) loss_scale 512.0000 (888.2107) mem 7379MB [2024-08-30 09:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1110/1251] eta 0:00:34 lr 0.000184 wd 0.0500 time 0.2380 (0.2423) data time 0.0010 (0.0015) model time 0.2369 (0.2409) loss 2.7266 (2.8925) grad_norm 3.0258 (inf) loss_scale 512.0000 (884.8245) mem 7379MB [2024-08-30 09:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1120/1251] eta 0:00:31 lr 0.000184 wd 0.0500 time 0.2487 (0.2423) data time 0.0010 (0.0015) model time 0.2477 (0.2409) loss 2.7020 (2.8960) grad_norm 3.4400 (inf) loss_scale 512.0000 (881.4987) mem 7379MB [2024-08-30 09:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1130/1251] eta 0:00:29 lr 0.000184 wd 0.0500 time 0.2400 (0.2423) data time 0.0010 (0.0015) model time 0.2390 (0.2409) loss 2.2358 (2.8980) grad_norm 3.5818 (inf) loss_scale 512.0000 (878.2317) mem 7379MB [2024-08-30 09:11:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1140/1251] eta 0:00:26 lr 0.000184 wd 0.0500 time 0.2474 (0.2423) data time 0.0008 (0.0015) model time 0.2466 (0.2409) loss 3.2632 (2.8997) grad_norm 4.0481 (inf) loss_scale 512.0000 (875.0219) mem 7379MB [2024-08-30 09:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1150/1251] eta 0:00:24 lr 0.000184 wd 0.0500 time 0.2434 (0.2423) data time 0.0008 (0.0015) model time 0.2426 (0.2409) loss 1.9941 (2.8998) grad_norm 3.2648 (inf) loss_scale 512.0000 (871.8679) mem 7379MB [2024-08-30 09:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1160/1251] eta 0:00:22 lr 0.000184 wd 0.0500 time 0.2417 (0.2423) data time 0.0008 (0.0015) model time 0.2409 (0.2409) loss 2.3311 (2.8990) grad_norm 2.6351 (inf) loss_scale 512.0000 (868.7683) mem 7379MB [2024-08-30 09:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1170/1251] eta 0:00:19 lr 0.000184 wd 0.0500 time 0.2389 (0.2423) data time 0.0011 (0.0015) model time 0.2378 (0.2409) loss 3.1858 (2.9000) grad_norm 5.0768 (inf) loss_scale 512.0000 (865.7216) mem 7379MB [2024-08-30 09:11:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1180/1251] eta 0:00:17 lr 0.000184 wd 0.0500 time 0.2388 (0.2424) data time 0.0008 (0.0015) model time 0.2381 (0.2409) loss 3.5111 (2.8998) grad_norm 2.7026 (inf) loss_scale 512.0000 (862.7265) mem 7379MB [2024-08-30 09:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1190/1251] eta 0:00:14 lr 0.000184 wd 0.0500 time 0.2327 (0.2424) data time 0.0008 (0.0015) model time 0.2319 (0.2409) loss 3.0128 (2.8985) grad_norm 4.3335 (inf) loss_scale 512.0000 (859.7817) mem 7379MB [2024-08-30 09:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1200/1251] eta 0:00:12 lr 0.000184 wd 0.0500 time 0.2396 (0.2424) data time 0.0011 (0.0015) model time 0.2385 (0.2409) loss 3.0017 (2.8998) grad_norm 3.5702 (inf) loss_scale 512.0000 (856.8859) mem 7379MB [2024-08-30 09:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1210/1251] eta 0:00:09 lr 0.000184 wd 0.0500 time 0.2483 (0.2424) data time 0.0008 (0.0015) model time 0.2476 (0.2409) loss 2.2908 (2.9012) grad_norm 2.6152 (inf) loss_scale 512.0000 (854.0380) mem 7379MB [2024-08-30 09:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1220/1251] eta 0:00:07 lr 0.000184 wd 0.0500 time 0.2372 (0.2424) data time 0.0011 (0.0014) model time 0.2361 (0.2409) loss 3.7769 (2.9045) grad_norm 2.6381 (inf) loss_scale 512.0000 (851.2367) mem 7379MB [2024-08-30 09:11:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1230/1251] eta 0:00:05 lr 0.000184 wd 0.0500 time 0.2430 (0.2424) data time 0.0008 (0.0014) model time 0.2422 (0.2410) loss 3.1559 (2.9048) grad_norm 4.1693 (inf) loss_scale 512.0000 (848.4809) mem 7379MB [2024-08-30 09:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1240/1251] eta 0:00:02 lr 0.000184 wd 0.0500 time 0.2239 (0.2423) data time 0.0005 (0.0014) model time 0.2234 (0.2409) loss 3.5313 (2.9055) grad_norm 3.1156 (inf) loss_scale 512.0000 (845.7695) mem 7379MB [2024-08-30 09:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [222/300][1250/1251] eta 0:00:00 lr 0.000184 wd 0.0500 time 0.2216 (0.2421) data time 0.0005 (0.0014) model time 0.2211 (0.2407) loss 1.7431 (2.9036) grad_norm 3.3117 (inf) loss_scale 512.0000 (843.1015) mem 7379MB [2024-08-30 09:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 222 training takes 0:05:02 [2024-08-30 09:11:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 09:11:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 09:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.488 (0.488) Loss 0.3899 (0.3899) Acc@1 92.383 (92.383) Acc@5 98.242 (98.242) Mem 7379MB [2024-08-30 09:11:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.112) Loss 0.6318 (0.6356) Acc@1 87.695 (86.319) Acc@5 97.754 (97.541) Mem 7379MB [2024-08-30 09:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.084 (0.097) Loss 0.9487 (0.6663) Acc@1 77.246 (85.268) Acc@5 95.312 (97.447) Mem 7379MB [2024-08-30 09:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.090) Loss 1.1152 (0.7547) Acc@1 73.340 (83.254) Acc@5 93.164 (96.462) Mem 7379MB [2024-08-30 09:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.0664 (0.8043) Acc@1 74.023 (82.005) Acc@5 94.141 (95.937) Mem 7379MB [2024-08-30 09:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.546 Acc@5 95.894 [2024-08-30 09:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.5% [2024-08-30 09:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.785 (0.785) Loss 0.3762 (0.3762) Acc@1 93.359 (93.359) Acc@5 98.340 (98.340) Mem 7379MB [2024-08-30 09:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.144) Loss 0.5835 (0.6026) Acc@1 89.258 (87.536) Acc@5 97.852 (97.718) Mem 7379MB [2024-08-30 09:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.113) Loss 0.8691 (0.6296) Acc@1 78.516 (86.435) Acc@5 96.094 (97.675) Mem 7379MB [2024-08-30 09:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.102) Loss 1.0869 (0.7134) Acc@1 73.633 (84.337) Acc@5 93.066 (96.774) Mem 7379MB [2024-08-30 09:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.067 (0.093) Loss 0.9780 (0.7568) Acc@1 77.051 (83.225) Acc@5 94.727 (96.327) Mem 7379MB [2024-08-30 09:11:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.798 Acc@5 96.308 [2024-08-30 09:11:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-08-30 09:11:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.80% [2024-08-30 09:11:52 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 09:11:53 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 09:11:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][0/1251] eta 0:14:57 lr 0.000184 wd 0.0500 time 0.7177 (0.7177) data time 0.4994 (0.4994) model time 0.0000 (0.0000) loss 3.0214 (3.0214) grad_norm 3.1828 (3.1828) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][10/1251] eta 0:05:52 lr 0.000183 wd 0.0500 time 0.2410 (0.2839) data time 0.0007 (0.0463) model time 0.0000 (0.0000) loss 2.4234 (2.5871) grad_norm 3.8948 (4.5010) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][20/1251] eta 0:05:26 lr 0.000183 wd 0.0500 time 0.2570 (0.2650) data time 0.0008 (0.0247) model time 0.0000 (0.0000) loss 2.4805 (2.7970) grad_norm 3.8058 (4.5050) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][30/1251] eta 0:05:21 lr 0.000183 wd 0.0500 time 0.2373 (0.2632) data time 0.0011 (0.0171) model time 0.0000 (0.0000) loss 2.3897 (2.8334) grad_norm 3.3451 (4.3297) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][40/1251] eta 0:05:17 lr 0.000183 wd 0.0500 time 0.2450 (0.2624) data time 0.0008 (0.0132) model time 0.0000 (0.0000) loss 3.5098 (2.9201) grad_norm 3.0104 (4.2388) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][50/1251] eta 0:05:10 lr 0.000183 wd 0.0500 time 0.2380 (0.2582) data time 0.0010 (0.0109) model time 0.0000 (0.0000) loss 2.8422 (2.9499) grad_norm 3.1518 (4.0342) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][60/1251] eta 0:05:04 lr 0.000183 wd 0.0500 time 0.2450 (0.2554) data time 0.0010 (0.0093) model time 0.2440 (0.2398) loss 3.7003 (2.9645) grad_norm 4.5188 (4.0453) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][70/1251] eta 0:04:59 lr 0.000183 wd 0.0500 time 0.2379 (0.2539) data time 0.0010 (0.0081) model time 0.2369 (0.2419) loss 2.6475 (2.9693) grad_norm 6.3735 (4.0115) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][80/1251] eta 0:04:55 lr 0.000183 wd 0.0500 time 0.2325 (0.2521) data time 0.0007 (0.0072) model time 0.2318 (0.2407) loss 2.0223 (2.9624) grad_norm 2.5716 (4.0059) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][90/1251] eta 0:04:51 lr 0.000183 wd 0.0500 time 0.2376 (0.2512) data time 0.0007 (0.0067) model time 0.2369 (0.2409) loss 3.7031 (2.9677) grad_norm 3.6728 (4.0082) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][100/1251] eta 0:04:47 lr 0.000183 wd 0.0500 time 0.2362 (0.2502) data time 0.0010 (0.0061) model time 0.2352 (0.2408) loss 2.6324 (2.9756) grad_norm 2.6845 (4.0454) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][110/1251] eta 0:04:44 lr 0.000183 wd 0.0500 time 0.2439 (0.2497) data time 0.0010 (0.0057) model time 0.2429 (0.2412) loss 3.0674 (2.9648) grad_norm 4.5553 (3.9834) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][120/1251] eta 0:04:41 lr 0.000183 wd 0.0500 time 0.2325 (0.2490) data time 0.0008 (0.0053) model time 0.2317 (0.2411) loss 3.2663 (2.9636) grad_norm 5.1437 (4.0371) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][130/1251] eta 0:04:38 lr 0.000183 wd 0.0500 time 0.2319 (0.2483) data time 0.0007 (0.0049) model time 0.2311 (0.2408) loss 3.7498 (2.9599) grad_norm 3.4577 (4.0382) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][140/1251] eta 0:04:35 lr 0.000183 wd 0.0500 time 0.2501 (0.2480) data time 0.0011 (0.0047) model time 0.2490 (0.2410) loss 3.5346 (2.9658) grad_norm 3.7686 (4.0591) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][150/1251] eta 0:04:32 lr 0.000183 wd 0.0500 time 0.2382 (0.2477) data time 0.0008 (0.0044) model time 0.2374 (0.2412) loss 3.3146 (2.9719) grad_norm 4.4369 (4.0625) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][160/1251] eta 0:04:29 lr 0.000183 wd 0.0500 time 0.2354 (0.2473) data time 0.0010 (0.0042) model time 0.2344 (0.2411) loss 3.1062 (2.9561) grad_norm 7.6209 (4.1236) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][170/1251] eta 0:04:27 lr 0.000183 wd 0.0500 time 0.2448 (0.2471) data time 0.0008 (0.0040) model time 0.2441 (0.2412) loss 2.3362 (2.9393) grad_norm 3.1094 (4.1337) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][180/1251] eta 0:04:24 lr 0.000183 wd 0.0500 time 0.2396 (0.2468) data time 0.0008 (0.0039) model time 0.2388 (0.2412) loss 3.3686 (2.9153) grad_norm 3.4300 (4.0883) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][190/1251] eta 0:04:21 lr 0.000183 wd 0.0500 time 0.2358 (0.2465) data time 0.0010 (0.0037) model time 0.2348 (0.2412) loss 2.7817 (2.9098) grad_norm 2.9563 (4.0604) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][200/1251] eta 0:04:18 lr 0.000183 wd 0.0500 time 0.2365 (0.2462) data time 0.0009 (0.0036) model time 0.2357 (0.2410) loss 3.4928 (2.9166) grad_norm 3.4590 (4.0273) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][210/1251] eta 0:04:16 lr 0.000183 wd 0.0500 time 0.2433 (0.2460) data time 0.0009 (0.0035) model time 0.2424 (0.2410) loss 2.7842 (2.9195) grad_norm 2.7056 (4.0083) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][220/1251] eta 0:04:13 lr 0.000183 wd 0.0500 time 0.2506 (0.2458) data time 0.0011 (0.0034) model time 0.2495 (0.2410) loss 3.1632 (2.9157) grad_norm 3.7091 (3.9971) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][230/1251] eta 0:04:10 lr 0.000183 wd 0.0500 time 0.2406 (0.2456) data time 0.0011 (0.0033) model time 0.2395 (0.2409) loss 3.0035 (2.9155) grad_norm 5.6203 (4.0359) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][240/1251] eta 0:04:08 lr 0.000183 wd 0.0500 time 0.2331 (0.2454) data time 0.0009 (0.0032) model time 0.2322 (0.2408) loss 1.9889 (2.9031) grad_norm 3.3997 (4.0400) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 09:12:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 09:12:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 09:12:53 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 09:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 09:18:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 09:18:12 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 09:18:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 09:18:23 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 09:18:24 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 09:23:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 09:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 09:23:31 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 09:23:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 09:23:37 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 09:23:39 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 09:23:40 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 09:23:40 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 223) [2024-08-30 09:23:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 09:23:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][250/1251] eta 0:30:48 lr 0.000183 wd 0.0500 time 0.2259 (1.8464) data time 0.0010 (0.1004) model time 0.2249 (1.7460) loss 2.9911 (3.2918) grad_norm 4.8060 (4.0199) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][260/1251] eta 0:14:45 lr 0.000183 wd 0.0500 time 0.2266 (0.8931) data time 0.0008 (0.0419) model time 0.2258 (0.8513) loss 2.4764 (3.0999) grad_norm 4.3110 (4.0482) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][270/1251] eta 0:10:33 lr 0.000183 wd 0.0500 time 0.2284 (0.6461) data time 0.0008 (0.0268) model time 0.2276 (0.6193) loss 3.5347 (3.1600) grad_norm 4.0324 (3.7932) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][280/1251] eta 0:08:37 lr 0.000183 wd 0.0500 time 0.2217 (0.5329) data time 0.0011 (0.0198) model time 0.2206 (0.5130) loss 2.9748 (3.1131) grad_norm 2.6627 (3.7882) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][290/1251] eta 0:07:29 lr 0.000183 wd 0.0500 time 0.2226 (0.4675) data time 0.0009 (0.0158) model time 0.2217 (0.4517) loss 2.8466 (3.0858) grad_norm 3.0650 (3.7076) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][300/1251] eta 0:06:44 lr 0.000183 wd 0.0500 time 0.2212 (0.4248) data time 0.0010 (0.0132) model time 0.2202 (0.4116) loss 2.8182 (3.0780) grad_norm 49.4230 (4.4610) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][310/1251] eta 0:06:11 lr 0.000182 wd 0.0500 time 0.2264 (0.3953) data time 0.0009 (0.0114) model time 0.2255 (0.3839) loss 2.9157 (3.0535) grad_norm 3.3942 (4.3690) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][320/1251] eta 0:05:47 lr 0.000182 wd 0.0500 time 0.2236 (0.3733) data time 0.0009 (0.0100) model time 0.2227 (0.3633) loss 3.2901 (3.0169) grad_norm 4.3586 (4.3099) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][330/1251] eta 0:05:28 lr 0.000182 wd 0.0500 time 0.2197 (0.3562) data time 0.0009 (0.0090) model time 0.2187 (0.3472) loss 3.1776 (3.0035) grad_norm 2.7144 (4.2442) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][340/1251] eta 0:05:12 lr 0.000182 wd 0.0500 time 0.2269 (0.3429) data time 0.0008 (0.0082) model time 0.2260 (0.3347) loss 3.1471 (2.9989) grad_norm 2.8083 (4.2449) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][350/1251] eta 0:04:59 lr 0.000182 wd 0.0500 time 0.2238 (0.3320) data time 0.0006 (0.0075) model time 0.2232 (0.3245) loss 2.7628 (3.0165) grad_norm 3.9874 (4.2054) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][360/1251] eta 0:04:47 lr 0.000182 wd 0.0500 time 0.2243 (0.3230) data time 0.0009 (0.0069) model time 0.2234 (0.3160) loss 3.7704 (3.0192) grad_norm 3.1080 (4.1734) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][370/1251] eta 0:04:37 lr 0.000182 wd 0.0500 time 0.2247 (0.3152) data time 0.0008 (0.0065) model time 0.2239 (0.3088) loss 3.3377 (3.0094) grad_norm 3.8335 (4.1291) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][380/1251] eta 0:04:28 lr 0.000182 wd 0.0500 time 0.2366 (0.3088) data time 0.0009 (0.0061) model time 0.2357 (0.3027) loss 2.4202 (3.0045) grad_norm 3.9827 (4.1035) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][390/1251] eta 0:04:21 lr 0.000182 wd 0.0500 time 0.2227 (0.3032) data time 0.0009 (0.0057) model time 0.2217 (0.2975) loss 2.8219 (2.9995) grad_norm 2.9979 (4.0518) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 09:24:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 09:24:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 09:24:34 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 10:16:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 10:16:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 10:16:10 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 10:16:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 10:16:17 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 10:16:18 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 10:16:19 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 10:16:19 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 223) [2024-08-30 10:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 11:20:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 11:20:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 11:20:39 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 11:20:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 11:20:53 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 11:20:55 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 11:20:56 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 11:20:56 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 223) [2024-08-30 11:20:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 11:21:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][400/1251] eta 1:11:29 lr 0.000182 wd 0.0500 time 0.2262 (5.0403) data time 0.0006 (0.2231) model time 0.2256 (4.8172) loss 2.4270 (3.0369) grad_norm 3.6693 (3.2122) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][410/1251] eta 0:18:44 lr 0.000182 wd 0.0500 time 0.2234 (1.3366) data time 0.0008 (0.0523) model time 0.2226 (1.2843) loss 2.8151 (3.0654) grad_norm 2.9502 (3.2259) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][420/1251] eta 0:11:48 lr 0.000182 wd 0.0500 time 0.2209 (0.8528) data time 0.0007 (0.0302) model time 0.2202 (0.8227) loss 3.2312 (3.0555) grad_norm 3.0195 (3.5404) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][430/1251] eta 0:09:03 lr 0.000182 wd 0.0500 time 0.2275 (0.6626) data time 0.0007 (0.0213) model time 0.2268 (0.6413) loss 3.1586 (3.0504) grad_norm 2.8793 (3.5457) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][440/1251] eta 0:07:34 lr 0.000182 wd 0.0500 time 0.2187 (0.5606) data time 0.0009 (0.0165) model time 0.2177 (0.5441) loss 2.9835 (3.0134) grad_norm 3.9823 (3.6529) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][450/1251] eta 0:06:38 lr 0.000182 wd 0.0500 time 0.2276 (0.4972) data time 0.0008 (0.0136) model time 0.2267 (0.4836) loss 3.0439 (3.0127) grad_norm 4.4240 (3.5926) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][460/1251] eta 0:05:59 lr 0.000182 wd 0.0500 time 0.2259 (0.4539) data time 0.0008 (0.0116) model time 0.2251 (0.4423) loss 2.4274 (2.9825) grad_norm 3.0727 (3.5200) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][470/1251] eta 0:05:30 lr 0.000182 wd 0.0500 time 0.2255 (0.4227) data time 0.0009 (0.0101) model time 0.2246 (0.4125) loss 3.3339 (2.9454) grad_norm 4.2693 (3.5399) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][480/1251] eta 0:05:07 lr 0.000182 wd 0.0500 time 0.2261 (0.3988) data time 0.0006 (0.0090) model time 0.2255 (0.3897) loss 2.1180 (2.9336) grad_norm 2.8178 (3.5575) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][490/1251] eta 0:04:49 lr 0.000182 wd 0.0500 time 0.2304 (0.3802) data time 0.0007 (0.0081) model time 0.2297 (0.3720) loss 3.6533 (2.9319) grad_norm 4.1770 (3.5093) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][500/1251] eta 0:04:34 lr 0.000182 wd 0.0500 time 0.2258 (0.3653) data time 0.0006 (0.0074) model time 0.2252 (0.3579) loss 3.4904 (2.9627) grad_norm 2.5288 (3.5123) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][510/1251] eta 0:04:21 lr 0.000182 wd 0.0500 time 0.2261 (0.3528) data time 0.0006 (0.0069) model time 0.2254 (0.3460) loss 2.6847 (2.9544) grad_norm 3.8364 (3.4942) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][520/1251] eta 0:04:10 lr 0.000182 wd 0.0500 time 0.2237 (0.3423) data time 0.0007 (0.0064) model time 0.2230 (0.3360) loss 2.8229 (2.9506) grad_norm 3.6360 (3.5180) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][530/1251] eta 0:04:00 lr 0.000182 wd 0.0500 time 0.2236 (0.3335) data time 0.0008 (0.0060) model time 0.2227 (0.3275) loss 3.2191 (2.9477) grad_norm 3.4334 (3.5434) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][540/1251] eta 0:03:51 lr 0.000182 wd 0.0500 time 0.2215 (0.3258) data time 0.0009 (0.0056) model time 0.2206 (0.3202) loss 3.2861 (2.9431) grad_norm 4.1867 (3.5859) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][550/1251] eta 0:03:43 lr 0.000182 wd 0.0500 time 0.2273 (0.3191) data time 0.0006 (0.0053) model time 0.2266 (0.3138) loss 2.9064 (2.9397) grad_norm 3.2592 (3.5894) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][560/1251] eta 0:03:36 lr 0.000182 wd 0.0500 time 0.2194 (0.3134) data time 0.0007 (0.0050) model time 0.2187 (0.3084) loss 2.1959 (2.9377) grad_norm 2.6707 (3.5623) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][570/1251] eta 0:03:30 lr 0.000182 wd 0.0500 time 0.2253 (0.3085) data time 0.0008 (0.0051) model time 0.2246 (0.3034) loss 3.3560 (2.9390) grad_norm 3.7476 (3.5527) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][580/1251] eta 0:03:23 lr 0.000182 wd 0.0500 time 0.2240 (0.3040) data time 0.0008 (0.0049) model time 0.2232 (0.2991) loss 3.2426 (2.9286) grad_norm 3.1686 (3.5371) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][590/1251] eta 0:03:18 lr 0.000182 wd 0.0500 time 0.2319 (0.2999) data time 0.0006 (0.0047) model time 0.2313 (0.2952) loss 2.8407 (2.9256) grad_norm 3.0588 (3.5499) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][600/1251] eta 0:03:12 lr 0.000182 wd 0.0500 time 0.2269 (0.2962) data time 0.0007 (0.0046) model time 0.2262 (0.2917) loss 2.2472 (2.9132) grad_norm 2.6027 (3.7089) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][610/1251] eta 0:03:07 lr 0.000181 wd 0.0500 time 0.2281 (0.2930) data time 0.0007 (0.0044) model time 0.2274 (0.2886) loss 2.1669 (2.9150) grad_norm 2.6540 (3.7155) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][620/1251] eta 0:03:02 lr 0.000181 wd 0.0500 time 0.2219 (0.2900) data time 0.0007 (0.0042) model time 0.2211 (0.2858) loss 2.2207 (2.9102) grad_norm 4.5012 (3.7302) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][630/1251] eta 0:02:58 lr 0.000181 wd 0.0500 time 0.2239 (0.2872) data time 0.0010 (0.0041) model time 0.2229 (0.2831) loss 2.1694 (2.9062) grad_norm 2.4367 (3.7362) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][640/1251] eta 0:02:53 lr 0.000181 wd 0.0500 time 0.2242 (0.2846) data time 0.0008 (0.0040) model time 0.2234 (0.2807) loss 3.2513 (2.9087) grad_norm 2.9967 (3.7256) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][650/1251] eta 0:02:49 lr 0.000181 wd 0.0500 time 0.2249 (0.2823) data time 0.0009 (0.0038) model time 0.2240 (0.2784) loss 3.2639 (2.9026) grad_norm 3.1172 (3.7470) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][660/1251] eta 0:02:45 lr 0.000181 wd 0.0500 time 0.2278 (0.2801) data time 0.0006 (0.0037) model time 0.2273 (0.2764) loss 2.7102 (2.8905) grad_norm 3.6614 (3.7425) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][670/1251] eta 0:02:41 lr 0.000181 wd 0.0500 time 0.2297 (0.2780) data time 0.0006 (0.0036) model time 0.2291 (0.2744) loss 3.3484 (2.8874) grad_norm 2.4647 (3.7230) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][680/1251] eta 0:02:37 lr 0.000181 wd 0.0500 time 0.2257 (0.2761) data time 0.0010 (0.0035) model time 0.2247 (0.2726) loss 2.9951 (2.8869) grad_norm 2.7220 (3.7314) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][690/1251] eta 0:02:34 lr 0.000181 wd 0.0500 time 0.2192 (0.2751) data time 0.0009 (0.0034) model time 0.2183 (0.2716) loss 2.7223 (2.8822) grad_norm 2.7738 (3.7178) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][700/1251] eta 0:02:30 lr 0.000181 wd 0.0500 time 0.2203 (0.2733) data time 0.0008 (0.0034) model time 0.2195 (0.2700) loss 3.2815 (2.8775) grad_norm 3.4954 (3.7042) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][710/1251] eta 0:02:27 lr 0.000181 wd 0.0500 time 0.2251 (0.2724) data time 0.0006 (0.0033) model time 0.2244 (0.2691) loss 3.6211 (2.8823) grad_norm 3.7195 (3.7050) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][720/1251] eta 0:02:23 lr 0.000181 wd 0.0500 time 0.2215 (0.2710) data time 0.0009 (0.0032) model time 0.2206 (0.2678) loss 3.5365 (2.8910) grad_norm 2.4850 (3.7003) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][730/1251] eta 0:02:20 lr 0.000181 wd 0.0500 time 0.2217 (0.2695) data time 0.0008 (0.0031) model time 0.2209 (0.2664) loss 3.0266 (2.8917) grad_norm 2.5189 (3.6820) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][740/1251] eta 0:02:17 lr 0.000181 wd 0.0500 time 0.2359 (0.2682) data time 0.0008 (0.0031) model time 0.2351 (0.2652) loss 3.0495 (2.8944) grad_norm 3.1421 (3.6913) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][750/1251] eta 0:02:13 lr 0.000181 wd 0.0500 time 0.2285 (0.2670) data time 0.0007 (0.0030) model time 0.2278 (0.2640) loss 3.2668 (2.8971) grad_norm 3.6417 (3.6846) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][760/1251] eta 0:02:10 lr 0.000181 wd 0.0500 time 0.2215 (0.2659) data time 0.0008 (0.0029) model time 0.2207 (0.2630) loss 3.4413 (2.9010) grad_norm 2.7651 (3.6955) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][770/1251] eta 0:02:07 lr 0.000181 wd 0.0500 time 0.2525 (0.2649) data time 0.0007 (0.0029) model time 0.2519 (0.2620) loss 3.3379 (2.8958) grad_norm 2.9322 (3.6945) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][780/1251] eta 0:02:04 lr 0.000181 wd 0.0500 time 0.2295 (0.2638) data time 0.0008 (0.0028) model time 0.2286 (0.2610) loss 2.0604 (2.8908) grad_norm 2.9662 (3.6825) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][790/1251] eta 0:02:01 lr 0.000181 wd 0.0500 time 0.2305 (0.2628) data time 0.0009 (0.0028) model time 0.2296 (0.2600) loss 3.4588 (2.8908) grad_norm 5.3884 (3.6912) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][800/1251] eta 0:01:58 lr 0.000181 wd 0.0500 time 0.2269 (0.2619) data time 0.0009 (0.0027) model time 0.2260 (0.2592) loss 3.0233 (2.8972) grad_norm 3.3384 (3.7004) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][810/1251] eta 0:01:55 lr 0.000181 wd 0.0500 time 0.2257 (0.2610) data time 0.0008 (0.0027) model time 0.2248 (0.2583) loss 2.9632 (2.9014) grad_norm 2.4258 (3.6937) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][820/1251] eta 0:01:52 lr 0.000181 wd 0.0500 time 0.2297 (0.2602) data time 0.0007 (0.0027) model time 0.2290 (0.2575) loss 2.1897 (2.8991) grad_norm 4.7181 (3.6863) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][830/1251] eta 0:01:49 lr 0.000181 wd 0.0500 time 0.2261 (0.2594) data time 0.0007 (0.0026) model time 0.2254 (0.2568) loss 3.5108 (2.9080) grad_norm 3.8803 (3.6848) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][840/1251] eta 0:01:46 lr 0.000181 wd 0.0500 time 0.2213 (0.2586) data time 0.0006 (0.0026) model time 0.2207 (0.2561) loss 2.8637 (2.9101) grad_norm 3.0637 (3.7024) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][850/1251] eta 0:01:43 lr 0.000181 wd 0.0500 time 0.2298 (0.2579) data time 0.0007 (0.0025) model time 0.2291 (0.2554) loss 2.2775 (2.9072) grad_norm 3.2572 (3.7105) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:22:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][860/1251] eta 0:01:40 lr 0.000181 wd 0.0500 time 0.2340 (0.2573) data time 0.0006 (0.0025) model time 0.2334 (0.2548) loss 2.7655 (2.9035) grad_norm 3.6541 (3.7052) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][870/1251] eta 0:01:37 lr 0.000181 wd 0.0500 time 0.2397 (0.2567) data time 0.0006 (0.0025) model time 0.2391 (0.2542) loss 2.6591 (2.8973) grad_norm 3.2759 (3.7174) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][880/1251] eta 0:01:35 lr 0.000181 wd 0.0500 time 0.2340 (0.2561) data time 0.0007 (0.0024) model time 0.2334 (0.2536) loss 3.4530 (2.9013) grad_norm 5.7042 (3.7326) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][890/1251] eta 0:01:32 lr 0.000181 wd 0.0500 time 0.2322 (0.2555) data time 0.0006 (0.0024) model time 0.2316 (0.2531) loss 2.1558 (2.9031) grad_norm 4.6673 (3.7383) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][900/1251] eta 0:01:29 lr 0.000180 wd 0.0500 time 0.2339 (0.2549) data time 0.0007 (0.0024) model time 0.2332 (0.2526) loss 2.2613 (2.9031) grad_norm 2.7698 (3.7413) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][910/1251] eta 0:01:26 lr 0.000180 wd 0.0500 time 0.2213 (0.2544) data time 0.0008 (0.0024) model time 0.2205 (0.2520) loss 3.5132 (2.9102) grad_norm 6.1130 (3.7548) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][920/1251] eta 0:01:24 lr 0.000180 wd 0.0500 time 0.2454 (0.2539) data time 0.0009 (0.0023) model time 0.2445 (0.2515) loss 3.1749 (2.9078) grad_norm 4.4030 (3.7503) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][930/1251] eta 0:01:21 lr 0.000180 wd 0.0500 time 0.2341 (0.2534) data time 0.0007 (0.0023) model time 0.2334 (0.2511) loss 3.4657 (2.9060) grad_norm 4.0769 (3.7491) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][940/1251] eta 0:01:18 lr 0.000180 wd 0.0500 time 0.2225 (0.2529) data time 0.0008 (0.0023) model time 0.2217 (0.2506) loss 2.8633 (2.9018) grad_norm 3.3237 (3.7423) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][950/1251] eta 0:01:15 lr 0.000180 wd 0.0500 time 0.2244 (0.2523) data time 0.0006 (0.0023) model time 0.2238 (0.2501) loss 3.1785 (2.9055) grad_norm 4.1985 (3.7700) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][960/1251] eta 0:01:13 lr 0.000180 wd 0.0500 time 0.2310 (0.2519) data time 0.0008 (0.0023) model time 0.2302 (0.2496) loss 2.8817 (2.9065) grad_norm 3.4172 (3.7666) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][970/1251] eta 0:01:10 lr 0.000180 wd 0.0500 time 0.2230 (0.2514) data time 0.0006 (0.0022) model time 0.2224 (0.2491) loss 2.8759 (2.9073) grad_norm 2.6739 (3.8209) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][980/1251] eta 0:01:07 lr 0.000180 wd 0.0500 time 0.2306 (0.2509) data time 0.0007 (0.0022) model time 0.2299 (0.2487) loss 3.1677 (2.9115) grad_norm 3.7803 (3.8229) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][990/1251] eta 0:01:05 lr 0.000180 wd 0.0500 time 0.2292 (0.2505) data time 0.0008 (0.0022) model time 0.2285 (0.2482) loss 2.7163 (2.9121) grad_norm 21.2553 (3.8479) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1000/1251] eta 0:01:02 lr 0.000180 wd 0.0500 time 0.2252 (0.2501) data time 0.0009 (0.0022) model time 0.2243 (0.2479) loss 2.0765 (2.9105) grad_norm 3.3725 (3.8496) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1010/1251] eta 0:01:00 lr 0.000180 wd 0.0500 time 0.2272 (0.2496) data time 0.0008 (0.0022) model time 0.2264 (0.2475) loss 3.2396 (2.9115) grad_norm 4.4609 (3.8548) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1020/1251] eta 0:00:57 lr 0.000180 wd 0.0500 time 0.2217 (0.2493) data time 0.0007 (0.0022) model time 0.2210 (0.2471) loss 3.5047 (2.9151) grad_norm 7.1563 (3.8564) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1030/1251] eta 0:00:55 lr 0.000180 wd 0.0500 time 0.2385 (0.2489) data time 0.0007 (0.0021) model time 0.2378 (0.2468) loss 3.9020 (2.9164) grad_norm 3.2024 (3.8554) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1040/1251] eta 0:00:52 lr 0.000180 wd 0.0500 time 0.2264 (0.2485) data time 0.0006 (0.0021) model time 0.2258 (0.2464) loss 3.1935 (2.9112) grad_norm 4.1443 (3.8544) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1050/1251] eta 0:00:49 lr 0.000180 wd 0.0500 time 0.2242 (0.2482) data time 0.0008 (0.0021) model time 0.2234 (0.2461) loss 3.4832 (2.9120) grad_norm 3.7020 (3.8571) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1060/1251] eta 0:00:47 lr 0.000180 wd 0.0500 time 0.2284 (0.2479) data time 0.0009 (0.0021) model time 0.2274 (0.2458) loss 2.7134 (2.9070) grad_norm 3.8071 (3.8546) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1070/1251] eta 0:00:44 lr 0.000180 wd 0.0500 time 0.2264 (0.2475) data time 0.0008 (0.0021) model time 0.2256 (0.2455) loss 3.2167 (2.9111) grad_norm 3.5730 (3.8627) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1080/1251] eta 0:00:42 lr 0.000180 wd 0.0500 time 0.2286 (0.2472) data time 0.0009 (0.0020) model time 0.2277 (0.2452) loss 3.2407 (2.9129) grad_norm 19.1356 (3.8891) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1090/1251] eta 0:00:39 lr 0.000180 wd 0.0500 time 0.2211 (0.2469) data time 0.0008 (0.0020) model time 0.2203 (0.2449) loss 3.2668 (2.9096) grad_norm 2.9654 (3.9096) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1100/1251] eta 0:00:37 lr 0.000180 wd 0.0500 time 0.2240 (0.2466) data time 0.0007 (0.0020) model time 0.2233 (0.2446) loss 3.4952 (2.9079) grad_norm 3.9832 (3.9154) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1110/1251] eta 0:00:34 lr 0.000180 wd 0.0500 time 0.2287 (0.2463) data time 0.0006 (0.0020) model time 0.2281 (0.2443) loss 2.8213 (2.9045) grad_norm 4.9954 (3.9169) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:23:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1120/1251] eta 0:00:32 lr 0.000180 wd 0.0500 time 0.2184 (0.2460) data time 0.0008 (0.0020) model time 0.2176 (0.2440) loss 3.1011 (2.9037) grad_norm 8.7718 (3.9188) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:24:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1130/1251] eta 0:00:29 lr 0.000180 wd 0.0500 time 0.2324 (0.2457) data time 0.0007 (0.0020) model time 0.2316 (0.2438) loss 2.7018 (2.9038) grad_norm 2.9369 (3.9109) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:24:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1140/1251] eta 0:00:27 lr 0.000180 wd 0.0500 time 0.2200 (0.2455) data time 0.0009 (0.0020) model time 0.2191 (0.2435) loss 2.9281 (2.9052) grad_norm 4.0319 (3.9091) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:24:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1150/1251] eta 0:00:24 lr 0.000180 wd 0.0500 time 0.2322 (0.2452) data time 0.0008 (0.0019) model time 0.2313 (0.2432) loss 1.7385 (2.9042) grad_norm 5.5237 (3.9129) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:24:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1160/1251] eta 0:00:22 lr 0.000180 wd 0.0500 time 0.2252 (0.2449) data time 0.0007 (0.0019) model time 0.2245 (0.2430) loss 2.9862 (2.9058) grad_norm 4.3362 (3.9097) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1170/1251] eta 0:00:19 lr 0.000180 wd 0.0500 time 0.2205 (0.2447) data time 0.0007 (0.0019) model time 0.2198 (0.2427) loss 3.6805 (2.9076) grad_norm 7.9217 (3.9130) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:24:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1180/1251] eta 0:00:17 lr 0.000180 wd 0.0500 time 0.2299 (0.2444) data time 0.0009 (0.0019) model time 0.2290 (0.2425) loss 3.0352 (2.9084) grad_norm 4.8572 (3.9132) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:24:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1190/1251] eta 0:00:14 lr 0.000180 wd 0.0500 time 0.2310 (0.2442) data time 0.0008 (0.0019) model time 0.2302 (0.2423) loss 3.2602 (2.9068) grad_norm 3.7749 (3.9061) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:24:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1200/1251] eta 0:00:12 lr 0.000179 wd 0.0500 time 0.2273 (0.2439) data time 0.0008 (0.0019) model time 0.2266 (0.2421) loss 2.1992 (2.9061) grad_norm 4.3627 (3.9005) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:24:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1210/1251] eta 0:00:10 lr 0.000179 wd 0.0500 time 0.2262 (0.2440) data time 0.0008 (0.0019) model time 0.2255 (0.2421) loss 2.1306 (2.9007) grad_norm 3.0891 (3.8911) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:24:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1220/1251] eta 0:00:07 lr 0.000179 wd 0.0500 time 0.2175 (0.2437) data time 0.0009 (0.0019) model time 0.2166 (0.2419) loss 2.4646 (2.9017) grad_norm 3.0493 (3.8949) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:24:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1230/1251] eta 0:00:05 lr 0.000179 wd 0.0500 time 0.2258 (0.2437) data time 0.0007 (0.0018) model time 0.2251 (0.2418) loss 3.1560 (2.8990) grad_norm 5.2199 (3.8886) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:24:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1240/1251] eta 0:00:02 lr 0.000179 wd 0.0500 time 0.2122 (0.2434) data time 0.0004 (0.0018) model time 0.2119 (0.2416) loss 3.4633 (2.8987) grad_norm 2.5966 (3.8829) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:24:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [223/300][1250/1251] eta 0:00:00 lr 0.000179 wd 0.0500 time 0.2126 (0.2431) data time 0.0003 (0.0018) model time 0.2123 (0.2412) loss 3.1273 (2.8965) grad_norm 3.5219 (3.8808) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 11:24:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 223 training takes 0:03:27 [2024-08-30 11:24:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 11:24:30 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 11:24:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.344 (0.344) Loss 0.4053 (0.4053) Acc@1 92.480 (92.480) Acc@5 98.242 (98.242) Mem 7377MB [2024-08-30 11:24:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.066 (0.097) Loss 0.6289 (0.6414) Acc@1 87.695 (86.692) Acc@5 97.656 (97.505) Mem 7377MB [2024-08-30 11:24:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.066 (0.085) Loss 0.9512 (0.6738) Acc@1 77.148 (85.463) Acc@5 94.824 (97.405) Mem 7377MB [2024-08-30 11:24:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.067 (0.080) Loss 1.1514 (0.7639) Acc@1 73.145 (83.285) Acc@5 92.090 (96.384) Mem 7377MB [2024-08-30 11:24:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 0.9990 (0.8108) Acc@1 77.051 (82.103) Acc@5 94.727 (95.894) Mem 7377MB [2024-08-30 11:24:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.692 Acc@5 95.844 [2024-08-30 11:24:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.7% [2024-08-30 11:24:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.69% [2024-08-30 11:24:35 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-30 11:24:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-30 11:24:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.383 (0.383) Loss 0.3762 (0.3762) Acc@1 93.262 (93.262) Acc@5 98.340 (98.340) Mem 7377MB [2024-08-30 11:24:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.098) Loss 0.5830 (0.6019) Acc@1 89.062 (87.464) Acc@5 97.852 (97.710) Mem 7377MB [2024-08-30 11:24:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.086) Loss 0.8711 (0.6289) Acc@1 78.516 (86.421) Acc@5 96.191 (97.675) Mem 7377MB [2024-08-30 11:24:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.065 (0.081) Loss 1.0869 (0.7129) Acc@1 74.023 (84.343) Acc@5 92.969 (96.752) Mem 7377MB [2024-08-30 11:24:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 0.9780 (0.7565) Acc@1 77.344 (83.210) Acc@5 94.629 (96.303) Mem 7377MB [2024-08-30 11:24:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.798 Acc@5 96.290 [2024-08-30 11:24:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-08-30 11:24:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.80% [2024-08-30 11:24:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 11:24:40 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 11:24:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][0/1251] eta 0:14:57 lr 0.000179 wd 0.0500 time 0.7175 (0.7175) data time 0.4873 (0.4873) model time 0.0000 (0.0000) loss 3.1787 (3.1787) grad_norm 4.1852 (4.1852) loss_scale 512.0000 (512.0000) mem 7380MB [2024-08-30 11:24:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][10/1251] eta 0:05:33 lr 0.000179 wd 0.0500 time 0.2218 (0.2690) data time 0.0009 (0.0451) model time 0.0000 (0.0000) loss 3.0475 (2.8716) grad_norm 3.2210 (4.0201) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:24:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][20/1251] eta 0:05:05 lr 0.000179 wd 0.0500 time 0.2258 (0.2479) data time 0.0010 (0.0241) model time 0.0000 (0.0000) loss 3.0668 (2.9421) grad_norm 5.5454 (3.9164) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:24:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][30/1251] eta 0:04:53 lr 0.000179 wd 0.0500 time 0.2269 (0.2404) data time 0.0006 (0.0166) model time 0.0000 (0.0000) loss 3.3677 (2.8736) grad_norm 4.0595 (3.8703) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:24:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][40/1251] eta 0:04:46 lr 0.000179 wd 0.0500 time 0.2280 (0.2369) data time 0.0008 (0.0128) model time 0.0000 (0.0000) loss 3.2593 (2.8599) grad_norm 4.1609 (3.8497) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:24:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][50/1251] eta 0:04:42 lr 0.000179 wd 0.0500 time 0.2328 (0.2350) data time 0.0006 (0.0105) model time 0.0000 (0.0000) loss 2.7064 (2.8717) grad_norm 4.7993 (3.8909) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:24:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][60/1251] eta 0:04:38 lr 0.000179 wd 0.0500 time 0.2315 (0.2337) data time 0.0006 (0.0089) model time 0.2309 (0.2261) loss 2.2518 (2.8499) grad_norm 5.2005 (3.8563) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:24:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][70/1251] eta 0:04:34 lr 0.000179 wd 0.0500 time 0.2387 (0.2328) data time 0.0006 (0.0078) model time 0.2382 (0.2263) loss 3.6863 (2.8926) grad_norm 3.1913 (3.7866) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][80/1251] eta 0:04:31 lr 0.000179 wd 0.0500 time 0.2278 (0.2318) data time 0.0008 (0.0069) model time 0.2270 (0.2256) loss 2.4814 (2.9215) grad_norm 3.1260 (3.9539) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][90/1251] eta 0:04:28 lr 0.000179 wd 0.0500 time 0.2277 (0.2310) data time 0.0008 (0.0063) model time 0.2269 (0.2250) loss 2.5242 (2.8848) grad_norm 4.2685 (3.8852) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][100/1251] eta 0:04:25 lr 0.000179 wd 0.0500 time 0.2247 (0.2303) data time 0.0005 (0.0057) model time 0.2241 (0.2247) loss 2.0440 (2.8541) grad_norm 4.2801 (3.8478) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][110/1251] eta 0:04:22 lr 0.000179 wd 0.0500 time 0.2237 (0.2297) data time 0.0007 (0.0053) model time 0.2230 (0.2244) loss 2.5401 (2.8607) grad_norm 3.2919 (3.9200) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][120/1251] eta 0:04:19 lr 0.000179 wd 0.0500 time 0.2259 (0.2293) data time 0.0009 (0.0049) model time 0.2250 (0.2242) loss 3.0607 (2.8793) grad_norm 3.5431 (3.8648) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][130/1251] eta 0:04:16 lr 0.000179 wd 0.0500 time 0.2278 (0.2292) data time 0.0009 (0.0046) model time 0.2269 (0.2246) loss 3.1262 (2.8761) grad_norm 2.9490 (3.8317) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][140/1251] eta 0:04:14 lr 0.000179 wd 0.0500 time 0.2307 (0.2290) data time 0.0008 (0.0044) model time 0.2298 (0.2247) loss 2.6010 (2.8765) grad_norm 4.4010 (3.8839) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][150/1251] eta 0:04:12 lr 0.000179 wd 0.0500 time 0.2330 (0.2289) data time 0.0006 (0.0041) model time 0.2324 (0.2249) loss 2.9801 (2.8823) grad_norm 3.4703 (3.8577) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][160/1251] eta 0:04:09 lr 0.000179 wd 0.0500 time 0.2195 (0.2288) data time 0.0007 (0.0039) model time 0.2188 (0.2250) loss 3.1651 (2.8986) grad_norm 4.0959 (3.8479) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][170/1251] eta 0:04:07 lr 0.000179 wd 0.0500 time 0.2363 (0.2287) data time 0.0008 (0.0038) model time 0.2356 (0.2251) loss 2.4919 (2.8846) grad_norm 2.5456 (3.9485) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][180/1251] eta 0:04:04 lr 0.000179 wd 0.0500 time 0.2274 (0.2285) data time 0.0007 (0.0036) model time 0.2267 (0.2251) loss 2.8542 (2.8744) grad_norm 3.0391 (3.9455) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][190/1251] eta 0:04:02 lr 0.000179 wd 0.0500 time 0.2254 (0.2284) data time 0.0010 (0.0034) model time 0.2245 (0.2251) loss 1.7237 (2.8876) grad_norm 8.5007 (3.9456) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][200/1251] eta 0:03:59 lr 0.000179 wd 0.0500 time 0.2255 (0.2283) data time 0.0009 (0.0033) model time 0.2246 (0.2251) loss 3.1262 (2.8894) grad_norm 4.5548 (3.9684) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][210/1251] eta 0:03:57 lr 0.000179 wd 0.0500 time 0.2378 (0.2283) data time 0.0009 (0.0032) model time 0.2368 (0.2252) loss 3.2833 (2.8982) grad_norm 2.9741 (3.9722) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][220/1251] eta 0:03:55 lr 0.000179 wd 0.0500 time 0.2288 (0.2282) data time 0.0009 (0.0031) model time 0.2279 (0.2252) loss 3.1702 (2.8882) grad_norm 2.9685 (3.9454) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][230/1251] eta 0:03:52 lr 0.000179 wd 0.0500 time 0.2189 (0.2279) data time 0.0008 (0.0030) model time 0.2181 (0.2250) loss 2.7381 (2.8894) grad_norm 3.8778 (3.9372) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][240/1251] eta 0:03:50 lr 0.000179 wd 0.0500 time 0.2315 (0.2278) data time 0.0009 (0.0029) model time 0.2305 (0.2250) loss 2.6551 (2.8837) grad_norm 3.0040 (3.9216) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][250/1251] eta 0:03:47 lr 0.000178 wd 0.0500 time 0.2218 (0.2277) data time 0.0010 (0.0028) model time 0.2208 (0.2249) loss 3.2320 (2.8853) grad_norm 2.4414 (3.9111) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][260/1251] eta 0:03:45 lr 0.000178 wd 0.0500 time 0.2336 (0.2276) data time 0.0008 (0.0028) model time 0.2328 (0.2249) loss 3.0125 (2.8890) grad_norm 3.3341 (3.8927) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][270/1251] eta 0:03:43 lr 0.000178 wd 0.0500 time 0.2190 (0.2274) data time 0.0009 (0.0027) model time 0.2181 (0.2247) loss 2.4890 (2.8930) grad_norm 3.0727 (3.8737) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][280/1251] eta 0:03:40 lr 0.000178 wd 0.0500 time 0.2273 (0.2273) data time 0.0006 (0.0026) model time 0.2267 (0.2247) loss 3.1595 (2.8928) grad_norm 2.8410 (3.8865) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][290/1251] eta 0:03:38 lr 0.000178 wd 0.0500 time 0.2286 (0.2273) data time 0.0009 (0.0026) model time 0.2277 (0.2247) loss 3.2005 (2.8927) grad_norm 3.1851 (3.8877) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][300/1251] eta 0:03:36 lr 0.000178 wd 0.0500 time 0.2240 (0.2272) data time 0.0006 (0.0026) model time 0.2234 (0.2246) loss 2.3407 (2.8886) grad_norm 4.0708 (3.8665) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][310/1251] eta 0:03:33 lr 0.000178 wd 0.0500 time 0.2189 (0.2270) data time 0.0006 (0.0025) model time 0.2182 (0.2245) loss 1.8931 (2.8828) grad_norm 4.0628 (3.8587) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][320/1251] eta 0:03:31 lr 0.000178 wd 0.0500 time 0.2236 (0.2269) data time 0.0007 (0.0025) model time 0.2229 (0.2245) loss 3.5074 (2.8929) grad_norm 5.5959 (3.8416) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][330/1251] eta 0:03:28 lr 0.000178 wd 0.0500 time 0.2204 (0.2268) data time 0.0009 (0.0024) model time 0.2195 (0.2244) loss 3.0647 (2.9033) grad_norm 3.0120 (3.8303) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:25:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][340/1251] eta 0:03:26 lr 0.000178 wd 0.0500 time 0.2271 (0.2268) data time 0.0009 (0.0024) model time 0.2262 (0.2244) loss 3.2228 (2.9060) grad_norm 2.6293 (3.8212) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:26:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][350/1251] eta 0:03:24 lr 0.000178 wd 0.0500 time 0.2264 (0.2267) data time 0.0009 (0.0023) model time 0.2255 (0.2243) loss 3.0877 (2.9084) grad_norm 3.2359 (3.8149) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:26:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][360/1251] eta 0:03:21 lr 0.000178 wd 0.0500 time 0.2254 (0.2266) data time 0.0009 (0.0023) model time 0.2245 (0.2243) loss 3.0893 (2.9085) grad_norm 2.8438 (3.8249) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:26:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][370/1251] eta 0:03:19 lr 0.000178 wd 0.0500 time 0.2271 (0.2266) data time 0.0008 (0.0023) model time 0.2262 (0.2243) loss 3.2240 (2.9063) grad_norm 5.6837 (3.8362) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:26:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][380/1251] eta 0:03:17 lr 0.000178 wd 0.0500 time 0.2264 (0.2266) data time 0.0008 (0.0022) model time 0.2256 (0.2243) loss 2.4129 (2.9006) grad_norm 3.0624 (3.8470) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:26:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][390/1251] eta 0:03:15 lr 0.000178 wd 0.0500 time 0.2192 (0.2269) data time 0.0010 (0.0022) model time 0.2182 (0.2247) loss 2.5835 (2.9033) grad_norm 5.0113 (3.8419) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:26:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][400/1251] eta 0:03:13 lr 0.000178 wd 0.0500 time 0.2258 (0.2269) data time 0.0010 (0.0022) model time 0.2248 (0.2248) loss 2.9197 (2.8983) grad_norm 2.9736 (3.8301) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:26:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][410/1251] eta 0:03:10 lr 0.000178 wd 0.0500 time 0.2243 (0.2268) data time 0.0008 (0.0021) model time 0.2236 (0.2247) loss 3.1191 (2.9020) grad_norm 5.9197 (3.8398) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:26:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][420/1251] eta 0:03:08 lr 0.000178 wd 0.0500 time 0.2199 (0.2268) data time 0.0006 (0.0021) model time 0.2193 (0.2247) loss 3.3662 (2.9067) grad_norm 4.0828 (3.8391) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:26:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][430/1251] eta 0:03:06 lr 0.000178 wd 0.0500 time 0.2276 (0.2267) data time 0.0008 (0.0021) model time 0.2268 (0.2247) loss 2.4521 (2.9117) grad_norm 2.8604 (3.8600) loss_scale 1024.0000 (521.5035) mem 7381MB [2024-08-30 11:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][440/1251] eta 0:03:03 lr 0.000178 wd 0.0500 time 0.2205 (0.2267) data time 0.0006 (0.0021) model time 0.2198 (0.2247) loss 3.0273 (2.9128) grad_norm 4.7801 (3.8682) loss_scale 1024.0000 (532.8980) mem 7381MB [2024-08-30 11:26:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][450/1251] eta 0:03:01 lr 0.000178 wd 0.0500 time 0.2183 (0.2266) data time 0.0006 (0.0020) model time 0.2177 (0.2246) loss 3.5095 (2.9156) grad_norm 3.5040 (3.8727) loss_scale 1024.0000 (543.7871) mem 7381MB [2024-08-30 11:26:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][460/1251] eta 0:02:59 lr 0.000178 wd 0.0500 time 0.2257 (0.2265) data time 0.0006 (0.0020) model time 0.2251 (0.2245) loss 3.3478 (2.9093) grad_norm 2.9829 (3.8714) loss_scale 1024.0000 (554.2039) mem 7381MB [2024-08-30 11:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][470/1251] eta 0:02:56 lr 0.000178 wd 0.0500 time 0.2227 (0.2265) data time 0.0006 (0.0020) model time 0.2221 (0.2245) loss 2.4821 (2.9032) grad_norm 3.1909 (3.8919) loss_scale 1024.0000 (564.1783) mem 7381MB [2024-08-30 11:26:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][480/1251] eta 0:02:54 lr 0.000178 wd 0.0500 time 0.2250 (0.2265) data time 0.0009 (0.0020) model time 0.2241 (0.2245) loss 3.0397 (2.9048) grad_norm 4.1370 (3.9090) loss_scale 1024.0000 (573.7380) mem 7381MB [2024-08-30 11:26:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][490/1251] eta 0:02:52 lr 0.000178 wd 0.0500 time 0.2266 (0.2264) data time 0.0010 (0.0019) model time 0.2256 (0.2245) loss 3.3381 (2.9042) grad_norm 3.3757 (3.9074) loss_scale 1024.0000 (582.9084) mem 7381MB [2024-08-30 11:26:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][500/1251] eta 0:02:50 lr 0.000178 wd 0.0500 time 0.2263 (0.2267) data time 0.0009 (0.0019) model time 0.2253 (0.2249) loss 2.6899 (2.9093) grad_norm 2.7185 (3.9026) loss_scale 1024.0000 (591.7126) mem 7381MB [2024-08-30 11:26:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][510/1251] eta 0:02:47 lr 0.000178 wd 0.0500 time 0.2277 (0.2267) data time 0.0007 (0.0019) model time 0.2270 (0.2248) loss 3.0677 (2.9130) grad_norm 2.4889 (3.8952) loss_scale 1024.0000 (600.1722) mem 7381MB [2024-08-30 11:26:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][520/1251] eta 0:02:45 lr 0.000178 wd 0.0500 time 0.2280 (0.2266) data time 0.0008 (0.0019) model time 0.2273 (0.2248) loss 3.1338 (2.9158) grad_norm 5.2262 (3.8912) loss_scale 1024.0000 (608.3071) mem 7381MB [2024-08-30 11:26:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][530/1251] eta 0:02:43 lr 0.000178 wd 0.0500 time 0.2286 (0.2266) data time 0.0008 (0.0019) model time 0.2278 (0.2248) loss 1.8235 (2.9150) grad_norm 5.1835 (3.8927) loss_scale 1024.0000 (616.1356) mem 7381MB [2024-08-30 11:26:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][540/1251] eta 0:02:41 lr 0.000178 wd 0.0500 time 0.2210 (0.2266) data time 0.0006 (0.0018) model time 0.2204 (0.2248) loss 3.2593 (2.9148) grad_norm 4.5421 (3.8965) loss_scale 1024.0000 (623.6747) mem 7381MB [2024-08-30 11:26:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][550/1251] eta 0:02:38 lr 0.000177 wd 0.0500 time 0.2287 (0.2266) data time 0.0006 (0.0018) model time 0.2281 (0.2248) loss 3.1704 (2.9196) grad_norm 3.4393 (3.8967) loss_scale 1024.0000 (630.9401) mem 7381MB [2024-08-30 11:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][560/1251] eta 0:02:36 lr 0.000177 wd 0.0500 time 0.2178 (0.2266) data time 0.0009 (0.0018) model time 0.2169 (0.2248) loss 3.2739 (2.9205) grad_norm 3.8475 (3.9035) loss_scale 1024.0000 (637.9465) mem 7381MB [2024-08-30 11:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][570/1251] eta 0:02:34 lr 0.000177 wd 0.0500 time 0.2265 (0.2265) data time 0.0008 (0.0018) model time 0.2256 (0.2247) loss 3.3152 (2.9202) grad_norm 3.6418 (3.9025) loss_scale 1024.0000 (644.7075) mem 7381MB [2024-08-30 11:26:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][580/1251] eta 0:02:31 lr 0.000177 wd 0.0500 time 0.2435 (0.2265) data time 0.0005 (0.0018) model time 0.2429 (0.2248) loss 1.8302 (2.9181) grad_norm 3.4799 (3.9000) loss_scale 1024.0000 (651.2358) mem 7381MB [2024-08-30 11:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][590/1251] eta 0:02:29 lr 0.000177 wd 0.0500 time 0.2311 (0.2265) data time 0.0006 (0.0018) model time 0.2305 (0.2247) loss 3.3155 (2.9211) grad_norm 3.9277 (3.8953) loss_scale 1024.0000 (657.5431) mem 7381MB [2024-08-30 11:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][600/1251] eta 0:02:27 lr 0.000177 wd 0.0500 time 0.2215 (0.2264) data time 0.0007 (0.0017) model time 0.2207 (0.2247) loss 2.7871 (2.9193) grad_norm 8.0443 (3.9170) loss_scale 1024.0000 (663.6406) mem 7381MB [2024-08-30 11:26:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][610/1251] eta 0:02:25 lr 0.000177 wd 0.0500 time 0.2251 (0.2264) data time 0.0007 (0.0017) model time 0.2245 (0.2247) loss 3.4843 (2.9237) grad_norm 3.8607 (3.9159) loss_scale 1024.0000 (669.5385) mem 7381MB [2024-08-30 11:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][620/1251] eta 0:02:22 lr 0.000177 wd 0.0500 time 0.2230 (0.2264) data time 0.0008 (0.0017) model time 0.2222 (0.2247) loss 3.0031 (2.9215) grad_norm 3.5456 (3.9114) loss_scale 1024.0000 (675.2464) mem 7381MB [2024-08-30 11:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][630/1251] eta 0:02:20 lr 0.000177 wd 0.0500 time 0.2260 (0.2264) data time 0.0006 (0.0017) model time 0.2254 (0.2247) loss 2.2793 (2.9136) grad_norm 4.2596 (3.9024) loss_scale 1024.0000 (680.7734) mem 7381MB [2024-08-30 11:27:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][640/1251] eta 0:02:18 lr 0.000177 wd 0.0500 time 0.2472 (0.2264) data time 0.0006 (0.0017) model time 0.2465 (0.2247) loss 2.8560 (2.9119) grad_norm 2.9312 (3.8947) loss_scale 1024.0000 (686.1279) mem 7381MB [2024-08-30 11:27:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][650/1251] eta 0:02:16 lr 0.000177 wd 0.0500 time 0.2222 (0.2263) data time 0.0008 (0.0017) model time 0.2214 (0.2247) loss 2.9910 (2.9116) grad_norm 4.4525 (3.8921) loss_scale 1024.0000 (691.3180) mem 7381MB [2024-08-30 11:27:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][660/1251] eta 0:02:13 lr 0.000177 wd 0.0500 time 0.2238 (0.2263) data time 0.0008 (0.0017) model time 0.2230 (0.2246) loss 2.4627 (2.9114) grad_norm 3.4097 (3.8945) loss_scale 1024.0000 (696.3510) mem 7381MB [2024-08-30 11:27:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][670/1251] eta 0:02:11 lr 0.000177 wd 0.0500 time 0.2246 (0.2263) data time 0.0006 (0.0017) model time 0.2240 (0.2247) loss 3.5173 (2.9144) grad_norm 4.9390 (3.8927) loss_scale 1024.0000 (701.2340) mem 7381MB [2024-08-30 11:27:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][680/1251] eta 0:02:09 lr 0.000177 wd 0.0500 time 0.2289 (0.2263) data time 0.0010 (0.0016) model time 0.2280 (0.2246) loss 3.0334 (2.9131) grad_norm 3.7218 (3.8873) loss_scale 1024.0000 (705.9736) mem 7381MB [2024-08-30 11:27:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][690/1251] eta 0:02:06 lr 0.000177 wd 0.0500 time 0.2272 (0.2262) data time 0.0005 (0.0016) model time 0.2266 (0.2246) loss 2.3516 (2.9130) grad_norm 3.0362 (3.8812) loss_scale 1024.0000 (710.5760) mem 7381MB [2024-08-30 11:27:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][700/1251] eta 0:02:04 lr 0.000177 wd 0.0500 time 0.2258 (0.2262) data time 0.0007 (0.0016) model time 0.2251 (0.2246) loss 3.5610 (2.9111) grad_norm 3.3610 (3.8742) loss_scale 1024.0000 (715.0471) mem 7381MB [2024-08-30 11:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][710/1251] eta 0:02:02 lr 0.000177 wd 0.0500 time 0.2230 (0.2262) data time 0.0009 (0.0016) model time 0.2220 (0.2246) loss 2.7924 (2.9077) grad_norm 2.7920 (3.8671) loss_scale 1024.0000 (719.3924) mem 7381MB [2024-08-30 11:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][720/1251] eta 0:02:00 lr 0.000177 wd 0.0500 time 0.2220 (0.2261) data time 0.0008 (0.0016) model time 0.2212 (0.2245) loss 3.1723 (2.9082) grad_norm 2.8871 (3.8554) loss_scale 1024.0000 (723.6172) mem 7381MB [2024-08-30 11:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][730/1251] eta 0:01:57 lr 0.000177 wd 0.0500 time 0.2256 (0.2261) data time 0.0006 (0.0016) model time 0.2251 (0.2245) loss 3.2622 (2.9075) grad_norm 4.0829 (3.8472) loss_scale 1024.0000 (727.7264) mem 7381MB [2024-08-30 11:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][740/1251] eta 0:01:55 lr 0.000177 wd 0.0500 time 0.2198 (0.2260) data time 0.0008 (0.0016) model time 0.2190 (0.2245) loss 3.2313 (2.9107) grad_norm 2.7265 (3.8378) loss_scale 1024.0000 (731.7247) mem 7381MB [2024-08-30 11:27:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][750/1251] eta 0:01:53 lr 0.000177 wd 0.0500 time 0.2190 (0.2260) data time 0.0008 (0.0016) model time 0.2182 (0.2244) loss 3.4975 (2.9144) grad_norm 3.9284 (3.8359) loss_scale 1024.0000 (735.6165) mem 7381MB [2024-08-30 11:27:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][760/1251] eta 0:01:50 lr 0.000177 wd 0.0500 time 0.2225 (0.2260) data time 0.0007 (0.0016) model time 0.2218 (0.2244) loss 3.2920 (2.9135) grad_norm 4.3002 (3.8288) loss_scale 1024.0000 (739.4060) mem 7381MB [2024-08-30 11:27:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][770/1251] eta 0:01:48 lr 0.000177 wd 0.0500 time 0.2239 (0.2260) data time 0.0007 (0.0016) model time 0.2233 (0.2244) loss 3.4377 (2.9140) grad_norm 6.2141 (3.8374) loss_scale 1024.0000 (743.0973) mem 7381MB [2024-08-30 11:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][780/1251] eta 0:01:46 lr 0.000177 wd 0.0500 time 0.2257 (0.2259) data time 0.0007 (0.0015) model time 0.2250 (0.2244) loss 2.9750 (2.9139) grad_norm 4.0966 (3.8361) loss_scale 1024.0000 (746.6940) mem 7381MB [2024-08-30 11:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][790/1251] eta 0:01:44 lr 0.000177 wd 0.0500 time 0.2159 (0.2259) data time 0.0007 (0.0015) model time 0.2152 (0.2244) loss 2.3268 (2.9105) grad_norm 3.2803 (3.8391) loss_scale 1024.0000 (750.1997) mem 7381MB [2024-08-30 11:27:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][800/1251] eta 0:01:41 lr 0.000177 wd 0.0500 time 0.2229 (0.2259) data time 0.0008 (0.0015) model time 0.2221 (0.2243) loss 2.3568 (2.9045) grad_norm 3.4001 (3.8323) loss_scale 1024.0000 (753.6180) mem 7381MB [2024-08-30 11:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][810/1251] eta 0:01:39 lr 0.000177 wd 0.0500 time 0.2233 (0.2258) data time 0.0006 (0.0015) model time 0.2227 (0.2243) loss 2.1071 (2.9018) grad_norm 2.8120 (3.8231) loss_scale 1024.0000 (756.9519) mem 7381MB [2024-08-30 11:27:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][820/1251] eta 0:01:37 lr 0.000177 wd 0.0500 time 0.2225 (0.2258) data time 0.0007 (0.0015) model time 0.2218 (0.2243) loss 1.9719 (2.9024) grad_norm 3.7702 (3.8203) loss_scale 1024.0000 (760.2046) mem 7381MB [2024-08-30 11:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][830/1251] eta 0:01:35 lr 0.000177 wd 0.0500 time 0.2233 (0.2258) data time 0.0008 (0.0015) model time 0.2225 (0.2243) loss 3.0675 (2.9017) grad_norm 2.9978 (3.8169) loss_scale 1024.0000 (763.3791) mem 7381MB [2024-08-30 11:27:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][840/1251] eta 0:01:32 lr 0.000177 wd 0.0500 time 0.2261 (0.2258) data time 0.0007 (0.0015) model time 0.2254 (0.2243) loss 3.4565 (2.9051) grad_norm 3.4489 (3.8148) loss_scale 1024.0000 (766.4780) mem 7381MB [2024-08-30 11:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][850/1251] eta 0:01:30 lr 0.000176 wd 0.0500 time 0.2226 (0.2258) data time 0.0007 (0.0015) model time 0.2218 (0.2243) loss 2.9706 (2.9075) grad_norm 2.7403 (3.8593) loss_scale 1024.0000 (769.5041) mem 7381MB [2024-08-30 11:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][860/1251] eta 0:01:28 lr 0.000176 wd 0.0500 time 0.2284 (0.2257) data time 0.0005 (0.0015) model time 0.2278 (0.2243) loss 2.8484 (2.9074) grad_norm 5.1670 (3.8687) loss_scale 1024.0000 (772.4599) mem 7381MB [2024-08-30 11:27:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][870/1251] eta 0:01:26 lr 0.000176 wd 0.0500 time 0.2276 (0.2257) data time 0.0008 (0.0015) model time 0.2268 (0.2243) loss 2.6419 (2.9060) grad_norm 3.0173 (3.8719) loss_scale 1024.0000 (775.3479) mem 7381MB [2024-08-30 11:27:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][880/1251] eta 0:01:23 lr 0.000176 wd 0.0500 time 0.2289 (0.2257) data time 0.0006 (0.0015) model time 0.2283 (0.2243) loss 2.1112 (2.9094) grad_norm 3.9366 (3.8824) loss_scale 1024.0000 (778.1703) mem 7381MB [2024-08-30 11:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][890/1251] eta 0:01:21 lr 0.000176 wd 0.0500 time 0.2291 (0.2257) data time 0.0006 (0.0015) model time 0.2285 (0.2242) loss 1.8375 (2.9075) grad_norm 3.0869 (3.8844) loss_scale 1024.0000 (780.9293) mem 7381MB [2024-08-30 11:28:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][900/1251] eta 0:01:19 lr 0.000176 wd 0.0500 time 0.2306 (0.2257) data time 0.0008 (0.0015) model time 0.2298 (0.2243) loss 3.1316 (2.9058) grad_norm 2.4536 (3.8774) loss_scale 1024.0000 (783.6271) mem 7381MB [2024-08-30 11:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][910/1251] eta 0:01:17 lr 0.000176 wd 0.0500 time 0.4408 (0.2259) data time 0.0005 (0.0015) model time 0.4402 (0.2245) loss 3.5347 (2.9078) grad_norm 4.4289 (3.9023) loss_scale 1024.0000 (786.2656) mem 7381MB [2024-08-30 11:28:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][920/1251] eta 0:01:14 lr 0.000176 wd 0.0500 time 0.2259 (0.2259) data time 0.0006 (0.0014) model time 0.2253 (0.2245) loss 3.6508 (2.9077) grad_norm 3.0012 (3.9150) loss_scale 1024.0000 (788.8469) mem 7381MB [2024-08-30 11:28:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][930/1251] eta 0:01:12 lr 0.000176 wd 0.0500 time 0.2293 (0.2259) data time 0.0006 (0.0014) model time 0.2287 (0.2245) loss 1.9841 (2.9062) grad_norm 3.1116 (3.9133) loss_scale 1024.0000 (791.3727) mem 7381MB [2024-08-30 11:28:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][940/1251] eta 0:01:10 lr 0.000176 wd 0.0500 time 0.2276 (0.2259) data time 0.0006 (0.0014) model time 0.2270 (0.2245) loss 2.9836 (2.9077) grad_norm 4.0180 (3.9121) loss_scale 1024.0000 (793.8448) mem 7381MB [2024-08-30 11:28:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][950/1251] eta 0:01:08 lr 0.000176 wd 0.0500 time 0.2203 (0.2259) data time 0.0009 (0.0014) model time 0.2194 (0.2245) loss 2.6014 (2.9090) grad_norm 3.0144 (3.9134) loss_scale 1024.0000 (796.2650) mem 7381MB [2024-08-30 11:28:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][960/1251] eta 0:01:05 lr 0.000176 wd 0.0500 time 0.2261 (0.2259) data time 0.0006 (0.0014) model time 0.2255 (0.2245) loss 2.7809 (2.9090) grad_norm 5.5526 (3.9089) loss_scale 1024.0000 (798.6348) mem 7381MB [2024-08-30 11:28:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][970/1251] eta 0:01:03 lr 0.000176 wd 0.0500 time 0.2185 (0.2259) data time 0.0007 (0.0014) model time 0.2178 (0.2245) loss 2.9532 (2.9093) grad_norm 2.5803 (3.9080) loss_scale 1024.0000 (800.9557) mem 7381MB [2024-08-30 11:28:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][980/1251] eta 0:01:01 lr 0.000176 wd 0.0500 time 0.2259 (0.2259) data time 0.0008 (0.0014) model time 0.2251 (0.2245) loss 2.6006 (2.9101) grad_norm 3.1430 (3.9129) loss_scale 1024.0000 (803.2294) mem 7381MB [2024-08-30 11:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][990/1251] eta 0:00:58 lr 0.000176 wd 0.0500 time 0.2228 (0.2259) data time 0.0008 (0.0014) model time 0.2220 (0.2245) loss 3.1738 (2.9093) grad_norm 14.1870 (3.9224) loss_scale 1024.0000 (805.4571) mem 7381MB [2024-08-30 11:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1000/1251] eta 0:00:56 lr 0.000176 wd 0.0500 time 0.2257 (0.2259) data time 0.0006 (0.0014) model time 0.2250 (0.2245) loss 1.8920 (2.9069) grad_norm 5.4854 (3.9338) loss_scale 1024.0000 (807.6404) mem 7381MB [2024-08-30 11:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1010/1251] eta 0:00:54 lr 0.000176 wd 0.0500 time 0.2197 (0.2259) data time 0.0008 (0.0014) model time 0.2189 (0.2245) loss 1.9525 (2.9060) grad_norm 4.7151 (3.9414) loss_scale 1024.0000 (809.7804) mem 7381MB [2024-08-30 11:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1020/1251] eta 0:00:52 lr 0.000176 wd 0.0500 time 0.4296 (0.2261) data time 0.0006 (0.0014) model time 0.4290 (0.2247) loss 2.9361 (2.9052) grad_norm 3.4486 (3.9386) loss_scale 1024.0000 (811.8786) mem 7381MB [2024-08-30 11:28:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1030/1251] eta 0:00:49 lr 0.000176 wd 0.0500 time 0.2254 (0.2261) data time 0.0006 (0.0014) model time 0.2248 (0.2247) loss 3.0356 (2.9071) grad_norm 2.5529 (3.9444) loss_scale 1024.0000 (813.9360) mem 7381MB [2024-08-30 11:28:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1040/1251] eta 0:00:47 lr 0.000176 wd 0.0500 time 0.2233 (0.2261) data time 0.0009 (0.0014) model time 0.2224 (0.2247) loss 2.9434 (2.9068) grad_norm 3.7383 (3.9426) loss_scale 1024.0000 (815.9539) mem 7381MB [2024-08-30 11:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1050/1251] eta 0:00:45 lr 0.000176 wd 0.0500 time 0.2258 (0.2261) data time 0.0009 (0.0014) model time 0.2249 (0.2247) loss 3.3011 (2.9069) grad_norm 3.7018 (3.9413) loss_scale 1024.0000 (817.9334) mem 7381MB [2024-08-30 11:28:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1060/1251] eta 0:00:43 lr 0.000176 wd 0.0500 time 0.2213 (0.2260) data time 0.0006 (0.0014) model time 0.2207 (0.2247) loss 1.8827 (2.9060) grad_norm 3.1867 (3.9383) loss_scale 1024.0000 (819.8756) mem 7381MB [2024-08-30 11:28:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1070/1251] eta 0:00:40 lr 0.000176 wd 0.0500 time 0.2235 (0.2260) data time 0.0009 (0.0014) model time 0.2226 (0.2246) loss 2.4448 (2.9028) grad_norm 2.8350 (3.9328) loss_scale 1024.0000 (821.7815) mem 7381MB [2024-08-30 11:28:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1080/1251] eta 0:00:38 lr 0.000176 wd 0.0500 time 0.2243 (0.2260) data time 0.0009 (0.0014) model time 0.2234 (0.2246) loss 3.0097 (2.9022) grad_norm 2.8064 (3.9269) loss_scale 1024.0000 (823.6522) mem 7381MB [2024-08-30 11:28:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1090/1251] eta 0:00:36 lr 0.000176 wd 0.0500 time 0.2198 (0.2260) data time 0.0006 (0.0014) model time 0.2192 (0.2246) loss 2.8829 (2.9017) grad_norm 3.8057 (inf) loss_scale 512.0000 (821.7342) mem 7381MB [2024-08-30 11:28:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1100/1251] eta 0:00:34 lr 0.000176 wd 0.0500 time 0.2279 (0.2260) data time 0.0010 (0.0014) model time 0.2269 (0.2246) loss 2.2971 (2.9000) grad_norm 4.0755 (inf) loss_scale 512.0000 (818.9210) mem 7381MB [2024-08-30 11:28:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1110/1251] eta 0:00:31 lr 0.000176 wd 0.0500 time 0.2257 (0.2260) data time 0.0006 (0.0013) model time 0.2251 (0.2246) loss 1.7938 (2.8998) grad_norm 3.6485 (inf) loss_scale 512.0000 (816.1584) mem 7381MB [2024-08-30 11:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1120/1251] eta 0:00:29 lr 0.000176 wd 0.0500 time 0.2230 (0.2260) data time 0.0006 (0.0013) model time 0.2223 (0.2246) loss 3.4119 (2.8995) grad_norm 3.1953 (inf) loss_scale 512.0000 (813.4451) mem 7381MB [2024-08-30 11:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1130/1251] eta 0:00:27 lr 0.000176 wd 0.0500 time 0.2343 (0.2260) data time 0.0007 (0.0013) model time 0.2336 (0.2247) loss 1.8312 (2.8970) grad_norm 3.3461 (inf) loss_scale 512.0000 (810.7798) mem 7381MB [2024-08-30 11:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1140/1251] eta 0:00:25 lr 0.000176 wd 0.0500 time 0.2236 (0.2260) data time 0.0009 (0.0013) model time 0.2228 (0.2247) loss 3.1227 (2.8979) grad_norm 3.8151 (inf) loss_scale 512.0000 (808.1613) mem 7381MB [2024-08-30 11:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1150/1251] eta 0:00:22 lr 0.000175 wd 0.0500 time 0.2184 (0.2260) data time 0.0006 (0.0013) model time 0.2177 (0.2247) loss 3.5610 (2.8983) grad_norm 3.7895 (inf) loss_scale 512.0000 (805.5882) mem 7381MB [2024-08-30 11:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1160/1251] eta 0:00:20 lr 0.000175 wd 0.0500 time 0.2224 (0.2260) data time 0.0007 (0.0013) model time 0.2217 (0.2247) loss 2.9085 (2.8996) grad_norm 3.5190 (inf) loss_scale 512.0000 (803.0594) mem 7381MB [2024-08-30 11:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1170/1251] eta 0:00:18 lr 0.000175 wd 0.0500 time 0.2221 (0.2260) data time 0.0007 (0.0013) model time 0.2214 (0.2247) loss 2.2578 (2.8994) grad_norm 2.9807 (inf) loss_scale 512.0000 (800.5739) mem 7381MB [2024-08-30 11:29:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1180/1251] eta 0:00:16 lr 0.000175 wd 0.0500 time 0.2186 (0.2260) data time 0.0008 (0.0013) model time 0.2178 (0.2247) loss 2.5751 (2.9005) grad_norm 3.0635 (inf) loss_scale 512.0000 (798.1304) mem 7381MB [2024-08-30 11:29:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1190/1251] eta 0:00:13 lr 0.000175 wd 0.0500 time 0.2308 (0.2261) data time 0.0006 (0.0013) model time 0.2302 (0.2247) loss 1.9156 (2.9001) grad_norm 2.2707 (inf) loss_scale 512.0000 (795.7280) mem 7381MB [2024-08-30 11:29:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1200/1251] eta 0:00:11 lr 0.000175 wd 0.0500 time 0.2264 (0.2261) data time 0.0006 (0.0013) model time 0.2258 (0.2248) loss 3.3882 (2.9007) grad_norm 3.8536 (inf) loss_scale 512.0000 (793.3655) mem 7381MB [2024-08-30 11:29:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1210/1251] eta 0:00:09 lr 0.000175 wd 0.0500 time 0.2319 (0.2261) data time 0.0006 (0.0013) model time 0.2313 (0.2248) loss 3.2466 (2.8987) grad_norm 12.9653 (inf) loss_scale 512.0000 (791.0421) mem 7381MB [2024-08-30 11:29:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1220/1251] eta 0:00:07 lr 0.000175 wd 0.0500 time 0.2229 (0.2261) data time 0.0008 (0.0013) model time 0.2221 (0.2248) loss 3.5824 (2.9001) grad_norm 4.7344 (inf) loss_scale 512.0000 (788.7568) mem 7381MB [2024-08-30 11:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1230/1251] eta 0:00:04 lr 0.000175 wd 0.0500 time 0.2265 (0.2261) data time 0.0006 (0.0013) model time 0.2259 (0.2248) loss 3.4159 (2.8994) grad_norm 3.1472 (inf) loss_scale 512.0000 (786.5085) mem 7381MB [2024-08-30 11:29:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1240/1251] eta 0:00:02 lr 0.000175 wd 0.0500 time 0.2193 (0.2261) data time 0.0003 (0.0013) model time 0.2189 (0.2248) loss 1.9700 (2.9003) grad_norm 6.2248 (inf) loss_scale 512.0000 (784.2965) mem 7381MB [2024-08-30 11:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [224/300][1250/1251] eta 0:00:00 lr 0.000175 wd 0.0500 time 0.2195 (0.2260) data time 0.0005 (0.0013) model time 0.2189 (0.2247) loss 2.6263 (2.8994) grad_norm 2.5341 (inf) loss_scale 512.0000 (782.1199) mem 7381MB [2024-08-30 11:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 224 training takes 0:04:42 [2024-08-30 11:29:23 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 11:29:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 11:29:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.374 (0.374) Loss 0.4092 (0.4092) Acc@1 92.871 (92.871) Acc@5 98.340 (98.340) Mem 7381MB [2024-08-30 11:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.099) Loss 0.6001 (0.6408) Acc@1 88.574 (86.515) Acc@5 97.949 (97.594) Mem 7381MB [2024-08-30 11:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.086) Loss 0.9443 (0.6687) Acc@1 77.734 (85.514) Acc@5 94.824 (97.484) Mem 7381MB [2024-08-30 11:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.068 (0.080) Loss 1.1445 (0.7595) Acc@1 73.828 (83.383) Acc@5 92.578 (96.525) Mem 7381MB [2024-08-30 11:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 1.0479 (0.8114) Acc@1 75.586 (82.129) Acc@5 94.043 (95.953) Mem 7381MB [2024-08-30 11:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.742 Acc@5 95.924 [2024-08-30 11:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.7% [2024-08-30 11:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.74% [2024-08-30 11:29:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-30 11:29:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-30 11:29:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.375 (0.375) Loss 0.3767 (0.3767) Acc@1 93.164 (93.164) Acc@5 98.340 (98.340) Mem 7381MB [2024-08-30 11:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.099) Loss 0.5801 (0.6014) Acc@1 89.355 (87.482) Acc@5 97.852 (97.763) Mem 7381MB [2024-08-30 11:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.068 (0.085) Loss 0.8687 (0.6286) Acc@1 78.613 (86.407) Acc@5 95.898 (97.689) Mem 7381MB [2024-08-30 11:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.067 (0.081) Loss 1.0869 (0.7128) Acc@1 74.121 (84.350) Acc@5 92.969 (96.780) Mem 7381MB [2024-08-30 11:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 0.9771 (0.7565) Acc@1 77.344 (83.201) Acc@5 94.531 (96.344) Mem 7381MB [2024-08-30 11:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.806 Acc@5 96.326 [2024-08-30 11:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-08-30 11:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.81% [2024-08-30 11:29:31 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 11:29:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 11:29:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][0/1251] eta 0:14:02 lr 0.000175 wd 0.0500 time 0.6736 (0.6736) data time 0.4705 (0.4705) model time 0.0000 (0.0000) loss 3.1398 (3.1398) grad_norm 4.5476 (4.5476) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:29:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][10/1251] eta 0:05:30 lr 0.000175 wd 0.0500 time 0.2192 (0.2662) data time 0.0009 (0.0435) model time 0.0000 (0.0000) loss 3.3238 (2.9538) grad_norm 5.0300 (3.9885) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:29:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][20/1251] eta 0:05:03 lr 0.000175 wd 0.0500 time 0.2298 (0.2465) data time 0.0006 (0.0232) model time 0.0000 (0.0000) loss 3.3920 (2.9307) grad_norm 3.3950 (4.0726) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:29:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][30/1251] eta 0:04:53 lr 0.000175 wd 0.0500 time 0.2295 (0.2401) data time 0.0009 (0.0160) model time 0.0000 (0.0000) loss 3.0858 (2.9902) grad_norm 4.2480 (3.9970) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:29:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][40/1251] eta 0:04:46 lr 0.000175 wd 0.0500 time 0.2298 (0.2366) data time 0.0007 (0.0123) model time 0.0000 (0.0000) loss 3.6045 (2.9826) grad_norm 3.8487 (3.9275) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:29:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][50/1251] eta 0:04:41 lr 0.000175 wd 0.0500 time 0.2256 (0.2347) data time 0.0012 (0.0101) model time 0.0000 (0.0000) loss 3.1448 (2.9722) grad_norm 3.0962 (3.8389) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:29:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][60/1251] eta 0:04:37 lr 0.000175 wd 0.0500 time 0.2348 (0.2333) data time 0.0006 (0.0086) model time 0.2343 (0.2250) loss 2.6282 (2.9772) grad_norm 3.9640 (3.9511) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:29:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][70/1251] eta 0:04:34 lr 0.000175 wd 0.0500 time 0.2326 (0.2323) data time 0.0006 (0.0075) model time 0.2321 (0.2251) loss 2.1149 (2.9690) grad_norm 2.4683 (3.9344) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:29:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][80/1251] eta 0:04:31 lr 0.000175 wd 0.0500 time 0.2297 (0.2316) data time 0.0009 (0.0067) model time 0.2288 (0.2254) loss 3.3259 (2.9579) grad_norm 3.8739 (3.9542) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:29:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][90/1251] eta 0:04:28 lr 0.000175 wd 0.0500 time 0.2265 (0.2309) data time 0.0009 (0.0061) model time 0.2256 (0.2250) loss 2.6208 (2.9701) grad_norm 2.8932 (4.0507) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:29:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][100/1251] eta 0:04:25 lr 0.000175 wd 0.0500 time 0.2253 (0.2303) data time 0.0010 (0.0056) model time 0.2243 (0.2249) loss 2.2223 (2.9341) grad_norm 3.1580 (4.0539) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:29:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][110/1251] eta 0:04:22 lr 0.000175 wd 0.0500 time 0.2225 (0.2299) data time 0.0011 (0.0052) model time 0.2214 (0.2247) loss 3.4156 (2.9355) grad_norm 4.2625 (4.0241) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 11:30:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][120/1251] eta 0:04:19 lr 0.000175 wd 0.0500 time 0.2259 (0.2293) data time 0.0006 (0.0048) model time 0.2253 (0.2244) loss 2.1414 (2.9100) grad_norm 2.7264 (inf) loss_scale 256.0000 (501.4215) mem 7381MB [2024-08-30 11:30:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][130/1251] eta 0:04:16 lr 0.000175 wd 0.0500 time 0.2214 (0.2290) data time 0.0009 (0.0045) model time 0.2205 (0.2244) loss 3.0487 (2.8937) grad_norm 5.1508 (inf) loss_scale 256.0000 (482.6870) mem 7381MB [2024-08-30 11:30:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][140/1251] eta 0:04:14 lr 0.000175 wd 0.0500 time 0.2195 (0.2287) data time 0.0009 (0.0043) model time 0.2186 (0.2243) loss 2.3616 (2.8890) grad_norm 4.5876 (inf) loss_scale 256.0000 (466.6099) mem 7381MB [2024-08-30 11:30:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][150/1251] eta 0:04:11 lr 0.000175 wd 0.0500 time 0.2343 (0.2285) data time 0.0006 (0.0040) model time 0.2337 (0.2244) loss 2.4578 (2.8882) grad_norm 10.0391 (inf) loss_scale 256.0000 (452.6623) mem 7381MB [2024-08-30 11:30:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][160/1251] eta 0:04:09 lr 0.000175 wd 0.0500 time 0.2269 (0.2283) data time 0.0006 (0.0039) model time 0.2263 (0.2244) loss 3.3797 (2.8916) grad_norm 3.5119 (inf) loss_scale 256.0000 (440.4472) mem 7381MB [2024-08-30 11:30:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][170/1251] eta 0:04:06 lr 0.000175 wd 0.0500 time 0.2412 (0.2282) data time 0.0010 (0.0038) model time 0.2402 (0.2244) loss 3.0622 (2.8974) grad_norm 3.3868 (inf) loss_scale 256.0000 (429.6608) mem 7381MB [2024-08-30 11:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][180/1251] eta 0:04:04 lr 0.000175 wd 0.0500 time 0.2211 (0.2280) data time 0.0007 (0.0036) model time 0.2203 (0.2243) loss 2.2790 (2.8882) grad_norm 2.6924 (inf) loss_scale 256.0000 (420.0663) mem 7381MB [2024-08-30 11:30:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][190/1251] eta 0:04:01 lr 0.000175 wd 0.0500 time 0.2196 (0.2278) data time 0.0009 (0.0035) model time 0.2187 (0.2243) loss 2.2805 (2.8875) grad_norm 3.2622 (inf) loss_scale 256.0000 (411.4764) mem 7381MB [2024-08-30 11:30:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][200/1251] eta 0:03:59 lr 0.000174 wd 0.0500 time 0.2434 (0.2278) data time 0.0005 (0.0034) model time 0.2429 (0.2243) loss 2.4526 (2.8911) grad_norm 3.9476 (inf) loss_scale 256.0000 (403.7413) mem 7381MB [2024-08-30 11:30:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][210/1251] eta 0:03:56 lr 0.000174 wd 0.0500 time 0.2305 (0.2276) data time 0.0008 (0.0032) model time 0.2297 (0.2243) loss 2.8647 (2.8951) grad_norm 3.8583 (inf) loss_scale 256.0000 (396.7393) mem 7381MB [2024-08-30 11:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][220/1251] eta 0:03:54 lr 0.000174 wd 0.0500 time 0.2364 (0.2275) data time 0.0007 (0.0031) model time 0.2357 (0.2244) loss 3.3262 (2.9068) grad_norm 3.1570 (inf) loss_scale 256.0000 (390.3710) mem 7381MB [2024-08-30 11:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][230/1251] eta 0:03:53 lr 0.000174 wd 0.0500 time 0.2235 (0.2282) data time 0.0008 (0.0031) model time 0.2226 (0.2253) loss 3.2056 (2.9087) grad_norm 3.0414 (inf) loss_scale 256.0000 (384.5541) mem 7381MB [2024-08-30 11:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][240/1251] eta 0:03:50 lr 0.000174 wd 0.0500 time 0.2225 (0.2282) data time 0.0008 (0.0030) model time 0.2217 (0.2253) loss 2.5238 (2.9037) grad_norm 2.6622 (inf) loss_scale 256.0000 (379.2199) mem 7381MB [2024-08-30 11:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][250/1251] eta 0:03:48 lr 0.000174 wd 0.0500 time 0.2203 (0.2281) data time 0.0009 (0.0030) model time 0.2195 (0.2253) loss 2.8672 (2.9041) grad_norm 4.1674 (inf) loss_scale 256.0000 (374.3108) mem 7381MB [2024-08-30 11:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][260/1251] eta 0:03:45 lr 0.000174 wd 0.0500 time 0.2163 (0.2280) data time 0.0008 (0.0029) model time 0.2155 (0.2252) loss 2.9001 (2.9018) grad_norm 3.8908 (inf) loss_scale 256.0000 (369.7778) mem 7381MB [2024-08-30 11:30:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][270/1251] eta 0:03:43 lr 0.000174 wd 0.0500 time 0.2203 (0.2281) data time 0.0006 (0.0029) model time 0.2196 (0.2254) loss 1.8898 (2.8925) grad_norm 4.4642 (inf) loss_scale 256.0000 (365.5793) mem 7381MB [2024-08-30 11:30:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][280/1251] eta 0:03:41 lr 0.000174 wd 0.0500 time 0.2256 (0.2280) data time 0.0006 (0.0028) model time 0.2250 (0.2253) loss 3.6780 (2.8965) grad_norm 3.2964 (inf) loss_scale 256.0000 (361.6797) mem 7381MB [2024-08-30 11:30:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][290/1251] eta 0:03:39 lr 0.000174 wd 0.0500 time 0.2222 (0.2279) data time 0.0010 (0.0027) model time 0.2212 (0.2253) loss 2.4632 (2.8897) grad_norm 5.8846 (inf) loss_scale 256.0000 (358.0481) mem 7381MB [2024-08-30 11:30:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][300/1251] eta 0:03:36 lr 0.000174 wd 0.0500 time 0.2263 (0.2279) data time 0.0009 (0.0027) model time 0.2254 (0.2253) loss 3.3970 (2.8912) grad_norm 2.6583 (inf) loss_scale 256.0000 (354.6578) mem 7381MB [2024-08-30 11:30:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][310/1251] eta 0:03:34 lr 0.000174 wd 0.0500 time 0.2230 (0.2278) data time 0.0007 (0.0026) model time 0.2223 (0.2253) loss 3.2653 (2.8908) grad_norm 3.4485 (inf) loss_scale 256.0000 (351.4855) mem 7381MB [2024-08-30 11:30:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][320/1251] eta 0:03:31 lr 0.000174 wd 0.0500 time 0.2235 (0.2277) data time 0.0006 (0.0026) model time 0.2230 (0.2252) loss 2.8371 (2.8857) grad_norm 3.9610 (inf) loss_scale 256.0000 (348.5109) mem 7381MB [2024-08-30 11:30:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][330/1251] eta 0:03:29 lr 0.000174 wd 0.0500 time 0.2265 (0.2277) data time 0.0006 (0.0026) model time 0.2259 (0.2252) loss 2.7355 (2.8874) grad_norm 3.2509 (inf) loss_scale 256.0000 (345.7160) mem 7381MB [2024-08-30 11:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][340/1251] eta 0:03:27 lr 0.000174 wd 0.0500 time 0.2191 (0.2276) data time 0.0006 (0.0025) model time 0.2185 (0.2252) loss 3.2191 (2.8906) grad_norm 18.2769 (inf) loss_scale 256.0000 (343.0850) mem 7381MB [2024-08-30 11:30:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][350/1251] eta 0:03:25 lr 0.000174 wd 0.0500 time 0.2280 (0.2276) data time 0.0007 (0.0025) model time 0.2273 (0.2252) loss 3.3259 (2.8975) grad_norm 4.5287 (inf) loss_scale 256.0000 (340.6040) mem 7381MB [2024-08-30 11:30:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][360/1251] eta 0:03:22 lr 0.000174 wd 0.0500 time 0.2219 (0.2275) data time 0.0006 (0.0024) model time 0.2214 (0.2252) loss 2.7455 (2.8894) grad_norm 3.0566 (inf) loss_scale 256.0000 (338.2604) mem 7381MB [2024-08-30 11:30:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][370/1251] eta 0:03:20 lr 0.000174 wd 0.0500 time 0.2182 (0.2276) data time 0.0009 (0.0024) model time 0.2172 (0.2253) loss 3.0767 (2.8870) grad_norm 4.0529 (inf) loss_scale 256.0000 (336.0431) mem 7381MB [2024-08-30 11:30:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][380/1251] eta 0:03:18 lr 0.000174 wd 0.0500 time 0.2222 (0.2275) data time 0.0009 (0.0023) model time 0.2213 (0.2253) loss 3.0169 (2.8882) grad_norm 3.9565 (inf) loss_scale 256.0000 (333.9423) mem 7381MB [2024-08-30 11:31:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][390/1251] eta 0:03:15 lr 0.000174 wd 0.0500 time 0.2226 (0.2275) data time 0.0007 (0.0023) model time 0.2219 (0.2253) loss 2.9290 (2.8916) grad_norm 5.7309 (inf) loss_scale 256.0000 (331.9488) mem 7381MB [2024-08-30 11:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][400/1251] eta 0:03:14 lr 0.000174 wd 0.0500 time 0.2249 (0.2281) data time 0.0005 (0.0023) model time 0.2243 (0.2259) loss 3.4820 (2.8901) grad_norm 4.3612 (inf) loss_scale 256.0000 (330.0549) mem 7381MB [2024-08-30 11:31:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][410/1251] eta 0:03:11 lr 0.000174 wd 0.0500 time 0.2282 (0.2280) data time 0.0008 (0.0023) model time 0.2274 (0.2259) loss 3.4151 (2.8910) grad_norm 3.6313 (inf) loss_scale 256.0000 (328.2530) mem 7381MB [2024-08-30 11:31:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][420/1251] eta 0:03:09 lr 0.000174 wd 0.0500 time 0.2235 (0.2280) data time 0.0008 (0.0023) model time 0.2227 (0.2259) loss 3.1376 (2.8929) grad_norm 4.4876 (inf) loss_scale 256.0000 (326.5368) mem 7381MB [2024-08-30 11:31:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][430/1251] eta 0:03:07 lr 0.000174 wd 0.0500 time 0.2340 (0.2279) data time 0.0006 (0.0022) model time 0.2334 (0.2259) loss 3.2488 (2.8968) grad_norm 3.3851 (inf) loss_scale 256.0000 (324.9002) mem 7381MB [2024-08-30 11:31:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][440/1251] eta 0:03:04 lr 0.000174 wd 0.0500 time 0.2245 (0.2279) data time 0.0008 (0.0022) model time 0.2237 (0.2259) loss 2.8642 (2.8949) grad_norm 2.4331 (inf) loss_scale 256.0000 (323.3379) mem 7381MB [2024-08-30 11:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][450/1251] eta 0:03:02 lr 0.000174 wd 0.0500 time 0.2236 (0.2279) data time 0.0007 (0.0022) model time 0.2229 (0.2258) loss 3.4863 (2.8991) grad_norm 3.1818 (inf) loss_scale 256.0000 (321.8448) mem 7381MB [2024-08-30 11:31:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][460/1251] eta 0:03:00 lr 0.000174 wd 0.0500 time 0.2274 (0.2279) data time 0.0007 (0.0022) model time 0.2267 (0.2259) loss 3.4246 (2.9047) grad_norm 3.3132 (inf) loss_scale 256.0000 (320.4165) mem 7381MB [2024-08-30 11:31:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][470/1251] eta 0:02:57 lr 0.000174 wd 0.0500 time 0.2266 (0.2279) data time 0.0008 (0.0021) model time 0.2257 (0.2259) loss 2.4164 (2.9070) grad_norm 3.8276 (inf) loss_scale 256.0000 (319.0488) mem 7381MB [2024-08-30 11:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][480/1251] eta 0:02:55 lr 0.000174 wd 0.0500 time 0.2195 (0.2279) data time 0.0006 (0.0021) model time 0.2189 (0.2259) loss 2.9906 (2.9074) grad_norm 3.7450 (inf) loss_scale 256.0000 (317.7380) mem 7381MB [2024-08-30 11:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][490/1251] eta 0:02:53 lr 0.000174 wd 0.0500 time 0.2259 (0.2278) data time 0.0006 (0.0021) model time 0.2254 (0.2258) loss 3.0402 (2.9056) grad_norm 3.1499 (inf) loss_scale 256.0000 (316.4807) mem 7381MB [2024-08-30 11:31:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][500/1251] eta 0:02:51 lr 0.000174 wd 0.0500 time 0.2214 (0.2278) data time 0.0007 (0.0021) model time 0.2207 (0.2258) loss 2.7483 (2.9076) grad_norm 3.3004 (inf) loss_scale 256.0000 (315.2735) mem 7381MB [2024-08-30 11:31:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][510/1251] eta 0:02:48 lr 0.000173 wd 0.0500 time 0.2232 (0.2277) data time 0.0007 (0.0021) model time 0.2225 (0.2257) loss 3.2129 (2.9066) grad_norm 4.9765 (inf) loss_scale 256.0000 (314.1135) mem 7381MB [2024-08-30 11:31:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][520/1251] eta 0:02:46 lr 0.000173 wd 0.0500 time 0.2270 (0.2277) data time 0.0008 (0.0020) model time 0.2262 (0.2258) loss 3.1804 (2.9067) grad_norm 3.7922 (inf) loss_scale 256.0000 (312.9981) mem 7381MB [2024-08-30 11:31:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][530/1251] eta 0:02:44 lr 0.000173 wd 0.0500 time 0.2328 (0.2277) data time 0.0008 (0.0020) model time 0.2320 (0.2258) loss 2.6031 (2.9095) grad_norm 2.9487 (inf) loss_scale 256.0000 (311.9247) mem 7381MB [2024-08-30 11:31:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][540/1251] eta 0:02:41 lr 0.000173 wd 0.0500 time 0.2271 (0.2278) data time 0.0007 (0.0020) model time 0.2264 (0.2259) loss 3.3536 (2.9136) grad_norm 3.8479 (inf) loss_scale 256.0000 (310.8909) mem 7381MB [2024-08-30 11:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][550/1251] eta 0:02:39 lr 0.000173 wd 0.0500 time 0.2276 (0.2277) data time 0.0011 (0.0020) model time 0.2265 (0.2258) loss 3.0528 (2.9081) grad_norm 3.2650 (inf) loss_scale 256.0000 (309.8947) mem 7381MB [2024-08-30 11:31:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][560/1251] eta 0:02:37 lr 0.000173 wd 0.0500 time 0.2270 (0.2277) data time 0.0008 (0.0020) model time 0.2262 (0.2259) loss 3.2110 (2.9104) grad_norm 3.0097 (inf) loss_scale 256.0000 (308.9340) mem 7381MB [2024-08-30 11:31:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][570/1251] eta 0:02:35 lr 0.000173 wd 0.0500 time 0.2271 (0.2277) data time 0.0006 (0.0020) model time 0.2265 (0.2259) loss 2.4512 (2.9108) grad_norm 3.1208 (inf) loss_scale 256.0000 (308.0070) mem 7381MB [2024-08-30 11:31:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][580/1251] eta 0:02:32 lr 0.000173 wd 0.0500 time 0.2300 (0.2277) data time 0.0008 (0.0019) model time 0.2292 (0.2259) loss 2.6946 (2.9131) grad_norm 3.3416 (inf) loss_scale 256.0000 (307.1119) mem 7381MB [2024-08-30 11:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][590/1251] eta 0:02:30 lr 0.000173 wd 0.0500 time 0.2285 (0.2277) data time 0.0006 (0.0019) model time 0.2279 (0.2259) loss 3.0959 (2.9157) grad_norm 3.0749 (inf) loss_scale 256.0000 (306.2470) mem 7381MB [2024-08-30 11:31:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][600/1251] eta 0:02:28 lr 0.000173 wd 0.0500 time 0.2200 (0.2277) data time 0.0008 (0.0019) model time 0.2193 (0.2259) loss 3.2839 (2.9200) grad_norm 2.8168 (inf) loss_scale 256.0000 (305.4110) mem 7381MB [2024-08-30 11:31:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][610/1251] eta 0:02:26 lr 0.000173 wd 0.0500 time 0.2321 (0.2278) data time 0.0009 (0.0019) model time 0.2313 (0.2260) loss 3.2096 (2.9214) grad_norm 4.6310 (inf) loss_scale 256.0000 (304.6023) mem 7381MB [2024-08-30 11:31:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][620/1251] eta 0:02:23 lr 0.000173 wd 0.0500 time 0.2239 (0.2278) data time 0.0007 (0.0019) model time 0.2233 (0.2260) loss 3.2267 (2.9234) grad_norm 4.5981 (inf) loss_scale 256.0000 (303.8196) mem 7381MB [2024-08-30 11:31:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][630/1251] eta 0:02:21 lr 0.000173 wd 0.0500 time 0.2236 (0.2278) data time 0.0006 (0.0019) model time 0.2230 (0.2261) loss 2.5681 (2.9243) grad_norm 3.4983 (inf) loss_scale 256.0000 (303.0618) mem 7381MB [2024-08-30 11:31:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][640/1251] eta 0:02:19 lr 0.000173 wd 0.0500 time 0.2272 (0.2278) data time 0.0006 (0.0019) model time 0.2266 (0.2261) loss 3.6415 (2.9223) grad_norm 3.6212 (inf) loss_scale 256.0000 (302.3276) mem 7381MB [2024-08-30 11:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][650/1251] eta 0:02:16 lr 0.000173 wd 0.0500 time 0.2233 (0.2279) data time 0.0006 (0.0019) model time 0.2227 (0.2261) loss 1.8395 (2.9190) grad_norm 3.1260 (inf) loss_scale 256.0000 (301.6160) mem 7381MB [2024-08-30 11:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][660/1251] eta 0:02:14 lr 0.000173 wd 0.0500 time 0.2267 (0.2279) data time 0.0009 (0.0018) model time 0.2258 (0.2262) loss 3.1448 (2.9203) grad_norm 3.8164 (inf) loss_scale 256.0000 (300.9259) mem 7381MB [2024-08-30 11:32:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][670/1251] eta 0:02:12 lr 0.000173 wd 0.0500 time 0.2454 (0.2279) data time 0.0006 (0.0018) model time 0.2448 (0.2262) loss 2.9306 (2.9225) grad_norm 3.5589 (inf) loss_scale 256.0000 (300.2563) mem 7381MB [2024-08-30 11:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][680/1251] eta 0:02:10 lr 0.000173 wd 0.0500 time 0.2232 (0.2279) data time 0.0007 (0.0018) model time 0.2225 (0.2262) loss 2.9636 (2.9214) grad_norm 2.9252 (inf) loss_scale 256.0000 (299.6065) mem 7381MB [2024-08-30 11:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][690/1251] eta 0:02:07 lr 0.000173 wd 0.0500 time 0.2216 (0.2279) data time 0.0008 (0.0018) model time 0.2208 (0.2262) loss 3.3023 (2.9194) grad_norm 3.2154 (inf) loss_scale 256.0000 (298.9754) mem 7381MB [2024-08-30 11:32:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][700/1251] eta 0:02:05 lr 0.000173 wd 0.0500 time 0.2230 (0.2278) data time 0.0009 (0.0018) model time 0.2221 (0.2261) loss 2.7785 (2.9188) grad_norm 3.0658 (inf) loss_scale 256.0000 (298.3623) mem 7381MB [2024-08-30 11:32:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][710/1251] eta 0:02:03 lr 0.000173 wd 0.0500 time 0.2254 (0.2278) data time 0.0008 (0.0018) model time 0.2246 (0.2261) loss 2.2709 (2.9196) grad_norm 2.7086 (inf) loss_scale 256.0000 (297.7665) mem 7381MB [2024-08-30 11:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][720/1251] eta 0:02:00 lr 0.000173 wd 0.0500 time 0.2193 (0.2278) data time 0.0008 (0.0018) model time 0.2186 (0.2261) loss 3.4767 (2.9192) grad_norm 4.1077 (inf) loss_scale 256.0000 (297.1872) mem 7381MB [2024-08-30 11:32:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][730/1251] eta 0:01:58 lr 0.000173 wd 0.0500 time 0.2196 (0.2277) data time 0.0008 (0.0018) model time 0.2188 (0.2260) loss 2.7197 (2.9185) grad_norm 3.6608 (inf) loss_scale 256.0000 (296.6238) mem 7381MB [2024-08-30 11:32:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][740/1251] eta 0:01:56 lr 0.000173 wd 0.0500 time 0.2229 (0.2277) data time 0.0007 (0.0018) model time 0.2223 (0.2260) loss 3.0922 (2.9156) grad_norm 2.9006 (inf) loss_scale 256.0000 (296.0756) mem 7381MB [2024-08-30 11:32:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][750/1251] eta 0:01:54 lr 0.000173 wd 0.0500 time 0.2243 (0.2276) data time 0.0007 (0.0018) model time 0.2236 (0.2259) loss 2.4169 (2.9149) grad_norm 3.9067 (inf) loss_scale 256.0000 (295.5419) mem 7381MB [2024-08-30 11:32:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][760/1251] eta 0:01:51 lr 0.000173 wd 0.0500 time 0.2302 (0.2276) data time 0.0008 (0.0017) model time 0.2295 (0.2259) loss 3.1872 (2.9093) grad_norm 4.5854 (inf) loss_scale 256.0000 (295.0223) mem 7381MB [2024-08-30 11:32:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][770/1251] eta 0:01:49 lr 0.000173 wd 0.0500 time 0.2285 (0.2275) data time 0.0005 (0.0017) model time 0.2279 (0.2259) loss 2.8621 (2.9082) grad_norm 3.3769 (inf) loss_scale 256.0000 (294.5162) mem 7381MB [2024-08-30 11:32:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][780/1251] eta 0:01:47 lr 0.000173 wd 0.0500 time 0.2268 (0.2275) data time 0.0009 (0.0017) model time 0.2259 (0.2259) loss 2.7164 (2.9057) grad_norm 3.2272 (inf) loss_scale 256.0000 (294.0230) mem 7381MB [2024-08-30 11:32:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][790/1251] eta 0:01:44 lr 0.000173 wd 0.0500 time 0.2288 (0.2275) data time 0.0006 (0.0017) model time 0.2282 (0.2258) loss 3.2375 (2.9062) grad_norm 4.6504 (inf) loss_scale 256.0000 (293.5424) mem 7381MB [2024-08-30 11:32:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][800/1251] eta 0:01:42 lr 0.000173 wd 0.0500 time 0.2231 (0.2274) data time 0.0006 (0.0017) model time 0.2225 (0.2258) loss 2.3751 (2.9065) grad_norm 3.3975 (inf) loss_scale 256.0000 (293.0737) mem 7381MB [2024-08-30 11:32:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][810/1251] eta 0:01:40 lr 0.000172 wd 0.0500 time 0.2232 (0.2274) data time 0.0008 (0.0017) model time 0.2224 (0.2258) loss 2.9372 (2.9070) grad_norm 3.9980 (inf) loss_scale 256.0000 (292.6165) mem 7381MB [2024-08-30 11:32:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][820/1251] eta 0:01:38 lr 0.000172 wd 0.0500 time 0.2242 (0.2274) data time 0.0006 (0.0017) model time 0.2236 (0.2258) loss 2.8732 (2.9056) grad_norm 7.5311 (inf) loss_scale 256.0000 (292.1705) mem 7381MB [2024-08-30 11:32:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][830/1251] eta 0:01:35 lr 0.000172 wd 0.0500 time 0.2272 (0.2274) data time 0.0006 (0.0017) model time 0.2266 (0.2257) loss 3.1168 (2.9084) grad_norm 3.1617 (inf) loss_scale 256.0000 (291.7353) mem 7381MB [2024-08-30 11:32:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][840/1251] eta 0:01:33 lr 0.000172 wd 0.0500 time 0.2307 (0.2273) data time 0.0008 (0.0017) model time 0.2299 (0.2257) loss 2.8773 (2.9099) grad_norm 4.3605 (inf) loss_scale 256.0000 (291.3103) mem 7381MB [2024-08-30 11:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][850/1251] eta 0:01:31 lr 0.000172 wd 0.0500 time 0.2206 (0.2273) data time 0.0007 (0.0017) model time 0.2199 (0.2257) loss 2.8605 (2.9062) grad_norm 3.3943 (inf) loss_scale 256.0000 (290.8954) mem 7381MB [2024-08-30 11:32:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][860/1251] eta 0:01:28 lr 0.000172 wd 0.0500 time 0.2319 (0.2273) data time 0.0006 (0.0017) model time 0.2313 (0.2257) loss 2.6110 (2.9067) grad_norm 3.5292 (inf) loss_scale 256.0000 (290.4901) mem 7381MB [2024-08-30 11:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][870/1251] eta 0:01:26 lr 0.000172 wd 0.0500 time 0.2253 (0.2273) data time 0.0008 (0.0016) model time 0.2245 (0.2257) loss 3.2451 (2.9081) grad_norm 3.0271 (inf) loss_scale 256.0000 (290.0941) mem 7381MB [2024-08-30 11:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][880/1251] eta 0:01:24 lr 0.000172 wd 0.0500 time 0.2318 (0.2273) data time 0.0006 (0.0016) model time 0.2312 (0.2257) loss 2.1793 (2.9050) grad_norm 4.9203 (inf) loss_scale 256.0000 (289.7072) mem 7381MB [2024-08-30 11:32:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][890/1251] eta 0:01:22 lr 0.000172 wd 0.0500 time 0.2218 (0.2273) data time 0.0006 (0.0016) model time 0.2212 (0.2257) loss 2.2588 (2.9071) grad_norm 3.5193 (inf) loss_scale 256.0000 (289.3288) mem 7381MB [2024-08-30 11:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][900/1251] eta 0:01:19 lr 0.000172 wd 0.0500 time 0.2277 (0.2272) data time 0.0007 (0.0016) model time 0.2270 (0.2257) loss 3.2394 (2.9073) grad_norm 3.2825 (inf) loss_scale 256.0000 (288.9589) mem 7381MB [2024-08-30 11:32:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][910/1251] eta 0:01:17 lr 0.000172 wd 0.0500 time 0.2327 (0.2272) data time 0.0005 (0.0016) model time 0.2322 (0.2257) loss 2.3870 (2.9068) grad_norm 3.4588 (inf) loss_scale 256.0000 (288.5971) mem 7381MB [2024-08-30 11:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][920/1251] eta 0:01:15 lr 0.000172 wd 0.0500 time 0.2260 (0.2272) data time 0.0009 (0.0016) model time 0.2251 (0.2257) loss 2.4113 (2.9062) grad_norm 3.1470 (inf) loss_scale 256.0000 (288.2432) mem 7381MB [2024-08-30 11:33:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][930/1251] eta 0:01:12 lr 0.000172 wd 0.0500 time 0.2279 (0.2272) data time 0.0008 (0.0016) model time 0.2271 (0.2257) loss 3.0558 (2.9067) grad_norm 3.4828 (inf) loss_scale 256.0000 (287.8969) mem 7381MB [2024-08-30 11:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][940/1251] eta 0:01:10 lr 0.000172 wd 0.0500 time 0.2236 (0.2272) data time 0.0007 (0.0016) model time 0.2228 (0.2257) loss 3.2392 (2.9062) grad_norm 3.7632 (inf) loss_scale 256.0000 (287.5579) mem 7381MB [2024-08-30 11:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][950/1251] eta 0:01:08 lr 0.000172 wd 0.0500 time 0.2223 (0.2272) data time 0.0008 (0.0016) model time 0.2215 (0.2257) loss 3.3066 (2.9068) grad_norm 5.3936 (inf) loss_scale 256.0000 (287.2261) mem 7381MB [2024-08-30 11:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][960/1251] eta 0:01:06 lr 0.000172 wd 0.0500 time 0.2210 (0.2272) data time 0.0008 (0.0016) model time 0.2201 (0.2257) loss 2.4867 (2.9047) grad_norm 2.7084 (inf) loss_scale 256.0000 (286.9011) mem 7381MB [2024-08-30 11:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][970/1251] eta 0:01:03 lr 0.000172 wd 0.0500 time 0.2239 (0.2272) data time 0.0006 (0.0016) model time 0.2233 (0.2256) loss 1.6124 (2.9029) grad_norm 5.2542 (inf) loss_scale 256.0000 (286.5829) mem 7381MB [2024-08-30 11:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][980/1251] eta 0:01:01 lr 0.000172 wd 0.0500 time 0.2205 (0.2272) data time 0.0007 (0.0016) model time 0.2198 (0.2256) loss 2.3175 (2.9014) grad_norm 2.6079 (inf) loss_scale 256.0000 (286.2712) mem 7381MB [2024-08-30 11:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][990/1251] eta 0:00:59 lr 0.000172 wd 0.0500 time 0.2261 (0.2271) data time 0.0009 (0.0016) model time 0.2253 (0.2256) loss 2.8012 (2.9020) grad_norm 5.3022 (inf) loss_scale 256.0000 (285.9657) mem 7381MB [2024-08-30 11:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1000/1251] eta 0:00:57 lr 0.000172 wd 0.0500 time 0.2298 (0.2271) data time 0.0007 (0.0016) model time 0.2291 (0.2256) loss 3.2291 (2.9014) grad_norm 3.5065 (inf) loss_scale 256.0000 (285.6663) mem 7381MB [2024-08-30 11:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1010/1251] eta 0:00:54 lr 0.000172 wd 0.0500 time 0.2260 (0.2271) data time 0.0008 (0.0015) model time 0.2253 (0.2256) loss 2.8812 (2.9012) grad_norm 6.4949 (inf) loss_scale 256.0000 (285.3729) mem 7381MB [2024-08-30 11:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1020/1251] eta 0:00:52 lr 0.000172 wd 0.0500 time 0.2273 (0.2271) data time 0.0006 (0.0015) model time 0.2267 (0.2256) loss 3.8302 (2.9035) grad_norm 5.4491 (inf) loss_scale 256.0000 (285.0852) mem 7381MB [2024-08-30 11:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1030/1251] eta 0:00:50 lr 0.000172 wd 0.0500 time 0.2314 (0.2270) data time 0.0008 (0.0015) model time 0.2305 (0.2256) loss 3.1884 (2.9032) grad_norm 3.2572 (inf) loss_scale 256.0000 (284.8031) mem 7381MB [2024-08-30 11:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1040/1251] eta 0:00:47 lr 0.000172 wd 0.0500 time 0.2222 (0.2270) data time 0.0006 (0.0015) model time 0.2217 (0.2255) loss 3.5029 (2.9008) grad_norm 3.1810 (inf) loss_scale 256.0000 (284.5264) mem 7381MB [2024-08-30 11:33:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1050/1251] eta 0:00:45 lr 0.000172 wd 0.0500 time 0.2254 (0.2270) data time 0.0006 (0.0015) model time 0.2248 (0.2255) loss 2.4831 (2.9012) grad_norm 7.0604 (inf) loss_scale 256.0000 (284.2550) mem 7381MB [2024-08-30 11:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1060/1251] eta 0:00:43 lr 0.000172 wd 0.0500 time 0.2232 (0.2270) data time 0.0008 (0.0015) model time 0.2223 (0.2255) loss 3.3473 (2.9006) grad_norm 4.8366 (inf) loss_scale 256.0000 (283.9887) mem 7381MB [2024-08-30 11:33:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1070/1251] eta 0:00:41 lr 0.000172 wd 0.0500 time 0.2335 (0.2270) data time 0.0006 (0.0015) model time 0.2329 (0.2255) loss 1.6517 (2.8988) grad_norm 3.2346 (inf) loss_scale 256.0000 (283.7274) mem 7381MB [2024-08-30 11:33:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1080/1251] eta 0:00:38 lr 0.000172 wd 0.0500 time 0.2269 (0.2270) data time 0.0006 (0.0015) model time 0.2264 (0.2255) loss 1.8799 (2.8986) grad_norm 4.9515 (inf) loss_scale 256.0000 (283.4709) mem 7381MB [2024-08-30 11:33:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1090/1251] eta 0:00:36 lr 0.000172 wd 0.0500 time 0.2208 (0.2269) data time 0.0008 (0.0015) model time 0.2200 (0.2255) loss 3.2161 (2.8981) grad_norm 5.0576 (inf) loss_scale 256.0000 (283.2191) mem 7381MB [2024-08-30 11:33:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1100/1251] eta 0:00:34 lr 0.000172 wd 0.0500 time 0.2211 (0.2269) data time 0.0009 (0.0015) model time 0.2202 (0.2254) loss 3.1502 (2.8999) grad_norm 4.6079 (inf) loss_scale 256.0000 (282.9718) mem 7381MB [2024-08-30 11:33:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1110/1251] eta 0:00:31 lr 0.000172 wd 0.0500 time 0.2229 (0.2269) data time 0.0006 (0.0015) model time 0.2223 (0.2254) loss 2.5868 (2.9005) grad_norm 2.9528 (inf) loss_scale 256.0000 (282.7291) mem 7381MB [2024-08-30 11:33:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1120/1251] eta 0:00:29 lr 0.000171 wd 0.0500 time 0.2191 (0.2269) data time 0.0009 (0.0015) model time 0.2182 (0.2254) loss 2.8709 (2.8998) grad_norm 3.2339 (inf) loss_scale 256.0000 (282.4906) mem 7381MB [2024-08-30 11:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1130/1251] eta 0:00:27 lr 0.000171 wd 0.0500 time 0.2236 (0.2268) data time 0.0007 (0.0015) model time 0.2229 (0.2254) loss 2.7146 (2.9014) grad_norm 2.9995 (inf) loss_scale 256.0000 (282.2564) mem 7381MB [2024-08-30 11:33:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1140/1251] eta 0:00:25 lr 0.000171 wd 0.0500 time 0.2230 (0.2268) data time 0.0008 (0.0015) model time 0.2222 (0.2254) loss 2.7039 (2.9032) grad_norm 2.8866 (inf) loss_scale 256.0000 (282.0263) mem 7381MB [2024-08-30 11:33:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1150/1251] eta 0:00:22 lr 0.000171 wd 0.0500 time 0.2172 (0.2268) data time 0.0009 (0.0015) model time 0.2163 (0.2253) loss 2.6478 (2.9030) grad_norm 4.7019 (inf) loss_scale 256.0000 (281.8002) mem 7381MB [2024-08-30 11:33:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1160/1251] eta 0:00:20 lr 0.000171 wd 0.0500 time 0.2387 (0.2269) data time 0.0008 (0.0015) model time 0.2379 (0.2255) loss 3.0087 (2.9035) grad_norm 3.2001 (inf) loss_scale 256.0000 (281.5780) mem 7381MB [2024-08-30 11:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1170/1251] eta 0:00:18 lr 0.000171 wd 0.0500 time 0.2229 (0.2269) data time 0.0005 (0.0015) model time 0.2223 (0.2255) loss 3.4515 (2.9032) grad_norm 4.1644 (inf) loss_scale 256.0000 (281.3595) mem 7381MB [2024-08-30 11:34:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1180/1251] eta 0:00:16 lr 0.000171 wd 0.0500 time 0.2221 (0.2269) data time 0.0007 (0.0015) model time 0.2214 (0.2255) loss 2.0965 (2.9023) grad_norm 4.0691 (inf) loss_scale 256.0000 (281.1448) mem 7381MB [2024-08-30 11:34:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1190/1251] eta 0:00:13 lr 0.000171 wd 0.0500 time 0.2244 (0.2269) data time 0.0008 (0.0014) model time 0.2237 (0.2254) loss 3.3760 (2.9046) grad_norm 5.4229 (inf) loss_scale 256.0000 (280.9337) mem 7381MB [2024-08-30 11:34:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1200/1251] eta 0:00:11 lr 0.000171 wd 0.0500 time 0.2243 (0.2268) data time 0.0006 (0.0014) model time 0.2237 (0.2254) loss 3.6130 (2.9047) grad_norm 3.1338 (inf) loss_scale 256.0000 (280.7261) mem 7381MB [2024-08-30 11:34:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1210/1251] eta 0:00:09 lr 0.000171 wd 0.0500 time 0.2246 (0.2268) data time 0.0007 (0.0014) model time 0.2239 (0.2254) loss 3.2774 (2.9058) grad_norm 4.5606 (inf) loss_scale 256.0000 (280.5219) mem 7381MB [2024-08-30 11:34:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1220/1251] eta 0:00:07 lr 0.000171 wd 0.0500 time 0.2236 (0.2268) data time 0.0008 (0.0014) model time 0.2228 (0.2254) loss 3.3549 (2.9053) grad_norm 36.3441 (inf) loss_scale 256.0000 (280.3210) mem 7381MB [2024-08-30 11:34:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1230/1251] eta 0:00:04 lr 0.000171 wd 0.0500 time 0.2243 (0.2267) data time 0.0007 (0.0014) model time 0.2236 (0.2253) loss 1.8001 (2.9043) grad_norm 3.9486 (inf) loss_scale 256.0000 (280.1235) mem 7381MB [2024-08-30 11:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1240/1251] eta 0:00:02 lr 0.000171 wd 0.0500 time 0.2118 (0.2266) data time 0.0006 (0.0014) model time 0.2112 (0.2252) loss 3.3486 (2.9046) grad_norm 4.1674 (inf) loss_scale 256.0000 (279.9291) mem 7381MB [2024-08-30 11:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [225/300][1250/1251] eta 0:00:00 lr 0.000171 wd 0.0500 time 0.2121 (0.2265) data time 0.0005 (0.0014) model time 0.2116 (0.2251) loss 2.0531 (2.9038) grad_norm 3.4698 (inf) loss_scale 256.0000 (279.7378) mem 7381MB [2024-08-30 11:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 225 training takes 0:04:43 [2024-08-30 11:34:15 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 11:34:16 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 11:34:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.415 (0.415) Loss 0.4180 (0.4180) Acc@1 92.578 (92.578) Acc@5 98.242 (98.242) Mem 7381MB [2024-08-30 11:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.104) Loss 0.6572 (0.6419) Acc@1 87.402 (86.515) Acc@5 97.266 (97.541) Mem 7381MB [2024-08-30 11:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.089) Loss 0.9556 (0.6746) Acc@1 76.074 (85.486) Acc@5 95.508 (97.438) Mem 7381MB [2024-08-30 11:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.067 (0.082) Loss 1.2061 (0.7651) Acc@1 71.484 (83.326) Acc@5 91.406 (96.421) Mem 7381MB [2024-08-30 11:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.078) Loss 1.0615 (0.8124) Acc@1 75.098 (82.181) Acc@5 94.238 (95.960) Mem 7381MB [2024-08-30 11:34:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.746 Acc@5 95.902 [2024-08-30 11:34:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.7% [2024-08-30 11:34:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.75% [2024-08-30 11:34:20 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-30 11:34:20 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-30 11:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.392 (0.392) Loss 0.3782 (0.3782) Acc@1 93.262 (93.262) Acc@5 98.340 (98.340) Mem 7381MB [2024-08-30 11:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.099) Loss 0.5791 (0.6013) Acc@1 89.355 (87.464) Acc@5 97.852 (97.745) Mem 7381MB [2024-08-30 11:34:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.067 (0.086) Loss 0.8696 (0.6287) Acc@1 78.809 (86.393) Acc@5 95.898 (97.689) Mem 7381MB [2024-08-30 11:34:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.066 (0.080) Loss 1.0889 (0.7128) Acc@1 74.316 (84.312) Acc@5 92.969 (96.774) Mem 7381MB [2024-08-30 11:34:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 0.9766 (0.7565) Acc@1 77.148 (83.177) Acc@5 94.531 (96.332) Mem 7381MB [2024-08-30 11:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.780 Acc@5 96.310 [2024-08-30 11:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-08-30 11:34:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][0/1251] eta 0:21:53 lr 0.000171 wd 0.0500 time 1.0502 (1.0502) data time 0.5407 (0.5407) model time 0.0000 (0.0000) loss 2.9135 (2.9135) grad_norm 3.2842 (3.2842) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 11:34:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][10/1251] eta 0:06:13 lr 0.000171 wd 0.0500 time 0.2322 (0.3009) data time 0.0006 (0.0500) model time 0.0000 (0.0000) loss 3.6991 (3.0554) grad_norm 4.6969 (4.0840) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 11:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][20/1251] eta 0:05:25 lr 0.000171 wd 0.0500 time 0.2324 (0.2644) data time 0.0009 (0.0266) model time 0.0000 (0.0000) loss 3.0137 (2.9983) grad_norm 2.7272 (4.1579) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 11:34:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][30/1251] eta 0:05:07 lr 0.000171 wd 0.0500 time 0.2237 (0.2517) data time 0.0008 (0.0183) model time 0.0000 (0.0000) loss 2.6861 (2.9973) grad_norm 2.7501 (3.9372) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 11:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][40/1251] eta 0:04:56 lr 0.000171 wd 0.0500 time 0.2259 (0.2450) data time 0.0009 (0.0141) model time 0.0000 (0.0000) loss 2.6774 (2.9519) grad_norm 3.6155 (3.7517) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 11:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][50/1251] eta 0:04:54 lr 0.000171 wd 0.0500 time 0.2302 (0.2450) data time 0.0013 (0.0115) model time 0.0000 (0.0000) loss 3.0221 (2.9014) grad_norm 3.0630 (3.6999) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 11:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][60/1251] eta 0:04:47 lr 0.000171 wd 0.0500 time 0.2254 (0.2418) data time 0.0010 (0.0098) model time 0.2244 (0.2242) loss 3.2359 (2.9104) grad_norm 2.9368 (3.9129) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 11:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][70/1251] eta 0:04:42 lr 0.000171 wd 0.0500 time 0.2223 (0.2392) data time 0.0007 (0.0085) model time 0.2216 (0.2234) loss 2.5475 (2.9012) grad_norm 2.8178 (3.8130) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 11:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][80/1251] eta 0:04:38 lr 0.000171 wd 0.0500 time 0.2218 (0.2375) data time 0.0006 (0.0076) model time 0.2212 (0.2237) loss 2.3073 (2.8832) grad_norm 8.0773 (3.8478) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 11:34:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][90/1251] eta 0:04:34 lr 0.000171 wd 0.0500 time 0.2230 (0.2360) data time 0.0008 (0.0068) model time 0.2222 (0.2236) loss 2.9810 (2.8837) grad_norm 2.5149 (3.7897) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 11:34:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][100/1251] eta 0:04:30 lr 0.000171 wd 0.0500 time 0.2241 (0.2349) data time 0.0005 (0.0063) model time 0.2235 (0.2238) loss 3.5180 (2.8891) grad_norm 4.2458 (3.8105) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 11:34:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][110/1251] eta 0:04:27 lr 0.000171 wd 0.0500 time 0.2217 (0.2343) data time 0.0009 (0.0058) model time 0.2208 (0.2244) loss 2.6649 (2.8730) grad_norm 2.8726 (3.7903) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 11:34:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][120/1251] eta 0:04:24 lr 0.000171 wd 0.0500 time 0.2265 (0.2336) data time 0.0008 (0.0054) model time 0.2257 (0.2243) loss 3.2806 (2.8754) grad_norm 3.4128 (3.7498) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 11:34:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][130/1251] eta 0:04:21 lr 0.000171 wd 0.0500 time 0.2257 (0.2329) data time 0.0008 (0.0050) model time 0.2249 (0.2242) loss 3.1945 (2.8705) grad_norm 3.5960 (3.7516) loss_scale 256.0000 (256.0000) mem 7381MB [2024-08-30 11:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 11:34:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 11:34:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 11:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 11:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 11:43:43 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 11:43:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 11:43:50 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 11:43:52 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 11:43:53 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 11:43:53 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 226) [2024-08-30 11:43:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 11:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 11:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 11:45:57 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 11:46:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 11:46:11 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 11:46:12 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 11:46:13 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 11:46:14 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 226) [2024-08-30 11:46:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 11:46:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][140/1251] eta 0:44:37 lr 0.000171 wd 0.0500 time 0.2269 (2.4096) data time 0.0010 (0.1340) model time 0.2259 (2.2755) loss 3.4300 (3.3694) grad_norm 4.1839 (4.9160) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:46:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][150/1251] eta 0:19:14 lr 0.000171 wd 0.0500 time 0.2249 (1.0488) data time 0.0013 (0.0510) model time 0.2236 (0.9977) loss 3.1071 (3.1543) grad_norm 2.5349 (4.5086) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:46:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][160/1251] eta 0:13:20 lr 0.000171 wd 0.0500 time 0.2278 (0.7340) data time 0.0008 (0.0319) model time 0.2270 (0.7022) loss 3.1194 (3.1373) grad_norm 3.1100 (4.1876) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:46:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][170/1251] eta 0:10:41 lr 0.000170 wd 0.0500 time 0.2255 (0.5937) data time 0.0013 (0.0233) model time 0.2242 (0.5703) loss 3.1536 (3.1504) grad_norm 4.3323 (4.1270) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:46:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][180/1251] eta 0:09:10 lr 0.000170 wd 0.0500 time 0.2298 (0.5143) data time 0.0011 (0.0185) model time 0.2287 (0.4958) loss 2.8486 (3.0901) grad_norm 3.6889 (4.2156) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:46:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][190/1251] eta 0:08:11 lr 0.000170 wd 0.0500 time 0.2231 (0.4630) data time 0.0010 (0.0154) model time 0.2221 (0.4476) loss 3.3070 (3.0815) grad_norm 5.4238 (4.1341) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:46:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][200/1251] eta 0:07:29 lr 0.000170 wd 0.0500 time 0.2241 (0.4272) data time 0.0010 (0.0132) model time 0.2231 (0.4140) loss 2.5304 (3.0458) grad_norm 2.8713 (4.2689) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:46:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][210/1251] eta 0:06:57 lr 0.000170 wd 0.0500 time 0.2278 (0.4010) data time 0.0010 (0.0116) model time 0.2268 (0.3894) loss 3.5423 (3.0199) grad_norm 2.6072 (4.1727) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:46:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][220/1251] eta 0:06:32 lr 0.000170 wd 0.0500 time 0.2251 (0.3807) data time 0.0007 (0.0104) model time 0.2244 (0.3703) loss 2.2618 (2.9913) grad_norm 4.8164 (4.1679) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][230/1251] eta 0:06:12 lr 0.000170 wd 0.0500 time 0.2266 (0.3649) data time 0.0007 (0.0095) model time 0.2258 (0.3554) loss 2.7927 (2.9852) grad_norm 5.7494 (4.2185) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][240/1251] eta 0:05:55 lr 0.000170 wd 0.0500 time 0.2424 (0.3521) data time 0.0009 (0.0088) model time 0.2414 (0.3433) loss 3.3303 (3.0005) grad_norm 4.4919 (4.1806) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:46:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][250/1251] eta 0:05:41 lr 0.000170 wd 0.0500 time 0.2227 (0.3412) data time 0.0007 (0.0081) model time 0.2220 (0.3331) loss 3.3256 (2.9967) grad_norm 3.5833 (4.1052) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][260/1251] eta 0:05:29 lr 0.000170 wd 0.0500 time 0.2293 (0.3322) data time 0.0009 (0.0075) model time 0.2285 (0.3246) loss 2.0834 (2.9868) grad_norm 3.0915 (4.0682) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][270/1251] eta 0:05:18 lr 0.000170 wd 0.0500 time 0.2204 (0.3246) data time 0.0011 (0.0071) model time 0.2193 (0.3175) loss 2.5071 (2.9856) grad_norm 3.4701 (4.1067) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][280/1251] eta 0:05:08 lr 0.000170 wd 0.0500 time 0.2399 (0.3181) data time 0.0007 (0.0067) model time 0.2392 (0.3114) loss 2.7055 (2.9795) grad_norm 4.1094 (4.0361) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][290/1251] eta 0:05:00 lr 0.000170 wd 0.0500 time 0.2197 (0.3124) data time 0.0013 (0.0063) model time 0.2184 (0.3061) loss 2.7277 (2.9754) grad_norm 3.5678 (3.9934) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][300/1251] eta 0:04:52 lr 0.000170 wd 0.0500 time 0.2270 (0.3074) data time 0.0011 (0.0060) model time 0.2259 (0.3014) loss 2.8566 (2.9675) grad_norm 3.3390 (3.9538) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][310/1251] eta 0:04:44 lr 0.000170 wd 0.0500 time 0.2283 (0.3028) data time 0.0007 (0.0057) model time 0.2276 (0.2971) loss 2.4953 (2.9576) grad_norm 4.0012 (3.9584) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][320/1251] eta 0:04:38 lr 0.000170 wd 0.0500 time 0.2261 (0.2988) data time 0.0010 (0.0055) model time 0.2251 (0.2933) loss 2.3643 (2.9487) grad_norm 2.6489 (3.9390) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][330/1251] eta 0:04:31 lr 0.000170 wd 0.0500 time 0.2223 (0.2952) data time 0.0007 (0.0053) model time 0.2217 (0.2899) loss 2.4303 (2.9387) grad_norm 3.5937 (3.9178) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][340/1251] eta 0:04:25 lr 0.000170 wd 0.0500 time 0.2258 (0.2919) data time 0.0007 (0.0051) model time 0.2251 (0.2869) loss 2.3120 (2.9246) grad_norm 3.4225 (3.9781) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][350/1251] eta 0:04:20 lr 0.000170 wd 0.0500 time 0.2237 (0.2891) data time 0.0014 (0.0049) model time 0.2223 (0.2842) loss 1.8729 (2.9185) grad_norm 2.9030 (3.9497) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][360/1251] eta 0:04:15 lr 0.000170 wd 0.0500 time 0.2357 (0.2865) data time 0.0008 (0.0047) model time 0.2349 (0.2817) loss 2.5465 (2.9249) grad_norm 4.1472 (3.9497) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][370/1251] eta 0:04:10 lr 0.000170 wd 0.0500 time 0.2300 (0.2840) data time 0.0007 (0.0046) model time 0.2292 (0.2795) loss 3.2764 (2.9178) grad_norm 3.7845 (3.9927) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][380/1251] eta 0:04:05 lr 0.000170 wd 0.0500 time 0.2279 (0.2817) data time 0.0010 (0.0044) model time 0.2269 (0.2773) loss 2.0202 (2.9154) grad_norm 4.5324 (3.9962) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][390/1251] eta 0:04:00 lr 0.000170 wd 0.0500 time 0.2201 (0.2796) data time 0.0013 (0.0043) model time 0.2188 (0.2753) loss 2.0036 (2.9076) grad_norm 3.2290 (4.0202) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][400/1251] eta 0:03:56 lr 0.000170 wd 0.0500 time 0.2252 (0.2776) data time 0.0007 (0.0042) model time 0.2245 (0.2735) loss 2.5033 (2.8996) grad_norm 3.7783 (4.0127) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][410/1251] eta 0:03:52 lr 0.000170 wd 0.0500 time 0.2235 (0.2759) data time 0.0012 (0.0041) model time 0.2224 (0.2718) loss 3.3069 (2.9042) grad_norm 4.1722 (4.0095) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][420/1251] eta 0:03:47 lr 0.000170 wd 0.0500 time 0.2296 (0.2742) data time 0.0009 (0.0040) model time 0.2286 (0.2703) loss 2.1184 (2.8968) grad_norm 4.3744 (4.0052) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][430/1251] eta 0:03:44 lr 0.000170 wd 0.0500 time 0.2416 (0.2735) data time 0.0011 (0.0039) model time 0.2405 (0.2696) loss 2.2606 (2.8927) grad_norm 3.6763 (3.9756) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][440/1251] eta 0:03:40 lr 0.000170 wd 0.0500 time 0.2452 (0.2720) data time 0.0008 (0.0038) model time 0.2444 (0.2682) loss 1.8976 (2.8852) grad_norm 9.4318 (3.9848) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][450/1251] eta 0:03:37 lr 0.000170 wd 0.0500 time 0.2232 (0.2714) data time 0.0009 (0.0037) model time 0.2223 (0.2677) loss 3.4133 (2.8955) grad_norm 2.8141 (3.9806) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][460/1251] eta 0:03:33 lr 0.000170 wd 0.0500 time 0.2328 (0.2701) data time 0.0010 (0.0036) model time 0.2317 (0.2665) loss 3.4050 (2.9029) grad_norm 4.1597 (3.9790) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][470/1251] eta 0:03:30 lr 0.000170 wd 0.0500 time 0.2359 (0.2690) data time 0.0010 (0.0036) model time 0.2349 (0.2654) loss 2.8987 (2.8997) grad_norm 2.8056 (3.9500) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][480/1251] eta 0:03:26 lr 0.000169 wd 0.0500 time 0.2247 (0.2678) data time 0.0010 (0.0035) model time 0.2237 (0.2643) loss 3.5030 (2.9013) grad_norm 2.9995 (3.9339) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][490/1251] eta 0:03:22 lr 0.000169 wd 0.0500 time 0.2224 (0.2667) data time 0.0007 (0.0034) model time 0.2217 (0.2632) loss 2.3834 (2.9024) grad_norm 3.7870 (4.0316) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][500/1251] eta 0:03:19 lr 0.000169 wd 0.0500 time 0.2348 (0.2656) data time 0.0012 (0.0034) model time 0.2337 (0.2622) loss 3.0551 (2.9017) grad_norm 4.9217 (4.0248) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:47:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][510/1251] eta 0:03:16 lr 0.000169 wd 0.0500 time 0.2233 (0.2646) data time 0.0010 (0.0033) model time 0.2224 (0.2613) loss 2.1876 (2.8992) grad_norm 3.7628 (4.0651) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][520/1251] eta 0:03:12 lr 0.000169 wd 0.0500 time 0.2295 (0.2637) data time 0.0012 (0.0033) model time 0.2283 (0.2604) loss 3.4325 (2.8962) grad_norm 3.2395 (4.0576) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][530/1251] eta 0:03:09 lr 0.000169 wd 0.0500 time 0.2253 (0.2628) data time 0.0007 (0.0032) model time 0.2246 (0.2596) loss 3.5180 (2.9027) grad_norm 4.7934 (4.1371) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][540/1251] eta 0:03:06 lr 0.000169 wd 0.0500 time 0.2213 (0.2619) data time 0.0011 (0.0032) model time 0.2202 (0.2587) loss 3.0807 (2.9053) grad_norm 3.4878 (4.1312) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][550/1251] eta 0:03:02 lr 0.000169 wd 0.0500 time 0.2270 (0.2610) data time 0.0007 (0.0031) model time 0.2263 (0.2579) loss 2.7957 (2.9085) grad_norm 3.5210 (4.2059) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][560/1251] eta 0:02:59 lr 0.000169 wd 0.0500 time 0.2253 (0.2602) data time 0.0010 (0.0031) model time 0.2243 (0.2571) loss 3.1208 (2.9092) grad_norm 4.6366 (4.1997) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][570/1251] eta 0:02:56 lr 0.000169 wd 0.0500 time 0.2242 (0.2594) data time 0.0009 (0.0030) model time 0.2233 (0.2564) loss 2.4113 (2.9145) grad_norm 4.6207 (4.1816) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][580/1251] eta 0:02:53 lr 0.000169 wd 0.0500 time 0.2247 (0.2587) data time 0.0006 (0.0030) model time 0.2241 (0.2557) loss 3.0183 (2.9170) grad_norm 3.6806 (4.1714) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][590/1251] eta 0:02:50 lr 0.000169 wd 0.0500 time 0.2307 (0.2581) data time 0.0007 (0.0029) model time 0.2300 (0.2552) loss 2.3908 (2.9144) grad_norm 4.8617 (4.1786) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][600/1251] eta 0:02:47 lr 0.000169 wd 0.0500 time 0.2242 (0.2574) data time 0.0008 (0.0029) model time 0.2234 (0.2546) loss 2.2879 (2.9087) grad_norm 3.9934 (4.1634) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][610/1251] eta 0:02:44 lr 0.000169 wd 0.0500 time 0.2320 (0.2568) data time 0.0009 (0.0029) model time 0.2311 (0.2540) loss 2.6753 (2.9032) grad_norm 2.5817 (4.1457) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][620/1251] eta 0:02:41 lr 0.000169 wd 0.0500 time 0.2282 (0.2562) data time 0.0011 (0.0028) model time 0.2271 (0.2534) loss 2.8955 (2.9071) grad_norm 3.5946 (4.1323) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][630/1251] eta 0:02:38 lr 0.000169 wd 0.0500 time 0.2329 (0.2556) data time 0.0007 (0.0028) model time 0.2322 (0.2529) loss 2.8893 (2.9066) grad_norm 6.1092 (4.1462) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][640/1251] eta 0:02:35 lr 0.000169 wd 0.0500 time 0.2257 (0.2551) data time 0.0011 (0.0028) model time 0.2246 (0.2523) loss 2.7722 (2.9073) grad_norm 3.6993 (4.1385) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][650/1251] eta 0:02:32 lr 0.000169 wd 0.0500 time 0.2262 (0.2545) data time 0.0010 (0.0027) model time 0.2252 (0.2518) loss 2.5988 (2.9132) grad_norm 3.0099 (4.1364) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][660/1251] eta 0:02:30 lr 0.000169 wd 0.0500 time 0.2312 (0.2540) data time 0.0007 (0.0027) model time 0.2306 (0.2513) loss 1.9950 (2.9087) grad_norm 3.6914 (4.1277) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][670/1251] eta 0:02:27 lr 0.000169 wd 0.0500 time 0.2374 (0.2535) data time 0.0010 (0.0027) model time 0.2364 (0.2509) loss 2.8302 (2.9071) grad_norm 3.6564 (4.1285) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][680/1251] eta 0:02:24 lr 0.000169 wd 0.0500 time 0.2313 (0.2531) data time 0.0007 (0.0026) model time 0.2306 (0.2505) loss 2.6040 (2.9064) grad_norm 2.5957 (4.1274) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][690/1251] eta 0:02:21 lr 0.000169 wd 0.0500 time 0.2297 (0.2526) data time 0.0010 (0.0026) model time 0.2287 (0.2500) loss 3.4040 (2.9129) grad_norm 3.5058 (4.1568) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][700/1251] eta 0:02:18 lr 0.000169 wd 0.0500 time 0.2262 (0.2522) data time 0.0010 (0.0026) model time 0.2252 (0.2496) loss 1.8350 (2.9122) grad_norm 6.1473 (4.1512) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][710/1251] eta 0:02:16 lr 0.000169 wd 0.0500 time 0.2307 (0.2518) data time 0.0011 (0.0025) model time 0.2296 (0.2492) loss 2.4635 (2.9128) grad_norm 4.0882 (4.1444) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][720/1251] eta 0:02:13 lr 0.000169 wd 0.0500 time 0.2336 (0.2514) data time 0.0011 (0.0025) model time 0.2325 (0.2489) loss 2.7579 (2.9135) grad_norm 3.1736 (4.1379) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][730/1251] eta 0:02:10 lr 0.000169 wd 0.0500 time 0.2322 (0.2510) data time 0.0006 (0.0025) model time 0.2315 (0.2485) loss 2.2107 (2.9136) grad_norm 2.6588 (4.1273) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][740/1251] eta 0:02:08 lr 0.000169 wd 0.0500 time 0.2298 (0.2506) data time 0.0014 (0.0025) model time 0.2285 (0.2481) loss 2.6431 (2.9115) grad_norm 6.3293 (4.1155) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][750/1251] eta 0:02:05 lr 0.000169 wd 0.0500 time 0.2337 (0.2502) data time 0.0009 (0.0025) model time 0.2328 (0.2477) loss 2.9684 (2.9127) grad_norm 9.3358 (4.1380) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][760/1251] eta 0:02:02 lr 0.000169 wd 0.0500 time 0.2257 (0.2498) data time 0.0010 (0.0024) model time 0.2247 (0.2474) loss 2.4885 (2.9150) grad_norm 4.7178 (4.1382) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][770/1251] eta 0:02:00 lr 0.000169 wd 0.0500 time 0.2251 (0.2495) data time 0.0007 (0.0024) model time 0.2244 (0.2471) loss 3.2563 (2.9161) grad_norm 2.9491 (4.1306) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:48:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][780/1251] eta 0:01:57 lr 0.000168 wd 0.0500 time 0.2319 (0.2492) data time 0.0010 (0.0024) model time 0.2309 (0.2468) loss 3.3359 (2.9129) grad_norm 3.8665 (4.1204) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:49:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][790/1251] eta 0:01:54 lr 0.000168 wd 0.0500 time 0.2236 (0.2489) data time 0.0011 (0.0024) model time 0.2225 (0.2465) loss 2.3816 (2.9132) grad_norm 4.8340 (4.1135) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][800/1251] eta 0:01:52 lr 0.000168 wd 0.0500 time 0.2365 (0.2486) data time 0.0009 (0.0024) model time 0.2357 (0.2463) loss 3.3797 (2.9105) grad_norm 3.4685 (4.1004) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:49:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][810/1251] eta 0:01:49 lr 0.000168 wd 0.0500 time 0.2259 (0.2483) data time 0.0008 (0.0023) model time 0.2251 (0.2460) loss 2.6450 (2.9142) grad_norm 3.4332 (4.0942) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:49:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][820/1251] eta 0:01:46 lr 0.000168 wd 0.0500 time 0.2280 (0.2480) data time 0.0009 (0.0023) model time 0.2271 (0.2457) loss 2.5581 (2.9143) grad_norm 3.8217 (4.0952) loss_scale 256.0000 (256.0000) mem 7373MB [2024-08-30 11:49:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 11:49:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 11:49:12 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 12:05:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 12:05:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 12:05:07 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 12:05:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 12:05:22 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 12:05:24 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 12:05:25 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 12:05:25 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 226) [2024-08-30 12:05:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 12:05:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][830/1251] eta 0:10:38 lr 0.000168 wd 0.0500 time 0.2344 (1.5178) data time 0.0009 (0.1028) model time 0.2335 (1.4150) loss 3.3906 (3.3171) grad_norm 2.7661 (3.5133) loss_scale 256.0000 (256.0000) mem 7374MB [2024-08-30 12:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][840/1251] eta 0:06:01 lr 0.000168 wd 0.0500 time 0.2437 (0.8800) data time 0.0008 (0.0519) model time 0.2429 (0.8280) loss 3.3042 (3.1414) grad_norm 4.9242 (3.9511) loss_scale 256.0000 (256.0000) mem 7374MB [2024-08-30 12:05:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][850/1251] eta 0:04:27 lr 0.000168 wd 0.0500 time 0.2433 (0.6676) data time 0.0009 (0.0350) model time 0.2424 (0.6326) loss 3.0477 (3.1725) grad_norm 4.6771 (3.8132) loss_scale 256.0000 (256.0000) mem 7374MB [2024-08-30 12:05:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][860/1251] eta 0:03:39 lr 0.000168 wd 0.0500 time 0.2425 (0.5613) data time 0.0009 (0.0265) model time 0.2416 (0.5348) loss 2.2354 (3.0756) grad_norm 2.7391 (3.7206) loss_scale 256.0000 (256.0000) mem 7374MB [2024-08-30 12:05:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][870/1251] eta 0:03:09 lr 0.000168 wd 0.0500 time 0.2456 (0.4975) data time 0.0009 (0.0214) model time 0.2446 (0.4761) loss 2.4981 (3.0456) grad_norm 3.0261 (3.6099) loss_scale 512.0000 (286.7200) mem 7374MB [2024-08-30 12:05:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][880/1251] eta 0:02:48 lr 0.000168 wd 0.0500 time 0.2453 (0.4550) data time 0.0007 (0.0180) model time 0.2446 (0.4370) loss 2.9759 (3.0135) grad_norm 3.8016 (3.5993) loss_scale 512.0000 (324.2667) mem 7374MB [2024-08-30 12:05:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][890/1251] eta 0:02:33 lr 0.000168 wd 0.0500 time 0.2459 (0.4245) data time 0.0007 (0.0156) model time 0.2453 (0.4090) loss 2.0771 (2.9745) grad_norm 3.3544 (3.5556) loss_scale 512.0000 (351.0857) mem 7374MB [2024-08-30 12:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][900/1251] eta 0:02:21 lr 0.000168 wd 0.0500 time 0.2347 (0.4023) data time 0.0012 (0.0137) model time 0.2335 (0.3885) loss 3.1681 (2.9716) grad_norm 3.4604 (3.5482) loss_scale 512.0000 (371.2000) mem 7374MB [2024-08-30 12:06:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][910/1251] eta 0:02:11 lr 0.000168 wd 0.0500 time 0.2476 (0.3847) data time 0.0007 (0.0123) model time 0.2468 (0.3724) loss 3.2890 (2.9484) grad_norm 3.3254 (3.5702) loss_scale 512.0000 (386.8444) mem 7374MB [2024-08-30 12:06:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][920/1251] eta 0:02:02 lr 0.000168 wd 0.0500 time 0.2451 (0.3706) data time 0.0010 (0.0112) model time 0.2440 (0.3594) loss 3.2544 (2.9612) grad_norm 12.7581 (3.6513) loss_scale 512.0000 (399.3600) mem 7374MB [2024-08-30 12:06:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][930/1251] eta 0:01:55 lr 0.000168 wd 0.0500 time 0.2600 (0.3591) data time 0.0010 (0.0103) model time 0.2590 (0.3488) loss 2.8978 (2.9736) grad_norm 2.7303 (3.7110) loss_scale 512.0000 (409.6000) mem 7374MB [2024-08-30 12:06:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][940/1251] eta 0:01:48 lr 0.000168 wd 0.0500 time 0.2479 (0.3495) data time 0.0007 (0.0095) model time 0.2472 (0.3400) loss 3.4239 (2.9744) grad_norm 2.6000 (3.7205) loss_scale 512.0000 (418.1333) mem 7374MB [2024-08-30 12:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][950/1251] eta 0:01:42 lr 0.000168 wd 0.0500 time 0.2359 (0.3413) data time 0.0009 (0.0089) model time 0.2350 (0.3324) loss 2.8368 (2.9483) grad_norm 6.3399 (3.7186) loss_scale 512.0000 (425.3538) mem 7374MB [2024-08-30 12:06:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 12:06:15 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 12:06:17 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 12:07:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 12:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 12:26:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 12:26:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 12:26:13 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 12:26:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 12:26:25 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 12:26:26 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 12:26:28 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 12:26:28 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 226) [2024-08-30 12:26:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 12:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][960/1251] eta 0:10:19 lr 0.000168 wd 0.0500 time 0.2406 (2.1282) data time 0.0012 (0.1476) model time 0.2394 (1.9806) loss 2.7926 (3.3895) grad_norm 4.0960 (3.4240) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 12:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][970/1251] eta 0:04:46 lr 0.000168 wd 0.0500 time 0.2373 (1.0206) data time 0.0012 (0.0616) model time 0.2362 (0.9590) loss 2.9129 (3.1690) grad_norm 3.1277 (3.1803) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 12:26:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][980/1251] eta 0:03:18 lr 0.000168 wd 0.0500 time 0.2464 (0.7331) data time 0.0009 (0.0393) model time 0.2455 (0.6938) loss 3.3941 (3.1604) grad_norm 3.2856 (3.7619) loss_scale 512.0000 (512.0000) mem 7379MB [2024-08-30 12:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 12:26:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 12:26:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 12:52:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 12:52:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 12:52:46 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 12:53:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 12:53:03 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 12:53:05 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 12:53:06 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 12:53:06 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 226) [2024-08-30 12:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 12:53:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][990/1251] eta 0:16:43 lr 0.000168 wd 0.0500 time 0.2255 (3.8450) data time 0.0008 (0.2643) model time 0.2247 (3.5807) loss 2.4429 (3.1175) grad_norm 2.2371 (3.2401) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1000/1251] eta 0:04:26 lr 0.000168 wd 0.0500 time 0.2316 (1.0618) data time 0.0010 (0.0618) model time 0.2305 (1.0000) loss 3.3052 (3.1325) grad_norm 3.7607 (3.5519) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1010/1251] eta 0:02:48 lr 0.000168 wd 0.0500 time 0.2286 (0.6986) data time 0.0008 (0.0353) model time 0.2279 (0.6633) loss 3.4163 (3.1112) grad_norm 3.4507 (3.4405) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1020/1251] eta 0:02:08 lr 0.000168 wd 0.0500 time 0.2309 (0.5557) data time 0.0007 (0.0249) model time 0.2302 (0.5308) loss 3.5441 (3.0903) grad_norm 2.7293 (3.5250) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1030/1251] eta 0:01:45 lr 0.000168 wd 0.0500 time 0.2262 (0.4792) data time 0.0011 (0.0194) model time 0.2252 (0.4598) loss 2.6244 (3.0285) grad_norm 3.6247 (3.6278) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1040/1251] eta 0:01:31 lr 0.000168 wd 0.0500 time 0.2294 (0.4320) data time 0.0011 (0.0159) model time 0.2283 (0.4161) loss 3.1864 (3.0558) grad_norm 3.1710 (3.8347) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1050/1251] eta 0:01:20 lr 0.000168 wd 0.0500 time 0.2222 (0.3994) data time 0.0010 (0.0135) model time 0.2212 (0.3859) loss 2.5177 (3.0128) grad_norm 5.0200 (3.9245) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1060/1251] eta 0:01:11 lr 0.000168 wd 0.0500 time 0.2272 (0.3759) data time 0.0010 (0.0118) model time 0.2262 (0.3641) loss 3.2237 (2.9797) grad_norm 3.4044 (4.0384) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1070/1251] eta 0:01:04 lr 0.000168 wd 0.0500 time 0.2258 (0.3577) data time 0.0007 (0.0105) model time 0.2251 (0.3472) loss 2.3054 (2.9473) grad_norm 3.0792 (3.9719) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1080/1251] eta 0:00:58 lr 0.000168 wd 0.0500 time 0.2247 (0.3433) data time 0.0007 (0.0095) model time 0.2241 (0.3339) loss 3.1388 (2.9409) grad_norm 4.2444 (4.2425) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1090/1251] eta 0:00:53 lr 0.000167 wd 0.0500 time 0.2197 (0.3317) data time 0.0009 (0.0087) model time 0.2188 (0.3231) loss 3.8242 (2.9697) grad_norm 3.1820 (4.1623) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1100/1251] eta 0:00:48 lr 0.000167 wd 0.0500 time 0.2324 (0.3225) data time 0.0008 (0.0080) model time 0.2316 (0.3145) loss 2.9995 (2.9592) grad_norm 2.9121 (4.0783) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1110/1251] eta 0:00:44 lr 0.000167 wd 0.0500 time 0.2311 (0.3146) data time 0.0009 (0.0074) model time 0.2302 (0.3072) loss 2.9721 (2.9619) grad_norm 3.7786 (4.0568) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1120/1251] eta 0:00:40 lr 0.000167 wd 0.0500 time 0.2277 (0.3080) data time 0.0009 (0.0070) model time 0.2268 (0.3011) loss 3.6370 (2.9628) grad_norm 2.8211 (4.0241) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1130/1251] eta 0:00:36 lr 0.000167 wd 0.0500 time 0.2285 (0.3024) data time 0.0012 (0.0066) model time 0.2273 (0.2959) loss 3.4052 (2.9582) grad_norm 4.3124 (4.0037) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1140/1251] eta 0:00:33 lr 0.000167 wd 0.0500 time 0.2271 (0.2976) data time 0.0009 (0.0063) model time 0.2263 (0.2913) loss 2.9140 (2.9536) grad_norm 5.6466 (4.0605) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:53:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1150/1251] eta 0:00:29 lr 0.000167 wd 0.0500 time 0.2253 (0.2933) data time 0.0006 (0.0059) model time 0.2247 (0.2873) loss 2.5034 (2.9542) grad_norm 4.3595 (4.0641) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:54:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1160/1251] eta 0:00:26 lr 0.000167 wd 0.0500 time 0.2250 (0.2893) data time 0.0009 (0.0057) model time 0.2241 (0.2837) loss 3.0258 (2.9502) grad_norm 5.2344 (4.0811) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1170/1251] eta 0:00:23 lr 0.000167 wd 0.0500 time 0.2269 (0.2858) data time 0.0010 (0.0054) model time 0.2258 (0.2805) loss 3.2535 (2.9384) grad_norm 14.0243 (4.1016) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:54:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1180/1251] eta 0:00:20 lr 0.000167 wd 0.0500 time 0.2302 (0.2828) data time 0.0007 (0.0052) model time 0.2295 (0.2776) loss 3.3148 (2.9380) grad_norm 2.5339 (4.0604) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:54:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1190/1251] eta 0:00:17 lr 0.000167 wd 0.0500 time 0.2234 (0.2799) data time 0.0006 (0.0050) model time 0.2227 (0.2749) loss 1.9867 (2.9221) grad_norm 3.6919 (4.0502) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:54:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1200/1251] eta 0:00:14 lr 0.000167 wd 0.0500 time 0.2280 (0.2774) data time 0.0009 (0.0048) model time 0.2271 (0.2726) loss 2.1840 (2.9194) grad_norm 2.4400 (4.0562) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:54:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1210/1251] eta 0:00:11 lr 0.000167 wd 0.0500 time 0.2241 (0.2752) data time 0.0009 (0.0046) model time 0.2232 (0.2706) loss 2.3271 (2.9188) grad_norm 3.0476 (4.0523) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1220/1251] eta 0:00:08 lr 0.000167 wd 0.0500 time 0.2347 (0.2731) data time 0.0008 (0.0045) model time 0.2339 (0.2687) loss 2.4617 (2.9165) grad_norm 3.9883 (4.1122) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:54:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1230/1251] eta 0:00:05 lr 0.000167 wd 0.0500 time 0.2222 (0.2712) data time 0.0011 (0.0043) model time 0.2211 (0.2669) loss 3.3727 (2.9210) grad_norm 3.0244 (4.1038) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:54:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1240/1251] eta 0:00:02 lr 0.000167 wd 0.0500 time 0.2309 (0.2694) data time 0.0006 (0.0042) model time 0.2303 (0.2652) loss 3.1679 (2.9120) grad_norm 4.5069 (4.1211) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:54:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [226/300][1250/1251] eta 0:00:00 lr 0.000167 wd 0.0500 time 0.2115 (0.2673) data time 0.0004 (0.0041) model time 0.2111 (0.2633) loss 2.8100 (2.8990) grad_norm 3.9088 (4.1157) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 12:54:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 226 training takes 0:01:10 [2024-08-30 12:54:20 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 12:54:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 12:54:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.370 (0.370) Loss 0.4004 (0.4004) Acc@1 92.188 (92.188) Acc@5 98.633 (98.633) Mem 7377MB [2024-08-30 12:54:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.096) Loss 0.6416 (0.6370) Acc@1 88.086 (86.417) Acc@5 97.656 (97.443) Mem 7377MB [2024-08-30 12:54:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.085) Loss 0.9438 (0.6639) Acc@1 76.758 (85.556) Acc@5 95.020 (97.359) Mem 7377MB [2024-08-30 12:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.080) Loss 1.1592 (0.7564) Acc@1 72.949 (83.420) Acc@5 92.090 (96.412) Mem 7377MB [2024-08-30 12:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 1.0420 (0.8040) Acc@1 75.195 (82.262) Acc@5 93.652 (95.906) Mem 7377MB [2024-08-30 12:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.762 Acc@5 95.860 [2024-08-30 12:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.8% [2024-08-30 12:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.76% [2024-08-30 12:54:28 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-08-30 12:54:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-08-30 12:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.382 (0.382) Loss 0.3794 (0.3794) Acc@1 93.164 (93.164) Acc@5 98.340 (98.340) Mem 7377MB [2024-08-30 12:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.068 (0.096) Loss 0.5796 (0.6010) Acc@1 89.160 (87.402) Acc@5 97.852 (97.772) Mem 7377MB [2024-08-30 12:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.066 (0.084) Loss 0.8701 (0.6286) Acc@1 78.418 (86.347) Acc@5 95.996 (97.717) Mem 7377MB [2024-08-30 12:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.080) Loss 1.0879 (0.7127) Acc@1 74.316 (84.262) Acc@5 93.066 (96.784) Mem 7377MB [2024-08-30 12:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.075) Loss 0.9751 (0.7562) Acc@1 77.246 (83.148) Acc@5 94.727 (96.337) Mem 7377MB [2024-08-30 12:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.740 Acc@5 96.318 [2024-08-30 12:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-08-30 12:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.74% [2024-08-30 12:54:32 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 12:54:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 12:54:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][0/1251] eta 0:16:52 lr 0.000167 wd 0.0500 time 0.8092 (0.8092) data time 0.4127 (0.4127) model time 0.0000 (0.0000) loss 2.9368 (2.9368) grad_norm 4.3804 (4.3804) loss_scale 512.0000 (512.0000) mem 7380MB [2024-08-30 12:54:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][10/1251] eta 0:05:46 lr 0.000167 wd 0.0500 time 0.2259 (0.2790) data time 0.0010 (0.0384) model time 0.0000 (0.0000) loss 2.8577 (2.7804) grad_norm 3.2457 (4.0487) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:54:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][20/1251] eta 0:05:12 lr 0.000167 wd 0.0500 time 0.2322 (0.2537) data time 0.0007 (0.0206) model time 0.0000 (0.0000) loss 2.1495 (2.7671) grad_norm 3.3478 (3.9668) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][30/1251] eta 0:05:09 lr 0.000167 wd 0.0500 time 0.2290 (0.2533) data time 0.0008 (0.0146) model time 0.0000 (0.0000) loss 2.6273 (2.7623) grad_norm 3.7219 (3.9077) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][40/1251] eta 0:04:58 lr 0.000167 wd 0.0500 time 0.2315 (0.2469) data time 0.0009 (0.0113) model time 0.0000 (0.0000) loss 2.8820 (2.7494) grad_norm 2.8547 (3.8180) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:54:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][50/1251] eta 0:04:51 lr 0.000167 wd 0.0500 time 0.2296 (0.2426) data time 0.0009 (0.0093) model time 0.0000 (0.0000) loss 3.1502 (2.7789) grad_norm 3.6110 (3.9029) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:54:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][60/1251] eta 0:04:45 lr 0.000167 wd 0.0500 time 0.2225 (0.2401) data time 0.0008 (0.0080) model time 0.2217 (0.2265) loss 3.5518 (2.8344) grad_norm 4.1749 (3.7961) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:54:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][70/1251] eta 0:04:41 lr 0.000167 wd 0.0500 time 0.2304 (0.2383) data time 0.0008 (0.0070) model time 0.2296 (0.2262) loss 2.8178 (2.8424) grad_norm 2.9330 (3.9696) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:54:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][80/1251] eta 0:04:37 lr 0.000167 wd 0.0500 time 0.2188 (0.2369) data time 0.0011 (0.0063) model time 0.2176 (0.2261) loss 3.3671 (2.8634) grad_norm 3.3137 (3.9425) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][90/1251] eta 0:04:33 lr 0.000167 wd 0.0500 time 0.2273 (0.2356) data time 0.0010 (0.0057) model time 0.2263 (0.2257) loss 3.3092 (2.8754) grad_norm 2.8220 (3.8830) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:54:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][100/1251] eta 0:04:29 lr 0.000167 wd 0.0500 time 0.2324 (0.2346) data time 0.0007 (0.0052) model time 0.2317 (0.2254) loss 2.0515 (2.8664) grad_norm 3.4151 (3.8232) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:54:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][110/1251] eta 0:04:26 lr 0.000167 wd 0.0500 time 0.2308 (0.2339) data time 0.0009 (0.0048) model time 0.2298 (0.2254) loss 3.3128 (2.8667) grad_norm 4.1285 (3.9003) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][120/1251] eta 0:04:23 lr 0.000167 wd 0.0500 time 0.2288 (0.2333) data time 0.0008 (0.0045) model time 0.2280 (0.2255) loss 2.2053 (2.8487) grad_norm 4.0539 (3.9215) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][130/1251] eta 0:04:20 lr 0.000167 wd 0.0500 time 0.2267 (0.2326) data time 0.0010 (0.0042) model time 0.2257 (0.2252) loss 3.1632 (2.8529) grad_norm 5.3733 (3.9557) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][140/1251] eta 0:04:18 lr 0.000167 wd 0.0500 time 0.2239 (0.2323) data time 0.0008 (0.0040) model time 0.2231 (0.2254) loss 3.0878 (2.8698) grad_norm 2.6237 (4.0504) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][150/1251] eta 0:04:15 lr 0.000166 wd 0.0500 time 0.2332 (0.2320) data time 0.0010 (0.0038) model time 0.2322 (0.2256) loss 2.6481 (2.8882) grad_norm 3.4951 (4.0272) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][160/1251] eta 0:04:12 lr 0.000166 wd 0.0500 time 0.2214 (0.2317) data time 0.0008 (0.0037) model time 0.2206 (0.2257) loss 3.6882 (2.8839) grad_norm 3.3744 (3.9930) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][170/1251] eta 0:04:10 lr 0.000166 wd 0.0500 time 0.2241 (0.2318) data time 0.0010 (0.0035) model time 0.2230 (0.2261) loss 3.3796 (2.9026) grad_norm 3.5780 (3.9459) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][180/1251] eta 0:04:08 lr 0.000166 wd 0.0500 time 0.2428 (0.2317) data time 0.0010 (0.0034) model time 0.2418 (0.2264) loss 3.0774 (2.9066) grad_norm 3.9501 (3.9563) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][190/1251] eta 0:04:05 lr 0.000166 wd 0.0500 time 0.2308 (0.2314) data time 0.0008 (0.0032) model time 0.2300 (0.2263) loss 3.5100 (2.9009) grad_norm 2.8621 (3.9481) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][200/1251] eta 0:04:02 lr 0.000166 wd 0.0500 time 0.2244 (0.2312) data time 0.0007 (0.0031) model time 0.2237 (0.2263) loss 3.5595 (2.8956) grad_norm 6.7727 (3.9942) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][210/1251] eta 0:04:00 lr 0.000166 wd 0.0500 time 0.2238 (0.2311) data time 0.0009 (0.0030) model time 0.2228 (0.2263) loss 3.1046 (2.8862) grad_norm 5.0116 (4.0856) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][220/1251] eta 0:03:58 lr 0.000166 wd 0.0500 time 0.2220 (0.2310) data time 0.0008 (0.0029) model time 0.2212 (0.2264) loss 2.9342 (2.8889) grad_norm 3.1339 (4.0612) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][230/1251] eta 0:03:55 lr 0.000166 wd 0.0500 time 0.2212 (0.2307) data time 0.0009 (0.0029) model time 0.2203 (0.2263) loss 2.9470 (2.8907) grad_norm 7.0377 (4.0615) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][240/1251] eta 0:03:53 lr 0.000166 wd 0.0500 time 0.2241 (0.2305) data time 0.0008 (0.0028) model time 0.2232 (0.2262) loss 2.6693 (2.8865) grad_norm 4.0235 (4.0832) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][250/1251] eta 0:03:50 lr 0.000166 wd 0.0500 time 0.2290 (0.2304) data time 0.0009 (0.0027) model time 0.2281 (0.2262) loss 3.3037 (2.8979) grad_norm 4.9228 (4.0982) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][260/1251] eta 0:03:48 lr 0.000166 wd 0.0500 time 0.2250 (0.2302) data time 0.0009 (0.0027) model time 0.2241 (0.2261) loss 2.0650 (2.8879) grad_norm 7.8178 (4.0849) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][270/1251] eta 0:03:45 lr 0.000166 wd 0.0500 time 0.2281 (0.2301) data time 0.0010 (0.0026) model time 0.2271 (0.2261) loss 2.2538 (2.8853) grad_norm 3.4100 (4.0581) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][280/1251] eta 0:03:43 lr 0.000166 wd 0.0500 time 0.2302 (0.2299) data time 0.0008 (0.0026) model time 0.2294 (0.2260) loss 3.0450 (2.8820) grad_norm 4.1103 (4.0412) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][290/1251] eta 0:03:40 lr 0.000166 wd 0.0500 time 0.2271 (0.2298) data time 0.0007 (0.0025) model time 0.2263 (0.2260) loss 3.7557 (2.8877) grad_norm 4.5607 (4.0330) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][300/1251] eta 0:03:38 lr 0.000166 wd 0.0500 time 0.2226 (0.2297) data time 0.0011 (0.0025) model time 0.2215 (0.2260) loss 3.4621 (2.8919) grad_norm 3.2093 (4.0276) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][310/1251] eta 0:03:36 lr 0.000166 wd 0.0500 time 0.2283 (0.2296) data time 0.0009 (0.0024) model time 0.2275 (0.2260) loss 3.1163 (2.8890) grad_norm 3.4418 (3.9974) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][320/1251] eta 0:03:33 lr 0.000166 wd 0.0500 time 0.2299 (0.2294) data time 0.0006 (0.0024) model time 0.2293 (0.2259) loss 2.2831 (2.8880) grad_norm 25.2010 (4.0490) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][330/1251] eta 0:03:31 lr 0.000166 wd 0.0500 time 0.2261 (0.2293) data time 0.0008 (0.0023) model time 0.2253 (0.2258) loss 3.0895 (2.8921) grad_norm 3.9789 (4.0579) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][340/1251] eta 0:03:28 lr 0.000166 wd 0.0500 time 0.2252 (0.2292) data time 0.0012 (0.0023) model time 0.2240 (0.2257) loss 2.6820 (2.8844) grad_norm 3.8973 (4.0554) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][350/1251] eta 0:03:26 lr 0.000166 wd 0.0500 time 0.2322 (0.2291) data time 0.0008 (0.0023) model time 0.2314 (0.2258) loss 2.7557 (2.8830) grad_norm 3.1077 (4.0323) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][360/1251] eta 0:03:24 lr 0.000166 wd 0.0500 time 0.2271 (0.2291) data time 0.0008 (0.0023) model time 0.2263 (0.2257) loss 3.3644 (2.8890) grad_norm 3.6832 (4.0226) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:55:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][370/1251] eta 0:03:21 lr 0.000166 wd 0.0500 time 0.2240 (0.2290) data time 0.0013 (0.0022) model time 0.2227 (0.2257) loss 2.9286 (2.8931) grad_norm 2.4068 (4.0086) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][380/1251] eta 0:03:19 lr 0.000166 wd 0.0500 time 0.2299 (0.2290) data time 0.0007 (0.0022) model time 0.2291 (0.2257) loss 2.8462 (2.8836) grad_norm 2.8824 (4.0003) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][390/1251] eta 0:03:17 lr 0.000166 wd 0.0500 time 0.2235 (0.2289) data time 0.0010 (0.0022) model time 0.2224 (0.2258) loss 3.4162 (2.8851) grad_norm 3.8505 (3.9786) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][400/1251] eta 0:03:14 lr 0.000166 wd 0.0500 time 0.2254 (0.2289) data time 0.0011 (0.0022) model time 0.2243 (0.2258) loss 3.3611 (2.8782) grad_norm 4.0297 (3.9710) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][410/1251] eta 0:03:12 lr 0.000166 wd 0.0500 time 0.2230 (0.2288) data time 0.0008 (0.0021) model time 0.2222 (0.2258) loss 3.3090 (2.8861) grad_norm 3.4388 (3.9598) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][420/1251] eta 0:03:10 lr 0.000166 wd 0.0500 time 0.2330 (0.2288) data time 0.0008 (0.0021) model time 0.2322 (0.2258) loss 3.1210 (2.8886) grad_norm 3.4230 (3.9641) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][430/1251] eta 0:03:07 lr 0.000166 wd 0.0500 time 0.2359 (0.2289) data time 0.0010 (0.0021) model time 0.2350 (0.2259) loss 3.2091 (2.8862) grad_norm 5.2925 (3.9677) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][440/1251] eta 0:03:05 lr 0.000166 wd 0.0500 time 0.2297 (0.2289) data time 0.0009 (0.0020) model time 0.2287 (0.2260) loss 3.3126 (2.8833) grad_norm 3.7493 (4.0770) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][450/1251] eta 0:03:03 lr 0.000166 wd 0.0500 time 0.2199 (0.2289) data time 0.0008 (0.0020) model time 0.2191 (0.2260) loss 1.9574 (2.8761) grad_norm 4.6756 (4.0676) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][460/1251] eta 0:03:01 lr 0.000165 wd 0.0500 time 0.2293 (0.2288) data time 0.0008 (0.0020) model time 0.2284 (0.2260) loss 2.6378 (2.8752) grad_norm 4.5931 (4.0575) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][470/1251] eta 0:02:58 lr 0.000165 wd 0.0500 time 0.2420 (0.2288) data time 0.0012 (0.0020) model time 0.2408 (0.2261) loss 3.4082 (2.8767) grad_norm 2.5822 (4.0510) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][480/1251] eta 0:02:56 lr 0.000165 wd 0.0500 time 0.2154 (0.2288) data time 0.0009 (0.0020) model time 0.2145 (0.2261) loss 1.5820 (2.8768) grad_norm 4.7372 (4.0445) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][490/1251] eta 0:02:54 lr 0.000165 wd 0.0500 time 0.2219 (0.2288) data time 0.0009 (0.0019) model time 0.2211 (0.2261) loss 2.0732 (2.8742) grad_norm 4.4366 (4.0461) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][500/1251] eta 0:02:52 lr 0.000165 wd 0.0500 time 0.2290 (0.2292) data time 0.0008 (0.0019) model time 0.2282 (0.2266) loss 2.7147 (2.8774) grad_norm 2.9410 (4.0392) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][510/1251] eta 0:02:49 lr 0.000165 wd 0.0500 time 0.2306 (0.2292) data time 0.0009 (0.0019) model time 0.2296 (0.2266) loss 3.1794 (2.8817) grad_norm 2.8103 (4.0335) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][520/1251] eta 0:02:47 lr 0.000165 wd 0.0500 time 0.2275 (0.2292) data time 0.0009 (0.0019) model time 0.2266 (0.2266) loss 2.4202 (2.8813) grad_norm 3.0327 (4.0183) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][530/1251] eta 0:02:45 lr 0.000165 wd 0.0500 time 0.2198 (0.2291) data time 0.0010 (0.0019) model time 0.2188 (0.2266) loss 3.1773 (2.8806) grad_norm 3.7285 (4.0225) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][540/1251] eta 0:02:42 lr 0.000165 wd 0.0500 time 0.2288 (0.2291) data time 0.0009 (0.0019) model time 0.2279 (0.2266) loss 2.5729 (2.8757) grad_norm 4.1049 (4.0418) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][550/1251] eta 0:02:40 lr 0.000165 wd 0.0500 time 0.2256 (0.2294) data time 0.0010 (0.0018) model time 0.2246 (0.2269) loss 3.2313 (2.8697) grad_norm 4.0163 (4.0259) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][560/1251] eta 0:02:38 lr 0.000165 wd 0.0500 time 0.2245 (0.2293) data time 0.0010 (0.0019) model time 0.2235 (0.2268) loss 3.0864 (2.8714) grad_norm 2.9701 (4.0135) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][570/1251] eta 0:02:36 lr 0.000165 wd 0.0500 time 0.2255 (0.2292) data time 0.0006 (0.0018) model time 0.2248 (0.2268) loss 2.7407 (2.8681) grad_norm 3.4929 (4.0101) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][580/1251] eta 0:02:33 lr 0.000165 wd 0.0500 time 0.2220 (0.2292) data time 0.0009 (0.0018) model time 0.2211 (0.2268) loss 2.0092 (2.8669) grad_norm 4.1265 (4.0043) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][590/1251] eta 0:02:31 lr 0.000165 wd 0.0500 time 0.2363 (0.2291) data time 0.0009 (0.0018) model time 0.2354 (0.2268) loss 3.0921 (2.8660) grad_norm 2.7237 (3.9956) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][600/1251] eta 0:02:29 lr 0.000165 wd 0.0500 time 0.2215 (0.2291) data time 0.0009 (0.0018) model time 0.2205 (0.2267) loss 3.1913 (2.8636) grad_norm 3.4330 (3.9866) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][610/1251] eta 0:02:26 lr 0.000165 wd 0.0500 time 0.2246 (0.2290) data time 0.0008 (0.0018) model time 0.2238 (0.2267) loss 3.1007 (2.8665) grad_norm 3.8726 (4.0107) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][620/1251] eta 0:02:24 lr 0.000165 wd 0.0500 time 0.2241 (0.2290) data time 0.0008 (0.0018) model time 0.2233 (0.2267) loss 3.5783 (2.8649) grad_norm 3.2284 (4.0044) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:56:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][630/1251] eta 0:02:22 lr 0.000165 wd 0.0500 time 0.2247 (0.2290) data time 0.0011 (0.0018) model time 0.2236 (0.2267) loss 3.2961 (2.8649) grad_norm 4.7337 (4.0021) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][640/1251] eta 0:02:19 lr 0.000165 wd 0.0500 time 0.2288 (0.2289) data time 0.0008 (0.0017) model time 0.2280 (0.2267) loss 2.3442 (2.8647) grad_norm 3.7133 (3.9964) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][650/1251] eta 0:02:17 lr 0.000165 wd 0.0500 time 0.2230 (0.2289) data time 0.0007 (0.0017) model time 0.2222 (0.2267) loss 2.3766 (2.8625) grad_norm 2.5618 (3.9860) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][660/1251] eta 0:02:15 lr 0.000165 wd 0.0500 time 0.2244 (0.2289) data time 0.0006 (0.0017) model time 0.2238 (0.2266) loss 3.4465 (2.8678) grad_norm 9.1588 (4.0036) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][670/1251] eta 0:02:12 lr 0.000165 wd 0.0500 time 0.2273 (0.2289) data time 0.0010 (0.0017) model time 0.2263 (0.2267) loss 2.6683 (2.8723) grad_norm 3.5405 (4.0087) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][680/1251] eta 0:02:10 lr 0.000165 wd 0.0500 time 0.2360 (0.2288) data time 0.0009 (0.0017) model time 0.2352 (0.2266) loss 2.1738 (2.8682) grad_norm 4.3128 (4.0083) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][690/1251] eta 0:02:08 lr 0.000165 wd 0.0500 time 0.2278 (0.2288) data time 0.0006 (0.0017) model time 0.2272 (0.2266) loss 2.4930 (2.8652) grad_norm 3.2187 (4.0141) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][700/1251] eta 0:02:06 lr 0.000165 wd 0.0500 time 0.2312 (0.2288) data time 0.0007 (0.0017) model time 0.2305 (0.2266) loss 2.5957 (2.8667) grad_norm 4.1153 (4.0149) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][710/1251] eta 0:02:03 lr 0.000165 wd 0.0500 time 0.2238 (0.2288) data time 0.0012 (0.0017) model time 0.2226 (0.2266) loss 2.9189 (2.8688) grad_norm 2.7162 (4.0120) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][720/1251] eta 0:02:01 lr 0.000165 wd 0.0500 time 0.2217 (0.2288) data time 0.0011 (0.0017) model time 0.2206 (0.2266) loss 2.9210 (2.8671) grad_norm 4.2947 (4.0194) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][730/1251] eta 0:01:59 lr 0.000165 wd 0.0500 time 0.2293 (0.2288) data time 0.0010 (0.0017) model time 0.2284 (0.2266) loss 2.7736 (2.8684) grad_norm 7.6190 (4.0328) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][740/1251] eta 0:01:56 lr 0.000165 wd 0.0500 time 0.2291 (0.2288) data time 0.0009 (0.0017) model time 0.2282 (0.2267) loss 2.8363 (2.8679) grad_norm 3.6539 (4.0432) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][750/1251] eta 0:01:54 lr 0.000165 wd 0.0500 time 0.2354 (0.2288) data time 0.0009 (0.0016) model time 0.2345 (0.2267) loss 3.0949 (2.8695) grad_norm 3.4565 (4.0384) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][760/1251] eta 0:01:52 lr 0.000165 wd 0.0500 time 0.2241 (0.2288) data time 0.0010 (0.0016) model time 0.2231 (0.2267) loss 2.8101 (2.8667) grad_norm 7.6999 (4.0411) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][770/1251] eta 0:01:50 lr 0.000164 wd 0.0500 time 0.2285 (0.2287) data time 0.0006 (0.0016) model time 0.2279 (0.2267) loss 3.0885 (2.8645) grad_norm 5.0113 (4.0449) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][780/1251] eta 0:01:47 lr 0.000164 wd 0.0500 time 0.2209 (0.2287) data time 0.0010 (0.0016) model time 0.2199 (0.2267) loss 2.1928 (2.8678) grad_norm 11.0360 (4.0483) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][790/1251] eta 0:01:45 lr 0.000164 wd 0.0500 time 0.2272 (0.2287) data time 0.0010 (0.0016) model time 0.2262 (0.2267) loss 3.1950 (2.8691) grad_norm 3.6977 (4.0624) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][800/1251] eta 0:01:43 lr 0.000164 wd 0.0500 time 0.2216 (0.2287) data time 0.0010 (0.0016) model time 0.2207 (0.2267) loss 3.0525 (2.8710) grad_norm 3.4980 (4.0553) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][810/1251] eta 0:01:40 lr 0.000164 wd 0.0500 time 0.2265 (0.2287) data time 0.0009 (0.0016) model time 0.2256 (0.2267) loss 3.3304 (2.8681) grad_norm 3.5340 (4.0451) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][820/1251] eta 0:01:38 lr 0.000164 wd 0.0500 time 0.2192 (0.2286) data time 0.0009 (0.0016) model time 0.2182 (0.2266) loss 2.7384 (2.8682) grad_norm 3.2816 (4.0358) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][830/1251] eta 0:01:36 lr 0.000164 wd 0.0500 time 0.2343 (0.2287) data time 0.0010 (0.0016) model time 0.2332 (0.2267) loss 2.6977 (2.8669) grad_norm 2.7210 (4.0293) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][840/1251] eta 0:01:33 lr 0.000164 wd 0.0500 time 0.2245 (0.2286) data time 0.0009 (0.0016) model time 0.2236 (0.2266) loss 3.2828 (2.8667) grad_norm 3.7330 (4.0321) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][850/1251] eta 0:01:31 lr 0.000164 wd 0.0500 time 0.2313 (0.2286) data time 0.0008 (0.0016) model time 0.2305 (0.2266) loss 2.8831 (2.8655) grad_norm 3.8096 (4.0294) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][860/1251] eta 0:01:29 lr 0.000164 wd 0.0500 time 0.2344 (0.2286) data time 0.0009 (0.0016) model time 0.2335 (0.2266) loss 2.4826 (2.8662) grad_norm 3.4795 (4.0262) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][870/1251] eta 0:01:27 lr 0.000164 wd 0.0500 time 0.2263 (0.2285) data time 0.0009 (0.0016) model time 0.2254 (0.2266) loss 3.2407 (2.8679) grad_norm 4.8430 (4.0315) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][880/1251] eta 0:01:24 lr 0.000164 wd 0.0500 time 0.2221 (0.2285) data time 0.0010 (0.0016) model time 0.2211 (0.2266) loss 3.0354 (2.8679) grad_norm 3.6078 (4.0273) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][890/1251] eta 0:01:22 lr 0.000164 wd 0.0500 time 0.2288 (0.2285) data time 0.0005 (0.0015) model time 0.2283 (0.2266) loss 2.4134 (2.8667) grad_norm 5.3517 (4.0215) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:57:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][900/1251] eta 0:01:20 lr 0.000164 wd 0.0500 time 0.2287 (0.2285) data time 0.0006 (0.0015) model time 0.2281 (0.2266) loss 1.8261 (2.8641) grad_norm 3.8969 (4.0202) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][910/1251] eta 0:01:17 lr 0.000164 wd 0.0500 time 0.2298 (0.2285) data time 0.0007 (0.0015) model time 0.2290 (0.2266) loss 3.2800 (2.8664) grad_norm 4.4102 (4.0192) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][920/1251] eta 0:01:15 lr 0.000164 wd 0.0500 time 0.2190 (0.2285) data time 0.0010 (0.0015) model time 0.2180 (0.2266) loss 3.5131 (2.8718) grad_norm 2.8666 (4.0121) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][930/1251] eta 0:01:13 lr 0.000164 wd 0.0500 time 0.2300 (0.2284) data time 0.0011 (0.0015) model time 0.2289 (0.2266) loss 3.1140 (2.8727) grad_norm 3.1489 (4.0039) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][940/1251] eta 0:01:11 lr 0.000164 wd 0.0500 time 0.2231 (0.2284) data time 0.0008 (0.0015) model time 0.2223 (0.2265) loss 2.9468 (2.8744) grad_norm 3.3360 (3.9977) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][950/1251] eta 0:01:08 lr 0.000164 wd 0.0500 time 0.2370 (0.2284) data time 0.0009 (0.0015) model time 0.2361 (0.2265) loss 3.2056 (2.8755) grad_norm 9.3527 (4.0067) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][960/1251] eta 0:01:06 lr 0.000164 wd 0.0500 time 0.2165 (0.2283) data time 0.0010 (0.0015) model time 0.2155 (0.2264) loss 3.3552 (2.8759) grad_norm 2.7796 (4.0021) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][970/1251] eta 0:01:04 lr 0.000164 wd 0.0500 time 0.2281 (0.2283) data time 0.0009 (0.0015) model time 0.2273 (0.2264) loss 2.5363 (2.8743) grad_norm 3.2402 (4.0007) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][980/1251] eta 0:01:01 lr 0.000164 wd 0.0500 time 0.2202 (0.2282) data time 0.0009 (0.0015) model time 0.2193 (0.2264) loss 2.6328 (2.8746) grad_norm 5.2450 (3.9968) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][990/1251] eta 0:00:59 lr 0.000164 wd 0.0500 time 0.2239 (0.2282) data time 0.0010 (0.0015) model time 0.2230 (0.2264) loss 2.9281 (2.8726) grad_norm 3.9894 (3.9924) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1000/1251] eta 0:00:57 lr 0.000164 wd 0.0500 time 0.2240 (0.2281) data time 0.0009 (0.0015) model time 0.2231 (0.2263) loss 3.0293 (2.8746) grad_norm 3.1744 (3.9932) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1010/1251] eta 0:00:54 lr 0.000164 wd 0.0500 time 0.2175 (0.2281) data time 0.0007 (0.0015) model time 0.2167 (0.2263) loss 3.8478 (2.8769) grad_norm 3.4293 (3.9920) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1020/1251] eta 0:00:52 lr 0.000164 wd 0.0500 time 0.2225 (0.2282) data time 0.0010 (0.0015) model time 0.2215 (0.2264) loss 2.6623 (2.8801) grad_norm 3.4970 (3.9980) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1030/1251] eta 0:00:50 lr 0.000164 wd 0.0500 time 0.2328 (0.2282) data time 0.0006 (0.0015) model time 0.2322 (0.2264) loss 3.1500 (2.8804) grad_norm 3.3883 (3.9961) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1040/1251] eta 0:00:48 lr 0.000164 wd 0.0500 time 0.2193 (0.2282) data time 0.0009 (0.0015) model time 0.2184 (0.2264) loss 3.3646 (2.8817) grad_norm 3.5344 (3.9944) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1050/1251] eta 0:00:45 lr 0.000164 wd 0.0500 time 0.2318 (0.2282) data time 0.0007 (0.0015) model time 0.2311 (0.2264) loss 3.4345 (2.8793) grad_norm 3.0893 (3.9935) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1060/1251] eta 0:00:43 lr 0.000164 wd 0.0500 time 0.2269 (0.2281) data time 0.0008 (0.0015) model time 0.2261 (0.2264) loss 2.4821 (2.8765) grad_norm 4.3308 (3.9959) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1070/1251] eta 0:00:41 lr 0.000164 wd 0.0500 time 0.2278 (0.2283) data time 0.0009 (0.0015) model time 0.2270 (0.2266) loss 2.9243 (2.8780) grad_norm 2.9280 (3.9930) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1080/1251] eta 0:00:39 lr 0.000163 wd 0.0500 time 0.2199 (0.2283) data time 0.0010 (0.0015) model time 0.2189 (0.2265) loss 3.0029 (2.8773) grad_norm 2.4369 (3.9855) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1090/1251] eta 0:00:36 lr 0.000163 wd 0.0500 time 0.2297 (0.2283) data time 0.0010 (0.0014) model time 0.2287 (0.2265) loss 2.7933 (2.8789) grad_norm 4.0683 (3.9845) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1100/1251] eta 0:00:34 lr 0.000163 wd 0.0500 time 0.2259 (0.2282) data time 0.0011 (0.0015) model time 0.2248 (0.2265) loss 3.0262 (2.8806) grad_norm 3.1533 (3.9818) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1110/1251] eta 0:00:32 lr 0.000163 wd 0.0500 time 0.2254 (0.2282) data time 0.0011 (0.0014) model time 0.2243 (0.2264) loss 3.1597 (2.8816) grad_norm 4.2468 (3.9897) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1120/1251] eta 0:00:29 lr 0.000163 wd 0.0500 time 0.2237 (0.2282) data time 0.0008 (0.0014) model time 0.2229 (0.2264) loss 2.1773 (2.8816) grad_norm 3.1980 (3.9904) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1130/1251] eta 0:00:27 lr 0.000163 wd 0.0500 time 0.2291 (0.2282) data time 0.0008 (0.0014) model time 0.2283 (0.2264) loss 3.6385 (2.8824) grad_norm 4.7142 (3.9886) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1140/1251] eta 0:00:25 lr 0.000163 wd 0.0500 time 0.2423 (0.2282) data time 0.0007 (0.0014) model time 0.2416 (0.2264) loss 3.1769 (2.8848) grad_norm 3.3644 (3.9874) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1150/1251] eta 0:00:23 lr 0.000163 wd 0.0500 time 0.2175 (0.2281) data time 0.0010 (0.0014) model time 0.2165 (0.2264) loss 3.1797 (2.8862) grad_norm 5.2706 (3.9861) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:58:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1160/1251] eta 0:00:20 lr 0.000163 wd 0.0500 time 0.2284 (0.2281) data time 0.0008 (0.0014) model time 0.2276 (0.2264) loss 2.7718 (2.8857) grad_norm 5.0150 (3.9863) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1170/1251] eta 0:00:18 lr 0.000163 wd 0.0500 time 0.2260 (0.2281) data time 0.0006 (0.0014) model time 0.2254 (0.2263) loss 2.2144 (2.8852) grad_norm 2.4584 (3.9893) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1180/1251] eta 0:00:16 lr 0.000163 wd 0.0500 time 0.2259 (0.2280) data time 0.0007 (0.0014) model time 0.2251 (0.2263) loss 3.1050 (2.8856) grad_norm 4.1476 (3.9839) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1190/1251] eta 0:00:13 lr 0.000163 wd 0.0500 time 0.2237 (0.2280) data time 0.0010 (0.0014) model time 0.2227 (0.2263) loss 2.8697 (2.8849) grad_norm 3.4095 (3.9857) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1200/1251] eta 0:00:11 lr 0.000163 wd 0.0500 time 0.2346 (0.2280) data time 0.0006 (0.0014) model time 0.2340 (0.2263) loss 3.5017 (2.8873) grad_norm 3.8415 (3.9833) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1210/1251] eta 0:00:09 lr 0.000163 wd 0.0500 time 0.2304 (0.2279) data time 0.0008 (0.0014) model time 0.2296 (0.2262) loss 2.9195 (2.8865) grad_norm 2.9619 (3.9942) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1220/1251] eta 0:00:07 lr 0.000163 wd 0.0500 time 0.2238 (0.2279) data time 0.0007 (0.0014) model time 0.2231 (0.2262) loss 2.1625 (2.8830) grad_norm 6.0071 (4.0004) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1230/1251] eta 0:00:04 lr 0.000163 wd 0.0500 time 0.2270 (0.2279) data time 0.0010 (0.0014) model time 0.2260 (0.2262) loss 2.9476 (2.8810) grad_norm 2.7495 (3.9994) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1240/1251] eta 0:00:02 lr 0.000163 wd 0.0500 time 0.2132 (0.2278) data time 0.0006 (0.0014) model time 0.2126 (0.2261) loss 3.1858 (2.8812) grad_norm 4.2368 (3.9983) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [227/300][1250/1251] eta 0:00:00 lr 0.000163 wd 0.0500 time 0.2137 (0.2277) data time 0.0006 (0.0014) model time 0.2131 (0.2261) loss 2.5647 (2.8818) grad_norm 2.6142 (3.9941) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 227 training takes 0:04:44 [2024-08-30 12:59:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 12:59:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 12:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.360 (0.360) Loss 0.4011 (0.4011) Acc@1 92.773 (92.773) Acc@5 98.145 (98.145) Mem 7381MB [2024-08-30 12:59:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.098) Loss 0.5850 (0.6283) Acc@1 88.379 (86.630) Acc@5 97.852 (97.496) Mem 7381MB [2024-08-30 12:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.085) Loss 0.9170 (0.6563) Acc@1 76.172 (85.658) Acc@5 95.605 (97.438) Mem 7381MB [2024-08-30 12:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.080) Loss 1.1475 (0.7478) Acc@1 71.484 (83.380) Acc@5 92.383 (96.472) Mem 7381MB [2024-08-30 12:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 1.0771 (0.7996) Acc@1 76.172 (82.069) Acc@5 93.457 (95.941) Mem 7381MB [2024-08-30 12:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.604 Acc@5 95.886 [2024-08-30 12:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.6% [2024-08-30 12:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.698 (0.698) Loss 0.3801 (0.3801) Acc@1 93.164 (93.164) Acc@5 98.340 (98.340) Mem 7381MB [2024-08-30 12:59:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.136) Loss 0.5791 (0.6007) Acc@1 89.160 (87.420) Acc@5 97.754 (97.772) Mem 7381MB [2024-08-30 12:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.067 (0.107) Loss 0.8701 (0.6284) Acc@1 78.320 (86.351) Acc@5 95.996 (97.731) Mem 7381MB [2024-08-30 12:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.096) Loss 1.0859 (0.7124) Acc@1 74.219 (84.287) Acc@5 92.969 (96.812) Mem 7381MB [2024-08-30 12:59:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.088) Loss 0.9761 (0.7563) Acc@1 77.441 (83.191) Acc@5 94.629 (96.363) Mem 7381MB [2024-08-30 12:59:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.772 Acc@5 96.334 [2024-08-30 12:59:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-08-30 12:59:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.77% [2024-08-30 12:59:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-08-30 12:59:27 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-08-30 12:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][0/1251] eta 0:12:59 lr 0.000163 wd 0.0500 time 0.6234 (0.6234) data time 0.4049 (0.4049) model time 0.0000 (0.0000) loss 3.2597 (3.2597) grad_norm 4.4125 (4.4125) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][10/1251] eta 0:05:25 lr 0.000163 wd 0.0500 time 0.2231 (0.2621) data time 0.0007 (0.0378) model time 0.0000 (0.0000) loss 3.2018 (2.9512) grad_norm 3.3521 (3.9164) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][20/1251] eta 0:05:02 lr 0.000163 wd 0.0500 time 0.2265 (0.2458) data time 0.0011 (0.0203) model time 0.0000 (0.0000) loss 2.6099 (2.8030) grad_norm 4.1696 (3.8645) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][30/1251] eta 0:04:52 lr 0.000163 wd 0.0500 time 0.2238 (0.2394) data time 0.0010 (0.0141) model time 0.0000 (0.0000) loss 2.1314 (2.7782) grad_norm 3.2912 (3.7980) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][40/1251] eta 0:04:45 lr 0.000163 wd 0.0500 time 0.2261 (0.2362) data time 0.0008 (0.0109) model time 0.0000 (0.0000) loss 3.9956 (2.8127) grad_norm 3.1320 (3.8248) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][50/1251] eta 0:04:41 lr 0.000163 wd 0.0500 time 0.2220 (0.2341) data time 0.0008 (0.0090) model time 0.0000 (0.0000) loss 2.9398 (2.7772) grad_norm 3.4416 (3.9722) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][60/1251] eta 0:04:36 lr 0.000163 wd 0.0500 time 0.2290 (0.2325) data time 0.0008 (0.0076) model time 0.2283 (0.2234) loss 3.0758 (2.7999) grad_norm 2.2218 (4.3025) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][70/1251] eta 0:04:33 lr 0.000163 wd 0.0500 time 0.2263 (0.2316) data time 0.0007 (0.0067) model time 0.2256 (0.2241) loss 3.0192 (2.7978) grad_norm 3.2401 (4.2909) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][80/1251] eta 0:04:30 lr 0.000163 wd 0.0500 time 0.2350 (0.2312) data time 0.0008 (0.0060) model time 0.2342 (0.2252) loss 2.8178 (2.8404) grad_norm 2.8495 (4.1551) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][90/1251] eta 0:04:28 lr 0.000163 wd 0.0500 time 0.2240 (0.2308) data time 0.0010 (0.0055) model time 0.2229 (0.2257) loss 3.1115 (2.8706) grad_norm 7.5906 (4.1538) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][100/1251] eta 0:04:25 lr 0.000163 wd 0.0500 time 0.2250 (0.2302) data time 0.0007 (0.0050) model time 0.2243 (0.2253) loss 2.9298 (2.8769) grad_norm 8.1825 (4.1183) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][110/1251] eta 0:04:22 lr 0.000163 wd 0.0500 time 0.2299 (0.2299) data time 0.0008 (0.0047) model time 0.2291 (0.2254) loss 1.9826 (2.8701) grad_norm 2.8809 (4.0633) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][120/1251] eta 0:04:19 lr 0.000163 wd 0.0500 time 0.2288 (0.2297) data time 0.0008 (0.0043) model time 0.2281 (0.2255) loss 3.0578 (2.8750) grad_norm 4.2236 (3.9943) loss_scale 512.0000 (512.0000) mem 7381MB [2024-08-30 12:59:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 12:59:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 12:59:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 13:04:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 13:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 13:04:38 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 13:04:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 13:04:48 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 13:04:50 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 13:04:51 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 13:04:51 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 228) [2024-08-30 13:04:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 13:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 13:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 13:10:00 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 13:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 13:10:11 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 13:10:12 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 13:10:13 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 13:10:13 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 228) [2024-08-30 13:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 13:17:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 13:17:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 13:17:44 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 13:17:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 13:17:53 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 13:17:54 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 13:17:56 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 13:17:56 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 228) [2024-08-30 13:17:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 13:18:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 13:18:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 13:18:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-30 13:20:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 13:20:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 13:20:32 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 13:20:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 13:20:39 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 13:20:40 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 13:20:41 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 13:20:41 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 228) [2024-08-30 13:20:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 13:23:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-30 13:23:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-08-30 13:23:42 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-08-30 13:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-08-30 13:23:49 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-08-30 13:23:51 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-08-30 13:23:52 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-08-30 13:23:52 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 228) [2024-08-30 13:23:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-08-30 13:24:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][130/1251] eta 0:39:26 lr 0.000163 wd 0.0500 time 0.2208 (2.1109) data time 0.0012 (0.1261) model time 0.2195 (1.9849) loss 3.3508 (3.4508) grad_norm 3.4479 (3.6561) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][140/1251] eta 0:18:31 lr 0.000162 wd 0.0500 time 0.2192 (1.0008) data time 0.0011 (0.0525) model time 0.2182 (0.9483) loss 2.6440 (3.1446) grad_norm 3.5575 (3.4970) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][150/1251] eta 0:13:07 lr 0.000162 wd 0.0500 time 0.2271 (0.7149) data time 0.0008 (0.0334) model time 0.2263 (0.6814) loss 3.4358 (3.1515) grad_norm 3.7645 (3.7914) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][160/1251] eta 0:10:37 lr 0.000162 wd 0.0500 time 0.2228 (0.5842) data time 0.0009 (0.0247) model time 0.2219 (0.5596) loss 2.7853 (3.1169) grad_norm 4.0608 (3.8760) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][170/1251] eta 0:09:08 lr 0.000162 wd 0.0500 time 0.2211 (0.5074) data time 0.0007 (0.0197) model time 0.2203 (0.4876) loss 3.0648 (3.0688) grad_norm 3.7247 (3.8620) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][180/1251] eta 0:08:10 lr 0.000162 wd 0.0500 time 0.2253 (0.4577) data time 0.0009 (0.0164) model time 0.2244 (0.4412) loss 2.6545 (3.0496) grad_norm 5.9607 (3.9151) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][190/1251] eta 0:07:28 lr 0.000162 wd 0.0500 time 0.2257 (0.4226) data time 0.0008 (0.0141) model time 0.2249 (0.4084) loss 3.1603 (3.0164) grad_norm 6.1802 (3.8879) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][200/1251] eta 0:06:56 lr 0.000162 wd 0.0500 time 0.2192 (0.3965) data time 0.0012 (0.0124) model time 0.2181 (0.3841) loss 2.9772 (2.9854) grad_norm 3.2296 (3.8454) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][210/1251] eta 0:06:32 lr 0.000162 wd 0.0500 time 0.2257 (0.3767) data time 0.0009 (0.0111) model time 0.2248 (0.3656) loss 2.9641 (2.9524) grad_norm 5.3568 (3.9839) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][220/1251] eta 0:06:12 lr 0.000162 wd 0.0500 time 0.2261 (0.3610) data time 0.0009 (0.0100) model time 0.2253 (0.3509) loss 3.1968 (2.9574) grad_norm 6.0485 (3.9848) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][230/1251] eta 0:05:55 lr 0.000162 wd 0.0500 time 0.2209 (0.3483) data time 0.0006 (0.0092) model time 0.2203 (0.3391) loss 2.9895 (2.9743) grad_norm 3.4545 (3.9352) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][240/1251] eta 0:05:41 lr 0.000162 wd 0.0500 time 0.2280 (0.3378) data time 0.0009 (0.0085) model time 0.2271 (0.3293) loss 3.3854 (2.9672) grad_norm 7.4730 (3.9788) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][250/1251] eta 0:05:29 lr 0.000162 wd 0.0500 time 0.2342 (0.3290) data time 0.0008 (0.0079) model time 0.2335 (0.3211) loss 3.0192 (2.9548) grad_norm 5.2424 (3.9664) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][260/1251] eta 0:05:18 lr 0.000162 wd 0.0500 time 0.2292 (0.3213) data time 0.0006 (0.0074) model time 0.2286 (0.3140) loss 2.2998 (2.9481) grad_norm 3.0647 (3.9441) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][270/1251] eta 0:05:08 lr 0.000162 wd 0.0500 time 0.2276 (0.3148) data time 0.0008 (0.0069) model time 0.2269 (0.3079) loss 3.0338 (2.9416) grad_norm 3.8966 (3.9776) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][280/1251] eta 0:05:00 lr 0.000162 wd 0.0500 time 0.2217 (0.3091) data time 0.0009 (0.0065) model time 0.2208 (0.3026) loss 2.5193 (2.9395) grad_norm 3.8609 (3.9531) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][290/1251] eta 0:04:52 lr 0.000162 wd 0.0500 time 0.2304 (0.3041) data time 0.0008 (0.0062) model time 0.2296 (0.2979) loss 2.9210 (2.9388) grad_norm 4.0255 (3.9494) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][300/1251] eta 0:04:44 lr 0.000162 wd 0.0500 time 0.2170 (0.2995) data time 0.0009 (0.0059) model time 0.2161 (0.2936) loss 3.2191 (2.9302) grad_norm 2.4644 (3.9006) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][310/1251] eta 0:04:38 lr 0.000162 wd 0.0500 time 0.2190 (0.2955) data time 0.0010 (0.0056) model time 0.2181 (0.2898) loss 3.0928 (2.9231) grad_norm 3.6657 (3.9205) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][320/1251] eta 0:04:31 lr 0.000162 wd 0.0500 time 0.2239 (0.2919) data time 0.0007 (0.0054) model time 0.2232 (0.2865) loss 3.2660 (2.9167) grad_norm 10.8312 (3.9383) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][330/1251] eta 0:04:25 lr 0.000162 wd 0.0500 time 0.2282 (0.2887) data time 0.0007 (0.0052) model time 0.2274 (0.2835) loss 3.1577 (2.9043) grad_norm 3.6058 (3.9212) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:24:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][340/1251] eta 0:04:20 lr 0.000162 wd 0.0500 time 0.2177 (0.2858) data time 0.0009 (0.0050) model time 0.2169 (0.2808) loss 3.3358 (2.9021) grad_norm 3.6722 (3.9258) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:25:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][350/1251] eta 0:04:15 lr 0.000162 wd 0.0500 time 0.2228 (0.2834) data time 0.0011 (0.0048) model time 0.2218 (0.2785) loss 2.9578 (2.9038) grad_norm 2.7507 (3.9408) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:25:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][360/1251] eta 0:04:10 lr 0.000162 wd 0.0500 time 0.2176 (0.2810) data time 0.0009 (0.0047) model time 0.2168 (0.2763) loss 2.8206 (2.8996) grad_norm 2.7339 (3.9040) loss_scale 512.0000 (512.0000) mem 7377MB [2024-08-30 13:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][370/1251] eta 0:04:05 lr 0.000162 wd 0.0500 time 0.2294 (0.2788) data time 0.0008 (0.0045) model time 0.2286 (0.2743) loss 3.1352 (2.8969) grad_norm 17.3748 (3.9350) loss_scale 1024.0000 (528.5830) mem 7377MB [2024-08-30 13:25:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][380/1251] eta 0:04:00 lr 0.000162 wd 0.0500 time 0.2262 (0.2767) data time 0.0006 (0.0044) model time 0.2255 (0.2723) loss 2.1892 (2.8855) grad_norm 7.3366 (inf) loss_scale 512.0000 (537.8988) mem 7377MB [2024-08-30 13:25:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][390/1251] eta 0:03:56 lr 0.000162 wd 0.0500 time 0.2211 (0.2747) data time 0.0007 (0.0042) model time 0.2204 (0.2704) loss 2.2309 (2.8801) grad_norm 6.9525 (inf) loss_scale 512.0000 (536.9288) mem 7377MB [2024-08-30 13:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][400/1251] eta 0:03:52 lr 0.000162 wd 0.0500 time 0.2198 (0.2728) data time 0.0009 (0.0041) model time 0.2189 (0.2687) loss 2.7932 (2.8885) grad_norm 3.9779 (inf) loss_scale 512.0000 (536.0289) mem 7377MB [2024-08-30 13:25:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][410/1251] eta 0:03:48 lr 0.000162 wd 0.0500 time 0.4396 (0.2719) data time 0.0008 (0.0040) model time 0.4388 (0.2679) loss 3.6506 (2.8836) grad_norm 3.8371 (inf) loss_scale 512.0000 (535.1916) mem 7377MB [2024-08-30 13:25:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][420/1251] eta 0:03:44 lr 0.000162 wd 0.0500 time 0.2208 (0.2703) data time 0.0010 (0.0039) model time 0.2198 (0.2664) loss 1.7343 (2.8709) grad_norm 3.4612 (inf) loss_scale 512.0000 (534.4108) mem 7377MB [2024-08-30 13:25:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][430/1251] eta 0:03:41 lr 0.000162 wd 0.0500 time 0.4729 (0.2696) data time 0.0008 (0.0038) model time 0.4721 (0.2658) loss 3.6894 (2.8701) grad_norm 3.7310 (inf) loss_scale 512.0000 (533.6808) mem 7377MB [2024-08-30 13:25:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][440/1251] eta 0:03:37 lr 0.000162 wd 0.0500 time 0.2239 (0.2682) data time 0.0010 (0.0037) model time 0.2229 (0.2644) loss 3.1463 (2.8791) grad_norm 4.0384 (inf) loss_scale 512.0000 (532.9968) mem 7377MB [2024-08-30 13:25:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][450/1251] eta 0:03:33 lr 0.000161 wd 0.0500 time 0.2215 (0.2668) data time 0.0007 (0.0036) model time 0.2208 (0.2631) loss 2.2172 (2.8867) grad_norm 3.7831 (inf) loss_scale 512.0000 (532.3547) mem 7377MB [2024-08-30 13:25:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][460/1251] eta 0:03:29 lr 0.000161 wd 0.0500 time 0.2244 (0.2655) data time 0.0008 (0.0035) model time 0.2235 (0.2619) loss 2.4344 (2.8822) grad_norm 3.2831 (inf) loss_scale 512.0000 (531.7507) mem 7377MB [2024-08-30 13:25:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-08-30 13:25:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-08-30 13:25:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-08-31 00:22:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-08-31 00:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-09-01 04:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-09-01 04:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-09-01 04:57:42 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-09-01 04:59:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-09-01 04:59:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-09-01 04:59:42 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-09-01 04:59:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-09-01 04:59:51 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-09-01 04:59:53 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-09-01 04:59:54 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-09-01 04:59:54 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 228) [2024-09-01 04:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-09-01 05:00:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][470/1251] eta 0:39:21 lr 0.000161 wd 0.0500 time 0.2451 (3.0236) data time 0.0008 (0.1429) model time 0.2443 (2.8807) loss 3.6725 (3.3282) grad_norm 2.9174 (3.9270) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][480/1251] eta 0:16:29 lr 0.000161 wd 0.0500 time 0.2348 (1.2833) data time 0.0011 (0.0542) model time 0.2338 (1.2290) loss 3.1752 (3.1312) grad_norm 3.5458 (4.4601) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][490/1251] eta 0:11:12 lr 0.000161 wd 0.0500 time 0.2402 (0.8831) data time 0.0010 (0.0342) model time 0.2393 (0.8489) loss 2.5461 (3.1188) grad_norm 3.1512 (4.1207) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][500/1251] eta 0:08:49 lr 0.000161 wd 0.0500 time 0.2388 (0.7049) data time 0.0009 (0.0250) model time 0.2379 (0.6799) loss 2.7730 (3.1105) grad_norm 3.8241 (3.9483) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][510/1251] eta 0:07:27 lr 0.000161 wd 0.0500 time 0.2406 (0.6042) data time 0.0010 (0.0198) model time 0.2396 (0.5844) loss 2.8793 (3.0681) grad_norm 4.4447 (4.0234) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][520/1251] eta 0:06:34 lr 0.000161 wd 0.0500 time 0.2399 (0.5393) data time 0.0008 (0.0164) model time 0.2391 (0.5229) loss 3.3699 (3.0613) grad_norm 5.8590 (4.2271) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][530/1251] eta 0:05:56 lr 0.000161 wd 0.0500 time 0.2405 (0.4950) data time 0.0009 (0.0141) model time 0.2396 (0.4809) loss 2.6145 (3.0240) grad_norm 2.9118 (4.1738) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][540/1251] eta 0:05:28 lr 0.000161 wd 0.0500 time 0.2467 (0.4617) data time 0.0010 (0.0124) model time 0.2456 (0.4493) loss 3.3180 (3.0004) grad_norm 2.9383 (4.1496) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][550/1251] eta 0:05:05 lr 0.000161 wd 0.0500 time 0.2309 (0.4361) data time 0.0009 (0.0111) model time 0.2300 (0.4250) loss 1.9800 (2.9660) grad_norm 3.8616 (4.1408) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][560/1251] eta 0:04:47 lr 0.000161 wd 0.0500 time 0.2384 (0.4158) data time 0.0009 (0.0100) model time 0.2374 (0.4058) loss 2.8273 (2.9652) grad_norm 3.4469 (4.0446) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][570/1251] eta 0:04:32 lr 0.000161 wd 0.0500 time 0.2415 (0.3995) data time 0.0011 (0.0093) model time 0.2403 (0.3902) loss 3.4215 (2.9959) grad_norm 3.7345 (3.9797) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][580/1251] eta 0:04:18 lr 0.000161 wd 0.0500 time 0.2425 (0.3859) data time 0.0008 (0.0086) model time 0.2417 (0.3773) loss 3.2889 (2.9921) grad_norm 2.8916 (3.9467) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][590/1251] eta 0:04:07 lr 0.000161 wd 0.0500 time 0.2411 (0.3745) data time 0.0008 (0.0080) model time 0.2403 (0.3665) loss 1.8042 (2.9797) grad_norm 3.0628 (3.9651) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][600/1251] eta 0:03:57 lr 0.000161 wd 0.0500 time 0.2421 (0.3648) data time 0.0011 (0.0075) model time 0.2410 (0.3573) loss 2.3682 (2.9778) grad_norm 2.5899 (3.9245) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][610/1251] eta 0:03:48 lr 0.000161 wd 0.0500 time 0.2330 (0.3565) data time 0.0008 (0.0070) model time 0.2322 (0.3495) loss 2.4109 (2.9742) grad_norm 4.5147 (3.9596) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][620/1251] eta 0:03:40 lr 0.000161 wd 0.0500 time 0.2344 (0.3491) data time 0.0011 (0.0067) model time 0.2333 (0.3424) loss 3.0326 (2.9720) grad_norm 4.0732 (3.9495) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][630/1251] eta 0:03:32 lr 0.000161 wd 0.0500 time 0.2348 (0.3425) data time 0.0010 (0.0063) model time 0.2337 (0.3362) loss 3.2830 (2.9725) grad_norm 4.3088 (3.9363) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][640/1251] eta 0:03:25 lr 0.000161 wd 0.0500 time 0.2430 (0.3367) data time 0.0008 (0.0060) model time 0.2422 (0.3307) loss 2.8583 (2.9595) grad_norm 4.8779 (3.9355) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:00:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][650/1251] eta 0:03:19 lr 0.000161 wd 0.0500 time 0.2356 (0.3316) data time 0.0011 (0.0058) model time 0.2345 (0.3259) loss 2.5021 (2.9536) grad_norm 4.2119 (3.9227) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:01:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][660/1251] eta 0:03:13 lr 0.000161 wd 0.0500 time 0.2404 (0.3271) data time 0.0007 (0.0055) model time 0.2397 (0.3215) loss 2.5457 (2.9464) grad_norm 3.7046 (3.9188) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:01:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][670/1251] eta 0:03:07 lr 0.000161 wd 0.0500 time 0.2403 (0.3229) data time 0.0008 (0.0053) model time 0.2395 (0.3176) loss 2.1162 (2.9278) grad_norm 3.0893 (3.8781) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:01:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][680/1251] eta 0:03:02 lr 0.000161 wd 0.0500 time 0.2424 (0.3190) data time 0.0007 (0.0051) model time 0.2417 (0.3139) loss 2.0796 (2.9241) grad_norm 5.9802 (3.8742) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:01:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][690/1251] eta 0:02:57 lr 0.000161 wd 0.0500 time 0.2396 (0.3157) data time 0.0008 (0.0049) model time 0.2388 (0.3108) loss 2.6443 (2.9328) grad_norm 4.1686 (3.9673) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:01:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][700/1251] eta 0:02:52 lr 0.000161 wd 0.0500 time 0.2453 (0.3125) data time 0.0009 (0.0048) model time 0.2444 (0.3078) loss 3.1471 (2.9266) grad_norm 2.9594 (3.9984) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:01:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][710/1251] eta 0:02:47 lr 0.000161 wd 0.0500 time 0.2417 (0.3096) data time 0.0008 (0.0046) model time 0.2408 (0.3050) loss 2.0143 (2.9212) grad_norm 2.9255 (3.9820) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:01:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][720/1251] eta 0:02:42 lr 0.000161 wd 0.0500 time 0.2331 (0.3069) data time 0.0011 (0.0045) model time 0.2320 (0.3024) loss 2.3428 (2.9194) grad_norm 5.1912 (3.9867) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:01:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][730/1251] eta 0:02:38 lr 0.000161 wd 0.0500 time 0.2399 (0.3044) data time 0.0009 (0.0044) model time 0.2391 (0.3001) loss 2.6293 (2.9111) grad_norm 3.9082 (3.9831) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][740/1251] eta 0:02:34 lr 0.000161 wd 0.0500 time 0.2450 (0.3021) data time 0.0010 (0.0043) model time 0.2440 (0.2979) loss 3.1240 (2.9134) grad_norm 3.1381 (4.0195) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:01:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][750/1251] eta 0:02:30 lr 0.000161 wd 0.0500 time 0.2399 (0.3000) data time 0.0012 (0.0041) model time 0.2387 (0.2958) loss 2.2593 (2.9059) grad_norm 3.9069 (4.0106) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:01:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][760/1251] eta 0:02:26 lr 0.000160 wd 0.0500 time 0.2435 (0.2988) data time 0.0008 (0.0040) model time 0.2426 (0.2948) loss 1.9907 (2.8979) grad_norm 3.4302 (3.9950) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:01:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][770/1251] eta 0:02:22 lr 0.000160 wd 0.0500 time 0.2354 (0.2969) data time 0.0009 (0.0039) model time 0.2345 (0.2929) loss 2.1615 (2.8885) grad_norm 4.4740 (3.9828) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:01:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][780/1251] eta 0:02:19 lr 0.000160 wd 0.0500 time 0.2521 (0.2960) data time 0.0008 (0.0039) model time 0.2513 (0.2921) loss 3.0554 (2.8959) grad_norm 3.7210 (3.9791) loss_scale 512.0000 (512.0000) mem 7352MB [2024-09-01 05:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-09-01 05:01:32 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 05:01:34 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 05:03:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-09-01 05:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-09-01 05:03:17 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-09-01 05:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-09-01 05:03:30 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-09-01 05:03:31 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-09-01 05:03:32 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-09-01 05:03:32 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 228) [2024-09-01 05:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-09-01 05:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][790/1251] eta 0:11:22 lr 0.000160 wd 0.0500 time 0.2316 (1.4814) data time 0.0006 (0.0732) model time 0.2311 (1.4082) loss 3.0826 (3.2137) grad_norm 3.1361 (4.1472) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][800/1251] eta 0:06:10 lr 0.000160 wd 0.0500 time 0.2201 (0.8215) data time 0.0010 (0.0352) model time 0.2191 (0.7863) loss 2.9303 (3.0957) grad_norm 2.8955 (4.3659) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][810/1251] eta 0:04:31 lr 0.000160 wd 0.0500 time 0.2245 (0.6163) data time 0.0006 (0.0233) model time 0.2239 (0.5929) loss 3.3030 (3.1296) grad_norm 3.0672 (4.3894) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:03:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][820/1251] eta 0:03:42 lr 0.000160 wd 0.0500 time 0.2300 (0.5168) data time 0.0008 (0.0176) model time 0.2291 (0.4992) loss 2.8572 (3.0616) grad_norm 3.3467 (4.8095) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][830/1251] eta 0:03:12 lr 0.000160 wd 0.0500 time 0.2239 (0.4578) data time 0.0008 (0.0142) model time 0.2231 (0.4436) loss 2.5160 (3.0297) grad_norm 3.5717 (4.6516) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][840/1251] eta 0:02:52 lr 0.000160 wd 0.0500 time 0.2351 (0.4190) data time 0.0008 (0.0119) model time 0.2343 (0.4071) loss 1.8582 (3.0036) grad_norm 4.7674 (4.6553) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][850/1251] eta 0:02:36 lr 0.000160 wd 0.0500 time 0.2310 (0.3912) data time 0.0010 (0.0103) model time 0.2300 (0.3808) loss 2.8823 (2.9901) grad_norm 3.4394 (4.5239) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][860/1251] eta 0:02:24 lr 0.000160 wd 0.0500 time 0.2397 (0.3707) data time 0.0009 (0.0091) model time 0.2388 (0.3615) loss 2.9870 (2.9737) grad_norm 3.7549 (4.4040) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][870/1251] eta 0:02:14 lr 0.000160 wd 0.0500 time 0.2196 (0.3543) data time 0.0006 (0.0082) model time 0.2190 (0.3461) loss 2.9404 (2.9655) grad_norm 4.4774 (4.3709) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][880/1251] eta 0:02:06 lr 0.000160 wd 0.0500 time 0.2228 (0.3413) data time 0.0012 (0.0075) model time 0.2216 (0.3339) loss 2.9955 (2.9704) grad_norm 5.0145 (4.2912) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][890/1251] eta 0:01:59 lr 0.000160 wd 0.0500 time 0.2216 (0.3304) data time 0.0008 (0.0069) model time 0.2209 (0.3236) loss 3.0176 (2.9792) grad_norm 4.7865 (4.3635) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][900/1251] eta 0:01:52 lr 0.000160 wd 0.0500 time 0.2295 (0.3217) data time 0.0008 (0.0064) model time 0.2287 (0.3154) loss 3.0251 (2.9662) grad_norm 5.4851 (4.2945) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][910/1251] eta 0:01:47 lr 0.000160 wd 0.0500 time 0.2264 (0.3143) data time 0.0007 (0.0059) model time 0.2257 (0.3083) loss 2.5640 (2.9470) grad_norm 2.8769 (4.2214) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][920/1251] eta 0:01:41 lr 0.000160 wd 0.0500 time 0.2385 (0.3080) data time 0.0008 (0.0056) model time 0.2377 (0.3024) loss 3.5666 (2.9478) grad_norm 3.3475 (4.1971) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][930/1251] eta 0:01:37 lr 0.000160 wd 0.0500 time 0.2240 (0.3023) data time 0.0006 (0.0053) model time 0.2234 (0.2971) loss 2.6200 (2.9434) grad_norm 3.7277 (4.1645) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][940/1251] eta 0:01:32 lr 0.000160 wd 0.0500 time 0.2260 (0.2976) data time 0.0007 (0.0050) model time 0.2252 (0.2926) loss 3.4799 (2.9409) grad_norm 3.6652 (4.1581) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][950/1251] eta 0:01:28 lr 0.000160 wd 0.0500 time 0.2229 (0.2933) data time 0.0007 (0.0048) model time 0.2221 (0.2885) loss 3.3604 (2.9424) grad_norm 2.5871 (4.1198) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][960/1251] eta 0:01:24 lr 0.000160 wd 0.0500 time 0.2272 (0.2895) data time 0.0009 (0.0046) model time 0.2263 (0.2849) loss 2.6633 (2.9248) grad_norm 2.4679 (4.0889) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][970/1251] eta 0:01:20 lr 0.000160 wd 0.0500 time 0.2296 (0.2862) data time 0.0007 (0.0044) model time 0.2289 (0.2818) loss 3.7586 (2.9252) grad_norm 5.2531 (4.1311) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][980/1251] eta 0:01:16 lr 0.000160 wd 0.0500 time 0.2246 (0.2833) data time 0.0008 (0.0042) model time 0.2238 (0.2791) loss 2.3596 (2.9133) grad_norm 4.3403 (4.1009) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][990/1251] eta 0:01:13 lr 0.000160 wd 0.0500 time 0.2246 (0.2806) data time 0.0007 (0.0041) model time 0.2239 (0.2766) loss 3.4618 (2.9093) grad_norm 3.6295 (4.0809) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1000/1251] eta 0:01:09 lr 0.000160 wd 0.0500 time 0.2251 (0.2782) data time 0.0008 (0.0039) model time 0.2244 (0.2742) loss 3.2965 (2.9056) grad_norm 3.9594 (4.0698) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1010/1251] eta 0:01:06 lr 0.000160 wd 0.0500 time 0.2182 (0.2760) data time 0.0007 (0.0038) model time 0.2175 (0.2722) loss 2.0877 (2.9071) grad_norm 3.8596 (4.0480) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1020/1251] eta 0:01:03 lr 0.000160 wd 0.0500 time 0.2236 (0.2739) data time 0.0006 (0.0037) model time 0.2230 (0.2702) loss 1.9519 (2.9020) grad_norm 3.5158 (4.0287) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1030/1251] eta 0:01:00 lr 0.000160 wd 0.0500 time 0.2234 (0.2719) data time 0.0008 (0.0036) model time 0.2227 (0.2683) loss 2.8790 (2.9009) grad_norm 5.8206 (4.0338) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1040/1251] eta 0:00:56 lr 0.000160 wd 0.0500 time 0.2206 (0.2701) data time 0.0010 (0.0035) model time 0.2196 (0.2666) loss 3.2221 (2.8978) grad_norm 2.9032 (4.0413) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1050/1251] eta 0:00:53 lr 0.000160 wd 0.0500 time 0.2202 (0.2684) data time 0.0007 (0.0034) model time 0.2194 (0.2650) loss 2.1539 (2.8884) grad_norm 3.5908 (4.0637) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1060/1251] eta 0:00:50 lr 0.000160 wd 0.0500 time 0.2187 (0.2667) data time 0.0010 (0.0033) model time 0.2177 (0.2634) loss 3.0357 (2.8937) grad_norm 2.8392 (4.0480) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1070/1251] eta 0:00:48 lr 0.000160 wd 0.0500 time 0.2309 (0.2661) data time 0.0007 (0.0032) model time 0.2301 (0.2629) loss 2.8974 (2.8953) grad_norm 5.0008 (4.0432) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1080/1251] eta 0:00:45 lr 0.000159 wd 0.0500 time 0.2234 (0.2648) data time 0.0009 (0.0031) model time 0.2225 (0.2617) loss 3.1207 (2.8818) grad_norm 2.3751 (4.0335) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:04:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1090/1251] eta 0:00:42 lr 0.000159 wd 0.0500 time 0.2261 (0.2644) data time 0.0010 (0.0031) model time 0.2251 (0.2614) loss 3.1157 (2.8784) grad_norm 3.3236 (4.0537) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1100/1251] eta 0:00:39 lr 0.000159 wd 0.0500 time 0.2239 (0.2632) data time 0.0007 (0.0030) model time 0.2231 (0.2602) loss 3.4643 (2.8878) grad_norm 3.4419 (4.0305) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1110/1251] eta 0:00:36 lr 0.000159 wd 0.0500 time 0.2303 (0.2622) data time 0.0009 (0.0029) model time 0.2295 (0.2592) loss 2.0217 (2.8917) grad_norm 4.2828 (4.0164) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1120/1251] eta 0:00:34 lr 0.000159 wd 0.0500 time 0.2268 (0.2611) data time 0.0009 (0.0029) model time 0.2259 (0.2582) loss 2.4633 (2.8903) grad_norm 2.8241 (4.0110) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1130/1251] eta 0:00:31 lr 0.000159 wd 0.0500 time 0.2300 (0.2601) data time 0.0006 (0.0028) model time 0.2295 (0.2573) loss 3.3613 (2.8937) grad_norm 7.3896 (4.0330) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1140/1251] eta 0:00:28 lr 0.000159 wd 0.0500 time 0.2230 (0.2591) data time 0.0007 (0.0028) model time 0.2222 (0.2563) loss 2.3391 (2.8911) grad_norm 2.8639 (4.0421) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1150/1251] eta 0:00:26 lr 0.000159 wd 0.0500 time 0.2282 (0.2582) data time 0.0009 (0.0027) model time 0.2273 (0.2554) loss 3.2088 (2.8911) grad_norm 3.4207 (4.0689) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1160/1251] eta 0:00:23 lr 0.000159 wd 0.0500 time 0.2257 (0.2573) data time 0.0007 (0.0027) model time 0.2250 (0.2546) loss 3.1549 (2.8887) grad_norm 3.3263 (4.0574) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1170/1251] eta 0:00:20 lr 0.000159 wd 0.0500 time 0.2238 (0.2565) data time 0.0006 (0.0026) model time 0.2231 (0.2539) loss 3.0856 (2.8844) grad_norm 2.9979 (4.0407) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1180/1251] eta 0:00:18 lr 0.000159 wd 0.0500 time 0.2305 (0.2558) data time 0.0008 (0.0026) model time 0.2297 (0.2532) loss 3.2398 (2.8889) grad_norm 14.9398 (4.0624) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1190/1251] eta 0:00:15 lr 0.000159 wd 0.0500 time 0.2193 (0.2550) data time 0.0009 (0.0026) model time 0.2184 (0.2525) loss 3.0099 (2.8913) grad_norm 3.1287 (4.0479) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1200/1251] eta 0:00:12 lr 0.000159 wd 0.0500 time 0.2211 (0.2543) data time 0.0007 (0.0025) model time 0.2204 (0.2518) loss 2.4285 (2.8886) grad_norm 2.3976 (4.0275) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1210/1251] eta 0:00:10 lr 0.000159 wd 0.0500 time 0.2235 (0.2536) data time 0.0008 (0.0025) model time 0.2228 (0.2511) loss 3.0732 (2.8962) grad_norm 2.6605 (4.0134) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1220/1251] eta 0:00:07 lr 0.000159 wd 0.0500 time 0.2258 (0.2530) data time 0.0009 (0.0025) model time 0.2249 (0.2505) loss 2.8257 (2.8993) grad_norm 2.7294 (4.0034) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1230/1251] eta 0:00:05 lr 0.000159 wd 0.0500 time 0.2204 (0.2523) data time 0.0009 (0.0024) model time 0.2195 (0.2499) loss 2.8195 (2.8997) grad_norm 4.1308 (3.9926) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1240/1251] eta 0:00:02 lr 0.000159 wd 0.0500 time 0.2113 (0.2516) data time 0.0006 (0.0024) model time 0.2107 (0.2492) loss 2.4434 (2.8943) grad_norm 3.3118 (3.9946) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [228/300][1250/1251] eta 0:00:00 lr 0.000159 wd 0.0500 time 0.2154 (0.2508) data time 0.0004 (0.0024) model time 0.2150 (0.2484) loss 1.6723 (2.8862) grad_norm 3.9577 (3.9956) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:05:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 228 training takes 0:01:57 [2024-09-01 05:05:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 05:05:35 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 05:05:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.366 (0.366) Loss 0.4133 (0.4133) Acc@1 92.578 (92.578) Acc@5 98.340 (98.340) Mem 7377MB [2024-09-01 05:05:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.066 (0.096) Loss 0.6289 (0.6349) Acc@1 87.109 (86.497) Acc@5 97.559 (97.479) Mem 7377MB [2024-09-01 05:05:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.085) Loss 0.9229 (0.6639) Acc@1 78.027 (85.542) Acc@5 94.922 (97.340) Mem 7377MB [2024-09-01 05:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.080) Loss 1.1250 (0.7575) Acc@1 72.852 (83.279) Acc@5 93.359 (96.412) Mem 7377MB [2024-09-01 05:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.076) Loss 1.0098 (0.8085) Acc@1 76.074 (82.079) Acc@5 94.238 (95.837) Mem 7377MB [2024-09-01 05:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.662 Acc@5 95.804 [2024-09-01 05:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.7% [2024-09-01 05:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.806 (0.806) Loss 0.3811 (0.3811) Acc@1 92.871 (92.871) Acc@5 98.242 (98.242) Mem 7377MB [2024-09-01 05:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-09-01 05:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-09-01 05:07:30 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-09-01 05:07:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-09-01 05:07:39 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-09-01 05:07:40 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-09-01 05:07:41 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-09-01 05:07:41 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 228) [2024-09-01 05:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-09-01 05:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][0/1251] eta 4:28:03 lr 0.000159 wd 0.0500 time 12.8563 (12.8563) data time 0.8847 (0.8847) model time 0.0000 (0.0000) loss 3.6817 (3.6817) grad_norm 2.8970 (2.8970) loss_scale 512.0000 (512.0000) mem 20035MB [2024-09-01 05:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][10/1251] eta 0:28:58 lr 0.000159 wd 0.0500 time 0.2403 (1.4006) data time 0.0011 (0.0813) model time 0.0000 (0.0000) loss 2.5736 (3.1715) grad_norm 3.8225 (3.7407) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][20/1251] eta 0:17:22 lr 0.000159 wd 0.0500 time 0.2337 (0.8470) data time 0.0011 (0.0431) model time 0.0000 (0.0000) loss 2.9738 (3.0533) grad_norm 3.0207 (3.8356) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][30/1251] eta 0:13:14 lr 0.000159 wd 0.0500 time 0.2295 (0.6503) data time 0.0008 (0.0295) model time 0.0000 (0.0000) loss 2.2636 (3.0296) grad_norm 3.8511 (3.8046) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][40/1251] eta 0:11:06 lr 0.000159 wd 0.0500 time 0.2358 (0.5500) data time 0.0010 (0.0226) model time 0.0000 (0.0000) loss 2.7426 (2.9795) grad_norm 2.5762 (3.7780) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][50/1251] eta 0:09:46 lr 0.000159 wd 0.0500 time 0.2389 (0.4885) data time 0.0009 (0.0184) model time 0.0000 (0.0000) loss 3.2903 (2.9810) grad_norm 3.4541 (3.7780) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][60/1251] eta 0:08:52 lr 0.000159 wd 0.0500 time 0.2324 (0.4472) data time 0.0010 (0.0155) model time 0.2315 (0.2354) loss 3.2040 (2.9572) grad_norm 4.4244 (3.8784) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][70/1251] eta 0:08:13 lr 0.000159 wd 0.0500 time 0.2387 (0.4176) data time 0.0012 (0.0135) model time 0.2374 (0.2357) loss 2.7440 (2.9146) grad_norm 4.5210 (3.8484) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][80/1251] eta 0:07:42 lr 0.000159 wd 0.0500 time 0.2275 (0.3949) data time 0.0011 (0.0120) model time 0.2264 (0.2345) loss 2.7280 (2.9059) grad_norm 4.6151 (3.8796) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][90/1251] eta 0:07:18 lr 0.000159 wd 0.0500 time 0.2308 (0.3774) data time 0.0007 (0.0108) model time 0.2301 (0.2345) loss 3.7620 (2.9081) grad_norm 5.3270 (3.8116) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][100/1251] eta 0:06:58 lr 0.000159 wd 0.0500 time 0.2358 (0.3634) data time 0.0008 (0.0098) model time 0.2349 (0.2347) loss 3.0484 (2.9217) grad_norm 3.4621 (3.8109) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][110/1251] eta 0:06:41 lr 0.000159 wd 0.0500 time 0.2383 (0.3520) data time 0.0011 (0.0091) model time 0.2372 (0.2348) loss 2.6369 (2.9407) grad_norm 3.0284 (3.7954) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][120/1251] eta 0:06:27 lr 0.000159 wd 0.0500 time 0.2378 (0.3425) data time 0.0008 (0.0084) model time 0.2370 (0.2350) loss 2.0060 (2.9476) grad_norm 4.4197 (3.8083) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][130/1251] eta 0:06:14 lr 0.000159 wd 0.0500 time 0.2327 (0.3345) data time 0.0011 (0.0079) model time 0.2316 (0.2352) loss 3.0033 (2.9445) grad_norm 3.7610 (3.8064) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][140/1251] eta 0:06:04 lr 0.000158 wd 0.0500 time 0.2351 (0.3277) data time 0.0010 (0.0074) model time 0.2341 (0.2355) loss 2.8828 (2.9345) grad_norm 3.0762 (3.7709) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][150/1251] eta 0:05:54 lr 0.000158 wd 0.0500 time 0.2398 (0.3218) data time 0.0009 (0.0069) model time 0.2388 (0.2357) loss 2.4396 (2.9324) grad_norm 4.3805 (3.8106) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][160/1251] eta 0:05:45 lr 0.000158 wd 0.0500 time 0.2383 (0.3166) data time 0.0011 (0.0066) model time 0.2372 (0.2358) loss 3.5573 (2.9382) grad_norm 5.2909 (3.8679) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][170/1251] eta 0:05:37 lr 0.000158 wd 0.0500 time 0.2377 (0.3121) data time 0.0011 (0.0063) model time 0.2366 (0.2360) loss 3.0778 (2.9320) grad_norm 2.8435 (3.8709) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][180/1251] eta 0:05:29 lr 0.000158 wd 0.0500 time 0.2355 (0.3080) data time 0.0009 (0.0060) model time 0.2346 (0.2361) loss 3.5401 (2.9198) grad_norm 2.5512 (3.8453) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][190/1251] eta 0:05:23 lr 0.000158 wd 0.0500 time 0.2758 (0.3046) data time 0.0011 (0.0057) model time 0.2747 (0.2365) loss 2.7509 (2.9206) grad_norm 5.7485 (3.8463) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][200/1251] eta 0:05:16 lr 0.000158 wd 0.0500 time 0.2415 (0.3014) data time 0.0010 (0.0055) model time 0.2404 (0.2367) loss 2.8308 (2.9100) grad_norm 3.7075 (3.8406) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][210/1251] eta 0:05:10 lr 0.000158 wd 0.0500 time 0.2467 (0.2984) data time 0.0009 (0.0053) model time 0.2458 (0.2367) loss 3.2882 (2.9119) grad_norm 3.4211 (3.8192) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][220/1251] eta 0:05:04 lr 0.000158 wd 0.0500 time 0.2379 (0.2957) data time 0.0008 (0.0051) model time 0.2371 (0.2368) loss 3.3759 (2.9096) grad_norm 3.3889 (3.8070) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][230/1251] eta 0:04:59 lr 0.000158 wd 0.0500 time 0.2379 (0.2932) data time 0.0008 (0.0049) model time 0.2371 (0.2368) loss 1.9678 (2.9035) grad_norm 3.4831 (3.8040) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][240/1251] eta 0:04:54 lr 0.000158 wd 0.0500 time 0.2379 (0.2908) data time 0.0008 (0.0047) model time 0.2370 (0.2367) loss 2.6734 (2.8974) grad_norm 4.2574 (3.8281) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][250/1251] eta 0:04:49 lr 0.000158 wd 0.0500 time 0.2445 (0.2887) data time 0.0008 (0.0046) model time 0.2437 (0.2367) loss 3.0645 (2.8894) grad_norm 4.3808 (3.8125) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][260/1251] eta 0:04:44 lr 0.000158 wd 0.0500 time 0.2451 (0.2867) data time 0.0007 (0.0045) model time 0.2444 (0.2366) loss 2.8331 (2.8793) grad_norm 3.2798 (3.8023) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][270/1251] eta 0:04:39 lr 0.000158 wd 0.0500 time 0.2456 (0.2850) data time 0.0007 (0.0043) model time 0.2449 (0.2368) loss 2.8710 (2.8784) grad_norm 3.5970 (3.8040) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][280/1251] eta 0:04:35 lr 0.000158 wd 0.0500 time 0.2416 (0.2834) data time 0.0009 (0.0042) model time 0.2407 (0.2368) loss 2.7030 (2.8804) grad_norm 3.2286 (3.8036) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][290/1251] eta 0:04:31 lr 0.000158 wd 0.0500 time 0.2347 (0.2826) data time 0.0010 (0.0041) model time 0.2337 (0.2377) loss 1.5317 (2.8769) grad_norm 4.5509 (3.8531) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][300/1251] eta 0:04:27 lr 0.000158 wd 0.0500 time 0.2340 (0.2811) data time 0.0011 (0.0040) model time 0.2329 (0.2378) loss 2.5965 (2.8677) grad_norm 4.4376 (3.8802) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][310/1251] eta 0:04:24 lr 0.000158 wd 0.0500 time 0.2413 (0.2806) data time 0.0009 (0.0039) model time 0.2404 (0.2388) loss 3.1211 (2.8659) grad_norm 4.0475 (3.8767) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][320/1251] eta 0:04:20 lr 0.000158 wd 0.0500 time 0.2326 (0.2795) data time 0.0009 (0.0038) model time 0.2317 (0.2389) loss 3.1852 (2.8726) grad_norm 2.8723 (3.8619) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][330/1251] eta 0:04:16 lr 0.000158 wd 0.0500 time 0.2484 (0.2783) data time 0.0008 (0.0037) model time 0.2476 (0.2390) loss 1.6641 (2.8722) grad_norm 3.4633 (3.8530) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][340/1251] eta 0:04:12 lr 0.000158 wd 0.0500 time 0.2377 (0.2772) data time 0.0012 (0.0037) model time 0.2365 (0.2389) loss 2.7684 (2.8746) grad_norm 2.9621 (3.8432) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][350/1251] eta 0:04:08 lr 0.000158 wd 0.0500 time 0.2369 (0.2760) data time 0.0007 (0.0036) model time 0.2362 (0.2388) loss 2.9283 (2.8769) grad_norm 4.2864 (3.8378) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][360/1251] eta 0:04:05 lr 0.000158 wd 0.0500 time 0.2340 (0.2750) data time 0.0007 (0.0035) model time 0.2333 (0.2388) loss 2.3967 (2.8774) grad_norm 3.6702 (3.8351) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][370/1251] eta 0:04:01 lr 0.000158 wd 0.0500 time 0.2322 (0.2740) data time 0.0009 (0.0035) model time 0.2313 (0.2387) loss 2.2184 (2.8772) grad_norm 4.8226 (3.8556) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][380/1251] eta 0:03:57 lr 0.000158 wd 0.0500 time 0.2344 (0.2731) data time 0.0008 (0.0034) model time 0.2337 (0.2387) loss 1.9273 (2.8752) grad_norm 3.7276 (3.8413) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][390/1251] eta 0:03:54 lr 0.000158 wd 0.0500 time 0.2354 (0.2722) data time 0.0008 (0.0033) model time 0.2347 (0.2386) loss 3.3184 (2.8712) grad_norm 5.2362 (3.8547) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][400/1251] eta 0:03:50 lr 0.000158 wd 0.0500 time 0.2406 (0.2714) data time 0.0009 (0.0033) model time 0.2397 (0.2386) loss 2.4923 (2.8723) grad_norm 4.2116 (3.8540) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][410/1251] eta 0:03:47 lr 0.000158 wd 0.0500 time 0.2383 (0.2706) data time 0.0010 (0.0032) model time 0.2373 (0.2386) loss 3.3092 (2.8768) grad_norm 3.9531 (3.8558) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][420/1251] eta 0:03:44 lr 0.000158 wd 0.0500 time 0.2332 (0.2699) data time 0.0007 (0.0032) model time 0.2325 (0.2386) loss 3.2205 (2.8773) grad_norm 3.1755 (3.8611) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][430/1251] eta 0:03:40 lr 0.000158 wd 0.0500 time 0.2302 (0.2691) data time 0.0009 (0.0031) model time 0.2293 (0.2386) loss 3.1416 (2.8830) grad_norm 3.1356 (3.8828) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][440/1251] eta 0:03:37 lr 0.000158 wd 0.0500 time 0.2359 (0.2684) data time 0.0008 (0.0031) model time 0.2351 (0.2385) loss 3.6414 (2.8870) grad_norm 3.5035 (3.8891) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][450/1251] eta 0:03:34 lr 0.000158 wd 0.0500 time 0.2364 (0.2677) data time 0.0010 (0.0030) model time 0.2354 (0.2385) loss 2.6398 (2.8852) grad_norm 4.1657 (3.8885) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][460/1251] eta 0:03:31 lr 0.000157 wd 0.0500 time 0.2352 (0.2670) data time 0.0011 (0.0030) model time 0.2342 (0.2384) loss 2.8972 (2.8785) grad_norm 4.9319 (3.8956) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][470/1251] eta 0:03:28 lr 0.000157 wd 0.0500 time 0.2359 (0.2664) data time 0.0012 (0.0029) model time 0.2347 (0.2383) loss 2.9282 (2.8720) grad_norm 4.4178 (3.8947) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][480/1251] eta 0:03:24 lr 0.000157 wd 0.0500 time 0.2394 (0.2658) data time 0.0007 (0.0029) model time 0.2387 (0.2383) loss 2.9825 (2.8716) grad_norm 2.5154 (3.9017) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][490/1251] eta 0:03:21 lr 0.000157 wd 0.0500 time 0.2344 (0.2653) data time 0.0011 (0.0029) model time 0.2333 (0.2383) loss 3.2661 (2.8758) grad_norm 4.7617 (3.9018) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:09:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][500/1251] eta 0:03:18 lr 0.000157 wd 0.0500 time 0.2479 (0.2647) data time 0.0010 (0.0028) model time 0.2469 (0.2383) loss 2.8431 (2.8730) grad_norm 2.8776 (3.9021) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][510/1251] eta 0:03:15 lr 0.000157 wd 0.0500 time 0.2328 (0.2641) data time 0.0008 (0.0028) model time 0.2320 (0.2382) loss 3.2293 (2.8778) grad_norm 2.4401 (3.8994) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][520/1251] eta 0:03:12 lr 0.000157 wd 0.0500 time 0.2316 (0.2636) data time 0.0008 (0.0028) model time 0.2308 (0.2382) loss 2.4381 (2.8779) grad_norm 3.2869 (3.8853) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][530/1251] eta 0:03:09 lr 0.000157 wd 0.0500 time 0.2327 (0.2632) data time 0.0011 (0.0027) model time 0.2316 (0.2382) loss 3.1119 (2.8723) grad_norm 3.4840 (3.8778) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][540/1251] eta 0:03:06 lr 0.000157 wd 0.0500 time 0.2521 (0.2628) data time 0.0010 (0.0027) model time 0.2511 (0.2382) loss 2.7486 (2.8736) grad_norm 3.4627 (3.8710) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][550/1251] eta 0:03:03 lr 0.000157 wd 0.0500 time 0.2368 (0.2623) data time 0.0007 (0.0027) model time 0.2361 (0.2382) loss 3.3309 (2.8741) grad_norm 3.1656 (3.8763) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][560/1251] eta 0:03:00 lr 0.000157 wd 0.0500 time 0.2437 (0.2619) data time 0.0011 (0.0026) model time 0.2426 (0.2382) loss 3.1656 (2.8783) grad_norm 5.6328 (3.8935) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][570/1251] eta 0:02:58 lr 0.000157 wd 0.0500 time 0.2360 (0.2615) data time 0.0010 (0.0026) model time 0.2350 (0.2381) loss 2.8841 (2.8792) grad_norm 3.9756 (3.8950) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][580/1251] eta 0:02:55 lr 0.000157 wd 0.0500 time 0.2365 (0.2611) data time 0.0008 (0.0026) model time 0.2357 (0.2381) loss 3.1219 (2.8798) grad_norm 4.9892 (3.9028) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][590/1251] eta 0:02:52 lr 0.000157 wd 0.0500 time 0.2308 (0.2607) data time 0.0010 (0.0026) model time 0.2298 (0.2381) loss 3.1362 (2.8799) grad_norm 3.6430 (3.9031) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][600/1251] eta 0:02:49 lr 0.000157 wd 0.0500 time 0.2291 (0.2603) data time 0.0011 (0.0025) model time 0.2280 (0.2380) loss 2.2095 (2.8784) grad_norm 3.3200 (3.9746) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][610/1251] eta 0:02:46 lr 0.000157 wd 0.0500 time 0.2307 (0.2599) data time 0.0010 (0.0025) model time 0.2297 (0.2380) loss 2.7002 (2.8793) grad_norm 2.7155 (3.9684) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][620/1251] eta 0:02:43 lr 0.000157 wd 0.0500 time 0.2411 (0.2596) data time 0.0008 (0.0025) model time 0.2404 (0.2380) loss 2.9791 (2.8808) grad_norm 4.0181 (3.9657) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][630/1251] eta 0:02:41 lr 0.000157 wd 0.0500 time 0.2456 (0.2593) data time 0.0011 (0.0025) model time 0.2446 (0.2380) loss 3.3120 (2.8827) grad_norm 4.6414 (3.9568) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][640/1251] eta 0:02:38 lr 0.000157 wd 0.0500 time 0.2382 (0.2590) data time 0.0009 (0.0024) model time 0.2373 (0.2381) loss 1.8134 (2.8805) grad_norm 3.3201 (3.9569) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][650/1251] eta 0:02:35 lr 0.000157 wd 0.0500 time 0.2311 (0.2586) data time 0.0008 (0.0024) model time 0.2303 (0.2380) loss 3.4845 (2.8805) grad_norm 6.5774 (3.9575) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][660/1251] eta 0:02:32 lr 0.000157 wd 0.0500 time 0.2484 (0.2583) data time 0.0007 (0.0024) model time 0.2477 (0.2380) loss 1.8352 (2.8754) grad_norm 3.4311 (3.9517) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][670/1251] eta 0:02:29 lr 0.000157 wd 0.0500 time 0.2332 (0.2580) data time 0.0008 (0.0024) model time 0.2324 (0.2380) loss 3.4714 (2.8787) grad_norm 3.5366 (3.9460) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][680/1251] eta 0:02:27 lr 0.000157 wd 0.0500 time 0.2402 (0.2577) data time 0.0008 (0.0024) model time 0.2394 (0.2380) loss 3.3356 (2.8797) grad_norm 3.8448 (3.9755) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][690/1251] eta 0:02:24 lr 0.000157 wd 0.0500 time 0.2335 (0.2574) data time 0.0011 (0.0023) model time 0.2325 (0.2379) loss 2.2749 (2.8775) grad_norm 3.3336 (3.9703) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][700/1251] eta 0:02:21 lr 0.000157 wd 0.0500 time 0.2355 (0.2571) data time 0.0011 (0.0023) model time 0.2344 (0.2379) loss 3.2482 (2.8762) grad_norm 3.0028 (3.9612) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][710/1251] eta 0:02:18 lr 0.000157 wd 0.0500 time 0.2328 (0.2569) data time 0.0007 (0.0023) model time 0.2321 (0.2379) loss 1.6528 (2.8749) grad_norm 2.7462 (3.9544) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][720/1251] eta 0:02:16 lr 0.000157 wd 0.0500 time 0.2394 (0.2566) data time 0.0007 (0.0023) model time 0.2386 (0.2379) loss 2.9760 (2.8723) grad_norm 3.6086 (3.9473) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][730/1251] eta 0:02:13 lr 0.000157 wd 0.0500 time 0.2359 (0.2563) data time 0.0009 (0.0023) model time 0.2349 (0.2378) loss 3.1273 (2.8735) grad_norm 2.6126 (3.9410) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][740/1251] eta 0:02:10 lr 0.000157 wd 0.0500 time 0.2394 (0.2560) data time 0.0009 (0.0023) model time 0.2385 (0.2378) loss 3.1540 (2.8780) grad_norm 4.1243 (3.9654) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:10:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][750/1251] eta 0:02:08 lr 0.000157 wd 0.0500 time 0.2388 (0.2558) data time 0.0010 (0.0022) model time 0.2378 (0.2378) loss 3.0009 (2.8767) grad_norm 4.5235 (3.9744) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][760/1251] eta 0:02:05 lr 0.000157 wd 0.0500 time 0.2347 (0.2556) data time 0.0010 (0.0022) model time 0.2337 (0.2378) loss 3.0803 (2.8760) grad_norm 8.8610 (3.9925) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][770/1251] eta 0:02:02 lr 0.000156 wd 0.0500 time 0.2323 (0.2554) data time 0.0009 (0.0022) model time 0.2314 (0.2378) loss 3.1752 (2.8774) grad_norm 3.7601 (4.0083) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][780/1251] eta 0:02:00 lr 0.000156 wd 0.0500 time 0.2359 (0.2552) data time 0.0008 (0.0022) model time 0.2351 (0.2378) loss 3.7092 (2.8790) grad_norm 3.7433 (4.0076) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][790/1251] eta 0:01:57 lr 0.000156 wd 0.0500 time 0.2343 (0.2550) data time 0.0010 (0.0022) model time 0.2333 (0.2378) loss 2.7605 (2.8779) grad_norm 3.8896 (4.0049) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][800/1251] eta 0:01:54 lr 0.000156 wd 0.0500 time 0.2407 (0.2547) data time 0.0008 (0.0022) model time 0.2399 (0.2378) loss 3.0236 (2.8780) grad_norm 3.2788 (4.0068) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][810/1251] eta 0:01:52 lr 0.000156 wd 0.0500 time 0.2442 (0.2545) data time 0.0007 (0.0021) model time 0.2435 (0.2378) loss 2.0348 (2.8725) grad_norm 3.4248 (4.0017) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][820/1251] eta 0:01:49 lr 0.000156 wd 0.0500 time 0.2343 (0.2546) data time 0.0007 (0.0021) model time 0.2336 (0.2381) loss 3.0771 (2.8735) grad_norm 5.2014 (3.9986) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][830/1251] eta 0:01:47 lr 0.000156 wd 0.0500 time 0.2369 (0.2546) data time 0.0010 (0.0021) model time 0.2359 (0.2383) loss 2.0561 (2.8699) grad_norm 5.0430 (3.9933) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][840/1251] eta 0:01:44 lr 0.000156 wd 0.0500 time 0.2476 (0.2544) data time 0.0007 (0.0021) model time 0.2469 (0.2382) loss 3.1504 (2.8699) grad_norm 4.4742 (3.9943) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][850/1251] eta 0:01:41 lr 0.000156 wd 0.0500 time 0.2343 (0.2542) data time 0.0007 (0.0021) model time 0.2336 (0.2382) loss 3.4453 (2.8686) grad_norm 3.0118 (3.9890) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][860/1251] eta 0:01:39 lr 0.000156 wd 0.0500 time 0.2366 (0.2540) data time 0.0010 (0.0021) model time 0.2356 (0.2382) loss 3.2105 (2.8689) grad_norm 2.9273 (3.9827) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][870/1251] eta 0:01:36 lr 0.000156 wd 0.0500 time 0.2380 (0.2538) data time 0.0007 (0.0021) model time 0.2373 (0.2382) loss 2.7836 (2.8706) grad_norm 4.3545 (3.9827) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][880/1251] eta 0:01:34 lr 0.000156 wd 0.0500 time 0.2418 (0.2537) data time 0.0010 (0.0021) model time 0.2408 (0.2382) loss 3.3432 (2.8690) grad_norm 3.4094 (3.9926) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][890/1251] eta 0:01:31 lr 0.000156 wd 0.0500 time 0.2447 (0.2535) data time 0.0010 (0.0020) model time 0.2437 (0.2382) loss 2.1442 (2.8677) grad_norm 3.9188 (3.9964) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][900/1251] eta 0:01:28 lr 0.000156 wd 0.0500 time 0.2279 (0.2533) data time 0.0011 (0.0020) model time 0.2269 (0.2381) loss 2.9128 (2.8672) grad_norm 2.7620 (3.9898) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][910/1251] eta 0:01:26 lr 0.000156 wd 0.0500 time 0.2313 (0.2531) data time 0.0009 (0.0020) model time 0.2304 (0.2381) loss 2.4306 (2.8682) grad_norm 2.6504 (3.9843) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][920/1251] eta 0:01:23 lr 0.000156 wd 0.0500 time 0.2384 (0.2529) data time 0.0009 (0.0020) model time 0.2375 (0.2381) loss 3.3715 (2.8697) grad_norm 3.3922 (3.9828) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][930/1251] eta 0:01:21 lr 0.000156 wd 0.0500 time 0.2360 (0.2528) data time 0.0010 (0.0020) model time 0.2350 (0.2381) loss 3.0040 (2.8711) grad_norm 2.4953 (3.9859) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][940/1251] eta 0:01:18 lr 0.000156 wd 0.0500 time 0.2400 (0.2526) data time 0.0009 (0.0020) model time 0.2391 (0.2380) loss 3.0125 (2.8708) grad_norm 5.4609 (3.9857) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][950/1251] eta 0:01:15 lr 0.000156 wd 0.0500 time 0.2392 (0.2524) data time 0.0010 (0.0020) model time 0.2382 (0.2380) loss 2.3343 (2.8665) grad_norm 2.9942 (3.9891) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][960/1251] eta 0:01:13 lr 0.000156 wd 0.0500 time 0.2379 (0.2523) data time 0.0010 (0.0020) model time 0.2369 (0.2380) loss 3.4490 (2.8662) grad_norm 6.1435 (3.9937) loss_scale 512.0000 (512.0000) mem 7376MB [2024-09-01 05:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-09-01 05:11:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 05:11:51 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 05:15:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-09-01 05:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-09-01 05:15:22 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-09-01 05:15:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-09-01 05:15:30 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-09-01 05:15:31 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-09-01 05:15:32 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-09-01 05:15:32 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 229) [2024-09-01 05:15:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-09-01 05:15:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][970/1251] eta 0:07:55 lr 0.000156 wd 0.0500 time 0.2205 (1.6911) data time 0.0006 (0.0957) model time 0.2198 (1.5955) loss 2.9424 (3.1957) grad_norm 3.0418 (3.5907) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][980/1251] eta 0:03:57 lr 0.000156 wd 0.0500 time 0.2212 (0.8771) data time 0.0007 (0.0431) model time 0.2205 (0.8340) loss 3.0353 (3.0789) grad_norm 2.8260 (3.5973) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][990/1251] eta 0:02:48 lr 0.000156 wd 0.0500 time 0.2328 (0.6447) data time 0.0009 (0.0280) model time 0.2319 (0.6167) loss 3.2301 (3.1115) grad_norm 3.3791 (3.7240) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:15:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1000/1251] eta 0:02:14 lr 0.000156 wd 0.0500 time 0.2313 (0.5353) data time 0.0009 (0.0209) model time 0.2304 (0.5144) loss 3.3342 (3.0741) grad_norm 3.6001 (3.7987) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:15:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1010/1251] eta 0:01:53 lr 0.000156 wd 0.0500 time 0.2270 (0.4709) data time 0.0008 (0.0168) model time 0.2262 (0.4542) loss 3.3419 (3.0597) grad_norm 4.5085 (4.1904) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:16:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1020/1251] eta 0:01:39 lr 0.000156 wd 0.0500 time 0.2219 (0.4286) data time 0.0007 (0.0141) model time 0.2211 (0.4145) loss 2.0891 (3.0140) grad_norm 2.6265 (4.1957) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:16:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1030/1251] eta 0:01:28 lr 0.000156 wd 0.0500 time 0.2242 (0.3985) data time 0.0008 (0.0121) model time 0.2234 (0.3864) loss 2.3382 (2.9874) grad_norm 3.8597 (4.4338) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:16:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1040/1251] eta 0:01:19 lr 0.000156 wd 0.0500 time 0.2306 (0.3763) data time 0.0007 (0.0107) model time 0.2300 (0.3656) loss 2.1865 (2.9629) grad_norm 3.7329 (4.4484) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:16:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1050/1251] eta 0:01:12 lr 0.000156 wd 0.0500 time 0.2258 (0.3591) data time 0.0010 (0.0096) model time 0.2248 (0.3495) loss 3.4010 (2.9449) grad_norm 3.2993 (4.3699) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:16:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1060/1251] eta 0:01:06 lr 0.000156 wd 0.0500 time 0.2243 (0.3456) data time 0.0007 (0.0087) model time 0.2237 (0.3368) loss 3.2562 (2.9433) grad_norm 3.2719 (4.3615) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:16:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1070/1251] eta 0:01:00 lr 0.000156 wd 0.0500 time 0.2301 (0.3347) data time 0.0008 (0.0080) model time 0.2293 (0.3266) loss 2.4006 (2.9450) grad_norm 3.8985 (4.3193) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1080/1251] eta 0:00:55 lr 0.000156 wd 0.0500 time 0.2314 (0.3257) data time 0.0008 (0.0074) model time 0.2306 (0.3182) loss 2.7725 (2.9508) grad_norm 4.8331 (4.2790) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:16:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1090/1251] eta 0:00:51 lr 0.000155 wd 0.0500 time 0.2250 (0.3179) data time 0.0008 (0.0069) model time 0.2241 (0.3110) loss 3.1848 (2.9407) grad_norm 4.3507 (4.2360) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1100/1251] eta 0:00:46 lr 0.000155 wd 0.0500 time 0.2216 (0.3112) data time 0.0008 (0.0065) model time 0.2208 (0.3047) loss 2.8707 (2.9365) grad_norm 4.8263 (4.2445) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:16:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1110/1251] eta 0:00:43 lr 0.000155 wd 0.0500 time 0.2271 (0.3055) data time 0.0008 (0.0061) model time 0.2263 (0.2993) loss 3.2331 (2.9344) grad_norm 4.1026 (4.2710) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:16:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1120/1251] eta 0:00:39 lr 0.000155 wd 0.0500 time 0.2244 (0.3005) data time 0.0007 (0.0058) model time 0.2236 (0.2946) loss 2.3804 (2.9305) grad_norm 3.6453 (4.3354) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 05:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1130/1251] eta 0:00:35 lr 0.000155 wd 0.0500 time 0.2262 (0.2961) data time 0.0008 (0.0055) model time 0.2254 (0.2906) loss 3.6812 (2.9359) grad_norm 4.9467 (4.3142) loss_scale 1024.0000 (530.2857) mem 7377MB [2024-09-01 05:16:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1140/1251] eta 0:00:32 lr 0.000155 wd 0.0500 time 0.2235 (0.2921) data time 0.0008 (0.0053) model time 0.2227 (0.2868) loss 2.2989 (2.9170) grad_norm 4.4049 (4.2803) loss_scale 1024.0000 (558.0225) mem 7377MB [2024-09-01 05:16:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1150/1251] eta 0:00:29 lr 0.000155 wd 0.0500 time 0.2222 (0.2884) data time 0.0008 (0.0051) model time 0.2215 (0.2834) loss 3.5916 (2.9162) grad_norm 3.4724 (4.2874) loss_scale 1024.0000 (582.8085) mem 7377MB [2024-09-01 05:16:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1160/1251] eta 0:00:25 lr 0.000155 wd 0.0500 time 0.2270 (0.2854) data time 0.0010 (0.0048) model time 0.2260 (0.2805) loss 2.6189 (2.9111) grad_norm 2.7184 (4.2424) loss_scale 1024.0000 (605.0909) mem 7377MB [2024-09-01 05:16:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1170/1251] eta 0:00:22 lr 0.000155 wd 0.0500 time 0.2270 (0.2825) data time 0.0008 (0.0047) model time 0.2262 (0.2778) loss 3.0568 (2.9016) grad_norm 3.4681 (4.2124) loss_scale 1024.0000 (625.2308) mem 7377MB [2024-09-01 05:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1180/1251] eta 0:00:19 lr 0.000155 wd 0.0500 time 0.2267 (0.2799) data time 0.0008 (0.0045) model time 0.2259 (0.2754) loss 2.7362 (2.8936) grad_norm 2.6715 (4.1953) loss_scale 1024.0000 (643.5229) mem 7377MB [2024-09-01 05:16:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1190/1251] eta 0:00:16 lr 0.000155 wd 0.0500 time 0.2305 (0.2775) data time 0.0007 (0.0043) model time 0.2298 (0.2732) loss 3.3249 (2.9007) grad_norm 5.6410 (4.1931) loss_scale 1024.0000 (660.2105) mem 7377MB [2024-09-01 05:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1200/1251] eta 0:00:14 lr 0.000155 wd 0.0500 time 0.2180 (0.2754) data time 0.0006 (0.0042) model time 0.2174 (0.2712) loss 3.0497 (2.8980) grad_norm 4.2119 (4.1704) loss_scale 1024.0000 (675.4958) mem 7377MB [2024-09-01 05:16:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1210/1251] eta 0:00:11 lr 0.000155 wd 0.0500 time 0.2242 (0.2735) data time 0.0010 (0.0041) model time 0.2232 (0.2694) loss 2.2023 (2.8888) grad_norm 3.7728 (4.1455) loss_scale 1024.0000 (689.5484) mem 7377MB [2024-09-01 05:16:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1220/1251] eta 0:00:08 lr 0.000155 wd 0.0500 time 0.2226 (0.2718) data time 0.0010 (0.0039) model time 0.2217 (0.2678) loss 3.1028 (2.8812) grad_norm 3.4211 (4.1263) loss_scale 1024.0000 (702.5116) mem 7377MB [2024-09-01 05:16:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1230/1251] eta 0:00:05 lr 0.000155 wd 0.0500 time 0.2252 (0.2701) data time 0.0008 (0.0038) model time 0.2245 (0.2662) loss 3.1037 (2.8756) grad_norm 3.1732 (4.0985) loss_scale 1024.0000 (714.5075) mem 7377MB [2024-09-01 05:16:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1240/1251] eta 0:00:02 lr 0.000155 wd 0.0500 time 0.2158 (0.2683) data time 0.0006 (0.0037) model time 0.2152 (0.2646) loss 2.1511 (2.8758) grad_norm 3.3530 (4.0753) loss_scale 1024.0000 (725.6403) mem 7377MB [2024-09-01 05:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [229/300][1250/1251] eta 0:00:00 lr 0.000155 wd 0.0500 time 0.2160 (0.2673) data time 0.0004 (0.0036) model time 0.2155 (0.2636) loss 3.5565 (2.8765) grad_norm 2.8946 (4.0622) loss_scale 1024.0000 (736.0000) mem 7377MB [2024-09-01 05:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 229 training takes 0:01:16 [2024-09-01 05:16:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 05:16:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 05:16:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.365 (0.365) Loss 0.3970 (0.3970) Acc@1 92.480 (92.480) Acc@5 98.633 (98.633) Mem 7377MB [2024-09-01 05:16:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.096) Loss 0.5732 (0.6219) Acc@1 89.062 (86.737) Acc@5 97.754 (97.638) Mem 7377MB [2024-09-01 05:16:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.086) Loss 0.9595 (0.6561) Acc@1 77.148 (85.742) Acc@5 94.727 (97.489) Mem 7377MB [2024-09-01 05:16:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.081) Loss 1.1064 (0.7485) Acc@1 74.609 (83.512) Acc@5 92.871 (96.522) Mem 7377MB [2024-09-01 05:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.077) Loss 1.0381 (0.7995) Acc@1 76.270 (82.181) Acc@5 93.359 (96.010) Mem 7377MB [2024-09-01 05:17:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.748 Acc@5 95.942 [2024-09-01 05:17:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.7% [2024-09-01 05:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.734 (0.734) Loss 0.3813 (0.3813) Acc@1 92.969 (92.969) Acc@5 98.242 (98.242) Mem 7377MB [2024-09-01 05:17:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.068 (0.138) Loss 0.5771 (0.6007) Acc@1 89.453 (87.473) Acc@5 97.754 (97.789) Mem 7377MB [2024-09-01 05:17:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.067 (0.106) Loss 0.8726 (0.6289) Acc@1 78.223 (86.403) Acc@5 95.898 (97.712) Mem 7377MB [2024-09-01 05:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.067 (0.095) Loss 1.0879 (0.7129) Acc@1 73.926 (84.315) Acc@5 93.262 (96.812) Mem 7377MB [2024-09-01 05:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.087) Loss 0.9756 (0.7572) Acc@1 76.953 (83.182) Acc@5 94.824 (96.368) Mem 7377MB [2024-09-01 05:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.760 Acc@5 96.322 [2024-09-01 05:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 05:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.76% [2024-09-01 05:17:04 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 05:17:05 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 05:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][0/1251] eta 0:15:56 lr 0.000155 wd 0.0500 time 0.7647 (0.7647) data time 0.4849 (0.4849) model time 0.0000 (0.0000) loss 3.1115 (3.1115) grad_norm 2.6305 (2.6305) loss_scale 1024.0000 (1024.0000) mem 7380MB [2024-09-01 05:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][10/1251] eta 0:05:42 lr 0.000155 wd 0.0500 time 0.2358 (0.2761) data time 0.0009 (0.0450) model time 0.0000 (0.0000) loss 3.3025 (2.5930) grad_norm 4.1628 (4.0960) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 05:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][20/1251] eta 0:05:09 lr 0.000155 wd 0.0500 time 0.2233 (0.2518) data time 0.0008 (0.0240) model time 0.0000 (0.0000) loss 3.0507 (2.6681) grad_norm 5.3546 (4.4437) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 05:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-09-01 05:17:10 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 05:17:12 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 05:19:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-09-01 05:19:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-09-01 05:19:57 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-09-01 05:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-09-01 05:23:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-09-01 05:23:16 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-09-01 05:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-09-01 05:27:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-09-01 05:27:34 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-09-01 05:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-09-01 05:27:45 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-09-01 05:27:46 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-09-01 05:27:48 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-09-01 05:27:48 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 230) [2024-09-01 05:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-09-01 05:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][30/1251] eta 0:36:13 lr 0.000155 wd 0.0500 time 0.2273 (1.7800) data time 0.0007 (0.1685) model time 0.0000 (0.0000) loss 2.9820 (3.2827) grad_norm 6.0311 (4.6516) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][40/1251] eta 0:19:26 lr 0.000155 wd 0.0500 time 0.2315 (0.9635) data time 0.0014 (0.0805) model time 0.0000 (0.0000) loss 2.9402 (3.0969) grad_norm 6.1231 (4.8767) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][50/1251] eta 0:14:13 lr 0.000155 wd 0.0500 time 0.2231 (0.7106) data time 0.0007 (0.0531) model time 0.0000 (0.0000) loss 3.3854 (3.1805) grad_norm 4.9748 (4.7227) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][60/1251] eta 0:11:38 lr 0.000155 wd 0.0500 time 0.2313 (0.5868) data time 0.0009 (0.0397) model time 0.2304 (0.2266) loss 3.0639 (3.1270) grad_norm 3.2307 (4.5979) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][70/1251] eta 0:10:07 lr 0.000155 wd 0.0500 time 0.2416 (0.5142) data time 0.0009 (0.0318) model time 0.2406 (0.2283) loss 2.9085 (3.0724) grad_norm 3.4937 (4.4399) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][80/1251] eta 0:09:05 lr 0.000155 wd 0.0500 time 0.2210 (0.4656) data time 0.0008 (0.0266) model time 0.2203 (0.2278) loss 2.1530 (3.0338) grad_norm 8.4478 (4.3943) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][90/1251] eta 0:08:20 lr 0.000155 wd 0.0500 time 0.2222 (0.4309) data time 0.0009 (0.0229) model time 0.2213 (0.2271) loss 2.9642 (3.0184) grad_norm 6.0890 (4.3735) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][100/1251] eta 0:07:46 lr 0.000155 wd 0.0500 time 0.2352 (0.4053) data time 0.0009 (0.0201) model time 0.2343 (0.2272) loss 3.0661 (2.9961) grad_norm 4.9755 (4.4515) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][110/1251] eta 0:07:19 lr 0.000155 wd 0.0500 time 0.2289 (0.3853) data time 0.0006 (0.0180) model time 0.2283 (0.2270) loss 2.8965 (2.9696) grad_norm 3.0874 (4.3751) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][120/1251] eta 0:06:58 lr 0.000155 wd 0.0500 time 0.2232 (0.3697) data time 0.0011 (0.0163) model time 0.2221 (0.2274) loss 3.1428 (2.9710) grad_norm 2.2968 (4.2857) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][130/1251] eta 0:06:40 lr 0.000155 wd 0.0500 time 0.2295 (0.3569) data time 0.0007 (0.0149) model time 0.2288 (0.2277) loss 3.7084 (2.9799) grad_norm 2.7412 (4.1841) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][140/1251] eta 0:06:24 lr 0.000155 wd 0.0500 time 0.2297 (0.3462) data time 0.0009 (0.0137) model time 0.2289 (0.2278) loss 3.0886 (2.9765) grad_norm 3.4931 (4.1213) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][150/1251] eta 0:06:11 lr 0.000155 wd 0.0500 time 0.2227 (0.3373) data time 0.0012 (0.0127) model time 0.2214 (0.2280) loss 2.5817 (2.9630) grad_norm 3.4084 (4.0623) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][160/1251] eta 0:05:59 lr 0.000154 wd 0.0500 time 0.2448 (0.3296) data time 0.0009 (0.0119) model time 0.2439 (0.2281) loss 3.9041 (2.9630) grad_norm 4.0237 (4.0494) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][170/1251] eta 0:05:48 lr 0.000154 wd 0.0500 time 0.2276 (0.3225) data time 0.0007 (0.0111) model time 0.2269 (0.2277) loss 2.5865 (2.9495) grad_norm 6.9497 (4.1915) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][180/1251] eta 0:05:39 lr 0.000154 wd 0.0500 time 0.2268 (0.3166) data time 0.0012 (0.0105) model time 0.2256 (0.2276) loss 3.4917 (2.9451) grad_norm 4.5112 (4.1798) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][190/1251] eta 0:05:30 lr 0.000154 wd 0.0500 time 0.2276 (0.3114) data time 0.0009 (0.0100) model time 0.2267 (0.2277) loss 2.9825 (2.9456) grad_norm 4.0407 (4.1638) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][200/1251] eta 0:05:22 lr 0.000154 wd 0.0500 time 0.2301 (0.3068) data time 0.0011 (0.0095) model time 0.2290 (0.2277) loss 2.6900 (2.9307) grad_norm 2.9696 (4.1249) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][210/1251] eta 0:05:15 lr 0.000154 wd 0.0500 time 0.2203 (0.3027) data time 0.0007 (0.0090) model time 0.2197 (0.2277) loss 3.5148 (2.9299) grad_norm 4.0949 (4.1487) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][220/1251] eta 0:05:08 lr 0.000154 wd 0.0500 time 0.2299 (0.2989) data time 0.0010 (0.0086) model time 0.2288 (0.2277) loss 2.2111 (2.9119) grad_norm 4.5343 (4.1506) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][230/1251] eta 0:05:01 lr 0.000154 wd 0.0500 time 0.2253 (0.2955) data time 0.0006 (0.0082) model time 0.2247 (0.2276) loss 3.0827 (2.9072) grad_norm 4.2535 (4.1323) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][240/1251] eta 0:04:55 lr 0.000154 wd 0.0500 time 0.2187 (0.2923) data time 0.0008 (0.0079) model time 0.2179 (0.2275) loss 3.2994 (2.8974) grad_norm 2.8960 (4.1208) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:28:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][250/1251] eta 0:04:49 lr 0.000154 wd 0.0500 time 0.2300 (0.2896) data time 0.0006 (0.0076) model time 0.2294 (0.2275) loss 2.3017 (2.8979) grad_norm 3.1062 (4.0853) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][260/1251] eta 0:04:44 lr 0.000154 wd 0.0500 time 0.2260 (0.2870) data time 0.0011 (0.0073) model time 0.2249 (0.2275) loss 2.2239 (2.8903) grad_norm 5.6159 (4.0976) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][270/1251] eta 0:04:39 lr 0.000154 wd 0.0500 time 0.2285 (0.2845) data time 0.0010 (0.0071) model time 0.2276 (0.2274) loss 2.5602 (2.8847) grad_norm 3.5922 (4.0984) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:29:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][280/1251] eta 0:04:34 lr 0.000154 wd 0.0500 time 0.2214 (0.2824) data time 0.0011 (0.0068) model time 0.2202 (0.2274) loss 3.3692 (2.8810) grad_norm 3.2551 (4.0859) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][290/1251] eta 0:04:29 lr 0.000154 wd 0.0500 time 0.2315 (0.2803) data time 0.0007 (0.0066) model time 0.2308 (0.2273) loss 1.8510 (2.8714) grad_norm 3.1576 (4.0953) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][300/1251] eta 0:04:24 lr 0.000154 wd 0.0500 time 0.2252 (0.2784) data time 0.0011 (0.0064) model time 0.2241 (0.2272) loss 2.9000 (2.8764) grad_norm 5.2260 (4.1086) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:29:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][310/1251] eta 0:04:21 lr 0.000154 wd 0.0500 time 0.2222 (0.2774) data time 0.0009 (0.0062) model time 0.2213 (0.2281) loss 2.7714 (2.8740) grad_norm 3.4140 (4.1002) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:29:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][320/1251] eta 0:04:16 lr 0.000154 wd 0.0500 time 0.2296 (0.2758) data time 0.0013 (0.0061) model time 0.2282 (0.2280) loss 2.8410 (2.8630) grad_norm 4.0975 (4.1136) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][330/1251] eta 0:04:13 lr 0.000154 wd 0.0500 time 0.2285 (0.2751) data time 0.0012 (0.0059) model time 0.2273 (0.2289) loss 3.0833 (2.8588) grad_norm 3.7746 (4.1275) loss_scale 1024.0000 (1024.0000) mem 7378MB [2024-09-01 05:29:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][340/1251] eta 0:04:09 lr 0.000154 wd 0.0500 time 0.2170 (0.2736) data time 0.0007 (0.0058) model time 0.2164 (0.2288) loss 3.3007 (2.8675) grad_norm inf (inf) loss_scale 512.0000 (1022.3950) mem 7378MB [2024-09-01 05:29:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][350/1251] eta 0:04:05 lr 0.000154 wd 0.0500 time 0.2297 (0.2722) data time 0.0008 (0.0056) model time 0.2289 (0.2288) loss 1.4692 (2.8688) grad_norm 3.3038 (inf) loss_scale 512.0000 (1006.8815) mem 7378MB [2024-09-01 05:29:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][360/1251] eta 0:04:01 lr 0.000154 wd 0.0500 time 0.2240 (0.2710) data time 0.0009 (0.0055) model time 0.2231 (0.2288) loss 2.4706 (2.8693) grad_norm 5.5508 (inf) loss_scale 512.0000 (992.2832) mem 7378MB [2024-09-01 05:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][370/1251] eta 0:03:57 lr 0.000154 wd 0.0500 time 0.2336 (0.2698) data time 0.0007 (0.0053) model time 0.2329 (0.2288) loss 3.2295 (2.8735) grad_norm 2.8434 (inf) loss_scale 512.0000 (978.5215) mem 7378MB [2024-09-01 05:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][380/1251] eta 0:03:53 lr 0.000154 wd 0.0500 time 0.2170 (0.2686) data time 0.0007 (0.0052) model time 0.2163 (0.2288) loss 2.4163 (2.8715) grad_norm 4.1053 (inf) loss_scale 512.0000 (965.5265) mem 7378MB [2024-09-01 05:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][390/1251] eta 0:03:50 lr 0.000154 wd 0.0500 time 0.2255 (0.2675) data time 0.0008 (0.0051) model time 0.2247 (0.2287) loss 3.0947 (2.8696) grad_norm 3.0184 (inf) loss_scale 512.0000 (953.2358) mem 7378MB [2024-09-01 05:29:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][400/1251] eta 0:03:46 lr 0.000154 wd 0.0500 time 0.2215 (0.2666) data time 0.0008 (0.0050) model time 0.2207 (0.2288) loss 2.8704 (2.8716) grad_norm 3.1066 (inf) loss_scale 512.0000 (941.5937) mem 7378MB [2024-09-01 05:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][410/1251] eta 0:03:43 lr 0.000154 wd 0.0500 time 0.2341 (0.2656) data time 0.0007 (0.0049) model time 0.2334 (0.2287) loss 3.4124 (2.8663) grad_norm 3.1436 (inf) loss_scale 512.0000 (930.5501) mem 7378MB [2024-09-01 05:29:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][420/1251] eta 0:03:39 lr 0.000154 wd 0.0500 time 0.2241 (0.2646) data time 0.0007 (0.0048) model time 0.2234 (0.2286) loss 3.3032 (2.8714) grad_norm 3.8556 (inf) loss_scale 512.0000 (920.0602) mem 7378MB [2024-09-01 05:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][430/1251] eta 0:03:36 lr 0.000154 wd 0.0500 time 0.2255 (0.2637) data time 0.0009 (0.0047) model time 0.2246 (0.2286) loss 2.7538 (2.8747) grad_norm 4.2896 (inf) loss_scale 512.0000 (910.0831) mem 7378MB [2024-09-01 05:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][440/1251] eta 0:03:33 lr 0.000154 wd 0.0500 time 0.2312 (0.2629) data time 0.0010 (0.0046) model time 0.2302 (0.2286) loss 2.0358 (2.8733) grad_norm 3.5782 (inf) loss_scale 512.0000 (900.5823) mem 7378MB [2024-09-01 05:29:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][450/1251] eta 0:03:29 lr 0.000154 wd 0.0500 time 0.2347 (0.2621) data time 0.0006 (0.0045) model time 0.2340 (0.2286) loss 2.7537 (2.8781) grad_norm 4.9413 (inf) loss_scale 512.0000 (891.5245) mem 7378MB [2024-09-01 05:29:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][460/1251] eta 0:03:26 lr 0.000154 wd 0.0500 time 0.2304 (0.2613) data time 0.0011 (0.0044) model time 0.2294 (0.2285) loss 2.8698 (2.8832) grad_norm 3.2721 (inf) loss_scale 512.0000 (882.8793) mem 7378MB [2024-09-01 05:29:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][470/1251] eta 0:03:23 lr 0.000154 wd 0.0500 time 0.2308 (0.2606) data time 0.0009 (0.0044) model time 0.2299 (0.2285) loss 3.0570 (2.8845) grad_norm 3.9701 (inf) loss_scale 512.0000 (874.6192) mem 7378MB [2024-09-01 05:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][480/1251] eta 0:03:20 lr 0.000153 wd 0.0500 time 0.2239 (0.2600) data time 0.0013 (0.0043) model time 0.2225 (0.2286) loss 2.5500 (2.8783) grad_norm 4.1915 (inf) loss_scale 512.0000 (866.7190) mem 7378MB [2024-09-01 05:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][490/1251] eta 0:03:17 lr 0.000153 wd 0.0500 time 0.2309 (0.2593) data time 0.0007 (0.0042) model time 0.2302 (0.2285) loss 2.1203 (2.8726) grad_norm 3.6987 (inf) loss_scale 512.0000 (859.1557) mem 7378MB [2024-09-01 05:29:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][500/1251] eta 0:03:14 lr 0.000153 wd 0.0500 time 0.2254 (0.2586) data time 0.0009 (0.0042) model time 0.2244 (0.2285) loss 3.3013 (2.8724) grad_norm 6.0876 (inf) loss_scale 512.0000 (851.9081) mem 7378MB [2024-09-01 05:29:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][510/1251] eta 0:03:11 lr 0.000153 wd 0.0500 time 0.2315 (0.2581) data time 0.0009 (0.0041) model time 0.2306 (0.2285) loss 2.7912 (2.8763) grad_norm 2.9807 (inf) loss_scale 512.0000 (844.9571) mem 7378MB [2024-09-01 05:30:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][520/1251] eta 0:03:08 lr 0.000153 wd 0.0500 time 0.2259 (0.2575) data time 0.0007 (0.0040) model time 0.2252 (0.2285) loss 2.7783 (2.8772) grad_norm 3.6191 (inf) loss_scale 512.0000 (838.2846) mem 7378MB [2024-09-01 05:30:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][530/1251] eta 0:03:05 lr 0.000153 wd 0.0500 time 0.2290 (0.2569) data time 0.0007 (0.0040) model time 0.2283 (0.2285) loss 3.5999 (2.8814) grad_norm 4.1153 (inf) loss_scale 512.0000 (831.8743) mem 7378MB [2024-09-01 05:30:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][540/1251] eta 0:03:02 lr 0.000153 wd 0.0500 time 0.2261 (0.2563) data time 0.0009 (0.0039) model time 0.2252 (0.2284) loss 2.1830 (2.8831) grad_norm 3.5792 (inf) loss_scale 512.0000 (825.7110) mem 7378MB [2024-09-01 05:30:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][550/1251] eta 0:02:59 lr 0.000153 wd 0.0500 time 0.2220 (0.2558) data time 0.0009 (0.0039) model time 0.2211 (0.2284) loss 3.4297 (2.8789) grad_norm 2.3589 (inf) loss_scale 512.0000 (819.7807) mem 7378MB [2024-09-01 05:30:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][560/1251] eta 0:02:56 lr 0.000153 wd 0.0500 time 0.2249 (0.2552) data time 0.0009 (0.0038) model time 0.2240 (0.2283) loss 3.5158 (2.8775) grad_norm 2.5040 (inf) loss_scale 512.0000 (814.0705) mem 7378MB [2024-09-01 05:30:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][570/1251] eta 0:02:53 lr 0.000153 wd 0.0500 time 0.2255 (0.2547) data time 0.0007 (0.0038) model time 0.2248 (0.2283) loss 3.0144 (2.8791) grad_norm 3.9066 (inf) loss_scale 512.0000 (808.5683) mem 7378MB [2024-09-01 05:30:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][580/1251] eta 0:02:50 lr 0.000153 wd 0.0500 time 0.2317 (0.2543) data time 0.0006 (0.0037) model time 0.2311 (0.2283) loss 3.1547 (2.8802) grad_norm 5.0244 (inf) loss_scale 512.0000 (803.2630) mem 7378MB [2024-09-01 05:30:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][590/1251] eta 0:02:47 lr 0.000153 wd 0.0500 time 0.2271 (0.2538) data time 0.0012 (0.0037) model time 0.2259 (0.2283) loss 2.5897 (2.8813) grad_norm 3.9356 (inf) loss_scale 512.0000 (798.1441) mem 7378MB [2024-09-01 05:30:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][600/1251] eta 0:02:44 lr 0.000153 wd 0.0500 time 0.2244 (0.2534) data time 0.0006 (0.0036) model time 0.2237 (0.2283) loss 3.3189 (2.8817) grad_norm 3.4072 (inf) loss_scale 512.0000 (793.2021) mem 7378MB [2024-09-01 05:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][610/1251] eta 0:02:42 lr 0.000153 wd 0.0500 time 0.2262 (0.2529) data time 0.0009 (0.0036) model time 0.2253 (0.2282) loss 2.6064 (2.8831) grad_norm 5.0002 (inf) loss_scale 512.0000 (788.4278) mem 7378MB [2024-09-01 05:30:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][620/1251] eta 0:02:39 lr 0.000153 wd 0.0500 time 0.2267 (0.2525) data time 0.0012 (0.0035) model time 0.2255 (0.2282) loss 2.6937 (2.8839) grad_norm 3.3727 (inf) loss_scale 512.0000 (783.8130) mem 7378MB [2024-09-01 05:30:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][630/1251] eta 0:02:36 lr 0.000153 wd 0.0500 time 0.2237 (0.2521) data time 0.0010 (0.0035) model time 0.2227 (0.2282) loss 3.0570 (2.8833) grad_norm 3.6492 (inf) loss_scale 512.0000 (779.3498) mem 7378MB [2024-09-01 05:30:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][640/1251] eta 0:02:33 lr 0.000153 wd 0.0500 time 0.2214 (0.2517) data time 0.0012 (0.0034) model time 0.2202 (0.2281) loss 3.0144 (2.8846) grad_norm 3.3092 (inf) loss_scale 512.0000 (775.0307) mem 7378MB [2024-09-01 05:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][650/1251] eta 0:02:31 lr 0.000153 wd 0.0500 time 0.2202 (0.2513) data time 0.0010 (0.0034) model time 0.2192 (0.2281) loss 2.8843 (2.8853) grad_norm 4.4537 (inf) loss_scale 512.0000 (770.8490) mem 7378MB [2024-09-01 05:30:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][660/1251] eta 0:02:28 lr 0.000153 wd 0.0500 time 0.2239 (0.2510) data time 0.0008 (0.0034) model time 0.2231 (0.2281) loss 3.2587 (2.8871) grad_norm 3.7110 (inf) loss_scale 512.0000 (766.7981) mem 7378MB [2024-09-01 05:30:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][670/1251] eta 0:02:25 lr 0.000153 wd 0.0500 time 0.2345 (0.2506) data time 0.0010 (0.0033) model time 0.2336 (0.2281) loss 2.7334 (2.8816) grad_norm 3.8470 (inf) loss_scale 512.0000 (762.8721) mem 7378MB [2024-09-01 05:30:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][680/1251] eta 0:02:22 lr 0.000153 wd 0.0500 time 0.2258 (0.2503) data time 0.0009 (0.0033) model time 0.2249 (0.2280) loss 2.8081 (2.8798) grad_norm 5.1007 (inf) loss_scale 512.0000 (759.0653) mem 7378MB [2024-09-01 05:30:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][690/1251] eta 0:02:20 lr 0.000153 wd 0.0500 time 0.2255 (0.2499) data time 0.0010 (0.0033) model time 0.2245 (0.2280) loss 3.0649 (2.8805) grad_norm 4.0782 (inf) loss_scale 512.0000 (755.3722) mem 7378MB [2024-09-01 05:30:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][700/1251] eta 0:02:17 lr 0.000153 wd 0.0500 time 0.2255 (0.2496) data time 0.0011 (0.0032) model time 0.2243 (0.2280) loss 2.9615 (2.8841) grad_norm 2.9388 (inf) loss_scale 512.0000 (751.7879) mem 7378MB [2024-09-01 05:30:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][710/1251] eta 0:02:14 lr 0.000153 wd 0.0500 time 0.2293 (0.2493) data time 0.0010 (0.0032) model time 0.2283 (0.2281) loss 1.8320 (2.8835) grad_norm 3.2871 (inf) loss_scale 256.0000 (746.8215) mem 7378MB [2024-09-01 05:30:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][720/1251] eta 0:02:12 lr 0.000153 wd 0.0500 time 0.2303 (0.2490) data time 0.0011 (0.0032) model time 0.2292 (0.2280) loss 2.2095 (2.8797) grad_norm 3.8429 (inf) loss_scale 256.0000 (739.7997) mem 7378MB [2024-09-01 05:30:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][730/1251] eta 0:02:09 lr 0.000153 wd 0.0500 time 0.2239 (0.2487) data time 0.0011 (0.0031) model time 0.2228 (0.2280) loss 2.2090 (2.8793) grad_norm 3.3133 (inf) loss_scale 256.0000 (732.9760) mem 7378MB [2024-09-01 05:30:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][740/1251] eta 0:02:06 lr 0.000153 wd 0.0500 time 0.2244 (0.2485) data time 0.0007 (0.0031) model time 0.2237 (0.2280) loss 3.2673 (2.8762) grad_norm 5.3348 (inf) loss_scale 256.0000 (726.3421) mem 7378MB [2024-09-01 05:30:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][750/1251] eta 0:02:04 lr 0.000153 wd 0.0500 time 0.2331 (0.2482) data time 0.0007 (0.0031) model time 0.2324 (0.2280) loss 3.4439 (2.8755) grad_norm 5.2465 (inf) loss_scale 256.0000 (719.8903) mem 7378MB [2024-09-01 05:30:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][760/1251] eta 0:02:01 lr 0.000153 wd 0.0500 time 0.2218 (0.2479) data time 0.0012 (0.0030) model time 0.2207 (0.2280) loss 2.8404 (2.8782) grad_norm 4.2733 (inf) loss_scale 256.0000 (713.6130) mem 7378MB [2024-09-01 05:30:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][770/1251] eta 0:01:59 lr 0.000153 wd 0.0500 time 0.2309 (0.2477) data time 0.0010 (0.0030) model time 0.2299 (0.2280) loss 2.7608 (2.8782) grad_norm 3.7279 (inf) loss_scale 256.0000 (707.5033) mem 7378MB [2024-09-01 05:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][780/1251] eta 0:01:56 lr 0.000153 wd 0.0500 time 0.2230 (0.2475) data time 0.0011 (0.0030) model time 0.2219 (0.2280) loss 3.0833 (2.8756) grad_norm 4.2265 (inf) loss_scale 256.0000 (701.5547) mem 7378MB [2024-09-01 05:31:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][790/1251] eta 0:01:53 lr 0.000153 wd 0.0500 time 0.2299 (0.2472) data time 0.0007 (0.0030) model time 0.2292 (0.2280) loss 2.5557 (2.8760) grad_norm 3.8225 (inf) loss_scale 256.0000 (695.7607) mem 7378MB [2024-09-01 05:31:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][800/1251] eta 0:01:51 lr 0.000152 wd 0.0500 time 0.2281 (0.2470) data time 0.0006 (0.0029) model time 0.2275 (0.2280) loss 2.2403 (2.8787) grad_norm 3.3002 (inf) loss_scale 256.0000 (690.1155) mem 7378MB [2024-09-01 05:31:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][810/1251] eta 0:01:48 lr 0.000152 wd 0.0500 time 0.2334 (0.2467) data time 0.0009 (0.0029) model time 0.2325 (0.2280) loss 2.8155 (2.8784) grad_norm 4.1523 (inf) loss_scale 256.0000 (684.6134) mem 7378MB [2024-09-01 05:31:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][820/1251] eta 0:01:46 lr 0.000152 wd 0.0500 time 0.2316 (0.2465) data time 0.0009 (0.0029) model time 0.2307 (0.2280) loss 3.6141 (2.8799) grad_norm 3.3316 (inf) loss_scale 256.0000 (679.2491) mem 7378MB [2024-09-01 05:31:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][830/1251] eta 0:01:43 lr 0.000152 wd 0.0500 time 0.2298 (0.2463) data time 0.0012 (0.0029) model time 0.2287 (0.2280) loss 1.9651 (2.8764) grad_norm 4.1214 (inf) loss_scale 256.0000 (674.0173) mem 7378MB [2024-09-01 05:31:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][840/1251] eta 0:01:41 lr 0.000152 wd 0.0500 time 0.2293 (0.2463) data time 0.0008 (0.0029) model time 0.2285 (0.2283) loss 3.2594 (2.8775) grad_norm 3.5450 (inf) loss_scale 256.0000 (668.9133) mem 7378MB [2024-09-01 05:31:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][850/1251] eta 0:01:38 lr 0.000152 wd 0.0500 time 0.2241 (0.2463) data time 0.0008 (0.0028) model time 0.2233 (0.2285) loss 2.6499 (2.8750) grad_norm 3.7213 (inf) loss_scale 128.0000 (663.6236) mem 7378MB [2024-09-01 05:31:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][860/1251] eta 0:01:36 lr 0.000152 wd 0.0500 time 0.2278 (0.2461) data time 0.0009 (0.0028) model time 0.2269 (0.2285) loss 3.3455 (2.8743) grad_norm 4.3811 (inf) loss_scale 128.0000 (657.2396) mem 7378MB [2024-09-01 05:31:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][870/1251] eta 0:01:33 lr 0.000152 wd 0.0500 time 0.2297 (0.2459) data time 0.0010 (0.0028) model time 0.2287 (0.2285) loss 3.2632 (2.8716) grad_norm 3.5195 (inf) loss_scale 128.0000 (651.0059) mem 7378MB [2024-09-01 05:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][880/1251] eta 0:01:31 lr 0.000152 wd 0.0500 time 0.2237 (0.2457) data time 0.0007 (0.0028) model time 0.2230 (0.2284) loss 3.0586 (2.8729) grad_norm 3.4210 (inf) loss_scale 128.0000 (644.9173) mem 7378MB [2024-09-01 05:31:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][890/1251] eta 0:01:28 lr 0.000152 wd 0.0500 time 0.2254 (0.2455) data time 0.0013 (0.0028) model time 0.2241 (0.2284) loss 3.1071 (2.8736) grad_norm 4.6809 (inf) loss_scale 128.0000 (638.9689) mem 7378MB [2024-09-01 05:31:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][900/1251] eta 0:01:26 lr 0.000152 wd 0.0500 time 0.2232 (0.2453) data time 0.0008 (0.0027) model time 0.2224 (0.2284) loss 2.6441 (2.8721) grad_norm 3.2627 (inf) loss_scale 128.0000 (633.1559) mem 7378MB [2024-09-01 05:31:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][910/1251] eta 0:01:23 lr 0.000152 wd 0.0500 time 0.2206 (0.2451) data time 0.0007 (0.0027) model time 0.2199 (0.2284) loss 2.8343 (2.8717) grad_norm 3.3815 (inf) loss_scale 128.0000 (627.4736) mem 7378MB [2024-09-01 05:31:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][920/1251] eta 0:01:21 lr 0.000152 wd 0.0500 time 0.2249 (0.2449) data time 0.0009 (0.0027) model time 0.2240 (0.2284) loss 2.0132 (2.8694) grad_norm 3.2565 (inf) loss_scale 128.0000 (621.9177) mem 7378MB [2024-09-01 05:31:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][930/1251] eta 0:01:18 lr 0.000152 wd 0.0500 time 0.2324 (0.2448) data time 0.0009 (0.0027) model time 0.2315 (0.2285) loss 3.3627 (2.8708) grad_norm 3.3152 (inf) loss_scale 128.0000 (616.4840) mem 7378MB [2024-09-01 05:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][940/1251] eta 0:01:16 lr 0.000152 wd 0.0500 time 0.2216 (0.2448) data time 0.0011 (0.0027) model time 0.2204 (0.2286) loss 3.2033 (2.8705) grad_norm 4.0429 (inf) loss_scale 128.0000 (611.1687) mem 7378MB [2024-09-01 05:31:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][950/1251] eta 0:01:13 lr 0.000152 wd 0.0500 time 0.2252 (0.2446) data time 0.0009 (0.0026) model time 0.2243 (0.2286) loss 3.4456 (2.8740) grad_norm 4.1585 (inf) loss_scale 128.0000 (605.9677) mem 7378MB [2024-09-01 05:31:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][960/1251] eta 0:01:11 lr 0.000152 wd 0.0500 time 0.2246 (0.2445) data time 0.0010 (0.0026) model time 0.2236 (0.2286) loss 3.0465 (2.8719) grad_norm 3.2132 (inf) loss_scale 128.0000 (600.8775) mem 7378MB [2024-09-01 05:31:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][970/1251] eta 0:01:08 lr 0.000152 wd 0.0500 time 0.2211 (0.2443) data time 0.0010 (0.0026) model time 0.2201 (0.2286) loss 2.2776 (2.8691) grad_norm 3.4979 (inf) loss_scale 128.0000 (595.8946) mem 7378MB [2024-09-01 05:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][980/1251] eta 0:01:06 lr 0.000152 wd 0.0500 time 0.2188 (0.2442) data time 0.0015 (0.0026) model time 0.2173 (0.2286) loss 1.7975 (2.8672) grad_norm 4.9594 (inf) loss_scale 128.0000 (591.0156) mem 7378MB [2024-09-01 05:31:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][990/1251] eta 0:01:03 lr 0.000152 wd 0.0500 time 0.2290 (0.2440) data time 0.0008 (0.0026) model time 0.2282 (0.2286) loss 3.5341 (2.8692) grad_norm 5.8204 (inf) loss_scale 128.0000 (586.2374) mem 7378MB [2024-09-01 05:31:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1000/1251] eta 0:01:01 lr 0.000152 wd 0.0500 time 0.2192 (0.2439) data time 0.0014 (0.0026) model time 0.2178 (0.2286) loss 2.6339 (2.8685) grad_norm 3.7054 (inf) loss_scale 128.0000 (581.5567) mem 7378MB [2024-09-01 05:31:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1010/1251] eta 0:00:58 lr 0.000152 wd 0.0500 time 0.2300 (0.2437) data time 0.0009 (0.0025) model time 0.2291 (0.2286) loss 3.6926 (2.8683) grad_norm 3.6909 (inf) loss_scale 128.0000 (576.9707) mem 7378MB [2024-09-01 05:31:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1020/1251] eta 0:00:56 lr 0.000152 wd 0.0500 time 0.2226 (0.2436) data time 0.0009 (0.0025) model time 0.2217 (0.2286) loss 2.9665 (2.8684) grad_norm 4.3952 (inf) loss_scale 128.0000 (572.4765) mem 7378MB [2024-09-01 05:31:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1030/1251] eta 0:00:53 lr 0.000152 wd 0.0500 time 0.2342 (0.2434) data time 0.0007 (0.0025) model time 0.2335 (0.2286) loss 3.6422 (2.8700) grad_norm 3.5950 (inf) loss_scale 128.0000 (568.0714) mem 7378MB [2024-09-01 05:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1040/1251] eta 0:00:51 lr 0.000152 wd 0.0500 time 0.2253 (0.2432) data time 0.0010 (0.0025) model time 0.2244 (0.2285) loss 2.5013 (2.8688) grad_norm 3.1951 (inf) loss_scale 128.0000 (563.7527) mem 7378MB [2024-09-01 05:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1050/1251] eta 0:00:48 lr 0.000152 wd 0.0500 time 0.2280 (0.2431) data time 0.0009 (0.0025) model time 0.2271 (0.2285) loss 3.1984 (2.8678) grad_norm 9.2805 (inf) loss_scale 128.0000 (559.5180) mem 7378MB [2024-09-01 05:32:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1060/1251] eta 0:00:46 lr 0.000152 wd 0.0500 time 0.2262 (0.2429) data time 0.0011 (0.0025) model time 0.2251 (0.2285) loss 3.2502 (2.8689) grad_norm 3.4845 (inf) loss_scale 128.0000 (555.3648) mem 7378MB [2024-09-01 05:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1070/1251] eta 0:00:43 lr 0.000152 wd 0.0500 time 0.2265 (0.2428) data time 0.0008 (0.0025) model time 0.2256 (0.2285) loss 3.8423 (2.8685) grad_norm 3.5207 (inf) loss_scale 128.0000 (551.2908) mem 7378MB [2024-09-01 05:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1080/1251] eta 0:00:41 lr 0.000152 wd 0.0500 time 0.2234 (0.2426) data time 0.0009 (0.0024) model time 0.2225 (0.2285) loss 2.9625 (2.8696) grad_norm 3.8033 (inf) loss_scale 128.0000 (547.2937) mem 7378MB [2024-09-01 05:32:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1090/1251] eta 0:00:39 lr 0.000152 wd 0.0500 time 0.2341 (0.2425) data time 0.0011 (0.0024) model time 0.2331 (0.2284) loss 1.9641 (2.8687) grad_norm 4.9164 (inf) loss_scale 128.0000 (543.3714) mem 7378MB [2024-09-01 05:32:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1100/1251] eta 0:00:36 lr 0.000152 wd 0.0500 time 0.2306 (0.2424) data time 0.0006 (0.0024) model time 0.2301 (0.2284) loss 3.2064 (2.8694) grad_norm 2.7028 (inf) loss_scale 128.0000 (539.5218) mem 7378MB [2024-09-01 05:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1110/1251] eta 0:00:34 lr 0.000152 wd 0.0500 time 0.2263 (0.2423) data time 0.0006 (0.0024) model time 0.2257 (0.2284) loss 3.0366 (2.8671) grad_norm 3.0694 (inf) loss_scale 128.0000 (535.7429) mem 7378MB [2024-09-01 05:32:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1120/1251] eta 0:00:31 lr 0.000151 wd 0.0500 time 0.2239 (0.2421) data time 0.0008 (0.0024) model time 0.2231 (0.2284) loss 2.1200 (2.8663) grad_norm 2.6569 (inf) loss_scale 128.0000 (532.0328) mem 7378MB [2024-09-01 05:32:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1130/1251] eta 0:00:29 lr 0.000151 wd 0.0500 time 0.2269 (0.2420) data time 0.0009 (0.0024) model time 0.2260 (0.2284) loss 1.8421 (2.8661) grad_norm 3.4410 (inf) loss_scale 128.0000 (528.3895) mem 7378MB [2024-09-01 05:32:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1140/1251] eta 0:00:26 lr 0.000151 wd 0.0500 time 0.2234 (0.2419) data time 0.0009 (0.0024) model time 0.2225 (0.2284) loss 3.2650 (2.8703) grad_norm 3.2575 (inf) loss_scale 128.0000 (524.8114) mem 7378MB [2024-09-01 05:32:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1150/1251] eta 0:00:24 lr 0.000151 wd 0.0500 time 0.2237 (0.2418) data time 0.0015 (0.0024) model time 0.2222 (0.2284) loss 3.1441 (2.8699) grad_norm 5.9711 (inf) loss_scale 128.0000 (521.2967) mem 7378MB [2024-09-01 05:32:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1160/1251] eta 0:00:21 lr 0.000151 wd 0.0500 time 0.2228 (0.2416) data time 0.0008 (0.0023) model time 0.2220 (0.2284) loss 2.7047 (2.8689) grad_norm 3.3462 (inf) loss_scale 128.0000 (517.8437) mem 7378MB [2024-09-01 05:32:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1170/1251] eta 0:00:19 lr 0.000151 wd 0.0500 time 0.2322 (0.2416) data time 0.0013 (0.0023) model time 0.2308 (0.2284) loss 2.4792 (2.8682) grad_norm 3.0663 (inf) loss_scale 128.0000 (514.4508) mem 7378MB [2024-09-01 05:32:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1180/1251] eta 0:00:17 lr 0.000151 wd 0.0500 time 0.2270 (0.2414) data time 0.0009 (0.0023) model time 0.2261 (0.2284) loss 2.6645 (2.8683) grad_norm 3.4265 (inf) loss_scale 128.0000 (511.1165) mem 7378MB [2024-09-01 05:32:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1190/1251] eta 0:00:14 lr 0.000151 wd 0.0500 time 0.2323 (0.2413) data time 0.0015 (0.0023) model time 0.2308 (0.2284) loss 3.2418 (2.8676) grad_norm 4.8390 (inf) loss_scale 128.0000 (507.8392) mem 7378MB [2024-09-01 05:32:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1200/1251] eta 0:00:12 lr 0.000151 wd 0.0500 time 0.2258 (0.2412) data time 0.0010 (0.0023) model time 0.2248 (0.2283) loss 3.0629 (2.8700) grad_norm 3.2650 (inf) loss_scale 128.0000 (504.6175) mem 7378MB [2024-09-01 05:32:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1210/1251] eta 0:00:09 lr 0.000151 wd 0.0500 time 0.2349 (0.2411) data time 0.0012 (0.0023) model time 0.2337 (0.2284) loss 2.9342 (2.8715) grad_norm 2.8138 (inf) loss_scale 128.0000 (501.4500) mem 7378MB [2024-09-01 05:32:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1220/1251] eta 0:00:07 lr 0.000151 wd 0.0500 time 0.2721 (0.2410) data time 0.0009 (0.0023) model time 0.2712 (0.2283) loss 3.1103 (2.8717) grad_norm 3.0425 (inf) loss_scale 128.0000 (498.3353) mem 7378MB [2024-09-01 05:32:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1230/1251] eta 0:00:05 lr 0.000151 wd 0.0500 time 0.2275 (0.2409) data time 0.0013 (0.0023) model time 0.2263 (0.2283) loss 2.7204 (2.8730) grad_norm 6.7547 (inf) loss_scale 128.0000 (495.2721) mem 7378MB [2024-09-01 05:32:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1240/1251] eta 0:00:02 lr 0.000151 wd 0.0500 time 0.2118 (0.2407) data time 0.0006 (0.0023) model time 0.2112 (0.2283) loss 1.5217 (2.8726) grad_norm 4.1377 (inf) loss_scale 128.0000 (492.2592) mem 7378MB [2024-09-01 05:32:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [230/300][1250/1251] eta 0:00:00 lr 0.000151 wd 0.0500 time 0.2114 (0.2405) data time 0.0004 (0.0022) model time 0.2110 (0.2281) loss 2.8458 (2.8723) grad_norm 4.1052 (inf) loss_scale 128.0000 (489.2954) mem 7378MB [2024-09-01 05:32:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 230 training takes 0:04:55 [2024-09-01 05:32:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 05:32:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 05:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.454 (0.454) Loss 0.4043 (0.4043) Acc@1 92.188 (92.188) Acc@5 98.535 (98.535) Mem 7378MB [2024-09-01 05:32:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.115) Loss 0.6216 (0.6314) Acc@1 87.500 (86.399) Acc@5 97.949 (97.532) Mem 7378MB [2024-09-01 05:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.099) Loss 0.9229 (0.6636) Acc@1 78.613 (85.477) Acc@5 95.605 (97.438) Mem 7378MB [2024-09-01 05:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.093) Loss 1.1221 (0.7529) Acc@1 74.023 (83.361) Acc@5 92.578 (96.525) Mem 7378MB [2024-09-01 05:32:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 1.0469 (0.8012) Acc@1 76.465 (82.157) Acc@5 93.945 (96.020) Mem 7378MB [2024-09-01 05:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.766 Acc@5 95.996 [2024-09-01 05:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.8% [2024-09-01 05:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.77% [2024-09-01 05:32:56 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 05:32:56 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 05:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.501 (0.501) Loss 0.3813 (0.3813) Acc@1 93.066 (93.066) Acc@5 98.145 (98.145) Mem 7378MB [2024-09-01 05:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.119) Loss 0.5742 (0.6005) Acc@1 89.453 (87.464) Acc@5 97.754 (97.763) Mem 7378MB [2024-09-01 05:32:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.102) Loss 0.8740 (0.6289) Acc@1 77.832 (86.375) Acc@5 95.801 (97.698) Mem 7378MB [2024-09-01 05:32:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.096) Loss 1.0879 (0.7129) Acc@1 73.535 (84.262) Acc@5 93.164 (96.815) Mem 7378MB [2024-09-01 05:33:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.088) Loss 0.9780 (0.7574) Acc@1 76.660 (83.144) Acc@5 94.727 (96.370) Mem 7378MB [2024-09-01 05:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.724 Acc@5 96.346 [2024-09-01 05:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-09-01 05:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.72% [2024-09-01 05:33:01 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 05:33:02 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 05:33:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][0/1251] eta 0:17:32 lr 0.000151 wd 0.0500 time 0.8412 (0.8412) data time 0.5589 (0.5589) model time 0.0000 (0.0000) loss 2.6772 (2.6772) grad_norm 3.2756 (3.2756) loss_scale 128.0000 (128.0000) mem 7382MB [2024-09-01 05:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][10/1251] eta 0:05:52 lr 0.000151 wd 0.0500 time 0.2312 (0.2837) data time 0.0012 (0.0518) model time 0.0000 (0.0000) loss 2.8065 (2.9054) grad_norm 4.0595 (3.6842) loss_scale 128.0000 (128.0000) mem 7383MB [2024-09-01 05:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][20/1251] eta 0:05:15 lr 0.000151 wd 0.0500 time 0.2261 (0.2560) data time 0.0010 (0.0276) model time 0.0000 (0.0000) loss 1.7885 (2.8127) grad_norm 3.3793 (3.5756) loss_scale 128.0000 (128.0000) mem 7383MB [2024-09-01 05:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][30/1251] eta 0:05:02 lr 0.000151 wd 0.0500 time 0.2250 (0.2478) data time 0.0009 (0.0190) model time 0.0000 (0.0000) loss 3.0865 (2.8781) grad_norm 3.0165 (3.9000) loss_scale 128.0000 (128.0000) mem 7383MB [2024-09-01 05:33:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][40/1251] eta 0:04:53 lr 0.000151 wd 0.0500 time 0.2199 (0.2425) data time 0.0010 (0.0146) model time 0.0000 (0.0000) loss 2.7594 (2.9272) grad_norm 5.0884 (4.0539) loss_scale 128.0000 (128.0000) mem 7383MB [2024-09-01 05:33:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][50/1251] eta 0:04:48 lr 0.000151 wd 0.0500 time 0.2292 (0.2399) data time 0.0008 (0.0120) model time 0.0000 (0.0000) loss 3.4867 (2.9455) grad_norm 2.5940 (3.9668) loss_scale 128.0000 (128.0000) mem 7383MB [2024-09-01 05:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][60/1251] eta 0:04:43 lr 0.000151 wd 0.0500 time 0.2311 (0.2377) data time 0.0013 (0.0102) model time 0.2298 (0.2255) loss 2.8672 (2.9592) grad_norm 3.5431 (4.0609) loss_scale 128.0000 (128.0000) mem 7383MB [2024-09-01 05:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][70/1251] eta 0:04:42 lr 0.000151 wd 0.0500 time 0.2231 (0.2390) data time 0.0009 (0.0089) model time 0.2222 (0.2356) loss 3.2411 (2.9534) grad_norm 3.7775 (4.0259) loss_scale 128.0000 (128.0000) mem 7383MB [2024-09-01 05:33:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][80/1251] eta 0:04:37 lr 0.000151 wd 0.0500 time 0.2240 (0.2373) data time 0.0007 (0.0079) model time 0.2234 (0.2318) loss 3.2728 (2.9409) grad_norm 3.0237 (4.0119) loss_scale 128.0000 (128.0000) mem 7383MB [2024-09-01 05:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][90/1251] eta 0:04:34 lr 0.000151 wd 0.0500 time 0.2326 (0.2363) data time 0.0010 (0.0072) model time 0.2317 (0.2307) loss 2.8410 (2.8999) grad_norm 3.3189 (3.9910) loss_scale 128.0000 (128.0000) mem 7383MB [2024-09-01 05:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][100/1251] eta 0:04:30 lr 0.000151 wd 0.0500 time 0.2288 (0.2354) data time 0.0012 (0.0066) model time 0.2276 (0.2298) loss 3.4910 (2.8910) grad_norm 3.7797 (3.9600) loss_scale 128.0000 (128.0000) mem 7383MB [2024-09-01 05:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][110/1251] eta 0:04:27 lr 0.000151 wd 0.0500 time 0.2262 (0.2348) data time 0.0013 (0.0061) model time 0.2249 (0.2294) loss 3.1947 (2.8978) grad_norm 4.3479 (3.9938) loss_scale 128.0000 (128.0000) mem 7383MB [2024-09-01 05:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][120/1251] eta 0:04:24 lr 0.000151 wd 0.0500 time 0.2317 (0.2341) data time 0.0007 (0.0057) model time 0.2310 (0.2289) loss 3.1774 (2.9018) grad_norm 3.3264 (4.0206) loss_scale 128.0000 (128.0000) mem 7383MB [2024-09-01 05:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][130/1251] eta 0:04:22 lr 0.000151 wd 0.0500 time 0.2258 (0.2337) data time 0.0013 (0.0053) model time 0.2246 (0.2288) loss 3.1580 (2.9076) grad_norm 3.8296 (3.9675) loss_scale 128.0000 (128.0000) mem 7383MB [2024-09-01 05:33:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-09-01 05:33:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 05:33:35 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 05:45:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-09-01 05:45:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-09-01 05:45:37 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-09-01 05:45:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-09-01 05:45:49 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-09-01 05:45:50 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-09-01 05:45:52 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-09-01 05:45:52 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 231) [2024-09-01 05:45:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-09-01 05:46:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][140/1251] eta 0:37:08 lr 0.000151 wd 0.0500 time 0.2184 (2.0058) data time 0.0011 (0.1038) model time 0.2173 (1.9020) loss 3.0688 (3.3151) grad_norm 3.5482 (3.5492) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][150/1251] eta 0:17:36 lr 0.000151 wd 0.0500 time 0.2310 (0.9600) data time 0.0009 (0.0433) model time 0.2302 (0.9167) loss 2.5748 (3.0960) grad_norm 6.0095 (4.1158) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][160/1251] eta 0:12:31 lr 0.000151 wd 0.0500 time 0.2295 (0.6885) data time 0.0006 (0.0276) model time 0.2288 (0.6609) loss 3.5983 (3.1638) grad_norm 19.3610 (4.6131) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][170/1251] eta 0:10:09 lr 0.000151 wd 0.0500 time 0.2398 (0.5642) data time 0.0009 (0.0205) model time 0.2389 (0.5437) loss 2.6165 (3.1434) grad_norm 3.0461 (4.5499) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][180/1251] eta 0:08:46 lr 0.000151 wd 0.0500 time 0.2226 (0.4920) data time 0.0007 (0.0163) model time 0.2220 (0.4757) loss 2.9662 (3.0877) grad_norm 3.8313 (4.4610) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][190/1251] eta 0:07:52 lr 0.000150 wd 0.0500 time 0.2271 (0.4451) data time 0.0008 (0.0136) model time 0.2263 (0.4315) loss 2.7690 (3.0755) grad_norm 6.8983 (4.4198) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][200/1251] eta 0:07:13 lr 0.000150 wd 0.0500 time 0.2248 (0.4126) data time 0.0009 (0.0117) model time 0.2239 (0.4008) loss 2.7508 (3.0513) grad_norm 3.4316 (4.3477) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][210/1251] eta 0:06:44 lr 0.000150 wd 0.0500 time 0.2293 (0.3883) data time 0.0008 (0.0103) model time 0.2285 (0.3780) loss 2.9125 (3.0047) grad_norm 4.8842 (4.2724) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][220/1251] eta 0:06:21 lr 0.000150 wd 0.0500 time 0.2268 (0.3698) data time 0.0009 (0.0092) model time 0.2259 (0.3606) loss 2.9016 (2.9822) grad_norm 2.8295 (4.2916) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][230/1251] eta 0:06:02 lr 0.000150 wd 0.0500 time 0.2277 (0.3551) data time 0.0008 (0.0084) model time 0.2269 (0.3467) loss 3.4240 (2.9870) grad_norm 4.9545 (4.2363) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][240/1251] eta 0:05:46 lr 0.000150 wd 0.0500 time 0.2290 (0.3432) data time 0.0007 (0.0077) model time 0.2283 (0.3355) loss 2.7444 (2.9942) grad_norm 3.7531 (4.1636) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][250/1251] eta 0:05:33 lr 0.000150 wd 0.0500 time 0.2277 (0.3333) data time 0.0008 (0.0071) model time 0.2269 (0.3262) loss 3.4176 (2.9822) grad_norm 5.5320 (4.1307) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][260/1251] eta 0:05:21 lr 0.000150 wd 0.0500 time 0.2287 (0.3249) data time 0.0009 (0.0066) model time 0.2278 (0.3183) loss 3.2003 (2.9639) grad_norm 3.7633 (4.1296) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][270/1251] eta 0:05:11 lr 0.000150 wd 0.0500 time 0.2290 (0.3178) data time 0.0007 (0.0062) model time 0.2284 (0.3116) loss 2.4155 (2.9612) grad_norm 4.1794 (4.1012) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][280/1251] eta 0:05:02 lr 0.000150 wd 0.0500 time 0.2211 (0.3116) data time 0.0009 (0.0059) model time 0.2202 (0.3058) loss 3.0656 (2.9552) grad_norm 3.2005 (4.0604) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][290/1251] eta 0:04:54 lr 0.000150 wd 0.0500 time 0.2227 (0.3063) data time 0.0009 (0.0055) model time 0.2218 (0.3008) loss 2.3970 (2.9508) grad_norm 3.1901 (4.0319) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][300/1251] eta 0:04:46 lr 0.000150 wd 0.0500 time 0.2192 (0.3016) data time 0.0009 (0.0053) model time 0.2184 (0.2963) loss 2.9521 (2.9518) grad_norm 3.7859 (4.0290) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][310/1251] eta 0:04:39 lr 0.000150 wd 0.0500 time 0.2237 (0.2973) data time 0.0009 (0.0050) model time 0.2228 (0.2923) loss 2.8995 (2.9398) grad_norm 3.0672 (4.0115) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][320/1251] eta 0:04:33 lr 0.000150 wd 0.0500 time 0.2191 (0.2935) data time 0.0008 (0.0048) model time 0.2183 (0.2887) loss 2.7312 (2.9361) grad_norm 3.5552 (3.9882) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][330/1251] eta 0:04:27 lr 0.000150 wd 0.0500 time 0.2361 (0.2902) data time 0.0008 (0.0046) model time 0.2353 (0.2856) loss 3.1750 (2.9281) grad_norm 4.0598 (3.9889) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][340/1251] eta 0:04:21 lr 0.000150 wd 0.0500 time 0.2238 (0.2872) data time 0.0007 (0.0044) model time 0.2231 (0.2827) loss 2.9444 (2.9121) grad_norm 7.2715 (4.0285) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][350/1251] eta 0:04:16 lr 0.000150 wd 0.0500 time 0.2254 (0.2843) data time 0.0008 (0.0043) model time 0.2246 (0.2801) loss 2.8706 (2.9104) grad_norm 6.3714 (4.0597) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][360/1251] eta 0:04:11 lr 0.000150 wd 0.0500 time 0.2313 (0.2819) data time 0.0009 (0.0041) model time 0.2304 (0.2778) loss 3.1028 (2.9125) grad_norm 3.3906 (4.0632) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][370/1251] eta 0:04:06 lr 0.000150 wd 0.0500 time 0.2321 (0.2797) data time 0.0008 (0.0040) model time 0.2313 (0.2757) loss 2.7779 (2.9012) grad_norm 4.2882 (4.0598) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][380/1251] eta 0:04:01 lr 0.000150 wd 0.0500 time 0.2313 (0.2776) data time 0.0008 (0.0039) model time 0.2305 (0.2737) loss 3.0385 (2.8955) grad_norm 4.0237 (4.0631) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][390/1251] eta 0:03:57 lr 0.000150 wd 0.0500 time 0.2278 (0.2757) data time 0.0008 (0.0037) model time 0.2270 (0.2720) loss 2.1092 (2.8831) grad_norm 6.0992 (4.0784) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][400/1251] eta 0:03:53 lr 0.000150 wd 0.0500 time 0.2199 (0.2738) data time 0.0008 (0.0036) model time 0.2191 (0.2702) loss 2.1661 (2.8776) grad_norm 2.8375 (4.0791) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][410/1251] eta 0:03:48 lr 0.000150 wd 0.0500 time 0.2260 (0.2722) data time 0.0009 (0.0035) model time 0.2251 (0.2686) loss 2.8506 (2.8839) grad_norm 3.3120 (4.0629) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][420/1251] eta 0:03:45 lr 0.000150 wd 0.0500 time 0.4459 (0.2713) data time 0.0008 (0.0034) model time 0.4451 (0.2679) loss 3.5070 (2.8791) grad_norm 5.5840 (4.0772) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][430/1251] eta 0:03:41 lr 0.000150 wd 0.0500 time 0.2320 (0.2699) data time 0.0009 (0.0034) model time 0.2310 (0.2665) loss 1.9257 (2.8704) grad_norm 5.0691 (4.1118) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][440/1251] eta 0:03:38 lr 0.000150 wd 0.0500 time 0.4315 (0.2691) data time 0.0008 (0.0033) model time 0.4307 (0.2658) loss 3.3486 (2.8667) grad_norm 5.3433 (4.1193) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][450/1251] eta 0:03:34 lr 0.000150 wd 0.0500 time 0.2284 (0.2678) data time 0.0009 (0.0032) model time 0.2275 (0.2646) loss 3.5757 (2.8741) grad_norm 14.5923 (4.1514) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][460/1251] eta 0:03:30 lr 0.000150 wd 0.0500 time 0.2314 (0.2666) data time 0.0006 (0.0031) model time 0.2309 (0.2635) loss 2.2477 (2.8783) grad_norm 3.6042 (4.1268) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][470/1251] eta 0:03:27 lr 0.000150 wd 0.0500 time 0.2283 (0.2655) data time 0.0009 (0.0031) model time 0.2274 (0.2624) loss 2.5171 (2.8764) grad_norm 3.3476 (4.1365) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][480/1251] eta 0:03:23 lr 0.000150 wd 0.0500 time 0.2205 (0.2644) data time 0.0008 (0.0030) model time 0.2197 (0.2614) loss 3.1130 (2.8774) grad_norm 3.2876 (4.1394) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][490/1251] eta 0:03:20 lr 0.000150 wd 0.0500 time 0.2340 (0.2634) data time 0.0011 (0.0030) model time 0.2328 (0.2604) loss 2.5411 (2.8748) grad_norm 5.3391 (4.1498) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][500/1251] eta 0:03:17 lr 0.000150 wd 0.0500 time 0.2370 (0.2624) data time 0.0007 (0.0029) model time 0.2363 (0.2595) loss 2.6289 (2.8756) grad_norm 5.1778 (4.1583) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][510/1251] eta 0:03:13 lr 0.000149 wd 0.0500 time 0.2299 (0.2615) data time 0.0010 (0.0028) model time 0.2289 (0.2587) loss 2.8861 (2.8718) grad_norm 2.7426 (4.1502) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][520/1251] eta 0:03:10 lr 0.000149 wd 0.0500 time 0.2261 (0.2606) data time 0.0008 (0.0028) model time 0.2253 (0.2578) loss 3.3415 (2.8693) grad_norm 2.9283 (4.1393) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][530/1251] eta 0:03:07 lr 0.000149 wd 0.0500 time 0.2275 (0.2598) data time 0.0008 (0.0027) model time 0.2267 (0.2571) loss 2.6326 (2.8727) grad_norm 3.8905 (4.1529) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][540/1251] eta 0:03:04 lr 0.000149 wd 0.0500 time 0.2334 (0.2590) data time 0.0009 (0.0027) model time 0.2325 (0.2563) loss 2.7644 (2.8782) grad_norm 5.6942 (4.1601) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][550/1251] eta 0:03:01 lr 0.000149 wd 0.0500 time 0.2279 (0.2583) data time 0.0007 (0.0027) model time 0.2272 (0.2556) loss 1.7507 (2.8785) grad_norm 4.7118 (4.1947) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][560/1251] eta 0:02:57 lr 0.000149 wd 0.0500 time 0.2233 (0.2575) data time 0.0010 (0.0026) model time 0.2223 (0.2549) loss 3.2787 (2.8803) grad_norm 2.9712 (4.1807) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][570/1251] eta 0:02:54 lr 0.000149 wd 0.0500 time 0.2272 (0.2569) data time 0.0008 (0.0026) model time 0.2263 (0.2543) loss 3.1895 (2.8867) grad_norm 2.8313 (4.1648) loss_scale 128.0000 (128.0000) mem 7378MB [2024-09-01 05:47:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-09-01 05:47:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 05:47:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 05:50:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-09-01 05:50:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-09-01 05:51:07 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-09-01 05:51:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-09-01 05:51:17 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-09-01 05:51:19 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-09-01 05:51:21 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-09-01 05:51:21 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 231) [2024-09-01 05:51:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-09-01 05:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][580/1251] eta 0:14:07 lr 0.000149 wd 0.0500 time 0.2291 (1.2636) data time 0.0009 (0.0598) model time 0.2283 (1.2038) loss 3.2185 (3.2673) grad_norm 3.8373 (3.6864) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][590/1251] eta 0:08:12 lr 0.000149 wd 0.0500 time 0.2224 (0.7453) data time 0.0007 (0.0304) model time 0.2217 (0.7149) loss 3.5410 (3.1554) grad_norm 3.1662 (3.5461) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:51:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][600/1251] eta 0:06:12 lr 0.000149 wd 0.0500 time 0.2272 (0.5723) data time 0.0009 (0.0206) model time 0.2263 (0.5517) loss 3.0264 (3.1613) grad_norm 3.3774 (3.6517) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:51:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][610/1251] eta 0:05:11 lr 0.000149 wd 0.0500 time 0.2230 (0.4857) data time 0.0007 (0.0157) model time 0.2223 (0.4700) loss 2.3479 (3.0586) grad_norm 2.8684 (3.6967) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:51:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][620/1251] eta 0:04:33 lr 0.000149 wd 0.0500 time 0.2452 (0.4340) data time 0.0008 (0.0128) model time 0.2443 (0.4212) loss 2.3404 (3.0254) grad_norm 3.6924 (3.6581) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:51:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][630/1251] eta 0:04:07 lr 0.000149 wd 0.0500 time 0.2291 (0.3993) data time 0.0006 (0.0108) model time 0.2285 (0.3885) loss 3.2610 (3.0131) grad_norm 3.3683 (3.6255) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:51:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][640/1251] eta 0:03:48 lr 0.000149 wd 0.0500 time 0.2215 (0.3744) data time 0.0008 (0.0094) model time 0.2207 (0.3650) loss 2.5555 (2.9866) grad_norm 4.2715 (3.6149) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:51:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][650/1251] eta 0:03:33 lr 0.000149 wd 0.0500 time 0.2206 (0.3558) data time 0.0009 (0.0083) model time 0.2197 (0.3475) loss 3.1110 (2.9669) grad_norm 5.1069 (3.8095) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:51:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][660/1251] eta 0:03:21 lr 0.000149 wd 0.0500 time 0.2276 (0.3415) data time 0.0008 (0.0075) model time 0.2268 (0.3340) loss 3.2618 (2.9592) grad_norm 2.3830 (3.8957) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][670/1251] eta 0:03:11 lr 0.000149 wd 0.0500 time 0.2273 (0.3300) data time 0.0009 (0.0069) model time 0.2264 (0.3231) loss 3.5567 (2.9693) grad_norm 3.4837 (3.9044) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:51:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][680/1251] eta 0:03:02 lr 0.000149 wd 0.0500 time 0.2223 (0.3205) data time 0.0009 (0.0063) model time 0.2214 (0.3141) loss 2.6308 (2.9770) grad_norm 3.4157 (3.9494) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][690/1251] eta 0:02:55 lr 0.000149 wd 0.0500 time 0.2193 (0.3126) data time 0.0007 (0.0059) model time 0.2186 (0.3067) loss 3.5933 (2.9899) grad_norm 7.2538 (3.9507) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:52:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][700/1251] eta 0:02:48 lr 0.000149 wd 0.0500 time 0.2284 (0.3060) data time 0.0007 (0.0055) model time 0.2276 (0.3005) loss 3.0677 (2.9637) grad_norm 3.4855 (4.1603) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:52:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][710/1251] eta 0:02:42 lr 0.000149 wd 0.0500 time 0.2230 (0.3003) data time 0.0007 (0.0052) model time 0.2223 (0.2952) loss 1.8062 (2.9561) grad_norm 4.8223 (4.1140) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:52:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][720/1251] eta 0:02:36 lr 0.000149 wd 0.0500 time 0.2208 (0.2955) data time 0.0009 (0.0049) model time 0.2199 (0.2906) loss 3.2824 (2.9582) grad_norm 3.3465 (4.1044) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:52:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][730/1251] eta 0:02:31 lr 0.000149 wd 0.0500 time 0.2212 (0.2911) data time 0.0008 (0.0047) model time 0.2204 (0.2864) loss 2.9596 (2.9537) grad_norm 6.2663 (4.0976) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:52:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][740/1251] eta 0:02:26 lr 0.000149 wd 0.0500 time 0.2276 (0.2875) data time 0.0008 (0.0045) model time 0.2267 (0.2829) loss 2.1242 (2.9487) grad_norm 3.0725 (4.0685) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-09-01 05:52:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 05:52:16 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 05:54:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-09-01 05:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-09-01 05:54:18 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-09-01 05:54:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-09-01 05:54:27 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-09-01 05:54:28 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-09-01 05:54:30 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-09-01 05:54:30 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 231) [2024-09-01 05:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-09-01 05:54:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][750/1251] eta 0:28:32 lr 0.000149 wd 0.0500 time 0.2342 (3.4176) data time 0.0009 (0.2089) model time 0.2334 (3.2087) loss 3.3112 (3.2334) grad_norm 8.5329 (5.0741) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:54:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][760/1251] eta 0:09:24 lr 0.000149 wd 0.0500 time 0.2386 (1.1501) data time 0.0008 (0.0604) model time 0.2378 (1.0897) loss 3.1186 (3.0489) grad_norm 6.4077 (4.7761) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:54:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][770/1251] eta 0:06:11 lr 0.000149 wd 0.0500 time 0.2391 (0.7715) data time 0.0011 (0.0357) model time 0.2380 (0.7358) loss 2.9692 (3.1193) grad_norm 4.9120 (4.5684) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][780/1251] eta 0:04:49 lr 0.000149 wd 0.0500 time 0.2412 (0.6155) data time 0.0008 (0.0255) model time 0.2403 (0.5900) loss 2.2901 (3.1130) grad_norm 3.2542 (4.3745) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:54:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][790/1251] eta 0:04:04 lr 0.000149 wd 0.0500 time 0.2382 (0.5301) data time 0.0008 (0.0199) model time 0.2374 (0.5102) loss 3.0554 (3.0680) grad_norm 3.7050 (4.1932) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][800/1251] eta 0:03:34 lr 0.000149 wd 0.0500 time 0.2406 (0.4765) data time 0.0008 (0.0164) model time 0.2397 (0.4601) loss 3.2793 (3.0479) grad_norm 7.6904 (4.1113) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][810/1251] eta 0:03:13 lr 0.000149 wd 0.0500 time 0.2446 (0.4394) data time 0.0008 (0.0140) model time 0.2437 (0.4254) loss 3.1472 (3.0080) grad_norm 5.2876 (4.0793) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][820/1251] eta 0:02:57 lr 0.000149 wd 0.0500 time 0.2333 (0.4125) data time 0.0011 (0.0123) model time 0.2322 (0.4002) loss 3.1986 (2.9804) grad_norm 3.2033 (4.0411) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][830/1251] eta 0:02:45 lr 0.000149 wd 0.0500 time 0.2388 (0.3919) data time 0.0010 (0.0110) model time 0.2377 (0.3810) loss 2.8494 (2.9565) grad_norm 3.7999 (3.9451) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][840/1251] eta 0:02:34 lr 0.000148 wd 0.0500 time 0.2362 (0.3757) data time 0.0011 (0.0099) model time 0.2351 (0.3658) loss 2.6292 (2.9462) grad_norm 5.3066 (3.9711) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][850/1251] eta 0:02:25 lr 0.000148 wd 0.0500 time 0.2408 (0.3627) data time 0.0009 (0.0090) model time 0.2399 (0.3537) loss 2.9339 (2.9707) grad_norm 6.1428 (4.1078) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][860/1251] eta 0:02:17 lr 0.000148 wd 0.0500 time 0.2455 (0.3521) data time 0.0010 (0.0083) model time 0.2445 (0.3438) loss 3.0095 (2.9629) grad_norm 3.5353 (4.1126) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][870/1251] eta 0:02:10 lr 0.000148 wd 0.0500 time 0.2508 (0.3431) data time 0.0009 (0.0077) model time 0.2498 (0.3354) loss 2.5644 (2.9548) grad_norm 3.4616 (5.3216) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][880/1251] eta 0:02:04 lr 0.000148 wd 0.0500 time 0.2410 (0.3357) data time 0.0010 (0.0072) model time 0.2399 (0.3284) loss 3.1729 (2.9620) grad_norm 5.7591 (5.2354) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][890/1251] eta 0:01:58 lr 0.000148 wd 0.0500 time 0.2417 (0.3292) data time 0.0009 (0.0068) model time 0.2408 (0.3224) loss 2.8837 (2.9490) grad_norm 3.9672 (5.1382) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][900/1251] eta 0:01:53 lr 0.000148 wd 0.0500 time 0.2369 (0.3235) data time 0.0008 (0.0065) model time 0.2361 (0.3170) loss 3.0133 (2.9530) grad_norm 4.2350 (5.0645) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][910/1251] eta 0:01:48 lr 0.000148 wd 0.0500 time 0.2377 (0.3185) data time 0.0009 (0.0061) model time 0.2368 (0.3124) loss 2.5195 (2.9478) grad_norm 4.9023 (4.9977) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][920/1251] eta 0:01:43 lr 0.000148 wd 0.0500 time 0.2452 (0.3140) data time 0.0011 (0.0058) model time 0.2441 (0.3082) loss 1.7238 (2.9335) grad_norm 3.8713 (4.9070) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][930/1251] eta 0:01:39 lr 0.000148 wd 0.0500 time 0.2372 (0.3101) data time 0.0009 (0.0056) model time 0.2363 (0.3046) loss 2.5656 (2.9278) grad_norm 5.0406 (4.8523) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][940/1251] eta 0:01:35 lr 0.000148 wd 0.0500 time 0.2463 (0.3065) data time 0.0008 (0.0053) model time 0.2455 (0.3011) loss 2.7131 (2.9249) grad_norm 4.4594 (4.9657) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][950/1251] eta 0:01:31 lr 0.000148 wd 0.0500 time 0.2453 (0.3033) data time 0.0010 (0.0051) model time 0.2443 (0.2981) loss 3.1341 (2.9117) grad_norm 2.8142 (4.9131) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][960/1251] eta 0:01:27 lr 0.000148 wd 0.0500 time 0.2462 (0.3003) data time 0.0008 (0.0049) model time 0.2455 (0.2954) loss 2.8373 (2.9066) grad_norm 4.0011 (4.8891) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][970/1251] eta 0:01:23 lr 0.000148 wd 0.0500 time 0.2333 (0.2978) data time 0.0011 (0.0048) model time 0.2322 (0.2930) loss 3.3054 (2.9083) grad_norm 4.2361 (4.8234) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][980/1251] eta 0:01:20 lr 0.000148 wd 0.0500 time 0.2347 (0.2954) data time 0.0007 (0.0046) model time 0.2340 (0.2908) loss 2.2018 (2.9002) grad_norm 5.0159 (4.8019) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][990/1251] eta 0:01:16 lr 0.000148 wd 0.0500 time 0.2394 (0.2932) data time 0.0007 (0.0044) model time 0.2387 (0.2887) loss 2.0131 (2.9080) grad_norm 3.9166 (4.7661) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1000/1251] eta 0:01:13 lr 0.000148 wd 0.0500 time 0.2333 (0.2911) data time 0.0007 (0.0043) model time 0.2326 (0.2868) loss 1.7763 (2.8979) grad_norm 3.5281 (4.7661) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1010/1251] eta 0:01:09 lr 0.000148 wd 0.0500 time 0.2372 (0.2892) data time 0.0008 (0.0042) model time 0.2365 (0.2850) loss 2.6045 (2.8919) grad_norm 3.2823 (4.7460) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1020/1251] eta 0:01:06 lr 0.000148 wd 0.0500 time 0.2371 (0.2874) data time 0.0014 (0.0041) model time 0.2357 (0.2833) loss 2.9975 (2.8883) grad_norm 3.5127 (4.6971) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1030/1251] eta 0:01:03 lr 0.000148 wd 0.0500 time 0.2342 (0.2857) data time 0.0009 (0.0040) model time 0.2333 (0.2817) loss 2.2339 (2.8834) grad_norm 5.2259 (4.6797) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:55:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1040/1251] eta 0:01:00 lr 0.000148 wd 0.0500 time 0.2372 (0.2848) data time 0.0008 (0.0039) model time 0.2364 (0.2810) loss 2.4400 (2.8779) grad_norm 3.9184 (4.6546) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1050/1251] eta 0:00:56 lr 0.000148 wd 0.0500 time 0.2400 (0.2833) data time 0.0010 (0.0038) model time 0.2389 (0.2795) loss 3.3521 (2.8763) grad_norm 3.4856 (4.6622) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1060/1251] eta 0:00:53 lr 0.000148 wd 0.0500 time 0.2336 (0.2827) data time 0.0010 (0.0037) model time 0.2326 (0.2790) loss 3.1857 (2.8769) grad_norm 3.5879 (4.6456) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1070/1251] eta 0:00:50 lr 0.000148 wd 0.0500 time 0.2320 (0.2813) data time 0.0009 (0.0036) model time 0.2311 (0.2777) loss 3.2201 (2.8887) grad_norm 2.9618 (4.6149) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1080/1251] eta 0:00:47 lr 0.000148 wd 0.0500 time 0.2397 (0.2801) data time 0.0010 (0.0035) model time 0.2387 (0.2766) loss 3.1194 (2.8854) grad_norm 3.2761 (4.5762) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1090/1251] eta 0:00:44 lr 0.000148 wd 0.0500 time 0.2530 (0.2790) data time 0.0010 (0.0035) model time 0.2520 (0.2755) loss 3.3587 (2.8921) grad_norm 3.1514 (4.5409) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1100/1251] eta 0:00:41 lr 0.000148 wd 0.0500 time 0.2379 (0.2778) data time 0.0011 (0.0034) model time 0.2368 (0.2745) loss 3.2382 (2.8950) grad_norm 4.3474 (4.5378) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1110/1251] eta 0:00:39 lr 0.000148 wd 0.0500 time 0.2457 (0.2769) data time 0.0008 (0.0033) model time 0.2449 (0.2736) loss 2.1975 (2.8925) grad_norm 5.6385 (4.5402) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1120/1251] eta 0:00:36 lr 0.000148 wd 0.0500 time 0.2458 (0.2760) data time 0.0010 (0.0033) model time 0.2448 (0.2727) loss 3.6730 (2.8919) grad_norm 2.6665 (4.5168) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1130/1251] eta 0:00:33 lr 0.000148 wd 0.0500 time 0.2424 (0.2750) data time 0.0009 (0.0032) model time 0.2415 (0.2718) loss 2.0769 (2.8837) grad_norm 4.9849 (4.4892) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1140/1251] eta 0:00:30 lr 0.000148 wd 0.0500 time 0.2364 (0.2741) data time 0.0011 (0.0031) model time 0.2353 (0.2710) loss 2.9552 (2.8832) grad_norm 4.1797 (4.4690) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1150/1251] eta 0:00:27 lr 0.000148 wd 0.0500 time 0.2348 (0.2732) data time 0.0009 (0.0031) model time 0.2339 (0.2702) loss 3.2055 (2.8883) grad_norm 5.0179 (4.4766) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1160/1251] eta 0:00:24 lr 0.000147 wd 0.0500 time 0.2358 (0.2724) data time 0.0010 (0.0030) model time 0.2347 (0.2693) loss 2.8269 (2.8900) grad_norm 3.8944 (4.4533) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1170/1251] eta 0:00:22 lr 0.000147 wd 0.0500 time 0.2469 (0.2717) data time 0.0008 (0.0030) model time 0.2460 (0.2687) loss 3.6248 (2.8901) grad_norm 2.7318 (4.4273) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1180/1251] eta 0:00:19 lr 0.000147 wd 0.0500 time 0.2387 (0.2709) data time 0.0009 (0.0029) model time 0.2377 (0.2680) loss 3.0251 (2.8974) grad_norm 3.4208 (4.4076) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1190/1251] eta 0:00:16 lr 0.000147 wd 0.0500 time 0.2411 (0.2702) data time 0.0011 (0.0029) model time 0.2400 (0.2673) loss 2.5204 (2.8949) grad_norm 4.2651 (4.3948) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1200/1251] eta 0:00:13 lr 0.000147 wd 0.0500 time 0.2427 (0.2695) data time 0.0009 (0.0029) model time 0.2418 (0.2667) loss 3.7942 (2.8933) grad_norm 3.6753 (4.3885) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1210/1251] eta 0:00:11 lr 0.000147 wd 0.0500 time 0.2362 (0.2689) data time 0.0009 (0.0028) model time 0.2354 (0.2661) loss 3.3855 (2.8887) grad_norm 4.5213 (4.3776) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1220/1251] eta 0:00:08 lr 0.000147 wd 0.0500 time 0.2368 (0.2683) data time 0.0012 (0.0028) model time 0.2356 (0.2655) loss 2.7609 (2.8819) grad_norm 3.2385 (4.3696) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1230/1251] eta 0:00:05 lr 0.000147 wd 0.0500 time 0.2377 (0.2677) data time 0.0011 (0.0027) model time 0.2367 (0.2649) loss 2.7462 (2.8793) grad_norm 2.7900 (4.3584) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1240/1251] eta 0:00:02 lr 0.000147 wd 0.0500 time 0.2308 (0.2670) data time 0.0007 (0.0027) model time 0.2302 (0.2643) loss 2.7104 (2.8802) grad_norm 3.8765 (4.3598) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [231/300][1250/1251] eta 0:00:00 lr 0.000147 wd 0.0500 time 0.2237 (0.2662) data time 0.0007 (0.0027) model time 0.2229 (0.2635) loss 3.0045 (2.8780) grad_norm 3.2190 (4.3483) loss_scale 128.0000 (128.0000) mem 7377MB [2024-09-01 05:56:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 231 training takes 0:02:14 [2024-09-01 05:56:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 05:56:51 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 05:56:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.416 (0.416) Loss 0.3745 (0.3745) Acc@1 93.066 (93.066) Acc@5 98.633 (98.633) Mem 7377MB [2024-09-01 05:56:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.067 (0.108) Loss 0.5767 (0.6317) Acc@1 88.867 (86.532) Acc@5 97.754 (97.576) Mem 7377MB [2024-09-01 05:56:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.094) Loss 0.8975 (0.6618) Acc@1 77.832 (85.519) Acc@5 95.801 (97.493) Mem 7377MB [2024-09-01 05:56:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.088) Loss 1.1172 (0.7529) Acc@1 73.438 (83.332) Acc@5 92.871 (96.541) Mem 7377MB [2024-09-01 05:56:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.082) Loss 1.0488 (0.7997) Acc@1 75.781 (82.205) Acc@5 93.945 (96.051) Mem 7377MB [2024-09-01 05:56:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.788 Acc@5 96.006 [2024-09-01 05:56:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.8% [2024-09-01 05:56:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.79% [2024-09-01 05:56:57 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 05:56:59 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 05:57:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.421 (0.421) Loss 0.3816 (0.3816) Acc@1 93.066 (93.066) Acc@5 98.145 (98.145) Mem 7377MB [2024-09-01 05:57:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.106) Loss 0.5728 (0.6009) Acc@1 89.453 (87.482) Acc@5 97.754 (97.736) Mem 7377MB [2024-09-01 05:57:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.093) Loss 0.8760 (0.6294) Acc@1 77.930 (86.412) Acc@5 95.703 (97.647) Mem 7377MB [2024-09-01 05:57:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.088) Loss 1.0869 (0.7133) Acc@1 73.730 (84.296) Acc@5 93.066 (96.771) Mem 7377MB [2024-09-01 05:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 0.9810 (0.7577) Acc@1 76.562 (83.172) Acc@5 94.629 (96.351) Mem 7377MB [2024-09-01 05:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.738 Acc@5 96.334 [2024-09-01 05:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-09-01 05:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.74% [2024-09-01 05:57:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 05:57:05 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 05:57:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][0/1251] eta 0:16:38 lr 0.000147 wd 0.0500 time 0.7978 (0.7978) data time 0.4088 (0.4088) model time 0.0000 (0.0000) loss 3.2612 (3.2612) grad_norm 3.3411 (3.3411) loss_scale 128.0000 (128.0000) mem 7380MB [2024-09-01 05:57:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][10/1251] eta 0:06:03 lr 0.000147 wd 0.0500 time 0.2404 (0.2929) data time 0.0011 (0.0381) model time 0.0000 (0.0000) loss 2.8751 (3.1610) grad_norm 4.7833 (3.8362) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][20/1251] eta 0:05:28 lr 0.000147 wd 0.0500 time 0.2365 (0.2672) data time 0.0009 (0.0204) model time 0.0000 (0.0000) loss 3.0870 (2.8862) grad_norm 3.9882 (3.7307) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][30/1251] eta 0:05:14 lr 0.000147 wd 0.0500 time 0.2400 (0.2580) data time 0.0009 (0.0142) model time 0.0000 (0.0000) loss 2.4096 (2.8165) grad_norm 4.6772 (4.0054) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][40/1251] eta 0:05:07 lr 0.000147 wd 0.0500 time 0.2359 (0.2536) data time 0.0007 (0.0110) model time 0.0000 (0.0000) loss 2.4263 (2.8277) grad_norm 3.0577 (4.3785) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][50/1251] eta 0:05:01 lr 0.000147 wd 0.0500 time 0.2431 (0.2512) data time 0.0009 (0.0090) model time 0.0000 (0.0000) loss 2.8232 (2.8767) grad_norm 3.6587 (4.3053) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][60/1251] eta 0:04:57 lr 0.000147 wd 0.0500 time 0.2366 (0.2494) data time 0.0011 (0.0077) model time 0.2355 (0.2391) loss 3.3814 (2.9118) grad_norm 4.0001 (4.2877) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][70/1251] eta 0:04:52 lr 0.000147 wd 0.0500 time 0.2310 (0.2479) data time 0.0010 (0.0068) model time 0.2299 (0.2385) loss 3.1021 (2.9021) grad_norm 3.0256 (4.2125) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][80/1251] eta 0:04:49 lr 0.000147 wd 0.0500 time 0.2422 (0.2470) data time 0.0009 (0.0060) model time 0.2413 (0.2388) loss 2.8205 (2.9034) grad_norm 3.9741 (4.5357) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][90/1251] eta 0:04:45 lr 0.000147 wd 0.0500 time 0.2414 (0.2462) data time 0.0007 (0.0055) model time 0.2407 (0.2387) loss 3.2931 (2.9138) grad_norm 4.0381 (4.5212) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][100/1251] eta 0:04:42 lr 0.000147 wd 0.0500 time 0.2387 (0.2454) data time 0.0007 (0.0050) model time 0.2381 (0.2384) loss 3.3111 (2.8986) grad_norm 3.7978 (4.4165) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][110/1251] eta 0:04:39 lr 0.000147 wd 0.0500 time 0.2502 (0.2448) data time 0.0007 (0.0047) model time 0.2495 (0.2383) loss 3.5048 (2.9005) grad_norm 4.0811 (4.4580) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][120/1251] eta 0:04:36 lr 0.000147 wd 0.0500 time 0.2396 (0.2445) data time 0.0010 (0.0044) model time 0.2387 (0.2385) loss 3.3738 (2.9180) grad_norm 4.5443 (4.4714) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][130/1251] eta 0:04:33 lr 0.000147 wd 0.0500 time 0.2428 (0.2443) data time 0.0011 (0.0041) model time 0.2417 (0.2388) loss 2.7527 (2.9175) grad_norm 3.1622 (5.5829) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][140/1251] eta 0:04:30 lr 0.000147 wd 0.0500 time 0.2361 (0.2438) data time 0.0008 (0.0039) model time 0.2353 (0.2386) loss 2.2976 (2.9006) grad_norm 3.0310 (5.4838) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][150/1251] eta 0:04:28 lr 0.000147 wd 0.0500 time 0.2319 (0.2435) data time 0.0007 (0.0037) model time 0.2312 (0.2386) loss 2.7638 (2.9074) grad_norm 9.5570 (5.4498) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][160/1251] eta 0:04:25 lr 0.000147 wd 0.0500 time 0.2396 (0.2433) data time 0.0007 (0.0035) model time 0.2390 (0.2387) loss 3.0501 (2.8889) grad_norm 3.5884 (5.3932) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][170/1251] eta 0:04:22 lr 0.000147 wd 0.0500 time 0.2435 (0.2432) data time 0.0008 (0.0034) model time 0.2427 (0.2389) loss 3.1885 (2.9093) grad_norm 3.5838 (5.2869) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][180/1251] eta 0:04:20 lr 0.000147 wd 0.0500 time 0.2349 (0.2431) data time 0.0010 (0.0032) model time 0.2340 (0.2390) loss 2.7949 (2.9051) grad_norm 3.9194 (5.2123) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][190/1251] eta 0:04:17 lr 0.000147 wd 0.0500 time 0.2314 (0.2429) data time 0.0012 (0.0031) model time 0.2302 (0.2389) loss 2.2525 (2.8983) grad_norm 3.3325 (5.1179) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][200/1251] eta 0:04:15 lr 0.000147 wd 0.0500 time 0.2279 (0.2428) data time 0.0009 (0.0030) model time 0.2271 (0.2389) loss 3.3915 (2.8978) grad_norm 3.8802 (5.0638) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][210/1251] eta 0:04:12 lr 0.000147 wd 0.0500 time 0.2323 (0.2426) data time 0.0010 (0.0029) model time 0.2313 (0.2389) loss 3.3347 (2.8834) grad_norm 2.8054 (5.0084) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:57:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][220/1251] eta 0:04:10 lr 0.000147 wd 0.0500 time 0.2354 (0.2425) data time 0.0009 (0.0028) model time 0.2345 (0.2389) loss 2.6503 (2.8788) grad_norm 3.5973 (4.9230) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:58:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][230/1251] eta 0:04:08 lr 0.000147 wd 0.0500 time 0.2365 (0.2433) data time 0.0008 (0.0028) model time 0.2357 (0.2400) loss 3.0489 (2.8828) grad_norm 8.5284 (4.8996) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:58:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][240/1251] eta 0:04:05 lr 0.000146 wd 0.0500 time 0.2273 (0.2431) data time 0.0010 (0.0027) model time 0.2263 (0.2399) loss 3.2297 (2.8868) grad_norm 5.0181 (4.8663) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][250/1251] eta 0:04:03 lr 0.000146 wd 0.0500 time 0.2558 (0.2430) data time 0.0011 (0.0026) model time 0.2547 (0.2400) loss 2.1045 (2.8770) grad_norm 4.7847 (4.8170) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:58:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][260/1251] eta 0:04:00 lr 0.000146 wd 0.0500 time 0.2442 (0.2430) data time 0.0008 (0.0026) model time 0.2434 (0.2400) loss 3.2361 (2.8855) grad_norm 4.2311 (4.7744) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:58:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][270/1251] eta 0:03:58 lr 0.000146 wd 0.0500 time 0.2375 (0.2430) data time 0.0010 (0.0025) model time 0.2365 (0.2400) loss 2.0638 (2.8876) grad_norm 4.4225 (4.7448) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:58:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][280/1251] eta 0:03:55 lr 0.000146 wd 0.0500 time 0.2312 (0.2428) data time 0.0017 (0.0025) model time 0.2295 (0.2400) loss 2.0226 (2.8824) grad_norm 3.7717 (4.7152) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][290/1251] eta 0:03:53 lr 0.000146 wd 0.0500 time 0.2400 (0.2427) data time 0.0007 (0.0024) model time 0.2393 (0.2399) loss 2.7805 (2.8871) grad_norm 4.0637 (4.6953) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:58:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][300/1251] eta 0:03:50 lr 0.000146 wd 0.0500 time 0.2414 (0.2428) data time 0.0007 (0.0024) model time 0.2407 (0.2401) loss 3.4899 (2.8833) grad_norm 3.2109 (4.6760) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:58:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][310/1251] eta 0:03:48 lr 0.000146 wd 0.0500 time 0.2439 (0.2433) data time 0.0007 (0.0023) model time 0.2432 (0.2408) loss 3.2270 (2.8658) grad_norm 4.6126 (4.6601) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:58:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][320/1251] eta 0:03:46 lr 0.000146 wd 0.0500 time 0.2449 (0.2432) data time 0.0008 (0.0023) model time 0.2441 (0.2407) loss 2.4432 (2.8627) grad_norm 3.2682 (4.6425) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:58:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][330/1251] eta 0:03:43 lr 0.000146 wd 0.0500 time 0.2431 (0.2432) data time 0.0011 (0.0022) model time 0.2420 (0.2407) loss 2.1550 (2.8560) grad_norm 4.8610 (4.6513) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:58:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][340/1251] eta 0:03:41 lr 0.000146 wd 0.0500 time 0.2476 (0.2431) data time 0.0008 (0.0022) model time 0.2468 (0.2407) loss 1.6617 (2.8484) grad_norm 4.5950 (4.6442) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 05:58:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][350/1251] eta 0:03:38 lr 0.000146 wd 0.0500 time 0.2317 (0.2430) data time 0.0008 (0.0022) model time 0.2309 (0.2407) loss 3.2096 (2.8506) grad_norm 2.5378 (4.6583) loss_scale 256.0000 (129.4587) mem 7381MB [2024-09-01 05:58:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][360/1251] eta 0:03:36 lr 0.000146 wd 0.0500 time 0.2365 (0.2429) data time 0.0011 (0.0021) model time 0.2354 (0.2406) loss 2.7495 (2.8503) grad_norm 5.6059 (4.7087) loss_scale 256.0000 (132.9640) mem 7381MB [2024-09-01 05:58:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][370/1251] eta 0:03:33 lr 0.000146 wd 0.0500 time 0.2469 (0.2428) data time 0.0009 (0.0021) model time 0.2460 (0.2405) loss 3.0762 (2.8534) grad_norm 3.6834 (4.6937) loss_scale 256.0000 (136.2803) mem 7381MB [2024-09-01 05:58:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][380/1251] eta 0:03:31 lr 0.000146 wd 0.0500 time 0.2461 (0.2428) data time 0.0007 (0.0021) model time 0.2454 (0.2405) loss 2.4224 (2.8501) grad_norm 4.4816 (4.6731) loss_scale 256.0000 (139.4226) mem 7381MB [2024-09-01 05:58:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][390/1251] eta 0:03:28 lr 0.000146 wd 0.0500 time 0.2329 (0.2427) data time 0.0011 (0.0020) model time 0.2318 (0.2404) loss 2.9705 (2.8514) grad_norm 3.7769 (4.6510) loss_scale 256.0000 (142.4041) mem 7381MB [2024-09-01 05:58:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][400/1251] eta 0:03:26 lr 0.000146 wd 0.0500 time 0.2366 (0.2426) data time 0.0007 (0.0020) model time 0.2359 (0.2404) loss 3.2566 (2.8529) grad_norm 3.3289 (4.6256) loss_scale 256.0000 (145.2369) mem 7381MB [2024-09-01 05:58:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][410/1251] eta 0:03:24 lr 0.000146 wd 0.0500 time 0.2434 (0.2426) data time 0.0007 (0.0020) model time 0.2426 (0.2404) loss 3.3112 (2.8519) grad_norm 4.3895 (4.6079) loss_scale 256.0000 (147.9319) mem 7381MB [2024-09-01 05:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][420/1251] eta 0:03:21 lr 0.000146 wd 0.0500 time 0.2436 (0.2425) data time 0.0007 (0.0020) model time 0.2429 (0.2403) loss 3.4150 (2.8599) grad_norm 3.8746 (4.5808) loss_scale 256.0000 (150.4988) mem 7381MB [2024-09-01 05:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][430/1251] eta 0:03:19 lr 0.000146 wd 0.0500 time 0.2450 (0.2426) data time 0.0008 (0.0020) model time 0.2442 (0.2404) loss 1.8684 (2.8607) grad_norm 3.6995 (4.5552) loss_scale 256.0000 (152.9466) mem 7381MB [2024-09-01 05:58:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][440/1251] eta 0:03:16 lr 0.000146 wd 0.0500 time 0.2380 (0.2425) data time 0.0010 (0.0019) model time 0.2371 (0.2403) loss 3.0574 (2.8553) grad_norm 2.9314 (4.5313) loss_scale 256.0000 (155.2834) mem 7381MB [2024-09-01 05:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][450/1251] eta 0:03:14 lr 0.000146 wd 0.0500 time 0.2289 (0.2424) data time 0.0011 (0.0019) model time 0.2278 (0.2403) loss 3.0269 (2.8482) grad_norm 3.0640 (4.5196) loss_scale 256.0000 (157.5166) mem 7381MB [2024-09-01 05:58:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][460/1251] eta 0:03:11 lr 0.000146 wd 0.0500 time 0.2353 (0.2424) data time 0.0011 (0.0019) model time 0.2342 (0.2402) loss 2.4007 (2.8489) grad_norm 4.6447 (4.5098) loss_scale 256.0000 (159.6529) mem 7381MB [2024-09-01 05:58:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][470/1251] eta 0:03:09 lr 0.000146 wd 0.0500 time 0.2422 (0.2423) data time 0.0009 (0.0019) model time 0.2413 (0.2403) loss 3.5219 (2.8543) grad_norm 3.6839 (4.5025) loss_scale 256.0000 (161.6985) mem 7381MB [2024-09-01 05:59:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][480/1251] eta 0:03:06 lr 0.000146 wd 0.0500 time 0.2380 (0.2423) data time 0.0010 (0.0019) model time 0.2370 (0.2403) loss 3.0782 (2.8522) grad_norm 5.4446 (4.4906) loss_scale 256.0000 (163.6590) mem 7381MB [2024-09-01 05:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][490/1251] eta 0:03:04 lr 0.000146 wd 0.0500 time 0.2267 (0.2422) data time 0.0008 (0.0018) model time 0.2259 (0.2402) loss 3.1799 (2.8511) grad_norm 3.6793 (4.4675) loss_scale 256.0000 (165.5397) mem 7381MB [2024-09-01 05:59:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][500/1251] eta 0:03:01 lr 0.000146 wd 0.0500 time 0.2326 (0.2422) data time 0.0010 (0.0018) model time 0.2315 (0.2402) loss 2.4930 (2.8517) grad_norm 5.0759 (4.4609) loss_scale 256.0000 (167.3453) mem 7381MB [2024-09-01 05:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][510/1251] eta 0:02:59 lr 0.000146 wd 0.0500 time 0.2321 (0.2421) data time 0.0009 (0.0018) model time 0.2313 (0.2401) loss 3.3738 (2.8593) grad_norm 5.7788 (4.4468) loss_scale 256.0000 (169.0802) mem 7381MB [2024-09-01 05:59:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][520/1251] eta 0:02:56 lr 0.000146 wd 0.0500 time 0.2464 (0.2420) data time 0.0007 (0.0018) model time 0.2457 (0.2400) loss 1.7807 (2.8521) grad_norm 4.1875 (4.4832) loss_scale 256.0000 (170.7486) mem 7381MB [2024-09-01 05:59:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][530/1251] eta 0:02:54 lr 0.000146 wd 0.0500 time 0.2339 (0.2420) data time 0.0007 (0.0018) model time 0.2332 (0.2400) loss 3.5704 (2.8502) grad_norm 6.0161 (4.4915) loss_scale 256.0000 (172.3540) mem 7381MB [2024-09-01 05:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][540/1251] eta 0:02:52 lr 0.000146 wd 0.0500 time 0.2369 (0.2419) data time 0.0011 (0.0018) model time 0.2358 (0.2400) loss 2.7928 (2.8548) grad_norm 4.1742 (4.4835) loss_scale 256.0000 (173.9002) mem 7381MB [2024-09-01 05:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][550/1251] eta 0:02:49 lr 0.000146 wd 0.0500 time 0.2349 (0.2419) data time 0.0009 (0.0017) model time 0.2340 (0.2399) loss 3.3455 (2.8563) grad_norm 5.2763 (4.4913) loss_scale 256.0000 (175.3902) mem 7381MB [2024-09-01 05:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][560/1251] eta 0:02:47 lr 0.000146 wd 0.0500 time 0.2356 (0.2419) data time 0.0007 (0.0017) model time 0.2349 (0.2399) loss 3.6296 (2.8596) grad_norm 4.8871 (4.4904) loss_scale 256.0000 (176.8271) mem 7381MB [2024-09-01 05:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][570/1251] eta 0:02:44 lr 0.000145 wd 0.0500 time 0.2402 (0.2418) data time 0.0014 (0.0017) model time 0.2388 (0.2399) loss 2.8447 (2.8537) grad_norm 3.4337 (4.4873) loss_scale 256.0000 (178.2137) mem 7381MB [2024-09-01 05:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][580/1251] eta 0:02:42 lr 0.000145 wd 0.0500 time 0.2424 (0.2418) data time 0.0007 (0.0017) model time 0.2418 (0.2399) loss 1.9446 (2.8528) grad_norm 6.6911 (4.5067) loss_scale 256.0000 (179.5525) mem 7381MB [2024-09-01 05:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][590/1251] eta 0:02:39 lr 0.000145 wd 0.0500 time 0.2429 (0.2418) data time 0.0009 (0.0017) model time 0.2420 (0.2399) loss 2.3693 (2.8516) grad_norm 3.3919 (4.4922) loss_scale 256.0000 (180.8460) mem 7381MB [2024-09-01 05:59:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][600/1251] eta 0:02:37 lr 0.000145 wd 0.0500 time 0.2459 (0.2417) data time 0.0007 (0.0017) model time 0.2453 (0.2398) loss 2.7751 (2.8522) grad_norm 3.4463 (4.4874) loss_scale 256.0000 (182.0965) mem 7381MB [2024-09-01 05:59:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][610/1251] eta 0:02:34 lr 0.000145 wd 0.0500 time 0.2360 (0.2417) data time 0.0009 (0.0017) model time 0.2352 (0.2398) loss 3.4327 (2.8547) grad_norm 3.9672 (4.4842) loss_scale 256.0000 (183.3061) mem 7381MB [2024-09-01 05:59:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][620/1251] eta 0:02:32 lr 0.000145 wd 0.0500 time 0.2348 (0.2417) data time 0.0008 (0.0017) model time 0.2340 (0.2399) loss 3.5611 (2.8560) grad_norm 3.6430 (4.4825) loss_scale 256.0000 (184.4767) mem 7381MB [2024-09-01 05:59:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][630/1251] eta 0:02:30 lr 0.000145 wd 0.0500 time 0.2355 (0.2417) data time 0.0009 (0.0017) model time 0.2346 (0.2399) loss 2.4804 (2.8566) grad_norm 3.7362 (4.4747) loss_scale 256.0000 (185.6101) mem 7381MB [2024-09-01 05:59:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][640/1251] eta 0:02:27 lr 0.000145 wd 0.0500 time 0.2363 (0.2417) data time 0.0009 (0.0016) model time 0.2354 (0.2399) loss 2.7753 (2.8567) grad_norm 2.8076 (4.4673) loss_scale 256.0000 (186.7083) mem 7381MB [2024-09-01 05:59:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][650/1251] eta 0:02:25 lr 0.000145 wd 0.0500 time 0.2306 (0.2417) data time 0.0009 (0.0016) model time 0.2297 (0.2398) loss 3.1420 (2.8555) grad_norm 15.7428 (4.4734) loss_scale 256.0000 (187.7727) mem 7381MB [2024-09-01 05:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][660/1251] eta 0:02:22 lr 0.000145 wd 0.0500 time 0.2378 (0.2416) data time 0.0010 (0.0016) model time 0.2368 (0.2398) loss 2.8188 (2.8519) grad_norm 2.9107 (4.4799) loss_scale 256.0000 (188.8048) mem 7381MB [2024-09-01 05:59:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][670/1251] eta 0:02:20 lr 0.000145 wd 0.0500 time 0.2478 (0.2417) data time 0.0007 (0.0016) model time 0.2471 (0.2399) loss 3.3553 (2.8567) grad_norm 4.0305 (4.4628) loss_scale 256.0000 (189.8063) mem 7381MB [2024-09-01 05:59:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][680/1251] eta 0:02:17 lr 0.000145 wd 0.0500 time 0.2542 (0.2417) data time 0.0007 (0.0016) model time 0.2534 (0.2399) loss 3.0774 (2.8610) grad_norm 3.1097 (4.4490) loss_scale 256.0000 (190.7783) mem 7381MB [2024-09-01 05:59:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][690/1251] eta 0:02:15 lr 0.000145 wd 0.0500 time 0.2380 (0.2417) data time 0.0009 (0.0016) model time 0.2372 (0.2399) loss 2.6228 (2.8609) grad_norm 2.3901 (4.4347) loss_scale 256.0000 (191.7221) mem 7381MB [2024-09-01 05:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][700/1251] eta 0:02:13 lr 0.000145 wd 0.0500 time 0.2426 (0.2417) data time 0.0007 (0.0016) model time 0.2420 (0.2399) loss 3.3574 (2.8630) grad_norm 4.1951 (4.4333) loss_scale 256.0000 (192.6391) mem 7381MB [2024-09-01 05:59:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][710/1251] eta 0:02:10 lr 0.000145 wd 0.0500 time 0.2440 (0.2417) data time 0.0007 (0.0016) model time 0.2433 (0.2399) loss 3.6176 (2.8627) grad_norm 2.7602 (4.4218) loss_scale 256.0000 (193.5302) mem 7381MB [2024-09-01 05:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][720/1251] eta 0:02:08 lr 0.000145 wd 0.0500 time 0.2371 (0.2416) data time 0.0010 (0.0016) model time 0.2362 (0.2399) loss 3.1193 (2.8606) grad_norm 3.2121 (4.4099) loss_scale 256.0000 (194.3967) mem 7381MB [2024-09-01 06:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][730/1251] eta 0:02:05 lr 0.000145 wd 0.0500 time 0.2413 (0.2417) data time 0.0009 (0.0016) model time 0.2403 (0.2400) loss 3.0397 (2.8582) grad_norm 2.9542 (4.4021) loss_scale 256.0000 (195.2394) mem 7381MB [2024-09-01 06:00:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][740/1251] eta 0:02:03 lr 0.000145 wd 0.0500 time 0.2437 (0.2417) data time 0.0013 (0.0016) model time 0.2424 (0.2400) loss 2.7493 (2.8578) grad_norm 3.8982 (4.3936) loss_scale 256.0000 (196.0594) mem 7381MB [2024-09-01 06:00:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][750/1251] eta 0:02:01 lr 0.000145 wd 0.0500 time 0.2361 (0.2417) data time 0.0007 (0.0016) model time 0.2355 (0.2399) loss 2.5820 (2.8554) grad_norm 3.4279 (4.3778) loss_scale 256.0000 (196.8575) mem 7381MB [2024-09-01 06:00:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][760/1251] eta 0:01:58 lr 0.000145 wd 0.0500 time 0.2356 (0.2416) data time 0.0007 (0.0015) model time 0.2348 (0.2399) loss 3.4908 (2.8577) grad_norm 2.6341 (4.3666) loss_scale 256.0000 (197.6347) mem 7381MB [2024-09-01 06:00:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][770/1251] eta 0:01:56 lr 0.000145 wd 0.0500 time 0.2430 (0.2416) data time 0.0008 (0.0015) model time 0.2421 (0.2399) loss 3.0502 (2.8588) grad_norm 2.6439 (4.3619) loss_scale 256.0000 (198.3917) mem 7381MB [2024-09-01 06:00:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][780/1251] eta 0:01:53 lr 0.000145 wd 0.0500 time 0.2429 (0.2416) data time 0.0010 (0.0015) model time 0.2420 (0.2399) loss 3.3881 (2.8626) grad_norm 5.7581 (4.3567) loss_scale 256.0000 (199.1293) mem 7381MB [2024-09-01 06:00:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][790/1251] eta 0:01:51 lr 0.000145 wd 0.0500 time 0.2357 (0.2416) data time 0.0009 (0.0015) model time 0.2349 (0.2399) loss 2.8257 (2.8640) grad_norm 3.7734 (4.3455) loss_scale 256.0000 (199.8483) mem 7381MB [2024-09-01 06:00:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][800/1251] eta 0:01:48 lr 0.000145 wd 0.0500 time 0.2330 (0.2415) data time 0.0010 (0.0015) model time 0.2320 (0.2399) loss 2.0346 (2.8641) grad_norm 3.0276 (4.3395) loss_scale 256.0000 (200.5493) mem 7381MB [2024-09-01 06:00:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][810/1251] eta 0:01:46 lr 0.000145 wd 0.0500 time 0.2355 (0.2416) data time 0.0007 (0.0015) model time 0.2348 (0.2399) loss 1.7702 (2.8605) grad_norm 3.3574 (4.3408) loss_scale 256.0000 (201.2330) mem 7381MB [2024-09-01 06:00:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][820/1251] eta 0:01:44 lr 0.000145 wd 0.0500 time 0.2416 (0.2416) data time 0.0010 (0.0015) model time 0.2406 (0.2399) loss 3.3597 (2.8587) grad_norm 3.7511 (4.3514) loss_scale 256.0000 (201.9001) mem 7381MB [2024-09-01 06:00:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][830/1251] eta 0:01:41 lr 0.000145 wd 0.0500 time 0.2453 (0.2419) data time 0.0007 (0.0015) model time 0.2445 (0.2403) loss 1.7451 (2.8590) grad_norm 3.8485 (4.3403) loss_scale 256.0000 (202.5511) mem 7381MB [2024-09-01 06:00:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][840/1251] eta 0:01:39 lr 0.000145 wd 0.0500 time 0.2485 (0.2419) data time 0.0010 (0.0015) model time 0.2476 (0.2403) loss 2.7188 (2.8597) grad_norm 2.7295 (4.3315) loss_scale 256.0000 (203.1867) mem 7381MB [2024-09-01 06:00:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][850/1251] eta 0:01:36 lr 0.000145 wd 0.0500 time 0.2361 (0.2419) data time 0.0009 (0.0015) model time 0.2352 (0.2403) loss 2.3598 (2.8616) grad_norm 3.4983 (4.3272) loss_scale 256.0000 (203.8073) mem 7381MB [2024-09-01 06:00:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][860/1251] eta 0:01:34 lr 0.000145 wd 0.0500 time 0.2338 (0.2419) data time 0.0010 (0.0015) model time 0.2327 (0.2403) loss 3.2351 (2.8640) grad_norm 3.3641 (4.3241) loss_scale 256.0000 (204.4135) mem 7381MB [2024-09-01 06:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][870/1251] eta 0:01:32 lr 0.000145 wd 0.0500 time 0.2381 (0.2419) data time 0.0007 (0.0015) model time 0.2374 (0.2403) loss 3.0020 (2.8659) grad_norm 3.2943 (4.3109) loss_scale 256.0000 (205.0057) mem 7381MB [2024-09-01 06:00:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][880/1251] eta 0:01:29 lr 0.000145 wd 0.0500 time 0.2501 (0.2419) data time 0.0007 (0.0015) model time 0.2495 (0.2403) loss 3.1464 (2.8661) grad_norm 4.2746 (4.3089) loss_scale 256.0000 (205.5846) mem 7381MB [2024-09-01 06:00:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][890/1251] eta 0:01:27 lr 0.000144 wd 0.0500 time 0.2339 (0.2418) data time 0.0008 (0.0015) model time 0.2331 (0.2403) loss 3.0622 (2.8664) grad_norm 2.7358 (4.3043) loss_scale 256.0000 (206.1504) mem 7381MB [2024-09-01 06:00:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][900/1251] eta 0:01:24 lr 0.000144 wd 0.0500 time 0.2467 (0.2418) data time 0.0009 (0.0015) model time 0.2458 (0.2403) loss 3.3609 (2.8689) grad_norm 3.8051 (4.3052) loss_scale 256.0000 (206.7037) mem 7381MB [2024-09-01 06:00:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][910/1251] eta 0:01:22 lr 0.000144 wd 0.0500 time 0.2350 (0.2418) data time 0.0007 (0.0015) model time 0.2343 (0.2403) loss 2.6838 (2.8697) grad_norm 3.9226 (4.2970) loss_scale 256.0000 (207.2448) mem 7381MB [2024-09-01 06:00:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][920/1251] eta 0:01:20 lr 0.000144 wd 0.0500 time 0.2398 (0.2418) data time 0.0010 (0.0015) model time 0.2389 (0.2402) loss 2.9772 (2.8690) grad_norm 4.6120 (4.2922) loss_scale 256.0000 (207.7742) mem 7381MB [2024-09-01 06:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][930/1251] eta 0:01:17 lr 0.000144 wd 0.0500 time 0.2413 (0.2418) data time 0.0010 (0.0014) model time 0.2403 (0.2402) loss 3.1399 (2.8687) grad_norm 3.6332 (4.2886) loss_scale 256.0000 (208.2922) mem 7381MB [2024-09-01 06:00:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][940/1251] eta 0:01:15 lr 0.000144 wd 0.0500 time 0.2451 (0.2418) data time 0.0007 (0.0014) model time 0.2444 (0.2403) loss 2.5616 (2.8700) grad_norm 4.5737 (4.2877) loss_scale 256.0000 (208.7991) mem 7381MB [2024-09-01 06:00:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][950/1251] eta 0:01:12 lr 0.000144 wd 0.0500 time 0.2425 (0.2418) data time 0.0008 (0.0014) model time 0.2417 (0.2403) loss 2.8291 (2.8701) grad_norm 3.9887 (4.3134) loss_scale 256.0000 (209.2955) mem 7381MB [2024-09-01 06:00:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][960/1251] eta 0:01:10 lr 0.000144 wd 0.0500 time 0.2371 (0.2418) data time 0.0011 (0.0014) model time 0.2359 (0.2403) loss 3.1403 (2.8723) grad_norm 3.1840 (4.3023) loss_scale 256.0000 (209.7815) mem 7381MB [2024-09-01 06:01:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][970/1251] eta 0:01:07 lr 0.000144 wd 0.0500 time 0.2385 (0.2418) data time 0.0010 (0.0014) model time 0.2375 (0.2403) loss 2.1398 (2.8705) grad_norm 2.6680 (4.3036) loss_scale 256.0000 (210.2575) mem 7381MB [2024-09-01 06:01:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][980/1251] eta 0:01:05 lr 0.000144 wd 0.0500 time 0.2486 (0.2418) data time 0.0009 (0.0014) model time 0.2476 (0.2403) loss 2.9925 (2.8660) grad_norm 3.4616 (4.3005) loss_scale 256.0000 (210.7238) mem 7381MB [2024-09-01 06:01:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][990/1251] eta 0:01:03 lr 0.000144 wd 0.0500 time 0.2357 (0.2418) data time 0.0008 (0.0014) model time 0.2349 (0.2403) loss 2.4425 (2.8634) grad_norm 2.7683 (4.2965) loss_scale 256.0000 (211.1806) mem 7381MB [2024-09-01 06:01:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1000/1251] eta 0:01:00 lr 0.000144 wd 0.0500 time 0.2320 (0.2418) data time 0.0010 (0.0014) model time 0.2309 (0.2402) loss 2.9616 (2.8649) grad_norm 5.5041 (4.3018) loss_scale 256.0000 (211.6284) mem 7381MB [2024-09-01 06:01:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1010/1251] eta 0:00:58 lr 0.000144 wd 0.0500 time 0.2345 (0.2417) data time 0.0008 (0.0014) model time 0.2337 (0.2402) loss 3.1613 (2.8660) grad_norm 3.3266 (4.2975) loss_scale 256.0000 (212.0673) mem 7381MB [2024-09-01 06:01:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1020/1251] eta 0:00:55 lr 0.000144 wd 0.0500 time 0.2508 (0.2417) data time 0.0008 (0.0014) model time 0.2500 (0.2402) loss 3.1681 (2.8691) grad_norm 2.3693 (4.3169) loss_scale 256.0000 (212.4976) mem 7381MB [2024-09-01 06:01:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1030/1251] eta 0:00:53 lr 0.000144 wd 0.0500 time 0.2460 (0.2417) data time 0.0009 (0.0014) model time 0.2451 (0.2402) loss 2.9248 (2.8680) grad_norm 13.0956 (4.3257) loss_scale 256.0000 (212.9195) mem 7381MB [2024-09-01 06:01:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1040/1251] eta 0:00:51 lr 0.000144 wd 0.0500 time 0.2497 (0.2417) data time 0.0010 (0.0014) model time 0.2487 (0.2402) loss 2.4012 (2.8671) grad_norm 4.4220 (4.3312) loss_scale 256.0000 (213.3333) mem 7381MB [2024-09-01 06:01:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1050/1251] eta 0:00:48 lr 0.000144 wd 0.0500 time 0.2437 (0.2417) data time 0.0009 (0.0014) model time 0.2428 (0.2402) loss 3.3808 (2.8675) grad_norm 2.9014 (4.3249) loss_scale 256.0000 (213.7393) mem 7381MB [2024-09-01 06:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1060/1251] eta 0:00:46 lr 0.000144 wd 0.0500 time 0.2314 (0.2418) data time 0.0009 (0.0014) model time 0.2305 (0.2403) loss 2.6033 (2.8645) grad_norm 3.7225 (4.3199) loss_scale 256.0000 (214.1376) mem 7381MB [2024-09-01 06:01:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1070/1251] eta 0:00:43 lr 0.000144 wd 0.0500 time 0.2425 (0.2418) data time 0.0009 (0.0014) model time 0.2417 (0.2403) loss 3.1053 (2.8645) grad_norm 2.6482 (4.3226) loss_scale 256.0000 (214.5285) mem 7381MB [2024-09-01 06:01:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1080/1251] eta 0:00:41 lr 0.000144 wd 0.0500 time 0.2413 (0.2418) data time 0.0009 (0.0014) model time 0.2404 (0.2403) loss 2.9226 (2.8640) grad_norm 3.0495 (4.3229) loss_scale 256.0000 (214.9121) mem 7381MB [2024-09-01 06:01:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1090/1251] eta 0:00:38 lr 0.000144 wd 0.0500 time 0.2465 (0.2418) data time 0.0011 (0.0014) model time 0.2454 (0.2403) loss 2.6951 (2.8662) grad_norm 3.1509 (4.3241) loss_scale 256.0000 (215.2887) mem 7381MB [2024-09-01 06:01:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1100/1251] eta 0:00:36 lr 0.000144 wd 0.0500 time 0.2443 (0.2418) data time 0.0010 (0.0014) model time 0.2434 (0.2403) loss 3.3400 (2.8683) grad_norm 3.1961 (inf) loss_scale 128.0000 (214.6122) mem 7381MB [2024-09-01 06:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1110/1251] eta 0:00:34 lr 0.000144 wd 0.0500 time 0.2407 (0.2418) data time 0.0007 (0.0014) model time 0.2399 (0.2403) loss 3.1207 (2.8688) grad_norm 7.3860 (inf) loss_scale 128.0000 (213.8326) mem 7381MB [2024-09-01 06:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1120/1251] eta 0:00:31 lr 0.000144 wd 0.0500 time 0.2404 (0.2418) data time 0.0008 (0.0014) model time 0.2395 (0.2403) loss 1.9789 (2.8689) grad_norm 4.5615 (inf) loss_scale 128.0000 (213.0669) mem 7381MB [2024-09-01 06:01:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1130/1251] eta 0:00:29 lr 0.000144 wd 0.0500 time 0.2472 (0.2417) data time 0.0009 (0.0014) model time 0.2462 (0.2403) loss 2.8407 (2.8706) grad_norm 3.5681 (inf) loss_scale 128.0000 (212.3148) mem 7381MB [2024-09-01 06:01:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1140/1251] eta 0:00:26 lr 0.000144 wd 0.0500 time 0.2449 (0.2418) data time 0.0008 (0.0014) model time 0.2441 (0.2403) loss 2.8975 (2.8683) grad_norm 3.4888 (inf) loss_scale 128.0000 (211.5758) mem 7381MB [2024-09-01 06:01:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1150/1251] eta 0:00:24 lr 0.000144 wd 0.0500 time 0.2321 (0.2417) data time 0.0007 (0.0014) model time 0.2314 (0.2403) loss 2.1175 (2.8634) grad_norm 3.1528 (inf) loss_scale 128.0000 (210.8497) mem 7381MB [2024-09-01 06:01:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1160/1251] eta 0:00:22 lr 0.000144 wd 0.0500 time 0.2445 (0.2419) data time 0.0008 (0.0014) model time 0.2438 (0.2404) loss 2.9949 (2.8615) grad_norm 4.5404 (inf) loss_scale 128.0000 (210.1361) mem 7381MB [2024-09-01 06:01:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1170/1251] eta 0:00:19 lr 0.000144 wd 0.0500 time 0.2381 (0.2419) data time 0.0008 (0.0014) model time 0.2373 (0.2404) loss 3.2907 (2.8618) grad_norm 3.4271 (inf) loss_scale 128.0000 (209.4347) mem 7381MB [2024-09-01 06:01:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1180/1251] eta 0:00:17 lr 0.000144 wd 0.0500 time 0.2373 (0.2419) data time 0.0009 (0.0014) model time 0.2364 (0.2404) loss 3.2831 (2.8630) grad_norm 5.0966 (inf) loss_scale 128.0000 (208.7451) mem 7381MB [2024-09-01 06:01:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1190/1251] eta 0:00:14 lr 0.000144 wd 0.0500 time 0.2460 (0.2418) data time 0.0009 (0.0013) model time 0.2451 (0.2404) loss 3.0639 (2.8643) grad_norm 3.0541 (inf) loss_scale 128.0000 (208.0672) mem 7381MB [2024-09-01 06:01:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1200/1251] eta 0:00:12 lr 0.000144 wd 0.0500 time 0.2421 (0.2418) data time 0.0008 (0.0013) model time 0.2413 (0.2404) loss 3.2253 (2.8660) grad_norm 2.4820 (inf) loss_scale 128.0000 (207.4005) mem 7381MB [2024-09-01 06:01:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1210/1251] eta 0:00:09 lr 0.000144 wd 0.0500 time 0.2436 (0.2418) data time 0.0009 (0.0013) model time 0.2428 (0.2404) loss 1.9289 (2.8642) grad_norm 3.3545 (inf) loss_scale 128.0000 (206.7448) mem 7381MB [2024-09-01 06:02:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1220/1251] eta 0:00:07 lr 0.000143 wd 0.0500 time 0.2473 (0.2418) data time 0.0009 (0.0013) model time 0.2464 (0.2404) loss 3.2922 (2.8635) grad_norm 5.0710 (inf) loss_scale 128.0000 (206.0999) mem 7381MB [2024-09-01 06:02:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1230/1251] eta 0:00:05 lr 0.000143 wd 0.0500 time 0.2453 (0.2418) data time 0.0009 (0.0013) model time 0.2443 (0.2404) loss 2.0609 (2.8629) grad_norm 3.2294 (inf) loss_scale 128.0000 (205.4655) mem 7381MB [2024-09-01 06:02:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1240/1251] eta 0:00:02 lr 0.000143 wd 0.0500 time 0.2286 (0.2418) data time 0.0007 (0.0013) model time 0.2279 (0.2404) loss 3.2218 (2.8630) grad_norm 3.4082 (inf) loss_scale 128.0000 (204.8413) mem 7381MB [2024-09-01 06:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [232/300][1250/1251] eta 0:00:00 lr 0.000143 wd 0.0500 time 0.2240 (0.2416) data time 0.0005 (0.0013) model time 0.2236 (0.2402) loss 2.3996 (2.8611) grad_norm 6.7588 (inf) loss_scale 128.0000 (204.2270) mem 7381MB [2024-09-01 06:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 232 training takes 0:05:02 [2024-09-01 06:02:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 06:02:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 06:02:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.429 (0.429) Loss 0.4084 (0.4084) Acc@1 92.773 (92.773) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 06:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.107) Loss 0.6123 (0.6305) Acc@1 88.770 (86.621) Acc@5 97.363 (97.470) Mem 7381MB [2024-09-01 06:02:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.094) Loss 0.9678 (0.6604) Acc@1 75.781 (85.561) Acc@5 95.117 (97.424) Mem 7381MB [2024-09-01 06:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.088) Loss 1.1250 (0.7469) Acc@1 73.730 (83.562) Acc@5 92.969 (96.535) Mem 7381MB [2024-09-01 06:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.0684 (0.8003) Acc@1 74.902 (82.243) Acc@5 93.945 (95.989) Mem 7381MB [2024-09-01 06:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.824 Acc@5 95.936 [2024-09-01 06:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.8% [2024-09-01 06:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.82% [2024-09-01 06:02:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 06:02:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 06:02:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.419 (0.419) Loss 0.3813 (0.3813) Acc@1 93.066 (93.066) Acc@5 98.242 (98.242) Mem 7381MB [2024-09-01 06:02:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.107) Loss 0.5713 (0.6009) Acc@1 89.648 (87.473) Acc@5 97.852 (97.736) Mem 7381MB [2024-09-01 06:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.096) Loss 0.8755 (0.6293) Acc@1 77.930 (86.412) Acc@5 95.605 (97.661) Mem 7381MB [2024-09-01 06:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.083 (0.091) Loss 1.0859 (0.7129) Acc@1 73.633 (84.277) Acc@5 92.969 (96.784) Mem 7381MB [2024-09-01 06:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 0.9819 (0.7575) Acc@1 76.465 (83.139) Acc@5 94.531 (96.356) Mem 7381MB [2024-09-01 06:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.710 Acc@5 96.342 [2024-09-01 06:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-09-01 06:02:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][0/1251] eta 0:28:54 lr 0.000143 wd 0.0500 time 1.3864 (1.3864) data time 0.7478 (0.7478) model time 0.0000 (0.0000) loss 2.3706 (2.3706) grad_norm 4.8681 (4.8681) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][10/1251] eta 0:07:06 lr 0.000143 wd 0.0500 time 0.2448 (0.3438) data time 0.0010 (0.0689) model time 0.0000 (0.0000) loss 2.7463 (3.0315) grad_norm 3.8564 (4.2850) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][20/1251] eta 0:06:04 lr 0.000143 wd 0.0500 time 0.2460 (0.2959) data time 0.0009 (0.0366) model time 0.0000 (0.0000) loss 2.8062 (2.9980) grad_norm 3.3179 (5.3112) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][30/1251] eta 0:05:41 lr 0.000143 wd 0.0500 time 0.2467 (0.2795) data time 0.0008 (0.0253) model time 0.0000 (0.0000) loss 3.6831 (2.8879) grad_norm 4.8761 (4.8554) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][40/1251] eta 0:05:26 lr 0.000143 wd 0.0500 time 0.2424 (0.2697) data time 0.0011 (0.0193) model time 0.0000 (0.0000) loss 3.2893 (2.9019) grad_norm 2.9506 (4.6989) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][50/1251] eta 0:05:17 lr 0.000143 wd 0.0500 time 0.2421 (0.2643) data time 0.0010 (0.0158) model time 0.0000 (0.0000) loss 2.9211 (2.8941) grad_norm 2.7208 (4.8312) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][60/1251] eta 0:05:09 lr 0.000143 wd 0.0500 time 0.2296 (0.2602) data time 0.0011 (0.0134) model time 0.2285 (0.2383) loss 2.5904 (2.8567) grad_norm 4.2862 (4.6623) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][70/1251] eta 0:05:03 lr 0.000143 wd 0.0500 time 0.2426 (0.2573) data time 0.0007 (0.0117) model time 0.2418 (0.2387) loss 3.4605 (2.8854) grad_norm 4.7963 (4.6002) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][80/1251] eta 0:04:58 lr 0.000143 wd 0.0500 time 0.2280 (0.2550) data time 0.0008 (0.0103) model time 0.2272 (0.2383) loss 2.4164 (2.8949) grad_norm 3.0650 (4.5842) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][90/1251] eta 0:04:54 lr 0.000143 wd 0.0500 time 0.2371 (0.2533) data time 0.0010 (0.0093) model time 0.2361 (0.2383) loss 2.8045 (2.8732) grad_norm 5.2480 (4.5154) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][100/1251] eta 0:04:50 lr 0.000143 wd 0.0500 time 0.2451 (0.2521) data time 0.0011 (0.0085) model time 0.2441 (0.2386) loss 2.8921 (2.8473) grad_norm 2.6001 (4.4843) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][110/1251] eta 0:04:46 lr 0.000143 wd 0.0500 time 0.2330 (0.2509) data time 0.0010 (0.0078) model time 0.2319 (0.2386) loss 2.0540 (2.8423) grad_norm 4.2554 (4.4564) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][120/1251] eta 0:04:42 lr 0.000143 wd 0.0500 time 0.2389 (0.2502) data time 0.0008 (0.0072) model time 0.2382 (0.2390) loss 3.4975 (2.8422) grad_norm 4.3195 (4.3918) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][130/1251] eta 0:04:39 lr 0.000143 wd 0.0500 time 0.2425 (0.2494) data time 0.0007 (0.0068) model time 0.2418 (0.2389) loss 2.6102 (2.8448) grad_norm 4.3455 (4.3360) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][140/1251] eta 0:04:36 lr 0.000143 wd 0.0500 time 0.2309 (0.2487) data time 0.0011 (0.0064) model time 0.2297 (0.2389) loss 3.2437 (2.8324) grad_norm 3.7178 (4.3558) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][150/1251] eta 0:04:33 lr 0.000143 wd 0.0500 time 0.2429 (0.2482) data time 0.0011 (0.0060) model time 0.2418 (0.2390) loss 3.3875 (2.8360) grad_norm 2.9359 (4.3705) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][160/1251] eta 0:04:30 lr 0.000143 wd 0.0500 time 0.2434 (0.2478) data time 0.0007 (0.0057) model time 0.2427 (0.2391) loss 3.0653 (2.8230) grad_norm 3.0841 (4.3469) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:02:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][170/1251] eta 0:04:27 lr 0.000143 wd 0.0500 time 0.2376 (0.2473) data time 0.0008 (0.0054) model time 0.2368 (0.2391) loss 1.8259 (2.8063) grad_norm 3.4582 (4.3477) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][180/1251] eta 0:04:24 lr 0.000143 wd 0.0500 time 0.2454 (0.2470) data time 0.0009 (0.0052) model time 0.2445 (0.2393) loss 2.7825 (2.8059) grad_norm 5.3581 (4.5096) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][190/1251] eta 0:04:21 lr 0.000143 wd 0.0500 time 0.2395 (0.2466) data time 0.0007 (0.0049) model time 0.2388 (0.2392) loss 2.4859 (2.8014) grad_norm 3.0446 (4.4681) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][200/1251] eta 0:04:18 lr 0.000143 wd 0.0500 time 0.2413 (0.2462) data time 0.0009 (0.0047) model time 0.2405 (0.2391) loss 3.2791 (2.7917) grad_norm 4.0694 (4.4538) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][210/1251] eta 0:04:15 lr 0.000143 wd 0.0500 time 0.2398 (0.2459) data time 0.0009 (0.0046) model time 0.2389 (0.2391) loss 2.2386 (2.7854) grad_norm 3.4619 (4.4134) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][220/1251] eta 0:04:13 lr 0.000143 wd 0.0500 time 0.2493 (0.2458) data time 0.0009 (0.0044) model time 0.2484 (0.2392) loss 3.0990 (2.7869) grad_norm 3.3996 (4.3767) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][230/1251] eta 0:04:10 lr 0.000143 wd 0.0500 time 0.2352 (0.2455) data time 0.0011 (0.0043) model time 0.2341 (0.2392) loss 2.3063 (2.7758) grad_norm 3.4022 (4.4127) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][240/1251] eta 0:04:08 lr 0.000143 wd 0.0500 time 0.2401 (0.2454) data time 0.0008 (0.0041) model time 0.2392 (0.2394) loss 3.3745 (2.7770) grad_norm 4.2287 (4.3794) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][250/1251] eta 0:04:05 lr 0.000143 wd 0.0500 time 0.2376 (0.2453) data time 0.0008 (0.0040) model time 0.2367 (0.2394) loss 3.3059 (2.7839) grad_norm 3.3085 (4.3553) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][260/1251] eta 0:04:02 lr 0.000143 wd 0.0500 time 0.2449 (0.2452) data time 0.0007 (0.0039) model time 0.2441 (0.2395) loss 2.7988 (2.7856) grad_norm 3.5950 (4.3496) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][270/1251] eta 0:04:00 lr 0.000143 wd 0.0500 time 0.2390 (0.2450) data time 0.0009 (0.0038) model time 0.2380 (0.2395) loss 3.0778 (2.7917) grad_norm 3.8916 (4.3541) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][280/1251] eta 0:03:57 lr 0.000143 wd 0.0500 time 0.2439 (0.2450) data time 0.0011 (0.0037) model time 0.2428 (0.2397) loss 2.7381 (2.7954) grad_norm 4.2825 (4.3442) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][290/1251] eta 0:03:55 lr 0.000143 wd 0.0500 time 0.2459 (0.2456) data time 0.0007 (0.0036) model time 0.2452 (0.2406) loss 2.3135 (2.7940) grad_norm 5.6552 (4.4478) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][300/1251] eta 0:03:53 lr 0.000142 wd 0.0500 time 0.2420 (0.2455) data time 0.0008 (0.0035) model time 0.2412 (0.2406) loss 2.4227 (2.7972) grad_norm 3.3013 (4.4300) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][310/1251] eta 0:03:50 lr 0.000142 wd 0.0500 time 0.2388 (0.2453) data time 0.0007 (0.0034) model time 0.2381 (0.2406) loss 2.9004 (2.7952) grad_norm 4.9900 (4.4211) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][320/1251] eta 0:03:48 lr 0.000142 wd 0.0500 time 0.2361 (0.2452) data time 0.0007 (0.0034) model time 0.2354 (0.2406) loss 3.0054 (2.7994) grad_norm 2.9407 (4.4047) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][330/1251] eta 0:03:45 lr 0.000142 wd 0.0500 time 0.2409 (0.2451) data time 0.0009 (0.0033) model time 0.2400 (0.2407) loss 3.0215 (2.8023) grad_norm 3.8706 (4.3844) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][340/1251] eta 0:03:43 lr 0.000142 wd 0.0500 time 0.2341 (0.2450) data time 0.0007 (0.0032) model time 0.2334 (0.2406) loss 3.1013 (2.8074) grad_norm 2.8188 (4.3699) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][350/1251] eta 0:03:40 lr 0.000142 wd 0.0500 time 0.2410 (0.2449) data time 0.0007 (0.0031) model time 0.2403 (0.2406) loss 3.1808 (2.8056) grad_norm 4.2580 (4.3564) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][360/1251] eta 0:03:38 lr 0.000142 wd 0.0500 time 0.2417 (0.2448) data time 0.0008 (0.0031) model time 0.2410 (0.2406) loss 3.3343 (2.8086) grad_norm 3.4685 (4.3339) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][370/1251] eta 0:03:35 lr 0.000142 wd 0.0500 time 0.2381 (0.2447) data time 0.0011 (0.0030) model time 0.2370 (0.2406) loss 3.4002 (2.8104) grad_norm 3.2513 (4.3201) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][380/1251] eta 0:03:33 lr 0.000142 wd 0.0500 time 0.2427 (0.2451) data time 0.0009 (0.0030) model time 0.2418 (0.2412) loss 3.0351 (2.8169) grad_norm 7.2797 (4.3053) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][390/1251] eta 0:03:30 lr 0.000142 wd 0.0500 time 0.2369 (0.2451) data time 0.0009 (0.0029) model time 0.2360 (0.2412) loss 3.2517 (2.8194) grad_norm 3.9771 (4.2945) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][400/1251] eta 0:03:28 lr 0.000142 wd 0.0500 time 0.2416 (0.2449) data time 0.0010 (0.0029) model time 0.2406 (0.2411) loss 3.2698 (2.8195) grad_norm 3.1230 (4.3111) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:03:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][410/1251] eta 0:03:25 lr 0.000142 wd 0.0500 time 0.2447 (0.2449) data time 0.0008 (0.0028) model time 0.2439 (0.2411) loss 3.5085 (2.8223) grad_norm 9.9693 (4.3327) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][420/1251] eta 0:03:23 lr 0.000142 wd 0.0500 time 0.2450 (0.2447) data time 0.0009 (0.0028) model time 0.2441 (0.2411) loss 2.9786 (2.8210) grad_norm 6.5411 (4.3485) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][430/1251] eta 0:03:20 lr 0.000142 wd 0.0500 time 0.2376 (0.2446) data time 0.0007 (0.0027) model time 0.2368 (0.2410) loss 3.5590 (2.8231) grad_norm 3.9905 (4.3514) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][440/1251] eta 0:03:18 lr 0.000142 wd 0.0500 time 0.2339 (0.2445) data time 0.0009 (0.0027) model time 0.2330 (0.2410) loss 2.4497 (2.8264) grad_norm 2.9712 (4.3414) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][450/1251] eta 0:03:15 lr 0.000142 wd 0.0500 time 0.2448 (0.2445) data time 0.0009 (0.0027) model time 0.2438 (0.2409) loss 2.7952 (2.8225) grad_norm 3.2330 (4.3359) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][460/1251] eta 0:03:13 lr 0.000142 wd 0.0500 time 0.2361 (0.2444) data time 0.0009 (0.0026) model time 0.2352 (0.2409) loss 2.4610 (2.8225) grad_norm 5.1886 (4.3173) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][470/1251] eta 0:03:10 lr 0.000142 wd 0.0500 time 0.2410 (0.2443) data time 0.0008 (0.0026) model time 0.2401 (0.2409) loss 3.1710 (2.8187) grad_norm 4.0119 (4.2999) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][480/1251] eta 0:03:08 lr 0.000142 wd 0.0500 time 0.2464 (0.2443) data time 0.0009 (0.0026) model time 0.2455 (0.2409) loss 3.0797 (2.8151) grad_norm 2.9661 (4.3263) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][490/1251] eta 0:03:05 lr 0.000142 wd 0.0500 time 0.2409 (0.2442) data time 0.0008 (0.0025) model time 0.2401 (0.2409) loss 2.9888 (2.8131) grad_norm 3.7969 (4.3072) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][500/1251] eta 0:03:03 lr 0.000142 wd 0.0500 time 0.2370 (0.2441) data time 0.0009 (0.0025) model time 0.2362 (0.2408) loss 3.3943 (2.8135) grad_norm 2.4257 (4.2901) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][510/1251] eta 0:03:00 lr 0.000142 wd 0.0500 time 0.2391 (0.2440) data time 0.0011 (0.0025) model time 0.2380 (0.2408) loss 2.9375 (2.8174) grad_norm 3.1946 (4.2748) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][520/1251] eta 0:02:58 lr 0.000142 wd 0.0500 time 0.2458 (0.2440) data time 0.0008 (0.0024) model time 0.2450 (0.2408) loss 2.6489 (2.8204) grad_norm 12.1374 (4.2910) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][530/1251] eta 0:02:55 lr 0.000142 wd 0.0500 time 0.2343 (0.2439) data time 0.0008 (0.0024) model time 0.2335 (0.2408) loss 3.0586 (2.8191) grad_norm 4.3739 (4.3321) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][540/1251] eta 0:02:53 lr 0.000142 wd 0.0500 time 0.2350 (0.2439) data time 0.0009 (0.0024) model time 0.2340 (0.2408) loss 2.8442 (2.8173) grad_norm 3.8093 (4.3140) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][550/1251] eta 0:02:50 lr 0.000142 wd 0.0500 time 0.2362 (0.2439) data time 0.0011 (0.0024) model time 0.2351 (0.2408) loss 2.9359 (2.8164) grad_norm 3.4229 (4.3007) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][560/1251] eta 0:02:48 lr 0.000142 wd 0.0500 time 0.2426 (0.2439) data time 0.0009 (0.0023) model time 0.2417 (0.2408) loss 3.2717 (2.8175) grad_norm 2.7394 (4.2912) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][570/1251] eta 0:02:46 lr 0.000142 wd 0.0500 time 0.2445 (0.2439) data time 0.0009 (0.0023) model time 0.2436 (0.2409) loss 2.8323 (2.8238) grad_norm 4.1965 (4.2838) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][580/1251] eta 0:02:43 lr 0.000142 wd 0.0500 time 0.2478 (0.2438) data time 0.0010 (0.0023) model time 0.2468 (0.2409) loss 3.3923 (2.8282) grad_norm 3.5390 (4.3023) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][590/1251] eta 0:02:41 lr 0.000142 wd 0.0500 time 0.2400 (0.2438) data time 0.0010 (0.0023) model time 0.2390 (0.2408) loss 3.1930 (2.8275) grad_norm 4.0324 (4.2990) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][600/1251] eta 0:02:38 lr 0.000142 wd 0.0500 time 0.2408 (0.2437) data time 0.0011 (0.0023) model time 0.2398 (0.2409) loss 2.8650 (2.8270) grad_norm 2.2732 (4.2950) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][610/1251] eta 0:02:36 lr 0.000142 wd 0.0500 time 0.2480 (0.2438) data time 0.0007 (0.0022) model time 0.2473 (0.2409) loss 3.2068 (2.8277) grad_norm 3.1997 (4.2870) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][620/1251] eta 0:02:33 lr 0.000142 wd 0.0500 time 0.2372 (0.2437) data time 0.0010 (0.0022) model time 0.2362 (0.2409) loss 2.8594 (2.8264) grad_norm 4.7550 (4.2857) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][630/1251] eta 0:02:31 lr 0.000141 wd 0.0500 time 0.2424 (0.2437) data time 0.0008 (0.0022) model time 0.2416 (0.2409) loss 3.0508 (2.8273) grad_norm 3.6216 (4.2766) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][640/1251] eta 0:02:28 lr 0.000141 wd 0.0500 time 0.2363 (0.2437) data time 0.0009 (0.0022) model time 0.2354 (0.2409) loss 3.1782 (2.8261) grad_norm 3.1635 (4.2725) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][650/1251] eta 0:02:26 lr 0.000141 wd 0.0500 time 0.2399 (0.2437) data time 0.0008 (0.0022) model time 0.2391 (0.2409) loss 3.5535 (2.8276) grad_norm 3.2398 (4.2621) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:04:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][660/1251] eta 0:02:24 lr 0.000141 wd 0.0500 time 0.2421 (0.2437) data time 0.0008 (0.0021) model time 0.2412 (0.2410) loss 1.9314 (2.8274) grad_norm 5.2426 (4.2620) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][670/1251] eta 0:02:21 lr 0.000141 wd 0.0500 time 0.2396 (0.2437) data time 0.0007 (0.0021) model time 0.2388 (0.2410) loss 1.7705 (2.8262) grad_norm 3.7980 (4.2566) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][680/1251] eta 0:02:19 lr 0.000141 wd 0.0500 time 0.2350 (0.2436) data time 0.0010 (0.0021) model time 0.2339 (0.2409) loss 3.4732 (2.8285) grad_norm 2.8490 (4.2530) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][690/1251] eta 0:02:16 lr 0.000141 wd 0.0500 time 0.2369 (0.2436) data time 0.0007 (0.0021) model time 0.2361 (0.2409) loss 2.6113 (2.8290) grad_norm 3.6848 (4.2526) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][700/1251] eta 0:02:14 lr 0.000141 wd 0.0500 time 0.2444 (0.2435) data time 0.0007 (0.0021) model time 0.2438 (0.2408) loss 3.2079 (2.8349) grad_norm 3.4649 (4.3072) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][710/1251] eta 0:02:11 lr 0.000141 wd 0.0500 time 0.2379 (0.2434) data time 0.0009 (0.0021) model time 0.2371 (0.2408) loss 1.8915 (2.8290) grad_norm 3.2216 (4.3045) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][720/1251] eta 0:02:09 lr 0.000141 wd 0.0500 time 0.2420 (0.2434) data time 0.0007 (0.0020) model time 0.2412 (0.2408) loss 2.3595 (2.8294) grad_norm 3.9420 (4.3028) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][730/1251] eta 0:02:06 lr 0.000141 wd 0.0500 time 0.2386 (0.2434) data time 0.0007 (0.0020) model time 0.2379 (0.2408) loss 3.2192 (2.8310) grad_norm 3.2266 (4.3010) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][740/1251] eta 0:02:04 lr 0.000141 wd 0.0500 time 0.2418 (0.2433) data time 0.0009 (0.0020) model time 0.2408 (0.2408) loss 2.3881 (2.8320) grad_norm 3.1511 (4.2939) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][750/1251] eta 0:02:01 lr 0.000141 wd 0.0500 time 0.2413 (0.2433) data time 0.0007 (0.0020) model time 0.2406 (0.2407) loss 2.6226 (2.8319) grad_norm 3.1647 (4.2885) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][760/1251] eta 0:01:59 lr 0.000141 wd 0.0500 time 0.2486 (0.2432) data time 0.0010 (0.0020) model time 0.2476 (0.2407) loss 2.8841 (2.8328) grad_norm 4.6572 (4.2806) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][770/1251] eta 0:01:57 lr 0.000141 wd 0.0500 time 0.2409 (0.2432) data time 0.0009 (0.0020) model time 0.2400 (0.2408) loss 3.7815 (2.8361) grad_norm 3.2539 (4.2656) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][780/1251] eta 0:01:54 lr 0.000141 wd 0.0500 time 0.2410 (0.2432) data time 0.0009 (0.0020) model time 0.2401 (0.2408) loss 2.6000 (2.8374) grad_norm 2.8606 (4.2575) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][790/1251] eta 0:01:52 lr 0.000141 wd 0.0500 time 0.2447 (0.2432) data time 0.0012 (0.0020) model time 0.2435 (0.2408) loss 2.7755 (2.8371) grad_norm 2.8715 (4.2524) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][800/1251] eta 0:01:49 lr 0.000141 wd 0.0500 time 0.2319 (0.2432) data time 0.0013 (0.0019) model time 0.2306 (0.2408) loss 3.0422 (2.8406) grad_norm 2.8223 (4.2442) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][810/1251] eta 0:01:47 lr 0.000141 wd 0.0500 time 0.2378 (0.2432) data time 0.0010 (0.0019) model time 0.2368 (0.2408) loss 2.6797 (2.8442) grad_norm 3.5885 (4.2369) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][820/1251] eta 0:01:44 lr 0.000141 wd 0.0500 time 0.2405 (0.2432) data time 0.0010 (0.0019) model time 0.2395 (0.2408) loss 3.1093 (2.8451) grad_norm 3.0310 (4.2307) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][830/1251] eta 0:01:42 lr 0.000141 wd 0.0500 time 0.2466 (0.2432) data time 0.0011 (0.0019) model time 0.2455 (0.2408) loss 3.0920 (2.8449) grad_norm 6.3894 (4.2419) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][840/1251] eta 0:01:39 lr 0.000141 wd 0.0500 time 0.2372 (0.2432) data time 0.0010 (0.0019) model time 0.2362 (0.2408) loss 3.6090 (2.8442) grad_norm 3.4804 (4.2354) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][850/1251] eta 0:01:37 lr 0.000141 wd 0.0500 time 0.2431 (0.2432) data time 0.0009 (0.0019) model time 0.2422 (0.2409) loss 2.6082 (2.8443) grad_norm 7.1821 (4.2284) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][860/1251] eta 0:01:35 lr 0.000141 wd 0.0500 time 0.2480 (0.2432) data time 0.0009 (0.0019) model time 0.2471 (0.2409) loss 2.4158 (2.8437) grad_norm 2.9232 (4.2225) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][870/1251] eta 0:01:32 lr 0.000141 wd 0.0500 time 0.2422 (0.2432) data time 0.0011 (0.0019) model time 0.2411 (0.2409) loss 3.1808 (2.8445) grad_norm 3.5979 (4.2413) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][880/1251] eta 0:01:30 lr 0.000141 wd 0.0500 time 0.2394 (0.2432) data time 0.0007 (0.0019) model time 0.2387 (0.2409) loss 3.2179 (2.8465) grad_norm 4.1009 (4.2428) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][890/1251] eta 0:01:27 lr 0.000141 wd 0.0500 time 0.2387 (0.2431) data time 0.0009 (0.0018) model time 0.2377 (0.2408) loss 2.1905 (2.8486) grad_norm 5.8994 (4.2484) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][900/1251] eta 0:01:25 lr 0.000141 wd 0.0500 time 0.2467 (0.2433) data time 0.0007 (0.0018) model time 0.2459 (0.2410) loss 3.4561 (2.8465) grad_norm 3.0909 (4.2492) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:05:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][910/1251] eta 0:01:22 lr 0.000141 wd 0.0500 time 0.2406 (0.2433) data time 0.0010 (0.0018) model time 0.2396 (0.2410) loss 3.0091 (2.8485) grad_norm 4.1004 (4.2590) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][920/1251] eta 0:01:20 lr 0.000141 wd 0.0500 time 0.2349 (0.2432) data time 0.0009 (0.0018) model time 0.2339 (0.2410) loss 3.0812 (2.8490) grad_norm 3.4567 (4.2525) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][930/1251] eta 0:01:18 lr 0.000141 wd 0.0500 time 0.2490 (0.2432) data time 0.0008 (0.0018) model time 0.2482 (0.2410) loss 3.5946 (2.8500) grad_norm 2.5905 (4.2465) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][940/1251] eta 0:01:15 lr 0.000141 wd 0.0500 time 0.2319 (0.2432) data time 0.0010 (0.0018) model time 0.2309 (0.2410) loss 2.3562 (2.8518) grad_norm 3.2313 (4.2393) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][950/1251] eta 0:01:13 lr 0.000141 wd 0.0500 time 0.2421 (0.2432) data time 0.0007 (0.0018) model time 0.2414 (0.2410) loss 2.9802 (2.8544) grad_norm 2.8786 (4.2291) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][960/1251] eta 0:01:10 lr 0.000141 wd 0.0500 time 0.2536 (0.2432) data time 0.0009 (0.0018) model time 0.2527 (0.2410) loss 3.6223 (2.8561) grad_norm 3.8503 (4.2228) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][970/1251] eta 0:01:08 lr 0.000140 wd 0.0500 time 0.2387 (0.2431) data time 0.0007 (0.0018) model time 0.2380 (0.2410) loss 3.1855 (2.8570) grad_norm 3.7376 (4.2171) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][980/1251] eta 0:01:05 lr 0.000140 wd 0.0500 time 0.2363 (0.2431) data time 0.0010 (0.0018) model time 0.2353 (0.2410) loss 1.9614 (2.8574) grad_norm 3.9039 (4.2140) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][990/1251] eta 0:01:03 lr 0.000140 wd 0.0500 time 0.2353 (0.2431) data time 0.0010 (0.0018) model time 0.2343 (0.2410) loss 2.9013 (2.8568) grad_norm 2.9917 (4.2065) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1000/1251] eta 0:01:01 lr 0.000140 wd 0.0500 time 0.2381 (0.2431) data time 0.0010 (0.0018) model time 0.2370 (0.2409) loss 3.4173 (2.8541) grad_norm 3.1884 (4.2107) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1010/1251] eta 0:00:58 lr 0.000140 wd 0.0500 time 0.2349 (0.2430) data time 0.0007 (0.0017) model time 0.2341 (0.2409) loss 3.1070 (2.8549) grad_norm 3.8084 (4.2053) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1020/1251] eta 0:00:56 lr 0.000140 wd 0.0500 time 0.2380 (0.2430) data time 0.0008 (0.0017) model time 0.2372 (0.2409) loss 2.3734 (2.8546) grad_norm 2.6530 (4.1949) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1030/1251] eta 0:00:53 lr 0.000140 wd 0.0500 time 0.2404 (0.2430) data time 0.0011 (0.0017) model time 0.2393 (0.2409) loss 2.7441 (2.8534) grad_norm 7.0506 (4.1960) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1040/1251] eta 0:00:51 lr 0.000140 wd 0.0500 time 0.2411 (0.2430) data time 0.0011 (0.0017) model time 0.2400 (0.2409) loss 2.3734 (2.8499) grad_norm 3.2337 (4.1936) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1050/1251] eta 0:00:48 lr 0.000140 wd 0.0500 time 0.2363 (0.2429) data time 0.0008 (0.0017) model time 0.2355 (0.2409) loss 3.0109 (2.8510) grad_norm 5.9138 (4.1931) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1060/1251] eta 0:00:46 lr 0.000140 wd 0.0500 time 0.2415 (0.2429) data time 0.0010 (0.0017) model time 0.2405 (0.2409) loss 3.2061 (2.8521) grad_norm 3.9154 (4.1952) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1070/1251] eta 0:00:43 lr 0.000140 wd 0.0500 time 0.2354 (0.2429) data time 0.0011 (0.0017) model time 0.2343 (0.2408) loss 3.4595 (2.8510) grad_norm 5.7332 (4.1977) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1080/1251] eta 0:00:41 lr 0.000140 wd 0.0500 time 0.3099 (0.2429) data time 0.0010 (0.0017) model time 0.3089 (0.2408) loss 3.0409 (2.8510) grad_norm 4.6188 (4.2109) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1090/1251] eta 0:00:39 lr 0.000140 wd 0.0500 time 0.2397 (0.2429) data time 0.0011 (0.0017) model time 0.2385 (0.2408) loss 2.6087 (2.8499) grad_norm 3.2250 (4.2091) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1100/1251] eta 0:00:36 lr 0.000140 wd 0.0500 time 0.2323 (0.2428) data time 0.0007 (0.0017) model time 0.2316 (0.2408) loss 1.9232 (2.8476) grad_norm 3.0816 (4.2040) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1110/1251] eta 0:00:34 lr 0.000140 wd 0.0500 time 0.2333 (0.2428) data time 0.0009 (0.0017) model time 0.2324 (0.2407) loss 2.8172 (2.8444) grad_norm 4.8019 (4.2148) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1120/1251] eta 0:00:31 lr 0.000140 wd 0.0500 time 0.2320 (0.2428) data time 0.0010 (0.0017) model time 0.2310 (0.2408) loss 2.9935 (2.8445) grad_norm 4.2160 (4.2228) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1130/1251] eta 0:00:29 lr 0.000140 wd 0.0500 time 0.2397 (0.2428) data time 0.0012 (0.0017) model time 0.2384 (0.2407) loss 2.3082 (2.8425) grad_norm 5.7856 (4.2240) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1140/1251] eta 0:00:26 lr 0.000140 wd 0.0500 time 0.2437 (0.2428) data time 0.0011 (0.0017) model time 0.2425 (0.2407) loss 2.6184 (2.8423) grad_norm 3.1264 (4.2249) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1150/1251] eta 0:00:24 lr 0.000140 wd 0.0500 time 0.2409 (0.2427) data time 0.0008 (0.0017) model time 0.2401 (0.2407) loss 2.4023 (2.8415) grad_norm 3.4270 (4.2255) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1160/1251] eta 0:00:22 lr 0.000140 wd 0.0500 time 0.2347 (0.2427) data time 0.0010 (0.0017) model time 0.2337 (0.2407) loss 3.5062 (2.8425) grad_norm 2.8999 (4.2224) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1170/1251] eta 0:00:19 lr 0.000140 wd 0.0500 time 0.2408 (0.2427) data time 0.0008 (0.0016) model time 0.2400 (0.2407) loss 3.2219 (2.8420) grad_norm 2.9268 (4.2289) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1180/1251] eta 0:00:17 lr 0.000140 wd 0.0500 time 0.2355 (0.2427) data time 0.0009 (0.0016) model time 0.2346 (0.2407) loss 2.8569 (2.8445) grad_norm 3.1361 (4.2239) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1190/1251] eta 0:00:14 lr 0.000140 wd 0.0500 time 0.2339 (0.2427) data time 0.0009 (0.0016) model time 0.2330 (0.2407) loss 2.8123 (2.8446) grad_norm 3.2766 (4.2164) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1200/1251] eta 0:00:12 lr 0.000140 wd 0.0500 time 0.2412 (0.2426) data time 0.0008 (0.0016) model time 0.2404 (0.2407) loss 3.5043 (2.8435) grad_norm 6.9269 (4.2164) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1210/1251] eta 0:00:09 lr 0.000140 wd 0.0500 time 0.2438 (0.2426) data time 0.0009 (0.0016) model time 0.2429 (0.2407) loss 3.0246 (2.8424) grad_norm 2.7797 (4.2151) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1220/1251] eta 0:00:07 lr 0.000140 wd 0.0500 time 0.2406 (0.2428) data time 0.0010 (0.0016) model time 0.2397 (0.2409) loss 2.8782 (2.8443) grad_norm 4.0095 (4.2092) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1230/1251] eta 0:00:05 lr 0.000140 wd 0.0500 time 0.2449 (0.2428) data time 0.0010 (0.0016) model time 0.2439 (0.2408) loss 3.1863 (2.8431) grad_norm 3.0227 (4.2069) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1240/1251] eta 0:00:02 lr 0.000140 wd 0.0500 time 0.2228 (0.2427) data time 0.0005 (0.0016) model time 0.2223 (0.2408) loss 2.9943 (2.8446) grad_norm 3.3930 (4.2042) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [233/300][1250/1251] eta 0:00:00 lr 0.000140 wd 0.0500 time 0.2224 (0.2426) data time 0.0007 (0.0016) model time 0.2218 (0.2406) loss 2.4609 (2.8442) grad_norm 3.3785 (4.1998) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 233 training takes 0:05:03 [2024-09-01 06:07:20 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 06:07:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 06:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.454 (0.454) Loss 0.4080 (0.4080) Acc@1 92.676 (92.676) Acc@5 98.340 (98.340) Mem 7381MB [2024-09-01 06:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.094 (0.112) Loss 0.6338 (0.6374) Acc@1 88.281 (86.808) Acc@5 97.656 (97.425) Mem 7381MB [2024-09-01 06:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.097) Loss 0.9614 (0.6682) Acc@1 76.953 (85.761) Acc@5 95.117 (97.373) Mem 7381MB [2024-09-01 06:07:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.090) Loss 1.1426 (0.7589) Acc@1 73.730 (83.606) Acc@5 92.773 (96.443) Mem 7381MB [2024-09-01 06:07:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.0586 (0.8086) Acc@1 76.172 (82.419) Acc@5 93.750 (95.927) Mem 7381MB [2024-09-01 06:07:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.978 Acc@5 95.884 [2024-09-01 06:07:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.0% [2024-09-01 06:07:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 81.98% [2024-09-01 06:07:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 06:07:25 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 06:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.450 (0.450) Loss 0.3809 (0.3809) Acc@1 93.359 (93.359) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 06:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.108) Loss 0.5708 (0.6008) Acc@1 89.551 (87.553) Acc@5 97.852 (97.763) Mem 7381MB [2024-09-01 06:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.093) Loss 0.8760 (0.6293) Acc@1 78.027 (86.458) Acc@5 95.801 (97.689) Mem 7381MB [2024-09-01 06:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.088) Loss 1.0850 (0.7128) Acc@1 73.926 (84.362) Acc@5 92.871 (96.806) Mem 7381MB [2024-09-01 06:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.082) Loss 0.9854 (0.7575) Acc@1 76.562 (83.194) Acc@5 94.238 (96.351) Mem 7381MB [2024-09-01 06:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.748 Acc@5 96.334 [2024-09-01 06:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-09-01 06:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.75% [2024-09-01 06:07:29 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 06:07:30 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 06:07:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][0/1251] eta 0:16:51 lr 0.000140 wd 0.0500 time 0.8084 (0.8084) data time 0.5826 (0.5826) model time 0.0000 (0.0000) loss 2.6947 (2.6947) grad_norm 5.7383 (5.7383) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][10/1251] eta 0:06:04 lr 0.000140 wd 0.0500 time 0.2484 (0.2937) data time 0.0011 (0.0539) model time 0.0000 (0.0000) loss 2.2886 (2.8187) grad_norm 3.2096 (3.7910) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][20/1251] eta 0:05:30 lr 0.000140 wd 0.0500 time 0.2458 (0.2681) data time 0.0009 (0.0287) model time 0.0000 (0.0000) loss 2.2734 (2.7548) grad_norm 3.1326 (3.6936) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][30/1251] eta 0:05:16 lr 0.000140 wd 0.0500 time 0.2341 (0.2594) data time 0.0009 (0.0197) model time 0.0000 (0.0000) loss 3.0833 (2.8170) grad_norm 3.3574 (4.0760) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][40/1251] eta 0:05:08 lr 0.000140 wd 0.0500 time 0.2398 (0.2548) data time 0.0009 (0.0152) model time 0.0000 (0.0000) loss 1.9925 (2.7696) grad_norm 3.7206 (4.0675) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][50/1251] eta 0:05:03 lr 0.000139 wd 0.0500 time 0.2476 (0.2528) data time 0.0011 (0.0124) model time 0.0000 (0.0000) loss 2.9180 (2.8066) grad_norm 5.8254 (4.0489) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][60/1251] eta 0:04:58 lr 0.000139 wd 0.0500 time 0.2417 (0.2510) data time 0.0007 (0.0105) model time 0.2409 (0.2407) loss 2.5040 (2.7682) grad_norm 4.5359 (4.0586) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][70/1251] eta 0:04:56 lr 0.000139 wd 0.0500 time 0.2366 (0.2514) data time 0.0007 (0.0092) model time 0.2358 (0.2467) loss 2.2076 (2.7812) grad_norm 3.1397 (4.0260) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][80/1251] eta 0:04:53 lr 0.000139 wd 0.0500 time 0.2408 (0.2504) data time 0.0007 (0.0082) model time 0.2401 (0.2453) loss 2.5945 (2.7615) grad_norm 3.6838 (3.9863) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][90/1251] eta 0:04:49 lr 0.000139 wd 0.0500 time 0.2405 (0.2495) data time 0.0009 (0.0074) model time 0.2396 (0.2444) loss 3.1251 (2.7800) grad_norm 2.9211 (3.9498) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][100/1251] eta 0:04:46 lr 0.000139 wd 0.0500 time 0.2415 (0.2485) data time 0.0010 (0.0068) model time 0.2405 (0.2432) loss 3.3278 (2.7808) grad_norm 2.9770 (3.9101) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][110/1251] eta 0:04:43 lr 0.000139 wd 0.0500 time 0.2581 (0.2481) data time 0.0008 (0.0063) model time 0.2572 (0.2431) loss 2.8198 (2.7918) grad_norm 3.7424 (3.8832) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][120/1251] eta 0:04:40 lr 0.000139 wd 0.0500 time 0.2444 (0.2478) data time 0.0012 (0.0058) model time 0.2432 (0.2432) loss 2.8077 (2.8074) grad_norm 4.9095 (3.8969) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][130/1251] eta 0:04:37 lr 0.000139 wd 0.0500 time 0.2353 (0.2474) data time 0.0009 (0.0055) model time 0.2344 (0.2430) loss 1.7725 (2.8066) grad_norm 5.2518 (3.9013) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][140/1251] eta 0:04:34 lr 0.000139 wd 0.0500 time 0.2411 (0.2470) data time 0.0007 (0.0051) model time 0.2404 (0.2426) loss 3.1805 (2.7970) grad_norm 2.8332 (3.9717) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][150/1251] eta 0:04:31 lr 0.000139 wd 0.0500 time 0.2436 (0.2466) data time 0.0007 (0.0049) model time 0.2429 (0.2424) loss 3.0398 (2.7978) grad_norm 3.4343 (3.9592) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][160/1251] eta 0:04:28 lr 0.000139 wd 0.0500 time 0.2443 (0.2462) data time 0.0008 (0.0046) model time 0.2435 (0.2422) loss 2.2767 (2.7900) grad_norm 3.5166 (3.9397) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][170/1251] eta 0:04:25 lr 0.000139 wd 0.0500 time 0.2376 (0.2461) data time 0.0011 (0.0044) model time 0.2365 (0.2422) loss 3.1790 (2.7910) grad_norm 4.0854 (3.9775) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][180/1251] eta 0:04:23 lr 0.000139 wd 0.0500 time 0.2379 (0.2456) data time 0.0010 (0.0042) model time 0.2369 (0.2417) loss 1.7147 (2.7889) grad_norm 4.1806 (3.9960) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][190/1251] eta 0:04:20 lr 0.000139 wd 0.0500 time 0.2421 (0.2453) data time 0.0008 (0.0041) model time 0.2414 (0.2416) loss 3.0331 (2.7936) grad_norm 3.3273 (3.9740) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][200/1251] eta 0:04:17 lr 0.000139 wd 0.0500 time 0.2477 (0.2451) data time 0.0009 (0.0039) model time 0.2468 (0.2415) loss 2.9840 (2.8058) grad_norm 3.3599 (3.9594) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][210/1251] eta 0:04:14 lr 0.000139 wd 0.0500 time 0.2420 (0.2448) data time 0.0009 (0.0038) model time 0.2411 (0.2413) loss 3.1421 (2.8136) grad_norm 4.5929 (3.9712) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][220/1251] eta 0:04:12 lr 0.000139 wd 0.0500 time 0.2447 (0.2447) data time 0.0007 (0.0036) model time 0.2440 (0.2413) loss 3.2750 (2.8273) grad_norm 3.6721 (3.9548) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][230/1251] eta 0:04:09 lr 0.000139 wd 0.0500 time 0.2409 (0.2445) data time 0.0012 (0.0035) model time 0.2397 (0.2411) loss 3.0264 (2.8417) grad_norm 3.3943 (3.9606) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][240/1251] eta 0:04:07 lr 0.000139 wd 0.0500 time 0.2401 (0.2444) data time 0.0008 (0.0034) model time 0.2393 (0.2411) loss 2.7878 (2.8442) grad_norm 3.0748 (3.9333) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][250/1251] eta 0:04:04 lr 0.000139 wd 0.0500 time 0.2440 (0.2442) data time 0.0007 (0.0033) model time 0.2433 (0.2410) loss 2.1769 (2.8454) grad_norm 11.7456 (3.9699) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][260/1251] eta 0:04:02 lr 0.000139 wd 0.0500 time 0.2458 (0.2442) data time 0.0008 (0.0032) model time 0.2450 (0.2411) loss 2.7698 (2.8553) grad_norm 5.0153 (3.9797) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][270/1251] eta 0:03:59 lr 0.000139 wd 0.0500 time 0.2383 (0.2441) data time 0.0009 (0.0031) model time 0.2374 (0.2411) loss 3.2146 (2.8545) grad_norm 3.2984 (3.9816) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][280/1251] eta 0:03:56 lr 0.000139 wd 0.0500 time 0.2392 (0.2440) data time 0.0007 (0.0031) model time 0.2385 (0.2411) loss 2.7648 (2.8506) grad_norm 4.0300 (4.0765) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][290/1251] eta 0:03:54 lr 0.000139 wd 0.0500 time 0.2377 (0.2439) data time 0.0009 (0.0030) model time 0.2368 (0.2410) loss 2.9219 (2.8578) grad_norm 4.2223 (4.0648) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][300/1251] eta 0:03:51 lr 0.000139 wd 0.0500 time 0.2338 (0.2438) data time 0.0009 (0.0029) model time 0.2329 (0.2409) loss 2.9198 (2.8573) grad_norm 3.7133 (4.1122) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][310/1251] eta 0:03:49 lr 0.000139 wd 0.0500 time 0.2381 (0.2437) data time 0.0008 (0.0029) model time 0.2373 (0.2409) loss 3.1550 (2.8598) grad_norm 3.2900 (4.1254) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][320/1251] eta 0:03:46 lr 0.000139 wd 0.0500 time 0.2450 (0.2437) data time 0.0007 (0.0028) model time 0.2443 (0.2410) loss 2.3329 (2.8535) grad_norm 3.6451 (4.1211) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][330/1251] eta 0:03:44 lr 0.000139 wd 0.0500 time 0.2411 (0.2436) data time 0.0007 (0.0028) model time 0.2404 (0.2409) loss 1.7291 (2.8497) grad_norm 4.0475 (4.1079) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][340/1251] eta 0:03:41 lr 0.000139 wd 0.0500 time 0.2359 (0.2436) data time 0.0007 (0.0027) model time 0.2352 (0.2410) loss 2.9434 (2.8493) grad_norm 3.1484 (4.1198) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][350/1251] eta 0:03:39 lr 0.000139 wd 0.0500 time 0.2394 (0.2435) data time 0.0008 (0.0027) model time 0.2386 (0.2409) loss 3.2482 (2.8457) grad_norm 5.3779 (4.1266) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][360/1251] eta 0:03:36 lr 0.000139 wd 0.0500 time 0.2360 (0.2434) data time 0.0008 (0.0026) model time 0.2351 (0.2408) loss 3.1618 (2.8451) grad_norm 2.7833 (4.1100) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][370/1251] eta 0:03:34 lr 0.000139 wd 0.0500 time 0.2376 (0.2433) data time 0.0008 (0.0026) model time 0.2367 (0.2407) loss 3.5894 (2.8450) grad_norm 4.8212 (4.1123) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][380/1251] eta 0:03:31 lr 0.000138 wd 0.0500 time 0.2408 (0.2432) data time 0.0010 (0.0025) model time 0.2398 (0.2407) loss 3.3304 (2.8496) grad_norm 3.3143 (4.1077) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][390/1251] eta 0:03:29 lr 0.000138 wd 0.0500 time 0.2369 (0.2431) data time 0.0009 (0.0025) model time 0.2360 (0.2407) loss 2.2428 (2.8479) grad_norm 4.0837 (4.0944) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][400/1251] eta 0:03:26 lr 0.000138 wd 0.0500 time 0.2362 (0.2431) data time 0.0007 (0.0025) model time 0.2355 (0.2406) loss 3.6479 (2.8493) grad_norm 3.6166 (4.0919) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][410/1251] eta 0:03:24 lr 0.000138 wd 0.0500 time 0.2327 (0.2430) data time 0.0009 (0.0024) model time 0.2318 (0.2406) loss 3.2927 (2.8512) grad_norm 3.6127 (4.0743) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][420/1251] eta 0:03:21 lr 0.000138 wd 0.0500 time 0.2400 (0.2430) data time 0.0009 (0.0024) model time 0.2392 (0.2406) loss 2.8189 (2.8506) grad_norm 2.5144 (4.0766) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][430/1251] eta 0:03:19 lr 0.000138 wd 0.0500 time 0.2382 (0.2430) data time 0.0010 (0.0024) model time 0.2372 (0.2407) loss 2.3573 (2.8477) grad_norm 4.2032 (4.0641) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][440/1251] eta 0:03:17 lr 0.000138 wd 0.0500 time 0.2440 (0.2430) data time 0.0009 (0.0023) model time 0.2430 (0.2407) loss 2.8266 (2.8465) grad_norm 3.3626 (4.0525) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][450/1251] eta 0:03:14 lr 0.000138 wd 0.0500 time 0.2409 (0.2429) data time 0.0008 (0.0023) model time 0.2401 (0.2407) loss 3.6306 (2.8488) grad_norm 4.0045 (4.0519) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][460/1251] eta 0:03:12 lr 0.000138 wd 0.0500 time 0.2409 (0.2429) data time 0.0008 (0.0023) model time 0.2401 (0.2407) loss 3.7180 (2.8452) grad_norm 4.4148 (4.0563) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][470/1251] eta 0:03:09 lr 0.000138 wd 0.0500 time 0.2345 (0.2428) data time 0.0007 (0.0022) model time 0.2338 (0.2406) loss 1.7484 (2.8430) grad_norm 2.5847 (4.0436) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][480/1251] eta 0:03:07 lr 0.000138 wd 0.0500 time 0.2452 (0.2428) data time 0.0007 (0.0022) model time 0.2445 (0.2406) loss 3.7359 (2.8441) grad_norm 2.9515 (4.0474) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][490/1251] eta 0:03:05 lr 0.000138 wd 0.0500 time 0.4596 (0.2432) data time 0.0010 (0.0022) model time 0.4586 (0.2411) loss 2.0670 (2.8421) grad_norm 6.1970 (4.0752) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][500/1251] eta 0:03:02 lr 0.000138 wd 0.0500 time 0.2428 (0.2432) data time 0.0009 (0.0022) model time 0.2419 (0.2411) loss 3.0757 (2.8474) grad_norm 5.3838 (4.0742) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][510/1251] eta 0:03:00 lr 0.000138 wd 0.0500 time 0.2462 (0.2432) data time 0.0007 (0.0021) model time 0.2455 (0.2412) loss 1.8864 (2.8442) grad_norm 3.4649 (4.0637) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][520/1251] eta 0:02:57 lr 0.000138 wd 0.0500 time 0.2358 (0.2432) data time 0.0010 (0.0021) model time 0.2347 (0.2412) loss 3.2616 (2.8438) grad_norm 4.0797 (4.0585) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][530/1251] eta 0:02:55 lr 0.000138 wd 0.0500 time 0.2431 (0.2431) data time 0.0008 (0.0021) model time 0.2423 (0.2411) loss 3.2411 (2.8449) grad_norm 3.7749 (4.0564) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][540/1251] eta 0:02:52 lr 0.000138 wd 0.0500 time 0.2355 (0.2431) data time 0.0009 (0.0021) model time 0.2346 (0.2411) loss 2.0009 (2.8410) grad_norm 4.2446 (4.0544) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][550/1251] eta 0:02:50 lr 0.000138 wd 0.0500 time 0.2365 (0.2431) data time 0.0009 (0.0021) model time 0.2356 (0.2411) loss 2.9941 (2.8425) grad_norm 3.0171 (4.0436) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][560/1251] eta 0:02:47 lr 0.000138 wd 0.0500 time 0.2391 (0.2430) data time 0.0009 (0.0020) model time 0.2382 (0.2410) loss 3.5038 (2.8470) grad_norm 5.7150 (4.0374) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][570/1251] eta 0:02:45 lr 0.000138 wd 0.0500 time 0.2342 (0.2429) data time 0.0009 (0.0020) model time 0.2333 (0.2410) loss 3.3009 (2.8476) grad_norm 3.1926 (4.0269) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][580/1251] eta 0:02:42 lr 0.000138 wd 0.0500 time 0.2343 (0.2429) data time 0.0008 (0.0020) model time 0.2335 (0.2410) loss 4.0545 (2.8517) grad_norm 2.9707 (4.0240) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][590/1251] eta 0:02:40 lr 0.000138 wd 0.0500 time 0.2145 (0.2432) data time 0.0011 (0.0020) model time 0.2134 (0.2413) loss 3.1143 (2.8512) grad_norm 2.9554 (4.0161) loss_scale 256.0000 (128.2166) mem 7381MB [2024-09-01 06:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][600/1251] eta 0:02:38 lr 0.000138 wd 0.0500 time 0.2339 (0.2431) data time 0.0008 (0.0020) model time 0.2331 (0.2413) loss 3.0829 (2.8509) grad_norm 3.4288 (4.0097) loss_scale 256.0000 (130.3428) mem 7381MB [2024-09-01 06:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][610/1251] eta 0:02:35 lr 0.000138 wd 0.0500 time 0.2311 (0.2431) data time 0.0009 (0.0020) model time 0.2301 (0.2412) loss 3.1000 (2.8529) grad_norm 2.6570 (3.9997) loss_scale 256.0000 (132.3993) mem 7381MB [2024-09-01 06:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][620/1251] eta 0:02:33 lr 0.000138 wd 0.0500 time 0.2430 (0.2431) data time 0.0008 (0.0019) model time 0.2422 (0.2412) loss 2.9704 (2.8568) grad_norm 3.9255 (3.9983) loss_scale 256.0000 (134.3897) mem 7381MB [2024-09-01 06:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][630/1251] eta 0:02:30 lr 0.000138 wd 0.0500 time 0.2512 (0.2430) data time 0.0009 (0.0019) model time 0.2502 (0.2412) loss 2.6143 (2.8601) grad_norm 3.9115 (4.0000) loss_scale 256.0000 (136.3170) mem 7381MB [2024-09-01 06:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][640/1251] eta 0:02:28 lr 0.000138 wd 0.0500 time 0.2368 (0.2430) data time 0.0012 (0.0019) model time 0.2356 (0.2411) loss 2.8177 (2.8596) grad_norm 4.4341 (4.0009) loss_scale 256.0000 (138.1841) mem 7381MB [2024-09-01 06:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][650/1251] eta 0:02:25 lr 0.000138 wd 0.0500 time 0.2297 (0.2429) data time 0.0008 (0.0019) model time 0.2289 (0.2411) loss 2.6773 (2.8576) grad_norm 3.6118 (3.9944) loss_scale 256.0000 (139.9939) mem 7381MB [2024-09-01 06:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][660/1251] eta 0:02:23 lr 0.000138 wd 0.0500 time 0.2411 (0.2429) data time 0.0009 (0.0019) model time 0.2402 (0.2411) loss 3.0275 (2.8598) grad_norm 3.2874 (3.9921) loss_scale 256.0000 (141.7489) mem 7381MB [2024-09-01 06:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][670/1251] eta 0:02:21 lr 0.000138 wd 0.0500 time 0.2464 (0.2429) data time 0.0009 (0.0019) model time 0.2455 (0.2411) loss 1.8873 (2.8596) grad_norm 3.5272 (3.9951) loss_scale 256.0000 (143.4516) mem 7381MB [2024-09-01 06:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][680/1251] eta 0:02:18 lr 0.000138 wd 0.0500 time 0.2482 (0.2430) data time 0.0010 (0.0019) model time 0.2473 (0.2412) loss 2.9029 (2.8600) grad_norm 2.7699 (3.9839) loss_scale 256.0000 (145.1043) mem 7381MB [2024-09-01 06:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][690/1251] eta 0:02:16 lr 0.000138 wd 0.0500 time 0.2362 (0.2429) data time 0.0008 (0.0018) model time 0.2354 (0.2411) loss 1.7038 (2.8580) grad_norm 3.1713 (3.9850) loss_scale 256.0000 (146.7091) mem 7381MB [2024-09-01 06:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][700/1251] eta 0:02:13 lr 0.000138 wd 0.0500 time 0.2389 (0.2429) data time 0.0011 (0.0018) model time 0.2378 (0.2411) loss 2.7984 (2.8575) grad_norm 4.4306 (3.9910) loss_scale 256.0000 (148.2682) mem 7381MB [2024-09-01 06:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][710/1251] eta 0:02:11 lr 0.000138 wd 0.0500 time 0.2465 (0.2429) data time 0.0010 (0.0018) model time 0.2454 (0.2411) loss 3.3387 (2.8602) grad_norm 2.9390 (3.9888) loss_scale 256.0000 (149.7834) mem 7381MB [2024-09-01 06:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][720/1251] eta 0:02:08 lr 0.000137 wd 0.0500 time 0.2319 (0.2428) data time 0.0009 (0.0018) model time 0.2310 (0.2411) loss 3.1591 (2.8609) grad_norm 5.3619 (4.0003) loss_scale 256.0000 (151.2566) mem 7381MB [2024-09-01 06:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][730/1251] eta 0:02:06 lr 0.000137 wd 0.0500 time 0.2447 (0.2428) data time 0.0011 (0.0018) model time 0.2437 (0.2411) loss 2.9611 (2.8609) grad_norm 5.0412 (4.0024) loss_scale 256.0000 (152.6895) mem 7381MB [2024-09-01 06:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][740/1251] eta 0:02:04 lr 0.000137 wd 0.0500 time 0.2348 (0.2428) data time 0.0010 (0.0018) model time 0.2337 (0.2411) loss 3.2335 (2.8637) grad_norm 4.5414 (4.0360) loss_scale 256.0000 (154.0837) mem 7381MB [2024-09-01 06:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][750/1251] eta 0:02:01 lr 0.000137 wd 0.0500 time 0.2472 (0.2428) data time 0.0008 (0.0018) model time 0.2463 (0.2411) loss 2.7071 (2.8624) grad_norm 3.4988 (4.0394) loss_scale 256.0000 (155.4407) mem 7381MB [2024-09-01 06:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][760/1251] eta 0:01:59 lr 0.000137 wd 0.0500 time 0.2379 (0.2428) data time 0.0007 (0.0018) model time 0.2373 (0.2411) loss 3.4805 (2.8655) grad_norm 2.9193 (4.0485) loss_scale 256.0000 (156.7622) mem 7381MB [2024-09-01 06:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][770/1251] eta 0:01:56 lr 0.000137 wd 0.0500 time 0.2592 (0.2428) data time 0.0011 (0.0018) model time 0.2581 (0.2411) loss 3.3075 (2.8672) grad_norm 4.3252 (4.0434) loss_scale 256.0000 (158.0493) mem 7381MB [2024-09-01 06:10:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][780/1251] eta 0:01:54 lr 0.000137 wd 0.0500 time 0.2314 (0.2427) data time 0.0008 (0.0017) model time 0.2306 (0.2410) loss 2.8908 (2.8669) grad_norm 3.0302 (4.0426) loss_scale 256.0000 (159.3035) mem 7381MB [2024-09-01 06:10:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][790/1251] eta 0:01:51 lr 0.000137 wd 0.0500 time 0.2588 (0.2427) data time 0.0009 (0.0017) model time 0.2579 (0.2410) loss 3.1995 (2.8684) grad_norm 3.9687 (4.0388) loss_scale 256.0000 (160.5259) mem 7381MB [2024-09-01 06:10:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][800/1251] eta 0:01:49 lr 0.000137 wd 0.0500 time 0.2330 (0.2427) data time 0.0008 (0.0017) model time 0.2321 (0.2410) loss 2.8230 (2.8681) grad_norm 2.9350 (4.0384) loss_scale 256.0000 (161.7179) mem 7381MB [2024-09-01 06:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][810/1251] eta 0:01:47 lr 0.000137 wd 0.0500 time 0.2393 (0.2427) data time 0.0008 (0.0017) model time 0.2385 (0.2410) loss 3.4886 (2.8658) grad_norm 3.8328 (4.0437) loss_scale 256.0000 (162.8804) mem 7381MB [2024-09-01 06:10:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][820/1251] eta 0:01:44 lr 0.000137 wd 0.0500 time 0.2293 (0.2427) data time 0.0011 (0.0017) model time 0.2282 (0.2410) loss 3.0943 (2.8682) grad_norm 4.8383 (4.0452) loss_scale 256.0000 (164.0146) mem 7381MB [2024-09-01 06:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][830/1251] eta 0:01:42 lr 0.000137 wd 0.0500 time 0.2425 (0.2426) data time 0.0010 (0.0017) model time 0.2415 (0.2410) loss 2.9403 (2.8664) grad_norm 3.9896 (4.0505) loss_scale 256.0000 (165.1215) mem 7381MB [2024-09-01 06:10:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][840/1251] eta 0:01:39 lr 0.000137 wd 0.0500 time 0.2408 (0.2426) data time 0.0011 (0.0017) model time 0.2397 (0.2410) loss 3.1440 (2.8654) grad_norm 4.3526 (4.0459) loss_scale 256.0000 (166.2021) mem 7381MB [2024-09-01 06:10:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][850/1251] eta 0:01:37 lr 0.000137 wd 0.0500 time 0.2308 (0.2426) data time 0.0008 (0.0017) model time 0.2300 (0.2409) loss 2.0809 (2.8661) grad_norm 3.9939 (4.0459) loss_scale 256.0000 (167.2573) mem 7381MB [2024-09-01 06:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][860/1251] eta 0:01:34 lr 0.000137 wd 0.0500 time 0.2432 (0.2426) data time 0.0010 (0.0017) model time 0.2422 (0.2409) loss 2.5397 (2.8648) grad_norm 4.4084 (4.0527) loss_scale 256.0000 (168.2880) mem 7381MB [2024-09-01 06:11:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][870/1251] eta 0:01:32 lr 0.000137 wd 0.0500 time 0.2375 (0.2425) data time 0.0007 (0.0017) model time 0.2368 (0.2409) loss 1.7786 (2.8652) grad_norm 4.8973 (4.0511) loss_scale 256.0000 (169.2951) mem 7381MB [2024-09-01 06:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][880/1251] eta 0:01:29 lr 0.000137 wd 0.0500 time 0.2395 (0.2425) data time 0.0007 (0.0017) model time 0.2388 (0.2409) loss 2.4078 (2.8634) grad_norm 3.9763 (4.0558) loss_scale 256.0000 (170.2792) mem 7381MB [2024-09-01 06:11:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][890/1251] eta 0:01:27 lr 0.000137 wd 0.0500 time 0.2340 (0.2425) data time 0.0007 (0.0016) model time 0.2333 (0.2408) loss 3.0193 (2.8630) grad_norm 3.5732 (4.0573) loss_scale 256.0000 (171.2413) mem 7381MB [2024-09-01 06:11:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][900/1251] eta 0:01:25 lr 0.000137 wd 0.0500 time 0.2375 (0.2425) data time 0.0010 (0.0016) model time 0.2365 (0.2408) loss 2.5886 (2.8609) grad_norm 4.8598 (4.0752) loss_scale 256.0000 (172.1820) mem 7381MB [2024-09-01 06:11:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][910/1251] eta 0:01:22 lr 0.000137 wd 0.0500 time 0.2415 (0.2424) data time 0.0010 (0.0016) model time 0.2406 (0.2408) loss 2.8635 (2.8625) grad_norm 3.5591 (4.0722) loss_scale 256.0000 (173.1021) mem 7381MB [2024-09-01 06:11:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][920/1251] eta 0:01:20 lr 0.000137 wd 0.0500 time 0.2428 (0.2424) data time 0.0010 (0.0016) model time 0.2418 (0.2408) loss 2.4877 (2.8608) grad_norm 3.8050 (4.0703) loss_scale 256.0000 (174.0022) mem 7381MB [2024-09-01 06:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][930/1251] eta 0:01:17 lr 0.000137 wd 0.0500 time 0.2384 (0.2424) data time 0.0008 (0.0016) model time 0.2376 (0.2408) loss 1.7884 (2.8588) grad_norm 6.9868 (4.0707) loss_scale 256.0000 (174.8829) mem 7381MB [2024-09-01 06:11:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][940/1251] eta 0:01:15 lr 0.000137 wd 0.0500 time 0.2422 (0.2424) data time 0.0007 (0.0016) model time 0.2415 (0.2408) loss 1.6974 (2.8575) grad_norm 3.1129 (4.0641) loss_scale 256.0000 (175.7450) mem 7381MB [2024-09-01 06:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][950/1251] eta 0:01:12 lr 0.000137 wd 0.0500 time 0.2419 (0.2424) data time 0.0008 (0.0016) model time 0.2410 (0.2408) loss 3.1946 (2.8575) grad_norm 7.3100 (4.0644) loss_scale 256.0000 (176.5889) mem 7381MB [2024-09-01 06:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][960/1251] eta 0:01:10 lr 0.000137 wd 0.0500 time 0.2476 (0.2424) data time 0.0010 (0.0016) model time 0.2466 (0.2408) loss 2.7635 (2.8581) grad_norm 4.0111 (4.0607) loss_scale 256.0000 (177.4152) mem 7381MB [2024-09-01 06:11:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][970/1251] eta 0:01:08 lr 0.000137 wd 0.0500 time 0.2478 (0.2423) data time 0.0007 (0.0016) model time 0.2470 (0.2408) loss 3.3151 (2.8593) grad_norm 3.8230 (4.0597) loss_scale 256.0000 (178.2245) mem 7381MB [2024-09-01 06:11:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][980/1251] eta 0:01:05 lr 0.000137 wd 0.0500 time 0.2447 (0.2423) data time 0.0008 (0.0016) model time 0.2439 (0.2408) loss 2.1092 (2.8591) grad_norm 3.8848 (4.0559) loss_scale 256.0000 (179.0173) mem 7381MB [2024-09-01 06:11:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][990/1251] eta 0:01:03 lr 0.000137 wd 0.0500 time 0.2408 (0.2423) data time 0.0011 (0.0016) model time 0.2396 (0.2408) loss 3.1646 (2.8610) grad_norm 3.7012 (4.0543) loss_scale 256.0000 (179.7941) mem 7381MB [2024-09-01 06:11:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1000/1251] eta 0:01:00 lr 0.000137 wd 0.0500 time 0.2436 (0.2423) data time 0.0009 (0.0016) model time 0.2427 (0.2408) loss 1.9692 (2.8585) grad_norm 3.7733 (4.0492) loss_scale 256.0000 (180.5554) mem 7381MB [2024-09-01 06:11:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1010/1251] eta 0:00:58 lr 0.000137 wd 0.0500 time 0.2420 (0.2423) data time 0.0007 (0.0016) model time 0.2412 (0.2408) loss 3.3978 (2.8597) grad_norm 4.7866 (4.0448) loss_scale 256.0000 (181.3017) mem 7381MB [2024-09-01 06:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1020/1251] eta 0:00:55 lr 0.000137 wd 0.0500 time 0.2416 (0.2423) data time 0.0008 (0.0016) model time 0.2409 (0.2407) loss 2.1688 (2.8601) grad_norm 3.9073 (4.0392) loss_scale 256.0000 (182.0333) mem 7381MB [2024-09-01 06:11:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1030/1251] eta 0:00:53 lr 0.000137 wd 0.0500 time 0.2428 (0.2423) data time 0.0008 (0.0016) model time 0.2420 (0.2407) loss 3.0986 (2.8603) grad_norm 3.2840 (4.0365) loss_scale 256.0000 (182.7507) mem 7381MB [2024-09-01 06:11:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1040/1251] eta 0:00:51 lr 0.000137 wd 0.0500 time 0.2447 (0.2423) data time 0.0012 (0.0016) model time 0.2435 (0.2407) loss 3.7554 (2.8630) grad_norm 6.5995 (4.0407) loss_scale 256.0000 (183.4544) mem 7381MB [2024-09-01 06:11:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1050/1251] eta 0:00:48 lr 0.000137 wd 0.0500 time 0.2362 (0.2422) data time 0.0010 (0.0015) model time 0.2351 (0.2407) loss 2.9737 (2.8655) grad_norm 4.5401 (4.0377) loss_scale 256.0000 (184.1446) mem 7381MB [2024-09-01 06:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1060/1251] eta 0:00:46 lr 0.000136 wd 0.0500 time 0.2419 (0.2422) data time 0.0007 (0.0015) model time 0.2412 (0.2407) loss 2.5034 (2.8632) grad_norm 3.9125 (4.0868) loss_scale 256.0000 (184.8219) mem 7381MB [2024-09-01 06:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1070/1251] eta 0:00:43 lr 0.000136 wd 0.0500 time 0.2371 (0.2422) data time 0.0009 (0.0015) model time 0.2363 (0.2407) loss 3.7637 (2.8622) grad_norm 3.4040 (4.1064) loss_scale 256.0000 (185.4865) mem 7381MB [2024-09-01 06:11:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1080/1251] eta 0:00:41 lr 0.000136 wd 0.0500 time 0.2470 (0.2422) data time 0.0007 (0.0015) model time 0.2463 (0.2407) loss 2.7676 (2.8586) grad_norm 3.7044 (4.1107) loss_scale 256.0000 (186.1388) mem 7381MB [2024-09-01 06:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1090/1251] eta 0:00:38 lr 0.000136 wd 0.0500 time 0.2306 (0.2422) data time 0.0009 (0.0015) model time 0.2297 (0.2406) loss 2.3633 (2.8581) grad_norm 3.3513 (4.1176) loss_scale 256.0000 (186.7791) mem 7381MB [2024-09-01 06:11:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1100/1251] eta 0:00:36 lr 0.000136 wd 0.0500 time 0.2335 (0.2421) data time 0.0010 (0.0015) model time 0.2325 (0.2406) loss 2.9468 (2.8562) grad_norm 4.8303 (4.1219) loss_scale 256.0000 (187.4078) mem 7381MB [2024-09-01 06:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1110/1251] eta 0:00:34 lr 0.000136 wd 0.0500 time 0.2290 (0.2421) data time 0.0012 (0.0015) model time 0.2278 (0.2406) loss 2.7776 (2.8569) grad_norm 3.1374 (4.1200) loss_scale 256.0000 (188.0252) mem 7381MB [2024-09-01 06:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1120/1251] eta 0:00:31 lr 0.000136 wd 0.0500 time 0.2450 (0.2421) data time 0.0009 (0.0015) model time 0.2442 (0.2406) loss 3.3175 (2.8565) grad_norm 3.0802 (4.1186) loss_scale 256.0000 (188.6316) mem 7381MB [2024-09-01 06:12:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1130/1251] eta 0:00:29 lr 0.000136 wd 0.0500 time 0.2495 (0.2421) data time 0.0009 (0.0015) model time 0.2485 (0.2406) loss 3.1521 (2.8557) grad_norm 4.4386 (4.1190) loss_scale 256.0000 (189.2272) mem 7381MB [2024-09-01 06:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1140/1251] eta 0:00:26 lr 0.000136 wd 0.0500 time 0.2352 (0.2421) data time 0.0009 (0.0015) model time 0.2343 (0.2406) loss 2.8347 (2.8552) grad_norm 4.7079 (4.1244) loss_scale 256.0000 (189.8124) mem 7381MB [2024-09-01 06:12:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1150/1251] eta 0:00:24 lr 0.000136 wd 0.0500 time 0.2425 (0.2421) data time 0.0012 (0.0015) model time 0.2413 (0.2406) loss 3.2805 (2.8566) grad_norm 2.6912 (4.1213) loss_scale 256.0000 (190.3875) mem 7381MB [2024-09-01 06:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1160/1251] eta 0:00:22 lr 0.000136 wd 0.0500 time 0.2356 (0.2421) data time 0.0011 (0.0015) model time 0.2345 (0.2406) loss 2.9688 (2.8562) grad_norm 3.6673 (4.1170) loss_scale 256.0000 (190.9526) mem 7381MB [2024-09-01 06:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1170/1251] eta 0:00:19 lr 0.000136 wd 0.0500 time 0.2424 (0.2421) data time 0.0008 (0.0015) model time 0.2416 (0.2406) loss 3.1858 (2.8571) grad_norm 4.9993 (4.1276) loss_scale 256.0000 (191.5081) mem 7381MB [2024-09-01 06:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1180/1251] eta 0:00:17 lr 0.000136 wd 0.0500 time 0.2390 (0.2421) data time 0.0009 (0.0015) model time 0.2382 (0.2406) loss 2.2014 (2.8571) grad_norm 3.0077 (4.1303) loss_scale 256.0000 (192.0542) mem 7381MB [2024-09-01 06:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1190/1251] eta 0:00:14 lr 0.000136 wd 0.0500 time 0.2379 (0.2421) data time 0.0010 (0.0015) model time 0.2370 (0.2406) loss 3.3842 (2.8572) grad_norm 3.3475 (4.1279) loss_scale 256.0000 (192.5911) mem 7381MB [2024-09-01 06:12:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1200/1251] eta 0:00:12 lr 0.000136 wd 0.0500 time 0.2486 (0.2421) data time 0.0009 (0.0015) model time 0.2476 (0.2406) loss 3.1922 (2.8576) grad_norm 2.7695 (4.1251) loss_scale 256.0000 (193.1191) mem 7381MB [2024-09-01 06:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1210/1251] eta 0:00:09 lr 0.000136 wd 0.0500 time 0.2420 (0.2421) data time 0.0009 (0.0015) model time 0.2411 (0.2406) loss 3.1904 (2.8582) grad_norm 5.3554 (4.1196) loss_scale 256.0000 (193.6383) mem 7381MB [2024-09-01 06:12:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1220/1251] eta 0:00:07 lr 0.000136 wd 0.0500 time 0.2430 (0.2421) data time 0.0009 (0.0015) model time 0.2421 (0.2406) loss 2.9225 (2.8580) grad_norm 3.1315 (4.1164) loss_scale 256.0000 (194.1491) mem 7381MB [2024-09-01 06:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1230/1251] eta 0:00:05 lr 0.000136 wd 0.0500 time 0.2435 (0.2420) data time 0.0011 (0.0015) model time 0.2424 (0.2406) loss 2.7156 (2.8558) grad_norm 3.9166 (4.1151) loss_scale 256.0000 (194.6515) mem 7381MB [2024-09-01 06:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1240/1251] eta 0:00:02 lr 0.000136 wd 0.0500 time 0.2241 (0.2420) data time 0.0005 (0.0015) model time 0.2237 (0.2405) loss 3.5463 (2.8568) grad_norm 3.4559 (4.1118) loss_scale 256.0000 (195.1459) mem 7381MB [2024-09-01 06:12:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [234/300][1250/1251] eta 0:00:00 lr 0.000136 wd 0.0500 time 0.2256 (0.2418) data time 0.0007 (0.0015) model time 0.2249 (0.2404) loss 3.2815 (2.8584) grad_norm 4.3413 (4.1098) loss_scale 256.0000 (195.6323) mem 7381MB [2024-09-01 06:12:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 234 training takes 0:05:02 [2024-09-01 06:12:33 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 06:12:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 06:12:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.454 (0.454) Loss 0.4053 (0.4053) Acc@1 92.676 (92.676) Acc@5 98.145 (98.145) Mem 7381MB [2024-09-01 06:12:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.109) Loss 0.5991 (0.6347) Acc@1 88.672 (86.657) Acc@5 97.559 (97.576) Mem 7381MB [2024-09-01 06:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.095) Loss 0.9365 (0.6594) Acc@1 77.441 (85.672) Acc@5 95.605 (97.461) Mem 7381MB [2024-09-01 06:12:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.090) Loss 1.1514 (0.7485) Acc@1 73.047 (83.553) Acc@5 92.578 (96.459) Mem 7381MB [2024-09-01 06:12:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.0410 (0.7986) Acc@1 74.512 (82.277) Acc@5 93.848 (95.948) Mem 7381MB [2024-09-01 06:12:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.874 Acc@5 95.874 [2024-09-01 06:12:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.9% [2024-09-01 06:12:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.742 (0.742) Loss 0.3801 (0.3801) Acc@1 93.359 (93.359) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 06:12:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.147) Loss 0.5713 (0.6005) Acc@1 89.551 (87.536) Acc@5 97.852 (97.772) Mem 7381MB [2024-09-01 06:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.115) Loss 0.8770 (0.6292) Acc@1 78.125 (86.416) Acc@5 95.898 (97.693) Mem 7381MB [2024-09-01 06:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.103) Loss 1.0850 (0.7126) Acc@1 74.316 (84.340) Acc@5 93.066 (96.821) Mem 7381MB [2024-09-01 06:12:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 0.9858 (0.7575) Acc@1 76.465 (83.184) Acc@5 94.238 (96.351) Mem 7381MB [2024-09-01 06:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.746 Acc@5 96.330 [2024-09-01 06:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-09-01 06:12:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][0/1251] eta 0:23:38 lr 0.000136 wd 0.0500 time 1.1341 (1.1341) data time 0.6183 (0.6183) model time 0.0000 (0.0000) loss 3.0630 (3.0630) grad_norm 3.3337 (3.3337) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:12:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][10/1251] eta 0:06:43 lr 0.000136 wd 0.0500 time 0.2489 (0.3253) data time 0.0009 (0.0571) model time 0.0000 (0.0000) loss 2.8575 (2.6262) grad_norm 5.7212 (4.5498) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][20/1251] eta 0:05:51 lr 0.000136 wd 0.0500 time 0.2413 (0.2856) data time 0.0007 (0.0304) model time 0.0000 (0.0000) loss 2.9111 (2.6780) grad_norm 3.3154 (4.2122) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:12:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][30/1251] eta 0:05:31 lr 0.000136 wd 0.0500 time 0.2438 (0.2716) data time 0.0010 (0.0209) model time 0.0000 (0.0000) loss 2.4309 (2.7315) grad_norm 4.1531 (4.1164) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][40/1251] eta 0:05:20 lr 0.000136 wd 0.0500 time 0.2414 (0.2649) data time 0.0011 (0.0161) model time 0.0000 (0.0000) loss 2.8712 (2.7531) grad_norm 4.6462 (4.2000) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:12:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][50/1251] eta 0:05:13 lr 0.000136 wd 0.0500 time 0.2502 (0.2607) data time 0.0010 (0.0131) model time 0.0000 (0.0000) loss 3.1595 (2.7243) grad_norm 3.5700 (4.2123) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:12:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][60/1251] eta 0:05:06 lr 0.000136 wd 0.0500 time 0.2403 (0.2576) data time 0.0010 (0.0111) model time 0.2393 (0.2408) loss 3.0247 (2.7678) grad_norm 3.4867 (4.3144) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][70/1251] eta 0:05:01 lr 0.000136 wd 0.0500 time 0.2457 (0.2555) data time 0.0011 (0.0097) model time 0.2447 (0.2411) loss 2.5669 (2.7872) grad_norm 4.0998 (4.1836) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][80/1251] eta 0:04:57 lr 0.000136 wd 0.0500 time 0.2450 (0.2541) data time 0.0007 (0.0086) model time 0.2443 (0.2420) loss 1.6928 (2.7646) grad_norm 3.2045 (4.1500) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][90/1251] eta 0:04:53 lr 0.000136 wd 0.0500 time 0.2414 (0.2529) data time 0.0009 (0.0078) model time 0.2405 (0.2420) loss 1.9082 (2.7626) grad_norm 2.9316 (4.1558) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][100/1251] eta 0:04:54 lr 0.000136 wd 0.0500 time 0.2449 (0.2562) data time 0.0009 (0.0071) model time 0.2440 (0.2505) loss 3.3982 (2.7767) grad_norm 3.7167 (4.0818) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][110/1251] eta 0:04:50 lr 0.000136 wd 0.0500 time 0.2474 (0.2548) data time 0.0008 (0.0066) model time 0.2466 (0.2487) loss 2.6751 (2.7756) grad_norm 3.3746 (4.1181) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][120/1251] eta 0:04:46 lr 0.000136 wd 0.0500 time 0.2393 (0.2536) data time 0.0010 (0.0061) model time 0.2383 (0.2475) loss 2.1170 (2.7832) grad_norm 3.8065 (4.1443) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][130/1251] eta 0:04:42 lr 0.000136 wd 0.0500 time 0.2421 (0.2523) data time 0.0009 (0.0057) model time 0.2411 (0.2460) loss 2.4344 (2.7771) grad_norm 3.5582 (4.1174) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][140/1251] eta 0:04:39 lr 0.000135 wd 0.0500 time 0.2431 (0.2515) data time 0.0009 (0.0054) model time 0.2422 (0.2453) loss 3.3341 (2.7981) grad_norm 3.8077 (4.1279) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][150/1251] eta 0:04:35 lr 0.000135 wd 0.0500 time 0.2389 (0.2507) data time 0.0011 (0.0051) model time 0.2378 (0.2445) loss 3.0903 (2.8080) grad_norm 6.3937 (4.1223) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][160/1251] eta 0:04:32 lr 0.000135 wd 0.0500 time 0.2377 (0.2501) data time 0.0007 (0.0048) model time 0.2370 (0.2441) loss 1.7842 (2.8022) grad_norm 3.3520 (4.1638) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][170/1251] eta 0:04:29 lr 0.000135 wd 0.0500 time 0.2364 (0.2495) data time 0.0009 (0.0046) model time 0.2355 (0.2438) loss 2.0070 (2.8012) grad_norm 6.8842 (4.1928) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][180/1251] eta 0:04:26 lr 0.000135 wd 0.0500 time 0.2348 (0.2491) data time 0.0010 (0.0044) model time 0.2338 (0.2435) loss 2.7179 (2.8077) grad_norm 15.9154 (4.2704) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][190/1251] eta 0:04:23 lr 0.000135 wd 0.0500 time 0.2442 (0.2486) data time 0.0009 (0.0042) model time 0.2433 (0.2432) loss 3.2612 (2.8273) grad_norm 3.4949 (4.2690) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][200/1251] eta 0:04:20 lr 0.000135 wd 0.0500 time 0.2418 (0.2482) data time 0.0010 (0.0041) model time 0.2408 (0.2429) loss 2.8301 (2.8193) grad_norm 2.5629 (4.2284) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][210/1251] eta 0:04:19 lr 0.000135 wd 0.0500 time 0.2429 (0.2488) data time 0.0011 (0.0039) model time 0.2418 (0.2440) loss 3.4419 (2.8289) grad_norm 5.3242 (4.2430) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][220/1251] eta 0:04:16 lr 0.000135 wd 0.0500 time 0.2373 (0.2485) data time 0.0011 (0.0038) model time 0.2363 (0.2438) loss 2.5529 (2.8263) grad_norm 4.1508 (4.3171) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][230/1251] eta 0:04:13 lr 0.000135 wd 0.0500 time 0.2485 (0.2481) data time 0.0008 (0.0037) model time 0.2476 (0.2436) loss 2.4210 (2.8270) grad_norm 3.0615 (4.2897) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][240/1251] eta 0:04:10 lr 0.000135 wd 0.0500 time 0.2370 (0.2479) data time 0.0008 (0.0036) model time 0.2362 (0.2434) loss 3.4012 (2.8296) grad_norm 4.6814 (4.2750) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][250/1251] eta 0:04:07 lr 0.000135 wd 0.0500 time 0.2346 (0.2475) data time 0.0007 (0.0035) model time 0.2339 (0.2431) loss 3.1227 (2.8265) grad_norm 2.8909 (4.2639) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][260/1251] eta 0:04:04 lr 0.000135 wd 0.0500 time 0.2420 (0.2472) data time 0.0009 (0.0034) model time 0.2411 (0.2429) loss 3.4529 (2.8297) grad_norm 3.7159 (4.2413) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][270/1251] eta 0:04:02 lr 0.000135 wd 0.0500 time 0.2393 (0.2469) data time 0.0009 (0.0033) model time 0.2384 (0.2427) loss 3.3115 (2.8328) grad_norm 3.4886 (4.2253) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][280/1251] eta 0:03:59 lr 0.000135 wd 0.0500 time 0.2430 (0.2467) data time 0.0009 (0.0032) model time 0.2421 (0.2426) loss 2.8449 (2.8342) grad_norm 4.0145 (4.2154) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][290/1251] eta 0:03:56 lr 0.000135 wd 0.0500 time 0.2444 (0.2465) data time 0.0009 (0.0031) model time 0.2435 (0.2425) loss 2.5438 (2.8324) grad_norm 3.7890 (4.2287) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][300/1251] eta 0:03:54 lr 0.000135 wd 0.0500 time 0.2452 (0.2463) data time 0.0009 (0.0031) model time 0.2442 (0.2423) loss 3.0367 (2.8354) grad_norm 3.3416 (4.2282) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:13:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][310/1251] eta 0:03:51 lr 0.000135 wd 0.0500 time 0.2473 (0.2462) data time 0.0011 (0.0030) model time 0.2462 (0.2424) loss 3.0399 (2.8330) grad_norm 3.1313 (4.2194) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][320/1251] eta 0:03:49 lr 0.000135 wd 0.0500 time 0.2348 (0.2460) data time 0.0009 (0.0029) model time 0.2339 (0.2422) loss 2.9363 (2.8339) grad_norm 8.4972 (4.2400) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][330/1251] eta 0:03:46 lr 0.000135 wd 0.0500 time 0.2457 (0.2459) data time 0.0008 (0.0029) model time 0.2448 (0.2422) loss 1.9070 (2.8391) grad_norm 3.4531 (4.2281) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][340/1251] eta 0:03:43 lr 0.000135 wd 0.0500 time 0.2420 (0.2457) data time 0.0007 (0.0028) model time 0.2413 (0.2421) loss 3.2299 (2.8395) grad_norm 3.8371 (4.2927) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][350/1251] eta 0:03:41 lr 0.000135 wd 0.0500 time 0.2312 (0.2456) data time 0.0011 (0.0028) model time 0.2301 (0.2420) loss 2.1079 (2.8322) grad_norm 5.1737 (4.2927) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][360/1251] eta 0:03:38 lr 0.000135 wd 0.0500 time 0.2433 (0.2454) data time 0.0007 (0.0027) model time 0.2425 (0.2419) loss 3.6041 (2.8404) grad_norm 4.9605 (4.3037) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][370/1251] eta 0:03:36 lr 0.000135 wd 0.0500 time 0.2482 (0.2453) data time 0.0008 (0.0027) model time 0.2475 (0.2418) loss 2.5755 (2.8411) grad_norm 3.3266 (4.3022) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][380/1251] eta 0:03:33 lr 0.000135 wd 0.0500 time 0.2335 (0.2451) data time 0.0009 (0.0026) model time 0.2326 (0.2417) loss 3.3857 (2.8467) grad_norm 3.5222 (4.3006) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][390/1251] eta 0:03:30 lr 0.000135 wd 0.0500 time 0.2435 (0.2451) data time 0.0009 (0.0026) model time 0.2427 (0.2417) loss 3.2786 (2.8479) grad_norm 3.9674 (4.3213) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][400/1251] eta 0:03:28 lr 0.000135 wd 0.0500 time 0.2366 (0.2449) data time 0.0009 (0.0025) model time 0.2357 (0.2416) loss 3.2508 (2.8519) grad_norm 2.6932 (4.3339) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][410/1251] eta 0:03:25 lr 0.000135 wd 0.0500 time 0.2354 (0.2448) data time 0.0008 (0.0025) model time 0.2346 (0.2415) loss 2.4511 (2.8546) grad_norm 3.8171 (4.3415) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][420/1251] eta 0:03:23 lr 0.000135 wd 0.0500 time 0.2456 (0.2447) data time 0.0009 (0.0025) model time 0.2447 (0.2415) loss 1.7892 (2.8536) grad_norm 5.6267 (4.3354) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][430/1251] eta 0:03:20 lr 0.000135 wd 0.0500 time 0.2458 (0.2447) data time 0.0010 (0.0024) model time 0.2449 (0.2415) loss 3.2609 (2.8577) grad_norm 3.3908 (4.3408) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][440/1251] eta 0:03:18 lr 0.000135 wd 0.0500 time 0.2387 (0.2446) data time 0.0007 (0.0024) model time 0.2380 (0.2415) loss 3.5838 (2.8640) grad_norm 3.3122 (4.3231) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][450/1251] eta 0:03:15 lr 0.000135 wd 0.0500 time 0.2366 (0.2445) data time 0.0008 (0.0024) model time 0.2357 (0.2415) loss 2.8293 (2.8692) grad_norm 4.0034 (4.3527) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][460/1251] eta 0:03:13 lr 0.000135 wd 0.0500 time 0.2428 (0.2445) data time 0.0010 (0.0023) model time 0.2418 (0.2415) loss 2.8316 (2.8668) grad_norm 2.9909 (4.3483) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][470/1251] eta 0:03:10 lr 0.000135 wd 0.0500 time 0.2374 (0.2444) data time 0.0007 (0.0023) model time 0.2367 (0.2415) loss 2.7477 (2.8678) grad_norm 3.8594 (4.3550) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][480/1251] eta 0:03:08 lr 0.000134 wd 0.0500 time 0.2411 (0.2444) data time 0.0010 (0.0023) model time 0.2401 (0.2414) loss 3.1912 (2.8672) grad_norm 3.2489 (4.3472) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][490/1251] eta 0:03:05 lr 0.000134 wd 0.0500 time 0.2380 (0.2442) data time 0.0009 (0.0023) model time 0.2371 (0.2413) loss 2.6144 (2.8629) grad_norm 4.3860 (4.3436) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][500/1251] eta 0:03:03 lr 0.000134 wd 0.0500 time 0.2421 (0.2441) data time 0.0010 (0.0022) model time 0.2411 (0.2412) loss 3.0656 (2.8616) grad_norm 3.3200 (4.3576) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][510/1251] eta 0:03:00 lr 0.000134 wd 0.0500 time 0.2343 (0.2440) data time 0.0009 (0.0022) model time 0.2334 (0.2412) loss 2.3835 (2.8577) grad_norm 2.8264 (4.3411) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][520/1251] eta 0:02:58 lr 0.000134 wd 0.0500 time 0.2369 (0.2440) data time 0.0014 (0.0022) model time 0.2355 (0.2411) loss 2.8133 (2.8586) grad_norm 3.6158 (4.3385) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][530/1251] eta 0:02:55 lr 0.000134 wd 0.0500 time 0.2409 (0.2439) data time 0.0009 (0.0022) model time 0.2400 (0.2411) loss 2.9152 (2.8589) grad_norm 5.7877 (4.3291) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][540/1251] eta 0:02:53 lr 0.000134 wd 0.0500 time 0.2393 (0.2439) data time 0.0009 (0.0021) model time 0.2383 (0.2411) loss 2.8564 (2.8592) grad_norm 3.7635 (4.3212) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][550/1251] eta 0:02:50 lr 0.000134 wd 0.0500 time 0.2290 (0.2438) data time 0.0009 (0.0021) model time 0.2281 (0.2411) loss 3.1988 (2.8603) grad_norm 3.7757 (4.3090) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:14:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][560/1251] eta 0:02:48 lr 0.000134 wd 0.0500 time 0.2395 (0.2437) data time 0.0010 (0.0021) model time 0.2385 (0.2410) loss 3.2150 (2.8624) grad_norm 4.2343 (4.3254) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][570/1251] eta 0:02:45 lr 0.000134 wd 0.0500 time 0.2352 (0.2436) data time 0.0008 (0.0021) model time 0.2344 (0.2409) loss 2.3487 (2.8563) grad_norm 2.9492 (4.3415) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][580/1251] eta 0:02:43 lr 0.000134 wd 0.0500 time 0.2449 (0.2436) data time 0.0012 (0.0021) model time 0.2437 (0.2409) loss 3.0979 (2.8617) grad_norm 3.3840 (4.3444) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][590/1251] eta 0:02:40 lr 0.000134 wd 0.0500 time 0.2386 (0.2435) data time 0.0008 (0.0020) model time 0.2378 (0.2409) loss 1.9498 (2.8593) grad_norm 4.8710 (4.3358) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][600/1251] eta 0:02:38 lr 0.000134 wd 0.0500 time 0.2339 (0.2435) data time 0.0008 (0.0020) model time 0.2331 (0.2409) loss 2.8974 (2.8580) grad_norm 4.8521 (4.3388) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][610/1251] eta 0:02:36 lr 0.000134 wd 0.0500 time 0.2399 (0.2434) data time 0.0009 (0.0020) model time 0.2390 (0.2409) loss 2.4671 (2.8521) grad_norm 3.8551 (4.3487) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][620/1251] eta 0:02:33 lr 0.000134 wd 0.0500 time 0.2476 (0.2434) data time 0.0009 (0.0020) model time 0.2466 (0.2408) loss 2.1981 (2.8492) grad_norm 3.6240 (4.3549) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][630/1251] eta 0:02:31 lr 0.000134 wd 0.0500 time 0.2422 (0.2433) data time 0.0009 (0.0020) model time 0.2413 (0.2408) loss 2.2456 (2.8501) grad_norm 3.0138 (4.3523) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][640/1251] eta 0:02:28 lr 0.000134 wd 0.0500 time 0.2445 (0.2433) data time 0.0010 (0.0020) model time 0.2435 (0.2408) loss 2.7960 (2.8483) grad_norm 5.8088 (4.3571) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][650/1251] eta 0:02:26 lr 0.000134 wd 0.0500 time 0.2302 (0.2432) data time 0.0009 (0.0020) model time 0.2293 (0.2407) loss 3.2230 (2.8516) grad_norm 4.4243 (4.3498) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][660/1251] eta 0:02:23 lr 0.000134 wd 0.0500 time 0.2360 (0.2432) data time 0.0008 (0.0019) model time 0.2351 (0.2407) loss 3.0275 (2.8511) grad_norm 2.5517 (4.3471) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][670/1251] eta 0:02:21 lr 0.000134 wd 0.0500 time 0.2363 (0.2431) data time 0.0009 (0.0019) model time 0.2353 (0.2407) loss 1.8247 (2.8482) grad_norm 3.4930 (4.3499) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][680/1251] eta 0:02:18 lr 0.000134 wd 0.0500 time 0.2421 (0.2431) data time 0.0007 (0.0019) model time 0.2414 (0.2407) loss 3.2087 (2.8548) grad_norm 5.5330 (4.3712) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][690/1251] eta 0:02:16 lr 0.000134 wd 0.0500 time 0.2384 (0.2430) data time 0.0008 (0.0019) model time 0.2375 (0.2406) loss 3.4014 (2.8570) grad_norm 5.2625 (4.3667) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][700/1251] eta 0:02:13 lr 0.000134 wd 0.0500 time 0.2462 (0.2430) data time 0.0007 (0.0019) model time 0.2456 (0.2406) loss 3.3727 (2.8575) grad_norm 3.5365 (4.3631) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][710/1251] eta 0:02:11 lr 0.000134 wd 0.0500 time 0.2400 (0.2430) data time 0.0007 (0.0019) model time 0.2393 (0.2406) loss 2.2677 (2.8558) grad_norm 8.8306 (4.3616) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][720/1251] eta 0:02:08 lr 0.000134 wd 0.0500 time 0.2369 (0.2429) data time 0.0009 (0.0019) model time 0.2360 (0.2406) loss 3.3513 (2.8591) grad_norm 5.0636 (4.3586) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][730/1251] eta 0:02:06 lr 0.000134 wd 0.0500 time 0.2350 (0.2429) data time 0.0011 (0.0019) model time 0.2339 (0.2405) loss 3.1997 (2.8620) grad_norm 2.7161 (4.3494) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][740/1251] eta 0:02:04 lr 0.000134 wd 0.0500 time 0.2405 (0.2428) data time 0.0009 (0.0018) model time 0.2396 (0.2405) loss 3.1146 (2.8614) grad_norm 4.1090 (4.3391) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][750/1251] eta 0:02:01 lr 0.000134 wd 0.0500 time 0.2424 (0.2430) data time 0.0008 (0.0018) model time 0.2417 (0.2408) loss 2.3463 (2.8608) grad_norm 7.9843 (4.3412) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][760/1251] eta 0:01:59 lr 0.000134 wd 0.0500 time 0.2411 (0.2430) data time 0.0010 (0.0018) model time 0.2400 (0.2407) loss 3.1905 (2.8640) grad_norm 5.4246 (4.3558) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][770/1251] eta 0:01:56 lr 0.000134 wd 0.0500 time 0.2431 (0.2430) data time 0.0008 (0.0018) model time 0.2423 (0.2407) loss 1.7982 (2.8610) grad_norm 7.0821 (4.3628) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][780/1251] eta 0:01:54 lr 0.000134 wd 0.0500 time 0.2488 (0.2430) data time 0.0007 (0.0018) model time 0.2481 (0.2407) loss 2.9322 (2.8598) grad_norm 3.5949 (4.3615) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][790/1251] eta 0:01:51 lr 0.000134 wd 0.0500 time 0.2320 (0.2429) data time 0.0008 (0.0018) model time 0.2312 (0.2407) loss 3.4832 (2.8600) grad_norm 3.0718 (4.3483) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][800/1251] eta 0:01:49 lr 0.000134 wd 0.0500 time 0.2354 (0.2429) data time 0.0007 (0.0018) model time 0.2347 (0.2406) loss 2.8850 (2.8608) grad_norm 3.8154 (4.3471) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:15:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][810/1251] eta 0:01:47 lr 0.000134 wd 0.0500 time 0.2335 (0.2428) data time 0.0011 (0.0018) model time 0.2324 (0.2406) loss 2.8719 (2.8615) grad_norm 7.0262 (4.3518) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][820/1251] eta 0:01:44 lr 0.000133 wd 0.0500 time 0.2430 (0.2428) data time 0.0010 (0.0018) model time 0.2421 (0.2406) loss 3.4744 (2.8592) grad_norm 3.1186 (4.3629) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][830/1251] eta 0:01:42 lr 0.000133 wd 0.0500 time 0.2400 (0.2428) data time 0.0010 (0.0017) model time 0.2389 (0.2406) loss 2.7807 (2.8574) grad_norm 5.1894 (4.3600) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][840/1251] eta 0:01:39 lr 0.000133 wd 0.0500 time 0.2346 (0.2428) data time 0.0007 (0.0017) model time 0.2338 (0.2406) loss 2.2939 (2.8547) grad_norm 3.3376 (4.3474) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][850/1251] eta 0:01:37 lr 0.000133 wd 0.0500 time 0.2391 (0.2427) data time 0.0010 (0.0017) model time 0.2381 (0.2406) loss 2.9845 (2.8551) grad_norm 3.1277 (4.3362) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][860/1251] eta 0:01:34 lr 0.000133 wd 0.0500 time 0.2350 (0.2427) data time 0.0009 (0.0017) model time 0.2341 (0.2406) loss 2.4611 (2.8551) grad_norm 3.7060 (4.3540) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][870/1251] eta 0:01:32 lr 0.000133 wd 0.0500 time 0.2391 (0.2427) data time 0.0007 (0.0017) model time 0.2383 (0.2406) loss 2.7473 (2.8551) grad_norm 4.3759 (4.3529) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][880/1251] eta 0:01:30 lr 0.000133 wd 0.0500 time 0.2473 (0.2427) data time 0.0009 (0.0017) model time 0.2463 (0.2406) loss 3.1159 (2.8566) grad_norm 3.3560 (4.3444) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][890/1251] eta 0:01:27 lr 0.000133 wd 0.0500 time 0.2411 (0.2427) data time 0.0009 (0.0017) model time 0.2402 (0.2406) loss 1.6153 (2.8569) grad_norm 5.2595 (4.3446) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][900/1251] eta 0:01:25 lr 0.000133 wd 0.0500 time 0.2531 (0.2427) data time 0.0009 (0.0017) model time 0.2521 (0.2406) loss 3.3764 (2.8583) grad_norm 3.4803 (4.3388) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][910/1251] eta 0:01:22 lr 0.000133 wd 0.0500 time 0.2363 (0.2426) data time 0.0009 (0.0017) model time 0.2353 (0.2406) loss 2.0472 (2.8602) grad_norm 4.4443 (4.3531) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][920/1251] eta 0:01:20 lr 0.000133 wd 0.0500 time 0.2502 (0.2426) data time 0.0010 (0.0017) model time 0.2492 (0.2406) loss 2.9116 (2.8596) grad_norm 3.8931 (4.3500) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][930/1251] eta 0:01:17 lr 0.000133 wd 0.0500 time 0.2383 (0.2426) data time 0.0009 (0.0017) model time 0.2373 (0.2406) loss 2.1957 (2.8605) grad_norm 4.9968 (4.3529) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][940/1251] eta 0:01:15 lr 0.000133 wd 0.0500 time 0.2374 (0.2426) data time 0.0011 (0.0017) model time 0.2363 (0.2405) loss 3.2296 (2.8614) grad_norm 5.2688 (4.3520) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][950/1251] eta 0:01:13 lr 0.000133 wd 0.0500 time 0.2379 (0.2425) data time 0.0008 (0.0017) model time 0.2371 (0.2405) loss 3.0857 (2.8613) grad_norm 4.8927 (4.3471) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][960/1251] eta 0:01:10 lr 0.000133 wd 0.0500 time 0.2395 (0.2425) data time 0.0011 (0.0016) model time 0.2385 (0.2405) loss 2.3542 (2.8635) grad_norm 3.5862 (4.3380) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][970/1251] eta 0:01:08 lr 0.000133 wd 0.0500 time 0.2381 (0.2425) data time 0.0010 (0.0016) model time 0.2372 (0.2405) loss 3.3798 (2.8635) grad_norm 3.5444 (4.3445) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][980/1251] eta 0:01:05 lr 0.000133 wd 0.0500 time 0.2324 (0.2425) data time 0.0009 (0.0016) model time 0.2315 (0.2405) loss 3.6745 (2.8657) grad_norm 4.0238 (4.3397) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][990/1251] eta 0:01:03 lr 0.000133 wd 0.0500 time 0.2399 (0.2424) data time 0.0008 (0.0016) model time 0.2392 (0.2405) loss 2.3399 (2.8667) grad_norm 20.7743 (4.3560) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1000/1251] eta 0:01:00 lr 0.000133 wd 0.0500 time 0.2457 (0.2424) data time 0.0009 (0.0016) model time 0.2448 (0.2405) loss 3.1963 (2.8677) grad_norm 2.9015 (4.3620) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1010/1251] eta 0:00:58 lr 0.000133 wd 0.0500 time 0.2435 (0.2424) data time 0.0009 (0.0016) model time 0.2426 (0.2405) loss 1.9097 (2.8669) grad_norm 3.8945 (4.3573) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1020/1251] eta 0:00:55 lr 0.000133 wd 0.0500 time 0.2432 (0.2424) data time 0.0010 (0.0016) model time 0.2422 (0.2404) loss 2.8498 (2.8664) grad_norm 6.1559 (4.3641) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1030/1251] eta 0:00:53 lr 0.000133 wd 0.0500 time 0.2379 (0.2428) data time 0.0008 (0.0016) model time 0.2372 (0.2409) loss 2.9251 (2.8693) grad_norm 3.7844 (4.3644) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1040/1251] eta 0:00:51 lr 0.000133 wd 0.0500 time 0.2428 (0.2428) data time 0.0008 (0.0016) model time 0.2420 (0.2409) loss 2.8907 (2.8690) grad_norm 3.0718 (4.3584) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1050/1251] eta 0:00:48 lr 0.000133 wd 0.0500 time 0.2309 (0.2428) data time 0.0011 (0.0016) model time 0.2298 (0.2409) loss 3.0921 (2.8697) grad_norm 3.1483 (4.3482) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1060/1251] eta 0:00:46 lr 0.000133 wd 0.0500 time 0.2415 (0.2428) data time 0.0009 (0.0016) model time 0.2405 (0.2409) loss 2.8258 (2.8675) grad_norm 3.6914 (4.3409) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1070/1251] eta 0:00:43 lr 0.000133 wd 0.0500 time 0.2522 (0.2428) data time 0.0009 (0.0016) model time 0.2513 (0.2409) loss 3.4710 (2.8667) grad_norm 3.0380 (4.3390) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1080/1251] eta 0:00:41 lr 0.000133 wd 0.0500 time 0.2379 (0.2428) data time 0.0010 (0.0016) model time 0.2369 (0.2409) loss 3.1262 (2.8656) grad_norm 5.3408 (4.3386) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1090/1251] eta 0:00:39 lr 0.000133 wd 0.0500 time 0.2488 (0.2428) data time 0.0010 (0.0016) model time 0.2478 (0.2409) loss 2.9823 (2.8662) grad_norm 4.5583 (4.3361) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1100/1251] eta 0:00:36 lr 0.000133 wd 0.0500 time 0.2464 (0.2428) data time 0.0010 (0.0016) model time 0.2454 (0.2409) loss 1.9035 (2.8644) grad_norm 3.5501 (4.3290) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1110/1251] eta 0:00:34 lr 0.000133 wd 0.0500 time 0.2453 (0.2428) data time 0.0008 (0.0016) model time 0.2445 (0.2409) loss 2.4635 (2.8641) grad_norm 3.2618 (4.3242) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1120/1251] eta 0:00:31 lr 0.000133 wd 0.0500 time 0.2430 (0.2428) data time 0.0009 (0.0016) model time 0.2421 (0.2409) loss 2.8574 (2.8653) grad_norm 4.6339 (4.3251) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1130/1251] eta 0:00:29 lr 0.000133 wd 0.0500 time 0.2425 (0.2427) data time 0.0010 (0.0015) model time 0.2414 (0.2409) loss 3.1685 (2.8632) grad_norm 2.8511 (4.3189) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1140/1251] eta 0:00:26 lr 0.000133 wd 0.0500 time 0.2455 (0.2427) data time 0.0009 (0.0015) model time 0.2446 (0.2408) loss 3.3122 (2.8648) grad_norm 15.4848 (4.3256) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1150/1251] eta 0:00:24 lr 0.000133 wd 0.0500 time 0.2319 (0.2426) data time 0.0010 (0.0015) model time 0.2309 (0.2408) loss 3.1537 (2.8654) grad_norm 5.8968 (4.3233) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1160/1251] eta 0:00:22 lr 0.000132 wd 0.0500 time 0.2387 (0.2426) data time 0.0009 (0.0015) model time 0.2378 (0.2408) loss 3.3848 (2.8667) grad_norm 4.0193 (4.3193) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1170/1251] eta 0:00:19 lr 0.000132 wd 0.0500 time 0.2385 (0.2426) data time 0.0010 (0.0015) model time 0.2375 (0.2408) loss 3.3739 (2.8654) grad_norm 3.9548 (4.3208) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1180/1251] eta 0:00:17 lr 0.000132 wd 0.0500 time 0.2454 (0.2426) data time 0.0010 (0.0015) model time 0.2444 (0.2408) loss 3.1422 (2.8650) grad_norm 3.8504 (4.3165) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1190/1251] eta 0:00:14 lr 0.000132 wd 0.0500 time 0.2345 (0.2426) data time 0.0008 (0.0015) model time 0.2337 (0.2408) loss 3.0569 (2.8645) grad_norm 4.8663 (4.3168) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1200/1251] eta 0:00:12 lr 0.000132 wd 0.0500 time 0.2380 (0.2426) data time 0.0008 (0.0015) model time 0.2372 (0.2408) loss 3.5659 (2.8656) grad_norm 4.9046 (4.3123) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1210/1251] eta 0:00:09 lr 0.000132 wd 0.0500 time 0.2405 (0.2426) data time 0.0012 (0.0015) model time 0.2393 (0.2408) loss 2.9092 (2.8662) grad_norm 4.3548 (4.3120) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1220/1251] eta 0:00:07 lr 0.000132 wd 0.0500 time 0.2400 (0.2426) data time 0.0007 (0.0015) model time 0.2393 (0.2408) loss 2.8761 (2.8658) grad_norm 4.8169 (4.3100) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1230/1251] eta 0:00:05 lr 0.000132 wd 0.0500 time 0.2482 (0.2425) data time 0.0010 (0.0015) model time 0.2472 (0.2408) loss 2.8134 (2.8641) grad_norm 4.1935 (4.3051) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1240/1251] eta 0:00:02 lr 0.000132 wd 0.0500 time 0.2240 (0.2425) data time 0.0005 (0.0015) model time 0.2236 (0.2407) loss 2.3012 (2.8644) grad_norm 4.5444 (4.3056) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [235/300][1250/1251] eta 0:00:00 lr 0.000132 wd 0.0500 time 0.2297 (0.2423) data time 0.0005 (0.0015) model time 0.2293 (0.2405) loss 2.8519 (2.8635) grad_norm 3.2774 (4.3027) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 235 training takes 0:05:03 [2024-09-01 06:17:45 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 06:17:46 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 06:17:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.435 (0.435) Loss 0.3918 (0.3918) Acc@1 93.262 (93.262) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 06:17:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.110) Loss 0.6338 (0.6286) Acc@1 87.305 (86.763) Acc@5 97.070 (97.488) Mem 7381MB [2024-09-01 06:17:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.095) Loss 0.9390 (0.6561) Acc@1 77.051 (85.779) Acc@5 95.410 (97.405) Mem 7381MB [2024-09-01 06:17:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.089) Loss 1.1699 (0.7499) Acc@1 72.559 (83.515) Acc@5 92.383 (96.421) Mem 7381MB [2024-09-01 06:17:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.0186 (0.7963) Acc@1 76.660 (82.362) Acc@5 94.824 (95.948) Mem 7381MB [2024-09-01 06:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.000 Acc@5 95.946 [2024-09-01 06:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.0% [2024-09-01 06:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.00% [2024-09-01 06:17:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 06:17:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 06:17:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.449 (0.449) Loss 0.3794 (0.3794) Acc@1 93.262 (93.262) Acc@5 98.340 (98.340) Mem 7381MB [2024-09-01 06:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.086 (0.113) Loss 0.5708 (0.6008) Acc@1 89.746 (87.553) Acc@5 97.852 (97.736) Mem 7381MB [2024-09-01 06:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.098) Loss 0.8770 (0.6293) Acc@1 77.930 (86.416) Acc@5 95.898 (97.670) Mem 7381MB [2024-09-01 06:17:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.092) Loss 1.0859 (0.7127) Acc@1 74.023 (84.334) Acc@5 92.969 (96.793) Mem 7381MB [2024-09-01 06:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 0.9868 (0.7577) Acc@1 76.562 (83.186) Acc@5 94.238 (96.330) Mem 7381MB [2024-09-01 06:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.760 Acc@5 96.312 [2024-09-01 06:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 06:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.76% [2024-09-01 06:17:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 06:17:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 06:17:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][0/1251] eta 0:13:02 lr 0.000132 wd 0.0500 time 0.6255 (0.6255) data time 0.4043 (0.4043) model time 0.0000 (0.0000) loss 3.5257 (3.5257) grad_norm 3.2561 (3.2561) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:17:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][10/1251] eta 0:05:41 lr 0.000132 wd 0.0500 time 0.2421 (0.2751) data time 0.0009 (0.0376) model time 0.0000 (0.0000) loss 3.4213 (2.8381) grad_norm 4.1323 (4.0250) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:18:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][20/1251] eta 0:05:18 lr 0.000132 wd 0.0500 time 0.2419 (0.2585) data time 0.0007 (0.0202) model time 0.0000 (0.0000) loss 3.1160 (2.8493) grad_norm 3.3430 (4.0074) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:18:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][30/1251] eta 0:05:08 lr 0.000132 wd 0.0500 time 0.2444 (0.2524) data time 0.0008 (0.0140) model time 0.0000 (0.0000) loss 3.1403 (2.8101) grad_norm 3.1980 (4.0097) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][40/1251] eta 0:05:02 lr 0.000132 wd 0.0500 time 0.2406 (0.2499) data time 0.0007 (0.0109) model time 0.0000 (0.0000) loss 1.8373 (2.7385) grad_norm 3.6585 (4.0256) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:18:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][50/1251] eta 0:04:57 lr 0.000132 wd 0.0500 time 0.2364 (0.2476) data time 0.0006 (0.0089) model time 0.0000 (0.0000) loss 3.3124 (2.7742) grad_norm 3.4025 (3.9422) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:18:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][60/1251] eta 0:04:53 lr 0.000132 wd 0.0500 time 0.2338 (0.2465) data time 0.0009 (0.0076) model time 0.2329 (0.2395) loss 3.1436 (2.7733) grad_norm 3.0357 (4.0032) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:18:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][70/1251] eta 0:04:49 lr 0.000132 wd 0.0500 time 0.2417 (0.2455) data time 0.0007 (0.0067) model time 0.2410 (0.2390) loss 3.2862 (2.7626) grad_norm 4.2775 (3.9499) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:18:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][80/1251] eta 0:04:46 lr 0.000132 wd 0.0500 time 0.2426 (0.2448) data time 0.0009 (0.0060) model time 0.2417 (0.2392) loss 3.4342 (2.7875) grad_norm 4.7859 (3.9370) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:18:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][90/1251] eta 0:04:43 lr 0.000132 wd 0.0500 time 0.2341 (0.2444) data time 0.0008 (0.0054) model time 0.2333 (0.2395) loss 3.5897 (2.8010) grad_norm 3.2833 (3.8817) loss_scale 512.0000 (264.4396) mem 7381MB [2024-09-01 06:18:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][100/1251] eta 0:04:40 lr 0.000132 wd 0.0500 time 0.2359 (0.2440) data time 0.0012 (0.0050) model time 0.2347 (0.2393) loss 2.6401 (2.7883) grad_norm 3.1342 (3.8657) loss_scale 512.0000 (288.9505) mem 7381MB [2024-09-01 06:18:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][110/1251] eta 0:04:37 lr 0.000132 wd 0.0500 time 0.2433 (0.2435) data time 0.0009 (0.0046) model time 0.2424 (0.2390) loss 2.9345 (2.7936) grad_norm 5.8369 (3.8716) loss_scale 512.0000 (309.0450) mem 7381MB [2024-09-01 06:18:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][120/1251] eta 0:04:34 lr 0.000132 wd 0.0500 time 0.2461 (0.2431) data time 0.0009 (0.0043) model time 0.2452 (0.2389) loss 2.8440 (2.8219) grad_norm 5.2443 (3.9885) loss_scale 512.0000 (325.8182) mem 7381MB [2024-09-01 06:18:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][130/1251] eta 0:04:32 lr 0.000132 wd 0.0500 time 0.2426 (0.2431) data time 0.0010 (0.0041) model time 0.2416 (0.2392) loss 2.9987 (2.8272) grad_norm 4.1173 (4.0079) loss_scale 512.0000 (340.0305) mem 7381MB [2024-09-01 06:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][140/1251] eta 0:04:29 lr 0.000132 wd 0.0500 time 0.2420 (0.2429) data time 0.0011 (0.0039) model time 0.2408 (0.2392) loss 2.2793 (2.8158) grad_norm 4.1321 (4.0431) loss_scale 512.0000 (352.2270) mem 7381MB [2024-09-01 06:18:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][150/1251] eta 0:04:27 lr 0.000132 wd 0.0500 time 0.2485 (0.2428) data time 0.0007 (0.0037) model time 0.2477 (0.2393) loss 3.2596 (2.8343) grad_norm 3.6912 (4.0484) loss_scale 512.0000 (362.8079) mem 7381MB [2024-09-01 06:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][160/1251] eta 0:04:24 lr 0.000132 wd 0.0500 time 0.2444 (0.2426) data time 0.0008 (0.0035) model time 0.2435 (0.2393) loss 1.8621 (2.8245) grad_norm 3.9810 (4.0941) loss_scale 512.0000 (372.0745) mem 7381MB [2024-09-01 06:18:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][170/1251] eta 0:04:22 lr 0.000132 wd 0.0500 time 0.2380 (0.2424) data time 0.0011 (0.0033) model time 0.2369 (0.2392) loss 2.5937 (2.8310) grad_norm 20.5354 (4.3184) loss_scale 512.0000 (380.2573) mem 7381MB [2024-09-01 06:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][180/1251] eta 0:04:19 lr 0.000132 wd 0.0500 time 0.2437 (0.2424) data time 0.0011 (0.0032) model time 0.2426 (0.2393) loss 2.8781 (2.8378) grad_norm 3.5786 (4.3017) loss_scale 512.0000 (387.5359) mem 7381MB [2024-09-01 06:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][190/1251] eta 0:04:17 lr 0.000132 wd 0.0500 time 0.2421 (0.2422) data time 0.0010 (0.0031) model time 0.2411 (0.2393) loss 2.8902 (2.8423) grad_norm 3.5909 (4.2978) loss_scale 512.0000 (394.0524) mem 7381MB [2024-09-01 06:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][200/1251] eta 0:04:14 lr 0.000132 wd 0.0500 time 0.2434 (0.2423) data time 0.0010 (0.0030) model time 0.2425 (0.2395) loss 2.2073 (2.8421) grad_norm 3.4043 (4.3862) loss_scale 512.0000 (399.9204) mem 7381MB [2024-09-01 06:18:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][210/1251] eta 0:04:12 lr 0.000132 wd 0.0500 time 0.2437 (0.2423) data time 0.0010 (0.0029) model time 0.2427 (0.2396) loss 3.2644 (2.8368) grad_norm 5.9055 (4.3504) loss_scale 512.0000 (405.2322) mem 7381MB [2024-09-01 06:18:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][220/1251] eta 0:04:09 lr 0.000132 wd 0.0500 time 0.2462 (0.2422) data time 0.0011 (0.0028) model time 0.2452 (0.2396) loss 2.8812 (2.8367) grad_norm 6.8728 (4.3782) loss_scale 512.0000 (410.0633) mem 7381MB [2024-09-01 06:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][230/1251] eta 0:04:07 lr 0.000132 wd 0.0500 time 0.2368 (0.2422) data time 0.0009 (0.0027) model time 0.2359 (0.2397) loss 3.2080 (2.8314) grad_norm 3.2369 (4.3552) loss_scale 512.0000 (414.4762) mem 7381MB [2024-09-01 06:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][240/1251] eta 0:04:04 lr 0.000132 wd 0.0500 time 0.2398 (0.2422) data time 0.0009 (0.0027) model time 0.2389 (0.2397) loss 3.5456 (2.8311) grad_norm 3.8102 (4.3330) loss_scale 512.0000 (418.5228) mem 7381MB [2024-09-01 06:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][250/1251] eta 0:04:02 lr 0.000132 wd 0.0500 time 0.2267 (0.2420) data time 0.0009 (0.0026) model time 0.2258 (0.2397) loss 3.3730 (2.8330) grad_norm 3.8789 (4.3156) loss_scale 512.0000 (422.2470) mem 7381MB [2024-09-01 06:18:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][260/1251] eta 0:03:59 lr 0.000131 wd 0.0500 time 0.2474 (0.2421) data time 0.0008 (0.0025) model time 0.2467 (0.2398) loss 3.4678 (2.8395) grad_norm 3.9688 (4.3218) loss_scale 512.0000 (425.6858) mem 7381MB [2024-09-01 06:19:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][270/1251] eta 0:03:57 lr 0.000131 wd 0.0500 time 0.2411 (0.2420) data time 0.0011 (0.0025) model time 0.2400 (0.2398) loss 2.1127 (2.8367) grad_norm 4.0775 (4.3009) loss_scale 512.0000 (428.8708) mem 7381MB [2024-09-01 06:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][280/1251] eta 0:03:55 lr 0.000131 wd 0.0500 time 0.2427 (0.2421) data time 0.0010 (0.0024) model time 0.2418 (0.2399) loss 2.9875 (2.8362) grad_norm 4.2917 (4.3152) loss_scale 512.0000 (431.8292) mem 7381MB [2024-09-01 06:19:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][290/1251] eta 0:03:52 lr 0.000131 wd 0.0500 time 0.2398 (0.2421) data time 0.0010 (0.0024) model time 0.2388 (0.2399) loss 3.1671 (2.8423) grad_norm 3.0492 (4.3151) loss_scale 512.0000 (434.5842) mem 7381MB [2024-09-01 06:19:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][300/1251] eta 0:03:50 lr 0.000131 wd 0.0500 time 0.2415 (0.2420) data time 0.0010 (0.0023) model time 0.2405 (0.2399) loss 3.2498 (2.8522) grad_norm 6.0190 (4.3265) loss_scale 512.0000 (437.1561) mem 7381MB [2024-09-01 06:19:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][310/1251] eta 0:03:47 lr 0.000131 wd 0.0500 time 0.2414 (0.2420) data time 0.0009 (0.0023) model time 0.2405 (0.2399) loss 3.3137 (2.8464) grad_norm 4.6766 (4.3269) loss_scale 512.0000 (439.5627) mem 7381MB [2024-09-01 06:19:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][320/1251] eta 0:03:45 lr 0.000131 wd 0.0500 time 0.2396 (0.2420) data time 0.0008 (0.0022) model time 0.2388 (0.2399) loss 3.2033 (2.8558) grad_norm 3.7917 (4.3122) loss_scale 512.0000 (441.8193) mem 7381MB [2024-09-01 06:19:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][330/1251] eta 0:03:42 lr 0.000131 wd 0.0500 time 0.2448 (0.2419) data time 0.0009 (0.0022) model time 0.2438 (0.2399) loss 2.4640 (2.8498) grad_norm 3.2763 (4.2945) loss_scale 512.0000 (443.9396) mem 7381MB [2024-09-01 06:19:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][340/1251] eta 0:03:40 lr 0.000131 wd 0.0500 time 0.2415 (0.2419) data time 0.0007 (0.0022) model time 0.2408 (0.2400) loss 2.7152 (2.8541) grad_norm 3.0959 (4.3129) loss_scale 512.0000 (445.9355) mem 7381MB [2024-09-01 06:19:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][350/1251] eta 0:03:37 lr 0.000131 wd 0.0500 time 0.2395 (0.2419) data time 0.0010 (0.0021) model time 0.2384 (0.2400) loss 2.7930 (2.8508) grad_norm 3.0340 (4.2911) loss_scale 512.0000 (447.8177) mem 7381MB [2024-09-01 06:19:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][360/1251] eta 0:03:35 lr 0.000131 wd 0.0500 time 0.2439 (0.2419) data time 0.0008 (0.0021) model time 0.2431 (0.2400) loss 3.0180 (2.8594) grad_norm 5.5565 (4.3184) loss_scale 512.0000 (449.5956) mem 7381MB [2024-09-01 06:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][370/1251] eta 0:03:33 lr 0.000131 wd 0.0500 time 0.2356 (0.2419) data time 0.0012 (0.0021) model time 0.2344 (0.2400) loss 2.0282 (2.8646) grad_norm 4.7607 (4.3365) loss_scale 512.0000 (451.2776) mem 7381MB [2024-09-01 06:19:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][380/1251] eta 0:03:30 lr 0.000131 wd 0.0500 time 0.2394 (0.2418) data time 0.0011 (0.0020) model time 0.2383 (0.2399) loss 2.0751 (2.8646) grad_norm 4.7984 (4.4255) loss_scale 512.0000 (452.8714) mem 7381MB [2024-09-01 06:19:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][390/1251] eta 0:03:28 lr 0.000131 wd 0.0500 time 0.2378 (0.2418) data time 0.0009 (0.0020) model time 0.2369 (0.2399) loss 2.6063 (2.8546) grad_norm 4.3979 (4.4161) loss_scale 512.0000 (454.3836) mem 7381MB [2024-09-01 06:19:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][400/1251] eta 0:03:25 lr 0.000131 wd 0.0500 time 0.2335 (0.2417) data time 0.0007 (0.0020) model time 0.2328 (0.2399) loss 2.6881 (2.8566) grad_norm 4.4728 (4.4216) loss_scale 512.0000 (455.8204) mem 7381MB [2024-09-01 06:19:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][410/1251] eta 0:03:23 lr 0.000131 wd 0.0500 time 0.2426 (0.2417) data time 0.0011 (0.0020) model time 0.2416 (0.2398) loss 3.0034 (2.8590) grad_norm 2.5546 (4.4013) loss_scale 512.0000 (457.1873) mem 7381MB [2024-09-01 06:19:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][420/1251] eta 0:03:20 lr 0.000131 wd 0.0500 time 0.2432 (0.2416) data time 0.0008 (0.0019) model time 0.2424 (0.2398) loss 3.0231 (2.8604) grad_norm 3.8832 (4.3968) loss_scale 512.0000 (458.4893) mem 7381MB [2024-09-01 06:19:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][430/1251] eta 0:03:18 lr 0.000131 wd 0.0500 time 0.2300 (0.2415) data time 0.0010 (0.0019) model time 0.2290 (0.2397) loss 2.8129 (2.8625) grad_norm 3.7654 (4.3760) loss_scale 512.0000 (459.7309) mem 7381MB [2024-09-01 06:19:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][440/1251] eta 0:03:15 lr 0.000131 wd 0.0500 time 0.2491 (0.2416) data time 0.0009 (0.0019) model time 0.2482 (0.2398) loss 3.2350 (2.8644) grad_norm 3.2784 (4.3641) loss_scale 512.0000 (460.9161) mem 7381MB [2024-09-01 06:19:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][450/1251] eta 0:03:13 lr 0.000131 wd 0.0500 time 0.2341 (0.2415) data time 0.0009 (0.0019) model time 0.2331 (0.2397) loss 2.7682 (2.8602) grad_norm 2.8187 (4.3435) loss_scale 512.0000 (462.0488) mem 7381MB [2024-09-01 06:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][460/1251] eta 0:03:11 lr 0.000131 wd 0.0500 time 0.2353 (0.2415) data time 0.0007 (0.0019) model time 0.2345 (0.2397) loss 3.2888 (2.8627) grad_norm 2.7658 (4.3262) loss_scale 512.0000 (463.1323) mem 7381MB [2024-09-01 06:19:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][470/1251] eta 0:03:08 lr 0.000131 wd 0.0500 time 0.2321 (0.2415) data time 0.0010 (0.0018) model time 0.2311 (0.2397) loss 3.3461 (2.8661) grad_norm 3.2791 (4.3161) loss_scale 512.0000 (464.1699) mem 7381MB [2024-09-01 06:19:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][480/1251] eta 0:03:06 lr 0.000131 wd 0.0500 time 0.2457 (0.2418) data time 0.0012 (0.0018) model time 0.2446 (0.2401) loss 2.9918 (2.8695) grad_norm 2.4728 (4.3058) loss_scale 512.0000 (465.1642) mem 7381MB [2024-09-01 06:19:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][490/1251] eta 0:03:03 lr 0.000131 wd 0.0500 time 0.2348 (0.2418) data time 0.0012 (0.0018) model time 0.2336 (0.2401) loss 2.3291 (2.8678) grad_norm 3.4211 (4.3057) loss_scale 512.0000 (466.1181) mem 7381MB [2024-09-01 06:19:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][500/1251] eta 0:03:01 lr 0.000131 wd 0.0500 time 0.2524 (0.2418) data time 0.0010 (0.0018) model time 0.2514 (0.2401) loss 2.8760 (2.8671) grad_norm 4.3521 (4.3139) loss_scale 512.0000 (467.0339) mem 7381MB [2024-09-01 06:19:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][510/1251] eta 0:02:59 lr 0.000131 wd 0.0500 time 0.2500 (0.2418) data time 0.0007 (0.0018) model time 0.2494 (0.2402) loss 2.4462 (2.8705) grad_norm 3.9996 (4.3149) loss_scale 512.0000 (467.9139) mem 7381MB [2024-09-01 06:20:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][520/1251] eta 0:02:56 lr 0.000131 wd 0.0500 time 0.2371 (0.2419) data time 0.0009 (0.0018) model time 0.2362 (0.2402) loss 2.4929 (2.8687) grad_norm 3.2958 (4.3222) loss_scale 512.0000 (468.7601) mem 7381MB [2024-09-01 06:20:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][530/1251] eta 0:02:54 lr 0.000131 wd 0.0500 time 0.2438 (0.2419) data time 0.0008 (0.0018) model time 0.2430 (0.2403) loss 2.8460 (2.8680) grad_norm 3.5978 (4.3106) loss_scale 512.0000 (469.5744) mem 7381MB [2024-09-01 06:20:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][540/1251] eta 0:02:52 lr 0.000131 wd 0.0500 time 0.2279 (0.2422) data time 0.0009 (0.0017) model time 0.2269 (0.2407) loss 2.1588 (2.8677) grad_norm 2.9796 (4.3100) loss_scale 512.0000 (470.3586) mem 7381MB [2024-09-01 06:20:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][550/1251] eta 0:02:50 lr 0.000131 wd 0.0500 time 0.2458 (0.2430) data time 0.0008 (0.0017) model time 0.2450 (0.2416) loss 2.5061 (2.8664) grad_norm 3.4090 (4.3007) loss_scale 512.0000 (471.1143) mem 7381MB [2024-09-01 06:20:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][560/1251] eta 0:02:47 lr 0.000131 wd 0.0500 time 0.2425 (0.2430) data time 0.0008 (0.0017) model time 0.2417 (0.2416) loss 3.0616 (2.8717) grad_norm 4.9072 (4.2923) loss_scale 512.0000 (471.8431) mem 7381MB [2024-09-01 06:20:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][570/1251] eta 0:02:45 lr 0.000131 wd 0.0500 time 0.2380 (0.2430) data time 0.0007 (0.0017) model time 0.2373 (0.2415) loss 3.4013 (2.8734) grad_norm 5.1060 (4.2953) loss_scale 512.0000 (472.5464) mem 7381MB [2024-09-01 06:20:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][580/1251] eta 0:02:43 lr 0.000131 wd 0.0500 time 0.2523 (0.2430) data time 0.0007 (0.0017) model time 0.2516 (0.2416) loss 2.8079 (2.8736) grad_norm 2.6888 (4.2872) loss_scale 512.0000 (473.2255) mem 7381MB [2024-09-01 06:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][590/1251] eta 0:02:40 lr 0.000131 wd 0.0500 time 0.2360 (0.2430) data time 0.0009 (0.0017) model time 0.2350 (0.2415) loss 2.3195 (2.8716) grad_norm 3.8721 (4.2755) loss_scale 512.0000 (473.8816) mem 7381MB [2024-09-01 06:20:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][600/1251] eta 0:02:38 lr 0.000130 wd 0.0500 time 0.2507 (0.2430) data time 0.0010 (0.0017) model time 0.2496 (0.2415) loss 2.1271 (2.8670) grad_norm 3.1110 (4.2595) loss_scale 512.0000 (474.5158) mem 7381MB [2024-09-01 06:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][610/1251] eta 0:02:35 lr 0.000130 wd 0.0500 time 0.2414 (0.2430) data time 0.0009 (0.0017) model time 0.2405 (0.2415) loss 3.1425 (2.8655) grad_norm 4.9054 (4.2542) loss_scale 512.0000 (475.1293) mem 7381MB [2024-09-01 06:20:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][620/1251] eta 0:02:33 lr 0.000130 wd 0.0500 time 0.2424 (0.2430) data time 0.0007 (0.0016) model time 0.2417 (0.2416) loss 3.0932 (2.8644) grad_norm 3.5211 (4.2427) loss_scale 512.0000 (475.7230) mem 7381MB [2024-09-01 06:20:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][630/1251] eta 0:02:30 lr 0.000130 wd 0.0500 time 0.2392 (0.2430) data time 0.0010 (0.0016) model time 0.2382 (0.2416) loss 2.5814 (2.8617) grad_norm 3.4808 (4.2389) loss_scale 512.0000 (476.2979) mem 7381MB [2024-09-01 06:20:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][640/1251] eta 0:02:28 lr 0.000130 wd 0.0500 time 0.2450 (0.2430) data time 0.0008 (0.0016) model time 0.2443 (0.2416) loss 3.2122 (2.8599) grad_norm 5.2148 (4.2814) loss_scale 512.0000 (476.8549) mem 7381MB [2024-09-01 06:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][650/1251] eta 0:02:26 lr 0.000130 wd 0.0500 time 0.2402 (0.2430) data time 0.0008 (0.0016) model time 0.2394 (0.2416) loss 1.9833 (2.8588) grad_norm 3.5720 (4.2783) loss_scale 512.0000 (477.3948) mem 7381MB [2024-09-01 06:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][660/1251] eta 0:02:23 lr 0.000130 wd 0.0500 time 0.2356 (0.2430) data time 0.0010 (0.0016) model time 0.2346 (0.2416) loss 2.3478 (2.8600) grad_norm 4.2569 (4.3046) loss_scale 512.0000 (477.9183) mem 7381MB [2024-09-01 06:20:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][670/1251] eta 0:02:21 lr 0.000130 wd 0.0500 time 0.2420 (0.2430) data time 0.0009 (0.0016) model time 0.2411 (0.2416) loss 2.5676 (2.8624) grad_norm 4.7690 (4.3097) loss_scale 512.0000 (478.4262) mem 7381MB [2024-09-01 06:20:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][680/1251] eta 0:02:18 lr 0.000130 wd 0.0500 time 0.2453 (0.2429) data time 0.0009 (0.0016) model time 0.2444 (0.2416) loss 3.6668 (2.8622) grad_norm 4.1904 (4.3271) loss_scale 512.0000 (478.9192) mem 7381MB [2024-09-01 06:20:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][690/1251] eta 0:02:16 lr 0.000130 wd 0.0500 time 0.2364 (0.2429) data time 0.0010 (0.0016) model time 0.2354 (0.2415) loss 3.2681 (2.8622) grad_norm 3.7916 (4.3210) loss_scale 512.0000 (479.3980) mem 7381MB [2024-09-01 06:20:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][700/1251] eta 0:02:13 lr 0.000130 wd 0.0500 time 0.2464 (0.2429) data time 0.0007 (0.0016) model time 0.2456 (0.2415) loss 2.2428 (2.8588) grad_norm 2.7553 (4.3296) loss_scale 512.0000 (479.8631) mem 7381MB [2024-09-01 06:20:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][710/1251] eta 0:02:11 lr 0.000130 wd 0.0500 time 0.2415 (0.2429) data time 0.0010 (0.0016) model time 0.2405 (0.2415) loss 2.8065 (2.8558) grad_norm 4.9558 (4.3265) loss_scale 512.0000 (480.3150) mem 7381MB [2024-09-01 06:20:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][720/1251] eta 0:02:08 lr 0.000130 wd 0.0500 time 0.2367 (0.2429) data time 0.0010 (0.0016) model time 0.2357 (0.2415) loss 2.5949 (2.8532) grad_norm 3.4438 (4.3245) loss_scale 512.0000 (480.7545) mem 7381MB [2024-09-01 06:20:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][730/1251] eta 0:02:06 lr 0.000130 wd 0.0500 time 0.2381 (0.2428) data time 0.0008 (0.0016) model time 0.2372 (0.2415) loss 3.1556 (2.8557) grad_norm 3.6977 (4.3269) loss_scale 512.0000 (481.1819) mem 7381MB [2024-09-01 06:20:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][740/1251] eta 0:02:04 lr 0.000130 wd 0.0500 time 0.2397 (0.2428) data time 0.0007 (0.0015) model time 0.2390 (0.2415) loss 3.1395 (2.8576) grad_norm 4.9449 (4.3249) loss_scale 512.0000 (481.5978) mem 7381MB [2024-09-01 06:20:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][750/1251] eta 0:02:01 lr 0.000130 wd 0.0500 time 0.2332 (0.2428) data time 0.0012 (0.0015) model time 0.2320 (0.2415) loss 2.6997 (2.8595) grad_norm 3.2391 (4.3274) loss_scale 512.0000 (482.0027) mem 7381MB [2024-09-01 06:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][760/1251] eta 0:01:59 lr 0.000130 wd 0.0500 time 0.2426 (0.2428) data time 0.0009 (0.0015) model time 0.2417 (0.2415) loss 2.6908 (2.8628) grad_norm 4.8066 (4.3387) loss_scale 512.0000 (482.3968) mem 7381MB [2024-09-01 06:21:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][770/1251] eta 0:01:56 lr 0.000130 wd 0.0500 time 0.2329 (0.2428) data time 0.0010 (0.0015) model time 0.2319 (0.2415) loss 1.9351 (2.8629) grad_norm 5.4921 (4.3508) loss_scale 512.0000 (482.7808) mem 7381MB [2024-09-01 06:21:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][780/1251] eta 0:01:54 lr 0.000130 wd 0.0500 time 0.2395 (0.2428) data time 0.0011 (0.0015) model time 0.2385 (0.2415) loss 3.2210 (2.8650) grad_norm 29.4900 (4.3754) loss_scale 512.0000 (483.1549) mem 7381MB [2024-09-01 06:21:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][790/1251] eta 0:01:51 lr 0.000130 wd 0.0500 time 0.2461 (0.2428) data time 0.0010 (0.0015) model time 0.2452 (0.2414) loss 3.3991 (2.8684) grad_norm 3.1416 (4.3654) loss_scale 512.0000 (483.5196) mem 7381MB [2024-09-01 06:21:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][800/1251] eta 0:01:49 lr 0.000130 wd 0.0500 time 0.2445 (0.2428) data time 0.0009 (0.0015) model time 0.2435 (0.2414) loss 3.0635 (2.8725) grad_norm 4.1054 (4.3693) loss_scale 512.0000 (483.8752) mem 7381MB [2024-09-01 06:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][810/1251] eta 0:01:47 lr 0.000130 wd 0.0500 time 0.2409 (0.2428) data time 0.0009 (0.0015) model time 0.2399 (0.2414) loss 3.1699 (2.8730) grad_norm 3.5494 (4.3666) loss_scale 512.0000 (484.2219) mem 7381MB [2024-09-01 06:21:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][820/1251] eta 0:01:44 lr 0.000130 wd 0.0500 time 0.2356 (0.2427) data time 0.0008 (0.0015) model time 0.2348 (0.2414) loss 2.3285 (2.8728) grad_norm 3.9775 (4.3633) loss_scale 512.0000 (484.5603) mem 7381MB [2024-09-01 06:21:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][830/1251] eta 0:01:42 lr 0.000130 wd 0.0500 time 0.2498 (0.2428) data time 0.0009 (0.0015) model time 0.2489 (0.2414) loss 3.0209 (2.8724) grad_norm 4.9589 (4.3541) loss_scale 512.0000 (484.8905) mem 7381MB [2024-09-01 06:21:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][840/1251] eta 0:01:39 lr 0.000130 wd 0.0500 time 0.2421 (0.2428) data time 0.0009 (0.0015) model time 0.2412 (0.2415) loss 3.1334 (2.8736) grad_norm 4.2273 (4.3510) loss_scale 512.0000 (485.2128) mem 7381MB [2024-09-01 06:21:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][850/1251] eta 0:01:37 lr 0.000130 wd 0.0500 time 0.2408 (0.2428) data time 0.0009 (0.0015) model time 0.2399 (0.2415) loss 2.9243 (2.8705) grad_norm 2.7684 (4.3445) loss_scale 512.0000 (485.5276) mem 7381MB [2024-09-01 06:21:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][860/1251] eta 0:01:34 lr 0.000130 wd 0.0500 time 0.2375 (0.2427) data time 0.0010 (0.0015) model time 0.2365 (0.2414) loss 2.5910 (2.8711) grad_norm 4.6155 (4.3540) loss_scale 512.0000 (485.8351) mem 7381MB [2024-09-01 06:21:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][870/1251] eta 0:01:32 lr 0.000130 wd 0.0500 time 0.2388 (0.2427) data time 0.0008 (0.0015) model time 0.2380 (0.2414) loss 3.0871 (2.8732) grad_norm 5.3311 (4.3483) loss_scale 512.0000 (486.1355) mem 7381MB [2024-09-01 06:21:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][880/1251] eta 0:01:30 lr 0.000130 wd 0.0500 time 0.2499 (0.2427) data time 0.0009 (0.0015) model time 0.2489 (0.2414) loss 2.6042 (2.8718) grad_norm 6.6856 (4.3496) loss_scale 512.0000 (486.4291) mem 7381MB [2024-09-01 06:21:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][890/1251] eta 0:01:27 lr 0.000130 wd 0.0500 time 0.2392 (0.2427) data time 0.0009 (0.0014) model time 0.2383 (0.2414) loss 2.6626 (2.8739) grad_norm 6.2508 (4.4150) loss_scale 512.0000 (486.7160) mem 7381MB [2024-09-01 06:21:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][900/1251] eta 0:01:25 lr 0.000130 wd 0.0500 time 0.2414 (0.2427) data time 0.0007 (0.0014) model time 0.2407 (0.2414) loss 2.3848 (2.8727) grad_norm 3.1515 (4.4077) loss_scale 512.0000 (486.9967) mem 7381MB [2024-09-01 06:21:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][910/1251] eta 0:01:22 lr 0.000130 wd 0.0500 time 0.2474 (0.2427) data time 0.0009 (0.0014) model time 0.2465 (0.2415) loss 2.6065 (2.8737) grad_norm 2.8382 (4.3983) loss_scale 512.0000 (487.2711) mem 7381MB [2024-09-01 06:21:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][920/1251] eta 0:01:20 lr 0.000130 wd 0.0500 time 0.2387 (0.2427) data time 0.0009 (0.0014) model time 0.2378 (0.2415) loss 1.9902 (2.8726) grad_norm 3.4166 (4.3896) loss_scale 512.0000 (487.5396) mem 7381MB [2024-09-01 06:21:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][930/1251] eta 0:01:17 lr 0.000130 wd 0.0500 time 0.2456 (0.2427) data time 0.0010 (0.0014) model time 0.2446 (0.2414) loss 3.3639 (2.8717) grad_norm 3.4449 (4.3827) loss_scale 512.0000 (487.8024) mem 7381MB [2024-09-01 06:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][940/1251] eta 0:01:15 lr 0.000129 wd 0.0500 time 0.2462 (0.2427) data time 0.0010 (0.0014) model time 0.2452 (0.2415) loss 3.0312 (2.8698) grad_norm 3.2471 (4.3819) loss_scale 512.0000 (488.0595) mem 7381MB [2024-09-01 06:21:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][950/1251] eta 0:01:13 lr 0.000129 wd 0.0500 time 0.2446 (0.2427) data time 0.0007 (0.0014) model time 0.2439 (0.2414) loss 2.5086 (2.8679) grad_norm 3.2242 (4.3809) loss_scale 512.0000 (488.3113) mem 7381MB [2024-09-01 06:21:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][960/1251] eta 0:01:10 lr 0.000129 wd 0.0500 time 0.2442 (0.2428) data time 0.0009 (0.0014) model time 0.2434 (0.2415) loss 2.9761 (2.8663) grad_norm 4.0960 (4.3773) loss_scale 512.0000 (488.5578) mem 7381MB [2024-09-01 06:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][970/1251] eta 0:01:08 lr 0.000129 wd 0.0500 time 0.2489 (0.2428) data time 0.0010 (0.0014) model time 0.2479 (0.2415) loss 2.3929 (2.8637) grad_norm 3.0546 (4.3740) loss_scale 512.0000 (488.7992) mem 7381MB [2024-09-01 06:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][980/1251] eta 0:01:05 lr 0.000129 wd 0.0500 time 0.2427 (0.2428) data time 0.0008 (0.0014) model time 0.2420 (0.2415) loss 2.2091 (2.8619) grad_norm 3.9502 (4.3782) loss_scale 512.0000 (489.0357) mem 7381MB [2024-09-01 06:21:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][990/1251] eta 0:01:03 lr 0.000129 wd 0.0500 time 0.2387 (0.2428) data time 0.0009 (0.0014) model time 0.2378 (0.2415) loss 2.5507 (2.8617) grad_norm 3.7505 (4.3988) loss_scale 512.0000 (489.2674) mem 7381MB [2024-09-01 06:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1000/1251] eta 0:01:00 lr 0.000129 wd 0.0500 time 0.2172 (0.2430) data time 0.0012 (0.0014) model time 0.2161 (0.2417) loss 3.1453 (2.8617) grad_norm 4.3323 (4.3971) loss_scale 512.0000 (489.4945) mem 7381MB [2024-09-01 06:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1010/1251] eta 0:00:58 lr 0.000129 wd 0.0500 time 0.2509 (0.2430) data time 0.0009 (0.0014) model time 0.2500 (0.2417) loss 2.8763 (2.8610) grad_norm 5.0113 (4.3959) loss_scale 512.0000 (489.7171) mem 7381MB [2024-09-01 06:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1020/1251] eta 0:00:56 lr 0.000129 wd 0.0500 time 0.2371 (0.2429) data time 0.0010 (0.0014) model time 0.2361 (0.2417) loss 3.0107 (2.8600) grad_norm 3.6774 (4.3969) loss_scale 512.0000 (489.9354) mem 7381MB [2024-09-01 06:22:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1030/1251] eta 0:00:53 lr 0.000129 wd 0.0500 time 0.2458 (0.2429) data time 0.0011 (0.0014) model time 0.2447 (0.2417) loss 2.8039 (2.8597) grad_norm 3.9205 (4.3913) loss_scale 512.0000 (490.1494) mem 7381MB [2024-09-01 06:22:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1040/1251] eta 0:00:51 lr 0.000129 wd 0.0500 time 0.2433 (0.2429) data time 0.0008 (0.0014) model time 0.2424 (0.2417) loss 2.8402 (2.8607) grad_norm 4.5131 (4.3935) loss_scale 512.0000 (490.3593) mem 7381MB [2024-09-01 06:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1050/1251] eta 0:00:48 lr 0.000129 wd 0.0500 time 0.2377 (0.2429) data time 0.0011 (0.0014) model time 0.2366 (0.2417) loss 2.9344 (2.8615) grad_norm 3.4907 (4.3909) loss_scale 512.0000 (490.5652) mem 7381MB [2024-09-01 06:22:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1060/1251] eta 0:00:46 lr 0.000129 wd 0.0500 time 0.2380 (0.2429) data time 0.0010 (0.0014) model time 0.2370 (0.2417) loss 3.0433 (2.8627) grad_norm 2.9825 (4.3955) loss_scale 512.0000 (490.7672) mem 7381MB [2024-09-01 06:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1070/1251] eta 0:00:43 lr 0.000129 wd 0.0500 time 0.2373 (0.2429) data time 0.0011 (0.0014) model time 0.2362 (0.2417) loss 2.9936 (2.8623) grad_norm 2.5990 (4.3905) loss_scale 512.0000 (490.9655) mem 7381MB [2024-09-01 06:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1080/1251] eta 0:00:41 lr 0.000129 wd 0.0500 time 0.2415 (0.2433) data time 0.0011 (0.0014) model time 0.2404 (0.2421) loss 3.1149 (2.8630) grad_norm 4.7396 (4.3905) loss_scale 512.0000 (491.1600) mem 7381MB [2024-09-01 06:22:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1090/1251] eta 0:00:39 lr 0.000129 wd 0.0500 time 0.2406 (0.2435) data time 0.0011 (0.0014) model time 0.2395 (0.2423) loss 2.5556 (2.8632) grad_norm 8.6695 (4.3909) loss_scale 512.0000 (491.3511) mem 7381MB [2024-09-01 06:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1100/1251] eta 0:00:36 lr 0.000129 wd 0.0500 time 0.2458 (0.2435) data time 0.0007 (0.0014) model time 0.2450 (0.2423) loss 2.0733 (2.8597) grad_norm 2.6309 (4.3885) loss_scale 512.0000 (491.5386) mem 7381MB [2024-09-01 06:22:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1110/1251] eta 0:00:34 lr 0.000129 wd 0.0500 time 0.2393 (0.2435) data time 0.0013 (0.0014) model time 0.2380 (0.2423) loss 2.5737 (2.8600) grad_norm 3.7689 (4.3863) loss_scale 512.0000 (491.7228) mem 7381MB [2024-09-01 06:22:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1120/1251] eta 0:00:31 lr 0.000129 wd 0.0500 time 0.2348 (0.2434) data time 0.0010 (0.0014) model time 0.2337 (0.2422) loss 1.8554 (2.8610) grad_norm 3.6662 (4.3842) loss_scale 512.0000 (491.9037) mem 7381MB [2024-09-01 06:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1130/1251] eta 0:00:29 lr 0.000129 wd 0.0500 time 0.2473 (0.2444) data time 0.0009 (0.0014) model time 0.2463 (0.2432) loss 3.3171 (2.8603) grad_norm 3.3942 (4.3834) loss_scale 512.0000 (492.0813) mem 7381MB [2024-09-01 06:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1140/1251] eta 0:00:27 lr 0.000129 wd 0.0500 time 0.2445 (0.2444) data time 0.0007 (0.0014) model time 0.2438 (0.2432) loss 2.1925 (2.8605) grad_norm 4.1282 (4.3827) loss_scale 512.0000 (492.2559) mem 7381MB [2024-09-01 06:22:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1150/1251] eta 0:00:24 lr 0.000129 wd 0.0500 time 0.2393 (0.2443) data time 0.0012 (0.0013) model time 0.2381 (0.2432) loss 2.7152 (2.8603) grad_norm 3.7520 (4.3765) loss_scale 512.0000 (492.4275) mem 7381MB [2024-09-01 06:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1160/1251] eta 0:00:22 lr 0.000129 wd 0.0500 time 0.2483 (0.2443) data time 0.0010 (0.0013) model time 0.2473 (0.2432) loss 2.2270 (2.8580) grad_norm 4.0811 (4.3717) loss_scale 512.0000 (492.5960) mem 7381MB [2024-09-01 06:22:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1170/1251] eta 0:00:19 lr 0.000129 wd 0.0500 time 0.2376 (0.2443) data time 0.0008 (0.0013) model time 0.2368 (0.2431) loss 2.4634 (2.8561) grad_norm 4.1256 (4.3647) loss_scale 512.0000 (492.7617) mem 7381MB [2024-09-01 06:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1180/1251] eta 0:00:17 lr 0.000129 wd 0.0500 time 0.2425 (0.2443) data time 0.0009 (0.0013) model time 0.2416 (0.2431) loss 2.1522 (2.8559) grad_norm 4.3390 (4.3642) loss_scale 512.0000 (492.9246) mem 7381MB [2024-09-01 06:22:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1190/1251] eta 0:00:14 lr 0.000129 wd 0.0500 time 0.2414 (0.2443) data time 0.0007 (0.0013) model time 0.2407 (0.2431) loss 3.1209 (2.8555) grad_norm 6.8237 (4.3660) loss_scale 512.0000 (493.0848) mem 7381MB [2024-09-01 06:22:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1200/1251] eta 0:00:12 lr 0.000129 wd 0.0500 time 0.2422 (0.2442) data time 0.0007 (0.0013) model time 0.2414 (0.2431) loss 3.0166 (2.8559) grad_norm 3.8214 (4.3591) loss_scale 512.0000 (493.2423) mem 7381MB [2024-09-01 06:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1210/1251] eta 0:00:10 lr 0.000129 wd 0.0500 time 0.2353 (0.2442) data time 0.0008 (0.0013) model time 0.2345 (0.2431) loss 3.6386 (2.8578) grad_norm 3.4990 (4.3515) loss_scale 512.0000 (493.3972) mem 7381MB [2024-09-01 06:22:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1220/1251] eta 0:00:07 lr 0.000129 wd 0.0500 time 0.2397 (0.2442) data time 0.0010 (0.0013) model time 0.2387 (0.2431) loss 3.0589 (2.8582) grad_norm 2.6493 (4.3460) loss_scale 512.0000 (493.5495) mem 7381MB [2024-09-01 06:22:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1230/1251] eta 0:00:05 lr 0.000129 wd 0.0500 time 0.2312 (0.2442) data time 0.0010 (0.0013) model time 0.2302 (0.2430) loss 2.6371 (2.8583) grad_norm 3.9280 (4.3427) loss_scale 512.0000 (493.6994) mem 7381MB [2024-09-01 06:22:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1240/1251] eta 0:00:02 lr 0.000129 wd 0.0500 time 0.2289 (0.2441) data time 0.0005 (0.0013) model time 0.2284 (0.2430) loss 2.0419 (2.8547) grad_norm 4.2590 (4.3414) loss_scale 512.0000 (493.8469) mem 7381MB [2024-09-01 06:23:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [236/300][1250/1251] eta 0:00:00 lr 0.000129 wd 0.0500 time 0.2227 (0.2440) data time 0.0005 (0.0013) model time 0.2223 (0.2428) loss 2.5257 (2.8551) grad_norm 3.4452 (4.3444) loss_scale 512.0000 (493.9920) mem 7381MB [2024-09-01 06:23:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 236 training takes 0:05:05 [2024-09-01 06:23:00 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 06:23:01 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 06:23:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.453 (0.453) Loss 0.4070 (0.4070) Acc@1 92.773 (92.773) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 06:23:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.113) Loss 0.5820 (0.6363) Acc@1 88.965 (86.586) Acc@5 98.047 (97.656) Mem 7381MB [2024-09-01 06:23:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.097) Loss 0.9824 (0.6644) Acc@1 76.562 (85.645) Acc@5 95.117 (97.503) Mem 7381MB [2024-09-01 06:23:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.091) Loss 1.1035 (0.7542) Acc@1 75.488 (83.597) Acc@5 93.262 (96.620) Mem 7381MB [2024-09-01 06:23:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.0645 (0.8053) Acc@1 75.879 (82.369) Acc@5 93.652 (96.015) Mem 7381MB [2024-09-01 06:23:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.924 Acc@5 95.942 [2024-09-01 06:23:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 81.9% [2024-09-01 06:23:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.853 (0.853) Loss 0.3796 (0.3796) Acc@1 93.164 (93.164) Acc@5 98.340 (98.340) Mem 7381MB [2024-09-01 06:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.150) Loss 0.5698 (0.6007) Acc@1 89.746 (87.491) Acc@5 97.949 (97.754) Mem 7381MB [2024-09-01 06:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.116) Loss 0.8799 (0.6293) Acc@1 78.027 (86.398) Acc@5 96.094 (97.698) Mem 7381MB [2024-09-01 06:23:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.103) Loss 1.0859 (0.7128) Acc@1 74.219 (84.331) Acc@5 93.066 (96.818) Mem 7381MB [2024-09-01 06:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 0.9893 (0.7579) Acc@1 76.367 (83.182) Acc@5 94.141 (96.346) Mem 7381MB [2024-09-01 06:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.764 Acc@5 96.328 [2024-09-01 06:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 06:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.76% [2024-09-01 06:23:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 06:23:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 06:23:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][0/1251] eta 0:16:25 lr 0.000129 wd 0.0500 time 0.7877 (0.7877) data time 0.5659 (0.5659) model time 0.0000 (0.0000) loss 2.8356 (2.8356) grad_norm 4.2728 (4.2728) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][10/1251] eta 0:06:00 lr 0.000129 wd 0.0500 time 0.2467 (0.2904) data time 0.0008 (0.0523) model time 0.0000 (0.0000) loss 3.1756 (2.7321) grad_norm 6.2500 (4.6409) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][20/1251] eta 0:05:28 lr 0.000129 wd 0.0500 time 0.2392 (0.2670) data time 0.0009 (0.0279) model time 0.0000 (0.0000) loss 2.9311 (2.8391) grad_norm 6.0356 (4.6410) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][30/1251] eta 0:05:15 lr 0.000129 wd 0.0500 time 0.2351 (0.2583) data time 0.0009 (0.0192) model time 0.0000 (0.0000) loss 2.6446 (2.8581) grad_norm 4.2519 (4.4773) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][40/1251] eta 0:05:07 lr 0.000128 wd 0.0500 time 0.2390 (0.2538) data time 0.0011 (0.0147) model time 0.0000 (0.0000) loss 2.8604 (2.8451) grad_norm 4.4418 (4.3621) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][50/1251] eta 0:05:01 lr 0.000128 wd 0.0500 time 0.2373 (0.2513) data time 0.0009 (0.0120) model time 0.0000 (0.0000) loss 2.9807 (2.8479) grad_norm 11.3821 (4.4242) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][60/1251] eta 0:04:57 lr 0.000128 wd 0.0500 time 0.2424 (0.2496) data time 0.0007 (0.0102) model time 0.2417 (0.2402) loss 3.4617 (2.8595) grad_norm 3.9707 (4.3870) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][70/1251] eta 0:04:53 lr 0.000128 wd 0.0500 time 0.2442 (0.2489) data time 0.0011 (0.0089) model time 0.2431 (0.2417) loss 2.8976 (2.8610) grad_norm 3.8123 (4.4635) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][80/1251] eta 0:04:50 lr 0.000128 wd 0.0500 time 0.2323 (0.2479) data time 0.0010 (0.0079) model time 0.2313 (0.2411) loss 3.4255 (2.8474) grad_norm 4.9265 (4.5211) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][90/1251] eta 0:04:46 lr 0.000128 wd 0.0500 time 0.2429 (0.2471) data time 0.0007 (0.0072) model time 0.2422 (0.2409) loss 2.3764 (2.8373) grad_norm 4.7906 (4.5794) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][100/1251] eta 0:04:43 lr 0.000128 wd 0.0500 time 0.2309 (0.2467) data time 0.0008 (0.0065) model time 0.2301 (0.2410) loss 3.4606 (2.8237) grad_norm 4.8618 (4.5472) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][110/1251] eta 0:04:41 lr 0.000128 wd 0.0500 time 0.2463 (0.2464) data time 0.0009 (0.0060) model time 0.2454 (0.2412) loss 2.9593 (2.8228) grad_norm 3.9003 (4.5659) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][120/1251] eta 0:04:38 lr 0.000128 wd 0.0500 time 0.2418 (0.2460) data time 0.0010 (0.0056) model time 0.2408 (0.2412) loss 2.6880 (2.8188) grad_norm 3.3267 (4.5041) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][130/1251] eta 0:04:35 lr 0.000128 wd 0.0500 time 0.2489 (0.2456) data time 0.0007 (0.0053) model time 0.2482 (0.2411) loss 3.0535 (2.8186) grad_norm 3.2139 (4.4407) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][140/1251] eta 0:04:32 lr 0.000128 wd 0.0500 time 0.2455 (0.2454) data time 0.0010 (0.0050) model time 0.2445 (0.2412) loss 2.9570 (2.8203) grad_norm 3.6796 (4.4012) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][150/1251] eta 0:04:30 lr 0.000128 wd 0.0500 time 0.2471 (0.2452) data time 0.0007 (0.0047) model time 0.2464 (0.2412) loss 2.5820 (2.8120) grad_norm 3.1029 (4.3763) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][160/1251] eta 0:04:27 lr 0.000128 wd 0.0500 time 0.2362 (0.2449) data time 0.0011 (0.0045) model time 0.2351 (0.2409) loss 1.6452 (2.8010) grad_norm 3.2496 (4.3199) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][170/1251] eta 0:04:24 lr 0.000128 wd 0.0500 time 0.2369 (0.2446) data time 0.0009 (0.0043) model time 0.2361 (0.2407) loss 3.3959 (2.8152) grad_norm 6.0440 (4.3304) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][180/1251] eta 0:04:21 lr 0.000128 wd 0.0500 time 0.2404 (0.2443) data time 0.0008 (0.0041) model time 0.2397 (0.2406) loss 2.8900 (2.8158) grad_norm 3.5615 (4.3103) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][190/1251] eta 0:04:18 lr 0.000128 wd 0.0500 time 0.2384 (0.2440) data time 0.0010 (0.0039) model time 0.2374 (0.2404) loss 1.6893 (2.8047) grad_norm 3.3762 (4.2864) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:23:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][200/1251] eta 0:04:16 lr 0.000128 wd 0.0500 time 0.2401 (0.2437) data time 0.0008 (0.0038) model time 0.2393 (0.2402) loss 2.5985 (2.7957) grad_norm 3.3326 (4.2467) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][210/1251] eta 0:04:13 lr 0.000128 wd 0.0500 time 0.2496 (0.2435) data time 0.0010 (0.0037) model time 0.2486 (0.2400) loss 2.8379 (2.7982) grad_norm 2.5895 (4.2388) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][220/1251] eta 0:04:10 lr 0.000128 wd 0.0500 time 0.2300 (0.2433) data time 0.0009 (0.0035) model time 0.2291 (0.2399) loss 2.8663 (2.8052) grad_norm 2.9312 (4.2307) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][230/1251] eta 0:04:08 lr 0.000128 wd 0.0500 time 0.2441 (0.2432) data time 0.0011 (0.0034) model time 0.2430 (0.2399) loss 1.7453 (2.7970) grad_norm 3.5699 (4.2123) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][240/1251] eta 0:04:05 lr 0.000128 wd 0.0500 time 0.2404 (0.2429) data time 0.0009 (0.0033) model time 0.2395 (0.2397) loss 2.6872 (2.7985) grad_norm 5.5324 (4.1975) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][250/1251] eta 0:04:03 lr 0.000128 wd 0.0500 time 0.2398 (0.2428) data time 0.0009 (0.0032) model time 0.2389 (0.2396) loss 2.4288 (2.7852) grad_norm 3.2432 (4.1789) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][260/1251] eta 0:04:00 lr 0.000128 wd 0.0500 time 0.2419 (0.2427) data time 0.0007 (0.0031) model time 0.2412 (0.2396) loss 3.1634 (2.7900) grad_norm 3.7532 (4.1810) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][270/1251] eta 0:03:57 lr 0.000128 wd 0.0500 time 0.2397 (0.2426) data time 0.0009 (0.0031) model time 0.2388 (0.2396) loss 3.0336 (2.7902) grad_norm 4.1427 (4.1522) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][280/1251] eta 0:03:55 lr 0.000128 wd 0.0500 time 0.2360 (0.2425) data time 0.0013 (0.0030) model time 0.2347 (0.2395) loss 1.6701 (2.7917) grad_norm 3.5928 (4.1461) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][290/1251] eta 0:03:53 lr 0.000128 wd 0.0500 time 0.2403 (0.2425) data time 0.0009 (0.0029) model time 0.2394 (0.2396) loss 3.1085 (2.7929) grad_norm 3.2300 (4.1524) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][300/1251] eta 0:03:50 lr 0.000128 wd 0.0500 time 0.2340 (0.2425) data time 0.0011 (0.0029) model time 0.2329 (0.2397) loss 3.2860 (2.7913) grad_norm 3.6049 (4.1986) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][310/1251] eta 0:03:48 lr 0.000128 wd 0.0500 time 0.2387 (0.2430) data time 0.0007 (0.0028) model time 0.2380 (0.2404) loss 2.5353 (2.7970) grad_norm 13.2786 (4.2526) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][320/1251] eta 0:03:46 lr 0.000128 wd 0.0500 time 0.2438 (0.2429) data time 0.0009 (0.0027) model time 0.2429 (0.2404) loss 2.7688 (2.7890) grad_norm 4.6408 (4.2439) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][330/1251] eta 0:03:43 lr 0.000128 wd 0.0500 time 0.2438 (0.2428) data time 0.0009 (0.0027) model time 0.2429 (0.2403) loss 2.8992 (2.7920) grad_norm 3.6400 (4.2440) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][340/1251] eta 0:03:41 lr 0.000128 wd 0.0500 time 0.2451 (0.2427) data time 0.0010 (0.0026) model time 0.2441 (0.2403) loss 3.4411 (2.7952) grad_norm 4.3032 (4.2449) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][350/1251] eta 0:03:38 lr 0.000128 wd 0.0500 time 0.2372 (0.2426) data time 0.0008 (0.0026) model time 0.2364 (0.2402) loss 2.2759 (2.7893) grad_norm 2.6117 (4.2232) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][360/1251] eta 0:03:37 lr 0.000128 wd 0.0500 time 0.2421 (0.2438) data time 0.0007 (0.0025) model time 0.2414 (0.2416) loss 2.1509 (2.7900) grad_norm 4.8500 (4.2365) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][370/1251] eta 0:03:35 lr 0.000128 wd 0.0500 time 0.2368 (0.2443) data time 0.0009 (0.0025) model time 0.2359 (0.2422) loss 3.0244 (2.7899) grad_norm 4.4153 (4.2533) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][380/1251] eta 0:03:32 lr 0.000128 wd 0.0500 time 0.2521 (0.2442) data time 0.0009 (0.0025) model time 0.2512 (0.2421) loss 2.9812 (2.7911) grad_norm 3.1930 (4.2766) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][390/1251] eta 0:03:30 lr 0.000127 wd 0.0500 time 0.2316 (0.2441) data time 0.0009 (0.0024) model time 0.2307 (0.2420) loss 3.3416 (2.7989) grad_norm 3.1410 (4.2496) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][400/1251] eta 0:03:27 lr 0.000127 wd 0.0500 time 0.2475 (0.2441) data time 0.0007 (0.0024) model time 0.2468 (0.2420) loss 3.1485 (2.8058) grad_norm 5.0187 (4.2344) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][410/1251] eta 0:03:25 lr 0.000127 wd 0.0500 time 0.2399 (0.2440) data time 0.0010 (0.0024) model time 0.2389 (0.2419) loss 3.1342 (2.8095) grad_norm 5.2246 (4.2206) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][420/1251] eta 0:03:22 lr 0.000127 wd 0.0500 time 0.2365 (0.2439) data time 0.0009 (0.0023) model time 0.2356 (0.2418) loss 2.1303 (2.8052) grad_norm 5.8877 (4.2184) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][430/1251] eta 0:03:20 lr 0.000127 wd 0.0500 time 0.2440 (0.2438) data time 0.0007 (0.0023) model time 0.2433 (0.2418) loss 2.9881 (2.8031) grad_norm 3.1874 (4.2362) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:24:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][440/1251] eta 0:03:17 lr 0.000127 wd 0.0500 time 0.2379 (0.2437) data time 0.0011 (0.0023) model time 0.2368 (0.2417) loss 2.9004 (2.8029) grad_norm 11.9069 (4.2518) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][450/1251] eta 0:03:15 lr 0.000127 wd 0.0500 time 0.2450 (0.2437) data time 0.0009 (0.0022) model time 0.2441 (0.2417) loss 2.5381 (2.8001) grad_norm 3.8235 (4.2466) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][460/1251] eta 0:03:12 lr 0.000127 wd 0.0500 time 0.2425 (0.2437) data time 0.0010 (0.0022) model time 0.2415 (0.2418) loss 3.0051 (2.8051) grad_norm 3.2199 (4.2290) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][470/1251] eta 0:03:10 lr 0.000127 wd 0.0500 time 0.2396 (0.2436) data time 0.0010 (0.0022) model time 0.2386 (0.2417) loss 2.5422 (2.8058) grad_norm 3.9053 (4.2151) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][480/1251] eta 0:03:07 lr 0.000127 wd 0.0500 time 0.2416 (0.2436) data time 0.0009 (0.0022) model time 0.2407 (0.2417) loss 2.6731 (2.8089) grad_norm 2.8131 (4.2103) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][490/1251] eta 0:03:05 lr 0.000127 wd 0.0500 time 0.2416 (0.2435) data time 0.0007 (0.0021) model time 0.2409 (0.2416) loss 3.5539 (2.8116) grad_norm 3.8272 (4.1980) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][500/1251] eta 0:03:02 lr 0.000127 wd 0.0500 time 0.2409 (0.2435) data time 0.0009 (0.0021) model time 0.2400 (0.2416) loss 3.2554 (2.8124) grad_norm 3.0269 (4.1999) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][510/1251] eta 0:03:00 lr 0.000127 wd 0.0500 time 0.2399 (0.2435) data time 0.0007 (0.0021) model time 0.2392 (0.2416) loss 2.1566 (2.8141) grad_norm 3.4458 (4.1872) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][520/1251] eta 0:02:57 lr 0.000127 wd 0.0500 time 0.2348 (0.2434) data time 0.0010 (0.0021) model time 0.2338 (0.2415) loss 3.1144 (2.8149) grad_norm 5.4865 (4.2084) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][530/1251] eta 0:02:55 lr 0.000127 wd 0.0500 time 0.2360 (0.2434) data time 0.0010 (0.0020) model time 0.2350 (0.2415) loss 3.2334 (2.8132) grad_norm 2.8648 (4.2617) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][540/1251] eta 0:02:52 lr 0.000127 wd 0.0500 time 0.2366 (0.2433) data time 0.0009 (0.0020) model time 0.2357 (0.2415) loss 2.7439 (2.8114) grad_norm 4.3970 (4.2669) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][550/1251] eta 0:02:50 lr 0.000127 wd 0.0500 time 0.2435 (0.2433) data time 0.0009 (0.0020) model time 0.2426 (0.2415) loss 3.0074 (2.8141) grad_norm 4.4847 (4.2687) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][560/1251] eta 0:02:48 lr 0.000127 wd 0.0500 time 0.2360 (0.2432) data time 0.0010 (0.0020) model time 0.2350 (0.2414) loss 2.3135 (2.8124) grad_norm 3.1834 (4.2641) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][570/1251] eta 0:02:45 lr 0.000127 wd 0.0500 time 0.2333 (0.2432) data time 0.0009 (0.0020) model time 0.2324 (0.2414) loss 2.3209 (2.8089) grad_norm 4.2764 (4.2747) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][580/1251] eta 0:02:43 lr 0.000127 wd 0.0500 time 0.2449 (0.2432) data time 0.0008 (0.0020) model time 0.2442 (0.2414) loss 2.9748 (2.8099) grad_norm 3.5216 (4.2760) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][590/1251] eta 0:02:40 lr 0.000127 wd 0.0500 time 0.2363 (0.2431) data time 0.0008 (0.0019) model time 0.2354 (0.2414) loss 2.7283 (2.8128) grad_norm 4.4161 (4.2825) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][600/1251] eta 0:02:38 lr 0.000127 wd 0.0500 time 0.2418 (0.2431) data time 0.0010 (0.0019) model time 0.2408 (0.2413) loss 3.1051 (2.8123) grad_norm 3.6622 (4.2855) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][610/1251] eta 0:02:35 lr 0.000127 wd 0.0500 time 0.2358 (0.2431) data time 0.0009 (0.0019) model time 0.2349 (0.2413) loss 3.0662 (2.8128) grad_norm 4.9096 (4.3105) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][620/1251] eta 0:02:33 lr 0.000127 wd 0.0500 time 0.2502 (0.2430) data time 0.0010 (0.0019) model time 0.2492 (0.2413) loss 3.1695 (2.8120) grad_norm 5.2414 (4.3064) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][630/1251] eta 0:02:30 lr 0.000127 wd 0.0500 time 0.2389 (0.2430) data time 0.0009 (0.0019) model time 0.2380 (0.2413) loss 2.2332 (2.8148) grad_norm 3.3359 (4.2964) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][640/1251] eta 0:02:28 lr 0.000127 wd 0.0500 time 0.2419 (0.2430) data time 0.0009 (0.0019) model time 0.2409 (0.2413) loss 2.5106 (2.8116) grad_norm 3.5607 (4.2888) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][650/1251] eta 0:02:26 lr 0.000127 wd 0.0500 time 0.2382 (0.2429) data time 0.0009 (0.0019) model time 0.2373 (0.2412) loss 3.2122 (2.8122) grad_norm 3.0739 (4.2800) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][660/1251] eta 0:02:23 lr 0.000127 wd 0.0500 time 0.2421 (0.2429) data time 0.0007 (0.0018) model time 0.2413 (0.2412) loss 3.6066 (2.8138) grad_norm 3.5106 (4.2805) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][670/1251] eta 0:02:21 lr 0.000127 wd 0.0500 time 0.2455 (0.2429) data time 0.0011 (0.0018) model time 0.2444 (0.2412) loss 1.9214 (2.8124) grad_norm 3.0323 (4.3174) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][680/1251] eta 0:02:18 lr 0.000127 wd 0.0500 time 0.2382 (0.2428) data time 0.0008 (0.0018) model time 0.2374 (0.2411) loss 3.2008 (2.8129) grad_norm 4.0126 (4.3144) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:25:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][690/1251] eta 0:02:16 lr 0.000127 wd 0.0500 time 0.2412 (0.2428) data time 0.0010 (0.0018) model time 0.2402 (0.2412) loss 2.9732 (2.8112) grad_norm 3.7946 (4.3092) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][700/1251] eta 0:02:13 lr 0.000127 wd 0.0500 time 0.2399 (0.2428) data time 0.0011 (0.0018) model time 0.2388 (0.2412) loss 2.4286 (2.8103) grad_norm 4.3675 (4.3130) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][710/1251] eta 0:02:11 lr 0.000127 wd 0.0500 time 0.2376 (0.2428) data time 0.0010 (0.0018) model time 0.2367 (0.2412) loss 2.2301 (2.8146) grad_norm 5.2130 (4.3083) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][720/1251] eta 0:02:08 lr 0.000127 wd 0.0500 time 0.2444 (0.2428) data time 0.0009 (0.0018) model time 0.2435 (0.2411) loss 2.9911 (2.8152) grad_norm 5.1586 (4.3060) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][730/1251] eta 0:02:06 lr 0.000127 wd 0.0500 time 0.2374 (0.2427) data time 0.0009 (0.0018) model time 0.2365 (0.2411) loss 2.5952 (2.8135) grad_norm 5.1166 (4.3256) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][740/1251] eta 0:02:04 lr 0.000126 wd 0.0500 time 0.2381 (0.2427) data time 0.0008 (0.0017) model time 0.2373 (0.2411) loss 2.1397 (2.8146) grad_norm 5.6768 (4.3334) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][750/1251] eta 0:02:01 lr 0.000126 wd 0.0500 time 0.2439 (0.2427) data time 0.0010 (0.0017) model time 0.2429 (0.2410) loss 2.5937 (2.8131) grad_norm 3.2653 (4.3610) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][760/1251] eta 0:01:59 lr 0.000126 wd 0.0500 time 0.2395 (0.2426) data time 0.0011 (0.0017) model time 0.2384 (0.2410) loss 2.9891 (2.8144) grad_norm 2.9711 (4.3646) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][770/1251] eta 0:01:56 lr 0.000126 wd 0.0500 time 0.2507 (0.2426) data time 0.0008 (0.0017) model time 0.2499 (0.2410) loss 3.4027 (2.8156) grad_norm 2.8916 (4.3586) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][780/1251] eta 0:01:54 lr 0.000126 wd 0.0500 time 0.2394 (0.2427) data time 0.0011 (0.0017) model time 0.2384 (0.2411) loss 2.8950 (2.8132) grad_norm 4.5779 (4.3527) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][790/1251] eta 0:01:51 lr 0.000126 wd 0.0500 time 0.2439 (0.2426) data time 0.0007 (0.0017) model time 0.2432 (0.2410) loss 2.0687 (2.8143) grad_norm 3.0956 (4.3412) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][800/1251] eta 0:01:49 lr 0.000126 wd 0.0500 time 0.2423 (0.2425) data time 0.0009 (0.0017) model time 0.2414 (0.2410) loss 1.8806 (2.8116) grad_norm 3.5384 (4.3379) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][810/1251] eta 0:01:46 lr 0.000126 wd 0.0500 time 0.2465 (0.2425) data time 0.0010 (0.0017) model time 0.2456 (0.2410) loss 3.1906 (2.8109) grad_norm 3.7949 (4.3362) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][820/1251] eta 0:01:44 lr 0.000126 wd 0.0500 time 0.2482 (0.2425) data time 0.0010 (0.0017) model time 0.2472 (0.2410) loss 3.1910 (2.8116) grad_norm 3.2233 (4.3278) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][830/1251] eta 0:01:42 lr 0.000126 wd 0.0500 time 0.2410 (0.2425) data time 0.0009 (0.0017) model time 0.2401 (0.2409) loss 2.4523 (2.8135) grad_norm 4.0492 (4.3203) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:26:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][840/1251] eta 0:01:39 lr 0.000126 wd 0.0500 time 0.2376 (0.2425) data time 0.0008 (0.0017) model time 0.2368 (0.2409) loss 3.0263 (2.8134) grad_norm 3.3860 (4.3216) loss_scale 1024.0000 (514.4352) mem 7381MB [2024-09-01 06:26:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][850/1251] eta 0:01:37 lr 0.000126 wd 0.0500 time 0.2441 (0.2424) data time 0.0010 (0.0017) model time 0.2431 (0.2409) loss 3.0973 (2.8147) grad_norm 4.0707 (4.3287) loss_scale 1024.0000 (520.4230) mem 7381MB [2024-09-01 06:26:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][860/1251] eta 0:01:34 lr 0.000126 wd 0.0500 time 0.2387 (0.2424) data time 0.0007 (0.0016) model time 0.2380 (0.2409) loss 1.9888 (2.8124) grad_norm 3.5240 (4.3196) loss_scale 1024.0000 (526.2718) mem 7381MB [2024-09-01 06:26:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][870/1251] eta 0:01:32 lr 0.000126 wd 0.0500 time 0.2416 (0.2424) data time 0.0010 (0.0016) model time 0.2406 (0.2408) loss 3.2411 (2.8141) grad_norm 3.2028 (4.3144) loss_scale 1024.0000 (531.9862) mem 7381MB [2024-09-01 06:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][880/1251] eta 0:01:30 lr 0.000126 wd 0.0500 time 0.2437 (0.2427) data time 0.0007 (0.0016) model time 0.2430 (0.2412) loss 2.9881 (2.8136) grad_norm 3.2793 (4.3074) loss_scale 1024.0000 (537.5709) mem 7381MB [2024-09-01 06:26:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][890/1251] eta 0:01:27 lr 0.000126 wd 0.0500 time 0.2352 (0.2429) data time 0.0009 (0.0016) model time 0.2342 (0.2414) loss 3.0269 (2.8140) grad_norm 4.0662 (4.3020) loss_scale 1024.0000 (543.0303) mem 7381MB [2024-09-01 06:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][900/1251] eta 0:01:25 lr 0.000126 wd 0.0500 time 0.2467 (0.2429) data time 0.0007 (0.0016) model time 0.2459 (0.2414) loss 3.7034 (2.8157) grad_norm 3.2666 (4.2943) loss_scale 1024.0000 (548.3685) mem 7381MB [2024-09-01 06:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][910/1251] eta 0:01:22 lr 0.000126 wd 0.0500 time 0.2396 (0.2429) data time 0.0015 (0.0016) model time 0.2380 (0.2414) loss 2.9855 (2.8185) grad_norm 3.2198 (4.2908) loss_scale 1024.0000 (553.5895) mem 7381MB [2024-09-01 06:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][920/1251] eta 0:01:20 lr 0.000126 wd 0.0500 time 0.2491 (0.2428) data time 0.0010 (0.0016) model time 0.2481 (0.2413) loss 2.7122 (2.8146) grad_norm 4.7314 (4.2975) loss_scale 1024.0000 (558.6971) mem 7381MB [2024-09-01 06:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][930/1251] eta 0:01:17 lr 0.000126 wd 0.0500 time 0.2377 (0.2428) data time 0.0007 (0.0016) model time 0.2370 (0.2413) loss 3.1243 (2.8158) grad_norm 3.2113 (4.3026) loss_scale 1024.0000 (563.6950) mem 7381MB [2024-09-01 06:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][940/1251] eta 0:01:15 lr 0.000126 wd 0.0500 time 0.2455 (0.2428) data time 0.0008 (0.0016) model time 0.2448 (0.2413) loss 3.1283 (2.8175) grad_norm 4.2925 (4.2967) loss_scale 1024.0000 (568.5866) mem 7381MB [2024-09-01 06:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][950/1251] eta 0:01:13 lr 0.000126 wd 0.0500 time 0.2386 (0.2428) data time 0.0010 (0.0016) model time 0.2377 (0.2413) loss 3.1072 (2.8197) grad_norm 4.5400 (4.3031) loss_scale 1024.0000 (573.3754) mem 7381MB [2024-09-01 06:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][960/1251] eta 0:01:10 lr 0.000126 wd 0.0500 time 0.2378 (0.2428) data time 0.0007 (0.0016) model time 0.2371 (0.2413) loss 1.8943 (2.8148) grad_norm 3.2619 (4.3071) loss_scale 1024.0000 (578.0645) mem 7381MB [2024-09-01 06:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][970/1251] eta 0:01:08 lr 0.000126 wd 0.0500 time 0.2485 (0.2428) data time 0.0007 (0.0016) model time 0.2478 (0.2413) loss 3.5723 (2.8152) grad_norm 2.9397 (4.3085) loss_scale 1024.0000 (582.6571) mem 7381MB [2024-09-01 06:27:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][980/1251] eta 0:01:05 lr 0.000126 wd 0.0500 time 0.2427 (0.2428) data time 0.0007 (0.0016) model time 0.2420 (0.2413) loss 2.0992 (2.8146) grad_norm 4.2256 (4.3093) loss_scale 1024.0000 (587.1560) mem 7381MB [2024-09-01 06:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][990/1251] eta 0:01:03 lr 0.000126 wd 0.0500 time 0.2418 (0.2428) data time 0.0011 (0.0016) model time 0.2407 (0.2413) loss 2.3626 (2.8144) grad_norm 3.2371 (4.3152) loss_scale 1024.0000 (591.5641) mem 7381MB [2024-09-01 06:27:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1000/1251] eta 0:01:00 lr 0.000126 wd 0.0500 time 0.2335 (0.2427) data time 0.0011 (0.0016) model time 0.2323 (0.2413) loss 3.1081 (2.8154) grad_norm 3.7318 (4.3158) loss_scale 1024.0000 (595.8841) mem 7381MB [2024-09-01 06:27:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1010/1251] eta 0:00:58 lr 0.000126 wd 0.0500 time 0.2370 (0.2427) data time 0.0009 (0.0016) model time 0.2361 (0.2413) loss 2.3367 (2.8153) grad_norm 5.3839 (4.3135) loss_scale 1024.0000 (600.1187) mem 7381MB [2024-09-01 06:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1020/1251] eta 0:00:56 lr 0.000126 wd 0.0500 time 0.2424 (0.2427) data time 0.0007 (0.0015) model time 0.2417 (0.2413) loss 3.2371 (2.8153) grad_norm 3.9942 (4.3201) loss_scale 1024.0000 (604.2703) mem 7381MB [2024-09-01 06:27:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1030/1251] eta 0:00:53 lr 0.000126 wd 0.0500 time 0.2384 (0.2427) data time 0.0009 (0.0015) model time 0.2375 (0.2413) loss 3.6576 (2.8170) grad_norm 4.7165 (4.3253) loss_scale 1024.0000 (608.3414) mem 7381MB [2024-09-01 06:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1040/1251] eta 0:00:51 lr 0.000126 wd 0.0500 time 0.2421 (0.2427) data time 0.0010 (0.0015) model time 0.2411 (0.2413) loss 1.6871 (2.8142) grad_norm 3.4484 (4.3210) loss_scale 1024.0000 (612.3343) mem 7381MB [2024-09-01 06:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1050/1251] eta 0:00:48 lr 0.000126 wd 0.0500 time 0.2411 (0.2427) data time 0.0009 (0.0015) model time 0.2403 (0.2412) loss 2.9146 (2.8163) grad_norm 3.0660 (4.3296) loss_scale 1024.0000 (616.2512) mem 7381MB [2024-09-01 06:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1060/1251] eta 0:00:46 lr 0.000126 wd 0.0500 time 0.2331 (0.2427) data time 0.0011 (0.0015) model time 0.2320 (0.2412) loss 2.5529 (2.8170) grad_norm 4.7846 (4.3232) loss_scale 1024.0000 (620.0943) mem 7381MB [2024-09-01 06:27:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1070/1251] eta 0:00:43 lr 0.000126 wd 0.0500 time 0.2398 (0.2426) data time 0.0011 (0.0015) model time 0.2387 (0.2412) loss 2.9980 (2.8165) grad_norm 5.6201 (4.3198) loss_scale 1024.0000 (623.8655) mem 7381MB [2024-09-01 06:27:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1080/1251] eta 0:00:41 lr 0.000126 wd 0.0500 time 0.2428 (0.2426) data time 0.0008 (0.0015) model time 0.2421 (0.2412) loss 1.6597 (2.8154) grad_norm 4.2318 (4.3165) loss_scale 1024.0000 (627.5671) mem 7381MB [2024-09-01 06:27:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1090/1251] eta 0:00:39 lr 0.000125 wd 0.0500 time 0.2409 (0.2426) data time 0.0009 (0.0015) model time 0.2400 (0.2412) loss 3.1030 (2.8191) grad_norm 3.6966 (4.3082) loss_scale 1024.0000 (631.2007) mem 7381MB [2024-09-01 06:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1100/1251] eta 0:00:36 lr 0.000125 wd 0.0500 time 0.2394 (0.2426) data time 0.0007 (0.0015) model time 0.2387 (0.2412) loss 3.5868 (2.8184) grad_norm 3.2692 (4.3013) loss_scale 1024.0000 (634.7684) mem 7381MB [2024-09-01 06:27:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1110/1251] eta 0:00:34 lr 0.000125 wd 0.0500 time 0.2385 (0.2426) data time 0.0009 (0.0015) model time 0.2376 (0.2412) loss 2.1098 (2.8203) grad_norm 14.6233 (4.3097) loss_scale 1024.0000 (638.2718) mem 7381MB [2024-09-01 06:27:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1120/1251] eta 0:00:31 lr 0.000125 wd 0.0500 time 0.2288 (0.2426) data time 0.0010 (0.0015) model time 0.2278 (0.2412) loss 2.8780 (2.8206) grad_norm 4.3283 (4.3199) loss_scale 1024.0000 (641.7128) mem 7381MB [2024-09-01 06:27:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1130/1251] eta 0:00:29 lr 0.000125 wd 0.0500 time 0.2488 (0.2426) data time 0.0011 (0.0015) model time 0.2477 (0.2412) loss 3.2509 (2.8199) grad_norm 3.7861 (4.3176) loss_scale 1024.0000 (645.0928) mem 7381MB [2024-09-01 06:27:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1140/1251] eta 0:00:26 lr 0.000125 wd 0.0500 time 0.2446 (0.2426) data time 0.0007 (0.0015) model time 0.2438 (0.2412) loss 2.4949 (2.8204) grad_norm 3.3058 (4.3111) loss_scale 1024.0000 (648.4137) mem 7381MB [2024-09-01 06:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1150/1251] eta 0:00:24 lr 0.000125 wd 0.0500 time 0.2423 (0.2426) data time 0.0007 (0.0015) model time 0.2416 (0.2412) loss 3.3755 (2.8224) grad_norm 3.0906 (4.3096) loss_scale 1024.0000 (651.6768) mem 7381MB [2024-09-01 06:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1160/1251] eta 0:00:22 lr 0.000125 wd 0.0500 time 0.2364 (0.2426) data time 0.0010 (0.0015) model time 0.2354 (0.2412) loss 2.8487 (2.8209) grad_norm 2.5426 (4.3079) loss_scale 1024.0000 (654.8837) mem 7381MB [2024-09-01 06:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1170/1251] eta 0:00:19 lr 0.000125 wd 0.0500 time 0.2410 (0.2426) data time 0.0010 (0.0015) model time 0.2401 (0.2412) loss 2.0324 (2.8201) grad_norm 3.9660 (4.3033) loss_scale 1024.0000 (658.0359) mem 7381MB [2024-09-01 06:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1180/1251] eta 0:00:17 lr 0.000125 wd 0.0500 time 0.2366 (0.2425) data time 0.0007 (0.0015) model time 0.2359 (0.2411) loss 3.3334 (2.8217) grad_norm 4.4440 (4.2998) loss_scale 1024.0000 (661.1346) mem 7381MB [2024-09-01 06:27:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1190/1251] eta 0:00:14 lr 0.000125 wd 0.0500 time 0.2421 (0.2425) data time 0.0007 (0.0015) model time 0.2414 (0.2411) loss 3.3906 (2.8235) grad_norm 3.4155 (4.3066) loss_scale 1024.0000 (664.1814) mem 7381MB [2024-09-01 06:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1200/1251] eta 0:00:12 lr 0.000125 wd 0.0500 time 0.2501 (0.2425) data time 0.0010 (0.0015) model time 0.2491 (0.2412) loss 2.8310 (2.8223) grad_norm 5.1314 (4.3084) loss_scale 1024.0000 (667.1774) mem 7381MB [2024-09-01 06:28:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1210/1251] eta 0:00:09 lr 0.000125 wd 0.0500 time 0.2367 (0.2425) data time 0.0009 (0.0015) model time 0.2357 (0.2412) loss 2.6895 (2.8211) grad_norm 3.1454 (4.3049) loss_scale 1024.0000 (670.1239) mem 7381MB [2024-09-01 06:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1220/1251] eta 0:00:07 lr 0.000125 wd 0.0500 time 0.2478 (0.2425) data time 0.0007 (0.0015) model time 0.2471 (0.2412) loss 2.7050 (2.8215) grad_norm 4.2891 (4.3010) loss_scale 1024.0000 (673.0221) mem 7381MB [2024-09-01 06:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1230/1251] eta 0:00:05 lr 0.000125 wd 0.0500 time 0.2415 (0.2425) data time 0.0010 (0.0015) model time 0.2405 (0.2412) loss 3.0652 (2.8223) grad_norm 4.3486 (4.2980) loss_scale 1024.0000 (675.8733) mem 7381MB [2024-09-01 06:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1240/1251] eta 0:00:02 lr 0.000125 wd 0.0500 time 0.2219 (0.2426) data time 0.0005 (0.0014) model time 0.2214 (0.2413) loss 3.5548 (2.8215) grad_norm 4.7546 (4.2942) loss_scale 1024.0000 (678.6785) mem 7381MB [2024-09-01 06:28:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [237/300][1250/1251] eta 0:00:00 lr 0.000125 wd 0.0500 time 0.2225 (0.2425) data time 0.0005 (0.0014) model time 0.2221 (0.2411) loss 2.8759 (2.8210) grad_norm 3.8766 (4.2972) loss_scale 1024.0000 (681.4388) mem 7381MB [2024-09-01 06:28:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 237 training takes 0:05:03 [2024-09-01 06:28:13 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 06:28:14 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 06:28:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.456 (0.456) Loss 0.4204 (0.4204) Acc@1 92.871 (92.871) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 06:28:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.110) Loss 0.6108 (0.6393) Acc@1 88.281 (86.870) Acc@5 97.461 (97.479) Mem 7381MB [2024-09-01 06:28:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.094) Loss 0.9741 (0.6720) Acc@1 77.148 (85.784) Acc@5 95.215 (97.391) Mem 7381MB [2024-09-01 06:28:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.089) Loss 1.1357 (0.7625) Acc@1 73.242 (83.565) Acc@5 92.578 (96.402) Mem 7381MB [2024-09-01 06:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.0762 (0.8124) Acc@1 75.684 (82.424) Acc@5 94.043 (95.894) Mem 7381MB [2024-09-01 06:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.070 Acc@5 95.860 [2024-09-01 06:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.1% [2024-09-01 06:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.07% [2024-09-01 06:28:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 06:28:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 06:28:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.445 (0.445) Loss 0.3796 (0.3796) Acc@1 93.164 (93.164) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 06:28:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.110) Loss 0.5693 (0.6008) Acc@1 89.941 (87.509) Acc@5 97.949 (97.772) Mem 7381MB [2024-09-01 06:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.094) Loss 0.8818 (0.6296) Acc@1 77.930 (86.393) Acc@5 96.094 (97.712) Mem 7381MB [2024-09-01 06:28:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.090) Loss 1.0859 (0.7132) Acc@1 74.316 (84.331) Acc@5 93.555 (96.840) Mem 7381MB [2024-09-01 06:28:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 0.9922 (0.7584) Acc@1 76.270 (83.201) Acc@5 94.238 (96.365) Mem 7381MB [2024-09-01 06:28:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.776 Acc@5 96.344 [2024-09-01 06:28:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 06:28:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.78% [2024-09-01 06:28:23 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 06:28:24 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 06:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][0/1251] eta 0:17:11 lr 0.000125 wd 0.0500 time 0.8246 (0.8246) data time 0.5975 (0.5975) model time 0.0000 (0.0000) loss 2.8135 (2.8135) grad_norm 3.4405 (3.4405) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:28:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][10/1251] eta 0:06:07 lr 0.000125 wd 0.0500 time 0.2437 (0.2958) data time 0.0009 (0.0552) model time 0.0000 (0.0000) loss 2.4225 (2.8732) grad_norm 4.2275 (4.0950) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][20/1251] eta 0:05:32 lr 0.000125 wd 0.0500 time 0.2477 (0.2700) data time 0.0009 (0.0294) model time 0.0000 (0.0000) loss 1.9530 (2.7231) grad_norm 2.4871 (3.7896) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:28:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][30/1251] eta 0:05:18 lr 0.000125 wd 0.0500 time 0.2463 (0.2611) data time 0.0010 (0.0202) model time 0.0000 (0.0000) loss 2.4920 (2.7549) grad_norm 7.7133 (4.1081) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][40/1251] eta 0:05:09 lr 0.000125 wd 0.0500 time 0.2435 (0.2558) data time 0.0008 (0.0155) model time 0.0000 (0.0000) loss 2.7825 (2.7128) grad_norm 4.4540 (4.1778) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][50/1251] eta 0:05:04 lr 0.000125 wd 0.0500 time 0.2387 (0.2534) data time 0.0007 (0.0127) model time 0.0000 (0.0000) loss 3.0054 (2.7247) grad_norm 4.1142 (4.4404) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:28:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][60/1251] eta 0:04:59 lr 0.000125 wd 0.0500 time 0.2436 (0.2516) data time 0.0009 (0.0108) model time 0.2427 (0.2413) loss 3.8439 (2.7703) grad_norm 3.9714 (4.3850) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][70/1251] eta 0:04:55 lr 0.000125 wd 0.0500 time 0.2418 (0.2500) data time 0.0010 (0.0094) model time 0.2408 (0.2404) loss 2.8291 (2.7764) grad_norm 5.2620 (4.7783) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:28:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][80/1251] eta 0:04:51 lr 0.000125 wd 0.0500 time 0.2401 (0.2488) data time 0.0009 (0.0084) model time 0.2392 (0.2399) loss 3.5229 (2.8077) grad_norm 4.7029 (4.7437) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][90/1251] eta 0:04:47 lr 0.000125 wd 0.0500 time 0.2352 (0.2480) data time 0.0011 (0.0075) model time 0.2341 (0.2401) loss 3.4515 (2.8148) grad_norm 4.1122 (4.6470) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:28:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][100/1251] eta 0:04:44 lr 0.000125 wd 0.0500 time 0.2379 (0.2474) data time 0.0009 (0.0069) model time 0.2369 (0.2404) loss 2.5762 (2.8016) grad_norm 2.6754 (4.5826) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:28:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][110/1251] eta 0:04:41 lr 0.000125 wd 0.0500 time 0.2401 (0.2467) data time 0.0011 (0.0064) model time 0.2390 (0.2400) loss 2.9591 (2.8082) grad_norm 3.6634 (4.5307) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][120/1251] eta 0:04:38 lr 0.000125 wd 0.0500 time 0.2394 (0.2463) data time 0.0011 (0.0059) model time 0.2383 (0.2401) loss 2.3098 (2.8165) grad_norm 4.6443 (4.5481) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][130/1251] eta 0:04:35 lr 0.000125 wd 0.0500 time 0.2369 (0.2460) data time 0.0010 (0.0055) model time 0.2358 (0.2402) loss 2.7917 (2.8179) grad_norm 2.8346 (4.4807) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][140/1251] eta 0:04:32 lr 0.000125 wd 0.0500 time 0.2430 (0.2457) data time 0.0010 (0.0052) model time 0.2420 (0.2403) loss 3.4066 (2.8253) grad_norm 4.7483 (4.4466) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][150/1251] eta 0:04:32 lr 0.000125 wd 0.0500 time 0.2363 (0.2475) data time 0.0010 (0.0049) model time 0.2353 (0.2434) loss 2.9398 (2.8148) grad_norm 6.3875 (4.4958) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][160/1251] eta 0:04:30 lr 0.000125 wd 0.0500 time 0.2347 (0.2479) data time 0.0008 (0.0047) model time 0.2339 (0.2443) loss 3.2955 (2.8226) grad_norm 3.0974 (4.4954) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][170/1251] eta 0:04:27 lr 0.000125 wd 0.0500 time 0.2367 (0.2474) data time 0.0012 (0.0045) model time 0.2355 (0.2438) loss 3.2555 (2.8199) grad_norm 4.0599 (4.4798) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][180/1251] eta 0:04:24 lr 0.000125 wd 0.0500 time 0.2443 (0.2471) data time 0.0009 (0.0043) model time 0.2434 (0.2436) loss 3.2112 (2.8242) grad_norm 3.7931 (4.4477) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][190/1251] eta 0:04:21 lr 0.000124 wd 0.0500 time 0.2335 (0.2468) data time 0.0010 (0.0041) model time 0.2325 (0.2434) loss 3.5421 (2.8222) grad_norm 2.7470 (4.4616) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][200/1251] eta 0:04:19 lr 0.000124 wd 0.0500 time 0.2403 (0.2465) data time 0.0009 (0.0040) model time 0.2395 (0.2431) loss 2.9013 (2.8297) grad_norm 5.2584 (4.4506) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][210/1251] eta 0:04:16 lr 0.000124 wd 0.0500 time 0.2475 (0.2463) data time 0.0010 (0.0038) model time 0.2466 (0.2430) loss 3.1435 (2.8218) grad_norm 4.6649 (4.4312) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][220/1251] eta 0:04:13 lr 0.000124 wd 0.0500 time 0.2357 (0.2459) data time 0.0010 (0.0037) model time 0.2346 (0.2427) loss 2.6296 (2.8212) grad_norm 4.7408 (4.4218) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][230/1251] eta 0:04:10 lr 0.000124 wd 0.0500 time 0.2304 (0.2456) data time 0.0010 (0.0036) model time 0.2294 (0.2424) loss 3.1981 (2.8158) grad_norm 5.8076 (4.4156) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][240/1251] eta 0:04:08 lr 0.000124 wd 0.0500 time 0.2393 (0.2453) data time 0.0011 (0.0035) model time 0.2382 (0.2422) loss 3.2061 (2.8139) grad_norm 5.0134 (4.3937) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][250/1251] eta 0:04:05 lr 0.000124 wd 0.0500 time 0.2428 (0.2451) data time 0.0010 (0.0034) model time 0.2417 (0.2420) loss 2.4011 (2.8236) grad_norm 9.5779 (4.4176) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][260/1251] eta 0:04:02 lr 0.000124 wd 0.0500 time 0.2414 (0.2450) data time 0.0011 (0.0033) model time 0.2403 (0.2420) loss 3.3584 (2.8321) grad_norm 4.1652 (4.4323) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][270/1251] eta 0:04:00 lr 0.000124 wd 0.0500 time 0.2352 (0.2450) data time 0.0011 (0.0032) model time 0.2341 (0.2420) loss 2.7649 (2.8330) grad_norm 3.9281 (4.4127) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][280/1251] eta 0:03:57 lr 0.000124 wd 0.0500 time 0.2414 (0.2449) data time 0.0010 (0.0031) model time 0.2403 (0.2420) loss 3.0247 (2.8372) grad_norm 5.4910 (4.3918) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][290/1251] eta 0:03:55 lr 0.000124 wd 0.0500 time 0.2432 (0.2448) data time 0.0008 (0.0030) model time 0.2423 (0.2420) loss 3.6301 (2.8372) grad_norm 4.7418 (4.3816) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][300/1251] eta 0:03:52 lr 0.000124 wd 0.0500 time 0.2405 (0.2447) data time 0.0011 (0.0030) model time 0.2393 (0.2419) loss 3.2606 (2.8310) grad_norm 3.5102 (4.3922) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][310/1251] eta 0:03:50 lr 0.000124 wd 0.0500 time 0.2313 (0.2446) data time 0.0010 (0.0029) model time 0.2302 (0.2419) loss 3.1732 (2.8327) grad_norm 3.6118 (4.3920) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][320/1251] eta 0:03:47 lr 0.000124 wd 0.0500 time 0.2406 (0.2444) data time 0.0007 (0.0028) model time 0.2399 (0.2417) loss 2.1345 (2.8307) grad_norm 3.5549 (4.3688) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][330/1251] eta 0:03:45 lr 0.000124 wd 0.0500 time 0.2420 (0.2443) data time 0.0008 (0.0028) model time 0.2411 (0.2417) loss 2.5871 (2.8344) grad_norm 3.4919 (4.3798) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][340/1251] eta 0:03:42 lr 0.000124 wd 0.0500 time 0.2362 (0.2442) data time 0.0010 (0.0027) model time 0.2352 (0.2416) loss 3.1138 (2.8315) grad_norm 4.8614 (4.3848) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][350/1251] eta 0:03:39 lr 0.000124 wd 0.0500 time 0.2419 (0.2441) data time 0.0007 (0.0027) model time 0.2411 (0.2415) loss 3.5626 (2.8360) grad_norm 3.4796 (4.3650) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][360/1251] eta 0:03:37 lr 0.000124 wd 0.0500 time 0.2360 (0.2439) data time 0.0008 (0.0026) model time 0.2352 (0.2414) loss 2.7997 (2.8352) grad_norm 3.4185 (4.3395) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][370/1251] eta 0:03:34 lr 0.000124 wd 0.0500 time 0.2372 (0.2438) data time 0.0007 (0.0026) model time 0.2364 (0.2413) loss 2.8074 (2.8376) grad_norm 2.6815 (4.3390) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][380/1251] eta 0:03:32 lr 0.000124 wd 0.0500 time 0.2382 (0.2437) data time 0.0008 (0.0025) model time 0.2374 (0.2413) loss 2.8899 (2.8434) grad_norm 5.0407 (4.3330) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:29:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][390/1251] eta 0:03:29 lr 0.000124 wd 0.0500 time 0.2439 (0.2437) data time 0.0008 (0.0025) model time 0.2431 (0.2413) loss 3.0929 (2.8455) grad_norm 2.7618 (4.3115) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][400/1251] eta 0:03:27 lr 0.000124 wd 0.0500 time 0.2481 (0.2436) data time 0.0007 (0.0025) model time 0.2474 (0.2412) loss 2.9191 (2.8465) grad_norm 3.4922 (4.3060) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][410/1251] eta 0:03:24 lr 0.000124 wd 0.0500 time 0.2370 (0.2435) data time 0.0013 (0.0024) model time 0.2357 (0.2411) loss 2.8008 (2.8371) grad_norm 4.6988 (4.3041) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][420/1251] eta 0:03:22 lr 0.000124 wd 0.0500 time 0.2487 (0.2434) data time 0.0009 (0.0024) model time 0.2477 (0.2411) loss 3.0559 (2.8381) grad_norm 4.9928 (4.3697) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][430/1251] eta 0:03:20 lr 0.000124 wd 0.0500 time 0.2328 (0.2438) data time 0.0010 (0.0024) model time 0.2319 (0.2415) loss 3.2998 (2.8405) grad_norm 4.5677 (4.3558) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][440/1251] eta 0:03:17 lr 0.000124 wd 0.0500 time 0.2384 (0.2437) data time 0.0007 (0.0023) model time 0.2377 (0.2415) loss 3.3544 (2.8431) grad_norm 4.5489 (4.3596) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][450/1251] eta 0:03:15 lr 0.000124 wd 0.0500 time 0.2349 (0.2436) data time 0.0010 (0.0023) model time 0.2338 (0.2414) loss 2.6794 (2.8461) grad_norm 6.7844 (4.4153) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][460/1251] eta 0:03:12 lr 0.000124 wd 0.0500 time 0.2360 (0.2435) data time 0.0007 (0.0023) model time 0.2353 (0.2413) loss 2.4836 (2.8425) grad_norm 3.1725 (4.4035) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][470/1251] eta 0:03:10 lr 0.000124 wd 0.0500 time 0.2484 (0.2435) data time 0.0010 (0.0023) model time 0.2473 (0.2413) loss 3.0152 (2.8412) grad_norm 6.4406 (4.3951) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][480/1251] eta 0:03:07 lr 0.000124 wd 0.0500 time 0.2454 (0.2435) data time 0.0009 (0.0022) model time 0.2444 (0.2413) loss 2.8735 (2.8397) grad_norm 3.3262 (4.4123) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][490/1251] eta 0:03:05 lr 0.000124 wd 0.0500 time 0.2414 (0.2435) data time 0.0011 (0.0022) model time 0.2403 (0.2413) loss 3.0924 (2.8436) grad_norm 4.1540 (4.3959) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][500/1251] eta 0:03:02 lr 0.000124 wd 0.0500 time 0.2421 (0.2434) data time 0.0008 (0.0022) model time 0.2413 (0.2413) loss 2.9557 (2.8484) grad_norm 3.6410 (4.3976) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][510/1251] eta 0:03:00 lr 0.000124 wd 0.0500 time 0.2468 (0.2434) data time 0.0010 (0.0022) model time 0.2459 (0.2413) loss 3.3049 (2.8453) grad_norm 5.3074 (4.4180) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][520/1251] eta 0:02:57 lr 0.000124 wd 0.0500 time 0.2438 (0.2434) data time 0.0010 (0.0021) model time 0.2428 (0.2413) loss 3.2680 (2.8481) grad_norm 3.5050 (4.4135) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][530/1251] eta 0:02:55 lr 0.000124 wd 0.0500 time 0.2414 (0.2434) data time 0.0009 (0.0021) model time 0.2405 (0.2413) loss 3.3149 (2.8535) grad_norm 2.8256 (4.4114) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][540/1251] eta 0:02:53 lr 0.000123 wd 0.0500 time 0.2462 (0.2434) data time 0.0011 (0.0021) model time 0.2451 (0.2414) loss 3.3668 (2.8546) grad_norm 3.8993 (4.4114) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][550/1251] eta 0:02:50 lr 0.000123 wd 0.0500 time 0.2454 (0.2434) data time 0.0010 (0.0021) model time 0.2444 (0.2414) loss 3.2457 (2.8560) grad_norm 2.7360 (4.3966) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][560/1251] eta 0:02:48 lr 0.000123 wd 0.0500 time 0.2378 (0.2433) data time 0.0009 (0.0020) model time 0.2368 (0.2413) loss 3.2756 (2.8556) grad_norm 8.0610 (4.3952) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][570/1251] eta 0:02:45 lr 0.000123 wd 0.0500 time 0.2400 (0.2433) data time 0.0007 (0.0020) model time 0.2393 (0.2413) loss 3.2542 (2.8537) grad_norm 2.2214 (4.3839) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][580/1251] eta 0:02:43 lr 0.000123 wd 0.0500 time 0.2391 (0.2432) data time 0.0008 (0.0020) model time 0.2382 (0.2413) loss 3.5316 (2.8542) grad_norm 4.8633 (4.3752) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][590/1251] eta 0:02:40 lr 0.000123 wd 0.0500 time 0.2443 (0.2432) data time 0.0009 (0.0020) model time 0.2434 (0.2412) loss 2.7076 (2.8557) grad_norm 4.5141 (4.3698) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][600/1251] eta 0:02:38 lr 0.000123 wd 0.0500 time 0.2410 (0.2431) data time 0.0007 (0.0020) model time 0.2403 (0.2412) loss 3.2031 (2.8579) grad_norm 4.1049 (4.3607) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][610/1251] eta 0:02:35 lr 0.000123 wd 0.0500 time 0.2290 (0.2431) data time 0.0009 (0.0020) model time 0.2281 (0.2411) loss 3.8058 (2.8601) grad_norm 6.0024 (4.3579) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][620/1251] eta 0:02:33 lr 0.000123 wd 0.0500 time 0.2443 (0.2430) data time 0.0009 (0.0019) model time 0.2434 (0.2411) loss 3.3002 (2.8602) grad_norm 3.8015 (4.3462) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][630/1251] eta 0:02:30 lr 0.000123 wd 0.0500 time 0.2314 (0.2430) data time 0.0010 (0.0019) model time 0.2304 (0.2411) loss 3.4865 (2.8594) grad_norm 3.8368 (4.3317) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:30:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][640/1251] eta 0:02:28 lr 0.000123 wd 0.0500 time 0.2309 (0.2430) data time 0.0009 (0.0019) model time 0.2300 (0.2411) loss 3.4766 (2.8588) grad_norm 4.4345 (4.3195) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][650/1251] eta 0:02:26 lr 0.000123 wd 0.0500 time 0.2350 (0.2430) data time 0.0008 (0.0019) model time 0.2342 (0.2411) loss 3.3386 (2.8558) grad_norm 2.9344 (4.3101) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][660/1251] eta 0:02:23 lr 0.000123 wd 0.0500 time 0.2457 (0.2430) data time 0.0007 (0.0019) model time 0.2450 (0.2412) loss 2.7501 (2.8548) grad_norm 2.8099 (4.2985) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][670/1251] eta 0:02:21 lr 0.000123 wd 0.0500 time 0.2487 (0.2430) data time 0.0007 (0.0019) model time 0.2480 (0.2412) loss 3.6251 (2.8543) grad_norm 2.8235 (4.2946) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][680/1251] eta 0:02:19 lr 0.000123 wd 0.0500 time 0.2358 (0.2439) data time 0.0009 (0.0019) model time 0.2349 (0.2422) loss 2.4609 (2.8517) grad_norm 3.8123 (4.2924) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][690/1251] eta 0:02:16 lr 0.000123 wd 0.0500 time 0.2346 (0.2439) data time 0.0010 (0.0018) model time 0.2336 (0.2421) loss 2.8598 (2.8505) grad_norm 4.7023 (4.3002) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][700/1251] eta 0:02:14 lr 0.000123 wd 0.0500 time 0.2436 (0.2438) data time 0.0007 (0.0018) model time 0.2428 (0.2421) loss 3.3413 (2.8505) grad_norm 5.5165 (4.3208) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][710/1251] eta 0:02:11 lr 0.000123 wd 0.0500 time 0.2397 (0.2438) data time 0.0009 (0.0018) model time 0.2388 (0.2421) loss 2.9550 (2.8505) grad_norm 7.5118 (4.3531) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][720/1251] eta 0:02:09 lr 0.000123 wd 0.0500 time 0.2396 (0.2438) data time 0.0007 (0.0018) model time 0.2389 (0.2421) loss 3.1293 (2.8495) grad_norm 3.2647 (4.3603) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][730/1251] eta 0:02:06 lr 0.000123 wd 0.0500 time 0.2355 (0.2437) data time 0.0008 (0.0018) model time 0.2347 (0.2420) loss 3.0612 (2.8534) grad_norm 3.8864 (4.3659) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][740/1251] eta 0:02:04 lr 0.000123 wd 0.0500 time 0.2432 (0.2437) data time 0.0010 (0.0018) model time 0.2423 (0.2420) loss 2.8915 (2.8538) grad_norm 2.6538 (4.3600) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][750/1251] eta 0:02:02 lr 0.000123 wd 0.0500 time 0.2397 (0.2437) data time 0.0009 (0.0018) model time 0.2388 (0.2420) loss 2.9218 (2.8552) grad_norm 5.5691 (4.3568) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][760/1251] eta 0:01:59 lr 0.000123 wd 0.0500 time 0.2432 (0.2436) data time 0.0007 (0.0018) model time 0.2425 (0.2420) loss 2.4910 (2.8540) grad_norm 3.0146 (4.3572) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][770/1251] eta 0:01:57 lr 0.000123 wd 0.0500 time 0.2450 (0.2436) data time 0.0009 (0.0018) model time 0.2441 (0.2419) loss 3.1938 (2.8562) grad_norm 3.3824 (4.3537) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][780/1251] eta 0:01:54 lr 0.000123 wd 0.0500 time 0.2422 (0.2436) data time 0.0013 (0.0017) model time 0.2410 (0.2419) loss 3.0347 (2.8570) grad_norm 2.7945 (4.3445) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][790/1251] eta 0:01:52 lr 0.000123 wd 0.0500 time 0.2405 (0.2436) data time 0.0010 (0.0017) model time 0.2396 (0.2419) loss 2.9406 (2.8558) grad_norm 6.0528 (4.3390) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][800/1251] eta 0:01:49 lr 0.000123 wd 0.0500 time 0.2330 (0.2435) data time 0.0012 (0.0017) model time 0.2319 (0.2419) loss 2.6628 (2.8558) grad_norm 6.2751 (4.3390) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][810/1251] eta 0:01:47 lr 0.000123 wd 0.0500 time 0.2370 (0.2435) data time 0.0010 (0.0017) model time 0.2360 (0.2418) loss 2.8420 (2.8554) grad_norm 5.2752 (4.3528) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][820/1251] eta 0:01:44 lr 0.000123 wd 0.0500 time 0.2417 (0.2435) data time 0.0009 (0.0017) model time 0.2408 (0.2418) loss 2.7492 (2.8564) grad_norm 4.4716 (4.3480) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][830/1251] eta 0:01:42 lr 0.000123 wd 0.0500 time 0.2366 (0.2435) data time 0.0010 (0.0017) model time 0.2356 (0.2418) loss 3.0544 (2.8573) grad_norm 5.5578 (4.3450) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][840/1251] eta 0:01:40 lr 0.000123 wd 0.0500 time 0.2391 (0.2434) data time 0.0009 (0.0017) model time 0.2382 (0.2418) loss 3.0726 (2.8575) grad_norm 3.0366 (4.3443) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][850/1251] eta 0:01:37 lr 0.000123 wd 0.0500 time 0.2352 (0.2434) data time 0.0009 (0.0017) model time 0.2342 (0.2418) loss 3.3178 (2.8597) grad_norm 4.0341 (4.3356) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][860/1251] eta 0:01:35 lr 0.000123 wd 0.0500 time 0.2439 (0.2434) data time 0.0010 (0.0017) model time 0.2429 (0.2418) loss 3.3374 (2.8592) grad_norm 4.6420 (4.3395) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][870/1251] eta 0:01:32 lr 0.000123 wd 0.0500 time 0.2398 (0.2434) data time 0.0007 (0.0017) model time 0.2391 (0.2417) loss 2.0457 (2.8577) grad_norm 4.9956 (4.3511) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:31:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][880/1251] eta 0:01:30 lr 0.000123 wd 0.0500 time 0.2432 (0.2433) data time 0.0007 (0.0017) model time 0.2425 (0.2417) loss 2.1271 (2.8555) grad_norm 3.6415 (4.3476) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][890/1251] eta 0:01:27 lr 0.000122 wd 0.0500 time 0.2293 (0.2433) data time 0.0009 (0.0017) model time 0.2284 (0.2417) loss 2.8171 (2.8537) grad_norm 3.8678 (4.3405) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][900/1251] eta 0:01:25 lr 0.000122 wd 0.0500 time 0.2393 (0.2432) data time 0.0013 (0.0016) model time 0.2380 (0.2416) loss 2.6860 (2.8539) grad_norm 2.6093 (4.3325) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][910/1251] eta 0:01:22 lr 0.000122 wd 0.0500 time 0.2536 (0.2433) data time 0.0007 (0.0016) model time 0.2529 (0.2417) loss 2.7501 (2.8530) grad_norm 3.4647 (4.3253) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][920/1251] eta 0:01:20 lr 0.000122 wd 0.0500 time 0.2417 (0.2432) data time 0.0008 (0.0016) model time 0.2409 (0.2417) loss 3.0720 (2.8559) grad_norm 2.9391 (4.3253) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][930/1251] eta 0:01:18 lr 0.000122 wd 0.0500 time 0.2413 (0.2432) data time 0.0010 (0.0016) model time 0.2403 (0.2416) loss 2.6334 (2.8558) grad_norm 5.2332 (4.3240) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][940/1251] eta 0:01:15 lr 0.000122 wd 0.0500 time 0.4006 (0.2434) data time 0.0008 (0.0016) model time 0.3998 (0.2418) loss 3.4156 (2.8563) grad_norm 2.5631 (4.3171) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][950/1251] eta 0:01:13 lr 0.000122 wd 0.0500 time 0.2369 (0.2433) data time 0.0008 (0.0016) model time 0.2361 (0.2418) loss 3.4092 (2.8565) grad_norm 4.1274 (4.3290) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][960/1251] eta 0:01:10 lr 0.000122 wd 0.0500 time 0.2391 (0.2433) data time 0.0010 (0.0016) model time 0.2381 (0.2417) loss 3.3025 (2.8587) grad_norm 2.8600 (4.3247) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][970/1251] eta 0:01:08 lr 0.000122 wd 0.0500 time 0.2372 (0.2432) data time 0.0007 (0.0016) model time 0.2365 (0.2417) loss 3.3001 (2.8548) grad_norm 3.9492 (4.3194) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][980/1251] eta 0:01:05 lr 0.000122 wd 0.0500 time 0.2390 (0.2432) data time 0.0010 (0.0016) model time 0.2381 (0.2417) loss 3.1711 (2.8574) grad_norm 4.4257 (4.3183) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][990/1251] eta 0:01:03 lr 0.000122 wd 0.0500 time 0.2398 (0.2432) data time 0.0007 (0.0016) model time 0.2391 (0.2417) loss 2.1699 (2.8524) grad_norm 3.0076 (4.3291) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1000/1251] eta 0:01:01 lr 0.000122 wd 0.0500 time 0.2404 (0.2432) data time 0.0007 (0.0016) model time 0.2397 (0.2416) loss 2.5046 (2.8500) grad_norm 6.3231 (4.3252) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1010/1251] eta 0:00:58 lr 0.000122 wd 0.0500 time 0.2401 (0.2432) data time 0.0009 (0.0016) model time 0.2392 (0.2416) loss 3.3624 (2.8506) grad_norm 3.7402 (4.3253) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1020/1251] eta 0:00:56 lr 0.000122 wd 0.0500 time 0.2392 (0.2432) data time 0.0009 (0.0016) model time 0.2382 (0.2416) loss 3.0067 (2.8523) grad_norm 4.3494 (4.3245) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1030/1251] eta 0:00:53 lr 0.000122 wd 0.0500 time 0.2364 (0.2431) data time 0.0009 (0.0016) model time 0.2356 (0.2416) loss 2.9977 (2.8518) grad_norm 3.2280 (4.3199) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1040/1251] eta 0:00:51 lr 0.000122 wd 0.0500 time 0.2383 (0.2431) data time 0.0010 (0.0016) model time 0.2373 (0.2416) loss 2.0592 (2.8482) grad_norm 2.9259 (4.3188) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1050/1251] eta 0:00:48 lr 0.000122 wd 0.0500 time 0.2434 (0.2431) data time 0.0009 (0.0016) model time 0.2425 (0.2416) loss 3.1119 (2.8479) grad_norm 3.8692 (4.3143) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1060/1251] eta 0:00:46 lr 0.000122 wd 0.0500 time 0.2351 (0.2431) data time 0.0010 (0.0015) model time 0.2341 (0.2416) loss 1.9588 (2.8460) grad_norm 3.8726 (4.3093) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1070/1251] eta 0:00:43 lr 0.000122 wd 0.0500 time 0.2433 (0.2430) data time 0.0007 (0.0015) model time 0.2426 (0.2415) loss 3.1320 (2.8475) grad_norm 3.8853 (4.3062) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1080/1251] eta 0:00:41 lr 0.000122 wd 0.0500 time 0.2464 (0.2430) data time 0.0008 (0.0015) model time 0.2456 (0.2415) loss 3.8941 (2.8480) grad_norm 4.4399 (4.3085) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1090/1251] eta 0:00:39 lr 0.000122 wd 0.0500 time 0.2354 (0.2430) data time 0.0009 (0.0015) model time 0.2345 (0.2415) loss 2.8191 (2.8469) grad_norm 5.2392 (4.3136) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1100/1251] eta 0:00:36 lr 0.000122 wd 0.0500 time 0.2443 (0.2430) data time 0.0010 (0.0015) model time 0.2433 (0.2415) loss 3.5599 (2.8493) grad_norm 4.2100 (4.3082) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1110/1251] eta 0:00:34 lr 0.000122 wd 0.0500 time 0.2343 (0.2430) data time 0.0007 (0.0015) model time 0.2336 (0.2415) loss 2.3707 (2.8463) grad_norm 3.1614 (4.3022) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1120/1251] eta 0:00:31 lr 0.000122 wd 0.0500 time 0.2415 (0.2430) data time 0.0007 (0.0015) model time 0.2408 (0.2415) loss 3.1139 (2.8459) grad_norm 4.7075 (4.3067) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1130/1251] eta 0:00:29 lr 0.000122 wd 0.0500 time 0.2379 (0.2429) data time 0.0009 (0.0015) model time 0.2369 (0.2414) loss 2.9055 (2.8458) grad_norm 3.3955 (4.2986) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1140/1251] eta 0:00:26 lr 0.000122 wd 0.0500 time 0.2431 (0.2429) data time 0.0009 (0.0015) model time 0.2421 (0.2414) loss 2.9326 (2.8454) grad_norm 3.2341 (4.2903) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1150/1251] eta 0:00:24 lr 0.000122 wd 0.0500 time 0.2439 (0.2429) data time 0.0007 (0.0015) model time 0.2432 (0.2414) loss 2.8474 (2.8461) grad_norm 4.2857 (4.3180) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1160/1251] eta 0:00:22 lr 0.000122 wd 0.0500 time 0.2425 (0.2429) data time 0.0010 (0.0015) model time 0.2415 (0.2414) loss 2.9134 (2.8469) grad_norm 4.0204 (4.3147) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1170/1251] eta 0:00:19 lr 0.000122 wd 0.0500 time 0.2360 (0.2429) data time 0.0009 (0.0015) model time 0.2352 (0.2414) loss 2.3365 (2.8446) grad_norm 4.0870 (4.3149) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1180/1251] eta 0:00:17 lr 0.000122 wd 0.0500 time 0.2462 (0.2429) data time 0.0007 (0.0015) model time 0.2455 (0.2414) loss 1.6540 (2.8436) grad_norm 3.0388 (4.3199) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1190/1251] eta 0:00:14 lr 0.000122 wd 0.0500 time 0.2435 (0.2428) data time 0.0007 (0.0015) model time 0.2428 (0.2414) loss 2.6243 (2.8442) grad_norm 14.3088 (4.3287) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1200/1251] eta 0:00:12 lr 0.000122 wd 0.0500 time 0.2416 (0.2428) data time 0.0011 (0.0015) model time 0.2406 (0.2414) loss 3.1127 (2.8439) grad_norm 3.7680 (4.3340) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1210/1251] eta 0:00:09 lr 0.000122 wd 0.0500 time 0.2291 (0.2428) data time 0.0010 (0.0015) model time 0.2281 (0.2413) loss 2.7382 (2.8447) grad_norm 3.7212 (4.3347) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1220/1251] eta 0:00:07 lr 0.000122 wd 0.0500 time 0.2469 (0.2428) data time 0.0007 (0.0015) model time 0.2462 (0.2413) loss 3.1478 (2.8432) grad_norm 17.6225 (4.3463) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1230/1251] eta 0:00:05 lr 0.000122 wd 0.0500 time 0.2321 (0.2428) data time 0.0010 (0.0015) model time 0.2311 (0.2413) loss 2.5660 (2.8425) grad_norm 2.7350 (4.3433) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1240/1251] eta 0:00:02 lr 0.000122 wd 0.0500 time 0.2271 (0.2427) data time 0.0007 (0.0015) model time 0.2264 (0.2412) loss 3.3071 (2.8419) grad_norm 3.2061 (4.3417) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [238/300][1250/1251] eta 0:00:00 lr 0.000121 wd 0.0500 time 0.2271 (0.2426) data time 0.0007 (0.0015) model time 0.2264 (0.2411) loss 3.0125 (2.8416) grad_norm 2.9346 (4.3373) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 238 training takes 0:05:03 [2024-09-01 06:33:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 06:33:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 06:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.435 (0.435) Loss 0.4211 (0.4211) Acc@1 91.992 (91.992) Acc@5 98.340 (98.340) Mem 7381MB [2024-09-01 06:33:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.109) Loss 0.6050 (0.6480) Acc@1 88.086 (86.630) Acc@5 97.949 (97.514) Mem 7381MB [2024-09-01 06:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.095) Loss 0.9473 (0.6771) Acc@1 77.051 (85.654) Acc@5 95.410 (97.414) Mem 7381MB [2024-09-01 06:33:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.085 (0.091) Loss 1.1660 (0.7653) Acc@1 71.680 (83.417) Acc@5 92.285 (96.418) Mem 7381MB [2024-09-01 06:33:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.063 (0.085) Loss 1.0625 (0.8120) Acc@1 76.855 (82.360) Acc@5 93.164 (95.979) Mem 7381MB [2024-09-01 06:33:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 81.998 Acc@5 95.952 [2024-09-01 06:33:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.0% [2024-09-01 06:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.817 (0.817) Loss 0.3794 (0.3794) Acc@1 93.262 (93.262) Acc@5 98.340 (98.340) Mem 7381MB [2024-09-01 06:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.085 (0.149) Loss 0.5679 (0.6003) Acc@1 89.648 (87.553) Acc@5 97.852 (97.745) Mem 7381MB [2024-09-01 06:33:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.115) Loss 0.8833 (0.6295) Acc@1 77.930 (86.403) Acc@5 95.898 (97.675) Mem 7381MB [2024-09-01 06:33:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.104) Loss 1.0869 (0.7134) Acc@1 74.219 (84.337) Acc@5 93.555 (96.818) Mem 7381MB [2024-09-01 06:33:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.095) Loss 0.9946 (0.7586) Acc@1 75.977 (83.215) Acc@5 94.238 (96.353) Mem 7381MB [2024-09-01 06:33:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.816 Acc@5 96.336 [2024-09-01 06:33:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 06:33:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.82% [2024-09-01 06:33:36 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 06:33:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 06:33:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][0/1251] eta 0:15:19 lr 0.000121 wd 0.0500 time 0.7348 (0.7348) data time 0.5101 (0.5101) model time 0.0000 (0.0000) loss 3.4476 (3.4476) grad_norm 6.4433 (6.4433) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][10/1251] eta 0:05:50 lr 0.000121 wd 0.0500 time 0.2367 (0.2823) data time 0.0009 (0.0472) model time 0.0000 (0.0000) loss 3.3504 (2.9641) grad_norm 4.3704 (4.2022) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][20/1251] eta 0:05:21 lr 0.000121 wd 0.0500 time 0.2397 (0.2615) data time 0.0007 (0.0252) model time 0.0000 (0.0000) loss 2.1822 (2.9421) grad_norm 4.4333 (4.0998) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][30/1251] eta 0:05:10 lr 0.000121 wd 0.0500 time 0.2373 (0.2541) data time 0.0010 (0.0174) model time 0.0000 (0.0000) loss 3.3449 (2.8552) grad_norm 3.7460 (4.1217) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][40/1251] eta 0:05:03 lr 0.000121 wd 0.0500 time 0.2365 (0.2506) data time 0.0007 (0.0134) model time 0.0000 (0.0000) loss 1.8105 (2.7796) grad_norm 6.2441 (4.2492) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][50/1251] eta 0:04:59 lr 0.000121 wd 0.0500 time 0.2452 (0.2491) data time 0.0009 (0.0109) model time 0.0000 (0.0000) loss 2.1813 (2.8083) grad_norm 2.9434 (4.1379) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][60/1251] eta 0:04:55 lr 0.000121 wd 0.0500 time 0.2397 (0.2477) data time 0.0009 (0.0093) model time 0.2388 (0.2397) loss 3.2653 (2.8306) grad_norm 4.1417 (4.1732) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][70/1251] eta 0:04:51 lr 0.000121 wd 0.0500 time 0.2492 (0.2467) data time 0.0010 (0.0081) model time 0.2482 (0.2398) loss 2.4245 (2.7840) grad_norm 2.4594 (4.1635) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][80/1251] eta 0:04:48 lr 0.000121 wd 0.0500 time 0.2419 (0.2460) data time 0.0007 (0.0072) model time 0.2411 (0.2397) loss 2.1293 (2.7893) grad_norm 5.2493 (4.2684) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:33:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][90/1251] eta 0:04:44 lr 0.000121 wd 0.0500 time 0.2439 (0.2454) data time 0.0007 (0.0066) model time 0.2432 (0.2396) loss 2.8545 (2.7768) grad_norm 6.1539 (4.3336) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][100/1251] eta 0:04:42 lr 0.000121 wd 0.0500 time 0.2433 (0.2451) data time 0.0008 (0.0060) model time 0.2426 (0.2401) loss 2.9319 (2.7910) grad_norm 3.6803 (4.2776) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][110/1251] eta 0:04:39 lr 0.000121 wd 0.0500 time 0.2442 (0.2446) data time 0.0008 (0.0056) model time 0.2433 (0.2399) loss 2.0144 (2.7931) grad_norm 4.4174 (4.2486) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][120/1251] eta 0:04:36 lr 0.000121 wd 0.0500 time 0.2581 (0.2443) data time 0.0007 (0.0052) model time 0.2573 (0.2399) loss 3.1715 (2.8011) grad_norm 4.5626 (4.2628) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][130/1251] eta 0:04:33 lr 0.000121 wd 0.0500 time 0.2370 (0.2440) data time 0.0012 (0.0049) model time 0.2358 (0.2397) loss 3.1655 (2.8025) grad_norm 5.1348 (4.2828) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][140/1251] eta 0:04:32 lr 0.000121 wd 0.0500 time 0.4024 (0.2449) data time 0.0009 (0.0046) model time 0.4015 (0.2415) loss 3.3944 (2.8096) grad_norm 4.3280 (4.2910) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][150/1251] eta 0:04:29 lr 0.000121 wd 0.0500 time 0.2380 (0.2444) data time 0.0011 (0.0043) model time 0.2369 (0.2411) loss 2.7897 (2.8198) grad_norm 3.1648 (4.2878) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][160/1251] eta 0:04:26 lr 0.000121 wd 0.0500 time 0.2402 (0.2442) data time 0.0009 (0.0041) model time 0.2393 (0.2409) loss 3.1784 (2.8302) grad_norm 3.2526 (4.2811) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][170/1251] eta 0:04:23 lr 0.000121 wd 0.0500 time 0.2477 (0.2440) data time 0.0011 (0.0039) model time 0.2466 (0.2409) loss 3.4126 (2.8391) grad_norm 5.2337 (4.2441) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][180/1251] eta 0:04:21 lr 0.000121 wd 0.0500 time 0.2407 (0.2440) data time 0.0010 (0.0038) model time 0.2397 (0.2410) loss 2.4813 (2.8348) grad_norm 4.4602 (4.2175) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][190/1251] eta 0:04:18 lr 0.000121 wd 0.0500 time 0.2416 (0.2439) data time 0.0012 (0.0036) model time 0.2404 (0.2410) loss 2.2749 (2.8385) grad_norm 4.3600 (4.2040) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][200/1251] eta 0:04:18 lr 0.000121 wd 0.0500 time 0.4574 (0.2459) data time 0.0007 (0.0035) model time 0.4567 (0.2438) loss 3.0380 (2.8352) grad_norm 3.3929 (4.1949) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][210/1251] eta 0:04:15 lr 0.000121 wd 0.0500 time 0.2431 (0.2457) data time 0.0009 (0.0034) model time 0.2422 (0.2436) loss 2.6604 (2.8435) grad_norm 5.0023 (4.2046) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][220/1251] eta 0:04:13 lr 0.000121 wd 0.0500 time 0.2390 (0.2455) data time 0.0011 (0.0033) model time 0.2379 (0.2435) loss 2.5911 (2.8503) grad_norm 6.6506 (4.2199) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][230/1251] eta 0:04:10 lr 0.000121 wd 0.0500 time 0.2390 (0.2454) data time 0.0009 (0.0032) model time 0.2381 (0.2434) loss 3.0526 (2.8579) grad_norm 4.9913 (4.2666) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][240/1251] eta 0:04:08 lr 0.000121 wd 0.0500 time 0.2473 (0.2453) data time 0.0010 (0.0031) model time 0.2463 (0.2433) loss 2.5551 (2.8577) grad_norm 4.8050 (4.2969) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][250/1251] eta 0:04:05 lr 0.000121 wd 0.0500 time 0.2429 (0.2452) data time 0.0009 (0.0030) model time 0.2421 (0.2432) loss 3.1646 (2.8386) grad_norm 6.6030 (4.3093) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][260/1251] eta 0:04:02 lr 0.000121 wd 0.0500 time 0.2391 (0.2451) data time 0.0010 (0.0029) model time 0.2382 (0.2431) loss 2.9137 (2.8313) grad_norm 4.6205 (4.3064) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][270/1251] eta 0:04:00 lr 0.000121 wd 0.0500 time 0.2469 (0.2450) data time 0.0009 (0.0029) model time 0.2460 (0.2430) loss 2.8578 (2.8294) grad_norm 4.3358 (4.3602) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][280/1251] eta 0:03:57 lr 0.000121 wd 0.0500 time 0.2370 (0.2448) data time 0.0008 (0.0028) model time 0.2362 (0.2429) loss 2.3757 (2.8322) grad_norm 5.1530 (4.3741) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][290/1251] eta 0:03:55 lr 0.000121 wd 0.0500 time 0.2387 (0.2447) data time 0.0007 (0.0027) model time 0.2379 (0.2427) loss 3.7588 (2.8344) grad_norm 9.2269 (4.4253) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][300/1251] eta 0:03:52 lr 0.000121 wd 0.0500 time 0.2376 (0.2445) data time 0.0010 (0.0027) model time 0.2365 (0.2426) loss 2.8998 (2.8357) grad_norm 3.5618 (4.4548) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][310/1251] eta 0:03:49 lr 0.000121 wd 0.0500 time 0.2372 (0.2444) data time 0.0011 (0.0026) model time 0.2361 (0.2425) loss 2.2256 (2.8357) grad_norm 5.5937 (4.4888) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][320/1251] eta 0:03:47 lr 0.000121 wd 0.0500 time 0.2314 (0.2442) data time 0.0011 (0.0026) model time 0.2303 (0.2423) loss 3.0306 (2.8398) grad_norm 2.7857 (4.4617) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][330/1251] eta 0:03:44 lr 0.000121 wd 0.0500 time 0.2339 (0.2441) data time 0.0008 (0.0025) model time 0.2331 (0.2422) loss 2.1692 (2.8402) grad_norm 3.1029 (4.4324) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:35:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][340/1251] eta 0:03:42 lr 0.000121 wd 0.0500 time 0.2462 (0.2440) data time 0.0008 (0.0025) model time 0.2455 (0.2422) loss 1.9288 (2.8355) grad_norm 3.8504 (4.4519) loss_scale 2048.0000 (1042.0176) mem 7381MB [2024-09-01 06:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][350/1251] eta 0:03:39 lr 0.000121 wd 0.0500 time 0.2402 (0.2439) data time 0.0007 (0.0024) model time 0.2395 (0.2421) loss 2.9317 (2.8410) grad_norm 3.7401 (4.4433) loss_scale 2048.0000 (1070.6781) mem 7381MB [2024-09-01 06:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][360/1251] eta 0:03:37 lr 0.000120 wd 0.0500 time 0.2420 (0.2439) data time 0.0007 (0.0024) model time 0.2413 (0.2421) loss 3.2034 (2.8356) grad_norm 6.1817 (4.4391) loss_scale 2048.0000 (1097.7507) mem 7381MB [2024-09-01 06:35:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][370/1251] eta 0:03:34 lr 0.000120 wd 0.0500 time 0.2455 (0.2439) data time 0.0010 (0.0023) model time 0.2445 (0.2421) loss 3.0692 (2.8382) grad_norm 3.5155 (4.4446) loss_scale 2048.0000 (1123.3639) mem 7381MB [2024-09-01 06:35:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][380/1251] eta 0:03:32 lr 0.000120 wd 0.0500 time 0.2428 (0.2438) data time 0.0007 (0.0023) model time 0.2421 (0.2420) loss 1.7970 (2.8259) grad_norm 3.3888 (4.4279) loss_scale 2048.0000 (1147.6325) mem 7381MB [2024-09-01 06:35:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][390/1251] eta 0:03:29 lr 0.000120 wd 0.0500 time 0.2428 (0.2438) data time 0.0009 (0.0023) model time 0.2420 (0.2420) loss 3.5483 (2.8313) grad_norm 4.4972 (4.4075) loss_scale 2048.0000 (1170.6598) mem 7381MB [2024-09-01 06:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][400/1251] eta 0:03:27 lr 0.000120 wd 0.0500 time 0.2480 (0.2437) data time 0.0011 (0.0023) model time 0.2470 (0.2419) loss 3.1478 (2.8339) grad_norm 3.9514 (4.4042) loss_scale 2048.0000 (1192.5387) mem 7381MB [2024-09-01 06:35:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][410/1251] eta 0:03:24 lr 0.000120 wd 0.0500 time 0.2456 (0.2436) data time 0.0010 (0.0022) model time 0.2446 (0.2418) loss 3.0926 (2.8389) grad_norm 3.3419 (4.4001) loss_scale 2048.0000 (1213.3528) mem 7381MB [2024-09-01 06:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][420/1251] eta 0:03:22 lr 0.000120 wd 0.0500 time 0.2392 (0.2435) data time 0.0011 (0.0022) model time 0.2381 (0.2417) loss 2.9901 (2.8410) grad_norm 4.2491 (4.3875) loss_scale 2048.0000 (1233.1781) mem 7381MB [2024-09-01 06:35:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][430/1251] eta 0:03:19 lr 0.000120 wd 0.0500 time 0.2344 (0.2434) data time 0.0008 (0.0022) model time 0.2336 (0.2417) loss 2.5694 (2.8417) grad_norm 3.0594 (inf) loss_scale 1024.0000 (1235.4524) mem 7381MB [2024-09-01 06:35:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][440/1251] eta 0:03:17 lr 0.000120 wd 0.0500 time 0.2525 (0.2434) data time 0.0008 (0.0021) model time 0.2517 (0.2416) loss 3.2777 (2.8435) grad_norm 3.6493 (inf) loss_scale 512.0000 (1226.0136) mem 7381MB [2024-09-01 06:35:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][450/1251] eta 0:03:14 lr 0.000120 wd 0.0500 time 0.2347 (0.2433) data time 0.0012 (0.0021) model time 0.2335 (0.2416) loss 3.2508 (2.8497) grad_norm 3.8334 (inf) loss_scale 512.0000 (1210.1818) mem 7381MB [2024-09-01 06:35:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][460/1251] eta 0:03:12 lr 0.000120 wd 0.0500 time 0.2392 (0.2433) data time 0.0009 (0.0021) model time 0.2383 (0.2416) loss 1.8787 (2.8470) grad_norm 3.6593 (inf) loss_scale 512.0000 (1195.0369) mem 7381MB [2024-09-01 06:35:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][470/1251] eta 0:03:09 lr 0.000120 wd 0.0500 time 0.2386 (0.2432) data time 0.0009 (0.0021) model time 0.2377 (0.2415) loss 3.2095 (2.8501) grad_norm 6.6031 (inf) loss_scale 512.0000 (1180.5350) mem 7381MB [2024-09-01 06:35:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][480/1251] eta 0:03:07 lr 0.000120 wd 0.0500 time 0.2371 (0.2431) data time 0.0009 (0.0021) model time 0.2362 (0.2414) loss 3.1331 (2.8501) grad_norm 3.7747 (inf) loss_scale 512.0000 (1166.6362) mem 7381MB [2024-09-01 06:35:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][490/1251] eta 0:03:04 lr 0.000120 wd 0.0500 time 0.2374 (0.2430) data time 0.0012 (0.0020) model time 0.2362 (0.2413) loss 2.7906 (2.8483) grad_norm 3.0389 (inf) loss_scale 512.0000 (1153.3035) mem 7381MB [2024-09-01 06:35:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][500/1251] eta 0:03:02 lr 0.000120 wd 0.0500 time 0.2309 (0.2430) data time 0.0009 (0.0020) model time 0.2301 (0.2413) loss 3.2259 (2.8448) grad_norm 5.9276 (inf) loss_scale 512.0000 (1140.5030) mem 7381MB [2024-09-01 06:35:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][510/1251] eta 0:03:00 lr 0.000120 wd 0.0500 time 0.2433 (0.2429) data time 0.0011 (0.0020) model time 0.2422 (0.2413) loss 2.5544 (2.8413) grad_norm 3.4661 (inf) loss_scale 512.0000 (1128.2035) mem 7381MB [2024-09-01 06:35:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][520/1251] eta 0:02:57 lr 0.000120 wd 0.0500 time 0.2406 (0.2429) data time 0.0008 (0.0020) model time 0.2398 (0.2412) loss 3.2082 (2.8414) grad_norm 4.3327 (inf) loss_scale 512.0000 (1116.3762) mem 7381MB [2024-09-01 06:35:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][530/1251] eta 0:02:55 lr 0.000120 wd 0.0500 time 0.2438 (0.2429) data time 0.0007 (0.0020) model time 0.2431 (0.2412) loss 3.4097 (2.8439) grad_norm 4.1859 (inf) loss_scale 512.0000 (1104.9944) mem 7381MB [2024-09-01 06:35:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][540/1251] eta 0:02:52 lr 0.000120 wd 0.0500 time 0.2427 (0.2428) data time 0.0008 (0.0019) model time 0.2419 (0.2412) loss 2.5241 (2.8438) grad_norm 4.1940 (inf) loss_scale 512.0000 (1094.0333) mem 7381MB [2024-09-01 06:35:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][550/1251] eta 0:02:50 lr 0.000120 wd 0.0500 time 0.2328 (0.2427) data time 0.0011 (0.0019) model time 0.2318 (0.2411) loss 3.1614 (2.8415) grad_norm 4.2543 (inf) loss_scale 512.0000 (1083.4701) mem 7381MB [2024-09-01 06:35:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][560/1251] eta 0:02:47 lr 0.000120 wd 0.0500 time 0.2466 (0.2427) data time 0.0010 (0.0019) model time 0.2455 (0.2411) loss 3.1954 (2.8425) grad_norm 16.6014 (inf) loss_scale 512.0000 (1073.2834) mem 7381MB [2024-09-01 06:35:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][570/1251] eta 0:02:45 lr 0.000120 wd 0.0500 time 0.2410 (0.2426) data time 0.0007 (0.0019) model time 0.2404 (0.2410) loss 2.2585 (2.8402) grad_norm 5.4986 (inf) loss_scale 512.0000 (1063.4536) mem 7381MB [2024-09-01 06:35:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][580/1251] eta 0:02:42 lr 0.000120 wd 0.0500 time 0.2400 (0.2426) data time 0.0010 (0.0019) model time 0.2390 (0.2410) loss 3.3585 (2.8412) grad_norm 4.9067 (inf) loss_scale 512.0000 (1053.9621) mem 7381MB [2024-09-01 06:36:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][590/1251] eta 0:02:40 lr 0.000120 wd 0.0500 time 0.2381 (0.2426) data time 0.0011 (0.0019) model time 0.2370 (0.2410) loss 2.2706 (2.8379) grad_norm 5.7807 (inf) loss_scale 512.0000 (1044.7919) mem 7381MB [2024-09-01 06:36:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][600/1251] eta 0:02:37 lr 0.000120 wd 0.0500 time 0.2422 (0.2426) data time 0.0007 (0.0018) model time 0.2414 (0.2410) loss 2.7009 (2.8377) grad_norm 6.4226 (inf) loss_scale 512.0000 (1035.9268) mem 7381MB [2024-09-01 06:36:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][610/1251] eta 0:02:35 lr 0.000120 wd 0.0500 time 0.2464 (0.2425) data time 0.0009 (0.0018) model time 0.2454 (0.2410) loss 2.9228 (2.8394) grad_norm 5.0834 (inf) loss_scale 512.0000 (1027.3519) mem 7381MB [2024-09-01 06:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][620/1251] eta 0:02:33 lr 0.000120 wd 0.0500 time 0.2403 (0.2425) data time 0.0010 (0.0018) model time 0.2393 (0.2409) loss 3.4548 (2.8397) grad_norm 7.9535 (inf) loss_scale 512.0000 (1019.0531) mem 7381MB [2024-09-01 06:36:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][630/1251] eta 0:02:30 lr 0.000120 wd 0.0500 time 0.2297 (0.2424) data time 0.0010 (0.0018) model time 0.2287 (0.2409) loss 2.8975 (2.8424) grad_norm 3.6219 (inf) loss_scale 512.0000 (1011.0174) mem 7381MB [2024-09-01 06:36:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][640/1251] eta 0:02:28 lr 0.000120 wd 0.0500 time 0.2325 (0.2424) data time 0.0009 (0.0018) model time 0.2316 (0.2408) loss 2.2955 (2.8395) grad_norm 3.3897 (inf) loss_scale 512.0000 (1003.2324) mem 7381MB [2024-09-01 06:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][650/1251] eta 0:02:25 lr 0.000120 wd 0.0500 time 0.2415 (0.2424) data time 0.0007 (0.0018) model time 0.2408 (0.2409) loss 3.0088 (2.8394) grad_norm 6.3428 (inf) loss_scale 512.0000 (995.6866) mem 7381MB [2024-09-01 06:36:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][660/1251] eta 0:02:23 lr 0.000120 wd 0.0500 time 0.2443 (0.2424) data time 0.0008 (0.0018) model time 0.2435 (0.2408) loss 3.7765 (2.8416) grad_norm 2.4908 (inf) loss_scale 512.0000 (988.3691) mem 7381MB [2024-09-01 06:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][670/1251] eta 0:02:20 lr 0.000120 wd 0.0500 time 0.2407 (0.2427) data time 0.0010 (0.0018) model time 0.2397 (0.2411) loss 3.1087 (2.8437) grad_norm 3.1017 (inf) loss_scale 512.0000 (981.2697) mem 7381MB [2024-09-01 06:36:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][680/1251] eta 0:02:18 lr 0.000120 wd 0.0500 time 0.2407 (0.2426) data time 0.0009 (0.0017) model time 0.2398 (0.2411) loss 3.1373 (2.8451) grad_norm 3.3768 (inf) loss_scale 512.0000 (974.3789) mem 7381MB [2024-09-01 06:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][690/1251] eta 0:02:16 lr 0.000120 wd 0.0500 time 0.2461 (0.2426) data time 0.0008 (0.0017) model time 0.2454 (0.2411) loss 2.8862 (2.8441) grad_norm 5.3514 (inf) loss_scale 512.0000 (967.6874) mem 7381MB [2024-09-01 06:36:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][700/1251] eta 0:02:13 lr 0.000120 wd 0.0500 time 0.2453 (0.2426) data time 0.0007 (0.0017) model time 0.2446 (0.2411) loss 3.9024 (2.8411) grad_norm 5.9390 (inf) loss_scale 512.0000 (961.1869) mem 7381MB [2024-09-01 06:36:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][710/1251] eta 0:02:11 lr 0.000119 wd 0.0500 time 0.2250 (0.2425) data time 0.0011 (0.0017) model time 0.2239 (0.2410) loss 3.0638 (2.8429) grad_norm 3.4546 (inf) loss_scale 512.0000 (954.8692) mem 7381MB [2024-09-01 06:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][720/1251] eta 0:02:08 lr 0.000119 wd 0.0500 time 0.2357 (0.2424) data time 0.0007 (0.0017) model time 0.2349 (0.2409) loss 3.1515 (2.8430) grad_norm 5.2370 (inf) loss_scale 512.0000 (948.7268) mem 7381MB [2024-09-01 06:36:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][730/1251] eta 0:02:06 lr 0.000119 wd 0.0500 time 0.2407 (0.2424) data time 0.0010 (0.0017) model time 0.2397 (0.2409) loss 2.2527 (2.8426) grad_norm 3.9240 (inf) loss_scale 512.0000 (942.7524) mem 7381MB [2024-09-01 06:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][740/1251] eta 0:02:03 lr 0.000119 wd 0.0500 time 0.2368 (0.2424) data time 0.0012 (0.0017) model time 0.2356 (0.2409) loss 3.1813 (2.8429) grad_norm 3.2424 (inf) loss_scale 512.0000 (936.9393) mem 7381MB [2024-09-01 06:36:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][750/1251] eta 0:02:01 lr 0.000119 wd 0.0500 time 0.2332 (0.2424) data time 0.0011 (0.0017) model time 0.2321 (0.2409) loss 3.0302 (2.8448) grad_norm 3.2585 (inf) loss_scale 512.0000 (931.2810) mem 7381MB [2024-09-01 06:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][760/1251] eta 0:01:58 lr 0.000119 wd 0.0500 time 0.2465 (0.2424) data time 0.0009 (0.0017) model time 0.2456 (0.2409) loss 3.1492 (2.8482) grad_norm 2.8847 (inf) loss_scale 512.0000 (925.7714) mem 7381MB [2024-09-01 06:36:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][770/1251] eta 0:01:56 lr 0.000119 wd 0.0500 time 0.2377 (0.2423) data time 0.0007 (0.0017) model time 0.2370 (0.2409) loss 3.1449 (2.8509) grad_norm 2.9407 (inf) loss_scale 512.0000 (920.4047) mem 7381MB [2024-09-01 06:36:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][780/1251] eta 0:01:54 lr 0.000119 wd 0.0500 time 0.2430 (0.2423) data time 0.0007 (0.0016) model time 0.2423 (0.2408) loss 3.0057 (2.8536) grad_norm 2.9913 (inf) loss_scale 512.0000 (915.1754) mem 7381MB [2024-09-01 06:36:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][790/1251] eta 0:01:51 lr 0.000119 wd 0.0500 time 0.2286 (0.2423) data time 0.0012 (0.0016) model time 0.2274 (0.2408) loss 2.9621 (2.8505) grad_norm 3.1645 (inf) loss_scale 512.0000 (910.0784) mem 7381MB [2024-09-01 06:36:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][800/1251] eta 0:01:49 lr 0.000119 wd 0.0500 time 0.2461 (0.2423) data time 0.0009 (0.0016) model time 0.2452 (0.2409) loss 2.7292 (2.8550) grad_norm 4.5842 (inf) loss_scale 512.0000 (905.1086) mem 7381MB [2024-09-01 06:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][810/1251] eta 0:01:46 lr 0.000119 wd 0.0500 time 0.2317 (0.2423) data time 0.0012 (0.0016) model time 0.2305 (0.2409) loss 3.3042 (2.8568) grad_norm 2.7506 (inf) loss_scale 512.0000 (900.2614) mem 7381MB [2024-09-01 06:36:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][820/1251] eta 0:01:44 lr 0.000119 wd 0.0500 time 0.2418 (0.2423) data time 0.0007 (0.0016) model time 0.2411 (0.2409) loss 3.0317 (2.8564) grad_norm 5.1199 (inf) loss_scale 512.0000 (895.5323) mem 7381MB [2024-09-01 06:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][830/1251] eta 0:01:42 lr 0.000119 wd 0.0500 time 0.2345 (0.2423) data time 0.0010 (0.0016) model time 0.2334 (0.2409) loss 3.2325 (2.8553) grad_norm 3.7203 (inf) loss_scale 512.0000 (890.9170) mem 7381MB [2024-09-01 06:37:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][840/1251] eta 0:01:39 lr 0.000119 wd 0.0500 time 0.2478 (0.2423) data time 0.0009 (0.0016) model time 0.2469 (0.2408) loss 2.1362 (2.8538) grad_norm 3.5417 (inf) loss_scale 512.0000 (886.4114) mem 7381MB [2024-09-01 06:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][850/1251] eta 0:01:37 lr 0.000119 wd 0.0500 time 0.2376 (0.2423) data time 0.0008 (0.0016) model time 0.2368 (0.2408) loss 3.3386 (2.8547) grad_norm 2.7587 (inf) loss_scale 512.0000 (882.0118) mem 7381MB [2024-09-01 06:37:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][860/1251] eta 0:01:34 lr 0.000119 wd 0.0500 time 0.2383 (0.2422) data time 0.0007 (0.0016) model time 0.2376 (0.2408) loss 1.7228 (2.8548) grad_norm 4.3339 (inf) loss_scale 512.0000 (877.7143) mem 7381MB [2024-09-01 06:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][870/1251] eta 0:01:32 lr 0.000119 wd 0.0500 time 0.2411 (0.2422) data time 0.0007 (0.0016) model time 0.2404 (0.2408) loss 1.8976 (2.8546) grad_norm 3.3375 (inf) loss_scale 512.0000 (873.5155) mem 7381MB [2024-09-01 06:37:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][880/1251] eta 0:01:29 lr 0.000119 wd 0.0500 time 0.2367 (0.2422) data time 0.0007 (0.0016) model time 0.2360 (0.2408) loss 2.4149 (2.8562) grad_norm 2.9644 (inf) loss_scale 512.0000 (869.4120) mem 7381MB [2024-09-01 06:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][890/1251] eta 0:01:27 lr 0.000119 wd 0.0500 time 0.2471 (0.2422) data time 0.0007 (0.0016) model time 0.2464 (0.2408) loss 3.3295 (2.8563) grad_norm 4.4746 (inf) loss_scale 512.0000 (865.4007) mem 7381MB [2024-09-01 06:37:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][900/1251] eta 0:01:25 lr 0.000119 wd 0.0500 time 0.2452 (0.2422) data time 0.0007 (0.0016) model time 0.2445 (0.2408) loss 3.4573 (2.8549) grad_norm 4.9139 (inf) loss_scale 512.0000 (861.4784) mem 7381MB [2024-09-01 06:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][910/1251] eta 0:01:22 lr 0.000119 wd 0.0500 time 0.2337 (0.2421) data time 0.0009 (0.0016) model time 0.2328 (0.2407) loss 3.1265 (2.8562) grad_norm 2.6037 (inf) loss_scale 512.0000 (857.6422) mem 7381MB [2024-09-01 06:37:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][920/1251] eta 0:01:20 lr 0.000119 wd 0.0500 time 0.2366 (0.2421) data time 0.0007 (0.0015) model time 0.2359 (0.2407) loss 1.5139 (2.8572) grad_norm 5.6240 (inf) loss_scale 512.0000 (853.8893) mem 7381MB [2024-09-01 06:37:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][930/1251] eta 0:01:17 lr 0.000119 wd 0.0500 time 0.2419 (0.2421) data time 0.0010 (0.0015) model time 0.2410 (0.2407) loss 2.8572 (2.8564) grad_norm 5.1012 (inf) loss_scale 512.0000 (850.2170) mem 7381MB [2024-09-01 06:37:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][940/1251] eta 0:01:15 lr 0.000119 wd 0.0500 time 0.2455 (0.2421) data time 0.0007 (0.0015) model time 0.2447 (0.2407) loss 3.0471 (2.8580) grad_norm 4.0560 (inf) loss_scale 512.0000 (846.6227) mem 7381MB [2024-09-01 06:37:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][950/1251] eta 0:01:12 lr 0.000119 wd 0.0500 time 0.2328 (0.2421) data time 0.0010 (0.0015) model time 0.2319 (0.2407) loss 2.8133 (2.8581) grad_norm 4.1720 (inf) loss_scale 512.0000 (843.1041) mem 7381MB [2024-09-01 06:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][960/1251] eta 0:01:10 lr 0.000119 wd 0.0500 time 0.2435 (0.2421) data time 0.0007 (0.0015) model time 0.2428 (0.2407) loss 2.5732 (2.8551) grad_norm 13.8545 (inf) loss_scale 512.0000 (839.6587) mem 7381MB [2024-09-01 06:37:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][970/1251] eta 0:01:08 lr 0.000119 wd 0.0500 time 0.2524 (0.2421) data time 0.0007 (0.0015) model time 0.2517 (0.2408) loss 2.4284 (2.8544) grad_norm 3.5570 (inf) loss_scale 512.0000 (836.2842) mem 7381MB [2024-09-01 06:37:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][980/1251] eta 0:01:05 lr 0.000119 wd 0.0500 time 0.2303 (0.2421) data time 0.0008 (0.0015) model time 0.2296 (0.2407) loss 2.8343 (2.8525) grad_norm 4.6784 (inf) loss_scale 512.0000 (832.9786) mem 7381MB [2024-09-01 06:37:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][990/1251] eta 0:01:03 lr 0.000119 wd 0.0500 time 0.2380 (0.2421) data time 0.0008 (0.0015) model time 0.2372 (0.2407) loss 3.4746 (2.8565) grad_norm 7.8042 (inf) loss_scale 512.0000 (829.7397) mem 7381MB [2024-09-01 06:37:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1000/1251] eta 0:01:00 lr 0.000119 wd 0.0500 time 0.2443 (0.2421) data time 0.0010 (0.0015) model time 0.2434 (0.2407) loss 2.8003 (2.8584) grad_norm 3.3478 (inf) loss_scale 512.0000 (826.5654) mem 7381MB [2024-09-01 06:37:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1010/1251] eta 0:00:58 lr 0.000119 wd 0.0500 time 0.2376 (0.2421) data time 0.0009 (0.0015) model time 0.2367 (0.2407) loss 2.2423 (2.8578) grad_norm 2.9955 (inf) loss_scale 512.0000 (823.4540) mem 7381MB [2024-09-01 06:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1020/1251] eta 0:00:55 lr 0.000119 wd 0.0500 time 0.2362 (0.2421) data time 0.0011 (0.0015) model time 0.2351 (0.2407) loss 2.7629 (2.8568) grad_norm 5.0757 (inf) loss_scale 512.0000 (820.4035) mem 7381MB [2024-09-01 06:37:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1030/1251] eta 0:00:53 lr 0.000119 wd 0.0500 time 0.2363 (0.2420) data time 0.0010 (0.0015) model time 0.2353 (0.2407) loss 2.3195 (2.8565) grad_norm 4.0269 (inf) loss_scale 512.0000 (817.4122) mem 7381MB [2024-09-01 06:37:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1040/1251] eta 0:00:51 lr 0.000119 wd 0.0500 time 0.2432 (0.2420) data time 0.0008 (0.0015) model time 0.2424 (0.2407) loss 3.7430 (2.8583) grad_norm 3.6841 (inf) loss_scale 512.0000 (814.4784) mem 7381MB [2024-09-01 06:37:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1050/1251] eta 0:00:48 lr 0.000119 wd 0.0500 time 0.2554 (0.2420) data time 0.0009 (0.0015) model time 0.2545 (0.2407) loss 2.6913 (2.8583) grad_norm 4.2675 (inf) loss_scale 512.0000 (811.6004) mem 7381MB [2024-09-01 06:37:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1060/1251] eta 0:00:46 lr 0.000119 wd 0.0500 time 0.2482 (0.2420) data time 0.0009 (0.0015) model time 0.2473 (0.2407) loss 2.8473 (2.8558) grad_norm 3.2844 (inf) loss_scale 512.0000 (808.7766) mem 7381MB [2024-09-01 06:37:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1070/1251] eta 0:00:43 lr 0.000118 wd 0.0500 time 0.2411 (0.2420) data time 0.0008 (0.0015) model time 0.2403 (0.2407) loss 2.8942 (2.8550) grad_norm 3.7303 (inf) loss_scale 512.0000 (806.0056) mem 7381MB [2024-09-01 06:37:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1080/1251] eta 0:00:41 lr 0.000118 wd 0.0500 time 0.2373 (0.2420) data time 0.0010 (0.0015) model time 0.2363 (0.2407) loss 2.9768 (2.8569) grad_norm 3.9463 (inf) loss_scale 512.0000 (803.2858) mem 7381MB [2024-09-01 06:38:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1090/1251] eta 0:00:38 lr 0.000118 wd 0.0500 time 0.2447 (0.2420) data time 0.0010 (0.0015) model time 0.2437 (0.2407) loss 2.9419 (2.8566) grad_norm 3.5000 (inf) loss_scale 512.0000 (800.6159) mem 7381MB [2024-09-01 06:38:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1100/1251] eta 0:00:36 lr 0.000118 wd 0.0500 time 0.2474 (0.2420) data time 0.0008 (0.0015) model time 0.2466 (0.2407) loss 2.7334 (2.8580) grad_norm 5.4266 (inf) loss_scale 512.0000 (797.9946) mem 7381MB [2024-09-01 06:38:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1110/1251] eta 0:00:34 lr 0.000118 wd 0.0500 time 0.2429 (0.2420) data time 0.0008 (0.0015) model time 0.2421 (0.2407) loss 1.6166 (2.8558) grad_norm 4.3614 (inf) loss_scale 512.0000 (795.4203) mem 7381MB [2024-09-01 06:38:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1120/1251] eta 0:00:31 lr 0.000118 wd 0.0500 time 0.2347 (0.2420) data time 0.0008 (0.0014) model time 0.2339 (0.2407) loss 2.1361 (2.8549) grad_norm 3.6926 (inf) loss_scale 512.0000 (792.8921) mem 7381MB [2024-09-01 06:38:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1130/1251] eta 0:00:29 lr 0.000118 wd 0.0500 time 0.4647 (0.2424) data time 0.0009 (0.0014) model time 0.4637 (0.2411) loss 2.9836 (2.8546) grad_norm 3.6687 (inf) loss_scale 512.0000 (790.4085) mem 7381MB [2024-09-01 06:38:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1140/1251] eta 0:00:26 lr 0.000118 wd 0.0500 time 0.2404 (0.2424) data time 0.0010 (0.0014) model time 0.2394 (0.2411) loss 2.2409 (2.8520) grad_norm 4.9575 (inf) loss_scale 512.0000 (787.9684) mem 7381MB [2024-09-01 06:38:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1150/1251] eta 0:00:24 lr 0.000118 wd 0.0500 time 0.2427 (0.2424) data time 0.0009 (0.0014) model time 0.2418 (0.2411) loss 2.8060 (2.8525) grad_norm 4.3663 (inf) loss_scale 512.0000 (785.5708) mem 7381MB [2024-09-01 06:38:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1160/1251] eta 0:00:22 lr 0.000118 wd 0.0500 time 0.2471 (0.2424) data time 0.0010 (0.0014) model time 0.2461 (0.2411) loss 2.6683 (2.8528) grad_norm 2.6567 (inf) loss_scale 512.0000 (783.2145) mem 7381MB [2024-09-01 06:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1170/1251] eta 0:00:19 lr 0.000118 wd 0.0500 time 0.2477 (0.2424) data time 0.0011 (0.0014) model time 0.2467 (0.2411) loss 3.0093 (2.8497) grad_norm 3.4087 (inf) loss_scale 512.0000 (780.8984) mem 7381MB [2024-09-01 06:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1180/1251] eta 0:00:17 lr 0.000118 wd 0.0500 time 0.2518 (0.2424) data time 0.0007 (0.0014) model time 0.2511 (0.2411) loss 3.3596 (2.8498) grad_norm 3.3768 (inf) loss_scale 512.0000 (778.6215) mem 7381MB [2024-09-01 06:38:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1190/1251] eta 0:00:14 lr 0.000118 wd 0.0500 time 0.2406 (0.2423) data time 0.0007 (0.0014) model time 0.2398 (0.2411) loss 3.3025 (2.8508) grad_norm 3.4736 (inf) loss_scale 512.0000 (776.3829) mem 7381MB [2024-09-01 06:38:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1200/1251] eta 0:00:12 lr 0.000118 wd 0.0500 time 0.2412 (0.2424) data time 0.0011 (0.0014) model time 0.2401 (0.2411) loss 3.0290 (2.8489) grad_norm 3.9780 (inf) loss_scale 512.0000 (774.1815) mem 7381MB [2024-09-01 06:38:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1210/1251] eta 0:00:09 lr 0.000118 wd 0.0500 time 0.2348 (0.2423) data time 0.0008 (0.0014) model time 0.2339 (0.2411) loss 3.3689 (2.8501) grad_norm 2.5581 (inf) loss_scale 512.0000 (772.0165) mem 7381MB [2024-09-01 06:38:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1220/1251] eta 0:00:07 lr 0.000118 wd 0.0500 time 0.2446 (0.2423) data time 0.0010 (0.0014) model time 0.2436 (0.2410) loss 2.8327 (2.8507) grad_norm 4.7685 (inf) loss_scale 512.0000 (769.8870) mem 7381MB [2024-09-01 06:38:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1230/1251] eta 0:00:05 lr 0.000118 wd 0.0500 time 0.2383 (0.2423) data time 0.0009 (0.0014) model time 0.2374 (0.2410) loss 3.4001 (2.8512) grad_norm 3.8616 (inf) loss_scale 512.0000 (767.7920) mem 7381MB [2024-09-01 06:38:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1240/1251] eta 0:00:02 lr 0.000118 wd 0.0500 time 0.2240 (0.2423) data time 0.0004 (0.0014) model time 0.2236 (0.2410) loss 3.3830 (2.8511) grad_norm 5.0225 (inf) loss_scale 512.0000 (765.7309) mem 7381MB [2024-09-01 06:38:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [239/300][1250/1251] eta 0:00:00 lr 0.000118 wd 0.0500 time 0.2252 (0.2421) data time 0.0007 (0.0014) model time 0.2245 (0.2408) loss 3.0323 (2.8527) grad_norm 3.9325 (inf) loss_scale 512.0000 (763.7026) mem 7381MB [2024-09-01 06:38:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 239 training takes 0:05:02 [2024-09-01 06:38:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 06:38:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 06:38:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.404 (0.404) Loss 0.4023 (0.4023) Acc@1 93.262 (93.262) Acc@5 98.242 (98.242) Mem 7381MB [2024-09-01 06:38:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.108) Loss 0.5801 (0.6317) Acc@1 89.355 (86.994) Acc@5 97.949 (97.461) Mem 7381MB [2024-09-01 06:38:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.095) Loss 0.9526 (0.6659) Acc@1 77.246 (85.821) Acc@5 95.508 (97.438) Mem 7381MB [2024-09-01 06:38:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.090) Loss 1.1406 (0.7556) Acc@1 72.852 (83.587) Acc@5 92.285 (96.481) Mem 7381MB [2024-09-01 06:38:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.0859 (0.8045) Acc@1 75.293 (82.393) Acc@5 93.555 (96.006) Mem 7381MB [2024-09-01 06:38:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.006 Acc@5 95.964 [2024-09-01 06:38:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.0% [2024-09-01 06:38:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.777 (0.777) Loss 0.3801 (0.3801) Acc@1 93.555 (93.555) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 06:38:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.146) Loss 0.5669 (0.6002) Acc@1 89.746 (87.571) Acc@5 97.949 (97.781) Mem 7381MB [2024-09-01 06:38:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.114) Loss 0.8843 (0.6295) Acc@1 78.223 (86.444) Acc@5 95.801 (97.689) Mem 7381MB [2024-09-01 06:38:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.087 (0.103) Loss 1.0869 (0.7135) Acc@1 74.316 (84.375) Acc@5 93.555 (96.831) Mem 7381MB [2024-09-01 06:38:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 0.9961 (0.7588) Acc@1 76.172 (83.256) Acc@5 94.141 (96.356) Mem 7381MB [2024-09-01 06:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.858 Acc@5 96.332 [2024-09-01 06:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 06:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.86% [2024-09-01 06:38:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 06:38:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 06:38:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][0/1251] eta 0:14:40 lr 0.000118 wd 0.0500 time 0.7036 (0.7036) data time 0.4788 (0.4788) model time 0.0000 (0.0000) loss 2.8565 (2.8565) grad_norm 3.6708 (3.6708) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:38:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][10/1251] eta 0:05:48 lr 0.000118 wd 0.0500 time 0.2385 (0.2808) data time 0.0009 (0.0445) model time 0.0000 (0.0000) loss 2.9066 (2.6504) grad_norm 8.2030 (4.7183) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:38:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][20/1251] eta 0:05:23 lr 0.000118 wd 0.0500 time 0.2420 (0.2627) data time 0.0010 (0.0238) model time 0.0000 (0.0000) loss 2.5492 (2.7801) grad_norm 3.6108 (5.1895) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:38:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][30/1251] eta 0:05:12 lr 0.000118 wd 0.0500 time 0.2442 (0.2556) data time 0.0007 (0.0164) model time 0.0000 (0.0000) loss 3.5066 (2.7421) grad_norm 3.5753 (4.8812) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][40/1251] eta 0:05:05 lr 0.000118 wd 0.0500 time 0.2466 (0.2520) data time 0.0009 (0.0127) model time 0.0000 (0.0000) loss 3.0474 (2.7652) grad_norm 4.0835 (4.5912) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][50/1251] eta 0:05:00 lr 0.000118 wd 0.0500 time 0.2435 (0.2503) data time 0.0007 (0.0104) model time 0.0000 (0.0000) loss 2.0464 (2.7198) grad_norm 4.1460 (4.3946) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][60/1251] eta 0:04:55 lr 0.000118 wd 0.0500 time 0.2342 (0.2485) data time 0.0011 (0.0088) model time 0.2331 (0.2380) loss 2.3278 (2.7217) grad_norm 3.5749 (4.4593) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][70/1251] eta 0:04:52 lr 0.000118 wd 0.0500 time 0.2391 (0.2475) data time 0.0009 (0.0077) model time 0.2382 (0.2393) loss 2.9422 (2.7589) grad_norm 4.0486 (4.3904) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][80/1251] eta 0:04:48 lr 0.000118 wd 0.0500 time 0.2337 (0.2465) data time 0.0013 (0.0069) model time 0.2324 (0.2388) loss 2.7969 (2.7487) grad_norm 4.2996 (4.2908) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][90/1251] eta 0:04:45 lr 0.000118 wd 0.0500 time 0.2332 (0.2458) data time 0.0014 (0.0063) model time 0.2318 (0.2389) loss 2.7001 (2.7362) grad_norm 3.3714 (4.2506) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][100/1251] eta 0:04:42 lr 0.000118 wd 0.0500 time 0.2460 (0.2453) data time 0.0012 (0.0057) model time 0.2449 (0.2391) loss 2.6974 (2.7407) grad_norm 5.8147 (4.3279) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][110/1251] eta 0:04:39 lr 0.000118 wd 0.0500 time 0.2310 (0.2447) data time 0.0009 (0.0053) model time 0.2301 (0.2390) loss 2.1073 (2.7428) grad_norm 6.5778 (4.3732) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][120/1251] eta 0:04:36 lr 0.000118 wd 0.0500 time 0.2496 (0.2443) data time 0.0007 (0.0049) model time 0.2489 (0.2389) loss 3.2931 (2.7654) grad_norm 4.9827 (4.3758) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][130/1251] eta 0:04:33 lr 0.000118 wd 0.0500 time 0.2382 (0.2440) data time 0.0007 (0.0046) model time 0.2375 (0.2390) loss 2.5406 (2.7731) grad_norm 3.0721 (4.3856) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][140/1251] eta 0:04:32 lr 0.000118 wd 0.0500 time 0.2419 (0.2452) data time 0.0009 (0.0044) model time 0.2409 (0.2413) loss 3.3400 (2.7793) grad_norm 4.6290 (4.3639) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][150/1251] eta 0:04:29 lr 0.000118 wd 0.0500 time 0.2388 (0.2450) data time 0.0009 (0.0042) model time 0.2379 (0.2413) loss 2.2853 (2.7700) grad_norm 3.9446 (4.3451) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][160/1251] eta 0:04:27 lr 0.000118 wd 0.0500 time 0.2346 (0.2448) data time 0.0009 (0.0040) model time 0.2338 (0.2412) loss 2.7420 (2.7847) grad_norm 8.3403 (4.3543) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][170/1251] eta 0:04:24 lr 0.000118 wd 0.0500 time 0.2357 (0.2446) data time 0.0008 (0.0038) model time 0.2349 (0.2411) loss 2.3348 (2.7874) grad_norm 3.4324 (4.3143) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][180/1251] eta 0:04:21 lr 0.000117 wd 0.0500 time 0.2371 (0.2444) data time 0.0007 (0.0036) model time 0.2365 (0.2410) loss 2.8889 (2.7988) grad_norm 3.5911 (4.3204) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][190/1251] eta 0:04:19 lr 0.000117 wd 0.0500 time 0.2423 (0.2442) data time 0.0011 (0.0035) model time 0.2412 (0.2410) loss 3.2266 (2.8053) grad_norm 5.0868 (4.3578) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][200/1251] eta 0:04:16 lr 0.000117 wd 0.0500 time 0.2362 (0.2440) data time 0.0009 (0.0034) model time 0.2353 (0.2409) loss 3.3617 (2.8097) grad_norm 3.0837 (4.3409) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][210/1251] eta 0:04:13 lr 0.000117 wd 0.0500 time 0.2415 (0.2439) data time 0.0011 (0.0032) model time 0.2404 (0.2408) loss 2.7928 (2.8054) grad_norm 5.6686 (4.3353) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][220/1251] eta 0:04:11 lr 0.000117 wd 0.0500 time 0.2335 (0.2437) data time 0.0008 (0.0031) model time 0.2327 (0.2408) loss 3.3810 (2.8051) grad_norm 3.4955 (4.3236) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][230/1251] eta 0:04:08 lr 0.000117 wd 0.0500 time 0.2458 (0.2437) data time 0.0011 (0.0030) model time 0.2447 (0.2408) loss 2.0862 (2.8047) grad_norm 4.5313 (4.4262) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][240/1251] eta 0:04:06 lr 0.000117 wd 0.0500 time 0.2409 (0.2437) data time 0.0010 (0.0030) model time 0.2400 (0.2410) loss 2.3705 (2.8061) grad_norm 4.6093 (4.4011) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][250/1251] eta 0:04:03 lr 0.000117 wd 0.0500 time 0.2440 (0.2436) data time 0.0009 (0.0029) model time 0.2431 (0.2409) loss 2.6061 (2.8101) grad_norm 2.4328 (4.3853) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][260/1251] eta 0:04:01 lr 0.000117 wd 0.0500 time 0.2317 (0.2434) data time 0.0009 (0.0028) model time 0.2307 (0.2408) loss 3.0450 (2.8104) grad_norm 2.9900 (4.3906) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][270/1251] eta 0:03:58 lr 0.000117 wd 0.0500 time 0.2424 (0.2433) data time 0.0009 (0.0027) model time 0.2414 (0.2407) loss 2.8980 (2.8090) grad_norm 3.3288 (4.3805) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:39:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][280/1251] eta 0:03:56 lr 0.000117 wd 0.0500 time 0.2391 (0.2432) data time 0.0011 (0.0027) model time 0.2380 (0.2407) loss 3.1785 (2.8150) grad_norm 3.0123 (4.3481) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][290/1251] eta 0:03:53 lr 0.000117 wd 0.0500 time 0.2389 (0.2432) data time 0.0010 (0.0026) model time 0.2379 (0.2406) loss 3.1355 (2.8090) grad_norm 4.3113 (4.3384) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][300/1251] eta 0:03:51 lr 0.000117 wd 0.0500 time 0.2411 (0.2432) data time 0.0007 (0.0026) model time 0.2403 (0.2407) loss 1.8278 (2.8065) grad_norm 6.1117 (4.4040) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][310/1251] eta 0:03:48 lr 0.000117 wd 0.0500 time 0.2382 (0.2431) data time 0.0008 (0.0025) model time 0.2374 (0.2407) loss 3.2046 (2.8080) grad_norm 4.6475 (4.3971) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][320/1251] eta 0:03:46 lr 0.000117 wd 0.0500 time 0.2448 (0.2431) data time 0.0012 (0.0025) model time 0.2436 (0.2407) loss 3.2107 (2.8008) grad_norm 3.5274 (4.3888) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][330/1251] eta 0:03:43 lr 0.000117 wd 0.0500 time 0.2426 (0.2431) data time 0.0008 (0.0024) model time 0.2418 (0.2407) loss 3.3041 (2.8065) grad_norm 5.3683 (4.4202) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][340/1251] eta 0:03:41 lr 0.000117 wd 0.0500 time 0.2442 (0.2430) data time 0.0009 (0.0024) model time 0.2433 (0.2407) loss 2.8080 (2.8112) grad_norm 5.3568 (4.4328) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][350/1251] eta 0:03:38 lr 0.000117 wd 0.0500 time 0.2436 (0.2430) data time 0.0007 (0.0023) model time 0.2428 (0.2407) loss 2.8762 (2.8115) grad_norm 4.7782 (4.4476) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][360/1251] eta 0:03:36 lr 0.000117 wd 0.0500 time 0.2332 (0.2429) data time 0.0012 (0.0023) model time 0.2320 (0.2407) loss 2.9404 (2.8147) grad_norm 3.1695 (4.4331) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][370/1251] eta 0:03:34 lr 0.000117 wd 0.0500 time 0.2418 (0.2429) data time 0.0009 (0.0023) model time 0.2409 (0.2407) loss 2.5609 (2.8143) grad_norm 3.2348 (4.4129) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][380/1251] eta 0:03:31 lr 0.000117 wd 0.0500 time 0.2323 (0.2429) data time 0.0009 (0.0023) model time 0.2314 (0.2407) loss 3.2608 (2.8166) grad_norm 3.9900 (4.4223) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][390/1251] eta 0:03:29 lr 0.000117 wd 0.0500 time 0.2422 (0.2428) data time 0.0008 (0.0022) model time 0.2414 (0.2407) loss 3.2692 (2.8084) grad_norm 3.5170 (4.4060) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][400/1251] eta 0:03:26 lr 0.000117 wd 0.0500 time 0.2310 (0.2428) data time 0.0011 (0.0022) model time 0.2299 (0.2407) loss 2.0224 (2.8115) grad_norm 3.7261 (4.4062) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][410/1251] eta 0:03:24 lr 0.000117 wd 0.0500 time 0.2284 (0.2427) data time 0.0008 (0.0022) model time 0.2276 (0.2406) loss 2.3915 (2.8040) grad_norm 3.9141 (4.3892) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][420/1251] eta 0:03:21 lr 0.000117 wd 0.0500 time 0.2302 (0.2427) data time 0.0010 (0.0021) model time 0.2292 (0.2407) loss 3.0008 (2.8054) grad_norm 3.3255 (4.3899) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][430/1251] eta 0:03:19 lr 0.000117 wd 0.0500 time 0.2349 (0.2427) data time 0.0011 (0.0021) model time 0.2338 (0.2407) loss 2.6625 (2.8046) grad_norm 4.3504 (4.3939) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][440/1251] eta 0:03:16 lr 0.000117 wd 0.0500 time 0.2462 (0.2427) data time 0.0009 (0.0021) model time 0.2453 (0.2407) loss 2.8798 (2.8086) grad_norm 4.7398 (4.3858) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][450/1251] eta 0:03:14 lr 0.000117 wd 0.0500 time 0.2384 (0.2426) data time 0.0007 (0.0021) model time 0.2377 (0.2406) loss 3.1827 (2.8090) grad_norm 5.3166 (4.4265) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][460/1251] eta 0:03:11 lr 0.000117 wd 0.0500 time 0.2421 (0.2426) data time 0.0007 (0.0020) model time 0.2414 (0.2406) loss 3.1882 (2.8097) grad_norm 3.6136 (4.4200) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][470/1251] eta 0:03:09 lr 0.000117 wd 0.0500 time 0.2368 (0.2425) data time 0.0007 (0.0020) model time 0.2361 (0.2406) loss 2.1748 (2.8082) grad_norm 3.6316 (4.4222) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][480/1251] eta 0:03:06 lr 0.000117 wd 0.0500 time 0.2355 (0.2425) data time 0.0009 (0.0020) model time 0.2346 (0.2406) loss 3.5404 (2.8086) grad_norm 3.9781 (4.4219) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][490/1251] eta 0:03:04 lr 0.000117 wd 0.0500 time 0.2467 (0.2424) data time 0.0010 (0.0020) model time 0.2457 (0.2405) loss 2.9450 (2.8110) grad_norm 5.2333 (4.4319) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][500/1251] eta 0:03:02 lr 0.000117 wd 0.0500 time 0.2366 (0.2424) data time 0.0010 (0.0020) model time 0.2356 (0.2405) loss 2.5203 (2.8160) grad_norm 4.9618 (4.4440) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][510/1251] eta 0:02:59 lr 0.000117 wd 0.0500 time 0.2327 (0.2425) data time 0.0014 (0.0019) model time 0.2313 (0.2406) loss 2.9302 (2.8182) grad_norm 3.9848 (4.4298) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][520/1251] eta 0:02:57 lr 0.000117 wd 0.0500 time 0.2411 (0.2424) data time 0.0014 (0.0019) model time 0.2397 (0.2406) loss 3.0265 (2.8183) grad_norm 4.9523 (4.4235) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:40:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][530/1251] eta 0:02:54 lr 0.000117 wd 0.0500 time 0.2395 (0.2424) data time 0.0007 (0.0019) model time 0.2388 (0.2406) loss 3.2961 (2.8153) grad_norm 4.3572 (4.4202) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][540/1251] eta 0:02:52 lr 0.000117 wd 0.0500 time 0.2456 (0.2424) data time 0.0007 (0.0019) model time 0.2449 (0.2406) loss 1.8927 (2.8164) grad_norm 3.2890 (4.4054) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][550/1251] eta 0:02:49 lr 0.000116 wd 0.0500 time 0.2428 (0.2424) data time 0.0010 (0.0019) model time 0.2419 (0.2406) loss 3.2129 (2.8163) grad_norm 3.5044 (4.4564) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][560/1251] eta 0:02:47 lr 0.000116 wd 0.0500 time 0.2430 (0.2425) data time 0.0010 (0.0019) model time 0.2420 (0.2407) loss 2.3999 (2.8159) grad_norm 12.7102 (4.4742) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][570/1251] eta 0:02:45 lr 0.000116 wd 0.0500 time 0.2410 (0.2425) data time 0.0009 (0.0018) model time 0.2401 (0.2407) loss 2.0320 (2.8165) grad_norm 4.8518 (4.4615) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][580/1251] eta 0:02:42 lr 0.000116 wd 0.0500 time 0.2454 (0.2424) data time 0.0007 (0.0018) model time 0.2448 (0.2406) loss 2.8363 (2.8170) grad_norm 4.2892 (4.4484) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][590/1251] eta 0:02:40 lr 0.000116 wd 0.0500 time 0.2396 (0.2423) data time 0.0007 (0.0018) model time 0.2389 (0.2406) loss 3.5842 (2.8185) grad_norm 3.2935 (4.4460) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][600/1251] eta 0:02:37 lr 0.000116 wd 0.0500 time 0.2400 (0.2423) data time 0.0008 (0.0018) model time 0.2392 (0.2406) loss 2.9087 (2.8160) grad_norm 4.2905 (4.4623) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][610/1251] eta 0:02:35 lr 0.000116 wd 0.0500 time 0.2417 (0.2423) data time 0.0011 (0.0018) model time 0.2405 (0.2406) loss 3.0119 (2.8151) grad_norm 4.2541 (4.4565) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][620/1251] eta 0:02:32 lr 0.000116 wd 0.0500 time 0.2503 (0.2423) data time 0.0009 (0.0018) model time 0.2493 (0.2406) loss 2.1462 (2.8177) grad_norm 3.9518 (4.4694) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][630/1251] eta 0:02:30 lr 0.000116 wd 0.0500 time 0.2438 (0.2422) data time 0.0010 (0.0018) model time 0.2428 (0.2405) loss 2.6492 (2.8170) grad_norm 4.8281 (4.4664) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][640/1251] eta 0:02:27 lr 0.000116 wd 0.0500 time 0.2467 (0.2422) data time 0.0010 (0.0018) model time 0.2456 (0.2405) loss 1.5698 (2.8206) grad_norm 3.1110 (4.4673) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][650/1251] eta 0:02:25 lr 0.000116 wd 0.0500 time 0.2455 (0.2422) data time 0.0007 (0.0017) model time 0.2448 (0.2405) loss 3.0606 (2.8210) grad_norm 2.8359 (4.4603) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][660/1251] eta 0:02:23 lr 0.000116 wd 0.0500 time 0.2366 (0.2422) data time 0.0009 (0.0017) model time 0.2357 (0.2405) loss 2.8766 (2.8193) grad_norm 4.4595 (4.4577) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][670/1251] eta 0:02:20 lr 0.000116 wd 0.0500 time 0.2430 (0.2422) data time 0.0009 (0.0017) model time 0.2421 (0.2405) loss 3.2714 (2.8204) grad_norm 5.8689 (4.4557) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][680/1251] eta 0:02:18 lr 0.000116 wd 0.0500 time 0.2414 (0.2422) data time 0.0010 (0.0017) model time 0.2404 (0.2405) loss 3.0921 (2.8171) grad_norm 4.2930 (4.4530) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][690/1251] eta 0:02:15 lr 0.000116 wd 0.0500 time 0.2401 (0.2422) data time 0.0011 (0.0017) model time 0.2390 (0.2406) loss 3.0103 (2.8184) grad_norm 4.9920 (4.4527) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][700/1251] eta 0:02:13 lr 0.000116 wd 0.0500 time 0.2397 (0.2422) data time 0.0007 (0.0017) model time 0.2390 (0.2406) loss 2.7515 (2.8168) grad_norm 3.6758 (4.4590) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][710/1251] eta 0:02:11 lr 0.000116 wd 0.0500 time 0.2398 (0.2422) data time 0.0007 (0.0017) model time 0.2391 (0.2406) loss 2.9208 (2.8157) grad_norm 5.9784 (4.4822) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][720/1251] eta 0:02:08 lr 0.000116 wd 0.0500 time 0.2367 (0.2428) data time 0.0009 (0.0017) model time 0.2358 (0.2412) loss 2.9650 (2.8181) grad_norm 4.4066 (4.4785) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][730/1251] eta 0:02:06 lr 0.000116 wd 0.0500 time 0.2416 (0.2431) data time 0.0009 (0.0017) model time 0.2407 (0.2415) loss 2.9440 (2.8155) grad_norm 3.2778 (4.4765) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][740/1251] eta 0:02:04 lr 0.000116 wd 0.0500 time 0.2385 (0.2431) data time 0.0010 (0.0016) model time 0.2375 (0.2415) loss 2.7020 (2.8145) grad_norm 5.5603 (4.4654) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][750/1251] eta 0:02:01 lr 0.000116 wd 0.0500 time 0.2396 (0.2431) data time 0.0009 (0.0016) model time 0.2387 (0.2415) loss 2.1285 (2.8133) grad_norm 2.8082 (4.4521) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][760/1251] eta 0:01:59 lr 0.000116 wd 0.0500 time 0.2405 (0.2430) data time 0.0008 (0.0016) model time 0.2397 (0.2415) loss 3.2291 (2.8137) grad_norm 5.6255 (4.4635) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][770/1251] eta 0:01:56 lr 0.000116 wd 0.0500 time 0.2494 (0.2430) data time 0.0009 (0.0016) model time 0.2485 (0.2415) loss 3.1822 (2.8122) grad_norm 4.6876 (4.4567) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:41:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][780/1251] eta 0:01:54 lr 0.000116 wd 0.0500 time 0.2380 (0.2430) data time 0.0007 (0.0016) model time 0.2372 (0.2415) loss 3.4651 (2.8145) grad_norm 4.3517 (4.4494) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][790/1251] eta 0:01:52 lr 0.000116 wd 0.0500 time 0.2403 (0.2430) data time 0.0009 (0.0016) model time 0.2394 (0.2415) loss 3.3840 (2.8169) grad_norm 3.5109 (4.4555) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][800/1251] eta 0:01:49 lr 0.000116 wd 0.0500 time 0.2423 (0.2430) data time 0.0009 (0.0016) model time 0.2413 (0.2415) loss 2.6123 (2.8160) grad_norm 6.3866 (4.4524) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][810/1251] eta 0:01:47 lr 0.000116 wd 0.0500 time 0.2468 (0.2430) data time 0.0012 (0.0016) model time 0.2455 (0.2415) loss 3.4391 (2.8182) grad_norm 3.1067 (4.4460) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][820/1251] eta 0:01:44 lr 0.000116 wd 0.0500 time 0.2401 (0.2430) data time 0.0009 (0.0016) model time 0.2392 (0.2415) loss 3.2152 (2.8179) grad_norm 4.4714 (4.4415) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][830/1251] eta 0:01:42 lr 0.000116 wd 0.0500 time 0.2505 (0.2430) data time 0.0010 (0.0016) model time 0.2494 (0.2415) loss 3.0178 (2.8190) grad_norm 5.0337 (4.4407) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][840/1251] eta 0:01:39 lr 0.000116 wd 0.0500 time 0.2386 (0.2430) data time 0.0008 (0.0016) model time 0.2378 (0.2415) loss 2.0869 (2.8213) grad_norm 3.1264 (4.4373) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][850/1251] eta 0:01:37 lr 0.000116 wd 0.0500 time 0.2393 (0.2430) data time 0.0010 (0.0016) model time 0.2383 (0.2415) loss 3.2818 (2.8214) grad_norm 3.1308 (4.4306) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][860/1251] eta 0:01:34 lr 0.000116 wd 0.0500 time 0.2446 (0.2429) data time 0.0009 (0.0016) model time 0.2437 (0.2415) loss 3.4671 (2.8227) grad_norm 3.6161 (4.4250) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][870/1251] eta 0:01:32 lr 0.000116 wd 0.0500 time 0.2444 (0.2429) data time 0.0007 (0.0016) model time 0.2437 (0.2414) loss 2.5231 (2.8233) grad_norm 6.2536 (4.4275) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][880/1251] eta 0:01:30 lr 0.000116 wd 0.0500 time 0.2317 (0.2429) data time 0.0009 (0.0015) model time 0.2307 (0.2414) loss 3.4008 (2.8225) grad_norm 4.5102 (4.4282) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][890/1251] eta 0:01:27 lr 0.000116 wd 0.0500 time 0.2426 (0.2429) data time 0.0008 (0.0015) model time 0.2418 (0.2414) loss 2.1670 (2.8230) grad_norm 3.9832 (4.4411) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][900/1251] eta 0:01:25 lr 0.000116 wd 0.0500 time 0.2456 (0.2428) data time 0.0010 (0.0015) model time 0.2446 (0.2414) loss 3.1746 (2.8228) grad_norm 7.0688 (4.4403) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][910/1251] eta 0:01:22 lr 0.000115 wd 0.0500 time 0.2446 (0.2428) data time 0.0009 (0.0015) model time 0.2436 (0.2414) loss 2.6096 (2.8260) grad_norm 4.3766 (4.4470) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][920/1251] eta 0:01:20 lr 0.000115 wd 0.0500 time 0.2375 (0.2428) data time 0.0009 (0.0015) model time 0.2366 (0.2413) loss 3.1226 (2.8256) grad_norm 4.3523 (4.4420) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][930/1251] eta 0:01:17 lr 0.000115 wd 0.0500 time 0.2422 (0.2428) data time 0.0010 (0.0015) model time 0.2412 (0.2413) loss 2.3996 (2.8253) grad_norm 3.3178 (4.4385) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][940/1251] eta 0:01:15 lr 0.000115 wd 0.0500 time 0.2367 (0.2428) data time 0.0009 (0.0015) model time 0.2358 (0.2413) loss 1.8379 (2.8259) grad_norm 3.5861 (4.4384) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][950/1251] eta 0:01:13 lr 0.000115 wd 0.0500 time 0.2455 (0.2427) data time 0.0010 (0.0015) model time 0.2446 (0.2413) loss 3.1276 (2.8280) grad_norm 4.7761 (4.4338) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][960/1251] eta 0:01:10 lr 0.000115 wd 0.0500 time 0.2364 (0.2427) data time 0.0008 (0.0015) model time 0.2357 (0.2413) loss 3.5411 (2.8279) grad_norm 3.3524 (4.4336) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][970/1251] eta 0:01:08 lr 0.000115 wd 0.0500 time 0.2350 (0.2427) data time 0.0009 (0.0015) model time 0.2341 (0.2413) loss 3.5020 (2.8288) grad_norm 3.9877 (4.4351) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][980/1251] eta 0:01:05 lr 0.000115 wd 0.0500 time 0.2383 (0.2427) data time 0.0012 (0.0015) model time 0.2371 (0.2413) loss 3.1649 (2.8305) grad_norm 3.4745 (4.4270) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][990/1251] eta 0:01:03 lr 0.000115 wd 0.0500 time 0.2436 (0.2427) data time 0.0010 (0.0015) model time 0.2427 (0.2412) loss 3.6704 (2.8340) grad_norm 3.9368 (4.4275) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1000/1251] eta 0:01:00 lr 0.000115 wd 0.0500 time 0.2441 (0.2427) data time 0.0011 (0.0015) model time 0.2430 (0.2413) loss 3.5204 (2.8361) grad_norm 4.3003 (4.4234) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1010/1251] eta 0:00:58 lr 0.000115 wd 0.0500 time 0.2374 (0.2427) data time 0.0011 (0.0015) model time 0.2363 (0.2413) loss 3.2916 (2.8341) grad_norm 4.7195 (4.4380) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:42:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1020/1251] eta 0:00:56 lr 0.000115 wd 0.0500 time 0.2383 (0.2427) data time 0.0008 (0.0015) model time 0.2374 (0.2413) loss 3.9084 (2.8334) grad_norm 4.0657 (4.4337) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1030/1251] eta 0:00:53 lr 0.000115 wd 0.0500 time 0.2446 (0.2427) data time 0.0009 (0.0015) model time 0.2437 (0.2413) loss 2.7168 (2.8336) grad_norm 2.4767 (4.4277) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1040/1251] eta 0:00:51 lr 0.000115 wd 0.0500 time 0.2325 (0.2427) data time 0.0011 (0.0015) model time 0.2314 (0.2413) loss 3.2909 (2.8363) grad_norm 6.3525 (4.4292) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1050/1251] eta 0:00:48 lr 0.000115 wd 0.0500 time 0.2420 (0.2427) data time 0.0007 (0.0015) model time 0.2413 (0.2413) loss 1.8110 (2.8339) grad_norm 3.6233 (4.4275) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1060/1251] eta 0:00:46 lr 0.000115 wd 0.0500 time 0.2440 (0.2427) data time 0.0007 (0.0015) model time 0.2433 (0.2413) loss 2.4420 (2.8339) grad_norm 4.1983 (4.4206) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1070/1251] eta 0:00:43 lr 0.000115 wd 0.0500 time 0.2504 (0.2428) data time 0.0010 (0.0015) model time 0.2495 (0.2415) loss 2.0442 (2.8291) grad_norm 6.3250 (4.4244) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1080/1251] eta 0:00:41 lr 0.000115 wd 0.0500 time 0.2407 (0.2428) data time 0.0007 (0.0015) model time 0.2400 (0.2414) loss 2.4883 (2.8286) grad_norm 4.3962 (4.4249) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1090/1251] eta 0:00:39 lr 0.000115 wd 0.0500 time 0.2386 (0.2428) data time 0.0010 (0.0014) model time 0.2376 (0.2414) loss 3.3367 (2.8295) grad_norm 13.9386 (4.4348) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1100/1251] eta 0:00:36 lr 0.000115 wd 0.0500 time 0.2307 (0.2428) data time 0.0010 (0.0014) model time 0.2297 (0.2414) loss 1.9856 (2.8297) grad_norm 5.4304 (4.4467) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1110/1251] eta 0:00:34 lr 0.000115 wd 0.0500 time 0.2431 (0.2427) data time 0.0010 (0.0014) model time 0.2422 (0.2414) loss 2.1729 (2.8287) grad_norm 4.1829 (4.4679) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1120/1251] eta 0:00:31 lr 0.000115 wd 0.0500 time 0.2365 (0.2427) data time 0.0008 (0.0014) model time 0.2358 (0.2413) loss 2.0410 (2.8279) grad_norm 3.7699 (4.4639) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1130/1251] eta 0:00:29 lr 0.000115 wd 0.0500 time 0.2384 (0.2427) data time 0.0007 (0.0014) model time 0.2377 (0.2413) loss 2.8780 (2.8287) grad_norm 4.0902 (4.4744) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1140/1251] eta 0:00:26 lr 0.000115 wd 0.0500 time 0.2363 (0.2427) data time 0.0009 (0.0014) model time 0.2354 (0.2413) loss 3.6915 (2.8278) grad_norm 3.3404 (4.4777) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1150/1251] eta 0:00:24 lr 0.000115 wd 0.0500 time 0.2494 (0.2426) data time 0.0010 (0.0014) model time 0.2484 (0.2413) loss 2.7097 (2.8301) grad_norm 3.1330 (4.4933) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1160/1251] eta 0:00:22 lr 0.000115 wd 0.0500 time 0.2407 (0.2426) data time 0.0007 (0.0014) model time 0.2400 (0.2413) loss 3.3374 (2.8295) grad_norm 4.4522 (4.4885) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1170/1251] eta 0:00:19 lr 0.000115 wd 0.0500 time 0.2424 (0.2426) data time 0.0007 (0.0014) model time 0.2417 (0.2412) loss 1.7698 (2.8296) grad_norm 3.5122 (4.4868) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1180/1251] eta 0:00:17 lr 0.000115 wd 0.0500 time 0.2406 (0.2426) data time 0.0008 (0.0014) model time 0.2397 (0.2412) loss 3.1021 (2.8302) grad_norm 3.7715 (4.4799) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 06:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1190/1251] eta 0:00:14 lr 0.000115 wd 0.0500 time 0.2401 (0.2426) data time 0.0010 (0.0014) model time 0.2392 (0.2412) loss 2.1555 (2.8307) grad_norm 3.7720 (4.5166) loss_scale 1024.0000 (514.1495) mem 7381MB [2024-09-01 06:43:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1200/1251] eta 0:00:12 lr 0.000115 wd 0.0500 time 0.2391 (0.2426) data time 0.0010 (0.0014) model time 0.2381 (0.2412) loss 1.9673 (2.8308) grad_norm 3.6435 (4.5119) loss_scale 1024.0000 (518.3947) mem 7381MB [2024-09-01 06:43:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1210/1251] eta 0:00:09 lr 0.000115 wd 0.0500 time 0.2300 (0.2426) data time 0.0008 (0.0014) model time 0.2291 (0.2412) loss 3.2588 (2.8296) grad_norm 3.0509 (4.5014) loss_scale 1024.0000 (522.5698) mem 7381MB [2024-09-01 06:43:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1220/1251] eta 0:00:07 lr 0.000115 wd 0.0500 time 0.2397 (0.2425) data time 0.0009 (0.0014) model time 0.2388 (0.2412) loss 2.9433 (2.8294) grad_norm 5.8114 (4.4978) loss_scale 1024.0000 (526.6765) mem 7381MB [2024-09-01 06:43:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1230/1251] eta 0:00:05 lr 0.000115 wd 0.0500 time 0.2516 (0.2425) data time 0.0007 (0.0014) model time 0.2509 (0.2412) loss 2.7355 (2.8284) grad_norm 3.9366 (4.4990) loss_scale 1024.0000 (530.7165) mem 7381MB [2024-09-01 06:43:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1240/1251] eta 0:00:02 lr 0.000115 wd 0.0500 time 0.2249 (0.2424) data time 0.0005 (0.0014) model time 0.2244 (0.2411) loss 2.3390 (2.8268) grad_norm 5.0484 (4.5038) loss_scale 1024.0000 (534.6914) mem 7381MB [2024-09-01 06:43:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [240/300][1250/1251] eta 0:00:00 lr 0.000115 wd 0.0500 time 0.2257 (0.2423) data time 0.0005 (0.0014) model time 0.2253 (0.2409) loss 2.7678 (2.8266) grad_norm 3.1123 (4.4971) loss_scale 1024.0000 (538.6027) mem 7381MB [2024-09-01 06:43:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 240 training takes 0:05:03 [2024-09-01 06:43:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 06:43:53 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 06:43:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.462 (0.462) Loss 0.4150 (0.4150) Acc@1 92.578 (92.578) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 06:43:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.113) Loss 0.6167 (0.6322) Acc@1 87.695 (86.967) Acc@5 97.754 (97.603) Mem 7381MB [2024-09-01 06:43:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.098) Loss 0.9312 (0.6625) Acc@1 77.441 (85.849) Acc@5 95.801 (97.498) Mem 7381MB [2024-09-01 06:43:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.091) Loss 1.1621 (0.7554) Acc@1 72.949 (83.625) Acc@5 92.871 (96.582) Mem 7381MB [2024-09-01 06:43:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.0674 (0.8038) Acc@1 76.562 (82.462) Acc@5 93.262 (96.018) Mem 7381MB [2024-09-01 06:43:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.076 Acc@5 95.964 [2024-09-01 06:43:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.1% [2024-09-01 06:43:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.08% [2024-09-01 06:43:57 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 06:43:58 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 06:43:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.448 (0.448) Loss 0.3809 (0.3809) Acc@1 93.457 (93.457) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 06:43:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.111) Loss 0.5674 (0.6005) Acc@1 89.648 (87.580) Acc@5 98.047 (97.754) Mem 7381MB [2024-09-01 06:44:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.095) Loss 0.8862 (0.6300) Acc@1 78.027 (86.426) Acc@5 95.898 (97.666) Mem 7381MB [2024-09-01 06:44:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.090) Loss 1.0869 (0.7141) Acc@1 74.414 (84.340) Acc@5 93.555 (96.818) Mem 7381MB [2024-09-01 06:44:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.084) Loss 0.9976 (0.7593) Acc@1 76.270 (83.234) Acc@5 94.141 (96.349) Mem 7381MB [2024-09-01 06:44:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.838 Acc@5 96.324 [2024-09-01 06:44:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 06:44:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][0/1251] eta 0:22:48 lr 0.000115 wd 0.0500 time 1.0938 (1.0938) data time 0.5201 (0.5201) model time 0.0000 (0.0000) loss 2.9194 (2.9194) grad_norm 2.6904 (2.6904) loss_scale 1024.0000 (1024.0000) mem 7381MB [2024-09-01 06:44:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][10/1251] eta 0:07:02 lr 0.000115 wd 0.0500 time 0.2532 (0.3404) data time 0.0009 (0.0483) model time 0.0000 (0.0000) loss 2.9282 (2.7481) grad_norm 3.0938 (nan) loss_scale 512.0000 (698.1818) mem 7381MB [2024-09-01 06:44:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][20/1251] eta 0:06:35 lr 0.000115 wd 0.0500 time 0.2353 (0.3216) data time 0.0011 (0.0258) model time 0.0000 (0.0000) loss 2.9406 (2.8790) grad_norm 3.6151 (nan) loss_scale 512.0000 (609.5238) mem 7381MB [2024-09-01 06:44:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][30/1251] eta 0:06:01 lr 0.000114 wd 0.0500 time 0.2383 (0.2959) data time 0.0008 (0.0178) model time 0.0000 (0.0000) loss 2.1034 (2.8142) grad_norm 4.4678 (nan) loss_scale 512.0000 (578.0645) mem 7381MB [2024-09-01 06:44:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][40/1251] eta 0:05:41 lr 0.000114 wd 0.0500 time 0.2394 (0.2823) data time 0.0011 (0.0137) model time 0.0000 (0.0000) loss 2.5318 (2.8195) grad_norm 3.2844 (nan) loss_scale 512.0000 (561.9512) mem 7381MB [2024-09-01 06:44:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][50/1251] eta 0:05:28 lr 0.000114 wd 0.0500 time 0.2424 (0.2739) data time 0.0007 (0.0112) model time 0.0000 (0.0000) loss 2.7792 (2.8364) grad_norm 5.3665 (nan) loss_scale 512.0000 (552.1569) mem 7381MB [2024-09-01 06:44:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][60/1251] eta 0:05:19 lr 0.000114 wd 0.0500 time 0.2393 (0.2684) data time 0.0008 (0.0095) model time 0.2385 (0.2397) loss 3.1580 (2.8605) grad_norm 3.4756 (nan) loss_scale 512.0000 (545.5738) mem 7381MB [2024-09-01 06:44:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][70/1251] eta 0:05:12 lr 0.000114 wd 0.0500 time 0.2408 (0.2647) data time 0.0009 (0.0083) model time 0.2399 (0.2403) loss 2.8424 (2.8296) grad_norm 4.9680 (nan) loss_scale 512.0000 (540.8451) mem 7381MB [2024-09-01 06:44:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][80/1251] eta 0:05:06 lr 0.000114 wd 0.0500 time 0.2452 (0.2619) data time 0.0007 (0.0074) model time 0.2446 (0.2405) loss 2.5701 (2.8389) grad_norm 3.1035 (nan) loss_scale 512.0000 (537.2840) mem 7381MB [2024-09-01 06:44:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][90/1251] eta 0:05:01 lr 0.000114 wd 0.0500 time 0.2436 (0.2596) data time 0.0010 (0.0067) model time 0.2425 (0.2406) loss 2.5260 (2.8612) grad_norm 3.8027 (nan) loss_scale 512.0000 (534.5055) mem 7381MB [2024-09-01 06:44:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][100/1251] eta 0:04:56 lr 0.000114 wd 0.0500 time 0.2310 (0.2578) data time 0.0011 (0.0061) model time 0.2300 (0.2404) loss 2.4079 (2.8293) grad_norm 3.5944 (nan) loss_scale 512.0000 (532.2772) mem 7381MB [2024-09-01 06:44:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][110/1251] eta 0:04:52 lr 0.000114 wd 0.0500 time 0.2366 (0.2563) data time 0.0009 (0.0057) model time 0.2357 (0.2403) loss 2.6319 (2.8262) grad_norm 3.4511 (nan) loss_scale 512.0000 (530.4505) mem 7381MB [2024-09-01 06:44:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][120/1251] eta 0:04:48 lr 0.000114 wd 0.0500 time 0.2459 (0.2552) data time 0.0007 (0.0053) model time 0.2452 (0.2406) loss 1.8441 (2.8250) grad_norm 3.0186 (nan) loss_scale 512.0000 (528.9256) mem 7381MB [2024-09-01 06:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][130/1251] eta 0:04:44 lr 0.000114 wd 0.0500 time 0.2403 (0.2541) data time 0.0009 (0.0049) model time 0.2394 (0.2406) loss 3.2915 (2.8087) grad_norm 4.6429 (nan) loss_scale 512.0000 (527.6336) mem 7381MB [2024-09-01 06:44:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][140/1251] eta 0:04:41 lr 0.000114 wd 0.0500 time 0.2415 (0.2531) data time 0.0010 (0.0046) model time 0.2405 (0.2404) loss 2.9241 (2.8345) grad_norm 3.7637 (nan) loss_scale 512.0000 (526.5248) mem 7381MB [2024-09-01 06:44:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][150/1251] eta 0:04:37 lr 0.000114 wd 0.0500 time 0.2328 (0.2523) data time 0.0011 (0.0044) model time 0.2317 (0.2403) loss 2.4765 (2.8252) grad_norm 3.6863 (nan) loss_scale 512.0000 (525.5629) mem 7381MB [2024-09-01 06:44:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][160/1251] eta 0:04:34 lr 0.000114 wd 0.0500 time 0.2420 (0.2516) data time 0.0010 (0.0042) model time 0.2410 (0.2404) loss 2.8897 (2.8360) grad_norm 3.4307 (nan) loss_scale 512.0000 (524.7205) mem 7381MB [2024-09-01 06:44:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][170/1251] eta 0:04:31 lr 0.000114 wd 0.0500 time 0.2450 (0.2511) data time 0.0010 (0.0040) model time 0.2440 (0.2404) loss 2.6979 (2.8296) grad_norm 6.1527 (nan) loss_scale 512.0000 (523.9766) mem 7381MB [2024-09-01 06:44:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][180/1251] eta 0:04:28 lr 0.000114 wd 0.0500 time 0.2341 (0.2504) data time 0.0010 (0.0038) model time 0.2331 (0.2402) loss 3.2158 (2.8305) grad_norm 3.0475 (nan) loss_scale 512.0000 (523.3149) mem 7381MB [2024-09-01 06:44:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][190/1251] eta 0:04:25 lr 0.000114 wd 0.0500 time 0.2384 (0.2500) data time 0.0007 (0.0037) model time 0.2377 (0.2403) loss 2.7851 (2.8303) grad_norm 3.2736 (nan) loss_scale 512.0000 (522.7225) mem 7381MB [2024-09-01 06:44:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][200/1251] eta 0:04:22 lr 0.000114 wd 0.0500 time 0.2329 (0.2495) data time 0.0009 (0.0036) model time 0.2320 (0.2402) loss 2.3410 (2.8202) grad_norm 3.8144 (nan) loss_scale 512.0000 (522.1891) mem 7381MB [2024-09-01 06:44:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][210/1251] eta 0:04:19 lr 0.000114 wd 0.0500 time 0.2469 (0.2492) data time 0.0008 (0.0035) model time 0.2461 (0.2403) loss 3.3879 (2.8292) grad_norm 6.2403 (nan) loss_scale 512.0000 (521.7062) mem 7381MB [2024-09-01 06:44:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][220/1251] eta 0:04:16 lr 0.000114 wd 0.0500 time 0.2360 (0.2488) data time 0.0008 (0.0033) model time 0.2352 (0.2403) loss 2.1152 (2.8231) grad_norm 4.5890 (nan) loss_scale 512.0000 (521.2670) mem 7381MB [2024-09-01 06:44:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][230/1251] eta 0:04:13 lr 0.000114 wd 0.0500 time 0.2368 (0.2484) data time 0.0010 (0.0032) model time 0.2358 (0.2403) loss 3.2067 (2.8142) grad_norm 4.2997 (nan) loss_scale 512.0000 (520.8658) mem 7381MB [2024-09-01 06:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][240/1251] eta 0:04:10 lr 0.000114 wd 0.0500 time 0.2388 (0.2482) data time 0.0010 (0.0031) model time 0.2378 (0.2403) loss 3.1199 (2.8252) grad_norm 3.4373 (nan) loss_scale 512.0000 (520.4979) mem 7381MB [2024-09-01 06:45:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][250/1251] eta 0:04:08 lr 0.000114 wd 0.0500 time 0.2444 (0.2479) data time 0.0006 (0.0031) model time 0.2437 (0.2403) loss 2.5055 (2.8222) grad_norm 4.0867 (nan) loss_scale 512.0000 (520.1594) mem 7381MB [2024-09-01 06:45:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][260/1251] eta 0:04:05 lr 0.000114 wd 0.0500 time 0.2397 (0.2476) data time 0.0009 (0.0030) model time 0.2388 (0.2402) loss 3.1215 (2.8217) grad_norm 4.8600 (nan) loss_scale 512.0000 (519.8467) mem 7381MB [2024-09-01 06:45:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][270/1251] eta 0:04:02 lr 0.000114 wd 0.0500 time 0.2393 (0.2474) data time 0.0009 (0.0029) model time 0.2383 (0.2403) loss 3.0135 (2.8223) grad_norm 9.0806 (nan) loss_scale 512.0000 (519.5572) mem 7381MB [2024-09-01 06:45:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][280/1251] eta 0:04:00 lr 0.000114 wd 0.0500 time 0.2386 (0.2472) data time 0.0013 (0.0029) model time 0.2373 (0.2402) loss 2.4183 (2.8274) grad_norm 5.3017 (nan) loss_scale 512.0000 (519.2883) mem 7381MB [2024-09-01 06:45:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][290/1251] eta 0:03:57 lr 0.000114 wd 0.0500 time 0.4457 (0.2477) data time 0.0011 (0.0028) model time 0.4446 (0.2411) loss 2.4052 (2.8249) grad_norm 4.3967 (nan) loss_scale 512.0000 (519.0378) mem 7381MB [2024-09-01 06:45:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][300/1251] eta 0:03:55 lr 0.000114 wd 0.0500 time 0.2415 (0.2475) data time 0.0007 (0.0027) model time 0.2409 (0.2412) loss 2.7276 (2.8254) grad_norm 3.5872 (nan) loss_scale 512.0000 (518.8040) mem 7381MB [2024-09-01 06:45:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][310/1251] eta 0:03:52 lr 0.000114 wd 0.0500 time 0.2397 (0.2474) data time 0.0010 (0.0027) model time 0.2387 (0.2412) loss 3.0565 (2.8297) grad_norm 3.8148 (nan) loss_scale 512.0000 (518.5852) mem 7381MB [2024-09-01 06:45:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][320/1251] eta 0:03:50 lr 0.000114 wd 0.0500 time 0.2391 (0.2472) data time 0.0010 (0.0026) model time 0.2381 (0.2411) loss 3.3282 (2.8325) grad_norm 4.8194 (nan) loss_scale 512.0000 (518.3801) mem 7381MB [2024-09-01 06:45:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][330/1251] eta 0:03:47 lr 0.000114 wd 0.0500 time 0.2467 (0.2471) data time 0.0007 (0.0026) model time 0.2460 (0.2412) loss 2.5292 (2.8293) grad_norm 4.1045 (nan) loss_scale 512.0000 (518.1873) mem 7381MB [2024-09-01 06:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][340/1251] eta 0:03:44 lr 0.000114 wd 0.0500 time 0.2330 (0.2469) data time 0.0010 (0.0025) model time 0.2320 (0.2411) loss 3.1278 (2.8189) grad_norm 5.9922 (nan) loss_scale 512.0000 (518.0059) mem 7381MB [2024-09-01 06:45:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][350/1251] eta 0:03:42 lr 0.000114 wd 0.0500 time 0.2364 (0.2467) data time 0.0009 (0.0025) model time 0.2355 (0.2410) loss 2.9637 (2.8163) grad_norm 4.0886 (nan) loss_scale 512.0000 (517.8348) mem 7381MB [2024-09-01 06:45:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][360/1251] eta 0:03:39 lr 0.000114 wd 0.0500 time 0.2404 (0.2465) data time 0.0007 (0.0024) model time 0.2397 (0.2410) loss 2.4056 (2.8099) grad_norm 3.7211 (nan) loss_scale 512.0000 (517.6731) mem 7381MB [2024-09-01 06:45:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][370/1251] eta 0:03:36 lr 0.000114 wd 0.0500 time 0.2400 (0.2463) data time 0.0009 (0.0024) model time 0.2390 (0.2409) loss 2.0778 (2.8087) grad_norm 3.5994 (nan) loss_scale 512.0000 (517.5202) mem 7381MB [2024-09-01 06:45:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][380/1251] eta 0:03:34 lr 0.000114 wd 0.0500 time 0.2445 (0.2462) data time 0.0010 (0.0024) model time 0.2435 (0.2409) loss 2.6851 (2.8053) grad_norm 3.0253 (nan) loss_scale 512.0000 (517.3753) mem 7381MB [2024-09-01 06:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][390/1251] eta 0:03:31 lr 0.000113 wd 0.0500 time 0.2340 (0.2461) data time 0.0007 (0.0023) model time 0.2332 (0.2409) loss 2.4387 (2.8073) grad_norm 5.1082 (nan) loss_scale 512.0000 (517.2379) mem 7381MB [2024-09-01 06:45:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][400/1251] eta 0:03:29 lr 0.000113 wd 0.0500 time 0.2417 (0.2460) data time 0.0009 (0.0023) model time 0.2408 (0.2409) loss 2.7633 (2.8096) grad_norm 5.1330 (nan) loss_scale 512.0000 (517.1072) mem 7381MB [2024-09-01 06:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][410/1251] eta 0:03:26 lr 0.000113 wd 0.0500 time 0.2450 (0.2459) data time 0.0010 (0.0023) model time 0.2440 (0.2409) loss 3.0495 (2.8114) grad_norm 4.8894 (nan) loss_scale 512.0000 (516.9830) mem 7381MB [2024-09-01 06:45:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][420/1251] eta 0:03:24 lr 0.000113 wd 0.0500 time 0.2400 (0.2458) data time 0.0009 (0.0022) model time 0.2391 (0.2409) loss 2.8992 (2.8058) grad_norm 4.3961 (nan) loss_scale 512.0000 (516.8646) mem 7381MB [2024-09-01 06:45:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][430/1251] eta 0:03:21 lr 0.000113 wd 0.0500 time 0.2424 (0.2457) data time 0.0008 (0.0022) model time 0.2416 (0.2409) loss 2.5089 (2.8014) grad_norm 3.1571 (nan) loss_scale 512.0000 (516.7517) mem 7381MB [2024-09-01 06:45:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][440/1251] eta 0:03:19 lr 0.000113 wd 0.0500 time 0.2386 (0.2456) data time 0.0008 (0.0022) model time 0.2377 (0.2409) loss 3.6445 (2.8005) grad_norm 5.0085 (nan) loss_scale 512.0000 (516.6440) mem 7381MB [2024-09-01 06:45:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][450/1251] eta 0:03:16 lr 0.000113 wd 0.0500 time 0.2391 (0.2455) data time 0.0009 (0.0022) model time 0.2382 (0.2408) loss 2.4583 (2.7958) grad_norm 3.9153 (nan) loss_scale 512.0000 (516.5410) mem 7381MB [2024-09-01 06:45:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][460/1251] eta 0:03:14 lr 0.000113 wd 0.0500 time 0.2421 (0.2454) data time 0.0009 (0.0021) model time 0.2412 (0.2409) loss 3.0795 (2.7985) grad_norm 3.2662 (nan) loss_scale 512.0000 (516.4425) mem 7381MB [2024-09-01 06:45:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][470/1251] eta 0:03:11 lr 0.000113 wd 0.0500 time 0.2372 (0.2453) data time 0.0007 (0.0021) model time 0.2364 (0.2409) loss 2.9944 (2.7997) grad_norm 6.1928 (nan) loss_scale 512.0000 (516.3482) mem 7381MB [2024-09-01 06:46:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][480/1251] eta 0:03:09 lr 0.000113 wd 0.0500 time 0.2386 (0.2453) data time 0.0010 (0.0021) model time 0.2376 (0.2409) loss 2.5435 (2.8016) grad_norm 3.3709 (nan) loss_scale 512.0000 (516.2578) mem 7381MB [2024-09-01 06:46:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][490/1251] eta 0:03:06 lr 0.000113 wd 0.0500 time 0.2417 (0.2452) data time 0.0010 (0.0021) model time 0.2408 (0.2409) loss 3.3701 (2.8018) grad_norm 4.5574 (nan) loss_scale 512.0000 (516.1711) mem 7381MB [2024-09-01 06:46:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][500/1251] eta 0:03:04 lr 0.000113 wd 0.0500 time 0.2408 (0.2451) data time 0.0008 (0.0020) model time 0.2400 (0.2409) loss 3.0001 (2.8007) grad_norm 3.6923 (nan) loss_scale 512.0000 (516.0878) mem 7381MB [2024-09-01 06:46:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][510/1251] eta 0:03:01 lr 0.000113 wd 0.0500 time 0.2457 (0.2451) data time 0.0010 (0.0020) model time 0.2447 (0.2409) loss 3.1003 (2.7967) grad_norm 3.4709 (nan) loss_scale 512.0000 (516.0078) mem 7381MB [2024-09-01 06:46:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][520/1251] eta 0:02:59 lr 0.000113 wd 0.0500 time 0.2382 (0.2450) data time 0.0007 (0.0020) model time 0.2374 (0.2409) loss 3.3645 (2.7992) grad_norm 6.7746 (nan) loss_scale 512.0000 (515.9309) mem 7381MB [2024-09-01 06:46:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][530/1251] eta 0:02:56 lr 0.000113 wd 0.0500 time 0.2330 (0.2449) data time 0.0008 (0.0020) model time 0.2322 (0.2408) loss 3.1717 (2.8030) grad_norm 3.5996 (nan) loss_scale 512.0000 (515.8569) mem 7381MB [2024-09-01 06:46:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][540/1251] eta 0:02:54 lr 0.000113 wd 0.0500 time 0.2428 (0.2448) data time 0.0011 (0.0020) model time 0.2417 (0.2408) loss 3.0105 (2.8075) grad_norm 5.4773 (nan) loss_scale 512.0000 (515.7856) mem 7381MB [2024-09-01 06:46:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][550/1251] eta 0:02:52 lr 0.000113 wd 0.0500 time 0.2501 (0.2455) data time 0.0007 (0.0019) model time 0.2494 (0.2416) loss 2.8774 (2.8069) grad_norm 3.1559 (nan) loss_scale 512.0000 (515.7169) mem 7381MB [2024-09-01 06:46:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][560/1251] eta 0:02:50 lr 0.000113 wd 0.0500 time 0.2438 (0.2462) data time 0.0008 (0.0019) model time 0.2429 (0.2425) loss 3.3976 (2.8085) grad_norm 4.9793 (nan) loss_scale 512.0000 (515.6506) mem 7381MB [2024-09-01 06:46:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][570/1251] eta 0:02:47 lr 0.000113 wd 0.0500 time 0.2441 (0.2462) data time 0.0008 (0.0019) model time 0.2432 (0.2425) loss 1.8181 (2.8052) grad_norm 8.2861 (nan) loss_scale 512.0000 (515.5867) mem 7381MB [2024-09-01 06:46:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][580/1251] eta 0:02:45 lr 0.000113 wd 0.0500 time 0.2438 (0.2461) data time 0.0008 (0.0019) model time 0.2429 (0.2424) loss 3.0916 (2.8107) grad_norm 6.1794 (nan) loss_scale 512.0000 (515.5250) mem 7381MB [2024-09-01 06:46:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][590/1251] eta 0:02:42 lr 0.000113 wd 0.0500 time 0.2389 (0.2460) data time 0.0010 (0.0019) model time 0.2380 (0.2424) loss 2.7506 (2.8069) grad_norm 5.5806 (nan) loss_scale 512.0000 (515.4653) mem 7381MB [2024-09-01 06:46:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][600/1251] eta 0:02:40 lr 0.000113 wd 0.0500 time 0.2406 (0.2459) data time 0.0008 (0.0019) model time 0.2398 (0.2424) loss 2.1135 (2.8048) grad_norm 7.7314 (nan) loss_scale 512.0000 (515.4077) mem 7381MB [2024-09-01 06:46:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][610/1251] eta 0:02:37 lr 0.000113 wd 0.0500 time 0.2403 (0.2459) data time 0.0010 (0.0018) model time 0.2393 (0.2423) loss 2.9528 (2.7996) grad_norm 3.3102 (nan) loss_scale 512.0000 (515.3519) mem 7381MB [2024-09-01 06:46:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][620/1251] eta 0:02:35 lr 0.000113 wd 0.0500 time 0.2329 (0.2457) data time 0.0007 (0.0018) model time 0.2322 (0.2422) loss 2.8197 (2.7992) grad_norm 8.3169 (nan) loss_scale 512.0000 (515.2979) mem 7381MB [2024-09-01 06:46:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][630/1251] eta 0:02:32 lr 0.000113 wd 0.0500 time 0.2328 (0.2457) data time 0.0008 (0.0018) model time 0.2321 (0.2422) loss 2.6628 (2.7971) grad_norm 4.0031 (nan) loss_scale 512.0000 (515.2456) mem 7381MB [2024-09-01 06:46:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][640/1251] eta 0:02:30 lr 0.000113 wd 0.0500 time 0.2452 (0.2456) data time 0.0007 (0.0018) model time 0.2446 (0.2422) loss 2.4609 (2.7963) grad_norm 4.8171 (nan) loss_scale 512.0000 (515.1950) mem 7381MB [2024-09-01 06:46:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][650/1251] eta 0:02:27 lr 0.000113 wd 0.0500 time 0.2577 (0.2455) data time 0.0011 (0.0018) model time 0.2567 (0.2421) loss 3.3429 (2.7973) grad_norm 3.5867 (nan) loss_scale 512.0000 (515.1459) mem 7381MB [2024-09-01 06:46:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][660/1251] eta 0:02:25 lr 0.000113 wd 0.0500 time 0.2380 (0.2454) data time 0.0008 (0.0018) model time 0.2372 (0.2421) loss 3.2784 (2.7953) grad_norm 3.8979 (nan) loss_scale 512.0000 (515.0983) mem 7381MB [2024-09-01 06:46:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][670/1251] eta 0:02:22 lr 0.000113 wd 0.0500 time 0.2348 (0.2453) data time 0.0009 (0.0018) model time 0.2339 (0.2420) loss 3.0624 (2.7940) grad_norm 3.9982 (nan) loss_scale 512.0000 (515.0522) mem 7381MB [2024-09-01 06:46:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][680/1251] eta 0:02:20 lr 0.000113 wd 0.0500 time 0.2430 (0.2453) data time 0.0011 (0.0018) model time 0.2420 (0.2420) loss 2.7235 (2.7935) grad_norm 4.5962 (nan) loss_scale 512.0000 (515.0073) mem 7381MB [2024-09-01 06:46:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][690/1251] eta 0:02:17 lr 0.000113 wd 0.0500 time 0.2307 (0.2452) data time 0.0010 (0.0017) model time 0.2297 (0.2419) loss 2.2845 (2.7913) grad_norm 4.1126 (nan) loss_scale 512.0000 (514.9638) mem 7381MB [2024-09-01 06:46:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][700/1251] eta 0:02:15 lr 0.000113 wd 0.0500 time 0.2323 (0.2451) data time 0.0009 (0.0017) model time 0.2314 (0.2418) loss 3.1870 (2.7951) grad_norm 3.7760 (nan) loss_scale 512.0000 (514.9215) mem 7381MB [2024-09-01 06:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][710/1251] eta 0:02:12 lr 0.000113 wd 0.0500 time 0.2410 (0.2450) data time 0.0009 (0.0017) model time 0.2401 (0.2418) loss 2.9034 (2.7973) grad_norm 4.2662 (nan) loss_scale 512.0000 (514.8805) mem 7381MB [2024-09-01 06:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][720/1251] eta 0:02:10 lr 0.000113 wd 0.0500 time 0.2402 (0.2449) data time 0.0008 (0.0017) model time 0.2394 (0.2417) loss 2.3766 (2.7957) grad_norm 8.5113 (nan) loss_scale 512.0000 (514.8405) mem 7381MB [2024-09-01 06:47:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][730/1251] eta 0:02:07 lr 0.000113 wd 0.0500 time 0.2441 (0.2449) data time 0.0008 (0.0017) model time 0.2433 (0.2417) loss 3.5946 (2.7957) grad_norm 7.2374 (nan) loss_scale 512.0000 (514.8016) mem 7381MB [2024-09-01 06:47:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][740/1251] eta 0:02:05 lr 0.000113 wd 0.0500 time 0.2412 (0.2448) data time 0.0010 (0.0017) model time 0.2402 (0.2417) loss 3.0259 (2.7970) grad_norm 5.7084 (nan) loss_scale 512.0000 (514.7638) mem 7381MB [2024-09-01 06:47:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][750/1251] eta 0:02:02 lr 0.000113 wd 0.0500 time 0.2374 (0.2447) data time 0.0008 (0.0017) model time 0.2366 (0.2416) loss 2.6040 (2.7983) grad_norm 9.9979 (nan) loss_scale 512.0000 (514.7270) mem 7381MB [2024-09-01 06:47:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][760/1251] eta 0:02:00 lr 0.000112 wd 0.0500 time 0.2472 (0.2447) data time 0.0010 (0.0017) model time 0.2463 (0.2416) loss 2.9017 (2.8018) grad_norm 3.9711 (nan) loss_scale 512.0000 (514.6912) mem 7381MB [2024-09-01 06:47:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][770/1251] eta 0:01:57 lr 0.000112 wd 0.0500 time 0.2504 (0.2446) data time 0.0009 (0.0017) model time 0.2495 (0.2416) loss 2.6491 (2.8031) grad_norm 4.6837 (nan) loss_scale 512.0000 (514.6563) mem 7381MB [2024-09-01 06:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][780/1251] eta 0:01:55 lr 0.000112 wd 0.0500 time 0.2373 (0.2446) data time 0.0007 (0.0017) model time 0.2366 (0.2415) loss 3.3287 (2.8025) grad_norm 2.7668 (nan) loss_scale 512.0000 (514.6223) mem 7381MB [2024-09-01 06:47:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][790/1251] eta 0:01:52 lr 0.000112 wd 0.0500 time 0.2378 (0.2445) data time 0.0009 (0.0016) model time 0.2369 (0.2415) loss 2.1624 (2.8022) grad_norm 3.6142 (nan) loss_scale 512.0000 (514.5891) mem 7381MB [2024-09-01 06:47:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][800/1251] eta 0:01:50 lr 0.000112 wd 0.0500 time 0.2430 (0.2445) data time 0.0010 (0.0016) model time 0.2419 (0.2415) loss 2.7776 (2.8010) grad_norm 4.2903 (nan) loss_scale 512.0000 (514.5568) mem 7381MB [2024-09-01 06:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][810/1251] eta 0:01:47 lr 0.000112 wd 0.0500 time 0.2140 (0.2446) data time 0.0009 (0.0016) model time 0.2131 (0.2416) loss 3.4319 (2.8030) grad_norm 3.4294 (nan) loss_scale 512.0000 (514.5253) mem 7381MB [2024-09-01 06:47:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][820/1251] eta 0:01:45 lr 0.000112 wd 0.0500 time 0.2346 (0.2445) data time 0.0008 (0.0016) model time 0.2338 (0.2416) loss 2.5241 (2.8043) grad_norm 2.6751 (nan) loss_scale 512.0000 (514.4945) mem 7381MB [2024-09-01 06:47:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][830/1251] eta 0:01:42 lr 0.000112 wd 0.0500 time 0.2364 (0.2445) data time 0.0010 (0.0016) model time 0.2354 (0.2416) loss 2.6483 (2.8056) grad_norm 3.1165 (nan) loss_scale 512.0000 (514.4645) mem 7381MB [2024-09-01 06:47:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][840/1251] eta 0:01:40 lr 0.000112 wd 0.0500 time 0.2379 (0.2444) data time 0.0009 (0.0016) model time 0.2370 (0.2415) loss 3.0150 (2.8048) grad_norm 4.7896 (nan) loss_scale 512.0000 (514.4352) mem 7381MB [2024-09-01 06:47:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][850/1251] eta 0:01:38 lr 0.000112 wd 0.0500 time 0.2392 (0.2444) data time 0.0009 (0.0016) model time 0.2383 (0.2415) loss 3.0170 (2.8055) grad_norm 3.3715 (nan) loss_scale 512.0000 (514.4066) mem 7381MB [2024-09-01 06:47:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][860/1251] eta 0:01:35 lr 0.000112 wd 0.0500 time 0.2401 (0.2444) data time 0.0009 (0.0016) model time 0.2392 (0.2415) loss 2.7791 (2.8056) grad_norm 5.3021 (nan) loss_scale 512.0000 (514.3786) mem 7381MB [2024-09-01 06:47:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][870/1251] eta 0:01:33 lr 0.000112 wd 0.0500 time 0.2426 (0.2443) data time 0.0012 (0.0016) model time 0.2414 (0.2415) loss 3.1296 (2.8064) grad_norm 4.1030 (nan) loss_scale 512.0000 (514.3513) mem 7381MB [2024-09-01 06:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][880/1251] eta 0:01:30 lr 0.000112 wd 0.0500 time 0.2395 (0.2443) data time 0.0008 (0.0016) model time 0.2386 (0.2414) loss 2.9529 (2.8048) grad_norm 3.7899 (nan) loss_scale 512.0000 (514.3246) mem 7381MB [2024-09-01 06:47:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][890/1251] eta 0:01:28 lr 0.000112 wd 0.0500 time 0.2431 (0.2442) data time 0.0009 (0.0016) model time 0.2422 (0.2414) loss 3.3118 (2.8064) grad_norm 3.3939 (nan) loss_scale 512.0000 (514.2985) mem 7381MB [2024-09-01 06:47:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][900/1251] eta 0:01:25 lr 0.000112 wd 0.0500 time 0.2423 (0.2442) data time 0.0008 (0.0016) model time 0.2415 (0.2414) loss 2.0502 (2.8058) grad_norm 2.9107 (nan) loss_scale 512.0000 (514.2730) mem 7381MB [2024-09-01 06:47:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][910/1251] eta 0:01:23 lr 0.000112 wd 0.0500 time 0.2355 (0.2441) data time 0.0008 (0.0016) model time 0.2347 (0.2414) loss 2.6875 (2.8090) grad_norm 3.6068 (nan) loss_scale 512.0000 (514.2481) mem 7381MB [2024-09-01 06:47:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][920/1251] eta 0:01:20 lr 0.000112 wd 0.0500 time 0.2447 (0.2441) data time 0.0010 (0.0016) model time 0.2437 (0.2414) loss 2.8503 (2.8092) grad_norm 10.4543 (nan) loss_scale 512.0000 (514.2237) mem 7381MB [2024-09-01 06:47:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][930/1251] eta 0:01:18 lr 0.000112 wd 0.0500 time 0.2422 (0.2441) data time 0.0011 (0.0016) model time 0.2411 (0.2414) loss 2.9865 (2.8117) grad_norm 2.8150 (nan) loss_scale 512.0000 (514.1998) mem 7381MB [2024-09-01 06:47:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][940/1251] eta 0:01:15 lr 0.000112 wd 0.0500 time 0.2503 (0.2440) data time 0.0007 (0.0015) model time 0.2497 (0.2413) loss 1.9753 (2.8113) grad_norm 3.4843 (nan) loss_scale 512.0000 (514.1764) mem 7381MB [2024-09-01 06:47:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][950/1251] eta 0:01:13 lr 0.000112 wd 0.0500 time 0.2390 (0.2440) data time 0.0010 (0.0015) model time 0.2380 (0.2413) loss 2.7130 (2.8110) grad_norm 3.5875 (nan) loss_scale 512.0000 (514.1535) mem 7381MB [2024-09-01 06:47:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][960/1251] eta 0:01:11 lr 0.000112 wd 0.0500 time 0.2432 (0.2440) data time 0.0009 (0.0015) model time 0.2423 (0.2414) loss 2.8167 (2.8136) grad_norm 5.4275 (nan) loss_scale 512.0000 (514.1311) mem 7381MB [2024-09-01 06:47:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][970/1251] eta 0:01:08 lr 0.000112 wd 0.0500 time 0.2402 (0.2440) data time 0.0010 (0.0015) model time 0.2392 (0.2413) loss 2.4042 (2.8141) grad_norm 4.2710 (nan) loss_scale 512.0000 (514.1092) mem 7381MB [2024-09-01 06:48:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][980/1251] eta 0:01:06 lr 0.000112 wd 0.0500 time 0.2427 (0.2440) data time 0.0010 (0.0015) model time 0.2417 (0.2413) loss 1.9725 (2.8132) grad_norm 5.2307 (nan) loss_scale 512.0000 (514.0877) mem 7381MB [2024-09-01 06:48:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][990/1251] eta 0:01:03 lr 0.000112 wd 0.0500 time 0.2512 (0.2440) data time 0.0008 (0.0015) model time 0.2504 (0.2413) loss 3.0841 (2.8130) grad_norm 10.8197 (nan) loss_scale 512.0000 (514.0666) mem 7381MB [2024-09-01 06:48:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1000/1251] eta 0:01:01 lr 0.000112 wd 0.0500 time 0.2280 (0.2439) data time 0.0010 (0.0015) model time 0.2270 (0.2413) loss 3.1874 (2.8134) grad_norm 3.7069 (nan) loss_scale 512.0000 (514.0460) mem 7381MB [2024-09-01 06:48:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1010/1251] eta 0:00:58 lr 0.000112 wd 0.0500 time 0.2423 (0.2439) data time 0.0009 (0.0015) model time 0.2414 (0.2413) loss 2.2079 (2.8152) grad_norm 2.9558 (nan) loss_scale 512.0000 (514.0257) mem 7381MB [2024-09-01 06:48:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1020/1251] eta 0:00:56 lr 0.000112 wd 0.0500 time 0.2398 (0.2439) data time 0.0010 (0.0015) model time 0.2389 (0.2413) loss 2.3743 (2.8136) grad_norm 3.6132 (nan) loss_scale 512.0000 (514.0059) mem 7381MB [2024-09-01 06:48:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1030/1251] eta 0:00:53 lr 0.000112 wd 0.0500 time 0.2352 (0.2438) data time 0.0009 (0.0015) model time 0.2344 (0.2413) loss 3.1651 (2.8131) grad_norm 5.0012 (nan) loss_scale 512.0000 (513.9864) mem 7381MB [2024-09-01 06:48:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1040/1251] eta 0:00:51 lr 0.000112 wd 0.0500 time 0.2363 (0.2438) data time 0.0011 (0.0015) model time 0.2352 (0.2413) loss 3.1484 (2.8116) grad_norm 4.3419 (nan) loss_scale 512.0000 (513.9673) mem 7381MB [2024-09-01 06:48:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1050/1251] eta 0:00:49 lr 0.000112 wd 0.0500 time 0.2416 (0.2438) data time 0.0012 (0.0015) model time 0.2404 (0.2413) loss 3.0660 (2.8125) grad_norm 3.7923 (nan) loss_scale 512.0000 (513.9486) mem 7381MB [2024-09-01 06:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1060/1251] eta 0:00:46 lr 0.000112 wd 0.0500 time 0.2353 (0.2438) data time 0.0008 (0.0015) model time 0.2345 (0.2412) loss 3.0102 (2.8128) grad_norm 8.1981 (nan) loss_scale 512.0000 (513.9303) mem 7381MB [2024-09-01 06:48:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1070/1251] eta 0:00:44 lr 0.000112 wd 0.0500 time 0.2506 (0.2441) data time 0.0009 (0.0015) model time 0.2497 (0.2416) loss 3.3256 (2.8140) grad_norm 4.2766 (nan) loss_scale 512.0000 (513.9122) mem 7381MB [2024-09-01 06:48:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1080/1251] eta 0:00:41 lr 0.000112 wd 0.0500 time 0.2663 (0.2444) data time 0.0012 (0.0015) model time 0.2651 (0.2419) loss 3.7187 (2.8128) grad_norm 3.6330 (nan) loss_scale 512.0000 (513.8945) mem 7381MB [2024-09-01 06:48:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1090/1251] eta 0:00:39 lr 0.000112 wd 0.0500 time 0.2425 (0.2444) data time 0.0007 (0.0015) model time 0.2418 (0.2419) loss 2.1230 (2.8139) grad_norm 2.7520 (nan) loss_scale 512.0000 (513.8772) mem 7381MB [2024-09-01 06:48:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1100/1251] eta 0:00:36 lr 0.000112 wd 0.0500 time 0.2378 (0.2443) data time 0.0007 (0.0015) model time 0.2371 (0.2419) loss 3.4131 (2.8132) grad_norm 3.0564 (nan) loss_scale 512.0000 (513.8601) mem 7381MB [2024-09-01 06:48:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1110/1251] eta 0:00:34 lr 0.000112 wd 0.0500 time 0.2712 (0.2443) data time 0.0011 (0.0015) model time 0.2701 (0.2419) loss 2.8479 (2.8145) grad_norm 4.7426 (nan) loss_scale 512.0000 (513.8434) mem 7381MB [2024-09-01 06:48:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1120/1251] eta 0:00:31 lr 0.000112 wd 0.0500 time 0.2370 (0.2443) data time 0.0009 (0.0015) model time 0.2362 (0.2418) loss 3.5020 (2.8156) grad_norm 4.0485 (nan) loss_scale 512.0000 (513.8269) mem 7381MB [2024-09-01 06:48:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1130/1251] eta 0:00:29 lr 0.000111 wd 0.0500 time 0.2332 (0.2442) data time 0.0011 (0.0015) model time 0.2322 (0.2418) loss 2.7154 (2.8155) grad_norm 7.0929 (nan) loss_scale 512.0000 (513.8108) mem 7381MB [2024-09-01 06:48:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1140/1251] eta 0:00:27 lr 0.000111 wd 0.0500 time 0.2425 (0.2442) data time 0.0008 (0.0015) model time 0.2418 (0.2418) loss 2.2083 (2.8163) grad_norm 3.6411 (nan) loss_scale 512.0000 (513.7949) mem 7381MB [2024-09-01 06:48:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1150/1251] eta 0:00:24 lr 0.000111 wd 0.0500 time 0.2372 (0.2442) data time 0.0009 (0.0015) model time 0.2364 (0.2418) loss 3.2253 (2.8178) grad_norm 2.6981 (nan) loss_scale 512.0000 (513.7793) mem 7381MB [2024-09-01 06:48:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1160/1251] eta 0:00:22 lr 0.000111 wd 0.0500 time 0.2403 (0.2442) data time 0.0010 (0.0015) model time 0.2393 (0.2418) loss 3.5125 (2.8198) grad_norm 3.5654 (nan) loss_scale 512.0000 (513.7640) mem 7381MB [2024-09-01 06:48:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1170/1251] eta 0:00:19 lr 0.000111 wd 0.0500 time 0.2513 (0.2441) data time 0.0010 (0.0015) model time 0.2503 (0.2418) loss 3.2299 (2.8191) grad_norm 3.9873 (nan) loss_scale 512.0000 (513.7489) mem 7381MB [2024-09-01 06:48:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1180/1251] eta 0:00:17 lr 0.000111 wd 0.0500 time 0.2463 (0.2441) data time 0.0009 (0.0015) model time 0.2454 (0.2418) loss 2.7386 (2.8189) grad_norm 3.7877 (nan) loss_scale 512.0000 (513.7341) mem 7381MB [2024-09-01 06:48:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1190/1251] eta 0:00:14 lr 0.000111 wd 0.0500 time 0.2409 (0.2441) data time 0.0007 (0.0014) model time 0.2402 (0.2418) loss 3.4005 (2.8186) grad_norm 3.1842 (nan) loss_scale 512.0000 (513.7196) mem 7381MB [2024-09-01 06:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1200/1251] eta 0:00:12 lr 0.000111 wd 0.0500 time 0.2429 (0.2440) data time 0.0007 (0.0014) model time 0.2422 (0.2417) loss 3.4392 (2.8196) grad_norm 3.3002 (nan) loss_scale 512.0000 (513.7052) mem 7381MB [2024-09-01 06:48:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1210/1251] eta 0:00:10 lr 0.000111 wd 0.0500 time 0.2408 (0.2440) data time 0.0007 (0.0014) model time 0.2401 (0.2417) loss 2.6606 (2.8193) grad_norm 2.9205 (nan) loss_scale 512.0000 (513.6912) mem 7381MB [2024-09-01 06:49:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1220/1251] eta 0:00:07 lr 0.000111 wd 0.0500 time 0.2418 (0.2440) data time 0.0009 (0.0014) model time 0.2409 (0.2417) loss 2.9484 (2.8187) grad_norm 4.1583 (nan) loss_scale 512.0000 (513.6773) mem 7381MB [2024-09-01 06:49:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1230/1251] eta 0:00:05 lr 0.000111 wd 0.0500 time 0.2413 (0.2440) data time 0.0009 (0.0014) model time 0.2403 (0.2417) loss 2.5595 (2.8165) grad_norm 2.8938 (nan) loss_scale 512.0000 (513.6637) mem 7381MB [2024-09-01 06:49:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1240/1251] eta 0:00:02 lr 0.000111 wd 0.0500 time 0.2264 (0.2439) data time 0.0005 (0.0014) model time 0.2259 (0.2416) loss 3.1264 (2.8159) grad_norm 4.3906 (nan) loss_scale 256.0000 (511.5874) mem 7381MB [2024-09-01 06:49:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [241/300][1250/1251] eta 0:00:00 lr 0.000111 wd 0.0500 time 0.2236 (0.2438) data time 0.0005 (0.0014) model time 0.2231 (0.2415) loss 3.5941 (2.8182) grad_norm 4.1878 (nan) loss_scale 256.0000 (509.5444) mem 7381MB [2024-09-01 06:49:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 241 training takes 0:05:04 [2024-09-01 06:49:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 06:49:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 06:49:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.479 (0.479) Loss 0.3933 (0.3933) Acc@1 93.066 (93.066) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 06:49:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.113) Loss 0.6304 (0.6442) Acc@1 87.793 (86.976) Acc@5 97.949 (97.656) Mem 7381MB [2024-09-01 06:49:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.097) Loss 0.9707 (0.6772) Acc@1 77.051 (85.775) Acc@5 95.215 (97.470) Mem 7381MB [2024-09-01 06:49:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.091) Loss 1.2178 (0.7669) Acc@1 71.191 (83.562) Acc@5 91.797 (96.532) Mem 7381MB [2024-09-01 06:49:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.0869 (0.8136) Acc@1 75.879 (82.446) Acc@5 92.871 (96.010) Mem 7381MB [2024-09-01 06:49:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.126 Acc@5 95.940 [2024-09-01 06:49:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.1% [2024-09-01 06:49:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.13% [2024-09-01 06:49:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 06:49:12 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 06:49:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.459 (0.459) Loss 0.3806 (0.3806) Acc@1 93.457 (93.457) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 06:49:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.114) Loss 0.5669 (0.6003) Acc@1 89.648 (87.562) Acc@5 98.047 (97.772) Mem 7381MB [2024-09-01 06:49:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.097) Loss 0.8887 (0.6303) Acc@1 77.637 (86.393) Acc@5 95.898 (97.684) Mem 7381MB [2024-09-01 06:49:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.090 (0.091) Loss 1.0908 (0.7149) Acc@1 74.121 (84.299) Acc@5 93.457 (96.834) Mem 7381MB [2024-09-01 06:49:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 0.9995 (0.7602) Acc@1 76.367 (83.217) Acc@5 94.141 (96.351) Mem 7381MB [2024-09-01 06:49:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.828 Acc@5 96.326 [2024-09-01 06:49:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 06:49:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][0/1251] eta 0:24:12 lr 0.000111 wd 0.0500 time 1.1614 (1.1614) data time 0.8282 (0.8282) model time 0.0000 (0.0000) loss 3.7256 (3.7256) grad_norm 3.6515 (3.6515) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][10/1251] eta 0:06:41 lr 0.000111 wd 0.0500 time 0.2445 (0.3233) data time 0.0007 (0.0761) model time 0.0000 (0.0000) loss 2.8983 (3.0921) grad_norm 3.0826 (3.9349) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][20/1251] eta 0:05:51 lr 0.000111 wd 0.0500 time 0.2526 (0.2857) data time 0.0010 (0.0404) model time 0.0000 (0.0000) loss 2.1258 (2.9171) grad_norm 3.3877 (4.0727) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][30/1251] eta 0:05:39 lr 0.000111 wd 0.0500 time 0.2440 (0.2782) data time 0.0009 (0.0277) model time 0.0000 (0.0000) loss 3.1088 (2.8817) grad_norm 3.0547 (4.2907) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][40/1251] eta 0:05:26 lr 0.000111 wd 0.0500 time 0.2438 (0.2697) data time 0.0008 (0.0212) model time 0.0000 (0.0000) loss 3.6413 (2.9468) grad_norm 3.5856 (4.1555) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][50/1251] eta 0:05:16 lr 0.000111 wd 0.0500 time 0.2407 (0.2636) data time 0.0010 (0.0172) model time 0.0000 (0.0000) loss 1.8521 (2.8653) grad_norm 3.3213 (4.1746) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][60/1251] eta 0:05:09 lr 0.000111 wd 0.0500 time 0.2393 (0.2600) data time 0.0009 (0.0146) model time 0.2383 (0.2406) loss 2.9142 (2.8361) grad_norm 3.0054 (4.1376) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][70/1251] eta 0:05:03 lr 0.000111 wd 0.0500 time 0.2458 (0.2571) data time 0.0010 (0.0127) model time 0.2449 (0.2397) loss 1.9337 (2.7962) grad_norm 4.2135 (4.1027) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][80/1251] eta 0:04:59 lr 0.000111 wd 0.0500 time 0.2484 (0.2557) data time 0.0008 (0.0112) model time 0.2476 (0.2413) loss 2.1431 (2.8234) grad_norm 3.3791 (4.1044) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][90/1251] eta 0:04:55 lr 0.000111 wd 0.0500 time 0.2390 (0.2542) data time 0.0008 (0.0101) model time 0.2382 (0.2412) loss 3.4010 (2.8597) grad_norm 5.5884 (4.2052) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][100/1251] eta 0:04:51 lr 0.000111 wd 0.0500 time 0.2519 (0.2531) data time 0.0007 (0.0092) model time 0.2512 (0.2414) loss 2.6269 (2.8378) grad_norm 9.8569 (4.2860) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][110/1251] eta 0:04:47 lr 0.000111 wd 0.0500 time 0.2457 (0.2520) data time 0.0009 (0.0084) model time 0.2448 (0.2412) loss 3.6298 (2.8461) grad_norm 5.4764 (4.2448) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][120/1251] eta 0:04:43 lr 0.000111 wd 0.0500 time 0.2372 (0.2510) data time 0.0007 (0.0078) model time 0.2365 (0.2409) loss 3.0850 (2.8337) grad_norm 4.0301 (4.2218) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][130/1251] eta 0:04:40 lr 0.000111 wd 0.0500 time 0.2363 (0.2501) data time 0.0007 (0.0073) model time 0.2356 (0.2406) loss 3.2233 (2.8385) grad_norm 3.7938 (4.2164) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][140/1251] eta 0:04:36 lr 0.000111 wd 0.0500 time 0.2346 (0.2491) data time 0.0011 (0.0069) model time 0.2335 (0.2399) loss 2.3885 (2.8599) grad_norm 12.9174 (4.3396) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][150/1251] eta 0:04:47 lr 0.000111 wd 0.0500 time 0.2418 (0.2613) data time 0.0010 (0.0065) model time 0.2407 (0.2591) loss 2.9482 (2.8362) grad_norm 2.9646 (4.3790) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:49:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][160/1251] eta 0:04:43 lr 0.000111 wd 0.0500 time 0.2413 (0.2601) data time 0.0008 (0.0061) model time 0.2405 (0.2575) loss 3.2936 (2.8346) grad_norm 3.7606 (4.3874) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][170/1251] eta 0:04:40 lr 0.000111 wd 0.0500 time 0.2443 (0.2591) data time 0.0009 (0.0058) model time 0.2434 (0.2562) loss 3.2276 (2.8434) grad_norm 7.3741 (4.3800) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][180/1251] eta 0:04:36 lr 0.000111 wd 0.0500 time 0.2402 (0.2582) data time 0.0009 (0.0056) model time 0.2393 (0.2551) loss 2.6416 (2.8278) grad_norm 6.1411 (4.3919) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][190/1251] eta 0:04:33 lr 0.000111 wd 0.0500 time 0.2370 (0.2574) data time 0.0013 (0.0053) model time 0.2357 (0.2542) loss 2.8942 (2.8241) grad_norm 4.3248 (4.3967) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][200/1251] eta 0:04:29 lr 0.000111 wd 0.0500 time 0.2390 (0.2566) data time 0.0012 (0.0051) model time 0.2379 (0.2532) loss 3.0626 (2.8386) grad_norm 4.4548 (4.4223) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][210/1251] eta 0:04:26 lr 0.000111 wd 0.0500 time 0.2385 (0.2559) data time 0.0009 (0.0049) model time 0.2376 (0.2524) loss 2.8403 (2.8382) grad_norm 4.2250 (4.4123) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][220/1251] eta 0:04:23 lr 0.000111 wd 0.0500 time 0.2473 (0.2552) data time 0.0010 (0.0047) model time 0.2464 (0.2517) loss 2.6797 (2.8401) grad_norm 3.9748 (4.4075) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][230/1251] eta 0:04:19 lr 0.000111 wd 0.0500 time 0.2447 (0.2546) data time 0.0010 (0.0046) model time 0.2438 (0.2511) loss 2.5381 (2.8304) grad_norm 3.3204 (4.4043) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][240/1251] eta 0:04:16 lr 0.000111 wd 0.0500 time 0.2357 (0.2539) data time 0.0009 (0.0044) model time 0.2348 (0.2504) loss 2.6484 (2.8300) grad_norm 3.8572 (4.4013) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][250/1251] eta 0:04:13 lr 0.000110 wd 0.0500 time 0.2285 (0.2533) data time 0.0010 (0.0043) model time 0.2275 (0.2497) loss 3.1783 (2.8316) grad_norm 4.0876 (4.3804) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][260/1251] eta 0:04:10 lr 0.000110 wd 0.0500 time 0.2433 (0.2533) data time 0.0012 (0.0042) model time 0.2421 (0.2498) loss 2.3397 (2.8235) grad_norm 5.4997 (4.3694) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][270/1251] eta 0:04:08 lr 0.000110 wd 0.0500 time 0.2410 (0.2529) data time 0.0007 (0.0040) model time 0.2403 (0.2494) loss 1.7853 (2.8220) grad_norm 3.9989 (4.3848) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][280/1251] eta 0:04:05 lr 0.000110 wd 0.0500 time 0.2454 (0.2524) data time 0.0009 (0.0039) model time 0.2445 (0.2490) loss 1.9204 (2.8270) grad_norm 3.7054 (4.3806) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][290/1251] eta 0:04:02 lr 0.000110 wd 0.0500 time 0.2323 (0.2520) data time 0.0008 (0.0038) model time 0.2315 (0.2486) loss 2.1008 (2.8199) grad_norm 3.6926 (4.3602) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][300/1251] eta 0:03:59 lr 0.000110 wd 0.0500 time 0.2378 (0.2517) data time 0.0008 (0.0037) model time 0.2370 (0.2483) loss 3.3262 (2.8108) grad_norm 3.0674 (4.3353) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][310/1251] eta 0:03:56 lr 0.000110 wd 0.0500 time 0.2418 (0.2514) data time 0.0009 (0.0037) model time 0.2409 (0.2480) loss 2.9542 (2.8127) grad_norm 4.4869 (4.3255) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][320/1251] eta 0:03:53 lr 0.000110 wd 0.0500 time 0.2468 (0.2510) data time 0.0008 (0.0036) model time 0.2461 (0.2477) loss 3.4820 (2.8093) grad_norm 3.1754 (4.3147) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][330/1251] eta 0:03:51 lr 0.000110 wd 0.0500 time 0.4003 (0.2512) data time 0.0009 (0.0035) model time 0.3994 (0.2479) loss 3.4378 (2.8098) grad_norm 3.7736 (4.3105) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][340/1251] eta 0:03:49 lr 0.000110 wd 0.0500 time 0.2388 (0.2519) data time 0.0009 (0.0034) model time 0.2378 (0.2488) loss 3.1694 (2.8120) grad_norm 6.4007 (4.3023) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][350/1251] eta 0:03:47 lr 0.000110 wd 0.0500 time 0.2451 (0.2520) data time 0.0010 (0.0033) model time 0.2442 (0.2490) loss 2.9562 (2.8140) grad_norm 4.4981 (4.3095) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][360/1251] eta 0:03:44 lr 0.000110 wd 0.0500 time 0.2414 (0.2517) data time 0.0009 (0.0033) model time 0.2405 (0.2488) loss 3.0042 (2.8204) grad_norm 5.4103 (4.3182) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][370/1251] eta 0:03:41 lr 0.000110 wd 0.0500 time 0.2357 (0.2514) data time 0.0009 (0.0032) model time 0.2347 (0.2484) loss 2.3714 (2.8235) grad_norm 3.6312 (4.3229) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][380/1251] eta 0:03:38 lr 0.000110 wd 0.0500 time 0.2303 (0.2511) data time 0.0010 (0.0032) model time 0.2293 (0.2481) loss 2.4652 (2.8189) grad_norm 4.0753 (4.3138) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][390/1251] eta 0:03:36 lr 0.000110 wd 0.0500 time 0.2434 (0.2509) data time 0.0007 (0.0031) model time 0.2427 (0.2480) loss 2.1795 (2.8061) grad_norm 3.7015 (4.3252) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][400/1251] eta 0:03:33 lr 0.000110 wd 0.0500 time 0.2402 (0.2506) data time 0.0008 (0.0031) model time 0.2394 (0.2477) loss 3.6497 (2.8064) grad_norm 6.1284 (4.3156) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:50:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][410/1251] eta 0:03:30 lr 0.000110 wd 0.0500 time 0.2392 (0.2503) data time 0.0010 (0.0030) model time 0.2382 (0.2475) loss 3.1112 (2.8036) grad_norm 4.2720 (4.3114) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][420/1251] eta 0:03:27 lr 0.000110 wd 0.0500 time 0.2460 (0.2501) data time 0.0010 (0.0030) model time 0.2450 (0.2473) loss 2.9860 (2.8119) grad_norm 4.5925 (4.3124) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][430/1251] eta 0:03:25 lr 0.000110 wd 0.0500 time 0.2429 (0.2499) data time 0.0011 (0.0029) model time 0.2419 (0.2471) loss 2.5415 (2.8165) grad_norm 3.8191 (4.3430) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][440/1251] eta 0:03:22 lr 0.000110 wd 0.0500 time 0.2374 (0.2497) data time 0.0010 (0.0029) model time 0.2364 (0.2469) loss 3.4325 (2.8181) grad_norm 5.6575 (4.3385) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][450/1251] eta 0:03:19 lr 0.000110 wd 0.0500 time 0.2405 (0.2495) data time 0.0008 (0.0028) model time 0.2397 (0.2468) loss 3.1280 (2.8178) grad_norm 5.8457 (4.3450) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][460/1251] eta 0:03:17 lr 0.000110 wd 0.0500 time 0.2373 (0.2494) data time 0.0009 (0.0028) model time 0.2364 (0.2466) loss 3.2357 (2.8193) grad_norm 5.4825 (4.3397) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][470/1251] eta 0:03:14 lr 0.000110 wd 0.0500 time 0.2398 (0.2491) data time 0.0007 (0.0028) model time 0.2391 (0.2464) loss 3.0411 (2.8181) grad_norm 5.4083 (4.3336) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][480/1251] eta 0:03:11 lr 0.000110 wd 0.0500 time 0.2407 (0.2490) data time 0.0009 (0.0027) model time 0.2398 (0.2462) loss 2.4091 (2.8208) grad_norm 4.7615 (4.3511) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][490/1251] eta 0:03:09 lr 0.000110 wd 0.0500 time 0.2437 (0.2489) data time 0.0008 (0.0027) model time 0.2428 (0.2462) loss 3.2504 (2.8196) grad_norm 3.3836 (4.3493) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][500/1251] eta 0:03:06 lr 0.000110 wd 0.0500 time 0.2375 (0.2487) data time 0.0009 (0.0026) model time 0.2366 (0.2461) loss 3.0688 (2.8178) grad_norm 3.6560 (4.3437) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][510/1251] eta 0:03:04 lr 0.000110 wd 0.0500 time 0.2398 (0.2486) data time 0.0007 (0.0026) model time 0.2390 (0.2460) loss 1.7441 (2.8096) grad_norm 4.8752 (4.3396) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][520/1251] eta 0:03:01 lr 0.000110 wd 0.0500 time 0.2431 (0.2485) data time 0.0010 (0.0026) model time 0.2421 (0.2459) loss 2.9447 (2.8109) grad_norm 4.2806 (4.3660) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][530/1251] eta 0:02:59 lr 0.000110 wd 0.0500 time 0.2379 (0.2484) data time 0.0009 (0.0026) model time 0.2370 (0.2458) loss 2.8916 (2.8116) grad_norm 3.5628 (4.3595) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][540/1251] eta 0:02:56 lr 0.000110 wd 0.0500 time 0.2354 (0.2483) data time 0.0009 (0.0025) model time 0.2345 (0.2457) loss 2.8327 (2.8122) grad_norm 4.3216 (4.3734) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][550/1251] eta 0:02:53 lr 0.000110 wd 0.0500 time 0.2353 (0.2481) data time 0.0009 (0.0025) model time 0.2343 (0.2456) loss 2.3707 (2.8078) grad_norm 5.3309 (4.3696) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][560/1251] eta 0:02:51 lr 0.000110 wd 0.0500 time 0.2442 (0.2480) data time 0.0009 (0.0025) model time 0.2433 (0.2455) loss 3.1971 (2.8052) grad_norm 3.7580 (4.3928) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][570/1251] eta 0:02:48 lr 0.000110 wd 0.0500 time 0.2457 (0.2479) data time 0.0007 (0.0024) model time 0.2450 (0.2454) loss 1.8804 (2.8053) grad_norm 3.9520 (4.4235) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][580/1251] eta 0:02:46 lr 0.000110 wd 0.0500 time 0.2387 (0.2478) data time 0.0008 (0.0024) model time 0.2379 (0.2452) loss 2.3261 (2.8011) grad_norm 3.3123 (4.4347) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][590/1251] eta 0:02:43 lr 0.000110 wd 0.0500 time 0.2363 (0.2476) data time 0.0010 (0.0024) model time 0.2353 (0.2451) loss 3.1700 (2.8025) grad_norm 3.3049 (4.4264) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][600/1251] eta 0:02:41 lr 0.000110 wd 0.0500 time 0.2464 (0.2475) data time 0.0009 (0.0024) model time 0.2456 (0.2451) loss 3.6632 (2.8027) grad_norm 3.5385 (4.4260) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][610/1251] eta 0:02:38 lr 0.000110 wd 0.0500 time 0.2478 (0.2475) data time 0.0009 (0.0023) model time 0.2469 (0.2450) loss 2.8422 (2.8024) grad_norm 6.9482 (4.4218) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][620/1251] eta 0:02:36 lr 0.000110 wd 0.0500 time 0.2413 (0.2474) data time 0.0012 (0.0023) model time 0.2402 (0.2449) loss 2.0626 (2.8020) grad_norm 3.7327 (4.5004) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][630/1251] eta 0:02:33 lr 0.000109 wd 0.0500 time 0.2428 (0.2473) data time 0.0007 (0.0023) model time 0.2420 (0.2449) loss 1.9216 (2.7972) grad_norm 4.4992 (4.4855) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][640/1251] eta 0:02:31 lr 0.000109 wd 0.0500 time 0.2501 (0.2472) data time 0.0009 (0.0023) model time 0.2492 (0.2448) loss 1.7610 (2.7925) grad_norm 3.3024 (4.4823) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][650/1251] eta 0:02:28 lr 0.000109 wd 0.0500 time 0.2357 (0.2471) data time 0.0009 (0.0023) model time 0.2348 (0.2447) loss 2.0453 (2.7909) grad_norm 3.9513 (4.4877) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][660/1251] eta 0:02:26 lr 0.000109 wd 0.0500 time 0.2447 (0.2470) data time 0.0008 (0.0022) model time 0.2439 (0.2447) loss 3.1242 (2.7917) grad_norm 3.8668 (4.5434) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][670/1251] eta 0:02:23 lr 0.000109 wd 0.0500 time 0.2417 (0.2470) data time 0.0010 (0.0022) model time 0.2407 (0.2446) loss 2.6187 (2.7923) grad_norm 6.2235 (4.5358) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][680/1251] eta 0:02:20 lr 0.000109 wd 0.0500 time 0.2479 (0.2469) data time 0.0010 (0.0022) model time 0.2470 (0.2446) loss 2.3763 (2.7895) grad_norm 4.8441 (4.5301) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][690/1251] eta 0:02:18 lr 0.000109 wd 0.0500 time 0.2427 (0.2468) data time 0.0007 (0.0022) model time 0.2419 (0.2445) loss 2.9450 (2.7884) grad_norm 9.4163 (4.5421) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][700/1251] eta 0:02:15 lr 0.000109 wd 0.0500 time 0.2431 (0.2467) data time 0.0011 (0.0022) model time 0.2421 (0.2444) loss 2.4925 (2.7899) grad_norm 3.5348 (4.5364) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][710/1251] eta 0:02:13 lr 0.000109 wd 0.0500 time 0.2454 (0.2466) data time 0.0010 (0.0022) model time 0.2444 (0.2443) loss 2.5369 (2.7909) grad_norm 4.0619 (4.5261) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][720/1251] eta 0:02:10 lr 0.000109 wd 0.0500 time 0.2433 (0.2465) data time 0.0008 (0.0021) model time 0.2425 (0.2443) loss 2.9998 (2.7909) grad_norm 3.9637 (4.5230) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][730/1251] eta 0:02:08 lr 0.000109 wd 0.0500 time 0.2507 (0.2465) data time 0.0008 (0.0021) model time 0.2500 (0.2442) loss 2.4104 (2.7901) grad_norm 4.1739 (4.5176) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][740/1251] eta 0:02:05 lr 0.000109 wd 0.0500 time 0.2414 (0.2464) data time 0.0010 (0.0021) model time 0.2404 (0.2442) loss 2.6825 (2.7901) grad_norm 4.0743 (4.5226) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][750/1251] eta 0:02:03 lr 0.000109 wd 0.0500 time 0.2428 (0.2464) data time 0.0007 (0.0021) model time 0.2421 (0.2441) loss 3.3029 (2.7931) grad_norm 4.0744 (4.5290) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][760/1251] eta 0:02:00 lr 0.000109 wd 0.0500 time 0.2453 (0.2463) data time 0.0010 (0.0021) model time 0.2443 (0.2441) loss 2.6516 (2.7923) grad_norm 4.8163 (4.5177) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][770/1251] eta 0:01:58 lr 0.000109 wd 0.0500 time 0.2449 (0.2463) data time 0.0007 (0.0021) model time 0.2442 (0.2440) loss 2.7987 (2.7937) grad_norm 4.1914 (4.5075) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][780/1251] eta 0:01:55 lr 0.000109 wd 0.0500 time 0.2385 (0.2462) data time 0.0008 (0.0021) model time 0.2377 (0.2440) loss 2.3904 (2.7937) grad_norm 17.6913 (4.5173) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][790/1251] eta 0:01:53 lr 0.000109 wd 0.0500 time 0.2363 (0.2461) data time 0.0009 (0.0020) model time 0.2354 (0.2439) loss 2.6879 (2.7953) grad_norm 4.4323 (4.5114) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][800/1251] eta 0:01:50 lr 0.000109 wd 0.0500 time 0.2378 (0.2460) data time 0.0010 (0.0020) model time 0.2367 (0.2439) loss 3.0820 (2.7935) grad_norm 70.6617 (4.5973) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][810/1251] eta 0:01:48 lr 0.000109 wd 0.0500 time 0.2364 (0.2460) data time 0.0009 (0.0020) model time 0.2356 (0.2438) loss 1.8254 (2.7912) grad_norm 4.1748 (4.5909) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][820/1251] eta 0:01:45 lr 0.000109 wd 0.0500 time 0.2435 (0.2459) data time 0.0007 (0.0020) model time 0.2428 (0.2437) loss 3.3820 (2.7888) grad_norm 3.8365 (4.6012) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][830/1251] eta 0:01:43 lr 0.000109 wd 0.0500 time 0.2352 (0.2458) data time 0.0007 (0.0020) model time 0.2344 (0.2437) loss 1.9711 (2.7903) grad_norm 4.6221 (4.6032) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][840/1251] eta 0:01:41 lr 0.000109 wd 0.0500 time 0.2456 (0.2458) data time 0.0007 (0.0020) model time 0.2449 (0.2436) loss 3.3933 (2.7920) grad_norm 5.6511 (4.6017) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][850/1251] eta 0:01:38 lr 0.000109 wd 0.0500 time 0.2357 (0.2457) data time 0.0008 (0.0020) model time 0.2349 (0.2436) loss 3.3122 (2.7932) grad_norm 4.2195 (4.5956) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][860/1251] eta 0:01:36 lr 0.000109 wd 0.0500 time 0.2399 (0.2459) data time 0.0013 (0.0020) model time 0.2387 (0.2438) loss 1.7857 (2.7933) grad_norm 3.3468 (4.5933) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][870/1251] eta 0:01:33 lr 0.000109 wd 0.0500 time 0.2341 (0.2466) data time 0.0009 (0.0020) model time 0.2331 (0.2445) loss 2.2165 (2.7922) grad_norm 3.0158 (4.5849) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][880/1251] eta 0:01:31 lr 0.000109 wd 0.0500 time 0.2343 (0.2465) data time 0.0007 (0.0019) model time 0.2336 (0.2445) loss 2.3422 (2.7934) grad_norm 3.4482 (4.5746) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][890/1251] eta 0:01:28 lr 0.000109 wd 0.0500 time 0.2407 (0.2464) data time 0.0007 (0.0019) model time 0.2400 (0.2444) loss 3.2121 (2.7931) grad_norm 3.9404 (4.5634) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:52:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][900/1251] eta 0:01:26 lr 0.000109 wd 0.0500 time 0.2428 (0.2463) data time 0.0009 (0.0019) model time 0.2419 (0.2443) loss 3.2107 (2.7936) grad_norm 3.4609 (4.5573) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][910/1251] eta 0:01:23 lr 0.000109 wd 0.0500 time 0.2372 (0.2463) data time 0.0010 (0.0019) model time 0.2361 (0.2443) loss 1.9810 (2.7907) grad_norm 3.6247 (4.5536) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][920/1251] eta 0:01:21 lr 0.000109 wd 0.0500 time 0.2411 (0.2463) data time 0.0008 (0.0019) model time 0.2403 (0.2443) loss 2.8064 (2.7907) grad_norm 2.5756 (4.5478) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][930/1251] eta 0:01:19 lr 0.000109 wd 0.0500 time 0.2516 (0.2463) data time 0.0009 (0.0019) model time 0.2507 (0.2443) loss 2.7577 (2.7917) grad_norm 3.8836 (4.5429) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][940/1251] eta 0:01:16 lr 0.000109 wd 0.0500 time 0.2321 (0.2462) data time 0.0008 (0.0019) model time 0.2313 (0.2442) loss 3.0724 (2.7908) grad_norm 3.9412 (4.5325) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][950/1251] eta 0:01:14 lr 0.000109 wd 0.0500 time 0.2378 (0.2462) data time 0.0008 (0.0019) model time 0.2370 (0.2442) loss 1.7920 (2.7873) grad_norm 3.3419 (4.5281) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][960/1251] eta 0:01:11 lr 0.000109 wd 0.0500 time 0.2466 (0.2463) data time 0.0009 (0.0019) model time 0.2456 (0.2444) loss 3.4695 (2.7901) grad_norm 4.2467 (4.5526) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][970/1251] eta 0:01:09 lr 0.000109 wd 0.0500 time 0.2428 (0.2463) data time 0.0010 (0.0019) model time 0.2418 (0.2443) loss 3.0583 (2.7907) grad_norm 6.2815 (4.5488) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][980/1251] eta 0:01:06 lr 0.000109 wd 0.0500 time 0.2456 (0.2462) data time 0.0009 (0.0019) model time 0.2447 (0.2443) loss 3.2302 (2.7932) grad_norm 3.0964 (4.5575) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][990/1251] eta 0:01:04 lr 0.000109 wd 0.0500 time 0.2450 (0.2462) data time 0.0007 (0.0018) model time 0.2443 (0.2443) loss 3.0708 (2.7960) grad_norm 3.8656 (4.5590) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1000/1251] eta 0:01:01 lr 0.000108 wd 0.0500 time 0.2401 (0.2462) data time 0.0007 (0.0018) model time 0.2394 (0.2443) loss 3.2867 (2.7963) grad_norm 3.8592 (4.5495) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1010/1251] eta 0:00:59 lr 0.000108 wd 0.0500 time 0.2321 (0.2461) data time 0.0010 (0.0018) model time 0.2311 (0.2442) loss 3.0393 (2.7981) grad_norm 2.4083 (4.5517) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1020/1251] eta 0:00:56 lr 0.000108 wd 0.0500 time 0.2416 (0.2461) data time 0.0010 (0.0018) model time 0.2406 (0.2442) loss 3.4449 (2.7986) grad_norm 3.3988 (4.5427) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1030/1251] eta 0:00:54 lr 0.000108 wd 0.0500 time 0.2385 (0.2460) data time 0.0008 (0.0018) model time 0.2377 (0.2441) loss 3.0215 (2.7995) grad_norm 3.7691 (4.5372) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1040/1251] eta 0:00:51 lr 0.000108 wd 0.0500 time 0.2431 (0.2460) data time 0.0007 (0.0018) model time 0.2424 (0.2441) loss 3.2785 (2.8009) grad_norm 5.2695 (4.5322) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1050/1251] eta 0:00:49 lr 0.000108 wd 0.0500 time 0.2438 (0.2460) data time 0.0009 (0.0018) model time 0.2429 (0.2441) loss 2.5405 (2.8006) grad_norm 3.0652 (4.5239) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1060/1251] eta 0:00:46 lr 0.000108 wd 0.0500 time 0.2459 (0.2459) data time 0.0009 (0.0018) model time 0.2450 (0.2440) loss 3.0905 (2.8022) grad_norm 3.4227 (4.5201) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1070/1251] eta 0:00:44 lr 0.000108 wd 0.0500 time 0.2444 (0.2459) data time 0.0010 (0.0018) model time 0.2434 (0.2440) loss 3.2682 (2.8047) grad_norm 3.5163 (4.5186) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1080/1251] eta 0:00:42 lr 0.000108 wd 0.0500 time 0.2386 (0.2458) data time 0.0011 (0.0018) model time 0.2375 (0.2439) loss 2.8472 (2.8044) grad_norm 4.3165 (4.5117) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1090/1251] eta 0:00:39 lr 0.000108 wd 0.0500 time 0.2425 (0.2457) data time 0.0010 (0.0018) model time 0.2415 (0.2438) loss 2.7032 (2.8052) grad_norm 4.4757 (4.5074) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1100/1251] eta 0:00:37 lr 0.000108 wd 0.0500 time 0.2357 (0.2457) data time 0.0008 (0.0018) model time 0.2348 (0.2438) loss 3.2689 (2.8046) grad_norm 6.8479 (4.5110) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1110/1251] eta 0:00:34 lr 0.000108 wd 0.0500 time 0.2440 (0.2456) data time 0.0009 (0.0018) model time 0.2431 (0.2437) loss 3.4033 (2.8045) grad_norm 4.4988 (4.5139) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1120/1251] eta 0:00:32 lr 0.000108 wd 0.0500 time 0.2324 (0.2456) data time 0.0009 (0.0018) model time 0.2315 (0.2437) loss 2.9818 (2.8049) grad_norm 4.2842 (4.5085) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1130/1251] eta 0:00:29 lr 0.000108 wd 0.0500 time 0.2497 (0.2455) data time 0.0009 (0.0017) model time 0.2488 (0.2437) loss 3.2957 (2.8073) grad_norm 6.3932 (4.5195) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1140/1251] eta 0:00:27 lr 0.000108 wd 0.0500 time 0.2340 (0.2455) data time 0.0011 (0.0017) model time 0.2329 (0.2436) loss 3.1281 (2.8077) grad_norm 5.4900 (4.5242) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:53:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1150/1251] eta 0:00:24 lr 0.000108 wd 0.0500 time 0.2425 (0.2455) data time 0.0010 (0.0017) model time 0.2416 (0.2436) loss 2.3951 (2.8079) grad_norm 4.1146 (4.5219) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1160/1251] eta 0:00:22 lr 0.000108 wd 0.0500 time 0.2342 (0.2454) data time 0.0008 (0.0017) model time 0.2334 (0.2436) loss 2.3911 (2.8057) grad_norm 3.4357 (4.5220) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1170/1251] eta 0:00:19 lr 0.000108 wd 0.0500 time 0.2404 (0.2454) data time 0.0009 (0.0017) model time 0.2394 (0.2435) loss 2.9214 (2.8065) grad_norm 11.4940 (4.5496) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1180/1251] eta 0:00:17 lr 0.000108 wd 0.0500 time 0.2363 (0.2453) data time 0.0010 (0.0017) model time 0.2353 (0.2435) loss 2.9798 (2.8083) grad_norm 4.1670 (4.5511) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1190/1251] eta 0:00:14 lr 0.000108 wd 0.0500 time 0.2628 (0.2453) data time 0.0009 (0.0017) model time 0.2619 (0.2435) loss 2.3657 (2.8080) grad_norm 4.8603 (4.5523) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1200/1251] eta 0:00:12 lr 0.000108 wd 0.0500 time 0.2397 (0.2453) data time 0.0008 (0.0017) model time 0.2388 (0.2435) loss 3.0344 (2.8052) grad_norm 6.4706 (4.5531) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1210/1251] eta 0:00:10 lr 0.000108 wd 0.0500 time 0.2339 (0.2452) data time 0.0008 (0.0017) model time 0.2331 (0.2434) loss 3.1113 (2.8051) grad_norm 3.2343 (4.5646) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1220/1251] eta 0:00:07 lr 0.000108 wd 0.0500 time 0.2374 (0.2452) data time 0.0010 (0.0017) model time 0.2363 (0.2434) loss 2.9624 (2.8031) grad_norm 3.1604 (4.5716) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1230/1251] eta 0:00:05 lr 0.000108 wd 0.0500 time 0.2479 (0.2452) data time 0.0009 (0.0017) model time 0.2470 (0.2434) loss 3.3380 (2.8039) grad_norm 4.3990 (4.5810) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1240/1251] eta 0:00:02 lr 0.000108 wd 0.0500 time 0.2279 (0.2451) data time 0.0007 (0.0017) model time 0.2272 (0.2433) loss 2.6650 (2.8043) grad_norm 4.1539 (4.5819) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [242/300][1250/1251] eta 0:00:00 lr 0.000108 wd 0.0500 time 0.2274 (0.2450) data time 0.0005 (0.0017) model time 0.2269 (0.2432) loss 2.6967 (2.8043) grad_norm 4.2623 (4.5752) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 242 training takes 0:05:06 [2024-09-01 06:54:23 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 06:54:24 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 06:54:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.474 (0.474) Loss 0.3960 (0.3960) Acc@1 92.285 (92.285) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 06:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.113) Loss 0.6245 (0.6263) Acc@1 87.988 (86.958) Acc@5 97.852 (97.647) Mem 7381MB [2024-09-01 06:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.097) Loss 0.9707 (0.6624) Acc@1 76.172 (85.789) Acc@5 95.410 (97.568) Mem 7381MB [2024-09-01 06:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.091) Loss 1.1162 (0.7509) Acc@1 73.926 (83.654) Acc@5 92.871 (96.620) Mem 7381MB [2024-09-01 06:54:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.0596 (0.8019) Acc@1 76.172 (82.472) Acc@5 93.359 (96.025) Mem 7381MB [2024-09-01 06:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.078 Acc@5 95.988 [2024-09-01 06:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.1% [2024-09-01 06:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.810 (0.810) Loss 0.3811 (0.3811) Acc@1 93.262 (93.262) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 06:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.150) Loss 0.5674 (0.6002) Acc@1 89.844 (87.562) Acc@5 98.145 (97.754) Mem 7381MB [2024-09-01 06:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.117) Loss 0.8921 (0.6306) Acc@1 77.930 (86.430) Acc@5 95.898 (97.670) Mem 7381MB [2024-09-01 06:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.104) Loss 1.0918 (0.7152) Acc@1 74.512 (84.353) Acc@5 93.555 (96.850) Mem 7381MB [2024-09-01 06:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.095) Loss 0.9995 (0.7603) Acc@1 76.465 (83.260) Acc@5 94.238 (96.358) Mem 7381MB [2024-09-01 06:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.880 Acc@5 96.332 [2024-09-01 06:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 06:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.88% [2024-09-01 06:54:32 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 06:54:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 06:54:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][0/1251] eta 0:14:04 lr 0.000108 wd 0.0500 time 0.6750 (0.6750) data time 0.4344 (0.4344) model time 0.0000 (0.0000) loss 1.9182 (1.9182) grad_norm 4.0254 (4.0254) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][10/1251] eta 0:05:48 lr 0.000108 wd 0.0500 time 0.2421 (0.2805) data time 0.0007 (0.0404) model time 0.0000 (0.0000) loss 2.7807 (2.6787) grad_norm 15.3089 (4.6377) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][20/1251] eta 0:05:22 lr 0.000108 wd 0.0500 time 0.2387 (0.2617) data time 0.0009 (0.0217) model time 0.0000 (0.0000) loss 2.8011 (2.7269) grad_norm 3.1306 (4.2612) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][30/1251] eta 0:05:11 lr 0.000108 wd 0.0500 time 0.2511 (0.2555) data time 0.0008 (0.0150) model time 0.0000 (0.0000) loss 2.6276 (2.8034) grad_norm 4.8121 (4.1410) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][40/1251] eta 0:05:05 lr 0.000108 wd 0.0500 time 0.2432 (0.2521) data time 0.0008 (0.0116) model time 0.0000 (0.0000) loss 2.3320 (2.8135) grad_norm 3.7858 (4.0908) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][50/1251] eta 0:05:00 lr 0.000108 wd 0.0500 time 0.2373 (0.2500) data time 0.0007 (0.0095) model time 0.0000 (0.0000) loss 2.9967 (2.8005) grad_norm 3.6120 (4.0112) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][60/1251] eta 0:04:55 lr 0.000108 wd 0.0500 time 0.2411 (0.2482) data time 0.0007 (0.0081) model time 0.2404 (0.2376) loss 2.7127 (2.8318) grad_norm 3.5138 (4.0522) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][70/1251] eta 0:04:51 lr 0.000108 wd 0.0500 time 0.2297 (0.2470) data time 0.0010 (0.0071) model time 0.2286 (0.2380) loss 2.7003 (2.8337) grad_norm 3.4364 (4.0308) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][80/1251] eta 0:04:48 lr 0.000108 wd 0.0500 time 0.2444 (0.2461) data time 0.0009 (0.0064) model time 0.2435 (0.2384) loss 2.7967 (2.8289) grad_norm 4.3183 (4.0036) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][90/1251] eta 0:04:45 lr 0.000108 wd 0.0500 time 0.2424 (0.2455) data time 0.0009 (0.0058) model time 0.2415 (0.2387) loss 2.9915 (2.8184) grad_norm 4.1603 (4.0343) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:54:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][100/1251] eta 0:04:41 lr 0.000108 wd 0.0500 time 0.2386 (0.2448) data time 0.0008 (0.0053) model time 0.2378 (0.2385) loss 2.4116 (2.8008) grad_norm 3.6174 (4.0461) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][110/1251] eta 0:04:38 lr 0.000108 wd 0.0500 time 0.2403 (0.2444) data time 0.0007 (0.0049) model time 0.2396 (0.2387) loss 2.9959 (2.8160) grad_norm 3.4259 (4.0735) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][120/1251] eta 0:04:36 lr 0.000108 wd 0.0500 time 0.2438 (0.2441) data time 0.0010 (0.0046) model time 0.2427 (0.2387) loss 2.1746 (2.8281) grad_norm 5.9917 (4.0951) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][130/1251] eta 0:04:33 lr 0.000107 wd 0.0500 time 0.2453 (0.2436) data time 0.0007 (0.0043) model time 0.2447 (0.2385) loss 2.9381 (2.8272) grad_norm 4.5369 (4.1101) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][140/1251] eta 0:04:30 lr 0.000107 wd 0.0500 time 0.2422 (0.2434) data time 0.0009 (0.0041) model time 0.2413 (0.2387) loss 3.0748 (2.8364) grad_norm 2.8148 (4.1656) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][150/1251] eta 0:04:27 lr 0.000107 wd 0.0500 time 0.2366 (0.2434) data time 0.0011 (0.0039) model time 0.2355 (0.2390) loss 2.6364 (2.8393) grad_norm 4.0850 (4.1427) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][160/1251] eta 0:04:25 lr 0.000107 wd 0.0500 time 0.2407 (0.2434) data time 0.0011 (0.0037) model time 0.2396 (0.2394) loss 3.2327 (2.8365) grad_norm 4.8042 (4.2980) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][170/1251] eta 0:04:22 lr 0.000107 wd 0.0500 time 0.2356 (0.2432) data time 0.0010 (0.0035) model time 0.2346 (0.2393) loss 3.2557 (2.8435) grad_norm 4.4082 (4.3446) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][180/1251] eta 0:04:20 lr 0.000107 wd 0.0500 time 0.2383 (0.2430) data time 0.0011 (0.0034) model time 0.2372 (0.2392) loss 2.9574 (2.8352) grad_norm 5.3973 (4.3396) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][190/1251] eta 0:04:17 lr 0.000107 wd 0.0500 time 0.2369 (0.2428) data time 0.0007 (0.0033) model time 0.2362 (0.2392) loss 2.9940 (2.8420) grad_norm 4.0064 (4.3174) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][200/1251] eta 0:04:14 lr 0.000107 wd 0.0500 time 0.2322 (0.2426) data time 0.0011 (0.0032) model time 0.2311 (0.2391) loss 3.4640 (2.8500) grad_norm 5.1789 (4.3130) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][210/1251] eta 0:04:12 lr 0.000107 wd 0.0500 time 0.2488 (0.2426) data time 0.0010 (0.0031) model time 0.2478 (0.2392) loss 2.1361 (2.8593) grad_norm 5.5745 (4.4575) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][220/1251] eta 0:04:10 lr 0.000107 wd 0.0500 time 0.2395 (0.2425) data time 0.0007 (0.0030) model time 0.2388 (0.2393) loss 2.6789 (2.8568) grad_norm 2.8947 (4.4642) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][230/1251] eta 0:04:07 lr 0.000107 wd 0.0500 time 0.2375 (0.2424) data time 0.0007 (0.0029) model time 0.2368 (0.2392) loss 2.4509 (2.8621) grad_norm 6.0057 (4.4987) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][240/1251] eta 0:04:04 lr 0.000107 wd 0.0500 time 0.2367 (0.2423) data time 0.0007 (0.0028) model time 0.2360 (0.2392) loss 2.4998 (2.8593) grad_norm 4.0068 (4.5247) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][250/1251] eta 0:04:02 lr 0.000107 wd 0.0500 time 0.2341 (0.2421) data time 0.0010 (0.0027) model time 0.2331 (0.2391) loss 2.8924 (2.8596) grad_norm 3.6535 (4.5182) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][260/1251] eta 0:03:59 lr 0.000107 wd 0.0500 time 0.2430 (0.2421) data time 0.0010 (0.0027) model time 0.2420 (0.2391) loss 2.7279 (2.8598) grad_norm 5.0740 (4.5374) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][270/1251] eta 0:03:57 lr 0.000107 wd 0.0500 time 0.2408 (0.2421) data time 0.0010 (0.0026) model time 0.2399 (0.2392) loss 2.5379 (2.8622) grad_norm 5.2012 (4.5357) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][280/1251] eta 0:03:54 lr 0.000107 wd 0.0500 time 0.2467 (0.2420) data time 0.0009 (0.0025) model time 0.2458 (0.2392) loss 2.6082 (2.8588) grad_norm 3.5628 (4.5026) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][290/1251] eta 0:03:52 lr 0.000107 wd 0.0500 time 0.2405 (0.2419) data time 0.0008 (0.0025) model time 0.2396 (0.2392) loss 3.0177 (2.8640) grad_norm 4.4430 (4.4872) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][300/1251] eta 0:03:50 lr 0.000107 wd 0.0500 time 0.4660 (0.2426) data time 0.0011 (0.0024) model time 0.4649 (0.2401) loss 1.8102 (2.8586) grad_norm 5.1074 (4.4673) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][310/1251] eta 0:03:50 lr 0.000107 wd 0.0500 time 0.4621 (0.2452) data time 0.0007 (0.0024) model time 0.4614 (0.2432) loss 2.9188 (2.8594) grad_norm 2.9508 (4.4415) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][320/1251] eta 0:03:48 lr 0.000107 wd 0.0500 time 0.2359 (0.2449) data time 0.0011 (0.0023) model time 0.2348 (0.2429) loss 3.0148 (2.8523) grad_norm 3.0543 (4.4230) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][330/1251] eta 0:03:45 lr 0.000107 wd 0.0500 time 0.2316 (0.2447) data time 0.0009 (0.0023) model time 0.2307 (0.2428) loss 3.0428 (2.8568) grad_norm 3.4373 (4.4185) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][340/1251] eta 0:03:42 lr 0.000107 wd 0.0500 time 0.2362 (0.2446) data time 0.0007 (0.0023) model time 0.2356 (0.2426) loss 2.8572 (2.8597) grad_norm 5.0759 (4.4696) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:55:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][350/1251] eta 0:03:40 lr 0.000107 wd 0.0500 time 0.2438 (0.2444) data time 0.0009 (0.0022) model time 0.2429 (0.2425) loss 3.5663 (2.8672) grad_norm 6.1567 (4.4643) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][360/1251] eta 0:03:37 lr 0.000107 wd 0.0500 time 0.2389 (0.2443) data time 0.0009 (0.0022) model time 0.2380 (0.2423) loss 3.6323 (2.8715) grad_norm 4.1223 (4.4625) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][370/1251] eta 0:03:35 lr 0.000107 wd 0.0500 time 0.2403 (0.2441) data time 0.0009 (0.0022) model time 0.2394 (0.2422) loss 2.9363 (2.8674) grad_norm 4.2531 (4.4650) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][380/1251] eta 0:03:32 lr 0.000107 wd 0.0500 time 0.2384 (0.2441) data time 0.0010 (0.0021) model time 0.2374 (0.2421) loss 2.5856 (2.8594) grad_norm 3.1758 (4.4690) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][390/1251] eta 0:03:30 lr 0.000107 wd 0.0500 time 0.2446 (0.2440) data time 0.0007 (0.0021) model time 0.2439 (0.2421) loss 3.0017 (2.8627) grad_norm 5.8127 (4.4569) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][400/1251] eta 0:03:27 lr 0.000107 wd 0.0500 time 0.2460 (0.2439) data time 0.0007 (0.0021) model time 0.2453 (0.2420) loss 2.4225 (2.8618) grad_norm 4.5055 (4.4520) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][410/1251] eta 0:03:25 lr 0.000107 wd 0.0500 time 0.2508 (0.2439) data time 0.0010 (0.0020) model time 0.2498 (0.2420) loss 1.6812 (2.8563) grad_norm 3.2237 (4.4483) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][420/1251] eta 0:03:22 lr 0.000107 wd 0.0500 time 0.2368 (0.2438) data time 0.0010 (0.0020) model time 0.2358 (0.2420) loss 2.7284 (2.8527) grad_norm 3.1847 (4.4395) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][430/1251] eta 0:03:20 lr 0.000107 wd 0.0500 time 0.2390 (0.2437) data time 0.0009 (0.0020) model time 0.2381 (0.2419) loss 3.0085 (2.8414) grad_norm 3.7647 (4.4231) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][440/1251] eta 0:03:17 lr 0.000107 wd 0.0500 time 0.2416 (0.2437) data time 0.0007 (0.0020) model time 0.2409 (0.2418) loss 2.7848 (2.8386) grad_norm 4.6930 (4.5009) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][450/1251] eta 0:03:15 lr 0.000107 wd 0.0500 time 0.2451 (0.2437) data time 0.0011 (0.0020) model time 0.2440 (0.2419) loss 2.6765 (2.8356) grad_norm 3.0759 (4.4898) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][460/1251] eta 0:03:12 lr 0.000107 wd 0.0500 time 0.2450 (0.2436) data time 0.0009 (0.0019) model time 0.2440 (0.2418) loss 2.8221 (2.8315) grad_norm 4.0309 (4.4817) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][470/1251] eta 0:03:10 lr 0.000107 wd 0.0500 time 0.2441 (0.2435) data time 0.0007 (0.0019) model time 0.2434 (0.2418) loss 2.2082 (2.8273) grad_norm 6.3758 (4.4782) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][480/1251] eta 0:03:07 lr 0.000107 wd 0.0500 time 0.2422 (0.2435) data time 0.0010 (0.0019) model time 0.2412 (0.2417) loss 2.7810 (2.8291) grad_norm 6.2392 (4.4768) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][490/1251] eta 0:03:05 lr 0.000107 wd 0.0500 time 0.2335 (0.2434) data time 0.0007 (0.0019) model time 0.2328 (0.2416) loss 3.5345 (2.8304) grad_norm 4.1408 (4.4701) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][500/1251] eta 0:03:02 lr 0.000107 wd 0.0500 time 0.2439 (0.2433) data time 0.0007 (0.0019) model time 0.2431 (0.2416) loss 2.1481 (2.8271) grad_norm 4.2879 (4.4725) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][510/1251] eta 0:03:00 lr 0.000106 wd 0.0500 time 0.2357 (0.2433) data time 0.0009 (0.0018) model time 0.2348 (0.2416) loss 2.2186 (2.8221) grad_norm 5.5978 (4.4646) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 06:56:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][520/1251] eta 0:02:57 lr 0.000106 wd 0.0500 time 0.2432 (0.2432) data time 0.0011 (0.0018) model time 0.2421 (0.2415) loss 2.8547 (2.8195) grad_norm 4.3161 (inf) loss_scale 128.0000 (254.7716) mem 7381MB [2024-09-01 06:56:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][530/1251] eta 0:02:55 lr 0.000106 wd 0.0500 time 0.2455 (0.2432) data time 0.0011 (0.0018) model time 0.2444 (0.2415) loss 3.0726 (2.8224) grad_norm 3.9944 (inf) loss_scale 128.0000 (252.3842) mem 7381MB [2024-09-01 06:56:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][540/1251] eta 0:02:52 lr 0.000106 wd 0.0500 time 0.2399 (0.2431) data time 0.0007 (0.0018) model time 0.2392 (0.2414) loss 2.0240 (2.8212) grad_norm 3.4954 (inf) loss_scale 128.0000 (250.0850) mem 7381MB [2024-09-01 06:56:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][550/1251] eta 0:02:50 lr 0.000106 wd 0.0500 time 0.2425 (0.2431) data time 0.0007 (0.0018) model time 0.2418 (0.2414) loss 2.1595 (2.8198) grad_norm 4.2440 (inf) loss_scale 128.0000 (247.8693) mem 7381MB [2024-09-01 06:56:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][560/1251] eta 0:02:47 lr 0.000106 wd 0.0500 time 0.2423 (0.2431) data time 0.0008 (0.0018) model time 0.2415 (0.2414) loss 3.1627 (2.8147) grad_norm 3.1478 (inf) loss_scale 128.0000 (245.7326) mem 7381MB [2024-09-01 06:56:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][570/1251] eta 0:02:45 lr 0.000106 wd 0.0500 time 0.2413 (0.2431) data time 0.0011 (0.0018) model time 0.2402 (0.2414) loss 3.2043 (2.8177) grad_norm 6.3652 (inf) loss_scale 128.0000 (243.6708) mem 7381MB [2024-09-01 06:56:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][580/1251] eta 0:02:43 lr 0.000106 wd 0.0500 time 0.2462 (0.2431) data time 0.0010 (0.0017) model time 0.2452 (0.2414) loss 2.9397 (2.8198) grad_norm 5.4877 (inf) loss_scale 128.0000 (241.6799) mem 7381MB [2024-09-01 06:56:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][590/1251] eta 0:02:40 lr 0.000106 wd 0.0500 time 0.2376 (0.2431) data time 0.0008 (0.0017) model time 0.2368 (0.2414) loss 2.2132 (2.8188) grad_norm 4.0508 (inf) loss_scale 128.0000 (239.7563) mem 7381MB [2024-09-01 06:56:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][600/1251] eta 0:02:38 lr 0.000106 wd 0.0500 time 0.2307 (0.2430) data time 0.0009 (0.0017) model time 0.2298 (0.2413) loss 2.4063 (2.8215) grad_norm 3.8940 (inf) loss_scale 128.0000 (237.8968) mem 7381MB [2024-09-01 06:57:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][610/1251] eta 0:02:35 lr 0.000106 wd 0.0500 time 0.2353 (0.2430) data time 0.0007 (0.0017) model time 0.2346 (0.2413) loss 3.1566 (2.8230) grad_norm 3.2756 (inf) loss_scale 128.0000 (236.0982) mem 7381MB [2024-09-01 06:57:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][620/1251] eta 0:02:33 lr 0.000106 wd 0.0500 time 0.2458 (0.2430) data time 0.0009 (0.0017) model time 0.2449 (0.2414) loss 3.1061 (2.8222) grad_norm 37.3744 (inf) loss_scale 128.0000 (234.3575) mem 7381MB [2024-09-01 06:57:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][630/1251] eta 0:02:30 lr 0.000106 wd 0.0500 time 0.2305 (0.2429) data time 0.0009 (0.0017) model time 0.2297 (0.2413) loss 2.1952 (2.8180) grad_norm 4.7878 (inf) loss_scale 128.0000 (232.6719) mem 7381MB [2024-09-01 06:57:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][640/1251] eta 0:02:28 lr 0.000106 wd 0.0500 time 0.2405 (0.2429) data time 0.0007 (0.0017) model time 0.2398 (0.2413) loss 3.2469 (2.8183) grad_norm 3.7317 (inf) loss_scale 128.0000 (231.0390) mem 7381MB [2024-09-01 06:57:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][650/1251] eta 0:02:25 lr 0.000106 wd 0.0500 time 0.2490 (0.2428) data time 0.0011 (0.0017) model time 0.2479 (0.2412) loss 3.1819 (2.8217) grad_norm 3.6685 (inf) loss_scale 128.0000 (229.4562) mem 7381MB [2024-09-01 06:57:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][660/1251] eta 0:02:23 lr 0.000106 wd 0.0500 time 0.2421 (0.2428) data time 0.0013 (0.0017) model time 0.2408 (0.2411) loss 1.8875 (2.8187) grad_norm 4.1600 (inf) loss_scale 128.0000 (227.9213) mem 7381MB [2024-09-01 06:57:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][670/1251] eta 0:02:21 lr 0.000106 wd 0.0500 time 0.2418 (0.2427) data time 0.0008 (0.0016) model time 0.2410 (0.2411) loss 2.9779 (2.8159) grad_norm 3.2266 (inf) loss_scale 128.0000 (226.4322) mem 7381MB [2024-09-01 06:57:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][680/1251] eta 0:02:18 lr 0.000106 wd 0.0500 time 0.2369 (0.2427) data time 0.0010 (0.0016) model time 0.2359 (0.2411) loss 3.2719 (2.8159) grad_norm 3.0728 (inf) loss_scale 128.0000 (224.9868) mem 7381MB [2024-09-01 06:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][690/1251] eta 0:02:16 lr 0.000106 wd 0.0500 time 0.2343 (0.2426) data time 0.0008 (0.0016) model time 0.2335 (0.2411) loss 3.2809 (2.8164) grad_norm 2.8707 (inf) loss_scale 128.0000 (223.5832) mem 7381MB [2024-09-01 06:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][700/1251] eta 0:02:13 lr 0.000106 wd 0.0500 time 0.2438 (0.2426) data time 0.0010 (0.0016) model time 0.2429 (0.2410) loss 2.4764 (2.8164) grad_norm 3.8826 (inf) loss_scale 128.0000 (222.2197) mem 7381MB [2024-09-01 06:57:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][710/1251] eta 0:02:11 lr 0.000106 wd 0.0500 time 0.2444 (0.2426) data time 0.0008 (0.0016) model time 0.2436 (0.2410) loss 3.3397 (2.8147) grad_norm 3.4778 (inf) loss_scale 128.0000 (220.8945) mem 7381MB [2024-09-01 06:57:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][720/1251] eta 0:02:08 lr 0.000106 wd 0.0500 time 0.2332 (0.2426) data time 0.0008 (0.0016) model time 0.2324 (0.2410) loss 2.7528 (2.8147) grad_norm 5.3844 (inf) loss_scale 128.0000 (219.6061) mem 7381MB [2024-09-01 06:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][730/1251] eta 0:02:06 lr 0.000106 wd 0.0500 time 0.2307 (0.2425) data time 0.0007 (0.0016) model time 0.2300 (0.2410) loss 2.0531 (2.8130) grad_norm 4.4934 (inf) loss_scale 128.0000 (218.3529) mem 7381MB [2024-09-01 06:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][740/1251] eta 0:02:03 lr 0.000106 wd 0.0500 time 0.2420 (0.2425) data time 0.0011 (0.0016) model time 0.2409 (0.2410) loss 3.0480 (2.8162) grad_norm 5.2191 (inf) loss_scale 128.0000 (217.1336) mem 7381MB [2024-09-01 06:57:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][750/1251] eta 0:02:01 lr 0.000106 wd 0.0500 time 0.2446 (0.2425) data time 0.0008 (0.0016) model time 0.2438 (0.2410) loss 2.4098 (2.8168) grad_norm 4.7477 (inf) loss_scale 128.0000 (215.9467) mem 7381MB [2024-09-01 06:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][760/1251] eta 0:01:59 lr 0.000106 wd 0.0500 time 0.2371 (0.2425) data time 0.0007 (0.0016) model time 0.2364 (0.2410) loss 3.0561 (2.8165) grad_norm 4.6251 (inf) loss_scale 128.0000 (214.7911) mem 7381MB [2024-09-01 06:57:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][770/1251] eta 0:01:56 lr 0.000106 wd 0.0500 time 0.2325 (0.2425) data time 0.0007 (0.0016) model time 0.2317 (0.2409) loss 2.7946 (2.8154) grad_norm 3.0462 (inf) loss_scale 128.0000 (213.6654) mem 7381MB [2024-09-01 06:57:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][780/1251] eta 0:01:54 lr 0.000106 wd 0.0500 time 0.2347 (0.2424) data time 0.0010 (0.0015) model time 0.2337 (0.2409) loss 3.0579 (2.8142) grad_norm 4.5477 (inf) loss_scale 128.0000 (212.5685) mem 7381MB [2024-09-01 06:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][790/1251] eta 0:01:51 lr 0.000106 wd 0.0500 time 0.2329 (0.2424) data time 0.0007 (0.0015) model time 0.2322 (0.2409) loss 2.5352 (2.8135) grad_norm 3.3467 (inf) loss_scale 128.0000 (211.4994) mem 7381MB [2024-09-01 06:57:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][800/1251] eta 0:01:49 lr 0.000106 wd 0.0500 time 0.2378 (0.2424) data time 0.0010 (0.0015) model time 0.2369 (0.2409) loss 2.4197 (2.8097) grad_norm 4.1538 (inf) loss_scale 128.0000 (210.4569) mem 7381MB [2024-09-01 06:57:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][810/1251] eta 0:01:46 lr 0.000106 wd 0.0500 time 0.2414 (0.2424) data time 0.0008 (0.0015) model time 0.2406 (0.2408) loss 3.3062 (2.8117) grad_norm 4.1882 (inf) loss_scale 128.0000 (209.4402) mem 7381MB [2024-09-01 06:57:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][820/1251] eta 0:01:44 lr 0.000106 wd 0.0500 time 0.2391 (0.2423) data time 0.0010 (0.0015) model time 0.2382 (0.2408) loss 2.9282 (2.8124) grad_norm 2.6794 (inf) loss_scale 128.0000 (208.4482) mem 7381MB [2024-09-01 06:57:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][830/1251] eta 0:01:42 lr 0.000106 wd 0.0500 time 0.2423 (0.2423) data time 0.0009 (0.0015) model time 0.2415 (0.2408) loss 3.4503 (2.8132) grad_norm 4.6898 (inf) loss_scale 128.0000 (207.4801) mem 7381MB [2024-09-01 06:57:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][840/1251] eta 0:01:39 lr 0.000106 wd 0.0500 time 0.2388 (0.2423) data time 0.0009 (0.0015) model time 0.2379 (0.2408) loss 2.4302 (2.8108) grad_norm 4.5825 (inf) loss_scale 128.0000 (206.5351) mem 7381MB [2024-09-01 06:57:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][850/1251] eta 0:01:37 lr 0.000106 wd 0.0500 time 0.2418 (0.2425) data time 0.0009 (0.0015) model time 0.2409 (0.2410) loss 2.7951 (2.8128) grad_norm 3.6776 (inf) loss_scale 128.0000 (205.6122) mem 7381MB [2024-09-01 06:58:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][860/1251] eta 0:01:34 lr 0.000106 wd 0.0500 time 0.2329 (0.2425) data time 0.0009 (0.0015) model time 0.2320 (0.2410) loss 2.9153 (2.8131) grad_norm 4.1874 (inf) loss_scale 128.0000 (204.7108) mem 7381MB [2024-09-01 06:58:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][870/1251] eta 0:01:32 lr 0.000106 wd 0.0500 time 0.2400 (0.2425) data time 0.0010 (0.0015) model time 0.2391 (0.2410) loss 3.0286 (2.8099) grad_norm 3.3171 (inf) loss_scale 128.0000 (203.8301) mem 7381MB [2024-09-01 06:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][880/1251] eta 0:01:29 lr 0.000106 wd 0.0500 time 0.2387 (0.2424) data time 0.0009 (0.0015) model time 0.2378 (0.2410) loss 3.1745 (2.8103) grad_norm 3.3773 (inf) loss_scale 128.0000 (202.9694) mem 7381MB [2024-09-01 06:58:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][890/1251] eta 0:01:27 lr 0.000105 wd 0.0500 time 0.2415 (0.2424) data time 0.0009 (0.0015) model time 0.2406 (0.2410) loss 2.2723 (2.8127) grad_norm 4.4090 (inf) loss_scale 128.0000 (202.1279) mem 7381MB [2024-09-01 06:58:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][900/1251] eta 0:01:25 lr 0.000105 wd 0.0500 time 0.2508 (0.2424) data time 0.0009 (0.0015) model time 0.2499 (0.2410) loss 2.4425 (2.8125) grad_norm 5.9405 (inf) loss_scale 128.0000 (201.3052) mem 7381MB [2024-09-01 06:58:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][910/1251] eta 0:01:22 lr 0.000105 wd 0.0500 time 0.2414 (0.2424) data time 0.0010 (0.0015) model time 0.2404 (0.2409) loss 3.0459 (2.8127) grad_norm 3.3167 (inf) loss_scale 128.0000 (200.5005) mem 7381MB [2024-09-01 06:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][920/1251] eta 0:01:20 lr 0.000105 wd 0.0500 time 0.2380 (0.2424) data time 0.0008 (0.0015) model time 0.2371 (0.2410) loss 3.0920 (2.8130) grad_norm 3.1717 (inf) loss_scale 128.0000 (199.7134) mem 7381MB [2024-09-01 06:58:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][930/1251] eta 0:01:17 lr 0.000105 wd 0.0500 time 0.2336 (0.2424) data time 0.0008 (0.0015) model time 0.2328 (0.2410) loss 3.0567 (2.8125) grad_norm 3.3656 (inf) loss_scale 128.0000 (198.9431) mem 7381MB [2024-09-01 06:58:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][940/1251] eta 0:01:15 lr 0.000105 wd 0.0500 time 0.2391 (0.2424) data time 0.0009 (0.0014) model time 0.2382 (0.2410) loss 3.2607 (2.8117) grad_norm 3.7384 (inf) loss_scale 128.0000 (198.1892) mem 7381MB [2024-09-01 06:58:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][950/1251] eta 0:01:12 lr 0.000105 wd 0.0500 time 0.2360 (0.2424) data time 0.0010 (0.0014) model time 0.2351 (0.2410) loss 2.1272 (2.8106) grad_norm 3.8833 (inf) loss_scale 128.0000 (197.4511) mem 7381MB [2024-09-01 06:58:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][960/1251] eta 0:01:10 lr 0.000105 wd 0.0500 time 0.2467 (0.2424) data time 0.0007 (0.0014) model time 0.2460 (0.2410) loss 2.6328 (2.8113) grad_norm 7.3952 (inf) loss_scale 128.0000 (196.7284) mem 7381MB [2024-09-01 06:58:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][970/1251] eta 0:01:08 lr 0.000105 wd 0.0500 time 0.2414 (0.2424) data time 0.0010 (0.0014) model time 0.2404 (0.2410) loss 2.3371 (2.8111) grad_norm 10.5432 (inf) loss_scale 128.0000 (196.0206) mem 7381MB [2024-09-01 06:58:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][980/1251] eta 0:01:05 lr 0.000105 wd 0.0500 time 0.2452 (0.2424) data time 0.0009 (0.0014) model time 0.2443 (0.2409) loss 3.0213 (2.8106) grad_norm 4.9886 (inf) loss_scale 128.0000 (195.3272) mem 7381MB [2024-09-01 06:58:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][990/1251] eta 0:01:03 lr 0.000105 wd 0.0500 time 0.2477 (0.2423) data time 0.0008 (0.0014) model time 0.2469 (0.2409) loss 3.3703 (2.8113) grad_norm 3.3968 (inf) loss_scale 128.0000 (194.6478) mem 7381MB [2024-09-01 06:58:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1000/1251] eta 0:01:00 lr 0.000105 wd 0.0500 time 0.2360 (0.2423) data time 0.0008 (0.0014) model time 0.2352 (0.2409) loss 3.4308 (2.8120) grad_norm 3.5746 (inf) loss_scale 128.0000 (193.9820) mem 7381MB [2024-09-01 06:58:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1010/1251] eta 0:00:58 lr 0.000105 wd 0.0500 time 0.2413 (0.2423) data time 0.0007 (0.0014) model time 0.2405 (0.2409) loss 3.3291 (2.8134) grad_norm 4.2053 (inf) loss_scale 128.0000 (193.3294) mem 7381MB [2024-09-01 06:58:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1020/1251] eta 0:00:55 lr 0.000105 wd 0.0500 time 0.2362 (0.2422) data time 0.0008 (0.0014) model time 0.2354 (0.2408) loss 3.0316 (2.8140) grad_norm 5.4699 (inf) loss_scale 128.0000 (192.6895) mem 7381MB [2024-09-01 06:58:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1030/1251] eta 0:00:53 lr 0.000105 wd 0.0500 time 0.2382 (0.2422) data time 0.0007 (0.0014) model time 0.2374 (0.2408) loss 3.1394 (2.8164) grad_norm 10.3905 (inf) loss_scale 128.0000 (192.0621) mem 7381MB [2024-09-01 06:58:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1040/1251] eta 0:00:51 lr 0.000105 wd 0.0500 time 0.2442 (0.2422) data time 0.0007 (0.0014) model time 0.2435 (0.2408) loss 1.8793 (2.8139) grad_norm 3.4166 (inf) loss_scale 128.0000 (191.4467) mem 7381MB [2024-09-01 06:58:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1050/1251] eta 0:00:48 lr 0.000105 wd 0.0500 time 0.2412 (0.2422) data time 0.0011 (0.0014) model time 0.2400 (0.2408) loss 3.2231 (2.8146) grad_norm 3.2217 (inf) loss_scale 128.0000 (190.8430) mem 7381MB [2024-09-01 06:58:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1060/1251] eta 0:00:46 lr 0.000105 wd 0.0500 time 0.2359 (0.2421) data time 0.0009 (0.0014) model time 0.2350 (0.2407) loss 2.1254 (2.8119) grad_norm 4.8222 (inf) loss_scale 128.0000 (190.2507) mem 7381MB [2024-09-01 06:58:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1070/1251] eta 0:00:43 lr 0.000105 wd 0.0500 time 0.2407 (0.2421) data time 0.0010 (0.0014) model time 0.2397 (0.2407) loss 2.3886 (2.8112) grad_norm 4.7395 (inf) loss_scale 128.0000 (189.6695) mem 7381MB [2024-09-01 06:58:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1080/1251] eta 0:00:41 lr 0.000105 wd 0.0500 time 0.2462 (0.2421) data time 0.0008 (0.0014) model time 0.2454 (0.2408) loss 3.5798 (2.8107) grad_norm 3.0079 (inf) loss_scale 128.0000 (189.0990) mem 7381MB [2024-09-01 06:58:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1090/1251] eta 0:00:38 lr 0.000105 wd 0.0500 time 0.2464 (0.2421) data time 0.0010 (0.0014) model time 0.2453 (0.2408) loss 2.9932 (2.8123) grad_norm 4.1835 (inf) loss_scale 128.0000 (188.5390) mem 7381MB [2024-09-01 06:58:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1100/1251] eta 0:00:36 lr 0.000105 wd 0.0500 time 0.2328 (0.2421) data time 0.0009 (0.0014) model time 0.2318 (0.2408) loss 2.4227 (2.8128) grad_norm 4.3778 (inf) loss_scale 128.0000 (187.9891) mem 7381MB [2024-09-01 06:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1110/1251] eta 0:00:34 lr 0.000105 wd 0.0500 time 0.2439 (0.2421) data time 0.0009 (0.0014) model time 0.2430 (0.2408) loss 2.9065 (2.8150) grad_norm 4.7488 (inf) loss_scale 128.0000 (187.4491) mem 7381MB [2024-09-01 06:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1120/1251] eta 0:00:31 lr 0.000105 wd 0.0500 time 0.2492 (0.2422) data time 0.0007 (0.0014) model time 0.2485 (0.2408) loss 3.3886 (2.8170) grad_norm 3.8339 (inf) loss_scale 128.0000 (186.9188) mem 7381MB [2024-09-01 06:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1130/1251] eta 0:00:29 lr 0.000105 wd 0.0500 time 0.2413 (0.2421) data time 0.0010 (0.0014) model time 0.2404 (0.2408) loss 2.1076 (2.8126) grad_norm 3.2977 (inf) loss_scale 128.0000 (186.3979) mem 7381MB [2024-09-01 06:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1140/1251] eta 0:00:26 lr 0.000105 wd 0.0500 time 0.2427 (0.2421) data time 0.0008 (0.0014) model time 0.2419 (0.2408) loss 2.6683 (2.8131) grad_norm 4.0096 (inf) loss_scale 128.0000 (185.8861) mem 7381MB [2024-09-01 06:59:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1150/1251] eta 0:00:24 lr 0.000105 wd 0.0500 time 0.2467 (0.2421) data time 0.0009 (0.0014) model time 0.2458 (0.2407) loss 2.6121 (2.8141) grad_norm 4.7805 (inf) loss_scale 128.0000 (185.3831) mem 7381MB [2024-09-01 06:59:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1160/1251] eta 0:00:22 lr 0.000105 wd 0.0500 time 0.2449 (0.2421) data time 0.0009 (0.0014) model time 0.2439 (0.2407) loss 1.8353 (2.8132) grad_norm 3.3150 (inf) loss_scale 128.0000 (184.8889) mem 7381MB [2024-09-01 06:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1170/1251] eta 0:00:19 lr 0.000105 wd 0.0500 time 0.2375 (0.2421) data time 0.0011 (0.0014) model time 0.2364 (0.2407) loss 2.6046 (2.8128) grad_norm 4.1988 (inf) loss_scale 128.0000 (184.4031) mem 7381MB [2024-09-01 06:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1180/1251] eta 0:00:17 lr 0.000105 wd 0.0500 time 0.2401 (0.2421) data time 0.0010 (0.0014) model time 0.2391 (0.2407) loss 3.0239 (2.8142) grad_norm 2.6131 (inf) loss_scale 128.0000 (183.9255) mem 7381MB [2024-09-01 06:59:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1190/1251] eta 0:00:14 lr 0.000105 wd 0.0500 time 0.2448 (0.2420) data time 0.0010 (0.0014) model time 0.2439 (0.2407) loss 2.9073 (2.8137) grad_norm 4.8737 (inf) loss_scale 128.0000 (183.4559) mem 7381MB [2024-09-01 06:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1200/1251] eta 0:00:12 lr 0.000105 wd 0.0500 time 0.2544 (0.2420) data time 0.0010 (0.0014) model time 0.2534 (0.2407) loss 2.2850 (2.8118) grad_norm 3.7309 (inf) loss_scale 128.0000 (182.9942) mem 7381MB [2024-09-01 06:59:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1210/1251] eta 0:00:09 lr 0.000105 wd 0.0500 time 0.2334 (0.2420) data time 0.0010 (0.0014) model time 0.2323 (0.2407) loss 3.1948 (2.8093) grad_norm 3.4069 (inf) loss_scale 128.0000 (182.5400) mem 7381MB [2024-09-01 06:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1220/1251] eta 0:00:07 lr 0.000105 wd 0.0500 time 0.2407 (0.2420) data time 0.0008 (0.0014) model time 0.2399 (0.2407) loss 2.2561 (2.8092) grad_norm 4.9005 (inf) loss_scale 128.0000 (182.0934) mem 7381MB [2024-09-01 06:59:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1230/1251] eta 0:00:05 lr 0.000105 wd 0.0500 time 0.4609 (0.2422) data time 0.0007 (0.0013) model time 0.4602 (0.2409) loss 2.6662 (2.8117) grad_norm 4.1589 (inf) loss_scale 128.0000 (181.6539) mem 7381MB [2024-09-01 06:59:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1240/1251] eta 0:00:02 lr 0.000105 wd 0.0500 time 0.2279 (0.2426) data time 0.0007 (0.0013) model time 0.2272 (0.2413) loss 3.0427 (2.8116) grad_norm 4.4594 (inf) loss_scale 128.0000 (181.2216) mem 7381MB [2024-09-01 06:59:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [243/300][1250/1251] eta 0:00:00 lr 0.000105 wd 0.0500 time 0.2229 (0.2425) data time 0.0007 (0.0013) model time 0.2222 (0.2411) loss 2.7181 (2.8121) grad_norm 5.7019 (inf) loss_scale 128.0000 (180.7962) mem 7381MB [2024-09-01 06:59:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 243 training takes 0:05:03 [2024-09-01 06:59:36 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 06:59:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 06:59:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.401 (0.401) Loss 0.3984 (0.3984) Acc@1 93.555 (93.555) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 06:59:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.109) Loss 0.6299 (0.6306) Acc@1 88.184 (86.967) Acc@5 97.266 (97.514) Mem 7381MB [2024-09-01 06:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.093) Loss 0.9707 (0.6609) Acc@1 76.270 (85.849) Acc@5 95.215 (97.489) Mem 7381MB [2024-09-01 06:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.088) Loss 1.1484 (0.7529) Acc@1 73.438 (83.679) Acc@5 92.285 (96.532) Mem 7381MB [2024-09-01 06:59:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.0332 (0.7991) Acc@1 76.562 (82.479) Acc@5 92.969 (95.989) Mem 7381MB [2024-09-01 06:59:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.068 Acc@5 95.972 [2024-09-01 06:59:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.1% [2024-09-01 06:59:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.874 (0.874) Loss 0.3811 (0.3811) Acc@1 93.164 (93.164) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 06:59:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.085 (0.152) Loss 0.5674 (0.6000) Acc@1 89.941 (87.509) Acc@5 98.242 (97.763) Mem 7381MB [2024-09-01 06:59:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.117) Loss 0.8940 (0.6307) Acc@1 77.832 (86.375) Acc@5 96.094 (97.721) Mem 7381MB [2024-09-01 06:59:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.104) Loss 1.0918 (0.7156) Acc@1 74.512 (84.318) Acc@5 93.457 (96.881) Mem 7381MB [2024-09-01 06:59:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.095) Loss 0.9980 (0.7607) Acc@1 76.855 (83.229) Acc@5 94.336 (96.368) Mem 7381MB [2024-09-01 06:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.850 Acc@5 96.318 [2024-09-01 06:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 06:59:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][0/1251] eta 0:26:39 lr 0.000105 wd 0.0500 time 1.2788 (1.2788) data time 0.5368 (0.5368) model time 0.0000 (0.0000) loss 3.1637 (3.1637) grad_norm 4.4598 (4.4598) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:59:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][10/1251] eta 0:06:54 lr 0.000105 wd 0.0500 time 0.2501 (0.3337) data time 0.0010 (0.0497) model time 0.0000 (0.0000) loss 2.4479 (3.0534) grad_norm 3.4808 (5.6577) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:59:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][20/1251] eta 0:05:56 lr 0.000104 wd 0.0500 time 0.2467 (0.2893) data time 0.0007 (0.0265) model time 0.0000 (0.0000) loss 1.9175 (2.9146) grad_norm 3.9325 (5.0956) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:59:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][30/1251] eta 0:05:33 lr 0.000104 wd 0.0500 time 0.2394 (0.2734) data time 0.0010 (0.0183) model time 0.0000 (0.0000) loss 3.3944 (2.8917) grad_norm 2.9155 (4.7679) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:59:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][40/1251] eta 0:05:21 lr 0.000104 wd 0.0500 time 0.2367 (0.2655) data time 0.0011 (0.0141) model time 0.0000 (0.0000) loss 1.6803 (2.8900) grad_norm 4.3656 (4.6296) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 06:59:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][50/1251] eta 0:05:14 lr 0.000104 wd 0.0500 time 0.2438 (0.2615) data time 0.0007 (0.0115) model time 0.0000 (0.0000) loss 3.0431 (2.8911) grad_norm 5.0845 (4.9825) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][60/1251] eta 0:05:07 lr 0.000104 wd 0.0500 time 0.2395 (0.2585) data time 0.0007 (0.0098) model time 0.2388 (0.2426) loss 3.2777 (2.9280) grad_norm 3.4409 (5.0093) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][70/1251] eta 0:05:02 lr 0.000104 wd 0.0500 time 0.2499 (0.2562) data time 0.0011 (0.0085) model time 0.2488 (0.2418) loss 3.1864 (2.9003) grad_norm 2.9650 (4.8567) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][80/1251] eta 0:04:59 lr 0.000104 wd 0.0500 time 0.2282 (0.2559) data time 0.0010 (0.0076) model time 0.2272 (0.2456) loss 2.0026 (2.8657) grad_norm 6.5707 (4.8190) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][90/1251] eta 0:04:55 lr 0.000104 wd 0.0500 time 0.2384 (0.2544) data time 0.0007 (0.0069) model time 0.2376 (0.2444) loss 3.3109 (2.8664) grad_norm 6.7595 (4.8360) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][100/1251] eta 0:04:51 lr 0.000104 wd 0.0500 time 0.2421 (0.2531) data time 0.0007 (0.0063) model time 0.2414 (0.2436) loss 2.6791 (2.8683) grad_norm 2.7887 (4.7754) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][110/1251] eta 0:04:47 lr 0.000104 wd 0.0500 time 0.2383 (0.2521) data time 0.0011 (0.0058) model time 0.2372 (0.2432) loss 2.5097 (2.8904) grad_norm 4.5684 (4.6855) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][120/1251] eta 0:04:44 lr 0.000104 wd 0.0500 time 0.2400 (0.2512) data time 0.0011 (0.0054) model time 0.2388 (0.2427) loss 2.6305 (2.8772) grad_norm 5.6137 (4.7251) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][130/1251] eta 0:04:40 lr 0.000104 wd 0.0500 time 0.2459 (0.2505) data time 0.0008 (0.0051) model time 0.2450 (0.2426) loss 2.9261 (2.8647) grad_norm 3.7625 (4.7211) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][140/1251] eta 0:04:37 lr 0.000104 wd 0.0500 time 0.2353 (0.2497) data time 0.0009 (0.0048) model time 0.2344 (0.2421) loss 2.6977 (2.8555) grad_norm 2.3480 (4.6380) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][150/1251] eta 0:04:33 lr 0.000104 wd 0.0500 time 0.2293 (0.2488) data time 0.0011 (0.0045) model time 0.2282 (0.2413) loss 2.1656 (2.8475) grad_norm 3.2044 (4.5682) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][160/1251] eta 0:04:30 lr 0.000104 wd 0.0500 time 0.2384 (0.2481) data time 0.0009 (0.0043) model time 0.2375 (0.2409) loss 2.6416 (2.8553) grad_norm 3.0343 (4.5079) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][170/1251] eta 0:04:27 lr 0.000104 wd 0.0500 time 0.2356 (0.2474) data time 0.0009 (0.0041) model time 0.2347 (0.2405) loss 3.3976 (2.8453) grad_norm 3.6355 (4.5038) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][180/1251] eta 0:05:30 lr 0.000104 wd 0.0500 time 0.2296 (0.3082) data time 0.0007 (0.0502) model time 0.2289 (0.2611) loss 3.2423 (2.8478) grad_norm 4.2936 (4.4576) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][190/1251] eta 0:05:44 lr 0.000104 wd 0.0500 time 0.5020 (0.3242) data time 0.0009 (0.0477) model time 0.5010 (0.2863) loss 2.9209 (2.8469) grad_norm 6.5118 (4.4565) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][200/1251] eta 0:05:36 lr 0.000104 wd 0.0500 time 0.2382 (0.3200) data time 0.0007 (0.0453) model time 0.2375 (0.2830) loss 2.8193 (2.8400) grad_norm 4.6653 (4.4354) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][210/1251] eta 0:05:29 lr 0.000104 wd 0.0500 time 0.2307 (0.3161) data time 0.0007 (0.0432) model time 0.2300 (0.2801) loss 3.0097 (2.8344) grad_norm 4.1455 (4.4230) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][220/1251] eta 0:05:22 lr 0.000104 wd 0.0500 time 0.2312 (0.3127) data time 0.0008 (0.0413) model time 0.2304 (0.2777) loss 1.9214 (2.8329) grad_norm 4.0951 (4.4173) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][230/1251] eta 0:05:15 lr 0.000104 wd 0.0500 time 0.2347 (0.3092) data time 0.0009 (0.0396) model time 0.2338 (0.2752) loss 2.9572 (2.8287) grad_norm 3.9378 (4.4120) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][240/1251] eta 0:05:14 lr 0.000104 wd 0.0500 time 0.4814 (0.3107) data time 0.0011 (0.0380) model time 0.4804 (0.2788) loss 2.6099 (2.8270) grad_norm 3.5169 (4.4275) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][250/1251] eta 0:05:10 lr 0.000104 wd 0.0500 time 0.2308 (0.3099) data time 0.0009 (0.0365) model time 0.2298 (0.2794) loss 2.6431 (2.8180) grad_norm 3.3940 (4.4143) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][260/1251] eta 0:05:06 lr 0.000104 wd 0.0500 time 0.2462 (0.3090) data time 0.0012 (0.0352) model time 0.2451 (0.2796) loss 2.9892 (2.8233) grad_norm 4.0251 (4.5131) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][270/1251] eta 0:05:05 lr 0.000104 wd 0.0500 time 0.2270 (0.3109) data time 0.0008 (0.0339) model time 0.2262 (0.2833) loss 3.2679 (2.8242) grad_norm 4.1453 (4.5357) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][280/1251] eta 0:04:59 lr 0.000104 wd 0.0500 time 0.2282 (0.3082) data time 0.0009 (0.0327) model time 0.2272 (0.2812) loss 3.1207 (2.8217) grad_norm 4.6151 (4.5267) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][290/1251] eta 0:05:00 lr 0.000104 wd 0.0500 time 0.8053 (0.3127) data time 0.0007 (0.0316) model time 0.8045 (0.2877) loss 2.5320 (2.8215) grad_norm 3.5609 (4.5095) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][300/1251] eta 0:04:55 lr 0.000104 wd 0.0500 time 0.2460 (0.3107) data time 0.0009 (0.0306) model time 0.2451 (0.2862) loss 3.2411 (2.8240) grad_norm 5.8286 (4.5445) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][310/1251] eta 0:04:50 lr 0.000104 wd 0.0500 time 0.2329 (0.3083) data time 0.0009 (0.0297) model time 0.2320 (0.2843) loss 1.7485 (2.8271) grad_norm 2.6987 (4.5527) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][320/1251] eta 0:04:44 lr 0.000104 wd 0.0500 time 0.2326 (0.3060) data time 0.0010 (0.0288) model time 0.2316 (0.2823) loss 2.5827 (2.8296) grad_norm 3.3612 (4.5875) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][330/1251] eta 0:04:46 lr 0.000104 wd 0.0500 time 0.2350 (0.3107) data time 0.0007 (0.0279) model time 0.2342 (0.2888) loss 2.9795 (2.8319) grad_norm 3.8453 (4.5713) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][340/1251] eta 0:04:41 lr 0.000104 wd 0.0500 time 0.2421 (0.3086) data time 0.0008 (0.0272) model time 0.2414 (0.2870) loss 2.7974 (2.8272) grad_norm 3.7806 (4.5574) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][350/1251] eta 0:04:36 lr 0.000104 wd 0.0500 time 0.2380 (0.3068) data time 0.0009 (0.0264) model time 0.2372 (0.2856) loss 3.0955 (2.8301) grad_norm 4.1492 (4.5925) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][360/1251] eta 0:04:31 lr 0.000104 wd 0.0500 time 0.2327 (0.3050) data time 0.0007 (0.0257) model time 0.2320 (0.2841) loss 3.3817 (2.8356) grad_norm 3.5610 (4.5723) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][370/1251] eta 0:04:27 lr 0.000104 wd 0.0500 time 0.2362 (0.3032) data time 0.0008 (0.0250) model time 0.2354 (0.2826) loss 2.3568 (2.8374) grad_norm 4.4980 (4.5944) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][380/1251] eta 0:04:22 lr 0.000104 wd 0.0500 time 0.2362 (0.3015) data time 0.0009 (0.0244) model time 0.2353 (0.2812) loss 2.9250 (2.8413) grad_norm 4.8458 (4.5826) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][390/1251] eta 0:04:18 lr 0.000104 wd 0.0500 time 0.2368 (0.2999) data time 0.0008 (0.0238) model time 0.2359 (0.2800) loss 3.4631 (2.8461) grad_norm 5.1477 (4.5867) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][400/1251] eta 0:04:13 lr 0.000103 wd 0.0500 time 0.2352 (0.2982) data time 0.0010 (0.0232) model time 0.2342 (0.2786) loss 3.4312 (2.8378) grad_norm 2.5011 (4.5909) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][410/1251] eta 0:04:09 lr 0.000103 wd 0.0500 time 0.2294 (0.2967) data time 0.0007 (0.0227) model time 0.2287 (0.2774) loss 2.7792 (2.8303) grad_norm 3.0545 (4.5874) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][420/1251] eta 0:04:09 lr 0.000103 wd 0.0500 time 2.2178 (0.2999) data time 0.0009 (0.0222) model time 2.2169 (0.2816) loss 2.9285 (2.8265) grad_norm 2.7201 (4.5583) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:01:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][430/1251] eta 0:04:09 lr 0.000103 wd 0.0500 time 0.2837 (0.3038) data time 0.0007 (0.0217) model time 0.2830 (0.2864) loss 2.9679 (2.8275) grad_norm 4.6618 (4.6244) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][440/1251] eta 0:04:13 lr 0.000103 wd 0.0500 time 0.2333 (0.3123) data time 0.0007 (0.0212) model time 0.2326 (0.2965) loss 2.1968 (2.8257) grad_norm 4.1231 (4.6174) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][450/1251] eta 0:04:09 lr 0.000103 wd 0.0500 time 0.2321 (0.3120) data time 0.0007 (0.0208) model time 0.2314 (0.2964) loss 2.5975 (2.8264) grad_norm 4.5435 (4.6117) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][460/1251] eta 0:04:06 lr 0.000103 wd 0.0500 time 0.2393 (0.3114) data time 0.0010 (0.0203) model time 0.2383 (0.2962) loss 2.3507 (2.8255) grad_norm 5.9071 (4.6128) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][470/1251] eta 0:04:02 lr 0.000103 wd 0.0500 time 0.2459 (0.3100) data time 0.0008 (0.0199) model time 0.2451 (0.2949) loss 1.9464 (2.8164) grad_norm 3.0799 (4.5982) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][480/1251] eta 0:03:57 lr 0.000103 wd 0.0500 time 0.2273 (0.3084) data time 0.0009 (0.0195) model time 0.2264 (0.2935) loss 3.3745 (2.8207) grad_norm 7.8486 (4.6393) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][490/1251] eta 0:03:53 lr 0.000103 wd 0.0500 time 0.2432 (0.3071) data time 0.0008 (0.0192) model time 0.2424 (0.2923) loss 3.2437 (2.8219) grad_norm 4.2188 (4.6264) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][500/1251] eta 0:03:49 lr 0.000103 wd 0.0500 time 0.2354 (0.3061) data time 0.0007 (0.0188) model time 0.2347 (0.2915) loss 3.3443 (2.8144) grad_norm 4.1963 (4.6231) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][510/1251] eta 0:03:46 lr 0.000103 wd 0.0500 time 0.2424 (0.3057) data time 0.0009 (0.0184) model time 0.2415 (0.2914) loss 3.0403 (2.8132) grad_norm 5.6287 (4.7054) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][520/1251] eta 0:03:42 lr 0.000103 wd 0.0500 time 0.2411 (0.3044) data time 0.0011 (0.0181) model time 0.2400 (0.2903) loss 3.1284 (2.8130) grad_norm 4.1781 (4.7090) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][530/1251] eta 0:03:38 lr 0.000103 wd 0.0500 time 0.2277 (0.3031) data time 0.0007 (0.0178) model time 0.2270 (0.2891) loss 2.7873 (2.8129) grad_norm 4.3399 (4.7135) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][540/1251] eta 0:03:34 lr 0.000103 wd 0.0500 time 0.2395 (0.3020) data time 0.0009 (0.0175) model time 0.2385 (0.2881) loss 2.7394 (2.8110) grad_norm 5.4525 (4.6913) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][550/1251] eta 0:03:30 lr 0.000103 wd 0.0500 time 0.2443 (0.3008) data time 0.0011 (0.0172) model time 0.2432 (0.2870) loss 3.0700 (2.8122) grad_norm 4.7551 (4.6814) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][560/1251] eta 0:03:27 lr 0.000103 wd 0.0500 time 0.2336 (0.2997) data time 0.0009 (0.0169) model time 0.2327 (0.2861) loss 2.8716 (2.8129) grad_norm 4.4740 (4.6779) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][570/1251] eta 0:03:23 lr 0.000103 wd 0.0500 time 0.2393 (0.2986) data time 0.0007 (0.0166) model time 0.2386 (0.2851) loss 3.1015 (2.8170) grad_norm 5.0716 (4.6619) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][580/1251] eta 0:03:19 lr 0.000103 wd 0.0500 time 0.2370 (0.2975) data time 0.0009 (0.0163) model time 0.2361 (0.2842) loss 1.9877 (2.8127) grad_norm 4.7578 (4.6529) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][590/1251] eta 0:03:15 lr 0.000103 wd 0.0500 time 0.2407 (0.2965) data time 0.0009 (0.0161) model time 0.2398 (0.2833) loss 3.3383 (2.8108) grad_norm 3.3179 (4.6479) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][600/1251] eta 0:03:12 lr 0.000103 wd 0.0500 time 0.2292 (0.2955) data time 0.0007 (0.0158) model time 0.2285 (0.2824) loss 3.5176 (2.8112) grad_norm 7.0594 (4.6518) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][610/1251] eta 0:03:09 lr 0.000103 wd 0.0500 time 0.2298 (0.2949) data time 0.0007 (0.0156) model time 0.2291 (0.2820) loss 2.3925 (2.8106) grad_norm 3.8168 (4.7256) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:02:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][620/1251] eta 0:03:09 lr 0.000103 wd 0.0500 time 4.2772 (0.3004) data time 0.0007 (0.0154) model time 4.2765 (0.2882) loss 3.3500 (2.8138) grad_norm 5.5075 (4.7234) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][630/1251] eta 0:03:15 lr 0.000103 wd 0.0500 time 0.2778 (0.3147) data time 0.0009 (0.0151) model time 0.2769 (0.3040) loss 2.5539 (2.8177) grad_norm 4.8943 (4.7189) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][640/1251] eta 0:03:11 lr 0.000103 wd 0.0500 time 0.2321 (0.3137) data time 0.0009 (0.0149) model time 0.2312 (0.3030) loss 2.8714 (2.8178) grad_norm 4.3213 (4.7040) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][650/1251] eta 0:03:07 lr 0.000103 wd 0.0500 time 0.2337 (0.3125) data time 0.0007 (0.0147) model time 0.2330 (0.3019) loss 2.0286 (2.8178) grad_norm 2.9455 (4.6898) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][660/1251] eta 0:03:04 lr 0.000103 wd 0.0500 time 0.2313 (0.3114) data time 0.0011 (0.0145) model time 0.2302 (0.3009) loss 2.9330 (2.8139) grad_norm 3.6881 (4.7017) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][670/1251] eta 0:03:00 lr 0.000103 wd 0.0500 time 0.2244 (0.3103) data time 0.0010 (0.0143) model time 0.2234 (0.2998) loss 3.0539 (2.8114) grad_norm 3.7408 (4.7369) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][680/1251] eta 0:02:56 lr 0.000103 wd 0.0500 time 0.2261 (0.3092) data time 0.0011 (0.0141) model time 0.2250 (0.2988) loss 2.6400 (2.8130) grad_norm 2.6409 (4.7273) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][690/1251] eta 0:02:52 lr 0.000103 wd 0.0500 time 0.2367 (0.3082) data time 0.0010 (0.0139) model time 0.2357 (0.2978) loss 2.9547 (2.8149) grad_norm 4.3388 (4.7279) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][700/1251] eta 0:02:49 lr 0.000103 wd 0.0500 time 0.2410 (0.3082) data time 0.0009 (0.0137) model time 0.2401 (0.2980) loss 3.7694 (2.8138) grad_norm 4.8758 (4.7266) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][710/1251] eta 0:02:46 lr 0.000103 wd 0.0500 time 0.2338 (0.3072) data time 0.0007 (0.0135) model time 0.2331 (0.2970) loss 3.0054 (2.8148) grad_norm 3.8430 (4.7223) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][720/1251] eta 0:02:45 lr 0.000103 wd 0.0500 time 0.2437 (0.3114) data time 0.0009 (0.0134) model time 0.2428 (0.3017) loss 2.7101 (2.8127) grad_norm 3.4880 (4.7121) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][730/1251] eta 0:02:41 lr 0.000103 wd 0.0500 time 0.2447 (0.3104) data time 0.0008 (0.0132) model time 0.2439 (0.3007) loss 2.3631 (2.8106) grad_norm 9.5713 (4.7153) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][740/1251] eta 0:02:38 lr 0.000103 wd 0.0500 time 0.2336 (0.3093) data time 0.0010 (0.0130) model time 0.2326 (0.2997) loss 2.8911 (2.8105) grad_norm 5.4231 (4.7124) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][750/1251] eta 0:02:34 lr 0.000103 wd 0.0500 time 0.2360 (0.3088) data time 0.0009 (0.0129) model time 0.2351 (0.2993) loss 2.7909 (2.8116) grad_norm 4.4301 (4.7086) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][760/1251] eta 0:02:31 lr 0.000103 wd 0.0500 time 0.2229 (0.3078) data time 0.0011 (0.0127) model time 0.2218 (0.2983) loss 2.5952 (2.8077) grad_norm 5.4498 (4.6993) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][770/1251] eta 0:02:30 lr 0.000103 wd 0.0500 time 0.2358 (0.3122) data time 0.0010 (0.0126) model time 0.2348 (0.3032) loss 2.9345 (2.8112) grad_norm 3.4523 (4.7187) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][780/1251] eta 0:02:26 lr 0.000103 wd 0.0500 time 0.2406 (0.3113) data time 0.0009 (0.0124) model time 0.2396 (0.3023) loss 3.1842 (2.8097) grad_norm 4.1772 (4.7172) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][790/1251] eta 0:02:26 lr 0.000102 wd 0.0500 time 0.2372 (0.3169) data time 0.0009 (0.0145) model time 0.2364 (0.3061) loss 2.9359 (2.8081) grad_norm 3.0921 (4.7145) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:03:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][800/1251] eta 0:02:22 lr 0.000102 wd 0.0500 time 0.2320 (0.3164) data time 0.0007 (0.0143) model time 0.2313 (0.3056) loss 1.9241 (2.8090) grad_norm 3.3395 (4.7014) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][810/1251] eta 0:02:19 lr 0.000102 wd 0.0500 time 0.2336 (0.3159) data time 0.0008 (0.0141) model time 0.2327 (0.3052) loss 3.0510 (2.8096) grad_norm 4.5026 (4.6888) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][820/1251] eta 0:02:15 lr 0.000102 wd 0.0500 time 0.2300 (0.3149) data time 0.0008 (0.0140) model time 0.2291 (0.3043) loss 2.1543 (2.8091) grad_norm 4.6209 (4.7023) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][830/1251] eta 0:02:13 lr 0.000102 wd 0.0500 time 0.3390 (0.3180) data time 0.0009 (0.0141) model time 0.3381 (0.3074) loss 3.2215 (2.8078) grad_norm 3.2796 (4.6930) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][840/1251] eta 0:02:11 lr 0.000102 wd 0.0500 time 0.5745 (0.3196) data time 0.0009 (0.0144) model time 0.5736 (0.3087) loss 3.2829 (2.8065) grad_norm 3.7462 (4.6880) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][850/1251] eta 0:02:09 lr 0.000102 wd 0.0500 time 0.2367 (0.3236) data time 0.0009 (0.0159) model time 0.2358 (0.3114) loss 2.9604 (2.8049) grad_norm 2.7809 (4.7025) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][860/1251] eta 0:02:06 lr 0.000102 wd 0.0500 time 0.2343 (0.3226) data time 0.0009 (0.0157) model time 0.2334 (0.3104) loss 3.1126 (2.8058) grad_norm 4.0847 (4.6934) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][870/1251] eta 0:02:02 lr 0.000102 wd 0.0500 time 0.2328 (0.3216) data time 0.0009 (0.0156) model time 0.2319 (0.3095) loss 2.9760 (2.8029) grad_norm 4.1620 (4.6818) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][880/1251] eta 0:01:58 lr 0.000102 wd 0.0500 time 0.2286 (0.3206) data time 0.0014 (0.0154) model time 0.2272 (0.3086) loss 2.3608 (2.8045) grad_norm 4.0514 (4.6761) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][890/1251] eta 0:01:55 lr 0.000102 wd 0.0500 time 0.2422 (0.3197) data time 0.0012 (0.0152) model time 0.2410 (0.3078) loss 3.0838 (2.8033) grad_norm 4.6150 (4.6669) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][900/1251] eta 0:01:51 lr 0.000102 wd 0.0500 time 0.2323 (0.3188) data time 0.0007 (0.0151) model time 0.2316 (0.3069) loss 1.9299 (2.8031) grad_norm 2.8689 (4.6573) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][910/1251] eta 0:01:48 lr 0.000102 wd 0.0500 time 0.2432 (0.3179) data time 0.0009 (0.0149) model time 0.2422 (0.3061) loss 2.0640 (2.8019) grad_norm 5.3708 (4.7615) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][920/1251] eta 0:01:44 lr 0.000102 wd 0.0500 time 0.2335 (0.3170) data time 0.0009 (0.0148) model time 0.2326 (0.3052) loss 2.9570 (2.8026) grad_norm 3.4489 (4.7633) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][930/1251] eta 0:01:41 lr 0.000102 wd 0.0500 time 0.2307 (0.3161) data time 0.0010 (0.0146) model time 0.2297 (0.3045) loss 1.5304 (2.8034) grad_norm 4.5123 (4.7536) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][940/1251] eta 0:01:38 lr 0.000102 wd 0.0500 time 0.2274 (0.3153) data time 0.0010 (0.0145) model time 0.2264 (0.3037) loss 2.9851 (2.8037) grad_norm 4.3292 (4.7462) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][950/1251] eta 0:01:34 lr 0.000102 wd 0.0500 time 0.2358 (0.3144) data time 0.0009 (0.0143) model time 0.2349 (0.3029) loss 2.7419 (2.8015) grad_norm 3.3589 (4.7447) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][960/1251] eta 0:01:32 lr 0.000102 wd 0.0500 time 0.2310 (0.3190) data time 0.0009 (0.0142) model time 0.2302 (0.3079) loss 2.9844 (2.8040) grad_norm 8.8104 (4.7456) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][970/1251] eta 0:01:29 lr 0.000102 wd 0.0500 time 0.2404 (0.3182) data time 0.0009 (0.0141) model time 0.2396 (0.3071) loss 1.9433 (2.8039) grad_norm 3.4657 (4.7390) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:04:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][980/1251] eta 0:01:26 lr 0.000102 wd 0.0500 time 0.2389 (0.3175) data time 0.0010 (0.0139) model time 0.2379 (0.3065) loss 3.0367 (2.8039) grad_norm 4.9107 (4.7369) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:05:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][990/1251] eta 0:01:23 lr 0.000102 wd 0.0500 time 0.4668 (0.3217) data time 0.2536 (0.0147) model time 0.2132 (0.3101) loss 3.0558 (2.8073) grad_norm 3.3739 (4.7281) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1000/1251] eta 0:01:20 lr 0.000102 wd 0.0500 time 0.2375 (0.3214) data time 0.0007 (0.0146) model time 0.2368 (0.3099) loss 1.8316 (2.8076) grad_norm 4.6382 (4.7241) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:05:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1010/1251] eta 0:01:17 lr 0.000102 wd 0.0500 time 0.2378 (0.3214) data time 0.0014 (0.0144) model time 0.2364 (0.3100) loss 2.5933 (2.8060) grad_norm 4.2283 (4.7219) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:05:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1020/1251] eta 0:01:14 lr 0.000102 wd 0.0500 time 0.2334 (0.3206) data time 0.0009 (0.0143) model time 0.2325 (0.3092) loss 2.9354 (2.8036) grad_norm 3.7432 (4.7561) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:05:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1030/1251] eta 0:01:10 lr 0.000102 wd 0.0500 time 0.2311 (0.3204) data time 0.0008 (0.0142) model time 0.2303 (0.3092) loss 2.0173 (2.8026) grad_norm 3.2931 (4.7494) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:05:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1040/1251] eta 0:01:07 lr 0.000102 wd 0.0500 time 0.2293 (0.3197) data time 0.0008 (0.0140) model time 0.2285 (0.3086) loss 3.1054 (2.8002) grad_norm 7.4512 (4.7544) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:05:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1050/1251] eta 0:01:04 lr 0.000102 wd 0.0500 time 0.2341 (0.3190) data time 0.0010 (0.0139) model time 0.2331 (0.3079) loss 3.0908 (2.8003) grad_norm 3.9342 (4.7633) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1060/1251] eta 0:01:00 lr 0.000102 wd 0.0500 time 0.2382 (0.3181) data time 0.0007 (0.0138) model time 0.2375 (0.3071) loss 3.0804 (2.8008) grad_norm 3.7019 (4.7590) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:05:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1070/1251] eta 0:00:57 lr 0.000102 wd 0.0500 time 0.2342 (0.3174) data time 0.0009 (0.0137) model time 0.2333 (0.3064) loss 2.9992 (2.8029) grad_norm 3.7842 (4.7614) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1080/1251] eta 0:00:55 lr 0.000102 wd 0.0500 time 0.2898 (0.3262) data time 0.0007 (0.0135) model time 0.2891 (0.3157) loss 2.0439 (2.8021) grad_norm 4.8594 (4.7641) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1090/1251] eta 0:00:52 lr 0.000102 wd 0.0500 time 0.2361 (0.3266) data time 0.0009 (0.0134) model time 0.2352 (0.3163) loss 2.6936 (2.8025) grad_norm 4.7905 (4.7623) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1100/1251] eta 0:00:49 lr 0.000102 wd 0.0500 time 0.3434 (0.3286) data time 0.0007 (0.0133) model time 0.3427 (0.3184) loss 3.0262 (2.8036) grad_norm 3.8151 (4.7559) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:05:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1110/1251] eta 0:00:46 lr 0.000102 wd 0.0500 time 0.2305 (0.3278) data time 0.0010 (0.0132) model time 0.2295 (0.3177) loss 3.0592 (2.8031) grad_norm 3.3594 (4.7549) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:05:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1120/1251] eta 0:00:42 lr 0.000102 wd 0.0500 time 0.2340 (0.3269) data time 0.0009 (0.0131) model time 0.2330 (0.3169) loss 2.5159 (2.8034) grad_norm 10.6959 (4.7533) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1130/1251] eta 0:00:40 lr 0.000102 wd 0.0500 time 0.2326 (0.3334) data time 0.0009 (0.0130) model time 0.2317 (0.3238) loss 3.0915 (2.8048) grad_norm 4.9636 (4.7511) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:06:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1140/1251] eta 0:00:36 lr 0.000102 wd 0.0500 time 0.2510 (0.3326) data time 0.0008 (0.0129) model time 0.2502 (0.3230) loss 3.0773 (2.8049) grad_norm 3.3905 (4.7492) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:06:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1150/1251] eta 0:00:33 lr 0.000102 wd 0.0500 time 0.2302 (0.3345) data time 0.0010 (0.0140) model time 0.2292 (0.3238) loss 2.9473 (2.8039) grad_norm 4.7453 (4.7451) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1160/1251] eta 0:00:30 lr 0.000102 wd 0.0500 time 0.2389 (0.3348) data time 0.0010 (0.0139) model time 0.2379 (0.3242) loss 2.9467 (2.8040) grad_norm 7.5582 (4.7439) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:06:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1170/1251] eta 0:00:27 lr 0.000102 wd 0.0500 time 0.2346 (0.3343) data time 0.0010 (0.0139) model time 0.2336 (0.3236) loss 2.7887 (2.8048) grad_norm 4.5240 (4.7477) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:06:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1180/1251] eta 0:00:23 lr 0.000101 wd 0.0500 time 0.2582 (0.3334) data time 0.0010 (0.0138) model time 0.2572 (0.3228) loss 3.0642 (2.8057) grad_norm 3.9034 (4.7419) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:06:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1190/1251] eta 0:00:20 lr 0.000101 wd 0.0500 time 0.2486 (0.3326) data time 0.0007 (0.0137) model time 0.2479 (0.3221) loss 3.0361 (2.8074) grad_norm 4.3715 (4.7752) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:06:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1200/1251] eta 0:00:16 lr 0.000101 wd 0.0500 time 0.2346 (0.3318) data time 0.0009 (0.0136) model time 0.2337 (0.3213) loss 3.1228 (2.8072) grad_norm 3.4771 (4.7687) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:06:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1210/1251] eta 0:00:13 lr 0.000101 wd 0.0500 time 0.2288 (0.3310) data time 0.0006 (0.0135) model time 0.2282 (0.3205) loss 2.9900 (2.8050) grad_norm 3.0443 (4.7644) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:06:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1220/1251] eta 0:00:10 lr 0.000101 wd 0.0500 time 0.2569 (0.3366) data time 0.0010 (0.0134) model time 0.2559 (0.3264) loss 2.7411 (2.8050) grad_norm 5.1627 (4.7599) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:06:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1230/1251] eta 0:00:07 lr 0.000101 wd 0.0500 time 0.2263 (0.3358) data time 0.0007 (0.0133) model time 0.2255 (0.3257) loss 3.4202 (2.8062) grad_norm 3.2487 (4.7616) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:06:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1240/1251] eta 0:00:03 lr 0.000101 wd 0.0500 time 0.2251 (0.3350) data time 0.0007 (0.0132) model time 0.2244 (0.3249) loss 3.1687 (2.8075) grad_norm 6.7953 (4.7624) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:06:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [244/300][1250/1251] eta 0:00:00 lr 0.000101 wd 0.0500 time 0.2248 (0.3341) data time 0.0005 (0.0131) model time 0.2243 (0.3241) loss 2.5339 (2.8087) grad_norm 4.4583 (4.7557) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:06:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 244 training takes 0:06:57 [2024-09-01 07:06:43 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 07:06:44 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 07:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 1.546 (1.546) Loss 0.4014 (0.4014) Acc@1 92.969 (92.969) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 07:06:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.062 (0.547) Loss 0.5771 (0.6199) Acc@1 89.453 (87.251) Acc@5 97.852 (97.674) Mem 7381MB [2024-09-01 07:06:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.551) Loss 0.9517 (0.6536) Acc@1 76.562 (86.175) Acc@5 95.605 (97.577) Mem 7381MB [2024-09-01 07:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.411) Loss 1.1611 (0.7496) Acc@1 72.949 (83.773) Acc@5 92.480 (96.623) Mem 7381MB [2024-09-01 07:06:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.328) Loss 1.0439 (0.8018) Acc@1 76.562 (82.577) Acc@5 93.750 (96.082) Mem 7381MB [2024-09-01 07:06:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.190 Acc@5 96.016 [2024-09-01 07:06:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.2% [2024-09-01 07:06:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.19% [2024-09-01 07:06:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 07:06:58 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 07:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 6.235 (6.235) Loss 0.3811 (0.3811) Acc@1 93.066 (93.066) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 07:07:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.102 (0.796) Loss 0.5674 (0.5996) Acc@1 90.039 (87.589) Acc@5 98.047 (97.763) Mem 7381MB [2024-09-01 07:07:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.458) Loss 0.8945 (0.6306) Acc@1 77.734 (86.435) Acc@5 96.191 (97.726) Mem 7381MB [2024-09-01 07:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.335) Loss 1.0947 (0.7160) Acc@1 74.512 (84.381) Acc@5 93.555 (96.903) Mem 7381MB [2024-09-01 07:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.269) Loss 0.9971 (0.7610) Acc@1 76.953 (83.277) Acc@5 94.238 (96.391) Mem 7381MB [2024-09-01 07:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.896 Acc@5 96.346 [2024-09-01 07:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 07:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.90% [2024-09-01 07:07:10 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 07:07:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 07:07:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][0/1251] eta 0:20:35 lr 0.000101 wd 0.0500 time 0.9878 (0.9878) data time 0.7414 (0.7414) model time 0.0000 (0.0000) loss 2.6531 (2.6531) grad_norm 2.4848 (2.4848) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][10/1251] eta 0:06:18 lr 0.000101 wd 0.0500 time 0.2389 (0.3049) data time 0.0007 (0.0683) model time 0.0000 (0.0000) loss 2.1465 (3.0344) grad_norm 4.4373 (4.0028) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:07:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][20/1251] eta 0:06:50 lr 0.000101 wd 0.0500 time 1.5534 (0.3331) data time 0.0010 (0.0363) model time 0.0000 (0.0000) loss 2.6041 (2.8516) grad_norm 4.5874 (3.9265) loss_scale 256.0000 (170.6667) mem 7381MB [2024-09-01 07:07:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][30/1251] eta 0:06:10 lr 0.000101 wd 0.0500 time 0.2351 (0.3033) data time 0.0009 (0.0249) model time 0.0000 (0.0000) loss 3.0875 (2.7381) grad_norm 5.9706 (4.3257) loss_scale 256.0000 (198.1935) mem 7381MB [2024-09-01 07:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][40/1251] eta 0:05:47 lr 0.000101 wd 0.0500 time 0.2407 (0.2872) data time 0.0010 (0.0191) model time 0.0000 (0.0000) loss 3.1018 (2.7592) grad_norm 2.8638 (4.9208) loss_scale 256.0000 (212.2927) mem 7381MB [2024-09-01 07:07:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][50/1251] eta 0:05:47 lr 0.000101 wd 0.0500 time 0.2438 (0.2893) data time 0.0013 (0.0155) model time 0.0000 (0.0000) loss 2.7678 (2.7000) grad_norm 4.6693 (6.8323) loss_scale 256.0000 (220.8627) mem 7381MB [2024-09-01 07:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][60/1251] eta 0:05:38 lr 0.000101 wd 0.0500 time 0.2402 (0.2839) data time 0.0010 (0.0132) model time 0.2392 (0.2549) loss 3.3648 (2.7328) grad_norm 5.3419 (6.4711) loss_scale 256.0000 (226.6230) mem 7381MB [2024-09-01 07:07:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][70/1251] eta 0:05:27 lr 0.000101 wd 0.0500 time 0.2369 (0.2770) data time 0.0010 (0.0115) model time 0.2359 (0.2444) loss 2.7765 (2.7473) grad_norm 4.3067 (6.5859) loss_scale 256.0000 (230.7606) mem 7381MB [2024-09-01 07:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][80/1251] eta 0:05:21 lr 0.000101 wd 0.0500 time 0.2353 (0.2749) data time 0.0009 (0.0102) model time 0.2345 (0.2492) loss 3.5360 (2.7953) grad_norm 5.5001 (6.3591) loss_scale 256.0000 (233.8765) mem 7381MB [2024-09-01 07:07:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][90/1251] eta 0:05:18 lr 0.000101 wd 0.0500 time 0.2338 (0.2743) data time 0.0008 (0.0092) model time 0.2330 (0.2541) loss 1.9778 (2.7863) grad_norm 3.5869 (6.1466) loss_scale 256.0000 (236.3077) mem 7381MB [2024-09-01 07:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][100/1251] eta 0:05:14 lr 0.000101 wd 0.0500 time 0.2356 (0.2734) data time 0.0009 (0.0084) model time 0.2347 (0.2561) loss 3.2966 (2.7893) grad_norm 4.3598 (6.0428) loss_scale 256.0000 (238.2574) mem 7381MB [2024-09-01 07:07:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][110/1251] eta 0:06:50 lr 0.000101 wd 0.0500 time 0.7394 (0.3599) data time 0.0007 (0.0077) model time 0.7387 (0.4188) loss 1.7705 (2.7912) grad_norm 11.9129 (5.9786) loss_scale 256.0000 (239.8559) mem 7381MB [2024-09-01 07:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][120/1251] eta 0:06:36 lr 0.000101 wd 0.0500 time 0.2297 (0.3502) data time 0.0011 (0.0071) model time 0.2286 (0.3935) loss 2.0494 (2.7976) grad_norm 4.4045 (5.8244) loss_scale 256.0000 (241.1901) mem 7381MB [2024-09-01 07:07:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][130/1251] eta 0:06:56 lr 0.000101 wd 0.0500 time 4.1172 (0.3719) data time 0.0009 (0.0067) model time 4.1163 (0.4236) loss 2.9511 (2.7880) grad_norm 4.8632 (5.8168) loss_scale 256.0000 (242.3206) mem 7381MB [2024-09-01 07:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][140/1251] eta 0:07:24 lr 0.000101 wd 0.0500 time 0.2430 (0.3996) data time 0.0008 (0.0113) model time 0.2422 (0.4532) loss 2.8594 (2.7802) grad_norm 4.3009 (5.7466) loss_scale 256.0000 (243.2908) mem 7381MB [2024-09-01 07:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][150/1251] eta 0:07:15 lr 0.000101 wd 0.0500 time 0.2386 (0.3954) data time 0.0008 (0.0106) model time 0.2377 (0.4414) loss 3.3120 (2.7867) grad_norm 4.2781 (5.6768) loss_scale 256.0000 (244.1325) mem 7381MB [2024-09-01 07:08:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][160/1251] eta 0:07:00 lr 0.000101 wd 0.0500 time 0.2440 (0.3856) data time 0.0008 (0.0100) model time 0.2432 (0.4227) loss 3.1117 (2.7841) grad_norm 4.2442 (5.5942) loss_scale 256.0000 (244.8696) mem 7381MB [2024-09-01 07:08:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][170/1251] eta 0:06:47 lr 0.000101 wd 0.0500 time 0.2432 (0.3770) data time 0.0010 (0.0095) model time 0.2422 (0.4073) loss 3.0280 (2.7759) grad_norm 2.9973 (5.5078) loss_scale 256.0000 (245.5205) mem 7381MB [2024-09-01 07:08:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][180/1251] eta 0:06:35 lr 0.000101 wd 0.0500 time 0.2396 (0.3691) data time 0.0010 (0.0090) model time 0.2386 (0.3939) loss 2.8270 (2.7822) grad_norm 3.3379 (5.4624) loss_scale 256.0000 (246.0994) mem 7381MB [2024-09-01 07:08:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][190/1251] eta 0:06:24 lr 0.000101 wd 0.0500 time 0.2511 (0.3622) data time 0.0010 (0.0086) model time 0.2501 (0.3826) loss 3.0756 (2.7929) grad_norm 3.6291 (5.4042) loss_scale 256.0000 (246.6178) mem 7381MB [2024-09-01 07:08:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][200/1251] eta 0:06:14 lr 0.000101 wd 0.0500 time 0.2378 (0.3563) data time 0.0009 (0.0082) model time 0.2369 (0.3733) loss 2.2762 (2.7829) grad_norm 3.0731 (5.3412) loss_scale 256.0000 (247.0846) mem 7381MB [2024-09-01 07:08:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][210/1251] eta 0:06:05 lr 0.000101 wd 0.0500 time 0.2458 (0.3509) data time 0.0009 (0.0079) model time 0.2449 (0.3650) loss 2.9621 (2.7916) grad_norm 3.9793 (5.3057) loss_scale 256.0000 (247.5071) mem 7381MB [2024-09-01 07:08:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][220/1251] eta 0:05:56 lr 0.000101 wd 0.0500 time 0.2360 (0.3458) data time 0.0010 (0.0076) model time 0.2350 (0.3575) loss 2.9331 (2.7802) grad_norm 4.3061 (5.2539) loss_scale 256.0000 (247.8914) mem 7381MB [2024-09-01 07:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][230/1251] eta 0:05:48 lr 0.000101 wd 0.0500 time 0.2365 (0.3411) data time 0.0009 (0.0073) model time 0.2356 (0.3508) loss 2.9667 (2.7798) grad_norm 4.7028 (5.2384) loss_scale 256.0000 (248.2424) mem 7381MB [2024-09-01 07:08:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][240/1251] eta 0:05:40 lr 0.000101 wd 0.0500 time 0.2389 (0.3366) data time 0.0008 (0.0070) model time 0.2382 (0.3446) loss 2.5009 (2.7764) grad_norm 4.4467 (5.2084) loss_scale 256.0000 (248.5643) mem 7381MB [2024-09-01 07:08:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][250/1251] eta 0:05:37 lr 0.000101 wd 0.0500 time 0.2406 (0.3370) data time 0.0009 (0.0068) model time 0.2397 (0.3446) loss 2.9622 (2.7730) grad_norm 4.4997 (5.1669) loss_scale 256.0000 (248.8606) mem 7381MB [2024-09-01 07:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][260/1251] eta 0:05:30 lr 0.000101 wd 0.0500 time 0.2413 (0.3334) data time 0.0008 (0.0066) model time 0.2404 (0.3397) loss 2.7522 (2.7689) grad_norm 4.8315 (5.1808) loss_scale 256.0000 (249.1341) mem 7381MB [2024-09-01 07:08:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][270/1251] eta 0:05:23 lr 0.000101 wd 0.0500 time 0.2346 (0.3299) data time 0.0009 (0.0064) model time 0.2337 (0.3351) loss 2.0472 (2.7585) grad_norm 4.9198 (5.1439) loss_scale 256.0000 (249.3875) mem 7381MB [2024-09-01 07:08:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][280/1251] eta 0:05:17 lr 0.000101 wd 0.0500 time 0.2405 (0.3265) data time 0.0008 (0.0062) model time 0.2397 (0.3307) loss 3.1127 (2.7653) grad_norm 2.5372 (5.1304) loss_scale 256.0000 (249.6228) mem 7381MB [2024-09-01 07:08:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][290/1251] eta 0:05:23 lr 0.000101 wd 0.0500 time 3.4310 (0.3361) data time 0.0007 (0.0060) model time 3.4303 (0.3421) loss 2.8049 (2.7595) grad_norm 3.3287 (5.0961) loss_scale 256.0000 (249.8419) mem 7381MB [2024-09-01 07:08:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][300/1251] eta 0:05:24 lr 0.000101 wd 0.0500 time 1.0238 (0.3415) data time 0.0010 (0.0058) model time 1.0229 (0.3483) loss 3.0787 (2.7609) grad_norm 4.9027 (5.0654) loss_scale 256.0000 (250.0465) mem 7381MB [2024-09-01 07:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][310/1251] eta 0:05:23 lr 0.000101 wd 0.0500 time 0.2368 (0.3435) data time 0.0010 (0.0057) model time 0.2359 (0.3504) loss 2.8948 (2.7626) grad_norm 7.2229 (5.0576) loss_scale 256.0000 (250.2379) mem 7381MB [2024-09-01 07:09:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][320/1251] eta 0:05:16 lr 0.000100 wd 0.0500 time 0.2700 (0.3404) data time 0.0012 (0.0055) model time 0.2688 (0.3464) loss 3.0678 (2.7710) grad_norm 4.1095 (5.0560) loss_scale 256.0000 (250.4174) mem 7381MB [2024-09-01 07:09:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][330/1251] eta 0:05:10 lr 0.000100 wd 0.0500 time 0.2322 (0.3372) data time 0.0007 (0.0054) model time 0.2315 (0.3423) loss 3.3628 (2.7701) grad_norm 6.0167 (5.0311) loss_scale 256.0000 (250.5861) mem 7381MB [2024-09-01 07:09:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][340/1251] eta 0:05:10 lr 0.000100 wd 0.0500 time 0.8177 (0.3411) data time 0.0010 (0.0053) model time 0.8167 (0.3467) loss 2.5638 (2.7715) grad_norm 3.7232 (5.0985) loss_scale 256.0000 (250.7449) mem 7381MB [2024-09-01 07:09:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][350/1251] eta 0:05:04 lr 0.000100 wd 0.0500 time 0.2616 (0.3385) data time 0.0010 (0.0053) model time 0.2606 (0.3432) loss 3.4152 (2.7764) grad_norm 5.4492 (5.0762) loss_scale 256.0000 (250.8946) mem 7381MB [2024-09-01 07:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][360/1251] eta 0:04:59 lr 0.000100 wd 0.0500 time 0.2323 (0.3357) data time 0.0011 (0.0052) model time 0.2312 (0.3398) loss 2.9300 (2.7744) grad_norm 2.7825 (5.0539) loss_scale 256.0000 (251.0360) mem 7381MB [2024-09-01 07:09:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][370/1251] eta 0:04:53 lr 0.000100 wd 0.0500 time 0.2334 (0.3330) data time 0.0009 (0.0051) model time 0.2325 (0.3366) loss 3.0326 (2.7737) grad_norm 4.0737 (5.0412) loss_scale 256.0000 (251.1698) mem 7381MB [2024-09-01 07:09:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][380/1251] eta 0:04:47 lr 0.000100 wd 0.0500 time 0.2250 (0.3305) data time 0.0009 (0.0050) model time 0.2240 (0.3335) loss 3.4669 (2.7760) grad_norm 3.8755 (5.0338) loss_scale 256.0000 (251.2966) mem 7381MB [2024-09-01 07:09:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][390/1251] eta 0:04:42 lr 0.000100 wd 0.0500 time 0.2372 (0.3282) data time 0.0007 (0.0049) model time 0.2365 (0.3307) loss 3.4723 (2.7806) grad_norm 3.0685 (4.9928) loss_scale 256.0000 (251.4169) mem 7381MB [2024-09-01 07:09:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][400/1251] eta 0:04:38 lr 0.000100 wd 0.0500 time 0.2258 (0.3271) data time 0.0009 (0.0048) model time 0.2249 (0.3294) loss 3.5585 (2.7857) grad_norm 3.4473 (4.9673) loss_scale 256.0000 (251.5312) mem 7381MB [2024-09-01 07:09:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][410/1251] eta 0:04:33 lr 0.000100 wd 0.0500 time 0.2522 (0.3255) data time 0.0009 (0.0047) model time 0.2513 (0.3275) loss 3.3808 (2.7848) grad_norm 5.5208 (5.1956) loss_scale 256.0000 (251.6399) mem 7381MB [2024-09-01 07:09:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][420/1251] eta 0:04:28 lr 0.000100 wd 0.0500 time 0.2342 (0.3235) data time 0.0010 (0.0046) model time 0.2332 (0.3251) loss 3.0128 (2.7772) grad_norm 5.0826 (5.1730) loss_scale 256.0000 (251.7435) mem 7381MB [2024-09-01 07:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][430/1251] eta 0:04:23 lr 0.000100 wd 0.0500 time 0.2392 (0.3215) data time 0.0007 (0.0045) model time 0.2384 (0.3228) loss 2.7611 (2.7733) grad_norm 3.8636 (5.1512) loss_scale 256.0000 (251.8422) mem 7381MB [2024-09-01 07:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][440/1251] eta 0:04:19 lr 0.000100 wd 0.0500 time 0.2325 (0.3196) data time 0.0009 (0.0044) model time 0.2316 (0.3205) loss 2.9239 (2.7802) grad_norm 4.7019 (5.1899) loss_scale 256.0000 (251.9365) mem 7381MB [2024-09-01 07:09:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][450/1251] eta 0:04:14 lr 0.000100 wd 0.0500 time 0.2300 (0.3178) data time 0.0007 (0.0044) model time 0.2292 (0.3185) loss 2.2456 (2.7811) grad_norm 3.3673 (5.1621) loss_scale 256.0000 (252.0266) mem 7381MB [2024-09-01 07:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][460/1251] eta 0:04:10 lr 0.000100 wd 0.0500 time 0.2410 (0.3162) data time 0.0008 (0.0043) model time 0.2402 (0.3166) loss 2.8376 (2.7771) grad_norm 3.3105 (5.1286) loss_scale 256.0000 (252.1128) mem 7381MB [2024-09-01 07:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][470/1251] eta 0:04:05 lr 0.000100 wd 0.0500 time 0.2451 (0.3145) data time 0.0010 (0.0042) model time 0.2441 (0.3148) loss 3.3423 (2.7769) grad_norm 2.5692 (5.0954) loss_scale 256.0000 (252.1953) mem 7381MB [2024-09-01 07:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][480/1251] eta 0:04:01 lr 0.000100 wd 0.0500 time 0.2277 (0.3128) data time 0.0006 (0.0042) model time 0.2270 (0.3128) loss 1.9969 (2.7795) grad_norm 2.9204 (5.0803) loss_scale 256.0000 (252.2744) mem 7381MB [2024-09-01 07:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][490/1251] eta 0:03:56 lr 0.000100 wd 0.0500 time 0.2364 (0.3113) data time 0.0007 (0.0041) model time 0.2357 (0.3110) loss 2.4486 (2.7808) grad_norm 4.4305 (5.0843) loss_scale 256.0000 (252.3503) mem 7381MB [2024-09-01 07:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][500/1251] eta 0:03:54 lr 0.000100 wd 0.0500 time 0.2420 (0.3128) data time 0.0006 (0.0051) model time 0.2413 (0.3116) loss 2.4166 (2.7830) grad_norm 3.4298 (5.0592) loss_scale 256.0000 (252.4232) mem 7381MB [2024-09-01 07:09:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][510/1251] eta 0:03:50 lr 0.000100 wd 0.0500 time 0.2298 (0.3113) data time 0.0010 (0.0050) model time 0.2288 (0.3099) loss 2.9651 (2.7806) grad_norm 5.8185 (5.0464) loss_scale 256.0000 (252.4932) mem 7381MB [2024-09-01 07:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][520/1251] eta 0:03:47 lr 0.000100 wd 0.0500 time 0.2308 (0.3111) data time 0.0012 (0.0049) model time 0.2296 (0.3097) loss 3.2419 (2.7801) grad_norm 4.0040 (5.0605) loss_scale 256.0000 (252.5605) mem 7381MB [2024-09-01 07:09:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][530/1251] eta 0:03:43 lr 0.000100 wd 0.0500 time 0.2407 (0.3098) data time 0.0009 (0.0049) model time 0.2398 (0.3082) loss 2.8907 (2.7848) grad_norm 3.4701 (5.0411) loss_scale 256.0000 (252.6252) mem 7381MB [2024-09-01 07:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][540/1251] eta 0:03:39 lr 0.000100 wd 0.0500 time 0.2355 (0.3084) data time 0.0007 (0.0048) model time 0.2348 (0.3067) loss 2.6024 (2.7849) grad_norm 4.7350 (5.0182) loss_scale 256.0000 (252.6876) mem 7381MB [2024-09-01 07:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][550/1251] eta 0:03:46 lr 0.000100 wd 0.0500 time 0.3013 (0.3233) data time 0.0009 (0.0064) model time 0.3003 (0.3213) loss 2.1130 (2.7839) grad_norm 3.5472 (5.0011) loss_scale 256.0000 (252.7477) mem 7381MB [2024-09-01 07:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][560/1251] eta 0:03:43 lr 0.000100 wd 0.0500 time 0.7422 (0.3229) data time 0.0009 (0.0063) model time 0.7413 (0.3209) loss 3.3376 (2.7880) grad_norm 4.7360 (4.9839) loss_scale 256.0000 (252.8057) mem 7381MB [2024-09-01 07:10:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][570/1251] eta 0:03:41 lr 0.000100 wd 0.0500 time 0.3257 (0.3245) data time 0.0009 (0.0062) model time 0.3247 (0.3227) loss 2.2746 (2.7863) grad_norm 3.6484 (4.9861) loss_scale 256.0000 (252.8616) mem 7381MB [2024-09-01 07:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][580/1251] eta 0:03:36 lr 0.000100 wd 0.0500 time 0.2332 (0.3231) data time 0.0010 (0.0061) model time 0.2322 (0.3212) loss 2.7475 (2.7860) grad_norm 3.2855 (4.9729) loss_scale 256.0000 (252.9157) mem 7381MB [2024-09-01 07:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][590/1251] eta 0:03:32 lr 0.000100 wd 0.0500 time 0.2391 (0.3217) data time 0.0009 (0.0060) model time 0.2382 (0.3196) loss 1.8053 (2.7862) grad_norm 2.6846 (4.9513) loss_scale 256.0000 (252.9679) mem 7381MB [2024-09-01 07:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][600/1251] eta 0:03:28 lr 0.000100 wd 0.0500 time 0.2347 (0.3203) data time 0.0007 (0.0059) model time 0.2340 (0.3181) loss 3.4463 (2.7837) grad_norm 5.0691 (4.9362) loss_scale 256.0000 (253.0183) mem 7381MB [2024-09-01 07:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][610/1251] eta 0:03:24 lr 0.000100 wd 0.0500 time 0.2382 (0.3189) data time 0.0009 (0.0058) model time 0.2373 (0.3167) loss 2.4474 (2.7837) grad_norm 5.2872 (4.9193) loss_scale 256.0000 (253.0671) mem 7381MB [2024-09-01 07:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][620/1251] eta 0:03:20 lr 0.000100 wd 0.0500 time 0.2337 (0.3175) data time 0.0009 (0.0058) model time 0.2328 (0.3152) loss 2.7790 (2.7802) grad_norm 3.9825 (4.9149) loss_scale 256.0000 (253.1143) mem 7381MB [2024-09-01 07:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][630/1251] eta 0:03:16 lr 0.000100 wd 0.0500 time 0.2452 (0.3163) data time 0.0010 (0.0057) model time 0.2442 (0.3139) loss 3.2011 (2.7806) grad_norm 5.3721 (4.9057) loss_scale 256.0000 (253.1601) mem 7381MB [2024-09-01 07:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][640/1251] eta 0:03:12 lr 0.000100 wd 0.0500 time 0.2337 (0.3151) data time 0.0009 (0.0056) model time 0.2328 (0.3126) loss 3.0502 (2.7827) grad_norm 5.1323 (4.8991) loss_scale 256.0000 (253.2044) mem 7381MB [2024-09-01 07:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][650/1251] eta 0:03:08 lr 0.000100 wd 0.0500 time 0.2453 (0.3140) data time 0.0010 (0.0055) model time 0.2444 (0.3114) loss 3.1954 (2.7841) grad_norm 4.1913 (4.8990) loss_scale 256.0000 (253.2473) mem 7381MB [2024-09-01 07:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][660/1251] eta 0:03:04 lr 0.000100 wd 0.0500 time 0.2415 (0.3129) data time 0.0011 (0.0055) model time 0.2404 (0.3102) loss 3.4101 (2.7857) grad_norm 3.6176 (4.8823) loss_scale 256.0000 (253.2890) mem 7381MB [2024-09-01 07:10:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][670/1251] eta 0:03:01 lr 0.000100 wd 0.0500 time 0.2571 (0.3118) data time 0.0009 (0.0054) model time 0.2562 (0.3091) loss 2.9883 (2.7853) grad_norm 4.7009 (4.8721) loss_scale 256.0000 (253.3294) mem 7381MB [2024-09-01 07:10:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][680/1251] eta 0:02:57 lr 0.000100 wd 0.0500 time 0.2428 (0.3108) data time 0.0011 (0.0053) model time 0.2417 (0.3081) loss 2.7557 (2.7835) grad_norm 2.5911 (4.8656) loss_scale 256.0000 (253.3686) mem 7381MB [2024-09-01 07:10:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][690/1251] eta 0:02:53 lr 0.000100 wd 0.0500 time 0.2435 (0.3098) data time 0.0008 (0.0053) model time 0.2428 (0.3070) loss 1.9534 (2.7777) grad_norm 3.3507 (4.8623) loss_scale 256.0000 (253.4067) mem 7381MB [2024-09-01 07:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][700/1251] eta 0:02:50 lr 0.000100 wd 0.0500 time 0.2367 (0.3089) data time 0.0011 (0.0052) model time 0.2356 (0.3060) loss 2.1986 (2.7796) grad_norm 3.3773 (4.8542) loss_scale 256.0000 (253.4437) mem 7381MB [2024-09-01 07:10:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][710/1251] eta 0:02:46 lr 0.000099 wd 0.0500 time 0.2493 (0.3079) data time 0.0010 (0.0051) model time 0.2483 (0.3050) loss 2.9361 (2.7799) grad_norm 3.2285 (4.8698) loss_scale 256.0000 (253.4796) mem 7381MB [2024-09-01 07:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][720/1251] eta 0:02:43 lr 0.000099 wd 0.0500 time 0.2552 (0.3070) data time 0.0009 (0.0051) model time 0.2543 (0.3041) loss 2.9405 (2.7802) grad_norm 4.0366 (4.8712) loss_scale 256.0000 (253.5146) mem 7381MB [2024-09-01 07:10:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][730/1251] eta 0:02:39 lr 0.000099 wd 0.0500 time 0.2469 (0.3062) data time 0.0009 (0.0050) model time 0.2460 (0.3032) loss 2.2776 (2.7791) grad_norm 6.3493 (4.8624) loss_scale 256.0000 (253.5486) mem 7381MB [2024-09-01 07:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][740/1251] eta 0:02:35 lr 0.000099 wd 0.0500 time 0.2424 (0.3053) data time 0.0009 (0.0050) model time 0.2416 (0.3022) loss 2.7815 (2.7789) grad_norm 4.4124 (4.8735) loss_scale 256.0000 (253.5816) mem 7381MB [2024-09-01 07:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][750/1251] eta 0:02:32 lr 0.000099 wd 0.0500 time 0.2456 (0.3044) data time 0.0009 (0.0049) model time 0.2447 (0.3014) loss 3.0012 (2.7818) grad_norm 4.7966 (4.8778) loss_scale 256.0000 (253.6138) mem 7381MB [2024-09-01 07:11:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][760/1251] eta 0:02:29 lr 0.000099 wd 0.0500 time 0.2451 (0.3036) data time 0.0010 (0.0049) model time 0.2441 (0.3005) loss 3.0426 (2.7822) grad_norm 3.0928 (4.8715) loss_scale 256.0000 (253.6452) mem 7381MB [2024-09-01 07:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][770/1251] eta 0:02:25 lr 0.000099 wd 0.0500 time 0.2455 (0.3028) data time 0.0008 (0.0048) model time 0.2447 (0.2997) loss 3.0796 (2.7824) grad_norm 8.9994 (4.8774) loss_scale 256.0000 (253.6757) mem 7381MB [2024-09-01 07:11:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][780/1251] eta 0:02:22 lr 0.000099 wd 0.0500 time 0.2296 (0.3020) data time 0.0008 (0.0048) model time 0.2289 (0.2989) loss 1.8451 (2.7806) grad_norm 4.1278 (4.8747) loss_scale 256.0000 (253.7055) mem 7381MB [2024-09-01 07:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][790/1251] eta 0:02:18 lr 0.000099 wd 0.0500 time 0.2477 (0.3013) data time 0.0011 (0.0047) model time 0.2466 (0.2981) loss 2.9976 (2.7785) grad_norm 2.7699 (4.8641) loss_scale 256.0000 (253.7345) mem 7381MB [2024-09-01 07:11:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][800/1251] eta 0:02:15 lr 0.000099 wd 0.0500 time 0.2399 (0.3005) data time 0.0009 (0.0047) model time 0.2390 (0.2974) loss 1.8695 (2.7789) grad_norm 3.0758 (4.8451) loss_scale 256.0000 (253.7628) mem 7381MB [2024-09-01 07:11:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][810/1251] eta 0:02:12 lr 0.000099 wd 0.0500 time 0.2498 (0.2998) data time 0.0009 (0.0046) model time 0.2489 (0.2966) loss 2.4019 (2.7796) grad_norm 3.9107 (4.8327) loss_scale 256.0000 (253.7904) mem 7381MB [2024-09-01 07:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][820/1251] eta 0:02:08 lr 0.000099 wd 0.0500 time 0.2308 (0.2990) data time 0.0007 (0.0046) model time 0.2301 (0.2958) loss 2.2296 (2.7802) grad_norm 3.4176 (4.8298) loss_scale 256.0000 (253.8173) mem 7381MB [2024-09-01 07:11:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][830/1251] eta 0:02:05 lr 0.000099 wd 0.0500 time 0.2367 (0.2983) data time 0.0010 (0.0046) model time 0.2357 (0.2950) loss 2.9521 (2.7812) grad_norm 3.7716 (4.8313) loss_scale 256.0000 (253.8436) mem 7381MB [2024-09-01 07:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][840/1251] eta 0:02:02 lr 0.000099 wd 0.0500 time 0.2400 (0.2976) data time 0.0011 (0.0045) model time 0.2389 (0.2943) loss 2.8127 (2.7826) grad_norm 3.3634 (4.8318) loss_scale 256.0000 (253.8692) mem 7381MB [2024-09-01 07:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][850/1251] eta 0:01:59 lr 0.000099 wd 0.0500 time 0.2333 (0.2969) data time 0.0012 (0.0045) model time 0.2321 (0.2936) loss 3.2146 (2.7832) grad_norm 3.9212 (4.8362) loss_scale 256.0000 (253.8942) mem 7381MB [2024-09-01 07:11:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][860/1251] eta 0:01:55 lr 0.000099 wd 0.0500 time 0.2453 (0.2963) data time 0.0011 (0.0044) model time 0.2442 (0.2930) loss 2.2744 (2.7834) grad_norm 3.4898 (4.8275) loss_scale 256.0000 (253.9187) mem 7381MB [2024-09-01 07:11:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][870/1251] eta 0:01:52 lr 0.000099 wd 0.0500 time 0.2415 (0.2956) data time 0.0010 (0.0044) model time 0.2405 (0.2923) loss 3.2177 (2.7829) grad_norm 3.0673 (4.8125) loss_scale 256.0000 (253.9426) mem 7381MB [2024-09-01 07:11:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][880/1251] eta 0:01:49 lr 0.000099 wd 0.0500 time 0.2404 (0.2950) data time 0.0009 (0.0044) model time 0.2395 (0.2917) loss 2.6254 (2.7832) grad_norm 4.0066 (4.8007) loss_scale 256.0000 (253.9659) mem 7381MB [2024-09-01 07:11:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][890/1251] eta 0:01:46 lr 0.000099 wd 0.0500 time 0.2392 (0.2943) data time 0.0007 (0.0043) model time 0.2385 (0.2910) loss 3.0311 (2.7844) grad_norm 6.7883 (4.8030) loss_scale 256.0000 (253.9888) mem 7381MB [2024-09-01 07:11:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][900/1251] eta 0:01:43 lr 0.000099 wd 0.0500 time 0.2359 (0.2937) data time 0.0010 (0.0043) model time 0.2349 (0.2904) loss 2.9600 (2.7838) grad_norm 3.6067 (4.7962) loss_scale 256.0000 (254.0111) mem 7381MB [2024-09-01 07:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][910/1251] eta 0:01:39 lr 0.000099 wd 0.0500 time 0.2420 (0.2931) data time 0.0009 (0.0042) model time 0.2411 (0.2898) loss 1.8879 (2.7843) grad_norm 38.9053 (4.8270) loss_scale 256.0000 (254.0329) mem 7381MB [2024-09-01 07:11:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][920/1251] eta 0:01:36 lr 0.000099 wd 0.0500 time 0.2397 (0.2926) data time 0.0008 (0.0042) model time 0.2389 (0.2892) loss 2.3741 (2.7811) grad_norm 7.5432 (4.8267) loss_scale 256.0000 (254.0543) mem 7381MB [2024-09-01 07:11:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][930/1251] eta 0:01:33 lr 0.000099 wd 0.0500 time 0.2410 (0.2920) data time 0.0007 (0.0042) model time 0.2403 (0.2886) loss 3.1824 (2.7781) grad_norm 3.9877 (4.8343) loss_scale 256.0000 (254.0752) mem 7381MB [2024-09-01 07:11:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][940/1251] eta 0:01:30 lr 0.000099 wd 0.0500 time 0.2386 (0.2915) data time 0.0008 (0.0041) model time 0.2378 (0.2881) loss 3.3742 (2.7801) grad_norm 4.5904 (4.8368) loss_scale 256.0000 (254.0956) mem 7381MB [2024-09-01 07:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][950/1251] eta 0:01:27 lr 0.000099 wd 0.0500 time 0.2343 (0.2909) data time 0.0010 (0.0041) model time 0.2333 (0.2875) loss 2.7917 (2.7803) grad_norm 7.6973 (4.8405) loss_scale 256.0000 (254.1157) mem 7381MB [2024-09-01 07:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][960/1251] eta 0:01:24 lr 0.000099 wd 0.0500 time 0.2371 (0.2904) data time 0.0007 (0.0041) model time 0.2363 (0.2870) loss 2.0968 (2.7786) grad_norm 3.5946 (4.8368) loss_scale 256.0000 (254.1353) mem 7381MB [2024-09-01 07:11:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][970/1251] eta 0:01:21 lr 0.000099 wd 0.0500 time 0.2436 (0.2903) data time 0.0011 (0.0040) model time 0.2426 (0.2870) loss 2.9746 (2.7797) grad_norm 5.0761 (4.8576) loss_scale 256.0000 (254.1545) mem 7381MB [2024-09-01 07:11:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][980/1251] eta 0:01:18 lr 0.000099 wd 0.0500 time 0.2420 (0.2898) data time 0.0010 (0.0040) model time 0.2410 (0.2864) loss 3.0498 (2.7814) grad_norm 4.7037 (4.8597) loss_scale 256.0000 (254.1733) mem 7381MB [2024-09-01 07:11:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][990/1251] eta 0:01:15 lr 0.000099 wd 0.0500 time 0.2467 (0.2895) data time 0.0007 (0.0040) model time 0.2460 (0.2862) loss 2.7021 (2.7819) grad_norm 6.2361 (4.8540) loss_scale 256.0000 (254.1917) mem 7381MB [2024-09-01 07:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1000/1251] eta 0:01:12 lr 0.000099 wd 0.0500 time 0.2383 (0.2890) data time 0.0006 (0.0039) model time 0.2377 (0.2856) loss 3.6586 (2.7826) grad_norm 3.4199 (4.8418) loss_scale 256.0000 (254.2098) mem 7381MB [2024-09-01 07:12:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1010/1251] eta 0:01:09 lr 0.000099 wd 0.0500 time 0.2338 (0.2884) data time 0.0010 (0.0039) model time 0.2328 (0.2851) loss 3.1227 (2.7812) grad_norm 7.3334 (4.8359) loss_scale 256.0000 (254.2275) mem 7381MB [2024-09-01 07:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1020/1251] eta 0:01:07 lr 0.000099 wd 0.0500 time 0.2391 (0.2901) data time 0.0008 (0.0039) model time 0.2384 (0.2869) loss 1.6844 (2.7795) grad_norm 3.5629 (4.8620) loss_scale 256.0000 (254.2449) mem 7381MB [2024-09-01 07:12:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1030/1251] eta 0:01:04 lr 0.000099 wd 0.0500 time 0.2422 (0.2896) data time 0.0008 (0.0039) model time 0.2415 (0.2864) loss 2.6246 (2.7797) grad_norm 4.0017 (4.8573) loss_scale 256.0000 (254.2619) mem 7381MB [2024-09-01 07:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1040/1251] eta 0:01:01 lr 0.000099 wd 0.0500 time 0.2379 (0.2891) data time 0.0007 (0.0038) model time 0.2372 (0.2859) loss 3.1679 (2.7808) grad_norm 3.8827 (4.8773) loss_scale 256.0000 (254.2786) mem 7381MB [2024-09-01 07:12:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1050/1251] eta 0:00:58 lr 0.000099 wd 0.0500 time 0.2421 (0.2887) data time 0.0008 (0.0038) model time 0.2413 (0.2854) loss 2.4279 (2.7800) grad_norm 3.4785 (4.8748) loss_scale 256.0000 (254.2950) mem 7381MB [2024-09-01 07:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1060/1251] eta 0:00:55 lr 0.000099 wd 0.0500 time 0.2425 (0.2882) data time 0.0006 (0.0038) model time 0.2418 (0.2850) loss 2.2086 (2.7790) grad_norm 5.7188 (4.8756) loss_scale 256.0000 (254.3110) mem 7381MB [2024-09-01 07:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1070/1251] eta 0:00:52 lr 0.000099 wd 0.0500 time 0.2336 (0.2878) data time 0.0011 (0.0038) model time 0.2325 (0.2845) loss 2.1227 (2.7786) grad_norm 4.0316 (4.8708) loss_scale 256.0000 (254.3268) mem 7381MB [2024-09-01 07:12:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1080/1251] eta 0:00:49 lr 0.000099 wd 0.0500 time 0.2390 (0.2873) data time 0.0010 (0.0037) model time 0.2380 (0.2841) loss 2.4975 (2.7788) grad_norm 3.4222 (4.8719) loss_scale 256.0000 (254.3423) mem 7381MB [2024-09-01 07:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1090/1251] eta 0:00:46 lr 0.000099 wd 0.0500 time 0.2372 (0.2868) data time 0.0011 (0.0037) model time 0.2361 (0.2836) loss 3.3673 (2.7804) grad_norm 23.3895 (4.8798) loss_scale 256.0000 (254.3575) mem 7381MB [2024-09-01 07:12:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1100/1251] eta 0:00:43 lr 0.000098 wd 0.0500 time 0.2503 (0.2864) data time 0.0009 (0.0037) model time 0.2494 (0.2832) loss 3.0849 (2.7815) grad_norm 5.3705 (4.8833) loss_scale 256.0000 (254.3724) mem 7381MB [2024-09-01 07:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1110/1251] eta 0:00:40 lr 0.000098 wd 0.0500 time 0.2369 (0.2860) data time 0.0007 (0.0037) model time 0.2361 (0.2828) loss 3.6169 (2.7830) grad_norm 4.0612 (4.8868) loss_scale 256.0000 (254.3870) mem 7381MB [2024-09-01 07:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1120/1251] eta 0:00:37 lr 0.000098 wd 0.0500 time 0.2389 (0.2856) data time 0.0010 (0.0036) model time 0.2380 (0.2824) loss 2.9464 (2.7811) grad_norm 3.9433 (4.8845) loss_scale 256.0000 (254.4014) mem 7381MB [2024-09-01 07:12:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1130/1251] eta 0:00:34 lr 0.000098 wd 0.0500 time 0.2501 (0.2852) data time 0.0010 (0.0036) model time 0.2491 (0.2820) loss 3.4072 (2.7827) grad_norm 3.8395 (4.8810) loss_scale 256.0000 (254.4156) mem 7381MB [2024-09-01 07:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1140/1251] eta 0:00:31 lr 0.000098 wd 0.0500 time 0.2350 (0.2848) data time 0.0009 (0.0036) model time 0.2341 (0.2816) loss 2.6509 (2.7826) grad_norm 3.8033 (4.8749) loss_scale 256.0000 (254.4294) mem 7381MB [2024-09-01 07:12:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1150/1251] eta 0:00:28 lr 0.000098 wd 0.0500 time 0.2310 (0.2844) data time 0.0010 (0.0036) model time 0.2299 (0.2812) loss 3.1482 (2.7855) grad_norm 3.1714 (4.8702) loss_scale 256.0000 (254.4431) mem 7381MB [2024-09-01 07:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1160/1251] eta 0:00:25 lr 0.000098 wd 0.0500 time 0.2330 (0.2840) data time 0.0011 (0.0035) model time 0.2318 (0.2808) loss 2.6815 (2.7861) grad_norm 4.4255 (4.8649) loss_scale 256.0000 (254.4565) mem 7381MB [2024-09-01 07:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1170/1251] eta 0:00:22 lr 0.000098 wd 0.0500 time 0.2453 (0.2836) data time 0.0008 (0.0035) model time 0.2445 (0.2803) loss 3.0611 (2.7842) grad_norm 4.8035 (4.8637) loss_scale 256.0000 (254.4697) mem 7381MB [2024-09-01 07:12:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1180/1251] eta 0:00:20 lr 0.000098 wd 0.0500 time 0.2424 (0.2831) data time 0.0012 (0.0035) model time 0.2412 (0.2799) loss 2.5141 (2.7837) grad_norm 5.9607 (4.8721) loss_scale 256.0000 (254.4826) mem 7381MB [2024-09-01 07:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1190/1251] eta 0:00:17 lr 0.000098 wd 0.0500 time 0.2324 (0.2837) data time 0.0007 (0.0042) model time 0.2316 (0.2797) loss 3.0679 (2.7819) grad_norm 3.9791 (4.8739) loss_scale 256.0000 (254.4954) mem 7381MB [2024-09-01 07:12:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1200/1251] eta 0:00:14 lr 0.000098 wd 0.0500 time 0.2358 (0.2835) data time 0.0007 (0.0042) model time 0.2351 (0.2796) loss 3.2377 (2.7837) grad_norm 3.8341 (4.8797) loss_scale 256.0000 (254.5079) mem 7381MB [2024-09-01 07:12:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1210/1251] eta 0:00:11 lr 0.000098 wd 0.0500 time 0.2556 (0.2855) data time 0.0007 (0.0042) model time 0.2548 (0.2817) loss 2.3004 (2.7841) grad_norm 5.6043 (4.8763) loss_scale 256.0000 (254.5202) mem 7381MB [2024-09-01 07:12:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1220/1251] eta 0:00:08 lr 0.000098 wd 0.0500 time 0.2345 (0.2855) data time 0.0012 (0.0041) model time 0.2333 (0.2817) loss 2.7830 (2.7835) grad_norm 4.9342 (4.8685) loss_scale 256.0000 (254.5324) mem 7381MB [2024-09-01 07:13:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1230/1251] eta 0:00:05 lr 0.000098 wd 0.0500 time 0.2298 (0.2851) data time 0.0009 (0.0041) model time 0.2289 (0.2813) loss 3.0429 (2.7822) grad_norm 4.3019 (4.8647) loss_scale 256.0000 (254.5443) mem 7381MB [2024-09-01 07:13:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1240/1251] eta 0:00:03 lr 0.000098 wd 0.0500 time 0.2286 (0.2847) data time 0.0007 (0.0041) model time 0.2279 (0.2809) loss 2.8887 (2.7815) grad_norm 6.8766 (4.8674) loss_scale 256.0000 (254.5560) mem 7381MB [2024-09-01 07:13:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [245/300][1250/1251] eta 0:00:00 lr 0.000098 wd 0.0500 time 0.2274 (0.2842) data time 0.0006 (0.0041) model time 0.2267 (0.2804) loss 3.2549 (2.7832) grad_norm 2.8336 (4.8633) loss_scale 256.0000 (254.5675) mem 7381MB [2024-09-01 07:13:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 245 training takes 0:05:55 [2024-09-01 07:13:06 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 07:13:07 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 07:13:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.448 (0.448) Loss 0.4143 (0.4143) Acc@1 92.871 (92.871) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 07:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.114) Loss 0.6089 (0.6456) Acc@1 88.965 (86.985) Acc@5 98.145 (97.523) Mem 7381MB [2024-09-01 07:13:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.096) Loss 0.9219 (0.6710) Acc@1 77.539 (85.896) Acc@5 96.191 (97.535) Mem 7381MB [2024-09-01 07:13:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.089) Loss 1.1768 (0.7638) Acc@1 73.145 (83.616) Acc@5 92.480 (96.654) Mem 7381MB [2024-09-01 07:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.0459 (0.8120) Acc@1 75.977 (82.541) Acc@5 94.043 (96.077) Mem 7381MB [2024-09-01 07:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.140 Acc@5 96.018 [2024-09-01 07:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.1% [2024-09-01 07:13:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.952 (0.952) Loss 0.3816 (0.3816) Acc@1 93.262 (93.262) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 07:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.157) Loss 0.5674 (0.5994) Acc@1 90.039 (87.598) Acc@5 98.047 (97.772) Mem 7381MB [2024-09-01 07:13:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.119) Loss 0.8970 (0.6306) Acc@1 77.930 (86.454) Acc@5 96.191 (97.726) Mem 7381MB [2024-09-01 07:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.105) Loss 1.0947 (0.7163) Acc@1 74.121 (84.362) Acc@5 93.457 (96.884) Mem 7381MB [2024-09-01 07:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.096) Loss 0.9961 (0.7613) Acc@1 76.953 (83.256) Acc@5 94.141 (96.391) Mem 7381MB [2024-09-01 07:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.896 Acc@5 96.346 [2024-09-01 07:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 07:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][0/1251] eta 0:33:36 lr 0.000098 wd 0.0500 time 1.6117 (1.6117) data time 0.7007 (0.7007) model time 0.0000 (0.0000) loss 3.1616 (3.1616) grad_norm 3.8299 (3.8299) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][10/1251] eta 0:07:29 lr 0.000098 wd 0.0500 time 0.2399 (0.3620) data time 0.0007 (0.0645) model time 0.0000 (0.0000) loss 1.7388 (2.6983) grad_norm 4.3801 (4.0380) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][20/1251] eta 0:06:12 lr 0.000098 wd 0.0500 time 0.2272 (0.3027) data time 0.0007 (0.0343) model time 0.0000 (0.0000) loss 3.2999 (2.7763) grad_norm 4.2099 (4.1032) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][30/1251] eta 0:05:42 lr 0.000098 wd 0.0500 time 0.2352 (0.2802) data time 0.0010 (0.0235) model time 0.0000 (0.0000) loss 2.8388 (2.7819) grad_norm 4.2828 (4.0829) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][40/1251] eta 0:05:28 lr 0.000098 wd 0.0500 time 0.2472 (0.2710) data time 0.0009 (0.0180) model time 0.0000 (0.0000) loss 3.6980 (2.7926) grad_norm 11.3054 (4.7195) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][50/1251] eta 0:05:27 lr 0.000098 wd 0.0500 time 0.2419 (0.2727) data time 0.0012 (0.0147) model time 0.0000 (0.0000) loss 2.8551 (2.7717) grad_norm 3.3525 (4.7648) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][60/1251] eta 0:05:17 lr 0.000098 wd 0.0500 time 0.2337 (0.2667) data time 0.0010 (0.0124) model time 0.2327 (0.2351) loss 3.4308 (2.8205) grad_norm 3.7766 (4.7951) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][70/1251] eta 0:05:10 lr 0.000098 wd 0.0500 time 0.2328 (0.2625) data time 0.0010 (0.0108) model time 0.2318 (0.2356) loss 2.3869 (2.8145) grad_norm 3.5536 (4.6106) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][80/1251] eta 0:05:03 lr 0.000098 wd 0.0500 time 0.2410 (0.2594) data time 0.0009 (0.0096) model time 0.2401 (0.2357) loss 2.8556 (2.8435) grad_norm 5.4178 (4.5955) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][90/1251] eta 0:04:57 lr 0.000098 wd 0.0500 time 0.2367 (0.2567) data time 0.0011 (0.0087) model time 0.2357 (0.2352) loss 2.9627 (2.8420) grad_norm 5.9174 (4.6396) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][100/1251] eta 0:05:12 lr 0.000098 wd 0.0500 time 0.2329 (0.2715) data time 0.0011 (0.0250) model time 0.2319 (0.2349) loss 2.6238 (2.8284) grad_norm 5.6343 (4.6229) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][110/1251] eta 0:05:16 lr 0.000098 wd 0.0500 time 0.2571 (0.2771) data time 0.0021 (0.0228) model time 0.2550 (0.2512) loss 3.1616 (2.8282) grad_norm 4.2032 (4.5629) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][120/1251] eta 0:05:09 lr 0.000098 wd 0.0500 time 0.2334 (0.2736) data time 0.0009 (0.0210) model time 0.2325 (0.2487) loss 2.2112 (2.8179) grad_norm 3.0209 (4.5358) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][130/1251] eta 0:05:03 lr 0.000098 wd 0.0500 time 0.2391 (0.2707) data time 0.0010 (0.0195) model time 0.2381 (0.2469) loss 3.0821 (2.8189) grad_norm 6.9289 (4.5357) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][140/1251] eta 0:04:57 lr 0.000098 wd 0.0500 time 0.2390 (0.2682) data time 0.0008 (0.0182) model time 0.2382 (0.2455) loss 2.5718 (2.8089) grad_norm 3.9020 (4.5273) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][150/1251] eta 0:04:52 lr 0.000098 wd 0.0500 time 0.2356 (0.2661) data time 0.0011 (0.0170) model time 0.2345 (0.2445) loss 2.9314 (2.8085) grad_norm 3.7374 (4.4968) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:13:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][160/1251] eta 0:04:48 lr 0.000098 wd 0.0500 time 0.2330 (0.2644) data time 0.0007 (0.0160) model time 0.2323 (0.2438) loss 2.4522 (2.8035) grad_norm 3.2514 (4.4557) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][170/1251] eta 0:04:43 lr 0.000098 wd 0.0500 time 0.2262 (0.2627) data time 0.0009 (0.0152) model time 0.2253 (0.2430) loss 3.5022 (2.8106) grad_norm 3.5223 (4.4562) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][180/1251] eta 0:04:40 lr 0.000098 wd 0.0500 time 0.2940 (0.2618) data time 0.0009 (0.0144) model time 0.2931 (0.2432) loss 3.6476 (2.8164) grad_norm 3.4342 (4.4935) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][190/1251] eta 0:04:53 lr 0.000098 wd 0.0500 time 0.2430 (0.2762) data time 0.0011 (0.0178) model time 0.2418 (0.2586) loss 2.5519 (2.8162) grad_norm 4.4699 (4.4748) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][200/1251] eta 0:04:48 lr 0.000098 wd 0.0500 time 0.2332 (0.2744) data time 0.0012 (0.0170) model time 0.2320 (0.2573) loss 2.8332 (2.8216) grad_norm 3.8978 (4.4721) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][210/1251] eta 0:04:43 lr 0.000098 wd 0.0500 time 0.2327 (0.2725) data time 0.0011 (0.0162) model time 0.2317 (0.2558) loss 3.2815 (2.8275) grad_norm 3.9995 (4.4657) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][220/1251] eta 0:04:39 lr 0.000098 wd 0.0500 time 0.2310 (0.2708) data time 0.0011 (0.0155) model time 0.2299 (0.2545) loss 3.1592 (2.8374) grad_norm 4.0089 (4.5132) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][230/1251] eta 0:04:34 lr 0.000098 wd 0.0500 time 0.2392 (0.2693) data time 0.0011 (0.0149) model time 0.2381 (0.2534) loss 2.9669 (2.8376) grad_norm 2.9697 (4.8149) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][240/1251] eta 0:04:31 lr 0.000098 wd 0.0500 time 0.2599 (0.2690) data time 0.0009 (0.0143) model time 0.2590 (0.2538) loss 3.3754 (2.8510) grad_norm 4.2137 (4.7862) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][250/1251] eta 0:04:28 lr 0.000097 wd 0.0500 time 0.2506 (0.2679) data time 0.0011 (0.0138) model time 0.2495 (0.2531) loss 2.8921 (2.8568) grad_norm 3.3185 (4.7368) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][260/1251] eta 0:04:24 lr 0.000097 wd 0.0500 time 0.2438 (0.2668) data time 0.0012 (0.0133) model time 0.2426 (0.2524) loss 2.6499 (2.8632) grad_norm 3.8151 (4.7110) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][270/1251] eta 0:04:20 lr 0.000097 wd 0.0500 time 0.2275 (0.2658) data time 0.0008 (0.0128) model time 0.2267 (0.2518) loss 2.6416 (2.8662) grad_norm 4.2300 (4.7965) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][280/1251] eta 0:04:17 lr 0.000097 wd 0.0500 time 0.2475 (0.2649) data time 0.0007 (0.0124) model time 0.2468 (0.2513) loss 3.3574 (2.8596) grad_norm 4.4622 (4.7618) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][290/1251] eta 0:04:13 lr 0.000097 wd 0.0500 time 0.2338 (0.2640) data time 0.0008 (0.0120) model time 0.2330 (0.2507) loss 2.1138 (2.8543) grad_norm 3.7323 (4.7258) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][300/1251] eta 0:04:10 lr 0.000097 wd 0.0500 time 0.2383 (0.2631) data time 0.0009 (0.0117) model time 0.2374 (0.2501) loss 3.2609 (2.8525) grad_norm 3.6542 (4.7128) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][310/1251] eta 0:04:06 lr 0.000097 wd 0.0500 time 0.2395 (0.2623) data time 0.0010 (0.0113) model time 0.2385 (0.2496) loss 2.7885 (2.8510) grad_norm 3.9051 (4.7007) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][320/1251] eta 0:04:03 lr 0.000097 wd 0.0500 time 0.2307 (0.2615) data time 0.0008 (0.0110) model time 0.2300 (0.2490) loss 1.7458 (2.8448) grad_norm 3.4448 (4.8258) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][330/1251] eta 0:04:00 lr 0.000097 wd 0.0500 time 0.2411 (0.2607) data time 0.0007 (0.0107) model time 0.2404 (0.2486) loss 2.0816 (2.8368) grad_norm 3.1467 (4.8648) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][340/1251] eta 0:03:56 lr 0.000097 wd 0.0500 time 0.2285 (0.2600) data time 0.0010 (0.0104) model time 0.2275 (0.2482) loss 2.5023 (2.8326) grad_norm 3.3467 (4.9043) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][350/1251] eta 0:03:53 lr 0.000097 wd 0.0500 time 0.2423 (0.2593) data time 0.0007 (0.0101) model time 0.2415 (0.2477) loss 2.7328 (2.8311) grad_norm 3.2134 (4.8783) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][360/1251] eta 0:03:50 lr 0.000097 wd 0.0500 time 0.2367 (0.2588) data time 0.0007 (0.0099) model time 0.2360 (0.2474) loss 2.2904 (2.8328) grad_norm 6.9814 (4.8769) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][370/1251] eta 0:03:47 lr 0.000097 wd 0.0500 time 0.2359 (0.2581) data time 0.0007 (0.0096) model time 0.2352 (0.2470) loss 3.0470 (2.8390) grad_norm 3.5114 (4.8830) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][380/1251] eta 0:03:44 lr 0.000097 wd 0.0500 time 0.2317 (0.2576) data time 0.0012 (0.0094) model time 0.2305 (0.2467) loss 3.1318 (2.8367) grad_norm 3.0826 (4.8620) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][390/1251] eta 0:03:41 lr 0.000097 wd 0.0500 time 0.2401 (0.2576) data time 0.0011 (0.0096) model time 0.2390 (0.2466) loss 2.8875 (2.8360) grad_norm 4.0859 (4.9964) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:14:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][400/1251] eta 0:03:38 lr 0.000097 wd 0.0500 time 0.2401 (0.2571) data time 0.0008 (0.0094) model time 0.2392 (0.2462) loss 2.3067 (2.8378) grad_norm 4.1361 (4.9819) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][410/1251] eta 0:03:35 lr 0.000097 wd 0.0500 time 0.2450 (0.2566) data time 0.0010 (0.0092) model time 0.2440 (0.2459) loss 2.2675 (2.8368) grad_norm 3.7310 (4.9588) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][420/1251] eta 0:03:32 lr 0.000097 wd 0.0500 time 0.2376 (0.2562) data time 0.0009 (0.0090) model time 0.2367 (0.2457) loss 2.1744 (2.8335) grad_norm 3.6536 (4.9299) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][430/1251] eta 0:03:30 lr 0.000097 wd 0.0500 time 0.2370 (0.2558) data time 0.0007 (0.0088) model time 0.2363 (0.2455) loss 2.2593 (2.8289) grad_norm 4.1209 (4.9187) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][440/1251] eta 0:03:27 lr 0.000097 wd 0.0500 time 0.2406 (0.2554) data time 0.0007 (0.0086) model time 0.2398 (0.2453) loss 2.8906 (2.8233) grad_norm 4.6938 (4.9230) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][450/1251] eta 0:03:24 lr 0.000097 wd 0.0500 time 0.2319 (0.2552) data time 0.0010 (0.0085) model time 0.2309 (0.2453) loss 3.0918 (2.8208) grad_norm 3.1817 (4.9022) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][460/1251] eta 0:03:21 lr 0.000097 wd 0.0500 time 0.2277 (0.2548) data time 0.0010 (0.0083) model time 0.2267 (0.2451) loss 2.8221 (2.8251) grad_norm 5.0894 (4.8944) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][470/1251] eta 0:03:18 lr 0.000097 wd 0.0500 time 0.2307 (0.2544) data time 0.0008 (0.0082) model time 0.2299 (0.2448) loss 2.1604 (2.8228) grad_norm 4.0394 (4.8851) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][480/1251] eta 0:03:15 lr 0.000097 wd 0.0500 time 0.2360 (0.2540) data time 0.0007 (0.0080) model time 0.2353 (0.2446) loss 3.3898 (2.8234) grad_norm 5.6514 (4.8897) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][490/1251] eta 0:03:15 lr 0.000097 wd 0.0500 time 0.2422 (0.2575) data time 0.0008 (0.0079) model time 0.2414 (0.2486) loss 1.4217 (2.8238) grad_norm 4.1925 (4.8747) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][500/1251] eta 0:03:13 lr 0.000097 wd 0.0500 time 0.2455 (0.2571) data time 0.0011 (0.0077) model time 0.2444 (0.2484) loss 3.5779 (2.8279) grad_norm 4.5435 (5.0163) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][510/1251] eta 0:03:10 lr 0.000097 wd 0.0500 time 0.2307 (0.2568) data time 0.0007 (0.0076) model time 0.2300 (0.2482) loss 3.0322 (2.8277) grad_norm 4.5760 (4.9995) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][520/1251] eta 0:03:07 lr 0.000097 wd 0.0500 time 0.2276 (0.2563) data time 0.0008 (0.0075) model time 0.2269 (0.2479) loss 2.7940 (2.8219) grad_norm 4.2909 (4.9832) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][530/1251] eta 0:03:04 lr 0.000097 wd 0.0500 time 0.2369 (0.2560) data time 0.0010 (0.0073) model time 0.2359 (0.2476) loss 2.6775 (2.8159) grad_norm 5.7969 (5.0208) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][540/1251] eta 0:03:01 lr 0.000097 wd 0.0500 time 0.2515 (0.2557) data time 0.0007 (0.0072) model time 0.2507 (0.2475) loss 2.9066 (2.8197) grad_norm 3.3383 (5.0138) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][550/1251] eta 0:02:59 lr 0.000097 wd 0.0500 time 0.2319 (0.2554) data time 0.0011 (0.0071) model time 0.2308 (0.2472) loss 2.0726 (2.8153) grad_norm 2.9467 (4.9879) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][560/1251] eta 0:02:56 lr 0.000097 wd 0.0500 time 0.2498 (0.2551) data time 0.0008 (0.0070) model time 0.2490 (0.2471) loss 2.6917 (2.8155) grad_norm 6.2118 (4.9749) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][570/1251] eta 0:02:53 lr 0.000097 wd 0.0500 time 0.2422 (0.2547) data time 0.0010 (0.0069) model time 0.2412 (0.2468) loss 2.0921 (2.8133) grad_norm 4.6234 (5.0061) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][580/1251] eta 0:02:50 lr 0.000097 wd 0.0500 time 0.2360 (0.2544) data time 0.0011 (0.0068) model time 0.2349 (0.2466) loss 2.2643 (2.8122) grad_norm 3.3453 (4.9959) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][590/1251] eta 0:02:48 lr 0.000097 wd 0.0500 time 0.2377 (0.2556) data time 0.0008 (0.0080) model time 0.2370 (0.2466) loss 3.1943 (2.8147) grad_norm 6.5403 (4.9795) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][600/1251] eta 0:02:46 lr 0.000097 wd 0.0500 time 0.2381 (0.2553) data time 0.0011 (0.0079) model time 0.2370 (0.2464) loss 2.9722 (2.8134) grad_norm 7.4318 (5.0026) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][610/1251] eta 0:02:43 lr 0.000097 wd 0.0500 time 0.2370 (0.2551) data time 0.0010 (0.0078) model time 0.2360 (0.2463) loss 3.3352 (2.8081) grad_norm 4.0145 (4.9831) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][620/1251] eta 0:02:40 lr 0.000097 wd 0.0500 time 0.2349 (0.2548) data time 0.0010 (0.0077) model time 0.2339 (0.2462) loss 2.8725 (2.8068) grad_norm 3.2285 (4.9754) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][630/1251] eta 0:02:38 lr 0.000097 wd 0.0500 time 0.2265 (0.2545) data time 0.0010 (0.0076) model time 0.2255 (0.2460) loss 2.8498 (2.8082) grad_norm 3.9416 (4.9566) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:15:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][640/1251] eta 0:02:35 lr 0.000096 wd 0.0500 time 0.2411 (0.2543) data time 0.0008 (0.0075) model time 0.2403 (0.2458) loss 3.2106 (2.8099) grad_norm 4.3099 (4.9615) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:16:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][650/1251] eta 0:02:32 lr 0.000096 wd 0.0500 time 0.2232 (0.2540) data time 0.0007 (0.0074) model time 0.2226 (0.2456) loss 2.4453 (2.8111) grad_norm 4.4254 (4.9529) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:16:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][660/1251] eta 0:02:29 lr 0.000096 wd 0.0500 time 0.2277 (0.2536) data time 0.0009 (0.0073) model time 0.2269 (0.2454) loss 3.0965 (2.8062) grad_norm 3.1760 (4.9490) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:16:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][670/1251] eta 0:02:28 lr 0.000096 wd 0.0500 time 0.2505 (0.2557) data time 0.0008 (0.0072) model time 0.2496 (0.2477) loss 3.5000 (2.8095) grad_norm 3.6359 (4.9408) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:16:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][680/1251] eta 0:02:25 lr 0.000096 wd 0.0500 time 0.2324 (0.2555) data time 0.0009 (0.0071) model time 0.2316 (0.2476) loss 2.8489 (2.8113) grad_norm 3.2707 (4.9316) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:16:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][690/1251] eta 0:02:23 lr 0.000096 wd 0.0500 time 0.2256 (0.2552) data time 0.0010 (0.0070) model time 0.2246 (0.2475) loss 2.1223 (2.8083) grad_norm 7.8135 (4.9262) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][700/1251] eta 0:02:20 lr 0.000096 wd 0.0500 time 0.2353 (0.2550) data time 0.0009 (0.0069) model time 0.2345 (0.2473) loss 2.3648 (2.8052) grad_norm 3.5086 (4.9155) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:16:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][710/1251] eta 0:02:17 lr 0.000096 wd 0.0500 time 0.2460 (0.2548) data time 0.0010 (0.0068) model time 0.2450 (0.2472) loss 2.0594 (2.8050) grad_norm 10.5471 (4.9186) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:16:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][720/1251] eta 0:02:15 lr 0.000096 wd 0.0500 time 0.2440 (0.2546) data time 0.0007 (0.0068) model time 0.2433 (0.2471) loss 2.2471 (2.8008) grad_norm 3.3318 (4.9152) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:16:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][730/1251] eta 0:02:12 lr 0.000096 wd 0.0500 time 0.2387 (0.2544) data time 0.0007 (0.0067) model time 0.2380 (0.2470) loss 2.6035 (2.8011) grad_norm 5.3682 (4.9089) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:16:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][740/1251] eta 0:02:09 lr 0.000096 wd 0.0500 time 0.2389 (0.2542) data time 0.0007 (0.0066) model time 0.2382 (0.2468) loss 3.1007 (2.7993) grad_norm 3.7793 (4.9045) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][750/1251] eta 0:02:07 lr 0.000096 wd 0.0500 time 0.2417 (0.2540) data time 0.0010 (0.0065) model time 0.2407 (0.2467) loss 2.7140 (2.8006) grad_norm 5.3354 (4.9201) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:16:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][760/1251] eta 0:02:04 lr 0.000096 wd 0.0500 time 0.2405 (0.2539) data time 0.0007 (0.0065) model time 0.2398 (0.2467) loss 2.7464 (2.8023) grad_norm 4.5417 (4.9135) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:16:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][770/1251] eta 0:02:02 lr 0.000096 wd 0.0500 time 0.2365 (0.2537) data time 0.0009 (0.0064) model time 0.2355 (0.2466) loss 2.6258 (2.8048) grad_norm 3.2357 (4.9078) loss_scale 512.0000 (258.6563) mem 7381MB [2024-09-01 07:16:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][780/1251] eta 0:01:59 lr 0.000096 wd 0.0500 time 0.2348 (0.2535) data time 0.0008 (0.0063) model time 0.2340 (0.2464) loss 2.9874 (2.8039) grad_norm 3.6165 (4.9062) loss_scale 512.0000 (261.9001) mem 7381MB [2024-09-01 07:16:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][790/1251] eta 0:01:56 lr 0.000096 wd 0.0500 time 0.2248 (0.2532) data time 0.0011 (0.0062) model time 0.2237 (0.2462) loss 2.7748 (2.8048) grad_norm 6.3237 (4.9094) loss_scale 512.0000 (265.0619) mem 7381MB [2024-09-01 07:16:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][800/1251] eta 0:01:54 lr 0.000096 wd 0.0500 time 0.2370 (0.2530) data time 0.0010 (0.0062) model time 0.2360 (0.2461) loss 2.4308 (2.8060) grad_norm 6.9593 (4.9187) loss_scale 512.0000 (268.1448) mem 7381MB [2024-09-01 07:16:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][810/1251] eta 0:01:51 lr 0.000096 wd 0.0500 time 0.2341 (0.2528) data time 0.0008 (0.0061) model time 0.2333 (0.2460) loss 3.1568 (2.8042) grad_norm 3.6441 (4.9156) loss_scale 512.0000 (271.1517) mem 7381MB [2024-09-01 07:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][820/1251] eta 0:01:48 lr 0.000096 wd 0.0500 time 0.2430 (0.2527) data time 0.0008 (0.0061) model time 0.2422 (0.2459) loss 3.1171 (2.8022) grad_norm 8.8543 (4.9359) loss_scale 512.0000 (274.0853) mem 7381MB [2024-09-01 07:16:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][830/1251] eta 0:01:46 lr 0.000096 wd 0.0500 time 0.2329 (0.2525) data time 0.0010 (0.0060) model time 0.2319 (0.2458) loss 2.9976 (2.7983) grad_norm 3.4600 (4.9315) loss_scale 512.0000 (276.9483) mem 7381MB [2024-09-01 07:16:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][840/1251] eta 0:01:43 lr 0.000096 wd 0.0500 time 0.2300 (0.2523) data time 0.0009 (0.0059) model time 0.2291 (0.2456) loss 2.6643 (2.7971) grad_norm 4.2998 (4.9347) loss_scale 512.0000 (279.7432) mem 7381MB [2024-09-01 07:16:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][850/1251] eta 0:01:41 lr 0.000096 wd 0.0500 time 0.2400 (0.2521) data time 0.0010 (0.0059) model time 0.2390 (0.2455) loss 2.4458 (2.7989) grad_norm 3.4310 (4.9240) loss_scale 512.0000 (282.4724) mem 7381MB [2024-09-01 07:16:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][860/1251] eta 0:01:38 lr 0.000096 wd 0.0500 time 0.2461 (0.2520) data time 0.0009 (0.0058) model time 0.2452 (0.2455) loss 2.8333 (2.7988) grad_norm 4.0092 (4.9109) loss_scale 512.0000 (285.1382) mem 7381MB [2024-09-01 07:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][870/1251] eta 0:01:35 lr 0.000096 wd 0.0500 time 0.2391 (0.2519) data time 0.0007 (0.0058) model time 0.2383 (0.2454) loss 3.1622 (2.8007) grad_norm 3.5154 (4.8969) loss_scale 512.0000 (287.7428) mem 7381MB [2024-09-01 07:16:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][880/1251] eta 0:01:33 lr 0.000096 wd 0.0500 time 0.2399 (0.2517) data time 0.0007 (0.0057) model time 0.2392 (0.2452) loss 1.7932 (2.8024) grad_norm 3.4670 (4.8915) loss_scale 512.0000 (290.2883) mem 7381MB [2024-09-01 07:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][890/1251] eta 0:01:30 lr 0.000096 wd 0.0500 time 0.2425 (0.2515) data time 0.0011 (0.0057) model time 0.2414 (0.2452) loss 2.7376 (2.8029) grad_norm 3.7494 (4.9198) loss_scale 512.0000 (292.7767) mem 7381MB [2024-09-01 07:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][900/1251] eta 0:01:28 lr 0.000096 wd 0.0500 time 0.2276 (0.2514) data time 0.0011 (0.0056) model time 0.2265 (0.2451) loss 2.9548 (2.8014) grad_norm 6.3078 (4.9261) loss_scale 512.0000 (295.2098) mem 7381MB [2024-09-01 07:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][910/1251] eta 0:01:25 lr 0.000096 wd 0.0500 time 0.2300 (0.2513) data time 0.0007 (0.0056) model time 0.2292 (0.2450) loss 3.5608 (2.8045) grad_norm 6.4288 (4.9297) loss_scale 512.0000 (297.5895) mem 7381MB [2024-09-01 07:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][920/1251] eta 0:01:23 lr 0.000096 wd 0.0500 time 0.2296 (0.2511) data time 0.0009 (0.0055) model time 0.2287 (0.2449) loss 2.9006 (2.8027) grad_norm 4.3834 (4.9262) loss_scale 512.0000 (299.9175) mem 7381MB [2024-09-01 07:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][930/1251] eta 0:01:20 lr 0.000096 wd 0.0500 time 0.2393 (0.2510) data time 0.0011 (0.0055) model time 0.2382 (0.2448) loss 3.2774 (2.8014) grad_norm 4.3514 (4.9224) loss_scale 512.0000 (302.1955) mem 7381MB [2024-09-01 07:17:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][940/1251] eta 0:01:17 lr 0.000096 wd 0.0500 time 0.2364 (0.2508) data time 0.0010 (0.0054) model time 0.2353 (0.2447) loss 3.3018 (2.7984) grad_norm 3.6260 (4.9232) loss_scale 512.0000 (304.4251) mem 7381MB [2024-09-01 07:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][950/1251] eta 0:01:15 lr 0.000096 wd 0.0500 time 0.2306 (0.2506) data time 0.0010 (0.0054) model time 0.2297 (0.2445) loss 2.8982 (2.7983) grad_norm 3.7773 (4.9144) loss_scale 512.0000 (306.6078) mem 7381MB [2024-09-01 07:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][960/1251] eta 0:01:13 lr 0.000096 wd 0.0500 time 0.3538 (0.2540) data time 0.0007 (0.0059) model time 0.3531 (0.2475) loss 1.9418 (2.7998) grad_norm 4.0638 (4.9108) loss_scale 512.0000 (308.7451) mem 7381MB [2024-09-01 07:17:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][970/1251] eta 0:01:11 lr 0.000096 wd 0.0500 time 0.2379 (0.2543) data time 0.0009 (0.0060) model time 0.2370 (0.2478) loss 1.7685 (2.7983) grad_norm 4.1699 (4.9037) loss_scale 512.0000 (310.8383) mem 7381MB [2024-09-01 07:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][980/1251] eta 0:01:08 lr 0.000096 wd 0.0500 time 0.2406 (0.2544) data time 0.0009 (0.0060) model time 0.2397 (0.2479) loss 2.9101 (2.7980) grad_norm 3.4964 (4.8989) loss_scale 512.0000 (312.8889) mem 7381MB [2024-09-01 07:17:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][990/1251] eta 0:01:06 lr 0.000096 wd 0.0500 time 0.2313 (0.2564) data time 0.0009 (0.0059) model time 0.2304 (0.2500) loss 3.3141 (2.7994) grad_norm 3.7008 (4.8905) loss_scale 512.0000 (314.8981) mem 7381MB [2024-09-01 07:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1000/1251] eta 0:01:04 lr 0.000096 wd 0.0500 time 0.2314 (0.2583) data time 0.0008 (0.0059) model time 0.2306 (0.2522) loss 2.8441 (2.7992) grad_norm 4.7014 (4.9298) loss_scale 512.0000 (316.8671) mem 7381MB [2024-09-01 07:17:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1010/1251] eta 0:01:02 lr 0.000096 wd 0.0500 time 0.2385 (0.2581) data time 0.0006 (0.0058) model time 0.2379 (0.2520) loss 2.2104 (2.7983) grad_norm 3.5792 (4.9258) loss_scale 512.0000 (318.7972) mem 7381MB [2024-09-01 07:17:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1020/1251] eta 0:00:59 lr 0.000096 wd 0.0500 time 0.2593 (0.2582) data time 0.0009 (0.0058) model time 0.2584 (0.2522) loss 3.3416 (2.7975) grad_norm 3.2495 (4.9238) loss_scale 512.0000 (320.6895) mem 7381MB [2024-09-01 07:17:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1030/1251] eta 0:00:57 lr 0.000096 wd 0.0500 time 0.2359 (0.2581) data time 0.0009 (0.0057) model time 0.2350 (0.2520) loss 2.4078 (2.7961) grad_norm 3.7684 (4.9154) loss_scale 512.0000 (322.5451) mem 7381MB [2024-09-01 07:17:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1040/1251] eta 0:00:54 lr 0.000095 wd 0.0500 time 0.2427 (0.2579) data time 0.0007 (0.0057) model time 0.2421 (0.2519) loss 2.7006 (2.7972) grad_norm 4.5141 (4.9102) loss_scale 512.0000 (324.3650) mem 7381MB [2024-09-01 07:17:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1050/1251] eta 0:00:51 lr 0.000095 wd 0.0500 time 0.2382 (0.2576) data time 0.0011 (0.0056) model time 0.2371 (0.2517) loss 2.7119 (2.7994) grad_norm 3.9651 (4.9085) loss_scale 512.0000 (326.1503) mem 7381MB [2024-09-01 07:17:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1060/1251] eta 0:00:49 lr 0.000095 wd 0.0500 time 0.2355 (0.2575) data time 0.0010 (0.0056) model time 0.2345 (0.2516) loss 2.8544 (2.8007) grad_norm 3.9513 (4.9037) loss_scale 512.0000 (327.9020) mem 7381MB [2024-09-01 07:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1070/1251] eta 0:00:46 lr 0.000095 wd 0.0500 time 0.2364 (0.2573) data time 0.0010 (0.0055) model time 0.2354 (0.2514) loss 2.3139 (2.8002) grad_norm 4.7649 (4.8940) loss_scale 512.0000 (329.6209) mem 7381MB [2024-09-01 07:17:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1080/1251] eta 0:00:43 lr 0.000095 wd 0.0500 time 0.2401 (0.2571) data time 0.0008 (0.0055) model time 0.2394 (0.2513) loss 2.7016 (2.8011) grad_norm 2.9991 (inf) loss_scale 256.0000 (329.1767) mem 7381MB [2024-09-01 07:17:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1090/1251] eta 0:00:41 lr 0.000095 wd 0.0500 time 0.2274 (0.2569) data time 0.0009 (0.0055) model time 0.2266 (0.2511) loss 3.4715 (2.8046) grad_norm 3.9706 (inf) loss_scale 256.0000 (328.5060) mem 7381MB [2024-09-01 07:17:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1100/1251] eta 0:00:38 lr 0.000095 wd 0.0500 time 0.2274 (0.2567) data time 0.0013 (0.0054) model time 0.2261 (0.2510) loss 3.0903 (2.8035) grad_norm 4.8443 (inf) loss_scale 256.0000 (327.8474) mem 7381MB [2024-09-01 07:18:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1110/1251] eta 0:00:36 lr 0.000095 wd 0.0500 time 0.2348 (0.2565) data time 0.0009 (0.0054) model time 0.2339 (0.2508) loss 2.5135 (2.8043) grad_norm 5.1263 (inf) loss_scale 256.0000 (327.2007) mem 7381MB [2024-09-01 07:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1120/1251] eta 0:00:33 lr 0.000095 wd 0.0500 time 0.2345 (0.2563) data time 0.0007 (0.0053) model time 0.2338 (0.2507) loss 2.0618 (2.8006) grad_norm 5.8153 (inf) loss_scale 256.0000 (326.5656) mem 7381MB [2024-09-01 07:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1130/1251] eta 0:00:31 lr 0.000095 wd 0.0500 time 0.2435 (0.2562) data time 0.0009 (0.0053) model time 0.2425 (0.2506) loss 1.5671 (2.8002) grad_norm 6.4438 (inf) loss_scale 256.0000 (325.9416) mem 7381MB [2024-09-01 07:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1140/1251] eta 0:00:28 lr 0.000095 wd 0.0500 time 0.2339 (0.2561) data time 0.0011 (0.0053) model time 0.2328 (0.2505) loss 3.1230 (2.8012) grad_norm 5.3486 (inf) loss_scale 256.0000 (325.3287) mem 7381MB [2024-09-01 07:18:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1150/1251] eta 0:00:25 lr 0.000095 wd 0.0500 time 0.2307 (0.2560) data time 0.0008 (0.0052) model time 0.2299 (0.2504) loss 3.5675 (2.8025) grad_norm 3.1108 (inf) loss_scale 256.0000 (324.7263) mem 7381MB [2024-09-01 07:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1160/1251] eta 0:00:23 lr 0.000095 wd 0.0500 time 0.2373 (0.2558) data time 0.0009 (0.0052) model time 0.2364 (0.2503) loss 3.0453 (2.8035) grad_norm 3.6189 (inf) loss_scale 256.0000 (324.1344) mem 7381MB [2024-09-01 07:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1170/1251] eta 0:00:20 lr 0.000095 wd 0.0500 time 0.2370 (0.2557) data time 0.0009 (0.0052) model time 0.2361 (0.2502) loss 2.9066 (2.8067) grad_norm 2.9860 (inf) loss_scale 256.0000 (323.5525) mem 7381MB [2024-09-01 07:18:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1180/1251] eta 0:00:18 lr 0.000095 wd 0.0500 time 0.2381 (0.2565) data time 0.0010 (0.0061) model time 0.2370 (0.2501) loss 2.6832 (2.8061) grad_norm 3.7016 (inf) loss_scale 256.0000 (322.9805) mem 7381MB [2024-09-01 07:18:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1190/1251] eta 0:00:15 lr 0.000095 wd 0.0500 time 0.2376 (0.2563) data time 0.0009 (0.0060) model time 0.2367 (0.2500) loss 3.3023 (2.8060) grad_norm 3.2726 (inf) loss_scale 256.0000 (322.4181) mem 7381MB [2024-09-01 07:18:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1200/1251] eta 0:00:13 lr 0.000095 wd 0.0500 time 0.2412 (0.2562) data time 0.0009 (0.0060) model time 0.2403 (0.2498) loss 2.6813 (2.8080) grad_norm 4.1728 (inf) loss_scale 256.0000 (321.8651) mem 7381MB [2024-09-01 07:18:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1210/1251] eta 0:00:10 lr 0.000095 wd 0.0500 time 0.2447 (0.2561) data time 0.0010 (0.0060) model time 0.2437 (0.2497) loss 3.2689 (2.8096) grad_norm 3.8024 (inf) loss_scale 256.0000 (321.3212) mem 7381MB [2024-09-01 07:18:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1220/1251] eta 0:00:07 lr 0.000095 wd 0.0500 time 0.2400 (0.2559) data time 0.0010 (0.0059) model time 0.2390 (0.2496) loss 2.7401 (2.8092) grad_norm 4.2814 (inf) loss_scale 256.0000 (320.7862) mem 7381MB [2024-09-01 07:18:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1230/1251] eta 0:00:05 lr 0.000095 wd 0.0500 time 0.2290 (0.2558) data time 0.0008 (0.0059) model time 0.2282 (0.2496) loss 2.9825 (2.8096) grad_norm 4.7037 (inf) loss_scale 256.0000 (320.2600) mem 7381MB [2024-09-01 07:18:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1240/1251] eta 0:00:02 lr 0.000095 wd 0.0500 time 0.2253 (0.2556) data time 0.0007 (0.0059) model time 0.2246 (0.2494) loss 2.8358 (2.8054) grad_norm 6.4152 (inf) loss_scale 256.0000 (319.7421) mem 7381MB [2024-09-01 07:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [246/300][1250/1251] eta 0:00:00 lr 0.000095 wd 0.0500 time 0.2227 (0.2554) data time 0.0005 (0.0058) model time 0.2222 (0.2492) loss 3.7621 (2.8054) grad_norm 4.1798 (inf) loss_scale 256.0000 (319.2326) mem 7381MB [2024-09-01 07:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 246 training takes 0:05:19 [2024-09-01 07:18:34 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 07:18:35 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 07:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.434 (0.434) Loss 0.3948 (0.3948) Acc@1 92.871 (92.871) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 07:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.107) Loss 0.6270 (0.6314) Acc@1 88.477 (86.994) Acc@5 97.852 (97.665) Mem 7381MB [2024-09-01 07:18:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.093) Loss 0.9170 (0.6606) Acc@1 77.344 (85.965) Acc@5 95.605 (97.596) Mem 7381MB [2024-09-01 07:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.088) Loss 1.1514 (0.7523) Acc@1 72.852 (83.748) Acc@5 92.383 (96.623) Mem 7381MB [2024-09-01 07:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.0156 (0.8006) Acc@1 76.270 (82.558) Acc@5 93.652 (96.065) Mem 7381MB [2024-09-01 07:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.194 Acc@5 96.014 [2024-09-01 07:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.2% [2024-09-01 07:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.19% [2024-09-01 07:18:39 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 07:18:40 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 07:18:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.506 (0.506) Loss 0.3818 (0.3818) Acc@1 93.164 (93.164) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 07:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.068 (0.118) Loss 0.5669 (0.5990) Acc@1 90.137 (87.571) Acc@5 98.047 (97.781) Mem 7381MB [2024-09-01 07:18:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.086 (0.099) Loss 0.8945 (0.6302) Acc@1 77.930 (86.440) Acc@5 96.289 (97.740) Mem 7381MB [2024-09-01 07:18:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.093) Loss 1.0957 (0.7163) Acc@1 74.121 (84.353) Acc@5 93.359 (96.891) Mem 7381MB [2024-09-01 07:18:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 0.9946 (0.7615) Acc@1 76.953 (83.258) Acc@5 94.336 (96.387) Mem 7381MB [2024-09-01 07:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.898 Acc@5 96.346 [2024-09-01 07:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 07:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.90% [2024-09-01 07:18:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 07:18:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 07:18:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][0/1251] eta 0:19:27 lr 0.000095 wd 0.0500 time 0.9333 (0.9333) data time 0.7127 (0.7127) model time 0.0000 (0.0000) loss 2.8541 (2.8541) grad_norm 4.2510 (4.2510) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:18:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][10/1251] eta 0:06:12 lr 0.000095 wd 0.0500 time 0.2375 (0.3004) data time 0.0008 (0.0657) model time 0.0000 (0.0000) loss 2.1853 (2.6926) grad_norm 4.8115 (4.4376) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:18:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][20/1251] eta 0:05:32 lr 0.000095 wd 0.0500 time 0.2349 (0.2702) data time 0.0008 (0.0348) model time 0.0000 (0.0000) loss 3.1312 (2.7878) grad_norm 4.4583 (4.3913) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][30/1251] eta 0:05:17 lr 0.000095 wd 0.0500 time 0.2288 (0.2596) data time 0.0010 (0.0239) model time 0.0000 (0.0000) loss 3.3444 (2.8033) grad_norm 3.3928 (4.1819) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:18:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][40/1251] eta 0:06:43 lr 0.000095 wd 0.0500 time 0.2453 (0.3328) data time 0.0322 (0.0258) model time 0.0000 (0.0000) loss 3.0461 (2.8155) grad_norm 4.4896 (4.4171) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][50/1251] eta 0:07:08 lr 0.000095 wd 0.0500 time 0.4028 (0.3565) data time 0.0010 (0.0227) model time 0.0000 (0.0000) loss 3.0219 (2.8015) grad_norm 3.3870 (4.2958) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][60/1251] eta 0:07:26 lr 0.000095 wd 0.0500 time 0.2345 (0.3747) data time 0.0007 (0.0306) model time 0.2338 (0.3967) loss 3.3054 (2.7663) grad_norm 5.0843 (4.2763) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][70/1251] eta 0:07:52 lr 0.000095 wd 0.0500 time 0.4749 (0.4000) data time 0.0006 (0.0426) model time 0.4742 (0.4176) loss 3.0965 (2.7522) grad_norm 5.1525 (4.2720) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][80/1251] eta 0:07:25 lr 0.000095 wd 0.0500 time 0.2291 (0.3808) data time 0.0008 (0.0374) model time 0.2282 (0.3596) loss 2.9786 (2.7763) grad_norm 3.9599 (4.2613) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][90/1251] eta 0:07:04 lr 0.000095 wd 0.0500 time 0.2319 (0.3655) data time 0.0011 (0.0334) model time 0.2308 (0.3297) loss 2.5174 (2.7660) grad_norm 4.5152 (4.3031) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][100/1251] eta 0:06:46 lr 0.000095 wd 0.0500 time 0.2379 (0.3529) data time 0.0009 (0.0302) model time 0.2370 (0.3113) loss 2.9121 (2.7816) grad_norm 3.6561 (4.3264) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][110/1251] eta 0:06:30 lr 0.000095 wd 0.0500 time 0.2345 (0.3426) data time 0.0007 (0.0276) model time 0.2338 (0.2989) loss 2.0437 (2.7598) grad_norm 3.0860 (4.3720) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][120/1251] eta 0:06:17 lr 0.000095 wd 0.0500 time 0.2286 (0.3337) data time 0.0011 (0.0254) model time 0.2274 (0.2897) loss 1.9868 (2.7496) grad_norm 3.5979 (4.3649) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][130/1251] eta 0:06:05 lr 0.000095 wd 0.0500 time 0.2339 (0.3262) data time 0.0007 (0.0235) model time 0.2332 (0.2828) loss 3.0709 (2.7343) grad_norm 3.5052 (4.3517) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][140/1251] eta 0:05:55 lr 0.000095 wd 0.0500 time 0.2388 (0.3200) data time 0.0010 (0.0219) model time 0.2378 (0.2778) loss 3.3977 (2.7522) grad_norm 4.3159 (4.3149) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][150/1251] eta 0:05:46 lr 0.000095 wd 0.0500 time 0.2386 (0.3147) data time 0.0010 (0.0206) model time 0.2377 (0.2739) loss 3.1832 (2.7641) grad_norm 3.9546 (4.3103) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][160/1251] eta 0:05:38 lr 0.000095 wd 0.0500 time 0.2413 (0.3100) data time 0.0011 (0.0193) model time 0.2402 (0.2707) loss 3.2035 (2.7746) grad_norm 3.2336 (4.2981) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][170/1251] eta 0:05:30 lr 0.000095 wd 0.0500 time 0.2455 (0.3058) data time 0.0007 (0.0183) model time 0.2447 (0.2679) loss 3.0135 (2.7832) grad_norm 3.5090 (4.3081) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][180/1251] eta 0:05:23 lr 0.000095 wd 0.0500 time 0.2318 (0.3020) data time 0.0008 (0.0173) model time 0.2310 (0.2653) loss 2.6369 (2.7832) grad_norm 3.0757 (4.3138) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][190/1251] eta 0:05:16 lr 0.000094 wd 0.0500 time 0.2355 (0.2987) data time 0.0010 (0.0165) model time 0.2345 (0.2635) loss 3.6507 (2.7992) grad_norm 5.2606 (4.3111) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][200/1251] eta 0:05:10 lr 0.000094 wd 0.0500 time 0.2313 (0.2956) data time 0.0009 (0.0157) model time 0.2304 (0.2616) loss 3.3097 (2.7963) grad_norm 7.1712 (4.3333) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][210/1251] eta 0:05:04 lr 0.000094 wd 0.0500 time 0.2376 (0.2927) data time 0.0007 (0.0150) model time 0.2369 (0.2598) loss 3.1663 (2.7966) grad_norm 5.5637 (4.3637) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][220/1251] eta 0:04:59 lr 0.000094 wd 0.0500 time 0.2377 (0.2901) data time 0.0010 (0.0144) model time 0.2368 (0.2583) loss 2.9953 (2.7988) grad_norm 3.9533 (4.4034) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][230/1251] eta 0:04:53 lr 0.000094 wd 0.0500 time 0.2359 (0.2879) data time 0.0010 (0.0138) model time 0.2349 (0.2572) loss 2.1201 (2.7929) grad_norm 5.3174 (4.4176) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][240/1251] eta 0:04:48 lr 0.000094 wd 0.0500 time 0.2293 (0.2858) data time 0.0008 (0.0133) model time 0.2285 (0.2561) loss 2.0973 (2.7825) grad_norm 4.4356 (4.3992) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][250/1251] eta 0:04:44 lr 0.000094 wd 0.0500 time 0.2350 (0.2837) data time 0.0007 (0.0128) model time 0.2343 (0.2550) loss 2.5464 (2.7818) grad_norm 5.2594 (4.5134) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:19:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][260/1251] eta 0:04:39 lr 0.000094 wd 0.0500 time 0.2559 (0.2820) data time 0.0009 (0.0123) model time 0.2550 (0.2540) loss 2.6893 (2.7849) grad_norm 4.1187 (4.5495) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][270/1251] eta 0:04:35 lr 0.000094 wd 0.0500 time 0.2404 (0.2805) data time 0.0009 (0.0119) model time 0.2395 (0.2535) loss 2.7244 (2.7928) grad_norm 5.9548 (4.5466) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][280/1251] eta 0:04:30 lr 0.000094 wd 0.0500 time 0.2335 (0.2789) data time 0.0008 (0.0115) model time 0.2327 (0.2527) loss 2.0261 (2.7885) grad_norm 3.6907 (4.5395) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][290/1251] eta 0:04:26 lr 0.000094 wd 0.0500 time 0.2365 (0.2776) data time 0.0007 (0.0112) model time 0.2358 (0.2522) loss 3.1316 (2.7874) grad_norm 3.8745 (4.5313) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][300/1251] eta 0:04:22 lr 0.000094 wd 0.0500 time 0.2325 (0.2764) data time 0.0009 (0.0108) model time 0.2316 (0.2516) loss 3.0241 (2.7956) grad_norm 4.7452 (4.5318) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][310/1251] eta 0:04:18 lr 0.000094 wd 0.0500 time 0.2334 (0.2751) data time 0.0007 (0.0105) model time 0.2326 (0.2511) loss 3.3124 (2.7985) grad_norm 3.0437 (4.4960) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][320/1251] eta 0:04:15 lr 0.000094 wd 0.0500 time 0.2439 (0.2740) data time 0.0008 (0.0102) model time 0.2432 (0.2506) loss 3.0706 (2.7948) grad_norm 4.1839 (4.4770) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][330/1251] eta 0:04:11 lr 0.000094 wd 0.0500 time 0.2455 (0.2730) data time 0.0010 (0.0099) model time 0.2445 (0.2502) loss 2.9619 (2.7942) grad_norm 2.6937 (4.4654) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][340/1251] eta 0:04:07 lr 0.000094 wd 0.0500 time 0.2415 (0.2721) data time 0.0009 (0.0097) model time 0.2406 (0.2499) loss 2.5115 (2.7950) grad_norm 4.0557 (4.4919) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][350/1251] eta 0:04:04 lr 0.000094 wd 0.0500 time 0.2412 (0.2712) data time 0.0007 (0.0094) model time 0.2405 (0.2496) loss 3.3518 (2.8006) grad_norm 8.3685 (4.5002) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][360/1251] eta 0:04:00 lr 0.000094 wd 0.0500 time 0.2491 (0.2703) data time 0.0007 (0.0092) model time 0.2483 (0.2491) loss 2.6564 (2.7988) grad_norm 3.5748 (4.4857) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][370/1251] eta 0:03:57 lr 0.000094 wd 0.0500 time 0.2482 (0.2694) data time 0.0007 (0.0090) model time 0.2476 (0.2488) loss 3.1209 (2.7990) grad_norm 3.0274 (4.4891) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][380/1251] eta 0:03:54 lr 0.000094 wd 0.0500 time 0.2399 (0.2687) data time 0.0007 (0.0087) model time 0.2392 (0.2485) loss 1.7629 (2.7882) grad_norm 4.9920 (4.4899) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][390/1251] eta 0:03:50 lr 0.000094 wd 0.0500 time 0.2335 (0.2679) data time 0.0007 (0.0085) model time 0.2328 (0.2482) loss 3.2235 (2.7859) grad_norm 4.2644 (4.4681) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][400/1251] eta 0:03:47 lr 0.000094 wd 0.0500 time 0.2429 (0.2673) data time 0.0008 (0.0084) model time 0.2421 (0.2480) loss 2.3974 (2.7808) grad_norm 13.5509 (4.4819) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][410/1251] eta 0:03:44 lr 0.000094 wd 0.0500 time 0.2318 (0.2666) data time 0.0009 (0.0082) model time 0.2309 (0.2477) loss 3.0494 (2.7831) grad_norm 4.7207 (4.4763) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][420/1251] eta 0:03:40 lr 0.000094 wd 0.0500 time 0.2292 (0.2658) data time 0.0009 (0.0080) model time 0.2282 (0.2474) loss 3.1614 (2.7826) grad_norm 3.4998 (4.5312) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][430/1251] eta 0:03:37 lr 0.000094 wd 0.0500 time 0.2289 (0.2651) data time 0.0008 (0.0078) model time 0.2282 (0.2470) loss 3.3322 (2.7862) grad_norm 6.9811 (4.5967) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][440/1251] eta 0:03:34 lr 0.000094 wd 0.0500 time 0.2296 (0.2645) data time 0.0009 (0.0077) model time 0.2288 (0.2468) loss 2.0159 (2.7825) grad_norm 4.2958 (4.5818) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][450/1251] eta 0:03:31 lr 0.000094 wd 0.0500 time 0.2299 (0.2639) data time 0.0011 (0.0075) model time 0.2287 (0.2465) loss 3.3824 (2.7803) grad_norm 4.7866 (4.5793) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][460/1251] eta 0:03:28 lr 0.000094 wd 0.0500 time 0.2354 (0.2637) data time 0.0010 (0.0074) model time 0.2344 (0.2467) loss 3.2995 (2.7862) grad_norm 3.8251 (4.5766) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][470/1251] eta 0:03:25 lr 0.000094 wd 0.0500 time 0.2287 (0.2636) data time 0.0007 (0.0073) model time 0.2280 (0.2469) loss 3.5776 (2.7857) grad_norm 4.6389 (4.5692) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][480/1251] eta 0:03:30 lr 0.000094 wd 0.0500 time 4.6675 (0.2725) data time 4.4089 (0.0163) model time 0.2586 (0.2471) loss 3.5566 (2.7855) grad_norm 9.1035 (4.6040) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:20:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][490/1251] eta 0:03:29 lr 0.000094 wd 0.0500 time 0.6777 (0.2748) data time 0.0009 (0.0164) model time 0.6768 (0.2497) loss 3.2151 (2.7845) grad_norm 8.5470 (4.6021) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][500/1251] eta 0:03:27 lr 0.000094 wd 0.0500 time 0.2437 (0.2763) data time 0.0008 (0.0161) model time 0.2429 (0.2518) loss 1.8658 (2.7843) grad_norm 5.0424 (4.6291) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][510/1251] eta 0:03:24 lr 0.000094 wd 0.0500 time 0.2320 (0.2755) data time 0.0013 (0.0158) model time 0.2307 (0.2515) loss 2.6019 (2.7838) grad_norm 2.9992 (4.6390) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][520/1251] eta 0:03:20 lr 0.000094 wd 0.0500 time 0.2358 (0.2747) data time 0.0010 (0.0155) model time 0.2348 (0.2511) loss 3.0506 (2.7801) grad_norm 4.4582 (4.6320) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][530/1251] eta 0:03:17 lr 0.000094 wd 0.0500 time 0.2382 (0.2740) data time 0.0007 (0.0152) model time 0.2376 (0.2508) loss 3.6490 (2.7809) grad_norm 3.4571 (4.6173) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][540/1251] eta 0:03:14 lr 0.000094 wd 0.0500 time 0.2387 (0.2733) data time 0.0011 (0.0150) model time 0.2376 (0.2505) loss 2.6326 (2.7773) grad_norm 3.2733 (4.6072) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][550/1251] eta 0:03:11 lr 0.000094 wd 0.0500 time 0.2278 (0.2726) data time 0.0011 (0.0147) model time 0.2267 (0.2501) loss 3.1101 (2.7760) grad_norm 4.8630 (4.6016) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][560/1251] eta 0:03:07 lr 0.000094 wd 0.0500 time 0.2452 (0.2720) data time 0.0007 (0.0145) model time 0.2445 (0.2498) loss 2.9025 (2.7742) grad_norm 5.4956 (4.5993) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][570/1251] eta 0:03:04 lr 0.000094 wd 0.0500 time 0.2364 (0.2713) data time 0.0007 (0.0143) model time 0.2357 (0.2496) loss 2.9054 (2.7715) grad_norm 3.7982 (4.6020) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][580/1251] eta 0:03:01 lr 0.000094 wd 0.0500 time 0.2420 (0.2707) data time 0.0011 (0.0140) model time 0.2409 (0.2493) loss 1.9984 (2.7673) grad_norm 3.0363 (4.6050) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][590/1251] eta 0:02:58 lr 0.000094 wd 0.0500 time 0.2355 (0.2702) data time 0.0007 (0.0138) model time 0.2348 (0.2491) loss 3.0568 (2.7699) grad_norm 5.3780 (4.6039) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][600/1251] eta 0:02:55 lr 0.000093 wd 0.0500 time 0.2421 (0.2697) data time 0.0011 (0.0136) model time 0.2410 (0.2489) loss 2.6684 (2.7699) grad_norm 9.9696 (4.6137) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][610/1251] eta 0:02:52 lr 0.000093 wd 0.0500 time 0.2384 (0.2693) data time 0.0010 (0.0134) model time 0.2374 (0.2488) loss 2.9978 (2.7720) grad_norm 3.5304 (4.6067) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][620/1251] eta 0:02:49 lr 0.000093 wd 0.0500 time 0.2365 (0.2688) data time 0.0011 (0.0132) model time 0.2354 (0.2486) loss 3.1496 (2.7772) grad_norm 4.3943 (4.6171) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][630/1251] eta 0:02:46 lr 0.000093 wd 0.0500 time 0.2503 (0.2683) data time 0.0007 (0.0130) model time 0.2496 (0.2484) loss 2.5578 (2.7810) grad_norm 3.1396 (4.6182) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][640/1251] eta 0:02:43 lr 0.000093 wd 0.0500 time 0.2423 (0.2679) data time 0.0009 (0.0128) model time 0.2414 (0.2483) loss 3.1902 (2.7759) grad_norm 2.7808 (4.6073) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][650/1251] eta 0:02:40 lr 0.000093 wd 0.0500 time 0.2410 (0.2675) data time 0.0009 (0.0126) model time 0.2401 (0.2482) loss 3.2513 (2.7796) grad_norm 3.6825 (4.6057) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][660/1251] eta 0:02:37 lr 0.000093 wd 0.0500 time 0.2425 (0.2671) data time 0.0008 (0.0125) model time 0.2416 (0.2481) loss 1.9217 (2.7799) grad_norm 3.8612 (4.6007) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][670/1251] eta 0:02:34 lr 0.000093 wd 0.0500 time 0.2444 (0.2668) data time 0.0010 (0.0123) model time 0.2435 (0.2479) loss 3.0501 (2.7839) grad_norm 3.8086 (4.6009) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][680/1251] eta 0:02:32 lr 0.000093 wd 0.0500 time 0.2472 (0.2664) data time 0.0010 (0.0121) model time 0.2463 (0.2478) loss 3.0432 (2.7815) grad_norm 3.6460 (4.5968) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][690/1251] eta 0:02:29 lr 0.000093 wd 0.0500 time 0.2501 (0.2660) data time 0.0007 (0.0120) model time 0.2494 (0.2477) loss 2.2192 (2.7812) grad_norm 3.3062 (4.5850) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][700/1251] eta 0:02:26 lr 0.000093 wd 0.0500 time 0.2447 (0.2657) data time 0.0009 (0.0118) model time 0.2438 (0.2476) loss 3.0903 (2.7811) grad_norm 3.0229 (4.5740) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][710/1251] eta 0:02:23 lr 0.000093 wd 0.0500 time 0.2352 (0.2653) data time 0.0008 (0.0116) model time 0.2344 (0.2474) loss 1.6282 (2.7790) grad_norm 2.8809 (4.5696) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][720/1251] eta 0:02:20 lr 0.000093 wd 0.0500 time 0.2372 (0.2650) data time 0.0010 (0.0115) model time 0.2362 (0.2473) loss 3.0553 (2.7789) grad_norm 3.6215 (4.5640) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][730/1251] eta 0:02:17 lr 0.000093 wd 0.0500 time 0.2385 (0.2646) data time 0.0010 (0.0114) model time 0.2374 (0.2472) loss 2.8681 (2.7777) grad_norm 4.4714 (4.5660) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][740/1251] eta 0:02:15 lr 0.000093 wd 0.0500 time 0.2412 (0.2643) data time 0.0007 (0.0112) model time 0.2405 (0.2471) loss 2.1034 (2.7759) grad_norm 3.0910 (4.5607) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][750/1251] eta 0:02:12 lr 0.000093 wd 0.0500 time 0.2356 (0.2640) data time 0.0007 (0.0111) model time 0.2349 (0.2470) loss 3.1074 (2.7793) grad_norm 4.6699 (4.5631) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][760/1251] eta 0:02:09 lr 0.000093 wd 0.0500 time 0.2418 (0.2637) data time 0.0008 (0.0109) model time 0.2410 (0.2469) loss 3.0532 (2.7788) grad_norm 6.1530 (4.5583) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][770/1251] eta 0:02:06 lr 0.000093 wd 0.0500 time 0.2393 (0.2634) data time 0.0009 (0.0108) model time 0.2385 (0.2468) loss 3.0298 (2.7811) grad_norm 3.1204 (4.5596) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][780/1251] eta 0:02:03 lr 0.000093 wd 0.0500 time 0.2431 (0.2631) data time 0.0010 (0.0107) model time 0.2421 (0.2467) loss 3.0192 (2.7821) grad_norm 3.3632 (4.5618) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][790/1251] eta 0:02:01 lr 0.000093 wd 0.0500 time 0.2379 (0.2628) data time 0.0008 (0.0106) model time 0.2372 (0.2466) loss 3.5343 (2.7829) grad_norm 5.4442 (4.6008) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][800/1251] eta 0:01:58 lr 0.000093 wd 0.0500 time 0.2409 (0.2626) data time 0.0009 (0.0105) model time 0.2400 (0.2466) loss 3.5346 (2.7851) grad_norm 3.6882 (4.6023) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][810/1251] eta 0:01:55 lr 0.000093 wd 0.0500 time 0.2381 (0.2623) data time 0.0011 (0.0103) model time 0.2371 (0.2465) loss 3.1974 (2.7877) grad_norm 4.0558 (4.6258) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][820/1251] eta 0:01:52 lr 0.000093 wd 0.0500 time 0.2419 (0.2621) data time 0.0007 (0.0102) model time 0.2411 (0.2464) loss 3.2513 (2.7868) grad_norm 3.7371 (4.6305) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][830/1251] eta 0:01:50 lr 0.000093 wd 0.0500 time 0.2355 (0.2618) data time 0.0008 (0.0101) model time 0.2346 (0.2463) loss 3.1344 (2.7919) grad_norm 4.4796 (4.6431) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][840/1251] eta 0:01:47 lr 0.000093 wd 0.0500 time 0.2415 (0.2616) data time 0.0009 (0.0100) model time 0.2406 (0.2463) loss 2.4297 (2.7906) grad_norm 3.1403 (4.6463) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][850/1251] eta 0:01:44 lr 0.000093 wd 0.0500 time 0.2427 (0.2613) data time 0.0009 (0.0099) model time 0.2418 (0.2462) loss 2.0889 (2.7872) grad_norm 3.3922 (4.6616) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][860/1251] eta 0:01:42 lr 0.000093 wd 0.0500 time 0.2362 (0.2611) data time 0.0011 (0.0098) model time 0.2351 (0.2461) loss 3.1145 (2.7880) grad_norm 4.2510 (4.6636) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][870/1251] eta 0:01:39 lr 0.000093 wd 0.0500 time 0.2442 (0.2609) data time 0.0010 (0.0097) model time 0.2431 (0.2461) loss 2.9362 (2.7883) grad_norm 5.7234 (4.6732) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][880/1251] eta 0:01:36 lr 0.000093 wd 0.0500 time 0.2391 (0.2607) data time 0.0012 (0.0096) model time 0.2379 (0.2460) loss 3.0643 (2.7880) grad_norm 4.8533 (4.6672) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][890/1251] eta 0:01:34 lr 0.000093 wd 0.0500 time 0.2446 (0.2605) data time 0.0010 (0.0095) model time 0.2435 (0.2460) loss 3.2546 (2.7894) grad_norm 3.6227 (4.6586) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][900/1251] eta 0:01:31 lr 0.000093 wd 0.0500 time 0.2406 (0.2603) data time 0.0008 (0.0094) model time 0.2398 (0.2459) loss 2.8912 (2.7871) grad_norm 17.3101 (4.6685) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][910/1251] eta 0:01:28 lr 0.000093 wd 0.0500 time 0.2431 (0.2601) data time 0.0008 (0.0093) model time 0.2422 (0.2459) loss 2.5427 (2.7853) grad_norm 5.5242 (4.6862) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][920/1251] eta 0:01:26 lr 0.000093 wd 0.0500 time 0.2306 (0.2599) data time 0.0010 (0.0092) model time 0.2296 (0.2458) loss 2.8153 (2.7841) grad_norm 4.0681 (4.6869) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][930/1251] eta 0:01:23 lr 0.000093 wd 0.0500 time 0.2443 (0.2597) data time 0.0007 (0.0091) model time 0.2436 (0.2458) loss 1.9147 (2.7831) grad_norm 4.7977 (4.6828) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][940/1251] eta 0:01:20 lr 0.000093 wd 0.0500 time 0.2449 (0.2596) data time 0.0007 (0.0091) model time 0.2442 (0.2458) loss 1.9211 (2.7833) grad_norm 3.3589 (4.6708) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][950/1251] eta 0:01:18 lr 0.000093 wd 0.0500 time 0.2463 (0.2594) data time 0.0008 (0.0090) model time 0.2455 (0.2457) loss 2.9285 (2.7850) grad_norm 3.9377 (4.6616) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][960/1251] eta 0:01:15 lr 0.000093 wd 0.0500 time 0.2393 (0.2593) data time 0.0008 (0.0089) model time 0.2386 (0.2457) loss 2.1052 (2.7834) grad_norm 3.7619 (4.6542) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][970/1251] eta 0:01:12 lr 0.000093 wd 0.0500 time 0.2397 (0.2591) data time 0.0007 (0.0088) model time 0.2390 (0.2456) loss 2.9242 (2.7831) grad_norm 2.9750 (4.6561) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:22:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][980/1251] eta 0:01:10 lr 0.000093 wd 0.0500 time 0.2470 (0.2589) data time 0.0011 (0.0087) model time 0.2459 (0.2456) loss 2.2061 (2.7832) grad_norm 5.3100 (4.6560) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][990/1251] eta 0:01:07 lr 0.000093 wd 0.0500 time 0.2425 (0.2587) data time 0.0007 (0.0086) model time 0.2418 (0.2455) loss 1.5856 (2.7807) grad_norm 5.7367 (4.6503) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1000/1251] eta 0:01:04 lr 0.000093 wd 0.0500 time 0.2455 (0.2588) data time 0.0009 (0.0086) model time 0.2446 (0.2458) loss 2.7636 (2.7820) grad_norm 4.0384 (4.6809) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1010/1251] eta 0:01:02 lr 0.000092 wd 0.0500 time 0.2390 (0.2588) data time 0.0009 (0.0085) model time 0.2381 (0.2459) loss 3.3806 (2.7827) grad_norm 4.0776 (4.6793) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1020/1251] eta 0:00:59 lr 0.000092 wd 0.0500 time 0.2595 (0.2587) data time 0.0007 (0.0084) model time 0.2588 (0.2459) loss 2.9031 (2.7842) grad_norm 3.4040 (4.6728) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1030/1251] eta 0:00:57 lr 0.000092 wd 0.0500 time 0.2443 (0.2585) data time 0.0011 (0.0083) model time 0.2432 (0.2458) loss 2.7914 (2.7860) grad_norm 6.5451 (4.6715) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1040/1251] eta 0:00:54 lr 0.000092 wd 0.0500 time 0.2358 (0.2584) data time 0.0011 (0.0083) model time 0.2348 (0.2458) loss 2.7175 (2.7839) grad_norm 6.2141 (4.6665) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1050/1251] eta 0:00:51 lr 0.000092 wd 0.0500 time 0.2392 (0.2582) data time 0.0011 (0.0082) model time 0.2381 (0.2458) loss 2.4768 (2.7846) grad_norm 3.9357 (4.6622) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1060/1251] eta 0:00:49 lr 0.000092 wd 0.0500 time 0.2372 (0.2581) data time 0.0011 (0.0081) model time 0.2361 (0.2457) loss 2.9445 (2.7845) grad_norm 3.6056 (4.6592) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1070/1251] eta 0:00:46 lr 0.000092 wd 0.0500 time 0.2365 (0.2579) data time 0.0012 (0.0081) model time 0.2353 (0.2457) loss 3.1826 (2.7871) grad_norm 5.7562 (4.6716) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1080/1251] eta 0:00:44 lr 0.000092 wd 0.0500 time 0.2412 (0.2578) data time 0.0009 (0.0080) model time 0.2403 (0.2456) loss 2.4591 (2.7849) grad_norm 2.9987 (4.6658) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1090/1251] eta 0:00:41 lr 0.000092 wd 0.0500 time 0.2362 (0.2577) data time 0.0010 (0.0079) model time 0.2352 (0.2456) loss 2.5749 (2.7833) grad_norm 3.3719 (4.6690) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1100/1251] eta 0:00:38 lr 0.000092 wd 0.0500 time 0.2445 (0.2576) data time 0.0009 (0.0079) model time 0.2436 (0.2456) loss 2.7469 (2.7845) grad_norm 4.7896 (4.6650) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1110/1251] eta 0:00:36 lr 0.000092 wd 0.0500 time 0.2355 (0.2574) data time 0.0009 (0.0078) model time 0.2346 (0.2455) loss 3.1166 (2.7855) grad_norm 3.5429 (4.6602) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1120/1251] eta 0:00:33 lr 0.000092 wd 0.0500 time 0.2405 (0.2573) data time 0.0007 (0.0078) model time 0.2397 (0.2455) loss 3.1059 (2.7860) grad_norm 4.0670 (4.6588) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1130/1251] eta 0:00:31 lr 0.000092 wd 0.0500 time 0.2395 (0.2572) data time 0.0011 (0.0077) model time 0.2385 (0.2455) loss 3.1926 (2.7866) grad_norm 5.5367 (4.6557) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1140/1251] eta 0:00:28 lr 0.000092 wd 0.0500 time 0.2391 (0.2570) data time 0.0007 (0.0076) model time 0.2383 (0.2454) loss 2.4795 (2.7835) grad_norm 4.5472 (4.6575) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1150/1251] eta 0:00:25 lr 0.000092 wd 0.0500 time 0.2502 (0.2569) data time 0.0009 (0.0076) model time 0.2493 (0.2454) loss 2.8659 (2.7852) grad_norm 3.6397 (4.6485) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1160/1251] eta 0:00:23 lr 0.000092 wd 0.0500 time 0.2434 (0.2568) data time 0.0007 (0.0075) model time 0.2427 (0.2454) loss 3.4864 (2.7881) grad_norm 4.8965 (4.6461) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1170/1251] eta 0:00:20 lr 0.000092 wd 0.0500 time 0.2343 (0.2566) data time 0.0007 (0.0075) model time 0.2336 (0.2453) loss 2.4183 (2.7867) grad_norm 3.2690 (4.6543) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1180/1251] eta 0:00:18 lr 0.000092 wd 0.0500 time 0.2378 (0.2565) data time 0.0010 (0.0074) model time 0.2368 (0.2453) loss 1.7681 (2.7852) grad_norm 3.7274 (4.6705) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1190/1251] eta 0:00:15 lr 0.000092 wd 0.0500 time 0.2365 (0.2564) data time 0.0008 (0.0074) model time 0.2357 (0.2452) loss 2.3057 (2.7870) grad_norm 4.7633 (4.6721) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1200/1251] eta 0:00:13 lr 0.000092 wd 0.0500 time 0.2433 (0.2563) data time 0.0008 (0.0073) model time 0.2425 (0.2452) loss 2.9502 (2.7876) grad_norm 4.7796 (4.6675) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1210/1251] eta 0:00:10 lr 0.000092 wd 0.0500 time 0.2413 (0.2561) data time 0.0009 (0.0073) model time 0.2404 (0.2452) loss 2.5410 (2.7882) grad_norm 5.0648 (4.6720) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:23:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1220/1251] eta 0:00:07 lr 0.000092 wd 0.0500 time 0.2398 (0.2561) data time 0.0007 (0.0072) model time 0.2391 (0.2451) loss 2.8661 (2.7887) grad_norm 4.3593 (4.6818) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1230/1251] eta 0:00:05 lr 0.000092 wd 0.0500 time 0.2379 (0.2559) data time 0.0009 (0.0072) model time 0.2370 (0.2451) loss 3.0468 (2.7883) grad_norm 6.4376 (4.6812) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1240/1251] eta 0:00:02 lr 0.000092 wd 0.0500 time 0.2249 (0.2558) data time 0.0007 (0.0071) model time 0.2242 (0.2450) loss 3.1455 (2.7887) grad_norm 8.4809 (4.6938) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [247/300][1250/1251] eta 0:00:00 lr 0.000092 wd 0.0500 time 0.2261 (0.2555) data time 0.0007 (0.0071) model time 0.2254 (0.2449) loss 2.7069 (2.7888) grad_norm 3.5921 (4.6959) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 247 training takes 0:05:19 [2024-09-01 07:24:04 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 07:24:05 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 07:24:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.430 (0.430) Loss 0.3923 (0.3923) Acc@1 93.457 (93.457) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 07:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.112) Loss 0.5796 (0.6179) Acc@1 89.648 (87.358) Acc@5 97.949 (97.772) Mem 7381MB [2024-09-01 07:24:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.097) Loss 0.8970 (0.6479) Acc@1 78.125 (86.137) Acc@5 95.508 (97.624) Mem 7381MB [2024-09-01 07:24:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.090) Loss 1.1182 (0.7390) Acc@1 74.121 (83.909) Acc@5 92.871 (96.664) Mem 7381MB [2024-09-01 07:24:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.0107 (0.7904) Acc@1 76.270 (82.705) Acc@5 94.238 (96.120) Mem 7381MB [2024-09-01 07:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.264 Acc@5 96.084 [2024-09-01 07:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.3% [2024-09-01 07:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.26% [2024-09-01 07:24:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 07:24:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 07:24:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.477 (0.477) Loss 0.3823 (0.3823) Acc@1 93.066 (93.066) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 07:24:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.114) Loss 0.5664 (0.5993) Acc@1 90.039 (87.607) Acc@5 98.047 (97.754) Mem 7381MB [2024-09-01 07:24:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.097) Loss 0.8940 (0.6304) Acc@1 77.832 (86.416) Acc@5 96.289 (97.707) Mem 7381MB [2024-09-01 07:24:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.091) Loss 1.0957 (0.7166) Acc@1 73.926 (84.321) Acc@5 93.262 (96.859) Mem 7381MB [2024-09-01 07:24:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 0.9922 (0.7618) Acc@1 76.758 (83.222) Acc@5 94.336 (96.353) Mem 7381MB [2024-09-01 07:24:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.884 Acc@5 96.314 [2024-09-01 07:24:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 07:24:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][0/1251] eta 0:26:23 lr 0.000092 wd 0.0500 time 1.2654 (1.2654) data time 0.7076 (0.7076) model time 0.0000 (0.0000) loss 2.9554 (2.9554) grad_norm 3.1905 (3.1905) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][10/1251] eta 0:06:56 lr 0.000092 wd 0.0500 time 0.2359 (0.3356) data time 0.0012 (0.0652) model time 0.0000 (0.0000) loss 3.1053 (2.7662) grad_norm 5.2862 (4.5386) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][20/1251] eta 0:06:09 lr 0.000092 wd 0.0500 time 0.2406 (0.3002) data time 0.0012 (0.0346) model time 0.0000 (0.0000) loss 3.1341 (2.8887) grad_norm 3.5322 (4.6987) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][30/1251] eta 0:05:42 lr 0.000092 wd 0.0500 time 0.2349 (0.2803) data time 0.0007 (0.0238) model time 0.0000 (0.0000) loss 3.3289 (2.8547) grad_norm 3.4052 (4.4434) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][40/1251] eta 0:05:27 lr 0.000092 wd 0.0500 time 0.2386 (0.2705) data time 0.0008 (0.0182) model time 0.0000 (0.0000) loss 2.9224 (2.7886) grad_norm 3.9631 (4.3091) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][50/1251] eta 0:05:17 lr 0.000092 wd 0.0500 time 0.2326 (0.2644) data time 0.0007 (0.0148) model time 0.0000 (0.0000) loss 3.4720 (2.8390) grad_norm 4.8193 (4.2785) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][60/1251] eta 0:05:10 lr 0.000092 wd 0.0500 time 0.2510 (0.2609) data time 0.0012 (0.0126) model time 0.2498 (0.2418) loss 3.0998 (2.8248) grad_norm 5.7998 (4.2641) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][70/1251] eta 0:05:05 lr 0.000092 wd 0.0500 time 0.2474 (0.2585) data time 0.0010 (0.0109) model time 0.2464 (0.2424) loss 2.8216 (2.8602) grad_norm 3.9330 (4.2523) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][80/1251] eta 0:04:59 lr 0.000092 wd 0.0500 time 0.2360 (0.2561) data time 0.0009 (0.0098) model time 0.2351 (0.2409) loss 2.7908 (2.8768) grad_norm 4.4068 (4.4038) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][90/1251] eta 0:04:55 lr 0.000092 wd 0.0500 time 0.2424 (0.2545) data time 0.0008 (0.0088) model time 0.2416 (0.2409) loss 2.2479 (2.8719) grad_norm 5.0286 (4.3452) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][100/1251] eta 0:04:51 lr 0.000092 wd 0.0500 time 0.2499 (0.2533) data time 0.0008 (0.0080) model time 0.2491 (0.2410) loss 1.6850 (2.8788) grad_norm 3.1262 (4.3279) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][110/1251] eta 0:04:47 lr 0.000092 wd 0.0500 time 0.2420 (0.2522) data time 0.0009 (0.0074) model time 0.2410 (0.2407) loss 3.0523 (2.8588) grad_norm 4.6810 (4.3563) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][120/1251] eta 0:04:44 lr 0.000092 wd 0.0500 time 0.2321 (0.2512) data time 0.0009 (0.0069) model time 0.2311 (0.2405) loss 2.4614 (2.8440) grad_norm 3.4036 (4.4099) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][130/1251] eta 0:04:40 lr 0.000092 wd 0.0500 time 0.2398 (0.2504) data time 0.0010 (0.0064) model time 0.2388 (0.2405) loss 3.0778 (2.8559) grad_norm 3.9749 (4.3932) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][140/1251] eta 0:04:37 lr 0.000092 wd 0.0500 time 0.2382 (0.2496) data time 0.0008 (0.0060) model time 0.2374 (0.2402) loss 2.5755 (2.8500) grad_norm 7.0240 (4.4230) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][150/1251] eta 0:04:34 lr 0.000092 wd 0.0500 time 0.2351 (0.2490) data time 0.0010 (0.0057) model time 0.2341 (0.2401) loss 3.0396 (2.8533) grad_norm 3.2482 (4.5171) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][160/1251] eta 0:04:31 lr 0.000091 wd 0.0500 time 0.2422 (0.2487) data time 0.0010 (0.0054) model time 0.2412 (0.2404) loss 2.7295 (2.8469) grad_norm 4.6391 (4.5454) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][170/1251] eta 0:04:28 lr 0.000091 wd 0.0500 time 0.2381 (0.2483) data time 0.0010 (0.0051) model time 0.2370 (0.2405) loss 2.8691 (2.8417) grad_norm 3.3150 (4.5238) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:24:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][180/1251] eta 0:04:25 lr 0.000091 wd 0.0500 time 0.2410 (0.2481) data time 0.0009 (0.0049) model time 0.2401 (0.2406) loss 2.8113 (2.8455) grad_norm 3.3958 (4.4842) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][190/1251] eta 0:04:22 lr 0.000091 wd 0.0500 time 0.2382 (0.2477) data time 0.0009 (0.0047) model time 0.2373 (0.2407) loss 3.0601 (2.8469) grad_norm 3.5635 (4.7137) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][200/1251] eta 0:04:20 lr 0.000091 wd 0.0500 time 0.2469 (0.2475) data time 0.0007 (0.0045) model time 0.2462 (0.2408) loss 3.1069 (2.8415) grad_norm 2.8941 (4.6797) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][210/1251] eta 0:04:17 lr 0.000091 wd 0.0500 time 0.2494 (0.2474) data time 0.0011 (0.0044) model time 0.2484 (0.2409) loss 2.8719 (2.8369) grad_norm 3.0588 (4.6784) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][220/1251] eta 0:04:14 lr 0.000091 wd 0.0500 time 0.2411 (0.2472) data time 0.0009 (0.0042) model time 0.2402 (0.2410) loss 3.0457 (2.8271) grad_norm 5.2227 (4.6705) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][230/1251] eta 0:04:12 lr 0.000091 wd 0.0500 time 0.2361 (0.2469) data time 0.0009 (0.0041) model time 0.2352 (0.2409) loss 2.9573 (2.8279) grad_norm 4.4197 (4.6320) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][240/1251] eta 0:04:09 lr 0.000091 wd 0.0500 time 0.2415 (0.2467) data time 0.0007 (0.0040) model time 0.2408 (0.2409) loss 2.8492 (2.8236) grad_norm 3.1934 (4.6338) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][250/1251] eta 0:04:06 lr 0.000091 wd 0.0500 time 0.2413 (0.2465) data time 0.0009 (0.0038) model time 0.2404 (0.2409) loss 2.9423 (2.8238) grad_norm 3.8360 (4.6434) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][260/1251] eta 0:04:04 lr 0.000091 wd 0.0500 time 0.2383 (0.2464) data time 0.0008 (0.0037) model time 0.2374 (0.2409) loss 3.1405 (2.8256) grad_norm 6.7628 (4.6484) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][270/1251] eta 0:04:01 lr 0.000091 wd 0.0500 time 0.2353 (0.2463) data time 0.0007 (0.0036) model time 0.2346 (0.2411) loss 1.7986 (2.8236) grad_norm 7.0049 (4.6486) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][280/1251] eta 0:03:59 lr 0.000091 wd 0.0500 time 0.2399 (0.2470) data time 0.0008 (0.0035) model time 0.2390 (0.2421) loss 2.9441 (2.8176) grad_norm 6.8004 (4.6966) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][290/1251] eta 0:03:57 lr 0.000091 wd 0.0500 time 0.2483 (0.2475) data time 0.0010 (0.0034) model time 0.2473 (0.2429) loss 2.7765 (2.8144) grad_norm 3.3566 (4.7538) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][300/1251] eta 0:03:55 lr 0.000091 wd 0.0500 time 0.2368 (0.2473) data time 0.0007 (0.0034) model time 0.2361 (0.2427) loss 3.0088 (2.8176) grad_norm 4.4426 (4.7399) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][310/1251] eta 0:03:52 lr 0.000091 wd 0.0500 time 0.2414 (0.2471) data time 0.0012 (0.0033) model time 0.2403 (0.2427) loss 2.9331 (2.8123) grad_norm 4.5993 (4.7350) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][320/1251] eta 0:03:49 lr 0.000091 wd 0.0500 time 0.2401 (0.2470) data time 0.0010 (0.0032) model time 0.2391 (0.2427) loss 3.4197 (2.8158) grad_norm 3.6639 (4.7329) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][330/1251] eta 0:03:47 lr 0.000091 wd 0.0500 time 0.2433 (0.2468) data time 0.0007 (0.0031) model time 0.2426 (0.2426) loss 1.5356 (2.8113) grad_norm 3.3091 (4.7177) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][340/1251] eta 0:03:44 lr 0.000091 wd 0.0500 time 0.2391 (0.2467) data time 0.0011 (0.0031) model time 0.2381 (0.2425) loss 2.8267 (2.8089) grad_norm 2.9176 (4.7030) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][350/1251] eta 0:03:42 lr 0.000091 wd 0.0500 time 0.2427 (0.2465) data time 0.0010 (0.0030) model time 0.2417 (0.2425) loss 2.8713 (2.8022) grad_norm 3.5416 (4.7037) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][360/1251] eta 0:03:39 lr 0.000091 wd 0.0500 time 0.2464 (0.2464) data time 0.0008 (0.0030) model time 0.2457 (0.2424) loss 2.1301 (2.8030) grad_norm 3.3620 (4.7032) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][370/1251] eta 0:03:36 lr 0.000091 wd 0.0500 time 0.2290 (0.2462) data time 0.0012 (0.0029) model time 0.2278 (0.2423) loss 2.3550 (2.8025) grad_norm 3.3069 (4.6944) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][380/1251] eta 0:03:34 lr 0.000091 wd 0.0500 time 0.2287 (0.2460) data time 0.0010 (0.0029) model time 0.2278 (0.2422) loss 2.9612 (2.8047) grad_norm 5.2254 (4.6813) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][390/1251] eta 0:03:31 lr 0.000091 wd 0.0500 time 0.2316 (0.2458) data time 0.0009 (0.0028) model time 0.2307 (0.2420) loss 2.1835 (2.8002) grad_norm 3.8374 (4.6629) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][400/1251] eta 0:03:35 lr 0.000091 wd 0.0500 time 0.2405 (0.2527) data time 0.0010 (0.0028) model time 0.2395 (0.2500) loss 3.2706 (2.8026) grad_norm 4.1493 (4.6492) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:25:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][410/1251] eta 0:03:32 lr 0.000091 wd 0.0500 time 0.2479 (0.2524) data time 0.0007 (0.0027) model time 0.2472 (0.2497) loss 3.4487 (2.8040) grad_norm 5.2364 (4.6403) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][420/1251] eta 0:03:29 lr 0.000091 wd 0.0500 time 0.2446 (0.2522) data time 0.0008 (0.0027) model time 0.2438 (0.2495) loss 2.5680 (2.8015) grad_norm 3.2768 (4.6244) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][430/1251] eta 0:03:26 lr 0.000091 wd 0.0500 time 0.2386 (0.2520) data time 0.0008 (0.0026) model time 0.2378 (0.2493) loss 3.0598 (2.8008) grad_norm 5.6145 (4.6123) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][440/1251] eta 0:03:24 lr 0.000091 wd 0.0500 time 0.2443 (0.2517) data time 0.0007 (0.0026) model time 0.2435 (0.2491) loss 2.0814 (2.8025) grad_norm 3.6055 (4.5958) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][450/1251] eta 0:03:21 lr 0.000091 wd 0.0500 time 0.2385 (0.2515) data time 0.0009 (0.0026) model time 0.2376 (0.2488) loss 3.0850 (2.8039) grad_norm 4.5830 (4.5863) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][460/1251] eta 0:03:18 lr 0.000091 wd 0.0500 time 0.2358 (0.2513) data time 0.0009 (0.0025) model time 0.2349 (0.2486) loss 3.1682 (2.7993) grad_norm 3.5759 (4.5786) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][470/1251] eta 0:03:16 lr 0.000091 wd 0.0500 time 0.2329 (0.2510) data time 0.0010 (0.0025) model time 0.2319 (0.2484) loss 2.7260 (2.7984) grad_norm 4.7616 (4.5930) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][480/1251] eta 0:03:13 lr 0.000091 wd 0.0500 time 0.2430 (0.2509) data time 0.0008 (0.0025) model time 0.2423 (0.2482) loss 2.9917 (2.8025) grad_norm 9.9762 (4.5957) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][490/1251] eta 0:03:10 lr 0.000091 wd 0.0500 time 0.2373 (0.2507) data time 0.0010 (0.0024) model time 0.2363 (0.2481) loss 3.0679 (2.8051) grad_norm 4.2795 (4.5947) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][500/1251] eta 0:03:08 lr 0.000091 wd 0.0500 time 0.2322 (0.2504) data time 0.0007 (0.0024) model time 0.2315 (0.2478) loss 3.2913 (2.7973) grad_norm 4.3854 (4.5918) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][510/1251] eta 0:03:05 lr 0.000091 wd 0.0500 time 0.2351 (0.2502) data time 0.0010 (0.0024) model time 0.2341 (0.2477) loss 2.9461 (2.7967) grad_norm 5.5181 (4.6220) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][520/1251] eta 0:03:02 lr 0.000091 wd 0.0500 time 0.2328 (0.2501) data time 0.0011 (0.0024) model time 0.2318 (0.2475) loss 2.9798 (2.7949) grad_norm 4.4971 (4.6389) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][530/1251] eta 0:03:00 lr 0.000091 wd 0.0500 time 0.2451 (0.2499) data time 0.0010 (0.0023) model time 0.2441 (0.2473) loss 3.0298 (2.7926) grad_norm 7.5708 (4.6471) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][540/1251] eta 0:02:57 lr 0.000091 wd 0.0500 time 0.2497 (0.2497) data time 0.0009 (0.0023) model time 0.2488 (0.2472) loss 3.2472 (2.7951) grad_norm 3.7792 (4.6525) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][550/1251] eta 0:02:54 lr 0.000091 wd 0.0500 time 0.2357 (0.2495) data time 0.0007 (0.0023) model time 0.2350 (0.2470) loss 3.0261 (2.7943) grad_norm 4.1947 (4.6504) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][560/1251] eta 0:02:52 lr 0.000091 wd 0.0500 time 0.2351 (0.2498) data time 0.0009 (0.0023) model time 0.2342 (0.2473) loss 1.9598 (2.7906) grad_norm 3.4446 (4.6851) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:26:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][570/1251] eta 0:02:50 lr 0.000090 wd 0.0500 time 0.2478 (0.2496) data time 0.0009 (0.0022) model time 0.2469 (0.2472) loss 2.9663 (2.7923) grad_norm 3.5598 (4.6855) loss_scale 512.0000 (256.4483) mem 7381MB [2024-09-01 07:26:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][580/1251] eta 0:02:47 lr 0.000090 wd 0.0500 time 0.2421 (0.2495) data time 0.0008 (0.0022) model time 0.2412 (0.2471) loss 3.0143 (2.7909) grad_norm 2.7105 (4.6703) loss_scale 512.0000 (260.8468) mem 7381MB [2024-09-01 07:26:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][590/1251] eta 0:02:44 lr 0.000090 wd 0.0500 time 0.2393 (0.2493) data time 0.0010 (0.0022) model time 0.2383 (0.2469) loss 2.3145 (2.7873) grad_norm 11.5708 (4.6732) loss_scale 512.0000 (265.0964) mem 7381MB [2024-09-01 07:26:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][600/1251] eta 0:02:42 lr 0.000090 wd 0.0500 time 0.2404 (0.2492) data time 0.0009 (0.0022) model time 0.2395 (0.2468) loss 2.8432 (2.7867) grad_norm 4.4428 (4.6736) loss_scale 512.0000 (269.2047) mem 7381MB [2024-09-01 07:26:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][610/1251] eta 0:02:39 lr 0.000090 wd 0.0500 time 0.2394 (0.2490) data time 0.0009 (0.0022) model time 0.2385 (0.2466) loss 3.0941 (2.7863) grad_norm 3.3929 (4.6694) loss_scale 512.0000 (273.1784) mem 7381MB [2024-09-01 07:26:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][620/1251] eta 0:02:37 lr 0.000090 wd 0.0500 time 0.2412 (0.2489) data time 0.0010 (0.0021) model time 0.2401 (0.2465) loss 2.9169 (2.7836) grad_norm 4.2855 (4.6586) loss_scale 512.0000 (277.0242) mem 7381MB [2024-09-01 07:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][630/1251] eta 0:02:34 lr 0.000090 wd 0.0500 time 0.2377 (0.2487) data time 0.0010 (0.0021) model time 0.2367 (0.2464) loss 3.0890 (2.7859) grad_norm 3.6164 (4.6545) loss_scale 512.0000 (280.7480) mem 7381MB [2024-09-01 07:26:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][640/1251] eta 0:02:31 lr 0.000090 wd 0.0500 time 0.2442 (0.2486) data time 0.0009 (0.0021) model time 0.2433 (0.2462) loss 2.6791 (2.7861) grad_norm 6.1942 (4.6575) loss_scale 512.0000 (284.3557) mem 7381MB [2024-09-01 07:26:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][650/1251] eta 0:02:29 lr 0.000090 wd 0.0500 time 0.2366 (0.2485) data time 0.0011 (0.0021) model time 0.2355 (0.2461) loss 3.1467 (2.7852) grad_norm 4.9163 (4.6594) loss_scale 512.0000 (287.8525) mem 7381MB [2024-09-01 07:26:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][660/1251] eta 0:02:26 lr 0.000090 wd 0.0500 time 0.2370 (0.2484) data time 0.0013 (0.0021) model time 0.2357 (0.2460) loss 2.7713 (2.7838) grad_norm 6.9476 (4.6800) loss_scale 512.0000 (291.2436) mem 7381MB [2024-09-01 07:27:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][670/1251] eta 0:02:24 lr 0.000090 wd 0.0500 time 0.2423 (0.2483) data time 0.0008 (0.0021) model time 0.2415 (0.2460) loss 3.2733 (2.7854) grad_norm 18.8064 (4.7134) loss_scale 512.0000 (294.5335) mem 7381MB [2024-09-01 07:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][680/1251] eta 0:02:21 lr 0.000090 wd 0.0500 time 0.2383 (0.2482) data time 0.0008 (0.0020) model time 0.2375 (0.2458) loss 3.2920 (2.7881) grad_norm 4.4111 (4.7327) loss_scale 512.0000 (297.7269) mem 7381MB [2024-09-01 07:27:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][690/1251] eta 0:02:19 lr 0.000090 wd 0.0500 time 0.2412 (0.2481) data time 0.0009 (0.0020) model time 0.2403 (0.2458) loss 2.9755 (2.7911) grad_norm 5.1459 (4.7691) loss_scale 512.0000 (300.8278) mem 7381MB [2024-09-01 07:27:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][700/1251] eta 0:02:16 lr 0.000090 wd 0.0500 time 0.2364 (0.2479) data time 0.0007 (0.0020) model time 0.2357 (0.2457) loss 3.6634 (2.7915) grad_norm 4.3122 (4.8303) loss_scale 512.0000 (303.8402) mem 7381MB [2024-09-01 07:27:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][710/1251] eta 0:02:14 lr 0.000090 wd 0.0500 time 0.2386 (0.2479) data time 0.0009 (0.0020) model time 0.2378 (0.2456) loss 3.0075 (2.7918) grad_norm 3.5250 (4.8262) loss_scale 512.0000 (306.7679) mem 7381MB [2024-09-01 07:27:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][720/1251] eta 0:02:11 lr 0.000090 wd 0.0500 time 0.2313 (0.2478) data time 0.0007 (0.0020) model time 0.2306 (0.2455) loss 3.0453 (2.7909) grad_norm 4.0521 (4.8154) loss_scale 512.0000 (309.6144) mem 7381MB [2024-09-01 07:27:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][730/1251] eta 0:02:09 lr 0.000090 wd 0.0500 time 0.2414 (0.2477) data time 0.0010 (0.0020) model time 0.2403 (0.2454) loss 3.0062 (2.7923) grad_norm 4.1221 (4.8202) loss_scale 512.0000 (312.3830) mem 7381MB [2024-09-01 07:27:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][740/1251] eta 0:02:06 lr 0.000090 wd 0.0500 time 0.2373 (0.2476) data time 0.0007 (0.0020) model time 0.2365 (0.2453) loss 3.2786 (2.7923) grad_norm 4.4130 (4.8196) loss_scale 512.0000 (315.0769) mem 7381MB [2024-09-01 07:27:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][750/1251] eta 0:02:03 lr 0.000090 wd 0.0500 time 0.2408 (0.2474) data time 0.0012 (0.0019) model time 0.2397 (0.2452) loss 3.2029 (2.7914) grad_norm 5.0111 (4.8233) loss_scale 512.0000 (317.6991) mem 7381MB [2024-09-01 07:27:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][760/1251] eta 0:02:01 lr 0.000090 wd 0.0500 time 0.2421 (0.2474) data time 0.0010 (0.0019) model time 0.2411 (0.2451) loss 3.0735 (2.7931) grad_norm 3.1597 (4.8403) loss_scale 512.0000 (320.2523) mem 7381MB [2024-09-01 07:27:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][770/1251] eta 0:01:58 lr 0.000090 wd 0.0500 time 0.2438 (0.2473) data time 0.0010 (0.0019) model time 0.2428 (0.2451) loss 3.0270 (2.7902) grad_norm 4.8767 (4.8423) loss_scale 512.0000 (322.7393) mem 7381MB [2024-09-01 07:27:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][780/1251] eta 0:01:56 lr 0.000090 wd 0.0500 time 0.2472 (0.2472) data time 0.0008 (0.0019) model time 0.2464 (0.2450) loss 3.1150 (2.7887) grad_norm 3.0350 (4.8510) loss_scale 512.0000 (325.1626) mem 7381MB [2024-09-01 07:27:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][790/1251] eta 0:01:53 lr 0.000090 wd 0.0500 time 0.2411 (0.2471) data time 0.0009 (0.0019) model time 0.2402 (0.2449) loss 2.8800 (2.7834) grad_norm 22.7308 (4.8598) loss_scale 512.0000 (327.5247) mem 7381MB [2024-09-01 07:27:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][800/1251] eta 0:01:51 lr 0.000090 wd 0.0500 time 0.2439 (0.2474) data time 0.0008 (0.0019) model time 0.2430 (0.2453) loss 2.4655 (2.7788) grad_norm 4.1357 (4.8562) loss_scale 512.0000 (329.8277) mem 7381MB [2024-09-01 07:27:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][810/1251] eta 0:01:49 lr 0.000090 wd 0.0500 time 0.2384 (0.2476) data time 0.0012 (0.0019) model time 0.2373 (0.2454) loss 2.9275 (2.7789) grad_norm 6.3294 (4.8514) loss_scale 512.0000 (332.0740) mem 7381MB [2024-09-01 07:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][820/1251] eta 0:01:46 lr 0.000090 wd 0.0500 time 0.2471 (0.2475) data time 0.0009 (0.0019) model time 0.2462 (0.2454) loss 3.3104 (2.7810) grad_norm 5.6586 (4.8483) loss_scale 512.0000 (334.2655) mem 7381MB [2024-09-01 07:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][830/1251] eta 0:01:44 lr 0.000090 wd 0.0500 time 0.2455 (0.2474) data time 0.0007 (0.0019) model time 0.2448 (0.2453) loss 2.8697 (2.7817) grad_norm 3.4134 (4.8499) loss_scale 512.0000 (336.4043) mem 7381MB [2024-09-01 07:27:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][840/1251] eta 0:01:41 lr 0.000090 wd 0.0500 time 0.2418 (0.2473) data time 0.0007 (0.0018) model time 0.2411 (0.2452) loss 3.0860 (2.7826) grad_norm 4.1802 (4.8627) loss_scale 512.0000 (338.4923) mem 7381MB [2024-09-01 07:27:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][850/1251] eta 0:01:39 lr 0.000090 wd 0.0500 time 0.2400 (0.2473) data time 0.0009 (0.0018) model time 0.2391 (0.2452) loss 2.4885 (2.7849) grad_norm 4.0148 (4.8523) loss_scale 512.0000 (340.5311) mem 7381MB [2024-09-01 07:27:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][860/1251] eta 0:01:36 lr 0.000090 wd 0.0500 time 0.2462 (0.2472) data time 0.0011 (0.0018) model time 0.2452 (0.2451) loss 2.7599 (2.7831) grad_norm 3.5415 (4.8519) loss_scale 512.0000 (342.5226) mem 7381MB [2024-09-01 07:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][870/1251] eta 0:01:34 lr 0.000090 wd 0.0500 time 0.2441 (0.2471) data time 0.0008 (0.0018) model time 0.2433 (0.2451) loss 3.3500 (2.7827) grad_norm 3.7014 (4.8471) loss_scale 512.0000 (344.4684) mem 7381MB [2024-09-01 07:27:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][880/1251] eta 0:01:31 lr 0.000090 wd 0.0500 time 0.2353 (0.2470) data time 0.0009 (0.0018) model time 0.2344 (0.2450) loss 2.4576 (2.7827) grad_norm 3.3274 (4.8410) loss_scale 512.0000 (346.3700) mem 7381MB [2024-09-01 07:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][890/1251] eta 0:01:29 lr 0.000090 wd 0.0500 time 0.2425 (0.2470) data time 0.0010 (0.0018) model time 0.2415 (0.2449) loss 3.1730 (2.7809) grad_norm 3.4402 (4.8489) loss_scale 512.0000 (348.2290) mem 7381MB [2024-09-01 07:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][900/1251] eta 0:01:26 lr 0.000090 wd 0.0500 time 0.2422 (0.2469) data time 0.0007 (0.0018) model time 0.2415 (0.2449) loss 3.2577 (2.7802) grad_norm 4.8242 (4.8463) loss_scale 512.0000 (350.0466) mem 7381MB [2024-09-01 07:27:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][910/1251] eta 0:01:24 lr 0.000090 wd 0.0500 time 0.2376 (0.2469) data time 0.0011 (0.0018) model time 0.2366 (0.2448) loss 2.2625 (2.7792) grad_norm 3.8336 (4.8410) loss_scale 512.0000 (351.8244) mem 7381MB [2024-09-01 07:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][920/1251] eta 0:01:21 lr 0.000090 wd 0.0500 time 0.2387 (0.2468) data time 0.0009 (0.0018) model time 0.2377 (0.2448) loss 3.2200 (2.7796) grad_norm 3.2721 (4.8394) loss_scale 512.0000 (353.5635) mem 7381MB [2024-09-01 07:28:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][930/1251] eta 0:01:19 lr 0.000090 wd 0.0500 time 0.2451 (0.2468) data time 0.0009 (0.0018) model time 0.2442 (0.2447) loss 3.3271 (2.7805) grad_norm 4.2626 (4.8359) loss_scale 512.0000 (355.2653) mem 7381MB [2024-09-01 07:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][940/1251] eta 0:01:16 lr 0.000090 wd 0.0500 time 0.2427 (0.2467) data time 0.0010 (0.0018) model time 0.2417 (0.2447) loss 2.9114 (2.7784) grad_norm 6.3035 (4.8302) loss_scale 512.0000 (356.9309) mem 7381MB [2024-09-01 07:28:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][950/1251] eta 0:01:14 lr 0.000090 wd 0.0500 time 0.2421 (0.2467) data time 0.0009 (0.0017) model time 0.2411 (0.2447) loss 3.0129 (2.7766) grad_norm 3.3405 (4.8493) loss_scale 512.0000 (358.5615) mem 7381MB [2024-09-01 07:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][960/1251] eta 0:01:11 lr 0.000090 wd 0.0500 time 0.2293 (0.2466) data time 0.0009 (0.0017) model time 0.2285 (0.2446) loss 3.3448 (2.7779) grad_norm 8.3699 (4.8580) loss_scale 512.0000 (360.1582) mem 7381MB [2024-09-01 07:28:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][970/1251] eta 0:01:09 lr 0.000090 wd 0.0500 time 0.2433 (0.2465) data time 0.0010 (0.0017) model time 0.2423 (0.2445) loss 2.7254 (2.7778) grad_norm 5.2099 (4.8611) loss_scale 512.0000 (361.7219) mem 7381MB [2024-09-01 07:28:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][980/1251] eta 0:01:06 lr 0.000090 wd 0.0500 time 0.2395 (0.2465) data time 0.0008 (0.0017) model time 0.2388 (0.2445) loss 3.5469 (2.7774) grad_norm 6.9474 (4.8541) loss_scale 512.0000 (363.2538) mem 7381MB [2024-09-01 07:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][990/1251] eta 0:01:04 lr 0.000089 wd 0.0500 time 0.2388 (0.2464) data time 0.0007 (0.0017) model time 0.2381 (0.2444) loss 1.7672 (2.7744) grad_norm 3.3063 (4.8542) loss_scale 512.0000 (364.7548) mem 7381MB [2024-09-01 07:28:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1000/1251] eta 0:01:01 lr 0.000089 wd 0.0500 time 0.2421 (0.2464) data time 0.0007 (0.0017) model time 0.2413 (0.2444) loss 3.4239 (2.7746) grad_norm 3.5369 (4.8504) loss_scale 512.0000 (366.2258) mem 7381MB [2024-09-01 07:28:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1010/1251] eta 0:00:59 lr 0.000089 wd 0.0500 time 0.2459 (0.2463) data time 0.0007 (0.0017) model time 0.2452 (0.2444) loss 1.7694 (2.7742) grad_norm 2.3357 (4.8405) loss_scale 512.0000 (367.6677) mem 7381MB [2024-09-01 07:28:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1020/1251] eta 0:00:56 lr 0.000089 wd 0.0500 time 0.2399 (0.2463) data time 0.0009 (0.0017) model time 0.2390 (0.2443) loss 3.0122 (2.7719) grad_norm 3.6430 (4.8316) loss_scale 512.0000 (369.0813) mem 7381MB [2024-09-01 07:28:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1030/1251] eta 0:00:54 lr 0.000089 wd 0.0500 time 0.2426 (0.2462) data time 0.0010 (0.0017) model time 0.2416 (0.2443) loss 2.2997 (2.7703) grad_norm 3.6562 (4.8253) loss_scale 512.0000 (370.4675) mem 7381MB [2024-09-01 07:28:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1040/1251] eta 0:00:51 lr 0.000089 wd 0.0500 time 0.2444 (0.2462) data time 0.0009 (0.0017) model time 0.2435 (0.2442) loss 2.8763 (2.7697) grad_norm 4.5647 (4.8186) loss_scale 512.0000 (371.8271) mem 7381MB [2024-09-01 07:28:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1050/1251] eta 0:00:49 lr 0.000089 wd 0.0500 time 0.2446 (0.2461) data time 0.0007 (0.0017) model time 0.2440 (0.2442) loss 3.3380 (2.7715) grad_norm 3.8255 (4.8112) loss_scale 512.0000 (373.1608) mem 7381MB [2024-09-01 07:28:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1060/1251] eta 0:00:47 lr 0.000089 wd 0.0500 time 0.2449 (0.2461) data time 0.0011 (0.0017) model time 0.2438 (0.2442) loss 2.8261 (2.7715) grad_norm 6.2428 (4.8061) loss_scale 512.0000 (374.4694) mem 7381MB [2024-09-01 07:28:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1070/1251] eta 0:00:44 lr 0.000089 wd 0.0500 time 0.4045 (0.2462) data time 0.0009 (0.0017) model time 0.4036 (0.2443) loss 3.0288 (2.7713) grad_norm 5.2852 (4.8126) loss_scale 512.0000 (375.7535) mem 7381MB [2024-09-01 07:28:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1080/1251] eta 0:00:42 lr 0.000089 wd 0.0500 time 0.2414 (0.2461) data time 0.0010 (0.0017) model time 0.2404 (0.2442) loss 2.6266 (2.7670) grad_norm 5.3011 (4.8072) loss_scale 512.0000 (377.0139) mem 7381MB [2024-09-01 07:28:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1090/1251] eta 0:00:39 lr 0.000089 wd 0.0500 time 0.2358 (0.2461) data time 0.0009 (0.0016) model time 0.2349 (0.2442) loss 1.8016 (2.7659) grad_norm 3.0596 (4.8029) loss_scale 512.0000 (378.2511) mem 7381MB [2024-09-01 07:28:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1100/1251] eta 0:00:37 lr 0.000089 wd 0.0500 time 0.2427 (0.2461) data time 0.0009 (0.0016) model time 0.2419 (0.2442) loss 3.2752 (2.7662) grad_norm 4.5190 (4.8027) loss_scale 512.0000 (379.4659) mem 7381MB [2024-09-01 07:28:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1110/1251] eta 0:00:34 lr 0.000089 wd 0.0500 time 0.2422 (0.2461) data time 0.0007 (0.0016) model time 0.2415 (0.2442) loss 2.8877 (2.7668) grad_norm 3.4749 (4.7939) loss_scale 512.0000 (380.6589) mem 7381MB [2024-09-01 07:28:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1120/1251] eta 0:00:32 lr 0.000089 wd 0.0500 time 0.2412 (0.2460) data time 0.0011 (0.0016) model time 0.2400 (0.2441) loss 2.8592 (2.7679) grad_norm 3.9009 (4.7884) loss_scale 512.0000 (381.8305) mem 7381MB [2024-09-01 07:28:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1130/1251] eta 0:00:29 lr 0.000089 wd 0.0500 time 0.2457 (0.2460) data time 0.0007 (0.0016) model time 0.2449 (0.2441) loss 2.4354 (2.7682) grad_norm 5.5867 (4.7888) loss_scale 512.0000 (382.9814) mem 7381MB [2024-09-01 07:28:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1140/1251] eta 0:00:27 lr 0.000089 wd 0.0500 time 0.2454 (0.2460) data time 0.0008 (0.0016) model time 0.2447 (0.2441) loss 2.6192 (2.7676) grad_norm 5.1067 (4.7984) loss_scale 512.0000 (384.1122) mem 7381MB [2024-09-01 07:28:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1150/1251] eta 0:00:24 lr 0.000089 wd 0.0500 time 0.2386 (0.2459) data time 0.0010 (0.0016) model time 0.2376 (0.2441) loss 2.5863 (2.7673) grad_norm 3.5396 (4.7989) loss_scale 512.0000 (385.2233) mem 7381MB [2024-09-01 07:28:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1160/1251] eta 0:00:22 lr 0.000089 wd 0.0500 time 0.2411 (0.2459) data time 0.0009 (0.0016) model time 0.2402 (0.2440) loss 3.1441 (2.7680) grad_norm 4.4876 (4.7976) loss_scale 512.0000 (386.3152) mem 7381MB [2024-09-01 07:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1170/1251] eta 0:00:19 lr 0.000089 wd 0.0500 time 0.2345 (0.2458) data time 0.0010 (0.0016) model time 0.2334 (0.2440) loss 3.2845 (2.7688) grad_norm 2.9785 (4.7923) loss_scale 512.0000 (387.3886) mem 7381MB [2024-09-01 07:29:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1180/1251] eta 0:00:17 lr 0.000089 wd 0.0500 time 0.2458 (0.2458) data time 0.0009 (0.0016) model time 0.2449 (0.2440) loss 2.8154 (2.7693) grad_norm 5.3555 (4.8060) loss_scale 512.0000 (388.4437) mem 7381MB [2024-09-01 07:29:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1190/1251] eta 0:00:14 lr 0.000089 wd 0.0500 time 0.2426 (0.2458) data time 0.0010 (0.0016) model time 0.2416 (0.2439) loss 2.8388 (2.7699) grad_norm 3.6999 (4.8060) loss_scale 512.0000 (389.4811) mem 7381MB [2024-09-01 07:29:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1200/1251] eta 0:00:12 lr 0.000089 wd 0.0500 time 0.2381 (0.2457) data time 0.0012 (0.0016) model time 0.2369 (0.2439) loss 2.6886 (2.7708) grad_norm 5.1860 (4.8027) loss_scale 512.0000 (390.5012) mem 7381MB [2024-09-01 07:29:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1210/1251] eta 0:00:10 lr 0.000089 wd 0.0500 time 0.2532 (0.2457) data time 0.0009 (0.0016) model time 0.2522 (0.2439) loss 3.0581 (2.7721) grad_norm 4.4633 (4.8050) loss_scale 512.0000 (391.5045) mem 7381MB [2024-09-01 07:29:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1220/1251] eta 0:00:07 lr 0.000089 wd 0.0500 time 0.2346 (0.2457) data time 0.0009 (0.0016) model time 0.2337 (0.2439) loss 2.1995 (2.7726) grad_norm 3.9196 (4.8014) loss_scale 512.0000 (392.4914) mem 7381MB [2024-09-01 07:29:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1230/1251] eta 0:00:05 lr 0.000089 wd 0.0500 time 0.2527 (0.2456) data time 0.0009 (0.0016) model time 0.2518 (0.2438) loss 3.0960 (2.7716) grad_norm 4.6848 (inf) loss_scale 256.0000 (391.3826) mem 7381MB [2024-09-01 07:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1240/1251] eta 0:00:02 lr 0.000089 wd 0.0500 time 0.2269 (0.2455) data time 0.0007 (0.0016) model time 0.2262 (0.2437) loss 2.8822 (2.7732) grad_norm 4.1868 (inf) loss_scale 256.0000 (390.2917) mem 7381MB [2024-09-01 07:29:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [248/300][1250/1251] eta 0:00:00 lr 0.000089 wd 0.0500 time 0.2281 (0.2454) data time 0.0005 (0.0016) model time 0.2277 (0.2436) loss 2.7387 (2.7743) grad_norm 6.1671 (inf) loss_scale 256.0000 (389.2182) mem 7381MB [2024-09-01 07:29:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 248 training takes 0:05:06 [2024-09-01 07:29:21 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 07:29:21 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 07:29:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.417 (0.417) Loss 0.4158 (0.4158) Acc@1 93.066 (93.066) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 07:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.117) Loss 0.5850 (0.6279) Acc@1 90.332 (87.358) Acc@5 98.145 (97.692) Mem 7381MB [2024-09-01 07:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.099) Loss 0.9297 (0.6600) Acc@1 77.441 (86.091) Acc@5 95.508 (97.577) Mem 7381MB [2024-09-01 07:29:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.094) Loss 1.1465 (0.7495) Acc@1 73.828 (83.934) Acc@5 92.480 (96.610) Mem 7381MB [2024-09-01 07:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.087) Loss 1.0381 (0.7983) Acc@1 75.488 (82.727) Acc@5 94.238 (96.122) Mem 7381MB [2024-09-01 07:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.302 Acc@5 96.082 [2024-09-01 07:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.3% [2024-09-01 07:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.30% [2024-09-01 07:29:25 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 07:29:26 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 07:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.470 (0.470) Loss 0.3826 (0.3826) Acc@1 93.262 (93.262) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 07:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.113) Loss 0.5669 (0.5994) Acc@1 90.039 (87.607) Acc@5 98.047 (97.834) Mem 7381MB [2024-09-01 07:29:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.096) Loss 0.8921 (0.6303) Acc@1 77.832 (86.440) Acc@5 96.289 (97.745) Mem 7381MB [2024-09-01 07:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.090) Loss 1.0996 (0.7168) Acc@1 74.121 (84.369) Acc@5 93.262 (96.881) Mem 7381MB [2024-09-01 07:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.084) Loss 0.9922 (0.7623) Acc@1 76.758 (83.258) Acc@5 94.336 (96.377) Mem 7381MB [2024-09-01 07:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.910 Acc@5 96.330 [2024-09-01 07:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 07:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.91% [2024-09-01 07:29:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 07:29:31 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 07:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][0/1251] eta 0:15:05 lr 0.000089 wd 0.0500 time 0.7240 (0.7240) data time 0.5034 (0.5034) model time 0.0000 (0.0000) loss 3.0161 (3.0161) grad_norm 5.5860 (5.5860) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:29:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][10/1251] eta 0:05:54 lr 0.000089 wd 0.0500 time 0.2459 (0.2857) data time 0.0009 (0.0468) model time 0.0000 (0.0000) loss 3.0869 (2.6149) grad_norm 3.3289 (4.3636) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][20/1251] eta 0:05:26 lr 0.000089 wd 0.0500 time 0.2512 (0.2649) data time 0.0007 (0.0249) model time 0.0000 (0.0000) loss 2.6570 (2.6713) grad_norm 3.9277 (4.2268) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:29:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][30/1251] eta 0:05:15 lr 0.000089 wd 0.0500 time 0.2457 (0.2583) data time 0.0010 (0.0172) model time 0.0000 (0.0000) loss 3.2078 (2.7053) grad_norm 5.3404 (4.2934) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:29:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][40/1251] eta 0:05:08 lr 0.000089 wd 0.0500 time 0.2375 (0.2544) data time 0.0009 (0.0133) model time 0.0000 (0.0000) loss 3.1480 (2.7485) grad_norm 3.5998 (4.2391) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:29:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][50/1251] eta 0:05:02 lr 0.000089 wd 0.0500 time 0.2357 (0.2516) data time 0.0007 (0.0109) model time 0.0000 (0.0000) loss 2.3812 (2.7983) grad_norm 3.5866 (4.2827) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:29:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][60/1251] eta 0:04:57 lr 0.000089 wd 0.0500 time 0.2384 (0.2498) data time 0.0009 (0.0092) model time 0.2375 (0.2397) loss 2.5254 (2.8021) grad_norm 4.1509 (4.2402) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:29:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][70/1251] eta 0:04:58 lr 0.000089 wd 0.0500 time 0.4119 (0.2530) data time 0.0008 (0.0081) model time 0.4112 (0.2556) loss 3.2073 (2.7908) grad_norm 4.0465 (4.2910) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:29:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][80/1251] eta 0:04:56 lr 0.000089 wd 0.0500 time 0.2350 (0.2532) data time 0.0007 (0.0073) model time 0.2343 (0.2550) loss 2.1201 (2.7692) grad_norm 3.6494 (4.3983) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][90/1251] eta 0:04:52 lr 0.000089 wd 0.0500 time 0.2489 (0.2519) data time 0.0009 (0.0066) model time 0.2480 (0.2513) loss 2.3141 (2.7548) grad_norm 8.6490 (4.5250) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:29:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][100/1251] eta 0:04:48 lr 0.000089 wd 0.0500 time 0.2433 (0.2510) data time 0.0009 (0.0060) model time 0.2425 (0.2495) loss 3.0161 (2.7742) grad_norm 5.8450 (4.5394) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:29:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][110/1251] eta 0:04:45 lr 0.000089 wd 0.0500 time 0.2422 (0.2502) data time 0.0007 (0.0055) model time 0.2415 (0.2480) loss 2.5600 (2.7731) grad_norm 4.3259 (4.5425) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][120/1251] eta 0:04:42 lr 0.000089 wd 0.0500 time 0.2354 (0.2494) data time 0.0012 (0.0052) model time 0.2343 (0.2468) loss 3.0513 (2.7629) grad_norm 6.6850 (4.5987) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][130/1251] eta 0:04:38 lr 0.000089 wd 0.0500 time 0.2415 (0.2488) data time 0.0012 (0.0049) model time 0.2402 (0.2460) loss 2.6097 (2.7862) grad_norm 4.5792 (4.7066) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][140/1251] eta 0:04:36 lr 0.000089 wd 0.0500 time 0.2468 (0.2485) data time 0.0010 (0.0046) model time 0.2457 (0.2457) loss 2.7908 (2.7662) grad_norm 2.7364 (4.6981) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][150/1251] eta 0:04:32 lr 0.000088 wd 0.0500 time 0.2313 (0.2480) data time 0.0008 (0.0043) model time 0.2305 (0.2451) loss 2.6597 (2.7748) grad_norm 4.7843 (4.6621) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][160/1251] eta 0:04:29 lr 0.000088 wd 0.0500 time 0.2400 (0.2475) data time 0.0008 (0.0041) model time 0.2391 (0.2446) loss 2.9613 (2.7843) grad_norm 3.2217 (4.6509) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][170/1251] eta 0:04:27 lr 0.000088 wd 0.0500 time 0.2385 (0.2472) data time 0.0007 (0.0040) model time 0.2378 (0.2443) loss 2.7411 (2.7783) grad_norm 6.1842 (4.6255) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][180/1251] eta 0:04:24 lr 0.000088 wd 0.0500 time 0.2469 (0.2468) data time 0.0009 (0.0038) model time 0.2461 (0.2438) loss 2.6781 (2.7739) grad_norm 4.5495 (4.6020) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][190/1251] eta 0:04:21 lr 0.000088 wd 0.0500 time 0.2403 (0.2464) data time 0.0008 (0.0036) model time 0.2395 (0.2435) loss 2.9523 (2.7750) grad_norm 5.3665 (4.6618) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][200/1251] eta 0:04:18 lr 0.000088 wd 0.0500 time 0.2385 (0.2461) data time 0.0008 (0.0035) model time 0.2377 (0.2432) loss 3.2376 (2.7705) grad_norm 4.5477 (4.6274) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][210/1251] eta 0:04:15 lr 0.000088 wd 0.0500 time 0.2346 (0.2457) data time 0.0011 (0.0034) model time 0.2335 (0.2428) loss 3.0257 (2.7755) grad_norm 5.0894 (4.6050) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][220/1251] eta 0:04:13 lr 0.000088 wd 0.0500 time 0.2402 (0.2454) data time 0.0007 (0.0033) model time 0.2395 (0.2425) loss 3.1105 (2.7819) grad_norm 8.2233 (4.6528) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][230/1251] eta 0:04:10 lr 0.000088 wd 0.0500 time 0.2342 (0.2450) data time 0.0007 (0.0032) model time 0.2335 (0.2421) loss 3.4229 (2.7857) grad_norm 4.7031 (4.6655) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][240/1251] eta 0:04:07 lr 0.000088 wd 0.0500 time 0.2324 (0.2447) data time 0.0010 (0.0031) model time 0.2314 (0.2418) loss 2.9885 (2.7874) grad_norm 3.1154 (4.6447) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][250/1251] eta 0:04:05 lr 0.000088 wd 0.0500 time 0.2425 (0.2449) data time 0.0011 (0.0030) model time 0.2414 (0.2422) loss 2.6519 (2.7855) grad_norm 3.5336 (4.6153) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][260/1251] eta 0:04:02 lr 0.000088 wd 0.0500 time 0.2407 (0.2446) data time 0.0009 (0.0029) model time 0.2398 (0.2419) loss 2.2352 (2.7736) grad_norm 4.6025 (4.6273) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][270/1251] eta 0:03:59 lr 0.000088 wd 0.0500 time 0.2378 (0.2443) data time 0.0011 (0.0029) model time 0.2368 (0.2416) loss 2.7226 (2.7675) grad_norm 3.5538 (4.6206) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][280/1251] eta 0:03:56 lr 0.000088 wd 0.0500 time 0.2287 (0.2440) data time 0.0011 (0.0028) model time 0.2276 (0.2413) loss 2.7626 (2.7693) grad_norm 3.8398 (4.6554) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][290/1251] eta 0:03:54 lr 0.000088 wd 0.0500 time 0.2360 (0.2438) data time 0.0011 (0.0027) model time 0.2349 (0.2412) loss 2.4771 (2.7725) grad_norm 4.7124 (4.6595) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][300/1251] eta 0:04:00 lr 0.000088 wd 0.0500 time 1.0774 (0.2532) data time 0.0008 (0.0027) model time 1.0766 (0.2525) loss 3.1250 (2.7794) grad_norm 6.4690 (4.6680) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][310/1251] eta 0:04:00 lr 0.000088 wd 0.0500 time 0.2393 (0.2556) data time 0.0010 (0.0026) model time 0.2383 (0.2554) loss 2.3989 (2.7713) grad_norm 4.9875 (inf) loss_scale 128.0000 (254.3537) mem 7381MB [2024-09-01 07:30:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][320/1251] eta 0:03:57 lr 0.000088 wd 0.0500 time 0.2364 (0.2552) data time 0.0010 (0.0026) model time 0.2354 (0.2549) loss 2.5848 (2.7732) grad_norm 5.1623 (inf) loss_scale 128.0000 (250.4174) mem 7381MB [2024-09-01 07:30:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][330/1251] eta 0:03:54 lr 0.000088 wd 0.0500 time 0.2285 (0.2547) data time 0.0010 (0.0025) model time 0.2274 (0.2543) loss 2.9896 (2.7748) grad_norm 4.3453 (inf) loss_scale 128.0000 (246.7190) mem 7381MB [2024-09-01 07:30:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][340/1251] eta 0:03:51 lr 0.000088 wd 0.0500 time 0.2328 (0.2542) data time 0.0008 (0.0025) model time 0.2320 (0.2537) loss 2.1662 (2.7758) grad_norm 3.3080 (inf) loss_scale 128.0000 (243.2375) mem 7381MB [2024-09-01 07:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][350/1251] eta 0:03:48 lr 0.000088 wd 0.0500 time 0.2303 (0.2537) data time 0.0013 (0.0025) model time 0.2290 (0.2531) loss 2.2560 (2.7679) grad_norm 2.7979 (inf) loss_scale 128.0000 (239.9544) mem 7381MB [2024-09-01 07:31:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][360/1251] eta 0:03:45 lr 0.000088 wd 0.0500 time 0.2355 (0.2532) data time 0.0010 (0.0024) model time 0.2344 (0.2524) loss 3.3372 (2.7688) grad_norm 2.8977 (inf) loss_scale 128.0000 (236.8532) mem 7381MB [2024-09-01 07:31:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][370/1251] eta 0:03:42 lr 0.000088 wd 0.0500 time 0.2326 (0.2529) data time 0.0008 (0.0024) model time 0.2318 (0.2520) loss 3.4116 (2.7683) grad_norm 4.0937 (inf) loss_scale 128.0000 (233.9191) mem 7381MB [2024-09-01 07:31:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][380/1251] eta 0:03:39 lr 0.000088 wd 0.0500 time 0.2389 (0.2525) data time 0.0009 (0.0023) model time 0.2380 (0.2516) loss 3.2627 (2.7703) grad_norm 4.7336 (inf) loss_scale 128.0000 (231.1391) mem 7381MB [2024-09-01 07:31:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][390/1251] eta 0:03:37 lr 0.000088 wd 0.0500 time 0.2450 (0.2522) data time 0.0007 (0.0023) model time 0.2442 (0.2512) loss 2.4959 (2.7624) grad_norm 6.9329 (inf) loss_scale 128.0000 (228.5013) mem 7381MB [2024-09-01 07:31:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][400/1251] eta 0:03:34 lr 0.000088 wd 0.0500 time 0.2334 (0.2518) data time 0.0009 (0.0023) model time 0.2324 (0.2508) loss 2.0167 (2.7638) grad_norm 4.7029 (inf) loss_scale 128.0000 (225.9950) mem 7381MB [2024-09-01 07:31:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][410/1251] eta 0:03:31 lr 0.000088 wd 0.0500 time 0.2338 (0.2515) data time 0.0010 (0.0023) model time 0.2328 (0.2504) loss 2.3950 (2.7680) grad_norm 5.3068 (inf) loss_scale 128.0000 (223.6107) mem 7381MB [2024-09-01 07:31:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][420/1251] eta 0:03:28 lr 0.000088 wd 0.0500 time 0.2365 (0.2511) data time 0.0011 (0.0022) model time 0.2355 (0.2501) loss 2.9218 (2.7653) grad_norm 3.7114 (inf) loss_scale 128.0000 (221.3397) mem 7381MB [2024-09-01 07:31:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][430/1251] eta 0:03:25 lr 0.000088 wd 0.0500 time 0.2346 (0.2507) data time 0.0009 (0.0022) model time 0.2336 (0.2496) loss 3.4077 (2.7643) grad_norm 3.9324 (inf) loss_scale 128.0000 (219.1740) mem 7381MB [2024-09-01 07:31:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][440/1251] eta 0:03:23 lr 0.000088 wd 0.0500 time 0.2308 (0.2504) data time 0.0009 (0.0022) model time 0.2298 (0.2492) loss 1.3997 (2.7613) grad_norm 4.3222 (inf) loss_scale 128.0000 (217.1066) mem 7381MB [2024-09-01 07:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][450/1251] eta 0:03:20 lr 0.000088 wd 0.0500 time 0.2331 (0.2501) data time 0.0010 (0.0021) model time 0.2320 (0.2489) loss 2.9458 (2.7597) grad_norm 9.0151 (inf) loss_scale 128.0000 (215.1308) mem 7381MB [2024-09-01 07:31:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][460/1251] eta 0:03:17 lr 0.000088 wd 0.0500 time 0.2606 (0.2499) data time 0.0008 (0.0021) model time 0.2598 (0.2486) loss 3.0131 (2.7601) grad_norm 3.2731 (inf) loss_scale 128.0000 (213.2408) mem 7381MB [2024-09-01 07:31:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][470/1251] eta 0:03:14 lr 0.000088 wd 0.0500 time 0.2498 (0.2497) data time 0.0007 (0.0021) model time 0.2491 (0.2484) loss 2.7625 (2.7602) grad_norm 5.4898 (inf) loss_scale 128.0000 (211.4310) mem 7381MB [2024-09-01 07:31:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][480/1251] eta 0:03:12 lr 0.000088 wd 0.0500 time 0.2315 (0.2494) data time 0.0011 (0.0021) model time 0.2304 (0.2482) loss 2.1841 (2.7647) grad_norm 2.8444 (inf) loss_scale 128.0000 (209.6965) mem 7381MB [2024-09-01 07:31:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][490/1251] eta 0:03:09 lr 0.000088 wd 0.0500 time 0.2295 (0.2492) data time 0.0007 (0.0020) model time 0.2288 (0.2479) loss 3.3750 (2.7699) grad_norm 4.0068 (inf) loss_scale 128.0000 (208.0326) mem 7381MB [2024-09-01 07:31:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][500/1251] eta 0:03:06 lr 0.000088 wd 0.0500 time 0.2378 (0.2489) data time 0.0007 (0.0020) model time 0.2372 (0.2476) loss 3.3315 (2.7746) grad_norm 4.5150 (inf) loss_scale 128.0000 (206.4351) mem 7381MB [2024-09-01 07:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][510/1251] eta 0:03:05 lr 0.000088 wd 0.0500 time 0.7963 (0.2497) data time 0.0010 (0.0020) model time 0.7953 (0.2485) loss 3.0856 (2.7780) grad_norm 4.8533 (inf) loss_scale 128.0000 (204.9002) mem 7381MB [2024-09-01 07:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][520/1251] eta 0:03:03 lr 0.000088 wd 0.0500 time 0.2359 (0.2510) data time 0.0012 (0.0020) model time 0.2347 (0.2499) loss 2.8053 (2.7743) grad_norm 4.6743 (inf) loss_scale 128.0000 (203.4242) mem 7381MB [2024-09-01 07:31:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][530/1251] eta 0:03:00 lr 0.000088 wd 0.0500 time 0.2285 (0.2507) data time 0.0010 (0.0020) model time 0.2275 (0.2496) loss 2.0029 (2.7710) grad_norm 3.8229 (inf) loss_scale 128.0000 (202.0038) mem 7381MB [2024-09-01 07:31:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][540/1251] eta 0:02:58 lr 0.000088 wd 0.0500 time 0.2325 (0.2504) data time 0.0008 (0.0019) model time 0.2317 (0.2493) loss 2.9634 (2.7745) grad_norm 3.1133 (inf) loss_scale 128.0000 (200.6359) mem 7381MB [2024-09-01 07:31:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][550/1251] eta 0:02:55 lr 0.000088 wd 0.0500 time 0.2307 (0.2502) data time 0.0011 (0.0019) model time 0.2296 (0.2491) loss 3.1404 (2.7760) grad_norm 3.5736 (inf) loss_scale 128.0000 (199.3176) mem 7381MB [2024-09-01 07:31:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][560/1251] eta 0:02:52 lr 0.000088 wd 0.0500 time 0.2431 (0.2500) data time 0.0007 (0.0019) model time 0.2424 (0.2488) loss 3.1864 (2.7816) grad_norm 4.9168 (inf) loss_scale 128.0000 (198.0463) mem 7381MB [2024-09-01 07:31:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][570/1251] eta 0:02:50 lr 0.000087 wd 0.0500 time 0.2351 (0.2498) data time 0.0010 (0.0019) model time 0.2342 (0.2485) loss 3.0528 (2.7848) grad_norm 4.3856 (inf) loss_scale 128.0000 (196.8196) mem 7381MB [2024-09-01 07:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][580/1251] eta 0:02:48 lr 0.000087 wd 0.0500 time 1.4764 (0.2517) data time 0.0007 (0.0019) model time 1.4758 (0.2506) loss 2.5718 (2.7851) grad_norm 3.2785 (inf) loss_scale 128.0000 (195.6351) mem 7381MB [2024-09-01 07:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][590/1251] eta 0:02:47 lr 0.000087 wd 0.0500 time 0.4068 (0.2529) data time 0.0011 (0.0020) model time 0.4057 (0.2519) loss 3.2766 (2.7876) grad_norm 5.1556 (inf) loss_scale 128.0000 (194.4907) mem 7381MB [2024-09-01 07:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][600/1251] eta 0:02:45 lr 0.000087 wd 0.0500 time 0.2372 (0.2543) data time 0.0011 (0.0020) model time 0.2361 (0.2534) loss 2.3107 (2.7858) grad_norm 2.7102 (inf) loss_scale 128.0000 (193.3844) mem 7381MB [2024-09-01 07:32:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][610/1251] eta 0:02:43 lr 0.000087 wd 0.0500 time 0.2417 (0.2544) data time 0.0009 (0.0020) model time 0.2408 (0.2535) loss 2.6836 (2.7886) grad_norm 6.7937 (inf) loss_scale 128.0000 (192.3142) mem 7381MB [2024-09-01 07:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][620/1251] eta 0:02:40 lr 0.000087 wd 0.0500 time 0.2366 (0.2550) data time 0.0008 (0.0019) model time 0.2358 (0.2541) loss 2.2343 (2.7886) grad_norm 4.0033 (inf) loss_scale 128.0000 (191.2786) mem 7381MB [2024-09-01 07:32:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][630/1251] eta 0:02:38 lr 0.000087 wd 0.0500 time 0.2408 (0.2548) data time 0.0009 (0.0019) model time 0.2399 (0.2539) loss 2.4376 (2.7906) grad_norm 4.5271 (inf) loss_scale 128.0000 (190.2758) mem 7381MB [2024-09-01 07:32:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][640/1251] eta 0:02:35 lr 0.000087 wd 0.0500 time 0.2342 (0.2545) data time 0.0009 (0.0019) model time 0.2333 (0.2536) loss 2.9621 (2.7904) grad_norm 3.6077 (inf) loss_scale 128.0000 (189.3042) mem 7381MB [2024-09-01 07:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][650/1251] eta 0:02:32 lr 0.000087 wd 0.0500 time 0.2426 (0.2542) data time 0.0010 (0.0019) model time 0.2417 (0.2532) loss 2.5000 (2.7918) grad_norm 4.3955 (inf) loss_scale 128.0000 (188.3625) mem 7381MB [2024-09-01 07:32:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][660/1251] eta 0:02:30 lr 0.000087 wd 0.0500 time 0.2345 (0.2546) data time 0.0010 (0.0023) model time 0.2335 (0.2533) loss 2.6316 (2.7923) grad_norm 3.1606 (inf) loss_scale 128.0000 (187.4493) mem 7381MB [2024-09-01 07:32:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][670/1251] eta 0:02:27 lr 0.000087 wd 0.0500 time 0.2308 (0.2543) data time 0.0008 (0.0023) model time 0.2300 (0.2530) loss 3.5536 (2.7946) grad_norm 4.4392 (inf) loss_scale 128.0000 (186.5633) mem 7381MB [2024-09-01 07:32:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][680/1251] eta 0:02:25 lr 0.000087 wd 0.0500 time 0.2346 (0.2540) data time 0.0009 (0.0022) model time 0.2336 (0.2526) loss 3.0597 (2.7939) grad_norm 4.3700 (inf) loss_scale 128.0000 (185.7034) mem 7381MB [2024-09-01 07:32:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][690/1251] eta 0:02:22 lr 0.000087 wd 0.0500 time 0.2328 (0.2538) data time 0.0007 (0.0022) model time 0.2321 (0.2524) loss 3.1111 (2.7971) grad_norm 4.1159 (inf) loss_scale 128.0000 (184.8683) mem 7381MB [2024-09-01 07:32:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][700/1251] eta 0:02:19 lr 0.000087 wd 0.0500 time 0.2292 (0.2535) data time 0.0010 (0.0022) model time 0.2282 (0.2521) loss 3.2104 (2.7987) grad_norm 3.1415 (inf) loss_scale 128.0000 (184.0571) mem 7381MB [2024-09-01 07:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][710/1251] eta 0:02:17 lr 0.000087 wd 0.0500 time 0.2592 (0.2533) data time 0.0013 (0.0022) model time 0.2579 (0.2520) loss 2.1102 (2.7950) grad_norm 5.1096 (inf) loss_scale 128.0000 (183.2686) mem 7381MB [2024-09-01 07:32:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][720/1251] eta 0:02:14 lr 0.000087 wd 0.0500 time 0.2448 (0.2532) data time 0.0012 (0.0022) model time 0.2436 (0.2518) loss 2.7491 (2.7955) grad_norm 3.9213 (inf) loss_scale 128.0000 (182.5021) mem 7381MB [2024-09-01 07:32:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][730/1251] eta 0:02:11 lr 0.000087 wd 0.0500 time 0.2379 (0.2529) data time 0.0007 (0.0022) model time 0.2372 (0.2515) loss 2.7108 (2.7971) grad_norm 4.4908 (inf) loss_scale 128.0000 (181.7565) mem 7381MB [2024-09-01 07:32:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][740/1251] eta 0:02:09 lr 0.000087 wd 0.0500 time 0.2397 (0.2528) data time 0.0009 (0.0021) model time 0.2388 (0.2514) loss 3.2347 (2.8023) grad_norm 11.2120 (inf) loss_scale 128.0000 (181.0310) mem 7381MB [2024-09-01 07:32:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][750/1251] eta 0:02:06 lr 0.000087 wd 0.0500 time 0.2372 (0.2525) data time 0.0007 (0.0021) model time 0.2365 (0.2511) loss 2.2303 (2.8050) grad_norm 3.4458 (inf) loss_scale 128.0000 (180.3249) mem 7381MB [2024-09-01 07:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][760/1251] eta 0:02:05 lr 0.000087 wd 0.0500 time 0.8772 (0.2550) data time 0.0008 (0.0021) model time 0.8764 (0.2538) loss 3.0495 (2.8071) grad_norm 4.5527 (inf) loss_scale 128.0000 (179.6373) mem 7381MB [2024-09-01 07:32:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][770/1251] eta 0:02:02 lr 0.000087 wd 0.0500 time 0.4434 (0.2557) data time 0.0009 (0.0022) model time 0.4425 (0.2544) loss 2.0448 (2.8041) grad_norm 3.9043 (inf) loss_scale 128.0000 (178.9676) mem 7381MB [2024-09-01 07:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][780/1251] eta 0:02:00 lr 0.000087 wd 0.0500 time 0.2352 (0.2554) data time 0.0011 (0.0022) model time 0.2342 (0.2541) loss 3.1315 (2.8035) grad_norm 5.3895 (inf) loss_scale 128.0000 (178.3150) mem 7381MB [2024-09-01 07:32:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][790/1251] eta 0:01:58 lr 0.000087 wd 0.0500 time 0.2319 (0.2580) data time 0.0009 (0.0026) model time 0.2309 (0.2564) loss 2.3809 (2.8033) grad_norm 3.7108 (inf) loss_scale 128.0000 (177.6789) mem 7381MB [2024-09-01 07:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][800/1251] eta 0:01:56 lr 0.000087 wd 0.0500 time 0.2395 (0.2591) data time 0.0009 (0.0026) model time 0.2386 (0.2575) loss 3.5176 (2.8017) grad_norm 3.9954 (inf) loss_scale 128.0000 (177.0587) mem 7381MB [2024-09-01 07:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][810/1251] eta 0:01:54 lr 0.000087 wd 0.0500 time 0.2454 (0.2588) data time 0.0007 (0.0026) model time 0.2447 (0.2573) loss 2.9104 (2.8046) grad_norm 5.6060 (inf) loss_scale 128.0000 (176.4538) mem 7381MB [2024-09-01 07:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][820/1251] eta 0:01:52 lr 0.000087 wd 0.0500 time 0.2353 (0.2606) data time 0.0013 (0.0026) model time 0.2340 (0.2592) loss 2.3884 (2.8017) grad_norm 4.4263 (inf) loss_scale 128.0000 (175.8636) mem 7381MB [2024-09-01 07:33:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][830/1251] eta 0:01:50 lr 0.000087 wd 0.0500 time 0.2363 (0.2622) data time 0.0009 (0.0025) model time 0.2354 (0.2609) loss 3.0153 (2.8004) grad_norm 3.6711 (inf) loss_scale 128.0000 (175.2876) mem 7381MB [2024-09-01 07:33:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][840/1251] eta 0:01:47 lr 0.000087 wd 0.0500 time 0.2321 (0.2620) data time 0.0007 (0.0025) model time 0.2314 (0.2607) loss 2.6044 (2.8022) grad_norm 3.4369 (inf) loss_scale 128.0000 (174.7253) mem 7381MB [2024-09-01 07:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][850/1251] eta 0:01:44 lr 0.000087 wd 0.0500 time 0.2342 (0.2617) data time 0.0008 (0.0025) model time 0.2334 (0.2604) loss 2.9605 (2.8003) grad_norm 4.0568 (inf) loss_scale 128.0000 (174.1763) mem 7381MB [2024-09-01 07:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][860/1251] eta 0:01:42 lr 0.000087 wd 0.0500 time 0.2493 (0.2614) data time 0.0011 (0.0025) model time 0.2482 (0.2601) loss 2.6923 (2.8007) grad_norm 3.4206 (inf) loss_scale 128.0000 (173.6400) mem 7381MB [2024-09-01 07:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][870/1251] eta 0:01:39 lr 0.000087 wd 0.0500 time 0.2376 (0.2612) data time 0.0007 (0.0025) model time 0.2369 (0.2598) loss 2.9859 (2.8036) grad_norm 3.9225 (inf) loss_scale 128.0000 (173.1160) mem 7381MB [2024-09-01 07:33:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][880/1251] eta 0:01:36 lr 0.000087 wd 0.0500 time 0.2452 (0.2609) data time 0.0010 (0.0024) model time 0.2442 (0.2596) loss 3.0125 (2.8031) grad_norm 4.0544 (inf) loss_scale 128.0000 (172.6039) mem 7381MB [2024-09-01 07:33:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][890/1251] eta 0:01:34 lr 0.000087 wd 0.0500 time 0.2439 (0.2607) data time 0.0007 (0.0024) model time 0.2432 (0.2593) loss 2.3556 (2.8035) grad_norm 3.9748 (inf) loss_scale 128.0000 (172.1033) mem 7381MB [2024-09-01 07:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][900/1251] eta 0:01:31 lr 0.000087 wd 0.0500 time 0.2406 (0.2605) data time 0.0008 (0.0024) model time 0.2398 (0.2591) loss 2.8508 (2.8019) grad_norm 4.8260 (inf) loss_scale 128.0000 (171.6138) mem 7381MB [2024-09-01 07:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][910/1251] eta 0:01:28 lr 0.000087 wd 0.0500 time 0.2399 (0.2602) data time 0.0008 (0.0024) model time 0.2391 (0.2588) loss 2.1404 (2.8007) grad_norm 3.7970 (inf) loss_scale 128.0000 (171.1350) mem 7381MB [2024-09-01 07:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][920/1251] eta 0:01:26 lr 0.000087 wd 0.0500 time 0.2426 (0.2599) data time 0.0009 (0.0024) model time 0.2417 (0.2586) loss 2.9093 (2.7951) grad_norm 3.4808 (inf) loss_scale 128.0000 (170.6667) mem 7381MB [2024-09-01 07:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][930/1251] eta 0:01:23 lr 0.000087 wd 0.0500 time 0.2410 (0.2597) data time 0.0011 (0.0024) model time 0.2400 (0.2583) loss 2.6738 (2.7960) grad_norm 3.9168 (inf) loss_scale 128.0000 (170.2084) mem 7381MB [2024-09-01 07:33:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][940/1251] eta 0:01:20 lr 0.000087 wd 0.0500 time 0.2424 (0.2594) data time 0.0009 (0.0024) model time 0.2415 (0.2580) loss 1.8791 (2.7954) grad_norm 5.1063 (inf) loss_scale 128.0000 (169.7598) mem 7381MB [2024-09-01 07:33:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][950/1251] eta 0:01:18 lr 0.000087 wd 0.0500 time 0.2308 (0.2592) data time 0.0007 (0.0023) model time 0.2301 (0.2578) loss 2.7871 (2.7959) grad_norm 3.2860 (inf) loss_scale 128.0000 (169.3207) mem 7381MB [2024-09-01 07:33:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][960/1251] eta 0:01:15 lr 0.000087 wd 0.0500 time 0.2330 (0.2590) data time 0.0008 (0.0023) model time 0.2322 (0.2576) loss 3.1218 (2.7953) grad_norm 4.9067 (inf) loss_scale 128.0000 (168.8907) mem 7381MB [2024-09-01 07:33:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][970/1251] eta 0:01:12 lr 0.000087 wd 0.0500 time 0.2463 (0.2588) data time 0.0009 (0.0023) model time 0.2454 (0.2573) loss 3.0807 (2.7944) grad_norm 3.4761 (inf) loss_scale 128.0000 (168.4696) mem 7381MB [2024-09-01 07:33:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][980/1251] eta 0:01:10 lr 0.000087 wd 0.0500 time 0.2348 (0.2586) data time 0.0010 (0.0023) model time 0.2339 (0.2571) loss 2.9191 (2.7954) grad_norm 5.8093 (inf) loss_scale 128.0000 (168.0571) mem 7381MB [2024-09-01 07:33:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][990/1251] eta 0:01:07 lr 0.000086 wd 0.0500 time 0.2430 (0.2584) data time 0.0007 (0.0023) model time 0.2423 (0.2569) loss 3.5740 (2.7955) grad_norm 9.3664 (inf) loss_scale 128.0000 (167.6529) mem 7381MB [2024-09-01 07:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1000/1251] eta 0:01:04 lr 0.000086 wd 0.0500 time 0.2360 (0.2581) data time 0.0008 (0.0023) model time 0.2352 (0.2567) loss 1.9451 (2.7932) grad_norm 5.6085 (inf) loss_scale 128.0000 (167.2567) mem 7381MB [2024-09-01 07:33:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1010/1251] eta 0:01:02 lr 0.000086 wd 0.0500 time 0.2357 (0.2579) data time 0.0008 (0.0023) model time 0.2348 (0.2565) loss 3.2910 (2.7950) grad_norm 4.3865 (inf) loss_scale 128.0000 (166.8684) mem 7381MB [2024-09-01 07:33:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1020/1251] eta 0:00:59 lr 0.000086 wd 0.0500 time 0.2349 (0.2577) data time 0.0008 (0.0022) model time 0.2341 (0.2562) loss 1.8801 (2.7952) grad_norm 3.8279 (inf) loss_scale 128.0000 (166.4878) mem 7381MB [2024-09-01 07:33:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1030/1251] eta 0:00:56 lr 0.000086 wd 0.0500 time 0.2394 (0.2575) data time 0.0009 (0.0022) model time 0.2385 (0.2561) loss 3.2080 (2.7966) grad_norm 4.8162 (inf) loss_scale 128.0000 (166.1145) mem 7381MB [2024-09-01 07:33:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1040/1251] eta 0:00:54 lr 0.000086 wd 0.0500 time 0.2273 (0.2573) data time 0.0008 (0.0022) model time 0.2265 (0.2558) loss 3.1443 (2.7959) grad_norm 3.1847 (inf) loss_scale 128.0000 (165.7483) mem 7381MB [2024-09-01 07:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1050/1251] eta 0:00:51 lr 0.000086 wd 0.0500 time 0.2310 (0.2571) data time 0.0010 (0.0022) model time 0.2300 (0.2556) loss 1.8561 (2.7951) grad_norm 5.6748 (inf) loss_scale 128.0000 (165.3892) mem 7381MB [2024-09-01 07:34:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1060/1251] eta 0:00:49 lr 0.000086 wd 0.0500 time 0.2330 (0.2570) data time 0.0007 (0.0022) model time 0.2323 (0.2555) loss 2.6934 (2.7933) grad_norm 3.3259 (inf) loss_scale 128.0000 (165.0368) mem 7381MB [2024-09-01 07:34:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1070/1251] eta 0:00:46 lr 0.000086 wd 0.0500 time 0.2550 (0.2568) data time 0.0013 (0.0022) model time 0.2536 (0.2553) loss 2.9466 (2.7925) grad_norm 4.8045 (inf) loss_scale 128.0000 (164.6909) mem 7381MB [2024-09-01 07:34:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1080/1251] eta 0:00:43 lr 0.000086 wd 0.0500 time 0.2277 (0.2566) data time 0.0009 (0.0022) model time 0.2267 (0.2551) loss 2.5654 (2.7922) grad_norm 8.2410 (inf) loss_scale 128.0000 (164.3515) mem 7381MB [2024-09-01 07:34:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1090/1251] eta 0:00:41 lr 0.000086 wd 0.0500 time 0.2272 (0.2564) data time 0.0007 (0.0022) model time 0.2264 (0.2549) loss 1.7727 (2.7924) grad_norm 4.1979 (inf) loss_scale 128.0000 (164.0183) mem 7381MB [2024-09-01 07:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1100/1251] eta 0:00:38 lr 0.000086 wd 0.0500 time 0.2412 (0.2564) data time 0.0008 (0.0022) model time 0.2405 (0.2549) loss 2.8519 (2.7923) grad_norm 10.4235 (inf) loss_scale 128.0000 (163.6912) mem 7381MB [2024-09-01 07:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1110/1251] eta 0:00:36 lr 0.000086 wd 0.0500 time 0.2388 (0.2562) data time 0.0011 (0.0022) model time 0.2377 (0.2547) loss 1.4777 (2.7884) grad_norm 3.5662 (inf) loss_scale 128.0000 (163.3699) mem 7381MB [2024-09-01 07:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1120/1251] eta 0:00:33 lr 0.000086 wd 0.0500 time 0.2339 (0.2561) data time 0.0008 (0.0021) model time 0.2330 (0.2546) loss 3.4539 (2.7898) grad_norm 3.8025 (inf) loss_scale 128.0000 (163.0544) mem 7381MB [2024-09-01 07:34:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1130/1251] eta 0:00:30 lr 0.000086 wd 0.0500 time 0.2385 (0.2559) data time 0.0012 (0.0021) model time 0.2373 (0.2544) loss 3.2959 (2.7907) grad_norm 4.7648 (inf) loss_scale 128.0000 (162.7445) mem 7381MB [2024-09-01 07:34:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1140/1251] eta 0:00:28 lr 0.000086 wd 0.0500 time 0.2346 (0.2557) data time 0.0010 (0.0021) model time 0.2337 (0.2542) loss 2.7459 (2.7920) grad_norm 4.3607 (inf) loss_scale 128.0000 (162.4400) mem 7381MB [2024-09-01 07:34:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1150/1251] eta 0:00:25 lr 0.000086 wd 0.0500 time 0.2417 (0.2556) data time 0.0007 (0.0021) model time 0.2410 (0.2541) loss 1.8645 (2.7920) grad_norm 4.5695 (inf) loss_scale 128.0000 (162.1407) mem 7381MB [2024-09-01 07:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1160/1251] eta 0:00:23 lr 0.000086 wd 0.0500 time 0.5304 (0.2571) data time 0.0010 (0.0021) model time 0.5294 (0.2556) loss 3.1180 (2.7921) grad_norm 7.5989 (inf) loss_scale 128.0000 (161.8467) mem 7381MB [2024-09-01 07:34:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1170/1251] eta 0:00:20 lr 0.000086 wd 0.0500 time 0.2405 (0.2569) data time 0.0006 (0.0021) model time 0.2399 (0.2555) loss 3.2637 (2.7919) grad_norm 3.5769 (inf) loss_scale 128.0000 (161.5576) mem 7381MB [2024-09-01 07:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1180/1251] eta 0:00:18 lr 0.000086 wd 0.0500 time 0.2347 (0.2568) data time 0.0009 (0.0021) model time 0.2339 (0.2553) loss 3.1065 (2.7923) grad_norm 5.0121 (inf) loss_scale 128.0000 (161.2735) mem 7381MB [2024-09-01 07:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1190/1251] eta 0:00:15 lr 0.000086 wd 0.0500 time 0.2427 (0.2567) data time 0.0009 (0.0021) model time 0.2418 (0.2552) loss 3.4656 (2.7944) grad_norm 80.7217 (inf) loss_scale 128.0000 (160.9941) mem 7381MB [2024-09-01 07:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1200/1251] eta 0:00:13 lr 0.000086 wd 0.0500 time 0.2358 (0.2565) data time 0.0010 (0.0021) model time 0.2348 (0.2550) loss 2.1516 (2.7923) grad_norm 3.7341 (inf) loss_scale 128.0000 (160.7194) mem 7381MB [2024-09-01 07:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1210/1251] eta 0:00:10 lr 0.000086 wd 0.0500 time 0.2362 (0.2564) data time 0.0008 (0.0021) model time 0.2355 (0.2549) loss 3.8014 (2.7936) grad_norm 5.7743 (inf) loss_scale 128.0000 (160.4492) mem 7381MB [2024-09-01 07:34:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1220/1251] eta 0:00:07 lr 0.000086 wd 0.0500 time 0.2333 (0.2562) data time 0.0007 (0.0021) model time 0.2326 (0.2548) loss 1.7446 (2.7928) grad_norm 4.2452 (inf) loss_scale 128.0000 (160.1835) mem 7381MB [2024-09-01 07:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1230/1251] eta 0:00:05 lr 0.000086 wd 0.0500 time 0.2364 (0.2561) data time 0.0008 (0.0021) model time 0.2357 (0.2546) loss 3.0092 (2.7932) grad_norm 3.8521 (inf) loss_scale 128.0000 (159.9220) mem 7381MB [2024-09-01 07:34:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1240/1251] eta 0:00:02 lr 0.000086 wd 0.0500 time 0.2249 (0.2559) data time 0.0007 (0.0021) model time 0.2242 (0.2544) loss 3.4241 (2.7955) grad_norm 2.8554 (inf) loss_scale 128.0000 (159.6648) mem 7381MB [2024-09-01 07:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [249/300][1250/1251] eta 0:00:00 lr 0.000086 wd 0.0500 time 0.2247 (0.2557) data time 0.0005 (0.0020) model time 0.2243 (0.2542) loss 3.2307 (2.7955) grad_norm 4.7702 (inf) loss_scale 128.0000 (159.4117) mem 7381MB [2024-09-01 07:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 249 training takes 0:05:19 [2024-09-01 07:34:51 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 07:34:51 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 07:34:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 2.121 (2.121) Loss 0.3867 (0.3867) Acc@1 93.262 (93.262) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 07:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.066 (0.335) Loss 0.6265 (0.6247) Acc@1 88.770 (87.154) Acc@5 97.461 (97.665) Mem 7381MB [2024-09-01 07:34:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.219) Loss 0.9468 (0.6575) Acc@1 77.734 (85.942) Acc@5 95.605 (97.605) Mem 7381MB [2024-09-01 07:34:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.185) Loss 1.1709 (0.7487) Acc@1 73.926 (83.827) Acc@5 92.285 (96.673) Mem 7381MB [2024-09-01 07:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.156) Loss 1.0332 (0.7994) Acc@1 75.488 (82.615) Acc@5 94.434 (96.110) Mem 7381MB [2024-09-01 07:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.226 Acc@5 96.074 [2024-09-01 07:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.2% [2024-09-01 07:34:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.999 (0.999) Loss 0.3828 (0.3828) Acc@1 93.359 (93.359) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 07:35:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.162) Loss 0.5679 (0.5997) Acc@1 89.941 (87.624) Acc@5 98.145 (97.869) Mem 7381MB [2024-09-01 07:35:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.084 (0.122) Loss 0.8911 (0.6305) Acc@1 78.027 (86.468) Acc@5 96.191 (97.763) Mem 7381MB [2024-09-01 07:35:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.087 (0.108) Loss 1.1006 (0.7171) Acc@1 74.121 (84.381) Acc@5 93.262 (96.900) Mem 7381MB [2024-09-01 07:35:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.098) Loss 0.9932 (0.7628) Acc@1 76.660 (83.248) Acc@5 94.336 (96.396) Mem 7381MB [2024-09-01 07:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.910 Acc@5 96.348 [2024-09-01 07:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 07:35:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][0/1251] eta 0:26:32 lr 0.000086 wd 0.0500 time 1.2727 (1.2727) data time 0.7074 (0.7074) model time 0.0000 (0.0000) loss 2.4326 (2.4326) grad_norm 3.5133 (3.5133) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][10/1251] eta 0:06:51 lr 0.000086 wd 0.0500 time 0.2378 (0.3312) data time 0.0010 (0.0653) model time 0.0000 (0.0000) loss 3.4796 (2.6254) grad_norm 3.6994 (4.0606) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][20/1251] eta 0:05:54 lr 0.000086 wd 0.0500 time 0.2407 (0.2877) data time 0.0011 (0.0347) model time 0.0000 (0.0000) loss 3.1536 (2.6592) grad_norm 3.9372 (4.1687) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][30/1251] eta 0:05:31 lr 0.000086 wd 0.0500 time 0.2403 (0.2712) data time 0.0007 (0.0238) model time 0.0000 (0.0000) loss 3.0390 (2.6297) grad_norm 5.1291 (4.3917) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][40/1251] eta 0:05:20 lr 0.000086 wd 0.0500 time 0.2379 (0.2646) data time 0.0008 (0.0183) model time 0.0000 (0.0000) loss 2.3764 (2.5787) grad_norm 4.5318 (4.4093) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][50/1251] eta 0:05:11 lr 0.000086 wd 0.0500 time 0.2467 (0.2595) data time 0.0009 (0.0149) model time 0.0000 (0.0000) loss 2.1147 (2.6143) grad_norm 3.7193 (5.1985) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][60/1251] eta 0:05:04 lr 0.000086 wd 0.0500 time 0.2353 (0.2558) data time 0.0010 (0.0126) model time 0.2344 (0.2361) loss 3.0835 (2.6494) grad_norm 3.1012 (5.1436) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][70/1251] eta 0:04:59 lr 0.000086 wd 0.0500 time 0.2414 (0.2534) data time 0.0007 (0.0110) model time 0.2407 (0.2367) loss 3.3107 (2.6704) grad_norm 3.2403 (5.0046) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][80/1251] eta 0:04:54 lr 0.000086 wd 0.0500 time 0.2373 (0.2517) data time 0.0009 (0.0098) model time 0.2364 (0.2375) loss 2.5928 (2.7075) grad_norm 5.2019 (4.9647) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][90/1251] eta 0:04:50 lr 0.000086 wd 0.0500 time 0.2419 (0.2505) data time 0.0008 (0.0088) model time 0.2411 (0.2381) loss 2.2001 (2.6925) grad_norm 4.4128 (4.9588) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][100/1251] eta 0:04:47 lr 0.000086 wd 0.0500 time 0.2419 (0.2494) data time 0.0008 (0.0080) model time 0.2411 (0.2381) loss 2.7364 (2.7031) grad_norm 5.9260 (4.9364) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][110/1251] eta 0:04:43 lr 0.000086 wd 0.0500 time 0.2369 (0.2487) data time 0.0009 (0.0074) model time 0.2360 (0.2385) loss 1.9856 (2.7021) grad_norm 3.5060 (4.8724) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][120/1251] eta 0:04:40 lr 0.000086 wd 0.0500 time 0.2473 (0.2481) data time 0.0010 (0.0068) model time 0.2463 (0.2388) loss 2.1515 (2.7067) grad_norm 4.7346 (4.8608) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][130/1251] eta 0:04:37 lr 0.000086 wd 0.0500 time 0.2369 (0.2471) data time 0.0011 (0.0064) model time 0.2358 (0.2383) loss 2.3353 (2.7189) grad_norm 4.7443 (4.8605) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][140/1251] eta 0:04:34 lr 0.000086 wd 0.0500 time 0.2449 (0.2468) data time 0.0009 (0.0060) model time 0.2439 (0.2386) loss 2.7175 (2.7266) grad_norm 4.3183 (5.1241) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][150/1251] eta 0:04:30 lr 0.000086 wd 0.0500 time 0.2297 (0.2461) data time 0.0008 (0.0057) model time 0.2289 (0.2382) loss 3.0180 (2.7231) grad_norm 3.9206 (5.1625) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][160/1251] eta 0:04:30 lr 0.000085 wd 0.0500 time 0.2314 (0.2481) data time 0.0009 (0.0054) model time 0.2306 (0.2418) loss 2.7279 (2.7249) grad_norm 4.1616 (5.2194) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][170/1251] eta 0:04:27 lr 0.000085 wd 0.0500 time 0.2381 (0.2474) data time 0.0011 (0.0051) model time 0.2370 (0.2412) loss 3.3183 (2.7339) grad_norm 4.3552 (5.2017) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][180/1251] eta 0:04:24 lr 0.000085 wd 0.0500 time 0.2398 (0.2468) data time 0.0009 (0.0049) model time 0.2389 (0.2409) loss 3.4669 (2.7313) grad_norm 3.4948 (5.1542) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][190/1251] eta 0:04:22 lr 0.000085 wd 0.0500 time 0.2379 (0.2476) data time 0.0009 (0.0047) model time 0.2369 (0.2423) loss 3.0423 (2.7337) grad_norm 5.5137 (5.1253) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][200/1251] eta 0:04:24 lr 0.000085 wd 0.0500 time 0.2440 (0.2520) data time 0.0011 (0.0045) model time 0.2428 (0.2484) loss 2.1973 (2.7380) grad_norm 3.9806 (5.0814) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][210/1251] eta 0:04:21 lr 0.000085 wd 0.0500 time 0.2436 (0.2514) data time 0.0008 (0.0044) model time 0.2428 (0.2479) loss 2.6106 (2.7418) grad_norm 7.0462 (5.0280) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:35:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][220/1251] eta 0:04:18 lr 0.000085 wd 0.0500 time 0.2427 (0.2510) data time 0.0010 (0.0042) model time 0.2417 (0.2475) loss 2.5274 (2.7470) grad_norm 3.4657 (5.0028) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][230/1251] eta 0:04:15 lr 0.000085 wd 0.0500 time 0.2407 (0.2504) data time 0.0009 (0.0041) model time 0.2399 (0.2469) loss 3.1044 (2.7472) grad_norm 4.3929 (4.9733) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][240/1251] eta 0:04:12 lr 0.000085 wd 0.0500 time 0.2385 (0.2498) data time 0.0008 (0.0039) model time 0.2377 (0.2462) loss 2.8809 (2.7548) grad_norm 4.7460 (4.9402) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][250/1251] eta 0:04:09 lr 0.000085 wd 0.0500 time 0.2467 (0.2493) data time 0.0007 (0.0038) model time 0.2460 (0.2458) loss 3.4985 (2.7624) grad_norm 5.4542 (4.9263) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][260/1251] eta 0:04:06 lr 0.000085 wd 0.0500 time 0.2324 (0.2489) data time 0.0009 (0.0037) model time 0.2315 (0.2453) loss 2.3459 (2.7595) grad_norm 5.6561 (4.8987) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][270/1251] eta 0:04:03 lr 0.000085 wd 0.0500 time 0.2430 (0.2485) data time 0.0007 (0.0036) model time 0.2423 (0.2450) loss 3.3191 (2.7516) grad_norm 75.1021 (5.1773) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][280/1251] eta 0:04:00 lr 0.000085 wd 0.0500 time 0.2478 (0.2482) data time 0.0010 (0.0035) model time 0.2467 (0.2447) loss 2.7803 (2.7522) grad_norm 6.0237 (5.1466) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][290/1251] eta 0:03:58 lr 0.000085 wd 0.0500 time 0.2376 (0.2478) data time 0.0009 (0.0034) model time 0.2367 (0.2443) loss 3.5872 (2.7653) grad_norm 6.0050 (5.1924) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][300/1251] eta 0:03:55 lr 0.000085 wd 0.0500 time 0.2408 (0.2474) data time 0.0008 (0.0034) model time 0.2401 (0.2440) loss 3.3048 (2.7640) grad_norm 2.8390 (5.1635) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][310/1251] eta 0:03:52 lr 0.000085 wd 0.0500 time 0.2365 (0.2472) data time 0.0010 (0.0033) model time 0.2355 (0.2438) loss 2.0367 (2.7667) grad_norm 3.5310 (5.1223) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][320/1251] eta 0:03:49 lr 0.000085 wd 0.0500 time 0.2436 (0.2470) data time 0.0008 (0.0032) model time 0.2428 (0.2436) loss 3.0278 (2.7647) grad_norm 6.3974 (5.1030) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][330/1251] eta 0:03:47 lr 0.000085 wd 0.0500 time 0.2413 (0.2468) data time 0.0007 (0.0031) model time 0.2406 (0.2435) loss 3.0189 (2.7666) grad_norm 4.3443 (5.1466) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][340/1251] eta 0:03:44 lr 0.000085 wd 0.0500 time 0.2406 (0.2466) data time 0.0007 (0.0031) model time 0.2399 (0.2434) loss 2.8305 (2.7679) grad_norm 4.2185 (5.1477) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][350/1251] eta 0:03:42 lr 0.000085 wd 0.0500 time 0.2567 (0.2464) data time 0.0006 (0.0030) model time 0.2560 (0.2432) loss 2.0467 (2.7619) grad_norm 4.1607 (5.1435) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][360/1251] eta 0:03:39 lr 0.000085 wd 0.0500 time 0.2427 (0.2463) data time 0.0007 (0.0030) model time 0.2420 (0.2431) loss 2.0499 (2.7611) grad_norm 3.4887 (5.1103) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][370/1251] eta 0:03:36 lr 0.000085 wd 0.0500 time 0.2456 (0.2461) data time 0.0008 (0.0029) model time 0.2448 (0.2429) loss 2.8607 (2.7625) grad_norm 4.1399 (5.1077) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][380/1251] eta 0:03:34 lr 0.000085 wd 0.0500 time 0.2353 (0.2464) data time 0.0008 (0.0029) model time 0.2346 (0.2434) loss 1.6924 (2.7535) grad_norm 3.7237 (5.0942) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][390/1251] eta 0:03:31 lr 0.000085 wd 0.0500 time 0.2348 (0.2462) data time 0.0006 (0.0028) model time 0.2342 (0.2432) loss 2.4803 (2.7544) grad_norm 3.6291 (5.0761) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][400/1251] eta 0:03:29 lr 0.000085 wd 0.0500 time 0.2265 (0.2458) data time 0.0007 (0.0028) model time 0.2258 (0.2429) loss 3.1643 (2.7479) grad_norm 3.7439 (5.0670) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][410/1251] eta 0:03:33 lr 0.000085 wd 0.0500 time 0.2356 (0.2536) data time 0.0012 (0.0027) model time 0.2345 (0.2518) loss 2.9066 (2.7488) grad_norm 4.1850 (5.0717) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][420/1251] eta 0:03:30 lr 0.000085 wd 0.0500 time 0.2399 (0.2532) data time 0.0009 (0.0027) model time 0.2390 (0.2514) loss 3.2967 (2.7480) grad_norm 2.9214 (5.0607) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:36:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][430/1251] eta 0:03:41 lr 0.000085 wd 0.0500 time 0.2301 (0.2692) data time 0.0008 (0.0040) model time 0.2293 (0.2679) loss 3.4378 (2.7495) grad_norm 3.4241 (5.0464) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][440/1251] eta 0:03:39 lr 0.000085 wd 0.0500 time 0.2287 (0.2711) data time 0.0008 (0.0041) model time 0.2279 (0.2698) loss 1.9593 (2.7498) grad_norm 10.9908 (5.0307) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][450/1251] eta 0:03:43 lr 0.000085 wd 0.0500 time 0.2445 (0.2786) data time 0.0009 (0.0041) model time 0.2435 (0.2784) loss 3.0861 (2.7552) grad_norm 7.4235 (5.0331) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][460/1251] eta 0:03:39 lr 0.000085 wd 0.0500 time 0.2356 (0.2777) data time 0.0009 (0.0040) model time 0.2347 (0.2774) loss 2.6344 (2.7559) grad_norm 4.1275 (5.0306) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][470/1251] eta 0:03:36 lr 0.000085 wd 0.0500 time 0.2358 (0.2769) data time 0.0009 (0.0039) model time 0.2349 (0.2764) loss 2.5246 (2.7524) grad_norm 3.3982 (5.0317) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][480/1251] eta 0:03:32 lr 0.000085 wd 0.0500 time 0.2318 (0.2760) data time 0.0010 (0.0039) model time 0.2308 (0.2754) loss 3.4205 (2.7488) grad_norm 4.2385 (5.0327) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][490/1251] eta 0:03:29 lr 0.000085 wd 0.0500 time 0.2317 (0.2752) data time 0.0010 (0.0038) model time 0.2307 (0.2745) loss 2.9163 (2.7488) grad_norm 6.1878 (5.0237) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][500/1251] eta 0:03:28 lr 0.000085 wd 0.0500 time 0.2402 (0.2777) data time 0.0007 (0.0038) model time 0.2395 (0.2773) loss 3.1580 (2.7512) grad_norm 3.9653 (4.9965) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][510/1251] eta 0:03:27 lr 0.000085 wd 0.0500 time 0.2356 (0.2797) data time 0.0010 (0.0061) model time 0.2347 (0.2769) loss 3.0011 (2.7521) grad_norm 4.3790 (4.9753) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][520/1251] eta 0:03:23 lr 0.000085 wd 0.0500 time 0.2383 (0.2789) data time 0.0007 (0.0060) model time 0.2376 (0.2760) loss 3.2746 (2.7517) grad_norm 4.5834 (4.9561) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][530/1251] eta 0:03:20 lr 0.000085 wd 0.0500 time 0.2428 (0.2782) data time 0.0006 (0.0059) model time 0.2422 (0.2752) loss 2.1690 (2.7498) grad_norm 3.2813 (4.9445) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][540/1251] eta 0:03:17 lr 0.000085 wd 0.0500 time 0.2292 (0.2773) data time 0.0012 (0.0058) model time 0.2280 (0.2743) loss 2.8931 (2.7554) grad_norm 3.9373 (4.9216) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][550/1251] eta 0:03:13 lr 0.000085 wd 0.0500 time 0.2494 (0.2766) data time 0.0009 (0.0057) model time 0.2485 (0.2735) loss 2.7315 (2.7549) grad_norm 5.5961 (4.9256) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][560/1251] eta 0:03:11 lr 0.000085 wd 0.0500 time 0.2382 (0.2778) data time 0.0008 (0.0056) model time 0.2374 (0.2750) loss 2.2245 (2.7581) grad_norm 10.1555 (4.9337) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][570/1251] eta 0:03:08 lr 0.000085 wd 0.0500 time 0.2344 (0.2772) data time 0.0010 (0.0056) model time 0.2333 (0.2743) loss 2.8145 (2.7568) grad_norm 3.7331 (4.9271) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][580/1251] eta 0:03:05 lr 0.000085 wd 0.0500 time 0.2335 (0.2765) data time 0.0011 (0.0055) model time 0.2324 (0.2736) loss 2.3344 (2.7600) grad_norm 9.2014 (4.9335) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][590/1251] eta 0:03:02 lr 0.000084 wd 0.0500 time 0.2348 (0.2758) data time 0.0009 (0.0054) model time 0.2340 (0.2728) loss 2.4417 (2.7652) grad_norm 3.2075 (4.9708) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][600/1251] eta 0:02:59 lr 0.000084 wd 0.0500 time 0.2324 (0.2751) data time 0.0010 (0.0053) model time 0.2314 (0.2722) loss 2.8489 (2.7686) grad_norm 5.9730 (4.9647) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][610/1251] eta 0:02:55 lr 0.000084 wd 0.0500 time 0.2362 (0.2746) data time 0.0011 (0.0053) model time 0.2352 (0.2715) loss 2.7692 (2.7678) grad_norm 3.2046 (4.9487) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][620/1251] eta 0:02:52 lr 0.000084 wd 0.0500 time 0.2333 (0.2740) data time 0.0009 (0.0052) model time 0.2324 (0.2709) loss 2.3603 (2.7690) grad_norm 6.5503 (4.9790) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][630/1251] eta 0:02:49 lr 0.000084 wd 0.0500 time 0.2357 (0.2734) data time 0.0010 (0.0051) model time 0.2347 (0.2704) loss 2.7492 (2.7684) grad_norm 5.1447 (4.9756) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:37:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][640/1251] eta 0:02:46 lr 0.000084 wd 0.0500 time 0.2513 (0.2729) data time 0.0009 (0.0051) model time 0.2504 (0.2698) loss 3.1953 (2.7675) grad_norm 4.3922 (4.9627) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][650/1251] eta 0:02:43 lr 0.000084 wd 0.0500 time 0.2377 (0.2724) data time 0.0007 (0.0050) model time 0.2369 (0.2693) loss 3.1857 (2.7681) grad_norm 4.5888 (4.9563) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][660/1251] eta 0:02:40 lr 0.000084 wd 0.0500 time 0.2449 (0.2719) data time 0.0011 (0.0049) model time 0.2439 (0.2688) loss 3.1998 (2.7702) grad_norm 4.3768 (4.9448) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][670/1251] eta 0:02:37 lr 0.000084 wd 0.0500 time 0.2455 (0.2713) data time 0.0007 (0.0049) model time 0.2448 (0.2683) loss 2.2877 (2.7693) grad_norm 3.4390 (4.9316) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][680/1251] eta 0:02:34 lr 0.000084 wd 0.0500 time 0.2416 (0.2708) data time 0.0008 (0.0048) model time 0.2408 (0.2677) loss 2.0327 (2.7687) grad_norm 4.2917 (4.9264) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][690/1251] eta 0:02:31 lr 0.000084 wd 0.0500 time 0.2287 (0.2703) data time 0.0010 (0.0048) model time 0.2277 (0.2672) loss 2.6889 (2.7691) grad_norm 3.6397 (4.9122) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][700/1251] eta 0:02:28 lr 0.000084 wd 0.0500 time 0.2491 (0.2699) data time 0.0012 (0.0047) model time 0.2478 (0.2668) loss 3.0699 (2.7707) grad_norm 4.2483 (4.8992) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][710/1251] eta 0:02:25 lr 0.000084 wd 0.0500 time 0.2349 (0.2694) data time 0.0010 (0.0047) model time 0.2339 (0.2663) loss 2.7866 (2.7686) grad_norm 4.4656 (4.8863) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][720/1251] eta 0:02:22 lr 0.000084 wd 0.0500 time 0.2469 (0.2690) data time 0.0012 (0.0046) model time 0.2456 (0.2658) loss 2.8392 (2.7687) grad_norm 3.7021 (4.8803) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][730/1251] eta 0:02:19 lr 0.000084 wd 0.0500 time 0.2326 (0.2685) data time 0.0007 (0.0046) model time 0.2319 (0.2654) loss 3.3924 (2.7683) grad_norm 8.4257 (4.8889) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][740/1251] eta 0:02:16 lr 0.000084 wd 0.0500 time 0.2452 (0.2681) data time 0.0010 (0.0045) model time 0.2442 (0.2650) loss 2.8474 (2.7684) grad_norm 3.4747 (4.8823) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][750/1251] eta 0:02:14 lr 0.000084 wd 0.0500 time 0.2360 (0.2677) data time 0.0008 (0.0045) model time 0.2351 (0.2646) loss 2.7782 (2.7705) grad_norm 7.5454 (4.8909) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][760/1251] eta 0:02:11 lr 0.000084 wd 0.0500 time 0.2434 (0.2673) data time 0.0008 (0.0044) model time 0.2427 (0.2642) loss 1.9586 (2.7704) grad_norm 4.7808 (4.8935) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][770/1251] eta 0:02:08 lr 0.000084 wd 0.0500 time 0.2439 (0.2669) data time 0.0008 (0.0044) model time 0.2431 (0.2638) loss 2.5759 (2.7674) grad_norm 3.8309 (4.8872) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][780/1251] eta 0:02:05 lr 0.000084 wd 0.0500 time 0.2316 (0.2666) data time 0.0011 (0.0043) model time 0.2306 (0.2635) loss 2.9335 (2.7657) grad_norm 5.0520 (4.8857) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][790/1251] eta 0:02:02 lr 0.000084 wd 0.0500 time 0.2449 (0.2662) data time 0.0008 (0.0043) model time 0.2441 (0.2631) loss 2.9328 (2.7654) grad_norm 4.5902 (4.8804) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][800/1251] eta 0:01:59 lr 0.000084 wd 0.0500 time 0.2341 (0.2659) data time 0.0008 (0.0043) model time 0.2334 (0.2628) loss 3.5874 (2.7645) grad_norm 2.8722 (4.8728) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][810/1251] eta 0:01:57 lr 0.000084 wd 0.0500 time 0.2371 (0.2655) data time 0.0010 (0.0042) model time 0.2362 (0.2625) loss 2.7082 (2.7661) grad_norm 3.5887 (4.8669) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][820/1251] eta 0:01:54 lr 0.000084 wd 0.0500 time 0.2448 (0.2652) data time 0.0011 (0.0042) model time 0.2437 (0.2621) loss 1.7619 (2.7647) grad_norm 3.7728 (4.8593) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][830/1251] eta 0:01:51 lr 0.000084 wd 0.0500 time 0.2389 (0.2649) data time 0.0011 (0.0041) model time 0.2378 (0.2618) loss 2.0557 (2.7637) grad_norm 3.0861 (4.8722) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][840/1251] eta 0:01:48 lr 0.000084 wd 0.0500 time 0.2353 (0.2645) data time 0.0009 (0.0041) model time 0.2345 (0.2615) loss 2.5744 (2.7667) grad_norm 4.1973 (4.8675) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][850/1251] eta 0:01:45 lr 0.000084 wd 0.0500 time 0.2370 (0.2643) data time 0.0007 (0.0041) model time 0.2364 (0.2612) loss 1.8251 (2.7694) grad_norm 3.5585 (4.8779) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][860/1251] eta 0:01:43 lr 0.000084 wd 0.0500 time 0.2490 (0.2640) data time 0.0007 (0.0040) model time 0.2482 (0.2609) loss 3.2161 (2.7686) grad_norm 4.8260 (4.8791) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][870/1251] eta 0:01:40 lr 0.000084 wd 0.0500 time 0.2421 (0.2636) data time 0.0008 (0.0040) model time 0.2414 (0.2606) loss 1.6995 (2.7692) grad_norm 3.0289 (4.8729) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][880/1251] eta 0:01:37 lr 0.000084 wd 0.0500 time 0.2399 (0.2633) data time 0.0010 (0.0040) model time 0.2389 (0.2603) loss 2.9978 (2.7704) grad_norm 4.8077 (4.8763) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][890/1251] eta 0:01:34 lr 0.000084 wd 0.0500 time 0.2531 (0.2630) data time 0.0009 (0.0039) model time 0.2522 (0.2600) loss 3.2067 (2.7697) grad_norm 8.1899 (4.8808) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:38:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][900/1251] eta 0:01:32 lr 0.000084 wd 0.0500 time 0.2486 (0.2627) data time 0.0009 (0.0039) model time 0.2477 (0.2597) loss 2.4501 (2.7713) grad_norm 5.9116 (4.8895) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][910/1251] eta 0:01:29 lr 0.000084 wd 0.0500 time 0.2339 (0.2625) data time 0.0008 (0.0039) model time 0.2331 (0.2594) loss 2.9974 (2.7735) grad_norm 4.0574 (4.8917) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][920/1251] eta 0:01:26 lr 0.000084 wd 0.0500 time 0.2276 (0.2624) data time 0.0008 (0.0038) model time 0.2269 (0.2594) loss 2.2285 (2.7736) grad_norm 5.3909 (4.8858) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][930/1251] eta 0:01:24 lr 0.000084 wd 0.0500 time 0.3482 (0.2637) data time 0.0009 (0.0038) model time 0.3474 (0.2607) loss 3.4859 (2.7735) grad_norm 4.9074 (4.8766) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][940/1251] eta 0:01:22 lr 0.000084 wd 0.0500 time 0.2402 (0.2643) data time 0.0007 (0.0038) model time 0.2395 (0.2614) loss 3.3988 (2.7753) grad_norm 4.1433 (4.8702) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][950/1251] eta 0:01:19 lr 0.000084 wd 0.0500 time 0.2299 (0.2642) data time 0.0010 (0.0038) model time 0.2289 (0.2613) loss 1.9968 (2.7747) grad_norm 4.6463 (4.8606) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][960/1251] eta 0:01:17 lr 0.000084 wd 0.0500 time 0.2443 (0.2651) data time 0.0012 (0.0048) model time 0.2431 (0.2612) loss 3.0738 (2.7754) grad_norm 3.5811 (4.8600) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][970/1251] eta 0:01:14 lr 0.000084 wd 0.0500 time 0.2350 (0.2648) data time 0.0008 (0.0047) model time 0.2342 (0.2609) loss 3.3334 (2.7757) grad_norm 3.5369 (4.8556) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][980/1251] eta 0:01:11 lr 0.000084 wd 0.0500 time 0.2583 (0.2646) data time 0.0014 (0.0047) model time 0.2569 (0.2607) loss 3.3853 (2.7758) grad_norm 5.9346 (4.8953) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][990/1251] eta 0:01:08 lr 0.000084 wd 0.0500 time 0.2301 (0.2643) data time 0.0010 (0.0047) model time 0.2291 (0.2605) loss 2.7504 (2.7785) grad_norm 7.3943 (4.9026) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1000/1251] eta 0:01:06 lr 0.000084 wd 0.0500 time 0.2401 (0.2641) data time 0.0008 (0.0046) model time 0.2393 (0.2602) loss 3.1111 (2.7778) grad_norm 6.1837 (4.9047) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1010/1251] eta 0:01:03 lr 0.000084 wd 0.0500 time 0.2349 (0.2638) data time 0.0010 (0.0046) model time 0.2340 (0.2600) loss 2.4091 (2.7777) grad_norm 5.4711 (4.8989) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1020/1251] eta 0:01:00 lr 0.000083 wd 0.0500 time 0.2287 (0.2636) data time 0.0010 (0.0046) model time 0.2277 (0.2597) loss 2.8688 (2.7775) grad_norm 12.3079 (4.8981) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1030/1251] eta 0:00:58 lr 0.000083 wd 0.0500 time 0.2286 (0.2633) data time 0.0010 (0.0045) model time 0.2276 (0.2596) loss 2.8236 (2.7778) grad_norm 5.1124 (4.8926) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1040/1251] eta 0:00:55 lr 0.000083 wd 0.0500 time 0.2302 (0.2631) data time 0.0011 (0.0045) model time 0.2291 (0.2593) loss 2.7059 (2.7766) grad_norm 5.5962 (4.8878) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1050/1251] eta 0:00:52 lr 0.000083 wd 0.0500 time 0.2426 (0.2629) data time 0.0009 (0.0045) model time 0.2417 (0.2591) loss 2.0541 (2.7764) grad_norm 3.1690 (4.8829) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:39:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1060/1251] eta 0:00:50 lr 0.000083 wd 0.0500 time 0.2329 (0.2626) data time 0.0007 (0.0044) model time 0.2322 (0.2589) loss 2.9746 (2.7766) grad_norm 3.5390 (4.8874) loss_scale 256.0000 (128.6032) mem 7381MB [2024-09-01 07:39:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1070/1251] eta 0:00:47 lr 0.000083 wd 0.0500 time 0.2396 (0.2624) data time 0.0010 (0.0044) model time 0.2386 (0.2586) loss 3.1649 (2.7786) grad_norm 7.0246 (4.8910) loss_scale 256.0000 (129.7927) mem 7381MB [2024-09-01 07:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1080/1251] eta 0:00:44 lr 0.000083 wd 0.0500 time 0.2344 (0.2621) data time 0.0009 (0.0044) model time 0.2335 (0.2584) loss 2.9719 (2.7795) grad_norm 5.8056 (4.8879) loss_scale 256.0000 (130.9602) mem 7381MB [2024-09-01 07:39:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1090/1251] eta 0:00:42 lr 0.000083 wd 0.0500 time 0.4374 (0.2623) data time 0.0007 (0.0043) model time 0.4366 (0.2586) loss 2.2591 (2.7783) grad_norm 4.3694 (4.8911) loss_scale 256.0000 (132.1063) mem 7381MB [2024-09-01 07:39:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1100/1251] eta 0:00:39 lr 0.000083 wd 0.0500 time 0.2299 (0.2620) data time 0.0011 (0.0043) model time 0.2288 (0.2584) loss 2.8813 (2.7768) grad_norm 3.8399 (4.8918) loss_scale 256.0000 (133.2316) mem 7381MB [2024-09-01 07:39:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1110/1251] eta 0:00:36 lr 0.000083 wd 0.0500 time 0.2345 (0.2618) data time 0.0009 (0.0043) model time 0.2336 (0.2582) loss 2.6314 (2.7759) grad_norm 4.9245 (4.8986) loss_scale 256.0000 (134.3366) mem 7381MB [2024-09-01 07:39:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1120/1251] eta 0:00:34 lr 0.000083 wd 0.0500 time 0.2355 (0.2616) data time 0.0007 (0.0043) model time 0.2348 (0.2580) loss 3.0329 (2.7781) grad_norm 5.4054 (4.8973) loss_scale 256.0000 (135.4219) mem 7381MB [2024-09-01 07:39:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1130/1251] eta 0:00:31 lr 0.000083 wd 0.0500 time 0.2418 (0.2614) data time 0.0009 (0.0042) model time 0.2409 (0.2578) loss 3.0283 (2.7778) grad_norm 3.1186 (4.8865) loss_scale 256.0000 (136.4881) mem 7381MB [2024-09-01 07:40:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1140/1251] eta 0:00:28 lr 0.000083 wd 0.0500 time 0.2385 (0.2612) data time 0.0009 (0.0042) model time 0.2377 (0.2576) loss 3.3489 (2.7789) grad_norm 3.2611 (4.8841) loss_scale 256.0000 (137.5355) mem 7381MB [2024-09-01 07:40:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1150/1251] eta 0:00:26 lr 0.000083 wd 0.0500 time 0.2434 (0.2611) data time 0.0009 (0.0042) model time 0.2425 (0.2575) loss 1.6417 (2.7779) grad_norm 4.3425 (4.8815) loss_scale 256.0000 (138.5647) mem 7381MB [2024-09-01 07:40:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1160/1251] eta 0:00:23 lr 0.000083 wd 0.0500 time 0.2443 (0.2609) data time 0.0009 (0.0041) model time 0.2434 (0.2573) loss 3.1486 (2.7789) grad_norm 3.9289 (4.8779) loss_scale 256.0000 (139.5762) mem 7381MB [2024-09-01 07:40:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1170/1251] eta 0:00:21 lr 0.000083 wd 0.0500 time 0.2322 (0.2606) data time 0.0007 (0.0041) model time 0.2315 (0.2571) loss 3.1850 (2.7789) grad_norm 4.3821 (4.8787) loss_scale 256.0000 (140.5705) mem 7381MB [2024-09-01 07:40:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1180/1251] eta 0:00:18 lr 0.000083 wd 0.0500 time 0.2398 (0.2604) data time 0.0007 (0.0041) model time 0.2391 (0.2569) loss 1.9909 (2.7788) grad_norm 4.0276 (4.8738) loss_scale 256.0000 (141.5478) mem 7381MB [2024-09-01 07:40:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1190/1251] eta 0:00:15 lr 0.000083 wd 0.0500 time 0.2424 (0.2602) data time 0.0008 (0.0041) model time 0.2415 (0.2567) loss 2.8128 (2.7807) grad_norm 6.2108 (4.8714) loss_scale 256.0000 (142.5088) mem 7381MB [2024-09-01 07:40:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1200/1251] eta 0:00:13 lr 0.000083 wd 0.0500 time 0.2357 (0.2601) data time 0.0008 (0.0040) model time 0.2349 (0.2566) loss 3.6072 (2.7808) grad_norm 4.2706 (4.8715) loss_scale 256.0000 (143.4538) mem 7381MB [2024-09-01 07:40:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1210/1251] eta 0:00:10 lr 0.000083 wd 0.0500 time 0.2403 (0.2600) data time 0.0007 (0.0040) model time 0.2396 (0.2565) loss 3.5598 (2.7800) grad_norm 4.5781 (4.8698) loss_scale 256.0000 (144.3832) mem 7381MB [2024-09-01 07:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1220/1251] eta 0:00:08 lr 0.000083 wd 0.0500 time 0.2378 (0.2598) data time 0.0010 (0.0040) model time 0.2368 (0.2563) loss 2.4352 (2.7790) grad_norm 3.6586 (4.8664) loss_scale 256.0000 (145.2973) mem 7381MB [2024-09-01 07:40:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1230/1251] eta 0:00:05 lr 0.000083 wd 0.0500 time 0.2602 (0.2597) data time 0.0008 (0.0040) model time 0.2594 (0.2562) loss 3.0732 (2.7798) grad_norm 3.4314 (4.8585) loss_scale 256.0000 (146.1966) mem 7381MB [2024-09-01 07:40:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1240/1251] eta 0:00:02 lr 0.000083 wd 0.0500 time 0.2242 (0.2595) data time 0.0005 (0.0039) model time 0.2238 (0.2560) loss 3.1778 (2.7808) grad_norm 4.0148 (4.8547) loss_scale 256.0000 (147.0814) mem 7381MB [2024-09-01 07:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [250/300][1250/1251] eta 0:00:00 lr 0.000083 wd 0.0500 time 0.2326 (0.2592) data time 0.0007 (0.0039) model time 0.2318 (0.2558) loss 2.7409 (2.7818) grad_norm 5.1081 (4.8614) loss_scale 256.0000 (147.9520) mem 7381MB [2024-09-01 07:40:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 250 training takes 0:05:24 [2024-09-01 07:40:27 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 07:40:28 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 07:40:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.465 (0.465) Loss 0.4048 (0.4048) Acc@1 92.969 (92.969) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 07:40:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.114) Loss 0.6016 (0.6244) Acc@1 89.258 (87.083) Acc@5 97.559 (97.736) Mem 7381MB [2024-09-01 07:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.098) Loss 0.9541 (0.6547) Acc@1 77.539 (86.021) Acc@5 95.703 (97.638) Mem 7381MB [2024-09-01 07:40:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.092) Loss 1.1631 (0.7460) Acc@1 74.414 (83.896) Acc@5 91.797 (96.724) Mem 7381MB [2024-09-01 07:40:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.086) Loss 1.0205 (0.7955) Acc@1 75.488 (82.705) Acc@5 94.238 (96.179) Mem 7381MB [2024-09-01 07:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.246 Acc@5 96.118 [2024-09-01 07:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.2% [2024-09-01 07:40:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.954 (0.954) Loss 0.3818 (0.3818) Acc@1 93.359 (93.359) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 07:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.162) Loss 0.5688 (0.5998) Acc@1 89.941 (87.660) Acc@5 98.145 (97.869) Mem 7381MB [2024-09-01 07:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.122) Loss 0.8911 (0.6305) Acc@1 78.027 (86.514) Acc@5 96.289 (97.763) Mem 7381MB [2024-09-01 07:40:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.109) Loss 1.1016 (0.7172) Acc@1 74.121 (84.403) Acc@5 93.066 (96.894) Mem 7381MB [2024-09-01 07:40:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.098) Loss 0.9937 (0.7631) Acc@1 76.465 (83.258) Acc@5 94.336 (96.391) Mem 7381MB [2024-09-01 07:40:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.902 Acc@5 96.350 [2024-09-01 07:40:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 07:40:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][0/1251] eta 0:27:40 lr 0.000083 wd 0.0500 time 1.3276 (1.3276) data time 0.8473 (0.8473) model time 0.0000 (0.0000) loss 2.9571 (2.9571) grad_norm 2.9803 (2.9803) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:40:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][10/1251] eta 0:06:57 lr 0.000083 wd 0.0500 time 0.2329 (0.3365) data time 0.0011 (0.0780) model time 0.0000 (0.0000) loss 2.9467 (2.7942) grad_norm 4.0310 (4.1069) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:40:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][20/1251] eta 0:05:54 lr 0.000083 wd 0.0500 time 0.2459 (0.2882) data time 0.0009 (0.0413) model time 0.0000 (0.0000) loss 1.9823 (2.7423) grad_norm 3.9374 (4.3137) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:40:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][30/1251] eta 0:05:30 lr 0.000083 wd 0.0500 time 0.2309 (0.2706) data time 0.0009 (0.0283) model time 0.0000 (0.0000) loss 3.1446 (2.7943) grad_norm 3.5133 (4.1861) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:40:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][40/1251] eta 0:05:17 lr 0.000083 wd 0.0500 time 0.2383 (0.2623) data time 0.0009 (0.0217) model time 0.0000 (0.0000) loss 2.9659 (2.8157) grad_norm 4.1785 (4.1725) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:40:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][50/1251] eta 0:05:09 lr 0.000083 wd 0.0500 time 0.2545 (0.2579) data time 0.0012 (0.0176) model time 0.0000 (0.0000) loss 2.9506 (2.8156) grad_norm 4.7677 (4.2966) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:40:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][60/1251] eta 0:05:03 lr 0.000083 wd 0.0500 time 0.2314 (0.2550) data time 0.0008 (0.0149) model time 0.2306 (0.2392) loss 3.3569 (2.7937) grad_norm 3.4628 (4.3937) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:40:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][70/1251] eta 0:04:58 lr 0.000083 wd 0.0500 time 0.2377 (0.2524) data time 0.0008 (0.0129) model time 0.2368 (0.2376) loss 2.3685 (2.7823) grad_norm 4.7244 (4.3204) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][80/1251] eta 0:04:52 lr 0.000083 wd 0.0500 time 0.2325 (0.2498) data time 0.0007 (0.0115) model time 0.2318 (0.2349) loss 2.1402 (2.7768) grad_norm 5.4621 (4.3458) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][90/1251] eta 0:04:48 lr 0.000083 wd 0.0500 time 0.2359 (0.2483) data time 0.0010 (0.0103) model time 0.2350 (0.2351) loss 2.8072 (2.7782) grad_norm 4.8370 (4.3542) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][100/1251] eta 0:04:44 lr 0.000083 wd 0.0500 time 0.2383 (0.2472) data time 0.0008 (0.0094) model time 0.2375 (0.2353) loss 3.4669 (2.7686) grad_norm 4.0230 (4.4169) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][110/1251] eta 0:04:40 lr 0.000083 wd 0.0500 time 0.2334 (0.2461) data time 0.0013 (0.0087) model time 0.2322 (0.2350) loss 2.2260 (2.7284) grad_norm 4.2336 (4.4085) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][120/1251] eta 0:04:37 lr 0.000083 wd 0.0500 time 0.2360 (0.2452) data time 0.0010 (0.0080) model time 0.2350 (0.2350) loss 2.5199 (2.7346) grad_norm 3.7632 (4.3848) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][130/1251] eta 0:04:34 lr 0.000083 wd 0.0500 time 0.2370 (0.2447) data time 0.0008 (0.0075) model time 0.2362 (0.2353) loss 2.9485 (2.7500) grad_norm 5.2622 (4.4497) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][140/1251] eta 0:04:30 lr 0.000083 wd 0.0500 time 0.2325 (0.2439) data time 0.0008 (0.0070) model time 0.2317 (0.2349) loss 3.6072 (2.7585) grad_norm 4.7164 (4.5244) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][150/1251] eta 0:04:29 lr 0.000083 wd 0.0500 time 0.2324 (0.2445) data time 0.0012 (0.0067) model time 0.2312 (0.2366) loss 2.8261 (2.7390) grad_norm 4.0082 (4.4945) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][160/1251] eta 0:04:26 lr 0.000083 wd 0.0500 time 0.2408 (0.2441) data time 0.0011 (0.0063) model time 0.2397 (0.2366) loss 2.8831 (2.7520) grad_norm 3.6561 (4.5514) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][170/1251] eta 0:04:23 lr 0.000083 wd 0.0500 time 0.2345 (0.2437) data time 0.0010 (0.0060) model time 0.2335 (0.2366) loss 2.2632 (2.7311) grad_norm 8.4240 (4.6169) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][180/1251] eta 0:04:20 lr 0.000083 wd 0.0500 time 0.2372 (0.2433) data time 0.0009 (0.0057) model time 0.2363 (0.2365) loss 2.3453 (2.7129) grad_norm 3.1455 (4.6126) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][190/1251] eta 0:04:17 lr 0.000083 wd 0.0500 time 0.2372 (0.2431) data time 0.0008 (0.0055) model time 0.2364 (0.2367) loss 2.2549 (2.7232) grad_norm 3.7527 (4.5872) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][200/1251] eta 0:04:15 lr 0.000082 wd 0.0500 time 0.2421 (0.2428) data time 0.0007 (0.0053) model time 0.2414 (0.2366) loss 2.6250 (2.7237) grad_norm 4.6164 (4.5699) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][210/1251] eta 0:04:12 lr 0.000082 wd 0.0500 time 0.2352 (0.2424) data time 0.0010 (0.0051) model time 0.2342 (0.2364) loss 2.7857 (2.7270) grad_norm 3.6370 (4.5729) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][220/1251] eta 0:04:09 lr 0.000082 wd 0.0500 time 0.2362 (0.2421) data time 0.0008 (0.0049) model time 0.2353 (0.2363) loss 3.1335 (2.7281) grad_norm 8.3951 (4.6930) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][230/1251] eta 0:04:06 lr 0.000082 wd 0.0500 time 0.2460 (0.2419) data time 0.0008 (0.0047) model time 0.2452 (0.2363) loss 3.4836 (2.7305) grad_norm 4.2813 (4.8889) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][240/1251] eta 0:04:04 lr 0.000082 wd 0.0500 time 0.2302 (0.2416) data time 0.0008 (0.0046) model time 0.2293 (0.2362) loss 2.8186 (2.7264) grad_norm 4.7108 (4.9024) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][250/1251] eta 0:04:01 lr 0.000082 wd 0.0500 time 0.2330 (0.2414) data time 0.0009 (0.0044) model time 0.2322 (0.2361) loss 2.9916 (2.7304) grad_norm 10.6960 (4.9492) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][260/1251] eta 0:03:58 lr 0.000082 wd 0.0500 time 0.2384 (0.2411) data time 0.0008 (0.0043) model time 0.2376 (0.2360) loss 3.4063 (2.7386) grad_norm 3.7710 (4.9333) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][270/1251] eta 0:03:56 lr 0.000082 wd 0.0500 time 0.2283 (0.2410) data time 0.0009 (0.0042) model time 0.2274 (0.2360) loss 2.9397 (2.7457) grad_norm 5.0724 (4.9093) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][280/1251] eta 0:03:53 lr 0.000082 wd 0.0500 time 0.2565 (0.2408) data time 0.0007 (0.0041) model time 0.2558 (0.2360) loss 1.7207 (2.7481) grad_norm 3.7187 (4.9579) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][290/1251] eta 0:03:51 lr 0.000082 wd 0.0500 time 0.2411 (0.2407) data time 0.0011 (0.0039) model time 0.2400 (0.2360) loss 3.3416 (2.7485) grad_norm 4.4458 (5.1161) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][300/1251] eta 0:03:48 lr 0.000082 wd 0.0500 time 0.2396 (0.2406) data time 0.0009 (0.0039) model time 0.2388 (0.2361) loss 2.6150 (2.7512) grad_norm 4.6412 (5.2104) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][310/1251] eta 0:03:46 lr 0.000082 wd 0.0500 time 0.2214 (0.2405) data time 0.0010 (0.0038) model time 0.2203 (0.2360) loss 2.1832 (2.7518) grad_norm 4.2941 (5.1901) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][320/1251] eta 0:03:43 lr 0.000082 wd 0.0500 time 0.2349 (0.2402) data time 0.0010 (0.0037) model time 0.2339 (0.2359) loss 3.1264 (2.7546) grad_norm 4.3889 (5.1867) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][330/1251] eta 0:03:41 lr 0.000082 wd 0.0500 time 0.2365 (0.2402) data time 0.0009 (0.0036) model time 0.2356 (0.2360) loss 2.7007 (2.7543) grad_norm 4.8126 (5.1814) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:41:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][340/1251] eta 0:03:38 lr 0.000082 wd 0.0500 time 0.2309 (0.2401) data time 0.0008 (0.0035) model time 0.2301 (0.2360) loss 2.3349 (2.7478) grad_norm 4.6239 (5.1784) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][350/1251] eta 0:03:36 lr 0.000082 wd 0.0500 time 0.2359 (0.2399) data time 0.0010 (0.0035) model time 0.2349 (0.2358) loss 2.8466 (2.7518) grad_norm 4.5638 (5.1585) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][360/1251] eta 0:03:33 lr 0.000082 wd 0.0500 time 0.2317 (0.2398) data time 0.0007 (0.0034) model time 0.2310 (0.2358) loss 3.2521 (2.7569) grad_norm 3.9524 (5.1434) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][370/1251] eta 0:03:31 lr 0.000082 wd 0.0500 time 0.2330 (0.2396) data time 0.0011 (0.0033) model time 0.2319 (0.2357) loss 3.2172 (2.7591) grad_norm 3.8255 (5.2288) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][380/1251] eta 0:03:28 lr 0.000082 wd 0.0500 time 0.2365 (0.2395) data time 0.0008 (0.0033) model time 0.2358 (0.2357) loss 3.3601 (2.7638) grad_norm 3.7768 (5.2314) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][390/1251] eta 0:03:26 lr 0.000082 wd 0.0500 time 0.2256 (0.2394) data time 0.0008 (0.0032) model time 0.2248 (0.2355) loss 2.7773 (2.7657) grad_norm 4.9241 (5.2247) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][400/1251] eta 0:03:23 lr 0.000082 wd 0.0500 time 0.2371 (0.2393) data time 0.0009 (0.0032) model time 0.2362 (0.2355) loss 3.0036 (2.7653) grad_norm 3.6226 (5.1935) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][410/1251] eta 0:03:21 lr 0.000082 wd 0.0500 time 0.2376 (0.2392) data time 0.0010 (0.0031) model time 0.2366 (0.2355) loss 2.4506 (2.7687) grad_norm 4.3737 (5.1748) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][420/1251] eta 0:03:18 lr 0.000082 wd 0.0500 time 0.2293 (0.2391) data time 0.0009 (0.0030) model time 0.2284 (0.2354) loss 2.5768 (2.7655) grad_norm 3.9912 (5.1586) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][430/1251] eta 0:03:16 lr 0.000082 wd 0.0500 time 0.2311 (0.2390) data time 0.0010 (0.0030) model time 0.2301 (0.2354) loss 2.4875 (2.7683) grad_norm 7.2381 (5.1348) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][440/1251] eta 0:03:13 lr 0.000082 wd 0.0500 time 0.2402 (0.2390) data time 0.0013 (0.0030) model time 0.2389 (0.2354) loss 3.0115 (2.7645) grad_norm 3.7859 (5.1160) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][450/1251] eta 0:03:11 lr 0.000082 wd 0.0500 time 0.2388 (0.2389) data time 0.0010 (0.0029) model time 0.2378 (0.2354) loss 3.3623 (2.7657) grad_norm 5.0420 (5.1044) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][460/1251] eta 0:03:08 lr 0.000082 wd 0.0500 time 0.2313 (0.2388) data time 0.0008 (0.0029) model time 0.2305 (0.2354) loss 2.9988 (2.7691) grad_norm 13.0543 (5.1264) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][470/1251] eta 0:03:06 lr 0.000082 wd 0.0500 time 0.2271 (0.2387) data time 0.0010 (0.0028) model time 0.2261 (0.2353) loss 2.5250 (2.7764) grad_norm 4.1630 (5.1220) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][480/1251] eta 0:03:03 lr 0.000082 wd 0.0500 time 0.2397 (0.2386) data time 0.0009 (0.0028) model time 0.2388 (0.2353) loss 3.2679 (2.7807) grad_norm 3.9558 (5.1129) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][490/1251] eta 0:03:01 lr 0.000082 wd 0.0500 time 0.2322 (0.2386) data time 0.0010 (0.0028) model time 0.2312 (0.2353) loss 3.1515 (2.7832) grad_norm 4.3617 (5.0870) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][500/1251] eta 0:02:59 lr 0.000082 wd 0.0500 time 0.2337 (0.2387) data time 0.0007 (0.0027) model time 0.2329 (0.2355) loss 2.2285 (2.7832) grad_norm 3.5547 (5.0805) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][510/1251] eta 0:02:56 lr 0.000082 wd 0.0500 time 0.2373 (0.2386) data time 0.0007 (0.0027) model time 0.2366 (0.2355) loss 2.3211 (2.7812) grad_norm 12.2291 (5.0788) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][520/1251] eta 0:02:54 lr 0.000082 wd 0.0500 time 0.2345 (0.2388) data time 0.0009 (0.0027) model time 0.2336 (0.2357) loss 3.2690 (2.7778) grad_norm 4.0550 (5.0571) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][530/1251] eta 0:02:52 lr 0.000082 wd 0.0500 time 0.2335 (0.2387) data time 0.0012 (0.0026) model time 0.2323 (0.2356) loss 1.7578 (2.7751) grad_norm 3.1365 (5.0469) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][540/1251] eta 0:02:49 lr 0.000082 wd 0.0500 time 0.2452 (0.2386) data time 0.0013 (0.0026) model time 0.2439 (0.2355) loss 2.5992 (2.7744) grad_norm 4.0888 (5.0353) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][550/1251] eta 0:02:47 lr 0.000082 wd 0.0500 time 0.2359 (0.2385) data time 0.0008 (0.0026) model time 0.2351 (0.2355) loss 3.1445 (2.7779) grad_norm 4.1376 (5.0187) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][560/1251] eta 0:02:44 lr 0.000082 wd 0.0500 time 0.2365 (0.2385) data time 0.0009 (0.0026) model time 0.2356 (0.2356) loss 3.0835 (2.7713) grad_norm 6.2642 (5.0023) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][570/1251] eta 0:02:42 lr 0.000082 wd 0.0500 time 0.2366 (0.2386) data time 0.0010 (0.0025) model time 0.2356 (0.2356) loss 2.8492 (2.7708) grad_norm 5.1778 (5.0089) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][580/1251] eta 0:02:40 lr 0.000082 wd 0.0500 time 0.2297 (0.2385) data time 0.0009 (0.0025) model time 0.2288 (0.2356) loss 3.1019 (2.7746) grad_norm 2.7770 (4.9895) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:42:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][590/1251] eta 0:02:37 lr 0.000082 wd 0.0500 time 0.2371 (0.2385) data time 0.0007 (0.0025) model time 0.2364 (0.2356) loss 3.2505 (2.7775) grad_norm 6.4869 (4.9838) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][600/1251] eta 0:02:35 lr 0.000082 wd 0.0500 time 0.2335 (0.2385) data time 0.0010 (0.0025) model time 0.2325 (0.2356) loss 2.7157 (2.7780) grad_norm 5.7729 (5.0090) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][610/1251] eta 0:02:32 lr 0.000082 wd 0.0500 time 0.2385 (0.2384) data time 0.0008 (0.0024) model time 0.2377 (0.2356) loss 2.0672 (2.7773) grad_norm 4.0212 (5.0338) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][620/1251] eta 0:02:30 lr 0.000082 wd 0.0500 time 0.2440 (0.2384) data time 0.0008 (0.0024) model time 0.2433 (0.2356) loss 1.9261 (2.7785) grad_norm 4.5009 (5.0358) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][630/1251] eta 0:02:28 lr 0.000081 wd 0.0500 time 0.2391 (0.2384) data time 0.0009 (0.0024) model time 0.2382 (0.2356) loss 3.0342 (2.7750) grad_norm 6.6609 (5.0209) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][640/1251] eta 0:02:25 lr 0.000081 wd 0.0500 time 0.2341 (0.2383) data time 0.0011 (0.0024) model time 0.2331 (0.2356) loss 2.6226 (2.7721) grad_norm 4.0369 (5.0196) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][650/1251] eta 0:02:23 lr 0.000081 wd 0.0500 time 0.2268 (0.2386) data time 0.0010 (0.0023) model time 0.2259 (0.2360) loss 2.5148 (2.7739) grad_norm 3.2072 (5.0126) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][660/1251] eta 0:02:21 lr 0.000081 wd 0.0500 time 0.2386 (0.2392) data time 0.0011 (0.0023) model time 0.2376 (0.2366) loss 2.8371 (2.7707) grad_norm 4.1036 (5.0037) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][670/1251] eta 0:02:19 lr 0.000081 wd 0.0500 time 0.4516 (0.2395) data time 0.0010 (0.0023) model time 0.4506 (0.2370) loss 3.5145 (2.7732) grad_norm 3.7947 (4.9858) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][680/1251] eta 0:02:16 lr 0.000081 wd 0.0500 time 0.2392 (0.2395) data time 0.0007 (0.0023) model time 0.2384 (0.2370) loss 3.1028 (2.7738) grad_norm 3.6604 (4.9848) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][690/1251] eta 0:02:14 lr 0.000081 wd 0.0500 time 0.2327 (0.2394) data time 0.0009 (0.0023) model time 0.2318 (0.2369) loss 2.8950 (2.7751) grad_norm 4.9076 (4.9855) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][700/1251] eta 0:02:11 lr 0.000081 wd 0.0500 time 0.2353 (0.2394) data time 0.0007 (0.0022) model time 0.2347 (0.2369) loss 3.6369 (2.7787) grad_norm 3.8750 (4.9735) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][710/1251] eta 0:02:09 lr 0.000081 wd 0.0500 time 0.2383 (0.2394) data time 0.0010 (0.0022) model time 0.2374 (0.2369) loss 1.7819 (2.7792) grad_norm 5.3801 (4.9664) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][720/1251] eta 0:02:07 lr 0.000081 wd 0.0500 time 0.2364 (0.2394) data time 0.0007 (0.0022) model time 0.2356 (0.2369) loss 3.1016 (2.7789) grad_norm 6.2025 (4.9636) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][730/1251] eta 0:02:04 lr 0.000081 wd 0.0500 time 0.2376 (0.2393) data time 0.0007 (0.0022) model time 0.2369 (0.2369) loss 2.9666 (2.7809) grad_norm 4.6429 (4.9628) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][740/1251] eta 0:02:02 lr 0.000081 wd 0.0500 time 0.2292 (0.2393) data time 0.0007 (0.0022) model time 0.2285 (0.2369) loss 1.6595 (2.7792) grad_norm 6.7196 (4.9590) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][750/1251] eta 0:01:59 lr 0.000081 wd 0.0500 time 0.2436 (0.2392) data time 0.0009 (0.0022) model time 0.2427 (0.2368) loss 2.8702 (2.7789) grad_norm 2.6431 (4.9685) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][760/1251] eta 0:01:57 lr 0.000081 wd 0.0500 time 0.2334 (0.2392) data time 0.0008 (0.0022) model time 0.2325 (0.2368) loss 3.3825 (2.7797) grad_norm 4.5354 (4.9640) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][770/1251] eta 0:01:55 lr 0.000081 wd 0.0500 time 0.2334 (0.2392) data time 0.0010 (0.0021) model time 0.2324 (0.2368) loss 2.8818 (2.7789) grad_norm 3.6538 (4.9597) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][780/1251] eta 0:01:52 lr 0.000081 wd 0.0500 time 0.2378 (0.2391) data time 0.0010 (0.0021) model time 0.2368 (0.2368) loss 2.7624 (2.7798) grad_norm 4.1246 (4.9432) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][790/1251] eta 0:01:50 lr 0.000081 wd 0.0500 time 0.2248 (0.2391) data time 0.0009 (0.0021) model time 0.2240 (0.2368) loss 3.1535 (2.7767) grad_norm 3.3741 (4.9342) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][800/1251] eta 0:01:47 lr 0.000081 wd 0.0500 time 0.2380 (0.2391) data time 0.0011 (0.0021) model time 0.2369 (0.2368) loss 2.9388 (2.7735) grad_norm 3.5175 (4.9236) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][810/1251] eta 0:01:45 lr 0.000081 wd 0.0500 time 0.2387 (0.2390) data time 0.0007 (0.0021) model time 0.2379 (0.2367) loss 3.0304 (2.7742) grad_norm 4.1242 (4.9220) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][820/1251] eta 0:01:43 lr 0.000081 wd 0.0500 time 0.2365 (0.2390) data time 0.0012 (0.0021) model time 0.2353 (0.2367) loss 2.7724 (2.7748) grad_norm 4.9895 (4.9146) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][830/1251] eta 0:01:40 lr 0.000081 wd 0.0500 time 0.2488 (0.2390) data time 0.0009 (0.0021) model time 0.2479 (0.2367) loss 2.4913 (2.7746) grad_norm 3.7215 (4.9100) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:43:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][840/1251] eta 0:01:38 lr 0.000081 wd 0.0500 time 0.2322 (0.2389) data time 0.0013 (0.0020) model time 0.2309 (0.2367) loss 2.0325 (2.7706) grad_norm 4.2756 (4.9113) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][850/1251] eta 0:01:35 lr 0.000081 wd 0.0500 time 0.2360 (0.2390) data time 0.0009 (0.0020) model time 0.2351 (0.2367) loss 3.1202 (2.7699) grad_norm 3.9284 (4.9102) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][860/1251] eta 0:01:33 lr 0.000081 wd 0.0500 time 0.2380 (0.2390) data time 0.0007 (0.0020) model time 0.2372 (0.2367) loss 2.9022 (2.7699) grad_norm 5.6393 (4.9121) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][870/1251] eta 0:01:31 lr 0.000081 wd 0.0500 time 0.2295 (0.2389) data time 0.0007 (0.0020) model time 0.2289 (0.2367) loss 1.9220 (2.7701) grad_norm 6.5515 (4.9077) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][880/1251] eta 0:01:28 lr 0.000081 wd 0.0500 time 0.2369 (0.2389) data time 0.0007 (0.0020) model time 0.2362 (0.2367) loss 2.7837 (2.7695) grad_norm 6.3760 (4.9048) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][890/1251] eta 0:01:26 lr 0.000081 wd 0.0500 time 0.2549 (0.2389) data time 0.0011 (0.0020) model time 0.2538 (0.2367) loss 2.9409 (2.7708) grad_norm 2.9200 (4.9024) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][900/1251] eta 0:01:23 lr 0.000081 wd 0.0500 time 0.2329 (0.2390) data time 0.0008 (0.0020) model time 0.2320 (0.2368) loss 2.0686 (2.7720) grad_norm 3.1330 (4.8985) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][910/1251] eta 0:01:21 lr 0.000081 wd 0.0500 time 0.2437 (0.2390) data time 0.0010 (0.0020) model time 0.2426 (0.2368) loss 1.9791 (2.7720) grad_norm 4.7426 (4.8924) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][920/1251] eta 0:01:19 lr 0.000081 wd 0.0500 time 0.2292 (0.2390) data time 0.0011 (0.0020) model time 0.2281 (0.2368) loss 3.1806 (2.7708) grad_norm 5.3977 (4.8878) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][930/1251] eta 0:01:16 lr 0.000081 wd 0.0500 time 0.2307 (0.2389) data time 0.0010 (0.0019) model time 0.2297 (0.2368) loss 3.1372 (2.7734) grad_norm 4.1260 (4.8840) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][940/1251] eta 0:01:14 lr 0.000081 wd 0.0500 time 0.2378 (0.2389) data time 0.0007 (0.0019) model time 0.2370 (0.2368) loss 3.6155 (2.7765) grad_norm 5.1382 (4.8850) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][950/1251] eta 0:01:11 lr 0.000081 wd 0.0500 time 0.2385 (0.2388) data time 0.0007 (0.0019) model time 0.2377 (0.2367) loss 3.1221 (2.7791) grad_norm 5.1706 (4.8818) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][960/1251] eta 0:01:09 lr 0.000081 wd 0.0500 time 0.2279 (0.2388) data time 0.0011 (0.0019) model time 0.2268 (0.2367) loss 2.8207 (2.7799) grad_norm 5.5967 (4.8785) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][970/1251] eta 0:01:07 lr 0.000081 wd 0.0500 time 0.2381 (0.2388) data time 0.0009 (0.0019) model time 0.2373 (0.2367) loss 3.3385 (2.7809) grad_norm 3.6319 (4.8753) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][980/1251] eta 0:01:04 lr 0.000081 wd 0.0500 time 0.2344 (0.2388) data time 0.0008 (0.0019) model time 0.2336 (0.2367) loss 2.4706 (2.7806) grad_norm 3.5474 (4.8915) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][990/1251] eta 0:01:02 lr 0.000081 wd 0.0500 time 0.2285 (0.2387) data time 0.0009 (0.0019) model time 0.2276 (0.2367) loss 3.2956 (2.7803) grad_norm 6.1520 (4.8893) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1000/1251] eta 0:00:59 lr 0.000081 wd 0.0500 time 0.2352 (0.2387) data time 0.0008 (0.0019) model time 0.2343 (0.2366) loss 2.9406 (2.7791) grad_norm 9.2020 (4.9326) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1010/1251] eta 0:00:57 lr 0.000081 wd 0.0500 time 0.2343 (0.2387) data time 0.0009 (0.0019) model time 0.2334 (0.2366) loss 2.2555 (2.7795) grad_norm 5.8132 (4.9287) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1020/1251] eta 0:00:55 lr 0.000081 wd 0.0500 time 0.2411 (0.2386) data time 0.0008 (0.0019) model time 0.2403 (0.2366) loss 3.3400 (2.7795) grad_norm 3.7225 (4.9240) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1030/1251] eta 0:00:52 lr 0.000081 wd 0.0500 time 0.2293 (0.2386) data time 0.0007 (0.0019) model time 0.2286 (0.2366) loss 3.0716 (2.7797) grad_norm 4.9753 (4.9220) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1040/1251] eta 0:00:50 lr 0.000081 wd 0.0500 time 0.2312 (0.2385) data time 0.0009 (0.0018) model time 0.2303 (0.2365) loss 2.5790 (2.7785) grad_norm 4.0770 (4.9144) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1050/1251] eta 0:00:47 lr 0.000081 wd 0.0500 time 0.2316 (0.2385) data time 0.0010 (0.0018) model time 0.2305 (0.2365) loss 2.7416 (2.7786) grad_norm 3.8400 (4.9047) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1060/1251] eta 0:00:45 lr 0.000081 wd 0.0500 time 0.2369 (0.2385) data time 0.0009 (0.0018) model time 0.2360 (0.2365) loss 3.1518 (2.7791) grad_norm 4.2825 (4.8967) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1070/1251] eta 0:00:43 lr 0.000080 wd 0.0500 time 0.2295 (0.2385) data time 0.0008 (0.0018) model time 0.2286 (0.2365) loss 3.0952 (2.7790) grad_norm 4.1916 (4.8932) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1080/1251] eta 0:00:40 lr 0.000080 wd 0.0500 time 0.2320 (0.2384) data time 0.0009 (0.0018) model time 0.2310 (0.2364) loss 2.7555 (2.7803) grad_norm 3.2715 (4.8838) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1090/1251] eta 0:00:38 lr 0.000080 wd 0.0500 time 0.2397 (0.2384) data time 0.0007 (0.0018) model time 0.2390 (0.2364) loss 3.0927 (2.7821) grad_norm 4.2214 (4.8772) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:44:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1100/1251] eta 0:00:36 lr 0.000080 wd 0.0500 time 0.2329 (0.2384) data time 0.0009 (0.0018) model time 0.2320 (0.2364) loss 2.2397 (2.7806) grad_norm 5.8963 (4.8757) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1110/1251] eta 0:00:33 lr 0.000080 wd 0.0500 time 0.2364 (0.2384) data time 0.0010 (0.0018) model time 0.2354 (0.2364) loss 2.9795 (2.7793) grad_norm 4.0670 (4.8775) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1120/1251] eta 0:00:31 lr 0.000080 wd 0.0500 time 0.2320 (0.2384) data time 0.0010 (0.0018) model time 0.2310 (0.2364) loss 2.8672 (2.7788) grad_norm 6.4501 (4.8784) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1130/1251] eta 0:00:28 lr 0.000080 wd 0.0500 time 0.2390 (0.2384) data time 0.0010 (0.0018) model time 0.2380 (0.2364) loss 2.6928 (2.7778) grad_norm 4.3170 (4.8800) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1140/1251] eta 0:00:26 lr 0.000080 wd 0.0500 time 0.2254 (0.2384) data time 0.0009 (0.0018) model time 0.2245 (0.2364) loss 2.4782 (2.7765) grad_norm 3.6662 (4.8744) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1150/1251] eta 0:00:24 lr 0.000080 wd 0.0500 time 0.2342 (0.2383) data time 0.0010 (0.0018) model time 0.2332 (0.2364) loss 3.2174 (2.7749) grad_norm 6.3627 (4.8778) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1160/1251] eta 0:00:21 lr 0.000080 wd 0.0500 time 0.2300 (0.2383) data time 0.0009 (0.0018) model time 0.2291 (0.2363) loss 3.1543 (2.7761) grad_norm 18.7641 (4.8830) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1170/1251] eta 0:00:19 lr 0.000080 wd 0.0500 time 0.2386 (0.2382) data time 0.0010 (0.0018) model time 0.2376 (0.2363) loss 2.9406 (2.7787) grad_norm 3.3049 (4.8783) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1180/1251] eta 0:00:16 lr 0.000080 wd 0.0500 time 0.2372 (0.2382) data time 0.0011 (0.0018) model time 0.2361 (0.2363) loss 2.0556 (2.7783) grad_norm 5.3273 (4.8788) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1190/1251] eta 0:00:14 lr 0.000080 wd 0.0500 time 0.2321 (0.2385) data time 0.0008 (0.0017) model time 0.2313 (0.2365) loss 2.0572 (2.7765) grad_norm 3.3027 (4.8724) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1200/1251] eta 0:00:12 lr 0.000080 wd 0.0500 time 0.2344 (0.2388) data time 0.0009 (0.0017) model time 0.2336 (0.2369) loss 2.4138 (2.7751) grad_norm 4.5934 (4.8691) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1210/1251] eta 0:00:09 lr 0.000080 wd 0.0500 time 0.2417 (0.2388) data time 0.0009 (0.0017) model time 0.2408 (0.2369) loss 3.0823 (2.7757) grad_norm 6.2692 (4.8882) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1220/1251] eta 0:00:07 lr 0.000080 wd 0.0500 time 0.2424 (0.2388) data time 0.0008 (0.0017) model time 0.2416 (0.2369) loss 2.1753 (2.7750) grad_norm 5.9965 (4.8943) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1230/1251] eta 0:00:05 lr 0.000080 wd 0.0500 time 0.2358 (0.2388) data time 0.0010 (0.0017) model time 0.2348 (0.2369) loss 2.9025 (2.7733) grad_norm 4.6083 (4.9026) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1240/1251] eta 0:00:02 lr 0.000080 wd 0.0500 time 0.2222 (0.2387) data time 0.0005 (0.0017) model time 0.2217 (0.2368) loss 1.6114 (2.7714) grad_norm 4.9052 (4.9000) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [251/300][1250/1251] eta 0:00:00 lr 0.000080 wd 0.0500 time 0.2229 (0.2386) data time 0.0007 (0.0017) model time 0.2221 (0.2367) loss 2.2034 (2.7708) grad_norm 7.9575 (4.9039) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 251 training takes 0:04:58 [2024-09-01 07:45:35 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 07:45:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 07:45:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.543 (0.543) Loss 0.3997 (0.3997) Acc@1 93.066 (93.066) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 07:45:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.094 (0.118) Loss 0.6182 (0.6274) Acc@1 88.379 (87.118) Acc@5 97.754 (97.621) Mem 7381MB [2024-09-01 07:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.097) Loss 0.9419 (0.6566) Acc@1 77.051 (86.082) Acc@5 95.703 (97.596) Mem 7381MB [2024-09-01 07:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.085 (0.094) Loss 1.1719 (0.7478) Acc@1 72.656 (83.912) Acc@5 92.188 (96.607) Mem 7381MB [2024-09-01 07:45:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.088) Loss 1.0273 (0.7972) Acc@1 77.246 (82.793) Acc@5 93.750 (96.082) Mem 7381MB [2024-09-01 07:45:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.390 Acc@5 96.012 [2024-09-01 07:45:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.4% [2024-09-01 07:45:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.39% [2024-09-01 07:45:40 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 07:45:40 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 07:45:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.567 (0.567) Loss 0.3813 (0.3813) Acc@1 93.262 (93.262) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 07:45:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.064 (0.134) Loss 0.5684 (0.5997) Acc@1 89.941 (87.615) Acc@5 98.145 (97.869) Mem 7381MB [2024-09-01 07:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.108 (0.139) Loss 0.8911 (0.6303) Acc@1 78.027 (86.514) Acc@5 96.289 (97.768) Mem 7381MB [2024-09-01 07:45:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.122) Loss 1.1025 (0.7170) Acc@1 74.316 (84.394) Acc@5 93.066 (96.881) Mem 7381MB [2024-09-01 07:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.108) Loss 0.9927 (0.7632) Acc@1 76.367 (83.282) Acc@5 94.336 (96.384) Mem 7381MB [2024-09-01 07:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.944 Acc@5 96.342 [2024-09-01 07:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 07:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.94% [2024-09-01 07:45:45 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 07:45:46 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 07:45:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][0/1251] eta 0:16:43 lr 0.000080 wd 0.0500 time 0.8025 (0.8025) data time 0.5174 (0.5174) model time 0.0000 (0.0000) loss 2.0449 (2.0449) grad_norm 4.1413 (4.1413) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][10/1251] eta 0:05:56 lr 0.000080 wd 0.0500 time 0.2381 (0.2876) data time 0.0010 (0.0480) model time 0.0000 (0.0000) loss 2.9360 (2.5537) grad_norm 3.8766 (5.2250) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][20/1251] eta 0:05:23 lr 0.000080 wd 0.0500 time 0.2251 (0.2632) data time 0.0010 (0.0257) model time 0.0000 (0.0000) loss 3.2465 (2.5653) grad_norm 2.8070 (4.5646) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][30/1251] eta 0:05:10 lr 0.000080 wd 0.0500 time 0.2452 (0.2539) data time 0.0011 (0.0177) model time 0.0000 (0.0000) loss 2.8257 (2.6525) grad_norm 3.2967 (4.6630) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][40/1251] eta 0:05:02 lr 0.000080 wd 0.0500 time 0.2380 (0.2498) data time 0.0009 (0.0136) model time 0.0000 (0.0000) loss 2.2009 (2.6473) grad_norm 5.2177 (4.8127) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:45:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][50/1251] eta 0:04:55 lr 0.000080 wd 0.0500 time 0.2349 (0.2463) data time 0.0007 (0.0112) model time 0.0000 (0.0000) loss 3.4792 (2.6768) grad_norm 4.7540 (4.7885) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][60/1251] eta 0:04:51 lr 0.000080 wd 0.0500 time 0.2316 (0.2446) data time 0.0013 (0.0095) model time 0.2304 (0.2352) loss 2.3743 (2.6944) grad_norm 4.5811 (4.6994) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][70/1251] eta 0:04:47 lr 0.000080 wd 0.0500 time 0.2303 (0.2432) data time 0.0011 (0.0083) model time 0.2292 (0.2344) loss 3.0405 (2.7235) grad_norm 6.2379 (4.6795) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][80/1251] eta 0:04:43 lr 0.000080 wd 0.0500 time 0.2389 (0.2424) data time 0.0011 (0.0074) model time 0.2378 (0.2348) loss 2.9215 (2.7035) grad_norm 4.0542 (4.5788) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][90/1251] eta 0:04:40 lr 0.000080 wd 0.0500 time 0.2429 (0.2418) data time 0.0010 (0.0067) model time 0.2419 (0.2350) loss 2.8664 (2.7425) grad_norm 4.2291 (5.9267) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][100/1251] eta 0:04:39 lr 0.000080 wd 0.0500 time 0.2301 (0.2430) data time 0.0008 (0.0061) model time 0.2293 (0.2387) loss 1.8996 (2.7431) grad_norm 3.9259 (5.7536) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][110/1251] eta 0:04:36 lr 0.000080 wd 0.0500 time 0.2368 (0.2423) data time 0.0009 (0.0057) model time 0.2359 (0.2379) loss 1.7791 (2.7441) grad_norm 4.5126 (5.6315) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][120/1251] eta 0:04:33 lr 0.000080 wd 0.0500 time 0.2275 (0.2416) data time 0.0011 (0.0053) model time 0.2264 (0.2372) loss 2.7911 (2.7396) grad_norm 4.1846 (5.5243) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][130/1251] eta 0:04:30 lr 0.000080 wd 0.0500 time 0.2405 (0.2411) data time 0.0006 (0.0050) model time 0.2398 (0.2368) loss 2.5914 (2.7411) grad_norm 6.2305 (5.4387) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][140/1251] eta 0:04:27 lr 0.000080 wd 0.0500 time 0.2407 (0.2408) data time 0.0010 (0.0047) model time 0.2397 (0.2368) loss 3.2653 (2.7435) grad_norm 4.3383 (5.4972) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][150/1251] eta 0:04:24 lr 0.000080 wd 0.0500 time 0.2326 (0.2404) data time 0.0010 (0.0044) model time 0.2315 (0.2365) loss 2.9697 (2.7373) grad_norm 6.9052 (5.4841) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][160/1251] eta 0:04:21 lr 0.000080 wd 0.0500 time 0.2325 (0.2400) data time 0.0006 (0.0042) model time 0.2319 (0.2362) loss 2.5488 (2.7177) grad_norm 4.0250 (5.3983) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][170/1251] eta 0:04:19 lr 0.000080 wd 0.0500 time 0.2500 (0.2398) data time 0.0007 (0.0040) model time 0.2493 (0.2360) loss 3.1933 (2.7205) grad_norm 2.7879 (5.3321) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][180/1251] eta 0:04:16 lr 0.000080 wd 0.0500 time 0.2375 (0.2396) data time 0.0009 (0.0039) model time 0.2366 (0.2360) loss 3.2451 (2.7304) grad_norm 3.5493 (5.3232) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][190/1251] eta 0:04:14 lr 0.000080 wd 0.0500 time 0.2372 (0.2395) data time 0.0009 (0.0037) model time 0.2363 (0.2360) loss 2.6179 (2.7316) grad_norm 3.0419 (5.2861) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][200/1251] eta 0:04:11 lr 0.000080 wd 0.0500 time 0.2325 (0.2393) data time 0.0009 (0.0036) model time 0.2316 (0.2359) loss 2.8031 (2.7446) grad_norm 3.0471 (5.2459) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][210/1251] eta 0:04:08 lr 0.000080 wd 0.0500 time 0.2284 (0.2391) data time 0.0009 (0.0035) model time 0.2275 (0.2358) loss 3.6572 (2.7532) grad_norm 4.5610 (5.1957) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][220/1251] eta 0:04:06 lr 0.000080 wd 0.0500 time 0.2285 (0.2388) data time 0.0013 (0.0034) model time 0.2273 (0.2356) loss 2.3348 (2.7593) grad_norm 8.1614 (5.1860) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][230/1251] eta 0:04:03 lr 0.000080 wd 0.0500 time 0.2289 (0.2387) data time 0.0006 (0.0033) model time 0.2282 (0.2355) loss 3.0171 (2.7534) grad_norm 5.8505 (5.1578) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][240/1251] eta 0:04:01 lr 0.000080 wd 0.0500 time 0.2362 (0.2386) data time 0.0009 (0.0032) model time 0.2354 (0.2355) loss 2.1087 (2.7484) grad_norm 5.9951 (5.2196) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][250/1251] eta 0:03:59 lr 0.000079 wd 0.0500 time 0.2380 (0.2393) data time 0.0009 (0.0031) model time 0.2371 (0.2365) loss 3.1239 (2.7424) grad_norm 3.6513 (5.2299) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][260/1251] eta 0:04:02 lr 0.000079 wd 0.0500 time 0.2402 (0.2443) data time 0.0011 (0.0030) model time 0.2391 (0.2428) loss 2.5070 (2.7445) grad_norm 5.4462 (5.2181) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][270/1251] eta 0:03:59 lr 0.000079 wd 0.0500 time 0.2335 (0.2440) data time 0.0007 (0.0029) model time 0.2328 (0.2424) loss 2.2039 (2.7440) grad_norm 5.5132 (5.2688) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][280/1251] eta 0:03:56 lr 0.000079 wd 0.0500 time 0.2313 (0.2437) data time 0.0007 (0.0029) model time 0.2306 (0.2421) loss 2.6232 (2.7499) grad_norm 4.5493 (5.2389) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][290/1251] eta 0:03:53 lr 0.000079 wd 0.0500 time 0.2449 (0.2435) data time 0.0008 (0.0028) model time 0.2441 (0.2418) loss 3.2584 (2.7497) grad_norm 2.8300 (5.3955) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][300/1251] eta 0:03:51 lr 0.000079 wd 0.0500 time 0.2399 (0.2433) data time 0.0010 (0.0027) model time 0.2389 (0.2416) loss 2.9640 (2.7491) grad_norm 4.3603 (5.3638) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:47:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][310/1251] eta 0:03:48 lr 0.000079 wd 0.0500 time 0.2529 (0.2431) data time 0.0009 (0.0027) model time 0.2520 (0.2414) loss 1.8678 (2.7529) grad_norm 4.7769 (5.3363) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:47:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][320/1251] eta 0:03:46 lr 0.000079 wd 0.0500 time 0.2504 (0.2430) data time 0.0008 (0.0026) model time 0.2496 (0.2413) loss 2.7819 (2.7528) grad_norm 3.2813 (inf) loss_scale 128.0000 (254.4050) mem 7381MB [2024-09-01 07:47:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][330/1251] eta 0:03:43 lr 0.000079 wd 0.0500 time 0.2360 (0.2428) data time 0.0008 (0.0026) model time 0.2352 (0.2412) loss 2.7957 (2.7651) grad_norm 4.0188 (inf) loss_scale 128.0000 (250.5861) mem 7381MB [2024-09-01 07:47:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][340/1251] eta 0:03:41 lr 0.000079 wd 0.0500 time 0.2306 (0.2426) data time 0.0011 (0.0025) model time 0.2295 (0.2410) loss 2.7797 (2.7642) grad_norm 3.2593 (inf) loss_scale 128.0000 (246.9912) mem 7381MB [2024-09-01 07:47:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][350/1251] eta 0:03:38 lr 0.000079 wd 0.0500 time 0.2350 (0.2425) data time 0.0008 (0.0025) model time 0.2342 (0.2408) loss 3.1840 (2.7716) grad_norm 3.0968 (inf) loss_scale 128.0000 (243.6011) mem 7381MB [2024-09-01 07:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][360/1251] eta 0:03:36 lr 0.000079 wd 0.0500 time 0.2357 (0.2424) data time 0.0009 (0.0025) model time 0.2348 (0.2408) loss 2.4160 (2.7579) grad_norm 3.3146 (inf) loss_scale 128.0000 (240.3989) mem 7381MB [2024-09-01 07:47:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][370/1251] eta 0:03:33 lr 0.000079 wd 0.0500 time 0.2268 (0.2423) data time 0.0011 (0.0024) model time 0.2256 (0.2406) loss 3.1078 (2.7541) grad_norm 3.5512 (inf) loss_scale 128.0000 (237.3693) mem 7381MB [2024-09-01 07:47:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][380/1251] eta 0:03:30 lr 0.000079 wd 0.0500 time 0.2296 (0.2421) data time 0.0012 (0.0024) model time 0.2284 (0.2404) loss 2.7409 (2.7506) grad_norm 3.9557 (inf) loss_scale 128.0000 (234.4987) mem 7381MB [2024-09-01 07:47:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][390/1251] eta 0:03:29 lr 0.000079 wd 0.0500 time 0.2990 (0.2430) data time 0.0012 (0.0024) model time 0.2979 (0.2415) loss 2.6597 (2.7475) grad_norm 3.3097 (inf) loss_scale 128.0000 (231.7749) mem 7381MB [2024-09-01 07:47:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][400/1251] eta 0:03:27 lr 0.000079 wd 0.0500 time 0.4436 (0.2441) data time 0.0009 (0.0023) model time 0.4426 (0.2427) loss 2.9016 (2.7488) grad_norm 4.7951 (inf) loss_scale 128.0000 (229.1870) mem 7381MB [2024-09-01 07:47:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][410/1251] eta 0:03:25 lr 0.000079 wd 0.0500 time 0.2439 (0.2442) data time 0.0009 (0.0023) model time 0.2430 (0.2428) loss 2.6075 (2.7480) grad_norm 4.4006 (inf) loss_scale 128.0000 (226.7251) mem 7381MB [2024-09-01 07:47:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][420/1251] eta 0:03:22 lr 0.000079 wd 0.0500 time 0.2454 (0.2440) data time 0.0011 (0.0023) model time 0.2442 (0.2426) loss 1.7354 (2.7490) grad_norm 3.6441 (inf) loss_scale 128.0000 (224.3800) mem 7381MB [2024-09-01 07:47:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][430/1251] eta 0:03:21 lr 0.000079 wd 0.0500 time 0.4969 (0.2454) data time 0.0007 (0.0023) model time 0.4963 (0.2442) loss 2.4256 (2.7438) grad_norm 3.3461 (inf) loss_scale 128.0000 (222.1439) mem 7381MB [2024-09-01 07:47:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][440/1251] eta 0:03:22 lr 0.000079 wd 0.0500 time 0.2712 (0.2491) data time 0.0012 (0.0022) model time 0.2700 (0.2484) loss 2.4203 (2.7441) grad_norm 4.4381 (inf) loss_scale 128.0000 (220.0091) mem 7381MB [2024-09-01 07:47:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][450/1251] eta 0:03:19 lr 0.000079 wd 0.0500 time 0.2348 (0.2490) data time 0.0012 (0.0022) model time 0.2336 (0.2483) loss 1.9270 (2.7421) grad_norm 7.7460 (inf) loss_scale 128.0000 (217.9690) mem 7381MB [2024-09-01 07:47:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][460/1251] eta 0:03:16 lr 0.000079 wd 0.0500 time 0.2324 (0.2486) data time 0.0010 (0.0022) model time 0.2313 (0.2479) loss 2.7695 (2.7401) grad_norm 3.3224 (inf) loss_scale 128.0000 (216.0174) mem 7381MB [2024-09-01 07:47:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][470/1251] eta 0:03:18 lr 0.000079 wd 0.0500 time 0.2362 (0.2546) data time 0.0008 (0.0021) model time 0.2354 (0.2545) loss 2.1430 (2.7376) grad_norm 4.3709 (inf) loss_scale 128.0000 (214.1486) mem 7381MB [2024-09-01 07:47:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][480/1251] eta 0:03:16 lr 0.000079 wd 0.0500 time 0.2278 (0.2553) data time 0.0007 (0.0021) model time 0.2271 (0.2553) loss 2.8534 (2.7379) grad_norm 7.1858 (inf) loss_scale 128.0000 (212.3576) mem 7381MB [2024-09-01 07:47:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][490/1251] eta 0:03:13 lr 0.000079 wd 0.0500 time 0.2343 (0.2549) data time 0.0010 (0.0021) model time 0.2333 (0.2549) loss 3.3192 (2.7400) grad_norm 3.7894 (inf) loss_scale 128.0000 (210.6395) mem 7381MB [2024-09-01 07:47:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][500/1251] eta 0:03:11 lr 0.000079 wd 0.0500 time 0.2320 (0.2545) data time 0.0012 (0.0021) model time 0.2308 (0.2544) loss 3.1363 (2.7402) grad_norm 4.2353 (inf) loss_scale 128.0000 (208.9900) mem 7381MB [2024-09-01 07:47:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][510/1251] eta 0:03:08 lr 0.000079 wd 0.0500 time 0.2467 (0.2542) data time 0.0008 (0.0021) model time 0.2459 (0.2541) loss 2.8080 (2.7399) grad_norm 3.6219 (inf) loss_scale 128.0000 (207.4051) mem 7381MB [2024-09-01 07:47:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][520/1251] eta 0:03:05 lr 0.000079 wd 0.0500 time 0.2451 (0.2539) data time 0.0009 (0.0020) model time 0.2442 (0.2537) loss 2.1045 (2.7392) grad_norm 6.1193 (inf) loss_scale 128.0000 (205.8810) mem 7381MB [2024-09-01 07:48:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][530/1251] eta 0:03:02 lr 0.000079 wd 0.0500 time 0.2435 (0.2536) data time 0.0008 (0.0020) model time 0.2426 (0.2533) loss 2.7728 (2.7352) grad_norm 4.1962 (inf) loss_scale 128.0000 (204.4143) mem 7381MB [2024-09-01 07:48:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][540/1251] eta 0:03:00 lr 0.000079 wd 0.0500 time 0.2411 (0.2534) data time 0.0009 (0.0020) model time 0.2402 (0.2531) loss 2.1426 (2.7323) grad_norm 4.3798 (inf) loss_scale 128.0000 (203.0018) mem 7381MB [2024-09-01 07:48:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][550/1251] eta 0:02:57 lr 0.000079 wd 0.0500 time 0.2363 (0.2530) data time 0.0007 (0.0020) model time 0.2356 (0.2527) loss 2.8555 (2.7353) grad_norm 3.0078 (inf) loss_scale 128.0000 (201.6407) mem 7381MB [2024-09-01 07:48:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][560/1251] eta 0:02:54 lr 0.000079 wd 0.0500 time 0.2375 (0.2527) data time 0.0009 (0.0020) model time 0.2366 (0.2523) loss 2.7877 (2.7336) grad_norm 4.0405 (inf) loss_scale 128.0000 (200.3280) mem 7381MB [2024-09-01 07:48:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][570/1251] eta 0:02:51 lr 0.000079 wd 0.0500 time 0.2300 (0.2524) data time 0.0010 (0.0019) model time 0.2290 (0.2520) loss 3.1498 (2.7360) grad_norm 3.9464 (inf) loss_scale 128.0000 (199.0613) mem 7381MB [2024-09-01 07:48:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][580/1251] eta 0:02:49 lr 0.000079 wd 0.0500 time 0.2377 (0.2522) data time 0.0008 (0.0019) model time 0.2369 (0.2518) loss 3.2856 (2.7354) grad_norm 10.3156 (inf) loss_scale 128.0000 (197.8382) mem 7381MB [2024-09-01 07:48:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][590/1251] eta 0:02:46 lr 0.000079 wd 0.0500 time 0.2418 (0.2519) data time 0.0010 (0.0019) model time 0.2408 (0.2514) loss 2.5061 (2.7364) grad_norm 3.6274 (inf) loss_scale 128.0000 (196.6565) mem 7381MB [2024-09-01 07:48:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][600/1251] eta 0:02:43 lr 0.000079 wd 0.0500 time 0.2406 (0.2517) data time 0.0009 (0.0019) model time 0.2397 (0.2511) loss 3.5209 (2.7352) grad_norm 3.9362 (inf) loss_scale 128.0000 (195.5141) mem 7381MB [2024-09-01 07:48:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][610/1251] eta 0:02:41 lr 0.000079 wd 0.0500 time 0.2559 (0.2515) data time 0.0008 (0.0019) model time 0.2551 (0.2509) loss 1.6831 (2.7337) grad_norm 3.8574 (inf) loss_scale 128.0000 (194.4092) mem 7381MB [2024-09-01 07:48:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][620/1251] eta 0:02:38 lr 0.000079 wd 0.0500 time 0.2411 (0.2513) data time 0.0013 (0.0019) model time 0.2399 (0.2507) loss 2.5361 (2.7339) grad_norm 7.3412 (inf) loss_scale 128.0000 (193.3398) mem 7381MB [2024-09-01 07:48:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][630/1251] eta 0:02:35 lr 0.000079 wd 0.0500 time 0.2280 (0.2511) data time 0.0009 (0.0019) model time 0.2271 (0.2505) loss 2.0928 (2.7354) grad_norm 5.0005 (inf) loss_scale 128.0000 (192.3043) mem 7381MB [2024-09-01 07:48:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][640/1251] eta 0:02:33 lr 0.000079 wd 0.0500 time 0.2294 (0.2509) data time 0.0009 (0.0018) model time 0.2284 (0.2502) loss 2.9003 (2.7342) grad_norm 4.1430 (inf) loss_scale 128.0000 (191.3011) mem 7381MB [2024-09-01 07:48:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][650/1251] eta 0:02:30 lr 0.000079 wd 0.0500 time 0.2330 (0.2507) data time 0.0010 (0.0018) model time 0.2320 (0.2500) loss 2.7320 (2.7374) grad_norm 6.3223 (inf) loss_scale 128.0000 (190.3287) mem 7381MB [2024-09-01 07:48:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][660/1251] eta 0:02:28 lr 0.000079 wd 0.0500 time 0.2332 (0.2505) data time 0.0011 (0.0018) model time 0.2321 (0.2498) loss 2.1286 (2.7362) grad_norm 4.1313 (inf) loss_scale 128.0000 (189.3858) mem 7381MB [2024-09-01 07:48:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][670/1251] eta 0:02:25 lr 0.000079 wd 0.0500 time 0.2403 (0.2503) data time 0.0010 (0.0018) model time 0.2393 (0.2496) loss 2.7976 (2.7360) grad_norm 5.9320 (inf) loss_scale 128.0000 (188.4709) mem 7381MB [2024-09-01 07:48:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][680/1251] eta 0:02:22 lr 0.000079 wd 0.0500 time 0.2462 (0.2501) data time 0.0011 (0.0018) model time 0.2451 (0.2494) loss 2.5718 (2.7376) grad_norm 5.5951 (inf) loss_scale 128.0000 (187.5830) mem 7381MB [2024-09-01 07:48:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][690/1251] eta 0:02:20 lr 0.000079 wd 0.0500 time 0.2408 (0.2499) data time 0.0007 (0.0018) model time 0.2401 (0.2491) loss 3.2603 (2.7385) grad_norm 3.3929 (inf) loss_scale 128.0000 (186.7207) mem 7381MB [2024-09-01 07:48:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][700/1251] eta 0:02:17 lr 0.000078 wd 0.0500 time 0.2358 (0.2497) data time 0.0008 (0.0018) model time 0.2349 (0.2489) loss 2.7834 (2.7381) grad_norm 4.2007 (inf) loss_scale 128.0000 (185.8830) mem 7381MB [2024-09-01 07:48:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][710/1251] eta 0:02:14 lr 0.000078 wd 0.0500 time 0.2370 (0.2495) data time 0.0009 (0.0018) model time 0.2362 (0.2487) loss 1.8896 (2.7337) grad_norm 3.8775 (inf) loss_scale 128.0000 (185.0689) mem 7381MB [2024-09-01 07:48:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][720/1251] eta 0:02:12 lr 0.000078 wd 0.0500 time 0.2373 (0.2493) data time 0.0008 (0.0018) model time 0.2365 (0.2485) loss 2.0871 (2.7354) grad_norm 4.0555 (inf) loss_scale 128.0000 (184.2774) mem 7381MB [2024-09-01 07:48:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][730/1251] eta 0:02:09 lr 0.000078 wd 0.0500 time 0.2340 (0.2491) data time 0.0009 (0.0017) model time 0.2331 (0.2482) loss 2.9741 (2.7368) grad_norm 3.5257 (inf) loss_scale 128.0000 (183.5075) mem 7381MB [2024-09-01 07:48:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][740/1251] eta 0:02:07 lr 0.000078 wd 0.0500 time 0.2416 (0.2488) data time 0.0010 (0.0017) model time 0.2405 (0.2480) loss 3.4421 (2.7407) grad_norm 4.3379 (inf) loss_scale 128.0000 (182.7584) mem 7381MB [2024-09-01 07:48:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][750/1251] eta 0:02:04 lr 0.000078 wd 0.0500 time 0.2306 (0.2486) data time 0.0008 (0.0017) model time 0.2297 (0.2478) loss 3.1352 (2.7417) grad_norm 5.0526 (inf) loss_scale 128.0000 (182.0293) mem 7381MB [2024-09-01 07:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][760/1251] eta 0:02:02 lr 0.000078 wd 0.0500 time 0.2364 (0.2485) data time 0.0008 (0.0017) model time 0.2356 (0.2476) loss 3.0499 (2.7439) grad_norm 3.8553 (inf) loss_scale 128.0000 (181.3193) mem 7381MB [2024-09-01 07:48:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][770/1251] eta 0:01:59 lr 0.000078 wd 0.0500 time 0.2352 (0.2484) data time 0.0010 (0.0017) model time 0.2342 (0.2475) loss 2.2326 (2.7441) grad_norm 5.0473 (inf) loss_scale 128.0000 (180.6278) mem 7381MB [2024-09-01 07:49:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][780/1251] eta 0:01:56 lr 0.000078 wd 0.0500 time 0.2317 (0.2482) data time 0.0007 (0.0017) model time 0.2310 (0.2473) loss 1.9227 (2.7467) grad_norm 3.8242 (inf) loss_scale 128.0000 (179.9539) mem 7381MB [2024-09-01 07:49:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][790/1251] eta 0:01:54 lr 0.000078 wd 0.0500 time 0.2415 (0.2481) data time 0.0010 (0.0017) model time 0.2405 (0.2472) loss 2.6812 (2.7460) grad_norm 6.0711 (inf) loss_scale 128.0000 (179.2971) mem 7381MB [2024-09-01 07:49:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][800/1251] eta 0:01:52 lr 0.000078 wd 0.0500 time 0.2376 (0.2485) data time 0.0007 (0.0017) model time 0.2368 (0.2476) loss 3.5591 (2.7470) grad_norm 4.9418 (inf) loss_scale 128.0000 (178.6567) mem 7381MB [2024-09-01 07:49:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][810/1251] eta 0:01:50 lr 0.000078 wd 0.0500 time 0.2310 (0.2501) data time 0.0008 (0.0017) model time 0.2301 (0.2493) loss 3.4182 (2.7459) grad_norm 3.7893 (inf) loss_scale 128.0000 (178.0321) mem 7381MB [2024-09-01 07:49:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][820/1251] eta 0:01:47 lr 0.000078 wd 0.0500 time 0.2327 (0.2504) data time 0.0009 (0.0017) model time 0.2318 (0.2497) loss 2.5104 (2.7467) grad_norm 3.5462 (inf) loss_scale 128.0000 (177.4227) mem 7381MB [2024-09-01 07:49:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][830/1251] eta 0:01:45 lr 0.000078 wd 0.0500 time 0.2325 (0.2503) data time 0.0011 (0.0017) model time 0.2314 (0.2495) loss 2.6115 (2.7464) grad_norm 6.5910 (inf) loss_scale 128.0000 (176.8279) mem 7381MB [2024-09-01 07:49:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][840/1251] eta 0:01:42 lr 0.000078 wd 0.0500 time 0.2339 (0.2504) data time 0.0007 (0.0016) model time 0.2332 (0.2496) loss 2.9803 (2.7483) grad_norm 5.5029 (inf) loss_scale 128.0000 (176.2473) mem 7381MB [2024-09-01 07:49:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][850/1251] eta 0:01:40 lr 0.000078 wd 0.0500 time 0.2411 (0.2518) data time 0.0010 (0.0016) model time 0.2401 (0.2512) loss 3.8085 (2.7498) grad_norm 3.4993 (inf) loss_scale 128.0000 (175.6804) mem 7381MB [2024-09-01 07:49:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][860/1251] eta 0:01:38 lr 0.000078 wd 0.0500 time 0.2370 (0.2517) data time 0.0009 (0.0016) model time 0.2361 (0.2510) loss 3.0828 (2.7508) grad_norm 4.3966 (inf) loss_scale 128.0000 (175.1266) mem 7381MB [2024-09-01 07:49:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][870/1251] eta 0:01:35 lr 0.000078 wd 0.0500 time 0.2312 (0.2515) data time 0.0008 (0.0016) model time 0.2303 (0.2508) loss 3.1071 (2.7509) grad_norm 4.3446 (inf) loss_scale 128.0000 (174.5855) mem 7381MB [2024-09-01 07:49:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][880/1251] eta 0:01:33 lr 0.000078 wd 0.0500 time 0.2310 (0.2513) data time 0.0007 (0.0016) model time 0.2303 (0.2506) loss 2.1894 (2.7490) grad_norm 3.6563 (inf) loss_scale 128.0000 (174.0568) mem 7381MB [2024-09-01 07:49:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][890/1251] eta 0:01:30 lr 0.000078 wd 0.0500 time 0.2426 (0.2512) data time 0.0010 (0.0016) model time 0.2417 (0.2504) loss 3.5180 (2.7504) grad_norm 3.5392 (inf) loss_scale 128.0000 (173.5398) mem 7381MB [2024-09-01 07:49:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][900/1251] eta 0:01:28 lr 0.000078 wd 0.0500 time 0.2320 (0.2510) data time 0.0010 (0.0016) model time 0.2310 (0.2502) loss 3.3361 (2.7523) grad_norm 3.9286 (inf) loss_scale 128.0000 (173.0344) mem 7381MB [2024-09-01 07:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][910/1251] eta 0:01:25 lr 0.000078 wd 0.0500 time 0.2295 (0.2508) data time 0.0010 (0.0016) model time 0.2285 (0.2501) loss 3.1365 (2.7520) grad_norm 5.7528 (inf) loss_scale 128.0000 (172.5401) mem 7381MB [2024-09-01 07:49:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][920/1251] eta 0:01:22 lr 0.000078 wd 0.0500 time 0.2307 (0.2507) data time 0.0007 (0.0016) model time 0.2299 (0.2499) loss 2.9720 (2.7536) grad_norm 4.7201 (inf) loss_scale 128.0000 (172.0565) mem 7381MB [2024-09-01 07:49:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][930/1251] eta 0:01:20 lr 0.000078 wd 0.0500 time 0.2409 (0.2505) data time 0.0007 (0.0016) model time 0.2402 (0.2497) loss 2.4116 (2.7536) grad_norm 4.3638 (inf) loss_scale 128.0000 (171.5832) mem 7381MB [2024-09-01 07:49:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][940/1251] eta 0:01:17 lr 0.000078 wd 0.0500 time 0.2399 (0.2504) data time 0.0009 (0.0016) model time 0.2390 (0.2496) loss 2.5222 (2.7537) grad_norm 3.6226 (inf) loss_scale 128.0000 (171.1201) mem 7381MB [2024-09-01 07:49:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][950/1251] eta 0:01:15 lr 0.000078 wd 0.0500 time 0.2347 (0.2502) data time 0.0007 (0.0016) model time 0.2340 (0.2494) loss 1.9012 (2.7535) grad_norm 5.9183 (inf) loss_scale 128.0000 (170.6667) mem 7381MB [2024-09-01 07:49:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][960/1251] eta 0:01:12 lr 0.000078 wd 0.0500 time 0.2347 (0.2501) data time 0.0007 (0.0016) model time 0.2340 (0.2493) loss 3.2068 (2.7521) grad_norm 6.0535 (inf) loss_scale 128.0000 (170.2227) mem 7381MB [2024-09-01 07:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][970/1251] eta 0:01:10 lr 0.000078 wd 0.0500 time 0.2306 (0.2500) data time 0.0010 (0.0016) model time 0.2295 (0.2491) loss 2.5715 (2.7512) grad_norm 4.1036 (inf) loss_scale 128.0000 (169.7878) mem 7381MB [2024-09-01 07:49:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][980/1251] eta 0:01:07 lr 0.000078 wd 0.0500 time 0.2334 (0.2498) data time 0.0009 (0.0016) model time 0.2325 (0.2490) loss 2.0004 (2.7504) grad_norm 6.2176 (inf) loss_scale 128.0000 (169.3619) mem 7381MB [2024-09-01 07:49:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][990/1251] eta 0:01:05 lr 0.000078 wd 0.0500 time 0.2351 (0.2497) data time 0.0012 (0.0016) model time 0.2340 (0.2488) loss 2.4923 (2.7508) grad_norm 4.6302 (inf) loss_scale 128.0000 (168.9445) mem 7381MB [2024-09-01 07:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1000/1251] eta 0:01:02 lr 0.000078 wd 0.0500 time 0.2439 (0.2500) data time 0.0010 (0.0015) model time 0.2430 (0.2491) loss 2.7580 (2.7515) grad_norm 5.4973 (inf) loss_scale 128.0000 (168.5355) mem 7381MB [2024-09-01 07:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1010/1251] eta 0:01:00 lr 0.000078 wd 0.0500 time 0.2459 (0.2498) data time 0.0008 (0.0015) model time 0.2451 (0.2490) loss 3.1976 (2.7520) grad_norm 3.9652 (inf) loss_scale 128.0000 (168.1345) mem 7381MB [2024-09-01 07:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1020/1251] eta 0:00:57 lr 0.000078 wd 0.0500 time 0.2324 (0.2497) data time 0.0008 (0.0015) model time 0.2316 (0.2488) loss 3.0227 (2.7556) grad_norm 4.8540 (inf) loss_scale 128.0000 (167.7414) mem 7381MB [2024-09-01 07:50:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1030/1251] eta 0:00:55 lr 0.000078 wd 0.0500 time 0.2459 (0.2498) data time 0.0007 (0.0015) model time 0.2452 (0.2489) loss 2.8114 (2.7559) grad_norm 3.5837 (inf) loss_scale 128.0000 (167.3560) mem 7381MB [2024-09-01 07:50:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1040/1251] eta 0:00:52 lr 0.000078 wd 0.0500 time 0.2390 (0.2496) data time 0.0008 (0.0015) model time 0.2382 (0.2488) loss 2.9473 (2.7560) grad_norm 4.6284 (inf) loss_scale 128.0000 (166.9779) mem 7381MB [2024-09-01 07:50:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1050/1251] eta 0:00:50 lr 0.000078 wd 0.0500 time 0.2333 (0.2495) data time 0.0007 (0.0015) model time 0.2326 (0.2486) loss 3.2202 (2.7574) grad_norm 3.8938 (inf) loss_scale 128.0000 (166.6070) mem 7381MB [2024-09-01 07:50:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1060/1251] eta 0:00:47 lr 0.000078 wd 0.0500 time 0.2327 (0.2493) data time 0.0008 (0.0015) model time 0.2319 (0.2485) loss 3.3270 (2.7585) grad_norm 6.3584 (inf) loss_scale 128.0000 (166.2432) mem 7381MB [2024-09-01 07:50:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1070/1251] eta 0:00:45 lr 0.000078 wd 0.0500 time 0.2430 (0.2492) data time 0.0010 (0.0015) model time 0.2419 (0.2483) loss 3.1099 (2.7595) grad_norm 5.8790 (inf) loss_scale 128.0000 (165.8861) mem 7381MB [2024-09-01 07:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1080/1251] eta 0:00:42 lr 0.000078 wd 0.0500 time 0.2361 (0.2491) data time 0.0010 (0.0015) model time 0.2352 (0.2482) loss 3.0579 (2.7576) grad_norm 3.6874 (inf) loss_scale 128.0000 (165.5356) mem 7381MB [2024-09-01 07:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1090/1251] eta 0:00:40 lr 0.000078 wd 0.0500 time 0.2428 (0.2489) data time 0.0007 (0.0015) model time 0.2421 (0.2480) loss 3.4224 (2.7589) grad_norm 4.8551 (inf) loss_scale 128.0000 (165.1916) mem 7381MB [2024-09-01 07:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1100/1251] eta 0:00:37 lr 0.000078 wd 0.0500 time 0.2471 (0.2489) data time 0.0009 (0.0015) model time 0.2462 (0.2480) loss 3.0157 (2.7582) grad_norm 6.4158 (inf) loss_scale 128.0000 (164.8538) mem 7381MB [2024-09-01 07:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1110/1251] eta 0:00:35 lr 0.000078 wd 0.0500 time 0.2425 (0.2488) data time 0.0008 (0.0015) model time 0.2417 (0.2479) loss 1.8192 (2.7557) grad_norm 3.0796 (inf) loss_scale 128.0000 (164.5221) mem 7381MB [2024-09-01 07:50:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1120/1251] eta 0:00:32 lr 0.000078 wd 0.0500 time 0.2444 (0.2487) data time 0.0015 (0.0015) model time 0.2428 (0.2478) loss 2.5545 (2.7545) grad_norm 4.7429 (inf) loss_scale 128.0000 (164.1963) mem 7381MB [2024-09-01 07:50:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1130/1251] eta 0:00:30 lr 0.000078 wd 0.0500 time 0.2424 (0.2486) data time 0.0007 (0.0015) model time 0.2417 (0.2477) loss 2.8731 (2.7581) grad_norm 3.7137 (inf) loss_scale 128.0000 (163.8762) mem 7381MB [2024-09-01 07:50:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1140/1251] eta 0:00:27 lr 0.000077 wd 0.0500 time 0.2486 (0.2486) data time 0.0009 (0.0015) model time 0.2476 (0.2476) loss 2.7060 (2.7603) grad_norm 3.2143 (inf) loss_scale 128.0000 (163.5618) mem 7381MB [2024-09-01 07:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1150/1251] eta 0:00:25 lr 0.000077 wd 0.0500 time 0.2451 (0.2485) data time 0.0008 (0.0015) model time 0.2444 (0.2476) loss 2.9866 (2.7612) grad_norm 3.9759 (inf) loss_scale 128.0000 (163.2528) mem 7381MB [2024-09-01 07:50:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1160/1251] eta 0:00:22 lr 0.000077 wd 0.0500 time 0.2366 (0.2484) data time 0.0009 (0.0015) model time 0.2357 (0.2475) loss 2.0632 (2.7615) grad_norm 6.0807 (inf) loss_scale 128.0000 (162.9492) mem 7381MB [2024-09-01 07:50:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1170/1251] eta 0:00:20 lr 0.000077 wd 0.0500 time 0.2449 (0.2483) data time 0.0007 (0.0015) model time 0.2442 (0.2474) loss 2.3902 (2.7621) grad_norm 4.2281 (inf) loss_scale 128.0000 (162.6507) mem 7381MB [2024-09-01 07:50:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1180/1251] eta 0:00:17 lr 0.000077 wd 0.0500 time 0.2383 (0.2483) data time 0.0008 (0.0015) model time 0.2375 (0.2474) loss 2.8588 (2.7635) grad_norm 4.8380 (inf) loss_scale 128.0000 (162.3573) mem 7381MB [2024-09-01 07:50:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1190/1251] eta 0:00:15 lr 0.000077 wd 0.0500 time 0.2396 (0.2482) data time 0.0007 (0.0015) model time 0.2389 (0.2473) loss 2.8533 (2.7647) grad_norm 4.3700 (inf) loss_scale 128.0000 (162.0688) mem 7381MB [2024-09-01 07:50:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1200/1251] eta 0:00:12 lr 0.000077 wd 0.0500 time 0.2348 (0.2482) data time 0.0011 (0.0015) model time 0.2337 (0.2472) loss 2.2260 (2.7655) grad_norm 5.9630 (inf) loss_scale 128.0000 (161.7852) mem 7381MB [2024-09-01 07:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1210/1251] eta 0:00:10 lr 0.000077 wd 0.0500 time 0.2469 (0.2481) data time 0.0010 (0.0015) model time 0.2459 (0.2472) loss 2.0354 (2.7652) grad_norm 6.1684 (inf) loss_scale 128.0000 (161.5062) mem 7381MB [2024-09-01 07:50:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1220/1251] eta 0:00:07 lr 0.000077 wd 0.0500 time 0.2406 (0.2481) data time 0.0007 (0.0015) model time 0.2398 (0.2471) loss 2.5680 (2.7659) grad_norm 7.3061 (inf) loss_scale 128.0000 (161.2318) mem 7381MB [2024-09-01 07:50:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1230/1251] eta 0:00:05 lr 0.000077 wd 0.0500 time 0.2386 (0.2480) data time 0.0007 (0.0014) model time 0.2379 (0.2471) loss 2.0838 (2.7641) grad_norm 4.7653 (inf) loss_scale 128.0000 (160.9618) mem 7381MB [2024-09-01 07:50:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1240/1251] eta 0:00:02 lr 0.000077 wd 0.0500 time 0.2287 (0.2479) data time 0.0007 (0.0014) model time 0.2280 (0.2469) loss 2.9538 (2.7635) grad_norm 4.6732 (inf) loss_scale 128.0000 (160.6962) mem 7381MB [2024-09-01 07:50:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [252/300][1250/1251] eta 0:00:00 lr 0.000077 wd 0.0500 time 0.2256 (0.2477) data time 0.0007 (0.0014) model time 0.2249 (0.2468) loss 2.9067 (2.7634) grad_norm 5.0814 (inf) loss_scale 128.0000 (160.4349) mem 7381MB [2024-09-01 07:50:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 252 training takes 0:05:09 [2024-09-01 07:50:56 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 07:50:57 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 07:50:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.442 (0.442) Loss 0.3970 (0.3970) Acc@1 93.164 (93.164) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 07:50:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.110) Loss 0.5928 (0.6331) Acc@1 89.062 (86.985) Acc@5 97.852 (97.630) Mem 7381MB [2024-09-01 07:50:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.095) Loss 0.9614 (0.6631) Acc@1 76.855 (85.900) Acc@5 95.312 (97.521) Mem 7381MB [2024-09-01 07:50:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.083 (0.090) Loss 1.1348 (0.7535) Acc@1 73.438 (83.808) Acc@5 92.676 (96.658) Mem 7381MB [2024-09-01 07:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.063 (0.084) Loss 1.0186 (0.8020) Acc@1 76.953 (82.622) Acc@5 93.945 (96.146) Mem 7381MB [2024-09-01 07:51:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.268 Acc@5 96.066 [2024-09-01 07:51:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.3% [2024-09-01 07:51:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.795 (0.795) Loss 0.3806 (0.3806) Acc@1 93.262 (93.262) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 07:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.147) Loss 0.5679 (0.5994) Acc@1 89.941 (87.660) Acc@5 98.145 (97.896) Mem 7381MB [2024-09-01 07:51:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.114) Loss 0.8911 (0.6300) Acc@1 78.125 (86.547) Acc@5 96.387 (97.786) Mem 7381MB [2024-09-01 07:51:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.102) Loss 1.1045 (0.7172) Acc@1 74.023 (84.400) Acc@5 93.164 (96.916) Mem 7381MB [2024-09-01 07:51:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 0.9927 (0.7633) Acc@1 76.465 (83.282) Acc@5 94.434 (96.425) Mem 7381MB [2024-09-01 07:51:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.934 Acc@5 96.380 [2024-09-01 07:51:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 07:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][0/1251] eta 0:23:19 lr 0.000077 wd 0.0500 time 1.1188 (1.1188) data time 0.4773 (0.4773) model time 0.0000 (0.0000) loss 2.8555 (2.8555) grad_norm 3.8800 (3.8800) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][10/1251] eta 0:06:41 lr 0.000077 wd 0.0500 time 0.2435 (0.3231) data time 0.0008 (0.0444) model time 0.0000 (0.0000) loss 2.3463 (2.8694) grad_norm 3.6973 (4.2858) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][20/1251] eta 0:05:49 lr 0.000077 wd 0.0500 time 0.2411 (0.2841) data time 0.0007 (0.0237) model time 0.0000 (0.0000) loss 2.2010 (2.9000) grad_norm 5.9530 (4.6904) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][30/1251] eta 0:05:30 lr 0.000077 wd 0.0500 time 0.2436 (0.2710) data time 0.0007 (0.0164) model time 0.0000 (0.0000) loss 3.3295 (2.9209) grad_norm 5.8861 (4.7155) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][40/1251] eta 0:05:19 lr 0.000077 wd 0.0500 time 0.2533 (0.2641) data time 0.0011 (0.0126) model time 0.0000 (0.0000) loss 2.3098 (2.8392) grad_norm 4.5573 (4.7727) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][50/1251] eta 0:05:11 lr 0.000077 wd 0.0500 time 0.2373 (0.2597) data time 0.0010 (0.0104) model time 0.0000 (0.0000) loss 2.9427 (2.8777) grad_norm 5.0300 (4.6922) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][60/1251] eta 0:05:06 lr 0.000077 wd 0.0500 time 0.2376 (0.2570) data time 0.0011 (0.0088) model time 0.2365 (0.2420) loss 2.4963 (2.8162) grad_norm 4.0624 (4.5979) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][70/1251] eta 0:05:00 lr 0.000077 wd 0.0500 time 0.2455 (0.2547) data time 0.0009 (0.0077) model time 0.2446 (0.2409) loss 3.0696 (2.8061) grad_norm 5.5012 (4.6029) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][80/1251] eta 0:04:56 lr 0.000077 wd 0.0500 time 0.2298 (0.2528) data time 0.0011 (0.0069) model time 0.2287 (0.2401) loss 3.0600 (2.8248) grad_norm 4.9139 (4.5907) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][90/1251] eta 0:04:51 lr 0.000077 wd 0.0500 time 0.2390 (0.2513) data time 0.0007 (0.0063) model time 0.2383 (0.2395) loss 2.5298 (2.8332) grad_norm 4.5852 (4.6501) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][100/1251] eta 0:04:47 lr 0.000077 wd 0.0500 time 0.2437 (0.2502) data time 0.0009 (0.0057) model time 0.2428 (0.2395) loss 2.5831 (2.8319) grad_norm 3.5557 (4.6193) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][110/1251] eta 0:04:44 lr 0.000077 wd 0.0500 time 0.2418 (0.2494) data time 0.0008 (0.0053) model time 0.2411 (0.2396) loss 3.0838 (2.8376) grad_norm 3.9383 (4.5511) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][120/1251] eta 0:04:41 lr 0.000077 wd 0.0500 time 0.2418 (0.2487) data time 0.0009 (0.0050) model time 0.2409 (0.2396) loss 3.1789 (2.8246) grad_norm 4.7688 (4.5489) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][130/1251] eta 0:04:38 lr 0.000077 wd 0.0500 time 0.2471 (0.2481) data time 0.0006 (0.0047) model time 0.2465 (0.2396) loss 2.3571 (2.8323) grad_norm 7.8750 (4.5990) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][140/1251] eta 0:04:35 lr 0.000077 wd 0.0500 time 0.2435 (0.2477) data time 0.0008 (0.0044) model time 0.2427 (0.2399) loss 3.0060 (2.8115) grad_norm 5.8099 (4.5946) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][150/1251] eta 0:04:32 lr 0.000077 wd 0.0500 time 0.2405 (0.2471) data time 0.0007 (0.0042) model time 0.2398 (0.2397) loss 2.8500 (2.8221) grad_norm 4.8066 (4.6537) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][160/1251] eta 0:04:29 lr 0.000077 wd 0.0500 time 0.2398 (0.2466) data time 0.0012 (0.0040) model time 0.2386 (0.2396) loss 2.3827 (2.8053) grad_norm 6.2440 (4.6276) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][170/1251] eta 0:04:26 lr 0.000077 wd 0.0500 time 0.2498 (0.2464) data time 0.0009 (0.0038) model time 0.2488 (0.2397) loss 3.1252 (2.8151) grad_norm 7.5635 (4.6421) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][180/1251] eta 0:04:23 lr 0.000077 wd 0.0500 time 0.2460 (0.2461) data time 0.0008 (0.0036) model time 0.2452 (0.2398) loss 3.2836 (2.8157) grad_norm 4.7338 (4.7024) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][190/1251] eta 0:04:20 lr 0.000077 wd 0.0500 time 0.2409 (0.2459) data time 0.0009 (0.0035) model time 0.2400 (0.2399) loss 2.9347 (2.8111) grad_norm 4.1242 (4.7891) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][200/1251] eta 0:04:18 lr 0.000077 wd 0.0500 time 0.2412 (0.2458) data time 0.0009 (0.0034) model time 0.2402 (0.2400) loss 2.6804 (2.8007) grad_norm 3.4180 (4.7944) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][210/1251] eta 0:04:15 lr 0.000077 wd 0.0500 time 0.2542 (0.2456) data time 0.0009 (0.0033) model time 0.2533 (0.2401) loss 2.9212 (2.7943) grad_norm 8.7731 (4.8460) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:51:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][220/1251] eta 0:04:14 lr 0.000077 wd 0.0500 time 0.2507 (0.2464) data time 0.0007 (0.0032) model time 0.2499 (0.2414) loss 3.2105 (2.8008) grad_norm 4.1447 (4.9363) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][230/1251] eta 0:04:11 lr 0.000077 wd 0.0500 time 0.2469 (0.2462) data time 0.0010 (0.0031) model time 0.2459 (0.2414) loss 2.6395 (2.7948) grad_norm 6.4753 (4.9561) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][240/1251] eta 0:04:08 lr 0.000077 wd 0.0500 time 0.2421 (0.2460) data time 0.0015 (0.0030) model time 0.2407 (0.2413) loss 3.0038 (2.7853) grad_norm 5.3797 (4.9529) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][250/1251] eta 0:04:06 lr 0.000077 wd 0.0500 time 0.2375 (0.2458) data time 0.0008 (0.0029) model time 0.2367 (0.2412) loss 3.1560 (2.7877) grad_norm 4.7849 (4.9675) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][260/1251] eta 0:04:03 lr 0.000077 wd 0.0500 time 0.2403 (0.2456) data time 0.0009 (0.0028) model time 0.2395 (0.2412) loss 1.6684 (2.7833) grad_norm 3.5551 (4.9596) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][270/1251] eta 0:04:01 lr 0.000077 wd 0.0500 time 0.2332 (0.2467) data time 0.0009 (0.0028) model time 0.2323 (0.2426) loss 2.8508 (2.7915) grad_norm 3.2746 (4.9480) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][280/1251] eta 0:04:00 lr 0.000077 wd 0.0500 time 0.2440 (0.2472) data time 0.0010 (0.0027) model time 0.2430 (0.2435) loss 2.7531 (2.7927) grad_norm 5.3887 (4.9265) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][290/1251] eta 0:03:57 lr 0.000077 wd 0.0500 time 0.2371 (0.2471) data time 0.0010 (0.0026) model time 0.2361 (0.2434) loss 2.8122 (2.7842) grad_norm 7.0011 (4.9069) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][300/1251] eta 0:03:54 lr 0.000077 wd 0.0500 time 0.2443 (0.2470) data time 0.0009 (0.0026) model time 0.2435 (0.2434) loss 2.8675 (2.7823) grad_norm 5.4451 (4.8811) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][310/1251] eta 0:03:52 lr 0.000077 wd 0.0500 time 0.2389 (0.2469) data time 0.0010 (0.0026) model time 0.2379 (0.2433) loss 3.0569 (2.7861) grad_norm 5.9829 (4.8700) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][320/1251] eta 0:03:49 lr 0.000077 wd 0.0500 time 0.2413 (0.2468) data time 0.0009 (0.0025) model time 0.2404 (0.2433) loss 2.1888 (2.7962) grad_norm 4.6879 (4.8552) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][330/1251] eta 0:03:47 lr 0.000077 wd 0.0500 time 0.2501 (0.2467) data time 0.0010 (0.0025) model time 0.2491 (0.2433) loss 3.0666 (2.8002) grad_norm 5.5807 (4.8549) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][340/1251] eta 0:03:44 lr 0.000076 wd 0.0500 time 0.2532 (0.2466) data time 0.0009 (0.0024) model time 0.2523 (0.2433) loss 2.9647 (2.8015) grad_norm 8.3552 (4.8609) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][350/1251] eta 0:03:42 lr 0.000076 wd 0.0500 time 0.2390 (0.2464) data time 0.0010 (0.0024) model time 0.2381 (0.2431) loss 2.3201 (2.8021) grad_norm 3.6362 (4.8635) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][360/1251] eta 0:03:39 lr 0.000076 wd 0.0500 time 0.2366 (0.2463) data time 0.0012 (0.0024) model time 0.2355 (0.2431) loss 1.8426 (2.8018) grad_norm 4.7274 (4.8841) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][370/1251] eta 0:03:36 lr 0.000076 wd 0.0500 time 0.2396 (0.2462) data time 0.0008 (0.0023) model time 0.2388 (0.2430) loss 3.4029 (2.8000) grad_norm 3.7201 (4.8814) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][380/1251] eta 0:03:34 lr 0.000076 wd 0.0500 time 0.2453 (0.2461) data time 0.0007 (0.0023) model time 0.2446 (0.2430) loss 2.8819 (2.8067) grad_norm 6.8927 (4.9045) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][390/1251] eta 0:03:31 lr 0.000076 wd 0.0500 time 0.2394 (0.2460) data time 0.0009 (0.0023) model time 0.2386 (0.2429) loss 2.2107 (2.8070) grad_norm 4.6597 (4.9181) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][400/1251] eta 0:03:29 lr 0.000076 wd 0.0500 time 0.2379 (0.2459) data time 0.0007 (0.0022) model time 0.2372 (0.2428) loss 3.2899 (2.8097) grad_norm 4.9114 (4.9391) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][410/1251] eta 0:03:26 lr 0.000076 wd 0.0500 time 0.2482 (0.2458) data time 0.0012 (0.0022) model time 0.2471 (0.2428) loss 3.2108 (2.8063) grad_norm 3.8485 (4.9313) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][420/1251] eta 0:03:24 lr 0.000076 wd 0.0500 time 0.2509 (0.2458) data time 0.0007 (0.0022) model time 0.2502 (0.2428) loss 2.1468 (2.8066) grad_norm 3.8766 (4.9146) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][430/1251] eta 0:03:21 lr 0.000076 wd 0.0500 time 0.2439 (0.2456) data time 0.0010 (0.0021) model time 0.2429 (0.2427) loss 1.8794 (2.8055) grad_norm 5.0089 (4.9335) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][440/1251] eta 0:03:19 lr 0.000076 wd 0.0500 time 0.2382 (0.2455) data time 0.0009 (0.0021) model time 0.2372 (0.2426) loss 2.9974 (2.7999) grad_norm 4.3723 (4.9159) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][450/1251] eta 0:03:16 lr 0.000076 wd 0.0500 time 0.2419 (0.2454) data time 0.0007 (0.0021) model time 0.2412 (0.2425) loss 1.8612 (2.7932) grad_norm 5.4752 (4.9623) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:52:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][460/1251] eta 0:03:14 lr 0.000076 wd 0.0500 time 0.2380 (0.2453) data time 0.0010 (0.0021) model time 0.2370 (0.2425) loss 2.7038 (2.7910) grad_norm 3.0841 (4.9503) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][470/1251] eta 0:03:11 lr 0.000076 wd 0.0500 time 0.2422 (0.2453) data time 0.0009 (0.0020) model time 0.2412 (0.2425) loss 2.7968 (2.7851) grad_norm 4.0827 (4.9431) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][480/1251] eta 0:03:09 lr 0.000076 wd 0.0500 time 0.2443 (0.2452) data time 0.0009 (0.0020) model time 0.2435 (0.2424) loss 2.8673 (2.7884) grad_norm 4.7957 (4.9561) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][490/1251] eta 0:03:06 lr 0.000076 wd 0.0500 time 0.2375 (0.2451) data time 0.0008 (0.0020) model time 0.2368 (0.2424) loss 2.8736 (2.7911) grad_norm 5.5491 (4.9536) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][500/1251] eta 0:03:04 lr 0.000076 wd 0.0500 time 0.2462 (0.2451) data time 0.0009 (0.0020) model time 0.2452 (0.2424) loss 3.0248 (2.7952) grad_norm 4.6569 (4.9564) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][510/1251] eta 0:03:01 lr 0.000076 wd 0.0500 time 0.2461 (0.2450) data time 0.0012 (0.0020) model time 0.2449 (0.2424) loss 3.0180 (2.7948) grad_norm 4.3112 (4.9401) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][520/1251] eta 0:02:59 lr 0.000076 wd 0.0500 time 0.2349 (0.2449) data time 0.0010 (0.0019) model time 0.2339 (0.2423) loss 2.7786 (2.7925) grad_norm 4.9270 (4.9235) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][530/1251] eta 0:02:56 lr 0.000076 wd 0.0500 time 0.2394 (0.2449) data time 0.0011 (0.0019) model time 0.2383 (0.2423) loss 3.1496 (2.7922) grad_norm 4.4919 (4.9615) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][540/1251] eta 0:02:54 lr 0.000076 wd 0.0500 time 0.2512 (0.2448) data time 0.0009 (0.0019) model time 0.2503 (0.2422) loss 2.7693 (2.7897) grad_norm 3.2394 (4.9525) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][550/1251] eta 0:02:51 lr 0.000076 wd 0.0500 time 0.2385 (0.2448) data time 0.0007 (0.0019) model time 0.2378 (0.2422) loss 2.8926 (2.7916) grad_norm 4.8834 (4.9645) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][560/1251] eta 0:02:49 lr 0.000076 wd 0.0500 time 0.2370 (0.2447) data time 0.0009 (0.0019) model time 0.2361 (0.2422) loss 2.5266 (2.7894) grad_norm 3.6946 (4.9680) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][570/1251] eta 0:02:46 lr 0.000076 wd 0.0500 time 0.2427 (0.2447) data time 0.0009 (0.0019) model time 0.2418 (0.2422) loss 3.3810 (2.7863) grad_norm 5.5453 (4.9718) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][580/1251] eta 0:02:44 lr 0.000076 wd 0.0500 time 0.2409 (0.2447) data time 0.0009 (0.0018) model time 0.2399 (0.2422) loss 3.0161 (2.7833) grad_norm 8.2549 (5.0026) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][590/1251] eta 0:02:41 lr 0.000076 wd 0.0500 time 0.2426 (0.2446) data time 0.0007 (0.0018) model time 0.2418 (0.2422) loss 2.4170 (2.7856) grad_norm 4.4219 (5.0034) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][600/1251] eta 0:02:39 lr 0.000076 wd 0.0500 time 0.2395 (0.2446) data time 0.0009 (0.0018) model time 0.2386 (0.2422) loss 3.1868 (2.7862) grad_norm 4.2220 (5.0012) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][610/1251] eta 0:02:36 lr 0.000076 wd 0.0500 time 0.2355 (0.2446) data time 0.0010 (0.0018) model time 0.2345 (0.2422) loss 2.6775 (2.7860) grad_norm 5.2438 (4.9957) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][620/1251] eta 0:02:34 lr 0.000076 wd 0.0500 time 0.2429 (0.2446) data time 0.0008 (0.0018) model time 0.2422 (0.2422) loss 2.2516 (2.7885) grad_norm 4.3184 (4.9841) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][630/1251] eta 0:02:31 lr 0.000076 wd 0.0500 time 0.2457 (0.2445) data time 0.0007 (0.0018) model time 0.2450 (0.2422) loss 2.6019 (2.7888) grad_norm 6.9845 (4.9767) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][640/1251] eta 0:02:29 lr 0.000076 wd 0.0500 time 0.2541 (0.2445) data time 0.0009 (0.0018) model time 0.2532 (0.2422) loss 1.8937 (2.7856) grad_norm 3.6008 (5.0139) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][650/1251] eta 0:02:26 lr 0.000076 wd 0.0500 time 0.2398 (0.2444) data time 0.0012 (0.0018) model time 0.2386 (0.2421) loss 2.6192 (2.7871) grad_norm 3.9191 (5.0100) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][660/1251] eta 0:02:24 lr 0.000076 wd 0.0500 time 0.2444 (0.2444) data time 0.0011 (0.0017) model time 0.2434 (0.2421) loss 3.1054 (2.7863) grad_norm 4.6586 (5.0133) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][670/1251] eta 0:02:22 lr 0.000076 wd 0.0500 time 0.2402 (0.2444) data time 0.0011 (0.0017) model time 0.2391 (0.2421) loss 2.4692 (2.7848) grad_norm 4.2241 (4.9998) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][680/1251] eta 0:02:19 lr 0.000076 wd 0.0500 time 0.2417 (0.2444) data time 0.0009 (0.0017) model time 0.2408 (0.2421) loss 3.0032 (2.7851) grad_norm 3.6141 (4.9862) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][690/1251] eta 0:02:17 lr 0.000076 wd 0.0500 time 0.2410 (0.2443) data time 0.0007 (0.0017) model time 0.2403 (0.2421) loss 3.1726 (2.7813) grad_norm 3.9764 (4.9736) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][700/1251] eta 0:02:14 lr 0.000076 wd 0.0500 time 0.2402 (0.2443) data time 0.0008 (0.0017) model time 0.2394 (0.2420) loss 2.3747 (2.7829) grad_norm 3.1481 (4.9637) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:53:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][710/1251] eta 0:02:12 lr 0.000076 wd 0.0500 time 0.2345 (0.2442) data time 0.0010 (0.0017) model time 0.2335 (0.2420) loss 2.9366 (2.7839) grad_norm 4.4569 (4.9548) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][720/1251] eta 0:02:09 lr 0.000076 wd 0.0500 time 0.2700 (0.2442) data time 0.0009 (0.0017) model time 0.2691 (0.2420) loss 2.7717 (2.7833) grad_norm 4.1731 (4.9455) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][730/1251] eta 0:02:07 lr 0.000076 wd 0.0500 time 0.2446 (0.2442) data time 0.0010 (0.0017) model time 0.2436 (0.2420) loss 3.3628 (2.7833) grad_norm 44.4317 (4.9961) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][740/1251] eta 0:02:04 lr 0.000076 wd 0.0500 time 0.2390 (0.2442) data time 0.0008 (0.0017) model time 0.2381 (0.2420) loss 3.5995 (2.7830) grad_norm 5.2552 (4.9911) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][750/1251] eta 0:02:02 lr 0.000076 wd 0.0500 time 0.2510 (0.2444) data time 0.0008 (0.0017) model time 0.2502 (0.2423) loss 3.1594 (2.7822) grad_norm 4.4410 (4.9851) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][760/1251] eta 0:01:59 lr 0.000076 wd 0.0500 time 0.2430 (0.2444) data time 0.0010 (0.0016) model time 0.2420 (0.2423) loss 3.3613 (2.7845) grad_norm 4.5867 (4.9810) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][770/1251] eta 0:01:57 lr 0.000076 wd 0.0500 time 0.2403 (0.2443) data time 0.0008 (0.0016) model time 0.2396 (0.2422) loss 2.1086 (2.7840) grad_norm 8.0045 (5.0156) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][780/1251] eta 0:01:55 lr 0.000076 wd 0.0500 time 0.2455 (0.2443) data time 0.0011 (0.0016) model time 0.2444 (0.2422) loss 3.1520 (2.7852) grad_norm 3.6421 (5.0120) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][790/1251] eta 0:01:52 lr 0.000075 wd 0.0500 time 0.3950 (0.2445) data time 0.0010 (0.0016) model time 0.3940 (0.2424) loss 3.1499 (2.7847) grad_norm 4.8364 (5.0020) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][800/1251] eta 0:01:50 lr 0.000075 wd 0.0500 time 0.2393 (0.2448) data time 0.0009 (0.0016) model time 0.2384 (0.2428) loss 2.1601 (2.7803) grad_norm 6.0546 (5.0036) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][810/1251] eta 0:01:47 lr 0.000075 wd 0.0500 time 0.2443 (0.2447) data time 0.0007 (0.0016) model time 0.2436 (0.2427) loss 2.7553 (2.7793) grad_norm 5.0513 (5.0039) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][820/1251] eta 0:01:45 lr 0.000075 wd 0.0500 time 0.2422 (0.2447) data time 0.0007 (0.0016) model time 0.2414 (0.2427) loss 2.1227 (2.7790) grad_norm 3.7256 (4.9990) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][830/1251] eta 0:01:42 lr 0.000075 wd 0.0500 time 0.2440 (0.2446) data time 0.0009 (0.0016) model time 0.2431 (0.2426) loss 2.9754 (2.7789) grad_norm 4.8513 (4.9850) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][840/1251] eta 0:01:40 lr 0.000075 wd 0.0500 time 0.2413 (0.2446) data time 0.0011 (0.0016) model time 0.2403 (0.2426) loss 2.1795 (2.7788) grad_norm 8.1013 (4.9820) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][850/1251] eta 0:01:38 lr 0.000075 wd 0.0500 time 0.2348 (0.2446) data time 0.0007 (0.0016) model time 0.2340 (0.2426) loss 2.3598 (2.7776) grad_norm 4.5771 (4.9829) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][860/1251] eta 0:01:35 lr 0.000075 wd 0.0500 time 0.2414 (0.2445) data time 0.0009 (0.0016) model time 0.2404 (0.2426) loss 3.0643 (2.7788) grad_norm 3.8637 (4.9778) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][870/1251] eta 0:01:33 lr 0.000075 wd 0.0500 time 0.2312 (0.2445) data time 0.0011 (0.0016) model time 0.2302 (0.2425) loss 1.7817 (2.7765) grad_norm 4.3816 (4.9722) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][880/1251] eta 0:01:30 lr 0.000075 wd 0.0500 time 0.2389 (0.2444) data time 0.0009 (0.0016) model time 0.2380 (0.2425) loss 3.2382 (2.7776) grad_norm 25.0020 (4.9869) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][890/1251] eta 0:01:28 lr 0.000075 wd 0.0500 time 0.2417 (0.2444) data time 0.0010 (0.0015) model time 0.2407 (0.2425) loss 3.3375 (2.7796) grad_norm 4.3047 (5.0050) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][900/1251] eta 0:01:25 lr 0.000075 wd 0.0500 time 0.2370 (0.2444) data time 0.0011 (0.0015) model time 0.2359 (0.2425) loss 2.3417 (2.7783) grad_norm 6.7457 (5.0032) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][910/1251] eta 0:01:23 lr 0.000075 wd 0.0500 time 0.2279 (0.2444) data time 0.0011 (0.0015) model time 0.2268 (0.2425) loss 2.9294 (2.7778) grad_norm 7.0801 (5.0028) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][920/1251] eta 0:01:20 lr 0.000075 wd 0.0500 time 0.2431 (0.2444) data time 0.0009 (0.0015) model time 0.2422 (0.2425) loss 2.6650 (2.7778) grad_norm 3.9025 (5.0033) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][930/1251] eta 0:01:18 lr 0.000075 wd 0.0500 time 0.2372 (0.2444) data time 0.0007 (0.0015) model time 0.2365 (0.2425) loss 3.1384 (2.7791) grad_norm 3.3153 (5.0010) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][940/1251] eta 0:01:15 lr 0.000075 wd 0.0500 time 0.2404 (0.2443) data time 0.0012 (0.0015) model time 0.2393 (0.2424) loss 2.6817 (2.7766) grad_norm 5.6054 (5.0740) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][950/1251] eta 0:01:13 lr 0.000075 wd 0.0500 time 0.2401 (0.2442) data time 0.0012 (0.0015) model time 0.2389 (0.2424) loss 2.9756 (2.7770) grad_norm 3.1084 (5.0630) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:54:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][960/1251] eta 0:01:11 lr 0.000075 wd 0.0500 time 0.2497 (0.2442) data time 0.0007 (0.0015) model time 0.2490 (0.2423) loss 2.5664 (2.7763) grad_norm 4.1896 (5.0583) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:55:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][970/1251] eta 0:01:08 lr 0.000075 wd 0.0500 time 0.2476 (0.2442) data time 0.0007 (0.0015) model time 0.2470 (0.2423) loss 3.4314 (2.7768) grad_norm 3.2347 (5.0505) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:55:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][980/1251] eta 0:01:06 lr 0.000075 wd 0.0500 time 0.2360 (0.2441) data time 0.0011 (0.0015) model time 0.2348 (0.2423) loss 2.9502 (2.7746) grad_norm 9.2392 (5.0435) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][990/1251] eta 0:01:03 lr 0.000075 wd 0.0500 time 0.2391 (0.2441) data time 0.0008 (0.0015) model time 0.2383 (0.2422) loss 2.4466 (2.7734) grad_norm 4.8488 (5.0328) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1000/1251] eta 0:01:01 lr 0.000075 wd 0.0500 time 0.2439 (0.2441) data time 0.0009 (0.0015) model time 0.2430 (0.2422) loss 2.7575 (2.7737) grad_norm 5.5407 (5.0326) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:55:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1010/1251] eta 0:00:58 lr 0.000075 wd 0.0500 time 0.2542 (0.2441) data time 0.0008 (0.0015) model time 0.2534 (0.2422) loss 3.0424 (2.7733) grad_norm 4.4167 (5.0430) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:55:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1020/1251] eta 0:00:56 lr 0.000075 wd 0.0500 time 0.2493 (0.2441) data time 0.0009 (0.0015) model time 0.2484 (0.2422) loss 3.0216 (2.7723) grad_norm 5.4065 (5.0540) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:55:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1030/1251] eta 0:00:53 lr 0.000075 wd 0.0500 time 0.2392 (0.2440) data time 0.0009 (0.0015) model time 0.2382 (0.2422) loss 2.1624 (2.7732) grad_norm 4.8923 (5.0535) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:55:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1040/1251] eta 0:00:51 lr 0.000075 wd 0.0500 time 0.2427 (0.2440) data time 0.0009 (0.0015) model time 0.2418 (0.2422) loss 1.9538 (2.7741) grad_norm 4.2333 (5.0669) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:55:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1050/1251] eta 0:00:49 lr 0.000075 wd 0.0500 time 0.2443 (0.2440) data time 0.0009 (0.0015) model time 0.2433 (0.2421) loss 2.4766 (2.7743) grad_norm 3.7519 (5.0710) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1060/1251] eta 0:00:46 lr 0.000075 wd 0.0500 time 0.2468 (0.2439) data time 0.0008 (0.0015) model time 0.2460 (0.2421) loss 2.8576 (2.7752) grad_norm 6.0106 (5.0721) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 07:55:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1070/1251] eta 0:00:44 lr 0.000075 wd 0.0500 time 0.2423 (0.2439) data time 0.0007 (0.0015) model time 0.2416 (0.2421) loss 3.1538 (2.7760) grad_norm 4.3611 (5.0685) loss_scale 256.0000 (128.5976) mem 7381MB [2024-09-01 07:55:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1080/1251] eta 0:00:41 lr 0.000075 wd 0.0500 time 0.2436 (0.2439) data time 0.0010 (0.0015) model time 0.2426 (0.2421) loss 2.8635 (2.7782) grad_norm 5.2612 (5.0573) loss_scale 256.0000 (129.7761) mem 7381MB [2024-09-01 07:55:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1090/1251] eta 0:00:39 lr 0.000075 wd 0.0500 time 0.2343 (0.2438) data time 0.0007 (0.0015) model time 0.2336 (0.2420) loss 3.5879 (2.7797) grad_norm 3.9072 (5.0679) loss_scale 256.0000 (130.9331) mem 7381MB [2024-09-01 07:55:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1100/1251] eta 0:00:36 lr 0.000075 wd 0.0500 time 0.2453 (0.2438) data time 0.0008 (0.0014) model time 0.2445 (0.2420) loss 3.2191 (2.7802) grad_norm 3.9834 (5.0578) loss_scale 256.0000 (132.0690) mem 7381MB [2024-09-01 07:55:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1110/1251] eta 0:00:34 lr 0.000075 wd 0.0500 time 0.2376 (0.2437) data time 0.0010 (0.0014) model time 0.2366 (0.2420) loss 2.5985 (2.7792) grad_norm 4.2327 (5.0541) loss_scale 256.0000 (133.1845) mem 7381MB [2024-09-01 07:55:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1120/1251] eta 0:00:31 lr 0.000075 wd 0.0500 time 0.2487 (0.2437) data time 0.0008 (0.0014) model time 0.2479 (0.2420) loss 2.2404 (2.7784) grad_norm 5.6319 (5.0564) loss_scale 256.0000 (134.2801) mem 7381MB [2024-09-01 07:55:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1130/1251] eta 0:00:29 lr 0.000075 wd 0.0500 time 0.2408 (0.2437) data time 0.0007 (0.0014) model time 0.2401 (0.2420) loss 2.7169 (2.7772) grad_norm 3.6069 (5.0699) loss_scale 256.0000 (135.3563) mem 7381MB [2024-09-01 07:55:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1140/1251] eta 0:00:27 lr 0.000075 wd 0.0500 time 0.2376 (0.2437) data time 0.0010 (0.0014) model time 0.2366 (0.2419) loss 2.4185 (2.7756) grad_norm 3.0814 (5.0621) loss_scale 256.0000 (136.4137) mem 7381MB [2024-09-01 07:55:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1150/1251] eta 0:00:24 lr 0.000075 wd 0.0500 time 0.2433 (0.2437) data time 0.0009 (0.0014) model time 0.2424 (0.2419) loss 3.0436 (2.7766) grad_norm 4.0111 (5.0588) loss_scale 256.0000 (137.4526) mem 7381MB [2024-09-01 07:55:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1160/1251] eta 0:00:22 lr 0.000075 wd 0.0500 time 0.2421 (0.2437) data time 0.0009 (0.0014) model time 0.2412 (0.2419) loss 3.2014 (2.7765) grad_norm 4.5584 (5.0648) loss_scale 256.0000 (138.4737) mem 7381MB [2024-09-01 07:55:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1170/1251] eta 0:00:19 lr 0.000075 wd 0.0500 time 0.2406 (0.2436) data time 0.0007 (0.0014) model time 0.2399 (0.2419) loss 3.2854 (2.7758) grad_norm 6.5982 (5.0681) loss_scale 256.0000 (139.4774) mem 7381MB [2024-09-01 07:55:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1180/1251] eta 0:00:17 lr 0.000075 wd 0.0500 time 0.2349 (0.2436) data time 0.0009 (0.0014) model time 0.2340 (0.2418) loss 3.4874 (2.7761) grad_norm 6.9142 (5.0787) loss_scale 256.0000 (140.4640) mem 7381MB [2024-09-01 07:55:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1190/1251] eta 0:00:14 lr 0.000075 wd 0.0500 time 0.2307 (0.2436) data time 0.0012 (0.0014) model time 0.2295 (0.2418) loss 2.8482 (2.7766) grad_norm 4.2525 (5.0743) loss_scale 256.0000 (141.4341) mem 7381MB [2024-09-01 07:55:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1200/1251] eta 0:00:12 lr 0.000075 wd 0.0500 time 0.2445 (0.2435) data time 0.0007 (0.0014) model time 0.2438 (0.2418) loss 3.0689 (2.7773) grad_norm 6.9857 (5.0686) loss_scale 256.0000 (142.3880) mem 7381MB [2024-09-01 07:56:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1210/1251] eta 0:00:09 lr 0.000075 wd 0.0500 time 0.2411 (0.2435) data time 0.0008 (0.0014) model time 0.2403 (0.2418) loss 3.4249 (2.7771) grad_norm 5.3488 (5.0645) loss_scale 256.0000 (143.3262) mem 7381MB [2024-09-01 07:56:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1220/1251] eta 0:00:07 lr 0.000075 wd 0.0500 time 0.2467 (0.2435) data time 0.0010 (0.0014) model time 0.2456 (0.2418) loss 2.3723 (2.7762) grad_norm 4.1678 (5.0605) loss_scale 256.0000 (144.2490) mem 7381MB [2024-09-01 07:56:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1230/1251] eta 0:00:05 lr 0.000075 wd 0.0500 time 0.2412 (0.2435) data time 0.0007 (0.0014) model time 0.2405 (0.2418) loss 2.7467 (2.7762) grad_norm 5.3184 (5.0611) loss_scale 256.0000 (145.1568) mem 7381MB [2024-09-01 07:56:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1240/1251] eta 0:00:02 lr 0.000075 wd 0.0500 time 0.2215 (0.2434) data time 0.0005 (0.0014) model time 0.2210 (0.2417) loss 3.1265 (2.7754) grad_norm 4.7221 (5.0636) loss_scale 256.0000 (146.0500) mem 7381MB [2024-09-01 07:56:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [253/300][1250/1251] eta 0:00:00 lr 0.000074 wd 0.0500 time 0.2286 (0.2433) data time 0.0004 (0.0014) model time 0.2282 (0.2416) loss 2.5919 (2.7744) grad_norm 3.5115 (5.0631) loss_scale 256.0000 (146.9289) mem 7381MB [2024-09-01 07:56:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 253 training takes 0:05:04 [2024-09-01 07:56:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 07:56:10 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 07:56:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.437 (0.437) Loss 0.3828 (0.3828) Acc@1 92.871 (92.871) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 07:56:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.111) Loss 0.5786 (0.6135) Acc@1 89.062 (87.172) Acc@5 98.145 (97.692) Mem 7381MB [2024-09-01 07:56:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.095) Loss 0.9360 (0.6445) Acc@1 76.758 (86.058) Acc@5 96.094 (97.610) Mem 7381MB [2024-09-01 07:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.090) Loss 1.1416 (0.7378) Acc@1 72.949 (83.902) Acc@5 92.285 (96.651) Mem 7381MB [2024-09-01 07:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.0059 (0.7862) Acc@1 76.758 (82.717) Acc@5 94.141 (96.137) Mem 7381MB [2024-09-01 07:56:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.328 Acc@5 96.056 [2024-09-01 07:56:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.3% [2024-09-01 07:56:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.775 (0.775) Loss 0.3799 (0.3799) Acc@1 93.359 (93.359) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 07:56:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.148) Loss 0.5688 (0.5998) Acc@1 90.137 (87.642) Acc@5 98.145 (97.860) Mem 7381MB [2024-09-01 07:56:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.114) Loss 0.8926 (0.6302) Acc@1 78.027 (86.528) Acc@5 96.191 (97.759) Mem 7381MB [2024-09-01 07:56:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.102) Loss 1.1074 (0.7175) Acc@1 73.926 (84.413) Acc@5 93.164 (96.888) Mem 7381MB [2024-09-01 07:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.093) Loss 0.9907 (0.7639) Acc@1 76.562 (83.301) Acc@5 94.434 (96.391) Mem 7381MB [2024-09-01 07:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.954 Acc@5 96.344 [2024-09-01 07:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 83.0% [2024-09-01 07:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.95% [2024-09-01 07:56:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 07:56:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 07:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][0/1251] eta 0:14:06 lr 0.000074 wd 0.0500 time 0.6769 (0.6769) data time 0.4511 (0.4511) model time 0.0000 (0.0000) loss 2.2194 (2.2194) grad_norm 3.6016 (3.6016) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][10/1251] eta 0:05:45 lr 0.000074 wd 0.0500 time 0.2387 (0.2788) data time 0.0010 (0.0419) model time 0.0000 (0.0000) loss 2.5619 (2.7228) grad_norm 13.5907 (5.2514) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][20/1251] eta 0:05:32 lr 0.000074 wd 0.0500 time 0.2362 (0.2697) data time 0.0009 (0.0224) model time 0.0000 (0.0000) loss 3.5882 (2.7911) grad_norm 5.0200 (5.4656) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][30/1251] eta 0:05:17 lr 0.000074 wd 0.0500 time 0.2372 (0.2603) data time 0.0011 (0.0155) model time 0.0000 (0.0000) loss 2.4343 (2.7542) grad_norm 3.9361 (5.2441) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][40/1251] eta 0:05:09 lr 0.000074 wd 0.0500 time 0.2404 (0.2555) data time 0.0008 (0.0119) model time 0.0000 (0.0000) loss 1.7620 (2.7329) grad_norm 3.7177 (5.0974) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][50/1251] eta 0:05:03 lr 0.000074 wd 0.0500 time 0.2451 (0.2525) data time 0.0008 (0.0098) model time 0.0000 (0.0000) loss 3.0569 (2.7436) grad_norm 3.9133 (4.9369) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][60/1251] eta 0:04:58 lr 0.000074 wd 0.0500 time 0.2435 (0.2506) data time 0.0010 (0.0083) model time 0.2426 (0.2400) loss 2.5723 (2.7654) grad_norm 5.5042 (4.9512) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][70/1251] eta 0:04:54 lr 0.000074 wd 0.0500 time 0.2460 (0.2493) data time 0.0007 (0.0073) model time 0.2453 (0.2403) loss 1.6689 (2.7523) grad_norm 4.8191 (4.8208) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][80/1251] eta 0:04:50 lr 0.000074 wd 0.0500 time 0.2418 (0.2482) data time 0.0010 (0.0065) model time 0.2408 (0.2400) loss 2.7883 (2.7439) grad_norm 4.9486 (4.9572) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][90/1251] eta 0:04:47 lr 0.000074 wd 0.0500 time 0.2486 (0.2476) data time 0.0009 (0.0059) model time 0.2477 (0.2404) loss 3.1623 (2.7635) grad_norm 3.5292 (4.8961) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][100/1251] eta 0:04:44 lr 0.000074 wd 0.0500 time 0.2495 (0.2469) data time 0.0008 (0.0054) model time 0.2486 (0.2403) loss 2.7109 (2.7818) grad_norm 5.1248 (4.8686) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][110/1251] eta 0:04:41 lr 0.000074 wd 0.0500 time 0.2464 (0.2465) data time 0.0007 (0.0050) model time 0.2457 (0.2404) loss 2.2925 (2.7498) grad_norm 3.2736 (4.8944) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][120/1251] eta 0:04:38 lr 0.000074 wd 0.0500 time 0.2389 (0.2458) data time 0.0009 (0.0047) model time 0.2380 (0.2400) loss 2.8327 (2.7597) grad_norm 4.3736 (4.8800) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][130/1251] eta 0:04:35 lr 0.000074 wd 0.0500 time 0.2432 (0.2455) data time 0.0006 (0.0044) model time 0.2425 (0.2402) loss 3.0487 (2.7687) grad_norm 8.3157 (4.8810) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][140/1251] eta 0:04:32 lr 0.000074 wd 0.0500 time 0.2408 (0.2453) data time 0.0011 (0.0042) model time 0.2397 (0.2403) loss 2.9644 (2.7696) grad_norm 4.6935 (4.8605) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][150/1251] eta 0:04:29 lr 0.000074 wd 0.0500 time 0.2375 (0.2449) data time 0.0009 (0.0039) model time 0.2366 (0.2401) loss 2.5361 (2.7696) grad_norm 4.3562 (4.8503) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:56:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][160/1251] eta 0:04:26 lr 0.000074 wd 0.0500 time 0.2405 (0.2447) data time 0.0010 (0.0038) model time 0.2395 (0.2401) loss 2.2415 (2.7629) grad_norm 3.1905 (4.8342) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][170/1251] eta 0:04:24 lr 0.000074 wd 0.0500 time 0.2437 (0.2447) data time 0.0007 (0.0036) model time 0.2430 (0.2404) loss 3.2824 (2.7582) grad_norm 4.4044 (4.8021) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][180/1251] eta 0:04:21 lr 0.000074 wd 0.0500 time 0.2398 (0.2444) data time 0.0009 (0.0035) model time 0.2389 (0.2403) loss 2.1254 (2.7609) grad_norm 3.6649 (4.8257) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][190/1251] eta 0:04:19 lr 0.000074 wd 0.0500 time 0.2457 (0.2444) data time 0.0010 (0.0033) model time 0.2447 (0.2405) loss 2.9889 (2.7608) grad_norm 3.4011 (4.8237) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][200/1251] eta 0:04:16 lr 0.000074 wd 0.0500 time 0.2513 (0.2443) data time 0.0010 (0.0032) model time 0.2503 (0.2405) loss 2.9134 (2.7650) grad_norm 4.6473 (4.8360) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][210/1251] eta 0:04:14 lr 0.000074 wd 0.0500 time 0.2385 (0.2440) data time 0.0009 (0.0031) model time 0.2376 (0.2404) loss 3.4508 (2.7640) grad_norm 3.9086 (4.8172) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][220/1251] eta 0:04:11 lr 0.000074 wd 0.0500 time 0.2413 (0.2440) data time 0.0010 (0.0030) model time 0.2403 (0.2405) loss 2.5427 (2.7522) grad_norm 3.7093 (4.7995) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][230/1251] eta 0:04:09 lr 0.000074 wd 0.0500 time 0.2488 (0.2440) data time 0.0009 (0.0029) model time 0.2480 (0.2406) loss 2.9943 (2.7554) grad_norm 3.0650 (4.7650) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][240/1251] eta 0:04:06 lr 0.000074 wd 0.0500 time 0.2485 (0.2438) data time 0.0009 (0.0028) model time 0.2476 (0.2405) loss 2.8429 (2.7609) grad_norm 4.9855 (4.7856) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][250/1251] eta 0:04:04 lr 0.000074 wd 0.0500 time 0.2504 (0.2439) data time 0.0007 (0.0028) model time 0.2497 (0.2408) loss 3.0697 (2.7581) grad_norm 6.0724 (4.8111) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][260/1251] eta 0:04:01 lr 0.000074 wd 0.0500 time 0.2437 (0.2439) data time 0.0010 (0.0027) model time 0.2427 (0.2408) loss 2.6357 (2.7533) grad_norm 3.2514 (4.8196) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][270/1251] eta 0:03:59 lr 0.000074 wd 0.0500 time 0.2451 (0.2446) data time 0.0007 (0.0026) model time 0.2444 (0.2418) loss 3.5081 (2.7524) grad_norm 4.6477 (4.8981) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][280/1251] eta 0:03:57 lr 0.000074 wd 0.0500 time 0.2447 (0.2445) data time 0.0010 (0.0026) model time 0.2437 (0.2418) loss 2.8960 (2.7525) grad_norm 3.1842 (4.9842) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][290/1251] eta 0:03:54 lr 0.000074 wd 0.0500 time 0.2280 (0.2444) data time 0.0011 (0.0025) model time 0.2270 (0.2417) loss 2.0375 (2.7520) grad_norm 5.2236 (4.9867) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][300/1251] eta 0:03:52 lr 0.000074 wd 0.0500 time 0.2355 (0.2442) data time 0.0011 (0.0025) model time 0.2344 (0.2415) loss 2.5879 (2.7519) grad_norm 3.5833 (4.9625) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][310/1251] eta 0:03:49 lr 0.000074 wd 0.0500 time 0.2453 (0.2441) data time 0.0007 (0.0024) model time 0.2446 (0.2415) loss 2.8868 (2.7471) grad_norm 3.8740 (4.9775) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][320/1251] eta 0:03:47 lr 0.000074 wd 0.0500 time 0.2383 (0.2439) data time 0.0011 (0.0024) model time 0.2372 (0.2414) loss 3.3213 (2.7439) grad_norm 4.8547 (4.9528) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][330/1251] eta 0:03:44 lr 0.000074 wd 0.0500 time 0.2482 (0.2438) data time 0.0009 (0.0023) model time 0.2473 (0.2413) loss 3.0509 (2.7430) grad_norm 5.4687 (4.9707) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][340/1251] eta 0:03:42 lr 0.000074 wd 0.0500 time 0.2408 (0.2437) data time 0.0008 (0.0023) model time 0.2401 (0.2412) loss 3.1713 (2.7454) grad_norm 6.9980 (4.9608) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][350/1251] eta 0:03:39 lr 0.000074 wd 0.0500 time 0.2438 (0.2437) data time 0.0010 (0.0023) model time 0.2428 (0.2412) loss 3.0796 (2.7461) grad_norm 2.8015 (5.2024) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][360/1251] eta 0:03:37 lr 0.000074 wd 0.0500 time 0.2372 (0.2435) data time 0.0009 (0.0022) model time 0.2363 (0.2411) loss 2.9089 (2.7410) grad_norm 4.2138 (5.1786) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][370/1251] eta 0:03:34 lr 0.000074 wd 0.0500 time 0.2467 (0.2435) data time 0.0008 (0.0022) model time 0.2459 (0.2411) loss 3.1627 (2.7383) grad_norm 4.8854 (5.1846) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][380/1251] eta 0:03:32 lr 0.000074 wd 0.0500 time 0.2395 (0.2434) data time 0.0011 (0.0022) model time 0.2384 (0.2410) loss 2.4938 (2.7388) grad_norm 5.4931 (5.1651) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][390/1251] eta 0:03:29 lr 0.000074 wd 0.0500 time 0.2436 (0.2433) data time 0.0010 (0.0021) model time 0.2426 (0.2410) loss 3.0683 (2.7412) grad_norm 4.0486 (5.1484) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 07:57:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][400/1251] eta 0:03:27 lr 0.000074 wd 0.0500 time 0.2380 (0.2433) data time 0.0010 (0.0021) model time 0.2370 (0.2410) loss 2.9042 (2.7389) grad_norm 3.9790 (inf) loss_scale 128.0000 (253.7656) mem 7381MB [2024-09-01 07:57:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][410/1251] eta 0:03:24 lr 0.000074 wd 0.0500 time 0.2418 (0.2432) data time 0.0009 (0.0021) model time 0.2409 (0.2410) loss 2.7637 (2.7395) grad_norm 3.7035 (inf) loss_scale 128.0000 (250.7056) mem 7381MB [2024-09-01 07:58:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][420/1251] eta 0:03:22 lr 0.000074 wd 0.0500 time 0.2423 (0.2432) data time 0.0011 (0.0020) model time 0.2413 (0.2409) loss 3.3698 (2.7401) grad_norm 5.7220 (inf) loss_scale 128.0000 (247.7910) mem 7381MB [2024-09-01 07:58:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][430/1251] eta 0:03:19 lr 0.000074 wd 0.0500 time 0.2466 (0.2433) data time 0.0010 (0.0020) model time 0.2456 (0.2410) loss 3.0323 (2.7395) grad_norm 5.1905 (inf) loss_scale 128.0000 (245.0116) mem 7381MB [2024-09-01 07:58:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][440/1251] eta 0:03:17 lr 0.000074 wd 0.0500 time 0.2401 (0.2432) data time 0.0010 (0.0020) model time 0.2391 (0.2410) loss 3.0160 (2.7391) grad_norm 6.2223 (inf) loss_scale 128.0000 (242.3583) mem 7381MB [2024-09-01 07:58:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][450/1251] eta 0:03:14 lr 0.000073 wd 0.0500 time 0.2674 (0.2432) data time 0.0010 (0.0020) model time 0.2664 (0.2410) loss 3.4746 (2.7420) grad_norm 4.9543 (inf) loss_scale 128.0000 (239.8226) mem 7381MB [2024-09-01 07:58:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][460/1251] eta 0:03:12 lr 0.000073 wd 0.0500 time 0.2390 (0.2431) data time 0.0008 (0.0020) model time 0.2382 (0.2410) loss 3.2039 (2.7407) grad_norm 6.8261 (inf) loss_scale 128.0000 (237.3970) mem 7381MB [2024-09-01 07:58:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][470/1251] eta 0:03:09 lr 0.000073 wd 0.0500 time 0.2484 (0.2431) data time 0.0008 (0.0020) model time 0.2476 (0.2410) loss 3.4437 (2.7407) grad_norm 4.9614 (inf) loss_scale 128.0000 (235.0743) mem 7381MB [2024-09-01 07:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][480/1251] eta 0:03:07 lr 0.000073 wd 0.0500 time 0.2347 (0.2431) data time 0.0009 (0.0019) model time 0.2337 (0.2409) loss 2.2573 (2.7403) grad_norm 4.9405 (inf) loss_scale 128.0000 (232.8482) mem 7381MB [2024-09-01 07:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][490/1251] eta 0:03:04 lr 0.000073 wd 0.0500 time 0.2315 (0.2430) data time 0.0009 (0.0019) model time 0.2305 (0.2409) loss 2.7541 (2.7441) grad_norm 7.6212 (inf) loss_scale 128.0000 (230.7128) mem 7381MB [2024-09-01 07:58:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][500/1251] eta 0:03:02 lr 0.000073 wd 0.0500 time 0.2343 (0.2430) data time 0.0007 (0.0019) model time 0.2335 (0.2409) loss 2.7842 (2.7433) grad_norm 10.3603 (inf) loss_scale 128.0000 (228.6627) mem 7381MB [2024-09-01 07:58:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][510/1251] eta 0:03:00 lr 0.000073 wd 0.0500 time 0.2545 (0.2430) data time 0.0010 (0.0019) model time 0.2535 (0.2409) loss 2.8670 (2.7455) grad_norm 5.6316 (inf) loss_scale 128.0000 (226.6928) mem 7381MB [2024-09-01 07:58:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][520/1251] eta 0:02:57 lr 0.000073 wd 0.0500 time 0.2443 (0.2430) data time 0.0008 (0.0019) model time 0.2435 (0.2409) loss 3.1989 (2.7490) grad_norm 4.9957 (inf) loss_scale 128.0000 (224.7985) mem 7381MB [2024-09-01 07:58:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][530/1251] eta 0:02:55 lr 0.000073 wd 0.0500 time 0.2432 (0.2429) data time 0.0008 (0.0018) model time 0.2424 (0.2409) loss 3.2444 (2.7469) grad_norm 4.6686 (inf) loss_scale 128.0000 (222.9755) mem 7381MB [2024-09-01 07:58:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][540/1251] eta 0:02:52 lr 0.000073 wd 0.0500 time 0.2423 (0.2432) data time 0.0009 (0.0018) model time 0.2414 (0.2412) loss 2.6597 (2.7478) grad_norm 6.8998 (inf) loss_scale 128.0000 (221.2200) mem 7381MB [2024-09-01 07:58:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][550/1251] eta 0:02:50 lr 0.000073 wd 0.0500 time 0.2427 (0.2435) data time 0.0011 (0.0018) model time 0.2416 (0.2416) loss 3.3626 (2.7501) grad_norm 4.0695 (inf) loss_scale 128.0000 (219.5281) mem 7381MB [2024-09-01 07:58:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][560/1251] eta 0:02:48 lr 0.000073 wd 0.0500 time 0.2467 (0.2435) data time 0.0009 (0.0018) model time 0.2458 (0.2416) loss 3.0256 (2.7524) grad_norm 3.5635 (inf) loss_scale 128.0000 (217.8966) mem 7381MB [2024-09-01 07:58:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][570/1251] eta 0:02:45 lr 0.000073 wd 0.0500 time 0.2494 (0.2435) data time 0.0007 (0.0018) model time 0.2487 (0.2416) loss 2.9337 (2.7534) grad_norm 6.4785 (inf) loss_scale 128.0000 (216.3222) mem 7381MB [2024-09-01 07:58:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][580/1251] eta 0:02:43 lr 0.000073 wd 0.0500 time 0.2473 (0.2434) data time 0.0009 (0.0018) model time 0.2463 (0.2416) loss 2.7427 (2.7534) grad_norm 3.3081 (inf) loss_scale 128.0000 (214.8021) mem 7381MB [2024-09-01 07:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][590/1251] eta 0:02:40 lr 0.000073 wd 0.0500 time 0.2381 (0.2434) data time 0.0011 (0.0018) model time 0.2369 (0.2416) loss 2.5766 (2.7579) grad_norm 4.6665 (inf) loss_scale 128.0000 (213.3333) mem 7381MB [2024-09-01 07:58:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][600/1251] eta 0:02:38 lr 0.000073 wd 0.0500 time 0.2398 (0.2434) data time 0.0011 (0.0017) model time 0.2387 (0.2416) loss 2.6643 (2.7586) grad_norm 4.4361 (inf) loss_scale 128.0000 (211.9135) mem 7381MB [2024-09-01 07:58:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][610/1251] eta 0:02:36 lr 0.000073 wd 0.0500 time 0.2437 (0.2434) data time 0.0007 (0.0017) model time 0.2429 (0.2416) loss 2.5203 (2.7606) grad_norm 3.8913 (inf) loss_scale 128.0000 (210.5401) mem 7381MB [2024-09-01 07:58:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][620/1251] eta 0:02:33 lr 0.000073 wd 0.0500 time 0.2380 (0.2434) data time 0.0010 (0.0017) model time 0.2370 (0.2416) loss 2.9120 (2.7604) grad_norm 3.6942 (inf) loss_scale 128.0000 (209.2110) mem 7381MB [2024-09-01 07:58:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][630/1251] eta 0:02:31 lr 0.000073 wd 0.0500 time 0.2392 (0.2433) data time 0.0009 (0.0017) model time 0.2383 (0.2415) loss 3.0177 (2.7630) grad_norm 6.1995 (inf) loss_scale 128.0000 (207.9239) mem 7381MB [2024-09-01 07:58:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][640/1251] eta 0:02:28 lr 0.000073 wd 0.0500 time 0.2373 (0.2433) data time 0.0009 (0.0017) model time 0.2364 (0.2415) loss 2.8353 (2.7657) grad_norm 4.8566 (inf) loss_scale 128.0000 (206.6771) mem 7381MB [2024-09-01 07:58:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][650/1251] eta 0:02:26 lr 0.000073 wd 0.0500 time 0.2468 (0.2433) data time 0.0008 (0.0017) model time 0.2460 (0.2416) loss 3.3790 (2.7652) grad_norm 4.5598 (inf) loss_scale 128.0000 (205.4685) mem 7381MB [2024-09-01 07:59:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][660/1251] eta 0:02:23 lr 0.000073 wd 0.0500 time 0.2577 (0.2433) data time 0.0010 (0.0017) model time 0.2567 (0.2416) loss 2.9319 (2.7655) grad_norm 4.2212 (inf) loss_scale 128.0000 (204.2965) mem 7381MB [2024-09-01 07:59:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][670/1251] eta 0:02:21 lr 0.000073 wd 0.0500 time 0.2440 (0.2434) data time 0.0011 (0.0017) model time 0.2429 (0.2416) loss 2.8951 (2.7667) grad_norm 3.8947 (inf) loss_scale 128.0000 (203.1595) mem 7381MB [2024-09-01 07:59:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][680/1251] eta 0:02:18 lr 0.000073 wd 0.0500 time 0.2484 (0.2433) data time 0.0008 (0.0017) model time 0.2476 (0.2416) loss 1.9036 (2.7672) grad_norm 5.8271 (inf) loss_scale 128.0000 (202.0558) mem 7381MB [2024-09-01 07:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][690/1251] eta 0:02:16 lr 0.000073 wd 0.0500 time 0.2409 (0.2433) data time 0.0007 (0.0016) model time 0.2401 (0.2416) loss 3.7597 (2.7678) grad_norm 7.0004 (inf) loss_scale 128.0000 (200.9841) mem 7381MB [2024-09-01 07:59:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][700/1251] eta 0:02:14 lr 0.000073 wd 0.0500 time 0.2321 (0.2433) data time 0.0011 (0.0016) model time 0.2310 (0.2416) loss 2.8640 (2.7695) grad_norm 6.0244 (inf) loss_scale 128.0000 (199.9429) mem 7381MB [2024-09-01 07:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][710/1251] eta 0:02:11 lr 0.000073 wd 0.0500 time 0.2535 (0.2433) data time 0.0008 (0.0016) model time 0.2527 (0.2416) loss 3.0011 (2.7710) grad_norm 5.6131 (inf) loss_scale 128.0000 (198.9311) mem 7381MB [2024-09-01 07:59:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][720/1251] eta 0:02:09 lr 0.000073 wd 0.0500 time 0.2423 (0.2433) data time 0.0011 (0.0016) model time 0.2412 (0.2417) loss 2.8592 (2.7712) grad_norm 3.7961 (inf) loss_scale 128.0000 (197.9473) mem 7381MB [2024-09-01 07:59:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][730/1251] eta 0:02:06 lr 0.000073 wd 0.0500 time 0.2428 (0.2433) data time 0.0009 (0.0016) model time 0.2419 (0.2417) loss 2.7306 (2.7704) grad_norm 2.9491 (inf) loss_scale 128.0000 (196.9904) mem 7381MB [2024-09-01 07:59:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][740/1251] eta 0:02:04 lr 0.000073 wd 0.0500 time 0.2396 (0.2433) data time 0.0010 (0.0016) model time 0.2386 (0.2416) loss 3.0268 (2.7663) grad_norm 4.7933 (inf) loss_scale 128.0000 (196.0594) mem 7381MB [2024-09-01 07:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][750/1251] eta 0:02:01 lr 0.000073 wd 0.0500 time 0.2411 (0.2433) data time 0.0007 (0.0016) model time 0.2404 (0.2416) loss 2.4848 (2.7633) grad_norm 3.5328 (inf) loss_scale 128.0000 (195.1531) mem 7381MB [2024-09-01 07:59:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][760/1251] eta 0:01:59 lr 0.000073 wd 0.0500 time 0.2479 (0.2433) data time 0.0007 (0.0016) model time 0.2472 (0.2416) loss 3.2747 (2.7633) grad_norm 4.8423 (inf) loss_scale 128.0000 (194.2707) mem 7381MB [2024-09-01 07:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][770/1251] eta 0:01:57 lr 0.000073 wd 0.0500 time 0.2461 (0.2433) data time 0.0011 (0.0016) model time 0.2450 (0.2416) loss 2.2903 (2.7625) grad_norm 11.5551 (inf) loss_scale 128.0000 (193.4112) mem 7381MB [2024-09-01 07:59:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][780/1251] eta 0:01:54 lr 0.000073 wd 0.0500 time 0.2384 (0.2432) data time 0.0010 (0.0016) model time 0.2374 (0.2416) loss 3.2843 (2.7646) grad_norm 4.9992 (inf) loss_scale 128.0000 (192.5736) mem 7381MB [2024-09-01 07:59:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][790/1251] eta 0:01:52 lr 0.000073 wd 0.0500 time 0.2426 (0.2432) data time 0.0010 (0.0016) model time 0.2417 (0.2416) loss 3.2285 (2.7655) grad_norm 3.3756 (inf) loss_scale 128.0000 (191.7573) mem 7381MB [2024-09-01 07:59:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][800/1251] eta 0:01:49 lr 0.000073 wd 0.0500 time 0.2436 (0.2432) data time 0.0008 (0.0016) model time 0.2428 (0.2416) loss 2.9161 (2.7642) grad_norm 3.7969 (inf) loss_scale 128.0000 (190.9613) mem 7381MB [2024-09-01 07:59:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][810/1251] eta 0:01:47 lr 0.000073 wd 0.0500 time 0.2425 (0.2432) data time 0.0011 (0.0015) model time 0.2415 (0.2416) loss 2.8950 (2.7635) grad_norm 6.6080 (inf) loss_scale 128.0000 (190.1850) mem 7381MB [2024-09-01 07:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][820/1251] eta 0:01:44 lr 0.000073 wd 0.0500 time 0.2397 (0.2432) data time 0.0007 (0.0015) model time 0.2389 (0.2416) loss 3.4257 (2.7633) grad_norm 5.0231 (inf) loss_scale 128.0000 (189.4275) mem 7381MB [2024-09-01 07:59:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][830/1251] eta 0:01:42 lr 0.000073 wd 0.0500 time 0.2557 (0.2432) data time 0.0007 (0.0015) model time 0.2550 (0.2416) loss 3.2796 (2.7625) grad_norm 4.5884 (inf) loss_scale 128.0000 (188.6883) mem 7381MB [2024-09-01 07:59:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][840/1251] eta 0:01:39 lr 0.000073 wd 0.0500 time 0.2449 (0.2432) data time 0.0007 (0.0015) model time 0.2442 (0.2416) loss 3.1123 (2.7655) grad_norm 4.0260 (inf) loss_scale 128.0000 (187.9667) mem 7381MB [2024-09-01 07:59:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][850/1251] eta 0:01:37 lr 0.000073 wd 0.0500 time 0.2460 (0.2432) data time 0.0008 (0.0015) model time 0.2452 (0.2416) loss 3.2290 (2.7661) grad_norm 4.0916 (inf) loss_scale 128.0000 (187.2620) mem 7381MB [2024-09-01 07:59:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][860/1251] eta 0:01:35 lr 0.000073 wd 0.0500 time 0.2394 (0.2432) data time 0.0010 (0.0015) model time 0.2383 (0.2416) loss 2.0114 (2.7645) grad_norm 4.1499 (inf) loss_scale 128.0000 (186.5738) mem 7381MB [2024-09-01 07:59:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][870/1251] eta 0:01:32 lr 0.000073 wd 0.0500 time 0.2457 (0.2432) data time 0.0008 (0.0015) model time 0.2449 (0.2416) loss 2.3779 (2.7615) grad_norm 3.8611 (inf) loss_scale 128.0000 (185.9013) mem 7381MB [2024-09-01 07:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][880/1251] eta 0:01:30 lr 0.000073 wd 0.0500 time 0.2417 (0.2432) data time 0.0009 (0.0015) model time 0.2408 (0.2416) loss 3.0676 (2.7614) grad_norm 4.9136 (inf) loss_scale 128.0000 (185.2440) mem 7381MB [2024-09-01 07:59:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][890/1251] eta 0:01:27 lr 0.000073 wd 0.0500 time 0.2413 (0.2432) data time 0.0009 (0.0015) model time 0.2404 (0.2416) loss 2.9468 (2.7601) grad_norm 6.6136 (inf) loss_scale 128.0000 (184.6016) mem 7381MB [2024-09-01 07:59:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][900/1251] eta 0:01:25 lr 0.000073 wd 0.0500 time 0.2412 (0.2432) data time 0.0010 (0.0015) model time 0.2402 (0.2416) loss 2.9618 (2.7590) grad_norm 4.5113 (inf) loss_scale 128.0000 (183.9734) mem 7381MB [2024-09-01 08:00:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][910/1251] eta 0:01:22 lr 0.000072 wd 0.0500 time 0.2310 (0.2432) data time 0.0010 (0.0015) model time 0.2299 (0.2416) loss 2.0021 (2.7568) grad_norm 4.3274 (inf) loss_scale 128.0000 (183.3589) mem 7381MB [2024-09-01 08:00:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][920/1251] eta 0:01:20 lr 0.000072 wd 0.0500 time 0.2475 (0.2432) data time 0.0008 (0.0015) model time 0.2467 (0.2416) loss 2.5511 (2.7562) grad_norm 4.3866 (inf) loss_scale 128.0000 (182.7579) mem 7381MB [2024-09-01 08:00:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][930/1251] eta 0:01:18 lr 0.000072 wd 0.0500 time 0.2386 (0.2431) data time 0.0009 (0.0015) model time 0.2377 (0.2416) loss 3.4605 (2.7548) grad_norm 5.4389 (inf) loss_scale 128.0000 (182.1697) mem 7381MB [2024-09-01 08:00:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][940/1251] eta 0:01:15 lr 0.000072 wd 0.0500 time 0.2389 (0.2431) data time 0.0009 (0.0015) model time 0.2380 (0.2416) loss 2.4903 (2.7568) grad_norm 3.9614 (inf) loss_scale 128.0000 (181.5940) mem 7381MB [2024-09-01 08:00:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][950/1251] eta 0:01:13 lr 0.000072 wd 0.0500 time 0.2438 (0.2431) data time 0.0010 (0.0015) model time 0.2428 (0.2416) loss 3.2160 (2.7565) grad_norm 4.1317 (inf) loss_scale 128.0000 (181.0305) mem 7381MB [2024-09-01 08:00:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][960/1251] eta 0:01:10 lr 0.000072 wd 0.0500 time 0.2456 (0.2431) data time 0.0010 (0.0015) model time 0.2446 (0.2415) loss 2.5887 (2.7583) grad_norm 10.4097 (inf) loss_scale 128.0000 (180.4787) mem 7381MB [2024-09-01 08:00:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][970/1251] eta 0:01:08 lr 0.000072 wd 0.0500 time 0.2353 (0.2430) data time 0.0010 (0.0015) model time 0.2343 (0.2415) loss 2.2079 (2.7584) grad_norm 5.3446 (inf) loss_scale 128.0000 (179.9382) mem 7381MB [2024-09-01 08:00:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][980/1251] eta 0:01:05 lr 0.000072 wd 0.0500 time 0.2445 (0.2430) data time 0.0009 (0.0014) model time 0.2436 (0.2415) loss 2.9536 (2.7578) grad_norm 4.6471 (inf) loss_scale 128.0000 (179.4088) mem 7381MB [2024-09-01 08:00:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][990/1251] eta 0:01:03 lr 0.000072 wd 0.0500 time 0.2413 (0.2430) data time 0.0011 (0.0014) model time 0.2402 (0.2415) loss 3.2763 (2.7568) grad_norm 4.7787 (inf) loss_scale 128.0000 (178.8900) mem 7381MB [2024-09-01 08:00:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1000/1251] eta 0:01:00 lr 0.000072 wd 0.0500 time 0.2493 (0.2430) data time 0.0009 (0.0014) model time 0.2484 (0.2415) loss 2.9910 (2.7572) grad_norm 5.0083 (inf) loss_scale 128.0000 (178.3816) mem 7381MB [2024-09-01 08:00:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1010/1251] eta 0:00:58 lr 0.000072 wd 0.0500 time 0.2460 (0.2430) data time 0.0007 (0.0014) model time 0.2453 (0.2415) loss 2.3497 (2.7582) grad_norm 6.2935 (inf) loss_scale 128.0000 (177.8833) mem 7381MB [2024-09-01 08:00:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1020/1251] eta 0:00:56 lr 0.000072 wd 0.0500 time 0.2393 (0.2430) data time 0.0009 (0.0014) model time 0.2384 (0.2415) loss 2.7811 (2.7577) grad_norm 3.2516 (inf) loss_scale 128.0000 (177.3947) mem 7381MB [2024-09-01 08:00:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1030/1251] eta 0:00:53 lr 0.000072 wd 0.0500 time 0.2412 (0.2429) data time 0.0011 (0.0014) model time 0.2402 (0.2415) loss 2.3064 (2.7587) grad_norm 5.8174 (inf) loss_scale 128.0000 (176.9156) mem 7381MB [2024-09-01 08:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1040/1251] eta 0:00:51 lr 0.000072 wd 0.0500 time 0.2390 (0.2429) data time 0.0008 (0.0014) model time 0.2382 (0.2414) loss 1.6846 (2.7597) grad_norm 3.8102 (inf) loss_scale 128.0000 (176.4457) mem 7381MB [2024-09-01 08:00:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1050/1251] eta 0:00:48 lr 0.000072 wd 0.0500 time 0.2389 (0.2429) data time 0.0011 (0.0014) model time 0.2378 (0.2414) loss 2.4739 (2.7579) grad_norm 6.6451 (inf) loss_scale 128.0000 (175.9848) mem 7381MB [2024-09-01 08:00:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1060/1251] eta 0:00:46 lr 0.000072 wd 0.0500 time 0.2439 (0.2429) data time 0.0009 (0.0014) model time 0.2430 (0.2415) loss 3.0805 (2.7594) grad_norm 3.9480 (inf) loss_scale 128.0000 (175.5325) mem 7381MB [2024-09-01 08:00:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1070/1251] eta 0:00:44 lr 0.000072 wd 0.0500 time 0.2557 (0.2433) data time 0.0009 (0.0014) model time 0.2548 (0.2418) loss 2.4685 (2.7590) grad_norm 6.3634 (inf) loss_scale 128.0000 (175.0887) mem 7381MB [2024-09-01 08:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1080/1251] eta 0:00:41 lr 0.000072 wd 0.0500 time 0.2340 (0.2433) data time 0.0007 (0.0014) model time 0.2333 (0.2419) loss 1.6627 (2.7576) grad_norm 8.0200 (inf) loss_scale 128.0000 (174.6531) mem 7381MB [2024-09-01 08:00:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1090/1251] eta 0:00:39 lr 0.000072 wd 0.0500 time 0.2357 (0.2433) data time 0.0011 (0.0014) model time 0.2346 (0.2419) loss 2.1290 (2.7580) grad_norm 4.3401 (inf) loss_scale 128.0000 (174.2255) mem 7381MB [2024-09-01 08:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1100/1251] eta 0:00:36 lr 0.000072 wd 0.0500 time 0.2352 (0.2433) data time 0.0010 (0.0014) model time 0.2342 (0.2419) loss 2.6714 (2.7575) grad_norm 4.8419 (inf) loss_scale 128.0000 (173.8056) mem 7381MB [2024-09-01 08:00:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1110/1251] eta 0:00:34 lr 0.000072 wd 0.0500 time 0.2492 (0.2433) data time 0.0011 (0.0014) model time 0.2480 (0.2419) loss 3.2506 (2.7574) grad_norm 5.7253 (inf) loss_scale 128.0000 (173.3933) mem 7381MB [2024-09-01 08:00:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1120/1251] eta 0:00:31 lr 0.000072 wd 0.0500 time 0.2448 (0.2433) data time 0.0010 (0.0014) model time 0.2438 (0.2419) loss 3.2787 (2.7597) grad_norm 3.8530 (inf) loss_scale 128.0000 (172.9884) mem 7381MB [2024-09-01 08:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1130/1251] eta 0:00:29 lr 0.000072 wd 0.0500 time 0.2362 (0.2433) data time 0.0010 (0.0014) model time 0.2352 (0.2419) loss 3.0492 (2.7599) grad_norm 4.7826 (inf) loss_scale 128.0000 (172.5906) mem 7381MB [2024-09-01 08:00:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1140/1251] eta 0:00:27 lr 0.000072 wd 0.0500 time 0.2398 (0.2433) data time 0.0007 (0.0014) model time 0.2391 (0.2418) loss 3.0428 (2.7601) grad_norm 4.3328 (inf) loss_scale 128.0000 (172.1998) mem 7381MB [2024-09-01 08:00:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1150/1251] eta 0:00:24 lr 0.000072 wd 0.0500 time 0.2362 (0.2432) data time 0.0011 (0.0014) model time 0.2350 (0.2418) loss 2.6939 (2.7591) grad_norm 4.3000 (inf) loss_scale 128.0000 (171.8158) mem 7381MB [2024-09-01 08:01:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1160/1251] eta 0:00:22 lr 0.000072 wd 0.0500 time 0.2431 (0.2432) data time 0.0008 (0.0014) model time 0.2424 (0.2418) loss 3.0154 (2.7592) grad_norm 4.1779 (inf) loss_scale 128.0000 (171.4384) mem 7381MB [2024-09-01 08:01:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1170/1251] eta 0:00:19 lr 0.000072 wd 0.0500 time 0.2438 (0.2432) data time 0.0010 (0.0014) model time 0.2428 (0.2418) loss 2.9579 (2.7584) grad_norm 5.4352 (inf) loss_scale 128.0000 (171.0675) mem 7381MB [2024-09-01 08:01:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1180/1251] eta 0:00:17 lr 0.000072 wd 0.0500 time 0.2501 (0.2432) data time 0.0009 (0.0014) model time 0.2492 (0.2418) loss 3.0508 (2.7599) grad_norm 4.6811 (inf) loss_scale 128.0000 (170.7028) mem 7381MB [2024-09-01 08:01:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1190/1251] eta 0:00:14 lr 0.000072 wd 0.0500 time 0.2403 (0.2432) data time 0.0008 (0.0014) model time 0.2395 (0.2418) loss 3.2272 (2.7594) grad_norm 4.7798 (inf) loss_scale 128.0000 (170.3442) mem 7381MB [2024-09-01 08:01:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1200/1251] eta 0:00:12 lr 0.000072 wd 0.0500 time 0.2483 (0.2434) data time 0.0010 (0.0014) model time 0.2474 (0.2420) loss 2.9464 (2.7596) grad_norm 3.9100 (inf) loss_scale 128.0000 (169.9917) mem 7381MB [2024-09-01 08:01:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1210/1251] eta 0:00:09 lr 0.000072 wd 0.0500 time 0.2433 (0.2434) data time 0.0009 (0.0014) model time 0.2423 (0.2420) loss 3.1159 (2.7594) grad_norm 5.1078 (inf) loss_scale 128.0000 (169.6449) mem 7381MB [2024-09-01 08:01:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1220/1251] eta 0:00:07 lr 0.000072 wd 0.0500 time 0.2331 (0.2434) data time 0.0008 (0.0014) model time 0.2323 (0.2420) loss 3.4525 (2.7585) grad_norm 4.0192 (inf) loss_scale 128.0000 (169.3038) mem 7381MB [2024-09-01 08:01:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1230/1251] eta 0:00:05 lr 0.000072 wd 0.0500 time 0.2417 (0.2433) data time 0.0010 (0.0014) model time 0.2406 (0.2420) loss 2.8961 (2.7602) grad_norm 4.7669 (inf) loss_scale 128.0000 (168.9683) mem 7381MB [2024-09-01 08:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1240/1251] eta 0:00:02 lr 0.000072 wd 0.0500 time 0.2240 (0.2433) data time 0.0005 (0.0014) model time 0.2236 (0.2419) loss 3.0150 (2.7586) grad_norm 3.3951 (inf) loss_scale 128.0000 (168.6382) mem 7381MB [2024-09-01 08:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [254/300][1250/1251] eta 0:00:00 lr 0.000072 wd 0.0500 time 0.2223 (0.2431) data time 0.0005 (0.0013) model time 0.2218 (0.2417) loss 3.3850 (2.7613) grad_norm 4.6465 (inf) loss_scale 128.0000 (168.3133) mem 7381MB [2024-09-01 08:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 254 training takes 0:05:04 [2024-09-01 08:01:23 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 08:01:24 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 08:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.461 (0.461) Loss 0.3938 (0.3938) Acc@1 92.969 (92.969) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 08:01:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.110) Loss 0.5957 (0.6187) Acc@1 89.746 (87.038) Acc@5 97.656 (97.638) Mem 7381MB [2024-09-01 08:01:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.097) Loss 0.9048 (0.6446) Acc@1 78.320 (85.993) Acc@5 96.191 (97.638) Mem 7381MB [2024-09-01 08:01:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.090) Loss 1.1279 (0.7357) Acc@1 73.730 (83.858) Acc@5 92.285 (96.636) Mem 7381MB [2024-09-01 08:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.0293 (0.7848) Acc@1 76.074 (82.720) Acc@5 94.531 (96.125) Mem 7381MB [2024-09-01 08:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.298 Acc@5 96.032 [2024-09-01 08:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.3% [2024-09-01 08:01:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.818 (0.818) Loss 0.3801 (0.3801) Acc@1 93.457 (93.457) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 08:01:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.149) Loss 0.5674 (0.6000) Acc@1 90.430 (87.713) Acc@5 98.145 (97.843) Mem 7381MB [2024-09-01 08:01:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.115) Loss 0.8931 (0.6304) Acc@1 78.125 (86.561) Acc@5 96.191 (97.749) Mem 7381MB [2024-09-01 08:01:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.102) Loss 1.1074 (0.7176) Acc@1 73.633 (84.400) Acc@5 93.066 (96.888) Mem 7381MB [2024-09-01 08:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 0.9907 (0.7641) Acc@1 76.758 (83.289) Acc@5 94.336 (96.382) Mem 7381MB [2024-09-01 08:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.934 Acc@5 96.332 [2024-09-01 08:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 08:01:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][0/1251] eta 0:26:50 lr 0.000072 wd 0.0500 time 1.2870 (1.2870) data time 0.6483 (0.6483) model time 0.0000 (0.0000) loss 2.8111 (2.8111) grad_norm 4.6970 (4.6970) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][10/1251] eta 0:06:58 lr 0.000072 wd 0.0500 time 0.2375 (0.3371) data time 0.0010 (0.0598) model time 0.0000 (0.0000) loss 3.1698 (2.9894) grad_norm 3.8488 (5.7685) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:01:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][20/1251] eta 0:06:00 lr 0.000072 wd 0.0500 time 0.2405 (0.2925) data time 0.0010 (0.0318) model time 0.0000 (0.0000) loss 2.5004 (2.9324) grad_norm 5.1584 (5.4753) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:01:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][30/1251] eta 0:05:36 lr 0.000072 wd 0.0500 time 0.2246 (0.2757) data time 0.0008 (0.0219) model time 0.0000 (0.0000) loss 3.3400 (2.8952) grad_norm 5.3243 (5.3171) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:01:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][40/1251] eta 0:05:22 lr 0.000072 wd 0.0500 time 0.2364 (0.2665) data time 0.0009 (0.0168) model time 0.0000 (0.0000) loss 3.3427 (2.8786) grad_norm 2.9915 (4.9392) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:01:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][50/1251] eta 0:05:14 lr 0.000072 wd 0.0500 time 0.2489 (0.2617) data time 0.0009 (0.0137) model time 0.0000 (0.0000) loss 2.6176 (2.8506) grad_norm 3.1601 (4.9949) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:01:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][60/1251] eta 0:05:08 lr 0.000072 wd 0.0500 time 0.2436 (0.2587) data time 0.0010 (0.0116) model time 0.2426 (0.2426) loss 3.1780 (2.8732) grad_norm 6.0910 (5.4851) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:01:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][70/1251] eta 0:05:02 lr 0.000072 wd 0.0500 time 0.2404 (0.2561) data time 0.0008 (0.0101) model time 0.2396 (0.2411) loss 3.3686 (2.8874) grad_norm 3.8040 (5.4185) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:01:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][80/1251] eta 0:04:57 lr 0.000072 wd 0.0500 time 0.2393 (0.2541) data time 0.0007 (0.0090) model time 0.2386 (0.2402) loss 2.1713 (2.8476) grad_norm 6.3815 (5.3170) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:01:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][90/1251] eta 0:04:53 lr 0.000072 wd 0.0500 time 0.2443 (0.2526) data time 0.0007 (0.0081) model time 0.2436 (0.2401) loss 2.9499 (2.8306) grad_norm 4.8060 (5.2613) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:01:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][100/1251] eta 0:04:49 lr 0.000072 wd 0.0500 time 0.2462 (0.2515) data time 0.0007 (0.0074) model time 0.2455 (0.2401) loss 3.1462 (2.8278) grad_norm 7.6720 (5.2028) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][110/1251] eta 0:04:45 lr 0.000072 wd 0.0500 time 0.2383 (0.2506) data time 0.0010 (0.0068) model time 0.2373 (0.2403) loss 2.8542 (2.8018) grad_norm 5.1666 (5.1550) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][120/1251] eta 0:04:42 lr 0.000072 wd 0.0500 time 0.2430 (0.2497) data time 0.0007 (0.0063) model time 0.2423 (0.2400) loss 3.1709 (2.8138) grad_norm 3.1724 (5.1257) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][130/1251] eta 0:04:39 lr 0.000071 wd 0.0500 time 0.2434 (0.2489) data time 0.0012 (0.0059) model time 0.2422 (0.2398) loss 2.9010 (2.8018) grad_norm 2.7478 (5.0861) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][140/1251] eta 0:04:35 lr 0.000071 wd 0.0500 time 0.2461 (0.2482) data time 0.0009 (0.0056) model time 0.2452 (0.2396) loss 3.3623 (2.8076) grad_norm 4.1930 (5.0155) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][150/1251] eta 0:04:32 lr 0.000071 wd 0.0500 time 0.2312 (0.2476) data time 0.0012 (0.0053) model time 0.2300 (0.2395) loss 2.9170 (2.8006) grad_norm 3.0138 (5.2057) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][160/1251] eta 0:04:29 lr 0.000071 wd 0.0500 time 0.2365 (0.2472) data time 0.0008 (0.0050) model time 0.2357 (0.2395) loss 3.2238 (2.7973) grad_norm 3.3536 (5.1431) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][170/1251] eta 0:04:26 lr 0.000071 wd 0.0500 time 0.2360 (0.2468) data time 0.0011 (0.0048) model time 0.2349 (0.2395) loss 2.1026 (2.7809) grad_norm 7.2830 (5.1247) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][180/1251] eta 0:04:23 lr 0.000071 wd 0.0500 time 0.2431 (0.2464) data time 0.0007 (0.0046) model time 0.2424 (0.2394) loss 3.0854 (2.7778) grad_norm 4.1404 (5.1017) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][190/1251] eta 0:04:21 lr 0.000071 wd 0.0500 time 0.2481 (0.2462) data time 0.0008 (0.0044) model time 0.2473 (0.2396) loss 2.2993 (2.7803) grad_norm 3.9294 (5.3693) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][200/1251] eta 0:04:18 lr 0.000071 wd 0.0500 time 0.2443 (0.2459) data time 0.0008 (0.0042) model time 0.2434 (0.2396) loss 3.1436 (2.7858) grad_norm 2.7837 (5.3524) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][210/1251] eta 0:04:15 lr 0.000071 wd 0.0500 time 0.2429 (0.2458) data time 0.0009 (0.0041) model time 0.2420 (0.2397) loss 3.4285 (2.7875) grad_norm 5.9771 (5.3227) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][220/1251] eta 0:04:13 lr 0.000071 wd 0.0500 time 0.2516 (0.2456) data time 0.0009 (0.0039) model time 0.2506 (0.2397) loss 2.9802 (2.7994) grad_norm 4.3873 (5.2670) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][230/1251] eta 0:04:10 lr 0.000071 wd 0.0500 time 0.2472 (0.2453) data time 0.0007 (0.0038) model time 0.2465 (0.2397) loss 2.7332 (2.8012) grad_norm 3.7306 (5.2540) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][240/1251] eta 0:04:07 lr 0.000071 wd 0.0500 time 0.2455 (0.2452) data time 0.0011 (0.0037) model time 0.2444 (0.2398) loss 3.2804 (2.7969) grad_norm 4.5061 (5.2421) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][250/1251] eta 0:04:05 lr 0.000071 wd 0.0500 time 0.2442 (0.2450) data time 0.0010 (0.0036) model time 0.2433 (0.2398) loss 2.9310 (2.7916) grad_norm 107.5315 (5.6311) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][260/1251] eta 0:04:02 lr 0.000071 wd 0.0500 time 0.2361 (0.2449) data time 0.0011 (0.0035) model time 0.2350 (0.2398) loss 3.0544 (2.7877) grad_norm 3.6279 (5.5891) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][270/1251] eta 0:04:00 lr 0.000071 wd 0.0500 time 0.2416 (0.2448) data time 0.0009 (0.0034) model time 0.2407 (0.2399) loss 2.8589 (2.7827) grad_norm 6.5873 (5.5693) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][280/1251] eta 0:03:57 lr 0.000071 wd 0.0500 time 0.2420 (0.2446) data time 0.0007 (0.0033) model time 0.2413 (0.2398) loss 3.2414 (2.7805) grad_norm 3.0904 (5.5370) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][290/1251] eta 0:03:54 lr 0.000071 wd 0.0500 time 0.2309 (0.2444) data time 0.0008 (0.0032) model time 0.2301 (0.2398) loss 1.9378 (2.7834) grad_norm 3.8376 (5.5010) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][300/1251] eta 0:03:52 lr 0.000071 wd 0.0500 time 0.2376 (0.2443) data time 0.0011 (0.0031) model time 0.2366 (0.2397) loss 2.4600 (2.7880) grad_norm 6.1278 (5.4780) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][310/1251] eta 0:03:50 lr 0.000071 wd 0.0500 time 0.2364 (0.2447) data time 0.0009 (0.0031) model time 0.2355 (0.2404) loss 3.0977 (2.7822) grad_norm 6.2392 (5.4429) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][320/1251] eta 0:03:47 lr 0.000071 wd 0.0500 time 0.2411 (0.2446) data time 0.0009 (0.0030) model time 0.2402 (0.2403) loss 3.0270 (2.7792) grad_norm 5.9843 (5.4326) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][330/1251] eta 0:03:45 lr 0.000071 wd 0.0500 time 0.2374 (0.2444) data time 0.0011 (0.0030) model time 0.2364 (0.2402) loss 2.7045 (2.7753) grad_norm 6.4338 (5.4547) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][340/1251] eta 0:03:42 lr 0.000071 wd 0.0500 time 0.2367 (0.2442) data time 0.0009 (0.0029) model time 0.2359 (0.2402) loss 2.5276 (2.7699) grad_norm 3.9377 (5.4320) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:02:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][350/1251] eta 0:03:39 lr 0.000071 wd 0.0500 time 0.2332 (0.2442) data time 0.0007 (0.0028) model time 0.2325 (0.2402) loss 2.6672 (2.7630) grad_norm 4.1341 (5.4142) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][360/1251] eta 0:03:37 lr 0.000071 wd 0.0500 time 0.2466 (0.2440) data time 0.0009 (0.0028) model time 0.2457 (0.2402) loss 3.1292 (2.7709) grad_norm 5.6416 (5.3859) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][370/1251] eta 0:03:34 lr 0.000071 wd 0.0500 time 0.2369 (0.2440) data time 0.0009 (0.0027) model time 0.2359 (0.2402) loss 2.8563 (2.7729) grad_norm 5.3237 (5.3864) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][380/1251] eta 0:03:32 lr 0.000071 wd 0.0500 time 0.2423 (0.2439) data time 0.0010 (0.0027) model time 0.2412 (0.2402) loss 2.3391 (2.7693) grad_norm 5.7980 (5.3959) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][390/1251] eta 0:03:30 lr 0.000071 wd 0.0500 time 0.2339 (0.2439) data time 0.0010 (0.0027) model time 0.2329 (0.2403) loss 2.5033 (2.7726) grad_norm 4.5900 (5.4204) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][400/1251] eta 0:03:27 lr 0.000071 wd 0.0500 time 0.2389 (0.2439) data time 0.0007 (0.0026) model time 0.2381 (0.2403) loss 2.8885 (2.7771) grad_norm 4.9269 (5.4076) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][410/1251] eta 0:03:25 lr 0.000071 wd 0.0500 time 0.2385 (0.2439) data time 0.0007 (0.0026) model time 0.2378 (0.2404) loss 3.3247 (2.7825) grad_norm 5.4456 (5.3835) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][420/1251] eta 0:03:23 lr 0.000071 wd 0.0500 time 0.2296 (0.2443) data time 0.0008 (0.0025) model time 0.2288 (0.2409) loss 3.2808 (2.7801) grad_norm 5.1984 (5.3506) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][430/1251] eta 0:03:20 lr 0.000071 wd 0.0500 time 0.2309 (0.2442) data time 0.0008 (0.0025) model time 0.2301 (0.2408) loss 2.3655 (2.7774) grad_norm 4.1694 (5.3391) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][440/1251] eta 0:03:17 lr 0.000071 wd 0.0500 time 0.2402 (0.2441) data time 0.0009 (0.0025) model time 0.2393 (0.2408) loss 1.8053 (2.7800) grad_norm 3.5438 (5.3155) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][450/1251] eta 0:03:15 lr 0.000071 wd 0.0500 time 0.2393 (0.2440) data time 0.0007 (0.0024) model time 0.2386 (0.2407) loss 2.9641 (2.7838) grad_norm 6.6480 (5.2999) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][460/1251] eta 0:03:12 lr 0.000071 wd 0.0500 time 0.2535 (0.2440) data time 0.0008 (0.0024) model time 0.2526 (0.2408) loss 3.2691 (2.7792) grad_norm 5.5988 (5.3234) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][470/1251] eta 0:03:10 lr 0.000071 wd 0.0500 time 0.4608 (0.2445) data time 0.0009 (0.0024) model time 0.4599 (0.2414) loss 2.3558 (2.7759) grad_norm 5.6910 (5.3004) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][480/1251] eta 0:03:08 lr 0.000071 wd 0.0500 time 0.2427 (0.2444) data time 0.0007 (0.0023) model time 0.2420 (0.2413) loss 2.5797 (2.7754) grad_norm 9.1333 (5.2974) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][490/1251] eta 0:03:05 lr 0.000071 wd 0.0500 time 0.2396 (0.2443) data time 0.0009 (0.0023) model time 0.2387 (0.2413) loss 3.2706 (2.7767) grad_norm 3.6421 (5.2808) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][500/1251] eta 0:03:03 lr 0.000071 wd 0.0500 time 0.2377 (0.2443) data time 0.0009 (0.0023) model time 0.2368 (0.2413) loss 2.8543 (2.7745) grad_norm 5.1173 (5.2563) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][510/1251] eta 0:03:00 lr 0.000071 wd 0.0500 time 0.2431 (0.2442) data time 0.0009 (0.0023) model time 0.2421 (0.2412) loss 2.6972 (2.7705) grad_norm 3.6308 (5.2474) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][520/1251] eta 0:02:58 lr 0.000071 wd 0.0500 time 0.2447 (0.2441) data time 0.0009 (0.0022) model time 0.2438 (0.2412) loss 2.3173 (2.7733) grad_norm 3.1403 (5.2336) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][530/1251] eta 0:02:55 lr 0.000071 wd 0.0500 time 0.2346 (0.2441) data time 0.0009 (0.0022) model time 0.2337 (0.2412) loss 3.2335 (2.7769) grad_norm 4.0092 (5.2233) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][540/1251] eta 0:02:53 lr 0.000071 wd 0.0500 time 0.2296 (0.2440) data time 0.0012 (0.0022) model time 0.2284 (0.2412) loss 2.5703 (2.7755) grad_norm 3.7157 (5.2064) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][550/1251] eta 0:02:51 lr 0.000071 wd 0.0500 time 0.2391 (0.2440) data time 0.0007 (0.0022) model time 0.2384 (0.2412) loss 2.8593 (2.7756) grad_norm 4.3810 (5.1888) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][560/1251] eta 0:02:48 lr 0.000071 wd 0.0500 time 0.2385 (0.2439) data time 0.0007 (0.0022) model time 0.2378 (0.2411) loss 1.5786 (2.7728) grad_norm 6.5784 (5.2720) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][570/1251] eta 0:02:46 lr 0.000071 wd 0.0500 time 0.2396 (0.2439) data time 0.0008 (0.0021) model time 0.2388 (0.2412) loss 2.3784 (2.7742) grad_norm 5.7016 (5.2831) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][580/1251] eta 0:02:43 lr 0.000071 wd 0.0500 time 0.2446 (0.2439) data time 0.0009 (0.0021) model time 0.2437 (0.2412) loss 2.4657 (2.7751) grad_norm 4.5103 (5.2858) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][590/1251] eta 0:02:41 lr 0.000071 wd 0.0500 time 0.2452 (0.2438) data time 0.0010 (0.0021) model time 0.2442 (0.2412) loss 3.3149 (2.7754) grad_norm 6.5452 (5.2777) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][600/1251] eta 0:02:38 lr 0.000070 wd 0.0500 time 0.2415 (0.2438) data time 0.0007 (0.0021) model time 0.2407 (0.2412) loss 2.9606 (2.7720) grad_norm 3.3532 (5.2583) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][610/1251] eta 0:02:36 lr 0.000070 wd 0.0500 time 0.2525 (0.2438) data time 0.0007 (0.0021) model time 0.2518 (0.2412) loss 3.3953 (2.7718) grad_norm 4.7407 (5.2514) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][620/1251] eta 0:02:33 lr 0.000070 wd 0.0500 time 0.2402 (0.2438) data time 0.0011 (0.0020) model time 0.2391 (0.2412) loss 2.9560 (2.7690) grad_norm 3.2675 (5.2307) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][630/1251] eta 0:02:31 lr 0.000070 wd 0.0500 time 0.2336 (0.2437) data time 0.0009 (0.0020) model time 0.2327 (0.2412) loss 2.8918 (2.7696) grad_norm 3.9778 (5.2229) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][640/1251] eta 0:02:28 lr 0.000070 wd 0.0500 time 0.2411 (0.2437) data time 0.0012 (0.0020) model time 0.2399 (0.2412) loss 1.6110 (2.7668) grad_norm 4.0255 (5.2502) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][650/1251] eta 0:02:26 lr 0.000070 wd 0.0500 time 0.2432 (0.2437) data time 0.0012 (0.0020) model time 0.2420 (0.2411) loss 2.8793 (2.7689) grad_norm 5.8093 (5.2398) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][660/1251] eta 0:02:23 lr 0.000070 wd 0.0500 time 0.2497 (0.2436) data time 0.0009 (0.0020) model time 0.2488 (0.2411) loss 3.2068 (2.7683) grad_norm 5.8910 (5.2970) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][670/1251] eta 0:02:21 lr 0.000070 wd 0.0500 time 0.2466 (0.2436) data time 0.0007 (0.0020) model time 0.2459 (0.2411) loss 3.0536 (2.7700) grad_norm 4.1974 (5.2865) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][680/1251] eta 0:02:19 lr 0.000070 wd 0.0500 time 0.2403 (0.2435) data time 0.0007 (0.0019) model time 0.2396 (0.2411) loss 2.0747 (2.7740) grad_norm 4.0824 (5.2751) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][690/1251] eta 0:02:16 lr 0.000070 wd 0.0500 time 0.2403 (0.2435) data time 0.0007 (0.0019) model time 0.2395 (0.2411) loss 3.0031 (2.7758) grad_norm 3.8974 (5.2624) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][700/1251] eta 0:02:14 lr 0.000070 wd 0.0500 time 0.2404 (0.2435) data time 0.0008 (0.0019) model time 0.2396 (0.2411) loss 2.9577 (2.7751) grad_norm 3.3598 (5.2597) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][710/1251] eta 0:02:11 lr 0.000070 wd 0.0500 time 0.2405 (0.2435) data time 0.0010 (0.0019) model time 0.2395 (0.2411) loss 3.2116 (2.7739) grad_norm 4.2067 (5.2470) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][720/1251] eta 0:02:09 lr 0.000070 wd 0.0500 time 0.2465 (0.2435) data time 0.0009 (0.0019) model time 0.2456 (0.2411) loss 2.6235 (2.7763) grad_norm 11.0152 (5.2492) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][730/1251] eta 0:02:06 lr 0.000070 wd 0.0500 time 0.2412 (0.2435) data time 0.0008 (0.0019) model time 0.2404 (0.2411) loss 1.9865 (2.7760) grad_norm 3.4139 (5.2324) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][740/1251] eta 0:02:04 lr 0.000070 wd 0.0500 time 0.2547 (0.2435) data time 0.0009 (0.0019) model time 0.2538 (0.2411) loss 2.2723 (2.7800) grad_norm 4.1871 (5.2160) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][750/1251] eta 0:02:01 lr 0.000070 wd 0.0500 time 0.2367 (0.2434) data time 0.0008 (0.0019) model time 0.2359 (0.2411) loss 1.8841 (2.7787) grad_norm 5.9728 (5.2101) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][760/1251] eta 0:01:59 lr 0.000070 wd 0.0500 time 0.2389 (0.2435) data time 0.0011 (0.0018) model time 0.2378 (0.2412) loss 2.3406 (2.7762) grad_norm 4.9236 (5.2156) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][770/1251] eta 0:01:57 lr 0.000070 wd 0.0500 time 0.2500 (0.2435) data time 0.0009 (0.0018) model time 0.2490 (0.2412) loss 3.2589 (2.7775) grad_norm 3.4274 (5.2134) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][780/1251] eta 0:01:54 lr 0.000070 wd 0.0500 time 0.2510 (0.2435) data time 0.0009 (0.0018) model time 0.2501 (0.2412) loss 2.5241 (2.7762) grad_norm 4.8739 (5.1933) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][790/1251] eta 0:01:52 lr 0.000070 wd 0.0500 time 0.2438 (0.2434) data time 0.0009 (0.0018) model time 0.2430 (0.2412) loss 2.7614 (2.7753) grad_norm 4.2279 (5.1812) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][800/1251] eta 0:01:49 lr 0.000070 wd 0.0500 time 0.2330 (0.2434) data time 0.0009 (0.0018) model time 0.2322 (0.2412) loss 2.9518 (2.7749) grad_norm 7.1270 (5.1759) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][810/1251] eta 0:01:47 lr 0.000070 wd 0.0500 time 0.2340 (0.2433) data time 0.0008 (0.0018) model time 0.2332 (0.2411) loss 3.0641 (2.7745) grad_norm 11.3663 (5.1799) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][820/1251] eta 0:01:44 lr 0.000070 wd 0.0500 time 0.2418 (0.2433) data time 0.0008 (0.0018) model time 0.2411 (0.2411) loss 3.3207 (2.7752) grad_norm 4.1066 (5.1765) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][830/1251] eta 0:01:42 lr 0.000070 wd 0.0500 time 0.2376 (0.2433) data time 0.0010 (0.0018) model time 0.2366 (0.2411) loss 2.0220 (2.7748) grad_norm 5.1901 (5.1753) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:04:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][840/1251] eta 0:01:39 lr 0.000070 wd 0.0500 time 0.2467 (0.2433) data time 0.0009 (0.0018) model time 0.2458 (0.2411) loss 2.7642 (2.7778) grad_norm 4.9643 (5.1733) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][850/1251] eta 0:01:37 lr 0.000070 wd 0.0500 time 0.2439 (0.2433) data time 0.0010 (0.0017) model time 0.2429 (0.2411) loss 3.1451 (2.7776) grad_norm 6.5828 (5.1675) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][860/1251] eta 0:01:35 lr 0.000070 wd 0.0500 time 0.2390 (0.2432) data time 0.0009 (0.0017) model time 0.2381 (0.2411) loss 2.9565 (2.7797) grad_norm 4.8196 (5.1652) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][870/1251] eta 0:01:32 lr 0.000070 wd 0.0500 time 0.2353 (0.2432) data time 0.0007 (0.0017) model time 0.2346 (0.2411) loss 3.1715 (2.7795) grad_norm 3.2039 (5.1749) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][880/1251] eta 0:01:30 lr 0.000070 wd 0.0500 time 0.2357 (0.2432) data time 0.0010 (0.0017) model time 0.2347 (0.2410) loss 3.2395 (2.7806) grad_norm 5.3279 (5.1635) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][890/1251] eta 0:01:27 lr 0.000070 wd 0.0500 time 0.2424 (0.2431) data time 0.0010 (0.0017) model time 0.2413 (0.2410) loss 2.5454 (2.7794) grad_norm 4.2114 (5.1622) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][900/1251] eta 0:01:25 lr 0.000070 wd 0.0500 time 0.2449 (0.2431) data time 0.0007 (0.0017) model time 0.2442 (0.2410) loss 3.3537 (2.7809) grad_norm 3.8801 (5.1579) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][910/1251] eta 0:01:22 lr 0.000070 wd 0.0500 time 0.2404 (0.2431) data time 0.0011 (0.0017) model time 0.2393 (0.2410) loss 2.7291 (2.7822) grad_norm 4.8945 (5.1584) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][920/1251] eta 0:01:20 lr 0.000070 wd 0.0500 time 0.2379 (0.2431) data time 0.0011 (0.0017) model time 0.2369 (0.2410) loss 2.6698 (2.7826) grad_norm 4.1907 (5.1503) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][930/1251] eta 0:01:18 lr 0.000070 wd 0.0500 time 0.2491 (0.2431) data time 0.0010 (0.0017) model time 0.2481 (0.2411) loss 2.6284 (2.7833) grad_norm 4.5132 (5.1459) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][940/1251] eta 0:01:15 lr 0.000070 wd 0.0500 time 0.2418 (0.2431) data time 0.0007 (0.0017) model time 0.2411 (0.2411) loss 2.5846 (2.7827) grad_norm 5.5203 (5.1392) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][950/1251] eta 0:01:13 lr 0.000070 wd 0.0500 time 0.2439 (0.2431) data time 0.0009 (0.0017) model time 0.2430 (0.2411) loss 3.1984 (2.7779) grad_norm 7.4183 (5.1365) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][960/1251] eta 0:01:10 lr 0.000070 wd 0.0500 time 0.2366 (0.2431) data time 0.0012 (0.0017) model time 0.2354 (0.2411) loss 3.0432 (2.7773) grad_norm 3.0780 (5.1267) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][970/1251] eta 0:01:08 lr 0.000070 wd 0.0500 time 0.2413 (0.2431) data time 0.0009 (0.0017) model time 0.2404 (0.2411) loss 2.7545 (2.7780) grad_norm 13.4410 (5.1430) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][980/1251] eta 0:01:05 lr 0.000070 wd 0.0500 time 0.2513 (0.2431) data time 0.0007 (0.0016) model time 0.2506 (0.2411) loss 2.3731 (2.7805) grad_norm 2.5257 (5.1280) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][990/1251] eta 0:01:03 lr 0.000070 wd 0.0500 time 0.2419 (0.2431) data time 0.0007 (0.0016) model time 0.2413 (0.2411) loss 2.6036 (2.7813) grad_norm 5.7124 (5.1232) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1000/1251] eta 0:01:01 lr 0.000070 wd 0.0500 time 0.2339 (0.2431) data time 0.0010 (0.0016) model time 0.2328 (0.2411) loss 2.5843 (2.7809) grad_norm 4.0400 (5.1106) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1010/1251] eta 0:00:58 lr 0.000070 wd 0.0500 time 0.2449 (0.2431) data time 0.0008 (0.0016) model time 0.2441 (0.2411) loss 3.7199 (2.7789) grad_norm 4.6362 (5.1122) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1020/1251] eta 0:00:56 lr 0.000070 wd 0.0500 time 0.2394 (0.2431) data time 0.0011 (0.0016) model time 0.2383 (0.2411) loss 3.1038 (2.7789) grad_norm 6.0030 (5.1121) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1030/1251] eta 0:00:53 lr 0.000070 wd 0.0500 time 0.2340 (0.2431) data time 0.0012 (0.0016) model time 0.2328 (0.2411) loss 3.2042 (2.7783) grad_norm 4.3540 (5.1104) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1040/1251] eta 0:00:51 lr 0.000070 wd 0.0500 time 0.2370 (0.2431) data time 0.0007 (0.0016) model time 0.2363 (0.2411) loss 3.5435 (2.7797) grad_norm 6.0324 (5.1092) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1050/1251] eta 0:00:48 lr 0.000070 wd 0.0500 time 0.2431 (0.2430) data time 0.0007 (0.0016) model time 0.2425 (0.2411) loss 2.2738 (2.7803) grad_norm 5.6525 (5.1045) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1060/1251] eta 0:00:46 lr 0.000070 wd 0.0500 time 0.2395 (0.2430) data time 0.0007 (0.0016) model time 0.2388 (0.2411) loss 2.2579 (2.7799) grad_norm 4.6485 (5.1040) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1070/1251] eta 0:00:43 lr 0.000069 wd 0.0500 time 0.2370 (0.2430) data time 0.0011 (0.0016) model time 0.2359 (0.2411) loss 3.3292 (2.7796) grad_norm 4.4107 (5.1064) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1080/1251] eta 0:00:41 lr 0.000069 wd 0.0500 time 0.2444 (0.2430) data time 0.0010 (0.0016) model time 0.2435 (0.2411) loss 2.6257 (2.7800) grad_norm 3.7431 (5.1053) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:05:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1090/1251] eta 0:00:39 lr 0.000069 wd 0.0500 time 0.2444 (0.2430) data time 0.0008 (0.0016) model time 0.2436 (0.2411) loss 3.3461 (2.7805) grad_norm 4.0504 (5.1200) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:06:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1100/1251] eta 0:00:36 lr 0.000069 wd 0.0500 time 0.2358 (0.2430) data time 0.0011 (0.0016) model time 0.2347 (0.2411) loss 3.0428 (2.7801) grad_norm 4.3804 (5.1228) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1110/1251] eta 0:00:34 lr 0.000069 wd 0.0500 time 0.2347 (0.2430) data time 0.0009 (0.0016) model time 0.2338 (0.2411) loss 2.0798 (2.7791) grad_norm 3.9410 (5.1129) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:06:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1120/1251] eta 0:00:31 lr 0.000069 wd 0.0500 time 0.2398 (0.2430) data time 0.0011 (0.0016) model time 0.2388 (0.2411) loss 2.7327 (2.7789) grad_norm 4.0954 (5.1062) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:06:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1130/1251] eta 0:00:29 lr 0.000069 wd 0.0500 time 0.2446 (0.2430) data time 0.0009 (0.0016) model time 0.2438 (0.2411) loss 3.0820 (2.7778) grad_norm 6.0290 (5.1273) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:06:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1140/1251] eta 0:00:26 lr 0.000069 wd 0.0500 time 0.2368 (0.2430) data time 0.0009 (0.0016) model time 0.2359 (0.2411) loss 2.9927 (2.7768) grad_norm 6.5443 (5.1252) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:06:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1150/1251] eta 0:00:24 lr 0.000069 wd 0.0500 time 0.2377 (0.2429) data time 0.0009 (0.0016) model time 0.2368 (0.2411) loss 3.6839 (2.7795) grad_norm 6.7714 (5.1275) loss_scale 256.0000 (128.8897) mem 7381MB [2024-09-01 08:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1160/1251] eta 0:00:22 lr 0.000069 wd 0.0500 time 0.2342 (0.2429) data time 0.0007 (0.0015) model time 0.2335 (0.2410) loss 1.8800 (2.7793) grad_norm 4.4175 (5.1231) loss_scale 256.0000 (129.9845) mem 7381MB [2024-09-01 08:06:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1170/1251] eta 0:00:19 lr 0.000069 wd 0.0500 time 0.2318 (0.2429) data time 0.0008 (0.0015) model time 0.2310 (0.2410) loss 2.7300 (2.7790) grad_norm 3.9316 (5.1174) loss_scale 256.0000 (131.0606) mem 7381MB [2024-09-01 08:06:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1180/1251] eta 0:00:17 lr 0.000069 wd 0.0500 time 0.2366 (0.2429) data time 0.0012 (0.0015) model time 0.2354 (0.2410) loss 3.1314 (2.7789) grad_norm 3.4197 (5.1100) loss_scale 256.0000 (132.1185) mem 7381MB [2024-09-01 08:06:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1190/1251] eta 0:00:14 lr 0.000069 wd 0.0500 time 0.2417 (0.2428) data time 0.0011 (0.0015) model time 0.2406 (0.2410) loss 3.2720 (2.7796) grad_norm 3.5680 (5.1039) loss_scale 256.0000 (133.1587) mem 7381MB [2024-09-01 08:06:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1200/1251] eta 0:00:12 lr 0.000069 wd 0.0500 time 0.2345 (0.2428) data time 0.0007 (0.0015) model time 0.2337 (0.2410) loss 3.0184 (2.7790) grad_norm 5.7955 (5.1036) loss_scale 256.0000 (134.1815) mem 7381MB [2024-09-01 08:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1210/1251] eta 0:00:09 lr 0.000069 wd 0.0500 time 0.2434 (0.2428) data time 0.0009 (0.0015) model time 0.2425 (0.2410) loss 2.5250 (2.7805) grad_norm 5.0806 (5.0940) loss_scale 256.0000 (135.1874) mem 7381MB [2024-09-01 08:06:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1220/1251] eta 0:00:07 lr 0.000069 wd 0.0500 time 0.2378 (0.2428) data time 0.0008 (0.0015) model time 0.2370 (0.2410) loss 2.6280 (2.7814) grad_norm 4.7228 (5.0899) loss_scale 256.0000 (136.1769) mem 7381MB [2024-09-01 08:06:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1230/1251] eta 0:00:05 lr 0.000069 wd 0.0500 time 0.2365 (0.2428) data time 0.0007 (0.0015) model time 0.2358 (0.2410) loss 1.9242 (2.7798) grad_norm 3.7003 (5.0896) loss_scale 256.0000 (137.1503) mem 7381MB [2024-09-01 08:06:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1240/1251] eta 0:00:02 lr 0.000069 wd 0.0500 time 0.2232 (0.2429) data time 0.0005 (0.0015) model time 0.2227 (0.2411) loss 3.6433 (2.7797) grad_norm 4.3998 (5.0903) loss_scale 256.0000 (138.1080) mem 7381MB [2024-09-01 08:06:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [255/300][1250/1251] eta 0:00:00 lr 0.000069 wd 0.0500 time 0.2257 (0.2427) data time 0.0005 (0.0015) model time 0.2253 (0.2409) loss 2.3764 (2.7775) grad_norm 3.9491 (5.0869) loss_scale 256.0000 (139.0504) mem 7381MB [2024-09-01 08:06:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 255 training takes 0:05:03 [2024-09-01 08:06:36 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 08:06:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 08:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.457 (0.457) Loss 0.3896 (0.3896) Acc@1 93.359 (93.359) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 08:06:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.111) Loss 0.5786 (0.6166) Acc@1 89.648 (87.012) Acc@5 97.656 (97.665) Mem 7381MB [2024-09-01 08:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.096) Loss 0.9209 (0.6460) Acc@1 77.148 (86.007) Acc@5 95.703 (97.605) Mem 7381MB [2024-09-01 08:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.090) Loss 1.1465 (0.7370) Acc@1 72.949 (83.915) Acc@5 92.676 (96.680) Mem 7381MB [2024-09-01 08:06:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 0.9819 (0.7842) Acc@1 77.344 (82.786) Acc@5 94.336 (96.170) Mem 7381MB [2024-09-01 08:06:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.318 Acc@5 96.070 [2024-09-01 08:06:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.3% [2024-09-01 08:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.850 (0.850) Loss 0.3804 (0.3804) Acc@1 93.359 (93.359) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 08:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.151) Loss 0.5679 (0.6007) Acc@1 90.527 (87.775) Acc@5 98.047 (97.825) Mem 7381MB [2024-09-01 08:06:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.117) Loss 0.8950 (0.6310) Acc@1 78.027 (86.616) Acc@5 96.094 (97.745) Mem 7381MB [2024-09-01 08:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.083 (0.105) Loss 1.1074 (0.7183) Acc@1 73.926 (84.444) Acc@5 93.164 (96.884) Mem 7381MB [2024-09-01 08:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.095) Loss 0.9907 (0.7647) Acc@1 77.051 (83.320) Acc@5 94.336 (96.382) Mem 7381MB [2024-09-01 08:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.956 Acc@5 96.326 [2024-09-01 08:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 83.0% [2024-09-01 08:06:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.96% [2024-09-01 08:06:45 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 08:06:46 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 08:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][0/1251] eta 0:15:19 lr 0.000069 wd 0.0500 time 0.7348 (0.7348) data time 0.5097 (0.5097) model time 0.0000 (0.0000) loss 2.4628 (2.4628) grad_norm 3.9528 (3.9528) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][10/1251] eta 0:05:52 lr 0.000069 wd 0.0500 time 0.2465 (0.2844) data time 0.0007 (0.0472) model time 0.0000 (0.0000) loss 2.9298 (2.5609) grad_norm 4.8708 (6.1624) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:06:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][20/1251] eta 0:05:25 lr 0.000069 wd 0.0500 time 0.2361 (0.2642) data time 0.0010 (0.0252) model time 0.0000 (0.0000) loss 3.4963 (2.6084) grad_norm 6.3564 (6.2904) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][30/1251] eta 0:05:12 lr 0.000069 wd 0.0500 time 0.2399 (0.2562) data time 0.0007 (0.0174) model time 0.0000 (0.0000) loss 1.9709 (2.6052) grad_norm 3.7215 (6.0367) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][40/1251] eta 0:05:05 lr 0.000069 wd 0.0500 time 0.2482 (0.2525) data time 0.0010 (0.0134) model time 0.0000 (0.0000) loss 2.6339 (2.6021) grad_norm 4.2355 (5.6890) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:06:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][50/1251] eta 0:05:00 lr 0.000069 wd 0.0500 time 0.2393 (0.2501) data time 0.0009 (0.0110) model time 0.0000 (0.0000) loss 1.9483 (2.5973) grad_norm 11.4976 (5.7336) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][60/1251] eta 0:05:00 lr 0.000069 wd 0.0500 time 0.2408 (0.2522) data time 0.0011 (0.0093) model time 0.2397 (0.2617) loss 2.9060 (2.6394) grad_norm 5.6918 (5.7970) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][70/1251] eta 0:04:55 lr 0.000069 wd 0.0500 time 0.2395 (0.2505) data time 0.0009 (0.0082) model time 0.2386 (0.2505) loss 2.9408 (2.6569) grad_norm 5.7659 (5.7407) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][80/1251] eta 0:04:52 lr 0.000069 wd 0.0500 time 0.2456 (0.2495) data time 0.0011 (0.0073) model time 0.2444 (0.2473) loss 3.2964 (2.6753) grad_norm 8.2486 (5.7601) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][90/1251] eta 0:04:53 lr 0.000069 wd 0.0500 time 0.2304 (0.2530) data time 0.0011 (0.0066) model time 0.2293 (0.2557) loss 3.0646 (2.6762) grad_norm 7.3366 (5.7486) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][100/1251] eta 0:04:49 lr 0.000069 wd 0.0500 time 0.2418 (0.2519) data time 0.0009 (0.0060) model time 0.2409 (0.2527) loss 3.4576 (2.7002) grad_norm 4.1597 (5.6271) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][110/1251] eta 0:04:46 lr 0.000069 wd 0.0500 time 0.2395 (0.2508) data time 0.0008 (0.0056) model time 0.2387 (0.2503) loss 2.6460 (2.6919) grad_norm 4.5590 (5.4764) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][120/1251] eta 0:04:42 lr 0.000069 wd 0.0500 time 0.2423 (0.2502) data time 0.0010 (0.0052) model time 0.2412 (0.2492) loss 2.9913 (2.6881) grad_norm 3.5732 (5.4113) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][130/1251] eta 0:04:39 lr 0.000069 wd 0.0500 time 0.2412 (0.2493) data time 0.0009 (0.0049) model time 0.2403 (0.2477) loss 3.3312 (2.7158) grad_norm 6.9842 (5.4174) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][140/1251] eta 0:04:36 lr 0.000069 wd 0.0500 time 0.2399 (0.2485) data time 0.0007 (0.0046) model time 0.2391 (0.2466) loss 2.8367 (2.7209) grad_norm 5.4013 (5.4206) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][150/1251] eta 0:04:33 lr 0.000069 wd 0.0500 time 0.2459 (0.2481) data time 0.0008 (0.0044) model time 0.2451 (0.2461) loss 3.2282 (2.7118) grad_norm 4.4559 (5.3549) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][160/1251] eta 0:04:30 lr 0.000069 wd 0.0500 time 0.2299 (0.2476) data time 0.0009 (0.0042) model time 0.2290 (0.2454) loss 2.3878 (2.6888) grad_norm 3.8550 (5.2705) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][170/1251] eta 0:04:27 lr 0.000069 wd 0.0500 time 0.2418 (0.2470) data time 0.0010 (0.0040) model time 0.2408 (0.2447) loss 3.0464 (2.6973) grad_norm 4.7665 (5.2011) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][180/1251] eta 0:04:24 lr 0.000069 wd 0.0500 time 0.2391 (0.2467) data time 0.0008 (0.0038) model time 0.2382 (0.2443) loss 3.2676 (2.6997) grad_norm 3.3631 (5.1358) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][190/1251] eta 0:04:21 lr 0.000069 wd 0.0500 time 0.2445 (0.2465) data time 0.0009 (0.0037) model time 0.2436 (0.2442) loss 2.5851 (2.7062) grad_norm 3.5293 (5.1084) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][200/1251] eta 0:04:18 lr 0.000069 wd 0.0500 time 0.2443 (0.2463) data time 0.0009 (0.0035) model time 0.2434 (0.2440) loss 3.2951 (2.7121) grad_norm 4.8720 (5.1098) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][210/1251] eta 0:04:16 lr 0.000069 wd 0.0500 time 0.2393 (0.2461) data time 0.0010 (0.0034) model time 0.2382 (0.2438) loss 2.9762 (2.7133) grad_norm 6.1051 (5.1013) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][220/1251] eta 0:04:13 lr 0.000069 wd 0.0500 time 0.2424 (0.2458) data time 0.0010 (0.0033) model time 0.2414 (0.2436) loss 3.0096 (2.7194) grad_norm 3.7172 (5.1302) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][230/1251] eta 0:04:10 lr 0.000069 wd 0.0500 time 0.2425 (0.2457) data time 0.0010 (0.0032) model time 0.2415 (0.2435) loss 2.1750 (2.7253) grad_norm 5.3356 (5.0959) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][240/1251] eta 0:04:08 lr 0.000069 wd 0.0500 time 0.2472 (0.2456) data time 0.0011 (0.0031) model time 0.2461 (0.2434) loss 2.6724 (2.7272) grad_norm 4.9143 (5.0661) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][250/1251] eta 0:04:05 lr 0.000069 wd 0.0500 time 0.2448 (0.2455) data time 0.0009 (0.0030) model time 0.2439 (0.2433) loss 3.1021 (2.7320) grad_norm 7.3769 (5.1146) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][260/1251] eta 0:04:03 lr 0.000069 wd 0.0500 time 0.2404 (0.2453) data time 0.0010 (0.0029) model time 0.2395 (0.2431) loss 2.9791 (2.7330) grad_norm 4.3881 (5.0929) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][270/1251] eta 0:04:00 lr 0.000069 wd 0.0500 time 0.2361 (0.2452) data time 0.0011 (0.0029) model time 0.2350 (0.2431) loss 3.0772 (2.7361) grad_norm 3.2955 (5.0500) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][280/1251] eta 0:03:57 lr 0.000069 wd 0.0500 time 0.2318 (0.2451) data time 0.0011 (0.0028) model time 0.2307 (0.2430) loss 3.6250 (2.7446) grad_norm 4.1335 (5.0304) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:07:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][290/1251] eta 0:03:55 lr 0.000068 wd 0.0500 time 0.2432 (0.2449) data time 0.0010 (0.0027) model time 0.2422 (0.2428) loss 3.1679 (2.7537) grad_norm 3.3873 (5.2505) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][300/1251] eta 0:03:52 lr 0.000068 wd 0.0500 time 0.2440 (0.2448) data time 0.0007 (0.0027) model time 0.2433 (0.2428) loss 3.5613 (2.7654) grad_norm 3.6738 (5.2160) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][310/1251] eta 0:03:50 lr 0.000068 wd 0.0500 time 0.2453 (0.2447) data time 0.0007 (0.0026) model time 0.2446 (0.2427) loss 2.5005 (2.7623) grad_norm 8.0300 (5.1985) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][320/1251] eta 0:03:47 lr 0.000068 wd 0.0500 time 0.2445 (0.2447) data time 0.0008 (0.0026) model time 0.2436 (0.2426) loss 2.9174 (2.7614) grad_norm 7.8652 (5.1779) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][330/1251] eta 0:03:45 lr 0.000068 wd 0.0500 time 0.2455 (0.2446) data time 0.0007 (0.0025) model time 0.2448 (0.2426) loss 3.0483 (2.7666) grad_norm 4.2293 (5.2820) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][340/1251] eta 0:03:42 lr 0.000068 wd 0.0500 time 0.2423 (0.2445) data time 0.0008 (0.0025) model time 0.2415 (0.2426) loss 3.3756 (2.7713) grad_norm 3.8131 (5.2600) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][350/1251] eta 0:03:40 lr 0.000068 wd 0.0500 time 0.2400 (0.2444) data time 0.0010 (0.0024) model time 0.2390 (0.2425) loss 3.1523 (2.7770) grad_norm 4.6412 (5.2416) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][360/1251] eta 0:03:37 lr 0.000068 wd 0.0500 time 0.2409 (0.2443) data time 0.0007 (0.0024) model time 0.2401 (0.2424) loss 2.6539 (2.7743) grad_norm 4.3005 (5.2586) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][370/1251] eta 0:03:35 lr 0.000068 wd 0.0500 time 0.2425 (0.2442) data time 0.0011 (0.0024) model time 0.2414 (0.2422) loss 2.4107 (2.7708) grad_norm 4.7982 (5.2472) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][380/1251] eta 0:03:32 lr 0.000068 wd 0.0500 time 0.2476 (0.2441) data time 0.0007 (0.0023) model time 0.2469 (0.2422) loss 2.7776 (2.7711) grad_norm 3.8688 (5.2449) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][390/1251] eta 0:03:30 lr 0.000068 wd 0.0500 time 0.2376 (0.2441) data time 0.0008 (0.0023) model time 0.2368 (0.2422) loss 2.0083 (2.7734) grad_norm 5.3885 (5.2306) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][400/1251] eta 0:03:27 lr 0.000068 wd 0.0500 time 0.2356 (0.2440) data time 0.0007 (0.0023) model time 0.2348 (0.2422) loss 3.3090 (2.7758) grad_norm 4.0293 (5.2151) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][410/1251] eta 0:03:25 lr 0.000068 wd 0.0500 time 0.2493 (0.2440) data time 0.0007 (0.0022) model time 0.2486 (0.2421) loss 1.7576 (2.7765) grad_norm 4.1755 (5.1819) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][420/1251] eta 0:03:22 lr 0.000068 wd 0.0500 time 0.2345 (0.2439) data time 0.0010 (0.0022) model time 0.2335 (0.2421) loss 2.9366 (2.7740) grad_norm 5.6151 (5.1802) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][430/1251] eta 0:03:20 lr 0.000068 wd 0.0500 time 0.2446 (0.2442) data time 0.0010 (0.0022) model time 0.2436 (0.2424) loss 3.3472 (2.7709) grad_norm 3.6597 (5.1476) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][440/1251] eta 0:03:17 lr 0.000068 wd 0.0500 time 0.2292 (0.2441) data time 0.0008 (0.0021) model time 0.2284 (0.2423) loss 2.2670 (2.7673) grad_norm 4.1638 (5.1524) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][450/1251] eta 0:03:15 lr 0.000068 wd 0.0500 time 0.2434 (0.2441) data time 0.0010 (0.0021) model time 0.2424 (0.2423) loss 2.6921 (2.7700) grad_norm 4.2110 (5.1524) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][460/1251] eta 0:03:13 lr 0.000068 wd 0.0500 time 0.2384 (0.2441) data time 0.0012 (0.0021) model time 0.2371 (0.2423) loss 2.8738 (2.7683) grad_norm 3.9944 (5.1707) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][470/1251] eta 0:03:10 lr 0.000068 wd 0.0500 time 0.2411 (0.2440) data time 0.0010 (0.0021) model time 0.2401 (0.2423) loss 2.8966 (2.7686) grad_norm 4.5336 (5.1637) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][480/1251] eta 0:03:08 lr 0.000068 wd 0.0500 time 0.2488 (0.2440) data time 0.0007 (0.0021) model time 0.2482 (0.2422) loss 2.1259 (2.7637) grad_norm 5.2004 (5.1471) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][490/1251] eta 0:03:05 lr 0.000068 wd 0.0500 time 0.2385 (0.2439) data time 0.0009 (0.0020) model time 0.2376 (0.2422) loss 2.9866 (2.7661) grad_norm 5.9793 (5.1330) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][500/1251] eta 0:03:03 lr 0.000068 wd 0.0500 time 0.2433 (0.2438) data time 0.0009 (0.0020) model time 0.2424 (0.2421) loss 3.1788 (2.7636) grad_norm 5.8763 (5.1306) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][510/1251] eta 0:03:00 lr 0.000068 wd 0.0500 time 0.2400 (0.2437) data time 0.0007 (0.0020) model time 0.2393 (0.2420) loss 3.0244 (2.7635) grad_norm 3.2607 (5.1428) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][520/1251] eta 0:02:58 lr 0.000068 wd 0.0500 time 0.2429 (0.2437) data time 0.0012 (0.0020) model time 0.2417 (0.2420) loss 3.0365 (2.7697) grad_norm 3.4936 (5.1325) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][530/1251] eta 0:02:55 lr 0.000068 wd 0.0500 time 0.2362 (0.2436) data time 0.0010 (0.0020) model time 0.2352 (0.2419) loss 2.9067 (2.7698) grad_norm 3.4564 (5.1240) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][540/1251] eta 0:02:53 lr 0.000068 wd 0.0500 time 0.2411 (0.2436) data time 0.0009 (0.0019) model time 0.2401 (0.2419) loss 3.6667 (2.7744) grad_norm 5.1509 (5.1212) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][550/1251] eta 0:02:50 lr 0.000068 wd 0.0500 time 0.2412 (0.2436) data time 0.0009 (0.0019) model time 0.2403 (0.2419) loss 2.8601 (2.7747) grad_norm 5.3514 (5.1511) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][560/1251] eta 0:02:48 lr 0.000068 wd 0.0500 time 0.2393 (0.2435) data time 0.0009 (0.0019) model time 0.2384 (0.2419) loss 3.3841 (2.7759) grad_norm 6.1500 (5.1401) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][570/1251] eta 0:02:45 lr 0.000068 wd 0.0500 time 0.2396 (0.2435) data time 0.0011 (0.0019) model time 0.2385 (0.2419) loss 3.0728 (2.7783) grad_norm 4.7716 (5.1422) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][580/1251] eta 0:02:43 lr 0.000068 wd 0.0500 time 0.2430 (0.2435) data time 0.0011 (0.0019) model time 0.2419 (0.2419) loss 3.0364 (2.7839) grad_norm 3.6563 (5.1271) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][590/1251] eta 0:02:40 lr 0.000068 wd 0.0500 time 0.2405 (0.2435) data time 0.0009 (0.0019) model time 0.2395 (0.2419) loss 2.9471 (2.7829) grad_norm 3.7381 (5.1355) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][600/1251] eta 0:02:38 lr 0.000068 wd 0.0500 time 0.2539 (0.2435) data time 0.0008 (0.0018) model time 0.2530 (0.2419) loss 3.4293 (2.7863) grad_norm 5.0369 (5.1305) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][610/1251] eta 0:02:36 lr 0.000068 wd 0.0500 time 0.2409 (0.2435) data time 0.0010 (0.0018) model time 0.2399 (0.2419) loss 2.5241 (2.7874) grad_norm 7.6395 (5.1314) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][620/1251] eta 0:02:33 lr 0.000068 wd 0.0500 time 0.2414 (0.2434) data time 0.0007 (0.0018) model time 0.2407 (0.2418) loss 3.2008 (2.7869) grad_norm 4.9391 (5.1232) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][630/1251] eta 0:02:31 lr 0.000068 wd 0.0500 time 0.2414 (0.2434) data time 0.0009 (0.0018) model time 0.2405 (0.2418) loss 2.2727 (2.7883) grad_norm 4.0398 (5.1092) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][640/1251] eta 0:02:28 lr 0.000068 wd 0.0500 time 0.2345 (0.2434) data time 0.0009 (0.0018) model time 0.2336 (0.2418) loss 3.0873 (2.7862) grad_norm 6.2065 (5.1121) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][650/1251] eta 0:02:26 lr 0.000068 wd 0.0500 time 0.2420 (0.2434) data time 0.0008 (0.0018) model time 0.2412 (0.2418) loss 2.0789 (2.7851) grad_norm 5.2593 (5.1075) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][660/1251] eta 0:02:23 lr 0.000068 wd 0.0500 time 0.2350 (0.2433) data time 0.0010 (0.0018) model time 0.2340 (0.2418) loss 2.2374 (2.7829) grad_norm 5.1407 (5.1057) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][670/1251] eta 0:02:21 lr 0.000068 wd 0.0500 time 0.2437 (0.2433) data time 0.0011 (0.0017) model time 0.2426 (0.2418) loss 3.2017 (2.7818) grad_norm 3.8534 (5.0941) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][680/1251] eta 0:02:18 lr 0.000068 wd 0.0500 time 0.2361 (0.2433) data time 0.0011 (0.0017) model time 0.2350 (0.2418) loss 2.3576 (2.7801) grad_norm 3.4092 (5.0928) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][690/1251] eta 0:02:16 lr 0.000068 wd 0.0500 time 0.2450 (0.2433) data time 0.0008 (0.0017) model time 0.2443 (0.2418) loss 2.8413 (2.7801) grad_norm 5.5683 (5.1001) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][700/1251] eta 0:02:14 lr 0.000068 wd 0.0500 time 0.2513 (0.2433) data time 0.0010 (0.0017) model time 0.2503 (0.2418) loss 2.8225 (2.7790) grad_norm 3.0613 (5.0939) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][710/1251] eta 0:02:11 lr 0.000068 wd 0.0500 time 0.2407 (0.2433) data time 0.0009 (0.0017) model time 0.2398 (0.2417) loss 3.0208 (2.7823) grad_norm 4.1618 (5.1064) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][720/1251] eta 0:02:09 lr 0.000068 wd 0.0500 time 0.2420 (0.2432) data time 0.0011 (0.0017) model time 0.2409 (0.2417) loss 3.0099 (2.7829) grad_norm 4.7391 (5.0988) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][730/1251] eta 0:02:06 lr 0.000068 wd 0.0500 time 0.2408 (0.2432) data time 0.0007 (0.0017) model time 0.2401 (0.2417) loss 3.0630 (2.7827) grad_norm 4.0558 (5.0965) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][740/1251] eta 0:02:04 lr 0.000068 wd 0.0500 time 0.2405 (0.2432) data time 0.0008 (0.0017) model time 0.2397 (0.2417) loss 2.2614 (2.7799) grad_norm 3.8275 (5.0853) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][750/1251] eta 0:02:01 lr 0.000068 wd 0.0500 time 0.2387 (0.2432) data time 0.0009 (0.0017) model time 0.2377 (0.2417) loss 3.0192 (2.7813) grad_norm 3.6298 (5.0826) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][760/1251] eta 0:01:59 lr 0.000068 wd 0.0500 time 0.2362 (0.2431) data time 0.0009 (0.0017) model time 0.2353 (0.2417) loss 1.8949 (2.7805) grad_norm 5.1777 (5.0771) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][770/1251] eta 0:01:56 lr 0.000067 wd 0.0500 time 0.2460 (0.2431) data time 0.0007 (0.0017) model time 0.2453 (0.2416) loss 2.3078 (2.7771) grad_norm 3.4365 (5.0683) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][780/1251] eta 0:01:54 lr 0.000067 wd 0.0500 time 0.2416 (0.2431) data time 0.0009 (0.0016) model time 0.2407 (0.2416) loss 2.8199 (2.7761) grad_norm 2.6849 (5.0614) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:09:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][790/1251] eta 0:01:52 lr 0.000067 wd 0.0500 time 0.2405 (0.2431) data time 0.0007 (0.0016) model time 0.2398 (0.2416) loss 2.2103 (2.7744) grad_norm 3.6623 (5.0628) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][800/1251] eta 0:01:49 lr 0.000067 wd 0.0500 time 0.2376 (0.2431) data time 0.0010 (0.0016) model time 0.2367 (0.2416) loss 3.0021 (2.7720) grad_norm 4.3767 (5.0642) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][810/1251] eta 0:01:47 lr 0.000067 wd 0.0500 time 0.2345 (0.2430) data time 0.0007 (0.0016) model time 0.2338 (0.2416) loss 2.9704 (2.7764) grad_norm 4.9968 (5.0908) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][820/1251] eta 0:01:44 lr 0.000067 wd 0.0500 time 0.2378 (0.2430) data time 0.0008 (0.0016) model time 0.2370 (0.2416) loss 2.5796 (2.7750) grad_norm 3.9348 (5.0851) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][830/1251] eta 0:01:42 lr 0.000067 wd 0.0500 time 0.2390 (0.2431) data time 0.0008 (0.0016) model time 0.2381 (0.2416) loss 3.2077 (2.7740) grad_norm 3.5253 (5.1201) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][840/1251] eta 0:01:39 lr 0.000067 wd 0.0500 time 0.2456 (0.2430) data time 0.0008 (0.0016) model time 0.2448 (0.2416) loss 2.5922 (2.7737) grad_norm 4.8698 (5.1162) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][850/1251] eta 0:01:37 lr 0.000067 wd 0.0500 time 0.2312 (0.2430) data time 0.0009 (0.0016) model time 0.2303 (0.2416) loss 1.8453 (2.7736) grad_norm 3.6288 (5.1109) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][860/1251] eta 0:01:35 lr 0.000067 wd 0.0500 time 0.2409 (0.2430) data time 0.0009 (0.0016) model time 0.2399 (0.2415) loss 2.5290 (2.7763) grad_norm 4.5582 (5.1217) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][870/1251] eta 0:01:32 lr 0.000067 wd 0.0500 time 0.2428 (0.2429) data time 0.0009 (0.0016) model time 0.2418 (0.2415) loss 2.9885 (2.7772) grad_norm 11.3775 (5.1466) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][880/1251] eta 0:01:30 lr 0.000067 wd 0.0500 time 0.2380 (0.2429) data time 0.0009 (0.0016) model time 0.2371 (0.2415) loss 2.8246 (2.7768) grad_norm 3.5374 (5.1376) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][890/1251] eta 0:01:27 lr 0.000067 wd 0.0500 time 0.2412 (0.2429) data time 0.0007 (0.0016) model time 0.2405 (0.2415) loss 3.1843 (2.7771) grad_norm 5.5704 (5.1374) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][900/1251] eta 0:01:25 lr 0.000067 wd 0.0500 time 0.2406 (0.2429) data time 0.0009 (0.0016) model time 0.2397 (0.2415) loss 3.0627 (2.7792) grad_norm 4.4693 (5.1393) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][910/1251] eta 0:01:22 lr 0.000067 wd 0.0500 time 0.2377 (0.2429) data time 0.0012 (0.0015) model time 0.2365 (0.2415) loss 2.9127 (2.7804) grad_norm 3.1014 (5.1374) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][920/1251] eta 0:01:20 lr 0.000067 wd 0.0500 time 0.2429 (0.2429) data time 0.0009 (0.0015) model time 0.2420 (0.2415) loss 2.7907 (2.7788) grad_norm 4.0055 (5.1305) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][930/1251] eta 0:01:17 lr 0.000067 wd 0.0500 time 0.2369 (0.2429) data time 0.0009 (0.0015) model time 0.2360 (0.2414) loss 3.0507 (2.7790) grad_norm 4.3836 (5.1219) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][940/1251] eta 0:01:15 lr 0.000067 wd 0.0500 time 0.2382 (0.2429) data time 0.0011 (0.0015) model time 0.2372 (0.2415) loss 3.3202 (2.7785) grad_norm 4.5580 (5.1333) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][950/1251] eta 0:01:13 lr 0.000067 wd 0.0500 time 0.4599 (0.2431) data time 0.0011 (0.0015) model time 0.4588 (0.2417) loss 3.2854 (2.7810) grad_norm 7.1619 (5.1329) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][960/1251] eta 0:01:10 lr 0.000067 wd 0.0500 time 0.2348 (0.2430) data time 0.0011 (0.0015) model time 0.2337 (0.2416) loss 3.1529 (2.7796) grad_norm 4.2687 (5.1231) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][970/1251] eta 0:01:08 lr 0.000067 wd 0.0500 time 0.2349 (0.2430) data time 0.0009 (0.0015) model time 0.2340 (0.2416) loss 3.4020 (2.7794) grad_norm 22.1273 (5.1335) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][980/1251] eta 0:01:05 lr 0.000067 wd 0.0500 time 0.2407 (0.2430) data time 0.0011 (0.0015) model time 0.2396 (0.2416) loss 2.8098 (2.7801) grad_norm 7.4661 (5.2267) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][990/1251] eta 0:01:03 lr 0.000067 wd 0.0500 time 0.2444 (0.2432) data time 0.0009 (0.0015) model time 0.2434 (0.2418) loss 2.7156 (2.7806) grad_norm 4.2766 (5.2339) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1000/1251] eta 0:01:01 lr 0.000067 wd 0.0500 time 0.2424 (0.2432) data time 0.0009 (0.0015) model time 0.2415 (0.2418) loss 2.6798 (2.7823) grad_norm 5.6661 (5.2841) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1010/1251] eta 0:00:58 lr 0.000067 wd 0.0500 time 0.2389 (0.2431) data time 0.0007 (0.0015) model time 0.2381 (0.2418) loss 1.9935 (2.7807) grad_norm 4.4517 (5.2855) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1020/1251] eta 0:00:56 lr 0.000067 wd 0.0500 time 0.2417 (0.2435) data time 0.0009 (0.0015) model time 0.2408 (0.2422) loss 3.1818 (2.7784) grad_norm 6.7770 (5.2940) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1030/1251] eta 0:00:53 lr 0.000067 wd 0.0500 time 0.2489 (0.2435) data time 0.0007 (0.0015) model time 0.2482 (0.2422) loss 3.5627 (2.7763) grad_norm 9.8373 (5.2953) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1040/1251] eta 0:00:51 lr 0.000067 wd 0.0500 time 0.2439 (0.2435) data time 0.0012 (0.0015) model time 0.2426 (0.2421) loss 3.0160 (2.7784) grad_norm 4.1829 (5.2939) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1050/1251] eta 0:00:48 lr 0.000067 wd 0.0500 time 0.2429 (0.2435) data time 0.0007 (0.0015) model time 0.2422 (0.2421) loss 3.2470 (2.7797) grad_norm 13.6607 (5.2974) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1060/1251] eta 0:00:46 lr 0.000067 wd 0.0500 time 0.2402 (0.2435) data time 0.0009 (0.0015) model time 0.2393 (0.2421) loss 2.1361 (2.7796) grad_norm 3.2715 (5.2925) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1070/1251] eta 0:00:44 lr 0.000067 wd 0.0500 time 0.2434 (0.2434) data time 0.0007 (0.0015) model time 0.2427 (0.2421) loss 3.0667 (2.7805) grad_norm 4.0482 (5.2986) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1080/1251] eta 0:00:41 lr 0.000067 wd 0.0500 time 0.2421 (0.2434) data time 0.0008 (0.0015) model time 0.2413 (0.2421) loss 2.5901 (2.7779) grad_norm 4.5723 (5.2880) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1090/1251] eta 0:00:39 lr 0.000067 wd 0.0500 time 0.2459 (0.2434) data time 0.0007 (0.0015) model time 0.2452 (0.2421) loss 3.0756 (2.7787) grad_norm 3.0631 (5.2755) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1100/1251] eta 0:00:36 lr 0.000067 wd 0.0500 time 0.2409 (0.2434) data time 0.0008 (0.0015) model time 0.2401 (0.2421) loss 2.3235 (2.7776) grad_norm 3.6414 (5.2665) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1110/1251] eta 0:00:34 lr 0.000067 wd 0.0500 time 0.2436 (0.2434) data time 0.0011 (0.0014) model time 0.2425 (0.2421) loss 2.6317 (2.7776) grad_norm 6.2964 (5.2731) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1120/1251] eta 0:00:31 lr 0.000067 wd 0.0500 time 0.2347 (0.2434) data time 0.0009 (0.0014) model time 0.2338 (0.2421) loss 2.5643 (2.7790) grad_norm 5.2078 (5.2745) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1130/1251] eta 0:00:29 lr 0.000067 wd 0.0500 time 0.2468 (0.2434) data time 0.0007 (0.0014) model time 0.2461 (0.2421) loss 2.8201 (2.7780) grad_norm 3.7712 (5.2671) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1140/1251] eta 0:00:27 lr 0.000067 wd 0.0500 time 0.2406 (0.2433) data time 0.0007 (0.0014) model time 0.2399 (0.2420) loss 3.0400 (2.7785) grad_norm 4.4176 (5.2595) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1150/1251] eta 0:00:24 lr 0.000067 wd 0.0500 time 0.2325 (0.2433) data time 0.0010 (0.0014) model time 0.2314 (0.2420) loss 2.6227 (2.7796) grad_norm 3.0937 (5.2598) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1160/1251] eta 0:00:22 lr 0.000067 wd 0.0500 time 0.2428 (0.2433) data time 0.0008 (0.0014) model time 0.2420 (0.2420) loss 3.2216 (2.7798) grad_norm 4.7240 (5.2620) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1170/1251] eta 0:00:19 lr 0.000067 wd 0.0500 time 0.2434 (0.2433) data time 0.0010 (0.0014) model time 0.2424 (0.2420) loss 3.4338 (2.7802) grad_norm 3.8947 (5.2621) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1180/1251] eta 0:00:17 lr 0.000067 wd 0.0500 time 0.2446 (0.2433) data time 0.0010 (0.0014) model time 0.2436 (0.2420) loss 2.3549 (2.7805) grad_norm 3.7571 (5.2679) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1190/1251] eta 0:00:14 lr 0.000067 wd 0.0500 time 0.2386 (0.2433) data time 0.0012 (0.0014) model time 0.2375 (0.2420) loss 2.8988 (2.7807) grad_norm 5.2748 (5.2611) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1200/1251] eta 0:00:12 lr 0.000067 wd 0.0500 time 0.2335 (0.2433) data time 0.0007 (0.0014) model time 0.2328 (0.2420) loss 2.1579 (2.7819) grad_norm 4.7560 (5.2537) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1210/1251] eta 0:00:09 lr 0.000067 wd 0.0500 time 0.2400 (0.2433) data time 0.0007 (0.0014) model time 0.2393 (0.2420) loss 2.7266 (2.7794) grad_norm 4.5396 (5.2457) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1220/1251] eta 0:00:07 lr 0.000067 wd 0.0500 time 0.2370 (0.2433) data time 0.0009 (0.0014) model time 0.2361 (0.2420) loss 3.0506 (2.7793) grad_norm 5.2619 (5.2400) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1230/1251] eta 0:00:05 lr 0.000067 wd 0.0500 time 0.2419 (0.2433) data time 0.0007 (0.0014) model time 0.2412 (0.2420) loss 3.6509 (2.7795) grad_norm 4.4981 (5.2801) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1240/1251] eta 0:00:02 lr 0.000067 wd 0.0500 time 0.2262 (0.2432) data time 0.0005 (0.0014) model time 0.2257 (0.2419) loss 3.2581 (2.7809) grad_norm 4.2670 (5.2689) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [256/300][1250/1251] eta 0:00:00 lr 0.000067 wd 0.0500 time 0.2241 (0.2430) data time 0.0004 (0.0014) model time 0.2236 (0.2417) loss 3.4082 (2.7814) grad_norm 4.0647 (5.2693) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 256 training takes 0:05:04 [2024-09-01 08:11:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 08:11:51 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 08:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.435 (0.435) Loss 0.3987 (0.3987) Acc@1 93.652 (93.652) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 08:11:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.093 (0.113) Loss 0.5938 (0.6180) Acc@1 88.867 (87.456) Acc@5 97.656 (97.727) Mem 7381MB [2024-09-01 08:11:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.096) Loss 0.9331 (0.6487) Acc@1 77.441 (86.300) Acc@5 95.117 (97.624) Mem 7381MB [2024-09-01 08:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.089) Loss 1.1191 (0.7403) Acc@1 74.219 (84.110) Acc@5 92.480 (96.645) Mem 7381MB [2024-09-01 08:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.0098 (0.7910) Acc@1 76.953 (82.853) Acc@5 93.750 (96.127) Mem 7381MB [2024-09-01 08:11:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.434 Acc@5 96.072 [2024-09-01 08:11:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.4% [2024-09-01 08:11:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.43% [2024-09-01 08:11:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 08:11:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 08:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.461 (0.461) Loss 0.3809 (0.3809) Acc@1 93.359 (93.359) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 08:11:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.109) Loss 0.5674 (0.6010) Acc@1 90.430 (87.757) Acc@5 97.949 (97.816) Mem 7381MB [2024-09-01 08:11:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.094) Loss 0.8945 (0.6310) Acc@1 77.930 (86.584) Acc@5 96.191 (97.749) Mem 7381MB [2024-09-01 08:11:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.088) Loss 1.1084 (0.7187) Acc@1 74.121 (84.425) Acc@5 93.164 (96.891) Mem 7381MB [2024-09-01 08:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.082) Loss 0.9897 (0.7650) Acc@1 76.855 (83.303) Acc@5 94.238 (96.384) Mem 7381MB [2024-09-01 08:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.928 Acc@5 96.326 [2024-09-01 08:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 08:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][0/1251] eta 0:23:41 lr 0.000067 wd 0.0500 time 1.1365 (1.1365) data time 0.6270 (0.6270) model time 0.0000 (0.0000) loss 2.8409 (2.8409) grad_norm 6.1829 (6.1829) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][10/1251] eta 0:06:37 lr 0.000066 wd 0.0500 time 0.2482 (0.3204) data time 0.0007 (0.0578) model time 0.0000 (0.0000) loss 3.2928 (2.8833) grad_norm 7.1036 (6.9675) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][20/1251] eta 0:05:47 lr 0.000066 wd 0.0500 time 0.2514 (0.2820) data time 0.0007 (0.0308) model time 0.0000 (0.0000) loss 1.8050 (2.8711) grad_norm 4.0101 (6.0469) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][30/1251] eta 0:05:28 lr 0.000066 wd 0.0500 time 0.2398 (0.2687) data time 0.0008 (0.0212) model time 0.0000 (0.0000) loss 1.7678 (2.7669) grad_norm 5.8319 (5.5101) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][40/1251] eta 0:05:17 lr 0.000066 wd 0.0500 time 0.2436 (0.2620) data time 0.0009 (0.0162) model time 0.0000 (0.0000) loss 1.6820 (2.7550) grad_norm 5.1114 (5.3117) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][50/1251] eta 0:05:09 lr 0.000066 wd 0.0500 time 0.2434 (0.2580) data time 0.0007 (0.0132) model time 0.0000 (0.0000) loss 3.5462 (2.8113) grad_norm 7.8477 (5.1583) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][60/1251] eta 0:05:03 lr 0.000066 wd 0.0500 time 0.2356 (0.2551) data time 0.0012 (0.0112) model time 0.2344 (0.2394) loss 2.9249 (2.7748) grad_norm 5.7873 (5.0692) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][70/1251] eta 0:04:58 lr 0.000066 wd 0.0500 time 0.2352 (0.2529) data time 0.0012 (0.0098) model time 0.2340 (0.2388) loss 3.0158 (2.7755) grad_norm 7.8591 (5.0888) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][80/1251] eta 0:04:54 lr 0.000066 wd 0.0500 time 0.2339 (0.2513) data time 0.0015 (0.0087) model time 0.2324 (0.2388) loss 2.9452 (2.7300) grad_norm 5.7503 (5.0167) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][90/1251] eta 0:04:50 lr 0.000066 wd 0.0500 time 0.2420 (0.2501) data time 0.0010 (0.0079) model time 0.2410 (0.2391) loss 2.5804 (2.7390) grad_norm 4.4187 (5.0005) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][100/1251] eta 0:04:46 lr 0.000066 wd 0.0500 time 0.2437 (0.2490) data time 0.0008 (0.0072) model time 0.2429 (0.2388) loss 3.3568 (2.7375) grad_norm 4.2222 (4.9875) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][110/1251] eta 0:04:43 lr 0.000066 wd 0.0500 time 0.2441 (0.2482) data time 0.0011 (0.0066) model time 0.2430 (0.2389) loss 2.5824 (2.7321) grad_norm 3.2049 (4.9373) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][120/1251] eta 0:04:40 lr 0.000066 wd 0.0500 time 0.2419 (0.2477) data time 0.0008 (0.0062) model time 0.2412 (0.2393) loss 2.2417 (2.7256) grad_norm 3.9228 (4.9458) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][130/1251] eta 0:04:37 lr 0.000066 wd 0.0500 time 0.2515 (0.2474) data time 0.0010 (0.0058) model time 0.2505 (0.2396) loss 2.1474 (2.7336) grad_norm 4.0838 (4.9133) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][140/1251] eta 0:04:34 lr 0.000066 wd 0.0500 time 0.2416 (0.2470) data time 0.0010 (0.0054) model time 0.2406 (0.2397) loss 2.8350 (2.7436) grad_norm 7.4083 (4.9290) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][150/1251] eta 0:04:31 lr 0.000066 wd 0.0500 time 0.2459 (0.2466) data time 0.0010 (0.0051) model time 0.2449 (0.2398) loss 2.5707 (2.7508) grad_norm 4.3398 (4.9501) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][160/1251] eta 0:04:28 lr 0.000066 wd 0.0500 time 0.2389 (0.2463) data time 0.0010 (0.0049) model time 0.2378 (0.2399) loss 2.8480 (2.7574) grad_norm 2.9312 (4.9277) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][170/1251] eta 0:04:26 lr 0.000066 wd 0.0500 time 0.2494 (0.2461) data time 0.0008 (0.0046) model time 0.2486 (0.2401) loss 3.1187 (2.7592) grad_norm 4.5298 (4.8786) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][180/1251] eta 0:04:23 lr 0.000066 wd 0.0500 time 0.2406 (0.2459) data time 0.0009 (0.0044) model time 0.2397 (0.2401) loss 3.1729 (2.7551) grad_norm 4.7868 (4.8767) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][190/1251] eta 0:04:20 lr 0.000066 wd 0.0500 time 0.2374 (0.2454) data time 0.0010 (0.0043) model time 0.2364 (0.2398) loss 2.8762 (2.7415) grad_norm 6.1036 (4.8690) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][200/1251] eta 0:04:17 lr 0.000066 wd 0.0500 time 0.2502 (0.2451) data time 0.0009 (0.0041) model time 0.2493 (0.2398) loss 2.2173 (2.7391) grad_norm 4.4158 (4.8737) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][210/1251] eta 0:04:14 lr 0.000066 wd 0.0500 time 0.2410 (0.2448) data time 0.0010 (0.0039) model time 0.2400 (0.2397) loss 3.0024 (2.7295) grad_norm 3.2436 (4.8815) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][220/1251] eta 0:04:12 lr 0.000066 wd 0.0500 time 0.2407 (0.2446) data time 0.0009 (0.0038) model time 0.2398 (0.2396) loss 2.1135 (2.7299) grad_norm 5.3463 (4.8598) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][230/1251] eta 0:04:09 lr 0.000066 wd 0.0500 time 0.2450 (0.2444) data time 0.0007 (0.0037) model time 0.2443 (0.2396) loss 1.8278 (2.7257) grad_norm 4.5996 (4.8331) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:12:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][240/1251] eta 0:04:06 lr 0.000066 wd 0.0500 time 0.2417 (0.2443) data time 0.0008 (0.0036) model time 0.2409 (0.2396) loss 3.1863 (2.7293) grad_norm 6.2836 (4.8426) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][250/1251] eta 0:04:04 lr 0.000066 wd 0.0500 time 0.2446 (0.2442) data time 0.0008 (0.0035) model time 0.2438 (0.2397) loss 3.3364 (2.7192) grad_norm 3.7288 (4.8445) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][260/1251] eta 0:04:01 lr 0.000066 wd 0.0500 time 0.2380 (0.2442) data time 0.0011 (0.0034) model time 0.2368 (0.2398) loss 2.6568 (2.7130) grad_norm 5.4346 (4.8325) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][270/1251] eta 0:03:59 lr 0.000066 wd 0.0500 time 0.2385 (0.2440) data time 0.0008 (0.0033) model time 0.2376 (0.2398) loss 2.3155 (2.7096) grad_norm 3.1401 (4.8050) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][280/1251] eta 0:03:56 lr 0.000066 wd 0.0500 time 0.2361 (0.2438) data time 0.0009 (0.0032) model time 0.2352 (0.2397) loss 3.2915 (2.7162) grad_norm 5.5757 (4.7940) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][290/1251] eta 0:03:54 lr 0.000066 wd 0.0500 time 0.2387 (0.2437) data time 0.0009 (0.0031) model time 0.2378 (0.2397) loss 3.4287 (2.7262) grad_norm 3.2547 (4.7738) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][300/1251] eta 0:03:51 lr 0.000066 wd 0.0500 time 0.2431 (0.2436) data time 0.0009 (0.0031) model time 0.2422 (0.2397) loss 2.7617 (2.7342) grad_norm 5.0907 (4.7993) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][310/1251] eta 0:03:49 lr 0.000066 wd 0.0500 time 0.2437 (0.2436) data time 0.0011 (0.0030) model time 0.2426 (0.2398) loss 3.2031 (2.7299) grad_norm 5.5346 (4.8099) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][320/1251] eta 0:03:46 lr 0.000066 wd 0.0500 time 0.2426 (0.2436) data time 0.0010 (0.0029) model time 0.2415 (0.2399) loss 2.1616 (2.7295) grad_norm 3.8675 (4.9468) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][330/1251] eta 0:03:44 lr 0.000066 wd 0.0500 time 0.2310 (0.2435) data time 0.0010 (0.0029) model time 0.2299 (0.2399) loss 2.9102 (2.7358) grad_norm 3.6532 (4.9545) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][340/1251] eta 0:03:41 lr 0.000066 wd 0.0500 time 0.2342 (0.2435) data time 0.0008 (0.0028) model time 0.2334 (0.2400) loss 3.0087 (2.7436) grad_norm 6.0782 (4.9563) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][350/1251] eta 0:03:39 lr 0.000066 wd 0.0500 time 0.2386 (0.2434) data time 0.0008 (0.0028) model time 0.2378 (0.2400) loss 2.5456 (2.7493) grad_norm 4.1128 (4.9363) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][360/1251] eta 0:03:36 lr 0.000066 wd 0.0500 time 0.2385 (0.2433) data time 0.0009 (0.0027) model time 0.2375 (0.2400) loss 3.3636 (2.7492) grad_norm 4.6155 (4.9287) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][370/1251] eta 0:03:34 lr 0.000066 wd 0.0500 time 0.2348 (0.2432) data time 0.0009 (0.0027) model time 0.2339 (0.2399) loss 3.0061 (2.7517) grad_norm 4.9147 (4.9169) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][380/1251] eta 0:03:31 lr 0.000066 wd 0.0500 time 0.2321 (0.2431) data time 0.0010 (0.0026) model time 0.2310 (0.2398) loss 2.9430 (2.7538) grad_norm 4.5967 (4.9151) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][390/1251] eta 0:03:29 lr 0.000066 wd 0.0500 time 0.2325 (0.2430) data time 0.0008 (0.0026) model time 0.2316 (0.2397) loss 3.6973 (2.7601) grad_norm 3.1843 (4.9107) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][400/1251] eta 0:03:26 lr 0.000066 wd 0.0500 time 0.2431 (0.2430) data time 0.0010 (0.0025) model time 0.2421 (0.2398) loss 2.6178 (2.7585) grad_norm 10.1380 (4.9750) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][410/1251] eta 0:03:24 lr 0.000066 wd 0.0500 time 0.2389 (0.2429) data time 0.0010 (0.0025) model time 0.2379 (0.2398) loss 2.8615 (2.7605) grad_norm 3.7919 (4.9736) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][420/1251] eta 0:03:21 lr 0.000066 wd 0.0500 time 0.2458 (0.2429) data time 0.0009 (0.0025) model time 0.2449 (0.2398) loss 3.3075 (2.7602) grad_norm 4.7703 (4.9792) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][430/1251] eta 0:03:19 lr 0.000066 wd 0.0500 time 0.2352 (0.2429) data time 0.0010 (0.0024) model time 0.2342 (0.2398) loss 3.0676 (2.7612) grad_norm 4.8387 (4.9794) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][440/1251] eta 0:03:16 lr 0.000066 wd 0.0500 time 0.2452 (0.2428) data time 0.0010 (0.0024) model time 0.2442 (0.2398) loss 2.9646 (2.7615) grad_norm 5.7951 (4.9709) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][450/1251] eta 0:03:14 lr 0.000066 wd 0.0500 time 0.2402 (0.2428) data time 0.0010 (0.0024) model time 0.2393 (0.2398) loss 3.1024 (2.7631) grad_norm 3.8951 (4.9661) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][460/1251] eta 0:03:11 lr 0.000066 wd 0.0500 time 0.2385 (0.2427) data time 0.0009 (0.0023) model time 0.2376 (0.2398) loss 3.1836 (2.7570) grad_norm 4.7950 (4.9560) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][470/1251] eta 0:03:09 lr 0.000066 wd 0.0500 time 0.2416 (0.2431) data time 0.0010 (0.0023) model time 0.2406 (0.2403) loss 2.2330 (2.7549) grad_norm 3.4285 (4.9503) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][480/1251] eta 0:03:07 lr 0.000066 wd 0.0500 time 0.2372 (0.2430) data time 0.0007 (0.0023) model time 0.2365 (0.2403) loss 3.1689 (2.7559) grad_norm 3.9552 (4.9378) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:13:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][490/1251] eta 0:03:04 lr 0.000065 wd 0.0500 time 0.2429 (0.2430) data time 0.0011 (0.0023) model time 0.2418 (0.2403) loss 3.1183 (2.7558) grad_norm 5.1394 (4.9306) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][500/1251] eta 0:03:02 lr 0.000065 wd 0.0500 time 0.2403 (0.2430) data time 0.0007 (0.0022) model time 0.2396 (0.2403) loss 2.2343 (2.7586) grad_norm 3.7086 (4.9138) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][510/1251] eta 0:03:00 lr 0.000065 wd 0.0500 time 0.2308 (0.2429) data time 0.0009 (0.0022) model time 0.2299 (0.2403) loss 2.7149 (2.7586) grad_norm 5.4065 (4.9187) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][520/1251] eta 0:02:58 lr 0.000065 wd 0.0500 time 0.2222 (0.2437) data time 0.0011 (0.0022) model time 0.2211 (0.2412) loss 2.5202 (2.7598) grad_norm 4.0033 (4.9165) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][530/1251] eta 0:02:56 lr 0.000065 wd 0.0500 time 0.2439 (0.2444) data time 0.0010 (0.0022) model time 0.2429 (0.2420) loss 2.8794 (2.7626) grad_norm 5.4305 (4.9157) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][540/1251] eta 0:02:53 lr 0.000065 wd 0.0500 time 0.2431 (0.2444) data time 0.0007 (0.0021) model time 0.2424 (0.2420) loss 2.6718 (2.7645) grad_norm 5.4354 (4.9836) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][550/1251] eta 0:02:51 lr 0.000065 wd 0.0500 time 0.2276 (0.2443) data time 0.0010 (0.0021) model time 0.2265 (0.2419) loss 3.3402 (2.7681) grad_norm 3.8773 (4.9741) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][560/1251] eta 0:02:48 lr 0.000065 wd 0.0500 time 0.2370 (0.2442) data time 0.0009 (0.0021) model time 0.2361 (0.2419) loss 2.3984 (2.7697) grad_norm 3.7973 (4.9607) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][570/1251] eta 0:02:46 lr 0.000065 wd 0.0500 time 0.2330 (0.2442) data time 0.0010 (0.0021) model time 0.2320 (0.2419) loss 3.0958 (2.7710) grad_norm 3.5170 (4.9553) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][580/1251] eta 0:02:43 lr 0.000065 wd 0.0500 time 0.2266 (0.2441) data time 0.0011 (0.0021) model time 0.2256 (0.2418) loss 2.8097 (2.7694) grad_norm 3.9885 (4.9467) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][590/1251] eta 0:02:41 lr 0.000065 wd 0.0500 time 0.2358 (0.2441) data time 0.0008 (0.0020) model time 0.2350 (0.2418) loss 2.4551 (2.7651) grad_norm 3.0107 (4.9371) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][600/1251] eta 0:02:39 lr 0.000065 wd 0.0500 time 1.1010 (0.2454) data time 0.0009 (0.0020) model time 1.1001 (0.2433) loss 2.7263 (2.7659) grad_norm 4.1724 (4.9321) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][610/1251] eta 0:02:38 lr 0.000065 wd 0.0500 time 0.2350 (0.2466) data time 0.0010 (0.0020) model time 0.2339 (0.2446) loss 2.9817 (2.7689) grad_norm 5.1235 (4.9318) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][620/1251] eta 0:02:35 lr 0.000065 wd 0.0500 time 0.2349 (0.2465) data time 0.0008 (0.0020) model time 0.2341 (0.2445) loss 2.9034 (2.7662) grad_norm 3.6114 (4.9303) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][630/1251] eta 0:02:33 lr 0.000065 wd 0.0500 time 0.2477 (0.2464) data time 0.0007 (0.0020) model time 0.2470 (0.2444) loss 2.4773 (2.7669) grad_norm 5.6204 (4.9394) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][640/1251] eta 0:02:30 lr 0.000065 wd 0.0500 time 0.2406 (0.2463) data time 0.0008 (0.0020) model time 0.2398 (0.2443) loss 2.7082 (2.7668) grad_norm 3.8990 (4.9352) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:14:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][650/1251] eta 0:02:27 lr 0.000065 wd 0.0500 time 0.2391 (0.2462) data time 0.0008 (0.0019) model time 0.2383 (0.2443) loss 3.1941 (2.7670) grad_norm 7.9321 (4.9398) loss_scale 512.0000 (259.9324) mem 7381MB [2024-09-01 08:14:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][660/1251] eta 0:02:25 lr 0.000065 wd 0.0500 time 0.2371 (0.2462) data time 0.0008 (0.0019) model time 0.2363 (0.2442) loss 3.2489 (2.7665) grad_norm 5.3697 (4.9334) loss_scale 512.0000 (263.7458) mem 7381MB [2024-09-01 08:14:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][670/1251] eta 0:02:22 lr 0.000065 wd 0.0500 time 0.2446 (0.2461) data time 0.0011 (0.0019) model time 0.2435 (0.2442) loss 2.9226 (2.7647) grad_norm 5.3080 (5.0021) loss_scale 512.0000 (267.4456) mem 7381MB [2024-09-01 08:14:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][680/1251] eta 0:02:20 lr 0.000065 wd 0.0500 time 0.2554 (0.2461) data time 0.0008 (0.0019) model time 0.2546 (0.2441) loss 3.2772 (2.7643) grad_norm 4.8724 (4.9948) loss_scale 512.0000 (271.0367) mem 7381MB [2024-09-01 08:14:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][690/1251] eta 0:02:18 lr 0.000065 wd 0.0500 time 0.2415 (0.2460) data time 0.0009 (0.0019) model time 0.2406 (0.2441) loss 2.8274 (2.7641) grad_norm 6.4929 (4.9991) loss_scale 512.0000 (274.5239) mem 7381MB [2024-09-01 08:14:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][700/1251] eta 0:02:15 lr 0.000065 wd 0.0500 time 0.2474 (0.2460) data time 0.0009 (0.0019) model time 0.2464 (0.2440) loss 2.1240 (2.7632) grad_norm 5.0064 (4.9982) loss_scale 512.0000 (277.9116) mem 7381MB [2024-09-01 08:14:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][710/1251] eta 0:02:13 lr 0.000065 wd 0.0500 time 0.2408 (0.2459) data time 0.0009 (0.0019) model time 0.2399 (0.2440) loss 2.0746 (2.7607) grad_norm 4.2725 (5.0262) loss_scale 512.0000 (281.2039) mem 7381MB [2024-09-01 08:14:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][720/1251] eta 0:02:10 lr 0.000065 wd 0.0500 time 0.2394 (0.2459) data time 0.0010 (0.0019) model time 0.2384 (0.2440) loss 3.2049 (2.7611) grad_norm 4.4282 (5.0177) loss_scale 512.0000 (284.4050) mem 7381MB [2024-09-01 08:14:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][730/1251] eta 0:02:08 lr 0.000065 wd 0.0500 time 0.2396 (0.2458) data time 0.0011 (0.0018) model time 0.2386 (0.2439) loss 2.7259 (2.7605) grad_norm 8.5142 (5.0136) loss_scale 512.0000 (287.5185) mem 7381MB [2024-09-01 08:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][740/1251] eta 0:02:05 lr 0.000065 wd 0.0500 time 0.2378 (0.2457) data time 0.0009 (0.0018) model time 0.2369 (0.2439) loss 3.3603 (2.7601) grad_norm 4.5355 (5.0097) loss_scale 512.0000 (290.5479) mem 7381MB [2024-09-01 08:15:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][750/1251] eta 0:02:03 lr 0.000065 wd 0.0500 time 0.2592 (0.2457) data time 0.0009 (0.0018) model time 0.2583 (0.2438) loss 3.0968 (2.7600) grad_norm 5.4074 (5.0041) loss_scale 512.0000 (293.4967) mem 7381MB [2024-09-01 08:15:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][760/1251] eta 0:02:00 lr 0.000065 wd 0.0500 time 0.2427 (0.2456) data time 0.0010 (0.0018) model time 0.2417 (0.2438) loss 2.4578 (2.7588) grad_norm 3.2445 (4.9963) loss_scale 512.0000 (296.3679) mem 7381MB [2024-09-01 08:15:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][770/1251] eta 0:01:58 lr 0.000065 wd 0.0500 time 0.2380 (0.2456) data time 0.0009 (0.0018) model time 0.2370 (0.2437) loss 2.5576 (2.7584) grad_norm 6.2932 (4.9975) loss_scale 512.0000 (299.1647) mem 7381MB [2024-09-01 08:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][780/1251] eta 0:01:55 lr 0.000065 wd 0.0500 time 0.2387 (0.2455) data time 0.0008 (0.0018) model time 0.2379 (0.2436) loss 3.1613 (2.7567) grad_norm 3.7681 (4.9937) loss_scale 512.0000 (301.8899) mem 7381MB [2024-09-01 08:15:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][790/1251] eta 0:01:53 lr 0.000065 wd 0.0500 time 0.2402 (0.2455) data time 0.0008 (0.0018) model time 0.2394 (0.2436) loss 2.6615 (2.7591) grad_norm 7.6182 (4.9901) loss_scale 512.0000 (304.5461) mem 7381MB [2024-09-01 08:15:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][800/1251] eta 0:01:50 lr 0.000065 wd 0.0500 time 0.2370 (0.2454) data time 0.0010 (0.0018) model time 0.2360 (0.2436) loss 2.4089 (2.7610) grad_norm 3.7972 (5.0012) loss_scale 512.0000 (307.1361) mem 7381MB [2024-09-01 08:15:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][810/1251] eta 0:01:48 lr 0.000065 wd 0.0500 time 0.2397 (0.2454) data time 0.0011 (0.0018) model time 0.2386 (0.2435) loss 2.3009 (2.7623) grad_norm 3.9076 (5.0051) loss_scale 512.0000 (309.6621) mem 7381MB [2024-09-01 08:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][820/1251] eta 0:01:45 lr 0.000065 wd 0.0500 time 0.2391 (0.2453) data time 0.0008 (0.0018) model time 0.2383 (0.2435) loss 1.8540 (2.7612) grad_norm 6.3508 (5.0065) loss_scale 512.0000 (312.1267) mem 7381MB [2024-09-01 08:15:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][830/1251] eta 0:01:43 lr 0.000065 wd 0.0500 time 0.2320 (0.2453) data time 0.0008 (0.0017) model time 0.2312 (0.2435) loss 3.2615 (2.7612) grad_norm 6.3428 (5.0073) loss_scale 512.0000 (314.5319) mem 7381MB [2024-09-01 08:15:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][840/1251] eta 0:01:40 lr 0.000065 wd 0.0500 time 0.2364 (0.2452) data time 0.0010 (0.0017) model time 0.2354 (0.2434) loss 2.7530 (2.7594) grad_norm 6.2134 (5.0233) loss_scale 512.0000 (316.8799) mem 7381MB [2024-09-01 08:15:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][850/1251] eta 0:01:38 lr 0.000065 wd 0.0500 time 0.2377 (0.2452) data time 0.0011 (0.0017) model time 0.2366 (0.2434) loss 2.8730 (2.7584) grad_norm 4.8798 (5.0276) loss_scale 512.0000 (319.1727) mem 7381MB [2024-09-01 08:15:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][860/1251] eta 0:01:35 lr 0.000065 wd 0.0500 time 0.2331 (0.2451) data time 0.0007 (0.0017) model time 0.2324 (0.2433) loss 3.2656 (2.7623) grad_norm 5.9732 (5.0263) loss_scale 512.0000 (321.4123) mem 7381MB [2024-09-01 08:15:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][870/1251] eta 0:01:33 lr 0.000065 wd 0.0500 time 0.2371 (0.2451) data time 0.0008 (0.0017) model time 0.2362 (0.2433) loss 3.0487 (2.7624) grad_norm 4.0192 (5.0186) loss_scale 512.0000 (323.6005) mem 7381MB [2024-09-01 08:15:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][880/1251] eta 0:01:30 lr 0.000065 wd 0.0500 time 0.2353 (0.2450) data time 0.0010 (0.0017) model time 0.2343 (0.2432) loss 1.8462 (2.7587) grad_norm 5.8399 (5.0276) loss_scale 512.0000 (325.7389) mem 7381MB [2024-09-01 08:15:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][890/1251] eta 0:01:28 lr 0.000065 wd 0.0500 time 0.2438 (0.2450) data time 0.0010 (0.0017) model time 0.2428 (0.2432) loss 2.9227 (2.7605) grad_norm 3.9565 (5.0193) loss_scale 512.0000 (327.8294) mem 7381MB [2024-09-01 08:15:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][900/1251] eta 0:01:25 lr 0.000065 wd 0.0500 time 0.2378 (0.2449) data time 0.0008 (0.0017) model time 0.2370 (0.2432) loss 3.5538 (2.7612) grad_norm 4.6214 (5.0209) loss_scale 512.0000 (329.8735) mem 7381MB [2024-09-01 08:15:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][910/1251] eta 0:01:23 lr 0.000065 wd 0.0500 time 0.2430 (0.2449) data time 0.0010 (0.0017) model time 0.2420 (0.2431) loss 2.5409 (2.7595) grad_norm 4.8208 (5.0212) loss_scale 512.0000 (331.8727) mem 7381MB [2024-09-01 08:15:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][920/1251] eta 0:01:21 lr 0.000065 wd 0.0500 time 0.2458 (0.2449) data time 0.0008 (0.0017) model time 0.2449 (0.2431) loss 2.0767 (2.7575) grad_norm 3.8397 (5.0419) loss_scale 512.0000 (333.8284) mem 7381MB [2024-09-01 08:15:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][930/1251] eta 0:01:18 lr 0.000065 wd 0.0500 time 0.2495 (0.2449) data time 0.0007 (0.0017) model time 0.2488 (0.2431) loss 2.4069 (2.7561) grad_norm 4.5919 (5.0361) loss_scale 512.0000 (335.7422) mem 7381MB [2024-09-01 08:15:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][940/1251] eta 0:01:16 lr 0.000065 wd 0.0500 time 0.2368 (0.2448) data time 0.0012 (0.0017) model time 0.2355 (0.2431) loss 3.1013 (2.7584) grad_norm 3.6625 (5.0380) loss_scale 512.0000 (337.6153) mem 7381MB [2024-09-01 08:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][950/1251] eta 0:01:13 lr 0.000065 wd 0.0500 time 0.2363 (0.2448) data time 0.0007 (0.0017) model time 0.2357 (0.2431) loss 3.3660 (2.7613) grad_norm 5.6291 (5.0408) loss_scale 512.0000 (339.4490) mem 7381MB [2024-09-01 08:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][960/1251] eta 0:01:11 lr 0.000065 wd 0.0500 time 0.2415 (0.2448) data time 0.0010 (0.0016) model time 0.2405 (0.2430) loss 3.0972 (2.7619) grad_norm 3.8087 (5.0508) loss_scale 512.0000 (341.2445) mem 7381MB [2024-09-01 08:15:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][970/1251] eta 0:01:08 lr 0.000065 wd 0.0500 time 0.2381 (0.2448) data time 0.0008 (0.0016) model time 0.2374 (0.2430) loss 2.5149 (2.7616) grad_norm 7.0316 (5.0571) loss_scale 512.0000 (343.0031) mem 7381MB [2024-09-01 08:15:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][980/1251] eta 0:01:06 lr 0.000064 wd 0.0500 time 0.2430 (0.2447) data time 0.0008 (0.0016) model time 0.2422 (0.2430) loss 3.4328 (2.7622) grad_norm 6.2430 (5.0734) loss_scale 512.0000 (344.7258) mem 7381MB [2024-09-01 08:16:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][990/1251] eta 0:01:03 lr 0.000064 wd 0.0500 time 0.2400 (0.2447) data time 0.0009 (0.0016) model time 0.2391 (0.2430) loss 3.3734 (2.7609) grad_norm 6.3021 (5.0906) loss_scale 512.0000 (346.4137) mem 7381MB [2024-09-01 08:16:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1000/1251] eta 0:01:01 lr 0.000064 wd 0.0500 time 0.2463 (0.2447) data time 0.0011 (0.0016) model time 0.2452 (0.2430) loss 2.5752 (2.7620) grad_norm 4.2575 (5.0852) loss_scale 512.0000 (348.0679) mem 7381MB [2024-09-01 08:16:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1010/1251] eta 0:00:59 lr 0.000064 wd 0.0500 time 0.2517 (0.2449) data time 0.0009 (0.0016) model time 0.2508 (0.2432) loss 2.7942 (2.7626) grad_norm 3.1001 (5.0825) loss_scale 512.0000 (349.6894) mem 7381MB [2024-09-01 08:16:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1020/1251] eta 0:00:56 lr 0.000064 wd 0.0500 time 0.2411 (0.2448) data time 0.0008 (0.0016) model time 0.2403 (0.2431) loss 1.7358 (2.7626) grad_norm 3.9616 (5.0749) loss_scale 512.0000 (351.2791) mem 7381MB [2024-09-01 08:16:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1030/1251] eta 0:00:54 lr 0.000064 wd 0.0500 time 0.2449 (0.2448) data time 0.0007 (0.0016) model time 0.2442 (0.2431) loss 3.1304 (2.7613) grad_norm 12.3859 (5.0930) loss_scale 512.0000 (352.8380) mem 7381MB [2024-09-01 08:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1040/1251] eta 0:00:51 lr 0.000064 wd 0.0500 time 0.2447 (0.2448) data time 0.0007 (0.0016) model time 0.2440 (0.2431) loss 2.8600 (2.7638) grad_norm 6.6823 (5.0978) loss_scale 512.0000 (354.3670) mem 7381MB [2024-09-01 08:16:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1050/1251] eta 0:00:49 lr 0.000064 wd 0.0500 time 0.2507 (0.2447) data time 0.0009 (0.0016) model time 0.2497 (0.2430) loss 3.0056 (2.7674) grad_norm 3.7906 (5.1173) loss_scale 512.0000 (355.8668) mem 7381MB [2024-09-01 08:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1060/1251] eta 0:00:46 lr 0.000064 wd 0.0500 time 0.2405 (0.2453) data time 0.0010 (0.0016) model time 0.2395 (0.2436) loss 3.2786 (2.7673) grad_norm 13.0259 (5.1249) loss_scale 512.0000 (357.3384) mem 7381MB [2024-09-01 08:16:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1070/1251] eta 0:00:44 lr 0.000064 wd 0.0500 time 0.2413 (0.2454) data time 0.0009 (0.0016) model time 0.2404 (0.2438) loss 3.1695 (2.7684) grad_norm 7.9004 (5.1246) loss_scale 512.0000 (358.7824) mem 7381MB [2024-09-01 08:16:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1080/1251] eta 0:00:41 lr 0.000064 wd 0.0500 time 0.2489 (0.2454) data time 0.0010 (0.0016) model time 0.2479 (0.2438) loss 3.1025 (2.7703) grad_norm 4.2363 (5.1277) loss_scale 512.0000 (360.1998) mem 7381MB [2024-09-01 08:16:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1090/1251] eta 0:00:39 lr 0.000064 wd 0.0500 time 0.2298 (0.2453) data time 0.0008 (0.0016) model time 0.2290 (0.2437) loss 2.0443 (2.7696) grad_norm 5.6199 (5.1319) loss_scale 512.0000 (361.5912) mem 7381MB [2024-09-01 08:16:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1100/1251] eta 0:00:37 lr 0.000064 wd 0.0500 time 0.2396 (0.2453) data time 0.0011 (0.0016) model time 0.2385 (0.2437) loss 2.3199 (2.7681) grad_norm 3.7084 (5.1267) loss_scale 512.0000 (362.9573) mem 7381MB [2024-09-01 08:16:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1110/1251] eta 0:00:34 lr 0.000064 wd 0.0500 time 0.2404 (0.2453) data time 0.0007 (0.0016) model time 0.2398 (0.2436) loss 1.6896 (2.7659) grad_norm 3.8012 (5.1203) loss_scale 512.0000 (364.2988) mem 7381MB [2024-09-01 08:16:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1120/1251] eta 0:00:32 lr 0.000064 wd 0.0500 time 0.2464 (0.2452) data time 0.0010 (0.0016) model time 0.2454 (0.2436) loss 2.9469 (2.7658) grad_norm 3.6142 (5.1147) loss_scale 512.0000 (365.6164) mem 7381MB [2024-09-01 08:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1130/1251] eta 0:00:29 lr 0.000064 wd 0.0500 time 0.2470 (0.2452) data time 0.0007 (0.0016) model time 0.2463 (0.2436) loss 2.8035 (2.7654) grad_norm 3.4542 (5.1119) loss_scale 512.0000 (366.9107) mem 7381MB [2024-09-01 08:16:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1140/1251] eta 0:00:27 lr 0.000064 wd 0.0500 time 0.2445 (0.2452) data time 0.0008 (0.0016) model time 0.2437 (0.2436) loss 2.6796 (2.7650) grad_norm 3.1471 (5.1103) loss_scale 512.0000 (368.1823) mem 7381MB [2024-09-01 08:16:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1150/1251] eta 0:00:24 lr 0.000064 wd 0.0500 time 0.2416 (0.2451) data time 0.0007 (0.0015) model time 0.2409 (0.2435) loss 2.8543 (2.7633) grad_norm 4.4331 (5.1050) loss_scale 512.0000 (369.4318) mem 7381MB [2024-09-01 08:16:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1160/1251] eta 0:00:22 lr 0.000064 wd 0.0500 time 0.2357 (0.2451) data time 0.0007 (0.0015) model time 0.2350 (0.2435) loss 2.6896 (2.7654) grad_norm 4.0859 (5.1130) loss_scale 512.0000 (370.6598) mem 7381MB [2024-09-01 08:16:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1170/1251] eta 0:00:19 lr 0.000064 wd 0.0500 time 0.2516 (0.2451) data time 0.0009 (0.0015) model time 0.2507 (0.2435) loss 1.9174 (2.7641) grad_norm 3.5983 (5.1057) loss_scale 512.0000 (371.8668) mem 7381MB [2024-09-01 08:16:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1180/1251] eta 0:00:17 lr 0.000064 wd 0.0500 time 0.2315 (0.2450) data time 0.0007 (0.0015) model time 0.2308 (0.2434) loss 2.4656 (2.7620) grad_norm 4.7805 (5.1009) loss_scale 512.0000 (373.0533) mem 7381MB [2024-09-01 08:16:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1190/1251] eta 0:00:14 lr 0.000064 wd 0.0500 time 0.2430 (0.2450) data time 0.0009 (0.0015) model time 0.2421 (0.2434) loss 2.7555 (2.7610) grad_norm 3.4740 (5.0964) loss_scale 512.0000 (374.2200) mem 7381MB [2024-09-01 08:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1200/1251] eta 0:00:12 lr 0.000064 wd 0.0500 time 0.2364 (0.2449) data time 0.0010 (0.0015) model time 0.2354 (0.2433) loss 2.2798 (2.7603) grad_norm 7.0708 (5.0926) loss_scale 512.0000 (375.3672) mem 7381MB [2024-09-01 08:16:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1210/1251] eta 0:00:10 lr 0.000064 wd 0.0500 time 0.2429 (0.2449) data time 0.0007 (0.0015) model time 0.2422 (0.2433) loss 3.0845 (2.7604) grad_norm 5.0389 (5.0910) loss_scale 512.0000 (376.4955) mem 7381MB [2024-09-01 08:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1220/1251] eta 0:00:07 lr 0.000064 wd 0.0500 time 0.2370 (0.2449) data time 0.0008 (0.0015) model time 0.2362 (0.2433) loss 2.1064 (2.7588) grad_norm 3.6562 (5.1107) loss_scale 512.0000 (377.6052) mem 7381MB [2024-09-01 08:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1230/1251] eta 0:00:05 lr 0.000064 wd 0.0500 time 0.2363 (0.2448) data time 0.0009 (0.0015) model time 0.2354 (0.2433) loss 2.9551 (2.7603) grad_norm 3.3701 (5.0989) loss_scale 512.0000 (378.6970) mem 7381MB [2024-09-01 08:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1240/1251] eta 0:00:02 lr 0.000064 wd 0.0500 time 0.2275 (0.2447) data time 0.0007 (0.0015) model time 0.2268 (0.2432) loss 2.9415 (2.7594) grad_norm 5.0103 (5.0934) loss_scale 512.0000 (379.7712) mem 7381MB [2024-09-01 08:17:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [257/300][1250/1251] eta 0:00:00 lr 0.000064 wd 0.0500 time 0.2304 (0.2446) data time 0.0005 (0.0015) model time 0.2299 (0.2430) loss 1.7588 (2.7604) grad_norm 5.4171 (5.0920) loss_scale 512.0000 (380.8281) mem 7381MB [2024-09-01 08:17:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 257 training takes 0:05:06 [2024-09-01 08:17:05 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 08:17:06 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 08:17:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.451 (0.451) Loss 0.3918 (0.3918) Acc@1 93.262 (93.262) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 08:17:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.111) Loss 0.5962 (0.6226) Acc@1 90.234 (87.473) Acc@5 97.949 (97.701) Mem 7381MB [2024-09-01 08:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.096) Loss 0.9175 (0.6529) Acc@1 78.516 (86.342) Acc@5 95.898 (97.624) Mem 7381MB [2024-09-01 08:17:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.089) Loss 1.1934 (0.7510) Acc@1 72.461 (84.000) Acc@5 92.285 (96.591) Mem 7381MB [2024-09-01 08:17:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.0068 (0.7987) Acc@1 77.734 (82.863) Acc@5 94.238 (96.118) Mem 7381MB [2024-09-01 08:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.456 Acc@5 96.064 [2024-09-01 08:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.5% [2024-09-01 08:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.46% [2024-09-01 08:17:10 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 08:17:11 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 08:17:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.448 (0.448) Loss 0.3813 (0.3813) Acc@1 93.457 (93.457) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 08:17:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.068 (0.109) Loss 0.5669 (0.6014) Acc@1 90.430 (87.784) Acc@5 97.949 (97.834) Mem 7381MB [2024-09-01 08:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.094) Loss 0.8940 (0.6312) Acc@1 77.734 (86.561) Acc@5 96.191 (97.773) Mem 7381MB [2024-09-01 08:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.088) Loss 1.1094 (0.7192) Acc@1 74.023 (84.416) Acc@5 92.969 (96.894) Mem 7381MB [2024-09-01 08:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.082) Loss 0.9907 (0.7656) Acc@1 77.148 (83.306) Acc@5 94.238 (96.375) Mem 7381MB [2024-09-01 08:17:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.940 Acc@5 96.316 [2024-09-01 08:17:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 08:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][0/1251] eta 0:22:41 lr 0.000064 wd 0.0500 time 1.0884 (1.0884) data time 0.4407 (0.4407) model time 0.0000 (0.0000) loss 2.7745 (2.7745) grad_norm 3.8202 (3.8202) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][10/1251] eta 0:06:40 lr 0.000064 wd 0.0500 time 0.2479 (0.3224) data time 0.0008 (0.0409) model time 0.0000 (0.0000) loss 3.0554 (2.7755) grad_norm 6.6605 (6.0679) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][20/1251] eta 0:05:49 lr 0.000064 wd 0.0500 time 0.2454 (0.2836) data time 0.0007 (0.0219) model time 0.0000 (0.0000) loss 3.3142 (2.9520) grad_norm 9.1899 (5.5060) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][30/1251] eta 0:05:29 lr 0.000064 wd 0.0500 time 0.2471 (0.2702) data time 0.0007 (0.0151) model time 0.0000 (0.0000) loss 3.1326 (2.9761) grad_norm 3.7057 (5.2892) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][40/1251] eta 0:05:19 lr 0.000064 wd 0.0500 time 0.2401 (0.2635) data time 0.0006 (0.0117) model time 0.0000 (0.0000) loss 2.4353 (2.8325) grad_norm 5.9761 (5.0575) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][50/1251] eta 0:05:11 lr 0.000064 wd 0.0500 time 0.2408 (0.2592) data time 0.0008 (0.0096) model time 0.0000 (0.0000) loss 3.5502 (2.8468) grad_norm 4.4042 (4.9810) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][60/1251] eta 0:05:05 lr 0.000064 wd 0.0500 time 0.2386 (0.2565) data time 0.0008 (0.0082) model time 0.2379 (0.2416) loss 3.0981 (2.8169) grad_norm 4.2179 (4.8924) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][70/1251] eta 0:05:00 lr 0.000064 wd 0.0500 time 0.2513 (0.2543) data time 0.0007 (0.0071) model time 0.2506 (0.2409) loss 2.3018 (2.8101) grad_norm 3.3826 (4.7929) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][80/1251] eta 0:04:56 lr 0.000064 wd 0.0500 time 0.2455 (0.2531) data time 0.0007 (0.0064) model time 0.2447 (0.2417) loss 3.5364 (2.8158) grad_norm 7.3409 (4.8712) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][90/1251] eta 0:04:52 lr 0.000064 wd 0.0500 time 0.2378 (0.2518) data time 0.0011 (0.0058) model time 0.2368 (0.2413) loss 2.5586 (2.8092) grad_norm 3.8284 (4.8865) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][100/1251] eta 0:04:48 lr 0.000064 wd 0.0500 time 0.2495 (0.2509) data time 0.0009 (0.0053) model time 0.2486 (0.2414) loss 3.0353 (2.8029) grad_norm 7.7327 (4.9331) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][110/1251] eta 0:04:45 lr 0.000064 wd 0.0500 time 0.2397 (0.2499) data time 0.0008 (0.0049) model time 0.2389 (0.2411) loss 2.0356 (2.7696) grad_norm 4.9323 (4.8951) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][120/1251] eta 0:04:41 lr 0.000064 wd 0.0500 time 0.2358 (0.2492) data time 0.0007 (0.0046) model time 0.2351 (0.2408) loss 2.5141 (2.7430) grad_norm 4.9672 (4.8665) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][130/1251] eta 0:04:38 lr 0.000064 wd 0.0500 time 0.2409 (0.2486) data time 0.0011 (0.0043) model time 0.2398 (0.2409) loss 2.7548 (2.7459) grad_norm 4.8715 (4.8673) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][140/1251] eta 0:04:35 lr 0.000064 wd 0.0500 time 0.2438 (0.2481) data time 0.0009 (0.0041) model time 0.2428 (0.2409) loss 3.0835 (2.7675) grad_norm 4.6365 (4.8624) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][150/1251] eta 0:04:32 lr 0.000064 wd 0.0500 time 0.2401 (0.2478) data time 0.0009 (0.0039) model time 0.2392 (0.2410) loss 2.3196 (2.7593) grad_norm 3.9069 (4.8469) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][160/1251] eta 0:04:29 lr 0.000064 wd 0.0500 time 0.2451 (0.2475) data time 0.0009 (0.0037) model time 0.2442 (0.2410) loss 3.0159 (2.7595) grad_norm 4.5917 (4.8751) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][170/1251] eta 0:04:26 lr 0.000064 wd 0.0500 time 0.2308 (0.2469) data time 0.0009 (0.0035) model time 0.2299 (0.2407) loss 3.2891 (2.7670) grad_norm 4.9236 (4.8634) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:17:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][180/1251] eta 0:04:24 lr 0.000064 wd 0.0500 time 0.2352 (0.2466) data time 0.0011 (0.0034) model time 0.2340 (0.2407) loss 2.8071 (2.7655) grad_norm 4.5574 (4.8773) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][190/1251] eta 0:04:21 lr 0.000064 wd 0.0500 time 0.2355 (0.2465) data time 0.0009 (0.0033) model time 0.2346 (0.2409) loss 2.0455 (2.7517) grad_norm 5.4685 (4.8341) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][200/1251] eta 0:04:18 lr 0.000064 wd 0.0500 time 0.2465 (0.2463) data time 0.0008 (0.0032) model time 0.2456 (0.2409) loss 2.3685 (2.7572) grad_norm 3.3340 (4.8428) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][210/1251] eta 0:04:16 lr 0.000064 wd 0.0500 time 0.2455 (0.2461) data time 0.0010 (0.0031) model time 0.2445 (0.2409) loss 2.5508 (2.7654) grad_norm 4.3243 (4.8611) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][220/1251] eta 0:04:13 lr 0.000064 wd 0.0500 time 0.2419 (0.2459) data time 0.0009 (0.0030) model time 0.2410 (0.2409) loss 2.4801 (2.7678) grad_norm 4.0772 (4.8671) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][230/1251] eta 0:04:10 lr 0.000063 wd 0.0500 time 0.2297 (0.2454) data time 0.0009 (0.0029) model time 0.2287 (0.2405) loss 2.8118 (2.7609) grad_norm 4.5721 (4.8488) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][240/1251] eta 0:04:08 lr 0.000063 wd 0.0500 time 0.2395 (0.2457) data time 0.0010 (0.0028) model time 0.2385 (0.2411) loss 1.8568 (2.7683) grad_norm 6.7724 (4.9158) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][250/1251] eta 0:04:05 lr 0.000063 wd 0.0500 time 0.2477 (0.2456) data time 0.0007 (0.0027) model time 0.2470 (0.2411) loss 2.9587 (2.7631) grad_norm 4.5372 (4.9392) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][260/1251] eta 0:04:03 lr 0.000063 wd 0.0500 time 0.2338 (0.2460) data time 0.0010 (0.0027) model time 0.2328 (0.2418) loss 2.5165 (2.7680) grad_norm 8.9570 (4.9552) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][270/1251] eta 0:04:01 lr 0.000063 wd 0.0500 time 0.2413 (0.2458) data time 0.0009 (0.0026) model time 0.2404 (0.2417) loss 2.8317 (2.7661) grad_norm 4.6361 (5.0216) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][280/1251] eta 0:03:58 lr 0.000063 wd 0.0500 time 0.2399 (0.2456) data time 0.0011 (0.0025) model time 0.2388 (0.2417) loss 2.0027 (2.7618) grad_norm 3.1680 (5.0202) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][290/1251] eta 0:03:56 lr 0.000063 wd 0.0500 time 0.2473 (0.2462) data time 0.0009 (0.0025) model time 0.2463 (0.2424) loss 3.1370 (2.7680) grad_norm 5.3088 (5.0178) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][300/1251] eta 0:03:54 lr 0.000063 wd 0.0500 time 0.2444 (0.2461) data time 0.0008 (0.0024) model time 0.2436 (0.2425) loss 2.9736 (2.7745) grad_norm 4.3978 (4.9825) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][310/1251] eta 0:03:51 lr 0.000063 wd 0.0500 time 0.2459 (0.2460) data time 0.0007 (0.0024) model time 0.2451 (0.2424) loss 3.3383 (2.7762) grad_norm 3.5614 (4.9566) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][320/1251] eta 0:03:48 lr 0.000063 wd 0.0500 time 0.2432 (0.2459) data time 0.0008 (0.0024) model time 0.2424 (0.2424) loss 3.1777 (2.7756) grad_norm 3.9156 (4.9455) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][330/1251] eta 0:03:46 lr 0.000063 wd 0.0500 time 0.2394 (0.2457) data time 0.0010 (0.0023) model time 0.2384 (0.2423) loss 1.7330 (2.7721) grad_norm 5.8483 (5.0067) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][340/1251] eta 0:03:44 lr 0.000063 wd 0.0500 time 0.2456 (0.2469) data time 0.0010 (0.0023) model time 0.2446 (0.2437) loss 2.3907 (2.7690) grad_norm 3.4011 (5.0062) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][350/1251] eta 0:03:42 lr 0.000063 wd 0.0500 time 0.2439 (0.2473) data time 0.0010 (0.0022) model time 0.2429 (0.2443) loss 3.0163 (2.7692) grad_norm 11.3095 (5.0326) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][360/1251] eta 0:03:40 lr 0.000063 wd 0.0500 time 0.2385 (0.2471) data time 0.0008 (0.0022) model time 0.2377 (0.2441) loss 2.7799 (2.7734) grad_norm 5.3786 (5.0823) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][370/1251] eta 0:03:37 lr 0.000063 wd 0.0500 time 0.2537 (0.2470) data time 0.0007 (0.0022) model time 0.2530 (0.2440) loss 2.1987 (2.7735) grad_norm 4.9170 (5.0611) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][380/1251] eta 0:03:35 lr 0.000063 wd 0.0500 time 0.2437 (0.2469) data time 0.0010 (0.0021) model time 0.2427 (0.2440) loss 2.1580 (2.7671) grad_norm 4.9964 (5.0597) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][390/1251] eta 0:03:32 lr 0.000063 wd 0.0500 time 0.2575 (0.2467) data time 0.0010 (0.0021) model time 0.2565 (0.2439) loss 1.8697 (2.7588) grad_norm 4.5567 (5.0592) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][400/1251] eta 0:03:29 lr 0.000063 wd 0.0500 time 0.2426 (0.2466) data time 0.0009 (0.0021) model time 0.2416 (0.2438) loss 2.2145 (2.7586) grad_norm 4.7825 (5.0503) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][410/1251] eta 0:03:27 lr 0.000063 wd 0.0500 time 0.2398 (0.2465) data time 0.0011 (0.0021) model time 0.2386 (0.2437) loss 2.5046 (2.7551) grad_norm 3.7415 (5.1438) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:18:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][420/1251] eta 0:03:24 lr 0.000063 wd 0.0500 time 0.2286 (0.2463) data time 0.0008 (0.0020) model time 0.2278 (0.2435) loss 2.5594 (2.7537) grad_norm 2.2621 (5.1161) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][430/1251] eta 0:03:22 lr 0.000063 wd 0.0500 time 0.2384 (0.2462) data time 0.0011 (0.0020) model time 0.2373 (0.2435) loss 3.2743 (2.7545) grad_norm 4.5792 (5.1039) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][440/1251] eta 0:03:19 lr 0.000063 wd 0.0500 time 0.2404 (0.2460) data time 0.0007 (0.0020) model time 0.2397 (0.2433) loss 3.2499 (2.7572) grad_norm 4.0811 (5.0900) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][450/1251] eta 0:03:16 lr 0.000063 wd 0.0500 time 0.2399 (0.2458) data time 0.0007 (0.0020) model time 0.2392 (0.2432) loss 3.3151 (2.7650) grad_norm 4.8423 (5.1032) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][460/1251] eta 0:03:14 lr 0.000063 wd 0.0500 time 0.2474 (0.2457) data time 0.0011 (0.0019) model time 0.2464 (0.2431) loss 2.3906 (2.7618) grad_norm 5.5253 (5.0887) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][470/1251] eta 0:03:11 lr 0.000063 wd 0.0500 time 0.2328 (0.2456) data time 0.0008 (0.0019) model time 0.2320 (0.2430) loss 2.9265 (2.7665) grad_norm 3.1389 (5.0758) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][480/1251] eta 0:03:09 lr 0.000063 wd 0.0500 time 0.2431 (0.2455) data time 0.0009 (0.0019) model time 0.2422 (0.2429) loss 2.7130 (2.7681) grad_norm 6.1015 (5.0853) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][490/1251] eta 0:03:06 lr 0.000063 wd 0.0500 time 0.2345 (0.2454) data time 0.0009 (0.0019) model time 0.2336 (0.2428) loss 3.5066 (2.7716) grad_norm 2.3724 (5.0833) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][500/1251] eta 0:03:04 lr 0.000063 wd 0.0500 time 0.2440 (0.2453) data time 0.0010 (0.0019) model time 0.2430 (0.2428) loss 3.0590 (2.7732) grad_norm 5.6300 (5.0734) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][510/1251] eta 0:03:01 lr 0.000063 wd 0.0500 time 0.2473 (0.2452) data time 0.0009 (0.0019) model time 0.2464 (0.2427) loss 2.9974 (2.7740) grad_norm 38.5628 (5.1591) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][520/1251] eta 0:02:59 lr 0.000063 wd 0.0500 time 0.2403 (0.2452) data time 0.0009 (0.0018) model time 0.2394 (0.2427) loss 2.3941 (2.7732) grad_norm 3.9198 (5.1401) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][530/1251] eta 0:02:56 lr 0.000063 wd 0.0500 time 0.2333 (0.2451) data time 0.0007 (0.0018) model time 0.2325 (0.2426) loss 2.7290 (2.7765) grad_norm 4.7385 (5.1342) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][540/1251] eta 0:02:54 lr 0.000063 wd 0.0500 time 0.2365 (0.2450) data time 0.0009 (0.0018) model time 0.2355 (0.2426) loss 2.0081 (2.7743) grad_norm 6.5728 (5.1289) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][550/1251] eta 0:02:51 lr 0.000063 wd 0.0500 time 0.2445 (0.2450) data time 0.0007 (0.0018) model time 0.2438 (0.2425) loss 2.7303 (2.7755) grad_norm 8.4116 (5.1346) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][560/1251] eta 0:02:49 lr 0.000063 wd 0.0500 time 0.2465 (0.2449) data time 0.0011 (0.0018) model time 0.2453 (0.2425) loss 3.0909 (2.7745) grad_norm 3.4934 (5.1389) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][570/1251] eta 0:02:46 lr 0.000063 wd 0.0500 time 0.2363 (0.2449) data time 0.0007 (0.0018) model time 0.2356 (0.2425) loss 2.8530 (2.7776) grad_norm 4.5358 (5.1766) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][580/1251] eta 0:02:44 lr 0.000063 wd 0.0500 time 0.2418 (0.2448) data time 0.0007 (0.0017) model time 0.2411 (0.2424) loss 2.9116 (2.7790) grad_norm 4.2271 (5.1817) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][590/1251] eta 0:02:41 lr 0.000063 wd 0.0500 time 0.2408 (0.2447) data time 0.0008 (0.0017) model time 0.2400 (0.2423) loss 3.0591 (2.7794) grad_norm 6.7934 (5.1838) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][600/1251] eta 0:02:39 lr 0.000063 wd 0.0500 time 0.2446 (0.2446) data time 0.0009 (0.0017) model time 0.2437 (0.2423) loss 3.1589 (2.7826) grad_norm 3.2908 (5.1733) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][610/1251] eta 0:02:36 lr 0.000063 wd 0.0500 time 0.2401 (0.2446) data time 0.0010 (0.0017) model time 0.2390 (0.2422) loss 2.8150 (2.7782) grad_norm 5.2590 (5.1682) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][620/1251] eta 0:02:34 lr 0.000063 wd 0.0500 time 0.2468 (0.2445) data time 0.0010 (0.0017) model time 0.2458 (0.2422) loss 2.7878 (2.7732) grad_norm 3.7142 (5.1685) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][630/1251] eta 0:02:31 lr 0.000063 wd 0.0500 time 0.2402 (0.2444) data time 0.0009 (0.0017) model time 0.2393 (0.2422) loss 2.7145 (2.7748) grad_norm 6.3268 (5.1637) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][640/1251] eta 0:02:29 lr 0.000063 wd 0.0500 time 0.2366 (0.2444) data time 0.0007 (0.0017) model time 0.2359 (0.2421) loss 3.1875 (2.7718) grad_norm 3.1228 (5.1884) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][650/1251] eta 0:02:26 lr 0.000063 wd 0.0500 time 0.2487 (0.2443) data time 0.0007 (0.0017) model time 0.2479 (0.2421) loss 3.3990 (2.7724) grad_norm 4.0784 (5.1969) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][660/1251] eta 0:02:24 lr 0.000063 wd 0.0500 time 0.2407 (0.2443) data time 0.0007 (0.0017) model time 0.2400 (0.2421) loss 2.3925 (2.7687) grad_norm 7.4917 (5.2125) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:19:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][670/1251] eta 0:02:21 lr 0.000063 wd 0.0500 time 0.2420 (0.2443) data time 0.0010 (0.0016) model time 0.2410 (0.2420) loss 2.4456 (2.7689) grad_norm 4.8030 (5.2514) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][680/1251] eta 0:02:19 lr 0.000063 wd 0.0500 time 0.2490 (0.2442) data time 0.0008 (0.0016) model time 0.2482 (0.2420) loss 2.0460 (2.7680) grad_norm 2.7689 (5.2494) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][690/1251] eta 0:02:16 lr 0.000063 wd 0.0500 time 0.2390 (0.2442) data time 0.0007 (0.0016) model time 0.2382 (0.2420) loss 2.9635 (2.7665) grad_norm 5.0653 (5.2442) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][700/1251] eta 0:02:14 lr 0.000063 wd 0.0500 time 0.2448 (0.2441) data time 0.0010 (0.0016) model time 0.2439 (0.2419) loss 3.0687 (2.7634) grad_norm 8.0799 (5.2383) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][710/1251] eta 0:02:12 lr 0.000063 wd 0.0500 time 0.2466 (0.2440) data time 0.0009 (0.0016) model time 0.2457 (0.2419) loss 2.6616 (2.7654) grad_norm 4.0098 (5.2303) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][720/1251] eta 0:02:09 lr 0.000063 wd 0.0500 time 0.2386 (0.2440) data time 0.0010 (0.0016) model time 0.2376 (0.2418) loss 3.0952 (2.7678) grad_norm 5.4631 (5.2314) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][730/1251] eta 0:02:07 lr 0.000062 wd 0.0500 time 0.2404 (0.2439) data time 0.0008 (0.0016) model time 0.2396 (0.2418) loss 3.4703 (2.7652) grad_norm 4.4629 (5.2262) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][740/1251] eta 0:02:04 lr 0.000062 wd 0.0500 time 0.2453 (0.2440) data time 0.0010 (0.0016) model time 0.2443 (0.2418) loss 2.8842 (2.7635) grad_norm 8.5742 (5.2331) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][750/1251] eta 0:02:02 lr 0.000062 wd 0.0500 time 0.2432 (0.2439) data time 0.0009 (0.0016) model time 0.2424 (0.2418) loss 1.6564 (2.7598) grad_norm 4.7536 (5.2231) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][760/1251] eta 0:01:59 lr 0.000062 wd 0.0500 time 0.2412 (0.2438) data time 0.0009 (0.0016) model time 0.2403 (0.2418) loss 2.2902 (2.7572) grad_norm 4.3232 (5.2152) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][770/1251] eta 0:01:57 lr 0.000062 wd 0.0500 time 0.2464 (0.2438) data time 0.0008 (0.0016) model time 0.2456 (0.2417) loss 3.0390 (2.7611) grad_norm 4.4310 (5.2080) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][780/1251] eta 0:01:54 lr 0.000062 wd 0.0500 time 0.2423 (0.2438) data time 0.0009 (0.0015) model time 0.2414 (0.2417) loss 2.4906 (2.7629) grad_norm 6.3934 (5.2032) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][790/1251] eta 0:01:52 lr 0.000062 wd 0.0500 time 0.2488 (0.2440) data time 0.0009 (0.0015) model time 0.2479 (0.2420) loss 2.7792 (2.7650) grad_norm 6.7454 (5.2084) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][800/1251] eta 0:01:50 lr 0.000062 wd 0.0500 time 0.2374 (0.2440) data time 0.0009 (0.0015) model time 0.2365 (0.2420) loss 1.7228 (2.7643) grad_norm 3.9208 (5.1998) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][810/1251] eta 0:01:47 lr 0.000062 wd 0.0500 time 0.2451 (0.2441) data time 0.0010 (0.0015) model time 0.2440 (0.2421) loss 3.0877 (2.7655) grad_norm 15.3546 (5.2001) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][820/1251] eta 0:01:45 lr 0.000062 wd 0.0500 time 0.2429 (0.2441) data time 0.0011 (0.0015) model time 0.2418 (0.2421) loss 2.9856 (2.7663) grad_norm 4.0424 (5.1905) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][830/1251] eta 0:01:42 lr 0.000062 wd 0.0500 time 0.2476 (0.2441) data time 0.0009 (0.0015) model time 0.2466 (0.2421) loss 2.7742 (2.7674) grad_norm 3.4599 (5.1827) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][840/1251] eta 0:01:40 lr 0.000062 wd 0.0500 time 0.2384 (0.2440) data time 0.0007 (0.0015) model time 0.2377 (0.2421) loss 3.0307 (2.7705) grad_norm 3.4039 (5.1758) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][850/1251] eta 0:01:37 lr 0.000062 wd 0.0500 time 0.2488 (0.2440) data time 0.0008 (0.0015) model time 0.2480 (0.2420) loss 2.7550 (2.7706) grad_norm 3.9279 (5.1831) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][860/1251] eta 0:01:35 lr 0.000062 wd 0.0500 time 0.2479 (0.2444) data time 0.0008 (0.0015) model time 0.2470 (0.2425) loss 2.8477 (2.7716) grad_norm 5.4659 (5.1779) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][870/1251] eta 0:01:33 lr 0.000062 wd 0.0500 time 0.2361 (0.2445) data time 0.0011 (0.0015) model time 0.2349 (0.2426) loss 3.1493 (2.7714) grad_norm 4.7048 (5.1834) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][880/1251] eta 0:01:30 lr 0.000062 wd 0.0500 time 0.2377 (0.2445) data time 0.0009 (0.0015) model time 0.2368 (0.2426) loss 2.3383 (2.7688) grad_norm 5.6489 (5.1778) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][890/1251] eta 0:01:28 lr 0.000062 wd 0.0500 time 0.2405 (0.2445) data time 0.0007 (0.0015) model time 0.2398 (0.2426) loss 2.8766 (2.7688) grad_norm 4.3801 (5.1683) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][900/1251] eta 0:01:25 lr 0.000062 wd 0.0500 time 0.2378 (0.2444) data time 0.0011 (0.0015) model time 0.2367 (0.2426) loss 2.4919 (2.7683) grad_norm 5.4101 (5.1640) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:20:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][910/1251] eta 0:01:23 lr 0.000062 wd 0.0500 time 0.2456 (0.2444) data time 0.0009 (0.0015) model time 0.2447 (0.2425) loss 3.2374 (2.7685) grad_norm 3.9034 (5.1574) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][920/1251] eta 0:01:20 lr 0.000062 wd 0.0500 time 0.2418 (0.2444) data time 0.0007 (0.0015) model time 0.2411 (0.2425) loss 1.9843 (2.7660) grad_norm 5.8466 (5.1525) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][930/1251] eta 0:01:18 lr 0.000062 wd 0.0500 time 0.2330 (0.2443) data time 0.0011 (0.0015) model time 0.2319 (0.2425) loss 2.3456 (2.7641) grad_norm 3.6078 (5.1450) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][940/1251] eta 0:01:15 lr 0.000062 wd 0.0500 time 0.2409 (0.2443) data time 0.0007 (0.0015) model time 0.2402 (0.2424) loss 2.2217 (2.7626) grad_norm 5.1314 (5.1366) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][950/1251] eta 0:01:13 lr 0.000062 wd 0.0500 time 0.2412 (0.2443) data time 0.0010 (0.0014) model time 0.2402 (0.2424) loss 2.9620 (2.7655) grad_norm 4.6452 (5.1368) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][960/1251] eta 0:01:11 lr 0.000062 wd 0.0500 time 0.2406 (0.2442) data time 0.0009 (0.0014) model time 0.2397 (0.2424) loss 3.1677 (2.7624) grad_norm 4.0164 (5.2027) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][970/1251] eta 0:01:08 lr 0.000062 wd 0.0500 time 0.2385 (0.2442) data time 0.0009 (0.0014) model time 0.2376 (0.2423) loss 2.5146 (2.7609) grad_norm 7.3267 (5.1987) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][980/1251] eta 0:01:06 lr 0.000062 wd 0.0500 time 0.2402 (0.2442) data time 0.0009 (0.0014) model time 0.2393 (0.2423) loss 2.8090 (2.7602) grad_norm 4.0669 (5.1965) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][990/1251] eta 0:01:03 lr 0.000062 wd 0.0500 time 0.2455 (0.2441) data time 0.0007 (0.0014) model time 0.2448 (0.2423) loss 3.2348 (2.7607) grad_norm 3.5222 (5.1906) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1000/1251] eta 0:01:01 lr 0.000062 wd 0.0500 time 0.2412 (0.2441) data time 0.0008 (0.0014) model time 0.2404 (0.2423) loss 2.1125 (2.7596) grad_norm 4.8723 (5.1833) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1010/1251] eta 0:00:58 lr 0.000062 wd 0.0500 time 0.2413 (0.2440) data time 0.0012 (0.0014) model time 0.2401 (0.2423) loss 3.3749 (2.7609) grad_norm 3.4812 (5.1853) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1020/1251] eta 0:00:56 lr 0.000062 wd 0.0500 time 0.2349 (0.2440) data time 0.0010 (0.0014) model time 0.2339 (0.2422) loss 2.7466 (2.7602) grad_norm 5.3682 (5.1762) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1030/1251] eta 0:00:53 lr 0.000062 wd 0.0500 time 0.2430 (0.2440) data time 0.0010 (0.0014) model time 0.2420 (0.2422) loss 3.1021 (2.7601) grad_norm 3.4751 (5.1700) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1040/1251] eta 0:00:51 lr 0.000062 wd 0.0500 time 0.2349 (0.2440) data time 0.0010 (0.0014) model time 0.2339 (0.2422) loss 2.3559 (2.7600) grad_norm 5.2284 (5.1649) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1050/1251] eta 0:00:49 lr 0.000062 wd 0.0500 time 0.2403 (0.2439) data time 0.0008 (0.0014) model time 0.2395 (0.2422) loss 2.1868 (2.7605) grad_norm 4.7477 (5.1639) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1060/1251] eta 0:00:46 lr 0.000062 wd 0.0500 time 0.2395 (0.2439) data time 0.0012 (0.0014) model time 0.2384 (0.2422) loss 3.2789 (2.7610) grad_norm 4.6692 (5.1578) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1070/1251] eta 0:00:44 lr 0.000062 wd 0.0500 time 0.2340 (0.2439) data time 0.0011 (0.0014) model time 0.2329 (0.2421) loss 2.9900 (2.7618) grad_norm 5.3715 (5.1506) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1080/1251] eta 0:00:41 lr 0.000062 wd 0.0500 time 0.2461 (0.2438) data time 0.0007 (0.0014) model time 0.2453 (0.2421) loss 2.6391 (2.7596) grad_norm 6.1083 (5.1508) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1090/1251] eta 0:00:39 lr 0.000062 wd 0.0500 time 0.2470 (0.2438) data time 0.0009 (0.0014) model time 0.2460 (0.2421) loss 2.7345 (2.7596) grad_norm 5.2144 (5.1496) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1100/1251] eta 0:00:36 lr 0.000062 wd 0.0500 time 0.2452 (0.2438) data time 0.0007 (0.0014) model time 0.2446 (0.2421) loss 3.1991 (2.7599) grad_norm 7.8997 (5.1564) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1110/1251] eta 0:00:34 lr 0.000062 wd 0.0500 time 0.2341 (0.2438) data time 0.0011 (0.0014) model time 0.2331 (0.2421) loss 2.7688 (2.7615) grad_norm 5.2271 (5.1636) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1120/1251] eta 0:00:31 lr 0.000062 wd 0.0500 time 0.2453 (0.2438) data time 0.0007 (0.0014) model time 0.2446 (0.2420) loss 3.3477 (2.7597) grad_norm 3.8051 (5.1606) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1130/1251] eta 0:00:29 lr 0.000062 wd 0.0500 time 0.2481 (0.2438) data time 0.0010 (0.0014) model time 0.2470 (0.2420) loss 2.6670 (2.7599) grad_norm 6.4177 (5.1581) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1140/1251] eta 0:00:27 lr 0.000062 wd 0.0500 time 0.2367 (0.2437) data time 0.0009 (0.0014) model time 0.2357 (0.2420) loss 3.4421 (2.7596) grad_norm 6.1424 (5.1659) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1150/1251] eta 0:00:24 lr 0.000062 wd 0.0500 time 0.2437 (0.2437) data time 0.0010 (0.0014) model time 0.2427 (0.2420) loss 3.0845 (2.7583) grad_norm 5.4830 (5.1609) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:21:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1160/1251] eta 0:00:22 lr 0.000062 wd 0.0500 time 0.2357 (0.2437) data time 0.0009 (0.0014) model time 0.2348 (0.2420) loss 2.3721 (2.7604) grad_norm 3.2729 (5.1606) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1170/1251] eta 0:00:19 lr 0.000062 wd 0.0500 time 0.2439 (0.2437) data time 0.0009 (0.0014) model time 0.2431 (0.2420) loss 2.4706 (2.7554) grad_norm 5.6594 (5.1565) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1180/1251] eta 0:00:17 lr 0.000062 wd 0.0500 time 0.2434 (0.2437) data time 0.0008 (0.0014) model time 0.2426 (0.2420) loss 2.1823 (2.7557) grad_norm 4.2949 (5.1542) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1190/1251] eta 0:00:14 lr 0.000062 wd 0.0500 time 0.2416 (0.2437) data time 0.0007 (0.0014) model time 0.2409 (0.2420) loss 3.5684 (2.7553) grad_norm 3.5714 (5.1603) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1200/1251] eta 0:00:12 lr 0.000062 wd 0.0500 time 0.2436 (0.2437) data time 0.0008 (0.0014) model time 0.2429 (0.2420) loss 3.0763 (2.7523) grad_norm 4.5970 (5.1532) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1210/1251] eta 0:00:09 lr 0.000062 wd 0.0500 time 0.2385 (0.2436) data time 0.0009 (0.0014) model time 0.2377 (0.2420) loss 1.3967 (2.7524) grad_norm 5.1781 (5.1468) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1220/1251] eta 0:00:07 lr 0.000062 wd 0.0500 time 0.2441 (0.2436) data time 0.0009 (0.0013) model time 0.2433 (0.2419) loss 3.0775 (2.7520) grad_norm 4.9837 (5.1447) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1230/1251] eta 0:00:05 lr 0.000061 wd 0.0500 time 0.2397 (0.2436) data time 0.0009 (0.0013) model time 0.2387 (0.2419) loss 3.0843 (2.7521) grad_norm 5.6884 (5.1373) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1240/1251] eta 0:00:02 lr 0.000061 wd 0.0500 time 0.2264 (0.2435) data time 0.0007 (0.0013) model time 0.2257 (0.2418) loss 2.9852 (2.7516) grad_norm 3.9675 (5.1462) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [258/300][1250/1251] eta 0:00:00 lr 0.000061 wd 0.0500 time 0.2234 (0.2433) data time 0.0006 (0.0013) model time 0.2228 (0.2417) loss 3.0165 (2.7532) grad_norm 3.7295 (5.1387) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 258 training takes 0:05:04 [2024-09-01 08:22:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 08:22:20 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 08:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.431 (0.431) Loss 0.3933 (0.3933) Acc@1 92.676 (92.676) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 08:22:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.110) Loss 0.5859 (0.6229) Acc@1 89.551 (87.340) Acc@5 97.852 (97.692) Mem 7381MB [2024-09-01 08:22:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.094) Loss 0.9199 (0.6521) Acc@1 78.223 (86.314) Acc@5 95.215 (97.554) Mem 7381MB [2024-09-01 08:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.089) Loss 1.2129 (0.7490) Acc@1 71.875 (83.984) Acc@5 91.797 (96.566) Mem 7381MB [2024-09-01 08:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.0312 (0.7984) Acc@1 76.562 (82.772) Acc@5 94.043 (96.108) Mem 7381MB [2024-09-01 08:22:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.418 Acc@5 96.048 [2024-09-01 08:22:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.4% [2024-09-01 08:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.844 (0.844) Loss 0.3809 (0.3809) Acc@1 93.359 (93.359) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 08:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.085 (0.152) Loss 0.5664 (0.6015) Acc@1 90.527 (87.793) Acc@5 98.145 (97.843) Mem 7381MB [2024-09-01 08:22:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.086 (0.118) Loss 0.8936 (0.6312) Acc@1 77.930 (86.607) Acc@5 96.094 (97.759) Mem 7381MB [2024-09-01 08:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.105) Loss 1.1113 (0.7196) Acc@1 74.023 (84.425) Acc@5 93.164 (96.878) Mem 7381MB [2024-09-01 08:22:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.095) Loss 0.9917 (0.7659) Acc@1 77.051 (83.303) Acc@5 94.336 (96.382) Mem 7381MB [2024-09-01 08:22:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.934 Acc@5 96.330 [2024-09-01 08:22:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 08:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][0/1251] eta 1:10:41 lr 0.000061 wd 0.0500 time 3.3907 (3.3907) data time 3.1017 (3.1017) model time 0.0000 (0.0000) loss 2.7605 (2.7605) grad_norm 4.3454 (4.3454) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][10/1251] eta 0:10:54 lr 0.000061 wd 0.0500 time 0.2369 (0.5272) data time 0.0007 (0.2829) model time 0.0000 (0.0000) loss 2.4622 (2.8467) grad_norm 4.2883 (6.1805) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][20/1251] eta 0:08:00 lr 0.000061 wd 0.0500 time 0.2446 (0.3904) data time 0.0007 (0.1487) model time 0.0000 (0.0000) loss 2.6578 (2.7204) grad_norm 4.5083 (5.4882) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][30/1251] eta 0:06:57 lr 0.000061 wd 0.0500 time 0.2426 (0.3422) data time 0.0007 (0.1010) model time 0.0000 (0.0000) loss 3.2907 (2.8081) grad_norm 4.5045 (5.1114) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][40/1251] eta 0:06:23 lr 0.000061 wd 0.0500 time 0.2303 (0.3165) data time 0.0009 (0.0766) model time 0.0000 (0.0000) loss 2.9361 (2.8472) grad_norm 4.2538 (5.1591) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][50/1251] eta 0:06:02 lr 0.000061 wd 0.0500 time 0.2422 (0.3022) data time 0.0011 (0.0618) model time 0.0000 (0.0000) loss 2.9888 (2.8557) grad_norm 4.1458 (4.8903) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][60/1251] eta 0:05:47 lr 0.000061 wd 0.0500 time 0.2423 (0.2917) data time 0.0009 (0.0518) model time 0.2414 (0.2375) loss 2.4520 (2.7849) grad_norm 3.9120 (4.7344) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][70/1251] eta 0:05:36 lr 0.000061 wd 0.0500 time 0.2538 (0.2848) data time 0.0007 (0.0447) model time 0.2530 (0.2394) loss 1.7566 (2.7484) grad_norm 3.7220 (5.0286) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][80/1251] eta 0:05:29 lr 0.000061 wd 0.0500 time 0.2452 (0.2813) data time 0.0009 (0.0393) model time 0.2444 (0.2448) loss 3.3908 (2.7907) grad_norm 4.6742 (4.9866) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][90/1251] eta 0:05:21 lr 0.000061 wd 0.0500 time 0.2409 (0.2769) data time 0.0009 (0.0351) model time 0.2400 (0.2438) loss 3.1244 (2.8098) grad_norm 4.1782 (4.9867) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][100/1251] eta 0:05:14 lr 0.000061 wd 0.0500 time 0.2452 (0.2733) data time 0.0009 (0.0317) model time 0.2443 (0.2429) loss 2.5518 (2.8055) grad_norm 4.3947 (5.0233) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:22:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][110/1251] eta 0:05:08 lr 0.000061 wd 0.0500 time 0.2478 (0.2705) data time 0.0009 (0.0289) model time 0.2469 (0.2427) loss 2.1583 (2.7972) grad_norm 5.2481 (5.1506) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:23:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][120/1251] eta 0:05:03 lr 0.000061 wd 0.0500 time 0.2376 (0.2683) data time 0.0011 (0.0266) model time 0.2365 (0.2426) loss 3.1684 (2.8054) grad_norm 5.4082 (5.1846) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:23:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][130/1251] eta 0:05:01 lr 0.000061 wd 0.0500 time 0.2333 (0.2688) data time 0.0011 (0.0247) model time 0.2322 (0.2466) loss 2.7470 (2.8102) grad_norm 3.2227 (5.1947) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 08:23:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][140/1251] eta 0:04:57 lr 0.000061 wd 0.0500 time 0.2401 (0.2681) data time 0.0011 (0.0230) model time 0.2391 (0.2478) loss 2.1503 (2.7827) grad_norm 4.1287 (5.1829) loss_scale 1024.0000 (519.2624) mem 7381MB [2024-09-01 08:23:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][150/1251] eta 0:04:53 lr 0.000061 wd 0.0500 time 0.2420 (0.2663) data time 0.0010 (0.0215) model time 0.2410 (0.2471) loss 2.3110 (2.7747) grad_norm 6.1679 (5.1696) loss_scale 1024.0000 (552.6887) mem 7381MB [2024-09-01 08:23:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][160/1251] eta 0:04:48 lr 0.000061 wd 0.0500 time 0.2422 (0.2648) data time 0.0010 (0.0202) model time 0.2412 (0.2464) loss 2.0225 (2.7676) grad_norm 4.8232 (5.1259) loss_scale 1024.0000 (581.9627) mem 7381MB [2024-09-01 08:23:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][170/1251] eta 0:04:44 lr 0.000061 wd 0.0500 time 0.2397 (0.2635) data time 0.0007 (0.0191) model time 0.2390 (0.2461) loss 3.4423 (2.7771) grad_norm 4.5309 (5.1207) loss_scale 1024.0000 (607.8129) mem 7381MB [2024-09-01 08:23:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][180/1251] eta 0:04:41 lr 0.000061 wd 0.0500 time 0.2452 (0.2624) data time 0.0008 (0.0181) model time 0.2445 (0.2458) loss 2.1497 (2.7662) grad_norm 4.9874 (5.1572) loss_scale 1024.0000 (630.8066) mem 7381MB [2024-09-01 08:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][190/1251] eta 0:04:37 lr 0.000061 wd 0.0500 time 0.2403 (0.2613) data time 0.0010 (0.0172) model time 0.2394 (0.2454) loss 3.0712 (2.7586) grad_norm 4.5208 (5.1897) loss_scale 1024.0000 (651.3927) mem 7381MB [2024-09-01 08:23:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][200/1251] eta 0:04:33 lr 0.000061 wd 0.0500 time 0.2416 (0.2603) data time 0.0011 (0.0164) model time 0.2405 (0.2451) loss 3.0824 (2.7683) grad_norm 4.5988 (5.2209) loss_scale 1024.0000 (669.9303) mem 7381MB [2024-09-01 08:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][210/1251] eta 0:04:30 lr 0.000061 wd 0.0500 time 0.2332 (0.2594) data time 0.0009 (0.0157) model time 0.2323 (0.2448) loss 1.8491 (2.7755) grad_norm 6.7058 (5.1940) loss_scale 1024.0000 (686.7109) mem 7381MB [2024-09-01 08:23:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][220/1251] eta 0:04:26 lr 0.000061 wd 0.0500 time 0.2386 (0.2586) data time 0.0007 (0.0150) model time 0.2379 (0.2445) loss 3.2513 (2.7823) grad_norm 5.0036 (5.2019) loss_scale 1024.0000 (701.9729) mem 7381MB [2024-09-01 08:23:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][230/1251] eta 0:04:22 lr 0.000061 wd 0.0500 time 0.2349 (0.2576) data time 0.0010 (0.0144) model time 0.2339 (0.2440) loss 3.2735 (2.7754) grad_norm 6.5483 (5.1658) loss_scale 1024.0000 (715.9134) mem 7381MB [2024-09-01 08:23:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][240/1251] eta 0:04:19 lr 0.000061 wd 0.0500 time 0.2370 (0.2569) data time 0.0010 (0.0139) model time 0.2360 (0.2437) loss 3.0008 (2.7764) grad_norm 5.0229 (5.1597) loss_scale 1024.0000 (728.6971) mem 7381MB [2024-09-01 08:23:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][250/1251] eta 0:04:16 lr 0.000061 wd 0.0500 time 0.2397 (0.2562) data time 0.0008 (0.0133) model time 0.2389 (0.2435) loss 3.1581 (2.7739) grad_norm 7.4071 (5.1573) loss_scale 1024.0000 (740.4622) mem 7381MB [2024-09-01 08:23:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][260/1251] eta 0:04:13 lr 0.000061 wd 0.0500 time 0.2283 (0.2556) data time 0.0011 (0.0129) model time 0.2272 (0.2433) loss 3.2628 (2.7764) grad_norm 7.9121 (5.1564) loss_scale 1024.0000 (751.3257) mem 7381MB [2024-09-01 08:23:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][270/1251] eta 0:04:10 lr 0.000061 wd 0.0500 time 0.2414 (0.2549) data time 0.0007 (0.0124) model time 0.2406 (0.2430) loss 3.2769 (2.7745) grad_norm 5.5049 (5.1724) loss_scale 1024.0000 (761.3875) mem 7381MB [2024-09-01 08:23:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][280/1251] eta 0:04:07 lr 0.000061 wd 0.0500 time 0.2421 (0.2544) data time 0.0010 (0.0120) model time 0.2410 (0.2428) loss 2.8812 (2.7723) grad_norm 3.8384 (5.1489) loss_scale 1024.0000 (770.7331) mem 7381MB [2024-09-01 08:23:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][290/1251] eta 0:04:04 lr 0.000061 wd 0.0500 time 0.2371 (0.2539) data time 0.0008 (0.0117) model time 0.2363 (0.2427) loss 3.3350 (2.7735) grad_norm 3.6624 (5.2195) loss_scale 1024.0000 (779.4364) mem 7381MB [2024-09-01 08:23:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][300/1251] eta 0:04:01 lr 0.000061 wd 0.0500 time 0.2611 (0.2535) data time 0.0009 (0.0113) model time 0.2602 (0.2425) loss 3.1294 (2.7746) grad_norm 4.6103 (inf) loss_scale 512.0000 (782.4585) mem 7381MB [2024-09-01 08:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][310/1251] eta 0:03:58 lr 0.000061 wd 0.0500 time 0.2352 (0.2531) data time 0.0009 (0.0110) model time 0.2343 (0.2424) loss 2.8220 (2.7779) grad_norm 3.8333 (inf) loss_scale 512.0000 (773.7621) mem 7381MB [2024-09-01 08:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][320/1251] eta 0:03:55 lr 0.000061 wd 0.0500 time 0.2353 (0.2526) data time 0.0007 (0.0107) model time 0.2345 (0.2422) loss 3.7113 (2.7744) grad_norm 4.5054 (inf) loss_scale 512.0000 (765.6075) mem 7381MB [2024-09-01 08:23:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][330/1251] eta 0:03:52 lr 0.000061 wd 0.0500 time 0.2442 (0.2522) data time 0.0011 (0.0104) model time 0.2431 (0.2421) loss 2.6728 (2.7696) grad_norm 3.4255 (inf) loss_scale 512.0000 (757.9456) mem 7381MB [2024-09-01 08:23:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][340/1251] eta 0:03:49 lr 0.000061 wd 0.0500 time 0.2393 (0.2519) data time 0.0010 (0.0101) model time 0.2383 (0.2420) loss 1.8510 (2.7572) grad_norm 5.2755 (inf) loss_scale 512.0000 (750.7331) mem 7381MB [2024-09-01 08:23:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][350/1251] eta 0:03:46 lr 0.000061 wd 0.0500 time 0.2449 (0.2516) data time 0.0010 (0.0098) model time 0.2439 (0.2420) loss 2.7744 (2.7579) grad_norm 3.7601 (inf) loss_scale 512.0000 (743.9316) mem 7381MB [2024-09-01 08:23:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][360/1251] eta 0:03:43 lr 0.000061 wd 0.0500 time 0.2394 (0.2512) data time 0.0007 (0.0096) model time 0.2387 (0.2418) loss 2.6525 (2.7510) grad_norm 5.1044 (inf) loss_scale 512.0000 (737.5069) mem 7381MB [2024-09-01 08:24:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][370/1251] eta 0:03:41 lr 0.000061 wd 0.0500 time 0.2356 (0.2510) data time 0.0008 (0.0094) model time 0.2348 (0.2418) loss 2.5778 (2.7496) grad_norm 7.5970 (inf) loss_scale 512.0000 (731.4286) mem 7381MB [2024-09-01 08:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][380/1251] eta 0:03:38 lr 0.000061 wd 0.0500 time 0.2411 (0.2508) data time 0.0011 (0.0092) model time 0.2401 (0.2418) loss 2.9854 (2.7456) grad_norm 5.8642 (inf) loss_scale 512.0000 (725.6693) mem 7381MB [2024-09-01 08:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][390/1251] eta 0:03:36 lr 0.000061 wd 0.0500 time 0.2413 (0.2510) data time 0.0010 (0.0089) model time 0.2403 (0.2424) loss 3.2812 (2.7506) grad_norm 3.3399 (inf) loss_scale 512.0000 (720.2046) mem 7381MB [2024-09-01 08:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][400/1251] eta 0:03:33 lr 0.000061 wd 0.0500 time 0.2421 (0.2508) data time 0.0007 (0.0087) model time 0.2414 (0.2423) loss 1.9194 (2.7449) grad_norm 4.0587 (inf) loss_scale 512.0000 (715.0125) mem 7381MB [2024-09-01 08:24:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][410/1251] eta 0:03:30 lr 0.000061 wd 0.0500 time 0.2395 (0.2505) data time 0.0008 (0.0086) model time 0.2387 (0.2422) loss 3.0860 (2.7537) grad_norm 3.6512 (inf) loss_scale 512.0000 (710.0730) mem 7381MB [2024-09-01 08:24:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][420/1251] eta 0:03:27 lr 0.000061 wd 0.0500 time 0.2380 (0.2503) data time 0.0010 (0.0084) model time 0.2370 (0.2421) loss 2.7551 (2.7558) grad_norm 5.0083 (inf) loss_scale 512.0000 (705.3682) mem 7381MB [2024-09-01 08:24:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][430/1251] eta 0:03:25 lr 0.000061 wd 0.0500 time 0.2430 (0.2501) data time 0.0009 (0.0082) model time 0.2420 (0.2420) loss 2.7122 (2.7616) grad_norm 5.3546 (inf) loss_scale 512.0000 (700.8817) mem 7381MB [2024-09-01 08:24:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][440/1251] eta 0:03:22 lr 0.000061 wd 0.0500 time 0.2434 (0.2498) data time 0.0006 (0.0080) model time 0.2427 (0.2419) loss 3.2267 (2.7615) grad_norm 3.9206 (inf) loss_scale 512.0000 (696.5986) mem 7381MB [2024-09-01 08:24:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][450/1251] eta 0:03:19 lr 0.000061 wd 0.0500 time 0.2330 (0.2496) data time 0.0012 (0.0079) model time 0.2317 (0.2418) loss 2.7443 (2.7627) grad_norm 2.7997 (inf) loss_scale 512.0000 (692.5055) mem 7381MB [2024-09-01 08:24:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][460/1251] eta 0:03:17 lr 0.000061 wd 0.0500 time 0.2334 (0.2494) data time 0.0007 (0.0077) model time 0.2327 (0.2418) loss 3.3324 (2.7600) grad_norm 13.0877 (inf) loss_scale 512.0000 (688.5900) mem 7381MB [2024-09-01 08:24:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][470/1251] eta 0:03:14 lr 0.000061 wd 0.0500 time 0.2423 (0.2493) data time 0.0010 (0.0076) model time 0.2413 (0.2418) loss 1.9345 (2.7603) grad_norm 4.4311 (inf) loss_scale 512.0000 (684.8408) mem 7381MB [2024-09-01 08:24:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][480/1251] eta 0:03:12 lr 0.000061 wd 0.0500 time 0.2484 (0.2491) data time 0.0010 (0.0075) model time 0.2474 (0.2418) loss 3.0952 (2.7644) grad_norm 3.7466 (inf) loss_scale 512.0000 (681.2474) mem 7381MB [2024-09-01 08:24:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][490/1251] eta 0:03:09 lr 0.000060 wd 0.0500 time 0.2477 (0.2490) data time 0.0011 (0.0073) model time 0.2466 (0.2418) loss 3.1296 (2.7682) grad_norm 4.1882 (inf) loss_scale 512.0000 (677.8004) mem 7381MB [2024-09-01 08:24:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][500/1251] eta 0:03:06 lr 0.000060 wd 0.0500 time 0.2470 (0.2489) data time 0.0007 (0.0072) model time 0.2463 (0.2418) loss 3.8130 (2.7713) grad_norm 4.0375 (inf) loss_scale 512.0000 (674.4910) mem 7381MB [2024-09-01 08:24:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][510/1251] eta 0:03:04 lr 0.000060 wd 0.0500 time 0.2420 (0.2487) data time 0.0009 (0.0071) model time 0.2411 (0.2418) loss 2.0160 (2.7653) grad_norm 3.8462 (inf) loss_scale 512.0000 (671.3112) mem 7381MB [2024-09-01 08:24:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][520/1251] eta 0:03:01 lr 0.000060 wd 0.0500 time 0.2387 (0.2486) data time 0.0009 (0.0070) model time 0.2378 (0.2418) loss 1.5292 (2.7599) grad_norm 3.8838 (inf) loss_scale 512.0000 (668.2534) mem 7381MB [2024-09-01 08:24:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][530/1251] eta 0:02:59 lr 0.000060 wd 0.0500 time 0.2383 (0.2485) data time 0.0010 (0.0068) model time 0.2373 (0.2418) loss 2.9968 (2.7591) grad_norm 3.4606 (inf) loss_scale 512.0000 (665.3107) mem 7381MB [2024-09-01 08:24:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][540/1251] eta 0:02:56 lr 0.000060 wd 0.0500 time 0.2484 (0.2484) data time 0.0009 (0.0067) model time 0.2474 (0.2418) loss 2.8128 (2.7611) grad_norm 5.5584 (inf) loss_scale 512.0000 (662.4769) mem 7381MB [2024-09-01 08:24:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][550/1251] eta 0:02:54 lr 0.000060 wd 0.0500 time 0.2436 (0.2483) data time 0.0010 (0.0066) model time 0.2426 (0.2417) loss 3.1653 (2.7609) grad_norm 4.0168 (inf) loss_scale 512.0000 (659.7459) mem 7381MB [2024-09-01 08:24:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][560/1251] eta 0:02:51 lr 0.000060 wd 0.0500 time 0.2344 (0.2481) data time 0.0011 (0.0065) model time 0.2334 (0.2417) loss 3.0539 (2.7632) grad_norm 6.6329 (inf) loss_scale 512.0000 (657.1123) mem 7381MB [2024-09-01 08:24:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][570/1251] eta 0:02:48 lr 0.000060 wd 0.0500 time 0.2471 (0.2480) data time 0.0007 (0.0064) model time 0.2464 (0.2417) loss 2.3586 (2.7619) grad_norm 4.7058 (inf) loss_scale 512.0000 (654.5709) mem 7381MB [2024-09-01 08:24:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][580/1251] eta 0:02:46 lr 0.000060 wd 0.0500 time 0.2430 (0.2480) data time 0.0010 (0.0063) model time 0.2421 (0.2417) loss 2.7401 (2.7658) grad_norm 4.4368 (inf) loss_scale 512.0000 (652.1170) mem 7381MB [2024-09-01 08:24:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][590/1251] eta 0:02:43 lr 0.000060 wd 0.0500 time 0.2382 (0.2479) data time 0.0010 (0.0063) model time 0.2372 (0.2417) loss 3.1767 (2.7670) grad_norm 6.8335 (inf) loss_scale 512.0000 (649.7462) mem 7381MB [2024-09-01 08:24:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][600/1251] eta 0:02:41 lr 0.000060 wd 0.0500 time 0.4440 (0.2481) data time 0.0009 (0.0062) model time 0.4431 (0.2421) loss 2.4904 (2.7686) grad_norm 3.5365 (inf) loss_scale 512.0000 (647.4542) mem 7381MB [2024-09-01 08:25:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][610/1251] eta 0:02:38 lr 0.000060 wd 0.0500 time 0.2357 (0.2480) data time 0.0007 (0.0061) model time 0.2350 (0.2421) loss 2.1953 (2.7648) grad_norm 4.5911 (inf) loss_scale 512.0000 (645.2373) mem 7381MB [2024-09-01 08:25:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][620/1251] eta 0:02:36 lr 0.000060 wd 0.0500 time 0.2446 (0.2480) data time 0.0007 (0.0060) model time 0.2438 (0.2421) loss 2.7895 (2.7661) grad_norm 4.2427 (inf) loss_scale 512.0000 (643.0918) mem 7381MB [2024-09-01 08:25:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][630/1251] eta 0:02:33 lr 0.000060 wd 0.0500 time 0.2385 (0.2478) data time 0.0010 (0.0059) model time 0.2375 (0.2420) loss 2.0821 (2.7660) grad_norm 4.2809 (inf) loss_scale 512.0000 (641.0143) mem 7381MB [2024-09-01 08:25:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][640/1251] eta 0:02:31 lr 0.000060 wd 0.0500 time 0.2465 (0.2477) data time 0.0010 (0.0058) model time 0.2455 (0.2420) loss 2.9782 (2.7674) grad_norm 3.8959 (inf) loss_scale 512.0000 (639.0016) mem 7381MB [2024-09-01 08:25:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][650/1251] eta 0:02:28 lr 0.000060 wd 0.0500 time 0.2401 (0.2476) data time 0.0011 (0.0058) model time 0.2390 (0.2420) loss 2.9118 (2.7697) grad_norm 4.4619 (inf) loss_scale 512.0000 (637.0507) mem 7381MB [2024-09-01 08:25:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][660/1251] eta 0:02:26 lr 0.000060 wd 0.0500 time 0.2380 (0.2485) data time 0.0011 (0.0057) model time 0.2369 (0.2430) loss 2.2359 (2.7645) grad_norm 4.8966 (inf) loss_scale 512.0000 (635.1589) mem 7381MB [2024-09-01 08:25:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][670/1251] eta 0:02:24 lr 0.000060 wd 0.0500 time 0.2441 (0.2484) data time 0.0011 (0.0056) model time 0.2430 (0.2429) loss 2.1758 (2.7665) grad_norm 4.9533 (inf) loss_scale 512.0000 (633.3234) mem 7381MB [2024-09-01 08:25:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][680/1251] eta 0:02:21 lr 0.000060 wd 0.0500 time 0.2353 (0.2482) data time 0.0009 (0.0056) model time 0.2345 (0.2428) loss 3.0346 (2.7679) grad_norm 3.5101 (inf) loss_scale 512.0000 (631.5419) mem 7381MB [2024-09-01 08:25:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][690/1251] eta 0:02:19 lr 0.000060 wd 0.0500 time 0.2429 (0.2481) data time 0.0007 (0.0055) model time 0.2422 (0.2428) loss 3.4275 (2.7706) grad_norm 5.0176 (inf) loss_scale 512.0000 (629.8119) mem 7381MB [2024-09-01 08:25:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][700/1251] eta 0:02:16 lr 0.000060 wd 0.0500 time 0.2396 (0.2481) data time 0.0010 (0.0054) model time 0.2386 (0.2428) loss 2.2563 (2.7670) grad_norm 4.6405 (inf) loss_scale 512.0000 (628.1312) mem 7381MB [2024-09-01 08:25:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][710/1251] eta 0:02:14 lr 0.000060 wd 0.0500 time 0.2409 (0.2480) data time 0.0007 (0.0054) model time 0.2402 (0.2428) loss 2.0914 (2.7638) grad_norm 3.8897 (inf) loss_scale 512.0000 (626.4979) mem 7381MB [2024-09-01 08:25:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][720/1251] eta 0:02:11 lr 0.000060 wd 0.0500 time 0.2413 (0.2479) data time 0.0010 (0.0053) model time 0.2403 (0.2427) loss 3.0701 (2.7654) grad_norm 3.0229 (inf) loss_scale 512.0000 (624.9098) mem 7381MB [2024-09-01 08:25:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][730/1251] eta 0:02:09 lr 0.000060 wd 0.0500 time 0.2403 (0.2478) data time 0.0007 (0.0053) model time 0.2396 (0.2427) loss 1.8570 (2.7641) grad_norm 5.4446 (inf) loss_scale 512.0000 (623.3653) mem 7381MB [2024-09-01 08:25:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][740/1251] eta 0:02:06 lr 0.000060 wd 0.0500 time 0.2454 (0.2477) data time 0.0011 (0.0052) model time 0.2443 (0.2427) loss 2.1722 (2.7637) grad_norm 3.4689 (inf) loss_scale 512.0000 (621.8623) mem 7381MB [2024-09-01 08:25:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][750/1251] eta 0:02:04 lr 0.000060 wd 0.0500 time 0.2454 (0.2476) data time 0.0007 (0.0051) model time 0.2447 (0.2426) loss 2.6605 (2.7623) grad_norm 3.3162 (inf) loss_scale 512.0000 (620.3995) mem 7381MB [2024-09-01 08:25:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][760/1251] eta 0:02:01 lr 0.000060 wd 0.0500 time 0.2407 (0.2475) data time 0.0010 (0.0051) model time 0.2398 (0.2426) loss 3.3943 (2.7646) grad_norm 6.7516 (inf) loss_scale 512.0000 (618.9750) mem 7381MB [2024-09-01 08:25:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][770/1251] eta 0:01:59 lr 0.000060 wd 0.0500 time 0.2416 (0.2475) data time 0.0011 (0.0050) model time 0.2406 (0.2426) loss 3.2609 (2.7634) grad_norm 4.6734 (inf) loss_scale 512.0000 (617.5875) mem 7381MB [2024-09-01 08:25:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][780/1251] eta 0:01:56 lr 0.000060 wd 0.0500 time 0.2418 (0.2474) data time 0.0009 (0.0050) model time 0.2408 (0.2425) loss 2.9100 (2.7632) grad_norm 6.8819 (inf) loss_scale 512.0000 (616.2356) mem 7381MB [2024-09-01 08:25:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][790/1251] eta 0:01:54 lr 0.000060 wd 0.0500 time 0.2376 (0.2473) data time 0.0009 (0.0049) model time 0.2367 (0.2425) loss 2.9676 (2.7633) grad_norm 7.5435 (inf) loss_scale 512.0000 (614.9178) mem 7381MB [2024-09-01 08:25:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][800/1251] eta 0:01:51 lr 0.000060 wd 0.0500 time 0.2486 (0.2472) data time 0.0011 (0.0049) model time 0.2475 (0.2425) loss 3.2414 (2.7601) grad_norm 4.9511 (inf) loss_scale 512.0000 (613.6330) mem 7381MB [2024-09-01 08:25:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][810/1251] eta 0:01:48 lr 0.000060 wd 0.0500 time 0.2459 (0.2471) data time 0.0009 (0.0048) model time 0.2450 (0.2424) loss 2.5377 (2.7605) grad_norm 4.8241 (inf) loss_scale 512.0000 (612.3798) mem 7381MB [2024-09-01 08:25:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][820/1251] eta 0:01:46 lr 0.000060 wd 0.0500 time 0.2397 (0.2471) data time 0.0010 (0.0048) model time 0.2387 (0.2424) loss 2.4368 (2.7607) grad_norm 4.2249 (inf) loss_scale 512.0000 (611.1571) mem 7381MB [2024-09-01 08:25:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][830/1251] eta 0:01:43 lr 0.000060 wd 0.0500 time 0.2424 (0.2470) data time 0.0010 (0.0047) model time 0.2414 (0.2424) loss 2.9635 (2.7600) grad_norm 7.1662 (inf) loss_scale 512.0000 (609.9639) mem 7381MB [2024-09-01 08:25:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][840/1251] eta 0:01:41 lr 0.000060 wd 0.0500 time 0.2395 (0.2470) data time 0.0010 (0.0047) model time 0.2384 (0.2424) loss 3.2012 (2.7621) grad_norm 5.6498 (inf) loss_scale 512.0000 (608.7990) mem 7381MB [2024-09-01 08:25:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][850/1251] eta 0:01:39 lr 0.000060 wd 0.0500 time 0.2357 (0.2469) data time 0.0008 (0.0047) model time 0.2349 (0.2424) loss 3.5007 (2.7669) grad_norm 7.4759 (inf) loss_scale 512.0000 (607.6616) mem 7381MB [2024-09-01 08:26:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][860/1251] eta 0:01:36 lr 0.000060 wd 0.0500 time 0.2511 (0.2468) data time 0.0009 (0.0046) model time 0.2502 (0.2423) loss 2.0668 (2.7683) grad_norm 5.2761 (inf) loss_scale 512.0000 (606.5505) mem 7381MB [2024-09-01 08:26:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][870/1251] eta 0:01:34 lr 0.000060 wd 0.0500 time 0.2347 (0.2468) data time 0.0010 (0.0046) model time 0.2337 (0.2423) loss 3.1137 (2.7681) grad_norm 7.9851 (inf) loss_scale 512.0000 (605.4650) mem 7381MB [2024-09-01 08:26:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][880/1251] eta 0:01:31 lr 0.000060 wd 0.0500 time 0.2417 (0.2467) data time 0.0007 (0.0045) model time 0.2410 (0.2423) loss 3.3214 (2.7695) grad_norm 3.5931 (inf) loss_scale 512.0000 (604.4041) mem 7381MB [2024-09-01 08:26:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][890/1251] eta 0:01:29 lr 0.000060 wd 0.0500 time 0.2410 (0.2467) data time 0.0009 (0.0045) model time 0.2401 (0.2423) loss 2.6094 (2.7703) grad_norm 4.7269 (inf) loss_scale 512.0000 (603.3670) mem 7381MB [2024-09-01 08:26:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][900/1251] eta 0:01:26 lr 0.000060 wd 0.0500 time 0.2364 (0.2466) data time 0.0010 (0.0045) model time 0.2354 (0.2422) loss 3.0420 (2.7722) grad_norm 3.9998 (inf) loss_scale 512.0000 (602.3529) mem 7381MB [2024-09-01 08:26:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][910/1251] eta 0:01:24 lr 0.000060 wd 0.0500 time 0.2512 (0.2466) data time 0.0007 (0.0044) model time 0.2504 (0.2422) loss 3.1165 (2.7744) grad_norm 4.2795 (inf) loss_scale 512.0000 (601.3611) mem 7381MB [2024-09-01 08:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][920/1251] eta 0:01:21 lr 0.000060 wd 0.0500 time 0.2428 (0.2465) data time 0.0010 (0.0044) model time 0.2418 (0.2422) loss 3.1641 (2.7776) grad_norm 3.5918 (inf) loss_scale 512.0000 (600.3909) mem 7381MB [2024-09-01 08:26:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][930/1251] eta 0:01:19 lr 0.000060 wd 0.0500 time 0.2503 (0.2466) data time 0.0007 (0.0043) model time 0.2496 (0.2424) loss 3.5174 (2.7794) grad_norm 4.5373 (inf) loss_scale 512.0000 (599.4415) mem 7381MB [2024-09-01 08:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][940/1251] eta 0:01:16 lr 0.000060 wd 0.0500 time 0.2372 (0.2466) data time 0.0009 (0.0043) model time 0.2363 (0.2424) loss 2.7036 (2.7778) grad_norm 3.3461 (inf) loss_scale 512.0000 (598.5122) mem 7381MB [2024-09-01 08:26:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][950/1251] eta 0:01:14 lr 0.000060 wd 0.0500 time 0.2474 (0.2465) data time 0.0009 (0.0043) model time 0.2465 (0.2423) loss 2.5420 (2.7775) grad_norm 6.3430 (inf) loss_scale 512.0000 (597.6025) mem 7381MB [2024-09-01 08:26:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][960/1251] eta 0:01:11 lr 0.000060 wd 0.0500 time 0.2475 (0.2464) data time 0.0009 (0.0042) model time 0.2466 (0.2423) loss 3.2576 (2.7775) grad_norm 4.9510 (inf) loss_scale 512.0000 (596.7118) mem 7381MB [2024-09-01 08:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][970/1251] eta 0:01:09 lr 0.000060 wd 0.0500 time 0.2463 (0.2464) data time 0.0008 (0.0042) model time 0.2456 (0.2423) loss 2.7495 (2.7801) grad_norm 4.8942 (inf) loss_scale 512.0000 (595.8393) mem 7381MB [2024-09-01 08:26:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][980/1251] eta 0:01:06 lr 0.000060 wd 0.0500 time 0.2421 (0.2463) data time 0.0009 (0.0042) model time 0.2412 (0.2423) loss 2.3335 (2.7809) grad_norm 3.0978 (inf) loss_scale 512.0000 (594.9847) mem 7381MB [2024-09-01 08:26:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][990/1251] eta 0:01:04 lr 0.000060 wd 0.0500 time 0.2327 (0.2463) data time 0.0009 (0.0041) model time 0.2318 (0.2422) loss 2.9982 (2.7791) grad_norm 4.5539 (inf) loss_scale 512.0000 (594.1473) mem 7381MB [2024-09-01 08:26:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1000/1251] eta 0:01:01 lr 0.000060 wd 0.0500 time 0.2431 (0.2462) data time 0.0009 (0.0041) model time 0.2422 (0.2422) loss 3.2157 (2.7768) grad_norm 3.0029 (inf) loss_scale 512.0000 (593.3267) mem 7381MB [2024-09-01 08:26:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1010/1251] eta 0:00:59 lr 0.000059 wd 0.0500 time 0.2420 (0.2462) data time 0.0008 (0.0041) model time 0.2412 (0.2422) loss 3.1258 (2.7739) grad_norm 6.4575 (inf) loss_scale 512.0000 (592.5223) mem 7381MB [2024-09-01 08:26:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1020/1251] eta 0:00:56 lr 0.000059 wd 0.0500 time 0.2401 (0.2461) data time 0.0009 (0.0040) model time 0.2392 (0.2422) loss 2.9853 (2.7736) grad_norm 6.0590 (inf) loss_scale 512.0000 (591.7336) mem 7381MB [2024-09-01 08:26:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1030/1251] eta 0:00:54 lr 0.000059 wd 0.0500 time 0.2375 (0.2461) data time 0.0011 (0.0040) model time 0.2364 (0.2421) loss 2.8186 (2.7734) grad_norm 3.7758 (inf) loss_scale 512.0000 (590.9602) mem 7381MB [2024-09-01 08:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1040/1251] eta 0:00:51 lr 0.000059 wd 0.0500 time 0.2481 (0.2460) data time 0.0009 (0.0040) model time 0.2471 (0.2421) loss 3.1670 (2.7721) grad_norm 4.0190 (inf) loss_scale 512.0000 (590.2017) mem 7381MB [2024-09-01 08:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1050/1251] eta 0:00:49 lr 0.000059 wd 0.0500 time 0.2482 (0.2460) data time 0.0010 (0.0040) model time 0.2471 (0.2421) loss 2.8818 (2.7723) grad_norm 4.2387 (inf) loss_scale 512.0000 (589.4577) mem 7381MB [2024-09-01 08:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1060/1251] eta 0:00:46 lr 0.000059 wd 0.0500 time 0.2488 (0.2460) data time 0.0009 (0.0039) model time 0.2479 (0.2421) loss 3.2864 (2.7717) grad_norm 5.7932 (inf) loss_scale 512.0000 (588.7276) mem 7381MB [2024-09-01 08:26:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1070/1251] eta 0:00:44 lr 0.000059 wd 0.0500 time 0.2400 (0.2460) data time 0.0009 (0.0039) model time 0.2391 (0.2421) loss 2.9796 (2.7703) grad_norm 5.3250 (inf) loss_scale 512.0000 (588.0112) mem 7381MB [2024-09-01 08:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1080/1251] eta 0:00:42 lr 0.000059 wd 0.0500 time 0.2414 (0.2459) data time 0.0009 (0.0039) model time 0.2405 (0.2421) loss 2.3960 (2.7677) grad_norm 5.5516 (inf) loss_scale 512.0000 (587.3080) mem 7381MB [2024-09-01 08:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1090/1251] eta 0:00:39 lr 0.000059 wd 0.0500 time 0.2458 (0.2459) data time 0.0007 (0.0038) model time 0.2451 (0.2421) loss 2.9107 (2.7676) grad_norm 4.3677 (inf) loss_scale 256.0000 (585.2099) mem 7381MB [2024-09-01 08:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1100/1251] eta 0:00:37 lr 0.000059 wd 0.0500 time 0.2347 (0.2458) data time 0.0009 (0.0038) model time 0.2338 (0.2421) loss 3.0470 (2.7666) grad_norm 7.9355 (inf) loss_scale 256.0000 (582.2198) mem 7381MB [2024-09-01 08:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1110/1251] eta 0:00:34 lr 0.000059 wd 0.0500 time 0.2357 (0.2458) data time 0.0009 (0.0038) model time 0.2348 (0.2421) loss 2.0064 (2.7649) grad_norm 3.1368 (inf) loss_scale 256.0000 (579.2835) mem 7381MB [2024-09-01 08:27:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1120/1251] eta 0:00:32 lr 0.000059 wd 0.0500 time 0.2339 (0.2457) data time 0.0011 (0.0038) model time 0.2328 (0.2420) loss 1.8876 (2.7613) grad_norm 5.9408 (inf) loss_scale 256.0000 (576.3996) mem 7381MB [2024-09-01 08:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1130/1251] eta 0:00:29 lr 0.000059 wd 0.0500 time 0.2324 (0.2457) data time 0.0011 (0.0037) model time 0.2314 (0.2420) loss 3.0209 (2.7602) grad_norm 4.1409 (inf) loss_scale 256.0000 (573.5668) mem 7381MB [2024-09-01 08:27:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1140/1251] eta 0:00:27 lr 0.000059 wd 0.0500 time 0.2463 (0.2457) data time 0.0007 (0.0037) model time 0.2457 (0.2420) loss 2.8232 (2.7608) grad_norm 4.4932 (inf) loss_scale 256.0000 (570.7835) mem 7381MB [2024-09-01 08:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1150/1251] eta 0:00:24 lr 0.000059 wd 0.0500 time 0.2392 (0.2456) data time 0.0007 (0.0037) model time 0.2384 (0.2420) loss 3.4385 (2.7620) grad_norm 4.7810 (inf) loss_scale 256.0000 (568.0487) mem 7381MB [2024-09-01 08:27:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1160/1251] eta 0:00:22 lr 0.000059 wd 0.0500 time 0.2391 (0.2456) data time 0.0007 (0.0037) model time 0.2383 (0.2420) loss 3.2788 (2.7630) grad_norm 4.3375 (inf) loss_scale 256.0000 (565.3609) mem 7381MB [2024-09-01 08:27:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1170/1251] eta 0:00:19 lr 0.000059 wd 0.0500 time 0.2364 (0.2456) data time 0.0009 (0.0037) model time 0.2355 (0.2420) loss 2.7919 (2.7622) grad_norm 10.1664 (inf) loss_scale 256.0000 (562.7190) mem 7381MB [2024-09-01 08:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1180/1251] eta 0:00:17 lr 0.000059 wd 0.0500 time 0.2327 (0.2456) data time 0.0009 (0.0036) model time 0.2319 (0.2420) loss 2.0156 (2.7603) grad_norm 6.1385 (inf) loss_scale 256.0000 (560.1219) mem 7381MB [2024-09-01 08:27:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1190/1251] eta 0:00:14 lr 0.000059 wd 0.0500 time 0.2409 (0.2455) data time 0.0009 (0.0036) model time 0.2400 (0.2420) loss 2.7313 (2.7626) grad_norm 4.4277 (inf) loss_scale 256.0000 (557.5684) mem 7381MB [2024-09-01 08:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1200/1251] eta 0:00:12 lr 0.000059 wd 0.0500 time 0.2455 (0.2455) data time 0.0008 (0.0036) model time 0.2447 (0.2420) loss 3.1265 (2.7637) grad_norm 3.6860 (inf) loss_scale 256.0000 (555.0575) mem 7381MB [2024-09-01 08:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1210/1251] eta 0:00:10 lr 0.000059 wd 0.0500 time 0.2431 (0.2455) data time 0.0010 (0.0036) model time 0.2420 (0.2420) loss 2.5036 (2.7619) grad_norm 3.8512 (inf) loss_scale 256.0000 (552.5879) mem 7381MB [2024-09-01 08:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1220/1251] eta 0:00:07 lr 0.000059 wd 0.0500 time 0.2390 (0.2454) data time 0.0008 (0.0035) model time 0.2382 (0.2419) loss 2.1078 (2.7629) grad_norm 4.6116 (inf) loss_scale 256.0000 (550.1589) mem 7381MB [2024-09-01 08:27:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1230/1251] eta 0:00:05 lr 0.000059 wd 0.0500 time 0.2403 (0.2454) data time 0.0009 (0.0035) model time 0.2394 (0.2419) loss 3.5502 (2.7633) grad_norm 3.8106 (inf) loss_scale 256.0000 (547.7693) mem 7381MB [2024-09-01 08:27:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1240/1251] eta 0:00:02 lr 0.000059 wd 0.0500 time 0.2232 (0.2453) data time 0.0005 (0.0035) model time 0.2227 (0.2418) loss 3.0191 (2.7624) grad_norm 3.2261 (inf) loss_scale 256.0000 (545.4182) mem 7381MB [2024-09-01 08:27:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [259/300][1250/1251] eta 0:00:00 lr 0.000059 wd 0.0500 time 0.2228 (0.2451) data time 0.0007 (0.0035) model time 0.2221 (0.2417) loss 1.9055 (2.7603) grad_norm 4.3818 (inf) loss_scale 256.0000 (543.1047) mem 7381MB [2024-09-01 08:27:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 259 training takes 0:05:06 [2024-09-01 08:27:35 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 08:27:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 08:27:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.441 (0.441) Loss 0.3984 (0.3984) Acc@1 93.262 (93.262) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 08:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.110) Loss 0.5913 (0.6241) Acc@1 88.965 (87.314) Acc@5 97.656 (97.621) Mem 7381MB [2024-09-01 08:27:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.094) Loss 0.9375 (0.6550) Acc@1 77.539 (86.151) Acc@5 95.312 (97.568) Mem 7381MB [2024-09-01 08:27:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.089) Loss 1.1602 (0.7508) Acc@1 73.438 (83.928) Acc@5 92.578 (96.598) Mem 7381MB [2024-09-01 08:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.0479 (0.8007) Acc@1 76.465 (82.736) Acc@5 93.945 (96.096) Mem 7381MB [2024-09-01 08:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.352 Acc@5 96.064 [2024-09-01 08:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.4% [2024-09-01 08:27:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.858 (0.858) Loss 0.3806 (0.3806) Acc@1 93.359 (93.359) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 08:27:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.148) Loss 0.5664 (0.6017) Acc@1 90.527 (87.802) Acc@5 98.047 (97.843) Mem 7381MB [2024-09-01 08:27:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.114) Loss 0.8936 (0.6313) Acc@1 78.027 (86.616) Acc@5 95.996 (97.749) Mem 7381MB [2024-09-01 08:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.102) Loss 1.1143 (0.7202) Acc@1 73.926 (84.444) Acc@5 93.066 (96.872) Mem 7381MB [2024-09-01 08:27:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 0.9927 (0.7667) Acc@1 76.855 (83.327) Acc@5 94.336 (96.365) Mem 7381MB [2024-09-01 08:27:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.962 Acc@5 96.316 [2024-09-01 08:27:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 83.0% [2024-09-01 08:27:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.96% [2024-09-01 08:27:44 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 08:27:45 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 08:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][0/1251] eta 0:14:40 lr 0.000059 wd 0.0500 time 0.7036 (0.7036) data time 0.4769 (0.4769) model time 0.0000 (0.0000) loss 3.0883 (3.0883) grad_norm 4.5798 (4.5798) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:27:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][10/1251] eta 0:05:50 lr 0.000059 wd 0.0500 time 0.2367 (0.2822) data time 0.0010 (0.0442) model time 0.0000 (0.0000) loss 3.2340 (2.7208) grad_norm 5.8268 (4.6632) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:27:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][20/1251] eta 0:05:21 lr 0.000059 wd 0.0500 time 0.2404 (0.2614) data time 0.0007 (0.0236) model time 0.0000 (0.0000) loss 1.9605 (2.7208) grad_norm 3.3285 (4.6249) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:27:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][30/1251] eta 0:05:11 lr 0.000059 wd 0.0500 time 0.2434 (0.2548) data time 0.0007 (0.0163) model time 0.0000 (0.0000) loss 2.4059 (2.7652) grad_norm 4.7296 (4.7165) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:27:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][40/1251] eta 0:05:04 lr 0.000059 wd 0.0500 time 0.2375 (0.2515) data time 0.0010 (0.0125) model time 0.0000 (0.0000) loss 2.8164 (2.7847) grad_norm 4.7036 (4.9478) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:27:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][50/1251] eta 0:04:59 lr 0.000059 wd 0.0500 time 0.2370 (0.2493) data time 0.0010 (0.0103) model time 0.0000 (0.0000) loss 3.1489 (2.7884) grad_norm 5.7487 (5.1137) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][60/1251] eta 0:04:54 lr 0.000059 wd 0.0500 time 0.2363 (0.2476) data time 0.0009 (0.0087) model time 0.2354 (0.2380) loss 2.5544 (2.7747) grad_norm 8.3393 (5.2551) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][70/1251] eta 0:04:51 lr 0.000059 wd 0.0500 time 0.2488 (0.2471) data time 0.0007 (0.0077) model time 0.2481 (0.2403) loss 2.8505 (2.8005) grad_norm 5.3724 (5.1680) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][80/1251] eta 0:04:48 lr 0.000059 wd 0.0500 time 0.2424 (0.2463) data time 0.0011 (0.0068) model time 0.2413 (0.2400) loss 3.1953 (2.8258) grad_norm 4.1297 (5.0965) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][90/1251] eta 0:04:44 lr 0.000059 wd 0.0500 time 0.2409 (0.2454) data time 0.0007 (0.0062) model time 0.2402 (0.2393) loss 3.7540 (2.8142) grad_norm 8.4213 (5.1121) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][100/1251] eta 0:04:41 lr 0.000059 wd 0.0500 time 0.2403 (0.2449) data time 0.0009 (0.0057) model time 0.2394 (0.2394) loss 3.0390 (2.8309) grad_norm 3.4586 (5.0937) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][110/1251] eta 0:04:39 lr 0.000059 wd 0.0500 time 0.2468 (0.2447) data time 0.0007 (0.0053) model time 0.2461 (0.2397) loss 2.7460 (2.8044) grad_norm 4.0994 (5.1056) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][120/1251] eta 0:04:38 lr 0.000059 wd 0.0500 time 0.2382 (0.2463) data time 0.0008 (0.0049) model time 0.2374 (0.2430) loss 2.6442 (2.7918) grad_norm 3.8832 (5.1802) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][130/1251] eta 0:04:35 lr 0.000059 wd 0.0500 time 0.2443 (0.2458) data time 0.0009 (0.0046) model time 0.2433 (0.2425) loss 3.1664 (2.7905) grad_norm 3.2615 (5.2187) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][140/1251] eta 0:04:32 lr 0.000059 wd 0.0500 time 0.2420 (0.2454) data time 0.0008 (0.0044) model time 0.2412 (0.2421) loss 3.1975 (2.7619) grad_norm 5.5928 (5.1895) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][150/1251] eta 0:04:29 lr 0.000059 wd 0.0500 time 0.2380 (0.2450) data time 0.0010 (0.0041) model time 0.2370 (0.2418) loss 3.0244 (2.7545) grad_norm 8.7658 (5.1740) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][160/1251] eta 0:04:28 lr 0.000059 wd 0.0500 time 0.2362 (0.2461) data time 0.0010 (0.0040) model time 0.2352 (0.2435) loss 3.0864 (2.7492) grad_norm 9.4224 (5.1684) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][170/1251] eta 0:04:25 lr 0.000059 wd 0.0500 time 0.2478 (0.2459) data time 0.0011 (0.0038) model time 0.2467 (0.2435) loss 2.8878 (2.7586) grad_norm 4.4914 (5.1881) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][180/1251] eta 0:04:23 lr 0.000059 wd 0.0500 time 0.2333 (0.2456) data time 0.0011 (0.0036) model time 0.2323 (0.2431) loss 2.0574 (2.7435) grad_norm 3.6752 (5.1878) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][190/1251] eta 0:04:22 lr 0.000059 wd 0.0500 time 0.2313 (0.2477) data time 0.0008 (0.0035) model time 0.2305 (0.2461) loss 3.0735 (2.7377) grad_norm 4.1413 (5.2644) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][200/1251] eta 0:04:20 lr 0.000059 wd 0.0500 time 0.2448 (0.2475) data time 0.0007 (0.0034) model time 0.2441 (0.2458) loss 3.4629 (2.7423) grad_norm 2.9499 (5.3063) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][210/1251] eta 0:04:17 lr 0.000059 wd 0.0500 time 0.2512 (0.2472) data time 0.0010 (0.0033) model time 0.2502 (0.2455) loss 2.9287 (2.7254) grad_norm 4.8088 (5.2913) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][220/1251] eta 0:04:14 lr 0.000059 wd 0.0500 time 0.2476 (0.2470) data time 0.0007 (0.0032) model time 0.2469 (0.2453) loss 3.1591 (2.7367) grad_norm 5.0228 (5.3038) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][230/1251] eta 0:04:11 lr 0.000059 wd 0.0500 time 0.2470 (0.2467) data time 0.0009 (0.0031) model time 0.2461 (0.2450) loss 3.0846 (2.7411) grad_norm 6.7594 (5.3320) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][240/1251] eta 0:04:09 lr 0.000059 wd 0.0500 time 0.2419 (0.2466) data time 0.0009 (0.0030) model time 0.2410 (0.2448) loss 2.8675 (2.7301) grad_norm 7.3236 (5.3148) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][250/1251] eta 0:04:06 lr 0.000059 wd 0.0500 time 0.2432 (0.2464) data time 0.0009 (0.0029) model time 0.2423 (0.2446) loss 2.9798 (2.7253) grad_norm 5.1040 (5.3319) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][260/1251] eta 0:04:03 lr 0.000059 wd 0.0500 time 0.2340 (0.2462) data time 0.0011 (0.0028) model time 0.2329 (0.2444) loss 2.7882 (2.7238) grad_norm 4.4073 (5.3064) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][270/1251] eta 0:04:01 lr 0.000059 wd 0.0500 time 0.2437 (0.2462) data time 0.0007 (0.0027) model time 0.2430 (0.2445) loss 2.7994 (2.7390) grad_norm 6.7790 (5.5363) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][280/1251] eta 0:03:58 lr 0.000058 wd 0.0500 time 0.2405 (0.2461) data time 0.0009 (0.0027) model time 0.2396 (0.2443) loss 2.3222 (2.7281) grad_norm 7.8790 (5.5955) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][290/1251] eta 0:03:56 lr 0.000058 wd 0.0500 time 0.2373 (0.2459) data time 0.0009 (0.0026) model time 0.2363 (0.2442) loss 2.8073 (2.7299) grad_norm 3.4700 (5.5679) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:28:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][300/1251] eta 0:03:53 lr 0.000058 wd 0.0500 time 0.2472 (0.2458) data time 0.0009 (0.0026) model time 0.2463 (0.2441) loss 2.9096 (2.7361) grad_norm 5.1977 (5.5338) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][310/1251] eta 0:03:51 lr 0.000058 wd 0.0500 time 0.2446 (0.2457) data time 0.0009 (0.0025) model time 0.2437 (0.2440) loss 3.4843 (2.7401) grad_norm 7.0636 (5.5206) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][320/1251] eta 0:03:48 lr 0.000058 wd 0.0500 time 0.2387 (0.2456) data time 0.0007 (0.0025) model time 0.2380 (0.2439) loss 2.4746 (2.7326) grad_norm 5.0554 (5.4799) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][330/1251] eta 0:03:46 lr 0.000058 wd 0.0500 time 0.2400 (0.2456) data time 0.0010 (0.0024) model time 0.2389 (0.2439) loss 2.6352 (2.7316) grad_norm 4.6537 (5.4518) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][340/1251] eta 0:03:43 lr 0.000058 wd 0.0500 time 0.2373 (0.2454) data time 0.0010 (0.0024) model time 0.2363 (0.2437) loss 2.0727 (2.7261) grad_norm 4.4720 (5.4219) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][350/1251] eta 0:03:41 lr 0.000058 wd 0.0500 time 0.2399 (0.2453) data time 0.0007 (0.0023) model time 0.2392 (0.2436) loss 1.9320 (2.7251) grad_norm 4.5468 (5.4018) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][360/1251] eta 0:03:38 lr 0.000058 wd 0.0500 time 0.2427 (0.2452) data time 0.0009 (0.0023) model time 0.2418 (0.2435) loss 2.8365 (2.7170) grad_norm 7.5087 (5.4177) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][370/1251] eta 0:03:35 lr 0.000058 wd 0.0500 time 0.2396 (0.2451) data time 0.0010 (0.0023) model time 0.2386 (0.2434) loss 2.9610 (2.7170) grad_norm 6.3117 (5.4130) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][380/1251] eta 0:03:33 lr 0.000058 wd 0.0500 time 0.2388 (0.2450) data time 0.0011 (0.0022) model time 0.2376 (0.2434) loss 2.8205 (2.7203) grad_norm 3.6367 (5.3758) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][390/1251] eta 0:03:30 lr 0.000058 wd 0.0500 time 0.2361 (0.2449) data time 0.0007 (0.0022) model time 0.2355 (0.2432) loss 2.8148 (2.7187) grad_norm 7.6945 (5.3808) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][400/1251] eta 0:03:28 lr 0.000058 wd 0.0500 time 0.2384 (0.2447) data time 0.0008 (0.0022) model time 0.2376 (0.2431) loss 2.7174 (2.7109) grad_norm 5.9536 (5.4279) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][410/1251] eta 0:03:25 lr 0.000058 wd 0.0500 time 0.2336 (0.2446) data time 0.0009 (0.0021) model time 0.2327 (0.2429) loss 1.9978 (2.7140) grad_norm 5.1185 (5.4167) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][420/1251] eta 0:03:23 lr 0.000058 wd 0.0500 time 0.2402 (0.2445) data time 0.0011 (0.0021) model time 0.2392 (0.2428) loss 3.0880 (2.7150) grad_norm 4.6202 (5.4004) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][430/1251] eta 0:03:20 lr 0.000058 wd 0.0500 time 0.2354 (0.2445) data time 0.0010 (0.0021) model time 0.2344 (0.2428) loss 3.0984 (2.7224) grad_norm 3.1806 (5.3791) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][440/1251] eta 0:03:18 lr 0.000058 wd 0.0500 time 0.2422 (0.2444) data time 0.0008 (0.0021) model time 0.2414 (0.2427) loss 2.9813 (2.7149) grad_norm 4.7231 (5.3516) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][450/1251] eta 0:03:15 lr 0.000058 wd 0.0500 time 0.2438 (0.2444) data time 0.0009 (0.0020) model time 0.2429 (0.2428) loss 2.8347 (2.7097) grad_norm 5.0844 (5.5292) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][460/1251] eta 0:03:13 lr 0.000058 wd 0.0500 time 0.2366 (0.2443) data time 0.0011 (0.0020) model time 0.2355 (0.2427) loss 2.2917 (2.7093) grad_norm 4.1241 (5.5252) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][470/1251] eta 0:03:10 lr 0.000058 wd 0.0500 time 0.2462 (0.2443) data time 0.0007 (0.0020) model time 0.2455 (0.2427) loss 3.3936 (2.7116) grad_norm 4.0255 (5.5177) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][480/1251] eta 0:03:08 lr 0.000058 wd 0.0500 time 0.2305 (0.2442) data time 0.0011 (0.0020) model time 0.2294 (0.2426) loss 2.8043 (2.7113) grad_norm 3.7987 (5.5133) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][490/1251] eta 0:03:05 lr 0.000058 wd 0.0500 time 0.2327 (0.2442) data time 0.0010 (0.0020) model time 0.2317 (0.2426) loss 2.7388 (2.7100) grad_norm 11.7750 (5.5196) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][500/1251] eta 0:03:03 lr 0.000058 wd 0.0500 time 0.2465 (0.2441) data time 0.0010 (0.0019) model time 0.2454 (0.2425) loss 3.2262 (2.7150) grad_norm 3.9373 (5.5139) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][510/1251] eta 0:03:00 lr 0.000058 wd 0.0500 time 0.2352 (0.2440) data time 0.0007 (0.0019) model time 0.2345 (0.2424) loss 3.2932 (2.7162) grad_norm 4.0127 (5.5225) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][520/1251] eta 0:02:58 lr 0.000058 wd 0.0500 time 0.2462 (0.2440) data time 0.0007 (0.0019) model time 0.2455 (0.2424) loss 2.0106 (2.7179) grad_norm 6.0882 (5.5301) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][530/1251] eta 0:02:55 lr 0.000058 wd 0.0500 time 0.2434 (0.2439) data time 0.0009 (0.0019) model time 0.2426 (0.2423) loss 2.9289 (2.7215) grad_norm 3.0776 (5.5214) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][540/1251] eta 0:02:53 lr 0.000058 wd 0.0500 time 0.2444 (0.2439) data time 0.0007 (0.0019) model time 0.2437 (0.2423) loss 3.0995 (2.7207) grad_norm 4.5020 (5.5406) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:29:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][550/1251] eta 0:02:50 lr 0.000058 wd 0.0500 time 0.2428 (0.2438) data time 0.0009 (0.0019) model time 0.2419 (0.2423) loss 2.7065 (2.7154) grad_norm 5.0099 (5.5240) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][560/1251] eta 0:02:48 lr 0.000058 wd 0.0500 time 0.2379 (0.2438) data time 0.0007 (0.0018) model time 0.2371 (0.2423) loss 2.5277 (2.7146) grad_norm 6.4903 (5.7391) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][570/1251] eta 0:02:46 lr 0.000058 wd 0.0500 time 0.2405 (0.2438) data time 0.0008 (0.0018) model time 0.2397 (0.2423) loss 2.9626 (2.7142) grad_norm 3.3713 (5.7128) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][580/1251] eta 0:02:43 lr 0.000058 wd 0.0500 time 0.2383 (0.2437) data time 0.0009 (0.0018) model time 0.2373 (0.2422) loss 3.4357 (2.7113) grad_norm 3.4167 (5.6952) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][590/1251] eta 0:02:41 lr 0.000058 wd 0.0500 time 0.2572 (0.2437) data time 0.0010 (0.0018) model time 0.2562 (0.2422) loss 2.6756 (2.7110) grad_norm 5.2008 (5.6919) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][600/1251] eta 0:02:38 lr 0.000058 wd 0.0500 time 0.2416 (0.2436) data time 0.0008 (0.0018) model time 0.2408 (0.2421) loss 2.8782 (2.7104) grad_norm 3.0435 (5.6854) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][610/1251] eta 0:02:36 lr 0.000058 wd 0.0500 time 0.2490 (0.2436) data time 0.0006 (0.0018) model time 0.2484 (0.2421) loss 2.5294 (2.7094) grad_norm 5.7371 (5.6610) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][620/1251] eta 0:02:33 lr 0.000058 wd 0.0500 time 0.2457 (0.2437) data time 0.0007 (0.0018) model time 0.2450 (0.2422) loss 2.6893 (2.7096) grad_norm 2.7966 (5.6459) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][630/1251] eta 0:02:31 lr 0.000058 wd 0.0500 time 0.2364 (0.2436) data time 0.0009 (0.0017) model time 0.2355 (0.2421) loss 3.0643 (2.7145) grad_norm 7.7122 (5.6458) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][640/1251] eta 0:02:28 lr 0.000058 wd 0.0500 time 0.2551 (0.2438) data time 0.0007 (0.0017) model time 0.2544 (0.2424) loss 3.2488 (2.7135) grad_norm 4.3536 (5.6297) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][650/1251] eta 0:02:26 lr 0.000058 wd 0.0500 time 0.2543 (0.2438) data time 0.0009 (0.0017) model time 0.2534 (0.2423) loss 3.4608 (2.7130) grad_norm 27.4902 (5.6497) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][660/1251] eta 0:02:24 lr 0.000058 wd 0.0500 time 0.2415 (0.2438) data time 0.0009 (0.0017) model time 0.2406 (0.2423) loss 2.8789 (2.7141) grad_norm 2.7469 (5.6381) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][670/1251] eta 0:02:21 lr 0.000058 wd 0.0500 time 0.2373 (0.2437) data time 0.0007 (0.0017) model time 0.2365 (0.2423) loss 3.1534 (2.7142) grad_norm 4.3361 (5.6207) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][680/1251] eta 0:02:19 lr 0.000058 wd 0.0500 time 0.2360 (0.2437) data time 0.0008 (0.0017) model time 0.2352 (0.2423) loss 3.3110 (2.7146) grad_norm 5.6641 (5.6135) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][690/1251] eta 0:02:16 lr 0.000058 wd 0.0500 time 0.2480 (0.2437) data time 0.0010 (0.0017) model time 0.2471 (0.2423) loss 2.8807 (2.7159) grad_norm 3.8682 (5.6087) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][700/1251] eta 0:02:14 lr 0.000058 wd 0.0500 time 0.2442 (0.2437) data time 0.0011 (0.0017) model time 0.2431 (0.2422) loss 2.9884 (2.7147) grad_norm 7.5367 (5.6174) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][710/1251] eta 0:02:11 lr 0.000058 wd 0.0500 time 0.2372 (0.2436) data time 0.0007 (0.0017) model time 0.2364 (0.2422) loss 2.4142 (2.7166) grad_norm 6.9687 (5.6348) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][720/1251] eta 0:02:09 lr 0.000058 wd 0.0500 time 0.2390 (0.2436) data time 0.0010 (0.0017) model time 0.2380 (0.2422) loss 2.0660 (2.7162) grad_norm 7.2254 (5.6344) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][730/1251] eta 0:02:06 lr 0.000058 wd 0.0500 time 0.2476 (0.2436) data time 0.0008 (0.0016) model time 0.2469 (0.2421) loss 2.6345 (2.7195) grad_norm 5.6944 (5.6313) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][740/1251] eta 0:02:04 lr 0.000058 wd 0.0500 time 0.2367 (0.2435) data time 0.0010 (0.0016) model time 0.2357 (0.2421) loss 2.8921 (2.7228) grad_norm 5.2076 (5.6192) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][750/1251] eta 0:02:01 lr 0.000058 wd 0.0500 time 0.2350 (0.2434) data time 0.0009 (0.0016) model time 0.2341 (0.2420) loss 2.3513 (2.7246) grad_norm 5.6786 (5.6241) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][760/1251] eta 0:01:59 lr 0.000058 wd 0.0500 time 0.2379 (0.2434) data time 0.0007 (0.0016) model time 0.2371 (0.2420) loss 1.9117 (2.7250) grad_norm 3.8487 (5.6176) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][770/1251] eta 0:01:57 lr 0.000058 wd 0.0500 time 0.2515 (0.2434) data time 0.0010 (0.0016) model time 0.2504 (0.2420) loss 2.9013 (2.7257) grad_norm 3.6244 (5.6169) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][780/1251] eta 0:01:54 lr 0.000058 wd 0.0500 time 0.2379 (0.2434) data time 0.0010 (0.0016) model time 0.2369 (0.2420) loss 2.3877 (2.7221) grad_norm 3.3428 (5.6044) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:30:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][790/1251] eta 0:01:52 lr 0.000058 wd 0.0500 time 0.2484 (0.2434) data time 0.0010 (0.0016) model time 0.2473 (0.2420) loss 3.0765 (2.7217) grad_norm 5.1112 (5.5846) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][800/1251] eta 0:01:49 lr 0.000057 wd 0.0500 time 0.2369 (0.2433) data time 0.0008 (0.0016) model time 0.2362 (0.2419) loss 2.2751 (2.7224) grad_norm 4.0885 (5.6405) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][810/1251] eta 0:01:47 lr 0.000057 wd 0.0500 time 0.2403 (0.2433) data time 0.0010 (0.0016) model time 0.2393 (0.2419) loss 2.3945 (2.7188) grad_norm 2.9709 (5.6266) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][820/1251] eta 0:01:44 lr 0.000057 wd 0.0500 time 0.2446 (0.2433) data time 0.0010 (0.0016) model time 0.2436 (0.2419) loss 3.1654 (2.7209) grad_norm 5.3124 (5.6218) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][830/1251] eta 0:01:42 lr 0.000057 wd 0.0500 time 0.2474 (0.2433) data time 0.0007 (0.0016) model time 0.2467 (0.2419) loss 2.6046 (2.7192) grad_norm 3.3593 (5.6177) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][840/1251] eta 0:01:39 lr 0.000057 wd 0.0500 time 0.2422 (0.2433) data time 0.0009 (0.0016) model time 0.2413 (0.2419) loss 2.9821 (2.7197) grad_norm 7.6855 (5.6092) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][850/1251] eta 0:01:37 lr 0.000057 wd 0.0500 time 0.2418 (0.2433) data time 0.0008 (0.0016) model time 0.2410 (0.2419) loss 3.1875 (2.7203) grad_norm 4.4694 (5.5924) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][860/1251] eta 0:01:35 lr 0.000057 wd 0.0500 time 0.2429 (0.2433) data time 0.0010 (0.0016) model time 0.2419 (0.2419) loss 2.7249 (2.7175) grad_norm 5.7505 (5.5836) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][870/1251] eta 0:01:32 lr 0.000057 wd 0.0500 time 0.2349 (0.2433) data time 0.0008 (0.0015) model time 0.2341 (0.2419) loss 1.6003 (2.7175) grad_norm 4.7562 (5.5684) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][880/1251] eta 0:01:30 lr 0.000057 wd 0.0500 time 0.2439 (0.2433) data time 0.0008 (0.0015) model time 0.2431 (0.2419) loss 2.0753 (2.7167) grad_norm 4.5612 (5.5550) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][890/1251] eta 0:01:27 lr 0.000057 wd 0.0500 time 0.2419 (0.2432) data time 0.0011 (0.0015) model time 0.2408 (0.2419) loss 2.1282 (2.7178) grad_norm 8.8154 (5.5598) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][900/1251] eta 0:01:25 lr 0.000057 wd 0.0500 time 0.2454 (0.2432) data time 0.0008 (0.0015) model time 0.2446 (0.2418) loss 2.7749 (2.7181) grad_norm 6.0060 (5.7303) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][910/1251] eta 0:01:22 lr 0.000057 wd 0.0500 time 0.2380 (0.2432) data time 0.0011 (0.0015) model time 0.2370 (0.2418) loss 2.8787 (2.7202) grad_norm 6.9831 (5.7196) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][920/1251] eta 0:01:20 lr 0.000057 wd 0.0500 time 0.2447 (0.2432) data time 0.0007 (0.0015) model time 0.2440 (0.2418) loss 2.7140 (2.7223) grad_norm 3.6981 (5.7007) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][930/1251] eta 0:01:18 lr 0.000057 wd 0.0500 time 0.2419 (0.2432) data time 0.0009 (0.0015) model time 0.2410 (0.2418) loss 2.7797 (2.7225) grad_norm 6.4149 (5.6911) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][940/1251] eta 0:01:15 lr 0.000057 wd 0.0500 time 0.2396 (0.2431) data time 0.0008 (0.0015) model time 0.2389 (0.2418) loss 2.6964 (2.7188) grad_norm 6.2457 (5.6825) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][950/1251] eta 0:01:13 lr 0.000057 wd 0.0500 time 0.2404 (0.2431) data time 0.0012 (0.0015) model time 0.2392 (0.2417) loss 2.8093 (2.7185) grad_norm 3.9837 (5.6767) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][960/1251] eta 0:01:10 lr 0.000057 wd 0.0500 time 0.2438 (0.2431) data time 0.0012 (0.0015) model time 0.2426 (0.2417) loss 2.2152 (2.7206) grad_norm 3.7522 (5.6746) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][970/1251] eta 0:01:08 lr 0.000057 wd 0.0500 time 0.2396 (0.2431) data time 0.0010 (0.0015) model time 0.2386 (0.2417) loss 2.9206 (2.7212) grad_norm 6.4729 (5.6696) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][980/1251] eta 0:01:05 lr 0.000057 wd 0.0500 time 0.2336 (0.2431) data time 0.0011 (0.0015) model time 0.2325 (0.2417) loss 2.4819 (2.7241) grad_norm 5.4970 (5.6633) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][990/1251] eta 0:01:03 lr 0.000057 wd 0.0500 time 0.2425 (0.2431) data time 0.0012 (0.0015) model time 0.2412 (0.2417) loss 2.9281 (2.7260) grad_norm 5.9722 (5.6633) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1000/1251] eta 0:01:01 lr 0.000057 wd 0.0500 time 0.2374 (0.2431) data time 0.0011 (0.0015) model time 0.2363 (0.2417) loss 2.8161 (2.7265) grad_norm 6.9859 (5.6742) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1010/1251] eta 0:00:58 lr 0.000057 wd 0.0500 time 0.2465 (0.2431) data time 0.0011 (0.0015) model time 0.2454 (0.2417) loss 3.1124 (2.7254) grad_norm 8.3104 (5.6699) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1020/1251] eta 0:00:56 lr 0.000057 wd 0.0500 time 0.2410 (0.2431) data time 0.0009 (0.0015) model time 0.2401 (0.2417) loss 2.5709 (2.7258) grad_norm 5.8652 (5.6603) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1030/1251] eta 0:00:53 lr 0.000057 wd 0.0500 time 0.2446 (0.2430) data time 0.0007 (0.0015) model time 0.2439 (0.2417) loss 2.4486 (2.7252) grad_norm 4.0456 (5.6550) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:31:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1040/1251] eta 0:00:51 lr 0.000057 wd 0.0500 time 0.2378 (0.2430) data time 0.0008 (0.0015) model time 0.2370 (0.2417) loss 3.1627 (2.7275) grad_norm 3.8247 (5.6571) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1050/1251] eta 0:00:48 lr 0.000057 wd 0.0500 time 0.2458 (0.2430) data time 0.0009 (0.0015) model time 0.2449 (0.2417) loss 2.6872 (2.7264) grad_norm 4.0404 (5.6440) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1060/1251] eta 0:00:46 lr 0.000057 wd 0.0500 time 0.2471 (0.2430) data time 0.0010 (0.0014) model time 0.2461 (0.2417) loss 2.9389 (2.7278) grad_norm 4.0460 (5.6419) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1070/1251] eta 0:00:43 lr 0.000057 wd 0.0500 time 0.2429 (0.2430) data time 0.0011 (0.0015) model time 0.2418 (0.2417) loss 2.9381 (2.7302) grad_norm 3.9056 (5.6344) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1080/1251] eta 0:00:41 lr 0.000057 wd 0.0500 time 0.2364 (0.2431) data time 0.0009 (0.0015) model time 0.2355 (0.2417) loss 2.9178 (2.7307) grad_norm 3.3358 (5.6277) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1090/1251] eta 0:00:39 lr 0.000057 wd 0.0500 time 0.2373 (0.2432) data time 0.0007 (0.0014) model time 0.2366 (0.2419) loss 3.0411 (2.7297) grad_norm 6.5349 (5.6188) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1100/1251] eta 0:00:36 lr 0.000057 wd 0.0500 time 0.2380 (0.2432) data time 0.0010 (0.0014) model time 0.2370 (0.2419) loss 3.1607 (2.7290) grad_norm 4.2366 (5.6101) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1110/1251] eta 0:00:34 lr 0.000057 wd 0.0500 time 0.2424 (0.2432) data time 0.0011 (0.0014) model time 0.2413 (0.2419) loss 2.3945 (2.7296) grad_norm 4.0832 (5.6109) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1120/1251] eta 0:00:31 lr 0.000057 wd 0.0500 time 0.2355 (0.2436) data time 0.0010 (0.0014) model time 0.2345 (0.2423) loss 2.9935 (2.7297) grad_norm 5.1604 (5.6042) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1130/1251] eta 0:00:29 lr 0.000057 wd 0.0500 time 0.2504 (0.2436) data time 0.0009 (0.0014) model time 0.2495 (0.2423) loss 2.7133 (2.7315) grad_norm 5.7700 (5.6005) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1140/1251] eta 0:00:27 lr 0.000057 wd 0.0500 time 0.2369 (0.2436) data time 0.0009 (0.0014) model time 0.2360 (0.2423) loss 1.7586 (2.7319) grad_norm 3.9001 (5.5992) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1150/1251] eta 0:00:24 lr 0.000057 wd 0.0500 time 0.2408 (0.2436) data time 0.0011 (0.0014) model time 0.2397 (0.2423) loss 2.3518 (2.7297) grad_norm 3.3824 (5.5898) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1160/1251] eta 0:00:22 lr 0.000057 wd 0.0500 time 0.2363 (0.2438) data time 0.0010 (0.0014) model time 0.2352 (0.2425) loss 2.6575 (2.7294) grad_norm 3.5252 (5.5809) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1170/1251] eta 0:00:19 lr 0.000057 wd 0.0500 time 0.2381 (0.2437) data time 0.0007 (0.0014) model time 0.2374 (0.2425) loss 2.9300 (2.7284) grad_norm 6.9000 (5.5748) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1180/1251] eta 0:00:17 lr 0.000057 wd 0.0500 time 0.2378 (0.2437) data time 0.0007 (0.0014) model time 0.2371 (0.2425) loss 2.1205 (2.7282) grad_norm 3.7835 (5.5687) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1190/1251] eta 0:00:14 lr 0.000057 wd 0.0500 time 0.2502 (0.2437) data time 0.0009 (0.0014) model time 0.2493 (0.2424) loss 2.4691 (2.7279) grad_norm 4.1357 (5.5667) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1200/1251] eta 0:00:12 lr 0.000057 wd 0.0500 time 0.2411 (0.2437) data time 0.0011 (0.0014) model time 0.2400 (0.2424) loss 2.7064 (2.7268) grad_norm 3.7056 (5.5651) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1210/1251] eta 0:00:09 lr 0.000057 wd 0.0500 time 0.2436 (0.2437) data time 0.0011 (0.0014) model time 0.2425 (0.2424) loss 3.4475 (2.7272) grad_norm 4.2904 (5.5614) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1220/1251] eta 0:00:07 lr 0.000057 wd 0.0500 time 0.2461 (0.2437) data time 0.0012 (0.0014) model time 0.2450 (0.2424) loss 2.7470 (2.7285) grad_norm 8.4408 (5.5632) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1230/1251] eta 0:00:05 lr 0.000057 wd 0.0500 time 0.2469 (0.2436) data time 0.0010 (0.0014) model time 0.2460 (0.2424) loss 3.3289 (2.7316) grad_norm 4.1455 (5.5546) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1240/1251] eta 0:00:02 lr 0.000057 wd 0.0500 time 0.2259 (0.2436) data time 0.0005 (0.0014) model time 0.2255 (0.2423) loss 3.0508 (2.7315) grad_norm 4.8951 (5.5476) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [260/300][1250/1251] eta 0:00:00 lr 0.000057 wd 0.0500 time 0.2248 (0.2434) data time 0.0005 (0.0014) model time 0.2243 (0.2421) loss 2.7947 (2.7307) grad_norm 4.4283 (5.5446) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 260 training takes 0:05:04 [2024-09-01 08:32:49 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 08:32:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 08:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.452 (0.452) Loss 0.4031 (0.4031) Acc@1 92.969 (92.969) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 08:32:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.111) Loss 0.5830 (0.6231) Acc@1 90.137 (87.296) Acc@5 97.852 (97.736) Mem 7381MB [2024-09-01 08:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.096) Loss 0.9067 (0.6504) Acc@1 78.125 (86.249) Acc@5 96.094 (97.642) Mem 7381MB [2024-09-01 08:32:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.090) Loss 1.1592 (0.7422) Acc@1 73.633 (84.019) Acc@5 92.773 (96.683) Mem 7381MB [2024-09-01 08:32:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.0391 (0.7911) Acc@1 75.879 (82.812) Acc@5 93.652 (96.172) Mem 7381MB [2024-09-01 08:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.406 Acc@5 96.118 [2024-09-01 08:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.4% [2024-09-01 08:32:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.814 (0.814) Loss 0.3816 (0.3816) Acc@1 93.359 (93.359) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 08:32:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.146) Loss 0.5664 (0.6021) Acc@1 90.527 (87.837) Acc@5 97.852 (97.834) Mem 7381MB [2024-09-01 08:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.112) Loss 0.8940 (0.6317) Acc@1 77.930 (86.621) Acc@5 95.996 (97.735) Mem 7381MB [2024-09-01 08:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.102) Loss 1.1143 (0.7207) Acc@1 74.023 (84.479) Acc@5 92.969 (96.856) Mem 7381MB [2024-09-01 08:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 0.9932 (0.7674) Acc@1 77.051 (83.363) Acc@5 94.141 (96.353) Mem 7381MB [2024-09-01 08:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.980 Acc@5 96.306 [2024-09-01 08:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 83.0% [2024-09-01 08:32:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.98% [2024-09-01 08:32:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 08:32:59 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 08:33:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][0/1251] eta 0:16:48 lr 0.000057 wd 0.0500 time 0.8065 (0.8065) data time 0.5817 (0.5817) model time 0.0000 (0.0000) loss 2.0584 (2.0584) grad_norm 4.7767 (4.7767) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][10/1251] eta 0:06:03 lr 0.000057 wd 0.0500 time 0.2623 (0.2933) data time 0.0008 (0.0538) model time 0.0000 (0.0000) loss 2.3129 (2.7480) grad_norm 4.5698 (4.6038) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][20/1251] eta 0:05:29 lr 0.000057 wd 0.0500 time 0.2421 (0.2674) data time 0.0008 (0.0288) model time 0.0000 (0.0000) loss 3.1821 (2.6747) grad_norm 4.6340 (4.3380) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][30/1251] eta 0:05:14 lr 0.000057 wd 0.0500 time 0.2346 (0.2579) data time 0.0009 (0.0199) model time 0.0000 (0.0000) loss 1.5691 (2.5894) grad_norm 4.3111 (4.6207) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][40/1251] eta 0:05:07 lr 0.000057 wd 0.0500 time 0.2421 (0.2538) data time 0.0011 (0.0153) model time 0.0000 (0.0000) loss 3.2617 (2.6243) grad_norm 9.6386 (4.8210) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][50/1251] eta 0:05:02 lr 0.000057 wd 0.0500 time 0.2505 (0.2515) data time 0.0010 (0.0125) model time 0.0000 (0.0000) loss 2.8056 (2.6482) grad_norm 5.1756 (4.8108) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][60/1251] eta 0:04:57 lr 0.000057 wd 0.0500 time 0.2380 (0.2499) data time 0.0010 (0.0106) model time 0.2370 (0.2409) loss 2.8081 (2.6884) grad_norm 4.4921 (4.8390) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][70/1251] eta 0:04:53 lr 0.000057 wd 0.0500 time 0.2374 (0.2487) data time 0.0011 (0.0092) model time 0.2364 (0.2405) loss 2.8439 (2.6708) grad_norm 5.2196 (5.0071) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][80/1251] eta 0:04:50 lr 0.000056 wd 0.0500 time 0.2461 (0.2478) data time 0.0010 (0.0082) model time 0.2451 (0.2406) loss 2.8632 (2.6864) grad_norm 4.1428 (4.9911) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][90/1251] eta 0:04:46 lr 0.000056 wd 0.0500 time 0.2434 (0.2469) data time 0.0010 (0.0074) model time 0.2424 (0.2401) loss 2.6624 (2.6919) grad_norm 4.2705 (4.9649) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][100/1251] eta 0:04:43 lr 0.000056 wd 0.0500 time 0.2461 (0.2462) data time 0.0009 (0.0068) model time 0.2452 (0.2399) loss 3.3801 (2.6967) grad_norm 5.8083 (5.0214) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][110/1251] eta 0:04:40 lr 0.000056 wd 0.0500 time 0.2415 (0.2459) data time 0.0011 (0.0063) model time 0.2404 (0.2401) loss 2.8437 (2.7039) grad_norm 3.5368 (4.9452) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][120/1251] eta 0:04:37 lr 0.000056 wd 0.0500 time 0.2411 (0.2454) data time 0.0009 (0.0058) model time 0.2401 (0.2401) loss 2.8494 (2.7073) grad_norm 4.5090 (4.9428) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][130/1251] eta 0:04:34 lr 0.000056 wd 0.0500 time 0.2496 (0.2451) data time 0.0008 (0.0055) model time 0.2488 (0.2400) loss 2.2264 (2.7029) grad_norm 17.3680 (5.0850) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][140/1251] eta 0:04:32 lr 0.000056 wd 0.0500 time 0.2433 (0.2449) data time 0.0007 (0.0051) model time 0.2425 (0.2401) loss 2.9036 (2.6973) grad_norm 5.7498 (5.2595) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][150/1251] eta 0:04:29 lr 0.000056 wd 0.0500 time 0.2461 (0.2448) data time 0.0007 (0.0049) model time 0.2454 (0.2404) loss 2.9256 (2.7152) grad_norm 4.5127 (5.2340) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][160/1251] eta 0:04:26 lr 0.000056 wd 0.0500 time 0.2423 (0.2447) data time 0.0008 (0.0046) model time 0.2415 (0.2406) loss 1.5807 (2.7205) grad_norm 4.4881 (5.2445) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][170/1251] eta 0:04:24 lr 0.000056 wd 0.0500 time 0.2448 (0.2448) data time 0.0009 (0.0044) model time 0.2440 (0.2409) loss 3.3337 (2.7189) grad_norm 3.7569 (5.1963) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][180/1251] eta 0:04:22 lr 0.000056 wd 0.0500 time 0.2412 (0.2447) data time 0.0010 (0.0042) model time 0.2402 (0.2411) loss 3.3827 (2.7299) grad_norm 5.1543 (5.2046) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][190/1251] eta 0:04:19 lr 0.000056 wd 0.0500 time 0.2392 (0.2445) data time 0.0011 (0.0041) model time 0.2381 (0.2409) loss 2.8693 (2.7341) grad_norm 3.9449 (5.1848) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][200/1251] eta 0:04:16 lr 0.000056 wd 0.0500 time 0.2421 (0.2442) data time 0.0009 (0.0039) model time 0.2412 (0.2408) loss 3.0662 (2.7389) grad_norm 8.5127 (5.2180) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][210/1251] eta 0:04:14 lr 0.000056 wd 0.0500 time 0.2412 (0.2441) data time 0.0010 (0.0038) model time 0.2402 (0.2407) loss 3.0168 (2.7503) grad_norm 5.2631 (5.2035) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][220/1251] eta 0:04:11 lr 0.000056 wd 0.0500 time 0.2445 (0.2439) data time 0.0007 (0.0036) model time 0.2437 (0.2406) loss 3.4023 (2.7575) grad_norm 3.9778 (5.2596) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][230/1251] eta 0:04:08 lr 0.000056 wd 0.0500 time 0.2409 (0.2438) data time 0.0011 (0.0035) model time 0.2398 (0.2406) loss 3.3528 (2.7547) grad_norm 4.5779 (5.2677) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][240/1251] eta 0:04:06 lr 0.000056 wd 0.0500 time 0.2439 (0.2437) data time 0.0009 (0.0034) model time 0.2430 (0.2406) loss 3.3018 (2.7512) grad_norm 4.8087 (5.2238) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][250/1251] eta 0:04:03 lr 0.000056 wd 0.0500 time 0.2472 (0.2436) data time 0.0010 (0.0033) model time 0.2462 (0.2405) loss 3.1890 (2.7474) grad_norm 5.9417 (5.2057) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][260/1251] eta 0:04:01 lr 0.000056 wd 0.0500 time 0.2460 (0.2434) data time 0.0007 (0.0033) model time 0.2453 (0.2405) loss 2.7589 (2.7422) grad_norm 4.1651 (5.1814) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][270/1251] eta 0:03:58 lr 0.000056 wd 0.0500 time 0.2434 (0.2434) data time 0.0008 (0.0032) model time 0.2427 (0.2405) loss 3.2105 (2.7364) grad_norm 3.8110 (5.1572) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][280/1251] eta 0:03:56 lr 0.000056 wd 0.0500 time 0.2410 (0.2433) data time 0.0009 (0.0031) model time 0.2401 (0.2404) loss 1.6273 (2.7367) grad_norm 3.4783 (5.1669) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][290/1251] eta 0:03:53 lr 0.000056 wd 0.0500 time 0.2403 (0.2433) data time 0.0009 (0.0030) model time 0.2393 (0.2405) loss 3.3524 (2.7388) grad_norm 3.5827 (5.1701) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][300/1251] eta 0:03:51 lr 0.000056 wd 0.0500 time 0.2550 (0.2433) data time 0.0008 (0.0029) model time 0.2542 (0.2406) loss 3.1766 (2.7424) grad_norm 5.2414 (5.1657) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][310/1251] eta 0:03:48 lr 0.000056 wd 0.0500 time 0.2430 (0.2431) data time 0.0009 (0.0029) model time 0.2421 (0.2405) loss 2.7395 (2.7325) grad_norm 4.1172 (5.1454) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][320/1251] eta 0:03:46 lr 0.000056 wd 0.0500 time 0.2439 (0.2431) data time 0.0007 (0.0028) model time 0.2432 (0.2405) loss 2.4029 (2.7309) grad_norm 5.1147 (5.1513) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][330/1251] eta 0:03:43 lr 0.000056 wd 0.0500 time 0.2370 (0.2430) data time 0.0010 (0.0028) model time 0.2360 (0.2404) loss 2.6331 (2.7332) grad_norm 5.8537 (5.2366) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][340/1251] eta 0:03:41 lr 0.000056 wd 0.0500 time 0.2421 (0.2429) data time 0.0010 (0.0027) model time 0.2411 (0.2404) loss 2.9311 (2.7273) grad_norm 8.4607 (5.3068) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][350/1251] eta 0:03:38 lr 0.000056 wd 0.0500 time 0.2346 (0.2429) data time 0.0007 (0.0027) model time 0.2339 (0.2404) loss 3.4092 (2.7315) grad_norm 6.2112 (5.3554) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][360/1251] eta 0:03:36 lr 0.000056 wd 0.0500 time 0.2420 (0.2428) data time 0.0008 (0.0026) model time 0.2412 (0.2404) loss 2.2977 (2.7283) grad_norm 2.7970 (5.3501) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][370/1251] eta 0:03:33 lr 0.000056 wd 0.0500 time 0.2465 (0.2428) data time 0.0009 (0.0026) model time 0.2456 (0.2404) loss 2.8238 (2.7253) grad_norm 4.5533 (5.3338) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][380/1251] eta 0:03:31 lr 0.000056 wd 0.0500 time 0.2407 (0.2427) data time 0.0007 (0.0025) model time 0.2400 (0.2404) loss 3.1222 (2.7254) grad_norm 4.0062 (5.3261) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][390/1251] eta 0:03:28 lr 0.000056 wd 0.0500 time 0.2361 (0.2426) data time 0.0011 (0.0025) model time 0.2350 (0.2403) loss 2.5765 (2.7296) grad_norm 7.6181 (5.3329) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][400/1251] eta 0:03:26 lr 0.000056 wd 0.0500 time 0.2388 (0.2430) data time 0.0009 (0.0025) model time 0.2379 (0.2408) loss 2.2379 (2.7304) grad_norm 5.8303 (5.3418) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][410/1251] eta 0:03:24 lr 0.000056 wd 0.0500 time 0.2383 (0.2430) data time 0.0013 (0.0024) model time 0.2369 (0.2408) loss 2.8267 (2.7247) grad_norm 4.3968 (5.3246) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][420/1251] eta 0:03:21 lr 0.000056 wd 0.0500 time 0.2451 (0.2429) data time 0.0009 (0.0024) model time 0.2441 (0.2407) loss 2.4351 (2.7231) grad_norm 3.4891 (5.2997) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][430/1251] eta 0:03:19 lr 0.000056 wd 0.0500 time 0.2414 (0.2436) data time 0.0008 (0.0024) model time 0.2406 (0.2415) loss 3.3808 (2.7197) grad_norm 2.7214 (5.2748) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][440/1251] eta 0:03:17 lr 0.000056 wd 0.0500 time 0.2422 (0.2435) data time 0.0010 (0.0023) model time 0.2412 (0.2415) loss 2.7008 (2.7202) grad_norm 4.1459 (5.2557) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][450/1251] eta 0:03:14 lr 0.000056 wd 0.0500 time 0.2350 (0.2434) data time 0.0010 (0.0023) model time 0.2340 (0.2413) loss 2.7088 (2.7207) grad_norm 4.5041 (5.2678) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][460/1251] eta 0:03:12 lr 0.000056 wd 0.0500 time 0.2388 (0.2433) data time 0.0008 (0.0023) model time 0.2381 (0.2412) loss 1.9065 (2.7199) grad_norm 4.0723 (5.2701) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][470/1251] eta 0:03:10 lr 0.000056 wd 0.0500 time 0.2428 (0.2436) data time 0.0010 (0.0023) model time 0.2418 (0.2416) loss 2.4667 (2.7220) grad_norm 4.0096 (5.2702) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][480/1251] eta 0:03:07 lr 0.000056 wd 0.0500 time 0.2444 (0.2436) data time 0.0011 (0.0023) model time 0.2433 (0.2416) loss 2.8770 (2.7243) grad_norm 4.1512 (5.2771) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][490/1251] eta 0:03:05 lr 0.000056 wd 0.0500 time 0.2456 (0.2435) data time 0.0010 (0.0022) model time 0.2446 (0.2415) loss 2.8956 (2.7239) grad_norm 4.1531 (5.2604) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:35:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][500/1251] eta 0:03:02 lr 0.000056 wd 0.0500 time 0.2369 (0.2434) data time 0.0009 (0.0022) model time 0.2359 (0.2414) loss 2.4080 (2.7262) grad_norm 4.4779 (5.2483) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][510/1251] eta 0:03:00 lr 0.000056 wd 0.0500 time 0.2463 (0.2434) data time 0.0008 (0.0022) model time 0.2455 (0.2415) loss 2.2961 (2.7253) grad_norm 6.4628 (5.2403) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][520/1251] eta 0:02:57 lr 0.000056 wd 0.0500 time 0.2404 (0.2434) data time 0.0009 (0.0022) model time 0.2395 (0.2415) loss 2.6997 (2.7254) grad_norm 3.7849 (5.2314) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:35:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][530/1251] eta 0:02:55 lr 0.000056 wd 0.0500 time 0.2420 (0.2434) data time 0.0007 (0.0021) model time 0.2413 (0.2415) loss 3.8951 (2.7301) grad_norm 3.6978 (5.2215) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:35:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][540/1251] eta 0:02:53 lr 0.000056 wd 0.0500 time 0.2344 (0.2433) data time 0.0007 (0.0021) model time 0.2337 (0.2414) loss 2.3478 (2.7303) grad_norm 4.1105 (5.2156) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:35:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][550/1251] eta 0:02:50 lr 0.000056 wd 0.0500 time 0.2452 (0.2433) data time 0.0011 (0.0021) model time 0.2441 (0.2415) loss 2.9295 (2.7336) grad_norm 4.8614 (5.2020) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][560/1251] eta 0:02:48 lr 0.000056 wd 0.0500 time 0.2447 (0.2433) data time 0.0008 (0.0021) model time 0.2439 (0.2414) loss 2.2136 (2.7339) grad_norm 4.9495 (5.2006) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:35:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][570/1251] eta 0:02:45 lr 0.000056 wd 0.0500 time 0.2520 (0.2432) data time 0.0008 (0.0021) model time 0.2512 (0.2414) loss 2.8938 (2.7356) grad_norm 3.4723 (5.1963) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:35:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][580/1251] eta 0:02:43 lr 0.000056 wd 0.0500 time 0.2327 (0.2432) data time 0.0009 (0.0020) model time 0.2318 (0.2413) loss 2.5074 (2.7367) grad_norm 4.4469 (5.1768) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:35:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][590/1251] eta 0:02:40 lr 0.000056 wd 0.0500 time 0.2422 (0.2432) data time 0.0008 (0.0020) model time 0.2413 (0.2414) loss 1.8138 (2.7334) grad_norm 5.8501 (5.1672) loss_scale 512.0000 (259.4653) mem 7381MB [2024-09-01 08:35:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][600/1251] eta 0:02:38 lr 0.000056 wd 0.0500 time 0.2421 (0.2431) data time 0.0007 (0.0020) model time 0.2414 (0.2413) loss 2.8458 (2.7324) grad_norm 4.9773 (5.1924) loss_scale 512.0000 (263.6672) mem 7381MB [2024-09-01 08:35:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][610/1251] eta 0:02:35 lr 0.000055 wd 0.0500 time 0.2433 (0.2431) data time 0.0011 (0.0020) model time 0.2422 (0.2413) loss 2.7505 (2.7334) grad_norm 6.5208 (5.2036) loss_scale 512.0000 (267.7316) mem 7381MB [2024-09-01 08:35:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][620/1251] eta 0:02:33 lr 0.000055 wd 0.0500 time 0.2467 (0.2431) data time 0.0007 (0.0020) model time 0.2460 (0.2413) loss 2.5761 (2.7335) grad_norm 4.6991 (5.2217) loss_scale 512.0000 (271.6651) mem 7381MB [2024-09-01 08:35:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][630/1251] eta 0:02:30 lr 0.000055 wd 0.0500 time 0.2432 (0.2431) data time 0.0009 (0.0020) model time 0.2422 (0.2413) loss 2.5046 (2.7344) grad_norm 7.3338 (5.2164) loss_scale 512.0000 (275.4739) mem 7381MB [2024-09-01 08:35:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][640/1251] eta 0:02:28 lr 0.000055 wd 0.0500 time 0.2313 (0.2430) data time 0.0012 (0.0019) model time 0.2301 (0.2413) loss 2.8912 (2.7376) grad_norm 5.4206 (5.2192) loss_scale 512.0000 (279.1638) mem 7381MB [2024-09-01 08:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][650/1251] eta 0:02:26 lr 0.000055 wd 0.0500 time 0.2386 (0.2430) data time 0.0010 (0.0019) model time 0.2376 (0.2413) loss 2.6717 (2.7343) grad_norm 5.3317 (5.2349) loss_scale 512.0000 (282.7404) mem 7381MB [2024-09-01 08:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][660/1251] eta 0:02:23 lr 0.000055 wd 0.0500 time 0.2367 (0.2433) data time 0.0009 (0.0019) model time 0.2358 (0.2416) loss 2.7058 (2.7364) grad_norm 4.3123 (5.2299) loss_scale 512.0000 (286.2088) mem 7381MB [2024-09-01 08:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][670/1251] eta 0:02:21 lr 0.000055 wd 0.0500 time 0.2409 (0.2433) data time 0.0009 (0.0019) model time 0.2400 (0.2416) loss 3.0093 (2.7354) grad_norm 5.6374 (5.2427) loss_scale 512.0000 (289.5738) mem 7381MB [2024-09-01 08:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][680/1251] eta 0:02:18 lr 0.000055 wd 0.0500 time 0.2472 (0.2433) data time 0.0010 (0.0019) model time 0.2463 (0.2416) loss 2.6821 (2.7335) grad_norm 3.9667 (5.2324) loss_scale 512.0000 (292.8399) mem 7381MB [2024-09-01 08:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][690/1251] eta 0:02:16 lr 0.000055 wd 0.0500 time 0.2458 (0.2433) data time 0.0008 (0.0019) model time 0.2450 (0.2416) loss 2.6987 (2.7309) grad_norm 3.9351 (5.2203) loss_scale 512.0000 (296.0116) mem 7381MB [2024-09-01 08:35:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][700/1251] eta 0:02:14 lr 0.000055 wd 0.0500 time 0.2389 (0.2439) data time 0.0008 (0.0019) model time 0.2381 (0.2423) loss 3.6897 (2.7321) grad_norm 5.6674 (5.2530) loss_scale 512.0000 (299.0927) mem 7381MB [2024-09-01 08:35:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][710/1251] eta 0:02:12 lr 0.000055 wd 0.0500 time 0.2486 (0.2442) data time 0.0011 (0.0018) model time 0.2475 (0.2426) loss 2.3945 (2.7275) grad_norm 3.9268 (5.2627) loss_scale 512.0000 (302.0872) mem 7381MB [2024-09-01 08:35:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][720/1251] eta 0:02:09 lr 0.000055 wd 0.0500 time 0.2408 (0.2441) data time 0.0007 (0.0018) model time 0.2400 (0.2426) loss 2.2082 (2.7271) grad_norm 3.6742 (5.2550) loss_scale 512.0000 (304.9986) mem 7381MB [2024-09-01 08:35:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][730/1251] eta 0:02:07 lr 0.000055 wd 0.0500 time 0.2475 (0.2441) data time 0.0009 (0.0018) model time 0.2466 (0.2425) loss 2.4462 (2.7246) grad_norm 3.4435 (5.2496) loss_scale 512.0000 (307.8304) mem 7381MB [2024-09-01 08:36:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][740/1251] eta 0:02:04 lr 0.000055 wd 0.0500 time 0.2470 (0.2440) data time 0.0007 (0.0018) model time 0.2463 (0.2425) loss 2.8346 (2.7244) grad_norm 3.1604 (5.2411) loss_scale 512.0000 (310.5857) mem 7381MB [2024-09-01 08:36:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][750/1251] eta 0:02:02 lr 0.000055 wd 0.0500 time 0.2398 (0.2440) data time 0.0011 (0.0018) model time 0.2387 (0.2424) loss 2.9656 (2.7261) grad_norm 3.8482 (5.2382) loss_scale 512.0000 (313.2676) mem 7381MB [2024-09-01 08:36:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][760/1251] eta 0:01:59 lr 0.000055 wd 0.0500 time 0.2370 (0.2440) data time 0.0007 (0.0018) model time 0.2363 (0.2424) loss 3.0065 (2.7250) grad_norm 4.9286 (5.2291) loss_scale 512.0000 (315.8791) mem 7381MB [2024-09-01 08:36:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][770/1251] eta 0:01:57 lr 0.000055 wd 0.0500 time 0.2391 (0.2439) data time 0.0008 (0.0018) model time 0.2383 (0.2424) loss 2.4508 (2.7262) grad_norm 6.5343 (5.2365) loss_scale 512.0000 (318.4228) mem 7381MB [2024-09-01 08:36:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][780/1251] eta 0:01:54 lr 0.000055 wd 0.0500 time 0.2332 (0.2439) data time 0.0010 (0.0018) model time 0.2322 (0.2423) loss 3.0116 (2.7265) grad_norm 4.4115 (5.2280) loss_scale 512.0000 (320.9014) mem 7381MB [2024-09-01 08:36:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][790/1251] eta 0:01:52 lr 0.000055 wd 0.0500 time 0.2498 (0.2439) data time 0.0009 (0.0018) model time 0.2489 (0.2423) loss 2.3696 (2.7250) grad_norm 4.1894 (5.2269) loss_scale 512.0000 (323.3173) mem 7381MB [2024-09-01 08:36:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][800/1251] eta 0:01:49 lr 0.000055 wd 0.0500 time 0.2375 (0.2438) data time 0.0007 (0.0018) model time 0.2368 (0.2423) loss 1.7282 (2.7266) grad_norm 3.4047 (5.2938) loss_scale 512.0000 (325.6729) mem 7381MB [2024-09-01 08:36:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][810/1251] eta 0:01:47 lr 0.000055 wd 0.0500 time 0.2519 (0.2438) data time 0.0012 (0.0017) model time 0.2507 (0.2423) loss 1.6769 (2.7268) grad_norm 5.6425 (5.2893) loss_scale 512.0000 (327.9704) mem 7381MB [2024-09-01 08:36:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][820/1251] eta 0:01:45 lr 0.000055 wd 0.0500 time 0.2349 (0.2438) data time 0.0011 (0.0017) model time 0.2338 (0.2422) loss 2.8584 (2.7254) grad_norm 3.6398 (5.2957) loss_scale 512.0000 (330.2119) mem 7381MB [2024-09-01 08:36:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][830/1251] eta 0:01:42 lr 0.000055 wd 0.0500 time 0.2307 (0.2437) data time 0.0009 (0.0017) model time 0.2297 (0.2422) loss 2.5643 (2.7255) grad_norm 7.8378 (inf) loss_scale 256.0000 (330.2431) mem 7381MB [2024-09-01 08:36:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][840/1251] eta 0:01:40 lr 0.000055 wd 0.0500 time 0.2395 (0.2437) data time 0.0012 (0.0017) model time 0.2383 (0.2422) loss 2.8963 (2.7253) grad_norm 3.1007 (inf) loss_scale 256.0000 (329.3603) mem 7381MB [2024-09-01 08:36:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][850/1251] eta 0:01:37 lr 0.000055 wd 0.0500 time 0.2441 (0.2437) data time 0.0008 (0.0017) model time 0.2433 (0.2422) loss 3.2163 (2.7274) grad_norm 5.1893 (inf) loss_scale 256.0000 (328.4982) mem 7381MB [2024-09-01 08:36:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][860/1251] eta 0:01:35 lr 0.000055 wd 0.0500 time 0.2350 (0.2436) data time 0.0007 (0.0017) model time 0.2343 (0.2421) loss 3.2099 (2.7298) grad_norm 4.3796 (inf) loss_scale 256.0000 (327.6562) mem 7381MB [2024-09-01 08:36:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][870/1251] eta 0:01:32 lr 0.000055 wd 0.0500 time 0.2451 (0.2436) data time 0.0009 (0.0017) model time 0.2442 (0.2421) loss 3.4419 (2.7331) grad_norm 4.8736 (inf) loss_scale 256.0000 (326.8335) mem 7381MB [2024-09-01 08:36:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][880/1251] eta 0:01:30 lr 0.000055 wd 0.0500 time 0.2369 (0.2436) data time 0.0010 (0.0017) model time 0.2359 (0.2421) loss 3.1329 (2.7324) grad_norm 6.4583 (inf) loss_scale 256.0000 (326.0295) mem 7381MB [2024-09-01 08:36:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][890/1251] eta 0:01:27 lr 0.000055 wd 0.0500 time 0.2411 (0.2436) data time 0.0011 (0.0017) model time 0.2401 (0.2421) loss 2.0998 (2.7326) grad_norm 4.5527 (inf) loss_scale 256.0000 (325.2435) mem 7381MB [2024-09-01 08:36:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][900/1251] eta 0:01:25 lr 0.000055 wd 0.0500 time 0.2384 (0.2436) data time 0.0007 (0.0017) model time 0.2377 (0.2421) loss 2.4994 (2.7310) grad_norm 4.1630 (inf) loss_scale 256.0000 (324.4750) mem 7381MB [2024-09-01 08:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][910/1251] eta 0:01:23 lr 0.000055 wd 0.0500 time 0.2439 (0.2435) data time 0.0009 (0.0017) model time 0.2429 (0.2420) loss 1.8697 (2.7327) grad_norm 3.4415 (inf) loss_scale 256.0000 (323.7234) mem 7381MB [2024-09-01 08:36:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][920/1251] eta 0:01:20 lr 0.000055 wd 0.0500 time 0.2388 (0.2435) data time 0.0009 (0.0016) model time 0.2378 (0.2420) loss 2.7694 (2.7345) grad_norm 7.5548 (inf) loss_scale 256.0000 (322.9881) mem 7381MB [2024-09-01 08:36:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][930/1251] eta 0:01:18 lr 0.000055 wd 0.0500 time 0.2388 (0.2435) data time 0.0010 (0.0016) model time 0.2378 (0.2420) loss 2.9475 (2.7328) grad_norm 4.1083 (inf) loss_scale 256.0000 (322.2685) mem 7381MB [2024-09-01 08:36:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][940/1251] eta 0:01:15 lr 0.000055 wd 0.0500 time 0.2388 (0.2435) data time 0.0011 (0.0016) model time 0.2377 (0.2420) loss 2.2153 (2.7328) grad_norm 6.1402 (inf) loss_scale 256.0000 (321.5643) mem 7381MB [2024-09-01 08:36:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][950/1251] eta 0:01:13 lr 0.000055 wd 0.0500 time 0.2403 (0.2434) data time 0.0010 (0.0016) model time 0.2393 (0.2419) loss 2.6299 (2.7358) grad_norm 2.9548 (inf) loss_scale 256.0000 (320.8749) mem 7381MB [2024-09-01 08:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][960/1251] eta 0:01:10 lr 0.000055 wd 0.0500 time 0.2414 (0.2434) data time 0.0007 (0.0016) model time 0.2407 (0.2419) loss 1.9619 (2.7313) grad_norm 5.1222 (inf) loss_scale 256.0000 (320.1998) mem 7381MB [2024-09-01 08:36:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][970/1251] eta 0:01:08 lr 0.000055 wd 0.0500 time 0.2451 (0.2434) data time 0.0009 (0.0016) model time 0.2442 (0.2419) loss 2.0140 (2.7277) grad_norm 3.1836 (inf) loss_scale 256.0000 (319.5386) mem 7381MB [2024-09-01 08:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][980/1251] eta 0:01:05 lr 0.000055 wd 0.0500 time 0.2426 (0.2433) data time 0.0010 (0.0016) model time 0.2416 (0.2419) loss 3.2184 (2.7287) grad_norm 4.8583 (inf) loss_scale 256.0000 (318.8909) mem 7381MB [2024-09-01 08:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][990/1251] eta 0:01:03 lr 0.000055 wd 0.0500 time 0.2445 (0.2433) data time 0.0009 (0.0016) model time 0.2436 (0.2419) loss 2.5043 (2.7286) grad_norm 7.1713 (inf) loss_scale 256.0000 (318.2563) mem 7381MB [2024-09-01 08:37:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1000/1251] eta 0:01:01 lr 0.000055 wd 0.0500 time 0.2395 (0.2433) data time 0.0007 (0.0016) model time 0.2388 (0.2419) loss 2.7469 (2.7291) grad_norm 6.1378 (inf) loss_scale 256.0000 (317.6344) mem 7381MB [2024-09-01 08:37:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1010/1251] eta 0:00:58 lr 0.000055 wd 0.0500 time 0.2415 (0.2433) data time 0.0012 (0.0016) model time 0.2404 (0.2419) loss 3.1767 (2.7297) grad_norm 4.1983 (inf) loss_scale 256.0000 (317.0247) mem 7381MB [2024-09-01 08:37:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1020/1251] eta 0:00:56 lr 0.000055 wd 0.0500 time 0.2451 (0.2433) data time 0.0007 (0.0016) model time 0.2444 (0.2419) loss 2.3688 (2.7299) grad_norm 6.4284 (inf) loss_scale 256.0000 (316.4270) mem 7381MB [2024-09-01 08:37:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1030/1251] eta 0:00:53 lr 0.000055 wd 0.0500 time 0.2393 (0.2433) data time 0.0009 (0.0016) model time 0.2384 (0.2419) loss 3.2799 (2.7282) grad_norm 6.3059 (inf) loss_scale 256.0000 (315.8409) mem 7381MB [2024-09-01 08:37:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1040/1251] eta 0:00:51 lr 0.000055 wd 0.0500 time 0.2392 (0.2433) data time 0.0010 (0.0016) model time 0.2382 (0.2418) loss 2.7513 (2.7278) grad_norm 5.8493 (inf) loss_scale 256.0000 (315.2661) mem 7381MB [2024-09-01 08:37:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1050/1251] eta 0:00:48 lr 0.000055 wd 0.0500 time 0.2412 (0.2433) data time 0.0007 (0.0016) model time 0.2405 (0.2418) loss 3.0575 (2.7272) grad_norm 4.6209 (inf) loss_scale 256.0000 (314.7022) mem 7381MB [2024-09-01 08:37:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1060/1251] eta 0:00:46 lr 0.000055 wd 0.0500 time 0.2317 (0.2432) data time 0.0012 (0.0016) model time 0.2305 (0.2418) loss 2.9640 (2.7273) grad_norm 5.9536 (inf) loss_scale 256.0000 (314.1489) mem 7381MB [2024-09-01 08:37:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1070/1251] eta 0:00:44 lr 0.000055 wd 0.0500 time 0.2423 (0.2432) data time 0.0010 (0.0016) model time 0.2413 (0.2418) loss 2.8154 (2.7283) grad_norm 5.4389 (inf) loss_scale 256.0000 (313.6060) mem 7381MB [2024-09-01 08:37:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1080/1251] eta 0:00:41 lr 0.000055 wd 0.0500 time 0.2457 (0.2432) data time 0.0007 (0.0015) model time 0.2450 (0.2418) loss 3.4087 (2.7300) grad_norm 8.0244 (inf) loss_scale 256.0000 (313.0731) mem 7381MB [2024-09-01 08:37:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1090/1251] eta 0:00:39 lr 0.000055 wd 0.0500 time 0.2430 (0.2432) data time 0.0007 (0.0015) model time 0.2423 (0.2418) loss 3.2333 (2.7302) grad_norm 4.1760 (inf) loss_scale 256.0000 (312.5500) mem 7381MB [2024-09-01 08:37:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1100/1251] eta 0:00:36 lr 0.000055 wd 0.0500 time 0.2355 (0.2432) data time 0.0010 (0.0015) model time 0.2345 (0.2418) loss 3.2330 (2.7325) grad_norm 4.6670 (inf) loss_scale 256.0000 (312.0363) mem 7381MB [2024-09-01 08:37:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1110/1251] eta 0:00:34 lr 0.000055 wd 0.0500 time 0.2320 (0.2431) data time 0.0012 (0.0015) model time 0.2308 (0.2417) loss 2.5827 (2.7327) grad_norm 4.4662 (inf) loss_scale 256.0000 (311.5320) mem 7381MB [2024-09-01 08:37:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1120/1251] eta 0:00:31 lr 0.000055 wd 0.0500 time 0.2304 (0.2431) data time 0.0009 (0.0015) model time 0.2295 (0.2417) loss 3.0388 (2.7319) grad_norm 4.8190 (inf) loss_scale 256.0000 (311.0366) mem 7381MB [2024-09-01 08:37:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1130/1251] eta 0:00:29 lr 0.000055 wd 0.0500 time 0.2378 (0.2431) data time 0.0009 (0.0015) model time 0.2369 (0.2417) loss 3.0725 (2.7326) grad_norm 5.3997 (inf) loss_scale 256.0000 (310.5500) mem 7381MB [2024-09-01 08:37:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1140/1251] eta 0:00:26 lr 0.000055 wd 0.0500 time 0.2381 (0.2431) data time 0.0009 (0.0015) model time 0.2371 (0.2417) loss 2.5615 (2.7323) grad_norm 4.0090 (inf) loss_scale 256.0000 (310.0719) mem 7381MB [2024-09-01 08:37:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1150/1251] eta 0:00:24 lr 0.000054 wd 0.0500 time 0.2372 (0.2430) data time 0.0007 (0.0015) model time 0.2365 (0.2416) loss 1.8691 (2.7316) grad_norm 3.9458 (inf) loss_scale 256.0000 (309.6021) mem 7381MB [2024-09-01 08:37:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1160/1251] eta 0:00:22 lr 0.000054 wd 0.0500 time 0.2447 (0.2430) data time 0.0009 (0.0015) model time 0.2437 (0.2416) loss 3.0543 (2.7317) grad_norm 3.8024 (inf) loss_scale 256.0000 (309.1404) mem 7381MB [2024-09-01 08:37:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1170/1251] eta 0:00:19 lr 0.000054 wd 0.0500 time 0.2433 (0.2430) data time 0.0007 (0.0015) model time 0.2426 (0.2416) loss 2.2057 (2.7283) grad_norm 6.1543 (inf) loss_scale 256.0000 (308.6866) mem 7381MB [2024-09-01 08:37:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1180/1251] eta 0:00:17 lr 0.000054 wd 0.0500 time 0.2450 (0.2430) data time 0.0009 (0.0015) model time 0.2441 (0.2416) loss 3.3077 (2.7313) grad_norm 8.8170 (inf) loss_scale 256.0000 (308.2405) mem 7381MB [2024-09-01 08:37:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1190/1251] eta 0:00:14 lr 0.000054 wd 0.0500 time 0.2316 (0.2432) data time 0.0010 (0.0015) model time 0.2305 (0.2418) loss 2.8725 (2.7287) grad_norm 3.6240 (inf) loss_scale 256.0000 (307.8018) mem 7381MB [2024-09-01 08:37:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1200/1251] eta 0:00:12 lr 0.000054 wd 0.0500 time 0.2428 (0.2431) data time 0.0007 (0.0015) model time 0.2421 (0.2418) loss 2.6659 (2.7289) grad_norm 4.0402 (inf) loss_scale 256.0000 (307.3705) mem 7381MB [2024-09-01 08:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1210/1251] eta 0:00:09 lr 0.000054 wd 0.0500 time 0.2378 (0.2431) data time 0.0010 (0.0015) model time 0.2368 (0.2418) loss 3.2290 (2.7292) grad_norm 5.0938 (inf) loss_scale 256.0000 (306.9463) mem 7381MB [2024-09-01 08:37:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1220/1251] eta 0:00:07 lr 0.000054 wd 0.0500 time 0.2455 (0.2431) data time 0.0008 (0.0015) model time 0.2447 (0.2418) loss 2.0565 (2.7300) grad_norm 13.7985 (inf) loss_scale 256.0000 (306.5291) mem 7381MB [2024-09-01 08:37:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1230/1251] eta 0:00:05 lr 0.000054 wd 0.0500 time 0.2453 (0.2431) data time 0.0009 (0.0015) model time 0.2443 (0.2418) loss 2.9325 (2.7293) grad_norm 5.3130 (inf) loss_scale 256.0000 (306.1186) mem 7381MB [2024-09-01 08:38:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1240/1251] eta 0:00:02 lr 0.000054 wd 0.0500 time 0.2226 (0.2436) data time 0.0005 (0.0015) model time 0.2221 (0.2422) loss 2.6039 (2.7306) grad_norm 5.4508 (inf) loss_scale 256.0000 (305.7147) mem 7381MB [2024-09-01 08:38:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [261/300][1250/1251] eta 0:00:00 lr 0.000054 wd 0.0500 time 0.2230 (0.2434) data time 0.0007 (0.0015) model time 0.2223 (0.2421) loss 2.8119 (2.7283) grad_norm 3.3595 (inf) loss_scale 256.0000 (305.3173) mem 7381MB [2024-09-01 08:38:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 261 training takes 0:05:04 [2024-09-01 08:38:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 08:38:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 08:38:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.518 (0.518) Loss 0.3843 (0.3843) Acc@1 93.262 (93.262) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 08:38:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.117) Loss 0.5498 (0.6142) Acc@1 90.723 (87.411) Acc@5 97.852 (97.763) Mem 7381MB [2024-09-01 08:38:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.098) Loss 0.9399 (0.6442) Acc@1 77.344 (86.407) Acc@5 95.508 (97.661) Mem 7381MB [2024-09-01 08:38:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.083 (0.092) Loss 1.1484 (0.7409) Acc@1 74.023 (84.104) Acc@5 92.480 (96.636) Mem 7381MB [2024-09-01 08:38:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 1.0010 (0.7891) Acc@1 76.953 (82.910) Acc@5 94.434 (96.163) Mem 7381MB [2024-09-01 08:38:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.432 Acc@5 96.114 [2024-09-01 08:38:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.4% [2024-09-01 08:38:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.798 (0.798) Loss 0.3816 (0.3816) Acc@1 93.262 (93.262) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 08:38:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.149) Loss 0.5649 (0.6019) Acc@1 90.527 (87.828) Acc@5 97.852 (97.843) Mem 7381MB [2024-09-01 08:38:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.115) Loss 0.8955 (0.6317) Acc@1 78.027 (86.644) Acc@5 95.898 (97.740) Mem 7381MB [2024-09-01 08:38:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.103) Loss 1.1152 (0.7212) Acc@1 73.926 (84.466) Acc@5 93.066 (96.859) Mem 7381MB [2024-09-01 08:38:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 0.9941 (0.7678) Acc@1 77.246 (83.344) Acc@5 94.141 (96.356) Mem 7381MB [2024-09-01 08:38:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.962 Acc@5 96.304 [2024-09-01 08:38:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 83.0% [2024-09-01 08:38:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][0/1251] eta 0:23:02 lr 0.000054 wd 0.0500 time 1.1050 (1.1050) data time 0.7699 (0.7699) model time 0.0000 (0.0000) loss 3.3296 (3.3296) grad_norm 3.6614 (3.6614) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][10/1251] eta 0:06:41 lr 0.000054 wd 0.0500 time 0.2470 (0.3236) data time 0.0007 (0.0709) model time 0.0000 (0.0000) loss 1.6759 (2.5487) grad_norm 4.0362 (5.8535) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][20/1251] eta 0:06:01 lr 0.000054 wd 0.0500 time 0.2422 (0.2933) data time 0.0008 (0.0376) model time 0.0000 (0.0000) loss 2.4787 (2.6204) grad_norm 8.7701 (6.4863) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][30/1251] eta 0:05:36 lr 0.000054 wd 0.0500 time 0.2314 (0.2759) data time 0.0012 (0.0259) model time 0.0000 (0.0000) loss 3.2308 (2.6588) grad_norm 3.3834 (5.9639) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][40/1251] eta 0:05:23 lr 0.000054 wd 0.0500 time 0.2431 (0.2674) data time 0.0010 (0.0198) model time 0.0000 (0.0000) loss 2.8727 (2.6379) grad_norm 3.9100 (5.7749) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][50/1251] eta 0:05:14 lr 0.000054 wd 0.0500 time 0.2382 (0.2619) data time 0.0010 (0.0161) model time 0.0000 (0.0000) loss 2.8907 (2.6210) grad_norm 5.8811 (5.6025) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][60/1251] eta 0:05:08 lr 0.000054 wd 0.0500 time 0.2390 (0.2589) data time 0.0011 (0.0136) model time 0.2379 (0.2426) loss 2.9267 (2.6313) grad_norm 7.7793 (5.6679) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][70/1251] eta 0:05:02 lr 0.000054 wd 0.0500 time 0.2386 (0.2562) data time 0.0008 (0.0118) model time 0.2379 (0.2407) loss 2.7614 (2.6805) grad_norm 4.1200 (5.6535) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][80/1251] eta 0:04:57 lr 0.000054 wd 0.0500 time 0.2362 (0.2544) data time 0.0012 (0.0105) model time 0.2350 (0.2406) loss 3.3872 (2.6958) grad_norm 5.0282 (5.5183) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][90/1251] eta 0:04:53 lr 0.000054 wd 0.0500 time 0.2470 (0.2529) data time 0.0008 (0.0095) model time 0.2462 (0.2403) loss 3.5184 (2.7206) grad_norm 5.9264 (5.4977) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][100/1251] eta 0:04:49 lr 0.000054 wd 0.0500 time 0.2546 (0.2517) data time 0.0008 (0.0086) model time 0.2538 (0.2402) loss 1.8793 (2.7276) grad_norm 5.5470 (5.4478) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][110/1251] eta 0:04:45 lr 0.000054 wd 0.0500 time 0.2449 (0.2506) data time 0.0009 (0.0080) model time 0.2440 (0.2400) loss 2.8807 (2.7328) grad_norm 6.1369 (5.3647) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][120/1251] eta 0:04:42 lr 0.000054 wd 0.0500 time 0.2412 (0.2496) data time 0.0009 (0.0074) model time 0.2403 (0.2396) loss 3.2432 (2.7520) grad_norm 5.6410 (5.4097) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][130/1251] eta 0:04:38 lr 0.000054 wd 0.0500 time 0.2399 (0.2489) data time 0.0010 (0.0069) model time 0.2389 (0.2396) loss 3.2332 (2.7549) grad_norm 5.4356 (5.4086) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][140/1251] eta 0:04:36 lr 0.000054 wd 0.0500 time 0.2628 (0.2485) data time 0.0010 (0.0065) model time 0.2618 (0.2399) loss 2.8042 (2.7559) grad_norm 4.8310 (5.4225) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][150/1251] eta 0:04:33 lr 0.000054 wd 0.0500 time 0.2431 (0.2481) data time 0.0007 (0.0061) model time 0.2424 (0.2401) loss 3.3327 (2.7426) grad_norm 4.1120 (5.3753) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][160/1251] eta 0:04:30 lr 0.000054 wd 0.0500 time 0.2438 (0.2477) data time 0.0007 (0.0058) model time 0.2431 (0.2401) loss 1.9541 (2.7445) grad_norm 7.5190 (5.4727) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][170/1251] eta 0:04:27 lr 0.000054 wd 0.0500 time 0.2355 (0.2474) data time 0.0009 (0.0055) model time 0.2346 (0.2402) loss 3.2279 (2.7450) grad_norm 5.1549 (5.4647) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:38:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][180/1251] eta 0:04:24 lr 0.000054 wd 0.0500 time 0.2406 (0.2471) data time 0.0008 (0.0053) model time 0.2397 (0.2403) loss 3.0308 (2.7545) grad_norm 3.9720 (5.4057) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][190/1251] eta 0:04:21 lr 0.000054 wd 0.0500 time 0.2414 (0.2468) data time 0.0009 (0.0050) model time 0.2405 (0.2403) loss 1.8081 (2.7414) grad_norm 3.9342 (5.3638) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][200/1251] eta 0:04:19 lr 0.000054 wd 0.0500 time 0.2404 (0.2466) data time 0.0007 (0.0048) model time 0.2396 (0.2405) loss 3.3290 (2.7485) grad_norm 4.3514 (5.3268) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][210/1251] eta 0:04:16 lr 0.000054 wd 0.0500 time 0.2380 (0.2465) data time 0.0011 (0.0046) model time 0.2369 (0.2406) loss 3.1697 (2.7505) grad_norm 5.8968 (5.2984) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][220/1251] eta 0:04:14 lr 0.000054 wd 0.0500 time 0.2439 (0.2464) data time 0.0007 (0.0045) model time 0.2432 (0.2407) loss 3.3933 (2.7513) grad_norm 9.2813 (5.3276) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][230/1251] eta 0:04:11 lr 0.000054 wd 0.0500 time 0.2354 (0.2461) data time 0.0007 (0.0043) model time 0.2347 (0.2407) loss 2.7404 (2.7560) grad_norm 4.7034 (5.3140) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][240/1251] eta 0:04:08 lr 0.000054 wd 0.0500 time 0.2458 (0.2459) data time 0.0011 (0.0042) model time 0.2447 (0.2407) loss 2.9813 (2.7619) grad_norm 5.0363 (5.4764) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][250/1251] eta 0:04:05 lr 0.000054 wd 0.0500 time 0.2377 (0.2457) data time 0.0009 (0.0041) model time 0.2368 (0.2406) loss 2.9974 (2.7639) grad_norm 5.5099 (5.4365) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][260/1251] eta 0:04:03 lr 0.000054 wd 0.0500 time 0.2387 (0.2454) data time 0.0007 (0.0039) model time 0.2379 (0.2404) loss 2.8960 (2.7562) grad_norm 5.1215 (5.4268) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][270/1251] eta 0:04:00 lr 0.000054 wd 0.0500 time 0.2451 (0.2453) data time 0.0010 (0.0038) model time 0.2440 (0.2404) loss 2.1390 (2.7507) grad_norm 7.1990 (5.3931) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][280/1251] eta 0:03:58 lr 0.000054 wd 0.0500 time 0.2498 (0.2452) data time 0.0010 (0.0037) model time 0.2488 (0.2405) loss 3.2027 (2.7455) grad_norm 3.9637 (5.3834) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][290/1251] eta 0:03:55 lr 0.000054 wd 0.0500 time 0.2383 (0.2451) data time 0.0008 (0.0036) model time 0.2375 (0.2406) loss 2.5167 (2.7335) grad_norm 7.1847 (5.3770) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][300/1251] eta 0:03:52 lr 0.000054 wd 0.0500 time 0.2485 (0.2450) data time 0.0010 (0.0035) model time 0.2475 (0.2405) loss 1.9256 (2.7291) grad_norm 4.4850 (5.3955) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][310/1251] eta 0:03:50 lr 0.000054 wd 0.0500 time 0.2505 (0.2449) data time 0.0007 (0.0035) model time 0.2498 (0.2406) loss 2.0494 (2.7279) grad_norm 4.7406 (5.3960) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][320/1251] eta 0:03:47 lr 0.000054 wd 0.0500 time 0.2468 (0.2448) data time 0.0007 (0.0034) model time 0.2461 (0.2406) loss 3.1430 (2.7285) grad_norm 17.4108 (5.4171) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][330/1251] eta 0:03:45 lr 0.000054 wd 0.0500 time 0.2493 (0.2448) data time 0.0009 (0.0033) model time 0.2484 (0.2407) loss 2.7843 (2.7315) grad_norm 7.1168 (5.4475) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][340/1251] eta 0:03:42 lr 0.000054 wd 0.0500 time 0.2420 (0.2447) data time 0.0007 (0.0032) model time 0.2413 (0.2407) loss 3.0298 (2.7324) grad_norm 4.6951 (5.4181) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][350/1251] eta 0:03:40 lr 0.000054 wd 0.0500 time 0.2430 (0.2446) data time 0.0010 (0.0032) model time 0.2420 (0.2407) loss 3.1376 (2.7314) grad_norm 4.5857 (5.4170) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][360/1251] eta 0:03:37 lr 0.000054 wd 0.0500 time 0.2367 (0.2445) data time 0.0009 (0.0031) model time 0.2359 (0.2407) loss 3.0538 (2.7346) grad_norm 9.5318 (5.5007) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][370/1251] eta 0:03:35 lr 0.000054 wd 0.0500 time 0.2473 (0.2445) data time 0.0008 (0.0031) model time 0.2465 (0.2408) loss 3.1312 (2.7272) grad_norm 6.2431 (5.4859) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][380/1251] eta 0:03:32 lr 0.000054 wd 0.0500 time 0.2390 (0.2445) data time 0.0008 (0.0030) model time 0.2381 (0.2408) loss 2.7214 (2.7229) grad_norm 4.8527 (5.4640) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][390/1251] eta 0:03:30 lr 0.000054 wd 0.0500 time 0.2454 (0.2445) data time 0.0008 (0.0030) model time 0.2446 (0.2409) loss 3.1444 (2.7250) grad_norm 4.7726 (5.4970) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][400/1251] eta 0:03:28 lr 0.000054 wd 0.0500 time 0.2470 (0.2445) data time 0.0007 (0.0029) model time 0.2463 (0.2410) loss 2.5735 (2.7292) grad_norm 9.9188 (5.5162) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][410/1251] eta 0:03:25 lr 0.000054 wd 0.0500 time 0.2403 (0.2444) data time 0.0010 (0.0029) model time 0.2393 (0.2410) loss 2.9672 (2.7304) grad_norm 5.0558 (5.5158) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][420/1251] eta 0:03:23 lr 0.000054 wd 0.0500 time 0.2333 (0.2443) data time 0.0011 (0.0028) model time 0.2322 (0.2409) loss 2.4949 (2.7289) grad_norm 4.7627 (5.4989) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:39:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][430/1251] eta 0:03:20 lr 0.000054 wd 0.0500 time 0.2349 (0.2442) data time 0.0009 (0.0028) model time 0.2340 (0.2408) loss 3.1719 (2.7302) grad_norm 3.5262 (5.4723) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][440/1251] eta 0:03:17 lr 0.000054 wd 0.0500 time 0.2460 (0.2441) data time 0.0009 (0.0027) model time 0.2450 (0.2408) loss 3.1412 (2.7323) grad_norm 4.5487 (5.4579) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][450/1251] eta 0:03:15 lr 0.000053 wd 0.0500 time 0.2481 (0.2441) data time 0.0009 (0.0027) model time 0.2472 (0.2408) loss 3.2871 (2.7291) grad_norm 4.8430 (5.4463) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][460/1251] eta 0:03:13 lr 0.000053 wd 0.0500 time 0.2485 (0.2440) data time 0.0009 (0.0027) model time 0.2476 (0.2408) loss 2.7339 (2.7310) grad_norm 3.4426 (5.4331) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][470/1251] eta 0:03:10 lr 0.000053 wd 0.0500 time 0.2347 (0.2440) data time 0.0007 (0.0026) model time 0.2340 (0.2408) loss 2.6525 (2.7304) grad_norm 4.3866 (5.4855) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][480/1251] eta 0:03:08 lr 0.000053 wd 0.0500 time 0.2480 (0.2444) data time 0.0010 (0.0026) model time 0.2470 (0.2414) loss 3.0178 (2.7350) grad_norm 3.7566 (5.4723) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][490/1251] eta 0:03:05 lr 0.000053 wd 0.0500 time 0.2480 (0.2444) data time 0.0007 (0.0026) model time 0.2473 (0.2414) loss 3.0249 (2.7370) grad_norm 4.3872 (5.4787) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][500/1251] eta 0:03:03 lr 0.000053 wd 0.0500 time 0.2393 (0.2443) data time 0.0009 (0.0025) model time 0.2384 (0.2413) loss 3.3045 (2.7381) grad_norm 5.9009 (5.4501) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][510/1251] eta 0:03:01 lr 0.000053 wd 0.0500 time 0.4411 (0.2451) data time 0.0010 (0.0025) model time 0.4402 (0.2422) loss 2.3380 (2.7337) grad_norm 4.0565 (5.4382) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][520/1251] eta 0:02:59 lr 0.000053 wd 0.0500 time 0.2447 (0.2454) data time 0.0011 (0.0025) model time 0.2436 (0.2427) loss 2.9470 (2.7286) grad_norm 4.0858 (5.4165) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][530/1251] eta 0:02:56 lr 0.000053 wd 0.0500 time 0.2480 (0.2454) data time 0.0009 (0.0024) model time 0.2471 (0.2427) loss 2.3417 (2.7283) grad_norm 4.6710 (5.4242) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][540/1251] eta 0:02:54 lr 0.000053 wd 0.0500 time 0.2440 (0.2453) data time 0.0007 (0.0024) model time 0.2433 (0.2426) loss 2.0999 (2.7283) grad_norm 3.8624 (5.4158) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][550/1251] eta 0:02:52 lr 0.000053 wd 0.0500 time 0.4476 (0.2456) data time 0.0007 (0.0024) model time 0.4470 (0.2430) loss 2.6363 (2.7299) grad_norm 5.4246 (5.4188) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][560/1251] eta 0:02:49 lr 0.000053 wd 0.0500 time 0.2384 (0.2455) data time 0.0011 (0.0024) model time 0.2373 (0.2429) loss 2.3710 (2.7266) grad_norm 6.6988 (5.4171) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][570/1251] eta 0:02:47 lr 0.000053 wd 0.0500 time 0.2451 (0.2455) data time 0.0008 (0.0023) model time 0.2442 (0.2429) loss 1.9282 (2.7281) grad_norm 3.5144 (5.4050) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][580/1251] eta 0:02:44 lr 0.000053 wd 0.0500 time 0.2440 (0.2454) data time 0.0008 (0.0023) model time 0.2432 (0.2428) loss 1.7962 (2.7287) grad_norm 6.7910 (5.4244) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][590/1251] eta 0:02:42 lr 0.000053 wd 0.0500 time 0.2451 (0.2453) data time 0.0011 (0.0023) model time 0.2440 (0.2428) loss 2.7828 (2.7313) grad_norm 6.5014 (5.4279) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][600/1251] eta 0:02:39 lr 0.000053 wd 0.0500 time 0.2412 (0.2453) data time 0.0010 (0.0023) model time 0.2403 (0.2427) loss 3.1034 (2.7277) grad_norm 3.8847 (5.4173) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][610/1251] eta 0:02:37 lr 0.000053 wd 0.0500 time 0.2430 (0.2452) data time 0.0010 (0.0022) model time 0.2420 (0.2427) loss 2.5526 (2.7254) grad_norm 4.9132 (5.4263) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][620/1251] eta 0:02:34 lr 0.000053 wd 0.0500 time 0.2442 (0.2452) data time 0.0008 (0.0022) model time 0.2434 (0.2427) loss 2.0032 (2.7242) grad_norm 5.9038 (5.4178) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][630/1251] eta 0:02:32 lr 0.000053 wd 0.0500 time 0.2449 (0.2451) data time 0.0008 (0.0022) model time 0.2441 (0.2427) loss 1.8707 (2.7242) grad_norm 7.3380 (5.4221) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][640/1251] eta 0:02:29 lr 0.000053 wd 0.0500 time 0.2528 (0.2451) data time 0.0009 (0.0022) model time 0.2519 (0.2427) loss 1.8538 (2.7223) grad_norm 5.8353 (5.4114) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][650/1251] eta 0:02:27 lr 0.000053 wd 0.0500 time 0.2422 (0.2450) data time 0.0009 (0.0022) model time 0.2413 (0.2426) loss 2.6896 (2.7196) grad_norm 4.2518 (5.4101) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][660/1251] eta 0:02:24 lr 0.000053 wd 0.0500 time 0.2405 (0.2450) data time 0.0010 (0.0021) model time 0.2394 (0.2426) loss 2.4905 (2.7170) grad_norm 3.5611 (5.4011) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][670/1251] eta 0:02:22 lr 0.000053 wd 0.0500 time 0.2494 (0.2450) data time 0.0007 (0.0021) model time 0.2487 (0.2426) loss 2.6860 (2.7210) grad_norm 3.7037 (5.3834) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][680/1251] eta 0:02:19 lr 0.000053 wd 0.0500 time 0.2377 (0.2449) data time 0.0010 (0.0021) model time 0.2367 (0.2426) loss 2.9584 (2.7229) grad_norm 23.3282 (5.5102) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][690/1251] eta 0:02:17 lr 0.000053 wd 0.0500 time 0.2382 (0.2449) data time 0.0010 (0.0021) model time 0.2372 (0.2425) loss 3.1331 (2.7227) grad_norm 4.6158 (5.5106) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][700/1251] eta 0:02:14 lr 0.000053 wd 0.0500 time 0.2393 (0.2448) data time 0.0010 (0.0021) model time 0.2383 (0.2425) loss 3.2268 (2.7252) grad_norm 5.1498 (5.5036) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][710/1251] eta 0:02:12 lr 0.000053 wd 0.0500 time 0.2450 (0.2448) data time 0.0011 (0.0021) model time 0.2439 (0.2425) loss 2.6246 (2.7259) grad_norm 3.8006 (5.4935) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][720/1251] eta 0:02:09 lr 0.000053 wd 0.0500 time 0.2377 (0.2447) data time 0.0009 (0.0020) model time 0.2368 (0.2424) loss 2.7663 (2.7258) grad_norm 4.4611 (5.4839) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][730/1251] eta 0:02:07 lr 0.000053 wd 0.0500 time 0.2307 (0.2446) data time 0.0009 (0.0020) model time 0.2298 (0.2423) loss 3.1854 (2.7232) grad_norm 3.6283 (5.4800) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][740/1251] eta 0:02:04 lr 0.000053 wd 0.0500 time 0.2405 (0.2446) data time 0.0009 (0.0020) model time 0.2396 (0.2423) loss 2.0465 (2.7214) grad_norm 4.7055 (5.4730) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][750/1251] eta 0:02:02 lr 0.000053 wd 0.0500 time 0.2467 (0.2445) data time 0.0013 (0.0020) model time 0.2454 (0.2423) loss 3.4297 (2.7224) grad_norm 3.3363 (5.4654) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][760/1251] eta 0:02:00 lr 0.000053 wd 0.0500 time 0.2403 (0.2445) data time 0.0009 (0.0020) model time 0.2394 (0.2423) loss 3.3146 (2.7228) grad_norm 6.9272 (5.4638) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][770/1251] eta 0:01:57 lr 0.000053 wd 0.0500 time 0.2421 (0.2445) data time 0.0009 (0.0020) model time 0.2412 (0.2423) loss 3.1122 (2.7241) grad_norm 3.6623 (5.4480) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][780/1251] eta 0:01:55 lr 0.000053 wd 0.0500 time 0.2450 (0.2445) data time 0.0009 (0.0020) model time 0.2441 (0.2423) loss 2.2218 (2.7193) grad_norm 3.9665 (5.4410) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][790/1251] eta 0:01:52 lr 0.000053 wd 0.0500 time 0.2425 (0.2444) data time 0.0009 (0.0020) model time 0.2416 (0.2423) loss 2.1270 (2.7163) grad_norm 4.1866 (5.4267) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][800/1251] eta 0:01:50 lr 0.000053 wd 0.0500 time 0.2427 (0.2444) data time 0.0008 (0.0019) model time 0.2419 (0.2422) loss 2.8133 (2.7137) grad_norm 4.0735 (5.4163) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][810/1251] eta 0:01:47 lr 0.000053 wd 0.0500 time 0.2365 (0.2444) data time 0.0010 (0.0019) model time 0.2355 (0.2422) loss 3.1971 (2.7137) grad_norm 9.0981 (5.4154) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][820/1251] eta 0:01:45 lr 0.000053 wd 0.0500 time 0.2472 (0.2444) data time 0.0010 (0.0019) model time 0.2462 (0.2422) loss 2.1992 (2.7104) grad_norm 5.5069 (5.4099) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][830/1251] eta 0:01:42 lr 0.000053 wd 0.0500 time 0.2479 (0.2443) data time 0.0010 (0.0019) model time 0.2469 (0.2422) loss 3.0558 (2.7119) grad_norm 3.1486 (5.5019) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][840/1251] eta 0:01:40 lr 0.000053 wd 0.0500 time 0.2408 (0.2443) data time 0.0009 (0.0019) model time 0.2399 (0.2422) loss 3.0561 (2.7135) grad_norm 4.5270 (5.4960) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][850/1251] eta 0:01:37 lr 0.000053 wd 0.0500 time 0.2453 (0.2443) data time 0.0011 (0.0019) model time 0.2442 (0.2422) loss 2.9484 (2.7143) grad_norm 6.9980 (5.4923) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][860/1251] eta 0:01:35 lr 0.000053 wd 0.0500 time 0.2443 (0.2442) data time 0.0011 (0.0019) model time 0.2432 (0.2422) loss 2.7032 (2.7183) grad_norm 4.7542 (5.4763) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][870/1251] eta 0:01:33 lr 0.000053 wd 0.0500 time 0.2430 (0.2442) data time 0.0007 (0.0019) model time 0.2423 (0.2421) loss 2.8979 (2.7206) grad_norm 4.8938 (5.4656) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][880/1251] eta 0:01:30 lr 0.000053 wd 0.0500 time 0.2420 (0.2442) data time 0.0010 (0.0019) model time 0.2410 (0.2421) loss 2.6907 (2.7200) grad_norm 4.8044 (5.4518) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][890/1251] eta 0:01:28 lr 0.000053 wd 0.0500 time 0.2377 (0.2442) data time 0.0009 (0.0018) model time 0.2368 (0.2421) loss 1.7549 (2.7168) grad_norm 4.2483 (5.4584) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][900/1251] eta 0:01:25 lr 0.000053 wd 0.0500 time 0.2456 (0.2441) data time 0.0010 (0.0018) model time 0.2446 (0.2421) loss 2.5302 (2.7185) grad_norm 4.2425 (5.4589) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][910/1251] eta 0:01:23 lr 0.000053 wd 0.0500 time 0.2529 (0.2442) data time 0.0007 (0.0018) model time 0.2522 (0.2421) loss 2.9162 (2.7159) grad_norm 6.5680 (5.4564) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:41:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][920/1251] eta 0:01:20 lr 0.000053 wd 0.0500 time 0.2504 (0.2441) data time 0.0007 (0.0018) model time 0.2497 (0.2421) loss 2.9055 (2.7185) grad_norm 4.0416 (5.4479) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][930/1251] eta 0:01:18 lr 0.000053 wd 0.0500 time 0.2487 (0.2441) data time 0.0008 (0.0018) model time 0.2478 (0.2421) loss 2.8965 (2.7175) grad_norm 3.2942 (5.4423) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][940/1251] eta 0:01:15 lr 0.000053 wd 0.0500 time 0.2460 (0.2441) data time 0.0008 (0.0018) model time 0.2452 (0.2421) loss 2.8727 (2.7166) grad_norm 6.9410 (5.4436) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][950/1251] eta 0:01:13 lr 0.000053 wd 0.0500 time 0.2361 (0.2441) data time 0.0010 (0.0018) model time 0.2351 (0.2421) loss 3.4210 (2.7198) grad_norm 4.4461 (5.4397) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][960/1251] eta 0:01:11 lr 0.000053 wd 0.0500 time 0.2384 (0.2440) data time 0.0009 (0.0018) model time 0.2375 (0.2420) loss 2.6102 (2.7205) grad_norm 5.3965 (5.4406) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][970/1251] eta 0:01:08 lr 0.000053 wd 0.0500 time 0.2355 (0.2440) data time 0.0007 (0.0018) model time 0.2348 (0.2420) loss 3.2848 (2.7184) grad_norm 6.7099 (5.4316) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][980/1251] eta 0:01:06 lr 0.000053 wd 0.0500 time 0.2470 (0.2440) data time 0.0009 (0.0018) model time 0.2461 (0.2420) loss 2.0782 (2.7155) grad_norm 4.2738 (5.4335) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][990/1251] eta 0:01:03 lr 0.000053 wd 0.0500 time 0.2479 (0.2440) data time 0.0010 (0.0018) model time 0.2469 (0.2420) loss 1.4925 (2.7147) grad_norm 5.5839 (5.4261) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1000/1251] eta 0:01:01 lr 0.000052 wd 0.0500 time 0.2410 (0.2440) data time 0.0010 (0.0018) model time 0.2401 (0.2421) loss 2.7224 (2.7163) grad_norm 5.1896 (5.4191) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1010/1251] eta 0:00:58 lr 0.000052 wd 0.0500 time 0.2374 (0.2440) data time 0.0008 (0.0017) model time 0.2365 (0.2421) loss 3.5858 (2.7181) grad_norm 11.0965 (5.4302) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1020/1251] eta 0:00:56 lr 0.000052 wd 0.0500 time 0.2382 (0.2440) data time 0.0010 (0.0017) model time 0.2371 (0.2421) loss 2.2797 (2.7178) grad_norm 6.9587 (5.4268) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1030/1251] eta 0:00:53 lr 0.000052 wd 0.0500 time 0.2485 (0.2439) data time 0.0007 (0.0017) model time 0.2477 (0.2420) loss 2.9986 (2.7200) grad_norm 4.2072 (5.4299) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1040/1251] eta 0:00:51 lr 0.000052 wd 0.0500 time 0.2441 (0.2439) data time 0.0008 (0.0017) model time 0.2432 (0.2420) loss 2.6248 (2.7194) grad_norm 5.2091 (5.4328) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1050/1251] eta 0:00:49 lr 0.000052 wd 0.0500 time 0.2359 (0.2439) data time 0.0010 (0.0017) model time 0.2349 (0.2420) loss 2.8954 (2.7203) grad_norm 7.0453 (5.4386) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1060/1251] eta 0:00:46 lr 0.000052 wd 0.0500 time 0.2375 (0.2439) data time 0.0009 (0.0017) model time 0.2366 (0.2420) loss 2.5692 (2.7212) grad_norm 4.4337 (5.4348) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1070/1251] eta 0:00:44 lr 0.000052 wd 0.0500 time 0.2228 (0.2440) data time 0.0007 (0.0017) model time 0.2221 (0.2421) loss 3.4106 (2.7217) grad_norm 4.3798 (5.4341) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1080/1251] eta 0:00:41 lr 0.000052 wd 0.0500 time 0.2378 (0.2439) data time 0.0009 (0.0017) model time 0.2369 (0.2420) loss 3.1901 (2.7234) grad_norm 5.1087 (5.4233) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1090/1251] eta 0:00:39 lr 0.000052 wd 0.0500 time 0.2391 (0.2439) data time 0.0010 (0.0017) model time 0.2381 (0.2420) loss 2.8525 (2.7235) grad_norm 4.5680 (5.4141) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1100/1251] eta 0:00:36 lr 0.000052 wd 0.0500 time 0.2347 (0.2438) data time 0.0007 (0.0017) model time 0.2339 (0.2420) loss 1.8552 (2.7210) grad_norm 5.0376 (5.4090) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1110/1251] eta 0:00:34 lr 0.000052 wd 0.0500 time 0.2385 (0.2438) data time 0.0009 (0.0017) model time 0.2376 (0.2420) loss 3.0277 (2.7199) grad_norm 5.1301 (5.4059) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1120/1251] eta 0:00:31 lr 0.000052 wd 0.0500 time 0.2391 (0.2438) data time 0.0007 (0.0017) model time 0.2384 (0.2420) loss 3.5137 (2.7194) grad_norm 4.0916 (5.3951) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1130/1251] eta 0:00:29 lr 0.000052 wd 0.0500 time 0.2399 (0.2438) data time 0.0009 (0.0017) model time 0.2389 (0.2420) loss 2.7288 (2.7179) grad_norm 4.7436 (5.3894) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1140/1251] eta 0:00:27 lr 0.000052 wd 0.0500 time 0.2411 (0.2438) data time 0.0010 (0.0017) model time 0.2401 (0.2419) loss 2.8313 (2.7200) grad_norm 7.6588 (5.4031) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1150/1251] eta 0:00:24 lr 0.000052 wd 0.0500 time 0.2348 (0.2437) data time 0.0009 (0.0017) model time 0.2339 (0.2419) loss 3.4790 (2.7233) grad_norm 5.3709 (5.4242) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1160/1251] eta 0:00:22 lr 0.000052 wd 0.0500 time 0.2437 (0.2437) data time 0.0010 (0.0017) model time 0.2427 (0.2419) loss 3.3432 (2.7232) grad_norm 3.0123 (5.4265) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:42:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1170/1251] eta 0:00:19 lr 0.000052 wd 0.0500 time 0.2413 (0.2437) data time 0.0008 (0.0016) model time 0.2405 (0.2419) loss 2.0682 (2.7234) grad_norm 6.9083 (5.4469) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1180/1251] eta 0:00:17 lr 0.000052 wd 0.0500 time 0.2472 (0.2437) data time 0.0007 (0.0016) model time 0.2465 (0.2419) loss 3.2486 (2.7235) grad_norm 4.2532 (5.4424) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1190/1251] eta 0:00:14 lr 0.000052 wd 0.0500 time 0.2459 (0.2437) data time 0.0007 (0.0016) model time 0.2452 (0.2419) loss 1.8030 (2.7234) grad_norm 4.2925 (5.4417) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1200/1251] eta 0:00:12 lr 0.000052 wd 0.0500 time 0.2428 (0.2436) data time 0.0009 (0.0016) model time 0.2418 (0.2418) loss 2.1049 (2.7249) grad_norm 4.2594 (5.4313) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1210/1251] eta 0:00:09 lr 0.000052 wd 0.0500 time 0.2346 (0.2436) data time 0.0010 (0.0016) model time 0.2336 (0.2418) loss 2.6451 (2.7242) grad_norm 3.9881 (5.4242) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1220/1251] eta 0:00:07 lr 0.000052 wd 0.0500 time 0.2446 (0.2436) data time 0.0011 (0.0016) model time 0.2434 (0.2418) loss 2.4794 (2.7244) grad_norm 4.8116 (5.4149) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1230/1251] eta 0:00:05 lr 0.000052 wd 0.0500 time 0.2470 (0.2436) data time 0.0007 (0.0016) model time 0.2464 (0.2418) loss 2.9338 (2.7258) grad_norm 7.4595 (5.4144) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1240/1251] eta 0:00:02 lr 0.000052 wd 0.0500 time 0.2305 (0.2435) data time 0.0007 (0.0016) model time 0.2298 (0.2418) loss 3.1960 (2.7259) grad_norm 18.5172 (5.4211) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [262/300][1250/1251] eta 0:00:00 lr 0.000052 wd 0.0500 time 0.2247 (0.2434) data time 0.0005 (0.0016) model time 0.2242 (0.2416) loss 2.3250 (2.7251) grad_norm 3.6699 (5.4162) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 262 training takes 0:05:04 [2024-09-01 08:43:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 08:43:18 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 08:43:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.421 (0.421) Loss 0.3931 (0.3931) Acc@1 92.578 (92.578) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 08:43:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.108) Loss 0.5762 (0.6172) Acc@1 90.039 (87.349) Acc@5 97.461 (97.594) Mem 7381MB [2024-09-01 08:43:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.095) Loss 0.9404 (0.6512) Acc@1 77.246 (86.207) Acc@5 95.020 (97.540) Mem 7381MB [2024-09-01 08:43:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.089) Loss 1.1719 (0.7443) Acc@1 73.828 (83.997) Acc@5 91.992 (96.569) Mem 7381MB [2024-09-01 08:43:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.083) Loss 1.0459 (0.7938) Acc@1 77.246 (82.812) Acc@5 94.141 (96.082) Mem 7381MB [2024-09-01 08:43:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.378 Acc@5 96.044 [2024-09-01 08:43:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.4% [2024-09-01 08:43:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.767 (0.767) Loss 0.3828 (0.3828) Acc@1 93.164 (93.164) Acc@5 98.926 (98.926) Mem 7381MB [2024-09-01 08:43:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.145) Loss 0.5654 (0.6022) Acc@1 90.625 (87.855) Acc@5 97.949 (97.834) Mem 7381MB [2024-09-01 08:43:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.113) Loss 0.8979 (0.6321) Acc@1 78.125 (86.654) Acc@5 95.898 (97.731) Mem 7381MB [2024-09-01 08:43:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.091 (0.103) Loss 1.1152 (0.7216) Acc@1 74.121 (84.454) Acc@5 93.262 (96.856) Mem 7381MB [2024-09-01 08:43:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 0.9951 (0.7684) Acc@1 77.344 (83.320) Acc@5 94.141 (96.353) Mem 7381MB [2024-09-01 08:43:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.940 Acc@5 96.300 [2024-09-01 08:43:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 08:43:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][0/1251] eta 0:22:10 lr 0.000052 wd 0.0500 time 1.0632 (1.0632) data time 0.5249 (0.5249) model time 0.0000 (0.0000) loss 3.0883 (3.0883) grad_norm 4.0180 (4.0180) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][10/1251] eta 0:06:35 lr 0.000052 wd 0.0500 time 0.2425 (0.3190) data time 0.0010 (0.0486) model time 0.0000 (0.0000) loss 2.4597 (2.5723) grad_norm 5.0024 (4.7937) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][20/1251] eta 0:05:47 lr 0.000052 wd 0.0500 time 0.2492 (0.2824) data time 0.0010 (0.0260) model time 0.0000 (0.0000) loss 2.7979 (2.6713) grad_norm 3.6084 (4.4264) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][30/1251] eta 0:05:28 lr 0.000052 wd 0.0500 time 0.2388 (0.2690) data time 0.0009 (0.0179) model time 0.0000 (0.0000) loss 2.9464 (2.6825) grad_norm 8.6299 (6.1087) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][40/1251] eta 0:05:17 lr 0.000052 wd 0.0500 time 0.2464 (0.2623) data time 0.0007 (0.0138) model time 0.0000 (0.0000) loss 3.0288 (2.7045) grad_norm 4.1608 (5.9903) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][50/1251] eta 0:05:09 lr 0.000052 wd 0.0500 time 0.2404 (0.2578) data time 0.0010 (0.0113) model time 0.0000 (0.0000) loss 3.2620 (2.6450) grad_norm 3.8270 (5.6706) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][60/1251] eta 0:05:03 lr 0.000052 wd 0.0500 time 0.2330 (0.2549) data time 0.0010 (0.0096) model time 0.2319 (0.2394) loss 3.0814 (2.6358) grad_norm 6.4907 (5.5970) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][70/1251] eta 0:04:58 lr 0.000052 wd 0.0500 time 0.2344 (0.2531) data time 0.0007 (0.0084) model time 0.2337 (0.2403) loss 3.2827 (2.6433) grad_norm 7.4443 (5.6274) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][80/1251] eta 0:04:54 lr 0.000052 wd 0.0500 time 0.2460 (0.2515) data time 0.0007 (0.0074) model time 0.2452 (0.2398) loss 3.5102 (2.6665) grad_norm 8.0852 (5.5981) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][90/1251] eta 0:04:51 lr 0.000052 wd 0.0500 time 0.2551 (0.2507) data time 0.0008 (0.0067) model time 0.2544 (0.2409) loss 1.7715 (2.6569) grad_norm 12.3690 (5.6368) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][100/1251] eta 0:04:47 lr 0.000052 wd 0.0500 time 0.2346 (0.2497) data time 0.0008 (0.0061) model time 0.2337 (0.2406) loss 3.2414 (2.6636) grad_norm 5.9957 (5.5541) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][110/1251] eta 0:04:48 lr 0.000052 wd 0.0500 time 0.2413 (0.2529) data time 0.0009 (0.0057) model time 0.2404 (0.2478) loss 3.2048 (2.6802) grad_norm 7.0385 (5.5185) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][120/1251] eta 0:04:44 lr 0.000052 wd 0.0500 time 0.2420 (0.2520) data time 0.0010 (0.0053) model time 0.2411 (0.2468) loss 3.1436 (2.7056) grad_norm 4.5400 (5.6179) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:43:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][130/1251] eta 0:04:41 lr 0.000052 wd 0.0500 time 0.2394 (0.2512) data time 0.0007 (0.0050) model time 0.2386 (0.2460) loss 1.5312 (2.6943) grad_norm 3.6254 (5.5581) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][140/1251] eta 0:04:38 lr 0.000052 wd 0.0500 time 0.2412 (0.2505) data time 0.0011 (0.0047) model time 0.2401 (0.2455) loss 2.9163 (2.7066) grad_norm 3.7815 (5.4490) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][150/1251] eta 0:04:35 lr 0.000052 wd 0.0500 time 0.2411 (0.2499) data time 0.0010 (0.0044) model time 0.2402 (0.2449) loss 3.0926 (2.7066) grad_norm 4.1688 (5.3897) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][160/1251] eta 0:04:32 lr 0.000052 wd 0.0500 time 0.2436 (0.2493) data time 0.0011 (0.0042) model time 0.2426 (0.2445) loss 2.8917 (2.7083) grad_norm 4.5318 (5.3622) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][170/1251] eta 0:04:29 lr 0.000052 wd 0.0500 time 0.2469 (0.2489) data time 0.0010 (0.0040) model time 0.2459 (0.2441) loss 2.8413 (2.7040) grad_norm 4.8270 (5.3190) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][180/1251] eta 0:04:26 lr 0.000052 wd 0.0500 time 0.2424 (0.2485) data time 0.0012 (0.0039) model time 0.2412 (0.2439) loss 3.1606 (2.7069) grad_norm 5.8867 (5.3460) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][190/1251] eta 0:04:23 lr 0.000052 wd 0.0500 time 0.2420 (0.2480) data time 0.0012 (0.0037) model time 0.2409 (0.2435) loss 3.0992 (2.7084) grad_norm 4.2041 (5.3188) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][200/1251] eta 0:04:20 lr 0.000052 wd 0.0500 time 0.2389 (0.2477) data time 0.0012 (0.0036) model time 0.2378 (0.2432) loss 2.1886 (2.7191) grad_norm 6.8184 (5.3799) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][210/1251] eta 0:04:17 lr 0.000052 wd 0.0500 time 0.2371 (0.2473) data time 0.0012 (0.0035) model time 0.2359 (0.2430) loss 1.9302 (2.7078) grad_norm 5.8795 (5.3965) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][220/1251] eta 0:04:14 lr 0.000052 wd 0.0500 time 0.2449 (0.2471) data time 0.0010 (0.0034) model time 0.2439 (0.2429) loss 3.0096 (2.7184) grad_norm 6.0897 (5.3605) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][230/1251] eta 0:04:12 lr 0.000052 wd 0.0500 time 0.2484 (0.2469) data time 0.0007 (0.0033) model time 0.2477 (0.2428) loss 2.4519 (2.7207) grad_norm 4.5124 (5.3316) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][240/1251] eta 0:04:09 lr 0.000052 wd 0.0500 time 0.2468 (0.2468) data time 0.0010 (0.0032) model time 0.2457 (0.2428) loss 2.4321 (2.7311) grad_norm 5.4237 (5.3609) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][250/1251] eta 0:04:06 lr 0.000052 wd 0.0500 time 0.2374 (0.2465) data time 0.0007 (0.0031) model time 0.2367 (0.2426) loss 2.4070 (2.7319) grad_norm 5.5157 (5.3531) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][260/1251] eta 0:04:04 lr 0.000052 wd 0.0500 time 0.2426 (0.2462) data time 0.0010 (0.0030) model time 0.2417 (0.2424) loss 2.4522 (2.7291) grad_norm 4.4642 (5.3765) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][270/1251] eta 0:04:01 lr 0.000052 wd 0.0500 time 0.2324 (0.2465) data time 0.0009 (0.0029) model time 0.2315 (0.2429) loss 3.1851 (2.7230) grad_norm 3.5638 (5.3406) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][280/1251] eta 0:03:59 lr 0.000052 wd 0.0500 time 0.2395 (0.2463) data time 0.0007 (0.0029) model time 0.2388 (0.2427) loss 2.8643 (2.7186) grad_norm 6.1392 (5.3235) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][290/1251] eta 0:03:56 lr 0.000052 wd 0.0500 time 0.2414 (0.2462) data time 0.0009 (0.0028) model time 0.2405 (0.2427) loss 3.1112 (2.7283) grad_norm 21.3413 (5.3498) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][300/1251] eta 0:03:54 lr 0.000052 wd 0.0500 time 0.2403 (0.2461) data time 0.0011 (0.0027) model time 0.2393 (0.2427) loss 2.9433 (2.7370) grad_norm 5.1024 (5.3275) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][310/1251] eta 0:03:52 lr 0.000051 wd 0.0500 time 0.2158 (0.2466) data time 0.0009 (0.0027) model time 0.2150 (0.2434) loss 3.3404 (2.7451) grad_norm 8.0636 (5.3471) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][320/1251] eta 0:03:49 lr 0.000051 wd 0.0500 time 0.2402 (0.2464) data time 0.0011 (0.0026) model time 0.2391 (0.2432) loss 3.0303 (2.7426) grad_norm 4.6979 (5.3252) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:44:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][330/1251] eta 0:03:46 lr 0.000051 wd 0.0500 time 0.2405 (0.2462) data time 0.0010 (0.0026) model time 0.2395 (0.2430) loss 3.0333 (2.7421) grad_norm 4.6705 (5.3031) loss_scale 512.0000 (262.9607) mem 7381MB [2024-09-01 08:44:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][340/1251] eta 0:03:44 lr 0.000051 wd 0.0500 time 0.2429 (0.2459) data time 0.0007 (0.0025) model time 0.2422 (0.2428) loss 2.9740 (2.7506) grad_norm 4.6176 (5.3006) loss_scale 512.0000 (270.2639) mem 7381MB [2024-09-01 08:44:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][350/1251] eta 0:03:41 lr 0.000051 wd 0.0500 time 0.2331 (0.2457) data time 0.0010 (0.0025) model time 0.2321 (0.2426) loss 3.1603 (2.7480) grad_norm 6.4979 (5.2751) loss_scale 512.0000 (277.1510) mem 7381MB [2024-09-01 08:44:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][360/1251] eta 0:03:38 lr 0.000051 wd 0.0500 time 0.2373 (0.2456) data time 0.0010 (0.0025) model time 0.2363 (0.2426) loss 2.9896 (2.7497) grad_norm 3.6981 (5.2671) loss_scale 512.0000 (283.6565) mem 7381MB [2024-09-01 08:44:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][370/1251] eta 0:03:36 lr 0.000051 wd 0.0500 time 0.2374 (0.2455) data time 0.0010 (0.0024) model time 0.2364 (0.2425) loss 2.7437 (2.7484) grad_norm 4.6595 (5.2552) loss_scale 512.0000 (289.8113) mem 7381MB [2024-09-01 08:45:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][380/1251] eta 0:03:33 lr 0.000051 wd 0.0500 time 0.2431 (0.2454) data time 0.0007 (0.0024) model time 0.2424 (0.2425) loss 3.1230 (2.7470) grad_norm 4.1757 (5.2582) loss_scale 512.0000 (295.6430) mem 7381MB [2024-09-01 08:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][390/1251] eta 0:03:31 lr 0.000051 wd 0.0500 time 0.2436 (0.2454) data time 0.0011 (0.0023) model time 0.2425 (0.2425) loss 2.9062 (2.7467) grad_norm 3.9504 (5.2262) loss_scale 512.0000 (301.1765) mem 7381MB [2024-09-01 08:45:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][400/1251] eta 0:03:28 lr 0.000051 wd 0.0500 time 0.2383 (0.2453) data time 0.0009 (0.0023) model time 0.2374 (0.2424) loss 2.7676 (2.7504) grad_norm 4.4776 (5.2501) loss_scale 512.0000 (306.4339) mem 7381MB [2024-09-01 08:45:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][410/1251] eta 0:03:26 lr 0.000051 wd 0.0500 time 0.2344 (0.2451) data time 0.0011 (0.0023) model time 0.2333 (0.2423) loss 3.0967 (2.7506) grad_norm 3.2902 (5.2369) loss_scale 512.0000 (311.4355) mem 7381MB [2024-09-01 08:45:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][420/1251] eta 0:03:23 lr 0.000051 wd 0.0500 time 0.2372 (0.2450) data time 0.0011 (0.0023) model time 0.2360 (0.2423) loss 2.7431 (2.7459) grad_norm 5.1825 (5.2170) loss_scale 512.0000 (316.1995) mem 7381MB [2024-09-01 08:45:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][430/1251] eta 0:03:21 lr 0.000051 wd 0.0500 time 0.2474 (0.2450) data time 0.0009 (0.0022) model time 0.2464 (0.2423) loss 3.0932 (2.7460) grad_norm 4.7689 (5.2130) loss_scale 512.0000 (320.7425) mem 7381MB [2024-09-01 08:45:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][440/1251] eta 0:03:18 lr 0.000051 wd 0.0500 time 0.2418 (0.2449) data time 0.0009 (0.0022) model time 0.2410 (0.2422) loss 3.2085 (2.7478) grad_norm 7.9136 (5.2125) loss_scale 512.0000 (325.0794) mem 7381MB [2024-09-01 08:45:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][450/1251] eta 0:03:16 lr 0.000051 wd 0.0500 time 0.2403 (0.2448) data time 0.0009 (0.0022) model time 0.2393 (0.2422) loss 2.4601 (2.7468) grad_norm 6.8783 (5.2130) loss_scale 512.0000 (329.2239) mem 7381MB [2024-09-01 08:45:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][460/1251] eta 0:03:13 lr 0.000051 wd 0.0500 time 0.2484 (0.2448) data time 0.0009 (0.0021) model time 0.2475 (0.2421) loss 3.1740 (2.7461) grad_norm 3.8812 (5.2071) loss_scale 512.0000 (333.1887) mem 7381MB [2024-09-01 08:45:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][470/1251] eta 0:03:11 lr 0.000051 wd 0.0500 time 0.2433 (0.2447) data time 0.0008 (0.0021) model time 0.2426 (0.2421) loss 3.3045 (2.7463) grad_norm 5.5189 (5.2164) loss_scale 512.0000 (336.9851) mem 7381MB [2024-09-01 08:45:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][480/1251] eta 0:03:08 lr 0.000051 wd 0.0500 time 0.2432 (0.2446) data time 0.0009 (0.0021) model time 0.2423 (0.2421) loss 2.8091 (2.7469) grad_norm 5.9271 (5.2211) loss_scale 512.0000 (340.6237) mem 7381MB [2024-09-01 08:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][490/1251] eta 0:03:06 lr 0.000051 wd 0.0500 time 0.2452 (0.2446) data time 0.0007 (0.0021) model time 0.2445 (0.2421) loss 2.1644 (2.7451) grad_norm 3.5039 (5.2250) loss_scale 512.0000 (344.1141) mem 7381MB [2024-09-01 08:45:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][500/1251] eta 0:03:03 lr 0.000051 wd 0.0500 time 0.2462 (0.2445) data time 0.0010 (0.0021) model time 0.2453 (0.2420) loss 2.6582 (2.7425) grad_norm 4.4291 (5.2290) loss_scale 512.0000 (347.4651) mem 7381MB [2024-09-01 08:45:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][510/1251] eta 0:03:01 lr 0.000051 wd 0.0500 time 0.2354 (0.2445) data time 0.0011 (0.0020) model time 0.2343 (0.2420) loss 3.1787 (2.7410) grad_norm 6.3649 (5.2289) loss_scale 512.0000 (350.6849) mem 7381MB [2024-09-01 08:45:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][520/1251] eta 0:02:58 lr 0.000051 wd 0.0500 time 0.2380 (0.2444) data time 0.0007 (0.0020) model time 0.2373 (0.2420) loss 3.3754 (2.7416) grad_norm 4.5026 (5.2230) loss_scale 512.0000 (353.7812) mem 7381MB [2024-09-01 08:45:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][530/1251] eta 0:02:56 lr 0.000051 wd 0.0500 time 0.2423 (0.2444) data time 0.0011 (0.0020) model time 0.2412 (0.2419) loss 3.1605 (2.7467) grad_norm 4.2544 (5.2132) loss_scale 512.0000 (356.7608) mem 7381MB [2024-09-01 08:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][540/1251] eta 0:02:53 lr 0.000051 wd 0.0500 time 0.2511 (0.2443) data time 0.0009 (0.0020) model time 0.2503 (0.2419) loss 3.0851 (2.7435) grad_norm 4.5460 (5.2135) loss_scale 512.0000 (359.6303) mem 7381MB [2024-09-01 08:45:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][550/1251] eta 0:02:51 lr 0.000051 wd 0.0500 time 0.2472 (0.2442) data time 0.0009 (0.0020) model time 0.2463 (0.2419) loss 2.9086 (2.7452) grad_norm 4.1379 (5.2081) loss_scale 512.0000 (362.3956) mem 7381MB [2024-09-01 08:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][560/1251] eta 0:02:48 lr 0.000051 wd 0.0500 time 0.2406 (0.2442) data time 0.0009 (0.0019) model time 0.2396 (0.2418) loss 2.6278 (2.7406) grad_norm 4.4486 (5.2109) loss_scale 512.0000 (365.0624) mem 7381MB [2024-09-01 08:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][570/1251] eta 0:02:46 lr 0.000051 wd 0.0500 time 0.2444 (0.2441) data time 0.0007 (0.0019) model time 0.2437 (0.2418) loss 3.0477 (2.7428) grad_norm 4.0214 (5.2164) loss_scale 512.0000 (367.6357) mem 7381MB [2024-09-01 08:45:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][580/1251] eta 0:02:43 lr 0.000051 wd 0.0500 time 0.2388 (0.2441) data time 0.0010 (0.0019) model time 0.2378 (0.2417) loss 2.9780 (2.7422) grad_norm 5.7063 (5.2108) loss_scale 512.0000 (370.1205) mem 7381MB [2024-09-01 08:45:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][590/1251] eta 0:02:41 lr 0.000051 wd 0.0500 time 0.2527 (0.2440) data time 0.0009 (0.0019) model time 0.2517 (0.2417) loss 2.2372 (2.7447) grad_norm 4.3769 (5.2026) loss_scale 512.0000 (372.5212) mem 7381MB [2024-09-01 08:45:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][600/1251] eta 0:02:38 lr 0.000051 wd 0.0500 time 0.2410 (0.2439) data time 0.0008 (0.0019) model time 0.2401 (0.2416) loss 2.1056 (2.7405) grad_norm 6.8799 (5.2075) loss_scale 512.0000 (374.8419) mem 7381MB [2024-09-01 08:45:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][610/1251] eta 0:02:36 lr 0.000051 wd 0.0500 time 0.2453 (0.2439) data time 0.0009 (0.0019) model time 0.2444 (0.2416) loss 2.5481 (2.7382) grad_norm 4.6853 (5.2023) loss_scale 512.0000 (377.0867) mem 7381MB [2024-09-01 08:45:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][620/1251] eta 0:02:33 lr 0.000051 wd 0.0500 time 0.2442 (0.2439) data time 0.0009 (0.0019) model time 0.2433 (0.2416) loss 3.0670 (2.7369) grad_norm 5.8668 (5.1977) loss_scale 512.0000 (379.2593) mem 7381MB [2024-09-01 08:46:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][630/1251] eta 0:02:31 lr 0.000051 wd 0.0500 time 0.2463 (0.2438) data time 0.0011 (0.0018) model time 0.2452 (0.2416) loss 2.4312 (2.7363) grad_norm 4.8859 (5.2122) loss_scale 512.0000 (381.3629) mem 7381MB [2024-09-01 08:46:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][640/1251] eta 0:02:28 lr 0.000051 wd 0.0500 time 0.2406 (0.2438) data time 0.0010 (0.0018) model time 0.2396 (0.2416) loss 2.5392 (2.7376) grad_norm 3.4344 (5.2033) loss_scale 512.0000 (383.4009) mem 7381MB [2024-09-01 08:46:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][650/1251] eta 0:02:26 lr 0.000051 wd 0.0500 time 0.2390 (0.2438) data time 0.0009 (0.0018) model time 0.2381 (0.2416) loss 1.8501 (2.7345) grad_norm 6.5251 (5.2167) loss_scale 512.0000 (385.3763) mem 7381MB [2024-09-01 08:46:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][660/1251] eta 0:02:24 lr 0.000051 wd 0.0500 time 0.2378 (0.2438) data time 0.0010 (0.0018) model time 0.2367 (0.2416) loss 2.4538 (2.7369) grad_norm 6.8177 (5.2174) loss_scale 512.0000 (387.2920) mem 7381MB [2024-09-01 08:46:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][670/1251] eta 0:02:21 lr 0.000051 wd 0.0500 time 0.2427 (0.2438) data time 0.0009 (0.0018) model time 0.2418 (0.2416) loss 3.3957 (2.7390) grad_norm 4.6143 (5.2064) loss_scale 512.0000 (389.1505) mem 7381MB [2024-09-01 08:46:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][680/1251] eta 0:02:19 lr 0.000051 wd 0.0500 time 0.2584 (0.2438) data time 0.0008 (0.0018) model time 0.2576 (0.2416) loss 3.6563 (2.7374) grad_norm 3.4526 (5.1998) loss_scale 512.0000 (390.9545) mem 7381MB [2024-09-01 08:46:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][690/1251] eta 0:02:16 lr 0.000051 wd 0.0500 time 0.2452 (0.2437) data time 0.0011 (0.0018) model time 0.2441 (0.2416) loss 2.9907 (2.7357) grad_norm 3.7628 (5.1906) loss_scale 512.0000 (392.7062) mem 7381MB [2024-09-01 08:46:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][700/1251] eta 0:02:14 lr 0.000051 wd 0.0500 time 0.2342 (0.2437) data time 0.0011 (0.0018) model time 0.2332 (0.2416) loss 2.6487 (2.7395) grad_norm 4.1535 (5.1788) loss_scale 512.0000 (394.4080) mem 7381MB [2024-09-01 08:46:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][710/1251] eta 0:02:11 lr 0.000051 wd 0.0500 time 0.2420 (0.2436) data time 0.0011 (0.0017) model time 0.2410 (0.2415) loss 3.1433 (2.7419) grad_norm 3.6394 (5.1708) loss_scale 512.0000 (396.0619) mem 7381MB [2024-09-01 08:46:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][720/1251] eta 0:02:09 lr 0.000051 wd 0.0500 time 0.2389 (0.2436) data time 0.0007 (0.0017) model time 0.2382 (0.2415) loss 1.7490 (2.7410) grad_norm 5.0056 (5.1664) loss_scale 512.0000 (397.6699) mem 7381MB [2024-09-01 08:46:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][730/1251] eta 0:02:06 lr 0.000051 wd 0.0500 time 0.2450 (0.2435) data time 0.0009 (0.0017) model time 0.2442 (0.2415) loss 2.4894 (2.7415) grad_norm 4.8445 (5.1582) loss_scale 512.0000 (399.2339) mem 7381MB [2024-09-01 08:46:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][740/1251] eta 0:02:04 lr 0.000051 wd 0.0500 time 0.2408 (0.2435) data time 0.0007 (0.0017) model time 0.2401 (0.2414) loss 2.4244 (2.7390) grad_norm 3.7129 (5.1509) loss_scale 512.0000 (400.7557) mem 7381MB [2024-09-01 08:46:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][750/1251] eta 0:02:01 lr 0.000051 wd 0.0500 time 0.2360 (0.2435) data time 0.0008 (0.0017) model time 0.2352 (0.2415) loss 2.9006 (2.7380) grad_norm 5.4164 (5.1535) loss_scale 512.0000 (402.2370) mem 7381MB [2024-09-01 08:46:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][760/1251] eta 0:01:59 lr 0.000051 wd 0.0500 time 0.2484 (0.2435) data time 0.0007 (0.0017) model time 0.2477 (0.2414) loss 2.7373 (2.7358) grad_norm 3.7575 (5.1534) loss_scale 512.0000 (403.6794) mem 7381MB [2024-09-01 08:46:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][770/1251] eta 0:01:57 lr 0.000051 wd 0.0500 time 0.2481 (0.2434) data time 0.0007 (0.0017) model time 0.2475 (0.2414) loss 3.1051 (2.7360) grad_norm 4.6321 (5.1540) loss_scale 512.0000 (405.0843) mem 7381MB [2024-09-01 08:46:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][780/1251] eta 0:01:54 lr 0.000051 wd 0.0500 time 0.2439 (0.2434) data time 0.0009 (0.0017) model time 0.2430 (0.2414) loss 3.0371 (2.7367) grad_norm 5.7071 (5.1506) loss_scale 512.0000 (406.4533) mem 7381MB [2024-09-01 08:46:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][790/1251] eta 0:01:52 lr 0.000051 wd 0.0500 time 0.4389 (0.2436) data time 0.0012 (0.0017) model time 0.4377 (0.2417) loss 3.1107 (2.7396) grad_norm 4.0877 (5.1553) loss_scale 512.0000 (407.7876) mem 7381MB [2024-09-01 08:46:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][800/1251] eta 0:01:49 lr 0.000051 wd 0.0500 time 0.2343 (0.2436) data time 0.0011 (0.0017) model time 0.2332 (0.2416) loss 2.7708 (2.7388) grad_norm 4.2443 (5.1618) loss_scale 512.0000 (409.0886) mem 7381MB [2024-09-01 08:46:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][810/1251] eta 0:01:47 lr 0.000051 wd 0.0500 time 0.2426 (0.2435) data time 0.0009 (0.0017) model time 0.2416 (0.2416) loss 2.9375 (2.7396) grad_norm 5.4835 (5.1661) loss_scale 512.0000 (410.3576) mem 7381MB [2024-09-01 08:46:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][820/1251] eta 0:01:44 lr 0.000051 wd 0.0500 time 0.2468 (0.2435) data time 0.0009 (0.0017) model time 0.2458 (0.2416) loss 2.8097 (2.7368) grad_norm 5.5692 (5.1671) loss_scale 512.0000 (411.5956) mem 7381MB [2024-09-01 08:46:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][830/1251] eta 0:01:42 lr 0.000051 wd 0.0500 time 0.2423 (0.2435) data time 0.0008 (0.0016) model time 0.2416 (0.2415) loss 3.7087 (2.7371) grad_norm 4.4693 (5.1688) loss_scale 512.0000 (412.8039) mem 7381MB [2024-09-01 08:46:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][840/1251] eta 0:01:40 lr 0.000051 wd 0.0500 time 0.2312 (0.2434) data time 0.0011 (0.0016) model time 0.2301 (0.2415) loss 2.5684 (2.7358) grad_norm 6.0436 (5.1723) loss_scale 512.0000 (413.9834) mem 7381MB [2024-09-01 08:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][850/1251] eta 0:01:37 lr 0.000051 wd 0.0500 time 0.2368 (0.2434) data time 0.0007 (0.0016) model time 0.2361 (0.2415) loss 2.3984 (2.7345) grad_norm 4.2523 (5.2448) loss_scale 512.0000 (415.1351) mem 7381MB [2024-09-01 08:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][860/1251] eta 0:01:35 lr 0.000051 wd 0.0500 time 0.2452 (0.2434) data time 0.0007 (0.0016) model time 0.2445 (0.2415) loss 3.4174 (2.7378) grad_norm 4.9175 (5.2535) loss_scale 512.0000 (416.2602) mem 7381MB [2024-09-01 08:46:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][870/1251] eta 0:01:32 lr 0.000051 wd 0.0500 time 0.2349 (0.2434) data time 0.0008 (0.0016) model time 0.2341 (0.2415) loss 2.8286 (2.7385) grad_norm 4.5599 (5.2548) loss_scale 512.0000 (417.3594) mem 7381MB [2024-09-01 08:47:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][880/1251] eta 0:01:30 lr 0.000050 wd 0.0500 time 0.2407 (0.2434) data time 0.0011 (0.0016) model time 0.2396 (0.2415) loss 2.6133 (2.7372) grad_norm 3.9601 (5.2413) loss_scale 512.0000 (418.4336) mem 7381MB [2024-09-01 08:47:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][890/1251] eta 0:01:27 lr 0.000050 wd 0.0500 time 0.2374 (0.2433) data time 0.0009 (0.0016) model time 0.2365 (0.2414) loss 2.2879 (2.7388) grad_norm 4.4051 (5.2335) loss_scale 512.0000 (419.4837) mem 7381MB [2024-09-01 08:47:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][900/1251] eta 0:01:25 lr 0.000050 wd 0.0500 time 0.2408 (0.2433) data time 0.0008 (0.0016) model time 0.2400 (0.2414) loss 1.8126 (2.7373) grad_norm 3.4876 (5.2239) loss_scale 512.0000 (420.5105) mem 7381MB [2024-09-01 08:47:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][910/1251] eta 0:01:22 lr 0.000050 wd 0.0500 time 0.2337 (0.2432) data time 0.0007 (0.0016) model time 0.2330 (0.2414) loss 3.0311 (2.7364) grad_norm 4.7026 (5.2291) loss_scale 512.0000 (421.5148) mem 7381MB [2024-09-01 08:47:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][920/1251] eta 0:01:20 lr 0.000050 wd 0.0500 time 0.2429 (0.2432) data time 0.0007 (0.0016) model time 0.2422 (0.2413) loss 2.5781 (2.7375) grad_norm 5.9997 (inf) loss_scale 256.0000 (419.9957) mem 7381MB [2024-09-01 08:47:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][930/1251] eta 0:01:18 lr 0.000050 wd 0.0500 time 0.2457 (0.2432) data time 0.0009 (0.0016) model time 0.2448 (0.2413) loss 2.4219 (2.7361) grad_norm 8.3279 (inf) loss_scale 256.0000 (418.2342) mem 7381MB [2024-09-01 08:47:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][940/1251] eta 0:01:15 lr 0.000050 wd 0.0500 time 0.2441 (0.2432) data time 0.0008 (0.0016) model time 0.2433 (0.2413) loss 2.2541 (2.7358) grad_norm 5.0010 (inf) loss_scale 256.0000 (416.5101) mem 7381MB [2024-09-01 08:47:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][950/1251] eta 0:01:13 lr 0.000050 wd 0.0500 time 0.2414 (0.2432) data time 0.0012 (0.0016) model time 0.2402 (0.2413) loss 2.1972 (2.7346) grad_norm 5.3814 (inf) loss_scale 256.0000 (414.8223) mem 7381MB [2024-09-01 08:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][960/1251] eta 0:01:10 lr 0.000050 wd 0.0500 time 0.2444 (0.2432) data time 0.0009 (0.0016) model time 0.2435 (0.2413) loss 2.6503 (2.7321) grad_norm 10.6776 (inf) loss_scale 256.0000 (413.1696) mem 7381MB [2024-09-01 08:47:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][970/1251] eta 0:01:08 lr 0.000050 wd 0.0500 time 0.2442 (0.2432) data time 0.0009 (0.0016) model time 0.2433 (0.2413) loss 2.6857 (2.7326) grad_norm 5.9450 (inf) loss_scale 256.0000 (411.5510) mem 7381MB [2024-09-01 08:47:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][980/1251] eta 0:01:05 lr 0.000050 wd 0.0500 time 0.2548 (0.2431) data time 0.0007 (0.0015) model time 0.2541 (0.2413) loss 3.2499 (2.7342) grad_norm 6.5839 (inf) loss_scale 256.0000 (409.9653) mem 7381MB [2024-09-01 08:47:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][990/1251] eta 0:01:03 lr 0.000050 wd 0.0500 time 0.2406 (0.2431) data time 0.0010 (0.0015) model time 0.2396 (0.2413) loss 3.0434 (2.7357) grad_norm 9.4194 (inf) loss_scale 256.0000 (408.4117) mem 7381MB [2024-09-01 08:47:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1000/1251] eta 0:01:01 lr 0.000050 wd 0.0500 time 0.2399 (0.2431) data time 0.0008 (0.0015) model time 0.2391 (0.2413) loss 2.6982 (2.7383) grad_norm 4.3726 (inf) loss_scale 256.0000 (406.8891) mem 7381MB [2024-09-01 08:47:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1010/1251] eta 0:00:58 lr 0.000050 wd 0.0500 time 0.2450 (0.2431) data time 0.0009 (0.0015) model time 0.2441 (0.2413) loss 2.9581 (2.7405) grad_norm 4.3448 (inf) loss_scale 256.0000 (405.3966) mem 7381MB [2024-09-01 08:47:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1020/1251] eta 0:00:56 lr 0.000050 wd 0.0500 time 0.2444 (0.2431) data time 0.0007 (0.0015) model time 0.2437 (0.2413) loss 3.3290 (2.7427) grad_norm 5.2031 (inf) loss_scale 256.0000 (403.9334) mem 7381MB [2024-09-01 08:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1030/1251] eta 0:00:53 lr 0.000050 wd 0.0500 time 0.2401 (0.2431) data time 0.0010 (0.0015) model time 0.2392 (0.2413) loss 2.3305 (2.7417) grad_norm 4.5268 (inf) loss_scale 256.0000 (402.4985) mem 7381MB [2024-09-01 08:47:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1040/1251] eta 0:00:51 lr 0.000050 wd 0.0500 time 0.2367 (0.2434) data time 0.0007 (0.0015) model time 0.2360 (0.2417) loss 2.9556 (2.7412) grad_norm 5.2116 (inf) loss_scale 256.0000 (401.0913) mem 7381MB [2024-09-01 08:47:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1050/1251] eta 0:00:48 lr 0.000050 wd 0.0500 time 0.2358 (0.2434) data time 0.0011 (0.0015) model time 0.2347 (0.2417) loss 2.2679 (2.7393) grad_norm 6.4898 (inf) loss_scale 256.0000 (399.7108) mem 7381MB [2024-09-01 08:47:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1060/1251] eta 0:00:46 lr 0.000050 wd 0.0500 time 0.2356 (0.2434) data time 0.0011 (0.0015) model time 0.2345 (0.2417) loss 2.7572 (2.7387) grad_norm 6.7581 (inf) loss_scale 256.0000 (398.3563) mem 7381MB [2024-09-01 08:47:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1070/1251] eta 0:00:44 lr 0.000050 wd 0.0500 time 0.2426 (0.2434) data time 0.0007 (0.0015) model time 0.2419 (0.2416) loss 2.7576 (2.7373) grad_norm 6.1781 (inf) loss_scale 256.0000 (397.0271) mem 7381MB [2024-09-01 08:47:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1080/1251] eta 0:00:41 lr 0.000050 wd 0.0500 time 0.2381 (0.2433) data time 0.0008 (0.0015) model time 0.2373 (0.2416) loss 2.7160 (2.7372) grad_norm 5.5684 (inf) loss_scale 256.0000 (395.7225) mem 7381MB [2024-09-01 08:47:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1090/1251] eta 0:00:39 lr 0.000050 wd 0.0500 time 0.2488 (0.2433) data time 0.0010 (0.0015) model time 0.2479 (0.2416) loss 3.0801 (2.7383) grad_norm 4.5039 (inf) loss_scale 256.0000 (394.4418) mem 7381MB [2024-09-01 08:47:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1100/1251] eta 0:00:36 lr 0.000050 wd 0.0500 time 0.2410 (0.2433) data time 0.0009 (0.0015) model time 0.2400 (0.2416) loss 2.8642 (2.7400) grad_norm 4.2935 (inf) loss_scale 256.0000 (393.1844) mem 7381MB [2024-09-01 08:47:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1110/1251] eta 0:00:34 lr 0.000050 wd 0.0500 time 0.2385 (0.2433) data time 0.0010 (0.0015) model time 0.2375 (0.2416) loss 3.2774 (2.7404) grad_norm 6.6836 (inf) loss_scale 256.0000 (391.9496) mem 7381MB [2024-09-01 08:47:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1120/1251] eta 0:00:31 lr 0.000050 wd 0.0500 time 0.2424 (0.2433) data time 0.0009 (0.0015) model time 0.2415 (0.2416) loss 1.9918 (2.7380) grad_norm 5.5865 (inf) loss_scale 256.0000 (390.7368) mem 7381MB [2024-09-01 08:48:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1130/1251] eta 0:00:29 lr 0.000050 wd 0.0500 time 0.2435 (0.2433) data time 0.0010 (0.0015) model time 0.2426 (0.2416) loss 2.6794 (2.7379) grad_norm 5.3399 (inf) loss_scale 256.0000 (389.5455) mem 7381MB [2024-09-01 08:48:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1140/1251] eta 0:00:27 lr 0.000050 wd 0.0500 time 0.2410 (0.2433) data time 0.0012 (0.0015) model time 0.2398 (0.2416) loss 2.7915 (2.7384) grad_norm 7.2622 (inf) loss_scale 256.0000 (388.3751) mem 7381MB [2024-09-01 08:48:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1150/1251] eta 0:00:24 lr 0.000050 wd 0.0500 time 0.2427 (0.2432) data time 0.0007 (0.0015) model time 0.2420 (0.2415) loss 3.2783 (2.7371) grad_norm 6.7479 (inf) loss_scale 256.0000 (387.2250) mem 7381MB [2024-09-01 08:48:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1160/1251] eta 0:00:22 lr 0.000050 wd 0.0500 time 0.2376 (0.2432) data time 0.0011 (0.0015) model time 0.2365 (0.2415) loss 2.5520 (2.7370) grad_norm 8.1662 (inf) loss_scale 256.0000 (386.0947) mem 7381MB [2024-09-01 08:48:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1170/1251] eta 0:00:19 lr 0.000050 wd 0.0500 time 0.2470 (0.2432) data time 0.0009 (0.0015) model time 0.2461 (0.2415) loss 3.4114 (2.7390) grad_norm 5.6152 (inf) loss_scale 256.0000 (384.9838) mem 7381MB [2024-09-01 08:48:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1180/1251] eta 0:00:17 lr 0.000050 wd 0.0500 time 0.2295 (0.2432) data time 0.0010 (0.0015) model time 0.2285 (0.2415) loss 3.0545 (2.7410) grad_norm 4.4714 (inf) loss_scale 256.0000 (383.8916) mem 7381MB [2024-09-01 08:48:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1190/1251] eta 0:00:14 lr 0.000050 wd 0.0500 time 0.2439 (0.2432) data time 0.0009 (0.0015) model time 0.2431 (0.2415) loss 2.9803 (2.7418) grad_norm 3.6871 (inf) loss_scale 256.0000 (382.8178) mem 7381MB [2024-09-01 08:48:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1200/1251] eta 0:00:12 lr 0.000050 wd 0.0500 time 0.2424 (0.2431) data time 0.0009 (0.0014) model time 0.2414 (0.2415) loss 2.9593 (2.7434) grad_norm 4.3370 (inf) loss_scale 256.0000 (381.7619) mem 7381MB [2024-09-01 08:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1210/1251] eta 0:00:09 lr 0.000050 wd 0.0500 time 0.2386 (0.2432) data time 0.0011 (0.0014) model time 0.2375 (0.2415) loss 2.7857 (2.7430) grad_norm 6.6176 (inf) loss_scale 256.0000 (380.7234) mem 7381MB [2024-09-01 08:48:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1220/1251] eta 0:00:07 lr 0.000050 wd 0.0500 time 0.2361 (0.2431) data time 0.0010 (0.0014) model time 0.2350 (0.2415) loss 2.6427 (2.7428) grad_norm 4.3184 (inf) loss_scale 256.0000 (379.7019) mem 7381MB [2024-09-01 08:48:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1230/1251] eta 0:00:05 lr 0.000050 wd 0.0500 time 0.2363 (0.2431) data time 0.0007 (0.0014) model time 0.2356 (0.2415) loss 2.9257 (2.7436) grad_norm 6.6730 (inf) loss_scale 256.0000 (378.6970) mem 7381MB [2024-09-01 08:48:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1240/1251] eta 0:00:02 lr 0.000050 wd 0.0500 time 0.2152 (0.2432) data time 0.0006 (0.0014) model time 0.2146 (0.2416) loss 2.6583 (2.7445) grad_norm 4.3318 (inf) loss_scale 256.0000 (377.7083) mem 7381MB [2024-09-01 08:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [263/300][1250/1251] eta 0:00:00 lr 0.000050 wd 0.0500 time 0.2294 (0.2431) data time 0.0007 (0.0014) model time 0.2287 (0.2414) loss 2.2348 (2.7427) grad_norm 3.8985 (inf) loss_scale 256.0000 (376.7354) mem 7381MB [2024-09-01 08:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 263 training takes 0:05:04 [2024-09-01 08:48:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 08:48:31 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 08:48:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.430 (0.430) Loss 0.3909 (0.3909) Acc@1 93.262 (93.262) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 08:48:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.110) Loss 0.5605 (0.6063) Acc@1 89.746 (87.784) Acc@5 97.754 (97.718) Mem 7381MB [2024-09-01 08:48:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.095) Loss 0.9277 (0.6387) Acc@1 77.637 (86.537) Acc@5 95.312 (97.628) Mem 7381MB [2024-09-01 08:48:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.089) Loss 1.1406 (0.7351) Acc@1 74.414 (84.173) Acc@5 92.383 (96.658) Mem 7381MB [2024-09-01 08:48:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.0117 (0.7838) Acc@1 76.562 (82.967) Acc@5 93.945 (96.187) Mem 7381MB [2024-09-01 08:48:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.566 Acc@5 96.152 [2024-09-01 08:48:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-09-01 08:48:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.57% [2024-09-01 08:48:35 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 08:48:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 08:48:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.429 (0.429) Loss 0.3833 (0.3833) Acc@1 93.359 (93.359) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 08:48:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.108) Loss 0.5654 (0.6023) Acc@1 90.430 (87.855) Acc@5 97.949 (97.816) Mem 7381MB [2024-09-01 08:48:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.096) Loss 0.8984 (0.6323) Acc@1 78.027 (86.621) Acc@5 95.801 (97.712) Mem 7381MB [2024-09-01 08:48:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.087 (0.090) Loss 1.1162 (0.7221) Acc@1 74.316 (84.425) Acc@5 93.164 (96.834) Mem 7381MB [2024-09-01 08:48:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.063 (0.084) Loss 0.9956 (0.7689) Acc@1 77.637 (83.296) Acc@5 94.238 (96.330) Mem 7381MB [2024-09-01 08:48:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.910 Acc@5 96.274 [2024-09-01 08:48:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 08:48:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][0/1251] eta 0:25:06 lr 0.000050 wd 0.0500 time 1.2042 (1.2042) data time 0.4826 (0.4826) model time 0.0000 (0.0000) loss 2.4378 (2.4378) grad_norm 3.1792 (3.1792) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:48:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][10/1251] eta 0:06:47 lr 0.000050 wd 0.0500 time 0.2428 (0.3280) data time 0.0009 (0.0447) model time 0.0000 (0.0000) loss 3.2838 (2.6681) grad_norm 4.6082 (5.0452) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:48:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][20/1251] eta 0:05:52 lr 0.000050 wd 0.0500 time 0.2447 (0.2868) data time 0.0013 (0.0240) model time 0.0000 (0.0000) loss 2.1353 (2.5697) grad_norm 4.3786 (4.8785) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:48:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][30/1251] eta 0:05:32 lr 0.000050 wd 0.0500 time 0.2378 (0.2720) data time 0.0009 (0.0165) model time 0.0000 (0.0000) loss 3.1157 (2.6816) grad_norm 4.4964 (4.7964) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:48:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][40/1251] eta 0:05:20 lr 0.000050 wd 0.0500 time 0.2429 (0.2646) data time 0.0009 (0.0127) model time 0.0000 (0.0000) loss 3.3444 (2.6980) grad_norm 4.1814 (4.9406) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:48:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][50/1251] eta 0:05:11 lr 0.000050 wd 0.0500 time 0.2376 (0.2596) data time 0.0010 (0.0104) model time 0.0000 (0.0000) loss 1.7683 (2.7005) grad_norm 3.4260 (5.4543) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][60/1251] eta 0:05:05 lr 0.000050 wd 0.0500 time 0.2433 (0.2565) data time 0.0009 (0.0089) model time 0.2424 (0.2391) loss 2.0417 (2.7088) grad_norm 6.9717 (5.8193) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:48:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][70/1251] eta 0:05:00 lr 0.000050 wd 0.0500 time 0.2423 (0.2542) data time 0.0010 (0.0078) model time 0.2413 (0.2392) loss 2.5385 (2.6765) grad_norm 24.8199 (6.0142) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][80/1251] eta 0:04:55 lr 0.000050 wd 0.0500 time 0.2418 (0.2525) data time 0.0008 (0.0070) model time 0.2410 (0.2394) loss 1.7950 (2.6714) grad_norm 4.9941 (5.9608) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][90/1251] eta 0:04:51 lr 0.000050 wd 0.0500 time 0.2418 (0.2515) data time 0.0012 (0.0063) model time 0.2405 (0.2400) loss 3.3020 (2.6962) grad_norm 4.0294 (5.8655) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][100/1251] eta 0:04:48 lr 0.000050 wd 0.0500 time 0.2420 (0.2504) data time 0.0007 (0.0058) model time 0.2413 (0.2399) loss 3.1262 (2.6907) grad_norm 5.5743 (5.7863) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][110/1251] eta 0:04:44 lr 0.000050 wd 0.0500 time 0.2307 (0.2494) data time 0.0008 (0.0053) model time 0.2299 (0.2398) loss 3.0666 (2.7014) grad_norm 8.0391 (6.1152) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][120/1251] eta 0:04:41 lr 0.000050 wd 0.0500 time 0.2406 (0.2485) data time 0.0007 (0.0050) model time 0.2399 (0.2394) loss 2.5525 (2.7144) grad_norm 5.4380 (6.0278) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][130/1251] eta 0:04:37 lr 0.000050 wd 0.0500 time 0.2530 (0.2479) data time 0.0007 (0.0047) model time 0.2524 (0.2395) loss 2.9065 (2.7131) grad_norm 4.9812 (5.9251) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][140/1251] eta 0:04:34 lr 0.000050 wd 0.0500 time 0.2459 (0.2475) data time 0.0009 (0.0044) model time 0.2450 (0.2396) loss 2.0923 (2.7167) grad_norm 4.4625 (5.8581) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][150/1251] eta 0:04:32 lr 0.000050 wd 0.0500 time 0.2568 (0.2472) data time 0.0010 (0.0042) model time 0.2558 (0.2399) loss 2.8398 (2.7124) grad_norm 6.0312 (5.7992) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][160/1251] eta 0:04:29 lr 0.000050 wd 0.0500 time 0.2570 (0.2469) data time 0.0007 (0.0040) model time 0.2563 (0.2400) loss 3.0306 (2.7052) grad_norm 3.9869 (5.7278) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][170/1251] eta 0:04:26 lr 0.000050 wd 0.0500 time 0.2471 (0.2466) data time 0.0008 (0.0038) model time 0.2463 (0.2400) loss 1.8742 (2.6927) grad_norm 5.2247 (5.7073) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][180/1251] eta 0:04:23 lr 0.000050 wd 0.0500 time 0.2438 (0.2463) data time 0.0010 (0.0037) model time 0.2428 (0.2401) loss 3.0220 (2.7048) grad_norm 9.2489 (5.7753) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][190/1251] eta 0:04:21 lr 0.000050 wd 0.0500 time 0.2424 (0.2460) data time 0.0007 (0.0035) model time 0.2417 (0.2401) loss 2.2831 (2.7163) grad_norm 5.1224 (5.7129) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][200/1251] eta 0:04:18 lr 0.000049 wd 0.0500 time 0.2434 (0.2457) data time 0.0007 (0.0034) model time 0.2427 (0.2400) loss 2.0578 (2.7046) grad_norm 4.2890 (5.6820) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][210/1251] eta 0:04:15 lr 0.000049 wd 0.0500 time 0.2361 (0.2455) data time 0.0010 (0.0033) model time 0.2351 (0.2400) loss 3.2903 (2.7073) grad_norm 3.7463 (5.6170) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][220/1251] eta 0:04:12 lr 0.000049 wd 0.0500 time 0.2498 (0.2453) data time 0.0007 (0.0032) model time 0.2491 (0.2400) loss 3.1891 (2.7029) grad_norm 11.9528 (5.6052) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][230/1251] eta 0:04:10 lr 0.000049 wd 0.0500 time 0.2368 (0.2452) data time 0.0008 (0.0031) model time 0.2360 (0.2400) loss 2.2013 (2.7054) grad_norm 13.8648 (5.6037) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][240/1251] eta 0:04:07 lr 0.000049 wd 0.0500 time 0.2431 (0.2449) data time 0.0011 (0.0030) model time 0.2419 (0.2400) loss 2.6031 (2.7023) grad_norm 5.4186 (5.5991) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][250/1251] eta 0:04:05 lr 0.000049 wd 0.0500 time 0.2452 (0.2449) data time 0.0012 (0.0029) model time 0.2440 (0.2401) loss 3.2786 (2.7008) grad_norm 4.0914 (5.5734) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][260/1251] eta 0:04:02 lr 0.000049 wd 0.0500 time 0.2363 (0.2447) data time 0.0010 (0.0029) model time 0.2353 (0.2400) loss 2.8433 (2.7004) grad_norm 6.3745 (5.5597) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][270/1251] eta 0:04:00 lr 0.000049 wd 0.0500 time 0.2454 (0.2454) data time 0.0007 (0.0028) model time 0.2447 (0.2411) loss 2.9276 (2.6972) grad_norm 5.9982 (5.5463) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][280/1251] eta 0:03:58 lr 0.000049 wd 0.0500 time 0.2529 (0.2453) data time 0.0007 (0.0027) model time 0.2522 (0.2411) loss 2.9669 (2.6993) grad_norm 3.2329 (5.5328) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][290/1251] eta 0:03:55 lr 0.000049 wd 0.0500 time 0.2375 (0.2452) data time 0.0012 (0.0027) model time 0.2363 (0.2411) loss 3.1630 (2.6937) grad_norm 5.2150 (5.5416) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][300/1251] eta 0:03:53 lr 0.000049 wd 0.0500 time 0.2446 (0.2452) data time 0.0009 (0.0026) model time 0.2437 (0.2412) loss 3.2791 (2.6856) grad_norm 3.1355 (5.5333) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][310/1251] eta 0:03:50 lr 0.000049 wd 0.0500 time 0.2505 (0.2451) data time 0.0007 (0.0026) model time 0.2498 (0.2413) loss 2.3449 (2.6808) grad_norm 5.1486 (5.5248) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:49:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][320/1251] eta 0:03:48 lr 0.000049 wd 0.0500 time 0.2431 (0.2450) data time 0.0009 (0.0025) model time 0.2422 (0.2413) loss 2.9625 (2.6830) grad_norm 6.9507 (5.5106) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][330/1251] eta 0:03:45 lr 0.000049 wd 0.0500 time 0.2392 (0.2449) data time 0.0007 (0.0025) model time 0.2386 (0.2412) loss 3.0025 (2.6877) grad_norm 5.6537 (5.5299) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][340/1251] eta 0:03:43 lr 0.000049 wd 0.0500 time 0.2365 (0.2448) data time 0.0009 (0.0024) model time 0.2356 (0.2412) loss 3.2397 (2.6947) grad_norm 5.3238 (5.5137) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][350/1251] eta 0:03:40 lr 0.000049 wd 0.0500 time 0.2482 (0.2447) data time 0.0007 (0.0024) model time 0.2475 (0.2412) loss 2.9876 (2.6917) grad_norm 4.8799 (5.5093) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][360/1251] eta 0:03:38 lr 0.000049 wd 0.0500 time 0.2454 (0.2447) data time 0.0010 (0.0023) model time 0.2444 (0.2412) loss 2.8592 (2.6900) grad_norm 6.0106 (5.4979) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][370/1251] eta 0:03:35 lr 0.000049 wd 0.0500 time 0.2363 (0.2446) data time 0.0014 (0.0023) model time 0.2349 (0.2412) loss 2.3049 (2.6917) grad_norm 5.5109 (5.4831) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][380/1251] eta 0:03:33 lr 0.000049 wd 0.0500 time 0.2451 (0.2446) data time 0.0010 (0.0023) model time 0.2441 (0.2412) loss 3.0199 (2.6887) grad_norm 4.7757 (5.4895) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][390/1251] eta 0:03:30 lr 0.000049 wd 0.0500 time 0.2435 (0.2445) data time 0.0010 (0.0022) model time 0.2425 (0.2413) loss 2.8266 (2.6875) grad_norm 5.3462 (5.4831) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][400/1251] eta 0:03:28 lr 0.000049 wd 0.0500 time 0.2418 (0.2445) data time 0.0007 (0.0022) model time 0.2412 (0.2413) loss 2.5816 (2.6882) grad_norm 3.6851 (5.4658) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][410/1251] eta 0:03:25 lr 0.000049 wd 0.0500 time 0.2442 (0.2445) data time 0.0011 (0.0022) model time 0.2431 (0.2413) loss 2.6715 (2.6879) grad_norm 5.1413 (5.4423) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][420/1251] eta 0:03:23 lr 0.000049 wd 0.0500 time 0.2319 (0.2444) data time 0.0010 (0.0021) model time 0.2309 (0.2413) loss 2.4545 (2.6854) grad_norm 4.3495 (5.4291) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][430/1251] eta 0:03:20 lr 0.000049 wd 0.0500 time 0.2382 (0.2443) data time 0.0009 (0.0021) model time 0.2373 (0.2413) loss 2.9847 (2.6828) grad_norm 4.8778 (5.4222) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][440/1251] eta 0:03:18 lr 0.000049 wd 0.0500 time 0.2314 (0.2444) data time 0.0008 (0.0021) model time 0.2306 (0.2414) loss 2.8233 (2.6848) grad_norm 3.0672 (5.4001) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][450/1251] eta 0:03:15 lr 0.000049 wd 0.0500 time 0.2431 (0.2443) data time 0.0009 (0.0021) model time 0.2422 (0.2414) loss 2.4575 (2.6834) grad_norm 2.8915 (5.3850) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][460/1251] eta 0:03:13 lr 0.000049 wd 0.0500 time 0.2375 (0.2446) data time 0.0010 (0.0020) model time 0.2365 (0.2417) loss 2.6123 (2.6871) grad_norm 5.7551 (5.3898) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][470/1251] eta 0:03:10 lr 0.000049 wd 0.0500 time 0.2427 (0.2445) data time 0.0007 (0.0020) model time 0.2420 (0.2417) loss 3.0266 (2.6878) grad_norm 4.4109 (5.3801) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][480/1251] eta 0:03:08 lr 0.000049 wd 0.0500 time 0.2393 (0.2445) data time 0.0007 (0.0020) model time 0.2386 (0.2417) loss 3.1716 (2.6913) grad_norm 5.6847 (5.3783) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][490/1251] eta 0:03:06 lr 0.000049 wd 0.0500 time 0.2325 (0.2445) data time 0.0009 (0.0020) model time 0.2316 (0.2417) loss 3.1402 (2.6941) grad_norm 3.9433 (5.3653) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][500/1251] eta 0:03:03 lr 0.000049 wd 0.0500 time 0.2368 (0.2444) data time 0.0009 (0.0020) model time 0.2359 (0.2416) loss 1.8278 (2.6977) grad_norm 2.8775 (5.3423) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][510/1251] eta 0:03:01 lr 0.000049 wd 0.0500 time 0.2404 (0.2443) data time 0.0011 (0.0019) model time 0.2393 (0.2416) loss 2.9142 (2.7019) grad_norm 4.4063 (5.3641) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][520/1251] eta 0:02:58 lr 0.000049 wd 0.0500 time 0.2412 (0.2443) data time 0.0011 (0.0019) model time 0.2401 (0.2417) loss 2.4193 (2.7012) grad_norm 5.4077 (5.3518) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][530/1251] eta 0:02:56 lr 0.000049 wd 0.0500 time 0.2327 (0.2443) data time 0.0010 (0.0019) model time 0.2318 (0.2417) loss 3.0631 (2.7014) grad_norm 4.2990 (5.3376) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][540/1251] eta 0:02:53 lr 0.000049 wd 0.0500 time 0.2322 (0.2442) data time 0.0008 (0.0019) model time 0.2315 (0.2416) loss 3.4010 (2.7056) grad_norm 5.2237 (5.3299) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][550/1251] eta 0:02:51 lr 0.000049 wd 0.0500 time 0.2338 (0.2442) data time 0.0008 (0.0019) model time 0.2330 (0.2416) loss 2.7056 (2.7046) grad_norm 5.3045 (5.3208) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:50:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][560/1251] eta 0:02:49 lr 0.000049 wd 0.0500 time 0.2407 (0.2446) data time 0.0012 (0.0019) model time 0.2395 (0.2421) loss 2.5109 (2.7070) grad_norm 7.4417 (5.3245) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][570/1251] eta 0:02:47 lr 0.000049 wd 0.0500 time 0.2381 (0.2453) data time 0.0012 (0.0018) model time 0.2368 (0.2429) loss 2.4720 (2.7076) grad_norm 4.5618 (5.3305) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][580/1251] eta 0:02:44 lr 0.000049 wd 0.0500 time 0.2507 (0.2452) data time 0.0009 (0.0018) model time 0.2499 (0.2429) loss 3.1604 (2.7103) grad_norm 4.5517 (5.3360) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:51:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][590/1251] eta 0:02:42 lr 0.000049 wd 0.0500 time 0.2444 (0.2451) data time 0.0008 (0.0018) model time 0.2436 (0.2428) loss 2.3852 (2.7130) grad_norm 4.3103 (5.3412) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:51:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][600/1251] eta 0:02:39 lr 0.000049 wd 0.0500 time 0.2411 (0.2451) data time 0.0009 (0.0018) model time 0.2402 (0.2427) loss 2.7757 (2.7146) grad_norm 5.2253 (5.3776) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:51:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][610/1251] eta 0:02:37 lr 0.000049 wd 0.0500 time 0.2344 (0.2450) data time 0.0008 (0.0018) model time 0.2336 (0.2426) loss 2.9886 (2.7190) grad_norm 4.8341 (5.3772) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:51:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][620/1251] eta 0:02:34 lr 0.000049 wd 0.0500 time 0.2416 (0.2449) data time 0.0009 (0.0018) model time 0.2407 (0.2426) loss 2.7599 (2.7180) grad_norm 7.4921 (5.3767) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:51:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][630/1251] eta 0:02:32 lr 0.000049 wd 0.0500 time 0.2355 (0.2448) data time 0.0009 (0.0018) model time 0.2346 (0.2425) loss 2.8705 (2.7184) grad_norm 4.4778 (5.3970) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:51:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][640/1251] eta 0:02:29 lr 0.000049 wd 0.0500 time 0.2389 (0.2448) data time 0.0009 (0.0017) model time 0.2381 (0.2425) loss 2.5048 (2.7191) grad_norm 4.4923 (5.3858) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:51:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][650/1251] eta 0:02:27 lr 0.000049 wd 0.0500 time 0.2393 (0.2447) data time 0.0008 (0.0017) model time 0.2385 (0.2425) loss 3.1259 (2.7188) grad_norm 8.2941 (5.3916) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 08:51:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][660/1251] eta 0:02:24 lr 0.000049 wd 0.0500 time 0.2345 (0.2447) data time 0.0008 (0.0017) model time 0.2337 (0.2424) loss 2.9863 (2.7179) grad_norm 4.0538 (inf) loss_scale 128.0000 (254.4508) mem 7381MB [2024-09-01 08:51:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][670/1251] eta 0:02:22 lr 0.000049 wd 0.0500 time 0.2513 (0.2447) data time 0.0010 (0.0017) model time 0.2502 (0.2425) loss 1.7497 (2.7175) grad_norm 3.4904 (inf) loss_scale 128.0000 (252.5663) mem 7381MB [2024-09-01 08:51:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][680/1251] eta 0:02:19 lr 0.000049 wd 0.0500 time 0.2493 (0.2447) data time 0.0008 (0.0017) model time 0.2485 (0.2425) loss 2.4154 (2.7161) grad_norm 5.1566 (inf) loss_scale 128.0000 (250.7372) mem 7381MB [2024-09-01 08:51:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][690/1251] eta 0:02:17 lr 0.000049 wd 0.0500 time 0.2453 (0.2446) data time 0.0008 (0.0017) model time 0.2445 (0.2424) loss 2.9825 (2.7165) grad_norm 3.7552 (inf) loss_scale 128.0000 (248.9609) mem 7381MB [2024-09-01 08:51:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][700/1251] eta 0:02:14 lr 0.000049 wd 0.0500 time 0.2447 (0.2446) data time 0.0007 (0.0017) model time 0.2441 (0.2424) loss 2.0566 (2.7134) grad_norm 3.8973 (inf) loss_scale 128.0000 (247.2354) mem 7381MB [2024-09-01 08:51:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][710/1251] eta 0:02:12 lr 0.000049 wd 0.0500 time 0.2476 (0.2446) data time 0.0010 (0.0017) model time 0.2466 (0.2424) loss 2.6076 (2.7120) grad_norm 6.0197 (inf) loss_scale 128.0000 (245.5584) mem 7381MB [2024-09-01 08:51:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][720/1251] eta 0:02:09 lr 0.000049 wd 0.0500 time 0.2403 (0.2445) data time 0.0008 (0.0017) model time 0.2395 (0.2424) loss 2.4511 (2.7108) grad_norm 6.5914 (inf) loss_scale 128.0000 (243.9279) mem 7381MB [2024-09-01 08:51:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][730/1251] eta 0:02:07 lr 0.000049 wd 0.0500 time 0.2436 (0.2445) data time 0.0009 (0.0017) model time 0.2427 (0.2423) loss 2.9625 (2.7100) grad_norm 4.4407 (inf) loss_scale 128.0000 (242.3420) mem 7381MB [2024-09-01 08:51:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][740/1251] eta 0:02:04 lr 0.000049 wd 0.0500 time 0.2393 (0.2444) data time 0.0009 (0.0016) model time 0.2383 (0.2423) loss 2.4906 (2.7122) grad_norm 5.1493 (inf) loss_scale 128.0000 (240.7989) mem 7381MB [2024-09-01 08:51:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][750/1251] eta 0:02:02 lr 0.000049 wd 0.0500 time 0.2383 (0.2444) data time 0.0009 (0.0016) model time 0.2374 (0.2422) loss 3.1457 (2.7152) grad_norm 11.7496 (inf) loss_scale 128.0000 (239.2969) mem 7381MB [2024-09-01 08:51:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][760/1251] eta 0:01:59 lr 0.000049 wd 0.0500 time 0.2458 (0.2443) data time 0.0011 (0.0016) model time 0.2447 (0.2422) loss 2.6339 (2.7168) grad_norm 9.4671 (inf) loss_scale 128.0000 (237.8344) mem 7381MB [2024-09-01 08:51:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][770/1251] eta 0:01:57 lr 0.000049 wd 0.0500 time 0.2395 (0.2442) data time 0.0008 (0.0016) model time 0.2387 (0.2421) loss 2.3557 (2.7157) grad_norm 4.6352 (inf) loss_scale 128.0000 (236.4099) mem 7381MB [2024-09-01 08:51:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][780/1251] eta 0:01:54 lr 0.000048 wd 0.0500 time 0.2314 (0.2441) data time 0.0012 (0.0016) model time 0.2301 (0.2421) loss 2.3250 (2.7174) grad_norm 4.2426 (inf) loss_scale 128.0000 (235.0218) mem 7381MB [2024-09-01 08:51:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][790/1251] eta 0:01:52 lr 0.000048 wd 0.0500 time 0.2442 (0.2441) data time 0.0007 (0.0016) model time 0.2435 (0.2421) loss 2.7684 (2.7163) grad_norm 5.0180 (inf) loss_scale 128.0000 (233.6688) mem 7381MB [2024-09-01 08:51:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][800/1251] eta 0:01:50 lr 0.000048 wd 0.0500 time 0.2392 (0.2441) data time 0.0009 (0.0016) model time 0.2384 (0.2420) loss 1.8450 (2.7135) grad_norm 3.9879 (inf) loss_scale 128.0000 (232.3496) mem 7381MB [2024-09-01 08:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][810/1251] eta 0:01:47 lr 0.000048 wd 0.0500 time 0.2440 (0.2440) data time 0.0009 (0.0016) model time 0.2431 (0.2420) loss 2.6258 (2.7110) grad_norm 5.7218 (inf) loss_scale 128.0000 (231.0629) mem 7381MB [2024-09-01 08:52:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][820/1251] eta 0:01:45 lr 0.000048 wd 0.0500 time 0.2425 (0.2440) data time 0.0009 (0.0016) model time 0.2416 (0.2420) loss 2.7704 (2.7101) grad_norm 3.8442 (inf) loss_scale 128.0000 (229.8076) mem 7381MB [2024-09-01 08:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][830/1251] eta 0:01:42 lr 0.000048 wd 0.0500 time 0.2452 (0.2439) data time 0.0010 (0.0016) model time 0.2442 (0.2419) loss 2.3619 (2.7110) grad_norm 7.5730 (inf) loss_scale 128.0000 (228.5824) mem 7381MB [2024-09-01 08:52:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][840/1251] eta 0:01:40 lr 0.000048 wd 0.0500 time 0.2383 (0.2439) data time 0.0008 (0.0016) model time 0.2375 (0.2419) loss 3.0701 (2.7152) grad_norm 3.7283 (inf) loss_scale 128.0000 (227.3864) mem 7381MB [2024-09-01 08:52:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][850/1251] eta 0:01:37 lr 0.000048 wd 0.0500 time 0.2396 (0.2439) data time 0.0008 (0.0016) model time 0.2388 (0.2419) loss 2.5693 (2.7139) grad_norm 6.0769 (inf) loss_scale 128.0000 (226.2186) mem 7381MB [2024-09-01 08:52:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][860/1251] eta 0:01:35 lr 0.000048 wd 0.0500 time 0.2420 (0.2438) data time 0.0008 (0.0016) model time 0.2412 (0.2418) loss 2.7144 (2.7162) grad_norm 4.7728 (inf) loss_scale 128.0000 (225.0778) mem 7381MB [2024-09-01 08:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][870/1251] eta 0:01:32 lr 0.000048 wd 0.0500 time 0.2419 (0.2438) data time 0.0009 (0.0015) model time 0.2410 (0.2418) loss 2.7002 (2.7156) grad_norm 4.3125 (inf) loss_scale 128.0000 (223.9633) mem 7381MB [2024-09-01 08:52:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][880/1251] eta 0:01:30 lr 0.000048 wd 0.0500 time 0.2451 (0.2437) data time 0.0007 (0.0015) model time 0.2444 (0.2417) loss 2.8101 (2.7148) grad_norm 6.2739 (inf) loss_scale 128.0000 (222.8740) mem 7381MB [2024-09-01 08:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][890/1251] eta 0:01:27 lr 0.000048 wd 0.0500 time 0.2398 (0.2437) data time 0.0007 (0.0015) model time 0.2390 (0.2417) loss 2.9746 (2.7147) grad_norm 3.8809 (inf) loss_scale 128.0000 (221.8092) mem 7381MB [2024-09-01 08:52:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][900/1251] eta 0:01:25 lr 0.000048 wd 0.0500 time 0.2307 (0.2436) data time 0.0009 (0.0015) model time 0.2298 (0.2417) loss 2.9270 (2.7132) grad_norm 6.0862 (inf) loss_scale 128.0000 (220.7680) mem 7381MB [2024-09-01 08:52:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][910/1251] eta 0:01:23 lr 0.000048 wd 0.0500 time 0.2301 (0.2436) data time 0.0008 (0.0015) model time 0.2293 (0.2416) loss 1.6149 (2.7142) grad_norm 5.4267 (inf) loss_scale 128.0000 (219.7497) mem 7381MB [2024-09-01 08:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][920/1251] eta 0:01:20 lr 0.000048 wd 0.0500 time 0.2426 (0.2435) data time 0.0011 (0.0015) model time 0.2414 (0.2416) loss 3.1019 (2.7145) grad_norm 5.6621 (inf) loss_scale 128.0000 (218.7535) mem 7381MB [2024-09-01 08:52:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][930/1251] eta 0:01:18 lr 0.000048 wd 0.0500 time 0.2464 (0.2435) data time 0.0010 (0.0015) model time 0.2454 (0.2416) loss 2.3091 (2.7117) grad_norm 7.9798 (inf) loss_scale 128.0000 (217.7787) mem 7381MB [2024-09-01 08:52:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][940/1251] eta 0:01:15 lr 0.000048 wd 0.0500 time 0.2326 (0.2435) data time 0.0010 (0.0015) model time 0.2316 (0.2415) loss 2.7253 (2.7129) grad_norm 3.8533 (inf) loss_scale 128.0000 (216.8247) mem 7381MB [2024-09-01 08:52:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][950/1251] eta 0:01:13 lr 0.000048 wd 0.0500 time 0.2460 (0.2434) data time 0.0008 (0.0015) model time 0.2453 (0.2415) loss 2.5328 (2.7130) grad_norm 3.7481 (inf) loss_scale 128.0000 (215.8906) mem 7381MB [2024-09-01 08:52:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][960/1251] eta 0:01:10 lr 0.000048 wd 0.0500 time 0.2412 (0.2434) data time 0.0009 (0.0015) model time 0.2403 (0.2415) loss 1.9856 (2.7121) grad_norm 6.7328 (inf) loss_scale 128.0000 (214.9761) mem 7381MB [2024-09-01 08:52:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][970/1251] eta 0:01:08 lr 0.000048 wd 0.0500 time 0.2447 (0.2433) data time 0.0008 (0.0015) model time 0.2440 (0.2414) loss 2.9688 (2.7113) grad_norm 5.9794 (inf) loss_scale 128.0000 (214.0803) mem 7381MB [2024-09-01 08:52:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][980/1251] eta 0:01:06 lr 0.000048 wd 0.0500 time 0.2531 (0.2435) data time 0.0008 (0.0015) model time 0.2523 (0.2417) loss 2.9428 (2.7123) grad_norm 6.9031 (inf) loss_scale 128.0000 (213.2029) mem 7381MB [2024-09-01 08:52:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][990/1251] eta 0:01:03 lr 0.000048 wd 0.0500 time 0.2393 (0.2435) data time 0.0012 (0.0015) model time 0.2381 (0.2417) loss 2.7794 (2.7105) grad_norm 7.2638 (inf) loss_scale 128.0000 (212.3431) mem 7381MB [2024-09-01 08:52:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1000/1251] eta 0:01:01 lr 0.000048 wd 0.0500 time 0.2398 (0.2435) data time 0.0007 (0.0015) model time 0.2390 (0.2416) loss 3.3117 (2.7114) grad_norm 5.5803 (inf) loss_scale 128.0000 (211.5005) mem 7381MB [2024-09-01 08:52:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1010/1251] eta 0:00:58 lr 0.000048 wd 0.0500 time 0.2419 (0.2435) data time 0.0008 (0.0015) model time 0.2411 (0.2416) loss 1.4982 (2.7102) grad_norm 5.3627 (inf) loss_scale 128.0000 (210.6746) mem 7381MB [2024-09-01 08:52:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1020/1251] eta 0:00:56 lr 0.000048 wd 0.0500 time 0.2322 (0.2434) data time 0.0007 (0.0015) model time 0.2315 (0.2415) loss 3.3774 (2.7107) grad_norm 4.9652 (inf) loss_scale 128.0000 (209.8648) mem 7381MB [2024-09-01 08:52:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1030/1251] eta 0:00:53 lr 0.000048 wd 0.0500 time 0.2499 (0.2433) data time 0.0008 (0.0015) model time 0.2491 (0.2415) loss 2.8396 (2.7137) grad_norm 4.4896 (inf) loss_scale 128.0000 (209.0708) mem 7381MB [2024-09-01 08:52:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1040/1251] eta 0:00:51 lr 0.000048 wd 0.0500 time 0.2333 (0.2440) data time 0.0009 (0.0015) model time 0.2324 (0.2422) loss 2.6108 (2.7143) grad_norm 6.4454 (inf) loss_scale 128.0000 (208.2920) mem 7381MB [2024-09-01 08:52:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1050/1251] eta 0:00:49 lr 0.000048 wd 0.0500 time 0.2478 (0.2440) data time 0.0011 (0.0015) model time 0.2467 (0.2422) loss 2.9387 (2.7161) grad_norm 4.3181 (inf) loss_scale 128.0000 (207.5281) mem 7381MB [2024-09-01 08:52:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1060/1251] eta 0:00:46 lr 0.000048 wd 0.0500 time 0.2466 (0.2440) data time 0.0011 (0.0015) model time 0.2456 (0.2422) loss 2.6288 (2.7173) grad_norm 5.1483 (inf) loss_scale 128.0000 (206.7785) mem 7381MB [2024-09-01 08:53:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1070/1251] eta 0:00:44 lr 0.000048 wd 0.0500 time 0.2399 (0.2440) data time 0.0008 (0.0014) model time 0.2391 (0.2422) loss 1.7443 (2.7155) grad_norm 4.2597 (inf) loss_scale 128.0000 (206.0430) mem 7381MB [2024-09-01 08:53:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1080/1251] eta 0:00:41 lr 0.000048 wd 0.0500 time 0.2439 (0.2439) data time 0.0007 (0.0014) model time 0.2433 (0.2422) loss 2.8821 (2.7153) grad_norm 7.8844 (inf) loss_scale 128.0000 (205.3210) mem 7381MB [2024-09-01 08:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1090/1251] eta 0:00:39 lr 0.000048 wd 0.0500 time 0.2418 (0.2439) data time 0.0009 (0.0014) model time 0.2409 (0.2421) loss 3.1229 (2.7156) grad_norm 5.4703 (inf) loss_scale 128.0000 (204.6123) mem 7381MB [2024-09-01 08:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1100/1251] eta 0:00:36 lr 0.000048 wd 0.0500 time 0.2353 (0.2443) data time 0.0007 (0.0014) model time 0.2346 (0.2425) loss 3.1291 (2.7141) grad_norm 5.7480 (inf) loss_scale 128.0000 (203.9164) mem 7381MB [2024-09-01 08:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1110/1251] eta 0:00:34 lr 0.000048 wd 0.0500 time 0.2434 (0.2444) data time 0.0011 (0.0014) model time 0.2423 (0.2427) loss 1.9224 (2.7124) grad_norm 6.5148 (inf) loss_scale 128.0000 (203.2331) mem 7381MB [2024-09-01 08:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1120/1251] eta 0:00:32 lr 0.000048 wd 0.0500 time 0.2403 (0.2444) data time 0.0009 (0.0014) model time 0.2394 (0.2427) loss 3.2851 (2.7138) grad_norm 5.3922 (inf) loss_scale 128.0000 (202.5620) mem 7381MB [2024-09-01 08:53:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1130/1251] eta 0:00:29 lr 0.000048 wd 0.0500 time 0.2419 (0.2444) data time 0.0010 (0.0014) model time 0.2409 (0.2427) loss 3.7511 (2.7148) grad_norm 5.9933 (inf) loss_scale 128.0000 (201.9027) mem 7381MB [2024-09-01 08:53:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1140/1251] eta 0:00:27 lr 0.000048 wd 0.0500 time 0.2459 (0.2444) data time 0.0009 (0.0014) model time 0.2450 (0.2426) loss 2.5299 (2.7152) grad_norm 3.9694 (inf) loss_scale 128.0000 (201.2550) mem 7381MB [2024-09-01 08:53:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1150/1251] eta 0:00:24 lr 0.000048 wd 0.0500 time 0.2383 (0.2443) data time 0.0007 (0.0014) model time 0.2376 (0.2426) loss 2.2016 (2.7146) grad_norm 4.9389 (inf) loss_scale 128.0000 (200.6186) mem 7381MB [2024-09-01 08:53:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1160/1251] eta 0:00:22 lr 0.000048 wd 0.0500 time 0.2448 (0.2443) data time 0.0010 (0.0014) model time 0.2438 (0.2426) loss 2.9887 (2.7160) grad_norm 9.5676 (inf) loss_scale 128.0000 (199.9931) mem 7381MB [2024-09-01 08:53:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1170/1251] eta 0:00:19 lr 0.000048 wd 0.0500 time 0.2404 (0.2443) data time 0.0008 (0.0014) model time 0.2396 (0.2426) loss 2.9687 (2.7138) grad_norm 4.2951 (inf) loss_scale 128.0000 (199.3783) mem 7381MB [2024-09-01 08:53:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1180/1251] eta 0:00:17 lr 0.000048 wd 0.0500 time 0.2387 (0.2443) data time 0.0008 (0.0014) model time 0.2379 (0.2426) loss 2.2622 (2.7137) grad_norm 4.5362 (inf) loss_scale 128.0000 (198.7739) mem 7381MB [2024-09-01 08:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1190/1251] eta 0:00:14 lr 0.000048 wd 0.0500 time 0.2313 (0.2442) data time 0.0011 (0.0014) model time 0.2302 (0.2425) loss 2.6759 (2.7128) grad_norm 4.2701 (inf) loss_scale 128.0000 (198.1797) mem 7381MB [2024-09-01 08:53:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1200/1251] eta 0:00:12 lr 0.000048 wd 0.0500 time 0.2408 (0.2443) data time 0.0009 (0.0014) model time 0.2399 (0.2427) loss 3.0976 (2.7128) grad_norm 6.0423 (inf) loss_scale 128.0000 (197.5953) mem 7381MB [2024-09-01 08:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1210/1251] eta 0:00:10 lr 0.000048 wd 0.0500 time 0.2370 (0.2443) data time 0.0009 (0.0014) model time 0.2361 (0.2426) loss 2.6565 (2.7125) grad_norm 5.8565 (inf) loss_scale 128.0000 (197.0206) mem 7381MB [2024-09-01 08:53:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1220/1251] eta 0:00:07 lr 0.000048 wd 0.0500 time 0.2338 (0.2443) data time 0.0012 (0.0014) model time 0.2327 (0.2426) loss 2.7295 (2.7105) grad_norm 5.6229 (inf) loss_scale 128.0000 (196.4554) mem 7381MB [2024-09-01 08:53:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1230/1251] eta 0:00:05 lr 0.000048 wd 0.0500 time 0.2473 (0.2442) data time 0.0009 (0.0014) model time 0.2464 (0.2426) loss 2.9334 (2.7104) grad_norm 5.5322 (inf) loss_scale 128.0000 (195.8993) mem 7381MB [2024-09-01 08:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1240/1251] eta 0:00:02 lr 0.000048 wd 0.0500 time 0.2237 (0.2441) data time 0.0005 (0.0014) model time 0.2232 (0.2425) loss 3.2296 (2.7092) grad_norm 4.0300 (inf) loss_scale 128.0000 (195.3521) mem 7381MB [2024-09-01 08:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [264/300][1250/1251] eta 0:00:00 lr 0.000048 wd 0.0500 time 0.2238 (0.2440) data time 0.0007 (0.0014) model time 0.2232 (0.2423) loss 3.2821 (2.7104) grad_norm 4.5044 (inf) loss_scale 128.0000 (194.8137) mem 7381MB [2024-09-01 08:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 264 training takes 0:05:05 [2024-09-01 08:53:45 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 08:53:46 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 08:53:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.408 (0.408) Loss 0.3960 (0.3960) Acc@1 93.164 (93.164) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 08:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.086 (0.108) Loss 0.5835 (0.6192) Acc@1 90.039 (87.322) Acc@5 97.852 (97.656) Mem 7381MB [2024-09-01 08:53:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.095) Loss 0.9199 (0.6476) Acc@1 77.441 (86.347) Acc@5 95.605 (97.600) Mem 7381MB [2024-09-01 08:53:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.068 (0.089) Loss 1.1621 (0.7422) Acc@1 73.242 (84.095) Acc@5 91.992 (96.620) Mem 7381MB [2024-09-01 08:53:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.0342 (0.7896) Acc@1 76.562 (82.905) Acc@5 93.945 (96.108) Mem 7381MB [2024-09-01 08:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.560 Acc@5 96.048 [2024-09-01 08:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-09-01 08:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.756 (0.756) Loss 0.3833 (0.3833) Acc@1 93.359 (93.359) Acc@5 98.926 (98.926) Mem 7381MB [2024-09-01 08:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.147) Loss 0.5649 (0.6025) Acc@1 90.527 (87.855) Acc@5 97.949 (97.825) Mem 7381MB [2024-09-01 08:53:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.085 (0.116) Loss 0.9014 (0.6327) Acc@1 78.027 (86.635) Acc@5 95.801 (97.721) Mem 7381MB [2024-09-01 08:53:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.103) Loss 1.1182 (0.7226) Acc@1 74.609 (84.451) Acc@5 92.969 (96.821) Mem 7381MB [2024-09-01 08:53:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 0.9971 (0.7695) Acc@1 77.344 (83.303) Acc@5 94.238 (96.315) Mem 7381MB [2024-09-01 08:53:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.928 Acc@5 96.260 [2024-09-01 08:53:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 08:53:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][0/1251] eta 0:24:01 lr 0.000048 wd 0.0500 time 1.1526 (1.1526) data time 0.6653 (0.6653) model time 0.0000 (0.0000) loss 2.7378 (2.7378) grad_norm 5.7496 (5.7496) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:53:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][10/1251] eta 0:06:41 lr 0.000048 wd 0.0500 time 0.2418 (0.3237) data time 0.0008 (0.0613) model time 0.0000 (0.0000) loss 2.6830 (2.7369) grad_norm 6.8600 (5.2707) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][20/1251] eta 0:05:49 lr 0.000048 wd 0.0500 time 0.2419 (0.2843) data time 0.0008 (0.0326) model time 0.0000 (0.0000) loss 2.5172 (2.6097) grad_norm 2.9419 (6.0490) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][30/1251] eta 0:05:29 lr 0.000048 wd 0.0500 time 0.2370 (0.2699) data time 0.0012 (0.0224) model time 0.0000 (0.0000) loss 1.9925 (2.6034) grad_norm 4.9319 (5.7307) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][40/1251] eta 0:05:18 lr 0.000048 wd 0.0500 time 0.2376 (0.2629) data time 0.0011 (0.0172) model time 0.0000 (0.0000) loss 1.8544 (2.5894) grad_norm 5.2024 (5.5733) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][50/1251] eta 0:05:10 lr 0.000048 wd 0.0500 time 0.2322 (0.2583) data time 0.0007 (0.0142) model time 0.0000 (0.0000) loss 2.1309 (2.5987) grad_norm 3.8807 (5.3911) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][60/1251] eta 0:05:04 lr 0.000048 wd 0.0500 time 0.2436 (0.2554) data time 0.0009 (0.0120) model time 0.2427 (0.2397) loss 1.9479 (2.6075) grad_norm 4.4978 (5.3761) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][70/1251] eta 0:04:59 lr 0.000048 wd 0.0500 time 0.2445 (0.2535) data time 0.0007 (0.0105) model time 0.2438 (0.2402) loss 2.5024 (2.6338) grad_norm 8.3249 (5.3913) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][80/1251] eta 0:04:54 lr 0.000048 wd 0.0500 time 0.2411 (0.2518) data time 0.0009 (0.0093) model time 0.2402 (0.2397) loss 3.1662 (2.6358) grad_norm 5.8304 (5.3834) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][90/1251] eta 0:04:51 lr 0.000048 wd 0.0500 time 0.2474 (0.2508) data time 0.0009 (0.0084) model time 0.2464 (0.2401) loss 3.1512 (2.6514) grad_norm 3.7231 (5.2956) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][100/1251] eta 0:04:47 lr 0.000048 wd 0.0500 time 0.2461 (0.2498) data time 0.0009 (0.0077) model time 0.2453 (0.2400) loss 3.0866 (2.6562) grad_norm 4.6300 (5.2865) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][110/1251] eta 0:04:44 lr 0.000047 wd 0.0500 time 0.2350 (0.2491) data time 0.0011 (0.0071) model time 0.2339 (0.2403) loss 2.4602 (2.6619) grad_norm 4.1988 (5.2623) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][120/1251] eta 0:04:41 lr 0.000047 wd 0.0500 time 0.2464 (0.2485) data time 0.0007 (0.0066) model time 0.2456 (0.2404) loss 2.9579 (2.6830) grad_norm 8.0822 (5.3781) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][130/1251] eta 0:04:38 lr 0.000047 wd 0.0500 time 0.2475 (0.2482) data time 0.0008 (0.0061) model time 0.2467 (0.2407) loss 3.1814 (2.6832) grad_norm 3.9459 (5.3361) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][140/1251] eta 0:04:35 lr 0.000047 wd 0.0500 time 0.2503 (0.2478) data time 0.0007 (0.0058) model time 0.2497 (0.2409) loss 3.2906 (2.6773) grad_norm 5.0603 (5.4248) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][150/1251] eta 0:04:32 lr 0.000047 wd 0.0500 time 0.2399 (0.2475) data time 0.0012 (0.0055) model time 0.2387 (0.2409) loss 2.8309 (2.6669) grad_norm 5.4531 (5.3642) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][160/1251] eta 0:04:29 lr 0.000047 wd 0.0500 time 0.2399 (0.2471) data time 0.0010 (0.0052) model time 0.2389 (0.2409) loss 2.8909 (2.6795) grad_norm 5.3090 (5.3674) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][170/1251] eta 0:04:26 lr 0.000047 wd 0.0500 time 0.2320 (0.2467) data time 0.0008 (0.0049) model time 0.2312 (0.2408) loss 3.1108 (2.6922) grad_norm 5.0918 (5.3562) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][180/1251] eta 0:04:23 lr 0.000047 wd 0.0500 time 0.2491 (0.2464) data time 0.0009 (0.0047) model time 0.2481 (0.2408) loss 1.7308 (2.6930) grad_norm 4.9384 (5.3102) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][190/1251] eta 0:04:21 lr 0.000047 wd 0.0500 time 0.2393 (0.2462) data time 0.0007 (0.0045) model time 0.2386 (0.2408) loss 1.6266 (2.6851) grad_norm 5.5119 (5.2960) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][200/1251] eta 0:04:18 lr 0.000047 wd 0.0500 time 0.2393 (0.2460) data time 0.0009 (0.0043) model time 0.2384 (0.2408) loss 3.3056 (2.6995) grad_norm 3.8577 (5.2786) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][210/1251] eta 0:04:15 lr 0.000047 wd 0.0500 time 0.2408 (0.2458) data time 0.0010 (0.0042) model time 0.2398 (0.2408) loss 1.9666 (2.6992) grad_norm 6.9577 (5.3126) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][220/1251] eta 0:04:13 lr 0.000047 wd 0.0500 time 0.2399 (0.2455) data time 0.0008 (0.0040) model time 0.2391 (0.2407) loss 2.5929 (2.7063) grad_norm 6.7831 (5.3562) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][230/1251] eta 0:04:10 lr 0.000047 wd 0.0500 time 0.2308 (0.2453) data time 0.0009 (0.0039) model time 0.2299 (0.2407) loss 1.6536 (2.6988) grad_norm 4.0851 (5.4258) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][240/1251] eta 0:04:07 lr 0.000047 wd 0.0500 time 0.2400 (0.2452) data time 0.0007 (0.0038) model time 0.2393 (0.2407) loss 2.3877 (2.7057) grad_norm 5.6253 (5.4175) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][250/1251] eta 0:04:05 lr 0.000047 wd 0.0500 time 0.2428 (0.2449) data time 0.0009 (0.0037) model time 0.2419 (0.2405) loss 2.9036 (2.7045) grad_norm 5.1283 (5.4444) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:54:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][260/1251] eta 0:04:02 lr 0.000047 wd 0.0500 time 0.2340 (0.2449) data time 0.0009 (0.0036) model time 0.2332 (0.2406) loss 2.4265 (2.7072) grad_norm 3.8837 (5.4046) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][270/1251] eta 0:04:00 lr 0.000047 wd 0.0500 time 0.2369 (0.2447) data time 0.0010 (0.0035) model time 0.2358 (0.2406) loss 2.1478 (2.7105) grad_norm 4.0876 (5.3885) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][280/1251] eta 0:03:57 lr 0.000047 wd 0.0500 time 0.2366 (0.2446) data time 0.0009 (0.0034) model time 0.2357 (0.2406) loss 3.0152 (2.7155) grad_norm 4.9327 (5.3604) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][290/1251] eta 0:03:54 lr 0.000047 wd 0.0500 time 0.2442 (0.2444) data time 0.0007 (0.0033) model time 0.2435 (0.2404) loss 1.8627 (2.7137) grad_norm 4.4516 (5.3331) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][300/1251] eta 0:03:52 lr 0.000047 wd 0.0500 time 0.2453 (0.2443) data time 0.0010 (0.0032) model time 0.2443 (0.2404) loss 2.9492 (2.7100) grad_norm 5.2811 (5.3413) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][310/1251] eta 0:03:49 lr 0.000047 wd 0.0500 time 0.2455 (0.2442) data time 0.0009 (0.0031) model time 0.2446 (0.2405) loss 2.4547 (2.7036) grad_norm 3.7139 (5.3422) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][320/1251] eta 0:03:47 lr 0.000047 wd 0.0500 time 0.2443 (0.2442) data time 0.0007 (0.0031) model time 0.2436 (0.2405) loss 2.6294 (2.7081) grad_norm 3.9939 (5.3303) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][330/1251] eta 0:03:44 lr 0.000047 wd 0.0500 time 0.2423 (0.2441) data time 0.0009 (0.0030) model time 0.2414 (0.2406) loss 2.7542 (2.7126) grad_norm 5.3451 (5.3268) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][340/1251] eta 0:03:42 lr 0.000047 wd 0.0500 time 0.2451 (0.2440) data time 0.0011 (0.0030) model time 0.2439 (0.2406) loss 3.0336 (2.7148) grad_norm 5.9681 (5.3065) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][350/1251] eta 0:03:39 lr 0.000047 wd 0.0500 time 0.2413 (0.2439) data time 0.0010 (0.0029) model time 0.2403 (0.2405) loss 2.6724 (2.7146) grad_norm 4.0628 (5.3627) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][360/1251] eta 0:03:37 lr 0.000047 wd 0.0500 time 0.2442 (0.2439) data time 0.0011 (0.0029) model time 0.2431 (0.2405) loss 2.9315 (2.7155) grad_norm 6.1383 (5.3472) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][370/1251] eta 0:03:34 lr 0.000047 wd 0.0500 time 0.2420 (0.2438) data time 0.0009 (0.0028) model time 0.2411 (0.2405) loss 2.9343 (2.7160) grad_norm 11.2034 (5.3529) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][380/1251] eta 0:03:32 lr 0.000047 wd 0.0500 time 0.2393 (0.2443) data time 0.0008 (0.0028) model time 0.2385 (0.2412) loss 3.2364 (2.7100) grad_norm 4.2010 (5.3750) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][390/1251] eta 0:03:31 lr 0.000047 wd 0.0500 time 0.2382 (0.2454) data time 0.0010 (0.0027) model time 0.2372 (0.2424) loss 3.0865 (2.7155) grad_norm 3.8985 (5.3618) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][400/1251] eta 0:03:28 lr 0.000047 wd 0.0500 time 0.2417 (0.2452) data time 0.0009 (0.0027) model time 0.2408 (0.2424) loss 2.0802 (2.7187) grad_norm 7.7866 (5.3651) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][410/1251] eta 0:03:26 lr 0.000047 wd 0.0500 time 0.2345 (0.2451) data time 0.0011 (0.0026) model time 0.2334 (0.2423) loss 2.5101 (2.7237) grad_norm 12.3136 (5.3598) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][420/1251] eta 0:03:23 lr 0.000047 wd 0.0500 time 0.2359 (0.2451) data time 0.0010 (0.0026) model time 0.2349 (0.2423) loss 2.1215 (2.7237) grad_norm 3.0240 (5.3455) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][430/1251] eta 0:03:21 lr 0.000047 wd 0.0500 time 0.2329 (0.2450) data time 0.0011 (0.0026) model time 0.2318 (0.2422) loss 3.3221 (2.7258) grad_norm 4.7353 (5.3469) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][440/1251] eta 0:03:18 lr 0.000047 wd 0.0500 time 0.2432 (0.2449) data time 0.0006 (0.0025) model time 0.2425 (0.2422) loss 2.8074 (2.7215) grad_norm 3.8462 (5.3322) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][450/1251] eta 0:03:16 lr 0.000047 wd 0.0500 time 0.2381 (0.2448) data time 0.0010 (0.0025) model time 0.2371 (0.2421) loss 2.0140 (2.7248) grad_norm 5.1750 (5.3441) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][460/1251] eta 0:03:13 lr 0.000047 wd 0.0500 time 0.2307 (0.2448) data time 0.0010 (0.0025) model time 0.2297 (0.2422) loss 2.2408 (2.7218) grad_norm 4.1437 (5.3332) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][470/1251] eta 0:03:11 lr 0.000047 wd 0.0500 time 0.2347 (0.2451) data time 0.0010 (0.0024) model time 0.2337 (0.2426) loss 2.6678 (2.7240) grad_norm 4.2520 (5.3235) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][480/1251] eta 0:03:08 lr 0.000047 wd 0.0500 time 0.2408 (0.2450) data time 0.0007 (0.0024) model time 0.2401 (0.2425) loss 1.5248 (2.7202) grad_norm 3.1131 (5.3147) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][490/1251] eta 0:03:06 lr 0.000047 wd 0.0500 time 0.2405 (0.2449) data time 0.0009 (0.0024) model time 0.2396 (0.2424) loss 3.0489 (2.7186) grad_norm 3.3494 (5.2926) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][500/1251] eta 0:03:03 lr 0.000047 wd 0.0500 time 0.2434 (0.2449) data time 0.0010 (0.0023) model time 0.2424 (0.2424) loss 2.7207 (2.7134) grad_norm 4.8686 (5.2854) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:55:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][510/1251] eta 0:03:01 lr 0.000047 wd 0.0500 time 0.2420 (0.2448) data time 0.0007 (0.0023) model time 0.2414 (0.2423) loss 3.3997 (2.7117) grad_norm 4.8536 (5.2783) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][520/1251] eta 0:02:58 lr 0.000047 wd 0.0500 time 0.2408 (0.2447) data time 0.0011 (0.0023) model time 0.2397 (0.2422) loss 2.8082 (2.7139) grad_norm 5.4254 (5.2900) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][530/1251] eta 0:02:56 lr 0.000047 wd 0.0500 time 0.2390 (0.2446) data time 0.0010 (0.0023) model time 0.2380 (0.2422) loss 2.8595 (2.7159) grad_norm 5.9286 (5.2847) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][540/1251] eta 0:02:53 lr 0.000047 wd 0.0500 time 0.2407 (0.2445) data time 0.0010 (0.0022) model time 0.2397 (0.2421) loss 2.6718 (2.7184) grad_norm 3.6986 (5.2800) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][550/1251] eta 0:02:51 lr 0.000047 wd 0.0500 time 0.2354 (0.2445) data time 0.0010 (0.0022) model time 0.2344 (0.2421) loss 2.8398 (2.7197) grad_norm 3.8102 (5.2737) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][560/1251] eta 0:02:48 lr 0.000047 wd 0.0500 time 0.2413 (0.2444) data time 0.0008 (0.0022) model time 0.2404 (0.2420) loss 3.2811 (2.7221) grad_norm 3.7545 (5.2770) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][570/1251] eta 0:02:46 lr 0.000047 wd 0.0500 time 0.2295 (0.2443) data time 0.0009 (0.0022) model time 0.2286 (0.2420) loss 3.0642 (2.7193) grad_norm 12.0790 (5.2912) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][580/1251] eta 0:02:43 lr 0.000047 wd 0.0500 time 0.2431 (0.2443) data time 0.0011 (0.0021) model time 0.2420 (0.2419) loss 2.7496 (2.7170) grad_norm 3.9054 (5.2876) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][590/1251] eta 0:02:41 lr 0.000047 wd 0.0500 time 0.2399 (0.2442) data time 0.0009 (0.0021) model time 0.2389 (0.2419) loss 1.8494 (2.7166) grad_norm 4.8196 (5.2931) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][600/1251] eta 0:02:39 lr 0.000047 wd 0.0500 time 0.2420 (0.2445) data time 0.0007 (0.0021) model time 0.2413 (0.2422) loss 3.1791 (2.7133) grad_norm 3.6869 (5.3155) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][610/1251] eta 0:02:36 lr 0.000047 wd 0.0500 time 0.2466 (0.2444) data time 0.0010 (0.0021) model time 0.2455 (0.2422) loss 2.8113 (2.7145) grad_norm 5.1839 (5.3073) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][620/1251] eta 0:02:34 lr 0.000047 wd 0.0500 time 0.2414 (0.2443) data time 0.0010 (0.0021) model time 0.2404 (0.2421) loss 2.9655 (2.7144) grad_norm 5.6085 (5.3206) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][630/1251] eta 0:02:31 lr 0.000047 wd 0.0500 time 0.2448 (0.2443) data time 0.0008 (0.0021) model time 0.2440 (0.2421) loss 3.0261 (2.7115) grad_norm 7.2872 (5.3190) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][640/1251] eta 0:02:29 lr 0.000047 wd 0.0500 time 0.2399 (0.2442) data time 0.0011 (0.0020) model time 0.2388 (0.2420) loss 2.9123 (2.7123) grad_norm 5.9716 (5.3378) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][650/1251] eta 0:02:26 lr 0.000047 wd 0.0500 time 0.2415 (0.2442) data time 0.0008 (0.0020) model time 0.2407 (0.2420) loss 2.9651 (2.7125) grad_norm 6.7281 (5.3484) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][660/1251] eta 0:02:24 lr 0.000047 wd 0.0500 time 0.2461 (0.2441) data time 0.0011 (0.0020) model time 0.2450 (0.2419) loss 2.7592 (2.7118) grad_norm 5.4824 (5.3379) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][670/1251] eta 0:02:21 lr 0.000047 wd 0.0500 time 0.2413 (0.2441) data time 0.0008 (0.0020) model time 0.2405 (0.2419) loss 3.3690 (2.7111) grad_norm 6.3742 (5.3907) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][680/1251] eta 0:02:19 lr 0.000047 wd 0.0500 time 0.2409 (0.2440) data time 0.0010 (0.0020) model time 0.2399 (0.2419) loss 3.3264 (2.7118) grad_norm 4.4600 (5.3858) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][690/1251] eta 0:02:16 lr 0.000047 wd 0.0500 time 0.2434 (0.2440) data time 0.0009 (0.0020) model time 0.2425 (0.2419) loss 3.1339 (2.7121) grad_norm 5.2326 (5.3797) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][700/1251] eta 0:02:14 lr 0.000046 wd 0.0500 time 0.2463 (0.2440) data time 0.0008 (0.0019) model time 0.2455 (0.2419) loss 2.0222 (2.7128) grad_norm 7.2200 (5.4040) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][710/1251] eta 0:02:11 lr 0.000046 wd 0.0500 time 0.2450 (0.2439) data time 0.0009 (0.0019) model time 0.2441 (0.2418) loss 2.8615 (2.7133) grad_norm 5.9211 (5.4202) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][720/1251] eta 0:02:09 lr 0.000046 wd 0.0500 time 0.2305 (0.2439) data time 0.0011 (0.0019) model time 0.2294 (0.2418) loss 2.7868 (2.7165) grad_norm 7.6708 (5.4299) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][730/1251] eta 0:02:07 lr 0.000046 wd 0.0500 time 0.2475 (0.2439) data time 0.0010 (0.0019) model time 0.2465 (0.2418) loss 2.5770 (2.7153) grad_norm 5.0347 (5.4917) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][740/1251] eta 0:02:04 lr 0.000046 wd 0.0500 time 0.2469 (0.2438) data time 0.0007 (0.0019) model time 0.2461 (0.2417) loss 1.9445 (2.7152) grad_norm 4.5590 (5.4892) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][750/1251] eta 0:02:02 lr 0.000046 wd 0.0500 time 0.2462 (0.2437) data time 0.0010 (0.0019) model time 0.2451 (0.2417) loss 2.6886 (2.7158) grad_norm 5.9678 (5.4803) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:56:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][760/1251] eta 0:01:59 lr 0.000046 wd 0.0500 time 0.2364 (0.2437) data time 0.0011 (0.0019) model time 0.2353 (0.2416) loss 2.4373 (2.7155) grad_norm 4.2670 (5.4775) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][770/1251] eta 0:01:57 lr 0.000046 wd 0.0500 time 0.2472 (0.2437) data time 0.0009 (0.0019) model time 0.2463 (0.2416) loss 2.2491 (2.7133) grad_norm 4.8189 (5.4756) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][780/1251] eta 0:01:54 lr 0.000046 wd 0.0500 time 0.2345 (0.2436) data time 0.0011 (0.0019) model time 0.2333 (0.2416) loss 3.0675 (2.7104) grad_norm 5.5512 (5.4658) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][790/1251] eta 0:01:52 lr 0.000046 wd 0.0500 time 0.2551 (0.2436) data time 0.0007 (0.0018) model time 0.2543 (0.2416) loss 2.9671 (2.7110) grad_norm 5.2784 (5.4725) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][800/1251] eta 0:01:49 lr 0.000046 wd 0.0500 time 0.2366 (0.2436) data time 0.0010 (0.0018) model time 0.2356 (0.2416) loss 2.5949 (2.7108) grad_norm 5.9164 (5.4633) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][810/1251] eta 0:01:47 lr 0.000046 wd 0.0500 time 0.2416 (0.2436) data time 0.0007 (0.0018) model time 0.2409 (0.2416) loss 2.2817 (2.7115) grad_norm 4.8361 (5.4526) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][820/1251] eta 0:01:44 lr 0.000046 wd 0.0500 time 0.2418 (0.2435) data time 0.0009 (0.0018) model time 0.2409 (0.2416) loss 2.4028 (2.7099) grad_norm 5.9141 (5.4422) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][830/1251] eta 0:01:42 lr 0.000046 wd 0.0500 time 0.2433 (0.2435) data time 0.0009 (0.0018) model time 0.2424 (0.2416) loss 3.2830 (2.7090) grad_norm 4.7969 (5.4399) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][840/1251] eta 0:01:40 lr 0.000046 wd 0.0500 time 0.2327 (0.2435) data time 0.0007 (0.0018) model time 0.2320 (0.2416) loss 3.4608 (2.7108) grad_norm 3.9383 (5.4350) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][850/1251] eta 0:01:37 lr 0.000046 wd 0.0500 time 0.2429 (0.2435) data time 0.0011 (0.0018) model time 0.2418 (0.2416) loss 2.9103 (2.7117) grad_norm 4.6435 (5.4264) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][860/1251] eta 0:01:35 lr 0.000046 wd 0.0500 time 0.2390 (0.2435) data time 0.0010 (0.0018) model time 0.2381 (0.2415) loss 3.1078 (2.7143) grad_norm 6.2432 (5.4190) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][870/1251] eta 0:01:32 lr 0.000046 wd 0.0500 time 0.2494 (0.2435) data time 0.0009 (0.0018) model time 0.2484 (0.2415) loss 1.6137 (2.7136) grad_norm 6.3761 (5.4152) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][880/1251] eta 0:01:30 lr 0.000046 wd 0.0500 time 0.2448 (0.2434) data time 0.0007 (0.0018) model time 0.2441 (0.2415) loss 2.6552 (2.7114) grad_norm 3.0619 (5.4071) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][890/1251] eta 0:01:27 lr 0.000046 wd 0.0500 time 0.2484 (0.2434) data time 0.0008 (0.0017) model time 0.2476 (0.2415) loss 3.1661 (2.7112) grad_norm 4.0646 (5.4212) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][900/1251] eta 0:01:25 lr 0.000046 wd 0.0500 time 0.2393 (0.2436) data time 0.0010 (0.0017) model time 0.2383 (0.2417) loss 2.8388 (2.7096) grad_norm 3.4103 (5.4090) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][910/1251] eta 0:01:23 lr 0.000046 wd 0.0500 time 0.2462 (0.2439) data time 0.0009 (0.0017) model time 0.2453 (0.2421) loss 2.9587 (2.7077) grad_norm 5.7843 (5.4085) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][920/1251] eta 0:01:20 lr 0.000046 wd 0.0500 time 0.2359 (0.2439) data time 0.0009 (0.0017) model time 0.2350 (0.2420) loss 1.7564 (2.7065) grad_norm 3.4193 (5.4019) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][930/1251] eta 0:01:18 lr 0.000046 wd 0.0500 time 0.2423 (0.2438) data time 0.0009 (0.0017) model time 0.2414 (0.2420) loss 3.0717 (2.7070) grad_norm 3.9049 (5.4023) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][940/1251] eta 0:01:15 lr 0.000046 wd 0.0500 time 0.2386 (0.2438) data time 0.0008 (0.0017) model time 0.2378 (0.2420) loss 3.3055 (2.7071) grad_norm 4.2191 (5.4022) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][950/1251] eta 0:01:13 lr 0.000046 wd 0.0500 time 0.2386 (0.2438) data time 0.0009 (0.0017) model time 0.2377 (0.2420) loss 3.0314 (2.7105) grad_norm 6.3635 (5.4239) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][960/1251] eta 0:01:10 lr 0.000046 wd 0.0500 time 0.2460 (0.2437) data time 0.0009 (0.0017) model time 0.2451 (0.2419) loss 1.9065 (2.7109) grad_norm 3.6054 (5.4172) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][970/1251] eta 0:01:08 lr 0.000046 wd 0.0500 time 0.2450 (0.2437) data time 0.0006 (0.0017) model time 0.2443 (0.2419) loss 3.3655 (2.7098) grad_norm 5.4248 (5.4155) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][980/1251] eta 0:01:06 lr 0.000046 wd 0.0500 time 0.2473 (0.2437) data time 0.0008 (0.0017) model time 0.2465 (0.2419) loss 3.0688 (2.7095) grad_norm 3.7273 (5.4071) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][990/1251] eta 0:01:03 lr 0.000046 wd 0.0500 time 0.2410 (0.2437) data time 0.0007 (0.0017) model time 0.2403 (0.2419) loss 1.9763 (2.7086) grad_norm 5.3964 (5.4019) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:57:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1000/1251] eta 0:01:01 lr 0.000046 wd 0.0500 time 0.2413 (0.2437) data time 0.0011 (0.0017) model time 0.2402 (0.2419) loss 3.3321 (2.7087) grad_norm 4.1925 (5.4003) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1010/1251] eta 0:00:58 lr 0.000046 wd 0.0500 time 0.2470 (0.2437) data time 0.0008 (0.0017) model time 0.2462 (0.2419) loss 3.1306 (2.7083) grad_norm 4.5976 (5.3995) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1020/1251] eta 0:00:56 lr 0.000046 wd 0.0500 time 0.2365 (0.2437) data time 0.0008 (0.0016) model time 0.2357 (0.2419) loss 3.1254 (2.7099) grad_norm 3.8928 (5.3934) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1030/1251] eta 0:00:53 lr 0.000046 wd 0.0500 time 0.2403 (0.2437) data time 0.0008 (0.0016) model time 0.2395 (0.2419) loss 1.8970 (2.7090) grad_norm 3.8570 (5.3907) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1040/1251] eta 0:00:51 lr 0.000046 wd 0.0500 time 0.2398 (0.2436) data time 0.0009 (0.0016) model time 0.2388 (0.2419) loss 2.4765 (2.7091) grad_norm 6.0672 (5.3904) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1050/1251] eta 0:00:48 lr 0.000046 wd 0.0500 time 0.2387 (0.2436) data time 0.0009 (0.0016) model time 0.2378 (0.2419) loss 2.0671 (2.7090) grad_norm 5.9405 (5.4055) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1060/1251] eta 0:00:46 lr 0.000046 wd 0.0500 time 0.2388 (0.2436) data time 0.0010 (0.0016) model time 0.2377 (0.2418) loss 2.8761 (2.7082) grad_norm 6.5844 (5.4079) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1070/1251] eta 0:00:44 lr 0.000046 wd 0.0500 time 0.2497 (0.2436) data time 0.0007 (0.0016) model time 0.2490 (0.2418) loss 3.2537 (2.7091) grad_norm 4.0965 (5.4043) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1080/1251] eta 0:00:41 lr 0.000046 wd 0.0500 time 0.2345 (0.2435) data time 0.0009 (0.0016) model time 0.2336 (0.2418) loss 2.5456 (2.7095) grad_norm 5.7482 (5.3973) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1090/1251] eta 0:00:39 lr 0.000046 wd 0.0500 time 0.2470 (0.2435) data time 0.0010 (0.0016) model time 0.2460 (0.2418) loss 2.9180 (2.7110) grad_norm 3.4937 (5.3904) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1100/1251] eta 0:00:36 lr 0.000046 wd 0.0500 time 0.2502 (0.2435) data time 0.0007 (0.0016) model time 0.2495 (0.2418) loss 3.0002 (2.7111) grad_norm 3.8468 (5.3873) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1110/1251] eta 0:00:34 lr 0.000046 wd 0.0500 time 0.2448 (0.2435) data time 0.0007 (0.0016) model time 0.2441 (0.2418) loss 3.0577 (2.7099) grad_norm 5.1824 (5.3832) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1120/1251] eta 0:00:31 lr 0.000046 wd 0.0500 time 0.2372 (0.2435) data time 0.0012 (0.0016) model time 0.2360 (0.2418) loss 2.4961 (2.7099) grad_norm 4.3048 (5.3835) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1130/1251] eta 0:00:29 lr 0.000046 wd 0.0500 time 0.2411 (0.2434) data time 0.0009 (0.0016) model time 0.2401 (0.2417) loss 1.7161 (2.7090) grad_norm 4.3407 (5.3795) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1140/1251] eta 0:00:27 lr 0.000046 wd 0.0500 time 0.2396 (0.2436) data time 0.0011 (0.0016) model time 0.2385 (0.2419) loss 2.9978 (2.7087) grad_norm 4.5830 (5.3793) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1150/1251] eta 0:00:24 lr 0.000046 wd 0.0500 time 0.2332 (0.2435) data time 0.0010 (0.0016) model time 0.2323 (0.2419) loss 2.2385 (2.7055) grad_norm 5.0550 (5.4008) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1160/1251] eta 0:00:22 lr 0.000046 wd 0.0500 time 0.2487 (0.2435) data time 0.0009 (0.0016) model time 0.2478 (0.2419) loss 2.9964 (2.7038) grad_norm 3.2282 (5.3909) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1170/1251] eta 0:00:19 lr 0.000046 wd 0.0500 time 0.2313 (0.2435) data time 0.0009 (0.0016) model time 0.2305 (0.2418) loss 2.1702 (2.7050) grad_norm 5.3681 (5.3899) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1180/1251] eta 0:00:17 lr 0.000046 wd 0.0500 time 0.2381 (0.2435) data time 0.0008 (0.0016) model time 0.2373 (0.2418) loss 2.1056 (2.7035) grad_norm 4.8747 (5.3850) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1190/1251] eta 0:00:14 lr 0.000046 wd 0.0500 time 0.2383 (0.2434) data time 0.0009 (0.0016) model time 0.2375 (0.2418) loss 2.8352 (2.7041) grad_norm 3.7971 (5.3877) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1200/1251] eta 0:00:12 lr 0.000046 wd 0.0500 time 0.2375 (0.2434) data time 0.0011 (0.0015) model time 0.2365 (0.2418) loss 3.3156 (2.7046) grad_norm 5.4471 (5.3891) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1210/1251] eta 0:00:09 lr 0.000046 wd 0.0500 time 0.2407 (0.2434) data time 0.0007 (0.0015) model time 0.2401 (0.2418) loss 2.9930 (2.7027) grad_norm 5.5000 (5.3828) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1220/1251] eta 0:00:07 lr 0.000046 wd 0.0500 time 0.2438 (0.2434) data time 0.0011 (0.0015) model time 0.2427 (0.2417) loss 2.6893 (2.7032) grad_norm 4.2021 (5.3784) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1230/1251] eta 0:00:05 lr 0.000046 wd 0.0500 time 0.2395 (0.2434) data time 0.0008 (0.0015) model time 0.2388 (0.2417) loss 2.7545 (2.7027) grad_norm 4.9471 (5.3715) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1240/1251] eta 0:00:02 lr 0.000046 wd 0.0500 time 0.2263 (0.2433) data time 0.0007 (0.0015) model time 0.2257 (0.2416) loss 2.9849 (2.7029) grad_norm 5.1195 (5.3701) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [265/300][1250/1251] eta 0:00:00 lr 0.000046 wd 0.0500 time 0.2298 (0.2431) data time 0.0005 (0.0015) model time 0.2293 (0.2415) loss 2.2695 (2.7043) grad_norm 5.7837 (5.3849) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:58:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 265 training takes 0:05:04 [2024-09-01 08:58:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 08:58:59 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 08:58:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.463 (0.463) Loss 0.3936 (0.3936) Acc@1 92.773 (92.773) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 08:59:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.089 (0.117) Loss 0.5728 (0.6183) Acc@1 90.137 (87.420) Acc@5 98.047 (97.718) Mem 7381MB [2024-09-01 08:59:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.084 (0.100) Loss 0.9487 (0.6492) Acc@1 77.539 (86.426) Acc@5 95.410 (97.684) Mem 7381MB [2024-09-01 08:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.093) Loss 1.1543 (0.7450) Acc@1 73.730 (84.154) Acc@5 92.285 (96.689) Mem 7381MB [2024-09-01 08:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.087) Loss 1.0254 (0.7943) Acc@1 77.344 (82.939) Acc@5 93.652 (96.175) Mem 7381MB [2024-09-01 08:59:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.458 Acc@5 96.084 [2024-09-01 08:59:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.5% [2024-09-01 08:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.743 (0.743) Loss 0.3838 (0.3838) Acc@1 93.555 (93.555) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 08:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.144) Loss 0.5649 (0.6029) Acc@1 90.527 (87.828) Acc@5 97.949 (97.807) Mem 7381MB [2024-09-01 08:59:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.113) Loss 0.9023 (0.6332) Acc@1 78.125 (86.658) Acc@5 95.703 (97.712) Mem 7381MB [2024-09-01 08:59:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.102) Loss 1.1211 (0.7233) Acc@1 74.609 (84.473) Acc@5 93.066 (96.821) Mem 7381MB [2024-09-01 08:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.093) Loss 0.9980 (0.7701) Acc@1 77.246 (83.332) Acc@5 94.238 (96.310) Mem 7381MB [2024-09-01 08:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.964 Acc@5 96.254 [2024-09-01 08:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 83.0% [2024-09-01 08:59:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][0/1251] eta 0:23:00 lr 0.000046 wd 0.0500 time 1.1034 (1.1034) data time 0.7280 (0.7280) model time 0.0000 (0.0000) loss 2.7588 (2.7588) grad_norm 3.5919 (3.5919) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][10/1251] eta 0:06:37 lr 0.000046 wd 0.0500 time 0.2497 (0.3204) data time 0.0008 (0.0670) model time 0.0000 (0.0000) loss 3.0799 (2.9044) grad_norm 4.1571 (4.6826) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][20/1251] eta 0:05:48 lr 0.000046 wd 0.0500 time 0.2438 (0.2831) data time 0.0008 (0.0356) model time 0.0000 (0.0000) loss 3.4863 (2.8109) grad_norm 5.6366 (4.7895) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][30/1251] eta 0:05:30 lr 0.000046 wd 0.0500 time 0.2521 (0.2703) data time 0.0011 (0.0244) model time 0.0000 (0.0000) loss 2.6208 (2.7782) grad_norm 3.7612 (4.8068) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][40/1251] eta 0:05:19 lr 0.000046 wd 0.0500 time 0.2413 (0.2640) data time 0.0007 (0.0187) model time 0.0000 (0.0000) loss 2.8066 (2.7860) grad_norm 3.7221 (4.8416) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][50/1251] eta 0:05:12 lr 0.000046 wd 0.0500 time 0.2462 (0.2599) data time 0.0009 (0.0152) model time 0.0000 (0.0000) loss 2.9263 (2.8279) grad_norm 4.5468 (4.7979) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][60/1251] eta 0:05:06 lr 0.000045 wd 0.0500 time 0.2414 (0.2573) data time 0.0008 (0.0129) model time 0.2406 (0.2427) loss 1.7907 (2.8087) grad_norm 4.0249 (5.0036) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][70/1251] eta 0:05:01 lr 0.000045 wd 0.0500 time 0.2414 (0.2553) data time 0.0009 (0.0112) model time 0.2404 (0.2424) loss 3.0251 (2.7935) grad_norm 5.8960 (5.1389) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][80/1251] eta 0:04:56 lr 0.000045 wd 0.0500 time 0.2412 (0.2535) data time 0.0010 (0.0099) model time 0.2402 (0.2415) loss 2.7560 (2.7938) grad_norm 3.7370 (5.4322) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][90/1251] eta 0:04:55 lr 0.000045 wd 0.0500 time 0.2407 (0.2542) data time 0.0008 (0.0090) model time 0.2398 (0.2460) loss 2.3289 (2.7490) grad_norm 4.3249 (5.3048) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][100/1251] eta 0:04:51 lr 0.000045 wd 0.0500 time 0.2435 (0.2532) data time 0.0009 (0.0082) model time 0.2426 (0.2453) loss 2.6481 (2.7426) grad_norm 3.2416 (5.2063) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][110/1251] eta 0:04:47 lr 0.000045 wd 0.0500 time 0.2387 (0.2521) data time 0.0010 (0.0075) model time 0.2377 (0.2445) loss 3.2632 (2.7557) grad_norm 3.8901 (5.1410) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][120/1251] eta 0:04:44 lr 0.000045 wd 0.0500 time 0.2393 (0.2512) data time 0.0008 (0.0070) model time 0.2385 (0.2438) loss 3.3931 (2.7539) grad_norm 3.7778 (5.1700) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][130/1251] eta 0:04:40 lr 0.000045 wd 0.0500 time 0.2411 (0.2503) data time 0.0007 (0.0065) model time 0.2403 (0.2431) loss 2.9732 (2.7477) grad_norm 4.2782 (5.4606) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][140/1251] eta 0:04:37 lr 0.000045 wd 0.0500 time 0.2373 (0.2496) data time 0.0010 (0.0061) model time 0.2362 (0.2428) loss 3.1672 (2.7608) grad_norm 6.1357 (5.5081) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][150/1251] eta 0:04:34 lr 0.000045 wd 0.0500 time 0.2439 (0.2492) data time 0.0010 (0.0058) model time 0.2429 (0.2427) loss 3.0542 (2.7626) grad_norm 4.5263 (5.5031) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 08:59:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][160/1251] eta 0:04:31 lr 0.000045 wd 0.0500 time 0.2378 (0.2486) data time 0.0009 (0.0055) model time 0.2369 (0.2423) loss 3.0044 (2.7691) grad_norm 4.1549 (5.4682) loss_scale 256.0000 (135.9503) mem 7381MB [2024-09-01 08:59:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][170/1251] eta 0:04:29 lr 0.000045 wd 0.0500 time 0.2434 (0.2491) data time 0.0010 (0.0052) model time 0.2424 (0.2435) loss 3.0114 (2.7694) grad_norm 5.5475 (5.4411) loss_scale 256.0000 (142.9708) mem 7381MB [2024-09-01 08:59:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][180/1251] eta 0:04:28 lr 0.000045 wd 0.0500 time 0.2403 (0.2504) data time 0.0010 (0.0050) model time 0.2393 (0.2457) loss 2.8238 (2.7709) grad_norm 7.2885 (5.4997) loss_scale 256.0000 (149.2155) mem 7381MB [2024-09-01 08:59:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][190/1251] eta 0:04:25 lr 0.000045 wd 0.0500 time 0.2476 (0.2499) data time 0.0007 (0.0048) model time 0.2469 (0.2453) loss 3.0187 (2.7712) grad_norm 4.8186 (5.5594) loss_scale 256.0000 (154.8063) mem 7381MB [2024-09-01 08:59:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][200/1251] eta 0:04:22 lr 0.000045 wd 0.0500 time 0.2415 (0.2495) data time 0.0007 (0.0046) model time 0.2408 (0.2450) loss 3.1131 (2.7754) grad_norm 8.0942 (5.5340) loss_scale 256.0000 (159.8408) mem 7381MB [2024-09-01 09:00:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][210/1251] eta 0:04:19 lr 0.000045 wd 0.0500 time 0.2439 (0.2491) data time 0.0009 (0.0044) model time 0.2429 (0.2447) loss 3.2862 (2.7844) grad_norm 3.6371 (5.5271) loss_scale 256.0000 (164.3981) mem 7381MB [2024-09-01 09:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][220/1251] eta 0:04:16 lr 0.000045 wd 0.0500 time 0.2443 (0.2489) data time 0.0009 (0.0043) model time 0.2434 (0.2446) loss 1.8839 (2.7901) grad_norm 5.8289 (5.5035) loss_scale 256.0000 (168.5430) mem 7381MB [2024-09-01 09:00:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][230/1251] eta 0:04:13 lr 0.000045 wd 0.0500 time 0.2375 (0.2486) data time 0.0009 (0.0041) model time 0.2366 (0.2444) loss 3.0069 (2.7830) grad_norm 4.0974 (5.4722) loss_scale 256.0000 (172.3290) mem 7381MB [2024-09-01 09:00:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][240/1251] eta 0:04:10 lr 0.000045 wd 0.0500 time 0.2419 (0.2482) data time 0.0007 (0.0040) model time 0.2412 (0.2441) loss 2.8312 (2.7832) grad_norm 5.8239 (5.5330) loss_scale 256.0000 (175.8008) mem 7381MB [2024-09-01 09:00:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][250/1251] eta 0:04:08 lr 0.000045 wd 0.0500 time 0.2386 (0.2479) data time 0.0009 (0.0039) model time 0.2377 (0.2438) loss 2.8765 (2.7853) grad_norm 11.2330 (5.5751) loss_scale 256.0000 (178.9960) mem 7381MB [2024-09-01 09:00:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][260/1251] eta 0:04:05 lr 0.000045 wd 0.0500 time 0.2324 (0.2476) data time 0.0010 (0.0038) model time 0.2314 (0.2436) loss 3.0052 (2.7886) grad_norm 3.4748 (5.5634) loss_scale 256.0000 (181.9464) mem 7381MB [2024-09-01 09:00:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][270/1251] eta 0:04:02 lr 0.000045 wd 0.0500 time 0.2434 (0.2474) data time 0.0008 (0.0037) model time 0.2426 (0.2435) loss 2.8485 (2.7722) grad_norm 4.1835 (5.6039) loss_scale 256.0000 (184.6790) mem 7381MB [2024-09-01 09:00:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][280/1251] eta 0:03:59 lr 0.000045 wd 0.0500 time 0.2383 (0.2471) data time 0.0008 (0.0036) model time 0.2375 (0.2433) loss 3.1839 (2.7734) grad_norm 5.7396 (5.5992) loss_scale 256.0000 (187.2171) mem 7381MB [2024-09-01 09:00:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][290/1251] eta 0:03:57 lr 0.000045 wd 0.0500 time 0.2404 (0.2468) data time 0.0008 (0.0035) model time 0.2396 (0.2430) loss 1.7768 (2.7727) grad_norm 5.8028 (5.6376) loss_scale 256.0000 (189.5808) mem 7381MB [2024-09-01 09:00:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][300/1251] eta 0:03:54 lr 0.000045 wd 0.0500 time 0.2360 (0.2467) data time 0.0010 (0.0034) model time 0.2350 (0.2430) loss 2.9096 (2.7758) grad_norm 7.5103 (5.6249) loss_scale 256.0000 (191.7874) mem 7381MB [2024-09-01 09:00:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][310/1251] eta 0:03:51 lr 0.000045 wd 0.0500 time 0.2495 (0.2465) data time 0.0010 (0.0033) model time 0.2485 (0.2429) loss 2.4335 (2.7782) grad_norm 4.1141 (5.5845) loss_scale 256.0000 (193.8521) mem 7381MB [2024-09-01 09:00:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][320/1251] eta 0:03:49 lr 0.000045 wd 0.0500 time 0.2373 (0.2464) data time 0.0008 (0.0032) model time 0.2365 (0.2428) loss 2.3639 (2.7666) grad_norm 8.5650 (5.6282) loss_scale 256.0000 (195.7882) mem 7381MB [2024-09-01 09:00:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][330/1251] eta 0:03:46 lr 0.000045 wd 0.0500 time 0.2438 (0.2463) data time 0.0007 (0.0032) model time 0.2431 (0.2428) loss 3.1403 (2.7673) grad_norm 3.9824 (5.6183) loss_scale 256.0000 (197.6073) mem 7381MB [2024-09-01 09:00:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][340/1251] eta 0:03:44 lr 0.000045 wd 0.0500 time 0.2305 (0.2461) data time 0.0007 (0.0031) model time 0.2298 (0.2427) loss 2.7442 (2.7643) grad_norm 5.8908 (5.6013) loss_scale 256.0000 (199.3196) mem 7381MB [2024-09-01 09:00:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][350/1251] eta 0:03:41 lr 0.000045 wd 0.0500 time 0.2439 (0.2461) data time 0.0008 (0.0031) model time 0.2430 (0.2427) loss 1.9601 (2.7694) grad_norm 3.9173 (5.5902) loss_scale 256.0000 (200.9345) mem 7381MB [2024-09-01 09:00:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][360/1251] eta 0:03:39 lr 0.000045 wd 0.0500 time 0.2413 (0.2460) data time 0.0012 (0.0030) model time 0.2401 (0.2427) loss 3.1651 (2.7678) grad_norm 5.4227 (5.5696) loss_scale 256.0000 (202.4598) mem 7381MB [2024-09-01 09:00:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][370/1251] eta 0:03:36 lr 0.000045 wd 0.0500 time 0.2161 (0.2463) data time 0.0010 (0.0029) model time 0.2152 (0.2431) loss 2.8943 (2.7702) grad_norm 4.9702 (5.5427) loss_scale 256.0000 (203.9030) mem 7381MB [2024-09-01 09:00:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][380/1251] eta 0:03:34 lr 0.000045 wd 0.0500 time 0.2426 (0.2461) data time 0.0012 (0.0029) model time 0.2414 (0.2430) loss 2.9244 (2.7717) grad_norm 3.4563 (5.6173) loss_scale 256.0000 (205.2703) mem 7381MB [2024-09-01 09:00:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][390/1251] eta 0:03:31 lr 0.000045 wd 0.0500 time 0.2376 (0.2461) data time 0.0010 (0.0028) model time 0.2366 (0.2430) loss 3.0602 (2.7754) grad_norm 25.7841 (5.6616) loss_scale 256.0000 (206.5678) mem 7381MB [2024-09-01 09:00:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][400/1251] eta 0:03:29 lr 0.000045 wd 0.0500 time 0.2335 (0.2459) data time 0.0010 (0.0028) model time 0.2325 (0.2429) loss 2.6617 (2.7783) grad_norm 3.4947 (5.6319) loss_scale 256.0000 (207.8005) mem 7381MB [2024-09-01 09:00:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][410/1251] eta 0:03:26 lr 0.000045 wd 0.0500 time 0.2424 (0.2458) data time 0.0011 (0.0028) model time 0.2413 (0.2428) loss 2.2368 (2.7766) grad_norm 4.1379 (5.6009) loss_scale 256.0000 (208.9732) mem 7381MB [2024-09-01 09:00:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][420/1251] eta 0:03:24 lr 0.000045 wd 0.0500 time 0.2408 (0.2457) data time 0.0007 (0.0027) model time 0.2401 (0.2427) loss 2.9075 (2.7757) grad_norm 7.2026 (5.6608) loss_scale 256.0000 (210.0903) mem 7381MB [2024-09-01 09:00:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][430/1251] eta 0:03:21 lr 0.000045 wd 0.0500 time 0.2393 (0.2456) data time 0.0008 (0.0027) model time 0.2385 (0.2427) loss 2.8659 (2.7728) grad_norm 3.6968 (5.6617) loss_scale 256.0000 (211.1555) mem 7381MB [2024-09-01 09:00:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][440/1251] eta 0:03:19 lr 0.000045 wd 0.0500 time 0.2434 (0.2455) data time 0.0010 (0.0026) model time 0.2424 (0.2426) loss 3.0127 (2.7710) grad_norm 9.8611 (5.6477) loss_scale 256.0000 (212.1723) mem 7381MB [2024-09-01 09:00:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][450/1251] eta 0:03:16 lr 0.000045 wd 0.0500 time 0.2367 (0.2453) data time 0.0008 (0.0026) model time 0.2358 (0.2425) loss 2.9439 (2.7675) grad_norm 4.1017 (5.6372) loss_scale 256.0000 (213.1441) mem 7381MB [2024-09-01 09:01:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][460/1251] eta 0:03:13 lr 0.000045 wd 0.0500 time 0.2376 (0.2452) data time 0.0009 (0.0026) model time 0.2367 (0.2424) loss 2.1629 (2.7615) grad_norm 4.7288 (5.6304) loss_scale 256.0000 (214.0738) mem 7381MB [2024-09-01 09:01:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][470/1251] eta 0:03:11 lr 0.000045 wd 0.0500 time 0.2466 (0.2451) data time 0.0009 (0.0025) model time 0.2457 (0.2424) loss 2.8683 (2.7642) grad_norm 4.9789 (5.6195) loss_scale 256.0000 (214.9639) mem 7381MB [2024-09-01 09:01:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][480/1251] eta 0:03:08 lr 0.000045 wd 0.0500 time 0.2444 (0.2450) data time 0.0007 (0.0025) model time 0.2436 (0.2423) loss 1.9580 (2.7613) grad_norm 8.4695 (5.6123) loss_scale 256.0000 (215.8170) mem 7381MB [2024-09-01 09:01:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][490/1251] eta 0:03:06 lr 0.000045 wd 0.0500 time 0.2322 (0.2449) data time 0.0009 (0.0025) model time 0.2313 (0.2422) loss 1.9557 (2.7619) grad_norm 5.1290 (5.5981) loss_scale 256.0000 (216.6354) mem 7381MB [2024-09-01 09:01:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][500/1251] eta 0:03:03 lr 0.000045 wd 0.0500 time 0.2306 (0.2448) data time 0.0010 (0.0024) model time 0.2296 (0.2421) loss 3.1915 (2.7620) grad_norm 3.7708 (5.5734) loss_scale 256.0000 (217.4212) mem 7381MB [2024-09-01 09:01:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][510/1251] eta 0:03:01 lr 0.000045 wd 0.0500 time 0.2398 (0.2447) data time 0.0012 (0.0024) model time 0.2386 (0.2421) loss 3.1251 (2.7632) grad_norm 3.8896 (5.6594) loss_scale 256.0000 (218.1761) mem 7381MB [2024-09-01 09:01:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][520/1251] eta 0:02:58 lr 0.000045 wd 0.0500 time 0.2742 (0.2447) data time 0.0011 (0.0024) model time 0.2731 (0.2421) loss 2.2916 (2.7627) grad_norm 4.0736 (5.6645) loss_scale 256.0000 (218.9021) mem 7381MB [2024-09-01 09:01:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][530/1251] eta 0:02:56 lr 0.000045 wd 0.0500 time 0.2412 (0.2447) data time 0.0010 (0.0023) model time 0.2402 (0.2421) loss 2.8253 (2.7602) grad_norm 3.6880 (5.6629) loss_scale 256.0000 (219.6008) mem 7381MB [2024-09-01 09:01:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][540/1251] eta 0:02:53 lr 0.000045 wd 0.0500 time 0.2379 (0.2446) data time 0.0008 (0.0023) model time 0.2371 (0.2420) loss 2.1910 (2.7547) grad_norm 3.6770 (5.6480) loss_scale 256.0000 (220.2736) mem 7381MB [2024-09-01 09:01:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][550/1251] eta 0:02:51 lr 0.000045 wd 0.0500 time 0.2399 (0.2446) data time 0.0011 (0.0023) model time 0.2388 (0.2420) loss 3.1546 (2.7549) grad_norm 4.5481 (5.6337) loss_scale 256.0000 (220.9220) mem 7381MB [2024-09-01 09:01:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][560/1251] eta 0:02:48 lr 0.000045 wd 0.0500 time 0.2426 (0.2445) data time 0.0009 (0.0023) model time 0.2417 (0.2420) loss 3.1297 (2.7547) grad_norm 3.6425 (5.6352) loss_scale 256.0000 (221.5472) mem 7381MB [2024-09-01 09:01:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][570/1251] eta 0:02:46 lr 0.000045 wd 0.0500 time 0.2402 (0.2445) data time 0.0007 (0.0023) model time 0.2395 (0.2420) loss 3.5062 (2.7551) grad_norm 4.4562 (5.6520) loss_scale 256.0000 (222.1506) mem 7381MB [2024-09-01 09:01:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][580/1251] eta 0:02:44 lr 0.000045 wd 0.0500 time 0.2444 (0.2445) data time 0.0012 (0.0022) model time 0.2432 (0.2420) loss 2.6226 (2.7544) grad_norm 3.9320 (5.6603) loss_scale 256.0000 (222.7332) mem 7381MB [2024-09-01 09:01:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][590/1251] eta 0:02:41 lr 0.000045 wd 0.0500 time 0.2461 (0.2444) data time 0.0012 (0.0022) model time 0.2448 (0.2419) loss 2.8782 (2.7502) grad_norm 9.4403 (5.6637) loss_scale 256.0000 (223.2961) mem 7381MB [2024-09-01 09:01:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][600/1251] eta 0:02:39 lr 0.000045 wd 0.0500 time 0.2330 (0.2443) data time 0.0011 (0.0022) model time 0.2319 (0.2419) loss 2.0175 (2.7460) grad_norm 5.7415 (5.6532) loss_scale 256.0000 (223.8403) mem 7381MB [2024-09-01 09:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][610/1251] eta 0:02:36 lr 0.000045 wd 0.0500 time 0.2439 (0.2443) data time 0.0009 (0.0022) model time 0.2431 (0.2419) loss 2.4588 (2.7492) grad_norm 3.0221 (5.6380) loss_scale 256.0000 (224.3666) mem 7381MB [2024-09-01 09:01:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][620/1251] eta 0:02:34 lr 0.000045 wd 0.0500 time 0.2392 (0.2443) data time 0.0007 (0.0022) model time 0.2385 (0.2419) loss 2.5651 (2.7471) grad_norm 5.7304 (5.6356) loss_scale 256.0000 (224.8760) mem 7381MB [2024-09-01 09:01:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][630/1251] eta 0:02:31 lr 0.000045 wd 0.0500 time 0.2450 (0.2442) data time 0.0007 (0.0021) model time 0.2443 (0.2418) loss 2.8222 (2.7487) grad_norm 6.1294 (5.6362) loss_scale 256.0000 (225.3693) mem 7381MB [2024-09-01 09:01:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][640/1251] eta 0:02:29 lr 0.000045 wd 0.0500 time 0.2427 (0.2441) data time 0.0010 (0.0021) model time 0.2417 (0.2418) loss 2.3673 (2.7486) grad_norm 3.2272 (5.6239) loss_scale 256.0000 (225.8471) mem 7381MB [2024-09-01 09:01:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][650/1251] eta 0:02:26 lr 0.000045 wd 0.0500 time 0.2343 (0.2441) data time 0.0008 (0.0021) model time 0.2335 (0.2417) loss 3.2069 (2.7491) grad_norm 4.5654 (5.6164) loss_scale 256.0000 (226.3103) mem 7381MB [2024-09-01 09:01:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][660/1251] eta 0:02:24 lr 0.000045 wd 0.0500 time 0.2417 (0.2440) data time 0.0007 (0.0021) model time 0.2410 (0.2417) loss 2.5093 (2.7526) grad_norm 5.7418 (5.6026) loss_scale 256.0000 (226.7595) mem 7381MB [2024-09-01 09:01:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][670/1251] eta 0:02:21 lr 0.000044 wd 0.0500 time 0.2492 (0.2440) data time 0.0011 (0.0021) model time 0.2482 (0.2417) loss 3.0301 (2.7505) grad_norm 3.6350 (5.6011) loss_scale 256.0000 (227.1952) mem 7381MB [2024-09-01 09:01:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][680/1251] eta 0:02:19 lr 0.000044 wd 0.0500 time 0.2471 (0.2440) data time 0.0010 (0.0021) model time 0.2461 (0.2417) loss 2.5403 (2.7528) grad_norm 7.8051 (5.5970) loss_scale 256.0000 (227.6182) mem 7381MB [2024-09-01 09:01:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][690/1251] eta 0:02:16 lr 0.000044 wd 0.0500 time 0.2400 (0.2439) data time 0.0011 (0.0020) model time 0.2389 (0.2417) loss 2.6537 (2.7517) grad_norm 5.6665 (5.6028) loss_scale 256.0000 (228.0289) mem 7381MB [2024-09-01 09:01:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][700/1251] eta 0:02:14 lr 0.000044 wd 0.0500 time 0.4489 (0.2448) data time 0.0010 (0.0020) model time 0.4479 (0.2426) loss 3.0511 (2.7483) grad_norm 4.5815 (5.6039) loss_scale 256.0000 (228.4280) mem 7381MB [2024-09-01 09:02:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][710/1251] eta 0:02:12 lr 0.000044 wd 0.0500 time 0.2315 (0.2447) data time 0.0008 (0.0020) model time 0.2306 (0.2425) loss 2.7592 (2.7491) grad_norm 4.8069 (5.5989) loss_scale 256.0000 (228.8158) mem 7381MB [2024-09-01 09:02:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][720/1251] eta 0:02:09 lr 0.000044 wd 0.0500 time 0.2433 (0.2446) data time 0.0010 (0.0020) model time 0.2424 (0.2425) loss 2.8809 (2.7489) grad_norm 3.6734 (5.6046) loss_scale 256.0000 (229.1928) mem 7381MB [2024-09-01 09:02:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][730/1251] eta 0:02:07 lr 0.000044 wd 0.0500 time 0.2416 (0.2446) data time 0.0007 (0.0020) model time 0.2408 (0.2424) loss 1.6654 (2.7442) grad_norm 7.9936 (5.6043) loss_scale 256.0000 (229.5595) mem 7381MB [2024-09-01 09:02:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][740/1251] eta 0:02:04 lr 0.000044 wd 0.0500 time 0.2408 (0.2445) data time 0.0007 (0.0020) model time 0.2401 (0.2424) loss 3.2526 (2.7442) grad_norm 4.8106 (5.6060) loss_scale 256.0000 (229.9163) mem 7381MB [2024-09-01 09:02:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][750/1251] eta 0:02:02 lr 0.000044 wd 0.0500 time 0.2433 (0.2445) data time 0.0010 (0.0020) model time 0.2423 (0.2423) loss 2.8936 (2.7434) grad_norm 3.9388 (5.5936) loss_scale 256.0000 (230.2636) mem 7381MB [2024-09-01 09:02:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][760/1251] eta 0:02:00 lr 0.000044 wd 0.0500 time 0.2316 (0.2444) data time 0.0008 (0.0020) model time 0.2307 (0.2423) loss 1.7314 (2.7419) grad_norm 7.5060 (5.5815) loss_scale 256.0000 (230.6018) mem 7381MB [2024-09-01 09:02:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][770/1251] eta 0:01:57 lr 0.000044 wd 0.0500 time 0.2406 (0.2444) data time 0.0008 (0.0019) model time 0.2398 (0.2423) loss 2.0055 (2.7383) grad_norm 4.6520 (5.5709) loss_scale 256.0000 (230.9313) mem 7381MB [2024-09-01 09:02:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][780/1251] eta 0:01:55 lr 0.000044 wd 0.0500 time 0.2363 (0.2444) data time 0.0010 (0.0019) model time 0.2353 (0.2423) loss 3.0461 (2.7397) grad_norm 3.8236 (5.5627) loss_scale 256.0000 (231.2522) mem 7381MB [2024-09-01 09:02:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][790/1251] eta 0:01:52 lr 0.000044 wd 0.0500 time 0.2434 (0.2443) data time 0.0012 (0.0019) model time 0.2422 (0.2422) loss 1.8544 (2.7419) grad_norm 4.3231 (5.5514) loss_scale 256.0000 (231.5651) mem 7381MB [2024-09-01 09:02:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][800/1251] eta 0:01:50 lr 0.000044 wd 0.0500 time 0.2470 (0.2443) data time 0.0007 (0.0019) model time 0.2463 (0.2422) loss 3.4593 (2.7421) grad_norm 4.0827 (5.5461) loss_scale 256.0000 (231.8702) mem 7381MB [2024-09-01 09:02:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][810/1251] eta 0:01:47 lr 0.000044 wd 0.0500 time 0.2375 (0.2442) data time 0.0011 (0.0019) model time 0.2364 (0.2422) loss 2.7778 (2.7436) grad_norm 4.3258 (5.5338) loss_scale 256.0000 (232.1677) mem 7381MB [2024-09-01 09:02:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][820/1251] eta 0:01:45 lr 0.000044 wd 0.0500 time 0.2417 (0.2442) data time 0.0013 (0.0019) model time 0.2404 (0.2422) loss 3.2686 (2.7446) grad_norm 5.6031 (5.5252) loss_scale 256.0000 (232.4580) mem 7381MB [2024-09-01 09:02:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][830/1251] eta 0:01:42 lr 0.000044 wd 0.0500 time 0.2375 (0.2442) data time 0.0010 (0.0019) model time 0.2366 (0.2422) loss 2.9500 (2.7426) grad_norm 3.4120 (5.5134) loss_scale 256.0000 (232.7413) mem 7381MB [2024-09-01 09:02:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][840/1251] eta 0:01:40 lr 0.000044 wd 0.0500 time 0.2382 (0.2442) data time 0.0007 (0.0019) model time 0.2375 (0.2422) loss 3.2894 (2.7423) grad_norm 4.3060 (5.5208) loss_scale 256.0000 (233.0178) mem 7381MB [2024-09-01 09:02:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][850/1251] eta 0:01:37 lr 0.000044 wd 0.0500 time 0.2458 (0.2442) data time 0.0007 (0.0019) model time 0.2451 (0.2422) loss 2.3612 (2.7410) grad_norm 5.3630 (5.5569) loss_scale 256.0000 (233.2879) mem 7381MB [2024-09-01 09:02:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][860/1251] eta 0:01:35 lr 0.000044 wd 0.0500 time 0.2391 (0.2442) data time 0.0010 (0.0018) model time 0.2381 (0.2422) loss 2.9779 (2.7401) grad_norm 6.1867 (5.5875) loss_scale 256.0000 (233.5517) mem 7381MB [2024-09-01 09:02:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][870/1251] eta 0:01:33 lr 0.000044 wd 0.0500 time 0.2469 (0.2442) data time 0.0009 (0.0018) model time 0.2460 (0.2422) loss 2.4603 (2.7423) grad_norm 5.8211 (5.6062) loss_scale 256.0000 (233.8094) mem 7381MB [2024-09-01 09:02:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][880/1251] eta 0:01:30 lr 0.000044 wd 0.0500 time 0.2452 (0.2442) data time 0.0007 (0.0018) model time 0.2445 (0.2422) loss 2.6453 (2.7443) grad_norm 5.9369 (5.5986) loss_scale 256.0000 (234.0613) mem 7381MB [2024-09-01 09:02:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][890/1251] eta 0:01:28 lr 0.000044 wd 0.0500 time 0.2349 (0.2441) data time 0.0010 (0.0018) model time 0.2339 (0.2422) loss 3.1790 (2.7457) grad_norm 4.4876 (5.5935) loss_scale 256.0000 (234.3075) mem 7381MB [2024-09-01 09:02:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][900/1251] eta 0:01:25 lr 0.000044 wd 0.0500 time 0.2378 (0.2443) data time 0.0007 (0.0018) model time 0.2371 (0.2424) loss 2.5208 (2.7468) grad_norm 5.3202 (5.5952) loss_scale 256.0000 (234.5483) mem 7381MB [2024-09-01 09:02:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][910/1251] eta 0:01:23 lr 0.000044 wd 0.0500 time 0.2431 (0.2443) data time 0.0007 (0.0018) model time 0.2424 (0.2424) loss 3.0765 (2.7447) grad_norm 3.7481 (5.5961) loss_scale 256.0000 (234.7838) mem 7381MB [2024-09-01 09:02:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][920/1251] eta 0:01:20 lr 0.000044 wd 0.0500 time 0.2322 (0.2443) data time 0.0010 (0.0018) model time 0.2311 (0.2424) loss 2.2952 (2.7431) grad_norm 3.9561 (5.5954) loss_scale 256.0000 (235.0141) mem 7381MB [2024-09-01 09:02:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][930/1251] eta 0:01:18 lr 0.000044 wd 0.0500 time 0.2424 (0.2443) data time 0.0009 (0.0018) model time 0.2415 (0.2424) loss 2.1575 (2.7433) grad_norm 3.4277 (5.5858) loss_scale 256.0000 (235.2395) mem 7381MB [2024-09-01 09:02:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][940/1251] eta 0:01:15 lr 0.000044 wd 0.0500 time 0.2366 (0.2443) data time 0.0010 (0.0018) model time 0.2356 (0.2424) loss 2.6284 (2.7443) grad_norm 4.8295 (5.5760) loss_scale 256.0000 (235.4601) mem 7381MB [2024-09-01 09:02:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][950/1251] eta 0:01:13 lr 0.000044 wd 0.0500 time 0.2422 (0.2443) data time 0.0010 (0.0018) model time 0.2411 (0.2424) loss 3.1385 (2.7434) grad_norm 4.6882 (5.5685) loss_scale 256.0000 (235.6761) mem 7381MB [2024-09-01 09:03:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][960/1251] eta 0:01:11 lr 0.000044 wd 0.0500 time 0.2344 (0.2442) data time 0.0008 (0.0018) model time 0.2336 (0.2424) loss 2.0505 (2.7443) grad_norm 4.8652 (5.5612) loss_scale 256.0000 (235.8876) mem 7381MB [2024-09-01 09:03:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][970/1251] eta 0:01:08 lr 0.000044 wd 0.0500 time 0.2605 (0.2442) data time 0.0009 (0.0017) model time 0.2596 (0.2424) loss 3.1635 (2.7440) grad_norm 4.4317 (5.5582) loss_scale 256.0000 (236.0947) mem 7381MB [2024-09-01 09:03:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][980/1251] eta 0:01:06 lr 0.000044 wd 0.0500 time 0.2399 (0.2442) data time 0.0009 (0.0017) model time 0.2390 (0.2424) loss 1.6705 (2.7426) grad_norm 4.0123 (5.5511) loss_scale 256.0000 (236.2977) mem 7381MB [2024-09-01 09:03:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][990/1251] eta 0:01:03 lr 0.000044 wd 0.0500 time 0.2428 (0.2442) data time 0.0009 (0.0017) model time 0.2420 (0.2424) loss 2.9303 (2.7423) grad_norm 11.2537 (5.5643) loss_scale 256.0000 (236.4965) mem 7381MB [2024-09-01 09:03:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1000/1251] eta 0:01:01 lr 0.000044 wd 0.0500 time 0.2408 (0.2442) data time 0.0007 (0.0017) model time 0.2400 (0.2423) loss 2.6055 (2.7439) grad_norm 5.4725 (5.5591) loss_scale 256.0000 (236.6913) mem 7381MB [2024-09-01 09:03:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1010/1251] eta 0:00:58 lr 0.000044 wd 0.0500 time 0.2466 (0.2442) data time 0.0010 (0.0017) model time 0.2456 (0.2423) loss 2.7363 (2.7425) grad_norm 5.1925 (5.5578) loss_scale 256.0000 (236.8823) mem 7381MB [2024-09-01 09:03:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1020/1251] eta 0:00:56 lr 0.000044 wd 0.0500 time 0.2376 (0.2444) data time 0.0009 (0.0017) model time 0.2367 (0.2425) loss 2.6524 (2.7435) grad_norm 9.4870 (5.5629) loss_scale 256.0000 (237.0695) mem 7381MB [2024-09-01 09:03:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1030/1251] eta 0:00:54 lr 0.000044 wd 0.0500 time 0.2372 (0.2444) data time 0.0007 (0.0017) model time 0.2365 (0.2426) loss 2.5197 (2.7427) grad_norm 7.5062 (5.5743) loss_scale 256.0000 (237.2532) mem 7381MB [2024-09-01 09:03:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1040/1251] eta 0:00:51 lr 0.000044 wd 0.0500 time 0.2453 (0.2443) data time 0.0009 (0.0017) model time 0.2444 (0.2425) loss 1.7056 (2.7431) grad_norm 6.0032 (5.5860) loss_scale 256.0000 (237.4332) mem 7381MB [2024-09-01 09:03:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1050/1251] eta 0:00:49 lr 0.000044 wd 0.0500 time 0.2401 (0.2443) data time 0.0011 (0.0017) model time 0.2390 (0.2425) loss 3.1466 (2.7431) grad_norm 10.0982 (5.5872) loss_scale 256.0000 (237.6099) mem 7381MB [2024-09-01 09:03:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1060/1251] eta 0:00:46 lr 0.000044 wd 0.0500 time 0.2438 (0.2443) data time 0.0008 (0.0017) model time 0.2430 (0.2425) loss 3.3726 (2.7421) grad_norm 5.5036 (5.5943) loss_scale 256.0000 (237.7832) mem 7381MB [2024-09-01 09:03:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1070/1251] eta 0:00:44 lr 0.000044 wd 0.0500 time 0.2408 (0.2443) data time 0.0009 (0.0017) model time 0.2399 (0.2425) loss 3.2086 (2.7427) grad_norm 6.4415 (5.5969) loss_scale 256.0000 (237.9533) mem 7381MB [2024-09-01 09:03:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1080/1251] eta 0:00:41 lr 0.000044 wd 0.0500 time 0.2390 (0.2443) data time 0.0009 (0.0017) model time 0.2381 (0.2425) loss 2.7396 (2.7442) grad_norm 3.8284 (5.6227) loss_scale 256.0000 (238.1203) mem 7381MB [2024-09-01 09:03:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1090/1251] eta 0:00:39 lr 0.000044 wd 0.0500 time 0.2449 (0.2443) data time 0.0010 (0.0017) model time 0.2438 (0.2425) loss 2.8125 (2.7450) grad_norm 4.1490 (5.6256) loss_scale 256.0000 (238.2841) mem 7381MB [2024-09-01 09:03:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1100/1251] eta 0:00:36 lr 0.000044 wd 0.0500 time 0.2363 (0.2443) data time 0.0009 (0.0017) model time 0.2354 (0.2425) loss 2.9333 (2.7440) grad_norm 5.1107 (5.6196) loss_scale 256.0000 (238.4450) mem 7381MB [2024-09-01 09:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1110/1251] eta 0:00:34 lr 0.000044 wd 0.0500 time 0.2383 (0.2443) data time 0.0009 (0.0017) model time 0.2374 (0.2425) loss 2.9444 (2.7427) grad_norm 6.6378 (5.6296) loss_scale 256.0000 (238.6031) mem 7381MB [2024-09-01 09:03:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1120/1251] eta 0:00:31 lr 0.000044 wd 0.0500 time 0.2451 (0.2442) data time 0.0008 (0.0016) model time 0.2444 (0.2425) loss 3.6925 (2.7433) grad_norm 4.1487 (5.6226) loss_scale 256.0000 (238.7583) mem 7381MB [2024-09-01 09:03:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1130/1251] eta 0:00:29 lr 0.000044 wd 0.0500 time 0.2422 (0.2442) data time 0.0009 (0.0016) model time 0.2413 (0.2424) loss 1.6563 (2.7441) grad_norm 5.3087 (5.6107) loss_scale 256.0000 (238.9107) mem 7381MB [2024-09-01 09:03:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1140/1251] eta 0:00:27 lr 0.000044 wd 0.0500 time 0.2436 (0.2442) data time 0.0007 (0.0016) model time 0.2429 (0.2424) loss 2.0107 (2.7448) grad_norm 4.6349 (5.6022) loss_scale 256.0000 (239.0605) mem 7381MB [2024-09-01 09:03:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1150/1251] eta 0:00:24 lr 0.000044 wd 0.0500 time 0.2521 (0.2442) data time 0.0009 (0.0016) model time 0.2511 (0.2424) loss 2.9731 (2.7477) grad_norm 5.1318 (5.6062) loss_scale 256.0000 (239.2076) mem 7381MB [2024-09-01 09:03:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1160/1251] eta 0:00:22 lr 0.000044 wd 0.0500 time 0.2524 (0.2442) data time 0.0008 (0.0016) model time 0.2516 (0.2424) loss 3.0629 (2.7485) grad_norm 4.1621 (5.5994) loss_scale 256.0000 (239.3523) mem 7381MB [2024-09-01 09:03:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1170/1251] eta 0:00:19 lr 0.000044 wd 0.0500 time 0.2384 (0.2441) data time 0.0007 (0.0016) model time 0.2377 (0.2424) loss 1.7873 (2.7473) grad_norm 3.8645 (5.5980) loss_scale 256.0000 (239.4944) mem 7381MB [2024-09-01 09:03:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1180/1251] eta 0:00:17 lr 0.000044 wd 0.0500 time 0.2351 (0.2441) data time 0.0011 (0.0016) model time 0.2340 (0.2424) loss 2.8181 (2.7481) grad_norm 4.2757 (5.6250) loss_scale 256.0000 (239.6342) mem 7381MB [2024-09-01 09:03:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1190/1251] eta 0:00:14 lr 0.000044 wd 0.0500 time 0.2448 (0.2441) data time 0.0008 (0.0016) model time 0.2440 (0.2424) loss 2.3036 (2.7481) grad_norm 4.1244 (5.6221) loss_scale 256.0000 (239.7716) mem 7381MB [2024-09-01 09:04:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1200/1251] eta 0:00:12 lr 0.000044 wd 0.0500 time 0.2398 (0.2440) data time 0.0007 (0.0016) model time 0.2390 (0.2423) loss 2.8862 (2.7485) grad_norm 3.7047 (5.6197) loss_scale 256.0000 (239.9067) mem 7381MB [2024-09-01 09:04:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1210/1251] eta 0:00:10 lr 0.000044 wd 0.0500 time 0.2483 (0.2440) data time 0.0010 (0.0016) model time 0.2473 (0.2423) loss 2.7620 (2.7499) grad_norm 4.7849 (5.6226) loss_scale 256.0000 (240.0396) mem 7381MB [2024-09-01 09:04:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1220/1251] eta 0:00:07 lr 0.000044 wd 0.0500 time 0.2450 (0.2440) data time 0.0010 (0.0016) model time 0.2440 (0.2423) loss 2.1259 (2.7483) grad_norm 5.5838 (5.6210) loss_scale 256.0000 (240.1704) mem 7381MB [2024-09-01 09:04:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1230/1251] eta 0:00:05 lr 0.000044 wd 0.0500 time 0.2401 (0.2440) data time 0.0007 (0.0016) model time 0.2394 (0.2423) loss 3.3931 (2.7481) grad_norm 4.4057 (5.6144) loss_scale 256.0000 (240.2989) mem 7381MB [2024-09-01 09:04:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1240/1251] eta 0:00:02 lr 0.000044 wd 0.0500 time 0.2244 (0.2439) data time 0.0007 (0.0016) model time 0.2237 (0.2422) loss 1.8666 (2.7472) grad_norm 4.7643 (5.6926) loss_scale 256.0000 (240.4255) mem 7381MB [2024-09-01 09:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [266/300][1250/1251] eta 0:00:00 lr 0.000044 wd 0.0500 time 0.2314 (0.2438) data time 0.0005 (0.0016) model time 0.2309 (0.2421) loss 1.6236 (2.7439) grad_norm 3.4051 (5.7173) loss_scale 256.0000 (240.5500) mem 7381MB [2024-09-01 09:04:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 266 training takes 0:05:04 [2024-09-01 09:04:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 09:04:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 09:04:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.480 (0.480) Loss 0.3965 (0.3965) Acc@1 92.969 (92.969) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 09:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.115) Loss 0.5918 (0.6102) Acc@1 89.941 (87.553) Acc@5 98.047 (97.692) Mem 7381MB [2024-09-01 09:04:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.089 (0.098) Loss 0.9199 (0.6428) Acc@1 78.027 (86.444) Acc@5 95.703 (97.628) Mem 7381MB [2024-09-01 09:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.091) Loss 1.1660 (0.7370) Acc@1 73.438 (84.233) Acc@5 92.578 (96.689) Mem 7381MB [2024-09-01 09:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.0254 (0.7856) Acc@1 77.246 (83.072) Acc@5 94.043 (96.184) Mem 7381MB [2024-09-01 09:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.606 Acc@5 96.100 [2024-09-01 09:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-09-01 09:04:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.61% [2024-09-01 09:04:17 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 09:04:18 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 09:04:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.479 (0.479) Loss 0.3843 (0.3843) Acc@1 93.555 (93.555) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 09:04:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.113) Loss 0.5654 (0.6032) Acc@1 90.430 (87.793) Acc@5 98.047 (97.843) Mem 7381MB [2024-09-01 09:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.097) Loss 0.9028 (0.6333) Acc@1 77.930 (86.630) Acc@5 95.801 (97.740) Mem 7381MB [2024-09-01 09:04:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.090) Loss 1.1230 (0.7237) Acc@1 74.609 (84.454) Acc@5 93.066 (96.837) Mem 7381MB [2024-09-01 09:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 0.9976 (0.7705) Acc@1 77.148 (83.310) Acc@5 94.238 (96.337) Mem 7381MB [2024-09-01 09:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.950 Acc@5 96.278 [2024-09-01 09:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 83.0% [2024-09-01 09:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][0/1251] eta 0:23:12 lr 0.000044 wd 0.0500 time 1.1129 (1.1129) data time 0.5753 (0.5753) model time 0.0000 (0.0000) loss 3.2151 (3.2151) grad_norm 6.2036 (6.2036) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][10/1251] eta 0:06:38 lr 0.000044 wd 0.0500 time 0.2405 (0.3214) data time 0.0009 (0.0532) model time 0.0000 (0.0000) loss 1.9319 (2.8288) grad_norm 4.6855 (5.4105) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][20/1251] eta 0:05:48 lr 0.000044 wd 0.0500 time 0.2377 (0.2832) data time 0.0009 (0.0284) model time 0.0000 (0.0000) loss 3.0547 (2.7871) grad_norm 12.5126 (6.2827) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][30/1251] eta 0:05:28 lr 0.000043 wd 0.0500 time 0.2452 (0.2691) data time 0.0007 (0.0195) model time 0.0000 (0.0000) loss 3.2022 (2.7767) grad_norm 5.5866 (6.2314) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][40/1251] eta 0:05:17 lr 0.000043 wd 0.0500 time 0.2400 (0.2624) data time 0.0010 (0.0150) model time 0.0000 (0.0000) loss 3.2491 (2.7458) grad_norm 4.3943 (5.8945) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][50/1251] eta 0:05:10 lr 0.000043 wd 0.0500 time 0.2408 (0.2582) data time 0.0007 (0.0123) model time 0.0000 (0.0000) loss 1.6945 (2.7311) grad_norm 3.4133 (5.7520) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][60/1251] eta 0:05:04 lr 0.000043 wd 0.0500 time 0.2471 (0.2554) data time 0.0011 (0.0104) model time 0.2460 (0.2403) loss 3.0672 (2.7503) grad_norm 4.3499 (5.5529) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][70/1251] eta 0:04:59 lr 0.000043 wd 0.0500 time 0.2448 (0.2534) data time 0.0008 (0.0091) model time 0.2440 (0.2402) loss 2.7134 (2.7736) grad_norm 5.2169 (5.5236) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][80/1251] eta 0:04:54 lr 0.000043 wd 0.0500 time 0.2473 (0.2519) data time 0.0007 (0.0081) model time 0.2465 (0.2402) loss 3.1427 (2.7753) grad_norm 7.9654 (5.5512) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][90/1251] eta 0:04:51 lr 0.000043 wd 0.0500 time 0.2423 (0.2507) data time 0.0009 (0.0073) model time 0.2414 (0.2403) loss 2.8672 (2.7801) grad_norm 4.1664 (5.7294) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][100/1251] eta 0:04:47 lr 0.000043 wd 0.0500 time 0.2451 (0.2500) data time 0.0010 (0.0067) model time 0.2441 (0.2406) loss 2.8411 (2.7799) grad_norm 3.7629 (5.8169) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][110/1251] eta 0:04:44 lr 0.000043 wd 0.0500 time 0.2418 (0.2492) data time 0.0007 (0.0062) model time 0.2411 (0.2405) loss 1.7142 (2.7724) grad_norm 4.3537 (6.1265) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][120/1251] eta 0:04:40 lr 0.000043 wd 0.0500 time 0.2369 (0.2483) data time 0.0010 (0.0057) model time 0.2359 (0.2402) loss 2.7983 (2.7557) grad_norm 5.9080 (6.2153) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][130/1251] eta 0:04:37 lr 0.000043 wd 0.0500 time 0.2452 (0.2477) data time 0.0011 (0.0054) model time 0.2441 (0.2400) loss 2.8688 (2.7272) grad_norm 4.2060 (6.1200) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][140/1251] eta 0:04:34 lr 0.000043 wd 0.0500 time 0.2479 (0.2473) data time 0.0008 (0.0051) model time 0.2471 (0.2402) loss 3.3604 (2.7332) grad_norm 3.8175 (6.0616) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:04:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][150/1251] eta 0:04:31 lr 0.000043 wd 0.0500 time 0.2383 (0.2469) data time 0.0011 (0.0048) model time 0.2372 (0.2402) loss 2.6003 (2.7233) grad_norm 3.3455 (6.0020) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][160/1251] eta 0:04:28 lr 0.000043 wd 0.0500 time 0.2397 (0.2465) data time 0.0009 (0.0046) model time 0.2388 (0.2402) loss 2.3640 (2.7097) grad_norm 4.2861 (5.9512) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][170/1251] eta 0:04:26 lr 0.000043 wd 0.0500 time 0.2432 (0.2463) data time 0.0012 (0.0044) model time 0.2420 (0.2403) loss 3.1084 (2.7162) grad_norm 5.4359 (5.9653) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][180/1251] eta 0:04:23 lr 0.000043 wd 0.0500 time 0.2407 (0.2460) data time 0.0008 (0.0042) model time 0.2400 (0.2402) loss 3.2537 (2.7220) grad_norm 4.4276 (5.8864) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][190/1251] eta 0:04:20 lr 0.000043 wd 0.0500 time 0.2454 (0.2458) data time 0.0007 (0.0040) model time 0.2447 (0.2402) loss 3.4288 (2.7070) grad_norm 4.6891 (5.8873) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][200/1251] eta 0:04:18 lr 0.000043 wd 0.0500 time 0.2456 (0.2455) data time 0.0009 (0.0039) model time 0.2447 (0.2402) loss 2.9266 (2.6972) grad_norm 3.7148 (5.8354) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][210/1251] eta 0:04:15 lr 0.000043 wd 0.0500 time 0.2396 (0.2452) data time 0.0010 (0.0037) model time 0.2386 (0.2400) loss 3.0562 (2.7011) grad_norm 5.2316 (6.4824) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][220/1251] eta 0:04:14 lr 0.000043 wd 0.0500 time 0.2425 (0.2469) data time 0.0010 (0.0036) model time 0.2415 (0.2425) loss 2.0373 (2.6982) grad_norm 4.7185 (6.4589) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][230/1251] eta 0:04:11 lr 0.000043 wd 0.0500 time 0.2489 (0.2466) data time 0.0007 (0.0035) model time 0.2483 (0.2423) loss 3.1491 (2.7042) grad_norm 12.2986 (6.4830) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][240/1251] eta 0:04:09 lr 0.000043 wd 0.0500 time 0.2411 (0.2464) data time 0.0008 (0.0034) model time 0.2403 (0.2423) loss 2.9798 (2.6879) grad_norm 4.1307 (6.4704) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][250/1251] eta 0:04:06 lr 0.000043 wd 0.0500 time 0.2388 (0.2462) data time 0.0009 (0.0033) model time 0.2378 (0.2421) loss 2.6984 (2.6902) grad_norm 4.6610 (6.5059) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][260/1251] eta 0:04:03 lr 0.000043 wd 0.0500 time 0.2439 (0.2460) data time 0.0007 (0.0032) model time 0.2432 (0.2420) loss 1.7857 (2.6906) grad_norm 3.9722 (6.4251) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][270/1251] eta 0:04:01 lr 0.000043 wd 0.0500 time 0.2487 (0.2458) data time 0.0007 (0.0031) model time 0.2480 (0.2420) loss 2.3269 (2.6855) grad_norm 4.8386 (6.3839) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][280/1251] eta 0:03:58 lr 0.000043 wd 0.0500 time 0.2455 (0.2456) data time 0.0009 (0.0030) model time 0.2445 (0.2418) loss 2.8548 (2.6890) grad_norm 4.0387 (6.3213) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][290/1251] eta 0:03:55 lr 0.000043 wd 0.0500 time 0.2381 (0.2454) data time 0.0012 (0.0030) model time 0.2369 (0.2417) loss 2.8176 (2.6848) grad_norm 3.9658 (6.2899) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][300/1251] eta 0:03:53 lr 0.000043 wd 0.0500 time 0.2402 (0.2454) data time 0.0010 (0.0029) model time 0.2392 (0.2418) loss 1.7460 (2.6859) grad_norm 4.8375 (6.2529) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][310/1251] eta 0:03:50 lr 0.000043 wd 0.0500 time 0.2459 (0.2453) data time 0.0009 (0.0028) model time 0.2450 (0.2417) loss 3.0348 (2.6923) grad_norm 7.5488 (6.2725) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][320/1251] eta 0:03:48 lr 0.000043 wd 0.0500 time 0.2496 (0.2452) data time 0.0011 (0.0028) model time 0.2485 (0.2417) loss 2.6558 (2.6977) grad_norm 8.3396 (6.4896) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][330/1251] eta 0:03:45 lr 0.000043 wd 0.0500 time 0.2434 (0.2452) data time 0.0007 (0.0027) model time 0.2427 (0.2418) loss 1.9419 (2.6901) grad_norm 3.6156 (6.4562) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][340/1251] eta 0:03:43 lr 0.000043 wd 0.0500 time 0.2533 (0.2451) data time 0.0009 (0.0027) model time 0.2523 (0.2418) loss 2.7799 (2.6890) grad_norm 4.3829 (6.4471) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][350/1251] eta 0:03:40 lr 0.000043 wd 0.0500 time 0.2368 (0.2450) data time 0.0008 (0.0026) model time 0.2360 (0.2417) loss 3.0003 (2.6857) grad_norm 5.1322 (6.3866) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][360/1251] eta 0:03:38 lr 0.000043 wd 0.0500 time 0.2428 (0.2449) data time 0.0007 (0.0026) model time 0.2421 (0.2417) loss 2.2526 (2.6921) grad_norm 5.6057 (6.3703) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][370/1251] eta 0:03:35 lr 0.000043 wd 0.0500 time 0.2392 (0.2448) data time 0.0007 (0.0025) model time 0.2386 (0.2417) loss 2.2477 (2.6919) grad_norm 3.6019 (6.3513) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][380/1251] eta 0:03:33 lr 0.000043 wd 0.0500 time 0.2403 (0.2448) data time 0.0009 (0.0025) model time 0.2394 (0.2417) loss 3.3937 (2.6927) grad_norm 6.6510 (6.3095) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:05:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][390/1251] eta 0:03:30 lr 0.000043 wd 0.0500 time 0.2463 (0.2447) data time 0.0007 (0.0025) model time 0.2456 (0.2417) loss 3.2787 (2.6886) grad_norm 4.3484 (6.2667) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][400/1251] eta 0:03:28 lr 0.000043 wd 0.0500 time 0.2440 (0.2446) data time 0.0011 (0.0024) model time 0.2429 (0.2416) loss 2.8137 (2.6884) grad_norm 4.4844 (6.2203) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][410/1251] eta 0:03:25 lr 0.000043 wd 0.0500 time 0.2426 (0.2446) data time 0.0010 (0.0024) model time 0.2417 (0.2416) loss 2.6337 (2.6868) grad_norm 4.0614 (6.1862) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][420/1251] eta 0:03:23 lr 0.000043 wd 0.0500 time 0.2367 (0.2445) data time 0.0010 (0.0024) model time 0.2357 (0.2416) loss 3.1759 (2.6894) grad_norm 7.3855 (6.1695) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][430/1251] eta 0:03:20 lr 0.000043 wd 0.0500 time 0.2434 (0.2444) data time 0.0009 (0.0023) model time 0.2426 (0.2416) loss 3.0304 (2.6924) grad_norm 4.7911 (6.1486) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][440/1251] eta 0:03:18 lr 0.000043 wd 0.0500 time 0.2394 (0.2448) data time 0.0007 (0.0023) model time 0.2387 (0.2421) loss 2.8963 (2.6987) grad_norm 7.4923 (6.1231) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][450/1251] eta 0:03:16 lr 0.000043 wd 0.0500 time 0.2375 (0.2447) data time 0.0008 (0.0023) model time 0.2367 (0.2420) loss 2.8753 (2.6999) grad_norm 3.5667 (6.0992) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][460/1251] eta 0:03:13 lr 0.000043 wd 0.0500 time 0.2435 (0.2446) data time 0.0007 (0.0022) model time 0.2427 (0.2419) loss 2.8664 (2.7036) grad_norm 7.6166 (6.0972) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][470/1251] eta 0:03:11 lr 0.000043 wd 0.0500 time 0.2376 (0.2446) data time 0.0007 (0.0022) model time 0.2369 (0.2419) loss 1.9663 (2.7016) grad_norm 5.1298 (6.0880) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][480/1251] eta 0:03:08 lr 0.000043 wd 0.0500 time 0.2450 (0.2445) data time 0.0010 (0.0022) model time 0.2440 (0.2419) loss 3.5347 (2.7007) grad_norm 7.2427 (6.0740) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][490/1251] eta 0:03:05 lr 0.000043 wd 0.0500 time 0.2416 (0.2444) data time 0.0009 (0.0022) model time 0.2406 (0.2418) loss 3.1012 (2.7032) grad_norm 4.7070 (6.0746) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][500/1251] eta 0:03:03 lr 0.000043 wd 0.0500 time 0.2380 (0.2444) data time 0.0009 (0.0021) model time 0.2371 (0.2418) loss 2.2378 (2.7032) grad_norm 4.9186 (6.0531) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][510/1251] eta 0:03:01 lr 0.000043 wd 0.0500 time 0.2375 (0.2443) data time 0.0010 (0.0021) model time 0.2366 (0.2418) loss 2.7705 (2.7065) grad_norm 5.2145 (6.1587) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][520/1251] eta 0:02:58 lr 0.000043 wd 0.0500 time 0.4454 (0.2446) data time 0.0010 (0.0021) model time 0.4444 (0.2422) loss 2.1308 (2.7065) grad_norm 4.2223 (6.1474) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][530/1251] eta 0:02:56 lr 0.000043 wd 0.0500 time 0.2399 (0.2446) data time 0.0009 (0.0021) model time 0.2390 (0.2421) loss 2.7857 (2.7032) grad_norm 3.5783 (6.1309) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][540/1251] eta 0:02:53 lr 0.000043 wd 0.0500 time 0.2451 (0.2445) data time 0.0010 (0.0021) model time 0.2441 (0.2421) loss 2.9619 (2.7015) grad_norm 4.4499 (6.1061) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][550/1251] eta 0:02:51 lr 0.000043 wd 0.0500 time 0.2459 (0.2446) data time 0.0007 (0.0020) model time 0.2452 (0.2422) loss 2.5855 (2.7026) grad_norm 6.2932 (6.0889) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][560/1251] eta 0:02:48 lr 0.000043 wd 0.0500 time 0.2374 (0.2445) data time 0.0007 (0.0020) model time 0.2367 (0.2421) loss 2.8280 (2.6999) grad_norm 10.1702 (6.0751) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][570/1251] eta 0:02:46 lr 0.000043 wd 0.0500 time 0.2488 (0.2445) data time 0.0009 (0.0020) model time 0.2478 (0.2422) loss 3.1811 (2.6967) grad_norm 4.0506 (6.0542) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][580/1251] eta 0:02:44 lr 0.000043 wd 0.0500 time 0.2446 (0.2445) data time 0.0008 (0.0020) model time 0.2438 (0.2422) loss 1.9829 (2.6953) grad_norm 5.1454 (6.0286) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][590/1251] eta 0:02:41 lr 0.000043 wd 0.0500 time 0.2461 (0.2444) data time 0.0011 (0.0020) model time 0.2450 (0.2421) loss 1.8773 (2.6952) grad_norm 4.2528 (6.0116) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][600/1251] eta 0:02:39 lr 0.000043 wd 0.0500 time 0.2468 (0.2444) data time 0.0007 (0.0020) model time 0.2461 (0.2421) loss 3.3565 (2.6984) grad_norm 4.6767 (6.0005) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][610/1251] eta 0:02:36 lr 0.000043 wd 0.0500 time 0.2425 (0.2443) data time 0.0009 (0.0019) model time 0.2417 (0.2420) loss 2.0821 (2.6965) grad_norm 3.6837 (5.9787) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][620/1251] eta 0:02:34 lr 0.000043 wd 0.0500 time 0.2341 (0.2442) data time 0.0007 (0.0019) model time 0.2334 (0.2420) loss 3.2890 (2.6958) grad_norm 3.9337 (5.9689) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][630/1251] eta 0:02:31 lr 0.000043 wd 0.0500 time 0.2284 (0.2442) data time 0.0010 (0.0019) model time 0.2273 (0.2419) loss 2.7267 (2.6958) grad_norm 4.8803 (5.9630) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][640/1251] eta 0:02:29 lr 0.000043 wd 0.0500 time 0.2380 (0.2441) data time 0.0009 (0.0019) model time 0.2371 (0.2419) loss 2.5169 (2.6977) grad_norm 4.2194 (5.9872) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][650/1251] eta 0:02:26 lr 0.000043 wd 0.0500 time 0.2488 (0.2440) data time 0.0007 (0.0019) model time 0.2481 (0.2418) loss 3.4959 (2.6969) grad_norm 5.1310 (5.9871) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][660/1251] eta 0:02:24 lr 0.000042 wd 0.0500 time 0.2339 (0.2440) data time 0.0011 (0.0019) model time 0.2328 (0.2418) loss 2.9171 (2.6979) grad_norm 4.8648 (5.9856) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][670/1251] eta 0:02:21 lr 0.000042 wd 0.0500 time 0.2408 (0.2439) data time 0.0009 (0.0019) model time 0.2399 (0.2418) loss 2.8905 (2.6996) grad_norm 5.0950 (5.9740) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][680/1251] eta 0:02:19 lr 0.000042 wd 0.0500 time 0.2372 (0.2439) data time 0.0011 (0.0018) model time 0.2361 (0.2418) loss 2.8227 (2.6993) grad_norm 4.3885 (5.9567) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][690/1251] eta 0:02:16 lr 0.000042 wd 0.0500 time 0.2474 (0.2439) data time 0.0007 (0.0018) model time 0.2467 (0.2418) loss 3.1055 (2.6998) grad_norm 4.3773 (5.9382) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][700/1251] eta 0:02:14 lr 0.000042 wd 0.0500 time 0.2384 (0.2439) data time 0.0007 (0.0018) model time 0.2377 (0.2417) loss 1.5805 (2.6960) grad_norm 6.9823 (5.9309) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][710/1251] eta 0:02:11 lr 0.000042 wd 0.0500 time 0.2439 (0.2438) data time 0.0007 (0.0018) model time 0.2432 (0.2417) loss 3.1163 (2.7002) grad_norm 6.4713 (5.9406) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][720/1251] eta 0:02:09 lr 0.000042 wd 0.0500 time 0.2403 (0.2438) data time 0.0010 (0.0018) model time 0.2393 (0.2417) loss 2.8743 (2.7021) grad_norm 5.6150 (5.9233) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][730/1251] eta 0:02:06 lr 0.000042 wd 0.0500 time 0.2410 (0.2438) data time 0.0011 (0.0018) model time 0.2399 (0.2417) loss 3.1251 (2.7021) grad_norm 7.4091 (5.9139) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][740/1251] eta 0:02:04 lr 0.000042 wd 0.0500 time 0.2354 (0.2437) data time 0.0011 (0.0018) model time 0.2343 (0.2416) loss 2.7466 (2.7022) grad_norm 4.7384 (5.9015) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][750/1251] eta 0:02:02 lr 0.000042 wd 0.0500 time 0.2405 (0.2437) data time 0.0007 (0.0018) model time 0.2398 (0.2416) loss 2.2516 (2.7031) grad_norm 5.8012 (5.8879) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][760/1251] eta 0:01:59 lr 0.000042 wd 0.0500 time 0.2380 (0.2436) data time 0.0010 (0.0018) model time 0.2370 (0.2416) loss 2.8775 (2.7048) grad_norm 4.2783 (5.8778) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][770/1251] eta 0:01:57 lr 0.000042 wd 0.0500 time 0.2446 (0.2436) data time 0.0008 (0.0017) model time 0.2437 (0.2416) loss 3.0673 (2.7040) grad_norm 4.2204 (5.8625) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][780/1251] eta 0:01:54 lr 0.000042 wd 0.0500 time 0.2713 (0.2436) data time 0.0010 (0.0017) model time 0.2703 (0.2416) loss 3.1223 (2.7040) grad_norm 3.9008 (5.8422) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][790/1251] eta 0:01:52 lr 0.000042 wd 0.0500 time 0.2451 (0.2436) data time 0.0010 (0.0017) model time 0.2441 (0.2416) loss 2.9720 (2.7085) grad_norm 3.4569 (5.8223) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][800/1251] eta 0:01:49 lr 0.000042 wd 0.0500 time 0.2336 (0.2436) data time 0.0011 (0.0017) model time 0.2324 (0.2416) loss 2.9884 (2.7119) grad_norm 4.6263 (5.8060) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][810/1251] eta 0:01:47 lr 0.000042 wd 0.0500 time 0.2428 (0.2436) data time 0.0011 (0.0017) model time 0.2418 (0.2416) loss 2.4306 (2.7143) grad_norm 5.5842 (5.8132) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][820/1251] eta 0:01:44 lr 0.000042 wd 0.0500 time 0.2446 (0.2436) data time 0.0011 (0.0017) model time 0.2436 (0.2416) loss 2.8538 (2.7148) grad_norm 4.8760 (5.8030) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][830/1251] eta 0:01:42 lr 0.000042 wd 0.0500 time 0.2449 (0.2435) data time 0.0010 (0.0017) model time 0.2438 (0.2416) loss 2.3267 (2.7147) grad_norm 5.7945 (5.7995) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][840/1251] eta 0:01:40 lr 0.000042 wd 0.0500 time 0.2356 (0.2435) data time 0.0007 (0.0017) model time 0.2349 (0.2415) loss 3.1839 (2.7165) grad_norm 4.6243 (5.7932) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][850/1251] eta 0:01:37 lr 0.000042 wd 0.0500 time 0.2401 (0.2435) data time 0.0009 (0.0017) model time 0.2392 (0.2415) loss 3.5014 (2.7171) grad_norm 3.7506 (5.7822) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][860/1251] eta 0:01:35 lr 0.000042 wd 0.0500 time 0.2409 (0.2434) data time 0.0007 (0.0017) model time 0.2402 (0.2415) loss 3.0024 (2.7187) grad_norm 10.9238 (5.7830) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][870/1251] eta 0:01:32 lr 0.000042 wd 0.0500 time 0.2441 (0.2434) data time 0.0008 (0.0017) model time 0.2433 (0.2415) loss 1.8300 (2.7152) grad_norm 4.6693 (5.7703) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][880/1251] eta 0:01:30 lr 0.000042 wd 0.0500 time 0.2415 (0.2434) data time 0.0009 (0.0017) model time 0.2406 (0.2414) loss 2.2422 (2.7165) grad_norm 5.6815 (5.7607) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][890/1251] eta 0:01:27 lr 0.000042 wd 0.0500 time 0.2374 (0.2433) data time 0.0010 (0.0016) model time 0.2364 (0.2414) loss 2.3184 (2.7186) grad_norm 4.3666 (5.7492) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:08:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][900/1251] eta 0:01:25 lr 0.000042 wd 0.0500 time 0.2404 (0.2433) data time 0.0011 (0.0016) model time 0.2393 (0.2414) loss 2.0144 (2.7190) grad_norm 5.6344 (5.7435) loss_scale 512.0000 (256.2841) mem 7381MB [2024-09-01 09:08:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][910/1251] eta 0:01:22 lr 0.000042 wd 0.0500 time 0.2450 (0.2433) data time 0.0010 (0.0016) model time 0.2440 (0.2414) loss 3.3234 (2.7174) grad_norm 4.8417 (5.7492) loss_scale 512.0000 (259.0911) mem 7381MB [2024-09-01 09:08:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][920/1251] eta 0:01:20 lr 0.000042 wd 0.0500 time 0.2429 (0.2432) data time 0.0009 (0.0016) model time 0.2420 (0.2414) loss 2.3580 (2.7179) grad_norm 3.4763 (5.7445) loss_scale 512.0000 (261.8371) mem 7381MB [2024-09-01 09:08:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][930/1251] eta 0:01:18 lr 0.000042 wd 0.0500 time 0.2444 (0.2432) data time 0.0008 (0.0016) model time 0.2436 (0.2413) loss 3.1393 (2.7185) grad_norm 5.7662 (5.7403) loss_scale 512.0000 (264.5242) mem 7381MB [2024-09-01 09:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][940/1251] eta 0:01:15 lr 0.000042 wd 0.0500 time 0.2425 (0.2431) data time 0.0009 (0.0016) model time 0.2415 (0.2413) loss 2.9947 (2.7191) grad_norm 4.1435 (5.7439) loss_scale 512.0000 (267.1541) mem 7381MB [2024-09-01 09:08:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][950/1251] eta 0:01:13 lr 0.000042 wd 0.0500 time 0.2422 (0.2431) data time 0.0009 (0.0016) model time 0.2413 (0.2413) loss 2.7853 (2.7199) grad_norm 3.8298 (5.7393) loss_scale 512.0000 (269.7287) mem 7381MB [2024-09-01 09:08:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][960/1251] eta 0:01:10 lr 0.000042 wd 0.0500 time 0.2453 (0.2431) data time 0.0007 (0.0016) model time 0.2445 (0.2412) loss 1.9580 (2.7167) grad_norm 4.9003 (5.7411) loss_scale 512.0000 (272.2497) mem 7381MB [2024-09-01 09:08:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][970/1251] eta 0:01:08 lr 0.000042 wd 0.0500 time 0.2425 (0.2431) data time 0.0009 (0.0016) model time 0.2416 (0.2412) loss 2.8412 (2.7185) grad_norm 4.1160 (5.7273) loss_scale 512.0000 (274.7188) mem 7381MB [2024-09-01 09:08:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][980/1251] eta 0:01:05 lr 0.000042 wd 0.0500 time 0.2379 (0.2433) data time 0.0008 (0.0016) model time 0.2370 (0.2415) loss 2.0378 (2.7189) grad_norm 3.5539 (5.7221) loss_scale 512.0000 (277.1376) mem 7381MB [2024-09-01 09:08:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][990/1251] eta 0:01:03 lr 0.000042 wd 0.0500 time 0.2443 (0.2433) data time 0.0010 (0.0016) model time 0.2433 (0.2414) loss 3.3576 (2.7198) grad_norm 5.6608 (5.7165) loss_scale 512.0000 (279.5076) mem 7381MB [2024-09-01 09:08:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1000/1251] eta 0:01:01 lr 0.000042 wd 0.0500 time 0.2337 (0.2432) data time 0.0007 (0.0016) model time 0.2329 (0.2414) loss 2.8566 (2.7220) grad_norm 5.5947 (5.7083) loss_scale 512.0000 (281.8302) mem 7381MB [2024-09-01 09:08:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1010/1251] eta 0:00:58 lr 0.000042 wd 0.0500 time 0.2423 (0.2432) data time 0.0010 (0.0016) model time 0.2413 (0.2414) loss 2.9334 (2.7238) grad_norm 4.1542 (5.7010) loss_scale 512.0000 (284.1068) mem 7381MB [2024-09-01 09:08:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1020/1251] eta 0:00:56 lr 0.000042 wd 0.0500 time 0.2430 (0.2432) data time 0.0009 (0.0016) model time 0.2421 (0.2414) loss 2.9554 (2.7260) grad_norm 3.5190 (5.7233) loss_scale 512.0000 (286.3389) mem 7381MB [2024-09-01 09:08:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1030/1251] eta 0:00:53 lr 0.000042 wd 0.0500 time 0.2405 (0.2432) data time 0.0010 (0.0016) model time 0.2395 (0.2414) loss 2.9667 (2.7284) grad_norm 5.2481 (5.7189) loss_scale 512.0000 (288.5276) mem 7381MB [2024-09-01 09:08:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1040/1251] eta 0:00:51 lr 0.000042 wd 0.0500 time 0.2347 (0.2432) data time 0.0009 (0.0016) model time 0.2338 (0.2414) loss 2.2255 (2.7246) grad_norm 4.8040 (5.7110) loss_scale 512.0000 (290.6744) mem 7381MB [2024-09-01 09:08:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1050/1251] eta 0:00:48 lr 0.000042 wd 0.0500 time 0.2420 (0.2431) data time 0.0009 (0.0015) model time 0.2411 (0.2414) loss 2.5519 (2.7244) grad_norm 3.8758 (5.7758) loss_scale 512.0000 (292.7802) mem 7381MB [2024-09-01 09:08:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1060/1251] eta 0:00:46 lr 0.000042 wd 0.0500 time 0.2383 (0.2433) data time 0.0007 (0.0015) model time 0.2376 (0.2416) loss 2.5265 (2.7249) grad_norm 3.6552 (5.7694) loss_scale 512.0000 (294.8464) mem 7381MB [2024-09-01 09:08:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1070/1251] eta 0:00:44 lr 0.000042 wd 0.0500 time 0.2394 (0.2433) data time 0.0009 (0.0015) model time 0.2384 (0.2416) loss 3.2543 (2.7235) grad_norm 3.4886 (5.7569) loss_scale 512.0000 (296.8739) mem 7381MB [2024-09-01 09:08:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1080/1251] eta 0:00:41 lr 0.000042 wd 0.0500 time 0.2453 (0.2433) data time 0.0010 (0.0015) model time 0.2444 (0.2415) loss 2.2588 (2.7218) grad_norm 4.5423 (5.7536) loss_scale 512.0000 (298.8640) mem 7381MB [2024-09-01 09:08:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1090/1251] eta 0:00:39 lr 0.000042 wd 0.0500 time 0.2371 (0.2433) data time 0.0009 (0.0015) model time 0.2362 (0.2415) loss 2.0202 (2.7191) grad_norm 5.0660 (5.7475) loss_scale 512.0000 (300.8176) mem 7381MB [2024-09-01 09:08:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1100/1251] eta 0:00:36 lr 0.000042 wd 0.0500 time 0.2425 (0.2433) data time 0.0007 (0.0015) model time 0.2417 (0.2415) loss 2.6094 (2.7207) grad_norm 9.7481 (5.7586) loss_scale 512.0000 (302.7357) mem 7381MB [2024-09-01 09:08:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1110/1251] eta 0:00:34 lr 0.000042 wd 0.0500 time 0.2426 (0.2432) data time 0.0008 (0.0015) model time 0.2418 (0.2415) loss 3.2866 (2.7212) grad_norm 5.5986 (5.7496) loss_scale 512.0000 (304.6193) mem 7381MB [2024-09-01 09:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1120/1251] eta 0:00:31 lr 0.000042 wd 0.0500 time 0.2463 (0.2432) data time 0.0008 (0.0015) model time 0.2455 (0.2415) loss 2.2946 (2.7206) grad_norm 12.9175 (5.7467) loss_scale 512.0000 (306.4692) mem 7381MB [2024-09-01 09:08:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1130/1251] eta 0:00:29 lr 0.000042 wd 0.0500 time 0.2409 (0.2432) data time 0.0009 (0.0015) model time 0.2400 (0.2415) loss 3.1718 (2.7209) grad_norm 3.7568 (5.7383) loss_scale 512.0000 (308.2865) mem 7381MB [2024-09-01 09:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1140/1251] eta 0:00:26 lr 0.000042 wd 0.0500 time 0.2428 (0.2432) data time 0.0009 (0.0015) model time 0.2418 (0.2415) loss 2.7812 (2.7210) grad_norm 6.7578 (5.7345) loss_scale 512.0000 (310.0719) mem 7381MB [2024-09-01 09:09:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1150/1251] eta 0:00:24 lr 0.000042 wd 0.0500 time 0.2427 (0.2436) data time 0.0010 (0.0015) model time 0.2417 (0.2419) loss 2.1817 (2.7207) grad_norm 3.6695 (5.7359) loss_scale 512.0000 (311.8262) mem 7381MB [2024-09-01 09:09:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1160/1251] eta 0:00:22 lr 0.000042 wd 0.0500 time 0.2353 (0.2436) data time 0.0009 (0.0015) model time 0.2343 (0.2419) loss 1.6790 (2.7200) grad_norm 3.5442 (5.7402) loss_scale 512.0000 (313.5504) mem 7381MB [2024-09-01 09:09:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1170/1251] eta 0:00:19 lr 0.000042 wd 0.0500 time 0.2350 (0.2435) data time 0.0009 (0.0015) model time 0.2341 (0.2419) loss 2.7827 (2.7201) grad_norm 4.4789 (5.7312) loss_scale 512.0000 (315.2451) mem 7381MB [2024-09-01 09:09:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1180/1251] eta 0:00:17 lr 0.000042 wd 0.0500 time 0.2361 (0.2435) data time 0.0007 (0.0015) model time 0.2354 (0.2419) loss 1.7963 (2.7196) grad_norm 7.1904 (5.7540) loss_scale 512.0000 (316.9111) mem 7381MB [2024-09-01 09:09:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1190/1251] eta 0:00:14 lr 0.000042 wd 0.0500 time 0.2499 (0.2435) data time 0.0007 (0.0015) model time 0.2492 (0.2418) loss 3.2955 (2.7194) grad_norm 3.3080 (5.7570) loss_scale 512.0000 (318.5491) mem 7381MB [2024-09-01 09:09:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1200/1251] eta 0:00:12 lr 0.000042 wd 0.0500 time 0.2412 (0.2435) data time 0.0010 (0.0015) model time 0.2403 (0.2418) loss 3.0653 (2.7226) grad_norm 3.3265 (5.7511) loss_scale 512.0000 (320.1599) mem 7381MB [2024-09-01 09:09:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1210/1251] eta 0:00:09 lr 0.000042 wd 0.0500 time 0.2374 (0.2435) data time 0.0009 (0.0015) model time 0.2365 (0.2418) loss 2.4185 (2.7196) grad_norm 5.2305 (5.7576) loss_scale 512.0000 (321.7440) mem 7381MB [2024-09-01 09:09:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1220/1251] eta 0:00:07 lr 0.000042 wd 0.0500 time 0.2374 (0.2435) data time 0.0010 (0.0015) model time 0.2364 (0.2418) loss 3.1673 (2.7199) grad_norm 4.3200 (5.7546) loss_scale 512.0000 (323.3022) mem 7381MB [2024-09-01 09:09:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1230/1251] eta 0:00:05 lr 0.000042 wd 0.0500 time 0.2382 (0.2435) data time 0.0008 (0.0015) model time 0.2374 (0.2418) loss 3.0574 (2.7184) grad_norm 5.0375 (5.7586) loss_scale 512.0000 (324.8351) mem 7381MB [2024-09-01 09:09:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1240/1251] eta 0:00:02 lr 0.000042 wd 0.0500 time 0.2212 (0.2434) data time 0.0007 (0.0015) model time 0.2205 (0.2417) loss 1.8670 (2.7165) grad_norm 5.4130 (5.7491) loss_scale 512.0000 (326.3433) mem 7381MB [2024-09-01 09:09:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [267/300][1250/1251] eta 0:00:00 lr 0.000042 wd 0.0500 time 0.2215 (0.2432) data time 0.0007 (0.0015) model time 0.2208 (0.2416) loss 2.7792 (2.7144) grad_norm 3.7319 (5.7392) loss_scale 512.0000 (327.8273) mem 7381MB [2024-09-01 09:09:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 267 training takes 0:05:04 [2024-09-01 09:09:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 09:09:27 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 09:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.422 (0.422) Loss 0.3979 (0.3979) Acc@1 93.359 (93.359) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 09:09:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.111) Loss 0.5942 (0.6199) Acc@1 89.941 (87.385) Acc@5 97.852 (97.665) Mem 7381MB [2024-09-01 09:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.095) Loss 0.9434 (0.6535) Acc@1 76.758 (86.230) Acc@5 95.898 (97.633) Mem 7381MB [2024-09-01 09:09:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.089) Loss 1.1719 (0.7483) Acc@1 73.633 (84.088) Acc@5 92.578 (96.683) Mem 7381MB [2024-09-01 09:09:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.0322 (0.7969) Acc@1 76.562 (82.884) Acc@5 93.945 (96.160) Mem 7381MB [2024-09-01 09:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.518 Acc@5 96.114 [2024-09-01 09:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.5% [2024-09-01 09:09:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.784 (0.784) Loss 0.3843 (0.3843) Acc@1 93.457 (93.457) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 09:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.145) Loss 0.5654 (0.6030) Acc@1 90.430 (87.793) Acc@5 98.047 (97.860) Mem 7381MB [2024-09-01 09:09:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.113) Loss 0.9038 (0.6334) Acc@1 78.027 (86.626) Acc@5 95.703 (97.749) Mem 7381MB [2024-09-01 09:09:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.102) Loss 1.1260 (0.7239) Acc@1 74.609 (84.482) Acc@5 93.066 (96.843) Mem 7381MB [2024-09-01 09:09:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.092) Loss 0.9956 (0.7707) Acc@1 77.148 (83.322) Acc@5 94.336 (96.344) Mem 7381MB [2024-09-01 09:09:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.952 Acc@5 96.274 [2024-09-01 09:09:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 83.0% [2024-09-01 09:09:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][0/1251] eta 0:22:44 lr 0.000042 wd 0.0500 time 1.0910 (1.0910) data time 0.6341 (0.6341) model time 0.0000 (0.0000) loss 2.2037 (2.2037) grad_norm 5.4218 (5.4218) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:09:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][10/1251] eta 0:06:38 lr 0.000042 wd 0.0500 time 0.2417 (0.3209) data time 0.0007 (0.0586) model time 0.0000 (0.0000) loss 1.7507 (2.6664) grad_norm 3.7187 (5.0238) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:09:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][20/1251] eta 0:05:48 lr 0.000042 wd 0.0500 time 0.2469 (0.2832) data time 0.0012 (0.0312) model time 0.0000 (0.0000) loss 3.0950 (2.7050) grad_norm 3.6944 (5.1321) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:09:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][30/1251] eta 0:05:29 lr 0.000042 wd 0.0500 time 0.2444 (0.2697) data time 0.0008 (0.0215) model time 0.0000 (0.0000) loss 2.6622 (2.6863) grad_norm 8.3704 (6.1087) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:09:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][40/1251] eta 0:05:18 lr 0.000042 wd 0.0500 time 0.2466 (0.2628) data time 0.0013 (0.0165) model time 0.0000 (0.0000) loss 3.1601 (2.7045) grad_norm 5.5783 (6.3643) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:09:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][50/1251] eta 0:05:11 lr 0.000041 wd 0.0500 time 0.2444 (0.2590) data time 0.0011 (0.0134) model time 0.0000 (0.0000) loss 3.0724 (2.6977) grad_norm 4.7219 (6.2129) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][60/1251] eta 0:05:05 lr 0.000041 wd 0.0500 time 0.2408 (0.2561) data time 0.0007 (0.0114) model time 0.2401 (0.2406) loss 1.8941 (2.6998) grad_norm 4.4003 (6.0622) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:09:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][70/1251] eta 0:04:59 lr 0.000041 wd 0.0500 time 0.2298 (0.2538) data time 0.0008 (0.0099) model time 0.2290 (0.2396) loss 3.1228 (2.6865) grad_norm 5.2418 (6.1926) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:09:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][80/1251] eta 0:04:55 lr 0.000041 wd 0.0500 time 0.2361 (0.2524) data time 0.0012 (0.0088) model time 0.2348 (0.2402) loss 2.6387 (2.6756) grad_norm 4.8535 (6.0652) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:09:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][90/1251] eta 0:04:51 lr 0.000041 wd 0.0500 time 0.2438 (0.2512) data time 0.0010 (0.0080) model time 0.2428 (0.2403) loss 3.1962 (2.6662) grad_norm 5.8427 (5.9434) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][100/1251] eta 0:04:48 lr 0.000041 wd 0.0500 time 0.2435 (0.2504) data time 0.0011 (0.0073) model time 0.2424 (0.2406) loss 2.7588 (2.6606) grad_norm 4.5434 (5.8026) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][110/1251] eta 0:04:44 lr 0.000041 wd 0.0500 time 0.2641 (0.2498) data time 0.0007 (0.0067) model time 0.2634 (0.2409) loss 2.4574 (2.6408) grad_norm 3.8773 (5.7678) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][120/1251] eta 0:04:41 lr 0.000041 wd 0.0500 time 0.2384 (0.2490) data time 0.0009 (0.0062) model time 0.2374 (0.2407) loss 2.9438 (2.6392) grad_norm 4.3424 (5.7392) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][130/1251] eta 0:04:38 lr 0.000041 wd 0.0500 time 0.2393 (0.2486) data time 0.0007 (0.0058) model time 0.2386 (0.2409) loss 3.4301 (2.6623) grad_norm 5.7891 (5.7062) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][140/1251] eta 0:04:35 lr 0.000041 wd 0.0500 time 0.2433 (0.2481) data time 0.0010 (0.0055) model time 0.2423 (0.2410) loss 3.1910 (2.6725) grad_norm 3.7148 (5.6564) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][150/1251] eta 0:04:32 lr 0.000041 wd 0.0500 time 0.2479 (0.2479) data time 0.0007 (0.0052) model time 0.2472 (0.2413) loss 2.3965 (2.6769) grad_norm 4.5189 (5.6655) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][160/1251] eta 0:04:30 lr 0.000041 wd 0.0500 time 0.2436 (0.2476) data time 0.0010 (0.0049) model time 0.2427 (0.2414) loss 2.7147 (2.6947) grad_norm 5.6342 (5.6771) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][170/1251] eta 0:04:27 lr 0.000041 wd 0.0500 time 0.2420 (0.2473) data time 0.0009 (0.0047) model time 0.2411 (0.2413) loss 2.3685 (2.6871) grad_norm 4.8629 (5.7257) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][180/1251] eta 0:04:24 lr 0.000041 wd 0.0500 time 0.2428 (0.2470) data time 0.0007 (0.0045) model time 0.2421 (0.2413) loss 3.2314 (2.6936) grad_norm 4.3742 (5.6878) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][190/1251] eta 0:04:21 lr 0.000041 wd 0.0500 time 0.2399 (0.2467) data time 0.0013 (0.0043) model time 0.2386 (0.2412) loss 3.0999 (2.6988) grad_norm 6.3643 (5.6745) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][200/1251] eta 0:04:19 lr 0.000041 wd 0.0500 time 0.2178 (0.2472) data time 0.0008 (0.0042) model time 0.2170 (0.2422) loss 2.7894 (2.6901) grad_norm 4.5010 (5.6687) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][210/1251] eta 0:04:17 lr 0.000041 wd 0.0500 time 0.2539 (0.2470) data time 0.0010 (0.0040) model time 0.2530 (0.2422) loss 2.4323 (2.6840) grad_norm 5.6703 (5.6263) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][220/1251] eta 0:04:14 lr 0.000041 wd 0.0500 time 0.2426 (0.2467) data time 0.0009 (0.0039) model time 0.2417 (0.2420) loss 2.4462 (2.6813) grad_norm 11.9965 (5.6107) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][230/1251] eta 0:04:11 lr 0.000041 wd 0.0500 time 0.2437 (0.2464) data time 0.0010 (0.0037) model time 0.2426 (0.2419) loss 3.0351 (2.6856) grad_norm 3.8892 (5.5903) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][240/1251] eta 0:04:08 lr 0.000041 wd 0.0500 time 0.2351 (0.2461) data time 0.0007 (0.0036) model time 0.2344 (0.2416) loss 3.1645 (2.6930) grad_norm 5.8381 (5.5868) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][250/1251] eta 0:04:06 lr 0.000041 wd 0.0500 time 0.2423 (0.2459) data time 0.0009 (0.0035) model time 0.2414 (0.2416) loss 2.1726 (2.6906) grad_norm 3.0607 (5.5436) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][260/1251] eta 0:04:03 lr 0.000041 wd 0.0500 time 0.2451 (0.2457) data time 0.0007 (0.0034) model time 0.2444 (0.2415) loss 2.9227 (2.6764) grad_norm 5.3473 (5.5374) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][270/1251] eta 0:04:00 lr 0.000041 wd 0.0500 time 0.2355 (0.2456) data time 0.0007 (0.0033) model time 0.2348 (0.2415) loss 2.8747 (2.6759) grad_norm 4.3726 (5.5184) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][280/1251] eta 0:03:58 lr 0.000041 wd 0.0500 time 0.2357 (0.2454) data time 0.0010 (0.0033) model time 0.2347 (0.2414) loss 2.9812 (2.6761) grad_norm 7.7916 (5.5293) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][290/1251] eta 0:03:55 lr 0.000041 wd 0.0500 time 0.2394 (0.2452) data time 0.0006 (0.0032) model time 0.2388 (0.2413) loss 2.1197 (2.6834) grad_norm 4.3473 (5.5227) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][300/1251] eta 0:03:53 lr 0.000041 wd 0.0500 time 0.2363 (0.2451) data time 0.0010 (0.0031) model time 0.2354 (0.2413) loss 2.1256 (2.6826) grad_norm 4.2962 (5.5010) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][310/1251] eta 0:03:50 lr 0.000041 wd 0.0500 time 0.2447 (0.2451) data time 0.0010 (0.0030) model time 0.2437 (0.2413) loss 2.5664 (2.6819) grad_norm 4.1820 (5.4861) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][320/1251] eta 0:03:48 lr 0.000041 wd 0.0500 time 0.2362 (0.2449) data time 0.0011 (0.0030) model time 0.2351 (0.2413) loss 2.5319 (2.6787) grad_norm 5.7809 (5.4875) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][330/1251] eta 0:03:45 lr 0.000041 wd 0.0500 time 0.2419 (0.2448) data time 0.0009 (0.0029) model time 0.2410 (0.2413) loss 3.1693 (2.6693) grad_norm 4.4523 (5.4998) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][340/1251] eta 0:03:43 lr 0.000041 wd 0.0500 time 0.2408 (0.2454) data time 0.0008 (0.0029) model time 0.2401 (0.2420) loss 3.5606 (2.6745) grad_norm 4.2098 (5.5425) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][350/1251] eta 0:03:41 lr 0.000041 wd 0.0500 time 0.2404 (0.2453) data time 0.0007 (0.0028) model time 0.2397 (0.2420) loss 3.4151 (2.6730) grad_norm 4.9726 (5.5291) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][360/1251] eta 0:03:38 lr 0.000041 wd 0.0500 time 0.2426 (0.2452) data time 0.0006 (0.0028) model time 0.2419 (0.2419) loss 2.3541 (2.6744) grad_norm 6.2850 (5.6086) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][370/1251] eta 0:03:35 lr 0.000041 wd 0.0500 time 0.2424 (0.2450) data time 0.0007 (0.0027) model time 0.2416 (0.2418) loss 2.0620 (2.6731) grad_norm 4.5732 (5.5906) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][380/1251] eta 0:03:33 lr 0.000041 wd 0.0500 time 0.2413 (0.2449) data time 0.0008 (0.0027) model time 0.2405 (0.2417) loss 3.2424 (2.6752) grad_norm 5.6265 (5.5635) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][390/1251] eta 0:03:30 lr 0.000041 wd 0.0500 time 0.2470 (0.2448) data time 0.0008 (0.0026) model time 0.2462 (0.2416) loss 3.3680 (2.6760) grad_norm 5.8406 (5.5708) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][400/1251] eta 0:03:28 lr 0.000041 wd 0.0500 time 0.2365 (0.2447) data time 0.0008 (0.0026) model time 0.2357 (0.2416) loss 2.9626 (2.6797) grad_norm 4.5583 (5.5525) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][410/1251] eta 0:03:25 lr 0.000041 wd 0.0500 time 0.2350 (0.2445) data time 0.0011 (0.0026) model time 0.2339 (0.2414) loss 3.0707 (2.6862) grad_norm 3.2303 (5.5311) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][420/1251] eta 0:03:23 lr 0.000041 wd 0.0500 time 0.2366 (0.2443) data time 0.0009 (0.0025) model time 0.2358 (0.2413) loss 2.6167 (2.6856) grad_norm 3.0065 (5.5146) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][430/1251] eta 0:03:20 lr 0.000041 wd 0.0500 time 0.2348 (0.2443) data time 0.0008 (0.0025) model time 0.2339 (0.2413) loss 2.6327 (2.6855) grad_norm 3.9011 (5.4939) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][440/1251] eta 0:03:18 lr 0.000041 wd 0.0500 time 0.2479 (0.2442) data time 0.0008 (0.0025) model time 0.2472 (0.2413) loss 2.1060 (2.6889) grad_norm 4.7133 (5.5131) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][450/1251] eta 0:03:15 lr 0.000041 wd 0.0500 time 0.2360 (0.2441) data time 0.0009 (0.0024) model time 0.2351 (0.2412) loss 2.9843 (2.6872) grad_norm 6.6525 (5.5198) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][460/1251] eta 0:03:13 lr 0.000041 wd 0.0500 time 0.2457 (0.2441) data time 0.0007 (0.0024) model time 0.2450 (0.2412) loss 2.2913 (2.6878) grad_norm 3.6084 (5.5929) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][470/1251] eta 0:03:10 lr 0.000041 wd 0.0500 time 0.2388 (0.2441) data time 0.0010 (0.0024) model time 0.2377 (0.2412) loss 2.9945 (2.6899) grad_norm 3.5147 (5.5883) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][480/1251] eta 0:03:08 lr 0.000041 wd 0.0500 time 0.2390 (0.2440) data time 0.0010 (0.0023) model time 0.2379 (0.2412) loss 2.4043 (2.6906) grad_norm 4.0853 (5.5799) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][490/1251] eta 0:03:05 lr 0.000041 wd 0.0500 time 0.2430 (0.2439) data time 0.0011 (0.0023) model time 0.2419 (0.2412) loss 2.7304 (2.6923) grad_norm 5.9617 (5.6309) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][500/1251] eta 0:03:03 lr 0.000041 wd 0.0500 time 0.2441 (0.2438) data time 0.0007 (0.0023) model time 0.2434 (0.2411) loss 3.4081 (2.6945) grad_norm 5.7042 (5.6227) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][510/1251] eta 0:03:00 lr 0.000041 wd 0.0500 time 0.2388 (0.2438) data time 0.0009 (0.0023) model time 0.2379 (0.2411) loss 3.1845 (2.6925) grad_norm 10.1443 (5.6289) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][520/1251] eta 0:02:58 lr 0.000041 wd 0.0500 time 0.2441 (0.2438) data time 0.0009 (0.0022) model time 0.2432 (0.2411) loss 2.3929 (2.6931) grad_norm 5.3736 (5.6228) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][530/1251] eta 0:02:55 lr 0.000041 wd 0.0500 time 0.2487 (0.2438) data time 0.0010 (0.0022) model time 0.2476 (0.2411) loss 2.5539 (2.6955) grad_norm 7.9773 (5.6232) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][540/1251] eta 0:02:53 lr 0.000041 wd 0.0500 time 0.2353 (0.2437) data time 0.0008 (0.0022) model time 0.2345 (0.2411) loss 2.8440 (2.6971) grad_norm 4.0946 (5.6227) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][550/1251] eta 0:02:50 lr 0.000041 wd 0.0500 time 0.2442 (0.2437) data time 0.0007 (0.0022) model time 0.2434 (0.2411) loss 2.7482 (2.6956) grad_norm 3.5543 (5.6091) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][560/1251] eta 0:02:48 lr 0.000041 wd 0.0500 time 0.2394 (0.2436) data time 0.0010 (0.0021) model time 0.2383 (0.2410) loss 2.9900 (2.6978) grad_norm 5.8240 (5.6109) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][570/1251] eta 0:02:45 lr 0.000041 wd 0.0500 time 0.2355 (0.2435) data time 0.0009 (0.0021) model time 0.2346 (0.2410) loss 3.1248 (2.6976) grad_norm 3.7868 (5.6519) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][580/1251] eta 0:02:43 lr 0.000041 wd 0.0500 time 0.2440 (0.2434) data time 0.0007 (0.0021) model time 0.2433 (0.2409) loss 3.0872 (2.6999) grad_norm 4.1417 (5.6583) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][590/1251] eta 0:02:40 lr 0.000041 wd 0.0500 time 0.2412 (0.2434) data time 0.0007 (0.0021) model time 0.2405 (0.2409) loss 1.9541 (2.6964) grad_norm 10.4037 (5.6695) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][600/1251] eta 0:02:38 lr 0.000041 wd 0.0500 time 0.2469 (0.2434) data time 0.0007 (0.0021) model time 0.2462 (0.2409) loss 2.7008 (2.6980) grad_norm 6.7617 (5.7005) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:12:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][610/1251] eta 0:02:35 lr 0.000041 wd 0.0500 time 0.2467 (0.2433) data time 0.0010 (0.0021) model time 0.2456 (0.2409) loss 3.0931 (2.7006) grad_norm 6.4226 (5.6955) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][620/1251] eta 0:02:33 lr 0.000041 wd 0.0500 time 0.2333 (0.2433) data time 0.0007 (0.0020) model time 0.2326 (0.2409) loss 3.0986 (2.7001) grad_norm 5.2790 (5.6902) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][630/1251] eta 0:02:31 lr 0.000041 wd 0.0500 time 0.2470 (0.2432) data time 0.0010 (0.0020) model time 0.2460 (0.2408) loss 3.0065 (2.7007) grad_norm 5.0010 (inf) loss_scale 256.0000 (509.5658) mem 7381MB [2024-09-01 09:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][640/1251] eta 0:02:28 lr 0.000041 wd 0.0500 time 0.2401 (0.2432) data time 0.0010 (0.0020) model time 0.2390 (0.2408) loss 2.8755 (2.7026) grad_norm 5.0138 (inf) loss_scale 256.0000 (505.6100) mem 7381MB [2024-09-01 09:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][650/1251] eta 0:02:26 lr 0.000041 wd 0.0500 time 0.2380 (0.2431) data time 0.0010 (0.0020) model time 0.2371 (0.2407) loss 2.4982 (2.7059) grad_norm 5.5822 (inf) loss_scale 256.0000 (501.7757) mem 7381MB [2024-09-01 09:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][660/1251] eta 0:02:23 lr 0.000041 wd 0.0500 time 0.2346 (0.2431) data time 0.0009 (0.0020) model time 0.2337 (0.2407) loss 2.3565 (2.7077) grad_norm 4.0584 (inf) loss_scale 256.0000 (498.0575) mem 7381MB [2024-09-01 09:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][670/1251] eta 0:02:21 lr 0.000041 wd 0.0500 time 0.2441 (0.2430) data time 0.0010 (0.0020) model time 0.2431 (0.2407) loss 2.8422 (2.7061) grad_norm 3.1144 (inf) loss_scale 256.0000 (494.4501) mem 7381MB [2024-09-01 09:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][680/1251] eta 0:02:18 lr 0.000041 wd 0.0500 time 0.2399 (0.2430) data time 0.0010 (0.0019) model time 0.2389 (0.2407) loss 2.3377 (2.7057) grad_norm 4.3152 (inf) loss_scale 256.0000 (490.9486) mem 7381MB [2024-09-01 09:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][690/1251] eta 0:02:16 lr 0.000040 wd 0.0500 time 0.2308 (0.2430) data time 0.0009 (0.0019) model time 0.2298 (0.2407) loss 2.4712 (2.7058) grad_norm 4.0987 (inf) loss_scale 256.0000 (487.5485) mem 7381MB [2024-09-01 09:12:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][700/1251] eta 0:02:13 lr 0.000040 wd 0.0500 time 0.2421 (0.2430) data time 0.0007 (0.0019) model time 0.2414 (0.2407) loss 2.8142 (2.7048) grad_norm 3.4316 (inf) loss_scale 256.0000 (484.2454) mem 7381MB [2024-09-01 09:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][710/1251] eta 0:02:11 lr 0.000040 wd 0.0500 time 0.2428 (0.2430) data time 0.0010 (0.0019) model time 0.2418 (0.2407) loss 2.4187 (2.7057) grad_norm 5.1963 (inf) loss_scale 256.0000 (481.0352) mem 7381MB [2024-09-01 09:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][720/1251] eta 0:02:08 lr 0.000040 wd 0.0500 time 0.2441 (0.2429) data time 0.0008 (0.0019) model time 0.2433 (0.2407) loss 2.5341 (2.7072) grad_norm 5.2029 (inf) loss_scale 256.0000 (477.9140) mem 7381MB [2024-09-01 09:12:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][730/1251] eta 0:02:06 lr 0.000040 wd 0.0500 time 0.2383 (0.2432) data time 0.0008 (0.0019) model time 0.2374 (0.2410) loss 2.6061 (2.7059) grad_norm 6.4964 (inf) loss_scale 256.0000 (474.8782) mem 7381MB [2024-09-01 09:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][740/1251] eta 0:02:04 lr 0.000040 wd 0.0500 time 0.2336 (0.2432) data time 0.0009 (0.0019) model time 0.2328 (0.2410) loss 3.3321 (2.7077) grad_norm 5.0743 (inf) loss_scale 256.0000 (471.9244) mem 7381MB [2024-09-01 09:12:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][750/1251] eta 0:02:02 lr 0.000040 wd 0.0500 time 0.2413 (0.2437) data time 0.0011 (0.0019) model time 0.2401 (0.2416) loss 3.0140 (2.7092) grad_norm 4.7871 (inf) loss_scale 256.0000 (469.0493) mem 7381MB [2024-09-01 09:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][760/1251] eta 0:01:59 lr 0.000040 wd 0.0500 time 0.2404 (0.2437) data time 0.0009 (0.0018) model time 0.2395 (0.2416) loss 2.3227 (2.7087) grad_norm 4.3536 (inf) loss_scale 256.0000 (466.2497) mem 7381MB [2024-09-01 09:12:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][770/1251] eta 0:01:57 lr 0.000040 wd 0.0500 time 0.2427 (0.2437) data time 0.0012 (0.0018) model time 0.2416 (0.2416) loss 3.1928 (2.7073) grad_norm 4.6453 (inf) loss_scale 256.0000 (463.5227) mem 7381MB [2024-09-01 09:12:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][780/1251] eta 0:01:54 lr 0.000040 wd 0.0500 time 0.2381 (0.2436) data time 0.0009 (0.0018) model time 0.2372 (0.2416) loss 2.7452 (2.7105) grad_norm 6.3682 (inf) loss_scale 256.0000 (460.8656) mem 7381MB [2024-09-01 09:12:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][790/1251] eta 0:01:52 lr 0.000040 wd 0.0500 time 0.2408 (0.2436) data time 0.0008 (0.0018) model time 0.2400 (0.2416) loss 2.9733 (2.7106) grad_norm 3.3415 (inf) loss_scale 256.0000 (458.2756) mem 7381MB [2024-09-01 09:12:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][800/1251] eta 0:01:49 lr 0.000040 wd 0.0500 time 0.2354 (0.2436) data time 0.0009 (0.0018) model time 0.2345 (0.2415) loss 2.9246 (2.7128) grad_norm 3.3956 (inf) loss_scale 256.0000 (455.7503) mem 7381MB [2024-09-01 09:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][810/1251] eta 0:01:47 lr 0.000040 wd 0.0500 time 0.2386 (0.2436) data time 0.0011 (0.0018) model time 0.2376 (0.2416) loss 3.1795 (2.7121) grad_norm 3.7157 (inf) loss_scale 256.0000 (453.2873) mem 7381MB [2024-09-01 09:12:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][820/1251] eta 0:01:44 lr 0.000040 wd 0.0500 time 0.2365 (0.2436) data time 0.0009 (0.0018) model time 0.2356 (0.2415) loss 3.1058 (2.7121) grad_norm 6.0448 (inf) loss_scale 256.0000 (450.8843) mem 7381MB [2024-09-01 09:12:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][830/1251] eta 0:01:42 lr 0.000040 wd 0.0500 time 0.2464 (0.2435) data time 0.0007 (0.0018) model time 0.2457 (0.2415) loss 3.0050 (2.7152) grad_norm 3.6660 (inf) loss_scale 256.0000 (448.5391) mem 7381MB [2024-09-01 09:13:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][840/1251] eta 0:01:40 lr 0.000040 wd 0.0500 time 0.2540 (0.2435) data time 0.0009 (0.0018) model time 0.2531 (0.2415) loss 2.4371 (2.7124) grad_norm 4.4130 (inf) loss_scale 256.0000 (446.2497) mem 7381MB [2024-09-01 09:13:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][850/1251] eta 0:01:37 lr 0.000040 wd 0.0500 time 0.2460 (0.2435) data time 0.0008 (0.0018) model time 0.2452 (0.2416) loss 3.4338 (2.7135) grad_norm 5.2678 (inf) loss_scale 256.0000 (444.0141) mem 7381MB [2024-09-01 09:13:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][860/1251] eta 0:01:35 lr 0.000040 wd 0.0500 time 0.2431 (0.2439) data time 0.0009 (0.0017) model time 0.2422 (0.2419) loss 2.3361 (2.7152) grad_norm 5.9597 (inf) loss_scale 256.0000 (441.8304) mem 7381MB [2024-09-01 09:13:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][870/1251] eta 0:01:32 lr 0.000040 wd 0.0500 time 0.2418 (0.2438) data time 0.0010 (0.0017) model time 0.2408 (0.2419) loss 2.5901 (2.7158) grad_norm 30.0178 (inf) loss_scale 256.0000 (439.6969) mem 7381MB [2024-09-01 09:13:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][880/1251] eta 0:01:30 lr 0.000040 wd 0.0500 time 0.2380 (0.2438) data time 0.0009 (0.0017) model time 0.2370 (0.2418) loss 2.5984 (2.7139) grad_norm 4.4190 (inf) loss_scale 256.0000 (437.6118) mem 7381MB [2024-09-01 09:13:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][890/1251] eta 0:01:28 lr 0.000040 wd 0.0500 time 0.2375 (0.2438) data time 0.0011 (0.0017) model time 0.2364 (0.2418) loss 2.7471 (2.7133) grad_norm 3.9471 (inf) loss_scale 256.0000 (435.5735) mem 7381MB [2024-09-01 09:13:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][900/1251] eta 0:01:25 lr 0.000040 wd 0.0500 time 0.2402 (0.2438) data time 0.0010 (0.0017) model time 0.2392 (0.2418) loss 3.3828 (2.7118) grad_norm 3.8856 (inf) loss_scale 256.0000 (433.5805) mem 7381MB [2024-09-01 09:13:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][910/1251] eta 0:01:23 lr 0.000040 wd 0.0500 time 0.2382 (0.2437) data time 0.0009 (0.0017) model time 0.2373 (0.2418) loss 2.9255 (2.7130) grad_norm 5.0561 (inf) loss_scale 256.0000 (431.6312) mem 7381MB [2024-09-01 09:13:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][920/1251] eta 0:01:20 lr 0.000040 wd 0.0500 time 0.2350 (0.2437) data time 0.0009 (0.0017) model time 0.2340 (0.2418) loss 2.8364 (2.7137) grad_norm 4.4140 (inf) loss_scale 256.0000 (429.7242) mem 7381MB [2024-09-01 09:13:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][930/1251] eta 0:01:18 lr 0.000040 wd 0.0500 time 0.2395 (0.2437) data time 0.0007 (0.0017) model time 0.2388 (0.2418) loss 3.1652 (2.7143) grad_norm 3.9836 (inf) loss_scale 256.0000 (427.8582) mem 7381MB [2024-09-01 09:13:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][940/1251] eta 0:01:15 lr 0.000040 wd 0.0500 time 0.2389 (0.2437) data time 0.0007 (0.0017) model time 0.2382 (0.2418) loss 1.8764 (2.7113) grad_norm 6.6757 (inf) loss_scale 256.0000 (426.0319) mem 7381MB [2024-09-01 09:13:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][950/1251] eta 0:01:13 lr 0.000040 wd 0.0500 time 0.2385 (0.2437) data time 0.0011 (0.0017) model time 0.2374 (0.2418) loss 2.5226 (2.7109) grad_norm 4.6193 (inf) loss_scale 256.0000 (424.2440) mem 7381MB [2024-09-01 09:13:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][960/1251] eta 0:01:10 lr 0.000040 wd 0.0500 time 0.2429 (0.2436) data time 0.0010 (0.0017) model time 0.2419 (0.2418) loss 2.9793 (2.7108) grad_norm 4.9997 (inf) loss_scale 256.0000 (422.4932) mem 7381MB [2024-09-01 09:13:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][970/1251] eta 0:01:08 lr 0.000040 wd 0.0500 time 0.2399 (0.2436) data time 0.0013 (0.0017) model time 0.2386 (0.2418) loss 2.6524 (2.7095) grad_norm 7.5082 (inf) loss_scale 256.0000 (420.7786) mem 7381MB [2024-09-01 09:13:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][980/1251] eta 0:01:06 lr 0.000040 wd 0.0500 time 0.2402 (0.2436) data time 0.0012 (0.0017) model time 0.2390 (0.2418) loss 2.7520 (2.7081) grad_norm 6.7233 (inf) loss_scale 256.0000 (419.0989) mem 7381MB [2024-09-01 09:13:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][990/1251] eta 0:01:03 lr 0.000040 wd 0.0500 time 0.2405 (0.2436) data time 0.0009 (0.0016) model time 0.2396 (0.2418) loss 1.6551 (2.7076) grad_norm 8.0393 (inf) loss_scale 256.0000 (417.4531) mem 7381MB [2024-09-01 09:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1000/1251] eta 0:01:01 lr 0.000040 wd 0.0500 time 0.2483 (0.2436) data time 0.0010 (0.0016) model time 0.2473 (0.2418) loss 2.7425 (2.7077) grad_norm 6.0469 (inf) loss_scale 256.0000 (415.8402) mem 7381MB [2024-09-01 09:13:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1010/1251] eta 0:00:58 lr 0.000040 wd 0.0500 time 0.2372 (0.2436) data time 0.0008 (0.0016) model time 0.2364 (0.2418) loss 3.1750 (2.7079) grad_norm 4.0095 (inf) loss_scale 256.0000 (414.2591) mem 7381MB [2024-09-01 09:13:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1020/1251] eta 0:00:56 lr 0.000040 wd 0.0500 time 0.2462 (0.2436) data time 0.0010 (0.0016) model time 0.2452 (0.2418) loss 2.8596 (2.7060) grad_norm 7.5624 (inf) loss_scale 256.0000 (412.7091) mem 7381MB [2024-09-01 09:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1030/1251] eta 0:00:53 lr 0.000040 wd 0.0500 time 0.2476 (0.2435) data time 0.0007 (0.0016) model time 0.2468 (0.2417) loss 2.9040 (2.7071) grad_norm 5.1788 (inf) loss_scale 256.0000 (411.1891) mem 7381MB [2024-09-01 09:13:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1040/1251] eta 0:00:51 lr 0.000040 wd 0.0500 time 0.2400 (0.2435) data time 0.0011 (0.0016) model time 0.2389 (0.2417) loss 2.3401 (2.7086) grad_norm 5.5915 (inf) loss_scale 256.0000 (409.6984) mem 7381MB [2024-09-01 09:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1050/1251] eta 0:00:48 lr 0.000040 wd 0.0500 time 0.2418 (0.2435) data time 0.0009 (0.0016) model time 0.2409 (0.2417) loss 3.1520 (2.7097) grad_norm 3.7351 (inf) loss_scale 256.0000 (408.2360) mem 7381MB [2024-09-01 09:13:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1060/1251] eta 0:00:46 lr 0.000040 wd 0.0500 time 0.2370 (0.2434) data time 0.0010 (0.0016) model time 0.2361 (0.2416) loss 2.5371 (2.7090) grad_norm 4.2213 (inf) loss_scale 256.0000 (406.8011) mem 7381MB [2024-09-01 09:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1070/1251] eta 0:00:44 lr 0.000040 wd 0.0500 time 0.2505 (0.2434) data time 0.0007 (0.0016) model time 0.2498 (0.2416) loss 1.7343 (2.7094) grad_norm 5.1080 (inf) loss_scale 256.0000 (405.3931) mem 7381MB [2024-09-01 09:13:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1080/1251] eta 0:00:41 lr 0.000040 wd 0.0500 time 0.2398 (0.2434) data time 0.0009 (0.0016) model time 0.2389 (0.2416) loss 2.8858 (2.7079) grad_norm 5.1183 (inf) loss_scale 256.0000 (404.0111) mem 7381MB [2024-09-01 09:14:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1090/1251] eta 0:00:39 lr 0.000040 wd 0.0500 time 0.2321 (0.2434) data time 0.0012 (0.0016) model time 0.2309 (0.2416) loss 2.1477 (2.7077) grad_norm 4.6498 (inf) loss_scale 256.0000 (402.6544) mem 7381MB [2024-09-01 09:14:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1100/1251] eta 0:00:36 lr 0.000040 wd 0.0500 time 0.2375 (0.2433) data time 0.0012 (0.0016) model time 0.2363 (0.2416) loss 2.5223 (2.7093) grad_norm 5.1168 (inf) loss_scale 256.0000 (401.3224) mem 7381MB [2024-09-01 09:14:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1110/1251] eta 0:00:34 lr 0.000040 wd 0.0500 time 0.2512 (0.2433) data time 0.0009 (0.0016) model time 0.2503 (0.2415) loss 2.9301 (2.7096) grad_norm 8.0591 (inf) loss_scale 256.0000 (400.0144) mem 7381MB [2024-09-01 09:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1120/1251] eta 0:00:31 lr 0.000040 wd 0.0500 time 0.2356 (0.2433) data time 0.0009 (0.0016) model time 0.2347 (0.2415) loss 2.8164 (2.7085) grad_norm 4.5064 (inf) loss_scale 256.0000 (398.7297) mem 7381MB [2024-09-01 09:14:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1130/1251] eta 0:00:29 lr 0.000040 wd 0.0500 time 0.2431 (0.2433) data time 0.0007 (0.0016) model time 0.2424 (0.2415) loss 1.8609 (2.7068) grad_norm 6.6700 (inf) loss_scale 256.0000 (397.4677) mem 7381MB [2024-09-01 09:14:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1140/1251] eta 0:00:27 lr 0.000040 wd 0.0500 time 0.2441 (0.2432) data time 0.0007 (0.0016) model time 0.2434 (0.2415) loss 2.2516 (2.7083) grad_norm 5.2019 (inf) loss_scale 256.0000 (396.2279) mem 7381MB [2024-09-01 09:14:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1150/1251] eta 0:00:24 lr 0.000040 wd 0.0500 time 0.2501 (0.2432) data time 0.0009 (0.0016) model time 0.2492 (0.2415) loss 2.1997 (2.7066) grad_norm 3.1488 (inf) loss_scale 256.0000 (395.0096) mem 7381MB [2024-09-01 09:14:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1160/1251] eta 0:00:22 lr 0.000040 wd 0.0500 time 0.2465 (0.2432) data time 0.0008 (0.0016) model time 0.2457 (0.2415) loss 1.9940 (2.7064) grad_norm 5.1439 (inf) loss_scale 256.0000 (393.8122) mem 7381MB [2024-09-01 09:14:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1170/1251] eta 0:00:19 lr 0.000040 wd 0.0500 time 0.2461 (0.2432) data time 0.0009 (0.0015) model time 0.2451 (0.2415) loss 3.0892 (2.7052) grad_norm 4.7599 (inf) loss_scale 256.0000 (392.6354) mem 7381MB [2024-09-01 09:14:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1180/1251] eta 0:00:17 lr 0.000040 wd 0.0500 time 0.2481 (0.2432) data time 0.0010 (0.0015) model time 0.2472 (0.2415) loss 3.2676 (2.7063) grad_norm 4.1858 (inf) loss_scale 256.0000 (391.4784) mem 7381MB [2024-09-01 09:14:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1190/1251] eta 0:00:14 lr 0.000040 wd 0.0500 time 0.2441 (0.2432) data time 0.0007 (0.0015) model time 0.2434 (0.2415) loss 2.6554 (2.7061) grad_norm 5.1419 (inf) loss_scale 256.0000 (390.3409) mem 7381MB [2024-09-01 09:14:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1200/1251] eta 0:00:12 lr 0.000040 wd 0.0500 time 0.2366 (0.2432) data time 0.0008 (0.0015) model time 0.2358 (0.2415) loss 2.0648 (2.7049) grad_norm 5.1686 (inf) loss_scale 256.0000 (389.2223) mem 7381MB [2024-09-01 09:14:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1210/1251] eta 0:00:09 lr 0.000040 wd 0.0500 time 0.2417 (0.2432) data time 0.0009 (0.0015) model time 0.2408 (0.2415) loss 2.1492 (2.7021) grad_norm 4.3526 (inf) loss_scale 256.0000 (388.1222) mem 7381MB [2024-09-01 09:14:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1220/1251] eta 0:00:07 lr 0.000040 wd 0.0500 time 0.2500 (0.2432) data time 0.0010 (0.0015) model time 0.2490 (0.2415) loss 3.2397 (2.7034) grad_norm 6.4630 (inf) loss_scale 256.0000 (387.0401) mem 7381MB [2024-09-01 09:14:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1230/1251] eta 0:00:05 lr 0.000040 wd 0.0500 time 0.2406 (0.2431) data time 0.0011 (0.0015) model time 0.2396 (0.2415) loss 3.3928 (2.7047) grad_norm 5.6691 (inf) loss_scale 256.0000 (385.9756) mem 7381MB [2024-09-01 09:14:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1240/1251] eta 0:00:02 lr 0.000040 wd 0.0500 time 0.2247 (0.2431) data time 0.0007 (0.0015) model time 0.2240 (0.2414) loss 2.9251 (2.7061) grad_norm 4.2689 (inf) loss_scale 256.0000 (384.9283) mem 7381MB [2024-09-01 09:14:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [268/300][1250/1251] eta 0:00:00 lr 0.000040 wd 0.0500 time 0.2273 (0.2429) data time 0.0005 (0.0015) model time 0.2268 (0.2413) loss 3.1445 (2.7067) grad_norm 6.2619 (inf) loss_scale 256.0000 (383.8977) mem 7381MB [2024-09-01 09:14:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 268 training takes 0:05:03 [2024-09-01 09:14:39 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 09:14:40 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 09:14:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.417 (0.417) Loss 0.3801 (0.3801) Acc@1 93.262 (93.262) Acc@5 99.023 (99.023) Mem 7381MB [2024-09-01 09:14:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.109) Loss 0.5815 (0.6137) Acc@1 90.332 (87.686) Acc@5 97.949 (97.692) Mem 7381MB [2024-09-01 09:14:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.096) Loss 0.9565 (0.6471) Acc@1 77.441 (86.454) Acc@5 95.312 (97.582) Mem 7381MB [2024-09-01 09:14:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.091) Loss 1.1318 (0.7414) Acc@1 74.023 (84.227) Acc@5 92.578 (96.673) Mem 7381MB [2024-09-01 09:14:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.0225 (0.7904) Acc@1 76.953 (83.036) Acc@5 94.141 (96.201) Mem 7381MB [2024-09-01 09:14:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.576 Acc@5 96.124 [2024-09-01 09:14:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-09-01 09:14:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.832 (0.832) Loss 0.3843 (0.3843) Acc@1 93.359 (93.359) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 09:14:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.149) Loss 0.5659 (0.6030) Acc@1 90.430 (87.793) Acc@5 98.145 (97.860) Mem 7381MB [2024-09-01 09:14:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.115) Loss 0.9048 (0.6334) Acc@1 78.125 (86.640) Acc@5 95.703 (97.754) Mem 7381MB [2024-09-01 09:14:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.103) Loss 1.1260 (0.7240) Acc@1 74.609 (84.488) Acc@5 92.871 (96.834) Mem 7381MB [2024-09-01 09:14:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.094) Loss 0.9951 (0.7709) Acc@1 77.148 (83.325) Acc@5 94.238 (96.334) Mem 7381MB [2024-09-01 09:14:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.954 Acc@5 96.270 [2024-09-01 09:14:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 83.0% [2024-09-01 09:14:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][0/1251] eta 0:23:00 lr 0.000040 wd 0.0500 time 1.1035 (1.1035) data time 0.6265 (0.6265) model time 0.0000 (0.0000) loss 3.4280 (3.4280) grad_norm 4.5480 (4.5480) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:14:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][10/1251] eta 0:06:37 lr 0.000040 wd 0.0500 time 0.2475 (0.3202) data time 0.0009 (0.0579) model time 0.0000 (0.0000) loss 3.1821 (2.8793) grad_norm 5.4933 (4.7247) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:14:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][20/1251] eta 0:05:47 lr 0.000040 wd 0.0500 time 0.2383 (0.2820) data time 0.0009 (0.0308) model time 0.0000 (0.0000) loss 3.0898 (2.7725) grad_norm 4.1230 (4.6376) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:14:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][30/1251] eta 0:05:28 lr 0.000040 wd 0.0500 time 0.2343 (0.2687) data time 0.0009 (0.0212) model time 0.0000 (0.0000) loss 2.8567 (2.8128) grad_norm 5.5889 (4.8285) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:14:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][40/1251] eta 0:05:17 lr 0.000040 wd 0.0500 time 0.2424 (0.2621) data time 0.0009 (0.0163) model time 0.0000 (0.0000) loss 2.5159 (2.8299) grad_norm 3.4308 (4.9821) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][50/1251] eta 0:05:09 lr 0.000040 wd 0.0500 time 0.2455 (0.2579) data time 0.0009 (0.0133) model time 0.0000 (0.0000) loss 2.8899 (2.8138) grad_norm 4.0366 (5.1090) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][60/1251] eta 0:05:07 lr 0.000040 wd 0.0500 time 0.2357 (0.2584) data time 0.0009 (0.0112) model time 0.2348 (0.2597) loss 3.1257 (2.7900) grad_norm 5.6339 (5.1525) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][70/1251] eta 0:05:05 lr 0.000040 wd 0.0500 time 0.2381 (0.2590) data time 0.0008 (0.0098) model time 0.2372 (0.2608) loss 2.4862 (2.7927) grad_norm 4.5253 (5.0823) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][80/1251] eta 0:05:01 lr 0.000040 wd 0.0500 time 0.2377 (0.2571) data time 0.0010 (0.0087) model time 0.2367 (0.2548) loss 3.0320 (2.7719) grad_norm 3.9782 (5.0780) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][90/1251] eta 0:04:56 lr 0.000040 wd 0.0500 time 0.2452 (0.2552) data time 0.0007 (0.0080) model time 0.2444 (0.2505) loss 3.1350 (2.7655) grad_norm 5.4508 (5.1380) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][100/1251] eta 0:04:52 lr 0.000039 wd 0.0500 time 0.2376 (0.2539) data time 0.0008 (0.0073) model time 0.2368 (0.2486) loss 2.5497 (2.7778) grad_norm 6.9533 (5.1629) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][110/1251] eta 0:04:48 lr 0.000039 wd 0.0500 time 0.2372 (0.2525) data time 0.0009 (0.0067) model time 0.2363 (0.2467) loss 3.1473 (2.7664) grad_norm 4.7972 (5.2479) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][120/1251] eta 0:04:44 lr 0.000039 wd 0.0500 time 0.2359 (0.2515) data time 0.0008 (0.0063) model time 0.2351 (0.2458) loss 3.0169 (2.7509) grad_norm 57.2696 (6.3719) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][130/1251] eta 0:04:43 lr 0.000039 wd 0.0500 time 0.2424 (0.2529) data time 0.0009 (0.0058) model time 0.2414 (0.2486) loss 3.3553 (2.7422) grad_norm 4.3243 (6.2685) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][140/1251] eta 0:04:40 lr 0.000039 wd 0.0500 time 0.2422 (0.2522) data time 0.0013 (0.0055) model time 0.2409 (0.2478) loss 2.8590 (2.7393) grad_norm 5.8635 (6.4214) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][150/1251] eta 0:04:36 lr 0.000039 wd 0.0500 time 0.2446 (0.2514) data time 0.0009 (0.0052) model time 0.2437 (0.2469) loss 3.2128 (2.7569) grad_norm 4.6463 (6.3748) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][160/1251] eta 0:04:33 lr 0.000039 wd 0.0500 time 0.2392 (0.2507) data time 0.0011 (0.0050) model time 0.2381 (0.2463) loss 2.8161 (2.7562) grad_norm 5.7071 (6.4803) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][170/1251] eta 0:04:30 lr 0.000039 wd 0.0500 time 0.2421 (0.2501) data time 0.0011 (0.0047) model time 0.2410 (0.2457) loss 3.0977 (2.7688) grad_norm 3.7475 (6.4958) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][180/1251] eta 0:04:27 lr 0.000039 wd 0.0500 time 0.2265 (0.2496) data time 0.0009 (0.0045) model time 0.2257 (0.2452) loss 2.6689 (2.7685) grad_norm 5.0915 (6.4783) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][190/1251] eta 0:04:24 lr 0.000039 wd 0.0500 time 0.2400 (0.2491) data time 0.0010 (0.0043) model time 0.2390 (0.2448) loss 2.3001 (2.7618) grad_norm 8.1918 (6.4717) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][200/1251] eta 0:04:21 lr 0.000039 wd 0.0500 time 0.2406 (0.2488) data time 0.0009 (0.0042) model time 0.2397 (0.2446) loss 3.3028 (2.7585) grad_norm 5.4843 (6.4024) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][210/1251] eta 0:04:18 lr 0.000039 wd 0.0500 time 0.2440 (0.2485) data time 0.0011 (0.0040) model time 0.2430 (0.2444) loss 2.9682 (2.7616) grad_norm 6.9522 (6.3488) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][220/1251] eta 0:04:15 lr 0.000039 wd 0.0500 time 0.2418 (0.2481) data time 0.0007 (0.0039) model time 0.2411 (0.2441) loss 3.0393 (2.7708) grad_norm 6.5125 (6.2982) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][230/1251] eta 0:04:13 lr 0.000039 wd 0.0500 time 0.2532 (0.2479) data time 0.0011 (0.0037) model time 0.2521 (0.2440) loss 2.2554 (2.7732) grad_norm 5.9458 (6.2612) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][240/1251] eta 0:04:10 lr 0.000039 wd 0.0500 time 0.2362 (0.2476) data time 0.0007 (0.0036) model time 0.2355 (0.2438) loss 3.6025 (2.7747) grad_norm 5.1993 (6.2307) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][250/1251] eta 0:04:07 lr 0.000039 wd 0.0500 time 0.2415 (0.2473) data time 0.0010 (0.0035) model time 0.2405 (0.2436) loss 1.8095 (2.7782) grad_norm 5.7351 (6.1969) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][260/1251] eta 0:04:04 lr 0.000039 wd 0.0500 time 0.2362 (0.2472) data time 0.0007 (0.0034) model time 0.2355 (0.2435) loss 2.2346 (2.7797) grad_norm 4.8914 (6.1915) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][270/1251] eta 0:04:02 lr 0.000039 wd 0.0500 time 0.2436 (0.2470) data time 0.0008 (0.0033) model time 0.2428 (0.2434) loss 2.1282 (2.7670) grad_norm 7.5844 (6.1882) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:15:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][280/1251] eta 0:03:59 lr 0.000039 wd 0.0500 time 0.2355 (0.2467) data time 0.0009 (0.0033) model time 0.2345 (0.2432) loss 2.8464 (2.7615) grad_norm 4.6176 (6.1310) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][290/1251] eta 0:03:56 lr 0.000039 wd 0.0500 time 0.2400 (0.2465) data time 0.0007 (0.0032) model time 0.2393 (0.2430) loss 3.3491 (2.7657) grad_norm 4.1579 (6.1356) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][300/1251] eta 0:03:54 lr 0.000039 wd 0.0500 time 0.2442 (0.2464) data time 0.0007 (0.0031) model time 0.2435 (0.2430) loss 1.7883 (2.7596) grad_norm 5.9244 (6.1392) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][310/1251] eta 0:03:51 lr 0.000039 wd 0.0500 time 0.2425 (0.2463) data time 0.0009 (0.0030) model time 0.2416 (0.2430) loss 1.9379 (2.7591) grad_norm 3.1630 (6.1052) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][320/1251] eta 0:03:49 lr 0.000039 wd 0.0500 time 0.2443 (0.2462) data time 0.0009 (0.0030) model time 0.2434 (0.2430) loss 3.0657 (2.7600) grad_norm 6.2104 (6.1098) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][330/1251] eta 0:03:46 lr 0.000039 wd 0.0500 time 0.2361 (0.2461) data time 0.0011 (0.0029) model time 0.2350 (0.2429) loss 1.9068 (2.7500) grad_norm 5.1276 (6.1685) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][340/1251] eta 0:03:44 lr 0.000039 wd 0.0500 time 0.2454 (0.2460) data time 0.0009 (0.0029) model time 0.2445 (0.2428) loss 3.0353 (2.7482) grad_norm 3.0191 (6.1498) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][350/1251] eta 0:03:42 lr 0.000039 wd 0.0500 time 0.2400 (0.2464) data time 0.0011 (0.0028) model time 0.2389 (0.2434) loss 2.7659 (2.7505) grad_norm 8.3260 (6.1986) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][360/1251] eta 0:03:39 lr 0.000039 wd 0.0500 time 0.2358 (0.2463) data time 0.0008 (0.0028) model time 0.2349 (0.2434) loss 2.9037 (2.7472) grad_norm 5.5105 (6.1787) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][370/1251] eta 0:03:36 lr 0.000039 wd 0.0500 time 0.2492 (0.2462) data time 0.0008 (0.0027) model time 0.2485 (0.2434) loss 2.8389 (2.7491) grad_norm 4.8002 (6.1728) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][380/1251] eta 0:03:34 lr 0.000039 wd 0.0500 time 0.2455 (0.2461) data time 0.0009 (0.0027) model time 0.2446 (0.2433) loss 2.7942 (2.7447) grad_norm 5.6765 (6.1662) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][390/1251] eta 0:03:31 lr 0.000039 wd 0.0500 time 0.2432 (0.2460) data time 0.0007 (0.0026) model time 0.2425 (0.2432) loss 2.7115 (2.7487) grad_norm 4.8894 (6.1211) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][400/1251] eta 0:03:29 lr 0.000039 wd 0.0500 time 0.2416 (0.2459) data time 0.0008 (0.0026) model time 0.2408 (0.2431) loss 2.4203 (2.7399) grad_norm 4.8927 (6.0933) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][410/1251] eta 0:03:26 lr 0.000039 wd 0.0500 time 0.2447 (0.2458) data time 0.0008 (0.0025) model time 0.2439 (0.2430) loss 3.0723 (2.7360) grad_norm 5.8782 (6.0861) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][420/1251] eta 0:03:24 lr 0.000039 wd 0.0500 time 0.2402 (0.2457) data time 0.0009 (0.0025) model time 0.2394 (0.2429) loss 2.7258 (2.7363) grad_norm 4.3827 (6.0707) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][430/1251] eta 0:03:21 lr 0.000039 wd 0.0500 time 0.2381 (0.2456) data time 0.0008 (0.0025) model time 0.2374 (0.2429) loss 3.2950 (2.7397) grad_norm 10.1699 (6.0942) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][440/1251] eta 0:03:19 lr 0.000039 wd 0.0500 time 0.2520 (0.2456) data time 0.0007 (0.0024) model time 0.2512 (0.2429) loss 1.7669 (2.7386) grad_norm 4.7758 (6.0747) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][450/1251] eta 0:03:16 lr 0.000039 wd 0.0500 time 0.2472 (0.2455) data time 0.0008 (0.0024) model time 0.2465 (0.2429) loss 3.0195 (2.7414) grad_norm 10.2495 (6.0823) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][460/1251] eta 0:03:14 lr 0.000039 wd 0.0500 time 0.2453 (0.2453) data time 0.0010 (0.0024) model time 0.2443 (0.2428) loss 2.3601 (2.7355) grad_norm 5.4148 (6.0935) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][470/1251] eta 0:03:11 lr 0.000039 wd 0.0500 time 0.2380 (0.2452) data time 0.0009 (0.0023) model time 0.2371 (0.2427) loss 2.8410 (2.7381) grad_norm 8.7799 (6.0750) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][480/1251] eta 0:03:09 lr 0.000039 wd 0.0500 time 0.2370 (0.2451) data time 0.0012 (0.0023) model time 0.2358 (0.2426) loss 3.0881 (2.7415) grad_norm 5.2637 (6.0589) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][490/1251] eta 0:03:06 lr 0.000039 wd 0.0500 time 0.2534 (0.2451) data time 0.0008 (0.0023) model time 0.2526 (0.2426) loss 1.9140 (2.7390) grad_norm 4.6732 (6.0453) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][500/1251] eta 0:03:03 lr 0.000039 wd 0.0500 time 0.2438 (0.2450) data time 0.0008 (0.0023) model time 0.2431 (0.2425) loss 3.1058 (2.7402) grad_norm 5.7357 (6.0305) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][510/1251] eta 0:03:01 lr 0.000039 wd 0.0500 time 0.2399 (0.2449) data time 0.0010 (0.0022) model time 0.2389 (0.2424) loss 2.8803 (2.7356) grad_norm 4.7537 (5.9988) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][520/1251] eta 0:02:58 lr 0.000039 wd 0.0500 time 0.2476 (0.2448) data time 0.0007 (0.0022) model time 0.2470 (0.2424) loss 2.9309 (2.7373) grad_norm 6.6101 (5.9833) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][530/1251] eta 0:02:56 lr 0.000039 wd 0.0500 time 0.2539 (0.2448) data time 0.0010 (0.0022) model time 0.2529 (0.2424) loss 2.9729 (2.7348) grad_norm 5.0948 (5.9590) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][540/1251] eta 0:02:54 lr 0.000039 wd 0.0500 time 0.2370 (0.2447) data time 0.0009 (0.0022) model time 0.2362 (0.2424) loss 3.1727 (2.7375) grad_norm 9.3110 (5.9491) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][550/1251] eta 0:02:51 lr 0.000039 wd 0.0500 time 0.2338 (0.2447) data time 0.0008 (0.0021) model time 0.2330 (0.2424) loss 3.0789 (2.7381) grad_norm 4.4271 (5.9258) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][560/1251] eta 0:02:49 lr 0.000039 wd 0.0500 time 0.2382 (0.2446) data time 0.0008 (0.0021) model time 0.2373 (0.2423) loss 2.3520 (2.7385) grad_norm 4.2226 (5.9397) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][570/1251] eta 0:02:46 lr 0.000039 wd 0.0500 time 0.2437 (0.2446) data time 0.0011 (0.0021) model time 0.2426 (0.2423) loss 3.1899 (2.7416) grad_norm 3.7955 (5.9204) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][580/1251] eta 0:02:44 lr 0.000039 wd 0.0500 time 0.2432 (0.2445) data time 0.0010 (0.0021) model time 0.2422 (0.2422) loss 2.2073 (2.7384) grad_norm 3.9099 (5.9385) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][590/1251] eta 0:02:41 lr 0.000039 wd 0.0500 time 0.2395 (0.2448) data time 0.0007 (0.0021) model time 0.2388 (0.2426) loss 2.9829 (2.7377) grad_norm 6.0812 (5.9583) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][600/1251] eta 0:02:39 lr 0.000039 wd 0.0500 time 0.2488 (0.2451) data time 0.0010 (0.0020) model time 0.2478 (0.2429) loss 2.4056 (2.7342) grad_norm 6.0104 (5.9423) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][610/1251] eta 0:02:37 lr 0.000039 wd 0.0500 time 0.2463 (0.2450) data time 0.0007 (0.0020) model time 0.2456 (0.2428) loss 1.7784 (2.7356) grad_norm 5.8920 (5.9362) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][620/1251] eta 0:02:34 lr 0.000039 wd 0.0500 time 0.2438 (0.2449) data time 0.0009 (0.0020) model time 0.2430 (0.2428) loss 2.9259 (2.7370) grad_norm 4.0658 (5.9182) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][630/1251] eta 0:02:32 lr 0.000039 wd 0.0500 time 0.2500 (0.2449) data time 0.0008 (0.0020) model time 0.2493 (0.2428) loss 2.2784 (2.7423) grad_norm 8.3860 (5.9078) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][640/1251] eta 0:02:29 lr 0.000039 wd 0.0500 time 0.2460 (0.2449) data time 0.0009 (0.0020) model time 0.2451 (0.2428) loss 2.9891 (2.7428) grad_norm 5.2655 (5.9012) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][650/1251] eta 0:02:27 lr 0.000039 wd 0.0500 time 0.4354 (0.2452) data time 0.0009 (0.0020) model time 0.4345 (0.2431) loss 3.0007 (2.7435) grad_norm 6.3709 (5.8997) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][660/1251] eta 0:02:25 lr 0.000039 wd 0.0500 time 0.2432 (0.2454) data time 0.0007 (0.0019) model time 0.2425 (0.2434) loss 2.0171 (2.7409) grad_norm 9.4857 (5.8998) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][670/1251] eta 0:02:22 lr 0.000039 wd 0.0500 time 0.2388 (0.2454) data time 0.0009 (0.0019) model time 0.2379 (0.2434) loss 3.2901 (2.7408) grad_norm 3.8597 (5.8961) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][680/1251] eta 0:02:20 lr 0.000039 wd 0.0500 time 0.2475 (0.2454) data time 0.0008 (0.0019) model time 0.2467 (0.2434) loss 1.8638 (2.7401) grad_norm 6.9833 (5.8967) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][690/1251] eta 0:02:17 lr 0.000039 wd 0.0500 time 0.2431 (0.2453) data time 0.0009 (0.0019) model time 0.2422 (0.2433) loss 2.3955 (2.7421) grad_norm 5.0924 (5.8885) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][700/1251] eta 0:02:15 lr 0.000039 wd 0.0500 time 0.2379 (0.2453) data time 0.0008 (0.0019) model time 0.2371 (0.2433) loss 2.2358 (2.7408) grad_norm 4.1296 (5.8761) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][710/1251] eta 0:02:12 lr 0.000039 wd 0.0500 time 0.2368 (0.2452) data time 0.0007 (0.0019) model time 0.2360 (0.2432) loss 2.7228 (2.7402) grad_norm 4.9488 (5.9492) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][720/1251] eta 0:02:10 lr 0.000039 wd 0.0500 time 0.2359 (0.2452) data time 0.0010 (0.0019) model time 0.2349 (0.2432) loss 2.2536 (2.7363) grad_norm 4.4128 (6.0289) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][730/1251] eta 0:02:07 lr 0.000039 wd 0.0500 time 0.2393 (0.2451) data time 0.0007 (0.0018) model time 0.2386 (0.2432) loss 2.8236 (2.7361) grad_norm 4.5237 (6.0076) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][740/1251] eta 0:02:05 lr 0.000039 wd 0.0500 time 0.2435 (0.2451) data time 0.0012 (0.0018) model time 0.2423 (0.2431) loss 2.8592 (2.7378) grad_norm 3.6997 (5.9915) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][750/1251] eta 0:02:02 lr 0.000039 wd 0.0500 time 0.2306 (0.2450) data time 0.0011 (0.0018) model time 0.2295 (0.2431) loss 2.5835 (2.7380) grad_norm 4.0358 (5.9841) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][760/1251] eta 0:02:00 lr 0.000039 wd 0.0500 time 0.2462 (0.2450) data time 0.0008 (0.0018) model time 0.2454 (0.2431) loss 2.8572 (2.7371) grad_norm 7.6511 (5.9826) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][770/1251] eta 0:01:57 lr 0.000038 wd 0.0500 time 0.2359 (0.2450) data time 0.0009 (0.0018) model time 0.2349 (0.2431) loss 2.8398 (2.7369) grad_norm 4.7100 (5.9763) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:17:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][780/1251] eta 0:01:55 lr 0.000038 wd 0.0500 time 0.2421 (0.2449) data time 0.0010 (0.0018) model time 0.2411 (0.2430) loss 2.7838 (2.7360) grad_norm 3.2972 (5.9689) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][790/1251] eta 0:01:52 lr 0.000038 wd 0.0500 time 0.2386 (0.2448) data time 0.0009 (0.0018) model time 0.2376 (0.2429) loss 2.4051 (2.7365) grad_norm 5.9166 (5.9714) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][800/1251] eta 0:01:50 lr 0.000038 wd 0.0500 time 0.2390 (0.2448) data time 0.0009 (0.0018) model time 0.2382 (0.2429) loss 3.1196 (2.7361) grad_norm 5.3721 (5.9659) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][810/1251] eta 0:01:47 lr 0.000038 wd 0.0500 time 0.2446 (0.2447) data time 0.0011 (0.0018) model time 0.2436 (0.2429) loss 2.5712 (2.7364) grad_norm 4.9210 (5.9528) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][820/1251] eta 0:01:45 lr 0.000038 wd 0.0500 time 0.2507 (0.2447) data time 0.0007 (0.0018) model time 0.2500 (0.2428) loss 2.5591 (2.7362) grad_norm 6.5758 (5.9525) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][830/1251] eta 0:01:43 lr 0.000038 wd 0.0500 time 0.2420 (0.2447) data time 0.0011 (0.0018) model time 0.2409 (0.2428) loss 2.4951 (2.7345) grad_norm 4.4601 (5.9419) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][840/1251] eta 0:01:40 lr 0.000038 wd 0.0500 time 0.2491 (0.2447) data time 0.0007 (0.0017) model time 0.2485 (0.2428) loss 2.4559 (2.7321) grad_norm 4.8479 (5.9322) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][850/1251] eta 0:01:38 lr 0.000038 wd 0.0500 time 0.2431 (0.2447) data time 0.0007 (0.0018) model time 0.2424 (0.2428) loss 3.1801 (2.7330) grad_norm 5.1461 (5.9748) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][860/1251] eta 0:01:35 lr 0.000038 wd 0.0500 time 0.2368 (0.2446) data time 0.0008 (0.0018) model time 0.2360 (0.2427) loss 3.5775 (2.7324) grad_norm 4.3925 (5.9630) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][870/1251] eta 0:01:33 lr 0.000038 wd 0.0500 time 0.2410 (0.2446) data time 0.0010 (0.0017) model time 0.2400 (0.2427) loss 2.7390 (2.7324) grad_norm 5.1699 (5.9537) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][880/1251] eta 0:01:30 lr 0.000038 wd 0.0500 time 0.4372 (0.2448) data time 0.0007 (0.0017) model time 0.4365 (0.2429) loss 3.0361 (2.7322) grad_norm 10.0034 (5.9475) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][890/1251] eta 0:01:28 lr 0.000038 wd 0.0500 time 0.2421 (0.2447) data time 0.0010 (0.0017) model time 0.2411 (0.2429) loss 2.7146 (2.7291) grad_norm 5.0945 (5.9459) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][900/1251] eta 0:01:25 lr 0.000038 wd 0.0500 time 0.2412 (0.2447) data time 0.0007 (0.0017) model time 0.2405 (0.2428) loss 2.3010 (2.7298) grad_norm 4.0063 (5.9400) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][910/1251] eta 0:01:23 lr 0.000038 wd 0.0500 time 0.2483 (0.2446) data time 0.0007 (0.0017) model time 0.2476 (0.2428) loss 2.9166 (2.7301) grad_norm 5.5443 (5.9297) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][920/1251] eta 0:01:20 lr 0.000038 wd 0.0500 time 0.2429 (0.2446) data time 0.0009 (0.0017) model time 0.2420 (0.2428) loss 3.4180 (2.7316) grad_norm 6.7533 (5.9161) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][930/1251] eta 0:01:18 lr 0.000038 wd 0.0500 time 0.2392 (0.2446) data time 0.0007 (0.0017) model time 0.2384 (0.2427) loss 2.5627 (2.7338) grad_norm 7.2253 (5.9156) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][940/1251] eta 0:01:16 lr 0.000038 wd 0.0500 time 0.2363 (0.2445) data time 0.0011 (0.0017) model time 0.2352 (0.2427) loss 1.9976 (2.7324) grad_norm 9.0546 (5.9227) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][950/1251] eta 0:01:13 lr 0.000038 wd 0.0500 time 0.2435 (0.2445) data time 0.0011 (0.0017) model time 0.2424 (0.2427) loss 2.7446 (2.7326) grad_norm 3.9176 (5.9194) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][960/1251] eta 0:01:11 lr 0.000038 wd 0.0500 time 0.2338 (0.2445) data time 0.0008 (0.0017) model time 0.2330 (0.2427) loss 3.0936 (2.7355) grad_norm 4.7446 (5.9164) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][970/1251] eta 0:01:08 lr 0.000038 wd 0.0500 time 0.2396 (0.2444) data time 0.0006 (0.0017) model time 0.2390 (0.2426) loss 2.7476 (2.7361) grad_norm 20.2229 (5.9392) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][980/1251] eta 0:01:06 lr 0.000038 wd 0.0500 time 0.2583 (0.2444) data time 0.0010 (0.0017) model time 0.2574 (0.2426) loss 3.0197 (2.7365) grad_norm 6.5651 (5.9697) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][990/1251] eta 0:01:03 lr 0.000038 wd 0.0500 time 0.2378 (0.2443) data time 0.0010 (0.0017) model time 0.2368 (0.2426) loss 2.5571 (2.7364) grad_norm 5.9369 (5.9594) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1000/1251] eta 0:01:01 lr 0.000038 wd 0.0500 time 0.2430 (0.2443) data time 0.0009 (0.0017) model time 0.2421 (0.2425) loss 2.7118 (2.7376) grad_norm 4.7670 (5.9437) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1010/1251] eta 0:00:58 lr 0.000038 wd 0.0500 time 0.2372 (0.2443) data time 0.0010 (0.0016) model time 0.2362 (0.2425) loss 2.0964 (2.7390) grad_norm 4.6749 (5.9680) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:18:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1020/1251] eta 0:00:56 lr 0.000038 wd 0.0500 time 0.2452 (0.2442) data time 0.0009 (0.0016) model time 0.2443 (0.2425) loss 1.8099 (2.7374) grad_norm 8.6415 (5.9561) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1030/1251] eta 0:00:53 lr 0.000038 wd 0.0500 time 0.2562 (0.2442) data time 0.0007 (0.0016) model time 0.2555 (0.2424) loss 3.0654 (2.7375) grad_norm 3.5807 (5.9507) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1040/1251] eta 0:00:51 lr 0.000038 wd 0.0500 time 0.2444 (0.2442) data time 0.0010 (0.0016) model time 0.2434 (0.2424) loss 3.1068 (2.7380) grad_norm 4.9330 (5.9495) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1050/1251] eta 0:00:49 lr 0.000038 wd 0.0500 time 0.2396 (0.2441) data time 0.0009 (0.0016) model time 0.2387 (0.2424) loss 2.8175 (2.7368) grad_norm 5.3178 (5.9476) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1060/1251] eta 0:00:46 lr 0.000038 wd 0.0500 time 0.2480 (0.2441) data time 0.0010 (0.0016) model time 0.2470 (0.2424) loss 2.5731 (2.7364) grad_norm 4.7540 (5.9376) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1070/1251] eta 0:00:44 lr 0.000038 wd 0.0500 time 0.2406 (0.2441) data time 0.0007 (0.0016) model time 0.2400 (0.2424) loss 3.5047 (2.7373) grad_norm 4.2014 (5.9343) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1080/1251] eta 0:00:41 lr 0.000038 wd 0.0500 time 0.2439 (0.2441) data time 0.0009 (0.0016) model time 0.2430 (0.2424) loss 3.4054 (2.7389) grad_norm 6.2258 (5.9361) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1090/1251] eta 0:00:39 lr 0.000038 wd 0.0500 time 0.2469 (0.2441) data time 0.0007 (0.0016) model time 0.2462 (0.2423) loss 2.2841 (2.7399) grad_norm 4.0709 (5.9223) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1100/1251] eta 0:00:36 lr 0.000038 wd 0.0500 time 0.2391 (0.2440) data time 0.0010 (0.0016) model time 0.2381 (0.2423) loss 3.0175 (2.7376) grad_norm 3.2566 (5.9170) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1110/1251] eta 0:00:34 lr 0.000038 wd 0.0500 time 0.2343 (0.2443) data time 0.0008 (0.0016) model time 0.2336 (0.2426) loss 2.7203 (2.7392) grad_norm 3.6278 (5.9123) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1120/1251] eta 0:00:32 lr 0.000038 wd 0.0500 time 0.2331 (0.2444) data time 0.0008 (0.0016) model time 0.2323 (0.2427) loss 2.9719 (2.7389) grad_norm 3.6201 (5.9398) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1130/1251] eta 0:00:29 lr 0.000038 wd 0.0500 time 0.2330 (0.2443) data time 0.0008 (0.0016) model time 0.2322 (0.2427) loss 1.7927 (2.7387) grad_norm 4.8650 (5.9365) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1140/1251] eta 0:00:27 lr 0.000038 wd 0.0500 time 0.2456 (0.2443) data time 0.0011 (0.0016) model time 0.2445 (0.2427) loss 3.3096 (2.7392) grad_norm 3.7673 (5.9399) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1150/1251] eta 0:00:24 lr 0.000038 wd 0.0500 time 0.2428 (0.2443) data time 0.0011 (0.0016) model time 0.2416 (0.2426) loss 3.1663 (2.7402) grad_norm 21.6417 (5.9513) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1160/1251] eta 0:00:22 lr 0.000038 wd 0.0500 time 0.2432 (0.2443) data time 0.0009 (0.0016) model time 0.2423 (0.2426) loss 1.9830 (2.7378) grad_norm 5.8286 (5.9430) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1170/1251] eta 0:00:19 lr 0.000038 wd 0.0500 time 0.2374 (0.2443) data time 0.0012 (0.0016) model time 0.2362 (0.2426) loss 3.1965 (2.7383) grad_norm 4.2803 (5.9322) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1180/1251] eta 0:00:17 lr 0.000038 wd 0.0500 time 0.2384 (0.2443) data time 0.0009 (0.0016) model time 0.2375 (0.2426) loss 2.9979 (2.7358) grad_norm 4.5502 (5.9259) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1190/1251] eta 0:00:14 lr 0.000038 wd 0.0500 time 0.2445 (0.2442) data time 0.0011 (0.0016) model time 0.2435 (0.2426) loss 2.3652 (2.7348) grad_norm 4.9087 (5.9207) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1200/1251] eta 0:00:12 lr 0.000038 wd 0.0500 time 0.2514 (0.2442) data time 0.0008 (0.0015) model time 0.2505 (0.2426) loss 3.1200 (2.7360) grad_norm 5.1206 (5.9081) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1210/1251] eta 0:00:10 lr 0.000038 wd 0.0500 time 0.2425 (0.2442) data time 0.0009 (0.0015) model time 0.2416 (0.2425) loss 2.4960 (2.7361) grad_norm 3.6162 (5.9046) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1220/1251] eta 0:00:07 lr 0.000038 wd 0.0500 time 0.2434 (0.2442) data time 0.0007 (0.0015) model time 0.2427 (0.2425) loss 3.1732 (2.7372) grad_norm 4.2744 (5.8954) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1230/1251] eta 0:00:05 lr 0.000038 wd 0.0500 time 0.2362 (0.2441) data time 0.0008 (0.0015) model time 0.2353 (0.2425) loss 2.5100 (2.7369) grad_norm 4.3756 (5.8883) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1240/1251] eta 0:00:02 lr 0.000038 wd 0.0500 time 0.2254 (0.2440) data time 0.0007 (0.0015) model time 0.2247 (0.2424) loss 2.5457 (2.7373) grad_norm 4.5016 (5.8878) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [269/300][1250/1251] eta 0:00:00 lr 0.000038 wd 0.0500 time 0.2226 (0.2439) data time 0.0005 (0.0015) model time 0.2221 (0.2423) loss 2.6987 (2.7392) grad_norm 4.2133 (5.8827) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:19:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 269 training takes 0:05:05 [2024-09-01 09:19:53 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 09:19:54 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 09:19:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.477 (0.477) Loss 0.3899 (0.3899) Acc@1 92.773 (92.773) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 09:19:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.113) Loss 0.5791 (0.6126) Acc@1 90.137 (87.544) Acc@5 98.145 (97.567) Mem 7381MB [2024-09-01 09:19:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.097) Loss 0.9321 (0.6434) Acc@1 76.953 (86.370) Acc@5 95.703 (97.531) Mem 7381MB [2024-09-01 09:19:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.091) Loss 1.1406 (0.7386) Acc@1 74.609 (84.227) Acc@5 92.188 (96.610) Mem 7381MB [2024-09-01 09:19:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.0186 (0.7888) Acc@1 76.758 (83.044) Acc@5 94.141 (96.110) Mem 7381MB [2024-09-01 09:19:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.662 Acc@5 96.078 [2024-09-01 09:19:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-09-01 09:19:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.66% [2024-09-01 09:19:58 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 09:19:58 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 09:19:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.448 (0.448) Loss 0.3843 (0.3843) Acc@1 93.359 (93.359) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 09:20:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.084 (0.111) Loss 0.5659 (0.6031) Acc@1 90.430 (87.802) Acc@5 98.145 (97.843) Mem 7381MB [2024-09-01 09:20:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.097) Loss 0.9067 (0.6338) Acc@1 78.125 (86.668) Acc@5 95.898 (97.749) Mem 7381MB [2024-09-01 09:20:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.090) Loss 1.1260 (0.7245) Acc@1 74.707 (84.473) Acc@5 92.871 (96.815) Mem 7381MB [2024-09-01 09:20:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 0.9985 (0.7716) Acc@1 77.441 (83.313) Acc@5 94.238 (96.325) Mem 7381MB [2024-09-01 09:20:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.938 Acc@5 96.262 [2024-09-01 09:20:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 09:20:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][0/1251] eta 0:23:36 lr 0.000038 wd 0.0500 time 1.1320 (1.1320) data time 0.6096 (0.6096) model time 0.0000 (0.0000) loss 2.6052 (2.6052) grad_norm 3.9843 (3.9843) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:20:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][10/1251] eta 0:06:41 lr 0.000038 wd 0.0500 time 0.2382 (0.3237) data time 0.0009 (0.0564) model time 0.0000 (0.0000) loss 2.4716 (2.8164) grad_norm 17.0484 (6.2871) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:20:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][20/1251] eta 0:05:50 lr 0.000038 wd 0.0500 time 0.2360 (0.2846) data time 0.0009 (0.0300) model time 0.0000 (0.0000) loss 2.6957 (2.8142) grad_norm 3.7612 (6.0053) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:20:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][30/1251] eta 0:05:30 lr 0.000038 wd 0.0500 time 0.2389 (0.2706) data time 0.0009 (0.0207) model time 0.0000 (0.0000) loss 2.2361 (2.7147) grad_norm 7.5262 (6.4209) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:20:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][40/1251] eta 0:05:19 lr 0.000038 wd 0.0500 time 0.2457 (0.2635) data time 0.0007 (0.0159) model time 0.0000 (0.0000) loss 3.5555 (2.7267) grad_norm 4.2793 (6.8559) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:20:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][50/1251] eta 0:05:11 lr 0.000038 wd 0.0500 time 0.2442 (0.2594) data time 0.0011 (0.0130) model time 0.0000 (0.0000) loss 1.8326 (2.6613) grad_norm 6.5620 (6.5859) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:20:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][60/1251] eta 0:05:05 lr 0.000038 wd 0.0500 time 0.2446 (0.2563) data time 0.0007 (0.0110) model time 0.2439 (0.2397) loss 3.4634 (2.6484) grad_norm 6.1305 (6.4690) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:20:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][70/1251] eta 0:05:00 lr 0.000038 wd 0.0500 time 0.2462 (0.2544) data time 0.0013 (0.0096) model time 0.2449 (0.2408) loss 3.1550 (2.6825) grad_norm 4.5232 (6.3925) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:20:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][80/1251] eta 0:04:58 lr 0.000038 wd 0.0500 time 0.4098 (0.2549) data time 0.0010 (0.0085) model time 0.4088 (0.2464) loss 2.3314 (2.6914) grad_norm 8.8924 (6.5092) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:20:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][90/1251] eta 0:04:54 lr 0.000038 wd 0.0500 time 0.2419 (0.2534) data time 0.0009 (0.0077) model time 0.2410 (0.2448) loss 3.3295 (2.7090) grad_norm 4.8055 (6.3122) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:20:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][100/1251] eta 0:04:50 lr 0.000038 wd 0.0500 time 0.2375 (0.2521) data time 0.0008 (0.0071) model time 0.2367 (0.2437) loss 3.3617 (2.6930) grad_norm 5.0683 (6.1709) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:20:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][110/1251] eta 0:04:46 lr 0.000038 wd 0.0500 time 0.2418 (0.2511) data time 0.0007 (0.0065) model time 0.2410 (0.2431) loss 3.1071 (2.6909) grad_norm 5.4325 (6.1135) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][120/1251] eta 0:04:43 lr 0.000038 wd 0.0500 time 0.2352 (0.2503) data time 0.0012 (0.0061) model time 0.2340 (0.2426) loss 3.2020 (2.7058) grad_norm 5.2078 (6.0551) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:20:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][130/1251] eta 0:04:39 lr 0.000038 wd 0.0500 time 0.2457 (0.2497) data time 0.0010 (0.0057) model time 0.2447 (0.2426) loss 2.4343 (2.7070) grad_norm 5.4465 (6.0532) loss_scale 512.0000 (271.6336) mem 7381MB [2024-09-01 09:20:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][140/1251] eta 0:04:36 lr 0.000038 wd 0.0500 time 0.2401 (0.2491) data time 0.0009 (0.0053) model time 0.2392 (0.2423) loss 2.0408 (2.7037) grad_norm 5.6764 (6.1010) loss_scale 512.0000 (288.6809) mem 7381MB [2024-09-01 09:20:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][150/1251] eta 0:04:33 lr 0.000038 wd 0.0500 time 0.2409 (0.2486) data time 0.0010 (0.0050) model time 0.2400 (0.2422) loss 2.7756 (2.7052) grad_norm 4.9415 (6.0366) loss_scale 512.0000 (303.4702) mem 7381MB [2024-09-01 09:20:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][160/1251] eta 0:04:30 lr 0.000038 wd 0.0500 time 0.2335 (0.2481) data time 0.0010 (0.0048) model time 0.2325 (0.2419) loss 2.5714 (2.6890) grad_norm 5.8061 (6.0311) loss_scale 512.0000 (316.4224) mem 7381MB [2024-09-01 09:20:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][170/1251] eta 0:04:27 lr 0.000038 wd 0.0500 time 0.2499 (0.2478) data time 0.0008 (0.0046) model time 0.2491 (0.2419) loss 3.2866 (2.7058) grad_norm 5.1387 (5.9944) loss_scale 512.0000 (327.8596) mem 7381MB [2024-09-01 09:20:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][180/1251] eta 0:04:25 lr 0.000038 wd 0.0500 time 0.2479 (0.2476) data time 0.0007 (0.0044) model time 0.2472 (0.2419) loss 2.8273 (2.7079) grad_norm 7.1503 (5.9604) loss_scale 512.0000 (338.0331) mem 7381MB [2024-09-01 09:20:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][190/1251] eta 0:04:24 lr 0.000037 wd 0.0500 time 0.2407 (0.2493) data time 0.0010 (0.0042) model time 0.2397 (0.2446) loss 2.8318 (2.7191) grad_norm 6.4282 (5.9418) loss_scale 512.0000 (347.1414) mem 7381MB [2024-09-01 09:20:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][200/1251] eta 0:04:21 lr 0.000037 wd 0.0500 time 0.2344 (0.2488) data time 0.0009 (0.0040) model time 0.2334 (0.2442) loss 2.4637 (2.7329) grad_norm 7.0751 (5.8810) loss_scale 512.0000 (355.3433) mem 7381MB [2024-09-01 09:20:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][210/1251] eta 0:04:18 lr 0.000037 wd 0.0500 time 0.2351 (0.2484) data time 0.0010 (0.0039) model time 0.2341 (0.2439) loss 2.8494 (2.7419) grad_norm 4.3962 (6.2152) loss_scale 512.0000 (362.7678) mem 7381MB [2024-09-01 09:20:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][220/1251] eta 0:04:15 lr 0.000037 wd 0.0500 time 0.2400 (0.2481) data time 0.0012 (0.0038) model time 0.2388 (0.2437) loss 2.2560 (2.7358) grad_norm 5.4004 (6.4319) loss_scale 512.0000 (369.5204) mem 7381MB [2024-09-01 09:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][230/1251] eta 0:04:12 lr 0.000037 wd 0.0500 time 0.2383 (0.2478) data time 0.0010 (0.0036) model time 0.2373 (0.2435) loss 3.2155 (2.7332) grad_norm 7.0309 (6.3785) loss_scale 512.0000 (375.6883) mem 7381MB [2024-09-01 09:21:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][240/1251] eta 0:04:10 lr 0.000037 wd 0.0500 time 0.2438 (0.2476) data time 0.0008 (0.0035) model time 0.2430 (0.2434) loss 2.6925 (2.7362) grad_norm 4.9102 (6.4905) loss_scale 512.0000 (381.3444) mem 7381MB [2024-09-01 09:21:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][250/1251] eta 0:04:07 lr 0.000037 wd 0.0500 time 0.2452 (0.2473) data time 0.0010 (0.0034) model time 0.2443 (0.2432) loss 2.1384 (2.7384) grad_norm 4.5450 (6.4912) loss_scale 512.0000 (386.5498) mem 7381MB [2024-09-01 09:21:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][260/1251] eta 0:04:04 lr 0.000037 wd 0.0500 time 0.2429 (0.2470) data time 0.0007 (0.0033) model time 0.2421 (0.2430) loss 2.5055 (2.7337) grad_norm 4.1039 (6.4846) loss_scale 512.0000 (391.3563) mem 7381MB [2024-09-01 09:21:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][270/1251] eta 0:04:02 lr 0.000037 wd 0.0500 time 0.2484 (0.2469) data time 0.0008 (0.0032) model time 0.2476 (0.2430) loss 2.2923 (2.7264) grad_norm 5.9105 (6.4275) loss_scale 512.0000 (395.8081) mem 7381MB [2024-09-01 09:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][280/1251] eta 0:03:59 lr 0.000037 wd 0.0500 time 0.2496 (0.2467) data time 0.0007 (0.0032) model time 0.2489 (0.2429) loss 3.0721 (2.7171) grad_norm 3.5064 (6.3849) loss_scale 512.0000 (399.9431) mem 7381MB [2024-09-01 09:21:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][290/1251] eta 0:03:56 lr 0.000037 wd 0.0500 time 0.2428 (0.2466) data time 0.0008 (0.0031) model time 0.2421 (0.2428) loss 3.2393 (2.7190) grad_norm 5.5221 (6.3510) loss_scale 512.0000 (403.7938) mem 7381MB [2024-09-01 09:21:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][300/1251] eta 0:03:54 lr 0.000037 wd 0.0500 time 0.2450 (0.2464) data time 0.0007 (0.0030) model time 0.2443 (0.2427) loss 3.0728 (2.7246) grad_norm 5.1194 (6.3137) loss_scale 512.0000 (407.3887) mem 7381MB [2024-09-01 09:21:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][310/1251] eta 0:03:51 lr 0.000037 wd 0.0500 time 0.2447 (0.2462) data time 0.0009 (0.0030) model time 0.2437 (0.2426) loss 3.1619 (2.7184) grad_norm 4.3442 (6.2630) loss_scale 512.0000 (410.7524) mem 7381MB [2024-09-01 09:21:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][320/1251] eta 0:03:49 lr 0.000037 wd 0.0500 time 0.2344 (0.2461) data time 0.0009 (0.0029) model time 0.2335 (0.2425) loss 2.5725 (2.7208) grad_norm 4.3453 (inf) loss_scale 256.0000 (407.5265) mem 7381MB [2024-09-01 09:21:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][330/1251] eta 0:03:46 lr 0.000037 wd 0.0500 time 0.2462 (0.2460) data time 0.0007 (0.0028) model time 0.2455 (0.2425) loss 3.0279 (2.7253) grad_norm 5.9669 (inf) loss_scale 256.0000 (402.9486) mem 7381MB [2024-09-01 09:21:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][340/1251] eta 0:03:43 lr 0.000037 wd 0.0500 time 0.2422 (0.2458) data time 0.0010 (0.0028) model time 0.2412 (0.2425) loss 3.1266 (2.7250) grad_norm 4.5398 (inf) loss_scale 256.0000 (398.6393) mem 7381MB [2024-09-01 09:21:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][350/1251] eta 0:03:41 lr 0.000037 wd 0.0500 time 0.2314 (0.2457) data time 0.0007 (0.0027) model time 0.2307 (0.2424) loss 2.9769 (2.7269) grad_norm 3.9361 (inf) loss_scale 256.0000 (394.5755) mem 7381MB [2024-09-01 09:21:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][360/1251] eta 0:03:38 lr 0.000037 wd 0.0500 time 0.2352 (0.2455) data time 0.0010 (0.0027) model time 0.2342 (0.2422) loss 3.3017 (2.7364) grad_norm 7.5046 (inf) loss_scale 256.0000 (390.7368) mem 7381MB [2024-09-01 09:21:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][370/1251] eta 0:03:36 lr 0.000037 wd 0.0500 time 0.2428 (0.2454) data time 0.0009 (0.0026) model time 0.2419 (0.2421) loss 3.1222 (2.7309) grad_norm 6.7898 (inf) loss_scale 256.0000 (387.1051) mem 7381MB [2024-09-01 09:21:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][380/1251] eta 0:03:34 lr 0.000037 wd 0.0500 time 0.2324 (0.2462) data time 0.0009 (0.0026) model time 0.2315 (0.2431) loss 2.9430 (2.7303) grad_norm 5.9949 (inf) loss_scale 256.0000 (383.6640) mem 7381MB [2024-09-01 09:21:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][390/1251] eta 0:03:32 lr 0.000037 wd 0.0500 time 0.2383 (0.2464) data time 0.0008 (0.0025) model time 0.2375 (0.2435) loss 3.4904 (2.7283) grad_norm 3.4618 (inf) loss_scale 256.0000 (380.3990) mem 7381MB [2024-09-01 09:21:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][400/1251] eta 0:03:29 lr 0.000037 wd 0.0500 time 0.2441 (0.2463) data time 0.0009 (0.0025) model time 0.2432 (0.2434) loss 2.7317 (2.7344) grad_norm 4.4206 (inf) loss_scale 256.0000 (377.2968) mem 7381MB [2024-09-01 09:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][410/1251] eta 0:03:27 lr 0.000037 wd 0.0500 time 0.2448 (0.2462) data time 0.0012 (0.0025) model time 0.2436 (0.2433) loss 3.1084 (2.7366) grad_norm 5.5010 (inf) loss_scale 256.0000 (374.3455) mem 7381MB [2024-09-01 09:21:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][420/1251] eta 0:03:24 lr 0.000037 wd 0.0500 time 0.2409 (0.2461) data time 0.0010 (0.0024) model time 0.2399 (0.2433) loss 2.5438 (2.7354) grad_norm 5.0205 (inf) loss_scale 256.0000 (371.5344) mem 7381MB [2024-09-01 09:21:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][430/1251] eta 0:03:21 lr 0.000037 wd 0.0500 time 0.2456 (0.2460) data time 0.0007 (0.0024) model time 0.2449 (0.2432) loss 2.8627 (2.7314) grad_norm 4.6783 (inf) loss_scale 256.0000 (368.8538) mem 7381MB [2024-09-01 09:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][440/1251] eta 0:03:19 lr 0.000037 wd 0.0500 time 0.2425 (0.2459) data time 0.0008 (0.0024) model time 0.2417 (0.2431) loss 1.9346 (2.7345) grad_norm 24.0911 (inf) loss_scale 256.0000 (366.2948) mem 7381MB [2024-09-01 09:21:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][450/1251] eta 0:03:16 lr 0.000037 wd 0.0500 time 0.2423 (0.2458) data time 0.0007 (0.0023) model time 0.2417 (0.2431) loss 3.1610 (2.7387) grad_norm 4.9123 (inf) loss_scale 256.0000 (363.8492) mem 7381MB [2024-09-01 09:21:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][460/1251] eta 0:03:14 lr 0.000037 wd 0.0500 time 0.2528 (0.2457) data time 0.0009 (0.0023) model time 0.2519 (0.2431) loss 3.0884 (2.7376) grad_norm 3.5108 (inf) loss_scale 256.0000 (361.5098) mem 7381MB [2024-09-01 09:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][470/1251] eta 0:03:11 lr 0.000037 wd 0.0500 time 0.2340 (0.2456) data time 0.0010 (0.0023) model time 0.2330 (0.2430) loss 2.6419 (2.7372) grad_norm 5.7268 (inf) loss_scale 256.0000 (359.2696) mem 7381MB [2024-09-01 09:22:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][480/1251] eta 0:03:09 lr 0.000037 wd 0.0500 time 0.2385 (0.2455) data time 0.0012 (0.0023) model time 0.2373 (0.2429) loss 2.7298 (2.7360) grad_norm 4.5695 (inf) loss_scale 256.0000 (357.1227) mem 7381MB [2024-09-01 09:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][490/1251] eta 0:03:06 lr 0.000037 wd 0.0500 time 0.2366 (0.2454) data time 0.0007 (0.0022) model time 0.2359 (0.2428) loss 3.4215 (2.7343) grad_norm 3.6757 (inf) loss_scale 256.0000 (355.0631) mem 7381MB [2024-09-01 09:22:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][500/1251] eta 0:03:04 lr 0.000037 wd 0.0500 time 0.2412 (0.2453) data time 0.0010 (0.0022) model time 0.2402 (0.2427) loss 2.9454 (2.7330) grad_norm 5.1667 (inf) loss_scale 256.0000 (353.0858) mem 7381MB [2024-09-01 09:22:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][510/1251] eta 0:03:01 lr 0.000037 wd 0.0500 time 0.2369 (0.2452) data time 0.0007 (0.0022) model time 0.2362 (0.2426) loss 2.9659 (2.7296) grad_norm 3.8335 (inf) loss_scale 256.0000 (351.1859) mem 7381MB [2024-09-01 09:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][520/1251] eta 0:02:59 lr 0.000037 wd 0.0500 time 0.2418 (0.2451) data time 0.0010 (0.0022) model time 0.2408 (0.2426) loss 2.9936 (2.7351) grad_norm 4.1260 (inf) loss_scale 256.0000 (349.3589) mem 7381MB [2024-09-01 09:22:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][530/1251] eta 0:02:56 lr 0.000037 wd 0.0500 time 0.2452 (0.2450) data time 0.0007 (0.0021) model time 0.2446 (0.2425) loss 3.1507 (2.7330) grad_norm 6.4692 (inf) loss_scale 256.0000 (347.6008) mem 7381MB [2024-09-01 09:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][540/1251] eta 0:02:54 lr 0.000037 wd 0.0500 time 0.2408 (0.2450) data time 0.0011 (0.0021) model time 0.2397 (0.2425) loss 3.0704 (2.7321) grad_norm 4.2862 (inf) loss_scale 256.0000 (345.9076) mem 7381MB [2024-09-01 09:22:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][550/1251] eta 0:02:51 lr 0.000037 wd 0.0500 time 0.2428 (0.2449) data time 0.0008 (0.0021) model time 0.2420 (0.2425) loss 1.9070 (2.7298) grad_norm 7.6698 (inf) loss_scale 256.0000 (344.2759) mem 7381MB [2024-09-01 09:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][560/1251] eta 0:02:49 lr 0.000037 wd 0.0500 time 0.2397 (0.2449) data time 0.0007 (0.0021) model time 0.2390 (0.2424) loss 1.5349 (2.7291) grad_norm 7.5292 (inf) loss_scale 256.0000 (342.7023) mem 7381MB [2024-09-01 09:22:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][570/1251] eta 0:02:46 lr 0.000037 wd 0.0500 time 0.2396 (0.2448) data time 0.0011 (0.0021) model time 0.2385 (0.2424) loss 2.9473 (2.7288) grad_norm 4.5304 (inf) loss_scale 256.0000 (341.1839) mem 7381MB [2024-09-01 09:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][580/1251] eta 0:02:44 lr 0.000037 wd 0.0500 time 0.2466 (0.2448) data time 0.0011 (0.0020) model time 0.2456 (0.2423) loss 2.9024 (2.7324) grad_norm 4.8037 (inf) loss_scale 256.0000 (339.7177) mem 7381MB [2024-09-01 09:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][590/1251] eta 0:02:41 lr 0.000037 wd 0.0500 time 0.2457 (0.2447) data time 0.0010 (0.0020) model time 0.2447 (0.2423) loss 2.5811 (2.7326) grad_norm 4.7800 (inf) loss_scale 256.0000 (338.3012) mem 7381MB [2024-09-01 09:22:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][600/1251] eta 0:02:39 lr 0.000037 wd 0.0500 time 0.2391 (0.2447) data time 0.0010 (0.0020) model time 0.2381 (0.2423) loss 2.8459 (2.7304) grad_norm 4.9486 (inf) loss_scale 256.0000 (336.9318) mem 7381MB [2024-09-01 09:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][610/1251] eta 0:02:37 lr 0.000037 wd 0.0500 time 0.2474 (0.2449) data time 0.0010 (0.0020) model time 0.2463 (0.2426) loss 1.8972 (2.7309) grad_norm 5.6231 (inf) loss_scale 256.0000 (335.6072) mem 7381MB [2024-09-01 09:22:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][620/1251] eta 0:02:34 lr 0.000037 wd 0.0500 time 0.2431 (0.2449) data time 0.0011 (0.0020) model time 0.2420 (0.2426) loss 1.7259 (2.7245) grad_norm 4.8461 (inf) loss_scale 256.0000 (334.3253) mem 7381MB [2024-09-01 09:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][630/1251] eta 0:02:32 lr 0.000037 wd 0.0500 time 0.2403 (0.2449) data time 0.0010 (0.0020) model time 0.2393 (0.2426) loss 3.1123 (2.7237) grad_norm 4.9744 (inf) loss_scale 256.0000 (333.0840) mem 7381MB [2024-09-01 09:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][640/1251] eta 0:02:29 lr 0.000037 wd 0.0500 time 0.2434 (0.2448) data time 0.0009 (0.0020) model time 0.2425 (0.2426) loss 3.2431 (2.7269) grad_norm 8.3043 (inf) loss_scale 256.0000 (331.8814) mem 7381MB [2024-09-01 09:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][650/1251] eta 0:02:27 lr 0.000037 wd 0.0500 time 0.2408 (0.2448) data time 0.0008 (0.0019) model time 0.2400 (0.2426) loss 2.1945 (2.7244) grad_norm 4.4463 (inf) loss_scale 256.0000 (330.7158) mem 7381MB [2024-09-01 09:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][660/1251] eta 0:02:24 lr 0.000037 wd 0.0500 time 0.2398 (0.2448) data time 0.0010 (0.0019) model time 0.2388 (0.2426) loss 2.6931 (2.7227) grad_norm 3.8676 (inf) loss_scale 256.0000 (329.5855) mem 7381MB [2024-09-01 09:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][670/1251] eta 0:02:22 lr 0.000037 wd 0.0500 time 0.2450 (0.2447) data time 0.0007 (0.0019) model time 0.2443 (0.2425) loss 2.4129 (2.7211) grad_norm 4.4846 (inf) loss_scale 256.0000 (328.4888) mem 7381MB [2024-09-01 09:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][680/1251] eta 0:02:19 lr 0.000037 wd 0.0500 time 0.2384 (0.2447) data time 0.0009 (0.0019) model time 0.2375 (0.2425) loss 3.1441 (2.7190) grad_norm 4.1906 (inf) loss_scale 256.0000 (327.4244) mem 7381MB [2024-09-01 09:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][690/1251] eta 0:02:17 lr 0.000037 wd 0.0500 time 0.2407 (0.2446) data time 0.0011 (0.0019) model time 0.2396 (0.2425) loss 2.9787 (2.7173) grad_norm 8.5930 (inf) loss_scale 256.0000 (326.3907) mem 7381MB [2024-09-01 09:22:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][700/1251] eta 0:02:14 lr 0.000037 wd 0.0500 time 0.2394 (0.2446) data time 0.0008 (0.0019) model time 0.2387 (0.2424) loss 2.7685 (2.7178) grad_norm 5.1112 (inf) loss_scale 256.0000 (325.3866) mem 7381MB [2024-09-01 09:22:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][710/1251] eta 0:02:12 lr 0.000037 wd 0.0500 time 0.2348 (0.2445) data time 0.0007 (0.0019) model time 0.2340 (0.2424) loss 2.4316 (2.7192) grad_norm 5.4086 (inf) loss_scale 256.0000 (324.4107) mem 7381MB [2024-09-01 09:22:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][720/1251] eta 0:02:09 lr 0.000037 wd 0.0500 time 0.2460 (0.2445) data time 0.0007 (0.0018) model time 0.2453 (0.2424) loss 3.2764 (2.7225) grad_norm 3.9006 (inf) loss_scale 256.0000 (323.4619) mem 7381MB [2024-09-01 09:23:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][730/1251] eta 0:02:07 lr 0.000037 wd 0.0500 time 0.2452 (0.2445) data time 0.0008 (0.0018) model time 0.2444 (0.2423) loss 2.0096 (2.7193) grad_norm 3.9544 (inf) loss_scale 256.0000 (322.5390) mem 7381MB [2024-09-01 09:23:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][740/1251] eta 0:02:04 lr 0.000037 wd 0.0500 time 0.2386 (0.2444) data time 0.0010 (0.0018) model time 0.2376 (0.2423) loss 2.0651 (2.7209) grad_norm 6.2300 (inf) loss_scale 256.0000 (321.6410) mem 7381MB [2024-09-01 09:23:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][750/1251] eta 0:02:02 lr 0.000037 wd 0.0500 time 0.2421 (0.2444) data time 0.0010 (0.0018) model time 0.2410 (0.2423) loss 3.1771 (2.7195) grad_norm 5.0060 (inf) loss_scale 256.0000 (320.7670) mem 7381MB [2024-09-01 09:23:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][760/1251] eta 0:01:59 lr 0.000037 wd 0.0500 time 0.2450 (0.2443) data time 0.0009 (0.0018) model time 0.2440 (0.2423) loss 2.8546 (2.7204) grad_norm 6.3452 (inf) loss_scale 256.0000 (319.9159) mem 7381MB [2024-09-01 09:23:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][770/1251] eta 0:01:57 lr 0.000037 wd 0.0500 time 0.2385 (0.2443) data time 0.0008 (0.0018) model time 0.2377 (0.2422) loss 2.9130 (2.7234) grad_norm 4.6719 (inf) loss_scale 256.0000 (319.0869) mem 7381MB [2024-09-01 09:23:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][780/1251] eta 0:01:55 lr 0.000037 wd 0.0500 time 0.2417 (0.2443) data time 0.0009 (0.0018) model time 0.2408 (0.2422) loss 2.0706 (2.7226) grad_norm 6.6687 (inf) loss_scale 256.0000 (318.2791) mem 7381MB [2024-09-01 09:23:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][790/1251] eta 0:01:52 lr 0.000037 wd 0.0500 time 0.2453 (0.2442) data time 0.0006 (0.0018) model time 0.2446 (0.2422) loss 2.8074 (2.7202) grad_norm 4.6224 (inf) loss_scale 256.0000 (317.4918) mem 7381MB [2024-09-01 09:23:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][800/1251] eta 0:01:50 lr 0.000037 wd 0.0500 time 0.2449 (0.2442) data time 0.0009 (0.0018) model time 0.2440 (0.2421) loss 2.0656 (2.7174) grad_norm 6.5126 (inf) loss_scale 256.0000 (316.7241) mem 7381MB [2024-09-01 09:23:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][810/1251] eta 0:01:47 lr 0.000037 wd 0.0500 time 0.2447 (0.2441) data time 0.0010 (0.0018) model time 0.2437 (0.2421) loss 2.2764 (2.7167) grad_norm 4.1167 (inf) loss_scale 256.0000 (315.9753) mem 7381MB [2024-09-01 09:23:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][820/1251] eta 0:01:45 lr 0.000037 wd 0.0500 time 0.2480 (0.2441) data time 0.0009 (0.0017) model time 0.2471 (0.2421) loss 2.5744 (2.7164) grad_norm 6.2080 (inf) loss_scale 256.0000 (315.2448) mem 7381MB [2024-09-01 09:23:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][830/1251] eta 0:01:42 lr 0.000037 wd 0.0500 time 0.2444 (0.2441) data time 0.0009 (0.0017) model time 0.2436 (0.2421) loss 2.9115 (2.7153) grad_norm 5.0072 (inf) loss_scale 256.0000 (314.5319) mem 7381MB [2024-09-01 09:23:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][840/1251] eta 0:01:40 lr 0.000037 wd 0.0500 time 0.2364 (0.2440) data time 0.0007 (0.0017) model time 0.2357 (0.2420) loss 3.3046 (2.7185) grad_norm 5.5092 (inf) loss_scale 256.0000 (313.8359) mem 7381MB [2024-09-01 09:23:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][850/1251] eta 0:01:37 lr 0.000037 wd 0.0500 time 0.2381 (0.2440) data time 0.0010 (0.0017) model time 0.2371 (0.2420) loss 3.0536 (2.7207) grad_norm 5.3958 (inf) loss_scale 256.0000 (313.1563) mem 7381MB [2024-09-01 09:23:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][860/1251] eta 0:01:35 lr 0.000037 wd 0.0500 time 0.2461 (0.2439) data time 0.0012 (0.0017) model time 0.2449 (0.2420) loss 2.9254 (2.7216) grad_norm 4.7298 (inf) loss_scale 256.0000 (312.4925) mem 7381MB [2024-09-01 09:23:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][870/1251] eta 0:01:32 lr 0.000037 wd 0.0500 time 0.2460 (0.2439) data time 0.0009 (0.0017) model time 0.2451 (0.2419) loss 2.5905 (2.7241) grad_norm 19.8727 (inf) loss_scale 256.0000 (311.8439) mem 7381MB [2024-09-01 09:23:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][880/1251] eta 0:01:30 lr 0.000037 wd 0.0500 time 0.2412 (0.2439) data time 0.0007 (0.0017) model time 0.2405 (0.2419) loss 3.0734 (2.7263) grad_norm 4.4626 (inf) loss_scale 256.0000 (311.2100) mem 7381MB [2024-09-01 09:23:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][890/1251] eta 0:01:28 lr 0.000036 wd 0.0500 time 0.2459 (0.2438) data time 0.0009 (0.0017) model time 0.2450 (0.2419) loss 2.8714 (2.7244) grad_norm 4.0805 (inf) loss_scale 256.0000 (310.5903) mem 7381MB [2024-09-01 09:23:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][900/1251] eta 0:01:25 lr 0.000036 wd 0.0500 time 0.2376 (0.2440) data time 0.0010 (0.0017) model time 0.2366 (0.2421) loss 2.8105 (2.7255) grad_norm 5.1077 (inf) loss_scale 256.0000 (309.9845) mem 7381MB [2024-09-01 09:23:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][910/1251] eta 0:01:23 lr 0.000036 wd 0.0500 time 0.2360 (0.2443) data time 0.0010 (0.0017) model time 0.2350 (0.2424) loss 1.9527 (2.7218) grad_norm 5.9489 (inf) loss_scale 256.0000 (309.3919) mem 7381MB [2024-09-01 09:23:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][920/1251] eta 0:01:20 lr 0.000036 wd 0.0500 time 0.2394 (0.2445) data time 0.0011 (0.0017) model time 0.2383 (0.2426) loss 2.1835 (2.7221) grad_norm 12.9639 (inf) loss_scale 256.0000 (308.8122) mem 7381MB [2024-09-01 09:23:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][930/1251] eta 0:01:18 lr 0.000036 wd 0.0500 time 0.2425 (0.2444) data time 0.0009 (0.0017) model time 0.2416 (0.2426) loss 2.2658 (2.7212) grad_norm 3.1838 (inf) loss_scale 256.0000 (308.2449) mem 7381MB [2024-09-01 09:23:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][940/1251] eta 0:01:16 lr 0.000036 wd 0.0500 time 0.2371 (0.2444) data time 0.0010 (0.0016) model time 0.2361 (0.2426) loss 2.9695 (2.7232) grad_norm 4.2543 (inf) loss_scale 256.0000 (307.6897) mem 7381MB [2024-09-01 09:23:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][950/1251] eta 0:01:13 lr 0.000036 wd 0.0500 time 0.2415 (0.2444) data time 0.0010 (0.0016) model time 0.2405 (0.2425) loss 2.1124 (2.7244) grad_norm 4.1360 (inf) loss_scale 256.0000 (307.1462) mem 7381MB [2024-09-01 09:23:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][960/1251] eta 0:01:11 lr 0.000036 wd 0.0500 time 0.2380 (0.2443) data time 0.0007 (0.0016) model time 0.2373 (0.2425) loss 2.4759 (2.7235) grad_norm 5.2333 (inf) loss_scale 256.0000 (306.6139) mem 7381MB [2024-09-01 09:24:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][970/1251] eta 0:01:08 lr 0.000036 wd 0.0500 time 0.2509 (0.2443) data time 0.0009 (0.0016) model time 0.2500 (0.2425) loss 2.9306 (2.7266) grad_norm 4.6263 (inf) loss_scale 256.0000 (306.0927) mem 7381MB [2024-09-01 09:24:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][980/1251] eta 0:01:06 lr 0.000036 wd 0.0500 time 0.2394 (0.2443) data time 0.0007 (0.0016) model time 0.2387 (0.2425) loss 2.4242 (2.7253) grad_norm 5.0320 (inf) loss_scale 256.0000 (305.5821) mem 7381MB [2024-09-01 09:24:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][990/1251] eta 0:01:03 lr 0.000036 wd 0.0500 time 0.2417 (0.2443) data time 0.0008 (0.0016) model time 0.2409 (0.2425) loss 3.0065 (2.7250) grad_norm 2.9705 (inf) loss_scale 256.0000 (305.0817) mem 7381MB [2024-09-01 09:24:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1000/1251] eta 0:01:01 lr 0.000036 wd 0.0500 time 0.2425 (0.2442) data time 0.0007 (0.0016) model time 0.2418 (0.2424) loss 2.3325 (2.7230) grad_norm 3.0486 (inf) loss_scale 256.0000 (304.5914) mem 7381MB [2024-09-01 09:24:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1010/1251] eta 0:00:58 lr 0.000036 wd 0.0500 time 0.2364 (0.2442) data time 0.0009 (0.0016) model time 0.2356 (0.2424) loss 3.2353 (2.7254) grad_norm 6.2203 (inf) loss_scale 256.0000 (304.1108) mem 7381MB [2024-09-01 09:24:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1020/1251] eta 0:00:56 lr 0.000036 wd 0.0500 time 0.2429 (0.2442) data time 0.0009 (0.0016) model time 0.2419 (0.2424) loss 2.8640 (2.7259) grad_norm 6.2542 (inf) loss_scale 256.0000 (303.6396) mem 7381MB [2024-09-01 09:24:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1030/1251] eta 0:00:53 lr 0.000036 wd 0.0500 time 0.2477 (0.2441) data time 0.0007 (0.0016) model time 0.2470 (0.2424) loss 1.7678 (2.7248) grad_norm 4.3542 (inf) loss_scale 256.0000 (303.1775) mem 7381MB [2024-09-01 09:24:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1040/1251] eta 0:00:51 lr 0.000036 wd 0.0500 time 0.2375 (0.2441) data time 0.0009 (0.0016) model time 0.2365 (0.2423) loss 2.5403 (2.7238) grad_norm 4.5111 (inf) loss_scale 256.0000 (302.7243) mem 7381MB [2024-09-01 09:24:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1050/1251] eta 0:00:49 lr 0.000036 wd 0.0500 time 0.2457 (0.2441) data time 0.0009 (0.0016) model time 0.2448 (0.2423) loss 2.6356 (2.7243) grad_norm 3.2741 (inf) loss_scale 256.0000 (302.2797) mem 7381MB [2024-09-01 09:24:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1060/1251] eta 0:00:46 lr 0.000036 wd 0.0500 time 0.2435 (0.2441) data time 0.0010 (0.0016) model time 0.2426 (0.2423) loss 2.2697 (2.7251) grad_norm 5.1354 (inf) loss_scale 256.0000 (301.8435) mem 7381MB [2024-09-01 09:24:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1070/1251] eta 0:00:44 lr 0.000036 wd 0.0500 time 0.2429 (0.2440) data time 0.0009 (0.0016) model time 0.2420 (0.2423) loss 2.8133 (2.7262) grad_norm 5.2760 (inf) loss_scale 256.0000 (301.4155) mem 7381MB [2024-09-01 09:24:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1080/1251] eta 0:00:41 lr 0.000036 wd 0.0500 time 0.2483 (0.2440) data time 0.0010 (0.0016) model time 0.2473 (0.2423) loss 1.8623 (2.7232) grad_norm 5.5895 (inf) loss_scale 256.0000 (300.9954) mem 7381MB [2024-09-01 09:24:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1090/1251] eta 0:00:39 lr 0.000036 wd 0.0500 time 0.2443 (0.2440) data time 0.0008 (0.0016) model time 0.2436 (0.2423) loss 2.3713 (2.7224) grad_norm 5.6076 (inf) loss_scale 256.0000 (300.5830) mem 7381MB [2024-09-01 09:24:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1100/1251] eta 0:00:36 lr 0.000036 wd 0.0500 time 0.2464 (0.2440) data time 0.0007 (0.0015) model time 0.2457 (0.2423) loss 2.0211 (2.7219) grad_norm 5.1127 (inf) loss_scale 256.0000 (300.1780) mem 7381MB [2024-09-01 09:24:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1110/1251] eta 0:00:34 lr 0.000036 wd 0.0500 time 0.2486 (0.2440) data time 0.0010 (0.0015) model time 0.2476 (0.2422) loss 2.6672 (2.7212) grad_norm 3.2081 (inf) loss_scale 256.0000 (299.7804) mem 7381MB [2024-09-01 09:24:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1120/1251] eta 0:00:32 lr 0.000036 wd 0.0500 time 0.2400 (0.2443) data time 0.0013 (0.0015) model time 0.2387 (0.2426) loss 2.9584 (2.7213) grad_norm 5.0853 (inf) loss_scale 256.0000 (299.3898) mem 7381MB [2024-09-01 09:24:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1130/1251] eta 0:00:29 lr 0.000036 wd 0.0500 time 0.2456 (0.2443) data time 0.0009 (0.0015) model time 0.2446 (0.2426) loss 2.7035 (2.7222) grad_norm 4.6814 (inf) loss_scale 256.0000 (299.0062) mem 7381MB [2024-09-01 09:24:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1140/1251] eta 0:00:27 lr 0.000036 wd 0.0500 time 0.2423 (0.2442) data time 0.0010 (0.0015) model time 0.2413 (0.2425) loss 2.1035 (2.7227) grad_norm 3.4802 (inf) loss_scale 256.0000 (298.6293) mem 7381MB [2024-09-01 09:24:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1150/1251] eta 0:00:24 lr 0.000036 wd 0.0500 time 0.2452 (0.2442) data time 0.0011 (0.0015) model time 0.2441 (0.2425) loss 2.6738 (2.7239) grad_norm 11.1505 (inf) loss_scale 256.0000 (298.2589) mem 7381MB [2024-09-01 09:24:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1160/1251] eta 0:00:22 lr 0.000036 wd 0.0500 time 0.2582 (0.2442) data time 0.0009 (0.0015) model time 0.2573 (0.2425) loss 3.0324 (2.7235) grad_norm 3.8612 (inf) loss_scale 256.0000 (297.8949) mem 7381MB [2024-09-01 09:24:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1170/1251] eta 0:00:19 lr 0.000036 wd 0.0500 time 0.2442 (0.2442) data time 0.0008 (0.0015) model time 0.2434 (0.2425) loss 3.0740 (2.7250) grad_norm 6.0213 (inf) loss_scale 256.0000 (297.5371) mem 7381MB [2024-09-01 09:24:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1180/1251] eta 0:00:17 lr 0.000036 wd 0.0500 time 0.2475 (0.2442) data time 0.0009 (0.0015) model time 0.2465 (0.2425) loss 3.0940 (2.7255) grad_norm 6.7822 (inf) loss_scale 256.0000 (297.1854) mem 7381MB [2024-09-01 09:24:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1190/1251] eta 0:00:14 lr 0.000036 wd 0.0500 time 0.2370 (0.2441) data time 0.0010 (0.0015) model time 0.2359 (0.2425) loss 2.8383 (2.7253) grad_norm 4.5318 (inf) loss_scale 256.0000 (296.8396) mem 7381MB [2024-09-01 09:24:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1200/1251] eta 0:00:12 lr 0.000036 wd 0.0500 time 0.2433 (0.2441) data time 0.0010 (0.0015) model time 0.2423 (0.2424) loss 2.8284 (2.7239) grad_norm 3.6150 (inf) loss_scale 256.0000 (296.4996) mem 7381MB [2024-09-01 09:24:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1210/1251] eta 0:00:10 lr 0.000036 wd 0.0500 time 0.2461 (0.2441) data time 0.0009 (0.0015) model time 0.2452 (0.2425) loss 2.4429 (2.7218) grad_norm 3.7182 (inf) loss_scale 256.0000 (296.1652) mem 7381MB [2024-09-01 09:25:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1220/1251] eta 0:00:07 lr 0.000036 wd 0.0500 time 0.2412 (0.2441) data time 0.0013 (0.0015) model time 0.2399 (0.2424) loss 2.8840 (2.7225) grad_norm 4.4088 (inf) loss_scale 256.0000 (295.8362) mem 7381MB [2024-09-01 09:25:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1230/1251] eta 0:00:05 lr 0.000036 wd 0.0500 time 0.2487 (0.2441) data time 0.0007 (0.0015) model time 0.2480 (0.2424) loss 2.7692 (2.7235) grad_norm 3.4941 (inf) loss_scale 256.0000 (295.5126) mem 7381MB [2024-09-01 09:25:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1240/1251] eta 0:00:02 lr 0.000036 wd 0.0500 time 0.2279 (0.2440) data time 0.0007 (0.0015) model time 0.2272 (0.2423) loss 1.5698 (2.7233) grad_norm 3.5501 (inf) loss_scale 256.0000 (295.1942) mem 7381MB [2024-09-01 09:25:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [270/300][1250/1251] eta 0:00:00 lr 0.000036 wd 0.0500 time 0.2262 (0.2438) data time 0.0007 (0.0015) model time 0.2255 (0.2422) loss 2.9808 (2.7230) grad_norm 4.5327 (inf) loss_scale 256.0000 (294.8809) mem 7381MB [2024-09-01 09:25:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 270 training takes 0:05:05 [2024-09-01 09:25:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 09:25:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 09:25:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.451 (0.451) Loss 0.3953 (0.3953) Acc@1 93.164 (93.164) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 09:25:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.114) Loss 0.5742 (0.6136) Acc@1 90.332 (87.695) Acc@5 98.145 (97.718) Mem 7381MB [2024-09-01 09:25:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.097) Loss 0.8901 (0.6442) Acc@1 78.516 (86.551) Acc@5 96.191 (97.684) Mem 7381MB [2024-09-01 09:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.092) Loss 1.1543 (0.7390) Acc@1 73.926 (84.334) Acc@5 92.383 (96.702) Mem 7381MB [2024-09-01 09:25:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 1.0195 (0.7873) Acc@1 77.051 (83.139) Acc@5 94.336 (96.220) Mem 7381MB [2024-09-01 09:25:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.706 Acc@5 96.156 [2024-09-01 09:25:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-09-01 09:25:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.71% [2024-09-01 09:25:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 09:25:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 09:25:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.441 (0.441) Loss 0.3853 (0.3853) Acc@1 93.262 (93.262) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 09:25:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.108) Loss 0.5659 (0.6036) Acc@1 90.527 (87.811) Acc@5 98.242 (97.843) Mem 7381MB [2024-09-01 09:25:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.094) Loss 0.9067 (0.6341) Acc@1 78.223 (86.672) Acc@5 95.898 (97.759) Mem 7381MB [2024-09-01 09:25:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.089) Loss 1.1250 (0.7250) Acc@1 74.707 (84.498) Acc@5 92.773 (96.825) Mem 7381MB [2024-09-01 09:25:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 0.9985 (0.7721) Acc@1 77.441 (83.315) Acc@5 94.141 (96.322) Mem 7381MB [2024-09-01 09:25:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.922 Acc@5 96.264 [2024-09-01 09:25:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 09:25:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][0/1251] eta 0:22:08 lr 0.000036 wd 0.0500 time 1.0623 (1.0623) data time 0.5025 (0.5025) model time 0.0000 (0.0000) loss 2.8804 (2.8804) grad_norm 6.3327 (6.3327) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][10/1251] eta 0:06:36 lr 0.000036 wd 0.0500 time 0.2408 (0.3193) data time 0.0008 (0.0466) model time 0.0000 (0.0000) loss 3.1525 (2.8972) grad_norm 4.6612 (5.0939) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][20/1251] eta 0:05:48 lr 0.000036 wd 0.0500 time 0.2375 (0.2827) data time 0.0010 (0.0249) model time 0.0000 (0.0000) loss 2.7098 (2.7762) grad_norm 3.7926 (5.0131) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][30/1251] eta 0:05:29 lr 0.000036 wd 0.0500 time 0.2430 (0.2696) data time 0.0008 (0.0172) model time 0.0000 (0.0000) loss 2.6852 (2.7834) grad_norm 4.0879 (5.4209) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][40/1251] eta 0:05:18 lr 0.000036 wd 0.0500 time 0.2349 (0.2629) data time 0.0010 (0.0132) model time 0.0000 (0.0000) loss 2.6944 (2.7111) grad_norm 7.9675 (5.3783) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][50/1251] eta 0:05:11 lr 0.000036 wd 0.0500 time 0.2345 (0.2590) data time 0.0008 (0.0108) model time 0.0000 (0.0000) loss 2.8875 (2.7160) grad_norm 5.9054 (6.2977) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][60/1251] eta 0:05:05 lr 0.000036 wd 0.0500 time 0.2390 (0.2563) data time 0.0010 (0.0092) model time 0.2381 (0.2420) loss 2.8451 (2.7221) grad_norm 5.7288 (6.2336) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][70/1251] eta 0:05:00 lr 0.000036 wd 0.0500 time 0.2432 (0.2544) data time 0.0009 (0.0080) model time 0.2423 (0.2416) loss 1.8695 (2.7493) grad_norm 5.1382 (6.2761) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][80/1251] eta 0:04:56 lr 0.000036 wd 0.0500 time 0.2389 (0.2528) data time 0.0010 (0.0072) model time 0.2380 (0.2414) loss 3.0185 (2.7652) grad_norm 5.0505 (6.1861) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][90/1251] eta 0:04:52 lr 0.000036 wd 0.0500 time 0.2533 (0.2521) data time 0.0008 (0.0065) model time 0.2525 (0.2422) loss 2.7586 (2.7564) grad_norm 5.8615 (6.2242) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][100/1251] eta 0:04:48 lr 0.000036 wd 0.0500 time 0.2409 (0.2508) data time 0.0007 (0.0060) model time 0.2402 (0.2415) loss 3.1483 (2.7803) grad_norm 6.8544 (6.1121) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][110/1251] eta 0:04:45 lr 0.000036 wd 0.0500 time 0.2464 (0.2499) data time 0.0007 (0.0055) model time 0.2457 (0.2412) loss 2.0308 (2.7900) grad_norm 4.6235 (6.0239) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][120/1251] eta 0:04:41 lr 0.000036 wd 0.0500 time 0.2387 (0.2490) data time 0.0009 (0.0051) model time 0.2378 (0.2408) loss 2.8570 (2.7801) grad_norm 5.5348 (5.9365) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][130/1251] eta 0:04:38 lr 0.000036 wd 0.0500 time 0.2432 (0.2485) data time 0.0010 (0.0048) model time 0.2422 (0.2408) loss 2.8738 (2.7643) grad_norm 6.0396 (5.8359) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][140/1251] eta 0:04:35 lr 0.000036 wd 0.0500 time 0.2402 (0.2481) data time 0.0010 (0.0046) model time 0.2392 (0.2409) loss 2.5426 (2.7520) grad_norm 4.4162 (5.8350) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][150/1251] eta 0:04:32 lr 0.000036 wd 0.0500 time 0.2383 (0.2477) data time 0.0007 (0.0043) model time 0.2376 (0.2409) loss 3.1323 (2.7516) grad_norm 4.5985 (5.7748) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][160/1251] eta 0:04:29 lr 0.000036 wd 0.0500 time 0.2345 (0.2473) data time 0.0012 (0.0041) model time 0.2333 (0.2408) loss 2.9094 (2.7498) grad_norm 6.5234 (5.7798) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:25:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][170/1251] eta 0:04:26 lr 0.000036 wd 0.0500 time 0.2336 (0.2468) data time 0.0009 (0.0039) model time 0.2328 (0.2406) loss 2.5606 (2.7587) grad_norm 4.8738 (5.7341) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][180/1251] eta 0:04:25 lr 0.000036 wd 0.0500 time 0.2406 (0.2477) data time 0.0010 (0.0038) model time 0.2395 (0.2423) loss 2.6506 (2.7385) grad_norm 6.1017 (5.8074) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][190/1251] eta 0:04:22 lr 0.000036 wd 0.0500 time 0.2326 (0.2474) data time 0.0010 (0.0036) model time 0.2316 (0.2421) loss 3.2059 (2.7344) grad_norm 5.8482 (5.7893) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][200/1251] eta 0:04:19 lr 0.000036 wd 0.0500 time 0.2442 (0.2470) data time 0.0007 (0.0035) model time 0.2435 (0.2419) loss 2.7353 (2.7242) grad_norm 4.3981 (5.7921) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][210/1251] eta 0:04:16 lr 0.000036 wd 0.0500 time 0.2350 (0.2468) data time 0.0009 (0.0034) model time 0.2341 (0.2420) loss 2.8839 (2.7222) grad_norm 7.9245 (5.7823) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][220/1251] eta 0:04:14 lr 0.000036 wd 0.0500 time 0.2472 (0.2466) data time 0.0011 (0.0033) model time 0.2461 (0.2419) loss 2.8501 (2.7254) grad_norm 6.0621 (5.8165) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][230/1251] eta 0:04:11 lr 0.000036 wd 0.0500 time 0.2407 (0.2463) data time 0.0007 (0.0032) model time 0.2400 (0.2417) loss 3.5472 (2.7348) grad_norm 5.2288 (5.7927) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][240/1251] eta 0:04:08 lr 0.000036 wd 0.0500 time 0.2335 (0.2461) data time 0.0011 (0.0031) model time 0.2325 (0.2417) loss 2.9803 (2.7180) grad_norm 3.9052 (5.7737) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][250/1251] eta 0:04:06 lr 0.000036 wd 0.0500 time 0.2723 (0.2462) data time 0.0011 (0.0030) model time 0.2713 (0.2419) loss 3.3458 (2.7182) grad_norm 5.8038 (5.7378) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][260/1251] eta 0:04:03 lr 0.000036 wd 0.0500 time 0.2443 (0.2461) data time 0.0011 (0.0029) model time 0.2432 (0.2420) loss 3.2255 (2.7163) grad_norm 4.0209 (5.7393) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][270/1251] eta 0:04:01 lr 0.000036 wd 0.0500 time 0.2399 (0.2460) data time 0.0009 (0.0029) model time 0.2390 (0.2419) loss 1.6912 (2.7032) grad_norm 5.1520 (5.7042) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][280/1251] eta 0:03:58 lr 0.000036 wd 0.0500 time 0.2380 (0.2458) data time 0.0011 (0.0028) model time 0.2369 (0.2419) loss 3.1289 (2.7075) grad_norm 3.6224 (5.6842) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][290/1251] eta 0:03:56 lr 0.000036 wd 0.0500 time 0.2424 (0.2457) data time 0.0009 (0.0027) model time 0.2415 (0.2419) loss 2.9058 (2.7077) grad_norm 3.7725 (5.6787) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][300/1251] eta 0:03:53 lr 0.000036 wd 0.0500 time 0.2434 (0.2456) data time 0.0007 (0.0027) model time 0.2427 (0.2419) loss 2.6177 (2.7098) grad_norm 3.4715 (5.6564) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][310/1251] eta 0:03:51 lr 0.000036 wd 0.0500 time 0.2412 (0.2455) data time 0.0009 (0.0026) model time 0.2402 (0.2419) loss 2.8207 (2.7096) grad_norm 7.7306 (5.6749) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][320/1251] eta 0:03:48 lr 0.000036 wd 0.0500 time 0.2422 (0.2454) data time 0.0009 (0.0026) model time 0.2413 (0.2419) loss 2.5192 (2.7131) grad_norm 5.3469 (5.6668) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][330/1251] eta 0:03:47 lr 0.000036 wd 0.0500 time 0.2515 (0.2466) data time 0.0007 (0.0025) model time 0.2508 (0.2434) loss 1.6836 (2.7063) grad_norm 5.2798 (5.6414) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][340/1251] eta 0:03:45 lr 0.000035 wd 0.0500 time 0.2430 (0.2471) data time 0.0009 (0.0025) model time 0.2421 (0.2440) loss 3.0626 (2.7036) grad_norm 7.8216 (5.6389) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][350/1251] eta 0:03:42 lr 0.000035 wd 0.0500 time 0.2398 (0.2470) data time 0.0008 (0.0024) model time 0.2390 (0.2440) loss 2.9568 (2.7017) grad_norm 6.0170 (5.6316) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][360/1251] eta 0:03:40 lr 0.000035 wd 0.0500 time 0.2445 (0.2469) data time 0.0010 (0.0024) model time 0.2435 (0.2440) loss 2.7778 (2.7020) grad_norm 12.7493 (5.6495) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][370/1251] eta 0:03:37 lr 0.000035 wd 0.0500 time 0.2325 (0.2468) data time 0.0009 (0.0023) model time 0.2316 (0.2438) loss 1.9329 (2.6960) grad_norm 4.7258 (5.6360) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][380/1251] eta 0:03:34 lr 0.000035 wd 0.0500 time 0.2386 (0.2467) data time 0.0009 (0.0023) model time 0.2377 (0.2438) loss 2.5956 (2.6951) grad_norm 5.3756 (5.7008) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][390/1251] eta 0:03:32 lr 0.000035 wd 0.0500 time 0.2446 (0.2465) data time 0.0008 (0.0023) model time 0.2438 (0.2436) loss 2.5918 (2.6948) grad_norm 4.6188 (5.6784) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][400/1251] eta 0:03:29 lr 0.000035 wd 0.0500 time 0.2479 (0.2464) data time 0.0008 (0.0022) model time 0.2471 (0.2436) loss 2.9382 (2.6978) grad_norm 4.1040 (5.6514) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:26:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][410/1251] eta 0:03:27 lr 0.000035 wd 0.0500 time 0.2375 (0.2463) data time 0.0007 (0.0022) model time 0.2367 (0.2435) loss 2.2279 (2.6988) grad_norm 21.4852 (5.6849) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][420/1251] eta 0:03:24 lr 0.000035 wd 0.0500 time 0.2343 (0.2462) data time 0.0011 (0.0022) model time 0.2332 (0.2435) loss 2.7534 (2.6948) grad_norm 5.3947 (5.6998) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][430/1251] eta 0:03:22 lr 0.000035 wd 0.0500 time 0.2416 (0.2461) data time 0.0009 (0.0022) model time 0.2407 (0.2434) loss 1.9149 (2.6918) grad_norm 4.7290 (5.6985) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][440/1251] eta 0:03:19 lr 0.000035 wd 0.0500 time 0.2445 (0.2460) data time 0.0011 (0.0021) model time 0.2435 (0.2433) loss 3.0060 (2.6951) grad_norm 5.8300 (5.6802) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][450/1251] eta 0:03:16 lr 0.000035 wd 0.0500 time 0.2452 (0.2459) data time 0.0009 (0.0021) model time 0.2442 (0.2433) loss 2.2778 (2.6977) grad_norm 4.4199 (5.6623) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][460/1251] eta 0:03:14 lr 0.000035 wd 0.0500 time 0.2399 (0.2458) data time 0.0009 (0.0021) model time 0.2390 (0.2432) loss 2.7857 (2.6951) grad_norm 6.4790 (5.6730) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][470/1251] eta 0:03:11 lr 0.000035 wd 0.0500 time 0.2407 (0.2457) data time 0.0007 (0.0021) model time 0.2400 (0.2431) loss 2.8288 (2.6948) grad_norm 4.9808 (5.6566) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][480/1251] eta 0:03:09 lr 0.000035 wd 0.0500 time 0.2387 (0.2456) data time 0.0011 (0.0020) model time 0.2377 (0.2431) loss 3.0166 (2.6950) grad_norm 15.5293 (5.6662) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][490/1251] eta 0:03:06 lr 0.000035 wd 0.0500 time 0.2411 (0.2455) data time 0.0007 (0.0020) model time 0.2403 (0.2430) loss 2.9738 (2.6990) grad_norm 4.9874 (5.6512) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][500/1251] eta 0:03:04 lr 0.000035 wd 0.0500 time 0.2367 (0.2455) data time 0.0007 (0.0020) model time 0.2360 (0.2430) loss 2.9886 (2.6992) grad_norm 5.5389 (5.6947) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][510/1251] eta 0:03:01 lr 0.000035 wd 0.0500 time 0.2516 (0.2454) data time 0.0008 (0.0020) model time 0.2508 (0.2429) loss 1.5365 (2.6972) grad_norm 5.2954 (5.6880) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][520/1251] eta 0:02:59 lr 0.000035 wd 0.0500 time 0.2416 (0.2453) data time 0.0010 (0.0019) model time 0.2406 (0.2429) loss 3.3387 (2.7025) grad_norm 3.6784 (5.6765) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][530/1251] eta 0:02:56 lr 0.000035 wd 0.0500 time 0.2430 (0.2453) data time 0.0009 (0.0019) model time 0.2420 (0.2428) loss 2.3434 (2.7070) grad_norm 3.4629 (5.6801) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][540/1251] eta 0:02:54 lr 0.000035 wd 0.0500 time 0.2402 (0.2452) data time 0.0007 (0.0019) model time 0.2395 (0.2428) loss 3.4175 (2.7066) grad_norm 8.4784 (5.6984) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][550/1251] eta 0:02:51 lr 0.000035 wd 0.0500 time 0.2383 (0.2451) data time 0.0011 (0.0019) model time 0.2371 (0.2427) loss 3.1406 (2.7069) grad_norm 4.9149 (5.7109) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][560/1251] eta 0:02:49 lr 0.000035 wd 0.0500 time 0.2361 (0.2450) data time 0.0012 (0.0019) model time 0.2349 (0.2427) loss 2.6620 (2.7048) grad_norm 4.3210 (5.7209) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][570/1251] eta 0:02:46 lr 0.000035 wd 0.0500 time 0.2420 (0.2450) data time 0.0010 (0.0019) model time 0.2410 (0.2426) loss 3.1833 (2.7063) grad_norm 5.0073 (5.7169) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][580/1251] eta 0:02:44 lr 0.000035 wd 0.0500 time 0.2426 (0.2448) data time 0.0009 (0.0019) model time 0.2417 (0.2425) loss 2.8077 (2.7093) grad_norm 5.5871 (5.6993) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][590/1251] eta 0:02:41 lr 0.000035 wd 0.0500 time 0.2369 (0.2448) data time 0.0012 (0.0018) model time 0.2357 (0.2424) loss 2.5583 (2.7087) grad_norm 4.7483 (5.6939) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][600/1251] eta 0:02:39 lr 0.000035 wd 0.0500 time 0.2413 (0.2447) data time 0.0009 (0.0018) model time 0.2403 (0.2424) loss 3.0098 (2.7098) grad_norm 4.5398 (5.6815) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][610/1251] eta 0:02:36 lr 0.000035 wd 0.0500 time 0.2456 (0.2447) data time 0.0011 (0.0018) model time 0.2445 (0.2424) loss 2.9631 (2.7134) grad_norm 6.8639 (5.6904) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][620/1251] eta 0:02:34 lr 0.000035 wd 0.0500 time 0.2439 (0.2446) data time 0.0007 (0.0018) model time 0.2432 (0.2424) loss 1.9079 (2.7120) grad_norm 5.8461 (5.7101) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][630/1251] eta 0:02:31 lr 0.000035 wd 0.0500 time 0.2430 (0.2446) data time 0.0010 (0.0018) model time 0.2420 (0.2424) loss 3.0161 (2.7138) grad_norm 6.3001 (5.7631) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][640/1251] eta 0:02:29 lr 0.000035 wd 0.0500 time 0.2387 (0.2446) data time 0.0007 (0.0018) model time 0.2380 (0.2423) loss 3.0698 (2.7189) grad_norm 11.2754 (5.7624) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][650/1251] eta 0:02:26 lr 0.000035 wd 0.0500 time 0.2407 (0.2445) data time 0.0009 (0.0018) model time 0.2398 (0.2423) loss 3.5051 (2.7217) grad_norm 4.1949 (5.7776) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:27:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][660/1251] eta 0:02:24 lr 0.000035 wd 0.0500 time 0.2409 (0.2444) data time 0.0009 (0.0017) model time 0.2400 (0.2422) loss 2.0726 (2.7202) grad_norm 4.9314 (5.7687) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][670/1251] eta 0:02:22 lr 0.000035 wd 0.0500 time 0.2450 (0.2444) data time 0.0011 (0.0017) model time 0.2439 (0.2422) loss 3.1691 (2.7232) grad_norm 7.9816 (5.7617) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][680/1251] eta 0:02:19 lr 0.000035 wd 0.0500 time 0.2423 (0.2444) data time 0.0009 (0.0017) model time 0.2414 (0.2422) loss 3.0182 (2.7196) grad_norm 4.7827 (5.7608) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][690/1251] eta 0:02:17 lr 0.000035 wd 0.0500 time 0.2351 (0.2444) data time 0.0009 (0.0017) model time 0.2342 (0.2422) loss 3.2971 (2.7235) grad_norm 6.2590 (5.7550) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][700/1251] eta 0:02:14 lr 0.000035 wd 0.0500 time 0.2410 (0.2446) data time 0.0010 (0.0017) model time 0.2400 (0.2425) loss 2.7309 (2.7235) grad_norm 4.6930 (5.7880) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][710/1251] eta 0:02:12 lr 0.000035 wd 0.0500 time 0.2473 (0.2448) data time 0.0007 (0.0017) model time 0.2465 (0.2427) loss 1.8266 (2.7214) grad_norm 5.0455 (5.8010) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][720/1251] eta 0:02:10 lr 0.000035 wd 0.0500 time 0.2474 (0.2451) data time 0.0009 (0.0017) model time 0.2465 (0.2430) loss 3.0849 (2.7184) grad_norm 5.0478 (5.7942) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][730/1251] eta 0:02:07 lr 0.000035 wd 0.0500 time 0.2440 (0.2451) data time 0.0009 (0.0017) model time 0.2431 (0.2430) loss 3.2002 (2.7178) grad_norm 4.5707 (5.7859) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][740/1251] eta 0:02:05 lr 0.000035 wd 0.0500 time 0.2431 (0.2450) data time 0.0009 (0.0017) model time 0.2422 (0.2430) loss 2.0696 (2.7180) grad_norm 6.8480 (5.7788) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][750/1251] eta 0:02:02 lr 0.000035 wd 0.0500 time 0.2413 (0.2450) data time 0.0007 (0.0017) model time 0.2407 (0.2430) loss 2.8647 (2.7129) grad_norm 4.4175 (5.7727) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][760/1251] eta 0:02:00 lr 0.000035 wd 0.0500 time 0.2447 (0.2450) data time 0.0009 (0.0017) model time 0.2439 (0.2430) loss 3.0010 (2.7163) grad_norm 4.2487 (5.7586) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][770/1251] eta 0:01:57 lr 0.000035 wd 0.0500 time 0.2422 (0.2449) data time 0.0010 (0.0016) model time 0.2412 (0.2429) loss 2.9367 (2.7170) grad_norm 4.3918 (5.7510) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][780/1251] eta 0:01:55 lr 0.000035 wd 0.0500 time 0.2494 (0.2449) data time 0.0009 (0.0016) model time 0.2484 (0.2429) loss 2.8460 (2.7204) grad_norm 7.3435 (5.7459) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][790/1251] eta 0:01:52 lr 0.000035 wd 0.0500 time 0.2429 (0.2448) data time 0.0008 (0.0016) model time 0.2421 (0.2429) loss 1.6855 (2.7198) grad_norm 5.0931 (5.7353) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][800/1251] eta 0:01:50 lr 0.000035 wd 0.0500 time 0.2466 (0.2448) data time 0.0007 (0.0016) model time 0.2459 (0.2429) loss 1.7848 (2.7189) grad_norm 7.9309 (5.7904) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][810/1251] eta 0:01:47 lr 0.000035 wd 0.0500 time 0.2413 (0.2448) data time 0.0007 (0.0016) model time 0.2407 (0.2428) loss 2.8780 (2.7170) grad_norm 5.0191 (5.7758) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][820/1251] eta 0:01:45 lr 0.000035 wd 0.0500 time 0.2435 (0.2448) data time 0.0008 (0.0016) model time 0.2427 (0.2429) loss 3.1979 (2.7157) grad_norm 4.8327 (5.7724) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][830/1251] eta 0:01:43 lr 0.000035 wd 0.0500 time 0.2486 (0.2448) data time 0.0012 (0.0016) model time 0.2474 (0.2428) loss 2.2178 (2.7146) grad_norm 4.7945 (5.7732) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][840/1251] eta 0:01:40 lr 0.000035 wd 0.0500 time 0.2350 (0.2447) data time 0.0010 (0.0016) model time 0.2340 (0.2428) loss 2.9065 (2.7133) grad_norm 7.8016 (5.7729) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][850/1251] eta 0:01:38 lr 0.000035 wd 0.0500 time 0.2318 (0.2447) data time 0.0011 (0.0016) model time 0.2307 (0.2428) loss 2.8674 (2.7133) grad_norm 3.8895 (5.7654) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][860/1251] eta 0:01:35 lr 0.000035 wd 0.0500 time 0.2466 (0.2447) data time 0.0009 (0.0016) model time 0.2457 (0.2428) loss 2.9419 (2.7141) grad_norm 5.1550 (5.7536) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][870/1251] eta 0:01:33 lr 0.000035 wd 0.0500 time 0.2393 (0.2446) data time 0.0010 (0.0016) model time 0.2383 (0.2427) loss 2.9941 (2.7135) grad_norm 4.1776 (5.7412) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][880/1251] eta 0:01:30 lr 0.000035 wd 0.0500 time 0.2559 (0.2446) data time 0.0009 (0.0016) model time 0.2550 (0.2428) loss 3.2595 (2.7136) grad_norm 3.4934 (5.7443) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][890/1251] eta 0:01:28 lr 0.000035 wd 0.0500 time 0.2522 (0.2446) data time 0.0007 (0.0016) model time 0.2515 (0.2427) loss 2.4672 (2.7128) grad_norm 6.4955 (5.7331) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:28:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][900/1251] eta 0:01:25 lr 0.000035 wd 0.0500 time 0.2402 (0.2446) data time 0.0009 (0.0015) model time 0.2393 (0.2427) loss 3.3880 (2.7144) grad_norm 4.9567 (5.7295) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][910/1251] eta 0:01:23 lr 0.000035 wd 0.0500 time 0.2457 (0.2446) data time 0.0011 (0.0015) model time 0.2447 (0.2427) loss 2.7956 (2.7153) grad_norm 6.1043 (5.7280) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][920/1251] eta 0:01:20 lr 0.000035 wd 0.0500 time 0.2342 (0.2445) data time 0.0010 (0.0015) model time 0.2332 (0.2427) loss 2.3666 (2.7146) grad_norm 7.4366 (5.7479) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][930/1251] eta 0:01:18 lr 0.000035 wd 0.0500 time 0.2464 (0.2445) data time 0.0007 (0.0015) model time 0.2457 (0.2427) loss 3.2408 (2.7154) grad_norm 5.4239 (5.7403) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][940/1251] eta 0:01:16 lr 0.000035 wd 0.0500 time 0.2459 (0.2445) data time 0.0007 (0.0015) model time 0.2452 (0.2427) loss 2.4908 (2.7153) grad_norm 6.7154 (5.7361) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][950/1251] eta 0:01:13 lr 0.000035 wd 0.0500 time 0.2419 (0.2445) data time 0.0010 (0.0015) model time 0.2409 (0.2426) loss 2.6409 (2.7146) grad_norm 5.9944 (5.7295) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][960/1251] eta 0:01:11 lr 0.000035 wd 0.0500 time 0.2340 (0.2444) data time 0.0010 (0.0015) model time 0.2330 (0.2426) loss 2.7747 (2.7136) grad_norm 5.8515 (5.7284) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][970/1251] eta 0:01:08 lr 0.000035 wd 0.0500 time 0.2423 (0.2444) data time 0.0009 (0.0015) model time 0.2414 (0.2426) loss 3.0103 (2.7144) grad_norm 6.2533 (5.7335) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][980/1251] eta 0:01:06 lr 0.000035 wd 0.0500 time 0.2428 (0.2444) data time 0.0010 (0.0015) model time 0.2418 (0.2426) loss 3.1385 (2.7150) grad_norm 9.0314 (5.7334) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][990/1251] eta 0:01:03 lr 0.000035 wd 0.0500 time 0.2450 (0.2443) data time 0.0007 (0.0015) model time 0.2443 (0.2426) loss 3.2045 (2.7159) grad_norm 8.6847 (5.7342) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1000/1251] eta 0:01:01 lr 0.000035 wd 0.0500 time 0.2429 (0.2443) data time 0.0009 (0.0015) model time 0.2419 (0.2425) loss 2.6642 (2.7168) grad_norm 4.0212 (5.7294) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1010/1251] eta 0:00:58 lr 0.000035 wd 0.0500 time 0.2443 (0.2443) data time 0.0007 (0.0015) model time 0.2436 (0.2425) loss 2.1096 (2.7146) grad_norm 4.8384 (5.7351) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1020/1251] eta 0:00:56 lr 0.000035 wd 0.0500 time 0.2292 (0.2443) data time 0.0008 (0.0015) model time 0.2283 (0.2425) loss 3.4321 (2.7132) grad_norm 4.5805 (5.7342) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1030/1251] eta 0:00:53 lr 0.000035 wd 0.0500 time 0.2478 (0.2443) data time 0.0009 (0.0015) model time 0.2469 (0.2425) loss 2.0406 (2.7114) grad_norm 6.6369 (5.7335) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1040/1251] eta 0:00:51 lr 0.000035 wd 0.0500 time 0.2355 (0.2443) data time 0.0009 (0.0015) model time 0.2346 (0.2425) loss 2.1877 (2.7124) grad_norm 5.0810 (5.7551) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1050/1251] eta 0:00:49 lr 0.000035 wd 0.0500 time 0.2364 (0.2442) data time 0.0010 (0.0015) model time 0.2353 (0.2425) loss 2.4910 (2.7143) grad_norm 5.5228 (5.7473) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1060/1251] eta 0:00:46 lr 0.000034 wd 0.0500 time 0.2393 (0.2442) data time 0.0010 (0.0015) model time 0.2383 (0.2425) loss 2.3044 (2.7148) grad_norm 6.2095 (5.7431) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:29:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1070/1251] eta 0:00:44 lr 0.000034 wd 0.0500 time 0.2449 (0.2442) data time 0.0011 (0.0015) model time 0.2438 (0.2425) loss 2.4888 (2.7132) grad_norm 3.9186 (5.7441) loss_scale 512.0000 (258.1513) mem 7381MB [2024-09-01 09:29:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1080/1251] eta 0:00:41 lr 0.000034 wd 0.0500 time 0.2446 (0.2442) data time 0.0007 (0.0015) model time 0.2439 (0.2424) loss 2.9785 (2.7143) grad_norm 4.3625 (5.7346) loss_scale 512.0000 (260.4995) mem 7381MB [2024-09-01 09:29:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1090/1251] eta 0:00:39 lr 0.000034 wd 0.0500 time 0.2346 (0.2441) data time 0.0009 (0.0015) model time 0.2337 (0.2424) loss 2.4329 (2.7140) grad_norm 4.6337 (5.7302) loss_scale 512.0000 (262.8048) mem 7381MB [2024-09-01 09:29:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1100/1251] eta 0:00:36 lr 0.000034 wd 0.0500 time 0.2452 (0.2441) data time 0.0009 (0.0014) model time 0.2443 (0.2424) loss 2.7661 (2.7149) grad_norm 3.9902 (5.7245) loss_scale 512.0000 (265.0681) mem 7381MB [2024-09-01 09:29:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1110/1251] eta 0:00:34 lr 0.000034 wd 0.0500 time 0.2473 (0.2441) data time 0.0009 (0.0014) model time 0.2464 (0.2424) loss 3.2617 (2.7174) grad_norm 3.7888 (5.7182) loss_scale 512.0000 (267.2907) mem 7381MB [2024-09-01 09:29:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1120/1251] eta 0:00:31 lr 0.000034 wd 0.0500 time 0.2406 (0.2441) data time 0.0008 (0.0014) model time 0.2397 (0.2424) loss 3.3859 (2.7198) grad_norm 7.6301 (5.7177) loss_scale 512.0000 (269.4737) mem 7381MB [2024-09-01 09:29:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1130/1251] eta 0:00:29 lr 0.000034 wd 0.0500 time 0.2622 (0.2441) data time 0.0008 (0.0014) model time 0.2615 (0.2424) loss 2.9680 (2.7215) grad_norm 3.7150 (5.7098) loss_scale 512.0000 (271.6180) mem 7381MB [2024-09-01 09:29:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1140/1251] eta 0:00:27 lr 0.000034 wd 0.0500 time 0.2441 (0.2441) data time 0.0009 (0.0014) model time 0.2432 (0.2424) loss 1.7291 (2.7217) grad_norm 8.8724 (5.7168) loss_scale 512.0000 (273.7248) mem 7381MB [2024-09-01 09:29:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1150/1251] eta 0:00:24 lr 0.000034 wd 0.0500 time 0.2432 (0.2441) data time 0.0009 (0.0014) model time 0.2422 (0.2424) loss 2.8951 (2.7230) grad_norm 3.0149 (5.7150) loss_scale 512.0000 (275.7950) mem 7381MB [2024-09-01 09:30:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1160/1251] eta 0:00:22 lr 0.000034 wd 0.0500 time 0.2346 (0.2440) data time 0.0009 (0.0014) model time 0.2337 (0.2424) loss 1.6834 (2.7227) grad_norm 4.3289 (5.7091) loss_scale 512.0000 (277.8295) mem 7381MB [2024-09-01 09:30:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1170/1251] eta 0:00:19 lr 0.000034 wd 0.0500 time 0.2397 (0.2440) data time 0.0008 (0.0014) model time 0.2389 (0.2424) loss 2.2405 (2.7241) grad_norm 3.4764 (5.7082) loss_scale 512.0000 (279.8292) mem 7381MB [2024-09-01 09:30:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1180/1251] eta 0:00:17 lr 0.000034 wd 0.0500 time 0.2484 (0.2440) data time 0.0008 (0.0014) model time 0.2476 (0.2423) loss 2.1312 (2.7247) grad_norm 6.3495 (5.7142) loss_scale 512.0000 (281.7951) mem 7381MB [2024-09-01 09:30:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1190/1251] eta 0:00:14 lr 0.000034 wd 0.0500 time 0.2423 (0.2440) data time 0.0009 (0.0014) model time 0.2414 (0.2423) loss 2.4916 (2.7226) grad_norm 3.6035 (5.7135) loss_scale 512.0000 (283.7280) mem 7381MB [2024-09-01 09:30:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1200/1251] eta 0:00:12 lr 0.000034 wd 0.0500 time 0.2593 (0.2440) data time 0.0009 (0.0014) model time 0.2583 (0.2423) loss 2.8216 (2.7215) grad_norm 3.5661 (5.7096) loss_scale 512.0000 (285.6286) mem 7381MB [2024-09-01 09:30:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1210/1251] eta 0:00:10 lr 0.000034 wd 0.0500 time 0.2404 (0.2439) data time 0.0007 (0.0014) model time 0.2398 (0.2423) loss 3.4663 (2.7221) grad_norm 5.4640 (5.7033) loss_scale 512.0000 (287.4979) mem 7381MB [2024-09-01 09:30:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1220/1251] eta 0:00:07 lr 0.000034 wd 0.0500 time 0.2383 (0.2439) data time 0.0012 (0.0014) model time 0.2371 (0.2423) loss 2.8692 (2.7215) grad_norm 4.2486 (5.7039) loss_scale 512.0000 (289.3366) mem 7381MB [2024-09-01 09:30:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1230/1251] eta 0:00:05 lr 0.000034 wd 0.0500 time 0.2416 (0.2439) data time 0.0008 (0.0014) model time 0.2408 (0.2422) loss 2.4905 (2.7201) grad_norm 8.4445 (5.7182) loss_scale 512.0000 (291.1454) mem 7381MB [2024-09-01 09:30:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1240/1251] eta 0:00:02 lr 0.000034 wd 0.0500 time 0.2217 (0.2442) data time 0.0007 (0.0014) model time 0.2210 (0.2426) loss 2.9213 (2.7202) grad_norm 4.5800 (5.7687) loss_scale 512.0000 (292.9251) mem 7381MB [2024-09-01 09:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [271/300][1250/1251] eta 0:00:00 lr 0.000034 wd 0.0500 time 0.2234 (0.2441) data time 0.0004 (0.0014) model time 0.2230 (0.2425) loss 3.2386 (2.7218) grad_norm 6.4720 (5.7765) loss_scale 512.0000 (294.6763) mem 7381MB [2024-09-01 09:30:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 271 training takes 0:05:05 [2024-09-01 09:30:22 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 09:30:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 09:30:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.466 (0.466) Loss 0.4031 (0.4031) Acc@1 92.871 (92.871) Acc@5 98.438 (98.438) Mem 7381MB [2024-09-01 09:30:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.112) Loss 0.5801 (0.6209) Acc@1 90.039 (87.553) Acc@5 98.340 (97.745) Mem 7381MB [2024-09-01 09:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.096) Loss 0.9131 (0.6501) Acc@1 78.027 (86.477) Acc@5 96.289 (97.712) Mem 7381MB [2024-09-01 09:30:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.090) Loss 1.1582 (0.7434) Acc@1 73.828 (84.293) Acc@5 92.383 (96.730) Mem 7381MB [2024-09-01 09:30:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.0488 (0.7920) Acc@1 76.270 (83.072) Acc@5 94.043 (96.208) Mem 7381MB [2024-09-01 09:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.638 Acc@5 96.146 [2024-09-01 09:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-09-01 09:30:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.785 (0.785) Loss 0.3855 (0.3855) Acc@1 93.164 (93.164) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 09:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.149) Loss 0.5654 (0.6038) Acc@1 90.430 (87.811) Acc@5 98.242 (97.834) Mem 7381MB [2024-09-01 09:30:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.116) Loss 0.9053 (0.6340) Acc@1 78.223 (86.663) Acc@5 95.898 (97.754) Mem 7381MB [2024-09-01 09:30:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.104) Loss 1.1260 (0.7250) Acc@1 74.512 (84.473) Acc@5 92.773 (96.821) Mem 7381MB [2024-09-01 09:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 0.9995 (0.7723) Acc@1 77.441 (83.306) Acc@5 94.141 (96.325) Mem 7381MB [2024-09-01 09:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.904 Acc@5 96.270 [2024-09-01 09:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 09:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][0/1251] eta 0:22:09 lr 0.000034 wd 0.0500 time 1.0627 (1.0627) data time 0.7509 (0.7509) model time 0.0000 (0.0000) loss 2.8932 (2.8932) grad_norm 3.7913 (3.7913) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:30:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][10/1251] eta 0:07:48 lr 0.000034 wd 0.0500 time 0.2479 (0.3775) data time 0.0009 (0.0692) model time 0.0000 (0.0000) loss 2.7566 (2.8623) grad_norm 4.4705 (5.1844) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:30:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][20/1251] eta 0:06:25 lr 0.000034 wd 0.0500 time 0.2487 (0.3131) data time 0.0008 (0.0367) model time 0.0000 (0.0000) loss 2.4383 (2.7486) grad_norm 27.7996 (6.4379) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:30:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][30/1251] eta 0:05:54 lr 0.000034 wd 0.0500 time 0.2507 (0.2902) data time 0.0010 (0.0252) model time 0.0000 (0.0000) loss 2.9719 (2.7958) grad_norm 5.5214 (6.3183) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:30:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][40/1251] eta 0:05:36 lr 0.000034 wd 0.0500 time 0.2449 (0.2782) data time 0.0007 (0.0193) model time 0.0000 (0.0000) loss 3.2613 (2.7587) grad_norm 6.3484 (6.0615) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:30:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][50/1251] eta 0:05:25 lr 0.000034 wd 0.0500 time 0.2401 (0.2711) data time 0.0008 (0.0157) model time 0.0000 (0.0000) loss 2.1204 (2.7360) grad_norm 3.9291 (5.8039) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:30:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][60/1251] eta 0:05:17 lr 0.000034 wd 0.0500 time 0.2432 (0.2663) data time 0.0008 (0.0133) model time 0.2424 (0.2410) loss 2.8485 (2.6881) grad_norm 4.7205 (5.7350) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][70/1251] eta 0:05:10 lr 0.000034 wd 0.0500 time 0.2521 (0.2629) data time 0.0010 (0.0116) model time 0.2511 (0.2411) loss 2.6366 (2.6759) grad_norm 6.1359 (5.7086) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:30:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][80/1251] eta 0:05:04 lr 0.000034 wd 0.0500 time 0.2416 (0.2605) data time 0.0010 (0.0103) model time 0.2406 (0.2414) loss 2.4208 (2.6549) grad_norm 9.1325 (5.8848) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:30:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][90/1251] eta 0:05:00 lr 0.000034 wd 0.0500 time 0.2405 (0.2584) data time 0.0010 (0.0092) model time 0.2395 (0.2413) loss 2.5188 (2.6280) grad_norm 5.7157 (5.8630) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:30:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][100/1251] eta 0:04:55 lr 0.000034 wd 0.0500 time 0.2486 (0.2572) data time 0.0007 (0.0084) model time 0.2479 (0.2420) loss 2.8583 (2.6495) grad_norm 3.7313 (5.8062) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][110/1251] eta 0:04:51 lr 0.000034 wd 0.0500 time 0.2429 (0.2558) data time 0.0009 (0.0077) model time 0.2420 (0.2419) loss 2.0317 (2.6169) grad_norm 5.2659 (5.7569) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][120/1251] eta 0:04:48 lr 0.000034 wd 0.0500 time 0.2415 (0.2549) data time 0.0010 (0.0072) model time 0.2404 (0.2421) loss 3.0224 (2.6401) grad_norm 4.0812 (5.8878) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][130/1251] eta 0:04:44 lr 0.000034 wd 0.0500 time 0.2386 (0.2537) data time 0.0009 (0.0067) model time 0.2377 (0.2417) loss 3.0708 (2.6571) grad_norm 5.5561 (5.9064) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][140/1251] eta 0:04:40 lr 0.000034 wd 0.0500 time 0.2373 (0.2529) data time 0.0009 (0.0063) model time 0.2365 (0.2416) loss 3.2457 (2.6607) grad_norm 4.2583 (5.9042) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][150/1251] eta 0:04:37 lr 0.000034 wd 0.0500 time 0.2525 (0.2524) data time 0.0009 (0.0060) model time 0.2516 (0.2418) loss 3.0736 (2.6833) grad_norm 10.0828 (6.1818) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][160/1251] eta 0:04:34 lr 0.000034 wd 0.0500 time 0.2323 (0.2517) data time 0.0008 (0.0057) model time 0.2315 (0.2417) loss 2.3291 (2.6899) grad_norm 4.6308 (6.0893) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][170/1251] eta 0:04:31 lr 0.000034 wd 0.0500 time 0.2441 (0.2512) data time 0.0008 (0.0054) model time 0.2433 (0.2417) loss 3.2084 (2.6937) grad_norm 5.0987 (6.0757) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][180/1251] eta 0:04:28 lr 0.000034 wd 0.0500 time 0.2297 (0.2506) data time 0.0008 (0.0051) model time 0.2289 (0.2415) loss 3.2028 (2.7096) grad_norm 4.6271 (6.0176) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][190/1251] eta 0:04:25 lr 0.000034 wd 0.0500 time 0.2421 (0.2501) data time 0.0008 (0.0049) model time 0.2414 (0.2415) loss 2.8328 (2.7079) grad_norm 9.0639 (5.9911) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][200/1251] eta 0:04:22 lr 0.000034 wd 0.0500 time 0.2397 (0.2497) data time 0.0007 (0.0047) model time 0.2390 (0.2414) loss 3.4463 (2.7108) grad_norm 6.1357 (6.0403) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][210/1251] eta 0:04:19 lr 0.000034 wd 0.0500 time 0.2409 (0.2491) data time 0.0011 (0.0045) model time 0.2398 (0.2411) loss 3.1139 (2.7025) grad_norm 5.5802 (6.1067) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][220/1251] eta 0:04:16 lr 0.000034 wd 0.0500 time 0.2347 (0.2488) data time 0.0011 (0.0044) model time 0.2336 (0.2411) loss 2.9773 (2.6921) grad_norm 4.0704 (6.1666) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][230/1251] eta 0:04:13 lr 0.000034 wd 0.0500 time 0.2295 (0.2485) data time 0.0010 (0.0042) model time 0.2285 (0.2411) loss 2.3175 (2.6882) grad_norm 11.8460 (6.1394) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][240/1251] eta 0:04:10 lr 0.000034 wd 0.0500 time 0.2393 (0.2482) data time 0.0009 (0.0041) model time 0.2384 (0.2411) loss 2.5393 (2.6787) grad_norm 5.2515 (6.1876) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][250/1251] eta 0:04:08 lr 0.000034 wd 0.0500 time 0.2346 (0.2480) data time 0.0010 (0.0040) model time 0.2337 (0.2411) loss 2.5568 (2.6760) grad_norm 4.4430 (6.1443) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][260/1251] eta 0:04:05 lr 0.000034 wd 0.0500 time 0.2336 (0.2477) data time 0.0010 (0.0039) model time 0.2325 (0.2411) loss 3.3027 (2.6776) grad_norm 4.1423 (6.1553) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][270/1251] eta 0:04:02 lr 0.000034 wd 0.0500 time 0.2430 (0.2475) data time 0.0010 (0.0038) model time 0.2420 (0.2411) loss 3.0341 (2.6867) grad_norm 4.3115 (6.1591) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][280/1251] eta 0:04:00 lr 0.000034 wd 0.0500 time 0.2473 (0.2474) data time 0.0007 (0.0037) model time 0.2466 (0.2412) loss 3.0526 (2.6808) grad_norm 4.9176 (6.1242) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][290/1251] eta 0:03:57 lr 0.000034 wd 0.0500 time 0.2433 (0.2472) data time 0.0010 (0.0036) model time 0.2423 (0.2412) loss 3.0116 (2.6884) grad_norm 5.6240 (6.0836) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][300/1251] eta 0:03:54 lr 0.000034 wd 0.0500 time 0.2338 (0.2471) data time 0.0007 (0.0035) model time 0.2331 (0.2412) loss 3.0938 (2.6936) grad_norm 6.2018 (6.0857) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][310/1251] eta 0:03:52 lr 0.000034 wd 0.0500 time 0.2429 (0.2469) data time 0.0010 (0.0034) model time 0.2419 (0.2412) loss 3.1394 (2.7014) grad_norm 6.1608 (6.0625) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][320/1251] eta 0:03:49 lr 0.000034 wd 0.0500 time 0.2376 (0.2467) data time 0.0007 (0.0034) model time 0.2369 (0.2411) loss 2.8580 (2.7042) grad_norm 6.8459 (6.0567) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][330/1251] eta 0:03:47 lr 0.000034 wd 0.0500 time 0.2485 (0.2466) data time 0.0009 (0.0033) model time 0.2476 (0.2411) loss 2.9892 (2.7079) grad_norm 5.1941 (6.0531) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][340/1251] eta 0:03:44 lr 0.000034 wd 0.0500 time 0.2335 (0.2464) data time 0.0007 (0.0032) model time 0.2328 (0.2410) loss 2.0099 (2.7132) grad_norm 6.0564 (6.0258) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:31:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][350/1251] eta 0:03:41 lr 0.000034 wd 0.0500 time 0.2302 (0.2463) data time 0.0010 (0.0032) model time 0.2292 (0.2411) loss 3.1196 (2.7128) grad_norm 5.0580 (5.9963) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][360/1251] eta 0:03:39 lr 0.000034 wd 0.0500 time 0.2328 (0.2461) data time 0.0009 (0.0031) model time 0.2319 (0.2410) loss 1.5625 (2.7132) grad_norm 3.9571 (5.9614) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][370/1251] eta 0:03:36 lr 0.000034 wd 0.0500 time 0.2423 (0.2460) data time 0.0010 (0.0030) model time 0.2412 (0.2410) loss 2.5798 (2.7172) grad_norm 6.1396 (5.9825) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][380/1251] eta 0:03:34 lr 0.000034 wd 0.0500 time 0.2397 (0.2459) data time 0.0007 (0.0030) model time 0.2390 (0.2409) loss 2.9912 (2.7190) grad_norm 3.6517 (5.9442) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][390/1251] eta 0:03:31 lr 0.000034 wd 0.0500 time 0.2368 (0.2457) data time 0.0007 (0.0029) model time 0.2360 (0.2409) loss 2.6770 (2.7157) grad_norm 3.5012 (5.9389) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][400/1251] eta 0:03:29 lr 0.000034 wd 0.0500 time 0.2525 (0.2456) data time 0.0009 (0.0029) model time 0.2515 (0.2409) loss 2.5968 (2.7160) grad_norm 4.6525 (5.9089) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][410/1251] eta 0:03:26 lr 0.000034 wd 0.0500 time 0.2459 (0.2455) data time 0.0007 (0.0028) model time 0.2452 (0.2409) loss 3.2374 (2.7186) grad_norm 6.1455 (5.9079) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][420/1251] eta 0:03:23 lr 0.000034 wd 0.0500 time 0.2469 (0.2454) data time 0.0010 (0.0028) model time 0.2460 (0.2409) loss 3.1740 (2.7212) grad_norm 3.7186 (5.8783) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][430/1251] eta 0:03:21 lr 0.000034 wd 0.0500 time 0.2369 (0.2453) data time 0.0009 (0.0028) model time 0.2359 (0.2408) loss 2.7739 (2.7231) grad_norm 6.2562 (5.8530) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][440/1251] eta 0:03:18 lr 0.000034 wd 0.0500 time 0.2474 (0.2452) data time 0.0009 (0.0027) model time 0.2465 (0.2407) loss 2.6836 (2.7184) grad_norm 4.3063 (5.8308) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][450/1251] eta 0:03:16 lr 0.000034 wd 0.0500 time 0.2394 (0.2451) data time 0.0007 (0.0027) model time 0.2387 (0.2408) loss 3.0689 (2.7147) grad_norm 5.4575 (5.8522) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][460/1251] eta 0:03:13 lr 0.000034 wd 0.0500 time 0.2436 (0.2450) data time 0.0007 (0.0026) model time 0.2429 (0.2407) loss 3.2093 (2.7141) grad_norm 6.4852 (5.8685) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][470/1251] eta 0:03:11 lr 0.000034 wd 0.0500 time 0.2347 (0.2452) data time 0.0008 (0.0026) model time 0.2339 (0.2410) loss 2.8754 (2.7115) grad_norm 5.2384 (5.8510) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][480/1251] eta 0:03:09 lr 0.000034 wd 0.0500 time 0.2484 (0.2451) data time 0.0010 (0.0026) model time 0.2475 (0.2411) loss 2.8507 (2.7117) grad_norm 5.1724 (5.8730) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][490/1251] eta 0:03:06 lr 0.000034 wd 0.0500 time 0.2482 (0.2450) data time 0.0007 (0.0025) model time 0.2475 (0.2410) loss 2.7870 (2.7151) grad_norm 5.2328 (5.8541) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][500/1251] eta 0:03:03 lr 0.000034 wd 0.0500 time 0.2474 (0.2449) data time 0.0010 (0.0025) model time 0.2465 (0.2410) loss 2.1314 (2.7176) grad_norm 7.3576 (5.8478) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][510/1251] eta 0:03:01 lr 0.000034 wd 0.0500 time 0.2328 (0.2453) data time 0.0008 (0.0025) model time 0.2319 (0.2414) loss 3.0050 (2.7162) grad_norm 12.6747 (5.8626) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][520/1251] eta 0:02:59 lr 0.000034 wd 0.0500 time 0.2371 (0.2456) data time 0.0008 (0.0024) model time 0.2363 (0.2418) loss 2.3194 (2.7173) grad_norm 4.7984 (5.9358) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][530/1251] eta 0:02:57 lr 0.000034 wd 0.0500 time 0.2391 (0.2455) data time 0.0008 (0.0024) model time 0.2382 (0.2418) loss 2.6901 (2.7151) grad_norm 4.2355 (5.9209) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][540/1251] eta 0:02:54 lr 0.000033 wd 0.0500 time 0.2488 (0.2455) data time 0.0007 (0.0024) model time 0.2481 (0.2418) loss 2.2445 (2.7160) grad_norm 7.3056 (5.9092) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][550/1251] eta 0:02:52 lr 0.000033 wd 0.0500 time 0.2514 (0.2455) data time 0.0010 (0.0024) model time 0.2504 (0.2418) loss 2.8472 (2.7199) grad_norm 3.6578 (5.9315) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][560/1251] eta 0:02:49 lr 0.000033 wd 0.0500 time 0.2427 (0.2454) data time 0.0007 (0.0023) model time 0.2420 (0.2418) loss 3.0359 (2.7201) grad_norm 5.9932 (5.9133) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][570/1251] eta 0:02:47 lr 0.000033 wd 0.0500 time 0.2406 (0.2453) data time 0.0011 (0.0023) model time 0.2395 (0.2418) loss 2.5815 (2.7200) grad_norm 4.7441 (5.9627) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][580/1251] eta 0:02:44 lr 0.000033 wd 0.0500 time 0.2385 (0.2452) data time 0.0009 (0.0023) model time 0.2376 (0.2418) loss 2.9383 (2.7201) grad_norm 4.4344 (5.9523) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][590/1251] eta 0:02:42 lr 0.000033 wd 0.0500 time 0.2475 (0.2452) data time 0.0009 (0.0023) model time 0.2466 (0.2417) loss 2.6898 (2.7149) grad_norm 4.8396 (5.9443) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:32:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][600/1251] eta 0:02:39 lr 0.000033 wd 0.0500 time 0.2442 (0.2451) data time 0.0010 (0.0023) model time 0.2432 (0.2417) loss 2.8943 (2.7118) grad_norm 3.9348 (5.9335) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][610/1251] eta 0:02:37 lr 0.000033 wd 0.0500 time 0.2487 (0.2451) data time 0.0007 (0.0022) model time 0.2480 (0.2417) loss 2.7949 (2.7119) grad_norm 4.8169 (5.9579) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][620/1251] eta 0:02:34 lr 0.000033 wd 0.0500 time 0.2494 (0.2450) data time 0.0011 (0.0022) model time 0.2483 (0.2417) loss 3.1155 (2.7146) grad_norm 3.9682 (6.0543) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][630/1251] eta 0:02:32 lr 0.000033 wd 0.0500 time 0.2413 (0.2450) data time 0.0011 (0.0022) model time 0.2402 (0.2417) loss 1.7433 (2.7130) grad_norm 5.4613 (6.0429) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][640/1251] eta 0:02:29 lr 0.000033 wd 0.0500 time 0.2339 (0.2449) data time 0.0010 (0.0022) model time 0.2328 (0.2417) loss 2.9181 (2.7145) grad_norm 4.4675 (6.0560) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][650/1251] eta 0:02:27 lr 0.000033 wd 0.0500 time 0.2402 (0.2449) data time 0.0010 (0.0022) model time 0.2393 (0.2417) loss 2.7159 (2.7146) grad_norm 3.8898 (6.0429) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][660/1251] eta 0:02:24 lr 0.000033 wd 0.0500 time 0.2411 (0.2448) data time 0.0008 (0.0021) model time 0.2403 (0.2416) loss 3.3024 (2.7124) grad_norm 4.3257 (6.0274) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][670/1251] eta 0:02:22 lr 0.000033 wd 0.0500 time 0.2498 (0.2448) data time 0.0007 (0.0021) model time 0.2491 (0.2416) loss 1.7092 (2.7074) grad_norm 4.3389 (6.0060) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][680/1251] eta 0:02:19 lr 0.000033 wd 0.0500 time 0.2398 (0.2448) data time 0.0008 (0.0021) model time 0.2390 (0.2416) loss 1.9320 (2.7069) grad_norm 4.1562 (6.0082) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][690/1251] eta 0:02:17 lr 0.000033 wd 0.0500 time 0.2524 (0.2448) data time 0.0009 (0.0021) model time 0.2516 (0.2417) loss 2.8444 (2.7027) grad_norm 4.2481 (5.9884) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][700/1251] eta 0:02:14 lr 0.000033 wd 0.0500 time 0.2470 (0.2447) data time 0.0009 (0.0021) model time 0.2461 (0.2417) loss 2.7998 (2.7011) grad_norm 3.9210 (5.9681) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][710/1251] eta 0:02:12 lr 0.000033 wd 0.0500 time 0.2403 (0.2447) data time 0.0007 (0.0021) model time 0.2396 (0.2416) loss 2.8400 (2.6976) grad_norm 3.3154 (5.9454) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][720/1251] eta 0:02:09 lr 0.000033 wd 0.0500 time 0.2379 (0.2446) data time 0.0009 (0.0020) model time 0.2370 (0.2416) loss 3.2749 (2.6985) grad_norm 7.1558 (5.9505) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][730/1251] eta 0:02:07 lr 0.000033 wd 0.0500 time 0.2420 (0.2446) data time 0.0011 (0.0020) model time 0.2409 (0.2416) loss 1.9512 (2.6963) grad_norm 4.2238 (5.9467) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][740/1251] eta 0:02:04 lr 0.000033 wd 0.0500 time 0.2445 (0.2446) data time 0.0008 (0.0020) model time 0.2437 (0.2416) loss 1.9973 (2.6965) grad_norm 4.9379 (5.9353) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][750/1251] eta 0:02:02 lr 0.000033 wd 0.0500 time 0.2410 (0.2445) data time 0.0008 (0.0020) model time 0.2402 (0.2416) loss 2.9883 (2.7011) grad_norm 9.4960 (5.9281) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][760/1251] eta 0:02:00 lr 0.000033 wd 0.0500 time 0.2435 (0.2445) data time 0.0009 (0.0020) model time 0.2426 (0.2416) loss 3.0325 (2.6998) grad_norm 3.6653 (5.9096) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][770/1251] eta 0:01:57 lr 0.000033 wd 0.0500 time 0.2456 (0.2445) data time 0.0009 (0.0020) model time 0.2447 (0.2416) loss 1.6974 (2.6978) grad_norm 4.6331 (5.8985) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][780/1251] eta 0:01:55 lr 0.000033 wd 0.0500 time 0.2361 (0.2444) data time 0.0011 (0.0020) model time 0.2350 (0.2415) loss 2.9787 (2.6934) grad_norm 3.9671 (5.8859) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][790/1251] eta 0:01:52 lr 0.000033 wd 0.0500 time 0.2420 (0.2444) data time 0.0010 (0.0019) model time 0.2410 (0.2415) loss 2.8557 (2.6942) grad_norm 3.8067 (5.8732) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][800/1251] eta 0:01:50 lr 0.000033 wd 0.0500 time 0.2468 (0.2443) data time 0.0008 (0.0019) model time 0.2460 (0.2415) loss 2.7456 (2.6963) grad_norm 6.0826 (5.8649) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][810/1251] eta 0:01:47 lr 0.000033 wd 0.0500 time 0.2408 (0.2443) data time 0.0008 (0.0019) model time 0.2401 (0.2415) loss 2.2552 (2.6950) grad_norm 6.3497 (5.8684) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][820/1251] eta 0:01:45 lr 0.000033 wd 0.0500 time 0.2448 (0.2443) data time 0.0011 (0.0019) model time 0.2437 (0.2415) loss 3.2230 (2.6979) grad_norm 5.1612 (5.8598) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][830/1251] eta 0:01:42 lr 0.000033 wd 0.0500 time 0.2329 (0.2442) data time 0.0011 (0.0019) model time 0.2318 (0.2415) loss 2.9890 (2.6979) grad_norm 3.9029 (5.8662) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][840/1251] eta 0:01:40 lr 0.000033 wd 0.0500 time 0.2478 (0.2442) data time 0.0007 (0.0019) model time 0.2471 (0.2414) loss 2.7487 (2.6967) grad_norm 4.9659 (5.8845) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:33:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][850/1251] eta 0:01:37 lr 0.000033 wd 0.0500 time 0.2399 (0.2441) data time 0.0007 (0.0019) model time 0.2391 (0.2414) loss 3.0321 (2.6977) grad_norm 5.2960 (5.8951) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][860/1251] eta 0:01:35 lr 0.000033 wd 0.0500 time 0.2499 (0.2441) data time 0.0011 (0.0019) model time 0.2489 (0.2414) loss 2.6617 (2.6989) grad_norm 5.5528 (5.8902) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][870/1251] eta 0:01:33 lr 0.000033 wd 0.0500 time 0.2449 (0.2441) data time 0.0011 (0.0019) model time 0.2439 (0.2414) loss 2.9230 (2.7005) grad_norm 9.4575 (5.8977) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][880/1251] eta 0:01:30 lr 0.000033 wd 0.0500 time 0.2375 (0.2441) data time 0.0010 (0.0019) model time 0.2364 (0.2414) loss 2.8663 (2.7035) grad_norm 4.4321 (5.8960) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][890/1251] eta 0:01:28 lr 0.000033 wd 0.0500 time 0.2379 (0.2441) data time 0.0009 (0.0018) model time 0.2371 (0.2414) loss 2.5195 (2.7035) grad_norm 3.3725 (5.8966) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][900/1251] eta 0:01:25 lr 0.000033 wd 0.0500 time 0.2385 (0.2440) data time 0.0009 (0.0018) model time 0.2376 (0.2414) loss 2.8302 (2.7058) grad_norm 5.2355 (5.8890) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][910/1251] eta 0:01:23 lr 0.000033 wd 0.0500 time 0.2403 (0.2440) data time 0.0011 (0.0018) model time 0.2391 (0.2414) loss 3.0617 (2.7057) grad_norm 6.3773 (5.9017) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][920/1251] eta 0:01:20 lr 0.000033 wd 0.0500 time 0.2382 (0.2440) data time 0.0008 (0.0018) model time 0.2374 (0.2414) loss 3.3469 (2.7064) grad_norm 5.4431 (5.9019) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][930/1251] eta 0:01:18 lr 0.000033 wd 0.0500 time 0.2423 (0.2439) data time 0.0009 (0.0018) model time 0.2414 (0.2414) loss 2.8887 (2.7064) grad_norm 8.1216 (5.8972) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][940/1251] eta 0:01:16 lr 0.000033 wd 0.0500 time 0.2450 (0.2446) data time 0.0009 (0.0018) model time 0.2441 (0.2421) loss 3.2147 (2.7030) grad_norm 12.1260 (5.8966) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][950/1251] eta 0:01:13 lr 0.000033 wd 0.0500 time 0.2395 (0.2445) data time 0.0011 (0.0018) model time 0.2384 (0.2420) loss 2.9001 (2.7043) grad_norm 6.3621 (5.8951) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][960/1251] eta 0:01:11 lr 0.000033 wd 0.0500 time 0.2339 (0.2445) data time 0.0009 (0.0018) model time 0.2331 (0.2420) loss 3.1683 (2.7059) grad_norm 4.9136 (5.8821) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][970/1251] eta 0:01:08 lr 0.000033 wd 0.0500 time 0.2401 (0.2445) data time 0.0007 (0.0018) model time 0.2394 (0.2420) loss 2.1413 (2.7055) grad_norm 11.5848 (5.8838) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][980/1251] eta 0:01:06 lr 0.000033 wd 0.0500 time 0.2431 (0.2444) data time 0.0011 (0.0018) model time 0.2421 (0.2420) loss 2.5352 (2.7064) grad_norm 6.0881 (5.8834) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][990/1251] eta 0:01:03 lr 0.000033 wd 0.0500 time 0.4566 (0.2446) data time 0.0011 (0.0018) model time 0.4555 (0.2422) loss 2.6122 (2.7090) grad_norm 8.9912 (5.8847) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1000/1251] eta 0:01:01 lr 0.000033 wd 0.0500 time 0.2426 (0.2446) data time 0.0007 (0.0018) model time 0.2419 (0.2421) loss 2.6360 (2.7095) grad_norm 34.8103 (5.9121) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1010/1251] eta 0:00:58 lr 0.000033 wd 0.0500 time 0.2417 (0.2445) data time 0.0009 (0.0017) model time 0.2408 (0.2421) loss 2.9112 (2.7096) grad_norm 5.2902 (5.9099) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1020/1251] eta 0:00:56 lr 0.000033 wd 0.0500 time 0.2447 (0.2445) data time 0.0007 (0.0017) model time 0.2441 (0.2421) loss 3.0616 (2.7098) grad_norm 5.8795 (5.9206) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1030/1251] eta 0:00:54 lr 0.000033 wd 0.0500 time 0.2440 (0.2444) data time 0.0009 (0.0017) model time 0.2431 (0.2421) loss 2.8848 (2.7075) grad_norm 4.3327 (5.9176) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1040/1251] eta 0:00:51 lr 0.000033 wd 0.0500 time 0.2408 (0.2444) data time 0.0009 (0.0017) model time 0.2399 (0.2420) loss 2.5745 (2.7081) grad_norm 6.9884 (5.9183) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1050/1251] eta 0:00:49 lr 0.000033 wd 0.0500 time 0.2431 (0.2444) data time 0.0011 (0.0017) model time 0.2420 (0.2420) loss 2.9872 (2.7090) grad_norm 3.7605 (5.9102) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1060/1251] eta 0:00:46 lr 0.000033 wd 0.0500 time 0.2505 (0.2443) data time 0.0007 (0.0017) model time 0.2498 (0.2420) loss 2.6405 (2.7093) grad_norm 5.8600 (5.9011) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1070/1251] eta 0:00:44 lr 0.000033 wd 0.0500 time 0.2413 (0.2443) data time 0.0008 (0.0017) model time 0.2405 (0.2420) loss 3.3630 (2.7111) grad_norm 2.8764 (5.9301) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1080/1251] eta 0:00:41 lr 0.000033 wd 0.0500 time 0.2342 (0.2443) data time 0.0009 (0.0017) model time 0.2333 (0.2419) loss 2.7534 (2.7117) grad_norm 4.8975 (5.9200) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:34:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1090/1251] eta 0:00:39 lr 0.000033 wd 0.0500 time 0.2361 (0.2442) data time 0.0011 (0.0017) model time 0.2350 (0.2419) loss 3.2439 (2.7125) grad_norm 6.0952 (5.9128) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1100/1251] eta 0:00:36 lr 0.000033 wd 0.0500 time 0.2419 (0.2442) data time 0.0010 (0.0017) model time 0.2409 (0.2419) loss 2.6123 (2.7155) grad_norm 7.0576 (5.9089) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1110/1251] eta 0:00:34 lr 0.000033 wd 0.0500 time 0.2334 (0.2442) data time 0.0007 (0.0017) model time 0.2327 (0.2419) loss 3.4310 (2.7154) grad_norm 5.5800 (5.8975) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1120/1251] eta 0:00:31 lr 0.000033 wd 0.0500 time 0.2373 (0.2441) data time 0.0012 (0.0017) model time 0.2360 (0.2418) loss 2.8795 (2.7150) grad_norm 4.4141 (5.8944) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1130/1251] eta 0:00:29 lr 0.000033 wd 0.0500 time 0.2456 (0.2441) data time 0.0010 (0.0017) model time 0.2447 (0.2418) loss 2.8402 (2.7161) grad_norm 3.6289 (5.8855) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1140/1251] eta 0:00:27 lr 0.000033 wd 0.0500 time 0.2425 (0.2441) data time 0.0009 (0.0017) model time 0.2417 (0.2418) loss 2.8138 (2.7160) grad_norm 5.6658 (5.8787) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1150/1251] eta 0:00:24 lr 0.000033 wd 0.0500 time 0.2435 (0.2441) data time 0.0009 (0.0017) model time 0.2425 (0.2418) loss 1.8817 (2.7151) grad_norm 6.8889 (5.8770) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1160/1251] eta 0:00:22 lr 0.000033 wd 0.0500 time 0.2321 (0.2440) data time 0.0009 (0.0017) model time 0.2313 (0.2418) loss 2.9194 (2.7150) grad_norm 4.4963 (5.8702) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1170/1251] eta 0:00:19 lr 0.000033 wd 0.0500 time 0.2397 (0.2440) data time 0.0009 (0.0016) model time 0.2388 (0.2418) loss 3.0639 (2.7148) grad_norm 5.8015 (5.8634) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1180/1251] eta 0:00:17 lr 0.000033 wd 0.0500 time 0.2371 (0.2440) data time 0.0007 (0.0016) model time 0.2364 (0.2417) loss 3.3414 (2.7141) grad_norm 4.8632 (5.8550) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1190/1251] eta 0:00:14 lr 0.000033 wd 0.0500 time 0.2440 (0.2440) data time 0.0009 (0.0016) model time 0.2431 (0.2417) loss 2.5986 (2.7131) grad_norm 3.3820 (5.8475) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1200/1251] eta 0:00:12 lr 0.000033 wd 0.0500 time 0.2421 (0.2439) data time 0.0007 (0.0016) model time 0.2414 (0.2417) loss 2.8252 (2.7151) grad_norm 6.5333 (5.8467) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1210/1251] eta 0:00:09 lr 0.000033 wd 0.0500 time 0.2398 (0.2439) data time 0.0012 (0.0016) model time 0.2386 (0.2417) loss 2.6960 (2.7153) grad_norm 4.8543 (5.8433) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1220/1251] eta 0:00:07 lr 0.000033 wd 0.0500 time 0.2292 (0.2439) data time 0.0011 (0.0016) model time 0.2281 (0.2417) loss 3.1113 (2.7140) grad_norm 7.1464 (5.8579) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1230/1251] eta 0:00:05 lr 0.000033 wd 0.0500 time 0.2303 (0.2438) data time 0.0010 (0.0016) model time 0.2293 (0.2416) loss 2.8373 (2.7136) grad_norm 3.6815 (5.8508) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1240/1251] eta 0:00:02 lr 0.000033 wd 0.0500 time 0.2231 (0.2437) data time 0.0007 (0.0016) model time 0.2224 (0.2416) loss 2.7064 (2.7138) grad_norm 4.1433 (5.8523) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [272/300][1250/1251] eta 0:00:00 lr 0.000033 wd 0.0500 time 0.2298 (0.2436) data time 0.0009 (0.0016) model time 0.2289 (0.2414) loss 2.6376 (2.7136) grad_norm 3.5032 (5.8421) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 272 training takes 0:05:04 [2024-09-01 09:35:36 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 09:35:37 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 09:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.393 (0.393) Loss 0.4011 (0.4011) Acc@1 92.676 (92.676) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 09:35:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.106) Loss 0.5718 (0.6214) Acc@1 90.039 (87.429) Acc@5 98.242 (97.692) Mem 7381MB [2024-09-01 09:35:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.094) Loss 0.9351 (0.6528) Acc@1 78.027 (86.361) Acc@5 95.605 (97.633) Mem 7381MB [2024-09-01 09:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.088) Loss 1.1592 (0.7485) Acc@1 74.023 (84.136) Acc@5 92.676 (96.664) Mem 7381MB [2024-09-01 09:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.0293 (0.7958) Acc@1 77.148 (83.039) Acc@5 93.945 (96.158) Mem 7381MB [2024-09-01 09:35:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.586 Acc@5 96.110 [2024-09-01 09:35:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-09-01 09:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.859 (0.859) Loss 0.3862 (0.3862) Acc@1 93.164 (93.164) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 09:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.090 (0.151) Loss 0.5654 (0.6041) Acc@1 90.430 (87.820) Acc@5 98.340 (97.852) Mem 7381MB [2024-09-01 09:35:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.118) Loss 0.9062 (0.6344) Acc@1 78.418 (86.672) Acc@5 95.801 (97.759) Mem 7381MB [2024-09-01 09:35:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.105) Loss 1.1289 (0.7260) Acc@1 74.512 (84.498) Acc@5 92.773 (96.828) Mem 7381MB [2024-09-01 09:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.095) Loss 1.0000 (0.7733) Acc@1 77.441 (83.322) Acc@5 94.141 (96.332) Mem 7381MB [2024-09-01 09:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.916 Acc@5 96.282 [2024-09-01 09:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 09:35:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][0/1251] eta 0:23:49 lr 0.000033 wd 0.0500 time 1.1431 (1.1431) data time 0.6986 (0.6986) model time 0.0000 (0.0000) loss 3.0373 (3.0373) grad_norm 4.6295 (4.6295) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][10/1251] eta 0:06:42 lr 0.000033 wd 0.0500 time 0.2548 (0.3244) data time 0.0009 (0.0645) model time 0.0000 (0.0000) loss 2.7887 (2.6996) grad_norm 4.6139 (4.9650) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][20/1251] eta 0:05:49 lr 0.000033 wd 0.0500 time 0.2375 (0.2838) data time 0.0010 (0.0343) model time 0.0000 (0.0000) loss 2.8935 (2.6961) grad_norm 4.6707 (5.3103) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][30/1251] eta 0:05:30 lr 0.000033 wd 0.0500 time 0.2403 (0.2703) data time 0.0007 (0.0235) model time 0.0000 (0.0000) loss 2.4881 (2.7498) grad_norm 4.5006 (5.6306) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][40/1251] eta 0:05:19 lr 0.000032 wd 0.0500 time 0.2455 (0.2638) data time 0.0008 (0.0181) model time 0.0000 (0.0000) loss 3.2815 (2.7172) grad_norm 5.5013 (5.5556) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:35:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][50/1251] eta 0:05:11 lr 0.000032 wd 0.0500 time 0.2407 (0.2594) data time 0.0011 (0.0147) model time 0.0000 (0.0000) loss 2.7374 (2.7156) grad_norm 6.9783 (5.6993) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][60/1251] eta 0:05:05 lr 0.000032 wd 0.0500 time 0.2368 (0.2564) data time 0.0009 (0.0125) model time 0.2359 (0.2400) loss 2.7115 (2.6961) grad_norm 6.5209 (5.6830) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][70/1251] eta 0:05:00 lr 0.000032 wd 0.0500 time 0.2449 (0.2545) data time 0.0007 (0.0109) model time 0.2442 (0.2410) loss 2.5960 (2.6792) grad_norm 4.8560 (5.6207) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][80/1251] eta 0:04:55 lr 0.000032 wd 0.0500 time 0.2381 (0.2527) data time 0.0009 (0.0096) model time 0.2372 (0.2404) loss 2.9525 (2.6741) grad_norm 4.5583 (5.5486) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][90/1251] eta 0:04:51 lr 0.000032 wd 0.0500 time 0.2327 (0.2514) data time 0.0008 (0.0087) model time 0.2319 (0.2401) loss 2.9237 (2.6764) grad_norm 5.8637 (5.5538) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][100/1251] eta 0:04:48 lr 0.000032 wd 0.0500 time 0.2438 (0.2503) data time 0.0009 (0.0079) model time 0.2429 (0.2401) loss 2.4521 (2.6985) grad_norm 5.2374 (5.5258) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][110/1251] eta 0:04:46 lr 0.000032 wd 0.0500 time 0.2389 (0.2512) data time 0.0009 (0.0073) model time 0.2380 (0.2432) loss 2.8684 (2.6797) grad_norm 4.5778 (5.5375) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][120/1251] eta 0:04:43 lr 0.000032 wd 0.0500 time 0.2449 (0.2505) data time 0.0011 (0.0068) model time 0.2438 (0.2431) loss 2.5850 (2.7164) grad_norm 4.0611 (5.5308) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][130/1251] eta 0:04:40 lr 0.000032 wd 0.0500 time 0.2360 (0.2499) data time 0.0009 (0.0063) model time 0.2350 (0.2428) loss 2.9028 (2.7280) grad_norm 5.0873 (5.5365) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][140/1251] eta 0:04:36 lr 0.000032 wd 0.0500 time 0.2450 (0.2493) data time 0.0011 (0.0060) model time 0.2438 (0.2426) loss 3.4902 (2.7425) grad_norm 5.5045 (5.6927) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][150/1251] eta 0:04:33 lr 0.000032 wd 0.0500 time 0.2413 (0.2487) data time 0.0011 (0.0056) model time 0.2403 (0.2423) loss 2.6038 (2.7449) grad_norm 4.3334 (5.6693) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][160/1251] eta 0:04:30 lr 0.000032 wd 0.0500 time 0.2358 (0.2483) data time 0.0008 (0.0054) model time 0.2350 (0.2422) loss 1.6757 (2.7314) grad_norm 3.4946 (5.6441) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][170/1251] eta 0:04:28 lr 0.000032 wd 0.0500 time 0.2435 (0.2480) data time 0.0009 (0.0051) model time 0.2426 (0.2421) loss 1.7839 (2.7389) grad_norm 7.4969 (5.6164) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][180/1251] eta 0:04:25 lr 0.000032 wd 0.0500 time 0.2358 (0.2476) data time 0.0008 (0.0049) model time 0.2350 (0.2420) loss 1.9791 (2.7388) grad_norm 4.8494 (5.6156) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][190/1251] eta 0:04:22 lr 0.000032 wd 0.0500 time 0.2472 (0.2474) data time 0.0008 (0.0047) model time 0.2465 (0.2420) loss 2.6191 (2.7306) grad_norm 5.8962 (5.6436) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][200/1251] eta 0:04:19 lr 0.000032 wd 0.0500 time 0.2382 (0.2470) data time 0.0007 (0.0045) model time 0.2375 (0.2417) loss 2.9638 (2.7352) grad_norm 6.1556 (5.6135) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][210/1251] eta 0:04:16 lr 0.000032 wd 0.0500 time 0.2335 (0.2466) data time 0.0008 (0.0043) model time 0.2328 (0.2416) loss 3.3661 (2.7354) grad_norm 3.4727 (5.5691) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][220/1251] eta 0:04:13 lr 0.000032 wd 0.0500 time 0.2414 (0.2463) data time 0.0011 (0.0042) model time 0.2403 (0.2414) loss 2.2927 (2.7327) grad_norm 9.0198 (5.6684) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][230/1251] eta 0:04:11 lr 0.000032 wd 0.0500 time 0.2410 (0.2459) data time 0.0008 (0.0040) model time 0.2401 (0.2411) loss 2.9342 (2.7381) grad_norm 6.8108 (5.6543) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][240/1251] eta 0:04:08 lr 0.000032 wd 0.0500 time 0.2459 (0.2457) data time 0.0008 (0.0039) model time 0.2451 (0.2410) loss 3.3799 (2.7277) grad_norm 7.5301 (5.6512) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][250/1251] eta 0:04:05 lr 0.000032 wd 0.0500 time 0.2380 (0.2455) data time 0.0007 (0.0038) model time 0.2374 (0.2410) loss 2.0002 (2.7332) grad_norm 5.0143 (5.6711) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][260/1251] eta 0:04:03 lr 0.000032 wd 0.0500 time 0.2415 (0.2453) data time 0.0009 (0.0037) model time 0.2406 (0.2409) loss 2.0752 (2.7276) grad_norm 13.6044 (5.7178) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][270/1251] eta 0:04:00 lr 0.000032 wd 0.0500 time 0.2340 (0.2451) data time 0.0009 (0.0036) model time 0.2331 (0.2408) loss 1.8114 (2.7165) grad_norm 5.6274 (5.6974) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][280/1251] eta 0:03:57 lr 0.000032 wd 0.0500 time 0.2373 (0.2450) data time 0.0011 (0.0035) model time 0.2362 (0.2408) loss 2.5381 (2.7170) grad_norm 5.0811 (5.7159) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][290/1251] eta 0:03:55 lr 0.000032 wd 0.0500 time 0.2424 (0.2447) data time 0.0008 (0.0034) model time 0.2415 (0.2406) loss 3.0869 (2.7200) grad_norm 5.3466 (5.8248) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:36:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][300/1251] eta 0:03:52 lr 0.000032 wd 0.0500 time 0.2358 (0.2445) data time 0.0009 (0.0033) model time 0.2349 (0.2405) loss 2.9720 (2.7210) grad_norm 4.3406 (5.7946) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][310/1251] eta 0:03:50 lr 0.000032 wd 0.0500 time 0.2545 (0.2445) data time 0.0009 (0.0032) model time 0.2536 (0.2406) loss 2.8970 (2.7201) grad_norm 3.9464 (5.7601) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][320/1251] eta 0:03:47 lr 0.000032 wd 0.0500 time 0.2458 (0.2444) data time 0.0009 (0.0032) model time 0.2449 (0.2406) loss 2.3303 (2.7180) grad_norm 5.3607 (5.7476) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][330/1251] eta 0:03:45 lr 0.000032 wd 0.0500 time 0.2420 (0.2444) data time 0.0008 (0.0031) model time 0.2413 (0.2407) loss 3.1653 (2.7285) grad_norm 4.7163 (5.7237) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][340/1251] eta 0:03:42 lr 0.000032 wd 0.0500 time 0.2393 (0.2442) data time 0.0010 (0.0030) model time 0.2383 (0.2406) loss 3.2000 (2.7324) grad_norm 5.2711 (5.7726) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][350/1251] eta 0:03:39 lr 0.000032 wd 0.0500 time 0.2396 (0.2441) data time 0.0007 (0.0030) model time 0.2389 (0.2405) loss 2.8800 (2.7310) grad_norm 4.1435 (5.7575) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][360/1251] eta 0:03:37 lr 0.000032 wd 0.0500 time 0.2419 (0.2440) data time 0.0009 (0.0029) model time 0.2409 (0.2405) loss 2.8641 (2.7332) grad_norm 3.7636 (5.7567) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][370/1251] eta 0:03:34 lr 0.000032 wd 0.0500 time 0.2372 (0.2439) data time 0.0010 (0.0029) model time 0.2362 (0.2405) loss 2.6403 (2.7359) grad_norm 5.7311 (5.7254) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][380/1251] eta 0:03:32 lr 0.000032 wd 0.0500 time 0.4585 (0.2444) data time 0.0009 (0.0028) model time 0.4576 (0.2412) loss 1.8663 (2.7346) grad_norm 4.0698 (5.7141) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][390/1251] eta 0:03:31 lr 0.000032 wd 0.0500 time 0.2364 (0.2459) data time 0.0010 (0.0028) model time 0.2354 (0.2429) loss 3.1395 (2.7357) grad_norm 6.9138 (5.7069) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][400/1251] eta 0:03:29 lr 0.000032 wd 0.0500 time 0.2405 (0.2458) data time 0.0008 (0.0027) model time 0.2397 (0.2429) loss 3.1556 (2.7340) grad_norm 5.5913 (5.6937) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][410/1251] eta 0:03:26 lr 0.000032 wd 0.0500 time 0.2424 (0.2457) data time 0.0009 (0.0027) model time 0.2415 (0.2428) loss 2.8433 (2.7297) grad_norm 7.3673 (5.7138) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][420/1251] eta 0:03:24 lr 0.000032 wd 0.0500 time 0.2405 (0.2455) data time 0.0009 (0.0027) model time 0.2396 (0.2426) loss 3.0028 (2.7297) grad_norm 6.4815 (5.7283) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][430/1251] eta 0:03:21 lr 0.000032 wd 0.0500 time 0.2381 (0.2454) data time 0.0008 (0.0026) model time 0.2373 (0.2425) loss 3.3529 (2.7354) grad_norm 3.8289 (5.7152) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][440/1251] eta 0:03:18 lr 0.000032 wd 0.0500 time 0.2372 (0.2453) data time 0.0011 (0.0026) model time 0.2361 (0.2425) loss 2.0769 (2.7355) grad_norm 6.7809 (5.7257) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][450/1251] eta 0:03:16 lr 0.000032 wd 0.0500 time 0.2392 (0.2452) data time 0.0009 (0.0025) model time 0.2383 (0.2424) loss 2.6301 (2.7386) grad_norm 4.7019 (5.7341) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][460/1251] eta 0:03:13 lr 0.000032 wd 0.0500 time 0.2528 (0.2452) data time 0.0008 (0.0025) model time 0.2521 (0.2424) loss 2.8405 (2.7398) grad_norm 4.4544 (5.7790) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][470/1251] eta 0:03:11 lr 0.000032 wd 0.0500 time 0.2380 (0.2451) data time 0.0008 (0.0025) model time 0.2372 (0.2424) loss 3.3829 (2.7358) grad_norm 5.7621 (5.7734) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][480/1251] eta 0:03:08 lr 0.000032 wd 0.0500 time 0.2467 (0.2451) data time 0.0009 (0.0024) model time 0.2458 (0.2424) loss 2.9036 (2.7342) grad_norm 5.2015 (5.7857) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][490/1251] eta 0:03:06 lr 0.000032 wd 0.0500 time 0.2484 (0.2450) data time 0.0008 (0.0024) model time 0.2476 (0.2424) loss 3.0858 (2.7350) grad_norm 5.8236 (5.7732) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][500/1251] eta 0:03:03 lr 0.000032 wd 0.0500 time 0.2385 (0.2449) data time 0.0011 (0.0024) model time 0.2374 (0.2423) loss 2.7936 (2.7335) grad_norm 4.9482 (5.7684) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][510/1251] eta 0:03:01 lr 0.000032 wd 0.0500 time 0.2456 (0.2449) data time 0.0009 (0.0024) model time 0.2447 (0.2423) loss 2.8062 (2.7344) grad_norm 5.2910 (5.7625) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][520/1251] eta 0:02:58 lr 0.000032 wd 0.0500 time 0.2455 (0.2448) data time 0.0007 (0.0023) model time 0.2448 (0.2422) loss 3.0101 (2.7363) grad_norm 6.3766 (5.7731) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][530/1251] eta 0:02:56 lr 0.000032 wd 0.0500 time 0.2340 (0.2447) data time 0.0010 (0.0023) model time 0.2330 (0.2422) loss 3.2797 (2.7395) grad_norm 4.0485 (5.7596) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:37:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][540/1251] eta 0:02:53 lr 0.000032 wd 0.0500 time 0.2398 (0.2447) data time 0.0009 (0.0023) model time 0.2389 (0.2421) loss 1.7565 (2.7407) grad_norm 8.0169 (5.7547) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:38:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][550/1251] eta 0:02:51 lr 0.000032 wd 0.0500 time 0.2444 (0.2446) data time 0.0011 (0.0023) model time 0.2433 (0.2420) loss 2.0456 (2.7341) grad_norm 4.2778 (5.7403) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 09:38:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][560/1251] eta 0:02:48 lr 0.000032 wd 0.0500 time 0.2371 (0.2444) data time 0.0010 (0.0022) model time 0.2360 (0.2419) loss 2.8151 (2.7320) grad_norm 4.5248 (5.7246) loss_scale 1024.0000 (512.9127) mem 7381MB [2024-09-01 09:38:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][570/1251] eta 0:02:46 lr 0.000032 wd 0.0500 time 0.2387 (0.2443) data time 0.0007 (0.0022) model time 0.2380 (0.2419) loss 2.8251 (2.7323) grad_norm 3.1394 (5.7198) loss_scale 1024.0000 (521.8634) mem 7381MB [2024-09-01 09:38:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][580/1251] eta 0:02:43 lr 0.000032 wd 0.0500 time 0.2432 (0.2443) data time 0.0010 (0.0022) model time 0.2421 (0.2418) loss 2.7961 (2.7338) grad_norm 5.2116 (5.7356) loss_scale 1024.0000 (530.5060) mem 7381MB [2024-09-01 09:38:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][590/1251] eta 0:02:41 lr 0.000032 wd 0.0500 time 0.2567 (0.2446) data time 0.0009 (0.0022) model time 0.2557 (0.2422) loss 2.9743 (2.7363) grad_norm 7.3436 (5.7283) loss_scale 1024.0000 (538.8562) mem 7381MB [2024-09-01 09:38:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][600/1251] eta 0:02:39 lr 0.000032 wd 0.0500 time 0.2437 (0.2445) data time 0.0010 (0.0022) model time 0.2427 (0.2421) loss 2.9487 (2.7372) grad_norm 5.0188 (5.7270) loss_scale 1024.0000 (546.9285) mem 7381MB [2024-09-01 09:38:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][610/1251] eta 0:02:36 lr 0.000032 wd 0.0500 time 0.2423 (0.2445) data time 0.0007 (0.0021) model time 0.2416 (0.2421) loss 3.0984 (2.7356) grad_norm 4.3188 (5.7351) loss_scale 1024.0000 (554.7365) mem 7381MB [2024-09-01 09:38:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][620/1251] eta 0:02:34 lr 0.000032 wd 0.0500 time 0.2424 (0.2444) data time 0.0010 (0.0021) model time 0.2414 (0.2420) loss 2.9021 (2.7384) grad_norm 4.1538 (5.7356) loss_scale 1024.0000 (562.2931) mem 7381MB [2024-09-01 09:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][630/1251] eta 0:02:31 lr 0.000032 wd 0.0500 time 0.2366 (0.2443) data time 0.0009 (0.0021) model time 0.2357 (0.2420) loss 2.7244 (2.7387) grad_norm 7.1905 (5.7385) loss_scale 1024.0000 (569.6101) mem 7381MB [2024-09-01 09:38:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][640/1251] eta 0:02:29 lr 0.000032 wd 0.0500 time 0.2505 (0.2443) data time 0.0010 (0.0021) model time 0.2495 (0.2420) loss 2.0469 (2.7355) grad_norm 3.8162 (5.7294) loss_scale 1024.0000 (576.6989) mem 7381MB [2024-09-01 09:38:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][650/1251] eta 0:02:26 lr 0.000032 wd 0.0500 time 0.2436 (0.2442) data time 0.0007 (0.0021) model time 0.2429 (0.2419) loss 2.5149 (2.7354) grad_norm 8.8647 (5.7496) loss_scale 1024.0000 (583.5699) mem 7381MB [2024-09-01 09:38:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][660/1251] eta 0:02:24 lr 0.000032 wd 0.0500 time 0.2330 (0.2441) data time 0.0008 (0.0021) model time 0.2322 (0.2419) loss 2.6793 (2.7344) grad_norm 8.5924 (inf) loss_scale 512.0000 (582.4871) mem 7381MB [2024-09-01 09:38:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][670/1251] eta 0:02:21 lr 0.000032 wd 0.0500 time 0.2387 (0.2441) data time 0.0011 (0.0020) model time 0.2376 (0.2418) loss 2.6706 (2.7374) grad_norm 3.3309 (inf) loss_scale 512.0000 (581.4367) mem 7381MB [2024-09-01 09:38:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][680/1251] eta 0:02:19 lr 0.000032 wd 0.0500 time 0.2400 (0.2440) data time 0.0009 (0.0020) model time 0.2391 (0.2418) loss 2.9195 (2.7390) grad_norm 3.6651 (inf) loss_scale 512.0000 (580.4170) mem 7381MB [2024-09-01 09:38:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][690/1251] eta 0:02:16 lr 0.000032 wd 0.0500 time 0.2463 (0.2440) data time 0.0009 (0.0020) model time 0.2454 (0.2418) loss 2.8309 (2.7350) grad_norm 5.3744 (inf) loss_scale 512.0000 (579.4269) mem 7381MB [2024-09-01 09:38:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][700/1251] eta 0:02:14 lr 0.000032 wd 0.0500 time 0.2361 (0.2440) data time 0.0010 (0.0020) model time 0.2351 (0.2417) loss 2.3473 (2.7341) grad_norm 5.0430 (inf) loss_scale 512.0000 (578.4650) mem 7381MB [2024-09-01 09:38:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][710/1251] eta 0:02:11 lr 0.000032 wd 0.0500 time 0.2392 (0.2439) data time 0.0010 (0.0020) model time 0.2382 (0.2417) loss 3.1965 (2.7337) grad_norm 4.2161 (inf) loss_scale 512.0000 (577.5302) mem 7381MB [2024-09-01 09:38:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][720/1251] eta 0:02:09 lr 0.000032 wd 0.0500 time 0.2383 (0.2439) data time 0.0010 (0.0020) model time 0.2372 (0.2417) loss 2.9111 (2.7345) grad_norm 4.5606 (inf) loss_scale 512.0000 (576.6214) mem 7381MB [2024-09-01 09:38:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][730/1251] eta 0:02:07 lr 0.000032 wd 0.0500 time 0.2408 (0.2439) data time 0.0007 (0.0019) model time 0.2402 (0.2417) loss 2.7938 (2.7331) grad_norm 4.3905 (inf) loss_scale 512.0000 (575.7373) mem 7381MB [2024-09-01 09:38:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][740/1251] eta 0:02:04 lr 0.000032 wd 0.0500 time 0.2418 (0.2439) data time 0.0010 (0.0019) model time 0.2407 (0.2417) loss 2.6934 (2.7335) grad_norm 6.3031 (inf) loss_scale 512.0000 (574.8772) mem 7381MB [2024-09-01 09:38:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][750/1251] eta 0:02:02 lr 0.000032 wd 0.0500 time 0.2368 (0.2438) data time 0.0009 (0.0019) model time 0.2359 (0.2417) loss 3.0625 (2.7360) grad_norm 3.8971 (inf) loss_scale 512.0000 (574.0399) mem 7381MB [2024-09-01 09:38:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][760/1251] eta 0:01:59 lr 0.000032 wd 0.0500 time 0.2357 (0.2438) data time 0.0009 (0.0019) model time 0.2348 (0.2417) loss 2.8860 (2.7361) grad_norm 5.7517 (inf) loss_scale 512.0000 (573.2247) mem 7381MB [2024-09-01 09:38:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][770/1251] eta 0:01:57 lr 0.000032 wd 0.0500 time 0.2413 (0.2438) data time 0.0008 (0.0019) model time 0.2405 (0.2417) loss 2.6606 (2.7365) grad_norm 51.0892 (inf) loss_scale 512.0000 (572.4306) mem 7381MB [2024-09-01 09:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][780/1251] eta 0:01:54 lr 0.000032 wd 0.0500 time 0.2424 (0.2438) data time 0.0010 (0.0019) model time 0.2414 (0.2417) loss 2.5737 (2.7385) grad_norm 4.3133 (inf) loss_scale 512.0000 (571.6569) mem 7381MB [2024-09-01 09:38:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][790/1251] eta 0:01:52 lr 0.000032 wd 0.0500 time 0.2391 (0.2437) data time 0.0007 (0.0019) model time 0.2384 (0.2417) loss 3.0870 (2.7407) grad_norm 4.3907 (inf) loss_scale 512.0000 (570.9027) mem 7381MB [2024-09-01 09:39:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][800/1251] eta 0:01:49 lr 0.000031 wd 0.0500 time 0.2419 (0.2437) data time 0.0008 (0.0019) model time 0.2411 (0.2417) loss 2.8938 (2.7393) grad_norm 5.6826 (inf) loss_scale 512.0000 (570.1673) mem 7381MB [2024-09-01 09:39:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][810/1251] eta 0:01:47 lr 0.000031 wd 0.0500 time 0.2393 (0.2437) data time 0.0009 (0.0019) model time 0.2384 (0.2416) loss 1.9589 (2.7380) grad_norm 5.2573 (inf) loss_scale 512.0000 (569.4501) mem 7381MB [2024-09-01 09:39:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][820/1251] eta 0:01:45 lr 0.000031 wd 0.0500 time 0.2473 (0.2437) data time 0.0010 (0.0018) model time 0.2462 (0.2416) loss 2.4372 (2.7391) grad_norm 4.1513 (inf) loss_scale 512.0000 (568.7503) mem 7381MB [2024-09-01 09:39:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][830/1251] eta 0:01:42 lr 0.000031 wd 0.0500 time 0.2448 (0.2437) data time 0.0007 (0.0018) model time 0.2441 (0.2417) loss 1.8664 (2.7362) grad_norm 4.5702 (inf) loss_scale 512.0000 (568.0674) mem 7381MB [2024-09-01 09:39:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][840/1251] eta 0:01:40 lr 0.000031 wd 0.0500 time 0.2318 (0.2436) data time 0.0008 (0.0018) model time 0.2310 (0.2416) loss 3.4753 (2.7368) grad_norm 4.5284 (inf) loss_scale 512.0000 (567.4007) mem 7381MB [2024-09-01 09:39:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][850/1251] eta 0:01:37 lr 0.000031 wd 0.0500 time 0.2494 (0.2436) data time 0.0008 (0.0018) model time 0.2486 (0.2416) loss 1.7348 (2.7361) grad_norm 3.3432 (inf) loss_scale 512.0000 (566.7497) mem 7381MB [2024-09-01 09:39:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][860/1251] eta 0:01:35 lr 0.000031 wd 0.0500 time 0.2471 (0.2436) data time 0.0011 (0.0018) model time 0.2460 (0.2416) loss 2.4401 (2.7332) grad_norm 3.6306 (inf) loss_scale 512.0000 (566.1138) mem 7381MB [2024-09-01 09:39:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][870/1251] eta 0:01:32 lr 0.000031 wd 0.0500 time 0.2444 (0.2436) data time 0.0007 (0.0018) model time 0.2437 (0.2416) loss 2.2951 (2.7291) grad_norm 4.5041 (inf) loss_scale 256.0000 (564.0230) mem 7381MB [2024-09-01 09:39:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][880/1251] eta 0:01:30 lr 0.000031 wd 0.0500 time 0.2338 (0.2436) data time 0.0010 (0.0018) model time 0.2329 (0.2416) loss 1.9950 (2.7261) grad_norm 7.6707 (inf) loss_scale 256.0000 (560.5267) mem 7381MB [2024-09-01 09:39:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][890/1251] eta 0:01:27 lr 0.000031 wd 0.0500 time 0.2433 (0.2435) data time 0.0007 (0.0018) model time 0.2426 (0.2416) loss 1.6580 (2.7253) grad_norm 4.9439 (inf) loss_scale 256.0000 (557.1089) mem 7381MB [2024-09-01 09:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][900/1251] eta 0:01:25 lr 0.000031 wd 0.0500 time 0.2472 (0.2435) data time 0.0007 (0.0018) model time 0.2465 (0.2416) loss 3.1603 (2.7273) grad_norm 3.8032 (inf) loss_scale 256.0000 (553.7669) mem 7381MB [2024-09-01 09:39:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][910/1251] eta 0:01:23 lr 0.000031 wd 0.0500 time 0.2419 (0.2435) data time 0.0010 (0.0018) model time 0.2409 (0.2416) loss 2.8194 (2.7271) grad_norm 4.6134 (inf) loss_scale 256.0000 (550.4984) mem 7381MB [2024-09-01 09:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][920/1251] eta 0:01:20 lr 0.000031 wd 0.0500 time 0.4616 (0.2440) data time 0.0010 (0.0018) model time 0.4606 (0.2421) loss 3.3145 (2.7293) grad_norm 4.6026 (inf) loss_scale 256.0000 (547.3008) mem 7381MB [2024-09-01 09:39:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][930/1251] eta 0:01:18 lr 0.000031 wd 0.0500 time 0.2379 (0.2444) data time 0.0008 (0.0017) model time 0.2371 (0.2425) loss 2.0802 (2.7301) grad_norm 4.2597 (inf) loss_scale 256.0000 (544.1719) mem 7381MB [2024-09-01 09:39:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][940/1251] eta 0:01:15 lr 0.000031 wd 0.0500 time 0.2297 (0.2443) data time 0.0010 (0.0017) model time 0.2286 (0.2425) loss 1.9969 (2.7292) grad_norm 3.4763 (inf) loss_scale 256.0000 (541.1095) mem 7381MB [2024-09-01 09:39:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][950/1251] eta 0:01:13 lr 0.000031 wd 0.0500 time 0.2408 (0.2443) data time 0.0009 (0.0017) model time 0.2398 (0.2425) loss 2.2549 (2.7263) grad_norm 4.6187 (inf) loss_scale 256.0000 (538.1115) mem 7381MB [2024-09-01 09:39:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][960/1251] eta 0:01:11 lr 0.000031 wd 0.0500 time 0.2412 (0.2443) data time 0.0008 (0.0017) model time 0.2404 (0.2425) loss 3.0347 (2.7277) grad_norm 7.7729 (inf) loss_scale 256.0000 (535.1759) mem 7381MB [2024-09-01 09:39:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][970/1251] eta 0:01:08 lr 0.000031 wd 0.0500 time 0.2477 (0.2443) data time 0.0010 (0.0017) model time 0.2467 (0.2425) loss 2.8276 (2.7277) grad_norm 4.5243 (inf) loss_scale 256.0000 (532.3007) mem 7381MB [2024-09-01 09:39:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][980/1251] eta 0:01:06 lr 0.000031 wd 0.0500 time 0.2458 (0.2443) data time 0.0011 (0.0017) model time 0.2448 (0.2425) loss 2.9395 (2.7263) grad_norm 5.5604 (inf) loss_scale 256.0000 (529.4842) mem 7381MB [2024-09-01 09:39:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][990/1251] eta 0:01:03 lr 0.000031 wd 0.0500 time 0.2439 (0.2443) data time 0.0007 (0.0017) model time 0.2431 (0.2425) loss 2.4036 (2.7232) grad_norm 3.9295 (inf) loss_scale 256.0000 (526.7245) mem 7381MB [2024-09-01 09:39:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1000/1251] eta 0:01:01 lr 0.000031 wd 0.0500 time 0.2370 (0.2442) data time 0.0007 (0.0017) model time 0.2362 (0.2424) loss 2.5700 (2.7243) grad_norm 3.8673 (inf) loss_scale 256.0000 (524.0200) mem 7381MB [2024-09-01 09:39:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1010/1251] eta 0:00:58 lr 0.000031 wd 0.0500 time 0.2359 (0.2442) data time 0.0009 (0.0017) model time 0.2349 (0.2424) loss 3.2248 (2.7227) grad_norm 6.3682 (inf) loss_scale 256.0000 (521.3689) mem 7381MB [2024-09-01 09:39:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1020/1251] eta 0:00:56 lr 0.000031 wd 0.0500 time 0.2363 (0.2442) data time 0.0007 (0.0017) model time 0.2356 (0.2424) loss 2.0865 (2.7214) grad_norm 14.2419 (inf) loss_scale 256.0000 (518.7698) mem 7381MB [2024-09-01 09:39:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1030/1251] eta 0:00:53 lr 0.000031 wd 0.0500 time 0.2324 (0.2442) data time 0.0009 (0.0017) model time 0.2315 (0.2424) loss 2.0886 (2.7201) grad_norm 5.3400 (inf) loss_scale 256.0000 (516.2211) mem 7381MB [2024-09-01 09:40:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1040/1251] eta 0:00:51 lr 0.000031 wd 0.0500 time 0.2444 (0.2443) data time 0.0007 (0.0017) model time 0.2437 (0.2426) loss 3.1906 (2.7212) grad_norm 4.5458 (inf) loss_scale 256.0000 (513.7214) mem 7381MB [2024-09-01 09:40:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1050/1251] eta 0:00:49 lr 0.000031 wd 0.0500 time 0.2314 (0.2443) data time 0.0015 (0.0017) model time 0.2299 (0.2426) loss 2.8716 (2.7199) grad_norm 5.4983 (inf) loss_scale 256.0000 (511.2693) mem 7381MB [2024-09-01 09:40:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1060/1251] eta 0:00:46 lr 0.000031 wd 0.0500 time 0.2433 (0.2443) data time 0.0011 (0.0017) model time 0.2422 (0.2426) loss 2.9197 (2.7193) grad_norm 23.7898 (inf) loss_scale 256.0000 (508.8633) mem 7381MB [2024-09-01 09:40:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1070/1251] eta 0:00:44 lr 0.000031 wd 0.0500 time 0.2426 (0.2443) data time 0.0007 (0.0016) model time 0.2419 (0.2425) loss 3.3854 (2.7196) grad_norm 6.4028 (inf) loss_scale 256.0000 (506.5023) mem 7381MB [2024-09-01 09:40:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1080/1251] eta 0:00:41 lr 0.000031 wd 0.0500 time 0.2355 (0.2443) data time 0.0008 (0.0016) model time 0.2347 (0.2425) loss 1.7969 (2.7186) grad_norm 4.7962 (inf) loss_scale 256.0000 (504.1850) mem 7381MB [2024-09-01 09:40:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1090/1251] eta 0:00:39 lr 0.000031 wd 0.0500 time 0.2433 (0.2442) data time 0.0007 (0.0016) model time 0.2426 (0.2425) loss 2.8744 (2.7216) grad_norm 4.6202 (inf) loss_scale 256.0000 (501.9102) mem 7381MB [2024-09-01 09:40:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1100/1251] eta 0:00:36 lr 0.000031 wd 0.0500 time 0.2409 (0.2442) data time 0.0009 (0.0016) model time 0.2400 (0.2424) loss 1.9785 (2.7201) grad_norm 7.5820 (inf) loss_scale 256.0000 (499.6767) mem 7381MB [2024-09-01 09:40:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1110/1251] eta 0:00:34 lr 0.000031 wd 0.0500 time 0.2381 (0.2442) data time 0.0008 (0.0016) model time 0.2373 (0.2424) loss 2.1928 (2.7182) grad_norm 4.1547 (inf) loss_scale 256.0000 (497.4833) mem 7381MB [2024-09-01 09:40:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1120/1251] eta 0:00:31 lr 0.000031 wd 0.0500 time 0.2417 (0.2442) data time 0.0010 (0.0016) model time 0.2407 (0.2424) loss 2.6360 (2.7171) grad_norm 19.4421 (inf) loss_scale 256.0000 (495.3292) mem 7381MB [2024-09-01 09:40:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1130/1251] eta 0:00:29 lr 0.000031 wd 0.0500 time 0.2441 (0.2443) data time 0.0010 (0.0016) model time 0.2431 (0.2426) loss 2.3144 (2.7162) grad_norm 6.0392 (inf) loss_scale 256.0000 (493.2131) mem 7381MB [2024-09-01 09:40:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1140/1251] eta 0:00:27 lr 0.000031 wd 0.0500 time 0.2448 (0.2443) data time 0.0008 (0.0016) model time 0.2440 (0.2426) loss 2.4022 (2.7173) grad_norm 6.3502 (inf) loss_scale 256.0000 (491.1341) mem 7381MB [2024-09-01 09:40:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1150/1251] eta 0:00:24 lr 0.000031 wd 0.0500 time 0.2380 (0.2443) data time 0.0008 (0.0016) model time 0.2372 (0.2426) loss 3.1190 (2.7183) grad_norm 5.8653 (inf) loss_scale 256.0000 (489.0912) mem 7381MB [2024-09-01 09:40:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1160/1251] eta 0:00:22 lr 0.000031 wd 0.0500 time 0.2354 (0.2443) data time 0.0009 (0.0016) model time 0.2346 (0.2426) loss 2.7353 (2.7178) grad_norm 5.4148 (inf) loss_scale 256.0000 (487.0835) mem 7381MB [2024-09-01 09:40:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1170/1251] eta 0:00:19 lr 0.000031 wd 0.0500 time 0.2401 (0.2443) data time 0.0014 (0.0016) model time 0.2387 (0.2426) loss 2.3794 (2.7156) grad_norm 5.7506 (inf) loss_scale 256.0000 (485.1102) mem 7381MB [2024-09-01 09:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1180/1251] eta 0:00:17 lr 0.000031 wd 0.0500 time 0.2360 (0.2443) data time 0.0009 (0.0016) model time 0.2351 (0.2426) loss 2.6514 (2.7140) grad_norm 5.6966 (inf) loss_scale 256.0000 (483.1702) mem 7381MB [2024-09-01 09:40:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1190/1251] eta 0:00:14 lr 0.000031 wd 0.0500 time 0.2463 (0.2443) data time 0.0009 (0.0016) model time 0.2453 (0.2426) loss 2.6922 (2.7150) grad_norm 15.5954 (inf) loss_scale 256.0000 (481.2628) mem 7381MB [2024-09-01 09:40:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1200/1251] eta 0:00:12 lr 0.000031 wd 0.0500 time 0.2416 (0.2442) data time 0.0009 (0.0016) model time 0.2407 (0.2425) loss 3.1938 (2.7169) grad_norm 3.7427 (inf) loss_scale 256.0000 (479.3872) mem 7381MB [2024-09-01 09:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1210/1251] eta 0:00:10 lr 0.000031 wd 0.0500 time 0.2375 (0.2442) data time 0.0007 (0.0016) model time 0.2367 (0.2425) loss 2.3175 (2.7167) grad_norm 5.8067 (inf) loss_scale 256.0000 (477.5425) mem 7381MB [2024-09-01 09:40:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1220/1251] eta 0:00:07 lr 0.000031 wd 0.0500 time 0.2311 (0.2442) data time 0.0012 (0.0016) model time 0.2300 (0.2425) loss 3.1145 (2.7166) grad_norm 4.7836 (inf) loss_scale 256.0000 (475.7281) mem 7381MB [2024-09-01 09:40:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1230/1251] eta 0:00:05 lr 0.000031 wd 0.0500 time 0.2458 (0.2442) data time 0.0007 (0.0016) model time 0.2451 (0.2425) loss 2.7693 (2.7175) grad_norm 4.8667 (inf) loss_scale 256.0000 (473.9431) mem 7381MB [2024-09-01 09:40:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1240/1251] eta 0:00:02 lr 0.000031 wd 0.0500 time 0.2244 (0.2441) data time 0.0006 (0.0016) model time 0.2238 (0.2424) loss 3.3058 (2.7175) grad_norm 11.9241 (inf) loss_scale 256.0000 (472.1869) mem 7381MB [2024-09-01 09:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [273/300][1250/1251] eta 0:00:00 lr 0.000031 wd 0.0500 time 0.2269 (0.2439) data time 0.0005 (0.0016) model time 0.2264 (0.2423) loss 1.8868 (2.7160) grad_norm 3.9527 (inf) loss_scale 256.0000 (470.4588) mem 7381MB [2024-09-01 09:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 273 training takes 0:05:05 [2024-09-01 09:40:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 09:40:51 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 09:40:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.493 (0.493) Loss 0.3906 (0.3906) Acc@1 92.676 (92.676) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 09:40:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.086 (0.114) Loss 0.5859 (0.6132) Acc@1 89.941 (87.393) Acc@5 98.047 (97.647) Mem 7381MB [2024-09-01 09:40:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.096) Loss 0.9058 (0.6425) Acc@1 77.832 (86.305) Acc@5 95.703 (97.633) Mem 7381MB [2024-09-01 09:40:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.090) Loss 1.1211 (0.7384) Acc@1 74.512 (84.136) Acc@5 92.383 (96.601) Mem 7381MB [2024-09-01 09:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.0225 (0.7879) Acc@1 77.637 (82.977) Acc@5 94.434 (96.129) Mem 7381MB [2024-09-01 09:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.564 Acc@5 96.112 [2024-09-01 09:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-09-01 09:40:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.811 (0.811) Loss 0.3865 (0.3865) Acc@1 93.164 (93.164) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 09:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.146) Loss 0.5649 (0.6045) Acc@1 90.527 (87.820) Acc@5 98.340 (97.860) Mem 7381MB [2024-09-01 09:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.115) Loss 0.9053 (0.6349) Acc@1 78.223 (86.668) Acc@5 95.898 (97.768) Mem 7381MB [2024-09-01 09:40:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.102) Loss 1.1279 (0.7266) Acc@1 74.414 (84.479) Acc@5 92.773 (96.815) Mem 7381MB [2024-09-01 09:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.0029 (0.7740) Acc@1 77.246 (83.296) Acc@5 94.141 (96.325) Mem 7381MB [2024-09-01 09:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.884 Acc@5 96.276 [2024-09-01 09:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 09:41:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][0/1251] eta 0:22:52 lr 0.000031 wd 0.0500 time 1.0971 (1.0971) data time 0.5084 (0.5084) model time 0.0000 (0.0000) loss 2.8788 (2.8788) grad_norm 4.9094 (4.9094) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][10/1251] eta 0:06:42 lr 0.000031 wd 0.0500 time 0.2513 (0.3244) data time 0.0007 (0.0478) model time 0.0000 (0.0000) loss 3.3273 (2.8721) grad_norm 15.4076 (7.3576) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][20/1251] eta 0:05:50 lr 0.000031 wd 0.0500 time 0.2396 (0.2850) data time 0.0012 (0.0255) model time 0.0000 (0.0000) loss 2.6929 (2.8644) grad_norm 5.3442 (8.1833) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][30/1251] eta 0:05:31 lr 0.000031 wd 0.0500 time 0.2363 (0.2713) data time 0.0011 (0.0176) model time 0.0000 (0.0000) loss 3.1243 (2.7520) grad_norm 24.7858 (7.7237) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][40/1251] eta 0:05:20 lr 0.000031 wd 0.0500 time 0.2440 (0.2644) data time 0.0008 (0.0135) model time 0.0000 (0.0000) loss 1.7339 (2.7449) grad_norm 8.5580 (7.4745) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][50/1251] eta 0:05:12 lr 0.000031 wd 0.0500 time 0.2382 (0.2604) data time 0.0007 (0.0111) model time 0.0000 (0.0000) loss 3.0668 (2.7305) grad_norm 4.8811 (6.9704) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][60/1251] eta 0:05:06 lr 0.000031 wd 0.0500 time 0.2366 (0.2571) data time 0.0010 (0.0094) model time 0.2356 (0.2394) loss 3.1268 (2.7414) grad_norm 5.7575 (6.6936) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][70/1251] eta 0:05:00 lr 0.000031 wd 0.0500 time 0.2401 (0.2546) data time 0.0010 (0.0082) model time 0.2391 (0.2389) loss 3.0484 (2.7436) grad_norm 4.3363 (7.4421) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][80/1251] eta 0:04:56 lr 0.000031 wd 0.0500 time 0.2355 (0.2534) data time 0.0008 (0.0073) model time 0.2347 (0.2404) loss 1.9642 (2.7339) grad_norm 4.5930 (7.1789) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][90/1251] eta 0:04:52 lr 0.000031 wd 0.0500 time 0.2470 (0.2522) data time 0.0010 (0.0066) model time 0.2460 (0.2408) loss 3.2112 (2.7273) grad_norm 5.2815 (6.9784) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][100/1251] eta 0:04:49 lr 0.000031 wd 0.0500 time 0.2379 (0.2512) data time 0.0011 (0.0061) model time 0.2368 (0.2407) loss 2.6534 (2.7264) grad_norm 3.6268 (6.8866) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][110/1251] eta 0:04:45 lr 0.000031 wd 0.0500 time 0.2393 (0.2503) data time 0.0010 (0.0056) model time 0.2383 (0.2407) loss 3.2409 (2.7404) grad_norm 12.5646 (6.7447) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][120/1251] eta 0:04:42 lr 0.000031 wd 0.0500 time 0.2421 (0.2496) data time 0.0007 (0.0053) model time 0.2415 (0.2406) loss 1.7822 (2.7308) grad_norm 6.5274 (6.6173) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][130/1251] eta 0:04:38 lr 0.000031 wd 0.0500 time 0.2419 (0.2488) data time 0.0009 (0.0049) model time 0.2410 (0.2403) loss 2.9215 (2.7037) grad_norm 3.8502 (6.5471) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][140/1251] eta 0:04:35 lr 0.000031 wd 0.0500 time 0.2384 (0.2483) data time 0.0010 (0.0047) model time 0.2373 (0.2405) loss 3.0976 (2.7297) grad_norm 4.1737 (6.4270) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][150/1251] eta 0:04:33 lr 0.000031 wd 0.0500 time 0.2556 (0.2480) data time 0.0007 (0.0044) model time 0.2550 (0.2407) loss 2.1220 (2.7170) grad_norm 4.9024 (6.3276) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][160/1251] eta 0:04:30 lr 0.000031 wd 0.0500 time 0.2359 (0.2477) data time 0.0009 (0.0042) model time 0.2350 (0.2408) loss 3.1772 (2.7147) grad_norm 4.8613 (6.2586) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][170/1251] eta 0:04:27 lr 0.000031 wd 0.0500 time 0.2289 (0.2475) data time 0.0011 (0.0040) model time 0.2278 (0.2410) loss 3.4223 (2.6919) grad_norm 5.0838 (6.2034) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][180/1251] eta 0:04:24 lr 0.000031 wd 0.0500 time 0.2428 (0.2471) data time 0.0007 (0.0039) model time 0.2421 (0.2408) loss 1.7538 (2.6988) grad_norm 4.3769 (6.1506) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][190/1251] eta 0:04:21 lr 0.000031 wd 0.0500 time 0.2363 (0.2467) data time 0.0009 (0.0037) model time 0.2354 (0.2407) loss 3.0669 (2.6980) grad_norm 5.5339 (6.2317) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][200/1251] eta 0:04:23 lr 0.000031 wd 0.0500 time 0.4451 (0.2506) data time 0.0009 (0.0036) model time 0.4442 (0.2462) loss 2.6898 (2.6924) grad_norm 6.5447 (6.4057) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][210/1251] eta 0:04:20 lr 0.000031 wd 0.0500 time 0.2327 (0.2502) data time 0.0010 (0.0035) model time 0.2317 (0.2458) loss 2.6614 (2.6894) grad_norm 4.4923 (6.3323) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][220/1251] eta 0:04:17 lr 0.000031 wd 0.0500 time 0.2471 (0.2498) data time 0.0007 (0.0034) model time 0.2463 (0.2455) loss 2.1069 (2.6773) grad_norm 6.5849 (6.3109) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][230/1251] eta 0:04:14 lr 0.000031 wd 0.0500 time 0.2356 (0.2494) data time 0.0011 (0.0033) model time 0.2346 (0.2452) loss 2.8337 (2.6846) grad_norm 4.8703 (6.4234) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:41:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][240/1251] eta 0:04:11 lr 0.000031 wd 0.0500 time 0.2473 (0.2490) data time 0.0009 (0.0032) model time 0.2463 (0.2449) loss 2.8329 (2.6879) grad_norm 9.9167 (6.3857) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:42:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][250/1251] eta 0:04:08 lr 0.000031 wd 0.0500 time 0.2478 (0.2488) data time 0.0010 (0.0031) model time 0.2468 (0.2447) loss 2.8700 (2.6905) grad_norm 5.2227 (6.3533) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][260/1251] eta 0:04:06 lr 0.000031 wd 0.0500 time 0.2427 (0.2484) data time 0.0010 (0.0030) model time 0.2417 (0.2445) loss 3.0348 (2.6856) grad_norm 7.0839 (6.3642) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:42:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][270/1251] eta 0:04:03 lr 0.000031 wd 0.0500 time 0.2413 (0.2482) data time 0.0008 (0.0029) model time 0.2406 (0.2444) loss 2.3343 (2.6862) grad_norm 4.9610 (6.3316) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:42:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][280/1251] eta 0:04:00 lr 0.000031 wd 0.0500 time 0.2378 (0.2479) data time 0.0009 (0.0029) model time 0.2369 (0.2441) loss 3.6166 (2.6887) grad_norm 4.6789 (6.3180) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:42:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][290/1251] eta 0:03:58 lr 0.000031 wd 0.0500 time 0.2442 (0.2477) data time 0.0007 (0.0028) model time 0.2435 (0.2440) loss 3.3801 (2.6855) grad_norm 6.0886 (inf) loss_scale 128.0000 (255.1203) mem 7381MB [2024-09-01 09:42:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][300/1251] eta 0:03:55 lr 0.000031 wd 0.0500 time 0.2377 (0.2474) data time 0.0008 (0.0027) model time 0.2369 (0.2437) loss 2.6052 (2.6842) grad_norm 3.7747 (inf) loss_scale 128.0000 (250.8970) mem 7381MB [2024-09-01 09:42:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][310/1251] eta 0:03:52 lr 0.000031 wd 0.0500 time 0.2399 (0.2472) data time 0.0011 (0.0027) model time 0.2388 (0.2436) loss 2.5741 (2.6861) grad_norm 5.7736 (inf) loss_scale 128.0000 (246.9453) mem 7381MB [2024-09-01 09:42:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][320/1251] eta 0:03:50 lr 0.000031 wd 0.0500 time 0.2533 (0.2471) data time 0.0007 (0.0026) model time 0.2526 (0.2436) loss 2.6454 (2.6885) grad_norm 4.6077 (inf) loss_scale 128.0000 (243.2399) mem 7381MB [2024-09-01 09:42:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][330/1251] eta 0:03:47 lr 0.000030 wd 0.0500 time 0.2481 (0.2470) data time 0.0011 (0.0026) model time 0.2471 (0.2435) loss 2.2014 (2.6869) grad_norm 8.0097 (inf) loss_scale 128.0000 (239.7583) mem 7381MB [2024-09-01 09:42:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][340/1251] eta 0:03:44 lr 0.000030 wd 0.0500 time 0.2367 (0.2468) data time 0.0008 (0.0025) model time 0.2359 (0.2433) loss 2.9214 (2.6803) grad_norm 5.3909 (inf) loss_scale 128.0000 (236.4809) mem 7381MB [2024-09-01 09:42:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][350/1251] eta 0:03:42 lr 0.000030 wd 0.0500 time 0.4084 (0.2470) data time 0.0007 (0.0025) model time 0.4077 (0.2437) loss 2.9798 (2.6785) grad_norm 5.4608 (inf) loss_scale 128.0000 (233.3903) mem 7381MB [2024-09-01 09:42:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][360/1251] eta 0:03:39 lr 0.000030 wd 0.0500 time 0.2476 (0.2469) data time 0.0009 (0.0024) model time 0.2467 (0.2436) loss 2.4876 (2.6807) grad_norm 4.0091 (inf) loss_scale 128.0000 (230.4709) mem 7381MB [2024-09-01 09:42:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][370/1251] eta 0:03:37 lr 0.000030 wd 0.0500 time 0.2434 (0.2467) data time 0.0008 (0.0024) model time 0.2426 (0.2435) loss 2.5452 (2.6886) grad_norm 4.6662 (inf) loss_scale 128.0000 (227.7089) mem 7381MB [2024-09-01 09:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][380/1251] eta 0:03:34 lr 0.000030 wd 0.0500 time 0.2292 (0.2466) data time 0.0011 (0.0024) model time 0.2281 (0.2435) loss 3.2919 (2.6896) grad_norm 4.7916 (inf) loss_scale 128.0000 (225.0919) mem 7381MB [2024-09-01 09:42:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][390/1251] eta 0:03:32 lr 0.000030 wd 0.0500 time 0.2406 (0.2464) data time 0.0009 (0.0023) model time 0.2397 (0.2433) loss 3.3210 (2.6861) grad_norm 5.1659 (inf) loss_scale 128.0000 (222.6087) mem 7381MB [2024-09-01 09:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][400/1251] eta 0:03:29 lr 0.000030 wd 0.0500 time 0.2407 (0.2463) data time 0.0008 (0.0023) model time 0.2399 (0.2433) loss 3.4951 (2.6927) grad_norm 4.4261 (inf) loss_scale 128.0000 (220.2494) mem 7381MB [2024-09-01 09:42:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][410/1251] eta 0:03:27 lr 0.000030 wd 0.0500 time 0.2436 (0.2462) data time 0.0010 (0.0023) model time 0.2427 (0.2432) loss 2.3459 (2.6872) grad_norm 5.5936 (inf) loss_scale 128.0000 (218.0049) mem 7381MB [2024-09-01 09:42:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][420/1251] eta 0:03:24 lr 0.000030 wd 0.0500 time 0.2349 (0.2461) data time 0.0011 (0.0022) model time 0.2338 (0.2431) loss 3.1418 (2.6869) grad_norm 4.9889 (inf) loss_scale 128.0000 (215.8670) mem 7381MB [2024-09-01 09:42:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][430/1251] eta 0:03:21 lr 0.000030 wd 0.0500 time 0.2387 (0.2460) data time 0.0007 (0.0022) model time 0.2380 (0.2430) loss 2.6110 (2.6847) grad_norm 7.9794 (inf) loss_scale 128.0000 (213.8283) mem 7381MB [2024-09-01 09:42:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][440/1251] eta 0:03:19 lr 0.000030 wd 0.0500 time 0.2395 (0.2459) data time 0.0010 (0.0022) model time 0.2385 (0.2429) loss 3.2253 (2.6827) grad_norm 3.4166 (inf) loss_scale 128.0000 (211.8821) mem 7381MB [2024-09-01 09:42:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][450/1251] eta 0:03:16 lr 0.000030 wd 0.0500 time 0.2389 (0.2458) data time 0.0008 (0.0021) model time 0.2380 (0.2429) loss 2.0261 (2.6836) grad_norm 3.9717 (inf) loss_scale 128.0000 (210.0222) mem 7381MB [2024-09-01 09:42:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][460/1251] eta 0:03:14 lr 0.000030 wd 0.0500 time 0.2406 (0.2457) data time 0.0010 (0.0021) model time 0.2396 (0.2428) loss 2.7321 (2.6860) grad_norm 8.0358 (inf) loss_scale 128.0000 (208.2430) mem 7381MB [2024-09-01 09:42:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][470/1251] eta 0:03:11 lr 0.000030 wd 0.0500 time 0.2377 (0.2456) data time 0.0009 (0.0021) model time 0.2369 (0.2428) loss 3.1115 (2.6868) grad_norm 6.1095 (inf) loss_scale 128.0000 (206.5393) mem 7381MB [2024-09-01 09:42:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][480/1251] eta 0:03:09 lr 0.000030 wd 0.0500 time 0.2446 (0.2456) data time 0.0007 (0.0021) model time 0.2440 (0.2428) loss 2.0111 (2.6907) grad_norm 4.5753 (inf) loss_scale 128.0000 (204.9064) mem 7381MB [2024-09-01 09:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][490/1251] eta 0:03:06 lr 0.000030 wd 0.0500 time 0.2414 (0.2455) data time 0.0007 (0.0021) model time 0.2407 (0.2427) loss 3.0006 (2.6933) grad_norm 4.6197 (inf) loss_scale 128.0000 (203.3401) mem 7381MB [2024-09-01 09:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][500/1251] eta 0:03:04 lr 0.000030 wd 0.0500 time 0.2387 (0.2454) data time 0.0009 (0.0020) model time 0.2378 (0.2427) loss 2.7958 (2.6929) grad_norm 4.2670 (inf) loss_scale 128.0000 (201.8363) mem 7381MB [2024-09-01 09:43:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][510/1251] eta 0:03:01 lr 0.000030 wd 0.0500 time 0.2399 (0.2453) data time 0.0009 (0.0020) model time 0.2390 (0.2426) loss 3.4470 (2.6974) grad_norm 5.6022 (inf) loss_scale 128.0000 (200.3914) mem 7381MB [2024-09-01 09:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][520/1251] eta 0:02:59 lr 0.000030 wd 0.0500 time 0.2394 (0.2452) data time 0.0011 (0.0020) model time 0.2383 (0.2426) loss 2.3108 (2.6953) grad_norm 6.6709 (inf) loss_scale 128.0000 (199.0019) mem 7381MB [2024-09-01 09:43:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][530/1251] eta 0:02:56 lr 0.000030 wd 0.0500 time 0.2581 (0.2451) data time 0.0009 (0.0020) model time 0.2572 (0.2425) loss 2.9340 (2.6985) grad_norm 7.2926 (inf) loss_scale 128.0000 (197.6648) mem 7381MB [2024-09-01 09:43:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][540/1251] eta 0:02:54 lr 0.000030 wd 0.0500 time 0.2289 (0.2450) data time 0.0010 (0.0020) model time 0.2279 (0.2424) loss 2.7490 (2.7000) grad_norm 6.3234 (inf) loss_scale 128.0000 (196.3771) mem 7381MB [2024-09-01 09:43:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][550/1251] eta 0:02:51 lr 0.000030 wd 0.0500 time 0.2461 (0.2450) data time 0.0010 (0.0019) model time 0.2451 (0.2424) loss 2.5074 (2.7036) grad_norm 6.3580 (inf) loss_scale 128.0000 (195.1361) mem 7381MB [2024-09-01 09:43:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][560/1251] eta 0:02:49 lr 0.000030 wd 0.0500 time 0.4428 (0.2453) data time 0.0009 (0.0019) model time 0.4419 (0.2428) loss 3.2610 (2.7075) grad_norm 3.8575 (inf) loss_scale 128.0000 (193.9394) mem 7381MB [2024-09-01 09:43:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][570/1251] eta 0:02:46 lr 0.000030 wd 0.0500 time 0.2387 (0.2452) data time 0.0011 (0.0019) model time 0.2376 (0.2427) loss 2.7324 (2.7056) grad_norm 3.8393 (inf) loss_scale 128.0000 (192.7846) mem 7381MB [2024-09-01 09:43:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][580/1251] eta 0:02:44 lr 0.000030 wd 0.0500 time 0.2369 (0.2451) data time 0.0010 (0.0019) model time 0.2359 (0.2426) loss 2.5676 (2.7021) grad_norm 6.1739 (inf) loss_scale 128.0000 (191.6695) mem 7381MB [2024-09-01 09:43:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][590/1251] eta 0:02:41 lr 0.000030 wd 0.0500 time 0.2482 (0.2450) data time 0.0011 (0.0019) model time 0.2471 (0.2426) loss 2.0968 (2.6999) grad_norm 13.9819 (inf) loss_scale 128.0000 (190.5922) mem 7381MB [2024-09-01 09:43:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][600/1251] eta 0:02:39 lr 0.000030 wd 0.0500 time 0.2433 (0.2449) data time 0.0010 (0.0019) model time 0.2423 (0.2425) loss 2.5665 (2.6957) grad_norm 4.5863 (inf) loss_scale 128.0000 (189.5507) mem 7381MB [2024-09-01 09:43:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][610/1251] eta 0:02:36 lr 0.000030 wd 0.0500 time 0.2445 (0.2449) data time 0.0007 (0.0018) model time 0.2438 (0.2425) loss 1.6404 (2.6953) grad_norm 7.5844 (inf) loss_scale 128.0000 (188.5434) mem 7381MB [2024-09-01 09:43:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][620/1251] eta 0:02:34 lr 0.000030 wd 0.0500 time 0.2455 (0.2448) data time 0.0010 (0.0018) model time 0.2445 (0.2424) loss 2.5969 (2.6907) grad_norm 4.1955 (inf) loss_scale 128.0000 (187.5684) mem 7381MB [2024-09-01 09:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][630/1251] eta 0:02:32 lr 0.000030 wd 0.0500 time 0.2564 (0.2448) data time 0.0007 (0.0018) model time 0.2558 (0.2424) loss 3.0548 (2.6919) grad_norm 4.2484 (inf) loss_scale 128.0000 (186.6244) mem 7381MB [2024-09-01 09:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][640/1251] eta 0:02:29 lr 0.000030 wd 0.0500 time 0.2403 (0.2447) data time 0.0008 (0.0018) model time 0.2395 (0.2423) loss 2.1405 (2.6910) grad_norm 9.4661 (inf) loss_scale 128.0000 (185.7098) mem 7381MB [2024-09-01 09:43:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][650/1251] eta 0:02:27 lr 0.000030 wd 0.0500 time 0.2460 (0.2446) data time 0.0010 (0.0018) model time 0.2450 (0.2423) loss 2.6055 (2.6913) grad_norm 5.5535 (inf) loss_scale 128.0000 (184.8233) mem 7381MB [2024-09-01 09:43:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][660/1251] eta 0:02:24 lr 0.000030 wd 0.0500 time 0.2441 (0.2446) data time 0.0011 (0.0018) model time 0.2430 (0.2422) loss 2.4860 (2.6942) grad_norm 5.2874 (inf) loss_scale 128.0000 (183.9637) mem 7381MB [2024-09-01 09:43:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][670/1251] eta 0:02:22 lr 0.000030 wd 0.0500 time 0.2423 (0.2445) data time 0.0009 (0.0018) model time 0.2414 (0.2422) loss 1.7875 (2.6910) grad_norm 7.2130 (inf) loss_scale 128.0000 (183.1297) mem 7381MB [2024-09-01 09:43:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][680/1251] eta 0:02:19 lr 0.000030 wd 0.0500 time 0.2363 (0.2445) data time 0.0010 (0.0018) model time 0.2353 (0.2422) loss 2.5932 (2.6920) grad_norm 6.1287 (inf) loss_scale 128.0000 (182.3201) mem 7381MB [2024-09-01 09:43:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][690/1251] eta 0:02:17 lr 0.000030 wd 0.0500 time 0.2352 (0.2444) data time 0.0010 (0.0018) model time 0.2342 (0.2421) loss 2.7602 (2.6907) grad_norm 5.1893 (inf) loss_scale 128.0000 (181.5340) mem 7381MB [2024-09-01 09:43:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][700/1251] eta 0:02:14 lr 0.000030 wd 0.0500 time 0.2577 (0.2444) data time 0.0010 (0.0017) model time 0.2567 (0.2421) loss 3.0906 (2.6917) grad_norm 5.0526 (inf) loss_scale 128.0000 (180.7703) mem 7381MB [2024-09-01 09:43:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][710/1251] eta 0:02:12 lr 0.000030 wd 0.0500 time 0.2436 (0.2446) data time 0.0009 (0.0017) model time 0.2427 (0.2423) loss 3.1643 (2.6910) grad_norm 5.4075 (inf) loss_scale 128.0000 (180.0281) mem 7381MB [2024-09-01 09:43:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][720/1251] eta 0:02:10 lr 0.000030 wd 0.0500 time 0.4030 (0.2454) data time 0.0007 (0.0017) model time 0.4023 (0.2433) loss 3.0220 (2.6909) grad_norm 5.0752 (inf) loss_scale 128.0000 (179.3065) mem 7381MB [2024-09-01 09:43:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][730/1251] eta 0:02:07 lr 0.000030 wd 0.0500 time 0.2447 (0.2454) data time 0.0008 (0.0017) model time 0.2440 (0.2432) loss 2.9942 (2.6912) grad_norm 7.3090 (inf) loss_scale 128.0000 (178.6047) mem 7381MB [2024-09-01 09:44:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][740/1251] eta 0:02:05 lr 0.000030 wd 0.0500 time 0.2544 (0.2453) data time 0.0010 (0.0017) model time 0.2533 (0.2432) loss 2.9824 (2.6937) grad_norm 6.1798 (inf) loss_scale 128.0000 (177.9217) mem 7381MB [2024-09-01 09:44:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][750/1251] eta 0:02:02 lr 0.000030 wd 0.0500 time 0.2397 (0.2452) data time 0.0008 (0.0017) model time 0.2389 (0.2431) loss 2.8153 (2.6909) grad_norm 3.6694 (inf) loss_scale 128.0000 (177.2570) mem 7381MB [2024-09-01 09:44:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][760/1251] eta 0:02:00 lr 0.000030 wd 0.0500 time 0.2471 (0.2452) data time 0.0008 (0.0017) model time 0.2463 (0.2431) loss 2.7895 (2.6925) grad_norm 4.8058 (inf) loss_scale 128.0000 (176.6097) mem 7381MB [2024-09-01 09:44:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][770/1251] eta 0:01:57 lr 0.000030 wd 0.0500 time 0.2281 (0.2451) data time 0.0009 (0.0017) model time 0.2272 (0.2430) loss 2.9906 (2.6931) grad_norm 5.7856 (inf) loss_scale 128.0000 (175.9792) mem 7381MB [2024-09-01 09:44:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][780/1251] eta 0:01:55 lr 0.000030 wd 0.0500 time 0.2460 (0.2451) data time 0.0007 (0.0017) model time 0.2454 (0.2430) loss 3.1616 (2.6935) grad_norm 6.6749 (inf) loss_scale 128.0000 (175.3649) mem 7381MB [2024-09-01 09:44:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][790/1251] eta 0:01:52 lr 0.000030 wd 0.0500 time 0.2411 (0.2450) data time 0.0007 (0.0017) model time 0.2404 (0.2429) loss 2.2169 (2.6948) grad_norm 3.7296 (inf) loss_scale 128.0000 (174.7661) mem 7381MB [2024-09-01 09:44:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][800/1251] eta 0:01:50 lr 0.000030 wd 0.0500 time 0.2522 (0.2450) data time 0.0008 (0.0016) model time 0.2514 (0.2429) loss 2.6776 (2.6975) grad_norm 11.1973 (inf) loss_scale 128.0000 (174.1823) mem 7381MB [2024-09-01 09:44:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][810/1251] eta 0:01:48 lr 0.000030 wd 0.0500 time 0.2388 (0.2449) data time 0.0009 (0.0016) model time 0.2380 (0.2429) loss 2.9281 (2.6966) grad_norm 5.0170 (inf) loss_scale 128.0000 (173.6128) mem 7381MB [2024-09-01 09:44:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][820/1251] eta 0:01:45 lr 0.000030 wd 0.0500 time 0.2413 (0.2449) data time 0.0008 (0.0016) model time 0.2404 (0.2429) loss 2.8370 (2.6974) grad_norm 5.6746 (inf) loss_scale 128.0000 (173.0572) mem 7381MB [2024-09-01 09:44:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][830/1251] eta 0:01:43 lr 0.000030 wd 0.0500 time 0.2534 (0.2449) data time 0.0009 (0.0016) model time 0.2525 (0.2429) loss 1.9641 (2.6955) grad_norm 6.7801 (inf) loss_scale 128.0000 (172.5150) mem 7381MB [2024-09-01 09:44:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][840/1251] eta 0:01:40 lr 0.000030 wd 0.0500 time 0.2405 (0.2449) data time 0.0010 (0.0016) model time 0.2395 (0.2429) loss 3.0228 (2.6965) grad_norm 3.8806 (inf) loss_scale 128.0000 (171.9857) mem 7381MB [2024-09-01 09:44:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][850/1251] eta 0:01:38 lr 0.000030 wd 0.0500 time 0.2452 (0.2449) data time 0.0009 (0.0016) model time 0.2443 (0.2429) loss 2.0222 (2.6960) grad_norm 4.9066 (inf) loss_scale 128.0000 (171.4689) mem 7381MB [2024-09-01 09:44:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][860/1251] eta 0:01:35 lr 0.000030 wd 0.0500 time 0.2435 (0.2448) data time 0.0010 (0.0016) model time 0.2425 (0.2429) loss 3.3023 (2.6991) grad_norm 5.8303 (inf) loss_scale 128.0000 (170.9640) mem 7381MB [2024-09-01 09:44:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][870/1251] eta 0:01:33 lr 0.000030 wd 0.0500 time 0.2420 (0.2448) data time 0.0010 (0.0016) model time 0.2411 (0.2428) loss 2.8290 (2.6971) grad_norm 7.5497 (inf) loss_scale 128.0000 (170.4707) mem 7381MB [2024-09-01 09:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][880/1251] eta 0:01:30 lr 0.000030 wd 0.0500 time 0.2400 (0.2450) data time 0.0009 (0.0016) model time 0.2391 (0.2430) loss 2.7802 (2.6972) grad_norm 5.2788 (inf) loss_scale 128.0000 (169.9886) mem 7381MB [2024-09-01 09:44:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][890/1251] eta 0:01:28 lr 0.000030 wd 0.0500 time 0.2421 (0.2450) data time 0.0007 (0.0016) model time 0.2414 (0.2430) loss 3.1578 (2.6980) grad_norm 4.5923 (inf) loss_scale 128.0000 (169.5174) mem 7381MB [2024-09-01 09:44:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][900/1251] eta 0:01:25 lr 0.000030 wd 0.0500 time 0.2433 (0.2450) data time 0.0007 (0.0016) model time 0.2426 (0.2430) loss 2.8551 (2.6984) grad_norm 4.7483 (inf) loss_scale 128.0000 (169.0566) mem 7381MB [2024-09-01 09:44:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][910/1251] eta 0:01:23 lr 0.000030 wd 0.0500 time 0.2442 (0.2449) data time 0.0007 (0.0016) model time 0.2435 (0.2430) loss 2.6390 (2.6978) grad_norm 4.7182 (inf) loss_scale 128.0000 (168.6059) mem 7381MB [2024-09-01 09:44:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][920/1251] eta 0:01:21 lr 0.000030 wd 0.0500 time 0.2334 (0.2449) data time 0.0010 (0.0016) model time 0.2323 (0.2430) loss 2.8077 (2.6962) grad_norm 3.6908 (inf) loss_scale 128.0000 (168.1650) mem 7381MB [2024-09-01 09:44:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][930/1251] eta 0:01:18 lr 0.000030 wd 0.0500 time 0.2439 (0.2448) data time 0.0008 (0.0015) model time 0.2431 (0.2429) loss 3.3042 (2.6950) grad_norm 6.4197 (inf) loss_scale 128.0000 (167.7336) mem 7381MB [2024-09-01 09:44:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][940/1251] eta 0:01:16 lr 0.000030 wd 0.0500 time 0.2407 (0.2448) data time 0.0012 (0.0015) model time 0.2395 (0.2429) loss 3.0720 (2.6937) grad_norm 6.9570 (inf) loss_scale 128.0000 (167.3114) mem 7381MB [2024-09-01 09:44:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][950/1251] eta 0:01:13 lr 0.000030 wd 0.0500 time 0.2432 (0.2448) data time 0.0008 (0.0015) model time 0.2424 (0.2429) loss 1.9329 (2.6931) grad_norm 5.2014 (inf) loss_scale 128.0000 (166.8980) mem 7381MB [2024-09-01 09:44:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][960/1251] eta 0:01:11 lr 0.000030 wd 0.0500 time 0.2344 (0.2447) data time 0.0010 (0.0015) model time 0.2334 (0.2428) loss 3.0038 (2.6923) grad_norm 6.5649 (inf) loss_scale 128.0000 (166.4932) mem 7381MB [2024-09-01 09:44:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][970/1251] eta 0:01:08 lr 0.000030 wd 0.0500 time 0.2457 (0.2447) data time 0.0009 (0.0015) model time 0.2448 (0.2428) loss 2.8937 (2.6919) grad_norm 4.9371 (inf) loss_scale 128.0000 (166.0968) mem 7381MB [2024-09-01 09:44:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][980/1251] eta 0:01:06 lr 0.000030 wd 0.0500 time 0.2380 (0.2447) data time 0.0007 (0.0015) model time 0.2373 (0.2428) loss 3.1933 (2.6921) grad_norm 4.4563 (inf) loss_scale 128.0000 (165.7085) mem 7381MB [2024-09-01 09:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][990/1251] eta 0:01:03 lr 0.000030 wd 0.0500 time 0.2440 (0.2446) data time 0.0012 (0.0015) model time 0.2428 (0.2428) loss 1.9580 (2.6947) grad_norm 4.5629 (inf) loss_scale 128.0000 (165.3280) mem 7381MB [2024-09-01 09:45:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1000/1251] eta 0:01:01 lr 0.000030 wd 0.0500 time 0.2260 (0.2446) data time 0.0011 (0.0015) model time 0.2249 (0.2427) loss 2.6194 (2.6932) grad_norm 5.4421 (inf) loss_scale 128.0000 (164.9550) mem 7381MB [2024-09-01 09:45:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1010/1251] eta 0:00:58 lr 0.000030 wd 0.0500 time 0.2403 (0.2445) data time 0.0007 (0.0015) model time 0.2395 (0.2427) loss 3.4917 (2.6951) grad_norm 5.1924 (inf) loss_scale 128.0000 (164.5895) mem 7381MB [2024-09-01 09:45:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1020/1251] eta 0:00:56 lr 0.000030 wd 0.0500 time 0.2325 (0.2445) data time 0.0008 (0.0015) model time 0.2317 (0.2427) loss 2.7505 (2.6959) grad_norm 5.9909 (inf) loss_scale 128.0000 (164.2311) mem 7381MB [2024-09-01 09:45:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1030/1251] eta 0:00:54 lr 0.000030 wd 0.0500 time 0.2392 (0.2445) data time 0.0009 (0.0015) model time 0.2383 (0.2427) loss 3.1691 (2.6968) grad_norm 7.4261 (inf) loss_scale 128.0000 (163.8797) mem 7381MB [2024-09-01 09:45:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1040/1251] eta 0:00:51 lr 0.000030 wd 0.0500 time 0.2408 (0.2445) data time 0.0008 (0.0015) model time 0.2400 (0.2427) loss 2.7470 (2.6988) grad_norm 5.4056 (inf) loss_scale 128.0000 (163.5351) mem 7381MB [2024-09-01 09:45:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1050/1251] eta 0:00:49 lr 0.000030 wd 0.0500 time 0.2381 (0.2445) data time 0.0007 (0.0015) model time 0.2374 (0.2426) loss 1.9025 (2.6988) grad_norm 3.9412 (inf) loss_scale 128.0000 (163.1970) mem 7381MB [2024-09-01 09:45:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1060/1251] eta 0:00:46 lr 0.000030 wd 0.0500 time 0.2401 (0.2444) data time 0.0011 (0.0015) model time 0.2389 (0.2426) loss 2.7118 (2.6961) grad_norm 4.2709 (inf) loss_scale 128.0000 (162.8652) mem 7381MB [2024-09-01 09:45:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1070/1251] eta 0:00:44 lr 0.000030 wd 0.0500 time 0.2432 (0.2444) data time 0.0009 (0.0015) model time 0.2423 (0.2426) loss 2.9361 (2.6973) grad_norm 3.6566 (inf) loss_scale 128.0000 (162.5397) mem 7381MB [2024-09-01 09:45:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1080/1251] eta 0:00:41 lr 0.000030 wd 0.0500 time 0.2358 (0.2444) data time 0.0008 (0.0015) model time 0.2350 (0.2426) loss 2.2805 (2.6939) grad_norm 3.8328 (inf) loss_scale 128.0000 (162.2202) mem 7381MB [2024-09-01 09:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1090/1251] eta 0:00:39 lr 0.000030 wd 0.0500 time 0.2446 (0.2444) data time 0.0008 (0.0015) model time 0.2438 (0.2426) loss 2.9758 (2.6941) grad_norm 7.9412 (inf) loss_scale 128.0000 (161.9065) mem 7381MB [2024-09-01 09:45:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1100/1251] eta 0:00:36 lr 0.000030 wd 0.0500 time 0.2394 (0.2446) data time 0.0009 (0.0015) model time 0.2385 (0.2428) loss 2.7769 (2.6930) grad_norm 5.6452 (inf) loss_scale 128.0000 (161.5985) mem 7381MB [2024-09-01 09:45:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1110/1251] eta 0:00:34 lr 0.000030 wd 0.0500 time 0.2407 (0.2445) data time 0.0007 (0.0015) model time 0.2400 (0.2428) loss 2.9693 (2.6932) grad_norm 7.4410 (inf) loss_scale 128.0000 (161.2961) mem 7381MB [2024-09-01 09:45:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1120/1251] eta 0:00:32 lr 0.000030 wd 0.0500 time 0.2443 (0.2445) data time 0.0008 (0.0015) model time 0.2435 (0.2428) loss 2.3948 (2.6920) grad_norm 7.4812 (inf) loss_scale 128.0000 (160.9991) mem 7381MB [2024-09-01 09:45:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1130/1251] eta 0:00:29 lr 0.000029 wd 0.0500 time 0.2499 (0.2445) data time 0.0007 (0.0015) model time 0.2492 (0.2428) loss 3.3149 (2.6944) grad_norm 8.3353 (inf) loss_scale 128.0000 (160.7073) mem 7381MB [2024-09-01 09:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1140/1251] eta 0:00:27 lr 0.000029 wd 0.0500 time 0.2457 (0.2445) data time 0.0007 (0.0015) model time 0.2450 (0.2427) loss 3.0181 (2.6952) grad_norm 3.4768 (inf) loss_scale 128.0000 (160.4207) mem 7381MB [2024-09-01 09:45:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1150/1251] eta 0:00:24 lr 0.000029 wd 0.0500 time 0.2421 (0.2444) data time 0.0008 (0.0014) model time 0.2414 (0.2427) loss 2.8935 (2.6952) grad_norm 7.8198 (inf) loss_scale 128.0000 (160.1390) mem 7381MB [2024-09-01 09:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1160/1251] eta 0:00:22 lr 0.000029 wd 0.0500 time 0.2360 (0.2444) data time 0.0008 (0.0014) model time 0.2351 (0.2427) loss 2.9914 (2.6958) grad_norm 4.2775 (inf) loss_scale 128.0000 (159.8622) mem 7381MB [2024-09-01 09:45:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1170/1251] eta 0:00:19 lr 0.000029 wd 0.0500 time 0.2525 (0.2444) data time 0.0007 (0.0014) model time 0.2517 (0.2427) loss 3.4044 (2.6960) grad_norm 4.9358 (inf) loss_scale 128.0000 (159.5901) mem 7381MB [2024-09-01 09:45:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1180/1251] eta 0:00:17 lr 0.000029 wd 0.0500 time 0.2444 (0.2444) data time 0.0009 (0.0014) model time 0.2435 (0.2427) loss 2.9828 (2.6951) grad_norm 4.2080 (inf) loss_scale 128.0000 (159.3226) mem 7381MB [2024-09-01 09:45:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1190/1251] eta 0:00:14 lr 0.000029 wd 0.0500 time 0.2451 (0.2444) data time 0.0007 (0.0014) model time 0.2445 (0.2427) loss 3.0901 (2.6955) grad_norm 4.7871 (inf) loss_scale 128.0000 (159.0596) mem 7381MB [2024-09-01 09:45:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1200/1251] eta 0:00:12 lr 0.000029 wd 0.0500 time 0.2415 (0.2444) data time 0.0009 (0.0014) model time 0.2407 (0.2427) loss 2.6005 (2.6963) grad_norm 7.6700 (inf) loss_scale 128.0000 (158.8010) mem 7381MB [2024-09-01 09:45:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1210/1251] eta 0:00:10 lr 0.000029 wd 0.0500 time 0.2408 (0.2444) data time 0.0010 (0.0014) model time 0.2398 (0.2427) loss 3.0911 (2.6971) grad_norm 7.4015 (inf) loss_scale 128.0000 (158.5467) mem 7381MB [2024-09-01 09:45:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1220/1251] eta 0:00:07 lr 0.000029 wd 0.0500 time 0.2423 (0.2443) data time 0.0010 (0.0014) model time 0.2413 (0.2426) loss 3.3259 (2.6985) grad_norm 4.1905 (inf) loss_scale 128.0000 (158.2965) mem 7381MB [2024-09-01 09:46:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1230/1251] eta 0:00:05 lr 0.000029 wd 0.0500 time 0.2357 (0.2443) data time 0.0010 (0.0014) model time 0.2347 (0.2426) loss 2.4635 (2.6990) grad_norm 5.0651 (inf) loss_scale 128.0000 (158.0504) mem 7381MB [2024-09-01 09:46:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1240/1251] eta 0:00:02 lr 0.000029 wd 0.0500 time 0.2220 (0.2447) data time 0.0005 (0.0014) model time 0.2216 (0.2431) loss 2.2760 (2.6991) grad_norm 4.9262 (inf) loss_scale 128.0000 (157.8082) mem 7381MB [2024-09-01 09:46:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [274/300][1250/1251] eta 0:00:00 lr 0.000029 wd 0.0500 time 0.2252 (0.2449) data time 0.0005 (0.0014) model time 0.2247 (0.2432) loss 2.8146 (2.6994) grad_norm 5.8282 (inf) loss_scale 128.0000 (157.5699) mem 7381MB [2024-09-01 09:46:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 274 training takes 0:05:06 [2024-09-01 09:46:06 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 09:46:06 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 09:46:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.469 (0.469) Loss 0.3987 (0.3987) Acc@1 92.773 (92.773) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 09:46:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.113) Loss 0.5938 (0.6224) Acc@1 89.844 (87.607) Acc@5 97.852 (97.665) Mem 7381MB [2024-09-01 09:46:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.097) Loss 0.9268 (0.6541) Acc@1 77.832 (86.537) Acc@5 95.605 (97.647) Mem 7381MB [2024-09-01 09:46:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.090) Loss 1.1592 (0.7485) Acc@1 74.023 (84.328) Acc@5 92.578 (96.626) Mem 7381MB [2024-09-01 09:46:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.0371 (0.7984) Acc@1 77.148 (83.127) Acc@5 94.336 (96.134) Mem 7381MB [2024-09-01 09:46:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.728 Acc@5 96.112 [2024-09-01 09:46:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-09-01 09:46:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.73% [2024-09-01 09:46:10 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 09:46:11 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 09:46:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.426 (0.426) Loss 0.3875 (0.3875) Acc@1 93.066 (93.066) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 09:46:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.109) Loss 0.5654 (0.6049) Acc@1 90.723 (87.891) Acc@5 98.242 (97.825) Mem 7381MB [2024-09-01 09:46:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.095) Loss 0.9053 (0.6352) Acc@1 78.223 (86.719) Acc@5 95.898 (97.754) Mem 7381MB [2024-09-01 09:46:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.089) Loss 1.1279 (0.7272) Acc@1 74.512 (84.545) Acc@5 92.871 (96.809) Mem 7381MB [2024-09-01 09:46:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.0029 (0.7747) Acc@1 77.344 (83.356) Acc@5 94.238 (96.327) Mem 7381MB [2024-09-01 09:46:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.934 Acc@5 96.286 [2024-09-01 09:46:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 09:46:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][0/1251] eta 0:23:48 lr 0.000029 wd 0.0500 time 1.1422 (1.1422) data time 0.6273 (0.6273) model time 0.0000 (0.0000) loss 2.6090 (2.6090) grad_norm 4.9243 (4.9243) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][10/1251] eta 0:06:43 lr 0.000029 wd 0.0500 time 0.2530 (0.3248) data time 0.0009 (0.0579) model time 0.0000 (0.0000) loss 2.7489 (2.7415) grad_norm 4.6102 (5.2965) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][20/1251] eta 0:05:50 lr 0.000029 wd 0.0500 time 0.2350 (0.2851) data time 0.0010 (0.0308) model time 0.0000 (0.0000) loss 2.7611 (2.7325) grad_norm 7.4563 (5.6300) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][30/1251] eta 0:05:30 lr 0.000029 wd 0.0500 time 0.2444 (0.2710) data time 0.0007 (0.0212) model time 0.0000 (0.0000) loss 2.0024 (2.7276) grad_norm 4.8627 (8.5069) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][40/1251] eta 0:05:20 lr 0.000029 wd 0.0500 time 0.2476 (0.2645) data time 0.0009 (0.0163) model time 0.0000 (0.0000) loss 1.8888 (2.6836) grad_norm 5.2171 (7.7828) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][50/1251] eta 0:05:12 lr 0.000029 wd 0.0500 time 0.2517 (0.2602) data time 0.0009 (0.0133) model time 0.0000 (0.0000) loss 2.9416 (2.6406) grad_norm 4.0874 (7.1916) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][60/1251] eta 0:05:06 lr 0.000029 wd 0.0500 time 0.2525 (0.2574) data time 0.0009 (0.0112) model time 0.2516 (0.2421) loss 2.8935 (2.6091) grad_norm 5.1437 (7.0630) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][70/1251] eta 0:05:01 lr 0.000029 wd 0.0500 time 0.2458 (0.2554) data time 0.0008 (0.0098) model time 0.2450 (0.2422) loss 1.7261 (2.6179) grad_norm 3.3964 (7.2720) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][80/1251] eta 0:04:57 lr 0.000029 wd 0.0500 time 0.2421 (0.2538) data time 0.0007 (0.0087) model time 0.2414 (0.2421) loss 2.7869 (2.6141) grad_norm 4.9072 (7.1841) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][90/1251] eta 0:04:53 lr 0.000029 wd 0.0500 time 0.2407 (0.2526) data time 0.0007 (0.0078) model time 0.2400 (0.2421) loss 3.1532 (2.6525) grad_norm 6.7459 (7.0887) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][100/1251] eta 0:04:49 lr 0.000029 wd 0.0500 time 0.2449 (0.2516) data time 0.0011 (0.0072) model time 0.2439 (0.2420) loss 2.0236 (2.6566) grad_norm 4.6868 (6.9055) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][110/1251] eta 0:04:46 lr 0.000029 wd 0.0500 time 0.2470 (0.2508) data time 0.0007 (0.0066) model time 0.2463 (0.2419) loss 2.5971 (2.6314) grad_norm 4.2262 (6.7130) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][120/1251] eta 0:04:43 lr 0.000029 wd 0.0500 time 0.2389 (0.2503) data time 0.0008 (0.0061) model time 0.2381 (0.2421) loss 2.5877 (2.6218) grad_norm 6.1661 (6.6475) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][130/1251] eta 0:04:39 lr 0.000029 wd 0.0500 time 0.2371 (0.2496) data time 0.0011 (0.0057) model time 0.2360 (0.2419) loss 2.8102 (2.6253) grad_norm 6.8356 (6.5682) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][140/1251] eta 0:04:36 lr 0.000029 wd 0.0500 time 0.2426 (0.2491) data time 0.0009 (0.0054) model time 0.2418 (0.2418) loss 2.9516 (2.6420) grad_norm 6.9782 (6.4724) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][150/1251] eta 0:04:33 lr 0.000029 wd 0.0500 time 0.2334 (0.2484) data time 0.0011 (0.0051) model time 0.2323 (0.2414) loss 3.1086 (2.6341) grad_norm 5.2480 (6.4604) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][160/1251] eta 0:04:30 lr 0.000029 wd 0.0500 time 0.2348 (0.2480) data time 0.0009 (0.0049) model time 0.2339 (0.2414) loss 2.7607 (2.6333) grad_norm 5.6372 (6.3831) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:46:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][170/1251] eta 0:04:27 lr 0.000029 wd 0.0500 time 0.2472 (0.2476) data time 0.0007 (0.0046) model time 0.2465 (0.2412) loss 3.1448 (2.6315) grad_norm 7.3787 (7.1107) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][180/1251] eta 0:04:24 lr 0.000029 wd 0.0500 time 0.2471 (0.2473) data time 0.0009 (0.0044) model time 0.2461 (0.2412) loss 2.8817 (2.6423) grad_norm 4.0907 (7.0268) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][190/1251] eta 0:04:22 lr 0.000029 wd 0.0500 time 0.2458 (0.2470) data time 0.0010 (0.0042) model time 0.2448 (0.2412) loss 2.5780 (2.6345) grad_norm 8.2272 (6.9448) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][200/1251] eta 0:04:19 lr 0.000029 wd 0.0500 time 0.2429 (0.2466) data time 0.0007 (0.0041) model time 0.2421 (0.2411) loss 3.0307 (2.6447) grad_norm 3.9473 (6.8382) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][210/1251] eta 0:04:16 lr 0.000029 wd 0.0500 time 0.2417 (0.2465) data time 0.0007 (0.0039) model time 0.2410 (0.2411) loss 3.0152 (2.6483) grad_norm 3.8822 (6.7484) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][220/1251] eta 0:04:13 lr 0.000029 wd 0.0500 time 0.2426 (0.2462) data time 0.0007 (0.0038) model time 0.2419 (0.2410) loss 2.9745 (2.6482) grad_norm 4.1297 (6.6909) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][230/1251] eta 0:04:11 lr 0.000029 wd 0.0500 time 0.2451 (0.2459) data time 0.0008 (0.0037) model time 0.2443 (0.2410) loss 2.7080 (2.6505) grad_norm 4.5583 (6.6567) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][240/1251] eta 0:04:08 lr 0.000029 wd 0.0500 time 0.2360 (0.2457) data time 0.0009 (0.0036) model time 0.2351 (0.2409) loss 1.9800 (2.6457) grad_norm 5.1611 (6.5668) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][250/1251] eta 0:04:05 lr 0.000029 wd 0.0500 time 0.2465 (0.2455) data time 0.0011 (0.0035) model time 0.2454 (0.2408) loss 2.8870 (2.6481) grad_norm 3.6073 (6.4979) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][260/1251] eta 0:04:03 lr 0.000029 wd 0.0500 time 0.2400 (0.2453) data time 0.0011 (0.0034) model time 0.2389 (0.2407) loss 2.9594 (2.6489) grad_norm 4.9078 (6.4553) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][270/1251] eta 0:04:00 lr 0.000029 wd 0.0500 time 0.2445 (0.2451) data time 0.0010 (0.0033) model time 0.2435 (0.2407) loss 2.6995 (2.6524) grad_norm 5.6158 (6.4220) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][280/1251] eta 0:03:57 lr 0.000029 wd 0.0500 time 0.2442 (0.2450) data time 0.0008 (0.0032) model time 0.2434 (0.2407) loss 2.9013 (2.6580) grad_norm 4.3448 (6.3834) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][290/1251] eta 0:03:55 lr 0.000029 wd 0.0500 time 0.2402 (0.2448) data time 0.0010 (0.0031) model time 0.2392 (0.2406) loss 2.2425 (2.6631) grad_norm 5.7942 (6.3190) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][300/1251] eta 0:03:52 lr 0.000029 wd 0.0500 time 0.2331 (0.2448) data time 0.0012 (0.0030) model time 0.2320 (0.2407) loss 2.9788 (2.6665) grad_norm 4.5368 (6.3741) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][310/1251] eta 0:03:50 lr 0.000029 wd 0.0500 time 0.2449 (0.2448) data time 0.0010 (0.0030) model time 0.2438 (0.2408) loss 3.0348 (2.6712) grad_norm 4.5635 (6.3444) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][320/1251] eta 0:03:47 lr 0.000029 wd 0.0500 time 0.2448 (0.2447) data time 0.0008 (0.0029) model time 0.2440 (0.2409) loss 1.9994 (2.6629) grad_norm 3.2848 (6.3048) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][330/1251] eta 0:03:45 lr 0.000029 wd 0.0500 time 0.2514 (0.2446) data time 0.0009 (0.0029) model time 0.2505 (0.2408) loss 3.2131 (2.6640) grad_norm 6.2498 (6.2821) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][340/1251] eta 0:03:42 lr 0.000029 wd 0.0500 time 0.2437 (0.2445) data time 0.0011 (0.0028) model time 0.2426 (0.2408) loss 3.0905 (2.6662) grad_norm 4.5411 (6.2748) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][350/1251] eta 0:03:40 lr 0.000029 wd 0.0500 time 0.2540 (0.2445) data time 0.0010 (0.0028) model time 0.2530 (0.2408) loss 2.1342 (2.6663) grad_norm 4.9090 (6.2520) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][360/1251] eta 0:03:37 lr 0.000029 wd 0.0500 time 0.2398 (0.2445) data time 0.0008 (0.0027) model time 0.2390 (0.2409) loss 2.9134 (2.6671) grad_norm 4.0237 (6.2167) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][370/1251] eta 0:03:35 lr 0.000029 wd 0.0500 time 0.2385 (0.2444) data time 0.0009 (0.0027) model time 0.2376 (0.2409) loss 1.6304 (2.6635) grad_norm 3.7374 (6.2199) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][380/1251] eta 0:03:33 lr 0.000029 wd 0.0500 time 0.2410 (0.2449) data time 0.0011 (0.0026) model time 0.2399 (0.2416) loss 2.8891 (2.6695) grad_norm 7.5050 (6.2003) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][390/1251] eta 0:03:30 lr 0.000029 wd 0.0500 time 0.2562 (0.2449) data time 0.0010 (0.0026) model time 0.2552 (0.2416) loss 2.2582 (2.6714) grad_norm 4.8148 (6.1719) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][400/1251] eta 0:03:28 lr 0.000029 wd 0.0500 time 0.2309 (0.2448) data time 0.0010 (0.0025) model time 0.2299 (0.2416) loss 1.6020 (2.6608) grad_norm 7.2688 (6.1610) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][410/1251] eta 0:03:25 lr 0.000029 wd 0.0500 time 0.2373 (0.2447) data time 0.0011 (0.0025) model time 0.2363 (0.2415) loss 2.0612 (2.6588) grad_norm 5.1657 (6.2799) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:47:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][420/1251] eta 0:03:23 lr 0.000029 wd 0.0500 time 0.2427 (0.2446) data time 0.0009 (0.0025) model time 0.2418 (0.2415) loss 2.0133 (2.6631) grad_norm 4.4929 (6.2719) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][430/1251] eta 0:03:20 lr 0.000029 wd 0.0500 time 0.2467 (0.2446) data time 0.0007 (0.0024) model time 0.2460 (0.2415) loss 1.6847 (2.6591) grad_norm 5.7829 (6.2578) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][440/1251] eta 0:03:18 lr 0.000029 wd 0.0500 time 0.2432 (0.2446) data time 0.0008 (0.0024) model time 0.2424 (0.2416) loss 2.8847 (2.6574) grad_norm 7.7534 (6.2545) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][450/1251] eta 0:03:16 lr 0.000029 wd 0.0500 time 0.2433 (0.2449) data time 0.0007 (0.0024) model time 0.2426 (0.2420) loss 1.9229 (2.6517) grad_norm 5.1407 (6.2357) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][460/1251] eta 0:03:13 lr 0.000029 wd 0.0500 time 0.2452 (0.2448) data time 0.0007 (0.0023) model time 0.2445 (0.2419) loss 3.0699 (2.6556) grad_norm 6.8628 (6.2179) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][470/1251] eta 0:03:11 lr 0.000029 wd 0.0500 time 0.2413 (0.2448) data time 0.0009 (0.0023) model time 0.2404 (0.2420) loss 2.8404 (2.6541) grad_norm 5.4957 (6.2065) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][480/1251] eta 0:03:08 lr 0.000029 wd 0.0500 time 0.2512 (0.2448) data time 0.0008 (0.0023) model time 0.2504 (0.2420) loss 2.9103 (2.6551) grad_norm 3.8620 (6.1761) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][490/1251] eta 0:03:06 lr 0.000029 wd 0.0500 time 0.2451 (0.2447) data time 0.0007 (0.0022) model time 0.2444 (0.2419) loss 2.4385 (2.6532) grad_norm 6.9503 (6.1711) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][500/1251] eta 0:03:03 lr 0.000029 wd 0.0500 time 0.2371 (0.2446) data time 0.0008 (0.0022) model time 0.2363 (0.2419) loss 3.2239 (2.6531) grad_norm 3.9541 (6.1491) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][510/1251] eta 0:03:01 lr 0.000029 wd 0.0500 time 0.2368 (0.2446) data time 0.0008 (0.0022) model time 0.2361 (0.2419) loss 2.7996 (2.6515) grad_norm 5.0104 (6.1375) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][520/1251] eta 0:02:58 lr 0.000029 wd 0.0500 time 0.2410 (0.2446) data time 0.0006 (0.0022) model time 0.2404 (0.2419) loss 1.7969 (2.6498) grad_norm 4.5009 (6.1232) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][530/1251] eta 0:02:56 lr 0.000029 wd 0.0500 time 0.4411 (0.2453) data time 0.0010 (0.0022) model time 0.4401 (0.2428) loss 2.6101 (2.6500) grad_norm 5.5205 (6.1215) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][540/1251] eta 0:02:54 lr 0.000029 wd 0.0500 time 0.2469 (0.2452) data time 0.0011 (0.0021) model time 0.2458 (0.2427) loss 2.6849 (2.6525) grad_norm 15.4936 (6.1129) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][550/1251] eta 0:02:51 lr 0.000029 wd 0.0500 time 0.2448 (0.2452) data time 0.0009 (0.0021) model time 0.2439 (0.2427) loss 3.0833 (2.6545) grad_norm 4.8325 (6.1135) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][560/1251] eta 0:02:49 lr 0.000029 wd 0.0500 time 0.2453 (0.2451) data time 0.0010 (0.0021) model time 0.2443 (0.2426) loss 2.7837 (2.6516) grad_norm 4.5861 (6.1150) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][570/1251] eta 0:02:46 lr 0.000029 wd 0.0500 time 0.2483 (0.2451) data time 0.0007 (0.0021) model time 0.2476 (0.2426) loss 3.2288 (2.6552) grad_norm 5.5275 (6.1063) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][580/1251] eta 0:02:44 lr 0.000029 wd 0.0500 time 0.2388 (0.2450) data time 0.0010 (0.0020) model time 0.2378 (0.2425) loss 2.9867 (2.6584) grad_norm 7.7673 (6.1036) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][590/1251] eta 0:02:41 lr 0.000029 wd 0.0500 time 0.2417 (0.2449) data time 0.0010 (0.0020) model time 0.2407 (0.2425) loss 3.1688 (2.6598) grad_norm 4.2086 (6.0918) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][600/1251] eta 0:02:39 lr 0.000029 wd 0.0500 time 0.2377 (0.2449) data time 0.0011 (0.0020) model time 0.2365 (0.2425) loss 2.8445 (2.6645) grad_norm 6.1489 (6.0832) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][610/1251] eta 0:02:36 lr 0.000029 wd 0.0500 time 0.2411 (0.2448) data time 0.0010 (0.0020) model time 0.2401 (0.2424) loss 2.9665 (2.6682) grad_norm 4.6759 (6.0638) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][620/1251] eta 0:02:34 lr 0.000029 wd 0.0500 time 0.2403 (0.2447) data time 0.0011 (0.0020) model time 0.2392 (0.2423) loss 3.1637 (2.6696) grad_norm 7.5491 (6.0527) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][630/1251] eta 0:02:31 lr 0.000029 wd 0.0500 time 0.2385 (0.2446) data time 0.0009 (0.0020) model time 0.2375 (0.2423) loss 2.9732 (2.6648) grad_norm 6.3715 (6.0390) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][640/1251] eta 0:02:29 lr 0.000029 wd 0.0500 time 0.2408 (0.2446) data time 0.0012 (0.0020) model time 0.2396 (0.2423) loss 1.7741 (2.6683) grad_norm 5.8828 (6.0272) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][650/1251] eta 0:02:26 lr 0.000029 wd 0.0500 time 0.2395 (0.2446) data time 0.0009 (0.0019) model time 0.2386 (0.2422) loss 2.4126 (2.6698) grad_norm 7.5163 (6.0253) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][660/1251] eta 0:02:24 lr 0.000029 wd 0.0500 time 0.2416 (0.2445) data time 0.0009 (0.0019) model time 0.2407 (0.2422) loss 2.9523 (2.6688) grad_norm 4.6612 (6.0110) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:48:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][670/1251] eta 0:02:22 lr 0.000029 wd 0.0500 time 0.2416 (0.2444) data time 0.0007 (0.0019) model time 0.2409 (0.2421) loss 2.0064 (2.6675) grad_norm 5.0504 (6.0078) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][680/1251] eta 0:02:19 lr 0.000029 wd 0.0500 time 0.2346 (0.2444) data time 0.0012 (0.0019) model time 0.2334 (0.2421) loss 3.0659 (2.6650) grad_norm 8.5479 (6.0047) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][690/1251] eta 0:02:17 lr 0.000029 wd 0.0500 time 0.2381 (0.2443) data time 0.0009 (0.0019) model time 0.2371 (0.2421) loss 1.5320 (2.6678) grad_norm 7.1753 (6.0196) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][700/1251] eta 0:02:14 lr 0.000028 wd 0.0500 time 0.2376 (0.2443) data time 0.0007 (0.0019) model time 0.2368 (0.2420) loss 3.1296 (2.6677) grad_norm 4.5522 (6.0119) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][710/1251] eta 0:02:12 lr 0.000028 wd 0.0500 time 0.2356 (0.2443) data time 0.0007 (0.0019) model time 0.2349 (0.2421) loss 2.1035 (2.6689) grad_norm 4.4473 (6.0145) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][720/1251] eta 0:02:09 lr 0.000028 wd 0.0500 time 0.2384 (0.2442) data time 0.0009 (0.0018) model time 0.2375 (0.2420) loss 3.0490 (2.6705) grad_norm 7.2683 (6.0188) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][730/1251] eta 0:02:07 lr 0.000028 wd 0.0500 time 0.2388 (0.2442) data time 0.0007 (0.0018) model time 0.2381 (0.2420) loss 2.4103 (2.6695) grad_norm 4.5270 (6.0076) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][740/1251] eta 0:02:04 lr 0.000028 wd 0.0500 time 0.2405 (0.2442) data time 0.0008 (0.0018) model time 0.2396 (0.2420) loss 1.9718 (2.6696) grad_norm 6.3896 (6.0075) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][750/1251] eta 0:02:02 lr 0.000028 wd 0.0500 time 0.2387 (0.2441) data time 0.0010 (0.0018) model time 0.2376 (0.2420) loss 3.1111 (2.6704) grad_norm 5.5483 (5.9944) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][760/1251] eta 0:01:59 lr 0.000028 wd 0.0500 time 0.2365 (0.2441) data time 0.0008 (0.0018) model time 0.2357 (0.2420) loss 3.0454 (2.6729) grad_norm 3.9882 (5.9877) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][770/1251] eta 0:01:57 lr 0.000028 wd 0.0500 time 0.2347 (0.2441) data time 0.0007 (0.0018) model time 0.2340 (0.2419) loss 2.9172 (2.6730) grad_norm 6.0280 (5.9765) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][780/1251] eta 0:01:54 lr 0.000028 wd 0.0500 time 0.2358 (0.2440) data time 0.0008 (0.0018) model time 0.2350 (0.2419) loss 3.2293 (2.6745) grad_norm 14.9157 (5.9929) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][790/1251] eta 0:01:52 lr 0.000028 wd 0.0500 time 0.2380 (0.2440) data time 0.0007 (0.0018) model time 0.2373 (0.2419) loss 3.0484 (2.6734) grad_norm 7.5516 (6.0070) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][800/1251] eta 0:01:50 lr 0.000028 wd 0.0500 time 0.2402 (0.2440) data time 0.0011 (0.0018) model time 0.2391 (0.2419) loss 2.8842 (2.6748) grad_norm 5.2946 (6.0123) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][810/1251] eta 0:01:47 lr 0.000028 wd 0.0500 time 0.2359 (0.2439) data time 0.0011 (0.0018) model time 0.2348 (0.2419) loss 2.4882 (2.6751) grad_norm 5.0391 (6.0158) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][820/1251] eta 0:01:45 lr 0.000028 wd 0.0500 time 0.2473 (0.2439) data time 0.0007 (0.0017) model time 0.2466 (0.2418) loss 3.5259 (2.6747) grad_norm 6.3089 (6.0190) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][830/1251] eta 0:01:42 lr 0.000028 wd 0.0500 time 0.2466 (0.2439) data time 0.0007 (0.0017) model time 0.2459 (0.2418) loss 3.3250 (2.6773) grad_norm 3.2403 (6.0221) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][840/1251] eta 0:01:40 lr 0.000028 wd 0.0500 time 0.2342 (0.2438) data time 0.0011 (0.0017) model time 0.2331 (0.2418) loss 2.9444 (2.6750) grad_norm 6.4962 (6.0379) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][850/1251] eta 0:01:37 lr 0.000028 wd 0.0500 time 0.2338 (0.2438) data time 0.0008 (0.0017) model time 0.2330 (0.2417) loss 2.8536 (2.6762) grad_norm 5.0992 (6.0349) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][860/1251] eta 0:01:35 lr 0.000028 wd 0.0500 time 0.2399 (0.2437) data time 0.0009 (0.0017) model time 0.2390 (0.2417) loss 2.3430 (2.6750) grad_norm 6.2455 (6.0276) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][870/1251] eta 0:01:32 lr 0.000028 wd 0.0500 time 0.2399 (0.2437) data time 0.0010 (0.0017) model time 0.2389 (0.2417) loss 3.2723 (2.6777) grad_norm 5.1202 (6.0183) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][880/1251] eta 0:01:30 lr 0.000028 wd 0.0500 time 0.2371 (0.2436) data time 0.0007 (0.0017) model time 0.2365 (0.2416) loss 3.2344 (2.6780) grad_norm 6.9069 (6.0088) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][890/1251] eta 0:01:27 lr 0.000028 wd 0.0500 time 0.2368 (0.2436) data time 0.0009 (0.0017) model time 0.2359 (0.2416) loss 2.2295 (2.6764) grad_norm 7.0836 (5.9986) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][900/1251] eta 0:01:25 lr 0.000028 wd 0.0500 time 0.2330 (0.2437) data time 0.0010 (0.0017) model time 0.2320 (0.2417) loss 2.7948 (2.6785) grad_norm 4.9159 (5.9995) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][910/1251] eta 0:01:23 lr 0.000028 wd 0.0500 time 0.2392 (0.2436) data time 0.0011 (0.0017) model time 0.2382 (0.2416) loss 2.7505 (2.6805) grad_norm 5.7830 (6.0263) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][920/1251] eta 0:01:20 lr 0.000028 wd 0.0500 time 0.2282 (0.2435) data time 0.0010 (0.0017) model time 0.2272 (0.2416) loss 2.5748 (2.6802) grad_norm 3.9367 (6.0228) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:50:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][930/1251] eta 0:01:18 lr 0.000028 wd 0.0500 time 0.2386 (0.2437) data time 0.0010 (0.0017) model time 0.2376 (0.2418) loss 3.2521 (2.6810) grad_norm 4.2303 (6.0155) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:50:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][940/1251] eta 0:01:15 lr 0.000028 wd 0.0500 time 0.2380 (0.2437) data time 0.0008 (0.0016) model time 0.2372 (0.2418) loss 1.8965 (2.6823) grad_norm 4.1620 (6.0057) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:50:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][950/1251] eta 0:01:13 lr 0.000028 wd 0.0500 time 0.2386 (0.2437) data time 0.0011 (0.0016) model time 0.2375 (0.2417) loss 2.4739 (2.6800) grad_norm 4.7089 (6.0003) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:50:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][960/1251] eta 0:01:10 lr 0.000028 wd 0.0500 time 0.2569 (0.2437) data time 0.0011 (0.0016) model time 0.2558 (0.2418) loss 3.0863 (2.6807) grad_norm 3.4736 (5.9976) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:50:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][970/1251] eta 0:01:08 lr 0.000028 wd 0.0500 time 0.2376 (0.2436) data time 0.0011 (0.0016) model time 0.2366 (0.2417) loss 2.9386 (2.6786) grad_norm 6.1374 (5.9914) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:50:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][980/1251] eta 0:01:06 lr 0.000028 wd 0.0500 time 0.2313 (0.2436) data time 0.0012 (0.0016) model time 0.2302 (0.2417) loss 2.6859 (2.6795) grad_norm 6.3084 (6.0048) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:50:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][990/1251] eta 0:01:03 lr 0.000028 wd 0.0500 time 0.2473 (0.2437) data time 0.0007 (0.0016) model time 0.2466 (0.2419) loss 2.6766 (2.6794) grad_norm 4.3481 (5.9968) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:50:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1000/1251] eta 0:01:01 lr 0.000028 wd 0.0500 time 0.2332 (0.2437) data time 0.0009 (0.0016) model time 0.2323 (0.2419) loss 2.1634 (2.6782) grad_norm 6.1290 (5.9873) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:50:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1010/1251] eta 0:00:58 lr 0.000028 wd 0.0500 time 0.2383 (0.2437) data time 0.0009 (0.0016) model time 0.2374 (0.2419) loss 3.0890 (2.6802) grad_norm 6.4020 (5.9818) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:50:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1020/1251] eta 0:00:56 lr 0.000028 wd 0.0500 time 0.2311 (0.2437) data time 0.0009 (0.0016) model time 0.2302 (0.2418) loss 3.0850 (2.6799) grad_norm 4.4965 (5.9832) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:50:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1030/1251] eta 0:00:53 lr 0.000028 wd 0.0500 time 0.2356 (0.2437) data time 0.0007 (0.0016) model time 0.2349 (0.2418) loss 2.7711 (2.6799) grad_norm 6.2780 (5.9737) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 09:50:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1040/1251] eta 0:00:51 lr 0.000028 wd 0.0500 time 0.2439 (0.2436) data time 0.0009 (0.0016) model time 0.2429 (0.2418) loss 2.9500 (2.6803) grad_norm 6.1833 (5.9723) loss_scale 256.0000 (128.3689) mem 7381MB [2024-09-01 09:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1050/1251] eta 0:00:49 lr 0.000028 wd 0.0500 time 0.4076 (0.2440) data time 0.0007 (0.0016) model time 0.4069 (0.2422) loss 3.3372 (2.6824) grad_norm 6.8941 (5.9624) loss_scale 256.0000 (129.5833) mem 7381MB [2024-09-01 09:50:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1060/1251] eta 0:00:46 lr 0.000028 wd 0.0500 time 0.2406 (0.2440) data time 0.0009 (0.0016) model time 0.2396 (0.2422) loss 3.1886 (2.6832) grad_norm 3.4012 (5.9495) loss_scale 256.0000 (130.7747) mem 7381MB [2024-09-01 09:50:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1070/1251] eta 0:00:44 lr 0.000028 wd 0.0500 time 0.2453 (0.2440) data time 0.0007 (0.0016) model time 0.2446 (0.2422) loss 3.0453 (2.6843) grad_norm 7.1757 (5.9502) loss_scale 256.0000 (131.9440) mem 7381MB [2024-09-01 09:50:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1080/1251] eta 0:00:41 lr 0.000028 wd 0.0500 time 0.2374 (0.2439) data time 0.0007 (0.0016) model time 0.2366 (0.2422) loss 2.9975 (2.6865) grad_norm 5.3519 (5.9468) loss_scale 256.0000 (133.0916) mem 7381MB [2024-09-01 09:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1090/1251] eta 0:00:39 lr 0.000028 wd 0.0500 time 0.2317 (0.2439) data time 0.0010 (0.0016) model time 0.2306 (0.2421) loss 2.6562 (2.6846) grad_norm 4.3739 (5.9464) loss_scale 256.0000 (134.2181) mem 7381MB [2024-09-01 09:50:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1100/1251] eta 0:00:36 lr 0.000028 wd 0.0500 time 0.2471 (0.2439) data time 0.0010 (0.0016) model time 0.2461 (0.2421) loss 2.2446 (2.6850) grad_norm 12.7413 (5.9458) loss_scale 256.0000 (135.3243) mem 7381MB [2024-09-01 09:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1110/1251] eta 0:00:34 lr 0.000028 wd 0.0500 time 0.2370 (0.2439) data time 0.0009 (0.0015) model time 0.2360 (0.2421) loss 2.8636 (2.6852) grad_norm 6.9923 (5.9400) loss_scale 256.0000 (136.4104) mem 7381MB [2024-09-01 09:50:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1120/1251] eta 0:00:31 lr 0.000028 wd 0.0500 time 0.2403 (0.2438) data time 0.0008 (0.0015) model time 0.2395 (0.2421) loss 3.1143 (2.6861) grad_norm 6.4778 (5.9365) loss_scale 256.0000 (137.4773) mem 7381MB [2024-09-01 09:50:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1130/1251] eta 0:00:29 lr 0.000028 wd 0.0500 time 0.2528 (0.2438) data time 0.0009 (0.0015) model time 0.2519 (0.2421) loss 2.2776 (2.6867) grad_norm 4.3641 (5.9316) loss_scale 256.0000 (138.5252) mem 7381MB [2024-09-01 09:50:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1140/1251] eta 0:00:27 lr 0.000028 wd 0.0500 time 0.2381 (0.2438) data time 0.0008 (0.0015) model time 0.2373 (0.2420) loss 2.5764 (2.6863) grad_norm 5.2215 (5.9294) loss_scale 256.0000 (139.5548) mem 7381MB [2024-09-01 09:50:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1150/1251] eta 0:00:24 lr 0.000028 wd 0.0500 time 0.2324 (0.2437) data time 0.0008 (0.0015) model time 0.2316 (0.2420) loss 2.0880 (2.6865) grad_norm 7.0154 (5.9218) loss_scale 256.0000 (140.5665) mem 7381MB [2024-09-01 09:50:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1160/1251] eta 0:00:22 lr 0.000028 wd 0.0500 time 0.2455 (0.2437) data time 0.0007 (0.0015) model time 0.2448 (0.2420) loss 3.2518 (2.6877) grad_norm 6.4264 (5.9224) loss_scale 256.0000 (141.5607) mem 7381MB [2024-09-01 09:51:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1170/1251] eta 0:00:19 lr 0.000028 wd 0.0500 time 0.2410 (0.2437) data time 0.0008 (0.0015) model time 0.2402 (0.2420) loss 1.9421 (2.6856) grad_norm 4.6986 (5.9635) loss_scale 256.0000 (142.5380) mem 7381MB [2024-09-01 09:51:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1180/1251] eta 0:00:17 lr 0.000028 wd 0.0500 time 0.2516 (0.2437) data time 0.0009 (0.0015) model time 0.2507 (0.2420) loss 3.0932 (2.6866) grad_norm 5.1783 (5.9559) loss_scale 256.0000 (143.4987) mem 7381MB [2024-09-01 09:51:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1190/1251] eta 0:00:14 lr 0.000028 wd 0.0500 time 0.2453 (0.2437) data time 0.0010 (0.0015) model time 0.2443 (0.2420) loss 2.4117 (2.6874) grad_norm 4.1647 (5.9472) loss_scale 256.0000 (144.4433) mem 7381MB [2024-09-01 09:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1200/1251] eta 0:00:12 lr 0.000028 wd 0.0500 time 0.2406 (0.2437) data time 0.0007 (0.0015) model time 0.2399 (0.2420) loss 3.0239 (2.6882) grad_norm 10.3780 (5.9478) loss_scale 256.0000 (145.3722) mem 7381MB [2024-09-01 09:51:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1210/1251] eta 0:00:09 lr 0.000028 wd 0.0500 time 0.2376 (0.2437) data time 0.0011 (0.0015) model time 0.2365 (0.2420) loss 3.3059 (2.6882) grad_norm 3.8112 (5.9487) loss_scale 256.0000 (146.2857) mem 7381MB [2024-09-01 09:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1220/1251] eta 0:00:07 lr 0.000028 wd 0.0500 time 0.2479 (0.2437) data time 0.0009 (0.0015) model time 0.2470 (0.2420) loss 3.3708 (2.6883) grad_norm 4.5940 (5.9480) loss_scale 256.0000 (147.1843) mem 7381MB [2024-09-01 09:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1230/1251] eta 0:00:05 lr 0.000028 wd 0.0500 time 0.2639 (0.2437) data time 0.0009 (0.0015) model time 0.2629 (0.2420) loss 3.3151 (2.6895) grad_norm 4.6766 (5.9371) loss_scale 256.0000 (148.0682) mem 7381MB [2024-09-01 09:51:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1240/1251] eta 0:00:02 lr 0.000028 wd 0.0500 time 0.2214 (0.2436) data time 0.0005 (0.0015) model time 0.2210 (0.2419) loss 2.9356 (2.6903) grad_norm 6.6179 (5.9621) loss_scale 256.0000 (148.9380) mem 7381MB [2024-09-01 09:51:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [275/300][1250/1251] eta 0:00:00 lr 0.000028 wd 0.0500 time 0.2237 (0.2435) data time 0.0007 (0.0015) model time 0.2230 (0.2418) loss 2.7161 (2.6889) grad_norm 3.5439 (5.9637) loss_scale 256.0000 (149.7938) mem 7381MB [2024-09-01 09:51:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 275 training takes 0:05:04 [2024-09-01 09:51:20 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 09:51:20 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 09:51:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.439 (0.439) Loss 0.3994 (0.3994) Acc@1 92.480 (92.480) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 09:51:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.083 (0.113) Loss 0.5640 (0.6094) Acc@1 90.332 (87.740) Acc@5 97.852 (97.763) Mem 7381MB [2024-09-01 09:51:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.083 (0.097) Loss 0.9097 (0.6427) Acc@1 77.930 (86.561) Acc@5 95.801 (97.661) Mem 7381MB [2024-09-01 09:51:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.092) Loss 1.1660 (0.7388) Acc@1 73.926 (84.287) Acc@5 92.383 (96.673) Mem 7381MB [2024-09-01 09:51:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.0332 (0.7882) Acc@1 76.953 (83.096) Acc@5 94.238 (96.201) Mem 7381MB [2024-09-01 09:51:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.628 Acc@5 96.152 [2024-09-01 09:51:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.6% [2024-09-01 09:51:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.725 (0.725) Loss 0.3877 (0.3877) Acc@1 92.969 (92.969) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 09:51:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.142) Loss 0.5654 (0.6054) Acc@1 90.723 (87.935) Acc@5 98.242 (97.789) Mem 7381MB [2024-09-01 09:51:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.112) Loss 0.9053 (0.6358) Acc@1 78.223 (86.747) Acc@5 95.801 (97.731) Mem 7381MB [2024-09-01 09:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.101) Loss 1.1279 (0.7277) Acc@1 74.609 (84.573) Acc@5 92.871 (96.799) Mem 7381MB [2024-09-01 09:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.092) Loss 1.0059 (0.7754) Acc@1 77.539 (83.394) Acc@5 94.434 (96.327) Mem 7381MB [2024-09-01 09:51:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.960 Acc@5 96.284 [2024-09-01 09:51:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 83.0% [2024-09-01 09:51:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][0/1251] eta 0:22:17 lr 0.000028 wd 0.0500 time 1.0691 (1.0691) data time 0.6419 (0.6419) model time 0.0000 (0.0000) loss 1.9863 (1.9863) grad_norm 5.0321 (5.0321) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:51:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][10/1251] eta 0:06:33 lr 0.000028 wd 0.0500 time 0.2381 (0.3172) data time 0.0010 (0.0592) model time 0.0000 (0.0000) loss 2.7425 (2.8754) grad_norm 4.0626 (5.8965) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:51:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][20/1251] eta 0:05:47 lr 0.000028 wd 0.0500 time 0.2455 (0.2823) data time 0.0007 (0.0316) model time 0.0000 (0.0000) loss 1.6789 (2.7996) grad_norm 4.8576 (5.4232) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][30/1251] eta 0:05:28 lr 0.000028 wd 0.0500 time 0.2369 (0.2690) data time 0.0011 (0.0217) model time 0.0000 (0.0000) loss 2.1145 (2.7168) grad_norm 7.6167 (5.5902) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][40/1251] eta 0:05:17 lr 0.000028 wd 0.0500 time 0.2369 (0.2619) data time 0.0009 (0.0167) model time 0.0000 (0.0000) loss 2.9900 (2.7433) grad_norm 4.6149 (5.4193) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:51:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][50/1251] eta 0:05:09 lr 0.000028 wd 0.0500 time 0.2404 (0.2580) data time 0.0009 (0.0136) model time 0.0000 (0.0000) loss 2.8433 (2.7512) grad_norm 6.1054 (5.4972) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:51:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][60/1251] eta 0:05:04 lr 0.000028 wd 0.0500 time 0.2401 (0.2556) data time 0.0011 (0.0115) model time 0.2390 (0.2427) loss 1.9702 (2.7537) grad_norm 7.5398 (5.5197) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:51:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][70/1251] eta 0:04:59 lr 0.000028 wd 0.0500 time 0.2371 (0.2536) data time 0.0009 (0.0100) model time 0.2362 (0.2414) loss 2.7871 (2.7448) grad_norm 5.3095 (5.4046) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:51:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][80/1251] eta 0:04:55 lr 0.000028 wd 0.0500 time 0.2443 (0.2522) data time 0.0008 (0.0089) model time 0.2435 (0.2415) loss 3.3344 (2.7642) grad_norm 6.5166 (5.4251) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:51:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][90/1251] eta 0:04:51 lr 0.000028 wd 0.0500 time 0.2400 (0.2508) data time 0.0010 (0.0080) model time 0.2390 (0.2406) loss 2.5500 (2.7337) grad_norm 4.5896 (5.3936) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:51:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][100/1251] eta 0:04:47 lr 0.000028 wd 0.0500 time 0.2408 (0.2499) data time 0.0009 (0.0073) model time 0.2398 (0.2406) loss 2.7867 (2.7288) grad_norm 10.9412 (5.4159) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:51:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][110/1251] eta 0:04:44 lr 0.000028 wd 0.0500 time 0.2427 (0.2490) data time 0.0010 (0.0068) model time 0.2417 (0.2405) loss 3.2430 (2.7131) grad_norm 3.4976 (5.4931) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:51:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][120/1251] eta 0:04:41 lr 0.000028 wd 0.0500 time 0.2461 (0.2485) data time 0.0007 (0.0063) model time 0.2454 (0.2407) loss 2.8336 (2.7041) grad_norm 3.7599 (5.4805) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][130/1251] eta 0:04:38 lr 0.000028 wd 0.0500 time 0.2423 (0.2481) data time 0.0009 (0.0059) model time 0.2414 (0.2408) loss 2.2545 (2.6942) grad_norm 4.8218 (5.4964) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][140/1251] eta 0:04:34 lr 0.000028 wd 0.0500 time 0.2308 (0.2475) data time 0.0008 (0.0055) model time 0.2300 (0.2406) loss 2.9684 (2.7046) grad_norm 3.3524 (5.4915) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][150/1251] eta 0:04:32 lr 0.000028 wd 0.0500 time 0.2334 (0.2472) data time 0.0009 (0.0052) model time 0.2325 (0.2407) loss 2.3654 (2.7112) grad_norm 5.6712 (5.5062) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][160/1251] eta 0:04:29 lr 0.000028 wd 0.0500 time 0.2422 (0.2467) data time 0.0011 (0.0050) model time 0.2412 (0.2406) loss 3.1853 (2.7184) grad_norm 5.6229 (5.4727) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][170/1251] eta 0:04:27 lr 0.000028 wd 0.0500 time 0.2392 (0.2471) data time 0.0008 (0.0047) model time 0.2383 (0.2415) loss 2.3184 (2.7031) grad_norm 4.6044 (5.4366) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][180/1251] eta 0:04:24 lr 0.000028 wd 0.0500 time 0.2419 (0.2468) data time 0.0009 (0.0045) model time 0.2410 (0.2414) loss 2.1468 (2.6969) grad_norm 4.6580 (5.4198) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][190/1251] eta 0:04:21 lr 0.000028 wd 0.0500 time 0.2402 (0.2465) data time 0.0007 (0.0043) model time 0.2395 (0.2413) loss 2.4389 (2.6991) grad_norm 5.5210 (5.5210) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][200/1251] eta 0:04:18 lr 0.000028 wd 0.0500 time 0.2407 (0.2463) data time 0.0009 (0.0042) model time 0.2398 (0.2413) loss 3.1929 (2.7002) grad_norm 8.1841 (5.6428) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][210/1251] eta 0:04:16 lr 0.000028 wd 0.0500 time 0.2435 (0.2460) data time 0.0011 (0.0040) model time 0.2424 (0.2412) loss 2.7374 (2.7016) grad_norm 5.2145 (5.6820) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][220/1251] eta 0:04:14 lr 0.000028 wd 0.0500 time 0.2154 (0.2465) data time 0.0008 (0.0039) model time 0.2147 (0.2421) loss 2.2089 (2.6911) grad_norm 4.6812 (5.7278) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][230/1251] eta 0:04:11 lr 0.000028 wd 0.0500 time 0.2408 (0.2462) data time 0.0009 (0.0038) model time 0.2399 (0.2419) loss 2.8673 (2.6864) grad_norm 3.5368 (5.6887) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][240/1251] eta 0:04:08 lr 0.000028 wd 0.0500 time 0.2496 (0.2461) data time 0.0009 (0.0036) model time 0.2487 (0.2420) loss 2.6961 (2.6831) grad_norm 10.5872 (5.6800) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][250/1251] eta 0:04:06 lr 0.000028 wd 0.0500 time 0.2379 (0.2460) data time 0.0008 (0.0035) model time 0.2372 (0.2420) loss 3.2404 (2.6783) grad_norm 3.6751 (5.6864) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][260/1251] eta 0:04:03 lr 0.000028 wd 0.0500 time 0.2483 (0.2458) data time 0.0011 (0.0034) model time 0.2472 (0.2419) loss 3.0771 (2.6797) grad_norm 7.4528 (5.6793) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][270/1251] eta 0:04:00 lr 0.000028 wd 0.0500 time 0.2416 (0.2457) data time 0.0007 (0.0034) model time 0.2410 (0.2418) loss 2.2150 (2.6775) grad_norm 4.5437 (5.8144) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][280/1251] eta 0:03:58 lr 0.000028 wd 0.0500 time 0.2485 (0.2455) data time 0.0007 (0.0033) model time 0.2478 (0.2418) loss 2.1229 (2.6669) grad_norm 5.0106 (5.7777) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][290/1251] eta 0:03:55 lr 0.000027 wd 0.0500 time 0.2420 (0.2454) data time 0.0009 (0.0032) model time 0.2411 (0.2417) loss 1.6892 (2.6702) grad_norm 4.1532 (5.7679) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][300/1251] eta 0:03:53 lr 0.000027 wd 0.0500 time 0.2393 (0.2453) data time 0.0010 (0.0031) model time 0.2382 (0.2417) loss 3.1779 (2.6688) grad_norm 27.9967 (5.8910) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][310/1251] eta 0:03:50 lr 0.000027 wd 0.0500 time 0.2422 (0.2451) data time 0.0009 (0.0030) model time 0.2413 (0.2416) loss 2.9109 (2.6699) grad_norm 7.9891 (5.8808) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][320/1251] eta 0:03:49 lr 0.000027 wd 0.0500 time 0.2453 (0.2465) data time 0.0007 (0.0030) model time 0.2446 (0.2433) loss 2.7892 (2.6751) grad_norm 5.2105 (5.8546) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][330/1251] eta 0:03:46 lr 0.000027 wd 0.0500 time 0.2444 (0.2463) data time 0.0009 (0.0029) model time 0.2435 (0.2432) loss 2.4069 (2.6738) grad_norm 4.3892 (5.8259) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][340/1251] eta 0:03:44 lr 0.000027 wd 0.0500 time 0.2378 (0.2462) data time 0.0009 (0.0029) model time 0.2368 (0.2431) loss 2.9445 (2.6748) grad_norm 20.6920 (5.8741) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][350/1251] eta 0:03:41 lr 0.000027 wd 0.0500 time 0.2465 (0.2460) data time 0.0008 (0.0028) model time 0.2457 (0.2429) loss 3.4509 (2.6836) grad_norm 5.0814 (5.8550) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:52:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][360/1251] eta 0:03:39 lr 0.000027 wd 0.0500 time 0.2423 (0.2458) data time 0.0009 (0.0028) model time 0.2414 (0.2428) loss 2.5771 (2.6749) grad_norm 4.4653 (5.8258) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][370/1251] eta 0:03:36 lr 0.000027 wd 0.0500 time 0.2289 (0.2462) data time 0.0011 (0.0027) model time 0.2278 (0.2433) loss 3.0333 (2.6786) grad_norm 6.8343 (5.8320) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][380/1251] eta 0:03:34 lr 0.000027 wd 0.0500 time 0.2388 (0.2461) data time 0.0007 (0.0027) model time 0.2381 (0.2432) loss 3.1754 (2.6817) grad_norm 4.5098 (5.8242) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][390/1251] eta 0:03:31 lr 0.000027 wd 0.0500 time 0.2450 (0.2459) data time 0.0006 (0.0026) model time 0.2443 (0.2431) loss 3.2520 (2.6835) grad_norm 6.9350 (5.8143) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][400/1251] eta 0:03:29 lr 0.000027 wd 0.0500 time 0.2411 (0.2458) data time 0.0007 (0.0026) model time 0.2404 (0.2430) loss 2.2026 (2.6824) grad_norm 4.5428 (5.7921) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][410/1251] eta 0:03:26 lr 0.000027 wd 0.0500 time 0.2364 (0.2457) data time 0.0010 (0.0025) model time 0.2355 (0.2429) loss 2.2328 (2.6882) grad_norm 4.4328 (5.7876) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][420/1251] eta 0:03:24 lr 0.000027 wd 0.0500 time 0.2458 (0.2455) data time 0.0009 (0.0025) model time 0.2449 (0.2429) loss 2.6377 (2.6861) grad_norm 4.9085 (5.7745) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][430/1251] eta 0:03:21 lr 0.000027 wd 0.0500 time 0.2439 (0.2455) data time 0.0012 (0.0025) model time 0.2427 (0.2429) loss 2.7546 (2.6908) grad_norm 5.8252 (5.7748) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][440/1251] eta 0:03:19 lr 0.000027 wd 0.0500 time 0.2481 (0.2454) data time 0.0007 (0.0024) model time 0.2474 (0.2428) loss 2.2991 (2.6898) grad_norm 3.8284 (5.7662) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][450/1251] eta 0:03:16 lr 0.000027 wd 0.0500 time 0.2410 (0.2453) data time 0.0008 (0.0024) model time 0.2403 (0.2428) loss 2.1933 (2.6907) grad_norm 6.5055 (5.7644) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][460/1251] eta 0:03:14 lr 0.000027 wd 0.0500 time 0.2419 (0.2453) data time 0.0009 (0.0024) model time 0.2410 (0.2427) loss 3.1909 (2.6919) grad_norm 5.2125 (5.7610) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][470/1251] eta 0:03:11 lr 0.000027 wd 0.0500 time 0.2435 (0.2453) data time 0.0010 (0.0023) model time 0.2425 (0.2428) loss 3.0399 (2.6937) grad_norm 4.8413 (5.7498) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][480/1251] eta 0:03:09 lr 0.000027 wd 0.0500 time 0.2368 (0.2452) data time 0.0007 (0.0023) model time 0.2361 (0.2427) loss 2.2195 (2.6872) grad_norm 3.5551 (5.7410) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][490/1251] eta 0:03:06 lr 0.000027 wd 0.0500 time 0.2364 (0.2452) data time 0.0011 (0.0023) model time 0.2353 (0.2427) loss 1.8066 (2.6864) grad_norm 4.4739 (5.7321) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][500/1251] eta 0:03:04 lr 0.000027 wd 0.0500 time 0.2492 (0.2451) data time 0.0007 (0.0023) model time 0.2485 (0.2426) loss 1.6046 (2.6887) grad_norm 3.9788 (5.7091) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][510/1251] eta 0:03:01 lr 0.000027 wd 0.0500 time 0.2369 (0.2450) data time 0.0008 (0.0022) model time 0.2361 (0.2426) loss 2.7603 (2.6899) grad_norm 5.6057 (5.7019) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][520/1251] eta 0:02:59 lr 0.000027 wd 0.0500 time 0.2472 (0.2449) data time 0.0011 (0.0022) model time 0.2461 (0.2425) loss 1.6584 (2.6886) grad_norm 7.1611 (5.6969) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][530/1251] eta 0:02:56 lr 0.000027 wd 0.0500 time 0.2431 (0.2448) data time 0.0010 (0.0022) model time 0.2421 (0.2424) loss 2.6472 (2.6854) grad_norm 7.7574 (5.6885) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][540/1251] eta 0:02:53 lr 0.000027 wd 0.0500 time 0.2414 (0.2447) data time 0.0007 (0.0022) model time 0.2406 (0.2424) loss 2.6583 (2.6850) grad_norm 5.0286 (5.6942) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][550/1251] eta 0:02:51 lr 0.000027 wd 0.0500 time 0.2349 (0.2446) data time 0.0012 (0.0021) model time 0.2338 (0.2423) loss 2.6067 (2.6837) grad_norm 3.9419 (5.6857) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][560/1251] eta 0:02:48 lr 0.000027 wd 0.0500 time 0.2432 (0.2446) data time 0.0007 (0.0021) model time 0.2425 (0.2423) loss 2.7287 (2.6844) grad_norm 4.1225 (5.6766) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][570/1251] eta 0:02:46 lr 0.000027 wd 0.0500 time 0.2451 (0.2445) data time 0.0007 (0.0021) model time 0.2443 (0.2422) loss 2.0415 (2.6840) grad_norm 5.2136 (5.6709) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][580/1251] eta 0:02:44 lr 0.000027 wd 0.0500 time 0.2445 (0.2445) data time 0.0009 (0.0021) model time 0.2436 (0.2422) loss 3.0156 (2.6807) grad_norm 3.4450 (5.6475) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][590/1251] eta 0:02:41 lr 0.000027 wd 0.0500 time 0.2408 (0.2444) data time 0.0010 (0.0021) model time 0.2398 (0.2422) loss 2.4649 (2.6798) grad_norm 5.1173 (5.6581) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][600/1251] eta 0:02:39 lr 0.000027 wd 0.0500 time 0.2474 (0.2444) data time 0.0011 (0.0020) model time 0.2463 (0.2421) loss 3.2235 (2.6781) grad_norm 2.9445 (5.6512) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:53:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][610/1251] eta 0:02:36 lr 0.000027 wd 0.0500 time 0.2362 (0.2443) data time 0.0010 (0.0020) model time 0.2353 (0.2421) loss 3.0263 (2.6754) grad_norm 13.2579 (5.6588) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][620/1251] eta 0:02:34 lr 0.000027 wd 0.0500 time 0.2535 (0.2442) data time 0.0008 (0.0020) model time 0.2527 (0.2420) loss 2.7241 (2.6770) grad_norm 4.4001 (5.6550) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][630/1251] eta 0:02:31 lr 0.000027 wd 0.0500 time 0.2413 (0.2442) data time 0.0011 (0.0020) model time 0.2402 (0.2420) loss 2.5752 (2.6752) grad_norm 3.7559 (5.6506) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][640/1251] eta 0:02:29 lr 0.000027 wd 0.0500 time 0.2470 (0.2441) data time 0.0010 (0.0020) model time 0.2460 (0.2420) loss 2.9235 (2.6798) grad_norm 6.7886 (5.6486) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][650/1251] eta 0:02:26 lr 0.000027 wd 0.0500 time 0.2354 (0.2441) data time 0.0011 (0.0020) model time 0.2343 (0.2419) loss 3.2142 (2.6804) grad_norm 7.0787 (5.7045) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][660/1251] eta 0:02:24 lr 0.000027 wd 0.0500 time 0.2398 (0.2440) data time 0.0008 (0.0019) model time 0.2390 (0.2419) loss 2.3079 (2.6768) grad_norm 4.5123 (5.7120) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][670/1251] eta 0:02:21 lr 0.000027 wd 0.0500 time 0.2498 (0.2440) data time 0.0007 (0.0019) model time 0.2491 (0.2419) loss 2.9943 (2.6803) grad_norm 9.0890 (5.7140) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][680/1251] eta 0:02:19 lr 0.000027 wd 0.0500 time 0.2375 (0.2440) data time 0.0009 (0.0019) model time 0.2366 (0.2419) loss 1.8106 (2.6756) grad_norm 6.8780 (5.7147) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][690/1251] eta 0:02:16 lr 0.000027 wd 0.0500 time 0.2454 (0.2440) data time 0.0007 (0.0019) model time 0.2446 (0.2419) loss 3.1794 (2.6787) grad_norm 7.6925 (5.7265) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][700/1251] eta 0:02:14 lr 0.000027 wd 0.0500 time 0.2404 (0.2442) data time 0.0009 (0.0019) model time 0.2394 (0.2421) loss 2.9956 (2.6829) grad_norm 4.6249 (5.7491) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][710/1251] eta 0:02:12 lr 0.000027 wd 0.0500 time 0.2410 (0.2442) data time 0.0011 (0.0019) model time 0.2399 (0.2421) loss 1.9807 (2.6813) grad_norm 4.4640 (5.7401) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][720/1251] eta 0:02:09 lr 0.000027 wd 0.0500 time 0.2384 (0.2441) data time 0.0007 (0.0019) model time 0.2376 (0.2421) loss 3.2343 (2.6829) grad_norm 7.5673 (5.7349) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][730/1251] eta 0:02:07 lr 0.000027 wd 0.0500 time 0.2402 (0.2441) data time 0.0007 (0.0019) model time 0.2395 (0.2420) loss 3.1332 (2.6845) grad_norm 3.9536 (5.7377) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][740/1251] eta 0:02:04 lr 0.000027 wd 0.0500 time 0.2440 (0.2440) data time 0.0009 (0.0018) model time 0.2431 (0.2420) loss 2.0297 (2.6831) grad_norm 10.7248 (5.7482) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][750/1251] eta 0:02:02 lr 0.000027 wd 0.0500 time 0.2371 (0.2442) data time 0.0010 (0.0018) model time 0.2361 (0.2422) loss 2.5813 (2.6856) grad_norm 7.0448 (5.7495) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][760/1251] eta 0:01:59 lr 0.000027 wd 0.0500 time 0.2333 (0.2442) data time 0.0008 (0.0018) model time 0.2325 (0.2422) loss 3.3476 (2.6867) grad_norm 6.8809 (5.7909) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][770/1251] eta 0:01:57 lr 0.000027 wd 0.0500 time 0.2405 (0.2441) data time 0.0009 (0.0018) model time 0.2395 (0.2422) loss 3.0115 (2.6916) grad_norm 3.8288 (5.7821) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][780/1251] eta 0:01:54 lr 0.000027 wd 0.0500 time 0.2433 (0.2441) data time 0.0010 (0.0018) model time 0.2422 (0.2422) loss 2.5248 (2.6908) grad_norm 5.0287 (5.7937) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][790/1251] eta 0:01:52 lr 0.000027 wd 0.0500 time 0.2404 (0.2441) data time 0.0008 (0.0018) model time 0.2396 (0.2422) loss 1.8595 (2.6940) grad_norm 8.9792 (5.7916) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][800/1251] eta 0:01:50 lr 0.000027 wd 0.0500 time 0.2456 (0.2441) data time 0.0007 (0.0018) model time 0.2449 (0.2421) loss 2.9866 (2.6923) grad_norm 10.4335 (5.8007) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][810/1251] eta 0:01:47 lr 0.000027 wd 0.0500 time 0.2446 (0.2440) data time 0.0011 (0.0018) model time 0.2435 (0.2421) loss 2.1959 (2.6939) grad_norm 8.2899 (5.7978) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][820/1251] eta 0:01:45 lr 0.000027 wd 0.0500 time 0.2477 (0.2440) data time 0.0010 (0.0018) model time 0.2467 (0.2421) loss 2.4277 (2.6960) grad_norm 4.7919 (5.7985) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][830/1251] eta 0:01:42 lr 0.000027 wd 0.0500 time 0.2372 (0.2440) data time 0.0009 (0.0018) model time 0.2364 (0.2421) loss 3.2155 (2.6984) grad_norm 5.6566 (5.7897) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][840/1251] eta 0:01:40 lr 0.000027 wd 0.0500 time 0.4524 (0.2444) data time 0.0008 (0.0017) model time 0.4516 (0.2426) loss 2.9121 (2.7009) grad_norm 5.5400 (5.7857) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][850/1251] eta 0:01:38 lr 0.000027 wd 0.0500 time 0.2435 (0.2446) data time 0.0011 (0.0017) model time 0.2424 (0.2428) loss 3.0028 (2.7011) grad_norm 5.4471 (5.7770) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:54:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][860/1251] eta 0:01:35 lr 0.000027 wd 0.0500 time 0.2479 (0.2446) data time 0.0009 (0.0017) model time 0.2469 (0.2428) loss 3.2181 (2.7010) grad_norm 4.2067 (5.7773) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][870/1251] eta 0:01:33 lr 0.000027 wd 0.0500 time 0.2384 (0.2446) data time 0.0010 (0.0017) model time 0.2373 (0.2428) loss 3.3018 (2.7030) grad_norm 6.9130 (5.7744) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][880/1251] eta 0:01:30 lr 0.000027 wd 0.0500 time 0.2400 (0.2445) data time 0.0008 (0.0017) model time 0.2391 (0.2427) loss 2.8206 (2.7038) grad_norm 4.8326 (5.7779) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][890/1251] eta 0:01:28 lr 0.000027 wd 0.0500 time 0.2626 (0.2445) data time 0.0010 (0.0017) model time 0.2616 (0.2427) loss 2.6736 (2.7018) grad_norm 5.6092 (5.7724) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][900/1251] eta 0:01:25 lr 0.000027 wd 0.0500 time 0.2276 (0.2445) data time 0.0009 (0.0017) model time 0.2267 (0.2427) loss 2.8623 (2.7037) grad_norm 3.5007 (5.7655) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][910/1251] eta 0:01:23 lr 0.000027 wd 0.0500 time 0.2391 (0.2447) data time 0.0008 (0.0017) model time 0.2384 (0.2429) loss 1.8678 (2.7004) grad_norm 3.5049 (5.7555) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][920/1251] eta 0:01:20 lr 0.000027 wd 0.0500 time 0.2411 (0.2446) data time 0.0009 (0.0017) model time 0.2402 (0.2429) loss 2.8041 (2.7003) grad_norm 5.4530 (5.7523) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][930/1251] eta 0:01:18 lr 0.000027 wd 0.0500 time 0.2395 (0.2446) data time 0.0007 (0.0017) model time 0.2388 (0.2429) loss 3.2007 (2.7000) grad_norm 7.3560 (5.7519) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][940/1251] eta 0:01:16 lr 0.000027 wd 0.0500 time 0.2365 (0.2446) data time 0.0010 (0.0017) model time 0.2355 (0.2428) loss 3.1593 (2.7009) grad_norm 4.4814 (5.7444) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][950/1251] eta 0:01:13 lr 0.000027 wd 0.0500 time 0.2379 (0.2446) data time 0.0010 (0.0017) model time 0.2369 (0.2428) loss 2.3044 (2.6998) grad_norm 7.1643 (5.7717) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][960/1251] eta 0:01:11 lr 0.000027 wd 0.0500 time 0.2423 (0.2445) data time 0.0007 (0.0017) model time 0.2415 (0.2428) loss 2.4773 (2.6998) grad_norm 3.5818 (5.7771) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][970/1251] eta 0:01:08 lr 0.000027 wd 0.0500 time 0.2446 (0.2445) data time 0.0009 (0.0016) model time 0.2437 (0.2428) loss 2.6607 (2.7011) grad_norm 7.1118 (5.7671) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][980/1251] eta 0:01:06 lr 0.000027 wd 0.0500 time 0.2513 (0.2445) data time 0.0008 (0.0016) model time 0.2505 (0.2428) loss 2.0486 (2.7008) grad_norm 5.8052 (5.7634) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][990/1251] eta 0:01:03 lr 0.000027 wd 0.0500 time 0.2441 (0.2445) data time 0.0011 (0.0016) model time 0.2430 (0.2428) loss 3.2555 (2.6989) grad_norm 5.4022 (5.7516) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1000/1251] eta 0:01:01 lr 0.000027 wd 0.0500 time 0.2381 (0.2445) data time 0.0009 (0.0016) model time 0.2372 (0.2428) loss 2.9539 (2.6993) grad_norm 5.1614 (5.7466) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1010/1251] eta 0:00:58 lr 0.000027 wd 0.0500 time 0.2440 (0.2445) data time 0.0007 (0.0016) model time 0.2433 (0.2428) loss 3.4027 (2.6989) grad_norm 5.3283 (5.7614) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1020/1251] eta 0:00:56 lr 0.000027 wd 0.0500 time 0.2416 (0.2444) data time 0.0007 (0.0016) model time 0.2409 (0.2427) loss 2.2824 (2.6972) grad_norm 5.3104 (5.7569) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1030/1251] eta 0:00:54 lr 0.000027 wd 0.0500 time 0.2474 (0.2444) data time 0.0009 (0.0016) model time 0.2465 (0.2427) loss 2.9383 (2.6972) grad_norm 4.4997 (5.7578) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1040/1251] eta 0:00:51 lr 0.000027 wd 0.0500 time 0.2356 (0.2444) data time 0.0008 (0.0016) model time 0.2348 (0.2427) loss 3.1515 (2.6966) grad_norm 9.8131 (5.7612) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1050/1251] eta 0:00:49 lr 0.000027 wd 0.0500 time 0.2351 (0.2443) data time 0.0013 (0.0016) model time 0.2338 (0.2427) loss 2.9439 (2.6987) grad_norm 4.8791 (5.7761) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1060/1251] eta 0:00:46 lr 0.000027 wd 0.0500 time 0.2438 (0.2443) data time 0.0009 (0.0016) model time 0.2429 (0.2427) loss 2.9120 (2.6995) grad_norm 5.8216 (5.7761) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1070/1251] eta 0:00:44 lr 0.000027 wd 0.0500 time 0.2391 (0.2443) data time 0.0010 (0.0016) model time 0.2380 (0.2426) loss 2.2576 (2.6989) grad_norm 4.2161 (5.7680) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1080/1251] eta 0:00:41 lr 0.000027 wd 0.0500 time 0.2443 (0.2443) data time 0.0009 (0.0016) model time 0.2434 (0.2426) loss 3.1971 (2.6986) grad_norm 11.1760 (5.7845) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1090/1251] eta 0:00:39 lr 0.000027 wd 0.0500 time 0.2405 (0.2443) data time 0.0007 (0.0016) model time 0.2397 (0.2426) loss 3.0003 (2.6993) grad_norm 5.5579 (5.7781) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:55:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1100/1251] eta 0:00:36 lr 0.000027 wd 0.0500 time 0.2393 (0.2442) data time 0.0010 (0.0016) model time 0.2384 (0.2426) loss 2.5246 (2.6992) grad_norm 6.2216 (5.7684) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1110/1251] eta 0:00:34 lr 0.000027 wd 0.0500 time 0.2344 (0.2442) data time 0.0007 (0.0016) model time 0.2337 (0.2425) loss 3.3081 (2.6989) grad_norm 3.5549 (5.8044) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1120/1251] eta 0:00:31 lr 0.000027 wd 0.0500 time 0.2397 (0.2442) data time 0.0007 (0.0016) model time 0.2390 (0.2425) loss 2.4741 (2.6991) grad_norm 4.2017 (5.8062) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1130/1251] eta 0:00:29 lr 0.000027 wd 0.0500 time 0.2422 (0.2441) data time 0.0011 (0.0016) model time 0.2411 (0.2425) loss 2.9626 (2.7011) grad_norm 4.5299 (5.8043) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1140/1251] eta 0:00:27 lr 0.000027 wd 0.0500 time 0.2379 (0.2441) data time 0.0011 (0.0015) model time 0.2368 (0.2424) loss 2.9607 (2.7010) grad_norm 4.1588 (5.7980) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1150/1251] eta 0:00:24 lr 0.000027 wd 0.0500 time 0.2357 (0.2441) data time 0.0006 (0.0015) model time 0.2350 (0.2424) loss 3.2151 (2.7017) grad_norm 4.6669 (5.7903) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1160/1251] eta 0:00:22 lr 0.000026 wd 0.0500 time 0.2390 (0.2440) data time 0.0009 (0.0015) model time 0.2381 (0.2424) loss 2.7623 (2.7006) grad_norm 5.2539 (5.7833) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1170/1251] eta 0:00:19 lr 0.000026 wd 0.0500 time 0.2474 (0.2440) data time 0.0007 (0.0015) model time 0.2466 (0.2424) loss 2.6428 (2.7011) grad_norm 4.1599 (5.7763) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1180/1251] eta 0:00:17 lr 0.000026 wd 0.0500 time 0.2406 (0.2440) data time 0.0007 (0.0015) model time 0.2398 (0.2424) loss 2.5133 (2.7004) grad_norm 6.4382 (5.7785) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1190/1251] eta 0:00:14 lr 0.000026 wd 0.0500 time 0.2413 (0.2440) data time 0.0007 (0.0015) model time 0.2406 (0.2424) loss 2.7000 (2.7004) grad_norm 3.3754 (5.7864) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1200/1251] eta 0:00:12 lr 0.000026 wd 0.0500 time 0.2425 (0.2440) data time 0.0010 (0.0015) model time 0.2416 (0.2423) loss 2.8986 (2.7003) grad_norm 4.4027 (5.8580) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1210/1251] eta 0:00:10 lr 0.000026 wd 0.0500 time 0.2355 (0.2439) data time 0.0007 (0.0015) model time 0.2348 (0.2423) loss 2.2881 (2.6981) grad_norm 5.1235 (5.8515) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1220/1251] eta 0:00:07 lr 0.000026 wd 0.0500 time 0.2371 (0.2439) data time 0.0011 (0.0015) model time 0.2360 (0.2423) loss 3.1854 (2.6983) grad_norm 4.7336 (5.8450) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1230/1251] eta 0:00:05 lr 0.000026 wd 0.0500 time 0.2381 (0.2439) data time 0.0009 (0.0015) model time 0.2372 (0.2423) loss 1.9237 (2.6988) grad_norm 6.2253 (5.8409) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1240/1251] eta 0:00:02 lr 0.000026 wd 0.0500 time 0.2220 (0.2438) data time 0.0007 (0.0015) model time 0.2213 (0.2422) loss 2.5146 (2.6990) grad_norm 3.7027 (5.8376) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [276/300][1250/1251] eta 0:00:00 lr 0.000026 wd 0.0500 time 0.2275 (0.2436) data time 0.0005 (0.0015) model time 0.2270 (0.2420) loss 2.5893 (2.6992) grad_norm 4.0161 (5.8285) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 276 training takes 0:05:04 [2024-09-01 09:56:33 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 09:56:34 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 09:56:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.453 (0.453) Loss 0.3926 (0.3926) Acc@1 92.676 (92.676) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 09:56:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.111) Loss 0.5796 (0.6162) Acc@1 90.039 (87.686) Acc@5 97.852 (97.683) Mem 7381MB [2024-09-01 09:56:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.096) Loss 0.9160 (0.6443) Acc@1 78.027 (86.593) Acc@5 95.605 (97.642) Mem 7381MB [2024-09-01 09:56:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.090) Loss 1.1396 (0.7383) Acc@1 75.586 (84.407) Acc@5 92.090 (96.636) Mem 7381MB [2024-09-01 09:56:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.0176 (0.7881) Acc@1 77.441 (83.236) Acc@5 93.945 (96.108) Mem 7381MB [2024-09-01 09:56:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.780 Acc@5 96.080 [2024-09-01 09:56:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 09:56:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.78% [2024-09-01 09:56:38 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 09:56:39 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 09:56:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.543 (0.543) Loss 0.3879 (0.3879) Acc@1 92.969 (92.969) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 09:56:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.118) Loss 0.5654 (0.6055) Acc@1 90.527 (87.917) Acc@5 98.242 (97.843) Mem 7381MB [2024-09-01 09:56:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.099) Loss 0.9053 (0.6359) Acc@1 78.223 (86.719) Acc@5 95.801 (97.768) Mem 7381MB [2024-09-01 09:56:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.093) Loss 1.1289 (0.7279) Acc@1 74.609 (84.555) Acc@5 92.969 (96.825) Mem 7381MB [2024-09-01 09:56:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.086) Loss 1.0059 (0.7757) Acc@1 77.441 (83.382) Acc@5 94.434 (96.341) Mem 7381MB [2024-09-01 09:56:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.954 Acc@5 96.298 [2024-09-01 09:56:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 83.0% [2024-09-01 09:56:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][0/1251] eta 0:22:25 lr 0.000026 wd 0.0500 time 1.0757 (1.0757) data time 0.6396 (0.6396) model time 0.0000 (0.0000) loss 3.0119 (3.0119) grad_norm 5.5233 (5.5233) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][10/1251] eta 0:06:33 lr 0.000026 wd 0.0500 time 0.2509 (0.3173) data time 0.0007 (0.0591) model time 0.0000 (0.0000) loss 1.8628 (2.6578) grad_norm 8.4241 (5.4397) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][20/1251] eta 0:05:46 lr 0.000026 wd 0.0500 time 0.2407 (0.2818) data time 0.0007 (0.0314) model time 0.0000 (0.0000) loss 2.9661 (2.7009) grad_norm 4.6915 (6.4024) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][30/1251] eta 0:05:27 lr 0.000026 wd 0.0500 time 0.2410 (0.2685) data time 0.0007 (0.0216) model time 0.0000 (0.0000) loss 2.0243 (2.7032) grad_norm 5.8896 (6.6046) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][40/1251] eta 0:05:16 lr 0.000026 wd 0.0500 time 0.2450 (0.2614) data time 0.0007 (0.0165) model time 0.0000 (0.0000) loss 3.0397 (2.7656) grad_norm 3.3690 (6.2789) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][50/1251] eta 0:05:09 lr 0.000026 wd 0.0500 time 0.2447 (0.2575) data time 0.0010 (0.0135) model time 0.0000 (0.0000) loss 3.3329 (2.7672) grad_norm 4.3363 (6.1400) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:56:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][60/1251] eta 0:05:03 lr 0.000026 wd 0.0500 time 0.2383 (0.2546) data time 0.0011 (0.0114) model time 0.2372 (0.2386) loss 2.9519 (2.7181) grad_norm 4.6217 (7.5737) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][70/1251] eta 0:04:58 lr 0.000026 wd 0.0500 time 0.2371 (0.2523) data time 0.0011 (0.0100) model time 0.2360 (0.2382) loss 2.9580 (2.7003) grad_norm 7.8927 (7.3350) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][80/1251] eta 0:04:53 lr 0.000026 wd 0.0500 time 0.2375 (0.2509) data time 0.0008 (0.0088) model time 0.2367 (0.2386) loss 2.3286 (2.6911) grad_norm 4.8777 (7.0285) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][90/1251] eta 0:04:50 lr 0.000026 wd 0.0500 time 0.2395 (0.2498) data time 0.0010 (0.0080) model time 0.2385 (0.2390) loss 2.9136 (2.6956) grad_norm 6.6841 (6.8796) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][100/1251] eta 0:04:46 lr 0.000026 wd 0.0500 time 0.2392 (0.2488) data time 0.0008 (0.0073) model time 0.2384 (0.2390) loss 2.9048 (2.6973) grad_norm 4.3308 (6.7973) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][110/1251] eta 0:04:43 lr 0.000026 wd 0.0500 time 0.2459 (0.2482) data time 0.0008 (0.0067) model time 0.2451 (0.2394) loss 3.4106 (2.7022) grad_norm 4.3945 (7.4548) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][120/1251] eta 0:04:40 lr 0.000026 wd 0.0500 time 0.2412 (0.2477) data time 0.0007 (0.0062) model time 0.2405 (0.2396) loss 1.7961 (2.6950) grad_norm 4.1994 (7.3023) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][130/1251] eta 0:04:37 lr 0.000026 wd 0.0500 time 0.2340 (0.2472) data time 0.0009 (0.0058) model time 0.2331 (0.2396) loss 2.4251 (2.6951) grad_norm 4.7477 (7.1100) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][140/1251] eta 0:04:33 lr 0.000026 wd 0.0500 time 0.2403 (0.2466) data time 0.0010 (0.0055) model time 0.2393 (0.2394) loss 2.0324 (2.6868) grad_norm 6.3932 (6.9832) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][150/1251] eta 0:04:31 lr 0.000026 wd 0.0500 time 0.2410 (0.2463) data time 0.0010 (0.0052) model time 0.2400 (0.2395) loss 2.6979 (2.6954) grad_norm 4.3242 (6.8559) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][160/1251] eta 0:04:28 lr 0.000026 wd 0.0500 time 0.2496 (0.2460) data time 0.0009 (0.0050) model time 0.2487 (0.2397) loss 2.1551 (2.6784) grad_norm 3.9601 (6.7331) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][170/1251] eta 0:04:25 lr 0.000026 wd 0.0500 time 0.2421 (0.2458) data time 0.0009 (0.0047) model time 0.2412 (0.2398) loss 2.9166 (2.6781) grad_norm 7.0971 (6.7683) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][180/1251] eta 0:04:24 lr 0.000026 wd 0.0500 time 0.2417 (0.2467) data time 0.0011 (0.0045) model time 0.2406 (0.2415) loss 3.1474 (2.6851) grad_norm 6.5315 (6.7223) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][190/1251] eta 0:04:21 lr 0.000026 wd 0.0500 time 0.2462 (0.2465) data time 0.0010 (0.0043) model time 0.2452 (0.2415) loss 2.4195 (2.6878) grad_norm 6.5049 (6.6411) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][200/1251] eta 0:04:18 lr 0.000026 wd 0.0500 time 0.2452 (0.2463) data time 0.0009 (0.0042) model time 0.2443 (0.2415) loss 3.0300 (2.6900) grad_norm 3.9249 (6.5570) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][210/1251] eta 0:04:17 lr 0.000026 wd 0.0500 time 0.4487 (0.2470) data time 0.0010 (0.0040) model time 0.4477 (0.2427) loss 3.1847 (2.6833) grad_norm 9.9540 (6.5258) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][220/1251] eta 0:04:14 lr 0.000026 wd 0.0500 time 0.2385 (0.2467) data time 0.0008 (0.0039) model time 0.2377 (0.2425) loss 2.6473 (2.6863) grad_norm 3.7754 (6.4913) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][230/1251] eta 0:04:11 lr 0.000026 wd 0.0500 time 0.2424 (0.2464) data time 0.0008 (0.0038) model time 0.2416 (0.2423) loss 2.8184 (2.6717) grad_norm 4.5294 (6.5517) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][240/1251] eta 0:04:08 lr 0.000026 wd 0.0500 time 0.2351 (0.2463) data time 0.0013 (0.0036) model time 0.2339 (0.2423) loss 2.7610 (2.6722) grad_norm 4.3755 (6.5240) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][250/1251] eta 0:04:06 lr 0.000026 wd 0.0500 time 0.2398 (0.2462) data time 0.0011 (0.0035) model time 0.2387 (0.2423) loss 2.7976 (2.6701) grad_norm 4.0247 (6.5182) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][260/1251] eta 0:04:03 lr 0.000026 wd 0.0500 time 0.2346 (0.2460) data time 0.0009 (0.0034) model time 0.2337 (0.2422) loss 2.8321 (2.6726) grad_norm 4.3583 (6.6285) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][270/1251] eta 0:04:01 lr 0.000026 wd 0.0500 time 0.2380 (0.2459) data time 0.0010 (0.0034) model time 0.2369 (0.2422) loss 1.7022 (2.6770) grad_norm 4.7373 (6.6717) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][280/1251] eta 0:03:58 lr 0.000026 wd 0.0500 time 0.2464 (0.2457) data time 0.0009 (0.0033) model time 0.2456 (0.2421) loss 2.2116 (2.6718) grad_norm 7.4576 (6.6200) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][290/1251] eta 0:03:56 lr 0.000026 wd 0.0500 time 0.2418 (0.2462) data time 0.0009 (0.0032) model time 0.2409 (0.2428) loss 3.1214 (2.6758) grad_norm 4.4254 (6.5702) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:57:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][300/1251] eta 0:03:55 lr 0.000026 wd 0.0500 time 0.2472 (0.2475) data time 0.0009 (0.0031) model time 0.2462 (0.2444) loss 2.7120 (2.6746) grad_norm 4.2188 (6.4980) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][310/1251] eta 0:03:52 lr 0.000026 wd 0.0500 time 0.2421 (0.2473) data time 0.0007 (0.0030) model time 0.2414 (0.2443) loss 1.8102 (2.6778) grad_norm 7.3079 (6.5056) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][320/1251] eta 0:03:50 lr 0.000026 wd 0.0500 time 0.2409 (0.2472) data time 0.0008 (0.0030) model time 0.2401 (0.2443) loss 3.2488 (2.6801) grad_norm 4.4693 (6.7080) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][330/1251] eta 0:03:47 lr 0.000026 wd 0.0500 time 0.2430 (0.2471) data time 0.0010 (0.0029) model time 0.2421 (0.2442) loss 3.1373 (2.6851) grad_norm 5.0743 (6.6717) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][340/1251] eta 0:03:44 lr 0.000026 wd 0.0500 time 0.2482 (0.2469) data time 0.0007 (0.0029) model time 0.2475 (0.2441) loss 1.5655 (2.6797) grad_norm 4.1622 (6.6321) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][350/1251] eta 0:03:42 lr 0.000026 wd 0.0500 time 0.2427 (0.2468) data time 0.0009 (0.0028) model time 0.2418 (0.2440) loss 2.4088 (2.6832) grad_norm 4.6833 (6.5986) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][360/1251] eta 0:03:39 lr 0.000026 wd 0.0500 time 0.2399 (0.2467) data time 0.0010 (0.0028) model time 0.2389 (0.2439) loss 2.9107 (2.6863) grad_norm 4.1646 (6.5894) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][370/1251] eta 0:03:37 lr 0.000026 wd 0.0500 time 0.2418 (0.2466) data time 0.0008 (0.0027) model time 0.2410 (0.2439) loss 2.8625 (2.6879) grad_norm 4.4281 (6.5651) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][380/1251] eta 0:03:34 lr 0.000026 wd 0.0500 time 0.2415 (0.2465) data time 0.0009 (0.0027) model time 0.2406 (0.2438) loss 2.6376 (2.6822) grad_norm 7.2903 (6.6207) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][390/1251] eta 0:03:32 lr 0.000026 wd 0.0500 time 0.2417 (0.2463) data time 0.0010 (0.0026) model time 0.2407 (0.2437) loss 2.6014 (2.6788) grad_norm 5.0767 (6.6010) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][400/1251] eta 0:03:29 lr 0.000026 wd 0.0500 time 0.2349 (0.2462) data time 0.0011 (0.0026) model time 0.2338 (0.2436) loss 3.1716 (2.6765) grad_norm 3.7343 (6.5520) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][410/1251] eta 0:03:26 lr 0.000026 wd 0.0500 time 0.2370 (0.2461) data time 0.0010 (0.0026) model time 0.2360 (0.2435) loss 2.3629 (2.6769) grad_norm 7.4100 (6.5355) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][420/1251] eta 0:03:24 lr 0.000026 wd 0.0500 time 0.2435 (0.2460) data time 0.0011 (0.0025) model time 0.2424 (0.2434) loss 2.1154 (2.6748) grad_norm 4.6731 (6.5040) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][430/1251] eta 0:03:21 lr 0.000026 wd 0.0500 time 0.2437 (0.2460) data time 0.0011 (0.0025) model time 0.2426 (0.2434) loss 3.0228 (2.6710) grad_norm 8.7305 (6.4653) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][440/1251] eta 0:03:19 lr 0.000026 wd 0.0500 time 0.2433 (0.2459) data time 0.0011 (0.0024) model time 0.2423 (0.2434) loss 2.1903 (2.6625) grad_norm 5.6098 (6.4440) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][450/1251] eta 0:03:16 lr 0.000026 wd 0.0500 time 0.2392 (0.2458) data time 0.0009 (0.0024) model time 0.2384 (0.2433) loss 2.8737 (2.6617) grad_norm 5.5528 (6.4444) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][460/1251] eta 0:03:14 lr 0.000026 wd 0.0500 time 0.2388 (0.2457) data time 0.0013 (0.0024) model time 0.2375 (0.2432) loss 3.0662 (2.6638) grad_norm 5.5106 (6.4189) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][470/1251] eta 0:03:11 lr 0.000026 wd 0.0500 time 0.2445 (0.2456) data time 0.0009 (0.0024) model time 0.2435 (0.2431) loss 3.1353 (2.6639) grad_norm 4.0140 (6.3979) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][480/1251] eta 0:03:09 lr 0.000026 wd 0.0500 time 0.2446 (0.2455) data time 0.0010 (0.0023) model time 0.2435 (0.2430) loss 2.6713 (2.6654) grad_norm 4.6672 (6.4334) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][490/1251] eta 0:03:06 lr 0.000026 wd 0.0500 time 0.2432 (0.2453) data time 0.0009 (0.0023) model time 0.2423 (0.2429) loss 1.7876 (2.6624) grad_norm 4.6155 (6.4216) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][500/1251] eta 0:03:04 lr 0.000026 wd 0.0500 time 0.2484 (0.2453) data time 0.0010 (0.0023) model time 0.2474 (0.2429) loss 2.8252 (2.6610) grad_norm 5.8900 (6.4062) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][510/1251] eta 0:03:01 lr 0.000026 wd 0.0500 time 0.2459 (0.2452) data time 0.0008 (0.0022) model time 0.2451 (0.2429) loss 2.5226 (2.6635) grad_norm 6.3850 (6.4013) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][520/1251] eta 0:02:59 lr 0.000026 wd 0.0500 time 0.2373 (0.2452) data time 0.0011 (0.0022) model time 0.2362 (0.2428) loss 2.8241 (2.6662) grad_norm 4.2217 (6.3898) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][530/1251] eta 0:02:56 lr 0.000026 wd 0.0500 time 0.2415 (0.2451) data time 0.0010 (0.0022) model time 0.2405 (0.2428) loss 3.0849 (2.6692) grad_norm 4.4001 (6.3750) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 09:58:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][540/1251] eta 0:02:54 lr 0.000026 wd 0.0500 time 0.2465 (0.2450) data time 0.0010 (0.0022) model time 0.2455 (0.2428) loss 3.0824 (2.6705) grad_norm 4.7948 (6.3565) loss_scale 512.0000 (258.3660) mem 7381MB [2024-09-01 09:58:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][550/1251] eta 0:02:51 lr 0.000026 wd 0.0500 time 0.2420 (0.2449) data time 0.0012 (0.0022) model time 0.2408 (0.2427) loss 2.6986 (2.6715) grad_norm 4.4496 (6.3540) loss_scale 512.0000 (262.9691) mem 7381MB [2024-09-01 09:59:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][560/1251] eta 0:02:49 lr 0.000026 wd 0.0500 time 0.2412 (0.2449) data time 0.0007 (0.0021) model time 0.2405 (0.2427) loss 2.7301 (2.6696) grad_norm 5.9304 (6.3277) loss_scale 512.0000 (267.4082) mem 7381MB [2024-09-01 09:59:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][570/1251] eta 0:02:46 lr 0.000026 wd 0.0500 time 0.2358 (0.2449) data time 0.0007 (0.0021) model time 0.2351 (0.2426) loss 2.5234 (2.6688) grad_norm 5.0106 (6.3209) loss_scale 512.0000 (271.6918) mem 7381MB [2024-09-01 09:59:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][580/1251] eta 0:02:44 lr 0.000026 wd 0.0500 time 0.2378 (0.2448) data time 0.0008 (0.0021) model time 0.2370 (0.2426) loss 1.6963 (2.6666) grad_norm 7.4003 (6.2973) loss_scale 512.0000 (275.8279) mem 7381MB [2024-09-01 09:59:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][590/1251] eta 0:02:41 lr 0.000026 wd 0.0500 time 0.2545 (0.2448) data time 0.0009 (0.0021) model time 0.2536 (0.2426) loss 2.9392 (2.6702) grad_norm 5.0797 (6.3449) loss_scale 512.0000 (279.8240) mem 7381MB [2024-09-01 09:59:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][600/1251] eta 0:02:39 lr 0.000026 wd 0.0500 time 0.2458 (0.2447) data time 0.0011 (0.0021) model time 0.2448 (0.2425) loss 2.8130 (2.6678) grad_norm 3.8711 (6.3187) loss_scale 512.0000 (283.6872) mem 7381MB [2024-09-01 09:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][610/1251] eta 0:02:36 lr 0.000026 wd 0.0500 time 0.2396 (0.2446) data time 0.0012 (0.0020) model time 0.2384 (0.2425) loss 2.6044 (2.6670) grad_norm 5.0233 (6.3082) loss_scale 512.0000 (287.4239) mem 7381MB [2024-09-01 09:59:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][620/1251] eta 0:02:34 lr 0.000026 wd 0.0500 time 0.2381 (0.2446) data time 0.0010 (0.0020) model time 0.2370 (0.2424) loss 2.0589 (2.6677) grad_norm 5.9965 (6.2958) loss_scale 512.0000 (291.0403) mem 7381MB [2024-09-01 09:59:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][630/1251] eta 0:02:31 lr 0.000026 wd 0.0500 time 0.2420 (0.2445) data time 0.0007 (0.0020) model time 0.2413 (0.2424) loss 2.4016 (2.6654) grad_norm 5.3034 (6.2837) loss_scale 512.0000 (294.5420) mem 7381MB [2024-09-01 09:59:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][640/1251] eta 0:02:29 lr 0.000026 wd 0.0500 time 0.2504 (0.2445) data time 0.0010 (0.0020) model time 0.2494 (0.2424) loss 2.7484 (2.6666) grad_norm 5.2618 (6.2586) loss_scale 512.0000 (297.9345) mem 7381MB [2024-09-01 09:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][650/1251] eta 0:02:26 lr 0.000026 wd 0.0500 time 0.2358 (0.2445) data time 0.0007 (0.0020) model time 0.2351 (0.2424) loss 3.2223 (2.6683) grad_norm 5.1604 (6.2629) loss_scale 512.0000 (301.2227) mem 7381MB [2024-09-01 09:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][660/1251] eta 0:02:24 lr 0.000026 wd 0.0500 time 0.2425 (0.2445) data time 0.0012 (0.0020) model time 0.2413 (0.2424) loss 2.8941 (2.6670) grad_norm 5.6070 (6.2492) loss_scale 512.0000 (304.4115) mem 7381MB [2024-09-01 09:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][670/1251] eta 0:02:22 lr 0.000026 wd 0.0500 time 0.2444 (0.2444) data time 0.0007 (0.0019) model time 0.2437 (0.2424) loss 3.0464 (2.6641) grad_norm 4.3753 (6.2378) loss_scale 512.0000 (307.5052) mem 7381MB [2024-09-01 09:59:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][680/1251] eta 0:02:19 lr 0.000026 wd 0.0500 time 0.2457 (0.2444) data time 0.0008 (0.0019) model time 0.2449 (0.2423) loss 1.9516 (2.6624) grad_norm 7.4396 (6.2259) loss_scale 512.0000 (310.5081) mem 7381MB [2024-09-01 09:59:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][690/1251] eta 0:02:17 lr 0.000026 wd 0.0500 time 0.2406 (0.2443) data time 0.0008 (0.0019) model time 0.2398 (0.2423) loss 2.5154 (2.6623) grad_norm 4.1215 (6.2160) loss_scale 512.0000 (313.4240) mem 7381MB [2024-09-01 09:59:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][700/1251] eta 0:02:14 lr 0.000026 wd 0.0500 time 0.2470 (0.2445) data time 0.0009 (0.0019) model time 0.2461 (0.2425) loss 2.8462 (2.6615) grad_norm 4.4164 (6.2119) loss_scale 512.0000 (316.2568) mem 7381MB [2024-09-01 09:59:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][710/1251] eta 0:02:12 lr 0.000026 wd 0.0500 time 0.2451 (0.2445) data time 0.0010 (0.0019) model time 0.2442 (0.2425) loss 2.9023 (2.6617) grad_norm 6.6181 (6.2100) loss_scale 512.0000 (319.0098) mem 7381MB [2024-09-01 09:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][720/1251] eta 0:02:09 lr 0.000026 wd 0.0500 time 0.2398 (0.2444) data time 0.0007 (0.0019) model time 0.2392 (0.2424) loss 3.4526 (2.6652) grad_norm 5.7538 (6.2095) loss_scale 512.0000 (321.6865) mem 7381MB [2024-09-01 09:59:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][730/1251] eta 0:02:07 lr 0.000026 wd 0.0500 time 0.2384 (0.2444) data time 0.0011 (0.0019) model time 0.2373 (0.2424) loss 3.1317 (2.6653) grad_norm 4.0043 (6.1976) loss_scale 512.0000 (324.2900) mem 7381MB [2024-09-01 09:59:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][740/1251] eta 0:02:04 lr 0.000026 wd 0.0500 time 0.2430 (0.2444) data time 0.0009 (0.0019) model time 0.2420 (0.2424) loss 2.8826 (2.6650) grad_norm 3.9116 (6.2260) loss_scale 512.0000 (326.8232) mem 7381MB [2024-09-01 09:59:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][750/1251] eta 0:02:02 lr 0.000026 wd 0.0500 time 0.2425 (0.2444) data time 0.0011 (0.0018) model time 0.2414 (0.2424) loss 3.0365 (2.6639) grad_norm 5.8928 (6.2633) loss_scale 512.0000 (329.2889) mem 7381MB [2024-09-01 09:59:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][760/1251] eta 0:01:59 lr 0.000026 wd 0.0500 time 0.2463 (0.2443) data time 0.0009 (0.0018) model time 0.2453 (0.2424) loss 2.9989 (2.6650) grad_norm 6.1260 (6.2489) loss_scale 512.0000 (331.6899) mem 7381MB [2024-09-01 09:59:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][770/1251] eta 0:01:57 lr 0.000026 wd 0.0500 time 0.2426 (0.2443) data time 0.0006 (0.0018) model time 0.2420 (0.2424) loss 2.9780 (2.6656) grad_norm 11.4116 (6.2363) loss_scale 512.0000 (334.0285) mem 7381MB [2024-09-01 09:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][780/1251] eta 0:01:55 lr 0.000026 wd 0.0500 time 0.2369 (0.2443) data time 0.0010 (0.0018) model time 0.2359 (0.2423) loss 2.9681 (2.6660) grad_norm 4.4227 (6.2242) loss_scale 512.0000 (336.3073) mem 7381MB [2024-09-01 09:59:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][790/1251] eta 0:01:52 lr 0.000026 wd 0.0500 time 0.2359 (0.2442) data time 0.0013 (0.0018) model time 0.2346 (0.2423) loss 2.9879 (2.6701) grad_norm 5.9294 (6.2131) loss_scale 512.0000 (338.5284) mem 7381MB [2024-09-01 09:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][800/1251] eta 0:01:50 lr 0.000025 wd 0.0500 time 0.2406 (0.2442) data time 0.0007 (0.0018) model time 0.2399 (0.2423) loss 2.9272 (2.6713) grad_norm 6.6769 (6.2079) loss_scale 512.0000 (340.6941) mem 7381MB [2024-09-01 10:00:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][810/1251] eta 0:01:47 lr 0.000025 wd 0.0500 time 0.2514 (0.2442) data time 0.0011 (0.0018) model time 0.2504 (0.2423) loss 2.4882 (2.6723) grad_norm 6.7343 (6.1897) loss_scale 512.0000 (342.8064) mem 7381MB [2024-09-01 10:00:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][820/1251] eta 0:01:45 lr 0.000025 wd 0.0500 time 0.2427 (0.2442) data time 0.0007 (0.0018) model time 0.2420 (0.2423) loss 3.1275 (2.6750) grad_norm 4.0827 (6.1875) loss_scale 512.0000 (344.8672) mem 7381MB [2024-09-01 10:00:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][830/1251] eta 0:01:42 lr 0.000025 wd 0.0500 time 0.2330 (0.2444) data time 0.0008 (0.0018) model time 0.2323 (0.2425) loss 3.0899 (2.6753) grad_norm 6.4395 (6.1755) loss_scale 512.0000 (346.8785) mem 7381MB [2024-09-01 10:00:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][840/1251] eta 0:01:40 lr 0.000025 wd 0.0500 time 0.2420 (0.2443) data time 0.0009 (0.0018) model time 0.2410 (0.2425) loss 2.3345 (2.6767) grad_norm 3.6610 (6.1670) loss_scale 512.0000 (348.8419) mem 7381MB [2024-09-01 10:00:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][850/1251] eta 0:01:37 lr 0.000025 wd 0.0500 time 0.2329 (0.2442) data time 0.0011 (0.0017) model time 0.2319 (0.2424) loss 2.7227 (2.6761) grad_norm 3.9320 (6.1518) loss_scale 512.0000 (350.7591) mem 7381MB [2024-09-01 10:00:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][860/1251] eta 0:01:35 lr 0.000025 wd 0.0500 time 0.2382 (0.2441) data time 0.0009 (0.0017) model time 0.2372 (0.2423) loss 2.3159 (2.6753) grad_norm 4.4951 (6.1465) loss_scale 512.0000 (352.6318) mem 7381MB [2024-09-01 10:00:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][870/1251] eta 0:01:33 lr 0.000025 wd 0.0500 time 0.2475 (0.2441) data time 0.0009 (0.0017) model time 0.2465 (0.2423) loss 2.2456 (2.6752) grad_norm 4.2890 (6.1325) loss_scale 512.0000 (354.4615) mem 7381MB [2024-09-01 10:00:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][880/1251] eta 0:01:30 lr 0.000025 wd 0.0500 time 0.2406 (0.2441) data time 0.0012 (0.0017) model time 0.2394 (0.2423) loss 2.8965 (2.6779) grad_norm 10.7789 (6.1279) loss_scale 512.0000 (356.2497) mem 7381MB [2024-09-01 10:00:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][890/1251] eta 0:01:28 lr 0.000025 wd 0.0500 time 0.2407 (0.2441) data time 0.0008 (0.0017) model time 0.2400 (0.2423) loss 2.8382 (2.6805) grad_norm 5.2979 (6.1297) loss_scale 512.0000 (357.9978) mem 7381MB [2024-09-01 10:00:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][900/1251] eta 0:01:25 lr 0.000025 wd 0.0500 time 0.2336 (0.2440) data time 0.0011 (0.0017) model time 0.2326 (0.2422) loss 2.9858 (2.6828) grad_norm 8.9008 (6.1304) loss_scale 512.0000 (359.7070) mem 7381MB [2024-09-01 10:00:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][910/1251] eta 0:01:23 lr 0.000025 wd 0.0500 time 0.2391 (0.2440) data time 0.0011 (0.0017) model time 0.2381 (0.2422) loss 2.0445 (2.6795) grad_norm 4.7024 (6.1189) loss_scale 512.0000 (361.3787) mem 7381MB [2024-09-01 10:00:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][920/1251] eta 0:01:20 lr 0.000025 wd 0.0500 time 0.2434 (0.2440) data time 0.0009 (0.0017) model time 0.2425 (0.2422) loss 3.0911 (2.6783) grad_norm 6.9582 (6.1229) loss_scale 512.0000 (363.0141) mem 7381MB [2024-09-01 10:00:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][930/1251] eta 0:01:18 lr 0.000025 wd 0.0500 time 0.2382 (0.2440) data time 0.0008 (0.0017) model time 0.2375 (0.2422) loss 2.7358 (2.6786) grad_norm 5.0807 (6.1282) loss_scale 512.0000 (364.6144) mem 7381MB [2024-09-01 10:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][940/1251] eta 0:01:15 lr 0.000025 wd 0.0500 time 0.2382 (0.2439) data time 0.0008 (0.0017) model time 0.2374 (0.2422) loss 2.9292 (2.6770) grad_norm 4.4659 (6.2134) loss_scale 512.0000 (366.1807) mem 7381MB [2024-09-01 10:00:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][950/1251] eta 0:01:13 lr 0.000025 wd 0.0500 time 0.2391 (0.2439) data time 0.0010 (0.0017) model time 0.2381 (0.2421) loss 2.3069 (2.6786) grad_norm 5.0995 (6.2181) loss_scale 512.0000 (367.7140) mem 7381MB [2024-09-01 10:00:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][960/1251] eta 0:01:10 lr 0.000025 wd 0.0500 time 0.2419 (0.2438) data time 0.0007 (0.0017) model time 0.2411 (0.2421) loss 2.8058 (2.6782) grad_norm 4.4526 (6.2098) loss_scale 512.0000 (369.2154) mem 7381MB [2024-09-01 10:00:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][970/1251] eta 0:01:08 lr 0.000025 wd 0.0500 time 0.2431 (0.2438) data time 0.0007 (0.0017) model time 0.2423 (0.2421) loss 3.2470 (2.6806) grad_norm 6.1400 (6.2012) loss_scale 512.0000 (370.6859) mem 7381MB [2024-09-01 10:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][980/1251] eta 0:01:06 lr 0.000025 wd 0.0500 time 0.2372 (0.2438) data time 0.0007 (0.0016) model time 0.2366 (0.2420) loss 3.0824 (2.6806) grad_norm 4.0655 (6.1965) loss_scale 512.0000 (372.1264) mem 7381MB [2024-09-01 10:00:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][990/1251] eta 0:01:03 lr 0.000025 wd 0.0500 time 0.2378 (0.2438) data time 0.0010 (0.0016) model time 0.2369 (0.2420) loss 2.8795 (2.6807) grad_norm 7.5897 (6.1909) loss_scale 512.0000 (373.5378) mem 7381MB [2024-09-01 10:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1000/1251] eta 0:01:01 lr 0.000025 wd 0.0500 time 0.2437 (0.2437) data time 0.0009 (0.0016) model time 0.2428 (0.2420) loss 2.9053 (2.6826) grad_norm 3.4624 (6.1804) loss_scale 512.0000 (374.9211) mem 7381MB [2024-09-01 10:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1010/1251] eta 0:00:58 lr 0.000025 wd 0.0500 time 0.2374 (0.2437) data time 0.0007 (0.0016) model time 0.2367 (0.2420) loss 1.7901 (2.6819) grad_norm 9.2067 (6.1761) loss_scale 512.0000 (376.2770) mem 7381MB [2024-09-01 10:00:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1020/1251] eta 0:00:56 lr 0.000025 wd 0.0500 time 0.2440 (0.2437) data time 0.0006 (0.0016) model time 0.2434 (0.2419) loss 2.7505 (2.6813) grad_norm 9.3358 (6.1738) loss_scale 512.0000 (377.6063) mem 7381MB [2024-09-01 10:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1030/1251] eta 0:00:53 lr 0.000025 wd 0.0500 time 0.2413 (0.2437) data time 0.0010 (0.0016) model time 0.2403 (0.2419) loss 3.4748 (2.6798) grad_norm 17.3317 (6.1958) loss_scale 512.0000 (378.9098) mem 7381MB [2024-09-01 10:00:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1040/1251] eta 0:00:51 lr 0.000025 wd 0.0500 time 0.2330 (0.2436) data time 0.0012 (0.0016) model time 0.2318 (0.2419) loss 2.5590 (2.6797) grad_norm 6.7557 (6.1873) loss_scale 512.0000 (380.1883) mem 7381MB [2024-09-01 10:00:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1050/1251] eta 0:00:48 lr 0.000025 wd 0.0500 time 0.2422 (0.2436) data time 0.0011 (0.0016) model time 0.2412 (0.2419) loss 3.1360 (2.6818) grad_norm 4.2795 (6.2463) loss_scale 512.0000 (381.4424) mem 7381MB [2024-09-01 10:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1060/1251] eta 0:00:46 lr 0.000025 wd 0.0500 time 0.2478 (0.2436) data time 0.0007 (0.0016) model time 0.2470 (0.2419) loss 2.4482 (2.6800) grad_norm 5.9111 (6.2370) loss_scale 512.0000 (382.6730) mem 7381MB [2024-09-01 10:01:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1070/1251] eta 0:00:44 lr 0.000025 wd 0.0500 time 0.2431 (0.2436) data time 0.0009 (0.0016) model time 0.2422 (0.2419) loss 1.9484 (2.6781) grad_norm 4.2873 (6.2286) loss_scale 512.0000 (383.8805) mem 7381MB [2024-09-01 10:01:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1080/1251] eta 0:00:41 lr 0.000025 wd 0.0500 time 0.2409 (0.2436) data time 0.0009 (0.0016) model time 0.2400 (0.2419) loss 2.3905 (2.6756) grad_norm 4.7386 (6.2228) loss_scale 512.0000 (385.0657) mem 7381MB [2024-09-01 10:01:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1090/1251] eta 0:00:39 lr 0.000025 wd 0.0500 time 0.2428 (0.2436) data time 0.0007 (0.0016) model time 0.2420 (0.2419) loss 3.1106 (2.6773) grad_norm 5.2230 (6.2183) loss_scale 512.0000 (386.2291) mem 7381MB [2024-09-01 10:01:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1100/1251] eta 0:00:36 lr 0.000025 wd 0.0500 time 0.2411 (0.2436) data time 0.0009 (0.0016) model time 0.2402 (0.2419) loss 2.9317 (2.6767) grad_norm 4.2219 (6.2193) loss_scale 512.0000 (387.3715) mem 7381MB [2024-09-01 10:01:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1110/1251] eta 0:00:34 lr 0.000025 wd 0.0500 time 0.2422 (0.2436) data time 0.0008 (0.0016) model time 0.2414 (0.2419) loss 3.0962 (2.6763) grad_norm 4.6711 (6.2206) loss_scale 512.0000 (388.4932) mem 7381MB [2024-09-01 10:01:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1120/1251] eta 0:00:31 lr 0.000025 wd 0.0500 time 0.2430 (0.2435) data time 0.0007 (0.0016) model time 0.2423 (0.2419) loss 3.1181 (2.6763) grad_norm 5.8848 (6.2083) loss_scale 512.0000 (389.5950) mem 7381MB [2024-09-01 10:01:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1130/1251] eta 0:00:29 lr 0.000025 wd 0.0500 time 0.2332 (0.2435) data time 0.0009 (0.0016) model time 0.2323 (0.2419) loss 2.6534 (2.6764) grad_norm 4.1124 (6.1976) loss_scale 512.0000 (390.6773) mem 7381MB [2024-09-01 10:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1140/1251] eta 0:00:27 lr 0.000025 wd 0.0500 time 0.4396 (0.2437) data time 0.0011 (0.0016) model time 0.4385 (0.2420) loss 2.8527 (2.6766) grad_norm 5.7792 (6.1937) loss_scale 512.0000 (391.7406) mem 7381MB [2024-09-01 10:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1150/1251] eta 0:00:24 lr 0.000025 wd 0.0500 time 0.2392 (0.2437) data time 0.0012 (0.0016) model time 0.2380 (0.2420) loss 3.0972 (2.6779) grad_norm 4.0249 (6.1866) loss_scale 512.0000 (392.7854) mem 7381MB [2024-09-01 10:01:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1160/1251] eta 0:00:22 lr 0.000025 wd 0.0500 time 0.2405 (0.2436) data time 0.0009 (0.0016) model time 0.2396 (0.2420) loss 2.1825 (2.6778) grad_norm 4.9906 (6.1829) loss_scale 512.0000 (393.8122) mem 7381MB [2024-09-01 10:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1170/1251] eta 0:00:19 lr 0.000025 wd 0.0500 time 0.2360 (0.2436) data time 0.0010 (0.0015) model time 0.2349 (0.2420) loss 2.9748 (2.6794) grad_norm 7.0221 (6.1920) loss_scale 512.0000 (394.8215) mem 7381MB [2024-09-01 10:01:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1180/1251] eta 0:00:17 lr 0.000025 wd 0.0500 time 0.2468 (0.2436) data time 0.0007 (0.0015) model time 0.2461 (0.2420) loss 2.0786 (2.6792) grad_norm 14.7875 (6.1997) loss_scale 512.0000 (395.8137) mem 7381MB [2024-09-01 10:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1190/1251] eta 0:00:14 lr 0.000025 wd 0.0500 time 0.2369 (0.2436) data time 0.0008 (0.0015) model time 0.2361 (0.2419) loss 2.7477 (2.6788) grad_norm 5.5177 (6.1985) loss_scale 512.0000 (396.7893) mem 7381MB [2024-09-01 10:01:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1200/1251] eta 0:00:12 lr 0.000025 wd 0.0500 time 0.2429 (0.2436) data time 0.0009 (0.0015) model time 0.2420 (0.2419) loss 2.2952 (2.6778) grad_norm 6.2928 (6.1933) loss_scale 512.0000 (397.7485) mem 7381MB [2024-09-01 10:01:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1210/1251] eta 0:00:09 lr 0.000025 wd 0.0500 time 0.2437 (0.2435) data time 0.0007 (0.0015) model time 0.2430 (0.2419) loss 2.9612 (2.6781) grad_norm 6.0304 (6.1952) loss_scale 512.0000 (398.6920) mem 7381MB [2024-09-01 10:01:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1220/1251] eta 0:00:07 lr 0.000025 wd 0.0500 time 0.2383 (0.2435) data time 0.0011 (0.0015) model time 0.2372 (0.2419) loss 2.1958 (2.6770) grad_norm 6.8346 (6.1893) loss_scale 512.0000 (399.6200) mem 7381MB [2024-09-01 10:01:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1230/1251] eta 0:00:05 lr 0.000025 wd 0.0500 time 0.2352 (0.2438) data time 0.0010 (0.0015) model time 0.2342 (0.2422) loss 2.6117 (2.6770) grad_norm 4.7637 (6.1820) loss_scale 512.0000 (400.5329) mem 7381MB [2024-09-01 10:01:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1240/1251] eta 0:00:02 lr 0.000025 wd 0.0500 time 0.2266 (0.2438) data time 0.0007 (0.0015) model time 0.2259 (0.2422) loss 3.0965 (2.6795) grad_norm 5.7104 (6.1775) loss_scale 512.0000 (401.4311) mem 7381MB [2024-09-01 10:01:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [277/300][1250/1251] eta 0:00:00 lr 0.000025 wd 0.0500 time 0.2222 (0.2436) data time 0.0007 (0.0015) model time 0.2215 (0.2420) loss 2.4877 (2.6816) grad_norm 5.5898 (6.1664) loss_scale 512.0000 (402.3149) mem 7381MB [2024-09-01 10:01:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 277 training takes 0:05:04 [2024-09-01 10:01:48 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 10:01:48 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 10:01:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.449 (0.449) Loss 0.3984 (0.3984) Acc@1 92.969 (92.969) Acc@5 98.926 (98.926) Mem 7381MB [2024-09-01 10:01:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.067 (0.107) Loss 0.5776 (0.6209) Acc@1 90.039 (87.660) Acc@5 98.047 (97.710) Mem 7381MB [2024-09-01 10:01:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.094) Loss 0.9126 (0.6486) Acc@1 78.125 (86.556) Acc@5 95.801 (97.661) Mem 7381MB [2024-09-01 10:01:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.088) Loss 1.1631 (0.7434) Acc@1 74.316 (84.315) Acc@5 92.578 (96.661) Mem 7381MB [2024-09-01 10:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.0215 (0.7919) Acc@1 76.660 (83.134) Acc@5 94.238 (96.179) Mem 7381MB [2024-09-01 10:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.706 Acc@5 96.126 [2024-09-01 10:01:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-09-01 10:01:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.799 (0.799) Loss 0.3884 (0.3884) Acc@1 92.871 (92.871) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 10:01:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.146) Loss 0.5654 (0.6060) Acc@1 90.430 (87.908) Acc@5 98.242 (97.834) Mem 7381MB [2024-09-01 10:01:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.115) Loss 0.9062 (0.6364) Acc@1 78.125 (86.686) Acc@5 95.703 (97.763) Mem 7381MB [2024-09-01 10:01:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.103) Loss 1.1299 (0.7285) Acc@1 74.902 (84.529) Acc@5 92.871 (96.799) Mem 7381MB [2024-09-01 10:01:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.0078 (0.7764) Acc@1 77.051 (83.353) Acc@5 94.531 (96.322) Mem 7381MB [2024-09-01 10:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.936 Acc@5 96.284 [2024-09-01 10:01:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 10:01:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][0/1251] eta 0:25:46 lr 0.000025 wd 0.0500 time 1.2362 (1.2362) data time 0.8030 (0.8030) model time 0.0000 (0.0000) loss 3.1193 (3.1193) grad_norm 3.5396 (3.5396) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][10/1251] eta 0:06:50 lr 0.000025 wd 0.0500 time 0.2361 (0.3309) data time 0.0008 (0.0739) model time 0.0000 (0.0000) loss 2.8325 (2.8701) grad_norm 3.4793 (5.1248) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][20/1251] eta 0:05:55 lr 0.000025 wd 0.0500 time 0.2450 (0.2886) data time 0.0008 (0.0393) model time 0.0000 (0.0000) loss 2.9505 (2.8254) grad_norm 7.8262 (5.2128) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][30/1251] eta 0:05:33 lr 0.000025 wd 0.0500 time 0.2477 (0.2730) data time 0.0007 (0.0269) model time 0.0000 (0.0000) loss 2.3907 (2.7421) grad_norm 4.8620 (5.2406) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][40/1251] eta 0:05:21 lr 0.000025 wd 0.0500 time 0.2425 (0.2652) data time 0.0009 (0.0206) model time 0.0000 (0.0000) loss 2.4247 (2.6792) grad_norm 5.0634 (5.2251) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][50/1251] eta 0:05:15 lr 0.000025 wd 0.0500 time 0.2174 (0.2627) data time 0.0010 (0.0168) model time 0.0000 (0.0000) loss 2.7970 (2.7414) grad_norm 5.4235 (5.2597) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][60/1251] eta 0:05:09 lr 0.000025 wd 0.0500 time 0.2517 (0.2595) data time 0.0007 (0.0142) model time 0.2510 (0.2424) loss 2.4805 (2.7406) grad_norm 6.9544 (5.2913) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][70/1251] eta 0:05:03 lr 0.000025 wd 0.0500 time 0.2403 (0.2569) data time 0.0009 (0.0123) model time 0.2395 (0.2412) loss 2.7433 (2.7422) grad_norm 5.0653 (5.5050) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][80/1251] eta 0:04:59 lr 0.000025 wd 0.0500 time 0.2336 (0.2554) data time 0.0009 (0.0109) model time 0.2328 (0.2420) loss 3.3348 (2.7290) grad_norm 5.8465 (5.6534) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][90/1251] eta 0:04:54 lr 0.000025 wd 0.0500 time 0.2362 (0.2538) data time 0.0010 (0.0098) model time 0.2352 (0.2413) loss 2.1605 (2.7140) grad_norm 4.4241 (5.7532) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][100/1251] eta 0:04:50 lr 0.000025 wd 0.0500 time 0.2377 (0.2526) data time 0.0009 (0.0090) model time 0.2368 (0.2413) loss 2.4357 (2.6802) grad_norm 3.4916 (5.6268) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][110/1251] eta 0:04:46 lr 0.000025 wd 0.0500 time 0.2379 (0.2515) data time 0.0011 (0.0082) model time 0.2368 (0.2410) loss 3.0305 (2.6928) grad_norm 3.8510 (5.5817) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][120/1251] eta 0:04:43 lr 0.000025 wd 0.0500 time 0.2414 (0.2507) data time 0.0009 (0.0076) model time 0.2405 (0.2409) loss 2.9524 (2.6820) grad_norm 4.5849 (5.5052) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][130/1251] eta 0:04:40 lr 0.000025 wd 0.0500 time 0.2411 (0.2500) data time 0.0009 (0.0071) model time 0.2402 (0.2409) loss 3.1047 (2.6743) grad_norm 6.3049 (5.4424) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][140/1251] eta 0:04:37 lr 0.000025 wd 0.0500 time 0.2458 (0.2495) data time 0.0008 (0.0067) model time 0.2451 (0.2410) loss 2.1165 (2.6769) grad_norm 7.4323 (5.4715) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][150/1251] eta 0:04:34 lr 0.000025 wd 0.0500 time 0.2412 (0.2491) data time 0.0008 (0.0063) model time 0.2404 (0.2412) loss 3.2233 (2.6857) grad_norm 7.6531 (5.4731) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][160/1251] eta 0:04:31 lr 0.000025 wd 0.0500 time 0.2405 (0.2487) data time 0.0011 (0.0060) model time 0.2394 (0.2412) loss 2.8082 (2.6789) grad_norm 5.2642 (5.5288) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][170/1251] eta 0:04:28 lr 0.000025 wd 0.0500 time 0.2374 (0.2481) data time 0.0007 (0.0057) model time 0.2367 (0.2409) loss 2.2909 (2.6792) grad_norm 7.0911 (5.4900) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][180/1251] eta 0:04:25 lr 0.000025 wd 0.0500 time 0.2422 (0.2477) data time 0.0010 (0.0054) model time 0.2412 (0.2408) loss 2.5116 (2.6840) grad_norm 6.0174 (5.5419) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][190/1251] eta 0:04:22 lr 0.000025 wd 0.0500 time 0.2405 (0.2474) data time 0.0008 (0.0052) model time 0.2397 (0.2409) loss 2.9293 (2.6838) grad_norm 5.0400 (5.5383) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][200/1251] eta 0:04:19 lr 0.000025 wd 0.0500 time 0.2571 (0.2471) data time 0.0009 (0.0050) model time 0.2562 (0.2409) loss 1.6665 (2.6726) grad_norm 8.0275 (5.5566) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][210/1251] eta 0:04:16 lr 0.000025 wd 0.0500 time 0.2458 (0.2469) data time 0.0007 (0.0048) model time 0.2451 (0.2408) loss 2.8952 (2.6733) grad_norm 8.9500 (5.5850) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][220/1251] eta 0:04:14 lr 0.000025 wd 0.0500 time 0.2433 (0.2467) data time 0.0007 (0.0046) model time 0.2426 (0.2409) loss 2.9936 (2.6748) grad_norm 6.5725 (5.5627) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][230/1251] eta 0:04:11 lr 0.000025 wd 0.0500 time 0.2389 (0.2464) data time 0.0007 (0.0045) model time 0.2382 (0.2409) loss 3.2413 (2.6820) grad_norm 5.0824 (5.5651) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][240/1251] eta 0:04:09 lr 0.000025 wd 0.0500 time 0.2457 (0.2463) data time 0.0010 (0.0043) model time 0.2447 (0.2409) loss 1.9277 (2.6653) grad_norm 3.4753 (5.5670) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:02:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][250/1251] eta 0:04:06 lr 0.000025 wd 0.0500 time 0.2429 (0.2461) data time 0.0009 (0.0042) model time 0.2419 (0.2409) loss 2.8889 (2.6731) grad_norm 4.5701 (5.5819) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][260/1251] eta 0:04:03 lr 0.000025 wd 0.0500 time 0.2316 (0.2459) data time 0.0009 (0.0041) model time 0.2307 (0.2408) loss 3.1210 (2.6710) grad_norm 4.8924 (5.5704) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][270/1251] eta 0:04:01 lr 0.000025 wd 0.0500 time 0.2427 (0.2458) data time 0.0010 (0.0040) model time 0.2418 (0.2409) loss 2.4039 (2.6784) grad_norm 5.2303 (5.5988) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][280/1251] eta 0:03:58 lr 0.000025 wd 0.0500 time 0.2432 (0.2457) data time 0.0010 (0.0038) model time 0.2422 (0.2409) loss 2.6465 (2.6812) grad_norm 4.9742 (5.6078) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][290/1251] eta 0:03:55 lr 0.000025 wd 0.0500 time 0.2346 (0.2456) data time 0.0008 (0.0037) model time 0.2338 (0.2409) loss 3.1111 (2.6829) grad_norm 7.0190 (5.5941) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][300/1251] eta 0:03:53 lr 0.000025 wd 0.0500 time 0.2415 (0.2454) data time 0.0008 (0.0037) model time 0.2408 (0.2409) loss 2.6865 (2.6822) grad_norm 4.4187 (5.6987) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][310/1251] eta 0:03:50 lr 0.000025 wd 0.0500 time 0.2351 (0.2453) data time 0.0010 (0.0036) model time 0.2341 (0.2409) loss 3.0484 (2.6793) grad_norm 5.8946 (5.6901) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][320/1251] eta 0:03:48 lr 0.000025 wd 0.0500 time 0.2400 (0.2452) data time 0.0011 (0.0035) model time 0.2389 (0.2409) loss 2.9791 (2.6725) grad_norm 5.3573 (5.7060) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][330/1251] eta 0:03:45 lr 0.000025 wd 0.0500 time 0.2355 (0.2451) data time 0.0009 (0.0034) model time 0.2346 (0.2409) loss 3.2820 (2.6725) grad_norm 5.0193 (5.6852) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][340/1251] eta 0:03:43 lr 0.000025 wd 0.0500 time 0.2474 (0.2450) data time 0.0007 (0.0033) model time 0.2467 (0.2409) loss 1.9990 (2.6745) grad_norm 4.7669 (5.6884) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][350/1251] eta 0:03:40 lr 0.000025 wd 0.0500 time 0.2308 (0.2449) data time 0.0010 (0.0033) model time 0.2299 (0.2409) loss 2.8213 (2.6732) grad_norm 5.7386 (5.6955) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][360/1251] eta 0:03:38 lr 0.000025 wd 0.0500 time 0.2439 (0.2448) data time 0.0009 (0.0032) model time 0.2429 (0.2408) loss 2.6708 (2.6699) grad_norm 3.7362 (5.6939) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][370/1251] eta 0:03:35 lr 0.000025 wd 0.0500 time 0.2358 (0.2446) data time 0.0009 (0.0032) model time 0.2349 (0.2408) loss 2.1503 (2.6652) grad_norm 4.2351 (5.6844) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][380/1251] eta 0:03:33 lr 0.000025 wd 0.0500 time 0.2437 (0.2446) data time 0.0007 (0.0031) model time 0.2430 (0.2408) loss 3.1609 (2.6695) grad_norm 3.9538 (5.7011) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][390/1251] eta 0:03:30 lr 0.000025 wd 0.0500 time 0.2406 (0.2445) data time 0.0007 (0.0030) model time 0.2399 (0.2407) loss 3.0552 (2.6735) grad_norm 5.1103 (5.7089) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][400/1251] eta 0:03:27 lr 0.000025 wd 0.0500 time 0.2410 (0.2444) data time 0.0009 (0.0030) model time 0.2400 (0.2407) loss 2.8291 (2.6750) grad_norm 5.7163 (5.8019) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][410/1251] eta 0:03:25 lr 0.000025 wd 0.0500 time 0.2412 (0.2443) data time 0.0007 (0.0029) model time 0.2405 (0.2407) loss 2.6444 (2.6793) grad_norm 3.7102 (5.7969) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][420/1251] eta 0:03:22 lr 0.000025 wd 0.0500 time 0.2423 (0.2442) data time 0.0008 (0.0029) model time 0.2414 (0.2407) loss 3.3654 (2.6829) grad_norm 5.7595 (5.8363) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][430/1251] eta 0:03:20 lr 0.000025 wd 0.0500 time 0.2277 (0.2441) data time 0.0008 (0.0028) model time 0.2269 (0.2406) loss 1.8689 (2.6809) grad_norm 4.4323 (5.8411) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][440/1251] eta 0:03:17 lr 0.000025 wd 0.0500 time 0.2395 (0.2440) data time 0.0007 (0.0028) model time 0.2388 (0.2406) loss 3.1251 (2.6857) grad_norm 4.0505 (5.8477) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][450/1251] eta 0:03:15 lr 0.000025 wd 0.0500 time 0.2335 (0.2439) data time 0.0009 (0.0028) model time 0.2326 (0.2405) loss 3.4647 (2.6915) grad_norm 5.0756 (5.8715) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][460/1251] eta 0:03:12 lr 0.000025 wd 0.0500 time 0.2336 (0.2438) data time 0.0011 (0.0027) model time 0.2325 (0.2405) loss 2.6068 (2.6935) grad_norm 5.0913 (5.8777) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][470/1251] eta 0:03:10 lr 0.000024 wd 0.0500 time 0.2419 (0.2438) data time 0.0010 (0.0027) model time 0.2409 (0.2405) loss 2.5311 (2.6894) grad_norm 5.0828 (5.8608) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][480/1251] eta 0:03:07 lr 0.000024 wd 0.0500 time 0.2406 (0.2437) data time 0.0010 (0.0027) model time 0.2395 (0.2404) loss 2.8792 (2.6878) grad_norm 9.2969 (5.8967) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][490/1251] eta 0:03:05 lr 0.000024 wd 0.0500 time 0.2397 (0.2436) data time 0.0007 (0.0026) model time 0.2390 (0.2404) loss 2.8072 (2.6846) grad_norm 3.3132 (5.8687) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][500/1251] eta 0:03:03 lr 0.000024 wd 0.0500 time 0.2483 (0.2443) data time 0.0009 (0.0026) model time 0.2474 (0.2412) loss 2.7439 (2.6845) grad_norm 5.0184 (5.8600) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][510/1251] eta 0:03:01 lr 0.000024 wd 0.0500 time 0.2376 (0.2445) data time 0.0009 (0.0026) model time 0.2368 (0.2415) loss 2.5245 (2.6814) grad_norm 4.4282 (5.8379) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][520/1251] eta 0:02:58 lr 0.000024 wd 0.0500 time 0.2437 (0.2444) data time 0.0007 (0.0025) model time 0.2430 (0.2414) loss 2.8584 (2.6782) grad_norm 4.8553 (5.9143) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][530/1251] eta 0:02:56 lr 0.000024 wd 0.0500 time 0.2437 (0.2443) data time 0.0008 (0.0025) model time 0.2429 (0.2414) loss 3.0390 (2.6768) grad_norm 5.6872 (5.9167) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][540/1251] eta 0:02:53 lr 0.000024 wd 0.0500 time 0.2411 (0.2443) data time 0.0010 (0.0025) model time 0.2401 (0.2414) loss 2.3243 (2.6726) grad_norm 3.5637 (5.8985) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][550/1251] eta 0:02:51 lr 0.000024 wd 0.0500 time 0.2408 (0.2443) data time 0.0010 (0.0025) model time 0.2399 (0.2414) loss 2.2983 (2.6696) grad_norm 4.6214 (5.8940) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][560/1251] eta 0:02:48 lr 0.000024 wd 0.0500 time 0.2354 (0.2442) data time 0.0007 (0.0024) model time 0.2348 (0.2414) loss 2.5371 (2.6725) grad_norm 4.1839 (5.9147) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][570/1251] eta 0:02:46 lr 0.000024 wd 0.0500 time 0.2341 (0.2442) data time 0.0010 (0.0024) model time 0.2330 (0.2414) loss 1.7141 (2.6703) grad_norm 3.2221 (5.8931) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][580/1251] eta 0:02:44 lr 0.000024 wd 0.0500 time 0.2402 (0.2445) data time 0.0007 (0.0024) model time 0.2395 (0.2417) loss 2.1347 (2.6678) grad_norm 4.8277 (5.9025) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][590/1251] eta 0:02:41 lr 0.000024 wd 0.0500 time 0.2465 (0.2444) data time 0.0011 (0.0024) model time 0.2454 (0.2417) loss 2.7943 (2.6667) grad_norm 4.2283 (5.8850) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][600/1251] eta 0:02:39 lr 0.000024 wd 0.0500 time 0.2455 (0.2443) data time 0.0010 (0.0023) model time 0.2444 (0.2416) loss 3.2915 (2.6648) grad_norm 4.9661 (5.9092) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][610/1251] eta 0:02:36 lr 0.000024 wd 0.0500 time 0.2437 (0.2443) data time 0.0009 (0.0023) model time 0.2428 (0.2416) loss 3.1681 (2.6673) grad_norm 3.1410 (5.8984) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][620/1251] eta 0:02:34 lr 0.000024 wd 0.0500 time 0.2517 (0.2443) data time 0.0011 (0.0023) model time 0.2506 (0.2416) loss 2.7718 (2.6670) grad_norm 5.1963 (5.8897) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][630/1251] eta 0:02:31 lr 0.000024 wd 0.0500 time 0.2435 (0.2442) data time 0.0012 (0.0023) model time 0.2423 (0.2416) loss 3.0578 (2.6715) grad_norm 8.8688 (5.8947) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][640/1251] eta 0:02:29 lr 0.000024 wd 0.0500 time 0.2478 (0.2442) data time 0.0012 (0.0022) model time 0.2467 (0.2416) loss 2.7933 (2.6707) grad_norm 4.6164 (5.8790) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][650/1251] eta 0:02:26 lr 0.000024 wd 0.0500 time 0.2415 (0.2442) data time 0.0006 (0.0022) model time 0.2408 (0.2416) loss 3.1092 (2.6706) grad_norm 5.0814 (5.8603) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][660/1251] eta 0:02:24 lr 0.000024 wd 0.0500 time 0.2361 (0.2441) data time 0.0009 (0.0022) model time 0.2352 (0.2415) loss 3.0196 (2.6747) grad_norm 5.3913 (5.8680) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][670/1251] eta 0:02:21 lr 0.000024 wd 0.0500 time 0.2373 (0.2440) data time 0.0009 (0.0022) model time 0.2364 (0.2415) loss 2.9037 (2.6736) grad_norm 5.8234 (5.8511) loss_scale 512.0000 (512.0000) mem 7381MB [2024-09-01 10:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][680/1251] eta 0:02:19 lr 0.000024 wd 0.0500 time 0.2324 (0.2440) data time 0.0011 (0.0022) model time 0.2313 (0.2415) loss 2.2704 (2.6723) grad_norm 5.8557 (inf) loss_scale 256.0000 (510.8722) mem 7381MB [2024-09-01 10:04:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][690/1251] eta 0:02:16 lr 0.000024 wd 0.0500 time 0.2399 (0.2439) data time 0.0007 (0.0022) model time 0.2392 (0.2414) loss 3.3179 (2.6738) grad_norm 4.1210 (inf) loss_scale 256.0000 (507.1838) mem 7381MB [2024-09-01 10:04:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][700/1251] eta 0:02:14 lr 0.000024 wd 0.0500 time 0.2387 (0.2439) data time 0.0007 (0.0021) model time 0.2380 (0.2414) loss 2.2515 (2.6727) grad_norm 4.0433 (inf) loss_scale 256.0000 (503.6006) mem 7381MB [2024-09-01 10:04:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][710/1251] eta 0:02:11 lr 0.000024 wd 0.0500 time 0.2430 (0.2439) data time 0.0010 (0.0021) model time 0.2420 (0.2414) loss 2.3576 (2.6741) grad_norm 3.5338 (inf) loss_scale 256.0000 (500.1181) mem 7381MB [2024-09-01 10:04:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][720/1251] eta 0:02:09 lr 0.000024 wd 0.0500 time 0.2385 (0.2438) data time 0.0011 (0.0021) model time 0.2374 (0.2414) loss 2.5514 (2.6756) grad_norm 3.9401 (inf) loss_scale 256.0000 (496.7323) mem 7381MB [2024-09-01 10:04:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][730/1251] eta 0:02:07 lr 0.000024 wd 0.0500 time 0.2497 (0.2438) data time 0.0010 (0.0021) model time 0.2487 (0.2414) loss 3.3350 (2.6789) grad_norm 7.0230 (inf) loss_scale 256.0000 (493.4391) mem 7381MB [2024-09-01 10:04:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][740/1251] eta 0:02:04 lr 0.000024 wd 0.0500 time 0.2415 (0.2441) data time 0.0007 (0.0021) model time 0.2408 (0.2417) loss 3.0282 (2.6791) grad_norm 10.4222 (inf) loss_scale 256.0000 (490.2348) mem 7381MB [2024-09-01 10:05:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][750/1251] eta 0:02:02 lr 0.000024 wd 0.0500 time 0.2416 (0.2440) data time 0.0007 (0.0021) model time 0.2409 (0.2417) loss 3.2000 (2.6818) grad_norm 6.3062 (inf) loss_scale 256.0000 (487.1158) mem 7381MB [2024-09-01 10:05:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][760/1251] eta 0:01:59 lr 0.000024 wd 0.0500 time 0.2372 (0.2440) data time 0.0009 (0.0021) model time 0.2363 (0.2416) loss 2.9700 (2.6820) grad_norm 3.5050 (inf) loss_scale 256.0000 (484.0788) mem 7381MB [2024-09-01 10:05:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][770/1251] eta 0:01:57 lr 0.000024 wd 0.0500 time 0.2441 (0.2439) data time 0.0007 (0.0020) model time 0.2434 (0.2416) loss 1.9768 (2.6791) grad_norm 6.3104 (inf) loss_scale 256.0000 (481.1206) mem 7381MB [2024-09-01 10:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][780/1251] eta 0:01:54 lr 0.000024 wd 0.0500 time 0.2372 (0.2439) data time 0.0007 (0.0020) model time 0.2365 (0.2416) loss 2.0231 (2.6792) grad_norm 6.1528 (inf) loss_scale 256.0000 (478.2382) mem 7381MB [2024-09-01 10:05:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][790/1251] eta 0:01:52 lr 0.000024 wd 0.0500 time 0.2302 (0.2439) data time 0.0010 (0.0020) model time 0.2292 (0.2416) loss 2.6253 (2.6786) grad_norm 4.8843 (inf) loss_scale 256.0000 (475.4286) mem 7381MB [2024-09-01 10:05:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][800/1251] eta 0:01:49 lr 0.000024 wd 0.0500 time 0.2337 (0.2438) data time 0.0009 (0.0020) model time 0.2328 (0.2415) loss 2.0450 (2.6766) grad_norm 9.6338 (inf) loss_scale 256.0000 (472.6891) mem 7381MB [2024-09-01 10:05:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][810/1251] eta 0:01:47 lr 0.000024 wd 0.0500 time 0.2358 (0.2438) data time 0.0011 (0.0020) model time 0.2347 (0.2415) loss 3.0798 (2.6791) grad_norm 4.7868 (inf) loss_scale 256.0000 (470.0173) mem 7381MB [2024-09-01 10:05:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][820/1251] eta 0:01:45 lr 0.000024 wd 0.0500 time 0.2281 (0.2437) data time 0.0008 (0.0020) model time 0.2273 (0.2414) loss 3.3129 (2.6801) grad_norm 8.1358 (inf) loss_scale 256.0000 (467.4105) mem 7381MB [2024-09-01 10:05:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][830/1251] eta 0:01:42 lr 0.000024 wd 0.0500 time 0.2448 (0.2437) data time 0.0007 (0.0020) model time 0.2441 (0.2414) loss 1.5758 (2.6788) grad_norm 7.7102 (inf) loss_scale 256.0000 (464.8664) mem 7381MB [2024-09-01 10:05:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][840/1251] eta 0:01:40 lr 0.000024 wd 0.0500 time 0.2369 (0.2436) data time 0.0012 (0.0020) model time 0.2357 (0.2414) loss 3.0214 (2.6806) grad_norm 4.1937 (inf) loss_scale 256.0000 (462.3829) mem 7381MB [2024-09-01 10:05:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][850/1251] eta 0:01:37 lr 0.000024 wd 0.0500 time 0.2360 (0.2436) data time 0.0009 (0.0019) model time 0.2351 (0.2414) loss 2.4713 (2.6777) grad_norm 10.1768 (inf) loss_scale 256.0000 (459.9577) mem 7381MB [2024-09-01 10:05:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][860/1251] eta 0:01:35 lr 0.000024 wd 0.0500 time 0.2405 (0.2436) data time 0.0009 (0.0019) model time 0.2396 (0.2414) loss 2.5349 (2.6770) grad_norm 6.8740 (inf) loss_scale 256.0000 (457.5889) mem 7381MB [2024-09-01 10:05:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][870/1251] eta 0:01:32 lr 0.000024 wd 0.0500 time 0.2410 (0.2435) data time 0.0010 (0.0019) model time 0.2401 (0.2414) loss 2.7260 (2.6764) grad_norm 4.0449 (inf) loss_scale 256.0000 (455.2744) mem 7381MB [2024-09-01 10:05:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][880/1251] eta 0:01:30 lr 0.000024 wd 0.0500 time 0.2392 (0.2435) data time 0.0011 (0.0019) model time 0.2381 (0.2413) loss 2.3646 (2.6784) grad_norm 3.8245 (inf) loss_scale 256.0000 (453.0125) mem 7381MB [2024-09-01 10:05:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][890/1251] eta 0:01:27 lr 0.000024 wd 0.0500 time 0.2384 (0.2435) data time 0.0011 (0.0019) model time 0.2373 (0.2413) loss 3.0491 (2.6774) grad_norm 3.7519 (inf) loss_scale 256.0000 (450.8013) mem 7381MB [2024-09-01 10:05:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][900/1251] eta 0:01:25 lr 0.000024 wd 0.0500 time 0.2384 (0.2435) data time 0.0009 (0.0019) model time 0.2375 (0.2413) loss 2.7758 (2.6773) grad_norm 6.3597 (inf) loss_scale 256.0000 (448.6393) mem 7381MB [2024-09-01 10:05:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][910/1251] eta 0:01:23 lr 0.000024 wd 0.0500 time 0.2431 (0.2434) data time 0.0008 (0.0019) model time 0.2422 (0.2413) loss 1.8202 (2.6803) grad_norm 6.4613 (inf) loss_scale 256.0000 (446.5247) mem 7381MB [2024-09-01 10:05:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][920/1251] eta 0:01:20 lr 0.000024 wd 0.0500 time 0.2515 (0.2434) data time 0.0007 (0.0019) model time 0.2508 (0.2413) loss 2.4651 (2.6817) grad_norm 4.0734 (inf) loss_scale 256.0000 (444.4560) mem 7381MB [2024-09-01 10:05:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][930/1251] eta 0:01:18 lr 0.000024 wd 0.0500 time 0.2426 (0.2434) data time 0.0009 (0.0019) model time 0.2417 (0.2413) loss 3.1419 (2.6821) grad_norm 4.7938 (inf) loss_scale 256.0000 (442.4318) mem 7381MB [2024-09-01 10:05:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][940/1251] eta 0:01:15 lr 0.000024 wd 0.0500 time 0.2435 (0.2434) data time 0.0007 (0.0019) model time 0.2428 (0.2413) loss 2.0731 (2.6835) grad_norm 8.3760 (inf) loss_scale 256.0000 (440.4506) mem 7381MB [2024-09-01 10:05:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][950/1251] eta 0:01:13 lr 0.000024 wd 0.0500 time 0.2452 (0.2434) data time 0.0010 (0.0018) model time 0.2442 (0.2413) loss 1.9321 (2.6823) grad_norm 35.2373 (inf) loss_scale 256.0000 (438.5110) mem 7381MB [2024-09-01 10:05:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][960/1251] eta 0:01:10 lr 0.000024 wd 0.0500 time 0.2462 (0.2433) data time 0.0010 (0.0018) model time 0.2452 (0.2413) loss 2.4179 (2.6835) grad_norm 4.0545 (inf) loss_scale 256.0000 (436.6119) mem 7381MB [2024-09-01 10:05:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][970/1251] eta 0:01:08 lr 0.000024 wd 0.0500 time 0.2370 (0.2434) data time 0.0009 (0.0018) model time 0.2361 (0.2413) loss 2.7724 (2.6831) grad_norm 7.7886 (inf) loss_scale 256.0000 (434.7518) mem 7381MB [2024-09-01 10:05:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][980/1251] eta 0:01:05 lr 0.000024 wd 0.0500 time 0.2358 (0.2433) data time 0.0011 (0.0018) model time 0.2347 (0.2413) loss 3.0060 (2.6830) grad_norm 9.3674 (inf) loss_scale 256.0000 (432.9297) mem 7381MB [2024-09-01 10:05:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][990/1251] eta 0:01:03 lr 0.000024 wd 0.0500 time 0.2326 (0.2433) data time 0.0007 (0.0018) model time 0.2319 (0.2412) loss 2.8598 (2.6862) grad_norm 5.4700 (inf) loss_scale 256.0000 (431.1443) mem 7381MB [2024-09-01 10:06:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1000/1251] eta 0:01:01 lr 0.000024 wd 0.0500 time 0.2442 (0.2433) data time 0.0007 (0.0018) model time 0.2435 (0.2412) loss 1.4362 (2.6844) grad_norm 4.2912 (inf) loss_scale 256.0000 (429.3946) mem 7381MB [2024-09-01 10:06:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1010/1251] eta 0:00:58 lr 0.000024 wd 0.0500 time 0.2369 (0.2433) data time 0.0009 (0.0018) model time 0.2360 (0.2412) loss 3.1417 (2.6840) grad_norm 6.4993 (inf) loss_scale 256.0000 (427.6795) mem 7381MB [2024-09-01 10:06:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1020/1251] eta 0:00:56 lr 0.000024 wd 0.0500 time 0.2347 (0.2435) data time 0.0010 (0.0018) model time 0.2337 (0.2415) loss 3.0789 (2.6862) grad_norm 3.8822 (inf) loss_scale 256.0000 (425.9980) mem 7381MB [2024-09-01 10:06:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1030/1251] eta 0:00:53 lr 0.000024 wd 0.0500 time 0.2417 (0.2437) data time 0.0007 (0.0018) model time 0.2411 (0.2417) loss 3.0704 (2.6877) grad_norm 3.1239 (inf) loss_scale 256.0000 (424.3492) mem 7381MB [2024-09-01 10:06:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1040/1251] eta 0:00:51 lr 0.000024 wd 0.0500 time 0.2448 (0.2436) data time 0.0007 (0.0018) model time 0.2441 (0.2417) loss 3.3862 (2.6868) grad_norm 6.0881 (inf) loss_scale 256.0000 (422.7320) mem 7381MB [2024-09-01 10:06:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1050/1251] eta 0:00:48 lr 0.000024 wd 0.0500 time 0.2414 (0.2436) data time 0.0009 (0.0018) model time 0.2405 (0.2416) loss 2.5563 (2.6858) grad_norm 5.4352 (inf) loss_scale 256.0000 (421.1456) mem 7381MB [2024-09-01 10:06:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1060/1251] eta 0:00:46 lr 0.000024 wd 0.0500 time 0.2360 (0.2435) data time 0.0007 (0.0017) model time 0.2354 (0.2416) loss 2.8507 (2.6863) grad_norm 5.3711 (inf) loss_scale 256.0000 (419.5891) mem 7381MB [2024-09-01 10:06:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1070/1251] eta 0:00:44 lr 0.000024 wd 0.0500 time 0.2370 (0.2435) data time 0.0008 (0.0017) model time 0.2362 (0.2416) loss 2.2780 (2.6885) grad_norm 5.8404 (inf) loss_scale 256.0000 (418.0616) mem 7381MB [2024-09-01 10:06:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1080/1251] eta 0:00:41 lr 0.000024 wd 0.0500 time 0.2404 (0.2435) data time 0.0009 (0.0017) model time 0.2394 (0.2415) loss 3.0783 (2.6895) grad_norm 7.1577 (inf) loss_scale 256.0000 (416.5624) mem 7381MB [2024-09-01 10:06:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1090/1251] eta 0:00:39 lr 0.000024 wd 0.0500 time 0.2316 (0.2435) data time 0.0011 (0.0017) model time 0.2305 (0.2415) loss 2.1948 (2.6885) grad_norm 7.2965 (inf) loss_scale 256.0000 (415.0907) mem 7381MB [2024-09-01 10:06:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1100/1251] eta 0:00:36 lr 0.000024 wd 0.0500 time 0.2384 (0.2435) data time 0.0009 (0.0017) model time 0.2375 (0.2415) loss 2.8509 (2.6878) grad_norm 8.4653 (inf) loss_scale 256.0000 (413.6458) mem 7381MB [2024-09-01 10:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1110/1251] eta 0:00:34 lr 0.000024 wd 0.0500 time 0.2505 (0.2434) data time 0.0010 (0.0017) model time 0.2495 (0.2415) loss 3.0794 (2.6898) grad_norm 6.7068 (inf) loss_scale 256.0000 (412.2268) mem 7381MB [2024-09-01 10:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1120/1251] eta 0:00:31 lr 0.000024 wd 0.0500 time 0.2410 (0.2434) data time 0.0009 (0.0017) model time 0.2400 (0.2415) loss 3.1709 (2.6893) grad_norm 4.6624 (inf) loss_scale 256.0000 (410.8332) mem 7381MB [2024-09-01 10:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1130/1251] eta 0:00:29 lr 0.000024 wd 0.0500 time 0.2390 (0.2434) data time 0.0009 (0.0017) model time 0.2381 (0.2415) loss 3.1254 (2.6905) grad_norm 5.3593 (inf) loss_scale 256.0000 (409.4642) mem 7381MB [2024-09-01 10:06:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1140/1251] eta 0:00:27 lr 0.000024 wd 0.0500 time 0.2445 (0.2434) data time 0.0007 (0.0017) model time 0.2438 (0.2415) loss 2.9691 (2.6897) grad_norm 4.4516 (inf) loss_scale 256.0000 (408.1192) mem 7381MB [2024-09-01 10:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1150/1251] eta 0:00:24 lr 0.000024 wd 0.0500 time 0.2448 (0.2434) data time 0.0010 (0.0017) model time 0.2439 (0.2415) loss 2.5788 (2.6906) grad_norm 6.8522 (inf) loss_scale 256.0000 (406.7976) mem 7381MB [2024-09-01 10:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1160/1251] eta 0:00:22 lr 0.000024 wd 0.0500 time 0.2395 (0.2434) data time 0.0008 (0.0017) model time 0.2387 (0.2415) loss 2.7708 (2.6912) grad_norm 5.5817 (inf) loss_scale 256.0000 (405.4987) mem 7381MB [2024-09-01 10:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1170/1251] eta 0:00:19 lr 0.000024 wd 0.0500 time 0.2315 (0.2434) data time 0.0011 (0.0017) model time 0.2304 (0.2415) loss 2.2590 (2.6928) grad_norm 6.7677 (inf) loss_scale 256.0000 (404.2220) mem 7381MB [2024-09-01 10:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1180/1251] eta 0:00:17 lr 0.000024 wd 0.0500 time 0.2312 (0.2434) data time 0.0008 (0.0017) model time 0.2305 (0.2415) loss 2.4026 (2.6927) grad_norm 7.4174 (inf) loss_scale 256.0000 (402.9670) mem 7381MB [2024-09-01 10:06:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1190/1251] eta 0:00:14 lr 0.000024 wd 0.0500 time 0.2439 (0.2434) data time 0.0007 (0.0017) model time 0.2432 (0.2415) loss 2.6863 (2.6932) grad_norm 6.0404 (inf) loss_scale 256.0000 (401.7330) mem 7381MB [2024-09-01 10:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1200/1251] eta 0:00:12 lr 0.000024 wd 0.0500 time 0.2459 (0.2433) data time 0.0009 (0.0017) model time 0.2450 (0.2415) loss 3.1587 (2.6921) grad_norm 7.4849 (inf) loss_scale 256.0000 (400.5196) mem 7381MB [2024-09-01 10:06:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1210/1251] eta 0:00:09 lr 0.000024 wd 0.0500 time 0.2423 (0.2434) data time 0.0008 (0.0017) model time 0.2415 (0.2415) loss 2.9402 (2.6912) grad_norm 4.3697 (inf) loss_scale 256.0000 (399.3262) mem 7381MB [2024-09-01 10:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1220/1251] eta 0:00:07 lr 0.000024 wd 0.0500 time 0.2404 (0.2433) data time 0.0009 (0.0016) model time 0.2395 (0.2415) loss 2.6492 (2.6916) grad_norm 4.6828 (inf) loss_scale 256.0000 (398.1523) mem 7381MB [2024-09-01 10:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1230/1251] eta 0:00:05 lr 0.000024 wd 0.0500 time 0.2392 (0.2433) data time 0.0010 (0.0016) model time 0.2383 (0.2415) loss 2.8985 (2.6905) grad_norm 7.9259 (inf) loss_scale 256.0000 (396.9976) mem 7381MB [2024-09-01 10:06:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1240/1251] eta 0:00:02 lr 0.000024 wd 0.0500 time 0.2247 (0.2432) data time 0.0004 (0.0016) model time 0.2242 (0.2414) loss 3.1874 (2.6897) grad_norm 6.3086 (inf) loss_scale 256.0000 (395.8614) mem 7381MB [2024-09-01 10:07:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [278/300][1250/1251] eta 0:00:00 lr 0.000024 wd 0.0500 time 0.2236 (0.2431) data time 0.0005 (0.0016) model time 0.2231 (0.2413) loss 2.8361 (2.6884) grad_norm 4.5151 (inf) loss_scale 256.0000 (394.7434) mem 7381MB [2024-09-01 10:07:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 278 training takes 0:05:04 [2024-09-01 10:07:01 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 10:07:01 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 10:07:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.436 (0.436) Loss 0.3940 (0.3940) Acc@1 92.676 (92.676) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 10:07:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.114) Loss 0.5557 (0.6090) Acc@1 90.527 (87.642) Acc@5 98.047 (97.710) Mem 7381MB [2024-09-01 10:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.097) Loss 0.9185 (0.6406) Acc@1 77.246 (86.454) Acc@5 95.703 (97.666) Mem 7381MB [2024-09-01 10:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.090) Loss 1.1455 (0.7358) Acc@1 74.316 (84.290) Acc@5 92.578 (96.689) Mem 7381MB [2024-09-01 10:07:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.085) Loss 1.0039 (0.7864) Acc@1 76.855 (83.117) Acc@5 94.336 (96.182) Mem 7381MB [2024-09-01 10:07:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.676 Acc@5 96.144 [2024-09-01 10:07:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-09-01 10:07:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.862 (0.862) Loss 0.3884 (0.3884) Acc@1 92.969 (92.969) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 10:07:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.152) Loss 0.5654 (0.6063) Acc@1 90.527 (87.891) Acc@5 98.242 (97.852) Mem 7381MB [2024-09-01 10:07:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.118) Loss 0.9058 (0.6368) Acc@1 78.027 (86.723) Acc@5 95.703 (97.773) Mem 7381MB [2024-09-01 10:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.083 (0.104) Loss 1.1309 (0.7290) Acc@1 74.902 (84.548) Acc@5 92.773 (96.799) Mem 7381MB [2024-09-01 10:07:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.095) Loss 1.0088 (0.7770) Acc@1 76.758 (83.363) Acc@5 94.531 (96.320) Mem 7381MB [2024-09-01 10:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.942 Acc@5 96.278 [2024-09-01 10:07:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 10:07:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][0/1251] eta 0:22:34 lr 0.000024 wd 0.0500 time 1.0829 (1.0829) data time 0.7611 (0.7611) model time 0.0000 (0.0000) loss 3.0794 (3.0794) grad_norm 6.1687 (6.1687) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][10/1251] eta 0:06:33 lr 0.000024 wd 0.0500 time 0.2380 (0.3172) data time 0.0007 (0.0701) model time 0.0000 (0.0000) loss 1.9860 (2.7350) grad_norm 3.9338 (6.0532) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][20/1251] eta 0:05:46 lr 0.000024 wd 0.0500 time 0.2405 (0.2812) data time 0.0007 (0.0372) model time 0.0000 (0.0000) loss 2.8256 (2.8079) grad_norm 6.3779 (5.7626) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][30/1251] eta 0:05:28 lr 0.000024 wd 0.0500 time 0.2491 (0.2690) data time 0.0009 (0.0255) model time 0.0000 (0.0000) loss 1.9250 (2.7122) grad_norm 5.0331 (5.6490) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][40/1251] eta 0:05:18 lr 0.000024 wd 0.0500 time 0.2477 (0.2628) data time 0.0007 (0.0195) model time 0.0000 (0.0000) loss 2.8949 (2.7079) grad_norm 4.0747 (5.8841) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][50/1251] eta 0:05:15 lr 0.000024 wd 0.0500 time 0.2464 (0.2628) data time 0.0009 (0.0159) model time 0.0000 (0.0000) loss 3.3945 (2.7643) grad_norm 4.7066 (5.6527) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][60/1251] eta 0:05:08 lr 0.000024 wd 0.0500 time 0.2352 (0.2594) data time 0.0009 (0.0135) model time 0.2342 (0.2411) loss 3.4487 (2.7961) grad_norm 5.8511 (5.6490) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][70/1251] eta 0:05:03 lr 0.000024 wd 0.0500 time 0.2441 (0.2569) data time 0.0010 (0.0117) model time 0.2431 (0.2409) loss 2.5767 (2.7790) grad_norm 8.1223 (5.8558) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][80/1251] eta 0:04:58 lr 0.000024 wd 0.0500 time 0.2415 (0.2552) data time 0.0011 (0.0104) model time 0.2404 (0.2413) loss 3.0089 (2.7788) grad_norm 6.1100 (5.8692) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][90/1251] eta 0:04:54 lr 0.000024 wd 0.0500 time 0.2471 (0.2537) data time 0.0008 (0.0094) model time 0.2463 (0.2411) loss 3.4076 (2.7793) grad_norm 3.6637 (5.8301) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][100/1251] eta 0:04:53 lr 0.000024 wd 0.0500 time 0.2410 (0.2547) data time 0.0009 (0.0085) model time 0.2400 (0.2455) loss 2.6371 (2.7605) grad_norm 5.9066 (5.8078) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][110/1251] eta 0:04:49 lr 0.000024 wd 0.0500 time 0.2500 (0.2536) data time 0.0008 (0.0079) model time 0.2492 (0.2447) loss 2.7961 (2.7535) grad_norm 4.9475 (5.7474) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][120/1251] eta 0:04:45 lr 0.000024 wd 0.0500 time 0.2391 (0.2526) data time 0.0009 (0.0073) model time 0.2382 (0.2442) loss 2.7585 (2.7573) grad_norm 7.2059 (5.7666) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][130/1251] eta 0:04:42 lr 0.000024 wd 0.0500 time 0.2375 (0.2518) data time 0.0009 (0.0068) model time 0.2366 (0.2437) loss 2.7276 (2.7575) grad_norm 6.1276 (5.7944) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][140/1251] eta 0:04:38 lr 0.000024 wd 0.0500 time 0.2372 (0.2510) data time 0.0007 (0.0064) model time 0.2365 (0.2434) loss 3.1722 (2.7557) grad_norm 6.3096 (5.8135) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][150/1251] eta 0:04:35 lr 0.000024 wd 0.0500 time 0.2402 (0.2504) data time 0.0009 (0.0060) model time 0.2393 (0.2431) loss 2.7001 (2.7502) grad_norm 5.0207 (5.9937) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][160/1251] eta 0:04:32 lr 0.000024 wd 0.0500 time 0.2362 (0.2497) data time 0.0007 (0.0057) model time 0.2355 (0.2426) loss 2.7973 (2.7261) grad_norm 6.5961 (5.9577) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][170/1251] eta 0:04:29 lr 0.000024 wd 0.0500 time 0.2423 (0.2491) data time 0.0007 (0.0054) model time 0.2416 (0.2423) loss 3.1641 (2.7193) grad_norm 5.3132 (5.9288) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][180/1251] eta 0:04:26 lr 0.000023 wd 0.0500 time 0.2446 (0.2487) data time 0.0007 (0.0052) model time 0.2439 (0.2422) loss 3.2845 (2.7173) grad_norm 8.2023 (5.8805) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:07:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][190/1251] eta 0:04:23 lr 0.000023 wd 0.0500 time 0.2397 (0.2482) data time 0.0007 (0.0050) model time 0.2390 (0.2419) loss 2.4032 (2.7181) grad_norm 4.0938 (5.8675) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][200/1251] eta 0:04:21 lr 0.000023 wd 0.0500 time 0.2351 (0.2489) data time 0.0011 (0.0048) model time 0.2340 (0.2432) loss 2.9736 (2.7223) grad_norm 7.0948 (5.9537) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][210/1251] eta 0:04:18 lr 0.000023 wd 0.0500 time 0.2344 (0.2485) data time 0.0009 (0.0046) model time 0.2335 (0.2430) loss 3.2223 (2.7216) grad_norm 6.0472 (5.9553) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][220/1251] eta 0:04:15 lr 0.000023 wd 0.0500 time 0.2377 (0.2481) data time 0.0011 (0.0044) model time 0.2366 (0.2427) loss 3.2324 (2.7278) grad_norm 7.3603 (5.9727) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][230/1251] eta 0:04:12 lr 0.000023 wd 0.0500 time 0.2462 (0.2478) data time 0.0009 (0.0043) model time 0.2453 (0.2425) loss 2.7043 (2.7294) grad_norm 5.3857 (5.9646) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][240/1251] eta 0:04:10 lr 0.000023 wd 0.0500 time 0.2465 (0.2475) data time 0.0008 (0.0042) model time 0.2457 (0.2425) loss 1.6941 (2.7135) grad_norm 5.5154 (5.9048) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][250/1251] eta 0:04:07 lr 0.000023 wd 0.0500 time 0.2435 (0.2472) data time 0.0009 (0.0040) model time 0.2427 (0.2422) loss 3.4970 (2.7159) grad_norm 3.9189 (5.8502) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][260/1251] eta 0:04:04 lr 0.000023 wd 0.0500 time 0.2434 (0.2469) data time 0.0011 (0.0039) model time 0.2423 (0.2420) loss 2.4422 (2.7203) grad_norm 4.3840 (5.8120) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][270/1251] eta 0:04:02 lr 0.000023 wd 0.0500 time 0.2456 (0.2467) data time 0.0009 (0.0038) model time 0.2447 (0.2420) loss 2.9709 (2.7172) grad_norm 4.6303 (5.7672) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][280/1251] eta 0:03:59 lr 0.000023 wd 0.0500 time 0.2472 (0.2465) data time 0.0007 (0.0037) model time 0.2465 (0.2419) loss 2.7603 (2.7185) grad_norm 6.3625 (5.7697) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][290/1251] eta 0:03:56 lr 0.000023 wd 0.0500 time 0.2422 (0.2463) data time 0.0007 (0.0036) model time 0.2415 (0.2418) loss 2.5558 (2.7180) grad_norm 3.8255 (5.7424) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][300/1251] eta 0:03:54 lr 0.000023 wd 0.0500 time 0.2365 (0.2462) data time 0.0008 (0.0035) model time 0.2357 (0.2418) loss 2.1699 (2.7211) grad_norm 4.8413 (5.8127) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][310/1251] eta 0:03:51 lr 0.000023 wd 0.0500 time 0.2435 (0.2460) data time 0.0007 (0.0034) model time 0.2427 (0.2418) loss 1.4587 (2.7102) grad_norm 4.3748 (5.7894) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][320/1251] eta 0:03:48 lr 0.000023 wd 0.0500 time 0.2445 (0.2459) data time 0.0012 (0.0034) model time 0.2434 (0.2418) loss 2.6083 (2.7113) grad_norm 4.6583 (5.7815) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][330/1251] eta 0:03:46 lr 0.000023 wd 0.0500 time 0.2454 (0.2458) data time 0.0011 (0.0033) model time 0.2443 (0.2417) loss 2.8702 (2.7156) grad_norm 6.0198 (5.8378) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][340/1251] eta 0:03:43 lr 0.000023 wd 0.0500 time 0.2385 (0.2457) data time 0.0011 (0.0032) model time 0.2374 (0.2417) loss 1.6191 (2.7095) grad_norm 3.5412 (5.8149) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][350/1251] eta 0:03:41 lr 0.000023 wd 0.0500 time 0.2340 (0.2456) data time 0.0009 (0.0032) model time 0.2331 (0.2417) loss 2.5321 (2.7080) grad_norm 6.8857 (5.8275) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][360/1251] eta 0:03:38 lr 0.000023 wd 0.0500 time 0.2490 (0.2455) data time 0.0008 (0.0031) model time 0.2482 (0.2417) loss 2.5010 (2.7002) grad_norm 5.7910 (5.7902) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][370/1251] eta 0:03:36 lr 0.000023 wd 0.0500 time 0.2413 (0.2454) data time 0.0009 (0.0030) model time 0.2403 (0.2417) loss 2.2444 (2.6986) grad_norm 4.0525 (5.7783) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][380/1251] eta 0:03:33 lr 0.000023 wd 0.0500 time 0.2431 (0.2454) data time 0.0012 (0.0030) model time 0.2419 (0.2417) loss 2.8555 (2.7007) grad_norm 4.4564 (5.7889) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][390/1251] eta 0:03:31 lr 0.000023 wd 0.0500 time 0.2298 (0.2452) data time 0.0010 (0.0029) model time 0.2288 (0.2416) loss 2.0303 (2.6952) grad_norm 5.0604 (5.7765) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][400/1251] eta 0:03:29 lr 0.000023 wd 0.0500 time 0.2347 (0.2461) data time 0.0009 (0.0029) model time 0.2338 (0.2427) loss 2.5606 (2.6931) grad_norm 4.5320 (5.7521) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][410/1251] eta 0:03:26 lr 0.000023 wd 0.0500 time 0.2429 (0.2460) data time 0.0007 (0.0028) model time 0.2422 (0.2426) loss 3.0473 (2.6921) grad_norm 3.8207 (5.7381) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][420/1251] eta 0:03:24 lr 0.000023 wd 0.0500 time 0.2449 (0.2458) data time 0.0008 (0.0028) model time 0.2442 (0.2425) loss 2.6524 (2.6928) grad_norm 5.2453 (5.7388) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][430/1251] eta 0:03:21 lr 0.000023 wd 0.0500 time 0.2375 (0.2458) data time 0.0007 (0.0028) model time 0.2368 (0.2425) loss 1.8266 (2.6879) grad_norm 4.7805 (5.7291) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:08:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][440/1251] eta 0:03:19 lr 0.000023 wd 0.0500 time 0.2413 (0.2457) data time 0.0009 (0.0027) model time 0.2404 (0.2424) loss 2.5240 (2.6849) grad_norm 4.5541 (5.7351) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][450/1251] eta 0:03:16 lr 0.000023 wd 0.0500 time 0.2471 (0.2456) data time 0.0011 (0.0027) model time 0.2460 (0.2424) loss 2.4337 (2.6887) grad_norm 9.3623 (5.7470) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][460/1251] eta 0:03:14 lr 0.000023 wd 0.0500 time 0.2329 (0.2454) data time 0.0007 (0.0026) model time 0.2322 (0.2423) loss 2.1175 (2.6900) grad_norm 4.5594 (5.7664) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][470/1251] eta 0:03:11 lr 0.000023 wd 0.0500 time 0.2523 (0.2454) data time 0.0007 (0.0026) model time 0.2516 (0.2423) loss 2.3433 (2.6893) grad_norm 5.3742 (5.7634) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][480/1251] eta 0:03:09 lr 0.000023 wd 0.0500 time 0.2476 (0.2453) data time 0.0009 (0.0026) model time 0.2467 (0.2422) loss 2.3494 (2.6844) grad_norm 4.2906 (5.7491) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][490/1251] eta 0:03:06 lr 0.000023 wd 0.0500 time 0.2368 (0.2452) data time 0.0011 (0.0025) model time 0.2358 (0.2421) loss 2.8031 (2.6823) grad_norm 5.6085 (5.7504) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][500/1251] eta 0:03:04 lr 0.000023 wd 0.0500 time 0.2443 (0.2452) data time 0.0009 (0.0025) model time 0.2433 (0.2422) loss 2.9863 (2.6837) grad_norm 5.4012 (5.7352) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][510/1251] eta 0:03:01 lr 0.000023 wd 0.0500 time 0.2335 (0.2451) data time 0.0010 (0.0025) model time 0.2324 (0.2421) loss 2.9454 (2.6893) grad_norm 3.8153 (5.7266) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][520/1251] eta 0:02:59 lr 0.000023 wd 0.0500 time 0.2409 (0.2449) data time 0.0010 (0.0024) model time 0.2399 (0.2420) loss 2.1408 (2.6882) grad_norm 6.4708 (5.7271) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][530/1251] eta 0:02:56 lr 0.000023 wd 0.0500 time 0.2443 (0.2448) data time 0.0007 (0.0024) model time 0.2436 (0.2419) loss 2.8090 (2.6885) grad_norm 7.0773 (5.7451) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][540/1251] eta 0:02:54 lr 0.000023 wd 0.0500 time 0.2419 (0.2448) data time 0.0007 (0.0024) model time 0.2412 (0.2419) loss 3.0382 (2.6868) grad_norm 5.8582 (5.7500) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][550/1251] eta 0:02:51 lr 0.000023 wd 0.0500 time 0.2478 (0.2448) data time 0.0009 (0.0024) model time 0.2469 (0.2419) loss 1.8505 (2.6835) grad_norm 5.2036 (5.7475) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][560/1251] eta 0:02:49 lr 0.000023 wd 0.0500 time 0.2384 (0.2447) data time 0.0009 (0.0023) model time 0.2375 (0.2419) loss 3.2668 (2.6865) grad_norm 3.4952 (5.7601) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][570/1251] eta 0:02:46 lr 0.000023 wd 0.0500 time 0.2498 (0.2447) data time 0.0009 (0.0023) model time 0.2490 (0.2419) loss 1.7189 (2.6874) grad_norm 4.8146 (5.7668) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][580/1251] eta 0:02:44 lr 0.000023 wd 0.0500 time 0.2449 (0.2450) data time 0.0010 (0.0023) model time 0.2439 (0.2423) loss 3.1207 (2.6863) grad_norm 6.2378 (5.7637) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][590/1251] eta 0:02:41 lr 0.000023 wd 0.0500 time 0.2426 (0.2450) data time 0.0008 (0.0023) model time 0.2418 (0.2423) loss 2.9533 (2.6848) grad_norm 37.0493 (5.8124) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][600/1251] eta 0:02:39 lr 0.000023 wd 0.0500 time 0.2478 (0.2449) data time 0.0009 (0.0023) model time 0.2469 (0.2423) loss 2.4515 (2.6853) grad_norm 6.1096 (5.8384) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][610/1251] eta 0:02:36 lr 0.000023 wd 0.0500 time 0.2395 (0.2448) data time 0.0009 (0.0022) model time 0.2386 (0.2422) loss 1.8489 (2.6832) grad_norm 5.1930 (5.8368) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][620/1251] eta 0:02:34 lr 0.000023 wd 0.0500 time 0.2418 (0.2448) data time 0.0010 (0.0022) model time 0.2408 (0.2422) loss 2.7143 (2.6831) grad_norm 4.6974 (5.8238) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][630/1251] eta 0:02:31 lr 0.000023 wd 0.0500 time 0.2361 (0.2447) data time 0.0009 (0.0022) model time 0.2352 (0.2421) loss 2.2852 (2.6814) grad_norm 6.1327 (5.8258) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][640/1251] eta 0:02:29 lr 0.000023 wd 0.0500 time 0.2506 (0.2447) data time 0.0010 (0.0022) model time 0.2495 (0.2421) loss 2.4276 (2.6804) grad_norm 5.1348 (5.8073) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][650/1251] eta 0:02:27 lr 0.000023 wd 0.0500 time 0.2405 (0.2446) data time 0.0009 (0.0022) model time 0.2396 (0.2421) loss 2.9204 (2.6814) grad_norm 4.8307 (5.7948) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][660/1251] eta 0:02:24 lr 0.000023 wd 0.0500 time 0.2338 (0.2446) data time 0.0009 (0.0021) model time 0.2329 (0.2421) loss 2.7141 (2.6813) grad_norm 4.5653 (5.8072) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][670/1251] eta 0:02:22 lr 0.000023 wd 0.0500 time 0.2421 (0.2445) data time 0.0007 (0.0021) model time 0.2413 (0.2420) loss 3.3902 (2.6833) grad_norm 4.3340 (5.8362) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][680/1251] eta 0:02:19 lr 0.000023 wd 0.0500 time 0.2420 (0.2445) data time 0.0011 (0.0021) model time 0.2408 (0.2420) loss 2.7069 (2.6839) grad_norm 4.6676 (5.8338) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][690/1251] eta 0:02:17 lr 0.000023 wd 0.0500 time 0.2303 (0.2444) data time 0.0008 (0.0021) model time 0.2295 (0.2420) loss 3.0621 (2.6830) grad_norm 5.5925 (5.8236) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:10:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][700/1251] eta 0:02:14 lr 0.000023 wd 0.0500 time 0.2421 (0.2444) data time 0.0007 (0.0021) model time 0.2414 (0.2420) loss 2.9710 (2.6859) grad_norm 4.7052 (5.8262) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:10:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][710/1251] eta 0:02:12 lr 0.000023 wd 0.0500 time 0.2346 (0.2444) data time 0.0012 (0.0021) model time 0.2333 (0.2419) loss 2.7380 (2.6825) grad_norm 9.5925 (5.8523) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:10:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][720/1251] eta 0:02:09 lr 0.000023 wd 0.0500 time 0.2362 (0.2443) data time 0.0011 (0.0021) model time 0.2351 (0.2419) loss 3.1488 (2.6831) grad_norm 3.8663 (5.9033) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][730/1251] eta 0:02:07 lr 0.000023 wd 0.0500 time 0.4427 (0.2445) data time 0.0013 (0.0020) model time 0.4414 (0.2421) loss 2.8291 (2.6828) grad_norm 4.6806 (5.8985) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:10:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][740/1251] eta 0:02:04 lr 0.000023 wd 0.0500 time 0.2471 (0.2445) data time 0.0010 (0.0020) model time 0.2461 (0.2421) loss 2.5279 (2.6848) grad_norm 5.2100 (5.8874) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:10:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][750/1251] eta 0:02:02 lr 0.000023 wd 0.0500 time 0.2491 (0.2444) data time 0.0008 (0.0020) model time 0.2482 (0.2421) loss 3.5350 (2.6892) grad_norm 6.1846 (5.8782) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:10:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][760/1251] eta 0:01:59 lr 0.000023 wd 0.0500 time 0.2478 (0.2444) data time 0.0007 (0.0020) model time 0.2471 (0.2421) loss 2.9564 (2.6926) grad_norm 4.8381 (5.8797) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:10:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][770/1251] eta 0:01:57 lr 0.000023 wd 0.0500 time 0.2507 (0.2443) data time 0.0009 (0.0020) model time 0.2498 (0.2421) loss 3.3251 (2.6900) grad_norm 27.5845 (5.9254) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:10:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][780/1251] eta 0:01:55 lr 0.000023 wd 0.0500 time 0.2330 (0.2443) data time 0.0009 (0.0020) model time 0.2321 (0.2420) loss 2.6574 (2.6920) grad_norm 4.7400 (5.9220) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:10:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][790/1251] eta 0:01:52 lr 0.000023 wd 0.0500 time 0.2453 (0.2442) data time 0.0008 (0.0020) model time 0.2445 (0.2420) loss 2.0221 (2.6922) grad_norm 6.4047 (5.9150) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:10:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][800/1251] eta 0:01:50 lr 0.000023 wd 0.0500 time 0.2408 (0.2442) data time 0.0007 (0.0019) model time 0.2400 (0.2420) loss 2.7704 (2.6922) grad_norm 4.6781 (inf) loss_scale 128.0000 (254.5618) mem 7381MB [2024-09-01 10:10:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][810/1251] eta 0:01:47 lr 0.000023 wd 0.0500 time 0.2430 (0.2442) data time 0.0008 (0.0019) model time 0.2422 (0.2419) loss 2.9645 (2.6943) grad_norm 3.9793 (inf) loss_scale 128.0000 (253.0012) mem 7381MB [2024-09-01 10:10:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][820/1251] eta 0:01:45 lr 0.000023 wd 0.0500 time 0.2402 (0.2441) data time 0.0007 (0.0019) model time 0.2395 (0.2419) loss 2.3279 (2.6934) grad_norm 4.6234 (inf) loss_scale 128.0000 (251.4787) mem 7381MB [2024-09-01 10:10:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][830/1251] eta 0:01:42 lr 0.000023 wd 0.0500 time 0.2399 (0.2441) data time 0.0010 (0.0019) model time 0.2389 (0.2419) loss 3.2019 (2.6969) grad_norm 5.8150 (inf) loss_scale 128.0000 (249.9928) mem 7381MB [2024-09-01 10:10:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][840/1251] eta 0:01:40 lr 0.000023 wd 0.0500 time 0.2388 (0.2441) data time 0.0010 (0.0019) model time 0.2378 (0.2419) loss 3.0436 (2.6979) grad_norm 4.8935 (inf) loss_scale 128.0000 (248.5422) mem 7381MB [2024-09-01 10:10:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][850/1251] eta 0:01:37 lr 0.000023 wd 0.0500 time 0.2344 (0.2440) data time 0.0007 (0.0019) model time 0.2337 (0.2418) loss 2.9157 (2.6977) grad_norm 5.4274 (inf) loss_scale 128.0000 (247.1257) mem 7381MB [2024-09-01 10:10:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][860/1251] eta 0:01:35 lr 0.000023 wd 0.0500 time 0.2396 (0.2440) data time 0.0010 (0.0019) model time 0.2386 (0.2418) loss 2.9444 (2.6986) grad_norm 4.3273 (inf) loss_scale 128.0000 (245.7422) mem 7381MB [2024-09-01 10:10:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][870/1251] eta 0:01:32 lr 0.000023 wd 0.0500 time 0.2456 (0.2439) data time 0.0009 (0.0019) model time 0.2447 (0.2418) loss 2.0649 (2.6971) grad_norm 14.2716 (inf) loss_scale 128.0000 (244.3904) mem 7381MB [2024-09-01 10:10:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][880/1251] eta 0:01:30 lr 0.000023 wd 0.0500 time 0.2432 (0.2440) data time 0.0008 (0.0019) model time 0.2424 (0.2418) loss 1.7469 (2.6966) grad_norm 5.8437 (inf) loss_scale 128.0000 (243.0692) mem 7381MB [2024-09-01 10:10:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][890/1251] eta 0:01:28 lr 0.000023 wd 0.0500 time 0.2366 (0.2439) data time 0.0009 (0.0018) model time 0.2357 (0.2418) loss 2.5610 (2.6995) grad_norm 3.9220 (inf) loss_scale 128.0000 (241.7778) mem 7381MB [2024-09-01 10:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][900/1251] eta 0:01:25 lr 0.000023 wd 0.0500 time 0.2349 (0.2439) data time 0.0010 (0.0018) model time 0.2339 (0.2418) loss 2.5069 (2.6973) grad_norm 6.7483 (inf) loss_scale 128.0000 (240.5150) mem 7381MB [2024-09-01 10:10:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][910/1251] eta 0:01:23 lr 0.000023 wd 0.0500 time 0.2441 (0.2439) data time 0.0007 (0.0018) model time 0.2433 (0.2418) loss 3.3379 (2.6994) grad_norm 6.7403 (inf) loss_scale 128.0000 (239.2799) mem 7381MB [2024-09-01 10:10:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][920/1251] eta 0:01:20 lr 0.000023 wd 0.0500 time 0.2535 (0.2439) data time 0.0008 (0.0018) model time 0.2527 (0.2418) loss 2.0260 (2.6995) grad_norm 4.3038 (inf) loss_scale 128.0000 (238.0717) mem 7381MB [2024-09-01 10:10:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][930/1251] eta 0:01:18 lr 0.000023 wd 0.0500 time 0.2486 (0.2439) data time 0.0009 (0.0018) model time 0.2477 (0.2418) loss 3.3075 (2.6975) grad_norm 3.8684 (inf) loss_scale 128.0000 (236.8894) mem 7381MB [2024-09-01 10:10:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][940/1251] eta 0:01:15 lr 0.000023 wd 0.0500 time 0.2423 (0.2438) data time 0.0007 (0.0018) model time 0.2415 (0.2418) loss 2.3105 (2.6979) grad_norm 7.5119 (inf) loss_scale 128.0000 (235.7322) mem 7381MB [2024-09-01 10:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][950/1251] eta 0:01:13 lr 0.000023 wd 0.0500 time 0.2392 (0.2438) data time 0.0008 (0.0018) model time 0.2384 (0.2418) loss 2.9231 (2.6975) grad_norm 7.4058 (inf) loss_scale 128.0000 (234.5994) mem 7381MB [2024-09-01 10:11:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][960/1251] eta 0:01:10 lr 0.000023 wd 0.0500 time 0.2415 (0.2438) data time 0.0011 (0.0018) model time 0.2404 (0.2417) loss 2.1932 (2.6970) grad_norm 6.8526 (inf) loss_scale 128.0000 (233.4901) mem 7381MB [2024-09-01 10:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][970/1251] eta 0:01:08 lr 0.000023 wd 0.0500 time 0.2368 (0.2437) data time 0.0012 (0.0018) model time 0.2356 (0.2417) loss 2.8129 (2.6978) grad_norm 5.2959 (inf) loss_scale 128.0000 (232.4037) mem 7381MB [2024-09-01 10:11:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][980/1251] eta 0:01:06 lr 0.000023 wd 0.0500 time 0.2469 (0.2437) data time 0.0007 (0.0018) model time 0.2462 (0.2417) loss 2.1043 (2.6950) grad_norm 6.8146 (inf) loss_scale 128.0000 (231.3394) mem 7381MB [2024-09-01 10:11:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][990/1251] eta 0:01:03 lr 0.000023 wd 0.0500 time 0.2571 (0.2438) data time 0.0008 (0.0018) model time 0.2564 (0.2417) loss 2.8635 (2.6960) grad_norm 5.4858 (inf) loss_scale 128.0000 (230.2967) mem 7381MB [2024-09-01 10:11:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1000/1251] eta 0:01:01 lr 0.000023 wd 0.0500 time 0.2358 (0.2437) data time 0.0008 (0.0018) model time 0.2349 (0.2417) loss 1.8835 (2.6957) grad_norm 6.9638 (inf) loss_scale 128.0000 (229.2747) mem 7381MB [2024-09-01 10:11:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1010/1251] eta 0:00:58 lr 0.000023 wd 0.0500 time 0.2382 (0.2437) data time 0.0008 (0.0017) model time 0.2373 (0.2417) loss 1.8806 (2.6946) grad_norm 4.7221 (inf) loss_scale 128.0000 (228.2730) mem 7381MB [2024-09-01 10:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1020/1251] eta 0:00:56 lr 0.000023 wd 0.0500 time 0.2460 (0.2437) data time 0.0007 (0.0017) model time 0.2453 (0.2417) loss 2.8357 (2.6952) grad_norm 4.9604 (inf) loss_scale 128.0000 (227.2909) mem 7381MB [2024-09-01 10:11:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1030/1251] eta 0:00:53 lr 0.000023 wd 0.0500 time 0.2285 (0.2438) data time 0.0010 (0.0017) model time 0.2275 (0.2418) loss 2.8512 (2.6937) grad_norm 5.9935 (inf) loss_scale 128.0000 (226.3278) mem 7381MB [2024-09-01 10:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1040/1251] eta 0:00:51 lr 0.000023 wd 0.0500 time 0.2394 (0.2438) data time 0.0007 (0.0017) model time 0.2387 (0.2418) loss 2.2917 (2.6930) grad_norm 3.8520 (inf) loss_scale 128.0000 (225.3833) mem 7381MB [2024-09-01 10:11:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1050/1251] eta 0:00:48 lr 0.000023 wd 0.0500 time 0.2414 (0.2438) data time 0.0009 (0.0017) model time 0.2405 (0.2418) loss 3.0711 (2.6956) grad_norm 5.6830 (inf) loss_scale 128.0000 (224.4567) mem 7381MB [2024-09-01 10:11:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1060/1251] eta 0:00:46 lr 0.000023 wd 0.0500 time 0.2388 (0.2437) data time 0.0009 (0.0017) model time 0.2379 (0.2417) loss 2.8032 (2.6949) grad_norm 7.1305 (inf) loss_scale 128.0000 (223.5476) mem 7381MB [2024-09-01 10:11:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1070/1251] eta 0:00:44 lr 0.000023 wd 0.0500 time 0.2338 (0.2437) data time 0.0009 (0.0017) model time 0.2329 (0.2417) loss 2.0662 (2.6961) grad_norm 3.9266 (inf) loss_scale 128.0000 (222.6555) mem 7381MB [2024-09-01 10:11:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1080/1251] eta 0:00:41 lr 0.000023 wd 0.0500 time 0.2427 (0.2437) data time 0.0010 (0.0017) model time 0.2417 (0.2417) loss 2.9232 (2.6934) grad_norm 7.9392 (inf) loss_scale 128.0000 (221.7798) mem 7381MB [2024-09-01 10:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1090/1251] eta 0:00:39 lr 0.000023 wd 0.0500 time 0.2374 (0.2436) data time 0.0009 (0.0017) model time 0.2365 (0.2417) loss 2.6718 (2.6951) grad_norm 4.2108 (inf) loss_scale 128.0000 (220.9203) mem 7381MB [2024-09-01 10:11:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1100/1251] eta 0:00:36 lr 0.000023 wd 0.0500 time 0.2329 (0.2438) data time 0.0008 (0.0017) model time 0.2320 (0.2419) loss 3.3035 (2.6948) grad_norm 4.4661 (inf) loss_scale 128.0000 (220.0763) mem 7381MB [2024-09-01 10:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1110/1251] eta 0:00:34 lr 0.000023 wd 0.0500 time 0.2363 (0.2438) data time 0.0009 (0.0017) model time 0.2354 (0.2419) loss 2.3688 (2.6951) grad_norm 4.3320 (inf) loss_scale 128.0000 (219.2475) mem 7381MB [2024-09-01 10:11:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1120/1251] eta 0:00:31 lr 0.000023 wd 0.0500 time 0.2369 (0.2438) data time 0.0010 (0.0017) model time 0.2359 (0.2419) loss 2.2876 (2.6929) grad_norm 11.7051 (inf) loss_scale 128.0000 (218.4335) mem 7381MB [2024-09-01 10:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1130/1251] eta 0:00:29 lr 0.000023 wd 0.0500 time 0.2367 (0.2437) data time 0.0007 (0.0017) model time 0.2360 (0.2419) loss 3.2812 (2.6940) grad_norm 4.6103 (inf) loss_scale 128.0000 (217.6340) mem 7381MB [2024-09-01 10:11:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1140/1251] eta 0:00:27 lr 0.000023 wd 0.0500 time 0.2404 (0.2437) data time 0.0008 (0.0017) model time 0.2396 (0.2418) loss 1.7185 (2.6941) grad_norm 5.5572 (inf) loss_scale 128.0000 (216.8484) mem 7381MB [2024-09-01 10:11:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1150/1251] eta 0:00:24 lr 0.000023 wd 0.0500 time 0.2390 (0.2437) data time 0.0010 (0.0017) model time 0.2380 (0.2418) loss 2.5784 (2.6944) grad_norm 4.4600 (inf) loss_scale 128.0000 (216.0765) mem 7381MB [2024-09-01 10:11:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1160/1251] eta 0:00:22 lr 0.000023 wd 0.0500 time 0.2418 (0.2436) data time 0.0007 (0.0016) model time 0.2412 (0.2418) loss 2.8894 (2.6947) grad_norm 5.1993 (inf) loss_scale 128.0000 (215.3178) mem 7381MB [2024-09-01 10:11:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1170/1251] eta 0:00:19 lr 0.000022 wd 0.0500 time 0.2379 (0.2436) data time 0.0008 (0.0016) model time 0.2371 (0.2417) loss 2.6359 (2.6946) grad_norm 3.6744 (inf) loss_scale 128.0000 (214.5722) mem 7381MB [2024-09-01 10:11:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1180/1251] eta 0:00:17 lr 0.000022 wd 0.0500 time 0.2387 (0.2436) data time 0.0011 (0.0016) model time 0.2375 (0.2417) loss 2.3136 (2.6930) grad_norm 4.8108 (inf) loss_scale 128.0000 (213.8391) mem 7381MB [2024-09-01 10:12:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1190/1251] eta 0:00:14 lr 0.000022 wd 0.0500 time 0.2362 (0.2435) data time 0.0008 (0.0016) model time 0.2354 (0.2417) loss 2.9164 (2.6925) grad_norm 5.1690 (inf) loss_scale 128.0000 (213.1184) mem 7381MB [2024-09-01 10:12:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1200/1251] eta 0:00:12 lr 0.000022 wd 0.0500 time 0.2364 (0.2435) data time 0.0008 (0.0016) model time 0.2356 (0.2417) loss 2.9576 (2.6909) grad_norm 5.8671 (inf) loss_scale 128.0000 (212.4097) mem 7381MB [2024-09-01 10:12:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1210/1251] eta 0:00:09 lr 0.000022 wd 0.0500 time 0.2355 (0.2435) data time 0.0009 (0.0016) model time 0.2347 (0.2417) loss 3.0778 (2.6917) grad_norm 5.0541 (inf) loss_scale 128.0000 (211.7126) mem 7381MB [2024-09-01 10:12:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1220/1251] eta 0:00:07 lr 0.000022 wd 0.0500 time 0.2330 (0.2435) data time 0.0009 (0.0016) model time 0.2321 (0.2417) loss 2.9580 (2.6925) grad_norm 3.8984 (inf) loss_scale 128.0000 (211.0270) mem 7381MB [2024-09-01 10:12:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1230/1251] eta 0:00:05 lr 0.000022 wd 0.0500 time 0.2341 (0.2435) data time 0.0007 (0.0016) model time 0.2334 (0.2417) loss 2.7103 (2.6937) grad_norm 4.3300 (inf) loss_scale 128.0000 (210.3526) mem 7381MB [2024-09-01 10:12:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1240/1251] eta 0:00:02 lr 0.000022 wd 0.0500 time 0.2255 (0.2434) data time 0.0007 (0.0016) model time 0.2248 (0.2416) loss 2.9797 (2.6959) grad_norm 4.7924 (inf) loss_scale 128.0000 (209.6890) mem 7381MB [2024-09-01 10:12:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [279/300][1250/1251] eta 0:00:00 lr 0.000022 wd 0.0500 time 0.2127 (0.2434) data time 0.0005 (0.0016) model time 0.2122 (0.2416) loss 2.0641 (2.6968) grad_norm 4.6275 (inf) loss_scale 128.0000 (209.0360) mem 7381MB [2024-09-01 10:12:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 279 training takes 0:05:04 [2024-09-01 10:12:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 10:12:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 10:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.452 (0.452) Loss 0.3911 (0.3911) Acc@1 93.262 (93.262) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 10:12:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.110) Loss 0.5723 (0.6123) Acc@1 90.137 (87.598) Acc@5 97.949 (97.665) Mem 7381MB [2024-09-01 10:12:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.095) Loss 0.9263 (0.6424) Acc@1 77.246 (86.491) Acc@5 95.117 (97.610) Mem 7381MB [2024-09-01 10:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.089) Loss 1.1562 (0.7370) Acc@1 73.730 (84.318) Acc@5 92.676 (96.661) Mem 7381MB [2024-09-01 10:12:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.0176 (0.7861) Acc@1 76.855 (83.120) Acc@5 93.945 (96.165) Mem 7381MB [2024-09-01 10:12:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.720 Acc@5 96.142 [2024-09-01 10:12:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-09-01 10:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.790 (0.790) Loss 0.3884 (0.3884) Acc@1 92.969 (92.969) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 10:12:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.148) Loss 0.5649 (0.6067) Acc@1 90.527 (87.926) Acc@5 98.242 (97.843) Mem 7381MB [2024-09-01 10:12:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.115) Loss 0.9062 (0.6370) Acc@1 77.734 (86.742) Acc@5 95.703 (97.763) Mem 7381MB [2024-09-01 10:12:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.103) Loss 1.1318 (0.7295) Acc@1 74.707 (84.564) Acc@5 92.773 (96.799) Mem 7381MB [2024-09-01 10:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.0088 (0.7775) Acc@1 76.953 (83.360) Acc@5 94.531 (96.320) Mem 7381MB [2024-09-01 10:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.944 Acc@5 96.272 [2024-09-01 10:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 10:12:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][0/1251] eta 0:21:58 lr 0.000022 wd 0.0500 time 1.0541 (1.0541) data time 0.6367 (0.6367) model time 0.0000 (0.0000) loss 2.7431 (2.7431) grad_norm 5.5221 (5.5221) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:12:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][10/1251] eta 0:06:35 lr 0.000022 wd 0.0500 time 0.2457 (0.3187) data time 0.0007 (0.0588) model time 0.0000 (0.0000) loss 3.2099 (2.7952) grad_norm 5.5650 (5.3235) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:12:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][20/1251] eta 0:05:46 lr 0.000022 wd 0.0500 time 0.2348 (0.2814) data time 0.0007 (0.0312) model time 0.0000 (0.0000) loss 2.6816 (2.5565) grad_norm 5.8492 (5.5064) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][30/1251] eta 0:05:28 lr 0.000022 wd 0.0500 time 0.2404 (0.2687) data time 0.0009 (0.0214) model time 0.0000 (0.0000) loss 3.1844 (2.6609) grad_norm 4.9278 (5.4495) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:12:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][40/1251] eta 0:05:17 lr 0.000022 wd 0.0500 time 0.2374 (0.2624) data time 0.0007 (0.0165) model time 0.0000 (0.0000) loss 3.0911 (2.7344) grad_norm 6.2384 (5.8676) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:12:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][50/1251] eta 0:05:20 lr 0.000022 wd 0.0500 time 0.2366 (0.2667) data time 0.0009 (0.0134) model time 0.0000 (0.0000) loss 2.7819 (2.6997) grad_norm 5.2783 (5.7790) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:12:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][60/1251] eta 0:05:12 lr 0.000022 wd 0.0500 time 0.2431 (0.2623) data time 0.0007 (0.0114) model time 0.2424 (0.2388) loss 3.4213 (2.7113) grad_norm 5.3807 (5.7557) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][70/1251] eta 0:05:06 lr 0.000022 wd 0.0500 time 0.2461 (0.2592) data time 0.0007 (0.0099) model time 0.2454 (0.2390) loss 2.2051 (2.7112) grad_norm 4.4667 (5.7104) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:12:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][80/1251] eta 0:05:01 lr 0.000022 wd 0.0500 time 0.2403 (0.2571) data time 0.0011 (0.0088) model time 0.2393 (0.2397) loss 3.0730 (2.7142) grad_norm 6.4762 (5.7753) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:12:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][90/1251] eta 0:04:56 lr 0.000022 wd 0.0500 time 0.2346 (0.2552) data time 0.0009 (0.0079) model time 0.2336 (0.2396) loss 2.8594 (2.7040) grad_norm 6.3267 (6.2507) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:12:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][100/1251] eta 0:04:51 lr 0.000022 wd 0.0500 time 0.2411 (0.2537) data time 0.0008 (0.0073) model time 0.2403 (0.2393) loss 2.6410 (2.7100) grad_norm 3.9234 (6.1973) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:12:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][110/1251] eta 0:04:48 lr 0.000022 wd 0.0500 time 0.2428 (0.2526) data time 0.0011 (0.0067) model time 0.2417 (0.2396) loss 2.7833 (2.7295) grad_norm 5.0455 (6.1243) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:12:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][120/1251] eta 0:04:44 lr 0.000022 wd 0.0500 time 0.2435 (0.2516) data time 0.0007 (0.0062) model time 0.2427 (0.2395) loss 3.3307 (2.7288) grad_norm 4.8289 (6.0430) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:12:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][130/1251] eta 0:04:41 lr 0.000022 wd 0.0500 time 0.2391 (0.2509) data time 0.0008 (0.0058) model time 0.2383 (0.2398) loss 2.9482 (2.7240) grad_norm 5.8078 (5.9942) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:12:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][140/1251] eta 0:04:37 lr 0.000022 wd 0.0500 time 0.2504 (0.2502) data time 0.0009 (0.0055) model time 0.2495 (0.2399) loss 1.5472 (2.7175) grad_norm 7.8250 (6.0184) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][150/1251] eta 0:04:34 lr 0.000022 wd 0.0500 time 0.2462 (0.2497) data time 0.0009 (0.0052) model time 0.2452 (0.2400) loss 2.9204 (2.7129) grad_norm 4.0281 (5.9635) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][160/1251] eta 0:04:31 lr 0.000022 wd 0.0500 time 0.2308 (0.2491) data time 0.0012 (0.0049) model time 0.2297 (0.2400) loss 2.9306 (2.7260) grad_norm 4.9769 (6.0439) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][170/1251] eta 0:04:28 lr 0.000022 wd 0.0500 time 0.2444 (0.2488) data time 0.0007 (0.0047) model time 0.2437 (0.2401) loss 3.5031 (2.7177) grad_norm 4.5149 (6.0384) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][180/1251] eta 0:04:26 lr 0.000022 wd 0.0500 time 0.2390 (0.2485) data time 0.0009 (0.0045) model time 0.2381 (0.2404) loss 2.6230 (2.7180) grad_norm 4.1547 (5.9984) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][190/1251] eta 0:04:23 lr 0.000022 wd 0.0500 time 0.2382 (0.2481) data time 0.0011 (0.0043) model time 0.2371 (0.2404) loss 1.7702 (2.7206) grad_norm 5.0321 (5.9720) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][200/1251] eta 0:04:20 lr 0.000022 wd 0.0500 time 0.2365 (0.2477) data time 0.0010 (0.0041) model time 0.2356 (0.2403) loss 3.1259 (2.7273) grad_norm 6.9720 (5.9254) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][210/1251] eta 0:04:17 lr 0.000022 wd 0.0500 time 0.2385 (0.2474) data time 0.0007 (0.0040) model time 0.2378 (0.2402) loss 2.0895 (2.7245) grad_norm 8.3168 (5.8723) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][220/1251] eta 0:04:14 lr 0.000022 wd 0.0500 time 0.2494 (0.2471) data time 0.0007 (0.0038) model time 0.2487 (0.2402) loss 2.2842 (2.7143) grad_norm 4.0872 (5.8637) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][230/1251] eta 0:04:12 lr 0.000022 wd 0.0500 time 0.2417 (0.2470) data time 0.0010 (0.0037) model time 0.2407 (0.2404) loss 2.3777 (2.7144) grad_norm 5.4798 (5.8363) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][240/1251] eta 0:04:09 lr 0.000022 wd 0.0500 time 0.2413 (0.2467) data time 0.0007 (0.0036) model time 0.2406 (0.2404) loss 1.9863 (2.7106) grad_norm 5.4188 (5.8152) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][250/1251] eta 0:04:06 lr 0.000022 wd 0.0500 time 0.2347 (0.2465) data time 0.0009 (0.0035) model time 0.2338 (0.2403) loss 3.1569 (2.7176) grad_norm 5.9985 (5.8109) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][260/1251] eta 0:04:04 lr 0.000022 wd 0.0500 time 0.2410 (0.2462) data time 0.0008 (0.0034) model time 0.2401 (0.2403) loss 1.9040 (2.7160) grad_norm 3.9543 (5.7718) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][270/1251] eta 0:04:01 lr 0.000022 wd 0.0500 time 0.2421 (0.2460) data time 0.0010 (0.0033) model time 0.2411 (0.2402) loss 2.8245 (2.7201) grad_norm 4.7634 (5.7572) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][280/1251] eta 0:03:58 lr 0.000022 wd 0.0500 time 0.2416 (0.2458) data time 0.0009 (0.0032) model time 0.2407 (0.2401) loss 3.0529 (2.7296) grad_norm 4.1463 (5.7625) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][290/1251] eta 0:03:56 lr 0.000022 wd 0.0500 time 0.2371 (0.2456) data time 0.0010 (0.0032) model time 0.2362 (0.2402) loss 2.6246 (2.7271) grad_norm 6.1389 (5.7907) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][300/1251] eta 0:03:53 lr 0.000022 wd 0.0500 time 0.2300 (0.2455) data time 0.0013 (0.0031) model time 0.2287 (0.2402) loss 2.6845 (2.7245) grad_norm 4.8973 (6.0575) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][310/1251] eta 0:03:50 lr 0.000022 wd 0.0500 time 0.2479 (0.2454) data time 0.0010 (0.0030) model time 0.2469 (0.2403) loss 2.8212 (2.7172) grad_norm 9.5202 (6.0491) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][320/1251] eta 0:03:48 lr 0.000022 wd 0.0500 time 0.2415 (0.2453) data time 0.0009 (0.0030) model time 0.2406 (0.2403) loss 2.4310 (2.7155) grad_norm 4.3344 (6.0276) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][330/1251] eta 0:03:45 lr 0.000022 wd 0.0500 time 0.2379 (0.2452) data time 0.0009 (0.0029) model time 0.2370 (0.2403) loss 3.4667 (2.7145) grad_norm 6.4690 (6.0341) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][340/1251] eta 0:03:43 lr 0.000022 wd 0.0500 time 0.2478 (0.2451) data time 0.0009 (0.0028) model time 0.2469 (0.2404) loss 2.8614 (2.7131) grad_norm 5.5136 (6.0079) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][350/1251] eta 0:03:40 lr 0.000022 wd 0.0500 time 0.2386 (0.2451) data time 0.0009 (0.0028) model time 0.2377 (0.2404) loss 2.3225 (2.7101) grad_norm 8.1033 (6.0000) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][360/1251] eta 0:03:38 lr 0.000022 wd 0.0500 time 0.2377 (0.2450) data time 0.0009 (0.0027) model time 0.2368 (0.2404) loss 1.7279 (2.7095) grad_norm 4.0668 (5.9978) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][370/1251] eta 0:03:36 lr 0.000022 wd 0.0500 time 0.2472 (0.2455) data time 0.0009 (0.0027) model time 0.2464 (0.2412) loss 3.3315 (2.7124) grad_norm 4.5717 (6.3821) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][380/1251] eta 0:03:33 lr 0.000022 wd 0.0500 time 0.2414 (0.2454) data time 0.0010 (0.0026) model time 0.2404 (0.2411) loss 2.3863 (2.7093) grad_norm 6.5898 (6.3650) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:13:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][390/1251] eta 0:03:31 lr 0.000022 wd 0.0500 time 0.2455 (0.2452) data time 0.0007 (0.0026) model time 0.2447 (0.2410) loss 2.2274 (2.7074) grad_norm 3.6661 (6.3430) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][400/1251] eta 0:03:28 lr 0.000022 wd 0.0500 time 0.2404 (0.2452) data time 0.0008 (0.0026) model time 0.2396 (0.2410) loss 2.1971 (2.7072) grad_norm 9.8977 (6.3227) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][410/1251] eta 0:03:26 lr 0.000022 wd 0.0500 time 0.2410 (0.2451) data time 0.0008 (0.0025) model time 0.2402 (0.2410) loss 1.5044 (2.7035) grad_norm 4.1646 (6.3267) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][420/1251] eta 0:03:23 lr 0.000022 wd 0.0500 time 0.2554 (0.2450) data time 0.0007 (0.0025) model time 0.2548 (0.2411) loss 2.8494 (2.7049) grad_norm 3.7601 (6.3030) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][430/1251] eta 0:03:21 lr 0.000022 wd 0.0500 time 0.2339 (0.2449) data time 0.0008 (0.0025) model time 0.2331 (0.2410) loss 2.5023 (2.7057) grad_norm 5.8126 (6.2755) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][440/1251] eta 0:03:18 lr 0.000022 wd 0.0500 time 0.2428 (0.2449) data time 0.0008 (0.0024) model time 0.2420 (0.2410) loss 3.1982 (2.7082) grad_norm 7.2198 (6.2627) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][450/1251] eta 0:03:16 lr 0.000022 wd 0.0500 time 0.2466 (0.2448) data time 0.0010 (0.0024) model time 0.2457 (0.2410) loss 2.5005 (2.7077) grad_norm 4.7810 (6.2585) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][460/1251] eta 0:03:13 lr 0.000022 wd 0.0500 time 0.2456 (0.2447) data time 0.0008 (0.0024) model time 0.2448 (0.2410) loss 3.2042 (2.7031) grad_norm 4.3949 (6.2413) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][470/1251] eta 0:03:11 lr 0.000022 wd 0.0500 time 0.2389 (0.2447) data time 0.0011 (0.0023) model time 0.2377 (0.2410) loss 2.1261 (2.7010) grad_norm 4.0278 (6.2213) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][480/1251] eta 0:03:08 lr 0.000022 wd 0.0500 time 0.2445 (0.2447) data time 0.0008 (0.0023) model time 0.2436 (0.2411) loss 3.0400 (2.7038) grad_norm 4.6545 (6.2090) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][490/1251] eta 0:03:06 lr 0.000022 wd 0.0500 time 0.2448 (0.2445) data time 0.0008 (0.0023) model time 0.2440 (0.2410) loss 2.3109 (2.7101) grad_norm 4.1413 (6.1813) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][500/1251] eta 0:03:03 lr 0.000022 wd 0.0500 time 0.2457 (0.2446) data time 0.0007 (0.0022) model time 0.2449 (0.2411) loss 2.7075 (2.7071) grad_norm 5.6767 (6.1861) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][510/1251] eta 0:03:01 lr 0.000022 wd 0.0500 time 0.2497 (0.2449) data time 0.0010 (0.0022) model time 0.2487 (0.2415) loss 3.2206 (2.7066) grad_norm 3.8090 (6.2102) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][520/1251] eta 0:02:58 lr 0.000022 wd 0.0500 time 0.2383 (0.2448) data time 0.0007 (0.0022) model time 0.2376 (0.2414) loss 2.6208 (2.7019) grad_norm 6.9856 (6.2250) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][530/1251] eta 0:02:56 lr 0.000022 wd 0.0500 time 0.2499 (0.2448) data time 0.0009 (0.0022) model time 0.2491 (0.2414) loss 2.1783 (2.7003) grad_norm 4.2045 (6.2144) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][540/1251] eta 0:02:53 lr 0.000022 wd 0.0500 time 0.2485 (0.2447) data time 0.0007 (0.0022) model time 0.2478 (0.2414) loss 1.7508 (2.6963) grad_norm 4.4653 (6.1981) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][550/1251] eta 0:02:51 lr 0.000022 wd 0.0500 time 0.2455 (0.2450) data time 0.0010 (0.0021) model time 0.2445 (0.2418) loss 2.8860 (2.6955) grad_norm 4.4091 (6.1910) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][560/1251] eta 0:02:49 lr 0.000022 wd 0.0500 time 0.2411 (0.2450) data time 0.0009 (0.0021) model time 0.2403 (0.2418) loss 2.5420 (2.6954) grad_norm 6.6224 (6.2019) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][570/1251] eta 0:02:46 lr 0.000022 wd 0.0500 time 0.2403 (0.2449) data time 0.0008 (0.0021) model time 0.2396 (0.2417) loss 2.3842 (2.7001) grad_norm 4.7697 (6.2105) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][580/1251] eta 0:02:44 lr 0.000022 wd 0.0500 time 0.2436 (0.2448) data time 0.0009 (0.0021) model time 0.2427 (0.2417) loss 2.7334 (2.6997) grad_norm 3.6456 (6.2150) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][590/1251] eta 0:02:41 lr 0.000022 wd 0.0500 time 0.2481 (0.2447) data time 0.0010 (0.0021) model time 0.2471 (0.2417) loss 2.3479 (2.6954) grad_norm 4.5366 (6.1988) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][600/1251] eta 0:02:39 lr 0.000022 wd 0.0500 time 0.2440 (0.2447) data time 0.0007 (0.0020) model time 0.2433 (0.2417) loss 1.9120 (2.6966) grad_norm 4.3708 (6.1885) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][610/1251] eta 0:02:36 lr 0.000022 wd 0.0500 time 0.2390 (0.2446) data time 0.0011 (0.0020) model time 0.2379 (0.2416) loss 2.8696 (2.6976) grad_norm 7.7804 (6.1717) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][620/1251] eta 0:02:34 lr 0.000022 wd 0.0500 time 0.2407 (0.2445) data time 0.0009 (0.0020) model time 0.2398 (0.2416) loss 2.2925 (2.6906) grad_norm 6.9612 (6.1647) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:14:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][630/1251] eta 0:02:31 lr 0.000022 wd 0.0500 time 0.2416 (0.2445) data time 0.0010 (0.0020) model time 0.2407 (0.2415) loss 3.0375 (2.6915) grad_norm 3.2660 (6.1519) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][640/1251] eta 0:02:29 lr 0.000022 wd 0.0500 time 0.2450 (0.2444) data time 0.0009 (0.0020) model time 0.2441 (0.2415) loss 2.5065 (2.6880) grad_norm 5.9922 (6.1386) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][650/1251] eta 0:02:26 lr 0.000022 wd 0.0500 time 0.2343 (0.2443) data time 0.0008 (0.0020) model time 0.2335 (0.2414) loss 2.8808 (2.6848) grad_norm 5.4089 (6.1325) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][660/1251] eta 0:02:24 lr 0.000022 wd 0.0500 time 0.2404 (0.2443) data time 0.0010 (0.0019) model time 0.2395 (0.2414) loss 1.9611 (2.6843) grad_norm 7.4347 (6.1264) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][670/1251] eta 0:02:21 lr 0.000022 wd 0.0500 time 0.2477 (0.2443) data time 0.0008 (0.0019) model time 0.2470 (0.2415) loss 2.8100 (2.6873) grad_norm 3.8976 (6.1239) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][680/1251] eta 0:02:19 lr 0.000022 wd 0.0500 time 0.2401 (0.2443) data time 0.0009 (0.0019) model time 0.2391 (0.2415) loss 2.5741 (2.6894) grad_norm 3.8192 (6.1124) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][690/1251] eta 0:02:17 lr 0.000022 wd 0.0500 time 0.2393 (0.2442) data time 0.0010 (0.0019) model time 0.2384 (0.2415) loss 2.5868 (2.6908) grad_norm 8.5663 (6.1085) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][700/1251] eta 0:02:14 lr 0.000022 wd 0.0500 time 0.2425 (0.2442) data time 0.0010 (0.0019) model time 0.2414 (0.2415) loss 2.6150 (2.6907) grad_norm 4.1330 (6.1046) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][710/1251] eta 0:02:12 lr 0.000022 wd 0.0500 time 0.2465 (0.2442) data time 0.0010 (0.0019) model time 0.2455 (0.2415) loss 2.7657 (2.6925) grad_norm 5.8264 (6.0989) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][720/1251] eta 0:02:09 lr 0.000022 wd 0.0500 time 0.2349 (0.2441) data time 0.0010 (0.0019) model time 0.2339 (0.2414) loss 2.8991 (2.6951) grad_norm 4.0049 (6.0850) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][730/1251] eta 0:02:07 lr 0.000022 wd 0.0500 time 0.2392 (0.2441) data time 0.0012 (0.0019) model time 0.2380 (0.2414) loss 2.8841 (2.6912) grad_norm 4.0214 (6.0766) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][740/1251] eta 0:02:04 lr 0.000022 wd 0.0500 time 0.2334 (0.2440) data time 0.0010 (0.0018) model time 0.2324 (0.2414) loss 3.2020 (2.6895) grad_norm 5.2734 (6.0556) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][750/1251] eta 0:02:02 lr 0.000022 wd 0.0500 time 0.2404 (0.2440) data time 0.0012 (0.0018) model time 0.2393 (0.2413) loss 2.7963 (2.6893) grad_norm 6.0001 (6.0979) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][760/1251] eta 0:01:59 lr 0.000022 wd 0.0500 time 0.2421 (0.2440) data time 0.0009 (0.0018) model time 0.2412 (0.2413) loss 2.8684 (2.6912) grad_norm 5.0079 (6.0983) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][770/1251] eta 0:01:57 lr 0.000022 wd 0.0500 time 0.2350 (0.2439) data time 0.0012 (0.0018) model time 0.2339 (0.2413) loss 2.4845 (2.6886) grad_norm 4.0187 (6.0889) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][780/1251] eta 0:01:54 lr 0.000022 wd 0.0500 time 0.2479 (0.2438) data time 0.0011 (0.0018) model time 0.2469 (0.2412) loss 2.2931 (2.6877) grad_norm 4.4969 (6.0818) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][790/1251] eta 0:01:52 lr 0.000022 wd 0.0500 time 0.2291 (0.2438) data time 0.0009 (0.0018) model time 0.2282 (0.2412) loss 2.9349 (2.6875) grad_norm 4.3970 (6.0782) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][800/1251] eta 0:01:49 lr 0.000022 wd 0.0500 time 0.2413 (0.2438) data time 0.0007 (0.0018) model time 0.2406 (0.2412) loss 2.0872 (2.6866) grad_norm 9.3080 (6.0708) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][810/1251] eta 0:01:47 lr 0.000022 wd 0.0500 time 0.2411 (0.2438) data time 0.0010 (0.0018) model time 0.2401 (0.2412) loss 2.5481 (2.6855) grad_norm 4.7284 (6.0660) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][820/1251] eta 0:01:45 lr 0.000022 wd 0.0500 time 0.2336 (0.2438) data time 0.0009 (0.0018) model time 0.2328 (0.2412) loss 2.8458 (2.6871) grad_norm 4.6648 (6.0637) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][830/1251] eta 0:01:42 lr 0.000022 wd 0.0500 time 0.2397 (0.2437) data time 0.0007 (0.0018) model time 0.2390 (0.2412) loss 2.7012 (2.6892) grad_norm 5.9318 (6.0678) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][840/1251] eta 0:01:40 lr 0.000022 wd 0.0500 time 0.2431 (0.2437) data time 0.0010 (0.0017) model time 0.2421 (0.2412) loss 3.1458 (2.6886) grad_norm 8.1730 (6.0607) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][850/1251] eta 0:01:37 lr 0.000022 wd 0.0500 time 0.2382 (0.2437) data time 0.0010 (0.0017) model time 0.2372 (0.2412) loss 3.0118 (2.6865) grad_norm 6.5968 (6.0634) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][860/1251] eta 0:01:35 lr 0.000022 wd 0.0500 time 0.2412 (0.2437) data time 0.0010 (0.0017) model time 0.2403 (0.2412) loss 2.3238 (2.6862) grad_norm 7.8627 (6.0589) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][870/1251] eta 0:01:32 lr 0.000022 wd 0.0500 time 0.2357 (0.2437) data time 0.0009 (0.0017) model time 0.2348 (0.2412) loss 2.6671 (2.6854) grad_norm 4.8236 (6.1180) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:15:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][880/1251] eta 0:01:30 lr 0.000022 wd 0.0500 time 0.2428 (0.2437) data time 0.0007 (0.0017) model time 0.2422 (0.2412) loss 2.6545 (2.6858) grad_norm 4.7934 (6.1639) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][890/1251] eta 0:01:28 lr 0.000022 wd 0.0500 time 0.2313 (0.2439) data time 0.0011 (0.0017) model time 0.2303 (0.2415) loss 2.7868 (2.6871) grad_norm 4.3792 (6.1763) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][900/1251] eta 0:01:25 lr 0.000022 wd 0.0500 time 0.2434 (0.2440) data time 0.0007 (0.0017) model time 0.2427 (0.2417) loss 3.1499 (2.6861) grad_norm 9.2351 (6.1784) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][910/1251] eta 0:01:23 lr 0.000022 wd 0.0500 time 0.2363 (0.2440) data time 0.0010 (0.0017) model time 0.2353 (0.2416) loss 1.7172 (2.6843) grad_norm 5.2529 (6.1689) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][920/1251] eta 0:01:20 lr 0.000022 wd 0.0500 time 0.2428 (0.2440) data time 0.0012 (0.0017) model time 0.2416 (0.2416) loss 2.4290 (2.6829) grad_norm 7.5811 (6.1761) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][930/1251] eta 0:01:18 lr 0.000022 wd 0.0500 time 0.2432 (0.2439) data time 0.0010 (0.0017) model time 0.2422 (0.2416) loss 2.7544 (2.6851) grad_norm 5.4234 (6.1754) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][940/1251] eta 0:01:15 lr 0.000021 wd 0.0500 time 0.2418 (0.2439) data time 0.0010 (0.0017) model time 0.2408 (0.2416) loss 2.9599 (2.6846) grad_norm 4.7618 (6.1730) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][950/1251] eta 0:01:13 lr 0.000021 wd 0.0500 time 0.2439 (0.2438) data time 0.0007 (0.0017) model time 0.2432 (0.2416) loss 1.6159 (2.6840) grad_norm 6.4077 (6.1715) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][960/1251] eta 0:01:10 lr 0.000021 wd 0.0500 time 0.2374 (0.2438) data time 0.0010 (0.0016) model time 0.2364 (0.2415) loss 2.6681 (2.6818) grad_norm 4.7346 (6.1615) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][970/1251] eta 0:01:08 lr 0.000021 wd 0.0500 time 0.2426 (0.2438) data time 0.0009 (0.0016) model time 0.2417 (0.2415) loss 2.9319 (2.6818) grad_norm 8.6540 (6.1496) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][980/1251] eta 0:01:06 lr 0.000021 wd 0.0500 time 0.2467 (0.2442) data time 0.0007 (0.0016) model time 0.2459 (0.2419) loss 2.7015 (2.6826) grad_norm 5.6599 (6.1325) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][990/1251] eta 0:01:03 lr 0.000021 wd 0.0500 time 0.2393 (0.2441) data time 0.0010 (0.0016) model time 0.2383 (0.2419) loss 2.9961 (2.6840) grad_norm 4.0764 (6.1224) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1000/1251] eta 0:01:01 lr 0.000021 wd 0.0500 time 0.2374 (0.2441) data time 0.0009 (0.0016) model time 0.2365 (0.2419) loss 2.2943 (2.6821) grad_norm 7.4517 (6.1411) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1010/1251] eta 0:00:58 lr 0.000021 wd 0.0500 time 0.2434 (0.2441) data time 0.0007 (0.0016) model time 0.2428 (0.2419) loss 1.9651 (2.6826) grad_norm 4.5331 (6.1299) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1020/1251] eta 0:00:56 lr 0.000021 wd 0.0500 time 0.2362 (0.2440) data time 0.0009 (0.0016) model time 0.2353 (0.2419) loss 2.8933 (2.6833) grad_norm 4.7760 (6.1508) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1030/1251] eta 0:00:53 lr 0.000021 wd 0.0500 time 0.2376 (0.2441) data time 0.0011 (0.0016) model time 0.2365 (0.2420) loss 2.3061 (2.6851) grad_norm 6.4811 (6.1506) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1040/1251] eta 0:00:51 lr 0.000021 wd 0.0500 time 0.2400 (0.2441) data time 0.0010 (0.0016) model time 0.2390 (0.2419) loss 1.8547 (2.6850) grad_norm 5.0419 (6.1383) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1050/1251] eta 0:00:49 lr 0.000021 wd 0.0500 time 0.2459 (0.2441) data time 0.0007 (0.0016) model time 0.2451 (0.2419) loss 2.1602 (2.6832) grad_norm 4.4888 (6.1236) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1060/1251] eta 0:00:46 lr 0.000021 wd 0.0500 time 0.2467 (0.2441) data time 0.0007 (0.0016) model time 0.2460 (0.2419) loss 2.9939 (2.6835) grad_norm 9.5449 (6.1270) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1070/1251] eta 0:00:44 lr 0.000021 wd 0.0500 time 0.2416 (0.2440) data time 0.0008 (0.0016) model time 0.2409 (0.2419) loss 3.0564 (2.6840) grad_norm 6.2174 (6.1189) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1080/1251] eta 0:00:41 lr 0.000021 wd 0.0500 time 0.2405 (0.2440) data time 0.0010 (0.0016) model time 0.2394 (0.2419) loss 2.2881 (2.6833) grad_norm 6.5427 (6.1213) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1090/1251] eta 0:00:39 lr 0.000021 wd 0.0500 time 0.2428 (0.2442) data time 0.0011 (0.0016) model time 0.2418 (0.2421) loss 2.7946 (2.6843) grad_norm 4.0781 (6.1236) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1100/1251] eta 0:00:36 lr 0.000021 wd 0.0500 time 0.2482 (0.2442) data time 0.0012 (0.0016) model time 0.2470 (0.2421) loss 2.9151 (2.6834) grad_norm 4.4700 (6.1221) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1110/1251] eta 0:00:34 lr 0.000021 wd 0.0500 time 0.2426 (0.2442) data time 0.0008 (0.0016) model time 0.2418 (0.2421) loss 2.9851 (2.6856) grad_norm 5.2321 (6.1152) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1120/1251] eta 0:00:31 lr 0.000021 wd 0.0500 time 0.2326 (0.2441) data time 0.0007 (0.0016) model time 0.2318 (0.2421) loss 1.4733 (2.6835) grad_norm 6.5860 (6.1085) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:16:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1130/1251] eta 0:00:29 lr 0.000021 wd 0.0500 time 0.2382 (0.2441) data time 0.0010 (0.0015) model time 0.2372 (0.2421) loss 2.3091 (2.6845) grad_norm 17.4978 (6.1139) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1140/1251] eta 0:00:27 lr 0.000021 wd 0.0500 time 0.2426 (0.2441) data time 0.0007 (0.0015) model time 0.2419 (0.2420) loss 3.4667 (2.6848) grad_norm 7.3279 (6.1064) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1150/1251] eta 0:00:24 lr 0.000021 wd 0.0500 time 0.2433 (0.2441) data time 0.0007 (0.0015) model time 0.2425 (0.2420) loss 3.2137 (2.6855) grad_norm 7.5362 (6.0995) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1160/1251] eta 0:00:22 lr 0.000021 wd 0.0500 time 0.2404 (0.2440) data time 0.0013 (0.0015) model time 0.2391 (0.2420) loss 2.4356 (2.6854) grad_norm 4.7556 (6.0901) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1170/1251] eta 0:00:19 lr 0.000021 wd 0.0500 time 0.2386 (0.2440) data time 0.0010 (0.0015) model time 0.2376 (0.2420) loss 2.0073 (2.6839) grad_norm 4.3276 (6.1013) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1180/1251] eta 0:00:17 lr 0.000021 wd 0.0500 time 0.2416 (0.2440) data time 0.0010 (0.0015) model time 0.2406 (0.2420) loss 2.7064 (2.6845) grad_norm 9.0736 (6.1011) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1190/1251] eta 0:00:14 lr 0.000021 wd 0.0500 time 0.2382 (0.2440) data time 0.0007 (0.0015) model time 0.2375 (0.2420) loss 2.9583 (2.6851) grad_norm 4.2095 (6.0918) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1200/1251] eta 0:00:12 lr 0.000021 wd 0.0500 time 0.2409 (0.2440) data time 0.0009 (0.0015) model time 0.2400 (0.2420) loss 3.1260 (2.6861) grad_norm 5.1347 (6.0969) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1210/1251] eta 0:00:10 lr 0.000021 wd 0.0500 time 0.2426 (0.2440) data time 0.0010 (0.0015) model time 0.2417 (0.2420) loss 2.5127 (2.6847) grad_norm 6.5244 (6.1175) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1220/1251] eta 0:00:07 lr 0.000021 wd 0.0500 time 0.2373 (0.2439) data time 0.0010 (0.0015) model time 0.2363 (0.2419) loss 2.1241 (2.6827) grad_norm 4.1126 (6.1111) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1230/1251] eta 0:00:05 lr 0.000021 wd 0.0500 time 0.2340 (0.2439) data time 0.0011 (0.0015) model time 0.2329 (0.2419) loss 2.8079 (2.6833) grad_norm 4.6462 (6.1087) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1240/1251] eta 0:00:02 lr 0.000021 wd 0.0500 time 0.2277 (0.2438) data time 0.0007 (0.0015) model time 0.2270 (0.2418) loss 2.6804 (2.6838) grad_norm 5.3768 (6.1036) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [280/300][1250/1251] eta 0:00:00 lr 0.000021 wd 0.0500 time 0.2274 (0.2437) data time 0.0007 (0.0015) model time 0.2267 (0.2417) loss 2.7638 (2.6826) grad_norm 9.2135 (6.1029) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 280 training takes 0:05:04 [2024-09-01 10:17:28 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 10:17:29 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 10:17:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.436 (0.436) Loss 0.3855 (0.3855) Acc@1 93.262 (93.262) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 10:17:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.111) Loss 0.5552 (0.6048) Acc@1 90.625 (87.678) Acc@5 97.949 (97.763) Mem 7381MB [2024-09-01 10:17:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.096) Loss 0.9297 (0.6384) Acc@1 76.855 (86.579) Acc@5 95.215 (97.638) Mem 7381MB [2024-09-01 10:17:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.082 (0.090) Loss 1.1230 (0.7317) Acc@1 74.902 (84.410) Acc@5 93.066 (96.721) Mem 7381MB [2024-09-01 10:17:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.0137 (0.7824) Acc@1 77.148 (83.203) Acc@5 94.727 (96.213) Mem 7381MB [2024-09-01 10:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.740 Acc@5 96.166 [2024-09-01 10:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-09-01 10:17:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.809 (0.809) Loss 0.3889 (0.3889) Acc@1 92.969 (92.969) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 10:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.147) Loss 0.5649 (0.6069) Acc@1 90.527 (87.900) Acc@5 98.145 (97.807) Mem 7381MB [2024-09-01 10:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.114) Loss 0.9072 (0.6372) Acc@1 77.930 (86.728) Acc@5 95.703 (97.745) Mem 7381MB [2024-09-01 10:17:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.101) Loss 1.1328 (0.7297) Acc@1 74.805 (84.536) Acc@5 92.773 (96.790) Mem 7381MB [2024-09-01 10:17:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.0088 (0.7778) Acc@1 77.051 (83.348) Acc@5 94.531 (96.320) Mem 7381MB [2024-09-01 10:17:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.932 Acc@5 96.264 [2024-09-01 10:17:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 10:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][0/1251] eta 0:21:11 lr 0.000021 wd 0.0500 time 1.0168 (1.0168) data time 0.6096 (0.6096) model time 0.0000 (0.0000) loss 2.8368 (2.8368) grad_norm 4.2822 (4.2822) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][10/1251] eta 0:06:32 lr 0.000021 wd 0.0500 time 0.2439 (0.3160) data time 0.0009 (0.0563) model time 0.0000 (0.0000) loss 2.6159 (2.5638) grad_norm 5.0403 (5.4873) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][20/1251] eta 0:05:45 lr 0.000021 wd 0.0500 time 0.2389 (0.2807) data time 0.0010 (0.0300) model time 0.0000 (0.0000) loss 2.6766 (2.6833) grad_norm 3.5291 (5.3827) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][30/1251] eta 0:05:27 lr 0.000021 wd 0.0500 time 0.2453 (0.2680) data time 0.0009 (0.0206) model time 0.0000 (0.0000) loss 2.9607 (2.5725) grad_norm 5.4439 (5.7480) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][40/1251] eta 0:05:17 lr 0.000021 wd 0.0500 time 0.2506 (0.2620) data time 0.0010 (0.0158) model time 0.0000 (0.0000) loss 2.6103 (2.5983) grad_norm 5.8111 (6.1518) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][50/1251] eta 0:05:10 lr 0.000021 wd 0.0500 time 0.2459 (0.2583) data time 0.0011 (0.0129) model time 0.0000 (0.0000) loss 1.5920 (2.5859) grad_norm 3.8664 (5.9924) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][60/1251] eta 0:05:04 lr 0.000021 wd 0.0500 time 0.2503 (0.2554) data time 0.0007 (0.0109) model time 0.2496 (0.2397) loss 2.1603 (2.6002) grad_norm 6.2967 (6.0064) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][70/1251] eta 0:04:59 lr 0.000021 wd 0.0500 time 0.2366 (0.2534) data time 0.0009 (0.0095) model time 0.2357 (0.2398) loss 2.9839 (2.6339) grad_norm 6.4438 (6.5021) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:17:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][80/1251] eta 0:04:55 lr 0.000021 wd 0.0500 time 0.2330 (0.2519) data time 0.0010 (0.0085) model time 0.2320 (0.2401) loss 2.7946 (2.6455) grad_norm 6.2260 (6.4301) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][90/1251] eta 0:04:51 lr 0.000021 wd 0.0500 time 0.2348 (0.2508) data time 0.0011 (0.0077) model time 0.2337 (0.2402) loss 2.9547 (2.6697) grad_norm 5.7905 (6.3446) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][100/1251] eta 0:04:47 lr 0.000021 wd 0.0500 time 0.2427 (0.2499) data time 0.0006 (0.0070) model time 0.2420 (0.2403) loss 2.6792 (2.6580) grad_norm 5.4994 (6.2653) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][110/1251] eta 0:04:44 lr 0.000021 wd 0.0500 time 0.2455 (0.2491) data time 0.0008 (0.0065) model time 0.2447 (0.2402) loss 2.8024 (2.6863) grad_norm 5.8856 (6.1645) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][120/1251] eta 0:04:40 lr 0.000021 wd 0.0500 time 0.2337 (0.2482) data time 0.0010 (0.0060) model time 0.2327 (0.2398) loss 3.2765 (2.7149) grad_norm 6.1928 (6.0974) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][130/1251] eta 0:04:37 lr 0.000021 wd 0.0500 time 0.2400 (0.2476) data time 0.0007 (0.0056) model time 0.2392 (0.2398) loss 3.2366 (2.7204) grad_norm 5.7465 (6.0373) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][140/1251] eta 0:04:34 lr 0.000021 wd 0.0500 time 0.2458 (0.2470) data time 0.0010 (0.0053) model time 0.2449 (0.2396) loss 2.9609 (2.7351) grad_norm 5.8444 (5.9532) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][150/1251] eta 0:04:31 lr 0.000021 wd 0.0500 time 0.2343 (0.2467) data time 0.0009 (0.0050) model time 0.2334 (0.2397) loss 3.0012 (2.7356) grad_norm 3.7907 (6.1494) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][160/1251] eta 0:04:28 lr 0.000021 wd 0.0500 time 0.2453 (0.2464) data time 0.0007 (0.0048) model time 0.2446 (0.2399) loss 2.9045 (2.7254) grad_norm 5.5944 (6.1777) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][170/1251] eta 0:04:26 lr 0.000021 wd 0.0500 time 0.2413 (0.2461) data time 0.0007 (0.0046) model time 0.2406 (0.2400) loss 3.5015 (2.7272) grad_norm 6.5509 (6.1639) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][180/1251] eta 0:04:23 lr 0.000021 wd 0.0500 time 0.2438 (0.2460) data time 0.0009 (0.0044) model time 0.2428 (0.2402) loss 2.8291 (2.7079) grad_norm 6.3196 (6.0873) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][190/1251] eta 0:04:20 lr 0.000021 wd 0.0500 time 0.2442 (0.2457) data time 0.0008 (0.0042) model time 0.2434 (0.2401) loss 3.0532 (2.7186) grad_norm 5.1116 (6.0727) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][200/1251] eta 0:04:17 lr 0.000021 wd 0.0500 time 0.2422 (0.2454) data time 0.0007 (0.0040) model time 0.2415 (0.2400) loss 2.7717 (2.7175) grad_norm 8.4059 (6.0503) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][210/1251] eta 0:04:15 lr 0.000021 wd 0.0500 time 0.2431 (0.2453) data time 0.0010 (0.0039) model time 0.2421 (0.2401) loss 2.7780 (2.7133) grad_norm 3.9973 (6.1018) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][220/1251] eta 0:04:12 lr 0.000021 wd 0.0500 time 0.2438 (0.2452) data time 0.0009 (0.0037) model time 0.2428 (0.2403) loss 3.1538 (2.7125) grad_norm 6.2140 (6.0692) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][230/1251] eta 0:04:10 lr 0.000021 wd 0.0500 time 0.2510 (0.2451) data time 0.0009 (0.0036) model time 0.2501 (0.2404) loss 2.9283 (2.7183) grad_norm 6.4924 (6.0404) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][240/1251] eta 0:04:07 lr 0.000021 wd 0.0500 time 0.2456 (0.2450) data time 0.0009 (0.0035) model time 0.2447 (0.2404) loss 2.9841 (2.7128) grad_norm 8.7694 (6.0199) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][250/1251] eta 0:04:05 lr 0.000021 wd 0.0500 time 0.2481 (0.2448) data time 0.0010 (0.0034) model time 0.2471 (0.2404) loss 2.6873 (2.7187) grad_norm 7.1645 (6.0245) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][260/1251] eta 0:04:03 lr 0.000021 wd 0.0500 time 0.2422 (0.2453) data time 0.0009 (0.0033) model time 0.2413 (0.2412) loss 2.3249 (2.7088) grad_norm 6.8820 (6.0366) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][270/1251] eta 0:04:00 lr 0.000021 wd 0.0500 time 0.2458 (0.2452) data time 0.0007 (0.0032) model time 0.2451 (0.2412) loss 2.4401 (2.7093) grad_norm 4.0940 (6.0190) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][280/1251] eta 0:03:58 lr 0.000021 wd 0.0500 time 0.2511 (0.2452) data time 0.0009 (0.0032) model time 0.2502 (0.2413) loss 2.6527 (2.7000) grad_norm 5.6402 (5.9993) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:18:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][290/1251] eta 0:03:55 lr 0.000021 wd 0.0500 time 0.2395 (0.2450) data time 0.0010 (0.0031) model time 0.2385 (0.2412) loss 2.3542 (2.6914) grad_norm 4.2410 (5.9694) loss_scale 256.0000 (128.4399) mem 7381MB [2024-09-01 10:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][300/1251] eta 0:03:52 lr 0.000021 wd 0.0500 time 0.2516 (0.2449) data time 0.0007 (0.0030) model time 0.2509 (0.2412) loss 1.8241 (2.6886) grad_norm 5.0146 (5.9432) loss_scale 256.0000 (132.6777) mem 7381MB [2024-09-01 10:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][310/1251] eta 0:03:50 lr 0.000021 wd 0.0500 time 0.2408 (0.2449) data time 0.0009 (0.0030) model time 0.2399 (0.2412) loss 2.8736 (2.6837) grad_norm 4.9282 (5.9422) loss_scale 256.0000 (136.6431) mem 7381MB [2024-09-01 10:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][320/1251] eta 0:03:48 lr 0.000021 wd 0.0500 time 0.2317 (0.2454) data time 0.0010 (0.0029) model time 0.2307 (0.2419) loss 2.7342 (2.6843) grad_norm 6.7278 (5.9180) loss_scale 256.0000 (140.3614) mem 7381MB [2024-09-01 10:18:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][330/1251] eta 0:03:46 lr 0.000021 wd 0.0500 time 0.2314 (0.2459) data time 0.0012 (0.0028) model time 0.2302 (0.2426) loss 2.8586 (2.6842) grad_norm 7.8269 (5.9094) loss_scale 256.0000 (143.8550) mem 7381MB [2024-09-01 10:19:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][340/1251] eta 0:03:43 lr 0.000021 wd 0.0500 time 0.2425 (0.2457) data time 0.0011 (0.0028) model time 0.2414 (0.2425) loss 2.9259 (2.6837) grad_norm 10.1120 (5.9198) loss_scale 256.0000 (147.1437) mem 7381MB [2024-09-01 10:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][350/1251] eta 0:03:41 lr 0.000021 wd 0.0500 time 0.2387 (0.2456) data time 0.0008 (0.0027) model time 0.2379 (0.2424) loss 2.3689 (2.6827) grad_norm 5.9709 (5.9178) loss_scale 256.0000 (150.2450) mem 7381MB [2024-09-01 10:19:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][360/1251] eta 0:03:38 lr 0.000021 wd 0.0500 time 0.2453 (0.2455) data time 0.0007 (0.0027) model time 0.2446 (0.2424) loss 2.9207 (2.6853) grad_norm 7.9120 (5.9563) loss_scale 256.0000 (153.1745) mem 7381MB [2024-09-01 10:19:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][370/1251] eta 0:03:36 lr 0.000021 wd 0.0500 time 0.2318 (0.2458) data time 0.0008 (0.0026) model time 0.2310 (0.2429) loss 3.0045 (2.6868) grad_norm 7.5146 (5.9580) loss_scale 256.0000 (155.9461) mem 7381MB [2024-09-01 10:19:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][380/1251] eta 0:03:34 lr 0.000021 wd 0.0500 time 0.2373 (0.2457) data time 0.0007 (0.0026) model time 0.2366 (0.2427) loss 2.0275 (2.6875) grad_norm 7.5256 (5.9829) loss_scale 256.0000 (158.5722) mem 7381MB [2024-09-01 10:19:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][390/1251] eta 0:03:31 lr 0.000021 wd 0.0500 time 0.2457 (0.2456) data time 0.0010 (0.0026) model time 0.2447 (0.2427) loss 2.5101 (2.6923) grad_norm 4.7661 (5.9637) loss_scale 256.0000 (161.0639) mem 7381MB [2024-09-01 10:19:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][400/1251] eta 0:03:28 lr 0.000021 wd 0.0500 time 0.2421 (0.2455) data time 0.0011 (0.0025) model time 0.2410 (0.2426) loss 2.6074 (2.6886) grad_norm 4.9994 (5.9550) loss_scale 256.0000 (163.4314) mem 7381MB [2024-09-01 10:19:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][410/1251] eta 0:03:26 lr 0.000021 wd 0.0500 time 0.2383 (0.2454) data time 0.0012 (0.0025) model time 0.2371 (0.2426) loss 1.9275 (2.6811) grad_norm 3.9117 (5.9455) loss_scale 256.0000 (165.6837) mem 7381MB [2024-09-01 10:19:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][420/1251] eta 0:03:23 lr 0.000021 wd 0.0500 time 0.2421 (0.2453) data time 0.0007 (0.0024) model time 0.2414 (0.2425) loss 2.0082 (2.6818) grad_norm 5.7697 (5.9440) loss_scale 256.0000 (167.8290) mem 7381MB [2024-09-01 10:19:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][430/1251] eta 0:03:21 lr 0.000021 wd 0.0500 time 0.2393 (0.2452) data time 0.0009 (0.0024) model time 0.2384 (0.2425) loss 3.1866 (2.6827) grad_norm 4.4683 (5.9815) loss_scale 256.0000 (169.8747) mem 7381MB [2024-09-01 10:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][440/1251] eta 0:03:18 lr 0.000021 wd 0.0500 time 0.2353 (0.2451) data time 0.0008 (0.0024) model time 0.2345 (0.2424) loss 3.5755 (2.6795) grad_norm 6.0762 (5.9687) loss_scale 256.0000 (171.8277) mem 7381MB [2024-09-01 10:19:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][450/1251] eta 0:03:16 lr 0.000021 wd 0.0500 time 0.2451 (0.2455) data time 0.0009 (0.0023) model time 0.2442 (0.2429) loss 1.9673 (2.6790) grad_norm 6.1307 (5.9936) loss_scale 256.0000 (173.6940) mem 7381MB [2024-09-01 10:19:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][460/1251] eta 0:03:14 lr 0.000021 wd 0.0500 time 0.2425 (0.2459) data time 0.0007 (0.0023) model time 0.2418 (0.2433) loss 3.1474 (2.6768) grad_norm 5.6315 (5.9739) loss_scale 256.0000 (175.4794) mem 7381MB [2024-09-01 10:19:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][470/1251] eta 0:03:11 lr 0.000021 wd 0.0500 time 0.2411 (0.2458) data time 0.0009 (0.0023) model time 0.2402 (0.2432) loss 2.4219 (2.6789) grad_norm 3.7304 (5.9549) loss_scale 256.0000 (177.1890) mem 7381MB [2024-09-01 10:19:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][480/1251] eta 0:03:09 lr 0.000021 wd 0.0500 time 0.2446 (0.2457) data time 0.0009 (0.0023) model time 0.2437 (0.2432) loss 3.1756 (2.6774) grad_norm 6.1342 (5.9498) loss_scale 256.0000 (178.8274) mem 7381MB [2024-09-01 10:19:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][490/1251] eta 0:03:06 lr 0.000021 wd 0.0500 time 0.2415 (0.2456) data time 0.0009 (0.0022) model time 0.2406 (0.2431) loss 1.7775 (2.6784) grad_norm 5.4818 (5.9328) loss_scale 256.0000 (180.3992) mem 7381MB [2024-09-01 10:19:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][500/1251] eta 0:03:04 lr 0.000021 wd 0.0500 time 0.2342 (0.2455) data time 0.0009 (0.0022) model time 0.2332 (0.2430) loss 3.0633 (2.6815) grad_norm 6.3378 (5.9143) loss_scale 256.0000 (181.9082) mem 7381MB [2024-09-01 10:19:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][510/1251] eta 0:03:01 lr 0.000021 wd 0.0500 time 0.2366 (0.2454) data time 0.0010 (0.0022) model time 0.2357 (0.2430) loss 2.8208 (2.6811) grad_norm 4.6873 (5.9042) loss_scale 256.0000 (183.3581) mem 7381MB [2024-09-01 10:19:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][520/1251] eta 0:02:59 lr 0.000021 wd 0.0500 time 0.2412 (0.2453) data time 0.0007 (0.0022) model time 0.2405 (0.2429) loss 3.0765 (2.6816) grad_norm 3.9247 (5.8874) loss_scale 256.0000 (184.7524) mem 7381MB [2024-09-01 10:19:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][530/1251] eta 0:02:56 lr 0.000021 wd 0.0500 time 0.2380 (0.2453) data time 0.0011 (0.0021) model time 0.2369 (0.2429) loss 2.3282 (2.6809) grad_norm 6.4718 (5.8844) loss_scale 256.0000 (186.0942) mem 7381MB [2024-09-01 10:19:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][540/1251] eta 0:02:54 lr 0.000021 wd 0.0500 time 0.2504 (0.2452) data time 0.0007 (0.0021) model time 0.2497 (0.2429) loss 3.4815 (2.6810) grad_norm 4.8438 (5.8672) loss_scale 256.0000 (187.3863) mem 7381MB [2024-09-01 10:19:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][550/1251] eta 0:02:51 lr 0.000021 wd 0.0500 time 0.2403 (0.2452) data time 0.0007 (0.0021) model time 0.2396 (0.2428) loss 2.9544 (2.6779) grad_norm 4.2147 (5.8568) loss_scale 256.0000 (188.6316) mem 7381MB [2024-09-01 10:19:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][560/1251] eta 0:02:49 lr 0.000021 wd 0.0500 time 0.2441 (0.2451) data time 0.0007 (0.0021) model time 0.2434 (0.2428) loss 3.5577 (2.6834) grad_norm 19.7255 (5.9053) loss_scale 256.0000 (189.8324) mem 7381MB [2024-09-01 10:19:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][570/1251] eta 0:02:46 lr 0.000021 wd 0.0500 time 0.2427 (0.2451) data time 0.0009 (0.0021) model time 0.2418 (0.2428) loss 2.1073 (2.6832) grad_norm 4.4985 (5.8961) loss_scale 256.0000 (190.9912) mem 7381MB [2024-09-01 10:20:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][580/1251] eta 0:02:44 lr 0.000021 wd 0.0500 time 0.2466 (0.2451) data time 0.0008 (0.0020) model time 0.2458 (0.2428) loss 3.6010 (2.6814) grad_norm 4.7737 (5.8915) loss_scale 256.0000 (192.1102) mem 7381MB [2024-09-01 10:20:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][590/1251] eta 0:02:41 lr 0.000021 wd 0.0500 time 0.2438 (0.2450) data time 0.0009 (0.0020) model time 0.2429 (0.2428) loss 2.1741 (2.6807) grad_norm 8.7685 (5.8900) loss_scale 256.0000 (193.1912) mem 7381MB [2024-09-01 10:20:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][600/1251] eta 0:02:39 lr 0.000021 wd 0.0500 time 0.2383 (0.2450) data time 0.0007 (0.0020) model time 0.2376 (0.2427) loss 1.9012 (2.6825) grad_norm 4.4372 (5.8878) loss_scale 256.0000 (194.2363) mem 7381MB [2024-09-01 10:20:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][610/1251] eta 0:02:36 lr 0.000021 wd 0.0500 time 0.2445 (0.2449) data time 0.0009 (0.0020) model time 0.2436 (0.2427) loss 2.7574 (2.6795) grad_norm 6.3685 (5.8774) loss_scale 256.0000 (195.2471) mem 7381MB [2024-09-01 10:20:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][620/1251] eta 0:02:34 lr 0.000021 wd 0.0500 time 0.2426 (0.2448) data time 0.0009 (0.0020) model time 0.2417 (0.2426) loss 2.0170 (2.6783) grad_norm 6.6825 (5.8701) loss_scale 256.0000 (196.2254) mem 7381MB [2024-09-01 10:20:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][630/1251] eta 0:02:32 lr 0.000021 wd 0.0500 time 0.2503 (0.2448) data time 0.0007 (0.0020) model time 0.2496 (0.2426) loss 3.1678 (2.6791) grad_norm 4.9475 (5.8648) loss_scale 256.0000 (197.1727) mem 7381MB [2024-09-01 10:20:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][640/1251] eta 0:02:29 lr 0.000021 wd 0.0500 time 0.2427 (0.2448) data time 0.0007 (0.0019) model time 0.2420 (0.2426) loss 1.9283 (2.6789) grad_norm 6.5692 (5.8632) loss_scale 256.0000 (198.0905) mem 7381MB [2024-09-01 10:20:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][650/1251] eta 0:02:27 lr 0.000021 wd 0.0500 time 0.2412 (0.2447) data time 0.0009 (0.0019) model time 0.2403 (0.2425) loss 2.4450 (2.6837) grad_norm 8.3461 (5.8884) loss_scale 256.0000 (198.9800) mem 7381MB [2024-09-01 10:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][660/1251] eta 0:02:24 lr 0.000021 wd 0.0500 time 0.2424 (0.2447) data time 0.0011 (0.0019) model time 0.2413 (0.2426) loss 2.8449 (2.6837) grad_norm 3.9593 (5.9110) loss_scale 256.0000 (199.8427) mem 7381MB [2024-09-01 10:20:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][670/1251] eta 0:02:22 lr 0.000021 wd 0.0500 time 0.2399 (0.2447) data time 0.0007 (0.0019) model time 0.2392 (0.2425) loss 2.8812 (2.6830) grad_norm 5.6820 (5.9065) loss_scale 256.0000 (200.6796) mem 7381MB [2024-09-01 10:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][680/1251] eta 0:02:19 lr 0.000021 wd 0.0500 time 0.2358 (0.2446) data time 0.0009 (0.0019) model time 0.2349 (0.2425) loss 3.2128 (2.6825) grad_norm 8.3426 (5.8958) loss_scale 256.0000 (201.4919) mem 7381MB [2024-09-01 10:20:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][690/1251] eta 0:02:17 lr 0.000021 wd 0.0500 time 0.2479 (0.2446) data time 0.0009 (0.0019) model time 0.2470 (0.2425) loss 2.6354 (2.6807) grad_norm 7.7511 (5.8900) loss_scale 256.0000 (202.2808) mem 7381MB [2024-09-01 10:20:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][700/1251] eta 0:02:14 lr 0.000021 wd 0.0500 time 0.2459 (0.2445) data time 0.0009 (0.0019) model time 0.2450 (0.2424) loss 2.8891 (2.6823) grad_norm 10.1504 (5.9035) loss_scale 256.0000 (203.0471) mem 7381MB [2024-09-01 10:20:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][710/1251] eta 0:02:12 lr 0.000021 wd 0.0500 time 0.2395 (0.2444) data time 0.0012 (0.0018) model time 0.2383 (0.2424) loss 2.1509 (2.6794) grad_norm 7.7681 (5.9409) loss_scale 256.0000 (203.7918) mem 7381MB [2024-09-01 10:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][720/1251] eta 0:02:09 lr 0.000021 wd 0.0500 time 0.2419 (0.2444) data time 0.0009 (0.0018) model time 0.2410 (0.2423) loss 1.8870 (2.6768) grad_norm 6.1746 (5.9293) loss_scale 256.0000 (204.5160) mem 7381MB [2024-09-01 10:20:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][730/1251] eta 0:02:07 lr 0.000021 wd 0.0500 time 0.2457 (0.2444) data time 0.0009 (0.0018) model time 0.2448 (0.2423) loss 2.8786 (2.6775) grad_norm 6.4170 (5.9269) loss_scale 256.0000 (205.2202) mem 7381MB [2024-09-01 10:20:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][740/1251] eta 0:02:04 lr 0.000021 wd 0.0500 time 0.2382 (0.2444) data time 0.0009 (0.0018) model time 0.2373 (0.2423) loss 1.9954 (2.6813) grad_norm 5.0518 (5.9736) loss_scale 256.0000 (205.9055) mem 7381MB [2024-09-01 10:20:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][750/1251] eta 0:02:02 lr 0.000021 wd 0.0500 time 0.2443 (0.2443) data time 0.0010 (0.0018) model time 0.2432 (0.2423) loss 1.8445 (2.6808) grad_norm 6.8413 (5.9689) loss_scale 256.0000 (206.5726) mem 7381MB [2024-09-01 10:20:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][760/1251] eta 0:01:59 lr 0.000021 wd 0.0500 time 0.2414 (0.2443) data time 0.0009 (0.0018) model time 0.2405 (0.2423) loss 1.9021 (2.6804) grad_norm 4.1549 (5.9586) loss_scale 256.0000 (207.2221) mem 7381MB [2024-09-01 10:20:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][770/1251] eta 0:01:57 lr 0.000020 wd 0.0500 time 0.2317 (0.2443) data time 0.0011 (0.0018) model time 0.2305 (0.2423) loss 2.6539 (2.6808) grad_norm 3.6028 (5.9561) loss_scale 256.0000 (207.8547) mem 7381MB [2024-09-01 10:20:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][780/1251] eta 0:01:55 lr 0.000020 wd 0.0500 time 0.2337 (0.2442) data time 0.0009 (0.0018) model time 0.2327 (0.2423) loss 3.1784 (2.6805) grad_norm 5.4306 (5.9537) loss_scale 256.0000 (208.4712) mem 7381MB [2024-09-01 10:20:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][790/1251] eta 0:01:52 lr 0.000020 wd 0.0500 time 0.2455 (0.2443) data time 0.0009 (0.0018) model time 0.2447 (0.2423) loss 2.4272 (2.6789) grad_norm 5.6321 (5.9628) loss_scale 256.0000 (209.0721) mem 7381MB [2024-09-01 10:20:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][800/1251] eta 0:01:50 lr 0.000020 wd 0.0500 time 0.2406 (0.2442) data time 0.0009 (0.0017) model time 0.2396 (0.2423) loss 2.8233 (2.6767) grad_norm 5.2672 (5.9696) loss_scale 256.0000 (209.6579) mem 7381MB [2024-09-01 10:20:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][810/1251] eta 0:01:47 lr 0.000020 wd 0.0500 time 0.2388 (0.2442) data time 0.0009 (0.0017) model time 0.2379 (0.2422) loss 2.2151 (2.6794) grad_norm 3.8259 (5.9763) loss_scale 256.0000 (210.2293) mem 7381MB [2024-09-01 10:20:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][820/1251] eta 0:01:45 lr 0.000020 wd 0.0500 time 0.2391 (0.2442) data time 0.0008 (0.0017) model time 0.2384 (0.2422) loss 3.2494 (2.6775) grad_norm 4.5965 (5.9810) loss_scale 256.0000 (210.7868) mem 7381MB [2024-09-01 10:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][830/1251] eta 0:01:42 lr 0.000020 wd 0.0500 time 0.2404 (0.2441) data time 0.0010 (0.0017) model time 0.2394 (0.2422) loss 2.7294 (2.6770) grad_norm 6.6269 (6.0005) loss_scale 256.0000 (211.3309) mem 7381MB [2024-09-01 10:21:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][840/1251] eta 0:01:40 lr 0.000020 wd 0.0500 time 0.2355 (0.2441) data time 0.0008 (0.0017) model time 0.2346 (0.2422) loss 2.0468 (2.6758) grad_norm 4.8416 (5.9981) loss_scale 256.0000 (211.8621) mem 7381MB [2024-09-01 10:21:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][850/1251] eta 0:01:37 lr 0.000020 wd 0.0500 time 0.2394 (0.2441) data time 0.0010 (0.0017) model time 0.2384 (0.2422) loss 3.3565 (2.6759) grad_norm 7.6868 (6.0372) loss_scale 256.0000 (212.3807) mem 7381MB [2024-09-01 10:21:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][860/1251] eta 0:01:35 lr 0.000020 wd 0.0500 time 0.2371 (0.2441) data time 0.0009 (0.0017) model time 0.2363 (0.2422) loss 3.3638 (2.6774) grad_norm 5.1519 (6.0453) loss_scale 256.0000 (212.8873) mem 7381MB [2024-09-01 10:21:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][870/1251] eta 0:01:32 lr 0.000020 wd 0.0500 time 0.2371 (0.2440) data time 0.0010 (0.0017) model time 0.2361 (0.2422) loss 3.1230 (2.6796) grad_norm 6.6608 (6.0480) loss_scale 256.0000 (213.3823) mem 7381MB [2024-09-01 10:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][880/1251] eta 0:01:30 lr 0.000020 wd 0.0500 time 0.2380 (0.2440) data time 0.0008 (0.0017) model time 0.2372 (0.2421) loss 3.2977 (2.6798) grad_norm 5.8348 (6.0381) loss_scale 256.0000 (213.8661) mem 7381MB [2024-09-01 10:21:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][890/1251] eta 0:01:28 lr 0.000020 wd 0.0500 time 0.2411 (0.2441) data time 0.0007 (0.0017) model time 0.2404 (0.2423) loss 3.1708 (2.6816) grad_norm 6.5545 (6.0357) loss_scale 256.0000 (214.3389) mem 7381MB [2024-09-01 10:21:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][900/1251] eta 0:01:25 lr 0.000020 wd 0.0500 time 0.2427 (0.2441) data time 0.0008 (0.0017) model time 0.2418 (0.2422) loss 2.7943 (2.6822) grad_norm 5.3969 (6.0522) loss_scale 256.0000 (214.8013) mem 7381MB [2024-09-01 10:21:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][910/1251] eta 0:01:23 lr 0.000020 wd 0.0500 time 0.2407 (0.2440) data time 0.0009 (0.0017) model time 0.2397 (0.2422) loss 2.1181 (2.6825) grad_norm 8.0291 (6.0437) loss_scale 256.0000 (215.2536) mem 7381MB [2024-09-01 10:21:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][920/1251] eta 0:01:20 lr 0.000020 wd 0.0500 time 0.2338 (0.2440) data time 0.0009 (0.0017) model time 0.2330 (0.2422) loss 2.7551 (2.6823) grad_norm 7.6332 (6.0419) loss_scale 256.0000 (215.6960) mem 7381MB [2024-09-01 10:21:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][930/1251] eta 0:01:18 lr 0.000020 wd 0.0500 time 0.2285 (0.2440) data time 0.0008 (0.0016) model time 0.2277 (0.2422) loss 2.7613 (2.6830) grad_norm 7.1822 (6.0449) loss_scale 256.0000 (216.1289) mem 7381MB [2024-09-01 10:21:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][940/1251] eta 0:01:15 lr 0.000020 wd 0.0500 time 0.2384 (0.2439) data time 0.0009 (0.0016) model time 0.2375 (0.2421) loss 2.7401 (2.6831) grad_norm 11.3404 (6.0468) loss_scale 256.0000 (216.5526) mem 7381MB [2024-09-01 10:21:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][950/1251] eta 0:01:13 lr 0.000020 wd 0.0500 time 0.2364 (0.2439) data time 0.0010 (0.0016) model time 0.2354 (0.2421) loss 1.8412 (2.6811) grad_norm 5.7245 (6.0478) loss_scale 256.0000 (216.9674) mem 7381MB [2024-09-01 10:21:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][960/1251] eta 0:01:10 lr 0.000020 wd 0.0500 time 0.2446 (0.2439) data time 0.0008 (0.0016) model time 0.2439 (0.2421) loss 3.2809 (2.6808) grad_norm 6.8277 (6.0423) loss_scale 256.0000 (217.3736) mem 7381MB [2024-09-01 10:21:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][970/1251] eta 0:01:08 lr 0.000020 wd 0.0500 time 0.2368 (0.2439) data time 0.0009 (0.0016) model time 0.2359 (0.2421) loss 2.0452 (2.6793) grad_norm 5.0466 (6.0371) loss_scale 256.0000 (217.7714) mem 7381MB [2024-09-01 10:21:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][980/1251] eta 0:01:06 lr 0.000020 wd 0.0500 time 0.2407 (0.2438) data time 0.0007 (0.0016) model time 0.2399 (0.2420) loss 2.8142 (2.6811) grad_norm 5.4605 (6.0456) loss_scale 256.0000 (218.1611) mem 7381MB [2024-09-01 10:21:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][990/1251] eta 0:01:03 lr 0.000020 wd 0.0500 time 0.4451 (0.2442) data time 0.0009 (0.0016) model time 0.4442 (0.2425) loss 3.0650 (2.6820) grad_norm 5.9271 (6.0502) loss_scale 256.0000 (218.5429) mem 7381MB [2024-09-01 10:21:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1000/1251] eta 0:01:01 lr 0.000020 wd 0.0500 time 0.2470 (0.2442) data time 0.0007 (0.0016) model time 0.2463 (0.2424) loss 2.5444 (2.6807) grad_norm 5.4926 (6.0478) loss_scale 256.0000 (218.9171) mem 7381MB [2024-09-01 10:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1010/1251] eta 0:00:58 lr 0.000020 wd 0.0500 time 0.2364 (0.2441) data time 0.0013 (0.0016) model time 0.2351 (0.2424) loss 3.0791 (2.6820) grad_norm 12.0614 (6.0552) loss_scale 256.0000 (219.2839) mem 7381MB [2024-09-01 10:21:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1020/1251] eta 0:00:56 lr 0.000020 wd 0.0500 time 0.2377 (0.2441) data time 0.0007 (0.0016) model time 0.2369 (0.2423) loss 2.1227 (2.6827) grad_norm 4.3421 (6.0579) loss_scale 256.0000 (219.6435) mem 7381MB [2024-09-01 10:21:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1030/1251] eta 0:00:53 lr 0.000020 wd 0.0500 time 0.2368 (0.2441) data time 0.0012 (0.0016) model time 0.2357 (0.2423) loss 2.9831 (2.6835) grad_norm 3.7828 (6.0556) loss_scale 256.0000 (219.9961) mem 7381MB [2024-09-01 10:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1040/1251] eta 0:00:51 lr 0.000020 wd 0.0500 time 0.2366 (0.2440) data time 0.0010 (0.0016) model time 0.2356 (0.2423) loss 2.0183 (2.6816) grad_norm 4.8566 (6.0445) loss_scale 256.0000 (220.3420) mem 7381MB [2024-09-01 10:21:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1050/1251] eta 0:00:49 lr 0.000020 wd 0.0500 time 0.2425 (0.2440) data time 0.0007 (0.0016) model time 0.2417 (0.2423) loss 3.1163 (2.6811) grad_norm 6.7391 (6.0405) loss_scale 256.0000 (220.6813) mem 7381MB [2024-09-01 10:21:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1060/1251] eta 0:00:46 lr 0.000020 wd 0.0500 time 0.2488 (0.2440) data time 0.0009 (0.0016) model time 0.2479 (0.2422) loss 2.6636 (2.6814) grad_norm 6.4085 (6.0328) loss_scale 256.0000 (221.0141) mem 7381MB [2024-09-01 10:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1070/1251] eta 0:00:44 lr 0.000020 wd 0.0500 time 0.2381 (0.2439) data time 0.0010 (0.0016) model time 0.2371 (0.2422) loss 3.1438 (2.6837) grad_norm 4.2491 (6.0421) loss_scale 256.0000 (221.3408) mem 7381MB [2024-09-01 10:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1080/1251] eta 0:00:41 lr 0.000020 wd 0.0500 time 0.2437 (0.2439) data time 0.0009 (0.0016) model time 0.2428 (0.2422) loss 3.2061 (2.6858) grad_norm 3.9651 (6.0578) loss_scale 256.0000 (221.6614) mem 7381MB [2024-09-01 10:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1090/1251] eta 0:00:39 lr 0.000020 wd 0.0500 time 0.2376 (0.2439) data time 0.0011 (0.0015) model time 0.2365 (0.2422) loss 2.8545 (2.6866) grad_norm 6.0220 (6.0515) loss_scale 256.0000 (221.9762) mem 7381MB [2024-09-01 10:22:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1100/1251] eta 0:00:36 lr 0.000020 wd 0.0500 time 0.2445 (0.2438) data time 0.0009 (0.0015) model time 0.2436 (0.2422) loss 3.3612 (2.6893) grad_norm 4.2604 (6.0434) loss_scale 256.0000 (222.2852) mem 7381MB [2024-09-01 10:22:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1110/1251] eta 0:00:34 lr 0.000020 wd 0.0500 time 0.2388 (0.2438) data time 0.0007 (0.0015) model time 0.2381 (0.2421) loss 1.9952 (2.6889) grad_norm 4.5360 (6.0356) loss_scale 256.0000 (222.5887) mem 7381MB [2024-09-01 10:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1120/1251] eta 0:00:31 lr 0.000020 wd 0.0500 time 0.2417 (0.2438) data time 0.0011 (0.0015) model time 0.2406 (0.2421) loss 3.1111 (2.6886) grad_norm 5.8102 (6.0362) loss_scale 256.0000 (222.8867) mem 7381MB [2024-09-01 10:22:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1130/1251] eta 0:00:29 lr 0.000020 wd 0.0500 time 0.2391 (0.2437) data time 0.0009 (0.0015) model time 0.2382 (0.2421) loss 2.6265 (2.6876) grad_norm 5.5075 (6.0420) loss_scale 256.0000 (223.1795) mem 7381MB [2024-09-01 10:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1140/1251] eta 0:00:27 lr 0.000020 wd 0.0500 time 0.2492 (0.2437) data time 0.0007 (0.0015) model time 0.2485 (0.2421) loss 2.1004 (2.6881) grad_norm 3.9236 (6.0514) loss_scale 256.0000 (223.4671) mem 7381MB [2024-09-01 10:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1150/1251] eta 0:00:24 lr 0.000020 wd 0.0500 time 0.2390 (0.2437) data time 0.0008 (0.0015) model time 0.2383 (0.2420) loss 2.8376 (2.6889) grad_norm 5.9884 (6.0532) loss_scale 256.0000 (223.7498) mem 7381MB [2024-09-01 10:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1160/1251] eta 0:00:22 lr 0.000020 wd 0.0500 time 0.2350 (0.2437) data time 0.0008 (0.0015) model time 0.2342 (0.2420) loss 3.2148 (2.6886) grad_norm 5.2107 (6.0505) loss_scale 256.0000 (224.0276) mem 7381MB [2024-09-01 10:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1170/1251] eta 0:00:19 lr 0.000020 wd 0.0500 time 0.2440 (0.2437) data time 0.0007 (0.0015) model time 0.2432 (0.2420) loss 3.3274 (2.6899) grad_norm 6.7046 (6.0538) loss_scale 256.0000 (224.3006) mem 7381MB [2024-09-01 10:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1180/1251] eta 0:00:17 lr 0.000020 wd 0.0500 time 0.2396 (0.2436) data time 0.0007 (0.0015) model time 0.2388 (0.2420) loss 2.8183 (2.6912) grad_norm 5.7435 (6.0549) loss_scale 256.0000 (224.5690) mem 7381MB [2024-09-01 10:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1190/1251] eta 0:00:14 lr 0.000020 wd 0.0500 time 0.2361 (0.2437) data time 0.0007 (0.0015) model time 0.2354 (0.2421) loss 2.3387 (2.6910) grad_norm 10.3929 (6.0577) loss_scale 256.0000 (224.8329) mem 7381MB [2024-09-01 10:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1200/1251] eta 0:00:12 lr 0.000020 wd 0.0500 time 0.2407 (0.2437) data time 0.0011 (0.0015) model time 0.2396 (0.2421) loss 2.4849 (2.6910) grad_norm 3.7749 (6.0499) loss_scale 256.0000 (225.0924) mem 7381MB [2024-09-01 10:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1210/1251] eta 0:00:09 lr 0.000020 wd 0.0500 time 0.2447 (0.2437) data time 0.0008 (0.0015) model time 0.2439 (0.2421) loss 2.8573 (2.6918) grad_norm 4.6290 (6.0461) loss_scale 256.0000 (225.3476) mem 7381MB [2024-09-01 10:22:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1220/1251] eta 0:00:07 lr 0.000020 wd 0.0500 time 0.2442 (0.2437) data time 0.0012 (0.0015) model time 0.2430 (0.2421) loss 1.8383 (2.6905) grad_norm 5.3897 (6.0407) loss_scale 256.0000 (225.5987) mem 7381MB [2024-09-01 10:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1230/1251] eta 0:00:05 lr 0.000020 wd 0.0500 time 0.2441 (0.2437) data time 0.0007 (0.0015) model time 0.2434 (0.2421) loss 2.1818 (2.6910) grad_norm 4.7397 (6.0350) loss_scale 256.0000 (225.8457) mem 7381MB [2024-09-01 10:22:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1240/1251] eta 0:00:02 lr 0.000020 wd 0.0500 time 0.2261 (0.2436) data time 0.0004 (0.0015) model time 0.2257 (0.2420) loss 2.8516 (2.6923) grad_norm 4.8977 (6.0327) loss_scale 256.0000 (226.0886) mem 7381MB [2024-09-01 10:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [281/300][1250/1251] eta 0:00:00 lr 0.000020 wd 0.0500 time 0.2232 (0.2435) data time 0.0005 (0.0015) model time 0.2228 (0.2418) loss 2.9875 (2.6917) grad_norm 4.0497 (6.0326) loss_scale 256.0000 (226.3277) mem 7381MB [2024-09-01 10:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 281 training takes 0:05:04 [2024-09-01 10:22:42 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 10:22:43 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 10:22:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.501 (0.501) Loss 0.3870 (0.3870) Acc@1 93.359 (93.359) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 10:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.069 (0.113) Loss 0.5825 (0.6098) Acc@1 90.332 (87.882) Acc@5 97.949 (97.727) Mem 7381MB [2024-09-01 10:22:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.097) Loss 0.9146 (0.6423) Acc@1 77.637 (86.714) Acc@5 95.898 (97.680) Mem 7381MB [2024-09-01 10:22:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.083 (0.092) Loss 1.1396 (0.7349) Acc@1 74.023 (84.548) Acc@5 93.359 (96.755) Mem 7381MB [2024-09-01 10:22:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.085) Loss 1.0225 (0.7848) Acc@1 77.051 (83.339) Acc@5 94.336 (96.263) Mem 7381MB [2024-09-01 10:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.906 Acc@5 96.198 [2024-09-01 10:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 10:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 269): INFO New max accuracy: 82.91% [2024-09-01 10:22:47 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saving...... [2024-09-01 10:22:47 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt.pth saved !!! [2024-09-01 10:22:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.429 (0.429) Loss 0.3882 (0.3882) Acc@1 93.164 (93.164) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 10:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.109) Loss 0.5649 (0.6069) Acc@1 90.527 (87.944) Acc@5 98.047 (97.807) Mem 7381MB [2024-09-01 10:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.095) Loss 0.9077 (0.6373) Acc@1 77.930 (86.742) Acc@5 95.703 (97.745) Mem 7381MB [2024-09-01 10:22:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.076 (0.089) Loss 1.1348 (0.7298) Acc@1 74.707 (84.551) Acc@5 92.969 (96.790) Mem 7381MB [2024-09-01 10:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.0088 (0.7781) Acc@1 76.855 (83.353) Acc@5 94.531 (96.315) Mem 7381MB [2024-09-01 10:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.924 Acc@5 96.266 [2024-09-01 10:22:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 10:22:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][0/1251] eta 0:23:52 lr 0.000020 wd 0.0500 time 1.1449 (1.1449) data time 0.7070 (0.7070) model time 0.0000 (0.0000) loss 2.6420 (2.6420) grad_norm 5.2807 (5.2807) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:22:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][10/1251] eta 0:07:02 lr 0.000020 wd 0.0500 time 0.2350 (0.3403) data time 0.0008 (0.0651) model time 0.0000 (0.0000) loss 1.5923 (2.5322) grad_norm 4.4983 (5.2168) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:22:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][20/1251] eta 0:06:00 lr 0.000020 wd 0.0500 time 0.2469 (0.2929) data time 0.0011 (0.0346) model time 0.0000 (0.0000) loss 2.8845 (2.6068) grad_norm 5.8608 (5.7147) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][30/1251] eta 0:05:37 lr 0.000020 wd 0.0500 time 0.2327 (0.2762) data time 0.0013 (0.0238) model time 0.0000 (0.0000) loss 2.7882 (2.6671) grad_norm 7.2700 (5.7948) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][40/1251] eta 0:05:24 lr 0.000020 wd 0.0500 time 0.2462 (0.2681) data time 0.0007 (0.0182) model time 0.0000 (0.0000) loss 2.6151 (2.7289) grad_norm 4.2867 (5.8469) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][50/1251] eta 0:05:15 lr 0.000020 wd 0.0500 time 0.2449 (0.2630) data time 0.0007 (0.0149) model time 0.0000 (0.0000) loss 2.8534 (2.7339) grad_norm 5.8318 (5.8296) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][60/1251] eta 0:05:08 lr 0.000020 wd 0.0500 time 0.2437 (0.2593) data time 0.0009 (0.0126) model time 0.2428 (0.2395) loss 2.5122 (2.6997) grad_norm 7.9801 (6.1409) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][70/1251] eta 0:05:03 lr 0.000020 wd 0.0500 time 0.2436 (0.2568) data time 0.0010 (0.0109) model time 0.2425 (0.2400) loss 2.9916 (2.7179) grad_norm 5.1778 (6.3355) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][80/1251] eta 0:04:58 lr 0.000020 wd 0.0500 time 0.2305 (0.2545) data time 0.0010 (0.0097) model time 0.2295 (0.2391) loss 2.9653 (2.7135) grad_norm 5.8898 (6.2630) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][90/1251] eta 0:04:53 lr 0.000020 wd 0.0500 time 0.2462 (0.2530) data time 0.0011 (0.0088) model time 0.2451 (0.2392) loss 2.3794 (2.6854) grad_norm 5.1720 (6.5394) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][100/1251] eta 0:04:49 lr 0.000020 wd 0.0500 time 0.2318 (0.2516) data time 0.0007 (0.0080) model time 0.2310 (0.2390) loss 3.0195 (2.6594) grad_norm 3.8787 (6.4424) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][110/1251] eta 0:04:45 lr 0.000020 wd 0.0500 time 0.2385 (0.2506) data time 0.0011 (0.0073) model time 0.2374 (0.2391) loss 2.5671 (2.6471) grad_norm 4.3530 (6.4143) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][120/1251] eta 0:04:42 lr 0.000020 wd 0.0500 time 0.2451 (0.2498) data time 0.0010 (0.0068) model time 0.2442 (0.2391) loss 2.7181 (2.6468) grad_norm 6.7479 (6.3689) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][130/1251] eta 0:04:39 lr 0.000020 wd 0.0500 time 0.2417 (0.2490) data time 0.0009 (0.0064) model time 0.2408 (0.2391) loss 2.7918 (2.6594) grad_norm 4.5062 (6.3245) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][140/1251] eta 0:04:36 lr 0.000020 wd 0.0500 time 0.2398 (0.2485) data time 0.0008 (0.0060) model time 0.2390 (0.2393) loss 3.2554 (2.6590) grad_norm 5.4816 (6.2421) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][150/1251] eta 0:04:33 lr 0.000020 wd 0.0500 time 0.2485 (0.2480) data time 0.0008 (0.0057) model time 0.2477 (0.2394) loss 2.6795 (2.6571) grad_norm 5.2395 (6.2004) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][160/1251] eta 0:04:31 lr 0.000020 wd 0.0500 time 0.2398 (0.2484) data time 0.0010 (0.0054) model time 0.2388 (0.2407) loss 3.1104 (2.6617) grad_norm 9.7341 (6.2160) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][170/1251] eta 0:04:28 lr 0.000020 wd 0.0500 time 0.2342 (0.2481) data time 0.0008 (0.0051) model time 0.2334 (0.2408) loss 2.9939 (2.6679) grad_norm 7.2535 (6.3231) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][180/1251] eta 0:04:25 lr 0.000020 wd 0.0500 time 0.2434 (0.2479) data time 0.0008 (0.0049) model time 0.2426 (0.2409) loss 2.8928 (2.6627) grad_norm 4.6975 (6.2645) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][190/1251] eta 0:04:22 lr 0.000020 wd 0.0500 time 0.2402 (0.2476) data time 0.0009 (0.0047) model time 0.2393 (0.2409) loss 2.8100 (2.6649) grad_norm 4.9765 (6.2463) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][200/1251] eta 0:04:19 lr 0.000020 wd 0.0500 time 0.2435 (0.2473) data time 0.0007 (0.0045) model time 0.2428 (0.2410) loss 3.2469 (2.6639) grad_norm 4.7115 (6.1934) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][210/1251] eta 0:04:17 lr 0.000020 wd 0.0500 time 0.2373 (0.2470) data time 0.0008 (0.0043) model time 0.2365 (0.2409) loss 2.7767 (2.6716) grad_norm 3.5472 (6.2441) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][220/1251] eta 0:04:14 lr 0.000020 wd 0.0500 time 0.2353 (0.2467) data time 0.0011 (0.0042) model time 0.2342 (0.2408) loss 3.1498 (2.6762) grad_norm 14.8218 (6.2412) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][230/1251] eta 0:04:11 lr 0.000020 wd 0.0500 time 0.2381 (0.2465) data time 0.0008 (0.0040) model time 0.2373 (0.2409) loss 2.7455 (2.6794) grad_norm 8.1164 (6.2290) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][240/1251] eta 0:04:09 lr 0.000020 wd 0.0500 time 0.2335 (0.2463) data time 0.0009 (0.0039) model time 0.2327 (0.2409) loss 2.8800 (2.6788) grad_norm 5.4021 (6.2096) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][250/1251] eta 0:04:06 lr 0.000020 wd 0.0500 time 0.2425 (0.2462) data time 0.0008 (0.0038) model time 0.2417 (0.2409) loss 3.1745 (2.6805) grad_norm 6.1214 (6.1867) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][260/1251] eta 0:04:03 lr 0.000020 wd 0.0500 time 0.2437 (0.2460) data time 0.0007 (0.0037) model time 0.2429 (0.2409) loss 3.4359 (2.6779) grad_norm 6.2031 (6.1478) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:23:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][270/1251] eta 0:04:03 lr 0.000020 wd 0.0500 time 0.4427 (0.2482) data time 0.0008 (0.0036) model time 0.4420 (0.2438) loss 2.6258 (2.6796) grad_norm 82.9879 (6.4271) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][280/1251] eta 0:04:00 lr 0.000020 wd 0.0500 time 0.2500 (0.2480) data time 0.0009 (0.0035) model time 0.2490 (0.2437) loss 2.6592 (2.6817) grad_norm 6.6339 (6.3980) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][290/1251] eta 0:03:58 lr 0.000020 wd 0.0500 time 0.2428 (0.2478) data time 0.0008 (0.0034) model time 0.2421 (0.2436) loss 2.7420 (2.6919) grad_norm 4.6217 (6.3851) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][300/1251] eta 0:03:55 lr 0.000020 wd 0.0500 time 0.2418 (0.2477) data time 0.0009 (0.0033) model time 0.2409 (0.2436) loss 2.6928 (2.6883) grad_norm 4.3870 (6.3927) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][310/1251] eta 0:03:52 lr 0.000020 wd 0.0500 time 0.2436 (0.2476) data time 0.0009 (0.0033) model time 0.2426 (0.2436) loss 2.4520 (2.6892) grad_norm 6.6299 (6.3670) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][320/1251] eta 0:03:50 lr 0.000020 wd 0.0500 time 0.2394 (0.2474) data time 0.0012 (0.0032) model time 0.2382 (0.2434) loss 3.3153 (2.6952) grad_norm 4.0004 (6.3504) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][330/1251] eta 0:03:47 lr 0.000020 wd 0.0500 time 0.2427 (0.2471) data time 0.0011 (0.0031) model time 0.2417 (0.2432) loss 2.8441 (2.6971) grad_norm 6.9370 (6.3914) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][340/1251] eta 0:03:44 lr 0.000020 wd 0.0500 time 0.2418 (0.2470) data time 0.0011 (0.0031) model time 0.2407 (0.2432) loss 2.9884 (2.6988) grad_norm 3.6175 (6.3690) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][350/1251] eta 0:03:42 lr 0.000020 wd 0.0500 time 0.2491 (0.2468) data time 0.0007 (0.0030) model time 0.2484 (0.2431) loss 2.3094 (2.6964) grad_norm 4.9380 (6.5841) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][360/1251] eta 0:03:39 lr 0.000020 wd 0.0500 time 0.2455 (0.2467) data time 0.0009 (0.0029) model time 0.2446 (0.2430) loss 2.9751 (2.6946) grad_norm 7.8170 (6.5944) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][370/1251] eta 0:03:37 lr 0.000020 wd 0.0500 time 0.2382 (0.2465) data time 0.0008 (0.0029) model time 0.2374 (0.2429) loss 2.9833 (2.6983) grad_norm 4.8727 (6.6029) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][380/1251] eta 0:03:34 lr 0.000020 wd 0.0500 time 0.2615 (0.2464) data time 0.0011 (0.0028) model time 0.2604 (0.2429) loss 2.8287 (2.6994) grad_norm 5.3114 (6.6247) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][390/1251] eta 0:03:32 lr 0.000020 wd 0.0500 time 0.2383 (0.2463) data time 0.0007 (0.0028) model time 0.2376 (0.2428) loss 1.7665 (2.6978) grad_norm 8.8123 (6.6119) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][400/1251] eta 0:03:29 lr 0.000020 wd 0.0500 time 0.4520 (0.2468) data time 0.0010 (0.0027) model time 0.4511 (0.2434) loss 3.1944 (2.6993) grad_norm 6.5791 (6.5940) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][410/1251] eta 0:03:27 lr 0.000020 wd 0.0500 time 0.2479 (0.2467) data time 0.0011 (0.0027) model time 0.2468 (0.2434) loss 1.8450 (2.7008) grad_norm 5.7412 (6.5949) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][420/1251] eta 0:03:24 lr 0.000020 wd 0.0500 time 0.2487 (0.2465) data time 0.0009 (0.0027) model time 0.2478 (0.2433) loss 2.4241 (2.7006) grad_norm 5.9508 (6.5687) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][430/1251] eta 0:03:22 lr 0.000020 wd 0.0500 time 0.2403 (0.2464) data time 0.0007 (0.0026) model time 0.2396 (0.2432) loss 2.0705 (2.6981) grad_norm 5.9741 (6.5514) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][440/1251] eta 0:03:19 lr 0.000020 wd 0.0500 time 0.2409 (0.2463) data time 0.0008 (0.0026) model time 0.2401 (0.2431) loss 2.6210 (2.6974) grad_norm 4.3166 (6.5620) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][450/1251] eta 0:03:17 lr 0.000020 wd 0.0500 time 0.2419 (0.2461) data time 0.0011 (0.0026) model time 0.2409 (0.2430) loss 2.5927 (2.7008) grad_norm 7.5720 (6.5683) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][460/1251] eta 0:03:14 lr 0.000020 wd 0.0500 time 0.2404 (0.2460) data time 0.0010 (0.0025) model time 0.2394 (0.2429) loss 2.6759 (2.7012) grad_norm 5.4712 (6.5481) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][470/1251] eta 0:03:12 lr 0.000020 wd 0.0500 time 0.2459 (0.2459) data time 0.0009 (0.0025) model time 0.2449 (0.2428) loss 2.7585 (2.7010) grad_norm 5.9550 (6.5325) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][480/1251] eta 0:03:09 lr 0.000020 wd 0.0500 time 0.2381 (0.2458) data time 0.0011 (0.0025) model time 0.2370 (0.2428) loss 2.9314 (2.7014) grad_norm 5.4611 (6.5248) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][490/1251] eta 0:03:06 lr 0.000020 wd 0.0500 time 0.2441 (0.2457) data time 0.0007 (0.0024) model time 0.2434 (0.2427) loss 2.9587 (2.7041) grad_norm 4.9297 (6.5228) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][500/1251] eta 0:03:04 lr 0.000020 wd 0.0500 time 0.2469 (0.2456) data time 0.0009 (0.0024) model time 0.2460 (0.2427) loss 3.2001 (2.7036) grad_norm 4.9284 (6.5152) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][510/1251] eta 0:03:01 lr 0.000020 wd 0.0500 time 0.2387 (0.2456) data time 0.0007 (0.0024) model time 0.2380 (0.2426) loss 2.5913 (2.7051) grad_norm 5.9602 (6.5316) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][520/1251] eta 0:02:59 lr 0.000020 wd 0.0500 time 0.2434 (0.2455) data time 0.0010 (0.0023) model time 0.2424 (0.2426) loss 1.7506 (2.7048) grad_norm 8.0333 (6.5181) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:25:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][530/1251] eta 0:02:56 lr 0.000020 wd 0.0500 time 0.2433 (0.2455) data time 0.0007 (0.0023) model time 0.2426 (0.2426) loss 2.8014 (2.7035) grad_norm 3.3923 (6.5088) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][540/1251] eta 0:02:54 lr 0.000020 wd 0.0500 time 0.2352 (0.2454) data time 0.0011 (0.0023) model time 0.2341 (0.2425) loss 2.9712 (2.7017) grad_norm 3.9223 (6.4858) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:25:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][550/1251] eta 0:02:51 lr 0.000020 wd 0.0500 time 0.2424 (0.2453) data time 0.0010 (0.0023) model time 0.2414 (0.2425) loss 2.7193 (2.6973) grad_norm 7.0594 (6.4731) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:25:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][560/1251] eta 0:02:49 lr 0.000020 wd 0.0500 time 0.2414 (0.2452) data time 0.0007 (0.0023) model time 0.2407 (0.2424) loss 1.9457 (2.6969) grad_norm 10.2194 (6.4784) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][570/1251] eta 0:02:46 lr 0.000020 wd 0.0500 time 0.2388 (0.2451) data time 0.0009 (0.0022) model time 0.2378 (0.2424) loss 2.7843 (2.6941) grad_norm 3.1927 (6.4734) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:25:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][580/1251] eta 0:02:44 lr 0.000020 wd 0.0500 time 0.2450 (0.2451) data time 0.0009 (0.0022) model time 0.2440 (0.2423) loss 2.8454 (2.6951) grad_norm 4.8034 (6.4707) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:25:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][590/1251] eta 0:02:41 lr 0.000020 wd 0.0500 time 0.2353 (0.2449) data time 0.0007 (0.0022) model time 0.2346 (0.2422) loss 1.9809 (2.6931) grad_norm 5.1328 (6.4685) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:25:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][600/1251] eta 0:02:39 lr 0.000020 wd 0.0500 time 0.2323 (0.2449) data time 0.0007 (0.0022) model time 0.2316 (0.2422) loss 3.0117 (2.6940) grad_norm 5.6815 (6.4521) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:25:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][610/1251] eta 0:02:36 lr 0.000020 wd 0.0500 time 0.2513 (0.2448) data time 0.0007 (0.0021) model time 0.2506 (0.2422) loss 2.8414 (2.6953) grad_norm 34.8663 (6.4710) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:25:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][620/1251] eta 0:02:34 lr 0.000020 wd 0.0500 time 0.2464 (0.2448) data time 0.0009 (0.0021) model time 0.2455 (0.2422) loss 2.8469 (2.6968) grad_norm 6.8323 (6.4610) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:25:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][630/1251] eta 0:02:31 lr 0.000020 wd 0.0500 time 0.2354 (0.2448) data time 0.0009 (0.0021) model time 0.2345 (0.2422) loss 2.6312 (2.6951) grad_norm 5.5631 (6.4599) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:25:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][640/1251] eta 0:02:29 lr 0.000019 wd 0.0500 time 0.2530 (0.2447) data time 0.0010 (0.0021) model time 0.2520 (0.2421) loss 2.2287 (2.6948) grad_norm 21.8006 (6.4982) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:25:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][650/1251] eta 0:02:26 lr 0.000019 wd 0.0500 time 0.2398 (0.2446) data time 0.0011 (0.0021) model time 0.2387 (0.2420) loss 2.6019 (2.6956) grad_norm 4.4199 (6.4742) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:25:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][660/1251] eta 0:02:24 lr 0.000019 wd 0.0500 time 0.2400 (0.2446) data time 0.0010 (0.0021) model time 0.2391 (0.2420) loss 3.0855 (2.6943) grad_norm 4.9695 (inf) loss_scale 128.0000 (255.0318) mem 7381MB [2024-09-01 10:25:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][670/1251] eta 0:02:22 lr 0.000019 wd 0.0500 time 0.2423 (0.2446) data time 0.0010 (0.0020) model time 0.2414 (0.2421) loss 3.1111 (2.6957) grad_norm 4.7121 (inf) loss_scale 128.0000 (253.1386) mem 7381MB [2024-09-01 10:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][680/1251] eta 0:02:19 lr 0.000019 wd 0.0500 time 0.2385 (0.2445) data time 0.0007 (0.0020) model time 0.2378 (0.2420) loss 3.1712 (2.6973) grad_norm 4.8697 (inf) loss_scale 128.0000 (251.3010) mem 7381MB [2024-09-01 10:25:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][690/1251] eta 0:02:17 lr 0.000019 wd 0.0500 time 0.2366 (0.2447) data time 0.0010 (0.0020) model time 0.2356 (0.2422) loss 2.7475 (2.6965) grad_norm 5.0858 (inf) loss_scale 128.0000 (249.5166) mem 7381MB [2024-09-01 10:25:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][700/1251] eta 0:02:14 lr 0.000019 wd 0.0500 time 0.2386 (0.2446) data time 0.0008 (0.0020) model time 0.2378 (0.2422) loss 2.6667 (2.6984) grad_norm 5.9860 (inf) loss_scale 128.0000 (247.7832) mem 7381MB [2024-09-01 10:25:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][710/1251] eta 0:02:12 lr 0.000019 wd 0.0500 time 0.2505 (0.2446) data time 0.0009 (0.0020) model time 0.2496 (0.2422) loss 2.5360 (2.6995) grad_norm 5.6620 (inf) loss_scale 128.0000 (246.0985) mem 7381MB [2024-09-01 10:25:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][720/1251] eta 0:02:09 lr 0.000019 wd 0.0500 time 0.2463 (0.2446) data time 0.0009 (0.0020) model time 0.2454 (0.2422) loss 1.8861 (2.6986) grad_norm 5.2076 (inf) loss_scale 128.0000 (244.4605) mem 7381MB [2024-09-01 10:25:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][730/1251] eta 0:02:07 lr 0.000019 wd 0.0500 time 0.2440 (0.2445) data time 0.0007 (0.0020) model time 0.2433 (0.2421) loss 2.7107 (2.6991) grad_norm 4.2250 (inf) loss_scale 128.0000 (242.8673) mem 7381MB [2024-09-01 10:25:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][740/1251] eta 0:02:04 lr 0.000019 wd 0.0500 time 0.2526 (0.2445) data time 0.0010 (0.0020) model time 0.2516 (0.2421) loss 3.1276 (2.7020) grad_norm 6.8554 (inf) loss_scale 128.0000 (241.3171) mem 7381MB [2024-09-01 10:25:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][750/1251] eta 0:02:02 lr 0.000019 wd 0.0500 time 0.2401 (0.2444) data time 0.0011 (0.0019) model time 0.2389 (0.2421) loss 3.0238 (2.6995) grad_norm 6.0098 (inf) loss_scale 128.0000 (239.8083) mem 7381MB [2024-09-01 10:25:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][760/1251] eta 0:02:00 lr 0.000019 wd 0.0500 time 0.2400 (0.2444) data time 0.0007 (0.0019) model time 0.2393 (0.2421) loss 1.9314 (2.6993) grad_norm 4.4830 (inf) loss_scale 128.0000 (238.3390) mem 7381MB [2024-09-01 10:26:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][770/1251] eta 0:01:57 lr 0.000019 wd 0.0500 time 0.2442 (0.2444) data time 0.0011 (0.0019) model time 0.2431 (0.2421) loss 2.4941 (2.6977) grad_norm 4.0773 (inf) loss_scale 128.0000 (236.9079) mem 7381MB [2024-09-01 10:26:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][780/1251] eta 0:01:55 lr 0.000019 wd 0.0500 time 0.2341 (0.2443) data time 0.0009 (0.0019) model time 0.2332 (0.2420) loss 2.2062 (2.6984) grad_norm 4.2328 (inf) loss_scale 128.0000 (235.5134) mem 7381MB [2024-09-01 10:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][790/1251] eta 0:01:52 lr 0.000019 wd 0.0500 time 0.3926 (0.2449) data time 0.0008 (0.0019) model time 0.3918 (0.2426) loss 3.1851 (2.6977) grad_norm 21.7862 (inf) loss_scale 128.0000 (234.1542) mem 7381MB [2024-09-01 10:26:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][800/1251] eta 0:01:50 lr 0.000019 wd 0.0500 time 0.2365 (0.2449) data time 0.0007 (0.0019) model time 0.2358 (0.2426) loss 2.5664 (2.6985) grad_norm 4.6737 (inf) loss_scale 128.0000 (232.8290) mem 7381MB [2024-09-01 10:26:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][810/1251] eta 0:01:47 lr 0.000019 wd 0.0500 time 0.2348 (0.2448) data time 0.0013 (0.0019) model time 0.2336 (0.2426) loss 2.8965 (2.6971) grad_norm 8.2495 (inf) loss_scale 128.0000 (231.5364) mem 7381MB [2024-09-01 10:26:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][820/1251] eta 0:01:45 lr 0.000019 wd 0.0500 time 0.2495 (0.2447) data time 0.0011 (0.0019) model time 0.2484 (0.2425) loss 2.7759 (2.6975) grad_norm 6.6703 (inf) loss_scale 128.0000 (230.2753) mem 7381MB [2024-09-01 10:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][830/1251] eta 0:01:43 lr 0.000019 wd 0.0500 time 0.2361 (0.2447) data time 0.0007 (0.0018) model time 0.2354 (0.2425) loss 2.0854 (2.6966) grad_norm 6.1396 (inf) loss_scale 128.0000 (229.0445) mem 7381MB [2024-09-01 10:26:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][840/1251] eta 0:01:40 lr 0.000019 wd 0.0500 time 0.2405 (0.2447) data time 0.0009 (0.0018) model time 0.2396 (0.2425) loss 3.0993 (2.6997) grad_norm 7.9238 (inf) loss_scale 128.0000 (227.8430) mem 7381MB [2024-09-01 10:26:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][850/1251] eta 0:01:38 lr 0.000019 wd 0.0500 time 0.2413 (0.2446) data time 0.0008 (0.0018) model time 0.2405 (0.2425) loss 3.2506 (2.6996) grad_norm 3.4639 (inf) loss_scale 128.0000 (226.6698) mem 7381MB [2024-09-01 10:26:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][860/1251] eta 0:01:35 lr 0.000019 wd 0.0500 time 0.2469 (0.2446) data time 0.0008 (0.0018) model time 0.2461 (0.2425) loss 2.0884 (2.6967) grad_norm 33.7401 (inf) loss_scale 128.0000 (225.5238) mem 7381MB [2024-09-01 10:26:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][870/1251] eta 0:01:33 lr 0.000019 wd 0.0500 time 0.2412 (0.2446) data time 0.0007 (0.0018) model time 0.2405 (0.2425) loss 2.2437 (2.6972) grad_norm 5.6615 (inf) loss_scale 128.0000 (224.4041) mem 7381MB [2024-09-01 10:26:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][880/1251] eta 0:01:30 lr 0.000019 wd 0.0500 time 0.2422 (0.2446) data time 0.0009 (0.0018) model time 0.2413 (0.2425) loss 2.8381 (2.6971) grad_norm 6.0694 (inf) loss_scale 128.0000 (223.3099) mem 7381MB [2024-09-01 10:26:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][890/1251] eta 0:01:28 lr 0.000019 wd 0.0500 time 0.2455 (0.2446) data time 0.0009 (0.0018) model time 0.2446 (0.2424) loss 2.7824 (2.6969) grad_norm 5.6800 (inf) loss_scale 128.0000 (222.2402) mem 7381MB [2024-09-01 10:26:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][900/1251] eta 0:01:25 lr 0.000019 wd 0.0500 time 0.2416 (0.2445) data time 0.0010 (0.0018) model time 0.2406 (0.2424) loss 3.0142 (2.6955) grad_norm 5.2928 (inf) loss_scale 128.0000 (221.1942) mem 7381MB [2024-09-01 10:26:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][910/1251] eta 0:01:23 lr 0.000019 wd 0.0500 time 0.2445 (0.2445) data time 0.0009 (0.0018) model time 0.2436 (0.2424) loss 2.6271 (2.6966) grad_norm 7.1636 (inf) loss_scale 128.0000 (220.1712) mem 7381MB [2024-09-01 10:26:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][920/1251] eta 0:01:20 lr 0.000019 wd 0.0500 time 0.2481 (0.2447) data time 0.0009 (0.0018) model time 0.2472 (0.2426) loss 2.7810 (2.6959) grad_norm 55.4002 (inf) loss_scale 128.0000 (219.1705) mem 7381MB [2024-09-01 10:26:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][930/1251] eta 0:01:18 lr 0.000019 wd 0.0500 time 0.2412 (0.2446) data time 0.0007 (0.0018) model time 0.2405 (0.2425) loss 3.2772 (2.6953) grad_norm 11.3466 (inf) loss_scale 128.0000 (218.1912) mem 7381MB [2024-09-01 10:26:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][940/1251] eta 0:01:16 lr 0.000019 wd 0.0500 time 0.2408 (0.2448) data time 0.0009 (0.0017) model time 0.2399 (0.2428) loss 1.9901 (2.6948) grad_norm 7.3382 (inf) loss_scale 128.0000 (217.2327) mem 7381MB [2024-09-01 10:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][950/1251] eta 0:01:13 lr 0.000019 wd 0.0500 time 0.2476 (0.2448) data time 0.0008 (0.0017) model time 0.2468 (0.2427) loss 3.0267 (2.6952) grad_norm 4.8279 (inf) loss_scale 128.0000 (216.2944) mem 7381MB [2024-09-01 10:26:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][960/1251] eta 0:01:11 lr 0.000019 wd 0.0500 time 0.2439 (0.2447) data time 0.0010 (0.0017) model time 0.2429 (0.2427) loss 2.9010 (2.6958) grad_norm 3.6892 (inf) loss_scale 128.0000 (215.3757) mem 7381MB [2024-09-01 10:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][970/1251] eta 0:01:08 lr 0.000019 wd 0.0500 time 0.2469 (0.2447) data time 0.0010 (0.0017) model time 0.2459 (0.2427) loss 2.3816 (2.6968) grad_norm 4.0971 (inf) loss_scale 128.0000 (214.4758) mem 7381MB [2024-09-01 10:26:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][980/1251] eta 0:01:06 lr 0.000019 wd 0.0500 time 0.2397 (0.2447) data time 0.0009 (0.0017) model time 0.2387 (0.2427) loss 2.5027 (2.6934) grad_norm 7.8341 (inf) loss_scale 128.0000 (213.5943) mem 7381MB [2024-09-01 10:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][990/1251] eta 0:01:03 lr 0.000019 wd 0.0500 time 0.2434 (0.2447) data time 0.0007 (0.0017) model time 0.2427 (0.2427) loss 2.4118 (2.6926) grad_norm 7.9346 (inf) loss_scale 128.0000 (212.7306) mem 7381MB [2024-09-01 10:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1000/1251] eta 0:01:01 lr 0.000019 wd 0.0500 time 0.2436 (0.2447) data time 0.0007 (0.0017) model time 0.2429 (0.2427) loss 2.7570 (2.6913) grad_norm 6.3314 (inf) loss_scale 128.0000 (211.8841) mem 7381MB [2024-09-01 10:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1010/1251] eta 0:00:58 lr 0.000019 wd 0.0500 time 0.2500 (0.2447) data time 0.0011 (0.0017) model time 0.2489 (0.2427) loss 2.0385 (2.6918) grad_norm 8.4561 (inf) loss_scale 128.0000 (211.0544) mem 7381MB [2024-09-01 10:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1020/1251] eta 0:00:56 lr 0.000019 wd 0.0500 time 0.2471 (0.2446) data time 0.0009 (0.0017) model time 0.2461 (0.2427) loss 2.7891 (2.6917) grad_norm 3.9016 (inf) loss_scale 128.0000 (210.2409) mem 7381MB [2024-09-01 10:27:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1030/1251] eta 0:00:54 lr 0.000019 wd 0.0500 time 0.2404 (0.2446) data time 0.0008 (0.0017) model time 0.2396 (0.2426) loss 3.0835 (2.6917) grad_norm 8.6649 (inf) loss_scale 128.0000 (209.4433) mem 7381MB [2024-09-01 10:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1040/1251] eta 0:00:51 lr 0.000019 wd 0.0500 time 0.2347 (0.2446) data time 0.0009 (0.0017) model time 0.2338 (0.2426) loss 2.1599 (2.6898) grad_norm 4.4651 (inf) loss_scale 128.0000 (208.6609) mem 7381MB [2024-09-01 10:27:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1050/1251] eta 0:00:49 lr 0.000019 wd 0.0500 time 0.2400 (0.2445) data time 0.0007 (0.0017) model time 0.2393 (0.2426) loss 2.6234 (2.6916) grad_norm 6.4260 (inf) loss_scale 128.0000 (207.8934) mem 7381MB [2024-09-01 10:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1060/1251] eta 0:00:46 lr 0.000019 wd 0.0500 time 0.2500 (0.2445) data time 0.0009 (0.0017) model time 0.2490 (0.2426) loss 2.6660 (2.6895) grad_norm 4.8269 (inf) loss_scale 128.0000 (207.1404) mem 7381MB [2024-09-01 10:27:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1070/1251] eta 0:00:44 lr 0.000019 wd 0.0500 time 0.2425 (0.2445) data time 0.0011 (0.0017) model time 0.2414 (0.2425) loss 2.6420 (2.6870) grad_norm 10.1404 (inf) loss_scale 128.0000 (206.4015) mem 7381MB [2024-09-01 10:27:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1080/1251] eta 0:00:41 lr 0.000019 wd 0.0500 time 0.2309 (0.2444) data time 0.0011 (0.0016) model time 0.2298 (0.2425) loss 2.5754 (2.6874) grad_norm 4.3067 (inf) loss_scale 128.0000 (205.6762) mem 7381MB [2024-09-01 10:27:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1090/1251] eta 0:00:39 lr 0.000019 wd 0.0500 time 0.2399 (0.2444) data time 0.0010 (0.0016) model time 0.2389 (0.2425) loss 2.8703 (2.6880) grad_norm 4.6825 (inf) loss_scale 128.0000 (204.9643) mem 7381MB [2024-09-01 10:27:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1100/1251] eta 0:00:36 lr 0.000019 wd 0.0500 time 0.2402 (0.2444) data time 0.0007 (0.0016) model time 0.2395 (0.2425) loss 2.2154 (2.6884) grad_norm 3.9277 (inf) loss_scale 128.0000 (204.2652) mem 7381MB [2024-09-01 10:27:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1110/1251] eta 0:00:34 lr 0.000019 wd 0.0500 time 0.2403 (0.2443) data time 0.0008 (0.0016) model time 0.2395 (0.2424) loss 2.5236 (2.6862) grad_norm 8.9970 (inf) loss_scale 128.0000 (203.5788) mem 7381MB [2024-09-01 10:27:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1120/1251] eta 0:00:32 lr 0.000019 wd 0.0500 time 0.2376 (0.2443) data time 0.0011 (0.0016) model time 0.2365 (0.2424) loss 2.5597 (2.6867) grad_norm 5.7812 (inf) loss_scale 128.0000 (202.9045) mem 7381MB [2024-09-01 10:27:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1130/1251] eta 0:00:29 lr 0.000019 wd 0.0500 time 0.2426 (0.2443) data time 0.0012 (0.0016) model time 0.2413 (0.2424) loss 2.7358 (2.6891) grad_norm 10.6839 (inf) loss_scale 128.0000 (202.2423) mem 7381MB [2024-09-01 10:27:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1140/1251] eta 0:00:27 lr 0.000019 wd 0.0500 time 0.2448 (0.2443) data time 0.0008 (0.0016) model time 0.2440 (0.2424) loss 2.9698 (2.6855) grad_norm 6.2203 (inf) loss_scale 128.0000 (201.5916) mem 7381MB [2024-09-01 10:27:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1150/1251] eta 0:00:24 lr 0.000019 wd 0.0500 time 0.2486 (0.2443) data time 0.0007 (0.0016) model time 0.2479 (0.2424) loss 2.3552 (2.6855) grad_norm 5.7357 (inf) loss_scale 128.0000 (200.9522) mem 7381MB [2024-09-01 10:27:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1160/1251] eta 0:00:22 lr 0.000019 wd 0.0500 time 0.2370 (0.2443) data time 0.0011 (0.0016) model time 0.2359 (0.2424) loss 2.9116 (2.6842) grad_norm 10.1763 (inf) loss_scale 128.0000 (200.3239) mem 7381MB [2024-09-01 10:27:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1170/1251] eta 0:00:19 lr 0.000019 wd 0.0500 time 0.2393 (0.2443) data time 0.0011 (0.0016) model time 0.2382 (0.2424) loss 2.2229 (2.6845) grad_norm 5.1491 (inf) loss_scale 128.0000 (199.7062) mem 7381MB [2024-09-01 10:27:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1180/1251] eta 0:00:17 lr 0.000019 wd 0.0500 time 0.2437 (0.2442) data time 0.0011 (0.0016) model time 0.2426 (0.2424) loss 2.7291 (2.6848) grad_norm 8.8054 (inf) loss_scale 128.0000 (199.0991) mem 7381MB [2024-09-01 10:27:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1190/1251] eta 0:00:14 lr 0.000019 wd 0.0500 time 0.2411 (0.2442) data time 0.0009 (0.0016) model time 0.2402 (0.2424) loss 2.5445 (2.6850) grad_norm 4.1741 (inf) loss_scale 128.0000 (198.5021) mem 7381MB [2024-09-01 10:27:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1200/1251] eta 0:00:12 lr 0.000019 wd 0.0500 time 0.2398 (0.2442) data time 0.0008 (0.0016) model time 0.2390 (0.2424) loss 2.6911 (2.6855) grad_norm 6.8935 (inf) loss_scale 128.0000 (197.9151) mem 7381MB [2024-09-01 10:27:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1210/1251] eta 0:00:10 lr 0.000019 wd 0.0500 time 0.2322 (0.2442) data time 0.0008 (0.0016) model time 0.2314 (0.2424) loss 2.5771 (2.6845) grad_norm 9.7434 (inf) loss_scale 128.0000 (197.3377) mem 7381MB [2024-09-01 10:27:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1220/1251] eta 0:00:07 lr 0.000019 wd 0.0500 time 0.2410 (0.2442) data time 0.0010 (0.0016) model time 0.2400 (0.2424) loss 2.9568 (2.6854) grad_norm 6.9497 (inf) loss_scale 128.0000 (196.7699) mem 7381MB [2024-09-01 10:27:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1230/1251] eta 0:00:05 lr 0.000019 wd 0.0500 time 0.2407 (0.2442) data time 0.0011 (0.0016) model time 0.2396 (0.2424) loss 2.5199 (2.6847) grad_norm 5.8280 (inf) loss_scale 128.0000 (196.2112) mem 7381MB [2024-09-01 10:27:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1240/1251] eta 0:00:02 lr 0.000019 wd 0.0500 time 0.2220 (0.2441) data time 0.0005 (0.0016) model time 0.2215 (0.2423) loss 3.2495 (2.6852) grad_norm 5.5183 (inf) loss_scale 128.0000 (195.6616) mem 7381MB [2024-09-01 10:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [282/300][1250/1251] eta 0:00:00 lr 0.000019 wd 0.0500 time 0.2272 (0.2439) data time 0.0008 (0.0016) model time 0.2264 (0.2421) loss 2.8813 (2.6864) grad_norm 7.9302 (inf) loss_scale 128.0000 (195.1207) mem 7381MB [2024-09-01 10:27:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 282 training takes 0:05:05 [2024-09-01 10:27:56 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 10:27:57 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 10:27:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.449 (0.449) Loss 0.3931 (0.3931) Acc@1 93.652 (93.652) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 10:27:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.110) Loss 0.5649 (0.6144) Acc@1 90.332 (87.802) Acc@5 98.145 (97.718) Mem 7381MB [2024-09-01 10:27:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.096) Loss 0.9282 (0.6457) Acc@1 77.344 (86.616) Acc@5 95.605 (97.619) Mem 7381MB [2024-09-01 10:28:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.090) Loss 1.1699 (0.7407) Acc@1 73.535 (84.362) Acc@5 92.383 (96.642) Mem 7381MB [2024-09-01 10:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.0186 (0.7879) Acc@1 76.660 (83.201) Acc@5 94.531 (96.168) Mem 7381MB [2024-09-01 10:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.778 Acc@5 96.144 [2024-09-01 10:28:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 10:28:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.802 (0.802) Loss 0.3879 (0.3879) Acc@1 93.359 (93.359) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 10:28:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.147) Loss 0.5649 (0.6067) Acc@1 90.430 (87.971) Acc@5 98.047 (97.798) Mem 7381MB [2024-09-01 10:28:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.079 (0.114) Loss 0.9097 (0.6373) Acc@1 77.930 (86.747) Acc@5 95.605 (97.731) Mem 7381MB [2024-09-01 10:28:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.102) Loss 1.1348 (0.7299) Acc@1 74.707 (84.551) Acc@5 92.773 (96.768) Mem 7381MB [2024-09-01 10:28:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.0088 (0.7782) Acc@1 76.855 (83.363) Acc@5 94.531 (96.289) Mem 7381MB [2024-09-01 10:28:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.940 Acc@5 96.238 [2024-09-01 10:28:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 10:28:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][0/1251] eta 0:23:21 lr 0.000019 wd 0.0500 time 1.1200 (1.1200) data time 0.6206 (0.6206) model time 0.0000 (0.0000) loss 1.5016 (1.5016) grad_norm 5.9068 (5.9068) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][10/1251] eta 0:06:41 lr 0.000019 wd 0.0500 time 0.2392 (0.3237) data time 0.0008 (0.0573) model time 0.0000 (0.0000) loss 2.3380 (2.6907) grad_norm 9.7016 (6.1336) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][20/1251] eta 0:05:50 lr 0.000019 wd 0.0500 time 0.2379 (0.2846) data time 0.0009 (0.0305) model time 0.0000 (0.0000) loss 2.5145 (2.6819) grad_norm 5.7509 (6.1487) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][30/1251] eta 0:05:30 lr 0.000019 wd 0.0500 time 0.2434 (0.2708) data time 0.0007 (0.0210) model time 0.0000 (0.0000) loss 2.4703 (2.7375) grad_norm 5.1487 (5.8076) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][40/1251] eta 0:05:19 lr 0.000019 wd 0.0500 time 0.2417 (0.2641) data time 0.0010 (0.0161) model time 0.0000 (0.0000) loss 2.4639 (2.7456) grad_norm 4.2028 (5.5181) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][50/1251] eta 0:05:11 lr 0.000019 wd 0.0500 time 0.2341 (0.2595) data time 0.0008 (0.0131) model time 0.0000 (0.0000) loss 2.6753 (2.7407) grad_norm 12.5412 (5.7591) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][60/1251] eta 0:05:11 lr 0.000019 wd 0.0500 time 0.2431 (0.2619) data time 0.0008 (0.0111) model time 0.2423 (0.2731) loss 3.5108 (2.7251) grad_norm 8.2099 (5.7293) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][70/1251] eta 0:05:08 lr 0.000019 wd 0.0500 time 0.2401 (0.2613) data time 0.0010 (0.0097) model time 0.2391 (0.2650) loss 3.0317 (2.7327) grad_norm 6.7643 (5.6922) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][80/1251] eta 0:05:03 lr 0.000019 wd 0.0500 time 0.2384 (0.2591) data time 0.0009 (0.0086) model time 0.2374 (0.2573) loss 2.7788 (2.7401) grad_norm 4.1526 (5.6782) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][90/1251] eta 0:04:58 lr 0.000019 wd 0.0500 time 0.2352 (0.2571) data time 0.0010 (0.0078) model time 0.2342 (0.2530) loss 2.7979 (2.7230) grad_norm 7.2275 (5.7162) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][100/1251] eta 0:04:54 lr 0.000019 wd 0.0500 time 0.2395 (0.2557) data time 0.0009 (0.0071) model time 0.2386 (0.2508) loss 2.8419 (2.7070) grad_norm 5.1291 (5.7265) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][110/1251] eta 0:04:50 lr 0.000019 wd 0.0500 time 0.2409 (0.2544) data time 0.0009 (0.0066) model time 0.2400 (0.2489) loss 2.8592 (2.7234) grad_norm 6.2709 (5.7874) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][120/1251] eta 0:04:46 lr 0.000019 wd 0.0500 time 0.2362 (0.2531) data time 0.0009 (0.0061) model time 0.2353 (0.2475) loss 2.7226 (2.7111) grad_norm 6.1550 (5.7130) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][130/1251] eta 0:04:42 lr 0.000019 wd 0.0500 time 0.2443 (0.2522) data time 0.0007 (0.0057) model time 0.2436 (0.2465) loss 2.2999 (2.6766) grad_norm 5.4333 (5.7030) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][140/1251] eta 0:04:39 lr 0.000019 wd 0.0500 time 0.2488 (0.2514) data time 0.0011 (0.0054) model time 0.2476 (0.2457) loss 2.5771 (2.6748) grad_norm 5.9253 (5.7367) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][150/1251] eta 0:04:35 lr 0.000019 wd 0.0500 time 0.2429 (0.2506) data time 0.0008 (0.0051) model time 0.2421 (0.2451) loss 3.3425 (2.6950) grad_norm 4.6073 (5.7957) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][160/1251] eta 0:04:34 lr 0.000019 wd 0.0500 time 0.2426 (0.2512) data time 0.0010 (0.0048) model time 0.2416 (0.2463) loss 2.8108 (2.6984) grad_norm 6.1224 (5.8965) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][170/1251] eta 0:04:30 lr 0.000019 wd 0.0500 time 0.2432 (0.2507) data time 0.0009 (0.0046) model time 0.2423 (0.2459) loss 2.9048 (2.6892) grad_norm 4.2896 (6.0056) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][180/1251] eta 0:04:27 lr 0.000019 wd 0.0500 time 0.2415 (0.2502) data time 0.0009 (0.0044) model time 0.2406 (0.2456) loss 3.1851 (2.6912) grad_norm 4.5638 (5.9781) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][190/1251] eta 0:04:25 lr 0.000019 wd 0.0500 time 0.2340 (0.2499) data time 0.0009 (0.0042) model time 0.2332 (0.2453) loss 2.4230 (2.6909) grad_norm 5.6277 (6.0219) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][200/1251] eta 0:04:22 lr 0.000019 wd 0.0500 time 0.2402 (0.2494) data time 0.0009 (0.0041) model time 0.2393 (0.2449) loss 2.0321 (2.6790) grad_norm 5.0086 (6.0426) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:28:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][210/1251] eta 0:04:20 lr 0.000019 wd 0.0500 time 0.2389 (0.2500) data time 0.0011 (0.0039) model time 0.2378 (0.2459) loss 2.7268 (2.6730) grad_norm 7.1457 (6.0313) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][220/1251] eta 0:04:17 lr 0.000019 wd 0.0500 time 0.2486 (0.2498) data time 0.0009 (0.0038) model time 0.2477 (0.2459) loss 1.9663 (2.6740) grad_norm 7.3717 (6.0140) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][230/1251] eta 0:04:14 lr 0.000019 wd 0.0500 time 0.2350 (0.2495) data time 0.0007 (0.0037) model time 0.2343 (0.2456) loss 2.9626 (2.6601) grad_norm 4.0374 (5.9672) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][240/1251] eta 0:04:11 lr 0.000019 wd 0.0500 time 0.2406 (0.2491) data time 0.0009 (0.0036) model time 0.2396 (0.2454) loss 2.2426 (2.6570) grad_norm 6.1233 (5.9737) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][250/1251] eta 0:04:09 lr 0.000019 wd 0.0500 time 0.2470 (0.2489) data time 0.0008 (0.0035) model time 0.2463 (0.2452) loss 2.7989 (2.6607) grad_norm 6.3292 (6.3755) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][260/1251] eta 0:04:06 lr 0.000019 wd 0.0500 time 0.2437 (0.2485) data time 0.0010 (0.0034) model time 0.2427 (0.2449) loss 2.8495 (2.6567) grad_norm 6.2826 (6.3787) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][270/1251] eta 0:04:03 lr 0.000019 wd 0.0500 time 0.2407 (0.2483) data time 0.0007 (0.0033) model time 0.2400 (0.2447) loss 2.7239 (2.6588) grad_norm 5.3512 (6.3409) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][280/1251] eta 0:04:00 lr 0.000019 wd 0.0500 time 0.2391 (0.2481) data time 0.0011 (0.0032) model time 0.2381 (0.2445) loss 2.6158 (2.6620) grad_norm 4.6771 (6.2985) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][290/1251] eta 0:03:58 lr 0.000019 wd 0.0500 time 0.2401 (0.2478) data time 0.0008 (0.0031) model time 0.2393 (0.2444) loss 2.9337 (2.6591) grad_norm 4.3656 (6.2495) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][300/1251] eta 0:03:55 lr 0.000019 wd 0.0500 time 0.2394 (0.2476) data time 0.0007 (0.0030) model time 0.2387 (0.2442) loss 2.7626 (2.6667) grad_norm 5.2031 (6.2380) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][310/1251] eta 0:03:52 lr 0.000019 wd 0.0500 time 0.2297 (0.2474) data time 0.0009 (0.0030) model time 0.2289 (0.2440) loss 2.2213 (2.6758) grad_norm 7.7943 (6.2402) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][320/1251] eta 0:03:50 lr 0.000019 wd 0.0500 time 0.2436 (0.2472) data time 0.0010 (0.0029) model time 0.2426 (0.2439) loss 2.6090 (2.6756) grad_norm 7.4456 (6.2696) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][330/1251] eta 0:03:47 lr 0.000019 wd 0.0500 time 0.2373 (0.2470) data time 0.0007 (0.0029) model time 0.2366 (0.2438) loss 2.0893 (2.6721) grad_norm 6.2132 (6.3177) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][340/1251] eta 0:03:44 lr 0.000019 wd 0.0500 time 0.2441 (0.2469) data time 0.0007 (0.0028) model time 0.2433 (0.2437) loss 1.7456 (2.6711) grad_norm 5.1198 (6.3112) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][350/1251] eta 0:03:42 lr 0.000019 wd 0.0500 time 0.2385 (0.2467) data time 0.0009 (0.0028) model time 0.2376 (0.2435) loss 2.2316 (2.6678) grad_norm 6.4958 (6.3006) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][360/1251] eta 0:03:39 lr 0.000019 wd 0.0500 time 0.2361 (0.2465) data time 0.0013 (0.0027) model time 0.2349 (0.2434) loss 2.4740 (2.6575) grad_norm 4.4888 (6.2842) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][370/1251] eta 0:03:37 lr 0.000019 wd 0.0500 time 0.2328 (0.2464) data time 0.0008 (0.0027) model time 0.2320 (0.2433) loss 3.1272 (2.6608) grad_norm 3.6822 (6.2898) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][380/1251] eta 0:03:34 lr 0.000019 wd 0.0500 time 0.2424 (0.2468) data time 0.0008 (0.0026) model time 0.2416 (0.2439) loss 2.7829 (2.6651) grad_norm 4.5170 (6.2722) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][390/1251] eta 0:03:32 lr 0.000019 wd 0.0500 time 0.2410 (0.2467) data time 0.0008 (0.0026) model time 0.2402 (0.2438) loss 3.2942 (2.6658) grad_norm 4.3947 (6.2623) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][400/1251] eta 0:03:29 lr 0.000019 wd 0.0500 time 0.2443 (0.2466) data time 0.0010 (0.0025) model time 0.2433 (0.2437) loss 2.8566 (2.6684) grad_norm 4.3442 (6.2625) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][410/1251] eta 0:03:27 lr 0.000019 wd 0.0500 time 0.2392 (0.2465) data time 0.0008 (0.0025) model time 0.2384 (0.2437) loss 2.3506 (2.6685) grad_norm 3.3708 (6.2809) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][420/1251] eta 0:03:24 lr 0.000019 wd 0.0500 time 0.2447 (0.2463) data time 0.0007 (0.0025) model time 0.2441 (0.2435) loss 3.3276 (2.6662) grad_norm 13.2032 (6.4460) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][430/1251] eta 0:03:22 lr 0.000019 wd 0.0500 time 0.2402 (0.2462) data time 0.0011 (0.0024) model time 0.2391 (0.2434) loss 3.2008 (2.6639) grad_norm 5.4021 (6.4321) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][440/1251] eta 0:03:19 lr 0.000019 wd 0.0500 time 0.2401 (0.2462) data time 0.0008 (0.0024) model time 0.2393 (0.2434) loss 3.1428 (2.6690) grad_norm 3.7284 (6.3905) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][450/1251] eta 0:03:17 lr 0.000019 wd 0.0500 time 0.2340 (0.2461) data time 0.0012 (0.0024) model time 0.2328 (0.2434) loss 2.6177 (2.6655) grad_norm 3.5697 (6.4002) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:29:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][460/1251] eta 0:03:14 lr 0.000019 wd 0.0500 time 0.2476 (0.2460) data time 0.0007 (0.0023) model time 0.2468 (0.2433) loss 3.3798 (2.6646) grad_norm 5.4430 (6.3837) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][470/1251] eta 0:03:11 lr 0.000019 wd 0.0500 time 0.2374 (0.2458) data time 0.0007 (0.0023) model time 0.2367 (0.2432) loss 3.0095 (2.6645) grad_norm 5.3876 (6.3878) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][480/1251] eta 0:03:09 lr 0.000019 wd 0.0500 time 0.2444 (0.2457) data time 0.0009 (0.0023) model time 0.2435 (0.2431) loss 3.0152 (2.6659) grad_norm 5.4954 (6.3931) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][490/1251] eta 0:03:06 lr 0.000019 wd 0.0500 time 0.2388 (0.2456) data time 0.0010 (0.0022) model time 0.2377 (0.2430) loss 3.0247 (2.6665) grad_norm 6.4054 (6.3642) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][500/1251] eta 0:03:04 lr 0.000019 wd 0.0500 time 0.2433 (0.2455) data time 0.0009 (0.0022) model time 0.2423 (0.2429) loss 2.5618 (2.6615) grad_norm 4.8168 (6.3367) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][510/1251] eta 0:03:01 lr 0.000019 wd 0.0500 time 0.2392 (0.2454) data time 0.0007 (0.0022) model time 0.2384 (0.2429) loss 3.4595 (2.6617) grad_norm 99.3264 (6.4967) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][520/1251] eta 0:02:59 lr 0.000019 wd 0.0500 time 0.2430 (0.2454) data time 0.0007 (0.0022) model time 0.2423 (0.2429) loss 1.9427 (2.6585) grad_norm 6.2569 (6.4864) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][530/1251] eta 0:02:56 lr 0.000019 wd 0.0500 time 0.2438 (0.2453) data time 0.0009 (0.0021) model time 0.2429 (0.2428) loss 2.9807 (2.6569) grad_norm 3.9015 (6.4958) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][540/1251] eta 0:02:54 lr 0.000019 wd 0.0500 time 0.2451 (0.2452) data time 0.0010 (0.0021) model time 0.2441 (0.2428) loss 2.0650 (2.6550) grad_norm 4.2380 (6.4611) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][550/1251] eta 0:02:51 lr 0.000019 wd 0.0500 time 0.2426 (0.2452) data time 0.0007 (0.0021) model time 0.2419 (0.2427) loss 3.0423 (2.6569) grad_norm 7.9412 (6.4555) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][560/1251] eta 0:02:49 lr 0.000019 wd 0.0500 time 0.2380 (0.2451) data time 0.0011 (0.0021) model time 0.2370 (0.2427) loss 2.8623 (2.6578) grad_norm 5.3086 (6.4597) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][570/1251] eta 0:02:46 lr 0.000019 wd 0.0500 time 0.2414 (0.2451) data time 0.0011 (0.0021) model time 0.2403 (0.2427) loss 2.1007 (2.6579) grad_norm 5.5867 (6.4451) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][580/1251] eta 0:02:44 lr 0.000018 wd 0.0500 time 0.4545 (0.2454) data time 0.0007 (0.0020) model time 0.4538 (0.2430) loss 3.4206 (2.6607) grad_norm 4.2660 (6.4176) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][590/1251] eta 0:02:42 lr 0.000018 wd 0.0500 time 0.2341 (0.2460) data time 0.0007 (0.0020) model time 0.2334 (0.2438) loss 2.8600 (2.6625) grad_norm 4.8629 (6.4081) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][600/1251] eta 0:02:40 lr 0.000018 wd 0.0500 time 0.2446 (0.2460) data time 0.0010 (0.0020) model time 0.2437 (0.2437) loss 3.1781 (2.6667) grad_norm 4.6548 (6.4001) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][610/1251] eta 0:02:37 lr 0.000018 wd 0.0500 time 0.2346 (0.2459) data time 0.0010 (0.0020) model time 0.2336 (0.2437) loss 2.9648 (2.6654) grad_norm 5.2089 (6.4002) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][620/1251] eta 0:02:35 lr 0.000018 wd 0.0500 time 0.2310 (0.2458) data time 0.0009 (0.0020) model time 0.2301 (0.2436) loss 2.8267 (2.6664) grad_norm 6.9335 (6.3888) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][630/1251] eta 0:02:32 lr 0.000018 wd 0.0500 time 0.2574 (0.2457) data time 0.0008 (0.0020) model time 0.2566 (0.2435) loss 2.8801 (2.6687) grad_norm 4.2482 (6.3833) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][640/1251] eta 0:02:30 lr 0.000018 wd 0.0500 time 0.2440 (0.2456) data time 0.0008 (0.0019) model time 0.2432 (0.2434) loss 1.7908 (2.6660) grad_norm 4.0497 (6.3642) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][650/1251] eta 0:02:27 lr 0.000018 wd 0.0500 time 0.2443 (0.2456) data time 0.0007 (0.0019) model time 0.2436 (0.2434) loss 3.3088 (2.6681) grad_norm 4.0142 (6.3482) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][660/1251] eta 0:02:25 lr 0.000018 wd 0.0500 time 0.2453 (0.2455) data time 0.0010 (0.0019) model time 0.2443 (0.2433) loss 2.6112 (2.6665) grad_norm 5.2172 (6.3339) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][670/1251] eta 0:02:22 lr 0.000018 wd 0.0500 time 0.2385 (0.2454) data time 0.0008 (0.0019) model time 0.2377 (0.2433) loss 2.9805 (2.6682) grad_norm 5.0715 (6.3280) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][680/1251] eta 0:02:20 lr 0.000018 wd 0.0500 time 0.2360 (0.2454) data time 0.0012 (0.0019) model time 0.2348 (0.2433) loss 2.8288 (2.6693) grad_norm 3.9277 (6.3208) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][690/1251] eta 0:02:17 lr 0.000018 wd 0.0500 time 0.2382 (0.2453) data time 0.0009 (0.0019) model time 0.2373 (0.2432) loss 2.6117 (2.6701) grad_norm 6.0056 (6.3057) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:30:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][700/1251] eta 0:02:15 lr 0.000018 wd 0.0500 time 0.2406 (0.2453) data time 0.0009 (0.0019) model time 0.2397 (0.2432) loss 3.0117 (2.6700) grad_norm 5.2981 (6.2920) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][710/1251] eta 0:02:12 lr 0.000018 wd 0.0500 time 0.2403 (0.2452) data time 0.0010 (0.0019) model time 0.2393 (0.2431) loss 3.3534 (2.6682) grad_norm 6.3868 (6.2864) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][720/1251] eta 0:02:10 lr 0.000018 wd 0.0500 time 0.2414 (0.2452) data time 0.0009 (0.0018) model time 0.2405 (0.2431) loss 3.0449 (2.6701) grad_norm 5.5816 (6.3063) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][730/1251] eta 0:02:07 lr 0.000018 wd 0.0500 time 0.2398 (0.2452) data time 0.0008 (0.0018) model time 0.2390 (0.2431) loss 2.0713 (2.6716) grad_norm 6.4633 (6.3273) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][740/1251] eta 0:02:05 lr 0.000018 wd 0.0500 time 0.2400 (0.2451) data time 0.0010 (0.0018) model time 0.2390 (0.2431) loss 2.9188 (2.6711) grad_norm 3.8752 (6.3271) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][750/1251] eta 0:02:02 lr 0.000018 wd 0.0500 time 0.2550 (0.2451) data time 0.0007 (0.0018) model time 0.2543 (0.2430) loss 3.1093 (2.6720) grad_norm 5.1571 (6.3328) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][760/1251] eta 0:02:00 lr 0.000018 wd 0.0500 time 0.2484 (0.2450) data time 0.0010 (0.0018) model time 0.2474 (0.2430) loss 3.0406 (2.6731) grad_norm 4.6181 (6.3337) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][770/1251] eta 0:01:57 lr 0.000018 wd 0.0500 time 0.2328 (0.2450) data time 0.0009 (0.0018) model time 0.2319 (0.2429) loss 1.8437 (2.6706) grad_norm 4.4655 (6.3252) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][780/1251] eta 0:01:55 lr 0.000018 wd 0.0500 time 0.2451 (0.2449) data time 0.0009 (0.0018) model time 0.2442 (0.2429) loss 2.9275 (2.6708) grad_norm 6.8511 (6.3226) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][790/1251] eta 0:01:52 lr 0.000018 wd 0.0500 time 0.2306 (0.2449) data time 0.0009 (0.0018) model time 0.2297 (0.2429) loss 2.5086 (2.6718) grad_norm 5.4686 (6.3511) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][800/1251] eta 0:01:50 lr 0.000018 wd 0.0500 time 0.2416 (0.2449) data time 0.0008 (0.0018) model time 0.2408 (0.2429) loss 3.3432 (2.6729) grad_norm 6.8676 (6.3446) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][810/1251] eta 0:01:47 lr 0.000018 wd 0.0500 time 0.2354 (0.2448) data time 0.0009 (0.0018) model time 0.2346 (0.2429) loss 3.2936 (2.6751) grad_norm 10.1607 (6.3495) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][820/1251] eta 0:01:45 lr 0.000018 wd 0.0500 time 0.2396 (0.2448) data time 0.0010 (0.0017) model time 0.2385 (0.2428) loss 2.1451 (2.6735) grad_norm 5.7095 (6.3318) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][830/1251] eta 0:01:43 lr 0.000018 wd 0.0500 time 0.2393 (0.2447) data time 0.0009 (0.0017) model time 0.2384 (0.2428) loss 1.6913 (2.6713) grad_norm 4.3424 (6.3181) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][840/1251] eta 0:01:40 lr 0.000018 wd 0.0500 time 0.2444 (0.2447) data time 0.0010 (0.0017) model time 0.2434 (0.2428) loss 3.1651 (2.6719) grad_norm 4.6326 (6.3073) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][850/1251] eta 0:01:38 lr 0.000018 wd 0.0500 time 0.2368 (0.2447) data time 0.0007 (0.0017) model time 0.2361 (0.2427) loss 2.6086 (2.6719) grad_norm 3.9292 (6.2944) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][860/1251] eta 0:01:35 lr 0.000018 wd 0.0500 time 0.2355 (0.2446) data time 0.0007 (0.0017) model time 0.2349 (0.2427) loss 2.1445 (2.6731) grad_norm 3.9338 (6.2880) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][870/1251] eta 0:01:33 lr 0.000018 wd 0.0500 time 0.2427 (0.2446) data time 0.0007 (0.0017) model time 0.2420 (0.2427) loss 2.4977 (2.6757) grad_norm 3.7008 (6.2744) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][880/1251] eta 0:01:30 lr 0.000018 wd 0.0500 time 0.2393 (0.2446) data time 0.0011 (0.0017) model time 0.2382 (0.2426) loss 3.0727 (2.6795) grad_norm 5.1082 (6.2759) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][890/1251] eta 0:01:28 lr 0.000018 wd 0.0500 time 0.2382 (0.2445) data time 0.0010 (0.0017) model time 0.2372 (0.2426) loss 2.5781 (2.6806) grad_norm 4.9583 (6.2680) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][900/1251] eta 0:01:25 lr 0.000018 wd 0.0500 time 0.2495 (0.2445) data time 0.0010 (0.0017) model time 0.2485 (0.2426) loss 3.3966 (2.6835) grad_norm 5.3362 (6.2873) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][910/1251] eta 0:01:23 lr 0.000018 wd 0.0500 time 0.2384 (0.2445) data time 0.0009 (0.0017) model time 0.2375 (0.2426) loss 2.1477 (2.6837) grad_norm 4.5161 (6.2795) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][920/1251] eta 0:01:20 lr 0.000018 wd 0.0500 time 0.2468 (0.2447) data time 0.0009 (0.0017) model time 0.2460 (0.2428) loss 3.1153 (2.6850) grad_norm 5.8872 (6.2951) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][930/1251] eta 0:01:18 lr 0.000018 wd 0.0500 time 0.2401 (0.2447) data time 0.0011 (0.0017) model time 0.2390 (0.2428) loss 2.4653 (2.6841) grad_norm 5.7306 (6.2960) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][940/1251] eta 0:01:16 lr 0.000018 wd 0.0500 time 0.2443 (0.2447) data time 0.0011 (0.0016) model time 0.2432 (0.2428) loss 2.8193 (2.6845) grad_norm 5.1245 (6.2820) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:31:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][950/1251] eta 0:01:13 lr 0.000018 wd 0.0500 time 0.2443 (0.2446) data time 0.0011 (0.0016) model time 0.2432 (0.2428) loss 2.9103 (2.6846) grad_norm 7.7014 (6.2817) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][960/1251] eta 0:01:11 lr 0.000018 wd 0.0500 time 0.2441 (0.2446) data time 0.0009 (0.0016) model time 0.2432 (0.2428) loss 2.8316 (2.6851) grad_norm 3.8152 (6.2772) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][970/1251] eta 0:01:08 lr 0.000018 wd 0.0500 time 0.2440 (0.2446) data time 0.0009 (0.0016) model time 0.2431 (0.2428) loss 3.3770 (2.6874) grad_norm 4.6805 (6.2856) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][980/1251] eta 0:01:06 lr 0.000018 wd 0.0500 time 0.2384 (0.2446) data time 0.0007 (0.0016) model time 0.2377 (0.2428) loss 2.8928 (2.6873) grad_norm 7.9609 (6.2871) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][990/1251] eta 0:01:03 lr 0.000018 wd 0.0500 time 0.2442 (0.2446) data time 0.0007 (0.0016) model time 0.2435 (0.2427) loss 2.7615 (2.6861) grad_norm 6.4873 (6.2882) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1000/1251] eta 0:01:01 lr 0.000018 wd 0.0500 time 0.2524 (0.2445) data time 0.0008 (0.0016) model time 0.2516 (0.2427) loss 1.7915 (2.6830) grad_norm 6.0080 (6.2802) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1010/1251] eta 0:00:58 lr 0.000018 wd 0.0500 time 0.2338 (0.2445) data time 0.0009 (0.0016) model time 0.2329 (0.2427) loss 3.1698 (2.6842) grad_norm 4.3603 (6.2739) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1020/1251] eta 0:00:56 lr 0.000018 wd 0.0500 time 0.2426 (0.2445) data time 0.0013 (0.0016) model time 0.2414 (0.2427) loss 3.0399 (2.6849) grad_norm 10.1622 (6.2771) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1030/1251] eta 0:00:54 lr 0.000018 wd 0.0500 time 0.2447 (0.2445) data time 0.0007 (0.0016) model time 0.2440 (0.2427) loss 1.6358 (2.6831) grad_norm 6.9793 (6.2884) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1040/1251] eta 0:00:51 lr 0.000018 wd 0.0500 time 0.2468 (0.2444) data time 0.0007 (0.0016) model time 0.2461 (0.2426) loss 2.9073 (2.6826) grad_norm 7.9777 (6.2895) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1050/1251] eta 0:00:49 lr 0.000018 wd 0.0500 time 0.2308 (0.2444) data time 0.0008 (0.0016) model time 0.2300 (0.2426) loss 2.3923 (2.6816) grad_norm 6.4287 (6.2878) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1060/1251] eta 0:00:46 lr 0.000018 wd 0.0500 time 0.2466 (0.2444) data time 0.0010 (0.0016) model time 0.2457 (0.2426) loss 1.8594 (2.6787) grad_norm 6.2967 (6.2932) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1070/1251] eta 0:00:44 lr 0.000018 wd 0.0500 time 0.2312 (0.2444) data time 0.0010 (0.0016) model time 0.2302 (0.2426) loss 1.9985 (2.6795) grad_norm 5.7467 (6.3061) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1080/1251] eta 0:00:41 lr 0.000018 wd 0.0500 time 0.2444 (0.2443) data time 0.0008 (0.0016) model time 0.2436 (0.2426) loss 3.1195 (2.6790) grad_norm 5.2124 (6.2967) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1090/1251] eta 0:00:39 lr 0.000018 wd 0.0500 time 0.2525 (0.2445) data time 0.0009 (0.0016) model time 0.2516 (0.2428) loss 2.7761 (2.6792) grad_norm 3.5639 (6.2948) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1100/1251] eta 0:00:36 lr 0.000018 wd 0.0500 time 0.2404 (0.2445) data time 0.0010 (0.0016) model time 0.2394 (0.2427) loss 3.3098 (2.6818) grad_norm 8.9242 (6.2951) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1110/1251] eta 0:00:34 lr 0.000018 wd 0.0500 time 0.2495 (0.2444) data time 0.0007 (0.0015) model time 0.2488 (0.2427) loss 3.1522 (2.6820) grad_norm 7.6163 (6.2916) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1120/1251] eta 0:00:32 lr 0.000018 wd 0.0500 time 0.2470 (0.2444) data time 0.0007 (0.0015) model time 0.2463 (0.2427) loss 3.1469 (2.6840) grad_norm 10.3327 (6.2866) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1130/1251] eta 0:00:29 lr 0.000018 wd 0.0500 time 0.2452 (0.2444) data time 0.0007 (0.0015) model time 0.2445 (0.2427) loss 2.9317 (2.6846) grad_norm 5.6842 (6.2847) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1140/1251] eta 0:00:27 lr 0.000018 wd 0.0500 time 0.2406 (0.2446) data time 0.0007 (0.0015) model time 0.2399 (0.2429) loss 3.3218 (2.6865) grad_norm 5.0598 (6.2765) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1150/1251] eta 0:00:24 lr 0.000018 wd 0.0500 time 0.2410 (0.2445) data time 0.0007 (0.0015) model time 0.2403 (0.2429) loss 3.0029 (2.6876) grad_norm 4.6638 (6.2769) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1160/1251] eta 0:00:22 lr 0.000018 wd 0.0500 time 0.2443 (0.2445) data time 0.0009 (0.0015) model time 0.2434 (0.2428) loss 2.4795 (2.6883) grad_norm 8.9543 (6.2895) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1170/1251] eta 0:00:19 lr 0.000018 wd 0.0500 time 0.2472 (0.2445) data time 0.0009 (0.0015) model time 0.2463 (0.2428) loss 2.7160 (2.6878) grad_norm 5.8660 (6.2831) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1180/1251] eta 0:00:17 lr 0.000018 wd 0.0500 time 0.2401 (0.2445) data time 0.0006 (0.0015) model time 0.2395 (0.2428) loss 3.0739 (2.6878) grad_norm 5.0673 (6.2759) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1190/1251] eta 0:00:14 lr 0.000018 wd 0.0500 time 0.2343 (0.2444) data time 0.0009 (0.0015) model time 0.2334 (0.2428) loss 2.8903 (2.6886) grad_norm 10.4266 (6.2930) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:32:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1200/1251] eta 0:00:12 lr 0.000018 wd 0.0500 time 0.2492 (0.2444) data time 0.0010 (0.0015) model time 0.2482 (0.2428) loss 3.1030 (2.6901) grad_norm 4.4675 (6.2868) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1210/1251] eta 0:00:10 lr 0.000018 wd 0.0500 time 0.2397 (0.2444) data time 0.0009 (0.0015) model time 0.2388 (0.2428) loss 2.8865 (2.6905) grad_norm 4.7657 (6.2837) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1220/1251] eta 0:00:07 lr 0.000018 wd 0.0500 time 0.2501 (0.2444) data time 0.0008 (0.0015) model time 0.2492 (0.2427) loss 2.9902 (2.6910) grad_norm 4.9925 (6.2753) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1230/1251] eta 0:00:05 lr 0.000018 wd 0.0500 time 0.2416 (0.2444) data time 0.0008 (0.0015) model time 0.2408 (0.2427) loss 3.1763 (2.6924) grad_norm 19.0222 (6.2876) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1240/1251] eta 0:00:02 lr 0.000018 wd 0.0500 time 0.2252 (0.2443) data time 0.0005 (0.0015) model time 0.2248 (0.2426) loss 2.9007 (2.6934) grad_norm 4.5593 (6.2793) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [283/300][1250/1251] eta 0:00:00 lr 0.000018 wd 0.0500 time 0.2268 (0.2441) data time 0.0005 (0.0015) model time 0.2264 (0.2425) loss 2.3316 (2.6950) grad_norm 6.9485 (6.2783) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 283 training takes 0:05:05 [2024-09-01 10:33:11 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 10:33:12 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 10:33:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.400 (0.400) Loss 0.3933 (0.3933) Acc@1 93.359 (93.359) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 10:33:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.106) Loss 0.5776 (0.6150) Acc@1 90.137 (87.562) Acc@5 97.852 (97.710) Mem 7381MB [2024-09-01 10:33:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.093) Loss 0.9170 (0.6462) Acc@1 77.246 (86.537) Acc@5 95.410 (97.656) Mem 7381MB [2024-09-01 10:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.094 (0.088) Loss 1.1455 (0.7394) Acc@1 74.609 (84.350) Acc@5 92.969 (96.705) Mem 7381MB [2024-09-01 10:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.0420 (0.7892) Acc@1 76.270 (83.146) Acc@5 94.043 (96.215) Mem 7381MB [2024-09-01 10:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.680 Acc@5 96.156 [2024-09-01 10:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-09-01 10:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.798 (0.798) Loss 0.3887 (0.3887) Acc@1 93.359 (93.359) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 10:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.078 (0.148) Loss 0.5645 (0.6071) Acc@1 90.430 (88.006) Acc@5 98.047 (97.781) Mem 7381MB [2024-09-01 10:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.115) Loss 0.9092 (0.6376) Acc@1 77.930 (86.775) Acc@5 95.801 (97.735) Mem 7381MB [2024-09-01 10:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.102) Loss 1.1367 (0.7302) Acc@1 74.609 (84.570) Acc@5 92.676 (96.765) Mem 7381MB [2024-09-01 10:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.0098 (0.7785) Acc@1 76.660 (83.379) Acc@5 94.531 (96.280) Mem 7381MB [2024-09-01 10:33:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.944 Acc@5 96.230 [2024-09-01 10:33:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 10:33:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][0/1251] eta 0:23:23 lr 0.000018 wd 0.0500 time 1.1223 (1.1223) data time 0.5339 (0.5339) model time 0.0000 (0.0000) loss 3.1976 (3.1976) grad_norm 4.4621 (4.4621) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][10/1251] eta 0:06:38 lr 0.000018 wd 0.0500 time 0.2357 (0.3208) data time 0.0013 (0.0495) model time 0.0000 (0.0000) loss 2.9346 (2.7260) grad_norm 6.1287 (5.5613) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][20/1251] eta 0:05:49 lr 0.000018 wd 0.0500 time 0.2584 (0.2842) data time 0.0011 (0.0264) model time 0.0000 (0.0000) loss 2.9852 (2.7247) grad_norm 4.7042 (5.7194) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][30/1251] eta 0:05:29 lr 0.000018 wd 0.0500 time 0.2393 (0.2696) data time 0.0008 (0.0182) model time 0.0000 (0.0000) loss 2.1135 (2.6863) grad_norm 8.9928 (6.3415) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][40/1251] eta 0:05:17 lr 0.000018 wd 0.0500 time 0.2434 (0.2623) data time 0.0008 (0.0140) model time 0.0000 (0.0000) loss 2.4943 (2.6582) grad_norm 6.4022 (6.3277) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][50/1251] eta 0:05:09 lr 0.000018 wd 0.0500 time 0.2364 (0.2580) data time 0.0010 (0.0114) model time 0.0000 (0.0000) loss 2.8705 (2.6602) grad_norm 6.3465 (6.2040) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][60/1251] eta 0:05:04 lr 0.000018 wd 0.0500 time 0.2368 (0.2554) data time 0.0008 (0.0097) model time 0.2360 (0.2410) loss 2.3131 (2.6528) grad_norm 5.4267 (6.0390) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][70/1251] eta 0:04:59 lr 0.000018 wd 0.0500 time 0.2340 (0.2533) data time 0.0008 (0.0085) model time 0.2332 (0.2404) loss 3.4006 (2.6619) grad_norm 6.1766 (6.2156) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][80/1251] eta 0:04:54 lr 0.000018 wd 0.0500 time 0.2411 (0.2518) data time 0.0011 (0.0076) model time 0.2400 (0.2402) loss 1.8754 (2.6430) grad_norm 4.5756 (6.2908) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][90/1251] eta 0:04:51 lr 0.000018 wd 0.0500 time 0.2468 (0.2507) data time 0.0009 (0.0068) model time 0.2460 (0.2405) loss 1.5837 (2.6193) grad_norm 3.8189 (6.2285) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][100/1251] eta 0:04:47 lr 0.000018 wd 0.0500 time 0.2435 (0.2497) data time 0.0007 (0.0063) model time 0.2428 (0.2401) loss 1.6210 (2.6066) grad_norm 4.4002 (6.1340) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][110/1251] eta 0:04:43 lr 0.000018 wd 0.0500 time 0.2576 (0.2488) data time 0.0009 (0.0058) model time 0.2567 (0.2400) loss 3.3011 (2.6179) grad_norm 5.9838 (6.0744) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][120/1251] eta 0:04:40 lr 0.000018 wd 0.0500 time 0.2421 (0.2481) data time 0.0009 (0.0054) model time 0.2412 (0.2400) loss 3.0824 (2.6280) grad_norm 5.8241 (6.1543) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][130/1251] eta 0:04:37 lr 0.000018 wd 0.0500 time 0.2407 (0.2477) data time 0.0007 (0.0050) model time 0.2400 (0.2402) loss 2.6031 (2.6265) grad_norm 6.9680 (6.4522) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][140/1251] eta 0:04:34 lr 0.000018 wd 0.0500 time 0.2372 (0.2472) data time 0.0010 (0.0048) model time 0.2362 (0.2400) loss 3.0316 (2.6286) grad_norm 6.3977 (6.4147) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:33:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][150/1251] eta 0:04:34 lr 0.000018 wd 0.0500 time 0.2502 (0.2497) data time 0.0010 (0.0045) model time 0.2493 (0.2445) loss 3.2315 (2.6339) grad_norm 7.9002 (6.3995) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:34:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][160/1251] eta 0:04:33 lr 0.000018 wd 0.0500 time 0.2415 (0.2505) data time 0.0009 (0.0043) model time 0.2406 (0.2460) loss 3.2931 (2.6275) grad_norm 8.7800 (6.3856) loss_scale 256.0000 (133.5652) mem 7381MB [2024-09-01 10:34:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][170/1251] eta 0:04:30 lr 0.000018 wd 0.0500 time 0.2284 (0.2498) data time 0.0008 (0.0041) model time 0.2276 (0.2453) loss 2.4879 (2.6397) grad_norm 4.8766 (6.3622) loss_scale 256.0000 (140.7251) mem 7381MB [2024-09-01 10:34:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][180/1251] eta 0:04:27 lr 0.000018 wd 0.0500 time 0.2361 (0.2494) data time 0.0011 (0.0039) model time 0.2350 (0.2450) loss 3.0999 (2.6284) grad_norm 3.6892 (6.3524) loss_scale 256.0000 (147.0939) mem 7381MB [2024-09-01 10:34:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][190/1251] eta 0:04:25 lr 0.000018 wd 0.0500 time 0.4365 (0.2500) data time 0.0011 (0.0038) model time 0.4355 (0.2460) loss 2.7577 (2.6251) grad_norm 4.9460 (6.3105) loss_scale 256.0000 (152.7958) mem 7381MB [2024-09-01 10:34:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][200/1251] eta 0:04:22 lr 0.000018 wd 0.0500 time 0.2377 (0.2495) data time 0.0011 (0.0036) model time 0.2366 (0.2457) loss 2.3495 (2.6336) grad_norm 3.8124 (6.2618) loss_scale 256.0000 (157.9303) mem 7381MB [2024-09-01 10:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][210/1251] eta 0:04:19 lr 0.000018 wd 0.0500 time 0.2355 (0.2493) data time 0.0009 (0.0035) model time 0.2346 (0.2455) loss 2.6986 (2.6454) grad_norm 6.1871 (6.2711) loss_scale 256.0000 (162.5782) mem 7381MB [2024-09-01 10:34:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][220/1251] eta 0:04:16 lr 0.000018 wd 0.0500 time 0.2383 (0.2489) data time 0.0008 (0.0034) model time 0.2374 (0.2452) loss 1.9814 (2.6364) grad_norm 5.4743 (6.5980) loss_scale 256.0000 (166.8054) mem 7381MB [2024-09-01 10:34:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][230/1251] eta 0:04:13 lr 0.000018 wd 0.0500 time 0.2357 (0.2486) data time 0.0009 (0.0033) model time 0.2348 (0.2449) loss 1.8792 (2.6385) grad_norm 5.6638 (6.6171) loss_scale 256.0000 (170.6667) mem 7381MB [2024-09-01 10:34:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][240/1251] eta 0:04:11 lr 0.000018 wd 0.0500 time 0.2526 (0.2483) data time 0.0009 (0.0032) model time 0.2516 (0.2447) loss 2.8122 (2.6468) grad_norm 6.2488 (6.5839) loss_scale 256.0000 (174.2075) mem 7381MB [2024-09-01 10:34:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][250/1251] eta 0:04:08 lr 0.000018 wd 0.0500 time 0.2392 (0.2481) data time 0.0007 (0.0031) model time 0.2385 (0.2446) loss 3.5477 (2.6524) grad_norm 8.5650 (6.5300) loss_scale 256.0000 (177.4661) mem 7381MB [2024-09-01 10:34:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][260/1251] eta 0:04:05 lr 0.000018 wd 0.0500 time 0.2416 (0.2480) data time 0.0009 (0.0030) model time 0.2407 (0.2445) loss 2.0993 (2.6451) grad_norm 6.5592 (6.5133) loss_scale 256.0000 (180.4751) mem 7381MB [2024-09-01 10:34:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][270/1251] eta 0:04:03 lr 0.000018 wd 0.0500 time 0.2413 (0.2478) data time 0.0007 (0.0029) model time 0.2406 (0.2444) loss 2.9979 (2.6360) grad_norm 3.6726 (6.4893) loss_scale 256.0000 (183.2620) mem 7381MB [2024-09-01 10:34:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][280/1251] eta 0:04:00 lr 0.000018 wd 0.0500 time 0.2440 (0.2476) data time 0.0009 (0.0029) model time 0.2431 (0.2442) loss 2.6766 (2.6292) grad_norm 4.6851 (6.6051) loss_scale 256.0000 (185.8505) mem 7381MB [2024-09-01 10:34:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][290/1251] eta 0:03:57 lr 0.000018 wd 0.0500 time 0.2380 (0.2474) data time 0.0009 (0.0028) model time 0.2371 (0.2441) loss 3.2473 (2.6260) grad_norm 9.1803 (6.7040) loss_scale 256.0000 (188.2612) mem 7381MB [2024-09-01 10:34:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][300/1251] eta 0:03:55 lr 0.000018 wd 0.0500 time 0.2460 (0.2472) data time 0.0009 (0.0028) model time 0.2450 (0.2440) loss 2.6891 (2.6272) grad_norm 4.2151 (6.7560) loss_scale 256.0000 (190.5116) mem 7381MB [2024-09-01 10:34:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][310/1251] eta 0:03:52 lr 0.000018 wd 0.0500 time 0.2316 (0.2470) data time 0.0007 (0.0027) model time 0.2309 (0.2439) loss 2.9104 (2.6339) grad_norm 7.4210 (6.7057) loss_scale 256.0000 (192.6174) mem 7381MB [2024-09-01 10:34:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][320/1251] eta 0:03:49 lr 0.000018 wd 0.0500 time 0.2403 (0.2469) data time 0.0011 (0.0026) model time 0.2392 (0.2438) loss 2.9775 (2.6366) grad_norm 5.1469 (6.8639) loss_scale 256.0000 (194.5919) mem 7381MB [2024-09-01 10:34:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][330/1251] eta 0:03:47 lr 0.000018 wd 0.0500 time 0.2442 (0.2468) data time 0.0011 (0.0026) model time 0.2432 (0.2437) loss 2.2641 (2.6392) grad_norm 6.2953 (6.8555) loss_scale 256.0000 (196.4471) mem 7381MB [2024-09-01 10:34:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][340/1251] eta 0:03:44 lr 0.000018 wd 0.0500 time 0.2444 (0.2466) data time 0.0009 (0.0025) model time 0.2435 (0.2436) loss 2.7057 (2.6401) grad_norm 5.3847 (6.8233) loss_scale 256.0000 (198.1935) mem 7381MB [2024-09-01 10:34:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][350/1251] eta 0:03:42 lr 0.000018 wd 0.0500 time 0.2415 (0.2470) data time 0.0010 (0.0025) model time 0.2405 (0.2442) loss 3.1021 (2.6470) grad_norm 4.7909 (6.8159) loss_scale 256.0000 (199.8405) mem 7381MB [2024-09-01 10:34:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][360/1251] eta 0:03:40 lr 0.000018 wd 0.0500 time 0.2398 (0.2470) data time 0.0009 (0.0025) model time 0.2389 (0.2442) loss 2.9868 (2.6508) grad_norm 5.9910 (6.8461) loss_scale 256.0000 (201.3961) mem 7381MB [2024-09-01 10:34:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][370/1251] eta 0:03:37 lr 0.000018 wd 0.0500 time 0.2369 (0.2468) data time 0.0011 (0.0024) model time 0.2358 (0.2441) loss 1.6738 (2.6507) grad_norm 6.7916 (6.8563) loss_scale 256.0000 (202.8679) mem 7381MB [2024-09-01 10:34:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][380/1251] eta 0:03:34 lr 0.000018 wd 0.0500 time 0.2518 (0.2468) data time 0.0010 (0.0024) model time 0.2509 (0.2440) loss 3.0124 (2.6521) grad_norm 3.7673 (6.8220) loss_scale 256.0000 (204.2625) mem 7381MB [2024-09-01 10:34:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][390/1251] eta 0:03:32 lr 0.000018 wd 0.0500 time 0.2407 (0.2466) data time 0.0009 (0.0023) model time 0.2398 (0.2439) loss 1.6549 (2.6530) grad_norm 5.7210 (6.8115) loss_scale 256.0000 (205.5857) mem 7381MB [2024-09-01 10:34:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][400/1251] eta 0:03:29 lr 0.000018 wd 0.0500 time 0.2444 (0.2466) data time 0.0010 (0.0023) model time 0.2434 (0.2439) loss 2.5075 (2.6502) grad_norm 4.7484 (6.8132) loss_scale 256.0000 (206.8429) mem 7381MB [2024-09-01 10:35:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][410/1251] eta 0:03:27 lr 0.000018 wd 0.0500 time 0.2339 (0.2465) data time 0.0011 (0.0023) model time 0.2328 (0.2439) loss 2.4489 (2.6437) grad_norm 6.2886 (6.7900) loss_scale 256.0000 (208.0389) mem 7381MB [2024-09-01 10:35:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][420/1251] eta 0:03:24 lr 0.000018 wd 0.0500 time 0.2432 (0.2464) data time 0.0009 (0.0023) model time 0.2422 (0.2438) loss 2.0759 (2.6431) grad_norm 4.2431 (6.8568) loss_scale 256.0000 (209.1781) mem 7381MB [2024-09-01 10:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][430/1251] eta 0:03:22 lr 0.000018 wd 0.0500 time 0.2382 (0.2463) data time 0.0009 (0.0022) model time 0.2373 (0.2437) loss 2.8301 (2.6408) grad_norm 9.7005 (6.8864) loss_scale 256.0000 (210.2645) mem 7381MB [2024-09-01 10:35:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][440/1251] eta 0:03:19 lr 0.000018 wd 0.0500 time 0.2458 (0.2462) data time 0.0007 (0.0022) model time 0.2451 (0.2436) loss 1.8478 (2.6460) grad_norm 4.3781 (6.8757) loss_scale 256.0000 (211.3016) mem 7381MB [2024-09-01 10:35:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][450/1251] eta 0:03:17 lr 0.000018 wd 0.0500 time 0.2365 (0.2461) data time 0.0008 (0.0022) model time 0.2357 (0.2436) loss 2.9458 (2.6464) grad_norm 6.7569 (6.8528) loss_scale 256.0000 (212.2927) mem 7381MB [2024-09-01 10:35:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][460/1251] eta 0:03:14 lr 0.000018 wd 0.0500 time 0.2389 (0.2460) data time 0.0010 (0.0021) model time 0.2379 (0.2436) loss 2.7981 (2.6485) grad_norm 8.9726 (6.8493) loss_scale 256.0000 (213.2408) mem 7381MB [2024-09-01 10:35:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][470/1251] eta 0:03:12 lr 0.000018 wd 0.0500 time 0.2429 (0.2459) data time 0.0009 (0.0021) model time 0.2420 (0.2435) loss 1.9073 (2.6513) grad_norm 3.6481 (6.8063) loss_scale 256.0000 (214.1486) mem 7381MB [2024-09-01 10:35:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][480/1251] eta 0:03:09 lr 0.000018 wd 0.0500 time 0.2389 (0.2459) data time 0.0009 (0.0021) model time 0.2380 (0.2434) loss 2.7133 (2.6509) grad_norm 5.4166 (6.7965) loss_scale 256.0000 (215.0187) mem 7381MB [2024-09-01 10:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][490/1251] eta 0:03:07 lr 0.000018 wd 0.0500 time 0.2446 (0.2458) data time 0.0010 (0.0021) model time 0.2436 (0.2434) loss 3.2506 (2.6538) grad_norm 5.4091 (6.8210) loss_scale 256.0000 (215.8534) mem 7381MB [2024-09-01 10:35:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][500/1251] eta 0:03:04 lr 0.000018 wd 0.0500 time 0.2425 (0.2457) data time 0.0011 (0.0021) model time 0.2414 (0.2433) loss 2.7144 (2.6528) grad_norm 5.6194 (6.9176) loss_scale 256.0000 (216.6547) mem 7381MB [2024-09-01 10:35:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][510/1251] eta 0:03:01 lr 0.000018 wd 0.0500 time 0.2432 (0.2456) data time 0.0008 (0.0020) model time 0.2424 (0.2432) loss 2.5452 (2.6519) grad_norm 7.2608 (6.8855) loss_scale 256.0000 (217.4247) mem 7381MB [2024-09-01 10:35:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][520/1251] eta 0:02:59 lr 0.000018 wd 0.0500 time 0.2445 (0.2455) data time 0.0009 (0.0020) model time 0.2437 (0.2432) loss 3.3105 (2.6569) grad_norm 9.0824 (6.8880) loss_scale 256.0000 (218.1651) mem 7381MB [2024-09-01 10:35:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][530/1251] eta 0:02:56 lr 0.000018 wd 0.0500 time 0.2395 (0.2455) data time 0.0008 (0.0020) model time 0.2388 (0.2431) loss 3.1052 (2.6603) grad_norm 5.4178 (6.8777) loss_scale 256.0000 (218.8776) mem 7381MB [2024-09-01 10:35:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][540/1251] eta 0:02:54 lr 0.000018 wd 0.0500 time 0.2372 (0.2453) data time 0.0007 (0.0020) model time 0.2365 (0.2430) loss 2.7428 (2.6606) grad_norm 3.8604 (6.8686) loss_scale 256.0000 (219.5638) mem 7381MB [2024-09-01 10:35:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][550/1251] eta 0:02:51 lr 0.000018 wd 0.0500 time 0.2371 (0.2452) data time 0.0008 (0.0020) model time 0.2363 (0.2429) loss 3.3991 (2.6641) grad_norm 6.4456 (6.8320) loss_scale 256.0000 (220.2250) mem 7381MB [2024-09-01 10:35:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][560/1251] eta 0:02:49 lr 0.000018 wd 0.0500 time 0.2391 (0.2451) data time 0.0010 (0.0019) model time 0.2381 (0.2428) loss 2.6073 (2.6646) grad_norm 5.4328 (6.8153) loss_scale 256.0000 (220.8627) mem 7381MB [2024-09-01 10:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][570/1251] eta 0:02:46 lr 0.000018 wd 0.0500 time 0.2385 (0.2450) data time 0.0012 (0.0019) model time 0.2373 (0.2428) loss 3.1617 (2.6649) grad_norm 4.0479 (6.8000) loss_scale 256.0000 (221.4781) mem 7381MB [2024-09-01 10:35:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][580/1251] eta 0:02:44 lr 0.000018 wd 0.0500 time 0.2506 (0.2450) data time 0.0010 (0.0019) model time 0.2496 (0.2427) loss 3.1558 (2.6663) grad_norm 6.9633 (6.7961) loss_scale 256.0000 (222.0723) mem 7381MB [2024-09-01 10:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][590/1251] eta 0:02:41 lr 0.000017 wd 0.0500 time 0.2477 (0.2449) data time 0.0009 (0.0019) model time 0.2468 (0.2427) loss 2.7315 (2.6687) grad_norm 4.7946 (6.7930) loss_scale 256.0000 (222.6464) mem 7381MB [2024-09-01 10:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][600/1251] eta 0:02:39 lr 0.000017 wd 0.0500 time 0.2490 (0.2449) data time 0.0008 (0.0019) model time 0.2482 (0.2427) loss 3.0378 (2.6713) grad_norm 4.4262 (6.7728) loss_scale 256.0000 (223.2013) mem 7381MB [2024-09-01 10:35:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][610/1251] eta 0:02:36 lr 0.000017 wd 0.0500 time 0.2408 (0.2448) data time 0.0008 (0.0019) model time 0.2400 (0.2426) loss 2.5541 (2.6715) grad_norm 4.7342 (6.7951) loss_scale 256.0000 (223.7381) mem 7381MB [2024-09-01 10:35:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][620/1251] eta 0:02:34 lr 0.000017 wd 0.0500 time 0.2396 (0.2448) data time 0.0009 (0.0018) model time 0.2387 (0.2426) loss 3.0595 (2.6741) grad_norm 4.2308 (6.7699) loss_scale 256.0000 (224.2576) mem 7381MB [2024-09-01 10:35:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][630/1251] eta 0:02:31 lr 0.000017 wd 0.0500 time 0.2439 (0.2447) data time 0.0007 (0.0018) model time 0.2431 (0.2426) loss 3.2368 (2.6734) grad_norm 5.2430 (6.7622) loss_scale 256.0000 (224.7607) mem 7381MB [2024-09-01 10:35:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][640/1251] eta 0:02:29 lr 0.000017 wd 0.0500 time 0.2359 (0.2447) data time 0.0011 (0.0018) model time 0.2348 (0.2425) loss 2.8439 (2.6730) grad_norm 4.6567 (6.7653) loss_scale 256.0000 (225.2480) mem 7381MB [2024-09-01 10:35:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][650/1251] eta 0:02:27 lr 0.000017 wd 0.0500 time 0.2444 (0.2446) data time 0.0009 (0.0018) model time 0.2435 (0.2425) loss 2.9295 (2.6753) grad_norm 8.1925 (6.7450) loss_scale 256.0000 (225.7204) mem 7381MB [2024-09-01 10:36:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][660/1251] eta 0:02:24 lr 0.000017 wd 0.0500 time 0.2333 (0.2446) data time 0.0011 (0.0018) model time 0.2322 (0.2424) loss 2.7702 (2.6744) grad_norm 4.5645 (6.7409) loss_scale 256.0000 (226.1785) mem 7381MB [2024-09-01 10:36:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][670/1251] eta 0:02:22 lr 0.000017 wd 0.0500 time 0.2460 (0.2445) data time 0.0011 (0.0018) model time 0.2449 (0.2424) loss 2.4067 (2.6737) grad_norm 3.4171 (6.7231) loss_scale 256.0000 (226.6230) mem 7381MB [2024-09-01 10:36:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][680/1251] eta 0:02:19 lr 0.000017 wd 0.0500 time 0.2466 (0.2445) data time 0.0010 (0.0018) model time 0.2456 (0.2424) loss 2.6543 (2.6770) grad_norm 5.5598 (6.7064) loss_scale 256.0000 (227.0543) mem 7381MB [2024-09-01 10:36:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][690/1251] eta 0:02:17 lr 0.000017 wd 0.0500 time 0.2368 (0.2445) data time 0.0010 (0.0018) model time 0.2358 (0.2424) loss 1.8563 (2.6782) grad_norm 7.2120 (6.6938) loss_scale 256.0000 (227.4732) mem 7381MB [2024-09-01 10:36:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][700/1251] eta 0:02:14 lr 0.000017 wd 0.0500 time 0.2446 (0.2444) data time 0.0009 (0.0018) model time 0.2438 (0.2424) loss 2.7134 (2.6778) grad_norm 4.0432 (6.6913) loss_scale 256.0000 (227.8802) mem 7381MB [2024-09-01 10:36:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][710/1251] eta 0:02:12 lr 0.000017 wd 0.0500 time 0.3981 (0.2446) data time 0.0010 (0.0017) model time 0.3971 (0.2426) loss 2.6642 (2.6744) grad_norm 5.8936 (6.6907) loss_scale 256.0000 (228.2757) mem 7381MB [2024-09-01 10:36:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][720/1251] eta 0:02:09 lr 0.000017 wd 0.0500 time 0.2463 (0.2446) data time 0.0009 (0.0017) model time 0.2453 (0.2426) loss 2.3547 (2.6724) grad_norm 3.9635 (6.6792) loss_scale 256.0000 (228.6602) mem 7381MB [2024-09-01 10:36:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][730/1251] eta 0:02:07 lr 0.000017 wd 0.0500 time 0.2481 (0.2449) data time 0.0007 (0.0017) model time 0.2474 (0.2430) loss 1.6391 (2.6706) grad_norm 3.8162 (6.6910) loss_scale 256.0000 (229.0342) mem 7381MB [2024-09-01 10:36:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][740/1251] eta 0:02:05 lr 0.000017 wd 0.0500 time 0.2489 (0.2449) data time 0.0009 (0.0017) model time 0.2480 (0.2430) loss 2.1235 (2.6703) grad_norm 7.7606 (6.6808) loss_scale 256.0000 (229.3981) mem 7381MB [2024-09-01 10:36:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][750/1251] eta 0:02:02 lr 0.000017 wd 0.0500 time 0.2431 (0.2449) data time 0.0007 (0.0017) model time 0.2424 (0.2429) loss 1.7652 (2.6717) grad_norm 3.6804 (6.6814) loss_scale 256.0000 (229.7523) mem 7381MB [2024-09-01 10:36:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][760/1251] eta 0:02:00 lr 0.000017 wd 0.0500 time 0.2436 (0.2448) data time 0.0008 (0.0017) model time 0.2429 (0.2429) loss 3.1877 (2.6684) grad_norm 5.0028 (6.6909) loss_scale 256.0000 (230.0972) mem 7381MB [2024-09-01 10:36:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][770/1251] eta 0:01:57 lr 0.000017 wd 0.0500 time 0.2424 (0.2448) data time 0.0007 (0.0017) model time 0.2417 (0.2429) loss 2.6492 (2.6659) grad_norm 9.4546 (6.6838) loss_scale 256.0000 (230.4332) mem 7381MB [2024-09-01 10:36:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][780/1251] eta 0:01:55 lr 0.000017 wd 0.0500 time 0.2536 (0.2448) data time 0.0009 (0.0017) model time 0.2527 (0.2428) loss 2.4279 (2.6674) grad_norm 7.9930 (6.6789) loss_scale 256.0000 (230.7606) mem 7381MB [2024-09-01 10:36:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][790/1251] eta 0:01:52 lr 0.000017 wd 0.0500 time 0.2505 (0.2448) data time 0.0007 (0.0017) model time 0.2498 (0.2429) loss 3.0889 (2.6698) grad_norm 11.3400 (6.6773) loss_scale 256.0000 (231.0796) mem 7381MB [2024-09-01 10:36:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][800/1251] eta 0:01:50 lr 0.000017 wd 0.0500 time 0.2427 (0.2447) data time 0.0007 (0.0017) model time 0.2420 (0.2428) loss 2.8813 (2.6706) grad_norm 4.8124 (6.6798) loss_scale 256.0000 (231.3908) mem 7381MB [2024-09-01 10:36:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][810/1251] eta 0:01:47 lr 0.000017 wd 0.0500 time 0.2366 (0.2447) data time 0.0008 (0.0016) model time 0.2358 (0.2428) loss 2.9262 (2.6738) grad_norm 6.0034 (6.6699) loss_scale 256.0000 (231.6942) mem 7381MB [2024-09-01 10:36:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][820/1251] eta 0:01:45 lr 0.000017 wd 0.0500 time 0.2512 (0.2447) data time 0.0007 (0.0016) model time 0.2505 (0.2428) loss 2.7854 (2.6738) grad_norm 8.8545 (6.6552) loss_scale 256.0000 (231.9903) mem 7381MB [2024-09-01 10:36:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][830/1251] eta 0:01:42 lr 0.000017 wd 0.0500 time 0.2354 (0.2447) data time 0.0011 (0.0016) model time 0.2342 (0.2428) loss 2.8122 (2.6732) grad_norm 3.9741 (6.6501) loss_scale 256.0000 (232.2792) mem 7381MB [2024-09-01 10:36:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][840/1251] eta 0:01:40 lr 0.000017 wd 0.0500 time 0.2399 (0.2446) data time 0.0009 (0.0016) model time 0.2390 (0.2428) loss 3.0222 (2.6759) grad_norm 4.6973 (6.6344) loss_scale 256.0000 (232.5612) mem 7381MB [2024-09-01 10:36:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][850/1251] eta 0:01:38 lr 0.000017 wd 0.0500 time 0.2395 (0.2446) data time 0.0009 (0.0016) model time 0.2386 (0.2428) loss 3.0168 (2.6752) grad_norm 3.9585 (6.6156) loss_scale 256.0000 (232.8367) mem 7381MB [2024-09-01 10:36:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][860/1251] eta 0:01:35 lr 0.000017 wd 0.0500 time 0.2400 (0.2446) data time 0.0011 (0.0016) model time 0.2389 (0.2427) loss 1.7099 (2.6757) grad_norm 6.6907 (6.5965) loss_scale 256.0000 (233.1057) mem 7381MB [2024-09-01 10:36:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][870/1251] eta 0:01:33 lr 0.000017 wd 0.0500 time 0.2430 (0.2447) data time 0.0011 (0.0016) model time 0.2419 (0.2429) loss 2.7907 (2.6732) grad_norm 4.8290 (6.5807) loss_scale 256.0000 (233.3685) mem 7381MB [2024-09-01 10:36:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][880/1251] eta 0:01:30 lr 0.000017 wd 0.0500 time 0.2386 (0.2447) data time 0.0011 (0.0016) model time 0.2375 (0.2429) loss 2.5926 (2.6753) grad_norm 5.3459 (6.5716) loss_scale 256.0000 (233.6254) mem 7381MB [2024-09-01 10:36:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][890/1251] eta 0:01:28 lr 0.000017 wd 0.0500 time 0.2342 (0.2446) data time 0.0011 (0.0016) model time 0.2331 (0.2428) loss 2.0618 (2.6726) grad_norm 4.9561 (6.5571) loss_scale 256.0000 (233.8765) mem 7381MB [2024-09-01 10:37:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][900/1251] eta 0:01:25 lr 0.000017 wd 0.0500 time 0.2391 (0.2446) data time 0.0008 (0.0016) model time 0.2383 (0.2428) loss 3.2500 (2.6725) grad_norm 4.7014 (6.5465) loss_scale 256.0000 (234.1221) mem 7381MB [2024-09-01 10:37:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][910/1251] eta 0:01:23 lr 0.000017 wd 0.0500 time 0.2367 (0.2446) data time 0.0009 (0.0016) model time 0.2357 (0.2428) loss 3.0002 (2.6734) grad_norm 5.1426 (6.5385) loss_scale 256.0000 (234.3622) mem 7381MB [2024-09-01 10:37:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][920/1251] eta 0:01:20 lr 0.000017 wd 0.0500 time 0.2269 (0.2446) data time 0.0008 (0.0016) model time 0.2261 (0.2428) loss 2.3140 (2.6710) grad_norm 9.6741 (6.5287) loss_scale 256.0000 (234.5972) mem 7381MB [2024-09-01 10:37:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][930/1251] eta 0:01:18 lr 0.000017 wd 0.0500 time 0.2393 (0.2446) data time 0.0011 (0.0016) model time 0.2382 (0.2428) loss 1.9077 (2.6708) grad_norm 4.4709 (6.5160) loss_scale 256.0000 (234.8271) mem 7381MB [2024-09-01 10:37:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][940/1251] eta 0:01:16 lr 0.000017 wd 0.0500 time 0.2454 (0.2446) data time 0.0007 (0.0016) model time 0.2447 (0.2428) loss 2.1821 (2.6685) grad_norm 6.0418 (6.5058) loss_scale 256.0000 (235.0521) mem 7381MB [2024-09-01 10:37:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][950/1251] eta 0:01:13 lr 0.000017 wd 0.0500 time 0.2370 (0.2445) data time 0.0009 (0.0015) model time 0.2362 (0.2428) loss 3.4304 (2.6692) grad_norm 4.9958 (6.5057) loss_scale 256.0000 (235.2723) mem 7381MB [2024-09-01 10:37:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][960/1251] eta 0:01:11 lr 0.000017 wd 0.0500 time 0.2283 (0.2445) data time 0.0009 (0.0015) model time 0.2274 (0.2427) loss 3.3471 (2.6705) grad_norm 4.6216 (6.4926) loss_scale 256.0000 (235.4880) mem 7381MB [2024-09-01 10:37:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][970/1251] eta 0:01:08 lr 0.000017 wd 0.0500 time 0.2436 (0.2445) data time 0.0007 (0.0015) model time 0.2429 (0.2427) loss 2.3855 (2.6719) grad_norm 4.2462 (6.4806) loss_scale 256.0000 (235.6993) mem 7381MB [2024-09-01 10:37:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][980/1251] eta 0:01:06 lr 0.000017 wd 0.0500 time 0.2427 (0.2444) data time 0.0009 (0.0015) model time 0.2418 (0.2427) loss 2.4039 (2.6721) grad_norm 8.1158 (6.4734) loss_scale 256.0000 (235.9062) mem 7381MB [2024-09-01 10:37:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][990/1251] eta 0:01:03 lr 0.000017 wd 0.0500 time 0.2373 (0.2444) data time 0.0007 (0.0015) model time 0.2366 (0.2427) loss 2.5962 (2.6728) grad_norm 4.3405 (6.5505) loss_scale 256.0000 (236.1090) mem 7381MB [2024-09-01 10:37:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1000/1251] eta 0:01:01 lr 0.000017 wd 0.0500 time 0.2443 (0.2444) data time 0.0010 (0.0015) model time 0.2433 (0.2427) loss 3.1424 (2.6737) grad_norm 5.8933 (6.5395) loss_scale 256.0000 (236.3077) mem 7381MB [2024-09-01 10:37:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1010/1251] eta 0:00:58 lr 0.000017 wd 0.0500 time 0.2392 (0.2443) data time 0.0010 (0.0015) model time 0.2382 (0.2426) loss 2.8497 (2.6739) grad_norm 7.7490 (6.5377) loss_scale 256.0000 (236.5025) mem 7381MB [2024-09-01 10:37:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1020/1251] eta 0:00:56 lr 0.000017 wd 0.0500 time 0.2419 (0.2443) data time 0.0010 (0.0015) model time 0.2409 (0.2426) loss 2.9544 (2.6732) grad_norm 4.6391 (6.5546) loss_scale 256.0000 (236.6934) mem 7381MB [2024-09-01 10:37:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1030/1251] eta 0:00:53 lr 0.000017 wd 0.0500 time 0.2446 (0.2443) data time 0.0011 (0.0015) model time 0.2435 (0.2426) loss 3.0496 (2.6745) grad_norm 5.8155 (6.5475) loss_scale 256.0000 (236.8807) mem 7381MB [2024-09-01 10:37:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1040/1251] eta 0:00:51 lr 0.000017 wd 0.0500 time 0.2442 (0.2443) data time 0.0010 (0.0015) model time 0.2432 (0.2426) loss 2.9249 (2.6721) grad_norm 4.6653 (6.5320) loss_scale 256.0000 (237.0644) mem 7381MB [2024-09-01 10:37:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1050/1251] eta 0:00:49 lr 0.000017 wd 0.0500 time 0.2467 (0.2443) data time 0.0009 (0.0015) model time 0.2458 (0.2426) loss 2.8530 (2.6714) grad_norm 4.1893 (6.5197) loss_scale 256.0000 (237.2445) mem 7381MB [2024-09-01 10:37:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1060/1251] eta 0:00:46 lr 0.000017 wd 0.0500 time 0.2443 (0.2443) data time 0.0009 (0.0015) model time 0.2434 (0.2426) loss 2.6787 (2.6717) grad_norm 5.7655 (6.5170) loss_scale 256.0000 (237.4213) mem 7381MB [2024-09-01 10:37:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1070/1251] eta 0:00:44 lr 0.000017 wd 0.0500 time 0.2418 (0.2443) data time 0.0011 (0.0015) model time 0.2407 (0.2426) loss 2.9148 (2.6689) grad_norm 9.4370 (6.5269) loss_scale 256.0000 (237.5948) mem 7381MB [2024-09-01 10:37:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1080/1251] eta 0:00:41 lr 0.000017 wd 0.0500 time 0.2417 (0.2446) data time 0.0010 (0.0015) model time 0.2407 (0.2430) loss 3.1020 (2.6691) grad_norm 5.3005 (6.5256) loss_scale 256.0000 (237.7650) mem 7381MB [2024-09-01 10:37:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1090/1251] eta 0:00:39 lr 0.000017 wd 0.0500 time 0.2416 (0.2448) data time 0.0011 (0.0015) model time 0.2405 (0.2431) loss 3.2074 (2.6697) grad_norm 7.7720 (6.5281) loss_scale 256.0000 (237.9322) mem 7381MB [2024-09-01 10:37:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1100/1251] eta 0:00:36 lr 0.000017 wd 0.0500 time 0.2361 (0.2448) data time 0.0009 (0.0015) model time 0.2352 (0.2431) loss 3.2208 (2.6712) grad_norm 20.4935 (6.5628) loss_scale 256.0000 (238.0963) mem 7381MB [2024-09-01 10:37:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1110/1251] eta 0:00:34 lr 0.000017 wd 0.0500 time 0.2606 (0.2447) data time 0.0007 (0.0015) model time 0.2599 (0.2431) loss 2.6408 (2.6716) grad_norm 13.3015 (6.5665) loss_scale 256.0000 (238.2574) mem 7381MB [2024-09-01 10:37:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1120/1251] eta 0:00:32 lr 0.000017 wd 0.0500 time 0.2379 (0.2447) data time 0.0011 (0.0015) model time 0.2368 (0.2431) loss 3.1463 (2.6733) grad_norm 6.2657 (6.5611) loss_scale 256.0000 (238.4157) mem 7381MB [2024-09-01 10:37:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1130/1251] eta 0:00:29 lr 0.000017 wd 0.0500 time 0.2400 (0.2447) data time 0.0008 (0.0015) model time 0.2392 (0.2430) loss 3.4628 (2.6746) grad_norm 6.0294 (6.5487) loss_scale 256.0000 (238.5712) mem 7381MB [2024-09-01 10:37:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1140/1251] eta 0:00:27 lr 0.000017 wd 0.0500 time 0.2368 (0.2447) data time 0.0007 (0.0015) model time 0.2361 (0.2430) loss 3.0399 (2.6736) grad_norm 10.4455 (6.5427) loss_scale 256.0000 (238.7239) mem 7381MB [2024-09-01 10:38:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1150/1251] eta 0:00:24 lr 0.000017 wd 0.0500 time 0.2396 (0.2446) data time 0.0009 (0.0015) model time 0.2387 (0.2430) loss 2.9864 (2.6755) grad_norm 5.1795 (6.5329) loss_scale 256.0000 (238.8740) mem 7381MB [2024-09-01 10:38:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1160/1251] eta 0:00:22 lr 0.000017 wd 0.0500 time 0.2375 (0.2446) data time 0.0009 (0.0014) model time 0.2366 (0.2430) loss 2.8971 (2.6765) grad_norm 6.2427 (6.5279) loss_scale 256.0000 (239.0215) mem 7381MB [2024-09-01 10:38:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1170/1251] eta 0:00:19 lr 0.000017 wd 0.0500 time 0.2398 (0.2446) data time 0.0007 (0.0014) model time 0.2391 (0.2430) loss 3.0609 (2.6772) grad_norm 5.9959 (6.5318) loss_scale 256.0000 (239.1665) mem 7381MB [2024-09-01 10:38:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1180/1251] eta 0:00:17 lr 0.000017 wd 0.0500 time 0.2375 (0.2446) data time 0.0010 (0.0014) model time 0.2365 (0.2430) loss 2.9546 (2.6781) grad_norm 8.5285 (6.5311) loss_scale 256.0000 (239.3091) mem 7381MB [2024-09-01 10:38:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1190/1251] eta 0:00:14 lr 0.000017 wd 0.0500 time 0.2420 (0.2446) data time 0.0008 (0.0014) model time 0.2413 (0.2430) loss 2.8390 (2.6781) grad_norm 4.4856 (6.5419) loss_scale 256.0000 (239.4492) mem 7381MB [2024-09-01 10:38:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1200/1251] eta 0:00:12 lr 0.000017 wd 0.0500 time 0.2417 (0.2446) data time 0.0010 (0.0014) model time 0.2407 (0.2430) loss 2.5079 (2.6781) grad_norm 7.2352 (6.5475) loss_scale 256.0000 (239.5870) mem 7381MB [2024-09-01 10:38:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1210/1251] eta 0:00:10 lr 0.000017 wd 0.0500 time 0.2341 (0.2446) data time 0.0008 (0.0014) model time 0.2334 (0.2430) loss 3.5074 (2.6796) grad_norm 6.8765 (6.5466) loss_scale 256.0000 (239.7225) mem 7381MB [2024-09-01 10:38:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1220/1251] eta 0:00:07 lr 0.000017 wd 0.0500 time 0.2414 (0.2445) data time 0.0009 (0.0014) model time 0.2405 (0.2430) loss 2.8975 (2.6795) grad_norm 5.5592 (6.5417) loss_scale 256.0000 (239.8559) mem 7381MB [2024-09-01 10:38:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1230/1251] eta 0:00:05 lr 0.000017 wd 0.0500 time 0.2369 (0.2445) data time 0.0010 (0.0014) model time 0.2359 (0.2430) loss 2.5916 (2.6803) grad_norm 3.6724 (6.5345) loss_scale 256.0000 (239.9870) mem 7381MB [2024-09-01 10:38:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1240/1251] eta 0:00:02 lr 0.000017 wd 0.0500 time 0.2241 (0.2446) data time 0.0007 (0.0014) model time 0.2235 (0.2430) loss 2.9450 (2.6796) grad_norm 9.5871 (6.5359) loss_scale 256.0000 (240.1160) mem 7381MB [2024-09-01 10:38:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [284/300][1250/1251] eta 0:00:00 lr 0.000017 wd 0.0500 time 0.2234 (0.2444) data time 0.0007 (0.0014) model time 0.2227 (0.2429) loss 2.7819 (2.6800) grad_norm 9.5198 (6.5336) loss_scale 256.0000 (240.2430) mem 7381MB [2024-09-01 10:38:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 284 training takes 0:05:05 [2024-09-01 10:38:26 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 10:38:27 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 10:38:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.422 (0.422) Loss 0.3904 (0.3904) Acc@1 92.969 (92.969) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 10:38:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.109) Loss 0.5664 (0.6122) Acc@1 90.430 (87.695) Acc@5 98.047 (97.798) Mem 7381MB [2024-09-01 10:38:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.081 (0.095) Loss 0.9155 (0.6433) Acc@1 77.734 (86.621) Acc@5 95.898 (97.721) Mem 7381MB [2024-09-01 10:38:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.089) Loss 1.1455 (0.7362) Acc@1 75.000 (84.384) Acc@5 92.676 (96.762) Mem 7381MB [2024-09-01 10:38:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.084) Loss 1.0234 (0.7848) Acc@1 77.051 (83.210) Acc@5 94.336 (96.263) Mem 7381MB [2024-09-01 10:38:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.772 Acc@5 96.218 [2024-09-01 10:38:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 10:38:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.822 (0.822) Loss 0.3887 (0.3887) Acc@1 93.457 (93.457) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 10:38:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.148) Loss 0.5645 (0.6072) Acc@1 90.430 (88.006) Acc@5 98.047 (97.798) Mem 7381MB [2024-09-01 10:38:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.115) Loss 0.9106 (0.6378) Acc@1 77.930 (86.775) Acc@5 95.898 (97.754) Mem 7381MB [2024-09-01 10:38:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.080 (0.103) Loss 1.1367 (0.7305) Acc@1 74.512 (84.558) Acc@5 92.969 (96.780) Mem 7381MB [2024-09-01 10:38:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.0117 (0.7789) Acc@1 76.562 (83.375) Acc@5 94.336 (96.291) Mem 7381MB [2024-09-01 10:38:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.936 Acc@5 96.250 [2024-09-01 10:38:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 10:38:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][0/1251] eta 0:22:01 lr 0.000017 wd 0.0500 time 1.0564 (1.0564) data time 0.6278 (0.6278) model time 0.0000 (0.0000) loss 2.6905 (2.6905) grad_norm 5.5133 (5.5133) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:38:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][10/1251] eta 0:06:35 lr 0.000017 wd 0.0500 time 0.2398 (0.3187) data time 0.0006 (0.0579) model time 0.0000 (0.0000) loss 1.6898 (2.6845) grad_norm 4.7777 (8.6207) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:38:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][20/1251] eta 0:05:48 lr 0.000017 wd 0.0500 time 0.2452 (0.2828) data time 0.0007 (0.0308) model time 0.0000 (0.0000) loss 3.4772 (2.8384) grad_norm 5.7532 (7.5598) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:38:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][30/1251] eta 0:05:37 lr 0.000017 wd 0.0500 time 0.2290 (0.2761) data time 0.0010 (0.0212) model time 0.0000 (0.0000) loss 2.3712 (2.8257) grad_norm 4.9366 (6.7535) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:38:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][40/1251] eta 0:05:24 lr 0.000017 wd 0.0500 time 0.2451 (0.2680) data time 0.0007 (0.0163) model time 0.0000 (0.0000) loss 3.4968 (2.8363) grad_norm 4.5094 (6.7927) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:38:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][50/1251] eta 0:05:16 lr 0.000017 wd 0.0500 time 0.2448 (0.2633) data time 0.0007 (0.0133) model time 0.0000 (0.0000) loss 2.1782 (2.8401) grad_norm 19.1850 (7.0530) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:38:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][60/1251] eta 0:05:09 lr 0.000017 wd 0.0500 time 0.2392 (0.2596) data time 0.0010 (0.0113) model time 0.2383 (0.2394) loss 3.0982 (2.8303) grad_norm 7.9860 (7.1142) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:38:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][70/1251] eta 0:05:03 lr 0.000017 wd 0.0500 time 0.2434 (0.2570) data time 0.0008 (0.0098) model time 0.2426 (0.2400) loss 2.9331 (2.8061) grad_norm 6.8597 (7.3077) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][80/1251] eta 0:04:58 lr 0.000017 wd 0.0500 time 0.2475 (0.2551) data time 0.0009 (0.0087) model time 0.2467 (0.2402) loss 2.9691 (2.7774) grad_norm 4.7515 (6.9779) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:38:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][90/1251] eta 0:04:54 lr 0.000017 wd 0.0500 time 0.2417 (0.2538) data time 0.0009 (0.0079) model time 0.2408 (0.2406) loss 2.7859 (2.7558) grad_norm 5.3579 (6.8399) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][100/1251] eta 0:04:52 lr 0.000017 wd 0.0500 time 0.2343 (0.2544) data time 0.0010 (0.0072) model time 0.2333 (0.2444) loss 2.4454 (2.7209) grad_norm 5.3446 (6.7503) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][110/1251] eta 0:04:49 lr 0.000017 wd 0.0500 time 0.2418 (0.2534) data time 0.0008 (0.0066) model time 0.2409 (0.2441) loss 3.2511 (2.7379) grad_norm 6.9139 (6.6518) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][120/1251] eta 0:04:45 lr 0.000017 wd 0.0500 time 0.2464 (0.2525) data time 0.0009 (0.0061) model time 0.2455 (0.2437) loss 3.1836 (2.7488) grad_norm 7.4228 (7.0239) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][130/1251] eta 0:04:42 lr 0.000017 wd 0.0500 time 0.2432 (0.2518) data time 0.0010 (0.0058) model time 0.2422 (0.2434) loss 3.0644 (2.7497) grad_norm 4.0744 (6.9216) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][140/1251] eta 0:04:38 lr 0.000017 wd 0.0500 time 0.2410 (0.2510) data time 0.0007 (0.0054) model time 0.2402 (0.2431) loss 3.1934 (2.7487) grad_norm 7.7711 (6.8884) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][150/1251] eta 0:04:35 lr 0.000017 wd 0.0500 time 0.2433 (0.2505) data time 0.0008 (0.0051) model time 0.2425 (0.2430) loss 1.6654 (2.7373) grad_norm 3.7988 (6.8405) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][160/1251] eta 0:04:32 lr 0.000017 wd 0.0500 time 0.2316 (0.2498) data time 0.0009 (0.0049) model time 0.2307 (0.2425) loss 3.0214 (2.7414) grad_norm 8.0481 (6.8514) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][170/1251] eta 0:04:29 lr 0.000017 wd 0.0500 time 0.2413 (0.2492) data time 0.0009 (0.0046) model time 0.2404 (0.2423) loss 2.7085 (2.7357) grad_norm 7.3985 (6.8169) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][180/1251] eta 0:04:26 lr 0.000017 wd 0.0500 time 0.2329 (0.2488) data time 0.0007 (0.0044) model time 0.2322 (0.2421) loss 3.0288 (2.7410) grad_norm 5.2067 (6.7318) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][190/1251] eta 0:04:23 lr 0.000017 wd 0.0500 time 0.2287 (0.2482) data time 0.0008 (0.0043) model time 0.2279 (0.2417) loss 2.9042 (2.7310) grad_norm 5.2017 (6.6894) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][200/1251] eta 0:04:20 lr 0.000017 wd 0.0500 time 0.2362 (0.2479) data time 0.0011 (0.0041) model time 0.2351 (0.2417) loss 2.5823 (2.7213) grad_norm 4.6118 (6.6558) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][210/1251] eta 0:04:17 lr 0.000017 wd 0.0500 time 0.2423 (0.2477) data time 0.0008 (0.0039) model time 0.2415 (0.2417) loss 3.1654 (2.7265) grad_norm 7.1952 (6.6388) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][220/1251] eta 0:04:15 lr 0.000017 wd 0.0500 time 0.2493 (0.2475) data time 0.0007 (0.0038) model time 0.2486 (0.2417) loss 2.5977 (2.7107) grad_norm 6.1217 (6.6325) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][230/1251] eta 0:04:12 lr 0.000017 wd 0.0500 time 0.2364 (0.2473) data time 0.0008 (0.0037) model time 0.2357 (0.2417) loss 2.3623 (2.7106) grad_norm 7.2278 (6.6193) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][240/1251] eta 0:04:09 lr 0.000017 wd 0.0500 time 0.2357 (0.2470) data time 0.0007 (0.0036) model time 0.2350 (0.2416) loss 2.9317 (2.7148) grad_norm 6.7943 (6.5712) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][250/1251] eta 0:04:07 lr 0.000017 wd 0.0500 time 0.2482 (0.2469) data time 0.0007 (0.0035) model time 0.2475 (0.2417) loss 1.7011 (2.7069) grad_norm 4.6781 (6.5076) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][260/1251] eta 0:04:04 lr 0.000017 wd 0.0500 time 0.2412 (0.2467) data time 0.0009 (0.0034) model time 0.2402 (0.2417) loss 2.9564 (2.7145) grad_norm 4.5524 (6.8314) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][270/1251] eta 0:04:01 lr 0.000017 wd 0.0500 time 0.2424 (0.2466) data time 0.0008 (0.0033) model time 0.2416 (0.2417) loss 2.5222 (2.7120) grad_norm 5.0409 (6.8565) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][280/1251] eta 0:03:59 lr 0.000017 wd 0.0500 time 0.2398 (0.2464) data time 0.0010 (0.0032) model time 0.2388 (0.2416) loss 2.0640 (2.6991) grad_norm 4.7933 (6.7957) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][290/1251] eta 0:03:56 lr 0.000017 wd 0.0500 time 0.2438 (0.2463) data time 0.0009 (0.0032) model time 0.2429 (0.2416) loss 2.6582 (2.6954) grad_norm 5.0474 (6.7663) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][300/1251] eta 0:03:54 lr 0.000017 wd 0.0500 time 0.2404 (0.2462) data time 0.0007 (0.0031) model time 0.2396 (0.2417) loss 1.8120 (2.6893) grad_norm 5.1334 (6.7438) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][310/1251] eta 0:03:51 lr 0.000017 wd 0.0500 time 0.2426 (0.2461) data time 0.0009 (0.0030) model time 0.2417 (0.2417) loss 1.9161 (2.6874) grad_norm 4.0077 (6.7019) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][320/1251] eta 0:03:48 lr 0.000017 wd 0.0500 time 0.2433 (0.2459) data time 0.0010 (0.0030) model time 0.2423 (0.2416) loss 2.9640 (2.6897) grad_norm 4.8310 (6.7787) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][330/1251] eta 0:03:46 lr 0.000017 wd 0.0500 time 0.2355 (0.2457) data time 0.0009 (0.0029) model time 0.2346 (0.2414) loss 2.1658 (2.6884) grad_norm 13.4657 (6.8173) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:39:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][340/1251] eta 0:03:43 lr 0.000017 wd 0.0500 time 0.2405 (0.2455) data time 0.0009 (0.0028) model time 0.2396 (0.2414) loss 3.1647 (2.6871) grad_norm 7.3662 (6.7758) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][350/1251] eta 0:03:41 lr 0.000017 wd 0.0500 time 0.2372 (0.2454) data time 0.0009 (0.0028) model time 0.2364 (0.2414) loss 2.8061 (2.6849) grad_norm 5.1858 (6.7604) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][360/1251] eta 0:03:38 lr 0.000017 wd 0.0500 time 0.2320 (0.2454) data time 0.0010 (0.0027) model time 0.2310 (0.2414) loss 2.8645 (2.6864) grad_norm 5.0033 (6.8972) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][370/1251] eta 0:03:36 lr 0.000017 wd 0.0500 time 0.2494 (0.2453) data time 0.0008 (0.0027) model time 0.2486 (0.2415) loss 2.2559 (2.6904) grad_norm 5.0176 (6.8700) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][380/1251] eta 0:03:33 lr 0.000017 wd 0.0500 time 0.2460 (0.2452) data time 0.0007 (0.0027) model time 0.2453 (0.2414) loss 3.2637 (2.6996) grad_norm 12.0435 (6.8863) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][390/1251] eta 0:03:31 lr 0.000017 wd 0.0500 time 0.2442 (0.2451) data time 0.0011 (0.0026) model time 0.2431 (0.2414) loss 2.9513 (2.6907) grad_norm 6.1726 (6.8702) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][400/1251] eta 0:03:28 lr 0.000017 wd 0.0500 time 0.2399 (0.2451) data time 0.0007 (0.0026) model time 0.2392 (0.2414) loss 2.4464 (2.6885) grad_norm 3.8371 (6.8515) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][410/1251] eta 0:03:26 lr 0.000017 wd 0.0500 time 0.2318 (0.2450) data time 0.0009 (0.0025) model time 0.2309 (0.2414) loss 3.6705 (2.6932) grad_norm 6.8244 (6.8605) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][420/1251] eta 0:03:23 lr 0.000017 wd 0.0500 time 0.2467 (0.2449) data time 0.0010 (0.0025) model time 0.2457 (0.2414) loss 2.2382 (2.6904) grad_norm 9.4863 (6.8349) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][430/1251] eta 0:03:21 lr 0.000017 wd 0.0500 time 0.2446 (0.2449) data time 0.0007 (0.0025) model time 0.2439 (0.2414) loss 2.2124 (2.6900) grad_norm 4.6273 (6.8015) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][440/1251] eta 0:03:18 lr 0.000017 wd 0.0500 time 0.2399 (0.2449) data time 0.0010 (0.0024) model time 0.2390 (0.2414) loss 1.5420 (2.6816) grad_norm 4.5296 (6.7777) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][450/1251] eta 0:03:16 lr 0.000017 wd 0.0500 time 0.2335 (0.2448) data time 0.0012 (0.0024) model time 0.2323 (0.2414) loss 2.7548 (2.6791) grad_norm 6.6231 (6.7987) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][460/1251] eta 0:03:13 lr 0.000017 wd 0.0500 time 0.2443 (0.2448) data time 0.0009 (0.0024) model time 0.2434 (0.2415) loss 2.9016 (2.6743) grad_norm 7.0948 (6.7815) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][470/1251] eta 0:03:11 lr 0.000017 wd 0.0500 time 0.2415 (0.2447) data time 0.0009 (0.0023) model time 0.2406 (0.2415) loss 2.8537 (2.6747) grad_norm 8.2448 (6.7751) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][480/1251] eta 0:03:08 lr 0.000017 wd 0.0500 time 0.2308 (0.2447) data time 0.0008 (0.0023) model time 0.2301 (0.2415) loss 2.6807 (2.6773) grad_norm 4.7684 (6.7459) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][490/1251] eta 0:03:06 lr 0.000017 wd 0.0500 time 0.2370 (0.2446) data time 0.0011 (0.0023) model time 0.2360 (0.2415) loss 2.7146 (2.6786) grad_norm 4.6946 (6.7306) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][500/1251] eta 0:03:03 lr 0.000017 wd 0.0500 time 0.2544 (0.2446) data time 0.0009 (0.0023) model time 0.2534 (0.2415) loss 2.7766 (2.6764) grad_norm 7.1376 (6.7186) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][510/1251] eta 0:03:01 lr 0.000017 wd 0.0500 time 0.2468 (0.2446) data time 0.0009 (0.0022) model time 0.2459 (0.2415) loss 3.0396 (2.6759) grad_norm 6.8223 (6.7106) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][520/1251] eta 0:02:58 lr 0.000017 wd 0.0500 time 0.2356 (0.2445) data time 0.0011 (0.0022) model time 0.2345 (0.2415) loss 3.0563 (2.6706) grad_norm 8.4307 (6.7652) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][530/1251] eta 0:02:56 lr 0.000017 wd 0.0500 time 0.2399 (0.2448) data time 0.0007 (0.0022) model time 0.2392 (0.2418) loss 3.0229 (2.6719) grad_norm 6.0682 (6.7778) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][540/1251] eta 0:02:54 lr 0.000017 wd 0.0500 time 0.2435 (0.2448) data time 0.0010 (0.0022) model time 0.2426 (0.2419) loss 2.8642 (2.6760) grad_norm 5.5390 (6.7502) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][550/1251] eta 0:02:51 lr 0.000017 wd 0.0500 time 0.2412 (0.2447) data time 0.0007 (0.0021) model time 0.2404 (0.2418) loss 2.9612 (2.6750) grad_norm 6.5021 (6.7307) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][560/1251] eta 0:02:49 lr 0.000017 wd 0.0500 time 0.2481 (0.2447) data time 0.0007 (0.0021) model time 0.2474 (0.2418) loss 2.7218 (2.6742) grad_norm 5.3029 (6.7218) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][570/1251] eta 0:02:46 lr 0.000017 wd 0.0500 time 0.2372 (0.2449) data time 0.0010 (0.0021) model time 0.2362 (0.2421) loss 3.2745 (2.6731) grad_norm 3.7320 (6.6974) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][580/1251] eta 0:02:44 lr 0.000017 wd 0.0500 time 0.2424 (0.2449) data time 0.0007 (0.0021) model time 0.2417 (0.2422) loss 2.5620 (2.6713) grad_norm 5.7764 (6.6795) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][590/1251] eta 0:02:41 lr 0.000017 wd 0.0500 time 0.2415 (0.2448) data time 0.0014 (0.0021) model time 0.2402 (0.2421) loss 2.7222 (2.6724) grad_norm 5.2881 (6.6848) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][600/1251] eta 0:02:39 lr 0.000017 wd 0.0500 time 0.2370 (0.2448) data time 0.0010 (0.0020) model time 0.2360 (0.2420) loss 2.9133 (2.6728) grad_norm 6.1285 (6.6681) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][610/1251] eta 0:02:36 lr 0.000017 wd 0.0500 time 0.2393 (0.2447) data time 0.0010 (0.0020) model time 0.2384 (0.2420) loss 2.7858 (2.6740) grad_norm 5.7995 (6.6460) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][620/1251] eta 0:02:34 lr 0.000017 wd 0.0500 time 0.2472 (0.2446) data time 0.0007 (0.0020) model time 0.2465 (0.2420) loss 1.9883 (2.6724) grad_norm 4.3150 (6.6342) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][630/1251] eta 0:02:32 lr 0.000017 wd 0.0500 time 0.2491 (0.2449) data time 0.0011 (0.0020) model time 0.2480 (0.2423) loss 2.7197 (2.6746) grad_norm 5.5600 (6.6230) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][640/1251] eta 0:02:30 lr 0.000017 wd 0.0500 time 0.2479 (0.2456) data time 0.0011 (0.0020) model time 0.2468 (0.2430) loss 3.2125 (2.6745) grad_norm 8.2079 (6.6069) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][650/1251] eta 0:02:27 lr 0.000017 wd 0.0500 time 0.2277 (0.2455) data time 0.0012 (0.0020) model time 0.2265 (0.2430) loss 2.5225 (2.6715) grad_norm 5.4476 (6.5950) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][660/1251] eta 0:02:25 lr 0.000017 wd 0.0500 time 0.2384 (0.2455) data time 0.0011 (0.0020) model time 0.2373 (0.2430) loss 3.1700 (2.6720) grad_norm 4.6442 (6.6087) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][670/1251] eta 0:02:22 lr 0.000017 wd 0.0500 time 0.2494 (0.2455) data time 0.0007 (0.0019) model time 0.2487 (0.2430) loss 3.1315 (2.6708) grad_norm 6.2961 (6.5902) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][680/1251] eta 0:02:20 lr 0.000016 wd 0.0500 time 0.2429 (0.2454) data time 0.0008 (0.0019) model time 0.2422 (0.2430) loss 2.1351 (2.6685) grad_norm 7.0042 (6.5667) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][690/1251] eta 0:02:17 lr 0.000016 wd 0.0500 time 0.2412 (0.2454) data time 0.0007 (0.0019) model time 0.2405 (0.2430) loss 3.3079 (2.6700) grad_norm 6.1572 (6.5525) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][700/1251] eta 0:02:15 lr 0.000016 wd 0.0500 time 0.2385 (0.2453) data time 0.0008 (0.0019) model time 0.2378 (0.2429) loss 2.7464 (2.6716) grad_norm 5.2905 (6.5444) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][710/1251] eta 0:02:12 lr 0.000016 wd 0.0500 time 0.2373 (0.2453) data time 0.0007 (0.0019) model time 0.2366 (0.2429) loss 3.3960 (2.6720) grad_norm 6.1088 (6.5318) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][720/1251] eta 0:02:10 lr 0.000016 wd 0.0500 time 0.2397 (0.2453) data time 0.0007 (0.0019) model time 0.2390 (0.2429) loss 2.1804 (2.6718) grad_norm 5.9872 (6.5221) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][730/1251] eta 0:02:07 lr 0.000016 wd 0.0500 time 0.2434 (0.2452) data time 0.0007 (0.0019) model time 0.2427 (0.2428) loss 3.0397 (2.6703) grad_norm 5.3330 (6.4993) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][740/1251] eta 0:02:05 lr 0.000016 wd 0.0500 time 0.2419 (0.2452) data time 0.0007 (0.0018) model time 0.2412 (0.2428) loss 1.7592 (2.6680) grad_norm 4.9319 (6.4797) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][750/1251] eta 0:02:02 lr 0.000016 wd 0.0500 time 0.2392 (0.2451) data time 0.0012 (0.0018) model time 0.2380 (0.2428) loss 2.6807 (2.6635) grad_norm 8.2387 (6.5025) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][760/1251] eta 0:02:00 lr 0.000016 wd 0.0500 time 0.2435 (0.2451) data time 0.0010 (0.0018) model time 0.2425 (0.2428) loss 2.5792 (2.6666) grad_norm 5.2059 (6.4924) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][770/1251] eta 0:01:57 lr 0.000016 wd 0.0500 time 0.2282 (0.2450) data time 0.0009 (0.0018) model time 0.2273 (0.2427) loss 2.9162 (2.6688) grad_norm 6.7526 (6.4913) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][780/1251] eta 0:01:55 lr 0.000016 wd 0.0500 time 0.2385 (0.2450) data time 0.0011 (0.0018) model time 0.2375 (0.2427) loss 3.1988 (2.6708) grad_norm 4.6204 (6.4841) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][790/1251] eta 0:01:52 lr 0.000016 wd 0.0500 time 0.2437 (0.2449) data time 0.0009 (0.0018) model time 0.2428 (0.2426) loss 2.6887 (2.6706) grad_norm 11.3582 (6.4971) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:41:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][800/1251] eta 0:01:50 lr 0.000016 wd 0.0500 time 0.2439 (0.2448) data time 0.0007 (0.0018) model time 0.2432 (0.2426) loss 2.7221 (2.6690) grad_norm 6.4192 (inf) loss_scale 128.0000 (255.6804) mem 7381MB [2024-09-01 10:41:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][810/1251] eta 0:01:47 lr 0.000016 wd 0.0500 time 0.2427 (0.2448) data time 0.0010 (0.0018) model time 0.2417 (0.2425) loss 2.7773 (2.6710) grad_norm 4.8500 (inf) loss_scale 128.0000 (254.1060) mem 7381MB [2024-09-01 10:41:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][820/1251] eta 0:01:45 lr 0.000016 wd 0.0500 time 0.2506 (0.2448) data time 0.0007 (0.0018) model time 0.2499 (0.2426) loss 3.3272 (2.6702) grad_norm 7.3669 (inf) loss_scale 128.0000 (252.5700) mem 7381MB [2024-09-01 10:41:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][830/1251] eta 0:01:43 lr 0.000016 wd 0.0500 time 0.2537 (0.2448) data time 0.0009 (0.0018) model time 0.2528 (0.2426) loss 2.9938 (2.6709) grad_norm 5.0761 (inf) loss_scale 128.0000 (251.0710) mem 7381MB [2024-09-01 10:42:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][840/1251] eta 0:01:40 lr 0.000016 wd 0.0500 time 0.2447 (0.2448) data time 0.0010 (0.0017) model time 0.2437 (0.2426) loss 2.9303 (2.6730) grad_norm 4.2852 (inf) loss_scale 128.0000 (249.6076) mem 7381MB [2024-09-01 10:42:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][850/1251] eta 0:01:38 lr 0.000016 wd 0.0500 time 0.2427 (0.2447) data time 0.0009 (0.0017) model time 0.2418 (0.2425) loss 1.9640 (2.6716) grad_norm 5.0673 (inf) loss_scale 128.0000 (248.1786) mem 7381MB [2024-09-01 10:42:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][860/1251] eta 0:01:35 lr 0.000016 wd 0.0500 time 0.2423 (0.2447) data time 0.0007 (0.0017) model time 0.2416 (0.2425) loss 2.6678 (2.6738) grad_norm 5.7525 (inf) loss_scale 128.0000 (246.7828) mem 7381MB [2024-09-01 10:42:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][870/1251] eta 0:01:33 lr 0.000016 wd 0.0500 time 0.2362 (0.2447) data time 0.0009 (0.0017) model time 0.2353 (0.2425) loss 2.8356 (2.6735) grad_norm 5.3380 (inf) loss_scale 128.0000 (245.4191) mem 7381MB [2024-09-01 10:42:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][880/1251] eta 0:01:30 lr 0.000016 wd 0.0500 time 0.2388 (0.2446) data time 0.0007 (0.0017) model time 0.2381 (0.2425) loss 2.9900 (2.6750) grad_norm 6.0794 (inf) loss_scale 128.0000 (244.0863) mem 7381MB [2024-09-01 10:42:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][890/1251] eta 0:01:28 lr 0.000016 wd 0.0500 time 0.2429 (0.2446) data time 0.0011 (0.0017) model time 0.2418 (0.2425) loss 2.8624 (2.6748) grad_norm 9.0913 (inf) loss_scale 128.0000 (242.7834) mem 7381MB [2024-09-01 10:42:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][900/1251] eta 0:01:25 lr 0.000016 wd 0.0500 time 0.2426 (0.2446) data time 0.0007 (0.0017) model time 0.2419 (0.2425) loss 2.8249 (2.6750) grad_norm 17.3650 (inf) loss_scale 128.0000 (241.5094) mem 7381MB [2024-09-01 10:42:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][910/1251] eta 0:01:23 lr 0.000016 wd 0.0500 time 0.2327 (0.2445) data time 0.0009 (0.0017) model time 0.2319 (0.2424) loss 3.3064 (2.6750) grad_norm 6.0555 (inf) loss_scale 128.0000 (240.2634) mem 7381MB [2024-09-01 10:42:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][920/1251] eta 0:01:20 lr 0.000016 wd 0.0500 time 0.2446 (0.2445) data time 0.0007 (0.0017) model time 0.2439 (0.2424) loss 2.7387 (2.6731) grad_norm 4.8627 (inf) loss_scale 128.0000 (239.0445) mem 7381MB [2024-09-01 10:42:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][930/1251] eta 0:01:18 lr 0.000016 wd 0.0500 time 0.2481 (0.2445) data time 0.0011 (0.0017) model time 0.2469 (0.2424) loss 2.1741 (2.6741) grad_norm 5.9185 (inf) loss_scale 128.0000 (237.8518) mem 7381MB [2024-09-01 10:42:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][940/1251] eta 0:01:16 lr 0.000016 wd 0.0500 time 0.2374 (0.2444) data time 0.0008 (0.0017) model time 0.2366 (0.2424) loss 2.2814 (2.6723) grad_norm 4.1429 (inf) loss_scale 128.0000 (236.6844) mem 7381MB [2024-09-01 10:42:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][950/1251] eta 0:01:13 lr 0.000016 wd 0.0500 time 0.2429 (0.2444) data time 0.0011 (0.0017) model time 0.2418 (0.2424) loss 2.8379 (2.6747) grad_norm 5.9297 (inf) loss_scale 128.0000 (235.5415) mem 7381MB [2024-09-01 10:42:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][960/1251] eta 0:01:11 lr 0.000016 wd 0.0500 time 0.2499 (0.2444) data time 0.0009 (0.0016) model time 0.2490 (0.2424) loss 3.0569 (2.6744) grad_norm 4.5195 (inf) loss_scale 128.0000 (234.4225) mem 7381MB [2024-09-01 10:42:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][970/1251] eta 0:01:08 lr 0.000016 wd 0.0500 time 0.2411 (0.2444) data time 0.0009 (0.0016) model time 0.2402 (0.2424) loss 2.4348 (2.6740) grad_norm 7.0218 (inf) loss_scale 128.0000 (233.3265) mem 7381MB [2024-09-01 10:42:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][980/1251] eta 0:01:06 lr 0.000016 wd 0.0500 time 0.2429 (0.2444) data time 0.0010 (0.0016) model time 0.2419 (0.2424) loss 2.9682 (2.6721) grad_norm 5.4182 (inf) loss_scale 128.0000 (232.2528) mem 7381MB [2024-09-01 10:42:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][990/1251] eta 0:01:03 lr 0.000016 wd 0.0500 time 0.2475 (0.2444) data time 0.0009 (0.0016) model time 0.2466 (0.2424) loss 3.0570 (2.6738) grad_norm 4.7792 (inf) loss_scale 128.0000 (231.2008) mem 7381MB [2024-09-01 10:42:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1000/1251] eta 0:01:01 lr 0.000016 wd 0.0500 time 0.2516 (0.2444) data time 0.0009 (0.0016) model time 0.2506 (0.2423) loss 2.6207 (2.6757) grad_norm 6.0401 (inf) loss_scale 128.0000 (230.1698) mem 7381MB [2024-09-01 10:42:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1010/1251] eta 0:00:58 lr 0.000016 wd 0.0500 time 0.2409 (0.2443) data time 0.0009 (0.0016) model time 0.2401 (0.2423) loss 2.5152 (2.6761) grad_norm 3.8091 (inf) loss_scale 128.0000 (229.1592) mem 7381MB [2024-09-01 10:42:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1020/1251] eta 0:00:56 lr 0.000016 wd 0.0500 time 0.2428 (0.2443) data time 0.0007 (0.0016) model time 0.2421 (0.2423) loss 3.3383 (2.6764) grad_norm 7.4766 (inf) loss_scale 128.0000 (228.1685) mem 7381MB [2024-09-01 10:42:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1030/1251] eta 0:00:54 lr 0.000016 wd 0.0500 time 0.2366 (0.2444) data time 0.0011 (0.0016) model time 0.2355 (0.2425) loss 2.6093 (2.6753) grad_norm 4.5207 (inf) loss_scale 128.0000 (227.1969) mem 7381MB [2024-09-01 10:42:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1040/1251] eta 0:00:51 lr 0.000016 wd 0.0500 time 0.2408 (0.2444) data time 0.0007 (0.0016) model time 0.2401 (0.2424) loss 1.6615 (2.6741) grad_norm 6.6573 (inf) loss_scale 128.0000 (226.2440) mem 7381MB [2024-09-01 10:42:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1050/1251] eta 0:00:49 lr 0.000016 wd 0.0500 time 0.2406 (0.2445) data time 0.0008 (0.0016) model time 0.2398 (0.2426) loss 2.5284 (2.6730) grad_norm 6.8600 (inf) loss_scale 128.0000 (225.3092) mem 7381MB [2024-09-01 10:42:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1060/1251] eta 0:00:46 lr 0.000016 wd 0.0500 time 0.2386 (0.2445) data time 0.0010 (0.0016) model time 0.2375 (0.2425) loss 2.6581 (2.6715) grad_norm 4.7838 (inf) loss_scale 128.0000 (224.3921) mem 7381MB [2024-09-01 10:42:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1070/1251] eta 0:00:44 lr 0.000016 wd 0.0500 time 0.2438 (0.2445) data time 0.0010 (0.0016) model time 0.2428 (0.2425) loss 2.8759 (2.6727) grad_norm 4.6529 (inf) loss_scale 128.0000 (223.4921) mem 7381MB [2024-09-01 10:42:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1080/1251] eta 0:00:41 lr 0.000016 wd 0.0500 time 0.2484 (0.2444) data time 0.0007 (0.0016) model time 0.2477 (0.2425) loss 2.3982 (2.6742) grad_norm 3.7391 (inf) loss_scale 128.0000 (222.6087) mem 7381MB [2024-09-01 10:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1090/1251] eta 0:00:39 lr 0.000016 wd 0.0500 time 0.2433 (0.2445) data time 0.0007 (0.0016) model time 0.2426 (0.2426) loss 2.8549 (2.6745) grad_norm 5.5783 (inf) loss_scale 128.0000 (221.7415) mem 7381MB [2024-09-01 10:43:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1100/1251] eta 0:00:36 lr 0.000016 wd 0.0500 time 0.2338 (0.2445) data time 0.0008 (0.0016) model time 0.2330 (0.2426) loss 2.9300 (2.6762) grad_norm 6.3682 (inf) loss_scale 128.0000 (220.8901) mem 7381MB [2024-09-01 10:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1110/1251] eta 0:00:34 lr 0.000016 wd 0.0500 time 0.2435 (0.2445) data time 0.0007 (0.0016) model time 0.2428 (0.2426) loss 2.9328 (2.6764) grad_norm 8.5039 (inf) loss_scale 128.0000 (220.0540) mem 7381MB [2024-09-01 10:43:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1120/1251] eta 0:00:32 lr 0.000016 wd 0.0500 time 0.2610 (0.2445) data time 0.0006 (0.0016) model time 0.2603 (0.2426) loss 2.1586 (2.6754) grad_norm 3.6662 (inf) loss_scale 128.0000 (219.2328) mem 7381MB [2024-09-01 10:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1130/1251] eta 0:00:29 lr 0.000016 wd 0.0500 time 0.2401 (0.2444) data time 0.0007 (0.0016) model time 0.2394 (0.2426) loss 1.7600 (2.6727) grad_norm 4.2689 (inf) loss_scale 128.0000 (218.4262) mem 7381MB [2024-09-01 10:43:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1140/1251] eta 0:00:27 lr 0.000016 wd 0.0500 time 0.2388 (0.2444) data time 0.0011 (0.0015) model time 0.2377 (0.2425) loss 2.2337 (2.6733) grad_norm 8.7959 (inf) loss_scale 128.0000 (217.6337) mem 7381MB [2024-09-01 10:43:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1150/1251] eta 0:00:24 lr 0.000016 wd 0.0500 time 0.2475 (0.2444) data time 0.0009 (0.0015) model time 0.2465 (0.2425) loss 1.8330 (2.6737) grad_norm 4.4999 (inf) loss_scale 128.0000 (216.8549) mem 7381MB [2024-09-01 10:43:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1160/1251] eta 0:00:22 lr 0.000016 wd 0.0500 time 0.2400 (0.2443) data time 0.0007 (0.0015) model time 0.2393 (0.2425) loss 1.6190 (2.6741) grad_norm 4.6850 (inf) loss_scale 128.0000 (216.0896) mem 7381MB [2024-09-01 10:43:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1170/1251] eta 0:00:19 lr 0.000016 wd 0.0500 time 0.2413 (0.2445) data time 0.0010 (0.0015) model time 0.2403 (0.2426) loss 2.6683 (2.6740) grad_norm 3.9939 (inf) loss_scale 128.0000 (215.3373) mem 7381MB [2024-09-01 10:43:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1180/1251] eta 0:00:17 lr 0.000016 wd 0.0500 time 0.2400 (0.2448) data time 0.0010 (0.0015) model time 0.2391 (0.2430) loss 2.6910 (2.6753) grad_norm 6.5759 (inf) loss_scale 128.0000 (214.5978) mem 7381MB [2024-09-01 10:43:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1190/1251] eta 0:00:14 lr 0.000016 wd 0.0500 time 0.2335 (0.2448) data time 0.0007 (0.0015) model time 0.2328 (0.2429) loss 2.7428 (2.6726) grad_norm 5.3395 (inf) loss_scale 128.0000 (213.8707) mem 7381MB [2024-09-01 10:43:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1200/1251] eta 0:00:12 lr 0.000016 wd 0.0500 time 0.2466 (0.2447) data time 0.0008 (0.0015) model time 0.2457 (0.2429) loss 2.4206 (2.6736) grad_norm 6.2460 (inf) loss_scale 128.0000 (213.1557) mem 7381MB [2024-09-01 10:43:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1210/1251] eta 0:00:10 lr 0.000016 wd 0.0500 time 0.2395 (0.2447) data time 0.0009 (0.0015) model time 0.2386 (0.2429) loss 2.8270 (2.6725) grad_norm 5.3089 (inf) loss_scale 128.0000 (212.4525) mem 7381MB [2024-09-01 10:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1220/1251] eta 0:00:07 lr 0.000016 wd 0.0500 time 0.2423 (0.2447) data time 0.0010 (0.0015) model time 0.2413 (0.2428) loss 2.6804 (2.6719) grad_norm 5.6752 (inf) loss_scale 128.0000 (211.7609) mem 7381MB [2024-09-01 10:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1230/1251] eta 0:00:05 lr 0.000016 wd 0.0500 time 0.2453 (0.2446) data time 0.0007 (0.0015) model time 0.2446 (0.2428) loss 2.7616 (2.6727) grad_norm 5.3219 (inf) loss_scale 128.0000 (211.0804) mem 7381MB [2024-09-01 10:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1240/1251] eta 0:00:02 lr 0.000016 wd 0.0500 time 0.2234 (0.2445) data time 0.0005 (0.0015) model time 0.2229 (0.2427) loss 3.1091 (2.6717) grad_norm 6.7377 (inf) loss_scale 128.0000 (210.4110) mem 7381MB [2024-09-01 10:43:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [285/300][1250/1251] eta 0:00:00 lr 0.000016 wd 0.0500 time 0.2246 (0.2444) data time 0.0005 (0.0015) model time 0.2242 (0.2426) loss 2.0289 (2.6706) grad_norm 20.8875 (inf) loss_scale 128.0000 (209.7522) mem 7381MB [2024-09-01 10:43:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 285 training takes 0:05:05 [2024-09-01 10:43:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 10:43:41 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 10:43:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.426 (0.426) Loss 0.3877 (0.3877) Acc@1 93.164 (93.164) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 10:43:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.108) Loss 0.5737 (0.6114) Acc@1 90.430 (87.669) Acc@5 97.852 (97.692) Mem 7381MB [2024-09-01 10:43:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.095) Loss 0.9028 (0.6410) Acc@1 78.418 (86.635) Acc@5 95.312 (97.652) Mem 7381MB [2024-09-01 10:43:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.081 (0.089) Loss 1.1533 (0.7356) Acc@1 74.805 (84.375) Acc@5 92.676 (96.724) Mem 7381MB [2024-09-01 10:43:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.0059 (0.7843) Acc@1 77.246 (83.194) Acc@5 94.141 (96.191) Mem 7381MB [2024-09-01 10:43:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.744 Acc@5 96.154 [2024-09-01 10:43:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.7% [2024-09-01 10:43:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.807 (0.807) Loss 0.3887 (0.3887) Acc@1 93.457 (93.457) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 10:43:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.073 (0.145) Loss 0.5630 (0.6074) Acc@1 90.430 (87.988) Acc@5 98.047 (97.772) Mem 7381MB [2024-09-01 10:43:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.076 (0.114) Loss 0.9121 (0.6381) Acc@1 77.734 (86.756) Acc@5 95.898 (97.735) Mem 7381MB [2024-09-01 10:43:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.103) Loss 1.1377 (0.7308) Acc@1 74.512 (84.536) Acc@5 92.969 (96.780) Mem 7381MB [2024-09-01 10:43:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.0127 (0.7792) Acc@1 76.562 (83.375) Acc@5 94.434 (96.287) Mem 7381MB [2024-09-01 10:43:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.924 Acc@5 96.244 [2024-09-01 10:43:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 10:43:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][0/1251] eta 0:22:13 lr 0.000016 wd 0.0500 time 1.0656 (1.0656) data time 0.5623 (0.5623) model time 0.0000 (0.0000) loss 3.1977 (3.1977) grad_norm 8.8044 (8.8044) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:43:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][10/1251] eta 0:06:32 lr 0.000016 wd 0.0500 time 0.2442 (0.3163) data time 0.0007 (0.0520) model time 0.0000 (0.0000) loss 2.7829 (2.7297) grad_norm 4.7930 (6.7467) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:43:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][20/1251] eta 0:05:45 lr 0.000016 wd 0.0500 time 0.2382 (0.2807) data time 0.0009 (0.0277) model time 0.0000 (0.0000) loss 2.3173 (2.6048) grad_norm 4.6931 (6.0861) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:43:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][30/1251] eta 0:05:27 lr 0.000016 wd 0.0500 time 0.2394 (0.2685) data time 0.0009 (0.0191) model time 0.0000 (0.0000) loss 3.0508 (2.6246) grad_norm 4.6253 (6.4725) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][40/1251] eta 0:05:17 lr 0.000016 wd 0.0500 time 0.2416 (0.2619) data time 0.0009 (0.0147) model time 0.0000 (0.0000) loss 2.9247 (2.6846) grad_norm 5.6015 (6.4201) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][50/1251] eta 0:05:09 lr 0.000016 wd 0.0500 time 0.2375 (0.2579) data time 0.0007 (0.0120) model time 0.0000 (0.0000) loss 2.6947 (2.6997) grad_norm 5.3092 (6.1823) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][60/1251] eta 0:05:04 lr 0.000016 wd 0.0500 time 0.2444 (0.2556) data time 0.0008 (0.0101) model time 0.2435 (0.2428) loss 1.6495 (2.6856) grad_norm 11.1787 (6.3724) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][70/1251] eta 0:04:59 lr 0.000016 wd 0.0500 time 0.2420 (0.2536) data time 0.0007 (0.0089) model time 0.2412 (0.2416) loss 2.7588 (2.6663) grad_norm 4.8454 (6.3346) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][80/1251] eta 0:04:55 lr 0.000016 wd 0.0500 time 0.2393 (0.2522) data time 0.0008 (0.0079) model time 0.2385 (0.2416) loss 2.4471 (2.6993) grad_norm 5.2117 (7.1589) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][90/1251] eta 0:04:51 lr 0.000016 wd 0.0500 time 0.2347 (0.2507) data time 0.0011 (0.0071) model time 0.2336 (0.2405) loss 2.4485 (2.6748) grad_norm 9.4979 (7.1063) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][100/1251] eta 0:04:47 lr 0.000016 wd 0.0500 time 0.2487 (0.2496) data time 0.0007 (0.0065) model time 0.2480 (0.2402) loss 2.3780 (2.6704) grad_norm 7.0533 (6.9748) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][110/1251] eta 0:04:44 lr 0.000016 wd 0.0500 time 0.2413 (0.2491) data time 0.0010 (0.0060) model time 0.2403 (0.2408) loss 3.0610 (2.6770) grad_norm 6.4597 (6.8565) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][120/1251] eta 0:04:41 lr 0.000016 wd 0.0500 time 0.2333 (0.2485) data time 0.0010 (0.0056) model time 0.2323 (0.2407) loss 2.4821 (2.6969) grad_norm 4.7526 (6.7181) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][130/1251] eta 0:04:38 lr 0.000016 wd 0.0500 time 0.2393 (0.2480) data time 0.0012 (0.0052) model time 0.2381 (0.2408) loss 2.8487 (2.7118) grad_norm 8.2755 (6.7302) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][140/1251] eta 0:04:34 lr 0.000016 wd 0.0500 time 0.2360 (0.2475) data time 0.0010 (0.0049) model time 0.2350 (0.2406) loss 2.3927 (2.7175) grad_norm 7.3480 (6.6857) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][150/1251] eta 0:04:32 lr 0.000016 wd 0.0500 time 0.2408 (0.2472) data time 0.0010 (0.0047) model time 0.2397 (0.2408) loss 2.0870 (2.7132) grad_norm 7.5739 (6.6388) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][160/1251] eta 0:04:29 lr 0.000016 wd 0.0500 time 0.2395 (0.2469) data time 0.0008 (0.0044) model time 0.2387 (0.2408) loss 2.0033 (2.7112) grad_norm 6.8329 (6.9530) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][170/1251] eta 0:04:26 lr 0.000016 wd 0.0500 time 0.2403 (0.2465) data time 0.0007 (0.0042) model time 0.2395 (0.2407) loss 2.4921 (2.7150) grad_norm 7.2597 (6.8936) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][180/1251] eta 0:04:23 lr 0.000016 wd 0.0500 time 0.2410 (0.2462) data time 0.0011 (0.0040) model time 0.2400 (0.2406) loss 2.7848 (2.7170) grad_norm 3.9786 (6.9565) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][190/1251] eta 0:04:21 lr 0.000016 wd 0.0500 time 0.2458 (0.2460) data time 0.0011 (0.0039) model time 0.2447 (0.2408) loss 2.6461 (2.7209) grad_norm 24.7978 (7.0511) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][200/1251] eta 0:04:18 lr 0.000016 wd 0.0500 time 0.2451 (0.2458) data time 0.0008 (0.0037) model time 0.2444 (0.2408) loss 3.3585 (2.7250) grad_norm 10.2634 (7.0537) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][210/1251] eta 0:04:15 lr 0.000016 wd 0.0500 time 0.2411 (0.2456) data time 0.0009 (0.0036) model time 0.2402 (0.2407) loss 3.0835 (2.7291) grad_norm 5.3090 (7.1202) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][220/1251] eta 0:04:13 lr 0.000016 wd 0.0500 time 0.2419 (0.2454) data time 0.0007 (0.0035) model time 0.2412 (0.2407) loss 2.4613 (2.7274) grad_norm 6.8575 (7.0580) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][230/1251] eta 0:04:10 lr 0.000016 wd 0.0500 time 0.2509 (0.2453) data time 0.0008 (0.0034) model time 0.2501 (0.2408) loss 3.3416 (2.7188) grad_norm 6.6515 (7.0370) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][240/1251] eta 0:04:07 lr 0.000016 wd 0.0500 time 0.2367 (0.2451) data time 0.0009 (0.0033) model time 0.2358 (0.2407) loss 2.2187 (2.7121) grad_norm 7.2614 (7.0744) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][250/1251] eta 0:04:05 lr 0.000016 wd 0.0500 time 0.2380 (0.2450) data time 0.0010 (0.0032) model time 0.2369 (0.2408) loss 2.2019 (2.7145) grad_norm 7.9025 (7.0298) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][260/1251] eta 0:04:02 lr 0.000016 wd 0.0500 time 0.2403 (0.2447) data time 0.0008 (0.0031) model time 0.2395 (0.2406) loss 2.6520 (2.7208) grad_norm 5.6373 (6.9980) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][270/1251] eta 0:04:00 lr 0.000016 wd 0.0500 time 0.2476 (0.2454) data time 0.0009 (0.0030) model time 0.2467 (0.2415) loss 2.9972 (2.7243) grad_norm 6.7029 (6.9775) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:44:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][280/1251] eta 0:03:58 lr 0.000016 wd 0.0500 time 0.2440 (0.2452) data time 0.0010 (0.0030) model time 0.2430 (0.2415) loss 2.6280 (2.7198) grad_norm 5.4731 (6.9733) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][290/1251] eta 0:03:55 lr 0.000016 wd 0.0500 time 0.2432 (0.2451) data time 0.0009 (0.0029) model time 0.2423 (0.2414) loss 2.5098 (2.7279) grad_norm 4.4387 (6.9883) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][300/1251] eta 0:03:52 lr 0.000016 wd 0.0500 time 0.2449 (0.2450) data time 0.0007 (0.0028) model time 0.2441 (0.2414) loss 2.3971 (2.7279) grad_norm 6.8248 (6.9435) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][310/1251] eta 0:03:50 lr 0.000016 wd 0.0500 time 0.2424 (0.2449) data time 0.0009 (0.0028) model time 0.2414 (0.2413) loss 2.6990 (2.7305) grad_norm 6.5696 (7.0681) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][320/1251] eta 0:03:48 lr 0.000016 wd 0.0500 time 0.2462 (0.2453) data time 0.0007 (0.0027) model time 0.2455 (0.2419) loss 2.7779 (2.7371) grad_norm 6.5368 (7.0306) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][330/1251] eta 0:03:45 lr 0.000016 wd 0.0500 time 0.2391 (0.2452) data time 0.0007 (0.0027) model time 0.2384 (0.2419) loss 1.8653 (2.7396) grad_norm 5.4134 (6.9960) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][340/1251] eta 0:03:43 lr 0.000016 wd 0.0500 time 0.2400 (0.2451) data time 0.0007 (0.0026) model time 0.2393 (0.2419) loss 2.8174 (2.7376) grad_norm 5.5471 (6.9686) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][350/1251] eta 0:03:41 lr 0.000016 wd 0.0500 time 0.3731 (0.2455) data time 0.0011 (0.0026) model time 0.3719 (0.2424) loss 2.5220 (2.7432) grad_norm 5.8442 (6.9605) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][360/1251] eta 0:03:38 lr 0.000016 wd 0.0500 time 0.2455 (0.2453) data time 0.0007 (0.0025) model time 0.2449 (0.2423) loss 2.3343 (2.7410) grad_norm 4.9943 (6.9974) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][370/1251] eta 0:03:35 lr 0.000016 wd 0.0500 time 0.2354 (0.2452) data time 0.0012 (0.0025) model time 0.2342 (0.2422) loss 2.8544 (2.7409) grad_norm 5.2793 (6.9807) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][380/1251] eta 0:03:33 lr 0.000016 wd 0.0500 time 0.2401 (0.2451) data time 0.0007 (0.0024) model time 0.2394 (0.2421) loss 3.0821 (2.7506) grad_norm 4.8739 (6.9585) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][390/1251] eta 0:03:30 lr 0.000016 wd 0.0500 time 0.2410 (0.2450) data time 0.0011 (0.0024) model time 0.2399 (0.2420) loss 2.9300 (2.7512) grad_norm 6.0484 (7.0118) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][400/1251] eta 0:03:28 lr 0.000016 wd 0.0500 time 0.2412 (0.2448) data time 0.0009 (0.0024) model time 0.2402 (0.2420) loss 3.0777 (2.7484) grad_norm 6.4273 (6.9829) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][410/1251] eta 0:03:25 lr 0.000016 wd 0.0500 time 0.2320 (0.2448) data time 0.0011 (0.0023) model time 0.2308 (0.2420) loss 3.2506 (2.7495) grad_norm 6.9434 (6.9748) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][420/1251] eta 0:03:23 lr 0.000016 wd 0.0500 time 0.2382 (0.2447) data time 0.0009 (0.0023) model time 0.2373 (0.2419) loss 1.6617 (2.7473) grad_norm 6.8411 (6.9473) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][430/1251] eta 0:03:20 lr 0.000016 wd 0.0500 time 0.2381 (0.2447) data time 0.0013 (0.0023) model time 0.2368 (0.2419) loss 2.8282 (2.7441) grad_norm 10.4277 (6.9764) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][440/1251] eta 0:03:18 lr 0.000016 wd 0.0500 time 0.2383 (0.2446) data time 0.0007 (0.0022) model time 0.2375 (0.2418) loss 2.9772 (2.7369) grad_norm 10.8874 (6.9653) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][450/1251] eta 0:03:16 lr 0.000016 wd 0.0500 time 0.2456 (0.2450) data time 0.0010 (0.0022) model time 0.2446 (0.2423) loss 3.3209 (2.7353) grad_norm 3.4346 (7.0086) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][460/1251] eta 0:03:14 lr 0.000016 wd 0.0500 time 0.2322 (0.2458) data time 0.0008 (0.0022) model time 0.2315 (0.2433) loss 2.3271 (2.7290) grad_norm 4.6444 (7.0178) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][470/1251] eta 0:03:11 lr 0.000016 wd 0.0500 time 0.2451 (0.2456) data time 0.0009 (0.0022) model time 0.2442 (0.2432) loss 2.8269 (2.7304) grad_norm 10.1007 (6.9890) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][480/1251] eta 0:03:09 lr 0.000016 wd 0.0500 time 0.2411 (0.2456) data time 0.0007 (0.0021) model time 0.2404 (0.2431) loss 2.3883 (2.7292) grad_norm 6.4563 (6.9932) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][490/1251] eta 0:03:06 lr 0.000016 wd 0.0500 time 0.2369 (0.2455) data time 0.0010 (0.0021) model time 0.2360 (0.2431) loss 3.1790 (2.7309) grad_norm 8.2574 (6.9991) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][500/1251] eta 0:03:04 lr 0.000016 wd 0.0500 time 0.2402 (0.2454) data time 0.0008 (0.0021) model time 0.2394 (0.2430) loss 3.4789 (2.7325) grad_norm 8.4290 (6.9833) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][510/1251] eta 0:03:01 lr 0.000016 wd 0.0500 time 0.2404 (0.2453) data time 0.0008 (0.0021) model time 0.2396 (0.2429) loss 3.4434 (2.7373) grad_norm 4.8165 (6.9504) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:45:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][520/1251] eta 0:02:59 lr 0.000016 wd 0.0500 time 0.2378 (0.2452) data time 0.0010 (0.0020) model time 0.2369 (0.2429) loss 2.8371 (2.7325) grad_norm 5.6015 (6.9987) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][530/1251] eta 0:02:56 lr 0.000016 wd 0.0500 time 0.2473 (0.2452) data time 0.0010 (0.0020) model time 0.2463 (0.2429) loss 3.0575 (2.7317) grad_norm 6.0442 (7.0040) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][540/1251] eta 0:02:54 lr 0.000016 wd 0.0500 time 0.2292 (0.2451) data time 0.0010 (0.0020) model time 0.2281 (0.2428) loss 2.3704 (2.7327) grad_norm 5.5450 (6.9813) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][550/1251] eta 0:02:51 lr 0.000016 wd 0.0500 time 0.2485 (0.2450) data time 0.0009 (0.0020) model time 0.2476 (0.2427) loss 2.9196 (2.7296) grad_norm 8.1750 (6.9694) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][560/1251] eta 0:02:49 lr 0.000016 wd 0.0500 time 0.2335 (0.2449) data time 0.0011 (0.0020) model time 0.2324 (0.2426) loss 2.8921 (2.7289) grad_norm 8.7705 (6.9500) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][570/1251] eta 0:02:46 lr 0.000016 wd 0.0500 time 0.2397 (0.2448) data time 0.0011 (0.0020) model time 0.2386 (0.2426) loss 2.1932 (2.7327) grad_norm 6.7371 (6.9361) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][580/1251] eta 0:02:44 lr 0.000016 wd 0.0500 time 0.2383 (0.2447) data time 0.0009 (0.0019) model time 0.2374 (0.2425) loss 3.3468 (2.7281) grad_norm 3.9979 (6.9173) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][590/1251] eta 0:02:41 lr 0.000016 wd 0.0500 time 0.2394 (0.2447) data time 0.0007 (0.0019) model time 0.2387 (0.2425) loss 2.2826 (2.7283) grad_norm 5.0754 (6.8956) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][600/1251] eta 0:02:39 lr 0.000016 wd 0.0500 time 0.2299 (0.2446) data time 0.0011 (0.0019) model time 0.2288 (0.2424) loss 2.9812 (2.7312) grad_norm 6.0333 (6.8700) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][610/1251] eta 0:02:36 lr 0.000016 wd 0.0500 time 0.2422 (0.2446) data time 0.0010 (0.0019) model time 0.2411 (0.2424) loss 2.9059 (2.7336) grad_norm 6.3864 (6.8461) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][620/1251] eta 0:02:34 lr 0.000016 wd 0.0500 time 0.2423 (0.2445) data time 0.0009 (0.0019) model time 0.2415 (0.2424) loss 1.5680 (2.7307) grad_norm 5.4949 (6.8403) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][630/1251] eta 0:02:31 lr 0.000016 wd 0.0500 time 0.2382 (0.2445) data time 0.0011 (0.0019) model time 0.2372 (0.2423) loss 2.7200 (2.7271) grad_norm 4.3077 (6.9154) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][640/1251] eta 0:02:29 lr 0.000016 wd 0.0500 time 0.2434 (0.2445) data time 0.0009 (0.0019) model time 0.2425 (0.2423) loss 2.5952 (2.7234) grad_norm 5.1852 (6.9911) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][650/1251] eta 0:02:26 lr 0.000016 wd 0.0500 time 0.2437 (0.2444) data time 0.0010 (0.0018) model time 0.2428 (0.2423) loss 2.8842 (2.7237) grad_norm 4.5484 (6.9817) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][660/1251] eta 0:02:24 lr 0.000016 wd 0.0500 time 0.2347 (0.2444) data time 0.0008 (0.0018) model time 0.2339 (0.2423) loss 3.1789 (2.7267) grad_norm 4.1145 (6.9597) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][670/1251] eta 0:02:21 lr 0.000016 wd 0.0500 time 0.2458 (0.2443) data time 0.0009 (0.0018) model time 0.2449 (0.2422) loss 2.9175 (2.7269) grad_norm 5.3535 (6.9425) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][680/1251] eta 0:02:19 lr 0.000016 wd 0.0500 time 0.2411 (0.2443) data time 0.0010 (0.0018) model time 0.2401 (0.2422) loss 2.9019 (2.7260) grad_norm 4.3214 (6.9340) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][690/1251] eta 0:02:17 lr 0.000016 wd 0.0500 time 0.2408 (0.2443) data time 0.0009 (0.0018) model time 0.2399 (0.2422) loss 2.4694 (2.7254) grad_norm 5.1705 (6.9188) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][700/1251] eta 0:02:14 lr 0.000016 wd 0.0500 time 0.2424 (0.2442) data time 0.0010 (0.0018) model time 0.2414 (0.2422) loss 2.0293 (2.7258) grad_norm 5.4418 (6.9111) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][710/1251] eta 0:02:12 lr 0.000016 wd 0.0500 time 0.2454 (0.2442) data time 0.0009 (0.0018) model time 0.2445 (0.2421) loss 1.8102 (2.7228) grad_norm 4.2103 (6.8872) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][720/1251] eta 0:02:09 lr 0.000016 wd 0.0500 time 0.2461 (0.2441) data time 0.0008 (0.0018) model time 0.2453 (0.2421) loss 2.8509 (2.7184) grad_norm 5.9197 (6.8804) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][730/1251] eta 0:02:07 lr 0.000016 wd 0.0500 time 0.2434 (0.2441) data time 0.0009 (0.0018) model time 0.2425 (0.2421) loss 3.2357 (2.7199) grad_norm 4.1426 (6.8618) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][740/1251] eta 0:02:04 lr 0.000016 wd 0.0500 time 0.2374 (0.2441) data time 0.0008 (0.0017) model time 0.2367 (0.2420) loss 2.8676 (2.7165) grad_norm 12.0415 (6.8606) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][750/1251] eta 0:02:02 lr 0.000016 wd 0.0500 time 0.2419 (0.2440) data time 0.0010 (0.0017) model time 0.2409 (0.2420) loss 3.0201 (2.7152) grad_norm 3.9888 (6.8443) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][760/1251] eta 0:01:59 lr 0.000016 wd 0.0500 time 0.2458 (0.2440) data time 0.0007 (0.0017) model time 0.2452 (0.2420) loss 2.8792 (2.7160) grad_norm 4.9368 (6.8338) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:46:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][770/1251] eta 0:01:57 lr 0.000016 wd 0.0500 time 0.2426 (0.2439) data time 0.0009 (0.0017) model time 0.2417 (0.2419) loss 3.2486 (2.7167) grad_norm 7.0934 (6.8286) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][780/1251] eta 0:01:54 lr 0.000016 wd 0.0500 time 0.2372 (0.2439) data time 0.0009 (0.0017) model time 0.2363 (0.2419) loss 2.1498 (2.7167) grad_norm 4.9841 (6.8201) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][790/1251] eta 0:01:52 lr 0.000016 wd 0.0500 time 0.2356 (0.2440) data time 0.0007 (0.0017) model time 0.2349 (0.2421) loss 2.9487 (2.7166) grad_norm 4.0773 (6.7999) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][800/1251] eta 0:01:50 lr 0.000016 wd 0.0500 time 0.2333 (0.2440) data time 0.0008 (0.0017) model time 0.2325 (0.2420) loss 2.6656 (2.7130) grad_norm 6.7071 (6.7912) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][810/1251] eta 0:01:47 lr 0.000016 wd 0.0500 time 0.2438 (0.2440) data time 0.0010 (0.0017) model time 0.2428 (0.2421) loss 3.0494 (2.7136) grad_norm 6.7835 (6.7995) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][820/1251] eta 0:01:45 lr 0.000016 wd 0.0500 time 0.2454 (0.2440) data time 0.0010 (0.0017) model time 0.2444 (0.2420) loss 2.3748 (2.7092) grad_norm 5.5441 (6.7957) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][830/1251] eta 0:01:42 lr 0.000016 wd 0.0500 time 0.2448 (0.2439) data time 0.0009 (0.0017) model time 0.2440 (0.2420) loss 3.0073 (2.7121) grad_norm 7.4653 (6.8081) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][840/1251] eta 0:01:40 lr 0.000016 wd 0.0500 time 0.2346 (0.2441) data time 0.0008 (0.0017) model time 0.2337 (0.2423) loss 2.8523 (2.7126) grad_norm 12.2337 (6.8124) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][850/1251] eta 0:01:37 lr 0.000016 wd 0.0500 time 0.2464 (0.2441) data time 0.0009 (0.0016) model time 0.2455 (0.2422) loss 2.3647 (2.7081) grad_norm 5.5731 (6.7913) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][860/1251] eta 0:01:35 lr 0.000016 wd 0.0500 time 0.2424 (0.2441) data time 0.0010 (0.0016) model time 0.2413 (0.2422) loss 1.8190 (2.7086) grad_norm 4.5994 (6.7709) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][870/1251] eta 0:01:32 lr 0.000016 wd 0.0500 time 0.2389 (0.2441) data time 0.0010 (0.0016) model time 0.2379 (0.2422) loss 2.4831 (2.7089) grad_norm 5.2652 (6.7584) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][880/1251] eta 0:01:30 lr 0.000015 wd 0.0500 time 0.2377 (0.2442) data time 0.0008 (0.0016) model time 0.2369 (0.2424) loss 3.2680 (2.7106) grad_norm 7.4010 (6.7469) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][890/1251] eta 0:01:28 lr 0.000015 wd 0.0500 time 0.2418 (0.2442) data time 0.0009 (0.0016) model time 0.2409 (0.2424) loss 1.8327 (2.7104) grad_norm 3.4670 (6.7384) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][900/1251] eta 0:01:25 lr 0.000015 wd 0.0500 time 0.2407 (0.2442) data time 0.0007 (0.0016) model time 0.2400 (0.2424) loss 2.4204 (2.7099) grad_norm 5.0770 (6.7274) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][910/1251] eta 0:01:23 lr 0.000015 wd 0.0500 time 0.2412 (0.2442) data time 0.0009 (0.0016) model time 0.2403 (0.2424) loss 3.3976 (2.7121) grad_norm 12.2136 (6.7400) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][920/1251] eta 0:01:20 lr 0.000015 wd 0.0500 time 0.2403 (0.2442) data time 0.0007 (0.0016) model time 0.2396 (0.2424) loss 2.7856 (2.7125) grad_norm 35.6417 (6.7640) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][930/1251] eta 0:01:18 lr 0.000015 wd 0.0500 time 0.2432 (0.2441) data time 0.0007 (0.0016) model time 0.2424 (0.2423) loss 3.3835 (2.7104) grad_norm 6.6438 (6.7643) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][940/1251] eta 0:01:15 lr 0.000015 wd 0.0500 time 0.2401 (0.2441) data time 0.0008 (0.0016) model time 0.2392 (0.2423) loss 2.5522 (2.7109) grad_norm 4.1599 (6.7602) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][950/1251] eta 0:01:13 lr 0.000015 wd 0.0500 time 0.2368 (0.2441) data time 0.0009 (0.0016) model time 0.2358 (0.2423) loss 2.4587 (2.7103) grad_norm 7.5866 (6.7525) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][960/1251] eta 0:01:11 lr 0.000015 wd 0.0500 time 0.2452 (0.2440) data time 0.0011 (0.0016) model time 0.2442 (0.2422) loss 2.6939 (2.7098) grad_norm 5.2890 (6.7435) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][970/1251] eta 0:01:08 lr 0.000015 wd 0.0500 time 0.2378 (0.2441) data time 0.0011 (0.0016) model time 0.2367 (0.2424) loss 2.6653 (2.7106) grad_norm 5.7728 (6.7270) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][980/1251] eta 0:01:06 lr 0.000015 wd 0.0500 time 0.2413 (0.2444) data time 0.0010 (0.0016) model time 0.2404 (0.2427) loss 1.9918 (2.7071) grad_norm 6.3914 (6.7169) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][990/1251] eta 0:01:03 lr 0.000015 wd 0.0500 time 0.2349 (0.2443) data time 0.0011 (0.0016) model time 0.2338 (0.2426) loss 2.8703 (2.7074) grad_norm 4.4933 (6.7053) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1000/1251] eta 0:01:01 lr 0.000015 wd 0.0500 time 0.2462 (0.2443) data time 0.0007 (0.0015) model time 0.2454 (0.2426) loss 3.2238 (2.7096) grad_norm 3.3890 (6.7339) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1010/1251] eta 0:00:58 lr 0.000015 wd 0.0500 time 0.2458 (0.2443) data time 0.0010 (0.0015) model time 0.2448 (0.2426) loss 2.9902 (2.7095) grad_norm 3.2274 (6.7141) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:47:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1020/1251] eta 0:00:56 lr 0.000015 wd 0.0500 time 0.2430 (0.2443) data time 0.0007 (0.0015) model time 0.2423 (0.2426) loss 2.0194 (2.7065) grad_norm 3.8622 (6.7061) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1030/1251] eta 0:00:53 lr 0.000015 wd 0.0500 time 0.2382 (0.2443) data time 0.0008 (0.0015) model time 0.2374 (0.2426) loss 1.8529 (2.7038) grad_norm 4.7269 (6.7104) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1040/1251] eta 0:00:51 lr 0.000015 wd 0.0500 time 0.2449 (0.2443) data time 0.0007 (0.0015) model time 0.2442 (0.2426) loss 2.9889 (2.7021) grad_norm 7.0488 (6.7051) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1050/1251] eta 0:00:49 lr 0.000015 wd 0.0500 time 0.2466 (0.2443) data time 0.0007 (0.0015) model time 0.2459 (0.2426) loss 2.3264 (2.7008) grad_norm 4.9910 (6.6962) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1060/1251] eta 0:00:46 lr 0.000015 wd 0.0500 time 0.2472 (0.2442) data time 0.0007 (0.0015) model time 0.2464 (0.2426) loss 2.4165 (2.7010) grad_norm 4.6897 (6.6973) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1070/1251] eta 0:00:44 lr 0.000015 wd 0.0500 time 0.2304 (0.2442) data time 0.0010 (0.0015) model time 0.2294 (0.2425) loss 1.7804 (2.7000) grad_norm 8.0947 (6.6848) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1080/1251] eta 0:00:41 lr 0.000015 wd 0.0500 time 0.2378 (0.2442) data time 0.0011 (0.0015) model time 0.2367 (0.2425) loss 2.7965 (2.6998) grad_norm 4.3999 (6.6795) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1090/1251] eta 0:00:39 lr 0.000015 wd 0.0500 time 0.2348 (0.2441) data time 0.0008 (0.0015) model time 0.2340 (0.2425) loss 3.3967 (2.6987) grad_norm 5.2453 (6.6735) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1100/1251] eta 0:00:36 lr 0.000015 wd 0.0500 time 0.2388 (0.2441) data time 0.0008 (0.0015) model time 0.2380 (0.2425) loss 2.1456 (2.6984) grad_norm 6.0261 (6.6673) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1110/1251] eta 0:00:34 lr 0.000015 wd 0.0500 time 0.2373 (0.2441) data time 0.0007 (0.0015) model time 0.2365 (0.2425) loss 1.8046 (2.6975) grad_norm 9.9035 (6.6797) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1120/1251] eta 0:00:31 lr 0.000015 wd 0.0500 time 0.2471 (0.2441) data time 0.0009 (0.0015) model time 0.2462 (0.2425) loss 2.4492 (2.6971) grad_norm 5.9731 (6.7303) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1130/1251] eta 0:00:29 lr 0.000015 wd 0.0500 time 0.2449 (0.2441) data time 0.0008 (0.0015) model time 0.2441 (0.2425) loss 2.9516 (2.6971) grad_norm 6.7188 (6.7302) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1140/1251] eta 0:00:27 lr 0.000015 wd 0.0500 time 0.2435 (0.2441) data time 0.0007 (0.0015) model time 0.2428 (0.2425) loss 3.1456 (2.6963) grad_norm 6.5491 (6.7249) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1150/1251] eta 0:00:24 lr 0.000015 wd 0.0500 time 0.2372 (0.2441) data time 0.0008 (0.0015) model time 0.2364 (0.2425) loss 2.0126 (2.6938) grad_norm 8.6155 (6.7205) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1160/1251] eta 0:00:22 lr 0.000015 wd 0.0500 time 0.2417 (0.2441) data time 0.0007 (0.0015) model time 0.2409 (0.2425) loss 3.0971 (2.6961) grad_norm 7.1771 (6.7115) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1170/1251] eta 0:00:19 lr 0.000015 wd 0.0500 time 0.2467 (0.2441) data time 0.0010 (0.0015) model time 0.2457 (0.2425) loss 2.8556 (2.6952) grad_norm 3.6167 (6.7005) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1180/1251] eta 0:00:17 lr 0.000015 wd 0.0500 time 0.2461 (0.2441) data time 0.0010 (0.0015) model time 0.2452 (0.2425) loss 2.8289 (2.6939) grad_norm 4.4096 (6.7109) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1190/1251] eta 0:00:14 lr 0.000015 wd 0.0500 time 0.2371 (0.2441) data time 0.0009 (0.0014) model time 0.2363 (0.2425) loss 3.5554 (2.6918) grad_norm 3.7537 (6.6978) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1200/1251] eta 0:00:12 lr 0.000015 wd 0.0500 time 0.2445 (0.2441) data time 0.0009 (0.0014) model time 0.2436 (0.2425) loss 2.9489 (2.6908) grad_norm 6.0809 (6.6895) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1210/1251] eta 0:00:10 lr 0.000015 wd 0.0500 time 0.2431 (0.2440) data time 0.0010 (0.0014) model time 0.2421 (0.2424) loss 3.1538 (2.6911) grad_norm 5.1740 (6.6965) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1220/1251] eta 0:00:07 lr 0.000015 wd 0.0500 time 0.2354 (0.2440) data time 0.0011 (0.0014) model time 0.2343 (0.2424) loss 3.2299 (2.6900) grad_norm 7.3711 (6.6950) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1230/1251] eta 0:00:05 lr 0.000015 wd 0.0500 time 0.2308 (0.2440) data time 0.0012 (0.0014) model time 0.2296 (0.2424) loss 2.9099 (2.6906) grad_norm 6.9044 (6.7205) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1240/1251] eta 0:00:02 lr 0.000015 wd 0.0500 time 0.2250 (0.2439) data time 0.0007 (0.0014) model time 0.2243 (0.2423) loss 3.1279 (2.6919) grad_norm 3.4307 (6.7179) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [286/300][1250/1251] eta 0:00:00 lr 0.000015 wd 0.0500 time 0.2237 (0.2438) data time 0.0005 (0.0014) model time 0.2232 (0.2422) loss 2.4909 (2.6908) grad_norm 6.9333 (6.7175) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 286 training takes 0:05:04 [2024-09-01 10:48:55 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 10:48:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 10:48:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.431 (0.431) Loss 0.3940 (0.3940) Acc@1 93.262 (93.262) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 10:48:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.109) Loss 0.5728 (0.6115) Acc@1 90.430 (87.855) Acc@5 97.852 (97.736) Mem 7381MB [2024-09-01 10:48:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.095) Loss 0.9243 (0.6431) Acc@1 77.734 (86.686) Acc@5 95.508 (97.670) Mem 7381MB [2024-09-01 10:48:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.085 (0.090) Loss 1.1494 (0.7382) Acc@1 75.000 (84.457) Acc@5 92.578 (96.692) Mem 7381MB [2024-09-01 10:48:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.084) Loss 1.0234 (0.7871) Acc@1 77.051 (83.263) Acc@5 94.531 (96.184) Mem 7381MB [2024-09-01 10:48:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.822 Acc@5 96.142 [2024-09-01 10:48:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 10:49:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.765 (0.765) Loss 0.3887 (0.3887) Acc@1 93.359 (93.359) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 10:49:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.145) Loss 0.5635 (0.6074) Acc@1 90.527 (87.979) Acc@5 97.949 (97.754) Mem 7381MB [2024-09-01 10:49:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.114) Loss 0.9121 (0.6383) Acc@1 77.734 (86.779) Acc@5 95.703 (97.731) Mem 7381MB [2024-09-01 10:49:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.101) Loss 1.1396 (0.7312) Acc@1 74.609 (84.570) Acc@5 92.969 (96.777) Mem 7381MB [2024-09-01 10:49:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.093) Loss 1.0137 (0.7796) Acc@1 76.660 (83.391) Acc@5 94.531 (96.289) Mem 7381MB [2024-09-01 10:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.934 Acc@5 96.248 [2024-09-01 10:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 10:49:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][0/1251] eta 0:23:06 lr 0.000015 wd 0.0500 time 1.1085 (1.1085) data time 0.5725 (0.5725) model time 0.0000 (0.0000) loss 2.8881 (2.8881) grad_norm 5.0145 (5.0145) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][10/1251] eta 0:06:40 lr 0.000015 wd 0.0500 time 0.2512 (0.3226) data time 0.0010 (0.0531) model time 0.0000 (0.0000) loss 2.6843 (2.6992) grad_norm 6.2957 (6.1930) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][20/1251] eta 0:06:02 lr 0.000015 wd 0.0500 time 0.2161 (0.2941) data time 0.0011 (0.0283) model time 0.0000 (0.0000) loss 2.6602 (2.6899) grad_norm 5.8110 (7.4102) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][30/1251] eta 0:05:39 lr 0.000015 wd 0.0500 time 0.2438 (0.2777) data time 0.0007 (0.0195) model time 0.0000 (0.0000) loss 2.4518 (2.7224) grad_norm 24.2146 (7.5716) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][40/1251] eta 0:05:26 lr 0.000015 wd 0.0500 time 0.2471 (0.2693) data time 0.0010 (0.0150) model time 0.0000 (0.0000) loss 1.5043 (2.6948) grad_norm 13.8073 (7.3555) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][50/1251] eta 0:05:17 lr 0.000015 wd 0.0500 time 0.2511 (0.2641) data time 0.0007 (0.0122) model time 0.0000 (0.0000) loss 1.8039 (2.7245) grad_norm 4.1862 (6.9968) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][60/1251] eta 0:05:10 lr 0.000015 wd 0.0500 time 0.2423 (0.2605) data time 0.0009 (0.0104) model time 0.2413 (0.2413) loss 2.8234 (2.7510) grad_norm 10.9715 (6.8898) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][70/1251] eta 0:05:04 lr 0.000015 wd 0.0500 time 0.2440 (0.2576) data time 0.0008 (0.0091) model time 0.2432 (0.2400) loss 2.3826 (2.7237) grad_norm 4.0194 (6.8202) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][80/1251] eta 0:04:59 lr 0.000015 wd 0.0500 time 0.2343 (0.2556) data time 0.0008 (0.0081) model time 0.2335 (0.2403) loss 1.9744 (2.7298) grad_norm 7.4273 (6.7178) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][90/1251] eta 0:04:54 lr 0.000015 wd 0.0500 time 0.2296 (0.2539) data time 0.0011 (0.0073) model time 0.2285 (0.2398) loss 2.9699 (2.7160) grad_norm 5.0681 (6.6658) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][100/1251] eta 0:04:50 lr 0.000015 wd 0.0500 time 0.2419 (0.2527) data time 0.0008 (0.0067) model time 0.2411 (0.2402) loss 2.9626 (2.6837) grad_norm 7.1101 (6.5417) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][110/1251] eta 0:04:47 lr 0.000015 wd 0.0500 time 0.2464 (0.2517) data time 0.0008 (0.0061) model time 0.2456 (0.2401) loss 3.4137 (2.7021) grad_norm 4.8638 (6.4192) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][120/1251] eta 0:04:43 lr 0.000015 wd 0.0500 time 0.2375 (0.2509) data time 0.0007 (0.0057) model time 0.2368 (0.2403) loss 3.2855 (2.7114) grad_norm 4.5338 (6.4191) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][130/1251] eta 0:04:40 lr 0.000015 wd 0.0500 time 0.2530 (0.2501) data time 0.0007 (0.0053) model time 0.2523 (0.2403) loss 3.0231 (2.7293) grad_norm 5.3728 (6.3980) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][140/1251] eta 0:04:37 lr 0.000015 wd 0.0500 time 0.2490 (0.2495) data time 0.0010 (0.0050) model time 0.2480 (0.2403) loss 2.6942 (2.7252) grad_norm 6.2050 (6.3297) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][150/1251] eta 0:04:34 lr 0.000015 wd 0.0500 time 0.2387 (0.2490) data time 0.0009 (0.0048) model time 0.2378 (0.2404) loss 1.7895 (2.7141) grad_norm 4.0279 (6.3296) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][160/1251] eta 0:04:31 lr 0.000015 wd 0.0500 time 0.2411 (0.2486) data time 0.0009 (0.0045) model time 0.2401 (0.2405) loss 2.8329 (2.7112) grad_norm 3.7370 (6.3313) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][170/1251] eta 0:04:28 lr 0.000015 wd 0.0500 time 0.2461 (0.2482) data time 0.0011 (0.0043) model time 0.2449 (0.2405) loss 2.5299 (2.7004) grad_norm 8.9587 (6.3237) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][180/1251] eta 0:04:25 lr 0.000015 wd 0.0500 time 0.2461 (0.2478) data time 0.0008 (0.0041) model time 0.2453 (0.2405) loss 1.8698 (2.6956) grad_norm 5.2202 (6.4541) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][190/1251] eta 0:04:22 lr 0.000015 wd 0.0500 time 0.2399 (0.2476) data time 0.0008 (0.0040) model time 0.2391 (0.2406) loss 1.9279 (2.6723) grad_norm 4.2538 (6.4362) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][200/1251] eta 0:04:19 lr 0.000015 wd 0.0500 time 0.2418 (0.2472) data time 0.0010 (0.0038) model time 0.2408 (0.2405) loss 1.6768 (2.6726) grad_norm 6.5296 (6.4061) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][210/1251] eta 0:04:17 lr 0.000015 wd 0.0500 time 0.2447 (0.2469) data time 0.0010 (0.0037) model time 0.2437 (0.2405) loss 2.9114 (2.6745) grad_norm 6.9858 (6.4340) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:49:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][220/1251] eta 0:04:14 lr 0.000015 wd 0.0500 time 0.2462 (0.2467) data time 0.0010 (0.0036) model time 0.2452 (0.2405) loss 2.9827 (2.6698) grad_norm 4.8412 (6.4437) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][230/1251] eta 0:04:11 lr 0.000015 wd 0.0500 time 0.2456 (0.2465) data time 0.0011 (0.0035) model time 0.2446 (0.2405) loss 2.5707 (2.6770) grad_norm 5.5892 (6.4720) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:50:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][240/1251] eta 0:04:09 lr 0.000015 wd 0.0500 time 0.2350 (0.2469) data time 0.0009 (0.0034) model time 0.2341 (0.2413) loss 2.5894 (2.6823) grad_norm 7.5984 (6.5873) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:50:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][250/1251] eta 0:04:08 lr 0.000015 wd 0.0500 time 0.2284 (0.2478) data time 0.0007 (0.0033) model time 0.2277 (0.2427) loss 2.3947 (2.6711) grad_norm 4.9338 (6.5741) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:50:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][260/1251] eta 0:04:05 lr 0.000015 wd 0.0500 time 0.2525 (0.2477) data time 0.0007 (0.0032) model time 0.2518 (0.2427) loss 2.2640 (2.6651) grad_norm 7.2687 (6.5954) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:50:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][270/1251] eta 0:04:02 lr 0.000015 wd 0.0500 time 0.2413 (0.2474) data time 0.0009 (0.0031) model time 0.2404 (0.2425) loss 2.4958 (2.6645) grad_norm 4.9698 (6.5613) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:50:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][280/1251] eta 0:03:59 lr 0.000015 wd 0.0500 time 0.2364 (0.2471) data time 0.0008 (0.0030) model time 0.2356 (0.2423) loss 3.1776 (2.6631) grad_norm 5.3136 (6.5678) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:50:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][290/1251] eta 0:03:57 lr 0.000015 wd 0.0500 time 0.2383 (0.2469) data time 0.0008 (0.0030) model time 0.2374 (0.2423) loss 2.0287 (2.6581) grad_norm 5.0305 (6.5413) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][300/1251] eta 0:03:55 lr 0.000015 wd 0.0500 time 0.2403 (0.2474) data time 0.0008 (0.0029) model time 0.2395 (0.2430) loss 2.0152 (2.6578) grad_norm 6.6264 (6.6507) loss_scale 256.0000 (129.7010) mem 7381MB [2024-09-01 10:50:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][310/1251] eta 0:03:52 lr 0.000015 wd 0.0500 time 0.2391 (0.2472) data time 0.0007 (0.0028) model time 0.2383 (0.2429) loss 2.6778 (2.6577) grad_norm 4.6212 (6.6354) loss_scale 256.0000 (133.7621) mem 7381MB [2024-09-01 10:50:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][320/1251] eta 0:03:50 lr 0.000015 wd 0.0500 time 0.2443 (0.2478) data time 0.0008 (0.0028) model time 0.2435 (0.2437) loss 2.4730 (2.6540) grad_norm 4.8530 (6.6039) loss_scale 256.0000 (137.5701) mem 7381MB [2024-09-01 10:50:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][330/1251] eta 0:03:48 lr 0.000015 wd 0.0500 time 0.2460 (0.2476) data time 0.0009 (0.0027) model time 0.2450 (0.2436) loss 2.9704 (2.6552) grad_norm 5.8268 (6.5803) loss_scale 256.0000 (141.1480) mem 7381MB [2024-09-01 10:50:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][340/1251] eta 0:03:45 lr 0.000015 wd 0.0500 time 0.2440 (0.2473) data time 0.0010 (0.0027) model time 0.2430 (0.2434) loss 2.8052 (2.6525) grad_norm 11.4656 (6.5803) loss_scale 256.0000 (144.5161) mem 7381MB [2024-09-01 10:50:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][350/1251] eta 0:03:42 lr 0.000015 wd 0.0500 time 0.2455 (0.2472) data time 0.0007 (0.0026) model time 0.2448 (0.2434) loss 3.0037 (2.6486) grad_norm 5.9044 (6.5563) loss_scale 256.0000 (147.6923) mem 7381MB [2024-09-01 10:50:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][360/1251] eta 0:03:40 lr 0.000015 wd 0.0500 time 0.2390 (0.2470) data time 0.0011 (0.0026) model time 0.2380 (0.2432) loss 2.8265 (2.6479) grad_norm 10.4836 (6.5497) loss_scale 256.0000 (150.6925) mem 7381MB [2024-09-01 10:50:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][370/1251] eta 0:03:37 lr 0.000015 wd 0.0500 time 0.2392 (0.2469) data time 0.0008 (0.0025) model time 0.2384 (0.2431) loss 2.7592 (2.6497) grad_norm 6.0021 (6.5105) loss_scale 256.0000 (153.5310) mem 7381MB [2024-09-01 10:50:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][380/1251] eta 0:03:34 lr 0.000015 wd 0.0500 time 0.2390 (0.2467) data time 0.0008 (0.0025) model time 0.2382 (0.2431) loss 2.0556 (2.6529) grad_norm 8.6015 (6.5064) loss_scale 256.0000 (156.2205) mem 7381MB [2024-09-01 10:50:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][390/1251] eta 0:03:32 lr 0.000015 wd 0.0500 time 0.2396 (0.2465) data time 0.0008 (0.0025) model time 0.2389 (0.2429) loss 3.1683 (2.6555) grad_norm 10.5010 (6.4942) loss_scale 256.0000 (158.7724) mem 7381MB [2024-09-01 10:50:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][400/1251] eta 0:03:29 lr 0.000015 wd 0.0500 time 0.2443 (0.2464) data time 0.0009 (0.0024) model time 0.2434 (0.2428) loss 1.7030 (2.6575) grad_norm 5.5677 (6.4805) loss_scale 256.0000 (161.1970) mem 7381MB [2024-09-01 10:50:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][410/1251] eta 0:03:27 lr 0.000015 wd 0.0500 time 0.2517 (0.2464) data time 0.0010 (0.0024) model time 0.2507 (0.2429) loss 2.9314 (2.6588) grad_norm 6.0916 (6.4621) loss_scale 256.0000 (163.5036) mem 7381MB [2024-09-01 10:50:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][420/1251] eta 0:03:24 lr 0.000015 wd 0.0500 time 0.2356 (0.2462) data time 0.0007 (0.0024) model time 0.2349 (0.2428) loss 3.3279 (2.6574) grad_norm 4.7014 (6.4370) loss_scale 256.0000 (165.7007) mem 7381MB [2024-09-01 10:50:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][430/1251] eta 0:03:22 lr 0.000015 wd 0.0500 time 0.2477 (0.2461) data time 0.0010 (0.0023) model time 0.2467 (0.2427) loss 2.9240 (2.6658) grad_norm 7.0662 (6.4270) loss_scale 256.0000 (167.7958) mem 7381MB [2024-09-01 10:50:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][440/1251] eta 0:03:19 lr 0.000015 wd 0.0500 time 0.2417 (0.2460) data time 0.0009 (0.0023) model time 0.2408 (0.2426) loss 2.7521 (2.6693) grad_norm 5.8989 (6.4205) loss_scale 256.0000 (169.7959) mem 7381MB [2024-09-01 10:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][450/1251] eta 0:03:16 lr 0.000015 wd 0.0500 time 0.2395 (0.2459) data time 0.0009 (0.0023) model time 0.2386 (0.2426) loss 3.1694 (2.6705) grad_norm 9.2299 (6.4238) loss_scale 256.0000 (171.7073) mem 7381MB [2024-09-01 10:50:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][460/1251] eta 0:03:14 lr 0.000015 wd 0.0500 time 0.2382 (0.2458) data time 0.0007 (0.0022) model time 0.2374 (0.2425) loss 2.6012 (2.6741) grad_norm 5.6151 (6.4106) loss_scale 256.0000 (173.5358) mem 7381MB [2024-09-01 10:50:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][470/1251] eta 0:03:11 lr 0.000015 wd 0.0500 time 0.2398 (0.2456) data time 0.0010 (0.0022) model time 0.2387 (0.2424) loss 1.7358 (2.6733) grad_norm 10.9635 (6.3991) loss_scale 256.0000 (175.2866) mem 7381MB [2024-09-01 10:51:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][480/1251] eta 0:03:09 lr 0.000015 wd 0.0500 time 0.2391 (0.2456) data time 0.0008 (0.0022) model time 0.2382 (0.2424) loss 2.4696 (2.6722) grad_norm 5.1178 (inf) loss_scale 128.0000 (174.8358) mem 7381MB [2024-09-01 10:51:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][490/1251] eta 0:03:06 lr 0.000015 wd 0.0500 time 0.2435 (0.2455) data time 0.0011 (0.0022) model time 0.2425 (0.2423) loss 2.9579 (2.6727) grad_norm 5.7113 (inf) loss_scale 128.0000 (173.8819) mem 7381MB [2024-09-01 10:51:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][500/1251] eta 0:03:04 lr 0.000015 wd 0.0500 time 0.2392 (0.2454) data time 0.0010 (0.0021) model time 0.2381 (0.2423) loss 1.9472 (2.6729) grad_norm 6.2550 (inf) loss_scale 128.0000 (172.9661) mem 7381MB [2024-09-01 10:51:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][510/1251] eta 0:03:01 lr 0.000015 wd 0.0500 time 0.2363 (0.2454) data time 0.0010 (0.0021) model time 0.2353 (0.2423) loss 2.8452 (2.6719) grad_norm 5.2319 (inf) loss_scale 128.0000 (172.0861) mem 7381MB [2024-09-01 10:51:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][520/1251] eta 0:02:59 lr 0.000015 wd 0.0500 time 0.2369 (0.2453) data time 0.0011 (0.0021) model time 0.2358 (0.2423) loss 2.4278 (2.6678) grad_norm 5.1188 (inf) loss_scale 128.0000 (171.2399) mem 7381MB [2024-09-01 10:51:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][530/1251] eta 0:02:56 lr 0.000015 wd 0.0500 time 0.2480 (0.2452) data time 0.0009 (0.0021) model time 0.2471 (0.2422) loss 3.0308 (2.6722) grad_norm 4.5966 (inf) loss_scale 128.0000 (170.4256) mem 7381MB [2024-09-01 10:51:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][540/1251] eta 0:02:54 lr 0.000015 wd 0.0500 time 0.2389 (0.2452) data time 0.0006 (0.0020) model time 0.2382 (0.2422) loss 2.8679 (2.6744) grad_norm 4.9532 (inf) loss_scale 128.0000 (169.6414) mem 7381MB [2024-09-01 10:51:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][550/1251] eta 0:02:51 lr 0.000015 wd 0.0500 time 0.2361 (0.2450) data time 0.0009 (0.0020) model time 0.2352 (0.2421) loss 2.9793 (2.6726) grad_norm 7.0975 (inf) loss_scale 128.0000 (168.8857) mem 7381MB [2024-09-01 10:51:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][560/1251] eta 0:02:49 lr 0.000015 wd 0.0500 time 0.2393 (0.2450) data time 0.0009 (0.0020) model time 0.2384 (0.2420) loss 1.9232 (2.6692) grad_norm 6.9565 (inf) loss_scale 128.0000 (168.1569) mem 7381MB [2024-09-01 10:51:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][570/1251] eta 0:02:46 lr 0.000015 wd 0.0500 time 0.2416 (0.2449) data time 0.0011 (0.0020) model time 0.2404 (0.2420) loss 2.7429 (2.6645) grad_norm 7.2010 (inf) loss_scale 128.0000 (167.4536) mem 7381MB [2024-09-01 10:51:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][580/1251] eta 0:02:44 lr 0.000015 wd 0.0500 time 0.2319 (0.2448) data time 0.0011 (0.0020) model time 0.2308 (0.2420) loss 2.6504 (2.6631) grad_norm 6.3105 (inf) loss_scale 128.0000 (166.7745) mem 7381MB [2024-09-01 10:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][590/1251] eta 0:02:41 lr 0.000015 wd 0.0500 time 0.2404 (0.2448) data time 0.0009 (0.0020) model time 0.2395 (0.2419) loss 2.9816 (2.6629) grad_norm 4.2207 (inf) loss_scale 128.0000 (166.1184) mem 7381MB [2024-09-01 10:51:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][600/1251] eta 0:02:39 lr 0.000015 wd 0.0500 time 0.2427 (0.2447) data time 0.0010 (0.0019) model time 0.2417 (0.2419) loss 2.5991 (2.6634) grad_norm 6.2863 (inf) loss_scale 128.0000 (165.4842) mem 7381MB [2024-09-01 10:51:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][610/1251] eta 0:02:36 lr 0.000015 wd 0.0500 time 0.2326 (0.2446) data time 0.0010 (0.0019) model time 0.2317 (0.2419) loss 3.0728 (2.6659) grad_norm 4.8116 (inf) loss_scale 128.0000 (164.8707) mem 7381MB [2024-09-01 10:51:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][620/1251] eta 0:02:34 lr 0.000015 wd 0.0500 time 0.2419 (0.2446) data time 0.0009 (0.0019) model time 0.2410 (0.2418) loss 2.8850 (2.6683) grad_norm 4.3852 (inf) loss_scale 128.0000 (164.2770) mem 7381MB [2024-09-01 10:51:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][630/1251] eta 0:02:31 lr 0.000015 wd 0.0500 time 0.2316 (0.2445) data time 0.0011 (0.0019) model time 0.2305 (0.2418) loss 2.4781 (2.6699) grad_norm 5.3003 (inf) loss_scale 128.0000 (163.7021) mem 7381MB [2024-09-01 10:51:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][640/1251] eta 0:02:29 lr 0.000015 wd 0.0500 time 0.2415 (0.2445) data time 0.0010 (0.0019) model time 0.2405 (0.2418) loss 2.8496 (2.6691) grad_norm 6.8845 (inf) loss_scale 128.0000 (163.1451) mem 7381MB [2024-09-01 10:51:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][650/1251] eta 0:02:26 lr 0.000015 wd 0.0500 time 0.2406 (0.2444) data time 0.0007 (0.0019) model time 0.2399 (0.2418) loss 1.5977 (2.6675) grad_norm 4.3501 (inf) loss_scale 128.0000 (162.6052) mem 7381MB [2024-09-01 10:51:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][660/1251] eta 0:02:24 lr 0.000015 wd 0.0500 time 0.2411 (0.2443) data time 0.0008 (0.0019) model time 0.2404 (0.2417) loss 3.4580 (2.6682) grad_norm 7.7185 (inf) loss_scale 128.0000 (162.0817) mem 7381MB [2024-09-01 10:51:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][670/1251] eta 0:02:21 lr 0.000015 wd 0.0500 time 0.2346 (0.2443) data time 0.0007 (0.0018) model time 0.2339 (0.2417) loss 2.2315 (2.6673) grad_norm 5.5564 (inf) loss_scale 128.0000 (161.5738) mem 7381MB [2024-09-01 10:51:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][680/1251] eta 0:02:19 lr 0.000015 wd 0.0500 time 0.2396 (0.2442) data time 0.0012 (0.0018) model time 0.2384 (0.2417) loss 2.6148 (2.6655) grad_norm 4.4214 (inf) loss_scale 128.0000 (161.0808) mem 7381MB [2024-09-01 10:51:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][690/1251] eta 0:02:16 lr 0.000015 wd 0.0500 time 0.2446 (0.2442) data time 0.0011 (0.0018) model time 0.2436 (0.2416) loss 2.8879 (2.6646) grad_norm 5.2888 (inf) loss_scale 128.0000 (160.6020) mem 7381MB [2024-09-01 10:51:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][700/1251] eta 0:02:14 lr 0.000015 wd 0.0500 time 0.2368 (0.2441) data time 0.0010 (0.0018) model time 0.2358 (0.2415) loss 2.6672 (2.6597) grad_norm 5.1116 (inf) loss_scale 128.0000 (160.1369) mem 7381MB [2024-09-01 10:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][710/1251] eta 0:02:12 lr 0.000015 wd 0.0500 time 0.2386 (0.2440) data time 0.0008 (0.0018) model time 0.2378 (0.2415) loss 2.1506 (2.6597) grad_norm 6.2752 (inf) loss_scale 128.0000 (159.6850) mem 7381MB [2024-09-01 10:52:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][720/1251] eta 0:02:09 lr 0.000015 wd 0.0500 time 0.2389 (0.2440) data time 0.0009 (0.0018) model time 0.2379 (0.2415) loss 2.6225 (2.6574) grad_norm 4.7349 (inf) loss_scale 128.0000 (159.2455) mem 7381MB [2024-09-01 10:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][730/1251] eta 0:02:07 lr 0.000015 wd 0.0500 time 0.2456 (0.2440) data time 0.0009 (0.0018) model time 0.2447 (0.2415) loss 2.9662 (2.6549) grad_norm 3.5171 (inf) loss_scale 128.0000 (158.8181) mem 7381MB [2024-09-01 10:52:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][740/1251] eta 0:02:04 lr 0.000015 wd 0.0500 time 0.2356 (0.2439) data time 0.0010 (0.0018) model time 0.2346 (0.2414) loss 2.5699 (2.6566) grad_norm 5.9160 (inf) loss_scale 128.0000 (158.4022) mem 7381MB [2024-09-01 10:52:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][750/1251] eta 0:02:02 lr 0.000015 wd 0.0500 time 0.2506 (0.2439) data time 0.0010 (0.0018) model time 0.2496 (0.2415) loss 2.8836 (2.6556) grad_norm 4.9703 (inf) loss_scale 128.0000 (157.9973) mem 7381MB [2024-09-01 10:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][760/1251] eta 0:01:59 lr 0.000015 wd 0.0500 time 0.2459 (0.2438) data time 0.0009 (0.0017) model time 0.2449 (0.2414) loss 2.9512 (2.6564) grad_norm 7.2274 (inf) loss_scale 128.0000 (157.6032) mem 7381MB [2024-09-01 10:52:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][770/1251] eta 0:01:57 lr 0.000015 wd 0.0500 time 0.4611 (0.2444) data time 0.0012 (0.0017) model time 0.4599 (0.2420) loss 2.6600 (2.6579) grad_norm 4.7081 (inf) loss_scale 128.0000 (157.2192) mem 7381MB [2024-09-01 10:52:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][780/1251] eta 0:01:55 lr 0.000015 wd 0.0500 time 0.2362 (0.2446) data time 0.0009 (0.0017) model time 0.2354 (0.2422) loss 2.6231 (2.6573) grad_norm 5.0628 (inf) loss_scale 128.0000 (156.8451) mem 7381MB [2024-09-01 10:52:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][790/1251] eta 0:01:52 lr 0.000015 wd 0.0500 time 0.2455 (0.2446) data time 0.0008 (0.0017) model time 0.2447 (0.2423) loss 2.4445 (2.6523) grad_norm 4.5008 (inf) loss_scale 128.0000 (156.4804) mem 7381MB [2024-09-01 10:52:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][800/1251] eta 0:01:50 lr 0.000015 wd 0.0500 time 0.2350 (0.2445) data time 0.0007 (0.0017) model time 0.2343 (0.2422) loss 2.8942 (2.6524) grad_norm 4.9033 (inf) loss_scale 128.0000 (156.1248) mem 7381MB [2024-09-01 10:52:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][810/1251] eta 0:01:47 lr 0.000015 wd 0.0500 time 0.2494 (0.2445) data time 0.0007 (0.0017) model time 0.2487 (0.2422) loss 1.6745 (2.6524) grad_norm 5.5301 (inf) loss_scale 128.0000 (155.7781) mem 7381MB [2024-09-01 10:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][820/1251] eta 0:01:45 lr 0.000015 wd 0.0500 time 0.2506 (0.2445) data time 0.0007 (0.0017) model time 0.2499 (0.2422) loss 3.0369 (2.6536) grad_norm 4.9074 (inf) loss_scale 128.0000 (155.4397) mem 7381MB [2024-09-01 10:52:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][830/1251] eta 0:01:42 lr 0.000015 wd 0.0500 time 0.2385 (0.2445) data time 0.0008 (0.0017) model time 0.2378 (0.2422) loss 3.2567 (2.6523) grad_norm 4.5460 (inf) loss_scale 128.0000 (155.1095) mem 7381MB [2024-09-01 10:52:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][840/1251] eta 0:01:40 lr 0.000015 wd 0.0500 time 0.2468 (0.2444) data time 0.0007 (0.0017) model time 0.2461 (0.2422) loss 2.2496 (2.6528) grad_norm 6.2063 (inf) loss_scale 128.0000 (154.7872) mem 7381MB [2024-09-01 10:52:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][850/1251] eta 0:01:37 lr 0.000015 wd 0.0500 time 0.2395 (0.2444) data time 0.0010 (0.0017) model time 0.2384 (0.2421) loss 2.7307 (2.6554) grad_norm 8.7646 (inf) loss_scale 128.0000 (154.4724) mem 7381MB [2024-09-01 10:52:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][860/1251] eta 0:01:35 lr 0.000015 wd 0.0500 time 0.2411 (0.2444) data time 0.0008 (0.0017) model time 0.2403 (0.2421) loss 2.6619 (2.6544) grad_norm 5.2269 (inf) loss_scale 128.0000 (154.1649) mem 7381MB [2024-09-01 10:52:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][870/1251] eta 0:01:33 lr 0.000015 wd 0.0500 time 0.2441 (0.2443) data time 0.0007 (0.0016) model time 0.2434 (0.2421) loss 3.4368 (2.6529) grad_norm 14.1253 (inf) loss_scale 128.0000 (153.8645) mem 7381MB [2024-09-01 10:52:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][880/1251] eta 0:01:30 lr 0.000015 wd 0.0500 time 0.2426 (0.2443) data time 0.0011 (0.0016) model time 0.2416 (0.2421) loss 3.1155 (2.6546) grad_norm 5.3682 (inf) loss_scale 128.0000 (153.5709) mem 7381MB [2024-09-01 10:52:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][890/1251] eta 0:01:28 lr 0.000015 wd 0.0500 time 0.2404 (0.2442) data time 0.0009 (0.0016) model time 0.2395 (0.2420) loss 3.1705 (2.6569) grad_norm 7.9142 (inf) loss_scale 128.0000 (153.2840) mem 7381MB [2024-09-01 10:52:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][900/1251] eta 0:01:25 lr 0.000015 wd 0.0500 time 0.2377 (0.2442) data time 0.0008 (0.0016) model time 0.2369 (0.2420) loss 2.0786 (2.6575) grad_norm 9.8947 (inf) loss_scale 128.0000 (153.0033) mem 7381MB [2024-09-01 10:52:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][910/1251] eta 0:01:23 lr 0.000015 wd 0.0500 time 0.2395 (0.2442) data time 0.0009 (0.0016) model time 0.2386 (0.2420) loss 2.8809 (2.6572) grad_norm 5.2118 (inf) loss_scale 128.0000 (152.7289) mem 7381MB [2024-09-01 10:52:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][920/1251] eta 0:01:20 lr 0.000015 wd 0.0500 time 0.2407 (0.2441) data time 0.0011 (0.0016) model time 0.2395 (0.2420) loss 2.2820 (2.6553) grad_norm 6.5359 (inf) loss_scale 128.0000 (152.4604) mem 7381MB [2024-09-01 10:52:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][930/1251] eta 0:01:18 lr 0.000015 wd 0.0500 time 0.2410 (0.2441) data time 0.0009 (0.0016) model time 0.2400 (0.2419) loss 2.9469 (2.6569) grad_norm 5.4100 (inf) loss_scale 128.0000 (152.1976) mem 7381MB [2024-09-01 10:52:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][940/1251] eta 0:01:15 lr 0.000015 wd 0.0500 time 0.2457 (0.2441) data time 0.0012 (0.0016) model time 0.2445 (0.2420) loss 2.5004 (2.6572) grad_norm 10.2075 (inf) loss_scale 128.0000 (151.9405) mem 7381MB [2024-09-01 10:52:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][950/1251] eta 0:01:13 lr 0.000015 wd 0.0500 time 0.2150 (0.2443) data time 0.0009 (0.0016) model time 0.2141 (0.2422) loss 2.8135 (2.6549) grad_norm 6.9225 (inf) loss_scale 128.0000 (151.6887) mem 7381MB [2024-09-01 10:52:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][960/1251] eta 0:01:11 lr 0.000015 wd 0.0500 time 0.2345 (0.2442) data time 0.0011 (0.0016) model time 0.2335 (0.2421) loss 3.2559 (2.6563) grad_norm 6.8096 (inf) loss_scale 128.0000 (151.4422) mem 7381MB [2024-09-01 10:53:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][970/1251] eta 0:01:08 lr 0.000015 wd 0.0500 time 0.2399 (0.2442) data time 0.0010 (0.0016) model time 0.2389 (0.2421) loss 2.7695 (2.6556) grad_norm 6.9390 (inf) loss_scale 128.0000 (151.2008) mem 7381MB [2024-09-01 10:53:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][980/1251] eta 0:01:06 lr 0.000015 wd 0.0500 time 0.2402 (0.2442) data time 0.0011 (0.0016) model time 0.2392 (0.2421) loss 2.1840 (2.6563) grad_norm 5.8207 (inf) loss_scale 128.0000 (150.9643) mem 7381MB [2024-09-01 10:53:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][990/1251] eta 0:01:03 lr 0.000015 wd 0.0500 time 0.2395 (0.2441) data time 0.0007 (0.0016) model time 0.2388 (0.2421) loss 2.9794 (2.6589) grad_norm 5.4889 (inf) loss_scale 128.0000 (150.7326) mem 7381MB [2024-09-01 10:53:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1000/1251] eta 0:01:01 lr 0.000015 wd 0.0500 time 0.2448 (0.2441) data time 0.0009 (0.0016) model time 0.2439 (0.2420) loss 2.8045 (2.6572) grad_norm 8.7867 (inf) loss_scale 128.0000 (150.5055) mem 7381MB [2024-09-01 10:53:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1010/1251] eta 0:00:58 lr 0.000015 wd 0.0500 time 0.2374 (0.2441) data time 0.0008 (0.0016) model time 0.2366 (0.2420) loss 2.7876 (2.6576) grad_norm 4.4842 (inf) loss_scale 128.0000 (150.2829) mem 7381MB [2024-09-01 10:53:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1020/1251] eta 0:00:56 lr 0.000015 wd 0.0500 time 0.2390 (0.2441) data time 0.0011 (0.0016) model time 0.2379 (0.2420) loss 2.9697 (2.6569) grad_norm 5.7261 (inf) loss_scale 128.0000 (150.0646) mem 7381MB [2024-09-01 10:53:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1030/1251] eta 0:00:53 lr 0.000015 wd 0.0500 time 0.2481 (0.2440) data time 0.0007 (0.0015) model time 0.2473 (0.2420) loss 2.9939 (2.6584) grad_norm 7.1600 (inf) loss_scale 128.0000 (149.8506) mem 7381MB [2024-09-01 10:53:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1040/1251] eta 0:00:51 lr 0.000015 wd 0.0500 time 0.2391 (0.2440) data time 0.0010 (0.0015) model time 0.2381 (0.2420) loss 2.7235 (2.6610) grad_norm 4.8652 (inf) loss_scale 128.0000 (149.6407) mem 7381MB [2024-09-01 10:53:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1050/1251] eta 0:00:49 lr 0.000015 wd 0.0500 time 0.2412 (0.2440) data time 0.0008 (0.0015) model time 0.2403 (0.2419) loss 2.7687 (2.6610) grad_norm 8.5737 (inf) loss_scale 128.0000 (149.4348) mem 7381MB [2024-09-01 10:53:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1060/1251] eta 0:00:46 lr 0.000015 wd 0.0500 time 0.2525 (0.2439) data time 0.0007 (0.0015) model time 0.2518 (0.2419) loss 2.6968 (2.6593) grad_norm 5.2894 (inf) loss_scale 128.0000 (149.2328) mem 7381MB [2024-09-01 10:53:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1070/1251] eta 0:00:44 lr 0.000015 wd 0.0500 time 0.2341 (0.2439) data time 0.0011 (0.0015) model time 0.2330 (0.2419) loss 2.8125 (2.6621) grad_norm 10.0277 (inf) loss_scale 128.0000 (149.0345) mem 7381MB [2024-09-01 10:53:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1080/1251] eta 0:00:41 lr 0.000015 wd 0.0500 time 0.2419 (0.2439) data time 0.0011 (0.0015) model time 0.2408 (0.2419) loss 2.9557 (2.6636) grad_norm 4.2371 (inf) loss_scale 128.0000 (148.8400) mem 7381MB [2024-09-01 10:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1090/1251] eta 0:00:39 lr 0.000015 wd 0.0500 time 0.2361 (0.2438) data time 0.0007 (0.0015) model time 0.2354 (0.2419) loss 3.0099 (2.6636) grad_norm 41.7344 (inf) loss_scale 128.0000 (148.6489) mem 7381MB [2024-09-01 10:53:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1100/1251] eta 0:00:36 lr 0.000015 wd 0.0500 time 0.2450 (0.2438) data time 0.0008 (0.0015) model time 0.2441 (0.2418) loss 2.9009 (2.6655) grad_norm 5.6441 (inf) loss_scale 128.0000 (148.4614) mem 7381MB [2024-09-01 10:53:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1110/1251] eta 0:00:34 lr 0.000015 wd 0.0500 time 0.2389 (0.2438) data time 0.0009 (0.0015) model time 0.2380 (0.2418) loss 3.2439 (2.6668) grad_norm 4.7842 (inf) loss_scale 128.0000 (148.2772) mem 7381MB [2024-09-01 10:53:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1120/1251] eta 0:00:31 lr 0.000015 wd 0.0500 time 0.2384 (0.2438) data time 0.0011 (0.0015) model time 0.2373 (0.2418) loss 3.0623 (2.6660) grad_norm 5.9744 (inf) loss_scale 128.0000 (148.0963) mem 7381MB [2024-09-01 10:53:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1130/1251] eta 0:00:29 lr 0.000015 wd 0.0500 time 0.2426 (0.2438) data time 0.0009 (0.0015) model time 0.2416 (0.2418) loss 2.8087 (2.6659) grad_norm 4.7021 (inf) loss_scale 128.0000 (147.9187) mem 7381MB [2024-09-01 10:53:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1140/1251] eta 0:00:27 lr 0.000015 wd 0.0500 time 0.2420 (0.2438) data time 0.0007 (0.0015) model time 0.2413 (0.2418) loss 2.6605 (2.6659) grad_norm 13.1832 (inf) loss_scale 128.0000 (147.7441) mem 7381MB [2024-09-01 10:53:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1150/1251] eta 0:00:24 lr 0.000015 wd 0.0500 time 0.2465 (0.2438) data time 0.0007 (0.0015) model time 0.2458 (0.2418) loss 2.0803 (2.6670) grad_norm 5.9407 (inf) loss_scale 128.0000 (147.5725) mem 7381MB [2024-09-01 10:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1160/1251] eta 0:00:22 lr 0.000015 wd 0.0500 time 0.2319 (0.2437) data time 0.0010 (0.0015) model time 0.2308 (0.2418) loss 2.9212 (2.6657) grad_norm 3.9261 (inf) loss_scale 128.0000 (147.4040) mem 7381MB [2024-09-01 10:53:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1170/1251] eta 0:00:19 lr 0.000015 wd 0.0500 time 0.2357 (0.2437) data time 0.0008 (0.0015) model time 0.2349 (0.2418) loss 2.8615 (2.6662) grad_norm 6.8042 (inf) loss_scale 128.0000 (147.2383) mem 7381MB [2024-09-01 10:53:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1180/1251] eta 0:00:17 lr 0.000015 wd 0.0500 time 0.2449 (0.2437) data time 0.0007 (0.0015) model time 0.2442 (0.2418) loss 2.6395 (2.6651) grad_norm 4.5584 (inf) loss_scale 128.0000 (147.0754) mem 7381MB [2024-09-01 10:53:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1190/1251] eta 0:00:14 lr 0.000015 wd 0.0500 time 0.2474 (0.2437) data time 0.0007 (0.0015) model time 0.2467 (0.2418) loss 2.7522 (2.6650) grad_norm 10.6842 (inf) loss_scale 128.0000 (146.9152) mem 7381MB [2024-09-01 10:53:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1200/1251] eta 0:00:12 lr 0.000015 wd 0.0500 time 0.2459 (0.2437) data time 0.0008 (0.0015) model time 0.2450 (0.2418) loss 2.8524 (2.6660) grad_norm 3.6192 (inf) loss_scale 128.0000 (146.7577) mem 7381MB [2024-09-01 10:53:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1210/1251] eta 0:00:09 lr 0.000015 wd 0.0500 time 0.2342 (0.2436) data time 0.0007 (0.0015) model time 0.2335 (0.2417) loss 3.1311 (2.6669) grad_norm 4.8665 (inf) loss_scale 128.0000 (146.6028) mem 7381MB [2024-09-01 10:54:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1220/1251] eta 0:00:07 lr 0.000014 wd 0.0500 time 0.2369 (0.2436) data time 0.0007 (0.0015) model time 0.2362 (0.2417) loss 3.1929 (2.6675) grad_norm 4.5914 (inf) loss_scale 128.0000 (146.4505) mem 7381MB [2024-09-01 10:54:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1230/1251] eta 0:00:05 lr 0.000014 wd 0.0500 time 0.2403 (0.2438) data time 0.0010 (0.0015) model time 0.2393 (0.2419) loss 2.9031 (2.6694) grad_norm 6.1237 (inf) loss_scale 128.0000 (146.3006) mem 7381MB [2024-09-01 10:54:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1240/1251] eta 0:00:02 lr 0.000014 wd 0.0500 time 0.2226 (0.2437) data time 0.0005 (0.0015) model time 0.2221 (0.2418) loss 2.7938 (2.6684) grad_norm 4.3826 (inf) loss_scale 128.0000 (146.1531) mem 7381MB [2024-09-01 10:54:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [287/300][1250/1251] eta 0:00:00 lr 0.000014 wd 0.0500 time 0.2214 (0.2437) data time 0.0005 (0.0015) model time 0.2209 (0.2418) loss 2.2352 (2.6679) grad_norm 11.1981 (inf) loss_scale 128.0000 (146.0080) mem 7381MB [2024-09-01 10:54:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 287 training takes 0:05:04 [2024-09-01 10:54:09 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 10:54:09 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 10:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.475 (0.475) Loss 0.3953 (0.3953) Acc@1 93.652 (93.652) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 10:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.076 (0.113) Loss 0.5713 (0.6157) Acc@1 90.039 (87.615) Acc@5 97.754 (97.736) Mem 7381MB [2024-09-01 10:54:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.097) Loss 0.9263 (0.6464) Acc@1 77.637 (86.528) Acc@5 95.703 (97.684) Mem 7381MB [2024-09-01 10:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.091) Loss 1.1504 (0.7405) Acc@1 74.512 (84.416) Acc@5 92.969 (96.689) Mem 7381MB [2024-09-01 10:54:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.0273 (0.7901) Acc@1 77.148 (83.206) Acc@5 93.945 (96.170) Mem 7381MB [2024-09-01 10:54:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.776 Acc@5 96.140 [2024-09-01 10:54:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 10:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.802 (0.802) Loss 0.3892 (0.3892) Acc@1 93.359 (93.359) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 10:54:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.147) Loss 0.5640 (0.6077) Acc@1 90.527 (87.935) Acc@5 97.949 (97.754) Mem 7381MB [2024-09-01 10:54:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.068 (0.115) Loss 0.9136 (0.6384) Acc@1 77.637 (86.747) Acc@5 95.605 (97.712) Mem 7381MB [2024-09-01 10:54:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.079 (0.102) Loss 1.1416 (0.7316) Acc@1 74.512 (84.548) Acc@5 93.066 (96.777) Mem 7381MB [2024-09-01 10:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.0127 (0.7800) Acc@1 76.562 (83.377) Acc@5 94.531 (96.282) Mem 7381MB [2024-09-01 10:54:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.918 Acc@5 96.232 [2024-09-01 10:54:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 10:54:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][0/1251] eta 0:22:24 lr 0.000014 wd 0.0500 time 1.0744 (1.0744) data time 0.6781 (0.6781) model time 0.0000 (0.0000) loss 2.9468 (2.9468) grad_norm 7.3344 (7.3344) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][10/1251] eta 0:06:33 lr 0.000014 wd 0.0500 time 0.2293 (0.3169) data time 0.0011 (0.0626) model time 0.0000 (0.0000) loss 2.7284 (2.7274) grad_norm 4.2687 (6.0949) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][20/1251] eta 0:05:44 lr 0.000014 wd 0.0500 time 0.2354 (0.2801) data time 0.0008 (0.0332) model time 0.0000 (0.0000) loss 2.9459 (2.6717) grad_norm 10.1390 (6.0617) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][30/1251] eta 0:05:26 lr 0.000014 wd 0.0500 time 0.2472 (0.2673) data time 0.0011 (0.0228) model time 0.0000 (0.0000) loss 2.4259 (2.6300) grad_norm 3.7165 (6.3245) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][40/1251] eta 0:05:16 lr 0.000014 wd 0.0500 time 0.2359 (0.2610) data time 0.0007 (0.0175) model time 0.0000 (0.0000) loss 3.5351 (2.7024) grad_norm 10.8730 (6.8416) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][50/1251] eta 0:05:08 lr 0.000014 wd 0.0500 time 0.2350 (0.2568) data time 0.0011 (0.0142) model time 0.0000 (0.0000) loss 3.2719 (2.7010) grad_norm 6.3349 (6.8952) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][60/1251] eta 0:05:02 lr 0.000014 wd 0.0500 time 0.2411 (0.2542) data time 0.0010 (0.0121) model time 0.2401 (0.2398) loss 3.0724 (2.7051) grad_norm 12.4598 (6.8930) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][70/1251] eta 0:04:57 lr 0.000014 wd 0.0500 time 0.2437 (0.2522) data time 0.0007 (0.0105) model time 0.2431 (0.2395) loss 2.9894 (2.7040) grad_norm 7.6242 (6.8017) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][80/1251] eta 0:04:53 lr 0.000014 wd 0.0500 time 0.2380 (0.2509) data time 0.0009 (0.0093) model time 0.2372 (0.2398) loss 3.2373 (2.7023) grad_norm 4.8942 (6.7708) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][90/1251] eta 0:04:49 lr 0.000014 wd 0.0500 time 0.2443 (0.2497) data time 0.0008 (0.0084) model time 0.2436 (0.2397) loss 2.6923 (2.6802) grad_norm 7.5909 (6.6931) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][100/1251] eta 0:04:46 lr 0.000014 wd 0.0500 time 0.2495 (0.2491) data time 0.0009 (0.0077) model time 0.2486 (0.2402) loss 2.9466 (2.6845) grad_norm 7.5105 (6.5969) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][110/1251] eta 0:04:43 lr 0.000014 wd 0.0500 time 0.2400 (0.2484) data time 0.0008 (0.0071) model time 0.2393 (0.2402) loss 2.9868 (2.7093) grad_norm 4.6525 (6.6158) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][120/1251] eta 0:04:40 lr 0.000014 wd 0.0500 time 0.2401 (0.2478) data time 0.0010 (0.0066) model time 0.2391 (0.2402) loss 2.8717 (2.7180) grad_norm 6.9926 (6.5656) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][130/1251] eta 0:04:37 lr 0.000014 wd 0.0500 time 0.2428 (0.2473) data time 0.0010 (0.0062) model time 0.2418 (0.2403) loss 2.7173 (2.7120) grad_norm 4.4300 (6.6721) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][140/1251] eta 0:04:34 lr 0.000014 wd 0.0500 time 0.2354 (0.2467) data time 0.0011 (0.0058) model time 0.2343 (0.2400) loss 2.8283 (2.7068) grad_norm 11.0994 (6.7122) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][150/1251] eta 0:04:31 lr 0.000014 wd 0.0500 time 0.2404 (0.2464) data time 0.0008 (0.0055) model time 0.2396 (0.2400) loss 3.0963 (2.7293) grad_norm 9.8110 (6.6376) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:54:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][160/1251] eta 0:04:28 lr 0.000014 wd 0.0500 time 0.2417 (0.2460) data time 0.0007 (0.0052) model time 0.2409 (0.2400) loss 2.7587 (2.7200) grad_norm 5.5328 (6.5969) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][170/1251] eta 0:04:25 lr 0.000014 wd 0.0500 time 0.2423 (0.2457) data time 0.0009 (0.0049) model time 0.2414 (0.2400) loss 2.9335 (2.7291) grad_norm 4.0260 (6.5663) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][180/1251] eta 0:04:22 lr 0.000014 wd 0.0500 time 0.2594 (0.2455) data time 0.0007 (0.0047) model time 0.2587 (0.2401) loss 2.6013 (2.7256) grad_norm 6.1193 (6.5153) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][190/1251] eta 0:04:20 lr 0.000014 wd 0.0500 time 0.2383 (0.2453) data time 0.0007 (0.0045) model time 0.2376 (0.2401) loss 2.1567 (2.7220) grad_norm 4.4489 (6.8223) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][200/1251] eta 0:04:18 lr 0.000014 wd 0.0500 time 0.2371 (0.2461) data time 0.0007 (0.0043) model time 0.2364 (0.2414) loss 1.9629 (2.7105) grad_norm 5.9150 (6.8282) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][210/1251] eta 0:04:15 lr 0.000014 wd 0.0500 time 0.2394 (0.2458) data time 0.0009 (0.0042) model time 0.2385 (0.2413) loss 2.9844 (2.7066) grad_norm 5.9750 (6.8356) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][220/1251] eta 0:04:13 lr 0.000014 wd 0.0500 time 0.2415 (0.2456) data time 0.0010 (0.0040) model time 0.2405 (0.2413) loss 2.4039 (2.6973) grad_norm 9.6881 (6.8669) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][230/1251] eta 0:04:10 lr 0.000014 wd 0.0500 time 0.2410 (0.2455) data time 0.0007 (0.0039) model time 0.2403 (0.2413) loss 2.1947 (2.6905) grad_norm 5.4781 (6.8277) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][240/1251] eta 0:04:08 lr 0.000014 wd 0.0500 time 0.2387 (0.2453) data time 0.0010 (0.0038) model time 0.2378 (0.2413) loss 2.7542 (2.6856) grad_norm 7.5333 (6.8167) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][250/1251] eta 0:04:05 lr 0.000014 wd 0.0500 time 0.2375 (0.2451) data time 0.0009 (0.0037) model time 0.2367 (0.2412) loss 2.7936 (2.6813) grad_norm 16.8726 (6.8697) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][260/1251] eta 0:04:04 lr 0.000014 wd 0.0500 time 0.2443 (0.2467) data time 0.0008 (0.0036) model time 0.2435 (0.2433) loss 2.0003 (2.6790) grad_norm 9.2781 (6.8258) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][270/1251] eta 0:04:01 lr 0.000014 wd 0.0500 time 0.2434 (0.2465) data time 0.0011 (0.0035) model time 0.2423 (0.2431) loss 2.7602 (2.6857) grad_norm 3.9445 (6.7930) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][280/1251] eta 0:03:59 lr 0.000014 wd 0.0500 time 0.2373 (0.2463) data time 0.0009 (0.0034) model time 0.2364 (0.2430) loss 2.0832 (2.6815) grad_norm 5.8211 (6.7532) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][290/1251] eta 0:03:56 lr 0.000014 wd 0.0500 time 0.2333 (0.2462) data time 0.0009 (0.0033) model time 0.2324 (0.2429) loss 1.9817 (2.6809) grad_norm 4.3788 (6.7138) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][300/1251] eta 0:03:54 lr 0.000014 wd 0.0500 time 0.2468 (0.2461) data time 0.0008 (0.0032) model time 0.2460 (0.2429) loss 3.3926 (2.6898) grad_norm 4.6243 (6.6603) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][310/1251] eta 0:03:51 lr 0.000014 wd 0.0500 time 0.2425 (0.2459) data time 0.0010 (0.0031) model time 0.2415 (0.2428) loss 2.7263 (2.6929) grad_norm 5.1487 (6.6635) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][320/1251] eta 0:03:48 lr 0.000014 wd 0.0500 time 0.2317 (0.2458) data time 0.0011 (0.0031) model time 0.2306 (0.2427) loss 2.1805 (2.6761) grad_norm 4.3163 (6.6004) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][330/1251] eta 0:03:46 lr 0.000014 wd 0.0500 time 0.2376 (0.2457) data time 0.0010 (0.0030) model time 0.2366 (0.2427) loss 2.8213 (2.6818) grad_norm 6.6764 (6.6157) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][340/1251] eta 0:03:43 lr 0.000014 wd 0.0500 time 0.2365 (0.2456) data time 0.0010 (0.0030) model time 0.2356 (0.2426) loss 3.0186 (2.6842) grad_norm 4.2307 (6.6550) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][350/1251] eta 0:03:41 lr 0.000014 wd 0.0500 time 0.2430 (0.2454) data time 0.0009 (0.0029) model time 0.2420 (0.2425) loss 2.2627 (2.6812) grad_norm 5.4539 (6.6166) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][360/1251] eta 0:03:38 lr 0.000014 wd 0.0500 time 0.2468 (0.2453) data time 0.0009 (0.0028) model time 0.2459 (0.2424) loss 2.9564 (2.6816) grad_norm 7.4652 (6.6001) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][370/1251] eta 0:03:35 lr 0.000014 wd 0.0500 time 0.2431 (0.2452) data time 0.0008 (0.0028) model time 0.2423 (0.2423) loss 2.8782 (2.6790) grad_norm 6.2031 (6.6244) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][380/1251] eta 0:03:33 lr 0.000014 wd 0.0500 time 0.2410 (0.2451) data time 0.0007 (0.0027) model time 0.2403 (0.2423) loss 2.6487 (2.6720) grad_norm 6.1102 (6.6104) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][390/1251] eta 0:03:30 lr 0.000014 wd 0.0500 time 0.2453 (0.2450) data time 0.0009 (0.0027) model time 0.2443 (0.2423) loss 3.1503 (2.6785) grad_norm 7.5269 (6.5852) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][400/1251] eta 0:03:28 lr 0.000014 wd 0.0500 time 0.2399 (0.2449) data time 0.0011 (0.0027) model time 0.2388 (0.2422) loss 2.7659 (2.6852) grad_norm 9.2377 (6.5574) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:55:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][410/1251] eta 0:03:25 lr 0.000014 wd 0.0500 time 0.2497 (0.2449) data time 0.0007 (0.0026) model time 0.2490 (0.2422) loss 2.0284 (2.6851) grad_norm 3.7512 (6.5538) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][420/1251] eta 0:03:23 lr 0.000014 wd 0.0500 time 0.2404 (0.2448) data time 0.0010 (0.0026) model time 0.2394 (0.2422) loss 1.5478 (2.6842) grad_norm 3.8761 (6.5357) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][430/1251] eta 0:03:20 lr 0.000014 wd 0.0500 time 0.2474 (0.2447) data time 0.0010 (0.0025) model time 0.2464 (0.2421) loss 2.3525 (2.6825) grad_norm 5.9112 (6.5124) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][440/1251] eta 0:03:18 lr 0.000014 wd 0.0500 time 0.2360 (0.2447) data time 0.0009 (0.0025) model time 0.2350 (0.2421) loss 2.4539 (2.6807) grad_norm 6.2614 (6.4943) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][450/1251] eta 0:03:15 lr 0.000014 wd 0.0500 time 0.2402 (0.2446) data time 0.0008 (0.0025) model time 0.2394 (0.2420) loss 2.2367 (2.6805) grad_norm 6.9458 (6.4834) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][460/1251] eta 0:03:13 lr 0.000014 wd 0.0500 time 0.2425 (0.2445) data time 0.0008 (0.0024) model time 0.2418 (0.2420) loss 3.2191 (2.6849) grad_norm 6.3325 (6.4637) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][470/1251] eta 0:03:10 lr 0.000014 wd 0.0500 time 0.2428 (0.2444) data time 0.0012 (0.0024) model time 0.2416 (0.2419) loss 2.0599 (2.6844) grad_norm 4.1786 (6.4881) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][480/1251] eta 0:03:08 lr 0.000014 wd 0.0500 time 0.2486 (0.2443) data time 0.0009 (0.0024) model time 0.2477 (0.2419) loss 2.1952 (2.6878) grad_norm 4.1915 (6.8485) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][490/1251] eta 0:03:05 lr 0.000014 wd 0.0500 time 0.2420 (0.2442) data time 0.0010 (0.0024) model time 0.2411 (0.2418) loss 3.1129 (2.6941) grad_norm 21.8453 (6.8560) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][500/1251] eta 0:03:03 lr 0.000014 wd 0.0500 time 0.2374 (0.2444) data time 0.0007 (0.0023) model time 0.2367 (0.2420) loss 2.5332 (2.6908) grad_norm 4.2859 (6.8141) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][510/1251] eta 0:03:01 lr 0.000014 wd 0.0500 time 0.2391 (0.2446) data time 0.0007 (0.0023) model time 0.2384 (0.2423) loss 1.9163 (2.6878) grad_norm 3.8261 (6.7965) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][520/1251] eta 0:02:58 lr 0.000014 wd 0.0500 time 0.2404 (0.2445) data time 0.0010 (0.0023) model time 0.2394 (0.2422) loss 3.0610 (2.6864) grad_norm 6.3822 (6.7924) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][530/1251] eta 0:02:56 lr 0.000014 wd 0.0500 time 0.2422 (0.2444) data time 0.0007 (0.0022) model time 0.2415 (0.2421) loss 2.1513 (2.6823) grad_norm 8.5553 (6.7819) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][540/1251] eta 0:02:53 lr 0.000014 wd 0.0500 time 0.2486 (0.2444) data time 0.0010 (0.0022) model time 0.2476 (0.2421) loss 3.2315 (2.6819) grad_norm 7.9151 (6.7634) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][550/1251] eta 0:02:51 lr 0.000014 wd 0.0500 time 0.2416 (0.2443) data time 0.0007 (0.0022) model time 0.2408 (0.2421) loss 2.3240 (2.6774) grad_norm 4.3678 (6.7518) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][560/1251] eta 0:02:48 lr 0.000014 wd 0.0500 time 0.2413 (0.2443) data time 0.0009 (0.0022) model time 0.2404 (0.2420) loss 2.9514 (2.6818) grad_norm 6.7109 (6.7358) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][570/1251] eta 0:02:46 lr 0.000014 wd 0.0500 time 0.2543 (0.2442) data time 0.0007 (0.0022) model time 0.2536 (0.2420) loss 2.1323 (2.6824) grad_norm 4.9011 (6.7204) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][580/1251] eta 0:02:43 lr 0.000014 wd 0.0500 time 0.2404 (0.2441) data time 0.0010 (0.0021) model time 0.2395 (0.2419) loss 2.2348 (2.6808) grad_norm 4.6436 (6.7061) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][590/1251] eta 0:02:41 lr 0.000014 wd 0.0500 time 0.2427 (0.2441) data time 0.0007 (0.0021) model time 0.2420 (0.2419) loss 3.2339 (2.6804) grad_norm 4.3235 (6.6862) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][600/1251] eta 0:02:38 lr 0.000014 wd 0.0500 time 0.2333 (0.2440) data time 0.0011 (0.0021) model time 0.2322 (0.2418) loss 3.0355 (2.6803) grad_norm 89.0411 (6.8314) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][610/1251] eta 0:02:36 lr 0.000014 wd 0.0500 time 0.2324 (0.2439) data time 0.0008 (0.0021) model time 0.2316 (0.2418) loss 2.9732 (2.6790) grad_norm 5.7461 (6.8591) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][620/1251] eta 0:02:33 lr 0.000014 wd 0.0500 time 0.2441 (0.2439) data time 0.0007 (0.0021) model time 0.2433 (0.2418) loss 2.4935 (2.6801) grad_norm 4.7247 (6.8466) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][630/1251] eta 0:02:31 lr 0.000014 wd 0.0500 time 0.2432 (0.2439) data time 0.0007 (0.0021) model time 0.2425 (0.2418) loss 2.3087 (2.6811) grad_norm 6.5508 (6.8319) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][640/1251] eta 0:02:28 lr 0.000014 wd 0.0500 time 0.2395 (0.2438) data time 0.0008 (0.0020) model time 0.2387 (0.2417) loss 1.9552 (2.6753) grad_norm 6.7339 (6.8220) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][650/1251] eta 0:02:26 lr 0.000014 wd 0.0500 time 0.2408 (0.2438) data time 0.0007 (0.0020) model time 0.2401 (0.2417) loss 2.2562 (2.6771) grad_norm 7.6689 (6.8385) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:56:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][660/1251] eta 0:02:24 lr 0.000014 wd 0.0500 time 0.2325 (0.2438) data time 0.0010 (0.0020) model time 0.2315 (0.2417) loss 2.9228 (2.6763) grad_norm 3.4593 (6.8145) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][670/1251] eta 0:02:21 lr 0.000014 wd 0.0500 time 0.2380 (0.2437) data time 0.0008 (0.0020) model time 0.2372 (0.2416) loss 2.8672 (2.6765) grad_norm 8.4833 (6.8167) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][680/1251] eta 0:02:19 lr 0.000014 wd 0.0500 time 0.2309 (0.2436) data time 0.0009 (0.0020) model time 0.2299 (0.2416) loss 3.1645 (2.6770) grad_norm 7.3162 (6.8021) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][690/1251] eta 0:02:16 lr 0.000014 wd 0.0500 time 0.2402 (0.2436) data time 0.0008 (0.0020) model time 0.2394 (0.2415) loss 1.7266 (2.6781) grad_norm 6.5873 (6.7831) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][700/1251] eta 0:02:14 lr 0.000014 wd 0.0500 time 0.2368 (0.2435) data time 0.0012 (0.0019) model time 0.2356 (0.2415) loss 2.0719 (2.6759) grad_norm 7.3703 (6.7703) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][710/1251] eta 0:02:11 lr 0.000014 wd 0.0500 time 0.2374 (0.2435) data time 0.0010 (0.0019) model time 0.2364 (0.2415) loss 3.0371 (2.6795) grad_norm 4.0752 (6.7764) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][720/1251] eta 0:02:09 lr 0.000014 wd 0.0500 time 0.2395 (0.2437) data time 0.0013 (0.0019) model time 0.2382 (0.2417) loss 1.7619 (2.6796) grad_norm 8.1850 (6.7638) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][730/1251] eta 0:02:06 lr 0.000014 wd 0.0500 time 0.2436 (0.2436) data time 0.0007 (0.0019) model time 0.2429 (0.2416) loss 3.1984 (2.6767) grad_norm 4.7597 (6.7525) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][740/1251] eta 0:02:04 lr 0.000014 wd 0.0500 time 0.2420 (0.2436) data time 0.0009 (0.0019) model time 0.2411 (0.2417) loss 2.9266 (2.6755) grad_norm 5.6915 (6.7381) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][750/1251] eta 0:02:02 lr 0.000014 wd 0.0500 time 0.2394 (0.2436) data time 0.0008 (0.0019) model time 0.2387 (0.2417) loss 2.9594 (2.6744) grad_norm 5.1835 (6.7279) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][760/1251] eta 0:01:59 lr 0.000014 wd 0.0500 time 0.2390 (0.2436) data time 0.0011 (0.0019) model time 0.2379 (0.2417) loss 1.9191 (2.6744) grad_norm 12.7427 (6.7199) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][770/1251] eta 0:01:57 lr 0.000014 wd 0.0500 time 0.2300 (0.2436) data time 0.0008 (0.0019) model time 0.2292 (0.2416) loss 2.2672 (2.6702) grad_norm 7.2030 (6.7736) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][780/1251] eta 0:01:54 lr 0.000014 wd 0.0500 time 0.2351 (0.2435) data time 0.0012 (0.0019) model time 0.2339 (0.2416) loss 3.4224 (2.6703) grad_norm 4.4619 (6.7733) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][790/1251] eta 0:01:52 lr 0.000014 wd 0.0500 time 0.2368 (0.2435) data time 0.0007 (0.0018) model time 0.2361 (0.2416) loss 2.0163 (2.6682) grad_norm 3.9218 (6.9654) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][800/1251] eta 0:01:49 lr 0.000014 wd 0.0500 time 0.2456 (0.2434) data time 0.0009 (0.0018) model time 0.2447 (0.2415) loss 2.7781 (2.6678) grad_norm 5.0652 (6.9573) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][810/1251] eta 0:01:47 lr 0.000014 wd 0.0500 time 0.2378 (0.2434) data time 0.0007 (0.0018) model time 0.2371 (0.2415) loss 2.1785 (2.6650) grad_norm 3.6491 (6.9471) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][820/1251] eta 0:01:44 lr 0.000014 wd 0.0500 time 0.2430 (0.2433) data time 0.0009 (0.0018) model time 0.2420 (0.2415) loss 3.0533 (2.6669) grad_norm 5.2482 (6.9248) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][830/1251] eta 0:01:42 lr 0.000014 wd 0.0500 time 0.2414 (0.2433) data time 0.0007 (0.0018) model time 0.2407 (0.2415) loss 2.7280 (2.6666) grad_norm 6.2392 (6.9126) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][840/1251] eta 0:01:40 lr 0.000014 wd 0.0500 time 0.2317 (0.2433) data time 0.0012 (0.0018) model time 0.2304 (0.2415) loss 3.1355 (2.6677) grad_norm 5.3020 (6.9219) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][850/1251] eta 0:01:37 lr 0.000014 wd 0.0500 time 0.2374 (0.2433) data time 0.0008 (0.0018) model time 0.2367 (0.2415) loss 2.8072 (2.6698) grad_norm 4.2711 (6.9042) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][860/1251] eta 0:01:35 lr 0.000014 wd 0.0500 time 0.2482 (0.2433) data time 0.0009 (0.0018) model time 0.2473 (0.2414) loss 1.5521 (2.6673) grad_norm 5.7984 (6.8883) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][870/1251] eta 0:01:32 lr 0.000014 wd 0.0500 time 0.2345 (0.2433) data time 0.0010 (0.0018) model time 0.2335 (0.2414) loss 2.6039 (2.6660) grad_norm 4.8134 (6.9004) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][880/1251] eta 0:01:30 lr 0.000014 wd 0.0500 time 0.2366 (0.2432) data time 0.0011 (0.0018) model time 0.2356 (0.2414) loss 2.7755 (2.6689) grad_norm 4.2042 (6.8853) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][890/1251] eta 0:01:27 lr 0.000014 wd 0.0500 time 0.2367 (0.2432) data time 0.0008 (0.0017) model time 0.2359 (0.2414) loss 3.0514 (2.6717) grad_norm 5.5701 (6.8907) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][900/1251] eta 0:01:25 lr 0.000014 wd 0.0500 time 0.2463 (0.2432) data time 0.0009 (0.0017) model time 0.2454 (0.2414) loss 2.8671 (2.6722) grad_norm 4.3158 (6.9044) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:57:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][910/1251] eta 0:01:22 lr 0.000014 wd 0.0500 time 0.2456 (0.2432) data time 0.0008 (0.0017) model time 0.2449 (0.2414) loss 2.2075 (2.6726) grad_norm 8.6883 (6.8954) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][920/1251] eta 0:01:20 lr 0.000014 wd 0.0500 time 0.2523 (0.2432) data time 0.0010 (0.0017) model time 0.2514 (0.2414) loss 3.3225 (2.6748) grad_norm 4.9143 (6.8961) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][930/1251] eta 0:01:18 lr 0.000014 wd 0.0500 time 0.2335 (0.2432) data time 0.0010 (0.0017) model time 0.2325 (0.2414) loss 1.6259 (2.6743) grad_norm 5.2750 (6.8810) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][940/1251] eta 0:01:15 lr 0.000014 wd 0.0500 time 0.2379 (0.2432) data time 0.0009 (0.0017) model time 0.2370 (0.2414) loss 3.2796 (2.6757) grad_norm 5.7355 (6.8727) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][950/1251] eta 0:01:13 lr 0.000014 wd 0.0500 time 0.2374 (0.2432) data time 0.0008 (0.0017) model time 0.2366 (0.2414) loss 3.0918 (2.6744) grad_norm 3.9370 (6.8703) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][960/1251] eta 0:01:10 lr 0.000014 wd 0.0500 time 0.2438 (0.2432) data time 0.0010 (0.0017) model time 0.2428 (0.2414) loss 2.1618 (2.6720) grad_norm 7.4218 (6.8591) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][970/1251] eta 0:01:08 lr 0.000014 wd 0.0500 time 0.2382 (0.2431) data time 0.0008 (0.0017) model time 0.2374 (0.2414) loss 2.6159 (2.6714) grad_norm 6.4972 (6.8479) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][980/1251] eta 0:01:05 lr 0.000014 wd 0.0500 time 0.2398 (0.2431) data time 0.0009 (0.0017) model time 0.2389 (0.2414) loss 2.2058 (2.6682) grad_norm 6.8853 (6.8357) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][990/1251] eta 0:01:03 lr 0.000014 wd 0.0500 time 0.2398 (0.2431) data time 0.0010 (0.0017) model time 0.2387 (0.2414) loss 2.7933 (2.6693) grad_norm 6.8242 (6.8233) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1000/1251] eta 0:01:01 lr 0.000014 wd 0.0500 time 0.2406 (0.2431) data time 0.0009 (0.0017) model time 0.2397 (0.2414) loss 2.7982 (2.6689) grad_norm 4.8099 (6.8143) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1010/1251] eta 0:00:58 lr 0.000014 wd 0.0500 time 0.2366 (0.2431) data time 0.0010 (0.0017) model time 0.2356 (0.2414) loss 3.4870 (2.6700) grad_norm 9.7607 (6.8091) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1020/1251] eta 0:00:56 lr 0.000014 wd 0.0500 time 0.2376 (0.2432) data time 0.0011 (0.0017) model time 0.2366 (0.2415) loss 2.9482 (2.6707) grad_norm 3.5693 (6.8336) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1030/1251] eta 0:00:53 lr 0.000014 wd 0.0500 time 0.2378 (0.2432) data time 0.0008 (0.0016) model time 0.2370 (0.2415) loss 2.7909 (2.6707) grad_norm 3.5104 (6.8240) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1040/1251] eta 0:00:51 lr 0.000014 wd 0.0500 time 0.2488 (0.2434) data time 0.0011 (0.0016) model time 0.2478 (0.2417) loss 1.9946 (2.6693) grad_norm 9.2531 (6.8214) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1050/1251] eta 0:00:48 lr 0.000014 wd 0.0500 time 0.2451 (0.2434) data time 0.0008 (0.0016) model time 0.2443 (0.2417) loss 2.6886 (2.6706) grad_norm 8.3296 (6.8197) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1060/1251] eta 0:00:46 lr 0.000014 wd 0.0500 time 0.2402 (0.2434) data time 0.0007 (0.0016) model time 0.2395 (0.2417) loss 2.8921 (2.6707) grad_norm 12.6682 (6.8341) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1070/1251] eta 0:00:44 lr 0.000014 wd 0.0500 time 0.2373 (0.2433) data time 0.0008 (0.0016) model time 0.2365 (0.2417) loss 2.5441 (2.6729) grad_norm 4.3553 (6.8238) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1080/1251] eta 0:00:41 lr 0.000014 wd 0.0500 time 0.2457 (0.2433) data time 0.0010 (0.0016) model time 0.2448 (0.2417) loss 2.3150 (2.6719) grad_norm 6.9710 (6.8142) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1090/1251] eta 0:00:39 lr 0.000014 wd 0.0500 time 0.2413 (0.2433) data time 0.0007 (0.0016) model time 0.2406 (0.2416) loss 2.8550 (2.6707) grad_norm 7.3241 (6.8114) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1100/1251] eta 0:00:36 lr 0.000014 wd 0.0500 time 0.2459 (0.2432) data time 0.0009 (0.0016) model time 0.2450 (0.2416) loss 2.5380 (2.6692) grad_norm 4.4551 (6.7973) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1110/1251] eta 0:00:34 lr 0.000014 wd 0.0500 time 0.2444 (0.2432) data time 0.0007 (0.0016) model time 0.2437 (0.2416) loss 3.0096 (2.6692) grad_norm 6.6095 (6.7879) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1120/1251] eta 0:00:31 lr 0.000014 wd 0.0500 time 0.2351 (0.2432) data time 0.0007 (0.0016) model time 0.2344 (0.2416) loss 2.5624 (2.6698) grad_norm 4.9874 (6.7842) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1130/1251] eta 0:00:29 lr 0.000014 wd 0.0500 time 0.2437 (0.2432) data time 0.0010 (0.0016) model time 0.2427 (0.2415) loss 2.8424 (2.6706) grad_norm 4.8928 (6.7689) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1140/1251] eta 0:00:26 lr 0.000014 wd 0.0500 time 0.2373 (0.2432) data time 0.0007 (0.0016) model time 0.2365 (0.2415) loss 3.2100 (2.6718) grad_norm 6.9417 (6.7683) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:58:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1150/1251] eta 0:00:24 lr 0.000014 wd 0.0500 time 0.2468 (0.2431) data time 0.0009 (0.0016) model time 0.2459 (0.2415) loss 2.4465 (2.6724) grad_norm 3.9781 (6.7625) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:59:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1160/1251] eta 0:00:22 lr 0.000014 wd 0.0500 time 0.2457 (0.2431) data time 0.0010 (0.0016) model time 0.2446 (0.2415) loss 2.7098 (2.6722) grad_norm 7.4990 (6.7596) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1170/1251] eta 0:00:19 lr 0.000014 wd 0.0500 time 0.2545 (0.2431) data time 0.0010 (0.0016) model time 0.2535 (0.2415) loss 2.6084 (2.6712) grad_norm 6.0293 (6.7518) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:59:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1180/1251] eta 0:00:17 lr 0.000014 wd 0.0500 time 0.2391 (0.2431) data time 0.0008 (0.0016) model time 0.2383 (0.2415) loss 1.8521 (2.6709) grad_norm 3.9597 (6.7560) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1190/1251] eta 0:00:14 lr 0.000014 wd 0.0500 time 0.2437 (0.2434) data time 0.0009 (0.0016) model time 0.2428 (0.2418) loss 2.9097 (2.6724) grad_norm 4.2725 (6.7448) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:59:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1200/1251] eta 0:00:12 lr 0.000014 wd 0.0500 time 0.2423 (0.2434) data time 0.0008 (0.0016) model time 0.2415 (0.2418) loss 1.9617 (2.6732) grad_norm 4.7252 (6.7329) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:59:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1210/1251] eta 0:00:09 lr 0.000014 wd 0.0500 time 0.2400 (0.2434) data time 0.0007 (0.0015) model time 0.2393 (0.2418) loss 3.1517 (2.6743) grad_norm 6.6306 (6.7279) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:59:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1220/1251] eta 0:00:07 lr 0.000014 wd 0.0500 time 0.2439 (0.2434) data time 0.0010 (0.0015) model time 0.2429 (0.2418) loss 1.5710 (2.6750) grad_norm 6.4863 (6.7273) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 10:59:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1230/1251] eta 0:00:05 lr 0.000014 wd 0.0500 time 0.2421 (0.2434) data time 0.0010 (0.0015) model time 0.2412 (0.2418) loss 2.4131 (2.6725) grad_norm 3.9920 (6.7182) loss_scale 256.0000 (128.9358) mem 7381MB [2024-09-01 10:59:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1240/1251] eta 0:00:02 lr 0.000014 wd 0.0500 time 0.2246 (0.2433) data time 0.0007 (0.0015) model time 0.2239 (0.2417) loss 2.0922 (2.6725) grad_norm 7.3697 (6.7136) loss_scale 256.0000 (129.9597) mem 7381MB [2024-09-01 10:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [288/300][1250/1251] eta 0:00:00 lr 0.000014 wd 0.0500 time 0.2225 (0.2433) data time 0.0005 (0.0015) model time 0.2221 (0.2417) loss 3.3783 (2.6722) grad_norm 6.2707 (6.7050) loss_scale 256.0000 (130.9672) mem 7381MB [2024-09-01 10:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 288 training takes 0:05:04 [2024-09-01 10:59:22 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 10:59:23 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 10:59:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.399 (0.399) Loss 0.3997 (0.3997) Acc@1 92.578 (92.578) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 10:59:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.112) Loss 0.5664 (0.6159) Acc@1 90.430 (87.615) Acc@5 97.656 (97.781) Mem 7381MB [2024-09-01 10:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.078 (0.097) Loss 0.9175 (0.6465) Acc@1 77.734 (86.575) Acc@5 95.605 (97.726) Mem 7381MB [2024-09-01 10:59:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.078 (0.091) Loss 1.1543 (0.7408) Acc@1 75.293 (84.384) Acc@5 92.480 (96.746) Mem 7381MB [2024-09-01 10:59:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.0283 (0.7902) Acc@1 76.172 (83.201) Acc@5 94.434 (96.222) Mem 7381MB [2024-09-01 10:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.784 Acc@5 96.186 [2024-09-01 10:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 10:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.793 (0.793) Loss 0.3899 (0.3899) Acc@1 93.359 (93.359) Acc@5 98.633 (98.633) Mem 7381MB [2024-09-01 10:59:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.146) Loss 0.5635 (0.6078) Acc@1 90.527 (87.962) Acc@5 97.949 (97.763) Mem 7381MB [2024-09-01 10:59:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.082 (0.115) Loss 0.9146 (0.6387) Acc@1 77.832 (86.756) Acc@5 95.605 (97.712) Mem 7381MB [2024-09-01 10:59:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.103) Loss 1.1426 (0.7318) Acc@1 74.609 (84.536) Acc@5 93.066 (96.777) Mem 7381MB [2024-09-01 10:59:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.094) Loss 1.0146 (0.7804) Acc@1 76.465 (83.348) Acc@5 94.531 (96.272) Mem 7381MB [2024-09-01 10:59:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.902 Acc@5 96.224 [2024-09-01 10:59:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 10:59:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][0/1251] eta 0:21:47 lr 0.000014 wd 0.0500 time 1.0449 (1.0449) data time 0.6276 (0.6276) model time 0.0000 (0.0000) loss 2.7830 (2.7830) grad_norm 29.0548 (29.0548) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:59:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][10/1251] eta 0:06:29 lr 0.000014 wd 0.0500 time 0.2458 (0.3141) data time 0.0010 (0.0580) model time 0.0000 (0.0000) loss 2.8420 (2.6394) grad_norm 6.5542 (8.8337) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:59:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][20/1251] eta 0:05:45 lr 0.000014 wd 0.0500 time 0.2416 (0.2809) data time 0.0009 (0.0309) model time 0.0000 (0.0000) loss 3.5187 (2.7369) grad_norm 4.0980 (8.8504) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:59:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][30/1251] eta 0:05:26 lr 0.000014 wd 0.0500 time 0.2407 (0.2676) data time 0.0010 (0.0213) model time 0.0000 (0.0000) loss 2.9011 (2.7013) grad_norm 4.6233 (7.6558) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:59:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][40/1251] eta 0:05:16 lr 0.000014 wd 0.0500 time 0.2445 (0.2610) data time 0.0010 (0.0163) model time 0.0000 (0.0000) loss 2.8654 (2.7105) grad_norm 10.7858 (7.5040) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:59:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][50/1251] eta 0:05:08 lr 0.000014 wd 0.0500 time 0.2389 (0.2569) data time 0.0009 (0.0133) model time 0.0000 (0.0000) loss 2.7465 (2.6792) grad_norm 5.0661 (7.2923) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:59:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][60/1251] eta 0:05:03 lr 0.000014 wd 0.0500 time 0.2467 (0.2544) data time 0.0009 (0.0113) model time 0.2458 (0.2408) loss 3.4086 (2.6869) grad_norm 7.4477 (7.0589) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:59:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][70/1251] eta 0:04:58 lr 0.000014 wd 0.0500 time 0.2434 (0.2528) data time 0.0007 (0.0099) model time 0.2427 (0.2414) loss 2.4606 (2.6633) grad_norm 6.0331 (6.8053) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:59:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][80/1251] eta 0:04:54 lr 0.000014 wd 0.0500 time 0.2407 (0.2514) data time 0.0009 (0.0088) model time 0.2399 (0.2411) loss 3.1374 (2.6278) grad_norm 4.9968 (6.5811) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][90/1251] eta 0:04:50 lr 0.000014 wd 0.0500 time 0.2484 (0.2505) data time 0.0010 (0.0079) model time 0.2474 (0.2414) loss 2.7254 (2.6184) grad_norm 3.9810 (6.5640) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:59:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][100/1251] eta 0:04:47 lr 0.000014 wd 0.0500 time 0.2379 (0.2496) data time 0.0008 (0.0072) model time 0.2371 (0.2412) loss 2.8207 (2.6096) grad_norm 9.4879 (6.5875) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 10:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][110/1251] eta 0:04:44 lr 0.000014 wd 0.0500 time 0.2405 (0.2490) data time 0.0011 (0.0067) model time 0.2394 (0.2413) loss 1.7880 (2.6030) grad_norm 3.9929 (6.4689) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][120/1251] eta 0:04:41 lr 0.000014 wd 0.0500 time 0.2527 (0.2487) data time 0.0007 (0.0062) model time 0.2520 (0.2417) loss 2.3252 (2.6031) grad_norm 4.5430 (6.4908) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][130/1251] eta 0:04:38 lr 0.000014 wd 0.0500 time 0.2434 (0.2481) data time 0.0010 (0.0058) model time 0.2425 (0.2414) loss 2.8574 (2.6144) grad_norm 5.0026 (6.5171) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][140/1251] eta 0:04:34 lr 0.000014 wd 0.0500 time 0.2394 (0.2474) data time 0.0010 (0.0054) model time 0.2384 (0.2411) loss 2.4028 (2.6288) grad_norm 3.9537 (6.5261) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][150/1251] eta 0:04:31 lr 0.000014 wd 0.0500 time 0.2437 (0.2470) data time 0.0007 (0.0051) model time 0.2430 (0.2410) loss 2.6742 (2.6195) grad_norm 4.8164 (6.4912) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][160/1251] eta 0:04:29 lr 0.000014 wd 0.0500 time 0.2449 (0.2467) data time 0.0009 (0.0049) model time 0.2440 (0.2410) loss 2.0045 (2.6266) grad_norm 13.0108 (6.5016) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][170/1251] eta 0:04:26 lr 0.000014 wd 0.0500 time 0.2372 (0.2463) data time 0.0009 (0.0047) model time 0.2363 (0.2408) loss 2.6664 (2.6395) grad_norm 4.0750 (6.4732) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][180/1251] eta 0:04:23 lr 0.000014 wd 0.0500 time 0.2362 (0.2460) data time 0.0009 (0.0045) model time 0.2353 (0.2407) loss 1.8331 (2.6407) grad_norm 5.1540 (6.4382) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][190/1251] eta 0:04:20 lr 0.000014 wd 0.0500 time 0.2382 (0.2458) data time 0.0010 (0.0043) model time 0.2372 (0.2407) loss 2.6319 (2.6516) grad_norm 8.4066 (6.4688) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][200/1251] eta 0:04:18 lr 0.000014 wd 0.0500 time 0.2442 (0.2455) data time 0.0011 (0.0041) model time 0.2431 (0.2407) loss 2.7070 (2.6566) grad_norm 8.1656 (6.4881) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][210/1251] eta 0:04:15 lr 0.000014 wd 0.0500 time 0.2555 (0.2454) data time 0.0007 (0.0040) model time 0.2549 (0.2407) loss 2.6452 (2.6636) grad_norm 7.5165 (6.5008) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][220/1251] eta 0:04:12 lr 0.000014 wd 0.0500 time 0.2450 (0.2453) data time 0.0010 (0.0038) model time 0.2441 (0.2408) loss 3.0235 (2.6646) grad_norm 4.8411 (6.4910) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][230/1251] eta 0:04:10 lr 0.000014 wd 0.0500 time 0.2420 (0.2452) data time 0.0008 (0.0037) model time 0.2412 (0.2409) loss 2.9632 (2.6616) grad_norm 6.4658 (6.4756) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][240/1251] eta 0:04:07 lr 0.000014 wd 0.0500 time 0.2410 (0.2449) data time 0.0010 (0.0036) model time 0.2400 (0.2407) loss 2.6699 (2.6573) grad_norm 4.8539 (6.4127) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][250/1251] eta 0:04:04 lr 0.000014 wd 0.0500 time 0.2416 (0.2447) data time 0.0009 (0.0035) model time 0.2407 (0.2406) loss 2.9198 (2.6515) grad_norm 5.4691 (6.4062) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][260/1251] eta 0:04:02 lr 0.000014 wd 0.0500 time 0.2321 (0.2445) data time 0.0009 (0.0034) model time 0.2312 (0.2405) loss 3.0064 (2.6531) grad_norm 4.8715 (6.3915) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][270/1251] eta 0:03:59 lr 0.000014 wd 0.0500 time 0.2437 (0.2443) data time 0.0009 (0.0033) model time 0.2428 (0.2404) loss 2.6127 (2.6570) grad_norm 7.9018 (6.5728) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][280/1251] eta 0:03:57 lr 0.000014 wd 0.0500 time 0.2483 (0.2443) data time 0.0010 (0.0032) model time 0.2474 (0.2405) loss 2.2717 (2.6620) grad_norm 5.8542 (6.5902) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][290/1251] eta 0:03:54 lr 0.000014 wd 0.0500 time 0.2358 (0.2442) data time 0.0008 (0.0031) model time 0.2349 (0.2405) loss 3.2900 (2.6620) grad_norm 9.8078 (6.5957) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][300/1251] eta 0:03:52 lr 0.000014 wd 0.0500 time 0.2406 (0.2441) data time 0.0010 (0.0031) model time 0.2396 (0.2405) loss 2.4947 (2.6695) grad_norm 9.0469 (6.5835) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][310/1251] eta 0:03:49 lr 0.000014 wd 0.0500 time 0.2423 (0.2441) data time 0.0009 (0.0030) model time 0.2414 (0.2406) loss 3.1393 (2.6719) grad_norm 5.0483 (6.5497) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][320/1251] eta 0:03:47 lr 0.000014 wd 0.0500 time 0.2464 (0.2441) data time 0.0007 (0.0029) model time 0.2457 (0.2407) loss 1.5616 (2.6679) grad_norm 4.5454 (6.5227) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][330/1251] eta 0:03:44 lr 0.000014 wd 0.0500 time 0.2398 (0.2440) data time 0.0008 (0.0029) model time 0.2390 (0.2407) loss 1.7207 (2.6646) grad_norm 7.1769 (6.5256) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][340/1251] eta 0:03:42 lr 0.000014 wd 0.0500 time 0.2395 (0.2438) data time 0.0008 (0.0028) model time 0.2387 (0.2406) loss 2.6995 (2.6662) grad_norm 6.2527 (6.5561) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][350/1251] eta 0:03:39 lr 0.000014 wd 0.0500 time 0.2415 (0.2437) data time 0.0009 (0.0028) model time 0.2407 (0.2405) loss 2.6117 (2.6610) grad_norm 4.9756 (6.5303) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:00:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][360/1251] eta 0:03:37 lr 0.000014 wd 0.0500 time 0.2410 (0.2437) data time 0.0010 (0.0027) model time 0.2400 (0.2406) loss 1.8586 (2.6577) grad_norm 9.2831 (6.5222) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:01:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][370/1251] eta 0:03:34 lr 0.000014 wd 0.0500 time 0.2359 (0.2436) data time 0.0009 (0.0027) model time 0.2350 (0.2405) loss 2.8070 (2.6565) grad_norm 3.7635 (6.5454) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:01:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][380/1251] eta 0:03:32 lr 0.000014 wd 0.0500 time 0.2358 (0.2436) data time 0.0007 (0.0026) model time 0.2351 (0.2406) loss 2.2076 (2.6468) grad_norm 6.0464 (6.5630) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:01:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][390/1251] eta 0:03:29 lr 0.000014 wd 0.0500 time 0.2534 (0.2435) data time 0.0009 (0.0026) model time 0.2525 (0.2406) loss 2.1536 (2.6479) grad_norm 5.5261 (6.5294) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:01:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][400/1251] eta 0:03:27 lr 0.000014 wd 0.0500 time 0.2411 (0.2440) data time 0.0008 (0.0025) model time 0.2403 (0.2412) loss 2.7149 (2.6510) grad_norm 4.7101 (6.5257) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:01:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][410/1251] eta 0:03:25 lr 0.000014 wd 0.0500 time 0.2396 (0.2445) data time 0.0013 (0.0025) model time 0.2383 (0.2418) loss 2.8411 (2.6558) grad_norm 4.6632 (6.5169) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:01:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][420/1251] eta 0:03:23 lr 0.000014 wd 0.0500 time 0.2318 (0.2444) data time 0.0009 (0.0025) model time 0.2310 (0.2417) loss 3.2207 (2.6606) grad_norm 4.9586 (inf) loss_scale 128.0000 (253.5677) mem 7381MB [2024-09-01 11:01:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][430/1251] eta 0:03:20 lr 0.000014 wd 0.0500 time 0.2290 (0.2443) data time 0.0008 (0.0024) model time 0.2281 (0.2416) loss 2.6132 (2.6620) grad_norm 4.9438 (inf) loss_scale 128.0000 (250.6543) mem 7381MB [2024-09-01 11:01:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][440/1251] eta 0:03:18 lr 0.000014 wd 0.0500 time 0.2452 (0.2443) data time 0.0007 (0.0024) model time 0.2445 (0.2416) loss 2.2106 (2.6549) grad_norm 4.4510 (inf) loss_scale 128.0000 (247.8730) mem 7381MB [2024-09-01 11:01:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][450/1251] eta 0:03:15 lr 0.000014 wd 0.0500 time 0.2354 (0.2442) data time 0.0008 (0.0024) model time 0.2346 (0.2416) loss 2.1668 (2.6578) grad_norm 7.3454 (inf) loss_scale 128.0000 (245.2151) mem 7381MB [2024-09-01 11:01:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][460/1251] eta 0:03:13 lr 0.000014 wd 0.0500 time 0.2418 (0.2441) data time 0.0010 (0.0023) model time 0.2408 (0.2415) loss 2.9992 (2.6629) grad_norm 4.6250 (inf) loss_scale 128.0000 (242.6725) mem 7381MB [2024-09-01 11:01:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][470/1251] eta 0:03:10 lr 0.000014 wd 0.0500 time 0.2520 (0.2441) data time 0.0007 (0.0023) model time 0.2513 (0.2416) loss 3.2812 (2.6655) grad_norm 6.7567 (inf) loss_scale 128.0000 (240.2378) mem 7381MB [2024-09-01 11:01:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][480/1251] eta 0:03:08 lr 0.000014 wd 0.0500 time 0.2493 (0.2444) data time 0.0009 (0.0023) model time 0.2484 (0.2420) loss 2.4516 (2.6696) grad_norm 6.7267 (inf) loss_scale 128.0000 (237.9044) mem 7381MB [2024-09-01 11:01:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][490/1251] eta 0:03:05 lr 0.000014 wd 0.0500 time 0.2371 (0.2444) data time 0.0010 (0.0023) model time 0.2360 (0.2420) loss 2.7779 (2.6686) grad_norm 4.9128 (inf) loss_scale 128.0000 (235.6660) mem 7381MB [2024-09-01 11:01:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][500/1251] eta 0:03:03 lr 0.000013 wd 0.0500 time 0.2374 (0.2444) data time 0.0010 (0.0022) model time 0.2364 (0.2420) loss 3.3577 (2.6745) grad_norm 4.6837 (inf) loss_scale 128.0000 (233.5170) mem 7381MB [2024-09-01 11:01:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][510/1251] eta 0:03:01 lr 0.000013 wd 0.0500 time 0.2391 (0.2444) data time 0.0009 (0.0022) model time 0.2382 (0.2420) loss 1.7980 (2.6731) grad_norm 12.4385 (inf) loss_scale 128.0000 (231.4521) mem 7381MB [2024-09-01 11:01:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][520/1251] eta 0:02:58 lr 0.000013 wd 0.0500 time 0.2404 (0.2444) data time 0.0008 (0.0022) model time 0.2397 (0.2421) loss 2.2908 (2.6715) grad_norm 5.5384 (inf) loss_scale 128.0000 (229.4664) mem 7381MB [2024-09-01 11:01:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][530/1251] eta 0:02:56 lr 0.000013 wd 0.0500 time 0.2375 (0.2444) data time 0.0009 (0.0022) model time 0.2366 (0.2420) loss 2.1206 (2.6716) grad_norm 6.0365 (inf) loss_scale 128.0000 (227.5556) mem 7381MB [2024-09-01 11:01:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][540/1251] eta 0:02:53 lr 0.000013 wd 0.0500 time 0.2441 (0.2443) data time 0.0009 (0.0021) model time 0.2431 (0.2420) loss 3.2032 (2.6718) grad_norm 5.0924 (inf) loss_scale 128.0000 (225.7153) mem 7381MB [2024-09-01 11:01:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][550/1251] eta 0:02:51 lr 0.000013 wd 0.0500 time 0.2413 (0.2442) data time 0.0009 (0.0021) model time 0.2405 (0.2420) loss 3.2163 (2.6714) grad_norm 4.6813 (inf) loss_scale 128.0000 (223.9419) mem 7381MB [2024-09-01 11:01:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][560/1251] eta 0:02:48 lr 0.000013 wd 0.0500 time 0.2403 (0.2442) data time 0.0008 (0.0021) model time 0.2396 (0.2420) loss 2.5927 (2.6663) grad_norm 9.5468 (inf) loss_scale 128.0000 (222.2317) mem 7381MB [2024-09-01 11:01:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][570/1251] eta 0:02:46 lr 0.000013 wd 0.0500 time 0.2407 (0.2442) data time 0.0007 (0.0021) model time 0.2400 (0.2420) loss 1.8608 (2.6653) grad_norm 5.5882 (inf) loss_scale 128.0000 (220.5814) mem 7381MB [2024-09-01 11:01:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][580/1251] eta 0:02:43 lr 0.000013 wd 0.0500 time 0.2450 (0.2442) data time 0.0010 (0.0021) model time 0.2440 (0.2420) loss 3.1071 (2.6713) grad_norm 3.9359 (inf) loss_scale 128.0000 (218.9880) mem 7381MB [2024-09-01 11:01:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][590/1251] eta 0:02:41 lr 0.000013 wd 0.0500 time 0.2402 (0.2442) data time 0.0008 (0.0020) model time 0.2393 (0.2420) loss 3.0891 (2.6682) grad_norm 6.4230 (inf) loss_scale 128.0000 (217.4484) mem 7381MB [2024-09-01 11:01:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][600/1251] eta 0:02:38 lr 0.000013 wd 0.0500 time 0.2391 (0.2441) data time 0.0010 (0.0020) model time 0.2382 (0.2420) loss 2.7442 (2.6690) grad_norm 4.8505 (inf) loss_scale 128.0000 (215.9601) mem 7381MB [2024-09-01 11:02:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][610/1251] eta 0:02:36 lr 0.000013 wd 0.0500 time 0.2434 (0.2442) data time 0.0009 (0.0020) model time 0.2425 (0.2420) loss 2.8910 (2.6677) grad_norm 4.8902 (inf) loss_scale 128.0000 (214.5205) mem 7381MB [2024-09-01 11:02:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][620/1251] eta 0:02:34 lr 0.000013 wd 0.0500 time 0.2394 (0.2441) data time 0.0007 (0.0020) model time 0.2387 (0.2420) loss 2.7162 (2.6644) grad_norm 5.5875 (inf) loss_scale 128.0000 (213.1272) mem 7381MB [2024-09-01 11:02:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][630/1251] eta 0:02:31 lr 0.000013 wd 0.0500 time 0.2460 (0.2441) data time 0.0009 (0.0020) model time 0.2450 (0.2420) loss 2.9264 (2.6629) grad_norm 5.0680 (inf) loss_scale 128.0000 (211.7781) mem 7381MB [2024-09-01 11:02:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][640/1251] eta 0:02:29 lr 0.000013 wd 0.0500 time 0.2418 (0.2440) data time 0.0009 (0.0020) model time 0.2408 (0.2420) loss 2.7025 (2.6642) grad_norm 7.8717 (inf) loss_scale 128.0000 (210.4711) mem 7381MB [2024-09-01 11:02:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][650/1251] eta 0:02:26 lr 0.000013 wd 0.0500 time 0.2460 (0.2440) data time 0.0010 (0.0019) model time 0.2450 (0.2419) loss 3.1407 (2.6653) grad_norm 12.9981 (inf) loss_scale 128.0000 (209.2043) mem 7381MB [2024-09-01 11:02:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][660/1251] eta 0:02:24 lr 0.000013 wd 0.0500 time 0.2448 (0.2439) data time 0.0007 (0.0019) model time 0.2441 (0.2419) loss 3.3543 (2.6645) grad_norm 3.4074 (inf) loss_scale 128.0000 (207.9758) mem 7381MB [2024-09-01 11:02:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][670/1251] eta 0:02:21 lr 0.000013 wd 0.0500 time 0.2425 (0.2439) data time 0.0010 (0.0019) model time 0.2415 (0.2418) loss 2.6343 (2.6628) grad_norm 5.8315 (inf) loss_scale 128.0000 (206.7839) mem 7381MB [2024-09-01 11:02:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][680/1251] eta 0:02:19 lr 0.000013 wd 0.0500 time 0.2435 (0.2438) data time 0.0011 (0.0019) model time 0.2424 (0.2418) loss 2.4052 (2.6600) grad_norm 6.6469 (inf) loss_scale 128.0000 (205.6270) mem 7381MB [2024-09-01 11:02:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][690/1251] eta 0:02:16 lr 0.000013 wd 0.0500 time 0.2386 (0.2438) data time 0.0011 (0.0019) model time 0.2374 (0.2418) loss 3.0365 (2.6611) grad_norm 6.2326 (inf) loss_scale 128.0000 (204.5036) mem 7381MB [2024-09-01 11:02:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][700/1251] eta 0:02:14 lr 0.000013 wd 0.0500 time 0.2420 (0.2438) data time 0.0007 (0.0019) model time 0.2413 (0.2418) loss 2.2689 (2.6596) grad_norm 4.2563 (inf) loss_scale 128.0000 (203.4123) mem 7381MB [2024-09-01 11:02:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][710/1251] eta 0:02:11 lr 0.000013 wd 0.0500 time 0.2387 (0.2438) data time 0.0010 (0.0019) model time 0.2377 (0.2418) loss 2.7159 (2.6580) grad_norm 6.5305 (inf) loss_scale 128.0000 (202.3516) mem 7381MB [2024-09-01 11:02:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][720/1251] eta 0:02:09 lr 0.000013 wd 0.0500 time 0.2428 (0.2437) data time 0.0010 (0.0019) model time 0.2418 (0.2418) loss 2.1128 (2.6544) grad_norm 6.6903 (inf) loss_scale 128.0000 (201.3204) mem 7381MB [2024-09-01 11:02:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][730/1251] eta 0:02:06 lr 0.000013 wd 0.0500 time 0.2304 (0.2437) data time 0.0012 (0.0018) model time 0.2292 (0.2417) loss 2.8239 (2.6545) grad_norm 4.4355 (inf) loss_scale 128.0000 (200.3174) mem 7381MB [2024-09-01 11:02:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][740/1251] eta 0:02:04 lr 0.000013 wd 0.0500 time 0.2371 (0.2436) data time 0.0011 (0.0018) model time 0.2360 (0.2417) loss 2.8180 (2.6563) grad_norm 6.2234 (inf) loss_scale 128.0000 (199.3414) mem 7381MB [2024-09-01 11:02:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][750/1251] eta 0:02:02 lr 0.000013 wd 0.0500 time 0.2308 (0.2436) data time 0.0008 (0.0018) model time 0.2300 (0.2416) loss 2.3043 (2.6543) grad_norm 5.3947 (inf) loss_scale 128.0000 (198.3915) mem 7381MB [2024-09-01 11:02:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][760/1251] eta 0:01:59 lr 0.000013 wd 0.0500 time 0.2457 (0.2435) data time 0.0011 (0.0018) model time 0.2446 (0.2416) loss 2.4295 (2.6511) grad_norm 5.1715 (inf) loss_scale 128.0000 (197.4665) mem 7381MB [2024-09-01 11:02:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][770/1251] eta 0:01:57 lr 0.000013 wd 0.0500 time 0.2458 (0.2435) data time 0.0006 (0.0018) model time 0.2452 (0.2416) loss 2.0074 (2.6509) grad_norm 5.4842 (inf) loss_scale 128.0000 (196.5655) mem 7381MB [2024-09-01 11:02:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][780/1251] eta 0:01:54 lr 0.000013 wd 0.0500 time 0.2518 (0.2435) data time 0.0009 (0.0018) model time 0.2509 (0.2416) loss 2.8241 (2.6522) grad_norm 5.9070 (inf) loss_scale 128.0000 (195.6876) mem 7381MB [2024-09-01 11:02:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][790/1251] eta 0:01:52 lr 0.000013 wd 0.0500 time 0.2286 (0.2434) data time 0.0007 (0.0018) model time 0.2278 (0.2415) loss 2.4401 (2.6497) grad_norm 6.8330 (inf) loss_scale 128.0000 (194.8319) mem 7381MB [2024-09-01 11:02:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][800/1251] eta 0:01:49 lr 0.000013 wd 0.0500 time 0.2318 (0.2434) data time 0.0008 (0.0018) model time 0.2310 (0.2415) loss 3.1024 (2.6492) grad_norm 5.8057 (inf) loss_scale 128.0000 (193.9975) mem 7381MB [2024-09-01 11:02:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][810/1251] eta 0:01:47 lr 0.000013 wd 0.0500 time 0.2406 (0.2433) data time 0.0012 (0.0018) model time 0.2394 (0.2414) loss 2.9431 (2.6475) grad_norm 4.5294 (inf) loss_scale 128.0000 (193.1837) mem 7381MB [2024-09-01 11:02:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][820/1251] eta 0:01:45 lr 0.000013 wd 0.0500 time 0.2476 (0.2438) data time 0.0010 (0.0018) model time 0.2466 (0.2419) loss 3.0377 (2.6469) grad_norm 4.4240 (inf) loss_scale 128.0000 (192.3898) mem 7381MB [2024-09-01 11:02:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][830/1251] eta 0:01:42 lr 0.000013 wd 0.0500 time 0.2396 (0.2438) data time 0.0010 (0.0017) model time 0.2386 (0.2419) loss 3.1528 (2.6513) grad_norm 3.8620 (inf) loss_scale 128.0000 (191.6149) mem 7381MB [2024-09-01 11:02:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][840/1251] eta 0:01:40 lr 0.000013 wd 0.0500 time 0.2359 (0.2437) data time 0.0010 (0.0017) model time 0.2349 (0.2419) loss 2.9046 (2.6527) grad_norm 7.4015 (inf) loss_scale 128.0000 (190.8585) mem 7381MB [2024-09-01 11:02:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][850/1251] eta 0:01:37 lr 0.000013 wd 0.0500 time 0.2442 (0.2437) data time 0.0010 (0.0017) model time 0.2432 (0.2419) loss 2.1122 (2.6522) grad_norm 7.6695 (inf) loss_scale 128.0000 (190.1199) mem 7381MB [2024-09-01 11:03:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][860/1251] eta 0:01:35 lr 0.000013 wd 0.0500 time 0.2372 (0.2437) data time 0.0010 (0.0017) model time 0.2362 (0.2418) loss 1.7728 (2.6482) grad_norm 4.5040 (inf) loss_scale 128.0000 (189.3984) mem 7381MB [2024-09-01 11:03:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][870/1251] eta 0:01:32 lr 0.000013 wd 0.0500 time 0.2367 (0.2436) data time 0.0010 (0.0017) model time 0.2357 (0.2418) loss 3.0301 (2.6509) grad_norm 5.1383 (inf) loss_scale 128.0000 (188.6935) mem 7381MB [2024-09-01 11:03:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][880/1251] eta 0:01:30 lr 0.000013 wd 0.0500 time 0.2420 (0.2436) data time 0.0010 (0.0017) model time 0.2409 (0.2418) loss 2.3988 (2.6497) grad_norm 4.8638 (inf) loss_scale 128.0000 (188.0045) mem 7381MB [2024-09-01 11:03:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][890/1251] eta 0:01:27 lr 0.000013 wd 0.0500 time 0.2351 (0.2435) data time 0.0009 (0.0017) model time 0.2342 (0.2418) loss 3.0985 (2.6524) grad_norm 6.5303 (inf) loss_scale 128.0000 (187.3311) mem 7381MB [2024-09-01 11:03:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][900/1251] eta 0:01:25 lr 0.000013 wd 0.0500 time 0.2347 (0.2435) data time 0.0011 (0.0017) model time 0.2337 (0.2417) loss 2.8498 (2.6545) grad_norm 5.6542 (inf) loss_scale 128.0000 (186.6726) mem 7381MB [2024-09-01 11:03:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][910/1251] eta 0:01:23 lr 0.000013 wd 0.0500 time 0.2442 (0.2435) data time 0.0007 (0.0017) model time 0.2435 (0.2417) loss 3.0376 (2.6564) grad_norm 5.5591 (inf) loss_scale 128.0000 (186.0285) mem 7381MB [2024-09-01 11:03:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][920/1251] eta 0:01:20 lr 0.000013 wd 0.0500 time 0.2334 (0.2435) data time 0.0007 (0.0017) model time 0.2327 (0.2417) loss 3.0788 (2.6579) grad_norm 11.2358 (inf) loss_scale 128.0000 (185.3985) mem 7381MB [2024-09-01 11:03:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][930/1251] eta 0:01:18 lr 0.000013 wd 0.0500 time 0.2407 (0.2434) data time 0.0007 (0.0017) model time 0.2400 (0.2417) loss 2.5955 (2.6574) grad_norm 9.6461 (inf) loss_scale 128.0000 (184.7820) mem 7381MB [2024-09-01 11:03:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][940/1251] eta 0:01:15 lr 0.000013 wd 0.0500 time 0.2426 (0.2434) data time 0.0010 (0.0017) model time 0.2416 (0.2416) loss 2.6929 (2.6556) grad_norm 6.4341 (inf) loss_scale 128.0000 (184.1785) mem 7381MB [2024-09-01 11:03:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][950/1251] eta 0:01:13 lr 0.000013 wd 0.0500 time 0.2504 (0.2434) data time 0.0007 (0.0016) model time 0.2497 (0.2416) loss 1.9045 (2.6543) grad_norm 25.5612 (inf) loss_scale 128.0000 (183.5878) mem 7381MB [2024-09-01 11:03:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][960/1251] eta 0:01:10 lr 0.000013 wd 0.0500 time 0.2502 (0.2434) data time 0.0008 (0.0016) model time 0.2494 (0.2416) loss 1.5424 (2.6546) grad_norm 4.9585 (inf) loss_scale 128.0000 (183.0094) mem 7381MB [2024-09-01 11:03:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][970/1251] eta 0:01:08 lr 0.000013 wd 0.0500 time 0.2363 (0.2434) data time 0.0010 (0.0016) model time 0.2353 (0.2416) loss 1.5724 (2.6537) grad_norm 6.1313 (inf) loss_scale 128.0000 (182.4428) mem 7381MB [2024-09-01 11:03:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][980/1251] eta 0:01:05 lr 0.000013 wd 0.0500 time 0.2374 (0.2433) data time 0.0007 (0.0016) model time 0.2367 (0.2416) loss 2.9911 (2.6542) grad_norm 5.9934 (inf) loss_scale 128.0000 (181.8879) mem 7381MB [2024-09-01 11:03:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][990/1251] eta 0:01:03 lr 0.000013 wd 0.0500 time 0.2361 (0.2433) data time 0.0007 (0.0016) model time 0.2354 (0.2416) loss 2.5982 (2.6547) grad_norm 3.5002 (inf) loss_scale 128.0000 (181.3441) mem 7381MB [2024-09-01 11:03:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1000/1251] eta 0:01:01 lr 0.000013 wd 0.0500 time 0.2562 (0.2433) data time 0.0007 (0.0016) model time 0.2555 (0.2416) loss 3.0491 (2.6529) grad_norm 4.7724 (inf) loss_scale 128.0000 (180.8112) mem 7381MB [2024-09-01 11:03:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1010/1251] eta 0:00:58 lr 0.000013 wd 0.0500 time 0.2406 (0.2435) data time 0.0011 (0.0016) model time 0.2395 (0.2418) loss 3.0718 (2.6554) grad_norm 4.2295 (inf) loss_scale 128.0000 (180.2888) mem 7381MB [2024-09-01 11:03:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1020/1251] eta 0:00:56 lr 0.000013 wd 0.0500 time 0.2506 (0.2435) data time 0.0009 (0.0016) model time 0.2497 (0.2418) loss 2.8360 (2.6537) grad_norm 6.2938 (inf) loss_scale 128.0000 (179.7767) mem 7381MB [2024-09-01 11:03:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1030/1251] eta 0:00:53 lr 0.000013 wd 0.0500 time 0.2327 (0.2434) data time 0.0011 (0.0016) model time 0.2316 (0.2418) loss 3.0515 (2.6554) grad_norm 4.4551 (inf) loss_scale 128.0000 (179.2745) mem 7381MB [2024-09-01 11:03:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1040/1251] eta 0:00:51 lr 0.000013 wd 0.0500 time 0.2461 (0.2434) data time 0.0009 (0.0016) model time 0.2452 (0.2417) loss 2.2227 (2.6553) grad_norm 5.7655 (inf) loss_scale 128.0000 (178.7819) mem 7381MB [2024-09-01 11:03:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1050/1251] eta 0:00:48 lr 0.000013 wd 0.0500 time 0.2441 (0.2434) data time 0.0008 (0.0016) model time 0.2433 (0.2417) loss 2.9399 (2.6550) grad_norm 4.4778 (inf) loss_scale 128.0000 (178.2988) mem 7381MB [2024-09-01 11:03:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1060/1251] eta 0:00:46 lr 0.000013 wd 0.0500 time 0.2434 (0.2434) data time 0.0009 (0.0016) model time 0.2425 (0.2417) loss 2.6609 (2.6548) grad_norm 6.4212 (inf) loss_scale 128.0000 (177.8247) mem 7381MB [2024-09-01 11:03:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1070/1251] eta 0:00:44 lr 0.000013 wd 0.0500 time 0.2446 (0.2433) data time 0.0007 (0.0016) model time 0.2439 (0.2417) loss 2.9238 (2.6542) grad_norm 12.1449 (inf) loss_scale 128.0000 (177.3595) mem 7381MB [2024-09-01 11:03:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1080/1251] eta 0:00:41 lr 0.000013 wd 0.0500 time 0.2408 (0.2433) data time 0.0010 (0.0016) model time 0.2398 (0.2417) loss 2.9800 (2.6564) grad_norm 4.4766 (inf) loss_scale 128.0000 (176.9029) mem 7381MB [2024-09-01 11:03:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1090/1251] eta 0:00:39 lr 0.000013 wd 0.0500 time 0.2449 (0.2433) data time 0.0010 (0.0016) model time 0.2439 (0.2417) loss 2.5593 (2.6561) grad_norm 8.9609 (inf) loss_scale 128.0000 (176.4546) mem 7381MB [2024-09-01 11:03:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1100/1251] eta 0:00:36 lr 0.000013 wd 0.0500 time 0.2463 (0.2433) data time 0.0007 (0.0016) model time 0.2456 (0.2417) loss 3.1419 (2.6585) grad_norm 4.9026 (inf) loss_scale 128.0000 (176.0145) mem 7381MB [2024-09-01 11:04:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1110/1251] eta 0:00:34 lr 0.000013 wd 0.0500 time 0.2425 (0.2433) data time 0.0009 (0.0015) model time 0.2416 (0.2416) loss 2.9255 (2.6581) grad_norm 5.9998 (inf) loss_scale 128.0000 (175.5824) mem 7381MB [2024-09-01 11:04:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1120/1251] eta 0:00:31 lr 0.000013 wd 0.0500 time 0.2436 (0.2433) data time 0.0007 (0.0015) model time 0.2429 (0.2416) loss 2.1198 (2.6582) grad_norm 5.0370 (inf) loss_scale 128.0000 (175.1579) mem 7381MB [2024-09-01 11:04:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1130/1251] eta 0:00:29 lr 0.000013 wd 0.0500 time 0.2339 (0.2432) data time 0.0007 (0.0015) model time 0.2331 (0.2416) loss 2.7955 (2.6595) grad_norm 5.7863 (inf) loss_scale 128.0000 (174.7409) mem 7381MB [2024-09-01 11:04:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1140/1251] eta 0:00:26 lr 0.000013 wd 0.0500 time 0.2489 (0.2432) data time 0.0006 (0.0015) model time 0.2482 (0.2416) loss 1.6783 (2.6578) grad_norm 6.4098 (inf) loss_scale 128.0000 (174.3313) mem 7381MB [2024-09-01 11:04:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1150/1251] eta 0:00:24 lr 0.000013 wd 0.0500 time 0.2520 (0.2432) data time 0.0009 (0.0015) model time 0.2511 (0.2416) loss 2.2499 (2.6579) grad_norm 6.4452 (inf) loss_scale 128.0000 (173.9288) mem 7381MB [2024-09-01 11:04:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1160/1251] eta 0:00:22 lr 0.000013 wd 0.0500 time 0.2393 (0.2432) data time 0.0009 (0.0015) model time 0.2384 (0.2416) loss 2.6042 (2.6572) grad_norm 7.1594 (inf) loss_scale 128.0000 (173.5332) mem 7381MB [2024-09-01 11:04:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1170/1251] eta 0:00:19 lr 0.000013 wd 0.0500 time 0.2389 (0.2432) data time 0.0009 (0.0015) model time 0.2380 (0.2416) loss 3.1802 (2.6584) grad_norm 4.6904 (inf) loss_scale 128.0000 (173.1443) mem 7381MB [2024-09-01 11:04:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1180/1251] eta 0:00:17 lr 0.000013 wd 0.0500 time 0.2381 (0.2432) data time 0.0007 (0.0015) model time 0.2374 (0.2416) loss 2.3994 (2.6603) grad_norm 5.0320 (inf) loss_scale 128.0000 (172.7621) mem 7381MB [2024-09-01 11:04:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1190/1251] eta 0:00:14 lr 0.000013 wd 0.0500 time 0.2509 (0.2432) data time 0.0010 (0.0015) model time 0.2499 (0.2416) loss 2.4956 (2.6625) grad_norm 4.0236 (inf) loss_scale 128.0000 (172.3862) mem 7381MB [2024-09-01 11:04:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1200/1251] eta 0:00:12 lr 0.000013 wd 0.0500 time 0.2545 (0.2432) data time 0.0007 (0.0015) model time 0.2538 (0.2416) loss 1.7095 (2.6609) grad_norm 4.3934 (inf) loss_scale 128.0000 (172.0167) mem 7381MB [2024-09-01 11:04:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1210/1251] eta 0:00:09 lr 0.000013 wd 0.0500 time 0.2485 (0.2432) data time 0.0009 (0.0015) model time 0.2475 (0.2416) loss 2.5301 (2.6611) grad_norm 5.0022 (inf) loss_scale 128.0000 (171.6532) mem 7381MB [2024-09-01 11:04:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1220/1251] eta 0:00:07 lr 0.000013 wd 0.0500 time 0.2387 (0.2432) data time 0.0014 (0.0015) model time 0.2374 (0.2416) loss 2.5939 (2.6597) grad_norm 5.9589 (inf) loss_scale 128.0000 (171.2957) mem 7381MB [2024-09-01 11:04:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1230/1251] eta 0:00:05 lr 0.000013 wd 0.0500 time 0.2421 (0.2432) data time 0.0009 (0.0015) model time 0.2412 (0.2416) loss 2.3207 (2.6601) grad_norm 5.0009 (inf) loss_scale 128.0000 (170.9439) mem 7381MB [2024-09-01 11:04:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1240/1251] eta 0:00:02 lr 0.000013 wd 0.0500 time 0.2257 (0.2431) data time 0.0005 (0.0015) model time 0.2252 (0.2415) loss 1.6877 (2.6585) grad_norm 5.1729 (inf) loss_scale 128.0000 (170.5979) mem 7381MB [2024-09-01 11:04:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [289/300][1250/1251] eta 0:00:00 lr 0.000013 wd 0.0500 time 0.2248 (0.2430) data time 0.0005 (0.0015) model time 0.2244 (0.2414) loss 2.4767 (2.6590) grad_norm 4.9635 (inf) loss_scale 128.0000 (170.2574) mem 7381MB [2024-09-01 11:04:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 289 training takes 0:05:03 [2024-09-01 11:04:35 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 11:04:36 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 11:04:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.465 (0.465) Loss 0.3894 (0.3894) Acc@1 93.066 (93.066) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 11:04:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.110) Loss 0.5654 (0.6092) Acc@1 90.527 (87.757) Acc@5 97.559 (97.727) Mem 7381MB [2024-09-01 11:04:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.074 (0.096) Loss 0.9072 (0.6397) Acc@1 78.125 (86.644) Acc@5 95.312 (97.712) Mem 7381MB [2024-09-01 11:04:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.089) Loss 1.1475 (0.7331) Acc@1 74.805 (84.470) Acc@5 92.578 (96.758) Mem 7381MB [2024-09-01 11:04:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.083) Loss 1.0195 (0.7834) Acc@1 76.562 (83.289) Acc@5 94.141 (96.272) Mem 7381MB [2024-09-01 11:04:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.844 Acc@5 96.200 [2024-09-01 11:04:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 11:04:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.792 (0.792) Loss 0.3901 (0.3901) Acc@1 93.262 (93.262) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 11:04:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.146) Loss 0.5635 (0.6081) Acc@1 90.527 (87.944) Acc@5 97.949 (97.772) Mem 7381MB [2024-09-01 11:04:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.114) Loss 0.9136 (0.6389) Acc@1 77.832 (86.756) Acc@5 95.508 (97.717) Mem 7381MB [2024-09-01 11:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.102) Loss 1.1436 (0.7320) Acc@1 74.512 (84.542) Acc@5 92.969 (96.780) Mem 7381MB [2024-09-01 11:04:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.0156 (0.7807) Acc@1 76.562 (83.360) Acc@5 94.434 (96.263) Mem 7381MB [2024-09-01 11:04:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.916 Acc@5 96.216 [2024-09-01 11:04:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 11:04:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][0/1251] eta 0:21:40 lr 0.000013 wd 0.0500 time 1.0399 (1.0399) data time 0.6340 (0.6340) model time 0.0000 (0.0000) loss 3.1220 (3.1220) grad_norm 6.1059 (6.1059) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:04:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][10/1251] eta 0:06:32 lr 0.000013 wd 0.0500 time 0.2407 (0.3166) data time 0.0008 (0.0585) model time 0.0000 (0.0000) loss 2.6093 (2.9104) grad_norm 4.2132 (6.2945) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:04:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][20/1251] eta 0:05:45 lr 0.000013 wd 0.0500 time 0.2439 (0.2807) data time 0.0010 (0.0311) model time 0.0000 (0.0000) loss 2.6521 (2.8034) grad_norm 4.2745 (6.0699) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:04:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][30/1251] eta 0:05:26 lr 0.000013 wd 0.0500 time 0.2357 (0.2678) data time 0.0010 (0.0214) model time 0.0000 (0.0000) loss 2.8408 (2.7567) grad_norm 4.6727 (6.0948) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:04:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][40/1251] eta 0:05:16 lr 0.000013 wd 0.0500 time 0.2346 (0.2613) data time 0.0009 (0.0164) model time 0.0000 (0.0000) loss 2.4993 (2.7435) grad_norm 3.6181 (5.9387) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:04:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][50/1251] eta 0:05:18 lr 0.000013 wd 0.0500 time 0.2344 (0.2653) data time 0.0010 (0.0134) model time 0.0000 (0.0000) loss 3.0462 (2.7242) grad_norm 6.9374 (5.9177) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][60/1251] eta 0:05:11 lr 0.000013 wd 0.0500 time 0.2438 (0.2612) data time 0.0007 (0.0114) model time 0.2432 (0.2396) loss 2.6529 (2.7239) grad_norm 7.1777 (5.9245) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][70/1251] eta 0:05:04 lr 0.000013 wd 0.0500 time 0.2388 (0.2581) data time 0.0010 (0.0099) model time 0.2378 (0.2387) loss 2.9055 (2.7420) grad_norm 6.5212 (5.8796) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][80/1251] eta 0:04:59 lr 0.000013 wd 0.0500 time 0.2397 (0.2560) data time 0.0009 (0.0088) model time 0.2388 (0.2394) loss 2.5823 (2.7286) grad_norm 5.1707 (5.9812) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][90/1251] eta 0:04:55 lr 0.000013 wd 0.0500 time 0.2395 (0.2544) data time 0.0009 (0.0079) model time 0.2386 (0.2395) loss 2.9135 (2.7175) grad_norm 7.9258 (5.9475) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][100/1251] eta 0:04:51 lr 0.000013 wd 0.0500 time 0.2421 (0.2530) data time 0.0010 (0.0073) model time 0.2411 (0.2396) loss 2.4616 (2.7151) grad_norm 6.9294 (6.0474) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][110/1251] eta 0:04:47 lr 0.000013 wd 0.0500 time 0.2415 (0.2519) data time 0.0008 (0.0067) model time 0.2407 (0.2397) loss 3.0671 (2.7220) grad_norm 7.6061 (6.1230) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][120/1251] eta 0:04:43 lr 0.000013 wd 0.0500 time 0.2437 (0.2511) data time 0.0011 (0.0062) model time 0.2425 (0.2397) loss 3.0908 (2.7072) grad_norm 5.9139 (6.1128) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][130/1251] eta 0:04:40 lr 0.000013 wd 0.0500 time 0.2386 (0.2502) data time 0.0009 (0.0058) model time 0.2376 (0.2395) loss 3.3130 (2.7087) grad_norm 4.6427 (6.1370) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][140/1251] eta 0:04:37 lr 0.000013 wd 0.0500 time 0.2379 (0.2495) data time 0.0008 (0.0055) model time 0.2372 (0.2396) loss 3.0885 (2.7264) grad_norm 6.0353 (6.1297) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][150/1251] eta 0:04:34 lr 0.000013 wd 0.0500 time 0.2376 (0.2490) data time 0.0009 (0.0052) model time 0.2367 (0.2398) loss 3.1173 (2.7249) grad_norm 4.2825 (6.0865) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][160/1251] eta 0:04:31 lr 0.000013 wd 0.0500 time 0.2420 (0.2486) data time 0.0009 (0.0049) model time 0.2411 (0.2399) loss 2.9315 (2.7266) grad_norm 7.3881 (6.1418) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][170/1251] eta 0:04:29 lr 0.000013 wd 0.0500 time 0.4550 (0.2493) data time 0.0010 (0.0047) model time 0.4540 (0.2416) loss 1.8057 (2.7240) grad_norm 4.3566 (6.1114) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][180/1251] eta 0:04:28 lr 0.000013 wd 0.0500 time 0.2368 (0.2512) data time 0.0009 (0.0045) model time 0.2360 (0.2446) loss 3.0385 (2.7128) grad_norm 6.8469 (6.1342) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][190/1251] eta 0:04:25 lr 0.000013 wd 0.0500 time 0.2376 (0.2507) data time 0.0009 (0.0043) model time 0.2367 (0.2444) loss 3.2611 (2.7195) grad_norm 4.5525 (6.1536) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][200/1251] eta 0:04:23 lr 0.000013 wd 0.0500 time 0.2495 (0.2503) data time 0.0011 (0.0041) model time 0.2484 (0.2442) loss 2.7246 (2.7175) grad_norm 6.0046 (6.2061) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][210/1251] eta 0:04:20 lr 0.000013 wd 0.0500 time 0.2451 (0.2499) data time 0.0010 (0.0040) model time 0.2441 (0.2440) loss 2.8244 (2.7271) grad_norm 5.8309 (6.1762) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][220/1251] eta 0:04:17 lr 0.000013 wd 0.0500 time 0.2436 (0.2496) data time 0.0007 (0.0039) model time 0.2429 (0.2438) loss 2.8325 (2.7193) grad_norm 5.1779 (6.1908) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][230/1251] eta 0:04:14 lr 0.000013 wd 0.0500 time 0.2380 (0.2493) data time 0.0008 (0.0037) model time 0.2372 (0.2437) loss 2.9298 (2.7173) grad_norm 8.3401 (6.2826) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][240/1251] eta 0:04:11 lr 0.000013 wd 0.0500 time 0.2375 (0.2489) data time 0.0010 (0.0036) model time 0.2365 (0.2435) loss 2.7843 (2.7102) grad_norm 4.5985 (6.2647) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][250/1251] eta 0:04:08 lr 0.000013 wd 0.0500 time 0.2378 (0.2486) data time 0.0010 (0.0035) model time 0.2368 (0.2434) loss 2.5548 (2.7084) grad_norm 5.0470 (6.2408) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][260/1251] eta 0:04:06 lr 0.000013 wd 0.0500 time 0.2448 (0.2483) data time 0.0007 (0.0034) model time 0.2441 (0.2432) loss 1.8059 (2.6979) grad_norm 5.1423 (6.1988) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][270/1251] eta 0:04:03 lr 0.000013 wd 0.0500 time 0.2407 (0.2480) data time 0.0009 (0.0033) model time 0.2397 (0.2430) loss 2.2693 (2.6997) grad_norm 3.8437 (6.3125) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][280/1251] eta 0:04:00 lr 0.000013 wd 0.0500 time 0.2461 (0.2478) data time 0.0008 (0.0032) model time 0.2453 (0.2429) loss 2.7305 (2.6937) grad_norm 16.1336 (6.3811) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][290/1251] eta 0:03:57 lr 0.000013 wd 0.0500 time 0.2382 (0.2476) data time 0.0009 (0.0032) model time 0.2372 (0.2428) loss 2.7898 (2.6975) grad_norm 4.7641 (6.3862) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:05:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][300/1251] eta 0:03:55 lr 0.000013 wd 0.0500 time 0.2426 (0.2475) data time 0.0009 (0.0031) model time 0.2416 (0.2429) loss 2.7493 (2.6972) grad_norm 6.5331 (6.3621) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][310/1251] eta 0:03:52 lr 0.000013 wd 0.0500 time 0.2378 (0.2473) data time 0.0009 (0.0030) model time 0.2369 (0.2428) loss 3.0515 (2.6943) grad_norm 9.5043 (6.3580) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][320/1251] eta 0:03:50 lr 0.000013 wd 0.0500 time 0.2405 (0.2471) data time 0.0009 (0.0030) model time 0.2396 (0.2427) loss 2.6207 (2.6910) grad_norm 4.5138 (6.3336) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][330/1251] eta 0:03:47 lr 0.000013 wd 0.0500 time 0.2409 (0.2469) data time 0.0010 (0.0029) model time 0.2399 (0.2426) loss 3.2956 (2.6929) grad_norm 6.8498 (6.3088) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][340/1251] eta 0:03:44 lr 0.000013 wd 0.0500 time 0.2446 (0.2468) data time 0.0007 (0.0028) model time 0.2439 (0.2425) loss 1.7100 (2.6866) grad_norm 10.5669 (6.3274) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][350/1251] eta 0:03:42 lr 0.000013 wd 0.0500 time 0.2426 (0.2466) data time 0.0007 (0.0028) model time 0.2419 (0.2425) loss 2.9396 (2.6875) grad_norm 6.1289 (6.3157) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][360/1251] eta 0:03:39 lr 0.000013 wd 0.0500 time 0.2464 (0.2465) data time 0.0007 (0.0027) model time 0.2457 (0.2424) loss 1.9159 (2.6898) grad_norm 5.7691 (6.3532) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][370/1251] eta 0:03:37 lr 0.000013 wd 0.0500 time 0.2466 (0.2463) data time 0.0007 (0.0027) model time 0.2459 (0.2423) loss 2.4173 (2.6904) grad_norm 5.6280 (6.3201) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][380/1251] eta 0:03:34 lr 0.000013 wd 0.0500 time 0.2395 (0.2462) data time 0.0007 (0.0027) model time 0.2387 (0.2423) loss 1.8752 (2.6899) grad_norm 6.3871 (6.3440) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][390/1251] eta 0:03:31 lr 0.000013 wd 0.0500 time 0.2342 (0.2460) data time 0.0010 (0.0026) model time 0.2332 (0.2422) loss 2.6146 (2.6878) grad_norm 4.1331 (6.3178) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][400/1251] eta 0:03:29 lr 0.000013 wd 0.0500 time 0.2392 (0.2459) data time 0.0010 (0.0026) model time 0.2382 (0.2421) loss 2.8037 (2.6912) grad_norm 4.4528 (6.3731) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][410/1251] eta 0:03:26 lr 0.000013 wd 0.0500 time 0.2376 (0.2458) data time 0.0008 (0.0025) model time 0.2368 (0.2421) loss 1.7506 (2.6862) grad_norm 5.3080 (6.3492) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][420/1251] eta 0:03:24 lr 0.000013 wd 0.0500 time 0.2506 (0.2457) data time 0.0008 (0.0025) model time 0.2499 (0.2420) loss 2.6131 (2.6880) grad_norm 5.2868 (6.3342) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][430/1251] eta 0:03:21 lr 0.000013 wd 0.0500 time 0.2406 (0.2456) data time 0.0009 (0.0025) model time 0.2397 (0.2420) loss 2.0752 (2.6903) grad_norm 5.8619 (6.3120) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][440/1251] eta 0:03:19 lr 0.000013 wd 0.0500 time 0.2385 (0.2455) data time 0.0010 (0.0024) model time 0.2375 (0.2420) loss 3.0271 (2.6901) grad_norm 7.7768 (6.3272) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][450/1251] eta 0:03:16 lr 0.000013 wd 0.0500 time 0.2456 (0.2455) data time 0.0009 (0.0024) model time 0.2448 (0.2420) loss 1.3397 (2.6879) grad_norm 11.8216 (6.3334) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][460/1251] eta 0:03:14 lr 0.000013 wd 0.0500 time 0.2433 (0.2454) data time 0.0008 (0.0024) model time 0.2425 (0.2420) loss 2.2423 (2.6898) grad_norm 6.3608 (6.3284) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][470/1251] eta 0:03:11 lr 0.000013 wd 0.0500 time 0.2366 (0.2453) data time 0.0009 (0.0023) model time 0.2357 (0.2419) loss 2.7050 (2.6894) grad_norm 4.8173 (6.3352) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][480/1251] eta 0:03:09 lr 0.000013 wd 0.0500 time 0.2416 (0.2452) data time 0.0011 (0.0023) model time 0.2405 (0.2419) loss 2.3418 (2.6873) grad_norm 7.4233 (6.3394) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][490/1251] eta 0:03:06 lr 0.000013 wd 0.0500 time 0.2433 (0.2452) data time 0.0008 (0.0023) model time 0.2425 (0.2418) loss 1.6086 (2.6865) grad_norm 9.4388 (6.3300) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][500/1251] eta 0:03:04 lr 0.000013 wd 0.0500 time 0.2332 (0.2451) data time 0.0011 (0.0023) model time 0.2321 (0.2419) loss 2.0272 (2.6835) grad_norm 5.1571 (6.3157) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][510/1251] eta 0:03:01 lr 0.000013 wd 0.0500 time 0.2420 (0.2451) data time 0.0008 (0.0022) model time 0.2412 (0.2418) loss 3.1276 (2.6862) grad_norm 4.7731 (6.3164) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][520/1251] eta 0:02:59 lr 0.000013 wd 0.0500 time 0.2358 (0.2450) data time 0.0008 (0.0022) model time 0.2350 (0.2418) loss 3.3581 (2.6856) grad_norm 4.6940 (6.3048) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][530/1251] eta 0:02:56 lr 0.000013 wd 0.0500 time 0.2354 (0.2449) data time 0.0007 (0.0022) model time 0.2347 (0.2418) loss 2.0868 (2.6851) grad_norm 8.0617 (6.2993) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][540/1251] eta 0:02:54 lr 0.000013 wd 0.0500 time 0.2448 (0.2449) data time 0.0009 (0.0022) model time 0.2439 (0.2417) loss 2.4870 (2.6835) grad_norm 6.9963 (6.3026) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:06:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][550/1251] eta 0:02:51 lr 0.000013 wd 0.0500 time 0.2393 (0.2448) data time 0.0012 (0.0021) model time 0.2381 (0.2417) loss 3.1631 (2.6869) grad_norm 6.0893 (6.2946) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][560/1251] eta 0:02:49 lr 0.000013 wd 0.0500 time 0.2376 (0.2447) data time 0.0008 (0.0021) model time 0.2368 (0.2416) loss 3.0471 (2.6878) grad_norm 4.6011 (6.2917) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][570/1251] eta 0:02:46 lr 0.000013 wd 0.0500 time 0.2468 (0.2446) data time 0.0007 (0.0021) model time 0.2461 (0.2416) loss 2.7065 (2.6898) grad_norm 4.6731 (6.2864) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][580/1251] eta 0:02:44 lr 0.000013 wd 0.0500 time 0.2466 (0.2446) data time 0.0009 (0.0021) model time 0.2456 (0.2416) loss 1.9717 (2.6899) grad_norm 5.6071 (6.2849) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][590/1251] eta 0:02:41 lr 0.000013 wd 0.0500 time 0.2445 (0.2445) data time 0.0010 (0.0021) model time 0.2435 (0.2416) loss 2.7289 (2.6912) grad_norm 5.4381 (6.2844) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][600/1251] eta 0:02:39 lr 0.000013 wd 0.0500 time 0.2221 (0.2448) data time 0.0010 (0.0020) model time 0.2211 (0.2419) loss 3.1798 (2.6861) grad_norm 7.3379 (6.2805) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][610/1251] eta 0:02:36 lr 0.000013 wd 0.0500 time 0.2443 (0.2448) data time 0.0011 (0.0020) model time 0.2432 (0.2419) loss 2.8899 (2.6882) grad_norm 6.2414 (6.2980) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][620/1251] eta 0:02:34 lr 0.000013 wd 0.0500 time 0.2458 (0.2448) data time 0.0011 (0.0020) model time 0.2447 (0.2419) loss 2.4950 (2.6873) grad_norm 4.8619 (6.3232) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][630/1251] eta 0:02:31 lr 0.000013 wd 0.0500 time 0.2365 (0.2448) data time 0.0009 (0.0020) model time 0.2356 (0.2420) loss 2.8931 (2.6880) grad_norm 4.2419 (6.3088) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][640/1251] eta 0:02:29 lr 0.000013 wd 0.0500 time 0.2402 (0.2447) data time 0.0009 (0.0020) model time 0.2393 (0.2420) loss 2.5343 (2.6835) grad_norm 6.3376 (6.2999) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][650/1251] eta 0:02:27 lr 0.000013 wd 0.0500 time 0.2477 (0.2447) data time 0.0010 (0.0020) model time 0.2467 (0.2420) loss 2.3173 (2.6822) grad_norm 7.9176 (6.3009) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][660/1251] eta 0:02:24 lr 0.000013 wd 0.0500 time 0.2382 (0.2447) data time 0.0009 (0.0019) model time 0.2373 (0.2420) loss 2.9052 (2.6812) grad_norm 4.7759 (6.2932) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][670/1251] eta 0:02:22 lr 0.000013 wd 0.0500 time 0.2370 (0.2446) data time 0.0011 (0.0019) model time 0.2360 (0.2419) loss 2.7440 (2.6826) grad_norm 4.8030 (6.2851) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][680/1251] eta 0:02:19 lr 0.000013 wd 0.0500 time 0.2369 (0.2445) data time 0.0009 (0.0019) model time 0.2361 (0.2418) loss 3.3134 (2.6845) grad_norm 7.0975 (6.2997) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][690/1251] eta 0:02:17 lr 0.000013 wd 0.0500 time 0.2350 (0.2444) data time 0.0009 (0.0019) model time 0.2342 (0.2418) loss 1.8540 (2.6816) grad_norm 7.3185 (6.2988) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][700/1251] eta 0:02:14 lr 0.000013 wd 0.0500 time 0.2444 (0.2444) data time 0.0007 (0.0019) model time 0.2437 (0.2418) loss 3.6082 (2.6829) grad_norm 4.7693 (6.3042) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][710/1251] eta 0:02:12 lr 0.000013 wd 0.0500 time 0.4388 (0.2449) data time 0.0010 (0.0019) model time 0.4378 (0.2424) loss 2.2356 (2.6803) grad_norm 5.7915 (6.3197) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][720/1251] eta 0:02:10 lr 0.000013 wd 0.0500 time 0.2508 (0.2452) data time 0.0007 (0.0019) model time 0.2500 (0.2427) loss 2.9572 (2.6810) grad_norm 5.0120 (6.3154) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][730/1251] eta 0:02:07 lr 0.000013 wd 0.0500 time 0.2414 (0.2452) data time 0.0007 (0.0019) model time 0.2407 (0.2427) loss 2.0745 (2.6796) grad_norm 11.1446 (6.3113) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][740/1251] eta 0:02:05 lr 0.000013 wd 0.0500 time 0.2467 (0.2451) data time 0.0010 (0.0018) model time 0.2457 (0.2427) loss 2.6001 (2.6817) grad_norm 8.2042 (6.3084) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][750/1251] eta 0:02:02 lr 0.000013 wd 0.0500 time 0.2405 (0.2451) data time 0.0008 (0.0018) model time 0.2397 (0.2426) loss 2.3041 (2.6808) grad_norm 4.7744 (6.3012) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][760/1251] eta 0:02:00 lr 0.000013 wd 0.0500 time 0.2437 (0.2451) data time 0.0011 (0.0018) model time 0.2426 (0.2426) loss 2.7178 (2.6815) grad_norm 6.7698 (6.2923) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][770/1251] eta 0:01:57 lr 0.000013 wd 0.0500 time 0.2322 (0.2450) data time 0.0012 (0.0018) model time 0.2310 (0.2426) loss 3.0855 (2.6783) grad_norm 6.8385 (6.2802) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][780/1251] eta 0:01:55 lr 0.000013 wd 0.0500 time 0.2310 (0.2450) data time 0.0007 (0.0018) model time 0.2303 (0.2425) loss 2.7128 (2.6824) grad_norm 4.6214 (6.2813) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:07:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][790/1251] eta 0:01:52 lr 0.000013 wd 0.0500 time 0.2416 (0.2449) data time 0.0008 (0.0018) model time 0.2409 (0.2425) loss 1.8995 (2.6825) grad_norm 4.7943 (6.2758) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][800/1251] eta 0:01:50 lr 0.000013 wd 0.0500 time 0.2400 (0.2448) data time 0.0010 (0.0018) model time 0.2390 (0.2425) loss 2.8796 (2.6839) grad_norm 3.8695 (6.2802) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][810/1251] eta 0:01:47 lr 0.000013 wd 0.0500 time 0.2449 (0.2448) data time 0.0007 (0.0018) model time 0.2442 (0.2425) loss 2.1396 (2.6822) grad_norm 6.1224 (6.2820) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][820/1251] eta 0:01:45 lr 0.000013 wd 0.0500 time 0.2424 (0.2448) data time 0.0007 (0.0018) model time 0.2417 (0.2425) loss 2.2095 (2.6822) grad_norm 6.8581 (6.2826) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][830/1251] eta 0:01:43 lr 0.000013 wd 0.0500 time 0.2313 (0.2448) data time 0.0011 (0.0017) model time 0.2302 (0.2424) loss 3.0529 (2.6809) grad_norm 7.0528 (6.2702) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][840/1251] eta 0:01:40 lr 0.000013 wd 0.0500 time 0.2393 (0.2447) data time 0.0009 (0.0017) model time 0.2384 (0.2424) loss 3.0084 (2.6801) grad_norm 6.1851 (6.2580) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][850/1251] eta 0:01:38 lr 0.000013 wd 0.0500 time 0.2318 (0.2447) data time 0.0009 (0.0017) model time 0.2309 (0.2424) loss 3.5512 (2.6830) grad_norm 8.8125 (6.2522) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][860/1251] eta 0:01:35 lr 0.000013 wd 0.0500 time 0.2492 (0.2447) data time 0.0010 (0.0017) model time 0.2482 (0.2424) loss 2.2136 (2.6827) grad_norm 7.7778 (6.2471) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][870/1251] eta 0:01:33 lr 0.000013 wd 0.0500 time 0.2424 (0.2447) data time 0.0010 (0.0017) model time 0.2414 (0.2424) loss 2.0179 (2.6808) grad_norm 7.3111 (6.2418) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][880/1251] eta 0:01:30 lr 0.000013 wd 0.0500 time 0.2392 (0.2446) data time 0.0008 (0.0017) model time 0.2384 (0.2424) loss 3.1004 (2.6806) grad_norm 75.8909 (6.3184) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][890/1251] eta 0:01:28 lr 0.000013 wd 0.0500 time 0.2401 (0.2446) data time 0.0011 (0.0017) model time 0.2390 (0.2424) loss 2.8491 (2.6782) grad_norm 5.3772 (6.3119) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][900/1251] eta 0:01:25 lr 0.000013 wd 0.0500 time 0.2400 (0.2446) data time 0.0011 (0.0017) model time 0.2388 (0.2424) loss 3.0332 (2.6774) grad_norm 5.0785 (6.3174) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][910/1251] eta 0:01:23 lr 0.000013 wd 0.0500 time 0.2447 (0.2446) data time 0.0011 (0.0017) model time 0.2436 (0.2424) loss 2.4873 (2.6771) grad_norm 4.1482 (6.3953) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][920/1251] eta 0:01:20 lr 0.000013 wd 0.0500 time 0.2422 (0.2445) data time 0.0009 (0.0017) model time 0.2412 (0.2423) loss 2.6107 (2.6772) grad_norm 4.2562 (6.3852) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][930/1251] eta 0:01:18 lr 0.000013 wd 0.0500 time 0.2567 (0.2445) data time 0.0009 (0.0017) model time 0.2558 (0.2423) loss 2.9142 (2.6782) grad_norm 5.3648 (6.3873) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][940/1251] eta 0:01:16 lr 0.000013 wd 0.0500 time 0.2424 (0.2445) data time 0.0011 (0.0017) model time 0.2413 (0.2423) loss 2.9701 (2.6782) grad_norm 4.1042 (6.3786) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][950/1251] eta 0:01:13 lr 0.000013 wd 0.0500 time 0.2436 (0.2445) data time 0.0008 (0.0017) model time 0.2427 (0.2423) loss 2.3946 (2.6781) grad_norm 10.7117 (6.4022) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][960/1251] eta 0:01:11 lr 0.000013 wd 0.0500 time 0.2421 (0.2444) data time 0.0009 (0.0016) model time 0.2412 (0.2423) loss 2.3739 (2.6794) grad_norm 4.4159 (6.4102) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][970/1251] eta 0:01:08 lr 0.000013 wd 0.0500 time 0.2522 (0.2444) data time 0.0009 (0.0016) model time 0.2512 (0.2423) loss 1.7643 (2.6809) grad_norm 5.7879 (6.3989) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][980/1251] eta 0:01:06 lr 0.000013 wd 0.0500 time 0.2429 (0.2448) data time 0.0010 (0.0016) model time 0.2419 (0.2427) loss 2.7701 (2.6810) grad_norm 5.9716 (6.3951) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][990/1251] eta 0:01:03 lr 0.000013 wd 0.0500 time 0.2465 (0.2448) data time 0.0007 (0.0016) model time 0.2458 (0.2427) loss 3.0263 (2.6797) grad_norm 7.4212 (6.3971) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1000/1251] eta 0:01:01 lr 0.000013 wd 0.0500 time 0.2464 (0.2447) data time 0.0009 (0.0016) model time 0.2455 (0.2426) loss 3.0196 (2.6795) grad_norm 6.2734 (6.3911) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1010/1251] eta 0:00:58 lr 0.000013 wd 0.0500 time 0.2366 (0.2447) data time 0.0010 (0.0016) model time 0.2356 (0.2426) loss 2.8499 (2.6799) grad_norm 6.0046 (6.4094) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1020/1251] eta 0:00:56 lr 0.000013 wd 0.0500 time 0.2474 (0.2447) data time 0.0010 (0.0016) model time 0.2464 (0.2426) loss 2.7151 (2.6798) grad_norm 6.5807 (6.4293) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1030/1251] eta 0:00:54 lr 0.000013 wd 0.0500 time 0.2436 (0.2446) data time 0.0009 (0.0016) model time 0.2426 (0.2426) loss 2.4376 (2.6795) grad_norm 6.9870 (6.4207) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:08:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1040/1251] eta 0:00:51 lr 0.000013 wd 0.0500 time 0.2461 (0.2446) data time 0.0007 (0.0016) model time 0.2454 (0.2426) loss 2.9511 (2.6785) grad_norm 7.9349 (6.4100) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:09:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1050/1251] eta 0:00:49 lr 0.000013 wd 0.0500 time 0.2421 (0.2446) data time 0.0010 (0.0016) model time 0.2411 (0.2425) loss 3.1343 (2.6799) grad_norm 5.1394 (6.4111) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:09:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1060/1251] eta 0:00:46 lr 0.000013 wd 0.0500 time 0.2297 (0.2445) data time 0.0009 (0.0016) model time 0.2288 (0.2425) loss 2.8687 (2.6829) grad_norm 5.5972 (6.4176) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:09:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1070/1251] eta 0:00:44 lr 0.000013 wd 0.0500 time 0.2413 (0.2445) data time 0.0011 (0.0016) model time 0.2402 (0.2425) loss 2.5873 (2.6831) grad_norm 5.5491 (6.4118) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:09:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1080/1251] eta 0:00:41 lr 0.000013 wd 0.0500 time 0.2374 (0.2444) data time 0.0007 (0.0016) model time 0.2367 (0.2424) loss 2.7439 (2.6833) grad_norm 5.1138 (6.4093) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:09:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1090/1251] eta 0:00:39 lr 0.000013 wd 0.0500 time 0.2422 (0.2444) data time 0.0009 (0.0016) model time 0.2413 (0.2424) loss 2.8197 (2.6834) grad_norm 5.3447 (6.4104) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:09:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1100/1251] eta 0:00:36 lr 0.000013 wd 0.0500 time 0.2409 (0.2444) data time 0.0010 (0.0016) model time 0.2399 (0.2424) loss 3.0551 (2.6852) grad_norm 7.7560 (6.4043) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:09:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1110/1251] eta 0:00:34 lr 0.000013 wd 0.0500 time 0.2399 (0.2444) data time 0.0007 (0.0016) model time 0.2392 (0.2424) loss 1.7038 (2.6840) grad_norm 6.0866 (6.4085) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:09:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1120/1251] eta 0:00:32 lr 0.000013 wd 0.0500 time 0.2416 (0.2443) data time 0.0009 (0.0016) model time 0.2407 (0.2423) loss 2.8490 (2.6834) grad_norm 4.8704 (6.4005) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:09:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1130/1251] eta 0:00:29 lr 0.000013 wd 0.0500 time 0.2357 (0.2443) data time 0.0008 (0.0016) model time 0.2349 (0.2423) loss 2.4467 (2.6830) grad_norm 3.8638 (6.4869) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:09:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1140/1251] eta 0:00:27 lr 0.000013 wd 0.0500 time 0.2421 (0.2444) data time 0.0011 (0.0015) model time 0.2410 (0.2425) loss 2.6879 (2.6841) grad_norm 7.1475 (6.4907) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:09:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1150/1251] eta 0:00:24 lr 0.000013 wd 0.0500 time 0.2392 (0.2444) data time 0.0008 (0.0015) model time 0.2383 (0.2424) loss 2.1374 (2.6814) grad_norm 4.3661 (6.4795) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:09:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1160/1251] eta 0:00:22 lr 0.000013 wd 0.0500 time 0.2402 (0.2443) data time 0.0006 (0.0015) model time 0.2395 (0.2424) loss 2.0236 (2.6785) grad_norm 7.4335 (6.4791) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:09:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1170/1251] eta 0:00:19 lr 0.000013 wd 0.0500 time 0.2463 (0.2443) data time 0.0011 (0.0015) model time 0.2452 (0.2424) loss 2.6793 (2.6758) grad_norm 6.2130 (6.4739) loss_scale 256.0000 (128.9838) mem 7381MB [2024-09-01 11:09:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1180/1251] eta 0:00:17 lr 0.000013 wd 0.0500 time 0.2521 (0.2443) data time 0.0009 (0.0015) model time 0.2512 (0.2424) loss 1.7572 (2.6765) grad_norm 13.6969 (6.4789) loss_scale 256.0000 (130.0593) mem 7381MB [2024-09-01 11:09:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1190/1251] eta 0:00:14 lr 0.000013 wd 0.0500 time 0.2463 (0.2443) data time 0.0010 (0.0015) model time 0.2453 (0.2424) loss 2.6124 (2.6765) grad_norm 7.9377 (6.4779) loss_scale 256.0000 (131.1167) mem 7381MB [2024-09-01 11:09:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1200/1251] eta 0:00:12 lr 0.000013 wd 0.0500 time 0.2407 (0.2442) data time 0.0011 (0.0015) model time 0.2396 (0.2423) loss 2.3453 (2.6752) grad_norm 4.1148 (6.4716) loss_scale 256.0000 (132.1565) mem 7381MB [2024-09-01 11:09:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1210/1251] eta 0:00:10 lr 0.000013 wd 0.0500 time 0.2427 (0.2442) data time 0.0009 (0.0015) model time 0.2418 (0.2423) loss 2.4845 (2.6772) grad_norm 5.7008 (6.4721) loss_scale 256.0000 (133.1792) mem 7381MB [2024-09-01 11:09:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1220/1251] eta 0:00:07 lr 0.000013 wd 0.0500 time 0.2449 (0.2442) data time 0.0009 (0.0015) model time 0.2439 (0.2423) loss 2.2433 (2.6769) grad_norm 4.2779 (6.4873) loss_scale 256.0000 (134.1851) mem 7381MB [2024-09-01 11:09:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1230/1251] eta 0:00:05 lr 0.000013 wd 0.0500 time 0.3902 (0.2444) data time 0.0010 (0.0015) model time 0.3892 (0.2425) loss 2.9953 (2.6782) grad_norm 5.1681 (6.4896) loss_scale 256.0000 (135.1747) mem 7381MB [2024-09-01 11:09:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1240/1251] eta 0:00:02 lr 0.000013 wd 0.0500 time 0.2248 (0.2445) data time 0.0007 (0.0015) model time 0.2242 (0.2426) loss 1.9494 (2.6772) grad_norm 6.0592 (6.4855) loss_scale 256.0000 (136.1483) mem 7381MB [2024-09-01 11:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [290/300][1250/1251] eta 0:00:00 lr 0.000013 wd 0.0500 time 0.2258 (0.2443) data time 0.0007 (0.0015) model time 0.2251 (0.2425) loss 2.9441 (2.6787) grad_norm 10.4826 (6.4806) loss_scale 256.0000 (137.1063) mem 7381MB [2024-09-01 11:09:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 290 training takes 0:05:05 [2024-09-01 11:09:50 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 11:09:50 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 11:09:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.446 (0.446) Loss 0.4045 (0.4045) Acc@1 93.359 (93.359) Acc@5 98.535 (98.535) Mem 7381MB [2024-09-01 11:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.080 (0.110) Loss 0.5767 (0.6220) Acc@1 90.820 (87.784) Acc@5 97.852 (97.763) Mem 7381MB [2024-09-01 11:09:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.095) Loss 0.9268 (0.6519) Acc@1 77.344 (86.607) Acc@5 94.824 (97.670) Mem 7381MB [2024-09-01 11:09:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.090) Loss 1.1631 (0.7462) Acc@1 73.828 (84.407) Acc@5 92.773 (96.702) Mem 7381MB [2024-09-01 11:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.0312 (0.7952) Acc@1 77.051 (83.256) Acc@5 93.945 (96.218) Mem 7381MB [2024-09-01 11:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.824 Acc@5 96.162 [2024-09-01 11:09:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 11:09:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.782 (0.782) Loss 0.3906 (0.3906) Acc@1 93.262 (93.262) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 11:09:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.085 (0.148) Loss 0.5649 (0.6085) Acc@1 90.527 (87.900) Acc@5 97.949 (97.772) Mem 7381MB [2024-09-01 11:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.117) Loss 0.9126 (0.6394) Acc@1 77.734 (86.719) Acc@5 95.508 (97.717) Mem 7381MB [2024-09-01 11:09:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.104) Loss 1.1436 (0.7324) Acc@1 74.609 (84.536) Acc@5 93.066 (96.784) Mem 7381MB [2024-09-01 11:09:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.095) Loss 1.0156 (0.7811) Acc@1 76.758 (83.348) Acc@5 94.531 (96.277) Mem 7381MB [2024-09-01 11:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.906 Acc@5 96.226 [2024-09-01 11:09:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 11:10:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][0/1251] eta 0:23:18 lr 0.000013 wd 0.0500 time 1.1177 (1.1177) data time 0.5865 (0.5865) model time 0.0000 (0.0000) loss 2.7627 (2.7627) grad_norm 7.0288 (7.0288) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][10/1251] eta 0:06:38 lr 0.000013 wd 0.0500 time 0.2424 (0.3209) data time 0.0009 (0.0543) model time 0.0000 (0.0000) loss 2.6587 (2.7857) grad_norm 5.6980 (6.1779) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][20/1251] eta 0:05:47 lr 0.000013 wd 0.0500 time 0.2396 (0.2825) data time 0.0009 (0.0289) model time 0.0000 (0.0000) loss 3.1084 (2.8501) grad_norm 6.6244 (6.4476) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][30/1251] eta 0:05:28 lr 0.000013 wd 0.0500 time 0.2488 (0.2693) data time 0.0009 (0.0199) model time 0.0000 (0.0000) loss 1.9907 (2.8069) grad_norm 4.6662 (6.6300) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][40/1251] eta 0:05:17 lr 0.000013 wd 0.0500 time 0.2416 (0.2623) data time 0.0009 (0.0153) model time 0.0000 (0.0000) loss 3.3258 (2.8391) grad_norm 5.2142 (6.6322) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][50/1251] eta 0:05:09 lr 0.000012 wd 0.0500 time 0.2402 (0.2581) data time 0.0009 (0.0125) model time 0.0000 (0.0000) loss 2.9236 (2.8126) grad_norm 7.2066 (6.7766) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][60/1251] eta 0:05:04 lr 0.000012 wd 0.0500 time 0.2428 (0.2553) data time 0.0008 (0.0106) model time 0.2420 (0.2402) loss 1.8629 (2.7877) grad_norm 5.7507 (6.6331) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][70/1251] eta 0:04:59 lr 0.000012 wd 0.0500 time 0.2453 (0.2534) data time 0.0011 (0.0092) model time 0.2442 (0.2405) loss 3.1528 (2.7945) grad_norm 4.5771 (8.7140) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][80/1251] eta 0:04:54 lr 0.000012 wd 0.0500 time 0.2491 (0.2517) data time 0.0010 (0.0082) model time 0.2481 (0.2399) loss 2.9994 (2.8003) grad_norm 4.6080 (9.6305) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][90/1251] eta 0:04:50 lr 0.000012 wd 0.0500 time 0.2440 (0.2505) data time 0.0009 (0.0074) model time 0.2430 (0.2398) loss 2.7109 (2.7652) grad_norm 6.4667 (9.3098) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][100/1251] eta 0:04:46 lr 0.000012 wd 0.0500 time 0.2447 (0.2492) data time 0.0007 (0.0068) model time 0.2440 (0.2393) loss 2.7243 (2.7471) grad_norm 9.6065 (9.2525) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][110/1251] eta 0:04:43 lr 0.000012 wd 0.0500 time 0.2395 (0.2485) data time 0.0008 (0.0063) model time 0.2386 (0.2393) loss 3.3173 (2.7384) grad_norm 9.5309 (8.9219) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][120/1251] eta 0:04:40 lr 0.000012 wd 0.0500 time 0.2442 (0.2480) data time 0.0011 (0.0058) model time 0.2431 (0.2397) loss 2.9641 (2.7275) grad_norm 5.1215 (8.7367) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][130/1251] eta 0:04:37 lr 0.000012 wd 0.0500 time 0.2360 (0.2475) data time 0.0010 (0.0055) model time 0.2350 (0.2398) loss 3.1440 (2.7131) grad_norm 4.9098 (8.5169) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][140/1251] eta 0:04:34 lr 0.000012 wd 0.0500 time 0.2409 (0.2470) data time 0.0011 (0.0051) model time 0.2398 (0.2397) loss 2.6007 (2.7015) grad_norm 6.0087 (8.3684) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][150/1251] eta 0:04:31 lr 0.000012 wd 0.0500 time 0.2509 (0.2466) data time 0.0010 (0.0049) model time 0.2499 (0.2397) loss 2.4835 (2.7025) grad_norm 6.4091 (8.3912) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][160/1251] eta 0:04:28 lr 0.000012 wd 0.0500 time 0.2425 (0.2462) data time 0.0010 (0.0046) model time 0.2415 (0.2397) loss 1.9829 (2.7119) grad_norm 5.7837 (8.2003) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][170/1251] eta 0:04:25 lr 0.000012 wd 0.0500 time 0.2426 (0.2459) data time 0.0010 (0.0044) model time 0.2417 (0.2397) loss 2.0783 (2.7135) grad_norm 5.7435 (8.1002) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][180/1251] eta 0:04:23 lr 0.000012 wd 0.0500 time 0.2422 (0.2456) data time 0.0007 (0.0042) model time 0.2415 (0.2397) loss 2.8298 (2.7094) grad_norm 5.2610 (7.9124) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][190/1251] eta 0:04:20 lr 0.000012 wd 0.0500 time 0.2399 (0.2454) data time 0.0010 (0.0041) model time 0.2390 (0.2398) loss 3.0132 (2.7004) grad_norm 5.7617 (7.8080) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][200/1251] eta 0:04:17 lr 0.000012 wd 0.0500 time 0.2438 (0.2453) data time 0.0007 (0.0039) model time 0.2430 (0.2399) loss 2.5213 (2.6985) grad_norm 5.1730 (7.8643) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][210/1251] eta 0:04:14 lr 0.000012 wd 0.0500 time 0.2442 (0.2449) data time 0.0007 (0.0038) model time 0.2435 (0.2397) loss 2.8776 (2.6970) grad_norm 11.5545 (7.8231) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][220/1251] eta 0:04:12 lr 0.000012 wd 0.0500 time 0.2394 (0.2448) data time 0.0009 (0.0037) model time 0.2385 (0.2397) loss 2.5692 (2.6949) grad_norm 5.9606 (7.7749) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][230/1251] eta 0:04:09 lr 0.000012 wd 0.0500 time 0.2427 (0.2446) data time 0.0010 (0.0035) model time 0.2416 (0.2397) loss 2.6713 (2.6859) grad_norm 3.9893 (7.6855) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:10:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][240/1251] eta 0:04:07 lr 0.000012 wd 0.0500 time 0.2493 (0.2445) data time 0.0009 (0.0034) model time 0.2484 (0.2398) loss 3.1795 (2.6806) grad_norm 5.3879 (7.5844) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][250/1251] eta 0:04:04 lr 0.000012 wd 0.0500 time 0.2394 (0.2444) data time 0.0015 (0.0033) model time 0.2380 (0.2399) loss 3.4020 (2.6867) grad_norm 4.9885 (7.5715) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][260/1251] eta 0:04:02 lr 0.000012 wd 0.0500 time 0.2451 (0.2443) data time 0.0008 (0.0032) model time 0.2442 (0.2399) loss 3.0160 (2.6953) grad_norm 7.6387 (7.5243) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][270/1251] eta 0:03:59 lr 0.000012 wd 0.0500 time 0.2412 (0.2442) data time 0.0007 (0.0032) model time 0.2405 (0.2400) loss 2.5895 (2.6909) grad_norm 4.6001 (7.4605) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][280/1251] eta 0:03:57 lr 0.000012 wd 0.0500 time 0.2388 (0.2442) data time 0.0012 (0.0031) model time 0.2376 (0.2401) loss 2.2502 (2.6874) grad_norm 4.7588 (7.3868) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][290/1251] eta 0:03:54 lr 0.000012 wd 0.0500 time 0.2291 (0.2441) data time 0.0011 (0.0030) model time 0.2280 (0.2401) loss 2.8812 (2.6964) grad_norm 3.9762 (7.3365) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][300/1251] eta 0:03:52 lr 0.000012 wd 0.0500 time 0.2470 (0.2441) data time 0.0008 (0.0029) model time 0.2462 (0.2402) loss 2.0286 (2.6900) grad_norm 4.5536 (7.2759) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][310/1251] eta 0:03:49 lr 0.000012 wd 0.0500 time 0.2462 (0.2440) data time 0.0007 (0.0029) model time 0.2455 (0.2402) loss 2.5617 (2.6802) grad_norm 5.3552 (7.2256) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][320/1251] eta 0:03:47 lr 0.000012 wd 0.0500 time 0.2467 (0.2439) data time 0.0008 (0.0028) model time 0.2459 (0.2402) loss 3.0429 (2.6833) grad_norm 4.2691 (7.1945) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][330/1251] eta 0:03:44 lr 0.000012 wd 0.0500 time 0.2423 (0.2438) data time 0.0011 (0.0028) model time 0.2412 (0.2403) loss 2.6919 (2.6862) grad_norm 5.4060 (7.1678) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][340/1251] eta 0:03:42 lr 0.000012 wd 0.0500 time 0.2370 (0.2438) data time 0.0009 (0.0027) model time 0.2362 (0.2403) loss 3.0532 (2.6901) grad_norm 6.2364 (7.1283) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][350/1251] eta 0:03:39 lr 0.000012 wd 0.0500 time 0.2352 (0.2437) data time 0.0010 (0.0027) model time 0.2342 (0.2402) loss 2.5313 (2.6925) grad_norm 5.4945 (7.0989) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][360/1251] eta 0:03:37 lr 0.000012 wd 0.0500 time 0.2519 (0.2436) data time 0.0009 (0.0026) model time 0.2510 (0.2402) loss 2.8849 (2.6919) grad_norm 4.9204 (7.0581) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][370/1251] eta 0:03:34 lr 0.000012 wd 0.0500 time 0.2446 (0.2439) data time 0.0010 (0.0026) model time 0.2436 (0.2407) loss 3.0393 (2.6918) grad_norm 6.4357 (7.0381) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][380/1251] eta 0:03:32 lr 0.000012 wd 0.0500 time 0.2417 (0.2438) data time 0.0009 (0.0025) model time 0.2408 (0.2406) loss 2.9641 (2.6958) grad_norm 7.9005 (7.0396) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][390/1251] eta 0:03:29 lr 0.000012 wd 0.0500 time 0.2364 (0.2438) data time 0.0009 (0.0025) model time 0.2355 (0.2406) loss 2.0551 (2.6901) grad_norm 4.8518 (6.9925) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][400/1251] eta 0:03:27 lr 0.000012 wd 0.0500 time 0.2539 (0.2437) data time 0.0011 (0.0025) model time 0.2527 (0.2406) loss 1.9543 (2.6880) grad_norm 6.7912 (6.9715) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][410/1251] eta 0:03:24 lr 0.000012 wd 0.0500 time 0.2370 (0.2436) data time 0.0007 (0.0024) model time 0.2363 (0.2406) loss 3.1272 (2.6915) grad_norm 6.5586 (6.9696) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][420/1251] eta 0:03:22 lr 0.000012 wd 0.0500 time 0.2304 (0.2435) data time 0.0008 (0.0024) model time 0.2295 (0.2405) loss 2.4808 (2.6959) grad_norm 5.5147 (6.9581) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][430/1251] eta 0:03:19 lr 0.000012 wd 0.0500 time 0.2434 (0.2435) data time 0.0007 (0.0023) model time 0.2428 (0.2406) loss 1.7151 (2.6999) grad_norm 4.0065 (6.9398) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][440/1251] eta 0:03:17 lr 0.000012 wd 0.0500 time 0.2347 (0.2435) data time 0.0011 (0.0023) model time 0.2337 (0.2406) loss 2.8235 (2.7003) grad_norm 4.7953 (6.9142) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][450/1251] eta 0:03:15 lr 0.000012 wd 0.0500 time 0.2438 (0.2439) data time 0.0011 (0.0023) model time 0.2427 (0.2411) loss 1.5190 (2.7004) grad_norm 4.3353 (6.8797) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][460/1251] eta 0:03:13 lr 0.000012 wd 0.0500 time 0.2412 (0.2443) data time 0.0009 (0.0023) model time 0.2402 (0.2416) loss 1.8576 (2.6962) grad_norm 13.8897 (6.8706) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][470/1251] eta 0:03:10 lr 0.000012 wd 0.0500 time 0.2411 (0.2443) data time 0.0007 (0.0022) model time 0.2404 (0.2416) loss 2.9979 (2.6975) grad_norm 5.1720 (6.8674) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][480/1251] eta 0:03:08 lr 0.000012 wd 0.0500 time 0.2410 (0.2443) data time 0.0009 (0.0022) model time 0.2401 (0.2416) loss 2.9420 (2.6975) grad_norm 6.0073 (6.8461) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:11:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][490/1251] eta 0:03:05 lr 0.000012 wd 0.0500 time 0.2342 (0.2442) data time 0.0012 (0.0022) model time 0.2330 (0.2416) loss 2.6658 (2.7000) grad_norm 5.9077 (6.8323) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][500/1251] eta 0:03:03 lr 0.000012 wd 0.0500 time 0.2435 (0.2442) data time 0.0009 (0.0022) model time 0.2426 (0.2416) loss 2.7600 (2.7022) grad_norm 9.8428 (6.8271) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][510/1251] eta 0:03:00 lr 0.000012 wd 0.0500 time 0.2423 (0.2441) data time 0.0008 (0.0021) model time 0.2415 (0.2416) loss 2.9315 (2.6985) grad_norm 6.7791 (6.8168) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][520/1251] eta 0:02:58 lr 0.000012 wd 0.0500 time 0.2392 (0.2441) data time 0.0010 (0.0021) model time 0.2382 (0.2415) loss 3.0822 (2.6988) grad_norm 10.2335 (6.7958) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][530/1251] eta 0:02:55 lr 0.000012 wd 0.0500 time 0.2484 (0.2440) data time 0.0009 (0.0021) model time 0.2475 (0.2415) loss 2.9579 (2.6990) grad_norm 6.1980 (6.7981) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][540/1251] eta 0:02:53 lr 0.000012 wd 0.0500 time 0.2420 (0.2440) data time 0.0008 (0.0021) model time 0.2413 (0.2415) loss 3.2849 (2.6984) grad_norm 6.6649 (6.7658) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][550/1251] eta 0:02:50 lr 0.000012 wd 0.0500 time 0.2435 (0.2439) data time 0.0007 (0.0021) model time 0.2427 (0.2414) loss 2.5193 (2.6961) grad_norm 5.9373 (6.7547) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][560/1251] eta 0:02:48 lr 0.000012 wd 0.0500 time 0.2381 (0.2438) data time 0.0010 (0.0020) model time 0.2371 (0.2414) loss 2.3005 (2.6937) grad_norm 3.8603 (6.7406) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][570/1251] eta 0:02:46 lr 0.000012 wd 0.0500 time 0.2507 (0.2438) data time 0.0010 (0.0020) model time 0.2497 (0.2414) loss 2.5745 (2.6934) grad_norm 4.3372 (6.7424) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][580/1251] eta 0:02:43 lr 0.000012 wd 0.0500 time 0.2432 (0.2438) data time 0.0005 (0.0020) model time 0.2427 (0.2414) loss 2.9087 (2.6984) grad_norm 5.0073 (6.7532) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][590/1251] eta 0:02:41 lr 0.000012 wd 0.0500 time 0.2401 (0.2437) data time 0.0007 (0.0020) model time 0.2394 (0.2414) loss 2.6382 (2.6968) grad_norm 6.4033 (6.7616) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][600/1251] eta 0:02:38 lr 0.000012 wd 0.0500 time 0.2436 (0.2437) data time 0.0008 (0.0020) model time 0.2428 (0.2414) loss 2.7336 (2.6976) grad_norm 7.0704 (6.7484) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][610/1251] eta 0:02:36 lr 0.000012 wd 0.0500 time 0.2429 (0.2437) data time 0.0011 (0.0019) model time 0.2418 (0.2414) loss 3.0317 (2.6973) grad_norm 4.5669 (6.7316) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][620/1251] eta 0:02:33 lr 0.000012 wd 0.0500 time 0.2434 (0.2436) data time 0.0008 (0.0019) model time 0.2426 (0.2413) loss 2.6567 (2.6970) grad_norm 7.2534 (6.7191) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][630/1251] eta 0:02:31 lr 0.000012 wd 0.0500 time 0.2413 (0.2436) data time 0.0007 (0.0019) model time 0.2405 (0.2414) loss 1.5798 (2.6972) grad_norm 6.1185 (6.7097) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][640/1251] eta 0:02:28 lr 0.000012 wd 0.0500 time 0.2368 (0.2436) data time 0.0011 (0.0019) model time 0.2357 (0.2414) loss 2.9764 (2.6960) grad_norm 7.0229 (6.7174) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][650/1251] eta 0:02:26 lr 0.000012 wd 0.0500 time 0.2481 (0.2436) data time 0.0008 (0.0019) model time 0.2474 (0.2413) loss 3.5797 (2.6954) grad_norm 7.0913 (6.7183) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][660/1251] eta 0:02:23 lr 0.000012 wd 0.0500 time 0.2454 (0.2436) data time 0.0009 (0.0019) model time 0.2445 (0.2414) loss 2.4427 (2.6936) grad_norm 6.1106 (6.7253) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][670/1251] eta 0:02:21 lr 0.000012 wd 0.0500 time 0.2395 (0.2435) data time 0.0008 (0.0019) model time 0.2387 (0.2414) loss 2.7130 (2.6939) grad_norm 5.7978 (6.7176) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][680/1251] eta 0:02:19 lr 0.000012 wd 0.0500 time 0.2460 (0.2435) data time 0.0011 (0.0018) model time 0.2449 (0.2414) loss 2.9512 (2.6944) grad_norm 12.4349 (6.7121) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][690/1251] eta 0:02:16 lr 0.000012 wd 0.0500 time 0.2340 (0.2435) data time 0.0009 (0.0018) model time 0.2331 (0.2414) loss 2.9300 (2.6930) grad_norm 5.3737 (6.7110) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][700/1251] eta 0:02:14 lr 0.000012 wd 0.0500 time 0.2461 (0.2435) data time 0.0011 (0.0018) model time 0.2450 (0.2414) loss 2.7729 (2.6903) grad_norm 6.4838 (6.7089) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][710/1251] eta 0:02:11 lr 0.000012 wd 0.0500 time 0.2409 (0.2435) data time 0.0008 (0.0018) model time 0.2401 (0.2414) loss 2.7591 (2.6915) grad_norm 8.0409 (6.7010) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][720/1251] eta 0:02:09 lr 0.000012 wd 0.0500 time 0.2428 (0.2434) data time 0.0011 (0.0018) model time 0.2417 (0.2413) loss 2.8981 (2.6933) grad_norm 7.5326 (6.6860) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][730/1251] eta 0:02:06 lr 0.000012 wd 0.0500 time 0.2442 (0.2434) data time 0.0007 (0.0018) model time 0.2435 (0.2413) loss 3.4213 (2.6935) grad_norm 5.4683 (6.6909) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:12:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][740/1251] eta 0:02:04 lr 0.000012 wd 0.0500 time 0.2450 (0.2434) data time 0.0010 (0.0018) model time 0.2440 (0.2413) loss 3.1299 (2.6919) grad_norm 4.7022 (6.6722) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:13:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][750/1251] eta 0:02:01 lr 0.000012 wd 0.0500 time 0.2440 (0.2434) data time 0.0010 (0.0018) model time 0.2430 (0.2413) loss 2.4112 (2.6928) grad_norm 4.0890 (6.6594) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:13:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][760/1251] eta 0:01:59 lr 0.000012 wd 0.0500 time 0.2359 (0.2433) data time 0.0010 (0.0018) model time 0.2349 (0.2413) loss 2.0605 (2.6910) grad_norm 4.2453 (inf) loss_scale 128.0000 (255.4954) mem 7381MB [2024-09-01 11:13:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][770/1251] eta 0:01:57 lr 0.000012 wd 0.0500 time 0.2547 (0.2433) data time 0.0008 (0.0017) model time 0.2540 (0.2413) loss 3.2132 (2.6900) grad_norm 7.0763 (inf) loss_scale 128.0000 (253.8418) mem 7381MB [2024-09-01 11:13:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][780/1251] eta 0:01:54 lr 0.000012 wd 0.0500 time 0.2436 (0.2433) data time 0.0008 (0.0017) model time 0.2428 (0.2413) loss 2.1970 (2.6905) grad_norm 10.7221 (inf) loss_scale 128.0000 (252.2305) mem 7381MB [2024-09-01 11:13:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][790/1251] eta 0:01:52 lr 0.000012 wd 0.0500 time 0.2389 (0.2433) data time 0.0011 (0.0017) model time 0.2378 (0.2413) loss 2.8124 (2.6909) grad_norm 5.3700 (inf) loss_scale 128.0000 (250.6599) mem 7381MB [2024-09-01 11:13:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][800/1251] eta 0:01:49 lr 0.000012 wd 0.0500 time 0.2399 (0.2433) data time 0.0010 (0.0017) model time 0.2388 (0.2413) loss 2.7933 (2.6939) grad_norm 7.5807 (inf) loss_scale 128.0000 (249.1286) mem 7381MB [2024-09-01 11:13:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][810/1251] eta 0:01:47 lr 0.000012 wd 0.0500 time 0.2438 (0.2433) data time 0.0010 (0.0017) model time 0.2428 (0.2413) loss 2.1117 (2.6927) grad_norm 8.3559 (inf) loss_scale 128.0000 (247.6350) mem 7381MB [2024-09-01 11:13:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][820/1251] eta 0:01:44 lr 0.000012 wd 0.0500 time 0.2465 (0.2433) data time 0.0009 (0.0017) model time 0.2456 (0.2413) loss 2.7020 (2.6921) grad_norm 5.8366 (inf) loss_scale 128.0000 (246.1778) mem 7381MB [2024-09-01 11:13:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][830/1251] eta 0:01:42 lr 0.000012 wd 0.0500 time 0.2366 (0.2432) data time 0.0009 (0.0017) model time 0.2358 (0.2413) loss 1.7136 (2.6885) grad_norm 4.9716 (inf) loss_scale 128.0000 (244.7557) mem 7381MB [2024-09-01 11:13:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][840/1251] eta 0:01:39 lr 0.000012 wd 0.0500 time 0.2389 (0.2433) data time 0.0007 (0.0017) model time 0.2382 (0.2413) loss 2.5511 (2.6883) grad_norm 3.6437 (inf) loss_scale 128.0000 (243.3674) mem 7381MB [2024-09-01 11:13:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][850/1251] eta 0:01:37 lr 0.000012 wd 0.0500 time 0.2384 (0.2432) data time 0.0007 (0.0017) model time 0.2377 (0.2413) loss 2.8163 (2.6868) grad_norm 8.6424 (inf) loss_scale 128.0000 (242.0118) mem 7381MB [2024-09-01 11:13:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][860/1251] eta 0:01:35 lr 0.000012 wd 0.0500 time 0.2425 (0.2432) data time 0.0009 (0.0017) model time 0.2415 (0.2413) loss 2.7876 (2.6875) grad_norm 5.1845 (inf) loss_scale 128.0000 (240.6876) mem 7381MB [2024-09-01 11:13:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][870/1251] eta 0:01:32 lr 0.000012 wd 0.0500 time 0.2587 (0.2432) data time 0.0007 (0.0017) model time 0.2580 (0.2413) loss 3.1978 (2.6886) grad_norm 5.1421 (inf) loss_scale 128.0000 (239.3938) mem 7381MB [2024-09-01 11:13:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][880/1251] eta 0:01:30 lr 0.000012 wd 0.0500 time 0.2392 (0.2432) data time 0.0010 (0.0017) model time 0.2381 (0.2413) loss 2.6808 (2.6889) grad_norm 5.8295 (inf) loss_scale 128.0000 (238.1294) mem 7381MB [2024-09-01 11:13:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][890/1251] eta 0:01:27 lr 0.000012 wd 0.0500 time 0.2410 (0.2436) data time 0.0009 (0.0016) model time 0.2401 (0.2418) loss 2.6606 (2.6912) grad_norm 3.4189 (inf) loss_scale 128.0000 (236.8934) mem 7381MB [2024-09-01 11:13:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][900/1251] eta 0:01:25 lr 0.000012 wd 0.0500 time 0.2435 (0.2441) data time 0.0007 (0.0016) model time 0.2428 (0.2423) loss 2.9990 (2.6883) grad_norm 7.1919 (inf) loss_scale 128.0000 (235.6848) mem 7381MB [2024-09-01 11:13:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][910/1251] eta 0:01:23 lr 0.000012 wd 0.0500 time 0.2510 (0.2441) data time 0.0007 (0.0016) model time 0.2503 (0.2423) loss 2.2933 (2.6890) grad_norm 5.7706 (inf) loss_scale 128.0000 (234.5027) mem 7381MB [2024-09-01 11:13:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][920/1251] eta 0:01:20 lr 0.000012 wd 0.0500 time 0.2379 (0.2440) data time 0.0007 (0.0016) model time 0.2372 (0.2422) loss 2.4782 (2.6902) grad_norm 3.6011 (inf) loss_scale 128.0000 (233.3464) mem 7381MB [2024-09-01 11:13:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][930/1251] eta 0:01:18 lr 0.000012 wd 0.0500 time 0.2435 (0.2440) data time 0.0011 (0.0016) model time 0.2423 (0.2422) loss 2.7521 (2.6884) grad_norm 5.4634 (inf) loss_scale 128.0000 (232.2148) mem 7381MB [2024-09-01 11:13:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][940/1251] eta 0:01:15 lr 0.000012 wd 0.0500 time 0.2423 (0.2440) data time 0.0010 (0.0016) model time 0.2413 (0.2422) loss 2.3047 (2.6877) grad_norm 4.8849 (inf) loss_scale 128.0000 (231.1073) mem 7381MB [2024-09-01 11:13:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][950/1251] eta 0:01:13 lr 0.000012 wd 0.0500 time 0.2454 (0.2440) data time 0.0007 (0.0016) model time 0.2446 (0.2422) loss 2.7761 (2.6872) grad_norm 3.8853 (inf) loss_scale 128.0000 (230.0231) mem 7381MB [2024-09-01 11:13:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][960/1251] eta 0:01:11 lr 0.000012 wd 0.0500 time 0.2493 (0.2440) data time 0.0009 (0.0016) model time 0.2484 (0.2422) loss 2.5008 (2.6871) grad_norm 7.5561 (inf) loss_scale 128.0000 (228.9615) mem 7381MB [2024-09-01 11:13:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][970/1251] eta 0:01:08 lr 0.000012 wd 0.0500 time 0.2434 (0.2440) data time 0.0007 (0.0016) model time 0.2427 (0.2422) loss 3.4245 (2.6886) grad_norm 6.6448 (inf) loss_scale 128.0000 (227.9217) mem 7381MB [2024-09-01 11:13:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][980/1251] eta 0:01:06 lr 0.000012 wd 0.0500 time 0.2473 (0.2439) data time 0.0009 (0.0016) model time 0.2463 (0.2422) loss 2.4097 (2.6884) grad_norm 5.9394 (inf) loss_scale 128.0000 (226.9032) mem 7381MB [2024-09-01 11:14:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][990/1251] eta 0:01:03 lr 0.000012 wd 0.0500 time 0.2515 (0.2441) data time 0.0011 (0.0016) model time 0.2504 (0.2424) loss 2.9859 (2.6898) grad_norm 8.4668 (inf) loss_scale 128.0000 (225.9051) mem 7381MB [2024-09-01 11:14:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1000/1251] eta 0:01:01 lr 0.000012 wd 0.0500 time 0.2440 (0.2443) data time 0.0007 (0.0016) model time 0.2432 (0.2425) loss 1.8049 (2.6896) grad_norm 4.9299 (inf) loss_scale 128.0000 (224.9271) mem 7381MB [2024-09-01 11:14:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1010/1251] eta 0:00:58 lr 0.000012 wd 0.0500 time 0.2319 (0.2442) data time 0.0009 (0.0016) model time 0.2310 (0.2425) loss 3.0633 (2.6914) grad_norm 7.8699 (inf) loss_scale 128.0000 (223.9683) mem 7381MB [2024-09-01 11:14:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1020/1251] eta 0:00:56 lr 0.000012 wd 0.0500 time 0.2442 (0.2442) data time 0.0009 (0.0016) model time 0.2433 (0.2425) loss 3.0103 (2.6919) grad_norm 6.6117 (inf) loss_scale 128.0000 (223.0284) mem 7381MB [2024-09-01 11:14:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1030/1251] eta 0:00:53 lr 0.000012 wd 0.0500 time 0.2387 (0.2442) data time 0.0008 (0.0016) model time 0.2379 (0.2425) loss 1.7181 (2.6909) grad_norm 5.9760 (inf) loss_scale 128.0000 (222.1067) mem 7381MB [2024-09-01 11:14:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1040/1251] eta 0:00:51 lr 0.000012 wd 0.0500 time 0.2427 (0.2441) data time 0.0009 (0.0015) model time 0.2418 (0.2424) loss 2.5114 (2.6907) grad_norm 7.5888 (inf) loss_scale 128.0000 (221.2027) mem 7381MB [2024-09-01 11:14:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1050/1251] eta 0:00:49 lr 0.000012 wd 0.0500 time 0.2355 (0.2441) data time 0.0007 (0.0015) model time 0.2348 (0.2424) loss 2.6270 (2.6892) grad_norm 7.7531 (inf) loss_scale 128.0000 (220.3159) mem 7381MB [2024-09-01 11:14:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1060/1251] eta 0:00:46 lr 0.000012 wd 0.0500 time 0.2377 (0.2441) data time 0.0007 (0.0015) model time 0.2370 (0.2424) loss 1.6559 (2.6885) grad_norm 6.0917 (inf) loss_scale 128.0000 (219.4458) mem 7381MB [2024-09-01 11:14:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1070/1251] eta 0:00:44 lr 0.000012 wd 0.0500 time 0.2310 (0.2440) data time 0.0009 (0.0015) model time 0.2301 (0.2424) loss 2.6606 (2.6874) grad_norm 6.2652 (inf) loss_scale 128.0000 (218.5920) mem 7381MB [2024-09-01 11:14:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1080/1251] eta 0:00:41 lr 0.000012 wd 0.0500 time 0.2402 (0.2440) data time 0.0008 (0.0015) model time 0.2394 (0.2423) loss 2.8368 (2.6869) grad_norm 4.4922 (inf) loss_scale 128.0000 (217.7539) mem 7381MB [2024-09-01 11:14:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1090/1251] eta 0:00:39 lr 0.000012 wd 0.0500 time 0.2355 (0.2440) data time 0.0008 (0.0015) model time 0.2346 (0.2423) loss 3.0263 (2.6874) grad_norm 4.5683 (inf) loss_scale 128.0000 (216.9313) mem 7381MB [2024-09-01 11:14:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1100/1251] eta 0:00:36 lr 0.000012 wd 0.0500 time 0.2387 (0.2440) data time 0.0010 (0.0015) model time 0.2377 (0.2423) loss 2.8982 (2.6875) grad_norm 8.6704 (inf) loss_scale 128.0000 (216.1235) mem 7381MB [2024-09-01 11:14:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1110/1251] eta 0:00:34 lr 0.000012 wd 0.0500 time 0.2409 (0.2440) data time 0.0009 (0.0015) model time 0.2401 (0.2423) loss 2.6510 (2.6890) grad_norm 5.0522 (inf) loss_scale 128.0000 (215.3303) mem 7381MB [2024-09-01 11:14:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1120/1251] eta 0:00:31 lr 0.000012 wd 0.0500 time 0.2418 (0.2440) data time 0.0009 (0.0015) model time 0.2409 (0.2423) loss 3.1379 (2.6897) grad_norm 4.9227 (inf) loss_scale 128.0000 (214.5513) mem 7381MB [2024-09-01 11:14:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1130/1251] eta 0:00:29 lr 0.000012 wd 0.0500 time 0.2432 (0.2439) data time 0.0007 (0.0015) model time 0.2425 (0.2423) loss 1.7632 (2.6895) grad_norm 6.7639 (inf) loss_scale 128.0000 (213.7860) mem 7381MB [2024-09-01 11:14:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1140/1251] eta 0:00:27 lr 0.000012 wd 0.0500 time 0.2405 (0.2439) data time 0.0009 (0.0015) model time 0.2396 (0.2422) loss 2.9212 (2.6910) grad_norm 4.7919 (inf) loss_scale 128.0000 (213.0342) mem 7381MB [2024-09-01 11:14:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1150/1251] eta 0:00:24 lr 0.000012 wd 0.0500 time 0.2448 (0.2439) data time 0.0007 (0.0015) model time 0.2441 (0.2422) loss 3.1460 (2.6920) grad_norm 5.4812 (inf) loss_scale 128.0000 (212.2954) mem 7381MB [2024-09-01 11:14:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1160/1251] eta 0:00:22 lr 0.000012 wd 0.0500 time 0.2466 (0.2438) data time 0.0009 (0.0015) model time 0.2456 (0.2422) loss 2.5206 (2.6922) grad_norm 8.9403 (inf) loss_scale 128.0000 (211.5693) mem 7381MB [2024-09-01 11:14:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1170/1251] eta 0:00:19 lr 0.000012 wd 0.0500 time 0.2428 (0.2438) data time 0.0009 (0.0015) model time 0.2418 (0.2422) loss 2.0505 (2.6907) grad_norm 5.8547 (inf) loss_scale 128.0000 (210.8557) mem 7381MB [2024-09-01 11:14:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1180/1251] eta 0:00:17 lr 0.000012 wd 0.0500 time 0.2516 (0.2438) data time 0.0009 (0.0015) model time 0.2507 (0.2422) loss 2.1116 (2.6910) grad_norm 5.2796 (inf) loss_scale 128.0000 (210.1541) mem 7381MB [2024-09-01 11:14:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1190/1251] eta 0:00:14 lr 0.000012 wd 0.0500 time 0.2446 (0.2438) data time 0.0009 (0.0015) model time 0.2436 (0.2421) loss 2.4587 (2.6916) grad_norm 7.1131 (inf) loss_scale 128.0000 (209.4643) mem 7381MB [2024-09-01 11:14:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1200/1251] eta 0:00:12 lr 0.000012 wd 0.0500 time 0.2397 (0.2437) data time 0.0010 (0.0015) model time 0.2387 (0.2421) loss 3.2203 (2.6920) grad_norm 4.0801 (inf) loss_scale 128.0000 (208.7860) mem 7381MB [2024-09-01 11:14:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1210/1251] eta 0:00:09 lr 0.000012 wd 0.0500 time 0.2388 (0.2437) data time 0.0009 (0.0015) model time 0.2378 (0.2421) loss 3.1003 (2.6927) grad_norm 6.2371 (inf) loss_scale 128.0000 (208.1189) mem 7381MB [2024-09-01 11:14:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1220/1251] eta 0:00:07 lr 0.000012 wd 0.0500 time 0.2503 (0.2437) data time 0.0009 (0.0015) model time 0.2494 (0.2421) loss 1.6069 (2.6911) grad_norm 5.0992 (inf) loss_scale 128.0000 (207.4627) mem 7381MB [2024-09-01 11:14:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1230/1251] eta 0:00:05 lr 0.000012 wd 0.0500 time 0.2401 (0.2437) data time 0.0007 (0.0015) model time 0.2393 (0.2421) loss 2.7910 (2.6913) grad_norm 5.0732 (inf) loss_scale 128.0000 (206.8172) mem 7381MB [2024-09-01 11:15:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1240/1251] eta 0:00:02 lr 0.000012 wd 0.0500 time 0.2290 (0.2436) data time 0.0007 (0.0015) model time 0.2283 (0.2420) loss 2.3679 (2.6897) grad_norm 5.7164 (inf) loss_scale 128.0000 (206.1821) mem 7381MB [2024-09-01 11:15:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [291/300][1250/1251] eta 0:00:00 lr 0.000012 wd 0.0500 time 0.2225 (0.2434) data time 0.0004 (0.0015) model time 0.2220 (0.2418) loss 3.3454 (2.6899) grad_norm 5.4455 (inf) loss_scale 128.0000 (205.5572) mem 7381MB [2024-09-01 11:15:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 291 training takes 0:05:04 [2024-09-01 11:15:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 11:15:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 11:15:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.457 (0.457) Loss 0.3943 (0.3943) Acc@1 93.262 (93.262) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 11:15:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.111) Loss 0.5698 (0.6145) Acc@1 90.430 (87.678) Acc@5 97.949 (97.789) Mem 7381MB [2024-09-01 11:15:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.080 (0.096) Loss 0.9204 (0.6456) Acc@1 78.223 (86.616) Acc@5 95.801 (97.749) Mem 7381MB [2024-09-01 11:15:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.077 (0.090) Loss 1.1543 (0.7406) Acc@1 74.414 (84.372) Acc@5 92.871 (96.777) Mem 7381MB [2024-09-01 11:15:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.0215 (0.7899) Acc@1 77.637 (83.256) Acc@5 94.141 (96.275) Mem 7381MB [2024-09-01 11:15:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.808 Acc@5 96.214 [2024-09-01 11:15:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 11:15:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.804 (0.804) Loss 0.3906 (0.3906) Acc@1 93.262 (93.262) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 11:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.146) Loss 0.5649 (0.6084) Acc@1 90.527 (87.891) Acc@5 97.949 (97.754) Mem 7381MB [2024-09-01 11:15:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.114) Loss 0.9126 (0.6393) Acc@1 77.832 (86.737) Acc@5 95.410 (97.712) Mem 7381MB [2024-09-01 11:15:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.068 (0.102) Loss 1.1445 (0.7325) Acc@1 74.707 (84.545) Acc@5 92.969 (96.777) Mem 7381MB [2024-09-01 11:15:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.093) Loss 1.0166 (0.7813) Acc@1 76.855 (83.370) Acc@5 94.434 (96.272) Mem 7381MB [2024-09-01 11:15:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.918 Acc@5 96.228 [2024-09-01 11:15:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 11:15:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][0/1251] eta 0:21:38 lr 0.000012 wd 0.0500 time 1.0382 (1.0382) data time 0.6866 (0.6866) model time 0.0000 (0.0000) loss 2.9210 (2.9210) grad_norm 5.9983 (5.9983) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][10/1251] eta 0:06:32 lr 0.000012 wd 0.0500 time 0.2471 (0.3164) data time 0.0007 (0.0633) model time 0.0000 (0.0000) loss 2.4108 (2.6955) grad_norm 5.8061 (6.5464) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][20/1251] eta 0:05:45 lr 0.000012 wd 0.0500 time 0.2428 (0.2805) data time 0.0007 (0.0336) model time 0.0000 (0.0000) loss 3.0223 (2.7205) grad_norm 9.5125 (6.5352) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][30/1251] eta 0:05:27 lr 0.000012 wd 0.0500 time 0.2368 (0.2682) data time 0.0009 (0.0231) model time 0.0000 (0.0000) loss 2.8847 (2.7546) grad_norm 8.3523 (6.5283) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][40/1251] eta 0:05:16 lr 0.000012 wd 0.0500 time 0.2345 (0.2611) data time 0.0009 (0.0177) model time 0.0000 (0.0000) loss 2.9584 (2.7547) grad_norm 6.0007 (6.4697) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][50/1251] eta 0:05:09 lr 0.000012 wd 0.0500 time 0.2351 (0.2574) data time 0.0009 (0.0145) model time 0.0000 (0.0000) loss 3.0502 (2.7641) grad_norm 6.3052 (6.5442) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][60/1251] eta 0:05:03 lr 0.000012 wd 0.0500 time 0.2408 (0.2547) data time 0.0010 (0.0122) model time 0.2398 (0.2397) loss 3.1655 (2.7588) grad_norm 4.5242 (6.4453) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][70/1251] eta 0:04:58 lr 0.000012 wd 0.0500 time 0.2456 (0.2530) data time 0.0007 (0.0106) model time 0.2449 (0.2408) loss 3.3272 (2.7780) grad_norm 7.4143 (6.4056) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][80/1251] eta 0:04:54 lr 0.000012 wd 0.0500 time 0.2451 (0.2517) data time 0.0011 (0.0095) model time 0.2440 (0.2411) loss 2.8831 (2.7755) grad_norm 7.2465 (6.3221) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][90/1251] eta 0:04:51 lr 0.000012 wd 0.0500 time 0.2369 (0.2507) data time 0.0010 (0.0085) model time 0.2359 (0.2412) loss 2.9519 (2.7629) grad_norm 3.9105 (6.2575) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][100/1251] eta 0:04:47 lr 0.000012 wd 0.0500 time 0.2439 (0.2499) data time 0.0008 (0.0078) model time 0.2431 (0.2413) loss 3.4348 (2.7699) grad_norm 5.7780 (6.3219) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][110/1251] eta 0:04:44 lr 0.000012 wd 0.0500 time 0.2454 (0.2494) data time 0.0009 (0.0072) model time 0.2445 (0.2417) loss 2.5882 (2.7733) grad_norm 6.4405 (6.2867) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][120/1251] eta 0:04:41 lr 0.000012 wd 0.0500 time 0.2468 (0.2489) data time 0.0011 (0.0066) model time 0.2457 (0.2417) loss 3.1508 (2.7623) grad_norm 5.1129 (6.2674) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][130/1251] eta 0:04:38 lr 0.000012 wd 0.0500 time 0.2438 (0.2484) data time 0.0007 (0.0062) model time 0.2431 (0.2417) loss 3.4245 (2.7519) grad_norm 8.2893 (6.1807) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][140/1251] eta 0:04:35 lr 0.000012 wd 0.0500 time 0.2418 (0.2479) data time 0.0009 (0.0058) model time 0.2409 (0.2415) loss 3.2626 (2.7541) grad_norm 6.0946 (6.1608) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][150/1251] eta 0:04:32 lr 0.000012 wd 0.0500 time 0.2445 (0.2477) data time 0.0007 (0.0055) model time 0.2438 (0.2417) loss 2.5437 (2.7432) grad_norm 4.6564 (6.1218) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][160/1251] eta 0:04:29 lr 0.000012 wd 0.0500 time 0.2399 (0.2473) data time 0.0011 (0.0052) model time 0.2388 (0.2416) loss 2.4104 (2.7395) grad_norm 5.6086 (6.1812) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][170/1251] eta 0:04:27 lr 0.000012 wd 0.0500 time 0.2449 (0.2470) data time 0.0009 (0.0050) model time 0.2440 (0.2416) loss 2.9131 (2.7349) grad_norm 5.3534 (6.4162) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][180/1251] eta 0:04:24 lr 0.000012 wd 0.0500 time 0.2434 (0.2467) data time 0.0011 (0.0048) model time 0.2424 (0.2415) loss 2.3006 (2.7378) grad_norm 7.3807 (6.4317) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:15:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][190/1251] eta 0:04:21 lr 0.000012 wd 0.0500 time 0.2488 (0.2465) data time 0.0009 (0.0046) model time 0.2479 (0.2415) loss 2.2218 (2.7328) grad_norm 5.7545 (6.4672) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][200/1251] eta 0:04:18 lr 0.000012 wd 0.0500 time 0.2397 (0.2464) data time 0.0010 (0.0044) model time 0.2387 (0.2417) loss 2.9068 (2.7422) grad_norm 6.5781 (6.4723) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][210/1251] eta 0:04:16 lr 0.000012 wd 0.0500 time 0.2422 (0.2461) data time 0.0006 (0.0042) model time 0.2416 (0.2415) loss 2.9855 (2.7344) grad_norm 8.4476 (6.4591) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][220/1251] eta 0:04:13 lr 0.000012 wd 0.0500 time 0.2467 (0.2459) data time 0.0008 (0.0041) model time 0.2459 (0.2415) loss 2.9027 (2.7327) grad_norm 6.1044 (6.4748) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][230/1251] eta 0:04:10 lr 0.000012 wd 0.0500 time 0.2433 (0.2457) data time 0.0012 (0.0040) model time 0.2421 (0.2414) loss 2.9739 (2.7342) grad_norm 4.8004 (6.4649) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][240/1251] eta 0:04:08 lr 0.000012 wd 0.0500 time 0.2510 (0.2456) data time 0.0008 (0.0038) model time 0.2502 (0.2414) loss 1.5777 (2.7350) grad_norm 5.0034 (6.4486) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][250/1251] eta 0:04:05 lr 0.000012 wd 0.0500 time 0.2376 (0.2454) data time 0.0010 (0.0037) model time 0.2366 (0.2413) loss 2.9223 (2.7221) grad_norm 6.8340 (6.4366) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][260/1251] eta 0:04:03 lr 0.000012 wd 0.0500 time 0.4503 (0.2460) data time 0.0010 (0.0036) model time 0.4493 (0.2423) loss 2.9934 (2.7149) grad_norm 5.4904 (6.6502) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][270/1251] eta 0:04:01 lr 0.000012 wd 0.0500 time 0.2472 (0.2459) data time 0.0010 (0.0035) model time 0.2462 (0.2423) loss 2.8183 (2.7158) grad_norm 6.1037 (6.6519) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][280/1251] eta 0:03:59 lr 0.000012 wd 0.0500 time 0.2405 (0.2465) data time 0.0010 (0.0034) model time 0.2395 (0.2431) loss 3.0864 (2.7146) grad_norm 5.5298 (6.6668) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][290/1251] eta 0:03:56 lr 0.000012 wd 0.0500 time 0.2425 (0.2465) data time 0.0007 (0.0033) model time 0.2418 (0.2431) loss 2.7213 (2.7171) grad_norm 5.3065 (6.7101) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][300/1251] eta 0:03:54 lr 0.000012 wd 0.0500 time 0.2386 (0.2463) data time 0.0011 (0.0033) model time 0.2375 (0.2430) loss 2.6781 (2.7161) grad_norm 4.1396 (7.3246) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][310/1251] eta 0:03:52 lr 0.000012 wd 0.0500 time 0.2403 (0.2468) data time 0.0010 (0.0032) model time 0.2393 (0.2437) loss 2.9049 (2.7147) grad_norm 5.5636 (7.3001) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][320/1251] eta 0:03:50 lr 0.000012 wd 0.0500 time 0.2370 (0.2473) data time 0.0010 (0.0031) model time 0.2359 (0.2444) loss 2.8447 (2.7135) grad_norm 6.1076 (7.2816) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][330/1251] eta 0:03:48 lr 0.000012 wd 0.0500 time 0.2350 (0.2477) data time 0.0011 (0.0031) model time 0.2340 (0.2450) loss 2.7395 (2.7142) grad_norm 7.5235 (7.2577) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][340/1251] eta 0:03:45 lr 0.000012 wd 0.0500 time 0.2460 (0.2476) data time 0.0007 (0.0030) model time 0.2453 (0.2449) loss 3.3151 (2.7157) grad_norm 7.5262 (7.2562) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][350/1251] eta 0:03:43 lr 0.000012 wd 0.0500 time 0.2511 (0.2476) data time 0.0007 (0.0029) model time 0.2505 (0.2449) loss 2.6339 (2.7139) grad_norm 5.2298 (7.2515) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][360/1251] eta 0:03:40 lr 0.000012 wd 0.0500 time 0.2423 (0.2473) data time 0.0009 (0.0029) model time 0.2414 (0.2447) loss 2.6299 (2.7105) grad_norm 5.0602 (7.2297) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][370/1251] eta 0:03:37 lr 0.000012 wd 0.0500 time 0.2378 (0.2472) data time 0.0010 (0.0028) model time 0.2368 (0.2445) loss 2.4637 (2.7057) grad_norm 5.4566 (7.2052) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][380/1251] eta 0:03:35 lr 0.000012 wd 0.0500 time 0.2448 (0.2470) data time 0.0009 (0.0028) model time 0.2438 (0.2444) loss 2.3722 (2.7056) grad_norm 7.7356 (7.1645) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][390/1251] eta 0:03:32 lr 0.000012 wd 0.0500 time 0.2501 (0.2469) data time 0.0007 (0.0027) model time 0.2494 (0.2443) loss 3.1358 (2.7065) grad_norm 5.6559 (7.1387) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][400/1251] eta 0:03:30 lr 0.000012 wd 0.0500 time 0.2499 (0.2468) data time 0.0012 (0.0027) model time 0.2487 (0.2443) loss 2.3944 (2.6998) grad_norm 4.1631 (7.1151) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][410/1251] eta 0:03:27 lr 0.000012 wd 0.0500 time 0.2549 (0.2468) data time 0.0011 (0.0027) model time 0.2538 (0.2443) loss 2.7719 (2.7025) grad_norm 6.4889 (7.0746) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][420/1251] eta 0:03:24 lr 0.000012 wd 0.0500 time 0.2464 (0.2467) data time 0.0009 (0.0026) model time 0.2455 (0.2442) loss 2.1673 (2.6977) grad_norm 74.0404 (7.2593) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:16:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][430/1251] eta 0:03:22 lr 0.000012 wd 0.0500 time 0.2425 (0.2465) data time 0.0009 (0.0026) model time 0.2416 (0.2441) loss 2.0804 (2.6910) grad_norm 7.3596 (7.2391) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][440/1251] eta 0:03:19 lr 0.000012 wd 0.0500 time 0.2497 (0.2465) data time 0.0009 (0.0025) model time 0.2488 (0.2441) loss 2.1622 (2.6873) grad_norm 6.0350 (7.2002) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][450/1251] eta 0:03:17 lr 0.000012 wd 0.0500 time 0.2505 (0.2464) data time 0.0009 (0.0025) model time 0.2496 (0.2440) loss 1.7006 (2.6831) grad_norm 5.8244 (7.1978) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][460/1251] eta 0:03:14 lr 0.000012 wd 0.0500 time 0.2382 (0.2463) data time 0.0010 (0.0025) model time 0.2372 (0.2440) loss 3.6949 (2.6866) grad_norm 6.6958 (7.1570) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][470/1251] eta 0:03:12 lr 0.000012 wd 0.0500 time 0.2413 (0.2463) data time 0.0009 (0.0024) model time 0.2404 (0.2439) loss 2.1538 (2.6874) grad_norm 4.6199 (7.1292) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][480/1251] eta 0:03:09 lr 0.000012 wd 0.0500 time 0.2494 (0.2462) data time 0.0010 (0.0024) model time 0.2484 (0.2439) loss 2.9708 (2.6860) grad_norm 5.1117 (7.1033) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][490/1251] eta 0:03:07 lr 0.000012 wd 0.0500 time 0.2414 (0.2462) data time 0.0010 (0.0024) model time 0.2404 (0.2439) loss 3.3692 (2.6852) grad_norm 3.4720 (7.0686) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][500/1251] eta 0:03:05 lr 0.000012 wd 0.0500 time 0.2370 (0.2465) data time 0.0007 (0.0024) model time 0.2363 (0.2442) loss 2.9286 (2.6862) grad_norm 6.6088 (7.0645) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][510/1251] eta 0:03:02 lr 0.000012 wd 0.0500 time 0.2523 (0.2464) data time 0.0009 (0.0023) model time 0.2514 (0.2442) loss 2.5732 (2.6878) grad_norm 6.6253 (7.0874) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][520/1251] eta 0:03:00 lr 0.000012 wd 0.0500 time 0.2444 (0.2463) data time 0.0009 (0.0023) model time 0.2435 (0.2441) loss 2.8539 (2.6887) grad_norm 6.5316 (7.0756) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][530/1251] eta 0:02:57 lr 0.000012 wd 0.0500 time 0.2491 (0.2462) data time 0.0007 (0.0023) model time 0.2484 (0.2441) loss 1.6627 (2.6899) grad_norm 7.5567 (7.0455) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][540/1251] eta 0:02:55 lr 0.000012 wd 0.0500 time 0.2404 (0.2462) data time 0.0007 (0.0023) model time 0.2397 (0.2440) loss 2.8701 (2.6906) grad_norm 4.7751 (7.0321) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][550/1251] eta 0:02:52 lr 0.000012 wd 0.0500 time 0.2439 (0.2460) data time 0.0012 (0.0022) model time 0.2427 (0.2439) loss 3.0891 (2.6935) grad_norm 4.7067 (7.0173) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][560/1251] eta 0:02:49 lr 0.000012 wd 0.0500 time 0.2393 (0.2459) data time 0.0007 (0.0022) model time 0.2386 (0.2438) loss 3.0019 (2.6978) grad_norm 6.6222 (7.0405) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][570/1251] eta 0:02:47 lr 0.000012 wd 0.0500 time 0.2389 (0.2459) data time 0.0009 (0.0022) model time 0.2379 (0.2438) loss 2.7984 (2.6996) grad_norm 4.4462 (7.0163) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][580/1251] eta 0:02:44 lr 0.000012 wd 0.0500 time 0.2530 (0.2459) data time 0.0009 (0.0022) model time 0.2521 (0.2438) loss 3.5079 (2.7020) grad_norm 7.2878 (7.0128) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][590/1251] eta 0:02:42 lr 0.000012 wd 0.0500 time 0.2439 (0.2458) data time 0.0011 (0.0021) model time 0.2429 (0.2437) loss 2.8030 (2.7004) grad_norm 5.3423 (6.9961) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][600/1251] eta 0:02:40 lr 0.000012 wd 0.0500 time 0.2546 (0.2458) data time 0.0009 (0.0021) model time 0.2537 (0.2437) loss 3.0442 (2.6941) grad_norm 3.5916 (6.9888) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][610/1251] eta 0:02:37 lr 0.000012 wd 0.0500 time 0.2397 (0.2458) data time 0.0009 (0.0021) model time 0.2388 (0.2437) loss 3.0640 (2.6940) grad_norm 9.0254 (7.0185) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][620/1251] eta 0:02:35 lr 0.000012 wd 0.0500 time 0.2368 (0.2457) data time 0.0009 (0.0021) model time 0.2360 (0.2436) loss 3.4113 (2.6923) grad_norm 7.8693 (7.0122) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][630/1251] eta 0:02:32 lr 0.000012 wd 0.0500 time 0.2336 (0.2455) data time 0.0010 (0.0021) model time 0.2326 (0.2435) loss 2.7377 (2.6900) grad_norm 7.6756 (6.9962) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][640/1251] eta 0:02:30 lr 0.000012 wd 0.0500 time 0.2525 (0.2455) data time 0.0007 (0.0021) model time 0.2518 (0.2435) loss 1.8960 (2.6896) grad_norm 8.5544 (6.9727) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][650/1251] eta 0:02:27 lr 0.000012 wd 0.0500 time 0.2497 (0.2455) data time 0.0009 (0.0020) model time 0.2488 (0.2435) loss 3.3850 (2.6889) grad_norm 5.5633 (6.9562) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][660/1251] eta 0:02:25 lr 0.000012 wd 0.0500 time 0.2413 (0.2455) data time 0.0009 (0.0020) model time 0.2404 (0.2435) loss 1.7681 (2.6904) grad_norm 11.4979 (6.9500) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][670/1251] eta 0:02:22 lr 0.000012 wd 0.0500 time 0.2447 (0.2454) data time 0.0007 (0.0020) model time 0.2440 (0.2434) loss 2.8949 (2.6918) grad_norm 5.4298 (6.9555) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:17:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][680/1251] eta 0:02:20 lr 0.000012 wd 0.0500 time 0.2385 (0.2453) data time 0.0009 (0.0020) model time 0.2376 (0.2433) loss 1.7597 (2.6901) grad_norm 3.7663 (6.9503) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][690/1251] eta 0:02:17 lr 0.000012 wd 0.0500 time 0.2346 (0.2452) data time 0.0009 (0.0020) model time 0.2336 (0.2433) loss 3.0134 (2.6906) grad_norm 4.9609 (6.9319) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][700/1251] eta 0:02:15 lr 0.000012 wd 0.0500 time 0.2331 (0.2452) data time 0.0007 (0.0020) model time 0.2324 (0.2432) loss 3.1538 (2.6894) grad_norm 5.8588 (6.9147) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][710/1251] eta 0:02:12 lr 0.000012 wd 0.0500 time 0.2415 (0.2451) data time 0.0007 (0.0020) model time 0.2408 (0.2432) loss 2.3269 (2.6887) grad_norm 5.3161 (6.8955) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][720/1251] eta 0:02:10 lr 0.000012 wd 0.0500 time 0.2366 (0.2450) data time 0.0008 (0.0019) model time 0.2357 (0.2431) loss 2.4035 (2.6894) grad_norm 3.7279 (6.8813) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][730/1251] eta 0:02:07 lr 0.000012 wd 0.0500 time 0.2432 (0.2450) data time 0.0011 (0.0019) model time 0.2421 (0.2431) loss 1.6847 (2.6884) grad_norm 7.0000 (7.0517) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][740/1251] eta 0:02:05 lr 0.000012 wd 0.0500 time 0.2413 (0.2449) data time 0.0009 (0.0019) model time 0.2403 (0.2430) loss 2.5622 (2.6883) grad_norm 5.3384 (7.0509) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][750/1251] eta 0:02:02 lr 0.000012 wd 0.0500 time 0.2369 (0.2449) data time 0.0007 (0.0019) model time 0.2362 (0.2430) loss 1.9974 (2.6875) grad_norm 8.4552 (7.0583) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][760/1251] eta 0:02:00 lr 0.000012 wd 0.0500 time 0.2404 (0.2448) data time 0.0011 (0.0019) model time 0.2393 (0.2429) loss 2.8738 (2.6871) grad_norm 5.1933 (7.0558) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][770/1251] eta 0:01:57 lr 0.000012 wd 0.0500 time 0.2431 (0.2448) data time 0.0011 (0.0019) model time 0.2420 (0.2429) loss 2.8711 (2.6856) grad_norm 34.3386 (7.0896) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][780/1251] eta 0:01:55 lr 0.000012 wd 0.0500 time 0.3933 (0.2449) data time 0.0008 (0.0019) model time 0.3925 (0.2431) loss 2.5908 (2.6845) grad_norm 8.1719 (7.0966) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][790/1251] eta 0:01:52 lr 0.000012 wd 0.0500 time 0.2431 (0.2449) data time 0.0009 (0.0019) model time 0.2422 (0.2431) loss 2.5933 (2.6846) grad_norm 7.1496 (7.1421) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][800/1251] eta 0:01:50 lr 0.000012 wd 0.0500 time 0.2403 (0.2450) data time 0.0007 (0.0018) model time 0.2396 (0.2432) loss 2.4099 (2.6825) grad_norm 5.3387 (7.1258) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][810/1251] eta 0:01:48 lr 0.000012 wd 0.0500 time 0.2458 (0.2450) data time 0.0009 (0.0018) model time 0.2449 (0.2432) loss 2.3192 (2.6806) grad_norm 8.1351 (7.1304) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][820/1251] eta 0:01:45 lr 0.000012 wd 0.0500 time 0.2427 (0.2450) data time 0.0009 (0.0018) model time 0.2418 (0.2432) loss 2.1166 (2.6792) grad_norm 5.9621 (7.1174) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][830/1251] eta 0:01:43 lr 0.000012 wd 0.0500 time 0.2459 (0.2449) data time 0.0010 (0.0018) model time 0.2449 (0.2432) loss 2.6602 (2.6789) grad_norm 6.9241 (7.1083) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][840/1251] eta 0:01:40 lr 0.000012 wd 0.0500 time 0.2387 (0.2452) data time 0.0010 (0.0018) model time 0.2378 (0.2434) loss 3.0839 (2.6809) grad_norm 5.2613 (7.1010) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][850/1251] eta 0:01:38 lr 0.000012 wd 0.0500 time 0.2390 (0.2454) data time 0.0009 (0.0018) model time 0.2381 (0.2436) loss 3.2725 (2.6814) grad_norm 5.1666 (7.1867) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][860/1251] eta 0:01:36 lr 0.000012 wd 0.0500 time 0.4587 (0.2456) data time 0.0009 (0.0018) model time 0.4578 (0.2438) loss 2.8938 (2.6788) grad_norm 9.3174 (7.1720) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][870/1251] eta 0:01:33 lr 0.000012 wd 0.0500 time 0.2481 (0.2455) data time 0.0010 (0.0018) model time 0.2471 (0.2438) loss 2.2090 (2.6792) grad_norm 5.9350 (7.1703) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][880/1251] eta 0:01:31 lr 0.000012 wd 0.0500 time 0.2422 (0.2455) data time 0.0009 (0.0018) model time 0.2412 (0.2438) loss 2.4777 (2.6792) grad_norm 5.8232 (7.1568) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][890/1251] eta 0:01:28 lr 0.000012 wd 0.0500 time 0.2471 (0.2454) data time 0.0010 (0.0017) model time 0.2461 (0.2437) loss 2.4301 (2.6796) grad_norm 5.1945 (7.1475) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][900/1251] eta 0:01:26 lr 0.000012 wd 0.0500 time 0.2430 (0.2454) data time 0.0010 (0.0017) model time 0.2420 (0.2437) loss 2.5660 (2.6783) grad_norm 5.9052 (7.1544) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][910/1251] eta 0:01:23 lr 0.000012 wd 0.0500 time 0.2400 (0.2453) data time 0.0008 (0.0017) model time 0.2392 (0.2436) loss 1.7080 (2.6765) grad_norm 5.4888 (7.1470) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:18:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][920/1251] eta 0:01:21 lr 0.000012 wd 0.0500 time 0.2431 (0.2453) data time 0.0008 (0.0017) model time 0.2423 (0.2436) loss 3.2113 (2.6773) grad_norm 5.3622 (7.1359) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][930/1251] eta 0:01:18 lr 0.000012 wd 0.0500 time 0.2420 (0.2453) data time 0.0008 (0.0017) model time 0.2412 (0.2436) loss 3.1153 (2.6775) grad_norm 5.2669 (7.1219) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][940/1251] eta 0:01:16 lr 0.000012 wd 0.0500 time 0.2471 (0.2453) data time 0.0008 (0.0017) model time 0.2463 (0.2436) loss 2.5794 (2.6775) grad_norm 8.0780 (7.1218) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][950/1251] eta 0:01:13 lr 0.000012 wd 0.0500 time 0.2456 (0.2452) data time 0.0007 (0.0017) model time 0.2449 (0.2436) loss 3.0623 (2.6784) grad_norm 7.4250 (7.1067) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][960/1251] eta 0:01:11 lr 0.000012 wd 0.0500 time 0.2422 (0.2452) data time 0.0007 (0.0017) model time 0.2414 (0.2435) loss 1.9099 (2.6760) grad_norm 4.8044 (7.0987) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][970/1251] eta 0:01:08 lr 0.000012 wd 0.0500 time 0.2315 (0.2451) data time 0.0009 (0.0017) model time 0.2306 (0.2435) loss 3.1475 (2.6772) grad_norm 4.6732 (7.0949) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][980/1251] eta 0:01:06 lr 0.000012 wd 0.0500 time 0.2397 (0.2451) data time 0.0008 (0.0017) model time 0.2390 (0.2434) loss 3.0570 (2.6790) grad_norm 5.6784 (7.0851) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][990/1251] eta 0:01:03 lr 0.000012 wd 0.0500 time 0.2369 (0.2450) data time 0.0010 (0.0017) model time 0.2359 (0.2434) loss 2.3013 (2.6783) grad_norm 6.5660 (7.0815) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1000/1251] eta 0:01:01 lr 0.000012 wd 0.0500 time 0.2500 (0.2450) data time 0.0009 (0.0017) model time 0.2491 (0.2434) loss 2.4550 (2.6768) grad_norm 5.6244 (7.0766) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1010/1251] eta 0:00:59 lr 0.000012 wd 0.0500 time 0.2464 (0.2450) data time 0.0010 (0.0017) model time 0.2453 (0.2433) loss 2.6378 (2.6768) grad_norm 5.9940 (7.0580) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1020/1251] eta 0:00:56 lr 0.000012 wd 0.0500 time 0.2447 (0.2449) data time 0.0009 (0.0017) model time 0.2438 (0.2433) loss 2.6208 (2.6778) grad_norm 14.9078 (7.0532) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1030/1251] eta 0:00:54 lr 0.000012 wd 0.0500 time 0.2406 (0.2449) data time 0.0009 (0.0016) model time 0.2397 (0.2433) loss 3.5246 (2.6796) grad_norm 6.1289 (7.0412) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1040/1251] eta 0:00:51 lr 0.000012 wd 0.0500 time 0.2410 (0.2451) data time 0.0010 (0.0016) model time 0.2400 (0.2435) loss 2.7263 (2.6777) grad_norm 6.9623 (7.0314) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1050/1251] eta 0:00:49 lr 0.000012 wd 0.0500 time 0.2340 (0.2451) data time 0.0010 (0.0016) model time 0.2329 (0.2435) loss 2.7641 (2.6775) grad_norm 5.9231 (7.0210) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1060/1251] eta 0:00:46 lr 0.000012 wd 0.0500 time 0.2457 (0.2451) data time 0.0010 (0.0016) model time 0.2447 (0.2435) loss 2.8970 (2.6765) grad_norm 3.4054 (7.0217) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1070/1251] eta 0:00:44 lr 0.000012 wd 0.0500 time 0.2341 (0.2450) data time 0.0009 (0.0016) model time 0.2331 (0.2434) loss 2.4850 (2.6780) grad_norm 9.8196 (7.0104) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1080/1251] eta 0:00:41 lr 0.000012 wd 0.0500 time 0.2425 (0.2450) data time 0.0008 (0.0016) model time 0.2417 (0.2434) loss 2.3332 (2.6761) grad_norm 4.5357 (6.9979) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1090/1251] eta 0:00:39 lr 0.000012 wd 0.0500 time 0.2461 (0.2450) data time 0.0007 (0.0016) model time 0.2454 (0.2434) loss 2.6062 (2.6757) grad_norm 8.7176 (6.9873) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1100/1251] eta 0:00:36 lr 0.000012 wd 0.0500 time 0.2359 (0.2449) data time 0.0010 (0.0016) model time 0.2349 (0.2433) loss 2.0354 (2.6753) grad_norm 6.9442 (6.9755) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1110/1251] eta 0:00:34 lr 0.000012 wd 0.0500 time 0.2387 (0.2449) data time 0.0009 (0.0016) model time 0.2378 (0.2433) loss 2.6714 (2.6754) grad_norm 5.7550 (6.9774) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1120/1251] eta 0:00:32 lr 0.000012 wd 0.0500 time 0.2423 (0.2449) data time 0.0007 (0.0016) model time 0.2416 (0.2433) loss 3.0648 (2.6774) grad_norm 5.9052 (6.9646) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1130/1251] eta 0:00:29 lr 0.000012 wd 0.0500 time 0.2432 (0.2449) data time 0.0009 (0.0016) model time 0.2423 (0.2433) loss 2.8205 (2.6784) grad_norm 5.4567 (6.9660) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1140/1251] eta 0:00:27 lr 0.000012 wd 0.0500 time 0.2377 (0.2448) data time 0.0011 (0.0016) model time 0.2366 (0.2433) loss 2.7470 (2.6790) grad_norm 6.9871 (6.9668) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1150/1251] eta 0:00:24 lr 0.000012 wd 0.0500 time 0.2373 (0.2448) data time 0.0009 (0.0016) model time 0.2364 (0.2433) loss 2.4106 (2.6782) grad_norm 6.6553 (7.0044) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1160/1251] eta 0:00:22 lr 0.000012 wd 0.0500 time 0.2409 (0.2448) data time 0.0010 (0.0016) model time 0.2399 (0.2432) loss 2.8382 (2.6779) grad_norm 4.7130 (7.0051) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:19:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1170/1251] eta 0:00:19 lr 0.000012 wd 0.0500 time 0.2361 (0.2448) data time 0.0009 (0.0016) model time 0.2353 (0.2432) loss 1.9370 (2.6771) grad_norm 3.8730 (6.9895) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1180/1251] eta 0:00:17 lr 0.000012 wd 0.0500 time 0.2500 (0.2447) data time 0.0008 (0.0016) model time 0.2492 (0.2432) loss 2.6406 (2.6768) grad_norm 5.7640 (6.9849) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1190/1251] eta 0:00:14 lr 0.000012 wd 0.0500 time 0.2395 (0.2447) data time 0.0011 (0.0016) model time 0.2383 (0.2432) loss 2.5158 (2.6768) grad_norm 5.3389 (6.9794) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1200/1251] eta 0:00:12 lr 0.000012 wd 0.0500 time 0.2413 (0.2447) data time 0.0010 (0.0016) model time 0.2403 (0.2431) loss 2.9542 (2.6761) grad_norm 8.2909 (6.9806) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1210/1251] eta 0:00:10 lr 0.000012 wd 0.0500 time 0.2528 (0.2447) data time 0.0010 (0.0015) model time 0.2518 (0.2431) loss 2.8855 (2.6767) grad_norm 5.2778 (6.9694) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1220/1251] eta 0:00:07 lr 0.000012 wd 0.0500 time 0.2403 (0.2446) data time 0.0007 (0.0015) model time 0.2396 (0.2431) loss 2.6982 (2.6755) grad_norm 9.3445 (6.9869) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1230/1251] eta 0:00:05 lr 0.000012 wd 0.0500 time 0.2337 (0.2446) data time 0.0013 (0.0015) model time 0.2324 (0.2431) loss 3.2450 (2.6764) grad_norm 13.6454 (6.9863) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1240/1251] eta 0:00:02 lr 0.000012 wd 0.0500 time 0.2212 (0.2445) data time 0.0006 (0.0015) model time 0.2206 (0.2430) loss 2.4596 (2.6757) grad_norm 6.6544 (6.9863) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [292/300][1250/1251] eta 0:00:00 lr 0.000012 wd 0.0500 time 0.2245 (0.2444) data time 0.0007 (0.0015) model time 0.2238 (0.2429) loss 3.5247 (2.6772) grad_norm 4.4501 (6.9900) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 292 training takes 0:05:05 [2024-09-01 11:20:18 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 11:20:19 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 11:20:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.417 (0.417) Loss 0.3967 (0.3967) Acc@1 93.164 (93.164) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 11:20:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.075 (0.107) Loss 0.5728 (0.6165) Acc@1 90.723 (87.695) Acc@5 97.949 (97.789) Mem 7381MB [2024-09-01 11:20:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.095) Loss 0.9292 (0.6466) Acc@1 77.246 (86.612) Acc@5 95.605 (97.735) Mem 7381MB [2024-09-01 11:20:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.083 (0.090) Loss 1.1602 (0.7408) Acc@1 74.805 (84.429) Acc@5 92.676 (96.765) Mem 7381MB [2024-09-01 11:20:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.061 (0.084) Loss 1.0439 (0.7898) Acc@1 76.172 (83.251) Acc@5 93.945 (96.265) Mem 7381MB [2024-09-01 11:20:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.852 Acc@5 96.206 [2024-09-01 11:20:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 11:20:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.755 (0.755) Loss 0.3909 (0.3909) Acc@1 93.262 (93.262) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 11:20:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.079 (0.144) Loss 0.5654 (0.6088) Acc@1 90.527 (87.855) Acc@5 97.852 (97.745) Mem 7381MB [2024-09-01 11:20:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.087 (0.116) Loss 0.9131 (0.6396) Acc@1 78.223 (86.733) Acc@5 95.508 (97.712) Mem 7381MB [2024-09-01 11:20:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.075 (0.103) Loss 1.1445 (0.7329) Acc@1 74.609 (84.520) Acc@5 92.871 (96.777) Mem 7381MB [2024-09-01 11:20:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.094) Loss 1.0176 (0.7818) Acc@1 76.855 (83.344) Acc@5 94.336 (96.270) Mem 7381MB [2024-09-01 11:20:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.896 Acc@5 96.226 [2024-09-01 11:20:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 11:20:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][0/1251] eta 0:23:34 lr 0.000012 wd 0.0500 time 1.1309 (1.1309) data time 0.6036 (0.6036) model time 0.0000 (0.0000) loss 2.6092 (2.6092) grad_norm 4.9782 (4.9782) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][10/1251] eta 0:06:36 lr 0.000012 wd 0.0500 time 0.2372 (0.3196) data time 0.0011 (0.0557) model time 0.0000 (0.0000) loss 2.9284 (2.7212) grad_norm 5.6179 (6.2922) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][20/1251] eta 0:05:47 lr 0.000012 wd 0.0500 time 0.2395 (0.2821) data time 0.0008 (0.0297) model time 0.0000 (0.0000) loss 2.8629 (2.7523) grad_norm 5.0985 (5.9151) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][30/1251] eta 0:05:28 lr 0.000012 wd 0.0500 time 0.2456 (0.2687) data time 0.0011 (0.0204) model time 0.0000 (0.0000) loss 2.9151 (2.7630) grad_norm 6.4475 (5.9969) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][40/1251] eta 0:05:17 lr 0.000012 wd 0.0500 time 0.2402 (0.2622) data time 0.0011 (0.0157) model time 0.0000 (0.0000) loss 2.1944 (2.7129) grad_norm 5.6767 (6.0754) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][50/1251] eta 0:05:10 lr 0.000012 wd 0.0500 time 0.2448 (0.2584) data time 0.0009 (0.0128) model time 0.0000 (0.0000) loss 2.3713 (2.7358) grad_norm 4.7892 (6.2327) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][60/1251] eta 0:05:07 lr 0.000012 wd 0.0500 time 0.2485 (0.2582) data time 0.0007 (0.0109) model time 0.2478 (0.2560) loss 3.1919 (2.7489) grad_norm 4.7195 (6.0957) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][70/1251] eta 0:05:04 lr 0.000012 wd 0.0500 time 0.2450 (0.2578) data time 0.0010 (0.0095) model time 0.2440 (0.2552) loss 2.8799 (2.7310) grad_norm 5.9719 (6.1096) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][80/1251] eta 0:04:59 lr 0.000011 wd 0.0500 time 0.2371 (0.2557) data time 0.0009 (0.0084) model time 0.2362 (0.2502) loss 2.5239 (2.7448) grad_norm 5.5809 (6.2013) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][90/1251] eta 0:04:55 lr 0.000011 wd 0.0500 time 0.2429 (0.2541) data time 0.0008 (0.0076) model time 0.2421 (0.2477) loss 3.3698 (2.7390) grad_norm 11.2394 (6.3003) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][100/1251] eta 0:04:51 lr 0.000011 wd 0.0500 time 0.2423 (0.2529) data time 0.0010 (0.0070) model time 0.2413 (0.2463) loss 2.7010 (2.7310) grad_norm 7.0898 (6.3573) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][110/1251] eta 0:04:49 lr 0.000011 wd 0.0500 time 0.4384 (0.2537) data time 0.0011 (0.0064) model time 0.4374 (0.2486) loss 3.2746 (2.7283) grad_norm 6.0764 (6.4816) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:20:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][120/1251] eta 0:04:47 lr 0.000011 wd 0.0500 time 0.2343 (0.2542) data time 0.0011 (0.0060) model time 0.2332 (0.2502) loss 2.6368 (2.7374) grad_norm 4.5762 (6.4322) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:21:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][130/1251] eta 0:04:44 lr 0.000011 wd 0.0500 time 0.2461 (0.2535) data time 0.0009 (0.0056) model time 0.2452 (0.2493) loss 3.1040 (2.7359) grad_norm 8.5723 (6.4928) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:21:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][140/1251] eta 0:04:42 lr 0.000011 wd 0.0500 time 0.2346 (0.2540) data time 0.0011 (0.0053) model time 0.2335 (0.2505) loss 2.8741 (2.7203) grad_norm 6.3355 (6.4420) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:21:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][150/1251] eta 0:04:38 lr 0.000011 wd 0.0500 time 0.2452 (0.2533) data time 0.0011 (0.0050) model time 0.2441 (0.2497) loss 2.8541 (2.7039) grad_norm 17.6003 (6.5541) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:21:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][160/1251] eta 0:04:35 lr 0.000011 wd 0.0500 time 0.2455 (0.2525) data time 0.0007 (0.0047) model time 0.2448 (0.2488) loss 3.1471 (2.6988) grad_norm 4.5007 (6.5211) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:21:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][170/1251] eta 0:04:32 lr 0.000011 wd 0.0500 time 0.2359 (0.2517) data time 0.0009 (0.0045) model time 0.2349 (0.2478) loss 2.9217 (2.6888) grad_norm 6.3158 (6.4938) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:21:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][180/1251] eta 0:04:28 lr 0.000011 wd 0.0500 time 0.2347 (0.2510) data time 0.0008 (0.0043) model time 0.2339 (0.2471) loss 2.1530 (2.6745) grad_norm 5.7030 (6.5346) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:21:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][190/1251] eta 0:04:25 lr 0.000011 wd 0.0500 time 0.2453 (0.2506) data time 0.0011 (0.0042) model time 0.2442 (0.2468) loss 2.7879 (2.6836) grad_norm 4.9440 (6.4970) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:21:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][200/1251] eta 0:04:22 lr 0.000011 wd 0.0500 time 0.2417 (0.2502) data time 0.0010 (0.0040) model time 0.2407 (0.2464) loss 2.5096 (2.6739) grad_norm 5.8746 (6.5893) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:21:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][210/1251] eta 0:04:20 lr 0.000011 wd 0.0500 time 0.2467 (0.2499) data time 0.0007 (0.0039) model time 0.2459 (0.2462) loss 2.3754 (2.6673) grad_norm 4.8886 (6.5459) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:21:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][220/1251] eta 0:04:17 lr 0.000011 wd 0.0500 time 0.2409 (0.2495) data time 0.0008 (0.0037) model time 0.2401 (0.2458) loss 2.2327 (2.6733) grad_norm 6.7695 (6.5721) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:21:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][230/1251] eta 0:04:14 lr 0.000011 wd 0.0500 time 0.2440 (0.2492) data time 0.0008 (0.0036) model time 0.2432 (0.2456) loss 3.4959 (2.6769) grad_norm 7.1320 (6.5790) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:21:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][240/1251] eta 0:04:11 lr 0.000011 wd 0.0500 time 0.2451 (0.2489) data time 0.0010 (0.0035) model time 0.2441 (0.2454) loss 3.0451 (2.6766) grad_norm 4.7067 (6.6366) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:21:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][250/1251] eta 0:04:08 lr 0.000011 wd 0.0500 time 0.2296 (0.2486) data time 0.0010 (0.0034) model time 0.2286 (0.2450) loss 3.4344 (2.6810) grad_norm 4.6726 (6.5734) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:21:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][260/1251] eta 0:04:06 lr 0.000011 wd 0.0500 time 0.2260 (0.2488) data time 0.0008 (0.0033) model time 0.2252 (0.2455) loss 2.8294 (2.6824) grad_norm 7.5280 (6.5833) loss_scale 256.0000 (130.4521) mem 7381MB [2024-09-01 11:21:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][270/1251] eta 0:04:03 lr 0.000011 wd 0.0500 time 0.2459 (0.2486) data time 0.0007 (0.0032) model time 0.2452 (0.2453) loss 3.2219 (2.6880) grad_norm 33.6506 (6.6645) loss_scale 256.0000 (135.0849) mem 7381MB [2024-09-01 11:21:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][280/1251] eta 0:04:01 lr 0.000011 wd 0.0500 time 0.2385 (0.2483) data time 0.0007 (0.0031) model time 0.2378 (0.2451) loss 3.1115 (2.6908) grad_norm 7.4789 (6.6490) loss_scale 256.0000 (139.3879) mem 7381MB [2024-09-01 11:21:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][290/1251] eta 0:03:58 lr 0.000011 wd 0.0500 time 0.2412 (0.2480) data time 0.0009 (0.0031) model time 0.2404 (0.2448) loss 1.7990 (2.6868) grad_norm 6.6937 (6.6478) loss_scale 256.0000 (143.3952) mem 7381MB [2024-09-01 11:21:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][300/1251] eta 0:03:55 lr 0.000011 wd 0.0500 time 0.2431 (0.2478) data time 0.0010 (0.0030) model time 0.2421 (0.2447) loss 2.8006 (2.6928) grad_norm 4.7624 (6.6094) loss_scale 256.0000 (147.1362) mem 7381MB [2024-09-01 11:21:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][310/1251] eta 0:03:52 lr 0.000011 wd 0.0500 time 0.2407 (0.2476) data time 0.0013 (0.0029) model time 0.2394 (0.2445) loss 2.6404 (2.6859) grad_norm 5.6079 (6.5767) loss_scale 256.0000 (150.6367) mem 7381MB [2024-09-01 11:21:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][320/1251] eta 0:03:50 lr 0.000011 wd 0.0500 time 0.2405 (0.2474) data time 0.0007 (0.0029) model time 0.2398 (0.2443) loss 2.7048 (2.6821) grad_norm 6.2532 (6.5527) loss_scale 256.0000 (153.9190) mem 7381MB [2024-09-01 11:21:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][330/1251] eta 0:03:47 lr 0.000011 wd 0.0500 time 0.2408 (0.2472) data time 0.0010 (0.0028) model time 0.2399 (0.2441) loss 3.1520 (2.6904) grad_norm 5.5304 (6.5259) loss_scale 256.0000 (157.0030) mem 7381MB [2024-09-01 11:21:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][340/1251] eta 0:03:44 lr 0.000011 wd 0.0500 time 0.2373 (0.2469) data time 0.0011 (0.0028) model time 0.2362 (0.2439) loss 3.0271 (2.6938) grad_norm 4.2607 (6.7064) loss_scale 256.0000 (159.9062) mem 7381MB [2024-09-01 11:21:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][350/1251] eta 0:03:42 lr 0.000011 wd 0.0500 time 0.2417 (0.2467) data time 0.0009 (0.0027) model time 0.2408 (0.2437) loss 3.0923 (2.6975) grad_norm 4.8453 (6.7080) loss_scale 256.0000 (162.6439) mem 7381MB [2024-09-01 11:21:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][360/1251] eta 0:03:39 lr 0.000011 wd 0.0500 time 0.2441 (0.2465) data time 0.0007 (0.0027) model time 0.2433 (0.2436) loss 3.3468 (2.7035) grad_norm 5.0610 (6.6810) loss_scale 256.0000 (165.2299) mem 7381MB [2024-09-01 11:21:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][370/1251] eta 0:03:37 lr 0.000011 wd 0.0500 time 0.2425 (0.2464) data time 0.0010 (0.0026) model time 0.2416 (0.2435) loss 2.9737 (2.7057) grad_norm 4.9250 (6.7082) loss_scale 256.0000 (167.6765) mem 7381MB [2024-09-01 11:22:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][380/1251] eta 0:03:34 lr 0.000011 wd 0.0500 time 0.2473 (0.2463) data time 0.0009 (0.0026) model time 0.2464 (0.2435) loss 2.7184 (2.7075) grad_norm 7.8681 (6.7437) loss_scale 256.0000 (169.9948) mem 7381MB [2024-09-01 11:22:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][390/1251] eta 0:03:32 lr 0.000011 wd 0.0500 time 0.2425 (0.2463) data time 0.0007 (0.0025) model time 0.2417 (0.2435) loss 3.4713 (2.7120) grad_norm 5.4215 (6.7253) loss_scale 256.0000 (172.1944) mem 7381MB [2024-09-01 11:22:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][400/1251] eta 0:03:29 lr 0.000011 wd 0.0500 time 0.2425 (0.2461) data time 0.0010 (0.0025) model time 0.2415 (0.2433) loss 2.6706 (2.7166) grad_norm 4.4748 (6.7164) loss_scale 256.0000 (174.2843) mem 7381MB [2024-09-01 11:22:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][410/1251] eta 0:03:26 lr 0.000011 wd 0.0500 time 0.2350 (0.2460) data time 0.0008 (0.0025) model time 0.2342 (0.2432) loss 3.3692 (2.7138) grad_norm 6.2380 (6.7064) loss_scale 256.0000 (176.2725) mem 7381MB [2024-09-01 11:22:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][420/1251] eta 0:03:24 lr 0.000011 wd 0.0500 time 0.2400 (0.2459) data time 0.0007 (0.0024) model time 0.2393 (0.2432) loss 1.6002 (2.7127) grad_norm 6.7737 (6.6937) loss_scale 256.0000 (178.1663) mem 7381MB [2024-09-01 11:22:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][430/1251] eta 0:03:21 lr 0.000011 wd 0.0500 time 0.2409 (0.2458) data time 0.0007 (0.0024) model time 0.2402 (0.2431) loss 3.1375 (2.7179) grad_norm 7.6164 (6.7008) loss_scale 256.0000 (179.9722) mem 7381MB [2024-09-01 11:22:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][440/1251] eta 0:03:19 lr 0.000011 wd 0.0500 time 0.2444 (0.2458) data time 0.0007 (0.0024) model time 0.2437 (0.2431) loss 3.0539 (2.7141) grad_norm 4.0385 (6.6759) loss_scale 256.0000 (181.6961) mem 7381MB [2024-09-01 11:22:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][450/1251] eta 0:03:16 lr 0.000011 wd 0.0500 time 0.2423 (0.2457) data time 0.0008 (0.0023) model time 0.2415 (0.2431) loss 2.4872 (2.7125) grad_norm 6.4643 (6.6694) loss_scale 256.0000 (183.3437) mem 7381MB [2024-09-01 11:22:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][460/1251] eta 0:03:14 lr 0.000011 wd 0.0500 time 0.2393 (0.2456) data time 0.0010 (0.0023) model time 0.2383 (0.2431) loss 3.1788 (2.7102) grad_norm 4.5972 (6.6770) loss_scale 256.0000 (184.9197) mem 7381MB [2024-09-01 11:22:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][470/1251] eta 0:03:11 lr 0.000011 wd 0.0500 time 0.2578 (0.2455) data time 0.0009 (0.0023) model time 0.2569 (0.2430) loss 2.7367 (2.7092) grad_norm 7.7422 (6.6668) loss_scale 256.0000 (186.4289) mem 7381MB [2024-09-01 11:22:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][480/1251] eta 0:03:09 lr 0.000011 wd 0.0500 time 0.2448 (0.2454) data time 0.0007 (0.0022) model time 0.2441 (0.2429) loss 2.1926 (2.7035) grad_norm 5.8882 (6.6834) loss_scale 256.0000 (187.8753) mem 7381MB [2024-09-01 11:22:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][490/1251] eta 0:03:06 lr 0.000011 wd 0.0500 time 0.2334 (0.2453) data time 0.0009 (0.0022) model time 0.2325 (0.2428) loss 2.5122 (2.7044) grad_norm 4.7123 (6.6605) loss_scale 256.0000 (189.2627) mem 7381MB [2024-09-01 11:22:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][500/1251] eta 0:03:04 lr 0.000011 wd 0.0500 time 0.2329 (0.2452) data time 0.0010 (0.0022) model time 0.2319 (0.2427) loss 3.0407 (2.7037) grad_norm 3.8924 (6.6340) loss_scale 256.0000 (190.5948) mem 7381MB [2024-09-01 11:22:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][510/1251] eta 0:03:01 lr 0.000011 wd 0.0500 time 0.2450 (0.2451) data time 0.0010 (0.0022) model time 0.2439 (0.2427) loss 2.8822 (2.7059) grad_norm 8.2069 (6.6128) loss_scale 256.0000 (191.8748) mem 7381MB [2024-09-01 11:22:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][520/1251] eta 0:02:59 lr 0.000011 wd 0.0500 time 0.2381 (0.2451) data time 0.0010 (0.0021) model time 0.2371 (0.2426) loss 3.2223 (2.7056) grad_norm 6.6584 (6.6157) loss_scale 256.0000 (193.1056) mem 7381MB [2024-09-01 11:22:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][530/1251] eta 0:02:56 lr 0.000011 wd 0.0500 time 0.2480 (0.2450) data time 0.0010 (0.0021) model time 0.2469 (0.2426) loss 2.7784 (2.7079) grad_norm 6.9697 (6.7420) loss_scale 256.0000 (194.2900) mem 7381MB [2024-09-01 11:22:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][540/1251] eta 0:02:54 lr 0.000011 wd 0.0500 time 0.2484 (0.2450) data time 0.0009 (0.0021) model time 0.2475 (0.2426) loss 2.7994 (2.7089) grad_norm 6.7183 (6.7562) loss_scale 256.0000 (195.4307) mem 7381MB [2024-09-01 11:22:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][550/1251] eta 0:02:51 lr 0.000011 wd 0.0500 time 0.2468 (0.2449) data time 0.0009 (0.0021) model time 0.2459 (0.2426) loss 2.1832 (2.7065) grad_norm 5.9807 (6.7403) loss_scale 256.0000 (196.5299) mem 7381MB [2024-09-01 11:22:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][560/1251] eta 0:02:49 lr 0.000011 wd 0.0500 time 0.2398 (0.2449) data time 0.0010 (0.0021) model time 0.2388 (0.2426) loss 1.9809 (2.7046) grad_norm 5.8183 (6.7299) loss_scale 256.0000 (197.5900) mem 7381MB [2024-09-01 11:22:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][570/1251] eta 0:02:46 lr 0.000011 wd 0.0500 time 0.2309 (0.2449) data time 0.0009 (0.0020) model time 0.2300 (0.2426) loss 2.5447 (2.7060) grad_norm 4.5859 (6.7070) loss_scale 256.0000 (198.6130) mem 7381MB [2024-09-01 11:22:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][580/1251] eta 0:02:44 lr 0.000011 wd 0.0500 time 0.2405 (0.2452) data time 0.0009 (0.0020) model time 0.2396 (0.2429) loss 2.3661 (2.7075) grad_norm 8.8864 (6.6915) loss_scale 256.0000 (199.6007) mem 7381MB [2024-09-01 11:22:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][590/1251] eta 0:02:42 lr 0.000011 wd 0.0500 time 0.2429 (0.2451) data time 0.0010 (0.0020) model time 0.2419 (0.2429) loss 2.9861 (2.7109) grad_norm 9.3168 (6.7164) loss_scale 256.0000 (200.5550) mem 7381MB [2024-09-01 11:22:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][600/1251] eta 0:02:39 lr 0.000011 wd 0.0500 time 0.2431 (0.2454) data time 0.0007 (0.0020) model time 0.2423 (0.2432) loss 2.4197 (2.7132) grad_norm 6.8664 (6.7085) loss_scale 256.0000 (201.4775) mem 7381MB [2024-09-01 11:22:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][610/1251] eta 0:02:37 lr 0.000011 wd 0.0500 time 0.2385 (0.2453) data time 0.0011 (0.0020) model time 0.2375 (0.2432) loss 2.7089 (2.7129) grad_norm 6.9500 (6.7131) loss_scale 256.0000 (202.3699) mem 7381MB [2024-09-01 11:22:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][620/1251] eta 0:02:34 lr 0.000011 wd 0.0500 time 0.2436 (0.2453) data time 0.0011 (0.0020) model time 0.2425 (0.2431) loss 2.7835 (2.7102) grad_norm 5.9297 (6.7078) loss_scale 256.0000 (203.2335) mem 7381MB [2024-09-01 11:23:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][630/1251] eta 0:02:32 lr 0.000011 wd 0.0500 time 0.3892 (0.2454) data time 0.0009 (0.0019) model time 0.3883 (0.2433) loss 2.1515 (2.7062) grad_norm 12.0749 (6.7106) loss_scale 256.0000 (204.0697) mem 7381MB [2024-09-01 11:23:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][640/1251] eta 0:02:30 lr 0.000011 wd 0.0500 time 0.2370 (0.2458) data time 0.0008 (0.0019) model time 0.2362 (0.2437) loss 2.5212 (2.7047) grad_norm 6.3333 (6.7103) loss_scale 256.0000 (204.8799) mem 7381MB [2024-09-01 11:23:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][650/1251] eta 0:02:27 lr 0.000011 wd 0.0500 time 0.2423 (0.2459) data time 0.0008 (0.0019) model time 0.2415 (0.2439) loss 2.7784 (2.7035) grad_norm 7.0247 (6.7469) loss_scale 256.0000 (205.6651) mem 7381MB [2024-09-01 11:23:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][660/1251] eta 0:02:25 lr 0.000011 wd 0.0500 time 0.2426 (0.2461) data time 0.0007 (0.0019) model time 0.2419 (0.2441) loss 3.1009 (2.7051) grad_norm 6.1036 (6.7457) loss_scale 256.0000 (206.4266) mem 7381MB [2024-09-01 11:23:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][670/1251] eta 0:02:22 lr 0.000011 wd 0.0500 time 0.2418 (0.2461) data time 0.0010 (0.0019) model time 0.2408 (0.2441) loss 2.1469 (2.7045) grad_norm 6.4213 (6.7444) loss_scale 256.0000 (207.1654) mem 7381MB [2024-09-01 11:23:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][680/1251] eta 0:02:20 lr 0.000011 wd 0.0500 time 0.2391 (0.2460) data time 0.0008 (0.0019) model time 0.2383 (0.2440) loss 2.9049 (2.7034) grad_norm 5.3522 (6.7420) loss_scale 256.0000 (207.8825) mem 7381MB [2024-09-01 11:23:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][690/1251] eta 0:02:17 lr 0.000011 wd 0.0500 time 0.2458 (0.2459) data time 0.0006 (0.0019) model time 0.2452 (0.2439) loss 1.9989 (2.7037) grad_norm 5.2258 (6.7467) loss_scale 256.0000 (208.5789) mem 7381MB [2024-09-01 11:23:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][700/1251] eta 0:02:15 lr 0.000011 wd 0.0500 time 0.2371 (0.2459) data time 0.0013 (0.0018) model time 0.2358 (0.2439) loss 2.6319 (2.7022) grad_norm 4.4423 (6.7278) loss_scale 256.0000 (209.2553) mem 7381MB [2024-09-01 11:23:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][710/1251] eta 0:02:12 lr 0.000011 wd 0.0500 time 0.2378 (0.2458) data time 0.0011 (0.0018) model time 0.2367 (0.2438) loss 2.5145 (2.7012) grad_norm 4.5034 (6.7173) loss_scale 256.0000 (209.9128) mem 7381MB [2024-09-01 11:23:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][720/1251] eta 0:02:10 lr 0.000011 wd 0.0500 time 0.2411 (0.2457) data time 0.0007 (0.0018) model time 0.2404 (0.2437) loss 2.3828 (2.6977) grad_norm 5.0202 (6.7039) loss_scale 256.0000 (210.5520) mem 7381MB [2024-09-01 11:23:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][730/1251] eta 0:02:07 lr 0.000011 wd 0.0500 time 0.2429 (0.2457) data time 0.0008 (0.0018) model time 0.2421 (0.2437) loss 2.5213 (2.6989) grad_norm 5.2773 (6.6958) loss_scale 256.0000 (211.1737) mem 7381MB [2024-09-01 11:23:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][740/1251] eta 0:02:05 lr 0.000011 wd 0.0500 time 0.2484 (0.2457) data time 0.0009 (0.0018) model time 0.2475 (0.2437) loss 2.2676 (2.6974) grad_norm 4.8206 (6.6872) loss_scale 256.0000 (211.7787) mem 7381MB [2024-09-01 11:23:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][750/1251] eta 0:02:03 lr 0.000011 wd 0.0500 time 0.2443 (0.2456) data time 0.0007 (0.0018) model time 0.2436 (0.2437) loss 3.4088 (2.6977) grad_norm 4.6621 (6.6869) loss_scale 256.0000 (212.3675) mem 7381MB [2024-09-01 11:23:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][760/1251] eta 0:02:00 lr 0.000011 wd 0.0500 time 0.2389 (0.2456) data time 0.0010 (0.0018) model time 0.2379 (0.2437) loss 3.1249 (2.6960) grad_norm 7.1811 (6.6680) loss_scale 256.0000 (212.9409) mem 7381MB [2024-09-01 11:23:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][770/1251] eta 0:01:58 lr 0.000011 wd 0.0500 time 0.2320 (0.2455) data time 0.0008 (0.0018) model time 0.2312 (0.2436) loss 3.2329 (2.6965) grad_norm 4.1086 (6.6594) loss_scale 256.0000 (213.4994) mem 7381MB [2024-09-01 11:23:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][780/1251] eta 0:01:55 lr 0.000011 wd 0.0500 time 0.2349 (0.2455) data time 0.0009 (0.0018) model time 0.2340 (0.2436) loss 1.9643 (2.6949) grad_norm 6.3501 (6.6532) loss_scale 256.0000 (214.0435) mem 7381MB [2024-09-01 11:23:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][790/1251] eta 0:01:53 lr 0.000011 wd 0.0500 time 0.2497 (0.2457) data time 0.0008 (0.0017) model time 0.2489 (0.2438) loss 2.9154 (2.6949) grad_norm 5.1715 (6.6452) loss_scale 256.0000 (214.5740) mem 7381MB [2024-09-01 11:23:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][800/1251] eta 0:01:50 lr 0.000011 wd 0.0500 time 0.2294 (0.2456) data time 0.0010 (0.0017) model time 0.2284 (0.2437) loss 2.9371 (2.6967) grad_norm 3.9558 (6.6328) loss_scale 256.0000 (215.0911) mem 7381MB [2024-09-01 11:23:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][810/1251] eta 0:01:48 lr 0.000011 wd 0.0500 time 0.2367 (0.2456) data time 0.0007 (0.0017) model time 0.2360 (0.2437) loss 2.5291 (2.6941) grad_norm 3.6321 (6.6452) loss_scale 256.0000 (215.5956) mem 7381MB [2024-09-01 11:23:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][820/1251] eta 0:01:45 lr 0.000011 wd 0.0500 time 0.2489 (0.2455) data time 0.0009 (0.0017) model time 0.2480 (0.2437) loss 2.7304 (2.6937) grad_norm 3.9319 (6.6225) loss_scale 256.0000 (216.0877) mem 7381MB [2024-09-01 11:23:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][830/1251] eta 0:01:43 lr 0.000011 wd 0.0500 time 0.2382 (0.2454) data time 0.0010 (0.0017) model time 0.2372 (0.2436) loss 2.9035 (2.6907) grad_norm 6.3950 (6.6056) loss_scale 256.0000 (216.5680) mem 7381MB [2024-09-01 11:23:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][840/1251] eta 0:01:40 lr 0.000011 wd 0.0500 time 0.2403 (0.2454) data time 0.0011 (0.0017) model time 0.2393 (0.2436) loss 3.1039 (2.6877) grad_norm 5.2747 (6.5947) loss_scale 256.0000 (217.0369) mem 7381MB [2024-09-01 11:23:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][850/1251] eta 0:01:38 lr 0.000011 wd 0.0500 time 0.2297 (0.2453) data time 0.0009 (0.0017) model time 0.2287 (0.2435) loss 3.0091 (2.6890) grad_norm 6.9109 (6.6226) loss_scale 256.0000 (217.4947) mem 7381MB [2024-09-01 11:23:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][860/1251] eta 0:01:35 lr 0.000011 wd 0.0500 time 0.2509 (0.2453) data time 0.0007 (0.0017) model time 0.2501 (0.2435) loss 2.3069 (2.6888) grad_norm 5.1925 (6.6150) loss_scale 256.0000 (217.9419) mem 7381MB [2024-09-01 11:24:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][870/1251] eta 0:01:33 lr 0.000011 wd 0.0500 time 0.2408 (0.2452) data time 0.0010 (0.0017) model time 0.2398 (0.2434) loss 2.2864 (2.6891) grad_norm 6.9096 (6.6060) loss_scale 256.0000 (218.3789) mem 7381MB [2024-09-01 11:24:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][880/1251] eta 0:01:30 lr 0.000011 wd 0.0500 time 0.2449 (0.2452) data time 0.0011 (0.0017) model time 0.2439 (0.2434) loss 2.6611 (2.6854) grad_norm 4.4408 (6.6021) loss_scale 256.0000 (218.8059) mem 7381MB [2024-09-01 11:24:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][890/1251] eta 0:01:28 lr 0.000011 wd 0.0500 time 0.2405 (0.2452) data time 0.0010 (0.0017) model time 0.2394 (0.2434) loss 1.7306 (2.6826) grad_norm 7.2209 (6.5953) loss_scale 256.0000 (219.2233) mem 7381MB [2024-09-01 11:24:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][900/1251] eta 0:01:26 lr 0.000011 wd 0.0500 time 0.2454 (0.2452) data time 0.0010 (0.0017) model time 0.2444 (0.2434) loss 2.0224 (2.6804) grad_norm 5.4036 (6.6009) loss_scale 256.0000 (219.6315) mem 7381MB [2024-09-01 11:24:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][910/1251] eta 0:01:23 lr 0.000011 wd 0.0500 time 0.2367 (0.2451) data time 0.0009 (0.0016) model time 0.2358 (0.2434) loss 1.9601 (2.6785) grad_norm 4.8706 (6.5920) loss_scale 256.0000 (220.0307) mem 7381MB [2024-09-01 11:24:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][920/1251] eta 0:01:21 lr 0.000011 wd 0.0500 time 0.2500 (0.2451) data time 0.0010 (0.0016) model time 0.2491 (0.2434) loss 2.8552 (2.6752) grad_norm 5.4252 (6.6024) loss_scale 256.0000 (220.4213) mem 7381MB [2024-09-01 11:24:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][930/1251] eta 0:01:18 lr 0.000011 wd 0.0500 time 0.2379 (0.2451) data time 0.0008 (0.0016) model time 0.2371 (0.2433) loss 1.9883 (2.6766) grad_norm 5.0443 (6.5970) loss_scale 256.0000 (220.8034) mem 7381MB [2024-09-01 11:24:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][940/1251] eta 0:01:16 lr 0.000011 wd 0.0500 time 0.2446 (0.2451) data time 0.0009 (0.0016) model time 0.2437 (0.2433) loss 2.4624 (2.6758) grad_norm 5.6036 (6.5867) loss_scale 256.0000 (221.1775) mem 7381MB [2024-09-01 11:24:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][950/1251] eta 0:01:13 lr 0.000011 wd 0.0500 time 0.2415 (0.2451) data time 0.0007 (0.0016) model time 0.2407 (0.2433) loss 3.2212 (2.6792) grad_norm 5.6158 (6.5808) loss_scale 256.0000 (221.5436) mem 7381MB [2024-09-01 11:24:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][960/1251] eta 0:01:11 lr 0.000011 wd 0.0500 time 0.2384 (0.2450) data time 0.0010 (0.0016) model time 0.2375 (0.2433) loss 2.6167 (2.6802) grad_norm 10.5171 (6.5834) loss_scale 256.0000 (221.9022) mem 7381MB [2024-09-01 11:24:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][970/1251] eta 0:01:08 lr 0.000011 wd 0.0500 time 0.2392 (0.2450) data time 0.0008 (0.0016) model time 0.2384 (0.2433) loss 1.9692 (2.6770) grad_norm 5.2978 (6.5978) loss_scale 256.0000 (222.2533) mem 7381MB [2024-09-01 11:24:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][980/1251] eta 0:01:06 lr 0.000011 wd 0.0500 time 0.2471 (0.2450) data time 0.0010 (0.0016) model time 0.2461 (0.2433) loss 3.1382 (2.6796) grad_norm 6.2602 (6.5917) loss_scale 256.0000 (222.5973) mem 7381MB [2024-09-01 11:24:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][990/1251] eta 0:01:03 lr 0.000011 wd 0.0500 time 0.2392 (0.2450) data time 0.0007 (0.0016) model time 0.2385 (0.2432) loss 2.6473 (2.6817) grad_norm 5.5568 (6.5838) loss_scale 256.0000 (222.9344) mem 7381MB [2024-09-01 11:24:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1000/1251] eta 0:01:01 lr 0.000011 wd 0.0500 time 0.2504 (0.2449) data time 0.0007 (0.0016) model time 0.2497 (0.2432) loss 3.0275 (2.6821) grad_norm 5.3907 (6.5880) loss_scale 256.0000 (223.2647) mem 7381MB [2024-09-01 11:24:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1010/1251] eta 0:00:59 lr 0.000011 wd 0.0500 time 0.2434 (0.2449) data time 0.0008 (0.0016) model time 0.2426 (0.2432) loss 2.6889 (2.6797) grad_norm 5.0369 (6.5785) loss_scale 256.0000 (223.5885) mem 7381MB [2024-09-01 11:24:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1020/1251] eta 0:00:56 lr 0.000011 wd 0.0500 time 0.2346 (0.2448) data time 0.0010 (0.0016) model time 0.2336 (0.2431) loss 2.8681 (2.6799) grad_norm 6.7623 (6.5782) loss_scale 256.0000 (223.9060) mem 7381MB [2024-09-01 11:24:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1030/1251] eta 0:00:54 lr 0.000011 wd 0.0500 time 0.2457 (0.2448) data time 0.0013 (0.0016) model time 0.2444 (0.2431) loss 2.2919 (2.6776) grad_norm 4.8988 (6.5642) loss_scale 256.0000 (224.2173) mem 7381MB [2024-09-01 11:24:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1040/1251] eta 0:00:51 lr 0.000011 wd 0.0500 time 0.2396 (0.2448) data time 0.0009 (0.0016) model time 0.2387 (0.2431) loss 3.7813 (2.6809) grad_norm 6.1723 (6.5601) loss_scale 256.0000 (224.5226) mem 7381MB [2024-09-01 11:24:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1050/1251] eta 0:00:49 lr 0.000011 wd 0.0500 time 0.2449 (0.2448) data time 0.0008 (0.0016) model time 0.2440 (0.2431) loss 2.9261 (2.6806) grad_norm 4.6366 (6.5734) loss_scale 256.0000 (224.8221) mem 7381MB [2024-09-01 11:24:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1060/1251] eta 0:00:46 lr 0.000011 wd 0.0500 time 0.2419 (0.2447) data time 0.0007 (0.0016) model time 0.2412 (0.2431) loss 2.8663 (2.6814) grad_norm 3.6121 (6.5662) loss_scale 256.0000 (225.1159) mem 7381MB [2024-09-01 11:24:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1070/1251] eta 0:00:44 lr 0.000011 wd 0.0500 time 0.2429 (0.2447) data time 0.0009 (0.0015) model time 0.2420 (0.2430) loss 3.0247 (2.6805) grad_norm 4.0022 (6.5559) loss_scale 256.0000 (225.4043) mem 7381MB [2024-09-01 11:24:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1080/1251] eta 0:00:41 lr 0.000011 wd 0.0500 time 0.2398 (0.2447) data time 0.0010 (0.0015) model time 0.2388 (0.2430) loss 2.5882 (2.6800) grad_norm 5.0034 (6.5484) loss_scale 256.0000 (225.6873) mem 7381MB [2024-09-01 11:24:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1090/1251] eta 0:00:39 lr 0.000011 wd 0.0500 time 0.2444 (0.2446) data time 0.0008 (0.0015) model time 0.2436 (0.2430) loss 2.7651 (2.6799) grad_norm 3.8928 (6.5471) loss_scale 256.0000 (225.9652) mem 7381MB [2024-09-01 11:24:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1100/1251] eta 0:00:36 lr 0.000011 wd 0.0500 time 0.2445 (0.2446) data time 0.0007 (0.0015) model time 0.2439 (0.2430) loss 2.9508 (2.6786) grad_norm 3.8798 (6.5401) loss_scale 256.0000 (226.2380) mem 7381MB [2024-09-01 11:24:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1110/1251] eta 0:00:34 lr 0.000011 wd 0.0500 time 0.2342 (0.2446) data time 0.0007 (0.0015) model time 0.2335 (0.2429) loss 2.7999 (2.6808) grad_norm 4.7620 (6.5397) loss_scale 256.0000 (226.5059) mem 7381MB [2024-09-01 11:25:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1120/1251] eta 0:00:32 lr 0.000011 wd 0.0500 time 0.2515 (0.2446) data time 0.0007 (0.0015) model time 0.2509 (0.2429) loss 3.2335 (2.6808) grad_norm 6.0755 (6.5330) loss_scale 256.0000 (226.7690) mem 7381MB [2024-09-01 11:25:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1130/1251] eta 0:00:29 lr 0.000011 wd 0.0500 time 0.2429 (0.2445) data time 0.0011 (0.0015) model time 0.2418 (0.2429) loss 2.6069 (2.6808) grad_norm 4.5000 (6.5260) loss_scale 256.0000 (227.0274) mem 7381MB [2024-09-01 11:25:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1140/1251] eta 0:00:27 lr 0.000011 wd 0.0500 time 0.2289 (0.2445) data time 0.0009 (0.0015) model time 0.2280 (0.2429) loss 2.8652 (2.6804) grad_norm 6.0663 (6.5216) loss_scale 256.0000 (227.2813) mem 7381MB [2024-09-01 11:25:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1150/1251] eta 0:00:24 lr 0.000011 wd 0.0500 time 0.2450 (0.2445) data time 0.0009 (0.0015) model time 0.2442 (0.2429) loss 2.4195 (2.6797) grad_norm 5.8450 (6.5118) loss_scale 256.0000 (227.5308) mem 7381MB [2024-09-01 11:25:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1160/1251] eta 0:00:22 lr 0.000011 wd 0.0500 time 0.2416 (0.2447) data time 0.0009 (0.0015) model time 0.2407 (0.2430) loss 2.7142 (2.6796) grad_norm 4.6488 (6.5124) loss_scale 256.0000 (227.7761) mem 7381MB [2024-09-01 11:25:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1170/1251] eta 0:00:19 lr 0.000011 wd 0.0500 time 0.2456 (0.2450) data time 0.0009 (0.0015) model time 0.2447 (0.2434) loss 2.6180 (2.6800) grad_norm 5.5179 (6.5114) loss_scale 256.0000 (228.0171) mem 7381MB [2024-09-01 11:25:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1180/1251] eta 0:00:17 lr 0.000011 wd 0.0500 time 0.2339 (0.2453) data time 0.0011 (0.0015) model time 0.2329 (0.2437) loss 2.7226 (2.6778) grad_norm 8.0334 (6.5083) loss_scale 256.0000 (228.2540) mem 7381MB [2024-09-01 11:25:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1190/1251] eta 0:00:14 lr 0.000011 wd 0.0500 time 0.2318 (0.2453) data time 0.0008 (0.0015) model time 0.2309 (0.2437) loss 2.2160 (2.6772) grad_norm 5.8900 (6.5109) loss_scale 256.0000 (228.4870) mem 7381MB [2024-09-01 11:25:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1200/1251] eta 0:00:12 lr 0.000011 wd 0.0500 time 0.2419 (0.2453) data time 0.0009 (0.0015) model time 0.2409 (0.2437) loss 2.8219 (2.6755) grad_norm 4.7951 (6.5047) loss_scale 256.0000 (228.7161) mem 7381MB [2024-09-01 11:25:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1210/1251] eta 0:00:10 lr 0.000011 wd 0.0500 time 0.2491 (0.2452) data time 0.0009 (0.0015) model time 0.2481 (0.2437) loss 2.8590 (2.6752) grad_norm 5.7945 (6.5027) loss_scale 256.0000 (228.9414) mem 7381MB [2024-09-01 11:25:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1220/1251] eta 0:00:07 lr 0.000011 wd 0.0500 time 0.2414 (0.2452) data time 0.0009 (0.0015) model time 0.2405 (0.2437) loss 2.7586 (2.6754) grad_norm 4.8625 (6.4926) loss_scale 256.0000 (229.1630) mem 7381MB [2024-09-01 11:25:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1230/1251] eta 0:00:05 lr 0.000011 wd 0.0500 time 0.2364 (0.2452) data time 0.0009 (0.0015) model time 0.2356 (0.2436) loss 2.3476 (2.6744) grad_norm 3.8544 (6.4898) loss_scale 256.0000 (229.3810) mem 7381MB [2024-09-01 11:25:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1240/1251] eta 0:00:02 lr 0.000011 wd 0.0500 time 0.2236 (0.2451) data time 0.0007 (0.0015) model time 0.2229 (0.2435) loss 3.0322 (2.6755) grad_norm 5.0724 (6.4894) loss_scale 256.0000 (229.5955) mem 7381MB [2024-09-01 11:25:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [293/300][1250/1251] eta 0:00:00 lr 0.000011 wd 0.0500 time 0.2266 (0.2450) data time 0.0005 (0.0015) model time 0.2261 (0.2434) loss 2.0034 (2.6761) grad_norm 4.8707 (6.4856) loss_scale 256.0000 (229.8066) mem 7381MB [2024-09-01 11:25:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 293 training takes 0:05:06 [2024-09-01 11:25:33 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 11:25:34 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 11:25:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.450 (0.450) Loss 0.3901 (0.3901) Acc@1 93.359 (93.359) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 11:25:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.110) Loss 0.5688 (0.6124) Acc@1 90.430 (87.642) Acc@5 97.949 (97.763) Mem 7381MB [2024-09-01 11:25:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.077 (0.094) Loss 0.9292 (0.6429) Acc@1 78.027 (86.658) Acc@5 95.898 (97.726) Mem 7381MB [2024-09-01 11:25:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.089) Loss 1.1572 (0.7370) Acc@1 74.316 (84.425) Acc@5 92.871 (96.784) Mem 7381MB [2024-09-01 11:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.083) Loss 1.0303 (0.7872) Acc@1 77.051 (83.253) Acc@5 94.043 (96.249) Mem 7381MB [2024-09-01 11:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.800 Acc@5 96.198 [2024-09-01 11:25:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 11:25:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.804 (0.804) Loss 0.3914 (0.3914) Acc@1 93.262 (93.262) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 11:25:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.081 (0.145) Loss 0.5659 (0.6093) Acc@1 90.527 (87.837) Acc@5 97.949 (97.763) Mem 7381MB [2024-09-01 11:25:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.068 (0.112) Loss 0.9136 (0.6400) Acc@1 78.223 (86.733) Acc@5 95.508 (97.726) Mem 7381MB [2024-09-01 11:25:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.071 (0.101) Loss 1.1445 (0.7333) Acc@1 74.707 (84.533) Acc@5 92.969 (96.787) Mem 7381MB [2024-09-01 11:25:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.092) Loss 1.0186 (0.7823) Acc@1 76.855 (83.348) Acc@5 94.336 (96.280) Mem 7381MB [2024-09-01 11:25:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.892 Acc@5 96.240 [2024-09-01 11:25:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 11:25:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][0/1251] eta 0:24:21 lr 0.000011 wd 0.0500 time 1.1679 (1.1679) data time 0.6065 (0.6065) model time 0.0000 (0.0000) loss 2.9835 (2.9835) grad_norm 4.2680 (4.2680) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:25:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][10/1251] eta 0:06:45 lr 0.000011 wd 0.0500 time 0.2446 (0.3269) data time 0.0012 (0.0561) model time 0.0000 (0.0000) loss 3.1466 (2.5755) grad_norm 4.4848 (6.0657) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:25:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][20/1251] eta 0:05:52 lr 0.000011 wd 0.0500 time 0.2455 (0.2867) data time 0.0009 (0.0299) model time 0.0000 (0.0000) loss 2.4281 (2.6242) grad_norm 6.5202 (6.5761) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:25:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][30/1251] eta 0:05:32 lr 0.000011 wd 0.0500 time 0.2415 (0.2721) data time 0.0011 (0.0207) model time 0.0000 (0.0000) loss 2.9768 (2.6428) grad_norm 5.5623 (6.1357) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:25:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][40/1251] eta 0:05:20 lr 0.000011 wd 0.0500 time 0.2392 (0.2646) data time 0.0008 (0.0159) model time 0.0000 (0.0000) loss 2.8196 (2.6712) grad_norm 6.1190 (6.7539) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:25:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][50/1251] eta 0:05:12 lr 0.000011 wd 0.0500 time 0.2445 (0.2603) data time 0.0009 (0.0130) model time 0.0000 (0.0000) loss 2.5512 (2.6691) grad_norm 4.0697 (6.5448) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:25:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][60/1251] eta 0:05:06 lr 0.000011 wd 0.0500 time 0.2357 (0.2571) data time 0.0010 (0.0110) model time 0.2347 (0.2401) loss 2.6535 (2.6718) grad_norm 5.5776 (6.4100) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][70/1251] eta 0:05:00 lr 0.000011 wd 0.0500 time 0.2348 (0.2548) data time 0.0009 (0.0096) model time 0.2339 (0.2400) loss 2.0957 (2.6590) grad_norm 5.6015 (6.3352) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][80/1251] eta 0:04:57 lr 0.000011 wd 0.0500 time 0.2514 (0.2538) data time 0.0011 (0.0085) model time 0.2502 (0.2418) loss 2.8393 (2.6553) grad_norm 7.9541 (6.2767) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][90/1251] eta 0:04:53 lr 0.000011 wd 0.0500 time 0.2705 (0.2528) data time 0.0009 (0.0077) model time 0.2696 (0.2423) loss 2.4788 (2.6823) grad_norm 7.3697 (6.3121) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][100/1251] eta 0:04:50 lr 0.000011 wd 0.0500 time 0.2476 (0.2520) data time 0.0008 (0.0070) model time 0.2469 (0.2425) loss 2.4658 (2.6794) grad_norm 6.6767 (6.3372) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][110/1251] eta 0:04:46 lr 0.000011 wd 0.0500 time 0.2420 (0.2509) data time 0.0014 (0.0065) model time 0.2406 (0.2420) loss 2.3523 (2.6876) grad_norm 4.6745 (6.3192) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][120/1251] eta 0:04:43 lr 0.000011 wd 0.0500 time 0.2464 (0.2502) data time 0.0007 (0.0060) model time 0.2457 (0.2419) loss 3.2020 (2.6954) grad_norm 5.2707 (6.3541) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][130/1251] eta 0:04:39 lr 0.000011 wd 0.0500 time 0.2400 (0.2496) data time 0.0008 (0.0056) model time 0.2391 (0.2418) loss 2.7165 (2.7052) grad_norm 5.2706 (6.3490) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][140/1251] eta 0:04:36 lr 0.000011 wd 0.0500 time 0.2444 (0.2490) data time 0.0007 (0.0053) model time 0.2437 (0.2417) loss 2.3506 (2.6764) grad_norm 8.4409 (6.3611) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][150/1251] eta 0:04:35 lr 0.000011 wd 0.0500 time 0.2550 (0.2499) data time 0.0007 (0.0050) model time 0.2543 (0.2437) loss 3.0295 (2.6788) grad_norm 4.8023 (6.2527) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][160/1251] eta 0:04:33 lr 0.000011 wd 0.0500 time 0.2406 (0.2507) data time 0.0008 (0.0048) model time 0.2397 (0.2452) loss 3.0182 (2.6890) grad_norm 13.5619 (6.5697) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][170/1251] eta 0:04:30 lr 0.000011 wd 0.0500 time 0.2397 (0.2502) data time 0.0009 (0.0046) model time 0.2388 (0.2449) loss 2.9023 (2.6846) grad_norm 4.3808 (6.5453) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][180/1251] eta 0:04:27 lr 0.000011 wd 0.0500 time 0.2456 (0.2498) data time 0.0008 (0.0044) model time 0.2448 (0.2447) loss 2.8872 (2.6730) grad_norm 6.7778 (6.5621) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][190/1251] eta 0:04:24 lr 0.000011 wd 0.0500 time 0.2371 (0.2493) data time 0.0009 (0.0042) model time 0.2362 (0.2443) loss 1.8730 (2.6608) grad_norm 4.4324 (6.5348) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][200/1251] eta 0:04:21 lr 0.000011 wd 0.0500 time 0.2412 (0.2489) data time 0.0007 (0.0040) model time 0.2405 (0.2440) loss 2.5286 (2.6566) grad_norm 6.5299 (6.4771) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][210/1251] eta 0:04:18 lr 0.000011 wd 0.0500 time 0.2390 (0.2485) data time 0.0011 (0.0039) model time 0.2379 (0.2438) loss 3.0370 (2.6521) grad_norm 4.9047 (6.4285) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][220/1251] eta 0:04:15 lr 0.000011 wd 0.0500 time 0.2504 (0.2483) data time 0.0007 (0.0038) model time 0.2497 (0.2437) loss 2.2617 (2.6527) grad_norm 5.1804 (6.7983) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][230/1251] eta 0:04:13 lr 0.000011 wd 0.0500 time 0.2433 (0.2479) data time 0.0010 (0.0036) model time 0.2422 (0.2434) loss 2.9096 (2.6511) grad_norm 7.7013 (6.7718) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][240/1251] eta 0:04:10 lr 0.000011 wd 0.0500 time 0.2395 (0.2476) data time 0.0009 (0.0035) model time 0.2386 (0.2432) loss 3.1743 (2.6535) grad_norm 5.3878 (6.7412) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][250/1251] eta 0:04:07 lr 0.000011 wd 0.0500 time 0.2349 (0.2473) data time 0.0008 (0.0034) model time 0.2341 (0.2430) loss 2.1915 (2.6568) grad_norm 3.6221 (6.7111) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][260/1251] eta 0:04:04 lr 0.000011 wd 0.0500 time 0.2444 (0.2472) data time 0.0007 (0.0033) model time 0.2436 (0.2430) loss 3.2505 (2.6645) grad_norm 5.8515 (6.6706) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][270/1251] eta 0:04:02 lr 0.000011 wd 0.0500 time 0.2388 (0.2470) data time 0.0010 (0.0032) model time 0.2379 (0.2429) loss 1.9494 (2.6607) grad_norm 18.3875 (6.7122) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][280/1251] eta 0:03:59 lr 0.000011 wd 0.0500 time 0.2394 (0.2468) data time 0.0011 (0.0032) model time 0.2383 (0.2428) loss 1.9204 (2.6624) grad_norm 8.2379 (6.8385) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][290/1251] eta 0:03:56 lr 0.000011 wd 0.0500 time 0.2448 (0.2466) data time 0.0009 (0.0031) model time 0.2439 (0.2427) loss 3.4700 (2.6575) grad_norm 4.4002 (6.7954) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][300/1251] eta 0:03:54 lr 0.000011 wd 0.0500 time 0.2363 (0.2463) data time 0.0007 (0.0030) model time 0.2356 (0.2425) loss 3.0067 (2.6614) grad_norm 4.1910 (6.9115) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:26:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][310/1251] eta 0:03:51 lr 0.000011 wd 0.0500 time 0.2464 (0.2462) data time 0.0008 (0.0030) model time 0.2457 (0.2425) loss 2.3771 (2.6562) grad_norm 4.9090 (6.9044) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:27:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][320/1251] eta 0:03:49 lr 0.000011 wd 0.0500 time 0.2348 (0.2461) data time 0.0009 (0.0029) model time 0.2339 (0.2424) loss 3.4825 (2.6602) grad_norm 5.5626 (6.8638) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:27:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][330/1251] eta 0:03:46 lr 0.000011 wd 0.0500 time 0.2403 (0.2459) data time 0.0007 (0.0028) model time 0.2396 (0.2423) loss 2.5582 (2.6538) grad_norm 6.4576 (6.8831) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:27:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][340/1251] eta 0:03:43 lr 0.000011 wd 0.0500 time 0.2434 (0.2458) data time 0.0007 (0.0028) model time 0.2427 (0.2423) loss 1.7901 (2.6450) grad_norm 3.9932 (6.8873) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:27:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][350/1251] eta 0:03:41 lr 0.000011 wd 0.0500 time 0.2407 (0.2456) data time 0.0009 (0.0027) model time 0.2398 (0.2421) loss 2.2548 (2.6533) grad_norm 5.7640 (6.9967) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:27:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][360/1251] eta 0:03:38 lr 0.000011 wd 0.0500 time 0.2375 (0.2455) data time 0.0011 (0.0027) model time 0.2364 (0.2421) loss 2.6007 (2.6504) grad_norm 5.2418 (6.9613) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:27:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-09-01 11:27:12 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 11:27:13 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 11:28:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-09-01 11:28:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-09-01 11:29:01 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-09-01 11:29:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-09-01 11:29:14 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-09-01 11:29:15 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-09-01 11:29:16 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-09-01 11:29:16 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 294) [2024-09-01 11:29:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-09-01 11:29:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][370/1251] eta 0:41:48 lr 0.000011 wd 0.0500 time 0.2334 (2.8479) data time 0.0008 (0.1358) model time 0.2326 (2.7121) loss 3.1669 (3.0090) grad_norm 5.4695 (9.6786) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:29:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][380/1251] eta 0:16:05 lr 0.000011 wd 0.0500 time 0.2424 (1.1080) data time 0.0009 (0.0460) model time 0.2414 (1.0621) loss 3.2557 (2.8883) grad_norm 10.5512 (8.6746) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:29:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][390/1251] eta 0:10:54 lr 0.000011 wd 0.0500 time 0.2396 (0.7600) data time 0.0009 (0.0280) model time 0.2386 (0.7320) loss 3.2123 (2.8766) grad_norm 6.5475 (7.3565) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:29:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][400/1251] eta 0:08:40 lr 0.000011 wd 0.0500 time 0.2431 (0.6112) data time 0.0009 (0.0203) model time 0.2422 (0.5909) loss 2.8193 (2.8593) grad_norm 4.5865 (6.8950) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:29:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][410/1251] eta 0:07:25 lr 0.000011 wd 0.0500 time 0.2423 (0.5292) data time 0.0009 (0.0160) model time 0.2413 (0.5132) loss 2.9254 (2.8318) grad_norm 4.2808 (6.8034) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:29:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][420/1251] eta 0:06:35 lr 0.000011 wd 0.0500 time 0.2318 (0.4762) data time 0.0007 (0.0133) model time 0.2311 (0.4629) loss 2.0457 (2.8247) grad_norm 6.6521 (6.8559) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:29:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][430/1251] eta 0:06:00 lr 0.000011 wd 0.0500 time 0.2400 (0.4397) data time 0.0009 (0.0114) model time 0.2391 (0.4283) loss 3.0701 (2.8135) grad_norm 4.8748 (6.9151) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:29:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][440/1251] eta 0:05:34 lr 0.000011 wd 0.0500 time 0.2382 (0.4128) data time 0.0009 (0.0100) model time 0.2373 (0.4028) loss 2.2266 (2.7703) grad_norm 4.3606 (6.7705) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:29:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][450/1251] eta 0:05:14 lr 0.000011 wd 0.0500 time 0.2380 (0.3923) data time 0.0010 (0.0089) model time 0.2370 (0.3834) loss 2.5331 (2.7642) grad_norm 5.8306 (6.7795) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:29:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][460/1251] eta 0:04:57 lr 0.000011 wd 0.0500 time 0.2376 (0.3762) data time 0.0010 (0.0081) model time 0.2366 (0.3681) loss 2.6225 (2.7416) grad_norm 4.7871 (6.6669) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:29:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][470/1251] eta 0:04:43 lr 0.000011 wd 0.0500 time 0.2424 (0.3632) data time 0.0009 (0.0075) model time 0.2415 (0.3557) loss 2.5725 (2.7734) grad_norm 4.5139 (6.7030) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][480/1251] eta 0:04:31 lr 0.000011 wd 0.0500 time 0.2437 (0.3524) data time 0.0008 (0.0069) model time 0.2429 (0.3455) loss 1.9175 (2.7617) grad_norm 4.9949 (6.7184) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][490/1251] eta 0:04:21 lr 0.000011 wd 0.0500 time 0.2418 (0.3433) data time 0.0008 (0.0064) model time 0.2410 (0.3369) loss 2.5624 (2.7489) grad_norm 5.1772 (6.6928) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][500/1251] eta 0:04:12 lr 0.000011 wd 0.0500 time 0.2399 (0.3357) data time 0.0008 (0.0060) model time 0.2391 (0.3297) loss 2.4632 (2.7463) grad_norm 5.3540 (6.6339) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][510/1251] eta 0:04:03 lr 0.000011 wd 0.0500 time 0.2400 (0.3290) data time 0.0010 (0.0057) model time 0.2390 (0.3233) loss 2.9096 (2.7352) grad_norm 8.0668 (6.5930) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][520/1251] eta 0:03:56 lr 0.000011 wd 0.0500 time 0.2360 (0.3232) data time 0.0010 (0.0054) model time 0.2351 (0.3178) loss 3.0035 (2.7330) grad_norm 11.2147 (6.5713) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][530/1251] eta 0:03:49 lr 0.000011 wd 0.0500 time 0.2382 (0.3181) data time 0.0010 (0.0051) model time 0.2372 (0.3130) loss 2.6468 (2.7288) grad_norm 4.9402 (6.6480) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][540/1251] eta 0:03:42 lr 0.000011 wd 0.0500 time 0.2339 (0.3136) data time 0.0007 (0.0049) model time 0.2332 (0.3087) loss 2.6438 (2.7265) grad_norm 7.7694 (6.6722) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][550/1251] eta 0:03:37 lr 0.000011 wd 0.0500 time 0.2447 (0.3097) data time 0.0011 (0.0047) model time 0.2437 (0.3050) loss 2.7576 (2.7243) grad_norm 6.3404 (6.6326) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][560/1251] eta 0:03:31 lr 0.000011 wd 0.0500 time 0.2349 (0.3060) data time 0.0011 (0.0045) model time 0.2337 (0.3015) loss 1.5289 (2.7140) grad_norm 6.4915 (6.6871) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][570/1251] eta 0:03:26 lr 0.000011 wd 0.0500 time 0.2449 (0.3028) data time 0.0011 (0.0043) model time 0.2438 (0.2985) loss 2.0495 (2.7076) grad_norm 5.6511 (6.6416) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][580/1251] eta 0:03:21 lr 0.000011 wd 0.0500 time 0.2335 (0.2999) data time 0.0009 (0.0042) model time 0.2325 (0.2957) loss 2.8944 (2.7011) grad_norm 5.4835 (6.9440) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][590/1251] eta 0:03:16 lr 0.000011 wd 0.0500 time 0.2399 (0.2973) data time 0.0007 (0.0040) model time 0.2392 (0.2932) loss 2.6886 (2.6981) grad_norm 6.9761 (6.9215) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][600/1251] eta 0:03:11 lr 0.000011 wd 0.0500 time 0.2535 (0.2949) data time 0.0010 (0.0039) model time 0.2525 (0.2910) loss 2.7470 (2.6893) grad_norm 5.2751 (6.9042) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][610/1251] eta 0:03:07 lr 0.000011 wd 0.0500 time 0.2486 (0.2927) data time 0.0010 (0.0038) model time 0.2476 (0.2889) loss 2.6280 (2.6870) grad_norm 5.3973 (6.8679) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][620/1251] eta 0:03:03 lr 0.000011 wd 0.0500 time 0.2393 (0.2908) data time 0.0007 (0.0037) model time 0.2386 (0.2871) loss 2.2197 (2.6770) grad_norm 5.3764 (6.8417) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][630/1251] eta 0:02:59 lr 0.000011 wd 0.0500 time 0.2413 (0.2889) data time 0.0007 (0.0036) model time 0.2407 (0.2853) loss 1.9883 (2.6687) grad_norm 7.7964 (6.8455) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][640/1251] eta 0:02:55 lr 0.000011 wd 0.0500 time 0.2543 (0.2871) data time 0.0010 (0.0035) model time 0.2533 (0.2836) loss 2.8013 (2.6703) grad_norm 5.1138 (6.8301) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][650/1251] eta 0:02:51 lr 0.000011 wd 0.0500 time 0.2347 (0.2855) data time 0.0010 (0.0034) model time 0.2337 (0.2821) loss 2.6541 (2.6672) grad_norm 8.7290 (6.8085) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][660/1251] eta 0:02:48 lr 0.000011 wd 0.0500 time 0.2360 (0.2848) data time 0.0011 (0.0033) model time 0.2349 (0.2815) loss 2.8607 (2.6635) grad_norm 8.3883 (6.8428) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][670/1251] eta 0:02:44 lr 0.000011 wd 0.0500 time 0.2350 (0.2833) data time 0.0009 (0.0032) model time 0.2341 (0.2800) loss 2.0234 (2.6595) grad_norm 9.9541 (6.8396) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][680/1251] eta 0:02:41 lr 0.000011 wd 0.0500 time 0.2303 (0.2826) data time 0.0011 (0.0032) model time 0.2293 (0.2795) loss 2.9785 (2.6665) grad_norm 7.3797 (6.8222) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][690/1251] eta 0:02:37 lr 0.000011 wd 0.0500 time 0.2351 (0.2813) data time 0.0008 (0.0031) model time 0.2343 (0.2782) loss 2.7986 (2.6761) grad_norm 4.3157 (6.8821) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][700/1251] eta 0:02:34 lr 0.000011 wd 0.0500 time 0.2335 (0.2800) data time 0.0007 (0.0030) model time 0.2327 (0.2770) loss 2.7420 (2.6720) grad_norm 5.8866 (6.8653) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][710/1251] eta 0:02:30 lr 0.000011 wd 0.0500 time 0.2520 (0.2789) data time 0.0007 (0.0030) model time 0.2513 (0.2759) loss 2.2468 (2.6734) grad_norm 5.1335 (6.9057) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:30:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][720/1251] eta 0:02:27 lr 0.000011 wd 0.0500 time 0.2441 (0.2779) data time 0.0010 (0.0029) model time 0.2431 (0.2749) loss 2.6523 (2.6770) grad_norm 8.2878 (6.9202) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][730/1251] eta 0:02:24 lr 0.000011 wd 0.0500 time 0.2435 (0.2768) data time 0.0010 (0.0029) model time 0.2425 (0.2739) loss 2.7335 (2.6738) grad_norm 4.7200 (6.8717) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][740/1251] eta 0:02:20 lr 0.000011 wd 0.0500 time 0.2328 (0.2758) data time 0.0008 (0.0028) model time 0.2320 (0.2730) loss 2.2077 (2.6749) grad_norm 11.4430 (6.9056) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][750/1251] eta 0:02:17 lr 0.000011 wd 0.0500 time 0.2505 (0.2749) data time 0.0008 (0.0028) model time 0.2497 (0.2721) loss 1.8824 (2.6668) grad_norm 5.1222 (6.8709) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][760/1251] eta 0:02:14 lr 0.000011 wd 0.0500 time 0.2546 (0.2741) data time 0.0009 (0.0027) model time 0.2537 (0.2713) loss 3.2721 (2.6711) grad_norm 6.9778 (6.8684) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][770/1251] eta 0:02:11 lr 0.000011 wd 0.0500 time 0.2355 (0.2732) data time 0.0008 (0.0027) model time 0.2347 (0.2705) loss 2.7048 (2.6762) grad_norm 8.6796 (6.8680) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][780/1251] eta 0:02:08 lr 0.000011 wd 0.0500 time 0.2458 (0.2726) data time 0.0009 (0.0027) model time 0.2449 (0.2699) loss 2.8882 (2.6780) grad_norm 4.3022 (6.8515) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][790/1251] eta 0:02:05 lr 0.000011 wd 0.0500 time 0.2440 (0.2718) data time 0.0008 (0.0026) model time 0.2432 (0.2692) loss 2.9229 (2.6766) grad_norm 4.1484 (6.8219) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][800/1251] eta 0:02:02 lr 0.000011 wd 0.0500 time 0.2323 (0.2711) data time 0.0008 (0.0026) model time 0.2315 (0.2685) loss 2.9808 (2.6835) grad_norm 7.6593 (6.8128) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][810/1251] eta 0:01:59 lr 0.000011 wd 0.0500 time 0.2442 (0.2704) data time 0.0010 (0.0025) model time 0.2433 (0.2678) loss 2.7722 (2.6865) grad_norm 12.0450 (6.7981) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][820/1251] eta 0:01:56 lr 0.000011 wd 0.0500 time 0.2405 (0.2697) data time 0.0007 (0.0025) model time 0.2398 (0.2672) loss 2.9575 (2.6842) grad_norm 5.3997 (6.8106) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][830/1251] eta 0:01:53 lr 0.000011 wd 0.0500 time 0.2390 (0.2691) data time 0.0007 (0.0025) model time 0.2383 (0.2667) loss 2.1020 (2.6788) grad_norm 6.9123 (6.7853) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][840/1251] eta 0:01:50 lr 0.000011 wd 0.0500 time 0.2363 (0.2686) data time 0.0007 (0.0024) model time 0.2355 (0.2661) loss 2.6788 (2.6751) grad_norm 7.1242 (6.8477) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][850/1251] eta 0:01:47 lr 0.000011 wd 0.0500 time 0.2402 (0.2680) data time 0.0007 (0.0024) model time 0.2395 (0.2656) loss 3.0968 (2.6759) grad_norm 6.8028 (6.8691) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][860/1251] eta 0:01:44 lr 0.000011 wd 0.0500 time 0.2425 (0.2675) data time 0.0007 (0.0024) model time 0.2418 (0.2651) loss 2.7359 (2.6755) grad_norm 7.1177 (6.8656) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][870/1251] eta 0:01:41 lr 0.000011 wd 0.0500 time 0.2420 (0.2671) data time 0.0010 (0.0024) model time 0.2410 (0.2647) loss 3.3109 (2.6737) grad_norm 3.7423 (6.8362) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][880/1251] eta 0:01:38 lr 0.000011 wd 0.0500 time 0.2356 (0.2665) data time 0.0009 (0.0023) model time 0.2346 (0.2642) loss 2.9602 (2.6798) grad_norm 10.2409 (6.8190) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][890/1251] eta 0:01:36 lr 0.000011 wd 0.0500 time 0.2394 (0.2660) data time 0.0008 (0.0023) model time 0.2387 (0.2637) loss 3.6134 (2.6757) grad_norm 4.5016 (6.8047) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][900/1251] eta 0:01:33 lr 0.000011 wd 0.0500 time 0.2425 (0.2655) data time 0.0008 (0.0023) model time 0.2418 (0.2633) loss 2.0137 (2.6718) grad_norm 4.5142 (6.7848) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][910/1251] eta 0:01:30 lr 0.000011 wd 0.0500 time 0.2344 (0.2650) data time 0.0008 (0.0023) model time 0.2336 (0.2628) loss 2.4879 (2.6718) grad_norm 6.9129 (6.7705) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][920/1251] eta 0:01:27 lr 0.000011 wd 0.0500 time 0.2435 (0.2646) data time 0.0010 (0.0022) model time 0.2425 (0.2624) loss 2.7894 (2.6743) grad_norm 6.2786 (6.7638) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][930/1251] eta 0:01:24 lr 0.000011 wd 0.0500 time 0.2423 (0.2642) data time 0.0010 (0.0022) model time 0.2413 (0.2620) loss 2.7433 (2.6761) grad_norm 7.2159 (6.7383) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][940/1251] eta 0:01:22 lr 0.000011 wd 0.0500 time 0.2424 (0.2638) data time 0.0010 (0.0022) model time 0.2414 (0.2616) loss 2.7063 (2.6761) grad_norm 5.3063 (6.7247) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][950/1251] eta 0:01:19 lr 0.000011 wd 0.0500 time 0.2309 (0.2634) data time 0.0011 (0.0022) model time 0.2298 (0.2612) loss 2.7458 (2.6769) grad_norm 4.3419 (6.7142) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][960/1251] eta 0:01:16 lr 0.000011 wd 0.0500 time 0.2394 (0.2630) data time 0.0007 (0.0022) model time 0.2387 (0.2608) loss 3.1012 (2.6769) grad_norm 6.6004 (6.6917) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:31:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][970/1251] eta 0:01:13 lr 0.000011 wd 0.0500 time 0.2337 (0.2626) data time 0.0007 (0.0021) model time 0.2329 (0.2604) loss 3.2266 (2.6747) grad_norm 8.4965 (6.6872) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:32:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][980/1251] eta 0:01:11 lr 0.000011 wd 0.0500 time 0.2402 (0.2622) data time 0.0008 (0.0021) model time 0.2395 (0.2601) loss 3.6754 (2.6778) grad_norm 6.0427 (6.6811) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:32:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][990/1251] eta 0:01:08 lr 0.000011 wd 0.0500 time 0.2367 (0.2619) data time 0.0010 (0.0021) model time 0.2357 (0.2598) loss 2.5773 (2.6813) grad_norm 14.2492 (6.6813) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:32:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1000/1251] eta 0:01:05 lr 0.000011 wd 0.0500 time 0.2376 (0.2615) data time 0.0010 (0.0021) model time 0.2366 (0.2594) loss 2.4334 (2.6808) grad_norm 4.1025 (6.7388) loss_scale 256.0000 (256.0000) mem 7375MB [2024-09-01 11:32:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1010/1251] eta 0:01:02 lr 0.000011 wd 0.0500 time 0.2362 (0.2612) data time 0.0008 (0.0021) model time 0.2354 (0.2592) loss 2.3203 (2.6777) grad_norm 12.3427 (6.7421) loss_scale 512.0000 (258.3814) mem 7375MB [2024-09-01 11:32:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1020/1251] eta 0:01:00 lr 0.000011 wd 0.0500 time 0.2452 (0.2610) data time 0.0008 (0.0021) model time 0.2444 (0.2589) loss 2.4611 (2.6784) grad_norm 5.8366 (6.7207) loss_scale 512.0000 (262.2534) mem 7375MB [2024-09-01 11:32:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1030/1251] eta 0:00:57 lr 0.000011 wd 0.0500 time 0.2406 (0.2606) data time 0.0007 (0.0020) model time 0.2399 (0.2586) loss 2.5202 (2.6748) grad_norm 6.4111 (6.7064) loss_scale 512.0000 (266.0090) mem 7375MB [2024-09-01 11:32:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1040/1251] eta 0:00:54 lr 0.000011 wd 0.0500 time 0.2381 (0.2603) data time 0.0007 (0.0020) model time 0.2373 (0.2583) loss 2.6904 (2.6802) grad_norm 18.0371 (6.7097) loss_scale 512.0000 (269.6533) mem 7375MB [2024-09-01 11:32:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1050/1251] eta 0:00:52 lr 0.000011 wd 0.0500 time 0.2398 (0.2601) data time 0.0010 (0.0020) model time 0.2388 (0.2580) loss 2.8781 (2.6803) grad_norm 5.4899 (6.7051) loss_scale 512.0000 (273.1912) mem 7375MB [2024-09-01 11:32:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1060/1251] eta 0:00:49 lr 0.000011 wd 0.0500 time 0.2413 (0.2597) data time 0.0009 (0.0020) model time 0.2404 (0.2577) loss 2.8356 (2.6787) grad_norm 4.2380 (6.6807) loss_scale 512.0000 (276.6273) mem 7375MB [2024-09-01 11:32:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1070/1251] eta 0:00:46 lr 0.000011 wd 0.0500 time 0.2396 (0.2595) data time 0.0008 (0.0020) model time 0.2388 (0.2575) loss 2.9414 (2.6771) grad_norm 5.4598 (6.6691) loss_scale 512.0000 (279.9660) mem 7375MB [2024-09-01 11:32:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1080/1251] eta 0:00:44 lr 0.000011 wd 0.0500 time 0.2300 (0.2592) data time 0.0009 (0.0020) model time 0.2291 (0.2572) loss 2.9306 (2.6729) grad_norm 7.1971 (6.6748) loss_scale 512.0000 (283.2112) mem 7375MB [2024-09-01 11:32:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1090/1251] eta 0:00:41 lr 0.000011 wd 0.0500 time 0.2319 (0.2589) data time 0.0007 (0.0020) model time 0.2312 (0.2569) loss 2.4736 (2.6723) grad_norm 4.4535 (6.6615) loss_scale 512.0000 (286.3669) mem 7375MB [2024-09-01 11:32:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1100/1251] eta 0:00:39 lr 0.000011 wd 0.0500 time 0.2382 (0.2586) data time 0.0007 (0.0019) model time 0.2375 (0.2567) loss 2.7335 (2.6737) grad_norm 5.4908 (6.6697) loss_scale 512.0000 (289.4367) mem 7375MB [2024-09-01 11:32:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1110/1251] eta 0:00:36 lr 0.000011 wd 0.0500 time 0.2386 (0.2583) data time 0.0009 (0.0019) model time 0.2377 (0.2564) loss 2.5254 (2.6729) grad_norm 3.6132 (6.6669) loss_scale 512.0000 (292.4242) mem 7375MB [2024-09-01 11:32:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1120/1251] eta 0:00:33 lr 0.000011 wd 0.0500 time 0.2369 (0.2581) data time 0.0009 (0.0019) model time 0.2360 (0.2562) loss 1.8996 (2.6704) grad_norm 6.2875 (6.6676) loss_scale 512.0000 (295.3325) mem 7375MB [2024-09-01 11:32:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1130/1251] eta 0:00:31 lr 0.000011 wd 0.0500 time 0.2338 (0.2579) data time 0.0008 (0.0019) model time 0.2330 (0.2560) loss 2.6160 (2.6731) grad_norm 5.7013 (6.6913) loss_scale 512.0000 (298.1647) mem 7375MB [2024-09-01 11:32:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1140/1251] eta 0:00:28 lr 0.000011 wd 0.0500 time 0.2389 (0.2577) data time 0.0010 (0.0019) model time 0.2379 (0.2558) loss 2.0752 (2.6734) grad_norm 4.6718 (6.6734) loss_scale 512.0000 (300.9239) mem 7375MB [2024-09-01 11:32:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1150/1251] eta 0:00:26 lr 0.000011 wd 0.0500 time 0.2406 (0.2575) data time 0.0010 (0.0019) model time 0.2396 (0.2556) loss 2.0301 (2.6730) grad_norm 4.1406 (6.6624) loss_scale 512.0000 (303.6127) mem 7375MB [2024-09-01 11:32:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1160/1251] eta 0:00:23 lr 0.000011 wd 0.0500 time 0.2411 (0.2573) data time 0.0007 (0.0019) model time 0.2404 (0.2554) loss 2.5091 (2.6744) grad_norm 11.0798 (6.6633) loss_scale 512.0000 (306.2340) mem 7375MB [2024-09-01 11:32:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1170/1251] eta 0:00:20 lr 0.000011 wd 0.0500 time 0.2367 (0.2571) data time 0.0007 (0.0019) model time 0.2360 (0.2552) loss 2.8541 (2.6717) grad_norm 4.9035 (6.6636) loss_scale 512.0000 (308.7901) mem 7375MB [2024-09-01 11:32:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1180/1251] eta 0:00:18 lr 0.000011 wd 0.0500 time 0.2397 (0.2572) data time 0.0007 (0.0019) model time 0.2390 (0.2553) loss 3.0456 (2.6677) grad_norm 8.5413 (6.6648) loss_scale 512.0000 (311.2834) mem 7375MB [2024-09-01 11:32:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1190/1251] eta 0:00:15 lr 0.000011 wd 0.0500 time 0.2484 (0.2570) data time 0.0008 (0.0018) model time 0.2475 (0.2551) loss 2.0017 (2.6670) grad_norm 7.0377 (6.6564) loss_scale 512.0000 (313.7164) mem 7375MB [2024-09-01 11:32:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1200/1251] eta 0:00:13 lr 0.000011 wd 0.0500 time 0.2519 (0.2570) data time 0.0010 (0.0018) model time 0.2510 (0.2551) loss 2.1507 (2.6645) grad_norm 10.3700 (6.6616) loss_scale 512.0000 (316.0910) mem 7375MB [2024-09-01 11:32:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1210/1251] eta 0:00:10 lr 0.000011 wd 0.0500 time 0.2433 (0.2568) data time 0.0008 (0.0018) model time 0.2425 (0.2549) loss 1.6747 (2.6622) grad_norm 14.7218 (6.6663) loss_scale 512.0000 (318.4095) mem 7375MB [2024-09-01 11:33:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1220/1251] eta 0:00:07 lr 0.000011 wd 0.0500 time 0.2404 (0.2566) data time 0.0008 (0.0018) model time 0.2396 (0.2548) loss 2.8137 (2.6622) grad_norm 5.1921 (6.6479) loss_scale 512.0000 (320.6737) mem 7375MB [2024-09-01 11:33:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1230/1251] eta 0:00:05 lr 0.000011 wd 0.0500 time 0.2412 (0.2564) data time 0.0011 (0.0018) model time 0.2401 (0.2546) loss 2.7015 (2.6609) grad_norm 7.0502 (6.6387) loss_scale 512.0000 (322.8855) mem 7375MB [2024-09-01 11:33:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1240/1251] eta 0:00:02 lr 0.000011 wd 0.0500 time 0.2252 (0.2562) data time 0.0005 (0.0018) model time 0.2247 (0.2544) loss 2.3801 (2.6604) grad_norm 6.2496 (6.6269) loss_scale 512.0000 (325.0469) mem 7375MB [2024-09-01 11:33:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [294/300][1250/1251] eta 0:00:00 lr 0.000011 wd 0.0500 time 0.2242 (0.2558) data time 0.0005 (0.0018) model time 0.2237 (0.2540) loss 2.1616 (2.6590) grad_norm 31.6424 (6.6445) loss_scale 512.0000 (327.1593) mem 7375MB [2024-09-01 11:33:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 294 training takes 0:03:46 [2024-09-01 11:33:07 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 11:33:08 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 11:33:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.387 (0.387) Loss 0.3916 (0.3916) Acc@1 92.871 (92.871) Acc@5 98.828 (98.828) Mem 7375MB [2024-09-01 11:33:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.077 (0.102) Loss 0.5684 (0.6119) Acc@1 90.039 (87.678) Acc@5 97.852 (97.727) Mem 7375MB [2024-09-01 11:33:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.069 (0.091) Loss 0.9331 (0.6427) Acc@1 77.734 (86.612) Acc@5 95.312 (97.693) Mem 7375MB [2024-09-01 11:33:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.069 (0.086) Loss 1.1543 (0.7370) Acc@1 74.512 (84.356) Acc@5 92.871 (96.743) Mem 7375MB [2024-09-01 11:33:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.064 (0.081) Loss 1.0254 (0.7870) Acc@1 77.051 (83.206) Acc@5 93.945 (96.206) Mem 7375MB [2024-09-01 11:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.756 Acc@5 96.154 [2024-09-01 11:33:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 11:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.776 (0.776) Loss 0.3911 (0.3911) Acc@1 93.164 (93.164) Acc@5 98.828 (98.828) Mem 7375MB [2024-09-01 11:33:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.082 (0.145) Loss 0.5664 (0.6096) Acc@1 90.527 (87.820) Acc@5 97.949 (97.772) Mem 7375MB [2024-09-01 11:33:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.075 (0.112) Loss 0.9155 (0.6403) Acc@1 78.320 (86.761) Acc@5 95.508 (97.726) Mem 7375MB [2024-09-01 11:33:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.074 (0.101) Loss 1.1465 (0.7335) Acc@1 74.707 (84.539) Acc@5 93.066 (96.784) Mem 7375MB [2024-09-01 11:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.062 (0.092) Loss 1.0195 (0.7826) Acc@1 76.953 (83.348) Acc@5 94.336 (96.277) Mem 7375MB [2024-09-01 11:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.892 Acc@5 96.238 [2024-09-01 11:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 11:33:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.89% [2024-09-01 11:33:19 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 11:33:20 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 11:33:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][0/1251] eta 0:15:33 lr 0.000011 wd 0.0500 time 0.7466 (0.7466) data time 0.4997 (0.4997) model time 0.0000 (0.0000) loss 3.1935 (3.1935) grad_norm 5.0191 (5.0191) loss_scale 512.0000 (512.0000) mem 7382MB [2024-09-01 11:33:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][10/1251] eta 0:05:58 lr 0.000011 wd 0.0500 time 0.2479 (0.2891) data time 0.0009 (0.0464) model time 0.0000 (0.0000) loss 3.0345 (2.6735) grad_norm 7.4523 (6.6633) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][20/1251] eta 0:05:28 lr 0.000011 wd 0.0500 time 0.2427 (0.2668) data time 0.0010 (0.0248) model time 0.0000 (0.0000) loss 3.2745 (2.7225) grad_norm 6.8458 (6.7328) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][30/1251] eta 0:05:14 lr 0.000011 wd 0.0500 time 0.2396 (0.2579) data time 0.0008 (0.0171) model time 0.0000 (0.0000) loss 3.2434 (2.6680) grad_norm 6.6806 (7.0134) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][40/1251] eta 0:05:07 lr 0.000011 wd 0.0500 time 0.2373 (0.2536) data time 0.0009 (0.0132) model time 0.0000 (0.0000) loss 2.8590 (2.6948) grad_norm 5.2805 (6.8053) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][50/1251] eta 0:05:01 lr 0.000011 wd 0.0500 time 0.2550 (0.2512) data time 0.0009 (0.0108) model time 0.0000 (0.0000) loss 1.6815 (2.6866) grad_norm 8.4195 (6.9256) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][60/1251] eta 0:04:57 lr 0.000011 wd 0.0500 time 0.2392 (0.2494) data time 0.0009 (0.0092) model time 0.2383 (0.2396) loss 2.7404 (2.6838) grad_norm 7.9570 (6.8731) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][70/1251] eta 0:04:53 lr 0.000011 wd 0.0500 time 0.2408 (0.2481) data time 0.0007 (0.0080) model time 0.2401 (0.2394) loss 2.9045 (2.6431) grad_norm 6.1442 (6.7545) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][80/1251] eta 0:04:49 lr 0.000011 wd 0.0500 time 0.2345 (0.2469) data time 0.0010 (0.0072) model time 0.2335 (0.2388) loss 2.9825 (2.6455) grad_norm 3.8057 (6.9863) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][90/1251] eta 0:04:45 lr 0.000011 wd 0.0500 time 0.2475 (0.2462) data time 0.0014 (0.0065) model time 0.2462 (0.2389) loss 2.7581 (2.6697) grad_norm 4.8523 (7.0044) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][100/1251] eta 0:04:42 lr 0.000011 wd 0.0500 time 0.2336 (0.2458) data time 0.0011 (0.0060) model time 0.2325 (0.2392) loss 1.9770 (2.6492) grad_norm 3.5810 (6.8555) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][110/1251] eta 0:04:39 lr 0.000011 wd 0.0500 time 0.2437 (0.2451) data time 0.0009 (0.0055) model time 0.2429 (0.2390) loss 2.8025 (2.6560) grad_norm 7.8930 (7.2467) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][120/1251] eta 0:04:37 lr 0.000011 wd 0.0500 time 0.2392 (0.2450) data time 0.0009 (0.0051) model time 0.2383 (0.2396) loss 3.2453 (2.6544) grad_norm 4.2791 (7.1180) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][130/1251] eta 0:04:34 lr 0.000011 wd 0.0500 time 0.2411 (0.2447) data time 0.0007 (0.0048) model time 0.2404 (0.2396) loss 1.8830 (2.6605) grad_norm 4.4138 (6.9800) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][140/1251] eta 0:04:31 lr 0.000011 wd 0.0500 time 0.2407 (0.2444) data time 0.0009 (0.0045) model time 0.2398 (0.2396) loss 2.4420 (2.6445) grad_norm 8.5413 (7.0129) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][150/1251] eta 0:04:28 lr 0.000011 wd 0.0500 time 0.2424 (0.2440) data time 0.0007 (0.0043) model time 0.2417 (0.2393) loss 3.3612 (2.6542) grad_norm 6.5706 (6.9838) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:33:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][160/1251] eta 0:04:25 lr 0.000011 wd 0.0500 time 0.2435 (0.2438) data time 0.0009 (0.0041) model time 0.2427 (0.2394) loss 2.7485 (2.6627) grad_norm 6.6556 (6.9774) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][170/1251] eta 0:04:23 lr 0.000011 wd 0.0500 time 0.2481 (0.2437) data time 0.0009 (0.0039) model time 0.2471 (0.2396) loss 2.7703 (2.6687) grad_norm 8.1179 (6.9109) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][180/1251] eta 0:04:20 lr 0.000011 wd 0.0500 time 0.2398 (0.2434) data time 0.0007 (0.0037) model time 0.2391 (0.2394) loss 2.2629 (2.6756) grad_norm 4.8001 (6.8996) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][190/1251] eta 0:04:18 lr 0.000011 wd 0.0500 time 0.2452 (0.2432) data time 0.0009 (0.0036) model time 0.2443 (0.2394) loss 2.3004 (2.6621) grad_norm 30.8130 (6.9981) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][200/1251] eta 0:04:15 lr 0.000011 wd 0.0500 time 0.2371 (0.2430) data time 0.0009 (0.0035) model time 0.2362 (0.2393) loss 2.7797 (2.6609) grad_norm 5.4836 (6.9290) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][210/1251] eta 0:04:12 lr 0.000011 wd 0.0500 time 0.2386 (0.2429) data time 0.0009 (0.0033) model time 0.2377 (0.2393) loss 2.9451 (2.6579) grad_norm 7.2127 (6.9861) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][220/1251] eta 0:04:10 lr 0.000011 wd 0.0500 time 0.2381 (0.2429) data time 0.0010 (0.0032) model time 0.2371 (0.2394) loss 2.8611 (2.6563) grad_norm 4.6951 (6.9039) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][230/1251] eta 0:04:07 lr 0.000011 wd 0.0500 time 0.2379 (0.2428) data time 0.0009 (0.0031) model time 0.2370 (0.2394) loss 3.0965 (2.6614) grad_norm 7.4009 (6.9641) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][240/1251] eta 0:04:05 lr 0.000011 wd 0.0500 time 0.2390 (0.2427) data time 0.0011 (0.0031) model time 0.2379 (0.2394) loss 2.8904 (2.6658) grad_norm 4.9765 (6.9226) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][250/1251] eta 0:04:02 lr 0.000011 wd 0.0500 time 0.2391 (0.2426) data time 0.0010 (0.0030) model time 0.2382 (0.2394) loss 3.0742 (2.6669) grad_norm 7.9067 (6.8881) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][260/1251] eta 0:04:00 lr 0.000011 wd 0.0500 time 0.2569 (0.2426) data time 0.0007 (0.0029) model time 0.2561 (0.2395) loss 2.4612 (2.6632) grad_norm 5.7146 (6.9172) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][270/1251] eta 0:03:57 lr 0.000011 wd 0.0500 time 0.2389 (0.2424) data time 0.0008 (0.0028) model time 0.2381 (0.2394) loss 3.0601 (2.6614) grad_norm 5.1119 (6.9232) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][280/1251] eta 0:03:55 lr 0.000011 wd 0.0500 time 0.2409 (0.2424) data time 0.0008 (0.0028) model time 0.2401 (0.2394) loss 2.9254 (2.6528) grad_norm 4.8250 (6.8765) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][290/1251] eta 0:03:52 lr 0.000011 wd 0.0500 time 0.2372 (0.2423) data time 0.0009 (0.0027) model time 0.2363 (0.2394) loss 3.0949 (2.6612) grad_norm 4.7506 (6.8864) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][300/1251] eta 0:03:50 lr 0.000011 wd 0.0500 time 0.2358 (0.2422) data time 0.0010 (0.0027) model time 0.2348 (0.2394) loss 2.5282 (2.6682) grad_norm 4.5560 (6.8633) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][310/1251] eta 0:03:47 lr 0.000011 wd 0.0500 time 0.2438 (0.2422) data time 0.0010 (0.0026) model time 0.2428 (0.2395) loss 2.6605 (2.6673) grad_norm 4.4114 (6.8535) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][320/1251] eta 0:03:45 lr 0.000011 wd 0.0500 time 0.2436 (0.2421) data time 0.0008 (0.0026) model time 0.2428 (0.2394) loss 2.6508 (2.6740) grad_norm 5.9212 (6.8718) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][330/1251] eta 0:03:42 lr 0.000011 wd 0.0500 time 0.2487 (0.2420) data time 0.0007 (0.0025) model time 0.2481 (0.2394) loss 1.8372 (2.6748) grad_norm 8.9545 (6.8529) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][340/1251] eta 0:03:40 lr 0.000011 wd 0.0500 time 0.2467 (0.2420) data time 0.0007 (0.0025) model time 0.2460 (0.2394) loss 1.7857 (2.6701) grad_norm 4.9331 (6.8597) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][350/1251] eta 0:03:38 lr 0.000011 wd 0.0500 time 0.2384 (0.2420) data time 0.0007 (0.0024) model time 0.2377 (0.2395) loss 2.0849 (2.6674) grad_norm 7.4351 (6.8343) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][360/1251] eta 0:03:35 lr 0.000011 wd 0.0500 time 0.2431 (0.2420) data time 0.0009 (0.0024) model time 0.2421 (0.2395) loss 2.9806 (2.6693) grad_norm 7.6177 (6.9554) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][370/1251] eta 0:03:33 lr 0.000011 wd 0.0500 time 0.2295 (0.2420) data time 0.0008 (0.0023) model time 0.2286 (0.2395) loss 3.5912 (2.6684) grad_norm 11.5199 (6.9752) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][380/1251] eta 0:03:30 lr 0.000011 wd 0.0500 time 0.2361 (0.2419) data time 0.0008 (0.0023) model time 0.2354 (0.2395) loss 2.9794 (2.6720) grad_norm 7.3471 (6.9627) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][390/1251] eta 0:03:28 lr 0.000011 wd 0.0500 time 0.2310 (0.2418) data time 0.0013 (0.0023) model time 0.2297 (0.2394) loss 3.0084 (2.6752) grad_norm 8.6779 (6.9616) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][400/1251] eta 0:03:25 lr 0.000011 wd 0.0500 time 0.2373 (0.2418) data time 0.0007 (0.0023) model time 0.2365 (0.2394) loss 2.7493 (2.6790) grad_norm 12.2475 (6.9859) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:34:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][410/1251] eta 0:03:23 lr 0.000011 wd 0.0500 time 0.2411 (0.2417) data time 0.0008 (0.0022) model time 0.2403 (0.2394) loss 2.5151 (2.6784) grad_norm 5.2896 (6.9586) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][420/1251] eta 0:03:20 lr 0.000011 wd 0.0500 time 0.2414 (0.2417) data time 0.0008 (0.0022) model time 0.2407 (0.2394) loss 2.3583 (2.6755) grad_norm 5.4313 (6.9364) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][430/1251] eta 0:03:18 lr 0.000011 wd 0.0500 time 0.2394 (0.2417) data time 0.0009 (0.0022) model time 0.2385 (0.2394) loss 2.5670 (2.6716) grad_norm 7.2103 (6.9677) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][440/1251] eta 0:03:15 lr 0.000011 wd 0.0500 time 0.2319 (0.2416) data time 0.0009 (0.0021) model time 0.2310 (0.2393) loss 2.9866 (2.6698) grad_norm 4.7809 (6.9399) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][450/1251] eta 0:03:13 lr 0.000011 wd 0.0500 time 0.2396 (0.2416) data time 0.0009 (0.0021) model time 0.2387 (0.2393) loss 2.3581 (2.6701) grad_norm 6.7420 (6.9227) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][460/1251] eta 0:03:11 lr 0.000011 wd 0.0500 time 0.2469 (0.2415) data time 0.0009 (0.0021) model time 0.2460 (0.2393) loss 2.9095 (2.6723) grad_norm 5.1340 (6.9039) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][470/1251] eta 0:03:08 lr 0.000011 wd 0.0500 time 0.2365 (0.2415) data time 0.0009 (0.0021) model time 0.2356 (0.2393) loss 3.2187 (2.6773) grad_norm 5.1820 (6.8839) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][480/1251] eta 0:03:06 lr 0.000011 wd 0.0500 time 0.2432 (0.2415) data time 0.0011 (0.0021) model time 0.2422 (0.2393) loss 2.8947 (2.6807) grad_norm 9.8047 (6.8841) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][490/1251] eta 0:03:04 lr 0.000011 wd 0.0500 time 0.2415 (0.2419) data time 0.0011 (0.0020) model time 0.2404 (0.2398) loss 2.3182 (2.6842) grad_norm 6.0734 (6.8682) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][500/1251] eta 0:03:01 lr 0.000011 wd 0.0500 time 0.2362 (0.2419) data time 0.0009 (0.0020) model time 0.2353 (0.2398) loss 2.5490 (2.6838) grad_norm 3.8619 (6.8609) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][510/1251] eta 0:02:59 lr 0.000011 wd 0.0500 time 0.2338 (0.2419) data time 0.0007 (0.0020) model time 0.2331 (0.2399) loss 2.1349 (2.6833) grad_norm 6.1561 (6.8509) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][520/1251] eta 0:02:56 lr 0.000011 wd 0.0500 time 0.2422 (0.2419) data time 0.0010 (0.0020) model time 0.2412 (0.2399) loss 3.0747 (2.6888) grad_norm 6.1605 (6.8417) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][530/1251] eta 0:02:54 lr 0.000011 wd 0.0500 time 0.2381 (0.2418) data time 0.0007 (0.0020) model time 0.2374 (0.2398) loss 1.8433 (2.6906) grad_norm 5.7368 (6.8568) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][540/1251] eta 0:02:51 lr 0.000011 wd 0.0500 time 0.2306 (0.2417) data time 0.0007 (0.0019) model time 0.2299 (0.2397) loss 2.6277 (2.6916) grad_norm 4.9719 (6.8283) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][550/1251] eta 0:02:49 lr 0.000011 wd 0.0500 time 0.2373 (0.2416) data time 0.0010 (0.0019) model time 0.2363 (0.2397) loss 2.9001 (2.6896) grad_norm 6.6263 (6.8203) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][560/1251] eta 0:02:46 lr 0.000011 wd 0.0500 time 0.2370 (0.2416) data time 0.0010 (0.0019) model time 0.2360 (0.2396) loss 2.9217 (2.6911) grad_norm 4.0528 (6.7965) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][570/1251] eta 0:02:44 lr 0.000011 wd 0.0500 time 0.2328 (0.2415) data time 0.0007 (0.0019) model time 0.2321 (0.2396) loss 3.2067 (2.6929) grad_norm 8.0284 (6.7905) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][580/1251] eta 0:02:42 lr 0.000011 wd 0.0500 time 0.2378 (0.2415) data time 0.0010 (0.0019) model time 0.2368 (0.2396) loss 1.5189 (2.6929) grad_norm 3.3349 (6.7948) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][590/1251] eta 0:02:39 lr 0.000011 wd 0.0500 time 0.2399 (0.2415) data time 0.0010 (0.0019) model time 0.2389 (0.2396) loss 2.4908 (2.6900) grad_norm 6.1507 (6.7920) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][600/1251] eta 0:02:37 lr 0.000011 wd 0.0500 time 0.2368 (0.2415) data time 0.0011 (0.0018) model time 0.2358 (0.2396) loss 2.6426 (2.6841) grad_norm 7.1897 (6.7842) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][610/1251] eta 0:02:34 lr 0.000011 wd 0.0500 time 0.2469 (0.2415) data time 0.0009 (0.0018) model time 0.2460 (0.2396) loss 2.6311 (2.6789) grad_norm 7.0270 (6.8106) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][620/1251] eta 0:02:32 lr 0.000011 wd 0.0500 time 0.2385 (0.2415) data time 0.0011 (0.0018) model time 0.2374 (0.2396) loss 2.9686 (2.6814) grad_norm 6.0740 (6.8061) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][630/1251] eta 0:02:29 lr 0.000011 wd 0.0500 time 0.2393 (0.2415) data time 0.0009 (0.0018) model time 0.2384 (0.2396) loss 2.9931 (2.6843) grad_norm 4.8749 (6.8315) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][640/1251] eta 0:02:27 lr 0.000011 wd 0.0500 time 0.2369 (0.2414) data time 0.0008 (0.0018) model time 0.2361 (0.2396) loss 2.9198 (2.6872) grad_norm 3.9019 (6.8260) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][650/1251] eta 0:02:25 lr 0.000011 wd 0.0500 time 0.2403 (0.2414) data time 0.0010 (0.0018) model time 0.2393 (0.2396) loss 2.9568 (2.6854) grad_norm 5.3403 (6.8063) loss_scale 512.0000 (512.0000) mem 7379MB [2024-09-01 11:35:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 383): INFO Suspend command received, saving checkpoint and exiting [2024-09-01 11:35:57 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 11:35:58 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 11:37:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 533): INFO Full config saved to ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/config.json [2024-09-01 11:37:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 129): INFO Creating model:vssm/msvmambav3_tiny_224 [2024-09-01 11:37:45 msvmambav3_tiny_224] (optimizer.py 18): INFO ==============> building optimizer adamw.................... [2024-09-01 11:37:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 194): INFO auto resuming from ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth [2024-09-01 11:37:58 msvmambav3_tiny_224] (utils.py 21): INFO ==============> Resuming form ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth.................... [2024-09-01 11:37:59 msvmambav3_tiny_224] (utils.py 30): INFO resuming model: [2024-09-01 11:38:00 msvmambav3_tiny_224] (utils.py 37): INFO resuming model_ema: [2024-09-01 11:38:00 msvmambav3_tiny_224] (utils.py 61): INFO => loaded successfully './exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth' (epoch 295) [2024-09-01 11:38:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 234): INFO Start training [2024-09-01 11:38:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][660/1251] eta 0:15:35 lr 0.000011 wd 0.0500 time 0.2201 (1.5828) data time 0.0008 (0.0945) model time 0.2192 (1.4883) loss 2.8048 (3.1148) grad_norm 9.9349 (7.1017) loss_scale 512.0000 (512.0000) mem 7377MB [2024-09-01 11:38:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][670/1251] eta 0:08:23 lr 0.000011 wd 0.0500 time 0.2248 (0.8659) data time 0.0010 (0.0453) model time 0.2238 (0.8206) loss 2.7531 (2.9512) grad_norm 5.9097 (inf) loss_scale 256.0000 (444.6316) mem 7377MB [2024-09-01 11:38:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][680/1251] eta 0:06:07 lr 0.000011 wd 0.0500 time 0.2234 (0.6442) data time 0.0009 (0.0300) model time 0.2225 (0.6142) loss 3.3545 (2.9910) grad_norm 8.0820 (inf) loss_scale 256.0000 (379.5862) mem 7377MB [2024-09-01 11:38:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][690/1251] eta 0:05:00 lr 0.000011 wd 0.0500 time 0.2226 (0.5362) data time 0.0010 (0.0226) model time 0.2216 (0.5136) loss 2.7360 (2.9272) grad_norm 4.7595 (inf) loss_scale 256.0000 (347.8974) mem 7377MB [2024-09-01 11:38:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][700/1251] eta 0:04:20 lr 0.000011 wd 0.0500 time 0.2210 (0.4721) data time 0.0010 (0.0182) model time 0.2200 (0.4540) loss 2.7090 (2.8966) grad_norm 5.5811 (inf) loss_scale 256.0000 (329.1429) mem 7377MB [2024-09-01 11:38:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][710/1251] eta 0:03:52 lr 0.000011 wd 0.0500 time 0.2248 (0.4301) data time 0.0009 (0.0153) model time 0.2239 (0.4148) loss 1.7334 (2.8361) grad_norm 3.9119 (inf) loss_scale 256.0000 (316.7458) mem 7377MB [2024-09-01 11:38:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][720/1251] eta 0:03:32 lr 0.000011 wd 0.0500 time 0.2279 (0.4002) data time 0.0008 (0.0132) model time 0.2271 (0.3870) loss 2.6541 (2.8050) grad_norm 3.9698 (inf) loss_scale 256.0000 (307.9420) mem 7377MB [2024-09-01 11:38:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][730/1251] eta 0:03:16 lr 0.000011 wd 0.0500 time 0.2257 (0.3778) data time 0.0011 (0.0117) model time 0.2246 (0.3661) loss 2.5104 (2.7863) grad_norm 5.5929 (inf) loss_scale 256.0000 (301.3671) mem 7377MB [2024-09-01 11:38:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][740/1251] eta 0:03:04 lr 0.000011 wd 0.0500 time 0.2265 (0.3605) data time 0.0009 (0.0105) model time 0.2256 (0.3501) loss 2.6125 (2.7655) grad_norm 8.1899 (inf) loss_scale 256.0000 (296.2697) mem 7377MB [2024-09-01 11:38:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][750/1251] eta 0:02:53 lr 0.000011 wd 0.0500 time 0.2251 (0.3466) data time 0.0011 (0.0095) model time 0.2240 (0.3372) loss 2.9786 (2.7736) grad_norm 12.4378 (inf) loss_scale 256.0000 (292.2020) mem 7377MB [2024-09-01 11:38:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][760/1251] eta 0:02:44 lr 0.000011 wd 0.0500 time 0.2282 (0.3353) data time 0.0009 (0.0087) model time 0.2274 (0.3266) loss 2.8710 (2.7822) grad_norm 6.9810 (inf) loss_scale 256.0000 (288.8807) mem 7377MB [2024-09-01 11:38:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][770/1251] eta 0:02:36 lr 0.000011 wd 0.0500 time 0.2236 (0.3259) data time 0.0010 (0.0081) model time 0.2226 (0.3178) loss 3.0753 (2.7796) grad_norm 5.6276 (inf) loss_scale 256.0000 (286.1176) mem 7377MB [2024-09-01 11:38:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][780/1251] eta 0:02:29 lr 0.000011 wd 0.0500 time 0.2233 (0.3180) data time 0.0007 (0.0075) model time 0.2226 (0.3105) loss 2.4929 (2.7624) grad_norm 4.6288 (inf) loss_scale 256.0000 (283.7829) mem 7377MB [2024-09-01 11:38:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][790/1251] eta 0:02:23 lr 0.000011 wd 0.0500 time 0.2192 (0.3111) data time 0.0008 (0.0070) model time 0.2184 (0.3040) loss 3.0147 (2.7583) grad_norm 5.1716 (inf) loss_scale 256.0000 (281.7842) mem 7377MB [2024-09-01 11:38:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][800/1251] eta 0:02:17 lr 0.000011 wd 0.0500 time 0.2218 (0.3051) data time 0.0007 (0.0066) model time 0.2212 (0.2985) loss 2.5388 (2.7535) grad_norm 5.7066 (inf) loss_scale 256.0000 (280.0537) mem 7377MB [2024-09-01 11:38:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][810/1251] eta 0:02:12 lr 0.000011 wd 0.0500 time 0.2203 (0.2999) data time 0.0009 (0.0063) model time 0.2194 (0.2936) loss 3.3636 (2.7514) grad_norm 6.0283 (inf) loss_scale 256.0000 (278.5409) mem 7377MB [2024-09-01 11:38:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][820/1251] eta 0:02:07 lr 0.000011 wd 0.0500 time 0.2235 (0.2954) data time 0.0012 (0.0060) model time 0.2223 (0.2894) loss 2.6574 (2.7545) grad_norm 6.5077 (inf) loss_scale 256.0000 (277.2071) mem 7377MB [2024-09-01 11:38:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][830/1251] eta 0:02:02 lr 0.000011 wd 0.0500 time 0.2251 (0.2914) data time 0.0008 (0.0057) model time 0.2243 (0.2857) loss 2.7034 (2.7343) grad_norm 4.8748 (inf) loss_scale 256.0000 (276.0223) mem 7377MB [2024-09-01 11:38:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][840/1251] eta 0:01:58 lr 0.000011 wd 0.0500 time 0.2244 (0.2879) data time 0.0008 (0.0054) model time 0.2236 (0.2825) loss 3.2655 (2.7359) grad_norm 8.9765 (inf) loss_scale 256.0000 (274.9630) mem 7377MB [2024-09-01 11:39:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][850/1251] eta 0:01:54 lr 0.000011 wd 0.0500 time 0.2221 (0.2847) data time 0.0009 (0.0052) model time 0.2212 (0.2795) loss 1.9712 (2.7197) grad_norm 7.3956 (inf) loss_scale 256.0000 (274.0101) mem 7377MB [2024-09-01 11:39:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][860/1251] eta 0:01:50 lr 0.000011 wd 0.0500 time 0.2190 (0.2817) data time 0.0007 (0.0050) model time 0.2183 (0.2767) loss 2.7568 (2.7121) grad_norm 5.9147 (inf) loss_scale 256.0000 (273.1483) mem 7377MB [2024-09-01 11:39:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][870/1251] eta 0:01:46 lr 0.000011 wd 0.0500 time 0.2251 (0.2791) data time 0.0009 (0.0048) model time 0.2243 (0.2742) loss 3.4587 (2.7069) grad_norm 4.6982 (inf) loss_scale 256.0000 (272.3653) mem 7377MB [2024-09-01 11:39:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][880/1251] eta 0:01:42 lr 0.000011 wd 0.0500 time 0.2196 (0.2766) data time 0.0008 (0.0047) model time 0.2188 (0.2719) loss 2.2527 (2.7051) grad_norm 4.5973 (inf) loss_scale 256.0000 (271.6507) mem 7377MB [2024-09-01 11:39:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][890/1251] eta 0:01:39 lr 0.000011 wd 0.0500 time 0.2184 (0.2743) data time 0.0006 (0.0045) model time 0.2178 (0.2698) loss 1.7415 (2.6982) grad_norm 5.9240 (inf) loss_scale 256.0000 (270.9958) mem 7377MB [2024-09-01 11:39:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][900/1251] eta 0:01:35 lr 0.000011 wd 0.0500 time 0.2215 (0.2722) data time 0.0009 (0.0043) model time 0.2205 (0.2679) loss 2.7981 (2.6964) grad_norm 6.5337 (inf) loss_scale 256.0000 (270.3936) mem 7377MB [2024-09-01 11:39:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][910/1251] eta 0:01:32 lr 0.000011 wd 0.0500 time 0.2265 (0.2703) data time 0.0011 (0.0042) model time 0.2254 (0.2661) loss 2.9984 (2.6868) grad_norm 4.8312 (inf) loss_scale 256.0000 (269.8378) mem 7377MB [2024-09-01 11:39:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][920/1251] eta 0:01:28 lr 0.000011 wd 0.0500 time 0.2274 (0.2687) data time 0.0009 (0.0041) model time 0.2265 (0.2646) loss 1.9073 (2.6765) grad_norm 6.8042 (inf) loss_scale 256.0000 (269.3234) mem 7377MB [2024-09-01 11:39:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][930/1251] eta 0:01:25 lr 0.000011 wd 0.0500 time 0.2227 (0.2670) data time 0.0011 (0.0040) model time 0.2216 (0.2630) loss 2.8871 (2.6827) grad_norm 6.9654 (inf) loss_scale 256.0000 (268.8459) mem 7377MB [2024-09-01 11:39:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][940/1251] eta 0:01:22 lr 0.000011 wd 0.0500 time 0.2174 (0.2662) data time 0.0010 (0.0039) model time 0.2164 (0.2623) loss 2.6531 (2.6835) grad_norm 3.9800 (inf) loss_scale 256.0000 (268.4014) mem 7377MB [2024-09-01 11:39:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][950/1251] eta 0:01:19 lr 0.000011 wd 0.0500 time 0.2306 (0.2648) data time 0.0008 (0.0038) model time 0.2299 (0.2610) loss 3.0746 (2.6742) grad_norm 4.8763 (inf) loss_scale 256.0000 (267.9866) mem 7377MB [2024-09-01 11:39:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][960/1251] eta 0:01:16 lr 0.000011 wd 0.0500 time 0.2232 (0.2642) data time 0.0009 (0.0037) model time 0.2223 (0.2605) loss 2.9401 (2.6718) grad_norm 15.6987 (inf) loss_scale 256.0000 (267.5987) mem 7377MB [2024-09-01 11:39:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][970/1251] eta 0:01:13 lr 0.000011 wd 0.0500 time 0.2218 (0.2629) data time 0.0008 (0.0036) model time 0.2210 (0.2593) loss 3.0080 (2.6799) grad_norm 11.8291 (inf) loss_scale 256.0000 (267.2351) mem 7377MB [2024-09-01 11:39:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][980/1251] eta 0:01:10 lr 0.000011 wd 0.0500 time 0.2229 (0.2616) data time 0.0009 (0.0035) model time 0.2220 (0.2581) loss 1.4483 (2.6817) grad_norm 6.0097 (inf) loss_scale 256.0000 (266.8936) mem 7377MB [2024-09-01 11:39:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][990/1251] eta 0:01:07 lr 0.000011 wd 0.0500 time 0.2196 (0.2605) data time 0.0009 (0.0034) model time 0.2187 (0.2571) loss 2.4679 (2.6815) grad_norm 20.5329 (inf) loss_scale 256.0000 (266.5723) mem 7377MB [2024-09-01 11:39:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1000/1251] eta 0:01:05 lr 0.000011 wd 0.0500 time 0.2219 (0.2594) data time 0.0024 (0.0034) model time 0.2195 (0.2560) loss 2.6352 (2.6842) grad_norm 5.8257 (inf) loss_scale 256.0000 (266.2693) mem 7377MB [2024-09-01 11:39:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1010/1251] eta 0:01:02 lr 0.000011 wd 0.0500 time 0.2230 (0.2584) data time 0.0007 (0.0033) model time 0.2224 (0.2551) loss 2.4544 (2.6836) grad_norm 4.1802 (inf) loss_scale 256.0000 (265.9833) mem 7377MB [2024-09-01 11:39:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1020/1251] eta 0:00:59 lr 0.000011 wd 0.0500 time 0.2218 (0.2574) data time 0.0009 (0.0032) model time 0.2210 (0.2542) loss 2.6751 (2.6816) grad_norm 5.1595 (inf) loss_scale 256.0000 (265.7127) mem 7377MB [2024-09-01 11:39:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1030/1251] eta 0:00:56 lr 0.000011 wd 0.0500 time 0.2263 (0.2565) data time 0.0006 (0.0032) model time 0.2257 (0.2533) loss 3.0355 (2.6830) grad_norm 7.2324 (inf) loss_scale 256.0000 (265.4565) mem 7377MB [2024-09-01 11:39:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1040/1251] eta 0:00:53 lr 0.000011 wd 0.0500 time 0.2220 (0.2557) data time 0.0008 (0.0031) model time 0.2212 (0.2525) loss 2.7023 (2.6767) grad_norm 7.4980 (inf) loss_scale 256.0000 (265.2134) mem 7377MB [2024-09-01 11:39:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1050/1251] eta 0:00:51 lr 0.000011 wd 0.0500 time 0.2235 (0.2549) data time 0.0008 (0.0031) model time 0.2227 (0.2518) loss 2.8960 (2.6788) grad_norm 5.6282 (inf) loss_scale 256.0000 (264.9825) mem 7377MB [2024-09-01 11:39:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1060/1251] eta 0:00:48 lr 0.000011 wd 0.0500 time 0.2262 (0.2542) data time 0.0009 (0.0030) model time 0.2254 (0.2512) loss 2.6519 (2.6838) grad_norm 4.8694 (inf) loss_scale 256.0000 (264.7628) mem 7377MB [2024-09-01 11:39:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1070/1251] eta 0:00:45 lr 0.000011 wd 0.0500 time 0.2283 (0.2535) data time 0.0008 (0.0030) model time 0.2274 (0.2505) loss 1.8570 (2.6838) grad_norm 7.2102 (inf) loss_scale 256.0000 (264.5537) mem 7377MB [2024-09-01 11:39:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1080/1251] eta 0:00:43 lr 0.000011 wd 0.0500 time 0.2241 (0.2528) data time 0.0008 (0.0029) model time 0.2234 (0.2499) loss 2.9175 (2.6891) grad_norm 4.4376 (inf) loss_scale 256.0000 (264.3543) mem 7377MB [2024-09-01 11:39:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1090/1251] eta 0:00:40 lr 0.000011 wd 0.0500 time 0.2230 (0.2522) data time 0.0010 (0.0029) model time 0.2220 (0.2493) loss 2.5593 (2.6926) grad_norm 10.3868 (inf) loss_scale 256.0000 (264.1640) mem 7377MB [2024-09-01 11:39:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1100/1251] eta 0:00:37 lr 0.000011 wd 0.0500 time 0.2332 (0.2516) data time 0.0008 (0.0028) model time 0.2324 (0.2488) loss 2.6177 (2.6922) grad_norm 8.9518 (inf) loss_scale 256.0000 (263.9822) mem 7377MB [2024-09-01 11:39:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1110/1251] eta 0:00:35 lr 0.000011 wd 0.0500 time 0.2262 (0.2511) data time 0.0011 (0.0028) model time 0.2251 (0.2483) loss 2.3925 (2.6882) grad_norm 3.5460 (inf) loss_scale 256.0000 (263.8083) mem 7377MB [2024-09-01 11:40:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1120/1251] eta 0:00:32 lr 0.000011 wd 0.0500 time 0.2270 (0.2506) data time 0.0006 (0.0028) model time 0.2264 (0.2478) loss 1.6465 (2.6828) grad_norm 4.0623 (inf) loss_scale 256.0000 (263.6418) mem 7377MB [2024-09-01 11:40:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1130/1251] eta 0:00:30 lr 0.000011 wd 0.0500 time 0.2222 (0.2500) data time 0.0008 (0.0027) model time 0.2214 (0.2473) loss 2.9873 (2.6817) grad_norm 6.5283 (inf) loss_scale 256.0000 (263.4823) mem 7377MB [2024-09-01 11:40:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1140/1251] eta 0:00:27 lr 0.000011 wd 0.0500 time 0.2226 (0.2494) data time 0.0008 (0.0027) model time 0.2218 (0.2468) loss 2.5929 (2.6852) grad_norm 7.2752 (inf) loss_scale 256.0000 (263.3292) mem 7377MB [2024-09-01 11:40:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1150/1251] eta 0:00:25 lr 0.000011 wd 0.0500 time 0.2246 (0.2489) data time 0.0008 (0.0026) model time 0.2238 (0.2463) loss 2.7402 (2.6829) grad_norm 5.3007 (inf) loss_scale 256.0000 (263.1824) mem 7377MB [2024-09-01 11:40:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1160/1251] eta 0:00:22 lr 0.000011 wd 0.0500 time 0.2310 (0.2484) data time 0.0006 (0.0026) model time 0.2304 (0.2458) loss 3.2421 (2.6831) grad_norm 6.9993 (inf) loss_scale 256.0000 (263.0413) mem 7377MB [2024-09-01 11:40:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1170/1251] eta 0:00:20 lr 0.000011 wd 0.0500 time 0.2221 (0.2480) data time 0.0011 (0.0026) model time 0.2210 (0.2454) loss 2.0233 (2.6849) grad_norm 4.8914 (inf) loss_scale 256.0000 (262.9056) mem 7377MB [2024-09-01 11:40:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1180/1251] eta 0:00:17 lr 0.000011 wd 0.0500 time 0.2263 (0.2475) data time 0.0006 (0.0025) model time 0.2257 (0.2450) loss 3.2023 (2.6800) grad_norm 10.3179 (inf) loss_scale 256.0000 (262.7750) mem 7377MB [2024-09-01 11:40:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1190/1251] eta 0:00:15 lr 0.000011 wd 0.0500 time 0.2166 (0.2470) data time 0.0010 (0.0025) model time 0.2155 (0.2445) loss 2.5523 (2.6799) grad_norm 7.3230 (inf) loss_scale 256.0000 (262.6494) mem 7377MB [2024-09-01 11:40:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1200/1251] eta 0:00:12 lr 0.000011 wd 0.0500 time 0.2224 (0.2466) data time 0.0008 (0.0025) model time 0.2216 (0.2442) loss 3.1549 (2.6818) grad_norm 13.2134 (inf) loss_scale 256.0000 (262.5282) mem 7377MB [2024-09-01 11:40:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1210/1251] eta 0:00:10 lr 0.000011 wd 0.0500 time 0.2240 (0.2462) data time 0.0007 (0.0025) model time 0.2233 (0.2438) loss 3.4134 (2.6841) grad_norm 6.0139 (inf) loss_scale 256.0000 (262.4114) mem 7377MB [2024-09-01 11:40:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1220/1251] eta 0:00:07 lr 0.000011 wd 0.0500 time 0.2211 (0.2458) data time 0.0008 (0.0024) model time 0.2203 (0.2434) loss 2.9500 (2.6862) grad_norm 5.0440 (inf) loss_scale 256.0000 (262.2988) mem 7377MB [2024-09-01 11:40:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1230/1251] eta 0:00:05 lr 0.000011 wd 0.0500 time 0.2238 (0.2455) data time 0.0007 (0.0024) model time 0.2231 (0.2431) loss 3.2053 (2.6878) grad_norm 10.3766 (inf) loss_scale 256.0000 (262.1900) mem 7377MB [2024-09-01 11:40:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1240/1251] eta 0:00:02 lr 0.000011 wd 0.0500 time 0.2168 (0.2450) data time 0.0005 (0.0024) model time 0.2163 (0.2426) loss 2.1559 (2.6878) grad_norm 5.8152 (inf) loss_scale 256.0000 (262.0849) mem 7377MB [2024-09-01 11:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [295/300][1250/1251] eta 0:00:00 lr 0.000010 wd 0.0500 time 0.2128 (0.2445) data time 0.0004 (0.0024) model time 0.2124 (0.2421) loss 2.5922 (2.6898) grad_norm 7.5527 (inf) loss_scale 256.0000 (261.9833) mem 7377MB [2024-09-01 11:40:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 295 training takes 0:02:26 [2024-09-01 11:40:30 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 11:40:32 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 11:40:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.331 (0.331) Loss 0.3899 (0.3899) Acc@1 93.262 (93.262) Acc@5 98.926 (98.926) Mem 7377MB [2024-09-01 11:40:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.071 (0.094) Loss 0.5732 (0.6136) Acc@1 90.527 (87.784) Acc@5 97.949 (97.798) Mem 7377MB [2024-09-01 11:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.082) Loss 0.9233 (0.6438) Acc@1 78.027 (86.649) Acc@5 95.215 (97.703) Mem 7377MB [2024-09-01 11:40:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.065 (0.078) Loss 1.1455 (0.7390) Acc@1 74.316 (84.394) Acc@5 93.164 (96.752) Mem 7377MB [2024-09-01 11:40:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.063 (0.074) Loss 1.0303 (0.7891) Acc@1 77.051 (83.248) Acc@5 94.141 (96.272) Mem 7377MB [2024-09-01 11:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.800 Acc@5 96.200 [2024-09-01 11:40:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 11:40:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.698 (0.698) Loss 0.3914 (0.3914) Acc@1 93.262 (93.262) Acc@5 98.828 (98.828) Mem 7377MB [2024-09-01 11:40:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.132) Loss 0.5669 (0.6100) Acc@1 90.430 (87.802) Acc@5 97.949 (97.763) Mem 7377MB [2024-09-01 11:40:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.104) Loss 0.9165 (0.6406) Acc@1 78.320 (86.737) Acc@5 95.703 (97.731) Mem 7377MB [2024-09-01 11:40:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.073 (0.093) Loss 1.1475 (0.7339) Acc@1 74.512 (84.517) Acc@5 93.066 (96.768) Mem 7377MB [2024-09-01 11:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.085) Loss 1.0195 (0.7830) Acc@1 77.051 (83.344) Acc@5 94.336 (96.263) Mem 7377MB [2024-09-01 11:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.888 Acc@5 96.228 [2024-09-01 11:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 11:40:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.89% [2024-09-01 11:40:41 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 11:40:43 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 11:40:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][0/1251] eta 0:12:46 lr 0.000010 wd 0.0500 time 0.6128 (0.6128) data time 0.3731 (0.3731) model time 0.0000 (0.0000) loss 2.7393 (2.7393) grad_norm 9.0943 (9.0943) loss_scale 256.0000 (256.0000) mem 7380MB [2024-09-01 11:40:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][10/1251] eta 0:05:24 lr 0.000010 wd 0.0500 time 0.2268 (0.2614) data time 0.0008 (0.0348) model time 0.0000 (0.0000) loss 2.7488 (2.6107) grad_norm 8.4690 (6.5252) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:40:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][20/1251] eta 0:04:59 lr 0.000010 wd 0.0500 time 0.2214 (0.2436) data time 0.0011 (0.0187) model time 0.0000 (0.0000) loss 2.6108 (2.6391) grad_norm 6.7831 (6.3315) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:40:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][30/1251] eta 0:04:49 lr 0.000010 wd 0.0500 time 0.2289 (0.2374) data time 0.0007 (0.0130) model time 0.0000 (0.0000) loss 3.2026 (2.6979) grad_norm 7.1152 (6.0122) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:40:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][40/1251] eta 0:04:43 lr 0.000010 wd 0.0500 time 0.2179 (0.2341) data time 0.0010 (0.0101) model time 0.0000 (0.0000) loss 1.5907 (2.6759) grad_norm 6.7895 (5.9326) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:40:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][50/1251] eta 0:04:38 lr 0.000010 wd 0.0500 time 0.2238 (0.2321) data time 0.0010 (0.0083) model time 0.0000 (0.0000) loss 3.0468 (2.6553) grad_norm 6.1959 (6.2128) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:40:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][60/1251] eta 0:04:34 lr 0.000010 wd 0.0500 time 0.2227 (0.2309) data time 0.0008 (0.0071) model time 0.2219 (0.2236) loss 2.5135 (2.6457) grad_norm 5.2203 (6.2432) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:40:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][70/1251] eta 0:04:31 lr 0.000010 wd 0.0500 time 0.2216 (0.2299) data time 0.0007 (0.0062) model time 0.2209 (0.2233) loss 2.9057 (2.6688) grad_norm 5.2932 (6.1599) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][80/1251] eta 0:04:28 lr 0.000010 wd 0.0500 time 0.2237 (0.2293) data time 0.0009 (0.0056) model time 0.2227 (0.2236) loss 2.3693 (2.6805) grad_norm 7.9472 (6.1912) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][90/1251] eta 0:04:25 lr 0.000010 wd 0.0500 time 0.2262 (0.2286) data time 0.0008 (0.0051) model time 0.2254 (0.2233) loss 1.8579 (2.6689) grad_norm 3.6348 (6.1565) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][100/1251] eta 0:04:22 lr 0.000010 wd 0.0500 time 0.2270 (0.2284) data time 0.0007 (0.0047) model time 0.2262 (0.2236) loss 2.3458 (2.6586) grad_norm 7.5057 (6.1607) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][110/1251] eta 0:04:20 lr 0.000010 wd 0.0500 time 0.2233 (0.2283) data time 0.0009 (0.0043) model time 0.2223 (0.2241) loss 3.0847 (2.6640) grad_norm 6.1893 (6.1512) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][120/1251] eta 0:04:17 lr 0.000010 wd 0.0500 time 0.2236 (0.2281) data time 0.0008 (0.0040) model time 0.2228 (0.2242) loss 2.1166 (2.6422) grad_norm 4.0616 (6.1193) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][130/1251] eta 0:04:15 lr 0.000010 wd 0.0500 time 0.2216 (0.2276) data time 0.0011 (0.0038) model time 0.2206 (0.2239) loss 2.9125 (2.6519) grad_norm 7.8730 (6.1913) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][140/1251] eta 0:04:12 lr 0.000010 wd 0.0500 time 0.2180 (0.2274) data time 0.0009 (0.0036) model time 0.2170 (0.2238) loss 2.7313 (2.6644) grad_norm 4.6447 (6.1593) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][150/1251] eta 0:04:10 lr 0.000010 wd 0.0500 time 0.2221 (0.2271) data time 0.0011 (0.0034) model time 0.2210 (0.2236) loss 2.6777 (2.6608) grad_norm 3.8974 (6.2475) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][160/1251] eta 0:04:08 lr 0.000010 wd 0.0500 time 0.2206 (0.2281) data time 0.0009 (0.0033) model time 0.2198 (0.2254) loss 3.1223 (2.6626) grad_norm 6.0401 (6.2606) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][170/1251] eta 0:04:06 lr 0.000010 wd 0.0500 time 0.2175 (0.2279) data time 0.0011 (0.0031) model time 0.2164 (0.2251) loss 3.1340 (2.6694) grad_norm 7.7859 (6.2580) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][180/1251] eta 0:04:03 lr 0.000010 wd 0.0500 time 0.2187 (0.2276) data time 0.0009 (0.0030) model time 0.2178 (0.2249) loss 3.2681 (2.6732) grad_norm 4.1785 (6.2330) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][190/1251] eta 0:04:01 lr 0.000010 wd 0.0500 time 0.2230 (0.2274) data time 0.0010 (0.0029) model time 0.2220 (0.2247) loss 2.5934 (2.6709) grad_norm 5.8046 (6.2201) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][200/1251] eta 0:03:58 lr 0.000010 wd 0.0500 time 0.2201 (0.2272) data time 0.0008 (0.0028) model time 0.2193 (0.2245) loss 2.5297 (2.6710) grad_norm 5.7522 (6.2369) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][210/1251] eta 0:03:57 lr 0.000010 wd 0.0500 time 0.2243 (0.2278) data time 0.0006 (0.0027) model time 0.2238 (0.2254) loss 2.1803 (2.6492) grad_norm 4.7483 (6.2539) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:41:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][220/1251] eta 0:03:54 lr 0.000010 wd 0.0500 time 0.2268 (0.2276) data time 0.0006 (0.0026) model time 0.2262 (0.2253) loss 1.6219 (2.6448) grad_norm 3.8074 (inf) loss_scale 128.0000 (252.5249) mem 7381MB [2024-09-01 11:41:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][230/1251] eta 0:03:52 lr 0.000010 wd 0.0500 time 0.2247 (0.2274) data time 0.0008 (0.0026) model time 0.2239 (0.2252) loss 3.1257 (2.6405) grad_norm 4.0134 (inf) loss_scale 128.0000 (247.1342) mem 7381MB [2024-09-01 11:41:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][240/1251] eta 0:03:49 lr 0.000010 wd 0.0500 time 0.2240 (0.2272) data time 0.0008 (0.0025) model time 0.2232 (0.2250) loss 2.4536 (2.6359) grad_norm 4.2566 (inf) loss_scale 128.0000 (242.1909) mem 7381MB [2024-09-01 11:41:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][250/1251] eta 0:03:47 lr 0.000010 wd 0.0500 time 0.2258 (0.2270) data time 0.0006 (0.0024) model time 0.2252 (0.2248) loss 3.3958 (2.6339) grad_norm 7.8372 (inf) loss_scale 128.0000 (237.6414) mem 7381MB [2024-09-01 11:41:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][260/1251] eta 0:03:44 lr 0.000010 wd 0.0500 time 0.2230 (0.2268) data time 0.0006 (0.0024) model time 0.2224 (0.2246) loss 2.3093 (2.6364) grad_norm 6.5848 (inf) loss_scale 128.0000 (233.4406) mem 7381MB [2024-09-01 11:41:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][270/1251] eta 0:03:42 lr 0.000010 wd 0.0500 time 0.2255 (0.2267) data time 0.0007 (0.0023) model time 0.2248 (0.2245) loss 3.2150 (2.6417) grad_norm 7.3469 (inf) loss_scale 128.0000 (229.5498) mem 7381MB [2024-09-01 11:41:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][280/1251] eta 0:03:39 lr 0.000010 wd 0.0500 time 0.2221 (0.2265) data time 0.0010 (0.0023) model time 0.2211 (0.2244) loss 1.7534 (2.6323) grad_norm 5.0242 (inf) loss_scale 128.0000 (225.9359) mem 7381MB [2024-09-01 11:41:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][290/1251] eta 0:03:37 lr 0.000010 wd 0.0500 time 0.2249 (0.2265) data time 0.0010 (0.0022) model time 0.2238 (0.2243) loss 1.7490 (2.6308) grad_norm 5.2383 (inf) loss_scale 128.0000 (222.5704) mem 7381MB [2024-09-01 11:41:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][300/1251] eta 0:03:35 lr 0.000010 wd 0.0500 time 0.2188 (0.2263) data time 0.0008 (0.0022) model time 0.2181 (0.2242) loss 2.9055 (2.6297) grad_norm 6.0151 (inf) loss_scale 128.0000 (219.4286) mem 7381MB [2024-09-01 11:41:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][310/1251] eta 0:03:32 lr 0.000010 wd 0.0500 time 0.2257 (0.2262) data time 0.0007 (0.0021) model time 0.2249 (0.2242) loss 2.9826 (2.6310) grad_norm 3.7213 (inf) loss_scale 128.0000 (216.4887) mem 7381MB [2024-09-01 11:41:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][320/1251] eta 0:03:30 lr 0.000010 wd 0.0500 time 0.2228 (0.2262) data time 0.0010 (0.0021) model time 0.2218 (0.2241) loss 2.6770 (2.6336) grad_norm 6.2235 (inf) loss_scale 128.0000 (213.7321) mem 7381MB [2024-09-01 11:41:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][330/1251] eta 0:03:28 lr 0.000010 wd 0.0500 time 0.2243 (0.2261) data time 0.0011 (0.0021) model time 0.2233 (0.2240) loss 2.6741 (2.6413) grad_norm 4.3498 (inf) loss_scale 128.0000 (211.1420) mem 7381MB [2024-09-01 11:42:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][340/1251] eta 0:03:25 lr 0.000010 wd 0.0500 time 0.2222 (0.2260) data time 0.0009 (0.0020) model time 0.2213 (0.2240) loss 2.8974 (2.6392) grad_norm 4.2416 (inf) loss_scale 128.0000 (208.7038) mem 7381MB [2024-09-01 11:42:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][350/1251] eta 0:03:23 lr 0.000010 wd 0.0500 time 0.2286 (0.2260) data time 0.0007 (0.0020) model time 0.2280 (0.2240) loss 1.8453 (2.6276) grad_norm 7.9079 (inf) loss_scale 128.0000 (206.4046) mem 7381MB [2024-09-01 11:42:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][360/1251] eta 0:03:21 lr 0.000010 wd 0.0500 time 0.2219 (0.2259) data time 0.0007 (0.0020) model time 0.2212 (0.2240) loss 2.9754 (2.6272) grad_norm 7.8240 (inf) loss_scale 128.0000 (204.2327) mem 7381MB [2024-09-01 11:42:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][370/1251] eta 0:03:19 lr 0.000010 wd 0.0500 time 0.2264 (0.2259) data time 0.0009 (0.0019) model time 0.2255 (0.2240) loss 2.6306 (2.6299) grad_norm 6.8178 (inf) loss_scale 128.0000 (202.1779) mem 7381MB [2024-09-01 11:42:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][380/1251] eta 0:03:16 lr 0.000010 wd 0.0500 time 0.2248 (0.2259) data time 0.0006 (0.0019) model time 0.2242 (0.2240) loss 3.2290 (2.6334) grad_norm 12.9473 (inf) loss_scale 128.0000 (200.2310) mem 7381MB [2024-09-01 11:42:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][390/1251] eta 0:03:14 lr 0.000010 wd 0.0500 time 0.2259 (0.2258) data time 0.0009 (0.0019) model time 0.2250 (0.2240) loss 2.8311 (2.6319) grad_norm 3.7805 (inf) loss_scale 128.0000 (198.3836) mem 7381MB [2024-09-01 11:42:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][400/1251] eta 0:03:12 lr 0.000010 wd 0.0500 time 0.2376 (0.2258) data time 0.0007 (0.0019) model time 0.2369 (0.2240) loss 2.6000 (2.6312) grad_norm 7.4428 (inf) loss_scale 128.0000 (196.6284) mem 7381MB [2024-09-01 11:42:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][410/1251] eta 0:03:09 lr 0.000010 wd 0.0500 time 0.2249 (0.2258) data time 0.0010 (0.0018) model time 0.2239 (0.2239) loss 2.6937 (2.6324) grad_norm 8.3595 (inf) loss_scale 128.0000 (194.9586) mem 7381MB [2024-09-01 11:42:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][420/1251] eta 0:03:07 lr 0.000010 wd 0.0500 time 0.2215 (0.2257) data time 0.0008 (0.0018) model time 0.2206 (0.2239) loss 2.7670 (2.6325) grad_norm 5.6375 (inf) loss_scale 128.0000 (193.3682) mem 7381MB [2024-09-01 11:42:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][430/1251] eta 0:03:05 lr 0.000010 wd 0.0500 time 0.2230 (0.2256) data time 0.0006 (0.0018) model time 0.2224 (0.2238) loss 2.0505 (2.6276) grad_norm 10.8225 (inf) loss_scale 128.0000 (191.8515) mem 7381MB [2024-09-01 11:42:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][440/1251] eta 0:03:02 lr 0.000010 wd 0.0500 time 0.2249 (0.2256) data time 0.0007 (0.0018) model time 0.2243 (0.2238) loss 2.6636 (2.6313) grad_norm 4.2580 (inf) loss_scale 128.0000 (190.4036) mem 7381MB [2024-09-01 11:42:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][450/1251] eta 0:03:00 lr 0.000010 wd 0.0500 time 0.2351 (0.2256) data time 0.0008 (0.0018) model time 0.2342 (0.2238) loss 2.5089 (2.6321) grad_norm 4.5274 (inf) loss_scale 128.0000 (189.0200) mem 7381MB [2024-09-01 11:42:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][460/1251] eta 0:02:58 lr 0.000010 wd 0.0500 time 0.2265 (0.2256) data time 0.0009 (0.0017) model time 0.2256 (0.2238) loss 2.7321 (2.6364) grad_norm 14.3835 (inf) loss_scale 128.0000 (187.6963) mem 7381MB [2024-09-01 11:42:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][470/1251] eta 0:02:56 lr 0.000010 wd 0.0500 time 0.2251 (0.2256) data time 0.0008 (0.0017) model time 0.2243 (0.2238) loss 2.4774 (2.6351) grad_norm 13.5413 (inf) loss_scale 128.0000 (186.4289) mem 7381MB [2024-09-01 11:42:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][480/1251] eta 0:02:53 lr 0.000010 wd 0.0500 time 0.2305 (0.2255) data time 0.0006 (0.0017) model time 0.2299 (0.2238) loss 3.0434 (2.6369) grad_norm 8.1070 (inf) loss_scale 128.0000 (185.2141) mem 7381MB [2024-09-01 11:42:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][490/1251] eta 0:02:51 lr 0.000010 wd 0.0500 time 0.2276 (0.2255) data time 0.0008 (0.0017) model time 0.2268 (0.2238) loss 3.2876 (2.6346) grad_norm 7.0466 (inf) loss_scale 128.0000 (184.0489) mem 7381MB [2024-09-01 11:42:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][500/1251] eta 0:02:49 lr 0.000010 wd 0.0500 time 0.2242 (0.2255) data time 0.0010 (0.0017) model time 0.2232 (0.2238) loss 2.6551 (2.6326) grad_norm 5.5764 (inf) loss_scale 128.0000 (182.9301) mem 7381MB [2024-09-01 11:42:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][510/1251] eta 0:02:47 lr 0.000010 wd 0.0500 time 0.2255 (0.2254) data time 0.0008 (0.0017) model time 0.2246 (0.2238) loss 3.2408 (2.6329) grad_norm 6.8545 (inf) loss_scale 128.0000 (181.8552) mem 7381MB [2024-09-01 11:42:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][520/1251] eta 0:02:44 lr 0.000010 wd 0.0500 time 0.2218 (0.2254) data time 0.0008 (0.0017) model time 0.2210 (0.2237) loss 2.4851 (2.6375) grad_norm 4.8625 (inf) loss_scale 128.0000 (180.8215) mem 7381MB [2024-09-01 11:42:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][530/1251] eta 0:02:42 lr 0.000010 wd 0.0500 time 0.2186 (0.2253) data time 0.0010 (0.0016) model time 0.2176 (0.2237) loss 2.7487 (2.6355) grad_norm 4.0365 (inf) loss_scale 128.0000 (179.8267) mem 7381MB [2024-09-01 11:42:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][540/1251] eta 0:02:40 lr 0.000010 wd 0.0500 time 0.2232 (0.2253) data time 0.0010 (0.0016) model time 0.2222 (0.2237) loss 3.2827 (2.6376) grad_norm 10.0380 (inf) loss_scale 128.0000 (178.8688) mem 7381MB [2024-09-01 11:42:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][550/1251] eta 0:02:37 lr 0.000010 wd 0.0500 time 0.2230 (0.2253) data time 0.0010 (0.0016) model time 0.2220 (0.2237) loss 2.3441 (2.6364) grad_norm 4.9497 (inf) loss_scale 128.0000 (177.9456) mem 7381MB [2024-09-01 11:42:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][560/1251] eta 0:02:35 lr 0.000010 wd 0.0500 time 0.2161 (0.2253) data time 0.0008 (0.0016) model time 0.2154 (0.2237) loss 1.9771 (2.6354) grad_norm 4.5051 (inf) loss_scale 128.0000 (177.0553) mem 7381MB [2024-09-01 11:42:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][570/1251] eta 0:02:33 lr 0.000010 wd 0.0500 time 0.2237 (0.2252) data time 0.0009 (0.0016) model time 0.2228 (0.2236) loss 2.8301 (2.6386) grad_norm 7.4891 (inf) loss_scale 128.0000 (176.1961) mem 7381MB [2024-09-01 11:42:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][580/1251] eta 0:02:31 lr 0.000010 wd 0.0500 time 0.2196 (0.2252) data time 0.0011 (0.0016) model time 0.2185 (0.2236) loss 2.8579 (2.6452) grad_norm 5.0189 (inf) loss_scale 128.0000 (175.3666) mem 7381MB [2024-09-01 11:42:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][590/1251] eta 0:02:28 lr 0.000010 wd 0.0500 time 0.2236 (0.2252) data time 0.0006 (0.0016) model time 0.2230 (0.2236) loss 2.8175 (2.6490) grad_norm 6.4148 (inf) loss_scale 128.0000 (174.5651) mem 7381MB [2024-09-01 11:42:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][600/1251] eta 0:02:26 lr 0.000010 wd 0.0500 time 0.2166 (0.2252) data time 0.0008 (0.0016) model time 0.2158 (0.2236) loss 2.9763 (2.6506) grad_norm 5.8527 (inf) loss_scale 128.0000 (173.7903) mem 7381MB [2024-09-01 11:43:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][610/1251] eta 0:02:24 lr 0.000010 wd 0.0500 time 0.2255 (0.2252) data time 0.0009 (0.0015) model time 0.2246 (0.2236) loss 2.5349 (2.6514) grad_norm 7.1600 (inf) loss_scale 128.0000 (173.0409) mem 7381MB [2024-09-01 11:43:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][620/1251] eta 0:02:22 lr 0.000010 wd 0.0500 time 0.2197 (0.2251) data time 0.0010 (0.0015) model time 0.2188 (0.2236) loss 2.5287 (2.6495) grad_norm 11.8382 (inf) loss_scale 128.0000 (172.3156) mem 7381MB [2024-09-01 11:43:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][630/1251] eta 0:02:19 lr 0.000010 wd 0.0500 time 0.2198 (0.2251) data time 0.0010 (0.0015) model time 0.2189 (0.2236) loss 2.5575 (2.6476) grad_norm 5.1887 (inf) loss_scale 128.0000 (171.6133) mem 7381MB [2024-09-01 11:43:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][640/1251] eta 0:02:17 lr 0.000010 wd 0.0500 time 0.2189 (0.2251) data time 0.0007 (0.0015) model time 0.2182 (0.2236) loss 2.6262 (2.6467) grad_norm 8.0359 (inf) loss_scale 128.0000 (170.9329) mem 7381MB [2024-09-01 11:43:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][650/1251] eta 0:02:15 lr 0.000010 wd 0.0500 time 0.2240 (0.2251) data time 0.0010 (0.0015) model time 0.2230 (0.2236) loss 1.6631 (2.6451) grad_norm 6.6230 (inf) loss_scale 128.0000 (170.2734) mem 7381MB [2024-09-01 11:43:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][660/1251] eta 0:02:13 lr 0.000010 wd 0.0500 time 0.2215 (0.2251) data time 0.0009 (0.0015) model time 0.2207 (0.2235) loss 2.9218 (2.6468) grad_norm 8.4402 (inf) loss_scale 128.0000 (169.6339) mem 7381MB [2024-09-01 11:43:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][670/1251] eta 0:02:10 lr 0.000010 wd 0.0500 time 0.2199 (0.2250) data time 0.0009 (0.0015) model time 0.2190 (0.2235) loss 2.6364 (2.6509) grad_norm 5.3216 (inf) loss_scale 128.0000 (169.0134) mem 7381MB [2024-09-01 11:43:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][680/1251] eta 0:02:08 lr 0.000010 wd 0.0500 time 0.2246 (0.2250) data time 0.0006 (0.0015) model time 0.2240 (0.2235) loss 3.1141 (2.6539) grad_norm 5.6381 (inf) loss_scale 128.0000 (168.4112) mem 7381MB [2024-09-01 11:43:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][690/1251] eta 0:02:06 lr 0.000010 wd 0.0500 time 0.2205 (0.2250) data time 0.0008 (0.0015) model time 0.2197 (0.2235) loss 2.9132 (2.6562) grad_norm 5.9019 (inf) loss_scale 128.0000 (167.8263) mem 7381MB [2024-09-01 11:43:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][700/1251] eta 0:02:03 lr 0.000010 wd 0.0500 time 0.2192 (0.2249) data time 0.0010 (0.0015) model time 0.2183 (0.2234) loss 3.0038 (2.6559) grad_norm 4.9719 (inf) loss_scale 128.0000 (167.2582) mem 7381MB [2024-09-01 11:43:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][710/1251] eta 0:02:01 lr 0.000010 wd 0.0500 time 0.2305 (0.2249) data time 0.0008 (0.0015) model time 0.2297 (0.2234) loss 3.0387 (2.6564) grad_norm 6.3749 (inf) loss_scale 128.0000 (166.7060) mem 7381MB [2024-09-01 11:43:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][720/1251] eta 0:01:59 lr 0.000010 wd 0.0500 time 0.2205 (0.2249) data time 0.0008 (0.0015) model time 0.2197 (0.2234) loss 2.7521 (2.6534) grad_norm 5.7220 (inf) loss_scale 128.0000 (166.1692) mem 7381MB [2024-09-01 11:43:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][730/1251] eta 0:01:57 lr 0.000010 wd 0.0500 time 0.2224 (0.2249) data time 0.0010 (0.0014) model time 0.2214 (0.2234) loss 2.9885 (2.6525) grad_norm 6.0142 (inf) loss_scale 128.0000 (165.6471) mem 7381MB [2024-09-01 11:43:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][740/1251] eta 0:01:55 lr 0.000010 wd 0.0500 time 0.2213 (0.2252) data time 0.0010 (0.0014) model time 0.2203 (0.2237) loss 3.2434 (2.6543) grad_norm 9.3972 (inf) loss_scale 128.0000 (165.1390) mem 7381MB [2024-09-01 11:43:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][750/1251] eta 0:01:52 lr 0.000010 wd 0.0500 time 0.2221 (0.2252) data time 0.0008 (0.0014) model time 0.2214 (0.2237) loss 3.1966 (2.6549) grad_norm 8.1062 (inf) loss_scale 128.0000 (164.6445) mem 7381MB [2024-09-01 11:43:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][760/1251] eta 0:01:50 lr 0.000010 wd 0.0500 time 0.2255 (0.2251) data time 0.0010 (0.0014) model time 0.2244 (0.2237) loss 2.6846 (2.6559) grad_norm 8.4074 (inf) loss_scale 128.0000 (164.1629) mem 7381MB [2024-09-01 11:43:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][770/1251] eta 0:01:48 lr 0.000010 wd 0.0500 time 0.2201 (0.2251) data time 0.0009 (0.0014) model time 0.2192 (0.2237) loss 2.8159 (2.6590) grad_norm 8.5656 (inf) loss_scale 128.0000 (163.6939) mem 7381MB [2024-09-01 11:43:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][780/1251] eta 0:01:46 lr 0.000010 wd 0.0500 time 0.2191 (0.2251) data time 0.0010 (0.0014) model time 0.2181 (0.2236) loss 2.6642 (2.6613) grad_norm 10.1187 (inf) loss_scale 128.0000 (163.2369) mem 7381MB [2024-09-01 11:43:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][790/1251] eta 0:01:43 lr 0.000010 wd 0.0500 time 0.2265 (0.2251) data time 0.0008 (0.0014) model time 0.2257 (0.2236) loss 2.9813 (2.6629) grad_norm 6.9875 (inf) loss_scale 128.0000 (162.7914) mem 7381MB [2024-09-01 11:43:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][800/1251] eta 0:01:41 lr 0.000010 wd 0.0500 time 0.2247 (0.2250) data time 0.0009 (0.0014) model time 0.2238 (0.2236) loss 2.6485 (2.6623) grad_norm 6.2137 (inf) loss_scale 128.0000 (162.3571) mem 7381MB [2024-09-01 11:43:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][810/1251] eta 0:01:39 lr 0.000010 wd 0.0500 time 0.2226 (0.2250) data time 0.0008 (0.0014) model time 0.2218 (0.2236) loss 3.4239 (2.6651) grad_norm 7.2009 (inf) loss_scale 128.0000 (161.9334) mem 7381MB [2024-09-01 11:43:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][820/1251] eta 0:01:36 lr 0.000010 wd 0.0500 time 0.2251 (0.2250) data time 0.0010 (0.0014) model time 0.2241 (0.2236) loss 2.7134 (2.6651) grad_norm 11.9397 (inf) loss_scale 128.0000 (161.5201) mem 7381MB [2024-09-01 11:43:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][830/1251] eta 0:01:34 lr 0.000010 wd 0.0500 time 0.2226 (0.2250) data time 0.0007 (0.0014) model time 0.2220 (0.2236) loss 2.9382 (2.6647) grad_norm 3.8545 (inf) loss_scale 128.0000 (161.1167) mem 7381MB [2024-09-01 11:43:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][840/1251] eta 0:01:32 lr 0.000010 wd 0.0500 time 0.2292 (0.2250) data time 0.0008 (0.0014) model time 0.2283 (0.2236) loss 2.8209 (2.6645) grad_norm 6.0334 (inf) loss_scale 128.0000 (160.7229) mem 7381MB [2024-09-01 11:43:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][850/1251] eta 0:01:30 lr 0.000010 wd 0.0500 time 0.2433 (0.2250) data time 0.0009 (0.0014) model time 0.2425 (0.2236) loss 2.3633 (2.6645) grad_norm 3.4538 (inf) loss_scale 128.0000 (160.3384) mem 7381MB [2024-09-01 11:43:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][860/1251] eta 0:01:27 lr 0.000010 wd 0.0500 time 0.2296 (0.2250) data time 0.0009 (0.0014) model time 0.2288 (0.2236) loss 2.9016 (2.6656) grad_norm 5.9040 (inf) loss_scale 128.0000 (159.9628) mem 7381MB [2024-09-01 11:43:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][870/1251] eta 0:01:25 lr 0.000010 wd 0.0500 time 0.2236 (0.2250) data time 0.0009 (0.0014) model time 0.2226 (0.2236) loss 2.8923 (2.6662) grad_norm 5.4776 (inf) loss_scale 128.0000 (159.5959) mem 7381MB [2024-09-01 11:44:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][880/1251] eta 0:01:23 lr 0.000010 wd 0.0500 time 0.2373 (0.2250) data time 0.0007 (0.0014) model time 0.2366 (0.2236) loss 2.1391 (2.6622) grad_norm 8.5614 (inf) loss_scale 128.0000 (159.2372) mem 7381MB [2024-09-01 11:44:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][890/1251] eta 0:01:21 lr 0.000010 wd 0.0500 time 0.2264 (0.2250) data time 0.0009 (0.0014) model time 0.2255 (0.2236) loss 1.6848 (2.6593) grad_norm 5.7001 (inf) loss_scale 128.0000 (158.8866) mem 7381MB [2024-09-01 11:44:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][900/1251] eta 0:01:18 lr 0.000010 wd 0.0500 time 0.2236 (0.2250) data time 0.0008 (0.0014) model time 0.2227 (0.2236) loss 2.8871 (2.6581) grad_norm 4.8371 (inf) loss_scale 128.0000 (158.5438) mem 7381MB [2024-09-01 11:44:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][910/1251] eta 0:01:16 lr 0.000010 wd 0.0500 time 0.2341 (0.2250) data time 0.0008 (0.0013) model time 0.2333 (0.2236) loss 2.1165 (2.6599) grad_norm 6.4012 (inf) loss_scale 128.0000 (158.2086) mem 7381MB [2024-09-01 11:44:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][920/1251] eta 0:01:14 lr 0.000010 wd 0.0500 time 0.2281 (0.2249) data time 0.0007 (0.0013) model time 0.2274 (0.2236) loss 2.9921 (2.6625) grad_norm 4.8108 (inf) loss_scale 128.0000 (157.8806) mem 7381MB [2024-09-01 11:44:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][930/1251] eta 0:01:12 lr 0.000010 wd 0.0500 time 0.2273 (0.2249) data time 0.0008 (0.0013) model time 0.2265 (0.2236) loss 2.5239 (2.6620) grad_norm 3.5285 (inf) loss_scale 128.0000 (157.5596) mem 7381MB [2024-09-01 11:44:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][940/1251] eta 0:01:09 lr 0.000010 wd 0.0500 time 0.2235 (0.2249) data time 0.0007 (0.0013) model time 0.2228 (0.2236) loss 3.2886 (2.6630) grad_norm 8.5883 (inf) loss_scale 128.0000 (157.2455) mem 7381MB [2024-09-01 11:44:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][950/1251] eta 0:01:07 lr 0.000010 wd 0.0500 time 0.2284 (0.2249) data time 0.0009 (0.0013) model time 0.2275 (0.2236) loss 2.6417 (2.6604) grad_norm 5.8344 (inf) loss_scale 128.0000 (156.9380) mem 7381MB [2024-09-01 11:44:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][960/1251] eta 0:01:05 lr 0.000010 wd 0.0500 time 0.2258 (0.2249) data time 0.0009 (0.0013) model time 0.2249 (0.2236) loss 2.2768 (2.6612) grad_norm 6.0297 (inf) loss_scale 128.0000 (156.6368) mem 7381MB [2024-09-01 11:44:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][970/1251] eta 0:01:03 lr 0.000010 wd 0.0500 time 0.2287 (0.2249) data time 0.0006 (0.0013) model time 0.2281 (0.2236) loss 2.2331 (2.6571) grad_norm 6.8379 (inf) loss_scale 128.0000 (156.3419) mem 7381MB [2024-09-01 11:44:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][980/1251] eta 0:01:00 lr 0.000010 wd 0.0500 time 0.2229 (0.2249) data time 0.0009 (0.0013) model time 0.2220 (0.2236) loss 2.5324 (2.6566) grad_norm 5.0825 (inf) loss_scale 128.0000 (156.0530) mem 7381MB [2024-09-01 11:44:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][990/1251] eta 0:00:58 lr 0.000010 wd 0.0500 time 0.2238 (0.2249) data time 0.0006 (0.0013) model time 0.2231 (0.2236) loss 3.1256 (2.6582) grad_norm 6.7775 (inf) loss_scale 128.0000 (155.7699) mem 7381MB [2024-09-01 11:44:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1000/1251] eta 0:00:56 lr 0.000010 wd 0.0500 time 0.2247 (0.2249) data time 0.0009 (0.0013) model time 0.2239 (0.2235) loss 3.3653 (2.6599) grad_norm 7.0186 (inf) loss_scale 128.0000 (155.4925) mem 7381MB [2024-09-01 11:44:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1010/1251] eta 0:00:54 lr 0.000010 wd 0.0500 time 0.2303 (0.2249) data time 0.0006 (0.0013) model time 0.2297 (0.2235) loss 2.4481 (2.6618) grad_norm 6.5636 (inf) loss_scale 128.0000 (155.2206) mem 7381MB [2024-09-01 11:44:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1020/1251] eta 0:00:51 lr 0.000010 wd 0.0500 time 0.2195 (0.2249) data time 0.0008 (0.0013) model time 0.2187 (0.2236) loss 2.4539 (2.6622) grad_norm 7.9877 (inf) loss_scale 128.0000 (154.9540) mem 7381MB [2024-09-01 11:44:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1030/1251] eta 0:00:49 lr 0.000010 wd 0.0500 time 0.2386 (0.2249) data time 0.0008 (0.0013) model time 0.2378 (0.2236) loss 2.6056 (2.6619) grad_norm 7.8147 (inf) loss_scale 128.0000 (154.6925) mem 7381MB [2024-09-01 11:44:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1040/1251] eta 0:00:47 lr 0.000010 wd 0.0500 time 0.2249 (0.2249) data time 0.0009 (0.0013) model time 0.2239 (0.2236) loss 1.8143 (2.6620) grad_norm 7.2431 (inf) loss_scale 128.0000 (154.4361) mem 7381MB [2024-09-01 11:44:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1050/1251] eta 0:00:45 lr 0.000010 wd 0.0500 time 0.2250 (0.2249) data time 0.0010 (0.0013) model time 0.2240 (0.2236) loss 2.9079 (2.6591) grad_norm 6.5560 (inf) loss_scale 128.0000 (154.1846) mem 7381MB [2024-09-01 11:44:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1060/1251] eta 0:00:42 lr 0.000010 wd 0.0500 time 0.2273 (0.2249) data time 0.0010 (0.0013) model time 0.2263 (0.2236) loss 2.7336 (2.6563) grad_norm 6.3726 (inf) loss_scale 128.0000 (153.9378) mem 7381MB [2024-09-01 11:44:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1070/1251] eta 0:00:40 lr 0.000010 wd 0.0500 time 0.2247 (0.2249) data time 0.0006 (0.0013) model time 0.2241 (0.2236) loss 2.9862 (2.6568) grad_norm 6.1264 (inf) loss_scale 128.0000 (153.6956) mem 7381MB [2024-09-01 11:44:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1080/1251] eta 0:00:38 lr 0.000010 wd 0.0500 time 0.2215 (0.2248) data time 0.0007 (0.0013) model time 0.2209 (0.2235) loss 2.9870 (2.6567) grad_norm 10.0162 (inf) loss_scale 128.0000 (153.4579) mem 7381MB [2024-09-01 11:44:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1090/1251] eta 0:00:36 lr 0.000010 wd 0.0500 time 0.2224 (0.2250) data time 0.0008 (0.0013) model time 0.2216 (0.2237) loss 2.6261 (2.6571) grad_norm 6.2950 (inf) loss_scale 128.0000 (153.2246) mem 7381MB [2024-09-01 11:44:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1100/1251] eta 0:00:33 lr 0.000010 wd 0.0500 time 0.2255 (0.2250) data time 0.0009 (0.0013) model time 0.2247 (0.2237) loss 2.7323 (2.6596) grad_norm 4.8486 (inf) loss_scale 128.0000 (152.9955) mem 7381MB [2024-09-01 11:44:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1110/1251] eta 0:00:31 lr 0.000010 wd 0.0500 time 0.2192 (0.2250) data time 0.0010 (0.0013) model time 0.2182 (0.2237) loss 2.1993 (2.6598) grad_norm 4.7864 (inf) loss_scale 128.0000 (152.7705) mem 7381MB [2024-09-01 11:44:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1120/1251] eta 0:00:29 lr 0.000010 wd 0.0500 time 0.2175 (0.2250) data time 0.0010 (0.0013) model time 0.2166 (0.2237) loss 2.6845 (2.6598) grad_norm 5.8884 (inf) loss_scale 128.0000 (152.5495) mem 7381MB [2024-09-01 11:44:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1130/1251] eta 0:00:27 lr 0.000010 wd 0.0500 time 0.2225 (0.2249) data time 0.0008 (0.0013) model time 0.2217 (0.2237) loss 3.0458 (2.6603) grad_norm 6.8290 (inf) loss_scale 128.0000 (152.3324) mem 7381MB [2024-09-01 11:44:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1140/1251] eta 0:00:24 lr 0.000010 wd 0.0500 time 0.2181 (0.2249) data time 0.0009 (0.0013) model time 0.2171 (0.2236) loss 2.8870 (2.6592) grad_norm 6.0318 (inf) loss_scale 128.0000 (152.1192) mem 7381MB [2024-09-01 11:45:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1150/1251] eta 0:00:22 lr 0.000010 wd 0.0500 time 0.2331 (0.2249) data time 0.0010 (0.0013) model time 0.2321 (0.2237) loss 2.2056 (2.6584) grad_norm 5.8875 (inf) loss_scale 128.0000 (151.9096) mem 7381MB [2024-09-01 11:45:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1160/1251] eta 0:00:20 lr 0.000010 wd 0.0500 time 0.2198 (0.2249) data time 0.0007 (0.0013) model time 0.2191 (0.2237) loss 3.1019 (2.6576) grad_norm 5.5940 (inf) loss_scale 128.0000 (151.7037) mem 7381MB [2024-09-01 11:45:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1170/1251] eta 0:00:18 lr 0.000010 wd 0.0500 time 0.2219 (0.2249) data time 0.0008 (0.0013) model time 0.2211 (0.2236) loss 2.7948 (2.6591) grad_norm 4.3716 (inf) loss_scale 128.0000 (151.5013) mem 7381MB [2024-09-01 11:45:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1180/1251] eta 0:00:15 lr 0.000010 wd 0.0500 time 0.2306 (0.2249) data time 0.0010 (0.0012) model time 0.2295 (0.2236) loss 2.6663 (2.6584) grad_norm 6.1797 (inf) loss_scale 128.0000 (151.3023) mem 7381MB [2024-09-01 11:45:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1190/1251] eta 0:00:13 lr 0.000010 wd 0.0500 time 0.2224 (0.2249) data time 0.0008 (0.0012) model time 0.2216 (0.2236) loss 2.8236 (2.6575) grad_norm 4.4594 (inf) loss_scale 128.0000 (151.1066) mem 7381MB [2024-09-01 11:45:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1200/1251] eta 0:00:11 lr 0.000010 wd 0.0500 time 0.2262 (0.2249) data time 0.0009 (0.0012) model time 0.2253 (0.2236) loss 3.1130 (2.6596) grad_norm 6.6654 (inf) loss_scale 128.0000 (150.9142) mem 7381MB [2024-09-01 11:45:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1210/1251] eta 0:00:09 lr 0.000010 wd 0.0500 time 0.2251 (0.2249) data time 0.0008 (0.0012) model time 0.2243 (0.2236) loss 2.2975 (2.6595) grad_norm 8.7357 (inf) loss_scale 128.0000 (150.7250) mem 7381MB [2024-09-01 11:45:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1220/1251] eta 0:00:06 lr 0.000010 wd 0.0500 time 0.2218 (0.2249) data time 0.0009 (0.0012) model time 0.2209 (0.2236) loss 2.6902 (2.6591) grad_norm 4.8882 (inf) loss_scale 128.0000 (150.5389) mem 7381MB [2024-09-01 11:45:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1230/1251] eta 0:00:04 lr 0.000010 wd 0.0500 time 0.2256 (0.2249) data time 0.0007 (0.0012) model time 0.2249 (0.2236) loss 3.2866 (2.6602) grad_norm 6.5145 (inf) loss_scale 128.0000 (150.3558) mem 7381MB [2024-09-01 11:45:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1240/1251] eta 0:00:02 lr 0.000010 wd 0.0500 time 0.2134 (0.2248) data time 0.0004 (0.0012) model time 0.2130 (0.2236) loss 3.5246 (2.6609) grad_norm 6.0902 (inf) loss_scale 128.0000 (150.1757) mem 7381MB [2024-09-01 11:45:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [296/300][1250/1251] eta 0:00:00 lr 0.000010 wd 0.0500 time 0.2112 (0.2247) data time 0.0006 (0.0012) model time 0.2107 (0.2235) loss 2.4598 (2.6596) grad_norm 6.9060 (inf) loss_scale 128.0000 (149.9984) mem 7381MB [2024-09-01 11:45:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 296 training takes 0:04:41 [2024-09-01 11:45:24 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 11:45:24 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 11:45:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.350 (0.350) Loss 0.3918 (0.3918) Acc@1 92.969 (92.969) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 11:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.068 (0.099) Loss 0.5669 (0.6123) Acc@1 90.234 (87.686) Acc@5 97.949 (97.754) Mem 7381MB [2024-09-01 11:45:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.065 (0.086) Loss 0.9341 (0.6440) Acc@1 77.344 (86.607) Acc@5 95.410 (97.670) Mem 7381MB [2024-09-01 11:45:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.070 (0.082) Loss 1.1602 (0.7393) Acc@1 74.707 (84.429) Acc@5 92.773 (96.699) Mem 7381MB [2024-09-01 11:45:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.077) Loss 1.0391 (0.7894) Acc@1 77.539 (83.260) Acc@5 94.043 (96.218) Mem 7381MB [2024-09-01 11:45:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.822 Acc@5 96.156 [2024-09-01 11:45:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 11:45:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.703 (0.703) Loss 0.3914 (0.3914) Acc@1 93.262 (93.262) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 11:45:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.134) Loss 0.5664 (0.6103) Acc@1 90.430 (87.793) Acc@5 97.949 (97.763) Mem 7381MB [2024-09-01 11:45:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.073 (0.104) Loss 0.9165 (0.6408) Acc@1 78.320 (86.751) Acc@5 95.605 (97.726) Mem 7381MB [2024-09-01 11:45:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.065 (0.092) Loss 1.1475 (0.7342) Acc@1 74.512 (84.517) Acc@5 93.066 (96.762) Mem 7381MB [2024-09-01 11:45:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.085) Loss 1.0205 (0.7834) Acc@1 77.051 (83.348) Acc@5 94.336 (96.258) Mem 7381MB [2024-09-01 11:45:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.894 Acc@5 96.218 [2024-09-01 11:45:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 11:45:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 285): INFO New max accuracy ema: 82.89% [2024-09-01 11:45:32 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saving...... [2024-09-01 11:45:33 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/best_ckpt_ema.pth saved !!! [2024-09-01 11:45:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][0/1251] eta 0:13:22 lr 0.000010 wd 0.0500 time 0.6413 (0.6413) data time 0.4356 (0.4356) model time 0.0000 (0.0000) loss 1.8858 (1.8858) grad_norm 7.7341 (7.7341) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:45:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][10/1251] eta 0:05:27 lr 0.000010 wd 0.0500 time 0.2224 (0.2639) data time 0.0007 (0.0404) model time 0.0000 (0.0000) loss 2.4436 (2.5157) grad_norm 7.8420 (7.7998) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:45:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][20/1251] eta 0:05:00 lr 0.000010 wd 0.0500 time 0.2219 (0.2445) data time 0.0008 (0.0216) model time 0.0000 (0.0000) loss 2.9853 (2.4961) grad_norm 5.4044 (7.1986) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:45:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][30/1251] eta 0:04:50 lr 0.000010 wd 0.0500 time 0.2236 (0.2378) data time 0.0009 (0.0149) model time 0.0000 (0.0000) loss 2.8836 (2.6227) grad_norm 5.1607 (6.7213) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:45:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][40/1251] eta 0:04:44 lr 0.000010 wd 0.0500 time 0.2307 (0.2348) data time 0.0012 (0.0115) model time 0.0000 (0.0000) loss 2.1806 (2.5945) grad_norm 7.3492 (6.5742) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:45:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][50/1251] eta 0:04:39 lr 0.000010 wd 0.0500 time 0.2237 (0.2329) data time 0.0006 (0.0094) model time 0.0000 (0.0000) loss 2.9592 (2.5928) grad_norm 5.4139 (6.6667) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:45:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][60/1251] eta 0:04:35 lr 0.000010 wd 0.0500 time 0.2249 (0.2315) data time 0.0008 (0.0080) model time 0.2241 (0.2235) loss 2.8746 (2.6285) grad_norm 6.0186 (6.5312) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:45:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][70/1251] eta 0:04:32 lr 0.000010 wd 0.0500 time 0.2257 (0.2304) data time 0.0008 (0.0070) model time 0.2248 (0.2233) loss 2.2345 (2.5805) grad_norm 5.4448 (6.4664) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:45:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][80/1251] eta 0:04:29 lr 0.000010 wd 0.0500 time 0.2280 (0.2297) data time 0.0007 (0.0063) model time 0.2273 (0.2234) loss 2.4265 (2.5779) grad_norm 5.2831 (6.5288) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:45:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][90/1251] eta 0:04:25 lr 0.000010 wd 0.0500 time 0.2264 (0.2290) data time 0.0007 (0.0057) model time 0.2257 (0.2232) loss 3.0891 (2.5779) grad_norm 6.6229 (6.5057) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:45:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][100/1251] eta 0:04:23 lr 0.000010 wd 0.0500 time 0.2229 (0.2285) data time 0.0008 (0.0052) model time 0.2221 (0.2232) loss 2.6500 (2.5784) grad_norm 5.7621 (6.6148) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:45:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][110/1251] eta 0:04:20 lr 0.000010 wd 0.0500 time 0.2219 (0.2281) data time 0.0009 (0.0048) model time 0.2210 (0.2232) loss 3.1582 (2.5776) grad_norm 8.6050 (6.6569) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][120/1251] eta 0:04:17 lr 0.000010 wd 0.0500 time 0.2240 (0.2278) data time 0.0008 (0.0045) model time 0.2232 (0.2232) loss 2.5047 (2.5626) grad_norm 5.5488 (6.6553) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][130/1251] eta 0:04:15 lr 0.000010 wd 0.0500 time 0.2278 (0.2276) data time 0.0007 (0.0042) model time 0.2270 (0.2234) loss 1.6993 (2.5604) grad_norm 4.5869 (6.6465) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][140/1251] eta 0:04:12 lr 0.000010 wd 0.0500 time 0.2261 (0.2273) data time 0.0007 (0.0040) model time 0.2255 (0.2232) loss 2.8450 (2.5635) grad_norm 39.0345 (6.9182) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][150/1251] eta 0:04:09 lr 0.000010 wd 0.0500 time 0.2263 (0.2271) data time 0.0008 (0.0038) model time 0.2255 (0.2232) loss 2.6535 (2.5625) grad_norm 5.5974 (6.9314) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][160/1251] eta 0:04:07 lr 0.000010 wd 0.0500 time 0.2238 (0.2268) data time 0.0006 (0.0036) model time 0.2232 (0.2231) loss 3.0090 (2.5778) grad_norm 6.5155 (6.9123) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][170/1251] eta 0:04:05 lr 0.000010 wd 0.0500 time 0.2281 (0.2267) data time 0.0011 (0.0034) model time 0.2270 (0.2232) loss 2.5483 (2.5795) grad_norm 4.9133 (6.8973) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][180/1251] eta 0:04:02 lr 0.000010 wd 0.0500 time 0.2338 (0.2266) data time 0.0010 (0.0033) model time 0.2328 (0.2233) loss 2.6809 (2.5924) grad_norm 4.1649 (6.8076) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][190/1251] eta 0:04:00 lr 0.000010 wd 0.0500 time 0.2221 (0.2265) data time 0.0009 (0.0032) model time 0.2211 (0.2232) loss 2.7139 (2.5976) grad_norm 10.5741 (6.8183) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][200/1251] eta 0:03:57 lr 0.000010 wd 0.0500 time 0.2233 (0.2263) data time 0.0007 (0.0031) model time 0.2226 (0.2231) loss 2.6196 (2.5994) grad_norm 6.9279 (6.7694) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][210/1251] eta 0:03:55 lr 0.000010 wd 0.0500 time 0.2292 (0.2261) data time 0.0008 (0.0030) model time 0.2284 (0.2230) loss 2.5263 (2.5985) grad_norm 4.7537 (6.7169) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][220/1251] eta 0:03:52 lr 0.000010 wd 0.0500 time 0.2235 (0.2260) data time 0.0009 (0.0029) model time 0.2226 (0.2230) loss 2.8908 (2.6026) grad_norm 5.3077 (6.6825) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][230/1251] eta 0:03:50 lr 0.000010 wd 0.0500 time 0.2236 (0.2259) data time 0.0007 (0.0028) model time 0.2228 (0.2230) loss 2.1646 (2.6043) grad_norm 8.4838 (6.6729) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][240/1251] eta 0:03:49 lr 0.000010 wd 0.0500 time 0.2300 (0.2266) data time 0.0009 (0.0027) model time 0.2290 (0.2240) loss 2.8269 (2.6133) grad_norm 4.4185 (6.6099) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][250/1251] eta 0:03:46 lr 0.000010 wd 0.0500 time 0.2187 (0.2265) data time 0.0006 (0.0026) model time 0.2181 (0.2239) loss 2.7839 (2.6096) grad_norm 7.3116 (6.6996) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][260/1251] eta 0:03:44 lr 0.000010 wd 0.0500 time 0.2209 (0.2264) data time 0.0010 (0.0026) model time 0.2199 (0.2238) loss 3.0316 (2.6180) grad_norm 5.4386 (6.6948) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][270/1251] eta 0:03:41 lr 0.000010 wd 0.0500 time 0.2241 (0.2262) data time 0.0008 (0.0025) model time 0.2232 (0.2238) loss 2.9922 (2.6174) grad_norm 6.7656 (6.6628) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][280/1251] eta 0:03:39 lr 0.000010 wd 0.0500 time 0.2269 (0.2262) data time 0.0007 (0.0025) model time 0.2262 (0.2238) loss 2.9483 (2.6193) grad_norm 6.6371 (6.6656) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][290/1251] eta 0:03:37 lr 0.000010 wd 0.0500 time 0.2266 (0.2260) data time 0.0007 (0.0024) model time 0.2259 (0.2236) loss 2.3427 (2.6253) grad_norm 4.6519 (6.6348) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][300/1251] eta 0:03:34 lr 0.000010 wd 0.0500 time 0.2249 (0.2259) data time 0.0008 (0.0024) model time 0.2241 (0.2236) loss 2.6127 (2.6250) grad_norm 6.7903 (6.6224) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][310/1251] eta 0:03:32 lr 0.000010 wd 0.0500 time 0.2203 (0.2258) data time 0.0011 (0.0023) model time 0.2192 (0.2235) loss 2.5019 (2.6302) grad_norm 4.5607 (6.6211) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][320/1251] eta 0:03:30 lr 0.000010 wd 0.0500 time 0.2182 (0.2257) data time 0.0007 (0.0023) model time 0.2175 (0.2234) loss 3.2836 (2.6379) grad_norm 5.0312 (6.6661) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][330/1251] eta 0:03:27 lr 0.000010 wd 0.0500 time 0.2207 (0.2256) data time 0.0008 (0.0022) model time 0.2198 (0.2234) loss 2.2948 (2.6355) grad_norm 4.1488 (6.6555) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][340/1251] eta 0:03:25 lr 0.000010 wd 0.0500 time 0.2254 (0.2256) data time 0.0007 (0.0022) model time 0.2246 (0.2233) loss 3.1295 (2.6417) grad_norm 4.8674 (6.6507) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][350/1251] eta 0:03:23 lr 0.000010 wd 0.0500 time 0.2245 (0.2255) data time 0.0008 (0.0022) model time 0.2237 (0.2233) loss 2.7885 (2.6350) grad_norm 7.0051 (6.6386) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][360/1251] eta 0:03:20 lr 0.000010 wd 0.0500 time 0.2210 (0.2254) data time 0.0008 (0.0021) model time 0.2202 (0.2233) loss 2.0722 (2.6343) grad_norm 5.9300 (6.6156) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][370/1251] eta 0:03:18 lr 0.000010 wd 0.0500 time 0.2258 (0.2254) data time 0.0007 (0.0021) model time 0.2251 (0.2233) loss 2.0482 (2.6329) grad_norm 6.3497 (6.6592) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:46:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][380/1251] eta 0:03:16 lr 0.000010 wd 0.0500 time 0.2285 (0.2254) data time 0.0007 (0.0021) model time 0.2278 (0.2233) loss 2.5365 (2.6328) grad_norm 4.4872 (6.7111) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][390/1251] eta 0:03:13 lr 0.000010 wd 0.0500 time 0.2185 (0.2253) data time 0.0008 (0.0020) model time 0.2177 (0.2232) loss 3.0116 (2.6306) grad_norm 5.4802 (6.7212) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][400/1251] eta 0:03:11 lr 0.000010 wd 0.0500 time 0.2290 (0.2253) data time 0.0009 (0.0020) model time 0.2282 (0.2233) loss 3.0565 (2.6275) grad_norm 4.9005 (6.7079) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][410/1251] eta 0:03:09 lr 0.000010 wd 0.0500 time 0.2234 (0.2253) data time 0.0008 (0.0020) model time 0.2226 (0.2233) loss 3.1481 (2.6283) grad_norm 5.8098 (7.3640) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][420/1251] eta 0:03:07 lr 0.000010 wd 0.0500 time 0.2256 (0.2253) data time 0.0013 (0.0020) model time 0.2244 (0.2233) loss 2.4414 (2.6326) grad_norm 6.5984 (7.3835) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][430/1251] eta 0:03:04 lr 0.000010 wd 0.0500 time 0.2251 (0.2253) data time 0.0007 (0.0019) model time 0.2243 (0.2233) loss 3.1187 (2.6365) grad_norm 5.4806 (7.3808) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][440/1251] eta 0:03:02 lr 0.000010 wd 0.0500 time 0.2225 (0.2253) data time 0.0006 (0.0019) model time 0.2219 (0.2234) loss 3.0943 (2.6356) grad_norm 5.7570 (7.3621) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][450/1251] eta 0:03:00 lr 0.000010 wd 0.0500 time 0.2212 (0.2253) data time 0.0009 (0.0019) model time 0.2203 (0.2234) loss 2.5834 (2.6342) grad_norm 7.0273 (7.3490) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][460/1251] eta 0:02:58 lr 0.000010 wd 0.0500 time 0.2269 (0.2253) data time 0.0008 (0.0019) model time 0.2261 (0.2234) loss 2.0674 (2.6318) grad_norm 7.0921 (7.3356) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][470/1251] eta 0:02:55 lr 0.000010 wd 0.0500 time 0.2260 (0.2252) data time 0.0008 (0.0018) model time 0.2252 (0.2234) loss 2.3055 (2.6358) grad_norm 5.5176 (7.3233) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][480/1251] eta 0:02:53 lr 0.000010 wd 0.0500 time 0.2266 (0.2252) data time 0.0008 (0.0018) model time 0.2258 (0.2234) loss 3.0701 (2.6412) grad_norm 20.8542 (7.3364) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][490/1251] eta 0:02:51 lr 0.000010 wd 0.0500 time 0.2196 (0.2252) data time 0.0009 (0.0018) model time 0.2187 (0.2234) loss 2.6611 (2.6434) grad_norm 6.3186 (7.3075) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][500/1251] eta 0:02:49 lr 0.000010 wd 0.0500 time 0.2213 (0.2256) data time 0.0008 (0.0018) model time 0.2205 (0.2238) loss 1.5531 (2.6447) grad_norm 6.4299 (7.3206) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][510/1251] eta 0:02:47 lr 0.000010 wd 0.0500 time 0.2240 (0.2256) data time 0.0008 (0.0018) model time 0.2232 (0.2238) loss 3.0836 (2.6414) grad_norm 7.6178 (7.2937) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][520/1251] eta 0:02:44 lr 0.000010 wd 0.0500 time 0.2289 (0.2255) data time 0.0009 (0.0018) model time 0.2280 (0.2238) loss 2.5463 (2.6408) grad_norm 3.9272 (7.2809) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][530/1251] eta 0:02:42 lr 0.000010 wd 0.0500 time 0.2290 (0.2255) data time 0.0006 (0.0017) model time 0.2284 (0.2238) loss 2.8155 (2.6393) grad_norm 10.2955 (7.2700) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][540/1251] eta 0:02:40 lr 0.000010 wd 0.0500 time 0.2270 (0.2255) data time 0.0007 (0.0017) model time 0.2263 (0.2238) loss 2.6006 (2.6374) grad_norm 4.9850 (7.2671) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][550/1251] eta 0:02:38 lr 0.000010 wd 0.0500 time 0.2237 (0.2255) data time 0.0010 (0.0017) model time 0.2227 (0.2238) loss 2.8224 (2.6364) grad_norm 23.6022 (7.2673) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][560/1251] eta 0:02:35 lr 0.000010 wd 0.0500 time 0.2281 (0.2254) data time 0.0006 (0.0017) model time 0.2275 (0.2238) loss 1.6731 (2.6401) grad_norm 7.0345 (7.2459) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][570/1251] eta 0:02:33 lr 0.000010 wd 0.0500 time 0.2245 (0.2254) data time 0.0010 (0.0017) model time 0.2235 (0.2238) loss 2.7838 (2.6376) grad_norm 6.8909 (7.2220) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][580/1251] eta 0:02:31 lr 0.000010 wd 0.0500 time 0.2198 (0.2254) data time 0.0012 (0.0017) model time 0.2187 (0.2238) loss 3.0280 (2.6395) grad_norm 4.3339 (7.2670) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][590/1251] eta 0:02:28 lr 0.000010 wd 0.0500 time 0.2191 (0.2254) data time 0.0010 (0.0017) model time 0.2181 (0.2237) loss 2.8115 (2.6413) grad_norm 11.6655 (7.2655) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][600/1251] eta 0:02:26 lr 0.000010 wd 0.0500 time 0.2211 (0.2253) data time 0.0008 (0.0016) model time 0.2202 (0.2237) loss 2.6062 (2.6434) grad_norm 6.0198 (7.2683) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][610/1251] eta 0:02:24 lr 0.000010 wd 0.0500 time 0.2253 (0.2254) data time 0.0011 (0.0016) model time 0.2242 (0.2237) loss 2.9888 (2.6456) grad_norm 4.7260 (7.2416) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][620/1251] eta 0:02:22 lr 0.000010 wd 0.0500 time 0.2151 (0.2253) data time 0.0010 (0.0016) model time 0.2141 (0.2237) loss 3.0797 (2.6446) grad_norm 5.2100 (7.2227) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][630/1251] eta 0:02:19 lr 0.000010 wd 0.0500 time 0.2232 (0.2253) data time 0.0009 (0.0016) model time 0.2223 (0.2237) loss 3.0337 (2.6443) grad_norm 9.0230 (7.2275) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][640/1251] eta 0:02:17 lr 0.000010 wd 0.0500 time 0.2247 (0.2253) data time 0.0006 (0.0016) model time 0.2241 (0.2237) loss 2.8890 (2.6484) grad_norm 6.3041 (7.2225) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:47:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][650/1251] eta 0:02:15 lr 0.000010 wd 0.0500 time 0.2255 (0.2253) data time 0.0008 (0.0016) model time 0.2247 (0.2237) loss 3.1119 (2.6467) grad_norm 8.6300 (7.2106) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][660/1251] eta 0:02:13 lr 0.000010 wd 0.0500 time 0.2245 (0.2253) data time 0.0008 (0.0016) model time 0.2237 (0.2237) loss 2.8085 (2.6451) grad_norm 4.1170 (7.2064) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][670/1251] eta 0:02:10 lr 0.000010 wd 0.0500 time 0.2249 (0.2253) data time 0.0009 (0.0016) model time 0.2240 (0.2238) loss 2.4777 (2.6461) grad_norm 5.4652 (7.1930) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][680/1251] eta 0:02:08 lr 0.000010 wd 0.0500 time 0.2220 (0.2253) data time 0.0009 (0.0016) model time 0.2210 (0.2238) loss 3.2138 (2.6486) grad_norm 8.3583 (7.2072) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][690/1251] eta 0:02:06 lr 0.000010 wd 0.0500 time 0.2265 (0.2253) data time 0.0008 (0.0016) model time 0.2257 (0.2238) loss 2.2022 (2.6484) grad_norm 4.0750 (7.2012) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][700/1251] eta 0:02:04 lr 0.000010 wd 0.0500 time 0.2260 (0.2253) data time 0.0008 (0.0016) model time 0.2252 (0.2238) loss 3.0289 (2.6506) grad_norm 8.7624 (7.2033) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][710/1251] eta 0:02:01 lr 0.000010 wd 0.0500 time 0.2195 (0.2253) data time 0.0009 (0.0015) model time 0.2187 (0.2238) loss 2.7892 (2.6537) grad_norm 8.1913 (7.2017) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][720/1251] eta 0:01:59 lr 0.000010 wd 0.0500 time 0.2217 (0.2253) data time 0.0007 (0.0015) model time 0.2210 (0.2238) loss 3.1306 (2.6550) grad_norm 6.4780 (7.1870) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][730/1251] eta 0:01:57 lr 0.000010 wd 0.0500 time 0.2219 (0.2253) data time 0.0009 (0.0015) model time 0.2210 (0.2238) loss 2.9167 (2.6552) grad_norm 4.4466 (7.1731) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][740/1251] eta 0:01:55 lr 0.000010 wd 0.0500 time 0.2230 (0.2253) data time 0.0007 (0.0015) model time 0.2223 (0.2238) loss 3.2820 (2.6556) grad_norm 5.7886 (7.1725) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][750/1251] eta 0:01:52 lr 0.000010 wd 0.0500 time 0.2191 (0.2252) data time 0.0008 (0.0015) model time 0.2183 (0.2238) loss 2.8271 (2.6576) grad_norm 10.2348 (7.1645) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][760/1251] eta 0:01:50 lr 0.000010 wd 0.0500 time 0.2260 (0.2252) data time 0.0008 (0.0015) model time 0.2252 (0.2237) loss 2.9570 (2.6545) grad_norm 5.3551 (7.1511) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][770/1251] eta 0:01:48 lr 0.000010 wd 0.0500 time 0.2230 (0.2252) data time 0.0008 (0.0015) model time 0.2222 (0.2237) loss 2.6303 (2.6565) grad_norm 5.9684 (7.1457) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][780/1251] eta 0:01:46 lr 0.000010 wd 0.0500 time 0.2221 (0.2252) data time 0.0009 (0.0015) model time 0.2212 (0.2237) loss 2.5947 (2.6565) grad_norm 5.3682 (7.1375) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][790/1251] eta 0:01:43 lr 0.000010 wd 0.0500 time 0.2307 (0.2252) data time 0.0006 (0.0015) model time 0.2301 (0.2237) loss 3.1943 (2.6587) grad_norm 6.3013 (7.1731) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][800/1251] eta 0:01:41 lr 0.000010 wd 0.0500 time 0.2246 (0.2252) data time 0.0008 (0.0015) model time 0.2238 (0.2237) loss 1.8856 (2.6591) grad_norm 7.8120 (7.1688) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][810/1251] eta 0:01:39 lr 0.000010 wd 0.0500 time 0.2232 (0.2252) data time 0.0010 (0.0015) model time 0.2222 (0.2238) loss 2.3530 (2.6597) grad_norm 9.9855 (7.1671) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][820/1251] eta 0:01:37 lr 0.000010 wd 0.0500 time 0.2217 (0.2252) data time 0.0013 (0.0015) model time 0.2205 (0.2237) loss 2.7073 (2.6598) grad_norm 8.4883 (7.1547) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][830/1251] eta 0:01:34 lr 0.000010 wd 0.0500 time 0.2192 (0.2251) data time 0.0006 (0.0015) model time 0.2186 (0.2237) loss 3.1501 (2.6624) grad_norm 10.9611 (7.1473) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][840/1251] eta 0:01:32 lr 0.000010 wd 0.0500 time 0.2249 (0.2251) data time 0.0009 (0.0014) model time 0.2240 (0.2237) loss 2.8598 (2.6628) grad_norm 4.2673 (7.1346) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][850/1251] eta 0:01:30 lr 0.000010 wd 0.0500 time 0.2166 (0.2251) data time 0.0010 (0.0014) model time 0.2155 (0.2237) loss 2.8144 (2.6632) grad_norm 6.1656 (7.1274) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][860/1251] eta 0:01:27 lr 0.000010 wd 0.0500 time 0.2194 (0.2251) data time 0.0010 (0.0014) model time 0.2184 (0.2236) loss 2.4101 (2.6654) grad_norm 5.0395 (7.1440) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][870/1251] eta 0:01:25 lr 0.000010 wd 0.0500 time 0.2270 (0.2251) data time 0.0008 (0.0014) model time 0.2263 (0.2236) loss 2.7274 (2.6656) grad_norm 4.7444 (7.1368) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][880/1251] eta 0:01:23 lr 0.000010 wd 0.0500 time 0.2243 (0.2251) data time 0.0008 (0.0014) model time 0.2235 (0.2237) loss 2.3024 (2.6671) grad_norm 4.8615 (7.3435) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][890/1251] eta 0:01:21 lr 0.000010 wd 0.0500 time 0.2210 (0.2250) data time 0.0009 (0.0014) model time 0.2201 (0.2236) loss 2.0427 (2.6649) grad_norm 5.2707 (7.3218) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][900/1251] eta 0:01:18 lr 0.000010 wd 0.0500 time 0.2218 (0.2250) data time 0.0007 (0.0014) model time 0.2211 (0.2236) loss 2.5226 (2.6650) grad_norm 3.7317 (7.3110) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:48:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][910/1251] eta 0:01:16 lr 0.000010 wd 0.0500 time 0.2215 (0.2250) data time 0.0008 (0.0014) model time 0.2207 (0.2236) loss 2.3467 (2.6629) grad_norm 7.1968 (7.3123) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:49:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][920/1251] eta 0:01:14 lr 0.000010 wd 0.0500 time 0.2216 (0.2250) data time 0.0010 (0.0014) model time 0.2205 (0.2236) loss 2.8726 (2.6652) grad_norm 7.4068 (7.2999) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:49:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][930/1251] eta 0:01:12 lr 0.000010 wd 0.0500 time 0.2292 (0.2250) data time 0.0009 (0.0014) model time 0.2283 (0.2236) loss 2.9281 (2.6650) grad_norm 13.5023 (7.2952) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:49:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][940/1251] eta 0:01:09 lr 0.000010 wd 0.0500 time 0.2180 (0.2250) data time 0.0009 (0.0014) model time 0.2170 (0.2236) loss 2.9043 (2.6637) grad_norm 5.2374 (7.2902) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:49:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][950/1251] eta 0:01:07 lr 0.000010 wd 0.0500 time 0.2172 (0.2250) data time 0.0010 (0.0014) model time 0.2162 (0.2236) loss 2.7103 (2.6631) grad_norm 9.9097 (7.2916) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:49:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][960/1251] eta 0:01:05 lr 0.000010 wd 0.0500 time 0.2289 (0.2250) data time 0.0008 (0.0014) model time 0.2281 (0.2236) loss 2.9904 (2.6634) grad_norm 8.9332 (7.2756) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:49:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][970/1251] eta 0:01:03 lr 0.000010 wd 0.0500 time 0.2266 (0.2250) data time 0.0006 (0.0014) model time 0.2260 (0.2236) loss 2.7420 (2.6632) grad_norm 6.0191 (7.2540) loss_scale 256.0000 (128.9228) mem 7381MB [2024-09-01 11:49:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][980/1251] eta 0:01:00 lr 0.000010 wd 0.0500 time 0.2209 (0.2250) data time 0.0006 (0.0014) model time 0.2202 (0.2236) loss 2.5839 (2.6623) grad_norm 6.7843 (7.2370) loss_scale 256.0000 (130.2181) mem 7381MB [2024-09-01 11:49:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][990/1251] eta 0:00:58 lr 0.000010 wd 0.0500 time 0.2176 (0.2250) data time 0.0009 (0.0014) model time 0.2166 (0.2236) loss 2.0339 (2.6623) grad_norm 11.5589 (7.2382) loss_scale 256.0000 (131.4874) mem 7381MB [2024-09-01 11:49:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1000/1251] eta 0:00:56 lr 0.000010 wd 0.0500 time 0.2240 (0.2250) data time 0.0010 (0.0014) model time 0.2231 (0.2236) loss 2.3221 (2.6607) grad_norm 4.9178 (7.2292) loss_scale 256.0000 (132.7313) mem 7381MB [2024-09-01 11:49:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1010/1251] eta 0:00:54 lr 0.000010 wd 0.0500 time 0.2288 (0.2249) data time 0.0006 (0.0014) model time 0.2283 (0.2236) loss 2.2386 (2.6564) grad_norm 4.8539 (7.2129) loss_scale 256.0000 (133.9505) mem 7381MB [2024-09-01 11:49:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1020/1251] eta 0:00:51 lr 0.000010 wd 0.0500 time 0.2273 (0.2249) data time 0.0006 (0.0014) model time 0.2267 (0.2236) loss 2.2035 (2.6548) grad_norm 5.8192 (7.2220) loss_scale 256.0000 (135.1459) mem 7381MB [2024-09-01 11:49:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1030/1251] eta 0:00:49 lr 0.000010 wd 0.0500 time 0.2237 (0.2249) data time 0.0009 (0.0014) model time 0.2228 (0.2235) loss 3.1759 (2.6542) grad_norm 9.0877 (7.2108) loss_scale 256.0000 (136.3181) mem 7381MB [2024-09-01 11:49:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1040/1251] eta 0:00:47 lr 0.000010 wd 0.0500 time 0.2247 (0.2251) data time 0.0009 (0.0014) model time 0.2237 (0.2237) loss 3.1108 (2.6527) grad_norm 3.9423 (7.2015) loss_scale 256.0000 (137.4678) mem 7381MB [2024-09-01 11:49:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1050/1251] eta 0:00:45 lr 0.000010 wd 0.0500 time 0.2202 (0.2250) data time 0.0011 (0.0013) model time 0.2191 (0.2237) loss 2.2272 (2.6539) grad_norm 4.7041 (7.1949) loss_scale 256.0000 (138.5956) mem 7381MB [2024-09-01 11:49:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1060/1251] eta 0:00:42 lr 0.000010 wd 0.0500 time 0.2299 (0.2251) data time 0.0008 (0.0013) model time 0.2291 (0.2237) loss 2.8508 (2.6523) grad_norm 6.1567 (7.1896) loss_scale 256.0000 (139.7022) mem 7381MB [2024-09-01 11:49:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1070/1251] eta 0:00:40 lr 0.000010 wd 0.0500 time 0.2238 (0.2251) data time 0.0006 (0.0013) model time 0.2231 (0.2237) loss 2.2433 (2.6524) grad_norm 4.9953 (7.1867) loss_scale 256.0000 (140.7880) mem 7381MB [2024-09-01 11:49:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1080/1251] eta 0:00:38 lr 0.000010 wd 0.0500 time 0.2275 (0.2251) data time 0.0006 (0.0013) model time 0.2269 (0.2237) loss 3.0351 (2.6541) grad_norm 7.5348 (7.1888) loss_scale 256.0000 (141.8538) mem 7381MB [2024-09-01 11:49:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1090/1251] eta 0:00:36 lr 0.000010 wd 0.0500 time 0.2295 (0.2250) data time 0.0007 (0.0013) model time 0.2288 (0.2237) loss 3.3741 (2.6560) grad_norm 8.9372 (7.1948) loss_scale 256.0000 (142.9001) mem 7381MB [2024-09-01 11:49:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1100/1251] eta 0:00:33 lr 0.000010 wd 0.0500 time 0.2294 (0.2250) data time 0.0008 (0.0013) model time 0.2287 (0.2237) loss 2.8357 (2.6546) grad_norm 8.8399 (7.1981) loss_scale 256.0000 (143.9273) mem 7381MB [2024-09-01 11:49:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1110/1251] eta 0:00:31 lr 0.000010 wd 0.0500 time 0.2255 (0.2251) data time 0.0009 (0.0013) model time 0.2246 (0.2237) loss 2.8025 (2.6538) grad_norm 7.9271 (7.2013) loss_scale 256.0000 (144.9361) mem 7381MB [2024-09-01 11:49:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1120/1251] eta 0:00:29 lr 0.000010 wd 0.0500 time 0.2263 (0.2250) data time 0.0008 (0.0013) model time 0.2255 (0.2237) loss 2.7575 (2.6537) grad_norm 6.0502 (7.1882) loss_scale 256.0000 (145.9269) mem 7381MB [2024-09-01 11:49:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1130/1251] eta 0:00:27 lr 0.000010 wd 0.0500 time 0.2264 (0.2250) data time 0.0006 (0.0013) model time 0.2258 (0.2237) loss 1.9792 (2.6539) grad_norm 5.3825 (7.1762) loss_scale 256.0000 (146.9001) mem 7381MB [2024-09-01 11:49:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1140/1251] eta 0:00:24 lr 0.000010 wd 0.0500 time 0.2224 (0.2250) data time 0.0009 (0.0013) model time 0.2215 (0.2237) loss 2.8716 (2.6541) grad_norm 5.9681 (7.1637) loss_scale 256.0000 (147.8563) mem 7381MB [2024-09-01 11:49:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1150/1251] eta 0:00:22 lr 0.000010 wd 0.0500 time 0.2241 (0.2250) data time 0.0008 (0.0013) model time 0.2233 (0.2237) loss 2.0620 (2.6533) grad_norm 5.5280 (7.1630) loss_scale 256.0000 (148.7958) mem 7381MB [2024-09-01 11:49:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1160/1251] eta 0:00:20 lr 0.000010 wd 0.0500 time 0.2192 (0.2250) data time 0.0009 (0.0013) model time 0.2184 (0.2237) loss 3.0577 (2.6539) grad_norm 7.2276 (7.1734) loss_scale 256.0000 (149.7192) mem 7381MB [2024-09-01 11:49:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1170/1251] eta 0:00:18 lr 0.000010 wd 0.0500 time 0.2225 (0.2252) data time 0.0008 (0.0013) model time 0.2217 (0.2239) loss 2.7633 (2.6543) grad_norm 4.5960 (7.1636) loss_scale 256.0000 (150.6268) mem 7381MB [2024-09-01 11:49:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1180/1251] eta 0:00:15 lr 0.000010 wd 0.0500 time 0.2227 (0.2251) data time 0.0006 (0.0013) model time 0.2221 (0.2239) loss 2.7275 (2.6521) grad_norm 5.8768 (7.1670) loss_scale 256.0000 (151.5191) mem 7381MB [2024-09-01 11:50:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1190/1251] eta 0:00:13 lr 0.000010 wd 0.0500 time 0.2271 (0.2251) data time 0.0008 (0.0013) model time 0.2263 (0.2239) loss 1.4997 (2.6523) grad_norm 7.6669 (7.1710) loss_scale 256.0000 (152.3963) mem 7381MB [2024-09-01 11:50:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1200/1251] eta 0:00:11 lr 0.000010 wd 0.0500 time 0.2277 (0.2251) data time 0.0008 (0.0013) model time 0.2269 (0.2238) loss 3.4040 (2.6532) grad_norm 9.2617 (7.1694) loss_scale 256.0000 (153.2590) mem 7381MB [2024-09-01 11:50:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1210/1251] eta 0:00:09 lr 0.000010 wd 0.0500 time 0.2189 (0.2251) data time 0.0007 (0.0013) model time 0.2182 (0.2238) loss 2.2994 (2.6522) grad_norm 5.2222 (7.1639) loss_scale 256.0000 (154.1073) mem 7381MB [2024-09-01 11:50:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1220/1251] eta 0:00:06 lr 0.000010 wd 0.0500 time 0.2202 (0.2251) data time 0.0009 (0.0013) model time 0.2193 (0.2238) loss 2.5906 (2.6521) grad_norm 10.9938 (7.1526) loss_scale 256.0000 (154.9419) mem 7381MB [2024-09-01 11:50:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1230/1251] eta 0:00:04 lr 0.000010 wd 0.0500 time 0.2312 (0.2251) data time 0.0005 (0.0013) model time 0.2307 (0.2238) loss 1.9245 (2.6506) grad_norm 8.5685 (7.1494) loss_scale 256.0000 (155.7628) mem 7381MB [2024-09-01 11:50:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1240/1251] eta 0:00:02 lr 0.000010 wd 0.0500 time 0.2121 (0.2250) data time 0.0006 (0.0013) model time 0.2116 (0.2237) loss 2.8921 (2.6494) grad_norm 4.2635 (7.1601) loss_scale 256.0000 (156.5705) mem 7381MB [2024-09-01 11:50:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [297/300][1250/1251] eta 0:00:00 lr 0.000010 wd 0.0500 time 0.2108 (0.2249) data time 0.0004 (0.0013) model time 0.2104 (0.2236) loss 3.1156 (2.6509) grad_norm 7.7295 (7.1499) loss_scale 256.0000 (157.3653) mem 7381MB [2024-09-01 11:50:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 297 training takes 0:04:41 [2024-09-01 11:50:14 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 11:50:15 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 11:50:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.403 (0.403) Loss 0.3892 (0.3892) Acc@1 93.457 (93.457) Acc@5 99.023 (99.023) Mem 7381MB [2024-09-01 11:50:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.070 (0.099) Loss 0.5718 (0.6110) Acc@1 90.332 (87.669) Acc@5 97.852 (97.683) Mem 7381MB [2024-09-01 11:50:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.064 (0.085) Loss 0.9219 (0.6406) Acc@1 77.734 (86.584) Acc@5 95.410 (97.638) Mem 7381MB [2024-09-01 11:50:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.067 (0.079) Loss 1.1562 (0.7345) Acc@1 74.219 (84.359) Acc@5 92.578 (96.736) Mem 7381MB [2024-09-01 11:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.075) Loss 1.0234 (0.7842) Acc@1 77.344 (83.229) Acc@5 94.238 (96.249) Mem 7381MB [2024-09-01 11:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.798 Acc@5 96.178 [2024-09-01 11:50:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 11:50:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.797 (0.797) Loss 0.3909 (0.3909) Acc@1 93.262 (93.262) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 11:50:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.069 (0.138) Loss 0.5654 (0.6099) Acc@1 90.430 (87.802) Acc@5 97.754 (97.745) Mem 7381MB [2024-09-01 11:50:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.070 (0.105) Loss 0.9175 (0.6405) Acc@1 78.223 (86.751) Acc@5 95.605 (97.731) Mem 7381MB [2024-09-01 11:50:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.064 (0.093) Loss 1.1475 (0.7340) Acc@1 74.414 (84.507) Acc@5 93.164 (96.768) Mem 7381MB [2024-09-01 11:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.085) Loss 1.0225 (0.7832) Acc@1 77.051 (83.346) Acc@5 94.336 (96.263) Mem 7381MB [2024-09-01 11:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.888 Acc@5 96.218 [2024-09-01 11:50:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 11:50:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][0/1251] eta 0:20:21 lr 0.000010 wd 0.0500 time 0.9765 (0.9765) data time 0.6819 (0.6819) model time 0.0000 (0.0000) loss 1.7373 (1.7373) grad_norm 5.7113 (5.7113) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][10/1251] eta 0:06:02 lr 0.000010 wd 0.0500 time 0.2203 (0.2923) data time 0.0009 (0.0629) model time 0.0000 (0.0000) loss 2.5140 (2.5954) grad_norm 4.6222 (5.9019) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][20/1251] eta 0:05:20 lr 0.000010 wd 0.0500 time 0.2187 (0.2606) data time 0.0007 (0.0333) model time 0.0000 (0.0000) loss 2.9211 (2.6085) grad_norm 13.4640 (5.9819) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][30/1251] eta 0:05:03 lr 0.000010 wd 0.0500 time 0.2320 (0.2487) data time 0.0007 (0.0229) model time 0.0000 (0.0000) loss 3.4343 (2.6361) grad_norm 7.4382 (6.1472) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][40/1251] eta 0:04:53 lr 0.000010 wd 0.0500 time 0.2150 (0.2425) data time 0.0010 (0.0175) model time 0.0000 (0.0000) loss 2.4588 (2.5741) grad_norm 4.4297 (6.0071) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][50/1251] eta 0:04:47 lr 0.000010 wd 0.0500 time 0.2285 (0.2392) data time 0.0006 (0.0143) model time 0.0000 (0.0000) loss 2.5299 (2.5668) grad_norm 10.3574 (6.1620) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][60/1251] eta 0:04:41 lr 0.000010 wd 0.0500 time 0.2165 (0.2365) data time 0.0008 (0.0121) model time 0.2156 (0.2221) loss 2.2498 (2.5650) grad_norm 7.0654 (6.1258) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][70/1251] eta 0:04:37 lr 0.000010 wd 0.0500 time 0.2222 (0.2347) data time 0.0008 (0.0105) model time 0.2214 (0.2224) loss 2.6875 (2.5897) grad_norm 6.1507 (6.6260) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][80/1251] eta 0:04:33 lr 0.000010 wd 0.0500 time 0.2202 (0.2332) data time 0.0010 (0.0093) model time 0.2192 (0.2221) loss 2.5993 (2.6038) grad_norm 4.3768 (6.7376) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][90/1251] eta 0:04:29 lr 0.000010 wd 0.0500 time 0.2249 (0.2325) data time 0.0008 (0.0084) model time 0.2240 (0.2231) loss 1.9828 (2.5818) grad_norm 18.0237 (6.8496) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][100/1251] eta 0:04:26 lr 0.000010 wd 0.0500 time 0.2298 (0.2318) data time 0.0006 (0.0076) model time 0.2293 (0.2234) loss 1.9255 (2.6082) grad_norm 15.2355 (6.8018) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][110/1251] eta 0:04:23 lr 0.000010 wd 0.0500 time 0.2242 (0.2311) data time 0.0008 (0.0070) model time 0.2234 (0.2233) loss 3.1558 (2.6231) grad_norm 4.0151 (6.7399) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][120/1251] eta 0:04:20 lr 0.000010 wd 0.0500 time 0.2194 (0.2306) data time 0.0010 (0.0065) model time 0.2184 (0.2234) loss 2.0789 (2.6156) grad_norm 4.5931 (6.7022) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][130/1251] eta 0:04:17 lr 0.000010 wd 0.0500 time 0.2198 (0.2300) data time 0.0009 (0.0061) model time 0.2188 (0.2232) loss 3.1588 (2.6363) grad_norm 7.5246 (6.6609) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][140/1251] eta 0:04:15 lr 0.000010 wd 0.0500 time 0.2263 (0.2296) data time 0.0010 (0.0057) model time 0.2253 (0.2232) loss 2.9861 (2.6623) grad_norm 9.8806 (6.7220) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][150/1251] eta 0:04:12 lr 0.000010 wd 0.0500 time 0.2267 (0.2292) data time 0.0010 (0.0054) model time 0.2258 (0.2232) loss 2.8681 (2.6692) grad_norm 14.7177 (6.7813) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:50:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][160/1251] eta 0:04:09 lr 0.000010 wd 0.0500 time 0.2285 (0.2290) data time 0.0009 (0.0051) model time 0.2277 (0.2233) loss 2.0788 (2.6685) grad_norm 5.6719 (6.8263) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][170/1251] eta 0:04:07 lr 0.000010 wd 0.0500 time 0.2233 (0.2286) data time 0.0008 (0.0049) model time 0.2225 (0.2232) loss 2.4293 (2.6631) grad_norm 4.9603 (6.8182) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][180/1251] eta 0:04:04 lr 0.000010 wd 0.0500 time 0.2285 (0.2284) data time 0.0008 (0.0047) model time 0.2277 (0.2233) loss 2.3052 (2.6628) grad_norm 9.0530 (6.8148) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][190/1251] eta 0:04:02 lr 0.000010 wd 0.0500 time 0.2244 (0.2283) data time 0.0006 (0.0045) model time 0.2238 (0.2234) loss 2.7873 (2.6667) grad_norm 10.1975 (6.8099) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][200/1251] eta 0:03:59 lr 0.000010 wd 0.0500 time 0.2217 (0.2280) data time 0.0007 (0.0043) model time 0.2210 (0.2233) loss 3.2511 (2.6718) grad_norm 20.8401 (6.8407) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][210/1251] eta 0:03:57 lr 0.000010 wd 0.0500 time 0.2235 (0.2278) data time 0.0008 (0.0041) model time 0.2227 (0.2233) loss 2.3342 (2.6762) grad_norm 6.9750 (6.9233) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][220/1251] eta 0:03:54 lr 0.000010 wd 0.0500 time 0.2186 (0.2276) data time 0.0008 (0.0040) model time 0.2177 (0.2232) loss 2.3032 (2.6819) grad_norm 7.4552 (7.0096) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][230/1251] eta 0:03:52 lr 0.000010 wd 0.0500 time 0.2217 (0.2274) data time 0.0011 (0.0038) model time 0.2206 (0.2231) loss 2.6068 (2.6723) grad_norm 8.3135 (6.9946) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][240/1251] eta 0:03:49 lr 0.000010 wd 0.0500 time 0.2251 (0.2272) data time 0.0007 (0.0037) model time 0.2244 (0.2231) loss 2.8118 (2.6600) grad_norm 7.1238 (7.0362) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][250/1251] eta 0:03:47 lr 0.000010 wd 0.0500 time 0.2220 (0.2270) data time 0.0010 (0.0036) model time 0.2210 (0.2230) loss 2.9086 (2.6666) grad_norm 9.8053 (7.1682) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][260/1251] eta 0:03:45 lr 0.000010 wd 0.0500 time 0.3862 (0.2275) data time 0.0008 (0.0035) model time 0.3853 (0.2237) loss 2.5906 (2.6642) grad_norm 6.6226 (7.1309) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][270/1251] eta 0:03:42 lr 0.000010 wd 0.0500 time 0.2273 (0.2272) data time 0.0006 (0.0034) model time 0.2267 (0.2236) loss 2.1403 (2.6603) grad_norm 4.5358 (7.0893) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][280/1251] eta 0:03:40 lr 0.000010 wd 0.0500 time 0.2193 (0.2271) data time 0.0009 (0.0033) model time 0.2184 (0.2235) loss 1.6870 (2.6653) grad_norm 4.8807 (7.0969) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][290/1251] eta 0:03:38 lr 0.000010 wd 0.0500 time 0.2219 (0.2269) data time 0.0008 (0.0032) model time 0.2211 (0.2234) loss 2.7628 (2.6699) grad_norm 5.5218 (7.0739) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][300/1251] eta 0:03:35 lr 0.000010 wd 0.0500 time 0.2231 (0.2268) data time 0.0008 (0.0032) model time 0.2223 (0.2233) loss 2.7842 (2.6636) grad_norm 4.8345 (7.0829) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][310/1251] eta 0:03:33 lr 0.000010 wd 0.0500 time 0.2305 (0.2266) data time 0.0009 (0.0031) model time 0.2297 (0.2233) loss 2.5266 (2.6685) grad_norm 5.8285 (7.0925) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][320/1251] eta 0:03:30 lr 0.000010 wd 0.0500 time 0.2206 (0.2265) data time 0.0010 (0.0030) model time 0.2196 (0.2232) loss 2.9482 (2.6691) grad_norm 8.0841 (7.0713) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][330/1251] eta 0:03:28 lr 0.000010 wd 0.0500 time 0.2292 (0.2265) data time 0.0009 (0.0030) model time 0.2283 (0.2233) loss 2.3930 (2.6671) grad_norm 5.5790 (7.0339) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][340/1251] eta 0:03:26 lr 0.000010 wd 0.0500 time 0.2266 (0.2265) data time 0.0008 (0.0029) model time 0.2258 (0.2233) loss 2.8330 (2.6646) grad_norm 4.9418 (7.1473) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][350/1251] eta 0:03:23 lr 0.000010 wd 0.0500 time 0.2243 (0.2264) data time 0.0008 (0.0028) model time 0.2235 (0.2233) loss 2.7611 (2.6629) grad_norm 6.1236 (7.1284) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][360/1251] eta 0:03:21 lr 0.000010 wd 0.0500 time 0.2221 (0.2263) data time 0.0010 (0.0028) model time 0.2211 (0.2232) loss 2.2582 (2.6632) grad_norm 9.1246 (7.1021) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][370/1251] eta 0:03:19 lr 0.000010 wd 0.0500 time 0.2274 (0.2262) data time 0.0010 (0.0028) model time 0.2264 (0.2232) loss 2.6154 (2.6569) grad_norm 5.5544 (7.0799) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][380/1251] eta 0:03:16 lr 0.000010 wd 0.0500 time 0.2284 (0.2262) data time 0.0007 (0.0027) model time 0.2276 (0.2232) loss 2.7316 (2.6574) grad_norm 11.9339 (7.0759) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][390/1251] eta 0:03:14 lr 0.000010 wd 0.0500 time 0.2204 (0.2261) data time 0.0008 (0.0027) model time 0.2196 (0.2232) loss 2.9784 (2.6562) grad_norm 11.1631 (7.0737) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][400/1251] eta 0:03:12 lr 0.000010 wd 0.0500 time 0.2175 (0.2260) data time 0.0009 (0.0026) model time 0.2167 (0.2232) loss 3.0400 (2.6548) grad_norm 4.6944 (7.0438) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][410/1251] eta 0:03:10 lr 0.000010 wd 0.0500 time 0.2356 (0.2260) data time 0.0012 (0.0026) model time 0.2344 (0.2232) loss 2.7790 (2.6549) grad_norm 6.6308 (7.0413) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:51:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][420/1251] eta 0:03:07 lr 0.000010 wd 0.0500 time 0.2332 (0.2260) data time 0.0009 (0.0025) model time 0.2323 (0.2233) loss 2.4697 (2.6507) grad_norm 5.5077 (7.0042) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][430/1251] eta 0:03:05 lr 0.000010 wd 0.0500 time 0.2290 (0.2260) data time 0.0008 (0.0025) model time 0.2282 (0.2233) loss 2.7963 (2.6525) grad_norm 4.1219 (6.9889) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][440/1251] eta 0:03:03 lr 0.000010 wd 0.0500 time 0.2230 (0.2259) data time 0.0009 (0.0025) model time 0.2222 (0.2233) loss 2.6710 (2.6536) grad_norm 10.8265 (6.9826) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][450/1251] eta 0:03:00 lr 0.000010 wd 0.0500 time 0.2225 (0.2259) data time 0.0010 (0.0024) model time 0.2215 (0.2232) loss 2.8759 (2.6507) grad_norm 10.6643 (6.9805) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][460/1251] eta 0:02:58 lr 0.000010 wd 0.0500 time 0.2321 (0.2258) data time 0.0010 (0.0024) model time 0.2311 (0.2233) loss 2.5222 (2.6536) grad_norm 4.6541 (6.9532) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][470/1251] eta 0:02:56 lr 0.000010 wd 0.0500 time 0.2218 (0.2258) data time 0.0008 (0.0024) model time 0.2209 (0.2232) loss 2.9042 (2.6546) grad_norm 5.2822 (6.9130) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][480/1251] eta 0:02:54 lr 0.000010 wd 0.0500 time 0.2241 (0.2258) data time 0.0009 (0.0023) model time 0.2232 (0.2232) loss 2.1949 (2.6581) grad_norm 7.6861 (6.8949) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][490/1251] eta 0:02:51 lr 0.000010 wd 0.0500 time 0.2195 (0.2257) data time 0.0007 (0.0023) model time 0.2188 (0.2233) loss 2.5120 (2.6606) grad_norm 12.1864 (6.9107) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][500/1251] eta 0:02:49 lr 0.000010 wd 0.0500 time 0.2193 (0.2256) data time 0.0007 (0.0023) model time 0.2186 (0.2232) loss 2.7119 (2.6590) grad_norm 5.7378 (6.8934) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][510/1251] eta 0:02:47 lr 0.000010 wd 0.0500 time 0.2308 (0.2256) data time 0.0006 (0.0022) model time 0.2302 (0.2232) loss 3.2509 (2.6597) grad_norm 18.5510 (6.9240) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][520/1251] eta 0:02:44 lr 0.000010 wd 0.0500 time 0.2290 (0.2256) data time 0.0008 (0.0022) model time 0.2282 (0.2232) loss 2.7854 (2.6611) grad_norm 5.4775 (6.8978) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][530/1251] eta 0:02:42 lr 0.000010 wd 0.0500 time 0.2247 (0.2256) data time 0.0008 (0.0022) model time 0.2239 (0.2232) loss 3.4390 (2.6664) grad_norm 4.9279 (6.8814) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][540/1251] eta 0:02:40 lr 0.000010 wd 0.0500 time 0.2276 (0.2255) data time 0.0008 (0.0022) model time 0.2268 (0.2232) loss 2.3668 (2.6655) grad_norm 8.1421 (6.8779) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][550/1251] eta 0:02:38 lr 0.000010 wd 0.0500 time 0.2265 (0.2255) data time 0.0011 (0.0022) model time 0.2254 (0.2232) loss 3.2496 (2.6684) grad_norm 4.1096 (6.8819) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][560/1251] eta 0:02:35 lr 0.000010 wd 0.0500 time 0.2272 (0.2255) data time 0.0011 (0.0021) model time 0.2261 (0.2232) loss 2.6922 (2.6676) grad_norm 5.5072 (6.8871) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][570/1251] eta 0:02:33 lr 0.000010 wd 0.0500 time 0.2269 (0.2254) data time 0.0009 (0.0021) model time 0.2260 (0.2232) loss 3.2556 (2.6698) grad_norm 6.1622 (6.9001) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][580/1251] eta 0:02:31 lr 0.000010 wd 0.0500 time 0.2216 (0.2255) data time 0.0008 (0.0021) model time 0.2207 (0.2232) loss 3.2229 (2.6695) grad_norm 5.7774 (6.9074) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][590/1251] eta 0:02:29 lr 0.000010 wd 0.0500 time 0.2197 (0.2254) data time 0.0010 (0.0021) model time 0.2186 (0.2232) loss 2.7392 (2.6706) grad_norm 4.9837 (6.8967) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][600/1251] eta 0:02:26 lr 0.000010 wd 0.0500 time 0.2211 (0.2254) data time 0.0011 (0.0021) model time 0.2201 (0.2232) loss 2.6393 (2.6697) grad_norm 6.1918 (6.8801) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][610/1251] eta 0:02:24 lr 0.000010 wd 0.0500 time 0.2333 (0.2253) data time 0.0009 (0.0020) model time 0.2324 (0.2232) loss 1.7479 (2.6686) grad_norm 5.7038 (6.8884) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][620/1251] eta 0:02:22 lr 0.000010 wd 0.0500 time 0.2252 (0.2253) data time 0.0011 (0.0020) model time 0.2241 (0.2232) loss 3.0897 (2.6715) grad_norm 4.6580 (6.8695) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][630/1251] eta 0:02:19 lr 0.000010 wd 0.0500 time 0.2245 (0.2253) data time 0.0008 (0.0020) model time 0.2237 (0.2232) loss 2.9825 (2.6732) grad_norm 5.0822 (6.8818) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][640/1251] eta 0:02:17 lr 0.000010 wd 0.0500 time 0.2257 (0.2253) data time 0.0008 (0.0020) model time 0.2249 (0.2232) loss 3.0045 (2.6750) grad_norm 13.7657 (6.8973) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][650/1251] eta 0:02:15 lr 0.000010 wd 0.0500 time 0.2200 (0.2252) data time 0.0011 (0.0020) model time 0.2188 (0.2231) loss 2.5253 (2.6730) grad_norm 6.7544 (6.9088) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][660/1251] eta 0:02:13 lr 0.000010 wd 0.0500 time 0.2251 (0.2252) data time 0.0008 (0.0019) model time 0.2243 (0.2231) loss 2.7129 (2.6744) grad_norm 4.4459 (6.8809) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][670/1251] eta 0:02:10 lr 0.000010 wd 0.0500 time 0.2219 (0.2252) data time 0.0007 (0.0019) model time 0.2212 (0.2231) loss 2.6583 (2.6765) grad_norm 4.4167 (6.8818) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][680/1251] eta 0:02:08 lr 0.000010 wd 0.0500 time 0.2258 (0.2251) data time 0.0010 (0.0019) model time 0.2248 (0.2231) loss 2.6645 (2.6761) grad_norm 3.8290 (6.8892) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:52:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][690/1251] eta 0:02:06 lr 0.000010 wd 0.0500 time 0.2243 (0.2251) data time 0.0008 (0.0019) model time 0.2236 (0.2231) loss 3.0730 (2.6796) grad_norm 5.3823 (6.8772) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:53:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][700/1251] eta 0:02:04 lr 0.000010 wd 0.0500 time 0.2205 (0.2251) data time 0.0007 (0.0019) model time 0.2198 (0.2231) loss 2.5650 (2.6805) grad_norm 6.1319 (6.8655) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:53:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][710/1251] eta 0:02:01 lr 0.000010 wd 0.0500 time 0.2271 (0.2251) data time 0.0009 (0.0019) model time 0.2262 (0.2231) loss 2.1736 (2.6790) grad_norm 11.7840 (6.8687) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:53:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][720/1251] eta 0:01:59 lr 0.000010 wd 0.0500 time 0.2254 (0.2251) data time 0.0008 (0.0019) model time 0.2246 (0.2231) loss 2.2120 (2.6780) grad_norm 7.4353 (6.8698) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:53:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][730/1251] eta 0:01:57 lr 0.000010 wd 0.0500 time 0.2245 (0.2250) data time 0.0009 (0.0018) model time 0.2236 (0.2231) loss 1.6635 (2.6786) grad_norm 6.6737 (6.8590) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:53:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][740/1251] eta 0:01:54 lr 0.000010 wd 0.0500 time 0.2217 (0.2250) data time 0.0008 (0.0018) model time 0.2209 (0.2231) loss 1.6897 (2.6743) grad_norm 3.7199 (6.9017) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:53:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][750/1251] eta 0:01:52 lr 0.000010 wd 0.0500 time 0.2260 (0.2250) data time 0.0008 (0.0018) model time 0.2252 (0.2231) loss 3.1938 (2.6774) grad_norm 6.6377 (6.8987) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:53:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][760/1251] eta 0:01:50 lr 0.000010 wd 0.0500 time 0.2284 (0.2250) data time 0.0007 (0.0018) model time 0.2276 (0.2231) loss 2.9830 (2.6741) grad_norm 6.2069 (6.9183) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:53:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][770/1251] eta 0:01:48 lr 0.000010 wd 0.0500 time 0.2268 (0.2250) data time 0.0009 (0.0018) model time 0.2259 (0.2231) loss 3.2122 (2.6740) grad_norm 9.3330 (6.9468) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:53:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][780/1251] eta 0:01:46 lr 0.000010 wd 0.0500 time 0.2231 (0.2253) data time 0.0007 (0.0018) model time 0.2224 (0.2234) loss 2.5661 (2.6720) grad_norm 5.8342 (6.9389) loss_scale 256.0000 (256.0000) mem 7381MB [2024-09-01 11:53:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][790/1251] eta 0:01:43 lr 0.000010 wd 0.0500 time 0.2226 (0.2255) data time 0.0010 (0.0018) model time 0.2216 (0.2237) loss 1.6644 (2.6704) grad_norm 6.6497 (inf) loss_scale 128.0000 (255.5145) mem 7381MB [2024-09-01 11:53:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][800/1251] eta 0:01:41 lr 0.000010 wd 0.0500 time 0.2197 (0.2255) data time 0.0007 (0.0018) model time 0.2190 (0.2236) loss 2.1610 (2.6689) grad_norm 7.7182 (inf) loss_scale 128.0000 (253.9226) mem 7381MB [2024-09-01 11:53:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][810/1251] eta 0:01:39 lr 0.000010 wd 0.0500 time 0.2174 (0.2255) data time 0.0010 (0.0018) model time 0.2164 (0.2236) loss 2.2412 (2.6690) grad_norm 4.9952 (inf) loss_scale 128.0000 (252.3699) mem 7381MB [2024-09-01 11:53:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][820/1251] eta 0:01:37 lr 0.000010 wd 0.0500 time 0.2187 (0.2254) data time 0.0010 (0.0018) model time 0.2178 (0.2236) loss 1.9941 (2.6693) grad_norm 5.0625 (inf) loss_scale 128.0000 (250.8551) mem 7381MB [2024-09-01 11:53:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][830/1251] eta 0:01:34 lr 0.000010 wd 0.0500 time 0.2261 (0.2254) data time 0.0010 (0.0017) model time 0.2251 (0.2236) loss 2.8436 (2.6683) grad_norm 8.0074 (inf) loss_scale 128.0000 (249.3767) mem 7381MB [2024-09-01 11:53:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][840/1251] eta 0:01:32 lr 0.000010 wd 0.0500 time 0.2316 (0.2254) data time 0.0006 (0.0017) model time 0.2310 (0.2236) loss 2.5280 (2.6659) grad_norm 5.4601 (inf) loss_scale 128.0000 (247.9334) mem 7381MB [2024-09-01 11:53:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][850/1251] eta 0:01:30 lr 0.000010 wd 0.0500 time 0.2319 (0.2254) data time 0.0008 (0.0017) model time 0.2310 (0.2236) loss 1.8421 (2.6636) grad_norm 6.1624 (inf) loss_scale 128.0000 (246.5241) mem 7381MB [2024-09-01 11:53:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][860/1251] eta 0:01:28 lr 0.000010 wd 0.0500 time 0.2164 (0.2253) data time 0.0009 (0.0017) model time 0.2155 (0.2235) loss 2.7916 (2.6651) grad_norm 7.5710 (inf) loss_scale 128.0000 (245.1475) mem 7381MB [2024-09-01 11:53:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][870/1251] eta 0:01:25 lr 0.000010 wd 0.0500 time 0.2229 (0.2253) data time 0.0008 (0.0017) model time 0.2221 (0.2235) loss 2.3613 (2.6683) grad_norm 78.3551 (inf) loss_scale 128.0000 (243.8025) mem 7381MB [2024-09-01 11:53:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][880/1251] eta 0:01:23 lr 0.000010 wd 0.0500 time 0.2184 (0.2253) data time 0.0008 (0.0017) model time 0.2176 (0.2235) loss 2.5729 (2.6686) grad_norm 4.7561 (inf) loss_scale 128.0000 (242.4881) mem 7381MB [2024-09-01 11:53:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][890/1251] eta 0:01:21 lr 0.000010 wd 0.0500 time 0.2219 (0.2253) data time 0.0010 (0.0017) model time 0.2209 (0.2235) loss 2.0739 (2.6686) grad_norm 4.7286 (inf) loss_scale 128.0000 (241.2031) mem 7381MB [2024-09-01 11:53:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][900/1251] eta 0:01:19 lr 0.000010 wd 0.0500 time 0.2201 (0.2253) data time 0.0007 (0.0017) model time 0.2194 (0.2235) loss 3.3935 (2.6682) grad_norm 7.0989 (inf) loss_scale 128.0000 (239.9467) mem 7381MB [2024-09-01 11:53:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][910/1251] eta 0:01:16 lr 0.000010 wd 0.0500 time 0.2218 (0.2253) data time 0.0010 (0.0017) model time 0.2208 (0.2235) loss 1.9628 (2.6655) grad_norm 3.8542 (inf) loss_scale 128.0000 (238.7179) mem 7381MB [2024-09-01 11:53:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][920/1251] eta 0:01:14 lr 0.000010 wd 0.0500 time 0.2269 (0.2252) data time 0.0008 (0.0017) model time 0.2261 (0.2235) loss 2.9421 (2.6668) grad_norm 4.6800 (inf) loss_scale 128.0000 (237.5157) mem 7381MB [2024-09-01 11:53:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][930/1251] eta 0:01:12 lr 0.000010 wd 0.0500 time 0.2250 (0.2252) data time 0.0010 (0.0017) model time 0.2240 (0.2235) loss 2.4351 (2.6664) grad_norm 10.0765 (inf) loss_scale 128.0000 (236.3394) mem 7381MB [2024-09-01 11:53:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][940/1251] eta 0:01:10 lr 0.000010 wd 0.0500 time 0.2200 (0.2252) data time 0.0009 (0.0016) model time 0.2191 (0.2235) loss 2.6496 (2.6688) grad_norm 17.0112 (inf) loss_scale 128.0000 (235.1881) mem 7381MB [2024-09-01 11:53:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][950/1251] eta 0:01:07 lr 0.000010 wd 0.0500 time 0.2212 (0.2252) data time 0.0011 (0.0016) model time 0.2202 (0.2235) loss 2.6089 (2.6731) grad_norm 19.0069 (inf) loss_scale 128.0000 (234.0610) mem 7381MB [2024-09-01 11:53:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][960/1251] eta 0:01:05 lr 0.000010 wd 0.0500 time 0.2242 (0.2252) data time 0.0006 (0.0016) model time 0.2236 (0.2235) loss 3.0040 (2.6741) grad_norm 14.6770 (inf) loss_scale 128.0000 (232.9573) mem 7381MB [2024-09-01 11:54:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][970/1251] eta 0:01:03 lr 0.000010 wd 0.0500 time 0.2253 (0.2251) data time 0.0007 (0.0016) model time 0.2247 (0.2235) loss 3.2648 (2.6711) grad_norm 8.4309 (inf) loss_scale 128.0000 (231.8764) mem 7381MB [2024-09-01 11:54:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][980/1251] eta 0:01:01 lr 0.000010 wd 0.0500 time 0.2232 (0.2251) data time 0.0006 (0.0016) model time 0.2225 (0.2234) loss 2.5367 (2.6698) grad_norm 5.8599 (inf) loss_scale 128.0000 (230.8175) mem 7381MB [2024-09-01 11:54:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][990/1251] eta 0:00:58 lr 0.000010 wd 0.0500 time 0.2182 (0.2251) data time 0.0010 (0.0016) model time 0.2172 (0.2234) loss 2.6225 (2.6683) grad_norm 8.2139 (inf) loss_scale 128.0000 (229.7800) mem 7381MB [2024-09-01 11:54:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1000/1251] eta 0:00:56 lr 0.000010 wd 0.0500 time 0.2192 (0.2251) data time 0.0007 (0.0016) model time 0.2185 (0.2234) loss 1.6738 (2.6661) grad_norm 5.9579 (inf) loss_scale 128.0000 (228.7632) mem 7381MB [2024-09-01 11:54:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1010/1251] eta 0:00:54 lr 0.000010 wd 0.0500 time 0.2229 (0.2251) data time 0.0010 (0.0016) model time 0.2219 (0.2234) loss 2.5811 (2.6660) grad_norm 10.8169 (inf) loss_scale 128.0000 (227.7666) mem 7381MB [2024-09-01 11:54:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1020/1251] eta 0:00:51 lr 0.000010 wd 0.0500 time 0.2198 (0.2251) data time 0.0008 (0.0016) model time 0.2190 (0.2234) loss 3.4274 (2.6654) grad_norm 6.4336 (inf) loss_scale 128.0000 (226.7894) mem 7381MB [2024-09-01 11:54:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1030/1251] eta 0:00:49 lr 0.000010 wd 0.0500 time 0.2244 (0.2251) data time 0.0010 (0.0016) model time 0.2234 (0.2234) loss 2.7718 (2.6661) grad_norm 4.8932 (inf) loss_scale 128.0000 (225.8312) mem 7381MB [2024-09-01 11:54:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1040/1251] eta 0:00:47 lr 0.000010 wd 0.0500 time 0.2228 (0.2251) data time 0.0008 (0.0016) model time 0.2220 (0.2234) loss 2.3573 (2.6654) grad_norm 7.0458 (inf) loss_scale 128.0000 (224.8915) mem 7381MB [2024-09-01 11:54:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1050/1251] eta 0:00:45 lr 0.000010 wd 0.0500 time 0.2243 (0.2250) data time 0.0007 (0.0016) model time 0.2236 (0.2234) loss 2.5368 (2.6642) grad_norm 4.1721 (inf) loss_scale 128.0000 (223.9696) mem 7381MB [2024-09-01 11:54:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1060/1251] eta 0:00:42 lr 0.000010 wd 0.0500 time 0.2271 (0.2250) data time 0.0008 (0.0016) model time 0.2263 (0.2234) loss 2.4524 (2.6632) grad_norm 7.8888 (inf) loss_scale 128.0000 (223.0650) mem 7381MB [2024-09-01 11:54:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1070/1251] eta 0:00:40 lr 0.000010 wd 0.0500 time 0.2216 (0.2250) data time 0.0008 (0.0016) model time 0.2208 (0.2234) loss 3.0112 (2.6654) grad_norm 5.9136 (inf) loss_scale 128.0000 (222.1774) mem 7381MB [2024-09-01 11:54:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1080/1251] eta 0:00:38 lr 0.000010 wd 0.0500 time 0.2203 (0.2250) data time 0.0010 (0.0015) model time 0.2193 (0.2234) loss 2.8068 (2.6665) grad_norm 4.8128 (inf) loss_scale 128.0000 (221.3062) mem 7381MB [2024-09-01 11:54:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1090/1251] eta 0:00:36 lr 0.000010 wd 0.0500 time 0.2234 (0.2250) data time 0.0011 (0.0015) model time 0.2223 (0.2234) loss 1.7699 (2.6645) grad_norm 3.7621 (inf) loss_scale 128.0000 (220.4510) mem 7381MB [2024-09-01 11:54:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1100/1251] eta 0:00:33 lr 0.000010 wd 0.0500 time 0.2256 (0.2250) data time 0.0008 (0.0015) model time 0.2248 (0.2234) loss 2.8165 (2.6639) grad_norm 5.1833 (inf) loss_scale 128.0000 (219.6113) mem 7381MB [2024-09-01 11:54:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1110/1251] eta 0:00:31 lr 0.000010 wd 0.0500 time 0.2223 (0.2250) data time 0.0009 (0.0015) model time 0.2213 (0.2234) loss 2.6760 (2.6647) grad_norm 5.9152 (inf) loss_scale 128.0000 (218.7867) mem 7381MB [2024-09-01 11:54:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1120/1251] eta 0:00:29 lr 0.000010 wd 0.0500 time 0.2221 (0.2249) data time 0.0009 (0.0015) model time 0.2212 (0.2233) loss 1.8946 (2.6641) grad_norm 3.9911 (inf) loss_scale 128.0000 (217.9768) mem 7381MB [2024-09-01 11:54:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1130/1251] eta 0:00:27 lr 0.000010 wd 0.0500 time 0.2207 (0.2249) data time 0.0007 (0.0015) model time 0.2201 (0.2234) loss 1.9212 (2.6616) grad_norm 6.3220 (inf) loss_scale 128.0000 (217.1813) mem 7381MB [2024-09-01 11:54:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1140/1251] eta 0:00:24 lr 0.000010 wd 0.0500 time 0.2222 (0.2249) data time 0.0009 (0.0015) model time 0.2213 (0.2234) loss 2.9079 (2.6625) grad_norm 4.3050 (inf) loss_scale 128.0000 (216.3996) mem 7381MB [2024-09-01 11:54:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1150/1251] eta 0:00:22 lr 0.000010 wd 0.0500 time 0.2246 (0.2249) data time 0.0009 (0.0015) model time 0.2237 (0.2234) loss 2.7752 (2.6619) grad_norm 7.6788 (inf) loss_scale 128.0000 (215.6316) mem 7381MB [2024-09-01 11:54:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1160/1251] eta 0:00:20 lr 0.000010 wd 0.0500 time 0.2213 (0.2249) data time 0.0007 (0.0015) model time 0.2206 (0.2233) loss 2.0406 (2.6623) grad_norm 8.3195 (inf) loss_scale 128.0000 (214.8768) mem 7381MB [2024-09-01 11:54:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1170/1251] eta 0:00:18 lr 0.000010 wd 0.0500 time 0.2229 (0.2249) data time 0.0010 (0.0015) model time 0.2219 (0.2233) loss 2.9323 (2.6613) grad_norm 3.8617 (inf) loss_scale 128.0000 (214.1349) mem 7381MB [2024-09-01 11:54:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1180/1251] eta 0:00:15 lr 0.000010 wd 0.0500 time 0.2261 (0.2249) data time 0.0007 (0.0015) model time 0.2254 (0.2233) loss 2.9905 (2.6614) grad_norm 6.4470 (inf) loss_scale 128.0000 (213.4056) mem 7381MB [2024-09-01 11:54:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1190/1251] eta 0:00:13 lr 0.000010 wd 0.0500 time 0.2233 (0.2249) data time 0.0010 (0.0015) model time 0.2223 (0.2233) loss 3.1237 (2.6607) grad_norm 7.5510 (inf) loss_scale 128.0000 (212.6885) mem 7381MB [2024-09-01 11:54:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1200/1251] eta 0:00:11 lr 0.000010 wd 0.0500 time 0.2209 (0.2249) data time 0.0007 (0.0015) model time 0.2203 (0.2233) loss 2.2615 (2.6588) grad_norm 7.8711 (inf) loss_scale 128.0000 (211.9833) mem 7381MB [2024-09-01 11:54:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1210/1251] eta 0:00:09 lr 0.000010 wd 0.0500 time 0.2269 (0.2249) data time 0.0009 (0.0015) model time 0.2260 (0.2233) loss 3.1931 (2.6606) grad_norm 5.6894 (inf) loss_scale 128.0000 (211.2898) mem 7381MB [2024-09-01 11:54:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1220/1251] eta 0:00:06 lr 0.000010 wd 0.0500 time 0.2226 (0.2249) data time 0.0009 (0.0015) model time 0.2218 (0.2233) loss 2.8414 (2.6616) grad_norm 35.4100 (inf) loss_scale 128.0000 (210.6077) mem 7381MB [2024-09-01 11:54:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1230/1251] eta 0:00:04 lr 0.000010 wd 0.0500 time 0.2280 (0.2249) data time 0.0007 (0.0015) model time 0.2272 (0.2233) loss 2.8708 (2.6613) grad_norm 6.6569 (inf) loss_scale 128.0000 (209.9366) mem 7381MB [2024-09-01 11:55:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1240/1251] eta 0:00:02 lr 0.000010 wd 0.0500 time 0.2150 (0.2248) data time 0.0006 (0.0015) model time 0.2144 (0.2233) loss 2.9764 (2.6613) grad_norm 6.5314 (inf) loss_scale 128.0000 (209.2764) mem 7381MB [2024-09-01 11:55:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [298/300][1250/1251] eta 0:00:00 lr 0.000010 wd 0.0500 time 0.2197 (0.2247) data time 0.0006 (0.0015) model time 0.2191 (0.2232) loss 2.7650 (2.6601) grad_norm 6.8908 (inf) loss_scale 128.0000 (208.6267) mem 7381MB [2024-09-01 11:55:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 298 training takes 0:04:41 [2024-09-01 11:55:03 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 11:55:04 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 11:55:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.391 (0.391) Loss 0.3867 (0.3867) Acc@1 93.555 (93.555) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 11:55:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.069 (0.097) Loss 0.5791 (0.6147) Acc@1 90.625 (87.828) Acc@5 97.852 (97.772) Mem 7381MB [2024-09-01 11:55:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.064 (0.085) Loss 0.9307 (0.6456) Acc@1 77.246 (86.672) Acc@5 95.117 (97.680) Mem 7381MB [2024-09-01 11:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.066 (0.079) Loss 1.1582 (0.7397) Acc@1 74.219 (84.416) Acc@5 92.969 (96.721) Mem 7381MB [2024-09-01 11:55:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.059 (0.075) Loss 1.0449 (0.7905) Acc@1 77.441 (83.246) Acc@5 94.238 (96.227) Mem 7381MB [2024-09-01 11:55:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.830 Acc@5 96.176 [2024-09-01 11:55:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 11:55:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.749 (0.749) Loss 0.3901 (0.3901) Acc@1 93.262 (93.262) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 11:55:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.072 (0.133) Loss 0.5654 (0.6100) Acc@1 90.430 (87.784) Acc@5 97.754 (97.745) Mem 7381MB [2024-09-01 11:55:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.072 (0.104) Loss 0.9180 (0.6404) Acc@1 78.027 (86.742) Acc@5 95.605 (97.726) Mem 7381MB [2024-09-01 11:55:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.067 (0.092) Loss 1.1465 (0.7339) Acc@1 74.512 (84.507) Acc@5 93.066 (96.762) Mem 7381MB [2024-09-01 11:55:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.060 (0.085) Loss 1.0225 (0.7833) Acc@1 77.051 (83.329) Acc@5 94.434 (96.268) Mem 7381MB [2024-09-01 11:55:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.872 Acc@5 96.224 [2024-09-01 11:55:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 11:55:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][0/1251] eta 0:20:21 lr 0.000010 wd 0.0500 time 0.9762 (0.9762) data time 0.6191 (0.6191) model time 0.0000 (0.0000) loss 2.6043 (2.6043) grad_norm 29.3072 (29.3072) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][10/1251] eta 0:06:06 lr 0.000010 wd 0.0500 time 0.2193 (0.2957) data time 0.0008 (0.0571) model time 0.0000 (0.0000) loss 1.9569 (2.5983) grad_norm 6.2207 (10.3815) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][20/1251] eta 0:05:23 lr 0.000010 wd 0.0500 time 0.2250 (0.2627) data time 0.0009 (0.0304) model time 0.0000 (0.0000) loss 2.7894 (2.7423) grad_norm 6.7096 (8.3623) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][30/1251] eta 0:05:06 lr 0.000010 wd 0.0500 time 0.2245 (0.2510) data time 0.0011 (0.0209) model time 0.0000 (0.0000) loss 2.9103 (2.6715) grad_norm 5.4171 (7.6707) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][40/1251] eta 0:04:55 lr 0.000010 wd 0.0500 time 0.2234 (0.2440) data time 0.0009 (0.0160) model time 0.0000 (0.0000) loss 2.8751 (2.6890) grad_norm 6.4097 (7.4528) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][50/1251] eta 0:04:48 lr 0.000010 wd 0.0500 time 0.2257 (0.2399) data time 0.0008 (0.0131) model time 0.0000 (0.0000) loss 2.6720 (2.6954) grad_norm 6.5555 (7.6265) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][60/1251] eta 0:04:42 lr 0.000010 wd 0.0500 time 0.2244 (0.2375) data time 0.0008 (0.0111) model time 0.2237 (0.2243) loss 3.0752 (2.6912) grad_norm 11.4948 (7.7253) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][70/1251] eta 0:04:38 lr 0.000010 wd 0.0500 time 0.2247 (0.2356) data time 0.0010 (0.0097) model time 0.2237 (0.2237) loss 2.9980 (2.6870) grad_norm 24.5116 (9.1488) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][80/1251] eta 0:04:34 lr 0.000010 wd 0.0500 time 0.2251 (0.2342) data time 0.0009 (0.0086) model time 0.2242 (0.2236) loss 2.9421 (2.6573) grad_norm 3.7542 (9.1465) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][90/1251] eta 0:04:30 lr 0.000010 wd 0.0500 time 0.2241 (0.2329) data time 0.0010 (0.0077) model time 0.2231 (0.2230) loss 2.8411 (2.6682) grad_norm 9.5325 (8.8864) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][100/1251] eta 0:04:26 lr 0.000010 wd 0.0500 time 0.2238 (0.2320) data time 0.0006 (0.0071) model time 0.2232 (0.2230) loss 3.0737 (2.6689) grad_norm 5.9035 (8.6518) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][110/1251] eta 0:04:23 lr 0.000010 wd 0.0500 time 0.2249 (0.2311) data time 0.0009 (0.0065) model time 0.2240 (0.2227) loss 2.8010 (2.6506) grad_norm 5.4676 (8.4777) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][120/1251] eta 0:04:24 lr 0.000010 wd 0.0500 time 0.2210 (0.2337) data time 0.0006 (0.0060) model time 0.2204 (0.2283) loss 2.9473 (2.6765) grad_norm 6.1665 (8.3894) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][130/1251] eta 0:04:21 lr 0.000010 wd 0.0500 time 0.2244 (0.2329) data time 0.0008 (0.0056) model time 0.2236 (0.2276) loss 3.1383 (2.6822) grad_norm 5.5778 (8.2066) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][140/1251] eta 0:04:18 lr 0.000010 wd 0.0500 time 0.2239 (0.2323) data time 0.0009 (0.0053) model time 0.2230 (0.2271) loss 2.5638 (2.6549) grad_norm 5.6375 (8.0724) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][150/1251] eta 0:04:15 lr 0.000010 wd 0.0500 time 0.2246 (0.2317) data time 0.0008 (0.0050) model time 0.2238 (0.2267) loss 2.8989 (2.6555) grad_norm 6.2997 (8.0360) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][160/1251] eta 0:04:12 lr 0.000010 wd 0.0500 time 0.2284 (0.2315) data time 0.0009 (0.0048) model time 0.2275 (0.2267) loss 2.9624 (2.6582) grad_norm 4.0959 (7.9967) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][170/1251] eta 0:04:09 lr 0.000010 wd 0.0500 time 0.2259 (0.2310) data time 0.0010 (0.0045) model time 0.2250 (0.2264) loss 2.6289 (2.6529) grad_norm 4.7946 (7.9102) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][180/1251] eta 0:04:07 lr 0.000010 wd 0.0500 time 0.2175 (0.2307) data time 0.0010 (0.0043) model time 0.2164 (0.2262) loss 2.3870 (2.6518) grad_norm 6.6965 (7.8062) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][190/1251] eta 0:04:04 lr 0.000010 wd 0.0500 time 0.2284 (0.2304) data time 0.0006 (0.0042) model time 0.2279 (0.2260) loss 2.6552 (2.6496) grad_norm 5.5494 (7.7475) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:55:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][200/1251] eta 0:04:01 lr 0.000010 wd 0.0500 time 0.2277 (0.2301) data time 0.0009 (0.0040) model time 0.2267 (0.2259) loss 2.7994 (2.6533) grad_norm 7.9419 (7.9132) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][210/1251] eta 0:03:59 lr 0.000010 wd 0.0500 time 0.2236 (0.2298) data time 0.0008 (0.0039) model time 0.2229 (0.2257) loss 3.0896 (2.6601) grad_norm 5.5808 (7.8325) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][220/1251] eta 0:03:56 lr 0.000010 wd 0.0500 time 0.2206 (0.2295) data time 0.0010 (0.0037) model time 0.2196 (0.2255) loss 2.3936 (2.6584) grad_norm 7.3975 (7.8098) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][230/1251] eta 0:03:54 lr 0.000010 wd 0.0500 time 0.2203 (0.2292) data time 0.0008 (0.0036) model time 0.2195 (0.2253) loss 2.7824 (2.6632) grad_norm 21.8767 (7.8079) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][240/1251] eta 0:03:51 lr 0.000010 wd 0.0500 time 0.2182 (0.2290) data time 0.0008 (0.0035) model time 0.2174 (0.2252) loss 2.0056 (2.6698) grad_norm 5.3427 (7.7634) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][250/1251] eta 0:03:49 lr 0.000010 wd 0.0500 time 0.2219 (0.2289) data time 0.0007 (0.0034) model time 0.2212 (0.2252) loss 3.1258 (2.6728) grad_norm 45.3911 (7.8768) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][260/1251] eta 0:03:46 lr 0.000010 wd 0.0500 time 0.2214 (0.2287) data time 0.0011 (0.0033) model time 0.2203 (0.2250) loss 2.9982 (2.6629) grad_norm 6.4697 (7.8429) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][270/1251] eta 0:03:44 lr 0.000010 wd 0.0500 time 0.2308 (0.2285) data time 0.0005 (0.0032) model time 0.2303 (0.2250) loss 2.8721 (2.6683) grad_norm 4.2258 (7.7996) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][280/1251] eta 0:03:41 lr 0.000010 wd 0.0500 time 0.2271 (0.2284) data time 0.0009 (0.0031) model time 0.2262 (0.2249) loss 2.9441 (2.6649) grad_norm 5.6408 (7.7384) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][290/1251] eta 0:03:39 lr 0.000010 wd 0.0500 time 0.2170 (0.2281) data time 0.0009 (0.0030) model time 0.2161 (0.2247) loss 2.7221 (2.6722) grad_norm 3.7941 (7.6648) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][300/1251] eta 0:03:36 lr 0.000010 wd 0.0500 time 0.2250 (0.2279) data time 0.0008 (0.0030) model time 0.2242 (0.2246) loss 2.5229 (2.6703) grad_norm 5.9585 (7.6061) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][310/1251] eta 0:03:34 lr 0.000010 wd 0.0500 time 0.2231 (0.2278) data time 0.0007 (0.0029) model time 0.2225 (0.2245) loss 2.5865 (2.6810) grad_norm 5.8119 (7.5807) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][320/1251] eta 0:03:31 lr 0.000010 wd 0.0500 time 0.2216 (0.2277) data time 0.0006 (0.0028) model time 0.2210 (0.2245) loss 2.2166 (2.6797) grad_norm 7.8768 (7.6272) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][330/1251] eta 0:03:29 lr 0.000010 wd 0.0500 time 0.2142 (0.2275) data time 0.0011 (0.0028) model time 0.2131 (0.2244) loss 3.1483 (2.6841) grad_norm 6.6792 (7.6162) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][340/1251] eta 0:03:27 lr 0.000010 wd 0.0500 time 0.2234 (0.2274) data time 0.0010 (0.0027) model time 0.2224 (0.2243) loss 2.9879 (2.6910) grad_norm 5.0751 (7.5720) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][350/1251] eta 0:03:24 lr 0.000010 wd 0.0500 time 0.2231 (0.2273) data time 0.0007 (0.0027) model time 0.2224 (0.2242) loss 2.5375 (2.6973) grad_norm 7.7926 (7.5653) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][360/1251] eta 0:03:22 lr 0.000010 wd 0.0500 time 0.2147 (0.2272) data time 0.0010 (0.0026) model time 0.2137 (0.2242) loss 2.8475 (2.6938) grad_norm 5.9031 (7.5136) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][370/1251] eta 0:03:20 lr 0.000010 wd 0.0500 time 0.2186 (0.2272) data time 0.0011 (0.0026) model time 0.2175 (0.2242) loss 2.6420 (2.6974) grad_norm 6.8259 (7.5125) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][380/1251] eta 0:03:17 lr 0.000010 wd 0.0500 time 0.2265 (0.2271) data time 0.0010 (0.0026) model time 0.2255 (0.2242) loss 2.9041 (2.6907) grad_norm 10.2792 (7.4720) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:40 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][390/1251] eta 0:03:15 lr 0.000010 wd 0.0500 time 0.2216 (0.2270) data time 0.0008 (0.0025) model time 0.2208 (0.2241) loss 2.7610 (2.6953) grad_norm 5.2678 (7.4637) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][400/1251] eta 0:03:13 lr 0.000010 wd 0.0500 time 0.2226 (0.2274) data time 0.0008 (0.0025) model time 0.2219 (0.2247) loss 2.5675 (2.6937) grad_norm 9.7762 (7.4851) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][410/1251] eta 0:03:11 lr 0.000010 wd 0.0500 time 0.2280 (0.2273) data time 0.0008 (0.0024) model time 0.2272 (0.2246) loss 2.7408 (2.6942) grad_norm 5.8250 (7.4601) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][420/1251] eta 0:03:08 lr 0.000010 wd 0.0500 time 0.2299 (0.2272) data time 0.0008 (0.0024) model time 0.2291 (0.2246) loss 2.0838 (2.6890) grad_norm 7.0198 (7.4273) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][430/1251] eta 0:03:06 lr 0.000010 wd 0.0500 time 0.2208 (0.2272) data time 0.0009 (0.0024) model time 0.2198 (0.2245) loss 1.8175 (2.6880) grad_norm 7.6825 (7.4250) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][440/1251] eta 0:03:04 lr 0.000010 wd 0.0500 time 0.2235 (0.2271) data time 0.0009 (0.0023) model time 0.2227 (0.2245) loss 2.6460 (2.6860) grad_norm 5.6384 (7.3839) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][450/1251] eta 0:03:01 lr 0.000010 wd 0.0500 time 0.2240 (0.2270) data time 0.0006 (0.0023) model time 0.2234 (0.2244) loss 2.6894 (2.6847) grad_norm 5.2472 (7.3723) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][460/1251] eta 0:02:59 lr 0.000010 wd 0.0500 time 0.2274 (0.2270) data time 0.0008 (0.0023) model time 0.2266 (0.2244) loss 2.9205 (2.6844) grad_norm 18.0542 (7.3854) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:56:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][470/1251] eta 0:02:57 lr 0.000010 wd 0.0500 time 0.2247 (0.2269) data time 0.0007 (0.0022) model time 0.2241 (0.2244) loss 2.4638 (2.6801) grad_norm 4.4735 (7.3661) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][480/1251] eta 0:02:54 lr 0.000010 wd 0.0500 time 0.2297 (0.2268) data time 0.0006 (0.0022) model time 0.2291 (0.2243) loss 3.1165 (2.6810) grad_norm 9.7430 (7.3707) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:03 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][490/1251] eta 0:02:52 lr 0.000010 wd 0.0500 time 0.2202 (0.2267) data time 0.0008 (0.0022) model time 0.2194 (0.2243) loss 3.1505 (2.6833) grad_norm 7.9821 (7.3405) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:05 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][500/1251] eta 0:02:50 lr 0.000010 wd 0.0500 time 0.2223 (0.2266) data time 0.0010 (0.0022) model time 0.2213 (0.2242) loss 2.8276 (2.6793) grad_norm 8.5716 (7.3398) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][510/1251] eta 0:02:47 lr 0.000010 wd 0.0500 time 0.2188 (0.2265) data time 0.0008 (0.0021) model time 0.2180 (0.2242) loss 2.6272 (2.6748) grad_norm 9.8372 (7.3827) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][520/1251] eta 0:02:45 lr 0.000010 wd 0.0500 time 0.2277 (0.2265) data time 0.0007 (0.0021) model time 0.2270 (0.2241) loss 2.3961 (2.6709) grad_norm 4.9742 (7.3520) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:12 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][530/1251] eta 0:02:43 lr 0.000010 wd 0.0500 time 0.2236 (0.2264) data time 0.0009 (0.0021) model time 0.2227 (0.2241) loss 2.9441 (2.6720) grad_norm 7.1466 (7.3537) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:14 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][540/1251] eta 0:02:40 lr 0.000010 wd 0.0500 time 0.2259 (0.2264) data time 0.0009 (0.0021) model time 0.2249 (0.2241) loss 2.7411 (2.6700) grad_norm 5.4525 (7.3512) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][550/1251] eta 0:02:38 lr 0.000010 wd 0.0500 time 0.2164 (0.2264) data time 0.0009 (0.0020) model time 0.2155 (0.2241) loss 2.4827 (2.6679) grad_norm 6.4639 (7.3198) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][560/1251] eta 0:02:36 lr 0.000010 wd 0.0500 time 0.2183 (0.2263) data time 0.0011 (0.0020) model time 0.2172 (0.2241) loss 2.8353 (2.6663) grad_norm 12.4785 (7.3603) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:21 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][570/1251] eta 0:02:34 lr 0.000010 wd 0.0500 time 0.2249 (0.2263) data time 0.0008 (0.0020) model time 0.2242 (0.2240) loss 2.5209 (2.6677) grad_norm 6.5753 (7.3493) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:23 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][580/1251] eta 0:02:31 lr 0.000010 wd 0.0500 time 0.2230 (0.2262) data time 0.0007 (0.0020) model time 0.2224 (0.2240) loss 2.8988 (2.6691) grad_norm 4.7160 (7.3278) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:25 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][590/1251] eta 0:02:29 lr 0.000010 wd 0.0500 time 0.2167 (0.2262) data time 0.0008 (0.0020) model time 0.2159 (0.2240) loss 3.3493 (2.6733) grad_norm 5.1431 (7.2891) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][600/1251] eta 0:02:27 lr 0.000010 wd 0.0500 time 0.2220 (0.2262) data time 0.0009 (0.0019) model time 0.2210 (0.2240) loss 3.3496 (2.6755) grad_norm 19.3052 (7.2819) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][610/1251] eta 0:02:24 lr 0.000010 wd 0.0500 time 0.2259 (0.2262) data time 0.0006 (0.0019) model time 0.2253 (0.2240) loss 2.1849 (2.6728) grad_norm 6.9362 (7.2582) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:32 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][620/1251] eta 0:02:22 lr 0.000010 wd 0.0500 time 0.2210 (0.2261) data time 0.0009 (0.0019) model time 0.2201 (0.2240) loss 2.8678 (2.6740) grad_norm 9.3644 (7.3388) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:34 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][630/1251] eta 0:02:20 lr 0.000010 wd 0.0500 time 0.2196 (0.2261) data time 0.0008 (0.0019) model time 0.2187 (0.2240) loss 2.4173 (2.6775) grad_norm 3.8693 (7.3516) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][640/1251] eta 0:02:18 lr 0.000010 wd 0.0500 time 0.2212 (0.2261) data time 0.0010 (0.0019) model time 0.2202 (0.2239) loss 3.0714 (2.6783) grad_norm 6.7841 (7.3452) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][650/1251] eta 0:02:16 lr 0.000010 wd 0.0500 time 0.4459 (0.2264) data time 0.0005 (0.0019) model time 0.4454 (0.2243) loss 2.1984 (2.6811) grad_norm 8.4194 (7.3412) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][660/1251] eta 0:02:13 lr 0.000010 wd 0.0500 time 0.2272 (0.2267) data time 0.0006 (0.0019) model time 0.2266 (0.2246) loss 3.0915 (2.6829) grad_norm 6.3473 (7.3231) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][670/1251] eta 0:02:11 lr 0.000010 wd 0.0500 time 0.2192 (0.2266) data time 0.0011 (0.0018) model time 0.2181 (0.2246) loss 2.6780 (2.6808) grad_norm 11.9149 (7.3255) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][680/1251] eta 0:02:09 lr 0.000010 wd 0.0500 time 0.2217 (0.2266) data time 0.0008 (0.0018) model time 0.2209 (0.2246) loss 2.0966 (2.6778) grad_norm 6.5845 (7.2990) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:48 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][690/1251] eta 0:02:07 lr 0.000010 wd 0.0500 time 0.2271 (0.2266) data time 0.0007 (0.0018) model time 0.2263 (0.2246) loss 2.2293 (2.6770) grad_norm 8.8303 (7.3044) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][700/1251] eta 0:02:04 lr 0.000010 wd 0.0500 time 0.2230 (0.2265) data time 0.0007 (0.0018) model time 0.2223 (0.2246) loss 2.4176 (2.6752) grad_norm 4.8234 (7.2915) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][710/1251] eta 0:02:02 lr 0.000010 wd 0.0500 time 0.2182 (0.2265) data time 0.0007 (0.0018) model time 0.2175 (0.2245) loss 2.5726 (2.6790) grad_norm 6.2098 (7.3520) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][720/1251] eta 0:02:00 lr 0.000010 wd 0.0500 time 0.2179 (0.2264) data time 0.0009 (0.0018) model time 0.2169 (0.2245) loss 1.9255 (2.6763) grad_norm 4.8977 (7.3618) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][730/1251] eta 0:01:57 lr 0.000010 wd 0.0500 time 0.2229 (0.2264) data time 0.0006 (0.0018) model time 0.2223 (0.2245) loss 3.0623 (2.6759) grad_norm 5.4875 (7.4786) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:57:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][740/1251] eta 0:01:55 lr 0.000010 wd 0.0500 time 0.2210 (0.2263) data time 0.0009 (0.0017) model time 0.2202 (0.2244) loss 1.9916 (2.6717) grad_norm 4.0556 (7.4852) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][750/1251] eta 0:01:53 lr 0.000010 wd 0.0500 time 0.2235 (0.2263) data time 0.0011 (0.0017) model time 0.2225 (0.2244) loss 3.0667 (2.6706) grad_norm 3.9820 (7.4597) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][760/1251] eta 0:01:51 lr 0.000010 wd 0.0500 time 0.2227 (0.2262) data time 0.0009 (0.0017) model time 0.2218 (0.2243) loss 3.1556 (2.6724) grad_norm 4.7038 (7.4394) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:06 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][770/1251] eta 0:01:48 lr 0.000010 wd 0.0500 time 0.2217 (0.2262) data time 0.0008 (0.0017) model time 0.2210 (0.2243) loss 2.4110 (2.6713) grad_norm 7.6147 (7.7402) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:08 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][780/1251] eta 0:01:46 lr 0.000010 wd 0.0500 time 0.2224 (0.2261) data time 0.0008 (0.0017) model time 0.2216 (0.2242) loss 2.8349 (2.6732) grad_norm 5.7739 (7.7142) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:10 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][790/1251] eta 0:01:44 lr 0.000010 wd 0.0500 time 0.2275 (0.2261) data time 0.0008 (0.0017) model time 0.2267 (0.2242) loss 3.1168 (2.6754) grad_norm 6.6331 (7.7107) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][800/1251] eta 0:01:41 lr 0.000010 wd 0.0500 time 0.2221 (0.2260) data time 0.0010 (0.0017) model time 0.2211 (0.2242) loss 2.3740 (2.6745) grad_norm 5.2739 (7.7080) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:15 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][810/1251] eta 0:01:39 lr 0.000010 wd 0.0500 time 0.2156 (0.2260) data time 0.0008 (0.0017) model time 0.2148 (0.2242) loss 3.3563 (2.6766) grad_norm 6.8391 (7.6997) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:17 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][820/1251] eta 0:01:37 lr 0.000010 wd 0.0500 time 0.2222 (0.2260) data time 0.0007 (0.0017) model time 0.2215 (0.2241) loss 3.2832 (2.6768) grad_norm 5.0323 (7.7585) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:19 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][830/1251] eta 0:01:35 lr 0.000010 wd 0.0500 time 0.2208 (0.2259) data time 0.0009 (0.0017) model time 0.2199 (0.2241) loss 2.8867 (2.6788) grad_norm 6.6003 (7.7399) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][840/1251] eta 0:01:32 lr 0.000010 wd 0.0500 time 0.2175 (0.2259) data time 0.0008 (0.0016) model time 0.2167 (0.2241) loss 2.5053 (2.6771) grad_norm 6.2152 (7.7216) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][850/1251] eta 0:01:30 lr 0.000010 wd 0.0500 time 0.2321 (0.2259) data time 0.0007 (0.0016) model time 0.2314 (0.2241) loss 2.7617 (2.6792) grad_norm 9.2838 (7.7097) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:26 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][860/1251] eta 0:01:28 lr 0.000010 wd 0.0500 time 0.2219 (0.2259) data time 0.0010 (0.0016) model time 0.2209 (0.2241) loss 2.6076 (2.6792) grad_norm 8.9470 (7.7306) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:28 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][870/1251] eta 0:01:26 lr 0.000010 wd 0.0500 time 0.2226 (0.2259) data time 0.0010 (0.0016) model time 0.2216 (0.2241) loss 2.5904 (2.6828) grad_norm 4.2659 (7.7151) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:30 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][880/1251] eta 0:01:23 lr 0.000010 wd 0.0500 time 0.2253 (0.2258) data time 0.0008 (0.0016) model time 0.2245 (0.2241) loss 2.7884 (2.6818) grad_norm 7.1383 (7.6971) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][890/1251] eta 0:01:21 lr 0.000010 wd 0.0500 time 0.2227 (0.2258) data time 0.0007 (0.0016) model time 0.2220 (0.2240) loss 2.2958 (2.6833) grad_norm 5.9081 (7.6855) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:35 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][900/1251] eta 0:01:19 lr 0.000010 wd 0.0500 time 0.2202 (0.2258) data time 0.0010 (0.0016) model time 0.2193 (0.2240) loss 2.2128 (2.6838) grad_norm 5.5962 (7.6640) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:37 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][910/1251] eta 0:01:16 lr 0.000010 wd 0.0500 time 0.2258 (0.2257) data time 0.0008 (0.0016) model time 0.2250 (0.2240) loss 2.0394 (2.6841) grad_norm 3.7315 (7.6556) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:39 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][920/1251] eta 0:01:14 lr 0.000010 wd 0.0500 time 0.2274 (0.2257) data time 0.0008 (0.0016) model time 0.2266 (0.2240) loss 2.7215 (2.6832) grad_norm 6.6905 (7.6532) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:42 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][930/1251] eta 0:01:12 lr 0.000010 wd 0.0500 time 0.2234 (0.2259) data time 0.0009 (0.0016) model time 0.2225 (0.2242) loss 2.6197 (2.6835) grad_norm 11.1130 (7.6498) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:44 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][940/1251] eta 0:01:10 lr 0.000010 wd 0.0500 time 0.2227 (0.2259) data time 0.0007 (0.0016) model time 0.2220 (0.2242) loss 3.0021 (2.6839) grad_norm 6.9654 (7.6399) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:46 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][950/1251] eta 0:01:07 lr 0.000010 wd 0.0500 time 0.2252 (0.2259) data time 0.0008 (0.0016) model time 0.2244 (0.2241) loss 2.8891 (2.6855) grad_norm 3.8290 (7.6305) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:49 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][960/1251] eta 0:01:05 lr 0.000010 wd 0.0500 time 0.2295 (0.2258) data time 0.0008 (0.0016) model time 0.2287 (0.2241) loss 2.4273 (2.6843) grad_norm 6.4965 (7.6071) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:51 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][970/1251] eta 0:01:03 lr 0.000010 wd 0.0500 time 0.2269 (0.2258) data time 0.0009 (0.0016) model time 0.2260 (0.2241) loss 2.5830 (2.6828) grad_norm 7.6111 (7.5941) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:53 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][980/1251] eta 0:01:01 lr 0.000010 wd 0.0500 time 0.2255 (0.2258) data time 0.0009 (0.0015) model time 0.2245 (0.2242) loss 2.8810 (2.6822) grad_norm 21.6622 (7.5954) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][990/1251] eta 0:00:58 lr 0.000010 wd 0.0500 time 0.2251 (0.2258) data time 0.0006 (0.0015) model time 0.2245 (0.2241) loss 1.7269 (2.6815) grad_norm 8.5337 (7.5887) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:58:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1000/1251] eta 0:00:56 lr 0.000010 wd 0.0500 time 0.2259 (0.2258) data time 0.0009 (0.0015) model time 0.2250 (0.2241) loss 2.6580 (2.6815) grad_norm 6.4230 (7.5832) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1010/1251] eta 0:00:54 lr 0.000010 wd 0.0500 time 0.2262 (0.2258) data time 0.0009 (0.0015) model time 0.2253 (0.2241) loss 2.8945 (2.6798) grad_norm 5.8453 (7.5801) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1020/1251] eta 0:00:52 lr 0.000010 wd 0.0500 time 0.2223 (0.2258) data time 0.0010 (0.0015) model time 0.2213 (0.2241) loss 2.5184 (2.6794) grad_norm 5.9418 (7.5680) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:04 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1030/1251] eta 0:00:49 lr 0.000010 wd 0.0500 time 0.2274 (0.2258) data time 0.0008 (0.0015) model time 0.2265 (0.2241) loss 2.1834 (2.6772) grad_norm 5.2999 (7.5674) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:07 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1040/1251] eta 0:00:47 lr 0.000010 wd 0.0500 time 0.2276 (0.2258) data time 0.0009 (0.0015) model time 0.2267 (0.2241) loss 2.0357 (2.6772) grad_norm 7.3265 (7.5587) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:09 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1050/1251] eta 0:00:45 lr 0.000010 wd 0.0500 time 0.2273 (0.2257) data time 0.0007 (0.0015) model time 0.2266 (0.2241) loss 1.8593 (2.6787) grad_norm 5.3951 (7.5670) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:11 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1060/1251] eta 0:00:43 lr 0.000010 wd 0.0500 time 0.2235 (0.2257) data time 0.0011 (0.0015) model time 0.2223 (0.2241) loss 2.8099 (2.6793) grad_norm 5.0335 (7.5489) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:13 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1070/1251] eta 0:00:40 lr 0.000010 wd 0.0500 time 0.2247 (0.2257) data time 0.0006 (0.0015) model time 0.2241 (0.2241) loss 1.9990 (2.6772) grad_norm 5.2527 (7.5929) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:16 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1080/1251] eta 0:00:38 lr 0.000010 wd 0.0500 time 0.2263 (0.2257) data time 0.0008 (0.0015) model time 0.2254 (0.2241) loss 2.7058 (2.6778) grad_norm 22.8930 (7.5968) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:18 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1090/1251] eta 0:00:36 lr 0.000010 wd 0.0500 time 0.2242 (0.2257) data time 0.0008 (0.0015) model time 0.2234 (0.2241) loss 2.3587 (2.6770) grad_norm 6.4925 (7.6020) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:20 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1100/1251] eta 0:00:34 lr 0.000010 wd 0.0500 time 0.2198 (0.2257) data time 0.0009 (0.0015) model time 0.2189 (0.2241) loss 2.7755 (2.6771) grad_norm 6.5521 (7.5934) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:22 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1110/1251] eta 0:00:31 lr 0.000010 wd 0.0500 time 0.2211 (0.2257) data time 0.0016 (0.0015) model time 0.2195 (0.2241) loss 2.9437 (2.6768) grad_norm 5.9396 (7.5760) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:24 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1120/1251] eta 0:00:29 lr 0.000010 wd 0.0500 time 0.2262 (0.2257) data time 0.0008 (0.0015) model time 0.2254 (0.2241) loss 2.6564 (2.6770) grad_norm 3.9678 (7.5592) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:27 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1130/1251] eta 0:00:27 lr 0.000010 wd 0.0500 time 0.2299 (0.2257) data time 0.0008 (0.0015) model time 0.2291 (0.2241) loss 2.6408 (2.6769) grad_norm 4.5576 (7.5459) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:29 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1140/1251] eta 0:00:25 lr 0.000010 wd 0.0500 time 0.2288 (0.2257) data time 0.0007 (0.0015) model time 0.2281 (0.2241) loss 2.5472 (2.6756) grad_norm 6.5105 (7.5850) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:31 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1150/1251] eta 0:00:22 lr 0.000010 wd 0.0500 time 0.2245 (0.2256) data time 0.0007 (0.0015) model time 0.2238 (0.2241) loss 2.9913 (2.6765) grad_norm 5.4407 (7.5730) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:33 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1160/1251] eta 0:00:20 lr 0.000010 wd 0.0500 time 0.2240 (0.2256) data time 0.0008 (0.0015) model time 0.2232 (0.2241) loss 1.9100 (2.6765) grad_norm 5.5589 (7.5569) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:36 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1170/1251] eta 0:00:18 lr 0.000010 wd 0.0500 time 0.3980 (0.2259) data time 0.0006 (0.0014) model time 0.3974 (0.2243) loss 3.1091 (2.6761) grad_norm 4.8495 (7.5660) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:38 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1180/1251] eta 0:00:16 lr 0.000010 wd 0.0500 time 0.2224 (0.2260) data time 0.0006 (0.0014) model time 0.2218 (0.2245) loss 3.0192 (2.6750) grad_norm 4.8096 (7.5519) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:41 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1190/1251] eta 0:00:13 lr 0.000010 wd 0.0500 time 0.2228 (0.2260) data time 0.0006 (0.0014) model time 0.2222 (0.2245) loss 1.9820 (2.6739) grad_norm 9.0654 (7.5423) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:43 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1200/1251] eta 0:00:11 lr 0.000010 wd 0.0500 time 0.2286 (0.2260) data time 0.0007 (0.0014) model time 0.2279 (0.2244) loss 3.1910 (2.6720) grad_norm 6.5903 (7.5298) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:45 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1210/1251] eta 0:00:09 lr 0.000010 wd 0.0500 time 0.2217 (0.2260) data time 0.0009 (0.0014) model time 0.2209 (0.2244) loss 3.3563 (2.6710) grad_norm 15.4913 (7.5308) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:47 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1220/1251] eta 0:00:07 lr 0.000010 wd 0.0500 time 0.2292 (0.2260) data time 0.0009 (0.0014) model time 0.2283 (0.2244) loss 1.9303 (2.6699) grad_norm 5.6834 (7.5244) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:50 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1230/1251] eta 0:00:04 lr 0.000010 wd 0.0500 time 0.2257 (0.2259) data time 0.0006 (0.0014) model time 0.2251 (0.2244) loss 2.4328 (2.6689) grad_norm 5.6813 (7.5176) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:52 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1240/1251] eta 0:00:02 lr 0.000010 wd 0.0500 time 0.2141 (0.2259) data time 0.0004 (0.0014) model time 0.2137 (0.2244) loss 2.0938 (2.6697) grad_norm 7.5120 (7.5057) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 371): INFO Train: [299/300][1250/1251] eta 0:00:00 lr 0.000010 wd 0.0500 time 0.2132 (0.2258) data time 0.0006 (0.0014) model time 0.2126 (0.2243) loss 2.0881 (2.6700) grad_norm 5.0183 (7.4949) loss_scale 128.0000 (128.0000) mem 7381MB [2024-09-01 11:59:54 msvmambav3_tiny_224] (main_hfai_mnodes.py 398): INFO EPOCH 299 training takes 0:04:42 [2024-09-01 11:59:54 msvmambav3_tiny_224] (utils.py 118): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saving...... [2024-09-01 11:59:55 msvmambav3_tiny_224] (utils.py 120): INFO ./exclude/output_msv2/msvmambav3_tiny_224/20240819223009/latest_ckpt.pth saved !!! [2024-09-01 11:59:55 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.351 (0.351) Loss 0.3872 (0.3872) Acc@1 93.555 (93.555) Acc@5 98.730 (98.730) Mem 7381MB [2024-09-01 11:59:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.066 (0.099) Loss 0.5752 (0.6145) Acc@1 90.234 (87.766) Acc@5 97.754 (97.718) Mem 7381MB [2024-09-01 11:59:56 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.071 (0.086) Loss 0.9146 (0.6437) Acc@1 77.734 (86.644) Acc@5 95.508 (97.680) Mem 7381MB [2024-09-01 11:59:57 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.072 (0.081) Loss 1.1641 (0.7398) Acc@1 74.414 (84.388) Acc@5 92.676 (96.702) Mem 7381MB [2024-09-01 11:59:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.076) Loss 1.0293 (0.7890) Acc@1 76.758 (83.239) Acc@5 94.141 (96.208) Mem 7381MB [2024-09-01 11:59:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.810 Acc@5 96.148 [2024-09-01 11:59:58 msvmambav3_tiny_224] (main_hfai_mnodes.py 261): INFO Accuracy of the network on the 50000 test images: 82.8% [2024-09-01 11:59:59 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [0/49] Time 0.827 (0.827) Loss 0.3901 (0.3901) Acc@1 93.359 (93.359) Acc@5 98.828 (98.828) Mem 7381MB [2024-09-01 12:00:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [10/49] Time 0.074 (0.139) Loss 0.5664 (0.6104) Acc@1 90.527 (87.775) Acc@5 97.754 (97.781) Mem 7381MB [2024-09-01 12:00:00 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [20/49] Time 0.066 (0.106) Loss 0.9170 (0.6407) Acc@1 78.027 (86.714) Acc@5 95.605 (97.735) Mem 7381MB [2024-09-01 12:00:01 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [30/49] Time 0.066 (0.094) Loss 1.1484 (0.7342) Acc@1 74.512 (84.485) Acc@5 93.066 (96.768) Mem 7381MB [2024-09-01 12:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 442): INFO Test: [40/49] Time 0.058 (0.086) Loss 1.0234 (0.7837) Acc@1 76.855 (83.310) Acc@5 94.434 (96.277) Mem 7381MB [2024-09-01 12:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 449): INFO * Acc@1 82.854 Acc@5 96.232 [2024-09-01 12:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 276): INFO Accuracy of the network on the 50000 test images: 82.9% [2024-09-01 12:00:02 msvmambav3_tiny_224] (main_hfai_mnodes.py 295): INFO Training time 0:22:01